Taint-based Dynamic Analysis
CoC Research Day - 9/25/2009
Designed at Apple in California;assembled at Georgia Tech
Dynamic Tainting Overview
C
A
B Z
Dynamic Tainting Overview
1 Assign
taint marks
C
A
B Z
Dynamic Tainting Overview
1 Assign
taint marks
C
A
B
31
2
Z
Dynamic Tainting Overview
1 Assign
taint marks2 Propagate
taint marks
C
A
B
31
2
Z
Dynamic Tainting Overview
1 Assign
taint marks2 Propagate
taint marks
C
A
B
31
2
Z
Dynamic Tainting Overview
1 Assign
taint marks3 Check
taint marks2 Propagate
taint marks
C
A
B
31
2
Z
Dynamic Tainting Overview
1 Assign
taint marks3 Check
taint marks2 Propagate
taint marks
C
A
B
31
2
Z
C
A
B
31
2
Z
3
Dynamic Tainting Applications
Attack detection / prevention
Information policy enforcement
Testing
Memory errors
Data lifetime
Dynamic Tainting Applications
Attack detection / preventionPrevent stack smashing, SQL injection, buffer overruns, etc.Attack detection / prevention
Information policy enforcement
Testing
Memory errors
Data lifetime
Dynamic Tainting Applications
Information policy enforcementensure classified information does not leave the system
Attack detection / prevention
Information policy enforcement
Testing
Memory errors
Data lifetime
Dynamic Tainting Applications
TestingCoverage metrics, test data generation heuristic, etc.
✔/✘
Attack detection / prevention
Information policy enforcement
Testing
Memory errors
Data lifetime
Dynamic Tainting Applications
Attack detection / prevention
Information policy enforcement
Testing
Data lifetimetrack how long sensitive data remains in an application
Memory errors
Data lifetime
Dynamic Tainting Applications
Attack detection / prevention
Information policy enforcement
Testing
Memory errorsDetect illegal memory access, leak detection, etc.Memory errors
Data lifetime
Dynamic Tainting Applications
Attack detection / prevention
Information policy enforcement
Testing
Memory errorsDetect illegal memory access, leak detection, etc.leak detectionMemory errors
Data lifetime
addhash(char hname[]) {35. int i;36. HASHPTR hptr;37. unsigned int hsum = 0;
38. for(i = 0 ; i < strlen(hname) ; i++) {39. sum += (unsigned int) hname[i];40. }41. hsum %= 3001;42. if((hptr = hashtab[hsum]) == (HASHPTR) NULL) {43. hptr = hashtab[hsum] = (HASHPTR) malloc(sizeof(HASHBOX));44. hptr->hnext = (HASHPTR) NULL;45. hptr->hnum = ++netctr;46. hptr->hname = (char *) malloc((strlen(hname) + 1) * ! ! ! ! ! ! ! ! ! ! sizeof(char));47. sprintf(hptr->hname , "%s" , hname);48. return(1);49. } else {! ... 67. } }
Detecting leaks is easy
addhash(char hname[]) {35. int i;36. HASHPTR hptr;37. unsigned int hsum = 0;
38. for(i = 0 ; i < strlen(hname) ; i++) {39. sum += (unsigned int) hname[i];40. }41. hsum %= 3001;42. if((hptr = hashtab[hsum]) == (HASHPTR) NULL) {43. hptr = hashtab[hsum] = (HASHPTR) malloc(sizeof(HASHBOX));44. hptr->hnext = (HASHPTR) NULL;45. hptr->hnum = ++netctr;46. hptr->hname = (char *) malloc((strlen(hname) + 1) * ! ! ! ! ! ! ! ! ! ! sizeof(char));47. sprintf(hptr->hname , "%s" , hname);48. return(1);49. } else {! ... 67. } }
Detecting leaks is easy
addhash(char hname[]) {35. int i;36. HASHPTR hptr;37. unsigned int hsum = 0;
38. for(i = 0 ; i < strlen(hname) ; i++) {39. sum += (unsigned int) hname[i];40. }41. hsum %= 3001;42. if((hptr = hashtab[hsum]) == (HASHPTR) NULL) {43. hptr = hashtab[hsum] = (HASHPTR) malloc(sizeof(HASHBOX));44. hptr->hnext = (HASHPTR) NULL;45. hptr->hnum = ++netctr;46. hptr->hname = (char *) malloc((strlen(hname) + 1) * ! ! ! ! ! ! ! ! ! ! sizeof(char));47. sprintf(hptr->hname , "%s" , hname);48. return(1);49. } else {! ... 67. } }
Detecting leaks is easy; fixing them is not
Discover where the last pointer to un-freed memory is lost
Leak Detection Overview
Assign taint marks
Propagate taint marks
Check taint marks
ptr1 = malloc(...) ➔ ptr1
ptr2 = calloc(...) ➔ ptr2
ptr3 = ptr1 ➔ ptr3 , ptr1
ptr1 = NULL ➔ ptr1 , ptr3
ptr4 = ptr2 + 1 ➔ ptr4 , ptr2
Report error if taint mark’s count is zero andmemory has not been freed.
1 1
1
Discover where the last pointer to un-freed memory is lost
Leak Detection Overview
Assign taint marks
Propagate taint marks
Check taint marks
ptr1 = malloc(...) ➔ ptr1
ptr2 = calloc(...) ➔ ptr2
ptr3 = ptr1 ➔ ptr3 , ptr1
ptr1 = NULL ➔ ptr1 , ptr3
ptr4 = ptr2 + 1 ➔ ptr4 , ptr2
Report error if taint mark’s count is zero andmemory has not been freed.
1 1
1
Discover where the last pointer to un-freed memory is lost
Leak Detection Overview
# of pointers tainted with this color
Assign taint marks
Propagate taint marks
Check taint marks
ptr1 = malloc(...) ➔ ptr1
ptr2 = calloc(...) ➔ ptr2
ptr3 = ptr1 ➔ ptr3 , ptr1
ptr1 = NULL ➔ ptr1 , ptr3
ptr4 = ptr2 + 1 ➔ ptr4 , ptr2
Report error if taint mark’s count is zero andmemory has not been freed.
1 1
1
Discover where the last pointer to un-freed memory is lost
Leak Detection Overview
Assign taint marks
Propagate taint marks
Check taint marks
ptr1 = malloc(...) ➔ ptr1
ptr2 = calloc(...) ➔ ptr2
ptr3 = ptr1 ➔ ptr3 , ptr1
ptr1 = NULL ➔ ptr1 , ptr3
ptr4 = ptr2 + 1 ➔ ptr4 , ptr2
Report error if taint mark’s count is zero andmemory has not been freed.
2
1 1
1
1 2
2
2
1
1 2 2
Discover where the last pointer to un-freed memory is lost
Leak Detection Overview
Assign taint marks
Propagate taint marks
Check taint marks
ptr1 = malloc(...) ➔ ptr1
ptr2 = calloc(...) ➔ ptr2
ptr3 = ptr1 ➔ ptr3 , ptr1
ptr1 = NULL ➔ ptr1 , ptr3
ptr4 = ptr2 + 1 ➔ ptr4 , ptr2
Report error if taint mark’s count is zero andmemory has not been freed.
2
1 1
1
1 2
2
2
1
1 2 2
In general propagation follows standard pointer arithmetic rules
Discover where the last pointer to un-freed memory is lost
Leak Detection Overview
Assign taint marks
Propagate taint marks
Check taint marks
ptr1 = malloc(...) ➔ ptr1
ptr2 = calloc(...) ➔ ptr2
ptr3 = ptr1 ➔ ptr3 , ptr1
ptr1 = NULL ➔ ptr1 , ptr3
ptr4 = ptr2 + 1 ➔ ptr4 , ptr2
Report error if taint mark’s count is zero andmemory has not been freed.
2
3
1 1
1
1 2
2
2
1
1 2 2
In general propagation follows standard pointer arithmetic rules
Discover where the last pointer to un-freed memory is lost
Leak Detection Overview
addhash(char hname[]) {35. int i;36. HASHPTR hptr;37. unsigned int hsum = 0;
38. for(i = 0 ; i < strlen(hname) ; i++) {39. sum += (unsigned int) hname[i];40. }41. hsum %= 3001;42. if((hptr = hashtab[hsum]) == (HASHPTR) NULL) {43. hptr = hashtab[hsum] = (HASHPTR) malloc(sizeof(HASHBOX));44. hptr->hnext = (HASHPTR) NULL;45. hptr->hnum = ++netctr;46. hptr->hname = (char *) malloc((strlen(hname) + 1) * ! ! ! ! ! ! ! ! ! ! sizeof(char));47. sprintf(hptr->hname , "%s" , hname);48. return(1);49. } else {! ... 67. } }
Detecting leaks is easy
46. hptr->hname = (char *) malloc((strlen(hname) + 1) * ! ! ! ! ! ! ! ! ! ! sizeof(char));
delHtab() {15. int i;16. HASHPTR hptr , zapptr;17. for(i = 0; i < 3001; i++) {18. hptr = hashtab[i];19. if(hptr != (HASHPTR) NULL) {20. zapptr = hptr ;21. while(hptr->hnext != (HASHPTR) NULL) {22.! ! hptr = hptr->hnext;23.! ! free(zapptr);24.! ! zapptr = hptr ;25.! ! }
26.! ! free(hptr);27.! }28. }!29. free(hashtab);30. return; }
Detecting leaks is easy
46. hptr->hname = (char *) malloc((strlen(hname) + 1) * ! ! ! ! ! ! ! ! ! ! sizeof(char));
Detecting leaks is easy; fixing them is, too
delHtab() {15. int i;16. HASHPTR hptr , zapptr;17. for(i = 0; i < 3001; i++) {18. hptr = hashtab[i];19. if(hptr != (HASHPTR) NULL) {20. zapptr = hptr ;21. while(hptr->hnext != (HASHPTR) NULL) {22.! ! hptr = hptr->hnext;23.! ! free(zapptr);24.! ! zapptr = hptr ;25.! ! }
26.! ! free(hptr);27.! }28. }!29. free(hashtab);30. return; }
46. hptr->hname = (char *) malloc((strlen(hname) + 1) * ! ! ! ! ! ! ! ! ! ! sizeof(char));
Detecting leaks is easy; fixing them is, too
delHtab() {15. int i;16. HASHPTR hptr , zapptr;17. for(i = 0; i < 3001; i++) {18. hptr = hashtab[i];19. if(hptr != (HASHPTR) NULL) {20. zapptr = hptr ;21. while(hptr->hnext != (HASHPTR) NULL) {22.! ! hptr = hptr->hnext;23.! ! free(zapptr);24.! ! zapptr = hptr ;25.! ! }
26.! ! free(hptr);27.! }28. }!29. free(hashtab);30. return; }
free(hptr->hname)
Leakpoint implementation
Leakpoint implementation
Pointer to memory area 0x1C93AC0 (16 bytes) allocated: at malloc by addhash (hash.c:50) by parser (parser.c:210) by readcell (parser.c:34) by main (main.c:98) was leaked: at free by delHtab (hash.c:28) by grdcell(grdcell.c:354) by main (main.c:227)
Leakpoint implementation
Pointer to memory area 0x1C93AC0 (16 bytes) allocated: at malloc by addhash (hash.c:50) by parser (parser.c:210) by readcell (parser.c:34) by main (main.c:98) was leaked: at free by delHtab (hash.c:28) by grdcell(grdcell.c:354) by main (main.c:227)
Leakpoint implementation
Pointer to memory area 0x1C93AC0 (16 bytes) allocated: at malloc by addhash (hash.c:50) by parser (parser.c:210) by readcell (parser.c:34) by main (main.c:98) was leaked: at free by delHtab (hash.c:28) by grdcell(grdcell.c:354) by main (main.c:227)
Evaluation
Evaluation
Transmission
Evaluation
Transmission
Locations identified by Leakpoint correspond to where the leaks were fixed by developers.
Evaluation
Transmission
Also found thousands of leaks in theSPEC INT benchmarks
Locations identified by Leakpoint correspond to where the leaks were fixed by developers.
static void processCompletedTasks(tr_web *web) { ... task->done_func(web->session, ..., task->done_func_user_data); ... evbuffer_free(task->response); tr_free(task->url); tr_free(task); ...}
static void invokeRequest(void * vreq) { ... hash = tr_new0(uint8_t, SHA_DIGEST_LENGTH); memcpy(hash, req->torrent_hash, SHA_DIGEST_LENGTH); tr_webRun(req->session, req->url, req->done_func, hash); ...}
static void onStoppedResponse(tr_session *session, ..., void *torrent_hash) { dbgmsg(NULL, "got a response ... message"); // tr_free(torrent_hash); onReqDone(session);}
Overhead
Powerful but expensive50 - 100x overheads
are common
• Execution time is completely automated
• Developers have to think less
Questions?