Upload
gizi
View
37
Download
5
Embed Size (px)
DESCRIPTION
Log Based Dynamic Binary Analysis for Detecting Device Driver Defects. Olatunji Ruwase Thesis Proposal. Thesis Committee: Todd C. Mowry (Chair) David Andersen Onur Mutlu Brad Chen (Google) Michael Swift (U. Wisconsin). Device Drivers: The Good, The Bad, & The Ugly. - PowerPoint PPT Presentation
Citation preview
Carnegie MellonCarnegie Mellon
Log Based Dynamic Binary Analysis for Detecting Device Driver Defects
Olatunji Ruwase
Thesis ProposalThesis Committee:
Todd C. Mowry (Chair)David Andersen
Onur MutluBrad Chen (Google)
Michael Swift (U. Wisconsin)
Carnegie Mellon
Device Drivers: The Good, The Bad, & The Ugly Good: Enable use of hardware
devices Kernel module in commodity OS Distributed in binary form
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 2
Ugly: Major cause of system failures System crashes OS corruption Application corruption Device damage
− Bad: Poor code quality [Chou01, Murphy04] Written by non kernel experts Poorly tested
Detect bugs in production driver executions
Carnegie Mellon
Program Monitoring Using Lifeguards
Lifeguards: dynamic correctness checking tools Dynamic binary analysis to work on unmodified binaries Instruction grained analysis to catch subtle bugs Versatility to catch broad range of bugs
Memory [Nethercote07] Security [Newsome05, Castro05] Concurrency [Savage97, Yu05, Flanagan09] Multilingual program interface [Lee10]
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 3
program Lifeguard
…eax = Xedx = eaxY = edx + 1jmp ecx …
Can Lifeguards be used to catch Driver Bugs ?
Carnegie Mellon
Why Drivers Are Difficult To Write Correctly
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects4
SYSTEM CALL BOUNDARY
User space
• Interface issues•Network stack• Kernel resources• Hardware device
Lifeguards effectively detect similar spectrum of issues in
applications
[Ryzhyk09_Dingo]
• Concurrency issues• Reentrant interrupt handling
• Generic C language issues• Memory management
Upper layers
of network
stack
Driver
Kernel resource
mgmt
Synchronous: main memory & CPU registers
Asynchronous: I/O memory & interrupts
Carnegie Mellon
Potential Uses of Driver Lifeguards Diagnosing system failures
Test sites Customer sites
Detecting “silent” faults Test sites Customer sites
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 5
Carnegie Mellon
Outline Motivation Overview of Lifeguard Deployment
Thesis Question Related work Research Challenges
Preliminary work Current and Future work
Timeline
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 6
Carnegie Mellon
Lifeguard Deployment Approaches Dynamic Binary
Instrumentation [PIN, VALGRIND] Fault isolation Imprecise checking of parallel
execution
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 7
Memory Lifeguard
Monitored program
Carnegie Mellon
Lifeguard Deployment Approaches Dynamic Binary
Instrumentation [PIN, VALGRIND] Fault isolation Imprecise checking of parallel
execution
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 8
Logging [AFTERSIGHT, LBA, SPECK] Monitor parallel execution
[Pokam09,Vlachos10] Accelerate lifeguard
execution[Chen08,Nightingale08,Ruwase08,Ruwase10
*p = …check_store (p)p = NULL
Multithreadedprogram
Memory Lifeguard
Execution trace
Monitored program
✘ Require fault containment✘ Protect Lifeguard✘ Restrict damages to faulting program
Log Based Lifeguards are more promising for monitoring kernel mode drivers
Carnegie Mellon
Thesis Questions
Can Log Based Lifeguards precisely detect faults in the executions of device drivers ? Can Log Based monitoring be adapted for drivers ? Will the Lifeguards be efficient enough for production
systems (Mobile, Desktop, Cloud) ?
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 9
Carnegie Mellon
Outline Motivation Overview of Lifeguard Deployment
Thesis Question Related work Research Challenges
Preliminary work Current and Future work
Timeline
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 10
Carnegie Mellon
Eliminating Driver Faults During Development
Static analysis [Metal, RacerX, SLAM]✖ Drivers are too complex
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 11
Upper layers of network
stack
SYSCALL BOUNDARY
Avoid overheads of runtime fault detection or isolation✖ Cannot find all faults in production drivers
Testing [DDT]✖ Drivers have too many execution
paths Synthesize driver code [Termite]✖ Cannot synthesize complex
features e.g. multithreading Lifeguards to detect other faults
Customer sites Testing sites
Driver
Carnegie Mellon
Lifeguards on customer systems Pinpoint fault location to aid debugging Detect “silent” driver faults
Using Existing Hardware to Isolate Driver Faults
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 12
Upper layers of network
stack
SYSCALL BOUNDARY
Prevent system failures due to driver faults✖ Little information on driver faults
Driver
Page table permissions [Nooks] User space drivers [Microdrivers, SUD]
Carnegie Mellon
Checking Driver Execution to Isolate Faults
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 13
Upper layers of network
stack
SYSCALL BOUNDARY
Pinpoint fault location Detect “silent” faults
Driver
Instrumented software checks [SafeDrive,XFI,BGI] Imprecise on parallel execution Only memory faults studied Logging works for parallel execution Lifeguards for high level faults
Hardware breakpoints [DataCollider] Sampling approach misses real faults Lifeguard finds all faults in execution
Carnegie Mellon
Related Work Summary Eliminating Driver faults during development
Static analysis [Metal, RacerX, SLAM] Testing [DDT] Synthesizing driver code [Termite]
Using existing hardware to isolate Driver faults Page table permissions [Nooks] User space drivers [Microdrivers, SUD]
Checking Driver execution to isolate faults Instrumented software checks [SafeDrive, XFI, BGI] Hardware breakpoints [DataCollider]
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 14
Carnegie Mellon
Outline Motivation Overview of Lifeguard Deployment
Thesis Question Related work Research Challenges
Preliminary work Current and Future work
Timeline
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 15
Carnegie Mellon
Research Challenges Preliminary work
Adapting Log Based Monitoring for Drivers Understanding Device Drivers
Current and Future work Detecting Common Driver Faults (Driver Lifeguards) Efficiency of Driver Lifeguards
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 16
Carnegie Mellon
Log Based Architectures (LBA) [Chen 08]
Execution logging Toggle when monitored thread (de)scheduled
Fault containment Lifeguard as separate process Block program at system calls until Lifeguard catches up
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 17
Hardware Log
Program Lifeguard
Operating System
Simulated LBA Design
Carnegie Mellon
Adapting Execution Logging for Driver Monitoring
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 18
SYSTEM CALL BOUNDARY
Toggle point Difficulty
Complete information for precise fault detection
Efficient Modest storage and bandwidth costs No lifeguard filtering costs
Upper layers
of network
stack
Driver
Carnegie Mellon
Adapting Execution Logging for Driver Monitoring
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 19
SYSTEM CALL BOUNDARY
Upper layers
of network
stack
Driver
Option Toggle Complete
Efficient
Kernel Ring change
✔ ✗
I/O stackDriver
[AFTERSIGHT]
Carnegie Mellon
Adapting Execution Logging for Driver Monitoring
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 20
SYSTEM CALL BOUNDARY
Upper layers
of network
stack
Driver
Option Toggle Complete
Efficient
Kernel Ring change
✔ ✗
I/O stack I/O syscall ✔ ✗
Driver
Carnegie Mellon
Adapting Execution Logging for Driver Monitoring
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 21
SYSTEM CALL BOUNDARY
Upper layers
of network
stack
DriverIdentify driver entry points at load
time
Option Toggle Complete
Efficient
Kernel Ring change
✔ ✗
I/O stack I/O syscall ✔ ✗
Driver Code region
✔ ✔
Carnegie Mellon
Execution logging Toggle when monitored thread (de)scheduled
Fault containment Lifeguard as separate process Block program at system calls until Lifeguard catches up
Adapting Fault Containment for Driver Monitoring
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 22
Hardware Log
Lifeguard
Operating System
Driver
Carnegie Mellon
Adapting Fault Containment for Driver Monitoring
Virtual Machine (VM) separation to protect Lifeguard [AFTERSIGHT] Rest of system remain vulnerable to driver faults Overhead of VM is high
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 23
Hardware Log
Lifeguard
OS
Driver
OS
Carnegie Mellon
Understanding Device Drivers
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 24
Upper layers of network
stack
Driver
SYSCALL BOUNDARY
Network Functionshard_start_xmit()irq_handler()open()stop()get_stats()
...
probe()remove()
module_init()module_cleanup()
PCI Bus Functions
Required Functions
Carnegie Mellon
Adapting Data Race Lifeguard for Network Drivers
Data race on X Two access on X where at least one access is a write No explicit synchronization between the accesses
Lockset algorithm for detecting races in applications [Eraser] Shared data protected with consistent set of locks Happens-before relation for non-lock synch. (e.g fork) [RaceTrack]
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 25
Thread 1
Write (X)
Thread 2
Read (X)
Lock (Mx)
Unlock (Mx) Lock (Mx)
Unlock (Mx)
Fork (Thread2)
Lockset + kernel synch (interrupts, spinlocks) = KernelEraser
Carnegie Mellon
Network Driver Races Reported by KernelEraser
Simulated LBA environment Kernel version: Linux 2.6.17.1 Drivers: tg3 & tulip Driver class: Network Bus: PCI Driver VM : 2 CPU Lifeguard VM : 1 CPU
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 26
Workload• Load driver• Enable Ethernet• Transfer file over network• Disable Ethernet • Unload driver
Driver Serious Benign False Alarm TotalNet stack synch. Device synch.
tg3 2* 15 13 1533 1563tulip 0 0 472 451 923
* Fixed in versions 2.6.18 & 2.6.21
Classification of Races
Carnegie Mellon
False Alarms due to Unobserved Invariants
Synchronizations in upper layers of I/O stack
Synchronizations due to device states
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 27
Upper layers of network
stack
tg3
SYSCALL BOUNDARY
open () { … tptg3_flags &= … …}
stop () { … while(tptg3_flags & …) …}
Lock(rtnl_lock);driver->open();Unlock(rtnl_lock);…Lock(rtnl_lock);driver->stop();Unlock(rtnl_lock)
Carnegie Mellon
False Alarms due to Unobserved Invariants
Synchronizations in upper layers of I/O stack
Synchronizations due to device states
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 28
Upper layers of network
stack
tg3
SYSCALL BOUNDARY
open () { … tptg3_flags &= … …}
probe() { … tptg3_flags |= … …}
connected to pci bus
inactiveready for pkt rx/tx
probe()
open()
Carnegie Mellon
Preliminary Work Summary Adapted Log Based Monitoring for Drivers
Identify driver code region to log only driver execution VM separation to protect Lifeguard
Adapted Lockset (KernelEraser) to detect races in network drivers Found 2 known but serious data races in tg3 False alarms due to external synchronizations
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 29
Carnegie Mellon
Outline Motivation Overview of Lifeguard Deployment
Thesis Question Related work Research Challenges
Preliminary work Current and Future work
Timeline
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 30
Carnegie Mellon
Eliminating False Alarms in KernelEraser+ External synchronizations
Network stack× Log network stack Emulate interface
invariants
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 31
SYSTEM CALL BOUNDARY
Upper layers
of network
stack
Driver
open () { Lock(rtnl_lock); …tptg3_flags &= ……Unlock(rtnl_lock);}
stop () { Lock(rtnl_lock);…while(tptg3_flags & …)…Unlock(rtnl_lock);}
Carnegie Mellon
Eliminating False Alarms in KernelEraser+ External synchronizations
Network stack× Log network stack Emulate interface
invariants• Device
Model finite state machine
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 32
SYSTEM CALL BOUNDARY
Upper layers
of network
stack
Driver
open () {(CONNECTED TO BUS) …tptg3_flags &= ……(READY FOR TX/RX)}
probe () {(INACTIVE) …tptg3_flags |= ……(CONNECTED TO BUS)}
connected to pci bus
inactiveready for pkt rx/tx
probe()
open()Driver Serious Benign False Alarm Total
Net stack synch. Device synch.tg3 2* 15 0 0 17tulip 0 0 0 0 0
Carnegie Mellon
Eliminating False Alarms in KernelEraser+ External synchronizations
Network stack× Log network stack Emulate interface
invariants• Device
Model finite state machine
+ Other driver classes• SCSI disk • SOUND• USB• GRAPHICS
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 33
SYSTEM CALL BOUNDARY
Upper layers
of network
stack
Driver
Carnegie Mellon
Lifeguards for Common Driver Faults
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects34
SYSTEM CALL BOUNDARY
Upper layers of network
stack
Network driver
Kernel resource
managers
User space
• Interface violations• Device protocol• Kernel protocol• I/O stack protocol
[Ryzhyk09_Dingo]
• Concurrency faults• Data Races
• Memory faults• Illegal memory access• Memory leaks• Uninitialized memory use
Scalability ?
Carnegie Mellon
Efficiency of Driver Lifeguards Accelerating Lifeguard analysis
Static analysis Dynamic optimizations Parallel Lifeguards Hardware accelerators
Reduce overhead of VM fault containment Hardware enforced fault isolation in same VM
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 35
Carnegie Mellon
Accelerating Driver Lifeguards
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 36
Hardware Log
Lifeguard
OS
Driver
OS
Reduce analysis workload• Static analysis [XFI]
Carnegie Mellon
Accelerating Driver Lifeguards Reduce analysis
workload• Static analysis [XFI]
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 37
Hardware Log
Lifeguard
OS
Driver
OS
Run analysis faster• Dynamic compiler optimizations
[Qin06,Ruwase10]• Parallel Lifeguards
[Nightingale08,Ruwase08]• Hardware accelerators
[Vlachos10]
Carnegie Mellon
Avoid Overhead of VM Fault Containment
Hardware enforced fault isolation [Nooks, SUD]
• Issues to consider• Protection quality• Lifeguard using Driver
(e.g. disk)
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 38
SYSTEM CALL BOUNDARY
Upper layers of network
stack
Network driver
Kernel resource
managers
User space
Lifeguard
Carnegie Mellon
Current and Future Work Summary Detecting common driver faults
Data races Memory Interface violations
Efficiency of Driver Lifeguards Accelerating Lifeguard analysis More efficient fault containment
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 39
Carnegie Mellon
Outline Motivation Overview of Lifeguard Deployment
Thesis Question Related work Research Challenges
Preliminary work Current and Future work
Timeline
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 40
Carnegie Mellon
Timeline
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 41
Carnegie Mellon
Thanks to members of the LBA Group for their contributions Shimin Chen Babak Falsafi Phillip Gibbons Michelle Goodstein Michael Kozuch Onur Mutlu Todd Mowry Gennady Pekhimenko Vivek Seshadri Theodoros Strigkos Evangelos Vlachos
04/22/23 Log Based Dynamic Binary Analysis for Detecting Device Driver Defects 42
Questions ?