30
Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior Xi Wang, Nickolai Zeldovich, M. Frans Kaashoek, Armando Solar-Lezama MIT CSAIL 24 th ACM SOSP (November, 2013) Best Paper

Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

  • Upload
    masako

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

24 th ACM SOSP (November, 2013) Best Paper. Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior. Xi Wang, Nickolai Zeldovich , M . Frans Kaashoek , Armando Solar- Lezama MIT CSAIL. Outline. Introduction Model for Unstable Code Design & Implementation - PowerPoint PPT Presentation

Citation preview

Page 1: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

Towards Optimization-Safe Systems: Analyzing the Impact of Undefined BehaviorXi Wang, Nickolai Zeldovich, M. Frans Kaashoek, Armando Solar-LezamaMIT CSAIL

24th ACM SOSP(November, 2013)Best Paper

Page 2: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

2

A Seminar at Advanced Defense Lab

OUTLINE Introduction Model for Unstable Code Design & Implementation Evaluation

2013/11/26

Page 3: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

3

A Seminar at Advanced Defense Lab

INTRODUCTION The specifications of C-family languages

designate certain code fragments as having undefined behavior. giving compilers the freedom to generate

instructions

Aiming for system programming, the specifications choose to trust programmers and assume that their code will never invoke undefined behavior.

2013/11/26

Page 4: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

4

A Seminar at Advanced Defense Lab

UNDEFINED BEHAVIOR IN C

p, q, p’: n-bit pointer x, y : n-bit integer a : array

2013/11/26

Page 5: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

5

A Seminar at Advanced Defense Lab

COMPILER OPTIMIZATION One way in which compilers exploit

undefined behavior is to optimize a program under the assumption that the program NEVER invokes undefined behavior.

Consequence: Origin program ≠ Optimized program We call such code optimization-unstable code, or

just unstable code for short.

2013/11/26

Page 6: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

6

A Seminar at Advanced Defense Lab

UNSTABLE CODE EXAMPLE Vulnerability Note VU#162289 (US-CERT) [

link]

2013/11/26

=>Compiler think: always false

Page 7: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

7

A Seminar at Advanced Defense Lab

UNSTABLE CODE EXAMPLE (CONT.) CVE-2009-1897 [link] Linux Kernel 2.6.30 [LXR link] Programmer put the check at an improper

position, but it can work...

2013/11/26

=>Compiler think: always false

Page 8: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

8

A Seminar at Advanced Defense Lab

Is this programmers’ fault? Poor understanding of unstable code is a

major obstacle to reasoning about system behavior.

However, these bugs are quite subtle, and understanding them requires detailed knowledge of the language specification.

2013/11/26

Page 9: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

9

A Seminar at Advanced Defense Lab

Is this compilers’ fault? A story: GCC bug #30475 (2007/01/15) [link]

“This will create MAJOR SECURITY ISSUES in ALL MANNER OF CODE. I don’t care if your language lawyers tell you gcc is right. . . . FIX THIS! NOW!” A GCC user

“I am not joking, the C standard explictly says signed integer overflow is undefined behavior. . . . GCC is not going to change.” A GCC developer

2013/11/26

Page 10: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

10

A Seminar at Advanced Defense Lab

UNSTABLE CODE TEST

The default optimization level for release build is -O2.

2013/11/26

Page 11: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

11

A Seminar at Advanced Defense Lab

MODEL FOR UNSTABLE CODE C*: a C dialect that assigns well-defined

semantics to code fragments that have undefined behavior in C.

P: Program e: expression or code fragment P[e/e’]: replace e in program P with e’ Definition: Unstable code

A code fragment e in program P is unstable w.r.t. language specifications C and C* iff there exists a fragment e’ such that is legal under C but not under C*.

2013/11/26

Page 12: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

12

A Seminar at Advanced Defense Lab

APPROACH FOR IDENTIFYING UNSTABLE CODE Stack does this using a two-phase scheme

1. Run optimizer O without taking advantage of undefined behavior, which resembles optimizations under C*

2. Run optimizer O again, this time taking advantage of undefined behavior, which resembles (more aggressive) optimizations under C.

2013/11/26

Page 13: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

13

A Seminar at Advanced Defense Lab

WELL-DEFINED PROGRAM ASSUMPTION x: input Re(x): reachability condition.

=> under input x, will e be reached? Ue(x) or UB: undefined behavior condition.

=> under input x, will e exhibit undefined behavior in C?

Definition: Well-defined program assumption A code fragment e is well-defined on an input x

iff executing e never triggers undefined behavior at e

A program P is well-defined on an input xiff every fragment of the program is well-defined on that input, denoted as Δ

2013/11/26

Page 14: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

14

A Seminar at Advanced Defense Lab

ELIMINATING UNREACHABLE CODE Theorem: Elimination

In a well-defined program P, an optimizer can eliminate code fragment e, if there is no input x that both reaches e and satisfies the well-defined program assumption Δ(x)

2013/11/26

Page 15: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

15

A Seminar at Advanced Defense Lab

SIMPLIFYING UNNECESSARY COMPUTATION Theorem: Simplification

2013/11/26

Page 16: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

16

A Seminar at Advanced Defense Lab

SIMPLIFICATION ORACLE Boolean oracle: propose true and false in

turn for a boolean expression, enumerating possible values

Algebra oracle: propose to eliminate common terms on both sides of a comparison if one side is a subexpression of the other x + y < x => y < 0

2013/11/26

Page 17: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

17

A Seminar at Advanced Defense Lab

LIMITATION It is possible to exploit the well-defined

program assumption in other forms.

2013/11/26

Page 18: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

18

A Seminar at Advanced Defense Lab

DESIGN & IMPLEMENTATION Implement with LLVM + Boolector solver

2013/11/26

Page 19: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

19

A Seminar at Advanced Defense Lab

COMPILER FRONTEND To reduce false warnings, Stack ignores such

compiler-generated code by tracking code origins, at the cost of missing possible bugs.

2013/11/26

Page 20: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

20

A Seminar at Advanced Defense Lab

UB CONDITION INSERTION Stack inserts a special function call into the

IR at the corresponding instruction void bug_on(bool expr)

2013/11/26

Page 21: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

21

A Seminar at Advanced Defense Lab

SOLVER-BASED ALGORITHM To implement these algorithms, Stack

consults the Boolector solver to decide satisfiability for elimination and simplification queries. But it is practically infeasible to precisely

compute them for large programs. To address this challenge, Stack computes

approximate queries by limiting the computation to a single function. With Tu and Padua’s algorithm

2013/11/26

Page 22: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

22

A Seminar at Advanced Defense Lab

EVALUATION New bug: 160 (July 2012 March 2013)

2013/11/26

Page 23: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

23

A Seminar at Advanced Defense Lab

ANALYSIS OF BUG REPORTS Non-optimization bugs Urgent optimization bugs Time bombs Redundant code (false alarm)

2013/11/26

Page 24: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

24

A Seminar at Advanced Defense Lab

ANALYSIS OF BUG REPORTS (CONT.) Non-optimization Bugs Example: PostgreSQL [link]

2013/11/26

Time bomb!!

Page 25: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

25

A Seminar at Advanced Defense Lab

PRECISION Kerberos: 11 warning

Developers accepted every patch false warning rate: 0/11

Postgres: STACK produced 68 warnings 9 patches accepted 29 patches in discussion: developers blamed

compilers 26 time bombs 4 false warnings

2013/11/26

Page 26: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

26

A Seminar at Advanced Defense Lab

PERFORMANCE 64-bit Ubuntu (Linux) Intel Core i7-980 3.3GHz 24GB memory Solver time out: 5s

2013/11/26

Page 27: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

27

A Seminar at Advanced Defense Lab

PREVALENCE OF UNSTABLE CODE All packages in Debian Wheezy archive:

17,432 Containing C/C++ code: 8,575 Containing unstable code: 3,471 (40%) 150 CPU day to analyze

2013/11/26

Page 28: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

28

A Seminar at Advanced Defense Lab

PREVALENCE OF UNSTABLE CODE (CONT.) 2013/11/26

Page 29: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

29

A Seminar at Advanced Defense Lab

COMPLETENESS It is difficult to known precisely how much

unstable code Stack would miss in general.

We analyze what kind of unstable code Stack misses.

A total of ten tests from real systems Result: 7/10

2013/11/26

Page 30: Towards Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior

A Seminar at Advanced Defense Lab30

Q & A

2013/11/26