Upload
shelly
View
77
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Verifying Dereference Safety via Expanding-Scope Analysis. Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson) N. Rinetzky (Tel-Aviv University) M.G. Nanda (IBM IRL). - PowerPoint PPT Presentation
Citation preview
Verifying Dereference Safety via Expanding-Scope Analysis
Alexey Loginov (GrammaTech, Inc.)
Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson) N. Rinetzky (Tel-Aviv University) M.G. Nanda (IBM IRL)
Why Null-Dereference Analysis?
Common problem …or symptom of other problems
› Null-dereference warning may help in identifying root cause Relevant to all software Specification is obvious (absence of NPE)
› Requires no user interaction
2
Why Sound Null-Dereference Analysis?
Safety guarantees are important in some domains Results can become an in-code specification, e.g., via JSR 305
› Annotations can help with code understanding› Annotations can simplify future analyses (e.g., after modifications)
Precise and efficient sound analysis is challenging› Lessons carry over to other static analyses
3
Example answers expected1. class A {2. final A a = new A();
3. static main() {4. B b = new B();5. initB(b);6. a.foo(b); // okay7. }
8. foo(B b) {9. b.f.fun(); // okay10. b.f.f.gun(); // null-deref.11. }
12. static initB(B b) {13. b.f = new F(); // okay14. b.f.f = null; // okay15. }16. }
4
Interprocedural information is needed often– Allocations in callers (e.g., new B()) common– Allocations in callees (e.g., new F())
common
Common approaches
Most existing tools perform intraprocedural analysis Have to make assumptions about callers/callees Option 1: pessimistic assumptions about callers/callees
› Result: a sea of false alarms
5
Results of pessimistic intraproc. analysis1. class A {2. final A a = new A();
3. static main() {4. B b = new B();5. initB(b);6. a.foo(b); // null deref.7. }
8. foo(B b) {9. b.f.fun(); // two null derefs.10. b.f.f.gun(); // null deref.11. }
12. static initB(B b) {13. b.f = new F(); // null deref.14. b.f.f = null; // okay15. }16. }
6
Reports four false alarms– Only real error is on line 10
Common approaches
Most existing tools perform intraprocedural analysis Have to make assumptions about callers/callees Option 2: optimistic assumptions about callers/callees
› Result: missing real errors (catching the most glaring ones)
7
Results of optimistic intraproc. analysis1. class A {2. final A a = new A();
3. static main() {4. B b = new b();5. initB(b);6. a.foo(b); // okay7. }
8. foo(B b) {9. b.f.fun(); // okay10. b.f.f.gun(); // okay11. }
12. static initB(B b) {13. b.f = new F(); // okay14. b.f.f = null; // okay15. }16. }
8
Misses the real error on line 10
Common approaches
Most existing tools perform intraprocedural analysis Have to make assumptions about callers/callees Option 3: mostly optimistic assumptions
› Detects inconsistencies in programmer’s beliefs• Test x == null: belief that x could be null before test• Dereference of x without a test: belief that x cannot be null
› Allow analysis to dismiss assumptions contradicted by beliefs› Result: missing real errors, reporting safe dereferences as unsafe
• Generally, few false alarms but many missed errors• Same result as option 2 (optimistic assumptions) in our example
9
Prospects for interprocedural analysis
Whole-program analysis cannot scale to large software› Majority of instructions are relevant to null-dereference analysis
• Can’t prune down program to a small relevant subset
Need mechanism to break down a program’s complexity
10
Expanding-Scope Analysis Holy Grail
› Cost: INTRAprocedural analysis› Precision: INTERprocedural (whole-program) analysis
Staged approach› Analyze dereferences with limited interprocedural context› Verify dereferences with the least amount of context› Increase interprocedural context for harder cases› In simplest form
• Start with local analysis (with pessimistic assumptions)– Verify some dereferences without considering context
• Consider remaining dereferences with extra level of context– Verify some dereferences within a call subtree of immediate callers
• …› We refer to individual analyses as Limited-Scope Analyses
11
Expanding-Scope Analysis
12
… f.foo() …
f f f
f
f f
f
Expanding-Scope Analysis
13
foo
main
initB
b.f.fb.f.fun();
.gun();
B b = new B();initB(b);a.foo(b);
b.f = new F();b.f.f = null
Abstract Domain Product of three abstract domains
1. Abstract domain for may-alias analysis• Implementation: flow- & context-insensitive Andersen-style
2. Abstract domain for must-alias analysis• Implementation: demand-driven (based on def-use chains)
3. Set APnn of non-null access paths• Access paths denote l-value expressions:
– (VarId | StaticFieldId).InstanceFieldId*• Finiteness of domain guaranteed by (parameterized) bounds on
– Size of APnn
– Maximal length of access paths in APnn
› Only the final component (set of non-null access paths APnn) changes
14
Transfer Functions (statements)
15
Statement Transfer functionv = null APnn \ { v. | }
v = new T() APnn {v}
v = w APnn {v. | w. APnn}
v = w.f APnn {v. | w.f. APnn} mustAlias(w)
v.f = null APnn \ {e′.f. | e′ mayAlias(v), } mustAlias(v)
v.f = w APnn {e′.f. | w. APnn, e′ mustAlias(v)} mustAlias(v)
…v.foo()……v[i]……v.length…
APnn mustAlias(v)
Let = InstanceFieldId* (sequences of instance fields)
Transfer Functions (conditions)
16
Condition Transfer functionon true branch on false branch
v == null v APnn ? : APnn APnn mustAlias(v)
v instanceof T APnn mustAlias(v) APnn
v == wAPnn
(mustAlias(w) if v APnn) (mustAlias(v) if w APnn)
APnn
Real OO applications (e.g., web applications) have wide call graphs› High scope limits are too expensive to analyze
New stages help stave off the need for high scope limits1. Pruning
• Verifies dereferences of (non-null) final and stationary fields2. Special local (scope-0) analyses
a. Caller-guarantee analysis (top-down in call graph)– Propagates callers’ guarantees to callees– E.g., for references passed as arguments down deep call chains
b. Callee-guarantee analysis (bottom-up in call graph)– Propagates callees’ guarantees up to callers– E.g., for field initializations in deep initialization call chains
17
Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion)
Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion)
18
subtrees of depth 1 from parents
pruning
caller-guarantee
callee-guarantee
scope-1
scope-2
…
subtrees of depth 2 from grandparents
symbolic
high priority low priority
…
Steps of staged interproc. analysis1. class A {2.
3. static main() {4. 5. initB(b);6. 7. }
8. foo(B b) {9. 10. 11. }
12. static initB(B b) {13. 14. 15. }16. }
20
Pruning (final & stationary fields) Limited-scope analysis
1. Scope-0 (local analysis)
2. Scope-1 analysis
final A a = new A();
a.foo(b);
b.f.f
b.f.f = null;b.f
b.f.fun();
B b = new B();
.gun();
= new F();
1. Caller-guarantee (local) analysis2. Callee-guarantee (local) analysis3. Scope-1 analysisb.f APnn
b APnn
b APnn
Experimental results 21 (mostly open-source) applications
› ~3K-465K bytecodes; ~300-37K dereferences Avg: ~90% of dereferences verified soundly and automatically
› ~8% dismissed by Pruning› ~77% dismissed by caller-guarantee analysis› ~5% dismissed by remaining stages
Final scope limit: between 2 and 5 (chosen heuristicallly)› Diminishing returns after local analyses (caller-/callee-guarantee)› Higher scope limits useful in the absence of caller/callee guarantees
Max. access-path length: 2 for all but four applications› Higher access-path lengths had no effect for most applications› Helped C-like applications (direct field dereferences without getters)
21
Experimental results Expected many false alarms due to simple abstract domain Implemented heuristic symbolic path-validity checking
› This phase selected ~20% as high-priority warnings› Surprisingly low incidence of false alarms due to path-correlation
Biggest domain shortcoming: not tracking access-path types› Causes unnecessarily high cost of verifying certain dereferences
• Includes too many irrelevant code portions when verifying a dereference› Produces false alarms due to examining type-infeasible paths
Results are encouraging for the simplicity of the domain
22
Tool-User Interaction The output includes suggested annotations
› Ordered by the number of warnings guaranteed to be dismissed• Actual number would require an alternate abstract domain
› Current annotation options• Field f is non-null• Parameter p or return value of method foo() is non-null
User may choose to accept some annotations› We studied annotations for 8 benchmarks with high warning counts› A few hours effort for non-familiar code
• Result: 30% decrease in warning counts
23
Summary
Novel expanding-scope analysis› Applicable to multiple abstract domains
Scalable and precise null-dereference analysis› Staged analysis makes a simple abstract domain effective
Vision: improve programs’ specifications and robustness› Cleanse programs by examining warnings and suggested annotations› Check accepted annotations with assertions or symbolic techniques› Extend the program’s specification and analyzability via annotations
25