Upload
christine-mills
View
214
Download
1
Tags:
Embed Size (px)
Citation preview
1
Chains of Evidence
(Thesis Proposal)
Tim Halloran
William L. Scherlis (advisor)James D. HerbslebMary ShawJoshua J. Bloch, Sun Microsystems Inc.
A Programmer-Oriented Approach to Assurance of Mechanical Program Properties
2
A thesis proposal should:
Explain the basic ideas of the thesis topic Argue why the topic is interesting
I.e., scientific value and engineering impact State what kinds of results are expected Argue that these results are obtainable
within a reasonable amount of time Demonstrate the student’s personal
qualifications for doing the proposed work
3
A “bug” description
I got a NullPointerException in CallStackRootNode.CallStackChildren.changeChildren() where the CallStackProducer returned a null Location[]. Now in this case my code is incomplete but it seems to me that there is a case for the producer not being able to furnish a stack, or the filter filtering it all out in which case some placeholder Location[] needs to be created and displayed.
NetBeans Bug Report #31423
4
The “answer”
I do not understand, you would prefer to return null rather than new Location[0]? There is something like a convention here that we prefer to not return null values from functions—its more “safe”
What is the problem here?
NetBeans Bug Report #31423
5
Loss of design intent
People leave and join software teams Documents become out of date and inconsistent
Models are missing Source code becomes the only authoritative
system artifact Maintainability suffers because code does not reveal
all the design intent behind it Quality suffers because programmers make mistakes
complying with tacit or informally expressed design intent
6
What models are missing? Low-level models of design intent about
“mechanical” program properties not expressible in the language Focus on bureaucratic aspects of a program
E.g., concurrency policy, exception policy, mutability policy, type use policy, static program structure
Rather than functional ones E.g., correctly sorting a data structure or correct
computation of a value
We hypothesize that expression and assurance of mechanical program properties can provide great value
/** @typerecommendation Collection, List */public class ArrayList extends...
7
Reasons for this problem
Missing capability in today’s languages, models, tools, and processes to Express and capture intent Assure our implementations are faithful to that intent
Worse, we don’t know how to keep intent consistent with as-built reality of a system as both evolve.
My research addresses both these problems
8
NetBeans “bug” example
Capture the intent that the getCallStack() method should never return a null Location[]
Annotate the interface as follows:
package org.netbeans.modules.debugger;
public interface CallStackProducer extends CallStackRoot { ... public /*@not-null*/ Location[] getCallStack(); ...}
Programmers might overlook the annotation or not be confident they followed it—to address this problem our approach uses a tool to statically assure consistency
9
NetBeans “bug” example We can go further and annotate that filterCallStack()
(within the CompactCallStackFilter class) should not raise NullPointerException
/** @never-throws java.lang.NullPointerException */public Location[] filterCallStack( /*@not-null*/ CallStackProducer producer) { ... Location[] stack = producer.getCallStack(); ... int i, k = stack.length; ...}
Now our programmer who reported NetBeans bug #31423 and implemented getCallStack() to return null could be informed two models of design intent are violated
10
NetBeans “bug” example – steps to assurance
1. Evidence filterCallStack() does not raise NullPointerException assuming the @not-null annotations are valid
2. Evidence that calls to the filterCallStack() method will never pass null in the producer parameter
3. Evidence that all implementations of getCallStack() never return null
Evidence that each step is valid, given its assumptions, is gathered by semantics-based program analysis
Each individual piece of evidence is useful to the programmer…but they can be linked
1
2
3
11
NetBeans “bug” example – key points
Each step becomes a “link” of evidence we are able to “chain” together to give us program assurance.
Our annotations capture design intent and serve as cut-points for program analysis
The program properties in the example are confusing to the programmer because they are non-local
12
Adoption in practice
Consistency management Stepwise approach to consistency Support real-world inconsistencies
Avoiding programming language change “Rising tide of abstraction” Support extra-language assurance
User experience Different from a compiler
Assurance selection—Where can we help?
24 October 2002 Post on Apache Jakarta General Mailing List: Most of the automated code metrics I read complain about things like “duh its an API of course its an unused class”—“or duh it a development utility or test case which isn’t MEANT to be flexible”A Follow-up: Exactly! Stuff like “This class is unused”—no, it’s just specified in a properties file somewhere and the static analysis is not picking that up! A couple of false positives like that and people start ignoring the tool. At least I do.
13
Research goals
Effective capture of implementation-level design decisions, incrementality, and tool supported consistency management
Assurance of properties not addressed by widely used programming languages
Design of an effective user experience for extra-language assurance
Understanding defects in widely deployed open source Java projects to understand where we can have the largest impact
14
Outline
Introduction Thesis Statement Approach Hypotheses Preliminary Work Validation Schedule Expected Contribution
•Loss of design intent•NetBeans “bug” example•Adoption in practice•Research goals•Chains of evidence
15
Chains of evidence
Proofs that a software system satisfies the theorem that programmer-expressed models of design intent are consistent with source code Models constructed from annotations within
code and other documentation and focused on mechanical program properties
Assurance is formed by linking together “chains” forged from small “links” of evidence about the software system
16
Chains of evidence
Partial chains of evidence are essential—they enable focused engagement with the programmer to determine if The design intent is wrong The design intent is incomplete The source code is wrong The program analysis algorithms (due to
limitations) have insufficient information to provide a result
17
Assurance spectrum of chains of evidence
Chains of Evidence – Assurance Focus(tractable)
- Scalability of Assurance Technique +
Sem
antic
“D
epth
” of
- D
esig
n In
tent
Ass
ured
+
TypeChecking
ProgramVerification
Concurrency Policy (Greenhouse)Thread Coloring (Sutherland)
Java Best PracticeProgram Structure
Exception PolicyMutability Policy
Type Use Policy
18
Thesis statement
Chains of evidence enables assurance of useful mechanical properties about programs with respect to explicit models of design intent, and that the approach has the potential to be scalable and practical for working programmers to adopt
19
Key ideas
A set of representative and substantive assurances available as part of our prototype tool is necessary to show feasibility and flexibility of our approach
An effective architecture for chains of evidence is required to organize assurance results and scale up to large Java systems
An effective user experience is needed to elicit design intent from and communicate assurance results to programmers
20
Key ideas
A prototype tool set within the context of a Java IDE enables evaluation of the effectiveness of our approach
Selection of what design intent to model and how to assure it can be empirically informed through (formative) analyses of bug and quality practices and (evaluative) analysis and tool use
A business case analysis can show cost-effectiveness of our approach and assurances
21
Approach
Develop an architecture, framework, tools, and user experience for chains of evidence*
Develop specific assurances Conduct three empirical investigations Business case analysis
22
Assurance development
Concurrency Policy * Mutability Policy API Protocol Policy NullPointerException
Policy Alias Policy Types and Their Use * Program Structure
Research challenge to design, using state-of-the-art program analysis, substantive assurances along a
representative set of points on our curve
Chains of Evidence – Assurance Focus(tractable)
- Scalability of Assurance Technique +S
eman
tic “
Dep
th”
of-
Des
ign
Inte
nt A
ssur
ed +
TypeChecking
ProgramVerification
Concurrency Policy (Greenhouse)Thread Coloring (Sutherland)
Java Best PracticeProgram Structure
Exception PolicyMutability Policy
Type Use Policy
23
Empirical investigations
Survey of open source Java bugs (39,463) Understand: “Where help is needed most?” 2 phases: bug selection and bug analysis
Sophomore experiment Hypothesis: “Violations of Java best practice
correlate with software defects” Prototype use studies
Qualitative use studies of our prototype tool Understand utility and practicality of chains of
evidence
24
Business case analysis
Cost/Benefit Analysis (in the sense of Reifer) to evaluate the programmer time and effort required to provide and maintain design models as compared with the costs of using current techniques Done for each individual assurance (eases
identification of state-of-the-practice techniques that address similar concerns)
25
Hypotheses
Safe evolution of software systems can be carried out with less up-front effort using our incremental approach then in approaches that rely on full functional specification Qualitative use studies of our prototype tool
Bugs of a non-local character (e.g., concurrency) are more difficult for programmers to solve and have great significance to engineering success Survey of open source Java Bugs
26
Hypotheses
Cut points are feasible to provide scalability for a wide range of important program analyses Assurance development Program analysis theory
Similar techniques can be used for assurances of model compliance and assessment of Java best practice (in the sense of Bloch) Architecture for chains of evidence Prototype tool coupled with assurance development
27
Hypotheses
Violations of Java best practice correlate with software defects (and overall bad software quality) Sophomore experiment
Model compliance is a cost-effective approach to improve software quality Business case analysis
Consistency management can be an independent function that is not coupled to program analysis Architecture for chains of evidence Consistency management (part of user experience)
28
Evidence of Feasibility —Preliminary Work
Two preliminary assurance prototypes “Models of Thumb” Demonstration of lock policy assurance
Preliminary Architecture Third prototype
Empirical investigations Survey of open source quality practices Preliminary survey of Java bugs
29
“Models of Thumb”
Assurance that Java “rules of thumb” are followed Two cases investigated on 2 million SLOC corpus
Ignored exceptions
Overspecific variable declarations
ArrayList results = new ArrayList();
try { ...} catch (Throwable t) { ;}
36
User Experience
Early prototype reported the following:
Mimicking compiler error message reporting Not effective for extra-language assurance
Negative focus, no rationale, no next step
Extension.java [line 297] changeFROM: ArrayList results = new ArrayList(); TO: List results = new ArrayList(); WHY: Use most abstract interface possible
Research challenge to design an effective user experience for extra-language assurance
39
Empirical Results
Name kSLOC
Overspecific Variable Declarations Ignored Exceptions
Variable Decl.Uses (u)
Violations Found catch Block Uses (u)
Violations Found
# %u /kSLOC # %u /kSLOC
Ant 64 13,953 434 3 6.7 916 163 18 2.5
Tomcat 66 13,970 485 3 7.3 964 230 24 3.5
J2SDK 1.4 508 116,397 3,650 3 7.2 3,239 686 21 1.4
NetBeans 571 99,201 5,851 6 10.2 5,085 1,048 21 1.8
Eclipse 792 178,872 8,325 5 10.5 6,511 1,110 17 1.4
Subtotal: 2,001 422,393 18,745 4 9.4 16,715 3,237 19 1.6
Whiteboard 38 6,823 1,205 18 28.0 199 40 20 1.4
Total: 2,039 429,216 19,950 5 9.8 16,914 3,257 19 1.6
40
Ignored Exceptions: Why?
Name
catch block Uses (u)
Ignored Exceptions
Total (t) Commented
# %u # %t
Ant 916 213 23 59 28
Tomcat 964 248 26 66 27
J2SDK 3,239 744 23 291 39
NetBeans 5,085 1,241 24 443 36
Eclipse 6,511 1,275 20 440 35
Tomcat
Sample of 50
Ignored exception
# %
Unfinished exception handling 1 2
Catch of an overly-broad exception
5 10
Unsure [comment or log] 8 16
Default-try-catch [comment] 9 18
Thread: InterruptedException 8 16
IO: IOException [close()] 7 14
Test code [wrapping test] 3 6
OK, well commented [not formal]
9 18
We sampled 50 ignored exceptions from Tomcat and
Eclipse and found roughly 90% are false positives (program
correctness only) -Explicit design intent needed
41
Greenhouse Concurrency Assurance
Assurance obtained•All accesses to shared fields are protected with the correct lock•All lock preconditions are satisfied for method calls that require callers to hold locks•Constructor does not allow references to escape (i.e., avoiding leakage)
42
Greenhouse Concurrency Assurance
Assurance obtained•All accesses to shared fields are protected with the correct lock•All lock preconditions are satisfied for method calls that require callers to hold locks•Constructor does not allow references to escape (i.e., avoiding leakage)
- Complex design intent models- What is the next step?
43
Prototype problems Difficult to understand the network of analyses that make up an
assurance Difficult to reuse portions of an assurance for another assurance No separation between data used to calculate results and actual
results No benefit from building up assurances from smaller assurances No standard approach to communicate results to higher-level
analyses No standard approach to communicate results to the user interface No standard ability to maintain assurance as the software or the
design intent model is being changed by a programmer within the IDE (i.e., truth maintenance)
Diagnosis: Our architecture is wrong
44
Toward an Architecture forChains of Evidence
Preliminary architecture for chains of evidence is based upon: A categorized
blackboard A truth maintenance
system A network of
program analysis components
Regiondesign intent
Lock policydesign intent
Lock policyassurance
Thread coloringdesign intent
Thread coloringassurance
Ignored exceptionassurance
Ignored exceptiondesign intent
OK to ignoreInterrupedException
within fluid.ex.*
OK: IgnoredInterrupedException
on line 13 of Foo.java
ISSUE: IgnoredIOException on
line 56 of Bar.java
Sea Blackboard
Developed a feasibility prototype
45
Toward an Architecture forChains of Evidence
Preliminary use has found this design: Enhances a programmer’s ability to
understand and react to tool results Allows a separation of analysis results and
design intent model information Provides efficient maintenance of assurance
as models and program code evolve
Research challenge to design user experience, evaluate (and enhance) scalability and flexibility
46
Validation
Prototype Tool Capabilities Assurance Soundness Empirical evidence of adoptability & utility
Bug survey, and Prototype use studies Cost-Effectiveness
Business case analysis
Chains of evidence enables assurance of useful mechanical properties about programs with respect to explicit models of design intent, and that the approach has the potential to be scalable and practical for working programmers to adopt
47
ScheduleDate Milestone Tasks
Jul 2003 Architecture completed and documentedPrototype using updated architectureRepresentative program assurances designedRefine automatic selection for Java bug survey
Aug 2003 Draft ICSE paperComplete sophomore experiment plan
Sep 2003 Complete Java bug survey
Dec 2003 Complete and document sophomore experimentComplete representative program assurances
Jan 2003 Begin prototype tool use studiesBegin dissertation draft
Mar 2003 Dissertation draft completed
May 2003 Prototype tool use studies completed and documentedOral and written thesis defense
48
Expected Contributions
I expect to provide an effective architecture, framework,
tools, and user experience for chains of evidence,
demonstrate the usefulness of the of the framework for representative assurances,
provide an empirically informed assessment of the potential for adoption, and
qualitatively demonstrate cost effectiveness
49
Chains of Evidence
Tim Halloran
William L. Scherlis (advisor)James D. HerbslebMary ShawJoshua J. Bloch, Sun Microsystems Inc.
A Programmer-Oriented Approach to Assurance of Mechanical Program Properties
Questions
51
My Proposal in One Slide Problem: Increasing source code quality and assurance thereof Idea: Chains of Evidence—A tool-supported method to assist
programmers in expressing models of low-level design intent and assuring their consistency with code
Preliminary Results: Java best practice and concurrency policy prototypes Architecture for managing chains of evidence Survey of open source quality practice and Java bugs
Approach(—demonstrating potential for): Develop a set of substantive assurances—feasibility & flexibility Develop architecture—scalability Design an effective user experience—adoption Develop prototype tool in Java IDE—feasibility Empirical investigation (bug survey, experiment)—adoption/impact Develop a business case analysis—practicability