MULTIVIE
WChecking System Rules Using System-Specific, Program-
Written Compiler ExtensionsPaper: Dawson Engler, Benjamin
Chelf, Andy Chou, and Seth Hallem
Presentation: Emerson Murphy-Hill
MULTIVIE
W
Program Validation is Hard
• Theorem Proving / Model Checking– Difficult and costly; an abstraction
• Testing– Limited coverage and you can’t test all
execution paths
• Manual Inspection– Very expensive per-line, erratic results
MULTIVIE
W
An Alternative Approach to Program Validation
• Static Analysis– No need for a separate model; verified
object is code– Can build into the compiler, which
traverses source anyway
• Authors present “Metalevel Compilation”– Compiler extensions for checking
domain-specific knowledge
MULTIVIE
W
Previous Work
• Other static checkers – find specific problems, we find general problems
• Validation – finds a few errors with a lot of work, we find many errors without much work
• Extensible Compilation – a bit different…
• Aspect Oriented Programming
MULTIVIE
W
Essential Bits Relating to RCU
• We have properties we want to enforce– Read lock/unlocks should be paired– No blocking in a read side critical section– Don’t hold global references outside of critical
sections– Don’t dereference a global variable directly in
a read-side critical section – always use rcu_dereference()
– Free memory after unlinked AND RCU synchronize
– (Is that all?)
MULTIVIE
W
Current State of Enforcing Properties
• Paul McKinney checks uses of RCU by hand, but:– There are > 1000 RCU uses in Linux now– The checks are done informally– The rules aren’t formalized– So this quickly gets unmanageable…
• How do we automate this?
MULTIVIE
W
Generic vs. Domain Specific Rules
• Generic Compiler Rules– Type Checking– Syntax Checking– Modularity Enforcement
• Domain Specific Rules– Rules for locking, interrupts– Functional rules (Herlihy)
• Can we code domain specific rules into the compiler?
MULTIVIE
W
The Process
• Specify an MC-Extension• Compile it with their “metal”
extension• Code linked into xg++• Run xg++ over desired source code
MULTIVIE
W
Increasing Efficiency
• Analysis can be flow sensitive or flow insensitive
• Code traversal is made more efficient by caching redundant paths (loops)
MULTIVIE
W
Limitations
• A Checker, not a Verifier (e.g., aliases)
• Problems in results may not actually be problems (they didn’t build the code themselves)
• Had to modify g++ front-end to allow for more relaxed type system in GNU C
MULTIVIE
W
Another Example
• An Extension (not shown) checks for possibly false assertions, statically
• Importance:– Assertions are normally done
dynamically– Failed assertions will crash system– Inspection wouldn’t find some of these
errors
• Detected 5 system-crashing errors
MULTIVIE
W
More Examples
• One detects floating point use, in kernel (this is a prob?) – exposed one bug
• Another checks for stack overflow – found 10 large stack allocations. Led to patches
MULTIVIE
W
Checking copyin/copyout
• An extension looks for unchecked pointer uses
• Found 18 errors in Xok (w/15 false positives)
MULTIVIE
W
Memory Allocation Checks
• One 60-line extension checks that memory is:– Checked before Use– Not used after it has been freed– Not double freed– Always freed on error paths
MULTIVIE
W
Enforcing Rules Globally
• So far, we have seen local checks• There are global rules, though• For example, transitive blocking calls
can be calculated, then used…
MULTIVIE
W
Deadlock Avoidance
• One Extension checks:– No blocking when interrupts are
disabled– No blocking when holding spin lock
• 123 Violations in Linux
MULTIVIE
W
Module Mis-Usage
• Modules can be loaded/unloaded dynamically, so each module must keep track of it’s reference count to protect against being unloaded inappropriately
• Found 75 violations in Linux
MULTIVIE
W
Lock Checking
• Rules for Locking:– Locks acquired in f
must be released in f– Locks can’t be locked
or unlocked twice– Upon exit, enable or
restore interrupts– “Bottom halves” of
interrupt handlers not disabled on exit
– Interrupts flags saved before restored
MULTIVIE
W
Lock Checking
• False Positives…– Intentional violation for modularity or
efficiency– Checker does local analysis only (not
transitive)– Checker not sophisticated enough to
prune impossible paths
MULTIVIE
W
Optimization
• Applied rules to the FLASH machine’s cache coherence protocol, where optimizations can increase performance significantly
MULTIVIE
W
Conclusion
• Metalevel Compilation:– Finds serious errors in real code (about
500 in three kernels)– Finds errors in a systematic, system
wide manner– Enables easy writing of checkers
MULTIVIE
W
Questions
• If we can validate properties automatically, can we weave them in automatically?
• How might you enforce properties between languages? (recall that MC extensions are written in an extension of the source language)
• Are there disadvantages to writing extensions in the source language (e.g., things you might want to express, but can’t easily?)
• What sorts of RCU rules do we want to encode with extensions, but can’t?
• How many code passes are needed? Can there be one pass per extension?
MULTIVIE
W
More Questions
• If read lock/unlock instructions must be paired, why not enforce this using a callback?
• The extensions themselves can be difficult to reason about, especially after a few lines. How can we make this easier? Can extensions be modularized?
• What typical compiler rules might better be pushed into the “extension” layer?
• What do you suppose the effort required to find these errors were, in terms of person-hours to implement checks and in terms of computation time to find errors?