Prediction and Certificationof Heap Usage
Luca VeraldiPhD. StudentDepartment of Computer Science - University of Pisa
BISS06 – Bertinoro International Spring School for Graduate Studies in Computer Science
References
Static Prediction of Heap Space Usage for FirstOrder Functional Programs (M. Hofmann, S. Jost)
Automatic Certification of Heap Consumption(L. Beringer, M. Hofmann, A. Momigliano, O. Shkaravska)
Camelot and Grail: Resource-Aware Functional Programming for the JVM (K. MacKenzie, N.Wolverson)
Agenda (1)
Introduction MRG + PCC, mobile programs: why heap usage certification methodology
The language and its Type System Operational Semantics and Annotated Types The fundational theorem and proof Inferring annotations
Camelot syntax, pattern matching, diamonds, transparency
Grail & JVM compiling tricks and implementation drawbacks, limitations
Overcoming linearity the multi-layered sharing approach
Introduction
Garantees for Resource Usage Requirements in Mobile Computing (MRG) time, heap/stack size bounds embedded computing devices hard resource constraints
Approach based on Proof-Carrying Code (PCC) resource-safe programming language (linear) type system + resource annotations certifying compiler binary (JVM bytecode) enriched with a verifiable certificate verifying resource needs prior to execution asymmetric certifying process
The language and Type System (1)
(L, T, LP) First-Order, Functional Language Annotated (Linear) Type System Efficient solution of Linear Constraint System
ƒ: L(Bool) → L(Bool) ƒ’: N → N . ƒ(w) runs within ƒ’(|w|) cells
We can annotate the code for ƒ with a counter no global characterization of the behavior of ƒ annotated code requires as much space as ƒ itself
Undecidable problem, in its general formulation impose restriction on language/type system
The language and Type System (2)
X Y Z A B
wi
w
X
Y
Z
The main aim: ƒ: L(L(Bool)) → L(Bool) w: (L (L (Bool, ), ), ) ├ e: (L (Bool,
), ) If we have
fs(init) ≥ Z + Y · |W| + X · ∑i |Wi|
then we can execute without any further space needs, leaving fs(final) ≥ B + A · |e|
The language and Type System (3)
w: (L (L (Bool,X),Y),Z)├ e: (L (Bool,A),B) mark different input (output) portions
with different weights fs(init) = Φ(|w|) fs(final) = Ψ(|e|)
From this annotations of ƒ, we derive a Linear Programming Problem, which integer solutions can be computed efficiently
The language and Type System (4)
e ::= nil | tt | ff | x | x y | inl(x) | inr(x) | cons(x,x) | let x = e in e | if x then e else e
| f(x1, …, xn) | match x with x x e
| match x with |inl(x) e |inr(x) e | pmatch x with |nil e |cons(x,x) e | dmatch x with |nil e |cons(x,x) e
Zero-order Types: T ::= 1 | Bool | L(T) | T T | T + T
First-order Types: F ::= (T, …, T) T
SIZE function to define heap space requirements for base types
The language and Type System (5)
FreeList: linked list of heap space blocks No compaction of heap. All blocks with same size
Cons: fails when no enough space
Two match statements pmatch:
dmatch:
User is required to choose among the two transparency in heap space usage and collection
pmatch x with |nil e |cons(x,x) e preserves the matched block
dmatch x with |nil e |cons(x,x) ereturns the cell back to the FreeList
The language and Type System (6)
The problem of (malignant) sharing: rev(a,b)
dmatch(a) with |nil b|cons(x,y) rev(y, cons(x,b))
let x=rev(a,nil) in Ψ(a)
rev uses destructive matching input value a cannot be reused any more Is there a static type system to prohibit this?
Linearity…
Operational Semantics (1)
(stack) S: Var→Val (heap) h: Loc→Val m, S, h ├ e v, h’, m’
S(x1)=v1, …, S(xn)=vn
m, [y1←v1, …, yn ←vn], h├ ef v, h’, m’
m, S, h ├ f(x1, …, xn) v, h’, m’fun
m, S, h├ e1 v1, h1, m1
m1, S[x←v1], h1├ e2 v, h’, m’
m, S, h ├ let x=e1 in e2 v, h’, m’let
Operational Semantics (2)S(x)=nil
m,S,h├ e1 v, h’, m’
m,S,h ├ match x with
|nil e1
|cons(h,t) e2 v, h’, m’
pmatch
dmatch
S(x)=loc h(loc)=(vh,vt)
j = m + SIZE( h(loc) )
j, S[h←vh, t ←vt], h \ {loc}├ e2 v, h’, m’
m,S,h ├ dmatch x with
|nil e1
|cons(h,t) e2 v, h’, m’
dmatch
S(x)=loc h(loc)=(vh,vt)
m, S[h←vh, t ←vt], h├ e2 v, h’, m’
m,S,h ├ pmatch x with
|nil e1
|cons(h,t) e2 v, h’, m’
pmatch
Operational Semantics (3)
Modeling benign sharing a function for reachable locations:
: heap x (Val)→(Loc)(h, {nil}) = (h, {c}) = {}(h, {loc}) = {loc} (h, {h(loc)})(h, {inl(v)}) = (h, {inr(v)}) = (h, {v})(h, {(x,y)}) = (h, {x}) (h, {y})(h, S) = (h, { v | xdom(S) . v=S(x) })
= xdom(S) (h, {S(x)})
stronger preconditions in semantics
Operational Semantics (4)
m,S,h├ e1 v1, h1, m1
m1, S[x←v1], h1├ e2 v, h’, m’
m,S,h ├ let x=e1 in e2 v, h’, m’let
S(x)=loc h(loc)=(vh,vt)
j = m + SIZE( h(loc) )
j, S[h←vh, t ←vt], h \ {loc}├ e2 v, h’, m’
m,S,h ├ dmatch x with
|nil e1
|cons(h,t) e2 v, h’, m’
dmatch
S’ = S↓FreeVar(e2)h↓(h, S’) = h1↓(h, S’)
let x=rev(a,nil) in Ψ(a)
cannot use in e2 locations
modified during the evaluation of e1
S’ = S[h←vh, t ←vt] S’’ = S’↓FreeVar(e2)
loc (h, S’’)
Annotated Types (1)
Extend Type Systems, with space usage Zero-order Types:
T ::= 1 | Bool | L(T) | T T | T + T R ::= (T, k)
First-oder Types: F ::= (T, …, T, k) R
Use new types to rewrite the typing rules
Annotated Types (2)
(f) = (A1, …, An, k) → (C, k’)
m ≥ k m – k + k’ ≥ m’, x1:A1, …, xn:An, m├ f(x1, …, xn) : (C, m’)
fun
m ≥ SIZE( A L(A, k)) + k + m’, xh:A, xt:L(A, k), m├ cons(xh, xt) : (L(A, k), m’) cons
, m├ e : (A, p) m’ ≤ p + k, m + k├ e : (A, m’) waste
Annotated Types (3)
, m ├ e1 : (C, m’)
, xh:A, xt:L(A, k), m + SIZE( A L(A, k)) + k├ e2 : (C, m’), x:L(A, k), m├ dmatch x with
|nil e1 |cons(h,t) e2 : (C, m’)
dmatch
, m ├ e1 : (C, m’)
, xh:A, xt:L(A, k), m + k├ e2 : (C, m’), x:L(A, k), m├ pmatch x with
|nil e1 |cons(h,t) e2 : (C, m’)
pmatch
The fundational theorem (1)
Introducing the heap requirement function:: heap x (Val) x (T) → Q+
(h, {nil}, {L(A, k)) = (h, {c}, {Bool}) = 0 (h, {loc}, {L(A, k)}) = k + (h, {h(loc)}, {AL(A, k)}) (h, {x+y}, {(A, k)+(B, l)}) = k + (h, {x}, {A}) (h, {x+y}, {(A, k)+(B, l)}) = l + (h, {y}, {B}) (h, {(x,y)}, {AB}) = (h, {x}, {A}) + (h, {y}, {B}) (h, S, ) = (h, { v | xdom(S) . v=S(x) }, )
= ∑xdom() (h, {S(x)}, (x))
Once determined, the global resource usage requirement derived from the Type System could be used to drop resource annotations away from operational semantics
The fundational theorem (2)
The theorem statement:
P is a valid program
, m ├ e : A, m’
S, h├ e v, h’
THEN
kN, aN . a ≥ m + (h, S, ) + k
bN . b ≥ m’ + (h’, v, A) + k
a, S, h├ e v, h’, b
The fundational theorem (3)
Proof (main idea) by induction on the lenght of derivation for
, m ├ e : A, m’ and S, h├ e v, h’
with different proofs for all syntax statements
*
, S, h, m 0, S0, h0, m0 ’, S’, h’, m’
The fundational theorem (4)
Last step is fun:
(f) = (A1, …, An, k) → (C, k’)
m ≥ k m – k + k’ ≥ m’, x1:A1, …, xn:An, m├ f(x1, …, xn) : (C, m’)
f ef v*
, S, h, m 0, S0, h0, m0 ’, S’, h’, m’0: y1:A1, …, yn:An
• S0: [y1←v1, …, yn ←vn] S
• h0 = h
• (h, S, ) ≥ (h0, S0, 0)
• a ≥ m + (h, S, ) + q
≥ k + (h0, S0, 0) + (m-k+q)
• induction hypotesys on
• a, S0, Γ0├ ef v, h’, b with
b ≥ k’ + (h’, v, C) + (m-k+q)
= q + (h’, v, C) + (m-k+k’)
≥ m’ + (h’, v, C) + q
S(x1)=v1, …, S(xn)=vn
m, [y1←v1, …, yn ←vn], h├ ef v, h’, m’
m, S, h ├ f(x1, …, xn) v,h’,m’
The fundational theorem (5)
Last step is dmatch: m e2 v*
, S, h, m 0, S0, h0, m0 ’, S’, h’, m’
S(x)=loc h(loc)=(vh,vt)
m0 = m + SIZE( h(loc) )
m0, S[h←vh, t ←vt], h \ {loc}├ e2 v, h’, m’
m,S,h ├ dmatch x with
|nil e1
|cons(h,t) e2 v, h’, m’
dmatch
, m ├ e1 : (C, m’)
, xh:A, xt:L(A, k), m + SIZE( A L(A, k)) + k├ e2 : (C, m’), x:L(A, k), m├ dmatch x with
|nil e1 |cons(h,t) e2 : (C, m’)dmatch
0 = \ {x:AL(A,k)} {h:A, t:L(A,k)}
• S0 = S[h←vh, t ←vt]
• h0 = h \ {loc}
• (h, S, )
= (h, {loc}, {L(A,k)})
+ (h0, S \ {x}, \ {x:AL(A,k)})
= k + (h, (vh,vt), AL(A,k))
+ (h0, S \ {x}, \ {x:AL(A,k)})
= k + (h0, S0, 0)
• (h, S, ) = k + (h0, S0, 0)
• a ≥ m + (h, S, ) + q
≥ m + SIZE(AL(A,k)) + k
+ (h0, S0, 0) + (q-SIZE(h(loc)))
• induction hypotesys on
• a, S0, 0 ├ e2 v, h’, b with
b ≥ m’ + (h’, v, C) + (q-SIZE(h(loc)))
Inferring annotations (1)
Find a valid (integral) assignment for all a, b… in type derivations
Associate P with a LP { ai,1xi,1 + … + ai,nxi,n ≤ bi } Objective Function Ψ = c1x1 + … + cnxn
Variables are free heap space variables in type derivations All variables need to be (integral) positive numbers
Constraints are inequalities in side conditions for type derivations
The Objective Function is simply Ψ = x1 + … + xn (minimize overall space requirements, modulus the waste rule)
We need integral optimal solutions. NP-Hard!
Inferring annotations (2)
Imposing further constraints: Almost positive constraints:
all variables for first-order types: (1, 0) | (Bool, 0) | (TT, 0) | (T+T, 0) | (L(T), 0)
all variables for right-hand side in first-order types (T, …, T, k) (T, 0)
All linear costraints become: { xi,0 ≥ ai,1xi,1 + … + ai,nxi,n + bi } The optimal solution is necessarly integral:
Proof by absurd: if xi,0Q+ is optimal, then xi,0 ≥ ai,1xi,1+…+ai,nxi,n+bi
But then, xi,0 ≥ ai,1xi,1+…+ai,nxi,n+bi
Therefore, xi,0 will be a better solution than the optimal one, xi,0
Inferring annotations (3)
Imposing further constraints: Almost conical constraints:
renaming variables: the only place where non-null constants
are introduced is when we consider SIZE(AL(A,k)) + k
All linear costraints become: { ai,1xi,1 + … + ai,nxi,n ≤ 0 } or { xi,j ≥ bi.j }
Integral solution can be found from the
rational one, multiplying by the LCD
Camelot
First-Order functional language Polymophism Elementary match construct Explicit resource usage:
heap cells are visible at the language level:match (l) with |Nil … |Cons(h,t)@d …
free(@d) Null constructor for types: Nil or !Nil In-place modification
Grail & JVM
Simpler functional language No inheritance Simplicity easy verifiability even on mobile
devices with constrained space and time resources Compilation of Camelot implies
all user defined data types are represented though a simple class, union of all features (and space requirements): the diamond class
monomorphisation normalisation of expressions and match statements
Overcoming linearity
Linearity in Type System could be a pretty restrictive policy Approach based on layered sharing
Layer 1: modifying usage Layer 2: read-only, shared with result Layer 3: read-only, not shared
Variables get decorated with corresponding usage layer We allow duplication w.r.t. several constraints Example: let x=e1 in e2
Konečný’s system for layered sharing
Splitting contexts:
• 1 for FreeVar(e1)FreeVar(e2) used in e1
• 2 for FreeVar(e1)FreeVar(e2) used in e2
• 1 for FreeVar(e1) \ 1
• 2 for FreeVar(e2) \ 2
Formulating constraints:
1, 1├ e1: A (i)
2, 2, xi : A ├ e2 : B
12→i, 2, 1
2→i 2├ let x=e1 in e2 : B