Prediction and Certification of Heap Usage

Prediction and Certificationof Heap Usage Luca Veraldi PhD. Student Department of Computer Science - University of Pisa BISS06 – Bertinoro International Spring School for Graduate Studies in Computer Science

References • Static Prediction of Heap Space Usage for FirstOrder Functional Programs (M. Hofmann, S. Jost) • Automatic Certification of Heap Consumption (L. Beringer, M. Hofmann, A. Momigliano, O. Shkaravska) • Camelot and Grail: Resource-Aware Functional Programming for the JVM (K. MacKenzie, N.Wolverson)

Agenda (1) • Introduction • MRG + PCC, mobile programs: why heap usage certification • methodology • The language and its Type System • Operational Semantics and Annotated Types • The fundational theorem and proof • Inferring annotations • Camelot • syntax, pattern matching, diamonds, transparency • Grail & JVM • compiling tricks and implementation drawbacks, limitations • Overcoming linearity • the multi-layered sharing approach

Introduction • Garantees for Resource Usage Requirements in Mobile Computing (MRG) • time, heap/stack size bounds • embedded computing devices • hard resource constraints • Approach based on Proof-Carrying Code (PCC) • resource-safe programming language • (linear) type system + resource annotations • certifying compiler • binary (JVM bytecode) enriched with a verifiable certificate • verifying resource needs prior to execution • asymmetric certifying process

The language and Type System (1) • (L, T, LP) • First-Order, Functional Language • Annotated (Linear) Type System • Efficient solution of Linear Constraint System • ƒ: L(Bool) → L(Bool) • ƒ’: N → N . ƒ(w) runs within ƒ’(|w|) cells • We can annotate the code for ƒ with a counter • no global characterization of the behavior of ƒ • annotated code requires as much space as ƒ itself • Undecidable problem, in its general formulation • impose restriction on language/type system

w Y wi Z X The language and Type System (2) • The main aim: • ƒ: L(L(Bool)) → L(Bool) • w: (L (L (Bool, ), ), ) ├ e: (L (Bool, ), ) X Y Z A B • If we have • fs(init)≥ Z + Y · |W| + X · ∑i|Wi| • then we can execute without any further space needs, leaving • fs(final)≥ B + A · |e|

The language and Type System (3) • w: (L (L (Bool,X),Y),Z)├ e: (L (Bool,A),B) • mark different input (output) portions with different weights • fs(init) = Φ(|w|) fs(final) = Ψ(|e|) • From this annotations of ƒ, we derive a Linear Programming Problem, which integer solutions can be computed efficiently

The language and Type System (5) • FreeList: linked list of heap space blocks • No compaction of heap. All blocks with same size • Cons: fails when no enough space • Two match statements • pmatch: • dmatch: • User is required to choose among the two • transparency in heap space usage and collection • pmatch x with |nil  e |cons(x,x)  e • preserves the matched block • dmatch x with |nil  e |cons(x,x)  e • returns the cell back to the FreeList

The language and Type System (6) • The problem of (malignant) sharing: • rev(a,b) dmatch(a) with |nil  b |cons(x,y)  rev(y, cons(x,b)) • let x=rev(a,nil) in Ψ(a) • rev uses destructive matching • input value acannot be reused any more • Is there a static type system to prohibit this? Linearity…

S(x1)=v1, …, S(xn)=vn m, [y1←v1, …, yn←vn], h├ ef v, h’, m’ m, S, h ├ f(x1, …, xn)  v, h’, m’ fun m, S, h├ e1 v1, h1, m1 m1, S[x←v1], h1├ e2 v, h’, m’ m, S, h ├ let x=e1 in e2 v, h’, m’ let Operational Semantics (1) • (stack) S: Var→Val (heap) h: Loc→Val • m, S, h ├ e  v, h’, m’

S(x)=nil m,S,h├ e1 v, h’, m’ m,S,h ├ match x with |nil  e1 |cons(h,t)  e2 v, h’, m’ pmatch dmatch S(x)=loc h(loc)=(vh,vt) j = m + SIZE( h(loc) ) j, S[h←vh, t ←vt], h \ {loc}├ e2 v, h’, m’ m,S,h ├ dmatch x with |nil  e1 |cons(h,t)  e2 v, h’, m’ S(x)=loc h(loc)=(vh,vt) m, S[h←vh, t ←vt], h├ e2 v, h’, m’ m,S,h ├ pmatch x with |nil  e1 |cons(h,t)  e2 v, h’, m’ dmatch pmatch Operational Semantics (2)

Operational Semantics (3) • Modeling benign sharing • a function for reachable locations: : heap x (Val)→(Loc) (h, {nil}) = (h, {c}) = {} (h, {loc}) = {loc}  (h, {h(loc)}) (h, {inl(v)}) = (h, {inr(v)}) = (h, {v}) (h, {(x,y)}) = (h, {x})  (h, {y}) (h, S) = (h, { v |  xdom(S) . v=S(x) }) = xdom(S) (h, {S(x)}) • stronger preconditions in semantics

S(x)=loc h(loc)=(vh,vt) j = m + SIZE( h(loc) ) j, S[h←vh, t ←vt], h \ {loc}├ e2 v, h’, m’ m,S,h ├ dmatch x with |nil  e1 |cons(h,t)  e2 v, h’, m’ m,S,h├ e1 v1, h1, m1 m1, S[x←v1], h1├ e2 v, h’, m’ m,S,h ├ let x=e1 in e2 v, h’, m’ let dmatch Operational Semantics (4) let x=rev(a,nil) in Ψ(a) cannot use in e2 locations modified during the evaluation of e1 S’ = S↓FreeVar(e2) h↓(h, S’) = h1↓(h, S’) S’ = S[h←vh, t ←vt] S’’ = S’↓FreeVar(e2) loc  (h, S’’)

Annotated Types (1) • Extend Type Systems, with space usage • Zero-order Types: • T ::= 1 | Bool | L(T) | T  T | T + T • R ::= (T, k) • First-oder Types: • F ::= (T, …, T, k)  R • Use new types to rewrite the typing rules

, m├ e : (A, p) m’≤ p + k • , m + k├ e : (A, m’) waste • (f) = (A1, …, An, k) → (C, k’) m ≥ k m – k + k’ ≥ m’ • , x1:A1, …, xn:An, m├ f(x1, …, xn) : (C, m’) fun m ≥ SIZE( A  L(A, k)) + k + m’ • , xh:A, xt:L(A, k), m├ cons(xh, xt) : (L(A, k), m’) cons Annotated Types (2)

, m ├ e1 : (C, m’) , xh:A, xt:L(A, k), m + SIZE( A  L(A, k)) + k├ e2 : (C, m’) • , x:L(A, k), m├ dmatch x with • |nil  e1 |cons(h,t)  e2 : (C, m’) , m ├ e1 : (C, m’) , xh:A, xt:L(A, k), m + k├ e2 : (C, m’) • , x:L(A, k), m├ pmatch x with • |nil  e1 |cons(h,t)  e2 : (C, m’) dmatch pmatch Annotated Types (3)

The fundational theorem (1) • Introducing the heap requirement function: : heap x (Val) x (T) → Q+ (h, {nil}, {L(A, k)) = (h, {c}, {Bool}) = 0 (h, {loc}, {L(A, k)}) = k + (h, {h(loc)}, {AL(A, k)}) (h, {x+y}, {(A, k)+(B, l)}) = k + (h, {x}, {A}) (h, {x+y}, {(A, k)+(B, l)}) = l + (h, {y}, {B}) (h, {(x,y)}, {AB}) = (h, {x}, {A}) + (h, {y}, {B}) (h, S, ) = (h, { v |  xdom(S) . v=S(x) }, ) = ∑xdom()(h, {S(x)}, (x)) • Once determined, the global resource usage requirement derived from the Type System could be used to drop resource annotations away from operational semantics

a, S, h├ e  v, h’, b The fundational theorem (2) • The theorem statement: • P is a valid program • , m ├ e : A, m’ • S, h├ e  v, h’ THEN •  kN, aN . a ≥ m + (h, S, ) + k •  bN . b ≥ m’ + (h’, v, A) + k

* , S, h, m 0, S0, h0, m0 ’, S’, h’, m’ The fundational theorem (3) • Proof (main idea) • by induction on the lenght of derivation for , m ├ e : A, m’ and S, h├ e  v, h’ • with different proofs for all syntax statements

(f) = (A1, …, An, k) → (C, k’) m ≥ k m – k + k’ ≥ m’ • , x1:A1, …, xn:An, m├ f(x1, …, xn) : (C, m’) * f ef v , S, h, m S(x1)=v1, …, S(xn)=vn m, [y1←v1, …, yn←vn], h├ ef v, h’, m’ m, S, h ├ f(x1, …, xn)  v,h’,m’ ’, S’, h’, m’ 0, S0, h0, m0 The fundational theorem (4) • Last step is fun: • 0: y1:A1, …, yn:An   • S0: [y1←v1, …, yn ←vn]  S • h0 = h • (h, S, ) ≥ (h0, S0, 0) • a ≥ m + (h, S, ) + q • ≥ k + (h0, S0, 0) + (m-k+q) • induction hypotesys on • a, S0, Γ0├ ef v, h’, b with • b ≥ k’ + (h’, v, C) + (m-k+q) • = q + (h’, v, C) + (m-k+k’) • ≥ m’ + (h’, v, C) + q

* m e2 v , S, h, m ’, S’, h’, m’ 0, S0, h0, m0 dmatch , m ├ e1 : (C, m’) , xh:A, xt:L(A, k), m + SIZE( A  L(A, k)) + k├ e2 : (C, m’) • , x:L(A, k), m├ dmatch x with • |nil  e1 |cons(h,t)  e2 : (C, m’) S(x)=loc h(loc)=(vh,vt) m0 = m + SIZE( h(loc) ) m0, S[h←vh, t ←vt], h \ {loc}├ e2 v, h’, m’ m,S,h ├ dmatch x with |nil  e1 |cons(h,t)  e2 v, h’, m’ dmatch The fundational theorem (5) • Last step is dmatch: • 0 =  \ {x:AL(A,k)} {h:A, t:L(A,k)} • S0 = S[h←vh, t ←vt] • h0 = h \ {loc} • (h, S, ) • = (h, {loc}, {L(A,k)}) • + (h0, S \ {x},  \ {x:AL(A,k)}) • = k + (h, (vh,vt), AL(A,k)) • + (h0, S \ {x},  \ {x:AL(A,k)}) • = k + (h0, S0, 0) • (h, S, ) = k + (h0, S0, 0) • a ≥ m + (h, S, ) + q • ≥ m + SIZE(AL(A,k)) + k • + (h0, S0, 0) + (q-SIZE(h(loc))) • induction hypotesys on • a, S0, 0├ e2 v, h’, b with • b ≥ m’ + (h’, v, C) + (q-SIZE(h(loc)))

Inferring annotations (1) • Find a valid (integral) assignment for all a, b… in type derivations • Associate P with a LP • { ai,1xi,1 + … + ai,nxi,n≤ bi } • Objective Function Ψ = c1x1 + … + cnxn • Variables are free heap space variables in type derivations • All variables need to be (integral) positive numbers • Constraints are inequalities in side conditions for type derivations • The Objective Function is simply Ψ = x1 + … + xn (minimize overall space requirements, modulus the waste rule) • We need integral optimal solutions. NP-Hard!

Inferring annotations (2) • Imposing further constraints: • Almost positive constraints: • all variables for first-order types: (1, 0) | (Bool, 0) | (TT, 0) | (T+T, 0) | (L(T), 0) • all variables for right-hand side in first-order types (T, …, T, k)  (T, 0) • All linear costraints become: • { xi,0≥ ai,1xi,1 + … + ai,nxi,n + bi } • The optimal solution is necessarly integral: • Proof by absurd: if xi,0Q+ is optimal, then xi,0≥ ai,1xi,1+…+ai,nxi,n+bi But then, xi,0≥ ai,1xi,1+…+ai,nxi,n+bi Therefore, xi,0 will be a better solution than the optimal one, xi,0

Inferring annotations (3) • Imposing further constraints: • Almost conical constraints: • renaming variables: the only place where non-null constants are introduced is when we consider SIZE(AL(A,k)) + k • All linear costraints become: • { ai,1xi,1 + … + ai,nxi,n≤ 0} or { xi,j ≥ bi.j } • Integral solution can be found from the rational one, multiplying by the LCD

Camelot • First-Order functional language • Polymophism • Elementary match construct • Explicit resource usage: • heap cells are visible at the language level: match (l) with |Nil  … |Cons(h,t)@d  … • free(@d) • Null constructor for types: Nil or !Nil • In-place modification

Grail & JVM • Simpler functional language • No inheritance • Simplicity  easy verifiability even on mobile devices with constrained space and time resources • Compilation of Camelot implies • all user defined data types are represented though a simple class, union of all features (and space requirements): the diamond class • monomorphisation • normalisation of expressions and match statements

Formulating constraints: 1, 1├ e1: A (i) 2, 2, xi : A ├ e2 : B 12→i, 2, 12→i 2├ let x=e1 in e2 : B Overcoming linearity • Linearity in Type System could be a pretty restrictive policy • Approach based on layered sharing • Layer 1: modifying usage • Layer 2: read-only, shared with result • Layer 3: read-only, not shared • Variables get decorated with corresponding usage layer • We allow duplication w.r.t. several constraints • Example: let x=e1 in e2 • Konečný’s system for layered sharing • Splitting contexts: • 1 for FreeVar(e1)FreeVar(e2) used in e1 • 2 for FreeVar(e1)FreeVar(e2) used in e2 • 1 for FreeVar(e1) \ 1 • 2 for FreeVar(e2) \ 2

Prediction and Certification of Heap Usage

Prediction and Certification of Heap Usage

Presentation Transcript

ADT Table and Heap

A heap of Clusters

Heap And Heap Sort

Parametric Prediction of Heap Memory Requirements

Heap Sort

Heap and Others

Heap liveness and its usage in automatic memory management

Parametric Heap Usage Analysis for Functional Programs

Heap

Fibonacci Heap

ANALYSIS OF SOFT HEAP

Heap

Stack and Heap Allocation

Heap

Fibonacci Heap

Utilizing Field Usage Patterns for Java Heap Space Optimization

ADT Table and Heap

Binomial Heap