Efficient Inference & Specialization for Array Bound Checks Elimination

A Practical and Precise Inference and Specializer for Array Bound Checks Elimination Dana N. Xu Univ of Cambridge Corneliu Popeea Natl Univ of Singapore Wei-Ngan Chin Natl Univ of Singapore PEPM 2008 - 8 January

Array Bound Check Elimination • Problem: • without array bound checks (e.g. C), programs may be unsafe. • with array bound checks (e.g. Java), program execution is slowed down. • Solution: eliminate redundant checks. Inference Specialization input program optimized program method summaries

Checks : i¸0 i<len(a) L1 Checks : m¸0 m<len(a) L2 Inference Goal: derive preconditions that make checks redundant. float foo (float a[], int j, int n) { float v=0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m]; } SAFE PRECONDITION Symb. Program State: i=j+1 Æ 0<i<=n SAFE UNSAFE Symb. Program State: i=j+1 Æ m>=0 Our contributions: modular inference of preconditions. handling indirection arrays.

L1 Specialization Goal: eliminate runtime checks guided by inference results. • If we assume all callers satisfy (j+1< len(a)) : float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + (if (m<len(a)) then a[m] else error); } Our contribution: integrate modular inference with specializer.

Overview • Introduction • Our approach • Modular inference: postcondition + preconditions. • Flexi-variant specialization. • Experimental results. • Conclusion.

Setting • First order imperative language: • Invariants expressed as linear formulae: meth ::= t mn ( ([ref] t v)* ) { e } - method t ::= int | float | t[int, .. , int] - type e ::= k | v | if v then e1 else e2 - expression | v=e | t v=e1;e2 | mn(v*) Q ::= { q(v*) = Á } - recursive formula Á::= Á1ÆÁ2| Á1ÇÁ2| q(v*) | s - formula s ::= a1v1 + .. + anvn· a - linear inequality

L1 L0 L2 Forward Derivation • Compute sps (symb. program state) at each point. • To support modularity, symbolic transitions relate initial values (j,n) and latest values (j’,n’) : float foo (float a[], int j, int n) { float v=0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m]; } sps(L1) = sps(L0) Æ i’=j'+1 Æ 0<i’·n sps(L2) = sps(L0) Æ i’=j'+1 Æ m’¸0 sps(L0) = len(a)>0Æj’=jÆn’=n

Forward Derivation for Recursion • Each method is first translated to a recursive constraint. • Compute an over-approximation of the least fixed point of this recursive constraint: • precise disjunctive polyhedron abstract domain. • with hulling and widening operators. • Details and examples in the paper.

Indirection Arrays • Hold indexes for accessing another array. • Used intensively for sparse matrix operations. • Need to capture universal properties about elements inside array: 8 i 2 indexes(a) ¢ 0 · a[i] · 10 0 · a_elem · 10 represented as:

Indirection Arrays • Given method: • Compute postcondition: void initArr(int a[], int i, int j, int n) { if (i>j) then () else { a[i]=n; initArr(a,i+1,j,n+1) } (i>j Æ a_elem'=a_elem) Ç (0·i·j<len(a) Æ (a_elem'=a_elem Ç n·a_elem'·n+j-i))

Inference of Preconditions pre = 8L¢(sps ) chk) • Classify checks with • pre is valid: safe check. • pre is unsatisfiable: unsafe check. • .. otherwise propagate pre as a check for the caller.

L1 L2 sps(L1) = len(a)>0 Æ i'=j'+1 Æ 0<i'·n' Æ j'=j Æ n'=n pre(L1.high) = 8 {i',j',n'} ¢ (sps(L1) ) i'<len(a)) = (j<len(a)-1) Ç (n·j Æ j¸len(a)-1) sps(L1) = len(a)>0 Æ i'=j'+1 Æ 0<i'·n' Æ j'=j Æ n'=n pre(L1.low) = 8 {i',j',n'} ¢ (sps(L1) ) i'¸ 0) = true sps(L2) = len(a)>0 Æ i'=j'+1 Æ m'¸ 0 Æ j'=j Æ n'=n pre(L2.high) = 8 {i',j',n',m'} ¢ (sps(L2) ) m'<len(a)) = false Example: Preconditions • Derive weakest precondition for each check: float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m]; }

too large no loss in precision less precise, but more efficient Efficient Preconditions • Problem: negation of sps results in large preconditions. • naïve pre-derivation: (len(a)· 0) Ç (j<len(a)-1 Æ 1·len(a)) Ç (n·j Æ 1·len(a)·j+1) • Simplify preconditions via strengthening: • weak pre-derivation drops disjuncts that violate type-invariants: (j<len(a)-1) Ç (n·jÆ len(a)· j+1) • strong pre-derivation drops disjuncts that allow the avoidance of the check: (j<len(a) - 1) • selective pre-derivation between weak and strong.

L1 L2 Inference Result: Method Summary • Postcondition: (j<len(a)-1 Ç j¸len(a)-1 Æ n·j) Æ j’=j Æ n’=n • Preconditions: { L1.high: (j<len(a)-1) } • Unsafe-checks: { L2.high } float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m]; }

Overview • Introduction • Our approach • Modular inference: postcondition + preconditions. • Flexi-variant specialization. • Experimental results. • Conclusion.

L1 Specialization • If we assume all contexts satisfy (j+1 < len(a)): • If we assume all contexts do not satisfy (j+1 < len(a)): specialize foo with 2 runtime checks. • Otherwise … ? float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + (if (m<len(a)) then a[m] else error); }

Specialization • Monovariant specializer • One specialized code for each method. • Lower bound of all optimization. • Compact code size. • Polyvariant specializer • Multiple optimized codes per method. • Each call site is replaced by a specialized call. • Highly optimized but may have code blow-up.

Flexivariant Specialization • Allows trade-off between optimization and code size. • Decides how many copies to generate per method, based on frequency and size constraint. • Less optimization - 1 copy: foo (2 runtime checks). • More optimization - 2 copies: foo1 (1 runtime check) + foo2 (2 runtime checks)

Soundness • Inference + Specialization = Well-typed program Theorem: Given a program P and an inference judgment ` P  PI. Let Bflex PI PT be the specialization of PI to PT. Then, if PT is well-typed, its execution will never proceed to invalid array-accesses.

Implementation • Prototype written in Haskell language: • uses an efficient Presburger solver [W. Pugh et al]. • disjunctive fixed-point analyzer [Popeea and Chin]. • Test programs: • small programs: binary search, merge sort, quick sort. • numerical benchmarks: Fast Fourier Transform, LU decomposition, Linpack.

Experimental Results

Precondition Strengthening • Weak prederivation may generate preconditions that are too large to be manipulated (* signifies a timing over an hour) • Strong prederivation keeps preconditions small (simplifies 81% from weak-pre). • Selective prederivation: both efficient and precise (simplifies 63.4% from weak-pre).

Conclusion • Modular summary-based analysis: • Disjunctive postcondition inference • Derivation of efficient, scalable preconditions. • Integration with a flexi-variant specializer. • Implementation of a prototype system. • Correctness proof.

A Practical and Precise Inference and Specializer for Array Bound Checks Elimination Corneliu Popeea, Dana N. Xu, Wei-Ngan Chin We thank Siau-Cheng Khoo for sound and insightful suggestions. Thanks to anonymous referees for comments.

Related Work • Global analyses: • Techniques: Suzuki and Ishihata [POPL'77], Cousot and Halbwachs [POPL'78] • Tools: Astreé [PLDI'03], C Global Surveyor [PLDI'04] • Modular analyses: • Cousot and Cousot [IFIP'77, CC'02] • Chatterjee, Ryder and Landi [POPL'99] • Moy [VMCAI'08] • Dependent type checking: • Xi and Pfenning [PLDI'98]

Limitations: • Large formulae: currently under-approx. formulae are propagated. Over-approx. formulae are more compact, since sps appears in a positive position. • Future work: • Dual analysis to validate some alarms as true bugs. • Extend the analysis with sound treatment of reference types. • Handle more (existential) properties about array elements.

Two Kinds of Recursive Invariants • For loops: • compute a loop invariant. • For methods with general recursion: • compute a loop invariant. • the method postcondition cannot be determined directly from the loop invariant: a separate fixed-point is computed.

VCgen Verification Condition Generator • Backward VCgen: • given: {P} assert chk {Q} • derives: P = (Q Æ chk) • Our precondition derivation: • given: {pre} …; assert chk {sps} • derives: pre = (sps => chk) • Differences: • sps is a transition relation: pre holds at the beginning of the current method. • sps is computed by a separate forward derivation.

Efficient Inference & Specialization for Array Bound Checks Elimination

Efficient Inference & Specialization for Array Bound Checks Elimination

Presentation Transcript

Branch-and-Bound

Checks and Balances

Checks and Balances

Branch and bound

Checks and Balances

Variable Elimination for Inference with Bayesian networks

Checks and Balances

Checks and Balances

Checks and Balances

Checks and Balances

Bucket Elimination: A unifying framework for Probabilistic inference Rina Dechter

Bucket Elimination: A Unifying Framework for Probabilistic Inference

Checks and balances

A Practical and Precise Inference and Specializer for Array Bound Checks Elimination

Branch and Bound

Checks and Balances

Resource Bound Inference for Functional Programs

Branch and bound

Substitution and Elimination

Checks and Balances

Checks and Balances

Bound and Checked