Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to Analysis

Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to Analysis Guoqing Xu and Atanas Rountev Ohio State University Supported by NSF under CAREER grant CCF-0546040

Precise and Scalable Points-to Analysis Analysis precision Context sensitivity – e.g. chain of call sites Heap cloning [Nystrom-PASTE’04, Lhotak-CC’06] The most precise analysis: refinement-based analysis [Sridharan-PLDI’06] Analysis scalability Millions of distinct call chains in a moderate-size Java program Sacrifice precision: k-length chain Merging equivalent relationships using BDDs BDDs incurs running time overhead, and may not scale for heap-cloning-based analysis 2

Merge Equivalent Contexts • Equivalence classes exists in the representation of calling contexts [Lhotak-CC’06] • Merging such contexts will not affect precision • Can we find and merge equivalent calling contexts? • We would be able to scale the points-to analysis without relying on the merging inside the BDD “black box” • A unique replacement context (URC) can replace all contexts from the same equivalence class

Outline • A model of equivalent contexts • Abstraction functions for pointer variables and targets • Proposed for pointer analysis, but can be applied to other context-sensitive analysis algorithms • A whole-program points-to analysis for Java • Implements the model • Context-sensitive for both pointer variables and targets • Does not limit the length of context strings (not k-CFA) • Bottom-up, summary-based • Experimental evaluation • Much more precise and efficient than state-of-the-art 1-object-sensitive analysis with BDDs [Lhotak-CC’06] • More efficient than the refinement-based analysis

Motivating Example void main(String[] args){ A a1 = new A(); A a2 = new A(); foo(a1); //call site 1 foo(a2); //call site 2 } void foo(A a){ t = new B(); bar(t); //call site 3 } void bar(B b){ p = b; } t(1)  new B(1) t(2)  new B(2) p(1,3)  new B(1) p(2,3)  new B(2) Observation: 1. t points to new B, under all calling contexts 2. p points to new B, under calling contexts (*,3)

A Better Representation? • Can we represent the points-to relationships like this? • t new B • p3new B • 1 copy of t , p, and new B • Key insights • Context-sensitivity corresponds to inlining; full context-sensitivity is achieved if all reachable methods are inlined in main • If a points-to relationship can be determined at method mduring inlining, it will not be affected by m’s callers • This is the conceptual source of context equivalence and merging

Hypothetical Inlining-based Analysis • (v, cv, o, co) represents a fully context-sensitive points-to relationship; v is local var; o is alloc site • Conceptual inlining • At call graph edge e, statement p := q from the callee is cloned as p(e) := q(e) in the caller • One caller up, the clone is p(e2, e) := q(e2, e) and so on … • Bottom-up analysis • After all call sites in a method m are inlined, an intraprocedural analysis is performed for m • Suppose (v, cv, o, co) is produced for m • For a call edge e from n to m: (v, (e) cv, o, (e) co) will definitely be produced later for n

Calling Context Reduction • Consider a tuple (v, cv, o, co) computed for main • Its lifetime consists of • A single creation event in some method m • This method is theflowing point for the tuple • A sequence of inlining steps that increase both cv and co in synch • URCs computed by abstraction functions for calling context; m is the flowing point • cv is mapped to a suffix (e0, e1, …, ei) where e0.src = m • co is mapped to a suffix (f0, f1, …, fj) where f0.src = m • Keep only the relevant suffix of the call chains

Using the Reduced Contexts • A URC is used to represent a set of calling contexts in the points-to relationships • A query (v, c)can be answered as follows: • Find all (v, urcv, o, urco) such that suffix(urcv, c) holds • Return all (o, co) such that suffix(urco, co) holds • Generalization for recursion – see the paper • A BDD may not be as effective for an analysis with heap cloning • Without heap cloning, an equivalence class is defined by a single string urcv • For an analysis with heap cloning, an equivalence class is defined by a pair (urcv , urco) • More classes = fewer opportunities for merging

Points-to Analysis • A specific algorithm that implements this model • Using URCs to represent calling contexts • The use of URCs could be applicable to other categories of points-to analysis • Resembles bottom-up inlining • Heap-cloning-based • Context-sensitively treat both pointer variables and targets • Partial unification (bi-directional flow of values)

Intraprocedural Analysis • Symbolic points-to graph (SPG) • A symbolic object node is introduced for each (1) formal parameter, (2) base variable v of load a = v.f, and (3) lhs vof a call site v = a.b(…) • Standard points-to analysis algorithm [Lhotak-CC’03] is used for SPG construction • SPG contains much fewer nodes and edges than the original program • Example void add (Integer t) { this.names = t; }

Escape Analysis • An allocation node new C or symbolic node SO directly escapes a method, if • It is pointed to by a formal parameter • It is pointed to by a returned variable • It is pointed to by a static field • A node indirectly escapes a method, if • It is reachable from nodes that directly escape • Compute a set of allocation/symbolic nodes that escape the method where they are defined

Interprocedural Analysis • Summary-based • Bottom-up traverse the call graph SCC-DAG • Summary function definition: set of [Of, Gf] • Of: … • Gf : the subgraph of all escaping objects (reachable from Of) and their points-to edges • Clone a summary function for each incoming call graph edge e • If o1c1 o2c2where o1ando2escape: in the caller, create o1(e) c1  o2(e) c2 • If v c1 s c2where s is an escaping symbolic node: create v(e) c1 s(e) c2 f f

Interprocedural Analysis • Composition of summary functions • For each [Oactual , Gactual ] • Find [Oformal , Gformal ] • Merge Gactual and Gformal • Subgraph merging • Simultaneously traverse Gactual and Gformal from Oactual and O formal • Merge b and c, if • where d and e have been merged

Merging Two Nodes

Interprocedural Analysis • The points-to solution is built on the fly • Once v c1 o c2 is formed, (v, c1, o, c2) is added to the points-to solution • The edge is removed from the SPG: we have found the flowing point and no further propagation is necessary • (c1, c2) defines an equivalence class for contexts • Node merging is essentially a dynamic transitive closure computation for identifying memory aliases

K-Last-Substring-Merging • A scalability-precision tradeoff • When composing summary functions, do not clone node oc1in the callee if there already exists a node oc2in the caller such that suffix(c1, k) == suffix(c2, k) • Does not limit the context length • Example • k =3 • O(d,e,f,g)and O(h,e,f,g) are distinct nodes if d.src != h.src • O(d,e,f,g) and O(h,e,f,g) are merged if d.src == h.src

Experiments • Benchmark set contains 19 Java programs, from SPECJVM, Ashes, and DaCapo • Experimentally compared our analysis with • Refinement: the refinement-based analysis from [Sridharan-PLDI’06] with its default budget for refinement • the most precise publicly-available analysis for computing an on-demand solution • queried the points-to sets for all possible (v, c) • 1H: 1-object-sensitive analysis with heap cloning, using BDDs [Lhotak-CC’06]; the most precise publicly-available analysis for computing a whole-program solution • Our analysis: computes a whole-program solution • 1Equiv: 1-last-substring-merging • 2Equiv: 2-last-substring-merging

Precision: #downcasts proven to be safe 2Equiv proves 59% more safe downcasts than 1H But less than Refinement

Cost: time to get a whole-program solution 2Equiv is 13 times faster than 1H 5.3 times faster than Refinement

Conclusions Context equivalence class identification Only URCs need to be explicitly represented in the data structures of the analysis A points-to analysis for Java Heap cloning, summary-based, bottom-up, with partial unification Experimental evaluation Precision approaches that of the refinement-based analysis, and is much higher than that of 1H Significantly faster than both refinement-based and 1H

Approximation in the Presence of Recursion • Two-phase approximation: (1) map an infinite call chain to a finite one, collapsing cycles while going backwards; (2) then, use the abstraction function shown earlier • Precision loss may result from phase 1: e.g., p points to o under context eabcdabf, but does not under context eabf

Merging Equivalent Contexts for Scalable Heap-cloning-based Points-to Analysis