1 / 45

Context-sensitive points-to analysis: is it worth it?

Context-sensitive points-to analysis: is it worth it?. Article by Ondřej Lhoták & Laurie Hendren from McGill University. Presentation by Roza Pogalnikova. Abstract. Evaluate precision of subset-based points-to analysis Compare different context-sensitivity approaches: call site strings

sierra
Télécharger la présentation

Context-sensitive points-to analysis: is it worth it?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Context-sensitivepoints-to analysis:is it worth it? Article by Ondřej Lhoták & Laurie Hendren from McGill University Presentation by Roza Pogalnikova

  2. Abstract • Evaluate precision of subset-based points-to analysis • Compare different context-sensitivity approaches: • call site strings • object sensitivity • algorithm by Zhu and Calman, Whaley and Lam (ZCWL)‏

  3. Subset-based PTA • Finding allocation sites that reach variable: • S: a = new A() // allocation statement • for variable x somewhere in the program: can it point to object allocated at S?

  4. Context Sensitivity • Call site: by program statement of method invocation • Object sensitivity: by receiving object of method invocation • ZCWL: k-CFA, where k is call graph depth without SCCs Run context-insensitive algorithm on cloned context-sensitive call graph. S: this->call_method()‏ S:this->call_method()‏

  5. Parameters • Include: • specialize only pointer variables • use heap abstraction as well • Different lengths of context strings

  6. Measurements • Measure to guide implementation: • number of contexts • number of distinct contexts • number of distinct point-to sets • Measure to evaluate: • size of the call graph (methods/edges)‏ • devirtualizable call sites • casts statically provable to be safe

  7. Results • Object sensitivity is the best and most scalable • Heap abstraction improves precision of analysis • Reduced analysis precision when no context sensitivity call graph in cycles

  8. What • Compare three kinds of context-sensitive points-to analysis: • call sites as context abstraction • object-sensitive analysis • ZCWL algorithm

  9. How • Implemented with JEDD system: • language extension of Java • abstraction of work with Binary Decision Diagrams (BDDs)‏ • Soot framework written in JEDD: • points-to analysis • call graph construction • side-effect analysis in BDDs • virtual call resolution

  10. BDDs Binary decision tree and truth table for the function f(x1, x2, x3) = -x1 * -x2 * -x3 + x1 * x2 + x2 * x3 BDD for the function f * credit: http://en.wikipedia.org/wiki/Binary_decision_diagram

  11. PTA using BDDs Points-to:(a, A)(b, B)(c, C)(a, B)(b, A)(c, A), (c, B) • Program:A: a = new O()B: b = new O()C: c = new O()a = bb = ac = b

  12. PTA using BDDs Points-to representation:(a, A) as 0000(a, B) as 0001(b, A) as 0100(b, B) as 0101(c, A) as 1000(c, B) as 1001(c, C) as 1010 • Binary representation: • a & A as 00 • b & B as 01 • c & C as 10

  13. PTA using BDDs • Compact way to represent points-to relations: * credit: [2] Points-to Analysis using BDDs

  14. Determine • How many contexts generalized? • How number of contexts relates to precision of analysis? • How likely scalable solution to be feasible?

  15. Background • O - pointer targets (objects)‏ • P – pointers • I – method invocation p may point to o: O(o) ϵ pt(P(p))‏

  16. Background • Oas – program statement where object was allocated • Pvar - pointer to local variable • [O(o), f] - field f of object o • Pfs(o.f) – pointer to a field f of object o

  17. Background • Compare 2 families of invocation abstraction: • call site Ics(i) (program statement of metacall)‏ • receiver object Iro(i) = O(o) (object on which method was invoked)

  18. Background • String of contexts given base abstraction Ibase: Istring(i) = [Ibase(i), Ibase(i2), Ibase(i3), ...] • ij is a j'th topmost invocation on stack during i (i = i1)‏ • Two approaches to make it finite: • define limit k to length of context string • ZCWL: exclude cycle edges from call graph

  19. Background • Another choice: which pointers/objects to model context-sensitively? • Given context-insensitive Pci and context I model run-time pointer p: • context-sensitively by P(p) = [I(ip), Pci(p)] (ip method invocation with p)‏ • context-insensitively by P(p) = Pci(p)‏

  20. Background • Given allocation site abstraction Oas, and context I model object o: • context-sensitively by O(o) = [I(io), Oas(o)] (io method invocation where o was allocated)‏ • context insensitively by O(o) = Oas(o)‏

  21. Benchmarks • The study was performed on: • SpecJVM 98 benchmark suite • DaCapo benchmark suite (ver. beta050224)‏ • Ashes benchmark suite • Polyglot extensible Java front-end • SUN standard library 1.3.1_01

  22. Benchmarks

  23. Contexts Number • Considered intractable: • propagate context from call site to called method • context strings number grows exponentially in the length of call chains

  24. Contexts Number • Clarify next issues: • how many of these contexts improve analysis results? • why BDDs can represent such number, and is there hope to represent it with traditional techniques?

  25. Total contexts number • Count method-context pairs • Empty spots – analysis not completed with available memory • BDD lib. could allocate 41 million BDD nodes (~820 MB)‏

  26. Total contexts number

  27. Total contexts number • Explicit context representation not scaling good • Contexts number grows slowly in object-sensitive (this pointer method invocations)‏ • ZCWL • k is max call depth in the call graph after merging SCCs • big variations because k different for each benchmark

  28. Equivalent contexts • Method-context pairs (m1, c1) and (m2, c2) are equivalent if: • m1 = m2 • ∀ local pointer p in the method, pt(P(p)) is the same for c1 and c2 • Equivalence classes reflect precision improvement due to context sensitivity

  29. Equivalent contexts

  30. Equivalent contexts • BDD “automatically” merges equal points-to relations, i. e. is effective • Object-sensitive vs. call sites – more precise • Context string length does not have great impact • Surprisingly ZCWL is less precise due to context-insensitivity in SCCs

  31. Distinct points-to sets • Measures analysis cost • Approximates space requirements in “traditional”representation, like shared bit-vectors • Similar results for all context-sensitive variations • Increase in distinct point-to sets with context-sensitive heap abstraction

  32. Distinct points-to sets

  33. Call Graph • Compare context-insensitive projection of context-sensitive call graphs • each node is method (and not method-context pair) • reachable methods preserved • ZCWL excluded (same as input context-insensitive graph)‏

  34. Reachable methods

  35. Reachable methods • Context-sensitivity discovers more unreachable methods (bloat)‏ • Context-sensitivity for heap objects: • In object-sensitive adds precision (sablecc-j)‏ • In call site no impact

  36. Call edges

  37. Call edges • Compare size of call graph in call edges • The same with exception of large difference in sablecc-j (specific code pattern)‏

  38. Virtual call resolution • Number of virtual calls with more then one implementation • Object-sensitive analysis has clear advantage over call site. • heap objects add precision (sablecc-j)‏

  39. Virtual call resolution

  40. Cast safety • Cast cannot fail if pointer can point-to only to object of “right” type (sub-type of the type in cast)‏ • Count non-provable casts • Object-sensitivity, especially with heap objects is the best (polyglot, javac)

  41. Cast safety

  42. Conclusions Evaluated effects: generated contexts distinct point-to sets precision of call graph construction virtual call resolution cast safety analysis • Context-sensitive variations: • object-sensitive analysis • call sites as context abstraction • ZCWL algorithm

  43. Conclusions • Context-sensitivity improvements: • small: call graph precision • medium: virtual call resolution • major: cast safety analysis • Object-sensitive analysis was the best: • analysis precision • potential scalability

  44. Conclusions • Object-sensitive variations improvements: • small: length of context strings • significant: heap objects with context • implementable with other existing techniques

  45. Conclusions • ZCWL algorithm: • disappointing results • caused by context-insensitive treatment of calls within SCCs of the initial graph • large proportion of edges in SCC

More Related