Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Pointer Analysis.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Rupesh Nasre.**Advisor: Prof R Govindarajan. Apr 05, 2008. Pointer Analysis.**Outline.**• Motivation and Introduction. • Related Work. • Preliminary Results. • Research Directions.**Pointer analysis is the mechanism of statically finding out**possible run-time values of a pointer. What is Pointer Analysis?**Pointer analysis is the mechanism of statically finding out**possible run-time values of a pointer and relation of various pointers with each other. What is Pointer Analysis?**Relation between pointers.**• p = arr + ii; q = arr + jj; if (p == q) { fun(); } • q = p; ... if (p == q) { fun(); }**Variants of Pointer Analysis.**• Alias analysis. do p and q point to the same memory location? • Points-to analysis. does p point to memory location x?**Why Pointer Analysis?**• for parallelization: fun(p); fun(q); • for common subexpression elimination: x = p + 2; y = q + 2; • for dead code elimination. if (p == q) { fun(); } • for other optimizations.**Introduction.**• Flow sensitivity. • Context sensitivity. • Field sensitivity. • Unification based. • Inclusion based.**Flow sensitivity.**p = &x; p = &y; label: ... flow-sensitive: {(p, &y)}. flow-insensitive: {(p, &x), (p, &y)}.**Context sensitivity.**caller1() { caller2() { fun(int *ptr) { fun(p); fun(q); r = ptr; } } } context-insensitive: {(r, p), (r, q)}. context sensitive: {(r, p)} along call-path caller1, {(r, q)} along call-path caller2.**Field sensitivity.**x.f = p; or p = x.f; field-sensitive: {(x.f, p)}. field-insensitive: {(x, p)}.**Unification based.**one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } unification-based: {(p, &s1), (p, &s2), (p, &s3), (q, &s1), (q, &s2), (q, &s3)}.**Inclusion based.**one(&s1); one(struct s*p) { two(struct s*q) { one(&s2); p->a = 3; q->b = 4; two(&s3); two(p); } } inclusion-based: {(p, &s1), (p, &s2), (q, &s1), (q, &s2), (q, &s3)}**Like all other important problems in Computer Science...**• Alias analysis without memory allocation, intra-procedural, flow-sensitive, supporting arbitrary levels of indirection, is NP-hard. • For two levels of indirection, it is still NP-hard. • Even flow-insensitive analysis is NP-hard (for arbitrary levels of indirection). • With dynamic memory allocation, allowing structs, it becomes undecidable. • Even for scalars (no structs), it remains undecidable. G Ramalingam, The undecidability of aliasing, TOPLAS 1994. Venkatesan Chakaravarthy, New results on the computability and complexity of points-to analysis, POPL 2003.**But the good news is...**• For single pointer dereference, even a flow-sensitive analysis with only scalars and well-defined types is in P, if dynamic memory allocation is not allowed. • For arbitrary number of dereferences, if the analysis is flow-insensitive, it is in P. G Ramalingam, The undecidability of aliasing, TOPLAS 1994. Venkatesan Chakaravarthy, New results on the computability and complexity of points-to analysis, POPL 2003.**Open Problems.**• When dynamic memory allocation is not allowed, but arbitrary number of levels of dereferencing is allowed, the problem is NP-hard. Is it in NP? • Is the above problem for bounded number of dereferences in P? • When dynamic memory is allowed, is the problem decidable?**Related Work.**• Choi et al, POPL 1993. • flow sensitive. • solution set for each program point. • alias sets for each CFG node. • uses worklists for efficiency. • precise but inefficient. J D Choi,M Burke, P Carini, Efficient flow-sensitive interprocedural computation of pointer induced aliases and side effects, POPL 1993.**Related Work.**• Andersen, PhD Thesis, 1994. • flow insensitive. • context insensitive. • inclusion based. • each variable represented using separate node. • precision used as upper bound. Lars Ole Andersen, Program Analysis and Specialization for the C Programming Language, PhD thesis, 1994.**Related Work.**• Burke et al, LCPC 1995. • flow insensitive. • alias solution for each procedure. • worklist used for efficiency. • can filter alias information based on scoping. • nearly as precise as Andersen's. M Burke, P Carini, J D Choi, M Hind, Flow-insensitive interprocedural alias analysis in the presence of function pointers, LCPC 1995.**Related Work.**• Reps et al, POPL 1995. • problem formulated using graph reachability. • poly-time algorithm for interprocedural finite distributive subset-based problems. • graph reachability used for aliasing. Thomas Reps, Susan Horwitz, Mooly Sagiv, Precise Interprocedural Dataflow Analysis via Graph Reachability, POPL 1995.**Related Work.**• Steensgaard, POPL 1996. • flow insensitive. • context insensitive. • field insensitive. • unification based. • linear space and almost linear time algorithm. • imprecise but sets lower bound on time complexity. Bjarne Steensgaard, Points-to Analysis in Almost Linear Time, POPL 1996.**Related Work.**• Ghiya et al, PLDI 1996. • flow sensitive. • context sensitive. • field insensitive. • makes use of direction, interference and shape. • classifies as tree, dag or cyclic graph. Rakesh Ghiya, Laurie Hendren, Is it a Tree, a DAG, or a Cyclic Graph? A Shape Analysis For Heap Directed Pointers in C, PLDI 1996.**Related Work.**• Cheng et al, PLDI 2000. • uses access paths. • flow insensitive. • field sensitive. • cost effective context sensitivity. • works well for large number of indirect function calls. Ben-Chung Cheng, Wen-Mei Hwu, Modular Interprocedural Pointer Analysis using Access Paths: Design, Implementation, and Evaluation, PLDI 2000.**Related Work.**• Whaley et al, PLDI 2004. • context sensitive. • field sensitive. • partially flow sensitive. • inclusion based. • scalable (10 min, 400 MB, 8000 methods). • ordered BDDs. John Whaley, Monica Lam, Cloning-based Context-sensitive Pointer Alias Analysis Using Binary Decision Diagrams, PLDI 2004.**Related Work.**• Lattner et al, PLDI 2007. • context sensitive. • flow insensitive. • field sensitive. • unification based. • scalable. • efficient (3 sec for 200K lines). • low storage requirement (30MB). Chris Lattner, Andrew Lenharth, Vikram Adve, Making Context Sensitive Points-to Analysis with Heap Cloning Practical For The Real World, PLDI 2007.**Our Experiments.**• framework = LLVM. • algorithm = Andersen. • benchmark = SPEC 2000.**Research Directions.**• Pointer arithmetic. void f(struct list *p, struct list *q) { struct list *tmp; tmp = p->next; p->next = q->next; q->next = q->next->next; p->next->next = tmp; }**Research Directions.**• Profiling. • at specific program points like function entry, exit. • for hot functions. • for fat pointers.**Research Directions.**• Complex data structures. • a recursive data structure is merged into a single node. • some programs have a single global data structure to operate on, like symbol table, dictionary. • how to characterize complexity of a data structure?**Rupesh Nasre.**Advisor: Prof R Govindarajan. Apr 05, 2008. Pointer Analysis.**188.ammp Description.**Benchmark Program General Category: Computational Chemistry. Modeling large systems of molecules usually associated with Biology. Benchmark Description: The benchmark runs molecular dynamics (i.e. solves the ODE defined by Newton's equations for the motions of the atoms in the system) on a protein-inhibitor complex which is embedded in water (see Harrison 1993 for descriptions of the algorithm and stability analysis on it). The energy is approximated by a classical potential or "force field". The protein is HIV protease complexed with the inhibitor indinavir. There are 9582 atoms in the water and protein making this representative of a typical large simulation. This benchmark is derived from published work on understanding drug resistance in HIV (Weber and Harrison 1999). Input Description: The problem tracks how the atoms move from an initial coorinates and initial velocities.**Conferences.**POPL: Principles of Programming Languages. PLDI: Programming Language Design and Implementation. MSP: Memory Systems Performance. LCPC: Languages and Compilers for Parallel Computing.**Related Work.**• Raman et al, MSP 2005. • uses executable instructions. • run time (dynamic). • collects RDS profile. • no type information. • interesting properties of data structures are found out.