Pointer Analysis Survey.

Rupesh Nasre. Aug 24, 2007. Pointer Analysis Survey.

Outline. • The problem. • Background. • Representative papers. • Discussion: trends, similarities, differences. • Directions for research.

Statically find out the groups of program variables, such that, all variables in a group may point to the same memory block during the program execution. The problem.

Background (1 of 7). • Static analysis. • done on static representation of a program. • does not require program execution. • is conservative by definition. • Dynamic analysis. • done on traces of program executions. • does not cover all possible behaviors. • precise for a run of the program.

Background (2 of 7). • Clients. • program transformations that depend on pointer analysis. • for instance, queries related to pointers and compiler optimizations. • typically, query resolution time for clients is inversely proportional to pointer analysis time.

Background (3 of 7). • Precision. • a measure of correctness for getting the required information from pointer analysis. • for pointer analysis, the required information is: whether two pointers are aliases or non-aliases. • dynamic analysis is precise with respect to that execution.

Background (4 of 7). • Efficiency. • amount of time taken by an algorithm. • Scalability. • asymptotic time complexity of an algorithm. • An algorithm can be efficient, but not scalable.

Background (5 of 7). • Flow-sensitivity. • algorithm considers control flow in the program. • Context-sensitivity. • algorithm considers calling context of a function. • Field-sensitivity. • algorithm separates individual fields of an aggregate, from each other and from the aggregate itself.

Background (6 of 7). • Unification-based. • algorithm merges equivalence classes of variables in an assignment. • less storage requirement. • fast. • low precision.

Background (7 of 7). • Inclusion based (or subset based or constraint based). • algorithm processes assignments directionally and each symbol is represented by a single node. • more storage requirement. • slower. • high precision.

Representative papers (1 of 4). • Choi et al, Efficient flow-sensitive interprocedural computation of pointer-induced aliases and side effects, POPL 1993. • Andersen, PhD Thesis, 1994. • Burke et al, Flow-insensitive interprocedural alias analysis in the presence of pointers, LCPC 1995. • Reps et al, Precise interprocedural dataflow analysis via graph reachability, POPL 1995.

Representative papers (2 of 4). • Steensgaard, Points-to analysis in almost linear time, POPL 1996. • Ghiya et al, Is it a tree, DAG, or a cyclic graph? A shape analysis for heap-directed pointers in C, PLDI 1996. • Hind et al, Which pointer analysis should I use?, ISSTA 2000.

Representative papers (3 of 4). • Cheng et al, Modular interprocedural pointer analysis using access paths: design, implementation, and evaluation, PLDI 2000. • Liang et al, Evaluating the precision of static reference analysis using profiling, ISSTA 2002. • Whaley et al, Cloning-based context-sensitive pointer alias analysis using binary decision diagrams, PLDI 2004.

Representative papers (4 of 4). • Raman et al, Recursive data structure profiling, MSP 2005. • Lattner et al, Making context sensitive points-to analysis with heap-cloning practical for the real world, PLDI 2007.

Discussion: similarities, differences. • Flow-sensitive: Choi93, Ghiya96, Reps95, Whaley04. • Context-sensitive: Andersen94, Cheng00, Ghiya96, Lattner07, Whaley04. • Field-sensitive: Cheng00, Lattner07, Whaley04. • Unification-based: Steensgaard96, Lattner07. • Inclusion-based: Andersen94, Cheng00, Whaley04.

Discussion: trends (1 of 2). • Recursion is handled using strongly-connected components. • A recursive data structure is represented using a single representative node. • Stack pointers are often treated in a different manner than heap pointers. • For better precision, inclusion-based analyses are preferred. For better efficiency, unification-based analyses are preferred.

Discussion: trends (2 of 2). • Flow-sensitivity does not improve precision to a significant extent, for, typically pointers are not reassigned and when they are, they point to the other part of the same data structure represented as a whole using a single node. • Graph algorithms typically involve three phases: intraprocedural, bottom-up, and top-down. • Single level of context-sensitivity proves sufficiently precise and efficient.

Discussion. • Most of the papers differ in the techniques used to solve pointer analysis problem. • Representation of alias information differs a lot across techniques. • matrices: Ghiya96. • graphs: Das00, Lattner07, Raman05, Reps95, Steensgaard96. • access-paths: Cheng00. • ordered binary decision diagrams: Whaley04.

Directions for research (1 of 4). • Complex data structures. • most algorithms do not handle them well. • occur when large hash tables, dictionaries, symbol tables form the main data structure of a program. • need to characterize complexity of a data structure. • adaptive algorithm depending on the complexity.

Directions for research (2 of 4). • Out-of-order execution for multithreaded programs. • some research done for multithreaded programs. • none of the papers talk about the result of out-of-order execution of instructions on aliases in multithreaded programs. • instructions may be reordered by compiler or hardware.

Directions for research (3 of 4). • Combination of techniques. • no one of the techniques present is best in all aspects. • hybrid approaches are necessary. • one way is to combine static pointer analysis with dynamic profile information. • another way is to use adaptive algorithm which internally uses different sub-algorithms invented.

Directions for research (4 of 4). • Representation of alias information. • history tells us that difference in the alias information representation often led to new algorithms. • research on finding novel ways to represent aliases can be an interesting area to be explored.

Rupesh Nasre. Aug 24, 2007. Pointer Analysis Survey.

Pointer Analysis Survey.

Pointer Analysis Survey.

Presentation Transcript

Pointer Analysis Lecture 2

Pointer Analysis

Pointer Analysis

Pointer Analysis

Pointer Analysis – Part I

Pointer Analysis – Part II

Context-Sensitive Pointer Analysis

Context-Insensitive Pointer Analysis

Pointer Analysis – A Survey

Pointer Analysis

Client-Driven Pointer Analysis

pointer-to-pointer (double pointer)

Pointer analysis

Pointer Analysis.

Pointer Analysis Survey.

Pointer Analysis – Part I

Probabilistic Pointer Analysis [PPA]

Pointer Analysis

Next Section: Pointer Analysis

Pointer Analysis – Part I

Pointer analysis