Shape Analysis With Reference Sets

Shape Analysis With Reference Sets Mark Marron IMDEA-Software (Madrid, Spain) mark.marron@imdea.org

Motivation • We want to provide basic information about the program heap for supporting a range of client applications • IDE tools (query, refactoring, etc.) • Optimization • Error Detection • Focus on scalable, manageable models/tools even at cost of overall expressivity/analytic power

Demo • Fix sharing info extraction • Add disjoint/overlaps for set information • Point out, more than just variable relations is desirable, variables transient

Goal • Track basic set relations • Membership, Overlapping, Non-Overlapping • Subset, Set Equality • Ensure small computational cost • High precision is not required but must handle common cases accurately • Iterative subset construction/mutation • Set style library operations • Union (AddAll) • Intersection • IsSubset • Contains

Approach Overview • Start with existing model that decomposes heap into related regions • Reduces the complexity of the set formula that are needed • Storage shape graph works well • Nodes represent sets of objects (or data structures), edges represent sets of pointers • Fine grained partitioning is possible • Disjointness properties are natural (and mostly free) • Annotate edges with additional properties to track reference set relations

Logical Structure Identification • Key issue for shape graph approach is how to group concrete objects into abstract nodes • Too many nodes is confusing and computationally expensive • Too few nodes leads to imprecision (as a single node must represent multiple logical structures) • Often done via allocation site or types • Solution: nodes are similar sets of objects • Recursive type information (recursive vs. non-recursive types) • Objects stored in the same collection, array or structure

Concrete Expression Heap

Abstract Expression Heap

Target Set Definition • Given a set of heap references R the corresponding target set is: • {Object o | ∃ r ∈ R that points to o} • The two sets of heap references can be related with ⊆ on the target sets • As the heap is partitioned into regions of objects we also define a notion of coverage • A reference set covers a region if every object in the region is in the corresponding target set

Considerations in Abstraction • Several possible choices for representing these relations • Theory of sets over all objects/references • Full binary relations on power sets of edges • Reduced set of relations • For efficiency we use a reduced set of relations • Equality of the reference sets abstracted by pairs of edges (E × E) • Relation from sets of edges to nodes that are covered by the abstracted references (℘(E) × N)

Abstract Edge Equivalence • Track target set equality of the pointers abstracted by pairs of edges

Abstract Node Coverage • Track if all nodes in region are contained in the target sets of given edges

Useful Inferences • There are a number of useful inferences that can be made from these two properties • If e, eʹ are edge equivalent and e has an empty concretization then eʹ must have an empty concretization as well • If an edge e covers node n then any other in edge represents a target set that is ⊆ to the target set for edge e

Example With Ref. Relations

Subsumes Aliasing • Note that the proposed reference set relations subsume classic must-alias • In the concrete model variables x == y (x, y non-null) iff Target(x) = Target(y) • In the abstract model the variables x, y must-alias iff the corresponding edges ex and ey are edge equivalent

Loop Invariant With Exit Test ... for(int i = 0; i < V.Length; ++i) V[i].f = 0;

Result ... for(int i = 0; i < V.Length; ++i) V[i].f = 0;

Static Analysis Statistics

Summary • Tracking reference set information is computationally inexpensive • Results are precise enough to model many interesting/important relations • In fact surprisingly so • Why? Most conditions end up being simple • Is this a general property? Are most programs made of simple relations/concepts which are composed into complex concepts (we hope so) • Could we use rich set decision procedures, e.g. all conditions are simple ⇒ most proofs easy/fast with right decomposition

Future Work • Build strong foundation for other tools to utilize • Transform core concepts from prototype to robust tools • Finish implementation of static analysis for CLI bytecode + core libraries (also runtime support) • Export results to Visual Studio for inspection, spec. generation, or other tools • Apply results in optimization, refactoring, and error detection applications

Questions

Shape Analysis With Reference Sets

Shape Analysis With Reference Sets

Presentation Transcript

Reference Source Analysis

Shape Analysis

Shape Analysis and Retrieval

Shape Analysis and Retrieval

Shape Analysis and Retrieval

Reference Book Analysis

Reference Book Analysis

Shape Analysis and Retrieval

Reference Book Analysis

Shape Analysis

RNA seq analysis with reference genome

Shape Analysis Overview

Statistical Shape Analysis

Solving Shape-Analysis Problems with Languages with Destructive updating

Local Heap Shape Analysis

End-User Shape Analysis

Shape Analysis with Structural Invariant Checkers

Pulse Shape Analysis with Segmented Germanium Detector

Statistical Shape Analysis

Analysis of shape

Get Perfect Shape Analysis with 3D Measurement Scanner

Engaging with shape