Compactly Representing First-Order Structures for Static Analysis

Compactly Representing First-Order Structures for Static Analysis Tel-Aviv University Roman Manevich Mooly Sagiv I.B.M T.J. Watson Ganesan RamalingamJohn FieldDeepak Goyal

Motivation • TVLA is a powerful and general abstract interpretation system • Abstract interpretation in TVLA • Operational semantics is expressed with first-order logic formulae • Program states are represented assets of Evolving First-Order Structures • Space is a major bottleneck

Desired Properties • Sparse data structures • Share common sub-structures • Inherited sharing • Incidental sharing due to program invariants • But feasible time performance • Phase sensitive data structures

Outline • Background • First-order structure representations • Base representation (TVLA 0.91) • BDD representation • Empirical evaluation • Conclusion

First-Order Logical Structures • Generalize shape graphs • Arbitrary set of individuals • Arbitrary set of predicates on individuals • Dynamically evolving • Usually small changes • Properties are extracted by evaluating first order formula: ∃v1 , v: x(v1) ∧ n(v1, v) • Join operator requires isomorphism testing

First-Order Structure ADT • Structure : new() /* empty structure */ • SetOfNodes : nodeSet(Structure) • Node : newNode(Structure) • removeNode(Structure, node) • Kleeneeval(Structure, p(r), <u1, . . . ,ur>) • update(Structure, p(r), <u1, . . . ,ur>, Kleene) • Structurecopy(Structure)

print_all Example /* list.h */typedef struct node { struct node * n; int data;} * L; /* print.c */#include “list.h”void print_all(L y) { L x;x = y; while (x != NULL) { /* assert(x != NULL) */ printf(“elem=%d”, xdata);x = xn; }}

print_all Example n=½ usm=½ u1y=1 n=½ S1 n=½ usm=½ u1y=1 n=½ S0 x = y x’(v) := y(v) copy(S0) : S1 nodeset(S0) : {u1, u} eval(S0, y, u1) : 1 update(S1, x, u1, 1) x=1 eval(S0, y, u) : 0 update(S1, x, u, 0)

print_all Example n=½ while (x != NULL)precondition : ∃v x(v) u1x=1y=1 usm=½ n=½ S1 n=½ x = x  nfocus : ∃v1 x(v1) ∧ n(v1, v)x’(v) := ∃v1 x(v1) ∧ n(v1, v) usm=½ u1y=1 S2.0 n=½ u1y=1 ux=1 S2.1 n=1 n=½ n=½ n=½ u.0sm=½ u1y=1 n=1 S2.2 u.1x=1

Overview and Main Results • Two novel representations of first-order structures • New BDD representation • New representation using functional maps • Implementation techniques • Empirical evaluation • Comparison of different representations • Space is reduced by a factor of 4–10 • New representations scale better

Base Representation (Tal Lev-Ami SAS 2000) • Two-Level Map : Predicate  (Node Tuple  Kleene) • Sparse Representation • Limited inherited sharing by “Copy-On-Write”

BDDs in a Nutshell (Bryant 86) • Ordered Binary Decision Diagrams • Data structure for Boolean functions • Functions are represented as (unique) DAGs x1 x2 x2 x3 x3 x3 x3 0 0 0 1 0 1 0 1

BDDs in a Nutshell (Bryant 86) • Ordered Binary Decision Diagrams • Data structure for Boolean functions • Functions are represented as (unique) DAGs • Also achieve sharing across functions x1 x1 x1 x2 x2 x2 x2 x2 x3 x3 x3 x3 x3 x3 x3 0 1 0 1 0 1 Duplicate Terminals Duplicate Nonterminals Redundant Tests

Encoding Structures Using Integers • Static encoding of • Predicates • Kleene values • Dynamic encoding of nodes • 0, 1, …, n-1 • Encode predicate p’s values as • ep(p).en(u1). en(u2) . … . en(un) . ek(Kleene)

x1 x2 x2 x3 0 1 BDD Representation of Integer Sets • Characteristic function • S={1,5} 1=<001>5=<101> S = (¬x1¬x2x3) (x1¬x2x3)

x1 x2 x2 x3 1 BDD Representation of Integer Sets • Characteristic function • S={1,5} 1=<001>5=<101> S = (¬x1¬x2x3) (x1¬x2x3)

BDD Representation Example n=½ usm=½ S0 n=½ S0 u1y=1 1

BDD Representation Example n=½ usm=½ S0 S1 n=½ S0 u1y=1 x=y n=½ u1x=1y=1 usm=½ n=½ S1 1

S2.2 BDD Representation Example n=½ usm=½ S0 S1 n=½ S0 u1y=1 x=y n=½ u1x=1y=1 usm=½ n=½ S1 x=xn n=½ n=½ n=½ u.0sm=½ u1y=1 n=1 S2.2 u.1x=1 1

Improved BDD Representation • Using this representation directlydoesn’t save space • Observation • Node names can be arbitrarily remapped without affecting the ADT semantics • Our heuristics • Use canonic node names to encode nodes • Increases incidental sharing • Reduces isomorphism test to pointer comparison • 4-10 space reduction

Reducing Time Overhead • Current implementation not optimized • Expensive formula evaluation • Hybrid representation • Distinguish between phases:mutable phase  Join  immutable phase • Dynamically switch representations

Functional Representation • Alternative representation for first-order structures • Structures represented by maps from integers to Kleene values • Tailored for representing first-order structures • Achieves better results than BDDs • Techniques similar to the BDD representation • More details in the paper

Empirical Evaluation • Benchmarks: • Cleanness Analysis (SAS 2000) • Garbage Collector • CMP (PLDI 2002) of Java Front-End and Kernel Benchmarks • Mobile Ambients (ESOP 2000) • Stress testing the representations • We use “relational analysis” • Save structures in every CFG location

Space Results

Abstract Counters • Ignore language/implementation details • A more reliable measurement technique • Count only crucial space information • Independent of C/Java

Abstract Counters Results

Trends in theCleanness Analysis Benchmark

What’s Missing from this Work? • Investigate other node mapping heuristics • Compactly represent sets of structures • Time optimizations

Conclusions • Two novel representations of first-order structures • New BDD representation • New representation using functional maps • Implementation techniques • Normalization techniques are crucial • Empirical evaluation • Comparison of different representations • Space is reduced by a factor of 4–10 • New representations scale better

Conclusions • The use of BDDs for static analysis is not a panacea for space saving • Domain-specific encoding crucial for saving space • Failed attempts • Original implementation of Veith’s encoding • PAG

The End

Compactly Representing First-Order Structures for Static Analysis

Compactly Representing First-Order Structures for Static Analysis

Presentation Transcript

Dynamic Analysis of First Order Instruments

STATIC TIMING ANALYSIS

Static Pushover Analysis

Static Analysis for Memory Safety

Static Analysis Techniques –

Static Analysis @ CTI

Static Analysis

Transient Analysis - First Order Circuits

Static Analysis for Security

Static analysis for security

Representing Data using Static and Moving Patterns

Static Analysis of Communication Structures in Parallel Programs

WP6: Static Analysis

Static Analysis

Compactly Representing Parallel Program Executions

Static Analysis for Security

Representing Part Relationships Between Developing Structures

Static Code Analysis

Static Analysis of Communication Structures in Parallel Programs

Representing Data using Static and Moving Patterns

Static Structural Analysis

STATIC FORCE ANALYSIS