Course Outline

Course Outline • Traditional Static Program Analysis • Classic analyses and applications • Soot • Software Testing, Refactoring • Dynamic Program Analysis

Announcements • The new page forthe class is at www.rpi.edu/~milana2/csci6961 • Email: milana2@rpi.edu , milanova@cs.rpi.edu • All, send me an email with your CS login

Outline • Analysis of object references • Class Hierarchy Analysis (CHA) • Rapid Type Analysis (RTA) • Call Graphs

Analysis of object references • Analysis of object-oriented programs • Java • Class Analysis problem: Given a reference variable x, what are the classes of the objects that x refers to at runtime? • Points-to Analysis problem: Given a reference variable x, what are the objects that x refers to at runtime?

Java Example: BoolExp hierarchy public class AndExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2; public AndExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) && _operand2.Evaluate(c); } } _operand1: {Constant} _operand2: {OrExp} public class OrExp extends BoolExp { private BoolExp _operand1; private BoolExp _operand2; public OrExp(BoolExp op1, BoolExp op2) { _operand=op1; _operand2=op2; } public boolean Evaluate(Context c) { return _operand1.Evaluate(c) || _operand2.Evaluate(c); } } _operand1: {VarExp} _operand2: {VarExp}

Class information: applications • Compilers: can we devirtualize a virtual function call x.m()/x->m()? • Software engineering • The calling relations in the program: call graph • Testing • Most interesting analyses require this information

Some terminology • Intraprocedural analysis • So far, we assumed there are no procedure calls! • Analysis that works within a procedure and approximates (or does not need) flow into and from procedures • Interprocedural analysis • Takes into account procedure calls and tracks flow into and from procedures • Many issues: • Parameter passing mechanisms • Context • Call graph! • Functions as parameters! • We will get back to this in a few classes…

Scalability • For most analyses (including class analysis) we need interprocedural analysis on very large programs • Can the analysis handle large programs? • 100K LOC, up to 45M LOC? • Approximations of standard fixed point iteration • Reduce Lattice • Reduce CFG • Make transfer functions converge faster • Other…

Today’s class • Class analysis: Given a reference variable x, what are the classes of the objects that x refers to at runtime? • Class Hierarchy Analysis (CHA) • Rapid Type Analysis (RTA) • Call graphs

Class Hierarchy Analysis (CHA) • The simplest method of inferring information about reference variables • Look at the class hierarchy • In Java, if a reference variable r has a type A, the possible classes of run-time objects are included in the subtree of A. Denoted by cone(A). • At virtual call site r.m find the methods that may be called based on the hierarchy information J. Dean, D. Grove, and C. Chambers, Optimization of OO Programs Using Static Class Hierarchy Analysis, ECOOP’95

f() A f() B C f() G D E Example public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } // there are no other creation sites // or calls in the program f()

f() A f() B C f() G D E Example public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } … } // there are no other creation sites // or calls in the program f() Cone(C) The solution for reference variables by CHA is: a may refer to objects of classes {A,B,C,D,E,G}, d may refer to objects of class {D}, e may refer to objects of class {E}, and g to {G}.

f() A f() B C f() G D E Example main public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } … } // there are no other creation sites // or calls in the program a.f(): A.f B.f C.f G.f f()

f() A f() B C f() G D E Example: Applies-to Sets main public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } … } // there are no other creation sites // or calls in the program a.f(): A.f B.f C.f G.f f() Applies-to sets: A.f = {A}; B.f = {B}; G.f = {G}; C.f = {C,D,E}

Observations on CHA • Do we need to resolve the class of the receiver uniquely in order to devirtualize a call? • Applies-to set for each method • At a call site r.f(), take the set of possible classes for the receiver r; intersect this set with each possible method’s applies-to set. • If only one method’s set has a non-empty intersection, then invoke the method directly. • Otherwise, the call cannot be resolved.

Rapid Type Analysis • Improves on Class Hierarchy Analysis • Interleaves construction of the call graph with the analysis (known as on-the-fly call graph construction) • Only expands calls if it has seen an instantiated object of appropriate type • Makes assumption that the whole program is available! David Bacon and Peter Sweeney, “Fast Static Analysis of C++ Virtual Function Calls”, OOPSLA ‘96

Example public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } // there are no other creation // sites or calls in the // program RTA starts in main; Sees D, and E are instantiated; Expands a.f() into C.f() only. Never reaches B.foo() and never sees G instantiated. main A.f B.f C.f G.f

RTA • Keeps two sets, I (the set of instantiated classes), and R (the set of reachable methods) • Starts from main, I = {}, R = {main} • Analyze calls in reachable methods: r.f() • Finds potential targets according to CHA: X.f, Y.f, etc. • If Applies-to(X.f) intersects withI, make X.f a real target, and add X.f to R • Analyze instantiation sites in reachable methods: r = new A() • Add A to I • Find all analyzed calls r.f() with potential targets X.f triggered by A (i.e., A in Applies-to(X.f) at r.f()). Make X.f a real target, and add X.f to R.

Example (continued) public class A { public static void main() { A a; D d = new D(); E e = new E(); if (…) a = d; else a = e; a.f(); } … } public class B extends A { public void foo() { G g = new G(); … } // there are no other creation // sites or calls in the // program main {A} {B} {C,D,E} {G} A.f B.f C.f G.f

Comparisons Bacon-Sweeny, OOPSLA’96 class A { public : virtual int foo() { return 1; }; }; class B: public A { Public : virtual int foo() { return 2; }; virtual int foo(int i) { return i+1; }; }; void main() { B* p = new B; int result1 = p->foo(1); int result2 = p->foo(); A* q = p; int result3 = q->foo(); } CHA resolves result2 call uniquely to B.foo(); however, it does not resolve result3. RTA resolves result3 uniquely because only B has been instantiated.

Type Safety Limitations A foo() • CHA and RTA assume type safety of the code they examine! //#1 void* x = (void *) new B; B* q = (B*) x; //a safe downcast int case1 = q->foo() //#2 void* x = (void *) new A; B* q = (B*) x; //an unsafe downcast int case2 = q->foo()//probably no error //#3 void* x = (void *) new A; B* q = (B *) x; //an unsafe downcast int case3 = q->foo(66);//run-time error foo()foo(int) B

Call Graphs • Class analysis: Given a reference variable x, what are the classes of the objects that x refers to at runtime? • We saw CHA and RTA • Deal with polymorphic/virtual calls: x.m() • Compilers: can we devirtualize a virtual call x.m()? • Software engineering • Construct the call graph of the program

BoolExp Example Call Graph Note: constructors not shown main theContext.Assign exp.Evaluate theContext.Assign AndExp.Evaluate Context.Assign _operand1.Evaluate _operand2.Evaluate Constant.Evaluate OrExp.Evaluate _operand1.Evaluate _operand2.Evaluate VarExp.Evaluate

Constructing the call graph: A General Reachability Model • Your project - a series of class analyses for Java • CHA, RTA, etc. • Constructing the call graph using CHA • Function dispatch: the effect of run-time virtual dispatch • As a reachability computation starting from main() • Constructing the call graph using RTA • Minor, but important changes from CHA

dispatch dispatch(call_site s, receiver_class rc) sig = signature_of_static_target(s) ret = return_type_of_static_target(s) c = rc; while (c != null) { if class c contains a method m with signature sig and return type ret return m; c = superclass(c) } print “error: this should be unreachable!!!”

Reachability Computation Queue worklist CallGraph Graph worklist.addAtTail(main()) Graph.addNode(main()) while (worklist.notEmpty()) { m = worklist.getFromHead(); process_method_body(m); }

process_method_body(method m) for each call site s inside m if s is a static call, a constructor call or a supercall add_edge(s) if s is a virtual call x.n(…) { rcv_class = type_of(x) for each non-abstract class c in cone(rcv_class) { n’ = dispatch(s,c); add_edge(s,n’); } }

add_edge add_edge(call_site s, run_time_target n’) // for virtual calls m = encl_method(s); if n’ is not in Graph Graph.addNode(n’); worklist.addAtTail(n’); Graph.addEdge(m,n’,s); // an edge from m to n labeled with s add_edge(call_site s) // for static calls, constructor calls and super calls // same…

Example class A { class B extends A {void m() {} void m() { void n() {} A x = new A(); static void main(…) { x.n(); // c3 B b = new B(); } b.m(); // c1 } A a = b; class C extends B { a.m(); // c2 void m() {} } void n() {} } }

BoolExp Hierarchy Example • Construct the call graph using CHA? • RTA?

Course Outline

Course Outline

Presentation Transcript

COURSE OUTLINE

Course Outline

Course Outline

Course Outline

Course Outline

Course Outline

Course Outline

Course Outline

Course Outline

Course Outline

Course Outline

Course Outline

COURSE OUTLINE

Course outline

Course Outline

Course Outline

Course Outline

Course outline