Efficient Field-Sensitive Pointer Analysis for C

# Efficient Field-Sensitive Pointer Analysis for C

Télécharger la présentation

## Efficient Field-Sensitive Pointer Analysis for C

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Efficient Field-Sensitive Pointer Analysis for C David J. Pearce, Paul H.J. Kelly and Chris Hankin Imperial College, London, UK d.pearce@doc.ic.ac.uk www.doc.ic.ac.uk/~djp1/

2. What is Pointer Analysis? • Determine pointer targets without running program • What is flow-insensitive pointer analysis? • One solution for all statements – so precision lost • This is a trade-off for efficiency over precision • This work considers flow-insensitive pointer analysis only int a,b,*p,*q = NULL; p = &a; if(…) q = p; // p{a,b}, q{a,NULL} p = &b;

3. Pointer analysis via set-constraints • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; r = &b; q = &c; if(...) q = p; else q = r; (program)

4. Pointer analysis via set-constraints • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; // p  { a } r = &b; // r  { b } q = &c; // q  { c } if(...) q = p; // q  p else q = r; // q  r (program) (constraints)

5. Pointer analysis via set-constraints p q r • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; // p  { a } r = &b; // r  { b } q = &c; // q  { c } if(...) q = p; // q  p else q = r; // q  r {a} {b} {c} (program) (constraints) (constraint graph)

6. Pointer analysis via set-constraints p q r • Generate set-constraints from program and solve them • Use constraint graph for efficient solving int a,b,c,*p,*q,*r; p = &a; // p  { a } r = &b; // r  { b } q = &c; // q  { c } if(...) q = p; // q  p else q = r; // q  r {a} {b} {a,b,c} (program) (constraints) (constraint graph)

7. Field-Sensitivity p x r q • How to deal with aggregate types ? • Standard approach treats them as single variables typedef struct { int *f1; int *f2; } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // x  p x.f2 = q; // x  q r = x.f1; // r  x {b} {a} {} {}

8. Field-Sensitivity p x r q • How to deal with aggregate types ? • Standard approach treats them as single variables typedef struct { int *f1; int *f2; } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // x  p x.f2 = q; // x  q r = x.f1; // r  x {b} {a} {a,b} {a,b}

9. Field-Sensitivity – A simple solution p xf2 xf1 r q • Use a separate node per field for each aggregate • Node “x” split in two typedef struct { int *f1; int *f2 } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // xf1 p x.f2 = q; // xf2 q r = x.f1; // r  xf1 {b} {a} {} {} {}

10. Field-Sensitivity – A simple solution p xf2 xf1 r q • Use a separate node per field for each aggregate • Node “x” split in two typedef struct { int *f1; int *f2 } t1; int a,b,*p,*q,*r; t1 x; p = &a; // p  { a } q = &b; // q  { b } x.f1 = p; // xf1 p x.f2 = q; // xf2 q r = x.f1; // r  xf1 {b} {a} {a} {b} {a}

11. Problem – can take address of field in C xf2 xf1 typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s  { x } p = &(s->f2); // p ? • System thus far has no mechanism for this • First idea – use string concatenation operator || • Works well for this example {..} {..}

12. Problem – can take address of field in C xf2 xf1 typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s  { x } p = &(s->f2); // p (*s) || f2 • System thus far has no mechanism for this • First idea – use string concatenation operator || • Works well for this example {..} {..}

13. Problem – can take address of field in C xf2 xf1 typedef struct { int *f1; int *f2; } t1; int **p; t1 x,*s; s = &x; // s  { x } p = &(s->f2); // p (*s) || f2  p  { x } || f2  p  { xf2 } • System thus far has no mechanism for this • First idea – use string concatenation operator || • Works well for this example {..} {..}

14. Problem – compatible types xf4 xf3 typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { x } p = &(s->f2); // p (*s) || f2 • First idea – use string concatenation operator || • Casting identical types except for field names • Derivation same as before - but,node xf2 no longer exists! {..} {..}

15. Problem – compatible types xf4 xf3 typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { x } p = &(s->f2); // p (*s) || f2  p  { x } || f2  p  { xf2 } • First idea – use string concatenation operator || • Casting identical types except for field names • Derivation same as before - but,node xf2 no longer exists! {..} {..}

16. Field-Sensitivity – Our Solution p xf3 xf4 s typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { xf3 } p = &(s->f2); // p s + 1 • Our solution – map variables to integers • Solution sets become integer sets • Use integer addition to model taking address of field • Address of aggregate modelled by address of its first field 0 1 2 3

17. Field-Sensitivity – Our Solution p xf3 xf4 s typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { xf3} s  { 2 } p = &(s->f2); // p s + 1 • Our solution – map variables to integers • Solution sets become integer sets • Use integer addition to model taking address of field • Address of aggregate modelled by address of its first field 0 1 2 3

18. Field-Sensitivity – Our Solution p xf3 xf4 s typedef struct { int *f1; int *f2; } t1; typedef struct { int *f3; int *f4; } t2; int **p; t1 *s; t2 x; s = (t1*) &x; // s  { xf3} s  { 2 } p = &(s->f2); // p s + 1  p  { 2 } + 1  p  { 3 } • Our solution – map variables to integers • Solution sets become integer sets • Use integer addition to model taking address of field • Address of aggregate modelled by address of its first field 0 1 2 3

19. Experimental Study

20. Conclusion • Field-sensitive Pointer Analysis • Presented new technique for C language • Elegantly copes with language features • Taking address of field • Compatible types and casting • Technique also handles function pointers without modification • Experimental evaluation over 7 common C programs • Considerable improvements in precision obtained • But, much higher solving times • And, relative gains appear to diminish with larger benchmarks

21. Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s {r} {a} {} {} (program) (constraints) (constraint graph)

22. Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s  r  s {r} {a} {r} {} (program) (constraints) (constraint graph)

23. Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s  r  s {r} {a} {r} {} (program) (constraints) (constraint graph)

24. Constraint Graphs (continued) p s q r • What about statements involving a pointer dereference? • Cannot be represented in the constraint graph • Instead, add edges as solution of q becomes known • Thus, computation similar to dynamic transitive closure int a,*r,*s,**p,**q; p = &r; // p  { r } s = &a; // s  { a } q = p; // q  p *q = s; // *q  s  r  s {r} {a} {r} {a} (program) (constraints) (constraint graph)