1 / 74

Assume/Guarantee Reasoning using Abstract Interpretation

Assume/Guarantee Reasoning using Abstract Interpretation. Nurit Dor Tom Reps Greta Yorsh Mooly Sagiv. Limitations of Whole Program Analysis. Complexity of Chaotic Iterations Not all the source code is available Large libraries Software components No interaction with the client

nida
Télécharger la présentation

Assume/Guarantee Reasoning using Abstract Interpretation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Assume/Guarantee Reasoningusing Abstract Interpretation Nurit Dor Tom Reps Greta Yorsh Mooly Sagiv

  2. Limitations of Whole Program Analysis • Complexity of Chaotic Iterations • Not all the source code is available • Large libraries • Software components • No interaction with the client • Program design

  3. A Motivating Example List rev(List x) requires acyclic(x) ensures $$=reverse(x) List rev(List x) { if (x ==null) return null ; return append(rev(xnext), x); } Contract List append(List x, List y) { List e; if (x == null) return y; e = malloc(…); edata = xdata; enext = append(xnext, y); } List append(List x, List y) requires acyclic(x)  acyclic(y) ensures $$= x || y Contract Can also used for runtime testing

  4. Challenges in A/G Reasoning • Specifying procedure contracts • Performing abstract interpretation using contracts

  5. Specifying Contracts • Executable specifications • assert • Can use loops • Expressive • Natural • But what about side-effects • Declarative specifications • Types • First order logic • Z • Hybrid • Larch • Java Modeling Language

  6. Procedure Contracts and Modularity • The postcondition does not reveal the whole story List foo(List x) requires acyclic(x) acyclic(y) ensures true void foo(List x, List z) { List y, t ; y = rev(x); t = rev(z); } List rev(List x) requires acyclic(x) ensures $$=reverse(x)

  7. Procedure Contracts and Modularity • Specify parts of the state which may be modified • But difficult to define potential side-effects • Can use abstract interpretation List foo(List x) requires acyclic(x) acyclic(y) ensures true void foo(List x, List z) { List y, t ; y = rev(x); t = rev(z) } List rev(List x) requires acyclic(x) ensures $$=reverse(x)

  8. Issues in Specifying Contracts • Expressible • Conciseness • Natural • Reuse • Cost of dynamic check (model checking) • Decidability • Cost of abstract interpretation

  9. Plan • CSSV: A tool for verifying absence of buffer overruns (N. Dor) • An algorithm for performing abstract interpretation in the most precise way using specification

  10. CSSV: Towards a Realistic Tool for Statically Detecting All Buffer Overflows in C Nurit Dor, Michael Rodeh, Mooly Sagiv DAEDALUSproject

  11. Null dereference Dereference to unallocated storage Out of bound pointer arithmetic Out of bound update Vulnerabilities of C programs /* from web2c [strpascal.c] */ void foo(char *s) { while ( *s != ‘ ‘ ) s++; *s = 0; }

  12. Is it common? • General belief – yes! • FUZZ study • Test reliability by random input • Tens of applications on 9 different UNIX systems • 18% – 23% hang or crash • CERT advisory • Up to 50% of attacks are due to buffer overflow • COMMON AND DANGEROUS

  13. CSSV’s Goals • Efficient conservative static checking algorithm • Verify the absence of buffer overflow • not just finding bugs • All C constructs • Pointer arithmetic, casting, dynamic memory, … • Real programs • Minimum false alarms

  14. string(src)  string(dst) (size > len(src)+len(dst)) alloc(dst+len(dst)) > len(src)) Verifying Absence of Buffer Overflow is non-trivial void safe_cat(char *dst, int size, char *src ) { if ( size > strlen(src) + strlen(dst) ) { dst = dst + strlen(dst); strcpy(dst, src); } } {string(src)  string(dst)alloc(dst+len(dst)) > len(src)} {string(src)  alloc(dst) > len(src)}

  15. Complex linear relationships Pointer arithmetic Loops Procedures Use Polyhedra[CH78] Pointer analysis Widening Procedure contracts Can this be done for real programs? Very few false alarms!

  16. V = { (1,2) (2,1) } R = { (1,0) (1,1) } 0 1 2 3 y 0 1 2 3 x Linear Relation Analysis • Cousot and Halbwachs, 78 • Statically analyze program variable relations: a1* var1 + a2* var2 + … + an* varn b • Polyhedron y  1 x + y  3-x + y 1

  17. C String Static Verifier • Detects string violations • Buffer overflow (update beyond bounds) • Unsafe pointer arithmetic • References beyond null termination • Unsafe library calls • Handles full C • Multi-level pointers, pointer arithmetic, structures, casting, … • Applied to real programs • Public domain software • C code from Airbus

  18. Plan • Semantics for C program • Contract language • Static analysis algorithm • Implementation

  19. ‘x’ 0x5050510 0x5050518 0 Standard C Semantics dst 0x480580 0x5058510 size 0x480584 125 void safe_cat( char *dst, int size, char *src ) { if ( size > strlen(src) + strlen(dst) ) { dst = dst + strlen(dst); strcpy(dst, src); } } src 0x480588 0x6000009 ‘y’ 0x6000009 0x6000A00 0

  20. dst 0x480580 0x5058510 size 0x480584 125 src 0x480588 ‘x’ 0x5050510 0x5050518 0 ‘y’ 0x6000009 0x6000A00 0 Instrumented C Semantics base asize 4 4 0x6000009 4 130 245

  21. ‘x’ 0x5050510 0x5050518 0 Instrumented C Semantics base asize offset dst 0x480580 0x5058510 4 0 size 0x480584 4 125 src 0x480588 0x6000009 4 9 130 0x6000000 245 ‘y’ 0x6000009 0x6000A00 0

  22. i offset(dst) asize(base(dst)) base(dst) dst Safety • The instrumented semantics checks validity of C expressions • ANSI C • Cleanness • dst = dst + i offset(dst) + i  asize(base(dst))

  23. Contracts • Defined in the instrumented semantics • Specify string behavior of procedures (C expressions) • Precondition • Postcondition • Use of values at procedure entry • Side-effects • Can be approximated from pointed information • No need to specify pointer information • Not aiming for modular pointer analysis

  24. Contracts’ Advantages • Modular analysis • Use contracts on call statements • Not all the code is available • Enable more expensive analyses • User control of the verification • Detect errors at point of logical error • Improve the precision of the analysis • Check additional properties • Beyond ANSI-C

  25. Example char* strcpy(char* dst, char* src) requires mod ensures ( string(src)  alloc(dst) > len(src) ) dst ( len(dst) = = [len(src)]pre return = = [dst]pre )

  26. safe_cat’s contract void safe_cat(char* dst, int size, char* src) requires mod ensures ( string(src)  string(dst) alloc(dst) == size ) dst ( len(dst) <= [len(src)]pre + [len(dst)]pre len(dst) >= [len(dst)]pre )

  27. Contracts and Soundness • All errors are detected • Violation of statement’s precondition • …a[i]… • Violation of procedure’s precondition • Call • Violation of procedure's postcondition • Return • Violation messages depend on the contracts • But may lead to more false alarms (e.g., trivial contracts)

  28. CSSV Static Analysis • Inline contracts • Expose behavior of called procedures • Pointer analysis (global) • Find relationship between base addresses • Integer analysis • Compute offset information

  29. Step 1: Inliner void safe_cat( char *dst, int size, char *src ) requires ( string(src)  string(dst) alloc(dst) == size) mod dst ensures ( len(dst) = = [pre@len(src)]pre + [len(dst)]pre ) void safe_cat( char *dst, int size, char *src ) { … strcpy(dst, src); … } char* strcpy( char *dst, char *src ) requires ( string(src) alloc(dst) > len(src)) mod dst ensures ( len(dst) = = [len(src)]pre return = = [dst]pre )

  30. Step 1: Inliner void safe_cat( char *dst, int size, char *src ) requires ( string(src)  string(dst) alloc(dst) == size) mod dst ensures ( len(dst) = = [pre@len(src)]pre + [len(dst)]pre ) void safe_cat( char *dst, int size, char *src ) { … strcpy(dst, src); … } assume assert char* strcpy( char *dst, char *src ) requires ( string(src) alloc(dst) > len(src)) mod dst ensures ( len(dst) = = [len(src)]pre return = = [dst]pre )

  31. Step 1: Inliner void safe_cat( char *dst, int size, char *src ) requires ( string(src)  string(dst) alloc(dst) == size) mod dst ensures ( len(dst) = = [pre@len(src)]pre + [len(dst)]pre ) void safe_cat( char *dst, int size, char *src ) { … strcpy(dst, src); … } assert char* strcpy( char *dst, char *src ) requires ( string(src) alloc(dst) > len(src)) mod dst ensures ( len(dst) = = [len(src)]pre return = = [dst]pre ) assume

  32. Step 2: Compute Pointer Information • Required for reasoning about pointers • Every base address is abstracted by an abstract location • Relationships between base addresses is computed (points-to) • Global analysis • Scalable • Imprecise • Flow insensitive • (Almost) Context insensitive

  33. Global Points-To safe_cat( char *dst, int size, char *src ) { … strcpy(dst, src); … } dst src main() { char s[10], t[20],r; char *p1, *p2; … p1= r + i; safe_cat(s,10,p1); p2 = r + j; safe_cat(t,10,p2); … } s t r p2 p1

  34. Procedural Points-to (PPT) • “Project” pointer information on visible variables of the procedure • Introduce abstract locations for formal parameters • Allow destructive updates through formal parameters (well behaved programs) • Can decrease precision in some procedures

  35. PPT dst src safe_cat( char *dst, int size, char *src ) { … strcpy(dst, src); … } Param #1 Param # 2

  36. Step 3: Static Analysis • Prove linear inequalities on string indices • Abstract string properties using constraint variables • Use abstract interpretation to conservatively interpret program statements • Verify safety preconditions

  37. ‘x’ 0x5050510 0x5050518 0 Back to Semantics base asize offset dst 0x480580 0x5058510 4 0 size 0x480584 4 125 src 0x480588 0x6000009 4 9 130 0x6000000 245 ‘y’ 0x6000009 0x6000A00 0

  38. ‘x’ 0x5050510 0x5050518 0 Abstract Representation dst 0x480580 dst 0x5058510 size 0x480584 125 src 0x480588 size 0x6000009 src n1 0x6000000 n2 ‘y’ 0x6000009 Base address relationship 0x6000A00 0

  39. Constraint Variables • For every abstract location a.offset src src.offset = 9

  40. Constraint Variables • For every integer abstract location a.val size size.val = 125

  41. Constraint Variables • For every abstract location a.is_nullt a.len a.asize n1.len n1.asize n1 0

  42. Abstract Representation dst.offset < n1.len dst size size.val+ dst.offset = n1.asize src n1.is_nullt = true n1 n2 n2.is_nullt = true

  43. n1.is_nullt = true dst.offset < n1.len size.val + dst.offset = n1.asize dst.offset n1.len n1.asize 0 size.val What does it represent? dst ? size ? ?

  44. dst.offset = n1.len size.val = n1.asize - dst.offset + n1.len Abstract Interpretation dst.offset < n1.len size.val = n1.asize - dst.offset dst = dst + strlen(dst);

  45. offset(dst) + i  asize(base(dst)) dst.offset + i.val  n1.asize dst i offset(dst) asize(base(dst)) dst.offset i n1.asize n1 base(dst) dst Verify Safety Condition dst = dst + i concrete semantics abstract semantics

  46. The Assume-Operation • Use two copies of constraint variables • Set modified values to ⊤ • Meet the post

  47. Procedure’sPointer info PreModPost Pointer Analysis Inliner C2IP Cfiles Cfiles Cfiles C’files Integer Procedure Potential Error Messages Integer Analysis CSSV Implementation contracts Procedure name

  48. Used Software • ASToolKit [Microsoft] • Core C [TAU - Greta Yorsh] • GOLF [Microsoft - Manuvir Das] • New Polka [Inria - Bertrand Jeannet]

  49. Applications • Verified string library from Airbus with 6 false alarms • Could be avoided by analyzing correlated conditions • Found 8 real errors in another string intensive application with 2 false alarms • In one case safety depends on correctness • Could be avoided by defensive programming • 1 - 206 CPU seconds per procedure • No optimizations • Very few false alarms

  50. Related Work Non-Conservative • Wagner et. al. [NDSS’00] • LCLint’s extension [USENIX’01] • Eau Claire [IEEE Oakland 02] Conservative • Polyspace verifier • Dor, Rodeh and Sagiv [SAS’01]

More Related