1 / 12

CS527 Advanced Topics in Software Engineering Lecture 20, 1 Nov 2007 Lorinc Hever (lhever2)

CS527 Advanced Topics in Software Engineering Lecture 20, 1 Nov 2007 Lorinc Hever (lhever2). Introduction . The Paper: A System and Language for Building System-Specific, Static Analyses 2002 PLDI Berlin, Germany Static Analysis System for C language Extensible rules

jens
Télécharger la présentation

CS527 Advanced Topics in Software Engineering Lecture 20, 1 Nov 2007 Lorinc Hever (lhever2)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS527 Advanced Topics in Software EngineeringLecture 20, 1 Nov 2007Lorinc Hever (lhever2)

  2. Introduction • The Paper: A System and Language for Building System-Specific, Static Analyses • 2002 PLDI Berlin, Germany • Static Analysis System for C language • Extensible rules • No source code annotation • Focusing on important bugs • The technology evolved into a commercial product Coverty Prevent

  3. The Analysis Challenge: 1 simple example • Something easy to evaluate: • x = 1; • y = 1; • assert(x < y ); • But, how can you evaluate the following code snippet? • x = v; • if (x < y) { • y = v; • } • assert(x < y ); • Going down on the x<y true path: • x = v; {no fact} • if (x < y) { {x=v} • y = v; {x=v, x<y} • } • assert( x < y ) {x=v,x<y, y=v} • assert fails since: v<v • Going down on the x<y false path: • x = v; {no fact} • if (x < y) {x=v} • assert( x < y ) {x=v,!(x<y)} • The result is false again. Source: Secure Programming with Static Analysis, Brian Chess, Jacob West Addison-Wesley Professional; (June 29, 2007)

  4. The system and language: xgcc + metal source frontend analyzer report • xgcc: a modified gcc compiler front-end, to generate the abstract syntax tree from the source, drives the analysis • metal: a high level language, describe the checkers to describe the state machines, interpreted by YACC? Source: http://metacomp.stanford.edu/osdi2000/node3.html • High level diagram from the Coverty manual: AST checker 1 checker n

  5. Metal features, simplified • global state: start • variable-specific state: 1 for every variable in the program • Declaration: state decl any_pointer v • Example state: v.freed • patterns: to match source code action, kfree(v), {*v} • transitions: to define state change v.freed: {*v}==>v.stop • state decl any_pointer v; • start: { kfree(v); } ==> v.freed ; • v.freed: { *v } ==> v.stop, { err(“err1"); } • | { kfree(v) } ==> v.stop, { err(“err2"); }

  6. Executing the free checker 1: int contrived(int *p, int *w, int x) { 2: int *q; 3: if(x) 5: { 6: kfree(w); 7: q = p; 8: p = 0; 9: } 10: if(!x) 11: return *w; // safe 12: return *q; // using ’q’ after free! 13:} 14:int contrived_caller (int *w, int x, int *p) { 15: kfree (p); 16: contrived (p, w, x); 17: return *w; // using ’w’ after free! 18:} {} contrived_caller(*p,*w,x) {} kfree(p) {p.freed} contrived(p,w,x) {p.freed} if(x) {x!=0,p.freed} kfree(w) {x!=0,p.freed,w.freed} q=p {x!=0,p.freed,w.freed, q.freed} p=0 {x==0,p.freed} prune {x!=0,w.freed, q.freed} prune {x==0, p.freed} return *w {x!=0,w.freed, q.freed} return *q {p.freed,w.freed} return *w

  7. Intraprocedural analysis • Concept: computes the final state within a single function • Traversal: extensions applied to the AST nodes, visited in order, with Depth-First-Search (DFS) • Transition: in every point check transition rules for the variable • Assumes the extension is deterministic: applying extension to the same program creates the same result. • Cashing, based on changing existing state or adding new state. • Based on the independence condition all state tuples reached in a block combined into a single set • Stop condition: meet over path, when the block summary contains all the tuples that can reach that block along any control path.

  8. Intreprocedural analysis • Concept: Carries state from the caller to the callee and back. • Create the Control Flow Graph (CFG) and determines the entry points. • Dynamic programming approach: extension doesn’t need a finite state space, and only the the states executed which can be reached along the path. • Refine: when the algorithm follows a function call, the passed variable should remain in the same state • Restore: when it returns from a function call, the manipulation on the variable should reflect the state after the call

  9. Increasing the accuracy (false positive suppression) • Killing variables: once the variable went to the stop state it’s removed from the list p=0, but what about a[i]=0 and i redefined? • Synonyms: p=q=malloc(…) all the operation applies to both synonyms • False path pruning: value tracking + congruence closure algorithm. • Track assignments and comparison • Evaluate the expression on the way • After a loop all loop variables goes to unknown • Same values goes to an equivalence class • Block summary entries removed for the pruned path • Targeted suppression, handled explicitly with metal • History, remember false positives from the past and suppress them (file name, function name, variable name and the actual error, no line number!)

  10. Reporting (ranking) • Approach: first list the errors that are difficult to diagnose with testing • Generic ranking • Distance: between the source and sink • Number of conditionals: 10 line distance each • Aliasing/synonyms • Local errors first • More analysis step, more likely it’s false positiv • High density areas are less important • Z-statistic: count the violations (c) and rule keeping (e), the almost always followed rule is probably correct.

  11. Soundness, Performance • Unsound analysis tool: not all the defects it reports are guaranteed to be genuine, focus on “good” result • Interprocedural algortihm doesn’t follow the recursive loops • Vulnerable to both false positives and false negatives • Uncomplete: doesn’t report all the bugs, you need a checker to detect a bug • Coverty Prevent performance: varies based on the code 15 sec (12,000 lines), MySQL (600,000 lines) < 1 hours. • Coverty Prevent defect density: 478 defects over 1,352,343 lines of code, average defect density of 0.353 defects per 1K line. • Coverty Prevent accuracy: overall false positive rate somewhere between 12.7% and 35.7%, with a mean of 24.2%. Source: Analysis Tool Evaluation: Coverity Prevent; Ali Almossawi, Kelvin Lim, Tanmay Sinha, Carnegie Mellon University (May 1, 2006,)

  12. Thank you!

More Related