1 / 78

Concrete, Symbolic, and Concolic Testing: Dealing with Large Code

Learn about the limitations of SMT solving and existing software when dealing with large and complex code. Explore techniques like symbolic testing and concolic testing to find rare errors and improve code coverage.

markhayes
Télécharger la présentation

Concrete, Symbolic, and Concolic Testing: Dealing with Large Code

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Announcements • No class April 7 • Makeup class May 3, 9-12

  2. Concrete, Symbolic, and Concolic Testing MoolySagiv Slides taken from Cristian Cadar

  3. Recap • Bounded model checking is powerful for small and intermediate size code • But how can we deal with large code with complicated features? • Limitations of the SMT power • Complexity of SMT solving • Complexity of existing software • Large code base • Heterogeneous systems • Missing code

  4. Program Path • Program Path • A path in the control flow of the program • Can start and end at any point • Appropriate for imperative programs • Feasible program path • There exists an input that leads to the execution of this path • Infeasible program path • No input that leads to the execution

  5. Infeasible Paths A score <45 void grade(int score) { A: if (score <45) { B: printf(“fail”); } else C: printf(“pass”); } D: if (score > 85) { E: printf(“with honors”); } F: } score45 B C printf(“fail”) printf(“pass”) D score > 85 E printf(“with honors”) F

  6. Concrete vs. Symbolic Executions • Real programs have many infeasible paths • Ineffective concrete testing • Symbolic execution aims to find rare errors

  7. Symbolic Testing Tools • EFFIGY [King, IBM 76] • PEX [MSR] • SAGE [MSR] • SATURN[Stanford] • KLEE[Stanford] • Java pathfinder[NASA] • Bitscope [Berkeley] • Cute [UIUC, Berkeley] • Calysto [UBC]

  8. Finding Infeasible Paths Via SMT A score <45 void grade(int score) { A: if (score <45) { B: printf(“fail”); } else C: printf(“pass”); } D: if (score > 85) { E: printf(“with honors”); } F: } score45 B C printf(“fail”) printf(“pass”) D score > 85 E score < 45  score > 85 UNSAT printf(“with honors”) F

  9. Plan • Random Testing • Symbolic Testing • Concolic Testing

  10. Fuzzing [Miller 1990] • Test programs on random unexpected data • Can be realized using black/white testing • Can be quite effective • Operating Systems • Networks • … • Usually implemented via instrumentation • Tricky to scale for programs with many paths If (x == 10001) { …. if (f(*y) == *z) { …. int f(int *p) { if (p !=NULL) { return q ; }

  11. Symbolic Exploration • Execute a program on symbolic inputs • Track set of values symbolically • Update symbolic states when instructions are executed • Whenever a branch is encountered check if the path is feasible using a theorem prover call

  12. Symbolic Execution Tree • The constructed symbolic execution paths • Nodes • Symbolic Program States • Edges • Potential Transitions • Constructed during symbolic evaluation • Each edge requires a theorem prover call

  13. Simple Example 1)int x, y; 2)if (x > y) { 3) x = x + y; 4) y = x – y; 5) x = x – y; 6) if (x > y) 7) assert false; 8)} pc=1, x =s1, y=s2 pc=2, x =s1, y=s2 xy x >y pc=3, x =s1, y=s2, s1>s2 pc=8, x =s1, y=s2, s1s2 x = x + y pc=4, x =s1+s2, y=s2, s1>s2 y = x - y pc=5, x =s1+s2, y=s1, s1>s2 x = x - y xy pc=6, x =s2, y=s1, s1>s2 pc=8, x =s2, y=s1, s1>s2

  14. Another Example int f(int x) { return 2 * x ;} int h(int x, int y) { 1)if (x!= y) { 2) if (f(x) == x +10) { 3) abort() // * error */ } } 4) return 0; pc=1, x =s1, y=s2 x ==y x !=y pc=4, x =s1, y=s2, s1=s2 pc=2, x =s1, y=s2, s1s2 f(x) == x+10 f(x) != x+10 pc=4, x=s1, y=s2, s1s2, 2*s1 s2+10 pc=3, x =s1, y=s2, s1s2, 2*s1 =s2+10

  15. Pointers structfoo {inti; char c;} bar(structfoo *a) { 1) if (a->c == 0) { 2) *((char *)a + sizeof(int)) = 1 ; 3) if (a->c !=0) { 4) abort(); } } 5) }

  16. Non-Deterministic Behavior int x; y; 1) if (nondet()) { 2) x = 7; } else { 3) x = 19 ; } 4)

  17. Loops inti; while i < n { i = i + 1; } if (n ==106) { abort(); }

  18. Scaling Issues for Symbolic Exploration

  19. Challenge 1: Limitations of Theorem Provers foobar(int x, int y) { if (x * x * x > 0) { if (x>0 && y ==10) { abort() ; } } else { if (x > 0 && y == 20) { abort ;} } }

  20. Challenge 2: External Calls FILE *fp; fp = fopen(“test.txt", “w"); if (fp) { struct stat buffer; if (stat (“text.txt”, &buffer) != 0) { abort(); } }

  21. Challenge 3: #Theorem prover calls inti = 0; while i < n { i = i + 1; } if (n==106) { abort(); }

  22. Concolic Testing Concrete + Symbolic = Concolic • Combine concrete testing (concrete execution) and symbolic testing (symbolic execution) • Trade coverage (miss bugs) for scalability • Reduce the number of theorem prover calls • Reduce the complexity of path formulas • Can cope with external calls

  23. Motivating Example int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } }

  24. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 22, y = 7 x = x0, y = y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme(int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } }

  25. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 22, y = 7, z = 14 x = x0, y = y0, z = 2*y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } }

  26. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 22, y = 7, z = 14 x = x0, y = y0, z = 2*y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } 2*y0 != x0

  27. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 22, y = 7, z = 14 x = x0, y = y0, z = 2*y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } Solve: 2*y0 == x0 Solution: x0 = 2, y0 = 1 2*y0 != x0

  28. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 2, y = 1 x = x0, y = y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } }

  29. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 2, y = 1, z = 2 x = x0, y = y0, z = 2*y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } }

  30. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 2, y = 1, z = 2 x = x0, y = y0, z = 2*y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } 2*y0 == x0

  31. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 2, y = 1, z = 2 x = x0, y = y0, z = 2*y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } 2*y0 == x0 x0 · y0+10

  32. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 2, y = 1, z = 2 x = x0, y = y0, z = 2*y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } Solve: (2*y0 == x0) (x0 > y0 + 10) Solution: x0 = 30, y0 = 15 2*y0 == x0 x0 · y0+10

  33. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 30, y = 15 x = x0, y = y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } }

  34. Concrete Execution Symbolic Execution concrete state symbolic state path condition x = 30, y = 15 x = x0, y = y0 Concolic Testing Approach int double (int v) { return 2*v; } void testme (int x, int y) { z = double (y); if (z == x) { if (x > y+10) { ERROR; } } } Program Error 2*y0 == x0 x0 > y0+10

  35. The Concolic Testing Algorithm Classify input variables into symbolic / concrete Instrument to record symbolic vars and path conditions Choose an arbitrary input Execute the program Symbolically re-execute the program Negate the unexplored last path condition Is there an input satisfying constraint F T

  36. Interesting Example Concolic Testing foobar(int x, int y) { if (x * x * x > 0) { if (x>0 && y ==10) { abort() ; } } else { if (x > 0 && y == 20) { abort ;} } }

  37. Pointers structfoo {inti; char c;} bar(structfoo *a) { 1) if (a->c == 0) { 2) *((char *)a + sizeof(int)) = 1 ; 3) if (a->c !=0) { 4) abort(); } } 5) }

  38. Loops inti= 0; while i < n { i = i + 1; } if (n ==106) { abort(); }

  39. Handling External Calls FILE *fp; fp = fopen(“test.txt", “w"); if (fp) { struct stat buffer; if (stat (“text.txt”, &buffer) != 0) { abort(); } }

  40. Concolic Testing Techniques • Linear underapproximation • Modeling System call models • Simple heuristics • Two possible values for pointers • … • Efficient instrumentation • Interface extraction • Generating random inputs of arbitrary types • …

  41. Original DART Approach [PLDI’05] • Tailored for C • Three types of functions • Program functions • External functions handled non-deterministically • Library functions handled as black-box (concrete only)

  42. SAGE: WhiteboxFuzzing for Security Testing • Check correctness of Win’7, Win’8 • 200+ machine years • 1 Billion+ SMT constraints • 100s of apps, 100s of bugs • 1/3 of all Win7 WEX security bugs found • Millions of dollars saved • Now available as a service

  43. LLVM • A toolchain for creating compilers and more • An intermediate language for C, C++ programs an beyond • Portable • Developer at Urbana Champaign • Adapted by Apple and more

  44. KLEE • Symbolic testing for system’s code • Developed at Stanford/Imperial • Optimize search space • Models for common functions • Built on top of LLVM • SMT agnostic

  45. Writing Systems Code Is Hard • Code complexity • Tricky control flow • Complex dependencies • Abusive use of pointer operations • Environmental dependencies • Code has to anticipate all possible interactions • Including malicious ones

  46. KLEE [OSDI 2008, Best Paper Award] Based on symbolic execution and constraint solving techniques • Automatically generates high coverage test suites • Over 90% on average on ~160 user-level apps • Finds deep bugs in complex systems programs • Including higher-level correctness ones

  47. Toy Example x=  x < 0 intbad_abs(intx) { if (x < 0) return –x; if (x == 1234) return –x; return x; } TRUE FALSE x0 x< 0 x = 1234 return -x TRUE FALSE x= 1234 x1234 x = -2 return -x return x test1.out x = 1234 x = 3 test2.out test3.out

  48. LLVM bytecode C code KLEE Architecture L L V M x = -2 SYMBOLIC ENVIRONMENT K L E E x = 1234 x = 3 x  0 x  1234 x = 3 Constraint Solver (STP)

  49. Outline Motivation Example and Basic Architecture Scalability Challenges Experimental Evaluation

  50. Three Big Challenges • Motivation • Example and Basic Architecture • Scalability Challenges • Exponential number of paths • Expensive constraint solving • Interaction with environment • Experimental Evaluation

More Related