1 / 63

ZING Systematic State Space Exploration of Concurrent Software

ZING Systematic State Space Exploration of Concurrent Software. Jakob Rehof Microsoft Research http://research.microsoft.com/~rehof http://research.microsoft.com/zing Joint work with Tony Andrews (MS) Shaz Qadeer (MSR) Sriram K. Rajamani (MSR). http://research.microsoft.com/zing.

natane
Télécharger la présentation

ZING Systematic State Space Exploration of Concurrent Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ZING Systematic State Space Exploration of Concurrent Software Jakob Rehof Microsoft Research http://research.microsoft.com/~rehof http://research.microsoft.com/zing Joint work with Tony Andrews (MS) Shaz Qadeer (MSR) Sriram K. Rajamani (MSR)

  2. http://research.microsoft.com/zing

  3. Lecture I : Outline • Software Model Checking • Overview of ZING

  4. Software Validation • Large scale reliable software is hard to build and test. • Different groups of programmers write different components. • Integration testing is a nightmare.

  5. Property Checking • Programmer provides redundant partial specifications • Code is automatically checked for consistency • Different from proving whole program correctness • Specifications are not complete

  6. Interface Usage Rules • Rules in documentation • Incomplete, unenforced, wordy • Order of operations & data access • Resource management • Disobeying rules causes bad behavior • System crash or deadlock • Unexpected exceptions • Failed runtime checks

  7. Does a given usage rule hold? • Checking this is computationally impossible! • Equivalent to solving Turing’s halting problem (undecidable) • Even restricted computable versions of the problem (finite state programs) are prohibitively expensive

  8. Why bother? Just because a problem is undecidable, it doesn’t go away!

  9. Automatic property checking = Study of tradeoffs • Soundness vs completeness • Missing errors vs reporting false alarms • Annotation burden on the programmer • Complexity of the analysis • Local vs Global • Precision vs Efficiency • Space vs Time

  10. Broad classification • Underapproximations • Testing • After passing testing, a program may still violate a given property • Overapproximations • Type checking • Even if a program satisfies a property, the type checker for the property could still reject it

  11. Current trend • Confluence of techniques from different fields: • Model checking • Automatic theorem proving • Program analysis • Significant emphasis on practicality • Several new projects in academia and industry

  12. Model Checking • Algorithmic exploration of state space of the system • Several advances in the past decade: • symbolic model checking • symmetry reductions • partial order reductions • compositional model checking • bounded model checking using SAT solvers • Most hardware companies use a model checker in the validation cycle

  13. enum {N, T, C} state[1..2] int turn init state[1] = N; state[2] = N turn = 0 trans state[i]= N & turn = 0 -> state[i] = T; turn = i state[i] = N & turn !=0 -> state[i] = T state[i] = T & turn = i -> state[i] = C state[i] = C & state[2-i+1] = N -> state[i] = N state[i] = C & state[2-i+1] != N -> state[i] = N; turn = 2-i+1

  14. T1,N2 turn=1 N1,T2 turn=2 C1,N2 turn=1 N1,C2 turn=2 T1,T2 turn=1 T1,T2 turn=2 C1,T2 turn=1 T1,C2 turn=2 N1,N2 turn=0 N = noncritical, T = trying, C = critical

  15. Model Checking • Strengths • Fully automatic (when it works) • Computes inductive invariants • I such that F(I)  I • Provides error traces • Weaknesses • Scale • Operates only on models • How do you get from the program to the model?

  16. Theorem proving • Early theorem provers were proof checkers • They were built to support assertional reasoning in the Hoare style • Cumbersome and hard to use • Greg Nelson’s thesis in early 80s paved the way for automatic theorem provers • Theory of equality with uninterpreted functions • Theory of lists • Theory of linear arithmetic • Combination of the above ! • Automatic theorem provers based on Nelson-Oppen method are widely used • ESC • Proof Carrying Code

  17. Theory of Equality. • Symbols: =, ¹, f, g, … • Axiomatically defined: • Example of a satisfiability problem: • g(g(g(x)) = x  g(g(g(g(g(x))))) = x  g(x) ¹ x • Satisfiability problem decidable in O(n log n)

  18. a : array [1..len] of int; int max := -MAXINT; i := 1; {  1  j  i. a[j]  max} while (i  len) if( a[i] > max) max := a[i]; i := i+1; endwhile {  1  j  len. a[j]  max} • (  1  j  i. a[j]  max) •  ( i > len) •  • ( 1  j  len. a[j]  max}

  19. Automatic theorem proving • Strengths • Handles unbounded domains naturally • Good implementations for • equality with uninterpreted functions • linear inequalities • combination of theories • Weaknesses • Hard to compute fixpoints • Requires inductive invariants • Pre and post conditions • Loop invariants

  20. Program analysis • Originated in optimizing compilers • constant propagation • live variable analysis • dead code elimination • loop index optimization • Type systems use similar analysis • Are the type annotations consistent?

  21. Program analysis • Strengths • Works on code • Pointer aware • Integrated into compilers • Precision efficiency tradeoffs well studied • flow (in)sensitive • context (in)sensitive • Weakenesses • Abstraction is hardwired and done by the designer of the analysis • Not targeted at property checking (traditionally)

  22. Model Checking, Theorem Proving and Program Analysis • Very related to each other • Different histories • different emphasis • different tradeoffs • Complementary, in some ways • Combination can be extremely powerful

  23. What is the key design challenge in a model checker for software? It is the model!

  24. Model Checking Hardware Primitive values are booleans States are boolean vectors of fixed size Models are finite state machines !!

  25. Characteristics of Software Primitive values are more complicated • Pointers • Objects Control flow (transition relation) is more complicated • Functions • Function pointers • Exceptions States are more complicated • Unbounded graphs over values Variables are scoped • Locals • Shared scopes Much richer modularity constructs • Functions • Classes

  26. Computing power doubles every 18 months -Gordon Moore When I use a model checker, it runs and runs for ever and never comes back… when I use a static analysis tool, it comes back immediately and says “I don’t know” - Patrick Cousot

  27. Problem • Check if programs written in common programming languages (C, C++, C#, Java) satisfy certain safety properties • Examples of properties: • API usage rules – ordering of calls • Absence of races • Absence of deadlocks • Protocol (state machines) on objects • Language-based safety properties

  28. Approach • Extract abstract “model” from the program that captures all “relevant” portions of the program with respect to property of interest • Systematically explore the state space of the extracted model. • Example: SLAM • Check if a sequential C program uses an interface “correctly” as specified by a safety property, using boolean program models

  29. Traditional approach model checker FSM Finite state machines Source code Sequential C program

  30. Software model checking SLAM model checker Data flow analysis implemented using BDDs Finite state machines Push down model Boolean program FSM abstraction C data structures, pointers, procedure calls, parameter passing, scoping,control flow Source code Sequential program in rich programming language (eg. C) Related work: BLAST, MAGIC,…

  31. Zing model checker Rich control constructs: thread creation, function call, exception, objects, dynamic allocation Model checking is undecidable! abstraction Source code Device driver (taking concurrency into account), web services code, transaction management system (2pc)

  32. Zing model checker • 3 core constructs: • Procedure calls with call-stack • Objects with dynamic allocation • Threads with dynamic creation • Inter-process communication: • Shared memory • Channels with blocking-receives, non-blocking sends, FIFO abstraction Source code Concurrent program in rich programming language

  33. Lecture I : Outline • Software Model Checking • Overview of ZING

  34. Zing: Challenges and Approach • Handling programming language features • Compile Zing to an intermediate “object model” (ZOM) • Build model checker on top of ZOM • State explosion • Expose program structure in ZOM • Exploit program structure to do efficient model checking

  35. Processes Process Process Process Stack … IP Locals Params Heap: complextypes IP Locals Params … … Zing Object Model: Internal StateView State Globals: simple types & refs

  36. Zing Object Model: External State View • Simplified view to query and update state • How many processes? • Is process(i) runnable? • Are two states equal? • Execute process(i) for one atomic step • Can write simple DFS search in 10 lines

  37. privatevoid doDfs(){ while(stateStack.Count > 0){ State s = (State) stateStack.Peek(); bool foundSuccessor = false; // find the next process to execute and execute it for (int p =s.LastProcessExplored + 1; p < s.NumProcesses; p++) { if(s.RunnableProcesses[p] { State newS = s.Execute(p); if (!stateHash.contains(newS)){ stateHash.add(newS); stateStack.push(newS); foundSuccessor = true; break; } } } if(!foundSuccessor) stateStack.Pop(); } } DOESN’T SCALE NEED TO EXPLOIT PROGRAM STRUCTURE !

  38. Optimizations to make model checking scale • Exploring states efficiently • Finger-printing • State-delta • Parallel model checking with multiple nodes • Exploring fewer states (using mathematical properties) • Reduction • Summarization • Symbolic execution • Iterative Refinement • Compositional conformance checking

  39. Saving storage • Only states on the checkers stack are stored • For states not on the stack only a fingerprint is stored • Store only “deltas” from previous states

  40. State reduction • Abstract notion of equality between states • Avoid exploring “similar” states several times • Exploit structure and do this fully automatically while computing the fingerprint: s1  s2  f(s1) = f(s2) Heap1 100 200 a b 0 ptr Heap2 ptr 150 300 b 0 a

  41. Architecture & Communication Server Reachablestates Frontier Reached States Frontier States Trace ModelChecker ModelChecker Model Checker Model Checker

  42. Optimizations to make model checking scale • Exploring states efficiently • Finger-printing • State-delta • Parallel model checking with multiple nodes • Exploring fewer states (using mathematical properties) • Reduction • Summarization • Symbolic execution • Iterative Refinement • Compositional conformance checking

  43. Racy program: need to explore all interleavings! local int y = 0; x := x + 1; x := x + 1; x := x + 1; x := x +1; assert(x div 4); y = y+1; y = y+1; //initialize int x :=0; local int z = 0; x := x + 1; x := x + 1; x := x + 1; x := x +1; assert(x div 4); z = z+1; z = z+1;

  44. Race-free program: need to explore two interleavings! local int y; acquire (m); x := x + 1; x := x + 1; x := x + 1; x := x +1; assert(x div 4); release (m); y = y+1; y = y+1; //initialize int x :=0; mutex m; local int z; acquire (m); x := x + 1; x := x + 1; x := x + 1; x := x +1; assert(x div 4); release (m); z = z+1; z = z+1;

  45. x r=bal S2 S3 S4 r=bal x S2 T3 S4 z rel(this) r=bal y acq(this) x S5 S6 S7 S2 S3 S4 S0 S1 S2 rel(this) x acq(this) z y r=bal S2 S0 S5 T1 T6 S7 S2 T3 S4 Four atomicities • R: right movers • lock acquire • L: left movers • lock release • B: both right + left movers • variable access holding lock • N: non-movers • access unprotected variable

  46. R* . x . N . Y . L* S0 S5 R* x . . . . Y N L* S0 S5 Transaction Lipton ‘75: any sequence (R+B)*; (N+); (L+B)* is a transaction Other threads need not be scheduled in the middle of a transaction

  47. Recall example:each thread has one transaction! local int y; acquire (m); x := x + 1; x := x + 1; x := x + 1; x := x +1; assert(x div 4); release (m); y = y+1; y = y+1; //initialize int x :=0; mutex m; local int z; acquire (m); x := x + 1; x := x + 1; x := x + 1; x := x +1; assert(x div 4); release (m); z = z+1; z = z+1;

  48. Transaction-based reduction • ZOM extended to expose “mover-ness” of each action • Model checker maintains a state machine to track the “phase” of each transaction • Continues scheduling one thread as long as it is inside a transaction! • Current implementation: • Classifies all heap accesses as non-movers • Can improve the scalability using better analysis (ownership?)

  49. Optimizations to make model checking scale • Exploring states efficiently • Finger-printing • State-delta • Parallel model checking with multiple nodes • Exploring fewer states (using mathematical properties) • Reduction • Summarization • Symbolic execution • Iterative Refinement • Compositional conformance checking

  50. Summarization for sequential programs • Procedure summarization (Sharir-Pnueli 81, Reps-Horwitz-Sagiv 95) is the key to efficiency int x; void incr_by_2() { x++; x++; } void main() { … x = 0; incr_by_2(); … x = 0; incr_by_2(); … } • Bebop, ESP, Moped, MC, Prefix, …

More Related