450 likes | 567 Vues
The verification of concurrent programs poses unique challenges compared to sequential ones, primarily due to the inherent non-determinism and timing dependencies that lead to subtle bugs. This discussion highlights key concepts such as assertions, abstractions, and atomicity, emphasizing their vital role in ensuring correctness in concurrent software. We explore difficulties like annotation and state explosion, as well as strategies to control interference, thereby providing a structured approach to program verification in multicore environments.
E N D
Taming Concurrency:A Program Verification Perspective Shaz Qadeer Microsoft Research
Reliable concurrent software? • Correctness problem • does program behave correctly for allinputs and allinterleavings? • Bugs due to concurrency are insidious • non-deterministic, timing dependent • data corruption, crashes • difficult to detect, reproduce, eliminate
Undecidable problem! P satisfies S Why is verification of concurrent programs more difficult than verification of sequential programs?
Assertions: Provide contracts to decompose problem • into a collection of decidable problems • pre-condition and post-condition for each procedure • loop invariant for each loop P satisfies S • Abstractions: Provide an abstraction of the program • for which verification is decidable • Finite-state systems • finite automata • Infinite-state systems • pushdown automata, counter automata, timed automata, Petri nets
Interference • pre x = 0; int t; t := x; t := t + 1; x := t; Correct • post x = 1;
Interference • pre x = 0; A B int t; t := x; t := t + 1; x := t; int t; t := x; t := t + 1; x := t; Incorrect! • post x = 2;
Controlling interference • pre x = 0; A B int t; acquire(l); t := x; t := t + 1; x := t; release(l); int t; acquire(l); t := x; t := t + 1; x := t; release(l); Correct! • post x = 2;
Interference makes program verification difficult • Annotation explosion with the assertion approach • State explosion with the abstraction approach
Annotation explosion • For sequential programs • assertion for each loop • assertion refers only to variables in scope • For concurrent programs • assertion for each control location • assertion may need to refer to private state of other threads
State explosion Sequential Concurrent PSPACE complete Finite-state systems P.G Pushdown systems (P.G)3 Undecidable P = # of program locations G = # of global states n = # of threads
Taming interference • Atomicity for combating annotation explosion • Interference-bounding for combating state explosion
Bank account Critical_Section l; /*# guarded_by l */ intbalance; /*# atomic */ void deposit (int x) { acquire(l); int r = balance; balance = r + x; release(l); } /*# atomic */ int read( ) { int r; acquire(l); r = balance; release(l); return r; } /*# atomic */ void withdraw(int x) { int r = read(); acquire(l); balance = r – x; release(l); }
Atomicity violationin StringBuffer (Flanagan-Q, PLDI 03) public final class StringBuffer { private int count; private char[ ] value; . . public synchronized StringBuffer append (StringBuffersb) { if (sb == null) sb = NULL; intlen = sb.length( ); intnewcount = count + len; if (newcount > value.length) expandCapacity(newcount); sb.getChars(0, len, value, count); //use of stale len !! count = newcount; return this; } public synchronized int length( ) { return count; } public synchronized void getChars(. . .) { . . . } }
Inadequate atomicity is a good predictor of • undesirable interference! • compared to data races • Numerous tools for detecting atomicity violations • static (Type systems, ESPC, QED) • dynamic (Atomizer, AVIO, AtomAid, Velodrome, …) • Significant effort cutting across communities • architecture and operating systems • programming languages and compilers • testing and formal methods
x y acq(l) r=bal bal=r+n rel(l) z acq(l) x r=bal y bal=r+n z rel(l) • Non-serialized executions of deposit acq(l) x y r=bal bal=r+n z rel(l) Definition of atomicity • Serialized execution of deposit • deposit is atomic if for every non-serialized execution, there is a serialized execution with the same behavior
Reduction (Lipton, CACM 75) acq(l) x r=bal y bal=r+n z rel(l) S0 S1 S2 S3 S4 S5 S6 S7 acq(l) y r=bal bal=r+n z rel(l) x S0 S1 S2 T3 S4 S5 S6 S7 x acq(l) y r=bal bal=r+n z rel(l) S0 T1 S2 T3 S4 S5 S6 S7 x y acq(l) r=bal bal=r+n z rel(l) S0 T1 T2 T3 S4 S5 S6 S7 x y r=bal bal=r+n rel(l) z acq(l) S0 T1 T2 T3 S4 S5 T6 S7
Four atomicities • R: right commutes • lock acquire • L: left commutes • lock release • B: both right + left commutes • variable access holding lock • A: atomic action, non-commuting • access unprotected variable
R* . x . A . Y . L* S0 S5 R* . . . Y x . A L* S0 S5 ; B L R A C B B L R A C R R A R A C L L L C C C A A A C C C C C C C C C Sequential composition • Theorem: Sequence (R+B)*;(A+); (L+B)* is atomic R; B ; A; L ; A R A R;A;L; R;A;L ; A A C
Bank account Critical_Section l; /*# guarded_by l */ intbalance; /*# atomic */ void deposit (int x) { acquire(l); int r = balance; balance = r + x; release(l); } /*# atomic */ int read( ) { int r; acquire(l); r = balance; release(l); return r; } /*# atomic */ void withdraw(int x) { int r = read(); acquire(l); balance = r – x; release(l); } R B B L R B L B A R B L A A C Incorrect!
Bank account Critical_Section l; /*# guarded_by l */ intbalance; /*# atomic */ void deposit (int x) { acquire(l); int r = balance; balance = r + x; release(l); } /*# atomic */ int read( ) { int r; acquire(l); r = balance; release(l); return r; } /*# atomic */ void withdraw(int x) { acquire(l); int r = balance; balance = r – x; release(l); } R B B L R B L B R B B L A A A Correct!
Taming interference • Atomicity for combating annotation explosion • Interference-bounding for combating state explosion
Interference bounding • An approach to the state-explosion problem • Explore all executions with a bounded amount of interference • increase the interference bound iteratively • Good idea if low interference bound • can be exploited by algorithms • is enough to expose bugs
Context-bounded verification [Wu-Q, PLDI 04] [Rehof-Q, TACAS 05] Context switch Context switch • Interference proportional to number of context switches • Explore all executions with few context switches • Unbounded computation within each context • Different from bounded model checking Context Context Context
Context-bounding today • CHESS [Musuvathi-Q, PLDI 07] • JMoped [Suwimonteerabuth et al., SPIN 08] • SPIN for multithreaded C programs [Zaks-Joshi, SPIN 08] • CBA [Lal-Reps, CAV 08] • Static Driver Verifier
Testing concurrent programs is HARD • Bugs hidden in rare thread interleavings • Today, concurrency testing = stress testing • Poor coverage of interleavings • Unpredictable coverage results in “Heisenbugs” • The mark of reliability of the system still remains its ability to withstand stress
CHESS in a nutshell ConcurrentProgram Win32 API Kernel Scheduler Demonic Scheduler • Replace the OS scheduler with a demonic scheduler • Systematically explore all scheduling choices
CHESS architecture Program CHESS runs the scenario in a loop While(not done) { TestScenario() } CHESS TestScenario() { … } • Each run takes different interleaving • Each run is repeatable • Intercept synchronization and threading calls • To control and introduce nondeterminism Win32 API • Detect • Assertion violations • Deadlocks • Dataraces • Livelocks Kernel: Threads, Scheduler, Synchronization Objects
CHESS methodology generalizes Singularity Program .NET Program Win32 Program CHESS CHESS CHESS Singularity .NET CLR Win32 / OS • CHESS works for • Unmanaged programs (written in C, C++) • Managed programs (written in C#,…) • Singularity applications • With appropriate wrappers, can work for Java, Linux applications
CHESS: Systematic testing for concurrency Program While(not done){ TestScenario() } CHESS TestScenario(){ … } CHESS runs the scenario in a loop • Each run is a different interleaving • Each run is repeatable Win32 API Kernel: Threads, Scheduler, Synchronization Objects
State-space explosion Thread 1 Thread n x = 1; … … … … … x= k; x = 1; … … … … … x= k; … k steps each n threads Goal: Scale CHESS to large programs (large k) • Number of executions = O( nnk) • Exponential in both n and k • Typically: n < 10 k > 100 • Limits scalability to large programs
Preemption-bounding Thread 1 Thread 2 x = 1; if (p != 0) { x = p->f; } x = 1; if (p != 0) { p = 0; preemption x = p->f; } non-preemption • Prioritize executions with small number of preemptions • Two kinds of context switches: • Preemptions – forced by the scheduler • e.g. Time-slice expiration • Non-preemptions – a thread voluntarily yields • e.g. Blocking on an unavailable lock, thread end
Preemption-bounding in CHESS • The scheduler has a budget of c preemptions • Nondeterministically choose the preemption points • Resort to non-preemptive scheduling after c preemptions • Once all executions explored with c preemptions • Try with c+1 preemptions
Property 1: Polynomial bound Thread 1 Thread 2 • Choose c preemption points x = 1; … … … … x = 1; … … … … … x = k; x = 1; … … … x = 1; … … … … … x = k; • Permute n+c atomic blocks … … … x = k; x = k; • Terminating program with fixed inputs and deterministic threads • n threads, k steps each, c preemptions • Number of executions <= nkCc . (n+c)! = O( (n2k)c. n! ) Exponential in n and c, but not in k
Property 2: Simple error traces • Finds smallest number of preemptions to the error • Number of preemptions better metric of error complexity than execution length
Property 3: Coverage metric • If search terminates with preemption-bound of c, then any remaining error must require at least c+1 preemptions • Intuitive estimate for • The complexity of the bugs remaining in the program • The chance of their occurrence in practice
Property 4: Many bugs with 2 preemptions Acknowledgement: testers from PCP team
CHESS status • Being used by testers in many Microsoft product groups • Demo and session at Professional Developer Conference 2008 • CHESS for Win32 • http://research.microsoft.com/projects/chess • CHESS for .NET • Coming soon
So far … • Interference makes program verification difficult • Taming interference • atomicity • interference-bounding
Whither next? • Concurrent programs as a composition of modules implementing atomic actions • Inteference-bounding for general concurrent systems • Symbolic context-bounded verification
Structuring concurrent programs • In addition to atomicity, we need • behavioral abstraction analogous to pre/post-conditions for sequential code fragments • intuitive and simple contract language • A calculus of atomic actions (Elmas-Tasiran-Q, POPL 08) • QED verifier
Interference-bounding for general concurrent systems • Task = unit-of-work • Shared-memory systems • Task (usually) corresponds to a thread • Message-passing systems • Task corresponds to a sequence of messages • Need linguistic support for the task abstraction in software models and implementations
Context-bounded reachability is NP-complete Unbounded Context-bounded PSPACE complete Finite-state systems NP-complete Pushdown systems Undecidable NP-complete P = # of program locations G = # of global states n = # of threads c = # of context switches
Symbolic context-bounded verification • The transition relations of tasks are usually encoded symbolically • reachability analysis is PSPACE-complete even for a single task • Scalable tools for single symbolic task • Bebop, Moped, Getafix • Can we extend these techniques to deal with multiple tasks? • Lal and Reps (TACAS 2008, CAV 2008) • Suwimonteerabuth et al. (SPIN 2008)