1 / 67

Ganesh Gopalakrishnan Associate Professor Computer Science University of Utah

Intel MPG talk of 11/12/99, Santa Clara:. * Overview of the Utah Verifier Group. * Verification of Coherence Protocols against Shared Memory Consistency Models using Test Model-Checking. Ganesh Gopalakrishnan Associate Professor Computer Science University of Utah

elgin
Télécharger la présentation

Ganesh Gopalakrishnan Associate Professor Computer Science University of Utah

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Intel MPG talk of 11/12/99, Santa Clara: * Overview of the Utah Verifier Group * Verification of Coherence Protocols against Shared Memory Consistency Models using Test Model-Checking Ganesh Gopalakrishnan Associate Professor Computer Science University of Utah www.cs.utah.edu/~ganesh

  2. Past Utah Verifier group Members • Ratan Nalumasu, PhD ‘98 (HP) • new partial-order reduction algorithm and model-checker PV • approach to write high-level specs for coherency protocols and obtain split transaction protocols automatically • test model-checking approach • Abdel Mokkedem, Postdoc (Compaq) • help in above, plus modeling & verifying the PCI 2.1 protocol • Rajnish Ghughal, MS ‘99 (Intel, Oregon) • test model-checking for weak memory models

  3. Present group Members • Ravi Hosabettu, PhD student • approach to pipelined processor modeling and verification using layered abstraction map • recently finished verification of high-level design model of CPU with reorder buffer, branches, speculation, exceptions (PVS proof - 35 days) • Michael Jones, PhD student • verifying the PCI 2.1 protocol using an abstraction map to PCI_abstract followed by a special-purpose SML model-checker for PCI_abstract • Annette Bunker, PhD student • background research • New group members: Ritwik Bhattacharya, Jason Yang, Ali Sezgin, Prosenjit Chatterjee

  4. Verification of Coherence Protocols Against Shared Memory Consistency Models Using Test Model-Checking

  5. FM and shared-memory system design • Processor-speed growth faster than memory speed-growth • Mismatch exacerbated by shared memory multiprocessors • Complex protocols employed to hide memory latencies • Need for formal verification techniques that can be employed during design • Handle strong (e.g. seq consistency) and weak (e.g. TSO) memory models

  6. Related Work • Graf (CAV’94) • for more than SC (hence unsound for SC) • properties depend on design • Alur, McMillan, Peled (LICS’96) • undecidable if data can be compared • Nalumasu, Ghughal, Mokkedem, Gopalakrishnan (CAV’98) • Henzinger, Qadeer, Rajamani (CAV’99) • needs invariants • invariants depend on design • assumes address-symmetry • Collier (‘80s) • not available at design-time Ganesh, Utah Verifier group -- Intel MPG talk

  7. Memory Models • Describes memory system’s behavior in response to memory operations Memory System Memory Operations (read or write) from various processes

  8. Uniprocessor Memory Model:the von Neumann model • Memory operations (reads and writes) execute in the order in which they appear in program P Memory

  9. P1 P2 Pn . . . Sequential Consistency: A multiprocessor memory model • Memory operations complete in program order • A Write becomes instantly visible to all processors Memory

  10. Weaker Memory Models • Sequential Consistency : intuitive and strong memory model, but.. • Does not allow many architectural optimizations • Weaker memory models : • Memory operations can occur out of order • Allows for more architectural optimizations to enable significant performance gain • Many real processors are allowing weaker memory models e.g. Sun Ultra 4, Alpha, PowerPC, Intel etc.

  11. An Example Weaker Memory ModelSPARC Total Store Order (TSO) P2 P1 Pn • The presence of local caches + write buffers + out of order memory accesses • Performance vs. programming complexity . . . . Memory

  12. CPU+ Cache CPU+ Cache CPU+ Cache CPU CPU …. Snooping bus Mem Mem Memory Model Verification Problem =

  13. Why informal methods insufficient ? • Danger of using incorrect optimizations • uniprocessor opt may not be legal for multiprocessors • Danger of incorrect implementations of legal optimizations • Concurrency - informal methods inadequate • Memory system semantics are complex and non-intuitive • more so for weaker memory models

  14. An optimization : fine for uni-processor... P1 F1 := 1 R1 := F2 Writes have higher latencies than reads A Simple Optimization : Let Read of F2 bypass write of F1 Works fine for uni-processor machines

  15. If Read bypasses Write then Both P1 and P2 in critical section !! Many optimizations in uni-processor designs not applicable for multiprocessors … but not so for multiprocessors P1 F1 := 1 R1 := F2 if (R1 == 0) critical section P2 F2 := 1 R2 := F1 if (R2 == 0) critical section

  16. Our main example: A Symmetric Multi-Processors (SMP) bus CPU $ CPU $ CPU $ Coherent snooping bus Memory Problem studied: how can the CPU designer - specify desired orderings of reads and writes - verify the implementation for adherence (in appearance)

  17. Broadcast Host b a Client0 Client1 The `Utah Runway Bus Model’ (URM) Cache lines Runout Runin Noncoh Coh_chans

  18. How test model-checking works Broadcast Host b a Client0 Client1 - Drive memory system model using test automata - See if error-state(s) reached

  19. Deriving Test-automata • Assume that memory-systems do not decode ‘data’ and use addresses only in = and != tests • Establish Limited Address Theorems for the chosen memory model (PO in our case) • for an interesting class of programs, examining all two-address programs is sufficient • List all possible violations over 1- and 2-addresses • Abstract these violations into test-automata • Test automata • are sound • completeness results under investigation • found effective in practice

  20. Then a_i are: P1 P2 P1 A := 1 A := 2 A := 3 .... A := k P2 X1 := A X2 := A X3 := A .... Xk := A Error state wr(0) rd(0) rd(1) wr(1) There exists some i,j s.t. j < i /\ X(j) < X(i) rd(1) rd(0) wr(1) - Achieves the effect of k = infinity - Considers all interleavings An Illustrative Example Suppose the observed executions are: Ganesh, Utah Verifier group -- Intel MPG talk

  21. (2) (3) (1) P_i ... rd(a,v1) … rd(a,v2) ... P_i ... rd(a,v) … rd(a,T) ... P_ j … wr(a,v2) … wr(a,v1) ... P_ j … wr(a,v) … P_i ... rd(a, v) … P_ j ... … ... v is not the initial value T of a, and a is not written anywhere P_ i and P_ j could be the same process P_ i and P_ j could be the same process All one-address PO violations (1-3 of 5) Ganesh, Utah Verifier group -- Intel MPG talk

  22. ...All one-address PO violations (4-5 of 5) P_i ... rd(a,v) … wr(a,v) ... P_i ... wr(a,v) … rd(a,T) ... (4) (5) v is not the initial value T of a, and a is not written before being read Ganesh, Utah Verifier group -- Intel MPG talk

  23. Broadcast Verification of Program Ordering for all one-address programs Host S(x) means Write(A,x) b a Client0 Client1 Read(A,-) Error states: E1, E2

  24. Verification of Program Ordering for all two-address programs Broadcast Write(A,x) Write(B,y) Host S(x,y) means b a Client0 Client1 Read(B,-) Read(A,-) Error states: E1, E2

  25. Broadcast Can run demo of this model-checking on this laptop if there is interest (need to boot linux..) Host b a Client0 Client1 Error states: E1, E2

  26. How to Handle Weaker Memory Models? • Identify new rules (if necessary) • Create new tests and test model-checking automata • Consider memory operations other than read and write • fences, barriers etc. Ganesh, Utah Verifier group -- Intel MPG talk

  27. Weaker memory models - relaxations • Partial-PO Relaxation : • Relaxes PO partially - WR is always relaxed • May relax WA in various orders • examples : SPARC V9 TSO, PSO, Intel Pentium Pro, Processor consistency etc. • Complete-PO relaxation : • Relaxes PO completely • typically does not relax WA • examples : SPARC V9 RMO, Alpha, PowerPC, Release Consistency Ganesh, Utah Verifier group -- Intel MPG talk

  28. SPARC Total Store Order (TSO) P2 P1 Pn • Relaxes Write-Read (WR) sub-rule • Also relaxes WA in a subtle way . . . . Memory Ganesh, Utah Verifier group -- Intel MPG talk

  29. TSO and PSO Specification (Ghughal, MS ‘99) • TSO = (UPO,RO,WO,RW,WA-S,MB-WR) • PSO = (UPO,RO,RW,WA-S,MB-WR,MB-WW) • A series of “pure tests” are defined to test for individual ordering rules (e.g. RO) in isolation Ganesh, Utah Verifier group -- Intel MPG talk

  30. P1 A := 1 A := 2 A := 3 .... A := k P2 X1 := A X2 := A X3 := A .... Xk := A rd(0) rd(1) There exists some i,j s.t. j < i /\ X(j) < X(i) rd(1) rd(0) Motivation for Pure Tests P1 P2 wr(0) Error state wr(1) wr(1) A visit to Error-state tells that ONE OF RO, WO, RW, or WR is violated -- NOT which one Ganesh, Utah Verifier group -- Intel MPG talk

  31. Steps for creating test-automata Initialize all variables to 0 • Identify violation in the setting of a simple example • Argue that regardless of WO, this violates RO • Generalize error to execution sequence (next slide) • Build test automata (following that) P2 X := A; Y := A; Z := A; P1 A := 1; A := 2; Finally A==2; X==Z==1 or 2, Y==1 or 2, Y!=X Ganesh, Utah Verifier group -- Intel MPG talk

  32. Pure Test for RO over the same operand (WO is NOT assumed!) • New Test for RO P1 A:=1 A:=2 .. A:=k P2 X[1]:=A X[2]:=A .. X[k]:=A Condition : for all p, q, r : p < q < r : X[p] = X[r] => X[p] = X[q] = X[r] • Formally proved that this (+ all others) are pure tests • Completeness still open. Ganesh, Utah Verifier group -- Intel MPG talk

  33. read(A) X3 :=read(A) s2 read(A) Test Automata for RO on Same Operand Obtained Assuming Data Independence P2 s0 P1 A := 0 X1 := read(A) s0 read(A) s1 A := 1 A := 0 X2 :=read(A) s1 s2 read(A) Non-deterministic switch Safety Property : Finally, X1 = X3 = 1 => X1 = X2 = X3 Ganesh, Utah Verifier group -- Intel MPG talk

  34. Pure Test for RO- different operands- WO not assumed P3 U := C; A := U; P2 X := A; Y := B; C := Y; P1 B:=1 • Initially all vars == 0 • Finally all vars == 1 • => In P2, B must have been read before A Ganesh, Utah Verifier group -- Intel MPG talk

  35. P3 U[1] := C; A := U[1]; U[2] := C; A := U[2]; ... U[k] := C; A := U[k]; Pure Test for RO- different operands- WO not assumed P2 Y[0] := 0; X[1] := A; Y[1] := B; C := Y[i]; X[2] := A; Y[2] := B; C := Y[2]; … X[k] := A; Y[k] := B; C := Y[k]; P1 B:=1 B:=2 .. B:=k Condition : Exists i:1<= i<= k Forall j:0<=i: X[i] != Y[j] “X is getting ahead of all the Y’s so far” -- need to examine a history of values... Turn into OR accumulator via data-independence! Ganesh, Utah Verifier group -- Intel MPG talk

  36. Test Automata for RO (diff opnds) read(A); t := read(B); C := t; y := y \/ t; P2 B:=0 s0 u := read(C); A := u; s0 P1 x := read(A); t := read(B); C := t; P3 s0 B:=1 Safety Property : (P2 in S1 /\ y==0) => x==0 read(A); t := read(B); C := t; s1 Ganesh, Utah Verifier group -- Intel MPG talk

  37. A Pure Test for (UPO, WO) P1 P2 A := 1; B := 1; B := 2; A := 2; U[1] := B; V[1] := A; ... ... ... ... A := 2k-1; B := 2k; B := 2k; A := 2k; U[k] := B V[k] := A Condition : forall i,j : U[i] is even or U[i] >= 2j or V[j] is even or V[j] >= 2i will need 2 bits for test model-checking automata Ganesh, Utah Verifier group -- Intel MPG talk

  38. Test Automata for UPO,WO (diff opnds) A := 01; B := 00; read(B); B := 01; A := 00; read(A); P1 P2 (P1 and P2 in their S1) => u is even \/ u = 11 \/ v is even \/ v = 11 s0 s0 A := 01; B := 00; u := read(B); B := 01; A := 00; v := read(A); A := 11; B := 10; read(B); B := 11; A := 10; read(A); s1 s1 Ganesh, Utah Verifier group -- Intel MPG talk

  39. P1 A := 1; C := 1; U := C; X := B P2 B := 1; D := 1; V := D; Y := A; WA-Relaxation of TSO Initially A = B = C = D = U = V = X = Y = 0; • Execution valid under TSO but not under SC. • WA Relaxation - captured by new rule WA-S Finally, A = B = C = D = U = V = 1; X = Y = 0; Ganesh, Utah Verifier group -- Intel MPG talk

  40. Rule of WA-S • WA : • a write becomes visible to all processors “instantly” • atomic set of events - all write events • WA-S : • a write becomes visible to all other processors “instantly” • atomic set of events - all write events in stores of other processors Ganesh, Utah Verifier group -- Intel MPG talk

  41. Memory Barriers - membar • A Special type of memory operations which enforces additional PO constraints as required • could select a particular sub-rule of PO • example : R1 := A; membar LoadStore; B := R2; • also known as fences etc. Ganesh, Utah Verifier group -- Intel MPG talk

  42. Rule of MB (MemBar) • Define one event corresponding to each membar instruction Pi L : membar storestore • Enforce orderings between all relevant operations before and after membar • Consists of 4 sub-rules : MB-RR , MB-RW, MB-WW, MB-WR Ganesh, Utah Verifier group -- Intel MPG talk

  43. What about Rule of MB? • only orders some reads and writes with respect to each other • Hence, could use test for sub-rules of PO to check for various sub-rules of MB • e.g. (CMP, RO) could be used for (CMP,MB-RR) • will need a MB-RR instruction between every two reads in Tests, but only 1 in test model-checking automata Ganesh, Utah Verifier group -- Intel MPG talk

  44. read(A) X3 :=read(A) ; MB-RR s2 read(A) Test Automata for (CMP, MB-RR) P2 s0 P1 A := 0 X1 := read(A) ; MB-RR s0 read(A) s1 A := 1 X2 :=read(A) ; MB-RR s1 s2 read(A) Non-deterministic switch Finally, X1 = X3 => X1 = X2 = X3 Ganesh, Utah Verifier group -- Intel MPG talk

  45. New Tests and Test model-checking automata • Also, developed new tests for • CMP, UPO, RO - checks for read ordering between two different operands • CMP, UPO, WO - checks for write ordering • CMP, UPO,CON - checks for coherency • Developed corresponding test automata • Provided formal proofs for each test and the test model-checking automata abstraction Ganesh, Utah Verifier group -- Intel MPG talk

  46. How to handle models such as Alpha weaker memory model? • Relaxes Program Order completely • Orderings guaranteed by explicit membar when needed • Write atomicity is relaxed in a manner similar to TSO • Specification as (UPO, ROO, WA-S, MB, MB-WW) • Tests developed for the same Ganesh, Utah Verifier group -- Intel MPG talk

  47. Memory Systems Verified • Verified three memory systems using VIS for SC • Also did last example in Promela and SPIN / PV • Serial Memory : a simple memory system • Lazy Caching : A Simple bus-based protocol involving queues • Runway-PA8000 Memory system : A fairly complex commercial multiprocessor memory system from Hewlett Packard (the URM) Ganesh, Utah Verifier group -- Intel MPG talk

  48. Experimental Results (VIS) Ganesh, Utah Verifier group -- Intel MPG talk

  49. SC verification of the HP/Runway modelPromela, with SPIN and PV (#states) Ganesh, Utah Verifier group -- Intel MPG talk

  50. Experimental Results for TSO operational model (in VIS) States Bdds Time TA CMP, RO, WO 3k 4k < 1 s CMP, PO 6.5M 50k 2:38 s CMP, WR 6.5k 50k 1:25 s CMP, RW 6.5k 50k 3:02 s CMP, RO 10k 2k 1:25 s Green is Pass ; Red is Fail (as expected for TSO) Ganesh, Utah Verifier group -- Intel MPG talk

More Related