380 likes | 484 Vues
Explore the formal foundation of partial orders and its relation to directed acyclic graphs in transaction systems. Learn the modeling and analysis of transactions in both informal and formal contexts. Understand the concept of committed projection and serializable histories.
E N D
Transactions Lecture 2 (BHG, Chap. 2) The formal foundation (c) Oded Shmueli 2004
Partial order • L=(Σ, <), Σ is the domain, < is a binary relation on Σ that is: • irreflexive, for all a Σ, a a (i.e., a < a is false). • transitive, for all a, b, c in Σ, a < b and b < c implies a < c. • If a < b then a is a predecessor of b and b follows a. • If neither a < b nor b < a then a and b are incomparable. • L’=(Σ’, <‘) is a restriction of L=(Σ, <) on domain Σ’ if Σ’ Σ and for all a, b Σ’, a <‘ b iff a < b . • L’ is a prefix of L, L’ ≤ L, if L’ is a restriction of L and for each a L’, all predecessors of a in L are in Σ’. (c) Oded Shmueli 2004
Partial order and DAGs • A partial order L=(Σ, <) can be viewed as a directed graph G=(N, E): • N = Σ. • (a, b) E iff a < b. • G is acyclic as, by transitivity, cyclic would imply a < a for some a Σ. G is also transitively closed. • Conversely, given a DAG G=(N,E), we can construct a partial order (Σ, <) by transitively closing G to produce (N, E+) and setting Σ = N and a < b iff (a, b) E+. (c) Oded Shmueli 2004
Transactions • In the system context a transaction is a particular program execution that manipulates the database using read and write operations. • In the theory context a transaction is a modeling of such an execution where the operations against the database are modeled as well as their order. • Since a transaction may be generated by concurrent programs, a transaction is best modeled as a partial order. • We will not model all aspects of transactions: • No initial values. • Values read or written. • Analysis will apply to any situation (view each write as an arbitrary function of all read values). • Can model input and output statements via unique data items. (c) Oded Shmueli 2004
Transactions, informally • T = (S,<), partial order: • S is the collection of read operations and write operations (once). • a or c, not both are in S. • all operations precede a or c. • a < b indicates a happened before b. • for all x, if Wi[x] and Ri[x] are in S, they are not incomparable. r2[x] w2[z] c2 r2[y] (c) Oded Shmueli 2004
Transactions, formally • Ti is a partial order with ordering relation <i: • Ti {ri[x], wi[x] | x is a data item} {ai, ci} • ai T iff ci T. • if t T is either ai or ci then for all other p T, p <i t. • If ri[x], wi[x] T then either ri[x] < wi[x] or wi[x] < ri[x] . (c) Oded Shmueli 2004
Complete History • Two operations conflict if they operate on the same data item and one is a write. • A complete history over transaction set T={T1,…,Tn} is a partial order (H,<H): • H is the union of the Ti’s, H = i Ti. • <H contains the union of the <i, <H i <i. • for any two conflicting p, q H: p <H q or q <H p. (c) Oded Shmueli 2004
History • Histories model system-wide, not necessarily complete, executions. • A History is a prefix of a complete history. • We usually represent histories as DAGs. • In DAG representation, usually not all transitive edges are drawn. (c) Oded Shmueli 2004
Committed Projection of a History • Ti committed (aborted) if ci (ai) present. • C(H): restriction of H to the set of operations of transactions committed in H. • C(H) is a complete history. • C(H) defines the semantics of a history H, that is the kind of database state transformation performed. • For this interpretation to be sound, the system need achieve this effect. (c) Oded Shmueli 2004
History example r3[x] w3[y] w3[x] c3 T1=r1[x] w1[x] c1 r4[y] w4[y] c4 w4[x] T3=r3[x] w3[y] w3[x] c3 r1[x] w1[x] c1 T4=r4[y] w4[x] w4[y] c4 All transactions committed H1 – complete history r3[x] w3[y] w3[x] r4[y] w4[y] w4[x] r1[x] w1[x] c1 H1’ –history, prefix of H1 T3, T4 active (c) Oded Shmueli 2004
C(H) r3[x] w3[y] w3[x] T1=r1[x] w1[x] c1 r4[y] w4[y] w4[x] T3=r3[x] w3[y] w3[x] c3 r1[x] w1[x] c1 T4=r4[y] w4[x] w4[y] c4 H1’ –history, prefix of H1 r1[x] w1[x] c1 Committed Projection of H1’, restriction to the domain of committed transactions (c) Oded Shmueli 2004
Serializable Histories • Define equivalence of histories. • Define serial histories. • Define serializable histories. (c) Oded Shmueli 2004
(Conflict) Equivalence of Histories • Histories H and H’ are equivalent: • H and H’ have the same set of transactions and operations. • H and H’ have the same order on conflicting operations of transactions that are not aborted in H. • Formally, for conflicting pi and pj such that ai, aj H, if pi <H pj then pi <H’ pj (implying pi <H pj iff pi <H’ pj) • Informally, in ordering conflicting operations we determine what’s computed, so equivalent histories perform the same database state transformation. Formally CSR ==> VSR. (c) Oded Shmueli 2004
Equivalence example w1[y] r1[x] r1[y] c1 H2 w1x] r2[z] w2[y] w2[x] c2 w1[y] r1[x] r1[y] w1x] c1 H3 H2 r2[z] w2[y] w2[x] c2 w1[x] H4 not equivalent to H2, H3, for example, w1[y], w2[y] r1[x] r1[y] c1 w1y] r2[z] w2[y] (c) Oded Shmueli 2004 w2[x] c2
Serializable Histories • A complete history is serial if for all Ti, Tj all operations of Ti precede those of Tj or vice versa. • We would like “correct” to mean “same as serial”. • Technical problem: serial is complete by definition, history is not. • “Solution”: allow serial histories over incomplete transactions. • But, incomplete histories may be incorrect database transformation. • A serial execution is a correct database state transformation. • So, for a history H to be “correct” we require it to be “equivalent” to a complete history H’. • H itself is not necessarily complete, C(H) is complete. • Also, C(H) is the semantics of H. So, we define: • H is serializable (SR) if C(H) is equivalent to a serial history. (c) Oded Shmueli 2004
The Serialization Graph • Consider history H over T={T1,..,Tn} • SG(H) has a node for each committed transaction in H. • An edge from Ti to Tj if one of Ti’s operations conflicts with and precedes one of Tj’s operations. (c) Oded Shmueli 2004
Serialization Graph r3[x] w3[x] c3 c1 w1[y] r1[x] w1[x] H5 c2 w2[y] r2[x] T2 T1 T3 SG(H5) Note: SG is not transitively closed in general, e.g., replace w3[x] with w3[z]. (c) Oded Shmueli 2004
Topological sort • Consider a DAG G=(V,E). • List the nodes of V as v1,…,vn so that for all edges (vi, vj), i<j. • A directed graph is acyclic iff it has a topological sort. • Finding a t.s.: • find a source v (no incoming edges). • delete edges outgoing from the source. • output v. (c) Oded Shmueli 2004
The Serializability Theorem H is serializable iff SG(H) is acyclic • (if) Equivalence of C(H) to a serial history Hs, in topological sort order of transactions in C(H). Conflicting operations appear in the same order in C(H) and Hs. (c) Oded Shmueli 2004
The Serializability Theorem (if): detailed • H over T={T1,…,Tn}. • W.l.o.g., T1,…,Tm are committed in H. • Consider SG(H). Sort it topologically Ti1,…,Tim. • Let Hs= Ti1,…,Tim. • Claim: H Hs. • Proof: Need to show: same operations, same order on conflicting operations. • H and Hs have the same set of operations. • Let pi (of Ti) and pj of (Tj) be conflicting operations. • All such operations are ordered in H. • There is an edge Ti Tj in SG(H). • So, in the t.s., Ti must precede Tj. • So Ti precedes Tj in Hs. So pi precedes pj in Hs. (c) Oded Shmueli 2004
The Serializability Theorem (Cont.) H is serializable iff SG(H) is acyclic • (only if) Consider Hs equivalent to C(H). • Ti Tj in SG(H) Ti precedes Tj in Hs. • So, a cycle in SG(H) implies a transaction precedes itself in Hs, which is impossible. (c) Oded Shmueli 2004
The Serializability Theorem (only if): detailed • H is SR. • Hs C(H). • Consider Ti Tj in SG(H). • This is due to conflicting pi (of Ti) and pj (of Tj) and pi precedes pj in C(H). • Since Hs C(H), pi precedes pj in Hs. • Since Hs is serial, Ti precedes Tj in Hs. • If there is a cycle T1 T2 … Tk=T1 in SG(H): • Then, T1 precedes T2 in Hs, …precedes T1 in Hs. • But T1 cannot precede itself no cycle can exist. (c) Oded Shmueli 2004
Example H6 = w1[x] w1[y] c1 r2[x] r3[y] w2[x] c2 w3[y] c3 SG(H6) = T1 T3 T2 • There are two t.s.’s: • T1 T3 T2 • T1 T2 T3 • Both provide equivalent serial histories. (c) Oded Shmueli 2004
Recoverable Histories • Ti reads x from Tj if • Wj[x] < Ri[x] • aj Ri[x] • Wj[x] < Wk[x] < Ri[X] ak < Ri[x] • Note: i=j is possible. • Ti reads from Tj if Ti reads some data item from Tj. (c) Oded Shmueli 2004
Examples: Additional Requirements • w1[x] r2[x] w2[y] c2 • T1 may abort, not recoverable (RC) • w1[x] r2[x] w2[y] is RC • if T1 aborts, so must T2 (not ACA) • w1[x,2] w1[y,3] w2[y,1] c1 r2[x] a2 • RC+ACA. We should put y=3. Seems ok. • X=1 w1[x,2] w2[x,3] a1 • should x be 1 (or 3)? If a2, should we put 2? Should be 1! (c) Oded Shmueli 2004
Formally: Additional Requirements (i ≠ j) • RC Ti reads from Tj and ci in H cj < ci • Don’t commit if you read uncommitted data. • ACA Ti reads, via ri[x], from Tj cj < ri[x] • Only read data produced by committed transactions. Here i ≠ j. • ST wj[x] < oi[x] aj < oi[x] or cj < oi[x] • implement abort by restoring before-images. • Each category is more restrictive. (c) Oded Shmueli 2004
ST ACA RC • Let H ST. • Suppose Ti reads x from Tj in H. • Then, wj[x] < ri[x] and aj ri[x]. • By ST, cj < ri[x]. So, H ACA and ST ACA. • H9 = w1[x] w1[y] r2[u] w2[x] w1[z] c1 r2[y] w2[y] c2 ACA but ST. So, ST ACA. • Let H ACA. • Suppose Ti reads x from Tj in H and ci H. • H ACA wj[x] < cj < ri[x]. • ci H ri[x] < ci cj < ci. So, H RC and ACA RC. • H8 = w1[x] w1[y] r2[u] w2[x] r2[y] w2[y] w1[z] c1 c2 RC but ACA. So, ACA RC. (c) Oded Shmueli 2004
State of the world SR RC Serial ACA ST (c) Oded Shmueli 2004
Prefix Commit Closed (PCC) Properties • PCC property: if holds on history H then it holds for C(H’) for any prefix H’ of H. • Any correctness criterion better be PCC. • Otherwise, system fails after producing H’ s.t. the property does not hold on C(H’). • ACA, ST, RC, SR are all PCC properties. • SR: H is SR. Look at SG(H). Look at prefix H’. Look at C(H’). SG(C(H’)) is sub-graph of SG(H), hence acyclic. Hence C(H’) is SR. (c) Oded Shmueli 2004
Operations other than read/write • Two operations conflict if the order of their performance may matter. • Computational effect: value returned, data items’ values. • Need to extend definition of conflict. • Theorems will apply. Same SG(H), theorem. • Can create compatibility matrix. • Important feature - ordering of conflicting operations. (c) Oded Shmueli 2004
Operations other than read/write - example • Consider increment (inc) that adds 1 and decrement (dec) that subtracts 1. • No value is returned. • Conflict table • n means conflict • y means no conflict (c) Oded Shmueli 2004
Operations other than read/write – example history c4 H11 c2 inc2[y] dec2[x] w1[x] dec4[y] r3[x] inc3[y] c3 w4[x] c1 r4[y] w1[y] T1T3T2T4 T2 T3 T4 SG(H11) T1 (c) Oded Shmueli 2004
View Equivalence • Transactions are deterministic transformers. • If a transaction reads the same values in two executions, it’ll produce the same values. • So, if in two executions transactions read the same values, they’ll produce the same values. • If, in addition, for all items x, the last transaction to write into x is the same one in the two executions, the final DB will be the same. (c) Oded Shmueli 2004
View Equivalence, formally • Final write: wi[x] in H, ai not in H, for all other wj[x], wj[x] < wi[x] or aj in H. • H is view-equivalent to H’ if: • H, H’ are over the same set of transactions, • For all Ti, Tj s.t. ai, aj not in H (and H’), if Ti reads x from Tj in H, Ti also does so in H’. • Same final writes in H and H’. (c) Oded Shmueli 2004
View Serializability • We’d like a definition that captures “a history is view equivalent to a serial history”. • And, use it as a correctness criterion. • Let’s try “a history is v-serializable if it’s view equivalent to a serial history”. • H12 = w1[x] w2[x] w2[y] c2 w1[y] c1 | w3[x] w3[y] c3. • H12 is view equivalent to T1 T2 T3. • Suppose the system crashes at| . • Resulting execution, H12’ = w1[x] w2[x] w2[y] c2 w1[y] c1,is not view equivalent to either T1 T2 or T2 T1. • So, “v-serializable” is not an appropriate correctness criterion. We need enforce PCC. (c) Oded Shmueli 2004
View Serializability, formally • H is VSR if if for each prefix H’ of H, C(H) is view equivalent to a serial history. • “for each prefix” - so it’s a PCC property! (c) Oded Shmueli 2004
View Serializability, properties • CSR VSR (next slide) • VSRCSR • W1[x] W2[x] W3[y] c2 W1[y] W3[y] c3 W1[z] c1 is VSR. • but bot CSR: T1 T2 T1 in SG(H). • VSR more inclusive but not a practical notion (a scheduler that outputs exactly VSR histories will need to “solve” P=NP first). (c) Oded Shmueli 2004
View Serializability, CSR VSR • CSR VSR: Let H be SR. SG(H) is acyclic. • Consider an arbitrary prefix H’ of H. • SG(H’) is acyclic (subgraph of SG(H)). • H’ is SR. H’ Hs where Hs is serial. • In H’ and Hs: • Same read from: otherwise conflicting ops are in the wrong order. • Same final writes: similar reason. • Conclusion: H’ is VSR. • H’ chosen arbitrarily, so H is VSR. (c) Oded Shmueli 2004