transaction management in database systems

1. 1 Transaction management in database systems Amr El Abbadi Department of Computer Science University of California Santa Barbara, CA 93106

2. 2 Overview Serializability Theory and Concurrency Control. Recovery and Atomic commitment. Ref: Concurrency Control and Recovery in Database Systems, Bernstein, Hadzilacos and Goodman.

3. 3 Preliminaries A database consists of a set of objects: x,y,z. Each object has a value. The values of all the objects form the state of the database, and these states must satisfy the database integrity constraints. Database objects support 2 atomic operations: read[x], write[x].

4. 4 Preliminaries A transaction is a set of operations executed in some order. We will assume total order. A transaction is assumed to be correct, i.e., if executed alone on a consistent database, it transforms it into another consistent state. Example: r1[x] r1[y] w1[x] w1[y] is an example of a transaction t1 that transfers some amount of money from account x to account y.

5. 5 Preliminaries Transaction execution must be atomic: no interference among transactions. Either all its operations are executed or none. Each transaction ends with a commit or an abort operation. A concurrency control protocol ensures that concurrent transactions do not interfere with each other. A recovery protocol ensures the all or nothing property.

6. 6 Preliminaries 2 operations conflict if the order of execution is important, i.e. if one of them is a write. Given a set of transactions T, a history H over T is a partial order over all transaction operations and the order reflects the operation execution order (transaction order and conflicting operations order).

7. 7 Example of a history T1: r1[x] w1[x] c1 T2: r2[x] w2[y] w2[x] c2 T3: r3[y] w3[x] w3[y] w3[z] c3 r1[x] w1[x] c1 r3[y] w3[x] w3[y] w3[z] c3 r2[x] w2[y] w2[x] c2

8. 8 Correctness A history is serial if for every 2 transactions, either all operations of one appear before the other or vice-versa. Since every transaction is correct, a serial history must be correct, and if executed on a consistent database, will result in a consistent database. But we want to allow concurrent transactions�

9. 9 Example of concurrent execution: transfer 100 from account x to y Serial execution Concurrent execution r1[x] returns 200 r1[x] returns 200 w1[x] writes 100 w1[x] writes 100 r1[y] returns 200 r2[x] returns 100 w1[y] writes 300 r1[y] returns 200 commit t1 w1[y] writes 300 r2[x] returns 100 commit t1 r2[y] returns 300 r2[y] returns 300 commit t2 commit t2 BOTH TRANSACTIONS OBSERVE AND WRITE SAME VALUES!

10. 10 Serializability A history is serializable if it is equivalent to a serial history over the same set of transactions. 2 histories are view equivalent of they have the same effects, i.e. same values are written by all transactions. Since we do not know what transactions write, we require that transactions read from the same transactions and final written values are the same.

11. 11 View Equivalence t2 reads-x-from t1 in history H if: r2[x] and w1[x] are in H, w1[x] < r2[x], there is no w3[x] such that w1[x] < w3[x] < r2[x] We augment the history with 2 transactions: Tinit, which writes all values initially Tfinal, which reads all values at the end. 2 histories, H1 and H2 are view equivalent if for all transactions ti and tj: tj reads-x-from ti in H1 iff tj reads-x-from ti in H2.

12. 12 View Serializability A history is view serializable if it is equivalent to a serial history. PROBLEM: recognizing view serializable histories is NP-Complete! Basic intuition: w1[x] r2[x] w3[x]

13. 13 Conflict Serializability Recall: 2 operations conflict if one of them is a write operation. Two histories, H1 and H2, are conflict equivalent if the order of conflicting operations is the same in both histories, i.e., if o1 in t1 and o2 in t2 conflict, then o1 < o2 in H1 iff o1 < o2 in H2. H is conflict serializable if it is conflict equivalent to a serial history.

14. 14 Serialization Graphs How do we prove a history H is (conflict) serializable? Serialization Graph SG(H): nodes are transactions, t1 -> t2 if o1 in t1 and o2 in t2 conflict and o1 < o2 in H H: w1[x]w1[y]c1r2[x]r3[y]w2[x]c2w3[y]c3 t1 t2 t3 Serializability Theorem: A history H is serializable if and only if (iff) SG(H) is acyclic. A concurrency control protocol ensures serializability.

15. 15 Recoverable Histories When a transaction aborts, the system must wipe out all its effects: on data: use before images on transactions: cascading aborts. Consider: w1[x,2] r2[x] w2[y,3] c2 a1 What do we do? Semantic dilemma! Solution: Only allow recoverable histories. A history is recoverable if whenever tj reads-x-from ti, ci < cj.

16. 16 Cascading Aborts w1[x,2] r2[x] w2[y,3] a1 Cascading aborts: abort t2 A history avoids cascading aborts (ACA) if whenever ti reads-x-from tj, cj < ri[x], i.e.,a transaction only reads committed values.

17. 17 Strict Executions w1[x,2] w2[x,3] a1 For recovery we want to use before images. What value to restore? What if t2 aborts? Solution: read and write committed data only. A history is strict if whenever wj[x] < oi[x] either aj<oi[x] or cj<oi[x], where oi is read or write, i.e., no object is read or written until the transaction that last wrote its value terminated (commit or abort).

18. 18 Relationships between Histories

19. 19 Two Phase Locking Locking: Associate with each object a lock. A transaction must acquire a lock before executing an operation on the object, and releases it after execution. 2 types of locks: read locks and write locks. Read write read X read and write conflict write X X write and write conflict Two Phase rule: Once a transaction has released a lock it cannot acquire any locks.

20. 20 Two Phase Locking Theorem: Every 2PL history H is serializable. In SG(H), if ti tj, then for some object x, there are conflicting ops pi[x] and qj[x] such that unlock of pi precedes the lock of qj. In SG(H), if there is a path: t1 t2 �. tn then there are operations p1[x] and qn[y] such that unlock of p1 precedes the lock of qn. Finally, if there is a cycle t1 t2 �. tn t1, then there are operations p1[x] and q1[y] such that unlock of p1 precedes the lock of q1, a contradiction.

21. 21 Two Phase Locking Strict two phase locking: Release all locks together at termination. This ensures that all executions are strict. Most commercial implementations are strict 2PL. Deadlock: Two transactions t1 and t2. t1 acquires read lock on x t2 acquires read lock on y now t1 cannot acquire write lock on y and t2 cannot acquire write lock on x.

22. 22 Locking Performance In a multiprogramming system, resource contention arises over memory, processors, I/O channels, etc. In a locking system, data contention arises due to queues , which form due to conflicting operations. Locking can cause thrashing: as number of transactions increases, throughput increases up to a point and then decreases. In DB, thrashing can be caused by data contention alone, ie even with infinite resources. Thruput trans/sec transactions

23. 23 Tree Locking [KS 80] Objects are structured as nodes in a tree. For simplicity, assume one type of operations, access. 2 access operations conflict. Lock coupling rule: if x is not the root, transaction t can acquire a lock on x only if t is holding a lock on the parent of x. Once t releases a lock, it may not subsequently obtain that lock again.

24. 24 Ordered Sharing [AE90] w1[x] r2[x] w3[y] w1[y] r2[y] (t1 t2 t3) Not allowed by 2PL, although serializable. 2PL solve the conflict problem by blocking. IDEA: Instead of blocking allow sharing but restrict lock releases. Introduce a third mode of locking: Ordered sharing Sharing Exclusive

25. 25 Ordered Sharing Lock Acquisition Rule:If ti acquires a lock pl1[x] before t2 acquires ql2[x] then p1 [x] is executed before q2[x]. Two Phase locking rule. A transaction t2 is waiting for transaction t1 if t2 has acquired a lock after t1 and t1 has not released any of its locks. Lock Relinquish Rule: t2 cannot release any of its locks if it is waiting for some transaction t1. Some transactions may have to delay their commit. If all locks are ordered sharing, the protocol accepts all (order-preserving) conflict serializable executions, i.e., the order of non-interleaving transactions is preserved.

26. 26 Timestamp Ordering Associate with each transaction a timestamp. The CC protocol orders conflicting operations according to timestamp order. TO rule: If pi[x] and qj[x] are conflicting operations, then pi is executed before qj if time(ti) < time(tj). Every object maintains: max_read and max_write. Read: if time(ti) < max_write, then reject read Write: if time(ti) < max_read or time(ti) < max_write, then reject write. Update max_read and max_write appropriately. Thomas Write Rule: if time(ti) < max_write then ignore write operation.

27. 27 Serialization Graph Testing Maintain a serialization graph as part of CC protocol. When SGT receives an operation, it adds appropriate nodes and edges to graph. If no cycle, accept operation, else reject. Garbage collection: when do we delete a transaction? When a transaction commits? r1[x] w2[x] w2[y] c2 r3[y] w3[z] c3��w1[z] Safe rule: delete a transaction t from SG if t committed and is a source in SG.

28. 28 Certification-based CC Locking may block resources for long periods. Simple Certification Approach: Immediately execute all operations of t1. At commit, check if any active transaction has executed a conflicting operation, if so, abort t1. Proof Idea: if t1 t2 then t1 certified before t2. Several variations, e.g., SGT certification, most famous is optimistic concurrency control protocol by Kung and Robinson.

29. 29 Optimistic Concurrency Control [KR81] Transactions execute in 3 phases: Read phase: unrestricted reading of any object Validation phase: ensure that no conflicts occurred. Write phase: after successful validation, write values in db. Validation of transaction t1: Check all concurrent transactions t2, i.e., the write phase of t2 overlaps with read phase of t1: if readset (t1) overlaps with writeset (t2) then abort t1. Further optimizations have been explored.

30. 30 Centralized Recovery We need to recover disk failures during transaction execution so as to ensure the all or nothing property. 3 Approaches: Shadow paging: 2 copies of database. Before images: store on disk log of before values and update database immediately. If failure occurs and transaction has not committed restore db based on log. After images: Perform updates in a log of after images. If transaction commits, install values in db from log.

31. 31 Distributed Recovery DBs reside on sites in a distributed system. Communication between sites by messages only. Each transaction has a home site or coordinator, and a number of participants. Goal: Either all sites commit or all abort. When a transaction wants to commit, it must be sure that all sites agree to commit too.

32. 32 Atomic Commitment At commit time, the coordinator requests votes from all participants. Atomic commitment requires: All processes reach same decision Commit only if all processes vote Yes. If there are no failures and all processes vote Yes, decision will be commit.

33. 33 Two Phase Commit (2PC) Coordinator send vote-request Collect votes. If all Yes, then Commit, else Abort. Send decision Participant receive vote-request send Yes or No receive decision

34. 34 Failures and Blocking What does a process do if it does not receive a message it is expecting? I.e., on timeout? 3 cases: participant waiting for vote-request abort coordinator waiting for vote abort participant waiting for decision uncertain Note: coordinator never uncertain

35. 35 Termination Protocol Can participant find help from other participants? Send to all participants: ``Help! What is decision?�� if any participant has committed or aborted send commit or abort decision. If a participant has not yet voted abort and send abort decision. If all participants voted Yes all live participants uncertain Transaction BLOCKED!

36. 36 Blocking of 2PC 2PC is a blocking protocol. Basic intuition: When a participant is in wait (uncertain) state, some other participants may be in commit and others in abort states. Solution: Introduce a buffer state so that if any operational site is uncertain, no process can have decided to Commit [Skeen 82]. 3 Phase commit protocol only assumes site failures.

37. 37 Three Phase Commit (3PC) Coordinator send vote-request Collect votes. If all Yes, then send Pre-Commit, else send Abort. Collect all Acks, and send Commit Participant receive vote-request send Yes or No if receive abort, then Abort, else, send Ack If recive commit, then Commit.

38. 38 Failure handling in 3PC 5 cases: participant waiting for vote-request abort coordinator waiting for vote abort coordinator waiting for Ack commit participant waiting for decision elect new leader participant waiting for commit elect new leader Note: In (5) a participant may still be waiting for decision.

39. 39 Termination for 3PC Leader sends to all participants requesting state. if any participant has committed or aborted send commit or abort decision. If all participants are uncertain abort and send abort decision. If some participant has pre-committed leader sends Pre-commit to all and wait for acks send commit

40. 40 Commit Protocols Summary 2 PC blocks with failures 3PC is non-blocking with site failures only. 3PC blocks with partitioning failures. Partition 1 Partition 2 Theorem [Skeen82]: There is no non-blocking atomic commitment protocol in the presence of partitioning failures.

41. 41 Conclusion Concurrency Control: Serializability Theory 2 phase locking Other locking and non-locking protocols Recovery Centralized. Distributed: 2PC and 3PC

transaction management in database systems

transaction management in database systems

Presentation Transcript

Database Transactions and Transaction Management

F28DM Database Management Systems Transaction Management

Massively Distributed Database Systems - Transaction management

Database Management Systems

Database Management Systems

Fundamentals of Database Systems Transaction Management

Database Management Systems

Database Management Systems

F28DM Database Management Systems Transaction Management

DATABASE Management systems

Database Management Systems

Database Management Systems

Database Management Systems

Database Systems: Transaction Management

UNIT: User-ceNtrIc Transaction Management in Web-Database Systems

DATABASE TRANSACTION

Database Management Systems

Fundamentals of Database Systems Transaction Management

Database Management Systems

Enterprise Database Systems Transaction Processing

Database Management Systems

DATABASE MANAGEMENT SYSTEMS