480 likes | 609 Vues
This guide delves into the intricacies of concurrency control within database systems, emphasizing its crucial role in ensuring that operations from multiple users do not interfere with each other. We cover essential concepts such as atomic transactions, resource locking, deadlock prevention, and the trade-offs between optimistic and pessimistic locking strategies. Additionally, we highlight the importance of ACID properties for ensuring reliable transaction management. This resource is valuable for database practitioners and students alike, helping them to grasp the fundamentals of effective concurrency control.
E N D
Foundations of Database Systems Concurrency Controls Instructor: Zhijun Wang Database Systems
Announcement • Quiz #2 will be given in the last hour today • Project Demo has been scheduled on May 8, 20:00-22:00pm at M402. You need to submit your project report at your Demo time. Database Systems
Concurrency Control • Concurrency control ensures that one user’s work does not inappropriately influence another user’s work • No single concurrency control technique is ideal for all circumstances • Trade-offs need to be made between level of protection and throughput Database Systems
Atomic Transactions • A transaction, or logical unit of work (LUW), is a series of actions taken against the database that occurs as an atomic unit • Either all actions in a transaction occur or none of them do Database Systems
Errors Introduced Without Atomic Transaction Database Systems
Errors Prevented With Atomic Transaction Database Systems
Concurrent Transaction • Concurrent transactions refer to two or more transactions that appear to users as they are being processed against a database at the same time • In reality, CPU can execute only one instruction at a time • Transactions are interleaved meaning that the operating system quickly switches CPU services among tasks so that some portion of each of them is carried out in a given interval • Concurrency problems: lost update and inconsistent reads Database Systems
Concurrent Transaction Processing Database Systems
Lost-Update Problem Database Systems
Resource Locking • Resource locking prevents multiple applications from obtaining copies of the same record when the record is about to be changed Database Systems
Lock Terminology • Implicit locks are locks placed by the DBMS • Explicit locks are issued by the application program • Lock granularity refers to size of a locked resource • Rows, page, table, and database level • Large granularity is easy to manage but frequently causes conflicts • Types of lock • An exclusive lock prohibits other users from reading the locked resource • A shared lock allows other users to read the locked resource, but they cannot update it Database Systems
Locking Granularity Database Systems
Concurrent Processing with Explicit Locks Database Systems
Serializable Transactions • Serializable transactions refer to two transactions that run concurrently and generate results that are consistent with the results that would have occurred if they had run separately • Two-phased locking is one of the techniques used to achieve serializability Database Systems
Two-phased Locking • Two-phased locking • Transactions are allowed to obtain locks as necessary (growing phase) • Once the first lock is released (shrinking phase), no other lock can be obtained • A special case of two-phased locking • Locks are obtained throughout the transaction • No lock is released until the COMMIT or ROLLBACK command is issued • This strategy is more restrictive but easier to implement than two-phased locking Database Systems
2PL Implementation Database Systems
Deadlock • Deadlock, or the deadly embrace, occurs when two transactions are each waiting on a resource that the other transaction holds • Preventing deadlock • Allow users to issue all lock requests at one time • Require all application programs to lock resources in the same order • Breaking deadlock • Almost every DBMS has algorithms for detecting deadlock • When deadlock occurs, DBMS aborts one of the transactions and rollbacks partially completed work Database Systems
Deadlock Database Systems
Optimistic versus Pessimistic Locking • Optimistic locking assumes that no transaction conflict will occur: • DBMS processes a transaction; checks whether conflict occurred: • If not, the transaction is finished • If so, the transaction is repeated until there is no conflict • Pessimistic locking assumes that conflict will occur: • Locks are issued before a transaction is processed, and then the locks are released • Optimistic locking is preferred for the Internet and for many intranet applications Database Systems
Optimistic Locking Database Systems
Pessimistic Locking Database Systems
Declaring Lock Characteristics • Most application programs do not explicitly declare locks due to its complication • Instead, they mark transaction boundaries and declare locking behavior they want the DBMS to use • Transaction boundary markers: BEGIN, COMMIT, and ROLLBACK TRANSACTION • Advantage • If the locking behavior needs to be changed, only the lock declaration need be changed, not the application program Database Systems
Marking Transaction Boundaries Database Systems
ACID Transactions • Acronym ACID transaction is one that is Atomic, Consistent, Isolated, and Durable • Atomic means either all or none of the database actions occur • Durable means database committed changes are permanent Database Systems
ACID Transactions • Consistency means either statement level or transaction level consistency • Statement level consistency: each statement independently processes rows consistently • Transaction level consistency: all rows impacted by either of the SQL statements are protected from changes during the entire transaction • With transaction level consistency, a transaction may not see its own changes Database Systems
ACID Transactions • Isolation means application programmers are able to declare the type of isolation level and to have the DBMS manage locks so as to achieve that level of isolation • SQL-92 defines four transaction isolation levels: • Read uncommitted • Read committed • Repeatable read • Serializable Database Systems
Database Recovery • In the event of system failure, that database must be restored to a usable state as soon as possible • Two recovery techniques: • Recovery via reprocessing • Recovery via rollback/rollforward Database Systems
Recovery via Reprocessing • Recovery via reprocessing: the database goes back to a known point (database save) and reprocesses the workload from there • Unfeasible strategy because • The recovered system may never catch up if the computer is heavily scheduled • Asynchronous events, although concurrent transactions, may cause different results Database Systems
Rollback/Rollforward • Recovery via rollback/rollforward: • Periodically save the database and keep a database change log since the save • Database log contains records of the data changes in chronological order • When there is a failure, either rollback or rollforward is applied • Rollback: undo the erroneous changes made to the database and reprocess valid transactions • Rollforward: restored database using saved data and valid transactions since the last save Database Systems
Rollback Before-image: a copy of every database record (or page) before it was changed. Database Systems
Rollforward After-image: a copy of every database record (or page) after it was changed Database Systems
Checkpoint • A checkpoint is a point of synchronization between the database and the transaction log • DBMS refuses new requests, finishes processing outstanding requests, and writes its buffers to disk • The DBMS waits until the writing is successfully completed the log and the database are synchronized • Checkpoints speed up database recovery process • Database can be recovered using after-images since the last checkpoint • Checkpoint can be done several times per hour • Most DBMS products automatically checkpoint themselves Database Systems
Transaction Log Database Systems
Database Recovery:A Processing Problem Occurs Database Systems
Database Recovery: Recovery Processing Database Systems
Schedules • Schedule: An interleaving of actions from a set of transactions, where the actions of any transaction are in the original order. • Represents some actual sequence of database actions. • Example: R1(A), W1(A), R2(B), W2(B), R1(C), W1(C) • In a complete schedule, each transaction ends in commit or abort. • Initial State + Schedule Final State Database Systems
Acceptable Schedules • One sensible “isolated, consistent” schedule: • Run transactions one at a time, in a series. • This is called a serial schedule. • NOTE: Different serial schedules can have different final states; all are “OK” • DBMS makes no guarantees about the order in which concurrently submitted transactions are executed. • Serializable schedules: • Final state is what some serial schedule would have produced. • Aborted transactions are not part of schedule • they are made to ‘disappear’ by using logging. Database Systems
Serializability Violations • Two actions conflict when 2 transactions access the same item: • W-R conflict: T2 reads something T1 wrote. • R-W and W-W conflicts: Similar. • WR conflict (dirty read): • Result is not equal to any serial execution! transfer $100 from A to B add 6% interest to A & B Database is inconsistent! Database Systems
More Conflicts • RW Conflicts (Unrepeatable Read) • T2 overwrites what T1 read. • If T1 reads it again, it will see something new! • Example when this would happen? • The increment(T1)/decrement(T2) example. • Again, not equivalent to a serial execution. • WW Conflicts (Overwriting Uncommited Data) • T2 overwrites what T1 wrote. • Example: 2 transactions (T1 to increase, T2 to decrease) to update 2 items to be kept equal. • Usually occurs in conjunction w/other anomalies. • Unless you have “blind writes”. Database Systems
RW conflict A = 5 T1: A = A + 1, T2: A = A – 1 T1 T2T1 T2T1 T2 R(A) R(A) R(A) W(A) W(A) R(A) R(A) R(A) W(A) W(A) W(A) W(A) Database Systems
WW Conflict T1: H=1000, L=1000 T2: H=2000, L=2000 T1 T2 W(H) W(L) W(L) W(H) Database Systems
Aborted Transactions • Serializable schedule: A schedule which is equivalent to a serial schedule of committed transactions. • as if aborted transactions never happened. • Two Issues: • How does one undo the effects of an transaction? • We’ll cover this in logging/recovery • What if another transaction sees these effects?? • Must undo that transaction as well! Database Systems
Cascading Aborts • Abort of T1 requires abort of T2! • Cascading Abort • What about WW conflicts & aborts? • T2 overwrites a value that T1 writes. • T1 aborts: its “remembered” values are restored. • Lose T2’s write! We will see how to solve this, too. • An ACA (avoids cascading abort)schedule is one in which cascading abort cannot arise. • A transaction only reads/writes data from committed transactions. Database Systems
Recoverable Schedules • Abort of T1 requires abort of T2! • But T2 has already committed! • A recoverable schedule is one in which this cannot happen. • i.e. a transaction commits only after all the transactions it “depends on” (i.e. it reads from or overwrites) commit. • Recoverable implies ACA (but not vice-versa!). • Real systems typically ensure that only recoverable schedules arise (through locking). Database Systems
Precedence Graph T1 T2 • A Precedence (or Serializability) graph: • Node for each committed transaction. • Arc from Ti to Tj if an action of Ti precedes and conflicts with an action of Tj. • T1 transfers $100 from A to B, T2 adds 6% to both • R1(A), W1(A), R2(A), W2(A), R2(B), W2(B), R1(B), W1(B) Database Systems
Conflict Serializability • 2 schedules are conflict equivalentif: • they have the same sets of actions, and • each pair of conflicting actions is ordered in the same way. • (they have the same effect on a DB) • A schedule is conflict serializableif it is conflict equivalent to a serial schedule. • It is serializable if the set of items in the DB does not grow or shrink • Note: Some serializable schedules are not conflict serializable! Database Systems
Conflict Serializability & Graphs • Theorem: A schedule is conflict serializable iff its precedence graph is acyclic. • Theorem: 2PLensures that the precedence graph will be acyclic! • Strict 2PLimproves on this by avoiding cascading aborts, problems with undoing WW conflicts; i.e., ensuring recoverable schedules. Database Systems
Summary • Concurrency control key to a DBMS. • More than just mutexes! • Transactions and the ACID properties: • C & I are handled by concurrency control. • A & D coming soon with logging & recovery. • Conflicts arise when two transactions access the same object, and one of the transactions is modifying it. • Serial execution is our model of correctness. Database Systems