180 likes | 318 Vues
C-Store: Concurrency Control and Recovery. Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun. 5, 2009. Concurrency Control vs. Recovery. Concurrency Control Provide correct control of concurrent running of multiple transactions to maximize system throughput .
E N D
C-Store: Concurrency Control and Recovery Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun. 5, 2009
Concurrency Control vs. Recovery • Concurrency Control • Provide correct control of concurrent running of multiple transactions to maximize system throughput. • i.e., the average number of transactions completed in a given time. • Recovery • Ensures database is fault tolerant, and not corrupted by software, system or media failure
Concurrency Control in C-Store • Uses strict two-phase locking to control concurrent running of read-write transactions. • each node (a site in the shared-nothing system architecture) sets locks on data objects that the runtime system reads or writes. • Resolves deadlocks via timeouts. • aborting one of the deadlocked transactions. • Does not use strict two-phase distributed commit. • avoiding the PREPARE phase.
Strict Two-Phase Locking (Strict 2PL) • It is the most widely used locking protocol. • Two rules (1) If a transaction T wants to read (respectively, modify) a database object, it first requires a shared (respectively, exclusive) lock on the object. (2) All locks held by a transaction are released when the transaction is completed.
Distributed COMMIT Processing in C-store (1): Master and Worker • Each transaction T has a master that is responsible for • assigning T ’ssub-transactions to appropriate nodes (workers). • and determining the ultimate commit state of T.
Distributed COMMIT Processing in C-store (2): The Protocol • 1st Phase • When the master receives a COMMIT statement for the transaction T, it waits until all workers have completed all outstanding actions • And then issues a commit (or abort) message to each worker. • 2nd Phase • Once a worker has received a commit message, it can releases all locks related to the transaction T • And delete the UNDO log for T. • T is completed, and hence has no need for UNDO in recovery.
Distributed COMMIT Processing in C-store (3): The Implications • In C-Store, the master does not PREPARE the workers. • So it is possible for a worker the master has told to commit to crash before writing any updates or log records related to a transaction to stable storage. • The failed worker will recover its state from other projections on other nodes during recovery.
Overview of Recovery in C-Store • Uses standard write-ahead logging protocol for recovery. • Uses a STEAL, NO-FORCE policy for writing database objects. • Possibly results in UNDO and REDO. • Only logs UNDO records. • Performs REDO by executing updates which have been queued on other nodes.
Write-Ahead Logging Property(WAL) • The Protocol • Each write must be recorded in the log (on disk) before the corresponding change is reflected in the database itself. • To ensure this protocol, the DBMS must be able to selectively force a page in memory to disk. • i.e., the page containing information on the write.
Contents of an Update Log Record • <prevLSN, transID, type, pageID, length, offset, before-image, after-image> • The first 3 fields are common to all log records. • The other fields are for updates.
STEAL / NO-FORCE • STEAL • Allowing an updated page P of an uncommitted transaction T to be swapped from memory to disk. • T can abort later,so the DBMS must remember the old value of P to support UNDO. • NO-FORCE • When a transaction T commits, pages in the buffer that are modified by T are not forced to disk. • System can crash before all the pages are written to disk, so the DBMS must remember the updates of T to support REDO.
the Recovery Algorithm ARIES: three phases • Analysis: Identifies dirty pages in the buffer (i.e., changes that have not been written to disk) and active transactions at the time of the crash. • REDO: Repeats all actions and restores the database state to what it was at the time of the crash. • UNDO: Undoes the actions of aborted transactions.
Recovery in C-Store • Basic idea • A crashed node recovers by running a query (copying state) from other projections. • K-Safety • Sufficient projections and join indexes are maintained, • So that K nodes can fail within time t, the time to recover, • And the system will be able to maintain transactional consistency. • Three cases to consider.
Recovery: Case 1 • If the failed node suffered no data loss, • No dirty pages are found for aborted transactions. • Then we can restore it by executing updates that will be queued for it elsewhere in the system. • Assuming those updates are successfully saved in other nodes, and the updates can be identified by conditions on timestamp, transaction ID and etc. • Pages of committed transactions were not written to disk. So we simply need REDO.
Recovery: Case 2 • If both the RS and WS are destroyed in the failed node, • Then we have to reconstruct both segments from other projections and join indexes in the system. • First restore segments by exploiting Insertion Vectors and Deleted Record Vectors from other nodes. • Second the queued updates must be run as in Case 1.
Recovery: Case 3 • If WS is damaged but RS is good in the failed node, • Then we can reconstruct the WS from other corresponding WS segments and/or RS segments. • Identifying corresponding WS segments by checking the range of sort key. • Using the sort keys to find storage keys, • And then finding other tuple columns by following appropriate join indexes.
Queries for Recovering WS • Note that each WS segment, S, contains only tuples with an insertion timestamp later than some time tlastmove(S).
References • Mike Stonebraker, Daniel Abadi, Adam Batkin, Xuedong Chen, Mitch Cherniack, Miguel Ferreira, Edmond Lau, Amerson Lin, Sam Madden, Elizabeth O'Neil, Pat O'Neil, Alex Rasin, Nga Tran and Stan Zdonik. C-Store: A Column Oriented DBMS VLDB, pages 553-564, 2005. • Raghu Ramakrishnan and Johannes Gehrke. Database Management Systems (3rd edition).