1 / 50

CS 440 Database Management Systems

CS 440 Database Management Systems. Transaction Management - Recovery. What we have discussed so far:. RDBMS Implementation. Use indexing to speed up queries Which type of index to use Which attributes to index … Storage Which block size to choose

michi
Télécharger la présentation

CS 440 Database Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 440 Database Management Systems Transaction Management - Recovery

  2. What we have discussed so far:

  3. RDBMS Implementation • Use indexing to speed up queries • Which type of index to use • Which attributes to index • … • Storage • Which block size to choose • Clustered versus unclustered relations • …

  4. RDBMS Implementation • Query execution • Algorithms to implement our operators • Analyze their costs • … • Query optimization • Find the best plan to execute the queries • Check if your RDBMS picks the best plan • Change RDBMS setting to get the best time • Build hash-index if RDBMS uses hash-based join

  5. RDBMS Implementation • You can start tuning your RDBMS • or writing data management programs • A rare skill! • We assumed the input is a single query • Programs use more than one query to do the job • What is different?

  6. Example: Balance Transfer • Transfer $200 from account 170 to account 103 • Three SQL queries: • Check if Account 170 has at least $200. • Reduce the balance of 170 by $200. • Increase the balance of 103 by $200. • There is a crash after step 2. • Power loss, hardware problem, software bug in OS,… • Account 170 loses $200.

  7. Balance Transfer Example: Lessons • Check if Account 170 has at least $200. • Reduce the balance of 170 by $200. • Increase the balance of 103 by $200. • Queries in step 2 and 3 must be executed together. • All or none. • Atomicity.

  8. What should we care? • Failures do not happen that often! • Ostrich method: let them happen • This makes the database inconsistent • Deposit and withdrawal do not sum up • A DB that is 1% inconsistent is 100% useless. • Nobody can rely on the data • Prohibitively expensive to find the inconsistent accounts

  9. RDBMS Implementation • We assumed that there is only one user • Multiple users concurrently read/write the data. • What is different?

  10. Example 6.40 Flight(fltNo, fltDate, seatNo, seatStatus) • Check the available seats SELECT seatNo FROM Flight WHERE fltNo=123 AND fltDate=DATE’08-1-1’ AND seatStatus = ‘available’; 2. Book the available seat UPDATE Flight SET seatStatus = ‘occupied’ WHERE fltNo=123 AND fltDate=DATE‘08-1-1’ AND seatNo = 22A

  11. Example 6.40 • John checks for availability and gets seat 22A • John books seat 22A • Mary checks for availability and gets seat 22B • Mary books seat 22B

  12. Example 6.40 • John checks for availability and gets seat 22A • Mary checks for availability and gets seat 22A • John books seat 22A • Mary books seat 22A Double booking!

  13. Example 6.40: Lessons • Either John’s program must execute first and then Mary’s, or the other way. They cannot run in parallel. • Or they seem to run serially! • They must not interfere with each other • Serializablity

  14. Transaction • An execution of a DB program • Or a coherent fraction of a DB program • A large DB program may contain more than one transaction • Flight: one transaction for booking, one transaction for canceling, …

  15. Transaction: ACID Properties • Atomicity • All or nothing • Consistency • Each transaction maps database form one consistent state to another consistent state. • Isolation • Concurrent transactions must not interfere with each other. • Durability • The result of a committed transaction does no vanish due to failures: power loss, errors, …

  16. Balance Transfer Example • Check if Account 170 has at least $200. • Reduce the balance of 170 by $200. • Increase the balance of 103 by $200. • Power outage in the middle violates Atomicity,Consistency, and durability

  17. Example 6.40 • John checks for availability and gets seat 22A • Mary checks for availability abd gets seat 22A • John books seat 22A • Mary books seat 22A • It violates consistency and isolation.

  18. Transaction START TRANSACTION SQL-statement 1 SQL-statement 2 ... SQL-statement n COMMIT SQL-statement1 to SQL-statement n will be treated as an atomic programming fragment: “All or none”.

  19. Transaction • We can also undo a transaction programmatically: START TRANSACTION SQL-statement 1 SQL-statement 2 ... IF (Some condition) ROLLBACK SQL-statement n COMMIT • ROLLBACK undoes the effect of all executed statements -> aborts the transaction.

  20. Transaction Manager • A component in DBMS that supports transaction processing. • COMMIT, ROLLBACK • It guarantees ACID properties.

  21. Transaction Manager Modules • Recovery • Rolls back the database to consistent state in the case system failures • Guarantees atomicity, durability, and consistency. • Recovery manager. • Concurrency control • Does not allow concurrent transactions to interfere with each other. • Guarantees isolation and consistency. • Concurrency control manager.

  22. Recovery

  23. Types of Failures • Human error • Enter wrong data, wrong logic in transactions • Use integrity constraints • Disk crashes • Redundancy: RAID, Archive on tape, … • System failures • Power outage, (other) hardware errors, software errors,… • Recovery

  24. Database Log • Transactions have some intermediate state • May be in memory buffers • Lost in case of system failures • Log • A file that records every action in the transaction. • Log Manager • Maintains the log of a database.

  25. Transaction Notations • Database is a set of data items • Usually data item = block • Could be record (smaller) or table (larger) • Transaction is a sequence of read/write data items • Three types of storage • Disk • Memory buffers: managed by buffer manager • Transaction’s local memory (variables): managed by transaction manager.

  26. Operations of Transactions • INPUT(A) • Read data item A from disk to memory buffer • READ(A,v) • Read data item A to local variable v • WRITE(A,v) • Write local variable v to data item A in memory buffer • OUTPUT(A) • Write data item A from memory buffer to disk.

  27. Example 17.1 • Salary increase • Data items A and B has the same value in DB • Consistent state • Transaction T • A = A * 2; • B = B * 2; • After T is done A and B must have the same value • A new consistent state

  28. Example 17.1

  29. Example 17.1 • Crash happens before OUTPUT(A) • DB in consistent state. • Crash happens after OUTPUT(B) • DB in a new consistent state. • Crash happens after OUTPUT(A) but before OUTPUT(B), • DB is in an inconsistent state! • Recovery guarantees that the transactions starts with and ends with consistent states. • Atomicity: All or none

  30. More on Log • Contains log records • An append-only file • It is not used to answer queries (very inefficient). It is used only to recover information. • Every transaction has some records in the DB log • After a system failure, TM uses log to • Undo the operations of the uncommitted transactions • Redo all or some operations of the committed transactions.

  31. Undo Logging

  32. Undo Logging • Records every modification on the log before they are written to the disk. • We can undo all modifications, if system crashes in the middle of transaction.

  33. Log Records in an Undo Log • <START T> • Transaction T has started • <COMMIT T> • Transaction T is committed. • <ABORT T> • Transaction T has aborted. TM must undo all operations of T (not covered in this course). • <T,A,v> • Transaction T has updated data item A whose old value was v. One per WRITE operation, none for OUTPUT.

  34. Rules of Undo Logging • Rule 1: • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A). • The new information is in the log, before they are written on disk. • Rule 2: • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.

  35. Sequence of Operations • According to Rule 1 and Rule 2: • The modifications to data items are recorded in the log. • The updated data items are written to disk. • COMMIT record is written to the log.

  36. Recovery Algorithm • System crash-> perform recovery: • Figure out which transaction is completed and which one is not • <START T>…<COMMIT T> ... : completed • <START T>…<ABORT T> ... : completed • <START T>… : incomplete • Undo all updates done by incomplete transactions.

  37. Recovery Algorithm • Read log from the end • If <COMMIT T> : mark T as completed • If <ABORT T> : mark T as completed • If <T,A,v> If (T is incomplete) write A=v to disk

  38. Example … … … … … <START T5> <T2,A2,v2> • <START T4> • <T5,A5,v5> <T4,A4,v4> <T3,A3,v3> • <COMMIT T4> <T1,A1,v1> • Write v2 to A2 on disk • Write v5 to A5 on disk • Write v3 to A3 on disk • Mark T4 as completed • Write v1 to A1 on disk

  39. How far up in the Log? • The recovery algorithm has to examine all records in the log. • Very inefficient. • We can use check-pointing to limit the number of examined log records.

  40. Crash During Recovery • Undo operations are idempotent • We can repeat them without losing consistency. • In the case crash, just re-start the recovery from the initial state.

  41. Checkpointing • Periodically create check points: • Do not accept any new transaction • Wait for the active transactions to complete • Flush log information to the log file. • Write checkpoint entry (<CKPT>) to log file. • Start accepting new transactions

  42. Example … … • <T16, A16, v16> • <T14, A14, v14> … … … <CKPT> <START T5> <T2,A2,v2> • <START T4> • <T5,A5,v5> <T4,A4,v4> <T3,A3,v3> • <COMMIT T4> <T1,A1,v1> Other transactions (T14, T16, …) are completed. Stop here • Write v2 to A2 on disk • Write v5 to A5 on disk • Write v3 to A3 on disk • Mark T4 as completed • Write v1 to A1 on disk

  43. Any problem in checkpointing? • It makes the database frequently unavailable. • Not acceptable in large systems. • Can we do checkpointing and accept transactions?

  44. NonquiescentCheckpointing • Find active transactions: T1, …, Tn • Write <START CKPT T1, …, Tn> to the log file. • Wait for T1,…,Tn to complete • Other transaction can operate and use the database • Write <END CKPT> to the log file.

  45. NonquiescentCheckpointing … … … … <START CKPT T1, T2> … … … … … … <END CKPT> … … … … … … … Other completed transactions and active transactions: T1, T2 T1, T2, and other active transactions Undo till the <START CKPT> for this <END CKPT> Crash Other active transactions

  46. NonquiescentCheckpointing … … … … <START T1> <START T2> … … <START CKPT T1, T2> … … … … … … … Other completed transactions and active transactions: T1, T2 Undo till the earliest record of T1 and T2 Crash T1, T2, and other active transactions

  47. NonquiescentCheckpointing • Space optimization • When inserting an <END CKPT> record, we remove all log records before the last <START CKPT T1, …, Tn>

  48. Problems with Undo Logging? • Rule 1: • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A). • Rule 2: • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk.

  49. Problems with Undo Logging? • Rule 1: • If T modifies data item A, record <T,A,v> must be written to log before OUTPUT(A). • Rule 2: • If T commits, record <COMMIT T> must be written to log only after all the modifications of T are written on disk. INEFFICIENT

More Related