1 / 27

Making Snapshot Isolation Serializable

Making Snapshot Isolation Serializable. Presenter:Ferhat Hoke ALAN FEKETE University of Sydney DIMITRIOS LIAROKAPIS, ELIZABETH O’NEIL, and PATRICK O’NEIL University of Massachusetts and DENNIS SHASHA Courant Institute. Roadmap. Introduction Key Words / Phrases and background The Details

berit
Télécharger la présentation

Making Snapshot Isolation Serializable

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Making Snapshot Isolation Serializable Presenter:Ferhat Hoke ALAN FEKETE University of Sydney DIMITRIOS LIAROKAPIS, ELIZABETH O’NEIL, and PATRICK O’NEIL University of Massachusetts and DENNIS SHASHA Courant Institute

  2. Roadmap • Introduction • Key Words / Phrases and background • The Details • Develop a theory • Apply this theory in TPC-C benchmark application • How to generalize this to other applications • How to modify program logic so that under SI serializability will be guaranteed • Conclusion – Comments - QA

  3. Introduction

  4. Introduction cont….

  5. Introduction cont….

  6. Introduction cont…. • What is Snapshot Isolation? • Snapshot Isolation (SI) is a multi-version concurrency control algorithm. • SI is attractive because it provides an “isolation level” that avoids many of the “common concurrency anomalies”. • Follows First-committer-wins (FCW) rule • Each data item might have multiple versions, created by active and committed transactions

  7. Introduction cont….

  8. Introduction cont….

  9. Introduction cont…. • What are we trying to achieve? GOAL: Bring concurrency safety to the many applications running on systems under SI.

  10. Introduction cont…. • However , SI does not guarantee serializability in all cases. • SI does not ensure that all executed histories are serializable, it has problem like “write skew”

  11. Key Words / Phrases and background “Write Skew” • Suppose data items: X, Y ( constraint: X+Y >0 ) Assume initially X0 =70 , Y0 =80. • T1 reads X0 & Y0 ,then subtracts 100 from X0, T2 reads X0 & Y0 concurrently, then subtracts 100 from Y0. • Each update is safe by itself, but under SI, the final state X+Y = -30-20 = -50 < 0 (violate X+Y >0) H: r1[x0=70] r1[y0=80] r2[x0=70] r2[y0=80] w1[x1=-30] w2[y2=-40] c1 c2 • FCW can not detect this problem.

  12. Key Words / Phrases and background “ common concurrency anomalies “ Three phenomenain ANSI SQL (92) Isolation: • dirty reads (P1) • non repeatable or Fuzzy Read (P2) • phantom (P3)

  13. Key Words / Phrases and background “ Isolation Levels “ ANSI/ISO SQL Isolation Levels Dirty Read Non-Repeatable Read Phantom Read P1 P2 P3 Read uncommitted Possible Possible Possible Read committed Not possiblePossible Possible Repeatable read Not possible Not possiblePossible Serializable Not possible Not possible Not possible

  14. Key Words / Phrases and background… How about Commercial Database? Some database products like Oracle and PostgreSQL choose READ COMMITTED as default level for better performance, and also provide SERIALIZABLE level for correctness SQL Server 2005 a.k.a ‘Yukon’ Row Versioning-Based transaction Isolation.

  15. Key Words / Phrases and background… How about Commercial Database? Some database products like Oracle and PostgreSQL choose READ COMMITTED as default level for better performance, and also provide SERIALIZABLE level for correctness

  16. The Details Develop a theory This paper presented new specifications for ANSI, they apply not only to locking implementations, but also to optimistic and multi-version CC schemes. Furthermore, they handle predicates in a correct and flexible manner at all levels. The new definitions are both correct and implementation–independent (it is important since it provides flexibility to implementers, which can lead to better performance.)

  17. The Details Method: Presents a new theorywith which the DBA can examine the program logic of the application to achieve one of the following two goals: 1. To verify that only serializable executions will occur when running on a DBMS which has SI as its concurrency control algorithm. 2. If 1 fails, to modify the application programs so that such serializability will be guaranteed.

  18. The Details…… Making Snapshot Isolation Serializable • Since the result of any read by Tj is taken from the snapshot as of start(Tj ),if Tj directly read-depends on Ti , Ti must commit before Tj starts; that is, the transactions are not concurrent. • Because of the FCW rule, if Tj directly write-depends on Ti, then Tj and Ti cannot be concurrent.

  19. The Details…… Making Snapshot Isolation Serializable Lemma 1In a history obtained using SI, if Tj directly read-, write-or anti-depends on Ti ,and Tj and Ti are concurrent, then Tj must anti-dependon T. Lemma 2Suppose h is a multi-version, multi-T history produced under SI. If h is not serializable, then in every cycle in the serialization graph DSG(h),there are three consecutive transactions Ti ,Tj ,Tk ,such that Ti and Tj are concurrentwith an edge Ti ->Tj, and also Tj and Tk are concurrent with an edge Tj ->Tk . By Lemma 1, both edges must be anti-dependencies.

  20. The Details…… Making Snapshot Isolation Serializable Theorem : If a collection of application programs has an SC-graph with no dangerous cycle, then every execution of the programs, running on a database using SI, is serializable. • Theorem will often allow the DBA to see that the mix of application programs is safe, in that every execution will be serializable. • If some dangerous cycles exist in SC-graph, non-serializable are possible, violations of integrity constraints may arise, even though each application program individually is correct.

  21. The Details…… Making Snapshot Isolation Serializable How does the DBA solve above problem ? The DBA modifies the programs so that the theorem does apply without changing the functionality of each program. 1. identify where a dangerous cycle exist 2. choose one of the Vulnerable edges in each dangerous cycle, and change the applications so the edge ceases to be vulnerable.

  22. The Details…… Making Snapshot Isolation Serializable How to remove vulnerability from a R-W conflict ? Materialize the conflict: Add a special conflict table, which both transactions update in order to create a direct write-write conflict. Promotion: Have one transaction "update" a read-only location (replacing a value with the same value) in order to create a direct write-write conflict (or use an equivalent promotion, e.g. Oracle's SELECT FOR UPDATE).

  23. The Details……Apply this theory in TPC-C benchmark application What is TPC-C? (revision 5.0 01) • TPC-C was introduced in 1992, as a mean to measure the performance of transaction processing systems. • It is an OLTP (on-line transaction processing) workload and a mix of read-only and update intensive transactions that simulate the activities found in complex OLTP application environment. • TPC-C offersa rich environment that emulates many OLTP applications, but it does not reflect the entire range of OLTP requirements. • It is designed to represent any industry in which one must manage, sell, or distribute a product or service.

  24. The Details……Apply this theory in TPC-C benchmark application • Comprised of a set of basic operations designed to exercise system functionalities in a manner representative of complex OLTP application environments. These basic operations have been given a lifelike context, portraying the activity of a company (wholesale supplier), to help users relate intuitively to the components of the benchmark. • Components of its database: 9 tables (warehouse, stock, item, history, new-owner, order-line, district, customer and order.) 5 types of transaction( New-Order, Payment, Order-Status, Delivery and Stock-Level.) results are highly dependent upon workload, specific application requirements and systems design and implementation. given the context of an order-entry application. (TPC-A: was given the lifelike context of a banking application.

  25. Conclusion – Comments - QA • This article have shown that potential inconsistency problems arising from write skew anomalies can be fixed by adding (otherwise unnecessary) updates to the transactions. • In general, therefore, snapshot isolation puts some of the problem of maintaining non-trivial constraints onto the user, who may not appreciate either the potential pitfalls or the possible solutions. The upside to this transfer is better performance.

  26. References ADYA, A., LISKOV, B., AND O’NEIL, P. 2000. Generalized isolation level definitions. In Proceedings • of IEEE International Conference on Data Engineering, (Feb.) IEEE Computer Society Press, Los • Alamitos, Calif., 67–78. • ANDERSON, T.,BRETBART, Y.,KORTH, H., ANDWOOL, A. 1998. Replication, consistency and practicality: • are these mutually exclusive? In Proceedings of the ACM SIGMOD International Conference • on Management of Data (June). ACM, New York, 484–495. • BERENSON, H., BERNSTEIN, P., GRAY, J., MELTON, J., O’NEIL, E., AND O’NEIL, P. 1995. A critique of • ANSI SQL isolation levels. In Proceedings of the ACM SIGMOD International Conference on • Management of Data (June) ACM, New York, 1–10. • ACM Transactions on Database Systems, Vol. 30, No. 2, June 2005. • 528 • A. Fekete et al. • BERGE, C. 1976. Graphs and Hypergraphs (2nd edition). North-Holland Mathematical Library, • Volume 6. • BERNSTEIN, P., HADZILACOS, V., AND GOODMAN, N. 1987. Concurrency Control and Recovery in • Database Systems. Addison-Wesley. (This text is now out of print but can be downloaded from • http://research.microsoft.com/pubs/ccontrol/default.htm) • BERNSTEIN, A., LEWIS, P., AND LU, S. 2000. Semantic conditions for correctness at different isolation • levels. In Proceedings of IEEE International Conference on Data Engineering (Feb.). IEEE • Computer Society Press, Los Alamitos, Calif., 57–66. • BREITBART, Y., KOMONDOOR, R., RASTOGI, R., SESHADRI, S., AND SILBERSCHATZ, A. 1999. Update propagation • protocols for replicated databases. In Proceedings of the ACM SIGMOD International • Conference on Management of Data (June). ACM, New York, 97–108. • CHAMBERLIN, D., ASTRAHAN, M., BLASGEN, M., GRAY, J.,KING, W., LINDSAY, B., LORIE, R., MEHL, J., PRICE, • T., PUTZOLU, F., SELINGER, P., SCHKOLNICK, M., SLUTZ, D., TRAIGER, I., WADE, B., AND YOST, R. 1981. • A history and evaluation of System R. Commun. ACM 24, 10 (Oct.). 632–646. (Also in: M. Stonebraker • and J. Hellerstein, Readings in Database Systems, Third Edition, Morgan Kaufmann • 1998.)

  27. References… ELNIKETY, S., PEDONE, F., AND ZWAENEPOEL, W. 2004. Generalized snapshot isolation and a prefixconsistent • implementation. Tech. Rep. IC/2004/21, EPFL, Mar. • ESWARAN, K., GRAY, J., LORIE, R., AND TRAIGER, I. 1976. The notions of consistency and predicate • locks in a database system. Commun. ACM 19, 11 (Nov.), 624–633. • FEKETE, A. 1999. Serializability and snapshot isolation. In Proceedings of the Australian • Database Conference (Auckland, New Zealand, Jan.). 201–210. • FEKETE, A., O’NEIL, E., AND O’NEIL, P. 2004. A read-only transaction anomaly under snapshot • isolation. ACM SIGMOD Record 33, 3 (Sept.), 12–14. • GRAY, J. (ED.). 1993. The Benchmark Handbook (2nd edition). Morgan-Kaufmann, San Francisco, • Calif. • GRAY, J. AND REUTER, A. 1993. Transaction Processing: Concepts and Techniques. Morgan- • Kaufmann, San Francisco, Calif. • GRAY, J.,HELLAND, P., O’NEIL, P., AND SHASHA, D. 1996. The dangers of replication and a solution. • In Proceedings of the ACM SIGMOD International Conference on Management of Data (June). • ACM, New York, 173–182. • JACOBS, K., BAMFORD, R., DOHERTY, G., HAAS, K., HOLT, M., PUTZOLU, F., AND QUIGLEY, B. 1995. Concurrency • Control: Transaction Isolation and Serializability in SQL92 and Oracle7. Oracle White • Paper, Part No. A33745 (July). • LIAROKAPIS, D. 2001. Testing Isolation Levels of Relational Database Management Systems, Ph.D. • dissertation, University of Massachusetts, Boston, Mass. (Dec.). (This thesis can be downloaded • from http://www.cs.umb.edu/∼dimitris/thesis/thesisDec20.pdf.) • PAPADIMITRIOU, C. 1986. The Theory of Database Concurrency Control. Computer Science Press. • SHASHA, D. AND BONNET, P. 2002. Database Tuning: Principles, Experiments, and Troubleshooting • Techniques. Morgan-Kaufmann, San Francisco, Calif. • SHASHA, D., LLIRBAT, F., SIMON, E., AND VALDURIEZ, P. 1995. Transaction chopping: Algorithms and • performance studies. ACM Trans. Datab. Syst. 20, 3 (Sept.), 325–363. • SCHENKEL, R. AND WEIKUM, G. 2000. Integrating snapshot isolation into transactional federations. • In Proceedings of 5th IFCIS International Conference on Cooperative Information Systems • (CoopIS 2000) (Sept.), 90–101. • TPC-C BENCHMARK SPECIFICATION, available at http://www.tpc.org/tpcc/.

More Related