Download
consistent join queries for cloud data stores n.
Skip this Video
Loading SlideShow in 5 Seconds..
Consistent Join Queries for Cloud Data Stores PowerPoint Presentation
Download Presentation
Consistent Join Queries for Cloud Data Stores

Consistent Join Queries for Cloud Data Stores

89 Vues Download Presentation
Télécharger la présentation

Consistent Join Queries for Cloud Data Stores

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Consistent Join Queries for Cloud Data Stores Zhou Wei 1,2 Guillaume Pierre1, Chi-Hung Chi2 1 VU University Amsterdam 2Tsinghua University Beijing

  2. Background • Cloud Computing Platforms for Web Apps • Unlimited resources • Scalable NoSQL data stores • Pay-as-you-go

  3. Problems of Cloud data stores • Problem 1: No Join queries • Only support queries within a single table • No queries across tables • Problem 2: Relaxed data consistency • Single-row transaction only • No consistency guarantee across multi-items • Web apps need consistent join queries

  4. Why Web App needs join? • Data Normalization  Foreign-key Equi-joins Post Post Comment

  5. Why consistent? • Application correctness is not an option • Hard to maintain correctness with an inconsistent data store • Require skillful programmers • Error-Prone • Cost more time

  6. Alternatives • SQL databases • Centralized • Full Replication • NOT scalable to update workload • Complicated queries • ACID Transaction • NoSQL data stores • Distributed • Partitioned • Scalable to update workload • No Join Queries • Single-row Transaction MySQL Oracle … Google Bigtable Amazon SimpleDB Cassandra …

  7. Our Position • Scalable NoSQL data stores supporting equi-joins queries for Web applications • Join query as a ACID transaction • Queries of Web applications access only a small portion of data • Also true for join queries • All of them are foreign-key equi-joins

  8. How it works? Web Application Join Queries Transactions CloudTPS TM TM TM TM Load Data Checkpoint Cloud Data Store

  9. Outline • Join Algorithm • Transaction Protocol • Performance Evaluation • Conclusion

  10. Join Algorithm • Web applications are response-time sensitive • Table scan is not an option • The basic idea • Linking records by FK relationships in advance • Each join query has well identified “root records” • Begin from accessing each “root record” • Recursively obtaining its related records

  11. Link records for forward join Given referring row  identify referenced row (FK) (PK) Post Comment

  12. Link records for backward join Give referenced row  identify referring rows (FK) (PK) Comment Post Index

  13. Joining two tables • SELECT * FROM post, comment WHERE post.id = 1 and comment.post_id = post.id post id=1 post.id = comment.post_id comment

  14. Multi-Joins • The table of the “root record” is the root • Recursively join child tables • Length of critical execution path = 2 Post Id=1 Comment Revision User

  15. Outline • Join Algorithm • Transaction Protocol • Performance Evaluation • Conclusion

  16. Distributed Transaction TM TM TM RW-TC RW-TC RW-TC SubT(1) SubT(2) SubT(3) Data1 Data2 Data3 2 Phase Commit (2PC)

  17. Problem • 2PC requires all records identified in advance • Join queries/index management identify accessed records on the fly • Solution: extend 2PC to allow adding records dynamically

  18. The Original 2PC 1. Submit 2. Vote 3. Commit/Abort TC TC TC Data1 Data2 Data1 Data2 Data1 Data2

  19. Extended 2PC • Dynamically add data items to the transaction 2. Vote 1. Submit TC TC Commit Add Data3 Data1 Data2 Data1 Data2

  20. Extended 2PC 3. Derive TC Data1 Data2 Data3

  21. Extended 2PC 4. Vote TC Commit Data1 Data2 Data3

  22. Extended 2PC 5. Commit TC Data1 Data2 Data3

  23. Outline • Join Algorithm • Transaction Protocol • Performance Evaluation • Conclusion

  24. Experiment Setup • Implement a prototype of CloudTPS • Java Servlet • Deployed in Tomcat v6.0 • Environments • 1. Das3+Hbase+Tomcat (99% reqs < 100ms) • 2. EC2+SimpleDB+Tomcat (90% reqs < 100ms) • High-CPU medium instances

  25. Micro-benchmarks • Generate specific workload • Containing only Join queries/RW transactions • Changing the number of accessed data items • Changing the length of critical execution path • Switch between Update indexes/Not update • All configured with 10 CloudTPS nodes • Measuring maximum sustainable throughput • Under response time constraint

  26. Two-table join evaluation Increasing number of accessed records in a transaction Transaction Per Second Record Per Second

  27. Multi-Joins Evaluation Increasing the height of join-query tree All join queries access 12 records

  28. RW transactions evaluation Transaction Per Second Record Per Second Increasing number of accessed records in a transaction

  29. Scalability Evaluation • Macro-Benchmark • Derived from TPC-W, simulating an online book store • Our workload includes only the join queries and RW transactions • No simple queries

  30. Scalability Results Linearly Scalability: Any increase of workload can be addressed by adding more machines

  31. CloudTPS vs. Postgresql Postgresql: binary-replication

  32. Fault Tolerance • Recovers from 3 network partitions and 2 machine failures

  33. Conclusion • We demonstrate supporting consistent join queries over partitioned and distributed data • Join queries from Web applications only access a small portion of data • It scales linearly even under an intensive workload of only join queries and read-write transactions

  34. Questions? Thank you!

  35. Details in Macro-benchmark workload Join queries RW transactions

  36. Secondary-key Queries • Select * from Table1 Where SK = ‘B’ • Create a separate index table for the SK • Translate into a join query Index Table Table 1

  37. Index Management • Indexes link the related records post Index comment

  38. Index Management (cont.) • Case: a new comment inserted • Updating indexes atomically post Index comment

  39. Index Management (cont.) • Consistency Violated Case 1 post Index comment ?

  40. Index Management (cont.) • Consistency Violated Case 2 post Index comment ?

  41. Fault tolerance • Maintain ACID during server failure • Transactions that reach an agreement of “COMMIT” should Commit • Abort other transactions • Replication to N transaction managers • Values of Data Items • States of Transactions (Vote result and “redo logs”) • Failover of a failed server • Transfer ownership of data items • Transfer leadership of transactions being coordinated

  42. Optimization for Join Queries • Join Query is a Read-only transaction • Remove the final Commit phase • Sub-transactions removed immediately after local execution • RO transactions will not block other transactions

  43. Concurrency Control 6 6 SubT(1) SubT(2) 5 5 SubT(1) SubT(2) Data1 Data2 Timestamp Ordering

  44. Read-Only Transaction TM TM TM Join(Data1) 7 SubT(1) 6 ROSubT(1) RW-TC 5 5 SubT(1) SubT(2) RO-TC Data1 Data2 Data3

  45. Read-Only Transaction (Cont.) TM TM TM Join(Data1) 7 SubT(1) 6 ROSubT(1) RW-TC Data2 RO-TC Data1 Data2 Data3

  46. Read-only Transaction (cont.) TM TM Result 7 SubT(1) 6 RW-TC ROSubT(2) None RO-TC Data1 Data2 Data3