1 / 15

Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal et al.

Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal et al. Oct 24, 2011. Framing. Survey paper Identifies necessary qualities of cloud storage Scalability Sensible consistency / programming model Scale-down and migration Autonomic management

ifama
Télécharger la présentation

Database Scalability, Elasticity, and Autonomy in the Cloud Agrawal et al.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Scalability, Elasticity, and Autonomy in the CloudAgrawal et al. Oct 24, 2011

  2. Framing • Survey paper • Identifies necessary qualities of cloud storage • Scalability • Sensible consistency / programming model • Scale-down and migration • Autonomic management • Pointers to different work in the space

  3. Scalability • Add more resources, get more performance • Handle more requests per second • Store more data • Achievable with scale-up or scale-out • Scale-out is the only paradigm for the cloud • App’s parallelism is limited by Amdahl’s Law

  4. Finding the right design point • What’s the right consistency / programming model? • Pure key-value stores are too weak • Only have transactions on single records • Traditional RDBMs are too strong • Can’t just run MySQL at scale • Instead, provide strong consistency within a portion of the data • Megastore • Vertica, Aster, Teradata, Greenplum, …

  5. Data Fusion vs. Data Fission Fusion Fission Weak Strong Consistency Dynamo Megastore, G-Store MySQL Azure, ElasTraS, Rel Cloud BigTable, PNUTS

  6. Data Fusion • Start with a key-value store • Partition records into groups • Provide multi-record updates within a group • Cross-group operations handled separately • Assumes that cross-group ops are rare

  7. Data Fission • Start with a relational database • Partition tables into shards • Provide ACID within each shard • Cross-shard ops are expensive • Assumes that cross-shard ops are rare

  8. What’s the difference? • Is Fusion vs. Fission a worthwhile distinction? • Seems like they both arrive at the same place • Megastore “Fusion” vs. ElasTras “Fission” • Shard tables based on a table’s primary key • Shard is co-located on the same machine • ACID transactions within a shard • Primary and secondary indexes • All Megastore is missing is an SQL interface!

  9. The difference • Different targeted users • Fusion is for people who own datacenters • Fission is for people who want SQL in the cloud • Different exposed API • Fusion is more explicit about performance • Fission tries to hide partitioning from user • Anything else?

  10. Elasticity • Dynamically scaling up and down on-demand • Important with pay-as-you-go cloud pricing • Consolidate to reduce costs • Expand to increase performance • Need to move state and processing duties around within the system

  11. Live migration of databases • Shared-disk • “Global disk” shared by all DB nodes • Just need to copy in-memory state • Iterative copy: sync up cached pages + transaction state to minimize the availability hit • Shared-nothing • Each DB node is its own separate DB instance • Need to copy both local disk state and memory • Push/pull: gradually shift new requests to the new node, sync state in the background

  12. Database Autonomy • Need management to be more automatic • Elasticity and load balancing based on usage and ML predictions • Performance modeling • Migration costs (availability, performance, $$$) • Resource isolation (consolidated services) • SLAs

  13. Questions?

  14. Tree schema • Primary table’s primary key used for sharding • Secondary tables are sharded into row groups • Row groups are co-located and transactional • Global tables are write-rarely, and replicated on all nodes

  15. Tree schema

More Related