1 / 21

Dynamo, Amazon’s NoSQL Database

Dynamo, Amazon’s NoSQL Database. Bogdan Ghidireac amazon.com. NoSQL Databases Dynamo Architecture. NoSQL Databases Dynamo Architecture. Document. ID01 -> { "glossary": { "title": "example glossary", " GlossDiv ": {

kay
Télécharger la présentation

Dynamo, Amazon’s NoSQL Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dynamo, Amazon’s NoSQL Database Bogdan Ghidireac amazon.com

  2. NoSQL Databases Dynamo Architecture

  3. NoSQL Databases Dynamo Architecture

  4. Document ID01 -> { "glossary": { "title": "example glossary", "GlossDiv": { "title": "S", "GlossList": { "GlossEntry": { "ID": "SGML", "SortAs": "SGML", "GlossTerm": "Standard Generalized Markup Language", "Acronym": "SGML", "Abbrev": "ISO 8879:1986", "GlossDef": { "para": "A meta-markup language, used to create markup languages such as DocBook.", "GlossSeeAlso": ["GML", "XML"] }, "GlossSee": "markup" } } } } }

  5. Key-Value ID01 -> readme.txt ID02 -> big_picture.png ID03 -> book.doc

  6. Column ID101 -> { ProductName = "Book 101 Title“ ISBN = "111-1111111111" Authors = [ "Author 1", "Author 2" ] Price = -2 Dimensions = "8.5 x 11.0 x 0.5" PageCount = 500 InPublication = 1 ProductCategory = "Book" } ID201 -> { ProductName = "18-Bicycle 201" Description = "201 description" BicycleType = "Road" Brand = "Brand-Company A" Price = 100 Gender = "M" Color = [ "Red", "Black" ] ProductCategory = "Bike" }

  7. Graph

  8. Object

  9. NoSQL Databases Dynamo Architecture

  10. Motivation • Highly available storage for shopping cart • Existing Oracle solution could not scale • 99.99% availability

  11. Key Principles • Keep It Simple • Replication is necessary for high availability • Symmetry: nobody’s special • Decentralization: favor peer-to-peer techniques over centralized control

  12. Consistency vs. Availability • There’s a fundamental tension between consistency and availability • Formalized by the “CAP Dilemma”: can’t simultaneously achieve strong consistency and availability in the presence of network partitions (Brewer, 1998) • For the highest availability, have to be willing to sacrifice consistency • Our design embraces this consistency tradeoff as a first principle

  13. Dynamo: Replicated DHT with Consistency Management • Consistent hashing • Optimistic replication • “Sloppy quorum” • Anti-entropy mechanisms • Object versioning

  14. Load Balancing Partitioning and Replication h(key1) 2128 0 N=3 B h(key2) A C F E D

  15. Load Balancing 2128 0 B B B B A A A A C C C C D D D D

  16. “Sloppy Quorum” • Configurable N, R, W • N replicas in ideal state • Successful read involves at least R nodes • Successful write involves at least W nodes • Sloppy Quorum: dynamic membership based on node availability • “Always Writable” with tunable probability of “Read Your Writes” consistency

  17. “Sloppy Quorum” h(key1) 2128 0 N=3 R=2 W=2 = success put(key1,v1) put(key1,v2) get(key1) = v1 local read local write key1= v2 key1= v1 B B success success forwarded writes forwarded reads forwarded writes key1= v1 key1= v2 A C E F anti-entropy key1= v1 key1= v2 D

  18. Configurability

  19. Consistency Management put(v4) based on [v2,v3] get() -> [v2,v3] put(v3) based on v1 put(v1) put(v2) based on v1 get() -> v1 • Each put() creates new, immutable version • Dynamo tracks version history • Automatic reconciliation • Application-level reconciliation • System Interfaces • put(key, object, context) • get(key) -> object[], context v1 v1 v1 v2 v2 B B v3 v3 v3 v4 v4 v4 Version History A v1 v2 v3 F F v4 D

  20. Conclusion • NoSQL databases are specialized storages created to solve scalability and availability problems found in relational systems. • Quorum, versioning, and consistent hashing techniques can be combined to yield a highly available system with user-perceived consistency.

  21. Questions?

More Related