1 / 34

Continuous Consistency and Availability

Continuous Consistency and Availability. Haifeng Yu CPS 212 Fall 2002. server. server. server. client. client. client. Consistency in Replication. Replication comes with consistency cost: Reasons for replication: Better performance and availability.

naiya
Télécharger la présentation

Continuous Consistency and Availability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002

  2. server server server client client client Consistency in Replication • Replication comes with consistency cost: • Reasons for replication: Better performance and availability • Replication transforms client-server communication to server-server communication: • Decrease performance • Decrease availability

  3. Availability / Performance / Scalability Optimistic Consistency Strong Consistency Consistency Strong Consistency and Optimistic Consistency • Traditionally, two choices for consistency level: • Strong consistency: Strictly “in sync” • Optimistic consistency: No guarantee at all • Associated tradeoffs with each model

  4. Problems with Binary Choice • Strong consistency incurs prohibitive overheads for many WAN apps • Replication may even decrease performance, availability and scalability relative to a single server! • Optimistic consistency provides no consistency guarantee at all • Resulting in upset users: Unbounded reservation conflicts • Potentially render the app unusable: If traffic data is more than 1 hour stale, probably of little use • Applications cannot tune consistency level based on its environment • Need to adapt to client, service and network characteristics

  5. Availability / Performance / Scalability Optimistic Consistency Strong Consistency Consistency Continuous Consistency • Consistency is continuous rather than binary for many WAN apps • These apps can benefit from exploiting the consistency spectrum between strong and optimistic consistency. Availability / Performance / Scalability Continuous Consistency Consistency

  6. To Other Replicas Quantifying Consistency • Many ways: • Staleness (TTL in web caching): Invalidate • Limit number of locally buffered writes bufferedupdates

  7. Applications ? • Applications: • Web caching • Airline reservation • Distributed games • Shared editor • Non-Applications: • Some scientific computing problems • Banking system • Any application that has binary output • Application’s nature determines whether continuous consistency is applicable

  8. Trading Consistency for Performance • Airline reservation: running at Berkeley, Utah, Duke Optimistic Consistency [Yu’02, TOCS] StrongConsistency

  9. The Cost of Increased Performance • Increased performance comes with a cost • Adaptively trade consistency for performance based on client, network, and service conditions

  10. Model vs. Protocol • Continuous consistency model is a spec. • Protocol is anything that can enforce the spec. • Corollary: Strong consistency protocol is a protocol for any model • Many protocols for a specific model, some are good, others are not

  11. Designing a Continuous Consistency Model • Model is a spec, thus quantifying consistency (in a bad way) is trivial • Only applications know its definition of consistency • Airline reservation vs. distributed games • What is a “good” continuous consistency model? • Can be used by diverse apps • Practical

  12. Distributed Consensus and Leader Election • What does “continuous consistency” mean ? • Allow at most k decision values • Allow at most k leaders • Helps overcome some impossibilities • Unique decision value requires ½ majority • K decision values allow any partition with 1/(k + 1) nodes to decide

  13. Group Membership Service • Def: Keep track of which nodes belong to which group • Traditionally, group membership only maintain a single group • Primary-partition membership services • Corresponds to strong consistency • Recently, partitionable membership services • Still active area of research • Corresponds to optimistic consistency • Continuous consistency: • Allow at most k groups • Again, helps overcome the ½ majority limitation

  14. Continuous Consistency Summary • WAN replication needs dynamically tunable consistency • Tradeoff between consistency and performance • How to design a continuous consistency model • Continuous consistency in other context • Next: Availability

  15. What is Availability ? • No well-accepted availability metric for Internet services • “Uptime” metric can be misleading for Internet services • Server may be inaccessible because of network partition • Available: “present or ready for immediate use” • From Webster’s Collegiate Dictionary • What does “immediate” mean? • Time-out • Availability = (accepted accesses) / (submitted accesses) • Implicit time-out in the definition

  16. Perform-ability • User satisfaction is not binary • What if a partial result is returned before time-out ? • What if the result is sent back after an hour, or a day ? • Availability is related to performance • Performability = reward function (quality and timeliness of result) • Determining reward function is hard !

  17. 2% [Chandra et.al., USITS’01] × Server reject due to server failure × 0.1%[MS press release,Jan’01] client Availability of an Internet Service • We use user-observed availability in our study: Availability = (accepted accesses) / (submitted accesses)

  18. communication to maintain consistency failed < 2% × Replica × reject × reject > 0.1% client Effects of Replication • Consistency may force a replica to reject an otherwise acceptable request • Network Failure Rate Replica Rejection Rate Replica

  19. : Replicas : Clients Limitations of Strong Consistency Option 1: accept reads accept reads reject writes reject writes Option 2: accept reads reject reads accept writes reject writes

  20. Effects of Continuous Consistency allow replica to buffer 5 writes Option 1: accept reads accept reads reject writes reject writes New Option 1: accept reads accept reads accept first 10 writes accept first 5 writes

  21. Effects of Continuous Consistency allow replica to buffer 5 writes Option 2: accept reads reject reads accept writes reject writes New Option 2: accept reads accept first few reads accept writes accept first 5 writes

  22. Availability Hard Bound 0% Consistency 100% Availability 100% Consistency Inconsistency Consistency Impact is Inherent • Hard bound always exist • We always know the to end points, but may not know the exact shape of the curve

  23. Availability Upper Bound Protocol A Protocol B Inconsistency Effects of Consistency Protocol • Achieved availability also depends on protocol • Design better protocols • Job of system designers

  24. Availability Optimizations • Technique should not be tied to model • Focus on two techniques: • Retiring replicas • Aggressive write propagation

  25. : Replicas : Clients Limitations of Strong Consistency Option 1: accept reads accept reads reject writes reject writes Option 2: accept reads reject reads accept writes reject writes

  26. Retiring Replicas • Obviously, such decision may not be optimal unless we have future knowledge • Importance of prediction • Even with future knowledge, it is hard • In option 2, all replicas much reach an agreement • Leader election • We are experiencing partitions • One option: Voting • What if we don’t have majority?

  27. Aggressive Write Propagation • Applicable to continuous consistency • Continuous consistency gives us “buffers” that can be utilized in case of network partition • Keep the buffer empty: • Cannot predict the occurrence of network partitions • Propagate writes more aggressively • Cut down the amount of inconsistency accumulated in times of good connectivity

  28. Effects of Aggressive Propagation • Baseline: Propagate writes only when necessary (lazily) • Aggressive: When necessary and every 3 seconds 8 replicas with measured faultload From [Yu’01, SOSP]

  29. More Aggressive Propagation • Aggressive write propagation does not work in all cases • Availability optimizations can incur more communication • Best availability achieved when we use a strong consistency protocol • Speaks of availability / performance tradeoffs

  30. Availability of Other Systems • Consensus and leader election • Blocks without majority • Group membership • Blocks without majority • Relaxing consistency enables them to make progress • Open Question: But will these systems still be useful ?

  31. Availability Summary • Availability definition • Inherent impact of consistency on availability • Availability also depends on consistency protocols • Availability optimizations: • Replica retirement • Aggressive write propagation

  32. Why can we easily approach the upper bound? • Simple protocols in our study can approach the upper bound closely • Remember reaching the upper bound in general needs future knowledge • Related to the characteristics of the faultloads we measured and simulated • Most partitions are singleton partitions • Most transitions are: fully-connected → singleton partition → fully-connected • These characteristics are consistent with • Internet hierarchical architecture

  33. Dual Effects of Replication Scale on Availability • Consistency may force a replica to reject a request • Adding more replicas: • Network Failure Rate Replica Rejection Rate • Availability = (1 - Network Failure Rate) * ( 1 - Rejection Rate) • Too large or too small replication scale can hurt availability

  34. Optimal Replication Scale • Optimal replication scale: Adding more replicas can hurt! • Increase in “replica rejection rate” outweighs decrease in “network failure rate” • Optimal replication scale depends on • Consistency level • Network failure rate among replicas

More Related