340 likes | 436 Vues
Explore the tradeoffs between strong and optimistic consistency in replication systems, balancing performance and availability. Learn about continuous consistency models and protocols, and how to design them for diverse applications.
E N D
Continuous Consistency and Availability Haifeng Yu CPS 212 Fall 2002
server server server client client client Consistency in Replication • Replication comes with consistency cost: • Reasons for replication: Better performance and availability • Replication transforms client-server communication to server-server communication: • Decrease performance • Decrease availability
Availability / Performance / Scalability Optimistic Consistency Strong Consistency Consistency Strong Consistency and Optimistic Consistency • Traditionally, two choices for consistency level: • Strong consistency: Strictly “in sync” • Optimistic consistency: No guarantee at all • Associated tradeoffs with each model
Problems with Binary Choice • Strong consistency incurs prohibitive overheads for many WAN apps • Replication may even decrease performance, availability and scalability relative to a single server! • Optimistic consistency provides no consistency guarantee at all • Resulting in upset users: Unbounded reservation conflicts • Potentially render the app unusable: If traffic data is more than 1 hour stale, probably of little use • Applications cannot tune consistency level based on its environment • Need to adapt to client, service and network characteristics
Availability / Performance / Scalability Optimistic Consistency Strong Consistency Consistency Continuous Consistency • Consistency is continuous rather than binary for many WAN apps • These apps can benefit from exploiting the consistency spectrum between strong and optimistic consistency. Availability / Performance / Scalability Continuous Consistency Consistency
To Other Replicas Quantifying Consistency • Many ways: • Staleness (TTL in web caching): Invalidate • Limit number of locally buffered writes bufferedupdates
Applications ? • Applications: • Web caching • Airline reservation • Distributed games • Shared editor • Non-Applications: • Some scientific computing problems • Banking system • Any application that has binary output • Application’s nature determines whether continuous consistency is applicable
Trading Consistency for Performance • Airline reservation: running at Berkeley, Utah, Duke Optimistic Consistency [Yu’02, TOCS] StrongConsistency
The Cost of Increased Performance • Increased performance comes with a cost • Adaptively trade consistency for performance based on client, network, and service conditions
Model vs. Protocol • Continuous consistency model is a spec. • Protocol is anything that can enforce the spec. • Corollary: Strong consistency protocol is a protocol for any model • Many protocols for a specific model, some are good, others are not
Designing a Continuous Consistency Model • Model is a spec, thus quantifying consistency (in a bad way) is trivial • Only applications know its definition of consistency • Airline reservation vs. distributed games • What is a “good” continuous consistency model? • Can be used by diverse apps • Practical
Distributed Consensus and Leader Election • What does “continuous consistency” mean ? • Allow at most k decision values • Allow at most k leaders • Helps overcome some impossibilities • Unique decision value requires ½ majority • K decision values allow any partition with 1/(k + 1) nodes to decide
Group Membership Service • Def: Keep track of which nodes belong to which group • Traditionally, group membership only maintain a single group • Primary-partition membership services • Corresponds to strong consistency • Recently, partitionable membership services • Still active area of research • Corresponds to optimistic consistency • Continuous consistency: • Allow at most k groups • Again, helps overcome the ½ majority limitation
Continuous Consistency Summary • WAN replication needs dynamically tunable consistency • Tradeoff between consistency and performance • How to design a continuous consistency model • Continuous consistency in other context • Next: Availability
What is Availability ? • No well-accepted availability metric for Internet services • “Uptime” metric can be misleading for Internet services • Server may be inaccessible because of network partition • Available: “present or ready for immediate use” • From Webster’s Collegiate Dictionary • What does “immediate” mean? • Time-out • Availability = (accepted accesses) / (submitted accesses) • Implicit time-out in the definition
Perform-ability • User satisfaction is not binary • What if a partial result is returned before time-out ? • What if the result is sent back after an hour, or a day ? • Availability is related to performance • Performability = reward function (quality and timeliness of result) • Determining reward function is hard !
2% [Chandra et.al., USITS’01] × Server reject due to server failure × 0.1%[MS press release,Jan’01] client Availability of an Internet Service • We use user-observed availability in our study: Availability = (accepted accesses) / (submitted accesses)
communication to maintain consistency failed < 2% × Replica × reject × reject > 0.1% client Effects of Replication • Consistency may force a replica to reject an otherwise acceptable request • Network Failure Rate Replica Rejection Rate Replica
: Replicas : Clients Limitations of Strong Consistency Option 1: accept reads accept reads reject writes reject writes Option 2: accept reads reject reads accept writes reject writes
Effects of Continuous Consistency allow replica to buffer 5 writes Option 1: accept reads accept reads reject writes reject writes New Option 1: accept reads accept reads accept first 10 writes accept first 5 writes
Effects of Continuous Consistency allow replica to buffer 5 writes Option 2: accept reads reject reads accept writes reject writes New Option 2: accept reads accept first few reads accept writes accept first 5 writes
Availability Hard Bound 0% Consistency 100% Availability 100% Consistency Inconsistency Consistency Impact is Inherent • Hard bound always exist • We always know the to end points, but may not know the exact shape of the curve
Availability Upper Bound Protocol A Protocol B Inconsistency Effects of Consistency Protocol • Achieved availability also depends on protocol • Design better protocols • Job of system designers
Availability Optimizations • Technique should not be tied to model • Focus on two techniques: • Retiring replicas • Aggressive write propagation
: Replicas : Clients Limitations of Strong Consistency Option 1: accept reads accept reads reject writes reject writes Option 2: accept reads reject reads accept writes reject writes
Retiring Replicas • Obviously, such decision may not be optimal unless we have future knowledge • Importance of prediction • Even with future knowledge, it is hard • In option 2, all replicas much reach an agreement • Leader election • We are experiencing partitions • One option: Voting • What if we don’t have majority?
Aggressive Write Propagation • Applicable to continuous consistency • Continuous consistency gives us “buffers” that can be utilized in case of network partition • Keep the buffer empty: • Cannot predict the occurrence of network partitions • Propagate writes more aggressively • Cut down the amount of inconsistency accumulated in times of good connectivity
Effects of Aggressive Propagation • Baseline: Propagate writes only when necessary (lazily) • Aggressive: When necessary and every 3 seconds 8 replicas with measured faultload From [Yu’01, SOSP]
More Aggressive Propagation • Aggressive write propagation does not work in all cases • Availability optimizations can incur more communication • Best availability achieved when we use a strong consistency protocol • Speaks of availability / performance tradeoffs
Availability of Other Systems • Consensus and leader election • Blocks without majority • Group membership • Blocks without majority • Relaxing consistency enables them to make progress • Open Question: But will these systems still be useful ?
Availability Summary • Availability definition • Inherent impact of consistency on availability • Availability also depends on consistency protocols • Availability optimizations: • Replica retirement • Aggressive write propagation
Why can we easily approach the upper bound? • Simple protocols in our study can approach the upper bound closely • Remember reaching the upper bound in general needs future knowledge • Related to the characteristics of the faultloads we measured and simulated • Most partitions are singleton partitions • Most transitions are: fully-connected → singleton partition → fully-connected • These characteristics are consistent with • Internet hierarchical architecture
Dual Effects of Replication Scale on Availability • Consistency may force a replica to reject a request • Adding more replicas: • Network Failure Rate Replica Rejection Rate • Availability = (1 - Network Failure Rate) * ( 1 - Rejection Rate) • Too large or too small replication scale can hurt availability
Optimal Replication Scale • Optimal replication scale: Adding more replicas can hurt! • Increase in “replica rejection rate” outweighs decrease in “network failure rate” • Optimal replication scale depends on • Consistency level • Network failure rate among replicas