Replication: optimistic approaches
This work delves into optimistic replication strategies in decentralized systems, focusing on peer-to-peer write sharing, update merge techniques, and conflict resolution. It highlights the importance of availability and performance in high latency networks and disconnected operations, offering examples like CVS and Bayou for cooperative engineering and general-purpose databases. The discussion includes consistency issues, dependency versus concurrency detection, and operational reconciliation, emphasizing practical applications and theoretical frameworks for improving the reliability of distributed systems.
Replication: optimistic approaches
E N D
Presentation Transcript
Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed SystemsGroup
Motivations for this work • Peer-to-peer, decentralised write sharing • Lessons and commonalities • Understand limitations • Different solutions: spectrum or discrete points? • Simple formal model Replication: optimistic approaches
Optimistic replication • Replicas of shared objects on sites • Without synchronisation: • peer-to-peer read • and update! • Consistency: a posteriori, offline • Merge independent updates • Applications: • high latency networks • disconnected operation • cooperative work • Improves availability & performance Replication: optimistic approaches
Example: cooperative engineering with CVS • CVS: developing shared code • Local, disconnected replica: no interference • Conflicts: • Write same file = syntactic • Overlap in file = violates edit semantics • Doesn’t compile, test = violates application semantics • Both sides of a conflict are excluded • Manual repair Replication: optimistic approaches
Example: Bayou • General-purpose database • Any replica can update, log actions action = { dependency check, operation, merge-procedure } • Optimistic replication: • epidemic exchange logs • { roll-back, replay }*; commit • dep-check: semantic check for conflict • merge-proc: semantic repair Replication: optimistic approaches
Basic vocabulary • While isolated: tentative updates • When connected, reconcile: • Propagate & collect updates • (Conceptually) Restart from initial state • Replay updates (if possible) • Overriding goal: consistency Replication: optimistic approaches
1. Consistency Study component issues of consistency
What is consistency? • Consistent with user intents • apply operations • according to user scenario • Consistent with data invariants • dependent actions • pre- and post-conditions • conflict resolution • Replicas consistent with each other • converge towards same values Replication: optimistic approaches
Consistency: problem taxonomy • Objects & updates • Internal vs. external consistency • Value / value log / operation log • Single master / multi-master • Detecting dependence vs. concurrency • Concurrency control • Laziness of concurrency control • Pessimistic / advanced concurrency / optimistic • Convergence Replication: optimistic approaches
Operation-based reconciliation • Updates: concurrent, unsynchronised • Local log of actions = operation descriptions • object identifier, method, arguments • Multi-log collects local + remote logs • Reconciliation schedule: merge multi-log & run sequentially • Scheduling issues: • Include vs. exclude • Execution order Replication: optimistic approaches
1 0 3 0 4 2 Operation-based model 0 0 Replication: optimistic approaches
Dependence vs. concurrency • Two actions are either have a dependency or commutative / concurrent • Dependent actions: • do not conflict • must be scheduled in dependence order • Concurrent actions • potentially conflict • Dependence / concurrency detection is a fundental mechanism Replication: optimistic approaches
Concurrency control • Concurrent & no conflict commute: execute both, arbitrary order • Conflict detection options • Conflict resolution options Replication: optimistic approaches
Convergence • Liveness: sites receive same/all actions • Safety: given same actions, sites compute the same value • Stability: actions eventually not undone Replication: optimistic approaches
2. Dependency & Concurrency Mechanisms to detect if actions are dependent or concurrent
Scalar clocks and timestamps • Wall clock, Lamport clock • Total order • Total order, consistent with causal dependence Schedule in timestamp order Can’t detect concurrency Replication: optimistic approaches
Happens-before • e1 precedes e2 in process • e1 sends, e2 receives e1 e2 (e1 e2) (e2 e1) e1 ||e2 • e1 || e2: e1 does not causee2 • e1 e2: e1might cause e2 • Partial order, consistent with causal dependence • Schedule consistent with Replication: optimistic approaches
Syntactic vs. semantic mechanisms • Scalar timestamps • no concurrency detection • very conservative approx. of causality • Vector timestamps • detect concurrency • conservative approx. of causality • Alternative: explicit semantic constraints Replication: optimistic approaches
Locks as semantic constraints • Read(x) depends on • previous Write(x) in same process, or • previously-received Write(x), whichever is later • Write(x) depends on • previous Read(*) in same process • More semantic information than Happens-Before • Step in the right direction, but still too coarse Replication: optimistic approaches
IceCube: Primitive constraints • Declarative (“static”): • MustHave: a b if as and ab then bs (not necessarily contiguous nor in order) • Order: a b if a, bs and ab then a before b in s (not necessarily both nor contiguous) • Within log, across logs • Imperative (dynamic): preCondition (State) Replication: optimistic approaches
Log constraints alternatives predecessor- successor parcel • Express user intents: • Predecessor/successor: a b b a b uses effect of a; “a causes b” • Parcel: a b b a transaction • Alternatives: a b b a Replication: optimistic approaches
3. Concurrency control & scheduling Policies for dealing with concurrent actions
Optimistic concurrency control & scheduling • Two actions are either: • dependent schedule in dependence order • concurrent and non-conflicting or commutative schedule in any order • concurrent and conflicting • schedule in non-conflicting order • or exclude one, the other, or both Replication: optimistic approaches
Concurrency control • Concurrent & no conflict commute: execute both, arbitrary order • Conflict detection options: • 2 concurrent actions conflict • only if operate on same object • only if both write • only if violate semantic invariant • Conflict resolution options: • exclude both • exclude 1st, include 2nd (or vice-versa) • execute both in favorable order • (rewrite and execute both) Replication: optimistic approaches
pre(x0) post(x0, f(x0)) x1:= f(x0) pre(x1) post(x1, g(x1)) x2:= g(x1) pre(x0) post(x’1, g(x0)) x’1:= g(x0) What is a conflict? • 1 site executes code + pre/post-conditions • Pre/post-conditions often unknown • Dependency between successive actions • Schedule execution must satisfy pre/post-conditions • Violation conflict Replication: optimistic approaches
Thomas’ Write Rule • Pre- / post-conditions unknown • Scalar clocks • no concurrency detect • implicit concurrency control • schedule in clock order • a later action excludes earlier ones • Lost updates • Delete ambiguity: “tombstone” state Replication: optimistic approaches
Value-based Version Vector concurrency control • Pre- / post-conditions unknown • Independent objects • actions to different objects commute • VV = per-object vector timestamp • any concurrent writes to object conflict • Resolution: • Manual • Values: “Resolver” per data type Replication: optimistic approaches
Bayou scheduling • Disjoint databases; 1 primary / database • Transaction: single database • Action = { dependency check, operation, merge-procedure } • Optimistic replication: • epidemic exchange logs • { roll-back, replay }*; commit • Conflict dependency check fails • merge-procedure Replication: optimistic approaches
Bayou dependency checks • Write-write conflicts: on replay check that data unchanged • Read-write conflicts: check input data • can detect concurrent updates • semantic: only relevant changes • Application-specific checks • bank account balance > £100 • fine grain Replication: optimistic approaches
IceCube: Object constraints • Shared data type advertises staticsemantics • mutually exclusive a b b a • best order (e.g. bank:credits before debits) a b • Only between concurrent actions • Also: dynamic constraints mutually exclusive best order commute Replication: optimistic approaches
IceCube scheduling • Insight: • conflict: choice of which action to exclude • maximise value Replication: optimistic approaches
0 4 0 5 log constraints 0 6 0 7 0 8 log constraints 0 9 0 10 0 11 IceCube execution model dynamic constraints 0 1 object constraints 0 2 Replication: optimistic approaches
Search vs. syntactic order Replication: optimistic approaches
Performance of IceCube heuristics Replication: optimistic approaches
4. Convergence Can a peer-to-peer system converge? Hard in the general case Formalise to understand limitations, trade-offs
Convergence • Liveness: sites receive same/all operations • epidemic multicast • quickly • Safety: sites compute the same value • equivalent schedules • Stability: actions eventually not undone • stable schedules • Users, external world dependency • Garbage collection Replication: optimistic approaches
Schedule soundness & equivalence • s sound: • Closed for MustHave as ab bs • Consistent with Order (a,b s ab) a before b in s • Equivalence: s t • s, t sound • as at • ordering is irrelevant! Replication: optimistic approaches
Stability • Peer-to-peer, indefinite tentative update + advisory reconciliation OK • But stability needed: • Users, external world depend on it • Garbage collect multilog • Stable: eventually decisions not changed • committed: definitely included in all schedules • aborted: definitely excluded Replication: optimistic approaches
Correctness of stability • Actions known to be stable at site i: • stablei = committedi abortedi • Live: • action a, site i: a stablei • Safe: • site i, schedule si: si sound committedi si • site i,k: committedi abortedk = • Safety invariant: strong, global! Replication: optimistic approaches
Maintaining disjointness site i,k: committedi abortedk = Different possibilities • Unilateral abort TWR, Holliday 2000 • Unilateral commit • Deterministic abort / commit rule TWR • Primary (only one) site decides Bayou, CVS • Consensus before deciding Deno, Holliday 2000-2002 Replication: optimistic approaches
Maintaining soundness • site i, schedule si: • si sound committedi si • When aborting a, also abort actions that MustHave a • When committing a, also abort uncommitted actions that are ‘Order’ed before a • Maintain both soundness and disjointness. • Peer-to-peer commitment is hard! Replication: optimistic approaches
Stability with TWR • Independent objects • Independent writes (no MustHave nor Order) • All sites take same decision: • Given two writes to same object, abort the earlier • Whether concurrent or not • Write stable when seen by all sites • Disjointness: committedi= • Soundness: no MustHave (no transactions) Replication: optimistic approaches
Stability in Bayou • Databases: • Disjoint • Independent: no multi-DB transaction • 1 primary / database • Log constraints: transactions, time order • Disjointness: Only 1 site decides about a: the primary for the database that a updates • Soundness: whole transaction commits or aborts Replication: optimistic approaches
Holliday’s pre-commit protocol • Log constraints: • multi-object transactions • happens-before order • Read transactions commit locally • Read-Write transactions: consensus to commit • convert locks to intentions • pre-commit, vote • commit if quorum ‘yes’ • abort if anti-quorum ‘no’ or conflict with committed Replication: optimistic approaches
Trade-offs • Deterministic rule • fast, inflexible • Partition + primary • single point of failure • no MustHave across partition boundaries • Consensus • slow • scalability • impossibility of consensus in asynchronous systems with failure Replication: optimistic approaches
Need for OR not going away • “Network technology improving: keep everything consistent pessimistically.” • True, but: • Constant latency; unavailable bandwidth • Mobile access unbounded latency • Increasing numbers of replicas • “Conflicts are rare.” • True, but: • Do occur • Very high cost Replication: optimistic approaches
OR pros & cons • Peer-to-peer read/write sharing • OR accepts more updates: • Performance despite latency • Availability despite failures • Increased complexity • Semantic information • Not transparent • Bottleneck moved to commit • Hard to make peer-to-peer • Unless (unacceptable?) restrictions • Unavoidable Replication: optimistic approaches
The end Replication: optimistic approaches