200 likes | 323 Vues
This paper explores Linearizable Resilient Data Types (LRDT) and their role in achieving consistency without the need for consensus in asynchronous distributed systems. We examine concepts such as commuting updates, lattice agreement, and implications for designing abstract data types (ADTs). The study emphasizes how linearizability can be approached independently of consensus protocols, the challenges posed by failures, and the implementation of efficient algorithms in practical distributed systems.
E N D
Consistency without consensusLinearizable Resilient Data Types (LRDT) Kaushik Rajan Sagar Chordia Kapil Vaswani Ganesan Ramalingam Sriram Rajamani
Consistency & consensus Add(The Hobbit) GetCart() No deterministic algorithm in the presence of failures [FLP] Add(Kindle) GetCart() Processes agree on ordering of operations
Commuting updates • What if all update operations commute? • Ordering of updates doesn’t matter! • Eventual consistency reduces to eventual message delivery • Single round trip latency • What if we desire linearizability? • Updates don’t commute with arbitrary reads • Reads must be consistently ordered with updates • Semantics of queries like the current top(k) elements well understood
Commuting updates Add(The Hobbit) GetCart() {} Add(Kindle) GetCart() {The Hobbit, Kindle} Reads must observe comparable sets of operations
Linearizable resilient data types Possible Impossible Don’t know S1 S1 op2 op1 op1 op1 S’ S S S op1 op2 op2 op2 op1 op2 S2 S2 P1 : commutes(s,op1,op2) P2 : nullify(s,op1,op2)
Examples • Read write register : every pair of writes nullify • Read write memory : writes to the same location nullify, writes to different locations commute
Examples • Set : add, remove and read the whole set • Add(u), Remove(v) commute • Add(u), Remove(u) nullify • Add(*), Add(*) commute • Remove(*) Remove(*) commute • Counter : IncrBy(x), DecrBy(x), SetTo(v), Read() • SetTo(v) nullifies all other operations • Other pairs of updates commute • Other examples Heaps, union-find, atomic snapshot objects…
Lattice agreement • Consistency reduces to lattice agreement • Weaker problem than consensus • Solvable in an asynchronous distributed system • Assumptions • t < n/2 failures • Eventual message delivery
Lattice agreement • processes, each process starts with a value belonging to a join semi lattice • Each non-faulty process outputs a value • (Validity) Each process’ output is a join of one or more input values including its own • (Consistency) Any two output values are comparable • (Liveness) Every correct process eventually outputs a value
Lattice agreement a = Add(The Hobbit) b = Add(Kindle) c = Add(Lumia)
PROPOSERS ACCEPTORS Initially On receiving Send to all acceptors wait for majority of acceptors to respond All Acks? Output N Y Y S S N
Safety and liveness • Safety always guaranteed • Lattice agreement is t-resilient • Liveness guaranteed if quorum of processes are non-faulty and communication is reliable • Processes output value in at-most n round trips, where n is the number of processes
Generalized lattice agreement • Generalization of lattice agreement • Processes receivesequence of values • Values belong to an infinite lattice • Processes output a sequence of values • (Validity) Every output value is a join of some received values • (Consistency) Any two output values are comparable (i.e. output values form a chain) • (Liveness) Every value received by a correct process is eventually included in an output value
GLA algorithm • Liveness (t-resilient) • Every received value is eventually included in some output in n round trips • Adaptive, complexity depends on contention • Fast path • Received values output in one round trip • Reconfigurable • Replicas can be added/removed dynamically
From GLA to linearizability • Update commands form power set lattice • Updates return once majority of processes have learnt a command set that includes the update command • Read performed by (ABD style algorithm) • reading the learnt command set from a quorum of processes • Writing back the largest among these to a quorum • Constructing state corresponding to the largest command set by exploiting commutativity and nullification • Multi-master replication • Does not require a single primary/leader
Impossibility • Consensus reduction Consensus(b) Si S0 if(b) then op1 else op2 s = read() if(s = S1,S12) return true else return false Pair of idempotent update operations that neither commute nor nullify at some state s0 op1 op2 S12 S1 op1 Op* Si S0 op2 op1 S2 S21 op2
Implications for designing ADTs Most commands commute
Implications for designing ADTs neither commute nor nullify at ;
The Gap : Open problems Doubly saturating counter Decr() Decr() Decr() Decr() 1 n 0 2 Incr() Incr() Incr() Incr() Incr() and Decr() commute at 1 … n-1 Incr() and Dect() nullify at 0 and n Don’t know if this is possible or impossible
Summary Possible Impossible ?? Saturating counter queues, sequences graph, RW mem…