On the Cost of Fault-Tolerant Consensus When There are no Faults

On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; IPL

Consensus Every process has input, outputs decision Agreement: two correct processes that decide, decide the same Validity: decision is input of one process Termination: eventually all correct processes decide Binary consensus - values 0 and 1

Models • Processes communicate by message passing • Processes fail by crashing • t<n potential failures of n>1 processes • Messages not lost among correct processes • Models: • Asynchronous • Synchronous • Partial Synchrony

Asynchronous Model • Unbounded message delay, processor speed • Consensus impossible even for t=1 [FLP85] • Reason: can never tell faulty process from slow one

Synchronous Model • Constant message delay, processor speed • Algorithm runs in synchronous rounds: • send messages to any number of processes, • wait fixed time to receive messages, • do local processing (possibly decide, halt) Round • If process i fails in a round, then any subset of the messages i sends in this round can be lost

Synchronous Consensus • Solvable • Consider a run with f failures (f<t) • Processes can decide in f+1 rounds [LF82,DRS90] (early-deciding) • 1 round with no failures

Partial Synchrony [DLS88] • There is a global stabilization time GST • until GST system asynchronous • after GST system is synchronous • GST is not known • Realistic: practical networks are not really asynchronous • Many variants and similar models, e.g., unreliable failure detectors [CT96]

Consensus with Partial Synchrony • Consensus solvable with < n/2 failures • Eventually correct processes won’t be suspected • Running time unbounded • by [FLP85] • because models can be asynchronous for unbounded time

In a Practical System Can we say more than: “consensus will be solved eventually”?

Our Approach • Look at well-behaved runs • no failures • messages arrive within known time  • most common in practice • Known algorithms decide in 2 rounds of communication in well-behaved runs • 2 time when maximum delay is  • Paxos [Lam98]; atomic commit [KD98]; failure detectors [Sch97,MR97]; atomic broadcast [KD96];...

Why are there no 1-Round Algorithms? • We will show a lower bound of 2 communication rounds • Follows from similar bound on Uniform Consensus in synchronous model

Uniform Consensus • Uniform agreement: every two processes that decide, decide the same • Recall: with consensus, only correct processes have to agree • Synchronous lower bound of f+2 rounds [CBS00] • as opposed to f+1 for consensus

From Consensus to Uniform Consensus • In partial synchrony model, any algorithm A for consensus solves uniform consensus [Gue95, Gue98] • Assume by contradiction that A does not solve uniform consensus • in some run, p,q decide differently, p fails • p may be non-faulty, and may wake up after q decides

Deriving the Lower Bound • We will now show a synchronous 2 round lower bound for uniform consensus in runs with no failures • Implies 2 round lower bound for well-behaved executions in partial synchrony model • Any algorithm for consensus solves uniform consensus (previous slide) • QED

Theorem: Uniform Consensus Failure-Free Lower Bound • Assume n>2 and t>1 • Then there is a failure-free run in which not all processes decide after one round

Deterministic Algorithms • Run determined by initial values and adversary actions • failures, message loss • (Global) state = list of values in processes’ local states (data structures)

000 ~ 001 ~ 011 ~ 111 Connectivity • States x, x’ are similar, x~x’, if they look the same to all but at most one correct process • E.g., set of initial states of consensus algorithm • Intuition: in connected states there cannot be different decisions

Coloring • Classical coloring: valency, potencial decisions state can lead to [FLP85] • Our coloring: val(x) = decision of correct processes in failure-free extension of x (0 or 1)

~ ~…~ ~ ~…~ ~ 0…0 x x’ 1...1 differ only in state of some correct j By validity, val=0 By validity, val=1 Theorem Proof Consider a colored graph of initial states: • Assume, by contradiction, in failure-free runs from x, x’, all decide in 1 round

differ only in state of process j=1 x’ x x x’ X X X X X X look the same to process 3 look the same to process 3 Illustrating the Contradiction val(x)=0, so x leads to decision 0 in one failure-free round look the same to process 2 A contradiction to uniform agreement!

The General Lower Bound f+2 rounds in runs with f failures

States (Configurations) • (Global) state = list of values in processes’ local states (data structures) • Given a fixed deterministic algorithm, state of run can be denoted as: x . E1. E2. E3 x state, Ei environment (adversary) actions

To Prove Lower Bounds • Sufficient to look at subset of runs, (limited adversary) • called a system • Simplifies proof

Considered Environment Actions • (i, [k]) - i fails, • messages to processes {1,…,k} lost (if sent) • [0] empty set - no loss • applicable if i non-failed and < t failures • (0, [0]) - no failures • always applicable • At most one process fails in one round • its messages lost by prefix of processes

L2(X0) L(X0) X0 Layering [MR98] • Layering L = set of environment actions • L(X) = {x.E | x  X, E  L applicable to x} • L0(X) = X • Lk(X) = L(Lk-1(X)) • Define system using layers • X0 set of initial states • System: Lk(X0)

Coloring • How to color non-decided states? • Classical coloring: valency, potencial decisions state can lead to [FLP85] • Our coloring: val(x) = decision of correct processes in failure-free extension of x (0 or 1)

Proof Strategy • Uniform Lemma: from connected set, under some conditions, 2 more rounds needed for uniform consensus (recall: 1 for consensus) • Connectivity Lemma: for f<t+1, Lf(X0) connected • feature of model • also implies consensus f+1 lower bound • can be proven for all Li(X0) in other models, e.g., mobile failure model [MR98,SW89]

Uniform Lemma • If • X connected • x,x’X, s.t. val(x)= 0, val(x’)=1 • In all states in X exist at least 3 non-failed processes and 2 can fail • Then • yX s.t. in y.(0,[0]) not all decide 1-round failure-free extension of y

... ... x x’ y y’ Uniform Lemma: Proof • X connected, val(x)= 0, val(x’)=1 • Assume, by contradiction, in failure-free extensions of y, y’, all decide after 1 round • 2 cases: j either failed or non-failed differ only in state of some correct j

y’ y y y’ X X y.(0,[0]) y’.(0,[0]) y.(1,[2]) y’.(1,[2]) look the same to process 3 X X X X y.(1,[2]).(3,[3]) y.(1,[2]).(3,[3]) Illustrating the Contradiction Case 1: j is correct val(y)=0, so y leads to decision 0 in one failure-free round look the same to process 2 A contradiction to uniform agreement!

Corollary: Failure-Free Case • n >2, t >1, f =0 • X0 = {initial failure-free states} connected • x’,xX0s.t.val(x)=0, val(x’)=1 (validity) • By Uniform Lemma, from some initial state need 2 rounds to decide

L(x) L(x’) Connectivity Lemma: Lf(X0) Connected for f<t+1 • Proof by induction, base immediate • For state x, L(x) connected (next slide) • Let x~x’X, • x, x’ differ in state of i only, i can fail • x.(i, [n]) = x’.(i, [n]) x.(i, [n]) ~ x’.(i, [n]) x ~ x’

x x x x ~ ~ ~ X X X x.(0,[0]) x.(1,[0]) x.(1,[2]) x.(1,[3]) x.(0,[0]) ~ x.(2,[0]) ~ x.(2,[1]) ~ x.(2,[3]) x.(0,[0]) ~ x.(3,[0]) ~ x.(3,[1]) ~ x.(3,[2]) L(x) is Connected

Theorem: f+2 Lower Bound • Assume n>t, and f < t-1 • Lf(X0) - final states of runs with f failures • connected • in any state in Lf(X0) exist at least 3 non-failed processes and 2 can fail • Take z, z’X0 s.t. val(z)val(z’), • let x, x’ be failure-free extensions of z, z’: x=z.(i,[0])f  Lf(X0)

Why a New Proof Technique?

Classical Technique: Bivalency • Bivalent state = state that can lead to different decisions [FLP85] • defined w.r.t. system [MR98] • 1-valent state always leads to decision 1 • 0-valent state always leads to decision 0 • Used for, e.g., • asynchronous consensus impossibility[FLP85] • consensus f+1 lower bound [AT99,MR98]

Bivalency-Based Proofs • Show that initial bivalent state exists • Show by induction that adversary can keep system in bivalent state • No decision is possible in a bivalent state

Bivalency-Based Proofs: Base ~ 00..0 ~ 0..01 …~... x ~ x’ …~... 01..1 11..1 differ in state of one process j 1-valent 0-valent • If j fails, x and x’ lead to same decision • Impossible for x to be 1-valent and for x’ to be 0-valent • Validity implies: initial bivalent state exists

Bivalency Doesn’t Work • Bivalency proofs use validity only to show that initial bivalent state exists • Proofs work if validity is replaced by: Weak Validity: initial bivalent state (w.r.t. system) exists • consensus f+1 lower bound proofs still work • we show that uniform consensus f+2 round lower bound does not hold

Counter Example to Weak Validity Round 1: send m1 to all if (got m1 from all) then return 1 fi Round 2: send m2 to all if (#m1 = #m2)then v=0else v=1fi return Uniform-Consensus(v) Decides 1 in one round in failure-free runs Decides 0 with one “clean” failure in Round 1

Conclusions • f+2 lower bound for uniform consensus • synchronous, crash failure model • 1 more round than consensus • New proof technique (new coloring) • because bivalency does not work • Implies lower bound of f+2 communication rounds for synchronous runs in partial synchrony model (for f < t-1)

On an Optimistic Note • Consensus requires 2 rounds in partial synchrony model because of false suspicions • 1-round algorithms work correctly while there are no false suspicions • group communication: Horus, Amoeba, ... • Optimistic approach: • use 1-round algorithm • reconcile conflicts in case of false suspicions

On the Cost of Fault-Tolerant Consensus When There are no Faults