We've seen (FischerLynchPaterson, WaitFreeHierarchy) that consensus is difficult in many systems if we assume deterministic processes. Randomized consensus is a way to avoid these results by allowing processes to flip coins. The intuition is that the bad executions constructed in FLP are a tiny fraction of the space of all executions; so if we can make these bad executions improbable, we can solve consensus with high probability. Formalizing this intuition requires changes both to the model (to allow randomization) and the problem statement (to allow consensus protocols to terminate only with probability 1, instead of in all executions.

1. The intuitive story

Imagine two pedestrians trying to pass each other in a hallway, where they need to reach consensus on whether to pass on the left or on the right. Each initially has a preference for which way they want to go (e.g. right for Americans and left for Britons), and from the FLP result we know that if the pedestrians are deterministic and have their timing controlled by an adversary, each will give up and adopt the other pedestrian's preference at exactly the same time. The results are both embarrassing (starvation) and implausible (doesn't happen in real life).

A solution is for one or both of the pedestrians to announce that they are choosing a new preference by flipping a coin. This gives a 50% probability that both will agree with each other, even if only one of the pedestrians flips a coin. Thus after a constant number of coin-flips, both pedestrians agree, and (assuming they can detect this agreement), they will have solved consensus.

This is the basic idea behind randomized consensus: build a framework for detecting agreement, then flip coins to get the agreement. A complication is that with n processes we may have to wait for Θ(n²) rounds before all their coin-flips happen to come up the same. So we want some way to build a shared coin that usually gives the same answer to all of the processes, and have them flip that instead.

2. Randomization

To add randomization to the model, we allow processes to do internal steps, known as local coin-flips, where the state of the process goes from some state q to some new state q' where q' is chosen according to a probability distribution specified by the algorithm. The simplest version of this is when the process just flips a coin that returns heads or tails with equal probability, and we can describe this algorithmically as if the process has access to a coin-flip subroutine (or coin-flip operation) that returns the outcome of the coin. Note that an implicit assumption is that only one process learns the outcome of the coin—this is what makes the coin local. A common trick in designing randomized distributed algorithms is to build from these local coins a global coin that all processes agree on and that the adversary doesn't have too much influence over. (This is actually a harder problem than solving consensus, and as we will see below if we can solve it we can solve consensus.)

Once we add randomization, we have to revisit the role of the adversary. Without randomization, the adversary was just a universal quantifier over admissible executions. Now we have to think in terms of adversary strategies, where the choices made by the adversary may depend on the outcome of coin-flips that occur during the execution of the protocol. An adversary strategy is typically represented as a function that chooses in each global configurations of the protocol which process executes the next operation. Fixing some single adversary strategy removes all the non-probabilistic nondeterminism from the system, leaving a probability distribution over executions that is determined by the probability distribution over sequences of local coin-flips.

Implicit in the definition of an adversary strategy is the assumption that the adversary can't predict future coin-flips (because their outcomes aren't visible in the current configuration). We may choose to impose additional restrictions on what the adversary can see, e.g. by preventing it from observing anything about the state of the protocol (an oblivious adversary) or limiting its ability to observe the internal states of processes, values contained in messages, or values written to registers (a semi-oblivious adversary). However, in the most general case we will allow the adversary to see (and react to) everything that has happened so far; this gives an adaptive adversary, which we will assume by default. (See TypesOfAdversaries for slightly more detail.)

3. Randomized termination

We replace the usual termination requirement with:

Randomized termination
With probability 1, every non-faulty process decides.

The agreement and validity conditions stay the same.

Probability 1 does not mean that termination holds in all executions, just that the set of non-terminating executions becomes vanishingly improbable in the limit. This is analogous to the probability that an infinite sequence of coin-flips contains at least one head ; while there exists a sequence that doesn't have this property (all tails), the probability of this bad sequences is equal to limn→∞ (1/2)n = 0. Similarly we'll show that the sequences of bad coin-flips that prevent consensus require that we get unlucky forever, which doesn't usually happen.

In computing the cost of a protocol that might not terminate for a while, we have to look at expected cost. This could be expected total work, or expected work done by any single process. We can't guarantee any fixed finite bound holds with probability 1, because if we could, we could get a deterministic protocol by running the guaranteed-to-terminate protocol with all local coin-flips returning heads, which contradicts FLP.

4. Upper bounds

With an adaptive adversary, we can solve randomized consensus in a wait-free shared memory system in expected Θ(n²) total operations; this bound is tight (Hagit Attiya and Keren Censor, Tight bounds for asynchronous randomized consensus, STOC 2007). The Attiya-Censor protocol is a culmination of roughly 25 years of work on the problem; a survey of the history emphasizing wait-free shared-memory results up to 2001 or so can be found in James Aspnes, Randomized protocols for asynchronous consensus, Distributed Computing 16(2-3):165-175, 2003. We'll present the Attiya-Censor protocol for building a global coin embedded in an algorithm for turning a global coin into a consensus protocol that was first described by Tushar Chandra Polylog randomized wait-free consensus, PODC 1996 for a semi-oblivious adversary model.

4.1. Reduction to shared coin

Basic idea: build two infinitely long arrays mark[0] and mark[1] of multi-writer bits, where mark[b][i] indicates that some process that prefers b has gotten to round i. In each round, a process looks to see if the processes ahead of it agree with each other, and if so it adopts their common preference; otherwise it flips a shared coin to decide on its new preference. Because slow processes adopt the common values of fast processes, if a fast process looks over its shoulder and sees that nobody has disagreed with it in the last two rounds, it can decide knowing that the slow processes will join it before they get around to flipping any coins.

Here's the actual algorithm:

procedure Consensus(input):
    p←input
    for r ← 1 to ∞:

        mark[p][r]←true
        if mark[1-p][r+1]:
            # somebody is ahead of us, join them
            p' ← 1-p
        else if mark[1-p][r]:
            # disagreement in our round, run shared coin
            p' ← SharedCoin[r]()
        else if mark[1-p][r-1]:
            # no disagreement in this round, keep current value
            p' ← p
        else:
            # no disagreement in previous round either, terminate
            return p
        endif

        if mark[p][r+1] = false:
            # abandon our possibly-losing team
            p ← p'
        endif
    end
end

The proof of validity comes from observing that if nobody has input p, then nobody ever marks a bit in mark[p], and everybody decides 1-p after two rounds.

The proof of agreement comes from carefully analyzing the behavior of slow processes once some process terminates; the essential idea is that once I read mark[1-p][r-1] = 0 after writing mark[p][r] = 1, then any process that comes later either (a) already agrees with me, or (b) hasn't written to mark[1-p][r-1] yet, in which case it reads mark[p][r] = 1 and mark[1-p][r] = 0 and switches its preference before it reaches round r. So every process enters round r with preference p, and in they all decide p in round r+1 at the latest.

For termination, we need to know that the SharedCoin protocol returns each value 0 or 1 with probability at least δ for some constant δ (called the agreement parameter), no matter what the adversary does. The reason for this is that in a round where some processes execute SharedCoin, there may be a few fast process that don't execut SharedCoin because they only saw one value in round r. So we need a constant probability that the coin-flippers agree with their fixed value no matter what the fixed value is. But if our SharedCoin has this property, then there is a constant probability that the coin-flippers agree with the leaders, and we get a constant probability per round of termination, with an expected time to termination of O(1/δ) asynchronous rounds.

5. Building a shared coin

Basic idea:

An approach similar to this (terminating when a random walk reached ±Θ(n) instead of at Θ(n²) total votes) was used for the first polynomial-time randomized consensus protocol of Aspnes and Herlihy James Aspnes and Maurice Herlihy, Fast randomized consensus using shared memory, J. Alg. 11(3):441–461, September 1990; their algorithm used O(n4) total register operations. Terminating at Θ(n²) total votes was suggested by Bracha and Rachman Gabi Bracha and Ophir Rachman, Randomized consensus in expected O(n² log n) operations, WDAG 1990, which produced a dramatic reduction in the overhead of testing termination. The optimal total work of O(n²) operations was only achieved recently by Attiya and Censor Hagit Attiya and Keren Censor, Tight bounds for asynchronous randomized consensus, STOC 2007, by augmenting the Bracha-Rachman protocol with a termination bit that ensures that all processes detect termination at roughly the same time. This is the protocol we describe below:

count ← 0
sum ← 0
while not done:
    vote ← localCoin()
    count++
    sum += flip
    A[i].(count, sum) ← (count, sum)
    if count mod n = 0:
        if ∑ A[i].count ≥ n²:
            done ← true
            return sgn(∑ A[i].sum)

Analysis:

Putting this together gives a constant probability that all processes see the same sign for the total vote as obtained from the first n² votes. This gives a constant probability that they agree on each possible value ±1.


CategoryDistributedComputingNotes

RandomizedConsensus (last edited 2008-02-29 05:50:41 by JamesAspnes)