Posts in this series:

So far we’ve looked at generating a random choice out of seven, or out of five. How do we generalize this?

## Problem statement

Given a positive integer *N*, choose a random non-negative integer *X* uniformly distributed among {0, 1, … *N *– 1} = {*x*: 0 <= *x* < *N*}, using only flips of a fair coin. If *F* is the number of flips required, try to minimize *E*(*F*).

The first blog post dealt with the case where *N* = 7, and the second where *N* = 5.

Write *N* in base 2 as *n*_{d} 2^{d} + *x*_{d – 1} 2^{d – 1} + … + *n*_{1} 2^{1} + *x*_{0} 2^{0} = Σ_{i = 0 to d} *n*_{i} 2^{i} where *d* is the smallest integer >= log_{2} *N *and each *n*_{i} ∈ {0, 1}.

Note that by construction, *n*_{d} = 1.

For ease of notation, we can write *N* = 0b*n*_{d}n_{d – 1}…*n*_{1}*n*_{0}. We will put a ` before every *n*_{4k + 3} as a separator, in much the same way as , is used as separator in decimal numbers.

Some examples:

*N* |
Binary |
*d* |
*n*_{i} |

7 |
0b111 |
2 |
*n*_{2} = 1, *n*_{1} = 1, *n*_{0} = 1 |

5 |
0b101 |
2 |
*n*_{2} = 1, *n*_{1} = 0, *n*_{0} = 1 |

128 |
0b1000`0000 |
7 |
*n*_{7} = 1, *n*_{6} = 0, *n*_{5} = 0, *n*_{4} = 0, *n*_{3} = 0, *n*_{2} = 0, *n*_{1} = 0, *n*_{0} = 0 |

129 |
0b1000`0001 |
7 |
*n*_{7} = 1, *n*_{6} = 0, *n*_{5} = 0, *n*_{4} = 0, *n*_{3} = 0, *n*_{2} = 0, *n*_{1} = 0, *n*_{0} = 1 |

204 |
0b1100`1100 |
7 |
*n*_{7} = 1, *n*_{6} = 1, *n*_{5} = 0, *n*_{4} = 0, *n*_{3} = 1, *n*_{2} = 1, *n*_{1} = 0, *n*_{0} = 0 |

255 |
0b1111`1111 |
7 |
*n*_{7} = 1, *n*_{6} = 1, *n*_{5} = 1, *n*_{4} = 1, *n*_{3} = 1, *n*_{2} = 1, *n*_{1} = 1, *n*_{0} = 1 |

We wish to choose *X* = 0b*x*_{d}x_{d – 1}…*x*_{1}*x*_{0} = Σ_{i = 0 to d} *x*_{i} 2^{i} where each *x*_{i} ∈ {0, 1} and *X* < *N*.

If *N* = 2^{d}, this is easy. *x*_{d} must be 0, and the other *x*_{i} can be assigned arbitrarily. We can do this a single pass with exactly *d* coin flips. But what if *N* is not a power of 2?

## A naïve algorithm

The naïve algorithm used in the first blog post is:

- Start with
*X* ← 0. - Repeat the following steps exactly
*d* + 1 times.- Let
*X* ← 2*X*. - Flip a coin. This is
*x*_{i}. - If it comes up Tails, let
*X* ← *X* + 1.

- If
*X* >= *N* then start over with step 1.

## Performance of the naïve algorithm

How many flips does this take?

Let *r* be the chance that the algorithm produces a result on the first pass. Let *F*_{1} be the number of flips it takes on the first pass, and let *F* be the number of flips it takes overall.

Then *E*(*F*) = *E*(*F*_{1}) + *r*(0) + (1 – *r*)* E*(*F*). Solving for *E*(*F*) we have *E*(*F*) = *E*(*F*_{1}) / *r*.

*r* = *N*/2^{d + 1}. This is somewhere between 1/2 and 1.*E*(*F*_{1}) = *d* + 1.- So
*E*(*F*) = (*d* + 1) 2^{d + 1} / *N*.

In the best case, *N* = 2^{d + 1} – 1 and all the *n*_{i} are 1. *E*(*F*) = (*d* + 1) 2^{d + 1} / (2^{d + 1} – 1) = (*d* + 1) / (1 – 2^{-(d + 1)}). This is basically *d* + 1, divided by a corrective term that gets closer and closer to 1 as *d* increases.

In the worst case, *N* = 2^{d} + 1 and all the *n*_{i} are 0 except *n*_{d} = *n*_{0} = 1. *E*(*F*) = (*d* + 1) (2^{d + 1}) / (2^{d} + 1) = (*d* + 1) 2 / (1 – 2^{–d}). This is basically 2*d* + 2, divided by a corrective term that gets closer and closer to 1 as *d* increases.

So if we blur our eyes a little bit, the number of flips for the naïve algorithm **varies from ***d* + 1 to 2*d* + 2 (with a small corrective factor) depending on the density of 1s in *n*_{i}.

## A better algorithm

The key to the better algorithm hinted at in part 2 is to do much the same thing, but **stop flipping as soon as the current pass is doomed**. This lowers *E*(*F*_{1}).

Find the most significant “on” bit *n*_{d} = 1. Also find the *least* significant “on” bit *n*_{s} = 1 with *n*_{i} = 0 for all *i* < *s*.

These may be the same, in which case *N* is a power of 2.

*N* |
Binary |
Most significant “on” bit |
Least significant “on” bit |

7 |
0b111 |
0b100 |
0b001 |

5 |
0b101 |
0b100 |
0b001 |

128 |
0b1000`0000 |
0b1000`0000 |
0b1000`0000 |

129 |
0b1000`0001 |
0b1000`0000 |
0b0000`0001 |

204 |
0b1100`1100 |
0b1000`0000 |
0b0000`0100 |

255 |
0b1111`1111 |
0b1000`0000 |
0b0000`0001 |

If *N* is a power of 2, the algorithm is simple. Set *x*_{d} = 0, and each of the other *x*_{i} to a random bit. This can be done in *d* coin flips. Call this **running the table** on *x*_{d – 1} through *x*_{0}.

Otherwise, generate random bits one at a time, starting at *x*_{d}. If your *x*_{i} < *n*_{i}, run the table and you’re done. If your *x*_{i} > *n*_{i}, you’re doomed; abort the pass and start over. If your *x*_{i} = *n*_{i} then you’re on the edge; move on to the next bit and repeat until you’ve exhausted *x*_{s}.

More explicitly:

- Start with
*i* ← *d*. - Generate a random bit
*x*_{i}. - If
*x*_{i} < *n*_{i}, we have won. Run the table on *x*_{i – 1} through *x*_{0}. **We have chosen ***X* = 0b*x*_{d}x_{d – 1}…*x*_{1}*x*_{0}. - If
*x*_{i} > *n*_{i}, we’re doomed, because *X* > *N*, which is too big. **No result**. Discard all the *x*_{i} generated so far and go back to step 1. - If
*x*_{i} = *n*_{i}, we’re on the edge. See if we have any “on” bits remaining:- If
*i* > *s*, we still have a chance – move on to the next bit. Save *x*_{i}, set *i* ← *i* – 1, and loop around to step 2 again. - If
*i* = *s*, we’re doomed, because *X* >= *N*, which is too big. **No result**. Discard all the *x*_{i} generated so far and go back to step 1.

## Performance of the better algorithm

How many coin flips does this take?

As before, if *N* = 2^{d} then *E*(*F*) = *d*

As before, if *N* is not a power of 2:

*E*(*F*) = *E*(*F*_{1}) / *r**r* = *N* / 2^{d + 1}

As before, in the best case, *N* = 2^{d + 1} – 1 and all the *n*_{i} are 1. *E*(*F*_{1}) = *d* + 1 and *E*(*F*) = (*d* + 1) / (1 – 2^{-(d + 1)}).

What about the worst case *N* = 2^{d} + 1 where *n*_{d} = 1, *n*_{0} = 1, and all the rest of the *n*_{i} are 0?

As before, *r* = *N* / 2^{d + 1} = (2^{d} + 1) / 2^{d + 1} = 2^{-1} + 2^{-(d + 1)}. This is basically 1/2 plus a corrective factor that gets smaller as *d* increases.

But *E*(*F*_{1}) is lower now. Let’s calculate it:

Half of the time, *x*_{d} = 0 and we get a result in *d* + 1 flips.

1/4 of the time, *x*_{d} = 1 and *x*_{d}_{ – 1} = 1 and we get no result in 2 flips.

1/8 of the time, *x*_{d} = 1, *x*_{d – 1} = 0, and *x*_{d – 2} = 1 so we get no result in 3 flips.

…

1/2^{i} of the time, *x*_{d} = 1, *x*_{d – 1} = 0 through *x*_{d – i + 2} = 0, and *x*_{d – i + 1} = 1 so we get no result in *i* flips.

…

1/2^{d} of the time, *x*_{d} = 1, *x*_{d – 1} = 0 through *x*_{2} = 0, and *x*_{1} = 1 so we get no result in *d* flips.

1/2^{d + 1} of the time, *x*_{d} = 1, *x*_{d – 1} = 0 through *x*_{1} = 0, and *x*_{0} = 1 so we get no result in *d* + 1 flips.

1/2^{d + 1} of the time, *x*_{d} = 1, and *x*_{d – 1} = 0 through *x*_{0} = 0, so we get a result of *X* = 2^{d} = *N* – 1 in *d* + 1 flips.

Adding these up, we have:

*E*(*F*_{1}) = (*d* + 1)/2 + (2/4 + 3/8 + … + *i*/2^{i} + … + *d*/2^{d} + (*d* + 1)/2^{d + 1}) + (*d* + 1)/2^{d + 1}

= (*d* + 1)(2^{-1} + 2^{-(d + 1)}) + Σ_{i = 2 to d + 1} *i*/2^{i}

= (*d* + 1)(2^{-1} + 2^{-(d + 1)}) – 1/2 + Σ_{i = 1 to d + 1} *i*/2^{i}

So *E*(*F*) = *E*(*F*_{1}) / *r* = ((*d* + 1)(2^{-1} + 2^{-(d + 1)}) – 1/2 + Σ_{i = 1 to d + 1} *i*/2^{i}) / (2^{-1} + 2^{-(d + 1)})

= *d* + 1 – 1/(1 + 2^{–d}) + 2(Σ_{i = 1 to d + 1} *i*/2^{i}) / (1 + 2^{–d})

As *d* gets large, 2^{–d} gets close to 0, and Σ_{i = 1 to d + 1} *i*/2^{i} gets close to 2. (For a derivation of this fact see Solution: baby boys and baby girls.) So:

*E*(*F*_{1}) approaches *d* + 1 – 1/(1 + 0) + 2(2) / (1 + 0) = *d* + 4.

So the number of flips for the better algorithm **varies from ***d* + 1 to *d* + 4 (with a small corrective factor) depending on the density of 1s in *n*_{i}. This is a vast improvement over the naïve algorithm which varied from* d* + 1 to 2*d* + 2.

Next time, we’ll take a look at an implementation of this algorithm in PowerShell which generates a System.Numerics.BigInteger using System.Random.