I’ve talked a few times about ElGamal encryption, where a participant *A* has a private key *a* and a public key *g ^{a}* mod

*p*for some public prime

*p*and generator

*g*.

Now let’s suppose we’re an attacker and we’re after *A*‘s private key. We know her public key *g ^{a}*, we know

*g*, and we know

*p*. Can we figure out

*a*= log

_{g}*g*mod

^{a}*p*? This is called the “discrete logarithm problem.”

## Brute force

If *p* is extremely small or we are extremely patient we can do it very easily. Start with 1 = *g*^{0}. Multiply by *g* to get *g*. Multiply again to get *g*^{2} mod *p*. Multiply a third time to get *g*^{3} mod *p*. Keep going until you get *g ^{a}*. Then

*a*is just the number of times you had to multiply.

For example, consider implementing ElGamal encrypt, decrypt, sign, and verify signature. In this toy example *p* = 101, *g* = 2, *A*‘s private key was 57, and her public key was 2^{57} mod 101 = 74. Since we (as the attacker) know *g* and *p* we can just calculate, doubling as we go, and subtracting 101 when we can:

- 2
^{0}mod 101 = 1 - 2
^{1}mod 101 = 2 - 2
^{2}mod 101 = 4 - …
- 2
^{6}mod 101 = 64 - 2
^{7}mod 101 = 27 - 2
^{8}mod 101 = 54 - …
- 2
^{56}mod 101 = 37 - 2
^{57}mod 101 = 74

So log_{2} 74 = 57 mod 101 and now we know *A*‘s private key.

This approach takes O(*p*) time so it quickly becomes impractical as *p* gets big.

## Baby-step giant-step

I blogged earlier about a speed improvement: David Shanks’ baby-step/giant-step algorithm. The basic idea is to precompute *g ^{i}* mod

*p*only up to

*s*= ⌈√

*p*⌉ and then store them in a lookup table. Then use a little algebra to query the lookup table up to

*s*times. This takes O(√

*p*) time and space so it can let us squeeze out a little more juice but as √

*p*gets big too this also becomes impractical.

## Pohlig-Hellman

Stephen Pohlig and Martin Hellman (yes, *that* Hellman) published their algorithm for taking discrete logarithms which works by analyzing the group *Z*_{p}^{x}.

The group has *p* – 1 elements {1, 2, …, *p* – 1}. (Note that 0 is not included.) This means that the order of the subgroup generated by any given element divides *p* – 1.

Factor *p* – 1 as a product of powers of primes: *p* – 1 = *p*_{1}^{k1} *p*_{2}^{k2} … *p*_{n}^{kn}.

To calculate *y* = log_{g} *x* mod *p*:

- Calculate
*y*mod*p*_{1}^{k1}(more on this later) - Calculate
*y*mod*p*_{2}^{k2} - …
- Calculate
*y*mod*p*_{n}^{kn} - Use the Chinese Remainder Theorem to reassemble
*y*from its various modular residues.

That’s all well and good, but how do we calculate *y* mod *p*_{i}^{ki}?

Let *q* = *p _{i}* and let

*k*=

*k*. The trick to the algorithm is to realize that

_{i}*y*mod

*q*can be written in base

^{k}*q*as

*y*

_{0}+

*y*

_{1}

*q*+

*y*

_{2}

*q*

^{2}+ … +

*y*

_{k – 1}

*q*

^{k – 1}. Each of the

*y*is a number in {0, 1, …,

_{i}*q*– 1} and can be evaluated in O(√

*q*) time and space using a little algebra, e.g. the fact that

*g*

^{(p – 1)/q}is an element of order

*q*.

So even if *p* is big, **if p – 1 has all small prime factors**, private keys can be extracted from public keys very quickly!

I wrote up an implementation of Pohlig-Hellman in PowerShell and tested it out.

First I found a (*p* – 1, *p*) pair where *p* was reasonably big and *p* – 1 had all small prime factors:

> cd pohlig-hellman > .\find-unsafe-prime.ps1 -n 10000000 Sieving primes up to 10000000 + 1... Finding largest small prime and the product of small primes from 100000000000000 to 100000020000000 Finding the prime between 100000000000000 and 100000020000000 with the smoothest p - 1 Prime 100000008359681 is one more than a number whose biggest prime factor is 53 > $p = [bigint]::Parse("100000008359681") > $p 100000008359681

100000008359681 is prime, and 100000008359680 = 2^{8} 5 11^{2} 13 19 31 37 43 53 has only small prime factors.

Then I found a generator:

> .\find-generator.ps1 -p 100000008359681 2^((100000008359681 - 1)/2) mod 100000008359681 = 1, rejecting 2 3^((100000008359681 - 1)/5) mod 100000008359681 = 1, rejecting 3 4^((100000008359681 - 1)/2) mod 100000008359681 = 1, rejecting 4 5^((100000008359681 - 1)/2) mod 100000008359681 = 1, rejecting 5 6 passes all checks > $g = [bigint]::Parse("6") > $g 6

Then I (arbitrarily) picked public key 6* ^{a}* = 2 mod 100000008359681. Now I have enough to calculate the discrete logarithm to find the private key

*a*.

Actually, √*p* is still small enough that baby-step giant-step *can* crack this logarithm directly, but it is slow. On my machine it finds log_{6} 2 = 67708279093016 mod 100000008359681 in a little over eleven minutes:

> cd baby-step-giant-step > Measure-Command { .\baby-step-giant-step.ps1 -p $p -g $g -x 2 } Find y so that 6^y = 2 mod 100000008359681 log_6 2 mod 100000008359681 = 67708279093016 6^67708279093016 mod 100000008359681 = 2 ... TotalMinutes : 11.3536922983333 ...

By comparison, Pohlig-Hellman finds the same logarithm on the same machine **in less than one-tenth of a second**:

> cd pohlig-hellman > Measure-Command { .\pohlig-hellman.ps1 -p $p -g $g -x 2 } log_6 2 = 67708279093016 mod 100000008359681 6^67708279093016 = 2 mod 100000008359681 ... TotalMilliseconds : 74.797 ...