Tuesday, June 17, 2014

a bit of number theory: Greatest Common Divisor

We'll discuss one of the most ancient algorithm in the history of Number Theory. 

Problem Statement:
Given two positive integers \( M\) and \(N\), find the greatest integer \(D\) such that \(D\) divides both \(M\) and \(N\).

In case you have forgotten, here are a few examples:

  • for \(M=10\), \(N=55\), the greatest common divisor \(D\) is \( 5\)
  • for \(M=72\), \(N=64\), the greatest common divisor \(D\) is \( 8\)
For small \( (M,N) \), \(D\) can usually be easily found by trial and error (and sometimes by terror), but for large values of \( (M,N) \), while trying all integers less than M and N might work, it can be really slow and certainly we can do better than this.

An efficient algorithm to solve this problem hinges on the following observation:
Theorem 1:
For positive integers \(a,b,d\), if \(d\mid a\) and \(d \mid b\), then \(d\mid ax+by\) for any \(x,y \in \mathbb{Z}\).

Directly substituting \(a = kd\) and \(b = ld\) shows that the claim indeed holds. \(\blacksquare\)

With this, we can devise an efficient algorithm due to Euclid:

Input: M,N are integers, \(M\geq N\)
Output: integer D, the greatest common divisor of M and N
if \(N\)==0:
   return \(M\)
return Euclid(\(N\), \(M \pmod{N}\))

Convince yourself that this algorithm works.
Euclid(M,N) eventually terminates with \( D = xM + yN\) for some \(x,y\in \mathbb{Z}\).
We will prove that it indeed returns the greatest common divisor of M and N.

Theorem 2:
Given integers \(d,x,y,a,b\), if \(d = ax + by\) and \(d\) divides both \(a\) and \(b\), then \(d\) is the greatest common divisor of \(a\) and \(b\).

Suppose that \(d\) is not the greatest common divisor of \(a\) and \(b\). Hence there is \(D\) where \(D > d\) such that \(D\) divides both \(a\) and \(b\). By Theorem 1, \(D \mid ax + by = d\), which implies that \(D \leq d\), a contradiction. Hence \(d\) must be the greatest common divisor of \(a\) and \(b\). \(\blacksquare\)

Finally, what is the running time of Euclid(M,N)? It is guaranteed that Euclid(M,N) will terminate after \(O(\log{N})\) recursive calls since each call reduces the size of either M or N by half, with each call performs a modular arithmetic which can be done in constant time. The claim below explains the upper bound to the recursive call.

If \(M \geq N\), then \(M \pmod{N} < \frac{M}{2}\) 

\(N\) can be either bigger or less than \(\frac{M}{2}\), so we have 2 cases to consider:
Case 1: \(N \leq \frac{M}{2}\)
Then we know for a fact that \(M \pmod{N} < N \leq \frac{M}{2}\).
Case 2: \(N > \frac{M}{2}\)
Here we have \( M \pmod{N} \equiv M-N \pmod{N}\) and since \(M-N < M - \frac{M}{2} = \frac{M}{2} < N\) we have \(M \pmod{N} = M-N < \frac{M}{2}\). \(\blacksquare\)

Here is an implementation in C:
/* Recursive greatest common divisor */
int gcd(int M, int N){
   if (N>M) return gcd(N, M);
   if (N==0) return M;
   return gcd(N, M%N);