CS 173

Spring 2021

Spring 2021

Number theory algorithms are used in n cryptography e.g. secure data exchange protocols (e.g. ssh, sftp, https). You've used these to communicate with business like your bank, even if you weren't aware of it. Number theory is used to construct random (actually pseudo-random) number functions for randomized data structures (e.g. hash tables), randomized algorithms (e.g. quicksort), and applications (e.g. games) that require randomization.

We won't go very deeply into number theory. We'll primarily use it as an excuse to practice direct proofs.

Here's a really nice algorithm for computing the gcd of two integers, called the Euclidean Algorithm. To keep things simple, I'm assuming that inputs are both positive. It's not hard to extend this code to handle negative inputs.

gcd(a,b) while (b > 0) r = remainder(a,b) a = b b = r return a

Notice that this is pseudocode not real computer code. Pseudocode is written for humans. So it often leaves out small details and the syntax isn't precisely defined. That's usually not a problem.

remainder(a,b) is the remainder left when we divide a by b.

If you prefer, we can write this recursively:

gcd(a,b) r = remainder(a,b) if (r = 0) return b else return gcd(b,r)

This is an extremely nice algorithm. It's very efficient and it's very easy to simulate by hand. Our job today will be to understand why it works.

First, let's pin down what the remainder function does. Here's a possible definition of quotient and remainder:

Suppose that a and b are integers, with b positive. Then the quotient (q) and the remainder (r) of a divided by b satisfy the equation: \(a = bq + r\)

So if a=15 and b=6, then we can satisfy the equation using q=2 and r=3.

However, notice that this equation doesn't completely pin down the values of q and r. We could also choose q=1 and r=9. Or q=3 and r= -3. So the actual mathematical definition contains two constraints:

Suppose that a and b are integers, with b positive. Then the quotient (q) and the remainder (r) of a divided by b satisfy the equations: \(a = bq + r\) and \(0 \le r < b\)

For example, suppose that we want to compute remainder(-15,4). The only solution to \(a = bq + r\) with b in the correct range is \(-15 = 4\cdot(-4) + 1 \) So remainder(-15,4) = 1. If you're uncertain of the correct value, remember that the remainder is never negative and set up the equation explicitly.

Beware: many programming languages do not follow the mathematical definition, and do a variety of different things when one or both inputs to remainder are negative. Always check the manual before feeding negative inputs to remainder functions.

Let's compute gcd(255,483) by tracing the values in the iterative version.

a | b | r = remainder(a,b) | |
---|---|---|---|

255 | 483 | 255 | |

483 | 255 | 228 | We now have \(a > b\) |

255 | 228 | 27 | |

228 | 27 | 12 | |

27 | 12 | 3 | |

12 | 3 | 0 | |

3 | 0 | --- | return 3 |

To show that the Euclidean Algorithm is correct, i.e. computes the gcd, we need to show three things:

- The algorithm halts in a finite amount of time.
- Resetting the variable values inside the while loop doesn't change the value of gcd(a,b).
- The return value at the end is correct, i.e. gcd(a,0) = a.

(1) is true because remainder(a,b) is always strictly smaller than b. So each iteration reduces the size of b. Since the values are positive integers, this means that it must reach zero in a finite number of steps. (This wouldn't necessarily be true if the values were real numbers.)

We'll get back to (2).

To understand why (3) is true, we need to remember exactly how divides is defined in the textbook:

An integer p divides an integer q (written \(p \mid q\)) if and only if q = pn, for some integer n.

I claim that every integer divides 0. To see this, let's set up the equation from the definition: \(0 = pn\). No matter what value p has, we can make this equation balance by setting n to be zero.

Now remember that gcd(a,b) is the largest integer that divides both a and b. Since everything divides zero, a is the largest integer that divides both 0 and a. So \(\gcd(a,0)= a\).

In case you were wondering, gcd(0,0) isn't defined. Since all integers divide zero, there is no maximum divisor.

Now we need to show that resetting the variable values in our loop doesn't change the value of gcd(a,b). That is:

Claim: for any positive integers a and b, gcd(a,b) = gcd(b,remainder(a,b)).

It's actually just as easy to prove a more general claim:

Claim 2: for any integers a, b,q,r, where b is positive, if a = bq + r, then gcd(a,b) = gcd(b,r).

Here's the outline of how to prove this:

- Suppose a = bq + r
- Any k that divides a and b also divides r. i.e. any common factor of a and b is also a factor of r.
- Any k that divides b and r also divides a.
- So if we make the common factors into sets, we have {common factors of a and b} = {common factors of b and r}
- Since the two sets are equal, the maximum values in the two sets must be equal. That is, gcd(a,b) = gcd(b,r).

We will prove (2) in the next video. The proof for (3) is extremely similar, and left as an exercise. Proving these will complete the proof that our gcd algorithm is correct.