Extended Euclidean algorithm for non-negative integers

Extended Euclidean algorithm for non-negative integers 2022-08-29

tl;dr: here's a Rust function (playground link) that performs the "extended Euclidean algorithm".

fn extended_gcd(a: usize, b: usize) -> (usize, usize, usize) {
    let remainder = a % b;
    if remainder == 0 {
        return (b, 0, if a == b {0} else {1});
    }
    let quotient = a / b;
    let (gcd, x, y) = extended_gcd(b, remainder);
    let new_y =  (((b * x + remainder - gcd) / remainder) * quotient + x + a - quotient) % a;     // what in the 0x5F3759DF?
    (gcd, y, new_y)
}

It assumes a >= 0, b > 0 (divides by b and so crashes if b == 0) and outputs (d, x, y) such that:

\(d = \gcd(a, b)\);
\(0 \le x < b/d\) and \(0 \le y < a/d\) (if \(a > 0\), otherwise \(y = 0\));
\(ax + by \equiv d \pmod{\operatorname{lcm}(a, b)}\).

It will also not underflow, but might overflow if a or b is too big. It uses usize as the type as an example, but it will work with any unsigned integer type, in particular big integer types that support the required arithmetic operations. It would not be so much harder to avoid recursive calls (or to handle the case \(b = 0\) once you decide what the output there should be) but I think it's tricky enough already.

This code is in Rust because that's the only language with a built-in unsigned integer type that I'm comfortable using. If you want to port it to another language, keep in mind the conventions of how / and % work. For reference, the function as written assumes that the quotient is rounded towards \(0\), so a/b is \(\lfloor a/b \rfloor\) and a % b satisfies a = (a / b) * b + (a % b).

The rest of this post is about why this should be called the extended Euclidean algorithm.

Abstract(?) algebra

Let's talk about gcds. Given \(a, b \in \mathbf{Z}\), not both \(0\), there are a few possible definitions of \(\gcd(a, b)\):

\(\gcd(a, b)\) is the largest among the common divisors of \(a\) and \(b\).
\(\gcd(a, b)\) is the smallest positive integer that can be written in the form \(ax + by\) with \(x, y \in \mathbf{Z}\).
\(\gcd(a, b)\) is the unique non-negative generator of the subgroup/ideal \(a\mathbf{Z} + b\mathbf{Z}\) of \(\mathbf{Z}\).
\(\gcd(a, b)\) is what the Euclidean algorithm returns when started with the arguments \(a\) and \(b\).

If we relax the meaning of largest and smallest, these definitions are leveraging increasing amounts of "niceness" for the ring of integers:

\(\mathbf{Z}\) is a unique factorization domain, so up to units has meets (and joins) as a poset under divisors.
\(\mathbf{Z}\) is a Bezout domain, ie the meet \(a \land b\) can be written as a linear combination of \(a\) and \(b\).
\(\mathbf{Z}\) is a principal ideal domain, so the ideal \(a\mathbf{Z} + b\mathbf{Z}\) is generated by a single element.
\(\mathbf{Z}\) is a Euclidean domain, ie has a well-defined sense of division with smaller remainder, and hence is a domain for running the Euclidean algorithm.

You might notice that definitions 2 and 3 are very close to each other, in fact they are generally equivalent: a finitely generated ideal in a Bezout domain must be principal. (Therefore a Bezout domain that is not a PID has to be non-Noetherian. The most reasonable example seems to be the ring of algebraic integers, see eg this math.stackexchange question.)

Given definition 4, it seems a pertinent question how "computable" the other definitions are. A reasonable interpretation of this for definition 1 is to exhibit the entire set of common divisors of \(a\) and \(b\), ie find the factors of \(d = \gcd(a, b)\). This is believed to be hard.

This brings us to the extended Euclidean algorithm, which given \(a, b \in \mathbf{Z}\), not both \(0\), computes \(x, y \in \mathbf{Z}\) such that \(ax + by = \gcd(a, b)\). This is not much more than the usual Euclidean algorithm, you just keep track of the quotients. The key point is that if dividing \(a\) by \(b\) gives quotient \(q\) and remainder \(r\) then \(r = a - q b\), so we can recursively/inductively express the remainder at each step as an explicit linear combination of \(a\) and \(b\).

Uniqueness

How unique are these \(x\) and \(y\)? Not at all: \(ax + by = a(x + nb) + b (y - na)\) for each \(n \in \mathbf{Z}\), or in fact for \(n \in \frac{1}{d} \mathbf{Z}\). In fact that is exactly the amount of ambiguity, \(x\) can be changed to anything in the coset \(x + (b/d)\mathbf{Z}\) alongwith a corresponding change of \(y\) to something in \(y + (a/d)\mathbf{Z}\). Note that independent of these choices, \(ax\) and \(by\) take values in fixed cosets of \((ab/d)\mathbf{Z} = \operatorname{lcm}(a, b) \mathbf{Z}\) (and add to \(1 + (ab/d)\mathbf{Z}\)).

What about uniqueness of \(d = \gcd(a, b)\)? The generator of a principal ideal is only defined up to units. Similarly divisibility is unaffected by multiplication by units. So if we want the gcd to be unique, the sense of "greatest" has to involve the ordering of integers, and not just the divisibility order.

Positive integers

We can try to fix this by adopting the convention of identifying subgroups of \(\mathbf{Z}\) with their unique non-negative generator, and restricting our domain to \(\mathbf{N} = \mathbf{Z}_{\ge 0}\). Let's see how the definitions of \(\gcd\) are affected:

\(\gcd(a, b)\) is still the largest among the common divisors of \(a\) and \(b\). Now \(\mathbf{N}\) is an actual poset with meets.
\(\gcd(a, b)\) in general cannot be written in the form \(ax + by\) with \(x, y \in \mathbf{N}\). The smallest positive integer of this form is \(\min(a, b)\) (unless one of \(a, b\) is \(0\), in which case it is the other one).
The submonoid/cone \(a\mathbf{N} + b\mathbf{N}\) of \(\mathbf{N}\) is not necessarily "principal" (ie generated by a single element) and is in general smaller than \(\mathbf{N} \cap (a\mathbf{Z} + b\mathbf{Z})\).
The Euclidean algorithm still works and still returns \(\gcd(a, b)\) when started with the arguments \(a\) and \(b\).

The extended Euclidean algorithm no longer works, since \(a - bq\) is no longer a valid operation. So we seem to have broken things.

All hope is not lost. Remember that our goal was to compute the cosets \(x + (b/d)\mathbf{Z}\) and \(y + (a/d)\mathbf{Z}\), after replacing subgroups by their non-negative generators. In this description, what should we represent cosets by? The subgroup generated by \(0\) behaves differently, so let's ignore that. The cosets of \(n \in \mathbf{N}_{> 0}\) (ie of \(n\mathbf{Z}\)) exactly correspond to elements in \([n] := \{0, \dots, n - 1\}\) (this choice is better than \(\{1, \dots, n\}\) since it corresponds to the usual convention of the operation \(a \mapsto a \mathop{\%} n = a \bmod n\)). So the question translates to, given \(a, b \in \mathbf{N}_{> 0}\), finding \(d = \gcd(a, b)\), \(x \in [b/d]\) and \(y \in [a/d]\) such that \((ax + by) \bmod (ab/d) = d\). This is exactly what the Rust function in the beginning does.

Finer points of the implementation

Let \(a = bq + r\) with \(r \in [b]\), ie \(0 \le r < b\). The key step is to find \(x\) and \(y\) given \(x'\) and \(y'\) with \(d \equiv bx' + ry' \pmod {br/d}\). Substituting, \(bx' + ry' = bx' + (a-bq)y' = ay' + b(x' - qy').\) So \(x = y'\) and \(y = (x' - qy') \bmod (a/d)\). I will leave it as an exerise for the reader to figure out why the arcane expression (((b * x' + r - d) / r) * q + x' + a - q) % a is equivalent to this (and is ensured to not underflow). Suggestions for simpler equivalent expressions are welcome.

The recursive call can be avoided by reworking the above inductive step "outside-in" instead of "inside-out". I have not checked what expressions you get but I would guess it involves more magic.

The divisors of \(0\)

Let's say \(a = 0\) and \(b > 0\). Then \(d = \gcd(a, b) = b\), \(a/d = 0\), \(b/d = 1\) and \(ab/d = 0\). So we want to find \(x \in [1]\) and \(y\) representing a coset of \(0\mathbf{Z}\), ie \(y \in \mathbf{Z}\) such that \(d \equiv ax + by \pmod{0}\), ie \(d = by\). So it's reasonable to say that the output here should be \(x = 0\) and \(y = 1\). The implementation above handles only this case and not the symmetric case \(a > 0\), \(b = 0\) but it would be easy to modify it to do so.

What about \(a = b = 0\)? Now \(d\) should be \(0\), so \(a/d\), \(b/d\) and \(ab/d\) don't make sense. Going back to the the original definition, any \(x\) and \(y\) satisfy \(ax + by = d\), so \(x\) and \(y\) are well-defined as elements of \(\mathbf{Z}/\mathbf{Z}\), so by our convention we should take \(x = y = 0\). This is probably the most sensible answer in this case, but that's not saying much.