4 Representing numbers as sums of two squares

Lemma 4.1 Lemma 1

For primes \(p = 4m + 1\) the equation \(s^2 \equiv -1 (\mod p)\) has two solutions \(s \in \{ 1, 2, \dots , p - 1\} \), for \(p = 2\) there is one such solution, while for primes of the form \(p = 4m + 3\) there is no solution.

Proof ▶

For \(p = 2\) take \(s = 1\). For odd \(p\), we construct the equivalence relation on \(\{ 1, 2, \dots , p - 1\} \) that is generated by identifying every element with its additive inverse and with its multiplicative inverse in \(\mathbb {Z}_p\). Thus the “general” equivalence classes will contain four elements

\[ \{ x, -x, \overline{x}, -\overline{x}\} \]

since such a 4-element set contains both inverses for all its elements. However, there are smaller equivalence classes if some of the four numbers are not distinct:

\(x \equiv -x\) is impossible for odd \(p\).
\(x \equiv \overline{x}\) is equivalent to \(x^2 \equiv 1\). This has two solutions, namely \(x = 1\) and \(x = p - 1\), leading to the equivalence class \(\{ 1, p - 1\} \) of size 2.
\(x \equiv -\overline{x}\) is equivalent to \(x^2 \equiv -1\). This equation may have no solution or two distinct solutions \(x_0, p - x_0\): in this case the equivalence class is \(\{ x_0, p - x_0\} \).

The set \(\{ 1, 2, \dots , p - 1\} \) has \(p - 1\) elements, and we have partitioned it into quadruples (equivalence classes of size 4), plus one or two pairs (equivalence classes of size 2). For \(p - 1 = 4m + 2\) we find that there is only the one pair \(\{ 1, p - 1\} \), the rest is quadruples, and thus \(s^2 \equiv -1 \pmod{p}\) has no solution. For \(p - 1 = 4m\) there has to be the second pair, and this contains the two solutions of \(s^2 \equiv -1\) that we were looking for.

Lemma 4.2 Lemma 2

✓

No number \(n = 4m + 3\) is a sum of two squares.

Proof ▶

The square of any even number is \((2k)^2 = 4k^2 \equiv 0 \pmod{4}\), while squares of odd numbers yield \((2k + 1)^2 = 4(k^2 + k) + 1 \equiv 1 \pmod{4}\). Thus any sum of two squares is congruent to \(0\), \(1\) or \(2 \pmod{4}\). □

Proposition 4.3 First proof

Every prime of the form \(p = 4m + 1\) is a sum of two squares, that is, it can be written as \(p = x^2 + y^2\) for some natural numbers \(x,y \in \mathbb {N}\).

Proof ▶

Consider the pairs \((x', y')\) of integers with \(0 \leq x', y' \leq \sqrt{p}\), that is, \(x', y' \in \{ 0, 1, \dots , \lfloor \sqrt{p} \rfloor \} \). There are \((\lfloor \sqrt{p} \rfloor + 1)^2\) such pairs. Using the estimate \(\lfloor x \rfloor + 1 {\gt} x\) for \(x = \sqrt{p}\), we see that we have more than \(p\) such pairs of integers. Thus for any \(s \in \mathbb {Z}\), it is impossible that all the values \(x' - sy'\) produced by the pairs \((x', y')\) are distinct modulo \(p\). That is, for every \(s\) there are two distinct pairs

\[ (x', y'), (x'', y'') \in \{ 0, 1, \dots , \lfloor \sqrt{p} \rfloor \} ^2 \]

with \(x' - sy' \equiv x'' - sy'' \pmod{p}\). Now we take differences: We have \(x' - x'' \equiv s(y' - y'') \pmod{p}\). Thus if we define \(x := |x' - x''|\), \(y := |y' - y''|\), then we get

\[ (x, y) \in \{ 0, 1, \dots , \lfloor \sqrt{p} \rfloor \} ^2 \quad \text{with} \quad x \equiv \pm sy \pmod{p}. \]

Also we know that not both \(x\) and \(y\) can be zero, because the pairs \((x', y')\) and \((x'', y'')\) are distinct.

Now let \(s\) be a solution of \(s^2 \equiv -1 \pmod{p}\), which exists by Lemma 1. Then \(x^2 \equiv s^2 y^2 = -y^2 \pmod{p}\), and so we have produced

\[ (x, y) \in \mathbb {Z}^2 \quad \text{with} \quad 0 {\lt} x^2 + y^2 {\lt} 2p \quad \text{and} \quad x^2 + y^2 \equiv 0 \pmod{p}. \]

But \(p\) is the only number between \(0\) and \(2p\) that is divisible by \(p\). Thus \(x^2 + y^2 = p\): done!

Proposition 4.4 Second proof

Every prime of the form \(p = 4m + 1\) is a sum of two squares, that is, it can be written as \(p = x^2 + y^2\) for some natural numbers \(x,y \in \mathbb {N}\).

Proof ▶

We study the set

\[ S := \{ (x, y, z) \in \mathbb {Z}^3 : 4xy + z^2 = p, \ x {\gt} 0, \ y {\gt} 0\} . \]

This set is finite. Indeed, \(x \geq 1\) and \(y \geq 1\) implies \(y \leq \frac{p}{4}\) and \(x \leq \frac{p}{4}\). So there are only finitely many possible values for \(x\) and \(y\), and given \(x\) and \(y\), there are at most two values for \(z\).

The first linear involution is given by

\[ f : S \to S, \quad (x, y, z) \mapsto (y, x, -z), \]

that is, “interchange \(x\) and \(y\), and negate \(z\).” This clearly maps \(S\) to itself, and it is an involution: Applied twice, it yields the identity. Also, \(f\) has no fixed points, since \(z = 0\) would imply \(p = 4xy\), which is impossible. Furthermore, \(f\) maps the solutions in

\[ T := \{ (x, y, z) \in S : z {\gt} 0\} \]

to the solutions in \(S \setminus T\), which satisfy \(z {\lt} 0\). Also, \(f\) reverses the signs of \(x - y\) and of \(z\), so it maps the solutions in

\[ U := \{ (x, y, z) \in S : (x - y) + z {\gt} 0\} \]

to the solutions in \(S \setminus U\). For this we have to see that there is no solution with \((x - y) + z = 0\), but there is none since this would give \(p = 4xy + z^2 = 4xy + (x - y)^2 = (x + y)^2\).

What do we get from the study of \(f\)? The main observation is that since \(f\) maps the sets \(T\) and \(U\) to their complements, it also interchanges the elements in \(T \setminus U\) with these in \(U \setminus T\). That is, there is the same number of solutions in \(U\) that are not in \(T\) as there are solutions in \(T\) that are not in \(U\) — so \(T\) and \(U\) have the same cardinality.
The second involution that we study is an involution on the set \(U\):

\[ g : U \to U, \quad (x, y, z) \mapsto (x - y + z, y, 2y - z). \]

First we check that indeed this is a well-defined map: If \((x, y, z) \in U\), then \(x - y + z {\gt} 0\), \(y {\gt} 0\) and \(4(x - y + z)y + (2y - z)^2 = 4xy + z^2\), so \(g(x, y, z) \in S\). By \((x - y + z) - y + (2y - z) = x {\gt} 0\) we find that indeed \(g(x, y, z) \in U\).

Also \(g\) is an involution: \(g(x, y, z) = (x - y + z, y, 2y - z)\) is mapped by \(g\) to \(((x - y + z) - y + (2y - z), y, 2y - (2y - z)) = (x, y, z)\).

And finally \(g\) has exactly one fixed point:

\[ (x, y, z) = g(x, y, z) = (x - y + z, y, 2y - z) \]

implies that \(y = z\), but then \(p = 4xy + y^2 = (4x + y)y\), which holds only for \(y = z = 1\) and \(x = \frac{p - 1}{4}\).

But if \(g\) is an involution on \(U\) that has exactly one fixed point, then the cardinality of \(U\) is odd.
The third, trivial, involution that we study is the involution on \(T\) that interchanges \(x\) and \(y\):

\[ h : T \to T, \quad (x, y, z) \mapsto (y, x, z). \]

This map is clearly well-defined, and an involution. We combine now our knowledge derived from the other two involutions: The cardinality of \(T\) is equal to the cardinality of \(U\), which is odd. But if \(h\) is an involution on a finite set of odd cardinality, then it has a fixed point: There is a point \((x, y, z) \in T\) with \(x = y\), that is, a solution of

\[ p = 4x^2 + z^2 = (2x)^2 + z^2. \]

Proposition 4.5 Third proof

Every prime of the form \(p = 4m + 1\) is a sum of two squares, that is, it can be written as \(p = x^2 + y^2\) for some natural numbers \(x,y \in \mathbb {N}\).

Proof ▶

Again we fix a prime number \(p = 4n + 1\) and consider the set of solutions

\[ T = \{ (x, y, z) \in \mathbb {N}^3 : 4xy + z^2 = p\} . \]

Each element of this set gives rise to a winged square: This is the figure consisting of a square and four rectangles in the plane that you get if you start with a square of side length \(z\) and at each vertex attach a rectangle of side-lengths \(x\) and \(y\) in a rotation-symmetric way, such that the edge of length \(x\) points away from the square, while the edge of length \(y\) runs along the side of the square.

We consider two winged squares “the same” if they are congruent. One way to make this unique, such that the representation of the winged square depends only on its boundary curve, is to require that the L formed by the two edges in the upper right-hand corner is at least as high as it is wide. If this condition is not satisfied, then a mirror image (reflected, e.g., in a vertical axis), will repair this. So each solution in \(T\) corresponds to a unique winged square of area \(4xy + z^2 = p\), and indeed this is reversible: From each winged square we can read off a solution.

Taking the union of the square and the four rectangles, we get for each winged square what we will call a unique winged shape: This is a polyomino of area \(p\) with four-fold rotation symmetry, which has twelve vertices: eight convex ones with inner right angle and four non-convex ones with outer right angle. (We can’t get a square shape, since \(p\) is a prime, so it can’t be a square number.) Again we will consider winged shapes “the same” if they are congruent, so we might assume that the L shape in the upper right-hand corner is at least as high as it is wide.

Now we are getting very close to the punch line: For each winged shape we get either one or two winged squares, by simultaneously drawing, in a rotation-symmetric way, vertical and horizontal lines to the interior starting at the non-convex vertices. We get only one solution if the shape has the symmetry of a square, that is, if the two arms of the L shapes have the same length. This happens exactly if \(y = z\), but then \(p = 4xz + z^2 = (4x + z)z\); assuming that \(p\) is a prime, this implies that \(z = 1\) and \(x = n\). In other words: Exactly one winged shape yields a single winged square, while all other winged shapes yield two winged squares each. Consequently, the number \(|T|\) of winged squares is odd.

However, the winged squares with non-square rectangles (with \(x \neq y\)) come in pairs, as we can always flip the four rectangular wings between vertical and horizontal format (that is, exchange \(x\) and \(y\)). As \(|T|\) is odd, this implies that there is an odd number of winged squares whose wings are squares, that is, \(T\) contains an odd number of triples \((x, y, z)\) with \(x = y\), and hence at least one, and this yields a solution to \((2x)^2 + z^2 = p\).

Theorem 4.6

A natural number \(n\) can be represented as a sum of two squares if and only if every prime factor of the form \(p = 4m + 3\) appears with an even exponent in the prime decomposition of \(n\).

Proof ▶

Call a number \(n\) representable if it is a sum of two squares, that is, if \(n = x^2 + y^2\) for some \(x, y \in \mathbb {N}_0\). The theorem is a consequence of the following five facts.

\(1 = 1^2 + 0^2\) and \(2 = 1^2 + 1^2\) are representable. Every prime of the form \(p = 4m + 1\) is representable.
The product of any two representable numbers \(n_1 = x_1^2 + y_1^2\) and \(n_2 = x_2^2 + y_2^2\) is representable: \(n_1 n_2 = (x_1 x_2 + y_1 y_2)^2 + (x_1 y_2 - x_2 y_1)^2\).
If \(n\) is representable, \(n = x^2 + y^2\), then also \(nz^2\) is representable, by \(nz^2 = (xz)^2 + (yz)^2\).

Facts (1), (2) and (3) together yield the “if” part of the theorem.

If \(p = 4m + 3\) is a prime that divides a representable number \(n = x^2 + y^2\), then \(p\) divides both \(x\) and \(y\), and thus \(p^2\) divides \(n\). In fact, if we had \(x \not\equiv 0 \pmod{p}\), then we could find \(\overline{x}\) such that \(x\overline{x} \equiv 1 \pmod{p}\), multiply the equation \(x^2 + y^2 \equiv 0\) by \(\overline{x}^2\), and thus we would obtain that \(1 + y^2 \overline{x}^2 = 1 + \left(\overline{x} y\right)^2 \equiv 0 \pmod{p}\), which is impossible for \(p = 4m + 3\) by Lemma 1.
If \(n\) is representable, and \(p = 4m + 3\) divides \(n\), then \(p^2\) divides \(n\), and \(n/p^2\) is representable. This follows from (4), and completes the proof. □