9 Four times \(π^2/6\)
The proof consists in two different evaluations of the double integral
For the first one, we expand \(\frac{1}{1-xy}\) as a geometric series, decompose the summands as products, and integrate effortlessly:
This evaluation also shows that the double integral (over a positive function with a pole at \(x=y=1\)) is finite. Note that the computation is also easy and straightforward if we read it backwards — thus the evaluation of \(\zeta (2)\) leads one to the double integral \(I\).
The second way to evaluate \(I\) comes from a change of coordinates: in the new coordinates given by \(u := \frac{y+x}{2}\) and \(v := \frac{y-x}{2}\) the domain of integration is a square of side length \(\frac{1}{2}\sqrt{2}\), which we get from the old domain by first rotating it by \(45^\circ \) and then shrinking it by a factor of \(\sqrt{2}\). Substitution of \(x = u-v\) and \(y = u+v\) yields
To transform the integral, we have to replace \(dx \, dy\) by \(2 \, du \, dv\), to compensate for the fact that our coordinate transformation reduces areas by a constant factor of \(2\) (which is the Jacobi determinant of the transformation). The new domain of integration, and the function to be integrated, are symmetric with respect to the \(u\)-axis, so we just need to compute two times (another factor of \(2\) arises here!) the integral over the upper half domain, which we split into two parts in the most natural way:
Using \(\int \frac{dx}{a^2+x^2} = \frac{1}{a} \arctan \frac{x}{a} + C\), this becomes
These integrals can be simplified and finally evaluated by substituting \(u = \sin \theta \) resp. \(u = \cos \theta \). But we proceed more directly, by computing that the derivative of \(g(u) := \arctan \left(\frac{u}{\sqrt{1-u^2}}\right)\) is \(g'(u) = \frac{1}{\sqrt{1-u^2}}\), while the derivative of \(h(u) := \arctan \left(\frac{1-u}{\sqrt{1-u^2}}\right) = \arctan \left(\sqrt{\frac{1-u}{1+u}}\right)\) is \(h'(u) = -\frac{1}{2} \frac{1}{\sqrt{1-u^2}}\). So we may use \(\int _a^b f'(x) f(x) dx = \left[ \frac{1}{2} f(x)^2 \right]_a^b = \frac{1}{2} f(b)^2 - \frac{1}{2} f(a)^2\) and get
As above, we may express this as a double integral, namely
So we have to compute this integral \(J\). And for this Beukers, Calabi and Kolk proposed the new coordinates
To compute the double integral, we may ignore the boundary of the domain, and consider \(x, y\) in the range \(0 {\lt} x {\lt} 1\) and \(0 {\lt} y {\lt} 1\). Then \(u, v\) will lie in the triangle \(u {\gt} 0, v {\gt} 0, u+v {\lt} \pi /2\). The coordinate transformation can be inverted explicitly, which leads one to the substitution
It is easy to check that these formulas define a bijective coordinate transformation between the interior of the unit square \(S = \{ (x, y) : 0 \le x, y \le 1\} \) and the interior of the triangle \(T = \{ (u, v) : u, v \ge 0, u+v \le \pi /2\} \). Now we have to compute the Jacobi determinant of the coordinate transformation, and magically it turns out to be
But this means that the integral that we want to compute is transformed into
which is just the area \(\frac{1}{2} (\frac{\pi }{2})^2 = \frac{\pi ^2}{8}\) of the triangle \(T\).
The first step is to establish a remarkable relation between values of the (squared) cotangent function. Namely, for all \(m \ge 1\) one has
To establish this, we start with the relation \(e^{ix} = \cos x + i \sin x\). Taking the \(n\)-th power \(e^{inx} = (e^{ix})^n\), we get
The imaginary part of this is
Now we let \(n = 2m+1\), while for \(x\) we will consider the \(m\) different values \(x = \frac{r\pi }{2m+1}\), for \(r = 1, 2, \dots , m\). For each of these values we have \(nx = r\pi \), and thus \(\sin nx = 0\), while \(0 {\lt} x {\lt} \frac{\pi }{2}\) implies that for \(\sin x\) we get \(m\) distinct positive values.
In particular, we can divide ?? by \(\sin ^n x\), which yields
that is,
for each of the \(m\) distinct values of \(x\). Thus for the polynomial of degree \(m\)
we know \(m\) distinct roots
The roots are distinct because \(\cot ^2 x = \cot ^2 y\) implies \(\sin ^2 x = \sin ^2 y\) and thus \(x = y\) for \(x, y \in \{ \frac{r\pi }{2m+1} : 1 \le r \le m \} \).
Hence the polynomial coincides with
Comparison of the coefficients of \(t^{m-1}\) in \(p(t)\) now yields that the sum of the roots is
which proves ??.
We also need a second identity, of the same type,
for the cosecant function \(\csc x = \frac{1}{\sin x}\). But
so we can derive ?? from ?? by adding \(m\) to both sides of the equation.
Now the stage is set, and everything falls into place. We use that in the range \(0 {\lt} y {\lt} \frac{\pi }{2}\) we have
and thus
which implies
Now we take this double inequality, apply it to each of the \(m\) distinct values of \(x\), and add the results. Using ?? for the left-hand side, and ?? for the right-hand side, we obtain
that is,
Both the left-hand and the right-hand side converge to \(\frac{\pi ^2}{6}\) for \(m \to \infty \): end of proof.
The first trick in this proof is to consider the Gregory–Leibniz series in doubly-infinite form \(\sum _{n=-\infty }^\infty \frac{(-1)^n}{2n+1}\). As for negative \(n = -k {\lt} 0\) we get the same terms as for \(n = k-1 \ge 0\), since \(\frac{(-1)^{-k}}{2(-k)+1} = \frac{(-1)^k}{-(2k-1)} = \frac{(-1)^{k-1}}{2(k-1)+1}\), we infer that \(\sum _{n=-N}^N \frac{(-1)^n}{2n+1}\) converges to \(\pi /2\) with \(N \to \infty \), and thus the square of this sum converges to \(\pi ^2/4\). You may write this as
The double sum may be interpreted as the sum of all entries of a square matrix of size \((2N+1) \times (2N+1)\), and we know that for \(N \to \infty \) this sum of all entries tends to \(\pi ^2/4\). We want to know, however, that the sum of only the diagonal entries, for \(m=n\), also tends to \(\pi ^2/4\),
because then \(\sum _{n=0}^\infty \frac{1}{(2n+1)^2} = \pi ^2/8\) will follow, and this, as we know, is equivalent to Euler’s theorem. So let’s show that the sum of all off-diagonal terms tends to \(0\)! We write \(\delta _N\) for this sum, and use a prime to denote that the diagonal terms with \(m=n\) are deleted, so
We only need to show that the terms
are small enough in absolute value. What do we know about them? It is easy to see that \(c_{-m,N} = -c_{m,N}\), so in particular \(c_{0,N} = 0\). Thus we may assume that \(m {\gt} 0\), and note that the summands for \(n = m+k\) and \(n = m-k\) cancel as long as they are in the range between \(-N\) and \(N\), that is, for \(1 \le k \le N-m\). Thus \(c_{m,N}\) equals the alternating sum of fractions of decreasing size given by the remaining terms, where the largest one occurs for \(n = m-(N-m)-1 = 2m-N-1\), that is \(m-n = N-m+1\). Hence
which implies that
This finally yields
and this goes to \(0\) as \(N\) goes to infinity.
Collecting the proofs from the chapter.
See theorems in this chapter.