Saturday, March 27, 2021

The Fermat–Lévy constant

Undergraduate analysis is full of computions of strange trigonometric series and I posted on Twitter, as a challenge, to compute the following constant: c=p=31p202π(n=11n2cos(nx))3dx. c = \sum_{p=3}^\infty \frac1{p^2} \int_0^{2\pi} \left( \sum_{n=1}^\infty \frac1{n^2} \cos(nx)\right)^3\, dx. As some followers quickly remarked, there could be some irony with this exercise, because quite unexpectedly, they could solve it using the solution to Fermat's Last Theorem. So I'll start with the origin of this problem.

(Light) philosophy of mathematics

This constant came as a screen capture from a 1950 paper by Paul Lévy, Axiome de Zermelo et nombres transfinis (Zermelo axiom and transfinite numbers) where he discusses various reasons to accept, or refuse, Zermelo's principle according to which every set can be endowed with a well-ordering.

Zermelo had proved this principle in 1905 and we know today that this principle is equivalent to the “axiom of choice”. In fact, given a set SS, Zermelo chooses once and for all, for every nonempty subset AA of SS some element f(A)f(A) of AA and it seems that this is precisely this simultaneous quantity of choices that led his contemporaries to realize the necessity of the axiom of choice, as an axiom, first for philosophical reasons — because they objected to the very possibility of doing so.

We know today how Zermelo's theorem is equivalent to the axiom of choice, that it is essentially inoccuous (Gödel, 1938, showing that if there is a set theory without assuming choice, there is some set theory where the axiom of choice is valid), but that it is also unprovable from the other axioms of set theory, say, those (ZF) proposed by Zermelo and Fraenkel (Cohen, 1965, showing that if there is a set theory without assuming the axiom of choice, there is some set theory where the axiom of choice does not hold).

We also know how this axiom, whose usefulness is recognized in the proofs of many theorems of algebra (existence and uniqueness of an algebraic closure, of prime or maximal ideals, of basis for vector spaces, etc.), also leads to paradoxical constructions (such as nonmeasurable sets, or the Banach–Tarski paradoxical decomposition of the unit sphere into finitely many parts that can be recombined into two unit spheres). For many modern mathematicians, the question is now that of a risk-benefice balance — using this axiom on safe grounds, say. However, for mathematicians such as Lebesgue, Borel or Brouwer, who were inspired by a constructive or intuitionistic philosophy of mathematics, this axiom had to be rejected, because they objected to the idea of using objects that cannot be named.

In his 1950 paper, Paul Lévy discusses these objections and explains why he rejects them. As he recognizes, the nature of language only allows for a countable amount of mathematical formulas, hence we can only name a countable amount of mathematical objects, and we know – this is a 1878 theorem of Cantor – that the continuum, for example, is not countable. There must exist, says Paul Lévy, numbers that cannot be named, « des nombres innomables ». I should add that Loren Graham and Jean-Michel Kantor wrote a book on that topic, Naming Infinity: A True Story of Religious Mysticism and Mathematical Creativity, which contains a whole chapter on what they call “the French trio: Borel, Lebesgue, Baire”.

At this point, Lévy introduces the constant whose definition starts this note. Let us now read what Lévy tells us (my translation):

Do mathematical objects exist in themselves, or only as products of the human mind and of our human logic? The point of view of Mr Brouwer regarding the truth of the statements of theorems forces us to ask the question.

Let us consider a precise statement, to fix ideas, that of Fermat's theorem. For us, it is true or false, and the sum c=p=302π(1cosnpxn2)3dx c = \sum_{p=3}^\infty \int_0^{2\pi} \left( \sum_1^\infty \frac{\cos n^p x}{n^2}\right)^3\, dx is zero in the first case and positive in the second. If it is false, one can check that it is false (although we do not know in advance how much time is needed). If it is true, an infinite number of checks would be necessary to assure it, and nothing guarantees a priori that this infinite number can be replaced by a finite reasoning. So there are three possible hypotheses (for some theorems, there would be four): the theorem is false; it is true and provable; it is true and unprovable.

In this latter case, Mr Brouwer refuses to consider this theorem as true, and the constant cc is neither zero nor positive (nor negative, of course).

Let us detail his argument a little bit. Lévy says that his strangely defined constant cc is zero in case FLT is true, and is strictly positive otherwise. But what is its value? For some mathematicians such as Brouwer, this constant can only have a value if we can prove that we are in one case or the other. At the time when Lévy writes his article, in 1950, one did not know yet whether Fermat's last theorem was true or false, it would only be proved in 1995! There is one way to prove that FLT is false, it consists in enumerating all quadruples (x,y,z,p)(x,y,z,p), where x,y,zx,y,z are strictly positive integers and pp is a prime number 3\geq 3, until one reaches some quadruple where xp+yp=zpx^p+y^p=z^p – this guarantees that c>0c>0, Lévy says. But if FLT is true, then this enumeration will go on indefinitely and we will never reach the conclusion. Lévy has an objection to this disappointment, which I find quite amusing, based on the paradox of Zeno of Elea: imagine that we are quicker and quicker at each verification, for example, say twice as fast each time. We could need 1 second for the first quadruple, half a second for the second one, a quarter second for the third one, and so on, and then the whole verification would take 2 seconds. This is obviously a thought experiment and I do not know if physics forbids such a phaenomenon, which anyway Lévy qualifies as supernatural. Computer scientists could object as well, because the complexity of the required computations grows at each step. But let us stop that this philosophical discussion and go back to Lévy's constant.

Computing the Lévy constant

The indicated sum is a series, indexed by prime number p3p\geq 3 (we could consider integers as well) of some integral from 00 to 2π2\pi of some trigonometric expression (divided by p2p^2) which I will denote by cpc_p.

The point is that cpc_p is always positive or zero, and that cpc_p is strictly positive if and only if Fermat's last theorem fails for exponent pp. It is not precised by Lévy, but the goal of division by p2p^2 is to make sure that the infinite series converges, whatever its sum.

Since cosines are smaller than 11, the series of all cos(npx)/n2\cos(n^p x)/n^2 converges normally and its sum is a continuous function of xx which I will denote by f(x)f(x). And one has cp=02πf(x)3dxc_p=\int_0^{2\pi} f(x)^3\,dx. (Since ff is bounded from above independently of pp, for example by π2/6\pi^2/6, each integral is uniformly bounded, and the division by p2p^2 will make the whole series convergent.) To understand this integral, let us expand the cube.

What we get is a sum, indexed by triples of integers (k,m,n)(k,m,n) of terms of the form cos(kpx)cos(mpx)cos(npx)/k2m2n2\cos(k^px) \cos(m^px)\cos(n^px) / k^2m^2n^2, and we need to integrate each of them from 00 to 2π2\pi.

Several methods are possible, such as knowing one's formulas for products of cosines, but I prefer replacing the cosines by their complex exponential expression (Euler's formula), so that each term gets replaced by a sum of 8 terms of the form e±ikpxe±impxe±inpx/8k2m2n2e^{\pm i k^p x} e^{\pm i m^p x} e^{\pm i n^p x} / 8 k^2m^2n^2. Let us factor the exponentials: we get ei(±kp±mp±np)x/8k2m2n2e^{ i (\pm k^p\pm m^p\pm n^p) x} / 8k^2m^2 n^2, and its integral from 00 to 2π2\pi is equal to π/4k2m2n2\pi/4k^2m^2n^2 if the argument of xx vanishes, and is zero otherwise.

In other words, the triple (k,m,n)(k,m,n) contributes as a strictly positive quantity to the sum as soon as (±k,±m,±n)(\pm k,\pm m,\pm n) is a counterexample to Fermat's theorem for exponent pp. Amusing, isn't it? Yes, but you will admit that it is essentially tautological… However…

The circle method

I now reach deeper, in any case mathematically more efficient, considerations and wish to say some words of the circle method. I'll take as an example the problem that the English mathematician Waring had posed in 1770.

In his Arithmetics, Diophantus had asked whether every integer is a sum of 4 perfect squares of integers, and Lagrange proved in 1770 that it is indeed the case. Waring generalized the question to higher powers: is it true that every integer is the sum of at most 9 perfect cubes of integers, or 19 fourth powers, etc.? These two numbers, 9 and 19, correspond to the fact that we need indeed 9 cubes to write 23 (23=23+23+23+1+1+1+1+123 = 2^3 + 2^3 + 2^3 + 1 +1 + 1 + 1 + 1) and 19 fourth powers to write 7979. So Waring was asking whether this necessary bound was, in fact, sufficient.

We know today that this is the case for cubes (Wieferich, 1909; Kempner, 1912) ou les puissances quatrièmes (Balasubramanian, Dress, Deshouillers, 1986). We also know that there is, for every integer k3k\geq3, some smallest integer g(k)g(k) such that every integer is the sum of at most g(k)g(k), as well as an upper bound which conjecturally is an equality, namely g(k)=2k+(3/2)k2g(k)=2^k+\lfloor (3/2)^k\rfloor - 2.

However, the question seems to me as rather less interesting, when kk grows, than the asymptotic analogue: every large enough integer is the sum of at most G(k)G(k) perfect kkth powers, because for this problem a much better bound is known to hold: G(k)2klog(k)+G(k)\leq 2k \log(k)+\cdots (Vinogradov, Karatsuba, Vaughan, Wooley…).

And to establish these results, trigonometric series strike back, as the heart of the circle method of Hardy and Littlewood.

To write NN as a sum of kkth powers, the idea is to set f(x)f(x) as the sum, for all integers nn smaller than P=N1/kP=N^{1/k}, of exp(2iπnkx)\exp(2i\pi n^k x), and to observe that the number of solution to Waring's equality n1k++nrk=Nn_1^k+\cdots+n_r^k=N is given by the integral from 00 to 11 of the quantity f(x)rexp(2iπNx)f(x)^r \exp(-2i\pi Nx).

At this point, we have not been less tautological than Paul Lévy was, and it required the genius of Hardy, Littlewood and their followers to observe that this integral could be actually evaluated. To that aim, the idea is to split the interval [0;1][0;1] in two regions. First of all, intervals (major arcs) centered around rational numbers of the form a/qa/q, where qq is “small”, and the complementary set (minor arcs). Roughly, the contribution of major arcs can be reduced to trigonometric sums in finite fields, using the Chinese remainder theorem and other trics, and it appears to have a simple asymptotic expansion – this is number theory. When the Gods favor us, the contribution of the minor arcs may be bounded efficiently, using more analytic methods, such as Cauchy-Schwarz or Weyl differencing. This requires the integer rr to be large enough, and then one gets an asymptotic expansion for the number of rr perfect kkth powers that add up to NN, and it is strictly positive for NN large enough!

What one needs to remember is that this works for rr quite large. In other words, we can study Diophantine equations whose number of variables is very large compared to the degree. An article by Birch (and which I have not studied well enough…) is called Forms in many variables. Alas, it is completely inoperant for questions such as Fermat's last theorem which has 3 variables and quite a large degree. On the other hand, the circle method is very robust, it allows the equation to be modified – Birch's paper is very flexible – and we know that small modifications of an equation can give rise to dramatic differences regarding its behaviour.

So, as the computer scientists would say, this may be rather a feature than a bug.