Wednesday, May 13, 2020

Using texexec to modify a PDF file (from the shell)

We all regularly have to modify a PDF file, either taking a part of it. The texexec program which is shipped with ConTeXt allows to do most of it, but its man page is not very helpful and its documentation is hard to find online. The previous link does not have all the important information, this other explanation (PDF) is also useful.

Here are some commands that I find the most useful, but never remind of.

The following command extract pages 1 to 5, page 7 and pages 8 to 12 from file.pdf and put it in outputfile.pdf

texexec --pdfselect --select=1:5,7,8:12 --result=outputfile.pdf file.pdf

(To be continued)

Sunday, April 12, 2020

Some proofs of the quadratic reciprocity law

Gauss's quadratic reciprocity law is a mysterious relation between prime numbers. If $p$ and $q$ are distinct prime numbers, it shows that there is a relation between the property that $p$ is a square modulo $q$, and the property that $q$ is a square modulo $p$: both are not independent. Gauss has given many proofs of this law, the first two of them were published in his 1801 Disquisitiones arithmeticae. Franz Lemmermeyer references 246 proofs of the quadratic reciprocity law on his website! In his book Reciprocity laws, from Eisenstein to Kronecker, he adds that “modern number theory dates from the discovery of the reciprocity law” (preface, p. v).

In this post, I want to discuss a few proofs of this law, that will range to the most basic to the most elaborate. But first, the statement.

Stating the quadratic reciprocity law

If $p$ is an odd prime number, we denote by $\mathbf F_p$ the field with $p$ elements. In this field, exactly half of the nonzero elements are squares. This is embodied in the following exact sequence of abelian groups:
\[1 \to \{\pm1\} \to \mathbf F_p^\times \xrightarrow{a\mapsto a^2} \mathbf F_p^\times \xrightarrow {a\mapsto a^{(p-1)/2}} \{\pm1\} \to 1 .\]
This means that the image of every morphism is the kernel of the next one. In explicit terms:
  1. Squares are the image of the map $a\mapsto a^2$ whose kernel is $\{\pm1\}$;
  2. An element $a\in\mathbf F_p^\times$ is a square if and only if $a^{(p-1)/2}=1$; otherwise, $a^{(p-1)/2}=-1$.
To prove that this is indeed an exact sequence, we work left to right. The kernel of $a\mapsto a^2$ consists exactly of $\{\pm1\}$; the cardinality of the image of this map, the squares,  is thus equal to $(p-1)/2$. If $a\in\mathbf F_p^\times$, one has $(a^{(p-1)/2})^2=a^{p-1}=1$, by Fermat's little theorem, so that the image of $a\mapsto a^{(p-1)/2}$ is contained in $\{\pm1\}$. If $a$ is a square, writing $a=b^2$, one has $a^{(p-1)/2}=b^{p-1}=1$, so that the squares are contained in the kernel of this morphism. On the other hand, there are at most $(p-1)/2$ elements $a\in \mathbf F_p^\times $ such that $a^{(p-1)/2}=1$, and the squares account for all of them, so that non squares satisfy $a^{(p-1)/2}=-1$.

For $a\in\mathbf F_p$, or even $a\in\mathbf Z$, one classically denotes by $(\frac ap)$ the element of $\{0,1,-1\}$ which is equal to expression $a^{(p-1)/2}$ in $\mathbf F_p$. This is the Legendre symbol of $a$ modulo $p$.

Let now $q$ be another odd prime number. We can consider the Legendre symbols $(\frac qp)$ and $(\frac qp)$ of $q$ modulo $p$ and of $p$ modulo $q$. Thise are elements of $\{\pm1\}$ and the quadratic reciprocity law states:
\[ \big(\frac pq\big) \, \big(\frac qp\big) = (-1)^{(p-1)(q-1)/4}. \]
Since this may look frightening, let us make this slightly more explicit:
  1. If $p$ or $q$ is congruent to $1$ modulo $4$, then $p$ is a square modulo $q$ exactly when $q$ is a square modulo $p$.
  2. If $p$ and $q$ both are congruent to $3$ modulo $4$, then $p$ is a square modulo $q$ exactly when $q$ is not a square modulo $p$.
This property may be used to compute Legendre symbols relatively quickly, by successive euclidean divisions.

The complementary laws

A few particular cases are either important, or classical enough to deserve being quoted. They may also have relatively easier proofs which already indicate a relation between the quadratic reciprocity law and roots of unity.

$-1$ is a square modulo $p$ if and only if $p\equiv 1\pmod 4$.

Indeed, $(\frac{-1}p)=(-1)^{(p-1)/2}$, which is $1$ if and only if $(p-1)/2$ is even, that is $p\equiv 1\pmod 4$. A possibly simple way to understand this is the following: $-1$ is a square in $\mathbf F_p^\times$ if and only if there are elements of order $4$ in $\mathbf F_p^\times$, which is equivalent to the divisibility $4\mid p-1$ because the group $\mathbf F_p^\times $ is cyclic.

$-3$ is a square modulo $p$ if and only if $p \equiv 1\pmod 3$.

Indeed, the usual formula for the cubic roots of unity $(-1\pm i \sqrt 3)/2$ shows that $-3$ is a square modulo $p$ if and only if there are elements of order 3 in $\mathbf F_p^\times$, that is, if and only if $3\mid p-1$.

$3$ is a square modulo $p$ if and only if $p\equiv \pm1 \pmod{12}$.

Indeed, the formula $(\frac 3p)=(\frac{-1}p)(\frac{-3}p) $ shows that $3$ is a square modulo $p$ if and only if either $-3$ and $-1$ both are squares modulo $p$, or both are non-squares modulo $p$. The first case holds when $p\equiv 1\pmod 3$ and $p\equiv 1\pmod 4$, that is, $p\equiv 1\pmod{12}$; the second case holds when $p\equiv -1\pmod3$ and $p\equiv-1\pmod4$, that is, $p\equiv-1\pmod{12}$.

$2$ is a square modulo $p$ if and only if $p\equiv \pm1\pmod 8$.

We play the same game with octic roots of unity, classically given by $(\pm 1+\pm i)/\sqrt 2$. So if $-1$ and $2$ are squares modulo $p$, then there exists an octic root of unity in $\mathbf F_p$, and $p\equiv 1\pmod 8$, and conversely. In the case $p\equiv 1\pmod 4$, this show that $2$ is a square modulo $p$ if and only if $p\equiv 1\pmod 8$.

On the other hand, one has $p^2\equiv 1\pmod8$, whatever odd prime number $p$ is, so that there exists an octic root of unity $\omega$ in the field $\mathbf F_{p^2}$ with $p^2$ elements. The question is whether $\omega$ belongs to $\mathbf F_p$. This is settled by Fermat's little theorem again: it holds if and only if $\omega^p=\omega$, equivalently, $\omega^{p-1}=1$ or $p\equiv 1\pmod 8$, and we have not made a single step.

However, if $\omega=(1+i)/\sqrt 2$, say, the other octic roots of unity are $\omega^7=\omega^{-1}=(1-i)/\sqrt 2$, $\omega^3= (-1+i)/\sqrt 2$ and $\omega^5=-\omega=-(1+i)/\sqrt 2$, so that $\sqrt 2=\omega+\omega^{-1}$. Independely of the previous computations, choose an octic root of unity $\omega$ and set $\alpha=\omega+\omega^{-1}$. This is an element of $\mathbf F_{p^2}$ such that $\alpha^2=\omega^2+\omega^{-2}+2=2$ since $\omega^4=-1$. So the question is whether this element $\alpha$ belongs to $\mathbf F_p$ of not. This can be settled by Fermat's little theorem, and we lead to the conclusion of this complementary law. Indeed $\alpha^p=\omega^p+\omega^{-p}$ is equal to $\alpha$ when $p\equiv \pm1\pmod 4$, and to $-\alpha$ otherwise.

Gauss's lemma and a product formula for the Legendre symbol

Lemma. — (1) Let $S\subset\mathbf F_p^\times$ be a “half-system” modulo $p$, that is, a subset of cardinality $(p-1)/2$ such that every element of $\mathbf F_p^\times$ is of the form $\pm s$ for a unique $s\in S$. For $a\in \mathbf F_p^\times$, one has $ \big( \frac ap \big) = (-1) ^m $, where $m=\mathop{\rm Card}( aS \cap (-S))$.
(2) Let $f\colon \mathbf R\to\mathbf C^\times$ be an odd $1$-periodic function. If $p,q$ are distinct odd prime numbers, then
\[ \big( \frac qp\big) = \prod_{a=1}^{(p-1)/2}\frac{ f(qa/p) }{f(a/p)}. \]

Proof of (1). For $s\in S$, write $as=\varepsilon(s) s_a$, where $s_a\in S$ and $\varepsilon(s)\in\{\pm1\}$. The map $s\mapsto s_a$ is bijective, so that
\[ a^{(p-1)/2} \prod_{s\in S} s = \prod_{s\in S} as = \prod_{s\in S} \varepsilon(s) s_a =\prod_{s\in S} \varepsilon(s) \cdot \prod_{s\in S} s_ a= =\prod_{s\in S} \varepsilon(s) \cdot \prod_{s\in S} s.\]
Simplifying, we get $a^{(p-1)/2}=\prod_{s\in S}\varepsilon(s)$. In this product, there are $m$ factors equal to $-1$, and the other are equal to $1$, so that it equals $(-1)^m$.

Proof of (2). Since $f$ is $1$-periodic, we may define a function $F\colon \mathbf F_p\to \mathbf C^\times$ by setting $F(\alpha)=f(a/p)$ for every lift $a$ of $\alpha$. Then $S=\{1,\dots,(p-1)/2\}$ is a hals-system modulo $p$. It then suffices to compare $\prod_{s\in S} F(qs)$ and $\prod_{s\in S}F(s)$, similarly as in (1).

Application to the complementary law

The relevance of this lemma to the quadratic reciprocity proof is apparent if one considers the case $a=2$. Take $S=\{1,\dots,(p-1)/2\}$. Then, $\varepsilon(s)=1$ if $s\leq (p-1)/4$ and $\varepsilon(s)=-1$ otherwise.
Consequently, $m=(p-1)/4$ if $p\equiv 1\pmod 4$, and $m=(p+1)/4$ if $p\equiv 1\pmod 4$, the discrepancy resulting from the thereshold $(p-1)/4$ for $s$ being an integer or not. If $p\equiv1\pmod4$, this implies $(\frac2p)=1$ if and only if $p\equiv 1\pmod 8$; if $p\equiv 3\pmod3$, this implies $(\frac2p)=1$ if and only if $p\equiv -1\pmod 8$.

The proof of Kronecker — or is it Eisenstein? – and some variations

Eisenstein starts from Gauss's lemma with $f(x)=\sin(2\pi x)$. We get
\[\big( \frac qp\big) = \prod_{a=1}^{(p-1)/2} \frac{\sin(2\pi qa/p)}{\sin(2\pi a/p)}. \]
On the other hand, for every odd integer $q$, there exists a polynomial $U_q$, the Chebyshev polynomial of second kind, such that
\[\sin (qx) = \sin(x) U_q( 4\sin^2(x) ) .\]
The relation
\[\sin((q+2)x)+\sin((q-2)x) = 2\sin(qx)\cos(2x)=2\sin(qx)(1-2\cos^2(x))
=2\sin(qx)(1-2\sin^2(x)) \]
shows that
\[U_{q+2}(T)+U_{q-2}(T) = (2-T)U_q(T).\]
This recurrence relation is initiated with $U_1=1$ and $U_3=3-T$ (I'd even take $U_{-1}=-1$),
so that $U_q$ has integer coefficients, degree $(q-1)/2$ and its leading coefficient is $(-1)^{(q-1)/2}$.
Since $\sin(qx)$ vanishes for $x=k\pi/q\pmod{2\pi}$, the roots of $U_q$ are the $3\sin^2(k\pi/q)$, for $1\leq k\leq (q-1)/2$, and we have the factorization
\[ \sin (qx) = (-1)^{(q-1)/2} \sin(x) \prod_{k=1}^{(q-1)/2} \big( 4\sin^2(x) - 4\sin^2(k\pi/q ) \big)
\[\big( \frac qp\big) = (-4)^{(q-1)(p-1)/4}
\prod_{a=1}^{(p-1)/2} \prod_{k=1}^{(q-1)/2} \left( \sin ^2(2\pi a/p)- \sin^2(2\pi k/q)\right). \]
By symmetry,
\[ \big( \frac pq\big) = (-4)^{(q-1)(p-1)/4}
\prod_{a=1}^{(q-1)/2} \prod_{k=1}^{(p-1)/2} \left( \sin ^2(2\pi a/q)- \sin^2(2\pi k/p)\right). \]
The big product in the two expressions just differ by the sign of each factor. Consequently,
\[\big( \frac pq\big) = (-1)^{(p-1)(q-1)/4} \big( \frac qp\big), \]
which is the quadratic reciprocity law.

First variation (Kronecker)

The analytic property of the sine function is not really used.
In fact, one can deduce from Gauss's lemma the formula
\[ \big(\frac qp\big) = \mathop{\rm sign} \left( \prod_{a=1}^{(p-1)/2} \prod_{k=1}^{(q-1)/2} \big( \frac qp-\frac kq\big)\right) . \]
This implies the quadratic reciprocity law in a similar way.

Second variation (learnt from Jean-François Mestre)

The products appearing in Eisenstein's proof are reminiscent of resultants. In fact,
one has the following formula:
\[\big(\frac qp\big) = \mathop{\rm Res}(U_q,U_p), \]
unless I fucked up the signs.
This equality follows from the analysis above, since the zeroes of $U_q$ are $4\sin^2(k\pi/q)$, and those of $U_p$ are $4\sin^2(a\pi/p)$, but it can also be proved directly.
Indeed, $U_p$ and $U_q$ are polynomials with integer coefficients, so that their resultant is an integer. Moreover, the roots of $U_p$ are of the form $4\sin^2(a\pi/p)$, but is is better to write them $2(1-\cos(2a\pi /p)) = 2-\omega^a-\omega^{-a}$, where $\omega=e^{2i\pi/p}$ is a primitive $p$th root of unity. From this, one can observe than $U_p$ and $U_q$ have no common root, in whatever field of whatever characteristic. Necessarily, their resultant is $\pm1$. To compute the sign, we compute the resultant modulo $p$. Modulo $p$, one has $U_p=(-T)^{(p-1)/2}$, or something like that, and one finds the correct sign.

The proof of Zolotarev — signatures

Lemma (Zolotarev). — For every $a\in\mathbf F_p^\times$, the Legendre symbol $(\frac ap)$ is the signature of the permutation induced by multiplication by $a$ in $\mathbf F_p$.

Many proofs are possible. Let $\varepsilon(a)$ be this signature. One can use the definition of the signature as $(-1)$ to the number of inversions. This definition implies that
\[ \varepsilon(a)=\prod_{0\leq x<y\leq p-1} \frac{ay - ax}{y-x} = a^{(p-1)(p-2)/2}=a^{(p-1)/2} \]
since $p-2$ is odd.

On the other hand, the map $\varepsilon\colon\mathbf F_p^\times\to\{\pm1\}$ is a morphism of groups, and the Legendre symbol is the only nontrivial one. If one knows that the group $\mathbf F_p^\times$ is cyclic, it is enough to understand the case of a generator $a$; then multiplication by $a$ has two orbits, $\{0\}$ and $\mathbf F_p^\times$, so that $\varepsilon(a)=(-1)^{p-2}=-1$. This proves that $\varepsilon$ coincides with the Legendre symbol.

One can also derive Zolotarev's lemma from Gauss's lemma.

This lemma admits many generalizations, such as the following one, classically attributed to Frobenius-Zolotarev in French books for future high school teachers (agrégation). The reference I know of it is the 1970 paper Sur une généralisation des symboles de Legendre et de Jacobi by Pierre Cartier.

Lemma. — Let $n$ be a positive integer. For $g\in\mathrm{GL}(n,\mathbf F_p)$ be an invertible matrix, let $\varepsilon(g)$ be the permutation of $\mathbf F_p^n$ induced by $g$. One has $\varepsilon(g)=\big( \frac{\det(g)}p\big)$.

To prove this formula, one can use the fact that $\varepsilon$ is a morphism of groups with values in $\{\pm1\}$. It is trivial on transvections (because transvections have odd order), hence on the $\mathrm{SL}(n,\mathbf F_p)$ they generate, so that it factors through the determinant. If $g$ is diagonal $(a,1,\dots,1)$, one has $\det(g)=a$; on the other hand, the orbit of $(x_1,\dots,x_n)$ is $O(x_1)\times (x_2,\dots,x_n)$, so that it has $m_1p^{n-1}$ orbits, $m_1$ being the number of orbits of multiplication by $a$ on $\mathbf F_p$. Consequently, $\varepsilon(g)=(-1)^{p^n-m_1p^{n-1}}$. Since $p^{n-1}$ is odd, this implies $\varepsilon(g)=(-1)^{p-m_1}=\varepsilon(a)=(\frac ap)$ by Zolotarev's lemma.

It must be noted that only two formulas can hold. Since $\mathrm{SL}(n,\mathbf F_p)$ is the derived subgroup of $\mathrm{GL}(n,\mathbf F_p)$, any morphism to a commutative group factors through the determinant, so that there exists a morphism of groups $\chi\colon \mathbf F_p^\times\to\{\pm1\}$ such that $\varepsilon(g)=\chi(\det(g))$. But there are only two possibilities of $\chi$:
either $\chi$ is the Legendre symbol, or $\chi=1$. So it would have been sufficient to find a matrix $g\in\mathrm{GL}(n,\mathbf F_p)$ such that $\varepsilon(g)=-1$ to conclude. This, however, is not so easy — in the same way that it is not so easy to explicitly produce an element of $\mathbf F_p$ which is not a square.

Another proof of the quadratic reciprocity law

In his Nouvelle démonstration de la loi de réciprocité de Legendre, Zolotarev used this lemma to give another proof of Gauss's quadratic reciprocity law. To prove it, one may consider the permutation $\tau$ of $\mathbf Z/pq\mathbf Z$ that maps $a+pb$ to $b+qa$, for $0\leq a<p$ and $0\leq b<q$. Under the identification of $\mathbf Z/pq\mathbf Z$ with $\mathbf Z/p\mathbf Z\times\mathbf Z/q\mathbf Z$ provided by the Chinese remainder theorem, this permutation is the composite of two permutations, respectively $(x,y)\mapsto (qx+y,y)$ and the inverse of $(x,y)\mapsto (x,py+x)$. Using Zolotarev's lemma, one finds that their signatures are $(\frac qp)$ and $(\frac pq)$ respectively,
so that $\varepsilon(\tau)=(\frac qp )(\frac pq)$. On the other hand, $a+pb<a'+pb'$ if and only if $b<b'$ or $b=b'$ and $a< a'$, so that $\tau(a+pb)=b+qa$ and $\tau(a'+pb')=b'+q'a$ give an inversion if and only if $b<b'$ and $a>a'$ or $b>b'$ and $a<a'$. Taking two pairs $(a,a')$, $(b,b')$ of distinct elements, exactly one of the two arrangements $a+pb, a'+pb'$ or $a+pb',a'+pb$ gives an inversion, so that there are $p(p-1)/2 \cdot q(q-1)/2$ inversions. Since $pq$ is odd, this implies $\varepsilon(\tau)=(-1)^{(p-1)(q-1)/4}$.

Signatures and Galois theory

Galois theory furnishes another relation between Zolotarev's lemma and quadratic reciprocity. Let us indeed consider the finite extension $\mathbf F_q\subset  E$ generated by a primitive $p$th root of unity. This is the decomposition extension of the polynomial $T^p-1$ over $\mathbf F_q$. Its Galois group is generated by the map $a\mapsto a^q$ of $E$ whose action on the $p$th roots of unity is exactly the same as multiplication by $q$ on $\mathbf Z/p\mathbf Z$. By Zolotarev's lemma, the Galois group is contained in the alternate group of the roots if and only if $q$ is a square modulo $p$. On the other hand, there is a well known criterion for the Galois group of a polynomial to be alternate, namely, that the discriminant of the polynomial is a square. The discriminant of $T^p-1$ is classically computed as $(-1)^{p(p-1)/2} p^p  (-1)^{p-1}=(-1)^{(p-1)/2}p \cdot p^{p-1}$ since $p$ is odd. This proves that $(\frac qp)=1$ if and only if $(-1)^{(p-1)/2}p$ is a square in $\mathbf F_q$, in other words,
\[ \big( \frac qp\big) = \left( (-1)^{(p-1)/2} p \right)^{(q-1)/2} = (-1)^{(p-1)(q-1)/4} \big( \frac pq\big), \]
which proves once again the quadratic reciprocity law.

Proofs via Gauss sums

Let $\omega$ be a primitive $n$th root of unity in some field $K$. The Gauss sums are the expressions
\[ G(\omega)=\sum_{a=1}^{n-1} \omega^{a^2}. \]
If $n=p$ is a prime number, then this can be rewritten as follows, using the fact that the sum of all $p$th roots of unity vanishes:
\[ G(\omega)=\sum_{a=0}^{p-1} \big(\frac ap\big) \omega^{a}. \]
The following computation is of the same spirit as the computation done with octic roots of unity (except for the minor fact that $8$ is not an odd prime number).

Lemma. — One has $G(\omega)^2=(-1)^{(p-1)/2} p$.

This starts by a direct computation:
\[ G(\omega)^2=\sum_{a,b\in\mathbf F_p} \big(\frac ap\big) \big(\frac bp\big) \omega^{a+b}
= \sum_{x\in\mathbf F_p} \sum_{a\in\mathbf F_p} \big( \frac{a(x-a)}p\big) \omega^x. \]
Fix $x\in\mathbf F_p$.
If $x=0$, then one has $(\frac{a(x-a)}p)=(\frac{-a^2}p)=(-1)^{(p-1)/2}$ for every $a$, and the coefficient of $\omega^x$ is thus $(-1)^{(p-1)/2}p$.
Assume $x\neq0$. $a\neq 0$, one has $a(x-a)=-a^2(1-x/a)$, hence
\[ \big(\frac{a(x-a)}p\big)=(-1)^{(p-1)/2} \big( \frac{1-x/a}p\big) \]
and when $a$ varies, $1-x/a$ takes all values in $\mathbf F_p$ but $1$. Would it take all values, the sum of all $(\frac{1-x/a}p)$ would be zero, because there are as many squares as non squares, and $0$ does not count, so that the sum is $-1$. This gives
\[ G(\omega)^2=(-1)^{(p-1)/2}p - (-1)^{(p-1)/2} \sum_{x\in\mathbf F_p^\times} \omega^x
= (-1)^{(p-1)/2}p \]
because the sum of all non-trivial $p$th roots of unity vanishes.

Proving quadratic reciprocity via finite fields

One way to prove quadratic reciprocity from there is to take for $K$ a finite extension of $\mathbf F_q$, the second prime number. One then has
\[ G(\omega)^q = \sum_{a=0}^{p-1} \big( \frac ap\big)^q \omega^{aq}
= \sum_{b=0}^{p-1} \big(\frac{b/q}p\big) \omega^b , \]
using the change of variables $b=aq$. Since $(\frac{b/q}p\big)=(\frac bp) q^{(p-1)/2}$, we get
\[ G(\omega)^q = q^{(p-1)/2} \sum_{b=0}^{p-1} \big(\frac bp\big) \omega^b = \big(\frac qp\big) G(\omega),\]
\[ G(\omega)^{q-1} = \big( \frac qp\big). \]
Combining this result with the previous lemma, it follows that
\[ \left( (-1)^{(p-1)/2} p \right)^{(q-1)/2} = \big( \frac qp \big), \]
which implies quadratic reciprocity after some straightforward manipulations.

Proving quadratic reciprocity via complex numbers

Another way to prove quadratic reciprocity is to choose a specific root of unity in $\mathbf C$, namely $\omega=e^{2i\pi/n}$. If $n=p$ is an odd prime number, then $G(\omega)^2=(-1)^{(p-1)/2}p$ has two roots in $\mathbf C$, and one needs to identify which of them $G(\omega)$ is. Determining the sign of the Gauss sum proved to be a much more difficult problem, and Gauss proved it only later.

Theorem. — One has $G_n=\sqrt n$, $0$, $i\sqrt n$, $(1+i)\sqrt n$ according to $n\equiv 0, 1, 2, 3\pmod 4$.

Writing any integer in $\{1,\dots,pq\}$ as $py+qx$, with $x\in\{1,\dots,p\}$ and $y\in\{1,\dots,q\}$, one can prove that $G_{pq}=G_pG_q (\frac qp)(\frac pq)$. Given the explicit value for $G_p,G_q$ and $G_{pq}$, this implies the quadratic reciprocity law.

Wednesday, February 12, 2020

Dynamics of Hecke correspondences

This morning, Sebastián Herrero, Ricardo Menares and Juan Rivera-Letelier released a preprint on the arXiv entitled “p-Adic distribution of CM points and Hecke orbits. I. Convergence towards the Gauss point.” In fact, there are two other papers on the arXiv with similar titles, by two other groups of authors. The first one, “p-adic Dynamics of Hecke Operators on Modular Curves” is by
Eyal Z. Goren and Payman L Kassaei, and the second one is by Daniele Disegni, “p-adic equidistribution of CM points”.

In fact, as @_nilradical pointed out initially, the words “Hecke correspondences” are generally used in modular forms books to describe some (pairwise commuting) endomorphisms of the spaces of modular forms. So what are these authors talking about?

As he said: “literally what the hell is going on in math?”. My answer, at a time where I should have been doing something else, such as sleeping, was direct:

This is a very nice question! Take an elliptic curve over the $p$-adics, say, and draw its images under Hecke correspondences. You get a cloud of dots on the $p$-adic modular curve. Normalize by its size. What are the limit measures ? The picture is richer than over the complex.

what do u mean by "correspondences"? i know only about hecke operators to the extent of what is in chapter 3 of shimura's intro book and the end of serre's "course in arithmetic"

You remember that paragraph of Serre's Book where he says modular forms of wt k are functions on pairs (E,ω) — E, ell curve ; ω, nonzero inv diff form on E — with some homogeneity and holomorphy condition ? (Idea: τ -> ell curve C/(Z+τZ) and ω = dz) ...

yea, im ok with weight k modular forms being sections of \Omega^2k

In this point of view, the Hecke operator T_n sends a modular form f to the function that maps (E,ω) to the sum of all f(E',ω') where E' is a quotient E/D (D, cyclic subgroup of order n) and ω' I don't tell here. Now, forget about modular forms (and differential forms ω, as well).

And I went to explain those things in a few Twitter posts, but it's time to say it at a quieter pace.

1. Elliptic curves, modular curves, and modular correspondences.

The reason for the adjective “modular” in the expression “modular forms” or in the expression “modular curves” is that they refer to the “moduli” of some extremly classical geometric objects called elliptic curves. Elliptic curves have various descriptions, all equally beautiful and important: either, complex analytically, as one-dimensional tori, quotients of the complex plane $\mathbf C$ by a lattice $\Lambda\simeq\mathbf Z^2$, or as cubic plane curves given by a (Weierstrass) equation of the form $y^2=x^3+ax+b$ in the affine plane with coordinates $(x,y)$ — forgetting a point at infinity.

In fact, the “set” of all elliptic curve can themselved be arranged in a curve provided one identifies “isomorphic” elliptic curves, so that some simplifications arise. Not any lattice must be considered, one first may assume $\Lambda=\mathbf Z+\mathbf Z\tau$, for some complex number $\tau$ with strictly positive imaginary part, but then the complex number $\tau$ and any complex number of the form $(a\tau+b)/(c\tau+d)$ gives the same elliptic curve, for any matrix $\begin{pmatrix}a & b\\c & d\end{pmatrix}$ with integer entries and determinant $1$. So, in some sense, the set of elliptic curves coincides with the quotient of the (Poincaré) upper half-space by the action of the group $\mathrm{SL}(2,\mathbf Z)$. Alternatively, in the Weierstrass model, there are two parameters $(a,b)$, but all pairs of the form $(u^4a,u^6b)$ give the same curve (via the change of variables $x'=u^2x$, $y'=u^3y$). One can guess some subtleties here, because some the matrix $-I_2$ acts trivially, and also because some specific $\tau$ have a nontrivial stabilizer (even taking $-I_2$ into acount); on the other side, $u=-1$ does not act, and some pairs $(a,b)$ are fixed by some more roots of unity. I will ignore these here; technically, they are solved in two ways, one is to add a “level structure”, the other is to consider this modular curve $M$ as an orbifold — and algebraic geometers say stack.

“Correspondence” is a long-forgotten topic in set theory, since the time where multivalued functions were considered. We forgot them because it's hard to talk consistently about them, but there are instances in which they arise naturally. In set theory, one way to define a correspondence $T$ from a set $X$ to a set $Y$ is to consider its graph $G_T$, a subset of $X\times Y$. If the graph contains a pair $(x,y)$, one says that the correspondence maps $x$ to $y$; but the graph could also contain a pair $(x,y')$, in which case it also maps $x$ to $y'$. And it could very well contain no pair with first element $x$, and then $x$ has no image under the correspondence. Correspondences can be composed: if the correspondence $S$ maps $x$ to $y$ and a correspondence $T$ maps $y$ to $z$,
then $T\circ S$ maps $x$ to $z$. They can also be inverted: just flip the graph.

Modular curves admit natural modular correspondences, indexed by strictly positive integers.
Specifically, the correspondence $T_n$ maps an elliptic curve $E$ to all elliptic curves $E'$ of
the form $E'=E/D$, where $D$ is a cyclic subgroup of rank $n$ of $E$.
The graph of this correspondence, when drawn on the surface $M\times M$, has a natural structure of an algebraic curve.

2. Modular dynamics

One virtue of correspondences in algebraic geometry is that they act naturally on objects that are attached “linearly” to the curve. For example, considering a function $f$ (or a differential form on $M$), rather than looking at all the images $E'$ of $E$, one could add the corresponding values $f(E')$, and consider thus sum as a function of $E$. This is exactly what is done to define the Hecke correspondence on modular forms. Geometrically this corresponds to pulling back $f$ from $M$ to $M\times M$ (using the first projection), then restricting on the graph of the correspondence, then “pushing-out” (a kind of trace) to the second factor — this is where the sum happens.

Another way to linearize the correspondence is to formally add all images $E'$, for example considering that the correspondence maps a Dirac measure $\delta_E$ at a point $E$
to the sum of the Dirac measures $\delta_{E'}$ at its images; maybe dividing by the number of images so that one keeps a probability measure. In this framework, the correspondence $T_n$ is a map from the space of probability measures on the modular curve to itself.

Which is cool because you can now iterate the process and wonder about the possible limit measures.

This is analogous to what is done in complex dynamics, for example when you study the dynamics
of the map $z\mapsto z^2+c$ (here, $c$ is a parameter): a construction of the Julia set consists in taking the 2 preimages of a point $z$ (any point, with a few exceptions), their 4 preimages, etc. As long as the construction goes on, the cloud of points that one gets becomes closer and closer to the Julia set.

In the case of the modular curve and the dynamics of Hecke correspondences (on can compose a given Hecke correspondence, or consider $T_n$ for large $n$, it does not really matter), what happens is described by theorems of William Duke / Laurent Clozel and Emmanuel Ullmo / Rodolphe Richard.
Whatever point one starts with, whatever probability measure one starts with, the probability measures on $M$ that are constructed by the dynamics converge to the Poincaré measure on the modular curve — that is, to the measure that is given by the hyperbolic measure $dx\, dy/ y^2$ on the Poincaré upper half-plane, restricted to a fundamental domain of the action.

In fact, Duke and Clozel/Ullmo have different goals. What they consider is a sequence of probability measures formed by considering elliptic curves with complex multiplication and all their conjugates. The limit theorem that they prove is then quite subtle and relies on deep properties of Maass forms of half integral weight.

3. Modular dynamics: $p$-adic fields

Over the $p$-adics, the dynamics is even more complicated, although its behaviour does not seem to be governed by analytic properties of analytic objects such as Maass forms, but by the arithmetic of the elliptic curve one starts with.

A first difficulty comes from the framework required to define $p$-adic dynamics. One wants to start from the field $\mathbf Q_p$, but it is not algebraically closed, and so the construction of the images of a curve by the correspondence require to take its algebraic closure $\overline{\mathbf Q_p}$. But then $p$-adic analysis is not so cool, because although that field has a natural $p$-adic absolute value, it is not complete anymore — so let's take its completion, $\mathbf C_p$. And this field is algebraically closed.

However, and this is a reflection that the Galois theory of $\mathbf Q_p$ is complicated, the field $\mathbf C_p$ is not locally compact, so that measure theory on such a field is not very well behaved. For example, a basic tool in measure theory over compact (metrizable, say) spaces is that the set of probability measures is itself compact for the vague topology on measures, so that any sequence of probability measures has a converging subsequence, etc.

In our case, this won't hold anymore and one needs to consider a suitable “compactification” of the $p$-adic modular curve — in fact, its analytification $M_p$ in the sense of Berkovich. One then gets a locally compact topological space on which the Hecke correspondences still act naturally, and
questions of dynamics now can be formulated properly.

The only thing I'll write about the dynamics is that it is subtle: there are domains which are stable by the correspondences. For example, if the reduction mod $p$ of an elliptic curve $E$ is supersingular, then all of its images $E' = E/D$ also have supersingular reduction. In other words, the supersingular locus in the Berkovich modular curve $M_p$ is totally invariant by the Hecke correspondence — this locus is a finite union of open disks. This shows that there are many possible limit measure and the papers that I have quoted above study the various limit phenomaena.

They also consider the analogue of Duke's result in that setting. Disegni also considers the analogous problem for Shimura curves (instead of elliptic curves, they parameterize abelian surfaces with real multiplication).

This will be all for this blog spot, now go and read these papers!

Friday, December 27, 2019

Behaviour of conjugacy by reduction modulo integers

Let $A$ and $B\in\mathrm{M}_n(\mathbf Z)$ be two square matrices with integer coefficients. Assume that they are conjugate by $\mathrm{GL}_n(\mathbf Z)$, namely, that there exists a matrix $P\in\mathrm{GL}_n(\mathbf Z)$ such that $B=P^{-1}AP$. Then we can reduce this relation modulo every integer $d\geq 2$ and obtain a similar relation between the images of $A$ and $B$ in $\mathrm M_n(\mathbf Z/d\mathbf Z)$.

Almost the same holds if $A$ and $B$ are only conjugate by $\mathrm{GL}_n(\mathbf Q)$, except for a few exceptions: we just need to take care to reduce the relation modulo integers $d$ that are coprime to the denominators of the coefficients of $P$ or of $P^{-1}$.

I was quite surprised at first to learn that the converse assertion is false. There are matrices $A$ and $B$
in $\mathrm M_2(\mathbf Z)$ whose images modulo every integer $d\geq 2$ are conjugate, but which are not conjugate by a matrix in $\mathrm{GL}_2(\mathbf Z)$.

An example is given by Peter Stebe in his paper “Conjugacy separability of groups of integer matrices”, Proc. of the AMS, 32 (1), mars 1972, p. 1—7.
Namely, set
\[ A = \begin{pmatrix} 188 & 275 \\ 121 & 177 \end{pmatrix} = \begin{pmatrix} 11\cdot 17+1 & 25\cdot 11 \\ 11^2 & 11\cdot 16+1 \end{pmatrix} \]
\[ B = \begin{pmatrix} 188 & 11 \\ 3025 & 177 \end{pmatrix} =
\begin{pmatrix} 11\cdot 17+1 & 11 \\ 11^2\cdot 25 & 11\cdot 16+1 \end{pmatrix}. \]
These matrices $A$ and $B$ have integer coefficients, their determinant is $1$, hence they belong to $\mathrm{SL}_2(\mathbf Z)$. They also have the same trace, hence the same characteristic polynomial, which is $T^2-365T+1$. The discriminant of this polynomial is $3\cdot 11^2\cdot 367$. This implies that their complex eigenvalues are distinct, hence these matrices are diagonalizable over $\mathbf C$, and are conjugate over $\mathbf C$.

In the same way, we see that they remain conjugate modulo every prime number $p$
that does not divide the discriminant. Modulo $3$ and $11$, we check that both matrices become
conjugate to $\begin{pmatrix}1 & 1 \\ 0 & 1\end{pmatrix}$, while they become conjugate to $\begin{pmatrix}-1 & 1 \\ 0 & -1 \end{pmatrix}$ modulo $367$.

It is a bit more delicate to prove that if we reduce modulo any integer $d\geq 2$, then $A$ and $B$ become conjugate under $\mathrm{SL}_2(\mathbf Z/d\mathbf Z)$. Stebe's argument runs in two steps.
He first computes the set of matrices $V$ that conjugate $A$ to $B$, namely he solves the equation $VA=BV$. The answer is given by
\[ V=V(x,y) = \begin{pmatrix} x & y \\ 11 y & 25 x-y \end{pmatrix}. \]
Moreover, one has
\[ \det(V(x,y))=25x^2 - xy -11y^2. \]
Consequently, to prove that $A$ and $B$ are conjugate in $\mathrm{SL}_2(\mathbf Z/d\mathbf Z)$, it suffices to find $x,y\in\mathbf Z/d\mathbf Z$ such that $\det(V(x,y))=1 \pmod d$.
To prove that they are conjugate by $\mathrm{SL}_2(\mathbf Z)$, we need to find $x,y\in\mathbf Z$ such that $\det(V(x,y))=1$, and if we agree to be content with a conjugacy by $\mathrm{GL}_2(\mathbf Z)$, then solutions of $\det(V(x,y))=-1$ are also admissible.

Let us first start with the equations modulo $d$. By the Chinese remainder theorem, we may assume that $d=p^m$ is a power of a prime number $p$. Now, if $p\neq 5$, we can take $y=0$ and $x$ such that $5x=1\pmod {p^m}$. If $p=5$, we take $x=0$ and we solve $y$ for $-11y^2=1\pmod {5^m}$, which is possible since $-11\equiv 4\pmod 5$ is a square, and it is easy, by induction (anyway, this is an instance of Hensel's lemma), to produce $y$ modulo $5^m$ such that $y\equiv 3\pmod 5$ and $-11y^2=1\pmod{5^m}$.

On the other hand, the equation $25x^2-xy-11y^2=\pm 1$ has no solutions in integers.
The case of $-1$ is easy by reduction modulo $3$: it becomes $x^2+2xy+y^2=2$, which has no solution since $x^2+2xy+y^2=(x+y)^2$ and $2$ is not a square modulo $3$.
The case of $+1$ is rather more difficult. Stebe treats it by reducing to the Pell equation $u^2=1101y^2+1$ and shows by analysing the minimal solution to this Pell equation that $y$ is divisible by $5$, which is incompatible with the initial equation.

From a more elaborate point of view, $V$ is a smooth scheme over $\mathbf Z$ that violates the integral Hasse principle. In fact, $V$ is a torsor under the centralizer of $A$, which is a torus, and the obstruction has been studied by Colliot-Thélène and Xu, precisely in this context. However, I did not make the calculations that could use their work to reprove Stebe's theorem.

Tuesday, July 2, 2019

Irreducibility of cyclotomic polynomials

For every integer $n\geq 1$, the $n$th cyclotomic polynomial $\Phi_n$ is the monic polynomial whose complex roots are the primitive $n$th roots of unity. A priori, this is a polynomial with complex coefficients, but since every $n$th root of unity is a primitive $d$th root of unity, for a unique divisor $d$ of $n$, one has the relation
\[ T^n-1 = \prod_{d\mid n} \Phi_d(T), \]
which implies, by induction and euclidean divisions, that $\Phi_n \in \mathbf Z[T]$ for every $n$.
The degree of the polynomial $\Phi_n$ is $\phi(n)$, the Euler indicator, number of units in $\mathbf Z/n\mathbf Z$, or number of integers in $\{0,1,\dots,n-1\}$ which are prime to $n$.

The goal of this note is to explain a few proofs that these polynomials are irreducible in $\mathbf Q[T]$ — or equivalently, in view of Gauss's lemma, in $\mathbf Z[T]$. This also amounts to saying that $\deg(\Phi_n)=\phi(n)$ or that the cyclotomic extension has degree $\phi(n)$, or that the canonical group homomorphism from the Galois group of $\mathbf Q(\zeta_n)$ to $(\mathbf Z/n\mathbf Z)^\times$ is an isomorphism.

1. The case where $n=p$ is a prime number.

One has $T^p-1=(T-1)(T^{p+1}+\dots+1)$, hence $\Phi_p=T^{p-1}+\dots+1$. If one reduces it modulo $p$, one finds $\Phi_p(T)\equiv (T-1)^{p-1}$, because $(T-1)\Phi_p(T)=T^p-1\equiv (T-1)^p$. Moreover, $\Phi_p(1)=p$ is not a multiple of $p^2$. By the Eisenstein criterion (after a change of variables $T=1+U$, if one prefers), the polynomial $\Phi_p$ is irreducible.

This argument also works when $n=p^e$ is a power of a prime number. Indeed, since a complex number $\alpha$ is a primitive $p^e$th root of unity if and only if $\alpha^{p^{e-1}}$ is a primitive $p$th root of unity, one has $\Phi_{p^e}= \Phi_p(T^{p^{e-1}})$. Then the Eisenstein criterion gives the result.

Comment.From the point of view of algebraic number theory, this proof makes use of the fact that the cyclotomic extension $\mathbf Q(\zeta_p)$ is totally ramified at $p$, of ramification index $p-1$.
Consequently, it must have degree $p-1$. More generally, it will prove that $\Phi_p$ is irreducible over the field $\mathbf Q_p$ of $p$-adic numbers, or even over any unramified extension of it, or even over any algebraic extension of $\mathbf Q_p$ for which the ramification index is prime to $p-1$.

2. The classical proof

Let us explain a proof that works for all integer $n$. Let $\alpha$ be a primitive $n$th root of unity, and let $P\in\mathbf Z[T]$ be its minimal polynomial — one has $P\mid \Phi_n$ in $\mathbf Z[T]$. Let (A priori, the divisibility is in $\mathbf Q[T]$, but Gauss's lemma implies that it holds in $\mathbf Z[T]$ as well.) Fix a polynomial $Q\in\mathbf Z[T]$ such that $\Phi_n=PQ$.

By euclidean division, one sees that the set $\mathbf Z[\alpha]$ of complex numbers of the form $S(\alpha)$, for $S\in\mathbf Z[T]$, is a free abelian group of rank $\deg(P)$, with basis $1,\alpha,\dots,\alpha^{\deg(P)-1}$.

Let $p$ be a prime number which does not divide $n$. By Fermat's little theorem, one has $P(T)^p \equiv P(T^p) \pmod p$, so that there exists $P_1\in\mathbf Z[T]$ such that $P(X)^p-P(X^p)=pP_1(T)$. This implies that $P(\alpha^p)=p P_1(\alpha)\in p\mathbf Z[\alpha]$.

Since $p$ is prime to $n$, $\alpha^p$ is a primitive $n$th root of unity, hence $\Phi_n(\alpha^p)=0$. Assume that $P(\alpha^p)\neq 0$. Then one has $Q(\alpha^p)=0$. Differentiating the equality $\Phi_n=PQ$, one gets $nT^{n}=T\Phi'_n(T)=TP'Q+TPQ'$; let us evaluate this at $\alpha_p$, we obtain $n=\alpha^p P(\alpha_p) Q'(\alpha^p)=p \alpha^p P^1(\alpha^p)Q'(\alpha^p)$. In other words, $n\in p\mathbf Z[\alpha]$, which is absurd because $n$ does not divide $p$. Consequently, $P(\alpha^p)=0$, and $P$ is also the minimal polynomial of $\alpha^p$.

By induction, one has $P(\alpha^m)=0$ for every integer $m$ which is prime to $n$. All primitive $n$th roots of unity are roots of $P$ and $\deg(P)=\phi(n)=\deg(\Phi_n)$. This shows that $P=\Phi_n$.

Comment.Since this proof considers prime numbers $p$ which do not divide $n$, it makes implicit use of the fact that the cyclotomic extension is unramified away from primes dividing $n$. The differentiation that appears in the proof is a way of proving this non-ramification: if $P(\alpha^p)$ is zero modulo $p$, it must be zero.

3. Landau's proof

A 1929 paper by Landau gives a variant of this classical proof which I just learnt from Milne's notes on Galois theory and which I find significantly easier.

We start as previously, $\alpha$ being a primitive $n$th root of unity and $P\in\mathbf Z[T]$ being its minimal polynomial.

Let us consider, when $k$ varies, the elements $P(\alpha^k)$ of $\mathbf Z[\alpha]$. There are finitely many of them, since this sequence is $n$-periodic, so that they can be written as finitely polynomials of degree $<\deg(P)$ in $\alpha$. Let $A$ be an upper-bound for their coefficients. If $p$ is a prime number, we have $P(\alpha^p) \in p\mathbf Z[\alpha]$ (by an already given argument). This implies $P(\alpha^p)=0$ if $p>A$.

By induction, one has $P(\alpha^m)=0$ for any integer $m$ whose prime factors are all $>A$.

One the other hand, if $m$ is an integer prime to $n$ and $P$ is the product of all prime number $p$ such that $p\leq A$ and $p$ does not divide $m$, then $m'=m+nP $ is another integer all of which prime factors are $>A$. (Indeed, if $p\leq A$, then either $p\mid m$ in
which case $p\nmid nP$ so that then $p\nmid m'$, or $p\nmid m$ in which case $p\mid nP$ so that $p\nmid m'$.) Since $m'\equiv m \pmod n$, one has $P(\alpha^{m})=P(\alpha^{m'})=0$.

This shows that all primitive $n$th roots of unity are roots of $P$, hence $P=\Phi_n$.

Comment. —This proof is quite of a mysterious nature to me.

4. Using Galois theory to pass from local information to global information

The cyclotomic extension $K_n$ contains, as subextension, the cyclotomic extensions $K_{p^e}$, where $n=\prod p_i^{e_i}$ is the decomposition of $n$ has a product of powers of prime numbers. By the first case, $K_{p^e}$ has degree $\phi(p^e)=p^{e-1}(p-1)$ over $\mathbf Q$. To prove that $\Phi_n$ is irreducible, it suffices to prove that these extensions are linearly disjoint, which is the object of the following lemma.

Lemma. — Let $m$ and $n$ be integers and let $d$ be their gcd. Then $K_m\cap K_n=K_d$.

This is an application of Galois theory (and the result holds for every ground field as soon as its characteristic does not divide $m$ and $n$). Let $M$ be the least common multiple of $m$ and $n$. One has $K_N=K_m\cdot K_n$, and the cyclotomic character furnishes a group morphism $\operatorname{Gal}(K_N/\mathbf Q)\to (\mathbf Z/N\mathbf Z)^\times$. The Galois groups $\operatorname{Gal}(K_N/K_m)$ and $\operatorname{Gal}(K_N/K_n)$ corresponding to the subfields $K_m$ and $K_n$ are the kernels of the composition of the cyclotomic character with the projections to $(\mathbf Z/m\mathbf Z)^\times$ and $(\mathbf Z/n\mathbf Z)^\times$, and their intersection to the subgroup generated by these two kernels, which is none but the kernel of the composition of the cyclotomic character with the projection to $(\mathbf Z/d\mathbf Z)^\times$.

Sunday, May 19, 2019

Designs, Skolem sequences, and partitions of integers

Recently, my father offered me the first volume of a graphic novel by Jean-François Kierzkowski and Marek called La suite de Skolem — Skolem's sequences. I knew about the norwegian mathematician Thoralf Skolem for two different reasons (the Löwenheim-Skolem theorem in model theory, and some diophantine equations that Laurent Moret-Bailly put in a more geometric setting — see his series of papers on Problèmes de Skolem), but I had never heared about Skolem sequences.

They appear in his 1957 paper, On certain distributions of integers in pairs with given differences (Math Scand., 5, 57-68).
The question is to put the integers $1,2,\dots,2n$ in $n$ pairs $(a_1,b_1),\dots,(a_n,b_n)$ such that the differences are all different, namely $b_i-a_i=i$ for $i\in\{1,\dots,r\}$. One can put it differently: write a sequence of $2n$ integers, where each of the integers from $1$ to $n$ appear exactly twice, the two $1$s being at distance $1$, the two $2$s at distance $2$, etc.
For example, $4,2,3,2,4,3,1,1$ is a Skolem sequence of length $n$, corresponding to the pairs $(7,8), (2,4), (3,6),(1,5)$.

Le jeu des cavaliers (Jessica Stockholder) — photo V. Pantaloni
Le jeu des cavaliers (photo V. Pantaloni)
The possibility of such sequences has been materialized under the form of a giant sculpture Le jeu des cavaliers by Jessica Stockholder at the Institut des Hautes Études Scientifiques (IHÉS) at Bures sur Yvette.

There is a basic necessary and sufficient condition for such a sequence to exist, namely $n$ has to be congruent to $0$ or $1$ modulo $4$. The proof of necessity is easy (attributed by Skolem to Bang): one has $\sum_{i=1}^n(b_i-a_i)=n(n+1)/2$, and $\sum_{i=1}^n (b_i+a_i)=2n(2n+1)/2$, so that $\sum_{i=1}^n b_i=n(n+1)/4+2n(2n+1)/2=n(5n+3)/4$. If $n$ is even, this forces $n\equiv 0 \pmod 4$, while if $n$ is odd, this forces $5n+3\equiv 0\pmod 4$, hence $n\equiv 1\pmod 4$. The proof of  existence consists in an explicit example of such a sequence, which is written down in Skolem's paper.

Skolem's motivation is only alluded to in that paper, but he explains it a bit further next year. In Some Remarks on the Triple Systems of Steiner, he gives the recipe that furnishes such a system from a Skolem sequence. Steiner triple systems on a set $S$ is the datum of triplets of elements of $S$ such that each pair of two elements of $S$ appears exactly once. In other words, they are a $(3,2,1)$-design on $S$ — a $(m,p,q)$-design on a set $S$ being a family of $m$-subsets of $S$ such that each $q$-subset appears in exactly $p$ of those subsets. Some relatively obvious divisibility conditions can be written down that give a necessary condition for the existence of designs with given parameters, but actual existence is much more difficult. In fact, it has been shown only recently by Peter Keevash that these necessary conditions are sufficient, provided the cardinality of the set $S$ is large enough, see Gil Kalai's talk Designs exist! at the Bourbaki Seminar.

In the case of Steiner triple systems, the condition is that the number $s$ of elements of $S$ be congruent to $1$ or $3$ modulo $6$. Indeed, there are $s(s-1)/2$ pairs of elements of $S$, and each 3-subset of the triple system features 3 such pairs, so that there are $N=s(s-1)/6$ triples. On the other hand, each element of $S$ appears exactly $3N/s$ times, so that $(s-1)/2$ is an integer. So $s$ has to be odd, and either $3$ divides $s$ (in which case $s\equiv 3\pmod 6$) or $s\equiv 1\pmod 6$.

And Skolem's observation is that a family of $n$ pairs $(a_i,b_i)$ as above furnishes a triple system on the set $S=\mathbf Z/(6n+1)\mathbf Z$, namely the triples $(i,i+j,i+b_j+n)$ where $1\leq i,j\leq n$, thus constructing Steiner triple systems on a set whose cardinality $6n+1$, when $n\equiv 0,1\pmod 4$.

My surprise came at the reading of the rest of Skolem's 1957 paper, because I knew the result he then described but had no idea it was due to him. (In fact, it was one of the first homework my math teacher Johan Yebbou gave to us when I was in classes préparatoires.) And since this result is very nice, let me tell you about it.

Theorem.Let $\alpha>1$ and $\beta>1$ be irrational real numbers such that $\alpha^{-1}+\beta^{-1}=1$. Then each strictly positive integer can be written either as $\lfloor \alpha n\rfloor$, or $\lfloor \beta n\rfloor$ for some integer $n\geq 1$, but not of both forms.

First of all, assume $N=\lfloor \alpha n\rfloor=\lfloor \beta m\rfloor$. Using that $\alpha,\beta$ are irrational, we thus write $N< \alpha n<N+1$ and $N<\beta m<N+1$. Dividing these inequalities by $\alpha$ and $\beta$ and adding them, we get $N<n+m<N+1$, since $\alpha^{-1}+\beta^{-1}=1$. This proves that any given integer can be written only of one of those two forms.

Since $\alpha^{-1}+\beta^{-1}=1$, one of $\alpha,\beta$ has to be $<2$. Assume that $1<\alpha<2$. The integers of the form $\lfloor \alpha n\rfloor$ form a strictly increasing sequence, and we want to show that any integer it avoids can be written $\lfloor \beta m\rfloor$.

Set $\gamma=\alpha-1$, so that $\beta=\alpha/(\alpha-1)=1+1/\gamma$. 

For every integer, we have $\lfloor \alpha(n+1)\rfloor = \lfloor \alpha n\rfloor + 1$ or $\lfloor \alpha(n+1)\rfloor=\lfloor \alpha n\rfloor+2$, so that if $\lfloor \alpha n\rfloor + 1$ is avoided, one has $\lfloor \alpha (n+1)\rfloor=\lfloor\alpha n\rfloor +2$.

Then, $\lfloor \alpha n\rfloor = n+\lfloor \gamma n\rfloor=n+k-1$, where $k=1+\lfloor \gamma n\rfloor$. The inequalities $k-1<\gamma n <k$ imply $k/\gamma - 1/\gamma< n<k/\gamma$. Moreover, $\lfloor \alpha(n+1)\rfloor=n+1+\lfloor \gamma(n+1)\rfloor=n+k+1$, so that $k+1<\gamma(n+1)<k+2$, hence $n>k/\gamma+1/\gamma-1>k/\gamma-1$. This proves that $n=\lfloor k/\gamma\rfloor$. Then, $\lfloor k\beta\rfloor=k+\lfloor k/\gamma\rfloor=k+n=\lfloor \alpha n\rfloor +1$.

Saturday, May 12, 2018

A theorem of Lee—Yang on root of polynomials

A recent MathOverflow post asked for a proof that the roots of certain polynomials were located on the unit circle. A comment by Richard Stanley pointed to a beautiful theorem of T. D. Lee and C. N. Yang. By the way, these two authors were physicists, got the Nobel prize in physics in 1957, and were the first Chinese scientists to be honored by the Nobel prize.

This theorem appears in an appendix to their paper, Statistical Theory of Equations of State and Phase Transitions.IL Lattice Gas and Ising Model, published in Phys. Review in 1952, and devoted to properties of the partition function of some lattice gases. Here is a discussion of this theorem, following both the initial paper and notes by Shayan Oveis Gharan.

Let $A=(a_{i,j})_{1\leq i,j\leq n}$ be a Hermitian matrix ($a_{i,j}=\overline{a_{j,i}}$ for all $i,j$). Define a polynomial
\[ F(T) = \sum_{S\subset\{1,\dots,n\}} \prod_{\substack{i\in S \\ j\not\in S}} a_{i,j} T^{\# S}. \]

Theorem 1 (Lee-Yang). — If $\lvert a_{i,j}\rvert\leq 1$ for all $i,j$, then all roots of $F$ have absolute value $1$.

This theorem follows from a multivariate result. Let us define
\[ P(T_1,\dots,T_n) = \sum_{S\subset\{1,\dots,n\}} \prod_{\substack{i\in S \\ j\not\in S}} a_{i,j} \prod_{i\in S} T_i .\]
Say that a polynomial $F\in\mathbf C[T_1,\dots,T_n]$ is good if it has no root $(z_1,\dots,z_n)\in\mathbf C^n$ such that $\lvert z_i\rvert <1$ for all $i$.

Proposition 2 (Lee-Yang). — If $\lvert a_{i,j}\rvert\leq 1$ for all $i,j$, then $P$ is good.

For every pair $(i,j)$, set $a^S_{i,j}=a_{i,j}$ if $i$ belongs to $S$, but not $j$, and set $a^S_{i,j}=1$ otherwise. Consequently,
\[ P(T_1,\dots,T_n) = \sum_{S\subset\{1,\dots,n\}}\prod_{i,j} a^S_{i,j} \prod_{k\in S} T_k. \]
In other words, if we define polynomials
\[ P_{i,j} (T_1,\dots,T_n) = \sum_{S\subset\{1,\dots,n\}} a^S_{i,j} \prod_{k\in S} T_k, \]
then $P$ is the “coefficientwise product” of the polynomials $P_{i,j}$.
We also note that these polynomials have degree at most one with respect to every variable. These observations may motivate the following lemmas concerning good polynomials.

Lemma 1. — If $P,Q$ are good, then so is their product.

Lemma 2. — If $P(T_1,\dots,T_n)$ is good, then $P(a,T_2,\dots,T_n)$ is good for every $a\in\mathbf C$ such that $\lvert a\rvert \leq 1$.

This is obvious if $\lvert a\rvert <1$; in the general case, this follows from the Rouché theorem — the set polynomials (of bounded degree) whose roots belong to some closed subset is closed.

Lemma 3. — If $\lvert a\rvert \leq 1$, then $1+aT$ is good.

This is obvious.

Lemma 4. — If $\lvert a\rvert\leq 1$, then $P=1+aT_1+\bar a T_2+ T_1T_2$ is good.

If $\lvert a\rvert =1$, then $P=(1+aT_1)(1+\bar aT_2)$ is good, as the product of two good polynomials.
Now assume that $\lvert a\rvert <1$. Let $(z_1,z_2)$ be a root of $P$ such that $\lvert z_1\rvert<1$. One has
\[ z_2 = - \frac{1+az_1}{z_1+\bar a}. \]
Since the Möbius transformation $z\mapsto (z+\bar a)/(1+a z)$ defines a bijection from the unit open disk to itself, one has $\lvert z_2\rvert >1$.

Lemma 5. — If $P=a+bT_1+cT_2+dT_1T_2$ is good, then $Q=a+dT$ is good.

Assume otherwise, so that $\lvert a\rvert <\lvert d\rvert$. By symmetry, we assume $\lvert b\rvert\geq \lvert c\rvert$. We write $P (T_1,T_2) = (a+cT_2) + (b+dT_2) T_1$.
Choose $z_2\in\mathbf C$ such that $dz_2$ and $b$ have the same argument; if, moreover $z_2$ is close enough to $1$ and satisfies $\lvert z_2\rvert <1$, then
\[ \lvert b+dz_2\rvert=\lvert b\rvert +\lvert dz_2\rvert > \lvert a\rvert+\lvert c\rvert>\lvert a+cz_2\rvert. \]
Consequently, the polynomial $P(T_1,z_2)$ is not good; a contradiction.

Lemma 6. — If $P,Q$ are good polynomials of degree at most one in each variable, then so is their coefficientwise product.

We first treat the case of one variable: then $P=a+bT$ and $Q=a'+b'T$, so that their coefficientwise product is given by $R=aa'+bb'T$. By assumption $\lvert a\rvert \geq \lvert b\rvert$ and $a\neq 0$;
similarly, $\lvert a'\rvert \geq \lvert b'\rvert $ and $a'\neq 0$. Consequently, $aa'\neq0$ and $\lvert aa'\rvert \geq \lvert bb'\rvert$, which shows that $R$ is good.
We prove the result by induction on $n$. For every subset $S$ of $\{1,\dots,n-1\}$, let $a_S$ and $b_S$ be the coefficients of $\prod_{i\in S}T_i$ and of $\prod_{i\in S} T_i \cdot T_n$ in $P$; define similarly $c_S$ and $d_S$with $Q$. The coefficientwise product of $P$ and $Q$ is equal to
\[ R= \sum_S (a_S c_S +b_S d_S T_n ) \prod_{i\in S} T_i . \]
Let $z\in\mathbf C$ be such that $\lvert z\rvert \leq 1$, so that
\[ P(T_1,\dots,T_{n-1},z)= \sum _S (a_S+b_S z) \prod_{i\in S} T_i \] is good, by lemma 2. Similarly, for $w\in\mathbf C$ such that $\lvert w\rvert\leq 1$, $Q(T_1,\dots,T_{n-1},w)
=\sum _S (c_S+d_S w) \prod_{i\in S} T_i$ is good. By induction, their coefficientwise product, given by
\[ R_{z,w} = \sum_S (a_S+b_S z)(c_S+d_S w) \prod_{i\in S} T_i \]
is good as well.
We now fix complex numbers $z_1,\dots,z_{n-1}$ of absolute value $<1$. By what precedes, the polynomial
\[ S(T,U) = (\sum_S a_S c_S z_S) + (\sum_S b_Sc_S z_S) T + (\sum_S a_S d_S z_S) U
+ (\sum_S b_S d_S z_S) TU \]
is good, where $z_S=\prod_{i\in S}z_i$. According to lemma 4, the polynomial
\[ R(z_1,\dots,z_{n-1},T) = (\sum_S a_S c_S z_S) + (\sum_S b_S d_S z_S) T \]
is good. This proves that $R$ is good.

Proof of theorem 2. — We have already observed that the polynomial $P$ is the coefficientwise product of polynomials $P_{i,j}$, each of them has degree at most one in each variable. On the other hand, one has
\[ P_{i,j} = (1+a_{i,j} T_i + a_{j,i} T_j + T_i T_j) \prod_{k\neq i,j} (1+T_k), \]
a product of good polynomials, so that $P_{i,j}$ is good. This proves that $P$ is good.

In fact, more is true. Indeed, one has
\begin{align*} T_1\dots T_n P(1/T_1,\dots,1/T_n)
& = \sum_ S \prod_{\substack{i\in S \\ j\notin S}} a_{i,j} \prod_{i\notin S} T_i \\
& =P^*(T_1,\dots,T_n)
where $P^*$ is defined using the transpose matrix of $A$. Consequently, $P$ has no root $(z_1,\dots,z_n)$ with $\lvert z_i\rvert >1$ for every $i$.

Proof of theorem 1. — Let $z$ be a root of $P$. Since the polynomial $P$ is good, so is the one-variable polynomial $F(T)=P(T,\dots,T)$. In particular, $F(z)=0$ implies $\lvert z\rvert \geq 1$. But the polynomial has a symmetry property, inherited by that of $P$, namely $ T^n F(1/T)=F^*(T)$, where $F^*$ is defined using the transpose matrix of $A$. Consequently, $F^*(1/z)=0$ and $\lvert 1/z\rvert \geq 1$. We thus have shown that $\lvert z\rvert=1$.