tag:blogger.com,1999:blog-82319176110066333752020-02-12T11:16:46.093+01:00Freedom Math DanceA blog about math (mainly), computer tricks (sometimes) and jazz music.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.comBlogger46125tag:blogger.com,1999:blog-8231917611006633375.post-27231173655279932912020-02-12T00:58:00.001+01:002020-02-12T11:16:46.083+01:00Dynamics of Hecke correspondencesThis morning, Sebastián Herrero, Ricardo Menares and Juan Rivera-Letelier released a preprint on the arXiv entitled “<a href="https://arxiv.org/abs/2002.03232">p-Adic distribution of CM points and Hecke orbits. I. Convergence towards the Gauss point.</a>” In fact, there are two other papers on the arXiv with similar titles, by two other groups of authors. The first one, “<a href="https://arxiv.org/abs/1711.00269">p-adic Dynamics of Hecke Operators on Modular Curves</a>” is by<br />Eyal Z. Goren and Payman L Kassaei, and the second one is by Daniele Disegni, “<a href="https://arxiv.org/abs/1904.07743">p-adic equidistribution of CM points</a>”. <br /><br />In fact, as <a href="https://twitter.com/_nilradical">@_nilradical</a> pointed out initially, the words “Hecke correspondences” are generally used in modular forms books to describe some (pairwise commuting) endomorphisms of the spaces of modular forms. So what are these authors talking about?<br /><br />As he said: “<i>literally what the hell is going on in math?</i>”. My answer, at a time where I should have been doing something else, such as sleeping, was direct:<br /><br />— <i>This is a very nice question! Take an elliptic curve over the $p$-adics, say, and draw its images under Hecke correspondences. You get a cloud of dots on the $p$-adic modular curve. Normalize by its size. What are the limit measures ? The picture is richer than over the complex.</i><br /><br />— <i>what do u mean by "correspondences"? i know only about hecke operators to the extent of what is in chapter 3 of shimura's intro book and the end of serre's "course in arithmetic"</i><br /><br />— <i>You remember that paragraph of Serre's Book where he says modular forms of wt k are functions on pairs (E,ω) — E, ell curve ; ω, nonzero inv diff form on E — with some homogeneity and holomorphy condition ? (Idea: τ -> ell curve C/(Z+τZ) and ω = dz) ...</i><br /><br />— <i>yea, im ok with weight k modular forms being sections of \Omega^2k</i><br /><br />— <i>In this point of view, the Hecke operator T_n sends a modular form f to the function that maps (E,ω) to the sum of all f(E',ω') where E' is a quotient E/D (D, cyclic subgroup of order n) and ω' I don't tell here. Now, forget about modular forms (and differential forms ω, as well).</i><br /><br />And I went to explain those things in a few Twitter posts, but it's time to say it at a quieter pace.<br /><br /><b>1. Elliptic curves, modular curves, and modular correspondences.</b><br /><br />The reason for the adjective “modular” in the expression “modular forms” or in the expression “modular curves” is that they refer to the “moduli” of some extremly classical geometric objects called elliptic curves. Elliptic curves have various descriptions, all equally beautiful and important: either, complex analytically, as one-dimensional tori, quotients of the complex plane $\mathbf C$ by a lattice $\Lambda\simeq\mathbf Z^2$, or as cubic plane curves given by a (Weierstrass) equation of the form $y^2=x^3+ax+b$ in the affine plane with coordinates $(x,y)$ — forgetting a point at infinity. <br /><br />In fact, the “set” of all elliptic curve can themselved be arranged in a curve provided one identifies “isomorphic” elliptic curves, so that some simplifications arise. Not any lattice must be considered, one first may assume $\Lambda=\mathbf Z+\mathbf Z\tau$, for some complex number $\tau$ with strictly positive imaginary part, but then the complex number $\tau$ and any complex number of the form $(a\tau+b)/(c\tau+d)$ gives the same elliptic curve, for any matrix $\begin{pmatrix}a & b\\c & d\end{pmatrix}$ with integer entries and determinant $1$. So, in some sense, the set of elliptic curves coincides with the quotient of the (Poincaré) upper half-space by the action of the group $\mathrm{SL}(2,\mathbf Z)$. Alternatively, in the Weierstrass model, there are two parameters $(a,b)$, but all pairs of the form $(u^4a,u^6b)$ give the same curve (via the change of variables $x'=u^2x$, $y'=u^3y$). One can guess some subtleties here, because some the matrix $-I_2$ acts trivially, and also because some specific $\tau$ have a nontrivial stabilizer (even taking $-I_2$ into acount); on the other side, $u=-1$ does not act, and some pairs $(a,b)$ are fixed by some more roots of unity. I will ignore these here; technically, they are solved in two ways, one is to add a “level structure”, the other is to consider this modular curve $M$ as an orbifold — and algebraic geometers say <i>stack</i>.<br /><br />“Correspondence” is a long-forgotten topic in set theory, since the time where multivalued functions were considered. We forgot them because it's hard to talk consistently about them, but there are instances in which they arise naturally. In set theory, one way to define a correspondence $T$ from a set $X$ to a set $Y$ is to consider its graph $G_T$, a subset of $X\times Y$. If the graph contains a pair $(x,y)$, one says that the correspondence maps $x$ to $y$; but the graph could also contain a pair $(x,y')$, in which case it also maps $x$ to $y'$. And it could very well contain no pair with first element $x$, and then $x$ has no image under the correspondence. Correspondences can be composed: if the correspondence $S$ maps $x$ to $y$ and a correspondence $T$ maps $y$ to $z$,<br />then $T\circ S$ maps $x$ to $z$. They can also be inverted: just flip the graph.<br /><br />Modular curves admit natural modular correspondences, indexed by strictly positive integers.<br />Specifically, the correspondence $T_n$ maps an elliptic curve $E$ to all elliptic curves $E'$ of<br />the form $E'=E/D$, where $D$ is a cyclic subgroup of rank $n$ of $E$.<br />The graph of this correspondence, when drawn on the surface $M\times M$, has a natural structure of an algebraic curve.<br /><br /><b>2. Modular dynamics</b><br /><br />One virtue of correspondences in algebraic geometry is that they act naturally on objects that are attached “linearly” to the curve. For example, considering a function $f$ (or a differential form on $M$), rather than looking at all the images $E'$ of $E$, one could add the corresponding values $f(E')$, and consider thus sum as a function of $E$. This is exactly what is done to define the Hecke correspondence on modular forms. Geometrically this corresponds to pulling back $f$ from $M$ to $M\times M$ (using the first projection), then restricting on the graph of the correspondence, then “pushing-out” (a kind of trace) to the second factor — this is where the sum happens.<br /><br />Another way to linearize the correspondence is to formally add all images $E'$, for example considering that the correspondence maps a Dirac measure $\delta_E$ at a point $E$<br />to the sum of the Dirac measures $\delta_{E'}$ at its images; maybe dividing by the number of images so that one keeps a probability measure. In this framework, the correspondence $T_n$ is a map from the space of probability measures on the modular curve to itself.<br /><br />Which is <i>cool</i> because you can now iterate the process and wonder about the possible limit measures.<br /><br />This is analogous to what is done in complex dynamics, for example when you study the dynamics<br />of the map $z\mapsto z^2+c$ (here, $c$ is a parameter): a construction of the Julia set consists in taking the 2 preimages of a point $z$ (any point, with a few exceptions), their 4 preimages, etc. As long as the construction goes on, the cloud of points that one gets becomes closer and closer to the Julia set.<br /><br />In the case of the modular curve and the dynamics of Hecke correspondences (on can compose a given Hecke correspondence, or consider $T_n$ for large $n$, it does not really matter), what happens is described by theorems of William Duke / Laurent Clozel and Emmanuel Ullmo / Rodolphe Richard.<br />Whatever point one starts with, whatever probability measure one starts with, the probability measures on $M$ that are constructed by the dynamics converge to the Poincaré measure on the modular curve — that is, to the measure that is given by the hyperbolic measure $dx\, dy/ y^2$ on the Poincaré upper half-plane, restricted to a fundamental domain of the action.<br /><br />In fact, Duke and Clozel/Ullmo have different goals. What they consider is a sequence of probability measures formed by considering elliptic curves with complex multiplication and all their conjugates. The limit theorem that they prove is then quite subtle and relies on deep properties of Maass forms of half integral weight.<br /><br /><b>3. Modular dynamics: $p$-adic fields</b><br /><br />Over the $p$-adics, the dynamics is even more complicated, although its behaviour does not seem to be governed by analytic properties of analytic objects such as Maass forms, but by the arithmetic of the elliptic curve one starts with.<br /><br />A first difficulty comes from the framework required to define $p$-adic dynamics. One wants to start from the field $\mathbf Q_p$, but it is not algebraically closed, and so the construction of the images of a curve by the correspondence require to take its algebraic closure $\overline{\mathbf Q_p}$. But then $p$-adic analysis is not so cool, because although that field has a natural $p$-adic absolute value, it is not complete anymore — so let's take its completion, $\mathbf C_p$. And this field is algebraically closed. <br /><br />However, and this is a reflection that the Galois theory of $\mathbf Q_p$ is complicated, the field $\mathbf C_p$ is not locally compact, so that measure theory on such a field is not very well behaved. For example, a basic tool in measure theory over compact (metrizable, say) spaces is that the set of probability measures is itself compact for the vague topology on measures, so that any sequence of probability measures has a converging subsequence, etc.<br /><br />In our case, this won't hold anymore and one needs to consider a suitable “compactification” of the $p$-adic modular curve — in fact, its analytification $M_p$ in the sense of Berkovich. One then gets a locally compact topological space on which the Hecke correspondences still act naturally, and <br />questions of dynamics now can be formulated properly.<br /><br />The only thing I'll write about the dynamics is that it is <i>subtle</i>: there are domains which are stable by the correspondences. For example, if the reduction mod $p$ of an elliptic curve $E$ is supersingular, then all of its images $E' = E/D$ also have supersingular reduction. In other words, the supersingular locus in the Berkovich modular curve $M_p$ is totally invariant by the Hecke correspondence — this locus is a finite union of open disks. This shows that there are many possible limit measure and the papers that I have quoted above study the various limit phenomaena.<br /><br />They also consider the analogue of Duke's result in that setting. Disegni also considers the analogous problem for Shimura curves (instead of elliptic curves, they parameterize abelian surfaces with real multiplication).<br /><br />This will be all for this blog spot, now go and read these papers!Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-28923140973532004982019-12-27T20:32:00.002+01:002019-12-27T21:11:58.628+01:00Behaviour of conjugacy by reduction modulo integersLet $A$ and $B\in\mathrm{M}_n(\mathbf Z)$ be two square matrices with integer coefficients. Assume that they are conjugate by $\mathrm{GL}_n(\mathbf Z)$, namely, that there exists a matrix $P\in\mathrm{GL}_n(\mathbf Z)$ such that $B=P^{-1}AP$. Then we can reduce this relation modulo every integer $d\geq 2$ and obtain a similar relation between the images of $A$ and $B$ in $\mathrm M_n(\mathbf Z/d\mathbf Z)$.<br /><br />Almost the same holds if $A$ and $B$ are only conjugate by $\mathrm{GL}_n(\mathbf Q)$, except for a few exceptions: we just need to take care to reduce the relation modulo integers $d$ that are coprime to the denominators of the coefficients of $P$ or of $P^{-1}$.<br /><br />I was quite surprised at first to learn that the converse assertion is false. There are matrices $A$ and $B$<br />in $\mathrm M_2(\mathbf Z)$ whose images modulo every integer $d\geq 2$ are conjugate, but which are not conjugate by a matrix in $\mathrm{GL}_2(\mathbf Z)$.<br /><br />An example is given by Peter Stebe in his paper “<a href="https://www.ams.org/journals/proc/1972-032-01/S0002-9939-1972-0289666-X/S0002-9939-1972-0289666-X.pdf">Conjugacy separability of groups of integer matrices</a>”, <a href="https://www.ams.org/journals/proc/1972-032-01/S0002-9939-1972-0289666-X/">Proc. of the AMS, 32 (1), mars 1972, p. 1—7</a>.<br />Namely, set<br />\[ A = \begin{pmatrix} 188 & 275 \\ 121 & 177 \end{pmatrix} = \begin{pmatrix} 11\cdot 17+1 & 25\cdot 11 \\ 11^2 & 11\cdot 16+1 \end{pmatrix} \]<br />and<br />\[ B = \begin{pmatrix} 188 & 11 \\ 3025 & 177 \end{pmatrix} = <br />\begin{pmatrix} 11\cdot 17+1 & 11 \\ 11^2\cdot 25 & 11\cdot 16+1 \end{pmatrix}. \]<br />These matrices $A$ and $B$ have integer coefficients, their determinant is $1$, hence they belong to $\mathrm{SL}_2(\mathbf Z)$. They also have the same trace, hence the same characteristic polynomial, which is $T^2-365T+1$. The discriminant of this polynomial is $3\cdot 11^2\cdot 367$. This implies that their complex eigenvalues are distinct, hence these matrices are diagonalizable over $\mathbf C$, and are conjugate over $\mathbf C$.<br /><br />In the same way, we see that they remain conjugate modulo every prime number $p$<br />that does not divide the discriminant. Modulo $3$ and $11$, we check that both matrices become <br />conjugate to $\begin{pmatrix}1 & 1 \\ 0 & 1\end{pmatrix}$, while they become conjugate to $\begin{pmatrix}-1 & 1 \\ 0 & -1 \end{pmatrix}$ modulo $367$.<br /><br />It is a bit more delicate to prove that if we reduce modulo any integer $d\geq 2$, then $A$ and $B$ become conjugate under $\mathrm{SL}_2(\mathbf Z/d\mathbf Z)$. Stebe's argument runs in two steps. <br />He first computes the set of matrices $V$ that conjugate $A$ to $B$, namely he solves the equation $VA=BV$. The answer is given by<br />\[ V=V(x,y) = \begin{pmatrix} x & y \\ 11 y & 25 x-y \end{pmatrix}. \]<br />Moreover, one has <br />\[ \det(V(x,y))=25x^2 - xy -11y^2. \]<br />Consequently, to prove that $A$ and $B$ are conjugate in $\mathrm{SL}_2(\mathbf Z/d\mathbf Z)$, it suffices to find $x,y\in\mathbf Z/d\mathbf Z$ such that $\det(V(x,y))=1 \pmod d$. <br />To prove that they are conjugate by $\mathrm{SL}_2(\mathbf Z)$, we need to find $x,y\in\mathbf Z$ such that $\det(V(x,y))=1$, and if we agree to be content with a conjugacy by $\mathrm{GL}_2(\mathbf Z)$, then solutions of $\det(V(x,y))=-1$ are also admissible.<br /><br />Let us first start with the equations modulo $d$. By the Chinese remainder theorem, we may assume that $d=p^m$ is a power of a prime number $p$. Now, if $p\neq 5$, we can take $y=0$ and $x$ such that $5x=1\pmod {p^m}$. If $p=5$, we take $x=0$ and we solve $y$ for $-11y^2=1\pmod {5^m}$, which is possible since $-11\equiv 4\pmod 5$ is a square, and it is easy, by induction (anyway, this is an instance of Hensel's lemma), to produce $y$ modulo $5^m$ such that $y\equiv 3\pmod 5$ and $-11y^2=1\pmod{5^m}$.<br /><br />On the other hand, the equation $25x^2-xy-11y^2=\pm 1$ has no solutions in integers.<br />The case of $-1$ is easy by reduction modulo $3$: it becomes $x^2+2xy+y^2=2$, which has no solution since $x^2+2xy+y^2=(x+y)^2$ and $2$ is not a square modulo $3$.<br />The case of $+1$ is rather more difficult. Stebe treats it by reducing to the Pell equation $u^2=1101y^2+1$ and shows by analysing the minimal solution to this Pell equation that $y$ is divisible by $5$, which is incompatible with the initial equation.<br /><br /><br />From a more elaborate point of view, $V$ is a smooth scheme over $\mathbf Z$ that violates the integral Hasse principle. In fact, $V$ is a torsor under the centralizer of $A$, which is a torus, and the obstruction has been studied by <a href="https://arxiv.org/abs/0712.1957v1">Colliot-Thélène and Xu</a>, precisely in this context. However, I did not make the calculations that could use their work to reprove Stebe's theorem.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-27318764240902068042019-07-02T15:56:00.000+02:002019-07-05T08:22:45.419+02:00Irreducibility of cyclotomic polynomialsFor every integer $n\geq 1$, the $n$th cyclotomic polynomial $\Phi_n$ is the monic polynomial whose complex roots are the primitive $n$th roots of unity. A priori, this is a polynomial with complex coefficients, but since every $n$th root of unity is a primitive $d$th root of unity, for a unique divisor $d$ of $n$, one has the relation<br />\[ T^n-1 = \prod_{d\mid n} \Phi_d(T), \]<br />which implies, by induction and euclidean divisions, that $\Phi_n \in \mathbf Z[T]$ for every $n$.<br />The degree of the polynomial $\Phi_n$ is $\phi(n)$, the Euler indicator, number of units in $\mathbf Z/n\mathbf Z$, or number of integers in $\{0,1,\dots,n-1\}$ which are prime to $n$.<br /><br />The goal of this note is to explain a few proofs that these polynomials are irreducible in $\mathbf Q[T]$ — or equivalently, in view of Gauss's lemma, in $\mathbf Z[T]$. This also amounts to saying that $\deg(\Phi_n)=\phi(n)$ or that the cyclotomic extension has degree $\phi(n)$, or that the canonical group homomorphism from the Galois group of $\mathbf Q(\zeta_n)$ to $(\mathbf Z/n\mathbf Z)^\times$ is an isomorphism.<br /><br /><b>1. The case where $n=p$ is a prime number.</b><br /><br />One has $T^p-1=(T-1)(T^{p+1}+\dots+1)$, hence $\Phi_p=T^{p-1}+\dots+1$. If one reduces it modulo $p$, one finds $\Phi_p(T)\equiv (T-1)^{p-1}$, because $(T-1)\Phi_p(T)=T^p-1\equiv (T-1)^p$. Moreover, $\Phi_p(1)=p$ is not a multiple of $p^2$. By the Eisenstein criterion (after a change of variables $T=1+U$, if one prefers), the polynomial $\Phi_p$ is irreducible.<br /><br />This argument also works when $n=p^e$ is a power of a prime number. Indeed, since a complex number $\alpha$ is a primitive $p^e$th root of unity if and only if $\alpha^{p^{e-1}}$ is a primitive $p$th root of unity, one has $\Phi_{p^e}= \Phi_p(T^{p^{e-1}})$. Then the Eisenstein criterion gives the result.<br /><br /><b>Comment.</b> — <i>From the point of view of algebraic number theory, this proof makes use of the fact that the cyclotomic extension $\mathbf Q(\zeta_p)$ is totally ramified at $p$, of ramification index $p-1$. <br />Consequently, it must have degree $p-1$. More generally, it will prove that $\Phi_p$ is irreducible over the field $\mathbf Q_p$ of $p$-adic numbers, or even over any unramified extension of it, or even over any algebraic extension of $\mathbf Q_p$ for which the ramification index is prime to $p-1$.</i><br /><br /><b>2. The classical proof</b><br /><br />Let us explain a proof that works for all integer $n$. Let $\alpha$ be a primitive $n$th root of unity, and let $P\in\mathbf Z[T]$ be its minimal polynomial — one has $P\mid \Phi_n$ in $\mathbf Z[T]$. Let (A priori, the divisibility is in $\mathbf Q[T]$, but Gauss's lemma implies that it holds in $\mathbf Z[T]$ as well.) Fix a polynomial $Q\in\mathbf Z[T]$ such that $\Phi_n=PQ$.<br /><br />By euclidean division, one sees that the set $\mathbf Z[\alpha]$ of complex numbers of the form $S(\alpha)$, for $S\in\mathbf Z[T]$, is a free abelian group of rank $\deg(P)$, with basis $1,\alpha,\dots,\alpha^{\deg(P)-1}$.<br /><br />Let $p$ be a prime number which does not divide $n$. By Fermat's little theorem, one has $P(T)^p \equiv P(T^p) \pmod p$, so that there exists $P_1\in\mathbf Z[T]$ such that $P(X)^p-P(X^p)=pP_1(T)$. This implies that $P(\alpha^p)=p P_1(\alpha)\in p\mathbf Z[\alpha]$.<br /><br />Since $p$ is prime to $n$, $\alpha^p$ is a primitive $n$th root of unity, hence $\Phi_n(\alpha^p)=0$. Assume that $P(\alpha^p)\neq 0$. Then one has $Q(\alpha^p)=0$. Differentiating the equality $\Phi_n=PQ$, one gets $nT^{n}=T\Phi'_n(T)=TP'Q+TPQ'$; let us evaluate this at $\alpha_p$, we obtain $n=\alpha^p P(\alpha_p) Q'(\alpha^p)=p \alpha^p P^1(\alpha^p)Q'(\alpha^p)$. In other words, $n\in p\mathbf Z[\alpha]$, which is absurd because $n$ does not divide $p$. Consequently, $P(\alpha^p)=0$, and $P$ is also the minimal polynomial of $\alpha^p$.<br /><br />By induction, one has $P(\alpha^m)=0$ for every integer $m$ which is prime to $n$. All primitive $n$th roots of unity are roots of $P$ and $\deg(P)=\phi(n)=\deg(\Phi_n)$. This shows that $P=\Phi_n$.<br /><br /><b>Comment.</b> — <i>Since this proof considers prime numbers $p$ which do not divide $n$, it makes implicit use of the fact that the cyclotomic extension is unramified away from primes dividing $n$. The differentiation that appears in the proof is a way of proving this non-ramification: if $P(\alpha^p)$ is zero modulo $p$, it must be zero.</i><br /><br /><b>3. Landau's proof</b><br /><br />A 1929 paper by Landau gives a variant of this classical proof which I just learnt from Milne's notes on Galois theory and which I find significantly easier.<br /><br />We start as previously, $\alpha$ being a primitive $n$th root of unity and $P\in\mathbf Z[T]$ being its minimal polynomial.<br /><br />Let us consider, when $k$ varies, the elements $P(\alpha^k)$ of $\mathbf Z[\alpha]$. There are finitely many of them, since this sequence is $n$-periodic, so that they can be written as finitely polynomials of degree $<\deg(P)$ in $\alpha$. Let $A$ be an upper-bound for their coefficients. If $p$ is a prime number, we have $P(\alpha^p) \in p\mathbf Z[\alpha]$ (by an already given argument). This implies $P(\alpha^p)=0$ if $p>A$. <br /><br />By induction, one has $P(\alpha^m)=0$ for any integer $m$ whose prime factors are all $>A$.<br /><br />One the other hand, if $m$ is an integer prime to $n$ and $P$ is the product of all prime number $p$ such that $p\leq A$ and $p$ does not divide $m$, then $m'=m+nP $ is another integer all of which prime factors are $>A$. (Indeed, if $p\leq A$, then either $p\mid m$ in<br />which case $p\nmid nP$ so that then $p\nmid m'$, or $p\nmid m$ in which case $p\mid nP$ so that $p\nmid m'$.) Since $m'\equiv m \pmod n$, one has $P(\alpha^{m})=P(\alpha^{m'})=0$.<br /><br />This shows that all primitive $n$th roots of unity are roots of $P$, hence $P=\Phi_n$.<br /><br /><b>Comment. —</b><i>This proof is quite of a mysterious nature to me.</i><br /><br /><b>4. Using Galois theory to pass from local information to global information</b><br /><br />The cyclotomic extension $K_n$ contains, as subextension, the cyclotomic extensions $K_{p^e}$, where $n=\prod p_i^{e_i}$ is the decomposition of $n$ has a product of powers of prime numbers. By the first case, $K_{p^e}$ has degree $\phi(p^e)=p^{e-1}(p-1)$ over $\mathbf Q$. To prove that $\Phi_n$ is irreducible, it suffices to prove that these extensions are linearly disjoint, which is the object of the following lemma.<br /><br /><b>Lemma. — </b><i>Let $m$ and $n$ be integers and let $d$ be their gcd. Then $K_m\cap K_n=K_d$.</i><br /><br />This is an application of Galois theory (and the result holds for every ground field as soon as its characteristic does not divide $m$ and $n$). Let $M$ be the least common multiple of $m$ and $n$. One has $K_N=K_m\cdot K_n$, and the cyclotomic character furnishes a group morphism $\operatorname{Gal}(K_N/\mathbf Q)\to (\mathbf Z/N\mathbf Z)^\times$. The Galois groups $\operatorname{Gal}(K_N/K_m)$ and $\operatorname{Gal}(K_N/K_n)$ corresponding to the subfields $K_m$ and $K_n$ are the kernels of the composition of the cyclotomic character with the projections to $(\mathbf Z/m\mathbf Z)^\times$ and $(\mathbf Z/n\mathbf Z)^\times$, and their intersection to the subgroup generated by these two kernels, which is none but the kernel of the composition of the cyclotomic character with the projection to $(\mathbf Z/d\mathbf Z)^\times$.<br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-76878207760457764142019-05-19T01:01:00.000+02:002019-05-19T01:01:25.818+02:00Designs, Skolem sequences, and partitions of integers<a href="https://www.bedetheque.com/media/Couvertures/Couv_249157.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" data-original-height="800" data-original-width="562" height="320" src="https://www.bedetheque.com/media/Couvertures/Couv_249157.jpg" width="225" /></a>Recently, my father offered me the first volume of a graphic novel by Jean-François Kierzkowski and Marek called <i>La suite de Skolem</i> — Skolem's sequences. I knew about the norwegian mathematician Thoralf Skolem for two different reasons (the Löwenheim-Skolem theorem in model theory, and some diophantine equations that Laurent Moret-Bailly put in a more geometric setting — see his series of papers on <i>Problèmes de Skolem</i>), but I had never heared about Skolem sequences.<br /><br />They appear in his 1957 paper, <a href="https://doi.org/10.7146/math.scand.a-10490"><i>On certain distributions of integers in pairs with given differences</i></a> (Math Scand., <i>5</i>, 57-68).<br />The question is to put the integers $1,2,\dots,2n$ in $n$ pairs $(a_1,b_1),\dots,(a_n,b_n)$ such that the differences are all different, namely $b_i-a_i=i$ for $i\in\{1,\dots,r\}$. One can put it differently: write a sequence of $2n$ integers, where each of the integers from $1$ to $n$ appear exactly twice, the two $1$s being at distance $1$, the two $2$s at distance $2$, etc.<br />For example, $4,2,3,2,4,3,1,1$ is a Skolem sequence of length $n$, corresponding to the pairs $(7,8), (2,4), (3,6),(1,5)$.<br /><br /><table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; text-align: left;"><tbody><tr><td style="text-align: center;"><a href="https://4.bp.blogspot.com/-x-0r2TAhXTI/XOCBpP7eJwI/AAAAAAAAJ2g/qK1_yeRti-gJITBpZeJEPggzHy6cPq96QCLcBGAs/s1600/cavaliers-IHES-Pantaloni.jpg" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"><img alt="Le jeu des cavaliers (Jessica Stockholder) — photo V. Pantaloni" border="0" data-original-height="626" data-original-width="1024" height="195" src="https://4.bp.blogspot.com/-x-0r2TAhXTI/XOCBpP7eJwI/AAAAAAAAJ2g/qK1_yeRti-gJITBpZeJEPggzHy6cPq96QCLcBGAs/s320/cavaliers-IHES-Pantaloni.jpg" title="Le jeu des cavaliers (Jessica Stockholder) — photo V. Pantaloni" width="320" /></a></td></tr><tr><td class="tr-caption" style="text-align: center;">Le jeu des cavaliers (photo V. Pantaloni)</td></tr></tbody></table>The possibility of such sequences has been materialized under the form of a giant sculpture <i>Le jeu des cavaliers</i> by Jessica Stockholder at the Institut des Hautes Études Scientifiques (IHÉS) at Bures sur Yvette.<br /><br />There is a basic necessary and sufficient condition for such a sequence to exist, namely $n$ has to be congruent to $0$ or $1$ modulo $4$. The proof of necessity is easy (attributed by Skolem to Bang): one has $\sum_{i=1}^n(b_i-a_i)=n(n+1)/2$, and $\sum_{i=1}^n (b_i+a_i)=2n(2n+1)/2$, so that $\sum_{i=1}^n b_i=n(n+1)/4+2n(2n+1)/2=n(5n+3)/4$. If $n$ is even, this forces $n\equiv 0 \pmod 4$, while if $n$ is odd, this forces $5n+3\equiv 0\pmod 4$, hence $n\equiv 1\pmod 4$. The proof of existence consists in an explicit example of such a sequence, which is written down in Skolem's paper.<br /><br />Skolem's motivation is only alluded to in that paper, but he explains it a bit further next year. In <a href="https://doi.org/10.7146/math.scand.a-10551"><i>Some Remarks on the Triple Systems of Steiner</i></a>, he gives the recipe that furnishes such a system from a Skolem sequence. Steiner triple systems on a set $S$ is the datum of triplets of elements of $S$ such that each pair of two elements of $S$ appears exactly once. In other words, they are a $(3,2,1)$-design on $S$ — a $(m,p,q)$-design on a set $S$ being a family of $m$-subsets of $S$ such that each $q$-subset appears in exactly $p$ of those subsets.<span class="name"></span> Some relatively obvious divisibility conditions can be written down that give a necessary condition for the existence of designs with given parameters, but actual existence is much more difficult. In fact, it has been shown only recently by Peter Keevash that these necessary conditions are sufficient, provided the cardinality of the set $S$ is large enough, see Gil Kalai's talk <a href="http://www.bourbaki.ens.fr/TEXTES/1100.pdf"><i>Designs exist!</i></a> at the Bourbaki Seminar.<br /><br />In the case of Steiner triple systems, the condition is that the number $s$ of elements of $S$ be congruent to $1$ or $3$ modulo $6$. Indeed, there are $s(s-1)/2$ pairs of elements of $S$, and each 3-subset of the triple system features 3 such pairs, so that there are $N=s(s-1)/6$ triples. On the other hand, each element of $S$ appears exactly $3N/s$ times, so that $(s-1)/2$ is an integer. So $s$ has to be odd, and either $3$ divides $s$ (in which case $s\equiv 3\pmod 6$) or $s\equiv 1\pmod 6$.<br /><br />And Skolem's observation is that a family of $n$ pairs $(a_i,b_i)$ as above furnishes a triple system on the set $S=\mathbf Z/(6n+1)\mathbf Z$, namely the triples $(i,i+j,i+b_j+n)$ where $1\leq i,j\leq n$, thus constructing Steiner triple systems on a set whose cardinality $6n+1$, when $n\equiv 0,1\pmod 4$.<br /><br />My surprise came at the reading of the rest of Skolem's 1957 paper, because I knew the result he then described but had no idea it was due to him. (In fact, it was one of the first homework my math teacher Johan Yebbou gave to us when I was in classes préparatoires.) And since this result is very nice, let me tell you about it.<br /><br /><b>Theorem.</b> — <i>Let $\alpha>1$ and $\beta>1$ be irrational real numbers such that $\alpha^{-1}+\beta^{-1}=1$. Then each strictly positive integer can be written either as $\lfloor \alpha n\rfloor$, or $\lfloor \beta n\rfloor$ for some integer $n\geq 1$, but not of both forms.</i><br /><br />First of all, assume $N=\lfloor \alpha n\rfloor=\lfloor \beta m\rfloor$. Using that $\alpha,\beta$ are irrational, we thus write $N< \alpha n<N+1$ and $N<\beta m<N+1$. Dividing these inequalities by $\alpha$ and $\beta$ and adding them, we get $N<n+m<N+1$, since $\alpha^{-1}+\beta^{-1}=1$. This proves that any given integer can be written only of one of those two forms.<br /><br />Since $\alpha^{-1}+\beta^{-1}=1$, one of $\alpha,\beta$ has to be $<2$. Assume that $1<\alpha<2$. The integers of the form $\lfloor \alpha n\rfloor$ form a strictly increasing sequence, and we want to show that any integer it avoids can be written $\lfloor \beta m\rfloor$.<br /><br />Set $\gamma=\alpha-1$, so that $\beta=\alpha/(\alpha-1)=1+1/\gamma$. <br /><br />For every integer, we have $\lfloor \alpha(n+1)\rfloor = \lfloor \alpha n\rfloor + 1$ or $\lfloor \alpha(n+1)\rfloor=\lfloor \alpha n\rfloor+2$, so that if $\lfloor \alpha n\rfloor + 1$ is avoided, one has $\lfloor \alpha (n+1)\rfloor=\lfloor\alpha n\rfloor +2$. <br /><br />Then, $\lfloor \alpha n\rfloor = n+\lfloor \gamma n\rfloor=n+k-1$, where $k=1+\lfloor \gamma n\rfloor$. The inequalities $k-1<\gamma n <k$ imply $k/\gamma - 1/\gamma< n<k/\gamma$. Moreover, $\lfloor \alpha(n+1)\rfloor=n+1+\lfloor \gamma(n+1)\rfloor=n+k+1$, so that $k+1<\gamma(n+1)<k+2$, hence $n>k/\gamma+1/\gamma-1>k/\gamma-1$. This proves that $n=\lfloor k/\gamma\rfloor$. Then, $\lfloor k\beta\rfloor=k+\lfloor k/\gamma\rfloor=k+n=\lfloor \alpha n\rfloor +1$.<br /><br /><br /><br /><br /><br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-84351311558196060102018-05-12T23:47:00.001+02:002018-10-10T12:08:58.889+02:00A theorem of Lee—Yang on root of polynomialsA recent <a href="https://mathoverflow.net/questions/299304/a-family-of-polynomials-whose-zeros-all-lie-on-the-unit-circle/299334#299334">MathOverflow</a> post asked for a proof that the roots of certain polynomials were located on the unit circle. A comment by <a href="http://www-math.mit.edu/~rstan/">Richard Stanley</a> pointed to a beautiful theorem of <a href="https://en.wikipedia.org/wiki/Tsung-Dao_Lee">T. D. Lee</a> and <a href="https://en.wikipedia.org/wiki/Chen-Ning_Yang">C. N. Yang</a>. By the way, these two authors were physicists, got the Nobel prize in physics in 1957, and were the first Chinese scientists to be honored by the Nobel prize.<br /><br />This theorem appears in an appendix to their paper, <a href="https://doi.org/10.1103/PhysRev.87.410">Statistical Theory of Equations of State and Phase Transitions.IL Lattice Gas and Ising Model</a>, published in Phys. Review in 1952, and devoted to properties of the partition function of some lattice gases. Here is a discussion of this theorem, following both the initial paper and <a href="https://homes.cs.washington.edu/~shayan/courses/cse599-s/counting-17.pdf">notes</a> by <a href="https://homes.cs.washington.edu/~shayan/">Shayan Oveis Gharan</a>.<br /><br />Let $A=(a_{i,j})_{1\leq i,j\leq n}$ be a Hermitian matrix ($a_{i,j}=\overline{a_{j,i}}$ for all $i,j$). Define a polynomial<br />\[ F(T) = \sum_{S\subset\{1,\dots,n\}} \prod_{\substack{i\in S \\ j\not\in S}} a_{i,j} T^{\# S}. \]<br /><br /><b>Theorem 1 (Lee-Yang).</b> — <i>If $\lvert a_{i,j}\rvert\leq 1$ for all $i,j$, then all roots of $F$ have absolute value $1$.</i><br /><br />This theorem follows from a multivariate result. Let us define<br />\[ P(T_1,\dots,T_n) = \sum_{S\subset\{1,\dots,n\}} \prod_{\substack{i\in S \\ j\not\in S}} a_{i,j} \prod_{i\in S} T_i .\]<br />Say that a polynomial $F\in\mathbf C[T_1,\dots,T_n]$ is good if it has no root $(z_1,\dots,z_n)\in\mathbf C^n$ such that $\lvert z_i\rvert <1$ for all $i$.<br /><br /><b>Proposition 2 (Lee-Yang).</b> — <i>If $\lvert a_{i,j}\rvert\leq 1$ for all $i,j$, then $P$ is good.</i><br /><br />For every pair $(i,j)$, set $a^S_{i,j}=a_{i,j}$ if $i$ belongs to $S$, but not $j$, and set $a^S_{i,j}=1$ otherwise. Consequently,<br />\[ P(T_1,\dots,T_n) = \sum_{S\subset\{1,\dots,n\}}\prod_{i,j} a^S_{i,j} \prod_{k\in S} T_k. \]<br />In other words, if we define polynomials<br />\[ P_{i,j} (T_1,\dots,T_n) = \sum_{S\subset\{1,\dots,n\}} a^S_{i,j} \prod_{k\in S} T_k, \]<br />then $P$ is the “coefficientwise product” of the polynomials $P_{i,j}$.<br />We also note that these polynomials have degree at most one with respect to every variable. These observations may motivate the following lemmas concerning good polynomials.<br /><br /><i>Lemma 1. — If $P,Q$ are good, then so is their product.</i><br /><br /><i>Lemma 2. — If $P(T_1,\dots,T_n)$ is good, then $P(a,T_2,\dots,T_n)$ is good for every $a\in\mathbf C$ such that $\lvert a\rvert \leq 1$.</i><br /><br />This is obvious if $\lvert a\rvert <1$; in the general case, this follows from the Rouché theorem — the set polynomials (of bounded degree) whose roots belong to some closed subset is closed.<br /><br /><i>Lemma 3. — If $\lvert a\rvert \leq 1$, then $1+aT$ is good.</i><br /><br />This is obvious.<br /><br /><i>Lemma 4. — If $\lvert a\rvert\leq 1$, then $P=1+aT_1+\bar a T_2+ T_1T_2$ is good.</i><br /><br />If $\lvert a\rvert =1$, then $P=(1+aT_1)(1+\bar aT_2)$ is good, as the product of two good polynomials.<br />Now assume that $\lvert a\rvert <1$. Let $(z_1,z_2)$ be a root of $P$ such that $\lvert z_1\rvert<1$. One has<br />\[ z_2 = - \frac{1+az_1}{z_1+\bar a}. \]<br />Since the Möbius transformation $z\mapsto (z+\bar a)/(1+a z)$ defines a bijection from the unit open disk to itself, one has $\lvert z_2\rvert >1$.<br /><br /><i>Lemma 5. — If $P=a+bT_1+cT_2+dT_1T_2$ is good, then $Q=a+dT$ is good.</i><br /><br />Assume otherwise, so that $\lvert a\rvert <\lvert d\rvert$. By symmetry, we assume $\lvert b\rvert\geq \lvert c\rvert$. We write $P (T_1,T_2) = (a+cT_2) + (b+dT_2) T_1$.<br />Choose $z_2\in\mathbf C$ such that $dz_2$ and $b$ have the same argument; if, moreover $z_2$ is close enough to $1$ and satisfies $\lvert z_2\rvert <1$, then<br />\[ \lvert b+dz_2\rvert=\lvert b\rvert +\lvert dz_2\rvert > \lvert a\rvert+\lvert c\rvert>\lvert a+cz_2\rvert. \]<br />Consequently, the polynomial $P(T_1,z_2)$ is not good; a contradiction.<br /><br /><i>Lemma 6. — If $P,Q$ are good polynomials of degree at most one in each variable, then so is their coefficientwise product.</i><br /><br />We first treat the case of one variable: then $P=a+bT$ and $Q=a'+b'T$, so that their coefficientwise product is given by $R=aa'+bb'T$. By assumption $\lvert a\rvert \geq \lvert b\rvert$ and $a\neq 0$;<br />similarly, $\lvert a'\rvert \geq \lvert b'\rvert $ and $a'\neq 0$. Consequently, $aa'\neq0$ and $\lvert aa'\rvert \geq \lvert bb'\rvert$, which shows that $R$ is good.<br />We prove the result by induction on $n$. For every subset $S$ of $\{1,\dots,n-1\}$, let $a_S$ and $b_S$ be the coefficients of $\prod_{i\in S}T_i$ and of $\prod_{i\in S} T_i \cdot T_n$ in $P$; define similarly $c_S$ and $d_S$with $Q$. The coefficientwise product of $P$ and $Q$ is equal to<br />\[ R= \sum_S (a_S c_S +b_S d_S T_n ) \prod_{i\in S} T_i . \]<br />Let $z\in\mathbf C$ be such that $\lvert z\rvert \leq 1$, so that<br />\[ P(T_1,\dots,T_{n-1},z)= \sum _S (a_S+b_S z) \prod_{i\in S} T_i \] is good, by lemma 2. Similarly, for $w\in\mathbf C$ such that $\lvert w\rvert\leq 1$, $Q(T_1,\dots,T_{n-1},w)<br />=\sum _S (c_S+d_S w) \prod_{i\in S} T_i$ is good. By induction, their coefficientwise product, given by<br />\[ R_{z,w} = \sum_S (a_S+b_S z)(c_S+d_S w) \prod_{i\in S} T_i \]<br />is good as well.<br />We now fix complex numbers $z_1,\dots,z_{n-1}$ of absolute value $<1$. By what precedes, the polynomial<br />\[ S(T,U) = (\sum_S a_S c_S z_S) + (\sum_S b_Sc_S z_S) T + (\sum_S a_S d_S z_S) U<br />+ (\sum_S b_S d_S z_S) TU \]<br />is good, where $z_S=\prod_{i\in S}z_i$. According to lemma 4, the polynomial<br />\[ R(z_1,\dots,z_{n-1},T) = (\sum_S a_S c_S z_S) + (\sum_S b_S d_S z_S) T \]<br />is good. This proves that $R$ is good.<br /><br /><i>Proof of theorem 2. — </i>We have already observed that the polynomial $P$ is the coefficientwise product of polynomials $P_{i,j}$, each of them has degree at most one in each variable. On the other hand, one has<br />\[ P_{i,j} = (1+a_{i,j} T_i + a_{j,i} T_j + T_i T_j) \prod_{k\neq i,j} (1+T_k), \]<br />a product of good polynomials, so that $P_{i,j}$ is good. This proves that $P$ is good.<br /><br class="Apple-interchange-newline" /> In fact, more is true. Indeed, one has<br />\[<br />\begin{align*} T_1\dots T_n P(1/T_1,\dots,1/T_n)<br />& = \sum_ S \prod_{\substack{i\in S \\ j\notin S}} a_{i,j} \prod_{i\notin S} T_i \\<br />& =P^*(T_1,\dots,T_n)<br />\end{align*}<br />\]<br />where $P^*$ is defined using the transpose matrix of $A$. Consequently, $P$ has no root $(z_1,\dots,z_n)$ with $\lvert z_i\rvert >1$ for every $i$.<br /><br /><i>Proof of theorem 1. —</i> Let $z$ be a root of $P$. Since the polynomial $P$ is good, so is the one-variable polynomial $F(T)=P(T,\dots,T)$. In particular, $F(z)=0$ implies $\lvert z\rvert \geq 1$. But the polynomial has a symmetry property, inherited by that of $P$, namely $ T^n F(1/T)=F^*(T)$, where $F^*$ is defined using the transpose matrix of $A$. Consequently, $F^*(1/z)=0$ and $\lvert 1/z\rvert \geq 1$. We thus have shown that $\lvert z\rvert=1$.<br /><br /><br /><br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com2tag:blogger.com,1999:blog-8231917611006633375.post-55691953856887292572018-05-02T11:24:00.003+02:002018-05-02T11:24:43.671+02:00Combinatorics and probability : Greene, Nijenhuis and Wilf's proof of the hook length formulaI've never been very good at remembering representation theory, past the general facts that hold for all finite groups, especially the representation theory of symmetric groups. So here are personal notes to help me understand the 1979 paper by Greene, Nijenhuis and Wilf, where they give a probabilistic proof of the <a href="https://en.wikipedia.org/wiki/Hook_length_formula">hook length formula</a>. If you already know what it is about, then you'd be quicker by browsing at the Wikipedia page that I just linked to!<br /><br />Heroes of this story are partitions, Ferrers diagrams, and Young tableaux. Let us first give definitions.<br /><br />A <i>partition</i> of an integer $n$ is a decreasing (US: non-increasing) sequence of integers $\lambda =(\lambda_1\geq \lambda_2\geq \dots\geq \lambda_m)$ of strictly positive integers such that $|\lambda|=\lambda_1+\dots+\lambda_m=n$.<br /><br />The <i>Ferrrers diagram</i> $F(\lambda)$ of this partition $\lambda$ is the stairs-like graphic representation consisting of the first quadrant cells indexed by integers $(i,j)$, where $1\leq i\leq m$ and $1\leq j\leq \lambda_i$. Visualizing a partition by means of its Ferrers diagram makes clear that there is an involution on partitions, $\lambda\mapsto \lambda^*$, which, geometrically consists in applying the symmetry with respect to the diagonal that cuts the first quadrant. In formulas, $j\leq \lambda_i$ if and only if $i\leq \lambda_j^*$.<br /><br />A <i>Young tableau</i> of shape $\lambda$ is an enumeration of the $n$ cells of this Ferrers diagram such that the enumeration strictly increases in rows and columns.<br /><br />There is a graphic way of representing Ferrers diagrams and Young tableaux — the tradition says it's the Soviet way — consisting in rotating the picture by 45° ($\pi/4$) to the left and viewing the first quadrant as a kind of bowl in which $n$ balls with diameter $1$ are thrown from the top.<br /><br />The <i>hook length formula</i> is a formula for the number $f_\lambda$ of Young tableaux of given shape $\lambda$.<br /><br />For every $(i,j)$ such that $1\leq i\leq m$ and $1\leq j\leq \lambda_i$, define its <i>hook</i> $H_{ij}$ to be the set of cells in the diagram that are either above it, or on its the right — in formula, the set of all pairs $(a,b)$ such that $a=i$ and $j\leq b\leq \lambda_i$, or $i\leq a\leq m$ and $b=j\leq \lambda_a$. Let $h_{ij}$ be the cardinality of the hook $H_{ij}$.<br /><br /><i>Lemma. — One has $h_{a,b} = (\lambda_a-b)+(\lambda_b^*-a) +1$.</i><br /><br />Indeed, $\lambda_a-b$ is the number of cells above $(a,b)$ in the Ferrers diagram of $\lambda$, excluding $(a,b)$, while $(\lambda_b^*-a)$ is the number of cells on the right of $(a,b)$.<br /><br /><b>Theorem (Frame, Thrall, Robinson; 1954). —</b> <i>Let $n$ be an integer and let $\lambda$ be a partition of $n$. The number of Young tableaux of shape $\lambda$ is given by </i><br /><i>\[ f_\lambda = \frac{n!}{\prod_{(i,j)\in F(\lambda)} h_{ij}}. \]</i><br /><br />From the point of view of representation theory, partitions of $n$ are in bijection with conjugacy classes of elements in the symmetric group $\mathfrak S_n$ (the lengths of the orbits of a permutation of $\{1,\dots,n\}$ can be sorted into a partition of $n$, and this partition characterizes the conjugacy class of the given permutation). Then, to each partition of $n$ corresponds an irreducible representation of $\mathfrak S_n$, and $f_\lambda$ appears to be its dimension. (In a future post, I plan to explain this part of the story.)<br /><br />As already said, the rest of this post is devoted to explaining the probabilistic proof due to Greene, Nijenhuis, and Wilf. (Aside: Nijenhuis is a Dutch name that should pronounced roughly like Nay-en uys.)<br /><br />A corner of a Ferrers diagram is a cell $(i,j)$ which is both on top of its column, and on the right of its row; in other words, it is a cell whose associated hook is made of itself only. A bit of thought convinces that a corner can be removed, and furnishes a Ferrers diagram with one cell less. Conversely, starting from a Ferrers diagram with $n-1$ cells, one may add a cell on the boundary so as to get a Ferrers diagram with $n$ cells. In the partition point of view, either one part gets one more item, or there is one more part, with only one item. In a Young tableau with $n$ cells, the cell numbered $n$ is at a corner, and removing it furnishes a Young tableau with $n-1$ cells; conversely, starting from a Young tableau with $n-1$ cells, one can add a cell so that it becomes a corner of the new tableau, and label it with $n$.<br /><br />Let $P(\lambda_1,\dots,\lambda_m)$ be the number on the right hand side of the Frame-Thrall-Robinson formula. By convention, it is set to be $0$ if $(\lambda_1,\dots,\lambda_m)$ does not satisfy $\lambda_1\geq\dots\geq\lambda_m\geq 1$. By induction, one wants to prove<br />\[ P(\lambda_1,\dots,\lambda_m) = \sum_{i=1}^m P(\lambda_1,\dots,\lambda_{i-1},\lambda_i-1,\lambda_{i+1},\dots,\lambda_m). \]<br />For every partition $\lambda$, set<br />\[ p_i(\lambda_1,\dots,\lambda_m) = \frac{P(\lambda_1,\dots,\lambda_{i-1},\lambda_i-1,\lambda_{i+1},\dots,\lambda_m)}{P(\lambda_1,\dots,\lambda_m)}. \]<br />We thus need to prove<br />\[ \sum_{i=1}^m p_i(\lambda)=1; \]<br />which we will do by interpreting the $p_i(\lambda)$ as the probabilities of disjoint events.<br /><br />Given the Ferrers diagram $F(\lambda)$, let us pick, at random, one cell $(i,j)$, each of them given equal probability $1/n$; then we pick a new cell, at random, in the hook of $(i,j)$, each of them given equal probability $1/(h_{ij}-1)$, etc., until we reach a corner of the given diagram. Such a trial defines a path in the Ferrers diagram, ending at a corner $(a,b)$; its projections are denoted by $A=\{a_1<a_2<\dots\}$ and $B=\{b_1<b_2<\dots\}$. Let $p(a,b)$ be the probability that we reach the corner $(a,b)$; let $q(A,B)$ be the probability that its projections be $A$ and $B$ conditioned to the hypothesis that it start at $(\inf(A),\inf(B))$.<br /><br /><i>Lemma. — Let $A,B$ be sets of integers, let $a=\sup(A)$ and let $b=\sup(B)$; assume that $(a,b)$ is a corner of $\lambda$. One has \[ q(A,B)= \prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B\\ j\neq b}} \frac1{h_{a,j}-1}.\]</i><br /><br />We argue by induction on the cardinalities of $A$ and $B$. If $A=\{a\}$ and $B=\{b\}$, then $q(A,B)=1$, since both products are empty; this proves the formula in this case. As above, let $a_1<a_2<\dots$ be the enumeration of the elements of $A$ and $b_1<b_2<\dots$ be that for $B$; let also $A'=A\setminus\{a_1\}$ and $B'=B\setminus\{b_1\}$. By construction of the process, after having chosen the initial cell $(a_1,b_1)$, it either goes on above the initially chosen cell $(a_1,b_1)$, hence at $(a_1,b_2)$, or on its right, that is, at $(a_2,b_1)$. One thus has<br />\[ \begin{align*}<br />q(A,B) = \mathbf P(A,B\mid a_1,b_1)<br />& = \mathbf P(a_1,b_1,b_2\mid a_1,b_1) \mathbf P(A,B\mid a_1,b_1,b_2)<br />+ \mathbf P(a_1,a_2,b_1\mid a_1,b_1) \mathbf P(A,B\mid a_1,a_2,b_1)\\<br />&= \frac1{f_{a_1,b_1}-1} \mathbf P(A,B'\mid a_1,b_2)<br />+ \frac1{f_{a_1,b_1}-1} \mathbf P(A',B\mid a_2,b_1) \\<br />&= \frac1{h_{a_1,b_1}-1}\left( q(A',B) + q(A,B') \right). \end{align*}<br />\]<br />By induction, we may assume that the given formula holds for $(A',B)$ and $(A,B')$. Then, one has<br />\[ \begin{align*}<br />q(A,B) & = \frac1{h_{a_1,b_1}-1} \left(<br /> \prod_{\substack{i\in A'\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1}<br />+<br /> \prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B' \\ j\neq b}} \frac1{h_{a,j}-1}\right) \\<br />& =\frac{ (h_{a_1,b}-1)+(h_{a,b_1}-1)}{h_{a_1,b_1}-1}<br /> \prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1},<br />\end{align*}<br />\]<br />which implies the desired formula once one remembers that<br />\[ h_{a,b_1} + h_{a_1,b}<br />= h_{a_1,b_1} + h_{a,b}= h_{a_1,b_1}+1\]<br />since $(a,b)$ is a corner of $F(\lambda)$.<br /><br /><b>Proposition. —</b> <i>Let $(a,b)$ be a corner of the diagram $F(\lambda)$; one has $p(a,b)=p_a(\lambda)$. </i>(Note that $b=\lambda_a$.)<br /><br />Write $F_a(\lambda)$ for the Ferrers diagram with corner $(a,b)$ removed. Its $(i,j)$-hook is the same as that of $F(\lambda)$ if $i\neq a$ and $j\neq b$; otherwise, it has one element less. Consequently, writing $h'_{i,j}$ for the cardinalities of its hooks, one has<br />\[\begin{align*}<br />p_a(\lambda) &= \frac1n \frac{\prod_{(i,j)\in F(\lambda)}h_{i,j}}{\prod_{(i,j)\in F_a(\lambda)} h'_{i,j}} \\ &= n \prod_{i<a} \frac{h_{i,b}}{h_{i,b}-1} \prod_{j<b}\frac{h_{a,j}}{h_{a,j}-1}\\<br />&=\frac1n \prod_{i<a}\left(1+ \frac1{h_{i,b}-1}\right) \prod_{j<b} \left(1+\frac1{h_{a,j}-1}\right). \end{align*}<br />\]<br />Let us now expand the products. We get<br />\[<br />p_a(\lambda) = \frac1n \sum_{\sup(A)<a} \sum_{\sup(B)<b} \prod_{i\in A} \frac1{h_{i,b}-1}\prod_{j\in B}\frac1{h_{a,j}-1},<br />\]<br />where $A$ and $B$ range over the (possibly empty) subsets of $\{1,\dots,n\}$ satisfying<br />the given conditions $\sup(A)<a$ and $\sup(B)<b$. (Recall that, by convention, or by definition, one has $\sup(\emptyset)=-\infty$.) Consequently, one has<br />\[ \begin{align*}<br />p_a(\lambda) & =\frac1n \sum_{\sup(A)=a} \sup_{\sup(B)=b} \prod_{\substack{i\in A\\ i\neq a}}<br /> \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1} \\<br />& = \frac1n \sum_{\sup(A)=a} \sum_{\sup(B)=b} q(A,B) \\<br />& = \sum_{\sup(A)=a} \sum_{\sup(B)=b} \mathbf P (A,B ) \\<br />& = p(a,b),<br />\end{align*}<br />\]<br />as claimed.<br /><br />Now, every trial has to end at some corner $(a,b)$, so that<br />\[ \sum_{\text{$(a,b)$ is a corner}} p(a,b) = 1.\] <br />On the other hand, if $(a,b)$ is a corner, then $b=\lambda_a$, while if $(a,\lambda_a)$ is not a corner, then $P_a(\lambda)=0$. We thus get $\sum_a P_a(\lambda)=P(\lambda)$, as was to be shown.<br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com3tag:blogger.com,1999:blog-8231917611006633375.post-45790136252514638242018-02-07T16:10:00.000+01:002018-02-07T16:10:24.469+01:00Contemporary homological algebra — Ignoramus et ignorabimus (?)The title of this post is a quotation of Emil Dubois-Reymond (1818-1896), a 19th century German physiologist, and the elder brother of the mathematician Paul Dubois-Reymond. Meaning <i>we are ignorant, and we will remain ignorant, </i>it adopts a pessimistic point of view on science, which would have intrinsic limitations. As such, this slogan has been quite opposed by David Hilbert who declared, in 1900, at the International congress of mathematicians, that there is no <i>ignorabimus</i> in mathematics. (In fact, there is some <i>ignorabimus</i>, because of Gödel's incompleteness theorem, but that is not the subject of this post.)<br /><br />I would like to discuss here, in a particularly informal way, some frustration of myself relative to homological algebra, in particular to its most recent developments. I am certainly ill-informed on those matters, and one of my goals is to clarify my own ideas, my expectations, my hopes,...<br /><br />This mere existence of this post is due to the kind invitation of a colleague of the computer science department working in (higher) category theory, namely François Metayer, who was interested to understand my motivation for willing to understand this topic.<br /><br /><br />Let me begin with a brief historical summary of the development of homological algebra, partly borrowed from Charles Weibel's <a href="http://www.math.rutgers.edu/~weibel/HA-history.pdf"><i>History of homological algebra</i></a>.<br /><ul><li>B. Riemann (1857), E. Betti (1871), H. Poincaré (1895) define homology numbers. </li><li>E. Noether (1925) introduces abelian groups, whose elementary divisors, recover the previously defined homology numbers.</li><li>J. Leray (1946) introduces sheaves, their cohomology, the spectral sequence... </li><li>During the years 1940–1955, under the hands of Cartan, Serre, Borel, etc., the theory develops itself in various directions (cohomology of groups, new spectral sequences, etc.).</li><li>In their foundational book, <i>Homological algebra,</i> H. Cartan and S. Eilenberg (1956) introduce derived functors, projective/injective resolutions,...</li><li>Around 1950, A. Dold, D. Kan, J. Moore, D. Puppe introduce simplicial methods. D. Kan introduces adjoint functors.</li><li>A. Grothendieck, in <i>Sur quelques points d'algèbre homologique</i> (1957), introduces general abelian categories, as well as convenient axioms that guarantee the existence of enough injective objects, thus giving birth to a generalized homological algebra.</li><li>P. Gabriel and M. Zisman (1967) developed the abstract calculus of fractions in categories, and proved that the homotopy category of topological spaces coincides with that of simplicial sets.</li><li>J.-L. Verdier (1963) defines derived categories. This acknowledges that objects give rise to, say, injective resolutions which are canonical up to homotopy, and that the corresponding complex is an object in its own right, that has to be seen as equivalent to the initial object. The framework is that of triangulated categories. Progressively, derived categories came to play an important rôle in algebraic geometry (Grothendieck duality, Verdier duality, deformation theory, intersection cohomology and perverse sheaves, the Riemann–Hilbert correspondence, mirror symmetry,...) and representation theory.</li><li>D. Quillen (1967) introduces model categories, who allow a parallel treatment of homological algebra in linear contexts (modules, sheaves of modules...) and non-linear ones (algebraic topology)... This is completed by A. Grothendieck's (1991) notion of derivators.</li><li>At some point, the theory of dg-categories appears, but I can't locate it precisely, nor do I understand precisely its relation with other approaches.</li><li>A. Joyal (2002) begins the study of quasi-categories (which were previously defined by J. M. Boardman and R. M. Vogt, 1973). Under the name of $(\infty,1)$-categories or $\infty$-categories, these quasi-categories are used extensively in Lurie's work (his books <i>Higher topos theory,</i> 2006; <i>Higher algebra,</i> 2017; the 10+ papers on derived algebraic geometry,...).</li></ul><div>My main object of interest (up to now) is “classical” algebraic geometry, with homological algebra as an important tool via the cohomology of sheaves, and while I have barely used anything more abstract that cohomology sheaves (almost never complexes), I do agree that there are three main options to homological algebra: derived categories, model categories, and $\infty$-categories.<br /><br /></div><div>While I am not absolutely ignorant of the first one (I even lectured on them), the two other approaches still look esoteric to me and I can't say I master them (yet?). Moreover, their learning curve seem to be quite steep (Lurie's books totalize more than 2000 pages, plus the innumerable papers on derived algebraic geometry, etc.) and I do not really see how an average geometer should/could embark in this journey.</div><br /><div>However, I believe that this is now a necessary journey, and I would like to mention some recent theorems that support this idea.</div><br /><div>First of all, and despite its usefulness, the theory of triangulated/derived categories has many defects. Here are some of them:</div><div><ul><li>There is no (and there cannot be any) functorial construction of a cone; </li><li>When a triangulated category is endowed with a truncation structure, there is no natural functor from the derived category of its heart to the initial triangulated category; </li><li>Derived categories are not well suited for non-abelian categories (filtered derived categories seem to require additional, non-trivial, work, for example);</li><li>Unbounded derived functors are often hard to define: we now dispose of homotopically injective resolutions (Spaltenstein, Serpé, Alonso-Tarrió et al.), but unbounded Verdier duality still requires some unnatural hypotheses on the morphism, for example.</li></ul><div>Three results, now.</div></div><div><br /></div><div>The first theorem I want to mention is due to M. Greenberg (1966). <i>Given a scheme $X$ of finite type over a complete discrete valuation ring $R$ with uniformizer $\pi$, there exists an integer $a\geq 1$, such that for any integer $n\geq1$, a point $x\in X(R/\pi^n)$ lifts to $X(R)$ if and only if it lifts to $X(R/\pi^{an})$.</i><br /><br />It may be worth stating it in more concrete terms. Two particular cases of such a ring $R$ are the ring $R[[t]]$ of power series over some field $k$, then $\pi=t$, and the ring $\mathbf Z_p$ of $p$-adic numbers (for some fixed prime number $p$), in which case one has $\pi=p$. It is then important to consider the case of affine scheme. Then $X=V(f_1,\dots,f_m)$ is defined by the vanishing of a finite family $f_1,\dots,f_m$ of polynomials in $R[T_1,\dots,T_n]$ in $n$ variables, so that, for any ring $A$, $X(A)$ is the set of solutions in $A^n$ of the system $f(T_1,\dots,T_n)=\dots=f_m(T_1,\dots,T_n)=0$. By reduction modulo $\pi^r$, a solution in $R^r$ gives rise to a solution in $R/\pi^r$, and Greenberg's result is about the converse: given a solution $x$ in $R/\pi^r$, how do decide whether it is a reduction of a solution in $R$. A necessary condition is that $x$ lifts to a solution in $R/\pi^s$, for every $s\geq r$. Greenberg's theorem asserts that it is sufficient that $x$ lift to a solution in $R/\pi^{ar}$, for some integer $a\geq 1$ which does not depend on $X$.<br /><br /></div><div>The proof of this theorem is non-trivial, but relatively elementary. After some preparation, it boils down to Hensel's lemma or, equivalently, Newton's method for solving equations.</div><div>However, it seems to me that there should be an extremely conceptual way to prove this theorem, based on general deformation theory such as the one developed by Illusie (1971). Namely, obstructions to lifting $x$ are encoded by various cohomology classes, and knowing that it lifts enough should be enough to see — on the nose — that these obstructions vanish.</div><div><br /></div><div>The second one is about cohomology of Artin stacks. Y. Laszlo and M. Olsson (2006) established the 6-operations package for $\ell$-adic sheaves on Artin stacks, but their statements have some hypotheses which look a bit unnatural. For example, the base scheme $S$ needs to be such that all schemes of finite type have finite $\ell$-cohomological dimension — this forbids $S=\operatorname{Spec}(\mathbf R)$. More recently, Y. Liu and W. Zheng developed a more general theory, apparently devoid of restrictive hypotheses, and their work builds on $\infty$-categories, more precisely, a stable $\infty$-category enhancing the unbounded derived category. On page 7 of their paper, they carefully explain why derived categories are unsufficient to take care of the necessary descent datas, but I can't say I understand their explanation yet...<br /><br />The last one is about the general formalism of 6-operations. While it is clear what these 6 operations should reflect (direct and inverse images; proper direct images and extraordinary inverse images; tensor product, internal hom), the list of the properties they should satisfy is not clear at all (to me). In the case of coherent sheaves, there is such a <a href="http://matematicas.unex.es/~navarro/res/sixoperations.pdf" style="font-style: italic;">formulaire</a>, written by A. Grothendieck itself on the occasion of a talk in 1983, but it is quite informal, and not at all a general formalism. Recently, F. Hörmann proposed such a formalism (2015–2017), based on Grothendieck's theory of derivators.<br /><br />Now, how should the average mathematician embark in learning these theories?<br /><br />Who will write the analogue of Godement's book for the homological algebra of the 21st century? Can we hope that it be shorter than 3000 pages?<br /><br />I hope to find, some day, some answer to these questions, and that they will allow to hear with satisfaction the words of Hilbert: <i>Wir müssen wissen, wir werden wissen.</i></div><div></div>Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com2tag:blogger.com,1999:blog-8231917611006633375.post-23625378032695012672017-03-22T18:14:00.000+01:002017-03-22T21:28:08.377+01:00Warning! — Theorems aheadDon't worry, no danger ahead! — This is just a short post about the German mathematician Ewald Warning and the theorems that bear his name.<br /><br />It seems that Ewald Warning's name will be forever linked with that of Chevalley, for the Chevalley-Warning theorem is one of the rare modern results that can be taught to undergraduate students; in France, it is especially famous at the Agrégation level. (Warning published a second paper, in 1959, about the axioms of plane geometry.)<br /><br />Warning's paper, <i>Bemerkung zur vorstehenden Arbeit von Herrn Chevalley</i> (About a previous work of Mr Chevalley), has been published in 1935 in the Publications of the mathematical seminar of Hamburg University (Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg), just after the mentioned paper of Chevalley. Emil Artin had a position in Hamburg at that time, which probably made the seminar very attractive; as a matter of fact, the same 1935 volume features a paper of Weil about Riemann-Roch, one of Burau about braids, one of Élie Cartan about homogeneous spaces, one of Santalo on geometric measure theory, etc.<br /><br />1. The classic statement of the Chevalley—Warning theorem is the following.<br /><br /><b>Theorem 1. – </b><i>Let $p$ be a prime number, let $q$ be a power of $p$ and let $F$ be a field with $q$ elements. Let $f_1,\dots,f_m$ be polynomials in $n$ variables and coefficients in $F$, of degrees $d_1,\dots,d_m$; let $d=d_1+\dots+d_m$. Let $Z=Z(f_1,\dots,f_m)$ be their zero-set in $F^n$. If $d<n$, then $p$ divides $\mathop{\rm Card}(Z)$.</i><br /><br />This is really a theorem of Warning, and Chevalley's theorem was the weaker consequence that if $Z\neq\emptyset$, then $Z$ contains at least two points. (In fact, Chevalley only considers the case $q=p$, but his proof extends readily.) The motivation of Chevalley lied in the possibility to apply this remark to the reduced norm of a possibly noncommutative finite field (a polynomial of degree $d$ in $d^2$ variables which vanishes exactly at the origin), thus providing a proof of Wedderburn's theorem. <br /><br />a) Chevalley's proof begins with a remark. For any polynomial $f\in F[T_1,\ldots,T_n]$, let $f^*$ be the polynomial obtained by replacing iteratively $X_i^q$ by $X_i$ in $f$, until the degree of $f$ in each variable is $<q$. For all $a\in F^n$, one has $f(a)=f^*(a)$; moreover, using the fact that a polynomial in one variable of degree $<q$ has at most $q$ roots, one proves that if $f(a)=0$ for all $a\in F^n$, then $f^*=0$.<br /><br />Assume now that $Z$ contains exactly one point, say $a\in F^n$, let $f=\prod (1-f_j^{q-1})$, let $g_a=\prod (1-(x_i-a_i)^{q-1})$. Both polynomials take the value $1$ at $x=a$, and $0$ elsewhere; moreover, $g_a$ is reduced. Consequently, $f^*=g$. Then<br />$$ (q-1)n=\deg(g_a)=\deg(f^*)\leq \deg(f)=(q-1)\sum \deg(f_j)=(q-1)d, $$<br />contradicting the hypothesis that $d<n$.<br /><br />b) Warning's proof is genuinely different. He first defines, for any subset $A$ of $F^n$ a reduced polynomial $g_A=\sum_{a\in A} g_a=\sum_{a\in A}\prod (1-(x_i-a_i)^{q-1})$, and observes that $g_A(a)=1$ if $a\in A$, and $g_A(a)=0$ otherwise. <br />Take $A=Z$, so that $f^*=g_Z$. Using that $\deg(f^*)\leq \deg(f)=(q-1)d$ and the expansion <br />$ (x-a)^{q-1} = \sum_{i=0}^{q-1} x^i a^{q-1-i}$, Warning derives from the equality $f^*=g_Z$<br />the relations<br />\[ \sum_{a\in Z} a_1^{\nu_1}\dots a_n^{\nu_n}=0, \]<br />for all $(\nu_1,\dots,\nu_n)$ such that $0\leq \nu_i\leq q-1$ and $\sum\nu_i <(q-1)(n-d)$. The particular case $\nu=0$ implies that $p$ divides $\mathop{\rm Card}(Z)$. More generally: <br /><br /><b>Proposition 2. — </b><i>For every polynomial $\phi\in F[T_1,\dots,T_n]$ of reduced degree $<(q-1)(n-d)$, one has $\sum_{a\in A} \phi(a) = 0$.</i><br /><br />c) The classic proof of that result is even easier. Let us recall it swiftly. First of all, for every integer $\nu$ such that $0\leq \nu <q$, one has $\sum_{a\in F} a^\nu=0$. This can be proved in many ways, for example by using the fact that the multiplicative group of $F$ is cyclic; on the other hand, for every nonzero element $t$ of $F$, the change of variables $a=tb$ leaves this sum both unchanged and multiplied by $t^\nu,$ so that taking $t$ such that $t^\nu\neq 1$, one sees that this sum vanishes. It follows from that that for every polynomial $f\in F[T_1,\dots,T_n]$ whose degree in <i>some</i> variable is $\lt;q-1$, one has $\sum_{a\in F^n} f(a)=0$. This holds in particular if the total degree of $f$ is $\lt; (q-1)n$. <br />Taking $f$ as above proves theorem 1.<br /><br />2. On the other hand there is a <i>second</i> Warning theorem, which seems to be absolutely neglected in France. It says the following:<br /><br /><b>Theorem 3. — </b><i>Keep the same notation as in theorem 1. If $Z$ is nonempty, then $\mathop{\rm Card}(Z)\geq q^{n-d}$.</i><br /><br />To prove this result, Warning starts from the following proposition:<br /><br /><b>Proposition 4. – </b><i>Let $L,L'$ be two parallel subspaces of dimension $d$ in $F^n$. Then $\mathop{\rm Card}(Z\cap L)$ and $\mathop{\rm Card}(Z\cap L')$ are congruent modulo $p$.</i><br /><br />Let $r=n-d$. Up to a change of coordinates, one may assume that $L=\{x_1=\dots=x_{r}=0\}$ and $L'=\{x_1-1=x_2=\dots=x_{r}=0\}$. Let<br />\[ \phi = \frac{1-x_1^{q-1}}{1-x_1} (1-x_2^{q-1})\cdots (1-x_{r}^{q-1}).\]<br />This is a polynomial of total degree is $(q-1)r-1<(q-1)(n-d)$. For $a\in F^n$, one has $\phi(a)=1$ if $a\in L$, $\phi(a)=-1$ if $a\in L'$, and $\phi(a)=0$ otherwise. Proposition 4 thus follows from proposition 2. It is now very easy to prove theorem 3 in the particular case where there exists <i>one</i> subspace $L$ of dimension $d$ such that $\mathop{\rm Card}(Z\cap L)\not\equiv 0\pmod p$. Indeed, by proposition 4, the same congruence will hold for every translate $L'$ of $L$. In particular, $\mathop{\rm Card}(Z\cap L')\neq0$ for every translate $L'$ of $L$, and there are $q^{n-d}$ distinct translates.<br /><br />To prove the general case, let us choose a subspace $M$ of $F^n$ of dimension $s\leq d$ such that <br />$\mathop{\rm Card}(Z\cap M)\not\equiv 0\pmod p$, and let us assume that $s$ is maximal.<br />Assume that $s< d$. Let $t\in\{1,\dots,p-1\}$ be the integer such that $\mathop{\rm Card}(Z\cap M)\equiv t\pmod p$. For every $(s+1)$-dimensional subspace $L$ of $F^n$ that contains $M$, one has $\mathop{\rm Card}(Z\cap L)\equiv 0\pmod p$, by maximaility of $s$, so that $Z\cap (L\setminus M)$ contains at least $p-t$ points. Since these subspaces $L$ are in 1-1 correspondence with the lines of the quotient space $F^n/M$, their number is equal to $(q^{n-s}-1)/(q-1)$. Consequently,<br />\[ \mathop{\rm Card}(Z) = \mathop{\rm Card}(Z\cap M) + \sum_L \mathop{\rm Card}(Z\cap (L\setminus M))<br />\geq t + (p-t) \frac{q^{n-s}-1}{q-1} \geq q^{n-s-1}\geq q^{n-d}, \]<br />as was to be shown.<br /><br />3. Classic theorems seem to an everlasting source of food for thought.<br /><br />a) In 1999, Alon observed that Chevalley's theorem follows from the <i>Combinatorial Nullstellensatz</i> he had just proved. On the other hand, this approach allowed Brink (2011) to prove a similar result in general fields $F$, but restricting the roots to belong to a product set $A_1\times\dots\times A_n$, where $A_1,\dots,A_n$ are finite subsets of $F$ of cardinality $q$. See <a href="https://arxiv.org/pdf/1404.7793.pdf">that paper of Clarke, Forrow and Schmitt</a> for further developments, in particular a version of Warning's second theorem.<br /><br />b) In the case of hypersurfaces (with the notation of theorem 1, $m=1$), Ax proved in 1964 that the cardinality of $Z$ is divisible not only by $p$, but by $q$. This led to renewed interest in the following years, especially in the works of Katz, Esnault, Berthelot, and the well has not dried up yet.<br /><br />c) In 2011, Heath-Brown published a paper where he uses Ax's result to strengthen the congruence modulo $p$ of proposition 4 to a congruence modulo $q$.<br /><br />d) By a Weil restriction argument, a 1995 paper of Moreno-Moreno partially deduces the Chevalley-Warning theorem over a field of cardinality $q$ from its particular case over the prime field. I write partially because they obtain a divisibility by an expression of the form $p^{\lceil f \alpha\rceil}$, while one expects $q^{\lceil \alpha}=p^{f\lceil\alpha\rceil}$. However, the same argument allows them to obtain a stronger bound which does not involve not the degrees of the polynomials, but the $p$-weights of these degrees, that is the sum of their digits in their base $p$ expansions. Again, they obtain a divisibility by an expression of the form $p^{\lceil f\beta\rceil}$, and it is a natural question to wonder whether the divisibility by $p^{f\lceil\beta\rceil}$ can be proved.<br /><br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-75784199580195287572017-02-05T15:59:00.002+01:002017-02-05T15:59:18.637+01:00 Counting points and counting curves on varieties — Tribute to Daniel Perrin$\require{enclose}\def\VarC{\mathrm{Var}_{\mathbf C}}\def\KVarC{K_0\VarC}$<br /><a href="https://2.bp.blogspot.com/-vausKhjuLsI/WJY6bQivrGI/AAAAAAAAHbw/KW7rIN0xmCgUm4YnOr8xDY2o5waDzVm6wCLcB/s1600/Perrin%2B-%2BCours%2Bd%2527alge%25CC%2580bre.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="200" src="https://2.bp.blogspot.com/-vausKhjuLsI/WJY6bQivrGI/AAAAAAAAHbw/KW7rIN0xmCgUm4YnOr8xDY2o5waDzVm6wCLcB/s200/Perrin%2B-%2BCours%2Bd%2527alge%25CC%2580bre.png" width="133" /></a><a href="https://www.math.u-psud.fr/~perrin/">Daniel Perrin</a> is a French algebraic geometer who turned 70 last year. He his also well known in France for his wonderful teaching habilities. He was one of the cornerstones of the former École normale supérieure de jeunes filles, before it merged in 1985 with the rue d'Ulm school. From this time remains a <i>Cours d'algèbre</i> which is a must for all the students (and their teachers) who prepare the agrégation, the highest recruitment process for French high schools. He actually taught me Galois theory (at École normale supérieure in 1990/1991) and Algebraic Geometry (the year after, at Orsay). His teaching restlessly stresses the importance of examples. He has also been deeply involved in training future primary school teachers, as well as in devising the mathematical curriculum of high school students: he was responsible of the report on geometry. It has been a great honor for me to be invited to lecture during the <a href="https://www.math.u-psud.fr/jdp16/">celebration of his achievements</a> that took place at Orsay on November, 23, 2016.<br /><br />Diophantine equations are a source of numerous arithmetic problems. One of them has been put forward by Manin in the 80s and consists in studying the behavior of the number of solutions of such equations of given size, when the bound grows to infinity. A geometric analogue of this question considers the space of all curves with given degree which are drawn on a fixed complex projective, and is interested in their behavior when the degree tends to infinity. This was the topic of my lecture and is the subject of this post.<br /><br />Let us first begin with an old problem, apparently studied by Dirichlet around 1840, and given a rigorous solution by Chebyshev and Cesáro around 1880: <i>the probability that two integers be coprime is equal to $6/\pi^2$.</i> Of course, there is no probability on the integers that has the properties one would expect, such as being invariant by translation, and the classical formalization of this problem states that the numbers of pairs $(a,b)$ of integers such that $1\leq a,b\leq n$ and $\gcd(a,b)=1$ grows as $n^2 \cdot 6/\pi^2$ when $n\to+\infty$,<br /><br />This can be proved relatively easily, for example as follows. Without the coprimality condition, there are $n^2$ such integers. Now one needs to remove those pairs both of which entries are multiples of $2$, and there are $\lfloor n/2\rfloor^2$ of those, those where $a,b$ are both multiples of $3$ ($\lfloor n/3\rfloor^2$), and then comes $5$, because we have already removed those even pairs, etc. for all prime numbers. But in this process, we have removed twice the pairs of integers both of which entries are multiples of $2\cdot 3=6$, so we have to add them back, and then remove the pairs of integers both of which are multiples of $2\cdot 3\cdot 5$, etc. This leads to the following formula for<br />the cardinality $C(n)$ we are interested in:<br /><br />$\displaystyle<br /> C(n) = n^2 - \lfloor\frac n2\rfloor^2 - \lfloor \frac n3\rfloor^2-\lfloor \frac n5\rfloor^2 - \dots<br />+ \lfloor \frac n{2\cdot 3}\rfloor^2+\lfloor\frac n{2\cdot 5}\rfloor^2+\dots<br />- \lfloor \frac n{2\cdot 3\cdot 5} \rfloor^2 - \dots $.<br /><br />Approximating $\lfloor n/a\rfloor$ by $n/a$, this becomes<br /><br />$\displaystyle<br />C(n) \approx n^2 - \left(\frac n2\right)-^2 - \left (\frac n3\right)^2-\left( \frac n5\rfloor\right)^2 - \dots<br />+ \left (\frac n{2\cdot 3}\right)^2+\left(\frac n{2\cdot 5}\right)^2+\dots<br />- \left (\frac n{2\cdot 3\cdot 5} \right)^2 - \dots $<br /><br />which we recognize as<br /><br />$\displaystyle<br />C(n)\approx n^2 \left(1-\frac1{2^2}\right) \left(1-\frac1{3^2}\right)\left(1-\frac1{5^2}\right) \dots<br />=n^2/\zeta(2)$,<br /><br />where $\zeta(2)$ is the value at $s=2$ of Riemann's zeta function $\zeta(s)$. Now, Euler had revealed the truly arithmetic nature of $\pi$ by proving in 1734 that $\zeta(2)=\pi^2/6$. The approximations we made in this calculation can be justified, and this furnishes a proof of the above claim.<br /><br />We can put this question about integers in a broader perspective if we recall that the ring $\mathbf Z$ is a principal ideal domain (PID) and study the analogue of our problem in other PIDs, in particular for $\mathbf F[T]$, where $\mathbf F$ is a finite field; set $q=\operatorname{Card}(\mathbf F)$. The above proof can be adapted easily (with simplifications, in fact) and shows that number of pairs $(A,B)$ of monic polynomials of degrees $\leq n$ such that $\gcd(A,B)=1$ grows as $q^n(1-1/q)$ when $n\to+\infty$. The analogy becomes stronger if one observes that $1/(1-1/q)$ is the value at $s=2$ of $1/(1-q^{1-s})$, the Hasse-Weil zeta function of the affine line over $\mathbf F$.<br /><br />What can we say about our initial question if we replace the ring $\mathbf Z$ with the PID $\mathbf C[T]$? Of course, there's no point in counting the set of pairs $(A,B)$ of coprime monic polynomials of degree $\leq n$ in $\mathbf C[T]$, because this set is infinite. Can we, however, describe this set? For simplicity, we will consider here the set $V_n$ of pairs of coprime monic polynomials of degree precisely $n$. If we identify a monic polynomial of degree $n$ with the sequence of its coefficients, we then view $V_n$ as a subset of $\mathbf C^{n}\times\mathbf C^n$. We first observe that $V_n$ is an Zariski open subset of $\mathbf C^{2n}$: its complement $W_n$ is defined by the vanishing of a polynomial in $2n$ variables — the resultant of $A$ and $B$.<br /><br />When $n=0$, we have $V_0=\mathbf C^0=\{\mathrm{pt}\}$.<br /><br />Let's look at $n=1$: the polynomials $A=T+a$ and $B=T+b$ are coprime if and only if $a\neq b$;<br />consequently, $V_1$ is the complement of the diagonal in $\mathbf C^2$.<br /><br />For $n=2$, this becomes more complicated: the resultant of the polynomials $T^2+aT+b$ and $T^2+cT+d$ is equal to $a^2d-abc-adc+b^2-2bd+bc^2+d^2$; however, it looks hard to guess some relevant properties of $V_n$ (or of its complement) just by staring at this equation. In any case, we can say that $V_2$ is the complement in $\mathbf C^4$ of the union of two sets, corresponding of the degree of the gcd of $(A,B)$. When $\gcd(A,B)=2$, one has $A=B$; this gives the diagonal, a subset of $\mathbf C^4$ isomorphic to $\mathbf C^2$; the set of pairs of polynomials $(A,B)$ whose gcd has degree $1$ is essentially $\mathbf C\times V_1$: multiply a pair $(A_1,B_1)$ of coprime polynomials of degree $1$ by an arbitrary polynomial of the form $(T-d)$.<br />Consequently,<br />\begin{align}V_2&=\mathbf C^4 - \left( \mathbf C^2 \cup \mathbf C\times V_1\right)\\<br />&= \mathbf C^4 - \left( \enclose{updiagonalstrike}{\mathbf C^2}\cup \left(\mathbf C\times (\mathbf C^2-\enclose{updiagonalstrike}{\mathbf C})\right)\right)\\<br />&=\mathbf C^4-\mathbf C^3<br />\end{align}<br />if we cancel the two $\mathbf C^2$ that appear. Except that this makes no sense!<br /><br />However, there is a way to make this computation both meaningful and rigorous, and it consists in working in the Grothendieck ring $\KVarC$ of complex algebraic varieties. Its additive group is generated by isomorphism classes of algebraic varieties, with relations of the form $[X]=[U]+[Z]$ for every Zariski closed subset $Z$ of an algebraic variety $X$, with complement $U=X-Z$. This group has a natural ring structure for which $[X][Y]=[X\times Y]$. Its unit element is the class of the point, $[\mathbf A^0]$ if one wishes. An important element of this ring $\KVarC$ is the class $\mathbf L=[\mathbf A^1]$ of the affine line. The natural map $e\colon \VarC\to \KVarC$ given by $e(X)=[X]$ is the universal Euler characteristic: it is the universal map from $\VarC$ to a ring such that $e(X)=e(X-Z)+e(Z)$ and $e(X\times Y)=e(X)e(Y)$, where $X,Y$ are complex varieties and $Z$ is a Zariski closed subset of $X$. <br /><br />In particular, it generalizes the classical Euler characteristic, the alternate sum of the dimensions of the cohomology groups (with compact support, if one wishes) of a variety. A subtler invariant of $\KVarC$ is given by mixed Hodge theory: there exists a unique ring morphism $\chi_{\mathrm H}\KVarC\to\mathbf Z[u,v]$ such that for every complex variety $X$, $\chi_{\mathrm H}([X])$ is the Hodge-Deligne polynomial of $X$. In particular, if $X$ is projective and smooth, $\chi_{\mathrm H}([X])=\sup_{p,q} \dim h^q(X,\Omega^p_X) u^pv^q$. If one replaces the field of complex numbers with a finite field $\mathbf F$, one may actually <i>count</i> the numbers of $\mathbf F$-points of $X$, and this furnishes yet another generalized Euler characteristic.<br /><br />The preceding calculation shows that $e(V_0)=1$, $e(V_1)=\mathbf L^2-\mathbf L$ and $e(V_2)=\mathbf L^4-\mathbf L^3$; more generally, one proves by induction that $e(V_n)=\mathbf L^{2n}-\mathbf L^{2n-1}$ for every integer $n\geq 0$. <br /><br />Equivalently, one has $e(W_n)=\mathbf L^{2n-1}$ for all $n$. I have to admit that I see no obvious reason for the class of $W_n$ to be equal to that of an affine space. However, as Ofer Gabber and Jean-Louis Colliot-Thélène pointed out to me during the talk, this resultant is the difference of two homogeneous polynomials $p-q$ of degrees $d=2$ and $d+1=3$; consequently, the locus it defines is a <i>rational variety</i> — given $a,b,c$, there is generically a unique $t$ such that $p-q$ vanishes at $(at,bt,ct,t)$.<br /><br />These three results have a common interpretation if one brings in the projective line $\mathbf P_1$. Indeed, pairs $(a,b)$ of coprime integers (up to $\pm1$) correspond to rational points on $\mathbf P_1$, and if $\mathbf F$ is a field, then pairs $(A,B)$ of coprime polynomials in $\mathbf F[T]$ correspond (up to $\mathbf F^\times$) to elements of $\mathbf P_1(\mathbf F(T))$. <br />In both examples, the numerical datum $\max(|a|,|b|)$ or $\max(\deg(A),\deg(B))$ is called the height of the corresponding point. <br /><br />In the case of the ring $\mathbf Z$, or in the case of the ring $\mathbf F[T]$ where $\mathbf F$ is a finite field, one has an obvious but fundamental finiteness theorem: there are only finitely many points of $\mathbf P_1$ with bounded height. In the latter case, $\mathbf C[T]$, this naïve finiteness does not hold. Nevertheless, if one sees $\mathbf P_1(\mathbf C(T))$ as an infinite dimensional variety — one needs infinitely many complex numbers to describe a rational function, then the points of bounded height constitute what is called a bounded family, a “finite dimensional” constructible set. <br /><br />The last two examples have a common geometric interpretation. Namely, $\mathbf F(T)$ is the field of functions of a projective smooth algebraic curve $C$ over $\mathbf F$; in fact, $C$ is the projective line again, but we may better ignore this coincidence. Then a point $x\in\mathbf P_1(\mathbf F(T))$<br />corresponds to a morphism $\varepsilon_x\colon C\to\mathbf P_1$, and the formula $H(x)=\deg(\epsilon_x^*\mathscr O(1))$ relates the height $H(x)$ of $x$ to the degree of the morphism $\varepsilon_x$.<br /><br />Since the notion of height generalizes from $\mathbf P_1$ to projective spaces $\mathbf P_n$ of higher dimension (and from $\mathbf Q$ to general number fields), this suggests a general question. Let $V\subset\mathbf P_n$ be a projective variety over a base field $k$ hat can one say about the set of points $x\in V(k)$ such that $H(x)\leq B$, when the bound $B$ grows to $\infty$?<br />The base field $k$ can be either a number field, or the field of functions $\mathbf F(C)$ of a curve $C$ over a finite field $\mathbf F$, or the field of functions $\mathbf C(C)$ of a curve over the complex numbers. In the last two cases, the variety can even be taken to be constant, deduced from a variety $V_0$ over $\mathbf F$ or $\mathbf C$.<br /><br /><ol><li>When $k$ is a number field, this set is a finite set; how does its cardinality grows? This is a question that Batyrev and Manin have put forward at the end of the 80s, and which has attracted a lot of attention since.</li><li>When $k=\mathbf F(C)$ is a function field over a finite field, this set is again a finite set; how does its cardinality grows? This question has been proposed by Emmanuel Peyre by analogy with the question of Batyrev and Manin.</li><li>When $k=\mathbf C(C)$ is a function field over $\mathbf C$, this set identifies with a closed subscheme of the Grothendieck-Hilbert scheme of $V$; what can one say about its geometry, in particular about its class in $\KVarC$? Again, this question has been proposed by Emmanuel Peyre around 2000.</li></ol><br />In a forthcoming post, I shall recall some results on these questions, especially the first one, and in particular explain an approach based on the Fourier summation formula. I will then explain a theorem proved with François Loeser where we make use of Hrushovski–Kazhdan's motivic Fourier summation formula in motivic integration to prove an instance of the third question.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com1tag:blogger.com,1999:blog-8231917611006633375.post-67695977572303416362017-01-09T10:01:00.000+01:002017-01-09T10:01:11.619+01:00“May you and all your students flourish.”<div class="separator" style="clear: both; text-align: center;"><a href="https://www.math.hmc.edu/~su/su.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="133" src="https://www.math.hmc.edu/~su/su.jpg" width="200" /></a></div><a href="https://www.math.hmc.edu/~su/">Francis Su</a> was the former president of the Mathematical Association of America. He just gave a beautiful address at the AMS-MAA Joint Meeting, entitled “Mathematics for Human Flourishing”.<br /><br />Basically, when asked about the goal of mathematics, the answer is often related to its contribution to the progress of mankind through the advancement of science. Francis Su explicits what the deepest goal of mathematics may be: contribute to the flourishing not only of mankind as a whole, but of each of us as human beings. Starting from Aristotle's view that a well-lived life goes through the exercise of “virtue” — excellence of character leading to the excellence of conduct. He then quotes five basic desires which mathematics help fulfill while cultivating such virtues: <i>play, beauty, truth, justice and love.</i><br /><br />Francis Su's address is full of personal stories, encounters, and quotes, and I invite all of you either to <a href="https://www.facebook.com/maanews/videos/1015487916%20of%20this9165419/">watch the video</a> on the <a href="https://www.facebook.com/maanews/">Facebook page of the MAA</a>, or to <a href="https://mathyawp.wordpress.com/2017/01/08/mathematics-for-human-flourishing/">read its transcript</a> on <a href="https://mathyawp.wordpress.com/">Francis Su's blog</a>.<br /><br />On the beginning of this New Year, I would just like to conclude this short message by repeating his <br />final wish: <i>”May you and all your students flourish!”<br /></i><br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-77654581240175828012016-06-11T23:10:00.003+02:002016-06-11T23:10:56.489+02:00Triviality of vector bundles with connections on simply connected varietiesI would like to discuss today a beautiful theorem of Grothendieck concerning differential equations. It was mentioned by Yves André in a wonderful <a href="https://www.youtube.com/watch?v=gG1wxAImQng">talk at IHÉS</a> in March 2016 and Hélène Esnault kindly explained its proof to me during a nice walk in the Bavarian Alps last April... The statement is as follows:<br /><br /><b>Theorem (<a href="https://eudml.org/doc/154006">Grothendieck, 1970</a>). — </b><i>Let $X$ be a smooth projective complex algebraic variety. Assume that $X$ is simply connected. Then every vector bundle with an integrable connection on $X$ is trivial.</i><br /><br />Let indeed $(E,\nabla)$ be a vector bundle with an integrable connection on $X$ and let us show that it is trivial, namely, that there exist $n$ global sections $e_1,\dots,e_n$ of $E$ which are horizontal ($\nabla e_i=0$) and form a basis of $E$ at each point.<br /><br />Considering the associated analytic picture, we get a vector bundle $(E^{\mathrm{an}},\nabla)$ with an integrable connection on the analytic manifold $X(\mathbf C)$. Let $x\in X(\mathbf C)$. By the theory of linear differential equations, this furnishes a representation $\rho$ of the topological fundamental group $\pi_1(X(\mathbf C),x)$ in the fiber $E_x$ of the vector bundle $E$ at the point $x$. Saying that $(E^{\mathrm{an}},\nabla)$ is trivial on $X(\mathbf C)$ means that this representation $\rho$ is trivial, which seems to be a triviality since $X$ is simply connected.<br /><br />However, in this statement, <i>simple connectedness</i> means in the sense of algebraic geometry, namely that $X$ has no non-trivial finite étale covering. And this is why the theorem can be surprising, for this hypothesis does not imply that $\pi_1(X(\mathbf C),x))$ is trivial, only that is has no non-trivial finite quotient. This is Grothendieck's version of Riemann's existence theorem, proved in SGA 1.<br /><br />However, it is known that $X(\mathbf C)$ is topologically equivalent to a finite cellular space, so that its fundamental group $\pi_1(X(\mathbf C),x)$ is finitely presented. <br /><br /><b>Proposition (<a href="http://mi.mathnet.ru/eng/msb6037">Mal</a></b><b><a href="http://mi.mathnet.ru/eng/msb6037">čev, 1940</a>). — </b><i>Let $G$ be a finitely generated subgroup of $\mathrm{GL}(n,\mathbf C)$. Then $G$ is residually finite: for every finite subset $T$ of $G$ not containing $\{\mathrm I_n\}$, there exists a finite group $K$ and a morphism $f\colon G\to K$ such that $T\cap \operatorname{Ker}(f)=\varnothing$. </i><br /><br />Consequently, the image of $\rho$ is residually finite. If it were non-trivial, there would exist a non-trivial finite quotient $K$ of $\operatorname{im}(\rho)$, hence a non-trivial finite quotient of $\pi_1(X(\mathbf C),x)$, which, as we have seen, is impossible. Consequently, the image of $\rho$ is trivial and $(E^{\mathrm{an}},\nabla)$ is trivial.<br /><br />In other words, there exists a basis $(e_1,\dots,e_n)$ of horizontal sections of $E^{\mathrm{an}}$. By Serre's GAGA theorem, $e_1,\dots,e_n$ are in fact algebraic, ie, induced by actual global sections of $E$ on $X$. By construction, they are horizontal and form a basis of $E$ at each point. Q.E.D.<br /><br />It now remains to explain the <i>proof of the proposition.</i> Let $S$ be a finite symmetric generating subset of $G$ containing $T$, not containing $\mathrm I_n$, and let $R$ be the subring of $\mathbf C$ generated by the entries of the elements of $S$ and their inverses. It is a non-zero finitely generated $\mathbf Z$-algebra; the elements of $S$ are contained in $\mathrm {GL}(n,R)$, hence $G$ is a subgroup of $\mathrm{GL}(n,R)$. Let $\mathfrak m$ be a maximal ideal of $R$ and let $k$ be its residue field; the point of the story is that <i>this field $k$ is finite</i> (I'll explain why in a minute.) Then the reduction map $R\to k$ induces a morphism of groups $\mathrm{GL}(n,R)\to \mathrm {GL}(n,k)$, hence a morphism $G\to \mathrm{GL}(n,k)$. By construction, a non-zero entry of an element of $S$ is invertible in $R$ hence is mapped to a non-zero element in $k$. Consequently, $S$ is disjoint from the kernel of $f$, as was to be shown.<br /><br /><b>Lemma. —</b> <i>Let $R$ be a finitely generated $\mathbf Z$-algebra and let $\mathfrak m$ be a maximal ideal of $R$. The residue field $R/\mathfrak m$ is finite.</i><br /><br /><i>Proof of the lemma. — </i>This could be summarized by saying that $\mathbf Z$ is a Jacobson ring: if $A$ is a Jacobson ring, then every finitely generated $A$-algebra $K$ which is a field is finite over $A$; in particular, $K$ is a finite extension of a quotient field of $A$. In the case $A=\mathbf Z$, the quotient fields of $\mathbf Z$ are the finite fields $\mathbf F_p$, so that $K$ is a finite extension of a finite field, hence is a finite field. Let us however explain the argument. Let $K$ be the field $R/\mathfrak m$; let us replace $\mathbf Z$ by its quotient $A=\mathbf Z/P$, where $P$ is the kernel of the map $\mathbf Z\to R/\mathfrak m$. There are two cases: either $P=(0)$ and $A=\mathbf Z$, or $P=(p)$, for some prime number $p$, and $A$ is the finite field $\mathbf F_p$;<br />we will eventually see that the first case cannot happen.<br /><br />Now, $K$ is a field which is a finitely generated algebra over a subalgebra $A$; let $k$ be the fraction field of $A$. The field $K$ is now a finitely generated algera over its subfield $k$; by Zariski's form of Hilbert's Nullstellensatz, $K$ is a finite <i>algebraic</i> extension of $k$. Let us choose a finite generating subset $S$ of $K$ as a $k$-algebra; each element of $S$ is algebraic over $k$; let us consider the product $f$ of the leading coefficients of their minimal polynomials, chosen to belong to $A[T]$ and let $A'=A[1/f]$. By construction, the elements of $S$ are integral over $K$, hence $K$ is integral over $A'$. Since $K$ is a field, we deduce that $A'$ is a field. To conclude, we split the discussion into the two cases stated above.<br /><br />If $P=(p)$, then $A=\mathbf F_p$, hence $k=\mathbf F_p$ as well, and $K$ is a finite extension of $\mathbf F_p$, hence is a finite field.<br /><br />Let us assume, by contradiction, that $P=(0)$, hence $A=\mathbf Z$ and $k=\mathbf Q$. By what precedes, there exists an element $f\in\mathbf Z$ such that $\mathbf Q=\mathbf Z[1/f]$. But this cannot be true, because $\mathbf Z[1/f]$ is not a field. Indeed, any prime number which does not divide $f$ is not invertible in $\mathbf Z[1/f]$. This concludes the proof of the lemma.<br /><br /><b>Remarks. — </b>1) The theorem does not hold if $X$ is not proper. For example, the affine line $\mathbf A^1_{\mathbf C}$ is simply connected, both algebraically and topologically, but the trivial line bundle $E=\mathscr O_X\cdot e$ endowed with the connection defined by $\nabla (e)=e$ is not trivial. It is analytically trivial though, but its horizontal analytic sections are of the form $\lambda \exp(z) e$, for $\lambda\in\mathbf C$, and except for $\lambda=0$, none of them are algebraic.<br />However, the theorem holds if one assumes moreover that the connection has regular singularities at infinity.<br /><br />2) The group theoretical property that we used is that on a complex algebraic variety, the monodromy group of a vector bundle with connection is residually finite. It is not always true that the topological fundamental group of a complex algebraic variety is residually finite. Examples have been given by Domingo Toledo in <a href="http://www.numdam.org/item?id=PMIHES_1993__77__103_0">“Projective varieties with non-residually finite fundamental group”, <i>Publications mathématiques de l’I.H.É.S.</i>, 77 (1993), p. 103–119. </a><br /><br />3) The analogous result in positive characteristic is a conjecture by Johan De Jong formulated in 2010: <i>If $X$ is a projective smooth simply connected algebraic variety over an algebraically closed field of characteristic $p$, then every isocrystal is trivial.</i> It is still open, despite beautiful progress by Hélène Esnault, together with Vikram Mehta and Atsushi Shiho.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-65057167364768345972016-05-05T01:29:00.000+02:002016-05-07T10:00:33.873+02:00Bourbaki and Felix KleinA colleague just sent me Xerox copies of a few pages of a 1899 biography of <a href="https://en.wikipedia.org/wiki/Charles-Denis_Bourbaki">the général Bourbaki</a>. Its author, François Bournand, was the private secretary of Édouard Drumont, an antisemitic writer and journalist. The book would probably not be worth much being mentioned here without its dedication:<br /><br /><div style="text-align: center;">À l'abbé Félix Klein</div><div style="text-align: center;">de l'Institut catholique</div><div style="text-align: center;">Hommage respectueux de son dévoué en N.-S.</div><div style="text-align: center;">François Bournand</div><div style="text-align: center;">Professeur d'histoire de l'art à l'École professionnelle catholique</div><br /><br /><div class="separator" style="clear: both; text-align: center;"><a href="https://4.bp.blogspot.com/-DiCmr2tAKao/VyqCCMddxaI/AAAAAAAAGXU/-CFrq3R9V20ZdCJYZ5AXmFmSdju0Xx7vACLcB/s1600/bourbaki-dedicace.png" imageanchor="1" style="float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="320" src="https://4.bp.blogspot.com/-DiCmr2tAKao/VyqCCMddxaI/AAAAAAAAGXU/-CFrq3R9V20ZdCJYZ5AXmFmSdju0Xx7vACLcB/s320/bourbaki-dedicace.png" width="267" /></a><a href="https://4.bp.blogspot.com/-bw5BHs1xeXw/VyqCQx68PtI/AAAAAAAAGXY/b_z4hfzplG8xUi4o-xWkqn0gv54wI-h6QCLcB/s1600/bourbaki-cover.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://4.bp.blogspot.com/-bw5BHs1xeXw/VyqCQx68PtI/AAAAAAAAGXY/b_z4hfzplG8xUi4o-xWkqn0gv54wI-h6QCLcB/s320/bourbaki-cover.png" width="204" /></a></div><br /><i>Abbé</i> is abbot, in this context, a catholic priest without a parish; the French initials N.-S. mean <i>Notre Seigneur</i>, Our Lord. It appears that this <a href="https://fr.wikipedia.org/wiki/F%C3%A9lix_Klein_%28pr%C3%AAtre%29">Félix Klein</a> (note the accent on the <i>e</i>) also has a Wikipedia page.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-91324744998226817562016-04-29T16:09:00.000+02:002016-04-29T17:01:29.735+02:00Roth's theoremsA few days ago, The Scotsman published <a href="http://www.scotsman.com/news/mathematician-leaves-1m-to-help-sick-patients-in-inverness-1-4111648">a paper</a> about Klaus Roth's legacy, explaining how he donated his fortune (1 million pounds) to various charities. This paper was reported by some friends on Facebook. Yuri Bilu added the mention that he knew <i>two</i> important theorems of Roth, and since one of them did not immediately reached my mind, I decided to write this post.<br /><br />The first theorem was a 1935 conjecture of Erdős and Turán concerning arithmetic progression of length 3 that Roth proved in 1952. That is, one is given a set $A$ of positive integers and one seeks for triples $(a,b,c)$ of distinct elements of $A$ such that $a+c=2b$; Roth proved that infinitely many such triples exist as soon as the upper density of $A$ is positive, that is:<br />\[ \limsup_{x\to+\infty} \frac{\mathop{\rm Card}(A\cap [0;x])}x >0. \]<br />In 1975, Endre Szemerédi proved that such sets of integers contain (finite) arithmetic progressions of arbitrarily large length. Other proofs have been given by Hillel Furstenberg (using ergodic theory) and Tim Gowers (by Fourier/combinatorical methods); Roth had used Hardy-Littlewood's circle method.<br /><br />In 1976, Erdős strengthened his initial conjecture with Turán and predicted that arithmetic progressions of arbitrarily large length exist in $A$ as soon as<br />\[ \sum_{a\in A} \frac 1a =+\infty.\]<br />Such a result is still a conjecture, even for arithmetic progressions of length $3$, but a remarkable particular case has been proved by Ben Green and Terry Tao in 2004, when $A$ is the set of all prime numbers.<br /><br />Outstanding as these results are (Tao has been given the Fields medal in 2006 and Szemerédi the Abel prize in 2012), the second theorem of Roth was proved in 1955 and was certainly the main reason for awarding him the Fields medal in 1958. Indeed, Roth gave a definitive answer to a long standing question in diophantine approximation that originated from the works of Joseph Liouville (1844). Given a real number $\alpha$, one is interested to rational fractions $p/q$ that are close to $\alpha$, and to the quality of the approximation, namely the exponent $n$ such that $\left| \alpha- \frac pq \right|\leq 1/q^n$. Precisely, the approximation exponent $\kappa(\alpha)$ is the largest lower bound of all real numbers $n$ such that the previous inequality has infinitely many solutions in fractions $p/q$, and Roth's theorem asserts that one has $\kappa(\alpha)=2$ when $\alpha$ is an irrational algebraic number.<br /><br />One part of this result goes back to Dirichlet, showing that for any irrational number $\alpha$, there exist many good approximations with exponent $2$. This can be proved using the theory of continued fractions and is also a classical application of Dirichlet's box principle. Take a positive integer $Q$ and consider the $Q+1$ numbers $q\alpha- \lfloor q\alpha\rfloor$ in $[0,1]$, for $0\leq q\leq Q$; two of them must be less that $1/Q$ apart; this furnishes integers $p',p'',q',q''$, with $0\leq q'<q''\leq Q$ such that $\left| (q''\alpha-p'')-(q'\alpha-p')\right|\leq 1/Q$; then set $p=p''-p'$ and $q=q''-q'$; one has $\left| q\alpha -p \right|\leq 1/Q$, hence $\left|\alpha-\frac pq\right|\leq 1/Qq\leq 1/q^2$.<br /><br />To prove an inequality in the other direction, Liouville's argument was that if $\alpha$ is an irrational root of a nonzero polynomial $P\in\mathbf Z[T]$, then $\kappa(\alpha)\leq\deg(P)$. The proof is now standard: given an approximation $p/q$ of $\alpha$, observe that $q^d P(p/q)$ is a non-zero integer (if, say, $P$ is irreducible), so that $\left| q^d P(p/q)\right|\geq 1$. On the other hand, $P(p/q)\approx (p/q-\alpha) P'(\alpha)$, hence an inequality $\left|\alpha-\frac pq\right|\gg q^{-d}$.<br /><br />This result has been generalized, first by Axel Thue en 1909 (who proved an inequality $\kappa(\alpha)\leq \frac12 d+1$), then by Carl Ludwig Siegel and Freeman Dyson in 1947 (showing $\kappa(\alpha)\leq 2\sqrt d$ and $\kappa(\alpha)\leq\sqrt{2d}$). While Liouville's result was based in the minimal polynomial of $\alpha$, these generalisations required to involve polynomials in two variables, and the non-vanishing of a quantity such that $q^dP(p/q)$ above was definitely less trivial. Roth's proof made use of polynomials of arbitrarily large degree, and his remarkable achievement was a proof of the required non-vanishing result.<br /><br />Roth's proof was “elementary”, making use only of polynomials and wronskians. There are today more geometric proofs, such as the one by Hélène Esnault and Eckart Viehweg (1984) or Michael Nakamaye's subsequent proof (1995) which is based on Faltings's product theorem.<br /><br />What is still missing, however, is the proof of an <i>effective</i> version of Roth's theorem, that would give, given any real number $n>\kappa(\alpha)$, an actual integer $Q$ such that every rational fraction $p/q$ in lowest terms such that $\left|\alpha-\frac pq\right|\leq 1/q^n$ satisfies $q\leq Q$. It seems that this defect lies at the very heart of almost all of the current approaches in diophantine approximations... Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com5tag:blogger.com,1999:blog-8231917611006633375.post-24517356692692711412016-04-13T15:55:00.001+02:002016-04-13T17:43:39.284+02:00Weierstrass's approximation theoremI had to mentor an Agrégation leçon entitled <i>Examples of dense subsets.</i> For my own edification (and that of the masses), I want to try to record here as many proofs as of the Weierstrass density theorem as I can : <i>Every complex-valued continuous function on the closed interval $[-1;1]$ can be uniformly approximated by polynomials.</i> I'll also include as a bonus the trigonometric variant: <i>Every complex-valued continuous and $2\pi$-periodic function on $\mathbf R$ can be uniformly approximated by trigonometric polynomials.</i><br /><br /><b>1. Using the Stone theorem.</b><br /><br />This 1937—1948 theorem is probably the final conceptual brick to the edifice of which Weierstrass laid the first stone in 1885. It asserts that a subalgebra of continuous functions on a compact totally regular (e.g., metric) space is dense for the uniform norm if and only if it separates points. In all presentations that I know of, its proof requires to establish that the absolute value function can be uniformly approximated by polynomials on $[-1;1]$: <br /><ul><li>Stone truncates the power series expansion of the function \[ x\mapsto \sqrt{1-(1-x^2)}=\sum_{n=0}^\infty \binom{1/2}n (x^2-1)^n, \] bounding by hand the error term.</li><li>Bourbaki (<i>Topologie générale</i>, X, p. 36, lemme 2) follows a more elementary approach and begins by proving that the function $x\mapsto \sqrt x$ can be uniformly approximated by polynomials on $[0;1]$. (The absolute value function is recovered since $\mathopen|x\mathclose|\sqrt{x^2}$.) To this aim, he introduces the sequence of polynomials given by $p_0=0$ and $p_{n+1}(x)=p_n(x)+\frac12\left(x-p_n(x)^2\right)$ and proves by induction the inequalities \[ 0\leq \sqrt x-p_n(x) \leq \frac{2\sqrt x}{2+n\sqrt x} \leq \frac 2n\] for $x\in[0;1]$ and $n\geq 0$. This implies the desired result.</li></ul>The algebra of polynomials separates points on the compact set $[-1;1]$, hence is dense. To treat the case of trigonometric polynomials, consider Laurent polynomials on the unit circle.<br /><br /><b>2. Convolution.</b><br /><br />Consider an approximation $(\rho_n)$ of the Dirac distribution, i.e., a sequence of continuous, nonnegative and compactly supported functions on $\mathbf R$ such that $\int\rho_n=1$ and such that for every $\delta>0$, $\int_{\mathopen| x\mathclose|>\delta} \rho_n(x)\,dx\to 0$. Given a continuous function $f$ on $\mathbf R$, form the convolutions defined by $f*\rho_n(x)=\int_{\mathbf R} \rho_n(t) f(x-t)\, dt$. It is classical that $f*\rho_n$ converges uniformly on every compact to $f$.<br /><br />Now, given a continuous function $f$ on $[-1;1]$, one can extend it to a continuous function with compact support on $\mathbf R$ (defining $f$ to be affine linear on $[-2;-1]$ and on $[1;2]$, and to be zero outside of $[-2;2]$. We want to choose $\rho_n$ so that $f*\rho_n$ is a polynomial on $[-1;1]$. The basic idea is just to choose a parameter $a>0$, and to take $\rho_n(x)= c_n (1-(x/a)^2)^n$ for $\mathopen|x\mathclose|\leq a$ and $\rho_n(x)=0$ otherwise, with $c_n$ adjusted so that $\int\rho_n=1$. Let us write $f*\rho_n(x)=\int_{-2}^2 \rho_n(x-t) f(t)\, dt$; if $x\in[-1;1]$ and $t\in[-2:2]$, then $x-t\in [-3;3]$ so we just need to be sure that $\rho_n$ is a polynomial on that interval, which we get by taking, say, $a=3$. This shows that the restriction of $f*\rho_n$ to $[-1;1]$ is a polynomial function, and we're done.<br /><br />This approach is more or less that of D. Jackson (“<a href="http://www.jstor.org/stable/2300993">A Proof of Weierstrass's Theorem</a>,” <i>Amer. Math. Monthly,</i> 1934). The difference is that he considers continuous functions on a closed interval contained in $\mathopen]0;1\mathclose[$ which he extends linearly to $[0;1]$ so that they vanish at $0$ and $1$; he considers the same convolution, taking the parameter $a=1$.<br /><br />Weierstrass's own proof (“<a href="http://bibliothek.bbaw.de/bibliothek-digital/digitalequellen/schriften/anzeige/index_html?band=10-sitz/1885-2&seite:int=109">Über die analytische Darstellbarkeit sogenannter willkurlicher Functionen einer reellen Veranderlichen Sitzungsberichteder</a>,” <i>Königlich Preussischen Akademie der Wissenschaften zu Berlin</i>, 1885) was slightly more sophisticated: he first showed approximation by convolution with the Gaussian kernel defined by $ \rho_n(t) =\sqrt{ n} e^{- \pi n t^2}$, and then expanded the kernel as a power series, a suitable truncation of which furnishes the desired polynomials.<br /><br />As shown by Jacskon, the same approach works easily (in a sense, more easily) for $2\pi$-periodic functions, considering the kernel defined by $\rho_n(x)=c_n(1+\cos(x))^n$, where $c_n$ is chosen so that \int_{-\pi}^\pi \rho_n=1$.<br /><br /><b>3. Bernstein polynomials.</b><br /><br />Take a continuous function $f$ on $[0;1]$ and, for $n\geq 0$, set \[ B_nf(x) = \sum_{k=0}^n f(k/n) \binom nk t^k (1-t)^{n-k}.\] It is classical that $B_nf$ converges uniformly to $f$ on $[0;1]$.<br /><br />There are two classical proofs of Bernstein's theorem. One is probabilistic and consists in observing that $B_nf(x)$ is the expected value of $f(S_n)$, where $S_n$ is the sum of $n$ i.i.d. Bernoulli random variables with parameter $x\in[0;1]$. Another (generalized as the Korovkin theorem, <span class="reference-text"><span class="ouvrage" id="Korovkin1953"><span class="ouvrage" id="P._P._Korovkin1953"><span lang="en">“<cite style="font-style: normal;">On convergence of linear positive operators in the space of continuous functions</cite>”</span>,</span></span></span> Dokl. Akad. Nauk SSSR (N.S.), vol. 90, <span class="reference-text"><span class="ouvrage" id="Korovkin1953"><span class="ouvrage" id="P._P._Korovkin1953"><time>1953</time></span></span></span>) consists in showing (i) that for $f=1,x,x^2$, $B_nf$ converges uniformly to $f$ (an explicit calculation), (ii) that if $f\geq 0$, then $B_nf\geq 0$ as well, (iii) for every $x\in[0;1]$, squeezing $f$ inbetween two quadratic polynomials $f^+$ and $f_-$ such that $f^+(x)-f^-(x)$ is as small as desired.<br /><br />A trigonometric variant would be given by Fejér's theorem that the Cesàro averages of a Fourier series of a continuous, $2\pi$-periodic function converge uniformly to that function. In turn, Fejér's theorem can be proved in both ways, either by convolution (the Fejér kernel is nonnegative), or by a Korovkine-type argument (replacing $1,x,x^2$ on $[0;1]$ by $1,z,z^2,z^{-1},z^{-2}$ on the unit circle).<br /><br /><br /><b>4. Using approximation by step functions.</b><br /><br />This proof originates with a paper of H. Kuhn, “<a href="http://www.ams.org/mathscinet-getitem?mr=173738">Ein elementarer Beweis des Weierstrasschen Approximationsatze</a>s,” Arch. Math. <b>15</b> (1964), p. 316–317.<br /><br />Let us show that for every $\delta\in\mathopen]0,1\mathclose[$ and every $\varepsilon>0$, there exists a polynomial $p$ satisfying the following properties:<br /><ul><li> $0\leq p(x)\leq \varepsilon$ for $-1\leq x\leq-\delta$;</li><li> $0\leq p(x)\leq 1$ for $-\delta\leq x\leq \delta$;</li><li> $1-\varepsilon\leq p(x)\leq 1$ for $\delta\leq x\leq 1$.</li></ul>In other words, these polynomials approximate the (discontinuous) function $f$ on $[-1;1]$ defined by $f(x)=0$ for $x< 0$, $f(x)=1$ for $x> 0$ and $f(0)=1/2$.<br /><br />A possible formula is $p(x)=(1- ((1-x)/2))^n)^{2^n}$, where $n$ is a large enough integer. First of all, one has $0\leq (1-x)/2\leq 1$ for every $x\in[-1;1]$, so that $0\leq p(x)\leq 1$. Let $x\in[-1;-\delta]$; then one has $(1-x)/2\geq (1+\delta)/2$, hence $p(x)\leq (1-((1+\delta)/2)^n)^{2^n}$, which can be made arbitrarily small when $n\to\infty$. Let finally $x\in[\delta;1]$; then $(1-x)/2\geq (1-\delta)/2$, hence $p(x)\geq (1-((1-\delta)/2)^n)^{2^n}\geq 1- (1-\delta)^n$, which can be made arbitrarily close to $1$ when $n\to\infty$.<br /><br />By translation and dilations, the discontinuity can be placed at any element of $[0;1]$. Let now $f$ be an arbitrary step function and let us write it as a linear combination $f=\sum a_i f_i$, where $f_i$ is a $\{0,1\}$-valued step function. For every $i$, let $p_i$ be a polynomial that approximates $f_i$ as given above. The linear combination $\sum a_i p_i$ approximates $f$ with maximal error $\sup(\mathopen|a_i\mathclose|)$.<br /><br />Using uniform continuity of continuous functions on $[-1;1]$, every continuous function can be uniformly approximated by a step function. This concludes the proof.<br /><br /><b>5. Using approximation by piecewise linear functions.</b><br /><br />As in the proof of Stone's theorem, one uses the fact that the function $x\mapsto \mathopen|x\mathclose|$ is uniformly approximated by a sequence of polynomial on $[-1;1]$. Consequently, so are the functions $x\mapsto \max(0,x)=(x+\mathopen|x\mathclose|)/2 $ and $x\mapsto\min(0,x)=(x-\mathopen|x\mathclose|)/2$. By translation and dilation, every continuous piecewise linear function on $[-1;1]$ with only one break point is uniformly approximated by polynomials. By linear combination, every continuous piecewise linear affine function is uniformly approximated by polynomials.<br />By uniform continuity, every continuous function can be uniformly approximated by continuous piecewise linear affine functions. Weierstrass's theorem follows.<br /><br /><b>6. Moments.</b><br /><br />A linear subspace $A$ of a Banach space is dense if and only if every continuous linear form which vanishes on $A$ is identically $0$. In the present case, the dual of $C^0([-1;1],\mathbf C)$ is the space of complex measures on $[-1;1]$ (Riesz theorem, if one wish, or the definition of a measure). So let $\mu$ be a complex measure on $[-1;1]$ such that $\int_{-1}^1 t^n \,d\mu(t)=0$ for every integer $n\geq 0$; let us show that $\mu=0$. This is the classical problem of showing that a complex measure on $[-1;1]$ is determined by its <i>moments</i>. In fact, the classical proof of this fact runs the other way round, and there must exist ways to reverse the arguments.<br /><br />One such solution is given in Rudin's <i>Real and complex analysis</i>, where it is more convenient to consider functions on the interval $[0;1]$. So, let $F(z)=\int_0^1 t^z \,d\mu(t)$. The function $F$ is holomorphic and bounded on the half-plane $\Re(z)> 0$ and vanishes at the positive integers. At this point, Rudin makes a conform transformation to the unit disk (setting $w=(z-1)/(z+1)$) and gets a bounded function on the unit disk with zeroes at $(n-1)/(n+1)=1-2/(n+1)$, for $n\in\mathbf N$, and this contradicts the fact that the series $\sum 1/(n+1)$ diverges.<br /><br />In Rudin, this method is used to prove the more general Müntz–Szász theorem according to which the family $(t^{\lambda_n})$ generates a dense subset of $C([0;1])$ if and only if $\sum 1/\lambda_n=+\infty$.<br /><br />Here is another solution I learnt in a paper by L. Carleson (“<a href="http://www.ams.org/mathscinet-getitem?mr=198209">Mergelyan's theorem on uniform polynomial approximation</a>”, <i>Math. Scand.,</i> 1964).<br /><br />For every complex number $a$ such that $\mathopen|a\mathclose|>1$, one can write $1/(t-a)$ as a converging power series. By summation, this quickly gives that<br />\[ F(a) = \int_{-1}^1 \frac{1}{t-a}\, d\mu(t) \equiv 0. \]<br />Observe that this formula defines a holomorphic function on $\mathbf C\setminus[-1;1]$; by analytic continuous, one thus has $F(a)=0$ for every $a\not\in[-1;1]$.<br />Take a $C^2$-function $g$ with compact support on the complex plane. For every $t\in\mathbf C$, one has the following formula<br />\[ \iint \bar\partial g(z) \frac{1}{t-z} \, dx\,dy = g(t), \]<br />which implies, by integration and Fubini, that<br />\[ \int_{-1}^1 g(t)\,d\mu(t) = \iint \int \bar\partial g(z) \frac1{t-z}\,d\mu(t)\,dx\,dy = \iint \bar\partial g(z) F(z)\,dx\, dy= 0. \]<br />On the other hand, every $C^2$ function on $[-1;1]$ can be extended to such a function $g$, so that the measure $\mu$ vanishes on every $C^2$ function on $[-1;1]$. Approximating a continuous function by a $C^2$ function (first take a piecewise linear approximation, and round the corners), we get that $\mu$ vanishes on every continuous function, as was to be proved.<br /><br /><b>7. Chebyshev/Markov systems.</b><br /><br />This proof is due to P. Borwein and taken from the book <i>Polynomials and polynomial inequalities,</i> by P. Borwein and T. Erdélyi (Graduate Texts in Maths, vol. 161, 1995). Let us say that a sequence $(f_n)$ of continuous functions on an interval $I$ is a Markov system (resp. a weak Markov system) if for every integer $n$, every linear combination of $(f_0,\dots,f_n)$ has at most $n$ zeroes (resp. $n$ sign changes) in $I$. <br /><br />Given a Markov system $(f_n)$, one defines a sequence $(T_n)$, where $T_n-f_n$ is the element of $\langle f_0,\dots,f_{n-1}\rangle$ which is the closest to $f_n$. The function $T_n$ has $n$ zeroes on the interval $I$; let $M_n$ be the maximum distance between two consecutive zeroes.<br /><br />Borwein's theorem (Theorem 4.1.1 in the mentioned book) then asserts that if the sequence $(f_n)$ is a Markov system consisting of $C^1$ functions, then its linear span is dense in $C(I)$ if and only if $M_n\to 0$.<br /><br />The sequence of monomials $(x^n)$ on $I=[-1;1]$ is of course a Markov system. In this case, the polynomial $T_n$ is the $n$th Chebyshev polynomial, given by $T_n(2\cos(x))=2\cos(nx)$, and its roots are given by $2\cos((\pi+2k\pi)/2n)$, for $k=0,\dots,n-1$, and $M_n\leq \pi/n$. This gives yet another proof of Weierstrass's approximation theorem.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com4tag:blogger.com,1999:blog-8231917611006633375.post-45043015030362352102016-02-24T07:22:00.003+01:002016-02-24T07:22:57.966+01:00Sound and colorJust back home from <a href="http://thestonenyc.com/">The Stone</a> where I could hear two very interesting sets with pianist <a href="http://www.russlossing.com/">Russ Lossing</a> and drummer <a href="http://www.gerryhemingway.com/">Gerry Hemingway</a>, first in duet, and then in quartet with <a href="http://lorenstillman.com/">Loren Stillman</a> on alto saxophone and <a href="http://www.samuelblaser.com/">Samuel Blaser</a> on trombone.<br /><br />I was absolutely excited at the prospect of returning to this avant-garde jazz hall (it has been my 3rd concert there, the first one was in 2010, with Sylvie Courvoisier, Thomas Morgan and Ben Perowski, and the second, last year, with <a href="http://freedommathdance.blogspot.com/2015/01/vijay-iyer-and-wadada-leo-smith-at-stone.html">Wadada Leo Smith and Vijay Iyer</a>) to listen to Gerry Hemingway, and the cold rain falling on New York City did not diminish my enthusiasm. (Although I had to take care on the streets, for one could almost see nothing...) I feared I would arrive late, but Gerry Hemingway was still installing his tools, various sticks, small cymbals, woodblocks, as well as a cello bow...<br /><br />I admit, it took me some time to appreciate the music. Of course, it was free jazz (so what?) and I couldn't really follow the stream of music. Both musicians were acting delicately and skillfully (no discussion) at creating sound, as a painter would spread brush strokes on a canvas—and actually, Hemingway was playing a lot of brushes, those drum sticks made of many (wire or plastic) strings that have a delicate and not very resonating sound... Color after color, something was emerging, sound was being shaped.<br /><br />There is an eternal discussion about the nature of music (is it rhythm? melody? harmony?) and consequently about the role of each instrument in the shaping of the music. A related question is the way a given instrument should be used to produce sound.<br /><br />None of the obvious answers was to be heard tonight. Russ Lossing sometimes stroke the strings of the grand piano with mallets, something almost classical in avant-garde piano music. I should have been prepared by the concert of Tony Malaby's Tubacello, that I attended with François Loeser in <a href="http://www.lylo.fr/concert/8d6b4b-festival-sons-d-hiver-tony-malaby-s-tubacello-oliver-lake-organ-quartet-theatre-paul-eluard">Sons d'hiver</a> a few weeks ago, where <a href="http://johnhollenbeck.com/">John Hollenbeck</a> simultaneously played drums and prepared piano, but the playing of Gerry Hemingway brought me much surprise. He could blow on the heads of the drums, hit them with a woodblock or strange plastic mallets; he could make the cymbals vibrate by pressing the cell bow on it; he could also take the top hi-hat cymbal on the left hand, and then either hit it with a stick, or press it on the snare drum, thereby producing a mixture of snare/cymbal sound; during a long drum roll, he could also vary the pitch of the sound by pressing the drum head with his right foot—can you imagine the scene?<br /><br />It is while discussing with him in between the two sets that I gradually understood (some of) his musical conception. How everything is about sound and color. That's why he uses an immense palette of tools, to produce the sounds he feels would best fit the music. He also discussed <i>extended technique</i>, by which he means not the kind of drumistic virtuosity that could allow you (unfortunately, not me...) to play the <a href="https://en.wikipedia.org/wiki/Drum_rudiment">26 drum rudiments</a> at 300bpm, but by extending the range of sounds he can consistently produce with his “basic Buddy Rich type instrument”—Google a picture of Terry Bozzio's drumkit if you don't see what I mean. He described himself as a colorist, who thinks of his instrument in terms of pitches; he also said how rhythm also exists in negative, when it is not played explicitly. A striking remark because it exactly depicted how I understand the playing of one of my favorite jazz drummers, Paul Motian, but whom I couldn't appreciate until I became able of hearing what he did not play.<br /><br />The second set did not sound as abstract as the first one. Probably the two blowing instruments helped giving the sound more flesh and more texture. <a href="http://www.samuelblaser.com/">Samuel Blaser</a>, on the trombone, was absolutely exceptional—go listen at once for his <a href="http://www.whirlwindrecordings.com/spring-rain/">Spring Rain</a> album, an alliance of Jimmy Giuffre and contemporary jazz—and Loren Stillman sang very beautiful melodic lines on the alto sax. The four of them could also play in all combinations, and with extremly interesting dynamics, going effortlessly from one to another. And when a wonderful moment of thunder ended abruptly with the first notes of Paul Motian's Etude, music turned into pure emotion. Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-62001002132311143232016-02-09T23:09:00.002+01:002016-02-14T12:02:41.524+01:00Happy New Year!As was apparently <a href="https://girlsangle.wordpress.com/2016/01/01/happy-new-year-2016/">first noticed by Noam Elkies</a>, 2016 is the cardinality of the general linear group over the field with 7 elements, $G=\mathop{\rm GL}(2,\mathbf F_7)$. I was mentoring an <i>agrégation</i> lesson on finite fields this afternoon, and I could not resist having the student check this. Then came the natural question of describing the Sylow subgroups of this finite group. This is what I describe here.<br /><br />First of all, let's recall the computation of the cardinality of $G$. The first column of a matrix in $G$ must be non-zero, hence there are $7^2-1$ possibilities; for the second column, it only needs to be non-collinear to the first one, and each choice of the first column forbids $7$ second columns, hence $7^2-7$ possibilities. In the end, one has $\mathop{\rm Card}(G)=(7^2-1)(7^2-7)=48\cdot 42=2016$. The same argument shows that the cardinality of the group $\mathop{\rm GL}(n,\mathbf F_q)$ is equal to $(q^n-1)(q^n-q)\cdots (q^n-q^{n-1})=q^{n(n-1)/2}(q-1)(q^2-1)\cdots (q^n-1)$.<br /><br />Let's go back to our example. The factorization of this cardinal comes easily: $2016=(7^2-1)(7^2-7)=(7-1)(7+1)7(7-1)=6\cdot 8\cdot 7\cdot 6= 2^5\cdot 3^2\cdot 7$. Consequently, there are three Sylow subgroups to find, for the prime numbers $2$, $3$ and $7$.<br /><br />The cas $p=7$ is the most classical one. One needs to find a group of order 7, and one such subgroup is given by the group of upper triangular matrices $\begin{pmatrix} 1 & * \\ 0 & 1\end{pmatrix}$. What makes things work is that $p$ is the characteristic of the chosen finite field. In general, if $q$ is a power of $p$, then the subgroup of upper-triangular matrices in $\mathop{\rm GL}(n,\mathbf F_q)$ with $1$s one the diagonal has cardinality $q\cdot q^2\cdots q^{n-1}=q^{n(n-1)/2}$, which is exactly the highest power of $p$ divising the cardinality of $\mathop{\rm GL}(n,\mathbf F_q)$.<br /><br />Let's now study $p=3$. We need to find a group $S$ of order $3^2=9$ inside $G$. There are a priori two possibilities, either $S\simeq (\mathbf Z/3\mathbf Z)^2$, or $S\simeq (\mathbf Z/9\mathbf Z)$.<br />We will find a group of the first sort, which will that the second case doesn't happen, because all $3$-Sylows are pairwise conjugate, hence isomorphic.<br /><br />Now, the multiplicative group $\mathbf F_7^\times$ is of order $6$, and is cyclic, hence contains a subgroup of order $3$, namely $C=\{1,2,4\}$. Consequently, the group of diagonal matrices with coefficients in $C$ is isomorphic to $(\mathbf Z/3\mathbf Z)^2$ and is our desired $3$-Sylow.<br /><br />Another reason why $G$ does not contain a subgroup $S$ isomorphic to $\mathbf Z/9\mathbf Z$ is that it does not contain elements of order $9$. Let's argue by contradiction and consider a matrix $A\in G$ such that $A^9=I$; then its minimal polynomial $P$ divides $T^9-1$. Since $7\nmid 9$, the matrix $A$ is diagonalizable over the algebraic closure of $\mathbf F_7$. The eigenvalues of $A$ are eigenvalues are $9$th roots of unity, and are quadratic over $\mathbf F_7$ since $\deg(P)\leq 2$. On the other hand, if $\alpha$ is a $9$th root of unity belonging to $\mathbf F_{49}$, one has $\alpha^9=\alpha^{48}=1$, hence $\alpha^3=1$ since $\gcd(9,48)=3$. Consequently, $\alpha$ is a cubic root of unity and $A^3=1$, showing that $A$ has order $3$.<br /><br />It remains to treat the case $p=2$, which I find slightly trickier. Let's try to find elements $A$ in $G$ whose order divides $2^5$. As above, it is diagonalizable in an algebraic closure, its minimal polynomial divides $T^{32}-1$, and its roots belong to $\mathbf F_{49}$, hence satisfy $\alpha^{32}=\alpha^{48}=1$, hence $\alpha^{16}=1$. Conversely, $\mathbf F_{49}^\times$ is cyclic of order $48$, hence contains an element of order $16$, and such an element is quadratic over $\mathbf F_7$, hence its minimal polynomial $P$ has degree $2$. The corresponding companion matrix $A$ in $G$ is an element of order $16$, generating a subgroup $S_1$ of $G$ isomorphic to $\mathbf Z/16\mathbf Z$. We also observe that $\alpha^8=-1$ (because its square is $1$); since $A^8$ is diagonalizable in an algebraic closure with $-1$ as the only eigenvalue, this shows $A^8=-I$.<br /><br />Now, there exists a $2$-Sylow subgroup containing $S_1$, and $S_1$ will be a normal subgroup of $S$ (because its index is the smallest prime number dividing the order of $S$, which is $2$). This suggests to introduce the normalizer $N$ of $S_1$ in $G$. One then has $S_1\subset S\subset N$. Let $s\in S$ be such that $s\not\in S_1$; then there exists a unique $k\in\{1,\dots,15\}$ such that $s^{-1}As=A^k$, and $s^{-2}As^2=A^{k^2}=A$ (because $s$ has order $2$ modulo $S_1$), hence $k^2\equiv 1\pmod{16}$—in other words, $k\equiv \pm1\pmod 8$.<br /><br />There exists a natural choice of $s$: the involution ($s^2=I$) which exchanges the two eigenspaces of $A$. To finish the computation, it's useful to take a specific example of polynomial $P$ of degree $2$ whose roots in $\mathbf F_{49}$ are primitive $16$th roots of unity. In other words, we need to factor the $16$th cyclotomic polynomial $\Phi_{16}=T^8+1$ over $\mathbf F_7$ and find a factor of degree $2$; actually, Galois theory shows that all factors have the same degree, so that there should be 4 factors of degree $2$. To explain the following computation, some remark is useful. Let $\alpha$ be a $16$th root of unity in $\mathbf F_{49}$; we have $(\alpha^8)^2=1$ but $\alpha^8\neq 1$, hence $\alpha^8=-1$. If $P$ is the minimal polynomial of $\alpha$, the other root is $\alpha^7$, hence the constant term of $P$ is equal to $\alpha\cdot \alpha^7=\alpha^8=-1$.<br /><br />We start from $T^8+1=(T^4+1)^2-2T^4$ and observe that $2\equiv 4^2\pmod 7$, so that $T^8+1=(T^4+1)^2-4^2T^4=(T^4+4T^2+1)(T^4-4T^2+1)$. To find the factors of degree $2$, we remember that their constant terms should be equal to $-1$. We thus go on differently, writing $T^4+4T^2+1=(T^2+aT-1)(T^2-aT-1)$ and solving for $a$: this gives $-2-a^2=4$, hence $a^2=-6=1$ and $a=\pm1$. The other factors are found similarly and we get<br />\[ T^8+1=(T^2-T-1)(T^2+T-1)(T^2-4T-1)(T^2+4T-1). \]<br />We thus choose the factor $T^2-T-1$ and set $A=\begin{pmatrix} 0 & 1 \\ 1 & 1 \end{pmatrix}$.<br /><br />Two eigenvectors for $A$ are $v=\begin{pmatrix} 1 \\ \alpha \end{pmatrix}$ and $v'=\begin{pmatrix}1 \\ \alpha'\end{pmatrix}$, where $\alpha'=\alpha^7$ is the other root of $T^2-T-1$. The equations for $B$ are $Bv=v'$ and $Bv'=v$; this gives $B=\begin{pmatrix} 1 & 0 \\ 1 & - 1\end{pmatrix}$. The subgroup $S=\langle A,B\rangle$ generated by $A$ and $B$ has order $32$ and is a $2$-Sylow subgroup of $G$. <br /><br />Generalizing this method involves finding large commutative $p$-subgroups (such as $S_1$) which belong to appropriate (possibly non-split) tori of $\mathop{\rm GL}(n)$ and combining them with adequate parts of their normalizer, which is close to considering Sylow subgroups of the symmetric group. The paper <a href="http://www.ams.org/journals/proc/1955-006-04/S0002-9939-1955-0072143-9/S0002-9939-1955-0072143-9.pdf">Sylow $p$-subgroups of the classical groups over finite fields with characteristic prime to $p$</a> by A.J. Weir gives the general description (as well as for orthogonal and symplectic groups), building on an earlier paper in which he constructed Sylow subgroups of symmetric groups. See also the paper <a href="http://www.digizeitschriften.de/dms/img/?PID=GDZPPN00243170X">Some remarks on Sylow subgroups of the general linear groups</a> by <span class="AuthorName_container"><span class="AuthorName">C. R. Leedham-Green and </span></span><span class="AuthorName_container"><span class="AuthorName">W. Plesken which says a lot about maximal $p$-subgroups of the general linear group (over non-necessarily finite fields).</span></span> Also, the question was recently the subject of <a href="http://mathoverflow.net/questions/88017/sylow-subgroups-of-projective-general-linear-groups">interesting discussions on MathOverflow</a>.<br /><br />[Edited on Febr. 14 to correct the computation of the 2-Sylow...]Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com1tag:blogger.com,1999:blog-8231917611006633375.post-82787505676248941212016-01-04T20:40:00.000+01:002016-01-04T20:42:26.398+01:00Model theory and algebraic geometry, 5 — Algebraic differential equations from coveringsIn this final post of this series, I return to elimination of imaginaries in DCF and explain the main theorem from Tom Scanlon's paper <a href="http://front.math.ucdavis.edu/1408.5177">Algebraic differential equations from covering maps</a>.<br /><br />The last ingredient to be discussed is jet spaces.<br /><br />Differential algebra is seldom used explicitly in algebraic geometry. However, differential techniques have furnished a crucial tool for the study of the Mordell conjecture over function fields (beginning with the proof of this conjecture by Grauert and Manin), and its generalizations in higher dimension (theorem of Bogomolov on surfaces satisfying $c_1^2>3c_2$), or for holomorphic curve (conjecture of Green-Griffiths). They are often reformulated within the language of <i>jet bundles</i>.<br /><br />Let us assume that $X$ is a smooth variety over a field $k$. Its tangent bundle $T(X)$ is a vector bundle over $X$ whose fiber at a (geometric) point $x$ is the tangent space $T_x(X)$ of $X$ at $x$. By construction, every morphism $f\colon Y\to X$ of algebraic varieties induces a tangent morphism $Tf\colon T(Y)\to T(X)$: it maps a tangent vector $v\in T_y(Y)$ at a (geometric) point $y\in Y$ to the tangent vector $T_yf(v)\int T_{f(y)}(X)$ at $f(y)$. This can be rephrased in the language of differential algebra as follows: for every differential field $(K,\partial)$ whose field of constants contains $k$, one has a derivative map $\nabla_1\colon X(K)\to T(X)(K)$. Here is the relation, where we assume that $K$ is the field of functions of a variety $Y$. A derivation $\partial$ on $K$ can be viewed as a vector field $V$ on $Y$, possibly not defined everywhere; replacing $Y$ by a dense open subset if needed, we assume that it is defined everywhere. Now, a point $x\in X(K)$ can be identified with a rational map $f\colon Y\dashrightarrow X$, defined on an open subset $U$ of $Y$. Then, we simply consider the morphism from $U$ to $T(X)$ given by $p\mapsto T_pf (V_p)$. At the level of function fields, this is our point $\nabla_1(x)\in T(X)(K)$.<br /><br />If one wants to look at higher derivatives, the construction of the tangent bundle can be iterated and gives rise to jet bundles which are varieties $J_m(X)$, defined for all integers $m\geq 0$, such that $J_0(X)=X$, $J_1(X)=T(X)$, and for $m\geq 1$, $J_m(X)$ is a vector bundle over $J_{m-1}X$ modelled on the $m$th symmetric product of $\Omega^1_X$. For every differential field $(K,\partial)$ whose field of constants contains $k$, there is a canonical $m$th derivative map $\nabla_m\colon X(K) \to J_m(X) (K)$.<br /><br />The construction of the jet bundles can be given so that the following three requirements are satisfied:<br /><ul><li>If $X=\mathbf A^1$ is the affine line, then $J_m(X)$ is an affine space of dimension $m+1$, and $\nabla_m$ is just given by $ \nabla_m (x) = (x,\partial(x),\dots,\partial^m(x)) $ for $x\in X(K)=K$;</li><li>Products: $J_m(X\times Y)=J_m(X)\times_k J_m(Y)$;</li><li>Open immersions: if $U$ is an open subset of $X$, then $J_m(U)$ is an open subset of $X$ given by the preimage of $U$ under the projection $J_m(X)\to J_{m-1}(X)\to \dots\to J_0(X)=X$.</li><li>When $X$ is an algebraic group, with origin $e$, then $J_m(X) $ is canonically isomorphic to the product of $X$ by the affine space $J_m(X)_e$ of $m$-jets at $e$.</li></ul>We now describe Scanlon's application.<br /><br />Let $G$ be a complex algebraic group acting on a complex algebraic variety $X$; let $S\colon X\to Z$ be the corresponding generalized Schwarzian map. Here, $Z$ is a complex algebraic variety, but $S$ is a differential map of some order $m$. In other words, there exists a constructible algebraic map $\tilde S\colon J_m(X)\to Z$ such that $S(x)=\tilde S(\nabla_m(x))$ for every differential field $(K,\partial)$ and every point $x\in X(K)$.<br /><br />Let $U$ be an open subset of $X(\mathbf C)$, for the complex topology, and let $\Gamma$ be a Zariski dense subgroup of $G(\mathbf C)$ which stabilizes $U$. We assume that there exists a complex algebraic variety $Y$ and a biholomorphic map $p\colon \Gamma\backslash U \to Y(\mathbf C)$.<br /><br />Locally, every open holomorphic map $\phi\colon\Omega\to Y(\mathbf C)$ can be lifted to a holomorphic map $\tilde\phi\colon \Omega\to U$. Two liftings differ locally by the action of an element of $\Gamma$, so that the composition $S\circ\tilde\phi$ does not depend on the choice of the lifting, by definition of the generalized Schwarzian map $S$. This gives a well-defined differential-analytic map $T\colon Y\to Z$. Let $m$ be the maximal order of derivatives appearing in a formula defining $T$. Then one may write $T\circ\phi =\tilde T\circ \nabla_m\tilde\phi$, where $\tilde T$ is a constructible analytic map from $J_m(Y)$ to $Z$.<br /><br /><b>Theorem</b> (Scanlon). — <i>Assume that there exists a fundamental domain $\mathfrak F\subset U$ such that the map $p|_{\mathfrak F}\colon \mathfrak F\to Y(\mathbf C)$ is definable in an o-minimal structure. Then $T$ is differential-algebraic: there exists a constructible map $\tilde T\colon J_m(Y)\to Z$ such that $T\circ \phi=\tilde T \circ J_m(\phi)$ for every $\phi$ as above.</i><br /><br />For the <i>proof</i>, observe that the map $\tilde T$ is definable in an o-minimal structure, because it comes, by quotient of a definable map from the preimage in $J_m(U)$ of $\mathfrak F$, and o-minimal structures allow elimination of imaginaries. By the theorem of Peterzil and Starchenko, it is constructible algebraic. Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-18871081428028233382015-11-11T22:04:00.000+01:002015-11-12T13:12:42.545+01:00When Baire meets Krasner<br />Here is a well-but-ought-to-be-better known theorem. <br /><br /><b>Theorem. —</b> <i>Let $\ell$ be a prime number and let $G$ be a compact subgroup of $\mathop{\rm GL}_d(\overline{\mathbf Q_\ell})$. Then there exists a finite extension $E$ of $\mathbf Q_\ell$ such that $G$ is contained in $\mathop{\rm GL}_d(E)$.</i><br /><br />Before explaining its proof, let us recall why such a theorem can be of any interest at all. The keyword here is <i>Galois representations.</i> <br /><br />It is now a well-established fact that linear representations are an extremly useful tool to study groups. This is standard for finite groups, for which complex linear representations appear at one point or another of graduate studies, and its topological version is even more classical for the abelian groups $\mathbf R/\mathbf Z$ (Fourier series) and $\mathbf R$ (Fourier integrals). On the other hand, some groups are extremly difficult to grasp while their representations are ubiquitous, namely the absolute Galois groups $G_K=\operatorname{Gal}(\overline K/K)$ of fields $K$.<br /><br />With the notable exception of real closed fields, these groups are infinite and have a natural (profinite) topology with open subgroups the groups $\operatorname{Gal}(\overline K/L)$, where $L$ is a finite extension of $K$ lying in $\overline K$. It is therefore important to study their continuous linear representations. Complex representations are important but since $G_K$ is totally discontinuous, their image is always finite. Therefore, $\ell$-adic representations, namely continuous morphisms from $G_K$ to $\mathop{\rm GL}_d(\mathbf Q_\ell)$, are more important. Here $\mathbf Q_\ell$ is the field of $\ell$-adic numbers.<br /><br />Their use goes back to Weil's proof of the Riemann hypothesis for curves over finite fields, via the action on $\ell^\infty$-division points of its Jacobian variety. Here $\ell$ is a prime different from the characteristic of the ground field. More generally, every Abelian variety $A$ over a field $K$ of characteristic $\neq\ell$ gives rise to a Tate module $T_\ell(A)$ which is a free $\mathbf Z_\ell$-module of rank $d=2\dim(A)$, endowed with a continuous action $\rho_{A,\ell}$ of $G_K$. Taking a basis of $T_\ell(A)$, one thus has a continuous morphism $G_K\to \mathop{\rm GL}_d(\mathbf Z_\ell)$, and, embedding $\mathbf Z_\ell$ in the field of $\ell$-adic numbers, a continuous morphism $G_K\to\mathop{\rm GL}_d(\mathbf Q_\ell)$. Even more generally, one can consider the $\ell$-adic étale cohomology of algebraic varieties over $K$.<br /><br />For various reasons, such as the need to diagonalize additional group actions, one can be led to consider similar representations where $\mathbf Q_\ell$ is replaced by a finite extension of $\mathbf Q_\ell$, or even by the algebraic closure $\overline{\mathbf Q_\ell}$. Since $G_K$ is a compact topological groups, its image by a continuous representation $\rho\colon G_K\to\mathop{\rm GL}_d(\overline{\mathbf Q_\ell}$ is a compact subgroup of $\mathop{\rm GL}_d(\overline{\mathbf Q_\ell}$ to which the above theorem applies.<br /><br />This being said for the motivation, one proof (attributed to Warren Sinnott) is given by Keith Conrad in his short note, <a href="http://www.math.uconn.edu/~kconrad/blurbs/gradnumthy/GLnQpbar.pdf">Compact subgroups of ${\rm GL}_n(\overline{\mathbf Q}_p)$</a>. In fact, while browsing at his large set of excellent expository notes, I fell on that one and felt urged to write this blog post.<br /><br />The following proof had been explained to me by Jean-Benoît Bost almost exactly 20 years ago. I believe that it ought to be much more widely known.<br /><br />It relies on the Baire category theorem and on Krasner's lemma. <br /><br /><b>Lemma 1</b> (essentially Baire). — <i>Let $G$ be a compact topological group and let $(G_n)$ be an increasing sequence of closed subgroups of $G$ such that $\bigcup G_n=G$. There exists an integer $n$ such that $G_n=G$.</i><br /><br /><i>Proof.</i> Since $G$ is compact Hausdorff, it satisfies the Baire category theorem and there exists an integer $m$ such that $G_m$ contains a non-empty open subset $V$. For every $g\in V$, then $V\cdot g^{-1}$ is an open neighborhood of identity contained in $G_m$. This shows that $G_n$ is open in $G$. Since $G$ is compact, it has finitely many cosets $g_iG_m$ modulo $G_m$; there exists an integer $n\geq m$ such that $g_i\in G_n$ for every $i$, hence $G=G_n$. QED.<br /><br /><b>Lemma 2</b> (essentially Krasner). — <i>For every integer $d$, the set of all extensions of $\mathbf Q_\ell$ of degree $d$, contained in $\overline{\mathbf Q_\ell}$, is finite.</i><br /><br /><i>Proof.</i> Every finite extension of $\mathbf Q_\ell$ has a primitive element whose minimal polynomial can be taken monic and with coefficients in $\mathbf Z_\ell$; its degree is the degree of the polynomial. On the other hand, Krasner's lemma asserts that for every such irreducible polynomial $P$, there exist a real number $c_P$ for every monic polynomial $Q$ such that the coefficients of $Q-P$ have absolute values $<c_P$, then $Q$ has a root in the field $E_P=\mathbf Q_\ell[T]/(P)$. By compactness of $\mathbf Z_\ell$, the set of all finite subextensions of given degree of $\overline{\mathbf Q_\ell}$ is finite. QED.<br /><br />Let us now give the <b>proof of the theorem.</b> Let $(E_n)$ be a increasing sequence of finite subextensions of $\overline{\mathbf Q_\ell}$ such that $\overline{\mathbf Q_\ell}=\bigcup_n E_n$ (lemma 2; take for $E_n$ the subfield generated by $E_{n-1}$ and all the subextensions of degree $n$ of $\overline{\mathbf Q_\ell}$). Then $G_n=G\cap \mathop{\rm GL}_d(E_n)$ is a closed subgroup of $G$, and $G$ is the increasing union of all $G_n$. By lemma 1, there exists an integer $n$ such that $G_n=G$. QED.<br /> <br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com3tag:blogger.com,1999:blog-8231917611006633375.post-16856967133981798612015-10-25T23:56:00.000+01:002015-11-11T21:40:15.798+01:00On Lp-spaces, when 0<p<1, convex sets and linear formsWhile the theory of normed vector spaces is now extensively taught at the undergraduate level, the more general theory of topological vector spaces usually does not reach the curriculum. There may be good reasons for that, and here is an example, taken from a paper of Mahlon M. Day, <a href="http://www.ams.org/journals/bull/1940-46-10/S0002-9904-1940-07308-2/S0002-9904-1940-07308-2.pdf">The spaces $L^p$ with $0<p<1$</a> (<i>Bull. Amer. Math. Soc.</i> <b>46</b> (1940), 816–823), of which I learned from a nice <a href="http://www.math.uconn.edu/~kconrad/blurbs/analysis/lpspace.pdf">analysis blurb</a> by <a href="http://www.math.uconn.edu/~kconrad/">Keith Conrad</a> which has almost the same title.<br /><br />For simplicity, I consider here the simple case when the measured space is $[0;1]$, with the Lebesgue measure, and $p=1/2$. Let $E$ be the set of measurable real valued functions $f$ on the interval $[0;1]$ such that $\int_0^1|f(t)|^{1/2}dt<+\infty$, where we identify two functions which coincide almost everywhere. For $f,g\in E$, let us define $d(f,g)=\int_0^1 \mathopen|f(t)-g(t) \mathclose|^{1/2}dt$.<br /><br /><b>Lemma. —</b> <i></i><br /><ol><li><i>The set $E$ is a vector subspace of the space of all measurable functions (modulo coincidence almost everywhere).</i></li><li><i>The mapping $d$ is a distance on $E$.</i></li><li><i>With respect to the topology defined by $d$, the addition of $E$ and the scalar multiplication are continuous, so that $E$ is a topological vector space.</i></li></ol><br /><i>Proof. —</i> We will use the following basic inequality: For $u,v\in\mathbf R$, one has $\mathopen|u+v\mathclose|^{1/2}\leq |u|^{1/2}+|v|^{1/2}$; it can be shown by squaring both sides of the inequality and using the usual triangular inequality. Let $f,g\in E$; taking $u=f(t)$ and $v=g(t)$, and integrating the inequality, we obtain that $f+g\in E$. It is clear that $af\in E$ for $a\in\mathbf R$ and $f\in E$. This proves that $E$ is a vector subspace of the space of measurable functions. For $f,g\in E$, one has $f-g\in E$, so that $d(f,g)$ is finite. Let then $f,g,h\in E$; taking $u=f(t)-g(t)$ and $v=g(t)-h(t)$, and integrating this inequality for $t\in[0;1]$, we then obtain the triangular inequality $d(f,h)\leq d(f,g)+d(g,h)$ for $d$. Moreover, if $d(f,g)=0$, then $f=g$ almost everywhere, hence $f=g$ by definition of $E$. This proves that $d$ is a distance on $E$. Let us now show that $E$ is a topological vector space with respect to the topology defined by $d$. Let $f,g\in E$. For $f',g'\in E$, one then has $d(f'+g',f+g)=\int_0^1\mathopen|(f-f')+(g-g')\mathclose|^{1/2}\leq d(f,f')+d(g,g')$. This proves that addition is continuous on $E$. Similarly, let $a\in \mathbf R$ and $f\in E$. For $b\in\mathbf R$ and $g\in E$, one has $d(af,bg)\leq d(af,bf)+d(bf,bg)\leq \mathopen|b-a\mathclose|^{1/2} d(f,0)+|b|^{1/2}d(f,g)$. This implies that scalar multiplication is continuous. QED.<br /><br /><br />The following theorem shows one unusual feature of this topological vector space.<br /><br /><b>Theorem. —</b> <i>One has $E^*=0$: every continuous linear form on $E$ vanishes identically.</i><br /><br /><i>Proof. —</i> Let $\phi$ be a non-zero continuous linear form on $E$. Let $f\in E$ be such that $\phi(f)\neq 0$; we may assume that $\phi(f)\geq 1$. For $s\in[0,1]$, let $g_s\colon[0;1]\to\mathbf R$ be the function defined by $g_s(t)=0$ for $0\leq t\leq s$ and $g_s(t)=1$ for $s< t\leq 1$. When $s$ goes from $0$ to $1$, $d(g_s f,0)$ goes from $d(f,0)$ to $0$. Consequently, there exists $s$ such that $d(g_s f,0)=d(f,0)/2$. Then $d((1-g_s)f,0)=\int_0^s |f(t)|^{1/2}dt=\int_0^1|f(t)|^{1/2}dt-\int_s^1|f(t)|^{1/2}dt=d(f,0)-d(g_sf,0)=d(f,0)/2$ as well. Moreover the equality $1=\phi(f)=\phi(g_sf)+\phi((1-g_s)f)=0$ shows that either $\phi(g_sf)\geq1/2$ or $\phi((1-g_s)f)\geq 1/2$. Set $f'=2g_s f$ in the first case, and $f'=2(1-g_s)f$ in the latter; one has $\phi(f')\geq 1$ and $d(f',0)=d(f,0)/\sqrt 2$. Iterating, we obtain a sequence $(f^{(n)})$ of elements of $E$ which converges to $0$ but such that $\phi(f^{(n)})\geq 1$ for every $n$, contradicting the continuity of $\phi$. QED.<br /><br /><br />On the other hand, we may believe to remember the Hahn-Banach theorem according to which, for every non-zero function $f\in E$, there exists a continuous linear form $\phi\in E^*$ such that $\phi(f)=1$. Obviously, the previous theorem seems to violate the Hahn-Banach theorem. <br />So why is this not so? Precisely because the Hahn-Banach theorem makes the fundamental hypothesis that the topological vector space be a normed vector space or, more generally, a locally convex vector space, which means that $0$ admits a basis of <i>convex</i> neighborhoods. According to the following proposition, this is far from being so.<br /><br /><b>Proposition. —</b> <i>$E$ is the only non-empty convex open subset of $E$.</i><br /><br /><i>Proof. —</i> Let $V$ be a non-empty convex open subset of $E$. Up to an affine transformation, in order prove that $V=E$, we may assume that $0\in V$ and that $V$ contains the unit ball of center $0$. We first show that $V$ is unbounded. For every $n\geq 1$, we split the interval $[0,1]$ in $n$ intervals $[(k-1)/n,k/n]$, for $1\leq k\leq n$, with characteristic functions $g_k$. One has $d(n^2g_k,0)=1$ for every $k$, hence $n^2 g_k\in V$; moreover, $1=\sum_{k=1}^n g_k$, so that $n=\frac 1n \sum_{k=1}^n n^2 g_k$ belongs to $V$. More generally, given $f\in E$ and $n\geq 1$, we split the interval $[0;1]$ into $n$ successive intervals, with characteristic functions $g_k$, such that $d(fg_k,0)=d(f,0)/n$ for every $k$; one also has $f=\sum fg_k$. Then $d(nfg_k,0)=\sqrt n d(fg_k,0)=1/\sqrt n\leq 1$, hence $n fg_k\in V$ and the relation $f=\frac1n \sum nf g_k$ shows that $f\in V$. QED.<br /><br /><br /><br />When $(X,\mu)$ is a measured space and $p$ is a real number such that $0<p<1$, the space $L^p(X,\mu)$ has similar properties. For this, I refer the interested reader to the above cited paper of Day and to Conrad's note.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com1tag:blogger.com,1999:blog-8231917611006633375.post-38721121955031834282015-06-06T17:34:00.004+02:002015-06-06T17:46:07.320+02:00Model theory and algebraic geometry, 4 — Elimination of imaginariesThe fourth post of this series is devoted to an important concept of model theory, that of elimination of imaginaries. The statement of Scanlon's theorem will appear in a subsequent one.<br /><br /><b>Definition. — </b><i>Let $T$ be a theory in a language $L$. One says that $T$ eliminates imaginaries (resp. weakly eliminates imaginaries) if for every model $M$ and every formula $f(x;a)$ with parameters $a\in M^p$, there exists a formula $g(x;y)$ such that $\{ b\in M^q\;;\; \forall x, f(x;a)\Leftrightarrow g(x;b)\}$ is a singleton (resp. is a non-empty finite set).</i><br /><br />What does this mean? View the formula $f(x;y)$ as defining a <i>family</i> of definable subsets, where $f(x;a)$ is the slice given by the choice of parameters $a$. It may happen that many fibers are equal. The property of elimination of imaginaries asserts that one can define the same family of definable subsets via another formula $g(x;y)$, with different parameters, so that every definable set in the original family appears once and only once. For the case of weak elimination, every definable set of the initial family appears only finitely times.<br /><br />There is an alternative, Galois theoretic style, description: a theory $T$ (weakly) eliminates imaginaries if and only if, for every formula $f(x;a)$ with parameters in a model $M$, there exists a finite subset $B\subset M$ such that for every elementary extension $N$ of $M$ and every automorphism $\sigma$ of $N$, then $\sigma$ preserves the formula (meaning $f(x;a)\leftrightarrow f(\sigma(x);a)$, or, equivalently, $\sigma$ leaves globally invariant the definable subset of $N^n$ defined by the formula $f(x;a)$) if and only if $\sigma$ leaves $B$ pointwise (resp. globally) invariant. One direction is obvious: take for $B$ the coordinates of the elements of the singleton (resp. the finite set) given by applying the definition. For the converse, elementary extensions must enter the picture because some models are too small to possess the necessary automorphisms that should exist; under “saturation hypotheses”, the model $M$ will witness them already.<br /><br />This property is related to the possibility of representing equivalence classes modulo a definable equivalence relation. Namely, let $M$ be a model and let $E$ be an equivalence relation on $M^n$ whose graph is a definable subset of $M^n\times M^n$. Assume that the theory $T$ eliminates imaginaries and allows to define two distinct elements. Then there exists a definable map $f_E\colon M^n\to M^m$ such that for every $y,z\in M^n$, $y \mathrel{E} z$ if and only if $f_E(y)=f_E(z)$. In particular, the quotient set $M^n/E$ is represented by the image of the definable map $f_E$.<br /><br />Conversely, let $f(x;a)$ be a formula with parameters $a\in M^p$ and consider the equivalence relation $E$ on $M^p$ given by $yEz$ if and only if $\forall x,\ f(x;y)\Leftrightarrow f(x;z)$. Its graph is obviously definable. Assume that there exists a definable map $f_E\colon M^p\to M^q$ such that $yEz$ if and only if $f_E(y)=f_E(z)$. Then an automorphism of (an elementary extension of) $M$ will fix the definable set defined by $f(x;a)$ if and only if it fixes $f_E(a)$, so that one has elimination of imaginaries.<br /><br /><b>Theorem </b>(Poizat). — <i>The theory of algebraically closed fields eliminates imaginaries.</i><br /><br />This is more or less equivalent to Weil's theorem on the field of definition of a variety. It is my feeling, however, that this property is under-estimated in algebraic geometry. Indeed, it is closely related to a theorem of Rosenlicht that asserts that given a variety $X$ and an algebraic group $G$ acting on $X$, there exists a dense $G$-invariant open subset $U$ of $X$ such that a geometric quotient $U/G$ exists in the sense of Mumford's <i>Geometric Invariant Theory</i>.<br /><br /><b>Examples. —</b> Let $K$ be an algebraically closed field.<br /><br />a) Let $X$ be a Zariski closed subset of $K^n$ and let $G$ be a finite group of (regular) automorphisms of $X$. Let us consider the formula $f(x;y)=\bigwedge_{g\in G} (x=g\cdot y)$ which asserts that $x$ belongs to the orbit of $G$ under the given action, so that $f(x;y)$ parameterizes $G$-orbits. Since $G$ is finite, weak elimination of imaginaries is a trivial matter, but elimination of imaginaries is possible. Let indeed $A$ be the affine algebra of $X$; this is a $K$-algebra of finite type with an action of $G$ and the algebra $A^G$ is finitely generated. Consequently, there exists a Zariski closed subset $Y$ of some $K^m$ and a polynomial morphism $\phi\colon K^n\to K^m$ such that, for every $y,z\in X$, $\phi(y)=\phi(z)$ if and only if there exists $g\in G$ such that $z=g\cdot y$. Consequently, for $a\in X$, $b=\phi(a)$ is the only element such that the formula $f(x;a)$ be equivalent to the formula $g(x;b)=(b\in Y) \wedge (\exists y\in X)(\phi(y)=b) \wedge f(x;y))$.<br /><br />The simplest instance would be the symmetric group $G=\mathfrak S_n$ acting on $K^n$ by permutation of coordinates. Then $G$-orbits are unordered $n$-tuples of elements of $K$, and it is a both trivial and fundamental fact that the orbit of $(x_1,\dots,x_n)$ is faithfully represented by the first $n$ elementary symmetric functions of $(x_1,\dots,x_n)$, equivalently, by the coefficients of the polynomial $\prod_{j=1}^n (T-x_j)$.<br /><br />b) Let $X=K^{n^2}$ be the set of all $n\times n$ matrices under which the group $G=\mathop{\rm GL}(n,K)$ acts by conjugation. The Jordan decomposition gives a partition of $X$ into constructible sets, stable under the action of $G$, and on each of them, there exists a regular representation of the equivalence classes. For example, the set $U$ of all matrices with pairwise distinct eigenvalues is Zariski open — it is defined by the non-vanishing of the discriminant of the characteristic polynomial — and on this set $U$, the conjugacy class of a matrix is represented by its characteristic polynomial.<br /><br /><b>Theorem. — </b><i>An o-minimal theory eliminates imaginaries. More precisely any surjective definable map $f\colon X\to Y$ between definable sets admits a definable section.</i><br /><br />This follows from the fact that one can define a canonical point in every non-empty definable set. By induction on dimension, it suffices to prove this for a subset $A$ of the line. Then, let $J_A$ be the leftmost interval of $A$ (if the formula $f$ defines $f$, then $J_A$ is defined by the formula $y\leq \rightarrow f(y)$); let $u$ and $v$ be the “endpoints” of $J_A$; if $u=-\infty$ and $v=+\infty$, set $x_A=0$; if $u=-\infty$ and $v<\infty$, set $x_A=v-1$; if $-\infty<u\leq v<+\infty$, set $x_A=(u+v)/2$. It is easy to write down a formula that expresses $x_A$ in terms of a formula for $A$. Consequently, in a family $A_t\subset M$ of non-empty definable sets, the function $t\mapsto x_{A_t}$ is definable.<br /><br /><b>Theorem (Poizat). — </b><i>The theory of differentially closed fields eliminates imaginaries in the language $\{+,-,\cdot,0,1,\partial\}$.</i><br /><br /><b>Examples. — </b>Let $K$ be an algebraically closed differential field. Let $X$ be an algebraic variety with the action of an algebraic group $G$, all defined over the field of constants $C=K^\partial$. We can then endow $X(K)$ with the equivalence relation given by $x\sim y$ if and only if there exists $g\in G(C)$ such that $y=g\cdot x$. The following three special instances of elimination of imaginaries in DCF are classical results of function theory:<br /><br />a) If $X=\mathbf A^1$ is the affine line and $G=\mathbf G_a$ is the additive group acting by translation, then the map $\partial\colon x\mapsto \partial (x)$ gives a bijection from $X(K)/G(C)$ to $K$. Indeed, two elements $x,y$ of $K$ differ by the addition of a constant element if and if $\partial(x)=\partial(y)$. (Moreover, every element of $K$ has a primitive.)<br /><br />b) Let $X=\mathbf A^1\setminus\{0\}$ be the affine line minus the origin and let $G=\mathbf G_m$ be the multiplicative group acting by multiplication. Then the logarithmic derivative $\partial\log\colon x\mapsto \partial(x)/x$ gives a bijection from $X(K)/G(C)=K^\times/C^\times$ to $K$ — two elements $x,y$ of $K^\times$ differ by multiplication by a constant if and only if $\partial(x)/x=\partial(y)/y$, and every element of $K$ is a logarithmic derivative.<br /><br />c) Let $X=\mathbf P^1$ be the projective line endowed with the action of the group $G=\operatorname{\rm PGL}(2)$. Then two points $x,y\in X(K)$ differ by an action of $G(C)$ if and only if their Schwarzian derivatives are equal, where the Schwarzian derivative of $x\in K$ is defined by<br />\[ S(x) = \partial\big(\partial^2 (x)/\partial (x)\big) -\frac12 \big(\partial^2(x)/\partial(x)\big). \]<br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-17688228837285915502015-05-11T22:51:00.001+02:002015-05-11T22:52:11.178+02:00Model theory and algebraic geometry, 3 — Real closed fields and o-minimalityIn this third post devoted to some interactions between model theory and algebraic geometry, we describe the concept of o-minimality and the o-minimal complex analysis of Peterzil and Starchenko.<br /><br /><b>1. Real closed fields and the theorem of Tarski-Seidenberg</b><br /><br />To begin with, we work in the language $L_{\mathrm{or}}$ of ordered rings which is the language of rings $L_{\mathrm r}=\{+,-,\cdot,0,1\}$ enlarged with an order relation $\leq$.<br /><br />Let us recall the definition of a real closed field: this is an field $K$ endowed with an ordering which is compatible with the field laws (the sum of positive elements is positive and the product of positive elements is positive) which satisfies the intermediate value theorem for polynomials: for every polynomial $P\in K[T]$, any pair $(a,b)$ of elements of $K$ such that $a<b$, $P(a)<0$ and $P(b)>0$, there exists $c\in K$ such that $P(c)=0$ and $a<c<b$. Observe that this property can be expressed by a sequence of first-order formulas, one for each degree.<br /><br />The field $\mathbf R$ of real numbers is real closed, but there are many other. For example, the field of formal Puiseux series with real coefficients is also real closed. <br /><br />A theorem of Artin-Schreier asserts that a field $K$ is real closed if and only if $\sqrt{-1}\not\in K$ and $K(\sqrt{-1})$ is an algebraic closure of $K$. This is also equialent to the fact that “the” algebraic closure of $K$ is a finite non-trivial extension of $K$. While the algebraic notion adapted to the language of rings is that of an algebraically closed field, the notion of a real closed field is the one which is adapted to the language of ordered rings. In model theoretic terms, the theory of real closed fields is the model companion of the theory of ordered fields.<br /><br />The analogue of the theorem of Chevalley is the classical theorem of Tarski-Seidenberg:<br /><br /><b>Theorem</b> (Tarski-Seidenberg). — <i>The theory of real closed fields eliminates quantifiers in the language of ordered rings.</i><br /><br />There is a very classical example of this theorem, namely, the resolution of polynomial equation of degree 2. Indeed, in a real closed field, every positive element has a square root (if $a>0$, then the polynomial $T^2-a$ is negative at $0$ and positive at $\max(a,1)$, so that it admits a positive root). The usual algebraic computation thus shows that the formula $\exists x, x^2+ax+b=0$ is equivalent to the formula $a^2-4b\geq 0$.<br /><br /><b>Corollary 1. —</b> <i>If $M$ is a real closed field and $A$ is a subset of $A$, then $\mathop{\rm Def}(M^n,A)$ is the set of all semi-algebraic subsets of $M$ defined by polynomials with coefficients in $A$.</i><br /><br /><b>Corollary 2. —</b> <i>If $M$ is a real closed field, the definable subsets of $M$ are the finite unions of intervals (open, closed or half-open, $\mathopen]a;b\mathclose[$, $\mathopen]a;b]$, $[\mathopen a;b\mathclose[$, $[a;b]$, possibly unbounded, possibly reduced to singletons).</i><br /><br /><b>2. O-minimality</b><br /><br />The seemingly innocuous property stated in corollary 2 leads to a definition which is surprisingly important and powerful.<br /><br /><b>Definition. —</b> <i>Let $T$ be the theory of a real closed field $M$ in an expansion $L$ of the language of ordered rings. One says that $T$ is <b>o-minimal</b> if the definable subsets of $M$ are the finite unions of intervals.</i><br /><br />It is a non-trivial result that the o-minimality is indeed a property of the theory $T$, and not a property of the model $M$: if it holds, then for every elementary extension $N$ of $M$, the definable subsets of $N$ still are finite unions of intervals. <br /><br />By the theorem of Tarski-Seidenberg, the theory of real closed fields is o-minimal. The discovery of more complicated o-minimal theories is a remarkable fact from the 80s.<br /><br /><b>Example. —</b> Let $L_{\mathrm{an},\mathrm{exp}}$ be the language obtained by adjoining to the language $L_{\mathrm{or}}$ of ordered rings symbols of functions $\exp$ and $f$, for every real analytic function $f\colon [0;1]^n\to\mathbf R$. The field of real numbers is viewed as a structure for this language by interpreting $\exp$ as the exponential function from $\mathbf R$ to $\mathbf R$, and every function symbol $f$ as the function from $\mathbf R^n$ to $\mathbf R$ that maps $x$ to $f(x)$ if $x\in [0;1]^n$, and to $0$ otherwise.<i> The theory </i><i><i>(denoted $\mathbf R_{\mathrm{an},\mathrm{exp}})$) </i>of $\mathbf R$ in this language is o-minimal.</i><br /><br />This is a thorem of van den Dries and Miller; the case of $L_{\mathrm{an}}$ (without the exponential function) had been established Denef and van den Dries, while the case of $L_{\mathrm{exp}}$ is due to Wilkie.<br /><br />To give a non-example, let us consider the language obtained by adjoining a symbol $\sin$ and view $\mathbf R$ as a structure for this language, the symbol $\sin$ being interpreted as the sine function from $\mathbf R$ to $\mathbf R$. Then the theory of $\mathbf R$ in this language is not o-minimal. Indeed, the set $2\pi\mathbf Z$ is definable by the formula $\sin(x)=0$, but $2\pi\mathbf Z$ has infinitely many connected components, so is not a finite union of intervals.<br /><br />One motivation for o-minimality is that it realizes (part of) Grothendieck quest towards <b>tame topology</b> as described in his <i>Esquisse d'un programme.</i> Indeed, sets which are definable in an o-minimal structure have many tameness properties:<br /><ul><li>The interior, the closure, the boundary of a definable set is definable. </li><li>Every definable set is homeomorphic to (the topological realization) of a simplicial complex</li><li>Every definable set has a celllular decomposition. Precisely, let us call a cell of $\mathbf R^{n+1}$ any subset $C$ of the following form: one is given a definable subset $A$ of $\mathbf R^n$ and definable functions $f,g\colon A\to\mathbf R$ such that $f(x)<g(x)$ for every $x\in A$, and the set $C$ is defined by the condition $x\in A$, and by one of the conditions $t<f(x)$, or $t=f(x)$, or $f(x)<t<g(x)$, or $t>f(x)$. Then for every finite family $(B_i)$ of definable subsets of $\mathbf R^{n+1}$, there is a finite partition of $\mathbf R^{n+1}$ into cells such that every $B_i$ is a union of cells.</li><li>Every definable function is piecewise smooth.</li><li>Definable continuous functions are definably piecewise trivial (theorem of Hardt): for every function $f\colon X\to Y$ between definable sets which is definable and continuous, there is a finite partition $(Y_i)$ of $Y$ into definable subsets such that the map $f_i\colon f^{-1}(Y_i)\to Y_i$ deduced from $f$ by restriction is isomorphic to a projection $Y_i\times S_i\to Y_i$.</li></ul><br />Recently, o-minimality has had spectacular and fantastic applications via the approach of Pila-Zannier to the conjecture of Pink, leading to new proofs of the Manin-Mumford conjecture (Pila-Zannier), and to proofs of the André-Oort conjecture (Pila, Pila-Tsimerman, Klingler-Ullmo-Yafaev), and, more recently, to partial results towards the conjecture of Pink (Gao, Habegger-Pila,...). However, this is not the goal of that post, so let me refer the interested reader to <a href="http://www.bourbaki.ens.fr/TEXTES/1037.pdf">Tom Scanlon's Bourbaki talk</a> on that topic. <br /><br /><b>3. O-minimal complex analysis</b><br /><br />The standard identification of the field $\mathbf C$ of complex numbers with $\mathbf R^2$ (associating with a complex number its real and imaginary parts) allows to talk of complex valued functions (on a subset of $\mathbf C^n$) which are definable in a given language. In a remarkable series of papers, Peterzil and Starchenko have shown that holomorphic functions which are definable in an o-minimal structure possess very rigid properties. Let us quote some of their theorems.<br /><br />So we fix an expansion of the language $L_{\mathrm{or}}$ of which the field $\mathbf R$ is a structure whose theory is o-minimal. By “definable”, we mean definable in that language. The typical language considered in the applications here is the language $L_{\mathrm{an},\mathrm{exp}}$.<br /><br /><b>Theorem. —</b> <i>Let $A$ be a finite subset of $\mathbf C$ and let $f\colon \mathbf C\setminus A\to \mathbf C$ be a holomorphic function. If $f$ is definable, then it is a rational function.</i><br /><br /><b>Theorem. —</b> <i>Let $V\subset\mathbf C^n$ be a closed analytic subset. If $V$ is definable, then $V$ is algebraic.</i><br /><br /><b>Corollary</b> (Theorem of Chow). — <i>Let $V\subset\mathbf P^n(\mathbf C)$ be a closed analytic subset. Then $V$ is algebraic.</i><br /><br />Indeed, working on the standard charts of $\mathbf P^n(\mathbf C)$, we see that $V$ is locally definable by analytic functions. By compactness of $\mathbf P^n(\mathbf C)$, it is thus definable in the language $L_{\mathrm{an}}$. Since the theory of $\mathbf R$ in this language is o-minimal, the corollary is a consequence of the previous theorem.<br /><br />Let us finally give an important example. Let $X$ be an bounded symmetric domain. This means that $X$ is a bounded open subset of $\mathbf C^n$ such that for every point $p\in X$, there exists a biholomorphic involution $f\colon X\to X$ such that $p$ is an isolated fixed point of $f$. This implies that $X$ is a homogeneous space $G/K$ under a semisimple Lie group $G$ which acts by holomorphisms, and $K$ is a maximal compact subgroup of $G$. Moreover, $X$ has a canonical Kähler metric which is invariant under $G$.<br /><br />The most classical example is given by the Poincaré upper half-plane on which $\mathrm{PGL}(2,\mathbf R)$ acts by homographies; of course, the upper half-plane is not bounded, but is biholomorphic to the open unit disk. <br /><br />A more sophisticated example is given by the Siegel upper half-plane or, rather, its bounded version. That is, $X$ is the set of $n\times n$ symmetric complex matrices $Z$ such that $\mathrm I_n-Z^* Z$ is positive definite. It is a homogeneous space for the symplectic group $\mathrm{Sp}(2n,\mathbf R)$; the fixator of $Z=0$ is the unitary group $U(n)$.<br /><br />Let now $\Gamma$ be an arithmetic subgroup of $\mathrm{Sp}(2n,\mathbf R)$; for example, let us take $\Gamma$ be a subgroup of finite index of $\mathrm{Sp}(2n,\mathbf Z)$. Then the quotient $S=X/\Gamma$ admits a structure of an analytic set and the projection $p\colon X\to S$ is an analytic map. If $\Gamma$ is “small enough” (torsion free, say), then $S$ is even complex manifold manifold, and $p$ is a covering. An important and difficult theorem of Baily-Borel asserts that $S$ is an algebraic variety.<br /><br />In fact, it is classical in this context that there exist Siegel sets, which are explicit subsets $F$ of $X$ such that $\Gamma\cdot F=X$ and such that the set of $\gamma\in\Gamma$ such that $\gamma\cdot F\cap F\neq\emptyset$ is finite. So Siegel sets are almost fundamental domains. An important remark is that they are semi-algebraic, that is, definable in the language of ordered rings. For example in the upper half-plane, one may take $F$ to be the set of all $z\in\mathbf C$ such that $-\frac12\leq \Re(z)\leq \frac12$ and $\Im(z)\geq \sqrt 3/2$. One may even take “fundamental sets” (which are fundamental domains up to something of empty interior) such as the one defined by the inequalities $-\frac12\leq \Re(z)\leq\frac12$ and $\lvert z\rvert \geq1$.<br /><br />Peterzil and Starchenko have proved that there restriction to $F$ of the projection $p$ is definable in the language $L_{\mathrm{an},\mathrm{exp}}$. An immediate consequence is that $S$ is definable in this language, hence is algebraic.<br /><br />These results have been generalized by Klinger, Ullmo and Yafaev to any bounded symmetric domain. This is an important technical part of their proof of the hyperbolic Ax-Lindemann conjecture.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-80165356477484383212015-05-02T23:22:00.002+02:002015-05-10T18:23:06.358+02:00Model theory and algebraic geometry, 2 — Definable sets, types; quantifier eliminationThis is the second post in a series of 4 devoted to the exposition of interactions between model theory and algebraic geometry. In the first one, I explained the notions of language, structures and theories, with examples taken from algebra. Here, I shall discuss the notion of definable set, of types, as well as basic results from dimension theory ($\omega$-stability).<br /><br />So we fix a theory $T$ in a language $L$. A definable set is defined, in a given model $M$ of $T$, by a formula. More precisely, we consider definable sets in cartesian powers $M^n$ of the model $M$, which can be defined by a formula in $n$ free variables with parameters in some subset $A$ of $M$. By definition, such a formula is a formula of the form $\phi(x;a)$, where $\phi(x;y)$ is a formula in $n+m$ free variables, split into two groups $x=(x_1,\dots,x_n)$ and $y=(y_1,\dots,y_m)$ and $a=(a_1,\dots,a_m)\in A^m$ is an $m$-tuple of parameters; the formula $\phi(x;y)$ can have quantifiers and bounded variables too. Given such a formula, we define a subset $[\phi(x;a)]$ of $M^n$ by $\{ x\in M^n\mid \phi(x;a)\}$. We write $\mathrm{Def}(M^n;A)$ for the set of all subsets of $M^n$ which are definable with parameters in $A$.<br /><br />Let us give examples, where $L$ is the language of rings and $T$ is the theory $\mathrm{ACF}$ of algebraically closed fields:<br /><ul><li>$V_1=\{x\mid x\neq 0 \}\subset M $, given by the formula “$x\neq 0$” with 1 variable and $0$ parameter;</li><li>$V_2=\{x\mid \exists y, 2xy=1\} \subset M $, given by the formula “$\exists y, 2xy=1$” with 1 free variable $x$, and one bounded variable $y$;</li><li>$V_3=\{(x,y)\mid x^2+\sqrt 2 y^2=\pi \}\subset \mathbf C^2$, where the model $\mathbf C$ is the field of complex numbers, $\phi((x,y),(a,b))$ is the formula $x^2+ay^2=b$ in 4 free variables, and the parameters are given by $(a,b)=(\sqrt 2,\pi)$.</li></ul><b>Theorem</b> (Chevalley). — <i>Let $L$ be the language of rings, $T=\mathrm{ACF}$ and $M$ be an algebraically closed field; let $A$ be a subset of $M$. The set $\mathrm{Def}(M^n;A)$ is the smallest boolean algebra of subsets of $M^n$ which contains all subsets of $M^n$ of the form $[P(x;a)]$ where $P$ is a polynomial in $n+m$ variables with coefficients in $\mathbf Z$ and $a=(a_1,\dots,a_m)$ is an $m$-tuple of elements of $A$. In other words, a subsets of $M^n$ is definable with parameters in $A$ if and only if it is constructible with parameters in $A$.</i><br /><br />The reason behind this theorem is the following set-theoretic interpretation of quantifiers and logical connectors. Precisely, if $\phi$ is a formula in $n+m+p$ variables, and $a\in A^p$, the definable subset $[\exists y \phi(x,y,a)]$ of $M^n$ coincides with the image of the definable subset $[\phi(x,y;a)]$ of $M^{n+m}$ under the projection $p_x \colon M^{n+m}\to M^n$. Similarly, if $\phi(x)$ and $\psi(x)$ are two formulas in $n$ free variables, then the definable subset $[\phi(x)\wedge\psi(x)]$ is the union of the definable subsets $[\phi(x)]$ and $[\psi(x)]$. And if $\phi(x)$ is a formula in $n$ variables, then the definable subset $[\neg\phi(x)]$ is the complement in $M^n$ of the definable subset $[\phi(x)]$.<br /><br />For example, the subset $V_2=[\exists y, 2xy=1]$ defined above can also be defined by $M\setminus [2x=0]$.<br /><br />One says that the theory ACF admits <i>elimination of quantifiers</i>: modulo the axioms of algebraically closed fields, every formula of the language $L$ is equivalent to a formula without quantifiers.<br /><br />An important consequence of this property is that for every extension $M\hookrightarrow M'$ of models of ACF, the theory of $M'$ is <i>equal </i>to the theory of $M$—one says that every extension of models is <i>elementary</i>.<br /><br />Let $p$ be either $0$ or a prime number. Observe that every algebraically closed field of characteristic $p$ is an extension of $\overline{\mathbf Q}$ if $p=0$, or of $\overline{\mathbf F_p}$ if $p$ is a prime number. As a consequence, for every characteristic $p\geq0$, the theory $\mathrm{ACF}_p$ of algebraically closed fields of characteristic $p$ (defined by the axioms of $\mathrm{ACF}$, and the axiom $1+1+\dots+1=0$ that the characteristic is $p$ if $p$ is a prime number, or the infinite list of axioms that assert that the characteristic is $\neq \ell$, if $p=0$) is <i>complete</i>: this list of axioms determines everything that can be said about algebraically closed fields of characteristic $p$.<br /><br /><b>Definition. —</b> <i>Let $a\in M^n$ and let $A$ be a subset of $M$. The </i>type of $a$<i> (with parameters in $A$) is the set $\mathrm{tp}(a/A)$ of all formulas $\phi(x;b)$ in $n$ free variables with parameters in $A$ such that $\phi(a;b)$ holds in the model $M$.</i><br /><br /><b>Definition. —</b> <i>Let $A$ be a subset of $M$. For every integer $n\geq 0$, the set $S_n(A)$ of </i>types<i> (with parameters in $A$) is the set of all types $\mathrm{tp}(a/A)$, where $N$ is an extension of $M$ which is a model of $T$ and $a\in N^n$. One then says that this type is </i>realized<i> in $N$.</i><br /><br />Gödel's completeness theorem allows us to give an alternative description of $S_n(A)$. Namely, let $p$ be a set of formulas in $n$ free variables and parameters in $A$ which contains the diagram of $A$ (that is, all formulas which involve only elements of $A$ and are true in $M$). Assume that $p$ is consistent (there exists a model $N$ which is an extension of $M$ and and element $a\in M^n$ such that $\phi(a)$ holds in $N$ for every $\phi\in p$) and maximal (for every formula $\phi\not\in p$, then for every model $N$ and every $a\in N^n$ such that $p\subset \mathrm{tp}(a/A)$, then $\phi(a)$ does not hold). Then $p\in S_n(A)$.<br /><br />For every formula $\phi\in L(A)$ in $n$ free variables and parameters in $A$, let $V_\phi$ be the set of types $p\in S_n(A)$ such that $\phi\in p$. Then the subsets $V_\phi$ of $S_n(A)$ consistute a basis of open sets for a natural topology on $S_n(A)$.<br /><br /><b>Theorem. —</b> <i>The topological space $S_n(A)$ is compact and totally discontinuous.</i><br /><br />Let us detail the case of the theory ACF in the langage of rings. I claim that if $K$ is a field, then $S_n(K)$ is homeomorphic to the spectrum $\mathop{\rm Spec}(K[T_1,\dots,T_n])$ endowed with its constructible topology. Concretely, for every algebraically closed extension $M$ of $K$ and every $a\in M^n$, the homeomorphism $j$ maps $\mathrm{tp}(a/K)$ to the prime ideal $\mathfrak p_a$ consisting of all polynomials $P\in K[T_1,\dots,T_n]$ such that $P(a)=0$.<br /><br />A type $p=\mathrm{tp}(a/K)$ is isolated if and only if the prime ideal $\mathfrak p_a$ is maximal. Consequently, if $n=1$, there is exactly one non-isolated type in $S_1(K)$, corresponding to the generic point of the spectrum $\mathop{\rm Spec}(K[T])$.<br /><br />As for any compact topological space, a space of types can be studied via its Cantor-Bendixson analysis, which is a decreasing sequence of subspaces, indexed by ordinals, defined by transfinite induction. First of all, for every topological space $X$, one denotes by $D(X)$ the set of all non-isolated points of $X$. One then defines $X_0=X$, $X_{\alpha}=D(X_\beta)$ if $\alpha=\beta+1$ is a successor-ordinal, and $X_\alpha=\bigcap_{\beta<\alpha} X_\beta$ if $\alpha$ is a limit-ordinal. For $x\in X$, the Cantor-Bendixson rank of $x$ is defined by $r_{CB}(x)=\alpha$ if $x\in X_\alpha$ and $x\not\in X_\beta$ for $\beta>\alpha$, and $r_{CB}(x)=\infty$ if $x\in X_\alpha$ for every ordinal $\alpha$. The set of points of infinite rank is the largest perfect subset of $X$.<br /><br />Let us return to the example of the theory ACF. If a type $p\in S_n(K)$ corresponds to a prime ideal $\mathfrak p=j(p)$ of $\mathop{\rm Spec}(K[T_1,\dots,T_n])$, its Cantor-Bendixson rank is the Zariski dimension of $V(I)$. More generally, if $F$ is a constructible subset of $\mathop{\rm Spec}(K[T_1,\dots,T_n])$, then $r_{CB}(F)$ is the Zariski-dimension of the Zariski-closure of $F$. Moreover, the points of maximal Cantor-Bendixson rank correspond to the generic points of the irreducible components of maximal dimension; in particular, there are only finitely many of them.<br /><br /><b>Definition. —</b> <i>One says that a theory $T$ is $\omega$-stable if for every finite or countable set of parameters $A$, the space of 1-types $S_1(A)$ is finite or countable.</i><br /><br />The theory ACF is $\omega$-stable. Indeed, if $K$ is the field generated by $A$, then $K[T]$ being<br />a countable noetherian ring, it has only countably many prime ideals.<br /><br />Since any non-empty perfect set is uncountable, one has the following lemma.<br /><br /><b>Lemma. —</b> Let $T$ be an $\omega$-stable theory and let $M$ be a model of $T$. Then the Cantor-Bendixson rank of every type $x\in S_n(M)$ is finite.<br /><br />Let us assume that $T$ is $\omega$-stable and let $F$ be a closed subset of $S_n(M)$. Then $r_{CB}(F)=\sup \{ r_{CB}(x)\,;\, x\in F\}$ is finite, and the set of points $x\in F$ such that $r_{CB}(x)=r_{CB}(F)$ is finite and non-empty.<br /><br />This example gives a strong indication that the model theory approach may be extremly fruitful for the study of algebraic theories whose geometry is not as well developed than algebraic geometry.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com2tag:blogger.com,1999:blog-8231917611006633375.post-1800639573150176092015-04-23T02:37:00.002+02:002015-05-10T18:23:35.351+02:00Model theory and algebraic geometry, 1 — Structures, languages, theories, modelsLast november, I had been invited to lecture at the GAGC conference on the use of model theoretic methods in algebraic geometry. In the last two decades, important results of “general mathematics” have been proved using sophisticated techniques, see for example Hrushovski's proofs of the Manin-Mumford and of the Mordell-Lang conjecture over function fields, or Chatzidakis-Hrushovski's proof of a descent result in algebraic dynamics (generalizing a theorem of Néron for abelian varieties), or Hrushovski-Loeser's approach to the topology of Berkovich spaces, or Medvedev-Scanlon's results on invariant varieties in polynomial dynamics, or Hrushovski's generalization of the Lang-Weil estimates, or the applications to the André-Oort conjecture (by Pila and others) of a theorem of Pila-Wilkie in o-minimal geometry... All these wonderful results were however too complicated to be discussed from scratch in this series of lectures and I decided to discuss a beautiful paper of Scanlon that “explains” why coverings from analytic geometry lead to algebraic differential equations.<br />There will be 4 posts:<br /><ol><li>Structures, languages, theories, models (this one)</li><li>Definable sets, types, quantifier elimination</li><li>Real closed fields and o-minimality</li><li>Elimination of imaginaries</li></ol>Model theory — a branch of mathematical logic — has two aspects:<br /><ul><li>The first one, that one could name “pure”, studies mathematical theories as mathematical objects. It introduced important concepts, such as quantifier elimination, elimination of imaginaries, types and their dimensions, stability theory, Zariski geometries, and provides a rough classification of mathematical theories.</li><li>The second one is “applied”: it studies classical mathematical theories using these tools. It may be for algebraic theories, such as fields, differential fields, valued fields, ordered groups or fields, difference fields, etc., that it works the best, and for theories which are primitive enough so that they escape indecidability <i>à la</i> Gödel. </li></ul> Let us begin with an empirical observation; classical mathematical theories feature:<br /><ul><li><i>sets</i> (which may be receptacles for groups, rings, fields, modules, etc.);</li><li><i>functions</i> and <i>relations</i> between those sets (composition laws, order relations, equality);</li><li>certain axioms which are well-formed <i>formulas</i> using these functions, these relations, basic logical symbols ($\forall$, $\exists$, $\vee$, $\wedge$, $\neg$) or their variants ($\Rightarrow$, $\Leftrightarrow$, $\exists!$, etc.).</li></ul>Model theory (to be precise, first-order model theory) introduces the concepts of a <i>language</i> (the letters and symbols that allow to express a mathematical theory), of a <i>theory</i> (sets of formulas in a given language, using a fixed infinite supply of variables), of a <i>structure</i> (sets, functions and relations that allow to interpret all formulas in the language) and finally of a <i>model</i> of a theory (a structure where the formulas of the given theory are interpreted as true). The theory of a structure is the set of all formulas which are interpreted as true. A morphism of structures is a map which is compatible with all the given relations. <br /><br />Let us give three examples from algebra: groups, fields, differential fields<br /><br /><b><i>a) Groups</i></b><br /><br />The language of groups has one symbol $\cdot$ which represents a binary law. Consequently, a structure for this language is just a set $S$ together with a binary law $S\times S\to S$. In this language, one can axiomatize groups using two axioms:<br /><ul><li>Associativity: $\forall x \forall y \forall z \quad x\cdot (y\cdot z)= (x\cdot y)\cdot z$</li><li>Existence of a neutral element and of inverses: $\exists e\forall x \exists y \quad (x\cdot e=e\cdot x \wedge x\cdot y=y\cdot x=e)$. </li></ul>Observe that in writing these formulas, we allow ourselves the usual shortcuts to which we are used as mathematicians. In fact, the foundations of model theory require to spend a few pages to discuss how formulas should be written, with or without parentheses, that they can be unambiguously read, etc.<br /><br />However, it may be more useful to study groups in a language with 3 symbols $\cdot,e,i$, where $\cdot$ represents the binary law, $e$ the neutral element and $i$ the inversion. Then a structure is a set together with a binary law, a distinguished element and a self-map; in particular, what is a structure depends on the language. In this new language, groups are axiomatized with three axioms:<br /><ul><li>Associativity as above;</li><li>Neutral element: $\forall x \quad x\cdot e=e\cdot x=x$;</li><li>Inverse: $\forall x\quad x\cdot i(x)=i(x)\cdot x=e$. </li></ul>The two theories of groups are essentially equivalent: one can translates any formula of the first language into the second, and conversely. Indeed, if a formula of the second language involves the symbols $e$, it suffices to copy $\exists e x\cdot e=e\cdot x$ in front of it; and if a formula involves $i(x)$, it suffices to add $\exists y$ in front of it, as well as the requirement $x\cdot y=y\cdot x=e$, and to replace $i(x)$ by $y$. Since the neutral element and the inverse law of a group are unambiguously defined by the composition law, this shows that the new formula is equivalent, albeit longer and less practical, to the initial one.<br /><br />The possibility of interpreting a theory in a language in a second language is a very important tool in mathematical logic.<br /><br /><b><i>b) Rings</i></b><br /><br />The language used to study rings has 5 symbols: $+,-,0,1,\cdot$. In this language, structures are just sets with three binary laws and two distinguished elements. One can of course axiomatize rings, using the well-known formulas that express that the law $+$ is associative and commutative, that $0$ is a neutral element and that $-$ gives subtraction, that the law $\cdot$ is associative and commutative with $1$ as a neutral element, and that the multiplication $\cdot$ distributes over addition.<br /><br />Adding the axioms $\forall x (x\neq 0 \Rightarrow \exists y \quad xy=1)$ and $1\neq 0$ gives rise to fields.<br /><br />That a field has characteristic 2, say, is axiomatized by the formula $1+1=0$, that it has characteristic 3 is axiomatized by the formula $1+1+1=0$, etc. That a field has characteristic 0 is axiomatized by an infinite list of axiom, one for each prime number $p$, saying that $1+1+\cdots+1\neq 0$ (with $p$ symbols $1$ on the left). We will see below why fields of characteristic 0 must be axiomatized by infinitely axioms.<br /><br />That a field is algebraically closed means that every monic polynomial has a root. To express this property, one needs to write down all possible polynomials. However, the language of rings does not give us access to integers, nor to sets of polynomials. Consequently, we must write down an infinite list of axioms, one for each positive integer $n$: $\forall x_1\forall x_2\cdots \forall x_n \exists y \quad y^n+x_1 y^{n-1}+\cdots+x_{n-1}y+x_n=0$. Here $y^m$ is an abbreviation for the product $y\cdot y \cdots y$ of $m$ factors equal to $y$.<br /><br />As we will see, the language of rings and the theory ACF of algebraically closed fields is well suited to study algebraic geometry.<br /><br /><i><b>c) Differential fields</b></i><br /><br />A differential ring/field is a ring/field $A$ endowed with a derivation $\partial\colon A\to A$, that is, with an additive map satisfying the Leibniz relation $\partial(ab)=a\partial(b)+b\partial(a)$. They can be naturally axiomatized in the language of rings augmented with a symbol $\partial$.<br /><br />There is a notion of a differentially closed field, analogous to the notion of an algebraically closed field, but encompassing differential equations. A differential field is differentially closed if any differential equation which has a solution in some differential extension already has a solution. This property is analogous to the consequence of Hilbert's Nullstellensatz according to which a field is algebraically closed if any system of polynomial equations which has a solution in an extension already has a solution. At least in characteristic zero, Robinson showed that their theory DCF$_0$ can be axiomatized by various families of axioms. For example, the one devised by Blum asserts the existence of an element $x$ such that $P(x)=0$ and $Q(x)\neq0$, for every pair $(P,Q)$ of non-zero differential polynomials in one indeterminate such that the order of $Q$ is strictly smaller than the order of $P$. This study requires the development of important and difficult results in differential algebra due to Ritt and Seidenberg.<br /><br /><br />At this level, there are two important basic theorems to mention: Gödel completeness theorem, and the theorems of Löwenheim-Skolem.<br /><br /><b>Completeness theorem</b> (Gödel). — <i>Let $T$ be a theory in a language $L$. Assume that every finite subset $S$ of $T$ admits a model. Then $T$ admits a model.</i><br /><br />There are two classical proof of this theorem.<br /><br />The first one uses ultraproducts and consists in choosing a model $M_S$ for every finite subset $S$ of $T$. Let then $\mathcal U$ be a non-principal ultrafilter on the set of finite subsets of $T$ and let $M$ be the ultraproduct of the family of models $(M_S)$. It inherits functions and relations from those of the models $M_S$, so that it is a structure in the language $L$. Moreover, one deduces from the definition of an ultrafilter that for every axiom $\alpha$ of $T$, the structure $M$ satisfies the axiom $\alpha$. Consequently, $M$ is a model of $T$.<br /><br />A second proof, due to Henkin, is more syntactical. It considers the set of all terms in the language $L$ (formulas without logical connectors), together with an equivalence relation that equates two terms for which some axiom says that they are equal, and with symbols representing objets of which an axiom affirms the existence. The quotient set modulo the equivalence relation is a model. In essence, this proof is very close to the construction of a free group as words.<br /><br />It is important to obseve that the proof of this theorem uses the existence of non-principal ultraproducts, which is a weak form of the axiom of choice. In fact, as in all classical mathematics, the axiom of choice — and set theory in general — is used in model theory to establish theorems. That does not prevent logicians to study the model theory of set theory without choice as a particular mathematical theory, but even to do that, one uses choice.<br /><br /><b>Theorem of Löwenheim-Skolem.</b> — <i>Let $T$ be a theory in a language $L$. If it admits an infinite model $M$, then it admits a model in every cardinality $\geq \sup(\mathop{\rm Card}(L),\aleph_0)$.</i><br /><br />To show the existence of a model of cardinality $\geq\kappa$, one enlarges the language $L$ and the theory $T$ by adding symbols $c_i$, indexed by a set of cardinality $\kappa$, and the axioms $c_i\neq c_j$ if $i\neq j$, giving rise to a theory $T'$ in a language $L'$. A structure for $L'$ is a structure for $L$ together with distinguished elements $c_i$; such a structure is a model of $T'$ if and only if it is a model of $T$ and if the elements $c_i$ are pairwise disintct. If the initial theory $T$ has an infinite model, then this model is a model of every finite fragment of the theory $T'$, because there are only finitely many axioms of the form $c_i\neq c_j$ to satisfy, and the model is assumed to be infinite. By Gödel's completeness theorem, the theory $T'$ has a model $M'$; forgetting the choice of distinguished elements, $M'$ is a model of the theory $T$, but the mere existence of the elements $c_i$ forces its cardinality to be at least $\kappa$.<br /><br />To show that there exists a model of cardinality exactly $\kappa$ (assumed to be larger than<i> </i>$\sup(\mathop{\rm Card}(L),\aleph_0)$), one starts from a model $M$ of cardinality $\geq\kappa$ and defines a substructure by induction, starting from the constant symbols and adding step by step only the elements which are required by the function symbols, the axioms and the elements already constructed. This construction furnishes a model of $T$ whose cardinality is equal to $\kappa$.<br /><br /><br />Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-10663376903085119822015-03-23T18:08:00.000+01:002015-03-23T18:08:13.579+01:00When Lagrange meets GaloisJean-Benoît Bost told me a beautiful proof of the main ingredient in the proof of Galois correspondence, which had been published by Lagrange in his 1772 “<a href="http://gallica.bnf.fr/ark:/12148/bpt6k229222d/f206.tableDesMatieres">Réflexions sur la résolution des résolutions algébriques</a>”, almost 60 years before Galois. (See <a href="http://gallica.bnf.fr/ark:/12148/bpt6k229222d/f356.image">Section 4</a> of that paper, I think; it is often difficult to recognize our modern mathematics in the language of these old masters.)<br /><br />In modernized notations, Lagrange considers the following situation. He is given a polynomial equation $ T^n + a_{n-1} T^{n-1}+\cdots + a_0 = 0$, with roots $x_1,\dots,x_n$, and two “rational functions” of its roots $f(x_1,\dots,x_n)$ and $\phi(x_1,\dots,x_n)$. (This means that $f$ and $\phi$ are the evaluation at the $n$-tuple $(x_1,\dots,x_n)$ of two rational functions in $n$ variables.) Lagrange says that $f$ and $\phi$ are similar (“semblables”) if every permutation of the roots which leaves $f(x_1,\dots,x_n)$ unchanged leaves $\phi(x_1,\dots,x_n)$ unchanged as well (and conversely). He then proves that $\phi(x_1,\dots,x_n)$ is a rational function of $a_0,\dots,a_{n-1}$ and $f(x_1,\dots,x_n)$.<br /><br />Let us restate this in a more modern language. Let $K\to L$ be a finite Galois extension of fields, in the sense that $K= L^{G}$, where $G=\mathop{\rm Aut}_K(L)$. Let $a, b\in L$ and let us assume that every element $g\in G$ which fixes $a$ fixes $b$ as well; then Lagrange proves that $b\in K(a)$.<br /><br />Translated in our language, his proof could be as follows. In formula, the assumption is that $g\cdot a=a$ implies $g\cdot b=b$; consequently, there exists a unique *function* $\phi\colon G\cdot a\to G\cdot b$ which is $G$-equivariant and maps $a$ to $b$. Let $d=\mathop{\rm Card}(G\cdot a)$ and let us consider Lagrange's interpolation polynomial —the unique polynomial $P\in L[T]$ of degree $d$ such that $P(x)=\phi(x)$ for every $x\in G\cdot a$. If $h\in G$, the polynomial $P^h$ obtained by applying $h$ to the coefficients of $P$ has degree $d$ and coincides with $\phi$; consequently, $P^h=P$. By the initial assumption, $P$ belongs to $K[T]$ and $b=P(a)$, hence $b\in K(a)$, as claimed.<br /><br />Combined with the primitive element theorem, this allows to give another short, and fairly elementary, presentation of the Galois correspondence.Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com0tag:blogger.com,1999:blog-8231917611006633375.post-61510820524072205152015-02-28T17:21:00.001+01:002015-04-15T12:54:44.896+02:00Galois Theory, Geck's styleThis note aims at popularizing a short note of Meinolf Geck, <span style="font-family: Garamond;">“</span>On the characterization of Galois extensions<span style="font-family: Garamond;">”</span>, Amer. Math. Monthly 121 (2014), no. 7, 637–639 (<a href="http://www.jstor.org/discover/10.4169/amer.math.monthly.121.07.637?uid=3738016&uid=2&uid=4&sid=21105979816053">Article</a>, <a href="http://www.ams.org/mathscinet/search/publdoc.html?pg1=MR&s1=3229113">Math Reviews</a>, <a href="http://arxiv.org/abs/1306.3853">arXiv</a>), that proposes a radical shortcut to the treatment of Galois theory at an elementary level. The proof of the pudding is in the eating, so let's see how it works. The novelty lies in theorem 2, but I give the full story so as to be sure that I do not hide something under the rug.<br /><br /><b>Proposition 1.</b><i> Let $K\to L$ be a field extension. Then $L$ is not the union of finitely many subfields $M$ such that $K\to M\subsetneq L$.</i><br /><blockquote class="tr_bq">Proof. It splits into two parts, according whether $K$ is finite or infinite.<br /><br />Assume that $K$ is finite and let $q=\mathop{\rm Card}( K)$. Then $L$ is finite as well, and let $n=[L:K]$ so that $\mathop{\rm Card}(L)=q^n$. If $M$ is a subextension of $L$, then $\mathop{\rm Card}( L)=q^m$, for some integer $m$ dividing $n$; moreover, $x^{q^m}=x$ for every $x\in L$. Then the union of all strict sub-extensions of $L$ has cardinality at most $\sum_{m=1}^{n-1} q^m =\frac{q^n-q}{q-1}<q^n$.<br /><br />It remains to treat the case where $K$ is infinite; then the proposition follows from the fact that a finite union of strict subspace of a $K$-vector space $E$ is not equal to $E$. Let indeed $(E_i)_{1\leq i\leq n}$ be a family of strict subspaces of $E$ and let us prove by induction on $n$ that $E\neq \bigcup_{i=1}^n E_i$. The cases $n\leq1$ are obvious. By induction we know that for every $j\in\{1,\dots,n\}$, the union $\bigcup_{i\neq j}E_i$ is distinct from $E$, hence select an element $x\in E$ such that $x\not\in E_2\cup \dots\cup E_n$. The desired result follows if, by chance, $x\not\in E_1$. Otherwise, choose $y\in E\setminus E_1$. For $s\neq t\in K$, and $i\in\{2,\dots,n\}$, observe that $y+sx$ and $y+tx$ cannot both belong to $E_i$, for this would imply that $(s-t)x\in E_i$, hence $x\in E_i$ since $s\neq t$. Consequently, there are at most $n-1$ elements $s\in K$ such that $y+sx\in \bigcup_{i=2}^nE_i$. Since $K$ is infinite, there exists $s\in K$ such that $y+sx\not\in\bigcup_{i=2}^n E_i$. Then $y+sx\not\in E_1$, neither, since $x\in E_1$ and $y\not\in E_1$. This proves that $E\neq \bigcup_{i=1}^nE_i$.</blockquote><br />Let $K\to L$ be a field extension and let $P\in K[T]$. We say that $P$ is split in $L$ if it is a product of linear factors in $L[T]$. We say that $P$ is separable if all of its roots (in some extension where it is split) have multiplicity $1$. We say that $K\to L$ is a splitting extension of $P$ if $P$ is split in $L$ and if $L$ is the subextension of $K$ generated by the roots of $P$ in $L$. Finally, we let $\mathop{\rm Aut}_K(L)$ be the set of $K$-linear automorphisms of $L$; it is a group under composition.<br /><br /><b>Theorem 2.</b><i> Let $K\to L$ be a finite extension of fields and let $G=\mathop{\rm Aut}_K(L)$. Then $\mathop{\rm Card}( G)\leq [L:K]$. Moreover, the following conditions are equivalent:<br /></i><br /><ol><li><i> One has $\mathop{\rm Card}( G)=[L:K]$;</i></li><i><li>There exists an irreducible separable polynomial $P\in K[T]$ such that $\deg(P)=[L:K]$ and which is split in $L$;</li><li>The extension $K\to L$ is a splitting extension of a separable polynomial in $K[T]$;</li><li>One has $K=L^G$.</li></i></ol><br /><br /><b>Remark 3.</b> In the conditions of (2), let us fix a root $z\in L$ of $P$. One has $L=K(z)$. Moreover, the map $f\mapsto f(z)$ is a bijection from $\mathop{\rm Aut}_K(L)$ to the set of roots of $P$ in $L$.<br /><br /><blockquote class="tr_bq">Proof of Theorem 2.<br />(a) Let us prove that $\mathop{\rm Card} (G)\leq [L:K]$. Let $m\in\mathbf N$ be such that $m\leq \mathop{\rm Card}( G)$ and let $\sigma_1,\dots,\sigma_m$ be distinct elements of $G$. For $1\leq i<j\leq m$, let $M_{i,j}$ be the subfield of $L$ consisting of all $x\in L$ such that $\sigma_i(x)=\sigma_j(x)$. It is a strict subextension of $L$ because $\sigma_i\neq\sigma_j$. Consequently, $L$ is not the union of the subfields $M_{i,j}$ and there exists an element $z\in L$ such that $\sigma_i(z)\neq \sigma_j(z)$ for all $i\neq j$. Let $P$ be the minimal polynomial of $z$. Then the set $\{\sigma_1(z),\dots,\sigma_m(z)\}$ consists of distinct roots of $P$, hence $\deg(P)\geq m$. In particular, $m\leq [L:K]$. Since this holds for every $m\leq \mathop{\rm Card}( G)$, this shows that $\mathop{\rm Card}( G)\leq [L:K]$.<br /><br />(b) If one has $\mathop{\rm Card}( G)=[L:K]$, then taking $m=\mathop{\rm Card}( G)$, we get an irreducible polynomial $P\in K[T]$ of degree $m$, with $m$ distinct roots in $L$. Necessarily, $P$ is separable and split in $L$. This gives (1)$\Rightarrow$(2).<br /><br />The implication (2)$\Rightarrow$(3) is obvious.<br /><br />(1)$\Rightarrow$(4). Let $M=L^G$. One has $\mathop{\rm Aut}_K(L)=\mathop{\rm Aut}_M(L)=G$. Consequently, $\mathop{\rm Card}(G)\leq [L:M]$. Since $\mathop{\rm Card}( G)=[L:K]=[L:M][M:K]$, this forces $M=K$.<br /><br />(4)$\Rightarrow$(3). There exists a $G$-invariant subset $A$ of $L$ such that $L=K(A)$. Then $P=\prod_{a\in A}(T-a)$ is split in $L$, and is $G$-invariant. Consequently, $P\in K[T]$. By construction, $P$ is separable and $L$ is a splitting extension of $P$.<br /><br />(3)$\Rightarrow$(1). Let $M$ be a subextension of $L$ and let $f\colon M\to L$ be a $K$-morphism. Let $a\in A$ and let $Q_a$ be the minimal polynomial of $a$ over $M$. The association $g\mapsto g(a)$ defines a bijection between the set of extensions of $f$ to $M(a)$ and the set of roots of $Q_a$ in $L$. Since $P(a)=0$, the polynomial $Q_a$ divides $P$, hence it is separable and split in $L$. Consequently, $f$ has exactly $\deg(Q_a)=[M(a):M]$ extensions to $M(a)$.<br /><br />By a straightforward induction on $\mathop{\rm Card}(B)$, for every subset $B$ of $A$, the set of $K$-morphisms from $K(B)$ to $L$ has cardinality $[K(B):K]$. When $B=A$, every such morphism is surjective, hence $\mathop{\rm Card}(\mathop{\rm Aut}_K(L))=[L:K]$.</blockquote><br />If these equivalent conditions hold, we say that the finite extension $K\to L$ is Galois.<br /><br /><b>Corollary 4.</b><i> Let $K\to L$ be a finite Galois extension. The maps $H\to L^H$ and $M\to \mathop{\rm Aut}_M(L)$ are bijections, inverse one of the other, between subgroups of $\mathop{\rm Aut}_K(L)$ and subextensions $K\to M\subset L$.</i><br /><blockquote class="tr_bq">Proof. a) For every subextension $K\to M\subset L$, the extension $M\subset L$ is Galois. In particular, $M=L^{\mathop{\rm Aut}_M(L)}$ and $\mathop{\rm Aut}_M(L)=[L:M]$.<br /><br />b) Let $H\subset\mathop{\rm Aut}_K(L)$ and let $M=L^H$. Then $M\to L$ is a Galois extension and $[L:M]=\mathop{\rm Aut}_M(L)$; moreover, one has $H\subset\mathop{\rm Aut}_M(L)$ by construction. Let us prove that $H=\mathop{\rm Aut}_M(L)$. Let $z\in L$ be any element whose minimal polynomial $P_z$ over $M$ is split and separable in $L$. One has $\mathop{\rm Card}(\mathop{\rm Aut}_M(L))=\deg(P_z)$. On the other hand, the polynomial $Q_z=\prod_{\sigma\in H}(T-\sigma(z))\in L[T]$ divides $P_z$ and is $H$-invariant, hence it belongs to $L^H[T]=M[T]$. This implies that $P_z=Q_z$, hence $\mathop{\rm Card}(H)=\deg(P_z)=\mathop{\rm Card}(\mathop{\rm Aut}_M(L))$. Consequently, $H=\mathop{\rm Aut}_M(L)$.</blockquote><br /><b>Corollary 5.</b><i> Let $K\to L$ be a Galois extension and let $K\to M\to L$ be an intermediate extension. The extension $M\to L$ is Galois too. Moreover, the following assertions are equivalent:<br /></i><br /><ol><li><i>The extension $K\to M$ is Galois;</i></li><i><li>$\mathop{\rm Aut}_M(L)$ is a normal subgroup of $\mathop{\rm Aut}_K(L)$;</li><li>For every $\sigma\in\mathop{\rm Aut}_K(L)$, one has $\sigma(M)\subset M$.<br /></li></i></ol><br /><blockquote class="tr_bq">Proof. (a) Let $P\in K[T]$ be a separable polynomial of which $K\to L$ is a splitting field. Then $M\to L$ is a splitting extension of $P$, hence $M\to L$ is Galois. <br /><br />(b) (1)$\Rightarrow$(2): Let $\sigma\in \mathop{\rm Aut}_K(L)$. Let $z$ be any element of $M$ and let $P\in K[T]$ be its minimal polynomial. One has $P(\sigma(z))=\sigma(P(z))=0$, hence $\sigma(z)$ is a root of $P$; in particular, $\sigma(z)\in M$. Consequently, the restriction of $\sigma$ to $M$ is a $K$-morphism from $M$ to itself; it is necessarily a $K$-automorphism. We thus have defined a map from $\mathop{\rm Aut}_K(L)$ to $\mathop{\rm Aut}_K(M)$; this map is a morphism of groups. Its kernel is $\mathop{\rm Aut}_M(L)$, so that this group is normal in $\mathop{\rm Aut}_K(L)$. <br /><br />(2)$\Rightarrow$(3): Let $\sigma\in\mathop{\rm Aut}_K(L)$ and let $H=\sigma\mathop{\rm Aut}_M(L)\sigma^{-1}$. By construction, one has $\sigma(M)\subset L^G$. On the other hand, the hypothesis that $\mathop{\rm Aut}_M(L)$ is normal in $\mathop{\rm Aut}_K(L)$ implies that $G=\mathop{\rm Aut}_M(L)$, so that $L^G=M$. We thus have proved that $\sigma(M)\subset M$. <br /><br />(3)$\Rightarrow$(1): Let $A$ be a finite subset of $M$ such that $M=K(A)$ and let $B$ be its orbit under $\mathop{\rm Aut}_K(L)$. The polynomial $\prod_{b\in B}(T-b)$ is separable and invariant under $\mathop{\rm Aut}_K(L)$, hence belongs to $K[T]$. By assumption, one has $B\subset M$. This implies that $K\to M$ is Galois. </blockquote><br /><b>Remark 6.</b> Let $L$ be a field, let $G$ be a finite group of automorphisms of $L$ and let $K=L^G$. Every element $a$ of $L$ is algebraic and separable over $K$; inded, $a$ is a root of the separable polynomial $\prod_{b\in G\cdot a}(T-b)=0$, which is $G$-invariant hence belongs to $K[T]$. There exists a finite extension $M$ of $K$, contained in $L$, such that $G\cdot M=M$ and such that the map $\mathop{\rm Aut}_K(L)\to \mathop{\rm Aut}_K(M)$ is injective. Then $K\to M$ is Galois, and $G=\mathop{\rm Aut}_K(M)$. Indeed, one has $G\subset\mathop{\rm Aut}_K(M)$, hence $K\subset M^{\mathop{\rm Aut}_K(M)}\subset M^G\subset L^G=K$. This implies that $K\to M$ is Galois and the Galois correspondence then implies $G=\mathop{\rm Aut}_K(M)$. The argument applies to every finite extension of $K$ which contains $M$. Consequently, they all have degree $\mathop{\rm Card}(G)$; necessarily, $L=M$.<br /><br /><b>Remark 7 (editions).</b> Matt Baker points out that the actual novelty of the treatment lies in theorem 2, the rest is standard. Also, remark 6 has been edited following an observation of Christian Naumovic that it is not a priori obvious that the extension $K\to L$ is finite. Antoine Chambert-Loirhttp://www.blogger.com/profile/02115924053842869740noreply@blogger.com2