Wednesday, November 11, 2015

When Baire meets Krasner


Here is a well-but-ought-to-be-better known theorem.

Theorem. — Let \ell be a prime number and let GG be a compact subgroup of GLd(Q)\mathop{\rm GL}_d(\overline{\mathbf Q_\ell}). Then there exists a finite extension EE of Q\mathbf Q_\ell such that GG is contained in GLd(E)\mathop{\rm GL}_d(E).

Before explaining its proof, let us recall why such a theorem can be of any interest at all. The keyword here is Galois representations.

It is now a well-established fact that linear representations are an extremly useful tool to study groups. This is standard for finite groups, for which complex linear representations appear at one point or another of graduate studies, and its topological version is even more classical for the abelian groups R/Z\mathbf R/\mathbf Z (Fourier series) and R\mathbf R (Fourier integrals). On the other hand, some groups are extremly difficult to grasp while their representations are ubiquitous, namely the absolute Galois groups GK=Gal(K/K)G_K=\operatorname{Gal}(\overline K/K) of fields KK.

With the notable exception of real closed fields, these groups are  infinite and have a natural (profinite) topology with open subgroups the groups Gal(K/L)\operatorname{Gal}(\overline K/L), where LL is a finite extension of KK lying in K\overline K. It is therefore important to study their continuous linear representations. Complex representations are important but since GKG_K is totally discontinuous, their image is always finite. Therefore, \ell-adic representations, namely continuous morphisms from GKG_K to GLd(Q)\mathop{\rm GL}_d(\mathbf Q_\ell), are more important. Here Q\mathbf Q_\ell is the field of \ell-adic numbers.

Their use goes back to Weil's proof of the Riemann hypothesis for curves over finite fields, via the action on \ell^\infty-division points of its Jacobian variety. Here \ell is a prime different from the characteristic of the ground field. More generally, every Abelian variety AA over a field KK of characteristic \neq\ell gives rise to a Tate module T(A)T_\ell(A) which is a free Z\mathbf Z_\ell-module of rank d=2dim(A)d=2\dim(A), endowed with a continuous action ρA,\rho_{A,\ell} of  GKG_K. Taking a basis of T(A)T_\ell(A), one thus has a continuous morphism GKGLd(Z)G_K\to \mathop{\rm GL}_d(\mathbf Z_\ell), and, embedding Z\mathbf Z_\ell in the field of \ell-adic numbers,  a continuous morphism GKGLd(Q)G_K\to\mathop{\rm GL}_d(\mathbf Q_\ell). Even more generally, one can consider the \ell-adic étale cohomology of algebraic varieties over KK.

For various reasons, such as the need to diagonalize additional group actions, one can be led to consider similar representations where Q\mathbf Q_\ell is replaced by a finite extension of Q\mathbf Q_\ell, or even by the algebraic closure Q\overline{\mathbf Q_\ell}. Since GKG_K is a compact topological groups, its image by a continuous representation ρ ⁣:GKGLd(Q\rho\colon G_K\to\mathop{\rm GL}_d(\overline{\mathbf Q_\ell} is a compact subgroup of GLd(Q\mathop{\rm GL}_d(\overline{\mathbf Q_\ell} to which the above theorem applies.

This being said for the motivation, one proof (attributed to Warren Sinnott)  is given by Keith Conrad in his short note, Compact subgroups of GLn(Qp){\rm GL}_n(\overline{\mathbf Q}_p). In fact, while browsing at his large set of excellent expository notes,  I fell on that one and felt urged to write this blog post.

The following proof had been explained to me by Jean-Benoît Bost almost exactly 20 years ago. I believe that it ought to be much more widely known.

It relies on the Baire category theorem and on Krasner's lemma.

Lemma 1 (essentially Baire). — Let GG be a compact topological group and let (Gn)(G_n) be an increasing sequence of closed subgroups of GG such that Gn=G\bigcup G_n=G. There exists an integer nn such that Gn=GG_n=G.

Proof. Since GG is compact Hausdorff, it satisfies the Baire category theorem and there exists an integer mm such that GmG_m contains a non-empty open subset VV. For every gVg\in V, then Vg1V\cdot g^{-1} is an open neighborhood of identity contained in GmG_m. This shows that GnG_n is open in GG. Since GG is compact, it has finitely many cosets giGmg_iG_m modulo GmG_m; there exists an integer nmn\geq m such that giGng_i\in G_n for every ii, hence G=GnG=G_n. QED.

Lemma 2 (essentially Krasner). — For every integer dd, the set of all extensions of Q\mathbf Q_\ell of degree dd, contained in Q\overline{\mathbf Q_\ell}, is finite.

Proof. Every finite extension of Q\mathbf Q_\ell has a primitive element whose minimal polynomial can be taken monic and with coefficients in Z\mathbf Z_\ell; its degree is the degree of the polynomial. On the other hand, Krasner's lemma asserts that for every such irreducible polynomial PP, there exist a real number cPc_P for every monic polynomial QQ such that the coefficients of QPQ-P have absolute values <cP<c_P, then QQ has a root in the field EP=Q[T]/(P)E_P=\mathbf Q_\ell[T]/(P). By compactness of Z\mathbf Z_\ell, the set of all finite subextensions of given degree of Q\overline{\mathbf Q_\ell} is finite. QED.

Let us now give the proof of the theorem. Let (En)(E_n) be a increasing sequence of finite subextensions of Q\overline{\mathbf Q_\ell} such that Q=nEn\overline{\mathbf Q_\ell}=\bigcup_n E_n (lemma 2; take for EnE_n the subfield generated by En1E_{n-1} and all the subextensions of degree nn of Q\overline{\mathbf Q_\ell}). Then Gn=GGLd(En)G_n=G\cap \mathop{\rm GL}_d(E_n) is a closed subgroup of GG, and GG is the increasing union of all GnG_n. By lemma 1, there exists an integer nn such that Gn=GG_n=G. QED.
 

Sunday, October 25, 2015

On Lp-spaces, when 0<p<1, convex sets and linear forms

While the theory of normed vector spaces is now extensively taught at the undergraduate level, the more general theory of topological vector spaces usually does not reach the curriculum. There may be good reasons for that, and here is an example, taken from a paper of Mahlon M. Day, The spaces LpL^p with 0<p<10<p<1 (Bull. Amer. Math. Soc. 46 (1940), 816–823), of which I learned from a nice analysis blurb by Keith Conrad which has almost the same title.

For simplicity, I consider here the simple case when the measured space is [0;1][0;1], with the Lebesgue measure, and p=1/2p=1/2. Let EE be the set of measurable real valued functions ff on the interval [0;1][0;1] such that 01f(t)1/2dt<+\int_0^1|f(t)|^{1/2}dt<+\infty, where we identify two functions which coincide almost everywhere. For f,gEf,g\in E, let us define d(f,g)=01f(t)g(t)1/2dtd(f,g)=\int_0^1 \mathopen|f(t)-g(t) \mathclose|^{1/2}dt.

Lemma. —
  1. The set EE is a vector subspace of the space of all measurable functions (modulo coincidence almost everywhere).
  2. The mapping dd is a distance on EE.
  3. With respect to the topology defined by dd, the addition of EE and the scalar multiplication are continuous, so that EE is a topological vector space.

Proof. — We will use the following basic inequality: For u,vRu,v\in\mathbf R, one has u+v1/2u1/2+v1/2\mathopen|u+v\mathclose|^{1/2}\leq |u|^{1/2}+|v|^{1/2}; it can be shown by squaring both sides of the inequality and using the usual triangular inequality. Let f,gEf,g\in E; taking u=f(t)u=f(t) and v=g(t)v=g(t), and integrating the inequality, we obtain that f+gEf+g\in E. It is clear that afEaf\in E for aRa\in\mathbf R and fEf\in E. This proves that EE is a vector subspace of the space of measurable functions. For f,gEf,g\in E, one has fgEf-g\in E, so that d(f,g)d(f,g) is finite. Let then f,g,hEf,g,h\in E; taking u=f(t)g(t)u=f(t)-g(t) and v=g(t)h(t)v=g(t)-h(t), and integrating this inequality for t[0;1]t\in[0;1], we then obtain the triangular inequality d(f,h)d(f,g)+d(g,h)d(f,h)\leq d(f,g)+d(g,h) for dd. Moreover, if d(f,g)=0d(f,g)=0, then f=gf=g almost everywhere, hence f=gf=g by definition of EE. This proves that dd is a distance on EE. Let us now show that EE is a topological vector space with respect to the topology defined by dd. Let f,gEf,g\in E. For f,gEf',g'\in E, one then has d(f+g,f+g)=01(ff)+(gg)1/2d(f,f)+d(g,g)d(f'+g',f+g)=\int_0^1\mathopen|(f-f')+(g-g')\mathclose|^{1/2}\leq d(f,f')+d(g,g'). This proves that addition is continuous on EE. Similarly, let aRa\in \mathbf R and fEf\in E. For bRb\in\mathbf R and gEg\in E, one has d(af,bg)d(af,bf)+d(bf,bg)ba1/2d(f,0)+b1/2d(f,g)d(af,bg)\leq d(af,bf)+d(bf,bg)\leq \mathopen|b-a\mathclose|^{1/2} d(f,0)+|b|^{1/2}d(f,g). This implies that scalar multiplication is continuous. QED.


The following theorem shows one unusual feature of this topological vector space.

Theorem. — One has E=0E^*=0: every continuous linear form on EE vanishes identically.

Proof. — Let ϕ\phi be a non-zero continuous linear form on EE. Let fEf\in E be such that ϕ(f)0\phi(f)\neq 0; we may assume that ϕ(f)1\phi(f)\geq 1. For s[0,1]s\in[0,1], let gs ⁣:[0;1]Rg_s\colon[0;1]\to\mathbf R be the function defined by gs(t)=0g_s(t)=0 for 0ts0\leq t\leq s and gs(t)=1g_s(t)=1 for s<t1s< t\leq 1. When ss goes from 00 to 11, d(gsf,0)d(g_s f,0) goes from d(f,0)d(f,0) to 00. Consequently, there exists ss such that d(gsf,0)=d(f,0)/2d(g_s f,0)=d(f,0)/2. Then d((1gs)f,0)=0sf(t)1/2dt=01f(t)1/2dts1f(t)1/2dt=d(f,0)d(gsf,0)=d(f,0)/2d((1-g_s)f,0)=\int_0^s |f(t)|^{1/2}dt=\int_0^1|f(t)|^{1/2}dt-\int_s^1|f(t)|^{1/2}dt=d(f,0)-d(g_sf,0)=d(f,0)/2 as well. Moreover the equality 1=ϕ(f)=ϕ(gsf)+ϕ((1gs)f)=01=\phi(f)=\phi(g_sf)+\phi((1-g_s)f)=0 shows that either ϕ(gsf)1/2\phi(g_sf)\geq1/2 or ϕ((1gs)f)1/2\phi((1-g_s)f)\geq 1/2. Set f=2gsff'=2g_s f in the first case, and f=2(1gs)ff'=2(1-g_s)f in the latter; one has ϕ(f)1\phi(f')\geq 1 and d(f,0)=d(f,0)/2d(f',0)=d(f,0)/\sqrt 2. Iterating, we obtain a sequence (f(n))(f^{(n)}) of elements of EE which converges to 00 but such that ϕ(f(n))1\phi(f^{(n)})\geq 1 for every nn, contradicting the continuity of ϕ\phi. QED.


On the other hand, we may believe to remember the Hahn-Banach theorem according to which, for every non-zero function fEf\in E, there exists a continuous linear form ϕE\phi\in E^* such that ϕ(f)=1\phi(f)=1. Obviously, the previous theorem seems to violate the Hahn-Banach theorem.
So why is this not so? Precisely because the Hahn-Banach theorem makes the fundamental hypothesis that the topological vector space be a normed vector space or, more generally, a locally convex vector space, which means that 00 admits a basis of convex neighborhoods. According to the following proposition, this is far from being so.

Proposition. — EE is the only non-empty convex open subset of EE.

Proof. — Let VV be a non-empty convex open subset of EE. Up to an affine transformation, in order prove that V=EV=E, we may assume that 0V0\in V and that VV contains the unit ball of center 00. We first show that VV is unbounded. For every n1n\geq 1, we split the interval [0,1][0,1] in nn intervals [(k1)/n,k/n][(k-1)/n,k/n], for 1kn1\leq k\leq n, with characteristic functions gkg_k. One has d(n2gk,0)=1d(n^2g_k,0)=1 for every kk, hence n2gkVn^2 g_k\in V; moreover, 1=k=1ngk1=\sum_{k=1}^n g_k, so that n=1nk=1nn2gkn=\frac 1n \sum_{k=1}^n n^2 g_k belongs to VV. More generally, given fEf\in E and n1n\geq 1, we split the interval [0;1][0;1] into nn successive intervals, with characteristic functions gkg_k, such that d(fgk,0)=d(f,0)/nd(fg_k,0)=d(f,0)/n for every kk; one also has f=fgkf=\sum fg_k. Then d(nfgk,0)=nd(fgk,0)=1/n1d(nfg_k,0)=\sqrt n d(fg_k,0)=1/\sqrt n\leq 1, hence nfgkVn fg_k\in V and the relation f=1nnfgkf=\frac1n \sum nf g_k shows that fVf\in V. QED.



When (X,μ)(X,\mu) is a measured space and pp is a real number such that 0<p<10<p<1, the space Lp(X,μ)L^p(X,\mu) has similar properties. For this, I refer the interested reader to the above cited paper of Day and to Conrad's note.

Saturday, June 6, 2015

Model theory and algebraic geometry, 4 — Elimination of imaginaries

The fourth post of this series is devoted to an important concept of model theory, that of elimination of imaginaries. The statement of Scanlon's theorem will appear in a subsequent one.

Definition. — Let TT be a theory in a language LL. One says that TT eliminates imaginaries (resp. weakly eliminates imaginaries) if for every model MM and every formula f(x;a)f(x;a) with parameters aMpa\in M^p, there exists a formula g(x;y)g(x;y) such that { bMq  ;  x,f(x;a)g(x;b)}\{ b\in M^q\;;\; \forall x, f(x;a)\Leftrightarrow g(x;b)\} is a singleton (resp. is a non-empty finite set).

What does this mean? View the formula f(x;y)f(x;y) as defining a family of definable subsets, where f(x;a)f(x;a) is the slice given by the choice of parameters aa. It may happen that many fibers are equal. The property of elimination of imaginaries asserts that one can define the same family of definable subsets via another formula g(x;y)g(x;y), with different parameters, so that every definable set in the original family appears once and only once. For the case of weak elimination, every definable set of the initial family appears only finitely times.

There is an alternative, Galois theoretic style, description: a theory TT (weakly) eliminates imaginaries if and only if, for every formula f(x;a)f(x;a) with parameters in a model MM, there exists a finite subset BMB\subset M such that for every elementary extension NN of MM and every automorphism σ\sigma of NN, then σ\sigma preserves the formula (meaning f(x;a)f(σ(x);a)f(x;a)\leftrightarrow f(\sigma(x);a), or, equivalently, σ\sigma leaves globally invariant the definable subset of NnN^n defined by the formula f(x;a)f(x;a)) if and only if  σ\sigma leaves BB pointwise (resp. globally) invariant. One direction is obvious: take for BB the coordinates of the elements of the singleton (resp. the finite set) given by applying the definition. For the converse, elementary extensions must enter the picture because some models are too small to possess the necessary automorphisms that should exist; under “saturation hypotheses”, the model MM will witness them already.

This property is related to the possibility of representing equivalence classes modulo a definable equivalence relation. Namely, let MM be a model and let EE be an equivalence relation on MnM^n whose graph is a definable subset of Mn×MnM^n\times M^n. Assume that the theory TT eliminates imaginaries and allows to define two distinct elements. Then there exists a definable map fE ⁣:MnMmf_E\colon M^n\to M^m such that for every y,zMny,z\in M^n, yEzy \mathrel{E} z if and only if fE(y)=fE(z)f_E(y)=f_E(z). In particular, the quotient set Mn/EM^n/E is represented by the image of the definable map fEf_E.

Conversely, let f(x;a)f(x;a) be a formula with parameters aMpa\in M^p and consider the equivalence relation EE on MpM^p given by yEzyEz if and only if $\forall x,\ f(x;y)\Leftrightarrow f(x;z)$. Its graph is obviously definable. Assume that there exists a definable map fE ⁣:MpMqf_E\colon M^p\to M^q such that yEzyEz if and only if fE(y)=fE(z)f_E(y)=f_E(z). Then an automorphism of (an elementary extension of) MM will fix the definable set defined by f(x;a)f(x;a) if and only if it fixes fE(a)f_E(a), so that one has elimination of imaginaries.

Theorem (Poizat). — The theory of algebraically closed fields eliminates imaginaries.

This is more or less equivalent to Weil's theorem on the field of definition of a variety. It is my feeling, however, that this property is under-estimated in algebraic geometry. Indeed, it is closely related to a theorem of Rosenlicht that asserts that given a variety XX and an algebraic group GG acting on XX, there exists a dense GG-invariant open subset UU of XX such that a geometric quotient U/GU/G exists in the sense of Mumford's Geometric Invariant Theory.

Examples. — Let KK be an algebraically closed field.

a) Let XX be a Zariski closed subset of KnK^n and let GG be a finite group of (regular) automorphisms of XX. Let us consider the formula f(x;y)=gG(x=gy)f(x;y)=\bigwedge_{g\in G} (x=g\cdot y) which asserts that xx belongs to the orbit of GG under the given action, so that f(x;y)f(x;y) parameterizes GG-orbits. Since GG is finite, weak elimination of imaginaries is a trivial matter, but elimination of imaginaries is possible. Let indeed AA be the affine algebra of XX; this is a KK-algebra of finite type with an action of GG and the algebra AGA^G is finitely generated. Consequently, there exists a Zariski closed subset YY of some KmK^m and a polynomial morphism ϕ ⁣:KnKm\phi\colon K^n\to K^m such that, for every y,zXy,z\in X, ϕ(y)=ϕ(z)\phi(y)=\phi(z) if and only if there exists gGg\in G such that z=gyz=g\cdot y. Consequently, for aXa\in X, b=ϕ(a)b=\phi(a) is the only element such that the formula f(x;a)f(x;a) be equivalent to the formula g(x;b)=(bY)(yX)(ϕ(y)=b)f(x;y))g(x;b)=(b\in Y) \wedge (\exists y\in X)(\phi(y)=b) \wedge f(x;y)).

The simplest instance would be the symmetric group G=SnG=\mathfrak S_n acting on KnK^n by permutation of coordinates. Then GG-orbits are unordered nn-tuples of elements of KK, and it is a both trivial and fundamental fact that the orbit of (x1,,xn)(x_1,\dots,x_n) is faithfully represented by the first nn elementary symmetric functions of (x1,,xn)(x_1,\dots,x_n), equivalently, by the coefficients of the polynomial j=1n(Txj)\prod_{j=1}^n (T-x_j).

b) Let X=Kn2X=K^{n^2} be the set of all n×nn\times n matrices under which the group G=GL(n,K)G=\mathop{\rm GL}(n,K) acts by conjugation. The Jordan decomposition gives a partition of XX into constructible sets, stable under the action of GG, and on each of them, there exists a regular representation of the equivalence classes. For example, the set UU of all matrices with pairwise distinct eigenvalues is Zariski open — it is defined by the non-vanishing of the discriminant of the characteristic polynomial — and on this set UU, the conjugacy class of a matrix is represented by its characteristic polynomial.

Theorem. — An o-minimal theory eliminates imaginaries. More precisely any surjective definable map f ⁣:XYf\colon X\to Y between definable sets admits a definable section.

This follows from the fact that one can define a canonical point in every non-empty definable set. By induction on dimension, it suffices to prove this for a subset AA of the line. Then, let JAJ_A be the leftmost interval of AA (if the formula ff defines ff, then JAJ_A is defined by the formula yf(y)y\leq \rightarrow f(y)); let uu and vv be the “endpoints” of JAJ_A; if u=u=-\infty and v=+v=+\infty, set xA=0x_A=0; if u=u=-\infty and v<v<\infty, set xA=v1x_A=v-1; if <uv<+-\infty<u\leq v<+\infty, set xA=(u+v)/2x_A=(u+v)/2. It is easy to write down a formula that expresses xAx_A in terms of a formula for AA. Consequently, in a family AtMA_t\subset M of non-empty definable sets, the function txAtt\mapsto x_{A_t} is definable.

Theorem (Poizat). — The theory of differentially closed fields eliminates imaginaries in the language {+,,,0,1,}\{+,-,\cdot,0,1,\partial\}.

Examples. — Let KK be an algebraically closed differential field. Let XX be an algebraic variety with the action of an algebraic group GG, all defined over the field of constants C=KC=K^\partial. We can then endow X(K)X(K) with the equivalence relation given by xyx\sim y if and only if there exists gG(C)g\in G(C) such that y=gxy=g\cdot x. The following three special instances of elimination of imaginaries in DCF are classical results of function theory:

a) If X=A1X=\mathbf A^1 is the affine line and G=GaG=\mathbf G_a is the additive group acting by translation, then the map  ⁣:x(x)\partial\colon x\mapsto \partial (x) gives a bijection from X(K)/G(C)X(K)/G(C) to KK. Indeed, two elements x,yx,y of KK differ by the addition of a constant element if and if (x)=(y)\partial(x)=\partial(y). (Moreover, every element of KK has a primitive.)

b) Let X=A1{0}X=\mathbf A^1\setminus\{0\} be the affine line minus the origin and let G=GmG=\mathbf G_m be the multiplicative group acting by multiplication. Then the logarithmic derivative log ⁣:x(x)/x\partial\log\colon x\mapsto \partial(x)/x gives a bijection from X(K)/G(C)=K×/C×X(K)/G(C)=K^\times/C^\times to KK — two elements x,yx,y of K×K^\times differ by multiplication by a constant if and only if (x)/x=(y)/y\partial(x)/x=\partial(y)/y, and every element of KK is a logarithmic derivative.

c) Let X=P1X=\mathbf P^1 be the projective line endowed with the action of the group G=PGL(2)G=\operatorname{\rm PGL}(2). Then two points x,yX(K)x,y\in X(K) differ by an action of G(C)G(C) if and only if their Schwarzian derivatives are equal, where the Schwarzian derivative of xKx\in K is defined by
S(x)=(2(x)/(x))12(2(x)/(x)). S(x) = \partial\big(\partial^2 (x)/\partial (x)\big) -\frac12 \big(\partial^2(x)/\partial(x)\big).

Link to Part 5 — Algebraic differential equations from coverings

Monday, May 11, 2015

Model theory and algebraic geometry, 3 — Real closed fields and o-minimality

In this third post devoted to some interactions between model theory and algebraic geometry, we describe the concept of o-minimality and the o-minimal complex analysis of Peterzil and Starchenko.

1. Real closed fields and the theorem of Tarski-Seidenberg

To begin with, we work in the language LorL_{\mathrm{or}} of ordered rings which is the language of rings Lr={+,,,0,1}L_{\mathrm r}=\{+,-,\cdot,0,1\} enlarged with an order relation \leq.

Let us recall the definition of a real closed field: this is an field KK endowed with an ordering which is compatible with the field laws (the sum of positive elements is positive and the product of positive elements is positive) which satisfies the intermediate value theorem for polynomials: for every polynomial PK[T]P\in K[T], any pair (a,b)(a,b) of elements of KK such that a<ba<b, P(a)<0P(a)<0 and P(b)>0P(b)>0, there exists cKc\in K such that P(c)=0P(c)=0 and a<c<ba<c<b. Observe that this property can be expressed by a sequence of first-order formulas, one for each degree.

The field R\mathbf R of real numbers is real closed, but there are many other. For example, the field of formal Puiseux series with real coefficients is also real closed.

A theorem of Artin-Schreier asserts that a field KK is real closed if and only if 1∉K\sqrt{-1}\not\in K and K(1)K(\sqrt{-1}) is an algebraic closure of KK. This is also equialent to the fact that “the” algebraic closure of KK is a finite non-trivial extension of KK. While the algebraic notion adapted to the language of rings is that of an algebraically closed field, the notion of a real closed field is the one which is adapted to the language of ordered rings. In model theoretic terms, the theory of real closed fields is the model companion of the theory of ordered fields.

The analogue of the theorem of Chevalley is the classical theorem of Tarski-Seidenberg:

Theorem (Tarski-Seidenberg). — The theory of real closed fields eliminates quantifiers in the language of ordered rings.

There is a very classical example of this theorem, namely, the resolution of polynomial equation of degree 2. Indeed, in a real closed field, every positive element has a square root (if a>0a>0, then the polynomial T2aT^2-a is negative at 00 and positive at max(a,1)\max(a,1), so that it admits a positive root). The usual algebraic computation thus shows that the formula x,x2+ax+b=0\exists x, x^2+ax+b=0 is equivalent to the formula a24b0a^2-4b\geq 0.

Corollary 1. — If MM is a real closed field and AA is a subset of AA, then Def(Mn,A)\mathop{\rm Def}(M^n,A) is the set of all semi-algebraic subsets of MM defined by polynomials with coefficients in AA.

Corollary 2. — If MM is a real closed field, the definable subsets of MM are the finite unions of intervals (open, closed or half-open, ]a;b[\mathopen]a;b\mathclose[, ]a;b]\mathopen]a;b], [a;b[[\mathopen a;b\mathclose[, [a;b][a;b], possibly unbounded, possibly reduced to singletons).

2. O-minimality

The seemingly innocuous property stated in corollary 2 leads to a definition which is surprisingly important and powerful.

Definition. — Let TT be the theory of a real closed field MM in an expansion LL of the language of ordered rings. One says that TT is o-minimal if the definable subsets of MM are the finite unions of intervals.

It is a non-trivial result that the o-minimality is indeed a property of the theory TT, and not a property of the model MM: if it holds, then for every elementary extension NN of MM, the definable subsets of NN still are finite unions of intervals.

By the theorem of Tarski-Seidenberg, the theory of real closed fields is o-minimal. The discovery of more complicated o-minimal theories is a remarkable fact from the 80s.

Example. — Let Lan,expL_{\mathrm{an},\mathrm{exp}} be the language obtained by adjoining to the language LorL_{\mathrm{or}} of ordered rings symbols of functions exp\exp and ff, for every real analytic function f ⁣:[0;1]nRf\colon [0;1]^n\to\mathbf R. The field of real numbers is viewed as a structure for this language by interpreting exp\exp as the exponential function from R\mathbf R to R\mathbf R, and every function symbol ff as the function from Rn\mathbf R^n to R\mathbf R that maps xx to f(x)f(x) if x[0;1]nx\in [0;1]^n, and to 00 otherwise. The theory (denoted Ran,exp)\mathbf R_{\mathrm{an},\mathrm{exp}})) of R\mathbf R in this language is o-minimal.

This is a thorem of van den Dries and Miller; the case of LanL_{\mathrm{an}} (without the exponential function) had been established Denef and van den Dries, while the case of LexpL_{\mathrm{exp}} is due to Wilkie.

To give a non-example, let us consider the language obtained by adjoining a symbol sin\sin and view R\mathbf R as a structure for this language, the symbol sin\sin being interpreted as the sine function from R\mathbf R to R\mathbf R. Then the theory of R\mathbf R in this language is not o-minimal. Indeed, the set 2πZ2\pi\mathbf Z is definable by the formula sin(x)=0\sin(x)=0, but 2πZ2\pi\mathbf Z has infinitely many connected components, so is not a finite union of intervals.

One motivation for o-minimality is that it realizes (part of) Grothendieck quest towards tame topology as described in his Esquisse d'un programme. Indeed, sets which are definable in an o-minimal structure have many tameness properties:
  • The interior, the closure, the boundary of a definable set is definable.
  • Every definable set is homeomorphic to (the topological realization) of a simplicial complex
  • Every definable set has a celllular decomposition. Precisely, let us call a cell of Rn+1\mathbf R^{n+1} any subset CC of the following form: one is given a definable subset AA of Rn\mathbf R^n and definable functions f,g ⁣:ARf,g\colon A\to\mathbf R such that f(x)<g(x)f(x)<g(x) for every xAx\in A, and the set CC is defined by the condition xAx\in A, and by one of the conditions t<f(x)t<f(x), or t=f(x)t=f(x), or f(x)<t<g(x)f(x)<t<g(x), or t>f(x)t>f(x).  Then for every finite family (Bi)(B_i) of definable subsets of Rn+1\mathbf R^{n+1}, there is a finite partition of Rn+1\mathbf R^{n+1} into cells such that every BiB_i is a union of cells.
  • Every definable function is piecewise smooth.
  • Definable continuous functions are definably piecewise trivial (theorem of Hardt): for every function f ⁣:XYf\colon X\to Y between definable sets which is definable and continuous, there is a finite partition (Yi)(Y_i) of YY into definable subsets such that the map fi ⁣:f1(Yi)Yif_i\colon f^{-1}(Y_i)\to Y_i deduced from ff by restriction is isomorphic to a projection Yi×SiYiY_i\times S_i\to Y_i.

Recently, o-minimality has had spectacular and fantastic applications via the approach of Pila-Zannier to the conjecture of Pink, leading to new proofs of the Manin-Mumford conjecture (Pila-Zannier), and to proofs of the André-Oort conjecture (Pila, Pila-Tsimerman, Klingler-Ullmo-Yafaev), and, more recently, to partial results towards the conjecture of Pink (Gao, Habegger-Pila,...). However, this is not the goal of that post, so let me refer the interested reader to Tom Scanlon's Bourbaki talk on that topic.

3. O-minimal complex analysis

The standard identification of the field C\mathbf C of complex numbers with R2\mathbf R^2 (associating with a complex number its real and imaginary parts) allows to talk of complex valued functions (on a subset of Cn\mathbf C^n) which are definable in a given language. In a remarkable series of papers, Peterzil and Starchenko have shown that holomorphic functions which are definable in an o-minimal structure possess very rigid properties. Let us quote some of their theorems.

So we fix an expansion of the language LorL_{\mathrm{or}} of which the field R\mathbf R is a structure whose theory is o-minimal. By “definable”, we mean definable in that language. The typical language considered in the applications here is the language Lan,expL_{\mathrm{an},\mathrm{exp}}.

Theorem. — Let AA be a finite subset of C\mathbf C and let f ⁣:CACf\colon \mathbf C\setminus A\to \mathbf C be a holomorphic function. If ff is definable, then it is a rational function.

Theorem. — Let VCnV\subset\mathbf C^n be a closed analytic subset. If VV is definable, then VV is algebraic.

Corollary (Theorem of Chow). — Let VPn(C)V\subset\mathbf P^n(\mathbf C) be a closed analytic subset. Then VV is algebraic.

Indeed, working on the standard charts of Pn(C)\mathbf P^n(\mathbf C), we see that VV is locally definable by analytic functions. By compactness of Pn(C)\mathbf P^n(\mathbf C), it is thus definable in the language LanL_{\mathrm{an}}. Since the theory of R\mathbf R in this language is o-minimal, the corollary is a consequence of the previous theorem.

Let us finally give an important example. Let XX be an bounded symmetric domain. This means that XX is a bounded open subset of Cn\mathbf C^n such that for every point pXp\in X, there exists a biholomorphic involution f ⁣:XXf\colon X\to X such that pp is an isolated fixed point of ff. This implies that XX is a homogeneous space G/KG/K under a semisimple Lie group GG which acts by holomorphisms, and KK is a maximal compact subgroup of GG. Moreover, XX has a canonical Kähler metric which is invariant under GG.

The most classical example is given by the Poincaré upper half-plane on which PGL(2,R)\mathrm{PGL}(2,\mathbf R) acts by homographies; of course, the upper half-plane is not bounded, but is biholomorphic to the open unit disk.

A more sophisticated example is given by the Siegel upper half-plane or, rather, its bounded version. That is, XX is the set of n×nn\times n symmetric complex matrices ZZ such that InZZ\mathrm I_n-Z^* Z is positive definite. It is a homogeneous space for the symplectic group Sp(2n,R)\mathrm{Sp}(2n,\mathbf R); the fixator of Z=0Z=0 is the unitary group U(n)U(n).

Let now Γ\Gamma be an arithmetic subgroup of Sp(2n,R)\mathrm{Sp}(2n,\mathbf R); for example, let us take Γ\Gamma be a subgroup of finite index of Sp(2n,Z)\mathrm{Sp}(2n,\mathbf Z). Then the quotient S=X/ΓS=X/\Gamma admits a structure of an analytic set and the projection p ⁣:XSp\colon X\to S is an analytic map. If Γ\Gamma is “small enough” (torsion free, say), then SS is even complex manifold manifold, and pp is a covering. An important and difficult theorem of Baily-Borel asserts that SS is an algebraic variety.

In fact, it is classical in this context that there exist Siegel sets, which are explicit subsets FF of XX such that ΓF=X\Gamma\cdot F=X and such that the set of γΓ\gamma\in\Gamma such that γFF\gamma\cdot F\cap F\neq\emptyset is finite. So Siegel sets are almost fundamental domains. An important remark is that they are semi-algebraic, that is, definable in the language of ordered rings. For example in the upper half-plane, one may take FF to be the set of all zCz\in\mathbf C such that 12(z)12-\frac12\leq \Re(z)\leq \frac12 and (z)3/2\Im(z)\geq \sqrt 3/2. One may even take “fundamental sets” (which are fundamental domains up to something of empty interior) such as the one defined by the inequalities 12(z)12-\frac12\leq \Re(z)\leq\frac12 and z1\lvert z\rvert \geq1.

Peterzil and Starchenko have proved that there restriction to FF of the projection pp is definable in the language Lan,expL_{\mathrm{an},\mathrm{exp}}. An immediate consequence is that SS is definable in this language, hence is algebraic.

These results have been generalized by Klinger, Ullmo and Yafaev to any bounded symmetric domain. This is an important technical part of their proof of the hyperbolic Ax-Lindemann conjecture.

Link to Part 4 — Elimination of imaginaries

Saturday, May 2, 2015

Model theory and algebraic geometry, 2 — Definable sets, types; quantifier elimination

This is the second post in a series of 4 devoted to the exposition of interactions between model theory and algebraic geometry. In the first one, I explained the notions of language, structures and theories, with examples taken from algebra. Here, I shall discuss the notion of definable set, of types, as well as basic results from dimension theory (ω\omega-stability).

So we fix a theory TT in a language LL. A definable set is defined, in a given model MM of TT, by a formula. More precisely, we consider definable sets in cartesian powers MnM^n of the model MM, which can be defined by a formula in nn free variables with parameters in some subset AA of MM. By definition, such a formula is a formula of the form ϕ(x;a)\phi(x;a), where ϕ(x;y)\phi(x;y) is a formula in n+mn+m free variables, split into two groups x=(x1,,xn)x=(x_1,\dots,x_n) and y=(y1,,ym)y=(y_1,\dots,y_m) and a=(a1,,am)Ama=(a_1,\dots,a_m)\in A^m is an mm-tuple of parameters; the formula ϕ(x;y)\phi(x;y) can have quantifiers and bounded variables too. Given such a formula, we define a subset [ϕ(x;a)][\phi(x;a)] of MnM^n by {xMnϕ(x;a)}\{ x\in M^n\mid \phi(x;a)\}. We write Def(Mn;A)\mathrm{Def}(M^n;A) for the set of all subsets of MnM^n which are definable with parameters in AA.

Let us give examples, where LL is the language of rings and TT is the theory ACF\mathrm{ACF} of algebraically closed fields:
  • V1={xx0}MV_1=\{x\mid x\neq 0 \}\subset M , given by the formula “x0x\neq 0” with 1 variable and 00 parameter;
  • V2={xy,2xy=1}MV_2=\{x\mid \exists y, 2xy=1\} \subset M , given by the formula “y,2xy=1\exists y, 2xy=1” with 1 free variable xx, and one bounded variable yy;
  • V3={(x,y)x2+2y2=π}C2V_3=\{(x,y)\mid x^2+\sqrt 2 y^2=\pi \}\subset \mathbf C^2, where the model C\mathbf C is the field of complex numbers, ϕ((x,y),(a,b))\phi((x,y),(a,b)) is the formula x2+ay2=bx^2+ay^2=b in 4 free variables, and the parameters are given by (a,b)=(2,π)(a,b)=(\sqrt 2,\pi).
Theorem (Chevalley). — Let LL be the language of rings, T=ACFT=\mathrm{ACF} and MM be an algebraically closed field; let AA be a subset of MM. The set Def(Mn;A)\mathrm{Def}(M^n;A) is the smallest boolean algebra of subsets of MnM^n which contains all subsets of MnM^n of the form [P(x;a)][P(x;a)] where PP is a polynomial in n+mn+m variables with coefficients in Z\mathbf Z and a=(a1,,am)a=(a_1,\dots,a_m) is an mm-tuple of elements of AA. In other words, a subsets of MnM^n is definable with parameters in AA if and only if it is constructible with parameters in AA.

The reason behind this theorem is the following set-theoretic interpretation of quantifiers and logical connectors. Precisely, if ϕ\phi is a formula in n+m+pn+m+p variables, and aApa\in A^p, the definable subset [yϕ(x,y,a)][\exists y \phi(x,y,a)] of MnM^n coincides with the image of the definable subset [ϕ(x,y;a)][\phi(x,y;a)] of Mn+mM^{n+m} under the projection px ⁣:Mn+mMnp_x \colon M^{n+m}\to M^n. Similarly, if ϕ(x)\phi(x) and ψ(x)\psi(x) are two formulas in nn free variables, then the definable subset [ϕ(x)ψ(x)][\phi(x)\wedge\psi(x)] is the union of the definable subsets [ϕ(x)][\phi(x)] and [ψ(x)][\psi(x)]. And if ϕ(x)\phi(x) is a formula in nn variables, then the definable subset [¬ϕ(x)][\neg\phi(x)] is the complement in MnM^n of the definable subset [ϕ(x)][\phi(x)].

For example, the subset V2=[y,2xy=1]V_2=[\exists y, 2xy=1] defined above can also be defined by M[2x=0]M\setminus [2x=0].

One says that the theory ACF admits elimination of quantifiers: modulo the axioms of algebraically closed fields, every formula of the language LL is equivalent to a formula without quantifiers.

An important consequence of this property is that for every extension MMM\hookrightarrow M' of models of ACF, the theory of MM' is equal to the theory of MM—one says that every extension of models is elementary.

Let pp be either 00 or a prime number. Observe that every algebraically closed field of characteristic pp is an extension of Q\overline{\mathbf Q} if p=0p=0, or of Fp\overline{\mathbf F_p} if pp is a prime number. As a consequence, for every characteristic p0p\geq0, the theory ACFp\mathrm{ACF}_p of algebraically closed fields of characteristic pp (defined by the axioms of ACF\mathrm{ACF}, and  the axiom 1+1++1=01+1+\dots+1=0 that the characteristic is pp if pp is a prime number, or the infinite list of axioms that assert that the characteristic is \neq \ell, if p=0p=0) is complete: this list of axioms determines everything that can be said about algebraically closed fields of characteristic pp.

Definition. — Let aMna\in M^n and let AA be a subset of MM. The type of aa (with parameters in AA) is the set tp(a/A)\mathrm{tp}(a/A) of all formulas ϕ(x;b)\phi(x;b) in nn free variables with parameters in AA such that ϕ(a;b)\phi(a;b) holds in the model MM.

Definition. — Let AA be a subset of MM. For every integer n0n\geq 0, the set Sn(A)S_n(A) of types (with parameters in AA) is the set of all types tp(a/A)\mathrm{tp}(a/A), where NN is an extension of MM which is a model of TT and aNna\in N^n. One then says that this type is realized in NN.

Gödel's completeness theorem allows us to give an alternative description of Sn(A)S_n(A). Namely, let pp be a set of formulas in nn free variables and parameters in AA which contains the diagram of AA (that is, all formulas which involve only elements of AA and are true in MM). Assume that pp is consistent (there exists a model NN which is an extension of MM and and element aMna\in M^n such that ϕ(a)\phi(a) holds in NN for every ϕp\phi\in p) and maximal (for every formula ϕ∉p\phi\not\in p, then for every model NN and every aNna\in N^n such that ptp(a/A)p\subset \mathrm{tp}(a/A), then ϕ(a)\phi(a) does not hold). Then pSn(A)p\in S_n(A).

For every formula ϕL(A)\phi\in L(A) in nn free variables and parameters in AA, let VϕV_\phi be the set of types pSn(A)p\in S_n(A) such that ϕp\phi\in p. Then the subsets VϕV_\phi of Sn(A)S_n(A) consistute a basis of open sets for a natural topology on Sn(A)S_n(A).

Theorem. — The topological space Sn(A)S_n(A) is compact and totally discontinuous.

Let us detail the case of the theory ACF in the langage of rings. I claim that if KK is a field, then Sn(K)S_n(K) is homeomorphic to the spectrum Spec(K[T1,,Tn])\mathop{\rm Spec}(K[T_1,\dots,T_n]) endowed with its constructible topology. Concretely, for every algebraically closed extension MM of KK and every aMna\in M^n, the homeomorphism jj maps tp(a/K)\mathrm{tp}(a/K) to the prime ideal pa\mathfrak p_a consisting of all polynomials PK[T1,,Tn]P\in K[T_1,\dots,T_n] such that P(a)=0P(a)=0.

A type p=tp(a/K)p=\mathrm{tp}(a/K) is isolated if and only if the prime ideal pa\mathfrak p_a is maximal. Consequently, if n=1n=1, there is exactly one non-isolated type in S1(K)S_1(K), corresponding to the generic point of the spectrum Spec(K[T])\mathop{\rm Spec}(K[T]).

As for any compact topological space, a space of types can be studied via its Cantor-Bendixson analysis, which is a decreasing sequence of subspaces, indexed by ordinals, defined by transfinite induction. First of all, for every topological space XX, one denotes by D(X)D(X) the set of all non-isolated points of XX. One then defines X0=XX_0=X, Xα=D(Xβ)X_{\alpha}=D(X_\beta) if α=β+1\alpha=\beta+1 is a successor-ordinal, and Xα=β<αXβX_\alpha=\bigcap_{\beta<\alpha} X_\beta if α\alpha is a limit-ordinal. For xXx\in X, the Cantor-Bendixson rank of xx is defined by rCB(x)=αr_{CB}(x)=\alpha if xXαx\in X_\alpha and x∉Xβx\not\in X_\beta for β>α\beta>\alpha, and rCB(x)=r_{CB}(x)=\infty if xXαx\in X_\alpha for every ordinal α\alpha. The set of points of infinite rank is the largest perfect subset of XX.

Let us return to the example of the theory ACF. If a type pSn(K)p\in S_n(K) corresponds to a prime ideal p=j(p)\mathfrak p=j(p) of Spec(K[T1,,Tn])\mathop{\rm Spec}(K[T_1,\dots,T_n]), its Cantor-Bendixson rank is the Zariski dimension of V(I)V(I). More generally, if FF is a constructible subset of Spec(K[T1,,Tn])\mathop{\rm Spec}(K[T_1,\dots,T_n]), then rCB(F)r_{CB}(F) is the Zariski-dimension of the Zariski-closure of FF. Moreover, the points of maximal Cantor-Bendixson rank correspond to the generic points of the irreducible components of maximal dimension; in particular, there are only finitely many of them.

Definition. — One says that a theory TT is ω\omega-stable if for every finite or countable set of parameters AA, the space of 1-types S1(A)S_1(A) is finite or countable.

The theory ACF is ω\omega-stable. Indeed, if KK is the field generated by AA, then K[T]K[T] being
a countable noetherian ring, it has only countably many prime ideals.

Since any non-empty perfect set is uncountable, one has the following lemma.

Lemma. — Let TT be an ω\omega-stable theory and let MM be a model of TT. Then the Cantor-Bendixson rank of every type xSn(M)x\in S_n(M) is finite.

Let us assume that TT is ω\omega-stable and let FF be a closed subset of Sn(M)S_n(M). Then rCB(F)=sup{rCB(x);xF}r_{CB}(F)=\sup \{ r_{CB}(x)\,;\, x\in F\} is finite, and the set of points xFx\in F such that rCB(x)=rCB(F)r_{CB}(x)=r_{CB}(F) is finite and non-empty.

This example gives a strong indication that the model theory approach may be extremly fruitful for the study of algebraic theories whose geometry is not as well developed than algebraic geometry.

Link to Part 3 — Real closed fields and o-minimality

Thursday, April 23, 2015

Model theory and algebraic geometry, 1 — Structures, languages, theories, models

Last november, I had been invited to lecture at the GAGC conference on the use of model theoretic methods in algebraic geometry. In the last two decades, important results of “general mathematics” have been proved using sophisticated techniques, see for example Hrushovski's proofs of the Manin-Mumford and of the Mordell-Lang conjecture over function fields, or Chatzidakis-Hrushovski's proof of a descent result in algebraic dynamics (generalizing a theorem of Néron for abelian varieties), or Hrushovski-Loeser's approach to the topology of Berkovich spaces, or Medvedev-Scanlon's results on invariant varieties in polynomial dynamics, or Hrushovski's generalization of the Lang-Weil estimates, or the applications to the André-Oort conjecture (by Pila and others) of a theorem of Pila-Wilkie in o-minimal geometry... All these wonderful results were however too complicated to be discussed from scratch in this series of lectures and I decided to discuss a beautiful paper of Scanlon that “explains” why coverings from analytic geometry lead to algebraic differential equations.
There will be 4 posts:
  1. Structures, languages, theories, models (this one)
  2. Definable sets, types, quantifier elimination
  3. Real closed fields and o-minimality
  4. Elimination of imaginaries
Model theory — a branch of mathematical logic — has two aspects:
  • The first one, that one could name “pure”, studies mathematical theories as mathematical objects. It introduced important concepts, such as quantifier elimination, elimination of imaginaries, types and their dimensions, stability theory, Zariski geometries, and provides a rough classification of mathematical theories.
  • The second one is “applied”: it studies classical mathematical theories using these tools. It may be for algebraic theories, such as fields, differential fields, valued fields, ordered groups or fields, difference fields, etc., that it works the best, and for theories which are primitive enough so that they escape indecidability à la Gödel.
 Let us begin with an empirical observation; classical mathematical theories feature:
  • sets (which may be receptacles for groups, rings, fields, modules, etc.);
  • functions and relations between those sets (composition laws, order relations, equality);
  • certain axioms which are well-formed formulas using these functions, these relations, basic logical symbols (\forall, \exists, \vee, \wedge, ¬\neg) or their variants (\Rightarrow, \Leftrightarrow, !\exists!, etc.).
Model theory (to be precise, first-order model theory) introduces the concepts of a language (the letters and symbols that allow to express a mathematical theory), of a theory (sets of formulas in a given language, using a fixed infinite supply of variables), of a structure (sets, functions and relations that allow to interpret all formulas in the language) and finally of a model of a theory (a structure where the formulas of the given theory are interpreted as true). The theory of a structure is the set of all formulas which are interpreted as true. A morphism of structures is a map which is compatible with all the given relations.

Let us give three examples from algebra: groups, fields, differential fields

a) Groups

The language of groups has one symbol \cdot which represents a binary law. Consequently, a structure for this language is just a set SS together with a binary law S×SSS\times S\to S. In this language, one can axiomatize groups using two axioms:
  • Associativity: xyzx(yz)=(xy)z\forall x \forall y \forall z \quad x\cdot (y\cdot z)= (x\cdot y)\cdot z
  • Existence of a neutral element and of inverses: exy(xe=ex xy=yx=e)\exists e\forall x \exists y \quad (x\cdot e=e\cdot x \wedge  x\cdot y=y\cdot x=e).
Observe that in writing these formulas, we allow ourselves the usual shortcuts to which we are used as mathematicians. In fact, the foundations of model theory require to spend a few pages to discuss how formulas should be written, with or without parentheses, that they can be unambiguously read, etc.

However, it may be more useful to study groups in a language with 3 symbols ,e,i\cdot,e,i, where \cdot represents the binary law, ee the neutral element and ii the inversion. Then a structure is a set together with a binary law, a distinguished element and a self-map; in particular, what is a structure depends on the language. In this new language, groups are axiomatized with three axioms:
  • Associativity as above;
  • Neutral element: xxe=ex=x\forall x \quad x\cdot e=e\cdot x=x;
  • Inverse: xxi(x)=i(x)x=e\forall x\quad x\cdot i(x)=i(x)\cdot x=e.
The two theories of groups are essentially equivalent: one can translates any formula of the first language into the second, and conversely. Indeed, if a formula of the second language involves the symbols ee, it suffices to copy exe=ex\exists e x\cdot e=e\cdot x in front of it; and if a formula involves i(x)i(x), it suffices to add y\exists y in front of it, as well as the requirement xy=yx=ex\cdot y=y\cdot x=e, and to replace i(x)i(x) by yy. Since the neutral element and the inverse law of a group are unambiguously defined by the composition law, this shows that the new formula is equivalent, albeit longer and less practical, to the initial one.

The possibility of interpreting a theory in a language in a second language is a very important tool in mathematical logic.

b) Rings

The language used to study rings has 5 symbols: +,,0,1,+,-,0,1,\cdot. In this language, structures are just sets with three binary laws and two distinguished elements. One can of course axiomatize rings, using the well-known formulas that express that the law ++ is associative and commutative, that 00 is a neutral element and that - gives subtraction, that the law \cdot is associative and commutative with 11 as a neutral element, and that the multiplication \cdot distributes over addition.

Adding the axioms x(x0yxy=1)\forall x (x\neq 0 \Rightarrow \exists y \quad xy=1) and 101\neq 0 gives rise to fields.

That a field has characteristic 2, say, is axiomatized by the formula 1+1=01+1=0, that it has characteristic 3 is axiomatized by the formula 1+1+1=01+1+1=0, etc. That a field has characteristic 0 is axiomatized by an infinite list of axiom, one for each prime number pp, saying that 1+1++101+1+\cdots+1\neq 0 (with pp symbols 11 on the left). We will see below why fields of characteristic 0 must be axiomatized by infinitely  axioms.

That a field is algebraically closed means that every monic polynomial has a root. To express this property, one needs to write down all possible polynomials. However, the language of rings does not give us access to integers, nor to sets of polynomials. Consequently, we must write down an infinite list of axioms, one for each positive integer nn: x1x2xnyyn+x1yn1++xn1y+xn=0\forall x_1\forall x_2\cdots \forall x_n \exists y \quad y^n+x_1 y^{n-1}+\cdots+x_{n-1}y+x_n=0. Here ymy^m is an abbreviation for the product yyyy\cdot y \cdots y of mm factors equal to yy.

As we will see, the language of rings and the theory ACF of algebraically closed fields is well suited to study algebraic geometry.

c) Differential fields

A differential ring/field is a ring/field AA endowed with a derivation  ⁣:AA\partial\colon A\to A, that is, with an additive map satisfying the Leibniz relation (ab)=a(b)+b(a)\partial(ab)=a\partial(b)+b\partial(a). They can be naturally axiomatized in the language of rings augmented with a symbol \partial.

There is a notion of a differentially closed field, analogous to the notion of an algebraically closed field, but encompassing differential equations. A differential field is differentially closed if any differential equation which has a solution in some differential extension already has a solution. This property is analogous to the consequence of Hilbert's Nullstellensatz according to which a field is algebraically closed if any system of polynomial equations which has a solution in an extension already has a solution. At least in characteristic zero, Robinson showed that their theory DCF0_0 can be axiomatized by various families of axioms. For example, the one devised by Blum asserts the existence of an element xx such that P(x)=0P(x)=0 and Q(x)0Q(x)\neq0, for every pair (P,Q)(P,Q) of non-zero differential polynomials in one indeterminate such that the order of QQ is strictly smaller than the order of PP. This study requires the development of important and difficult results in differential algebra due to Ritt and Seidenberg.


At this level, there are two important basic theorems to mention: Gödel completeness theorem, and the theorems of Löwenheim-Skolem.

Completeness theorem (Gödel). — Let TT be a theory in a language LL. Assume that every finite subset SS of TT admits a model. Then TT admits a model.

There are two classical proof of this theorem.

The first one uses ultraproducts and consists in choosing a model MSM_S for every finite subset SS of TT. Let then U\mathcal U be a non-principal ultrafilter on the set of finite subsets of TT and let MM be the ultraproduct of the family of models (MS)(M_S). It inherits functions and relations from those of the models MSM_S, so that it is a structure in the language LL. Moreover, one deduces from the definition of an ultrafilter that for every axiom α\alpha of TT, the structure MM satisfies the axiom α\alpha. Consequently, MM is a model of TT.

A second proof, due to Henkin, is more syntactical. It considers the set of all terms in the language LL (formulas without logical connectors), together with an equivalence relation that equates two terms for which some axiom says that they are equal, and with symbols representing objets of which an axiom affirms the existence. The quotient set modulo the equivalence relation is a model. In essence, this proof is very close to the construction of a free group as words.

It is important to obseve that the proof of this theorem uses the existence of non-principal ultraproducts, which is a weak form of the axiom of choice. In fact, as in all classical mathematics, the axiom of choice — and set theory in general — is used in model theory to establish theorems. That does not prevent logicians to study the model theory of set theory without choice as a particular mathematical theory, but even to do that, one uses choice.

Theorem of Löwenheim-Skolem.Let TT be a theory in a language LL. If it admits an infinite model MM, then it admits a model in every cardinality sup(Card(L),0)\geq \sup(\mathop{\rm Card}(L),\aleph_0).

To show the existence of a model of cardinality κ\geq\kappa, one enlarges the language LL and the theory TT by adding symbols cic_i, indexed by a set of cardinality κ\kappa, and the axioms cicjc_i\neq c_j if iji\neq j, giving rise to a theory TT' in a language LL'. A structure for LL' is a structure for LL together with distinguished elements cic_i; such a structure is a model of TT' if and only if it is a model of TT and if the elements cic_i are pairwise disintct. If the initial theory TT has an infinite model, then this model is a model of every finite fragment of the theory TT', because there are only finitely many axioms of the form cicjc_i\neq c_j to satisfy, and the model is assumed to be infinite. By Gödel's completeness theorem, the theory TT' has a model MM'; forgetting the choice of distinguished elements, MM'  is a model of the theory TT, but the mere existence of the elements cic_i forces its cardinality to be at least κ\kappa.

To show that there exists a model of cardinality exactly κ\kappa (assumed to be larger than sup(Card(L),0)\sup(\mathop{\rm Card}(L),\aleph_0)), one starts from a model MM of cardinality κ\geq\kappa and defines a substructure by induction, starting from the constant symbols and adding step by step only the elements which are required by the function symbols, the axioms and the elements already constructed. This construction furnishes a model of TT whose cardinality is equal to κ\kappa.


Link to Part 2 — Definable sets, types; quantifier elimination

Monday, March 23, 2015

When Lagrange meets Galois

Jean-Benoît Bost told me a beautiful proof of the main ingredient in the proof of Galois correspondence, which had been published by Lagrange in his 1772 “Réflexions sur la résolution des résolutions algébriques”, almost 60 years before Galois. (See Section 4 of that paper, I think; it is often difficult to recognize our modern mathematics in the language of these old masters.)

In modernized notations, Lagrange considers the following situation. He is given a polynomial equation Tn+an1Tn1++a0=0 T^n + a_{n-1} T^{n-1}+\cdots + a_0 = 0, with roots x1,,xnx_1,\dots,x_n, and two “rational functions” of its roots  f(x1,,xn)f(x_1,\dots,x_n) and ϕ(x1,,xn)\phi(x_1,\dots,x_n). (This means that ff and ϕ\phi are the evaluation at the nn-tuple (x1,,xn)(x_1,\dots,x_n) of two rational functions in nn variables.) Lagrange says that ff and ϕ\phi are similar (“semblables”) if every permutation of the roots which leaves f(x1,,xn)f(x_1,\dots,x_n) unchanged leaves ϕ(x1,,xn)\phi(x_1,\dots,x_n) unchanged as well (and conversely). He then proves that ϕ(x1,,xn)\phi(x_1,\dots,x_n) is a rational function of a0,,an1a_0,\dots,a_{n-1} and f(x1,,xn)f(x_1,\dots,x_n).

Let us restate this in a more modern language. Let KLK\to L be a finite Galois extension of fields, in the sense that K=LGK= L^{G}, where G=AutK(L)G=\mathop{\rm Aut}_K(L). Let a,bLa, b\in L and let us assume that every element gGg\in G which fixes aa fixes bb as well; then Lagrange proves that bK(a)b\in K(a).

Translated in our language, his proof could be as follows. In formula, the assumption is that ga=ag\cdot a=a implies gb=bg\cdot b=b; consequently, there exists a unique *function* ϕ ⁣:GaGb\phi\colon G\cdot a\to G\cdot b which is GG-equivariant and maps aa to bb. Let d=Card(Ga)d=\mathop{\rm Card}(G\cdot a) and let us consider Lagrange's interpolation polynomial —the unique polynomial PL[T]P\in L[T] of degree dd such that P(x)=ϕ(x)P(x)=\phi(x) for every xGax\in G\cdot a. If hGh\in G, the polynomial PhP^h obtained by applying hh to the coefficients of PP has degree dd and coincides with ϕ\phi; consequently, Ph=PP^h=P. By the initial assumption, PP belongs to K[T]K[T] and b=P(a)b=P(a), hence bK(a)b\in K(a), as claimed.

Combined with the primitive element theorem, this allows to give another short, and fairly elementary, presentation of the Galois correspondence.

Saturday, February 28, 2015

Galois Theory, Geck's style

This note aims at popularizing a short note of Meinolf Geck, On the characterization of Galois extensions, Amer. Math. Monthly 121 (2014), no. 7, 637–639 (Article, Math Reviews, arXiv), that proposes a radical shortcut to the treatment of Galois theory at an elementary level. The proof of the pudding is in the eating, so let's see how it works. The novelty lies in theorem 2, but I give the full story so as to be sure that I do not hide something under the rug.

Proposition 1. Let KLK\to L be a field extension. Then LL is not the union of finitely many subfields MM such that KMLK\to M\subsetneq L.
Proof. It splits into two parts, according whether KK is finite or infinite.

Assume that KK is finite and let q=Card(K)q=\mathop{\rm Card}( K). Then LL is finite as well, and let n=[L:K]n=[L:K] so that Card(L)=qn\mathop{\rm Card}(L)=q^n. If MM is a subextension of LL, then Card(L)=qm\mathop{\rm Card}( L)=q^m, for some integer mm dividing nn; moreover, xqm=xx^{q^m}=x for every xLx\in L. Then the union of all strict sub-extensions of LL has cardinality at most m=1n1qm=qnqq1<qn\sum_{m=1}^{n-1} q^m =\frac{q^n-q}{q-1}<q^n.

It remains to treat the case where KK is infinite; then the proposition follows from the fact that a finite union of strict subspace of a KK-vector space EE is not equal to EE. Let indeed (Ei)1in(E_i)_{1\leq i\leq n} be a family of strict subspaces of EE and let us prove by induction on nn that Ei=1nEiE\neq \bigcup_{i=1}^n E_i. The cases n1n\leq1 are obvious. By induction we know that for every j{1,,n}j\in\{1,\dots,n\}, the union ijEi\bigcup_{i\neq j}E_i is distinct from EE, hence select an element xEx\in E such that x∉E2Enx\not\in E_2\cup \dots\cup E_n. The desired result follows if, by chance, x∉E1x\not\in E_1. Otherwise, choose yEE1y\in E\setminus E_1. For stKs\neq t\in K, and i{2,,n}i\in\{2,\dots,n\}, observe that y+sxy+sx and y+txy+tx cannot both belong to EiE_i, for this would imply that (st)xEi(s-t)x\in E_i, hence xEix\in E_i since sts\neq t. Consequently, there are at most n1n-1 elements sKs\in K such that y+sxi=2nEiy+sx\in \bigcup_{i=2}^nE_i. Since KK is infinite, there exists sKs\in K such that y+sx∉i=2nEiy+sx\not\in\bigcup_{i=2}^n E_i. Then y+sx∉E1y+sx\not\in E_1, neither, since xE1x\in E_1 and y∉E1y\not\in E_1. This proves that Ei=1nEiE\neq \bigcup_{i=1}^nE_i.

Let KLK\to L be a field extension and let PK[T]P\in K[T]. We say that PP is split in LL if it is a product of linear factors in L[T]L[T]. We say that PP is separable if all of its roots (in some extension where it is split) have multiplicity 11. We say that KLK\to L is a splitting extension of PP if PP is split in LL and if LL is the subextension of KK generated by the roots of PP in LL. Finally, we let AutK(L)\mathop{\rm Aut}_K(L) be the set of KK-linear automorphisms of LL; it is a group under composition.

Theorem 2. Let KLK\to L be a finite extension of fields and let G=AutK(L)G=\mathop{\rm Aut}_K(L). Then Card(G)[L:K]\mathop{\rm Card}( G)\leq [L:K]. Moreover, the following conditions are equivalent:

  1. One has Card(G)=[L:K]\mathop{\rm Card}( G)=[L:K];
  2. There exists an irreducible separable polynomial PK[T]P\in K[T] such that deg(P)=[L:K]\deg(P)=[L:K] and which is split in LL;
  3. The extension KLK\to L is a splitting extension of a separable polynomial in K[T]K[T];
  4. One has K=LGK=L^G.


Remark 3. In the conditions of (2), let us fix a root zLz\in L of PP. One has L=K(z)L=K(z). Moreover, the map ff(z)f\mapsto f(z) is a bijection from AutK(L)\mathop{\rm Aut}_K(L) to the set of roots of PP in LL.

Proof of Theorem 2.
(a) Let us prove that Card(G)[L:K]\mathop{\rm Card} (G)\leq [L:K]. Let mNm\in\mathbf N be such that mCard(G)m\leq \mathop{\rm Card}( G) and let σ1,,σm\sigma_1,\dots,\sigma_m be distinct elements of GG. For 1i<jm1\leq i<j\leq m, let Mi,jM_{i,j} be the subfield of LL consisting of all xLx\in L such that σi(x)=σj(x)\sigma_i(x)=\sigma_j(x). It is a strict subextension of LL because σiσj\sigma_i\neq\sigma_j. Consequently, LL is not the union of the subfields Mi,jM_{i,j} and there exists an element zLz\in L such that σi(z)σj(z)\sigma_i(z)\neq \sigma_j(z) for all iji\neq j. Let PP be the minimal polynomial of zz. Then the set {σ1(z),,σm(z)}\{\sigma_1(z),\dots,\sigma_m(z)\} consists of distinct roots of PP, hence deg(P)m\deg(P)\geq m. In particular, m[L:K]m\leq [L:K]. Since this holds for every mCard(G)m\leq \mathop{\rm Card}( G), this shows that Card(G)[L:K]\mathop{\rm Card}( G)\leq [L:K].

(b) If one has Card(G)=[L:K]\mathop{\rm Card}( G)=[L:K], then taking m=Card(G)m=\mathop{\rm Card}( G), we get an irreducible polynomial PK[T]P\in K[T] of degree mm, with mm distinct roots in LL. Necessarily, PP is separable and split in LL. This gives (1)\Rightarrow(2).

The implication (2)\Rightarrow(3) is obvious.

(1)\Rightarrow(4). Let M=LGM=L^G. One has AutK(L)=AutM(L)=G\mathop{\rm Aut}_K(L)=\mathop{\rm Aut}_M(L)=G. Consequently, Card(G)[L:M]\mathop{\rm Card}(G)\leq [L:M]. Since Card(G)=[L:K]=[L:M][M:K]\mathop{\rm Card}( G)=[L:K]=[L:M][M:K], this forces M=KM=K.

(4)\Rightarrow(3). There exists a GG-invariant subset AA of LL such that L=K(A)L=K(A). Then P=aA(Ta)P=\prod_{a\in A}(T-a) is split in LL, and is GG-invariant. Consequently, PK[T]P\in K[T]. By construction, PP is separable and LL is a splitting extension of PP.

(3)\Rightarrow(1). Let MM be a subextension of LL and let f ⁣:MLf\colon M\to L be a KK-morphism. Let aAa\in A and let QaQ_a be the minimal polynomial of aa over MM. The association gg(a)g\mapsto g(a) defines a bijection between the set of extensions of ff to M(a)M(a) and the set of roots of QaQ_a in LL. Since P(a)=0P(a)=0, the polynomial QaQ_a divides PP, hence it is separable and split in LL. Consequently, ff has exactly deg(Qa)=[M(a):M]\deg(Q_a)=[M(a):M] extensions to M(a)M(a).

By a straightforward induction on Card(B)\mathop{\rm Card}(B), for every subset BB of AA, the set of KK-morphisms from K(B)K(B) to LL has cardinality [K(B):K][K(B):K]. When B=AB=A, every such morphism is surjective, hence Card(AutK(L))=[L:K]\mathop{\rm Card}(\mathop{\rm Aut}_K(L))=[L:K].

If these equivalent conditions hold, we say that the finite extension KLK\to L is Galois.

Corollary 4. Let KLK\to L be a finite Galois extension. The maps HLHH\to L^H and MAutM(L)M\to \mathop{\rm Aut}_M(L) are bijections, inverse one of the other, between subgroups of AutK(L)\mathop{\rm Aut}_K(L) and subextensions KMLK\to M\subset L.
Proof. a) For every subextension KMLK\to M\subset L, the extension MLM\subset L is Galois. In particular, M=LAutM(L)M=L^{\mathop{\rm Aut}_M(L)} and AutM(L)=[L:M]\mathop{\rm Aut}_M(L)=[L:M].

b) Let HAutK(L)H\subset\mathop{\rm Aut}_K(L) and let M=LHM=L^H. Then MLM\to L is a Galois extension and [L:M]=AutM(L)[L:M]=\mathop{\rm Aut}_M(L); moreover, one has HAutM(L)H\subset\mathop{\rm Aut}_M(L) by construction. Let us prove that H=AutM(L)H=\mathop{\rm Aut}_M(L). Let zLz\in L be any element whose minimal polynomial PzP_z over MM is split and separable in LL. One has Card(AutM(L))=deg(Pz)\mathop{\rm Card}(\mathop{\rm Aut}_M(L))=\deg(P_z). On the other hand, the polynomial Qz=σH(Tσ(z))L[T]Q_z=\prod_{\sigma\in H}(T-\sigma(z))\in L[T] divides PzP_z and is HH-invariant, hence it belongs to LH[T]=M[T]L^H[T]=M[T]. This implies that Pz=QzP_z=Q_z, hence Card(H)=deg(Pz)=Card(AutM(L))\mathop{\rm Card}(H)=\deg(P_z)=\mathop{\rm Card}(\mathop{\rm Aut}_M(L)). Consequently, H=AutM(L)H=\mathop{\rm Aut}_M(L).

Corollary 5. Let KLK\to L be a Galois extension and let KMLK\to M\to L be an intermediate extension. The extension MLM\to L is Galois too. Moreover, the following assertions are equivalent:

  1. The extension KMK\to M is Galois;
  2. AutM(L)\mathop{\rm Aut}_M(L) is a normal subgroup of AutK(L)\mathop{\rm Aut}_K(L);
  3. For every σAutK(L)\sigma\in\mathop{\rm Aut}_K(L), one has σ(M)M\sigma(M)\subset M.

Proof. (a) Let PK[T]P\in K[T] be a separable polynomial of which KLK\to L is a splitting field. Then MLM\to L is a splitting extension of PP, hence MLM\to L is Galois.

(b) (1)\Rightarrow(2): Let σAutK(L)\sigma\in \mathop{\rm Aut}_K(L). Let zz be any element of MM and let PK[T]P\in K[T] be its minimal polynomial. One has P(σ(z))=σ(P(z))=0P(\sigma(z))=\sigma(P(z))=0, hence σ(z)\sigma(z) is a root of PP; in particular, σ(z)M\sigma(z)\in M. Consequently, the restriction of σ\sigma to MM is a KK-morphism from MM to itself; it is necessarily a KK-automorphism. We thus have defined a map from AutK(L)\mathop{\rm Aut}_K(L) to AutK(M)\mathop{\rm Aut}_K(M); this map is a morphism of groups. Its kernel is AutM(L)\mathop{\rm Aut}_M(L), so that this group is normal in AutK(L)\mathop{\rm Aut}_K(L).

(2)\Rightarrow(3): Let σAutK(L)\sigma\in\mathop{\rm Aut}_K(L) and let H=σAutM(L)σ1H=\sigma\mathop{\rm Aut}_M(L)\sigma^{-1}. By construction, one has σ(M)LH\sigma(M)\subset L^H. On the other hand, the hypothesis that AutM(L)\mathop{\rm Aut}_M(L) is normal in AutK(L)\mathop{\rm Aut}_K(L) implies that H=AutM(L)H=\mathop{\rm Aut}_M(L), so that LH=ML^H=M. We thus have proved that σ(M)M\sigma(M)\subset M.

(3)\Rightarrow(1): Let AA be a finite subset of MM such that M=K(A)M=K(A) and let BB be its orbit under AutK(L)\mathop{\rm Aut}_K(L). The polynomial bB(Tb)\prod_{b\in B}(T-b) is separable and invariant under AutK(L)\mathop{\rm Aut}_K(L), hence belongs to K[T]K[T]. By assumption, one has BMB\subset M. This implies that KMK\to M is Galois.

Remark 6. Let LL be a field, let GG be a finite group of automorphisms of LL and let K=LGK=L^G. Every element aa of LL is algebraic and separable over KK; inded, aa is a root of the separable polynomial bGa(Tb)=0\prod_{b\in G\cdot a}(T-b)=0, which is GG-invariant hence belongs to K[T]K[T]. There exists a finite extension MM of KK, contained in LL, such that GM=MG\cdot M=M and such that the map AutK(L)AutK(M)\mathop{\rm Aut}_K(L)\to \mathop{\rm Aut}_K(M) is injective. Then KMK\to M is Galois, and G=AutK(M)G=\mathop{\rm Aut}_K(M). Indeed, one has GAutK(M)G\subset\mathop{\rm Aut}_K(M), hence KMAutK(M)MGLG=KK\subset M^{\mathop{\rm Aut}_K(M)}\subset M^G\subset L^G=K. This implies that KMK\to M is Galois and the Galois correspondence then implies G=AutK(M)G=\mathop{\rm Aut}_K(M). The argument applies to every finite extension of KK which contains MM. Consequently, they all have degree Card(G)\mathop{\rm Card}(G); necessarily, L=ML=M.

Remark 7 (editions). Matt Baker points out that the actual novelty of the treatment lies in theorem 2, the rest is standard. Also, remark 6 has been edited following an observation of Christian Naumovic that it is not a priori obvious that the extension KLK\to L is finite.

Monday, January 26, 2015

Vijay Iyer and Wadada Leo Smith at The Stone

I just had the chance to attend two sets with Vijay Iyer and Wadada Leo Smith tonight! That happened at The Stone, a small music room in NYC owned by John Zorn that features avant-garde jazz music (but not only).

The first set was a plain duet of these two artists. The Stone was packed and we had to sit on the floor. After 10 quite boring minutes during which Vijay played electronics only, he took on the piano and music emerged. Although Vijay had sheets of music prepared, this set sounded very free, especially concerning Wadada Leo Smith's playing—it seems he used all what a trumpet allows to create sound. However, the atmosphere was peaceful. For those who know some of Wadada Leo Smith's music, this was closer to Kulture Jazz than to Ten Freedom Summers which I had discussed on this blog last year

For the second set, came along Reggie Workman at the bass, Nitin Mitta on the tablas, and Patricia Franceschy on vibes. This made the music sound quite differently. The musicians had decided of a few melodic lines and ostinatos, and grooved on that. The tablas gave a wonderful color to the music, similar as the one on Tirtha (with Prasanna on the guitar, and Nitin Mitta on the tablas). The vibes also gave a good touch. It seems that there are nice vibes players in free jazz nowadays; I'm thinking for example of Jason Adasiewicz who plays in Nicole Mitchelle's Ice Crystals group.

It was my first night in New York City since 2 years. I am happy to have had the opportunity to hear these great artists. Tomorrow night, if the announced snow storm permits, I'll go listen to Ari Hoenig at the Smalls!