Saturday, June 11, 2016

Triviality of vector bundles with connections on simply connected varieties

I would like to discuss today a beautiful theorem of Grothendieck concerning differential equations. It was mentioned by Yves André in a wonderful talk at IHÉS in March 2016 and Hélène Esnault kindly explained its proof to me during a nice walk in the Bavarian Alps last April... The statement is as follows:

Theorem (Grothendieck, 1970). — Let XX be a smooth projective complex algebraic variety. Assume that XX is simply connected. Then every vector bundle with an integrable connection on XX is trivial.

Let indeed (E,)(E,\nabla) be a vector bundle with an integrable connection on XX and let us show that it is trivial, namely, that there exist nn global sections e1,,ene_1,\dots,e_n of EE which are horizontal (ei=0\nabla e_i=0) and form a basis of EE at each point.

Considering the associated analytic picture, we get a vector bundle (Ean,)(E^{\mathrm{an}},\nabla) with an integrable connection on the analytic manifold X(C)X(\mathbf C). Let xX(C)x\in X(\mathbf C). By the theory of linear differential equations, this furnishes a representation ρ\rho of the topological fundamental group π1(X(C),x)\pi_1(X(\mathbf C),x) in the fiber ExE_x of the vector bundle EE at the point xx. Saying that (Ean,)(E^{\mathrm{an}},\nabla) is trivial on X(C)X(\mathbf C) means that this representation ρ\rho is trivial, which seems to be a triviality since XX is simply connected.

However, in this statement, simple connectedness means in the sense of algebraic geometry, namely that XX has no non-trivial finite étale covering. And this is why the theorem can be surprising, for this hypothesis does not imply that π1(X(C),x))\pi_1(X(\mathbf C),x)) is trivial, only that is has no non-trivial finite quotient. This is Grothendieck's version of Riemann's existence theorem, proved in SGA 1.

However, it is known that X(C)X(\mathbf C) is topologically equivalent to a finite cellular space, so that its fundamental group π1(X(C),x)\pi_1(X(\mathbf C),x)  is finitely presented.

Proposition (Malčev, 1940). — Let GG be a finitely generated subgroup of GL(n,C)\mathrm{GL}(n,\mathbf C). Then GG is residually finite: for every finite subset TT of GG not containing {In}\{\mathrm I_n\}, there exists a finite group KK and a morphism f ⁣:GKf\colon G\to K such that TKer(f)=T\cap \operatorname{Ker}(f)=\varnothing.

Consequently, the image of ρ\rho is residually finite. If it were non-trivial, there would exist a non-trivial finite quotient KK of im(ρ)\operatorname{im}(\rho), hence a non-trivial finite quotient of π1(X(C),x)\pi_1(X(\mathbf C),x), which, as we have seen, is impossible. Consequently, the image of ρ\rho is trivial and (Ean,)(E^{\mathrm{an}},\nabla) is trivial.

In other words, there exists a basis (e1,,en)(e_1,\dots,e_n) of horizontal sections of EanE^{\mathrm{an}}. By Serre's GAGA theorem, e1,,ene_1,\dots,e_n are in fact algebraic, ie, induced by actual global sections of EE on XX. By construction, they are horizontal and form a basis of EE at each point. Q.E.D.

It now remains to explain the proof of the proposition. Let SS be a finite symmetric generating subset of GG containing TT, not containing In\mathrm I_n, and let RR be the subring of C\mathbf C generated by the entries of the elements of SS and their inverses. It is a non-zero finitely generated Z\mathbf Z-algebra; the elements of SS are contained in GL(n,R)\mathrm {GL}(n,R), hence GG is a subgroup of GL(n,R)\mathrm{GL}(n,R). Let m\mathfrak m be a maximal ideal of RR and let kk be its residue field; the point of the story is that this field kk is finite (I'll explain why in a minute.) Then the reduction map RkR\to k induces a morphism of groups GL(n,R)GL(n,k)\mathrm{GL}(n,R)\to \mathrm {GL}(n,k), hence a morphism GGL(n,k)G\to \mathrm{GL}(n,k). By construction, a non-zero entry of an element of SS is invertible in RR hence is mapped to a non-zero element in kk. Consequently, SS is disjoint from the kernel of ff, as was to be shown.

Lemma. — Let RR be a finitely generated Z\mathbf Z-algebra and let m\mathfrak m be a maximal ideal of RR. The residue field R/mR/\mathfrak m is finite.

Proof of the lemma. — This could be summarized by saying that Z\mathbf Z is a Jacobson ring: if AA is a Jacobson ring, then every finitely generated AA-algebra KK which is a field is finite over AA; in particular, KK is a finite extension of a quotient field of AA. In the case A=ZA=\mathbf Z,  the quotient fields of Z\mathbf Z are the finite fields Fp\mathbf F_p, so that KK is a finite extension of a finite field, hence is a finite field. Let us however explain the argument. Let KK be the field R/mR/\mathfrak m; let us replace Z\mathbf Z by its quotient A=Z/PA=\mathbf Z/P, where PP is the kernel of the map ZR/m\mathbf Z\to R/\mathfrak m. There are two cases: either P=(0)P=(0) and A=ZA=\mathbf Z, or P=(p)P=(p), for some prime number pp, and AA is the finite field Fp\mathbf F_p;
we will eventually see that the first case cannot happen.

Now, KK is a field which is a finitely generated algebra over a subalgebra AA; let kk be the fraction field of AA. The field KK is now a finitely generated algera over its subfield kk; by Zariski's form of Hilbert's Nullstellensatz, KK is a finite algebraic extension of kk. Let us choose a finite generating subset SS of KK as a kk-algebra; each element of SS is algebraic over kk; let us consider the product ff of the leading coefficients of their minimal polynomials, chosen to belong to A[T]A[T] and let A=A[1/f]A'=A[1/f]. By construction, the elements of SS are integral over KK, hence KK is integral over AA'. Since KK is a field, we deduce that AA' is a field. To conclude, we split the discussion into the two cases stated above.

If P=(p)P=(p), then A=FpA=\mathbf F_p, hence k=Fpk=\mathbf F_p as well, and KK is a finite extension of Fp\mathbf F_p, hence is a finite field.

Let us assume, by contradiction, that P=(0)P=(0), hence A=ZA=\mathbf Z and k=Qk=\mathbf Q. By what precedes, there exists an element fZf\in\mathbf Z such that Q=Z[1/f]\mathbf Q=\mathbf Z[1/f]. But this cannot be true, because Z[1/f]\mathbf Z[1/f] is not a field. Indeed, any prime number which does not divide ff is not invertible in Z[1/f]\mathbf Z[1/f]. This concludes the proof of the lemma.

Remarks. — 1) The theorem does not hold if XX is not proper. For example, the affine line AC1\mathbf A^1_{\mathbf C} is simply connected, both algebraically and topologically, but the trivial line bundle E=OXeE=\mathscr O_X\cdot e endowed with the connection defined by (e)=e\nabla (e)=e is not trivial. It is analytically trivial though, but its horizontal analytic sections are of the form λexp(z)e\lambda \exp(z) e, for λC\lambda\in\mathbf C, and except for λ=0\lambda=0, none of them are algebraic.
However, the theorem holds if one assumes moreover that the connection has regular singularities at infinity.

2) The group theoretical property that we used is that on a complex algebraic variety, the monodromy group of a vector bundle with connection is residually finite. It is not always true that the topological fundamental group of a complex algebraic variety is residually finite. Examples have been given by Domingo Toledo in “Projective varieties with non-residually finite fundamental group”, Publications mathématiques de l’I.H.É.S., 77 (1993), p. 103–119.

3) The analogous result in positive characteristic is a conjecture by Johan De Jong formulated in 2010: If XX is a projective smooth simply connected algebraic variety over an algebraically closed field of characteristic pp, then every isocrystal is trivial. It is still open, despite beautiful progress by Hélène Esnault, together with Vikram Mehta and Atsushi Shiho.

Thursday, May 5, 2016

Bourbaki and Felix Klein

A colleague just sent me Xerox copies of a few pages of a 1899 biography of the général Bourbaki. Its author, François Bournand, was the private secretary of Édouard Drumont, an antisemitic writer and journalist. The book would probably not be worth much being mentioned here without its dedication:

À l'abbé Félix Klein
de l'Institut catholique
Hommage respectueux de son dévoué en N.-S.
François Bournand
Professeur d'histoire de l'art à l'École professionnelle catholique



Abbé is abbot, in this context, a catholic priest without a parish; the French initials N.-S. mean Notre Seigneur, Our Lord. It appears that this Félix Klein (note the accent on the e) also has a Wikipedia page.

Friday, April 29, 2016

Roth's theorems

A few days ago,  The Scotsman published a paper about Klaus Roth's legacy, explaining how he donated his fortune (1 million pounds) to various charities. This paper was reported by some friends on Facebook. Yuri Bilu added the mention that he knew two important theorems of Roth, and since one of them did not immediately reached my mind, I decided to write this post.

The first theorem was a 1935 conjecture of Erdős and Turán concerning arithmetic progression of length 3 that Roth proved in 1952. That is, one is given a set AA of positive integers and one seeks for triples (a,b,c)(a,b,c) of distinct elements of AA such that a+c=2ba+c=2b; Roth proved that infinitely many such triples exist as soon as the upper density of AA is positive, that is:
lim supx+Card(A[0;x])x>0. \limsup_{x\to+\infty} \frac{\mathop{\rm Card}(A\cap [0;x])}x >0.
In 1975, Endre Szemerédi proved that such sets of integers contain (finite) arithmetic progressions of arbitrarily large length. Other proofs have been given by Hillel Furstenberg (using ergodic theory) and Tim Gowers (by Fourier/combinatorical methods); Roth had used Hardy-Littlewood's circle method.

In 1976, Erdős strengthened his initial conjecture with Turán and predicted that arithmetic progressions of arbitrarily large length exist in AA as soon as
aA1a=+. \sum_{a\in A} \frac 1a =+\infty.
Such a result is still a conjecture, even for arithmetic progressions of length 33, but a remarkable particular case has been proved by Ben Green and Terry Tao in 2004, when AA is the set of all prime numbers.

Outstanding as these results are (Tao has been given the Fields medal in 2006 and Szemerédi the Abel prize in 2012), the second theorem of Roth was proved in 1955 and was certainly the main reason for awarding him the Fields medal in 1958. Indeed, Roth gave a definitive answer to a long standing question in diophantine approximation that originated from the works of Joseph Liouville (1844). Given a real number α\alpha, one is interested to rational fractions p/qp/q that are close to α\alpha, and to the quality of the approximation, namely the exponent nn such that αpq1/qn\left| \alpha- \frac pq \right|\leq 1/q^n. Precisely, the approximation exponent κ(α)\kappa(\alpha) is the largest lower bound of all real numbers nn such that the previous inequality has infinitely many solutions in fractions p/qp/q, and Roth's theorem asserts that one has κ(α)=2\kappa(\alpha)=2 when α\alpha is an irrational algebraic number.

One part of this result goes back to Dirichlet, showing that for any irrational number α\alpha, there exist many good approximations with exponent  22. This can be proved using the theory of continued fractions and is also a classical application of Dirichlet's box principle. Take a positive integer QQ and consider the Q+1Q+1 numbers qαqαq\alpha- \lfloor q\alpha\rfloor in [0,1][0,1], for 0qQ0\leq q\leq Q; two of them must be less that 1/Q1/Q apart; this furnishes integers p,p,q,qp',p'',q',q'', with 0q<qQ0\leq q'<q''\leq Q such that (qαp)(qαp)1/Q\left| (q''\alpha-p'')-(q'\alpha-p')\right|\leq 1/Q; then set p=ppp=p''-p' and q=qqq=q''-q'; one has qαp1/Q\left| q\alpha -p \right|\leq 1/Q, hence αpq1/Qq1/q2\left|\alpha-\frac pq\right|\leq 1/Qq\leq 1/q^2.

To prove an inequality in the other direction, Liouville's argument was that if α\alpha is an irrational root of a nonzero polynomial PZ[T]P\in\mathbf Z[T], then κ(α)deg(P)\kappa(\alpha)\leq\deg(P). The proof is now standard: given an approximation p/qp/q of α\alpha, observe that qdP(p/q)q^d P(p/q) is a non-zero integer (if, say, PP is irreducible), so that qdP(p/q)1\left| q^d P(p/q)\right|\geq 1. On the other hand, P(p/q)(p/qα)P(α)P(p/q)\approx (p/q-\alpha) P'(\alpha), hence an inequality αpqqd\left|\alpha-\frac pq\right|\gg q^{-d}.

This result has been generalized, first by Axel Thue en 1909 (who proved an inequality κ(α)12d+1\kappa(\alpha)\leq \frac12 d+1), then by Carl Ludwig Siegel and Freeman Dyson in 1947 (showing κ(α)2d\kappa(\alpha)\leq 2\sqrt d and κ(α)2d\kappa(\alpha)\leq\sqrt{2d}). While Liouville's result was based in the minimal polynomial of α\alpha, these generalisations required to involve polynomials in two variables, and the non-vanishing of a quantity such that qdP(p/q)q^dP(p/q) above was definitely less trivial. Roth's proof made use of polynomials of arbitrarily large degree, and his remarkable achievement was a proof of the required non-vanishing result.

Roth's proof was “elementary”, making use only of polynomials and wronskians. There are today more geometric proofs, such as the one by Hélène Esnault and Eckart Viehweg (1984) or Michael Nakamaye's subsequent proof (1995) which is based on Faltings's product theorem.

What is still missing, however, is the proof of an effective version of Roth's theorem, that would give, given any real number n>κ(α)n>\kappa(\alpha), an actual integer QQ such that every rational fraction p/qp/q in lowest terms such that αpq1/qn\left|\alpha-\frac pq\right|\leq 1/q^n satisfies qQq\leq Q. It seems that this defect lies at the very heart of almost all of the current approaches in diophantine approximations... 

Wednesday, April 13, 2016

Weierstrass's approximation theorem

I had to mentor an Agrégation leçon entitled Examples of dense subsets. For my own edification (and that of the masses), I want to try to record here as many proofs as of the Weierstrass density theorem as I can : Every complex-valued continuous function on the closed interval [1;1][-1;1] can be uniformly approximated by polynomials. I'll also include as a bonus the trigonometric variant: Every complex-valued continuous and 2π2\pi-periodic function on R\mathbf R can be uniformly approximated by trigonometric polynomials.

1. Using the Stone theorem.

This 1937—1948 theorem is probably the final conceptual brick to the edifice of which Weierstrass laid the first stone in 1885. It asserts that a subalgebra of continuous functions on a compact totally regular (e.g., metric) space is dense for the uniform norm if and only if it separates points. In all presentations that I know of, its proof requires to establish that the absolute value function can be uniformly approximated by polynomials on [1;1][-1;1]:
  • Stone truncates the power series expansion of the function x1(1x2)=n=0(1/2n)(x21)n, x\mapsto \sqrt{1-(1-x^2)}=\sum_{n=0}^\infty \binom{1/2}n (x^2-1)^n, bounding by hand the error term.
  • Bourbaki (Topologie générale, X, p. 36, lemme 2) follows a more elementary approach and begins by proving  that the function xxx\mapsto \sqrt x can be uniformly approximated by polynomials on [0;1][0;1]. (The absolute value function is recovered since xx2\mathopen|x\mathclose|\sqrt{x^2}.) To this aim, he introduces the sequence of polynomials given by p0=0p_0=0 and pn+1(x)=pn(x)+12(xpn(x)2)p_{n+1}(x)=p_n(x)+\frac12\left(x-p_n(x)^2\right) and proves by induction the inequalities 0xpn(x)2x2+nx2n 0\leq \sqrt x-p_n(x) \leq \frac{2\sqrt x}{2+n\sqrt x} \leq \frac 2n for x[0;1]x\in[0;1] and n0n\geq 0. This implies the desired result.
The algebra of polynomials separates points on the compact set [1;1][-1;1], hence is dense. To treat the case of trigonometric polynomials, consider Laurent polynomials on the unit circle.

2. Convolution.

Consider an approximation (ρn)(\rho_n) of the Dirac distribution, i.e., a sequence of continuous, nonnegative and compactly supported functions on R\mathbf R such that ρn=1\int\rho_n=1 and such that for every δ>0\delta>0, x>δρn(x)dx0\int_{\mathopen| x\mathclose|>\delta} \rho_n(x)\,dx\to 0. Given a continuous function ff on R\mathbf R, form the convolutions defined by fρn(x)=Rρn(t)f(xt)dtf*\rho_n(x)=\int_{\mathbf R} \rho_n(t) f(x-t)\, dt. It is classical that fρnf*\rho_n converges uniformly on every compact to ff.

Now, given a continuous function ff on [1;1][-1;1], one can extend it to a continuous function with compact support on R\mathbf R (defining ff to be affine linear on [2;1][-2;-1] and on [1;2][1;2], and to be zero outside of [2;2][-2;2]. We want to choose ρn\rho_n so that fρnf*\rho_n is a polynomial on [1;1][-1;1]. The basic idea is just to choose a parameter a>0a>0, and to take ρn(x)=cn(1(x/a)2)n\rho_n(x)= c_n (1-(x/a)^2)^n for xa\mathopen|x\mathclose|\leq a and ρn(x)=0\rho_n(x)=0 otherwise, with cnc_n adjusted so that ρn=1\int\rho_n=1. Let us write fρn(x)=22ρn(xt)f(t)dtf*\rho_n(x)=\int_{-2}^2 \rho_n(x-t) f(t)\, dt; if x[1;1]x\in[-1;1] and t[2:2]t\in[-2:2], then xt[3;3]x-t\in [-3;3] so we just need to be sure that ρn\rho_n is a polynomial on that interval, which we get by taking, say, a=3a=3. This shows that the restriction of fρnf*\rho_n to [1;1][-1;1] is a polynomial function, and we're done.

This approach is more or less that of D. Jackson (“A Proof of Weierstrass's Theorem,” Amer. Math. Monthly, 1934). The difference is that he considers continuous functions on a closed interval contained in ]0;1[\mathopen]0;1\mathclose[ which he extends linearly to [0;1][0;1] so that they vanish at 00 and 11; he considers the same convolution, taking the parameter a=1a=1.

Weierstrass's own proof (“Über die analytische Darstellbarkeit sogenannter willkurlicher Functionen einer reellen Veranderlichen Sitzungsberichteder,” Königlich Preussischen Akademie der Wissenschaften zu Berlin, 1885) was slightly more sophisticated: he first showed approximation by convolution with the Gaussian kernel  defined by ρn(t)=neπnt2 \rho_n(t) =\sqrt{ n} e^{- \pi n t^2}, and then expanded the kernel as a power series, a suitable truncation of which furnishes the desired polynomials.

As shown by Jacskon, the same approach works easily (in a sense, more easily) for 2π2\pi-periodic functions, considering the kernel defined by ρn(x)=cn(1+cos(x))n\rho_n(x)=c_n(1+\cos(x))^n, where cnc_n is chosen so that \int_{-\pi}^\pi \rho_n=1$.

3. Bernstein polynomials.

Take a continuous function ff on [0;1][0;1] and, for n0n\geq 0, set Bnf(x)=k=0nf(k/n)(nk)tk(1t)nk. B_nf(x) = \sum_{k=0}^n f(k/n) \binom nk t^k (1-t)^{n-k}. It is classical that BnfB_nf converges uniformly to ff on [0;1][0;1].

There are two classical proofs of Bernstein's theorem. One is probabilistic and consists in observing that Bnf(x)B_nf(x) is the expected value of f(Sn)f(S_n), where SnS_n is the sum of nn i.i.d. Bernoulli random variables with parameter x[0;1]x\in[0;1]. Another (generalized as the Korovkin theorem, On convergence of linear positive operators in the space of continuous functions, Dokl. Akad. Nauk SSSR (N.S.), vol. 90,‎ ) consists in showing (i) that for f=1,x,x2f=1,x,x^2, BnfB_nf converges uniformly to ff (an explicit calculation), (ii) that if f0f\geq 0, then Bnf0B_nf\geq 0 as well, (iii) for every x[0;1]x\in[0;1], squeezing ff inbetween two quadratic polynomials f+f^+ and ff_- such that f+(x)f(x)f^+(x)-f^-(x) is as small as desired.

A trigonometric variant would be given by Fejér's theorem that the Cesàro averages of a Fourier series of a continuous, 2π2\pi-periodic function converge uniformly to that function. In turn, Fejér's theorem can be proved in both ways, either by convolution (the Fejér kernel is nonnegative), or by a Korovkine-type argument (replacing 1,x,x21,x,x^2 on [0;1][0;1] by 1,z,z2,z1,z21,z,z^2,z^{-1},z^{-2} on the unit circle).


4. Using approximation by step functions.

This proof originates with a paper of H. Kuhn, “Ein elementarer Beweis des Weierstrasschen Approximationsatzes,” Arch. Math. 15 (1964), p. 316–317.

Let us show that for every δ]0,1[\delta\in\mathopen]0,1\mathclose[ and every ε>0\varepsilon>0, there exists a polynomial pp satisfying the following properties:
  • 0p(x)ε0\leq p(x)\leq \varepsilon for 1xδ-1\leq x\leq-\delta;
  • 0p(x)10\leq p(x)\leq 1 for δxδ-\delta\leq x\leq \delta;
  • 1εp(x)11-\varepsilon\leq p(x)\leq 1 for δx1\delta\leq x\leq 1.
In other words, these polynomials approximate the (discontinuous) function ff on [1;1][-1;1] defined by f(x)=0f(x)=0 for x<0x< 0, f(x)=1f(x)=1 for x>0x> 0 and f(0)=1/2f(0)=1/2.

A possible formula is p(x)=(1((1x)/2))n)2np(x)=(1- ((1-x)/2))^n)^{2^n}, where nn is a large enough integer. First of all, one has 0(1x)/210\leq (1-x)/2\leq 1 for every x[1;1]x\in[-1;1], so that 0p(x)10\leq p(x)\leq 1. Let x[1;δ]x\in[-1;-\delta]; then one has (1x)/2(1+δ)/2(1-x)/2\geq (1+\delta)/2, hence p(x)(1((1+δ)/2)n)2np(x)\leq (1-((1+\delta)/2)^n)^{2^n}, which can be made arbitrarily small when nn\to\infty. Let finally x[δ;1]x\in[\delta;1]; then (1x)/2(1δ)/2(1-x)/2\geq (1-\delta)/2, hence p(x)(1((1δ)/2)n)2n1(1δ)np(x)\geq (1-((1-\delta)/2)^n)^{2^n}\geq 1- (1-\delta)^n, which can be made arbitrarily close to 11 when nn\to\infty.

By translation and dilations, the discontinuity can be placed at any element of [0;1][0;1]. Let now ff be an arbitrary step function and let us write it as a linear combination f=aifif=\sum a_i f_i, where fif_i is a {0,1}\{0,1\}-valued step function. For every ii, let pip_i be a polynomial that approximates fif_i as given above. The linear combination aipi\sum a_i p_i approximates ff with maximal error sup(ai)\sup(\mathopen|a_i\mathclose|).

Using uniform continuity of continuous functions on [1;1][-1;1], every continuous function can be uniformly approximated by a step function. This concludes the proof.

5. Using approximation by piecewise linear functions.

As in the proof of Stone's theorem, one uses the fact that the function xxx\mapsto \mathopen|x\mathclose| is uniformly approximated by a sequence of polynomial on [1;1][-1;1]. Consequently,  so are the functions xmax(0,x)=(x+x)/2x\mapsto \max(0,x)=(x+\mathopen|x\mathclose|)/2 and xmin(0,x)=(xx)/2x\mapsto\min(0,x)=(x-\mathopen|x\mathclose|)/2. By translation and dilation, every continuous piecewise linear function on [1;1][-1;1] with only one break point is uniformly approximated by polynomials. By linear combination, every continuous piecewise linear affine function is uniformly approximated by polynomials.
By uniform continuity, every continuous function can be uniformly approximated by continuous piecewise linear affine functions. Weierstrass's theorem follows.

6. Moments.

A linear subspace AA of a Banach space is dense if and only if every continuous linear form which vanishes on AA is identically 00. In the present case, the dual of C0([1;1],C)C^0([-1;1],\mathbf C) is the space of complex measures on [1;1][-1;1] (Riesz theorem, if one wish, or the definition of a measure). So let μ\mu be a complex measure on [1;1][-1;1] such that 11tndμ(t)=0\int_{-1}^1 t^n \,d\mu(t)=0 for every integer n0n\geq 0; let us show that μ=0\mu=0. This is the classical problem of showing that a complex measure on [1;1][-1;1] is determined by its moments. In fact, the classical proof of this fact runs the other way round, and there must exist ways to reverse the arguments.

One such solution is given in Rudin's Real and complex analysis, where it is more convenient to consider functions on the interval [0;1][0;1]. So, let F(z)=01tzdμ(t)F(z)=\int_0^1 t^z \,d\mu(t). The function FF is holomorphic and bounded on the half-plane (z)>0\Re(z)> 0 and vanishes at the positive integers. At this point, Rudin makes a conform transformation to the unit disk (setting w=(z1)/(z+1)w=(z-1)/(z+1)) and gets a  bounded function on the unit disk with zeroes at (n1)/(n+1)=12/(n+1)(n-1)/(n+1)=1-2/(n+1), for nNn\in\mathbf N, and this contradicts the fact that the series 1/(n+1)\sum 1/(n+1) diverges.

In Rudin, this method is used to prove the more general Müntz–Szász theorem according to which the family (tλn)(t^{\lambda_n}) generates a dense subset of C([0;1])C([0;1]) if and only if 1/λn=+\sum 1/\lambda_n=+\infty.

Here is another solution I learnt in a paper by L. Carleson (“Mergelyan's theorem on uniform polynomial approximation”, Math. Scand., 1964).

For every complex number aa such that a>1\mathopen|a\mathclose|>1, one can write 1/(ta)1/(t-a) as a converging power series. By summation, this quickly gives that
F(a)=111tadμ(t)0. F(a) = \int_{-1}^1 \frac{1}{t-a}\, d\mu(t) \equiv 0.
Observe that this formula defines a holomorphic function on C[1;1]\mathbf C\setminus[-1;1]; by analytic continuous, one thus has F(a)=0F(a)=0 for every a∉[1;1]a\not\in[-1;1].
Take a C2C^2-function gg with compact support on the complex plane. For every tCt\in\mathbf C, one has the following formula
ˉg(z)1tzdxdy=g(t), \iint \bar\partial g(z) \frac{1}{t-z} \, dx\,dy = g(t),
which implies, by integration and Fubini, that
11g(t)dμ(t)=ˉg(z)1tzdμ(t)dxdy=ˉg(z)F(z)dxdy=0. \int_{-1}^1 g(t)\,d\mu(t) = \iint \int \bar\partial g(z) \frac1{t-z}\,d\mu(t)\,dx\,dy = \iint \bar\partial g(z) F(z)\,dx\, dy= 0.
On the other hand, every C2C^2 function on [1;1][-1;1] can be extended to such a function gg, so that the measure μ\mu vanishes on every C2C^2 function on [1;1][-1;1]. Approximating a continuous function by a C2C^2 function (first take a piecewise linear approximation, and round the corners), we get that μ\mu vanishes on every continuous function, as was to be proved.

7. Chebyshev/Markov systems.

This proof is due to P. Borwein and taken from the book Polynomials and polynomial inequalities, by P. Borwein and T. Erdélyi (Graduate Texts in Maths, vol. 161, 1995). Let us say that a sequence (fn)(f_n) of continuous functions on an interval II is a Markov system (resp. a weak Markov system) if for every integer nn, every linear combination of (f0,,fn)(f_0,\dots,f_n) has at most nn zeroes (resp. nn sign changes) in II.

Given a Markov system (fn)(f_n), one defines a sequence (Tn)(T_n), where TnfnT_n-f_n is the element of f0,,fn1\langle f_0,\dots,f_{n-1}\rangle which is the closest to fnf_n. The function TnT_n has nn zeroes on the interval II; let MnM_n be the maximum distance between two consecutive zeroes.

Borwein's theorem  (Theorem 4.1.1 in the mentioned book) then asserts that if the sequence (fn)(f_n) is a Markov system consisting of C1C^1 functions, then its linear span is dense in C(I)C(I) if and only if Mn0M_n\to 0.

The sequence of monomials (xn)(x^n) on I=[1;1]I=[-1;1] is of course a Markov system.  In this case, the polynomial TnT_n is the nnth Chebyshev polynomial, given by Tn(2cos(x))=2cos(nx)T_n(2\cos(x))=2\cos(nx), and its roots are given by 2cos((π+2kπ)/2n)2\cos((\pi+2k\pi)/2n), for k=0,,n1k=0,\dots,n-1, and Mnπ/nM_n\leq \pi/n. This gives yet another proof of Weierstrass's approximation theorem.

Wednesday, February 24, 2016

Sound and color

Just back home from The Stone where I could hear two very interesting sets with pianist Russ Lossing and drummer Gerry Hemingway, first in duet, and then in quartet with Loren Stillman on alto saxophone and Samuel Blaser on trombone.

I was absolutely excited at the prospect of returning to this avant-garde jazz hall (it has been my 3rd concert there, the first one was in 2010, with Sylvie Courvoisier, Thomas Morgan and Ben Perowski, and the second, last year, with Wadada Leo Smith and Vijay Iyer) to listen to Gerry Hemingway, and the cold rain falling on New York City did not diminish my enthusiasm. (Although I had to take care on the streets, for one could almost see nothing...) I feared I would arrive late, but Gerry Hemingway was still installing his tools, various sticks, small cymbals, woodblocks, as well as a cello bow...

I admit, it took me some time to appreciate the music. Of course, it was free jazz (so what?) and I couldn't really follow the stream of music. Both musicians were acting delicately and skillfully (no discussion) at creating sound, as a painter would spread brush strokes on a canvas—and actually, Hemingway was playing a lot of brushes, those drum sticks made of many (wire or plastic) strings that have a delicate and not very resonating sound... Color after color, something was emerging, sound was being shaped.

There is an eternal discussion about the nature of music (is it rhythm? melody? harmony?) and consequently about the role of each instrument in the shaping of the music. A related question is the way a given instrument should be used to produce sound.

None of the obvious answers was to be heard tonight. Russ Lossing sometimes stroke the strings of the grand piano with mallets, something almost classical in avant-garde piano music. I should have been prepared by the concert of Tony Malaby's Tubacello, that I attended with François Loeser in Sons d'hiver a few weeks ago, where John Hollenbeck simultaneously played drums and prepared piano, but the playing of  Gerry Hemingway brought me much surprise. He could blow on the heads of the drums, hit them with a woodblock or strange plastic mallets; he could make the cymbals vibrate by pressing the cell bow on it; he could also take the top hi-hat cymbal on the left hand, and then either hit it with a stick, or press it on the snare drum, thereby producing a mixture of snare/cymbal sound; during a long drum roll, he could also vary the pitch of the sound by pressing the drum head with his right foot—can you imagine the scene?

It is while discussing with him in between the two sets that I gradually understood (some of) his musical conception. How everything is about sound and color. That's why he uses an immense palette of tools, to produce the sounds he feels would best fit the music. He also discussed extended technique, by which he means not the kind of drumistic virtuosity that could allow you (unfortunately, not me...) to play the 26 drum rudiments at 300bpm, but by extending the range of sounds he can consistently produce with his “basic Buddy Rich type instrument”—Google a picture of Terry Bozzio's drumkit if you don't see what I mean. He described himself as a colorist, who thinks of his instrument in terms of pitches; he also said how rhythm also exists in negative, when it is not played explicitly. A striking remark because it exactly depicted how I understand the playing of one of my favorite jazz drummers, Paul Motian, but whom I couldn't appreciate until I became able of hearing what he did not play.

The second set  did not sound as abstract as the first one. Probably the two blowing instruments helped giving the sound more flesh and more texture. Samuel Blaser, on the trombone, was absolutely exceptional—go listen at once for his Spring Rain album, an alliance of Jimmy Giuffre and contemporary jazz—and Loren Stillman sang very beautiful melodic lines on the alto sax. The four of them could also play in all combinations, and with extremly interesting dynamics, going effortlessly from one to another. And when a wonderful moment of thunder ended abruptly with the first notes of Paul Motian's Etude, music turned into pure emotion.

Tuesday, February 9, 2016

Happy New Year!

As was apparently first noticed by Noam Elkies, 2016 is the cardinality of the general linear group over the field with 7 elements, G=GL(2,F7)G=\mathop{\rm GL}(2,\mathbf F_7). I was mentoring an agrégation lesson on finite fields this afternoon, and I could not resist having the student check this. Then came the natural question of describing the Sylow subgroups of this finite group. This is what I describe here.

First of all, let's recall the computation of the cardinality of GG. The first column of a matrix in GG must be non-zero, hence there are 7217^2-1 possibilities; for the second column, it only needs to be non-collinear to the first one, and each choice of the first column forbids 77 second columns, hence 7277^2-7 possibilities. In the end, one has Card(G)=(721)(727)=4842=2016\mathop{\rm Card}(G)=(7^2-1)(7^2-7)=48\cdot 42=2016. The same argument shows that the cardinality of the group GL(n,Fq)\mathop{\rm GL}(n,\mathbf F_q) is equal to (qn1)(qnq)(qnqn1)=qn(n1)/2(q1)(q21)(qn1)(q^n-1)(q^n-q)\cdots (q^n-q^{n-1})=q^{n(n-1)/2}(q-1)(q^2-1)\cdots (q^n-1).

Let's go back to our example. The factorization of this cardinal comes easily: 2016=(721)(727)=(71)(7+1)7(71)=6876=253272016=(7^2-1)(7^2-7)=(7-1)(7+1)7(7-1)=6\cdot 8\cdot 7\cdot 6= 2^5\cdot 3^2\cdot 7. Consequently, there are three Sylow subgroups to find, for the prime numbers 22, 33 and 77.

The cas p=7p=7 is the most classical one. One needs to find a group of order 7, and one such subgroup is given by the group of upper triangular matrices (101)\begin{pmatrix} 1 & * \\ 0 & 1\end{pmatrix}. What makes things work is that pp is the characteristic of the chosen finite field. In general, if qq is a power of pp, then the subgroup of upper-triangular matrices in GL(n,Fq)\mathop{\rm GL}(n,\mathbf F_q) with 11s one the diagonal has cardinality qq2qn1=qn(n1)/2q\cdot q^2\cdots q^{n-1}=q^{n(n-1)/2}, which is exactly the highest power of pp divising the cardinality of GL(n,Fq)\mathop{\rm GL}(n,\mathbf F_q).

Let's now study p=3p=3. We need to find a group SS of order 32=93^2=9 inside GG. There are a priori two possibilities, either S(Z/3Z)2S\simeq (\mathbf Z/3\mathbf Z)^2, or S(Z/9Z)S\simeq (\mathbf Z/9\mathbf Z).
We will find a group of the first sort, which will that the second case doesn't happen, because all 33-Sylows are pairwise conjugate, hence isomorphic.

Now, the multiplicative group F7×\mathbf F_7^\times is of order 66, and is cyclic, hence contains a subgroup of order 33, namely C={1,2,4}C=\{1,2,4\}. Consequently, the group of diagonal matrices with coefficients in CC is isomorphic to (Z/3Z)2(\mathbf Z/3\mathbf Z)^2 and is our desired 33-Sylow.

Another reason why GG does not contain a subgroup SS isomorphic to Z/9Z\mathbf Z/9\mathbf Z is that it does not contain elements of order 99. Let's argue by contradiction and consider a matrix AGA\in G such that A9=IA^9=I; then its minimal polynomial PP divides T91T^9-1. Since 797\nmid 9, the matrix AA is diagonalizable over the algebraic closure of F7\mathbf F_7. The eigenvalues of AA are eigenvalues are 99th roots of unity, and are quadratic over F7\mathbf F_7 since deg(P)2\deg(P)\leq 2. On the other hand, if α\alpha is a 99th root of unity belonging to F49\mathbf F_{49}, one has α9=α48=1\alpha^9=\alpha^{48}=1, hence α3=1\alpha^3=1 since gcd(9,48)=3\gcd(9,48)=3. Consequently, α\alpha is a cubic root of unity and A3=1A^3=1, showing that AA has order 33.

It remains to treat the case p=2p=2, which I find slightly trickier. Let's try to find elements AA in GG whose order divides 252^5. As above, it is diagonalizable in an algebraic closure, its minimal polynomial divides T321T^{32}-1, and its roots belong to F49\mathbf F_{49}, hence satisfy α32=α48=1\alpha^{32}=\alpha^{48}=1, hence α16=1\alpha^{16}=1. Conversely, F49×\mathbf F_{49}^\times is cyclic of order 4848, hence contains an element of order 1616, and such an element is quadratic over F7\mathbf F_7, hence its minimal polynomial PP has degree 22. The corresponding companion matrix AA in GG is an element of order 1616, generating a subgroup S1S_1 of GG isomorphic to Z/16Z\mathbf Z/16\mathbf Z. We also observe that α8=1\alpha^8=-1 (because its square is 11); since A8A^8 is diagonalizable in an algebraic closure with 1-1 as the only eigenvalue, this shows A8=IA^8=-I.

Now, there exists a 22-Sylow subgroup containing S1S_1, and S1S_1 will be a normal subgroup of SS (because its index is the smallest prime number dividing the order of SS, which is 22). This suggests to introduce the normalizer NN of S1S_1 in GG. One then has S1SNS_1\subset S\subset N. Let sSs\in S be such that s∉S1s\not\in S_1; then there exists a unique k{1,,15}k\in\{1,\dots,15\} such that s1As=Aks^{-1}As=A^k, and s2As2=Ak2=As^{-2}As^2=A^{k^2}=A (because ss has order 22 modulo S1S_1), hence k21(mod16)k^2\equiv 1\pmod{16}—in other words, k±1(mod8)k\equiv \pm1\pmod 8.

There exists a natural choice of ss: the involution (s2=Is^2=I) which exchanges the two eigenspaces of AA. To finish the computation, it's useful to take a specific example of polynomial PP of degree 22 whose roots in F49\mathbf F_{49} are primitive 1616th roots of unity. In other words, we need to factor the 1616th cyclotomic polynomial Φ16=T8+1\Phi_{16}=T^8+1 over F7\mathbf F_7 and find a factor of degree 22; actually, Galois theory shows that all factors have the same degree, so that there should be 4 factors of degree 22.  To explain the following computation, some remark is useful. Let α\alpha be a 1616th root of unity in F49\mathbf F_{49}; we have (α8)2=1(\alpha^8)^2=1 but α81\alpha^8\neq 1, hence α8=1\alpha^8=-1.  If PP is the minimal polynomial of α\alpha, the other root is α7\alpha^7, hence the constant term of PP is equal to αα7=α8=1\alpha\cdot \alpha^7=\alpha^8=-1.

We start from T8+1=(T4+1)22T4T^8+1=(T^4+1)^2-2T^4 and observe that 242(mod7)2\equiv 4^2\pmod 7, so that T8+1=(T4+1)242T4=(T4+4T2+1)(T44T2+1)T^8+1=(T^4+1)^2-4^2T^4=(T^4+4T^2+1)(T^4-4T^2+1). To find the factors of degree 22, we remember that their constant terms should be equal to 1-1. We thus go on differently, writing T4+4T2+1=(T2+aT1)(T2aT1)T^4+4T^2+1=(T^2+aT-1)(T^2-aT-1) and solving for aa: this gives 2a2=4-2-a^2=4, hence a2=6=1a^2=-6=1 and a=±1a=\pm1. The other factors are found similarly and we get
T8+1=(T2T1)(T2+T1)(T24T1)(T2+4T1). T^8+1=(T^2-T-1)(T^2+T-1)(T^2-4T-1)(T^2+4T-1).
We thus choose the factor T2T1T^2-T-1 and set A=(0111)A=\begin{pmatrix} 0 & 1 \\ 1 & 1 \end{pmatrix}.

Two eigenvectors for AA are v=(1α)v=\begin{pmatrix} 1 \\ \alpha \end{pmatrix} and v=(1α)v'=\begin{pmatrix}1 \\ \alpha'\end{pmatrix}, where α=α7\alpha'=\alpha^7 is the other root of T2T1T^2-T-1. The equations for BB are Bv=vBv=v' and Bv=vBv'=v; this gives B=(1011)B=\begin{pmatrix} 1 & 0 \\ 1 & - 1\end{pmatrix}. The subgroup S=A,BS=\langle A,B\rangle generated by AA and BB has order 3232 and is a 22-Sylow subgroup of GG.

Generalizing this method involves finding large commutative pp-subgroups (such as S1S_1) which belong to appropriate (possibly non-split) tori of GL(n)\mathop{\rm GL}(n) and combining them with adequate parts of their normalizer, which is close to considering Sylow subgroups of the symmetric group. The paper Sylow pp-subgroups of the classical groups over finite fields with characteristic prime to pp by A.J. Weir gives the general description (as well as for orthogonal and symplectic groups), building on an earlier paper in which he constructed Sylow subgroups of symmetric groups. See also the paper Some remarks on Sylow subgroups of the general linear groups by C. R. Leedham-Green and W. Plesken which says a lot about maximal pp-subgroups of the general linear group (over non-necessarily finite fields). Also, the question was recently the subject of interesting discussions on MathOverflow.

[Edited on Febr. 14 to correct the computation of the 2-Sylow...]

Monday, January 4, 2016

Model theory and algebraic geometry, 5 — Algebraic differential equations from coverings

In this final post of this series, I return to elimination of imaginaries in DCF and explain the main theorem from Tom Scanlon's paper Algebraic differential equations from covering maps.

The last ingredient to be discussed is jet spaces.

Differential algebra is seldom used explicitly in algebraic geometry. However, differential techniques have furnished a crucial tool for the study of the Mordell conjecture over function fields (beginning with the proof of this conjecture by Grauert and Manin), and its generalizations in higher dimension (theorem of Bogomolov on surfaces satisfying c12>3c2c_1^2>3c_2), or for holomorphic curve (conjecture of Green-Griffiths). They are often reformulated within the language of jet bundles.

Let us assume that XX is a smooth variety over a field kk. Its tangent bundle T(X)T(X) is a vector bundle over XX whose fiber at a (geometric) point xx is the tangent space Tx(X)T_x(X) of XX at xx. By construction, every morphism f ⁣:YXf\colon Y\to X of algebraic varieties induces a tangent morphism Tf ⁣:T(Y)T(X)Tf\colon T(Y)\to T(X): it maps a tangent vector vTy(Y)v\in T_y(Y) at a (geometric) point yYy\in Y to the tangent vector Tyf(v)Tf(y)(X)T_yf(v)\int T_{f(y)}(X) at f(y)f(y). This can be rephrased in the language of differential algebra as follows: for every differential field (K,)(K,\partial) whose field of constants contains kk, one has a derivative map 1 ⁣:X(K)T(X)(K)\nabla_1\colon X(K)\to T(X)(K). Here is the relation, where we assume that KK is the field of functions of a variety YY. A derivation \partial on KK can be viewed as a vector field VV on YY, possibly not defined everywhere; replacing YY by a dense open subset if needed, we assume that it is defined everywhere. Now, a point xX(K)x\in X(K) can be identified with a rational map f ⁣:YXf\colon Y\dashrightarrow X, defined on an open subset UU of YY. Then, we simply consider the morphism from UU to T(X)T(X) given by pTpf(Vp)p\mapsto T_pf (V_p). At the level of function fields, this is our point 1(x)T(X)(K)\nabla_1(x)\in T(X)(K).

If one wants to look at higher derivatives, the construction of the tangent bundle can be iterated and gives rise to jet bundles which are varieties Jm(X)J_m(X), defined for all integers m0m\geq 0, such that J0(X)=XJ_0(X)=X,  J1(X)=T(X)J_1(X)=T(X), and for m1m\geq 1, Jm(X)J_m(X) is a vector bundle over Jm1XJ_{m-1}X modelled on the mmth symmetric product of ΩX1\Omega^1_X.  For every differential field (K,)(K,\partial) whose field of constants contains kk, there is a canonical mmth derivative map m ⁣:X(K)Jm(X)(K)\nabla_m\colon X(K) \to J_m(X) (K).

The construction of the jet bundles can be given so that the following three requirements are satisfied:
  • If X=A1X=\mathbf A^1 is the affine line, then Jm(X)J_m(X) is an affine space of dimension m+1m+1, and m\nabla_m is just given by m(x)=(x,(x),,m(x)) \nabla_m (x) = (x,\partial(x),\dots,\partial^m(x)) for xX(K)=Kx\in X(K)=K;
  • Products: Jm(X×Y)=Jm(X)×kJm(Y)J_m(X\times Y)=J_m(X)\times_k J_m(Y);
  • Open immersions: if UU is an open subset of XX, then Jm(U)J_m(U) is an open subset of XX given by the preimage of UU under the projection Jm(X)Jm1(X)J0(X)=XJ_m(X)\to J_{m-1}(X)\to \dots\to J_0(X)=X.
  • When XX is an algebraic group, with origin ee, then Jm(X)J_m(X) is canonically isomorphic to the product of XX by the affine space Jm(X)eJ_m(X)_e of mm-jets at ee.
We now describe Scanlon's application.

Let GG be a complex algebraic group acting on a complex algebraic variety XX; let S ⁣:XZS\colon X\to Z be the corresponding generalized Schwarzian map. Here, ZZ is a complex algebraic variety, but SS is a differential map of some order mm. In other words, there exists a constructible algebraic map S~ ⁣:Jm(X)Z\tilde S\colon J_m(X)\to Z such that S(x)=S~(m(x))S(x)=\tilde S(\nabla_m(x)) for every differential field (K,)(K,\partial) and every point xX(K)x\in X(K).

Let UU be an open subset of X(C)X(\mathbf C), for the complex topology, and let Γ\Gamma be a Zariski dense subgroup of G(C)G(\mathbf C) which stabilizes UU. We assume that there exists a complex algebraic variety YY and a biholomorphic map p ⁣:Γ\UY(C)p\colon \Gamma\backslash U \to Y(\mathbf C).

Locally, every open holomorphic map ϕ ⁣:ΩY(C)\phi\colon\Omega\to Y(\mathbf C) can be lifted to a holomorphic map ϕ~ ⁣:ΩU\tilde\phi\colon \Omega\to U. Two liftings differ locally by the action of an element of Γ\Gamma, so that the composition Sϕ~S\circ\tilde\phi does not depend on the choice of the lifting, by definition of the generalized Schwarzian map SS. This gives a well-defined differential-analytic map T ⁣:YZT\colon Y\to Z. Let mm be the maximal order of derivatives appearing in a formula defining TT. Then one may write Tϕ=T~mϕ~T\circ\phi =\tilde T\circ \nabla_m\tilde\phi, where T~\tilde T is a constructible analytic map from Jm(Y)J_m(Y) to ZZ.

Theorem (Scanlon). — Assume that there exists a fundamental domain FU\mathfrak F\subset U such that the map pF ⁣:FY(C)p|_{\mathfrak F}\colon \mathfrak F\to Y(\mathbf C) is definable in an o-minimal structure. Then TT is differential-algebraic: there exists a constructible map T~ ⁣:Jm(Y)Z\tilde T\colon J_m(Y)\to Z such that Tϕ=T~Jm(ϕ)T\circ \phi=\tilde T \circ J_m(\phi) for every ϕ\phi as above.

For the proof, observe that the map T~\tilde T is definable in an o-minimal structure, because it comes, by quotient of a definable map from the preimage in Jm(U)J_m(U) of F\mathfrak F, and o-minimal structures allow elimination of imaginaries. By the theorem of Peterzil and Starchenko, it is constructible algebraic.