I would like to discuss today a beautiful theorem of Grothendieck concerning differential equations. It was mentioned by Yves André in a wonderful talk at IHÉS in March 2016 and Hélène Esnault kindly explained its proof to me during a nice walk in the Bavarian Alps last April... The statement is as follows:
Theorem (Grothendieck, 1970). — Let be a smooth projective complex algebraic variety. Assume that is simply connected. Then every vector bundle with an integrable connection on is trivial.
Let indeed be a vector bundle with an integrable connection on and let us show that it is trivial, namely, that there exist global sections of which are horizontal () and form a basis of at each point.
Considering the associated analytic picture, we get a vector bundle with an integrable connection on the analytic manifold . Let . By the theory of linear differential equations, this furnishes a representation of the topological fundamental group in the fiber of the vector bundle at the point . Saying that is trivial on means that this representation is trivial, which seems to be a triviality since is simply connected.
However, in this statement, simple connectedness means in the sense of algebraic geometry, namely that has no non-trivial finite étale covering. And this is why the theorem can be surprising, for this hypothesis does not imply that is trivial, only that is has no non-trivial finite quotient. This is Grothendieck's version of Riemann's existence theorem, proved in SGA 1.
However, it is known that is topologically equivalent to a finite cellular space, so that its fundamental group is finitely presented.
Proposition (Malčev, 1940). — Let be a finitely generated subgroup of . Then is residually finite: for every finite subset of not containing , there exists a finite group and a morphism such that .
Consequently, the image of is residually finite. If it were non-trivial, there would exist a non-trivial finite quotient of , hence a non-trivial finite quotient of , which, as we have seen, is impossible. Consequently, the image of is trivial and is trivial.
In other words, there exists a basis of horizontal sections of . By Serre's GAGA theorem, are in fact algebraic, ie, induced by actual global sections of on . By construction, they are horizontal and form a basis of at each point. Q.E.D.
It now remains to explain the proof of the proposition. Let be a finite symmetric generating subset of containing , not containing , and let be the subring of generated by the entries of the elements of and their inverses. It is a non-zero finitely generated -algebra; the elements of are contained in , hence is a subgroup of . Let be a maximal ideal of and let be its residue field; the point of the story is that this field is finite (I'll explain why in a minute.) Then the reduction map induces a morphism of groups , hence a morphism . By construction, a non-zero entry of an element of is invertible in hence is mapped to a non-zero element in . Consequently, is disjoint from the kernel of , as was to be shown.
Lemma. — Let be a finitely generated -algebra and let be a maximal ideal of . The residue field is finite.
Proof of the lemma. — This could be summarized by saying that is a Jacobson ring: if is a Jacobson ring, then every finitely generated -algebra which is a field is finite over ; in particular, is a finite extension of a quotient field of . In the case , the quotient fields of are the finite fields , so that is a finite extension of a finite field, hence is a finite field. Let us however explain the argument. Let be the field ; let us replace by its quotient , where is the kernel of the map . There are two cases: either and , or , for some prime number , and is the finite field ;
we will eventually see that the first case cannot happen.
Now, is a field which is a finitely generated algebra over a subalgebra ; let be the fraction field of . The field is now a finitely generated algera over its subfield ; by Zariski's form of Hilbert's Nullstellensatz, is a finite algebraic extension of . Let us choose a finite generating subset of as a -algebra; each element of is algebraic over ; let us consider the product of the leading coefficients of their minimal polynomials, chosen to belong to and let . By construction, the elements of are integral over , hence is integral over . Since is a field, we deduce that is a field. To conclude, we split the discussion into the two cases stated above.
If , then , hence as well, and is a finite extension of , hence is a finite field.
Let us assume, by contradiction, that , hence and . By what precedes, there exists an element such that . But this cannot be true, because is not a field. Indeed, any prime number which does not divide is not invertible in . This concludes the proof of the lemma.
Remarks. — 1) The theorem does not hold if is not proper. For example, the affine line is simply connected, both algebraically and topologically, but the trivial line bundle endowed with the connection defined by is not trivial. It is analytically trivial though, but its horizontal analytic sections are of the form , for , and except for , none of them are algebraic.
However, the theorem holds if one assumes moreover that the connection has regular singularities at infinity.
2) The group theoretical property that we used is that on a complex algebraic variety, the monodromy group of a vector bundle with connection is residually finite. It is not always true that the topological fundamental group of a complex algebraic variety is residually finite. Examples have been given by Domingo Toledo in “Projective varieties with non-residually finite fundamental group”, Publications mathématiques de l’I.H.É.S., 77 (1993), p. 103–119.
3) The analogous result in positive characteristic is a conjecture by Johan De Jong formulated in 2010: If is a projective smooth simply connected algebraic variety over an algebraically closed field of characteristic , then every isocrystal is trivial. It is still open, despite beautiful progress by Hélène Esnault, together with Vikram Mehta and Atsushi Shiho.
Saturday, June 11, 2016
Thursday, May 5, 2016
Bourbaki and Felix Klein
Libellés :
Felix Klein
,
N. Bourbaki
A colleague just sent me Xerox copies of a few pages of a 1899 biography of the général Bourbaki. Its author, François Bournand, was the private secretary of Édouard Drumont, an antisemitic writer and journalist. The book would probably not be worth much being mentioned here without its dedication:
Abbé is abbot, in this context, a catholic priest without a parish; the French initials N.-S. mean Notre Seigneur, Our Lord. It appears that this Félix Klein (note the accent on the e) also has a Wikipedia page.
À l'abbé Félix Klein
de l'Institut catholique
Hommage respectueux de son dévoué en N.-S.
François Bournand
Professeur d'histoire de l'art à l'École professionnelle catholique
Abbé is abbot, in this context, a catholic priest without a parish; the French initials N.-S. mean Notre Seigneur, Our Lord. It appears that this Félix Klein (note the accent on the e) also has a Wikipedia page.
Friday, April 29, 2016
Roth's theorems
Libellés :
combinatorics
,
number theory
A few days ago, The Scotsman published a paper about Klaus Roth's legacy, explaining how he donated his fortune (1 million pounds) to various charities. This paper was reported by some friends on Facebook. Yuri Bilu added the mention that he knew two important theorems of Roth, and since one of them did not immediately reached my mind, I decided to write this post.
The first theorem was a 1935 conjecture of Erdős and Turán concerning arithmetic progression of length 3 that Roth proved in 1952. That is, one is given a set A of positive integers and one seeks for triples (a,b,c) of distinct elements of A such that a+c=2b; Roth proved that infinitely many such triples exist as soon as the upper density of A is positive, that is:
x→+∞limsupxCard(A∩[0;x])>0.
In 1975, Endre Szemerédi proved that such sets of integers contain (finite) arithmetic progressions of arbitrarily large length. Other proofs have been given by Hillel Furstenberg (using ergodic theory) and Tim Gowers (by Fourier/combinatorical methods); Roth had used Hardy-Littlewood's circle method.
In 1976, Erdős strengthened his initial conjecture with Turán and predicted that arithmetic progressions of arbitrarily large length exist in A as soon as
a∈A∑a1=+∞.
Such a result is still a conjecture, even for arithmetic progressions of length 3, but a remarkable particular case has been proved by Ben Green and Terry Tao in 2004, when A is the set of all prime numbers.
Outstanding as these results are (Tao has been given the Fields medal in 2006 and Szemerédi the Abel prize in 2012), the second theorem of Roth was proved in 1955 and was certainly the main reason for awarding him the Fields medal in 1958. Indeed, Roth gave a definitive answer to a long standing question in diophantine approximation that originated from the works of Joseph Liouville (1844). Given a real number α, one is interested to rational fractions p/q that are close to α, and to the quality of the approximation, namely the exponent n such that α−qp≤1/qn. Precisely, the approximation exponent κ(α) is the largest lower bound of all real numbers n such that the previous inequality has infinitely many solutions in fractions p/q, and Roth's theorem asserts that one has κ(α)=2 when α is an irrational algebraic number.
One part of this result goes back to Dirichlet, showing that for any irrational number α, there exist many good approximations with exponent 2. This can be proved using the theory of continued fractions and is also a classical application of Dirichlet's box principle. Take a positive integer Q and consider the Q+1 numbers qα−⌊qα⌋ in [0,1], for 0≤q≤Q; two of them must be less that 1/Q apart; this furnishes integers p′,p′′,q′,q′′, with 0≤q′<q′′≤Q such that ∣(q′′α−p′′)−(q′α−p′)∣≤1/Q; then set p=p′′−p′ and q=q′′−q′; one has ∣qα−p∣≤1/Q, hence α−qp≤1/Qq≤1/q2.
To prove an inequality in the other direction, Liouville's argument was that if α is an irrational root of a nonzero polynomial P∈Z[T], then κ(α)≤deg(P). The proof is now standard: given an approximation p/q of α, observe that qdP(p/q) is a non-zero integer (if, say, P is irreducible), so that qdP(p/q)≥1. On the other hand, P(p/q)≈(p/q−α)P′(α), hence an inequality α−qp≫q−d.
This result has been generalized, first by Axel Thue en 1909 (who proved an inequality κ(α)≤21d+1), then by Carl Ludwig Siegel and Freeman Dyson in 1947 (showing κ(α)≤2d and κ(α)≤2d). While Liouville's result was based in the minimal polynomial of α, these generalisations required to involve polynomials in two variables, and the non-vanishing of a quantity such that qdP(p/q) above was definitely less trivial. Roth's proof made use of polynomials of arbitrarily large degree, and his remarkable achievement was a proof of the required non-vanishing result.
Roth's proof was “elementary”, making use only of polynomials and wronskians. There are today more geometric proofs, such as the one by Hélène Esnault and Eckart Viehweg (1984) or Michael Nakamaye's subsequent proof (1995) which is based on Faltings's product theorem.
What is still missing, however, is the proof of an effective version of Roth's theorem, that would give, given any real number n>κ(α), an actual integer Q such that every rational fraction p/q in lowest terms such that α−qp≤1/qn satisfies q≤Q. It seems that this defect lies at the very heart of almost all of the current approaches in diophantine approximations...
The first theorem was a 1935 conjecture of Erdős and Turán concerning arithmetic progression of length 3 that Roth proved in 1952. That is, one is given a set A of positive integers and one seeks for triples (a,b,c) of distinct elements of A such that a+c=2b; Roth proved that infinitely many such triples exist as soon as the upper density of A is positive, that is:
x→+∞limsupxCard(A∩[0;x])>0.
In 1975, Endre Szemerédi proved that such sets of integers contain (finite) arithmetic progressions of arbitrarily large length. Other proofs have been given by Hillel Furstenberg (using ergodic theory) and Tim Gowers (by Fourier/combinatorical methods); Roth had used Hardy-Littlewood's circle method.
In 1976, Erdős strengthened his initial conjecture with Turán and predicted that arithmetic progressions of arbitrarily large length exist in A as soon as
a∈A∑a1=+∞.
Such a result is still a conjecture, even for arithmetic progressions of length 3, but a remarkable particular case has been proved by Ben Green and Terry Tao in 2004, when A is the set of all prime numbers.
Outstanding as these results are (Tao has been given the Fields medal in 2006 and Szemerédi the Abel prize in 2012), the second theorem of Roth was proved in 1955 and was certainly the main reason for awarding him the Fields medal in 1958. Indeed, Roth gave a definitive answer to a long standing question in diophantine approximation that originated from the works of Joseph Liouville (1844). Given a real number α, one is interested to rational fractions p/q that are close to α, and to the quality of the approximation, namely the exponent n such that α−qp≤1/qn. Precisely, the approximation exponent κ(α) is the largest lower bound of all real numbers n such that the previous inequality has infinitely many solutions in fractions p/q, and Roth's theorem asserts that one has κ(α)=2 when α is an irrational algebraic number.
One part of this result goes back to Dirichlet, showing that for any irrational number α, there exist many good approximations with exponent 2. This can be proved using the theory of continued fractions and is also a classical application of Dirichlet's box principle. Take a positive integer Q and consider the Q+1 numbers qα−⌊qα⌋ in [0,1], for 0≤q≤Q; two of them must be less that 1/Q apart; this furnishes integers p′,p′′,q′,q′′, with 0≤q′<q′′≤Q such that ∣(q′′α−p′′)−(q′α−p′)∣≤1/Q; then set p=p′′−p′ and q=q′′−q′; one has ∣qα−p∣≤1/Q, hence α−qp≤1/Qq≤1/q2.
To prove an inequality in the other direction, Liouville's argument was that if α is an irrational root of a nonzero polynomial P∈Z[T], then κ(α)≤deg(P). The proof is now standard: given an approximation p/q of α, observe that qdP(p/q) is a non-zero integer (if, say, P is irreducible), so that qdP(p/q)≥1. On the other hand, P(p/q)≈(p/q−α)P′(α), hence an inequality α−qp≫q−d.
This result has been generalized, first by Axel Thue en 1909 (who proved an inequality κ(α)≤21d+1), then by Carl Ludwig Siegel and Freeman Dyson in 1947 (showing κ(α)≤2d and κ(α)≤2d). While Liouville's result was based in the minimal polynomial of α, these generalisations required to involve polynomials in two variables, and the non-vanishing of a quantity such that qdP(p/q) above was definitely less trivial. Roth's proof made use of polynomials of arbitrarily large degree, and his remarkable achievement was a proof of the required non-vanishing result.
Roth's proof was “elementary”, making use only of polynomials and wronskians. There are today more geometric proofs, such as the one by Hélène Esnault and Eckart Viehweg (1984) or Michael Nakamaye's subsequent proof (1995) which is based on Faltings's product theorem.
What is still missing, however, is the proof of an effective version of Roth's theorem, that would give, given any real number n>κ(α), an actual integer Q such that every rational fraction p/q in lowest terms such that α−qp≤1/qn satisfies q≤Q. It seems that this defect lies at the very heart of almost all of the current approaches in diophantine approximations...
Wednesday, April 13, 2016
Weierstrass's approximation theorem
Libellés :
agrégation
,
density
,
topology
I had to mentor an Agrégation leçon entitled Examples of dense subsets. For my own edification (and that of the masses), I want to try to record here as many proofs as of the Weierstrass density theorem as I can : Every complex-valued continuous function on the closed interval [−1;1] can be uniformly approximated by polynomials. I'll also include as a bonus the trigonometric variant: Every complex-valued continuous and 2π-periodic function on R can be uniformly approximated by trigonometric polynomials.
1. Using the Stone theorem.
This 1937—1948 theorem is probably the final conceptual brick to the edifice of which Weierstrass laid the first stone in 1885. It asserts that a subalgebra of continuous functions on a compact totally regular (e.g., metric) space is dense for the uniform norm if and only if it separates points. In all presentations that I know of, its proof requires to establish that the absolute value function can be uniformly approximated by polynomials on [−1;1]:
2. Convolution.
Consider an approximation (ρn) of the Dirac distribution, i.e., a sequence of continuous, nonnegative and compactly supported functions on R such that ∫ρn=1 and such that for every δ>0, ∫∣x∣>δρn(x)dx→0. Given a continuous function f on R, form the convolutions defined by f∗ρn(x)=∫Rρn(t)f(x−t)dt. It is classical that f∗ρn converges uniformly on every compact to f.
Now, given a continuous function f on [−1;1], one can extend it to a continuous function with compact support on R (defining f to be affine linear on [−2;−1] and on [1;2], and to be zero outside of [−2;2]. We want to choose ρn so that f∗ρn is a polynomial on [−1;1]. The basic idea is just to choose a parameter a>0, and to take ρn(x)=cn(1−(x/a)2)n for ∣x∣≤a and ρn(x)=0 otherwise, with cn adjusted so that ∫ρn=1. Let us write f∗ρn(x)=∫−22ρn(x−t)f(t)dt; if x∈[−1;1] and t∈[−2:2], then x−t∈[−3;3] so we just need to be sure that ρn is a polynomial on that interval, which we get by taking, say, a=3. This shows that the restriction of f∗ρn to [−1;1] is a polynomial function, and we're done.
This approach is more or less that of D. Jackson (“A Proof of Weierstrass's Theorem,” Amer. Math. Monthly, 1934). The difference is that he considers continuous functions on a closed interval contained in ]0;1[ which he extends linearly to [0;1] so that they vanish at 0 and 1; he considers the same convolution, taking the parameter a=1.
Weierstrass's own proof (“Über die analytische Darstellbarkeit sogenannter willkurlicher Functionen einer reellen Veranderlichen Sitzungsberichteder,” Königlich Preussischen Akademie der Wissenschaften zu Berlin, 1885) was slightly more sophisticated: he first showed approximation by convolution with the Gaussian kernel defined by ρn(t)=ne−πnt2, and then expanded the kernel as a power series, a suitable truncation of which furnishes the desired polynomials.
As shown by Jacskon, the same approach works easily (in a sense, more easily) for 2π-periodic functions, considering the kernel defined by ρn(x)=cn(1+cos(x))n, where cn is chosen so that \int_{-\pi}^\pi \rho_n=1$.
3. Bernstein polynomials.
Take a continuous function f on [0;1] and, for n≥0, set Bnf(x)=k=0∑nf(k/n)(kn)tk(1−t)n−k. It is classical that Bnf converges uniformly to f on [0;1].
There are two classical proofs of Bernstein's theorem. One is probabilistic and consists in observing that Bnf(x) is the expected value of f(Sn), where Sn is the sum of n i.i.d. Bernoulli random variables with parameter x∈[0;1]. Another (generalized as the Korovkin theorem, “On convergence of linear positive operators in the space of continuous functions”, Dokl. Akad. Nauk SSSR (N.S.), vol. 90, ) consists in showing (i) that for f=1,x,x2, Bnf converges uniformly to f (an explicit calculation), (ii) that if f≥0, then Bnf≥0 as well, (iii) for every x∈[0;1], squeezing f inbetween two quadratic polynomials f+ and f− such that f+(x)−f−(x) is as small as desired.
A trigonometric variant would be given by Fejér's theorem that the Cesàro averages of a Fourier series of a continuous, 2π-periodic function converge uniformly to that function. In turn, Fejér's theorem can be proved in both ways, either by convolution (the Fejér kernel is nonnegative), or by a Korovkine-type argument (replacing 1,x,x2 on [0;1] by 1,z,z2,z−1,z−2 on the unit circle).
4. Using approximation by step functions.
This proof originates with a paper of H. Kuhn, “Ein elementarer Beweis des Weierstrasschen Approximationsatzes,” Arch. Math. 15 (1964), p. 316–317.
Let us show that for every δ∈]0,1[ and every ε>0, there exists a polynomial p satisfying the following properties:
A possible formula is p(x)=(1−((1−x)/2))n)2n, where n is a large enough integer. First of all, one has 0≤(1−x)/2≤1 for every x∈[−1;1], so that 0≤p(x)≤1. Let x∈[−1;−δ]; then one has (1−x)/2≥(1+δ)/2, hence p(x)≤(1−((1+δ)/2)n)2n, which can be made arbitrarily small when n→∞. Let finally x∈[δ;1]; then (1−x)/2≥(1−δ)/2, hence p(x)≥(1−((1−δ)/2)n)2n≥1−(1−δ)n, which can be made arbitrarily close to 1 when n→∞.
By translation and dilations, the discontinuity can be placed at any element of [0;1]. Let now f be an arbitrary step function and let us write it as a linear combination f=∑aifi, where fi is a {0,1}-valued step function. For every i, let pi be a polynomial that approximates fi as given above. The linear combination ∑aipi approximates f with maximal error sup(∣ai∣).
Using uniform continuity of continuous functions on [−1;1], every continuous function can be uniformly approximated by a step function. This concludes the proof.
5. Using approximation by piecewise linear functions.
As in the proof of Stone's theorem, one uses the fact that the function x↦∣x∣ is uniformly approximated by a sequence of polynomial on [−1;1]. Consequently, so are the functions x↦max(0,x)=(x+∣x∣)/2 and x↦min(0,x)=(x−∣x∣)/2. By translation and dilation, every continuous piecewise linear function on [−1;1] with only one break point is uniformly approximated by polynomials. By linear combination, every continuous piecewise linear affine function is uniformly approximated by polynomials.
By uniform continuity, every continuous function can be uniformly approximated by continuous piecewise linear affine functions. Weierstrass's theorem follows.
6. Moments.
A linear subspace A of a Banach space is dense if and only if every continuous linear form which vanishes on A is identically 0. In the present case, the dual of C0([−1;1],C) is the space of complex measures on [−1;1] (Riesz theorem, if one wish, or the definition of a measure). So let μ be a complex measure on [−1;1] such that ∫−11tndμ(t)=0 for every integer n≥0; let us show that μ=0. This is the classical problem of showing that a complex measure on [−1;1] is determined by its moments. In fact, the classical proof of this fact runs the other way round, and there must exist ways to reverse the arguments.
One such solution is given in Rudin's Real and complex analysis, where it is more convenient to consider functions on the interval [0;1]. So, let F(z)=∫01tzdμ(t). The function F is holomorphic and bounded on the half-plane ℜ(z)>0 and vanishes at the positive integers. At this point, Rudin makes a conform transformation to the unit disk (setting w=(z−1)/(z+1)) and gets a bounded function on the unit disk with zeroes at (n−1)/(n+1)=1−2/(n+1), for n∈N, and this contradicts the fact that the series ∑1/(n+1) diverges.
In Rudin, this method is used to prove the more general Müntz–Szász theorem according to which the family (tλn) generates a dense subset of C([0;1]) if and only if ∑1/λn=+∞.
Here is another solution I learnt in a paper by L. Carleson (“Mergelyan's theorem on uniform polynomial approximation”, Math. Scand., 1964).
For every complex number a such that ∣a∣>1, one can write 1/(t−a) as a converging power series. By summation, this quickly gives that
F(a)=∫−11t−a1dμ(t)≡0.
Observe that this formula defines a holomorphic function on C∖[−1;1]; by analytic continuous, one thus has F(a)=0 for every a∈[−1;1].
Take a C2-function g with compact support on the complex plane. For every t∈C, one has the following formula
∬∂ˉg(z)t−z1dxdy=g(t),
which implies, by integration and Fubini, that
∫−11g(t)dμ(t)=∬∫∂ˉg(z)t−z1dμ(t)dxdy=∬∂ˉg(z)F(z)dxdy=0.
On the other hand, every C2 function on [−1;1] can be extended to such a function g, so that the measure μ vanishes on every C2 function on [−1;1]. Approximating a continuous function by a C2 function (first take a piecewise linear approximation, and round the corners), we get that μ vanishes on every continuous function, as was to be proved.
7. Chebyshev/Markov systems.
This proof is due to P. Borwein and taken from the book Polynomials and polynomial inequalities, by P. Borwein and T. Erdélyi (Graduate Texts in Maths, vol. 161, 1995). Let us say that a sequence (fn) of continuous functions on an interval I is a Markov system (resp. a weak Markov system) if for every integer n, every linear combination of (f0,…,fn) has at most n zeroes (resp. n sign changes) in I.
Given a Markov system (fn), one defines a sequence (Tn), where Tn−fn is the element of ⟨f0,…,fn−1⟩ which is the closest to fn. The function Tn has n zeroes on the interval I; let Mn be the maximum distance between two consecutive zeroes.
Borwein's theorem (Theorem 4.1.1 in the mentioned book) then asserts that if the sequence (fn) is a Markov system consisting of C1 functions, then its linear span is dense in C(I) if and only if Mn→0.
The sequence of monomials (xn) on I=[−1;1] is of course a Markov system. In this case, the polynomial Tn is the nth Chebyshev polynomial, given by Tn(2cos(x))=2cos(nx), and its roots are given by 2cos((π+2kπ)/2n), for k=0,…,n−1, and Mn≤π/n. This gives yet another proof of Weierstrass's approximation theorem.
1. Using the Stone theorem.
This 1937—1948 theorem is probably the final conceptual brick to the edifice of which Weierstrass laid the first stone in 1885. It asserts that a subalgebra of continuous functions on a compact totally regular (e.g., metric) space is dense for the uniform norm if and only if it separates points. In all presentations that I know of, its proof requires to establish that the absolute value function can be uniformly approximated by polynomials on [−1;1]:
- Stone truncates the power series expansion of the function x↦1−(1−x2)=n=0∑∞(n1/2)(x2−1)n, bounding by hand the error term.
- Bourbaki (Topologie générale, X, p. 36, lemme 2) follows a more elementary approach and begins by proving that the function x↦x can be uniformly approximated by polynomials on [0;1]. (The absolute value function is recovered since ∣x∣x2.) To this aim, he introduces the sequence of polynomials given by p0=0 and pn+1(x)=pn(x)+21(x−pn(x)2) and proves by induction the inequalities 0≤x−pn(x)≤2+nx2x≤n2 for x∈[0;1] and n≥0. This implies the desired result.
2. Convolution.
Consider an approximation (ρn) of the Dirac distribution, i.e., a sequence of continuous, nonnegative and compactly supported functions on R such that ∫ρn=1 and such that for every δ>0, ∫∣x∣>δρn(x)dx→0. Given a continuous function f on R, form the convolutions defined by f∗ρn(x)=∫Rρn(t)f(x−t)dt. It is classical that f∗ρn converges uniformly on every compact to f.
Now, given a continuous function f on [−1;1], one can extend it to a continuous function with compact support on R (defining f to be affine linear on [−2;−1] and on [1;2], and to be zero outside of [−2;2]. We want to choose ρn so that f∗ρn is a polynomial on [−1;1]. The basic idea is just to choose a parameter a>0, and to take ρn(x)=cn(1−(x/a)2)n for ∣x∣≤a and ρn(x)=0 otherwise, with cn adjusted so that ∫ρn=1. Let us write f∗ρn(x)=∫−22ρn(x−t)f(t)dt; if x∈[−1;1] and t∈[−2:2], then x−t∈[−3;3] so we just need to be sure that ρn is a polynomial on that interval, which we get by taking, say, a=3. This shows that the restriction of f∗ρn to [−1;1] is a polynomial function, and we're done.
This approach is more or less that of D. Jackson (“A Proof of Weierstrass's Theorem,” Amer. Math. Monthly, 1934). The difference is that he considers continuous functions on a closed interval contained in ]0;1[ which he extends linearly to [0;1] so that they vanish at 0 and 1; he considers the same convolution, taking the parameter a=1.
Weierstrass's own proof (“Über die analytische Darstellbarkeit sogenannter willkurlicher Functionen einer reellen Veranderlichen Sitzungsberichteder,” Königlich Preussischen Akademie der Wissenschaften zu Berlin, 1885) was slightly more sophisticated: he first showed approximation by convolution with the Gaussian kernel defined by ρn(t)=ne−πnt2, and then expanded the kernel as a power series, a suitable truncation of which furnishes the desired polynomials.
As shown by Jacskon, the same approach works easily (in a sense, more easily) for 2π-periodic functions, considering the kernel defined by ρn(x)=cn(1+cos(x))n, where cn is chosen so that \int_{-\pi}^\pi \rho_n=1$.
3. Bernstein polynomials.
Take a continuous function f on [0;1] and, for n≥0, set Bnf(x)=k=0∑nf(k/n)(kn)tk(1−t)n−k. It is classical that Bnf converges uniformly to f on [0;1].
There are two classical proofs of Bernstein's theorem. One is probabilistic and consists in observing that Bnf(x) is the expected value of f(Sn), where Sn is the sum of n i.i.d. Bernoulli random variables with parameter x∈[0;1]. Another (generalized as the Korovkin theorem, “On convergence of linear positive operators in the space of continuous functions”, Dokl. Akad. Nauk SSSR (N.S.), vol. 90, ) consists in showing (i) that for f=1,x,x2, Bnf converges uniformly to f (an explicit calculation), (ii) that if f≥0, then Bnf≥0 as well, (iii) for every x∈[0;1], squeezing f inbetween two quadratic polynomials f+ and f− such that f+(x)−f−(x) is as small as desired.
A trigonometric variant would be given by Fejér's theorem that the Cesàro averages of a Fourier series of a continuous, 2π-periodic function converge uniformly to that function. In turn, Fejér's theorem can be proved in both ways, either by convolution (the Fejér kernel is nonnegative), or by a Korovkine-type argument (replacing 1,x,x2 on [0;1] by 1,z,z2,z−1,z−2 on the unit circle).
4. Using approximation by step functions.
This proof originates with a paper of H. Kuhn, “Ein elementarer Beweis des Weierstrasschen Approximationsatzes,” Arch. Math. 15 (1964), p. 316–317.
Let us show that for every δ∈]0,1[ and every ε>0, there exists a polynomial p satisfying the following properties:
- 0≤p(x)≤ε for −1≤x≤−δ;
- 0≤p(x)≤1 for −δ≤x≤δ;
- 1−ε≤p(x)≤1 for δ≤x≤1.
A possible formula is p(x)=(1−((1−x)/2))n)2n, where n is a large enough integer. First of all, one has 0≤(1−x)/2≤1 for every x∈[−1;1], so that 0≤p(x)≤1. Let x∈[−1;−δ]; then one has (1−x)/2≥(1+δ)/2, hence p(x)≤(1−((1+δ)/2)n)2n, which can be made arbitrarily small when n→∞. Let finally x∈[δ;1]; then (1−x)/2≥(1−δ)/2, hence p(x)≥(1−((1−δ)/2)n)2n≥1−(1−δ)n, which can be made arbitrarily close to 1 when n→∞.
By translation and dilations, the discontinuity can be placed at any element of [0;1]. Let now f be an arbitrary step function and let us write it as a linear combination f=∑aifi, where fi is a {0,1}-valued step function. For every i, let pi be a polynomial that approximates fi as given above. The linear combination ∑aipi approximates f with maximal error sup(∣ai∣).
Using uniform continuity of continuous functions on [−1;1], every continuous function can be uniformly approximated by a step function. This concludes the proof.
5. Using approximation by piecewise linear functions.
As in the proof of Stone's theorem, one uses the fact that the function x↦∣x∣ is uniformly approximated by a sequence of polynomial on [−1;1]. Consequently, so are the functions x↦max(0,x)=(x+∣x∣)/2 and x↦min(0,x)=(x−∣x∣)/2. By translation and dilation, every continuous piecewise linear function on [−1;1] with only one break point is uniformly approximated by polynomials. By linear combination, every continuous piecewise linear affine function is uniformly approximated by polynomials.
By uniform continuity, every continuous function can be uniformly approximated by continuous piecewise linear affine functions. Weierstrass's theorem follows.
6. Moments.
A linear subspace A of a Banach space is dense if and only if every continuous linear form which vanishes on A is identically 0. In the present case, the dual of C0([−1;1],C) is the space of complex measures on [−1;1] (Riesz theorem, if one wish, or the definition of a measure). So let μ be a complex measure on [−1;1] such that ∫−11tndμ(t)=0 for every integer n≥0; let us show that μ=0. This is the classical problem of showing that a complex measure on [−1;1] is determined by its moments. In fact, the classical proof of this fact runs the other way round, and there must exist ways to reverse the arguments.
One such solution is given in Rudin's Real and complex analysis, where it is more convenient to consider functions on the interval [0;1]. So, let F(z)=∫01tzdμ(t). The function F is holomorphic and bounded on the half-plane ℜ(z)>0 and vanishes at the positive integers. At this point, Rudin makes a conform transformation to the unit disk (setting w=(z−1)/(z+1)) and gets a bounded function on the unit disk with zeroes at (n−1)/(n+1)=1−2/(n+1), for n∈N, and this contradicts the fact that the series ∑1/(n+1) diverges.
In Rudin, this method is used to prove the more general Müntz–Szász theorem according to which the family (tλn) generates a dense subset of C([0;1]) if and only if ∑1/λn=+∞.
Here is another solution I learnt in a paper by L. Carleson (“Mergelyan's theorem on uniform polynomial approximation”, Math. Scand., 1964).
For every complex number a such that ∣a∣>1, one can write 1/(t−a) as a converging power series. By summation, this quickly gives that
F(a)=∫−11t−a1dμ(t)≡0.
Observe that this formula defines a holomorphic function on C∖[−1;1]; by analytic continuous, one thus has F(a)=0 for every a∈[−1;1].
Take a C2-function g with compact support on the complex plane. For every t∈C, one has the following formula
∬∂ˉg(z)t−z1dxdy=g(t),
which implies, by integration and Fubini, that
∫−11g(t)dμ(t)=∬∫∂ˉg(z)t−z1dμ(t)dxdy=∬∂ˉg(z)F(z)dxdy=0.
On the other hand, every C2 function on [−1;1] can be extended to such a function g, so that the measure μ vanishes on every C2 function on [−1;1]. Approximating a continuous function by a C2 function (first take a piecewise linear approximation, and round the corners), we get that μ vanishes on every continuous function, as was to be proved.
7. Chebyshev/Markov systems.
This proof is due to P. Borwein and taken from the book Polynomials and polynomial inequalities, by P. Borwein and T. Erdélyi (Graduate Texts in Maths, vol. 161, 1995). Let us say that a sequence (fn) of continuous functions on an interval I is a Markov system (resp. a weak Markov system) if for every integer n, every linear combination of (f0,…,fn) has at most n zeroes (resp. n sign changes) in I.
Given a Markov system (fn), one defines a sequence (Tn), where Tn−fn is the element of ⟨f0,…,fn−1⟩ which is the closest to fn. The function Tn has n zeroes on the interval I; let Mn be the maximum distance between two consecutive zeroes.
Borwein's theorem (Theorem 4.1.1 in the mentioned book) then asserts that if the sequence (fn) is a Markov system consisting of C1 functions, then its linear span is dense in C(I) if and only if Mn→0.
The sequence of monomials (xn) on I=[−1;1] is of course a Markov system. In this case, the polynomial Tn is the nth Chebyshev polynomial, given by Tn(2cos(x))=2cos(nx), and its roots are given by 2cos((π+2kπ)/2n), for k=0,…,n−1, and Mn≤π/n. This gives yet another proof of Weierstrass's approximation theorem.
Wednesday, February 24, 2016
Sound and color
Just back home from The Stone where I could hear two very interesting sets with pianist Russ Lossing and drummer Gerry Hemingway, first in duet, and then in quartet with Loren Stillman on alto saxophone and Samuel Blaser on trombone.
I was absolutely excited at the prospect of returning to this avant-garde jazz hall (it has been my 3rd concert there, the first one was in 2010, with Sylvie Courvoisier, Thomas Morgan and Ben Perowski, and the second, last year, with Wadada Leo Smith and Vijay Iyer) to listen to Gerry Hemingway, and the cold rain falling on New York City did not diminish my enthusiasm. (Although I had to take care on the streets, for one could almost see nothing...) I feared I would arrive late, but Gerry Hemingway was still installing his tools, various sticks, small cymbals, woodblocks, as well as a cello bow...
I admit, it took me some time to appreciate the music. Of course, it was free jazz (so what?) and I couldn't really follow the stream of music. Both musicians were acting delicately and skillfully (no discussion) at creating sound, as a painter would spread brush strokes on a canvas—and actually, Hemingway was playing a lot of brushes, those drum sticks made of many (wire or plastic) strings that have a delicate and not very resonating sound... Color after color, something was emerging, sound was being shaped.
There is an eternal discussion about the nature of music (is it rhythm? melody? harmony?) and consequently about the role of each instrument in the shaping of the music. A related question is the way a given instrument should be used to produce sound.
None of the obvious answers was to be heard tonight. Russ Lossing sometimes stroke the strings of the grand piano with mallets, something almost classical in avant-garde piano music. I should have been prepared by the concert of Tony Malaby's Tubacello, that I attended with François Loeser in Sons d'hiver a few weeks ago, where John Hollenbeck simultaneously played drums and prepared piano, but the playing of Gerry Hemingway brought me much surprise. He could blow on the heads of the drums, hit them with a woodblock or strange plastic mallets; he could make the cymbals vibrate by pressing the cell bow on it; he could also take the top hi-hat cymbal on the left hand, and then either hit it with a stick, or press it on the snare drum, thereby producing a mixture of snare/cymbal sound; during a long drum roll, he could also vary the pitch of the sound by pressing the drum head with his right foot—can you imagine the scene?
It is while discussing with him in between the two sets that I gradually understood (some of) his musical conception. How everything is about sound and color. That's why he uses an immense palette of tools, to produce the sounds he feels would best fit the music. He also discussed extended technique, by which he means not the kind of drumistic virtuosity that could allow you (unfortunately, not me...) to play the 26 drum rudiments at 300bpm, but by extending the range of sounds he can consistently produce with his “basic Buddy Rich type instrument”—Google a picture of Terry Bozzio's drumkit if you don't see what I mean. He described himself as a colorist, who thinks of his instrument in terms of pitches; he also said how rhythm also exists in negative, when it is not played explicitly. A striking remark because it exactly depicted how I understand the playing of one of my favorite jazz drummers, Paul Motian, but whom I couldn't appreciate until I became able of hearing what he did not play.
The second set did not sound as abstract as the first one. Probably the two blowing instruments helped giving the sound more flesh and more texture. Samuel Blaser, on the trombone, was absolutely exceptional—go listen at once for his Spring Rain album, an alliance of Jimmy Giuffre and contemporary jazz—and Loren Stillman sang very beautiful melodic lines on the alto sax. The four of them could also play in all combinations, and with extremly interesting dynamics, going effortlessly from one to another. And when a wonderful moment of thunder ended abruptly with the first notes of Paul Motian's Etude, music turned into pure emotion.
I was absolutely excited at the prospect of returning to this avant-garde jazz hall (it has been my 3rd concert there, the first one was in 2010, with Sylvie Courvoisier, Thomas Morgan and Ben Perowski, and the second, last year, with Wadada Leo Smith and Vijay Iyer) to listen to Gerry Hemingway, and the cold rain falling on New York City did not diminish my enthusiasm. (Although I had to take care on the streets, for one could almost see nothing...) I feared I would arrive late, but Gerry Hemingway was still installing his tools, various sticks, small cymbals, woodblocks, as well as a cello bow...
I admit, it took me some time to appreciate the music. Of course, it was free jazz (so what?) and I couldn't really follow the stream of music. Both musicians were acting delicately and skillfully (no discussion) at creating sound, as a painter would spread brush strokes on a canvas—and actually, Hemingway was playing a lot of brushes, those drum sticks made of many (wire or plastic) strings that have a delicate and not very resonating sound... Color after color, something was emerging, sound was being shaped.
There is an eternal discussion about the nature of music (is it rhythm? melody? harmony?) and consequently about the role of each instrument in the shaping of the music. A related question is the way a given instrument should be used to produce sound.
None of the obvious answers was to be heard tonight. Russ Lossing sometimes stroke the strings of the grand piano with mallets, something almost classical in avant-garde piano music. I should have been prepared by the concert of Tony Malaby's Tubacello, that I attended with François Loeser in Sons d'hiver a few weeks ago, where John Hollenbeck simultaneously played drums and prepared piano, but the playing of Gerry Hemingway brought me much surprise. He could blow on the heads of the drums, hit them with a woodblock or strange plastic mallets; he could make the cymbals vibrate by pressing the cell bow on it; he could also take the top hi-hat cymbal on the left hand, and then either hit it with a stick, or press it on the snare drum, thereby producing a mixture of snare/cymbal sound; during a long drum roll, he could also vary the pitch of the sound by pressing the drum head with his right foot—can you imagine the scene?
It is while discussing with him in between the two sets that I gradually understood (some of) his musical conception. How everything is about sound and color. That's why he uses an immense palette of tools, to produce the sounds he feels would best fit the music. He also discussed extended technique, by which he means not the kind of drumistic virtuosity that could allow you (unfortunately, not me...) to play the 26 drum rudiments at 300bpm, but by extending the range of sounds he can consistently produce with his “basic Buddy Rich type instrument”—Google a picture of Terry Bozzio's drumkit if you don't see what I mean. He described himself as a colorist, who thinks of his instrument in terms of pitches; he also said how rhythm also exists in negative, when it is not played explicitly. A striking remark because it exactly depicted how I understand the playing of one of my favorite jazz drummers, Paul Motian, but whom I couldn't appreciate until I became able of hearing what he did not play.
The second set did not sound as abstract as the first one. Probably the two blowing instruments helped giving the sound more flesh and more texture. Samuel Blaser, on the trombone, was absolutely exceptional—go listen at once for his Spring Rain album, an alliance of Jimmy Giuffre and contemporary jazz—and Loren Stillman sang very beautiful melodic lines on the alto sax. The four of them could also play in all combinations, and with extremly interesting dynamics, going effortlessly from one to another. And when a wonderful moment of thunder ended abruptly with the first notes of Paul Motian's Etude, music turned into pure emotion.
Tuesday, February 9, 2016
Happy New Year!
Libellés :
algebra
,
linear algebra
,
Sylow subgroups
As was apparently first noticed by Noam Elkies, 2016 is the cardinality of the general linear group over the field with 7 elements, G=GL(2,F7). I was mentoring an agrégation lesson on finite fields this afternoon, and I could not resist having the student check this. Then came the natural question of describing the Sylow subgroups of this finite group. This is what I describe here.
First of all, let's recall the computation of the cardinality of G. The first column of a matrix in G must be non-zero, hence there are 72−1 possibilities; for the second column, it only needs to be non-collinear to the first one, and each choice of the first column forbids 7 second columns, hence 72−7 possibilities. In the end, one has Card(G)=(72−1)(72−7)=48⋅42=2016. The same argument shows that the cardinality of the group GL(n,Fq) is equal to (qn−1)(qn−q)⋯(qn−qn−1)=qn(n−1)/2(q−1)(q2−1)⋯(qn−1).
Let's go back to our example. The factorization of this cardinal comes easily: 2016=(72−1)(72−7)=(7−1)(7+1)7(7−1)=6⋅8⋅7⋅6=25⋅32⋅7. Consequently, there are three Sylow subgroups to find, for the prime numbers 2, 3 and 7.
The cas p=7 is the most classical one. One needs to find a group of order 7, and one such subgroup is given by the group of upper triangular matrices (10∗1). What makes things work is that p is the characteristic of the chosen finite field. In general, if q is a power of p, then the subgroup of upper-triangular matrices in GL(n,Fq) with 1s one the diagonal has cardinality q⋅q2⋯qn−1=qn(n−1)/2, which is exactly the highest power of p divising the cardinality of GL(n,Fq).
Let's now study p=3. We need to find a group S of order 32=9 inside G. There are a priori two possibilities, either S≃(Z/3Z)2, or S≃(Z/9Z).
We will find a group of the first sort, which will that the second case doesn't happen, because all 3-Sylows are pairwise conjugate, hence isomorphic.
Now, the multiplicative group F7× is of order 6, and is cyclic, hence contains a subgroup of order 3, namely C={1,2,4}. Consequently, the group of diagonal matrices with coefficients in C is isomorphic to (Z/3Z)2 and is our desired 3-Sylow.
Another reason why G does not contain a subgroup S isomorphic to Z/9Z is that it does not contain elements of order 9. Let's argue by contradiction and consider a matrix A∈G such that A9=I; then its minimal polynomial P divides T9−1. Since 7∤9, the matrix A is diagonalizable over the algebraic closure of F7. The eigenvalues of A are eigenvalues are 9th roots of unity, and are quadratic over F7 since deg(P)≤2. On the other hand, if α is a 9th root of unity belonging to F49, one has α9=α48=1, hence α3=1 since gcd(9,48)=3. Consequently, α is a cubic root of unity and A3=1, showing that A has order 3.
It remains to treat the case p=2, which I find slightly trickier. Let's try to find elements A in G whose order divides 25. As above, it is diagonalizable in an algebraic closure, its minimal polynomial divides T32−1, and its roots belong to F49, hence satisfy α32=α48=1, hence α16=1. Conversely, F49× is cyclic of order 48, hence contains an element of order 16, and such an element is quadratic over F7, hence its minimal polynomial P has degree 2. The corresponding companion matrix A in G is an element of order 16, generating a subgroup S1 of G isomorphic to Z/16Z. We also observe that α8=−1 (because its square is 1); since A8 is diagonalizable in an algebraic closure with −1 as the only eigenvalue, this shows A8=−I.
Now, there exists a 2-Sylow subgroup containing S1, and S1 will be a normal subgroup of S (because its index is the smallest prime number dividing the order of S, which is 2). This suggests to introduce the normalizer N of S1 in G. One then has S1⊂S⊂N. Let s∈S be such that s∈S1; then there exists a unique k∈{1,…,15} such that s−1As=Ak, and s−2As2=Ak2=A (because s has order 2 modulo S1), hence k2≡1(mod16)—in other words, k≡±1(mod8).
There exists a natural choice of s: the involution (s2=I) which exchanges the two eigenspaces of A. To finish the computation, it's useful to take a specific example of polynomial P of degree 2 whose roots in F49 are primitive 16th roots of unity. In other words, we need to factor the 16th cyclotomic polynomial Φ16=T8+1 over F7 and find a factor of degree 2; actually, Galois theory shows that all factors have the same degree, so that there should be 4 factors of degree 2. To explain the following computation, some remark is useful. Let α be a 16th root of unity in F49; we have (α8)2=1 but α8=1, hence α8=−1. If P is the minimal polynomial of α, the other root is α7, hence the constant term of P is equal to α⋅α7=α8=−1.
We start from T8+1=(T4+1)2−2T4 and observe that 2≡42(mod7), so that T8+1=(T4+1)2−42T4=(T4+4T2+1)(T4−4T2+1). To find the factors of degree 2, we remember that their constant terms should be equal to −1. We thus go on differently, writing T4+4T2+1=(T2+aT−1)(T2−aT−1) and solving for a: this gives −2−a2=4, hence a2=−6=1 and a=±1. The other factors are found similarly and we get
T8+1=(T2−T−1)(T2+T−1)(T2−4T−1)(T2+4T−1).
We thus choose the factor T2−T−1 and set A=(0111).
Two eigenvectors for A are v=(1α) and v′=(1α′), where α′=α7 is the other root of T2−T−1. The equations for B are Bv=v′ and Bv′=v; this gives B=(110−1). The subgroup S=⟨A,B⟩ generated by A and B has order 32 and is a 2-Sylow subgroup of G.
Generalizing this method involves finding large commutative p-subgroups (such as S1) which belong to appropriate (possibly non-split) tori of GL(n) and combining them with adequate parts of their normalizer, which is close to considering Sylow subgroups of the symmetric group. The paper Sylow p-subgroups of the classical groups over finite fields with characteristic prime to p by A.J. Weir gives the general description (as well as for orthogonal and symplectic groups), building on an earlier paper in which he constructed Sylow subgroups of symmetric groups. See also the paper Some remarks on Sylow subgroups of the general linear groups by C. R. Leedham-Green and W. Plesken which says a lot about maximal p-subgroups of the general linear group (over non-necessarily finite fields). Also, the question was recently the subject of interesting discussions on MathOverflow.
[Edited on Febr. 14 to correct the computation of the 2-Sylow...]
First of all, let's recall the computation of the cardinality of G. The first column of a matrix in G must be non-zero, hence there are 72−1 possibilities; for the second column, it only needs to be non-collinear to the first one, and each choice of the first column forbids 7 second columns, hence 72−7 possibilities. In the end, one has Card(G)=(72−1)(72−7)=48⋅42=2016. The same argument shows that the cardinality of the group GL(n,Fq) is equal to (qn−1)(qn−q)⋯(qn−qn−1)=qn(n−1)/2(q−1)(q2−1)⋯(qn−1).
Let's go back to our example. The factorization of this cardinal comes easily: 2016=(72−1)(72−7)=(7−1)(7+1)7(7−1)=6⋅8⋅7⋅6=25⋅32⋅7. Consequently, there are three Sylow subgroups to find, for the prime numbers 2, 3 and 7.
The cas p=7 is the most classical one. One needs to find a group of order 7, and one such subgroup is given by the group of upper triangular matrices (10∗1). What makes things work is that p is the characteristic of the chosen finite field. In general, if q is a power of p, then the subgroup of upper-triangular matrices in GL(n,Fq) with 1s one the diagonal has cardinality q⋅q2⋯qn−1=qn(n−1)/2, which is exactly the highest power of p divising the cardinality of GL(n,Fq).
Let's now study p=3. We need to find a group S of order 32=9 inside G. There are a priori two possibilities, either S≃(Z/3Z)2, or S≃(Z/9Z).
We will find a group of the first sort, which will that the second case doesn't happen, because all 3-Sylows are pairwise conjugate, hence isomorphic.
Now, the multiplicative group F7× is of order 6, and is cyclic, hence contains a subgroup of order 3, namely C={1,2,4}. Consequently, the group of diagonal matrices with coefficients in C is isomorphic to (Z/3Z)2 and is our desired 3-Sylow.
Another reason why G does not contain a subgroup S isomorphic to Z/9Z is that it does not contain elements of order 9. Let's argue by contradiction and consider a matrix A∈G such that A9=I; then its minimal polynomial P divides T9−1. Since 7∤9, the matrix A is diagonalizable over the algebraic closure of F7. The eigenvalues of A are eigenvalues are 9th roots of unity, and are quadratic over F7 since deg(P)≤2. On the other hand, if α is a 9th root of unity belonging to F49, one has α9=α48=1, hence α3=1 since gcd(9,48)=3. Consequently, α is a cubic root of unity and A3=1, showing that A has order 3.
It remains to treat the case p=2, which I find slightly trickier. Let's try to find elements A in G whose order divides 25. As above, it is diagonalizable in an algebraic closure, its minimal polynomial divides T32−1, and its roots belong to F49, hence satisfy α32=α48=1, hence α16=1. Conversely, F49× is cyclic of order 48, hence contains an element of order 16, and such an element is quadratic over F7, hence its minimal polynomial P has degree 2. The corresponding companion matrix A in G is an element of order 16, generating a subgroup S1 of G isomorphic to Z/16Z. We also observe that α8=−1 (because its square is 1); since A8 is diagonalizable in an algebraic closure with −1 as the only eigenvalue, this shows A8=−I.
Now, there exists a 2-Sylow subgroup containing S1, and S1 will be a normal subgroup of S (because its index is the smallest prime number dividing the order of S, which is 2). This suggests to introduce the normalizer N of S1 in G. One then has S1⊂S⊂N. Let s∈S be such that s∈S1; then there exists a unique k∈{1,…,15} such that s−1As=Ak, and s−2As2=Ak2=A (because s has order 2 modulo S1), hence k2≡1(mod16)—in other words, k≡±1(mod8).
There exists a natural choice of s: the involution (s2=I) which exchanges the two eigenspaces of A. To finish the computation, it's useful to take a specific example of polynomial P of degree 2 whose roots in F49 are primitive 16th roots of unity. In other words, we need to factor the 16th cyclotomic polynomial Φ16=T8+1 over F7 and find a factor of degree 2; actually, Galois theory shows that all factors have the same degree, so that there should be 4 factors of degree 2. To explain the following computation, some remark is useful. Let α be a 16th root of unity in F49; we have (α8)2=1 but α8=1, hence α8=−1. If P is the minimal polynomial of α, the other root is α7, hence the constant term of P is equal to α⋅α7=α8=−1.
We start from T8+1=(T4+1)2−2T4 and observe that 2≡42(mod7), so that T8+1=(T4+1)2−42T4=(T4+4T2+1)(T4−4T2+1). To find the factors of degree 2, we remember that their constant terms should be equal to −1. We thus go on differently, writing T4+4T2+1=(T2+aT−1)(T2−aT−1) and solving for a: this gives −2−a2=4, hence a2=−6=1 and a=±1. The other factors are found similarly and we get
T8+1=(T2−T−1)(T2+T−1)(T2−4T−1)(T2+4T−1).
We thus choose the factor T2−T−1 and set A=(0111).
Two eigenvectors for A are v=(1α) and v′=(1α′), where α′=α7 is the other root of T2−T−1. The equations for B are Bv=v′ and Bv′=v; this gives B=(110−1). The subgroup S=⟨A,B⟩ generated by A and B has order 32 and is a 2-Sylow subgroup of G.
Generalizing this method involves finding large commutative p-subgroups (such as S1) which belong to appropriate (possibly non-split) tori of GL(n) and combining them with adequate parts of their normalizer, which is close to considering Sylow subgroups of the symmetric group. The paper Sylow p-subgroups of the classical groups over finite fields with characteristic prime to p by A.J. Weir gives the general description (as well as for orthogonal and symplectic groups), building on an earlier paper in which he constructed Sylow subgroups of symmetric groups. See also the paper Some remarks on Sylow subgroups of the general linear groups by C. R. Leedham-Green and W. Plesken which says a lot about maximal p-subgroups of the general linear group (over non-necessarily finite fields). Also, the question was recently the subject of interesting discussions on MathOverflow.
[Edited on Febr. 14 to correct the computation of the 2-Sylow...]
Monday, January 4, 2016
Model theory and algebraic geometry, 5 — Algebraic differential equations from coverings
Libellés :
algebraic geometry
,
differential algebra
,
model theory
In this final post of this series, I return to elimination of imaginaries in DCF and explain the main theorem from Tom Scanlon's paper Algebraic differential equations from covering maps.
The last ingredient to be discussed is jet spaces.
Differential algebra is seldom used explicitly in algebraic geometry. However, differential techniques have furnished a crucial tool for the study of the Mordell conjecture over function fields (beginning with the proof of this conjecture by Grauert and Manin), and its generalizations in higher dimension (theorem of Bogomolov on surfaces satisfying c12>3c2), or for holomorphic curve (conjecture of Green-Griffiths). They are often reformulated within the language of jet bundles.
Let us assume that X is a smooth variety over a field k. Its tangent bundle T(X) is a vector bundle over X whose fiber at a (geometric) point x is the tangent space Tx(X) of X at x. By construction, every morphism f:Y→X of algebraic varieties induces a tangent morphism Tf:T(Y)→T(X): it maps a tangent vector v∈Ty(Y) at a (geometric) point y∈Y to the tangent vector Tyf(v)∫Tf(y)(X) at f(y). This can be rephrased in the language of differential algebra as follows: for every differential field (K,∂) whose field of constants contains k, one has a derivative map ∇1:X(K)→T(X)(K). Here is the relation, where we assume that K is the field of functions of a variety Y. A derivation ∂ on K can be viewed as a vector field V on Y, possibly not defined everywhere; replacing Y by a dense open subset if needed, we assume that it is defined everywhere. Now, a point x∈X(K) can be identified with a rational map f:Y⇢X, defined on an open subset U of Y. Then, we simply consider the morphism from U to T(X) given by p↦Tpf(Vp). At the level of function fields, this is our point ∇1(x)∈T(X)(K).
If one wants to look at higher derivatives, the construction of the tangent bundle can be iterated and gives rise to jet bundles which are varieties Jm(X), defined for all integers m≥0, such that J0(X)=X, J1(X)=T(X), and for m≥1, Jm(X) is a vector bundle over Jm−1X modelled on the mth symmetric product of ΩX1. For every differential field (K,∂) whose field of constants contains k, there is a canonical mth derivative map ∇m:X(K)→Jm(X)(K).
The construction of the jet bundles can be given so that the following three requirements are satisfied:
Let G be a complex algebraic group acting on a complex algebraic variety X; let S:X→Z be the corresponding generalized Schwarzian map. Here, Z is a complex algebraic variety, but S is a differential map of some order m. In other words, there exists a constructible algebraic map S~:Jm(X)→Z such that S(x)=S~(∇m(x)) for every differential field (K,∂) and every point x∈X(K).
Let U be an open subset of X(C), for the complex topology, and let Γ be a Zariski dense subgroup of G(C) which stabilizes U. We assume that there exists a complex algebraic variety Y and a biholomorphic map p:Γ\U→Y(C).
Locally, every open holomorphic map ϕ:Ω→Y(C) can be lifted to a holomorphic map ϕ~:Ω→U. Two liftings differ locally by the action of an element of Γ, so that the composition S∘ϕ~ does not depend on the choice of the lifting, by definition of the generalized Schwarzian map S. This gives a well-defined differential-analytic map T:Y→Z. Let m be the maximal order of derivatives appearing in a formula defining T. Then one may write T∘ϕ=T~∘∇mϕ~, where T~ is a constructible analytic map from Jm(Y) to Z.
Theorem (Scanlon). — Assume that there exists a fundamental domain F⊂U such that the map p∣F:F→Y(C) is definable in an o-minimal structure. Then T is differential-algebraic: there exists a constructible map T~:Jm(Y)→Z such that T∘ϕ=T~∘Jm(ϕ) for every ϕ as above.
For the proof, observe that the map T~ is definable in an o-minimal structure, because it comes, by quotient of a definable map from the preimage in Jm(U) of F, and o-minimal structures allow elimination of imaginaries. By the theorem of Peterzil and Starchenko, it is constructible algebraic.
The last ingredient to be discussed is jet spaces.
Differential algebra is seldom used explicitly in algebraic geometry. However, differential techniques have furnished a crucial tool for the study of the Mordell conjecture over function fields (beginning with the proof of this conjecture by Grauert and Manin), and its generalizations in higher dimension (theorem of Bogomolov on surfaces satisfying c12>3c2), or for holomorphic curve (conjecture of Green-Griffiths). They are often reformulated within the language of jet bundles.
Let us assume that X is a smooth variety over a field k. Its tangent bundle T(X) is a vector bundle over X whose fiber at a (geometric) point x is the tangent space Tx(X) of X at x. By construction, every morphism f:Y→X of algebraic varieties induces a tangent morphism Tf:T(Y)→T(X): it maps a tangent vector v∈Ty(Y) at a (geometric) point y∈Y to the tangent vector Tyf(v)∫Tf(y)(X) at f(y). This can be rephrased in the language of differential algebra as follows: for every differential field (K,∂) whose field of constants contains k, one has a derivative map ∇1:X(K)→T(X)(K). Here is the relation, where we assume that K is the field of functions of a variety Y. A derivation ∂ on K can be viewed as a vector field V on Y, possibly not defined everywhere; replacing Y by a dense open subset if needed, we assume that it is defined everywhere. Now, a point x∈X(K) can be identified with a rational map f:Y⇢X, defined on an open subset U of Y. Then, we simply consider the morphism from U to T(X) given by p↦Tpf(Vp). At the level of function fields, this is our point ∇1(x)∈T(X)(K).
If one wants to look at higher derivatives, the construction of the tangent bundle can be iterated and gives rise to jet bundles which are varieties Jm(X), defined for all integers m≥0, such that J0(X)=X, J1(X)=T(X), and for m≥1, Jm(X) is a vector bundle over Jm−1X modelled on the mth symmetric product of ΩX1. For every differential field (K,∂) whose field of constants contains k, there is a canonical mth derivative map ∇m:X(K)→Jm(X)(K).
The construction of the jet bundles can be given so that the following three requirements are satisfied:
- If X=A1 is the affine line, then Jm(X) is an affine space of dimension m+1, and ∇m is just given by ∇m(x)=(x,∂(x),…,∂m(x)) for x∈X(K)=K;
- Products: Jm(X×Y)=Jm(X)×kJm(Y);
- Open immersions: if U is an open subset of X, then Jm(U) is an open subset of X given by the preimage of U under the projection Jm(X)→Jm−1(X)→⋯→J0(X)=X.
- When X is an algebraic group, with origin e, then Jm(X) is canonically isomorphic to the product of X by the affine space Jm(X)e of m-jets at e.
Let G be a complex algebraic group acting on a complex algebraic variety X; let S:X→Z be the corresponding generalized Schwarzian map. Here, Z is a complex algebraic variety, but S is a differential map of some order m. In other words, there exists a constructible algebraic map S~:Jm(X)→Z such that S(x)=S~(∇m(x)) for every differential field (K,∂) and every point x∈X(K).
Let U be an open subset of X(C), for the complex topology, and let Γ be a Zariski dense subgroup of G(C) which stabilizes U. We assume that there exists a complex algebraic variety Y and a biholomorphic map p:Γ\U→Y(C).
Locally, every open holomorphic map ϕ:Ω→Y(C) can be lifted to a holomorphic map ϕ~:Ω→U. Two liftings differ locally by the action of an element of Γ, so that the composition S∘ϕ~ does not depend on the choice of the lifting, by definition of the generalized Schwarzian map S. This gives a well-defined differential-analytic map T:Y→Z. Let m be the maximal order of derivatives appearing in a formula defining T. Then one may write T∘ϕ=T~∘∇mϕ~, where T~ is a constructible analytic map from Jm(Y) to Z.
Theorem (Scanlon). — Assume that there exists a fundamental domain F⊂U such that the map p∣F:F→Y(C) is definable in an o-minimal structure. Then T is differential-algebraic: there exists a constructible map T~:Jm(Y)→Z such that T∘ϕ=T~∘Jm(ϕ) for every ϕ as above.
For the proof, observe that the map T~ is definable in an o-minimal structure, because it comes, by quotient of a definable map from the preimage in Jm(U) of F, and o-minimal structures allow elimination of imaginaries. By the theorem of Peterzil and Starchenko, it is constructible algebraic.
Subscribe to:
Posts
(
Atom
)