Showing posts with label finite fields. Show all posts
Showing posts with label finite fields. Show all posts

Friday, September 13, 2024

The combinatorial Nullstellensatz

The “combinatorial Nullstellensatz” is a relatively elementary statement due to Noga Alon (1999) whose name, while possibly frightening, really says what it is and what it is good for. (A freely available version is there.) Nullstellensatz is the classic name for a theorem of David Hilbert that relates loci in $F^n$ defined by polynomial equations and the ideals of the polynomial ring $F[T_1,\dots,T_n]$ generated by these equations, when $F$ is an algebraically closed field. The word itself is German and means “theorem about the location of zeroes”. The precise correspondence is slightly involved, and stating it here precisely would derail us from the goal of this post, so let us stay with these informal words for the moment. The adjective combinatorial in the title refers to the fact that Alon deduced many beautiful consequences of this theorem in the field of combinatorics, and for some of them, furnishes quite simple proofs. We'll give one of them below, the Cauchy—Davenport inequality, but my main motivation in writing this text was to discuss the proof of the combinatorial Nullstellensatz, in particular untold aspects of the proofs which took me some time to unravel.$\gdef\Card{\operatorname{Card}}$

The context is the following: $F$ is a field, $n$ is a positive integer, and one considers finite sets $S_1,\dots,S_n$ in $F$, and the product set $S=S_1\times \dots \times S_n$ in $F^n$. One considers polynomials $f\in F[T_1,\dots,T_n]$ in $n$ indeterminates and their values at the points $(s_1,\dots,s_n)$ of $S$. For every $i\in\{1,\dots,n\}$, set $g_i = \prod_{a\in S_i} (T_i-a)$; this is a polynomial of degree $\Card(S_i)$ in the indeterminate $T_i$.

Theorem 1. — If a polynomial $f\in F[T_1,\dots,T_n]$ vanishes at all points $s\in S$, there exist polynomials $h_1,\dots,h_n\in F[T_1,\dots,T_n]$ such that $f=g_1 h_1+\dots+g_n h_n$, and for each $i$, one has $\deg(g_i h_i)\leq \deg (f)$.

This theorem is really close to the classic Nullstellensatz, but the specific nature of the set $S$ allows to have a weaker hypothesis (the field $F$ is not assumed to be algebraically closed) and a stronger conclusion (the Nullstellensatz would assert that there exists some power $f^k$ of $f$ that is of this form, saying nothing of the degrees).

Its proof relies on a kind of Euclidean division algorithm. Assume that $f$ has some monomial $c_mT^m=c_mT_1^{m_1}\dots T_n^{m_n}$ where $m_i\geq \Card(S_i)$; then one can consider a Euclidean division (in the variable $T_i$), $T_i^{m_i}=p_i g_i + r_i$, where $\deg(r_i)<\Card(S_i)$. One can then replace this monomial $c_mT^m$ in $f$ by $c_m T^m/T_i^{m_i})r_i$, and, at the same time register $c_m T^m/T_i^{m_i} p_i$ to $h_i$. Since this amounts to subtract some multiple of $g_i$, the vanishing assumption still holds. Applying this method repeatedly, one reduces to the case where the degree of $f$ in the variable $T_i$ is $<\Card(S_i)$ for all $i$.
Then a variant of the theorem that says that a polynomial in one indeterminate has no more roots than its degree implies that $f\equiv 0$.

A weak point in the written exposition of this argument is the reason why the iterative construction would terminate. Intuitively, something has to decrease, but one needs a minute (or an hour, or a week) of reflexion to understand what precisely decreases. The problem is that if one works one monomial at a time, the degrees might stay the same. The simplest reason why this procedure indeed works belongs to the theory of Gröbner bases and to the proof that the division algorithm actually works: instead of the degrees with respect to all variables, or of the total degree, one should consider a degree with respect to some lexicographic ordering of the variables — the point is that it is a total ordering of the monomials, so that if one consider the leading term, given by the largest monomial, the degree will actually decrease.

The second theorem is in fact the one which is needed for the applications to combinatorics. Its statement is rather written in a contrapositive way — it will show that $f$ does not vanish at some point of $S$.

Theorem 2.Let $f\in F[T_1,\dots,T_m]$ and assume that $f$ has a nonzero monomial $c_mT^m$, where $|m|$ is the total degree of $f$. If, moreover, $m_i<\Card(S_i)$ for all $i$, then there exists a point $s=(s_1,\dots,s_n)\in S$ such that $f(s)\neq 0$.

It is that theorem whose proof took me a hard time to understand. I finally managed to formalize it in Lean, hence I was pretty sure I had understood it. In fact, writing this blog post helped me simplify it further, removing a couple of useless arguments by contradiction! Anyway, I feel it is worth being told with a bit more detail than in the original paper.

We argue by contradiction, assuming that $f(s)=0$ for all $s\in S_1\times\dots \times S_n$. Applying theorem 1, we get an expression $f=\sum_{i=1}^n g_i h_i$ where $\deg(g_ih_i)\leq \deg(f)$ for all $i$. The coefficient of $T^m$ in $f$ is non zero, by assumption; it is also the sum of the coefficients of the coefficients of $T^m$ in $g_i h_i$, for $1\leq i\leq n$, and we are going to prove that all of them vanish — hence getting the desired contradiction. $\gdef\coeff{\operatorname{coeff}}$

Fix $i\in\{1,\dots, n\}$.  If $h_i=0$, then $\coeff_m(g_ih_i)=\coeff_m(0)=0$, hence we may assume that $h_i\neq0$. This implies that $\deg(g_i h_i)=\Card(S_i)+\deg(h_i) \leq \deg(f)=|m|$.
One then has
\[ \coeff_m(g_i h_i)=\sum_{p+q=m} \coeff_p (g_i) \coeff_q(h_i), \]
and it suffices to prove that all terms of this sum vanish. Fix $p,q$ such that $p+q=m$, assume that  $\coeff_p(g_i)\neq 0$, and let us prove that $\coeff_q(h_i)=0$.

By the very definition of $g_i$ as a product of $\Card(S_i)$ factors of the form $T_i-a$, this implies that $p_j=0$ for all $j\neq i$. Moreover, $p_i\leq p_i+q_i=m_i < \Card(S_i)$, by the assumption of the theorem, hence $p_i<\Card(S_i)$. This implies \[\Card(S_i)+\deg(h_i)\leq |m|= |p+q|=|p|+|q|=p_i+|q|<\Card(S_i)+|q|.\]
Subtracting $\Card(S_i)$ on both sides, one gets $\deg(h_i)<|q|$, hence $\coeff_q(h_i)=0$, as was to be shown.

To conclude, let us add the combinatorial application to the Cauchy—Davenport theorem.
$\gdef\F{\mathbf F}$

Theorem 3.Let $p$ be a prime number and let $A$ and $B$ be nonempty subsets of $\F_p$, the field with $p$ elements. Denote by $A+B$ the set of all $a+b$, for $a\in A$ and $b\in B$. Then
\[ \Card(A+B) \geq \min (p, \Card(A)+\Card(B)-1).\]


First consider the case where $\Card(A)+\Card(B)>p$, so that $\min(p,\Card(A)+\Card(B)-1)=p$.  In this case, for every $c\in\F_p$, one has
\[\Card(A \cap (c-B))\cup \Card(A\cup (c-B))=\Card(A)+\Card(c-B)>\Card(A)+\Card(B)>p,\]
so that the sets $A$ and $c-B$ intersect. In particular, there exist $a\in A$ and $b\in B$ such that $a+b=c$ and $A+B=\F_p$, and the desired inequality holds.

Let us now treat the case where $\Card(A)+\Card(B)\leq p$, so that $\min(p,\Card(A)+\Card(B)-1)=\Card(A)+\Card(B)-1$. We will assume that the conclusion of the theorem does not hold, that is, $\Card(A+B)\leq \Card(A)+\Card(B)-2$, and derive a contradiction. Let us consider the polynomial $f\in \F_p[X,Y]$ defined by $f=\prod _{c\in A+B} (X+Y-c)$. The degree of $f$ is equal to $\Card(A+B)\leq \Card(A)+\Card(B)-2$. Choose natural integers $u$ and $v$ such that  $u+v=\Card(A+B)$, $u\leq\Card(A)-1$ and $v\leq \Card(B)-1$. The coefficient of $X^u Y^v$ is equal to the binomial coefficient $\binom{u+v}u$. Since $u+v\leq \Card(A)+\Card(B)-2\leq p-2$, this coefficient is nonzero in $\F_p$.  By theorem 2, there exists $(a,b)\in A\times B$ such that $f(a,b)\neq 0$. However, one has $a+b\in A+B$, hence $f(a,b)=0$ by the definition of $f$. This contradiction concludes the proof that $\Card(A+B)\geq \Card(A)+\Card(B)-1$. It also concludes this post.

Wednesday, March 22, 2017

Warning! — Theorems ahead

Don't worry, no danger ahead! — This is just a short post about the German mathematician Ewald Warning and the theorems that bear his name.

It seems that Ewald Warning's name will be forever linked with that of Chevalley, for the Chevalley-Warning theorem is one of the rare modern results that can be taught to undergraduate students; in France, it is especially famous at the Agrégation level. (Warning published a second paper, in 1959, about the axioms of plane geometry.)

Warning's paper, Bemerkung zur vorstehenden Arbeit von Herrn Chevalley (About a previous work of Mr Chevalley), has been published in 1935 in the Publications of the mathematical seminar of Hamburg University (Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg), just after the mentioned paper of Chevalley. Emil Artin had a position in Hamburg at that time, which probably made the seminar very attractive; as a matter of fact, the same 1935 volume features a paper of Weil about Riemann-Roch, one of Burau about braids, one of Élie Cartan about homogeneous spaces, one of Santalo on geometric measure theory, etc.

1. The classic statement of the Chevalley—Warning theorem is the following.

Theorem 1. – Let $p$ be a prime number, let $q$ be a power of $p$ and let $F$ be a field with $q$ elements. Let $f_1,\dots,f_m$ be polynomials in $n$ variables and coefficients in $F$, of degrees $d_1,\dots,d_m$; let $d=d_1+\dots+d_m$. Let $Z=Z(f_1,\dots,f_m)$ be their zero-set in $F^n$. If $d<n$, then $p$ divides $\mathop{\rm Card}(Z)$.

This is really a theorem of Warning, and Chevalley's theorem was the weaker consequence that if $Z\neq\emptyset$, then $Z$ contains at least two points. (In fact, Chevalley only considers the case $q=p$, but his proof extends readily.) The motivation of Chevalley lied in the possibility to apply this remark to the reduced norm of a possibly noncommutative finite field (a polynomial of degree $d$ in $d^2$ variables which vanishes exactly at the origin), thus providing a proof of Wedderburn's theorem.

a) Chevalley's proof begins with a remark. For any polynomial $f\in F[T_1,\ldots,T_n]$, let $f^*$ be the polynomial obtained by replacing iteratively $X_i^q$ by $X_i$ in $f$, until the degree of $f$ in each variable is $<q$. For all $a\in F^n$, one has $f(a)=f^*(a)$; moreover, using the fact that a polynomial in one variable of degree $<q$ has at most $q$ roots, one proves that if $f(a)=0$ for all $a\in F^n$, then $f^*=0$.

Assume now that $Z$ contains exactly one point, say $a\in F^n$, let $f=\prod (1-f_j^{q-1})$, let $g_a=\prod (1-(x_i-a_i)^{q-1})$. Both polynomials take the value $1$ at $x=a$, and $0$ elsewhere; moreover, $g_a$ is reduced. Consequently, $f^*=g$. Then
$$ (q-1)n=\deg(g_a)=\deg(f^*)\leq \deg(f)=(q-1)\sum \deg(f_j)=(q-1)d, $$
contradicting the hypothesis that $d<n$.

b) Warning's proof is genuinely different. He first defines, for any subset $A$ of $F^n$ a reduced polynomial $g_A=\sum_{a\in A} g_a=\sum_{a\in A}\prod (1-(x_i-a_i)^{q-1})$, and observes that $g_A(a)=1$ if $a\in A$, and $g_A(a)=0$ otherwise.
Take $A=Z$, so that $f^*=g_Z$. Using that $\deg(f^*)\leq \deg(f)=(q-1)d$ and the expansion
$ (x-a)^{q-1} = \sum_{i=0}^{q-1} x^i a^{q-1-i}$, Warning derives from the equality $f^*=g_Z$
the relations
\[ \sum_{a\in Z} a_1^{\nu_1}\dots a_n^{\nu_n}=0, \]
for all $(\nu_1,\dots,\nu_n)$ such that $0\leq \nu_i\leq q-1$ and $\sum\nu_i <(q-1)(n-d)$. The particular case $\nu=0$ implies that $p$ divides $\mathop{\rm Card}(Z)$. More generally:

Proposition 2. — For every polynomial $\phi\in F[T_1,\dots,T_n]$ of reduced degree $<(q-1)(n-d)$, one has $\sum_{a\in A} \phi(a) = 0$.

c) The classic proof of that result is even easier. Let us recall it swiftly. First of all, for every integer $\nu$ such that $0\leq \nu <q$, one has $\sum_{a\in F} a^\nu=0$. This can be proved in many ways, for example by using the fact that the multiplicative group of $F$ is cyclic; on the other hand, for every nonzero element $t$ of $F$, the change of variables $a=tb$ leaves this sum both unchanged and multiplied by $t^\nu,$ so that taking $t$ such that $t^\nu\neq 1$, one sees that this sum vanishes. It follows from that that for every polynomial $f\in F[T_1,\dots,T_n]$ whose degree in some variable is $\lt q-1$, one has $\sum_{a\in F^n} f(a)=0$. This holds in particular if the total degree of $f$ is $\lt (q-1)n$.
Taking $f$ as above proves theorem 1.

2. On the other hand there is a second Warning theorem, which seems to be absolutely neglected in France. It says the following:

Theorem 3. — Keep the same notation as in theorem 1. If $Z$ is nonempty, then $\mathop{\rm Card}(Z)\geq q^{n-d}$.

To prove this result, Warning starts from the following proposition:

Proposition 4. – Let $L,L'$ be two parallel subspaces of dimension $d$ in $F^n$. Then $\mathop{\rm Card}(Z\cap L)$ and $\mathop{\rm Card}(Z\cap L')$ are congruent modulo $p$.

Let $r=n-d$. Up to a change of coordinates, one may assume that $L=\{x_1=\dots=x_{r}=0\}$ and $L'=\{x_1-1=x_2=\dots=x_{r}=0\}$. Let
\[ \phi = \frac{1-x_1^{q-1}}{1-x_1} (1-x_2^{q-1})\cdots (1-x_{r}^{q-1}).\]
This is a polynomial of total degree is $(q-1)r-1<(q-1)(n-d)$. For $a\in F^n$, one has $\phi(a)=1$ if $a\in L$, $\phi(a)=-1$ if $a\in L'$, and $\phi(a)=0$ otherwise. Proposition 4 thus follows from proposition 2. It is now very easy to prove theorem 3 in the particular case where there exists one subspace $L$ of dimension $d$ such that $\mathop{\rm Card}(Z\cap L)\not\equiv 0\pmod p$. Indeed, by proposition 4, the same congruence will hold for every translate $L'$ of $L$. In particular, $\mathop{\rm Card}(Z\cap L')\neq0$ for every translate $L'$ of $L$, and there are $q^{n-d}$ distinct translates.

To prove the general case, let us choose a subspace $M$ of $F^n$ of dimension $s\leq d$ such that $\mathop{\rm Card}(Z\cap M)\not\equiv 0\pmod p$, and let us assume that $s$ is maximal.
Assume that $s< d$. Let $t\in\{1,\dots,p-1\}$ be the integer such that $\mathop{\rm Card}(Z\cap M)\equiv t\pmod p$. For every $(s+1)$-dimensional subspace $L$ of $F^n$ that contains $M$, one has $\mathop{\rm Card}(Z\cap L)\equiv 0\pmod p$, by maximaility of $s$, so that $Z\cap (L\setminus M)$ contains at least $p-t$ points. Since these subspaces $L$ are in 1-1 correspondence with the lines of the quotient space $F^n/M$, their number is equal to $(q^{n-s}-1)/(q-1)$. Consequently, \[ \mathop{\rm Card}(Z) = \mathop{\rm Card}(Z\cap M) + \sum_L \mathop{\rm Card}(Z\cap (L\setminus M)) \geq t + (p-t) \frac{q^{n-s}-1}{q-1} \geq q^{n-s-1}\geq q^{n-d}, \] as was to be shown.

3. Classic theorems seem to an everlasting source of food for thought.

a) In 1999, Alon observed that Chevalley's theorem follows from the Combinatorial Nullstellensatz he had just proved. On the other hand, this approach allowed Brink (2011) to prove a similar result in general fields $F$, but restricting the roots to belong to a product set $A_1\times\dots\times A_n$, where $A_1,\dots,A_n$ are finite subsets of $F$ of cardinality $q$. See that paper of Clarke, Forrow and Schmitt for further developments, in particular a version of Warning's second theorem.

b) In the case of hypersurfaces (with the notation of theorem 1, $m=1$), Ax proved in 1964 that the cardinality of $Z$ is divisible not only by $p$, but by $q$. This led to renewed interest in the following years, especially in the works of Katz, Esnault, Berthelot, and the well has not dried up yet.

c) In 2011, Heath-Brown published a paper where he uses Ax's result to strengthen the congruence modulo $p$ of proposition 4 to a congruence modulo $q$.

d) By a Weil restriction argument, a 1995 paper of Moreno-Moreno partially deduces the Chevalley-Warning theorem over a field of cardinality $q$ from its particular case over the prime field. I write partially because they obtain a divisibility by an expression of the form $p^{\lceil f \alpha\rceil}$, while one expects $q^{\lceil \alpha}=p^{f\lceil\alpha\rceil}$. However, the same argument allows them to obtain a stronger bound which does not involve not the degrees of the polynomials, but the $p$-weights of these degrees, that is the sum of their digits in their base $p$ expansions. Again, they obtain a divisibility by an expression of the form $p^{\lceil f\beta\rceil}$, and it is a natural question to wonder whether the divisibility by $p^{f\lceil\beta\rceil}$ can be proved.