Friday, September 13, 2024

The combinatorial Nullstellensatz

The “combinatorial Nullstellensatz” is a relatively elementary statement due to Noga Alon (1999) whose name, while possibly frightening, really says what it is and what it is good for. (A freely available version is there.) Nullstellensatz is the classic name for a theorem of David Hilbert that relates loci in FnF^n defined by polynomial equations and the ideals of the polynomial ring F[T1,,Tn]F[T_1,\dots,T_n] generated by these equations, when FF is an algebraically closed field. The word itself is German and means “theorem about the location of zeroes”. The precise correspondence is slightly involved, and stating it here precisely would derail us from the goal of this post, so let us stay with these informal words for the moment. The adjective combinatorial in the title refers to the fact that Alon deduced many beautiful consequences of this theorem in the field of combinatorics, and for some of them, furnishes quite simple proofs. We'll give one of them below, the Cauchy—Davenport inequality, but my main motivation in writing this text was to discuss the proof of the combinatorial Nullstellensatz, in particular untold aspects of the proofs which took me some time to unravel.\gdef\Card{\operatorname{Card}}

The context is the following: FF is a field, nn is a positive integer, and one considers finite sets S1,,SnS_1,\dots,S_n in FF, and the product set S=S1××SnS=S_1\times \dots \times S_n in FnF^n. One considers polynomials fF[T1,,Tn]f\in F[T_1,\dots,T_n] in nn indeterminates and their values at the points (s1,,sn)(s_1,\dots,s_n) of SS. For every i{1,,n}i\in\{1,\dots,n\}, set gi=aSi(Tia)g_i = \prod_{a\in S_i} (T_i-a); this is a polynomial of degree Card(Si)\Card(S_i) in the indeterminate TiT_i.

Theorem 1. — If a polynomial fF[T1,,Tn]f\in F[T_1,\dots,T_n] vanishes at all points sSs\in S, there exist polynomials h1,,hnF[T1,,Tn]h_1,\dots,h_n\in F[T_1,\dots,T_n] such that f=g1h1++gnhnf=g_1 h_1+\dots+g_n h_n, and for each ii, one has deg(gihi)deg(f)\deg(g_i h_i)\leq \deg (f).

This theorem is really close to the classic Nullstellensatz, but the specific nature of the set SS allows to have a weaker hypothesis (the field FF is not assumed to be algebraically closed) and a stronger conclusion (the Nullstellensatz would assert that there exists some power fkf^k of ff that is of this form, saying nothing of the degrees).

Its proof relies on a kind of Euclidean division algorithm. Assume that ff has some monomial cmTm=cmT1m1Tnmnc_mT^m=c_mT_1^{m_1}\dots T_n^{m_n} where miCard(Si)m_i\geq \Card(S_i); then one can consider a Euclidean division (in the variable TiT_i), Timi=pigi+riT_i^{m_i}=p_i g_i + r_i, where deg(ri)<Card(Si)\deg(r_i)<\Card(S_i). One can then replace this monomial cmTmc_mT^m in ff by cmTm/Timi)ric_m T^m/T_i^{m_i})r_i, and, at the same time register cmTm/Timipic_m T^m/T_i^{m_i} p_i to hih_i. Since this amounts to subtract some multiple of gig_i, the vanishing assumption still holds. Applying this method repeatedly, one reduces to the case where the degree of ff in the variable TiT_i is <Card(Si)<\Card(S_i) for all ii.
Then a variant of the theorem that says that a polynomial in one indeterminate has no more roots than its degree implies that f0f\equiv 0.

A weak point in the written exposition of this argument is the reason why the iterative construction would terminate. Intuitively, something has to decrease, but one needs a minute (or an hour, or a week) of reflexion to understand what precisely decreases. The problem is that if one works one monomial at a time, the degrees might stay the same. The simplest reason why this procedure indeed works belongs to the theory of Gröbner bases and to the proof that the division algorithm actually works: instead of the degrees with respect to all variables, or of the total degree, one should consider a degree with respect to some lexicographic ordering of the variables — the point is that it is a total ordering of the monomials, so that if one consider the leading term, given by the largest monomial, the degree will actually decrease.

The second theorem is in fact the one which is needed for the applications to combinatorics. Its statement is rather written in a contrapositive way — it will show that ff does not vanish at some point of SS.

Theorem 2.Let fF[T1,,Tm]f\in F[T_1,\dots,T_m] and assume that ff has a nonzero monomial cmTmc_mT^m, where m|m| is the total degree of ff. If, moreover, mi<Card(Si)m_i<\Card(S_i) for all ii, then there exists a point s=(s1,,sn)Ss=(s_1,\dots,s_n)\in S such that f(s)0f(s)\neq 0.

It is that theorem whose proof took me a hard time to understand. I finally managed to formalize it in Lean, hence I was pretty sure I had understood it. In fact, writing this blog post helped me simplify it further, removing a couple of useless arguments by contradiction! Anyway, I feel it is worth being told with a bit more detail than in the original paper.

We argue by contradiction, assuming that f(s)=0f(s)=0 for all sS1××Sns\in S_1\times\dots \times S_n. Applying theorem 1, we get an expression f=i=1ngihif=\sum_{i=1}^n g_i h_i where deg(gihi)deg(f)\deg(g_ih_i)\leq \deg(f) for all ii. The coefficient of TmT^m in ff is non zero, by assumption; it is also the sum of the coefficients of the coefficients of TmT^m in gihig_i h_i, for 1in1\leq i\leq n, and we are going to prove that all of them vanish — hence getting the desired contradiction. \gdef\coeff{\operatorname{coeff}}

Fix i{1,,n}i\in\{1,\dots, n\}.  If hi=0h_i=0, then coeffm(gihi)=coeffm(0)=0\coeff_m(g_ih_i)=\coeff_m(0)=0, hence we may assume that hi0h_i\neq0. This implies that deg(gihi)=Card(Si)+deg(hi)deg(f)=m\deg(g_i h_i)=\Card(S_i)+\deg(h_i) \leq \deg(f)=|m|.
One then has
coeffm(gihi)=p+q=mcoeffp(gi)coeffq(hi), \coeff_m(g_i h_i)=\sum_{p+q=m} \coeff_p (g_i) \coeff_q(h_i),
and it suffices to prove that all terms of this sum vanish. Fix p,qp,q such that p+q=mp+q=m, assume that  coeffp(gi)0\coeff_p(g_i)\neq 0, and let us prove that coeffq(hi)=0\coeff_q(h_i)=0.

By the very definition of gig_i as a product of Card(Si)\Card(S_i) factors of the form TiaT_i-a, this implies that pj=0p_j=0 for all jij\neq i. Moreover, pipi+qi=mi<Card(Si)p_i\leq p_i+q_i=m_i < \Card(S_i), by the assumption of the theorem, hence pi<Card(Si)p_i<\Card(S_i). This implies Card(Si)+deg(hi)m=p+q=p+q=pi+q<Card(Si)+q.\Card(S_i)+\deg(h_i)\leq |m|= |p+q|=|p|+|q|=p_i+|q|<\Card(S_i)+|q|.
Subtracting Card(Si)\Card(S_i) on both sides, one gets deg(hi)<q\deg(h_i)<|q|, hence coeffq(hi)=0\coeff_q(h_i)=0, as was to be shown.

To conclude, let us add the combinatorial application to the Cauchy—Davenport theorem.
\gdef\F{\mathbf F}

Theorem 3.Let pp be a prime number and let AA and BB be nonempty subsets of Fp\F_p, the field with pp elements. Denote by A+BA+B the set of all a+ba+b, for aAa\in A and bBb\in B. Then
Card(A+B)min(p,Card(A)+Card(B)1). \Card(A+B) \geq \min (p, \Card(A)+\Card(B)-1).


First consider the case where Card(A)+Card(B)>p\Card(A)+\Card(B)>p, so that min(p,Card(A)+Card(B)1)=p\min(p,\Card(A)+\Card(B)-1)=p.  In this case, for every cFpc\in\F_p, one has
Card(A(cB))Card(A(cB))=Card(A)+Card(cB)>Card(A)+Card(B)>p,\Card(A \cap (c-B))\cup \Card(A\cup (c-B))=\Card(A)+\Card(c-B)>\Card(A)+\Card(B)>p,
so that the sets AA and cBc-B intersect. In particular, there exist aAa\in A and bBb\in B such that a+b=ca+b=c and A+B=FpA+B=\F_p, and the desired inequality holds.

Let us now treat the case where Card(A)+Card(B)p\Card(A)+\Card(B)\leq p, so that min(p,Card(A)+Card(B)1)=Card(A)+Card(B)1\min(p,\Card(A)+\Card(B)-1)=\Card(A)+\Card(B)-1. We will assume that the conclusion of the theorem does not hold, that is, Card(A+B)Card(A)+Card(B)2\Card(A+B)\leq \Card(A)+\Card(B)-2, and derive a contradiction. Let us consider the polynomial fFp[X,Y]f\in \F_p[X,Y] defined by f=cA+B(X+Yc)f=\prod _{c\in A+B} (X+Y-c). The degree of ff is equal to Card(A+B)Card(A)+Card(B)2\Card(A+B)\leq \Card(A)+\Card(B)-2. Choose natural integers uu and vv such that  u+v=Card(A+B)u+v=\Card(A+B), uCard(A)1u\leq\Card(A)-1 and vCard(B)1v\leq \Card(B)-1. The coefficient of XuYvX^u Y^v is equal to the binomial coefficient (u+vu)\binom{u+v}u. Since u+vCard(A)+Card(B)2p2u+v\leq \Card(A)+\Card(B)-2\leq p-2, this coefficient is nonzero in Fp\F_p.  By theorem 2, there exists (a,b)A×B(a,b)\in A\times B such that f(a,b)0f(a,b)\neq 0. However, one has a+bA+Ba+b\in A+B, hence f(a,b)=0f(a,b)=0 by the definition of ff. This contradiction concludes the proof that Card(A+B)Card(A)+Card(B)1\Card(A+B)\geq \Card(A)+\Card(B)-1. It also concludes this post.