Friday, October 13, 2023

A combinatorial proof of the Newton identities

Let T1,,TnT_1,\dots,T_n be indeterminates. For all m0m\geq0, denote by SmS_m the mmth elementary symmetric polynomial, given by  Sm=i1<<imTi1Tim. S_m = \sum_{i_1 \lt \dots \lt i_m} T_{i_1}\dots T_{i_m}. These are polynomials in T1,,TnT_1,\dots, T_n, and as their names suggest, these polynomials are symmetric, meaning that  Sm(Tσ(1),,Tσ(n))=Sm(T1,,Tn) S_m (T_{\sigma(1)},\dots,T_{\sigma(n)}) = S_m (T_1,\dots,T_n) for any permutation σ\sigma of {1,,n}\{1,\dots,n\}. By definition, one has S0=1S_0=1 (there is exactly one family of integers, the empty one, and the empty product is equal to~11) and Sm=0S_m=0 for m>nm>n (there are no families of integers 1i1<<imn1\leq i_1\lt\dots\lt i_m\leq n). A well-known theorem asserts that any symmetric polynomial in T1,,TnT_1,\dots,T_n (with coefficients in a commutative ring RR) can be written uniquely as a polynomial in S1,,SnS_1,\dots,S_n (with coefficients in RR). It is in particular the case for the Newton polynomials, defined for p0p\geq 0 by  Np=T1p++Tnp. N_p = T_1^p + \dots+T_n^p . In particular, N0=nN_0=n and N1=S1N_1=S_1. The next Newton polynomial can be computed by “completing the square”:  N2=T12++Tn2=(T1++Tn)22S1=S122S1. N_2 = T_1^2+\dots+T_n^2 = (T_1+\dots+T_n)^2 - 2 S _1 = S_1^2 - 2S_1. These are the first two (or three) of a family of relations, called the Newton identities, that relate the polynomials NpN_p and the polynomials SmS_m:  m=0p1(1)mNpmSm+(1)ppSp=0.  \sum_{m=0}^{p-1} (-1)^m N_{p-m}\cdot S_m + (-1)^p p \cdot S_p = 0. Using that the coefficient of NpN_p is given by S0=1S_0=1, they allow to express inductively the polynomials NpN_p in terms of S1,,SpS_1,\dots,S_p. If p>np>n, all terms with m>nm>n vanish. Using that the coefficient of SpS_p is N0=pN_0=p, these formulas also show, conversely, that if p!p! is invertible in the commutative ring RR, then S1,,SpS_1,\dots,S_p can be expressed in terms of N1,,NpN_1,\dots,N_p. There are various proofs of these identities. Most of the one I have seen are algebraic in nature, in so that they rely on the relations between the roots of a polynomial and its coefficients, and/or on expansion in power series. The following proof, due to Doron Zeilberger (1984) (Discrete Math. 49, p. 319, doi:10.1016/0012-365X(84)90171-7), is purely combinatorial. I have been made aware of it because it is the one that is formalized in the proof assistant system Lean. (See there for the source code.) We consider the set S={1,,n}S=\{1,\dots,n\} and define a set XX whose elements are the pairs (A,k)(A,k) satisfying the following properties:
  • AA is a subset of SS;
  • kk is an element of SS;
  • the cardinality of AA is less or equal than pp;
  • if Card(A)=p{\mathop {\rm Card}}(A)=p, then kAk\in A.
Define a map f ⁣:XXf\colon X\to X by the following rule: for (A,k)(A,k) in XX, set
  • f(A,k)=(A{k},k)f(A,k) = (A \setminus\{k\}, k) if kAk\in A;
  • f(A,k)=(A{k},k)f(A,k) = (A\cup\{k\}, k) if kAk\notin A.
Write f(A,k)=(A,k)f(A,k)=(A',k); observe that this is again an element of XX. The first two conditions are obvious. In the first case, when kAk\in A, then the cardinality of AA' is one less than the cardinality of AA, hence is at most p1p-1, and the last condition is vacuous. In the second case, when kAk\notin A, the cardinality of AA is strictly smaller than pp, because (A,k)X(A,k)\in X, so that the cardinality of AA' is at most pp; moreover, kAk\in A' hence the last condition holds. Observe that f ⁣:XXf\colon X\to X is an involution: ff=idXf\circ f={\mathrm{id}}_X. Indeed, with the same notation, one has f(A,k)=(A,k)f(A,k)=(A',k). In the first case, kAk\in A, hence kAk\notin A', so that f(A,k)=(A{k},k)=(A,k)f(A',k)=(A'\cup\{k\},k)=(A,k); in the second case, one has mAm\in A', hence f(A,k)=(A{k},k)=(A,k)f(A',k)=(A'\setminus\{k\}, k)=(A,k) since A=A{k}A'=A\cup\{k\} and kAk\notin A. Moreover, ff has no fixed point because it switches pairs (A,k)(A,k) such that kAk\in A and pairs such that kAk\notin A. Zeilberger's proof of the Newton identities build on a function w ⁣:XR[T1,,Tn]w\colon X\to R[T_1,\dots,T_n] such that w(f(A,k))=f(A,k)w(f(A,k))=-f(A,k) for all (A,k)X(A,k)\in X. Since ff has no fixed point, the expression vanishes  (A,k)Xw(A,k) \sum_{(A,k)\in X} w(A,k) since one can cancel a term (A,k)(A,k) with its image f(A,k)f(A,k). Zeilberger's definition of the function ww is  w(A,k)=(1)Card(A)TkpCard(A)aATa. w(A,k) = (-1)^{\mathrm{Card}(A)} T_k^{p-\mathrm{Card}(A)}\cdot \prod_{a\in A} T_a. It satisfies the desired relation: for (A,k)X(A,k)\in X such that kAk\notin A, one has f(A,k)=(A{k},k)f(A,k)=(A\cup\{k\}, k), so that  w(f(A,k))=(1)Card(A)+1TkkCard(A)1aA{k}Ta=(1)Card(A)TkkCard(A)aATa=w(A,k). w(f(A,k)) = (-1)^{\mathrm{Card}(A)+1} T_k^{k-\mathrm{Card}(A)-1} \prod_{a\in A\cup\{k\}} T_a = - (-1)^{\mathrm{Card}(A)} T_k^{k-\mathrm{Card}(A)} \prod_{a\in A} T_a = w(A,k). When kAk\in A, one has f(A,k)=(A,k)f(A,k)=(A',k) with kAk\notin A', and the formula applied to (A,k)(A',k) implies the one for (A,k)(A,k), because ff is an involution. Now, in the relation (A,k)w(A,k)=0 \sum_{(A,k)} w(A,k) =0, we first sum on AA and then on kk:  ASk(A,k)X(1)Card(A)TkpCard(A)aATa=0 \sum_{A\subset S} \sum_{k | (A,k) \in X} (-1)^{\mathrm{Card}(A)} T_k^{p-{\mathrm{Card}(A)}} \prod_{a\in A} T_a = 0 and put together all those sets AA which have the same cardinality, say, mm. This gives  m=0n(1)mCard(A)=maATak (A,k)XTkpm=0. \sum_{m=0}^n (-1)^m \sum_{\mathrm{Card}(A)=m} \prod_{a\in A} T_a \cdot \sum_{k | (A,k)\in X} T_k^{p-m} = 0. Let us compute separately the contribution to the sum of each mm. When m>pm\gt p, no subset AA is authorized so that the corresponding sum is zero. On the opposite, when m<pm\lt p, for any subset AA such that Card(A)=m\mathrm{Card}(A)=m, all pairs (A,k)(A,k) belong to XX, so that the inner sum is NpmN_{p-m}, and the whole term is (1)mSmNpm(-1)^mS_m N_{p-m}. Finally, when m=pm=p, the only kk that appear in the inner sum are the elements of AA, and the corresponding term is kATk0=Card(A)=p\sum_{k\in A} T_k^0=\mathrm {Card}(A)=p; this term contributes to (1)ppSp(-1)^p p S_p.