A recent MathOverflow post asked for a proof that the roots of certain polynomials were located on the unit circle. A comment by Richard Stanley pointed to a beautiful theorem of T. D. Lee and C. N. Yang. By the way, these two authors were physicists, got the Nobel prize in physics in 1957, and were the first Chinese scientists to be honored by the Nobel prize.
This theorem appears in an appendix to their paper, Statistical Theory of Equations of State and Phase Transitions.IL Lattice Gas and Ising Model, published in Phys. Review in 1952, and devoted to properties of the partition function of some lattice gases. Here is a discussion of this theorem, following both the initial paper and notes by Shayan Oveis Gharan.
Let be a Hermitian matrix ( for all ). Define a polynomial
Theorem 1 (Lee-Yang). — If for all , then all roots of have absolute value .
This theorem follows from a multivariate result. Let us define
Say that a polynomial is good if it has no root such that for all .
Proposition 2 (Lee-Yang). — If for all , then is good.
For every pair , set if belongs to , but not , and set otherwise. Consequently,
In other words, if we define polynomials
then is the “coefficientwise product” of the polynomials .
We also note that these polynomials have degree at most one with respect to every variable. These observations may motivate the following lemmas concerning good polynomials.
Lemma 1. — If are good, then so is their product.
Lemma 2. — If is good, then is good for every such that .
This is obvious if ; in the general case, this follows from the Rouché theorem — the set polynomials (of bounded degree) whose roots belong to some closed subset is closed.
Lemma 3. — If , then is good.
This is obvious.
Lemma 4. — If , then is good.
If , then is good, as the product of two good polynomials.
Now assume that . Let be a root of such that . One has
Since the Möbius transformation defines a bijection from the unit open disk to itself, one has .
Lemma 5. — If is good, then is good.
Assume otherwise, so that . By symmetry, we assume . We write .
Choose such that and have the same argument; if, moreover is close enough to and satisfies , then
Consequently, the polynomial is not good; a contradiction.
Lemma 6. — If are good polynomials of degree at most one in each variable, then so is their coefficientwise product.
We first treat the case of one variable: then and , so that their coefficientwise product is given by . By assumption and ;
similarly, and . Consequently, and , which shows that is good.
We prove the result by induction on . For every subset of , let and be the coefficients of and of in ; define similarly and with . The coefficientwise product of and is equal to
Let be such that , so that
is good, by lemma 2. Similarly, for such that , $Q(T_1,\dots,T_{n-1},w)
=\sum _S (c_S+d_S w) \prod_{i\in S} T_i$ is good. By induction, their coefficientwise product, given by
is good as well.
We now fix complex numbers of absolute value . By what precedes, the polynomial
\[ S(T,U) = (\sum_S a_S c_S z_S) + (\sum_S b_Sc_S z_S) T + (\sum_S a_S d_S z_S) U
+ (\sum_S b_S d_S z_S) TU \]
is good, where . According to lemma 4, the polynomial
is good. This proves that is good.
Proof of theorem 2. — We have already observed that the polynomial is the coefficientwise product of polynomials , each of them has degree at most one in each variable. On the other hand, one has
a product of good polynomials, so that is good. This proves that is good.
In fact, more is true. Indeed, one has
\[
\begin{align*} T_1\dots T_n P(1/T_1,\dots,1/T_n)
& = \sum_ S \prod_{\substack{i\in S \\ j\notin S}} a_{i,j} \prod_{i\notin S} T_i \\
& =P^*(T_1,\dots,T_n)
\end{align*}
\]
where is defined using the transpose matrix of . Consequently, has no root with for every .
Proof of theorem 1. — Let be a root of . Since the polynomial is good, so is the one-variable polynomial . In particular, implies . But the polynomial has a symmetry property, inherited by that of , namely , where is defined using the transpose matrix of . Consequently, and . We thus have shown that .
Saturday, May 12, 2018
Wednesday, May 2, 2018
Combinatorics and probability : Greene, Nijenhuis and Wilf's proof of the hook length formula
I've never been very good at remembering representation theory, past the general facts that hold for all finite groups, especially the representation theory of symmetric groups. So here are personal notes to help me understand the 1979 paper by Greene, Nijenhuis and Wilf, where they give a probabilistic proof of the hook length formula. If you already know what it is about, then you'd be quicker by browsing at the Wikipedia page that I just linked to!
Heroes of this story are partitions, Ferrers diagrams, and Young tableaux. Let us first give definitions.
A partition of an integer n is a decreasing (US: non-increasing) sequence of integers λ=(λ1≥λ2≥⋯≥λm) of strictly positive integers such that ∣λ∣=λ1+⋯+λm=n.
The Ferrrers diagram F(λ) of this partition λ is the stairs-like graphic representation consisting of the first quadrant cells indexed by integers (i,j), where 1≤i≤m and 1≤j≤λi. Visualizing a partition by means of its Ferrers diagram makes clear that there is an involution on partitions, λ↦λ∗, which, geometrically consists in applying the symmetry with respect to the diagonal that cuts the first quadrant. In formulas, j≤λi if and only if i≤λj∗.
A Young tableau of shape λ is an enumeration of the n cells of this Ferrers diagram such that the enumeration strictly increases in rows and columns.
There is a graphic way of representing Ferrers diagrams and Young tableaux — the tradition says it's the Soviet way — consisting in rotating the picture by 45° (π/4) to the left and viewing the first quadrant as a kind of bowl in which n balls with diameter 1 are thrown from the top.
The hook length formula is a formula for the number fλ of Young tableaux of given shape λ.
For every (i,j) such that 1≤i≤m and 1≤j≤λi, define its hook Hij to be the set of cells in the diagram that are either above it, or on its the right — in formula, the set of all pairs (a,b) such that a=i and j≤b≤λi, or i≤a≤m and b=j≤λa. Let hij be the cardinality of the hook Hij.
Lemma. — One has ha,b=(λa−b)+(λb∗−a)+1.
Indeed, λa−b is the number of cells above (a,b) in the Ferrers diagram of λ, excluding (a,b), while (λb∗−a) is the number of cells on the right of (a,b).
Theorem (Frame, Thrall, Robinson; 1954). — Let n be an integer and let λ be a partition of n. The number of Young tableaux of shape λ is given by
fλ=∏(i,j)∈F(λ)hijn!.
From the point of view of representation theory, partitions of n are in bijection with conjugacy classes of elements in the symmetric group Sn (the lengths of the orbits of a permutation of {1,…,n} can be sorted into a partition of n, and this partition characterizes the conjugacy class of the given permutation). Then, to each partition of n corresponds an irreducible representation of Sn, and fλ appears to be its dimension. (In a future post, I plan to explain this part of the story.)
As already said, the rest of this post is devoted to explaining the probabilistic proof due to Greene, Nijenhuis, and Wilf. (Aside: Nijenhuis is a Dutch name that should pronounced roughly like Nay-en uys.)
A corner of a Ferrers diagram is a cell (i,j) which is both on top of its column, and on the right of its row; in other words, it is a cell whose associated hook is made of itself only. A bit of thought convinces that a corner can be removed, and furnishes a Ferrers diagram with one cell less. Conversely, starting from a Ferrers diagram with n−1 cells, one may add a cell on the boundary so as to get a Ferrers diagram with n cells. In the partition point of view, either one part gets one more item, or there is one more part, with only one item. In a Young tableau with n cells, the cell numbered n is at a corner, and removing it furnishes a Young tableau with n−1 cells; conversely, starting from a Young tableau with n−1 cells, one can add a cell so that it becomes a corner of the new tableau, and label it with n.
Let P(λ1,…,λm) be the number on the right hand side of the Frame-Thrall-Robinson formula. By convention, it is set to be 0 if (λ1,…,λm) does not satisfy λ1≥⋯≥λm≥1. By induction, one wants to prove
P(λ1,…,λm)=i=1∑mP(λ1,…,λi−1,λi−1,λi+1,…,λm).
For every partition λ, set
pi(λ1,…,λm)=P(λ1,…,λm)P(λ1,…,λi−1,λi−1,λi+1,…,λm).
We thus need to prove
i=1∑mpi(λ)=1;
which we will do by interpreting the pi(λ) as the probabilities of disjoint events.
Given the Ferrers diagram F(λ), let us pick, at random, one cell (i,j), each of them given equal probability 1/n; then we pick a new cell, at random, in the hook of (i,j), each of them given equal probability 1/(hij−1), etc., until we reach a corner of the given diagram. Such a trial defines a path in the Ferrers diagram, ending at a corner (a,b); its projections are denoted by A={a1<a2<…} and B={b1<b2<…}. Let p(a,b) be the probability that we reach the corner (a,b); let q(A,B) be the probability that its projections be A and B conditioned to the hypothesis that it start at (inf(A),inf(B)).
Lemma. — Let A,B be sets of integers, let a=sup(A) and let b=sup(B); assume that (a,b) is a corner of λ. One has q(A,B)=i∈Ai=a∏hi,b−11j∈Bj=b∏ha,j−11.
We argue by induction on the cardinalities of A and B. If A={a} and B={b}, then q(A,B)=1, since both products are empty; this proves the formula in this case. As above, let a1<a2<… be the enumeration of the elements of A and b1<b2<… be that for B; let also A′=A∖{a1} and B′=B∖{b1}. By construction of the process, after having chosen the initial cell (a1,b1), it either goes on above the initially chosen cell (a1,b1), hence at (a1,b2), or on its right, that is, at (a2,b1). One thus has
\[ \begin{align*}
q(A,B) = \mathbf P(A,B\mid a_1,b_1)
& = \mathbf P(a_1,b_1,b_2\mid a_1,b_1) \mathbf P(A,B\mid a_1,b_1,b_2)
+ \mathbf P(a_1,a_2,b_1\mid a_1,b_1) \mathbf P(A,B\mid a_1,a_2,b_1)\\
&= \frac1{f_{a_1,b_1}-1} \mathbf P(A,B'\mid a_1,b_2)
+ \frac1{f_{a_1,b_1}-1} \mathbf P(A',B\mid a_2,b_1) \\
&= \frac1{h_{a_1,b_1}-1}\left( q(A',B) + q(A,B') \right). \end{align*}
\]
By induction, we may assume that the given formula holds for (A′,B) and (A,B′). Then, one has
\[ \begin{align*}
q(A,B) & = \frac1{h_{a_1,b_1}-1} \left(
\prod_{\substack{i\in A'\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1}
+
\prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B' \\ j\neq b}} \frac1{h_{a,j}-1}\right) \\
& =\frac{ (h_{a_1,b}-1)+(h_{a,b_1}-1)}{h_{a_1,b_1}-1}
\prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1},
\end{align*}
\]
which implies the desired formula once one remembers that
\[ h_{a,b_1} + h_{a_1,b}
= h_{a_1,b_1} + h_{a,b}= h_{a_1,b_1}+1\]
since (a,b) is a corner of F(λ).
Proposition. — Let (a,b) be a corner of the diagram F(λ); one has p(a,b)=pa(λ). (Note that b=λa.)
Write Fa(λ) for the Ferrers diagram with corner (a,b) removed. Its (i,j)-hook is the same as that of F(λ) if i=a and j=b; otherwise, it has one element less. Consequently, writing hi,j′ for the cardinalities of its hooks, one has
\[\begin{align*}
p_a(\lambda) &= \frac1n \frac{\prod_{(i,j)\in F(\lambda)}h_{i,j}}{\prod_{(i,j)\in F_a(\lambda)} h'_{i,j}} \\ &= n \prod_{i<a} \frac{h_{i,b}}{h_{i,b}-1} \prod_{j<b}\frac{h_{a,j}}{h_{a,j}-1}\\
&=\frac1n \prod_{i<a}\left(1+ \frac1{h_{i,b}-1}\right) \prod_{j<b} \left(1+\frac1{h_{a,j}-1}\right). \end{align*}
\]
Let us now expand the products. We get
\[
p_a(\lambda) = \frac1n \sum_{\sup(A)<a} \sum_{\sup(B)<b} \prod_{i\in A} \frac1{h_{i,b}-1}\prod_{j\in B}\frac1{h_{a,j}-1},
\]
where A and B range over the (possibly empty) subsets of {1,…,n} satisfying
the given conditions sup(A)<a and sup(B)<b. (Recall that, by convention, or by definition, one has sup(∅)=−∞.) Consequently, one has
\[ \begin{align*}
p_a(\lambda) & =\frac1n \sum_{\sup(A)=a} \sup_{\sup(B)=b} \prod_{\substack{i\in A\\ i\neq a}}
\frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1} \\
& = \frac1n \sum_{\sup(A)=a} \sum_{\sup(B)=b} q(A,B) \\
& = \sum_{\sup(A)=a} \sum_{\sup(B)=b} \mathbf P (A,B ) \\
& = p(a,b),
\end{align*}
\]
as claimed.
Now, every trial has to end at some corner (a,b), so that
(a,b) is a corner∑p(a,b)=1.
On the other hand, if( a , b ) (a,b) (a,b) is a corner, then b = λ a b=\lambda_a b=λa, while if ( a , λ a ) (a,\lambda_a) (a,λa) is not a corner, then P a ( λ ) = 0 P_a(\lambda)=0 Pa(λ)=0. We thus get ∑ a P a ( λ ) = P ( λ ) \sum_a P_a(\lambda)=P(\lambda) ∑aPa(λ)=P(λ), as was to be shown.
Heroes of this story are partitions, Ferrers diagrams, and Young tableaux. Let us first give definitions.
A partition of an integer n is a decreasing (US: non-increasing) sequence of integers λ=(λ1≥λ2≥⋯≥λm) of strictly positive integers such that ∣λ∣=λ1+⋯+λm=n.
The Ferrrers diagram F(λ) of this partition λ is the stairs-like graphic representation consisting of the first quadrant cells indexed by integers (i,j), where 1≤i≤m and 1≤j≤λi. Visualizing a partition by means of its Ferrers diagram makes clear that there is an involution on partitions, λ↦λ∗, which, geometrically consists in applying the symmetry with respect to the diagonal that cuts the first quadrant. In formulas, j≤λi if and only if i≤λj∗.
A Young tableau of shape λ is an enumeration of the n cells of this Ferrers diagram such that the enumeration strictly increases in rows and columns.
There is a graphic way of representing Ferrers diagrams and Young tableaux — the tradition says it's the Soviet way — consisting in rotating the picture by 45° (π/4) to the left and viewing the first quadrant as a kind of bowl in which n balls with diameter 1 are thrown from the top.
The hook length formula is a formula for the number fλ of Young tableaux of given shape λ.
For every (i,j) such that 1≤i≤m and 1≤j≤λi, define its hook Hij to be the set of cells in the diagram that are either above it, or on its the right — in formula, the set of all pairs (a,b) such that a=i and j≤b≤λi, or i≤a≤m and b=j≤λa. Let hij be the cardinality of the hook Hij.
Lemma. — One has ha,b=(λa−b)+(λb∗−a)+1.
Indeed, λa−b is the number of cells above (a,b) in the Ferrers diagram of λ, excluding (a,b), while (λb∗−a) is the number of cells on the right of (a,b).
Theorem (Frame, Thrall, Robinson; 1954). — Let n be an integer and let λ be a partition of n. The number of Young tableaux of shape λ is given by
fλ=∏(i,j)∈F(λ)hijn!.
From the point of view of representation theory, partitions of n are in bijection with conjugacy classes of elements in the symmetric group Sn (the lengths of the orbits of a permutation of {1,…,n} can be sorted into a partition of n, and this partition characterizes the conjugacy class of the given permutation). Then, to each partition of n corresponds an irreducible representation of Sn, and fλ appears to be its dimension. (In a future post, I plan to explain this part of the story.)
As already said, the rest of this post is devoted to explaining the probabilistic proof due to Greene, Nijenhuis, and Wilf. (Aside: Nijenhuis is a Dutch name that should pronounced roughly like Nay-en uys.)
A corner of a Ferrers diagram is a cell (i,j) which is both on top of its column, and on the right of its row; in other words, it is a cell whose associated hook is made of itself only. A bit of thought convinces that a corner can be removed, and furnishes a Ferrers diagram with one cell less. Conversely, starting from a Ferrers diagram with n−1 cells, one may add a cell on the boundary so as to get a Ferrers diagram with n cells. In the partition point of view, either one part gets one more item, or there is one more part, with only one item. In a Young tableau with n cells, the cell numbered n is at a corner, and removing it furnishes a Young tableau with n−1 cells; conversely, starting from a Young tableau with n−1 cells, one can add a cell so that it becomes a corner of the new tableau, and label it with n.
Let P(λ1,…,λm) be the number on the right hand side of the Frame-Thrall-Robinson formula. By convention, it is set to be 0 if (λ1,…,λm) does not satisfy λ1≥⋯≥λm≥1. By induction, one wants to prove
P(λ1,…,λm)=i=1∑mP(λ1,…,λi−1,λi−1,λi+1,…,λm).
For every partition λ, set
pi(λ1,…,λm)=P(λ1,…,λm)P(λ1,…,λi−1,λi−1,λi+1,…,λm).
We thus need to prove
i=1∑mpi(λ)=1;
which we will do by interpreting the pi(λ) as the probabilities of disjoint events.
Given the Ferrers diagram F(λ), let us pick, at random, one cell (i,j), each of them given equal probability 1/n; then we pick a new cell, at random, in the hook of (i,j), each of them given equal probability 1/(hij−1), etc., until we reach a corner of the given diagram. Such a trial defines a path in the Ferrers diagram, ending at a corner (a,b); its projections are denoted by A={a1<a2<…} and B={b1<b2<…}. Let p(a,b) be the probability that we reach the corner (a,b); let q(A,B) be the probability that its projections be A and B conditioned to the hypothesis that it start at (inf(A),inf(B)).
Lemma. — Let A,B be sets of integers, let a=sup(A) and let b=sup(B); assume that (a,b) is a corner of λ. One has q(A,B)=i∈Ai=a∏hi,b−11j∈Bj=b∏ha,j−11.
We argue by induction on the cardinalities of A and B. If A={a} and B={b}, then q(A,B)=1, since both products are empty; this proves the formula in this case. As above, let a1<a2<… be the enumeration of the elements of A and b1<b2<… be that for B; let also A′=A∖{a1} and B′=B∖{b1}. By construction of the process, after having chosen the initial cell (a1,b1), it either goes on above the initially chosen cell (a1,b1), hence at (a1,b2), or on its right, that is, at (a2,b1). One thus has
\[ \begin{align*}
q(A,B) = \mathbf P(A,B\mid a_1,b_1)
& = \mathbf P(a_1,b_1,b_2\mid a_1,b_1) \mathbf P(A,B\mid a_1,b_1,b_2)
+ \mathbf P(a_1,a_2,b_1\mid a_1,b_1) \mathbf P(A,B\mid a_1,a_2,b_1)\\
&= \frac1{f_{a_1,b_1}-1} \mathbf P(A,B'\mid a_1,b_2)
+ \frac1{f_{a_1,b_1}-1} \mathbf P(A',B\mid a_2,b_1) \\
&= \frac1{h_{a_1,b_1}-1}\left( q(A',B) + q(A,B') \right). \end{align*}
\]
By induction, we may assume that the given formula holds for (A′,B) and (A,B′). Then, one has
\[ \begin{align*}
q(A,B) & = \frac1{h_{a_1,b_1}-1} \left(
\prod_{\substack{i\in A'\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1}
+
\prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B' \\ j\neq b}} \frac1{h_{a,j}-1}\right) \\
& =\frac{ (h_{a_1,b}-1)+(h_{a,b_1}-1)}{h_{a_1,b_1}-1}
\prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1},
\end{align*}
\]
which implies the desired formula once one remembers that
\[ h_{a,b_1} + h_{a_1,b}
= h_{a_1,b_1} + h_{a,b}= h_{a_1,b_1}+1\]
since (a,b) is a corner of F(λ).
Proposition. — Let (a,b) be a corner of the diagram F(λ); one has p(a,b)=pa(λ). (Note that b=λa.)
Write Fa(λ) for the Ferrers diagram with corner (a,b) removed. Its (i,j)-hook is the same as that of F(λ) if i=a and j=b; otherwise, it has one element less. Consequently, writing hi,j′ for the cardinalities of its hooks, one has
\[\begin{align*}
p_a(\lambda) &= \frac1n \frac{\prod_{(i,j)\in F(\lambda)}h_{i,j}}{\prod_{(i,j)\in F_a(\lambda)} h'_{i,j}} \\ &= n \prod_{i<a} \frac{h_{i,b}}{h_{i,b}-1} \prod_{j<b}\frac{h_{a,j}}{h_{a,j}-1}\\
&=\frac1n \prod_{i<a}\left(1+ \frac1{h_{i,b}-1}\right) \prod_{j<b} \left(1+\frac1{h_{a,j}-1}\right). \end{align*}
\]
Let us now expand the products. We get
\[
p_a(\lambda) = \frac1n \sum_{\sup(A)<a} \sum_{\sup(B)<b} \prod_{i\in A} \frac1{h_{i,b}-1}\prod_{j\in B}\frac1{h_{a,j}-1},
\]
where A and B range over the (possibly empty) subsets of {1,…,n} satisfying
the given conditions sup(A)<a and sup(B)<b. (Recall that, by convention, or by definition, one has sup(∅)=−∞.) Consequently, one has
\[ \begin{align*}
p_a(\lambda) & =\frac1n \sum_{\sup(A)=a} \sup_{\sup(B)=b} \prod_{\substack{i\in A\\ i\neq a}}
\frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1} \\
& = \frac1n \sum_{\sup(A)=a} \sum_{\sup(B)=b} q(A,B) \\
& = \sum_{\sup(A)=a} \sum_{\sup(B)=b} \mathbf P (A,B ) \\
& = p(a,b),
\end{align*}
\]
as claimed.
Now, every trial has to end at some corner (a,b), so that
(a,b) is a corner∑p(a,b)=1.
On the other hand, if
Subscribe to:
Posts
(
Atom
)