Wednesday, May 2, 2018

Combinatorics and probability : Greene, Nijenhuis and Wilf's proof of the hook length formula

I've never been very good at remembering representation theory, past the general facts that hold for all finite groups, especially the representation theory of symmetric groups.  So here are personal notes to help me understand the 1979 paper by Greene, Nijenhuis and Wilf, where they give a probabilistic proof of the hook length formula. If you already know what it is about, then you'd be quicker by browsing at the Wikipedia page that I just linked to!

Heroes of this story are partitions, Ferrers diagrams, and Young tableaux. Let us first give definitions.

A partition of an integer nn is a decreasing (US: non-increasing) sequence of integers λ=(λ1λ2λm)\lambda =(\lambda_1\geq \lambda_2\geq \dots\geq \lambda_m) of strictly positive integers such that λ=λ1++λm=n|\lambda|=\lambda_1+\dots+\lambda_m=n.

The Ferrrers diagram F(λ)F(\lambda) of this partition λ\lambda is the stairs-like graphic representation consisting of the first quadrant cells indexed by integers (i,j)(i,j), where 1im1\leq i\leq m and 1jλi1\leq j\leq \lambda_i. Visualizing a partition by means of its Ferrers diagram makes clear that there is an involution on partitions, λλ\lambda\mapsto \lambda^*, which, geometrically consists in applying the symmetry with respect to the diagonal that cuts the first quadrant. In formulas, jλij\leq \lambda_i if and only if iλji\leq \lambda_j^*.

A Young tableau of shape λ\lambda is an enumeration of the nn cells of this Ferrers diagram such that the enumeration strictly increases in rows and columns.

There is a graphic way of representing Ferrers diagrams and Young tableaux — the tradition says it's the Soviet way — consisting in rotating the picture by 45° (π/4\pi/4) to the left and viewing the first quadrant as a kind of bowl in which nn balls with diameter 11 are thrown from the top.

The hook length formula is a formula for the number fλf_\lambda of Young tableaux of given shape λ\lambda.

For every (i,j)(i,j) such that 1im1\leq i\leq m and 1jλi1\leq j\leq \lambda_i, define its hook HijH_{ij} to be the set of cells in the diagram that are either above it, or on its the right — in formula, the set of all pairs (a,b)(a,b) such that a=ia=i and jbλij\leq b\leq \lambda_i, or iami\leq a\leq m and b=jλab=j\leq \lambda_a. Let hijh_{ij} be the cardinality of the hook HijH_{ij}.

Lemma. — One has ha,b=(λab)+(λba)+1h_{a,b} = (\lambda_a-b)+(\lambda_b^*-a) +1.

Indeed, λab\lambda_a-b is the number of cells above (a,b)(a,b) in the Ferrers diagram of λ\lambda, excluding (a,b)(a,b), while (λba)(\lambda_b^*-a) is the number of cells on the right of (a,b)(a,b).

Theorem (Frame, Thrall, Robinson; 1954). — Let nn be an integer and let λ\lambda be a partition of nn. The number of Young tableaux of shape λ\lambda is given by 
fλ=n!(i,j)F(λ)hij. f_\lambda = \frac{n!}{\prod_{(i,j)\in F(\lambda)} h_{ij}}.

From the point of view of representation theory, partitions of nn are in bijection with conjugacy classes of elements in the symmetric group Sn\mathfrak S_n (the lengths of the orbits of a permutation of {1,,n}\{1,\dots,n\} can be sorted into a partition of nn, and this partition characterizes the conjugacy class of the given permutation). Then, to each partition of nn corresponds an irreducible representation of Sn\mathfrak S_n, and fλf_\lambda appears to be its dimension. (In a future post, I plan to explain this part of the story.)

As already said, the rest of this post is devoted to explaining the probabilistic proof due to Greene, Nijenhuis, and Wilf. (Aside: Nijenhuis is a Dutch name that should pronounced roughly like Nay-en uys.)

A corner of a Ferrers diagram is a cell (i,j)(i,j) which is both on top of its column, and on the right of its row; in other words, it is a cell whose associated hook is made of itself only. A bit of thought convinces that a corner can be removed, and furnishes a Ferrers diagram with one cell less. Conversely, starting from a Ferrers diagram with n1n-1 cells, one may add a cell on the boundary so as to get a Ferrers diagram with nn cells. In the partition point of view, either one part gets one more item, or there is one more part, with only one item. In a Young tableau with nn cells, the cell numbered nn is at a corner, and removing it furnishes a Young tableau with n1n-1 cells; conversely, starting from a Young tableau with n1n-1 cells, one can add a cell so that it becomes a corner of the new tableau, and label it with nn.

Let P(λ1,,λm)P(\lambda_1,\dots,\lambda_m) be the number on the right hand side of the Frame-Thrall-Robinson formula. By convention, it is set to be 00 if (λ1,,λm)(\lambda_1,\dots,\lambda_m) does not satisfy λ1λm1\lambda_1\geq\dots\geq\lambda_m\geq 1. By induction, one wants to prove
P(λ1,,λm)=i=1mP(λ1,,λi1,λi1,λi+1,,λm). P(\lambda_1,\dots,\lambda_m) = \sum_{i=1}^m P(\lambda_1,\dots,\lambda_{i-1},\lambda_i-1,\lambda_{i+1},\dots,\lambda_m).
For every partition λ\lambda, set
pi(λ1,,λm)=P(λ1,,λi1,λi1,λi+1,,λm)P(λ1,,λm). p_i(\lambda_1,\dots,\lambda_m) = \frac{P(\lambda_1,\dots,\lambda_{i-1},\lambda_i-1,\lambda_{i+1},\dots,\lambda_m)}{P(\lambda_1,\dots,\lambda_m)}.
We thus need to prove
i=1mpi(λ)=1; \sum_{i=1}^m p_i(\lambda)=1;
which we will do by interpreting the pi(λ)p_i(\lambda) as the probabilities of disjoint events.

Given the Ferrers diagram F(λ)F(\lambda), let us pick, at random, one cell (i,j)(i,j), each of them given equal probability 1/n1/n; then we pick a new cell, at random, in the hook of (i,j)(i,j), each of them given equal probability 1/(hij1)1/(h_{ij}-1), etc., until we reach a corner of the given diagram. Such a trial defines a path in the Ferrers diagram, ending at a corner (a,b)(a,b); its projections  are denoted by A={a1<a2<}A=\{a_1<a_2<\dots\} and B={b1<b2<}B=\{b_1<b_2<\dots\}. Let p(a,b)p(a,b) be the probability that we reach the corner (a,b)(a,b); let q(A,B)q(A,B) be the probability that its projections be AA and BB conditioned to the hypothesis that it start at (inf(A),inf(B))(\inf(A),\inf(B)).

Lemma. — Let A,BA,B be sets of integers, let a=sup(A)a=\sup(A) and let b=sup(B)b=\sup(B);  assume that (a,b)(a,b) is a corner of λ\lambda. One has q(A,B)=iAia1hi,b1jBjb1ha,j1. q(A,B)= \prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B\\ j\neq b}} \frac1{h_{a,j}-1}.

We argue by induction on the cardinalities of AA and BB. If A={a}A=\{a\} and B={b}B=\{b\}, then q(A,B)=1q(A,B)=1, since both products are empty; this proves the formula in this case. As above, let a1<a2<a_1<a_2<\dots be the enumeration of the elements of AA and b1<b2<b_1<b_2<\dots be that for BB; let also A=A{a1}A'=A\setminus\{a_1\} and B=B{b1}B'=B\setminus\{b_1\}. By construction of the process, after having chosen the initial cell (a1,b1)(a_1,b_1),  it either goes on above the initially chosen cell (a1,b1)(a_1,b_1), hence at (a1,b2)(a_1,b_2), or on its right, that is, at (a2,b1)(a_2,b_1). One thus has
\[ \begin{align*}
q(A,B) =   \mathbf P(A,B\mid a_1,b_1)
& =  \mathbf P(a_1,b_1,b_2\mid a_1,b_1) \mathbf P(A,B\mid a_1,b_1,b_2)
+ \mathbf P(a_1,a_2,b_1\mid a_1,b_1) \mathbf P(A,B\mid a_1,a_2,b_1)\\
&= \frac1{f_{a_1,b_1}-1} \mathbf P(A,B'\mid a_1,b_2)
+ \frac1{f_{a_1,b_1}-1} \mathbf P(A',B\mid a_2,b_1) \\
&= \frac1{h_{a_1,b_1}-1}\left( q(A',B) + q(A,B') \right). \end{align*}
\]
By induction, we may assume that the given formula holds for (A,B)(A',B) and (A,B)(A,B'). Then, one has
\[ \begin{align*}
q(A,B) & = \frac1{h_{a_1,b_1}-1} \left(
 \prod_{\substack{i\in A'\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1}
+
 \prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B' \\ j\neq b}} \frac1{h_{a,j}-1}\right) \\
& =\frac{  (h_{a_1,b}-1)+(h_{a,b_1}-1)}{h_{a_1,b_1}-1}
 \prod_{\substack{i\in A\\ i\neq a}} \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1},
\end{align*}
\]
which implies the desired formula once one remembers that
\[ h_{a,b_1} + h_{a_1,b}
= h_{a_1,b_1} + h_{a,b}=  h_{a_1,b_1}+1\]
since (a,b)(a,b) is a corner of F(λ)F(\lambda).

Proposition. — Let (a,b)(a,b) be a corner of the diagram F(λ)F(\lambda); one has p(a,b)=pa(λ)p(a,b)=p_a(\lambda)(Note that b=λab=\lambda_a.)

Write Fa(λ)F_a(\lambda) for the Ferrers diagram with corner (a,b)(a,b) removed. Its (i,j)(i,j)-hook is the same as that of F(λ)F(\lambda) if iai\neq a and jbj\neq b; otherwise, it has one element less. Consequently, writing hi,jh'_{i,j} for the cardinalities of its hooks, one has
\[\begin{align*}
p_a(\lambda) &= \frac1n \frac{\prod_{(i,j)\in F(\lambda)}h_{i,j}}{\prod_{(i,j)\in F_a(\lambda)} h'_{i,j}} \\ &= n \prod_{i<a} \frac{h_{i,b}}{h_{i,b}-1} \prod_{j<b}\frac{h_{a,j}}{h_{a,j}-1}\\
&=\frac1n \prod_{i<a}\left(1+ \frac1{h_{i,b}-1}\right) \prod_{j<b} \left(1+\frac1{h_{a,j}-1}\right). \end{align*}
\]
Let us now expand the products. We get
\[
p_a(\lambda) = \frac1n \sum_{\sup(A)<a} \sum_{\sup(B)<b} \prod_{i\in A} \frac1{h_{i,b}-1}\prod_{j\in B}\frac1{h_{a,j}-1},
\]
where AA and BB range over the (possibly empty) subsets of {1,,n}\{1,\dots,n\} satisfying
the given conditions sup(A)<a\sup(A)<a and sup(B)<b\sup(B)<b. (Recall that, by convention, or by definition, one has sup()=\sup(\emptyset)=-\infty.) Consequently, one has
\[ \begin{align*}
p_a(\lambda) & =\frac1n \sum_{\sup(A)=a} \sup_{\sup(B)=b} \prod_{\substack{i\in A\\ i\neq a}}
 \frac1{h_{i,b}-1} \prod_{\substack{j\in B \\ j\neq b}} \frac1{h_{a,j}-1} \\
& = \frac1n \sum_{\sup(A)=a} \sum_{\sup(B)=b} q(A,B) \\
& = \sum_{\sup(A)=a} \sum_{\sup(B)=b} \mathbf P (A,B ) \\
& = p(a,b),
\end{align*}
\]
as claimed.

Now, every trial has to end at some corner (a,b)(a,b), so that
(a,b) is a cornerp(a,b)=1. \sum_{\text{(a,b)(a,b) is a corner}} p(a,b) = 1.
On the other hand, if (a,b)(a,b) is a corner, then b=λab=\lambda_a, while if (a,λa)(a,\lambda_a) is not a corner, then Pa(λ)=0P_a(\lambda)=0. We thus get aPa(λ)=P(λ)\sum_a P_a(\lambda)=P(\lambda), as was to be shown.

3 comments :

  1. Errata:

    - "sequence of integers" -> "sequence".

    - "Ferrrers" -> "Ferrers".

    - Add comma after "geometrically".

    - Your "Young tableau" is commonly called "standard Young tableau". Not asking you to change the name, but it's worth mentioning at least once.

    - "on its the right" -> "on its right, or (i,j)(i, j) itself".

    - "In the partition point of view, either one part gets one more item, or there is one more part, with only one item." I'd speak of incrementing a part by 11 or creating a new part equal to 11, rather than of "items" (these are integer partitions, not set partitions). Also, worth saying what "part" means.

    - You don't literally want P(λ1,,λm)P(\lambda_1,\dots,\lambda_m) to be 00 when λ1λm1\lambda_1\geq\dots\geq\lambda_m\geq 1 is not satisfied. After all, subtracting 11 from the last entry of the partition might make that last entry 00, and this should simply be considered as a partition with one fewer entry.

    - After "in the hook of (i,j)(i, j)", add "(but not (i,j)(i, j) itself)".

    - "its projections" should be better explained IMHO.

    - In the lemma, I'd require AA and BB to be nonzero, to avoid the question of what sup\sup \varnothing is.

    - In the second line of
    "\begin{align*}
    q(A,B) = \mathbf P(A,B\mid a_1,b_1)
    & = \mathbf P(a_1,b_1,b_2\mid a_1,b_1) \mathbf P(A,B\mid a_1,b_1,b_2)
    + \mathbf P(a_1,a_2,b_1\mid a_1,b_1) \mathbf P(A,B\mid a_1,a_2,b_1)\\
    &= \frac1{f_{a_1,b_1}-1} \mathbf P(A,B'\mid a_1,b_2)
    + \frac1{f_{a_1,b_1}-1} \mathbf P(A',B\mid a_2,b_1) \\
    &= \frac1{h_{a_1,b_1}-1}\left( q(A',B) + q(A,B') \right). \end{align*}"
    replace both ffs by hhs.

    - In the second line of
    "\begin{align*}
    p_a(\lambda) &= \frac1n \frac{\prod_{(i,j)\in F(\lambda)}h_{i,j}}{\prod_{(i,j)\in F_a(\lambda)} h'_{i,j}} \\ &= n \prod_{i "pa(λ)=0p_a(\lambda) = 0".

    - "aPa(λ)=P(λ)\sum_a P_a(\lambda)=P(\lambda)" -> "apa(λ)=1\sum_a p_a(\lambda)=1".

    PS. While I'm picking nits: On the first line of the proof of Proposition 2.4 of https://webusers.imj-prg.fr/~antoine.chambert-loir/enseignement/2015-16/agreg/resultant.pdf , "A[T]m+nA[T]_{m+n}" should be "A[T]<m+nA[T]_{< m+n}". Also, in the proof of Corollary 2.6, it is useless to define m=m1+m2m = m_1 + m_2 and P=P1P2P = P_1P_2 (you don't use these notations). In Section 3, you probably want to assume the ground ring to be a field. In Section 5.3, "Soit pp un nombre impair" should probably also require pp to be prime, and "et soit aKa \in K une racine de Φp\Phi_p" should be "et soit aKa \in K une racine de TpT_p". In the proof of Théorème 5.4, "Q(α)Q(\alpha)" should be "Tq(α)T_q(\alpha)". And in Exemple 1.8, "et soit KK une extension de KK" should be "et soit KK une extension de kk".

    PPS. Your lecture notes are gorgeous; thanks a lot for them!

    ReplyDelete
    Replies
    1. Just realized parts of my post have been cropped, probably due to <-signs being misunderstood as HTML. So let me repost them:

      - In the second line of
      "\begin{align*}
      p_a(\lambda) &= \frac1n \frac{\prod_{(i,j)\in F(\lambda)}h_{i,j}}{\prod_{(i,j)\in F_a(\lambda)} h'_{i,j}} \\ &= n \prod_{i < a} \frac{h_{i,b}}{h_{i,b}-1} \prod_{j < b}\frac{h_{a,j}}{h_{a,j}-1}\\
      &=\frac1n \prod_{i < a}\left(1+ \frac1{h_{i,b}-1}\right) \prod_{j < b} \left(1+\frac1{h_{a,j}-1}\right). \end{align*}"
      replace nn by 1n\frac1n.

      - Honestly I'd find it easier to replace "sup(A)"\sup(A) "p_a(\lambda) = 0$".

      - "aPa(λ)=P(λ)\sum_a P_a(\lambda)=P(\lambda)" -> "apa(λ)=1\sum_a p_a(\lambda)=1".

      Delete
    2. OK, that was better, but some parts are still missing. Next try:

      - Honestly I'd find it easier to replace "sup(A)<a\sup(A) < a" by "A{1,,a1}A \subseteq \left\{1,\ldots,a-1\right\}", and similarly for BB. That way you won't have to explain what you mean by the supremum of the empty set.

      - The "sup" in "\sup_{\sup(B)=b}" should be a \sum.

      - The boldfaced P in "\sum_{\sup(A)=a} \sum_{\sup(B)=b} \mathbf P (A,B ) " has never been defined. I assume it is meant to be the same as q(A,B)q(A, B), except that we don't condition to any hypothesis on the starting cell of the trial.

      - "Pa(λ)=0P_a(\lambda) = 0" -> "pa(λ)=0p_a(\lambda) = 0".

      - "aPa(λ)=P(λ)\sum_a P_a(\lambda)=P(\lambda)" -> "apa(λ)=1\sum_a p_a(\lambda)=1".

      Delete