Monday, December 12, 2022

Multiplicative square roots

I will just discuss briefly the first section of a paper by William Waterhouse (2012), “Square Root as a Homomorphism” (American Mathematical Monthly 119 (3), 235-239), which addresses the following question: given a field $F$, when is it possible to define square roots for all squares compatibly with products, ie, so that $\sqrt {ab}=\sqrt a\,\sqrt b$ if $a,b\in F$ are squares.

Real numbers. — Such a square root operation exists when $F$ is the field of real numbers: we are familiar with the process of taking the positive square root of a positive real number.

Finite fields. — It also exists in some finite fields. So let $F$ be a finite field, let $q$ be its number of elements; then $q$ is a power of a prime number $p$, but if you wish, you may already assume that $q=p$ is prime. For simplicity, we assume that $q$ is odd. By Fermat's little theorem, every nonzero element $a\in F$ satisfies $a^{q-1}=1$. Then $q-1$ is even, we can write $a^{q-1}=a^{(q-1)/2})^2=1$, so that $a^{(q-1)/2}=\pm1$, and Euler's criterion asserts that $a$ is a square if and only if $a^{(q-1)/2}=1$. (That this condition is necessary is obvious: write $a=b^2$, one gets $a^{(q-1)/2}=b^{q-1}=1$ by Fermat's criterion. Then, a counting argument shows that it is sufficient: the map $b\mapsto b^2$ is $2$ to $1$ on nonzero elements, hence its image consists of $(q-1)/2$ elements, all of which are squares; since the polynomial equation $T^{(q-1)/2}=1$ has at most $(q-1)/2$ solutions in $F$, we obtained all of them in this way.)

For example, $-1$ is a square if and only if $(-1)^{(q-1)/2}=1$, which happens if and only if $(q-1)/2$ is even, that is, $q\equiv 1\pmod 4$. In this case, do we have a formula for a square root of $-1$? When $q=p$, yes, but it is not an easy one: Wilson's theorem states that $(p-1)!\equiv -1\pmod p$, just because you may pair each integer $a$ such that $1\lt a\lt p-1$ with its multiplicative inverse modulo $p$; then only two factors remain in the product and $(p-1)!\equiv 1\cdot (p-1)\equiv -1\pmod p$. Now, we pair each integer $a$ such that $1\leq a\leq p-1$ with its additive inverse $p-a$; we get $(((p-1)/2) ! )^2 (-1)^((p-1)/2)$, hence $((p-1)/2)!)^2\equiv -1\pmod p$. This is not an easy formula, because computing the factorial takes a long time for large $p$.

It is possible to do much quicker, but you need to have a die at your disposal. Indeed, choose an element $a$ such that $1\leq a\leq p-1$ and compute $b=a^{(p-1)/4}$. Since $b^2=a^{(p-1)/2}=\pm1$, two possibilities arise: when $a$ is a square, we get $1$, but if $a$ is not a square, then we get $-1$. And if we choose $a$ randomly, we have one chance over two of not having chosen a square, hence one chance over two to get an element $b$ such that $b^2=-1$.

At this point you may ask why it isn't as long to compute the power $a^{(p-1)/4}$ than the factorial $((p-1)/2)!$, and you would be right. The reason is that there is a fast recursive way to compute a power $a^n$, by writing $a^n=(a^2)^{n/2}$ if $n$ is odd, and $a^n=a\cdot (a^2)^{(n-1)/2}$ if $n$ is odd. This leads to basically $\log_2(n)$ multiplications and squarings, and not $n$ multiplications ($n-1$, actually) as the naïve expression $a\cdot a\dots a$ might have let you think.

But let us go back to the question of computing square roots. As the last three paragraphs indicate, it could be difficult to do so when $q\equiv 1\pmod 4$. However, it is extremly easy in the other case $q\equiv 3\pmod 4$. Take a nonzero element $a$ which is a square, and write $a^{(q-1)/2}=1$. Since $q\equiv 3\pmod 4$, we write $q=-1+4m$ so that $a^{2m-1}=1$, hence say $a=a^{2m}=(a^m)^2$. We have our square root, it is simply given by $b=a^m=a^{(q+1)/4}$. The resulting map, $a\mapsto a^m$, gives us our desired multiplicative square roots on squares.

Complex numbers. — Now for a negative result, there is no multiplicative square root on the complex numbers, basically for the reason we have been taught that it leads to fallacies. All complex numbers are squares, so let us assume that we have a multiplicative square root $r\colon \mathbf C\to\mathbf C$. Letting $i=r(-1)$, the contradiction comes from the relation $$-i = r(-i)^2=r((-i)^2)=r(-1)=i.$$

We can now state and prove Waterhouse's theorem:

Theorem.Let $F$ be a field (of characteristic $\neq 2$) and let $S\subseteq F$ be the multiplicative monoid of squares. There exists a multiplicative homomorphism $r\colon S\to F$ if and only if $-1\notin S$.

Proof. — The same negative argument as in the complex numbers works whenever $-1$ is a square in $F$. So let us assume that $-1$ is not a square and let us explain why a multiplicative square root exists. The proof, however, is not explicit but relies on some maximal principle. Moreover, we won't define the square root map directly, but its image.
Let us first analyse the situation. Assume that $r\colon S\to F$ is a multiplicative square root. It is simpler to remove $0$ from the discussion so we consider its restriction $S^\times \to F^\times$ and still denote it by $r$. By assumption, it is a morphism of groups, so that its image $R^\times$ is a subgroup of $F^\times$. Observe that it does not contain $-1$, for if $r(a)=-1$, then $a=r(a)^2=(-1)^2=1$ but $r(1)=1$. Moreover, for every element $a\in F^\times$, we have $r(a^2)^2=a^2$, hence $r(a^2)=\pm a$, so that either $a$, or $-a$ belongs to $R$, but not both since $-1\not\in R^\times$. As a consequence, $R^\times$ is a maximal subgroup of $F^\times$ among those which do not contain $-1$: adding to $R^\times$ any element $a\in F^\times$ such that $a\notin R^\times$ would lead to a subgroup $\langle R^\times,a\rangle$ which contains $-1$.

Let us consider a maximal subgroup of $F^\times$ containing the squares which does not contain $-1$. Starting from $S^\times$, which does not contain $-1$, this can be done using Zorn's lemma, or by transfinite induction: well ordering the elements of $F^\times$, and constructing $R^\times$ by induction. Since $R^\times$ contains the squares, the union $R^\times \cup a R^\times$ is a subgroup of $F^\times$; if it does not contain $-1$, then we replace $R^\times$ by it, other wise we discard $a$ and keep $R^\times$.

Let $a\in F^\times$. If $a\notin R^\times$, the construction means that $-1\in aR^\times$, hence $-a\in R^\times$. But we can't have both $a$ and $-a$ in $R^\times$, for that would imply that $-1\in R^\times$.

If $a\in F^\times$ is a nonzero square, it has two square roots, of the form $\pm b$, and we define $r(a)$ to be its square root which belongs to $R^\times$. One has $r(1)=1$, because $1\in S^\times\subset R^\times$. For nonzero squares $a,b$, the product $r(a)r(b)$ is a square root of $ab$, and it belongs to $R^\times$, hence it equals $r(ab)$. This proves that the map $r$ is multiplicative. This concludes the proof.

Remark. — If you've studied some abstract algebra, you may have recognized something in the middle of the proof. Indeed, the quotient group $V=F^\times/S^\times$ has exponent 2: for every $\alpha$ in this group, $\alpha^2=1$. Consequently, even if it is written multiplicatively, this abelian group is a vector space over the field with 2-elements. Since $-1$ is not a square in $F^\times$, its class $[-1]$ is nonzero in $F^\times/S^\times$, and the quotient group $W=R^\times/S^\times$ is just a maximal vector subspace that does not contain $[-1]$. It is a hyperplane and is defined by a linear form $\phi$ on $V$. Since $V$ is written multiplicatively, this linear form corresponds to a group homomorphism $f\colon F^\times \to\{\pm1\}$ which maps $S^\times$ to $1$ and such that $f(-1)=-1$. For every square $a=b^2$, we then have $r(a)=b f(b)$.

In his paper, Waterhouse goes on by viewing “fields $F$ with a multiplicative square root $r$” as a basic algebraic object, and considering such structures $(F,r)$ which can't be extended by adding algebraic elements. The final theorem of the paper shows that the Galois group $\mathop{\rm Gal}(\overline F/F)$ is either cyclic of order 2, or is the additive group of the 2-adic integers.

Tuesday, November 1, 2022

#Mathober2022

Sophia Wood (@fractalkitty) had the good idea to set up a #Mathober project: for each day of october, she proposes you to react to one word of mathematics. I did something on the Mastodon server, that was also crossposted on Twitter. I will copy it here, but meanwhile you can enjoy it there.

You can also enjoy Sophia's work there:

Link to Mastodon : https://mathstodon.xyz/web/@antoinechambertloir/109131452332129714

Link to Twitter: https://twitter.com/achambertloir/status/1578647553205276672

Link to Sophia's Wood sketches:https://fractalkitty.com/2022/10/01/mathober2022-sketches

Tuesday, September 13, 2022

Yet another post on simplicity

I see that I finally arrive to an end of my journey in formalizing in Lean the simplicity of the alternating group in 5 letters or more, so it may be a good time to summarize what I did, from the mathematical side. 

On a first blog post, “Not simple proofs of simplicity”, I had described my initial plan, but it was not clear at that time that I would either arrive at a final proof, nor that I would be able to formalize it in Lean. In fact, a few weeks after I had started this experiment, I doubted I would make it and went on formalizing the traditional proof that the alternating group is simple. I added a few simplifications—which I was later told were already explained in Jacobson's Basic Algebra, say that's life…– leading to “The very simple proof that the alternating groups of five letters (or more) is simple”. I managed to formalize that proof at the end of 2021, and spent a lot of energy of the 8 next months to formalize the proof that I initially had in mind.

As I had already explained, the goal/constraint is to apply the Iwasawa criterion to the alternating group. This criterion says that if a group $G$ acts primitively on a set $X$, and if we attach to each point $x\in X$ a commutative subgroup $Tx$ of $G$, in such a way that $T(g\cdot x)=g\cdot Tx\cdot g^{-1}$ for every $g\in G$ and every $x\in X$, and if the subgroups $Tx$ generate $G$, then every normal subgroup of $G$ that acts nontrivially on $X$ contains the commutator subgroup. We take $G=\mathfrak A_n$. For $n\geq 5$, its commutator subgroup is $\mathfrak A_n$ itself (for example because any two 3-cycles are conjugated; in particular, a 3-cycle is conjugate to its square, which implies that it maps to $1$ in the abelianization of $\mathfrak A_n$). So we need to get primitive actions of $\mathfrak A_n$ and commutative subgroups. 

One of the equivalent criteria for primitivity of a transitive actions is that the stabilizers of points are maximal subgroups. As I had explained at the end of the first post, the maximal subgroups of $\mathfrak S_n$ and $\mathfrak A_n$ are known by the O'Nan–Scott theorem, combined with its converse which is a theorem of Liebeck, Praeger and Saxl. These theorems give a precise list of the maximal subgroups of $\mathfrak S_n$ and $\mathfrak A_n$, of which the first entry is precisely $\mathfrak S_p\times \mathfrak S_{n-p}$ (where the first factor acts on $\{1;\dots;p\}$ and the second acts on $\{p+1;\dots;n\}$) and its intersection with $\mathfrak A_n$, if $0<p<n$ and $n\neq 2p$.

We need to understand the limitation $n\neq 2p$, the point being that if $n=2p$, the subgroup $\mathfrak S_p\times\mathfrak S_p$ is not maximal in $\mathfrak S_{2p}$, it is a subgroup of index 2 of a “wreath product” obtained by adding one permutation that exchanges the two blocks $\{1,\dots,p\}$ and $\{p+1,\dots,2p\}$, for example $(1\,p+1)(2\,p+2)\dots (p\,2p)$. This group is the second entry in the O'Nan–Scott theorem.

These two entries are labelled as intransitive and imprimitive respectively, because $\mathfrak S_p\times \mathfrak S_{n-p}$ has two orbits on $\{1;\dots;n\}$, while the wreath product is transitive but it preserves the partition consisting of the two blocks $\{1,\dots,p\}$ and $\{p+1,\dots,2p\}$.

These two entries seem to be obvious to the group theorists. It is given without proof in the paper of Liebeck, Praeger and Saxl.

The case of $\mathfrak S_n$ is easy, and occupies a subsection of Wilson's book on Finite Simple Groups. It is even funny to prove by hand, and not so hard to formalize in Lean. Take a subgroup $K$ of $\mathfrak S_n$ such that $\mathfrak S_p\times \mathfrak S_{n-p} \subsetneq K$ and let us prove that $K=\mathfrak S_n$.  To that end, it suffices to show that $K$ contains any transposition $(a\,b)$. This is obvious if both $a$ and $b$ belong to $\{1;\dots;p\}$ or if they both belong to $\{p+1;dots;n\}$, so assume that $a\in\{1;\dots;p\}$ and $b\in\{p+1;\dots;n\}$. Since $K$ does not stabilize $\{1;\dots;p\}$, there is $x\in\{1;\dots;p\}$ and $k\in K$ such that $y=k\cdot x \in\{p+1;\dots;n\}$. If $n>2p$, there exists $z\in\{p+1;\dots;n\}$ such that $z\neq y$ and $t=k^{-1}\cdot z\in\{p+1;\dots;n\}$; from the relation $k^{-1} \cdot (y\,z) \cdot k=(x\,t)$ and the fact that $(y\,z)\in \mathfrak S_p\times\mathfrak S_{n-p}$, we deduce that $(x\,t)$ belongs to $K$. This gives us one transposition of the desired form; finally, the relation $(a\,b)=h (x\,t) h^{-1}$ with $h=(x\,a)(t\,b)\in\mathfrak S_p\times\mathfrak S_{n-p}$ shows that $(a\,b)\in K$. The other case, $n<2p$ is symmetric.

Bizarrely, the analogous result for the alternating group looked more difficult to me, although some colleague assured me that it could be done, an other one that I could certainly do it, and a last one did it for $n>7$. Since Liebeck, Praeger and Saxl gave no reference at all, I asked Liebeck about and he explained me a short proof that uses totally different ideas.

Let $G=\mathfrak A_n$ or $\mathfrak S_n$ and consider a subgroup $K$ such that $(\mathfrak S_p\times\mathfrak S_{n-p})\cap G \subsetneq K\subseteq G$; we wish to prove that $K=G$. Arguments as given above already show that $K$ acts transitively on $\{1;\dots;n\}$. But we can do more: it acts primitively. Now, one just needs to invoke a 1870 theorem of Jordan: a primitive subgroup of $\mathfrak S_n$ that contains a transposition is $\mathfrak S_n$, and a primitive subgroup of $\mathfrak S_n$ that contains a 3-cycle contains $\mathfrak A_n$!

To prove that $K$ acts primitively, it is convenient to use the standard definition of a primitive action. If a group $G$ acts on a set $X$, call block of the action a nonempty subset $B$ of $X$ which, for every $g\in G$, is either fixed or moved to a disjoint subset by $G$; it follows from the definition that the translates of a block by the action form a partition of $X$. Singletons are blocks, the full subset is a block, and one definition of a primitive action is that the only blocks are these trivial ones (and $X$ is nonempty). Orbits are blocks, so that a primitive action is transitive. Conversely, one can prove that if the action is transitive, then it is primitive if and only if stabilizers of points in $X$ are maximal subgroups. A more general result is that for every point $a\in X$, associating with a set $B$ its stabilizer $G_B$ gives a bijection from the set of blocks that contain $a$ to the set of subgroups of $G$ that contain $G_a$, with inverse bijection associating with a subgroup $K$ containing $G_a$ the orbit $K\cdot a$, and these bijections preserve inclusion. 

Proof. — Let $B,B'$ be blocks such that $B\subseteq B'$ and let $g\in G_B$; then $g\cdot B'$ contains $g\cdot B=B$, hence $g\cdot B'$ is not disjoint from $B'$, so that $g\cdot B'=B'$ by definition of a block. This proves that $G_B$ is a subgroup of $ G_{B'}$.

Let $B$ be a block that contains $a$; then $G_B \cdot a=B$. Indeed, the inclusion $G_B\cdot a\subseteq B$ follows from the definition of $G_B$. To prove the other inclusion, let $b\in B$. Since the action is transitive, there exists $g\in G$ such that $g\cdot a=b$; then $g\cdot B$ and $B$ both contain $b$, hence $g\cdot B=B$, so that $g\in G_B$ and $b\in G_B\cdot a$.

Finally, let $K$ a a subgroup of $G$ containing $G_a$ and let $B=K\cdot a$. Let us prove that $B$ is a block such that $K=G_B$. Let $g\in G$ such that $g\cdot B$ and $B$ are not disjoint; let $b,c\in B$ be such that $b=g\cdot c$; write $b=k\cdot a$ and $c=h\cdot a$ for $k,h\in K$. Then $k\cdot a = gh\cdot a$ so that $k^{-1}gh\in G_a$, hence $k^{-1}gh\in K$; we conclude that $g\in K$, hence $g\cdot B=gK\cdot a = K\cdot a=B.$ So $B$ is a block. This also shows that $G_B\subseteq K$, and the converse inclusion is obvious.

Going back to our initial problem, it remains to show that the action of $K$ on $\{1;\dots;n\}$ only has trivial blocks. The proof uses two remarks.

  1. The trace of a block on $\{1;\dots;p\}$, respectively $\{p+1;\dots;n\}$, is either a singleton, or all of it. Indeed, this trace is a block for the induced action of $(\mathfrak S_p\times\mathfrak S_{n-p})\cap G$ on $\{1;\dots;p\}$ (respectively $\{p+1;\dots;n\}$), and this action contains that of $\mathfrak A_p$ (respectively…) and even that of $\mathfrak S_p$ if $p\neq n-1$. On the other hand, the symmetric group acts 2-transitively, hence primitively.  (The cases $p=1$ or $p=n-1$ need minor adjustements.)
  2. If $2p<n$, then no nontrivial block can contain $\{p+1;\dots;n\}$. Indeed, there is not enough space in the complementary subset so that disjoint translates of this block make a partition of $\{1;\dots;n\}$.

Let us now conclude the proof. (I still find the following argument a bit convoluted but have nothing really better to propose yet.) Consider a block $B\subset\{1;\dots;n\}$ for the action of $K$, and assume that $B$ is not a singleton, nor the full set. If $B$ meets $\{p+1;\dots;n\}$ in at least two elements, then it contains $\{p+1;\dots;n\}$, hence is the full block, a contradiction. If $B$ meets $\{1;\dots;p\}$ in at least two elements, then it contains $\{1;\dots;p\}$, and some disjoint translate of it  is contained in $\{p+1;\dots;n\}$; this translate is a block that contains $\{p+1;\dots;n\}$, hence is the full set, so that the initial block is the full set as well.  By similar arguments, $B$ meets both $\{1;\dots;p\}$ and $\{p+1;\dots;n\}$ in exactly one element, and the same hold for any translate $k\cdot B$ of $B$. However, using the hypothesis that $p\neq n-p$ and that $K$ strictly contains $(\mathfrak S_p\times\mathfrak S_{n-p})\cap G$, we find $k\in K$ such that $k\cdot B$ meets $\{p+1;\dots;n\}$ in at least two elements, and we can conclude as earlier that $B$ is the full set.

To terminate this blog spot, I need to say something about Jordan's theorem. Jordan was concerned about the concept multiple transitivity: a group $G$ acting on a set $X$ is $m$-transitive if whenever systems of distinct elements $a_1,\dots,a_m$ on the one side, $b_1,\dots,b_m$ on the other side, are given, there exists $g\in G$ such that $g\cdot a_1=b_1,\dots g \cdot a_m=b_m$ (one assumes here that $m\leq {\mathrm{Card}(X)}$). Many theorems from this time (Matthieu, Bertrand, Serret, Jordan…), partly in relation with Galois theory of equations, aim at limiting the multiple transitivity of subgroups of the symmetric group. The symmetric group itself is $n$-transitive, if $n={\mathrm {Card}(X)}$, the alternating group is $(n-2)$-transitive, and other subgroups have to be much less transitive.

The general result of Jordan, proved in the Note C (page 664) to §398 of his Traité des substitutions et des équations algébriques (1870, Gauthier-Villars)  is that a primitive subgroup of $\mathfrak S_n$ containing a cycle of prime order $p$ is $n-p+1$-transitive. For $p=2$, we get that this subgroup is $(n-1)$-transitive, hence is $\mathfrak S_n$; for $p=3$, we get that it is $(n-2)$-transitive, and that implies that it contains the alternating group $\mathfrak A_n$. I formalized these results in Lean, following the presentation of Wielandt's book on Finite permutation groups (theorem 13.3 of that reference). A later theorem of Jordan (1873; see theorem 13.9 in Wielandt's book) asserts that such a subgroup always contains the alternating group provided $n-p\geq 3$; I have not (not yet?) formalized it in Lean.

All in all, this gives a fairly sophisticated proof that the alternating group is simple. One of its merit is to follow a general line, that applies to many other groups. In particular, Iwasawa's criterion is also used by Wilson in his book Finite simple groups to prove that the simplicity of the Mathieu groups $M_{11}, M_{12}$, and of many other finite groups.

I just opened Jordan's book to write this blog post. Let me add that it contains (§85) another proof of simplicity of the alternating group, and I will try to explain it in a later post.