Monday, December 12, 2022

Multiplicative square roots

I will just discuss briefly the first section of a paper by William Waterhouse (2012), “Square Root as a Homomorphism” (American Mathematical Monthly 119 (3), 235-239), which addresses the following question: given a field FF, when is it possible to define square roots for all squares compatibly with products, ie, so that ab=ab\sqrt {ab}=\sqrt a\,\sqrt b if a,bFa,b\in F are squares.

Real numbers. — Such a square root operation exists when FF is the field of real numbers: we are familiar with the process of taking the positive square root of a positive real number.

Finite fields. — It also exists in some finite fields. So let FF be a finite field, let qq be its number of elements; then qq is a power of a prime number pp, but if you wish, you may already assume that q=pq=p is prime. For simplicity, we assume that qq is odd. By Fermat's little theorem, every nonzero element aFa\in F satisfies aq1=1a^{q-1}=1. Then q1q-1 is even, we can write aq1=a(q1)/2)2=1a^{q-1}=a^{(q-1)/2})^2=1, so that a(q1)/2=±1a^{(q-1)/2}=\pm1, and Euler's criterion asserts that aa is a square if and only if a(q1)/2=1a^{(q-1)/2}=1. (That this condition is necessary is obvious: write a=b2a=b^2, one gets a(q1)/2=bq1=1a^{(q-1)/2}=b^{q-1}=1 by Fermat's criterion. Then, a counting argument shows that it is sufficient: the map bb2b\mapsto b^2 is 22 to 11 on nonzero elements, hence its image consists of (q1)/2(q-1)/2 elements, all of which are squares; since the polynomial equation T(q1)/2=1T^{(q-1)/2}=1 has at most (q1)/2(q-1)/2 solutions in FF, we obtained all of them in this way.)

For example, 1-1 is a square if and only if (1)(q1)/2=1(-1)^{(q-1)/2}=1, which happens if and only if (q1)/2(q-1)/2 is even, that is, q1(mod4)q\equiv 1\pmod 4. In this case, do we have a formula for a square root of 1-1? When q=pq=p, yes, but it is not an easy one: Wilson's theorem states that (p1)!1(modp)(p-1)!\equiv -1\pmod p, just because you may pair each integer aa such that 1<a<p11\lt a\lt p-1 with its multiplicative inverse modulo pp; then only two factors remain in the product and (p1)!1(p1)1(modp)(p-1)!\equiv 1\cdot (p-1)\equiv -1\pmod p. Now, we pair each integer aa such that 1ap11\leq a\leq p-1 with its additive inverse pap-a; we get (((p1)/2)!)2(1)((p1)/2)(((p-1)/2) ! )^2 (-1)^((p-1)/2), hence ((p1)/2)!)21(modp)((p-1)/2)!)^2\equiv -1\pmod p. This is not an easy formula, because computing the factorial takes a long time for large pp.

It is possible to do much quicker, but you need to have a die at your disposal. Indeed, choose an element aa such that 1ap11\leq a\leq p-1 and compute b=a(p1)/4b=a^{(p-1)/4}. Since b2=a(p1)/2=±1b^2=a^{(p-1)/2}=\pm1, two possibilities arise: when aa is a square, we get 11, but if aa is not a square, then we get 1-1. And if we choose aa randomly, we have one chance over two of not having chosen a square, hence one chance over two to get an element bb such that b2=1b^2=-1.

At this point you may ask why it isn't as long to compute the power a(p1)/4a^{(p-1)/4} than the factorial ((p1)/2)!((p-1)/2)!, and you would be right. The reason is that there is a fast recursive way to compute a power ana^n, by writing an=(a2)n/2a^n=(a^2)^{n/2} if nn is odd, and an=a(a2)(n1)/2a^n=a\cdot (a^2)^{(n-1)/2} if nn is odd. This leads to basically log2(n)\log_2(n) multiplications and squarings, and not nn multiplications (n1n-1, actually) as the naïve expression aaaa\cdot a\dots a might have let you think.

But let us go back to the question of computing square roots. As the last three paragraphs indicate, it could be difficult to do so when q1(mod4)q\equiv 1\pmod 4. However, it is extremly easy in the other case q3(mod4)q\equiv 3\pmod 4. Take a nonzero element aa which is a square, and write a(q1)/2=1a^{(q-1)/2}=1. Since q3(mod4)q\equiv 3\pmod 4, we write q=1+4mq=-1+4m so that a2m1=1a^{2m-1}=1, hence say a=a2m=(am)2a=a^{2m}=(a^m)^2. We have our square root, it is simply given by b=am=a(q+1)/4b=a^m=a^{(q+1)/4}. The resulting map, aama\mapsto a^m, gives us our desired multiplicative square roots on squares.

Complex numbers. — Now for a negative result, there is no multiplicative square root on the complex numbers, basically for the reason we have been taught that it leads to fallacies. All complex numbers are squares, so let us assume that we have a multiplicative square root r ⁣:CCr\colon \mathbf C\to\mathbf C. Letting i=r(1)i=r(-1), the contradiction comes from the relation i=r(i)2=r((i)2)=r(1)=i.-i = r(-i)^2=r((-i)^2)=r(-1)=i.

We can now state and prove Waterhouse's theorem:

Theorem.Let FF be a field (of characteristic 2\neq 2) and let SFS\subseteq F be the multiplicative monoid of squares. There exists a multiplicative homomorphism r ⁣:SFr\colon S\to F if and only if 1S-1\notin S.

Proof. — The same negative argument as in the complex numbers works whenever 1-1 is a square in FF. So let us assume that 1-1 is not a square and let us explain why a multiplicative square root exists. The proof, however, is not explicit but relies on some maximal principle. Moreover, we won't define the square root map directly, but its image.
Let us first analyse the situation. Assume that r ⁣:SFr\colon S\to F is a multiplicative square root. It is simpler to remove 00 from the discussion so we consider its restriction S×F×S^\times \to F^\times and still denote it by rr. By assumption, it is a morphism of groups, so that its image R×R^\times is a subgroup of F×F^\times. Observe that it does not contain 1-1, for if r(a)=1r(a)=-1, then a=r(a)2=(1)2=1a=r(a)^2=(-1)^2=1 but r(1)=1r(1)=1. Moreover, for every element aF×a\in F^\times, we have r(a2)2=a2r(a^2)^2=a^2, hence r(a2)=±ar(a^2)=\pm a, so that either aa, or a-a belongs to RR, but not both since 1∉R×-1\not\in R^\times. As a consequence, R×R^\times is a maximal subgroup of F×F^\times among those which do not contain 1-1: adding to R×R^\times any element aF×a\in F^\times such that aR×a\notin R^\times would lead to a subgroup R×,a\langle R^\times,a\rangle which contains 1-1.

Let us consider a maximal subgroup of F×F^\times containing the squares which does not contain 1-1. Starting from S×S^\times, which does not contain 1-1, this can be done using Zorn's lemma, or by transfinite induction: well ordering the elements of F×F^\times, and constructing R×R^\times by induction. Since R×R^\times contains the squares, the union R×aR×R^\times \cup a R^\times is a subgroup of F×F^\times; if it does not contain 1-1, then we replace R×R^\times by it, other wise we discard aa and keep R×R^\times.

Let aF×a\in F^\times. If aR×a\notin R^\times, the construction means that 1aR×-1\in aR^\times, hence aR×-a\in R^\times. But we can't have both aa and a-a in R×R^\times, for that would imply that 1R×-1\in R^\times.

If aF×a\in F^\times is a nonzero square, it has two square roots, of the form ±b\pm b, and we define r(a)r(a) to be its square root which belongs to R×R^\times. One has r(1)=1r(1)=1, because 1S×R×1\in S^\times\subset R^\times. For nonzero squares a,ba,b, the product r(a)r(b)r(a)r(b) is a square root of abab, and it belongs to R×R^\times, hence it equals r(ab)r(ab). This proves that the map rr is multiplicative. This concludes the proof.

Remark. — If you've studied some abstract algebra, you may have recognized something in the middle of the proof. Indeed, the quotient group V=F×/S×V=F^\times/S^\times has exponent 2: for every α\alpha in this group, α2=1\alpha^2=1. Consequently, even if it is written multiplicatively, this abelian group is a vector space over the field with 2-elements. Since 1-1 is not a square in F×F^\times, its class [1][-1] is nonzero in F×/S×F^\times/S^\times, and the quotient group W=R×/S×W=R^\times/S^\times is just a maximal vector subspace that does not contain [1][-1]. It is a hyperplane and is defined by a linear form ϕ\phi on VV. Since VV is written multiplicatively, this linear form corresponds to a group homomorphism f ⁣:F×{±1}f\colon F^\times \to\{\pm1\} which maps S×S^\times to 11 and such that f(1)=1f(-1)=-1. For every square a=b2a=b^2, we then have r(a)=bf(b)r(a)=b f(b).

In his paper, Waterhouse goes on by viewing “fields FF with a multiplicative square root rr” as a basic algebraic object, and considering such structures (F,r)(F,r) which can't be extended by adding algebraic elements. The final theorem of the paper shows that the Galois group Gal(F/F)\mathop{\rm Gal}(\overline F/F) is either cyclic of order 2, or is the additive group of the 2-adic integers.

Tuesday, November 1, 2022

#Mathober2022

Sophia Wood (@fractalkitty) had the good idea to set up a #Mathober project: for each day of october, she proposes you to react to one word of mathematics. I did something on the Mastodon server, that was also crossposted on Twitter. I will copy it here, but meanwhile you can enjoy it there.

You can also enjoy Sophia's work there:

Link to Mastodon : https://mathstodon.xyz/web/@antoinechambertloir/109131452332129714

Link to Twitter: https://twitter.com/achambertloir/status/1578647553205276672

Link to Sophia's Wood sketches:https://fractalkitty.com/2022/10/01/mathober2022-sketches

Tuesday, September 13, 2022

Yet another post on simplicity

I see that I finally arrive to an end of my journey in formalizing in Lean the simplicity of the alternating group in 5 letters or more, so it may be a good time to summarize what I did, from the mathematical side. 

On a first blog post, “Not simple proofs of simplicity”, I had described my initial plan, but it was not clear at that time that I would either arrive at a final proof, nor that I would be able to formalize it in Lean. In fact, a few weeks after I had started this experiment, I doubted I would make it and went on formalizing the traditional proof that the alternating group is simple. I added a few simplifications—which I was later told were already explained in Jacobson's Basic Algebra, say that's life…– leading to “The very simple proof that the alternating groups of five letters (or more) is simple”. I managed to formalize that proof at the end of 2021, and spent a lot of energy of the 8 next months to formalize the proof that I initially had in mind.

As I had already explained, the goal/constraint is to apply the Iwasawa criterion to the alternating group. This criterion says that if a group GG acts primitively on a set XX, and if we attach to each point xXx\in X a commutative subgroup TxTx of GG, in such a way that T(gx)=gTxg1T(g\cdot x)=g\cdot Tx\cdot g^{-1} for every gGg\in G and every xXx\in X, and if the subgroups TxTx generate GG, then every normal subgroup of GG that acts nontrivially on XX contains the commutator subgroup. We take G=AnG=\mathfrak A_n. For n5n\geq 5, its commutator subgroup is An\mathfrak A_n itself (for example because any two 3-cycles are conjugated; in particular, a 3-cycle is conjugate to its square, which implies that it maps to 11 in the abelianization of An\mathfrak A_n). So we need to get primitive actions of An\mathfrak A_n and commutative subgroups. 

One of the equivalent criteria for primitivity of a transitive actions is that the stabilizers of points are maximal subgroups. As I had explained at the end of the first post, the maximal subgroups of Sn\mathfrak S_n and An\mathfrak A_n are known by the O'Nan–Scott theorem, combined with its converse which is a theorem of Liebeck, Praeger and Saxl. These theorems give a precise list of the maximal subgroups of Sn\mathfrak S_n and An\mathfrak A_n, of which the first entry is precisely Sp×Snp\mathfrak S_p\times \mathfrak S_{n-p} (where the first factor acts on {1;;p}\{1;\dots;p\} and the second acts on {p+1;;n}\{p+1;\dots;n\}) and its intersection with An\mathfrak A_n, if 0<p<n0<p<n and n2pn\neq 2p.

We need to understand the limitation n2pn\neq 2p, the point being that if n=2pn=2p, the subgroup Sp×Sp\mathfrak S_p\times\mathfrak S_p is not maximal in S2p\mathfrak S_{2p}, it is a subgroup of index 2 of a “wreath product” obtained by adding one permutation that exchanges the two blocks {1,,p}\{1,\dots,p\} and {p+1,,2p}\{p+1,\dots,2p\}, for example (1p+1)(2p+2)(p2p)(1\,p+1)(2\,p+2)\dots (p\,2p). This group is the second entry in the O'Nan–Scott theorem.

These two entries are labelled as intransitive and imprimitive respectively, because Sp×Snp\mathfrak S_p\times \mathfrak S_{n-p} has two orbits on {1;;n}\{1;\dots;n\}, while the wreath product is transitive but it preserves the partition consisting of the two blocks {1,,p}\{1,\dots,p\} and {p+1,,2p}\{p+1,\dots,2p\}.

These two entries seem to be obvious to the group theorists. It is given without proof in the paper of Liebeck, Praeger and Saxl.

The case of Sn\mathfrak S_n is easy, and occupies a subsection of Wilson's book on Finite Simple Groups. It is even funny to prove by hand, and not so hard to formalize in Lean. Take a subgroup KK of Sn\mathfrak S_n such that Sp×SnpK\mathfrak S_p\times \mathfrak S_{n-p} \subsetneq K and let us prove that K=SnK=\mathfrak S_n.  To that end, it suffices to show that KK contains any transposition (ab)(a\,b). This is obvious if both aa and bb belong to {1;;p}\{1;\dots;p\} or if they both belong to {p+1;dots;n}\{p+1;dots;n\}, so assume that a{1;;p}a\in\{1;\dots;p\} and b{p+1;;n}b\in\{p+1;\dots;n\}. Since KK does not stabilize {1;;p}\{1;\dots;p\}, there is x{1;;p}x\in\{1;\dots;p\} and kKk\in K such that y=kx{p+1;;n}y=k\cdot x \in\{p+1;\dots;n\}. If n>2pn>2p, there exists z{p+1;;n}z\in\{p+1;\dots;n\} such that zyz\neq y and t=k1z{p+1;;n}t=k^{-1}\cdot z\in\{p+1;\dots;n\}; from the relation k1(yz)k=(xt)k^{-1} \cdot (y\,z) \cdot k=(x\,t) and the fact that (yz)Sp×Snp(y\,z)\in \mathfrak S_p\times\mathfrak S_{n-p}, we deduce that (xt)(x\,t) belongs to KK. This gives us one transposition of the desired form; finally, the relation (ab)=h(xt)h1(a\,b)=h (x\,t) h^{-1} with h=(xa)(tb)Sp×Snph=(x\,a)(t\,b)\in\mathfrak S_p\times\mathfrak S_{n-p} shows that (ab)K(a\,b)\in K. The other case, n<2pn<2p is symmetric.

Bizarrely, the analogous result for the alternating group looked more difficult to me, although some colleague assured me that it could be done, an other one that I could certainly do it, and a last one did it for n>7n>7. Since Liebeck, Praeger and Saxl gave no reference at all, I asked Liebeck about and he explained me a short proof that uses totally different ideas.

Let G=AnG=\mathfrak A_n or Sn\mathfrak S_n and consider a subgroup KK such that (Sp×Snp)GKG(\mathfrak S_p\times\mathfrak S_{n-p})\cap G \subsetneq K\subseteq G; we wish to prove that K=GK=G. Arguments as given above already show that KK acts transitively on {1;;n}\{1;\dots;n\}. But we can do more: it acts primitively. Now, one just needs to invoke a 1870 theorem of Jordan: a primitive subgroup of Sn\mathfrak S_n that contains a transposition is Sn\mathfrak S_n, and a primitive subgroup of Sn\mathfrak S_n that contains a 3-cycle contains An\mathfrak A_n!

To prove that KK acts primitively, it is convenient to use the standard definition of a primitive action. If a group GG acts on a set XX, call block of the action a nonempty subset BB of XX which, for every gGg\in G, is either fixed or moved to a disjoint subset by GG; it follows from the definition that the translates of a block by the action form a partition of XX. Singletons are blocks, the full subset is a block, and one definition of a primitive action is that the only blocks are these trivial ones (and XX is nonempty). Orbits are blocks, so that a primitive action is transitive. Conversely, one can prove that if the action is transitive, then it is primitive if and only if stabilizers of points in XX are maximal subgroups. A more general result is that for every point aXa\in X, associating with a set BB its stabilizer GBG_B gives a bijection from the set of blocks that contain aa to the set of subgroups of GG that contain GaG_a, with inverse bijection associating with a subgroup KK containing GaG_a the orbit KaK\cdot a, and these bijections preserve inclusion. 

Proof. — Let B,BB,B' be blocks such that BBB\subseteq B' and let gGBg\in G_B; then gBg\cdot B' contains gB=Bg\cdot B=B, hence gBg\cdot B' is not disjoint from BB', so that gB=Bg\cdot B'=B' by definition of a block. This proves that GBG_B is a subgroup of GB G_{B'}.

Let BB be a block that contains aa; then GBa=BG_B \cdot a=B. Indeed, the inclusion GBaBG_B\cdot a\subseteq B follows from the definition of GBG_B. To prove the other inclusion, let bBb\in B. Since the action is transitive, there exists gGg\in G such that ga=bg\cdot a=b; then gBg\cdot B and BB both contain bb, hence gB=Bg\cdot B=B, so that gGBg\in G_B and bGBab\in G_B\cdot a.

Finally, let KK a a subgroup of GG containing GaG_a and let B=KaB=K\cdot a. Let us prove that BB is a block such that K=GBK=G_B. Let gGg\in G such that gBg\cdot B and BB are not disjoint; let b,cBb,c\in B be such that b=gcb=g\cdot c; write b=kab=k\cdot a and c=hac=h\cdot a for k,hKk,h\in K. Then ka=ghak\cdot a = gh\cdot a so that k1ghGak^{-1}gh\in G_a, hence k1ghKk^{-1}gh\in K; we conclude that gKg\in K, hence gB=gKa=Ka=B.g\cdot B=gK\cdot a = K\cdot a=B. So BB is a block. This also shows that GBKG_B\subseteq K, and the converse inclusion is obvious.

Going back to our initial problem, it remains to show that the action of KK on {1;;n}\{1;\dots;n\} only has trivial blocks. The proof uses two remarks.

  1. The trace of a block on {1;;p}\{1;\dots;p\}, respectively {p+1;;n}\{p+1;\dots;n\}, is either a singleton, or all of it. Indeed, this trace is a block for the induced action of (Sp×Snp)G(\mathfrak S_p\times\mathfrak S_{n-p})\cap G on {1;;p}\{1;\dots;p\} (respectively {p+1;;n}\{p+1;\dots;n\}), and this action contains that of Ap\mathfrak A_p (respectively…) and even that of Sp\mathfrak S_p if pn1p\neq n-1. On the other hand, the symmetric group acts 2-transitively, hence primitively.  (The cases p=1p=1 or p=n1p=n-1 need minor adjustements.)
  2. If 2p<n2p<n, then no nontrivial block can contain {p+1;;n}\{p+1;\dots;n\}. Indeed, there is not enough space in the complementary subset so that disjoint translates of this block make a partition of {1;;n}\{1;\dots;n\}.

Let us now conclude the proof. (I still find the following argument a bit convoluted but have nothing really better to propose yet.) Consider a block B{1;;n}B\subset\{1;\dots;n\} for the action of KK, and assume that BB is not a singleton, nor the full set. If BB meets {p+1;;n}\{p+1;\dots;n\} in at least two elements, then it contains {p+1;;n}\{p+1;\dots;n\}, hence is the full block, a contradiction. If BB meets {1;;p}\{1;\dots;p\} in at least two elements, then it contains {1;;p}\{1;\dots;p\}, and some disjoint translate of it  is contained in {p+1;;n}\{p+1;\dots;n\}; this translate is a block that contains {p+1;;n}\{p+1;\dots;n\}, hence is the full set, so that the initial block is the full set as well.  By similar arguments, BB meets both {1;;p}\{1;\dots;p\} and {p+1;;n}\{p+1;\dots;n\} in exactly one element, and the same hold for any translate kBk\cdot B of BB. However, using the hypothesis that pnpp\neq n-p and that KK strictly contains (Sp×Snp)G(\mathfrak S_p\times\mathfrak S_{n-p})\cap G, we find kKk\in K such that kBk\cdot B meets {p+1;;n}\{p+1;\dots;n\} in at least two elements, and we can conclude as earlier that BB is the full set.

To terminate this blog spot, I need to say something about Jordan's theorem. Jordan was concerned about the concept multiple transitivity: a group GG acting on a set XX is mm-transitive if whenever systems of distinct elements a1,,ama_1,\dots,a_m on the one side, b1,,bmb_1,\dots,b_m on the other side, are given, there exists gGg\in G such that ga1=b1,gam=bmg\cdot a_1=b_1,\dots g \cdot a_m=b_m (one assumes here that mCard(X)m\leq {\mathrm{Card}(X)}). Many theorems from this time (Matthieu, Bertrand, Serret, Jordan…), partly in relation with Galois theory of equations, aim at limiting the multiple transitivity of subgroups of the symmetric group. The symmetric group itself is nn-transitive, if n=Card(X)n={\mathrm {Card}(X)}, the alternating group is (n2)(n-2)-transitive, and other subgroups have to be much less transitive.

The general result of Jordan, proved in the Note C (page 664) to §398 of his Traité des substitutions et des équations algébriques (1870, Gauthier-Villars)  is that a primitive subgroup of Sn\mathfrak S_n containing a cycle of prime order pp is np+1n-p+1-transitive. For p=2p=2, we get that this subgroup is (n1)(n-1)-transitive, hence is Sn\mathfrak S_n; for p=3p=3, we get that it is (n2)(n-2)-transitive, and that implies that it contains the alternating group An\mathfrak A_n. I formalized these results in Lean, following the presentation of Wielandt's book on Finite permutation groups (theorem 13.3 of that reference). A later theorem of Jordan (1873; see theorem 13.9 in Wielandt's book) asserts that such a subgroup always contains the alternating group provided np3n-p\geq 3; I have not (not yet?) formalized it in Lean.

All in all, this gives a fairly sophisticated proof that the alternating group is simple. One of its merit is to follow a general line, that applies to many other groups. In particular, Iwasawa's criterion is also used by Wilson in his book Finite simple groups to prove that the simplicity of the Mathieu groups M11,M12M_{11}, M_{12}, and of many other finite groups.

I just opened Jordan's book to write this blog post. Let me add that it contains (§85) another proof of simplicity of the alternating group, and I will try to explain it in a later post.