Sunday, April 14, 2024

Evaluating the operator norms of matrices

Let EE and FF be normed vector spaces, over the real or complex numbers, and let u ⁣:EFu\colon E\to F be a linear map. The continuity of uu is proved to be equivalent to the existence of a real number cc such that u(x)cx\|u(x)\|\leq c \|x\| for every xEx\in E, and the least such real number is called the operator norm of uu; we denote it by u\|u\|. It defines a norm on the linear space L(E;F)\mathscr L(E;F) of continuous linear maps from EE to FF and as such is quite important. When E=FE=F, it is also related to the spectrum of uu and is implicitly at the heart of criteria for the Gershgorin criterion for localization of eigenvalues.

\gdef\R{\mathbf R}\gdef\norm#1{\lVert#1\rVert}\gdef\Abs#1{\left|#1\right|}\gdef\abs#1{\lvert#1\rvert}

However, even in the simplest cases of matrices, its explicit computation is not trivial at all, and as we'll see even less trivial than what is told in algebra classes, as I learned by browsing Wikipedia when I wanted to prepare a class on the topic.

Since I'm more a kind of abstract guy, I will use both languages, of normed spaces and matrices, for the first one allows to explain a few things at a more fundamental level. I'll make the translation, though. Also, to be specific, I'll work with real vector spaces.

So E=RmE=\R^m, F=RnF=\R^n, and linear maps in L(E;F)\mathscr L(E;F) are represented by n×mn\times m matrices. There are plentiful interesting norms on EE, but I will concentrate the discussion on the p\ell^p-norm given by (x1,,xm)=(x1p++xmp)1/p\norm{(x_1,\dots,x_m)} = (|x_1|^p+\dots+|x_m|^p)^{1/p}. Similarly, I will consider the q\ell^q-norm on FF given by (y1,,ym)=(y1q++ynq)1/q\norm{(y_1,\dots,y_m)} = (|y_1|^q+\dots+|y_n|^q)^{1/q}. Here, pp and qq are elements of [1;+[[1;+\infty\mathclose[. It is also interesting to allow p=p=\infty or q=q=\infty; in this case, the expression defining the norm is just replaced by sup(x1,,xm)\sup(|x_1|,\dots,|x_m|) and sup(y1,,yn)\sup(|y_1|,\dots,|y_n|) respectively.

Duality

Whatever norm is given on EE, the dual space E=L(E;R)E^*=\mathscr L(E;\mathbf R) is endowed with the dual norm, which is just the operator norm of that space: for ϕE\phi\in E^*, ϕ\norm\phi is the least real number such that ϕ(x)ϕx|\phi(x)|\leq \norm\phi \norm x for all xEx\in E. And similarly for FF. To emphasize duality, we will write x,ϕ\langle x,\phi\rangle instead of ϕ(x)\phi(x).

Example. — The dual norm of the p\ell^p norm can be computed explicitly, thanks to the Young inequality   x1y1++xnyn(x1p++xnp)1/p(x1q++xnq)1/q \Abs{ x_1 y_1 + \dots + x_n y_n } \leq (|x_1|^p+\dots + |x_n|^p)^{1/p} (|x_1|^q+\dots+|x_n|^q)^{1/q} if p,qp,q are related by the relation 1p+1q=1\frac1p+\frac1q=1. (When p=1p=1, this gives q=q=\infty, and symmetrically p=p=\infty if q=1q=1.) Moreover, this inequality is optimal in the sense that for any (x1,,xn)(x_1,\dots,x_n), one may find a nonzero (y1,,yn)(y_1,\dots,y_n) for which the inequality is an equality. What this inequality says about norms/dual norms is that if one identifies Rn\R^n with its dual, via the duality bracket x,y=x1y1++xnyn\langle x,y\rangle=x_1y_1+\dots+x_n y_n, the dual of the p\ell^p-norm is the q\ell^q-norm, for that relation 1/p+1/q=11/p+1/q=1.

If u ⁣:EFu\colon E\to F is a continuous linear map, it has an adjoint (or transpose) u ⁣:FEu^*\colon F^*\to E^*, which is defined by u(ϕ)=ϕuu^*(\phi)= \phi\circ u, for ϕF\phi\in F^*. In terms of the duality bracket, this rewrites as  ϕ,u(x)=u(ϕ),x \langle \phi, u(x)\rangle = \langle u^*(\phi),x\rangle for xEx\in E and ϕF\phi\in F^*.

Proposition.One has u=u\norm{u^*}=\norm u.

For ϕF\phi\in F^*, u(ϕ)\norm{u^*(\phi)} is the least real number such that u(ϕ)(x)u(ϕ)x|u^*(\phi)(x)|\leq \norm{u^*(\phi)} \norm x for all xEx\in E. Since one has  u(ϕ)(x)=u(ϕ),x=ϕ,u(x)ϕu(x)ϕux, |u^*(\phi)(x)|= |\langle u^*(\phi),x\rangle|=|\langle\phi, u(x)\rangle\leq \norm\phi \norm{u(x)} \leq \norm\phi \norm u\norm x, we see that u(ϕ)ϕu\norm {u^*(\phi)}\leq \norm\phi\norm u for all ϕ\phi. As a consequence, uu\norm{u^*}\leq \norm u.
To get the other inequality, we wish to find a nonzero ϕ\phi such that u(ϕ)=uϕ\norm{u^*(\phi)}=\norm{u}\norm\phi. This ϕ\phi should thus be such that there exists xx such that u(ϕ),x=uϕx|\langle u^*(\phi),x\rangle|=\norm u\norm\phi\norm x which, by the preceding computation means that ϕ,u(x)=ϕux|\langle\phi, u(x)\rangle=\norm\phi\norm u\norm x. Such ϕ\phi and xx must not exist in general, but we can find reasonable approximations. Start with a nonzero xEx\in E such that u(x)\norm{u(x)} is close to ux\norm u\norm x; then using the Hahn-Banach theorem, find a nonzero ϕF\phi\in F^* such that ϕ=1\norm\phi=1 and ϕ(u(x))=u(x)|\phi(u(x))|=\norm {u(x)}. We see that ϕ,u(x)\langle\phi, u(x)\rangle is close to uϕx\norm u\norm\phi\norm x, and this concludes the proof.
In some cases, in particular in the finite dimension case, we can use biduality to get the other inequality. Indeed EE^{**} identifies with EE, with its initial norm, and uu^{**} identifies with uu. By the first case, we thus have uu\norm{u^{**}}\leq \norm {u^*}, hence uu\norm u\leq\norm{u^*}.

The case p=1p=1

We compute u\norm{u} when E=RmE=\mathbf R^m is endowed with the 1\ell^1-norm, and FF is arbitrary. The linear map u ⁣:EFu\colon E\to F thus corresponds with mm vectors u1,,umu_1,\dots,u_m of FF, and one has  u((x1,,xm))=x1u1++xmum. u((x_1,\dots,x_m))=x_1 u_1+\dots+x_m u_m. By the triangular inequality, we have u((x1,,xm))x1u1++xmum \norm{u((x_1,\dots,x_m))} \leq |x_1| \norm{u_1}+\dots+\abs{x_m}\norm{u_m} hence  u((x1,,xm))(x1++xm)sup(u1,,um). \norm{u((x_1,\dots,x_m))} \leq (\abs{x_1} +\dots+\abs{x_m}) \sup(\norm{u_1},\dots,\norm{u_m}). Consequently,  usup(u1,,um). \norm{u} \leq \sup(\norm{u_1},\dots,\norm{u_m}). On the other hand, taking x=(x1,,xm)x=(x_1,\dots,x_m) of the form (0,,1,0,)(0,\dots,1,0,\dots), where the 11 is at place kk such that uk\norm{u_k} is largest, we have x=1\norm{x}=1 and u(x)=uk\norm{u(x)}=\norm{u_k}. The preceding inequality is thus an equality.

In the matrix case, this shows that the (1,q)(\ell^1,\ell^q)-norm of a n×mn\times m matrix AA is the supremum of the q\ell^q-norms of the columns of AA.

The case q=q=\infty

We compute u\norm{u} when F=RnF=\mathbf R^n is endowed with the \ell^\infty-norm, and EE is arbitrary. A direct computation is possible in the matrix case, but it is not really illuminating, and I find it better to argue geometrically, using a duality argument.

Namely, we can use u ⁣:FEu^*\colon F^*\to E^* to compute u\norm{u}, since u=u\norm u=\norm{u^*}. We have seen above that FF^* is Rn\mathbf R^n, endowed with the 1\ell^1-norm, so that we have computed u\norm{u^*} in the preceding section. The basis (e1,,en)(e_1,\dots,e_n) of FF gives a dual basis (ϕ1,,ϕn)(\phi_1,\dots,\phi_n), and one has u=u=sup(u(ϕ1),,u(ϕn)). \norm{u}=\norm{u^*} = \sup (\norm{u^*(\phi_1)},\dots,\norm{u^*(\phi_n)}).

In the matrix case, this shows that the (p,)(\ell^p,\ell^\infty)-norm of a n×mn\times m matrix AA is the supremum of the p\ell^p-norms of the lines of AA.

Relation with the Gershgorin circle theorem

I mentioned the Gershgorin circle theorem as being in the same spirit as the computation of an operator norm, because its proof relies on the same kind of estimations. In fact, no additional computation is necessary!

Theorem (Gershgorin “circles theorem”). — Let A=(aij)A=(a_{ij}) be an n×nn\times n matrix and let λ\lambda be an eigenvalue of AA. There exists an integer ii such that  λaiijiaij. \abs{\lambda-a_{ii}}\leq \sum_{j\neq i} \abs{a_{ij}}.

For the proof, one writes A=D+NA=D+N where DD is diagonal has zeroes on its diagonal, and writes λx=Ax=Dx+Nx\lambda x=Ax=Dx+Nx, hence (λID)x=Nx(\lambda I-D)x=Nx. Endow Rn\R^n with the \ell^\infty-norm. We can assume that x=1\norm x=1. Then the norm of the right hand side is bounded above by N\norm N, while the norm of the left hand side is sup(λaiixi)λaii\sup(\abs{\lambda-a_{ii}} |x_i|)\geq |\lambda-a_{ii}| if ii is chosen so that xi=x=1|x_i|=\norm x=1. Given the above formula for N\norm N, this implies the theorem.

The case p=q=2p=q=2

Since Euclidean spaces are very useful in applications, this may be a very important case to consider, and we will see that the answer is not at all straightforward from the coefficients of the matrix.

We have to bound from above u(x)\norm{u(x)}. Using the scalar product, we write  u(x)2=u(x),u(x)=uu(x),x, \norm{u(x)}^2 = \langle u(x),u(x)\rangle = \langle u^*u(x),x\rangle, where u ⁣:FEu^*\colon F\to E now denotes the adjoint of uu, which identifies with the transpose of uu if one identifies EE with EE^* and FF with FF^* by means of their scalar products. Using the Cauchy-Schwarz formula, we get that u(x)2uu(x)xuux2\norm{u(x)}^2\leq \norm{u^*u(x)}\norm x\leq \norm{u^*u} \norm x^2, hence uuu1/2\norm{u} \leq \norm{u^*u}^{1/2}. This inequality is remarkable because on the other hand, we have uuuu=u2\norm{u^*u}\leq \norm{u^*}\norm{u}=\norm{u}^2. Consequently, u=uu1/2\norm{u}=\norm{u^*u}^{1/2}.

This formula might not appear to be so useful, since it reduces the computation of the operator norm of uu to that of uuu^*u. However, the linear map uuu^*u is a positive self-adjoint endomorphism of EE hence, (assuming that EE is finite dimensional here), it can be diagonalized in a orthonormal basis. We then see that uu\norm{u^*u} is the largest eigenvalue of uuu^*u, which is also its spectral radius. The square roots of the eigenvalues of uuu^*u are also called the singular values of uu, hence u\norm u is the largest singular value of uu.

One can play with duality as well, and we have u=uu1/2\norm{u}=\norm{uu^*}^{1/2}.

Other cases?

There are general inequalities relating the various p\ell^p-norms of a vector xRmx\in\R^m, and these can be used to deduce inequalities for u\norm u, when E=RmE=\R^m has an p\ell^p-norm and F=RnF=\R^n has an q\ell^q-norm. However, given the explicit value of u\norm u for (p,q)=(2,2)(p,q)=(2,2) and the fact that no closed form expression exists for the spectral radius, it is unlikely that there is a closed form expression in the remaining cases.

Worse: the exact computation of u\norm u in the cases (,1)(\infty,1), (,2)(\infty,2) or (2,1)(2,1) is known to be computationally NP-complete, and I try to explain this result below, following J. Rohn (2000) (“Computing the Norm ∥ A ∥∞,1 Is NP-Hard”, Linear and Multilinear Algebra 47 (3), p. 195‑204). I concentrate on the (,1)(\infty, 1) case ; the (,2)(\infty,2) case is supposed to be analogous (see Joel Tropp's thesis, top of page 48, quoted by Wikipedia, but no arguments are given), and the case (2,1)(2,1) would follow by symmetry.

A matrix from a graph

Let us consider a finite (undirected, simple, without loops) graph GG on the set V={1,,n}V=\{1,\dots,n\} of nn vertices, with set of edges EE, and let us introduce the following n×nn\times n matrix A=(aij)A=(a_{ij}), a variant of the incidence matrix of the graph GG (actually nIEnI-E, where II is the identity matrix and EE is the incidence matrix of GG):

  • One has aii=na_{ii}=n for all ii;
  • If iji\neq j and vertices ii and jj are connected by an edge, then aij=1a_{ij}=-1;
  • Otherwise, aij=0a_{ij}=0.
For any subset SS of VV, the cut c(S)c(S) of SS is the number of edges which have one endpoint in SS and the other outside of SS.

Proposition.The (,1)(\ell^\infty,\ell^1)-norm of AA is given by  4supSVc(S) 2Card(E)+n2. 4 \sup_{S\subseteq V} c(S) - 2 \operatorname{Card}(E) + n^2.

The proof starts with the following observation, valid for more general matrices.

Lemma. — The (,1)(\ell^\infty,\ell^1)-norm of a symmetric positive n×nn\times n matrix AA is given by A=supzz,Az\norm A = \sup_z \langle z, Az \rangle where zz runs among the set ZZ of vectors in Rn\R^n with coordinates ±1\pm1.

The vectors of ZZ are the vertices of the polytope [1;1]n[-1;1]^n, which is the unit ball of Rn\R^n for the \ell^\infty-norm. Consequently, every vector of [1;1]n[-1;1]^n is a convex combination of vectors of ZZ. Writing x=zZczzx=\sum_{z\in Z} c_z z, we have Ax=czAzczAz=supzZAz.\norm {Ax} = \norm{\sum c_z Az} \leq \sum c_z \norm {Az}= \sup_{z\in Z} \norm{Az}. The other inequality being obvious, we already see that A=supzZAz\norm A=\sup_{z\in Z}\norm{Az}. Note that this formula holds for any norm on the codomain.
If, for zZz\in Z, one writes Az=(y1,,yn)Az=(y_1,\dots,y_n), one has Az=y1++yn\norm{Az}=|y_1|+\dots+|y_n|, because the codomain is endowed with the 1\ell^1-norm, so that z,Az=ziyiAz\langle z, Az\rangle = \sum z_i y_i\leq \norm{Az}. We thus the inequality supzZz,AzA\sup_{z\in Z} \langle z,Az\rangle \leq \norm A.
Let us now use the fact that AA is symmetric and positive. Fix zZz\in Z, set Az=(y1,,yn)Az=(y_1,\dots,y_n) as above, and define xZx\in Z by xi=1x_i=1 if yi0y_i\geq0 and xi=1x_i=-1 otherwise. One thus has x,Az=yi=Az\langle x, Az\rangle=\sum |y_i|=\norm{Az}. Since AA is symmetric and positive, one has xz,A(xz)0\langle x-z, A(x-z)\rangle\geq0, and this implies 2Az= 2x,Azx,Ax+z,Az,2\norm{Az}= 2\langle x, Az\rangle \leq \langle x, Ax\rangle+\langle z, Az\rangle, so that, AzsupxZx,Ax\norm{Az}\leq \sup_{x\in Z} \langle x, Ax\rangle. This concludes the proof.

To prove the theorem, we will apply the preceding lemma. We observe that AA is symmetric, by construction. It is also positive, since for every xRnx\in\R^n, one has x,Ax=aijxixjnxi2ijxixj=(n+1)xi2(xi)2xi2\langle x,Ax\rangle=\sum a_{ij}x_ix_j \geq n \sum x_i^2 -\sum_{i\neq j} x_i x_j = (n+1)\sum x_i^2- (\sum x_i)^2\geq \sum x_i^2 using the Cauchy-Schwarz inequality (xi)2nxi2(\sum x_i)^2\leq n\sum x_i^2. By the preceding lemma, we thus have  A=supz{±1}nz,Az. \norm A = \sup_{z\in\{\pm1\}^n} \langle z, Az\rangle. The 2n2^n vectors zZz\in Z are in bijection with the 2n2^n subsets of V={1,,n}V=\{1,\dots,n\}, by associating with zZz\in Z the subset SS of VV consisting of vertices ii such that zi=1z_i=1. Then, one can compute  z,Az=i,jaijzizj=4c(S)2Card(E)+n2. \langle z, Az\rangle = \sum_{i,j} a_{ij} z_iz_j = 4c(S) - 2\operatorname{Card}(E) + n^2. It follows that A \norm A  is equal to the indicated expression.

The last step of the proof is an application of the “simple max-cut” NP-hardness theorem of Garey, Johnson and Stockmeyer (1976), itself a strenghtening of Karp (1973)'s seminal result that “max-cut” is NP-complete. I won't explain the proofs of these results here, but let me explain what they mean and how they relate to the present discussion. First of all, computer scientists categorize problems according to the time that is required to solve them, in terms of the size of the entries. This notion depends on the actual computer that is used, but the theory of Turing machines allows to single out two classes, P and EXP, consisting of problems which can be solved in polynomial, respectively exponential, time in term of the size of the entries. A second notion, introduced by Karp, is that of NP problems, problems which can be solved in polynomial time by a “non deterministic Turing machine” — “nondeterministic” means the computer can parallelize itself at will when it needs to consider various possibilities. This class belongs to EXP (because one can simulate in exponential time a polynomial time nondeterministic algorithm) and also corresponds to the class of problems whose solution can be checked in polynomial time.

Our problem is to find a subset SS of {1,,n}\{1,\dots,n\} that maximizes c(S)c(S). This is a restriction of the “general max-cut” problem where, given an integer valued function ww on the set of edges, on wishes to find subset that maximizes c(S;w)c(S;w), the sum of the weights of the edges which have one endpoint in SS and the other outside of SS. Karp (1973) observed that the existence of SS such that c(S;w)mc(S;w)\geq m is an NP problem (if one is provided SS, it takes polynomial time to compute c(S;w)c(S;w) and to decide that it is at least mm), and the naïve search algorithm is in EXP, since there are 2n2^n such subsets. Moreover, Karp proves that any NP problem can be reduced to it in polynomial time. This is what is meant by the assertion that it is NP-complete. Consequently, determining supSc(S;w)\sup_S c(S;w) is NP-hard: if you can solve that problem, then you can solve the “max-cut” problem in polynomial time, hence any other NP-problem. A subsequent theorem by Garey, Johnson and Stockmeyer (1976) established that restricting the max-cut problems to ±1\pm1 weights is still NP-hard, and this completes the proof of Rohn's theorem.

(Aside, to insist that signs matter: a theorem of Edmonds and Karp (1972), one can solve the “min-cup” problem in polynomial time, which consists in deciding, for some given integer mm, whether there exist SS such that c(S;w)mc(S;w)\leq m.)

Saturday, April 13, 2024

The topology on the ring of polynomials and the continuity of the evaluation map

Polynomials are an algebraic gadget, and one is rarely led to think about the topology a ring of polynomials should carry. That happened to me, though, more or less by accident, when María Inés de Frutos Fernández and I worked on implementing in Lean the evaluation of power series. So let's start with them. To simplify the discussion, I only consider the case of one inderminate. When there are finitely many of them, the situation is the same; in the case of infinitely many indeterminates, there might be some additional subtleties, but I have not thought about it.

\gdef\lbra{[\![}\gdef\rbra{]\!]} \gdef\lpar{(\!(}\gdef\rpar{)\!)} \gdef\bN{\mathbf N} \gdef\coeff{\operatorname{coeff}} \gdef\eval{\operatorname{eval}} \gdef\colim{\operatorname{colim}}

Power series

A power series over a ring RR is just an expression anTn\sum a_nT^n, where (a0,a1,)(a_0,a_1, \dots) is a family of elements of RR indexed by the integers. After all, this is just what is meant by “formal series”: coefficients and nothing else.

Defining a topology on the ring R[ ⁣[T] ⁣]R\lbra T\rbra should allow to say what it means for a sequence (fm)(f_m) of power series to converge to a power series ff, and the most natural thing to require is that for every nn, the coefficient am,na_{m,n} of TnT^n in fmf_m converges to the corresponding coeffient ama_m of TnT^n in ff. In other words, we endow R[ ⁣[T] ⁣]R\lbra T\rbra with the product topology when it is identified with the product set RNR^{\bN}. The explicit definition may look complicated, but the important point for us is the following characterization of this topology: Let XX be a topological space and let f ⁣:XR[ ⁣[T] ⁣]f\colon X \to R\lbra T\rbra be a map; for ff to be continuous, it is necessary and sufficient that all maps fn ⁣:XRf_n\colon X \to R are continuous, where, for any xXx\in X, fn(x)f_n(x) is the nnth coefficient of f(x)f(x). In particular, the coeffient maps R[ ⁣[T] ⁣]RR\lbra T\rbra\to R are continuous.

What can we do with that topology, then? The first thing, maybe, is to observe its adequacy wrt the ring structure on R[ ⁣[T] ⁣]R\lbra T\rbra.

Proposition.If addition and multiplication on RR are continuous, then addition and multiplication on R[ ⁣[T] ⁣]R\lbra T\rbra are continuous.

Let's start with addition. We need to prove that s ⁣:R[ ⁣[T] ⁣]×R[ ⁣[T] ⁣]R[ ⁣[T] ⁣]s\colon R\lbra T\rbra \times R\lbra T\rbra\to R\lbra T\rbra is continuous. By the characterization, it is enough to prove that all coordinate functions sn ⁣:R[ ⁣[T] ⁣]×R[ ⁣[T] ⁣]Rs_n\colon R\lbra T\rbra \times R\lbra T\rbra\to R, (f,g)coeffn(f+g) (f,g)\mapsto \coeff_n(f+g) , are continuous. But these functions factor through the nnth coefficient maps: coeffn(f+g)=coeffn(f)+coeffn(g)\coeff_n(f+g) = \coeff_n(f)+\coeff_n(g), which is continuous, since addition, coefficients and projections are continuous. This is similar, but slightly more complicated for multiplication: if the multiplication map is denoted by mm, we have to prove that the maps mnm_n defined by mn(f,g)=coeffn(fg)m_n(f,g)=\coeff_n(f\cdot g) are continuous. However, they can be written as  mn(f,g)=coeffn(fg)=p=0ncoeffp(f)coeffnp(g). m_n(f,g)=\coeff_n(f\cdot g) = \sum_{p=0}^n \coeff_p(f)\coeff_{n-p}(g). Since the projections and the coefficient maps are continuous, it is sufficient to prove that the maps from Rn+1×Rn+1R^{n+1} \times R^{n+1} to RR given by ((a0,,an),(b0,,bn))p=0napbnp((a_0,\dots,a_n),(b_0,\dots,b_n))\mapsto \sum_{p=0}^n a_p b_{n-p} are continuous, and this follows from continuity and commutativity of addition on RR, because it is a polynomial expression.

Polynomials

At this point, let's go back to our initial question of endowing polynomials with a natural topology.

An obvious candidate is the induced topology. This looks correct; in any case, it is such that addition and multiplication on R[T]R[T] are continuous. However, it lacks an interesting property with respect to evaluation.

Recall that for every aRa\in R, there is an evaluation map evala ⁣:R[T]R\eval_a\colon R[T]\to R, defined by ff(a)f\mapsto f(a), and even, if one wishes, the two-variable evaluation map R[T]×RRR[T]\times R\to R.
The first claim is that this map is not continuous.

An example will serve of proof. I take RR to be the real numbers, fn=Tnf_n=T^n and a=1a=1. Then fnf_n converges to zero, because for each integer mm, the real numbers coeffm(fn)\coeff_m(f_n) are zero for n>mn>m. On the other hand, fn(a)=fn(1)=1f_n(a)=f_n(1)=1 for all nn, and this does not converge to zero!

So we have to change the topology on polynomials if we want that this map be continuous, and we now give the correct definition. The ring of polynomials is the increasing union of subsets R[T]nR[T]_n, indexed by integers nn, consisting of all polynomials of degree less than nn. Each of these subsets is given the product topology, as above, but we endow their union with the “inductive limit” topology. Explicitly, if YY is a topological space and u ⁣:R[T]Yu\colon R[T]\to Y is a map, then uu is continuous if and only if, for each integer nn, its restriction to R[T]nR[T]_n is continuous.

The inclusion map R[T]R[ ⁣[T] ⁣]R[T]\to R\lbra T\rbra is continuous, hence the topology on polynomials is finer than the topology induced by the topology on power series. As the following property indicates, it is usually strictly finer.

We can also observe that addition and multiplication on R[T]R[T] are still continuous. The same proof as above works, once we observe that the coefficient maps are continuous. (On the other hand, one may be tempted to compare the product topology of the inductive topologies, with the inductive topology of the product topologies, a thing which is not obvious in the direction that we need.)

Proposition.Assume that addition and multiplication on RR are continuous. Then the evaluation maps evala ⁣:R[T]R\eval_a \colon R[T]\to R are continuous.

We have We have to prove that for every integer nn, the evaluation map evala\eval_a induced a continuous map from R[T]nR[T]_n to RR. Now, this map factors as a projection map R[T]Rn+1R[T]\to R^{n+1} composed with a polynomial map (c0,,cn)c0+c1a++cnan(c_0,\dots,c_n)\mapsto c_0+c_1a+\dots+c_n a^n. It is therefore continuous.

Laurent series

We can upgrade the preceding discussion and define a natural topology on the ring R( ⁣(T) ⁣)R\lpar T\rpar of Laurent series, which are the power series with possibly negative exponents. For this, for all integers dd, we set R( ⁣(T) ⁣)dR\lpar T\rpar_d to be the set of power series of the form f=n=dcnTn f=\sum_{n=-d}^\infty c_n T^n, we endow that set with the product topology, and take the corresponding inductive limit topology. We leave to the reader to check that this is a ring topology, but that the naïve product topology on R( ⁣(T) ⁣)R\lpar T\rpar wouldn't be in general.

Back to the continuity of evaluation

The continuity of the evaluation maps ff(a)f\mapsto f(a) were an important guide to the topology of the ring of polynomials. This suggests a more general question, for which I don't have a full answer, whether the two-variable evaluation map, (f,a)f(a)(f,a)\mapsto f(a), is continuous. On each subspace R[T]d×RR[T]_d\times R, the evaluation map is given by a polynomial map ((c0,,cd,a)c0+c1a++cdad(c_0,\dots,c_d,a)\mapsto c_0 +c_1a+\dots+c_d a^d), hence is continuous, but that does not imply the desired continuity, because that only tells us about R[T]×RR[T]\times R with the topology colimd(R[T]d×R)\colim_d (R[T]_d\times R), while we are interested in the topology (colimdR[T]d)×R(\colim_d R[T]_d)\times R. To compare these topologies, note that the natural bijection colimd(R[T]d×R)(colimdR[T]d)×R\colim_d (R[T]_d\times R) \to (\colim_d R[T]_d)\times R is continuous (because it is continuous at each level dd), but the continuity of its inverse is not so clear.

I find it amusing, then, to observe that sequential continuity holds in the important case where RR is a field. This relies on the following proposition.

Proposition.Assume that RR is a field. Then, for every converging sequence (fn)(f_n) in R[T]R[T], the degrees deg(fn)\deg(f_n) are bounded.

Otherwise, we can assume that (fn)(f_n) converges to 00 and that deg(fn+1)>deg(fn)\deg(f_{n+1})>\deg(f_n) for all nn. We construct a continuous linear form ϕ\phi on R[T]R[T] such that ϕ(fn)\phi(f_n) does not converge to 00. This linear form is given by a formal power series ϕ(f)=adcd\phi(f)=\sum a_d c_d for f=cdTdf=\sum c_dT^d, and we choose the coefficients (an)(a_n) by induction so that ϕ(fn)=1\phi(f_n)=1 for all nn. Indeed, if the coefficients are chosen up to deg(fn)\deg(f_n), then we fix ad=0a_d=0 for deg(fn)<d<deg(fn+1)\deg(f_n)<d<\deg(f_{n+1}) and choose adeg(fn+1)a_{\deg(f_{n+1})} so that ϕ(fn+1)=1\phi(f_{n+1})=1. This linear form is continuous because its restriction to any R[T]dR[T]_d is given by a polynomial, hence is continuous.

Corollary. — If RR is a topological ring which is a field, then the evaluation map R[T]×RRR[T]\times R\to R is sequentially continuous.

Consider sequences (fn)(f_n) in R[T]R[T] and (an)(a_n) in RR that converge to ff and aa respectively. By the proposition, there is an integer dd such that deg(fn)d\deg(f_n)\leq d for all nn, and deg(f)d\deg(f)\leq d. Since evaluation is continuous on R[T]d×RR[T]_d\times R, one has fn(an)f(a)f_n(a_n)\to f(a), as claimed.

Remark. — The previous proposition does not hold on rings. In fact, if R=ZpR=\mathbf Z_p is the ring of pp-adic integers, then ϕ(pnTn)=pnϕ(Tn)\phi(p^nT^n)=p^n \phi(T^n) converges to 00 for every continuous linear form ϕ\phi on R[T]R[T]. More is true since in that case, evaluation is continuous! The point is that in Zp\mathbf Z_p, the ideals (pn)(p^n) form a basis of neighborhoods of the origin.

Proposition. — If the topology of RR is linear, namely the origin of RR has a basis of neighborhoods consisting of ideals, then the evaluation map R[T]×RRR[T]\times R\to R is continuous.

By translation, one reduces to showing continuity at (0,0)(0,0). Let VV be a neighborhood of 00 in RR and let II be an ideal of RR such that IVI\subset V. Since it is an subgroup of the additive group of RR, the ideal II is open. Then the set IR[T]I\cdot R[T] is open because for every dd, its trace on R[T]dR[T]_d, is equal to IR[T]dI\cdot R[T]_d, hence is open. Then, for fIR[T]f\in I\cdot R[T] and aRa\in R, one has f(a)If(a)\in I, hence f(a)Vf(a)\in V.

Here is one case where I can prove that evaluation is continuous.

Proposition.If the topology of RR is given by a family of absolute values, then the evaluation map (f,a)f(a)(f,a)\mapsto f(a) is continuous.

I just treat the case where the topology of RR is given by one absolute value. By translation and linearity, it suffices to prove continuity at (0,0)(0,0). Consider the norm 1\|\cdot\|_1 on R[T]R[T] defined by f1=cn\|f\|_1=\sum |c_n| if f=cnTnf=\sum c_nT^n. By the triangular inequality, one has f(a)f1|f(a)|\leq \|f\|_1 for any aRa\in R such that a1|a|\leq 1. For every r>0r>0, the set VrV_r of polynomials fR[T]f\in R[T] such that f1<r\|f\|_1<r is an open neighborhood of the origin since, for every integer dd, its intersection with R[T]dR[T]_d is an open neighborhood of the origin in R[T]dR[T]_d. Let also WW be the set of aRa\in R such that a1|a|\leq 1. Then Vr×WV_r\times W is a neighborhood of (0,0)(0,0) in R[T]×RR[T]\times R such that f(a)<r|f(a)|<r for every (f,a)Vr×W(f,a)\in V_r\times W. This implies the desired continuity.

Wednesday, April 10, 2024

Flatness and projectivity: when is the localization of a ring a projective module?

Projective modules and flat modules are two important concepts in algebra, because they characterize those modules for which a general functorial construction (Hom module and tensor product, respectively) behave better than what is the case for general modules.

This blog post came out of reading a confusion on a student's exam: projective modules are flat, but not all flat modules are projective. Since localization gives flat modules, it is easy to obtain a an example of a flat module which is not projective (see below, Q\mathbf Q works, as a Z\mathbf Z-module), but my question was to understand when the localization of a commutative ring is a projective module.

\gdef\Hom{\operatorname{Hom}}\gdef\Spec{\operatorname{Spec}}\gdef\id{\mathrm{id}}

Let me first recall the definitions. Let RR be a ring and let MM be a (right)RR-module.

The HomR(M,)\Hom_R(M,\bullet)-functor associates with a right RR-module XX the abelian group HomR(M,X)\Hom_R(M,X). By composition, any linear map f ⁣:XYf\colon X\to Y induces an additive map HomR(M,f) ⁣:HomR(M,X)HomR(M,X)\Hom_R(M,f)\colon \Hom_R(M,X)\to \Hom_R(M,X): it maps u ⁣:MXu\colon M\to X to ϕu\phi\circ u. When RR is commutative, these are even RR-modules and morphisms of RR-modules. If ff is injective, HomR(M,f)\Hom_R(M,f) is injective as well, but if ff is surjective, it is not always the case that HomR(M,f)\Hom_R(M,f) is surjective, and one says that the RR-module MM is projective if HomR(M,f)\Hom_R(M,f) is surjective for all surjective linear maps ff.

The R\otimes_R-functor associates with a left RR-module XX the abelian group MRXM\otimes_R X, and with any linear map f ⁣:XYf\colon X\to Y, the additive map MRXMRYM\otimes_R X\to M\otimes_R Y that maps a split tensor mxm\otimes x to mf(x)m\otimes f(x). When RR is commutative, these are even RR-modules and morphisms of RR-modules. If ff is surjective, then MRfM\otimes_R f is surjective, but if ff is injective, it is not always the case that MRfM\otimes_R f is injective. One says that MM is flat if MRfM\otimes_R f is injective for all injective linear maps ff.

These notions are quite abstract, and the development of homological algebra made them prevalent in modern algebra.

Example. — Free modules are projective and flat.

Proposition. — An RR-module MM is projective if and only if there exists an RR-module NN such that MNM\oplus N is free.
Indeed, taking a generating family of MM, we construct a free module LL and a surjective linear map u ⁣:LMu\colon L\to M. Since MM is projective, the map HomR(M,u)\Hom_R(M,u) is surjective and there exists v ⁣:MLv\colon M\to L such that uv=idMu\circ v=\id_M. Then vv is an isomorphism from MM to u(M)u(M), and one can check that L=u(M)ker(v)L=u(M)\oplus \ker(v).

Corollary. — Projective modules are flat.

Theorem (Kaplansky). — If RR is a local ring, then a projective RR-module is free.

The theorem has a reasonably easy proof for a finitely generated RR-module MM over a commutative local ring. Let JJ be the maximal ideal of RR and let k=R/Jk=R/J be the residue field. Then M/JMM/JM is a finite dimensional kk-vector space; let us consider a family (e1,,en)(e_1,\dots,e_n) in MM whose images form a basis of M/JMM/JM. Now, one has e1,,en+JM=M\langle e_1,\dots,e_n\rangle + J M = M, hence Nakayama's lemma implies that M=e1,,enM=\langle e_1,\dots,e_n\rangle. Let then u ⁣:RnMu\colon R^n\to M be the morphism given by u(a1,,an)=aieiu(a_1,\dots,a_n)=\sum a_i e_i; by what precedes, it is surjective, and we let NN be its kernel. Since MM is projective, the morphism HomR(M,u)\Hom_R(M,u) is surjective, and there exists v ⁣:MRnv\colon M\to R^n such that uv=idMu\circ v=\id_M. We then have an isomorphism MNRnM\oplus N\simeq R^n, where N=ker(v)N=\ker(v). Moding out by JJ, we get M/JMN/JNknM/JM \oplus N/JN \simeq k^n. Necessarily, N/JN=0N/JN=0, hence N=JNN=JN; since NN is a direct summand of RnR^n, it is finitely generated, and Nakayama's lemma implies that N=0N=0.

Example. — Let RR be a commutative ring and let SS be a multiplicative subset of RR. Then the fraction ring S1RS^{-1}R is a flat RR-module.
Let u ⁣:XYu\colon X\to Y be an injective morphism of RR-modules. First of all, one identifies the morphism S1RRu ⁣:S1RRXS1RRYS^{-1}R\otimes_R u\colon S^{-1}R\otimes_R X\to S^{-1}R\otimes_R Y to the morphism S1u ⁣:S1XS1YS^{-1}u\colon S^{-1}X\to S^{-1}Y induced by uu on fraction modules. Then, it is easy to see that S1uS^{-1}u is injective. Let indeed x/sS1Xx/s\in S^{-1}X be an element that maps to 00; one then has u(x)/s=0u(x)/s=0, hence there exists tSt\in S such that tu(x)=0tu(x)=0. Consequently, u(tx)=0u(tx)=0, hence tx=0tx=0 because uu is injective. This implies x/s=0x/s=0.

Theorem.Let RR be a commutative ring. If MM is a finitely presented RR-module, then MM is locally free: there exists a finite family (f1,,fn)(f_1,\dots,f_n) in RR such that R=f1,,fnR=\langle f_1,\dots,f_n\rangle and such that for every ii, MfiM_{f_i} is a free RfiR_{f_i}-module.
The proof is a variant of the case of local rings. Starting from a point pSpec(R)p\in\Spec(R), we know that MpM_p is a finitely presented flat RpR_p-module. As above, we get a surjective morphism u ⁣:RnMu\colon R^n\to M which induces an isomorphism κ(p)nκ(p)M\kappa(p)^n\to \kappa(p)\otimes M, and we let NN be its kernel. By flatness of MM (and an argument involving the snake lemma), the exact sequence 0NRpM00\to N\to R_p\to M\to 0 induces an exact sequence 0κ(p)Nκ(p)nκ(p)M00\to \kappa(p)\otimes N\to \kappa(p)^n\to \kappa(p)\otimes M\to 0. And since the last sequence is an isomorphism, we have κ(p)N\kappa(p)\otimes N. Since MM is finitely presented, the module NN is finitely generated, and Nakayama's lemma implies that Np=0N_p=0; moreover, there exists f∉pf\not\in p such that Nf=0N_f=0, so that uf ⁣:RfnMfu_f\colon R_f^n\to M_f is an isomorphism. One concludes by using the quasicompactness of Spec(R)\Spec(R).

However, not all flat modules are projective. The most basic example is the following one.

Example.The Z\mathbf Z-module Q\mathbf Q is flat, but is not projective.
It is flat because it is the total fraction ring of Z\mathbf Z. To show that it is not projective, we consider the free module L=Z(N)L={\mathbf Z}^{(\mathbf N)} with basis (en)(e_n) and the morphism u ⁣:LQu\colon L\to\mathbf Q that maps ene_n to 1/n1/n (if n>0n>0, say). This morphism is surjective. If Q\mathbf Q were projective, there would exist a morphism v ⁣:QLv\colon \mathbf Q\to L such that uv=idQu\circ v=\id_{\mathbf Q}. Consider a fraction a/bQa/b\in\mathbf Q; one has b1/b=1b\cdot 1/b=1, hence bv(1/b)=v(1)b v(1/b)=v(1). We thus see that all coeffiencients of v(1)v(1) are divisible by bb, for any integer bb; they must be zero, hence v(1)=0v(1)=0 and 1=u(v(1))=01=u(v(1))=0, a contradiction.
The proof generalizes. For example, if RR is a domain and SS does not consist of units, and does not contain 00, then S1RS^{-1}R is not projective. (With analogous notation, take a nonzero coefficient aa of v(1)v(1) and set b=asb=as, where sSs\in S is not 00; then asas divides aa, hence ss divides 11 and ss is a unit.)

These recollections are meant to motivate the forthcoming question: When is it the case that a localization S1RS^{-1}R is a projective RR-module?

Example. — Let ee be an idempotent of RR, so that the ring RR decomposes as a product ot two rings ReR×(1e)RR\simeq eR \times (1-e)R, and both factors are projective submodules of RR since their direct sum is the free RR-module RR. Now, one can observe that Re=eRR_e= eR. Consequently, ReR_e is projective. Geometrically, Spec(R)\Spec(R) decomposes as a disjoint union of two closed subsets V(e)\mathrm V(e) and V(1e)\mathrm V(1-e); the first one can be viewed as the open subset Spec(R1e)\Spec(R_{1-e}) and the second one as the open subset Spec(Re)\Spec(R_e).

The question was to decide whether this geometric condition furnishes the basic conditions for a localization S1RS^{-1}R to be projective. With the above notation, we recall that Spec(S1R)\Spec(S^{-1}R) is homeomorphic to a the subset of Spec(R)\Spec(R) consisting of prime ideals pp such that pS=p\cap S=\emptyset. The preceding example corresponds to the case where Spec(S1R)\Spec(S^{-1}R) is open and closed in Spec(R)\Spec(R). In this case, we view S1RS^{-1}R as a quasicoherent sheaf on Spec(R)\Spec(R), it is free of rank one on the open subset Spec(S1R)\Spec(S^{-1}R), and zero on the complementary open subset. It is therefore locally free, hence the RR-module S1RS^{-1}R is projective.

Observation.The set Spec(S1R)\Spec(S^{-1}R) is stable under generization. If S1RS^{-1}R is a projective RR-module, then it is open.
The first part is obvious: if pp and qq are prime ideals of RR such that pqp\subseteq q and qS=q\cap S=\emptyset, then pS=p\cap S=\emptyset. The second part follows from the observation that the support of S1RS^{-1}R is exactly Spec(S1R)\Spec(S^{-1}R), combined with the following proposition.

Proposition. — The support of a projective module is open.
I learnt this result in the paper by Vasconcelos (1969), “On Projective Modules of Finite Rank” (Proceedings of the American Mathematical Society 22 (2): 430‑33). The proof relies on the trace ideal τR(M)\tau_R(M) of a module: this is the image of the canonical morphism t ⁣:MRMRt\colon M^\vee \otimes_R M\to R. (It is called the trace ideal, because when MM is free, MRMM^\vee\otimes_R M can also be identified with the module of endomorphisms of finite rank of MM, a split tensor ϕm\phi\otimes m corresponds with the endomorhism xϕ(x)mx\mapsto \phi(x)m, and then t(ϕm)=ϕ(m)t(\phi \otimes m)=\phi(m) is its trace.) Now, if pp belongs to the support of MM, then τR(M)p=Rp\tau_R(M)_p=R_p, while if pp does not belong to the support of MM, one has Mp=0M_p=0, hence τR(M)p=0\tau_R(M)_p=0. In other words, the support of MM is the complement of the closed locus V(τR(M))\mathrm V(\tau_R(M)) of Spec(R)\Spec(R).

On the other hand, one should remember the following basic property of the support of a module.

Proposition. — The support of a module is stable under specialization. The support of a finitely generated module is closed.
Indeed, for every mMm\in M and pSpec(R)p\in \Spec(R), saying that m=0m=0 in MpM_p means that there exist sRs\in R such that sps\notin p with sm=0sm=0. In other words, this set is V(annR(m))\mathrm V(\mathrm{ann}_R(m)). This shows that the support of MM is the union of the closed subsets V(annR(m))\mathrm V(\mathrm{ann}_R(m)); it is in particular stable under specialization. If MM is finitely generated, this also shows its support is V(annR(M))\mathrm V(\mathrm{ann}_R(M)), hence is closed.

At this point, one can go either following Vasconcelos (1969) who shows that a projective module MM of the form S1RS^{-1}R is finitely generated if and only if its trace ideal is. In particular, if RR is noetherian and S1RS^{-1}R is a projective RR-module, then Spec(S1R)\Spec(S^{-1}R) is closed. It is thus open and closed, and we are in the situation of the basic example above.

One can also use a topological argument explained to me by Daniel Ferrand: a minimal prime ideal of RR that meets Spec(S1R)\Spec(S^{-1}R) is disjoint from SS, hence belongs to Spec(S1R)\Spec(S^{-1}R). Consequently, Spec(S1R)\Spec(S^{-1}R) is the union of the irreducible components of Spec(R)\Spec(R) that it meets. If this set of irreducible components is finite (or locally finite), for example if Spec(R)\Spec(R) is noetherian, for example if RR is a noetherian ring, then Spec(S1R)\Spec(S^{-1}R) is closed.

I did not find the time to think more about this question, and it would be nice to have an example of a projective localization which does not come from this situation.