Friday, April 2, 2021

On the Hadamard-Lévy theorem, or is it Banach-Mazur?

During the preparation of an agrégation lecture on connectedness, I came across the following theorem, attributed to Hadamard–Lévy: 

Theorem. — Let $f\colon \mathbf R^n\to\mathbf R^n$ be a $\mathscr C^1$-map which is proper and a local diffeomorphism. Then $f$ is a global diffeomorphism.

In this context, that $f$ is proper means that $\| f(x)\| \to+\infty$ when $\| x\|\to+\infty$, while, by the inverse function theorem, the condition that $f$ is a local diffeomorphism is equivalent to the property that its differential $f'(x)$ is invertible, for every $x\in\mathbf R^n$. The conclusion is that $f$ is a diffeomorphism from $\mathbf R^n$ to itself; in particular, $f$ is bijective and its inverse is continuous.

This theorem is not stated in this form neither by Hadamard (1906), nor by Lévy (1920), but is essentially due to Banach & Mazur (1934) and it is the purpose of this note to clarify the history, explain a few proofs, as well as more recent consequences for partial differential equations.

A proper map is closed: the image $f(A)$ of a closed subset $A$ of $\mathbf R^n$ is closed in $\mathbf R^n$. Indeed, let $(a_m)$ be a sequence in $A$ whose image $(f(a_m))$ converges in $\mathbf R^n$ to an element $b$; let us show that there exists $a\in A$ such that $b=f(a)$. The properness assumption on $f$ implies that $(a_m)$ is bounded. Consequently, it has a limit point $a$, and $a\in A$ because $A$ is closed. Necessarily, $f(a)$ is a limit point of the sequence $(f(a_m))$, hence $b=f(a)$.

In this respect, let us note the following reinforcement of the previous theorem, due to Browder (1954):
Theorem (Browder). — Let $f\colon \mathbf R^n\to\mathbf R^n$ be a local homeomorphism. If $f$ is closed, then $f$ is a global homeomorphism.

A surprising aspect of these results and their descendents is that they are based on two really different ideas. Banach & Mazur and Browder are based on the notion of covering, with ideas of homotopy theory and, ultimately, the fact that $\mathbf R^n$ is simply connected. On the other hand, the motivation of Hadamard was to generalize to dimension $n$ the following elementary discussion in the one-dimensional case: Let $f\colon\mathbf R\to\mathbf R$ be a $\mathscr C^1$-function whose derivative is $>0$ everywhere (so that $f$ is strictly increasing); give a condition for $f$ to be surjective. In this case, the condition is easy to find: the indefinite integral $\int f'(x)\,dx$ has to be divergent both at $-\infty$ and $+\infty$. In the $n$-dimensional case, the theorems of Hadamard is the following:

Theorem.Let $f\colon\mathbf R^n\to\mathbf R^n$ be a $\mathscr C^1$-map. For $r\in\mathbf R_+$, let $\omega(r)$ be the infimum, for $x\in\mathbf R^n$ such that $\|x\|=r$, of the norm of the linear map $f'(x)^{-1}$; if $\int_0^\infty dr/\omega(r)=+\infty$, then $f$ is a global diffeomorphism.

In Hadamard's paper, the quantity $\omega(r)$ is described geometrically as the minor axis of the ellipsoid defined by $f'(x)$, and Hadamard insists that using the volume of this ellipsoid only, essentially given by the determinant of $f'(x)$, would not suffice to characterize global diffeomorphisms. (Examples are furnished by maps of the form $f(x_1,x_2)=(f_1(x_1),f_2(x_2))$. The determinant condition considers $f_1'(x_1)f_2'(x_2)$, while one needs individual conditions on $f'_1(x_1)$ and $f'_2(x_2)$.)

In fact, as explained in Plastock (1974), both versions (closedness hypothesis or quantitative assumptions on the differential) imply that the map $f$ is a topological covering of $\mathbf R^n$. Since the target $\mathbf R^n$ is simply connected and the source $\mathbf R^n$ is connceted, $f$ has to be a homeomorphism. I will explain this proof below, but I would first like to explain another one, due to Zuily & Queffelec (1995) propose an alternating proof which is quite interesting.

A dynamical system approach

The goal is to prove that $f$ is bijective and, to that aim, we will prove that every preimage set $f^{-1}(b)$ is reduced to one element. Replacing $f$ by $f-b$, it suffices to treat the case of $b=0$. In other words, we wish to solve that the equation $f(x)=0$ has exactly one solution. For that, it is natural to try to start from some point $\xi\in\mathbf R^n$ and to force $f$ to decrease. This can be done by following the flow of the vector field given by $v(x)=-f'(x)^{-1}(f(x))$. This is a vector field on $\mathbf R^n$ and we can consider its flow: a map $\Phi$ defined on an open subset of $\mathbf R\times\mathbf R^n$ such that $\partial_t \Phi(t,x)=v(\Phi(t,x))$ for all $(t,x)$ and $\Phi(0,x)=x$ for all $x$. In fact, the Cauchy–Lipschitz theorem guarantees the existence of such a flow only if the vector field $v$ is locally Lipschitz, which happens if, for example, $f$ is assumed to be $\mathscr C^2$. In this case, there is even uniqueness of a maximal flow, and we will make this assumption, for safety. (In fact, the paper of De Marco, Gorni & Zampieri (1994) constructs the flow directly thanks to the hypothesis that the vector field is pulled back from the Euler vector field on $\mathbf R^n$.)

What are we doing here? Note that in $\mathbf R^n$, the opposite of the Euler vector field, defined by $u(y)=-y$, has a very simple solution: the flow lines are straight lines going to $0$. The formula above just pulls back this vector field $u$ via the local diffeomorphism $f$, and the flow lines of the vector field $v$ will just be the ones given by pull back by $f$, which will explain the behaviour described below.

In particular, let $a\in\mathbf R^n$ be such that $f(a)=0$ and let $U$ be a neighborhood of $a$ such that $f$ induces a diffeomorphism from $U$ to a ball around $0$. Pulling back the solution of the minus-Euler vector field by $f$, we see that once a flow line enters the open set $U$, it converges to $a$. The goal is now to prove that it will indeed enter such a neighborhood (and, in particular, that such a point $a$ exists).

We consider a flow line starting from a point $x$, that is, $\phi(t)=\Phi(t,x)$ for all times $t$. Let $g(t)= f(\phi(t))$; observe that $g$ satisfies $g'(t)=f'(\phi(t))(\phi'(t))=-g(t)$, hence $g(t)=g(0)e^{-t}$. Assume that the line flow is defined on $[0;t_1\mathopen[$, with $t_1<+\infty$. by what precedes, $g$ is bounded in the neighborhood of $t_1$; since $f$ is assumed to be proper, this implies that $\phi(t)$ is bounded as well. The continuity of the vector field $v$ implies that $\phi$ is uniformly continuous, hence it has a limit at $t_1$. We may then extend the line flow a bit right of $t_1$. As a consequence, the line flow is defined for all times, and $g(t)\to0$ when $t\to+\infty$. By the same properness argument, this implies that $\phi(t)$ is bounded when $t\to+\infty$, hence it has limit points $a$ which satisfy $f(a)=0$. Once $\phi$ enters an appropriate neighborhood of such a point, we have seen that the line flow automatically converges to some point $a\in f^{-1}(0)$.

Let us now consider the map $\lambda\colon\mathbf R^n\to f^{-1}(0)$ that associates with a point $\xi$ the limit of the line flow $t\mapsto \Phi(t,\xi)$ starting from the initial condition $\xi$. By continuity of the flow of a vector field depending on the initial condition, the map $\lambda$ is continuous. On the other hand, the hypothesis that $f$ is a local diffeomorphism implies that $f^{-1}(0)$ is a closed discrete subset of $\mathbf R^n$. Since $\mathbf R^n$ is connected, the map $\lambda$ is constant. Since one has $\lambda(\xi)=\xi$ for every $\xi\in f^{-1}(0)$, this establishes that $f^{-1}(0)$ is reduced to one element, as claimed.

Once $f$ is shown to be bijective, the fact that it is proper (closed would suffice) implies that its inverse bijection $f^{-1}$ is continuous. This concludes the proof.

The theorem of Banach and Mazur

The paper of Banach and Mazur is written in a bigger generality. They consider multivalued continuous maps $F\colon X\to Y$ ($k$-deutige stetige Abbildungen) by which they mean that for every $x$, a subset $F(x)$ of $Y$ is given, of cardinality $k$, the continuity being expressed by sequences: if $x_n\to x$, one can order, for every $n$, the elements of $F(x_n)=\{y_{n,1},\dots,y_{n,k}\}$, as well as the elements of $F(x)=\{y_1,\dots,y_k\}$, in such a way that $y_{n,j}\to y_n$ for all $j$. (In their framework, $X$ and $Y$ are metric spaces, but one could transpose their definition to topological spaces if needed.) They say that such a map is decomposed (zerfällt) if there are continuous functions $f_1,\dots,f_k$ from $X$ to $Y$ such that $F(x)=\{f_1(x),\dots,f_k(x)\}$ for all $x\in X$.

In essence, the definition that Banach and Mazur are proposing contains as a particular case the finite coverings. Namely, if $p\colon Y\to X$ is a finite covering of degree $k$, then the map $x\mapsto p^{-1}(x)$ is a continuous $k$-valued map from $X$ to $Y$. Conversely, let us consider the graph $Z$ of $F$, namely the set of all points $(x,y)\in X\times Y$ such that $y\in F(x)$. Then the first projection $p\colon Z\to X$ is a covering map of degree $k$, but it is not clear that it has local sections.

It would however not be so surprising to 21st-century mathematicians that if one makes a suitable assumption of simple connectedness on $X$, then every such $F$ should be decomposed. Banach and Mazur assume that $X$ satisfies two properties:

  1. The space $X$ is semilocally arcwise connected: for every point $x\in X$ and every neighborhood $U$ of $x$, there exists an open neighborhood $U'$ contained in $U$ such that for every point $x'\in U'$, there exists a path $c\colon[0;1]\to U$ such that $c(0)=x$ and $c(1)=x'$. (Semilocally means that the path is not necessarily in $U'$ but in $U$.)
  2. The space $X$ is arcwise simply connected: two paths $c_0,c_1\colon[0;1]\to X$ with the same endpoints ($c_0(0)=c_1(0)$ and $c_0(1)=c_1(1)$) are strictly homotopic — there exists a continuous map $h\colon[0;1]\to X$ such that $h(0,t)=c_0(t)$ and $h(1,t)=c_1(t)$ for all $t$, and $h(s,0)=c_0(0)$ and $h(s,1)=c_0(1)$ for all $s$.

Consider a $k$-valued continuous map $F$ from $X$ to $Y$, where $X$ is connected. Banach and Mazur first prove that for every path $c\colon [0;1]\to X$ and every point $y_0\in F(c(0))$, there exists a continuous function $f\colon[0;1]\to Y$ such that $f(t)\in F(c(t))$ for all $t$. To that aim, the consider disjoint neighborhoods $V_1,\dots,V_k$ of the elements of $F(c(0))$, with $y_0\in V_1$, say, and observe that for $t$ small enough, there is a unique element in $F(c(t))\cap V_1$. This defines a bit of the path $c$, and one can go on. Now, given two paths $c,c'$ such that $c(0)=c'(0)$ and $c(1)=c'(1)$, and two maps $f,f'$ as above, they consider a homotopy $h\colon[0;1]\times[0;1]\to X$ linking $c$ to $c'$. Subdividing this square in small enough subsquares, one see by induction that $f(1)=f'(1)$. (This is analogous to the proof that a topological covering of the square is trivial.) Fixing a point $x_0\in X$ and a point $y_0\in F(x_0)$, one gets in this way a map from $X$ to $Y$ such that $F(x)$ is equal to $f(1)$, for every path $c\colon[0;1]\to X$ such that $c(0)=x_0$ and $c(1)=x$, and every continuous map $f\colon [0;1]\to Y$ such that $f(t)\in F(c(t))$ for all $t$ and $f(0)=y_0$. This furnishes a map from $X$ to $Y$, and one proves that it is continuous. If one considers all such maps, for all points in $F(x_0)$, one obtains the decomposition of the multivalued map $F$.

To prove their version of the Hadamard–Lévy theorem, Banach and Mazur observe that if $f\colon Y\to X$ is a local homeomorphism which is proper, then setting $F(x)=f^{-1}(y)$ gives a multivalued continuous map. It is not obvious that the cardinalities $k(x)$ of the sets $F(x)$ are constant, but this follows (if $X$ is connected) from the fact that $f$ is both a local homeomorphism and proper. Then $F$ is decomposed, so that there exist continuous maps $g_1,\dots,g_k\colon X\to Y$ such that $f^{-1}(x)=\{g_1(x),\dots,g_k(x)\}$ for all $x\in X$. This implies that $Y$ is the disjoint union of the $k$ connected subsets $g_j(X)$. If $Y$ is connected, then $f$ is a homeomorphism.

The versions of Hadamard and Lévy, after Plastock

Hadamard considered the finite dimensional case, and Lévy extended it to the case of Hilbert spaces.

Plastock considers a Banach-space version of the theorem above: $f\colon E\to F$ is a $\mathscr C^1$-map between Banach spaces with invertible differentials and such that, setting $\omega(r)=\inf_{\|x\| = r}\|f'(x)^{-1}\|$, one has $\int_0^\infty \omega(r)\,dr=+\infty$. Of course, under these hypotheses, the Banach spaces $E$ and $F$ are isomorphic, but it may be useful that they are not identical. Note that $f(E)$ is open in $F$, and the proposition that will insure that $f$ is a global diffeomorphism is the following one, in the spirit of covering theory.

Proposition.(Assuming that $f$ is a local diffeomorphism.) It suffices to prove that the map $f$ satisfies the path lifting property: for every point $x\in E$ and every $\mathscr C^1$ map $c\colon[0;1]\to f(E)$ such that $c(0)=f(x)$, there exists a $\mathscr C^1$ map $d\colon[0;1]\to E$ such that $c(t)=f(d(t))$ for all $t$ and $d(0)=c$.

The goal is now to prove that $f$ satisfies this path lifting property. Using that $f$ is a local homeomorphism, one sees that lifts are unique, and are defined on a maximal subinterval of $[0;1]$ which is either $[0;1]$ itself, or of the form $[0;s\mathclose[$. To prevent the latter case, one needs to impose conditions on the norm $\| f'(x)^{-1}\|$ such as the one phrased in terms of $\omega(r)$ as in the Hadamard–Lévy theorem. In fact, Plastock starts with a simpler case.

Proposition.The path lifting property follows from the following additional hypotheses:

  1. One has $\|f(x)\|\to+\infty$ when $\|x\|\to+\infty$;
  2. There exists a positive continuous function $M\colon\mathbf R_+\to\mathbf R_+$ such that $\|f'(x)^{-1}\|\leq M(\|x\|)$ for all $x.

Assume indeed that a path $c$ has a maximal lift $d$, defined over the interval $[0;s\mathclose[$. By the hypothesis (i), $d(t)$ remains bounded when $t\to s$, because $c(t)=f(d(t))$ tends to $c(s)$. Differentiating the relation $c(t)=f(d(t))$, one gets $c'(t)=f'(d(t))(d'(t))$, hence $d'(t)=f'(d(t))^{-1}(c'(t))$, so that $\| d'(t)\|\leq M(\|d(t)\|) \|c'(t)\|$. This implies that $\|d'\|$ is bounded, so that $d$ is uniformly continuous, hence it has a limit at $s$. Then the path $d$ can be extended by setting $d(s)$ to this limit and using the local diffeomorphism property to go beyong $s$.

The Hadamard–Lévy is related to completeness of some length-spaces. So we shall modify the distance of the Banach space $E$ as follows: if $c\colon[0;1]\to E$ is a path in $E$, then its length is defined by \[ \ell(c) = \int_0^1 \| f'(c(t))^{-1}\|^{-1} \|{c'(t)}\|\, dt. \] Observe that $\|f'(c(t))^{-1}\|^{-1} \geq \omega(\|c(t)\|)$, so that \[ \ell(c) \geq \int_0^1 \omega(\|c(t)\|) \|{c'(t)}\|\, dt. \] The modified distance of two points in $E$ is then redefined as the infimum of the lengths of all paths joining two points.

Lemma.With respect to the modified distance, the space $E$ is complete.

One proves that $\ell(c) \geq \int_{\|{c(0)}\|}^{\|{c(1)}\|}\omega(r)\,dr$. Since $\int_0^\infty \omega(r)\,dr=+\infty$, this implies that Cauchy sequences for the modified distance are bounded in $E$ for the original norm. On the other hand, on any bounded subset of $E$, the Banach norm and the modified distance are equivalent, so that they have the same Cauchy sequences.

Other conditions can be derived from Plastock's general theorem. For example, assuming that $E$ and $F$ are a Hilbert space $H$, he shows that it suffices to assume the existence of a decreasing function $\lambda\colon\mathbf R_+\to\mathbf R_+$ such that $\langle f'(x)(u),u\rangle \geq \lambda(\|x\|) \| u\|^2$ for all $x,y$ and $\int_0^\infty \lambda(r)\,dr=+\infty$. Indeed, under this assumption, one may set $\omega(r)=\lambda(r)$.

Application to periodic solutions of differential equations

Spectral theory can be seen as the infinite dimensional generalization of classical linear algebra. Linear differential operators and linear partial differential operators furnish prominent examples of such operators. The theorems of Hadamard–Lévy type have been applied to solve nonlinear differential equations.

I just give an example here, to give an idea of how this works, and also because I am quite lazy enough to check the details.

Following Brown & Lin (1979), we consider the Newtonian equation of motion: \[ u''(t) + \nabla G (u(t)) = p(t) \] where $G$ represents the ambiant potential, assumed to be smooth enough, and $p\colon \mathbf R\to\mathbf R^n$ is some external control. The problem studied by Brown and Lin is to prove the existence of periodic solutions when $p$ is itself periodic. The method consists in interpreting the left hand side as a non linear map defined on the Sobolev space $E$ of $2\pi$-periodic $\mathscr C^1$-functions with a second derivative in $F=L^2([0;2\pi];\mathbf R^n)$, with values in $F$. Write $L$ for the linear operator $u\mapsto u''$ and $N$ for the (nonlinear) operator $u\mapsto \nabla G(u)$. Then $L$ is linear continuous (hence $L'(u)(v)=L'(v)$), and $N$ is continuously differentiable, with differential given by \[ N'(u) (v) = \left( t \mapsto Q (u(t)) (v(t)) \right) \] for $u,v\in E$, and $Q$ is the Hessian of $G$.

In other words, the differential $(L+N)'(u)$ is the linear map $v\mapsto L(v) + Q(u(t)) v$. It is invertible if the eigenvalues of $Q(u(t))$ are away from integers. Concretely, Brown and Lin assume that there are two constant symmetric matrices $A$ and $B$ such that $A\leq Q(x) \leq B$ for all $x$, and whose eigenvalues $\lambda_1\leq \dots\lambda_n$ and $\mu_1\leq\dots\leq \mu_n$ are such that there are integers $N_1,\dots,N_n$ with $N_k^2<\lambda_k\leq\mu_k<(N_k+1)^2$ for all $k$. Using spectral theory in Hilbert spaces, these conditions imply that the linear operator $L+Q(u)\colon E\to F$ is an isomorphism, and that $\|(L+Q(u)^{-1}\|$ is bounded from above by the constant expression \[ c= \sup_{1\leq k\leq n} \sup (\lambda_k-N_k^2)^{-1},((N_k+1)^2-\mu_k)^{-1} ).\]

Thanks to this differential estimate, the theorem of Hadamard–Lévy implies that the nonlinear differential operator $L+N$ is a global diffeomorphism from $E$ to $F$. In particular, there is a unique $2\pi$-periodic solution for every $2\pi$-periodic control function $p$.

I thank Thomas Richard for his comments.

No comments :

Post a Comment