Friday, April 25, 2025

Yet another proof of the Weierstrass approximation theorem

Browsing through my Zotero database, I fall upon a paper by Harald Kuhn where he proposes an elementary proof of the Weierstrass approximation theorem. The proof is indeed neat, so here it is.

Theorem.Let $f\colon[0;1]\to\mathbf R$ be a continuous function and let $\varepsilon$ be a strictly positive real number. There exists a polynomial $P\in\mathbf R[T]$ such that $\left|P(x)-f(x)\right|<\varepsilon$ for every $x \in[0;1]$.

The proof goes out of the world of continuous functions and considers the Heaviside function $H$ on $\mathbf R$ defined by $H(x)=0$ for $x<0$ and $H(x)=1$ for $x\geq 0$ and will construct a polynomial approximation of $H.$

Lemma.There exists a sequence $(P_n)$ of polynomials which are increasing on $[-1;1]$ and which, for every $\delta>0$, converge uniformly to the function $H$ on $[-1;1]\setminus [-\delta;\delta]$.

For $n\in\mathbf N$, consider the polynomial $Q_n=(1-T^n)^{2^n}$. On $[0;1]$, it defines an decreasing function, with $Q_n(0)=1$ and $Q_n(1)=0$.

Let $q\in[0;1]$ be such that $0\leq q<1/2$. One has $$ Q_n(q) = (1-q^n)^{2^n} \geq 1-2^n q^n $$ in view of the inequality $(1-t)^n \geq 1-nt$ which is valid for $t\in[0;1]$. Since $2q<1$, we see that $Q_n(q)\to 1$. Since $Q_n$ is decreasing, one has $Q_n(q)\leq Q_n(x)\leq Q_n(1)=1$ for every $x\in[0;q]$, which shows that $Q_n$ converges uniformly to the constant function $1$ on the interval $[0;q]$.

Let now $q\in[0;1]$ be such that $1/2<q\leq 1$. Then $$ \frac1{Q_n(q)} = \frac1{(1-q^n)^{2^n}} = \left(1 + \frac{q^n}{1-q^{n}}\right)^{2^n} \geq 1 + \frac{2^nq^n}{1-q^{n}} $$ so that $Q_n(q)\to 0$. Since $Q_n$ is decreasing, one has $0=Q_n(1)\leq Q_n(x)\leq Q_n(q)$ for every $x\in[q;1]$, so that $Q_n$ converges uniformly to the constant function $0$ on the interval $[q;1]$.

Make an affine change of variables and set $P_n = Q_n((1-T)/2)$. The function defined by $P_n$ on $[-1:1]$ is increasing; for any real number $\delta$ such that $\delta>0,$ it converges uniformly to $0$ on $[-1;-\delta]$, and it converges uniformly to $1$ on $[\delta;1]$. This concludes the proof of the lemma.

We can now turn to the proof of the Weierstrass approximation theorem. Let $f$ be a continuous function on $[0;1]$. We may assume that $f(0)=0.$

The first step, that follows from Heine's uniform continuity theorem, consists in noting that there exists a uniform approximation of $f$ by a function of the form $ F(x)=\sum_{m=1}^N a_m H(x-c_m)$, where $(a_1,\dots,a_m)$ and $(c_1,\dots,c_m)$ are real numbers in $[0;1].$ Namely, Heine's theorem implies that there exists $\delta>0$ such that $|f(x)-f(y)|<\varepsilon$ if $|x-y|<\delta$. Choose $N$ such that $N\delta\geq 1$ and set $c_m=m/N$; then define $a_1,\dots,a_N$ so that $a_1=f(c_1)$, $a_1+a_2=f(c_2)$, etc. It is easy to check that $|F(x)-f(x)|\leq \varepsilon$ for every $x\in[0;1]$. Moreover, $\lvert a_m\rvert\leq\varepsilon/2$ for all $m$.

Now, fix a real number $\delta>0$ small enough so that the intervals $[c_m-\delta;c_m+\delta]$ are pairwise disjoint, and $n\in\mathbf N$ large enough so that $|P_n(x)-H(x)|\leq\varepsilon/2A$ for all $x\in[-1;1]$ such that $\delta\leq|x|$, where $A=\lvert a_0\rvert+\dots+\lvert a_N\rvert$. Finally, set $P(T) = \sum_{m=1}^N a_m P_n(T-c_m)$.

Let $x\in[-1;1]$. If $x$ doesn't belong to any interval of the form $[c_m-\delta;c_m+\delta]$, one can write $$\lvert P(x)-F(x)\rvert\leq \sum_{m} \lvert a_m\rvert \,\lvert P_n(x-c_m)- H(x-c_m)\rvert \leq \sum_m \lvert a_m\rvert (\varepsilon/A)\leq \varepsilon. $$ On the other hand, if there exists $m\in\{1,\dots,N\}$ such that $x\in[c_m-\delta;c_m+\delta]$, then there exists a unique such integer $m$. Writing $$\lvert P(x)-F(x)\rvert\leq \sum_{k\neq m} \lvert a_k\rvert \,\lvert P_n(x-c_k)- H(x-c_k)\rvert + \lvert a_m\rvert\, \lvert P_n(x-c_m)-H(x-c_m)\rvert, $$ the term with index $k$ in the first sum is bounded by $\lvert a_k\rvert \varepsilon/A$, while the last term is bounded by $ \lvert a_m\rvert$, because $0\leq P_n\leq H\leq 1$. Consequently, $$\lvert P(x)-F(x)\rvert \leq (\varepsilon/2A)\sum_{k\neq m} \lvert a_k\rvert +\lvert a_m\rvert \leq 2\varepsilon.$$ Finally, $\lvert P(x)-f(x)\rvert\leq \lvert P(x)-F(x)\rvert+\lvert F(x)-f(x)\rvert\leq 3\varepsilon$. This concludes the proof.

Yet another proof of the inequality between the arithmetic and the geometric means

This is an exposition of the proof of the inequality between arithmetic and geometric means given by A. Pełczyński (1992), “Yet another proof of the inequality between the means”, Annales Societatis Mathematicae Polonae. Seria II. Wiadomości Matematyczne, 29, p. 223–224. The proof might look bizarre, but I can guess some relation with another paper of the author where he proves uniqueness of the John ellipsoid. And as bizarre as it is, and despite the abundance of proofs of this inequality, I found it nice. (The paper is written in Polish, but the formulas allow to understand it.)

For $n\geq 1$, let $a_1,\dots,a_n$ be positive real numbers. Their arithmetic mean is $$ A = \dfrac1n \left(a_1+\dots + a_n\right)$$ while their geometric mean is $$G = \left(a_1\dots a_n\right)^{1/n}.$$ The inequality of the title says $G\leq A,$ with equality if and only if all $a_k$ are equal. By homogeneity, it suffices to prove that $A\geq 1$ if $G=1$, with equality if and only if $a_k=1$ for all $k.$ In other words, we have to prove the following theorem.

Theorem.If $a_1,\dots,a_n$ are positive real numbers such that $a_1\dots a_n=1$ and $a_1+\dots+a_n\leq n,$ then $a_1=\dots=a_n=1.$

The case $n=1$ is obvious and we argue by induction on $n$.

Lemma.If $a_1\cdots a_n=1$ and $a_1+\dots+a_n\leq n,$ then $a_1^2+\dots+a_n^2\leq n$

Indeed, we can write $$ a_1^2+\dots+a_n^2 = (a_1+\dots+a_n)^2 - \sum_{i\neq j} a_i a_j \leq n^2 - \sum_{i\neq j} a_i a_j,$$ and we have to give a lower bound for the second term. For given $i\neq j$, the product $a_i a_j$ and the remaining $a_k$, for $k\neq i,j$, are $n-1$ positive real numbers whose product is equal to $1$. By induction, one has $$ n-1 \leq a_i a_j + \sum_{k\neq i,j}a_k.$$ Summing these $n(n-1)$ inequalities, we have $$ n(n-1)^2 \leq \sum_{i\neq j} a_i a_j + \sum_{i\neq j} \sum_{k\neq i,j} a_k.$$ In the second term, every element $a_k$ appears $(n-1)(n-2)$ times, hence $$ n(n-1)^2 \leq \sum_{i\neq j} a_i a_j + (n-1)(n-2) \sum_{k} a_k \leq \sum_{i\neq j} a_i a_j + n(n-1)(n-2), $$ so that $$ \sum_{i\neq j} a_i a_j \geq n(n-1)^2-n(n-1)(n-2)=n(n-1).$$ Finally, we obtain $$a_1^2+\dots+a_n^2 \leq n^2-n(n-1)=n,$$ as claimed.

We can iterate this lemma: if $a_1+\dots+a_n\leq n$, then $$a_1^{2^m}+\dots+a_n^{2^m}\leq n$$ for every integer $m\geq 0$. When $m\to+\infty$, we obtain that $a_k\leq 1$ for every $n$. Since $a_1\dots a_n=1$, we must have $a_1=\dots=a_n=1$, and this concludes the proof.

Monday, April 7, 2025

A generalization of the Eisenstein criterion

Recently, in the Zulip server for Lean users, somebody went with something that looked like homework, but managed to sting me a little bit.

It was about irreducibility of polynomials with integer coefficients. Specifically, the guy wanted a proof that the polynomial $T^4-10 T^2+1$ is irreducible, claiming that the Eisenstein criterion was not good at it.

What was to proven (this is what irreducible means) is that it is impossible to write that polynomial as the product of two polynomials with integer coefficients, except by writing $T^4-10 T^2+1$ as $(1)·(T^4-10 T^2+1)$ or as $ (-1)·(-T^4+10 T^2-1)$.

This is both a complicated and a trivial question

Trivial because there are general bounds (initially due to Mignotte) for the integers that appear in any such factorization, and it could just be sufficient to try any possibility in the given range and conclude. Brute force, not very intelligent, but with a certain outcome.

Complicated because those computations would often be long, and there are many criteria in number theory to assert irreducibility.

One of the easiest criteria is the aforementioned Eisenstein criterion.

This criterion requires an auxiliary prime number, and I'll state it as an example, using another polynomial, say $T ^ 4 - 10 T ^ 2 + 2$. Here, one takes the prime number 2, and one observes that modulo 2, the polynomial is equal to $T ^ 4$, while the constant term, 2, is not divisible by $2^2=4$. In that case, the criterion immediately asserts that the polynomial $T^4-10T^2+2$ is irreducible.

With all due respect to Eisenstein, I don't like that criterion too much, though, because it only applies in kind of exceptional situations. Still, it is often useful.

There is a classic case of application, by the way, of the Eisenstein criterion, namely the irreducibility of cyclotomic polynomials of prime index, for example, $T^4+T^3+T^2+T+1$ (for the prime number $p=5$). But it is not visible that the criterion applies, and one needs to perform the change of variable $T = X+1$.

I had noticed that in that case, it maybe interesting to generalize the Eisenstein criterion to avoid this change of variables. Indeed, for a monic polynomial $f$ in $\mathbf Z[T]$, a variant of the criterion states: take an integer $a$ and a prime number $p$, and assume that :

  • $f(T) \equiv (T-a)^d \mod p$
  • $f'(a)$ (derivative at $a$) is not divisible by $p^2$.

Then $f$ is irreducible.

For the cyclotomic polynomial of index $5$, $f(T) = T^4+T^3+T^2+T+1$, still taking $p=5$, one has $f(T) \equiv (T-1)^4 \pmod5$ and $f'(1)=4·5/2=10$ is not divisible by $5^2=25$. Consequently, it is irreducible. And the same argument works for all cyclotomic polynomials of prime index.

The reason, that avoids any strange computation, is that $f(T)=(T^5-1)/(T-1)$, which modulo 5 is $(T-1)^4$ — by the divisibility of the binomial coefficients.

To go back to the initial example, $T^4-10T^2+1$, there are indeed no prime numbers with which the Eisenstein criterion can be applied. This is obvious in the standard form, because the constant coefficient is 1. But the variant doesn't help neither. The only prime it could seems to be 2, but its derivative at 1 is equal to $-16$, and is divisible by 4.

This is where a new variant of the criterion can be applied, this time with the prime number 3.

Theorem.Let $q\in\mathbf Z[T]$ be a monic polynomial, let $p$ be a prime number such that $q$ is irreducible in $\mathbf F_p[T]$. Let $f\in\mathbf Z[T]$ be a monic polynomial. Assume that $f\equiv q^d$ modulo $p$, but that $f\not\equiv 0$ modulo $\langle q, p^2\rangle$. Then $f$ is irreducible in $\mathbf Z[T]$.

To see how this criterion applies, observe that modulo 3, one has $f(T)\equiv T^4+2T^2+1=(T^2+1)^2\pmod 3$. So we are almost as in the initial criterion, but the polynomial $T$ is not $T^2+1$. The first thing that allows this criterion to apply is that $T^2+1$ is irreducible modulo 3. In this case, this is because $-1$ is not a square mod 3.

The criterion also requires of variant of the condition on the derivative — it holds because the polynomial is not zero modulo $\langle T^2+1, 9\rangle. Here, one has \[ T^4-10T^2+1=(T^2+1)^2-12T^2 = (T^2+1)^2-12(T^2+1)+12\] hence it is equal to 3 modulo $\langle T^2+1, 9\rangle$.

And so we have an Eisenstein-type proof that the polynomial $T^4-10T^2+1$ is irreducible over the integers. CQFD!

I made the fun last a bit longer by formalizing the proof in Lean, first of the generalized criterion, and then of the particular example. It is not absolutely convincing yet, because Lean/Mathlib still lacks a bit of tools for handling explicit computations. And probably many parts can be streamlined. Still, it was a fun exercise to do.

The proof works in a more general context and gives the following theore:

Theorem. Let $R$ be an integral domain, let $P$ be a prime ideal of $R$ and let $K$ be the field of fractions of $R/P$. Let $q\in R[T]$ be a monic polynomial such that $q$ is irreducible in $K[T]$. Let $f\in R[T]$ be a monic polynomial. Assume that $f\equiv q^d$ modulo $P$, but that $f\not\equiv 0$ modulo $\langle q\rangle + P^2$. Then $f$ is irreducible in $R[T]$.