Lecture 3

Motivation: Convolution and approximation

Last time we worked on convolution and showed how to approximate an integrable function by continuous function by convolution. The following code shows the approximation process in problem 4 of problem set 1. It plots ΠΛ1/n\Pi*\Lambda_{1/n} for n=5,10,20n = 5,10,20.

x = PolynomialRing(QQ,'x').gen()
T = piecewise([[(-1,0),1+x],[(0,1),1-x]])
f = piecewise([[(-1/2,1/2),1]])
P = plot(f)
Q = plot(T)
#N=1
#TN=piecewise([[(-1/N,0),N*(1+x*N)],[(0,1/N),N*(1-x*N)]])
#TNF = f.convolution(TN)
#P=plot(TN)
for N in [5,10,20]:
    TN=piecewise([[(-1/N,0),N*(1+x*N)],[(0,1/N),N*(1-x*N)]])
    Q = Q + plot(TN,color='red')
    TNF=f.convolution(TN)
    P=P+plot(TNF)
P.show(title='convolution')
Q.show(title='lambda n')

The idea to do pointwise approximation is, suppose we want to approximate f(x)f(x) by a sequence fgnf*g_n. We will have to control f(x)fgn(x)=01f(xy)g(y)dyf(x)|f(x)-f*g_n(x)| = |\int_{0}^1f(x-y)g(y)dy - f(x)|. With assumption 01gn(y)dy=1\int_{0}^1g_n(y)dy = 1 for every nn, we can write the RHS as 01(f(xy)f(x))gn(y)dy01f(xy)f(x)gn(y)dy|\int_{0}^1(f(x-y) - f(x))g_n(y)dy| \leq \int_{0}^1|f(x-y)-f(x)||g_n(y)|dy.

When in particular gng_n has very small support (δ,δ)(-\delta,\delta), the integral is taken on y(δ,δ)y\in(-\delta,\delta), so that f(xy)f(x-y) is a small perturbation of ff near xx. For example, when ff is continuous at xx, the term f(xy)f(x)|f(x-y)-f(x)| will be small.

Good kernels

Let (Kn)n=1(K_n)_{n=1}^{\infty} be a family of kernels. It is called good provided that it satisfies the following properties:

(1) 01Kn(x)dx=1\int_{0}^1 K_n(x)dx = 1 for every nn.

(2) For every δ>0\delta > 0, (0,1)(δ,δ)Kn(x)dx0\int_{(0,1)\setminus(-\delta,\delta)}|K_n(x)|dx \to 0.

(3) Either Kn0K_n\geq 0, or KnL1<M\|K_n\|_{L^1}<M for some M>0M>0 and all nn.

A good family of kernels is also called an “approximation to the identity”.

The Λn\Lambda_n satisfies (1) has support shrinking to {0}\{0\}, so it is automatically a good kernel. But the Fejer kernel and Dirichlet kernel has no “shrinking support”, as can be seen from appendix of lecture 1.

The notion of “good kernel” is designed so that the following holds:

Theorem. Let (Kn)n=1(K_n)_{n=1}^{\infty} be a good family of kernels. Let fL1(T)f\in L^1(\mathbb{T}), then

  • If ff is continuous at xx, then limn(fKn)(x)=f(x)\lim_{n\to \infty}(f*K_n)(x) = f(x).
  • If ff is continuous everywhere on T\mathbb{T}, the limit is uniform.

Proof. The first properties implies

fKn(x)f(x)=01(f(xy)f(x))Kn(y)dy01f(xy)f(x)Kn(y)dy. \begin{split} |f*K_n(x) - f(x)| &= |\int_{0}^1({f(x-y)}-f(x))K_n(y)dy|\\ &\leq \int_{0}^1|f(x-y)-f(x)||K_n(y)|dy. \end{split}

For ϵ>0\epsilon > 0 choose δ such that when y(δ,δ)y\in (-\delta,\delta), f(xy)f(x)<ϵ|f(x-y) - f(x)|<\epsilon, so δδf(xy)f(y)Kn(y)dyϵδδKn(y)dyϵM2δ\int_{-\delta}^{\delta}|f(x-y)-f(y)||K_n(y)|dy\leq \epsilon \int_{-\delta}^\delta |K_n(y)|dy \leq \epsilon M \cdot 2 \delta.

The other part of integration is

(0,1)(δ,δ))f(xy)f(x)Kn(y)dy2supx[0,1]f(x)(0,1)(δ,δ)Kn(y)dy0\int_{(0,1)\setminus(-\delta,\delta)})|f(x-y)-f(x)||K_n(y)|dy\leq 2\sup_{x\in [0,1]}|f(x)| \cdot \int_{(0,1)\setminus(-\delta,\delta)}|K_n(y)|dy \to 0 when nn\to \infty by assumption (2) of good kernel.

We conclude that when nn is large enough (depend on xx), fKn(x)f(x)01f(xy)f(x)Kn(y)dy<Cϵ|f*K_n(x)-f(x)|\leq \int_{0}^1|f(x-y)-f(x)||K_n(y)|dy < C\epsilon for some constatn C>0C>0.

If ff is everywhere continuous on T\mathbb{T}, then it is uniformly continuous on T\mathbb{T}, so the δ in the proof can be chosen independent of xx, so the “when nn large enough part” no longer depend on xx, this implies uniform convergence. \square

Cesaro means and Fejer kernel

Definition. (Cesaro sum) Let fL1(T)f\in L^1(\mathbb{T}). Let SnfS_n f be its nn-th Fourier partial sum. The NN-th Cesaro sum of ff is

σNf(x):=S1++SN1N(f)(x). \sigma_Nf(x):= \frac{S_1+\cdots+S_{N-1}}{N}(f)(x).

Let Dn=k=nne2πikxD_n = \sum_{k=-n}^n e^{2\pi i k x} be the Dirichlet kernels. Let

FN:=D0++DN1N. F_N:=\frac{D_0+\cdots+D_{N-1}}{N}.

Then σn(f)(x)=fFN(x)\sigma_n(f)(x) = f*F_N(x). The FNF_N is called Fejer kernel.

Similar to the Dirichlet kernel, the Fejer kernel has also a simpler form.

Lemma. FN(x)=1Nsin2(πNx)sin2(πx)F_N(x) = \frac{1}{N}\frac{\sin^2(\pi Nx)}{\sin^2(\pi x)}.

Proof. In problem set 2.

Lemma. The Fejer kernel is a good kernel.

Proof. Note that since FN0F_N\geq 0, it sufficies to verify (1) and (2) in the definition of good kernel.

  • To verify (1)(1), 1212FN(x)dx=01D0(x)++DN1(x)Ndx=NN=1\int_{-\frac{1}{2}}^{\frac{1}{2}} F_N(x)dx = \int_{0}^1 \frac{D_0(x)+\cdots+D_{N-1(x)}}{N}dx = \frac{N}{N} = 1.
  • To verify (2), let AδA_\delta denote the set [12,12](δ,δ)[-\frac{1}{2},\frac{1}{2}]\setminus (-\delta,\delta). Observe that for δ>0\delta > 0, FN0F_N\to 0 uniformly on AδA_\delta. Indeed, choose cδc_\delta such that sin2(πx)cδ\sin^2(\pi x)\geq c_\delta, so FN(x)1N1cδF_N(x)\leq \frac{1}{N}\cdot \frac{1}{c_\delta} on AδA_\delta, which goes to 0 when NN\to \infty. By uniform convergence limNAδFN=AδlimNFN=0.\lim_{N\to \infty}\int_{A_\delta}F_N = \int_{A_\delta}\lim_{N\to \infty}F_N = 0.

As a consequence of goodness of FNF_N, we obtain the Fejer’s theorem

Theorem. fL1(T)f\in L^1(\mathbb{T}). Then

  • σN(f)(x)f(x)\sigma_N(f)(x)\to f(x) when ff is continuous at xx.
  • If ff is continuous on T\mathbb{T}, then the convergence is uniform.

In particular, we have showed that every continuous function on T\mathbb{T} can be uniformly approximated by trignomic polynomials, where the trignomic polynomials can be taken as its Cesaro sums. Later, we shall use this result to prove the L2L^2 convergence of Fourier series.

Further remarks on “approximated identity”

A family of good kernels are also called “an approximated identity”, because it is really an “approximation” to the “identity”. An “identity” element usually means something similar to 1, multiplying with the identity element will return itself. In our case, the identity is the “identity” of convolution. Imagine a “function” II such that fI=ff*I = f, what shall the II be? By the convolution theorem, f^(n)=fI^(n)=f^(n)I^(n)\hat{f}(n) = \widehat{f*I}(n) = \hat{f}(n)\hat{I}(n), so one has I^(n)=1\hat{I}(n) = 1 for every nn. Then the Fourier series of II should look like n=+e2πinx\sum_{n = -\infty}^{+\infty}e^{2\pi i n x}, so II should be something like “limNDN\lim_{N \to \infty}D_N”. With the notion of distributions and measures we can make this discussion precise and the “identity” will be the δ-measure, and an approximate identity is an approximation to the δ-measure in weak sense.

The “goodness” of good kernels is not restricted to continuous category. One can show for example that if fLp(T)f\in L^p(\mathbb{T}), then σnff\sigma_n f \to f in LpL^p. See Hoffman’s book chapter 2 for more general results in this direction.