In this note we identify the circle S1 with T:=R/Z via e2πix↔x∈[0,1]. When we say the point x on the circle, we mean the point e2πix∈S1. In our discussion “time” is a real variable t∈[0,+∞].
The distribution of temperature on the circle is modeled by a density function f(x) for x∈T. The temperature on interval [a,b] is given by ∫abf(x)dx.
Let u(x,t) be the temperature distribution of circle at time t, note that u(0,t)=u(1,t), so the temperature of an interval (a,b) is given by ∫abu(x,t)dx. Let u(x,0)=f(x) be the initial temperature. We study the heat diffusion when time increases under the following postulates:
Increasing of temperature of an interval (a,b) during time Δt is proportional to increasing of heat on (a,b). In other words, [T(a,b,t+ϵ)−T(a,b,t)]=c⋅[H(a,b,t+ϵ)−H(a,b,t)] for some constant c>0, or ∂t∂H(a,b,t)=c∂t∂T(a,b,t) if we look at infinitesimal change of time t.
Fix time t, consider heat transfer during an infinitesimal time interval. Let ωt(x) be the flux of heat at x, i.e. the amount of heat which goes from leftside to right side at x, so that ωt(x) is positive if heat transfers from left to right and negative if heat transfers from right to left. We get
Fourier’s law: the heat flux is negatively proportional to the derivative of temperature, i.e. at time t, ω(x)=κlimδ→0δu(x+δ,t)−u(x,t)=−κ∂x∂u(x,t) for a constant κ>0. Note that negative because heat transfers from higher temperature to lower temperature.
For example, if the temperature is higher on the right, so u(x,t) is increasing in x variable, so the partial derivative \frac{\partial}{\partial x}u(x,t) is positive, but in this case the heat will go from right(higher) to the left(lower), so \omega_t(x) < 0 because \omega_t(x) means the amount of heat going from left to right.
Consequently, ∂x∂ωt(x)=−κ∂x2∂2u(x,t).
From the first two postulates, we get dtd∫abu(x,t)dx=∫ab∂t∂u(x,t)dx=c(ωt(a)−ωt(b))=−c∫abωt′(x)dx. By the third postulate, ωt′(x)=−κ∂x2∂2u(x,t), so we get ∫ab∂t∂u(x,t)dx=∫ab∂x2∂2cκu(x,t)dx. Equality holds for any interval so we get the equation
Since u(x,t) is periodic on the x variable, Fourier’s idea is to look for solutions of the form u(x,t)=∑n=−∞+∞cn(t)e2πinx. It follows that cn(t)=∫01u(x,t)e−2πinxdx. Take dtd on cn(t) we get
(It makes sense to assume u to be smoothly depending on t so that we can change derivative and integral here, by applying mean value theorem).
By the derivative theorem, g′(n)=2πing^(n), so apply to u(x,t) in the x variable we get cn′(t)=−2π2n2cn(t). Solving the ODE we get cn(t)=cn(0)e−2π2n2t. cn(0) is the Fourier coefficient of u(x,0)=f(x), so we get u(x,t)=∑n=−∞+∞f^(n)e−2π2n2te2πinx.
Write back f^(n)=∫01f(y)e−2πinydy, then u(x,t)=∑n=−∞+∞∫01f(y)e−2πinye2πinxe−2π2n2tdy=∫01f(y)e−2π2n2te2πin(x−y)dy, provided we can change the integral and summation. Let g(x,t)=∑n=−∞+∞e−2π2n2te2πinx, then the series converges absolutely hence uniformly in x variable for every t, (∑e−2πn2t<∞ as a geometric series), then g(x,t) is a continuous function on the circle, this also justifies the change of integral and summation.
We obtained a simpler form of solution: u(x,t)=∫01g(x−y,t)f(y)dy. This type of integrals is called convolution, as we shall study next.
By changing variable y′=x−y, we get ∫01f(x−y)g(y)dy=∫xx−1f(y′)g(x−y′)d(−y′)=−∫0−1f(y′)g(x−y′)dy′=∫01g(x−y)f(y)dy, so we can also define convolution by (f∗g)(x)=∫01g(x−y)f(y)dy.
Our definition of convolution here is periodic convolution. Since convolution means taking average, we need to normalize in period T so that f∗g=T1∫0Tf(x−y)g(y)dy. The standard notion of convolution is on the real line and plays an important rule in Fourier transforms.
The N-th Fourier partial sum SN(f) can be written as f∗DN. In fact, SN(f)(x)=∑n=−NNf^(n)e2πinx=∑n=−NN∫01f(y)e−2πinydy⋅e2πinx=∫01f(y)∑n=−NNe2πin(x−y)dy=f∗DN(x), here DN=∑n=−NNe2πinx is the Dirichlet kernel on T.
From the first properties one may feel that convolution looks quite similar to multiplication. This intuition may be justified in some sense by the fact that convolution and multiplication are Fourier transform of each other.
Let f,g be integrable period 1, then f∗g(n)=f^(n)g^(n).
Proof. By definition of convolution and Fubini theorem, f∗g(n)=∫01(f∗g)(x)e−2πinxdx=∫x∫yf(x−y)g(y)dye−2πinxdx=∫y∫xf(x−y)e−2πin(x−y)dxe−2πinyg(y)dy=f^(n)g^(n).
The integrand satisfies |f(x-y)g(y)e^{-2\pi i n x}| = |f(x-y)g(y)| and \int_y\int_x |f(x-y)||g(y)|dxdy = \int_y |g(y)|\int_x |f(x)|dx dy = \|f\|_{L^1}\|g\|_{L^1} < \infty because f,g are both L^1 on [0,1] .
Another good property of convolution is its regulartiy. The convolution is defined by integration, which are good because integrals usually have more regularity than integrands. In other words, f∗g is “nicer” than the nicer one of f and g, that’s why convolution is widely used to do approximations. As an example we present the approximation of integrable functions in detail to let you have some feeling about this technique.
Proposition 5.1. Let f,g be periodic functions with period 1. Then
If f is continuous, g is integrable, then f∗g is continuous.
Moreover, if f is integrable, g is bounded, then f∗g is also continuous.
Proof. Let h be a real number. ∣f∗g(x+h)−f∗g(x)∣=∣∫01(f(x+h−y)−f(x−y))g(y)dy∣≤∫01∣f(x+h−y)−f(x−y)∣∣g(y)∣dy. For ϵ>0, since f is continuous, when ∣h∣ is small enough, ∣f(x′+h)−f(x′)∣<ϵ for every x′∈R. It follows that ∣f(x+h−y)−f(x−y)∣<ϵ for every y∈(0,1). Then the right hand side ≤ϵ∥g∥L1, where ∥g∥L1:=∫01∣g(y)∣dy<∞ because g is integrable. This shows that f∗g is continuous. Actually f∗g is uniformly continuous in this case.
To prove the second stronger assertion we need an approximation lemma which will be proved in problem set 1.
Lemma 5.2. Let f be an integrable function on [0,1], then for ϵ>0, there exists a continuous function f~ such that ∥f−f~∥L1<ϵ.
By the approximation lemma, choose a sequence of continuous functions fk such that ∥f−fk∥L1→0 when k→∞. Then ∣f∗g(x)−fk∗g(x)∣=∣(f−fk)∗g(x)∣=∫01∣(f−fk)(y)∣∣g(x−y)∣dy≤B∥f−fk∥L1, where 0<B<∞ is an upper bound of ∣g∣. Since the right hand side goes to 0 when k→∞ and does not depend on x, we conclude that fk∗g→f∗g uniformly. Since fk∗g is continuous because fk is continuous, this implies that f∗g is a uniform limit of continuous functions, which is continuous.
Definition 5.4. A rect function is a function of the form χ(a,b), where (a,b) is an interval. rect stands for rectangle, the graph of a rect function looks like a rectangle hence the name.
Definition 5.5. A step function is given by ∑k=1mckχ(ak,bk), where ck are real numbers, (ak,bk) are disjoint intervals. Still, the name explains its shape. A step function is a finite linear combination of rect functions of disjoint intervals.
Now let f be a Riemann integrable function.
Step 0. By definition, there exists a step function h such that ∥f−h∥L1<ϵ. In fact, this is how the Riemann integral is defined. The integral of a function is defined by taking limit of integral of step functions for suitable functions for which the limit makes sense.
Step 1. Approximate step function by continuous functions. That is, find a continuous function g such that ∥g−h∥L1<ϵ. Since a step function is a linear combination of rect functions, it sufficies to do so for rect functions. I’ll walk you through the proof in problem set 1. The idea is to use convolution.
Step 2. Now we have ∥f−g∥L1≤∥f−h∥L1+∥g−h∥L1≤2ϵ by triangle inequality. We are done.
Note that the approximation lemma holds for the more general Lebesgue integrals, the proof is actually similar. The integrals are designed to be approximated by simple functions.