Chapter 4
Heat Equation

In this chapter we investigate the heat equation

u˙ u = 0

and the corresponding inhomogeneous variant

u˙ u = f.

The unknown function u is defined on an open domain Ω × (0,T) n × . We shall extend some statements about harmonic functions to solutions of the heat equation, but also try to understand the important differences.

The heat equation describes a diffusion process. This means a time-like evolution of space-like distributed quantities like heat or chemical concentration, or even probability. Let us provide a short justification of the equation as a model of heat. We have seen for the mean value property that the Laplacian measures the difference of a function from its mean value: for small r from the proof of Theorem 3.5 we have 𝒮(r) n1ru(x) which implies 𝒮(r) u(x) 1 2nr2u(x). If the temperature u at x is cooler than the points around it, then u˙ should be positive, and vice-versa if u is hotter. Moreover we have seen from the general conservation law (with F = u) that the quantity u is preserved by the heat equation (under appropriate assumptions). The simplicity of the equation together with these properties make it a useful model to study. There is no widely agree upon name for solutions to the homogeneous heat equation, similar to harmonic functions for the Laplace equation, though some books use the term caloric. A previous class suggested to call them flames, similar to how solutions of the wave equations are waves, which I find cute.

There are two boundary value problems that we will examine in particular. The first is the initial value problem on n × (0,T)

u˙ u = f on n × (0,T),u(x, 0) = h(x) on n.

This is sometimes called the Cauchy problem. It purports to model how the temperature within an infinitely large body changes given the initial temperature h at every point. The inhomogeneous term f represents the infusion or removal of heat at points within the body. The second problem applies to a bounded spatial domain Ω

u˙ u = f on Ω × (0,T), u = g on Ω × [0,T], u(x, 0) = h(x) on Ω.

This problem is called the Dirichlet problem, in analogy to the corresponding problem for the Laplace equation. This models the temperature within a finite body but where additionally the temperature of the boundary is also controlled (specified by g). In both problems any solution should at least extend continuously to the boundary, so that the boundary conditions are meaningful.

Before we begin the develop the theory that we will use, let’s study some monstrous examples, to show us what to be wary of. The first shows the importance of the negative sign in the heat equation. We give an illustration that the heat equation is not time symmetric in the way that many models in physics are (at least conceptually) and that the ‘reverse time’ problem is not well-posed. Consider n = 1 and for any integer m define the function

um(x,t) = em2(Tt) sin𝑚𝑥.

They have the property that u˙m = m2u m as well as x2u m = m2u. Therefore they all solve the homogeneous heat equation with ‘terminal’ condition um(T) = sin𝑚𝑥. This example can even be applied to a Dirichlet-type problem. Consider the spatial domain Ω = (0, 2π) with the boundary values g 0. Because m is an integer, all these functions satisfy it. Even though these boundary conditions are smooth and uniformly bounded by 1, the solutions at any time t < T can still be arbitrarily large

sup |um(,t)| = em2(Tt).

This is one reason to only study the forward time Dirichlet problem.

Similarly for the Cauchy problem introduced above, there is also the possibility of rapidly growing solutions. Again for n = 1, we make the ansatz

u(x,t) = l=0g l(t)xl,u˙(x,t) u(x,t) = l=0(ġ l(t) (l + 2)(l + 1)gl+2(t))xl.

Thus if u solves the heat equation then we must have a recursion relation between gl and gl+2. For a given function g0(t) = g(t) and setting g1(t) 0 we thus obtain the following formal solution of the homogeneous heat equation:

u(x,t) = l=0g(l)(t) (2l)! x2l.

We now show that for g(t) = exp(t2) this power series indeed converges to a smooth solution and further that on every compact subset of n the uniform limit of this solution vanishes as t 0. We first calculate g(l)(t) for any l 0 by a real polynomial pl of degree l solving the relation

g(l)(t) = tlp l(t2) exp(t2)withp l+1(z) = 2zpl(z) lpl(z) 2zpl(z).

This recursion relation for pl follows by differentiating by t. The first two polynomials are p0(z) = 1 and p1(z) = 2z. We claim that the coefficient of pl(z) in front of zk is bounded by l!7l 2kk!. For l = 0, k = 0 this is clear. By induction we obtain with k l + 1

2 l!7l 2k1(k 1)! + l l!7l 2kk! + 2k l!7l 2kk! = l!7l(4k + l + 2k) 2kk! l!7l7(l + 1) 2kk! (l + 1)!7l+1 2kk! .

This proves the claim. Using the inequalities l! (2l)! = 1 2l13(2l1) 1 2ll! we conclude

|u(x,t)| l=0l!7lx2l (2l)!tl k=0l g(t) 2kk!t2k l=01 l! (7x2 2t )l k=0g(t) k! ( 1 2t2 ) k = exp (7x2 2t 1 2t2 ) .

Therefore the series converges absolutely and for t 0 uniformly on compact sets to 0. This means that we can extend u smoothly to t 0 by giving it the value 0. This means that the Cauchy problem with initial value h 0 has a non-zero solution: The space is the same temperature everywhere and suddenly wild temperature fluctuations begin. Even though it seems as if the Cauchy problem should be well-posed, additional constraints will be required.

4.1 Spectral Theory and the Fourier Transform

Let us give some motivation for introducing spectral theory, which is the theory of the eigenvalues of the operator . Let us look for ‘separable’ solutions of the homogeneous heat equation. They are solutions that neatly factorise as u(x,t) = φ(t)h(x). These solve the heat equation if

φ˙(t)h(x) φ(t)h(x) = 0 φ˙(t) φ(t) = h(x) h(x) .

Because the left if a function of t and the right is a function of x, the only way that this is possible is if the two sides are equal to some constant λ. This means that h is an eigenfunction of the (negative) Laplace operator:

h = 𝜆honΩ,

and φ˙ = 𝜆𝜑. The factorisation is only determined up to a scaling, so we set φ(0) = 1. Thus φ(t) = e𝜆𝑡 and u has the initial value u(x, 0) = h(x).

Turning this around, if we are given an initial value problem where h is an eigenfunction of the Laplacian, then this method gives a solution. More generally, if the initial condition is a linear combination of eigenfunctions then a linear combination of separable solutions solves the problem. The question now arises can every function be written as a linear combination of eigenfunctions in some suitable sense?

What are the eigenfunctions of ? The trigonometric functions provide many examples for every λ > 0:

e2𝜋𝑖𝑘x = 4π2|k|2e2𝜋𝑖𝑘x.

The drawback of these functions are that they are not integrable on the plane because they have modulus 1 at every point. But in a limiting sense they are all orthogonal to one another in L2 inner product

e2𝜋𝑖k1x,e2𝜋𝑖k2x =ne2𝜋𝑖k1xe2𝜋𝑖k2x¯ dnx =ne2𝜋𝑖(k1k2)x dnx = 0

because the integrand is periodic and the integral over a single period is zero. This leads us to define the Fourier transform as the coefficients of the orthogonal projection of a function onto these functions, in the sense that for a finite dimensional inner product space h = h,eiei for an orthonormal basis {ei}.

Definition 4.1. The Fourier transform of a function h : n is defined to be

ĥ(k) = [h](k) :=ne2𝜋𝑖𝑘xh(x) dnx.

Be aware: there are several definitions of Fourier transform that differ by a constant scaling and a scaling of k. Always check which is being used.

When one learns Fourier analysis in detail, a major theme is under what conditions this definition makes sense, how it can be extended to other classes of functions, and which of the important properties are retained for these extensions. For example, a basic result that we will soon prove is that if the function h L1(n) then its Fourier transform is continuous and bounded.

Let us compute the Fourier transform for an important example: the Gaussian curve e|𝜋𝑥|2. We begin

ne2𝜋𝑖𝑘xe|𝜋𝑥|2 dnx =ne|k|2+|k|22𝑖𝑘(𝜋𝑥)|𝜋𝑥|2 dnx =ne|k|2(𝑖𝑘+𝜋𝑥)(𝑖𝑘+𝜋𝑥) dnx = e|k|2 ne(𝑖𝑘+𝜋𝑥)(𝑖𝑘+𝜋𝑥) dnx = πne|k|2 𝑖𝑘+neyy dny.

To finish we need to compute the value of the final integral. It is so famous that it has its own name ‘the Gaussian integral’. It value is πn2. Several methods to compute this will be explored in the tutorial. By rescaling we also have the Fourier transforms for other Gaussians. In conclusion

[ea|x|2 ](k) = (π a )n2e1 a|𝜋𝑘|2 .

One obvious class of functions that can be Fourier transformed is the test functions because they have compact support. But this turns out to be a little too restrictive. Instead we consider functions that decay rapidly at infinity.

Definition 4.2. The Schwartz space 𝒮 contains all smooth complex valued functions f on n for which ρl,α(f) := sup |x|2l|αf(x)| are finite for all l 0 and all α 0n.

There are other equivalent definitions in the literature. A common alternative is to use (1 + |x|2)l instead of |x|2l. One characterisation of 𝒮 is that it is the largest subspace of integrable functions that is closed under differentiation and multiplication with polynomials. For following lemma however is perhaps the more important justification for considering this space.

Lemma 4.3. The Fourier transformation maps 𝒮 onto 𝒮 . For any function h 𝒮 and ĥ = [h] we have

[jh](k) = 2𝜋𝑖kjĥ(k), and  [2𝜋𝑖xjh](k) = jĥ(k).

Proof. If we simply take the absolute value of the definition of the Fourier transform we get |ĥ(k)| n|h(y)|dny = h L1(n). Any h C0(n, ) certainly has finite L1-norm and by taking supremum we obtain

ĥ hL1(n).

This shows that is a continuous linear operator from C0(n, ) with the L1-norm to Cb(n, ) with the supremum norm. Since C0(n, ) is dense in L1(n), the Fourier transform extends to a continuous linear map from L1(n) into the Banach space Cb(n, ), as we claimed above.

But let us return to Schwarz functions and prove what is stated in the lemma. By integration by parts

[jh](k) = n xj (e2𝜋𝑖𝑘x) h(x) dnx = n(2𝜋𝑖kj)e2𝜋𝑖𝑘xh(x) dnx = 2𝜋𝑖k jĥ(k).

To make this calculation rigorous, one should integrate by parts on a large cube [R,R]n. But the decay properties of h ensure that the boundary terms vanish in the limit. Applying this formula with higher derivatives gives a polynomial in k on the right. Turning this relation around proves that any polynomial times ĥ is the Fourier transform of a Schwartz function and thus bounded.

Similarly we can differentiate ĥ:

kjĥ =n kj (e2𝜋𝑖𝑘x) h(x) dnx =ne2𝜋𝑖𝑘x(2𝜋𝑖x jh(x)) dnx = F[2𝜋𝑖x jh(x)](k).

This is justified by the estimate

|jĥ(k)| = |n 2𝜋𝑖xje2𝜋𝑖𝑘xh(x) dnx| 2π|x|h(x) L1(n).

Because h decays faster than any power of |x| the right hand side is bounded. Repeatedly applying this differentiation formula shows that ĥ is smooth. The combination of the differentiation and polynomial rules for the Fourier transform therefore proves that ĥ is Schwartz. □

The property of transforming derivatives into polynomials is what makes the Fourier transform a useful tool in solving ODEs and PDEs. Let’s see how it applies to the heat equation. The Fourier transform of the Laplacian is [u] = (2𝜋𝑖)2|k|2ĥ, where we only Fourier transform the space variables and leave t out from the integral. Under sufficient regularity assumptions a solution to the heat equation obeys

[tu] + 4π2|k|2û = tû + 4π2|k|2û = 0

by interchanging the t and integration. For each value of k this is an ODE for û(k,t) in the variable t. We even get initial conditions by applying the Fourier transform to the initial condition of the PDE û(k, 0) = ĥ(k). It has the solution

û(k,t) = e4π2|k|2tû(k, 0) = e4π2|k|2tĥ(k).

So if we are able to find a function that has this as its Fourier transform, we have solved the heat equation. For this we need to understand how the Fourier transform behaves with respect to products and convolutions.

Lemma 4.4. Let u,v 𝒮 . Then [u v] = ûv^ and [𝑢𝑣] = û v^.

Proof. This follows by direct calculation.

[u v](k) =ne2𝜋𝑖𝑘x (nu(x y)v(y) dny) dnx =n (ne2𝜋𝑖𝑘xu(x y) dnx)v(y) dny =ne2𝜋𝑖𝑘y (ne2𝜋𝑖𝑘zu(z) dnz)v(y) dny =ne2𝜋𝑖𝑘zu(z) dnzne2𝜋𝑖𝑘yv(y) dny = û(k)v^(k).

The second half of the lemma is an easy consequence of the first half together with the inverse Fourier transform, which is given after Theorem 4.7. We really only need the first half of the lemma, but it much prettier to present the two results side-by-side. □

Because of our earlier example, we know that

[ 1 (4𝜋𝑡)n2e|x|2 4t ] = e4π2|k|2t .

Therefore we can conclude that

u(x,t) = 1 (4𝜋𝑡)n2e|x|2 4t xh(x)

is a solution to the heat equation with initial condition u(x, 0) = h(x), where the convolution is only taken over the spatial variables.

Our derivation of the solution has assumed that the functions in question have sufficient regularity such that we were able to interchange the order of integration or differentiate under the integral sign as needed. In the next section we will take the formula for the solution that we have derived and prove directly, under weaker assumptions on h, that it solves the Cauchy problem.

4.2 Fundamental Solution

Our method of the previous section to solve the homogeneous heat equation through a Fourier transform uncovered a particular Gaussian function. It turns out to be a fundamental solution for the heat equation that is well-suited to the case t > 0, which holds for both problems we are interested in.

Definition 4.5. The fundamental solution of the heat equation is defined as

Φ(x,t) = { 1 (4𝜋𝑡)n2ex2 4t forx n,t > 0 0 forx n,t 0 .

For t0 one can check that this solves the homogeneous heat equation be direct calculation (Exercise). For x0 we also know that tΦ(x,t) is a smooth function, so in fact Φ solves the heat equation in the strong sense everywhere except (0, 0). We will show that (t )Φ = δ soon. Similar to the fundamental solution of the Laplace equation, this fundamental solution has the scaling property Φ(𝑎𝑥,a2t) = anΦ(x,t). You may be wondering if the odd scaling factor for Φ is meaningful. It is, as the following lemma shows.

Lemma 4.6. For all t > 0 the fundamental solution satisfies nΦ(x,t) dnx = 1.

Proof. 1 (4𝜋𝑡)n2 ne|x|2 4t dnx = 1 πn2nex2 dnx = 1 πn2 (ex2 𝑑𝑥)n = 1. □

We can therefore understand the fundamental solution as being similar to a mollifier on n. As t 0 the function grows and concentrates near the origin. It is not a mollifier because it does not have compact support, but it does lie in 𝒮 and we should expect that the convolution with Φ converges in the limit t 0 to the identity. This is the content of the following theorem. This theorem also gives a solution to the Cauchy problem for the homogeneous heat equation under the assumption that the initial condition is continuous and bounded.

Theorem 4.7. For h Cb(n, ) the following function u has the properties (i)-(iii):

u(x,t) =nΦ(x y,t)h(y) dny

(i)

u C(n × +)

(ii)

u˙ u = 0 on n × +

(iii)

u extends continuously to n × [0,) with lim t0u(x,t) = h(x).

Proof. For t > 0 by the smoothness of Φ and the boundedness of h, the function is well-defined and we can pass derivatives into the integral. This should that u is smooth. Likewise (ii) follows, since Φ solves the heat equation on n × +.

The harder argument is (iii). For any 𝜖 > 0 and any x in a compact subset of n there exists δ > 0, such that |h(x) h(y)| < 𝜖 for all |x y| < δ (continuity implies uniform continuity on any compact subset). Furthermore there exists T > 0, such that

nB(0,δ)Φ(y,t) dny =nB(0,δt)Φ(z, 1) dnz < 𝜖 for all 0 < t < T.

This implies

|u(x,t) h(x)| = |nΦ(x y,t)(h(y) h(x)) dny| B(x,δ)Φ(x y,t)h(y) h(x) dny +nB(x,δ)Φ(x y,t)|h(y) h(x)|dny 𝜖 + 2𝜖 sup {|h(y)|y n}

for all 0 < t < T. So u(x,t) converges in the limit t 0 uniformly on compact subsets of n to h. □

Part (iii) of this theorem is also an important lemma in Fourier analysis, because it leads to an explicit formula for the inverse of the Fourier transform. Suppose that u,v 𝒮 . We compute the following integral parameterised in x

nû(k)v(k)e2𝜋𝑖𝑘x dnk =n (nu(y)e2𝜋𝑖𝑘(yx) dny)v(k) dnk =n (nu(z + x)e2𝜋𝑖𝑘z dnz)v(k) dnk =nu(z + x) (nv(k)e2𝜋𝑖𝑘z dnk) dnz =nu(z + x)v^(z) dnz.

The trick is to now choose v^ to be the fundamental solution Φ(x,𝜖). This gives

nû(k)e4π2|k|2𝜖e2𝜋𝑖𝑘x dnk =nu(z + x)Φ(z,𝜖) dnz =nu(y)Φ(y x,𝜖) dny.

Taking the limit as 𝜖 0 and applying Theorem 4.7(iii) on the right hand side proves

nû(k)e2𝜋𝑖𝑘x dnk = u(x).

To summarise, the inverse Fourier transform is

1[u](x) =nu(k)e2𝜋𝑖𝑘x dnk = [u](x).

The fact that the Fourier transform and its inverse differ only by a sign in the exponent of the exponential is the reason that it has so many ‘dual’ properties, such as for multiplication and convolution, or for differentiation and multiplication by polynomials.

The equation above for u and v is also the important step to extend the Fourier transform to (some) distributions. When x = 0 we have

nû(k)v(k) dnk =nu(z)v^(z) dnz.

If this was written in the notation of distributions it would be Fû(v) = Fu(v^). This seems as if it would be a suitable definition of the Fourier transform of a distribution. However, even if v is a test function, we can’t be sure that v^ is a test function only that it is Schwartz, and thus F(v^) is not defined for all distributions.

Unfortunately there is no way to fix this. Instead we must restrict ourselves to consider only distributions that can act on Schwartz functions. But what does this mean? First we recognise that sup ρl,α from Definition 4.2 of 𝒮 constitutes a family of seminorms for Schwartz space. Further the inclusion of the space of test functions 𝒟 into the Schwartz space 𝒮 is continuous and dense with respect to this topology. Therefore we can identify the subspace of distributions that can be extended continuously to act on 𝒮 .

Definition 4.8. Let F 𝒟 be a distribution. Suppose that ϕm is a sequence of test functions that converges to zero in 𝒮 , i.e. lim mρl,α(ϕm) = 0 for all l,α. We say that F is a tempered distribution F 𝒮 if lim mF(ϕm) = 0. If F is a tempered distribution then it acts on a Schwartz function ϕ by

F(ϕ) = lim mF(ϕm)

for any sequence of test functions ϕm that converges to ϕ in 𝒮 . For tempered distributions, we define the Fourier transform F^(ϕ) = F(ϕ^).

Many of the properties of Fourier transforms on 𝒮 carry over to 𝒮 , in particular the differentiation and polynomial multiplication rules. Defining the Fourier transform on distributions is not just a convenient way to extend it to a large class of functions but actually essential for understanding the Fourier transforms of many common functions. For example, the Fourier transform of the constant function 1 is the delta distribution.

Fourier analysis can also solve the inhomogeneous heat equation on n × +. Taking the transform of the PDE results in the inhomogeneous ODE

tû + 4π2|k|2û = f^.

This has the solution

û(k,t) = e4π2|k|2tĥ(k) +0te4π2|k|2(ts)f^(k,s) ds.

We recognise the first term from the homogeneous case. The second term is new, but it is the integral over time of the product of Φ^(k,t s) and f^. Performing the inverse transform suggests the following solution

u(x,t) =nΦ(x y,t)h(y) dny +0tnΦ(x y,t s)f(y,s) dny ds.

It remains to consider the regularity of the second integral.

Theorem 4.9 (Solution of the inhomogeneous heat equation). If f is twice continuously and bounded differentiable on n × [0,), then

u(x,t) =0tnΦ(x y,t s)f(y,s) dny ds

solves the inhomogeneous initial value problem

u˙ u = f on n × + and  lim t0u(x,t) = 0.

Proof. The integrand has a singularity when s = t. Therefore consider

u𝜖(x,t) =0t𝜖nΦ(x y,t s)f(y,s) dny ds

To this function we can apply the heat equation with impunity:

u˙𝜖(x,t) u𝜖(x,t) =nΦ(x y,t (t 𝜖))f(y,t 𝜖) dny +0t𝜖n(t )Φ(x y,t s)f(y,s) dny ds =nΦ(x y,𝜖)f(y,t 𝜖) dny.

Theorem 4.7 (iii) implies lim 𝜖0u˙𝜖 u𝜖 = f on n × +. Additionally u𝜖(x,𝜖) = 0. The assumptions on f are sufficient to conclude that

f = lim 𝜖0 (u˙𝜖(x,t) u𝜖(x,t)) = ( ∂𝑡 ) lim 𝜖0u𝜖(x,t) = ( ∂𝑡 )u(x,t)

and 0 = lim 𝜖0u𝜖(x,𝜖) = u(x, 0). Properly one should bound the difference between u and u𝜖, which is the integral in time over the short interval [t 𝜖,t], in a similar manner to Theorem 3.2. □

We summarise our inquiries with the following statement.

Corollary 4.10. Suppose f is twice continuously and bounded differentiable on n × [0,) and h is continuous and bounded on n. The inhomogeneous initial value problem has the following solution:

u˙ u = f u(x, 0) = h(x) u(x,t) =nΦ(x y,t)h(y) dny + 0tnΦ(x y,t s)f(y,s) dny ds.

To finish the section we make some qualitative remarks on the behaviour of these solutions. The two integrals are a homogeneous solution that satisfies the initial condition 0 and an inhomogeneous solution that vanishes initially. One is reminded of the Green’s representation formula, which was also two integrals dividing the task between themselves. We can also see that as a physics model of heat it violates the principle of locality and the speed of light. Consider f = 0, so there is no additional sources of heat, and suppose the initial temperature h is non-negative and has compact support. Then for any point and time (x,t) n × + the solution is positive, because Φ is everywhere positive. The interpretation is that the heat that was present in the support of h has instantly spread out to the whole space.

4.3 Maximum Principle

Like elliptic PDEs, parabolic PDEs also have a maximum principle. In this section we will prove a weak maximum principle for the heat equation and apply it to the question of uniqueness of the Dirichlet and Cauchy problems. There is an approach to the maximum principle based on so-called ‘heat balls’ that mimic the mean value property for the Laplace equation (see Evans), but this is computationally messy. Instead we follow Han and give a proof in the style of Theorem 3.13.

The domain of the heat equation distinguishes time and spatial directions. We therefore make special definitions adapted to this distinction. For any open domain Ω n we define the parabolic cylinder as ΩT = Ω × (0,T]. The parabolic boundary ΩT of ΩT is defined as Ω¯T ΩT . It is the union of (Ω × (0,T]) (Ω¯ × 0) and does not contain at time t = T points inside of Ω.

Theorem 4.11 (Weak maximum principle for the heat equation). Let Ω n be open and bounded and u a twice differentiable function on ΩT that extends continuously to Ω¯T . Suppose that u is a subsolution to the heat equation:

u˙ u 0

on ΩT . Then the maximum of u is taken on ΩT .

Proof. Note because Ω is bounded that Ω¯T is compact, and thus u must have a maximum. The theorem claims that the maximum occurs on the boundary, but does not forbid it from also occurring on the interior. The constant function would be an example where the maximum is taken both on the boundary and the interior.

We first prove the theorem under the stronger assumption that u˙ u < 0. Suppose that u has a maximum at (x0,t0) ΩT . If t0 < T then we can also say that tu(x0,t0) = 0, otherwise if t = T we can only say that tu(x0,t0) 0. In either case we see that 0 > u˙(x0,t0) u(x0,t0) u(x0,t0). Also because this point is a maximum xu(x0,t0) = 0 and the Hessian H in the spatial coordinates is negative semidefinite. As argued in Theorem 3.13 at such a point u(x0,t0) 0. But now we have a contradiction. Therefore the maximum cannot occur on ΩT .

Next we handle the general case with a trick similar to Theorem 3.13. For any 𝜖 > 0 define

u𝜖(x,t) := u(x,t) 𝜖𝑡.

This forces

(t )u𝜖 = u˙ u 𝜖 𝜖 < 0.

Thus the special case applies to u𝜖 and we conclude that the maximum of u𝜖 occurs on the boundary. But we can now argue

max Ω¯Tu = max Ω¯T(u𝜖 + 𝜖𝑡) max Ω¯Tu𝜖 + 𝜖𝑇 = max ΩTu𝜖 + 𝜖𝑇 max ΩTu + 𝜖𝑇.

Taking 𝜖 0 yields the result. □

The following is an easy consequence, similar to the uniqueness of the Dirichlet problem for the Laplace equation.

Theorem 4.12. On an open and bounded domain Ω n there exists at most one solution u of the Dirichlet problem for the inhomogeneous heat equation.

Proof. Suppose that there were two solutions. Consider their difference u. This function must solve the homogeneous heat equation and vanishes on both the initial boundary Ω ×{0} and the spatial boundary Ω × (0,T). In other words, it is zero on the parabolic boundary. By the weak maximum principle applied to u and u the maximum and minimum of u is zero. Thus u 0 and the two solutions are equal. □

We can also conclude the ‘comparison principle’ or ‘monotonicity property’ for the heat equation: If one body starts hotter than another at every point h1 h2, stays hotter on the boundary g1 g2 and receives more heat on the interior f1 f2, then at every point and every time the first body is hotter than the second.

Remarkably we can also use the weak maximum principle to show a form of uniqueness in the Cauchy problem, even though it is on a unbounded domain. We must be careful however, as we have seen that the solution is not unique: we began the chapter with the example of a function that is identically zero initially and then springs to life. Any such example however must be a monster.

Theorem 4.13. Let u be a solution on n × (0,T] of the Cauchy problem:

u˙ u = 0 on n × (0,T) u(x, 0) = 0 on n ×{0},

which is bounded by |u(x,t)| MeAx2 on n × [0,T] for some positive constants A,M > 0. Then u is identically zero.

Proof. Choose a > A. We will prove that u 0 on n × [0, 1 4a]. The result then holds on [0,T] by induction on the decomposition [0,T] = [0, 1 4a] [ 1 4a, 2 4a] .

For any R > 0, define the function

vR(x,t) = Me(aA)R2 (1 4𝑎𝑡)n 2 exp ( a|x|2 14𝑎𝑡)

on B(0,R) × (0, 1 4a). It is an easy check that vR solves the homogeneous heat equation and it is clearly positive. Moreover, on the sphere x ∂𝐵(0,R) it is larger than u, since

vR = Me(aA)R2 (1 4𝑎𝑡)n 2 exp ( aR2 14𝑎𝑡) Me(aA)R2 exp (aR2) = MeAR2 |u|

Hence by the maximum principle we know that vR |u| on all of B(0,R)¯ × [0, 1 4a].

Now choose any point (x,t) n × [0, 1 4a]. For all R > |x| we know that |u(x,t)| < vR(x,t). But

lim RvR(x,t) = M (1 4𝑎𝑡)n 2 exp ( a|x|2 14𝑎𝑡) lim Re(aA)R2 = 0.

Thus u(x,t) = 0 too. □

The obvious question is whether the solution given by Corollary 4.10 meets this growth condition. If it does, then it is the unique solution that does. Suppose therefore that h and f are bounded by |h(x)| MeA|x|2 and |f(x,t)| MeA|x|2 on (x,t) n × [0,T] for some A > 0, a > 0. Observe the following doubling relation for the fundamental solution

Φ(x,t) = 2n2 (2π(2t))n2 exp (2 |x|2 4(2t) ) = 2n2Φ(x, 2t) exp (|x|2 8t ).

For t 1 16A =: T0 this implies Φ(x,t) 2n2Φ(x, 2t) exp(2A|x|2) We compute the first integral from the formula for the solution:

|nΦ(x y,t)h(y) dny| n2n2Φ(x y, 2t)e2A|xy|2 MeA|y|2 dny = 2n2MnΦ(x y, 2t)e2A|x|2A|2xy|2 dny 2n2Me2A|x|2 dny.

The last step of the calculation was achieved by the estimate eA|2xy|2 1 and using the fact that for any positive time the fundamental solution has integral 1, Lemma 4.6. For the second integral of in the formula of the solution, the above estimate also applies, but further we need to integrate. Again for t < T0 we have

|0tnΦ(x y,t s)f(y,s) dny ds| 0t2n2Me2A|x|2 ds 2n2Me2A|x|2 T0.

Together this proves that |u(x,t)| MeA|x|2 on n × [0,T 0] for A = 2A, M = 2n2M(1 + T 0) and T0 = 1 16A. Thus we have proven short time unique existence for the Cauchy problem. The short time limitation is unavoidable. Consider the solution u(x,t) = (T t)n 2 exp ( |x|2 4(Tt) ) of the homogeneous heat equation. It has the initial condition h(x) = Tn 2 exp |x|2 4T but explodes for t T.

4.4 Heat Kernels

In the last section we proved that we found the unique (non-monstrous) solution to the Cauchy problem and proved uniqueness for Dirichlet problem. It remains to solve the Dirichlet problem, at least in some special cases. That is the goal of this section. In analogy to the Green’s function of the Laplace equation we define:

Definition 4.14. For a bounded open domain Ω n the heat kernel HΩ : Ω ×Ω × + of Ω is characterised by the following two properties:

(i)

For x Ω the function (y,t)HΩ(x,y,t) Φ(x y,t) solves the homogeneous heat equation and extends continuously to Ω¯ × 0+ with value 0 on (y,t) Ω¯ ×{0}.

(ii)

For (x,t) Ω × + yHΩ(x,y,t) extends continuously to Ω¯ with value 0 on Ω.

Some properties of Green’s functions carry over with essentially the same proof.

Lemma 4.15. For any bounded open domain Ω n the heat kernel is unique, if it exists.

Proof. For each x Ω let u(y,t) = HΩ(x,y,t) Φ(x y,t). This solves the homogeneous heat equation with initial condition h 0 and boundary condition u(y,t) = Φ(x y,t) for y Ω, since HΩ(x,y,t) = 0 on the boundary. This defines a Dirichlet problem and we know that there is at most one solution, due to Theorem 4.12. □

However, the heat kernel has a nice property that the Green’s function don’t have: the heat kernel of the Cartesian product of two domains can be easily calculated in terms of the heat kernels of both domains:

Lemma 4.16. If Ω m and Ω n are two open, bounded and connected domains with given heat kernels HΩ and HΩ, then the heat kernel of Ω ×Ω is given by

HΩ×Ω((x,x), (y,y),t) = H Ω(x,y,t)HΩ(x,y,t) (x,x), (y,y) Ω¯ ×Ω¯ t +.

Proof. For any (x,x,t) Ω ×Ω× + the function (y,y)H Ω(x,y,t)HΩ(x,y,t) extends by the value zero continuously to (Ω ×Ω) = (Ω ×Ω) (Ω × Ω). The Laplace operator of the Cartesian product is the sum of the corresponding Laplace operators. We calculate

t(HΩHΩ) (y + y)HΩHΩ = (tHΩ)HΩ + HΩ(tHΩ) (yHΩ)HΩ HΩ(yHΩ) = (tHΩ yHΩ)HΩ + HΩ(tHΩyHΩ) = 0.

Hence for all (x,x) Ω ×Ω the function (y,y,t)H Ω(x,y,t)HΩ(x,y,t) solves the homogeneous heat equation. The product of both fundamental solutions is the fundamental solution on m+n. Hence for all (x,x) Ω ×Ω the function

(y,y,t)H Ω(x,y,t)HΩ(x,yt) Φ n(x y,t)Φm(x y,t) = [HΩ Φn(x y,t)][HΩΦm(x y,t)] + Φn(x y,t)[HΩΦm(x y,t)] + [H Ω Φn(x y,t)]Φm(x y,t)

extends continuously to Ω¯ ×Ω¯× 0+ by setting it zero on (y,y,t) Ω¯ ×Ω¯×{0}. □

The minor technicality is that the boundaries of the Cartesian products Ω ×Ω n+m are not continuously differentiable submanifolds and our proof of the divergence theorem does not apply to these Cartesian products. However, the divergence theorem can be extended to these Cartesian products, so this is indeed only a technicality.

We want to develop a formula for the solution to the Dirichlet problem similar to the Poisson formula. Therefore we begin by giving a representation formula. To start, take Green’s second formula with u(y,s) and v(y,s) = HΩ(x,y,t s), two functions on Ω × + with appropriate regularity, with x and t treated as additional parameters. Now integrate this over s from 0 to t 𝜖 to obtain

0t𝜖ΩHΩ(x,y,t s)yu(y,s) yHΩ(x,y,t s)u(y,s) dny ds =0t𝜖Ω [HΩ(x,y,t s)yu(y,s) yHΩ(x,y,t s)u(y,s)] N(y) dσ(y) ds = 0t𝜖ΩyHΩ(x,y,t s)u(y,s) N(y) dσ(y) ds.

We should explain some of the choices. The choice of t s in HΩ creates a convolution type formula, which we expect from our experience with the Laplace equation and fundamental solutions in general. But if we were to integrate all the way to t, then we would have a singularity in HΩ. Integrating to t 𝜖 is akin to using a ball B(x,𝜖) in the derivation of the Green’s representation formula for the Laplace equation. Finally, HΩ(x,y,t s) is zero for y Ω, so this term drops out.

We need a similar formula with t in place of the Laplacian so that we can combine them and get the heat operator. Therefore we take the expression we need and integrate by parts

0t𝜖ΩHΩ(x,y,t s)su(y,s) dny ds =ΩHΩ(x,y,t s)u(x,s) dny| s=0s=t𝜖 0t𝜖Ω (sHΩ)(x,y,t s)u(y,s) dny ds 0T𝜖ΩHΩ(x,y,t s)su(y,s) sHΩ(x,y,t s)u(y,s) dny ds =ΩHΩ(x,y,𝜖)u(y,t 𝜖) HΩ(x,y,t)u(y, 0) dny.

When subtracting the two equations, sHΩ(x,y,t s) yHΩ(x,y,t s) = 0, leaving

0t𝜖ΩHΩ(x,y,t s)[su(y,s) yu(y,s)] dny ds =ΩHΩ(x,y,𝜖)u(y,t 𝜖) HΩ(x,y,t)u(y, 0) dny +0t𝜖ΩyHΩ(x,y,t s)u(y,s) N(y) dσ(y) ds

Finally we wish to take 𝜖 0. The interesting term is the first term after the equal sign. We use Property (i) of the heat kernel and Theorem 4.7 to deduce the limit:

lim 𝜖↓0ΩHΩ(x,y,𝜖)u(y,t 𝜖) dny = lim 𝜖↓0Ω[HΩ(x,y,𝜖) Φ(x y,𝜖)]u(y,t 𝜖) dny + lim 𝜖↓0ΩΦ(x y,𝜖)u(y,t 𝜖) dny = lim 𝜖↓0Ω0u(y,t) dny + u(x,t) = u(x,t).

Rearranging terms we arrive at the following representation formula:

u(x,t) =0tΩHΩ(x,y,t s)[su(y,s) yu(y,s)] dny ds 0tΩyHΩ(x,y,t s)u(y,s) N(y) dσ(y) ds +ΩHΩ(x,y,t)u(y, 0) dny

As with the Laplace equation, inserting the boundary conditions and inhomogeneities into this formula defines a valid solution, furnishing us with a solution to the Dirichlet problem.

Theorem 4.17 (Solution of the Dirichlet problem). Let f be a function on Ω × (0,T), g a function on Ω × [0,T] and h a function on Ω which together with the open domain Ω n have appropriate regularity such that all appearing integrals converge absolutely. Then

u(x,t) =0tΩHΩ(x,y,t s)f(y,s) dny ds 0tΩyHΩ(x,y,t s)g(y,s) N(y) dσ(y) ds +ΩHΩ(x,y,t)h(y) dny

is the unique solution of the initial and boundary value problem

u˙ u = f on Ω × (0,T) u = g on Ω × [0,T] u(x, 0) = h(x) on Ω.

We do not give a full proof of this statement. Let us think through how we might try to prove this theorem in the general case, using as few assumptions on HΩ as possible. The proof should be similar to the proof Poisson’s representation formula 3.21, but we must argue from the definition of the heat kernel rather than having a concrete formula for the Green’s function.

Perhaps the most important property is the symmetry of x and y. This allows us to conclude that, away from the singularity, HΩ is also a solution to the homogeneous heat equation in x. Applying the heat operator to the second and third terms should then cause them to vanish.

Lemma 4.18. For all t > 0 and x,y Ω¯ we have HΩ(x,y,t) = HΩ(y,x,t).

Proof. We insert u(y,s) = HΩ(z,y,s) into the representation formula, using limits where appropriate to avoid the singularities:

HΩ(z,x,t) = 0 0 + lim 𝜖↓0ΩHΩ(x,y,t)HΩ(z,y,𝜖) dny = H Ω(x,z,t).

For the Laplace equation, we had Weyl’s lemma to prove the regularity of harmonic functions. However, we also have the result that harmonic functions are analytic, using the specific formula for the Green’s function of a ball to a neighborhood of any point of a harmonic functions. In the next section we will derive the heat kernel on a cube. This can also be used to prove the regularity of solutions to the homogeneous heat equation. Then we can use the trick HΩ = [HΩ Φ(x y,t)] + Φ(x y,t) to show that the first and third integrals have the same behavior as the integrals in Corollary 4.10.

Thus it again comes down to understanding the integral over Ω. In the proof of the Poisson formula, we abstracted out the properties that were required of the normal derivative K = G N. The necessary properties are more difficult to establish, so we will stop the proof here. Hopefully this gives you a taste of the task required of a proof of Theorem 4.17 for a general domain. Instead, we close with one more property of general heat kernels (which might give you an idea of how K 0 is proven.)

Lemma 4.19. For any bounded open domain Ω n the corresponding heat kernel is positive on the corresponding parabolic cylinder, if it exists.

Proof. The fundamental solution Φ(x,t) is positive on (x,t) n × +. For bounded open domains Ω n and given x Ω the difference Φ(x y,t) HΩ(x,y,t) of the fundamental solution minus the heat kernel is the unique solution of the heat equation on Ω × [0,T] which vanishes on Ω ×{t = 0} and coincides on Ω × [0,T] with Φ(x y,t). This solution is for all 𝜖 > 0 on Ω ×{t = 𝜖} and on Ω × [0,T] not larger than Φ(x y,t). By the Maximum Principle it is not larger than Φ(x y,t) and HΩ(x,y,t) is positive. □

4.5 Heat Kernel of (0, 1)

Despite our hard work, we still haven’t actually solved the Dirichlet problem for even a single domain Ω. It is long past time to rectify that. We begin with the simplest case n = 1 where every open bounded domain is the union of intervals. Up to scaling and translation then, we need only consider the unit interval (0, 1).

There are several ideas that lead to the heat kernel. The method of images will be explored in the exercises. Here we give an argument based on the eigenfunctions. If you recall from the beginning of the chapter, the special class of separable solutions is connected to the eigenfunctions of the Laplacian . In dimension one the eigenfunctions e±2𝜋𝑖|k|x have eigenvalues 4π2|k|2. If we look for eigenfunctions that vanish on the boundary, then this is only possible if k 1 2 and then

hk(x) = 2 sin 2𝜋𝑘𝑥

is the unique solution up to scaling. This particular scaling has been chosen because it makes these functions orthonormal with respect to the inner product on L2([0, 1]). Due to the Stone-Weierstrass theorem, these functions are also dense in the space of functions that vanish at x = 0, 1. But by Property (ii) of heat kernels, H(0,1) is such a function. Therefore we expect

H(0,1)(x,y,t) = k1 2+ak(x,t)hk(y).

This is essentially the Fourier series of the heat kernel. The unique solution to the homogeneous heat equation with hk as initial condition and vanishing for x = 0, 1 is

uk(x,t) = e4π2k2t2 sin 2𝜋𝑘𝑥.

If H(0,1) is the heat kernel of (0, 1) then it must fulfil the representation for these functions. Hence

ul(x,t) =nH(0,1)(x,y,t)hl(y) dny+0+0 = k1 2+ak(x,t)nhk(y)hl(y) dny = a l(x,t).

This brings us to a formula for the heat kernel

H(0,1)(x,y,t) = k1 2+ul(x,t)hk(y) = n=12eπ2n2t sin(𝜋𝑛𝑥) sin(𝜋𝑛𝑦).

The method of images leads to the equivalent formula

H(0,1)(x,y,t) = 1 2Θ(xy 2 ,𝜋𝑖𝑡) 1 2Θ(x+y 2 ,𝜋𝑖𝑡)

where Θ(x,τ) is Jacobi’s Theta function, a well-studied ‘special’ function defined by the series

Θ(x,τ) = ke2𝜋𝑖𝑘𝑥+𝜋𝑖𝜏k2 .

This sum converges on the domain (x,τ) ×{τ (τ) > 0} very rapidly since e𝜋𝑖𝜏k2 decays exponentially with respect to k2, making it useful for computation. The sine formula for the heat kernel also has this property, but none-the-less it is useful to be able to call on standard functions when using a program such as Mathematica or Matlab. The Theta function is theoretically important because of its quasiperiodicity:

Θ(x + 1,τ) = Θ(x,τ), Θ(x + τ,τ) = Θ(x,τ)e𝜋𝑖𝜏2𝜋𝑖𝑥.

From the heat kernel on (0, 1) we can construct the heat kernel on any interval. The fundamental solution scales according to Φ(x y,t) = 1 rnΦ(x r y r, t r2). It is also invariant if we translate x and y by the same amount. Since the heat kernel is unique, it must be

H(a,b)(x,y,t) = 1 b aH(0,1) (x a b a, y a b a, t (b a)2 ) .

And further, by Lemma 4.16 we have the heat kernel on any box [a,b]n n.

We close this chapter with a final result on regularity. Due to the existence of monster solutions, we cannot hope for analyticity in the time coordinate, but we at least have smoothness.

Corollary 4.20. Any solution u of the homogeneous heat equation on an open domain in n × is smooth and for fixed t analytic with respect to x.

Proof. For any point in the domain, we can find a small cube in space and time that contains the point. By translation, assume that the cube is [0,r]n × [0,t] and the point is time t. Then using the heat kernel on this domain, we obtain from the representation formula

u(x,t) = 0t[0,r]n u(z, s)zH[0,r]n(x,z,ts)N(z) dσ(z) ds+[0,r]n u(y, 0)H[0,r]n(x,y,t) dny.

It remains to show that the regularity of the heat kernel it transferred to u. This can be calculated using the explicit formula, but we give a more conceptual argument. In the proof of Theorem 4.7 we showed that Φ(x y,t) converges on the complement of y B(x,δ) uniformly to zero in the limit t 0. The same is true for all partial derivatives and due to condition (ii) in Definition 4.14 also for H(0,1)n(x,y,t). By Lemma 4.18 the integral for u(x,t) is smooth at all x (0,r)n. For (z,s) (0,r)n × [0,t] the Taylor series of xH[0,r]n(x,z,t s) converges uniformly on compact subsets of x (0,r)n to H[0,r]n(x,z,t s). □