In this chapter we investigate the heat equation
and the corresponding inhomogeneous variant
The unknown function is defined on an open domain . We shall extend some statements about harmonic functions to solutions of the heat equation, but also try to understand the important differences.
The heat equation describes a diffusion process. This means a time-like evolution of space-like distributed quantities like heat or chemical concentration, or even probability. Let us provide a short justification of the equation as a model of heat. We have seen for the mean value property that the Laplacian measures the difference of a function from its mean value: for small from the proof of Theorem 3.5 we have which implies . If the temperature at is cooler than the points around it, then should be positive, and vice-versa if is hotter. Moreover we have seen from the general conservation law (with ) that the quantity is preserved by the heat equation (under appropriate assumptions). The simplicity of the equation together with these properties make it a useful model to study. There is no widely agree upon name for solutions to the homogeneous heat equation, similar to harmonic functions for the Laplace equation, though some books use the term caloric. A previous class suggested to call them flames, similar to how solutions of the wave equations are waves, which I find cute.
There are two boundary value problems that we will examine in particular. The first is the initial value problem on
This is sometimes called the Cauchy problem. It purports to model how the temperature within an infinitely large body changes given the initial temperature at every point. The inhomogeneous term represents the infusion or removal of heat at points within the body. The second problem applies to a bounded spatial domain
This problem is called the Dirichlet problem, in analogy to the corresponding problem for the Laplace equation. This models the temperature within a finite body but where additionally the temperature of the boundary is also controlled (specified by ). In both problems any solution should at least extend continuously to the boundary, so that the boundary conditions are meaningful.
Before we begin the develop the theory that we will use, let’s study some monstrous examples, to show us what to be wary of. The first shows the importance of the negative sign in the heat equation. We give an illustration that the heat equation is not time symmetric in the way that many models in physics are (at least conceptually) and that the ‘reverse time’ problem is not well-posed. Consider and for any integer define the function
They have the property that as well as . Therefore they all solve the homogeneous heat equation with ‘terminal’ condition . This example can even be applied to a Dirichlet-type problem. Consider the spatial domain with the boundary values . Because is an integer, all these functions satisfy it. Even though these boundary conditions are smooth and uniformly bounded by , the solutions at any time can still be arbitrarily large
This is one reason to only study the forward time Dirichlet problem.
Similarly for the Cauchy problem introduced above, there is also the possibility of rapidly growing solutions. Again for , we make the ansatz
Thus if solves the heat equation then we must have a recursion relation between and . For a given function and setting we thus obtain the following formal solution of the homogeneous heat equation:
We now show that for this power series indeed converges to a smooth solution and further that on every compact subset of the uniform limit of this solution vanishes as . We first calculate for any by a real polynomial of degree solving the relation
This recursion relation for follows by differentiating by . The first two polynomials are and . We claim that the coefficient of in front of is bounded by . For , this is clear. By induction we obtain with
This proves the claim. Using the inequalities we conclude
Therefore the series converges absolutely and for uniformly on compact sets to . This means that we can extend smoothly to by giving it the value . This means that the Cauchy problem with initial value has a non-zero solution: The space is the same temperature everywhere and suddenly wild temperature fluctuations begin. Even though it seems as if the Cauchy problem should be well-posed, additional constraints will be required.
Let us give some motivation for introducing spectral theory, which is the theory of the eigenvalues of the operator . Let us look for ‘separable’ solutions of the homogeneous heat equation. They are solutions that neatly factorise as . These solve the heat equation if
Because the left if a function of and the right is a function of , the only way that this is possible is if the two sides are equal to some constant . This means that is an eigenfunction of the (negative) Laplace operator:
and . The factorisation is only determined up to a scaling, so we set . Thus and has the initial value .
Turning this around, if we are given an initial value problem where is an eigenfunction of the Laplacian, then this method gives a solution. More generally, if the initial condition is a linear combination of eigenfunctions then a linear combination of separable solutions solves the problem. The question now arises can every function be written as a linear combination of eigenfunctions in some suitable sense?
What are the eigenfunctions of ? The trigonometric functions provide many examples for every :
The drawback of these functions are that they are not integrable on the plane because they have modulus at every point. But in a limiting sense they are all orthogonal to one another in inner product
because the integrand is periodic and the integral over a single period is zero. This leads us to define the Fourier transform as the coefficients of the orthogonal projection of a function onto these functions, in the sense that for a finite dimensional inner product space for an orthonormal basis .
Definition 4.1. The Fourier transform of a function is defined to be
Be aware: there are several definitions of Fourier transform that differ by a constant scaling and a scaling of . Always check which is being used.
When one learns Fourier analysis in detail, a major theme is under what conditions this definition makes sense, how it can be extended to other classes of functions, and which of the important properties are retained for these extensions. For example, a basic result that we will soon prove is that if the function then its Fourier transform is continuous and bounded.
Let us compute the Fourier transform for an important example: the Gaussian curve . We begin
To finish we need to compute the value of the final integral. It is so famous that it has its own name ‘the Gaussian integral’. It value is . Several methods to compute this will be explored in the tutorial. By rescaling we also have the Fourier transforms for other Gaussians. In conclusion
One obvious class of functions that can be Fourier transformed is the test functions because they have compact support. But this turns out to be a little too restrictive. Instead we consider functions that decay rapidly at infinity.
Definition 4.2. The Schwartz space contains all smooth complex valued functions on for which are finite for all and all .
There are other equivalent definitions in the literature. A common alternative is to use instead of . One characterisation of is that it is the largest subspace of integrable functions that is closed under differentiation and multiplication with polynomials. For following lemma however is perhaps the more important justification for considering this space.
Proof. If we simply take the absolute value of the definition of the Fourier transform we get . Any certainly has finite -norm and by taking supremum we obtain
This shows that is a continuous linear operator from with the -norm to with the supremum norm. Since is dense in , the Fourier transform extends to a continuous linear map from into the Banach space , as we claimed above.
But let us return to Schwarz functions and prove what is stated in the lemma. By integration by parts
To make this calculation rigorous, one should integrate by parts on a large cube . But the decay properties of ensure that the boundary terms vanish in the limit. Applying this formula with higher derivatives gives a polynomial in on the right. Turning this relation around proves that any polynomial times is the Fourier transform of a Schwartz function and thus bounded.
Similarly we can differentiate :
This is justified by the estimate
Because decays faster than any power of the right hand side is bounded. Repeatedly applying this differentiation formula shows that is smooth. The combination of the differentiation and polynomial rules for the Fourier transform therefore proves that is Schwartz. □
The property of transforming derivatives into polynomials is what makes the Fourier transform a useful tool in solving ODEs and PDEs. Let’s see how it applies to the heat equation. The Fourier transform of the Laplacian is , where we only Fourier transform the space variables and leave out from the integral. Under sufficient regularity assumptions a solution to the heat equation obeys
by interchanging the and integration. For each value of this is an ODE for in the variable . We even get initial conditions by applying the Fourier transform to the initial condition of the PDE . It has the solution
So if we are able to find a function that has this as its Fourier transform, we have solved the heat equation. For this we need to understand how the Fourier transform behaves with respect to products and convolutions.
Proof. This follows by direct calculation.
The second half of the lemma is an easy consequence of the first half together with the inverse Fourier transform, which is given after Theorem 4.7. We really only need the first half of the lemma, but it much prettier to present the two results side-by-side. □
Because of our earlier example, we know that
Therefore we can conclude that
is a solution to the heat equation with initial condition , where the convolution is only taken over the spatial variables.
Our derivation of the solution has assumed that the functions in question have sufficient regularity such that we were able to interchange the order of integration or differentiate under the integral sign as needed. In the next section we will take the formula for the solution that we have derived and prove directly, under weaker assumptions on , that it solves the Cauchy problem.
Our method of the previous section to solve the homogeneous heat equation through a Fourier transform uncovered a particular Gaussian function. It turns out to be a fundamental solution for the heat equation that is well-suited to the case , which holds for both problems we are interested in.
For one can check that this solves the homogeneous heat equation be direct calculation (Exercise). For we also know that is a smooth function, so in fact solves the heat equation in the strong sense everywhere except . We will show that soon. Similar to the fundamental solution of the Laplace equation, this fundamental solution has the scaling property . You may be wondering if the odd scaling factor for is meaningful. It is, as the following lemma shows.
Proof. . □
We can therefore understand the fundamental solution as being similar to a mollifier on . As the function grows and concentrates near the origin. It is not a mollifier because it does not have compact support, but it does lie in and we should expect that the convolution with converges in the limit to the identity. This is the content of the following theorem. This theorem also gives a solution to the Cauchy problem for the homogeneous heat equation under the assumption that the initial condition is continuous and bounded.
Theorem 4.7. For the following function has the properties (i)-(iii):
on
extends continuously to with .
Proof. For by the smoothness of and the boundedness of , the function is well-defined and we can pass derivatives into the integral. This should that is smooth. Likewise (ii) follows, since solves the heat equation on .
The harder argument is (iii). For any and any in a compact subset of there exists , such that for all (continuity implies uniform continuity on any compact subset). Furthermore there exists , such that
This implies
for all . So converges in the limit uniformly on compact subsets of to . □
Part (iii) of this theorem is also an important lemma in Fourier analysis, because it leads to an explicit formula for the inverse of the Fourier transform. Suppose that . We compute the following integral parameterised in
The trick is to now choose to be the fundamental solution . This gives
Taking the limit as and applying Theorem 4.7(iii) on the right hand side proves
To summarise, the inverse Fourier transform is
The fact that the Fourier transform and its inverse differ only by a sign in the exponent of the exponential is the reason that it has so many ‘dual’ properties, such as for multiplication and convolution, or for differentiation and multiplication by polynomials.
The equation above for and is also the important step to extend the Fourier transform to (some) distributions. When we have
If this was written in the notation of distributions it would be . This seems as if it would be a suitable definition of the Fourier transform of a distribution. However, even if is a test function, we can’t be sure that is a test function only that it is Schwartz, and thus is not defined for all distributions.
Unfortunately there is no way to fix this. Instead we must restrict ourselves to consider only distributions that can act on Schwartz functions. But what does this mean? First we recognise that from Definition 4.2 of constitutes a family of seminorms for Schwartz space. Further the inclusion of the space of test functions into the Schwartz space is continuous and dense with respect to this topology. Therefore we can identify the subspace of distributions that can be extended continuously to act on .
Definition 4.8. Let be a distribution. Suppose that is a sequence of test functions that converges to zero in , i.e. for all . We say that is a tempered distribution if . If is a tempered distribution then it acts on a Schwartz function by
for any sequence of test functions that converges to in . For tempered distributions, we define the Fourier transform .
Many of the properties of Fourier transforms on carry over to , in particular the differentiation and polynomial multiplication rules. Defining the Fourier transform on distributions is not just a convenient way to extend it to a large class of functions but actually essential for understanding the Fourier transforms of many common functions. For example, the Fourier transform of the constant function is the delta distribution.
Fourier analysis can also solve the inhomogeneous heat equation on . Taking the transform of the PDE results in the inhomogeneous ODE
This has the solution
We recognise the first term from the homogeneous case. The second term is new, but it is the integral over time of the product of and . Performing the inverse transform suggests the following solution
It remains to consider the regularity of the second integral.
Theorem 4.9 (Solution of the inhomogeneous heat equation). If is twice continuously and bounded differentiable on , then
solves the inhomogeneous initial value problem
Proof. The integrand has a singularity when . Therefore consider
To this function we can apply the heat equation with impunity:
Theorem 4.7 (iii) implies on . Additionally . The assumptions on are sufficient to conclude that
and . Properly one should bound the difference between and , which is the integral in time over the short interval , in a similar manner to Theorem 3.2. □
We summarise our inquiries with the following statement.
Corollary 4.10. Suppose is twice continuously and bounded differentiable on and is continuous and bounded on . The inhomogeneous initial value problem has the following solution:
To finish the section we make some qualitative remarks on the behaviour of these solutions. The two integrals are a homogeneous solution that satisfies the initial condition and an inhomogeneous solution that vanishes initially. One is reminded of the Green’s representation formula, which was also two integrals dividing the task between themselves. We can also see that as a physics model of heat it violates the principle of locality and the speed of light. Consider , so there is no additional sources of heat, and suppose the initial temperature is non-negative and has compact support. Then for any point and time the solution is positive, because is everywhere positive. The interpretation is that the heat that was present in the support of has instantly spread out to the whole space.
Like elliptic PDEs, parabolic PDEs also have a maximum principle. In this section we will prove a weak maximum principle for the heat equation and apply it to the question of uniqueness of the Dirichlet and Cauchy problems. There is an approach to the maximum principle based on so-called ‘heat balls’ that mimic the mean value property for the Laplace equation (see Evans), but this is computationally messy. Instead we follow Han and give a proof in the style of Theorem 3.13.
The domain of the heat equation distinguishes time and spatial directions. We therefore make special definitions adapted to this distinction. For any open domain we define the parabolic cylinder as . The parabolic boundary of is defined as . It is the union of and does not contain at time points inside of .
Theorem 4.11 (Weak maximum principle for the heat equation). Let be open and bounded and a twice differentiable function on that extends continuously to . Suppose that is a subsolution to the heat equation:
on . Then the maximum of is taken on .
Proof. Note because is bounded that is compact, and thus must have a maximum. The theorem claims that the maximum occurs on the boundary, but does not forbid it from also occurring on the interior. The constant function would be an example where the maximum is taken both on the boundary and the interior.
We first prove the theorem under the stronger assumption that . Suppose that has a maximum at . If then we can also say that , otherwise if we can only say that . In either case we see that . Also because this point is a maximum and the Hessian in the spatial coordinates is negative semidefinite. As argued in Theorem 3.13 at such a point . But now we have a contradiction. Therefore the maximum cannot occur on .
Next we handle the general case with a trick similar to Theorem 3.13. For any define
This forces
Thus the special case applies to and we conclude that the maximum of occurs on the boundary. But we can now argue
Taking yields the result. □
The following is an easy consequence, similar to the uniqueness of the Dirichlet problem for the Laplace equation.
Theorem 4.12. On an open and bounded domain there exists at most one solution of the Dirichlet problem for the inhomogeneous heat equation.
Proof. Suppose that there were two solutions. Consider their difference . This function must solve the homogeneous heat equation and vanishes on both the initial boundary and the spatial boundary . In other words, it is zero on the parabolic boundary. By the weak maximum principle applied to and the maximum and minimum of is zero. Thus and the two solutions are equal. □
We can also conclude the ‘comparison principle’ or ‘monotonicity property’ for the heat equation: If one body starts hotter than another at every point , stays hotter on the boundary and receives more heat on the interior , then at every point and every time the first body is hotter than the second.
Remarkably we can also use the weak maximum principle to show a form of uniqueness in the Cauchy problem, even though it is on a unbounded domain. We must be careful however, as we have seen that the solution is not unique: we began the chapter with the example of a function that is identically zero initially and then springs to life. Any such example however must be a monster.
Theorem 4.13. Let be a solution on of the Cauchy problem:
which is bounded by on for some positive constants . Then is identically zero.
Proof. Choose . We will prove that on . The result then holds on by induction on the decomposition .
For any , define the function
on . It is an easy check that solves the homogeneous heat equation and it is clearly positive. Moreover, on the sphere it is larger than , since
Hence by the maximum principle we know that on all of .
Now choose any point . For all we know that . But
Thus too. □
The obvious question is whether the solution given by Corollary 4.10 meets this growth condition. If it does, then it is the unique solution that does. Suppose therefore that and are bounded by and on for some , . Observe the following doubling relation for the fundamental solution
For this implies We compute the first integral from the formula for the solution:
The last step of the calculation was achieved by the estimate and using the fact that for any positive time the fundamental solution has integral 1, Lemma 4.6. For the second integral of in the formula of the solution, the above estimate also applies, but further we need to integrate. Again for we have
Together this proves that on for , and . Thus we have proven short time unique existence for the Cauchy problem. The short time limitation is unavoidable. Consider the solution of the homogeneous heat equation. It has the initial condition but explodes for .
In the last section we proved that we found the unique (non-monstrous) solution to the Cauchy problem and proved uniqueness for Dirichlet problem. It remains to solve the Dirichlet problem, at least in some special cases. That is the goal of this section. In analogy to the Green’s function of the Laplace equation we define:
Definition 4.14. For a bounded open domain the heat kernel of is characterised by the following two properties:
For the function solves the homogeneous heat equation and extends continuously to with value on .
For extends continuously to with value on .
Some properties of Green’s functions carry over with essentially the same proof.
Proof. For each let . This solves the homogeneous heat equation with initial condition and boundary condition for , since on the boundary. This defines a Dirichlet problem and we know that there is at most one solution, due to Theorem 4.12. □
However, the heat kernel has a nice property that the Green’s function don’t have: the heat kernel of the Cartesian product of two domains can be easily calculated in terms of the heat kernels of both domains:
Lemma 4.16. If and are two open, bounded and connected domains with given heat kernels and , then the heat kernel of is given by
Proof. For any the function extends by the value zero continuously to . The Laplace operator of the Cartesian product is the sum of the corresponding Laplace operators. We calculate
Hence for all the function solves the homogeneous heat equation. The product of both fundamental solutions is the fundamental solution on . Hence for all the function
extends continuously to by setting it zero on . □
The minor technicality is that the boundaries of the Cartesian products are not continuously differentiable submanifolds and our proof of the divergence theorem does not apply to these Cartesian products. However, the divergence theorem can be extended to these Cartesian products, so this is indeed only a technicality.
We want to develop a formula for the solution to the Dirichlet problem similar to the Poisson formula. Therefore we begin by giving a representation formula. To start, take Green’s second formula with and , two functions on with appropriate regularity, with and treated as additional parameters. Now integrate this over from to to obtain
We should explain some of the choices. The choice of in creates a convolution type formula, which we expect from our experience with the Laplace equation and fundamental solutions in general. But if we were to integrate all the way to , then we would have a singularity in . Integrating to is akin to using a ball in the derivation of the Green’s representation formula for the Laplace equation. Finally, is zero for , so this term drops out.
We need a similar formula with in place of the Laplacian so that we can combine them and get the heat operator. Therefore we take the expression we need and integrate by parts
When subtracting the two equations, , leaving
Finally we wish to take . The interesting term is the first term after the equal sign. We use Property (i) of the heat kernel and Theorem 4.7 to deduce the limit:
Rearranging terms we arrive at the following representation formula:
As with the Laplace equation, inserting the boundary conditions and inhomogeneities into this formula defines a valid solution, furnishing us with a solution to the Dirichlet problem.
Theorem 4.17 (Solution of the Dirichlet problem). Let be a function on , a function on and a function on which together with the open domain have appropriate regularity such that all appearing integrals converge absolutely. Then
is the unique solution of the initial and boundary value problem
We do not give a full proof of this statement. Let us think through how we might try to prove this theorem in the general case, using as few assumptions on as possible. The proof should be similar to the proof Poisson’s representation formula 3.21, but we must argue from the definition of the heat kernel rather than having a concrete formula for the Green’s function.
Perhaps the most important property is the symmetry of and . This allows us to conclude that, away from the singularity, is also a solution to the homogeneous heat equation in . Applying the heat operator to the second and third terms should then cause them to vanish.
Proof. We insert into the representation formula, using limits where appropriate to avoid the singularities:
For the Laplace equation, we had Weyl’s lemma to prove the regularity of harmonic functions. However, we also have the result that harmonic functions are analytic, using the specific formula for the Green’s function of a ball to a neighborhood of any point of a harmonic functions. In the next section we will derive the heat kernel on a cube. This can also be used to prove the regularity of solutions to the homogeneous heat equation. Then we can use the trick to show that the first and third integrals have the same behavior as the integrals in Corollary 4.10.
Thus it again comes down to understanding the integral over . In the proof of the Poisson formula, we abstracted out the properties that were required of the normal derivative . The necessary properties are more difficult to establish, so we will stop the proof here. Hopefully this gives you a taste of the task required of a proof of Theorem 4.17 for a general domain. Instead, we close with one more property of general heat kernels (which might give you an idea of how is proven.)
Lemma 4.19. For any bounded open domain the corresponding heat kernel is positive on the corresponding parabolic cylinder, if it exists.
Proof. The fundamental solution is positive on . For bounded open domains and given the difference of the fundamental solution minus the heat kernel is the unique solution of the heat equation on which vanishes on and coincides on with . This solution is for all on and on not larger than . By the Maximum Principle it is not larger than and is positive. □
Despite our hard work, we still haven’t actually solved the Dirichlet problem for even a single domain . It is long past time to rectify that. We begin with the simplest case where every open bounded domain is the union of intervals. Up to scaling and translation then, we need only consider the unit interval .
There are several ideas that lead to the heat kernel. The method of images will be explored in the exercises. Here we give an argument based on the eigenfunctions. If you recall from the beginning of the chapter, the special class of separable solutions is connected to the eigenfunctions of the Laplacian . In dimension one the eigenfunctions have eigenvalues . If we look for eigenfunctions that vanish on the boundary, then this is only possible if and then
is the unique solution up to scaling. This particular scaling has been chosen because it makes these functions orthonormal with respect to the inner product on . Due to the Stone-Weierstrass theorem, these functions are also dense in the space of functions that vanish at . But by Property (ii) of heat kernels, is such a function. Therefore we expect
This is essentially the Fourier series of the heat kernel. The unique solution to the homogeneous heat equation with as initial condition and vanishing for is
If is the heat kernel of then it must fulfil the representation for these functions. Hence
This brings us to a formula for the heat kernel
The method of images leads to the equivalent formula
where is Jacobi’s Theta function, a well-studied ‘special’ function defined by the series
This sum converges on the domain very rapidly since decays exponentially with respect to , making it useful for computation. The sine formula for the heat kernel also has this property, but none-the-less it is useful to be able to call on standard functions when using a program such as Mathematica or Matlab. The Theta function is theoretically important because of its quasiperiodicity:
From the heat kernel on we can construct the heat kernel on any interval. The fundamental solution scales according to . It is also invariant if we translate and by the same amount. Since the heat kernel is unique, it must be
And further, by Lemma 4.16 we have the heat kernel on any box .
We close this chapter with a final result on regularity. Due to the existence of monster solutions, we cannot hope for analyticity in the time coordinate, but we at least have smoothness.
Corollary 4.20. Any solution of the homogeneous heat equation on an open domain in is smooth and for fixed analytic with respect to .
Proof. For any point in the domain, we can find a small cube in space and time that contains the point. By translation, assume that the cube is and the point is time . Then using the heat kernel on this domain, we obtain from the representation formula
It remains to show that the regularity of the heat kernel it transferred to . This can be calculated using the explicit formula, but we give a more conceptual argument. In the proof of Theorem 4.7 we showed that converges on the complement of uniformly to zero in the limit . The same is true for all partial derivatives and due to condition (ii) in Definition 4.14 also for . By Lemma 4.18 the integral for is smooth at all . For the Taylor series of converges uniformly on compact subsets of to . □