Chapter 1
First Order PDEs

In this introductory chapter we first introduce partial differential equations and then consider first order partial differential equations. We shall see that they are simpler than higher order partial differential equations. In contrast to higher order partial differential equations these first order partial differential equations are similar to ordinary differential equations and can be solved by using the theory of ordinary differential equations. After this introductory chapter we shall focus on second order partial differential equations. Before we consider the three main examples of second order differential equations we introduce some general concepts in the next chapter. These general concepts are partially motivated by observations contained in the first chapter.

A partial differential equation is an equation on the partial derivatives of a function depending on at least two variables.

Definition 1.1. A possibly vector valued equation of the following form

F (Dku(x),Dk1u(x),,𝐷𝑢(x),u(x),x) = 0

is called partial differential equation of order k. Here F is a given function and u an unknown function. The expressions Dku denotes the vector of all partial derivatives of the function u of order k. The function u is called a solution of the differential equation, if u is k times differentiable and obeys the partial differential equation.

On open subsets Ω n we denote the partial derivatives of higher order by γ = iiγi = i( xi)γi with multi-indices γ 0n of length |γ| = iγi. The multi-indices are ordered by δ γδi γi for i = 1,,n. The partial derivative acts only on the immediately following function; they only act on a product of functions if the product is grouped together in brackets.

1.1 Homogeneous Transport Equation

One of the simplest partial differential equations is the transport equation:

u˙ + b u = 0.

Here u˙ denotes the partial derivative ∂𝑢 ∂𝑡 of the unknown function u : n × , b n is a vector, and the product b u denotes the scalar product of the vector b with the vector of the first partial derivatives of u with respect to x:

b u(x,t) = b1∂𝑢(x,t) x1 + + bn∂𝑢(x,t) xn .

Let us first assume that u(x,t) is a differentiable solution of the transport equation. For all fixed (x0,t0) n × the function

z(s) = u(x0 + s b,t0 + s)

is a differentiable function on s , whose first derivative vanishes:

z(s) = bu(x 0 + s b,t0 + s) + u˙(x0 + s b,t0 + s) = 0.

Therefore u is constant along all parallel straight lines in direction of (b, 1). Furthermore, u is completely determined by the values on all these parallel straight lines.

Initial Value Problem 1.2. We seek a solution u : n × of the transport equation u˙ + b u = 0 with given b n, which at t = 0 is equal to some given function g : n . We call this the Cauchy problem (or initial value problem) for the transport equation.

With the additional initial data, we can now uniquely determine a solution. All parallel straight lines in direction of (b, 1) intersect n ×{0} exactly once. So choose t0 = 0, giving the parameterised lines

(x(s),t(s)) = (x0 + 𝑠𝑏,s) n ×{0}.

The initial point of any line can be determined by x0 = x 𝑠𝑏 = x 𝑡𝑏. Thus the value of u on each straight line is determined by the initial condition. These lines are in general called characteristic curves. The solution has to be equal to

u(x,t) = u(x0 + 𝑠𝑏,s) = u(x0, 0) = g(x0) = g(x 𝑡𝑏).

If g is differentiable on n, then this function indeed solves the transport equation. In this case the initial value problem has a unique solution. Otherwise, if g is not differentiable on n, then the initial value problem does not have a solution. As we have seen above, whenever the initial value problem has a solution, then the function u(x,t) = g(x 𝑏𝑡) is the unique solution. So it might be that this candidate is a solution in a more general sense.

1.2 Inhomogeneous Transport Equation

Now we consider the corresponding inhomogeneous transport equation:

u˙ + b u = f.

Again b n is a given vector, f : n × is a given function and u : n × is the unknown function.

Initial Value Problem 1.3. Given a vector b , a function f : n × and an initial value g : n , we seek a solution to the Cauchy problem for the inhomogeneous transport equation: a function u : n × that satisfies

u˙ + b u = fwith u(x, 0) = g(x).

Similar to the homogeneous case, we define for each (x0, 0) n × the function z(s) = u(x0 + 𝑠𝑏,s) which solves

z(s) = b u(x 0 + 𝑠𝑏,s) + u˙(x0 + 𝑠𝑏,s) = f(x0 + 𝑠𝑏,s).

Notice that the right hand side is only a function of s. Moreover z(0) = u(x0, 0) = g(x0) is known. Thus we can integrate and determine z(s) completely. This tells us the value of u and any point on the line (x0 + 𝑠𝑏,s) n × .

We can also gather this information into a formula for u. The point (x,t) lies on the line (x0 + 𝑠𝑏,s) with s = t and x0 = x 𝑡𝑏. Therefore

u(x,t) = z(t) = z(0) +0tz(s) ds = g(x 0) +0tf(x 0 + 𝑠𝑏,s) ds = g(x 𝑡𝑏) +0tf(x + (s t)b,s) ds.

We observe that this formula is analogous to the formula for solutions of inhomogeneous initial value problems of linear ODEs. The unique solution is the sum of the unique solution of the corresponding homogeneous initial value problem and the integral over solution of the homogeneous equation with the inhomogeneity as initial values. We obtained these solutions of the first order homogeneous and inhomogeneous transport equation by solving an ODE. We shall generalise this method in Section 1.5 and solve more general first order PDEs by solving an appropriate chosen system of first order ODEs.

1.3 Scalar Conservation Laws

In this section we consider the following class of non-linear first order differential equations

u˙(x,t) + ∂𝑓(u(x,t)) ∂𝑥 = u˙(x,t) + f(u(x,t)) ∂𝑢(x,t) ∂𝑥 = 0

for a smooth function f : . Here u : × is the unknown function. This equation is called a scalar conservation law and is a non-linear first order PDE. For any compact interval [a,b] we calculate

d 𝑑𝑡abu(x,t)𝑑𝑥 =abu˙(x,t)𝑑𝑥 = ab∂𝑓(u(x,t)) ∂𝑥 𝑑𝑥 = f(u(a,t)) f(u(b,t)).

This is the meaning of a conservation law: the change of the integral of u(,t) over [a,b] is equal to the ’flux’ of f(u(x,t)) through the ’boundary’ [a,b] = {a,b}.

Thinking of t as time, the natural boundary condition to consider is u(x, 0) = g(x) for all x with some given function g : . Let us try to apply the method of characteristics to these equations, namely we assume that there exists a solution u try to understand how the value of u changes along a curve (x(s),s) in its domain. The difference to the transport equation is that we do not assume that the curves are straight lines; it remains to be seen which curves we should choose. Let z(s) = u(x(s),t(s)). The derivative is

z(s) = ∂𝑢(x(s),t(s)) ∂𝑥 x(s) + ∂𝑢(x(s),t(s)) ∂𝑡 t(s)

Hence if we choose the curve x(s) with the property that x(s) = f(u(x(s),s)) and t(s) with the property that t(s) = 1 then

z(s) = ∂𝑢(x(s),s) ∂𝑥 f(u(x(s),s)) + u˙(x(s),s) = 0.

This shows that z is constant along these particular curves.

There remain two things to determine: what is the value of z and does there even exist a curve x(s) with the required property? We make the assumption that the characteristic curve begins at the point (x0, 0). In other words x(0) = x0. By the constancy of z and the initial conditions we have z(s) = u(x(0), 0) = u(x0, 0) = g(x0). This answers the first question. The second question is now answerable too: the derivative of x(s) is constant equal to

x(s) = f(u(x(s),s)) = f(z(s)) = f(g(x 0)).

The characteristic curve is therefore x(s) = x0 + sf(g(x 0)). Together this shows that the solution of the PDE is uniquely determine from the initial condition, if it exists.

Instead of thinking about a single characteristic curve and initial point, let us think about all characteristic curves. This point of view implies the solution obeys

u(x0 + tf(g(x 0)),t) = g(x0)for allx0,t .

The characteristic curves with initial points x1,x2 with g(x1)g(x2) might intersect at t +. In this case the method of characteristic implies g(x1) = u(x1 + tf(g(x 1)),t) = u(x2 + tf(g(x 2)),t) = g(x2), which is impossible. This situation is called crossing characteristics. But otherwise the above implicit equation for u can be solved and defines a solution to the PDE.

Theorem 1.4. If f C2(, ) and g C1(, ) with f(g(x))g(x) > α for all x and some α 0, then there is a unique C1-solution of the initial value problem for the scalar conservation law

∂𝑢(x,t) ∂𝑡 + f(u(x,t))∂𝑢(x,t) ∂𝑥 = 0withu(x, 0) = g(x)

on (x,t) × [0,α1) for α > 0 and on (x,t) × [0,) for α = 0.

Proof. By the method of characteristics the solution u(x,t) is on the lines x0 + tf(g(x 0)) equal to g(x0). For all t 0 with 1 𝑡𝛼 > 0 the derivative of x0x0 + tf(g(x 0)) obeys

1 + tf(g(x0))g(x 0) 1 𝑡𝛼 > 0.

Hence x0 + tf(g(x 0)) is a strictly increasing function of x0 and therefore injective. Moreover lim x0±x0 + tf(g(x 0)) = ±, because there is a minimum rate of growth. So x0x0 + tf(g(x 0)) is a C1-diffeomorphism from onto . Therefore there exists for any x a unique x0 with x0 + tf(g(x 0)) = x. Then u(x,t) = g(x0) solves the initial value problem. □

Example 1.5. For n = 1 and f(u) = 1 2u2 we obtain Burgers equation:

u˙(x,t) + u(x,t)∂𝑢(x,t) ∂𝑥 = 0.

The solutions of the corresponding characteristic equations are x(t) = x0 + g(x0)t. Therefore the solutions of the corresponding initial value problem obey

u(x + 𝑡𝑔(x),t) = g(x).

If g is continuously differentiable and monotonic increasing, then for all t [0,) the map xx + 𝑡𝑔(x) is a C1-diffeomorphism from onto and there is a unique C1-solution on × [0,). More generally, if g(x) > α with α 0, then there is a unique C1-solution on × [0,α1) for α > 0 and (x,t) × [0,) for α = 0.

1.4 Noncharacteristic Hypersurfaces

Until now we have only considered specific PDEs where one variable was labelled ‘time’ and the initial conditions was t = 0. In this section we shall consider boundary conditions for the general first order PDE:

F(u(x),u(x),x) = 0

on the domain Ω n with the boundary condition u(y) = g(y) for all y Σ. Here u is a real unknown function on an open domain Ω n and F is a real function on an open subset of W n × ×Ω. For the boundary condition we assume that Σ = {x Ωφ(x) = φ(x0)} is the level-set of the function φ, which we call a hypersurface.

We will first show that locally every Cauchy problem can be brought into the following form:

u(y) = g(y) for all y Ω H with H = {x nx e n = x0 en}.

Here en = (0,, 0, 1) denotes the n-th element of the canonical basis and H the unique hyperplane through x0 Ω orthogonal to en. If φ(x0)0 we may assume without loss of generality that ∂𝜑 xn(x0)0 (relabel the variables if necessary). Then we apply the inverse function theorem to xΦ(x) = (x1,,xn1,φ(x)) to get a continuously differentiable coordinate transformation x = Φ1(y) in a neighbourhood of x0. This coordinate change has the property that φ(x) = φ(x0) if and only if y en = yn = φ(x0). We say that the boundary has been straighten at x0. Then by the chain rule the composition u = v Φ of a function v : Ω with Φ obeys

u(x) = v(Φ(x)) Φ(x) = v(y) Φ(Φ1(y)) .

Here v and u are row vectors and Φ(x) the Jacobi matrix. Hence u solves the PDE

F(u(x),u(x),x) = 0

if and only if v solves the PDE

G(v(y),v(y),y) := F (v(y) Φ(Φ1(y)) ,v(y),Φ1(y)) = 0.

Thus we can indeed assume locally (the coordinate change is only guaranteed to exist in a neighbourhood of x0) that the boundary is a hyperplane, at the cost of changing the form of the PDE.

Next we ask the question: given the values of u on the hypersurface H is there anything else we can determine about u on the hypersurface? Can we determine the value of its derivatives for example, or can we see immediately that there is no possible u (like for some situations of Burgers’ equation)?

We can compute the partial derivatives in most directions at x0 H. Observe

∂𝑢(x0) x1 = lim h0u(x0 + he1) u(x0) h = lim h0g(x0 + he1) g(x0) h = ∂𝑔(x0) x1 .

This also works for the directions x2,,xn1 which lie in the hyperplane. This idea does not determine ∂𝑢(x0) xn , but we have not used the PDE yet. If we substitute all the values we know, there is only one free variable in the PDE:

F(u(x0),u(x0),x0) = F (∂𝑔(x0) x1 ,, ∂𝑔(x0) xn1 ,pn,g(x0),x0) = 0.

Whether or not this has a solution depends on both the PDE F and the initial condition g. However, if there does exist a solution then there is a simple criterion depending only on F that ensures further that it is solvable in a neighbourhood of x0.

Definition 1.6. Consider the PDE as a function of 2n + 1 variables F(p,z,x) = 0 and suppose that there is a solution (p0,z0,x0). The hyperplane H = {xn = x0,n}is called noncharacteristic at x0 if

∂𝐹 pn(p0,z0,x0)0.

To understand the name ‘noncharacteristic’ let us consider the example

∂𝑢 x1 = 0,u(x1, 0) = g(x1).

The PDE in this case is F(p1,p2,z,x1,x2) = p1, which clearly does not enjoy the noncharacteristic property. We see that the initial condition is fighting against the PDE; they are only compatible if g is constant. And even if they happen to be compatible then the initial condition does not determine ∂𝑢 x2 on H = {x2 = 0}. If we apply the method of characteristics to this PDE, we must try to find a curve (x1(s),x2(s)) along which z(s) = u(x1(s),x2(s)) is nicely behaved. Differentiating z gives

z = ∂𝑢 x1x1 + ∂𝑢 x2x2,

which ‘aligns’ with the PDE if we choose x1 = 1 and x2 = 0. However this choice of characteristics gives x1(s) = x0,1 + s,xs(s) = x0,2, which lies in the hyperplane. The method fails to be useful because no points in the domain can be reached by characteristics starting on the hyperplane.

Lemma 1.7. Let F : W and g : H be continuously differentiable, x0 Ω H, z0 = g(x0) and p0,1 = ∂𝑔(x0) x1 ,,p0,n1 = ∂𝑔(x0) xn1 . If there exists p0,n with F(p0,z0,x0) = 0 and H is noncharacteristic at x0 then in an open neighbourhood Ωx0 Ω of x0 there exists for x Ωx0 H a unique solution q of

F(q(x),g(x),x) = 0, qi(x) = ∂𝑔(x) xi  for i = 1,,n 1 and  q(x0) = p0.

Proof. Consider the function (x,qn)F(q1(x),,qn1(x),qn,g(x),x). This takes the value 0 at (x0,p0,n). The noncharacteristic assumption means that we can apply the implicit function theorem to define qn as a unique function of x in a neighbourhood of x0. □

1.5 Method of Characteristics

In this section continue to consider the general first order PDE and try to formalise the method of characteristics, which thus far we have developed only ad hoc. We try to obtain the solution to the PDE by understanding the function u along a curve in the domain. For a clever choice of the curve this reduces to the solution of an appropriate system of first order ODEs. So let x(s) be a curve in the domain of the PDE and z(s) = u(x(s)) be the value of u along the curve. The new ingredient is that we must also consider p(s) = u(x(s)), the gradient of u along this curve. But how should be choose the curve sx(s)? For this purpose we first differentiate

pi(s) = d 𝑑𝑠 ∂𝑢(x(s)) xi = j=1n2u(x(s)) xjxi xj(s).

The total derivative of F(u(x),u(x),x) = 0 with respect to xi gives

0 = 𝑑𝐹(u(x),u(x),x) dxi = = j=1n∂𝐹(u(x),u(x),x) pj 2u(x) xixj + ∂𝐹(u(x),u(x),x) ∂𝑧 ∂𝑢(x) xi + ∂𝐹(u(x),u(x),x) xi .

Due to the commutativity iju = jiu of the second partial derivatives we obtain

j=1n∂𝐹(p(s),z(s),x(s)) pj 2u(x(s)) xjxi = ∂𝐹(p(s),z(s),x(s)) ∂𝑧 pi(s)∂𝐹(p(s),z(s),x(s)) xi .

We want to eliminate the explicit dependence on u from all our equations. If we compare this equation with the derivative of pi we see that we should choose the vector field for the characteristic curves as

xj(s) = ∂𝐹(p(s),z(s),x(s)) pj .

This choice allows us to rewrite the equation above for p as

pi(s) = j=1n2u(x(s)) xjxi ∂𝐹(p(s),z(s),x(s)) pj = ∂𝐹(p(s),z(s),x(s)) ∂𝑧 pi(s) ∂𝐹(p(s),z(s),x(s)) xi .

Finally we differentiate

z(s) = d 𝑑𝑠u(x(s)) = j=1n∂𝑢(x(s)) xj xj(s) = j=1hp j(s)∂𝐹(p(s),z(s),x(s)) pj .

In this way we indeed obtain the following system of first order ODEs:

xi(s) = ∂𝐹(p(s),z(s),x(s)) pi pi(s) = ∂𝐹(p(s),z(s),x(s)) xi ∂𝐹(p(s),z(s),x(s)) ∂𝑧 pi(s) z(s) = j=1n∂𝐹(p(s),z(s),x(s)) pj pj(s).

This is a system of first order ODEs with 2n + 1 unknown real functions. Importantly this is a ‘closed’ system; it only depends on these 2n + 1 functions, not on any other information from u. This is a little surprising, particularly that p, which is effectively a certain second derivative of u, only depends on the location x, the value z, and the first derivatives p. The fact that this idea of characteristics leads to a finite system of ODEs is what makes this an effective method. Let us summarise these calculations in the following theorem:

Theorem 1.8. Let F be a real differentiable function on an open subset W n × × n and u : Ω a twice differentiable solution on an open subset Ω n of the first order PDE F(u(x),u(x),x) = 0. For every solution sx(s) of the ODE

xi(s) = ∂𝐹 pi(u(x(s)),u(x(s)),x(s))

the functions p(s) = u(x(s)) and z(s) = u(x(s)) solve the ODEs

pi(s) = ∂𝐹(p(s),z(s),x(s)) xi ∂𝐹(p(s),z(s),x(s)) ∂𝑧 pi(s) and  z(s) = j=1n∂𝐹(p(s),z(s),x(s)) pj pj(s).

This theorem can be used to address the uniqueness of the solution of PDE, reducing it to the question of uniqueness of the solution of this system of ODEs. This is useful because we have many theorems that tell us when a system of ODEs is unique. For example, the Picard-Lindelöf theorem tells us the solution is uniquely determined by initial conditions if the right hand side is Lipschitz.

We must also pay attention to the logical structure of this theorem. It says if a solution to the PDE exists then it solves the ODE; it tells us where to look for potential solutions. But that was not the task we set for ourselves at the outset of this section. We want to prove that a solution of the PDE does in fact exist. We have seen that global solutions may not exist due to crossing characteristics, so the best we can hope for is a local existence result. This takes a little work but is achieved in the following theorem.

Theorem 1.9. Let F : W and g : H be three times differentiable functions. Suppose we have a point (p0,z0,x0) W with

F(p0,z0,x0) = 0, z0 = g(x0), p0,1 = ∂𝑔(x0) x1 ,,p0,n1 = ∂𝑔(x0) xn1 .

Furthermore, assume that H is noncharacteristic at x0. Then in a neighbourhood Ωx0 Ω of x0 there exists a unique solution of the boundary value problem

F(u(x),u(x),x) = 0 for x Ωx0  and u(y) = g(y) for y Ωx0 H.

Proof. The strategy of this proof is to solve the system of ODEs given by the method of characteristics and show that it does solve the PDE and the initial conditions. First we need to translate the initial conditions of the PDE to initial conditions for the ODEs. By Lemma 1.7 there exists a solution q on an open neighbourhood of x0 in H of the following equations

F(q(y),g(y),y) = 0, qi(y) = ∂𝑔(y) xi  for i = 1,,n 1 and  q(x0) = p0.

If F is twice and g are three times differentiable then the implicit function theorem yields a twice differentiable solution. The Picard-Lindelöf theorem shows that the following initial value problem has for all y in the intersection of an open neighbourhood of x0 with H a unique solution:

xi(s) = ∂𝐹 pi(p(s),z(s),x(s))  with  x(0) = y pi(s) = ∂𝐹 xi(p(s),z(s),x(s)) ∂𝐹 ∂𝑧 (p(s),z(s),x(s))pi(s)  with  p(0) = q(y) z(s) = j=1n ∂𝐹 pj(p(s),z(s),x(s))pj(s)  with  z(0) = g(y).

We denote the family of solutions by (x(y,s),p(y,s),z(y,s)). For a neighbourhood Ωx0 x0 there exists an 𝜖 > 0 such that these solutions are uniquely defined on (y,s) (Ω H) × (𝜖,𝜖). This is a local proof so let us just write Ω instead of Ωx0. Since F and g are three times differentiable all coefficients and initial values are twice differentiable. The theorem on the dependence of solutions of ODEs on the initial values gives that (y,s)(x(y,s),p(y,s),z(y,s)) is on (Ω H) × (𝜖,𝜖) twice differentiable.

Now let us examine the characteristic curves in more detail. The function (y,s)x(y,s) on (Ω H) × (𝜖,𝜖) n has at (y,s) = (x0, 0) the Jacobi matrix

( 1 0 0 ∂𝐹(p0,z0,x0) p1 0 0 1 ∂𝐹(p0,z0,x0) pn1 0 0 0 ∂𝐹(p0,z0,x0) pn ) .

Since ∂𝐹(p0,z0,x0) pn 0 this matrix is invertible. The inverse function theorem implies that on the (possibly diminished) neighbourhood Ω of x0 and suitable 𝜖 > 0 this map is a twice differentiable homeomorphism (Ω H) × (𝜖,𝜖) Ω with twice differentiable inverse mapping. Because we know that the inverse mapping exists, the function u : Ω defined in implicit form by

u(x(y,s)) = z(y,s) for all (y,s) (Ω H) × (𝜖,𝜖)

is well-defined.

This function u satisfies the initial conditions of the PDE: we have x(y, 0) = y and so

u(y) = u(x(y, 0)) = z(y, 0) = g(y)

for all y Ω H. It remains to show that u solves the PDE F(u(x),u(x),x) = 0. Observe that the ODEs imply

d 𝑑𝑠F(p(y,s),z(y,s),x(y,s)) = 0.

Since F(q(y),g(y),y) vanishes for all y Ω H we conclude

F(p(y,s),z(y,s),x(y,s)) = 0 for all (y,s) (Ω H) × (𝜖,𝜖).

Hence to show that u solves the PDE it suffices to show p(y,s) = u(x(y,s)) for all (y,s) (Ω H) × (𝜖,𝜖).

To this end, we need to establish the following equalities

∂𝑧(y,s) ∂𝑠 = j=1np j(y,s)xj(y,s) ∂𝑠  and  ∂𝑧(y,s) yi = j=1np j(y,s)xj(y,s) yi

for all (y,s) (Ω H) × (𝜖,𝜖) and all i = 1,,n 1. The first equation follows from the ODE for x(y,s) and z(y,s). For s = 0 the second equation follows from the initial conditions for z(y,s), p(y,s) and x(y,s). For s0, let us use v(y,s) for the difference between the left and right hand sides of the second equation:

v(y,s) := ∂𝑧(y,s) yi j=1np j(y,s)xj(y,s) yi .

We need to show that v is always zero. The derivative of the first equation with respect to yi yields

2z(y,s) yi∂𝑠 = j=1n (pj(y,s) yi xj(y,s) ∂𝑠 + pj(y,s)2x j(y,s) yi∂𝑠 ).

By the commutativity of the second partial derivatives we obtain

∂𝑠v(y,s) = 2z(y,s) ∂𝑠∂yi j=1npj(y,s) ∂𝑠 xj(y,s) yi j=1np j(y,s)2x j(y,s) ∂𝑠∂yi = j=1n (pj(y,s) yi xj(y,s) ∂𝑠 pj(y,s) ∂𝑠 xj(y,s) yi ) = j=1npj(y,s) yi ∂𝐹(p(y,s),z(y,s),x(y,s)) pj + j=1n (∂𝐹(p(y,s),z(y,s),x(y,s)) xj + ∂𝐹(p(y,s),z(y,s),x(y,s))pj(y,s) ∂𝑧 ) xj(y,s) yi = yiF(p(y,s),z(y,s),x(y,s)) ∂𝐹(p(y,s),z(y,s),x(y,s)) ∂𝑧 (∂𝑧(y,s) yi j=1np j(y,s)xj(y,s) yi ) .

Notice that the bracketed expression is exactly v. Inserting F(p(y,s),z(y,s),x(y,s)) = 0 we obtain

∂𝑠v(y,s) = ∂𝐹(p(y,s),z(y,s),x(y,s)) ∂𝑧 v(y,s).

For each y this is a linear homogeneous ODE for v(y,s) in the variable s with initial value 0 at s = 0. The unique solution is v(y,s) 0. This implies the second equation for all y and s:

∂𝑧(y,s) yi = j=1np j(y,s)xj(y,s) yi .

Now that we have established the two equalities, we demonstrate that they are not only necessary but also sufficient for the conclusion p(y,s) = u(x(y,s)) for all (y,s) (Ω H) × (𝜖,𝜖). The solution u is defined as the composition of the inverse of (y,s)x(y,s) with (y,s)z(y,s). The chain rule implies

∂𝑢 xj = ∂𝑧 ∂𝑠 ∂𝑠 xj + i=1n1 ∂𝑧 yi yi xj = ( k=1np kxk ∂𝑠 ) ∂𝑠 xj + i=1n1 ( k=1np kxk yi ) yi xj = k=1np k (xk ∂𝑠 ∂𝑠 xj + i=1n1xk yi yi xj ) = k=1np kxk xj = pj.

Thus we have shown that the function u, which was constructed from the method of characteristics, solves the PDE.

Theorem 1.8 and the theorem of Picard-Lindelöf imply the uniqueness of the solutions. □

The relation between the method of characteristics as explained in this section and the ad hoc versions we used in previous sections will be explored in the exercises. The important point is they are really the same method, but in many cases the system decouples and the ODEs for x and z do not depend on p. This is a nice simplification because it makes solving the p equations redundant.

1.6 Weak Solutions

In the first few sections there were situations with no solutions, or the method of characteristics gave a ‘solution’ that was not differentiable. In this section we take a scalar conservation law and look for more general notions of solutions which allow us to extend solutions across the crossing characteristics by allowing a limited amount of non-differentiability. But if we don’t have differentiability, what does it meant to satisfy a PDE? For this purpose we use the conserved integrals. Since we will restrict ourselves to the one-dimensional situation for the moment, the natural domains are intervals Ω = [a,b] with a < b . In this case the conservation law implies

d 𝑑𝑡abu(x,t)𝑑𝑥 = f(u(a,t)) f(u(b,t)).

Now we look for functions u with discontinuities along the graph {(x,t)x = y(t)} of a C1-function y. In the case that y(t) belongs to [a,b], we split the integral over [a,b] into the integrals over [a,b] = [a,y(t)] [y(t),b]. In such a case let us calculate the derivative of the integral over [a,b]:

d 𝑑𝑡abu(x,t)𝑑𝑥 = d 𝑑𝑡ay(t)u(x,t)𝑑𝑥 + d 𝑑𝑡y(t)bu(x,t)𝑑𝑥 = = (t) lim 𝑥↑𝑦(t)u(x,t) +ay(t)u˙(x,t)𝑑𝑥 (t) lim 𝑥↓𝑦(t)u(x,t) +y(t)bu˙(x,t)𝑑𝑥.

We abbreviate lim 𝑥↑𝑦(t)u(x,t) as ul(y(t),t) and lim 𝑥↓𝑦(t)u(x,t) as ur(y(t),t) and assume that on both sides of the graph of y the function u is a classical solution of the conservation law:

d 𝑑𝑡abu(x,t)𝑑𝑥 = (t)(ul(y(t),t) ur(y(t),t)) ay(t) d 𝑑𝑥f(u(x,t))𝑑𝑥 y(t)b d 𝑑𝑥f(u(x,t))𝑑𝑥 = (t)(ul(y(t),t) ur(y(t),t)) + f(u(a,t)) f(u(b,t)) + f(ur(y(t),t)) f(ul(y(t),t)).

Hence the integrated version of the conservation law still holds, if the following Rankine-Hugonoit condition is fulfilled:

(t) = f(ur(y,t)) f(ul(y,t)) ur(y,t) ul(y,t) .

Example 1.10. We consider Burgers equation u˙(x,t) + u(x,t)∂𝑢 ∂𝑥(x,t) = 0 for (x,t) × + with the following continuous initial values u(x, 0) = g(x) and

g(x) = { 1  for x 0, 1 x for 0 x < 1 0  for 1 x.

The first crossing of characteristics happens for t = 1:

x = x0+𝑡𝑔(x0) = { x0 + t  for x0 0, x0 + t(1 x0) for 0 < x0 < 1, x0  for 1 x0.

For t < 1 the evaluation at t is a homeomorphism from onto itself with inverse

x { x t for x t, xt 1t  for t < x < 1, x  for 1 x.

Therefore the solution is for 0 < t < 1 equal to

u(x,t) = { 1  for x < t, x1 t1  for t < x < 1, 0  for 1 x.

At t = 1 the solutions of the characteristic equations starting at x [0, 1] all meet at x = 1. For t > 1 there exists a unique solution satisfying the Rankine-Hugonoit condition, which is 1 on some interval (,y(t)) and 0 on the interval (y(t),). The corresponding regions have to be separated by a path with velocity 1 2 which starts at (x,t) = (1, 1). This gives y(t) = 1 + t1 2 . For t 1 this solution is equal to

u(x,t) = { 1 for x < 1 + t1 2 , 0 for 1 + t1 2 < x.

The second initial value problem is not continuous but monotonic increasing. For continuous monotonic increasing functions g the evaluation at t of the solutions of the characteristic equation would be a homeomorphism for all t > 0. Therefore in such cases there exists a unique continuous solution for all t > 0. But for non-continuous initial values this is not the case.

Example 1.11. We again consider Burgers equation u˙(x,t) + u(x,t)∂𝑢 ∂𝑥(x,t) = 0 for (x,t) × + with the following non-continuous initial values u(x, 0) = g(x) and

g(x) = { 0 for x < 0, 1 for 0 < x.

Again there is a unique discontinuous solution which is 0 on some interval (,y(t)) and 1 on the interval (y(t),). By the Rankine-Hugonoit condition both regions are separated by a path with velocity 1 2. This solution is equal to

u(x,t) = { 0 for x < t 2, 1 for 1 2 < x.

But there exists another continuous solution, which clearly also satisfies the Rankine-Hugonoit condition:

u(x,t) = { 0 for x 0, x t  for 0 < x < t, 1  for  t x.

These solutions are constant along the lines x = 𝑐𝑡 for c [0, 1]. These lines all intersect in the discontinuity at (x,t) = (0, 0). Besides these two extreme cases there exists infinitely many other solutions with several regions of discontinuity, which all satisfy the Rankine-Hugonoit condition.

These examples show that such weak solutions exists for all t 0 but are not unique. We now restrict the space of weak solutions such that they have a unique solution for all t 0. Since we want to maximise the regularity we only accept discontinuities if there are no continuous solutions. In the last example we prefer the continuous solution. So for Burgers equation this means we only accept discontinuous solutions that take larger values for smaller x and smaller values for larger x.

Definition 1.12 (Lax Entropy condition). A discontinuity of a weak solution along a C1-path ty(t) satisfies the Lax entropy condition, if along the path the following inequality is fulfilled:

f(ul(y,t)) > (t) > f(ur(y,t)).

A weak solutions with discontinuities along C1-paths is called an admissible solution, if along the path both the Rankine-Hugonoit condition and the Lax Entropy condition are satisfied.

There is a justification of the entropy condition on physical grounds in Evans’ book p. 142-3.

For continuous g there is a crossing of characteristics if f(g(x 1)) > f(g(x 2)) for x1 < x2. So this condition ensures that discontinuities can only show up if we cannot avoid a crossing of characteristics.

Theorem 1.13. Let f C1(, ) be convex and u and v two admissible solutions of

u˙(x,t) + f(u(x,t))∂𝑢 ∂𝑥(x,t) = 0.

in L1(). Then tu(,t) v(,t)L1() is monotonically decreasing.

Proof. We divide into maximal intervals I = [a(t),b(t)] with the property that either u(x,t) > v(x,t) or v(x,t) > u(x,t) for all x (a(t),b(t)). This means that either xu v vanishes at the boundary, or is discontinuous and changes sign at the boundary. We claim that the boundaries a(t) and b(t) of these maximal intervals are differentiable. We prove this only for a(t). For b(t) the proof is analogous. If either u(,t) or v(,t) is discontinuous at a, then by definition of an admissible solution the locus of the discontinuity a is differentiable with respect to t. On the other hand, suppose u and v are both continuously differentiable at (a(t1),t1) with u(a(t1),t1) = v(a(t1),t1). Then we know that u and v have a common characteristic through this point s(a(t1) + sf(u(a(t 1),t1)),t1 + s), and moreover they are equal along this characteristic. Hence the line of equality is given by a(t) = a(t1) + (t t1)f(u(a(t 1),t1)).

To simplify notation we will sometimes write a and b instead of a(t) and b(t). Additionally, we only consider intervals on whose interior u > v. On the other intervals these arguments apply with interchanged u and v. Now we calculate

d 𝑑𝑡 a(t)b(t)(u(x,t) v(x,t))𝑑𝑥 =a(t)b(t)(u˙(x,t) v˙(x,t))𝑑𝑥 + (t)(u(b,t) v(b,t)) ȧ(t)(u(a,t) v(a,t)) =a(t)b(t) d 𝑑𝑥(f(v(x,t)) f(u(x,t)))𝑑𝑥 + (t)(u(b,t) v(b,t)) ȧ(t)(u(a,t) v(a,t)) = f(v(b,t)) f(u(b,t)) + (t)(u(b,t) v(b,t)) + f(u(a,t)) f(v(a,t)) + ȧ(t)(v(a,t) u(a,t)).

If u and v are both differentiable at (a,t), then they take the same values at (a,t) and the corresponding terms in the last line vanishes. The same holds, if u and v are both differentiable at (b,t). For convex f the derivative f is monotonically increasing and the Lax-Entropy condition implies at all discontinuities y of u(,t) and v(,t)

ul(y,t) > ur(y,t), vl(y,t) > vr(y,t),

respectively. If one of the two solutions u and v is at the boundary of I continuous and the other is non-continuous, then the value of the continuous solution belongs to the closed interval between the limits of the non-continuous solution, because at the boundary either u v becomes zero or changes sign. For v being continuous and u being discontinuous at a we would have ul(a,t) v(a,t) ur(a,t) by u > v on (a,b) in contradiction to the former inequality. So either u(,t) is continuous and differentiable at a and v(,t) is discontinuous at a(t) and analogously u is discontinuous at b and v is continuous and differentiable at b. The Rankine Hugonoit condition determines ȧ(t) and (t). At a(t) the corresponding contribution to d 𝑑𝑡u(,t) v(,t)1 is

f(u(a,t)) f(vr(a,t)) + ȧ(t) (vr(a,t) u(a,t)) = = f(u(a,t)) f(vr(a,t)) + f(vr(a,t)) f(vl(a,t)) vr(a,t) vl(a,t) (vr(a,t) u(a,t)) = f(u(a,t)) (f(vr(a,t)) vl(a,t) u(a,t) vl(a,t) vr(a,t) + f(vl(a,t)) u(a,t) vr(a,t) vl(a,t) vr(a,t) ) .

Since f is convex the secant lies above the graph of f. Since u(a,t) [vr(a,t),vl(a,t)] this expression is non-positive. At b(t) this contribution is

f(v(b,t)) f(ul(b,t)) + (t) (ul(b,t) v(b,t)) = = f(v(b,t)) f(ul(b,t)) + f(ur(b,t)) f(ul(b,t)) ur(b,t) ul(b,t) (ul(b,t) v(b,t)) = f(v(b,t)) (f(ur(b,t)) ul(b,t) v(b,t) ul(b,t) ur(b,t) + f(ul(b,t)) v(b,t) ur(b,t) ul(b,t) ur(b,t) ) .

Again due to v(b,t) [ur(b,t),ul(b,t)] this expression is non-positive.

If finally both solutions are discontinuous at a(t) or b(t). Since u(,t) v(,t) is positive on I, the Lax Entropy condition implies [ur(a,t),ul, (a,t)] [vr(a,t),vl(a,t)] and [vr(b,t),vl(b,t)] [ur(b,t),ul(b,t)], respectively. The corresponding contributions to d 𝑑𝑡u(,t) v(,t)1 are again non-positive:

f(ur(a,t)) f(vr(a,t)) + ȧ(t) (vr(a,t) ur(a,t)) = = f(ur(a,t)) f(vr(a,t)) + f(vr(a,t)) f(vl(a,t)) vr(a,t) vl(a,t) (vr(a,t) ur(a,t)) = f(ur(a,t)) (f(vr(a,t))vl(a,t) ur(a,t) vl(a,t) vr(a,t) + f(vl(a,t))ur(a,t) vr(a,t) vl(a,t) vr(a,t) ) .

f(vl(b,t)) f(ul(b,t)) + (t) (ul(b,t) vl(b,t)) = = f(vl(b,t)) f(ul(b,t)) + f(ur(b,t)) f(ul(b,t)) ur(b,t) ul(b,t) (ul(b,t) vl(b,t)) = f(vl(b,t)) (f(ur(b,t)) ul(b,t) vl(b,t) ul(b,t) ur(b,t) + f(ul(b,t))vl(b,t) ur(b,t) ul(b,t) ur(b,t) ) .

Hence the contributions to d 𝑑𝑡u(,t) v(,t)1 of all intervals are non-positive. □

This implies that admissible solutions to an IVP are unique, if they exist. By utilising an explicit formula for admissible solutions one can also prove the existence of admissible solutions. The following theorem is Theorem 10.3 in the lecture notes “Hyperbolic Partial Differential Equations” by Peter Lax, Courant Lecture Notes in Mathematics 14, American Mathematical Society (2006), which also supplies a proof.

Theorem 1.14. For f C2(, ) is strictly convex and g L1() L() there exists an unique admissible solution u(x,t) of

u˙(x,t) + f(u(x,t))∂𝑢 ∂𝑥(x,t) = 0andu(x, 0) = g(x)for all x .