David Ketcheson

Dispersion relations for linear systems of PDEs

2014-05-28T00:00:00+03:00

Fourier analysis is an essential tool for understanding the behavior of solutions to linear equations. Often, this analysis is introduced to students in the context of scalar equations with real coefficients. If nothing more is said, students may mistakenly apply assumptions based on the scalar case to systems, leading to erroneous conclusions. I’m surprised at how often I’ve seen this, and I’ve even made the mistake myself.

Scalar equations

Students in any undergraduate PDE course learn that solutions of the heat equation

\[ \label{heat} u_t(x,t) = u_{xx}(x,t) \]

diffuse in time whereas solutions of the wave equation

\[ \label{wave} u_{tt} = u_{xx} \]

oscillate in time without growing or decaying. They may even be introduced to a general approach for the Cauchy problem: given an evolution equation

\[ \label{evol} u_t = \sum_{j=0}^n a_j \frac{\partial^j u}{\partial x^j}, \]

one inserts the Fourier mode solution

\[ \label{fourier} u(x,t) = e^{i(kx - \omega(k) t)} \]

to obtain

\[-i\omega(k) = \sum_{j=0}^n a_j (ik)^j\]

or simply

\[\omega(k) = \sum_{j=0}^n a_j i^{j+1} k^j.\]

The function \(\omega(k)\) is often referred to as the dispersion relation for the PDE. Any solution can be expressed as a sum of Fourier modes, and each mode propagates in a manner dictated by the dispersion relation. It’s easy to see that

If \(\omega(k)\) is real, then energy is conserved and each mode simply translates. This occurs if only odd-numbered spatial derivatives appear in the evolution equation \eqref{evol}.
If \(\omega(k)\) has negative imaginary part, energy decays in time. The heat equation \eqref{heat} behaves this way.
If \(\omega(k)\) has positive imaginary part, then the energy will grow exponentially in time. This doesn’t usually occur in physical systems. An example of this behavior is obtained by changing the sign of the right side in the heat equation to get \(u_t = - u_{xx}\).

What about the wave equation, which has two time derivatives? Using the same Fourier mode ansatz \eqref{fourier}, one obtains \[ \begin{align} \omega^2 & = k^2 \end{align} \] or \(\omega = \pm k\). Since \(\omega\) is real, energy is conserved.

In the discussion above, we have assumed that \(u\) is a scalar and that the coefficients \(a_j\) are real. Many undergraduate courses stop at this point, and students are left with the intuition that even-numbered derivative terms are diffusive while odd-numbered derivative terms are dispersive.

In practice, we often deal with systems of PDEs or PDEs with complex coefficients, and this intuition is then no longer correct. There is nothing deep or mysterious about this topic, but it’s easy to jump to incorrect conclusions if one is not careful. To take a common example, consider the time-dependent Schroedinger equation: \[i \psi_t = \psi_{xx} + V\psi.\] At first glance, we have on the right side a diffusion term (\(\psi_{xx}\)) and a reaction term (\(V\psi\)). But what about that pesky factor of \(i\) (the imaginary unit) on the left hand side? It’s easy to find the answer using the usual ansatz, but let’s take a little detour first.

Systems of equations

Consider the linear system \[ \begin{align*} u_t = A \frac{\partial^j u}{\partial x^j}, \end{align*} \] where \(u\in \mathbb{R}^m\) and \(A\) is a square real matrix. Let \(\lambda_m\) and \(s_m\) denote the eigenvalues and eigenvectors (respectively) of \(A\). Inserting the Fourier mode solution \[u(x,t) = s_m e^{i(kx - \omega(k) t)},\] we obtain \[\omega(k) = i^{j+1} k^j \lambda_m s_m,\] and any solution can be written as a superposition of these. We see now that the behavior of the energy with respect to time depends on both the number \(j\) of spatial derivatives and the nature of the eigenvalues of \(A\). For instance, if \(j=1\) and \(A\) has imaginary eigenvalues, energy is conserved. We can obtain just such an example by rewriting the wave equation \eqref{wave} as a first-order system: \[ \begin{align} u_t & = v_x \label{w1} \\ v_t & = u_x. \label{w2} \end{align} \] (If you’re not familiar with this, just differentiate \eqref{w1} w.r.t. \(t\) and \eqref{w2} w.r.t. \(x\), then equate partial derivatives to get back the second-order wave equation \eqref{wave}). We have a linear system with \(j=1\) and \[ A = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}.\] This matrix has eigenvalues \(\lambda=\pm 1\), so \(\omega(k)\) has zero imaginary part.

In this example, our intuition from the scalar case works: our first-order system, with only odd-numbered derivatives, leads to wave-like behavior. But notice that if \(A\) had imaginary eigenvalues, our intuition would be wrong; for instance, the system \[ \begin{align*} u_t & = -v_x \\ v_t & = u_x, \end{align*} \] corresponding to the second-order equation \(u_{tt} = - u_{xx},\) admits exponentially growing solutions.

Scalar problems with complex coefficients

Now that we understand the dispersion relation for systems, it’s easy to understand the dispersion relation for the Schrodinger equation. Multiply by \(-i\) to get \[\psi_t = -i\psi_{xx} + -iV\psi.\] Now we can think of this in the same way as a system, where the coefficient matrices have purely imaginary eigenvalues. Then it’s clear that the (even-derivative) terms on the right hand side are both related to wave behavior (i.e., energy is conserved).

Systems with derivatives of different orders

In the most general case, we have systems of linear PDEs with multiple spatial derivatives of different order: \[ \label{gensys} u_t = \sum_{j=0}^n A_j \frac{\partial^j u}{\partial x^j}. \]

Here’s a real example from my research. It comes from homogenization of the wave equation in a spatially varying medium (see Equation (5.17) of this paper for more details). It’s the wave equation plus some second-derivative terms: \[ u_t = v_x + v_{xx} \\ v_t = u_x - u_{xx}. \] You might (if you hadn’t read the example above) assume that this system is dissipative due to the second derivatives. This system is of the form \eqref{gensys} with \[ \begin{align} A_1 & = \begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix} & A_2 & = \begin{pmatrix} 0 & 1 \\ -1 & 0 \end{pmatrix}. \end{align} \] Of course, \(A_1\) has real eigenvalues and leads to wave-like behavior. But \(A_2\) has pure imaginary eigenvalues, so it also leads to wave-like behavior! The second derivative terms are dispersive. In fact, it’s easy to show that the energy \(E=u^2+v^2\) is a conserved quantity for this system (try it!).

Strictly speaking, Fourier analysis like what we’ve described can’t usually be applied to \eqref{gensys} because the matrices \(A_j\) will not generally be simultaneously diagonalizable (though this analysis can still give us intuition for what each set of terms may do). Worse yet, the individual matrices may not be diagonalizable. Let’s illustrate with a simple case.

Returning to the wave equation, let’s consider a different way of writing it as a system: \[ \begin{align*} u_t & = v \\ v_t & = u_{xx}. \end{align*} \] It’s easy to check that this system is equivalent to the wave equation – but notice that it’s composed of parts with only even derivatives! (reaction and diffusion equations in the terminology of scalar PDEs). This system is of the form \eqref{gensys} with \[ \begin{align} A_0 & = \begin{pmatrix} 0 & 1 \\ 0 & 0 \end{pmatrix} & A_2 & = \begin{pmatrix} 0 & 0 \\ 1 & 0 \end{pmatrix}. \end{align} \] Notice that both eigenvalues of both matrices are equal to zero.

Notes 2014.03.03

2014-03-03T00:00:00+03:00

Finally figured out what was wrong with the stability regions for the deferred correction methods in Nodepy when \(\theta \ne 0\). See these notebooks.

I also sat with Roland and got the latest version of PeanoClaw running on my workstation.

Notes 2014.02.27

2014-02-27T00:00:00+03:00

Investigated stability regions for high order deferred correction schemes; see this notebook.

Finally, after about a year of searching, found a way to redirect all output from distutils to a file. This will avoid the massive amount of warnings that are currently printed to the screen when installing PyClaw. See the patch here, based on this StackOverflow answer.

I also put together an IPython notebook on shallow water solitary waves over periodic bathymetry. It will be in the Github repo soon.

The Schrodinger equation is not a reaction-diffusion equation

2014-02-22T00:00:00+03:00

Recently, a stackexchange answer claimed that the Schrodinger equation is effectively a reaction-diffusion equation. I’ll set aside semantic arguments about the meaning of “effectively”, and give a more obvious example to explain why I think this statement is misleading.

Consider the wave equation

\[u_{tt} = u_{xx}\]

Introducing a new variable \(v=u_t\) we can rewrite the wave equation as

\[ \begin{align*} v_t & = u_{xx} \\ u_t & = v. \end{align*} \]

Observe that the first of these equation is the diffusion equation, while the second is a reaction equation. Thus we have reaction-diffusion!

Right?

Wrong. We’ve disguised the true nature of this equation by applying our intuition (which is based on scalar PDEs) to a system of PDEs. In the same way, the “reaction-diffusion” label for Schrodinger is obtained by applying intuition based on PDEs with real coefficients to a PDE with complex coefficients.

Of course, in both cases you can use numerical methods that are appropriate for reaction-diffusion problems in order to solve a wave equation.
Here is a quick ipython notebook implementation of the obvious method for the system above.

Notes 2014.02.22

2014-02-22T00:00:00+03:00

Discussed time stepping for aeroacoustics with Antony Jameson at Stanford. Also reviewed a couple of his group’s papers on high order flux reconstruction schemes.

Insights from von Neumann analysis of high-order flux reconstruction schemes

Vincent, Castonguay, Jameson
JCP 2011

Investigate a 1-parameter family of stable flux reconstruction methods suggested by earlier work. For certain parameter values you get DG or SD schemes. Some values admit spurious modes. The size of the largest stable step size and the order of accuracy are determined as a function of the parameter (c). Nonlinear 2D experimental results are predicted relatively well by the 1D von Neumann analysis.

Section 4 is a nice description of how to do von Neumann analysis for FE methods.

On the Non-linear Stability of Flux Reconstruction Schemes

Jameson, Vincent, Castonguay
J. Sci. Comput. 2011

They look at energy stability in a very general way. Nonlinear stability depends on solution point locations, and on the accuracy of the determination of the transformed flux.

Time integration ideas that could be useful for aeronautics simulations:

Optimization of stability regions
Multirate time stepping. Some work has been done in Nonuniform time-step Runge–Kutta discontinuous Galerkin method for Computational Aeroacoustics.
Large time step methods. See A class of large time step Godunov schemes for hyperbolic conservation laws and applications.

Large time step methods might work very well as the coarse propagator for parareal-type algorithms.

The parallel EPPEER code

2012-10-17T00:00:00+03:00

I tried out the EPPEER code, which uses two-step Runge-Kutta methods and OpenMP, because I’m thinking of writing a shared-memory parallel ODE solver code myself.

I downloaded the code from

http://www.mathematik.uni-marburg.de/~schmitt/peer/eppeer.zip

unzipped, and ran

gfortran -c mbod4h.f90 
gfortran -c ivprkp.f90 
gfortran -c -fopenmp ivpepp.f90 
gfortran -fopenmp ivprkp.o ivpepp.o mbod4h.o ivp_pmain.f90
./a.out

I had to fix one line that was trying to open a logfile and failed. I also set

export OMP_NUM_THREADS=4

This runs the code with increasingly tight tolerances on a 400-body problem. The output was (I killed it before it finished the really tight tolerance run(s)

 tol, err, otime, cpu  0.10E-01 0.10702      2.9556      10.534    
 steps,rej,nfcn:  337   88     1399
 tol, err, otime, cpu  0.10E-02 0.93692E-01  4.9853      18.585    
 steps,rej,nfcn:  605  159     2465
 tol, err, otime, cpu  0.10E-03 0.66604E-01  7.9798      30.365    
 steps,rej,nfcn:  994  244     4015
 tol, err, otime, cpu  0.10E-04 0.47637E-01  12.026      46.477    
 steps,rej,nfcn: 1534  324     6175
 tol, err, otime, cpu  0.10E-05 0.24241E-01  18.239      70.756    
 steps,rej,nfcn: 2338  415     9391

If I understand correctly, the last column is total CPU time; the next to last is wall time. For comparison, I ran it without parallelism:

export OMP_NUM_THREADS=1

Then I got the following:

 tol, err, otime, cpu  0.10E-01 0.10702      10.382      10.382    
 steps,rej,nfcn:  337   88     1399
 tol, err, otime, cpu  0.10E-02 0.93692E-01  18.297      18.297    
 steps,rej,nfcn:  605  159     2465
 tol, err, otime, cpu  0.10E-03 0.66604E-01  29.814      29.815    
 steps,rej,nfcn:  994  244     4015
 tol, err, otime, cpu  0.10E-04 0.47637E-01  45.854      45.855    
 steps,rej,nfcn: 1534  324     6175
 tol, err, otime, cpu  0.10E-05 0.24241E-01  69.725      69.726    
 steps,rej,nfcn: 2338  415     9391
 tol, err, otime, cpu  0.10E-06 0.53727E-02  105.47      105.48    
 steps,rej,nfcn: 3539  484    14195

The numbers of function evaluations were identical, confirming that the computations being performed were the same. The speedup (about 3x) is very nice. We should be able to achieve something similar with extrapolation.

These results are actually plotted in the user guide, at the end of Section 4.

This was originally posted on mathwiki.

Blogging an iPython notebook with Jekyll

2012-10-11T00:00:00+03:00

Update as of December 2014: Don’t bother using what’s below; go to Christop Corley’s blog for a much better setup!

I’ve been playing around with iPython notebooks for a while and planning to use them instead of SAGE worksheets for my numerical analysis course next spring. As a warmup, I wrote an iPython notebook explaining a bit about internal stability of Runge-Kutta methods and showing some new research results using NodePy.

I also wanted to post the notebook on my blog here; the ability to more easily include math and code in blog posts was one of my main motivations for moving away from Blogger to my own site. I first tried following the instructions given by Fernando Perez. That was quite painless and worked flawlessly, using nbconvert.py to convert the .ipynb file directly to HTML, with graphics embedded. The only issue was that I didn’t love the look of the output quite as much as I love how Carl Boettiger’s Markdown + Jekyll posts with code and math look (see an example here). Besides, Markdown is so much nicer than HTML, and nbconvert.py has a Markdown output option.

So I tried the markdown option:

nbconvert.py my_nb.ipynb -f markdown

I copied the result to my _posts/ directory, added the YAML front-matter that Jekyll expects, and took a look. Everything was great except that all my plots were gone, of course. After considering a few options, I decided for now to put plots for such posts in a subfolder jekyll_images/ of my public Dropbox folder. Then it was a simple matter of search/replace all the paths to the images. At that point, it looked great; you can see the source and the result.

The only issue was that I didn’t want to manually do all that work every time. I considered creating a new Converter class in nbconvert to handle it, but finally decided that it would be more convenient to just write a shell script that calls nbconvert and then operates on the result.
Here it is:

#!/bin/bash

fname=$1

nbconvert.py ${fname}.ipynb -f markdown
sed  -i '' "s#${fname}_files#https:\/\/dl.dropbox.com\/u\/656693\/jekyll_images\/${fname}_files#g"  ${fname}.md

dt=$(date "+%Y-%m-%d")

echo "0a
---
layout:    post
time:      ${dt}
title:     TITLE-ME
subtitle:  SUBTITLE-ME
tags:      TAG-ME
---
.
w" | ed ${fname}.md

mv ${fname}.md ~/labnotebook/_posts/${dt}-${fname}.md

It’s also on Github here. This was a nice educational exercise in constructing shell scripts, in which I learned or re-learned:

how to use command-line arguments
how to use sed and ed
how to use data

You can expect a lot more iPython-notebook based posts in the future.

Internal stability of Runge-Kutta methods

2012-10-11T00:00:00+03:00

Note: this post was generated from an iPython notebook. You can download the notebook from github and execute all the code yourself.

Internal stability deals with the growth of errors (such as roundoff) introduced at the Runge-Kutta stages during a single Runge-Kutta step. It is usually important only for methods with a large number of stages, since that is when the internal amplification factors can be large. An excellent explanation of internal stability is given in this paper. Here we demonstrate some tools for studying internal stability in NodePy.

First, let’s load a couple of RK methods:

from nodepy import rk
reload(rk)
rk4 = rk.loadRKM('RK44')
ssprk4 = rk.loadRKM('SSP104')
print rk4
print ssprk4

Classical RK4
The original four-stage, fourth-order method of Kutta
 0   |
 1/2 |  1/2
 1/2 |  0    1/2
 1   |  0    0    1
_____|____________________
     |  1/6  1/3  1/3  1/6
SSPRK(10,4)
The optimal ten-stage, fourth order SSP Runge-Kutta method
 0   |
 1/6 |  1/6
 1/3 |  1/6   1/6
 1/2 |  1/6   1/6   1/6
 2/3 |  1/6   1/6   1/6   1/6
 1/3 |  1/15  1/15  1/15  1/15  1/15
 1/2 |  1/15  1/15  1/15  1/15  1/15  1/6
 2/3 |  1/15  1/15  1/15  1/15  1/15  1/6   1/6
 5/6 |  1/15  1/15  1/15  1/15  1/15  1/6   1/6   1/6
 1   |  1/15  1/15  1/15  1/15  1/15  1/6   1/6   1/6   1/6
_____|____________________________________________________________
     |  1/10  1/10  1/10  1/10  1/10  1/10  1/10  1/10  1/10  1/10

Absolute stability regions

First we can use NodePy to plot the region of absolute stability for each method. The absolute stability region is the set

\(\\{ z \in C : |\phi (z)|\le 1 \\}\)

where \(\phi(z)\) is the stability function of the method:

\(1 + z b^T (I-zA)^{-1}\)

If we solve \(u'(t) = \lambda u\) with a given method, then \(z=\lambda \Delta t\) must lie inside this region or the computation will be unstable.

p,q = rk4.stability_function()
print p
h1=rk4.plot_stability_region()

         4          3       2
0.04167 x + 0.1667 x + 0.5 x + 1 x + 1

p,q = ssprk4.stability_function()
print p
h2=ssprk4.plot_stability_region()

           10             9            8             7           6
3.969e-09 x  + 2.381e-07 x + 6.43e-06 x + 0.0001029 x + 0.00108 x
            5           4          3       2
 + 0.00787 x + 0.04167 x + 0.1667 x + 0.5 x + 1 x + 1

Internal stability

The stability function tells us by how much errors from one step are amplified in the next one. This is important since we introduce truncation errors at every step. However, we also introduce roundoff errors at the each stage within a step. Internal stability tells us about the growth of those. Internal stability is typically less important than (step-by-step) absolute stability for two reasons:

Roundoff errors are typically much smaller than truncation errors, so moderate amplification of them typically is not significant
Although the propagation of stage errors within a step is governed by internal stability functions, in later steps these errors are propagated according to the (principal) stability function

Nevertheless, in methods with many stages, internal stability can play a key role.

Questions: In the solution of PDEs, large spatial truncation errors enter at each stage. Does this mean internal stability becomes more significant? How does this relate to stiff accuracy analysis and order reduction?

Internal stability functions

We can write the equations of a Runge-Kutta method compactly as

\(y = u^n e + h A F(y)\) \(u^{n+1} = u^n + h b^T F(y),\)

where \(y\) is the vector of stage values, \(u^n\) is the previous step solution, \(e\) is a vector with all entries equal to 1, \(h\) is the step size, \(A\) and \(b\) are the coefficients in the Butcher tableau, and \(F(y)\) is the vector of stage derivatives. In floating point arithmetic, roundoff errors will be made at each stage. Representing these errors by a vector \(r\), we have

\(y = u^n e + h A F(y) + r.\)

Considering the test problem \(F(y)=\lambda y\) and solving for \(y\) gives

\(y = u^n (I-zA)^{-1}e + (I-zA)^{-1}r,\)

where \(z=h\lambda\). Substituting this result in the equation for \(u^{n+1}\) gives

\(u^{n+1} = u^n (1 + zb^T(I-zA)^{-1}e) + zb^T(I-zA)^{-1}r = \psi(z) u^n + \theta(z)^T r.\)

Here \(\psi(z)\) is the stability function of the method, that we already encountered above. Meanwhile, the vector \(\theta(z)\) contains the internal stability functions that govern the amplification of roundoff errors \(r\) within a step:

\(\theta(z) = z b^T (I-zA)^{-1}.\)

Let’s compute \(\theta\) for the classical RK4 method:

theta=rk4.internal_stability_polynomials()
theta

    [poly1d([1/24, 1/12, 1/6, 1/6, 0], dtype=object),
     poly1d([1/12, 1/6, 1/3, 0], dtype=object),
     poly1d([1/6, 1/3, 0], dtype=object),
     poly1d([1/6, 0], dtype=object)]

for theta_j in theta:
    print theta_j

         4           3          2
0.04167 x + 0.08333 x + 0.1667 x + 0.1667 x
         3          2
0.08333 x + 0.1667 x + 0.3333 x
        2
0.1667 x + 0.3333 x
 
0.1667 x

Thus the roundoff errors in the first stage are amplified by a factor \(z^4/24 + z^3/12 + z^2/6 + z/6\), while the errors in the last stage are amplified by a factor \(z/6\).

Internal instability

Usually internal stability is unimportant since it relates to amplification of roundoff errors, which are very small. Let’s think about when things can go wrong in terms of internal instability. If \(|\theta(z)|\) is of the order \(1/\epsilon_{machine}\), then roundoff errors could be amplified so much that they destroy the accuracy of the computation. More specifically, we should be concerned if \(|\theta(z)|\) is of the order \(tol/\epsilon_{machine}\) where \(tol\) is our desired error tolerance. Of course, we only care about values of \(z\) that lie inside the absolute stability region \(S\), since internal stability won’t matter if the computation is not absolutely stable.

We can get some idea about the amplification of stage errors by plotting the curves \(|\theta(z)|=1\) along with the stability region. Ideally these curves will all lie outside the stability region, so that all stage errors are damped.

rk4.internal_stability_plot()

ssprk4.internal_stability_plot()

For both methods, we see that some of the curves intersect the absolute stability region, so some stage errors are amplified. But by how much? We’d really like to know the maximum amplification of the stage errors under the condition of absolute stability. We therefore define the maximum internal amplification factor \(M\):

\(M = \max_j \max_{z \in S} |\theta_j(z)|\)

print rk4.maximum_internal_amplification()
print ssprk4.maximum_internal_amplification()

2.15239281554
4.04399941143

We see that both methods have small internal amplification factors, so internal stability is not a concern in either case. This is not surprising for the method with only four stages; it is a surprisingly good property of the method with ten stages.

Questions: Do SSP RK methods always (necessarily) have small amplification factors? Can we prove it?

Now let’s look at some methods with many stages.

Runge-Kutta Chebyshev methods

The paper of Verwer, Hundsdorfer, and Sommeijer deals with RKC methods, which can have very many stages. The construction of these methods is implemented in NodePy, so let’s take a look at them. The functions RKC1(s) and RKC2(s) construct RKC methods of order 1 and 2, respectively, with \(s\) stages.

s=4
rkc = rk.RKC1(s)
print rkc

RKC41

 0    |
 1/16 |  1/16
 1/4  |  1/8   1/8
 9/16 |  3/16  1/4   1/8
______|________________________
      |   1/4   3/8   1/4   1/8

rkc.internal_stability_plot()

It looks like there could be some significant internal amplification here. Let’s see:

rkc.maximum_internal_amplification()

    11.760869405962685

Nothing catastrophic. Let’s try a larger value of \(s\):

s=20
rkc = rk.RKC1(s)
rkc.maximum_internal_amplification()

    42.665327220219126

As promised, these methods seem to have good internal stability properties. What about the second-order methods?

s=20
rkc = rk.RKC2(s)
rkc.maximum_internal_amplification()

    106.69110992619214

Again, nothing catastrophic. We could take \(s\) much larger than 20, but the calculations get to be rather slow (in Python) and since we’re using floating point arithmetic, the accuracy deteriorates.

Remark: we could do the calculations in exact arithmetic using Sympy, but things would get even slower. Perhaps there are some optimizations that could be done to speed this up. Or perhaps we should use Mathematica if we need to do this kind of thing.

Remark 2: of course, for the RKC methods the internal stability polynomials are shifted Chebyshev polynomials. So we could evaluate them directly in a stable manner using the three-term recurrence (or perhaps scipy’s special functions library). This would also be a nice check on the calculations above.

Other methods with many stages

Three other classes of methods with many stages have been implemented in NodePy:

SSP families
Integral deferred correction (IDC) methods
Extrapolation methods

SSP Families

s=20
ssprk = rk.SSPRK2(s)
ssprk.internal_stability_plot()
ssprk.maximum_internal_amplification()

    2.0212921484995547

s=25 # # of stages
ssprk = rk.SSPRK3(s)
ssprk.internal_stability_plot()
ssprk.maximum_internal_amplification()

    3.8049237837215397

The SSP methods seem to have excellent internal stability properties.

IDC methods

p=6 #order
idc = rk.DC(p-1)
print len(idc)
idc.internal_stability_plot()
idc.maximum_internal_amplification()

    6.4140166271998815

IDC methods also seem to have excellent internal stability.

Extrapolation methods

p=6 #order
ex = rk.extrap(p)
print len(ex)
ex.internal_stability_plot()
ex.maximum_internal_amplification()

16
6

Not so good. Let’s try a method with even more stages (this next computation will take a while; go stretch your legs).

p=10 #order
ex = rk.extrap(p)
print len(ex)
ex.maximum_internal_amplification()

    28073.244376758907

Now we’re starting to see something that might cause trouble, especially since such high order extrapolation methods are usually used when extremely tight error tolerances are required. Internal amplification will cause a loss of about 5 digits of accuracy here, so the best we can hope for is about 10 digits of accuracy in double precision. Higher order extrapolation methods will make things even worse. How large are their amplification factors? (Really long calculation here…)

pmax = 12
ampfac = np.zeros(pmax+1)
for p in range(1,pmax+1):
    ex = rk.extrap(p)
    ampfac[p] = ex.maximum_internal_amplification()
    print p, ampfac[p]

1 1.99777378912
2 2.40329384375
3
 5.07204078733
4
 17.747335803
5
 69.62805786
6
 97.6097450835
7
 346.277441462
8
 1467.40356089
9
 6344.16303534
10
 28073.2443768
11
 126011.586473
12
 169897.662582

[]

semilogy(ampfac,linewidth=3)

[]

We see roughly geometric growth of the internal amplification factor as a function of the order \(p\). It seems clear that very high order extrapolation methods applied to problems with high accuracy requirements will fall victim to internal stability issues.

A curious upwind implicit scheme for advection

2012-10-11T00:00:00+03:00

The CFL condition

The CFL condition is one of the most basic and intuitive principles in the numerical solution of hyperbolic PDEs. First formulated by Courant, Friedrichs and Lewy in their seminal paper (in English for free here](http://www.stat.uchicago.edu/~lekheng/courses/302/classics/courant-friedrichs-lewy.pdf)), it states that the domain of dependence of a numerical method for solving a PDE must contain the true domain of dependence. Otherwise, the numerical method cannot be convergent.

The CFL condition is geometric and easily understood in the context of, say, a first-order upwind discretization of advection. Usually it says nothing interesting about implicit schemes, since they include all points in their domain of dependence. But sometimes understanding the CFL condition for a particular scheme can be subtle.

An implicit scheme

Consider the advection equation

\[u_t + a u_x = 0.\]

Discretization using a backward difference in space and in time gives the scheme

\[U^{n+1}_j = U^n_j - \nu(U^{n+1}_j - U^{n+1}_{j-1}).\]

Where \(\nu = ka/h\) is the CFL number and \(k,h\) are the step sizes in time and space, respectively. This very simple scheme illustrates the concepts of the CFL condition and stability in a remarkable way.

For simplicity, suppose that the problem is posed on the domain \(0\le x \le 1\), with an appropriate boundary condition. Since this scheme computes \(U^{n+1}_j\) in terms of \(U^n_j\) and \(U^{n+1}_{j-1}\), it seems that the numerical domain of dependence for \(U^n_j\) is \((x,t)\in (0,x_j)\times[0,t_n]\). Based on this, we may conclude that the scheme is not convergent for \(\nu<0\). Simple enough.

But what if \(\nu=-1\)? Then the scheme reads \[U^{n+1}_{j-1} = U^n_j,\] which gives the exact solution! This is a sort of “anti-unit CFL condition”.

How can this scheme be convergent (in fact, exact!) for a negative CFL number when it doesn’t use any values to the right?

Understanding the CFL condition

Look at the exact formula above. In this case the scheme is not a method for computing \(U^{n+1}_j\) but for computing \(U^{n+1}_{j-1}\), and it does use a value from the previous time step that lies to the right.

So we can view the scheme with \(\nu=-1\) as a method for computing \(U^{n+1}_j\), in which case the CFL condition is satisfied only for \(\nu\ge0\), or we can view the scheme as a method for computing \(U^{n+1}_{j-1}\), in which case the CFL condition is satisfied only for \(\nu\le-1\). Which viewpoint is correct?

To answer that question, remember that the CFL condition is purely algebraic – that is, it relates to which values are actually used to compute which other values. To understand this scheme, we need to think about how we actually solve for \(U^{n+1}\) when using it. Notice that the scheme can be written as \[A U^{n+1} = U^n\] where the matrix \(A\) is lower-triangular. Hence the system can be solved by substitution. To go further, we must consider two cases:

\(\nu>0\): in this case, boundary values must be supplied along the left boundary at \(x=0\). Then, starting from the known value at the boundary, we work to the right by substitution: \[U^{n+1}_j = \frac{U^n_j+\nu U^{n+1}_{j-1}}{1+\nu}.\] Hence the scheme is truly a way of computing \(U^{n+1}_j\) based on \(U^n_j, U^{n+1}_{j-1}\) and the resulting CFL condition is \(\nu\ge0\).
\(\nu<0\): in this case, boundary values must be supplied along the right boundary at \(x=1\). Then, starting from the known value at the boundary, we work to the left by substitution: \[U^{n+1}_{j-1} = \frac{(1+\nu)U^{n+1}_j - U^n_j}{\nu}.\] Hence the scheme is truly a way of computing \(U^{n+1}_{j-1}\) based on \(U^n_j, U^{n+1}_{j}\) and the resulting CFL condition is \(\nu\le-1\).

This post was originally published on the KAUST Mathwiki here (login required).