David Ketcheson2023-04-17T14:18:03+03:00David I. Ketchesondketch@gmail.comDispersion relations for linear systems of PDEs2014-05-28T00:00:00+03:00h/2014/05/28/dispersion_relations<p>Fourier analysis is an essential tool for understanding the behavior
of solutions to linear equations. Often, this analysis is introduced to
students in the context of scalar equations with real coefficients. If
nothing more is said, students may mistakenly apply assumptions based on
the scalar case to systems, leading to erroneous conclusions. I’m
surprised at how often I’ve seen this, and I’ve even made the mistake
myself.</p>
<h2 id="scalar-equations">Scalar equations</h2>
<p>Students in any undergraduate PDE course learn that solutions of the
heat equation</p>
<p><span class="math display">\[
\label{heat}
u_t(x,t) = u_{xx}(x,t)
\]</span></p>
<p>diffuse in time whereas solutions of the wave equation</p>
<p><span class="math display">\[
\label{wave}
u_{tt} = u_{xx}
\]</span></p>
<p>oscillate in time without growing or decaying. They may even be
introduced to a general approach for the Cauchy problem: given an
evolution equation</p>
<p><span class="math display">\[ \label{evol}
u_t = \sum_{j=0}^n a_j \frac{\partial^j u}{\partial x^j},
\]</span></p>
<p>one inserts the Fourier mode solution</p>
<p><span class="math display">\[ \label{fourier}
u(x,t) = e^{i(kx - \omega(k) t)}
\]</span></p>
<p>to obtain</p>
<p><span class="math display">\[-i\omega(k) = \sum_{j=0}^n a_j
(ik)^j\]</span></p>
<p>or simply</p>
<p><span class="math display">\[\omega(k) = \sum_{j=0}^n a_j i^{j+1}
k^j.\]</span></p>
<p>The function <span class="math inline">\(\omega(k)\)</span> is often
referred to as the <em>dispersion relation</em> for the PDE. Any
solution can be expressed as a sum of Fourier modes, and each mode
propagates in a manner dictated by the dispersion relation. It’s easy to
see that</p>
<ul>
<li>If <span class="math inline">\(\omega(k)\)</span> is
<strong>real</strong>, then energy is conserved and each mode simply
translates. This occurs if only odd-numbered spatial derivatives appear
in the evolution equation \eqref{evol}.</li>
<li>If <span class="math inline">\(\omega(k)\)</span> has
<strong>negative imaginary part</strong>, energy decays in time. The
heat equation \eqref{heat} behaves this way.</li>
<li>If <span class="math inline">\(\omega(k)\)</span> has
<strong>positive imaginary part</strong>, then the energy will grow
exponentially in time. This doesn’t usually occur in physical systems.
An example of this behavior is obtained by changing the sign of the
right side in the heat equation to get <span class="math inline">\(u_t =
- u_{xx}\)</span>.</li>
</ul>
<p>What about the wave equation, which has two time derivatives? Using
the same Fourier mode ansatz \eqref{fourier}, one obtains <span
class="math display">\[
\begin{align}
\omega^2 & = k^2
\end{align}
\]</span> or <span class="math inline">\(\omega = \pm k\)</span>. Since
<span class="math inline">\(\omega\)</span> is real, energy is
conserved.</p>
<p>In the discussion above, we have assumed that <span
class="math inline">\(u\)</span> is a scalar and that the coefficients
<span class="math inline">\(a_j\)</span> are real. Many undergraduate
courses stop at this point, and students are left with the intuition
that <strong>even-numbered derivative terms are diffusive</strong> while
<strong>odd-numbered derivative terms are dispersive</strong>.</p>
<p>In practice, we often deal with systems of PDEs or PDEs with complex
coefficients, and this intuition is then no longer correct. There is
nothing deep or mysterious about this topic, but it’s easy to jump to
incorrect conclusions if one is not careful. To take a common example,
consider the time-dependent Schroedinger equation: <span
class="math display">\[i \psi_t = \psi_{xx} + V\psi.\]</span> At first
glance, we have on the right side a diffusion term (<span
class="math inline">\(\psi_{xx}\)</span>) and a reaction term (<span
class="math inline">\(V\psi\)</span>). But what about that pesky factor
of <span class="math inline">\(i\)</span> (the imaginary unit) on the
left hand side? It’s easy to find the answer using the usual ansatz, but
let’s take a little detour first.</p>
<h2 id="systems-of-equations">Systems of equations</h2>
<p>Consider the linear system <span class="math display">\[
\begin{align*}
u_t = A \frac{\partial^j u}{\partial x^j},
\end{align*}
\]</span> where <span class="math inline">\(u\in \mathbb{R}^m\)</span>
and <span class="math inline">\(A\)</span> is a square real matrix. Let
<span class="math inline">\(\lambda_m\)</span> and <span
class="math inline">\(s_m\)</span> denote the eigenvalues and
eigenvectors (respectively) of <span class="math inline">\(A\)</span>.
Inserting the Fourier mode solution <span class="math display">\[u(x,t)
= s_m e^{i(kx - \omega(k) t)},\]</span> we obtain <span
class="math display">\[\omega(k) = i^{j+1} k^j \lambda_m s_m,\]</span>
and any solution can be written as a superposition of these. We see now
that the behavior of the energy with respect to time depends on both the
number <span class="math inline">\(j\)</span> of spatial derivatives and
the nature of the eigenvalues of <span class="math inline">\(A\)</span>.
For instance, if <span class="math inline">\(j=1\)</span> and <span
class="math inline">\(A\)</span> has imaginary eigenvalues, energy is
conserved. We can obtain just such an example by rewriting the wave
equation \eqref{wave} as a first-order system: <span
class="math display">\[
\begin{align}
u_t & = v_x \label{w1} \\
v_t & = u_x. \label{w2}
\end{align}
\]</span> (If you’re not familiar with this, just differentiate
\eqref{w1} w.r.t. <span class="math inline">\(t\)</span> and \eqref{w2}
w.r.t. <span class="math inline">\(x\)</span>, then equate partial
derivatives to get back the second-order wave equation \eqref{wave}). We
have a linear system with <span class="math inline">\(j=1\)</span> and
<span class="math display">\[ A = \begin{pmatrix}
0 & 1 \\ 1 & 0
\end{pmatrix}.\]</span> This matrix has eigenvalues <span
class="math inline">\(\lambda=\pm 1\)</span>, so <span
class="math inline">\(\omega(k)\)</span> has zero imaginary part.</p>
<p>In this example, our intuition from the scalar case works: our
first-order system, with only odd-numbered derivatives, leads to
wave-like behavior. But notice that if <span
class="math inline">\(A\)</span> had imaginary eigenvalues, our
intuition would be wrong; for instance, the system <span
class="math display">\[
\begin{align*}
u_t & = -v_x \\
v_t & = u_x,
\end{align*}
\]</span> corresponding to the second-order equation <span
class="math inline">\(u_{tt} = - u_{xx},\)</span> admits exponentially
growing solutions.</p>
<h2 id="scalar-problems-with-complex-coefficients">Scalar problems with
complex coefficients</h2>
<p>Now that we understand the dispersion relation for systems, it’s easy
to understand the dispersion relation for the Schrodinger equation.
Multiply by <span class="math inline">\(-i\)</span> to get <span
class="math display">\[\psi_t = -i\psi_{xx} + -iV\psi.\]</span> Now we
can think of this in the same way as a system, where the coefficient
matrices have purely imaginary eigenvalues. Then it’s clear that the
(even-derivative) terms on the right hand side are both related to wave
behavior (i.e., energy is conserved).</p>
<h2 id="systems-with-derivatives-of-different-orders">Systems with
derivatives of different orders</h2>
<p>In the most general case, we have systems of linear PDEs with
multiple spatial derivatives of different order: <span
class="math display">\[ \label{gensys}
u_t = \sum_{j=0}^n A_j \frac{\partial^j u}{\partial x^j}.
\]</span></p>
<p>Here’s a real example from my research. It comes from homogenization
of the wave equation in a spatially varying medium (see Equation (5.17)
of <a
href="http://faculty.washington.edu/rjl/pubs/solitary/40815.pdf">this
paper</a> for more details). It’s the wave equation plus some
second-derivative terms: <span class="math display">\[
u_t = v_x + v_{xx} \\
v_t = u_x - u_{xx}.
\]</span> You might (if you hadn’t read the example above) assume that
this system is dissipative due to the second derivatives. This system is
of the form \eqref{gensys} with <span class="math display">\[
\begin{align}
A_1 & = \begin{pmatrix}
0 & 1 \\ 1 & 0
\end{pmatrix}
&
A_2 & = \begin{pmatrix}
0 & 1 \\ -1 & 0
\end{pmatrix}.
\end{align}
\]</span> Of course, <span class="math inline">\(A_1\)</span> has real
eigenvalues and leads to wave-like behavior. But <span
class="math inline">\(A_2\)</span> has pure imaginary eigenvalues, so it
also leads to wave-like behavior! The second derivative terms are
<em>dispersive</em>. In fact, it’s easy to show that the energy <span
class="math inline">\(E=u^2+v^2\)</span> is a conserved quantity for
this system (try it!).</p>
<p>Strictly speaking, Fourier analysis like what we’ve described can’t
usually be applied to \eqref{gensys} because the matrices <span
class="math inline">\(A_j\)</span> will not generally be simultaneously
diagonalizable (though this analysis can still give us intuition for
what each set of terms may do). Worse yet, the individual matrices may
not be diagonalizable. Let’s illustrate with a simple case.</p>
<p>Returning to the wave equation, let’s consider a different way of
writing it as a system: <span class="math display">\[
\begin{align*}
u_t & = v \\
v_t & = u_{xx}.
\end{align*}
\]</span> It’s easy to check that this system is equivalent to the wave
equation – but notice that it’s composed of parts with only even
derivatives! (<em>reaction</em> and <em>diffusion</em> equations in the
terminology of scalar PDEs). This system is of the form \eqref{gensys}
with <span class="math display">\[
\begin{align}
A_0 & = \begin{pmatrix}
0 & 1 \\ 0 & 0
\end{pmatrix}
&
A_2 & = \begin{pmatrix}
0 & 0 \\ 1 & 0
\end{pmatrix}.
\end{align}
\]</span> Notice that both eigenvalues of both matrices are equal to
zero.</p>
Notes 2014.03.032014-03-03T00:00:00+03:00h/2014/03/03/notes<p>Finally figured out what was wrong with the stability regions for the
deferred correction methods in Nodepy when <span
class="math inline">\(\theta \ne 0\)</span>. See <a
href="https://bitbucket.org/ketch/rkextrapolation/src/cce934e20cf514c8c5450e7ad09f5774052ff575/code/SDC%20Stability%20regions%20when%20theta%20is%20nonzero.ipynb?at=master">these</a>
<a
href="https://bitbucket.org/ketch/rkextrapolation/src/cce934e20cf514c8c5450e7ad09f5774052ff575/code/Reproduce_DC_stability_region.ipynb?at=master">notebooks</a>.</p>
<p>I also sat with Roland and got the latest version of PeanoClaw
running on my workstation.</p>
Notes 2014.02.272014-02-27T00:00:00+03:00h/2014/02/27/notes<p>Investigated stability regions for high order deferred correction
schemes; see <a
href="https://bitbucket.org/ketch/rkextrapolation/src/a78b4aa2d336491cac35c1df81e703ce103d6937/SDC%20Stability%20regions%20when%20theta%20is%20nonzero.ipynb">this
notebook</a>.</p>
<p>Finally, after about a year of searching, found a way to redirect all
output from distutils to a file. This will avoid the massive amount of
warnings that are currently printed to the screen when installing
PyClaw. See the patch <a
href="https://github.com/clawpack/clawpack/pull/35">here</a>, based on
<a href="http://stackoverflow.com/a/11632982/786902">this StackOverflow
answer</a>.</p>
<p>I also put together an <a
href="http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/656693/shallow_water_diffraction.ipynb">IPython
notebook on shallow water solitary waves over periodic bathymetry</a>.
It will be in the Github repo soon.</p>
The Schrodinger equation is not a reaction-diffusion equation2014-02-22T00:00:00+03:00h/2014/02/22/schrodinger-is-not-diffusion<p>Recently, a stackexchange answer claimed that <a
href="http://scicomp.stackexchange.com/a/10878/123">the Schrodinger
equation is effectively a reaction-diffusion equation</a>. I’ll set
aside semantic arguments about the meaning of “effectively”, and give a
more obvious example to explain why I think this statement is
misleading.</p>
<p>Consider the wave equation</p>
<p><span class="math display">\[u_{tt} = u_{xx}\]</span></p>
<p>Introducing a new variable <span class="math inline">\(v=u_t\)</span>
we can rewrite the wave equation as</p>
<p><span class="math display">\[
\begin{align*}
v_t & = u_{xx} \\
u_t & = v.
\end{align*}
\]</span></p>
<p>Observe that the first of these equation is the diffusion equation,
while the second is a reaction equation. Thus we have
reaction-diffusion!</p>
<p>Right?</p>
<p>Wrong. We’ve disguised the true nature of this equation by applying
our intuition (which is based on scalar PDEs) to a system of PDEs. In
the same way, the “reaction-diffusion” label for Schrodinger is obtained
by applying intuition based on PDEs with real coefficients to a PDE with
complex coefficients.</p>
<p>Of course, in both cases you can use numerical methods that are
appropriate for reaction-diffusion problems in order to solve a wave
equation.<br />
<a
href="http://nbviewer.ipython.org/github/ketch/exposition/blob/master/Wave%20equation%20as%20reaction-diffusion.ipynb">Here
is a quick ipython notebook implementation of the obvious method for the
system above</a>.</p>
Notes 2014.02.222014-02-22T00:00:00+03:00h/2014/02/22/notes<p>Discussed time stepping for aeroacoustics with Antony Jameson at
Stanford. Also reviewed a couple of his group’s papers on high order
flux reconstruction schemes.</p>
<h3
id="insights-from-von-neumann-analysis-of-high-order-flux-reconstruction-schemes">Insights
from von Neumann analysis of high-order flux reconstruction schemes</h3>
<ul>
<li>Vincent, Castonguay, Jameson</li>
<li>JCP 2011</li>
</ul>
<blockquote>
<p>Investigate a 1-parameter family of stable flux reconstruction
methods suggested by earlier work. For certain parameter values you get
DG or SD schemes. Some values admit spurious modes. The size of the
largest stable step size and the order of accuracy are determined as a
function of the parameter (c). Nonlinear 2D experimental results are
predicted relatively well by the 1D von Neumann analysis.</p>
</blockquote>
<blockquote>
<p>Section 4 is a nice description of how to do von Neumann analysis for
FE methods.</p>
</blockquote>
<h3 id="on-the-non-linear-stability-of-flux-reconstruction-schemes">On
the Non-linear Stability of Flux Reconstruction Schemes</h3>
<ul>
<li>Jameson, Vincent, Castonguay</li>
<li>J. Sci. Comput. 2011</li>
</ul>
<blockquote>
<p>They look at energy stability in a very general way. Nonlinear
stability depends on solution point locations, and on the accuracy of
the determination of the transformed flux.</p>
</blockquote>
<p>Time integration ideas that could be useful for aeronautics
simulations:</p>
<ul>
<li>Optimization of stability regions</li>
<li>Multirate time stepping. Some work has been done in <a
href="http://dx.doi.org/10.1016/j.jcp.2010.05.028">Nonuniform time-step
Runge–Kutta discontinuous Galerkin method for Computational
Aeroacoustics</a>.</li>
<li>Large time step methods. See <a
href="http://dx.doi.org/10.1016/j.jcp.2011.06.008">A class of large time
step Godunov schemes for hyperbolic conservation laws and
applications</a>.</li>
</ul>
<p>Large time step methods might work very well as the coarse propagator
for parareal-type algorithms.</p>
The parallel EPPEER code2012-10-17T00:00:00+03:00h/2012/10/17/eppeer<p>I tried out the EPPEER code, which uses two-step Runge-Kutta methods
and OpenMP, because I’m thinking of writing a shared-memory parallel ODE
solver code myself.</p>
<p>I downloaded the code from</p>
<p><a
href="http://www.mathematik.uni-marburg.de/~schmitt/peer/eppeer.zip"
title="Go to wiki page">http://www.mathematik.uni-marburg.de/~schmitt/peer/eppeer.zip</a></p>
<p>unzipped, and ran</p>
<pre><code>gfortran -c mbod4h.f90
gfortran -c ivprkp.f90
gfortran -c -fopenmp ivpepp.f90
gfortran -fopenmp ivprkp.o ivpepp.o mbod4h.o ivp_pmain.f90
./a.out</code></pre>
<p>I had to fix one line that was trying to open a logfile and failed. I
also set</p>
<pre><code>export OMP_NUM_THREADS=4</code></pre>
<p>This runs the code with increasingly tight tolerances on a 400-body
problem. The output was (I killed it before it finished the really tight
tolerance run(s)</p>
<pre><code> tol, err, otime, cpu 0.10E-01 0.10702 2.9556 10.534
steps,rej,nfcn: 337 88 1399
tol, err, otime, cpu 0.10E-02 0.93692E-01 4.9853 18.585
steps,rej,nfcn: 605 159 2465
tol, err, otime, cpu 0.10E-03 0.66604E-01 7.9798 30.365
steps,rej,nfcn: 994 244 4015
tol, err, otime, cpu 0.10E-04 0.47637E-01 12.026 46.477
steps,rej,nfcn: 1534 324 6175
tol, err, otime, cpu 0.10E-05 0.24241E-01 18.239 70.756
steps,rej,nfcn: 2338 415 9391</code></pre>
<p>If I understand correctly, the last column is total CPU time; the
next to last is wall time. For comparison, I ran it without
parallelism:</p>
<pre><code>export OMP_NUM_THREADS=1</code></pre>
<p>Then I got the following:</p>
<pre><code> tol, err, otime, cpu 0.10E-01 0.10702 10.382 10.382
steps,rej,nfcn: 337 88 1399
tol, err, otime, cpu 0.10E-02 0.93692E-01 18.297 18.297
steps,rej,nfcn: 605 159 2465
tol, err, otime, cpu 0.10E-03 0.66604E-01 29.814 29.815
steps,rej,nfcn: 994 244 4015
tol, err, otime, cpu 0.10E-04 0.47637E-01 45.854 45.855
steps,rej,nfcn: 1534 324 6175
tol, err, otime, cpu 0.10E-05 0.24241E-01 69.725 69.726
steps,rej,nfcn: 2338 415 9391
tol, err, otime, cpu 0.10E-06 0.53727E-02 105.47 105.48
steps,rej,nfcn: 3539 484 14195</code></pre>
<p>The numbers of function evaluations were identical, confirming that
the computations being performed were the same. The speedup (about 3x)
is very nice. We should be able to achieve something similar with
extrapolation.</p>
<p>These results are actually plotted in <a
href="http://www.mathematik.uni-marburg.de/~schmitt/peer/man_epp.pdf">the
user guide</a>, at the end of Section 4.</p>
<p>This was originally posted on <a
href="https://mathwiki.kaust.edu.sa/david/eppeer">mathwiki</a>.</p>
Blogging an iPython notebook with Jekyll2012-10-11T00:00:00+03:00h/2012/10/11/blogging_ipython_notebooks_with_jekyll<blockquote>
<p><strong>Update as of December 2014: Don’t bother using what’s below;
go to <a
href="http://cscorley.github.io/2014/02/21/blogging-with-ipython-and-jekyll/">Christop
Corley’s blog</a> for a much better setup!</strong></p>
</blockquote>
<p>I’ve been playing around with <a
href="http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html">iPython
notebooks</a> for a while and planning to use them instead of <a
href="http://www.sagemath.org/">SAGE</a> worksheets for my numerical
analysis course next spring. As a warmup, I wrote an iPython notebook
explaining a bit about internal stability of Runge-Kutta methods and
showing some new research results using <a
href="http://numerics.kaust.edu.sa/nodepy/">NodePy</a>.</p>
<p>I also wanted to post the notebook on my blog here; the ability to
more easily include math and code in blog posts was one of my main
motivations for moving away from Blogger to my own site. I first tried
following <a
href="http://blog.fperez.org/2012/09/blogging-with-ipython-notebook.html">the
instructions given by Fernando Perez</a>. That was quite painless and
worked flawlessly, using <code>nbconvert.py</code> to convert the .ipynb
file directly to HTML, with graphics embedded. The only issue was that I
didn’t love the look of the output quite as much as I love how Carl
Boettiger’s Markdown + Jekyll posts with code and math look (see an
example <a
href="http://www.carlboettiger.info/2012/09/14/analytic-solution-to-multiple-uncertainty.html">here</a>).
Besides, Markdown is so much nicer than HTML, and
<code>nbconvert.py</code> has a Markdown output option.</p>
<p>So I tried the markdown option:</p>
<pre><code>nbconvert.py my_nb.ipynb -f markdown</code></pre>
<p>I copied the result to my <code>_posts/</code> directory, added the
<a href="https://github.com/mojombo/jekyll/wiki/YAML-Front-Matter">YAML
front-matter</a> that Jekyll expects, and took a look. Everything was
great except that all my plots were gone, of course. After considering a
few options, I decided for now to put plots for such posts in a
subfolder <code>jekyll_images/</code> of my public Dropbox folder. Then
it was a simple matter of search/replace all the paths to the images. At
that point, it looked great; you can see the <a
href="https://github.com/ketch/nodepy/blob/master/examples/Internal_stability.ipynb">source</a>
and the <a
href="http://davidketcheson.info/2012/10/11/Internal_stability.html">result</a>.</p>
<p>The only issue was that I didn’t want to manually do all that work
every time. I considered creating a new Converter class in
<code>nbconvert</code> to handle it, but finally decided that it would
be more convenient to just write a shell script that calls
<code>nbconvert</code> and then operates on the result.<br />
Here it is:</p>
<pre><code>#!/bin/bash
fname=$1
nbconvert.py ${fname}.ipynb -f markdown
sed -i '' "s#${fname}_files#https:\/\/dl.dropbox.com\/u\/656693\/jekyll_images\/${fname}_files#g" ${fname}.md
dt=$(date "+%Y-%m-%d")
echo "0a
---
layout: post
time: ${dt}
title: TITLE-ME
subtitle: SUBTITLE-ME
tags: TAG-ME
---
.
w" | ed ${fname}.md
mv ${fname}.md ~/labnotebook/_posts/${dt}-${fname}.md</code></pre>
<p>It’s also on Github <a
href="https://github.com/ketch/labnotebook/blob/master/nbconv.sh">here</a>.
This was a nice educational exercise in constructing shell scripts, in
which I learned or re-learned:</p>
<ul>
<li>how to use command-line arguments</li>
<li>how to use sed and ed</li>
<li>how to use data</li>
</ul>
<p>You can expect a lot more iPython-notebook based posts in the
future.</p>
Internal stability of Runge-Kutta methods2012-10-11T00:00:00+03:00h/2012/10/11/Internal_stability<p>Note: this post was generated from an iPython notebook. You can <a
href="https://github.com/ketch/nodepy/blob/master/examples/Internal_stability.ipynb">download
the notebook from github</a> and execute all the code yourself.</p>
<p>Internal stability deals with the growth of errors (such as roundoff)
introduced at the Runge-Kutta stages during a single Runge-Kutta step.
It is usually important only for methods with a large number of stages,
since that is when the internal amplification factors can be large. An
excellent explanation of internal stability is given in <a
href="http://oai.cwi.nl/oai/asset/1652/1652A.pdf">this paper</a>. Here
we demonstrate some tools for studying internal stability in NodePy.</p>
<p>First, let’s load a couple of RK methods:</p>
<div class="highlight">
<pre><span class="kn">from</span> <span class="nn">nodepy</span> <span class="kn">import</span> <span class="n">rk</span>
<span class="nb">reload</span><span class="p">(</span><span class="n">rk</span><span class="p">)</span>
<span class="n">rk4</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">loadRKM</span><span class="p">(</span><span class="s">'RK44'</span><span class="p">)</span>
<span class="n">ssprk4</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">loadRKM</span><span class="p">(</span><span class="s">'SSP104'</span><span class="p">)</span>
<span class="k">print</span> <span class="n">rk4</span>
<span class="k">print</span> <span class="n">ssprk4</span>
</pre>
</div>
<pre><code>Classical RK4
The original four-stage, fourth-order method of Kutta
0 |
1/2 | 1/2
1/2 | 0 1/2
1 | 0 0 1
_____|____________________
| 1/6 1/3 1/3 1/6
SSPRK(10,4)
The optimal ten-stage, fourth order SSP Runge-Kutta method
0 |
1/6 | 1/6
1/3 | 1/6 1/6
1/2 | 1/6 1/6 1/6
2/3 | 1/6 1/6 1/6 1/6
1/3 | 1/15 1/15 1/15 1/15 1/15
1/2 | 1/15 1/15 1/15 1/15 1/15 1/6
2/3 | 1/15 1/15 1/15 1/15 1/15 1/6 1/6
5/6 | 1/15 1/15 1/15 1/15 1/15 1/6 1/6 1/6
1 | 1/15 1/15 1/15 1/15 1/15 1/6 1/6 1/6 1/6
_____|____________________________________________________________
| 1/10 1/10 1/10 1/10 1/10 1/10 1/10 1/10 1/10 1/10</code></pre>
<h2 id="absolute-stability-regions">Absolute stability regions</h2>
<p>First we can use NodePy to plot the region of absolute stability for
each method. The absolute stability region is the set</p>
<center>
<span class="math inline">\(\\{ z \in C : |\phi (z)|\le 1 \\}\)</span>
</center>
<p>where <span class="math inline">\(\phi(z)\)</span> is the
<em>stability function</em> of the method:</p>
<center>
<span class="math inline">\(1 + z b^T (I-zA)^{-1}\)</span>
</center>
<p>If we solve <span class="math inline">\(u'(t) = \lambda
u\)</span> with a given method, then <span
class="math inline">\(z=\lambda \Delta t\)</span> must lie inside this
region or the computation will be unstable.</p>
<div class="highlight">
<pre><span class="n">p</span><span class="p">,</span><span class="n">q</span> <span class="o">=</span> <span class="n">rk4</span><span class="o">.</span><span class="n">stability_function</span><span class="p">()</span>
<span class="k">print</span> <span class="n">p</span>
<span class="n">h1</span><span class="o">=</span><span class="n">rk4</span><span class="o">.</span><span class="n">plot_stability_region</span><span class="p">()</span>
</pre>
</div>
<pre><code> 4 3 2
0.04167 x + 0.1667 x + 0.5 x + 1 x + 1</code></pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_00.png" /></p>
<div class="highlight">
<pre><span class="n">p</span><span class="p">,</span><span class="n">q</span> <span class="o">=</span> <span class="n">ssprk4</span><span class="o">.</span><span class="n">stability_function</span><span class="p">()</span>
<span class="k">print</span> <span class="n">p</span>
<span class="n">h2</span><span class="o">=</span><span class="n">ssprk4</span><span class="o">.</span><span class="n">plot_stability_region</span><span class="p">()</span>
</pre>
</div>
<pre><code> 10 9 8 7 6
3.969e-09 x + 2.381e-07 x + 6.43e-06 x + 0.0001029 x + 0.00108 x
5 4 3 2
+ 0.00787 x + 0.04167 x + 0.1667 x + 0.5 x + 1 x + 1</code></pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_01.png" /></p>
<h1 id="internal-stability">Internal stability</h1>
<p>The stability function tells us by how much errors from one step are
amplified in the next one. This is important since we introduce
truncation errors at every step. However, we also introduce roundoff
errors at the each stage within a step. Internal stability tells us
about the growth of those. Internal stability is typically less
important than (step-by-step) absolute stability for two reasons:</p>
<ul>
<li>Roundoff errors are typically much smaller than truncation errors,
so moderate amplification of them typically is not significant</li>
<li>Although the propagation of stage errors within a step is governed
by internal stability functions, in later steps these errors are
propagated according to the (principal) stability function</li>
</ul>
<p>Nevertheless, in methods with many stages, internal stability can
play a key role.</p>
<p>Questions: <em>In the solution of PDEs, large spatial truncation
errors enter at each stage. Does this mean internal stability becomes
more significant? How does this relate to stiff accuracy analysis and
order reduction?</em></p>
<h2 id="internal-stability-functions">Internal stability functions</h2>
<p>We can write the equations of a Runge-Kutta method compactly as</p>
<center>
<span class="math inline">\(y = u^n e + h A F(y)\)</span>
</center>
<center>
<span class="math inline">\(u^{n+1} = u^n + h b^T F(y),\)</span>
</center>
<p>where <span class="math inline">\(y\)</span> is the vector of stage
values, <span class="math inline">\(u^n\)</span> is the previous step
solution, <span class="math inline">\(e\)</span> is a vector with all
entries equal to 1, <span class="math inline">\(h\)</span> is the step
size, <span class="math inline">\(A\)</span> and <span
class="math inline">\(b\)</span> are the coefficients in the Butcher
tableau, and <span class="math inline">\(F(y)\)</span> is the vector of
stage derivatives. In floating point arithmetic, roundoff errors will be
made at each stage. Representing these errors by a vector <span
class="math inline">\(r\)</span>, we have</p>
<center>
<span class="math inline">\(y = u^n e + h A F(y) + r.\)</span>
</center>
<p>Considering the test problem <span class="math inline">\(F(y)=\lambda
y\)</span> and solving for <span class="math inline">\(y\)</span>
gives</p>
<center>
<span class="math inline">\(y = u^n (I-zA)^{-1}e +
(I-zA)^{-1}r,\)</span>
</center>
<p>where <span class="math inline">\(z=h\lambda\)</span>. Substituting
this result in the equation for <span
class="math inline">\(u^{n+1}\)</span> gives</p>
<center>
<span class="math inline">\(u^{n+1} = u^n (1 + zb^T(I-zA)^{-1}e) +
zb^T(I-zA)^{-1}r = \psi(z) u^n + \theta(z)^T r.\)</span>
</center>
<p>Here <span class="math inline">\(\psi(z)\)</span> is the
<em>stability function</em> of the method, that we already encountered
above. Meanwhile, the vector <span
class="math inline">\(\theta(z)\)</span> contains the <em>internal
stability functions</em> that govern the amplification of roundoff
errors <span class="math inline">\(r\)</span> within a step:</p>
<center>
<span class="math inline">\(\theta(z) = z b^T (I-zA)^{-1}.\)</span>
</center>
<p>Let’s compute <span class="math inline">\(\theta\)</span> for the
classical RK4 method:</p>
<div class="highlight">
<pre><span class="n">theta</span><span class="o">=</span><span class="n">rk4</span><span class="o">.</span><span class="n">internal_stability_polynomials</span><span class="p">()</span>
<span class="n">theta</span>
</pre>
</div>
<pre>
[poly1d([1/24, 1/12, 1/6, 1/6, 0], dtype=object),
poly1d([1/12, 1/6, 1/3, 0], dtype=object),
poly1d([1/6, 1/3, 0], dtype=object),
poly1d([1/6, 0], dtype=object)]
</pre>
<div class="highlight">
<pre><span class="k">for</span> <span class="n">theta_j</span> <span class="ow">in</span> <span class="n">theta</span><span class="p">:</span>
<span class="k">print</span> <span class="n">theta_j</span>
</pre>
</div>
<pre><code> 4 3 2
0.04167 x + 0.08333 x + 0.1667 x + 0.1667 x
3 2
0.08333 x + 0.1667 x + 0.3333 x
2
0.1667 x + 0.3333 x
0.1667 x</code></pre>
<p>Thus the roundoff errors in the first stage are amplified by a factor
<span class="math inline">\(z^4/24 + z^3/12 + z^2/6 + z/6\)</span>,
while the errors in the last stage are amplified by a factor <span
class="math inline">\(z/6\)</span>.</p>
<h2 id="internal-instability">Internal instability</h2>
<p>Usually internal stability is unimportant since it relates to
amplification of roundoff errors, which are very small. Let’s think
about when things can go wrong in terms of internal instability. If
<span class="math inline">\(|\theta(z)|\)</span> is of the order <span
class="math inline">\(1/\epsilon_{machine}\)</span>, then roundoff
errors could be amplified so much that they destroy the accuracy of the
computation. More specifically, we should be concerned if <span
class="math inline">\(|\theta(z)|\)</span> is of the order <span
class="math inline">\(tol/\epsilon_{machine}\)</span> where <span
class="math inline">\(tol\)</span> is our desired error tolerance. Of
course, we only care about values of <span
class="math inline">\(z\)</span> that lie inside the absolute stability
region <span class="math inline">\(S\)</span>, since internal stability
won’t matter if the computation is not absolutely stable.</p>
<p>We can get some idea about the amplification of stage errors by
plotting the curves <span class="math inline">\(|\theta(z)|=1\)</span>
along with the stability region. Ideally these curves will all lie
outside the stability region, so that all stage errors are damped.</p>
<div class="highlight">
<pre><span class="n">rk4</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
</pre>
</div>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_02.png" /></p>
<div class="highlight">
<pre><span class="n">ssprk4</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
</pre>
</div>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_03.png" /></p>
<p>For both methods, we see that some of the curves intersect the
absolute stability region, so some stage errors are amplified. But by
how much? We’d really like to know the maximum amplification of the
stage errors under the condition of absolute stability. We therefore
define the <em>maximum internal amplification factor</em> <span
class="math inline">\(M\)</span>:</p>
<center>
<span class="math inline">\(M = \max_j \max_{z \in S}
|\theta_j(z)|\)</span>
</center>
<div class="highlight">
<pre><span class="k">print</span> <span class="n">rk4</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
<span class="k">print</span> <span class="n">ssprk4</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>2.15239281554
4.04399941143</code></pre>
<p>We see that both methods have small internal amplification factors,
so internal stability is not a concern in either case. This is not
surprising for the method with only four stages; it is a surprisingly
good property of the method with ten stages.</p>
<p>Questions: <em>Do SSP RK methods always (necessarily) have small
amplification factors? Can we prove it?</em></p>
<p>Now let’s look at some methods with many stages.</p>
<h2 id="runge-kutta-chebyshev-methods">Runge-Kutta Chebyshev
methods</h2>
<p>The paper of Verwer, Hundsdorfer, and Sommeijer deals with RKC
methods, which can have very many stages. The construction of these
methods is implemented in NodePy, so let’s take a look at them. The
functions <code>RKC1(s)</code> and <code>RKC2(s)</code> construct RKC
methods of order 1 and 2, respectively, with <span
class="math inline">\(s\)</span> stages.</p>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">4</span>
<span class="n">rkc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">RKC1</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="k">print</span> <span class="n">rkc</span>
</pre>
</div>
<pre><code>RKC41
0 |
1/16 | 1/16
1/4 | 1/8 1/8
9/16 | 3/16 1/4 1/8
______|________________________
| 1/4 3/8 1/4 1/8</code></pre>
<div class="highlight">
<pre><span class="n">rkc</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
</pre>
</div>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_04.png" /></p>
<p>It looks like there could be some significant internal amplification
here. Let’s see:</p>
<div class="highlight">
<pre><span class="n">rkc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
11.760869405962685
</pre>
<p>Nothing catastrophic. Let’s try a larger value of <span
class="math inline">\(s\)</span>:</p>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">20</span>
<span class="n">rkc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">RKC1</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">rkc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
42.665327220219126
</pre>
<p>As promised, these methods seem to have good internal stability
properties. What about the second-order methods?</p>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">20</span>
<span class="n">rkc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">RKC2</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">rkc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
106.69110992619214
</pre>
<p>Again, nothing catastrophic. We could take <span
class="math inline">\(s\)</span> much larger than 20, but the
calculations get to be rather slow (in Python) and since we’re using
floating point arithmetic, the accuracy deteriorates.</p>
<p>Remark: <em>we could do the calculations in exact arithmetic using
Sympy, but things would get even slower. Perhaps there are some
optimizations that could be done to speed this up. Or perhaps we should
use Mathematica if we need to do this kind of thing.</em></p>
<p>Remark 2: <em>of course, for the RKC methods the internal stability
polynomials are shifted Chebyshev polynomials. So we could evaluate them
directly in a stable manner using the three-term recurrence (or perhaps
scipy’s special functions library). This would also be a nice check on
the calculations above.</em></p>
<h2 id="other-methods-with-many-stages">Other methods with many
stages</h2>
<p>Three other classes of methods with many stages have been implemented
in NodePy:</p>
<ul>
<li>SSP families</li>
<li>Integral deferred correction (IDC) methods</li>
<li>Extrapolation methods</li>
</ul>
<h3 id="ssp-families">SSP Families</h3>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">20</span>
<span class="n">ssprk</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">SSPRK2</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
2.0212921484995547
</pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_05.png" /></p>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">25</span> <span class="c"># # of stages</span>
<span class="n">ssprk</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">SSPRK3</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
3.8049237837215397
</pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_06.png" /></p>
<p>The SSP methods seem to have excellent internal stability
properties.</p>
<h3 id="idc-methods">IDC methods</h3>
<div class="highlight">
<pre><span class="n">p</span><span class="o">=</span><span class="mi">6</span> <span class="c">#order</span>
<span class="n">idc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">DC</span><span class="p">(</span><span class="n">p</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">idc</span><span class="p">)</span>
<span class="n">idc</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">idc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>26</code></pre>
<pre>
6.4140166271998815
</pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_07.png" /></p>
<p>IDC methods also seem to have excellent internal stability.</p>
<h3 id="extrapolation-methods">Extrapolation methods</h3>
<div class="highlight">
<pre><span class="n">p</span><span class="o">=</span><span class="mi">6</span> <span class="c">#order</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">extrap</span><span class="p">(</span><span class="n">p</span><span class="p">)</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">ex</span><span class="p">)</span>
<span class="n">ex</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">ex</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>16
6</code></pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_08.png" /></p>
<p>Not so good. Let’s try a method with even more stages (this next
computation will take a while; go stretch your legs).</p>
<div class="highlight">
<pre><span class="n">p</span><span class="o">=</span><span class="mi">10</span> <span class="c">#order</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">extrap</span><span class="p">(</span><span class="n">p</span><span class="p">)</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">ex</span><span class="p">)</span>
<span class="n">ex</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>46</code></pre>
<pre>
28073.244376758907
</pre>
<p>Now we’re starting to see something that might cause trouble,
especially since such high order extrapolation methods are usually used
when extremely tight error tolerances are required. Internal
amplification will cause a loss of about 5 digits of accuracy here, so
the best we can hope for is about 10 digits of accuracy in double
precision. Higher order extrapolation methods will make things even
worse. How large are their amplification factors? (Really long
calculation here…)</p>
<div class="highlight">
<pre><span class="n">pmax</span> <span class="o">=</span> <span class="mi">12</span>
<span class="n">ampfac</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">pmax</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span>
<span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="n">pmax</span><span class="o">+</span><span class="mi">1</span><span class="p">):</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">extrap</span><span class="p">(</span><span class="n">p</span><span class="p">)</span>
<span class="n">ampfac</span><span class="p">[</span><span class="n">p</span><span class="p">]</span> <span class="o">=</span> <span class="n">ex</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
<span class="k">print</span> <span class="n">p</span><span class="p">,</span> <span class="n">ampfac</span><span class="p">[</span><span class="n">p</span><span class="p">]</span>
</pre>
</div>
<pre><code>1 1.99777378912
2 2.40329384375
3
5.07204078733
4
17.747335803
5
69.62805786
6
97.6097450835
7
346.277441462
8
1467.40356089
9
6344.16303534
10
28073.2443768
11
126011.586473
12
169897.662582</code></pre>
<pre>
[<matplotlib.lines.Line2D at 0x2611bbe10>]
</pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_09.png" /></p>
<div class="highlight">
<pre><span class="n">semilogy</span><span class="p">(</span><span class="n">ampfac</span><span class="p">,</span><span class="n">linewidth</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
</pre>
</div>
<pre>
[<matplotlib.lines.Line2D at 0x2611a6710>]
</pre>
<p><img
src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_10.png" /></p>
<p>We see roughly geometric growth of the internal amplification factor
as a function of the order <span class="math inline">\(p\)</span>. It
seems clear that very high order extrapolation methods applied to
problems with high accuracy requirements will fall victim to internal
stability issues.</p>
A curious upwind implicit scheme for advection2012-10-11T00:00:00+03:00h/2012/10/11/A_curious_upwind_implicit_scheme_for_advection<h2 id="the-cfl-condition">The CFL condition</h2>
<p>The CFL condition is one of the most basic and intuitive principles
in the numerical solution of hyperbolic PDEs. First formulated by
Courant, Friedrichs and Lewy in their seminal paper (in English for free
here](http://www.stat.uchicago.edu/~lekheng/courses/302/classics/courant-friedrichs-lewy.pdf)),
it states that the domain of dependence of a numerical method for
solving a PDE must contain the true domain of dependence. Otherwise, the
numerical method cannot be convergent.</p>
<p>The CFL condition is geometric and easily understood in the context
of, say, a first-order upwind discretization of advection. Usually it
says nothing interesting about implicit schemes, since they include all
points in their domain of dependence. But sometimes understanding the
CFL condition for a particular scheme can be subtle.</p>
<h3 id="an-implicit-scheme">An implicit scheme</h3>
<p>Consider the advection equation</p>
<p><span class="math display">\[u_t + a u_x = 0.\]</span></p>
<p>Discretization using a backward difference in space and in time gives
the scheme</p>
<p><span class="math display">\[U^{n+1}_j = U^n_j - \nu(U^{n+1}_j -
U^{n+1}_{j-1}).\]</span></p>
<p>Where <span class="math inline">\(\nu = ka/h\)</span> is the CFL
number and <span class="math inline">\(k,h\)</span> are the step sizes
in time and space, respectively. This very simple scheme illustrates the
concepts of the CFL condition and stability in a remarkable way.</p>
<p>For simplicity, suppose that the problem is posed on the domain <span
class="math inline">\(0\le x \le 1\)</span>, with an appropriate
boundary condition. Since this scheme computes <span
class="math inline">\(U^{n+1}_j\)</span> in terms of <span
class="math inline">\(U^n_j\)</span> and <span
class="math inline">\(U^{n+1}_{j-1}\)</span>, it seems that the
numerical domain of dependence for <span
class="math inline">\(U^n_j\)</span> is <span
class="math inline">\((x,t)\in (0,x_j)\times[0,t_n]\)</span>. Based on
this, we may conclude that the scheme is not convergent for <span
class="math inline">\(\nu<0\)</span>. Simple enough.</p>
<p>But what if <span class="math inline">\(\nu=-1\)</span>? Then the
scheme reads <span class="math display">\[U^{n+1}_{j-1} =
U^n_j,\]</span> which gives <strong>the exact solution</strong>! This is
a sort of “anti-unit CFL condition”.</p>
<p>How can this scheme be convergent (in fact, exact!) for a negative
CFL number when it doesn’t use any values to the right?</p>
<h3 id="understanding-the-cfl-condition">Understanding the CFL
condition</h3>
<p>Look at the exact formula above. In this case the scheme is not a
method for computing <span class="math inline">\(U^{n+1}_j\)</span> but
for computing <span class="math inline">\(U^{n+1}_{j-1}\)</span>, and it
<em>does</em> use a value from the previous time step that lies to the
right.</p>
<p>So we can view the scheme with <span
class="math inline">\(\nu=-1\)</span> as a method for computing <span
class="math inline">\(U^{n+1}_j\)</span>, in which case the CFL
condition is satisfied only for <span
class="math inline">\(\nu\ge0\)</span>, or we can view the scheme as a
method for computing <span class="math inline">\(U^{n+1}_{j-1}\)</span>,
in which case the CFL condition is satisfied only for <span
class="math inline">\(\nu\le-1\)</span>. <strong>Which viewpoint is
correct?</strong></p>
<p>To answer that question, remember that the CFL condition is purely
algebraic – that is, it relates to which values are actually used to
compute which other values. To understand this scheme, we need to think
about how we actually solve for <span
class="math inline">\(U^{n+1}\)</span> when using it. Notice that the
scheme can be written as <span class="math display">\[A U^{n+1} =
U^n\]</span> where the matrix <span class="math inline">\(A\)</span> is
lower-triangular. Hence the system can be solved by substitution. To go
further, we must consider two cases:</p>
<ol type="1">
<li><p><span class="math inline">\(\nu>0\)</span>: in this case,
boundary values must be supplied along the left boundary at <span
class="math inline">\(x=0\)</span>. Then, starting from the known value
at the boundary, we work to the right by substitution: <span
class="math display">\[U^{n+1}_j = \frac{U^n_j+\nu
U^{n+1}_{j-1}}{1+\nu}.\]</span> Hence the scheme is truly a way of
computing <span class="math inline">\(U^{n+1}_j\)</span> based on <span
class="math inline">\(U^n_j, U^{n+1}_{j-1}\)</span> and the resulting
CFL condition is <span class="math inline">\(\nu\ge0\)</span>.</p></li>
<li><p><span class="math inline">\(\nu<0\)</span>: in this case,
boundary values must be supplied along the right boundary at <span
class="math inline">\(x=1\)</span>. Then, starting from the known value
at the boundary, we work to the left by substitution: <span
class="math display">\[U^{n+1}_{j-1} = \frac{(1+\nu)U^{n+1}_j -
U^n_j}{\nu}.\]</span> Hence the scheme is truly a way of computing <span
class="math inline">\(U^{n+1}_{j-1}\)</span> based on <span
class="math inline">\(U^n_j, U^{n+1}_{j}\)</span> and the resulting CFL
condition is <span class="math inline">\(\nu\le-1\)</span>.</p></li>
</ol>
<p>This post was originally published on the KAUST Mathwiki <a
href="https://mathwiki.kaust.edu.sa/david/A%20curious%20upwind%20implicit%20scheme%20for%20advection">here</a>
(login required).</p>