David Ketcheson2016-12-21T15:21:08+03:00David I. Ketchesondketch@gmail.comDispersion relations for linear systems of PDEs2014-05-28T00:00:00+03:00h/2014/05/28/dispersion_relations<p>Fourier analysis is an essential tool for understanding the behavior of solutions to linear equations. Often, this analysis is introduced to students in the context of scalar equations with real coefficients. If nothing more is said, students may mistakenly apply assumptions based on the scalar case to systems, leading to erroneous conclusions. I’m surprised at how often I’ve seen this, and I’ve even made the mistake myself.</p>
<h2 id="scalar-equations">Scalar equations</h2>
<p>Students in any undergraduate PDE course learn that solutions of the heat equation</p>
<p><span class="math display">\[
\label{heat}
u_t(x,t) = u_{xx}(x,t)
\]</span></p>
<p>diffuse in time whereas solutions of the wave equation</p>
<p><span class="math display">\[
\label{wave}
u_{tt} = u_{xx}
\]</span></p>
<p>oscillate in time without growing or decaying. They may even be introduced to a general approach for the Cauchy problem: given an evolution equation</p>
<p><span class="math display">\[ \label{evol}
u_t = \sum_{j=0}^n a_j \frac{\partial^j u}{\partial x^j},
\]</span></p>
<p>one inserts the Fourier mode solution</p>
<p><span class="math display">\[ \label{fourier}
u(x,t) = e^{i(kx - \omega(k) t)}
\]</span></p>
<p>to obtain</p>
<p><span class="math display">\[-i\omega(k) = \sum_{j=0}^n a_j (ik)^j\]</span></p>
<p>or simply</p>
<p><span class="math display">\[\omega(k) = \sum_{j=0}^n a_j i^{j+1} k^j.\]</span></p>
<p>The function <span class="math inline">\(\omega(k)\)</span> is often referred to as the <em>dispersion relation</em> for the PDE. Any solution can be expressed as a sum of Fourier modes, and each mode propagates in a manner dictated by the dispersion relation. It’s easy to see that</p>
<ul>
<li>If <span class="math inline">\(\omega(k)\)</span> is <strong>real</strong>, then energy is conserved and each mode simply translates. This occurs if only odd-numbered spatial derivatives appear in the evolution equation \eqref{evol}.</li>
<li>If <span class="math inline">\(\omega(k)\)</span> has <strong>negative imaginary part</strong>, energy decays in time. The heat equation \eqref{heat} behaves this way.</li>
<li>If <span class="math inline">\(\omega(k)\)</span> has <strong>positive imaginary part</strong>, then the energy will grow exponentially in time. This doesn’t usually occur in physical systems. An example of this behavior is obtained by changing the sign of the right side in the heat equation to get <span class="math inline">\(u_t = - u_{xx}\)</span>.</li>
</ul>
<p>What about the wave equation, which has two time derivatives? Using the same Fourier mode ansatz \eqref{fourier}, one obtains <span class="math display">\[
\begin{align}
\omega^2 & = k^2
\end{align}
\]</span> or <span class="math inline">\(\omega = \pm k\)</span>. Since <span class="math inline">\(\omega\)</span> is real, energy is conserved.</p>
<p>In the discussion above, we have assumed that <span class="math inline">\(u\)</span> is a scalar and that the coefficients <span class="math inline">\(a_j\)</span> are real. Many undergraduate courses stop at this point, and students are left with the intuition that <strong>even-numbered derivative terms are diffusive</strong> while <strong>odd-numbered derivative terms are dispersive</strong>.</p>
<p>In practice, we often deal with systems of PDEs or PDEs with complex coefficients, and this intuition is then no longer correct. There is nothing deep or mysterious about this topic, but it’s easy to jump to incorrect conclusions if one is not careful. To take a common example, consider the time-dependent Schroedinger equation: <span class="math display">\[i \psi_t = \psi_{xx} + V\psi.\]</span> At first glance, we have on the right side a diffusion term (<span class="math inline">\(\psi_{xx}\)</span>) and a reaction term (<span class="math inline">\(V\psi\)</span>). But what about that pesky factor of <span class="math inline">\(i\)</span> (the imaginary unit) on the left hand side? It’s easy to find the answer using the usual ansatz, but let’s take a little detour first.</p>
<h2 id="systems-of-equations">Systems of equations</h2>
<p>Consider the linear system <span class="math display">\[
\begin{align*}
u_t = A \frac{\partial^j u}{\partial x^j},
\end{align*}
\]</span> where <span class="math inline">\(u\in \mathbb{R}^m\)</span> and <span class="math inline">\(A\)</span> is a square real matrix. Let <span class="math inline">\(\lambda_m\)</span> and <span class="math inline">\(s_m\)</span> denote the eigenvalues and eigenvectors (respectively) of <span class="math inline">\(A\)</span>. Inserting the Fourier mode solution <span class="math display">\[u(x,t) = s_m e^{i(kx - \omega(k) t)},\]</span> we obtain <span class="math display">\[\omega(k) = i^{j+1} k^j \lambda_m s_m,\]</span> and any solution can be written as a superposition of these. We see now that the behavior of the energy with respect to time depends on both the number <span class="math inline">\(j\)</span> of spatial derivatives and the nature of the eigenvalues of <span class="math inline">\(A\)</span>. For instance, if <span class="math inline">\(j=1\)</span> and <span class="math inline">\(A\)</span> has imaginary eigenvalues, energy is conserved. We can obtain just such an example by rewriting the wave equation \eqref{wave} as a first-order system: <span class="math display">\[
\begin{align}
u_t & = v_x \label{w1} \\
v_t & = u_x. \label{w2}
\end{align}
\]</span> (If you’re not familiar with this, just differentiate \eqref{w1} w.r.t. <span class="math inline">\(t\)</span> and \eqref{w2} w.r.t. <span class="math inline">\(x\)</span>, then equate partial derivatives to get back the second-order wave equation \eqref{wave}). We have a linear system with <span class="math inline">\(j=1\)</span> and <span class="math display">\[ A = \begin{pmatrix}
0 & 1 \\ 1 & 0
\end{pmatrix}.\]</span> This matrix has eigenvalues <span class="math inline">\(\lambda=\pm 1\)</span>, so <span class="math inline">\(\omega(k)\)</span> has zero imaginary part.</p>
<p>In this example, our intuition from the scalar case works: our first-order system, with only odd-numbered derivatives, leads to wave-like behavior. But notice that if <span class="math inline">\(A\)</span> had imaginary eigenvalues, our intuition would be wrong; for instance, the system <span class="math display">\[
\begin{align*}
u_t & = -v_x \\
v_t & = u_x,
\end{align*}
\]</span> corresponding to the second-order equation <span class="math inline">\(u_{tt} = - u_{xx},\)</span> admits exponentially growing solutions.</p>
<h2 id="scalar-problems-with-complex-coefficients">Scalar problems with complex coefficients</h2>
<p>Now that we understand the dispersion relation for systems, it’s easy to understand the dispersion relation for the Schrodinger equation. Multiply by <span class="math inline">\(-i\)</span> to get <span class="math display">\[\psi_t = -i\psi_{xx} + -iV\psi.\]</span> Now we can think of this in the same way as a system, where the coefficient matrices have purely imaginary eigenvalues. Then it’s clear that the (even-derivative) terms on the right hand side are both related to wave behavior (i.e., energy is conserved).</p>
<h2 id="systems-with-derivatives-of-different-orders">Systems with derivatives of different orders</h2>
<p>In the most general case, we have systems of linear PDEs with multiple spatial derivatives of different order: <span class="math display">\[ \label{gensys}
u_t = \sum_{j=0}^n A_j \frac{\partial^j u}{\partial x^j}.
\]</span></p>
<p>Here’s a real example from my research. It comes from homogenization of the wave equation in a spatially varying medium (see Equation (5.17) of <a href="http://faculty.washington.edu/rjl/pubs/solitary/40815.pdf">this paper</a> for more details). It’s the wave equation plus some second-derivative terms: <span class="math display">\[
u_t = v_x + v_{xx} \\
v_t = u_x - u_{xx}.
\]</span> You might (if you hadn’t read the example above) assume that this system is dissipative due to the second derivatives. This system is of the form \eqref{gensys} with <span class="math display">\[
\begin{align}
A_1 & = \begin{pmatrix}
0 & 1 \\ 1 & 0
\end{pmatrix}
&
A_2 & = \begin{pmatrix}
0 & 1 \\ -1 & 0
\end{pmatrix}.
\end{align}
\]</span> Of course, <span class="math inline">\(A_1\)</span> has real eigenvalues and leads to wave-like behavior. But <span class="math inline">\(A_2\)</span> has pure imaginary eigenvalues, so it also leads to wave-like behavior! The second derivative terms are <em>dispersive</em>. In fact, it’s easy to show that the energy <span class="math inline">\(E=u^2+v^2\)</span> is a conserved quantity for this system (try it!).</p>
<p>Strictly speaking, Fourier analysis like what we’ve described can’t usually be applied to \eqref{gensys} because the matrices <span class="math inline">\(A_j\)</span> will not generally be simultaneously diagonalizable (though this analysis can still give us intuition for what each set of terms may do). Worse yet, the individual matrices may not be diagonalizable. Let’s illustrate with a simple case.</p>
<p>Returning to the wave equation, let’s consider a different way of writing it as a system: <span class="math display">\[
\begin{align*}
u_t & = v \\
v_t & = u_{xx}.
\end{align*}
\]</span> It’s easy to check that this system is equivalent to the wave equation – but notice that it’s composed of parts with only even derivatives! (<em>reaction</em> and <em>diffusion</em> equations in the terminology of scalar PDEs). This system is of the form \eqref{gensys} with <span class="math display">\[
\begin{align}
A_0 & = \begin{pmatrix}
0 & 1 \\ 0 & 0
\end{pmatrix}
&
A_2 & = \begin{pmatrix}
0 & 0 \\ 1 & 0
\end{pmatrix}.
\end{align}
\]</span> Notice that both eigenvalues of both matrices are equal to zero.</p>
Notes 2014.03.032014-03-03T00:00:00+03:00h/2014/03/03/notes<p>Finally figured out what was wrong with the stability regions for the deferred correction methods in Nodepy when <span class="math inline">\(\theta \ne 0\)</span>. See <a href="https://bitbucket.org/ketch/rkextrapolation/src/cce934e20cf514c8c5450e7ad09f5774052ff575/code/SDC%20Stability%20regions%20when%20theta%20is%20nonzero.ipynb?at=master">these</a> <a href="https://bitbucket.org/ketch/rkextrapolation/src/cce934e20cf514c8c5450e7ad09f5774052ff575/code/Reproduce_DC_stability_region.ipynb?at=master">notebooks</a>.</p>
<p>I also sat with Roland and got the latest version of PeanoClaw running on my workstation.</p>
Notes 2014.02.272014-02-27T00:00:00+03:00h/2014/02/27/notes<p>Investigated stability regions for high order deferred correction schemes; see <a href="https://bitbucket.org/ketch/rkextrapolation/src/a78b4aa2d336491cac35c1df81e703ce103d6937/SDC%20Stability%20regions%20when%20theta%20is%20nonzero.ipynb">this notebook</a>.</p>
<p>Finally, after about a year of searching, found a way to redirect all output from distutils to a file. This will avoid the massive amount of warnings that are currently printed to the screen when installing PyClaw. See the patch <a href="https://github.com/clawpack/clawpack/pull/35">here</a>, based on <a href="http://stackoverflow.com/a/11632982/786902">this StackOverflow answer</a>.</p>
<p>I also put together an <a href="http://nbviewer.ipython.org/urls/dl.dropboxusercontent.com/u/656693/shallow_water_diffraction.ipynb">IPython notebook on shallow water solitary waves over periodic bathymetry</a>. It will be in the Github repo soon.</p>
The Schrodinger equation is not a reaction-diffusion equation2014-02-22T00:00:00+03:00h/2014/02/22/schrodinger-is-not-diffusion<p>Recently, a stackexchange answer claimed that <a href="http://scicomp.stackexchange.com/a/10878/123">the Schrodinger equation is effectively a reaction-diffusion equation</a>. I’ll set aside semantic arguments about the meaning of “effectively”, and give a more obvious example to explain why I think this statement is misleading.</p>
<p>Consider the wave equation</p>
<p><span class="math display">\[u_{tt} = u_{xx}\]</span></p>
<p>Introducing a new variable <span class="math inline">\(v=u_t\)</span> we can rewrite the wave equation as</p>
<p><span class="math display">\[
\begin{align*}
v_t & = u_{xx} \\
u_t & = v.
\end{align*}
\]</span></p>
<p>Observe that the first of these equation is the diffusion equation, while the second is a reaction equation. Thus we have reaction-diffusion!</p>
<p>Right?</p>
<p>Wrong. We’ve disguised the true nature of this equation by applying our intuition (which is based on scalar PDEs) to a system of PDEs. In the same way, the “reaction-diffusion” label for Schrodinger is obtained by applying intuition based on PDEs with real coefficients to a PDE with complex coefficients.</p>
<p>Of course, in both cases you can use numerical methods that are appropriate for reaction-diffusion problems in order to solve a wave equation.<br />
<a href="http://nbviewer.ipython.org/github/ketch/exposition/blob/master/Wave%20equation%20as%20reaction-diffusion.ipynb">Here is a quick ipython notebook implementation of the obvious method for the system above</a>.</p>
Notes 2014.02.222014-02-22T00:00:00+03:00h/2014/02/22/notes<p>Discussed time stepping for aeroacoustics with Antony Jameson at Stanford. Also reviewed a couple of his group’s papers on high order flux reconstruction schemes.</p>
<h3 id="insights-from-von-neumann-analysis-of-high-order-flux-reconstruction-schemes">Insights from von Neumann analysis of high-order flux reconstruction schemes</h3>
<ul>
<li>Vincent, Castonguay, Jameson</li>
<li>JCP 2011</li>
</ul>
<blockquote>
<p>Investigate a 1-parameter family of stable flux reconstruction methods suggested by earlier work. For certain parameter values you get DG or SD schemes. Some values admit spurious modes. The size of the largest stable step size and the order of accuracy are determined as a function of the parameter (c). Nonlinear 2D experimental results are predicted relatively well by the 1D von Neumann analysis.</p>
</blockquote>
<blockquote>
<p>Section 4 is a nice description of how to do von Neumann analysis for FE methods.</p>
</blockquote>
<h3 id="on-the-non-linear-stability-of-flux-reconstruction-schemes">On the Non-linear Stability of Flux Reconstruction Schemes</h3>
<ul>
<li>Jameson, Vincent, Castonguay</li>
<li>J. Sci. Comput. 2011</li>
</ul>
<blockquote>
<p>They look at energy stability in a very general way. Nonlinear stability depends on solution point locations, and on the accuracy of the determination of the transformed flux.</p>
</blockquote>
<p>Time integration ideas that could be useful for aeronautics simulations:</p>
<ul>
<li>Optimization of stability regions</li>
<li>Multirate time stepping. Some work has been done in <a href="http://dx.doi.org/10.1016/j.jcp.2010.05.028">Nonuniform time-step Runge–Kutta discontinuous Galerkin method for Computational Aeroacoustics</a>.</li>
<li>Large time step methods. See <a href="http://dx.doi.org/10.1016/j.jcp.2011.06.008">A class of large time step Godunov schemes for hyperbolic conservation laws and applications</a>.</li>
</ul>
<p>Large time step methods might work very well as the coarse propagator for parareal-type algorithms.</p>
The parallel EPPEER code2012-10-17T00:00:00+03:00h/2012/10/17/eppeer<p>I tried out the EPPEER code, which uses two-step Runge-Kutta methods and OpenMP, because I’m thinking of writing a shared-memory parallel ODE solver code myself.</p>
<p>I downloaded the code from</p>
<p><a href="http://www.mathematik.uni-marburg.de/~schmitt/peer/eppeer.zip" class="uri" title="Go to wiki page">http://www.mathematik.uni-marburg.de/~schmitt/peer/eppeer.zip</a></p>
<p>unzipped, and ran</p>
<pre><code>gfortran -c mbod4h.f90
gfortran -c ivprkp.f90
gfortran -c -fopenmp ivpepp.f90
gfortran -fopenmp ivprkp.o ivpepp.o mbod4h.o ivp_pmain.f90
./a.out</code></pre>
<p>I had to fix one line that was trying to open a logfile and failed. I also set</p>
<pre><code>export OMP_NUM_THREADS=4</code></pre>
<p>This runs the code with increasingly tight tolerances on a 400-body problem. The output was (I killed it before it finished the really tight tolerance run(s)</p>
<pre><code> tol, err, otime, cpu 0.10E-01 0.10702 2.9556 10.534
steps,rej,nfcn: 337 88 1399
tol, err, otime, cpu 0.10E-02 0.93692E-01 4.9853 18.585
steps,rej,nfcn: 605 159 2465
tol, err, otime, cpu 0.10E-03 0.66604E-01 7.9798 30.365
steps,rej,nfcn: 994 244 4015
tol, err, otime, cpu 0.10E-04 0.47637E-01 12.026 46.477
steps,rej,nfcn: 1534 324 6175
tol, err, otime, cpu 0.10E-05 0.24241E-01 18.239 70.756
steps,rej,nfcn: 2338 415 9391</code></pre>
<p>If I understand correctly, the last column is total CPU time; the next to last is wall time. For comparison, I ran it without parallelism:</p>
<pre><code>export OMP_NUM_THREADS=1</code></pre>
<p>Then I got the following:</p>
<pre><code> tol, err, otime, cpu 0.10E-01 0.10702 10.382 10.382
steps,rej,nfcn: 337 88 1399
tol, err, otime, cpu 0.10E-02 0.93692E-01 18.297 18.297
steps,rej,nfcn: 605 159 2465
tol, err, otime, cpu 0.10E-03 0.66604E-01 29.814 29.815
steps,rej,nfcn: 994 244 4015
tol, err, otime, cpu 0.10E-04 0.47637E-01 45.854 45.855
steps,rej,nfcn: 1534 324 6175
tol, err, otime, cpu 0.10E-05 0.24241E-01 69.725 69.726
steps,rej,nfcn: 2338 415 9391
tol, err, otime, cpu 0.10E-06 0.53727E-02 105.47 105.48
steps,rej,nfcn: 3539 484 14195</code></pre>
<p>The numbers of function evaluations were identical, confirming that the computations being performed were the same. The speedup (about 3x) is very nice. We should be able to achieve something similar with extrapolation.</p>
<p>These results are actually plotted in <a href="http://www.mathematik.uni-marburg.de/~schmitt/peer/man_epp.pdf">the user guide</a>, at the end of Section 4.</p>
<p>This was originally posted on <a href="https://mathwiki.kaust.edu.sa/david/eppeer">mathwiki</a>.</p>
Blogging an iPython notebook with Jekyll2012-10-11T00:00:00+03:00h/2012/10/11/blogging_ipython_notebooks_with_jekyll<blockquote>
<p><strong>Update as of December 2014: Don’t bother using what’s below; go to <a href="http://cscorley.github.io/2014/02/21/blogging-with-ipython-and-jekyll/">Christop Corley’s blog</a> for a much better setup!</strong></p>
</blockquote>
<p>I’ve been playing around with <a href="http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html">iPython notebooks</a> for a while and planning to use them instead of <a href="http://www.sagemath.org/">SAGE</a> worksheets for my numerical analysis course next spring. As a warmup, I wrote an iPython notebook explaining a bit about internal stability of Runge-Kutta methods and showing some new research results using <a href="http://numerics.kaust.edu.sa/nodepy/">NodePy</a>.</p>
<p>I also wanted to post the notebook on my blog here; the ability to more easily include math and code in blog posts was one of my main motivations for moving away from Blogger to my own site. I first tried following <a href="http://blog.fperez.org/2012/09/blogging-with-ipython-notebook.html">the instructions given by Fernando Perez</a>. That was quite painless and worked flawlessly, using <code>nbconvert.py</code> to convert the .ipynb file directly to HTML, with graphics embedded. The only issue was that I didn’t love the look of the output quite as much as I love how Carl Boettiger’s Markdown + Jekyll posts with code and math look (see an example <a href="http://www.carlboettiger.info/2012/09/14/analytic-solution-to-multiple-uncertainty.html">here</a>). Besides, Markdown is so much nicer than HTML, and <code>nbconvert.py</code> has a Markdown output option.</p>
<p>So I tried the markdown option:</p>
<pre><code>nbconvert.py my_nb.ipynb -f markdown</code></pre>
<p>I copied the result to my <code>_posts/</code> directory, added the <a href="https://github.com/mojombo/jekyll/wiki/YAML-Front-Matter">YAML front-matter</a> that Jekyll expects, and took a look. Everything was great except that all my plots were gone, of course. After considering a few options, I decided for now to put plots for such posts in a subfolder <code>jekyll_images/</code> of my public Dropbox folder. Then it was a simple matter of search/replace all the paths to the images. At that point, it looked great; you can see the <a href="https://github.com/ketch/nodepy/blob/master/examples/Internal_stability.ipynb">source</a> and the <a href="http://davidketcheson.info/2012/10/11/Internal_stability.html">result</a>.</p>
<p>The only issue was that I didn’t want to manually do all that work every time. I considered creating a new Converter class in <code>nbconvert</code> to handle it, but finally decided that it would be more convenient to just write a shell script that calls <code>nbconvert</code> and then operates on the result.<br />
Here it is:</p>
<pre><code>#!/bin/bash
fname=$1
nbconvert.py ${fname}.ipynb -f markdown
sed -i '' "s#${fname}_files#https:\/\/dl.dropbox.com\/u\/656693\/jekyll_images\/${fname}_files#g" ${fname}.md
dt=$(date "+%Y-%m-%d")
echo "0a
---
layout: post
time: ${dt}
title: TITLE-ME
subtitle: SUBTITLE-ME
tags: TAG-ME
---
.
w" | ed ${fname}.md
mv ${fname}.md ~/labnotebook/_posts/${dt}-${fname}.md</code></pre>
<p>It’s also on Github <a href="https://github.com/ketch/labnotebook/blob/master/nbconv.sh">here</a>. This was a nice educational exercise in constructing shell scripts, in which I learned or re-learned:</p>
<ul>
<li>how to use command-line arguments</li>
<li>how to use sed and ed</li>
<li>how to use data</li>
</ul>
<p>You can expect a lot more iPython-notebook based posts in the future.</p>
Internal stability of Runge-Kutta methods2012-10-11T00:00:00+03:00h/2012/10/11/Internal_stability<p>Note: this post was generated from an iPython notebook. You can <a href="https://github.com/ketch/nodepy/blob/master/examples/Internal_stability.ipynb">download the notebook from github</a> and execute all the code yourself.</p>
<p>Internal stability deals with the growth of errors (such as roundoff) introduced at the Runge-Kutta stages during a single Runge-Kutta step. It is usually important only for methods with a large number of stages, since that is when the internal amplification factors can be large. An excellent explanation of internal stability is given in <a href="http://oai.cwi.nl/oai/asset/1652/1652A.pdf">this paper</a>. Here we demonstrate some tools for studying internal stability in NodePy.</p>
<p>First, let’s load a couple of RK methods:</p>
<div class="highlight">
<pre><span class="kn">from</span> <span class="nn">nodepy</span> <span class="kn">import</span> <span class="n">rk</span>
<span class="nb">reload</span><span class="p">(</span><span class="n">rk</span><span class="p">)</span>
<span class="n">rk4</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">loadRKM</span><span class="p">(</span><span class="s">'RK44'</span><span class="p">)</span>
<span class="n">ssprk4</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">loadRKM</span><span class="p">(</span><span class="s">'SSP104'</span><span class="p">)</span>
<span class="k">print</span> <span class="n">rk4</span>
<span class="k">print</span> <span class="n">ssprk4</span>
</pre>
</div>
<pre><code>Classical RK4
The original four-stage, fourth-order method of Kutta
0 |
1/2 | 1/2
1/2 | 0 1/2
1 | 0 0 1
_____|____________________
| 1/6 1/3 1/3 1/6
SSPRK(10,4)
The optimal ten-stage, fourth order SSP Runge-Kutta method
0 |
1/6 | 1/6
1/3 | 1/6 1/6
1/2 | 1/6 1/6 1/6
2/3 | 1/6 1/6 1/6 1/6
1/3 | 1/15 1/15 1/15 1/15 1/15
1/2 | 1/15 1/15 1/15 1/15 1/15 1/6
2/3 | 1/15 1/15 1/15 1/15 1/15 1/6 1/6
5/6 | 1/15 1/15 1/15 1/15 1/15 1/6 1/6 1/6
1 | 1/15 1/15 1/15 1/15 1/15 1/6 1/6 1/6 1/6
_____|____________________________________________________________
| 1/10 1/10 1/10 1/10 1/10 1/10 1/10 1/10 1/10 1/10</code></pre>
<h2 id="absolute-stability-regions">Absolute stability regions</h2>
<p>First we can use NodePy to plot the region of absolute stability for each method. The absolute stability region is the set</p>
<center>
<span class="math inline">\(\\{ z \in C : |\phi (z)|\le 1 \\}\)</span>
</center>
<p>where <span class="math inline">\(\phi(z)\)</span> is the <em>stability function</em> of the method:</p>
<center>
<span class="math inline">\(1 + z b^T (I-zA)^{-1}\)</span>
</center>
<p>If we solve <span class="math inline">\(u'(t) = \lambda u\)</span> with a given method, then <span class="math inline">\(z=\lambda \Delta t\)</span> must lie inside this region or the computation will be unstable.</p>
<div class="highlight">
<pre><span class="n">p</span><span class="p">,</span><span class="n">q</span> <span class="o">=</span> <span class="n">rk4</span><span class="o">.</span><span class="n">stability_function</span><span class="p">()</span>
<span class="k">print</span> <span class="n">p</span>
<span class="n">h1</span><span class="o">=</span><span class="n">rk4</span><span class="o">.</span><span class="n">plot_stability_region</span><span class="p">()</span>
</pre>
</div>
<pre><code> 4 3 2
0.04167 x + 0.1667 x + 0.5 x + 1 x + 1</code></pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_00.png" alt="" />
</figure>
<div class="highlight">
<pre><span class="n">p</span><span class="p">,</span><span class="n">q</span> <span class="o">=</span> <span class="n">ssprk4</span><span class="o">.</span><span class="n">stability_function</span><span class="p">()</span>
<span class="k">print</span> <span class="n">p</span>
<span class="n">h2</span><span class="o">=</span><span class="n">ssprk4</span><span class="o">.</span><span class="n">plot_stability_region</span><span class="p">()</span>
</pre>
</div>
<pre><code> 10 9 8 7 6
3.969e-09 x + 2.381e-07 x + 6.43e-06 x + 0.0001029 x + 0.00108 x
5 4 3 2
+ 0.00787 x + 0.04167 x + 0.1667 x + 0.5 x + 1 x + 1</code></pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_01.png" alt="" />
</figure>
<h1 id="internal-stability">Internal stability</h1>
<p>The stability function tells us by how much errors from one step are amplified in the next one. This is important since we introduce truncation errors at every step. However, we also introduce roundoff errors at the each stage within a step. Internal stability tells us about the growth of those. Internal stability is typically less important than (step-by-step) absolute stability for two reasons:</p>
<ul>
<li>Roundoff errors are typically much smaller than truncation errors, so moderate amplification of them typically is not significant</li>
<li>Although the propagation of stage errors within a step is governed by internal stability functions, in later steps these errors are propagated according to the (principal) stability function</li>
</ul>
<p>Nevertheless, in methods with many stages, internal stability can play a key role.</p>
<p>Questions: <em>In the solution of PDEs, large spatial truncation errors enter at each stage. Does this mean internal stability becomes more significant? How does this relate to stiff accuracy analysis and order reduction?</em></p>
<h2 id="internal-stability-functions">Internal stability functions</h2>
<p>We can write the equations of a Runge-Kutta method compactly as</p>
<center>
<span class="math inline">\(y = u^n e + h A F(y)\)</span>
</center>
<center>
<span class="math inline">\(u^{n+1} = u^n + h b^T F(y),\)</span>
</center>
<p>where <span class="math inline">\(y\)</span> is the vector of stage values, <span class="math inline">\(u^n\)</span> is the previous step solution, <span class="math inline">\(e\)</span> is a vector with all entries equal to 1, <span class="math inline">\(h\)</span> is the step size, <span class="math inline">\(A\)</span> and <span class="math inline">\(b\)</span> are the coefficients in the Butcher tableau, and <span class="math inline">\(F(y)\)</span> is the vector of stage derivatives. In floating point arithmetic, roundoff errors will be made at each stage. Representing these errors by a vector <span class="math inline">\(r\)</span>, we have</p>
<center>
<span class="math inline">\(y = u^n e + h A F(y) + r.\)</span>
</center>
<p>Considering the test problem <span class="math inline">\(F(y)=\lambda y\)</span> and solving for <span class="math inline">\(y\)</span> gives</p>
<center>
<span class="math inline">\(y = u^n (I-zA)^{-1}e + (I-zA)^{-1}r,\)</span>
</center>
<p>where <span class="math inline">\(z=h\lambda\)</span>. Substituting this result in the equation for <span class="math inline">\(u^{n+1}\)</span> gives</p>
<center>
<span class="math inline">\(u^{n+1} = u^n (1 + zb^T(I-zA)^{-1}e) + zb^T(I-zA)^{-1}r = \psi(z) u^n + \theta(z)^T r.\)</span>
</center>
<p>Here <span class="math inline">\(\psi(z)\)</span> is the <em>stability function</em> of the method, that we already encountered above. Meanwhile, the vector <span class="math inline">\(\theta(z)\)</span> contains the <em>internal stability functions</em> that govern the amplification of roundoff errors <span class="math inline">\(r\)</span> within a step:</p>
<center>
<span class="math inline">\(\theta(z) = z b^T (I-zA)^{-1}.\)</span>
</center>
<p>Let’s compute <span class="math inline">\(\theta\)</span> for the classical RK4 method:</p>
<div class="highlight">
<pre><span class="n">theta</span><span class="o">=</span><span class="n">rk4</span><span class="o">.</span><span class="n">internal_stability_polynomials</span><span class="p">()</span>
<span class="n">theta</span>
</pre>
</div>
<pre>
[poly1d([1/24, 1/12, 1/6, 1/6, 0], dtype=object),
poly1d([1/12, 1/6, 1/3, 0], dtype=object),
poly1d([1/6, 1/3, 0], dtype=object),
poly1d([1/6, 0], dtype=object)]
</pre>
<div class="highlight">
<pre><span class="k">for</span> <span class="n">theta_j</span> <span class="ow">in</span> <span class="n">theta</span><span class="p">:</span>
<span class="k">print</span> <span class="n">theta_j</span>
</pre>
</div>
<pre><code> 4 3 2
0.04167 x + 0.08333 x + 0.1667 x + 0.1667 x
3 2
0.08333 x + 0.1667 x + 0.3333 x
2
0.1667 x + 0.3333 x
0.1667 x</code></pre>
<p>Thus the roundoff errors in the first stage are amplified by a factor <span class="math inline">\(z^4/24 + z^3/12 + z^2/6 + z/6\)</span>, while the errors in the last stage are amplified by a factor <span class="math inline">\(z/6\)</span>.</p>
<h2 id="internal-instability">Internal instability</h2>
<p>Usually internal stability is unimportant since it relates to amplification of roundoff errors, which are very small. Let’s think about when things can go wrong in terms of internal instability. If <span class="math inline">\(|\theta(z)|\)</span> is of the order <span class="math inline">\(1/\epsilon_{machine}\)</span>, then roundoff errors could be amplified so much that they destroy the accuracy of the computation. More specifically, we should be concerned if <span class="math inline">\(|\theta(z)|\)</span> is of the order <span class="math inline">\(tol/\epsilon_{machine}\)</span> where <span class="math inline">\(tol\)</span> is our desired error tolerance. Of course, we only care about values of <span class="math inline">\(z\)</span> that lie inside the absolute stability region <span class="math inline">\(S\)</span>, since internal stability won’t matter if the computation is not absolutely stable.</p>
<p>We can get some idea about the amplification of stage errors by plotting the curves <span class="math inline">\(|\theta(z)|=1\)</span> along with the stability region. Ideally these curves will all lie outside the stability region, so that all stage errors are damped.</p>
<div class="highlight">
<pre><span class="n">rk4</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
</pre>
</div>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_02.png" alt="" />
</figure>
<div class="highlight">
<pre><span class="n">ssprk4</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
</pre>
</div>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_03.png" alt="" />
</figure>
<p>For both methods, we see that some of the curves intersect the absolute stability region, so some stage errors are amplified. But by how much? We’d really like to know the maximum amplification of the stage errors under the condition of absolute stability. We therefore define the <em>maximum internal amplification factor</em> <span class="math inline">\(M\)</span>:</p>
<center>
<span class="math inline">\(M = \max_j \max_{z \in S} |\theta_j(z)|\)</span>
</center>
<div class="highlight">
<pre><span class="k">print</span> <span class="n">rk4</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
<span class="k">print</span> <span class="n">ssprk4</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>2.15239281554
4.04399941143</code></pre>
<p>We see that both methods have small internal amplification factors, so internal stability is not a concern in either case. This is not surprising for the method with only four stages; it is a surprisingly good property of the method with ten stages.</p>
<p>Questions: <em>Do SSP RK methods always (necessarily) have small amplification factors? Can we prove it?</em></p>
<p>Now let’s look at some methods with many stages.</p>
<h2 id="runge-kutta-chebyshev-methods">Runge-Kutta Chebyshev methods</h2>
<p>The paper of Verwer, Hundsdorfer, and Sommeijer deals with RKC methods, which can have very many stages. The construction of these methods is implemented in NodePy, so let’s take a look at them. The functions <code>RKC1(s)</code> and <code>RKC2(s)</code> construct RKC methods of order 1 and 2, respectively, with <span class="math inline">\(s\)</span> stages.</p>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">4</span>
<span class="n">rkc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">RKC1</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="k">print</span> <span class="n">rkc</span>
</pre>
</div>
<pre><code>RKC41
0 |
1/16 | 1/16
1/4 | 1/8 1/8
9/16 | 3/16 1/4 1/8
______|________________________
| 1/4 3/8 1/4 1/8</code></pre>
<div class="highlight">
<pre><span class="n">rkc</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
</pre>
</div>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_04.png" alt="" />
</figure>
<p>It looks like there could be some significant internal amplification here. Let’s see:</p>
<div class="highlight">
<pre><span class="n">rkc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
11.760869405962685
</pre>
<p>Nothing catastrophic. Let’s try a larger value of <span class="math inline">\(s\)</span>:</p>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">20</span>
<span class="n">rkc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">RKC1</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">rkc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
42.665327220219126
</pre>
<p>As promised, these methods seem to have good internal stability properties. What about the second-order methods?</p>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">20</span>
<span class="n">rkc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">RKC2</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">rkc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
106.69110992619214
</pre>
<p>Again, nothing catastrophic. We could take <span class="math inline">\(s\)</span> much larger than 20, but the calculations get to be rather slow (in Python) and since we’re using floating point arithmetic, the accuracy deteriorates.</p>
<p>Remark: <em>we could do the calculations in exact arithmetic using Sympy, but things would get even slower. Perhaps there are some optimizations that could be done to speed this up. Or perhaps we should use Mathematica if we need to do this kind of thing.</em></p>
<p>Remark 2: <em>of course, for the RKC methods the internal stability polynomials are shifted Chebyshev polynomials. So we could evaluate them directly in a stable manner using the three-term recurrence (or perhaps scipy’s special functions library). This would also be a nice check on the calculations above.</em></p>
<h2 id="other-methods-with-many-stages">Other methods with many stages</h2>
<p>Three other classes of methods with many stages have been implemented in NodePy:</p>
<ul>
<li>SSP families</li>
<li>Integral deferred correction (IDC) methods</li>
<li>Extrapolation methods</li>
</ul>
<h3 id="ssp-families">SSP Families</h3>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">20</span>
<span class="n">ssprk</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">SSPRK2</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
2.0212921484995547
</pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_05.png" alt="" />
</figure>
<div class="highlight">
<pre><span class="n">s</span><span class="o">=</span><span class="mi">25</span> <span class="c"># # of stages</span>
<span class="n">ssprk</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">SSPRK3</span><span class="p">(</span><span class="n">s</span><span class="p">)</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">ssprk</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre>
3.8049237837215397
</pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_06.png" alt="" />
</figure>
<p>The SSP methods seem to have excellent internal stability properties.</p>
<h3 id="idc-methods">IDC methods</h3>
<div class="highlight">
<pre><span class="n">p</span><span class="o">=</span><span class="mi">6</span> <span class="c">#order</span>
<span class="n">idc</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">DC</span><span class="p">(</span><span class="n">p</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">idc</span><span class="p">)</span>
<span class="n">idc</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">idc</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>26</code></pre>
<pre>
6.4140166271998815
</pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_07.png" alt="" />
</figure>
<p>IDC methods also seem to have excellent internal stability.</p>
<h3 id="extrapolation-methods">Extrapolation methods</h3>
<div class="highlight">
<pre><span class="n">p</span><span class="o">=</span><span class="mi">6</span> <span class="c">#order</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">extrap</span><span class="p">(</span><span class="n">p</span><span class="p">)</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">ex</span><span class="p">)</span>
<span class="n">ex</span><span class="o">.</span><span class="n">internal_stability_plot</span><span class="p">()</span>
<span class="n">ex</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>16
6</code></pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_08.png" alt="" />
</figure>
<p>Not so good. Let’s try a method with even more stages (this next computation will take a while; go stretch your legs).</p>
<div class="highlight">
<pre><span class="n">p</span><span class="o">=</span><span class="mi">10</span> <span class="c">#order</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">extrap</span><span class="p">(</span><span class="n">p</span><span class="p">)</span>
<span class="k">print</span> <span class="nb">len</span><span class="p">(</span><span class="n">ex</span><span class="p">)</span>
<span class="n">ex</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
</pre>
</div>
<pre><code>46</code></pre>
<pre>
28073.244376758907
</pre>
<p>Now we’re starting to see something that might cause trouble, especially since such high order extrapolation methods are usually used when extremely tight error tolerances are required. Internal amplification will cause a loss of about 5 digits of accuracy here, so the best we can hope for is about 10 digits of accuracy in double precision. Higher order extrapolation methods will make things even worse. How large are their amplification factors? (Really long calculation here…)</p>
<div class="highlight">
<pre><span class="n">pmax</span> <span class="o">=</span> <span class="mi">12</span>
<span class="n">ampfac</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">pmax</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span>
<span class="k">for</span> <span class="n">p</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span><span class="n">pmax</span><span class="o">+</span><span class="mi">1</span><span class="p">):</span>
<span class="n">ex</span> <span class="o">=</span> <span class="n">rk</span><span class="o">.</span><span class="n">extrap</span><span class="p">(</span><span class="n">p</span><span class="p">)</span>
<span class="n">ampfac</span><span class="p">[</span><span class="n">p</span><span class="p">]</span> <span class="o">=</span> <span class="n">ex</span><span class="o">.</span><span class="n">maximum_internal_amplification</span><span class="p">()</span>
<span class="k">print</span> <span class="n">p</span><span class="p">,</span> <span class="n">ampfac</span><span class="p">[</span><span class="n">p</span><span class="p">]</span>
</pre>
</div>
<pre><code>1 1.99777378912
2 2.40329384375
3
5.07204078733
4
17.747335803
5
69.62805786
6
97.6097450835
7
346.277441462
8
1467.40356089
9
6344.16303534
10
28073.2443768
11
126011.586473
12
169897.662582</code></pre>
<pre>
[<matplotlib.lines.Line2D at 0x2611bbe10>]
</pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_09.png" alt="" />
</figure>
<div class="highlight">
<pre><span class="n">semilogy</span><span class="p">(</span><span class="n">ampfac</span><span class="p">,</span><span class="n">linewidth</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
</pre>
</div>
<pre>
[<matplotlib.lines.Line2D at 0x2611a6710>]
</pre>
<figure>
<img src="https://dl.dropbox.com/u/656693/jekyll_images/Internal_stability_files/Internal_stability_fig_10.png" alt="" />
</figure>
<p>We see roughly geometric growth of the internal amplification factor as a function of the order <span class="math inline">\(p\)</span>. It seems clear that very high order extrapolation methods applied to problems with high accuracy requirements will fall victim to internal stability issues.</p>
A curious upwind implicit scheme for advection2012-10-11T00:00:00+03:00h/2012/10/11/A_curious_upwind_implicit_scheme_for_advection<h2 id="the-cfl-condition">The CFL condition</h2>
<p>The CFL condition is one of the most basic and intuitive principles in the numerical solution of hyperbolic PDEs. First formulated by Courant, Friedrichs and Lewy in their seminal paper (in English for free here](http://www.stat.uchicago.edu/~lekheng/courses/302/classics/courant-friedrichs-lewy.pdf)), it states that the domain of dependence of a numerical method for solving a PDE must contain the true domain of dependence. Otherwise, the numerical method cannot be convergent.</p>
<p>The CFL condition is geometric and easily understood in the context of, say, a first-order upwind discretization of advection. Usually it says nothing interesting about implicit schemes, since they include all points in their domain of dependence. But sometimes understanding the CFL condition for a particular scheme can be subtle.</p>
<h3 id="an-implicit-scheme">An implicit scheme</h3>
<p>Consider the advection equation</p>
<p><span class="math display">\[u_t + a u_x = 0.\]</span></p>
<p>Discretization using a backward difference in space and in time gives the scheme</p>
<p><span class="math display">\[U^{n+1}_j = U^n_j - \nu(U^{n+1}_j - U^{n+1}_{j-1}).\]</span></p>
<p>Where <span class="math inline">\(\nu = ka/h\)</span> is the CFL number and <span class="math inline">\(k,h\)</span> are the step sizes in time and space, respectively. This very simple scheme illustrates the concepts of the CFL condition and stability in a remarkable way.</p>
<p>For simplicity, suppose that the problem is posed on the domain <span class="math inline">\(0\le x \le 1\)</span>, with an appropriate boundary condition. Since this scheme computes <span class="math inline">\(U^{n+1}_j\)</span> in terms of <span class="math inline">\(U^n_j\)</span> and <span class="math inline">\(U^{n+1}_{j-1}\)</span>, it seems that the numerical domain of dependence for <span class="math inline">\(U^n_j\)</span> is <span class="math inline">\((x,t)\in (0,x_j)\times[0,t_n]\)</span>. Based on this, we may conclude that the scheme is not convergent for <span class="math inline">\(\nu<0\)</span>. Simple enough.</p>
<p>But what if <span class="math inline">\(\nu=-1\)</span>? Then the scheme reads <span class="math display">\[U^{n+1}_{j-1} = U^n_j,\]</span> which gives <strong>the exact solution</strong>! This is a sort of “anti-unit CFL condition”.</p>
<p>How can this scheme be convergent (in fact, exact!) for a negative CFL number when it doesn’t use any values to the right?</p>
<h3 id="understanding-the-cfl-condition">Understanding the CFL condition</h3>
<p>Look at the exact formula above. In this case the scheme is not a method for computing <span class="math inline">\(U^{n+1}_j\)</span> but for computing <span class="math inline">\(U^{n+1}_{j-1}\)</span>, and it <em>does</em> use a value from the previous time step that lies to the right.</p>
<p>So we can view the scheme with <span class="math inline">\(\nu=-1\)</span> as a method for computing <span class="math inline">\(U^{n+1}_j\)</span>, in which case the CFL condition is satisfied only for <span class="math inline">\(\nu\ge0\)</span>, or we can view the scheme as a method for computing <span class="math inline">\(U^{n+1}_{j-1}\)</span>, in which case the CFL condition is satisfied only for <span class="math inline">\(\nu\le-1\)</span>. <strong>Which viewpoint is correct?</strong></p>
<p>To answer that question, remember that the CFL condition is purely algebraic – that is, it relates to which values are actually used to compute which other values. To understand this scheme, we need to think about how we actually solve for <span class="math inline">\(U^{n+1}\)</span> when using it. Notice that the scheme can be written as <span class="math display">\[A U^{n+1} = U^n\]</span> where the matrix <span class="math inline">\(A\)</span> is lower-triangular. Hence the system can be solved by substitution. To go further, we must consider two cases:</p>
<ol type="1">
<li><p><span class="math inline">\(\nu>0\)</span>: in this case, boundary values must be supplied along the left boundary at <span class="math inline">\(x=0\)</span>. Then, starting from the known value at the boundary, we work to the right by substitution: <span class="math display">\[U^{n+1}_j = \frac{U^n_j+\nu U^{n+1}_{j-1}}{1+\nu}.\]</span> Hence the scheme is truly a way of computing <span class="math inline">\(U^{n+1}_j\)</span> based on <span class="math inline">\(U^n_j, U^{n+1}_{j-1}\)</span> and the resulting CFL condition is <span class="math inline">\(\nu\ge0\)</span>.</p></li>
<li><p><span class="math inline">\(\nu<0\)</span>: in this case, boundary values must be supplied along the right boundary at <span class="math inline">\(x=1\)</span>. Then, starting from the known value at the boundary, we work to the left by substitution: <span class="math display">\[U^{n+1}_{j-1} = \frac{(1+\nu)U^{n+1}_j - U^n_j}{\nu}.\]</span> Hence the scheme is truly a way of computing <span class="math inline">\(U^{n+1}_{j-1}\)</span> based on <span class="math inline">\(U^n_j, U^{n+1}_{j}\)</span> and the resulting CFL condition is <span class="math inline">\(\nu\le-1\)</span>.</p></li>
</ol>
<p>This post was originally published on the KAUST Mathwiki <a href="https://mathwiki.kaust.edu.sa/david/A%20curious%20upwind%20implicit%20scheme%20for%20advection">here</a> (login required).</p>