David Ketcheson2023-04-17T14:18:03+03:00David I. Ketchesondketch@gmail.comModeling Coronavirus part V -- try the model yourself2020-03-22T00:00:00+03:00h/2020/03/22/SIR_interactive<p>As a follow up to this series of posts, I’ve created a version of the
model that you can experiment with on your own, without needing to know
any computer programming. <a
href="https://mybinder.org/v2/gh/ketch/covid-blog-posts/master?filepath=Interactive_SIR_model.ipynb">Try
it here</a>.</p>
<p>Note that it may take some time for the model to load.</p>
<p>If you arrived here and haven’t already read the series of posts on
this topic, I recommend that you <a
href="http://www.davidketcheson.info/2020/03/17/SIR_model.html">start at
the beginning</a>.</p>
Modeling Coronavirus part IV -- understanding exponential growth2020-03-20T00:00:00+03:00h/2020/03/20/SIR_exponential<p>This is my fourth post on modeling the Coronavirus epidemic. I
recommend starting with <a
href="http://www.davidketcheson.info/2020/03/17/SIR_model.html">the
first post</a> and reading them in order.</p>
<p>So far, we’ve <a
href="http://www.davidketcheson.info/2020/03/17/SIR_model.html">learned
about the SIR model</a> and <a
href="http://www.davidketcheson.info/2020/03/19/SIR_Estimating_parameters.html">used
available data</a> combined with the model to <a
href="http://www.davidketcheson.info/2020/03/19/SIR_predictions.html">predict
the epidemic</a>. We’re now going to detour into some additional
mathematical ideas that will help us get further understanding. At the
end of this post, we’ll try to judge how effective the mitigation
strategies have been in some countries.</p>
<h1 id="exponential-growth">Exponential growth</h1>
<p>In the second post, we saw that the initial spread of a disease
follows a differential equation of the form</p>
<p><span class="math display">\[
\frac{dI}{dt} = \beta I(t).
\]</span></p>
<p>Once you’re comfortable with the idea that <span
class="math inline">\(dI/dt\)</span> just means “the rate of change of
<span class="math inline">\(I\)</span>”, you realize that this is one of
the simplest equations imaginable. So it shouldn’t be surprising that
this equation comes up a lot in the real world. Its solution is</p>
<p><span class="math display">\[
I(t) = e^{\beta t} I(0).
\]</span></p>
<p>Here <span class="math inline">\(e\approx 2.72\)</span> is
<strong>Euler’s number</strong>. This equation tells us that the number
of infected grows very quickly. In the case of Coronavirus, the number
<span class="math inline">\(I\)</span> can double in about 3 days (<a
href="https://ourworldindata.org/coronavirus#growth-country-by-country-view">you
can see more estimates of the doubling time here</a>). We refer to this
kind of growth (where a given quantity doubles over a certain time
interval) as <strong>exponential growth</strong>.</p>
<p>Financial advisors love to talk about the power of exponential growth
because that’s also how compound interest works. Of course, with most
investments it takes a lot more than 3 days to double your money, but
the principle is the same. In fact, exponentially growing functions are
all around us, and learning how they behave can help you to understand a
lot of things.</p>
<p>Let’s look again at our first prediction from the last post:</p>
<p><img src="/assets/img/covid19/exp_4_0.png" /></p>
<p>Right now, we are at the very left edge of this plot. How many people
are infected at the start of the plot? It looks like zero, but we know
there are over 100 thousand current cases. It just looks like zero
because the scale here is in billions, and 100 thousand is much too
small to see on that scale! This is a common problem with exponentially
growing functions. We can try to solve the problem by zooming in on the
left end of the graph, showing results for just the next 30 days:</p>
<p><img src="/assets/img/covid19/exp_6_0.png" /></p>
<p>It’s a little better, but even on this scale the current number of
infections is too small to be seen. We can try again to fix it by just
changing the vertical scale:</p>
<p><img src="/assets/img/covid19/exp_8_0.png" /></p>
<p>Now we can see that the starting value is not zero, but we can’t see
the right part of the graph at all! Let’s find a better solution.</p>
<h1 id="logarithmic-scaling">Logarithmic scaling</h1>
<p>In the plots above, the scale of the <span
class="math inline">\(y\)</span>-axis is <strong>linear</strong>. That
means that equal distances in <span class="math inline">\(y\)</span>
represent equal changes in the value of the function. The problem with
using this for an exponentially growing function is that the changes in
the function at early times are tiny compared to the later growth.</p>
<p>Instead, we can visualize the growth using a
<strong>logarithmic</strong> scale:</p>
<p><img src="/assets/img/covid19/exp_12_0.png" /></p>
<p>Look carefully at the <span class="math inline">\(y\)</span> axis
here. As you can see, each of the evenly space labels on the axis
represents a value <strong>10 times greater</strong> than the value
below it. In other words, equal distances on this axis represent equal
<strong>ratios</strong>. The great thing about this scaling is that we
can easily see how the function varies over the whole graph. It might
seem strange that it looks almost like a straight line, but that’s
exactly how an exponentially growing function should look on this kind
of plot. Remember, equal distances in <span
class="math inline">\(y\)</span> represent equal ratios, and we said
that this kind of function doubles over each interval of some fixed
size.</p>
<p>With this plot, it’s easy to answer questions like “when do we expect
to have more than 1 million cases?” We couldn’t possibly answer that by
looking at the previous plots.</p>
<p>We can also see more easily how rapidly the epidemic goes from
something small to a true global crisis. <strong>At present there are
less than a million cases, but before the end of April we would have
(without mitigation) over 1 billion</strong>. Never in history have that
many human beings been ill at the same time with a single disease.</p>
<p>In technical terms, we say this is a <strong>semi-log plot</strong>,
because the <span class="math inline">\(y\)</span> axis is logarithmic
while the <span class="math inline">\(x\)</span> axis is linear. If both
the <span class="math inline">\(x\)</span> and <span
class="math inline">\(y\)</span> axes were logarithmic, we would call it
a <strong>log-log plot</strong>.</p>
<p>Understanding logarithmic plots is a bit of a superpower, because it
allows you to study data by looking at plots like the last one above,
which let you simultaneously see the parts where the function is small
and big.</p>
<h1 id="the-model-predictions-on-a-log-scale">The model predictions on a
log scale</h1>
<p>Here’s a semilog plot of our basic model for the epidemic over the
next year:</p>
<p><img src="/assets/img/covid19/exp_18_0.png" /></p>
<p>Remember, this is exactly the same data that’s in the first plot near
the top of this post. We’re just looking at it in a different way. But
with this new plot we can much more easily see how many susceptibles
remain at the end of a year: about a tenth of a billion, or 100
million.</p>
<p>We can also see that there are still about 1000 infected people in
this model at the end of a year. Notice that after the infection peaks,
it declines in a way that also looks like a straight (downward-trending)
line on this plot; that means that the decrease is also
<strong>exponential</strong>. In other words, after we pass the peak of
infection, the number of infected individuals will consistently reduce
by a factor of two over a certain time interval. Looking back at the
model, we see that the rate of decrease is determined by <span
class="math inline">\(\gamma\)</span>. In the late stages of the
epidemic, because the fraction of susceptible people is small, we have
approximately:</p>
<p><span class="math display">\[
\frac{dI}{dt} = -\gamma I(t)
\]</span></p>
<p>whose solution is</p>
<p><span class="math display">\[
I(t+\tau) = e^{-\gamma \tau} I(t),
\]</span></p>
<p>which, for our estimate <span class="math inline">\(\gamma \approx
0.05\)</span>, implies that the number of infections is reduced by half
about every 14 days.</p>
<h1 id="assessing-the-effectiveness-of-mitigation">Assessing the
effectiveness of mitigation</h1>
<p>Armed with our new superpower, let’s look at our data from specific
countries with this logarithmic scaling. We’ll start again with
Italy.</p>
<p><img src="/assets/img/covid19/exp_22_0.png" /></p>
<p>Several things become clearer with this scaling. Notice that the plot
is <strong>not</strong> a straight line, but we can make good guesses as
to why. Before day 30 (February 21st), there were only one or two known
cases, and after day 30 there was a very abrupt increase. It seems
likely that the virus was spreading before day 30 and the new cases were
only detected later. This matches with <a
href="https://en.wikipedia.org/wiki/2020_coronavirus_pandemic_in_Italy#cite_note-250">statements
from experts in Italy</a>.</p>
<p>Recall our full model equation for <span
class="math inline">\(I(t)\)</span>:</p>
<p><span class="math display">\[
\frac{dI}{dt} = (\beta \frac{S}{N}-\gamma) I \\
\]</span></p>
<p>Since only a tiny fraction of the whole Italian population is
infected, we have <span class="math inline">\(S/N\approx 1\)</span>, so
the slope of our exponential growth line on a semilog plot should be
<span class="math inline">\(\beta-\gamma \approx 0.2\)</span>. Let’s see
how a line with that slope matches the data:</p>
<p><img src="/assets/img/covid19/exp_24_0.png" /></p>
<p>Note that we are doing essentially the same thing that we did when
trying to determine <span class="math inline">\(\beta\)</span> in the
second post of this series; but now we are looking at the results on a
log scale to reveal more detail.</p>
<p>It seems plausible (not utterly convincing) that the virus has been
spreading at this expected rate in Italy for about 50 days now. However,
notice that in the last week the slope seems to have decreased. Since
Italy is now making great efforts to detect all new cases, it seems most
likely that this is the result of mitigation. It’s probably too soon to
try to assess the effectiveness of that mitigation, but let’s make an
attempt anyway:</p>
<p><img src="/assets/img/covid19/exp_26_0.png" /></p>
<p>We see that the slope over the last several days is about 0.14,
corresponding to a mitigation factor <span class="math inline">\(q
\approx 0.7\)</span> in the model I introduced in the last post.</p>
<p><img src="/assets/img/covid19/exp_28_0.png" /></p>
<p>For Spain, we see a similar pattern, but no sign of any impact from
mitigation yet.</p>
<p><img src="/assets/img/covid19/exp_30_0.png" /></p>
<p>For South Korea, up to about day 40 we see the same pattern (with
again a similar plateau in late February followed by a rapid rise when
testing increased). But afterward we see completely different behavior,
as the curve flattens. Given that South Korea has perhaps the most
agressive testing strategy in the world, it seems unlikely that this can
be attributed to new infections going undetected. Instead, the evidence
suggests that mitigation strategies have been very successful. We can
measure this success by looking at how much the slope of the curve has
changed:</p>
<p><img src="/assets/img/covid19/exp_32_0.png" /></p>
<p>An approximate fit to the data from recent days suggests that the
growth rate has been cut to approximately <span
class="math inline">\(0.02\)</span>; this would correspond to a value of
<span class="math inline">\(q\)</span> (from my previous post) of about
1/4, meaning each infected person on average transmits the disease to
only 1/4 as many people as they naturally would. But again it is
probably too early to estimate this number with confidence.</p>
<p><a
href="http://www.davidketcheson.info/2020/03/22/SIR_interactive.html">In
the next post in the series, you can experiment with the model
yourself</a>.</p>
Modeling Coronavirus part III -- predictions2020-03-19T00:00:00+03:00h/2020/03/19/SIR_predictions<p>Welcome to the third post in my series on modeling the Coronavirus
pandemic. In the <a
href="http://www.davidketcheson.info/2020/03/17/SIR_model.html">first</a>
and <a
href="http://www.davidketcheson.info/2020/03/19/SIR_Estimating_parameters.html">second</a>
posts, we introduced the SIR model and used the available data to
estimate the model parameters, in the absence of any mitigation
(i.e. with no social distancing, quarantine, etc.). In this post we’ll
put model and parameters together to see what they predict for the
current pandemic. We’ll do this first assuming no mitigation, and then
we’ll make a very rough attempt to understand how mitigation may impact
the predictions.</p>
<p>Remember, I am not an epidemiologist and I make no guarantees as to
the accuracy of these predictions. I don’t recommend that you make
precise plans based on the details of these predictions. My goal is
merely to show that with a bit of mathematics and common sense, we can
have a reasonable idea of what the future holds.</p>
<h1 id="predictions-in-the-absence-of-mitigation">Predictions in the
absence of mitigation</h1>
<p>First, let’s look at what we expect to happen without any mitigation.
In other words, this is a scenario in which schools and workplaces
remain open, people continue to shake hands or kiss cheeks when greeting
one another, and so forth. <strong>Note that the predictions here are
NOT what we expect will actually happen, because we have intentionally
ignored the effects of containment measures.</strong></p>
<p>Here are the basic assumptions leading to the predictions below:</p>
<ul>
<li>The dynamics of the COVID-19 epidemic follow the SIR model, with
parameters <span class="math inline">\(\beta \approx 0.25\)</span> and
<span class="math inline">\(\gamma \approx 0.05\)</span>.</li>
<li>No containment measures (like quarantine, closures, and social
distancing) are implemented.</li>
<li>The number of confirmed cases at present is about 15 percent of the
actual cases (this is a very rough guess based on some expert
opinions).</li>
</ul>
<p><img src="/assets/img/covid19/output_5_0.png" /></p>
<p>There are several important things to notice here.</p>
<h2
id="when-does-the-maximum-number-of-infected-individuals-occur-and-how-high-is-it">When
does the maximum number of infected individuals occur, and how high is
it?</h2>
<p>As we expect, there is an initial exponential growth of infection
that eventually levels off and then decreases after most of the
population has been infected. Recall from the first post that we expect
the infection peak to occur when the fraction of susceptible individuals
is</p>
<p><span class="math display">\[\gamma/\beta = 0.05/0.25 =
0.2\]</span></p>
<p>i.e., when four-fifths of the world population has already been
infected (including those who have recovered). With the current
parameters, this peak occurs around the middle of May and results in
almost 4 billion concurrent cases. Of course, the vast majority of those
cases would not need medical care; based on our estimate of 15%
reporting and less than 20% of reported cases being severe, perhaps only
about 3% of all cases would require medical attention. Even so, that
would mean a peak of 100 million cases requiring medical attention
simultaneously, worldwide.</p>
<h2 id="how-many-people-catch-the-virus">How many people catch the
virus?</h2>
<p>In this scenario, almost everyone in the world is eventually
infected; by day 400 only 54 million susceptible individuals remain.</p>
<h2 id="how-long-does-the-epidemic-last">How long does the epidemic
last?</h2>
<p>After mid-May, the epidemic begins to trail off gradually, but it is
not until about early November that the number of infected drops back to
below 1 million. Thus (in this scenario) we should expect the severe
epidemic to last at least several months. Of course, it would make sense
for those who have already recovered from the virus – which is most of
the world population – to go back to work/school in the early Fall,</p>
<h1 id="variations-on-the-average-scenario">Variations on the average
scenario</h1>
<p>In the last post, we saw that there is significant uncertainty in the
value of <span class="math inline">\(\gamma\)</span> and especially
<span class="math inline">\(\beta\)</span>. How do these predictions
change if we vary <span class="math inline">\(\beta\)</span>? We found
values in the range <span class="math inline">\((0.2, 0.3)\)</span>, so
let’s look at what happens for each of the extremes of this
interval:</p>
<p><img src="/assets/img/covid19/output_8_0.png" /></p>
<p>With a smaller value of <span
class="math inline">\(\beta=0.2\)</span>, the virus spreads more slowly;
the peak occurs near June 1st with just over 3 billion infected. Again,
almost the entire world catches the virus (eventually 150 million
susceptibles remain), and the number of cases does not drop below 1
million until November.</p>
<p><img src="/assets/img/covid19/output_10_0.png" /></p>
<p>With a larger values of <span
class="math inline">\(\beta=0.3\)</span>, the virus spreads more
quickly; the peak occurs in early May with just over 4 billion infected.
The epidemic ends a bit sooner, with the number of cases dropping below
1 million some time in October. Only 19 million people remain
susceptible after a year.</p>
<p>As we can see, even with these fairly large changes in <span
class="math inline">\(\beta\)</span>, the general picture remains the
same. We can expect the epidemic to peak in May or June and last into
the fall.</p>
<h1 id="the-effect-of-mitigation">The effect of mitigation</h1>
<p>As I write this, almost every country in the world is adopting
measures to mitigate the spread of the virus. This includes closing
schools and workplaces, quarantining infected individuals and their
contacts, and encouraging people to stay home.</p>
<p>For simplicity we can view all of these mitigation measures as having
a single effect: increasing the average time between encounters, or in
other words reducing <span class="math inline">\(\beta\)</span>.
Remember that <span class="math inline">\(\beta\)</span> is the average
number of people per day that a given individual has close contact with.
Since <span class="math inline">\(\beta\)</span> is a constant in our
model but the mitigation techniques and their effectiveness may vary
over time, we can incorporate mitigation by adding a new factor <span
class="math inline">\(q(t)\in[0,1]\)</span> multiplying <span
class="math inline">\(\beta\)</span>:</p>
<p><span class="math display">\[\begin{align}
\frac{dS}{dt} & = -q(t)\beta I \frac{S}{N} \\
\frac{dI}{dt} & = q(t)\beta I \frac{S}{N}-\gamma I \\
\frac{dR}{dt} & = \gamma I
\end{align}\]</span></p>
<p>What is the meaning of <span class="math inline">\(q(t)\)</span>? If
there were an absolute quarantine, with no human contact at all, we
would have <span class="math inline">\(q=0\)</span>, whereas if no
measures are implemented then we would have <span
class="math inline">\(q=1\)</span> (corresponding to the predictions
above). In the real world, <span class="math inline">\(q\)</span> will
be somewhere between these extremes.</p>
<p>What is the correct value of <span class="math inline">\(q\)</span>?
Frankly, I have no idea and I doubt that even the experts can say with
confidence. But we can hypothesize some values and explore their impact.
For simplicity, we’ll assume that some value <span
class="math inline">\(q<1\)</span> is achieved through mitigation
measures starting now and lasting for the next <span
class="math inline">\(N_q\)</span> days. After <span
class="math inline">\(N_q\)</span> days, these measures are lifted so
that society (and <span class="math inline">\(\beta\)</span>) returns to
normal.</p>
<p>If we could maintain mitigation measures forever (i.e. <span
class="math inline">\(N_q=\infty\)</span>) then this mitigation would
have exactly the same effect as reducing <span
class="math inline">\(\beta\)</span>. As we saw above, smaller values of
<span class="math inline">\(\beta\)</span> lead to a smaller infection
peak, but a longer epidemic.</p>
<p>Of course, we do not expect mitigation to last forever; people must
go back to work eventually. This can have some surprising effects.</p>
<p>As a first scenario, let’s imagine that <span
class="math inline">\(q=1/2\)</span>; i.e., we are able to reduce the
amount of human contact by 50%, and <span
class="math inline">\(N_q=180\)</span> (about six months).</p>
<p><img src="/assets/img/covid19/output_20_0.png" /></p>
<p>We see that this mitigation makes a substantial difference. Now the
peak point of infection is delayed until late June, with less than 2
billion simultaneous cases at maximum. Notice the bump in the number of
cases around mid-September when the restrictions are relaxed. Another
effect of this mitigation is that the time in which there are some
significant number of cases lasts much longer – in this scenario there
are still more than 1 million cases even 1 year from now. It’s still
true that most of the world catches the virus eventually, but about 480
million susceptibles remain even after two years.</p>
<p>Next let’s consider an even more successful mitigation: suppose that
we cut human contact by three-fifths, so <span
class="math inline">\(q=0.4\)</span>, with measures again lasting <span
class="math inline">\(N_q\)</span>=180 days.</p>
<p><img src="/assets/img/covid19/output_23_0b.png" /></p>
<p>Here the initial growth is even slower, and we see an interesting
phenomenon: the infection appears to reach a peak around mid-September,
right when restrictions are relaxed. But the relaxing of restrictions
allows the virus to suddenly spread much faster, leading to a higher
peak in early October. What’s quite surprising and counterintuitive is
that <strong>even though we have stronger mitigation, the peak is
actually higher in this case than in the previous case</strong>. Why?
It’s because the very strong mitigation for 180 days means that at the
end of that period there is still a large proportion of susceptible
people, leading to stronger growth of the epidemic when mitigation
ends.</p>
<p>Finally, if we have extremely strong mitigation and then suddenly
remove all restrictions, we simply get a delayed version of the
full-scale outbreak:</p>
<p><img src="/assets/img/covid19/output_new.png" /></p>
<p>This is because our mitigation was so effective that hardly anyone
caught the virus and everyone was still susceptible after our 180-day
mitigation ended. Of course, the ideal scenario (in terms of minimizing
infection) would be to maintain strong mitigation until a vaccine can be
deployed.</p>
<h1 id="when-should-we-let-society-go-back-to-normal">When should we let
society go back to normal?</h1>
<p>A natural question at this point is, when is it okay to let everyone
go back to school and to work? There are many possible answers, but
perhaps a reasonable one can be obtained as follows. Let’s assume that
our goal is to ensure that the number of infected <span
class="math inline">\(I(t)\)</span> will not increase when we send
everyone back to work; i.e., when we let <span
class="math inline">\(q\)</span> go back up to 1. We have</p>
<p><span class="math display">\[\begin{align}
\frac{dI}{dt} & = (q\beta \frac{S}{N}-\gamma) I.
\end{align}\]</span></p>
<p>To ensure that <span class="math inline">\(dI/dt<0\)</span> even
with <span class="math inline">\(q=1\)</span>, we need the susceptible
fraction <span class="math inline">\(S/N\)</span> to have fallen to
<span class="math inline">\(\gamma/\beta\)</span> – the same condition
we found previously for when the infection peak would occur (without
mitigation). Using the parameter values we found in the last post, this
means we should wait until 80% of the population has been infected
before returning to normal. This is probably too cautious, since ending
mitigation a bit sooner (but after the initial peak of infection) would
only lead to a small subsequent rise, with a second infection peak
smaller than the first:</p>
<p><img src="/assets/img/covid19/output_26_0b.png" /></p>
<p>Of course, other strategies are possible and could make sense, like
sending people back to work as soon as they have tested positive and
then recovered from the virus. And more complicated mitigation
strategies are possible (and likely) in which some restrictions are
relaxed while others remain in place. All of these will have to be
weighed against the cost they impose on our lives.</p>
<p>In the <a
href="https://github.com/ketch/covid-blog-posts/blob/master/03_Predictions.ipynb">Jupyter
notebook for this post</a> there is an interactive widget that lets you
experiment with the model, including all five of the parameters we have
discussed. The possible effects are quite interesting and we have only
touched on the basics here. I encourage you to try it out!</p>
<h1 id="further-considerations">Further considerations</h1>
<p>There are a number of potentially important factors that we have
ignored here, including:</p>
<ul>
<li>Because countries will have differing strategies, there may be
significantly <strong>earlier infection peaks in some countries and
later peaks in others</strong>. This is likely to be the case if
international travel restrictions remain in place for an extended
period.</li>
<li>Possible <strong>seasonal effects</strong> on the spread of
Coronavirus. Currently there seems to be no way to know if COVID-19 will
be seasonal like the flu.</li>
<li>Development of a <strong>vaccine</strong>. As we have seen,
significantly reducing the number of susceptible individuals can rapidly
halt the spread of a disease, even if not every individual is
vaccinated. So if a vaccine appears and can be mass produced before the
fall, it could susbstantially shorten the duration of the epidemic and
reduce its impact. On the other hand, it seems that a vaccine arriving
after a year would be too late to have much impact.</li>
</ul>
<p><a
href="http://www.davidketcheson.info/2020/03/20/SIR_exponential.html">In
then next post, we take a deeper look at exponential growth and its
implications for the epidemic</a>.</p>
Modeling Coronavirus part II -- estimating parameters2020-03-19T00:00:00+03:00h/2020/03/19/SIR_Estimating_parameters<p>Welcome back! In the <a
href="http://www.davidketcheson.info/2020/03/17/SIR_model.html">first
post</a> of this series, we learned about the SIR model, which consists
of three differential equations describing the rate of change of
susceptible (S), infected (I), and recovered (R) populations:</p>
<p><span class="math display">\[\begin{align*}
\frac{dS}{dt} & = -\beta I \frac{S}{N} \\
\frac{dI}{dt} & = \beta I \frac{S}{N}-\gamma I \\
\frac{dR}{dt} & = \gamma I
\end{align*}\]</span></p>
<p>As we discussed, the model contains two key parameters (<span
class="math inline">\(\beta\)</span> and <span
class="math inline">\(\gamma\)</span>) that influence the spread of a
disease. In this second post on modeling the COVID-19 outbreak, we will
take the existing data and use it to estimate the values of those
parameters.</p>
<p>The parameters we want to estimate are:</p>
<ul>
<li><span class="math inline">\(\beta\)</span>: The average number of
people that come in close contact with a given infected individual, per
day</li>
<li><span class="math inline">\(\gamma\)</span>: The reciprocal of the
average duration of the disease (in days)</li>
</ul>
<h2 id="estimating-gamma">Estimating <span
class="math inline">\(\gamma\)</span></h2>
<p>A rough estimate of <span class="math inline">\(\gamma\)</span> is
available directly from medical sources. Most cases of COVID-19 are mild
and recovery occurs after about two weeks, which would give <span
class="math inline">\(\gamma = 1/14 \approx 0.07\)</span>. However, a
smaller portion of cases are more severe and can last for several weeks,
so <span class="math inline">\(\gamma\)</span> will be somewhat smaller
than this value. Estimates I have seen put the value in the range <span
class="math inline">\(0.03 - 0.06\)</span>.</p>
<h2 id="estimating-beta">Estimating <span
class="math inline">\(\beta\)</span></h2>
<p>It’s much more difficult to get a good estimate of <span
class="math inline">\(\beta\)</span>. To be clear, we are trying to
estimate, for an infected individual, the average number of other
individuals with whom they have close contact per day. Here <em>close
contact</em> means contact that would lead to infection of the other
individual (if that individual is still susceptible).</p>
<p>As we discussed earlier, this number is affected by many factors. It
will also be affected by mitigation strategies implemented to reduce
human contact. For now, we want to estimate the value of <span
class="math inline">\(\beta\)</span> <em>in the absence of
mitigation</em>. Later, we will try to take mitigation into account.</p>
<p>Recall that our equation for the number of infected is</p>
<p><span class="math display">\[
\frac{dI}{dt} = \left(\beta \frac{S}{N}-\gamma \right) I(t)
\]</span></p>
<p>Very early in an outbreak, the ratio <span class="math inline">\(S/N
\approx 1\)</span> since hardly anyone has been infected. Also, at
extremely early times, we can ignore <span
class="math inline">\(\gamma\)</span> because the disease is so new that
nobody has been sick for long enough to recover. For COVID-19, this is
true for about the first two weeks of the disease’ spread in a new
population. During that time we have simply</p>
<p><span class="math display">\[
\frac{dI}{dt} = \beta I(t)
\]</span></p>
<p>This is one of the simplest differential equations, and its solution
is just a growing exponential:</p>
<p><span class="math display">\[
I(t) = e^{\beta t} I(0).
\]</span></p>
<p>Here <span class="math inline">\(I(0)\)</span> is of course the
number of initially infected individuals. Thus we can try to estimate
<span class="math inline">\(\beta\)</span> by fitting an exponential
curve to the initial two weeks of spread. This is not the only way to
estimate <span class="math inline">\(\beta\)</span>; using this approach
is the first of several choices that we’ll make, and those choices will
influence the our eventual predictions.</p>
<h3 id="getting-the-data">Getting the data</h3>
<p>Fortunately for us, comprehensive data on the spread of COVID-19 is
available from <a href="https://github.com/CSSEGISandData/COVID-19">this
Github repository</a> provided by the Johns Hopkins University Center
for Systems Science and Engineering. Specifically, I’ll be using the
data in <a
href="https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Deaths.csv">this
file</a>. Note that the file gets updated daily; as I write it is March
17th.</p>
<p>I’m using Python and Pandas to work with the data. For this blog
post, I have removed most of the computer code, but you can <a
href="https://github.com/ketch/covid-blog-posts/blob/master/02_Estimating_parameters.ipynb">download
the Jupyter notebook</a> and play with the code and data yourself.</p>
<p>To estimate <span class="math inline">\(\beta\)</span>, we just pick
a particular country from this dataset, plot the number of cases over
time, and fit an exponential function to it. We can use a standard
mathematical tool called <em>least squares fitting</em> to find a
reasonable value.</p>
<h2 id="fitting-the-data-from-italy">Fitting the data from Italy</h2>
<p>For instance, here is the data from Italy:</p>
<p><img src="/assets/img/covid19/output_12_0.png" /></p>
<p>Since this data starts back in January, before the virus reached
Italy, the number of cases at the beginning is zero. We can use the
interval from day 30 to day 43 (inclusive) to try to fit <span
class="math inline">\(\beta\)</span>, since this seems to be when the
outbreak began to take off. Here it must be emphasized that the choice
of this particular interval is somewhat arbitrary; different choices
will give somewhat different values for <span
class="math inline">\(\beta\)</span>.</p>
<div class="sourceCode" id="cb1"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="kw">def</span> exponential_fit(cases,start,length):</span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a> <span class="kw">def</span> resid(beta):</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a> prediction <span class="op">=</span> cases[start]<span class="op">*</span>np.exp(beta<span class="op">*</span>(dd<span class="op">-</span>start))</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> prediction[start:start<span class="op">+</span>length]<span class="op">-</span>cases[start:start<span class="op">+</span>length]</span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a> soln <span class="op">=</span> optimize.least_squares(resid,<span class="fl">0.2</span>)</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a> beta <span class="op">=</span> soln.x[<span class="dv">0</span>]</span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a> <span class="bu">print</span>(<span class="st">'Estimated value of beta: </span><span class="sc">{:.3f}</span><span class="st">'</span>.<span class="bu">format</span>(beta))</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> beta</span></code></pre></div>
<p>Let’s see how well this value predicts the data:</p>
<div class="sourceCode" id="cb2"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="kw">def</span> plot_fit(cases,start,end<span class="op">=</span><span class="dv">56</span>):</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a> length<span class="op">=</span>end<span class="op">-</span>start</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a> plt.plot(cases)</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a> beta <span class="op">=</span> exponential_fit(cases,start,length)</span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a> prediction <span class="op">=</span> cases[start]<span class="op">*</span>np.exp(beta<span class="op">*</span>(dd<span class="op">-</span>start))</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a> plt.plot(dd[start:start<span class="op">+</span>length],prediction[start:start<span class="op">+</span>length],<span class="st">'--k'</span>)<span class="op">;</span></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a> plt.legend([<span class="st">'Data'</span>,<span class="st">'fit'</span>])<span class="op">;</span></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a> plt.xlabel(<span class="st">'Days'</span>)<span class="op">;</span> plt.ylabel(<span class="st">'Total cases'</span>)<span class="op">;</span></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a> <span class="cf">return</span> beta</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a> </span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>beta <span class="op">=</span> plot_fit(total_cases,start<span class="op">=</span><span class="dv">35</span>,end<span class="op">=</span><span class="dv">49</span>)</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a>plt.title(<span class="st">'Italy'</span>)<span class="op">;</span></span></code></pre></div>
<pre><code>Estimated value of beta: 0.247</code></pre>
<p><img src="/assets/img/covid19/output_16_0.png" /></p>
<p>The fit seems reasonably good, over the interval we used. How well
does it match if we plot the fit over the whole time interval?</p>
<div class="sourceCode" id="cb4"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a>start<span class="op">=</span><span class="dv">35</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>plt.plot(total_cases)</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>dd <span class="op">=</span> np.arange(<span class="bu">len</span>(days))</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>prediction <span class="op">=</span> total_cases[start]<span class="op">*</span>np.exp(beta<span class="op">*</span>(dd<span class="op">-</span>start))</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>plt.plot(dd[start:],prediction[start:],<span class="st">'--k'</span>)<span class="op">;</span></span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>plt.legend([<span class="st">'Data'</span>,<span class="st">'fit'</span>])<span class="op">;</span></span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>plt.xlabel(<span class="st">'Days'</span>)<span class="op">;</span> plt.ylabel(<span class="st">'Total cases'</span>)<span class="op">;</span></span></code></pre></div>
<p><img src="/assets/img/covid19/output_18_0.png" /></p>
<p>Clearly, the prediction is not accurate at later times. There are two
main reasons for this:</p>
<ul>
<li>Our assumption of exponential growth was based on other assumptions
that are only valid at the very start of the outbreak;</li>
<li>Italian society has taken measures to combat the spread of the
virus, effectively reducing <span class="math inline">\(\beta\)</span>
at later times.</li>
</ul>
<p>We can resolve the first issue by using the full SIR model (instead
of just exponential growth) to make predictions. The second issue is
more complicated; we will try to deal with it in a later blog post.</p>
<h2 id="fitting-to-data-from-other-regions">Fitting to data from other
regions</h2>
<p>To have more confidence in our value of <span
class="math inline">\(\beta\)</span>, we can perform a similar fit with
data from other regions, and see if we get a similar value. Next, let’s
try fitting the data from the USA. Here’s the data:</p>
<p><img src="/assets/img/covid19/output_21_0.png" /></p>
<p>Notice that we only have about 1 week of meaningful data. Let’s try
to fit an exponential to it:</p>
<p><img src="/assets/img/covid19/output_23_0.png" /></p>
<p>We get a fairly similar value for <span
class="math inline">\(\beta\)</span>. Furthermore, the fit using this
value seems to be pretty good.</p>
<h3 id="spain">Spain</h3>
<p><img src="/assets/img/covid19/output_26_1.png" /></p>
<h3 id="uk">UK</h3>
<p><img src="/assets/img/covid19/output_28_1.png" /></p>
<h3 id="france">France</h3>
<p><img src="/assets/img/covid19/output_30_0.png" /></p>
<h3 id="hubei-province-china">Hubei Province, China</h3>
<p>Let’s look at the data from where it all started: Hubei province,
China. Here it makes sense to start the fit from day zero of the JHU
data set.</p>
<p><img src="/assets/img/covid19/output_32_0.png" /></p>
<p>Each of these countries seems to fit the model reasonably well and to
give a more or less similar value of <span
class="math inline">\(\beta\)</span>, in the range <span
class="math inline">\(0.22\)</span> to <span
class="math inline">\(0.29\)</span>. It would be wrong to feel
completely confident about this value, or to try to extrapolate too much
from such a short time interval of data, but the consistency of these
results does seem to suggest that our estimate is meaningful.</p>
<p>Now let’s look at some countries that don’t fit this pattern.</p>
<h3 id="iran-and-south-korea">Iran and South Korea</h3>
<p>Here is the number of confirmed cases for Iran:</p>
<p><img src="/assets/img/covid19/output_36_0.png" /></p>
<p>And here is Korea:</p>
<p><img src="/assets/img/covid19/output_38_0.png" /></p>
<p>At a glance we can see that this data doesn’t follow the pattern of
the previous countries. In Iran, after the first week, the growth seems
to be linear. In Korea, the initial exponential growth eventually slows
down drastically and is beginning to level off. This tells us that
something we left out of our model must be at play.</p>
<p>In the case of Korea, it seems straightforward to understand what is
going on. Korea has deployed the most extensive COVID-19 testing system
in the world, with over 270,000 people tested to date. This is combined
with an extensive effort to isolate infected people and those they have
been in recent contact with. Essentially, South Korea has reduced the
value of <span class="math inline">\(\beta\)</span>. Based on our
earlier analysis, to prevent future exponential growth, they will need
to keep <span class="math inline">\(\beta\)</span> down to approximately
the value of <span class="math inline">\(\gamma\)</span> or less. If we
believe that <span class="math inline">\(\gamma \approx 0.05\)</span>
and <span class="math inline">\(\beta \approx 0.25\)</span>, this means
reducing the amount of human contact by infected people by five
times.</p>
<p>Iran’s case is at first more puzzling, since the testing and
quarantine measures there have not been exceptional compared to
countries like Italy and Spain. Instead, there are <a
href="https://www.theatlantic.com/ideas/archive/2020/03/irans-coronavirus-problem-lot-worse-it-seems/607663/">strong</a>
<a
href="https://www.nytimes.com/2020/02/28/world/middleeast/coronavirus-iran-confusion.html">suspicions</a>
that <a
href="https://medicalxpress.com/news/2020-03-covid-outbreak-iran-larger.html">the
official numbers from Iran are wildly inaccurate</a> and the real number
of cases (and deaths) is <a
href="https://www.washingtonpost.com/world/middle_east/coronavirus-pummels-iran-leadership-as-data-show-spread-is-far-worse-than-reported/2020/03/04/7b1196ae-5c9f-11ea-ac50-18701e14e06d_story.html">drastically
higher than what is reported</a>.</p>
<h2 id="problems-with-the-our-approach">Problems with the our
approach</h2>
<p>Before we go finish, it’s important to understand the limitations of
the data we’re working with and the technique we have used. Most
importantly, the numbers we have certainly <strong>do not represent the
real number of infected individuals</strong>. That’s because many
infected individuals are never tested for the virus. This is especially
true for diseases like COVID-19 in which the majority of cases are mild
and do not require professional medical care. Estimates I have seen
claim that only about 10-20% of all cases are detected.</p>
<p>If we assume that the fraction of cases that are actually detected is
constant over time, then this discrepancy does not hinder our ability to
estimate <span class="math inline">\(\beta\)</span>, since dividing the
initial and final number of infected by the same constant will lead to
the same estimate of <span class="math inline">\(\beta\)</span> that
would be obtained if we counted all the cases. However, it’s clear that
in many places this factor changes over time as a country starts doing
more and more testing. This would cause the number of reported cases to
grow even faster than the real number. This is most likely occurring,
for instance, in the US where previously many individuals with symptoms
were not tested due to a lack of test availability.</p>
<p>Another issue is that in some cases governments may be intentionally
hiding the true number of infections. As we have seen, this is likely
the case in Iran.</p>
<p>Finally, mitigation strategies may already be in place and
influencing the rate of spread in some countries, even in the early days
of outbreak. This would lead to us underestimating the natural value of
<span class="math inline">\(\beta\)</span>.</p>
<h1 id="conclusion">Conclusion</h1>
<p>What we can take away from this analysis are the rough estimates for
the SIR parameters:</p>
<p><span class="math display">\[\gamma \approx 0.05\]</span> <span
class="math display">\[\beta \approx 0.25.\]</span></p>
<p>Notice that the behavior in this initial phase of the epidemic that
we have focused on is very similar to the simple behavior we considered
at the start of the first post. There, the number of infected
individuals doubled each day, but we knew that was unrealistic. Here,
the number of infected individuals doubles every few days. How many days
does it take for the number to double? If it takes <span
class="math inline">\(m\)</span> days for the number of cases to double,
then we have</p>
<p><span class="math display">\[
e^{\beta m} = 2
\]</span></p>
<p>so <span class="math inline">\(m = \log(2)/\beta\)</span> where <span
class="math inline">\(\log\)</span> means the natural logarithm. For
<span class="math inline">\(\beta=0.25\)</span>, this gives a doubling
time of about 2.8 days. This growth will slow down somewhat after the
first couple of weeks for reasons we have already discussed.</p>
<p>It should be emphasized that the value of <span
class="math inline">\(\beta\)</span> here is what we expect <strong>in
the absence of mitigation strategies</strong>. In later posts, we’ll
look at what these values mean for the future spread of the epidemic,
and what the potential effect of mitigation may be.</p>
<p>In the <a
href="https://github.com/ketch/covid-blog-posts/blob/master/02_Estimating_parameters.ipynb">Jupyter
notebook for this post</a> there is an interactive setup where you can
make your own fits to the data from a variety of regions.</p>
<p><a
href="http://www.davidketcheson.info/2020/03/19/SIR_predictions.html">Click
here to go to the next post</a>, in which we use what we’ve found to
predict the future.</p>
Modeling Coronavirus part I -- the SIR model2020-03-17T00:00:00+03:00h/2020/03/17/SIR_model<p>This post is the first in a series in which we’ll use a simple but
effective mathematical model to predict the ongoing Coronavirus
outbreak. As I write, the number of officially confirmed global cases is
just under 200,000 and many schools and workplaces all over the world
have closed in order to slow its spread. If you’re like me, you’re
wondering:</p>
<ul>
<li>Am I likely to catch this virus?</li>
<li>How long will it be until my school or workplace opens up
again?</li>
</ul>
<p>My claim is that we can reach some reasonable approximate answers
using straightforward mathematics. Math is quite effective at predicting
the average behavior of large groups, and a little math can go a long
way in telling us what will happen next with COVID-19. My goal here is
to help you make and understand those predictions with little more than
high school mathematics.</p>
<p><em>Disclaimer</em>: I am not an epidemiologist and I make no
guarantees about the predictions we’ll arrive at here. By reading this
you agree not to sue me. What I write here does not reflect the opinion
of my employer or anyone else.</p>
<p>Each post in this series is written in a <a
href="https://jupyter.org/">Jupyter notebook</a>, which you can download
and experiment with yourself if you are so inclined. The notebook for
this first post is <a
href="https://github.com/ketch/covid-blog-posts/blob/master/01_SIR_Model.ipynb">here</a></p>
<h1 id="modeling-the-spread-of-infectious-disease">Modeling the spread
of infectious disease</h1>
<p>An infectious disease spreads from one individual to another.
Consider the following simple model:</p>
<ul>
<li>On day zero, a single individual is infected</li>
<li>On each subsequent day, each infected individual passes the disease
to one more individual</li>
</ul>
<p>How quickly does the number of infected individuals grow?</p>
<p>1, 2, 4, 8, 16, …</p>
<p>On each day, the number of infected doubles! How many days would it
take for everyone on earth to be infected?</p>
<p>This is not a reasonable model for any of the diseases we know of. Of
course, the rate of new infections per infected person (1 per day) was
an arbitrary choice and real values are likely to be smaller. What other
effects are missing from this model?</p>
<ul>
<li><strong>Recovery and immunity</strong>: eventually, an individual
recovers and can no longer infect others</li>
<li><strong>Spread</strong>: The infection can only spread to people who
don’t yet have it. If most of the individuals in contact with an
infected person are already infected, that person is less likely to
spread the disease to someone new</li>
</ul>
<p>Those two factors are vital to understanding the true dynamics of
epidemics. Of course, there are many other important details we have
left out; for instance:</p>
<ul>
<li>The disease may affect different individuals in different ways.</li>
<li>Individuals are spread out geographically.</li>
<li>Certain individuals are likely to infect many others, while others
are less likely. This depends on many factors including culture,
personality, and lifestyle, as well as the mode of transmission of the
disease.</li>
<li>Individuals might take actions to avoid getting infected
(e.g. washing hands, avoiding sick people) or to avoid spreading the
disease (e.g. staying home when sick).</li>
</ul>
<p>All of these effects (and many others) will influence the spread of
the disease. A model that tries to incorporate them all would be very
complex.</p>
<h2 id="the-sir-model">The SIR model</h2>
<p>One of the simplest but most relevant models is based on the idea
that the population consists of three groups:</p>
<ul>
<li><strong>S(t)</strong> Susceptible (those who have not yet been
infected)</li>
<li><strong>I(t)</strong> Infected (those who can currently spread the
disease)</li>
<li><strong>R(t)</strong> Recovered (those who are now immune and cannot
contract or spread the disease)</li>
</ul>
<p>When we write <span class="math inline">\(S(t)\)</span>, we mean that
the number of susceptible individuals (<span
class="math inline">\(S\)</span>) is given as a function of time (<span
class="math inline">\(t\)</span>). We can’t write down this function
exactly; instead it will be described by a <em>differential
equation</em>. Now, differential equations are a bit like <a
href="https://en.wikipedia.org/wiki/Whale_shark">whale sharks</a>: they
sound scary at first, but in reality they are simple and friendly
creatures.</p>
<p>A differential equation is just a description of how some quantity
changes. In this case, we will have three differential equations,
describing the rate of change of <span class="math inline">\(S\)</span>,
<span class="math inline">\(I\)</span>, and <span
class="math inline">\(R\)</span>. The idea is that susceptible people
can become infected and infected people can become recovered:</p>
<p><span class="math display">\[ S \to I \to R\]</span></p>
<p>To define differential equations for the three groups, we only need
to determine the rate at which each of these transitions occurs.</p>
<h3 id="rate-of-infection">Rate of infection</h3>
<p>In our first simple model, we assumed the rate of infection was
proportional to the number of infected. This is very reasonable, but for
someone new to become infected we need both an infected individual
<strong>and</strong> a susceptible one. If we imagine that people
encounter each other randomly at some rate <span
class="math inline">\(\beta\)</span>, then the rate of new infections is
just the number of infected multiplied by the probability of
encountering a susceptible individual:</p>
<p><span class="math display">\[
\frac{dI}{dt} = \beta I \frac{S}{N}.
\]</span></p>
<p>Here <span class="math inline">\(N=S+I+R\)</span> is the total
population, so <span class="math inline">\(S/N\)</span> is the
probability that a randomly chosen individual is susceptible. This is
probably the most complicated point of our discussion, so take some time
to think about it until it makes sense to you.</p>
<p>Of course, since new infected people were previously susceptible, the
number of susceptible individuals must decrease at the same rate:</p>
<p><span class="math display">\[
\frac{dS}{dt} = -\beta I \frac{S}{N}.
\]</span></p>
<h3 id="rate-of-recovery">Rate of recovery</h3>
<p>The other transition is from infected to recovered. A proper model
for this should involve a time delay, since (for many diseases) new
infected individuals typically become recovered after a certain interval
of time. For instance, with the flu or the new Coronavirus, the number
of new recovered individuals might depend on how many became infected
about one or two weeks ago. Incorporating such an effect would lead to a
more complicated model known as a <strong>delay differential
equation</strong>.</p>
<p>Instead, we will simply assume that over any time interval, a certain
fraction of the infected become recovered. Denoting the recovery rate by
<span class="math inline">\(\gamma\)</span>, we have</p>
<p><span class="math display">\[
\frac{dR}{dt} = \gamma I.
\]</span></p>
<p>The number of infected must decrease at the same rate, so we must
modify our differential equation for <span
class="math inline">\(I(t)\)</span> to read</p>
<p><span class="math display">\[
\frac{dI}{dt} = \beta I \frac{S}{N}-\gamma I.
\]</span></p>
<h3 id="the-full-model">The full model</h3>
<p>Taking these three equations together, we have</p>
<p><span class="math display">\[\begin{align}
\frac{dS}{dt} & = -\beta I \frac{S}{N} \\
\frac{dI}{dt} & = \beta I \frac{S}{N}-\gamma I \\
\frac{dR}{dt} & = \gamma I
\end{align}\]</span></p>
<p>Notice that if we add the 3 equations together, we get</p>
<p><span class="math display">\[
\frac{dN}{dt} = 0.
\]</span></p>
<p>What do <span class="math inline">\(\beta\)</span> and <span
class="math inline">\(\gamma\)</span> really mean? We can think of <span
class="math inline">\(\beta\)</span> as the number of others that one
infected person encounters per unit time, and <span
class="math inline">\(\gamma^{-1}\)</span> as the typical time from
infection to recovery. So the number of new infections generated by one
infected individual is, on average, <span
class="math display">\[\beta/\gamma = R_0,\]</span> the <strong>basic
reproduction number</strong>.</p>
<h3 id="sir-dynamics">SIR dynamics</h3>
<p>Notice that <span class="math inline">\(S(t)\)</span> can only
decrease and <span class="math inline">\(R(t)\)</span> can only
increase, but <span class="math inline">\(I(t)\)</span> may increase or
decrease. A key question is, under what conditions will <span
class="math inline">\(I(t)\)</span> increase? This will tell us whether
a small number of cases could become an epidemic.</p>
<p>We can write</p>
<p><span class="math display">\[
\frac{dI}{dt} = \left(\beta \frac{S}{N}-\gamma \right) I
\]</span></p>
<p>from which we see that <span class="math inline">\(I(t)\)</span>
grows if <span class="math display">\[\beta S/N >
\gamma.\]</span></p>
<p><span class="math display">\[
\frac{dI}{dt} = \left(\beta \frac{S}{N}-\gamma \right) I
\]</span></p>
<p>Initially in a population we have <span class="math display">\[S/N
\approx 1,\]</span></p>
<p>so an epidemic of some size can occur if <span
class="math inline">\(\beta > \gamma\)</span>. As the epidemic grows,
the ratio <span class="math inline">\(S/N\)</span> becomes smaller, so
eventually the spread slows down.</p>
<p>What fraction of the population must been infected before <span
class="math inline">\(I(t)\)</span> will start to decrease?</p>
<p>The epidemic will begin to subside when <span
class="math display">\[S/N = (\beta/\gamma)^{-1} =
R_0^{-1}.\]</span></p>
<p>This determines the infection peak. After this point, there will
still be new infections but the overall number of infected will
decrease.</p>
<h2 id="an-example">An example</h2>
<p>So what does an epidemic look like, using the SIR model? We can
easily compute the solution using standard numerical methods; I’ve
omitted the code here since I want to focus on the model, but feel free
to look at <a
href="https://github.com/ketch/covid-blog-posts/blob/master/01_SIR_Model.ipynb">the
original Jupyter notebook with code</a> and modify the parameters
yourself.</p>
<p><img src="/assets/img/covid19/01_1.png" /></p>
<p>Here I’ve set <span class="math inline">\(N=1\)</span> so the numbers
on the vertical axis represent a fraction of the current population.
Initially, only a tiny fraction is infected while the remainder is
susceptible. The plot above shows the typical behavior of an epidemic:
an initial rapid exponential spread, until much of the population is
infected or recovered, at which point the number of infections begins to
decline.</p>
<p>According to our analysis above, the number of infections should
begin to decrease when the susceptible fraction <span
class="math inline">\(S/N\)</span> is equal to <span
class="math inline">\(\gamma/\beta\)</span>. Here I’ve taken <span
class="math inline">\(\beta=1\)</span> and <span
class="math inline">\(\gamma=1/10\)</span>, so the infection peak should
occur when the susceptible population has dropped to 1/10. Let’s
check:</p>
<p><img src="/assets/img/covid19/01_2.png" /></p>
<p>The results are in perfect agreement with what we predicted. In the
<a
href="https://github.com/ketch/covid-blog-posts/blob/master/01_SIR_Model.ipynb">original
notebook</a> there is an interactive model where you can adjust the
parameters and see the results.</p>
<p><a
href="https://mybinder.org/v2/gh/ketch/covid-blog-posts/master"><img
src="https://mybinder.org/badge_logo.svg" alt="Binder" /></a></p>
<p><a
href="http://www.davidketcheson.info/2020/03/19/SIR_Estimating_parameters.html">Click
here to go to the next post</a>, in which we look at real-world data to
estimate the values of <span class="math inline">\(\beta\)</span> and
<span class="math inline">\(\gamma\)</span>.</p>
How and why I'm teaching my kids to code2014-12-09T00:00:00+03:00h/2014/12/09/teaching_kids_to_program<p>I think that most of the world today drastically underestimates kids
– and by so doing, often harms them. Kids love learning, creating, and
achieving. We do them no service by providing everything for them, or by
“protecting” them from challenging tasks. This troubling trend is
manifest across the physical, social, and mental aspects of life. But
today I want to focus on one thing to which I think we should introduce
kids earlier and oftener: programming.</p>
<h1 id="why-programming">Why programming?</h1>
<p>In my childrens’ school (which I consider to be quite good), students
are introduced to computers early on. But this introduction focuses on
things like preparing a PowerPoint presentation, writing an essay in
Word, etc. Learning to use a computer by focusing on such canned
applications is a bit like learning to cook by mastering the operation
of a microwave. Yes, it will allow you to produce edible results –
assuming your local supermarket has a freezer section – but it hardly
acquaints you with the breadth of the culinary arts. With a microwave,
the only choice you really have in the process is how long to heat the
thing up – all other details have been determined for you, in advance,
by someone else. You cannot choose a recipe that fits your tastes or
dietary preferences, and you certainly can’t adapt a recipe or invent
something new.</p>
<p><strong><em>To really learn to cook, you have to start working with
the raw ingredients. To really learn to compute, you have to learn to
program.</em></strong></p>
<p>I think it is a shame to go through life without ever learning to
cook, since food is such a central part of human existence. But that
concern is mainly philosophical. Those of the coming generation who go
through life without ever learning to program will be, in a sense,
relegated to second-class status, unable to understand, control, or
create that which governs so much of life. Think of it: how much of your
time is spent interacting directly or indirectly with some electronic
device? For the computationally-illiterate, the landscape of daily life
is thus one of immovable, incomprehensible objects which they must adapt
to or work around. But for those who can program, these objects become
tools that are understood and can be modified to fit any desired
purpose.</p>
<h1 id="piquing-their-interest">Piquing their interest</h1>
<p>Small children are fascinated by whatever they see adults doing. All
three of my daughters (ages 9, 6, and 2) are interested in programming,
though I have never told them they should be. They became interested in
programming by seeing me at it. Of course, simply typing in a terminal
doesn’t really grab their attention. But they get curious when I’m
running wave simulations, and ask questions about the visualizations. It
was surprising to me to find that even quite abstract things can grab
their interest, if there is an interesting plot to go along with it. But
when I recently showed them simulations of water waves, their excitement
became palpable.</p>
<p>I was running some simple simulations of waves breaking on a beach,
for a talk I was to give in front of a general scientific audience. The
girls began asking what-if questions, and the experimental fun began. We
put a big wall on the beach and then tested how big the waves needed to
be before they would go over the wall. Then we added a big dip in front
of the wall. We tried starting the simulation with all the water flowing
in toward the shore. And so forth. They came up with the ideas, and I
would implement them. Importantly, the code that sets up the problem was
easy to change and run in just a matter of seconds. I think if they had
had to wait even a full minute to find out “what if”, I would have lost
them.</p>
<figure>
<img src="/assets/img/shallow_water_fun.jpg"
alt="Solving PDEs: fun for all ages" />
<figcaption aria-hidden="true">Solving PDEs: fun for all
ages</figcaption>
</figure>
<p>Did they learn how to program from this? No, none of them typed any
code, and I made only a minimal attempt to explain to them the code I
was changing. But they understood that by typing instructions, one can
make a computer do whatever one wants. They learned that computers can
be used to answer fun and interesting questions. And they got a little
exposure to some programming tools and concepts. Most importantly, they
<strong>want</strong> to understand how to make the computer do whatever
they can imagine.</p>
<h1 id="simple-programming-for-kids">Simple programming for kids</h1>
<p>There are a number of tools designed to give kids a “softer”
introduction to programming. Perhaps the best-known is MIT’s <a
href="">Scratch</a>. I guess the idea is that the connection between
typed instructions and computer output is too abstract. Also, young
children may still be developing reading, writing, and typing skills. So
the text editor is replaced by a GUI with cute animals and buttons that
add actions. This may be great for some kids, but again there is the
sense that one is only learning to microwave pre-cooked meals. My
experience in introducing my oldest daughter to programming (at 6) is
that she was much more excited by the blank slate of a Python
interpreter.</p>
<p>Of course, we didn’t jump into decorators and class inheritance.
There should be fast, fun feedback, especially at the start when the
learning curve is steepest. My daughter got a huge kick out of learning
she could make the computer talk (using the system command “say” on a
Mac). This was easily incorporated as part of programming some simple
games (like guessing a number or hangman). Those games naturally
introduce simple ideas like loops and if-statements. The goal is always
to create something fun or useful; the programming ideas are only
incidental. In my opinion, programming should work that way at all ages
and all levels.</p>
<p>Some of the things she has programmed so far include:</p>
<ul>
<li>A “guess the number” game (that tells you to guess higher/lower at
each iteration)</li>
<li>A game that asks simple math questions</li>
<li>Hangman (this one was surprisingly easy compared to what I expected,
though she didn’t implement any graphics)</li>
<li>A very simple adventure game (that lets you move around an imaginary
world)</li>
<li>A “countdown to Christmas” that announces the number of hours and
days left before Christmas. I helped her use chrontab to make it run
every hour.</li>
</ul>
<h1 id="maintaining-interest">Maintaining interest</h1>
<p>My oldest daughter is 9 now, and while I haven’t taught her as much
programming as I’d like, she continues to be interested in and excited
about it. Let me mention a couple of things that I think have helped
maintain that excitement:</p>
<ul>
<li><strong>Ownership</strong>: when she is programming, the program is
hers. I don’t write code for her, although in some cases I have nearly
dictated bits and pieces before explaining them. I also don’t impose the
design or details of the end goals. I provide ideas and suggestions when
she is stuck, and I ask lots of probing questions.</li>
<li><strong>Freedom</strong>: I let her dictate the pace. We don’t have
a regularly-scheduled time for her to learn – we do it when the fancy
strikes. If she becomes frustrated or bored, we stop.</li>
<li><strong>Fun</strong>: Every project is something that she has
decided would be fun. Sometimes the ideas come from her, and sometimes
from me, but the decision to pursue one is hers.</li>
</ul>
<h1 id="difficulties">Difficulties</h1>
<p>Along the way, I’ve run into some challenges that I haven’t solved.
My daughter often comes up with projects that would be much too
complicated, especially where graphics are concerned. I try to find
something similar that would be simple enough. When working on a
project, she usually wants to do it in a way that is far from my
preconceived “optimal” implementation. I try to be patient and
hands-off, and to let her learn for herself from her attempts. She also
tends to get interested in some non-essential aspect of a project, which
may not involve much programming skill – like making the computer say a
lot of silly things when you guess wrong in the number-guessing game.
Again, I try to be enthusiastic and not to interfere. Most importantly,
our programming sessions are never too serious and are not a source of
tension.</p>
<p>The biggest challenge for me is that teaching programming can be
frustrating. It’s easy to forget how difficult the programming mindset
is. Teaching requires a lot of patience as the student grapples with
ideas that seem obvious to the teacher. It’s important that the teacher
not jump in and “fix” the student’s work – the grappling (however slow
and painful it may seem) is essential to learning. I try to stick to the
Socratic method – that is, I can only guide by asking questions. I also
find that my daughter benefits a lot (though she never wants to do it)
from “rubber ducking”, which means reading the code out loud.</p>
<h1 id="hour-of-code">Hour of Code</h1>
<p>This morning I participated in part of an amazing world-wide effort
to help kids learn to code. Last year it exposed 15 million kids to
programming. The idea is that each kid spend at least one hour learning
about programming. A number of teachers at the KAUST schools have chosen
to participate. If you want to start teaching your own kids to program –
or if you want to learn! – the <a href="http://hourofcode.com/us">Hour
of Code website</a> has modules for all levels. For instance, in my
daughter’s 3rd-grade class the kids worked through a set of lessons
using a graphical interface (moving code around with a mouse, rather
than typing) in order to make Elsa and Anna (from Frozen) ice skate in
snowflake-shaped patterns. The lessons are an amazingly well-designed
sequence that also teaches some geometry and is very appealing to kids.
My hat is off to the people behind Hour of Code and all the teachers who
use it to make programming part of their curriculum.</p>
<h1 id="conclusion">Conclusion</h1>
<p>Programming literacy presupposes literacy in reading and writing. For
future generations, these two types of literacy will be of similarly
profound importance. Computer programs run our world. If approached in
the right way, programming can be a fun and playful pasttime that builds
creativity and reasoning skills while teaching kids to see the devices
that surround them as malleable tools rather than some kind of opaque
oracle.</p>
<p>One last note: you may be thinking, “but I only have boys. Can boys
learn to program too?” Sure they can, and you’d better teach them now or
they won’t have a chance against all the great women coders of the
future. ;-)</p>
Clawpack turns 202014-12-01T00:00:00+03:00h/2014/12/01/clawpack-20<p>Twenty years ago, version 1.0 of the Conservation LAWs PACKage
(CLAWPACK, now <a href="http://clawpack.org">Clawpack</a> was first
released by <a href="http://faculty.washington.edu/rjl/">Randy
LeVeque</a>. It seems fitting to take the occasion to look back on the
intervening years. What follows are my thoughts on some of the great
things that have resulted.</p>
<p><img src="/assets/img/clawpack_bday.jpg" /></p>
<p>As far as I can tell, <a
href="http://www.netlib.org/na-digest-html/94/v94n44.html#5">this item
in the NA-Digest</a> is the first public announcement of its existence.
It was also announced more verbosely the same year in <a
href="http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=70F5140D445F7B89AF162827E10443A7?doi=10.1.1.48.3424&rep=rep1&type=pdf">this
conference paper</a>, from the proceedings of the 5th HYP conference.
Reading that conference paper now, I am struck by how it incorporated
many of the ideals of scientific software development that we now
discuss as if they were new ideas. For instance,</p>
<ul>
<li>Code that is <strong>easy to read and use</strong>, with plentiful
<strong>documentation and examples</strong>.</li>
<li><strong>Modular design</strong>, that allows low-level functions to
be reused and disparate parts of the code to be modified independently.
In the case of Clawpack, this is epitomized by the fact that it allows
the solution of <em>any</em> system of hyperbolic PDEs by changing just
a single routine (the Riemann solver).</li>
<li>An interface that allows methods and parameters to be changed
easily, so that <strong>different methods can be conveniently
compared</strong>.</li>
<li>Clawpack was proposed as a <strong>benchmark</strong> against which
to easily <strong>test new algorithms</strong>.</li>
<li>Clawpack was released <strong>open source</strong> and for
<strong>free</strong> on a public FTP server (netlib).</li>
</ul>
<p>In this day of so much ado about credit for software, it’s also
interesting to view this paper as an early example of a mathematical
publication that is all about software.</p>
<p>Looking through the code snippets in the paper, I was astonished to
recognize how much of the original Fortran 77 code remains virtually
unaltered – including many variable names, function interfaces, and
overall design. This is a testament to the quality of the original code
design.</p>
<p>The central algorithms in Clawpack have also stood the test of time.
The 80’s saw the heyday of research into second-order TVD methods for
conservation laws, and Clawpack was released just as that era came to a
close. Since then, research has gone in other directions – high-order
methods, well-balancing, and positivity preservation, to name a few.
While these new directions have provided additions to Clawpack, the
“classic” algorithms have not changed and are still hard to beat as a
robust general-purpose tool.</p>
<p>Of course, much has happened in the intervening twenty years. The
original library handled 1- and 2-dimensional problems on regular
cartesian grids. In the next few years, subsequent versions added
algorithms for <a
href="http://faculty.washington.edu/rjl/pubs/wp3d/index.html">3D</a>,
mapped grids, and <a
href="http://faculty.washington.edu/rjl/pubs/amrclaw/index.html">adaptively
refined meshes</a>.</p>
<p>Additional algorithmic innovations are too numerous to try to list,
but one that has had a lot of impact is <a
href="http://epubs.siam.org/doi/abs/10.1137/S106482750139738X">the
f-wave technique</a>.</p>
<p>The problems to which Clawpack has been applied are certainly
<strong>much</strong> too numerous to list. But you can start to get an
idea by looking at citations of major Clawpack papers like <a
href="http://scholar.google.com/scholar?cites=17725852758276924945&as_sdt=2005&sciodt=0,5&hl=en">this</a>,
<a
href="http://scholar.google.com/scholar?cites=7413565796955546825&as_sdt=2005&sciodt=0,5&hl=en">this</a>,
<a
href="http://scholar.google.com/scholar?cites=13042879341964900053&as_sdt=2005&sciodt=0,5&hl=en">this</a>,
<a
href="http://scholar.google.com/scholar?cites=13610946858631858859&as_sdt=2005&sciodt=0,5&hl=en">this</a>,
and <a
href="http://scholar.google.com/scholar?cites=7380133178147045066&as_sdt=2005&sciodt=0,5&hl=en">this</a>.
Perhaps the heaviest use in recent years has involved geophysical flows
such as tsunamis and storm surges, in GeoClaw.</p>
<h2 id="the-clawpack-family-of-codes">The Clawpack family of codes</h2>
<p>Clawpack has spawned numerous offshoots and extensions, including
(but not limited to) <a
href="http://www.clawpack.org/amrclaw.html">AMRClaw</a>, <a
href="http://mitran-lab.amath.unc.edu:8084/redmine/projects/bearclaw/wiki">BearClaw</a>,
<a href="http://cedb.asce.org/cgi/WWWdisplay.cgi?121287">ZPLClaw</a>, <a
href="http://www.clawpack.org/doc/pyclaw/solvers.html?highlight=sharpclaw#pyclaw.sharpclaw.solver.SharpClawSolver">SharpClaw</a>,
<a
href="https://scholarworks.aub.edu.lb/handle/10938/9322">CUDAClaw</a>,
<a
href="http://math.boisestate.edu/~calhoun/ForestClaw/">ForestClaw</a>,
<a href="https://github.com/manyclaw/manyclaw">ManyClaw</a>, <a
href="http://www.clawpack.org/pyclaw/index.html">PyClaw</a>, and <a
href="http://www.clawpack.org/geoclaw.html">GeoClaw</a>. Some of these
have become part of the Clawpack-5 suite while others have forked and
gone in other directions.</p>
<p>Nowadays, the term Clawpack refers to a collection of interrelated
packages that are maintained and developed at <a
href="http://github.com/clawpack">github.com/clawpack</a>. They
include:</p>
<ul>
<li>The original (“Classic”) Clawpack;</li>
<li>AMRClaw (with adaptive mesh refinement)</li>
<li>GeoClaw (with special tools for geophysical flows)</li>
<li>PyClaw (a Python interface to both the classic code and the
high-order “SharpClaw” algorithms)</li>
<li>Riemann (a library of approximate Riemann solvers, which can be used
with all of the above codes)</li>
<li>VisClaw (an in-house visualization tool)</li>
</ul>
<p>The Github organization also includes repositories for the docs and
for contributed applications.</p>
<h2 id="the-clawpack-community">The Clawpack community</h2>
<p>As far as I know, the original release was a one-person effort. But
like most open-source projects, Clawpack quickly became a broader
collaboration. I won’t attempt to credit everyone here; you can see some
of the major contributors <a
href="http://www.clawpack.org/about.html#authors">here</a>, and many
more by looking at the contributors pages on Github.</p>
<p><img src="/assets/img/HPC3-2012-group-photo.jpg" /></p>
<p>I was surprised to realize that I’ve now been involved with Clawpack
for half of its existence – ten years! During those years I’ve gotten to
work with an group of exceptional researchers who are also just
outstanding people. They say that the culture of an open-source software
community is shaped strongly by its founder, and I think Clawpack is no
exception.<br />
It seems to me that its creator is not only a great applied
mathematician, but also somene who consistently leads the way in terms
of improving the way we do science. Clawpack exemplifies his commitment
to reproducibility and sustainable scientific software development, long
before those words came into scientific vogue. He was an advocate for
publishing in journals with low subscription prices, long before open
access became a movement.<br />
Most significantly, he has always been interested first in finding and
solving interesting problems, and only secondarily in publishing papers.
Both through his personal influence and as chair of the SIAM Journals
Committee, he has been influential in making progress in these
directions, including the establishment of a Software section in SISC,
the acceptance by SIAM journals of supplementary materials (including
code), and a new policy allowing authors to post published articles on
their own or institutional websites.</p>
<p><img src="/assets/img/hpc3_attendees.jpg" /></p>
<p>As a result, the culture surrounding Clawpack has always encouraged
openness and a willingness to accept new contributions. Furthermore, I
think that the Clawpack developers have maintained a healthy skepticism
toward our own algorithms and code. Although we try to make our code
useful to as many people as possible, there has never been any attempt
to evangelize the community in order to increase use of a particular set
of algorithms or to increase metrics like citation counts. Because of
this attitude, the code is continually improved through incorporation of
new algorithmic innovations.</p>
<h2 id="lessons-learned">Lessons learned</h2>
<p>Of course, it would be wrong to say that Clawpack has been a perfect
model for scientific code development. There are plenty of things we’ve
done wrong or could learn to improve.</p>
<p>The original announcement says that “contributions from other users
will be gratefully accepted,” and that has always been true.
Nevertheless, the widely accepted development model for a long time was
that most users would take the code, fork it, and make their own
enhancements that would never get back to the main codebase. While this
prevented feature bloat, it also meant that a great wealth of knowledge
– largely in the form of sophisticated approximate Riemann solvers –
will perish on some dusty hard drive rather than benefitting the larger
community. We’re trying to change that now by encouraging users to
submit pull requests for Riemann solvers and for entire
applications.</p>
<p>Another example of where I see room for improvement is in output and
visualization, where we have, to some degree, reinvented the wheel.
Clawpack has long used custom ASCII and binary file formats, that can
only be read in by Clawpack (or by reverse-engineering code for the
relatively simple formats). We are now pushing to move to a more
standard default format (probably HDF5), which would allow easier
integration with standard visualization and post-processing
libraries.</p>
<p>On the visualization side, the Clawpack developers have created some
extremely useful tools for plotting time-dependent data on structured
grids (including block-structured AMR). These tools sit on top of MATLAB
and matplotlib. A large amount of work has gone into these “in-house”
tools rather than into leveraging and contributing to dedicated
visualization tools. Meanwhile, individual users have occasionally
connected Clawpack to powerful visualization tools, but their custom
code never got back to the main codebase. The limited capabilities of
matplotlib in 3D seem to finally be providing sufficient impetus to
force us to integrate with a sophisticated visualization library. I have
been working lately on integration with <a
href="http://yt-project.org">yt</a>.</p>
<h2 id="the-next-20-years">The next 20 years?</h2>
<p>It may come as a surprise for a code that’s so long in the tooth, but
I think Clawpack development at present is more vibrant than ever. Since
2011, we’ve held annual developer workshops, the latest of which took
place last week here at KAUST. The pictures on this page are from those
workshops (the cake in the first photo, which shows the Clawpack logo,
was made by my wife, and is a fondant version of a fluid-dynamical
shockwave hitting a low-density bubble).</p>
<p>As for the future, I won’t claim enough clairvoyance to see 20 years
ahead. But here are some things I hope we can accomplish in the next few
years:</p>
<ul>
<li><strong>Massively parallel adaptive mesh refinement</strong> (at
present, it exists only in the unreleased ForestClaw code; a concurrent
effort is bring this to PyClaw through BoxLib);</li>
<li>An ever-growing <strong>library of Riemann solvers</strong> for
increasingly complex systems;</li>
<li><strong>Code-generation</strong> for solving problems where a custom
Riemann solver is not yet available;</li>
<li>Incorporation of code that runs on <strong>accelerators</strong>
(like CUDAClaw and ManyClaw) into the main library in a way that allows
users to change hardware seamlessly;</li>
<li>More <strong>teaching tools</strong> based on Clawpack and IPython
notebooks, including a book showcasing Riemann solutions of important
physical systems;</li>
<li><strong>Additional algorithms</strong> (such as Discontinuous
Galerkin methods and new time stepping techniques) that can be accessed
through the same problem setup and use the same Riemann solvers;</li>
<li>Better <strong>interoperability</strong> between Clawpack and other
codes (such as <a href="http://proteus.usace.army.mil/">Proteus</a>), by
making Clawpack more of a true library.</li>
</ul>
<p>Are you excited yet? I certainly am. Come join the fun!</p>
<p><img src="/assets/img/clawlogo.jpg" /></p>
Teaching in the open2014-07-18T00:00:00+03:00h/2014/07/18/teaching_in_the_open<p>If you examine the menu bar above, you’ll notice that my site has a
new top-level page: <a href="/teaching.html">Teaching</a>. This is a
direct result of my attending <a
href="http://www.youtube.com/watch?v=1e26rp6qPbA&t=26m12s">Greg
Wilson’s inspiring keynote at Scipy 2014</a>. That link will take you to
the key (for me) part of the talk, but I recommend watching the whole
thing. His message is: massive collaboration is the real revolution.
Michael Nielsen made the same statement in <em>Reinventing
Discovery</em>; here Greg applies this statement to university
education, and asks:</p>
<blockquote>
<p>Why don’t instructors open-source their teaching materials?</p>
</blockquote>
<p>This new page is my own effort to enable that revolution. In fact,
I’ve been gradually putting my teaching materials online for the past
couple of years, without giving it much thought. The teaching page
collects all the resources I’ve made available in one place.</p>
<p>Something even more exciting in this vein is coming in the fall. If
you want to know a little about it, watch <a
href="http://www.youtube.com/watch?v=TWxwKDT88GU&t=56m2s">the last
few minutes of Lorena Barba’s excellent Scipy keynote on computational
thinking and teaching</a>.</p>
<p>Stay tuned.</p>
KAUST goes open access2014-07-01T00:00:00+03:00h/2014/07/01/KAUST_goes_open_access<p>I’m proud to announce that as of today, KAUST has officially adopted
an open access policy!</p>
<h1 id="what-it-means">What it means</h1>
<p>Institutional open access (OA) policies are a primary tool in the
effort to allow academics to retain control of their own work. MIT and
Harvard were the first to adopt such policies; now hundreds of
institutions have similar policies.</p>
<p>In short, <strong>the policy ensures that KAUST has non-exclusive
rights to distribute all research done at KAUST. This right precedes any
publishing or copyright agreement terms</strong>. It also places a
responsibility on KAUST faculty to provide a pre-print of each paper to
the library.</p>
<p>The policy has nothing to do with publishing in open access journals
(so-called Gold OA). Authors continue to publish in the same manner –
and the same journals – as before.</p>
<p>KAUST’s OA policy is based closely on the text <a
href="http://cyber.law.harvard.edu/hoap/Good_practices_for_university_open-access_policies">recommended
by the Harvard Open Access Project (HOAP)</a>. HOAP was an extremely
valuable resource for us in developing a policy and convincing the
faculty, administration, and legal team to approve it.</p>
<h1 id="how-it-happened">How it happened</h1>
<p>This is the culmination of a process that started back in 2011 with a
lunch conversation between Rick Johnson (KAUST librarian and long-time
OA advocate) and myself. We were both frustrated that KAUST theses were
being “published” in a way that was inaccessible to anyone outside the
University. Over the next several months, we successfully worked to
ensure that all KAUST theses would be accessible for free to the general
public. In fact, the first thesis to be published openly (and for months
the only one) was <a
href="http://archive.kaust.edu.sa/kaust/handle/10754/209415">that of my
MS student, Manuel Quezada de Luna</a>.<br />
Now anyone can read the currently 367 completed KAUST theses <a
href="http://archive.kaust.edu.sa/kaust/handle/10754/124545/">here</a>.</p>
<p>We decided the next order of business was a full institutional
open-access policy. With strong support from Jim Calvin (VP for academic
affairs) and my faculty colleagues Suzana Nunes and Sahraoui Chaieb, we
eventually hammered out something that all could agree on (even the
lawyers!) The policy was championed by our new library director, Molly
Tamarkin, as soon as she arrived at KAUST earlier this year.</p>
<h1 id="the-policy">The policy</h1>
<p>Here’s the full text of the policy, which at the moment is only
available on an internal site. I’ll post a link here when a public
announcement is made.</p>
<blockquote>
<p>University faculty members, research scientists, post-doctoral
fellows, students and employees (“University Research Authors”) grant to
the University non-exclusive permission to make available their
scholarly research articles and to exercise the copyright in those
articles for the purpose of open dissemination.</p>
</blockquote>
<blockquote>
<p>More specifically, each University Research Author grants to the
University a non-exclusive, irrevocable, worldwide license to exercise
any and all rights under copyright relating to each of his or her
scholarly research articles, in any medium, provided that the articles
are not sold for a profit, and to authorize others to do the same.</p>
</blockquote>
<blockquote>
<p>The Office of the Vice President for Academic Affairs or its
designate may waive application of the license for a particular article
or delay access for a specified period of time upon express direction by
the author.</p>
</blockquote>
<blockquote>
<p>Each faculty member or researcher will provide an electronic copy of
the author’s final version of each article no later than the date of its
publication at no charge in accordance with the guidelines published
from time to time by the Office of the Vice President for Academic
Affairs.</p>
</blockquote>
<blockquote>
<p>The Office of the Vice President for Academic Affairs charges the
KAUST Library to develop and monitor a plan to comply with this policy
and existing copyright obligations in a manner as convenient for the
faculty as possible.</p>
</blockquote>
<blockquote>
<p>The Office of the Vice President for Academic Affairs or its delegate
will be responsible for interpreting this policy, resolving disputes
concerning its interpretation and application, and recommending changes
to the Academic Council from time to time.</p>
</blockquote>
<blockquote>
<p>The KAUST Library will review this policy after three years.</p>
</blockquote>
Teaching with SageMathCloud2014-05-31T00:00:00+03:00h/2014/05/31/teaching_with_SMC<p>During the past Spring semester at KAUST, I again taught AMCS 252,
our masters-level course on numerical analysis for differential
equations. I’ve been teaching the course using Python for 5 years now.
This year, for the first time, <em>I didn’t spend any time helping
students install Python, numpy, matplotlib, or scipy</em>. In fact, I
even had them use Clawpack – and they didn’t need to install it. Why?
Because they all used <a
href="http://cloud.sagemath.com">SageMathCloud</a> for the course.</p>
<h2 id="a-little-history">A little history</h2>
<p>For the past several years, I have been increasingly integrating into
the course <a href="https://github.com/ketch/AMCS252">a set of
electronic notebooks</a> in which the students are presented with some
explanations and code, followed by exercises that involve modifying,
running, and understanding the numerical algorithms implemented in the
notebook. At first these were a set of Sage worksheets, and I ran a
local Sage server within the KAUST network. When the VM that held the
server died a horrible and irreversible death, I decided to switch to
the IPython notebook format that had become increasingly popular. It
wasn’t too hard to <a
href="http://www.davidketcheson.info/2013/01/16/sage_to_ipython.html">convert
all my Sage worksheets to IPython notebooks</a>. But my students had to
either do all their work in the computer lab or figure out how to
install the necessary Python packages on their own machines. This was a
bit of a time sink for me, although it has gotten easier each year
thanks to packages like <a
href="https://store.continuum.io/cshop/anaconda/">Anaconda</a> and <a
href="https://www.enthought.com/products/canopy/">Canopy</a>. This also
meant that they all ended up working in slightly different environments,
which occasionally caused problems.</p>
<h2 id="ipython-notebooks-in-the-cloud">IPython notebooks in the
cloud</h2>
<p>In the last year, two new cloud services emerged, both offering free
accounts with the ability to run IPython notebooks:</p>
<ul>
<li><a href="http://wakari.io">Wakari</a></li>
<li><a href="http://cloud.sagemath.com">Sage Math Cloud</a></li>
</ul>
<p>I realized that by using one of these services, I could avoid dealing
with installation issues and ensure that everyone worked in an identical
environment. Though I have found both Wakari and SMC to be useful, I
ended up going with SMC for the course because it has, in my opinion, a
more intuitive user interface.</p>
<h2 id="getting-started">Getting started</h2>
<p>On the first day of class, students had only to create a free SMC
account, create a new project, and type the URL of the course github
repo into the “new file” box, which automatically caused it to be cloned
into their SMC project. As I updated materials during the semester, all
they had to do was open a SMC terminal and type “git pull” (in fact,
none of the students had ever used git before, but none of them had any
difficulty with this during the course).</p>
<p><img src="https://cloud.sagemath.com/art/templates.png" alt="Git clone via SMC" height="200" align="center"></p>
<p>Another great advantage of using a cloud service was that students
could work or show their work from any computer. Since it was a small
class, I had them present homework solutions in-class. They could all
present solutions using the computer attached to the projector in the
room by just logging into their own SMC account. That meant we avoided
losing 5 or 10 minutes of class time in order to switch cables or
transfer files.</p>
<h2 id="feedback">Feedback</h2>
<p>Overall, the students’ feedback was very positive. Most notably,
although some of them did eventually install Python and the related
packages locally on their laptops, they all chose to use SMC for their
homework assignments throughout the course. There were some noticeable
latency issues (the ping time between Saudi Arabia and Seattle is
200ms), and SMC currently has a 10-20 second delay the first time you
open an IPython notebook (there’s no such delay for Sage worksheets).
But those were not showstoppers, and I think by the time I teach my next
course those issues will be resolved (by an IPython upgrade on SMC and
by the launch of a European SMC server, respectively). William Stein,
creator of SMC (and Sage) was extremely responsive and helpful (in fact,
he created a trial European server recently in response to my and
others’ comments about latency).</p>
<p><img src="https://dl.dropboxusercontent.com/u/656693/smc_screenshot.png" alt="SMC" align="center"></p>
<p>I used SMC again to <a
href="https://github.com/ketch/HyperPython/blob/master/README.md">teach
a 1-day tutorial</a> at <a href="http://jkk.sze.hu/en_GB/program">a
workshop</a> this month. Other than a couple of minor hiccups, it again
worked very well. I plan to continue using it for teaching in the
future. One feature I haven’t used yet (but intend to) is the ability to
“collaborate” on a project so that multiple users can edit it at the
same time. I understand that <a
href="http://sagemath.blogspot.com/2014/04/the-sagemathcloud-roadmap.html">many
other great features are in the works</a>.</p>
<p>I would strongly recommend SMC to other teachers of
computationally-oriented courses, even if you’re not using IPython
notebooks or Sage worksheets. As long as all the software for your
course is freely available, you can install it all on SMC so that
students have identical environments, accessible from anything with a
web browser, with no need to do any installation of their own.</p>
<p>If you’re interested in my notebooks, you can find them here:</p>
<ul>
<li><a href="https://github.com/ketch/finite-difference-course">Spring
2013 course</a></li>
<li><a href="https://github.com/ketch/AMCS252">Spring 2014
course</a></li>
<li><a href="https://github.com/ketch/HyperPython">HyperPython
tutorial</a></li>
</ul>
<p>Just be warned that some are more polished than others, and they’re
likely to all get a makeover soon.</p>
<p>Now that I keep a lot of my <a
href="https://github.com/ketch/shallow_water_periodic_bathymetry/blob/master/pyclaw/shallow_water_diffraction.ipynb">research
in IPython notebooks on Github</a>, I’m also thinking that SMC is a way
to be able to show that research to anyone, anywhere. Heck, I can create
a project, clone a Github repo, and run PyClaw in a notebook <strong>on
my phone!</strong> Just amazing.</p>
HyperPython2014-05-28T00:00:00+03:00h/2014/05/28/hyperpython<p><img src="https://raw.githubusercontent.com/ketch/HyperPython/master/figures/finite_volume.png" alt="Finite volumes" height="200" align="center"></p>
<p>Last week, I ran a 1-day tutorial at the <a
href="http://jkk.sze.hu/en_GB/program">Workshop on Design, Simulation,
Optimization and Control of Green Vehicles and Transportation</a>. The
idea was to teach attendees about Python programming, basic theory of
hyperbolic conservation laws, finite volume methods, and how to use <a
href="http://clawpack.github.io/doc/pyclaw/">PyClaw</a>, all in the
space of a few hours.</p>
<p>Inspired by Lorena Barba’s recent release of <a
href="http://lorenabarba.com/blog/announcing-aeropython/">AeroPython</a>,
I decided to develop a short set of IPython notebooks for the tutorial.
The result is <a
href="https://github.com/ketch/HyperPython">HyperPython</a>, a set of 5
lessons (plus Python crash course):</p>
<ul>
<li><a
href="http://nbviewer.ipython.org/github/ketch/HyperPython/blob/master/Lesson_00_Python.ipynb">Lesson
0: Python</a></li>
<li><a
href="http://nbviewer.ipython.org/github/ketch/HyperPython/blob/master/Lesson_01_Advection.ipynb">Lesson
1: Advection</a></li>
<li><a
href="http://nbviewer.ipython.org/github/ketch/HyperPython/blob/master/Lesson_02_Traffic.ipynb">Lesson
2: Traffic</a></li>
<li><a
href="http://nbviewer.ipython.org/github/ketch/HyperPython/blob/master/Lesson_03_High-resolution_methods.ipynb">Lesson
3: High-resolution methods</a></li>
<li><a
href="http://nbviewer.ipython.org/github/ketch/HyperPython/blob/master/Lesson_04_Fluid_dynamics.ipynb">Lesson
4: Fluid dynamics</a></li>
<li><a
href="http://nbviewer.ipython.org/github/ketch/HyperPython/blob/master/Lesson_05_PyClaw.ipynb">Lesson
5: PyClaw</a></li>
</ul>
<p>These won’t make you an expert, but if you’re looking for something
short, practical, and fun, please give them a try. You may also find the
last two notebooks useful if you’re looking for a good introduction to
PyClaw.</p>
<p>These may be greatly expanded in the future into a full-fledged
semester-length course.</p>
Open access is about open access, not journals2013-12-13T00:00:00+03:00h/2013/12/13/open_access_means_open_access<p>In October, <a
href="http://news.sciencemag.org/sites/default/files/media/Open%20Access%20SurveySummary_11082013_0.pdf">Science
Magazine conducted a survey regarding open access</a>. Among the
questions:</p>
<ul>
<li><em>How important is it for scientific papers to be freely
accessible to the public?</em></li>
<li><em>Of the papers that you published in the last 3 years, what
percentage did you submit to fully open access journals?</em></li>
</ul>
<p><strong>72%</strong> replied “extremely important” to the first
question, while only <strong>58%</strong> indicated they had submitted
any paper to an open access journal. Does this mean that scientists are
not acting in agreement with their own principles?</p>
<p><strong>No!</strong></p>
<p>It may shock the editors of Science, but the open access movement is
not about changing the funding model for academic publishers.</p>
<p><strong>Open access means that research results can be read by
anyone, for free.</strong></p>
<p>Scientists can accomplish that without any help from publishers. The
fact is that most scientists don’t view <em>open access journals</em> as
the best way to make their work accessible. Another question from the
Science survey asked</p>
<ul>
<li><em>Which options for making papers freely available do you
prefer?</em></li>
</ul>
<p>The most common answer (66%) was <strong>“Immediate access through a
repository, such as PubMedCentral or Arxiv, or on an author’s web
site”</strong>.</p>
<p>This is quick and painless. <a
href="http://www.sherpa.ac.uk/romeo/statistics.php?la=en&fIDnum=%7C&mode=simple">It
is allowed by an overwhelming majority of publishers</a>. It requires no
mandates from governments or universities. It requires no extra funding.
Anyone can do it, and every scientist who cares a whit about open access
already has done it.</p>
<p>If someone tells you that we need governments or publishers to
intervene to make open access possible, you can be sure that his agenda
is something other than open access. The only obstacle left is our own
apathy.</p>
A Tale of Two Theorems2013-10-14T00:00:00+03:00h/2013/10/14/CFL-disk<p>In their <a href="http://dx.doi.org/10.1147/rd.112.0215">celebrated
1928 paper</a>, Courant, Friedrichs, and Lewy proved a geometric
condition that must be satisfied by a convergent
<strong>partial</strong> differential equation discretization – the
famous CFL condition. Briefly, the CFL theorem says that the numerical
method must transport information at least as quickly as information
travels in the true PDE solution. The proof is geometric and is conveyed
through numerous diagrams.</p>
<p>Exactly fifty years later, in 1978, Rolf Jeltsch and Olavi Nevanlinna
<a href="http://dx.doi.org/10.1007/BF01932030">published a theorem</a>
[JN] that deals with bounding the modulus of a polynomial <span
class="math inline">\(\psi(z)\)</span> over a disk of the form <span
class="math display">\[D_r = {z \in \{\mathbb C} : |z-r|\le
r\}.\]</span> Their theorem says that if <span
class="math inline">\(\psi(z) = 1 + z + a_2 z^2 + \cdots + a_s
z^s\)</span> and <span class="math inline">\(|\psi(z)|\le 1\)</span> for
all <span class="math inline">\(z\)</span> in such a disk <span
class="math inline">\(D_r\)</span>, then the disk radius <span
class="math inline">\(r\)</span> is at most <span
class="math inline">\(s\)</span>. The proof of this result is, of
course, purely algebraic.</p>
<p>These results apparently have nothing to do with one another. And yet
it turns out that <strong>they are equivalent statements!</strong> That
is, the CFL theorem can be proved using the JN disk theorem. And the JN
disk theorem can be proved using the CFL condition (and no algebraic
techniques). This was explained in <a
href="http://dx.doi.org/10.1007/BF01389633">a beautiful paper of
Sanz-Serna and Spijker</a> [SS] in 1986, and the result deserves to be
much more well known.</p>
<h3 id="first-order-upwinding">First order upwinding</h3>
<p>Consider the problem of approximating the value <span
class="math inline">\(u(x_i,t_n)\)</span> for the advection equation
<span class="math display">\[u_t + u_x = 0.\]</span> The exact solution
can be obtained by characteristics from the previous time level: <span
class="math display">\[u(x_i,t_n) = u(x_i-k,t_{n-1}),\]</span> where
<span class="math inline">\(k\)</span> is the time step size. The CFL
theorem says that the stencil used for approximating <span
class="math inline">\(u(x_i,t_n)\)</span> must enclose the point <span
class="math inline">\(x_i-k\)</span>.</p>
<p>Let’s discretize the advection equation in space using upwind
differences: <span class="math display">\[U_i'(t) =
-\left(U_i-U_{i-1}\right).\]</span> Here for simplicity we’ve assumed a
spatial mesh width of 1. Taking periodic boundary conditions, this
semi-discretization is a system of ODEs of the form <span
class="math inline">\(U'=LU\)</span> where <span
class="math inline">\(L\)</span> is the circulant matrix <span
class="math display">\[
\begin{pmatrix}
-1 & & & 1 \\
1 & -1 & & \\
& \ddots & \ddots \\
& & 1 & -1 \\
\end{pmatrix}\]</span> (as usual, all the omitted entries are zero). The
eigenvalues of this matrix all lie on the boundary of the disk of radius
one centered at <span class="math inline">\(z=-1\)</span>, which we
denote by <span class="math inline">\(D_1\)</span>. Here are the
eigenvalues of a 50-point discretization:</p>
<p><img
src="https://dl.dropboxusercontent.com/u/656693/wiki_images/disk_eigen.png" /></p>
<p>If we discretize in time with Euler’s method, we get the scheme <span
class="math display">\[U^n_i = U^{n-1}_i -
k\left(U_i-U_{i-1}\right).\]</span> This scheme computes the solution at
<span class="math inline">\((x_i,t_n)\)</span> using values at <span
class="math inline">\((x_{i-1},t_{n-1})\)</span> and <span
class="math inline">\((x_i,t_{n-1})\)</span>, so the CFL theorem says it
can be convergent only if <span class="math inline">\(x_i-k\)</span>
lies in the interval <span class="math inline">\((x_{i-1},x_i)\)</span>.
Since <span class="math inline">\(x_{i-1} = x_i - 1\)</span>, this holds
iff the step size <span class="math inline">\(k\)</span> is smaller than
1.</p>
<p>This result – that the first-order upwind method is stable and
convergent only for CFL number at most one – is well known, and can also
be derived using basic method of lines stability theory. The stability
function for Euler’s method is <span class="math inline">\(\psi(z) = 1 +
z\)</span>, so it is stable only if <span
class="math inline">\(z=k\lambda\)</span> lies in the disk <span
class="math inline">\(\{z : |1+z|\le 1\} = D_1\)</span> for each
eigenvalue <span class="math inline">\(\lambda\)</span> of <span
class="math inline">\(L\)</span>. What we have seen in the foregoing is
that this stability condition can be derived directly from the CFL
condition, without considering the eigenvalues of <span
class="math inline">\(L\)</span> or the stability region of Euler’s
method.</p>
<h3 id="proving-the-jn-disk-theorem-via-the-cfl-theorem">Proving the JN
disk theorem via the CFL theorem</h3>
<p>For higher order discretizations, the CFL condition is necessary but
not generally sufficient for stability. Nevertheless, we can use it to
derive the JN disk theorem. I’ll restrict the explanation here to
Runge-Kutta methods, but the extension to multistep methods is very
simple. Suppose that we discretize in time using a Runge-Kutta method
with <span class="math inline">\(s\)</span> stages. In each stage, one
point further to the left is used, so typically the stencil for
computing <span class="math inline">\(u(x_i,t_n)\)</span> includes the
values from the previous step at <span class="math inline">\(x_{i-s},
x_{i-s+1}, \dots, x_i\)</span>. Thus the CFL theorem says the method
cannot be convergent unless <span class="math inline">\(x_i-k\)</span>
lies in the interval <span class="math inline">\((x_{i-s},x_i)\)</span>;
i.e., unless <span class="math inline">\(k\le s\)</span>. Meanwhile, the
stability function <span class="math inline">\(\psi(z)\)</span> of the
Runge-Kutta method is a polynomial of degree at most <span
class="math inline">\(s\)</span>. Method of lines analysis tells us that
the full discretization is stable if <span
class="math inline">\(kD_1\)</span> lies inside the region <span
class="math inline">\(\{z : |\psi(z)|\le 1\}.\)</span> Since we know it
is unstable for <span class="math inline">\(k>s\)</span>, this
implies that if <span class="math inline">\(|\psi(z)|\le 1\)</span> over
the disk <span class="math inline">\(D_k\)</span>, then <span
class="math inline">\(k \le s\)</span>.</p>
<h3 id="recap">Recap</h3>
<ol type="1">
<li>An <span class="math inline">\(s\)</span>-stage upwind
discretization has stencil width <span
class="math inline">\(s\)</span>.</li>
<li>The CFL condition implies that this discretization cannot be
convergent for Courant numbers larger than <span
class="math inline">\(s\)</span>.</li>
<li>The spectrum of the semi-discretization is the boundary of the disk
<span class="math inline">\(D_1\)</span>.</li>
<li>Stability analysis implies that the full discretization is
convergent if the scaled spectrum <span class="math inline">\(kD_1 =
D_k\)</span> lies inside the stability region of the time
discretization.</li>
<li>Thus no <span class="math inline">\(s\)</span>-stage time
discretization can have a stability region including the disk larger
than <span class="math inline">\(D_s\)</span> (this is the content of
the JN disk theorem).</li>
</ol>
<h3 id="ellipses">Ellipses</h3>
<p>Of course, we didn’t have to choose first-order upwinding in space;
we could have taken any spatial discretization. For instance, if we use
centered differences: <span class="math display">\[U_i'(t) =
-\left(U_{i+1}-U_{i-1}\right)\]</span> then the spectrum of the
semi-discretization lies on the imaginary axis in the interval <span
class="math inline">\([-i,i]\)</span>. Then the same line of reasoning
then tells us that the largest imaginary-axis interval of stability for
an <span class="math inline">\(s\)</span>-stage method is <span
class="math inline">\([-is,is]\)</span>. By considering convex
combinations of upwind and centered differences, we get similar results
for a family of ellipses; this is the content of Theorem 5 of [SS].</p>
<h3 id="parabolic-problems">Parabolic problems</h3>
<p>It’s well known that the largest interval of stability of a
consistent <span class="math inline">\(s\)</span>-stage method on the
negative real axis has length <span class="math inline">\(s^2\)</span>;
the corresponding polynomials are (shifted) Chebyshev polynomials. You
might hope that this could also be deduced by considering a centered
difference semi-discretization of the heat equation and applying the CFL
theorem. That would be very neat, since it would provide a connection
between PDE stability theory and the optimality of Chebyshev
polynomials.</p>
<p>Indeed, explicit time discretizations generally lead to step size
restrictions depending on the square of the spatial mesh width when
paired with the usual centered spatial discretization. But the CFL
theorem is not sharp for these discretizations; it only tells us that
<span class="math inline">\(k\)</span> must vanish vanish more quickly
than the spatial mesh width. So no deduction along these lines seems
possible.</p>
<p>#spnetwork #recommend doi:/10.1007/BF01389633</p>
<p>#discusses doi:10.1147/rd.112.0215 #discusses
doi:10.1007/BF01932030</p>
Documentation, testing, and default arguments for your MATLAB packages2013-10-12T00:00:00+03:00h/2013/10/12/MATLAB-docs-testing<p>I primarily develop code in Python and Fortran, but I also use MATLAB
for certain things. For instance, I haven’t found a Python-friendly
nonlinear optimization package that measures up to the capabilities of
MATLAB’s optimization toolbox (fmincon). So my RK-opt package for
optimising Runge-Kutta methods is written all in MATLAB.</p>
<p>The trouble is that working in Python has spoiled me for other
languages. Python has the excellent <a
href="http://sphinx-doc.org/">Sphinx</a> package for writing
<strong>beautiful documentation</strong>. Python has the <a
href="http://nose.readthedocs.org/">nosetests</a> harness for easily
writing and running <strong>tests</strong>. And Python has <a
href="http://www.diveintopython.net/power_of_introspection/optional_arguments.html">a
simple syntax for including <strong>optional function arguments</strong>
with default values</a>.</p>
<p>MATLAB doesn’t support any of these things so elegantly*.</p>
<p>*<em>This was true one year ago, when I started writing this. But it
seems things have improved – see below</em>.</p>
<p>In any case, all is not lost – I have found reasonable approximations
in the MATLAB ecosystem, and in some cases I’ve adapted the Python tools
to work with MATLAB.</p>
<h3 id="documenting-matlab-projects-using-sphinx">Documenting MATLAB
projects using Sphinx</h3>
<p>In principle, Sphinx can be used to write documentation for packages
written in any language. However, its <a
href="http://sphinx-doc.org/ext/autodoc.html">autodoc</a> functionality,
which automatically extracts Python docstrings, doesn’t work with
MATLAB. For RK-Opt, I hacked together a simple workaround in <a
href="https://github.com/ketch/RK-opt/blob/master/doc/m2rst.py">this
74-line Python file</a>. It goes through a given directory, extracts the
MATLAB docstring for each function, and compiles them into an .rst file
for Sphinx processing. You can see an <a
href="http://numerics.kaust.edu.sa/RK-opt/RK-coeff-opt.html">example of
the results here</a>.</p>
<p><strong>Update</strong>: <em>as I’m writing this, I’ve discovered a
new <a
href="https://bitbucket.org/bwanamarko/sphinx-contrib/src/tip/matlabdomain/README.rst">MATLAB
extension for Sphinx’s autodoc</a>. I will have to try it out sometime;
please let me know in the comments if you’ve used it.</em></p>
<h3 id="automated-testing-in-matlab">Automated testing in MATLAB</h3>
<p>I’ve become convinced that writing at least one or two tests is
worthwhile for even small, experimental packages. In Python, it’s simple
to include test in the docs and run them with doctest, or write test
suites and run them with nosetest. For MATLAB, I would have recommended
the third-party <a
href="http://www.mathworks.com/matlabcentral/fileexchange/22846-matlab-xunit-test-framework">xunit
framework</a>. But it seems that this year <a
href="http://www.mathworks.com/help/matlab/matlab-unit-test-framework.html">Mathworks
finally added this functionality to MATLAB</a>. Even so, you might want
to use xunit because <a
href="https://github.com/tgs/matlab-xunit-doctest">it’s possible to run
doctests with it</a> but not with MATLAB’s new built-in framework. Also,
you can get XML output from xunit, which a number of other tools can
analyze (for instance, to tell you about code coverage). For an example
of how to use xunit, <a
href="https://github.com/ketch/RK-opt/blob/master/RK-coeff-opt/test_rkopt.m">see
RK-Opt</a>.</p>
<p>Again, I’d be interested to hear from you in the comments if you’ve
used MATLAB’s new built-in test harness.</p>
<h3 id="optional-arguments-with-default-values">Optional arguments with
default values</h3>
<p>MATLAB does allow the user to specify only some subset of the input
arguments to a function – as long as the omitted ones all come after the
included ones. I used to take advantage of this, with this kind of code
inside the function:</p>
<pre><code>if nargin<5 rmax=50; end
if nargin<4 eps=1.e-10; end</code></pre>
<p>This is a reasonable solution in very small functions, but it breaks
if you want to add new arguments that don’t come at the end, and if you
want to specify the very last value then you have to specify them all. A
better general solution is the <a
href="http://www.mathworks.com/help/matlab/ref/inputparserclass.html">inputParser
object</a>. It’s much less natural than Python’s syntax, but the result
for the user is the same: arbitrary subsets of the optional arguments
can be specified; default values will be used for the rest. As a bonus,
you can check the types of the inputs. <a
href="https://github.com/ketch/RK-opt/blob/master/polyopt/opt_poly_bisect.m#L258">Here’s
an example of usage</a>.</p>
<p>If you know of better ways to do any of these things, please let me
know in the comments!</p>
<p>Of course, it’s entirely possible to develop large, well-documented,
well-tested, user-friendly packages in MATLAB – <a
href="http://www.chebfun.org/">Chebfun</a> is one example. It’s just
that this is the exception and not the rule in the MATLAB community.
Hopefully better integration with testing and documentation tools will
improve this situation.</p>
Giving a math talk using IPython notebook slides and Wakari2013-09-21T00:00:00+03:00h/2013/09/21/ipython_notebook_slides_talks<h1
id="giving-a-math-talk-using-ipython-notebook-slides-and-wakari">Giving
a math talk using IPython notebook slides and Wakari</h1>
<p>Last week I gave my first full-length <em>executable talk</em>: one
in which I showed the code that produced (almost) all the results I
presented. You can <a
href="http://www.davidketcheson.info/talks/SciCADE-talk.slides.html#/">see
the talk</a> and <a
href="https://www.wakari.io/sharing/bundle/ketch/SciCADE-talk">run the
talk on Wakari</a> (or download it and run it locally). All you need is
Python with its scientific packages (numpy, scipy, sympy – I recommend
just installing <a href="http://www.continuum.io/downloads">Anaconda
Python</a> if you haven’t already). I took things a step further and
actually ran a bunch of demo code live on Wakari. I was excited
beforehand, and judging by the number of people that came into the room
right immediately before my talk (and left immediately afterward), so
was the audience. But I was disappointed with how it went. Here’s
why.</p>
<p><strong>Composing a talk in an IPython notebook is
counterintuitive</strong>. When I give a talk, I try to tell a
compelling and coherent story. This requires a certain mindset, and
somehow the IPython notebook helps rather than hinders – at least, for
me. I think there is too much of a disconnect between how things look
when I’m writing them and how they look as slides. In theory Beamer is
worse in this respect, but it felt worse with the notebook.</p>
<p><strong>It is hard to engage your audience with code</strong>. Almost
nobody can digest complicated formulas during a talk, which is why even
when I speak to mathematicians I usually have very few equations and
lots of pictures. Well, the same goes for code – nobody can digest more
than a few simple lines on a slide. I think I did a good job of keeping
the code short, high-level, and intuitive, but it still felt flat.</p>
<p><strong>Code in the talk needs to execute very quickly</strong>. This
is obvious for code that you run as a live demo, but I found it
necessary also for code snippets that I didn’t run live (but where I
wanted to show the results). That’s because when you recompile your talk
(which I do <em>many, many times</em> during the composing process), you
have to wait for all that code to execute again. It doesn’t help that
things seem to run significantly slower on Wakari than on my laptop.</p>
<p><strong>The IPython notebook format is not (yet) good at displaying
graphs and tables</strong>. Talks full of text put people to sleep, and
code is text, so this kind of talk already has a strike against it. But
to makes matters worse, I can’t insert images into my notebook slides
without putting an ugly line of code above them. And the notebook
refuses to let me embed vector graphics formats (like PDF), so I have to
degrade them to slightly blurry pngs.</p>
<p><strong>It’s hard to judge how long a code-based talk will
take</strong>. I usually judge conservatively so I can move at a relaxed
pace. But my demo took much longer than I planned (partly due to the
difficulty of using a Spanish keyboard), and I had to rush through the
last third of the talk in about 2 minutes. I guess this is something to
learn with practice.</p>
<p><strong>The default fonts in notebook-converted slides are just too
small</strong>. They are fine for someone sitting at a computer screen,
but much too small for the projector screen at the front of a large
screen. You can adjust the size in the browser using ‘+’, but the result
looks ugly for some reason. I know the fonts can be changed using CSS,
and I’ll make them larger next time.</p>
<p>For me, the worst condemnation of any talk is that no questions are
asked afterward. I haven’t had that happen in a long time, but this was
close: there was only one question, and that question demonstrated that
I had completely failed to convey what was going on behind a lot of the
code I had showed.</p>
<p>It feels too soon to give up on this approach to talks; I will try it
again some time. Perhaps I just haven’t found the right use for this
medium. If you have tried giving a similar talk, I’d love to hear your
opinion or suggestions.</p>
<p>One note about the slides: parts of them will not make sense in the
absence of my verbal explanations. I generally avoid including a lot of
explanatory text in the slides. I actually added a lot more than usual
in this case because I was planning to post them online.</p>
Don't scrap the DOE CSGF program2013-06-23T00:00:00+03:00h/2013/06/23/csgf-letter-to-congress<p>The US federal government <a
href="http://energy.gov/sites/prod/files/2013/04/f0/Volume4.pdf">has
proposed to eliminate a number of smaller graduate fellowship
programs</a> and lump them together with the <a
href="http://www.nsfgrfp.org/">NSF graduate fellowship</a>.
Unfortunately, this includes the Dept. of Energy’s illustrious <a
href="http://www.krellinst.org/csgf/">Computational Science Graduate
Fellowship (CSGF) program</a>, of which I’m an alumnus. I think the CSGF
program is an irreplaceable asset that is nurturing the <em>third
pillar</em> of science across disciplines in a way that a much larger
program never could.</p>
<p>Here are just a few of the definite, tangible impacts the CSGF
program had on me, off the top of my head:</p>
<ul>
<li>Because of the program of study requirements, I took a course in
optimization, without which I would never have written <a
href="http://dx.doi.org/10.1090/S0025-5718-09-02209-1">this paper</a>,
parts of <a href="http://arxiv.org/abs/1105.5798">this paper</a>, and
probably <a href="http://dx.doi.org/10.2140/camcos.2012.7.247">this
paper</a>.</li>
<li>I met David Keyes (a member of the steering committee) and came to
KAUST! I almost certainly would not be here if it weren’t for the CSGF
program.</li>
<li>I met Carl Boettiger and learned about using Jekyll for open
notebook science, resulting in the site you are reading.</li>
</ul>
<p>Here is the letter I sent to four congressional committee members
asking them to save the program. If you know the CSGF program and its
significance, I urge you to do so too.</p>
<blockquote>
<p>Dear Senator/Congressman,</p>
</blockquote>
<blockquote>
<p>I am writing to you because of your leadership role on the
Appropriations Subcommittee on Energy and Water Development. I am an
applied mathematician and an alumnus of the DOE computational science
graduate fellowship (CSGF) program. I am writing because I have learned
that funding for the CSGF program is slated to be merged into a much
larger NSF graduate fellowship program. I think this would be a terrible
decision, because it would destroy the unique benefits of the CSGF
program.</p>
</blockquote>
<blockquote>
<p>The CSGF program was the second federal graduate fellowship that I
received while pursuing a Ph.D. The first was a fellowship from the
Dept. of Homeland Security. I would like to emphasize the value of the
CSGF program by contrasting it with the DHS fellowship program. Under
the DHS fellowship, I received valuable funding, but that was
essentially all. In contrast, the CSGF program dramatically altered my
career path in several positive ways. It required me to receive a
broader graduate education including computer science and physics, which
has allowed me to pursue interdisciplinary research that would otherwise
be impossible. It sent me to a practicum at Sandia National Laboratory,
where I established collaborations that continue to this day. Most
importantly, it introduced me to the network of CSGF fellows and alumni,
a small and very cohesive community of outstanding computational
scientists that is beginning to transform this relatively new scientific
discipline. It is no exaggeration to say that my career has been shaped
by my interaction with that community. I am now a successful professor
with my own research funding and have no obligation to attend the annual
CSGF conference. But that interaction is important enough that last year
I flew half way around the world, using my own research funds, to spend
a few days with the current fellows and other alumni.</p>
</blockquote>
<blockquote>
<p>I think that many federal fellowships, including the DHS fellowship,
may be well served by their being merged into a larger NSF program. But
the unique benefits of the CSGF program, and especially the scientific
community that it fosters, could not exist under a larger program with
less focus. Please keep the DOE CSGF program intact and keep the funding
for it within the Advanced Scientific Computing Research (ASCR)
office.</p>
</blockquote>
<blockquote>
<p>Sincerely,</p>
</blockquote>
<blockquote>
<p>Professor David I. Ketcheson</p>
</blockquote>
How to avoid javascript errors when copy-pasting Bibtex citations in Mendeley on Mac OS X2013-02-15T00:00:00+03:00h/2013/02/15/mendeley_bibtex_javascript_solution<p>I use Mendeley to manage references. Mendeley has a nice auto-import
feature that will pull down bibliographic data from the web to my
database. When writing, my workflow typically involves grabbing
references from Mendeley in bibtex format. The simplest way to do this
involves right-clicking on a publication and selecting “copy citation”.
Provided that one has already selected “bibtex generic citation style”
in the <em>View->Citation Style</em> menu, this action results in the
full bibtex entry being copied to the clipboard.</p>
<p>At least, that’s how it’s supposed to work.</p>
<p>For a couple of years now, I’ve had the problem that I get this on
the clipboard instead:</p>
<blockquote>
<p>Error: JavaScript error found: CSL error: Exception: TypeError:
‘undefined’ is not a function, 515,
file:///Applications/Mendeley%20Desktop.app/Contents/Resources/citeproc-js/citeproc.js</p>
</blockquote>
<p>Despite this problem being <a
href="https://www.google.com/search?q=mendeley+javascript+error">reported
by numerous users</a>, Mendeley has never provided a fix that worked for
me. But today, after discussion with Mendeley support, I found my own
fix.</p>
<p><strong>What to do:</strong> Just replace the file</p>
<p>~/Library/Application Support/Mendeley
Desktop/citationStyles-1.0/bibtex.csl</p>
<p>with the one found at</p>
<p><a
href="http://www.zotero.org/styles/bibtex">http://www.zotero.org/styles/bibtex</a></p>
<p>Then re-open Mendeley. That’s it. Of course, I reccomend just moving
your bibtex.csl rather than deleting it, in case anything goes
wrong.</p>
5 reasons why you should submit your next paper to CAMCoS2013-01-17T00:00:00+03:00h/2013/01/17/why_to_publish_in_camcos<p>I have a new favorite journal: <a
href="http://msp.org/camcos/">Communications in Applied Mathematics and
Computational Science</a>. I just published a paper with them for the
first time (technically it’s still in press, but you can download it <a
href="http://msp.org/camcos/2012/7-2/p04.xhtml">here (paywall)</a> or <a
href="http://numerics.kaust.edu.sa/papers/stability_polynomials/stability_polynomials_2012.html">here
(free; same version)</a>.</p>
<p>CAMCoS is a hidden gem – it is relatively new (6 years old) and not
yet as widely known as most established journals. I believe that within
a few years it will be as coveted a publishing venue as any applied
mathematics journal. Here’s why.</p>
<ol type="1">
<li><p><strong>A respected publisher with an exceptional editorial
board.</strong> We all know that a primary consideration when submitting
an article is the prestige of the publisher and the journal. CAMCoS is
published by <a href="msp.org">Mathematical Sciences Publishers
(MSP)</a>, a non-profit run by mathematicians for mathematicians; they
also publish Annals of Mathematics, Geometry and Topology, and a number
of other excellent journals. Their website says “<em>our aim is to
transform scientific publishing into an industry that helps rather than
hinders scholarly activity</em>”, and their actions back that up. The
CAMCoS editorial board is an outstanding group of some of the world’s
leading applied mathematicians; <a
href="http://msp.org/camcos/about/journal/editorial.html">take a look
for yourself</a>.</p></li>
<li><p><strong>Timely and thorough peer review and
copy-editing.</strong> A respected publisher and a famed editorial board
are nice, but how well is the journal actually operated?<br />
My experience with CAMCoS puts it far ahead of most other journals I’ve
dealt with. We submitted the article in early July, and it came back in
early November: 4 months, which is not lightning-fast but not too
shabby. The referees seemed to be well chosen and to have done a
thorough job, suggesting several valuable improvements. We resubmitted
in late November (minor revisions) and the article was accepted five
days later. We submitted the TeX files in early December, and received
the galley proofs with copy editing one month later. The really
astonishing part: we approved the proofs on January 7, with one added
correction; our article was made available online, in final form
<strong>the next day</strong>, on January 8. I’ve never before
experienced or even heard of that kind of turnaround from a publisher.
For comparison, my SISC paper that was accepted in August still hasn’t
been assigned an issue or a DOI (it’s now mid-January).</p></li>
<li><p><strong>Electronic PDF features that no other publisher I know
offers.</strong> The copy editing is high quality, but what really blew
me away is that the copy editor went through our bibliography and made
every paper title into a hyperlink to the published journal article. As
if that wasn’t enough, he added links to Mathematical Reviews and
Zentralblatt for every article that possessed such entries. These
hyperlinks are active in the PDF, as are hyperlinks from references in
the paper to the bibliography, references to equations and theorems,
etc.<br />
This may seem like a small thing, but I think it’s very powerful. It
means that as you read through the paper, when you see a citation in a
sentence that puzzles or interests you, can just click on the citation,
which will take you to the bibliography. Then you can click on the
bibliographic entry to go immediately to the paper cited, or a to a
review of it! <a
href="http://numerics.kaust.edu.sa/papers/stability_polynomials/camcos-v7-n2-p04-s.pdf">Click
through to the paper</a> and try it for yourself. This is a capability
that all journal articles obviously should have had for the last 15
years, but this is the first publisher I’ve seen who understands
that.</p></li>
<li><p><strong>You can choose whether to keep your copyright, and you
can post the final version of the paper on your website or institutional
server.</strong> <a
href="http://msp.berkeley.edu/editorial/uploads/camcos/accepted/120712-Ketcheson/copyright.pdf">MSP’s
publishing agreement</a> has two options: if you wish, you may sign over
your copyright to them. You will still retain the rights to “reproduce
[the article] by any means for educational and scientific purposes …
without fee or permission” as long as you don’t try to charge anyone for
it. Alternatively, you can retain the copyright to your work, granting
MSP only a license to publish; the only restriction is again that you
can’t charge others a fee for accessing your work. Their policy has
clearly been designed with the author’s interests as primary
concern.</p></li>
<li><p><strong>(Virtually) Diamond open access</strong>. <a
href="http://symomega.wordpress.com/2012/08/09/green-gold-or-diamond-access/">Diamond
open access</a> is the OA movement’s dream; a model that avoids the
author charges of Gold OA while still providing peer review and a stable
DOI (which green OA often lacks). In strict terms, CAMCoS is not diamond
OA, since it requires a subscription. However, I claim it is virtually
diamond OA, for two reasons. First, CAMCoS uses a moving paywall, under
which articles become OA after one year (thus, only the 2012 issue
requires a subscription at present). During that first year, open access
can easily be provided by the author posting a copy somewhere.</p></li>
</ol>
<p>By way of disclosure, I have no affiliation with CAMCoS and no reason
to promote them except that they represent the kind of journal I think
the applied mathematics community should support.</p>
Convert SAGE worksheets to IPython notebooks2013-01-16T00:00:00+03:00h/2013/01/16/sage_to_ipython<h1 id="converting-a-sage-worksheet-to-an-ipython-notebook">Converting a
SAGE worksheet to an IPython notebook</h1>
<p>Download link: <a
href="">http://github.com/ketch/sage2ipython/</a></p>
<p>I use Python to teach numerical methods here at KAUST, and I’m in the
process of switching from using <a
href="http://www.sagemath.org">SAGE</a> worksheets to <a
href="http://ipython.org">IPython</a> notebooks (more on the reasons in
a later post). I’ve invested a lot of time over the past three years in
developing a set of SAGE worksheets and it would be a substantial amount
of tedious work to manually copy-paste their contents into IPython
notebooks. So I decided to write an automated converter.</p>
<p>Each SAGE worksheet is usually stored in a .sws file that is a
bzipped tarball; underneath, there is a text version (called
worksheet.html). If you run SAGE on your own machine, the text versions
of your worksheets can usually be found in
<code>~/.sage/sage_notebook.sagenb/home/username/number/</code>.</p>
<p>It’s a simple matter to convert the SAGE format (that uses triple
braces to delimit code cells) into the IPython format (that I believe is
JSON).<br />
Rather than write an actual parser, which seemed like overkill, I just
created a script that steps through the file line-by-line and keeps
track of whether it’s in a cell. Debugging it was slightly painful
because if you have the tiniest syntax error, then the IPython notebook
server just tells you something is wrong and displays nothing.</p>
<p>You can download the converter from the <a
href="http://github.com/ketch/sage2ipython/">Github repository</a>. It
has been tested with SAGE version 4.2.1 and IPython version 0.13.1. Note
that it has several limitations (see the list below). But it has served
my needs well.</p>
<p>Usage:</p>
<div class="sourceCode" id="cb1"><pre
class="sourceCode python"><code class="sourceCode python"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a> </span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="im">import</span> sage2ipython</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>sage2ipython.sage2ipy(<span class="st">'/path/to/sage/worksheet/html/file'</span>,<span class="st">'output_file_name.ipynb'</span>)</span></code></pre></div>
<p>To convert all your SAGE worksheets, do:</p>
<pre><code>>>> import sage2ipython
>>> sage2ipython.convert_all_sage_worksheets('username')</code></pre>
<p>where <em>username</em> is your account name. You may also need to
edit the SAGE notebook account name that occurs in the path in
<code>convert_all_sage_worksheets()</code>.</p>
<p>If you have any problems, it is likely that your worksheet contains
some special characters that need to be escaped in the IPython notebook.
I’ve included fixes for several of those, but almost certainly not all
of them. Please let me know.</p>
<p>General notes/limitations:</p>
<ul>
<li>All code blocks are assumed to be Python code blocks.</li>
<li>Output is simply deleted.</li>
<li>Everything else is put in Markdown cells.</li>
<li>Double backslashes are handled properly only if you have the
development version of IPython. Otherwise, you should convert them to
quadruple backslashes.</li>
</ul>
Why I signed the Cost of Knowledge2012-12-23T00:00:00+03:00h/2012/12/23/why-i-signed-thecostofknowledge<p>It has been almost a year since <a
href="http://gowers.wordpress.com/2012/01/21/elsevier-my-part-in-its-downfall/">Tim
Gowers’ blog post</a> about boycotting Elsevier triggered the Cost of
Knowledge movement. <a href="http://thecostofknowledge.com/">The boycott
has been signed by more than 13,000 people</a>. But the vast majority of
academics – including most of my own friends and collaborators –
continue to support Elsevier by gifting to it much of their research
output and their labor, enabling <a
href="http://tylerneylon.com/files/Neylon_Open_Sci_Sum_Talk.pdf">Elsevier
to operate at greater profits than Starbucks, Amazon, or Nike</a>.</p>
<p>One of the most common arguments I’ve heard for this continued
support goes as follows:</p>
<blockquote>
<p><em>Sure, it would be better if journal prices were lower, but it’s
not such a big deal that we should all get worked up about it.
Corporations always seek to maximize profits – why focus on academic
publishers?</em></p>
</blockquote>
<h3 id="commercial-publishers-are-pillaging-academia">Commercial
publishers are pillaging academia</h3>
<p>In September, I spoke with a colleague who is a professor at a small
university in Spain. She told me that she currently has no
Ph.D. students and sees no possibility of supervising students “in the
next 5-6 years” due to the financial situation at her university.
Furthermore, she confided that each month she is uncertain whether her
next paycheck will come. Last year, all faculty at her university took
an involuntary 5% pay-cut as the university struggled to pay its bills.
Another colleague from a major university in Texas told me that the Math
department lost 20% of its faculty in the last two years over funding
problems, as the university’s budget has decreased by 25%.</p>
<p>Of course, both of these universities continue to pay a huge sums for
Elsevier journal bundles – that’s a cost they simply can’t cut if they
want to continue as a respected institution of higher learning. Elsevier
continues to pillage academic institutions through its strangle-hold on
scientific publishing, while professors face salary cuts and students
cope with ever-rising tuition.</p>
<h3 id="who-is-to-blame">Who is to blame?</h3>
<p>Many people on both sides of the boycott have argued that we
shouldn’t expect commercial publishers to behave any better. As
corporations, the argument goes, their highest allegiance is to their
shareholders, not their stakeholders. Commercial publishers simply can’t
be expected to do anything but plunder academia and the general public
in order to enrich themselves. But then who is responsible for
channeling funds that should support scientific research but instead go
to pay for overpriced journals? <strong>The guilty party must be the
academics who prop up commercial publishers by providing all the content
and labor!</strong></p>
<h3 id="its-not-you-its-me">It’s not you, it’s me</h3>
<p>Researchers who continue to support Elsevier can pretend that they
are passively “staying out of the fight”, but they are deluding
themselves. Submitting or refereeing a paper is a very active and
expensive decision. When I submit a paper, I decide where to deposit a
huge investment of my own time and my instution’s (or funding agency’s)
money. That is why I joined the boycott – why I had to. It is the
scientific publishing equivalent of the Hippocratic oath: <em>First, do
no harm.</em> Locking away publicly-funded research for the profit of a
few is harmful. Forcing my own university (and taxpayers) to buy back
the research they already paid me to do is harmful. I’m not boycotting
in order to stop Elsevier from doing harm, I’m boycotting <em>to prevent
myself from doing harm</em>.</p>
<p>One editor-in-chief of an Elsevier journal pointed out to me that
SIAM also operates its journals “at a profit”, meaning that the
subscription fees generate more revenue than what it costs to run the
journal. This is true! But where does that extra revenue go? It is used
to subsidize SIAM conferences, reducing the registration fees I pay.
This can be seen as a shrewd way of leveraging central university
budgets (which pay for journal subscriptions) to support my discipline
specifically.</p>
<p>Why would I send my work to a publisher like Elsevier that aims
primarily to enrich shareholders when I could send it to a journal of
the same quality with a non-profit publisher (like SIAM) that charges
much lower prices and uses its income to benefit its customers
(members)?</p>
Open scientific collaboration2012-12-22T00:00:00+03:00h/2012/12/22/habits-of-open-scientist-4-collab<p>This is the fourth post in my <a
href="http://davidketcheson.info/2012/07/31/habits-of-open-scientist.html">series
on habits of the open scientist</a>. Here I discuss the fourth habit,
<strong>open collaboration</strong>. The previous post was on <a
href="http://davidketcheson.info/2012/08/22/habits-of-open-scientist-3-pre.html">Pre-publication
dissemination of research</a>.</p>
<p>As mentioned in the introduction to this series, the first three
habits are truly essential for any conscientious scientist. With the
fourth habit, we’re moving into things that are valuable but less
essential – <em>advanced open science</em>, if you will.</p>
<p>What do I mean by open collaboration? The use of online tools and
social media to connect with new collaborators and provide your own
expertise where it is needed most. For an excellent introduction to the
subject, go read <a
href="http://michaelnielsen.org/blog/reinventing-discovery/">Michael
Nielsen’s book, <em>Reinventing Discovery</em></a>. Here I’ll just focus
on a few examples from my own experience:</p>
<h3 id="scientific-qa-sites">Scientific Q&A sites</h3>
<p>Often scientific research involves elements of work that have been
done before or are already well understood – by someone, somewhere.
Sometimes this work is published and readily available, but other times
it is unpublished or perhaps published in a place you wouldn’t know to
look. Finding the person with the specialized knowlege you need might
take much longer than “reinventing the wheel”, i.e. redoing the work
yourself. Enter StackExchange, an engine for connecting questions with
correct answers and making them readily available.</p>
<p>I’m an avid participant in (and former moderator of) the <a
href="http://scicomp.stackexchange.com">Stack Exchange for Computational
Science</a>. I also use <a
href="http://mathoverflow.net">Mathoverflow</a> and <a
href="http://stackoverflow.com">Stack Overflow</a>. Some personal
examples of the kind of connections I’m talking about are <a
href="http://math.stackexchange.com/questions/86977/polynomials-that-are-orthogonal-over-curves-in-the-complex-plane/">here</a>
and <a
href="http://scicomp.stackexchange.com/questions/65/are-there-operator-splitting-approaches-for-multiphysics-pdes-that-achieve-high">here</a>.
These are conversations that would never have taken place “in real life”
simply because the people involved have never met each other.</p>
<p>I also find the <a href="http://tex.stackexchange.com">TeX stack
exchange</a> to be a gold mine, and typically far more useful than
browsing through package documentation on CTAN.</p>
<h3 id="social-networks-like-google">Social networks like <a
href="http://plus.google.com">Google+</a></h3>
<p>I use Google+ (and previously Reader, which was a far superior tool)
for sharing new papers that I think may be of interest to my
collaborators. I’ve also used it to debate journals’ editorial policies
(with the editors) and for preliminary planning of conferences and
proposals – to find out who may be interested in participating. It’s
certainly not suited to discussing scientific or mathematical concepts
in any detail, and it is annoyingly difficult to sort through new things
that are posted. I think that Facebook is less useful for this purpose
because Facebook is used primarily for personal content whereas a large
community of G+ users (of which I am part) consider it to be a platform
for sharing professional content. But I’m not a good judge – I don’t
even have a Facebook account.</p>
<h3 id="github"><a href="http://github.com">Github</a></h3>
<p>I wanted to say “sites like Github”, but I don’t think there are any
others. Online code hosting sites have long facilitated collaboration
between existing teams, but Github takes this to a new level by
explicitly promoting collaboration between people who have never met.
Surprisingly, this paradigm shift didn’t require any new technology.
Rather, it stems from a combination of their “code first, ask permission
later” pull-request mindset and subtle differences in the user interface
– like a “fork me” button on every page, just begging you to modify some
stranger’s code.</p>
<p>Now this philosophy – and use 0f Github – has moved beyond just
sharing what we usually think of as computer code. For instance, Carl
Boettiger puts <a href="http://github.com/cboettig/labnotebook">the full
source of his Jekyll-based website on Github</a>, which enabled me
(simply by forking it) to easily set up this site.</p>
<h3 id="a-word-of-caution">A word of caution</h3>
<p>As useful as all the above are, I’ve found that they can also be a
way of wasting time. You may find this to be the case if you’re merely
trading opinions with strangers or consuming tidbits of information that
aren’t really relevant to your research – for instance, I find that my
time spent on the <a href="http://academia.stackexchange.com">Academia
Stack Exchange</a> is of dubious value. I stepped down from moderating
the SciComp Stack Exchange because I felt it was too time-consuming. But
if used in a focused way, open collaboration tools can accelerate,
enrich, and expand your research.</p>
<p>What other tools or sites ought to be mentioned here? Let me know in
the comments.</p>
Reflections on the 2012 ICERM Reproducibility Workshop2012-12-14T00:00:00+03:00h/2012/12/14/icerm-reproducibility<p>I spent the last five days at ICERM (the new Math institute at Brown
University) attending the workshop <a
href="http://icerm.brown.edu/tw12-5-rcem">Reproducibility in
Computational and Experimental Mathematics</a>. The workshop was focused
on discussing how mathematicians can ensure that their computations are
reproducible, in order to ensure correctness and facilitate their use by
others. It’s a topic dear to my heart and one that I’ve <a
href="http://www.davidketcheson.info/tags.html#reproducible-research">blogged
about before</a>.</p>
<p>My hat is off to the organizers for managing to assemble a highly
diverse group of experts, including not only academic luminaries from
both pure and applied math, open source software gurus, and leaders from
companies like Github and Google (yes, Peter Norvig himself attended).
Most of the talks were excellent. Many included live demos of great
tools, and others introduced me to things that I never thought you could
do with computation – like discovering new formulas for pi.</p>
<p>Going into the workshop, I felt that I already knew a lot about
reproducibility and had relatively good habits in this regard. So what
did I learn? I picked up a new tool, Andrew Davison’s <a
href="http://packages.python.org/Sumatra/introduction.html">Sumatra</a>,
which I had heard of before but now have begun to use in earnest (more
on that in a future post). I was impressed with Lorena Barba’s <a
href="http://dx.doi.org/10.6084/m9.figshare.104539">Reproducibility PI
Manifesto</a> and learned a new trick from her: put your figures up on
Figshare before submitting a paper in order to retain copyright on the
figures. I marveled at Greg Wilson’s goal of reaching 20% of all
scientists with his <a href="http://software-carpentry.org/">Software
Carpentry</a> courses, and I determined to host such a course at KAUST
in the near future.</p>
<p>I also learned that the reproducibility movement in computational
science and mathematics involves a wide range of opinions and concerns.
For instance, some consider that the primary motivation for
reproducibility is to ensure correctness of results, while others feel
that it is scientific productivity. There is disagreement about how much
value should be placed on code development, on how reproducibility
should be taught, and on ways in which journals and funding agencies
should encourage reproducibility. In the end, we had difficulty even
agreeing on a well-defined terminology for concepts related to
reproducibility. Nevertheless, there is broad agreement that we need to
improve our habits in recording and presenting our computational work.
On the final day, in a spurt of crazy massive Google Doc collaboration
(have you ever edited a documented live with 30 others at once?) we
drafted a report that I’ll link to here once it appears.</p>
<p>If you want to know more, take a look at the great <a
href="http://wiki.stodden.net/ICERM_Reproducibility_in_Computational_and_Experimental_Mathematics:_Readings_and_References#Thought_Pieces_Submitted_for_the_ICERM_Workshop">thought
pieces</a> submitted and the rest of the material on the <a
href="http://wiki.stodden.net/Main_Page">wiki</a>.</p>
Adopting the Reproducible Research Standard2012-12-06T00:00:00+03:00h/2012/12/06/reproducible-research-standard<p>Back in July, I read Victoria Stodden’s work on licensing
reproducible research. Victoria has proposed the Reproducible Research
Standard (RRS), which is an amalgamation of recommended licenses for
what she calls the <em>research compendium</em>. The research compendium
is the full set of outputs of a research project, including:</p>
<ul>
<li>The research paper</li>
<li>Additional media, such as movies</li>
<li>Computer code</li>
<li>Data</li>
<li>A record of the computing environment used to process the code and
data</li>
</ul>
<p>The idea is that all of these components are part of your research
and someone wanting to understand your research may need access to all
of them. The RRS consists of the following licenses:</p>
<ul>
<li><a href="http://creativecommons.org/licenses/by/3.0/">Creative
Commons Attribution (BY)</a> for <strong>media</strong> (text, figures,
movies)</li>
<li><a
href="http://en.wikipedia.org/wiki/BSD_licenses%233-clause_license_.28.22New_BSD_License.22_or_.22Modified_BSD_License.22.29">Modified
BSD</a> for <strong>code</strong></li>
<li><a
href="http://sciencecommons.org/resources/faq/database-protocol">Science
Commons Database Protocol</a> for <strong>data</strong></li>
</ul>
<p>For the most part, this is easy enough to implement: the current
academic research system frankly doesn’t care what you do with your
code, data or miscellaneous media outputs. And I think that actually
releasing those is the most important part of the RRS. But the text and
figures of the paper itself must be published in a journal, and
typically the journal will want the copyright – preventing you from
releasing those media under CC-BY.</p>
<p>Nevertheless, I’ve attempted to follow the full RRS with each of the
two papers I’ve had accepted since then. <a
href="http://arxiv.org/abs/1111.3499">The first</a> (still in press) was
accepted to the SIAM Journal on Scientific Computing (SISC). The code is
licensed under modified BSD as part of the <a
href="https://github.com/clawpack/sharpclaw">SharpClaw</a> package (now
rolled into <a href="https://github.com/clawpack/pyclaw">PyClaw</a>).
After reading <a
href="http://adamdsmith.wordpress.com/2009/07/07/copyright-copywrong/">one
author’s experience retaining copyright to an article published by
SIAM</a>, I decided to try the same approach of modifying the copyright
transfer agreement by <a
href="http://adamdsmith.wordpress.com/2009/07/07/copyright-copywrong/#jp-carousel-138">striking
out the transfer of copyright</a>. I suspected that the instance just
linked to went “below the radar”, and I wanted to be completely
above-board, so I pointed out to SIAM that I had modified the agreement.
What made this particularly interesting is that one of my co-authors on
the paper is Randy LeVeque, chair of the SIAM journals committee.</p>
<p>Eventually, SIAM objected “on the grounds that non-exclusive right to
publish doesn’t prohibit others from publishing for profit, which may be
to [the authors’] disadvantage as well.” They agreed instead to an
addendum generated via http://scholars.sciencecommons.org/ that retains
for the authors the right to post the final article on any public
server, as long as publication in SISC is stated. Since this gave me
what I wanted in practical terms, I agreed and signed the copyright
transfer + addendum. I’ve been told that an ad hoc committee of SIAM
leadership is now discussing how SIAM should handle these copyright
questions like this.</p>
<p>I came away from this feeling like we had made progress, but I still
wanted to see if I could implement the full RRS with respect to the next
paper. My <a href="http://arxiv.org/pdf/1201.3035v3.pdf">next accepted
paper</a> (also still in press) was a submission to <a
href="msp.org/camcos/">Communications in Applied Mathematics and
Computational Science</a>, published by the extremely progressive
not-for-profit <a href="http://msp.org/about/">Mathematical Sciences
Publishers</a>. This is a truly remarkable journal that will be the
subject of another blog post in the near future, but what’s important in
this context is that the journal doesn’t require authors to transfer
copyright! They only require a <a
href="http://msp.berkeley.edu/editorial/uploads/camcos/accepted/120712-Ketcheson/copyright.pdf">license
to publish</a> which includes this clause:</p>
<blockquote>
<p><em>The copyright holder retains the right to duplicate the Work by
any means and to permit others to do the same with the exception of
reproduction by services that collect fees for delivery of documents,
which may be licensed only by the Publisher. In each case of authorized
duplication of the Work in whole or in part, the Author(s) must still
ensure that the original publication by the Publisher is properly
credited.</em></p>
</blockquote>
<p>After discussion with my co-author Aron Ahmadia, we’re retaining
copyright and licensing the paper under CC-BY-NC. The NC (non-commercial
clause) seems necessary to comply with the paragraph above, and seems
reasonable to me. The code for the paper is released as part of the <a
href="https://github.com/ketch/RK-opt">RK-opt package</a>. So I’m
calling this mission accomplished.</p>
<p>I have mixed feelings about whether it makes sense for journals to
let authors keep copyright – I can see some sense in SIAM’s objection,
and I think that non-profit publishers need to protect enough of a
revenue stream to support their activities. I think it is better that
that revenue come from (low-cost) subscriptions than from author fees.
It will be interesting to see where SIAM’s policy falls.</p>
Switching from Blogger to Jekyll2012-10-25T00:00:00+03:00h/2012/10/25/switching-from-blogger-to-jekyll<p>If you’re reading this, then you’ve probably noticed: I moved my blog
from <a href="http://scienceinthesands.blogspot.com">blogspot</a> to my
own new site. Among other things, that meant a change in the engine that
runs the blog, from <a href="http://www.blogger.com">blogger</a> to <a
href="https://github.com/mojombo/jekyll">Jekyll</a>. It was a big jump
from the simplest, hosted blogging platform out there to a rather
advanced engine designed by hackers for hackers.</p>
<h2 id="why-switch">Why switch?</h2>
<p>I had been wanting for some time to include a lot more math and code
in my blog posts, and it was a hassle with Blogger. The output often
looked funny and was hard to control. With Jekyll, I get <a
href="http://localhost:4000/2012/10/11/Internal_stability.html">beautiful
results like this</a>. I also wanted more control over my blog’s
appearance and greater interoperability, which meant <a
href="http://pragprog.com/the-pragmatic-programmer/extracts/tips">keeping
things in plain text</a> and using (generated) static HTML, both of
which Jekyll enables me to do.</p>
<p>But really the switch was part of a much bigger change: I’ve migrated
the content of my professional home page here to davidketcheson.info and
begun an open science notebook. That’s why the link at the top of the
page reads <strong>NoteBlog</strong>: it’s intended to be a combination
<strong>notebook</strong> and <strong>blog</strong>. On the blog side,
I’ll keep posting about issues like scientific publishing, open science,
reproducibility. On the notebook side, there will be a lot more posts of
raw results and experiments from my current research projects, not
intended for a general audience. And somewhere in-between there’ll be
reasonably polished expository math-y posts accessible to students and
researchers in my field.</p>
<h2 id="how-i-switched">How I switched</h2>
<p>It was easy, thanks primarily to Carl Boettiger. This site was built
based on Carl Boettiger’s <a
href="http://carlboettiger.info">labnotebook site</a>. Carl publishes
the source for his site on Github as the <a
href="http://github.com/cboettig/labnotebook">labnotebook project</a>
and releases it all under <a
href="http://creativecommons.org/publicdomain/zero/1.0/">CC0</a>, so
setting my site up was as easy as following <a
href="http://www.carlboettiger.info/README.html">his instructions</a>,
replacing the _posts directory, and making a few CSS customizations.</p>
<p>I migrated all my Blogger content following <a
href="http://coolaj86.info/articles/migrate-from-blogger-to-jekyll.html">these
instructions</a>. This didn’t manage to bring in the tags or comments,
unfortunately. I had done a poor job of tagging my posts in the past
anyway, so I manually re-tagged my 45 existing posts.</p>
<h2 id="subscribing-to-my-new-blog-andor-notebook">Subscribing to my new
blog and/or notebook</h2>
<p>One nice thing about having more control is that I can set up
separate feeds for different kinds of posts. On the right you’ll see
three RSS feed links: one for all entries (notebook and blog), and one
each for the separate notebook and blog feeds. I imagine most of you
will only want to subscribe to the blog, unless you’re interested in my
research niche (you can look at the <a
href="http://www.davidketcheson.info/categories.html">categories
page</a> to get an idea of what each will include).</p>
You might be a low-quality scientific journal if...2012-10-16T00:00:00+03:00h/2012/10/16/low_quality_journal<p>In the spirit of David Letterman’s top 10 lists, here are my top 10
signs you might be a low quality scientific journal, inspired by an
e-mail I received today from <a
href="http://www.ampublisher.com/Canadian-Journal-Computing.html">Computing
in Mathematics, Natural Sciences, Engineering and Medicine</a>.</p>
<ol type="1">
<li><p>You are incorporated in Canada, and have “Canadian” in your
title, but none of your editors lives in Canada.</p></li>
<li><p>Your scope is so broad that it includes abstract algebra, textile
engineering, and dermatology all in one journal.</p></li>
<li><p>Most of your abstracts include misspellings of common English
words, like <a
href="http://www.ampublisher.com/June%202010/CMNSEM%20Jun%202010.html">“stander”
for standard</a>.</p></li>
<li><p>Most of your article titles could be used as exercises for grade
school students learning to fix improper verb conjugations and
subject-noun agreement.</p></li>
<li><p>You promise to complete reviews of articles in mathematics or
similar fields in two weeks.</p></li>
<li><p>You require math and physics manuscript submissions to be in
Microsoft Word format.</p></li>
<li><p>You spam researchers in other disciplines with your calls for
papers.</p></li>
<li><p>Your chief editor received a Ph.D. in the last five
years.</p></li>
<li><p>Your funding model is <s>vanity press</s> gold open
access.</p></li>
<li><p>You regularly invite graduate students to serve on your editorial
board.</p></li>
</ol>
<p>On a serious note: a few of these, taken by themselves, might not
necessarily be a bad sign. And I must say that AM Publishers’ author
charges (95 USD) are the lowest I’ve ever seen.</p>
<p>For a related, more serious analysis, see <a
href="http://scholarlyoa.com/publishers/">Beall’s List of Predatory
Open-Access Publishers</a>.</p>
Impact of the Elsevier boycott2012-10-11T00:00:00+03:00h/2012/10/11/elsevier_boycott_impact<p>Seven months ago, I signed the Elsevier boycott at <a
href="http://thecostofknowledge.com">thecostofknowledge.com</a>. What
impact has this had? So far, I’ve</p>
<ul>
<li>Submitted 2 manuscripts to SIAM journals that would otherwise have
gone to Elsevier journals</li>
<li>Declined to referee 3 manuscripts from Elsevier journals</li>
</ul>
<p>If every signee had a similar impact (admittedly, that’s a very
optimistic view), that would be more than 24,000 journal articles
effectively pulled from Elsevier journals and published elsewhere. Which
might be a good thing for the editors, since they’d be having a
difficult time finding qualified referees in the communities where the
boycott has been adopted.</p>
<p>It will be impossible to quantify my impact going forward, since I
now automatically rule out Elsevier journals when planning a new paper,
and since I’ve asked editors to remove me from their list of potential
referees.</p>
<p>Some people I’ve met seem to have the perception that the boycotters
are deeply angry people who spend their time muttering curses at
commercial publishers. That simply isn’t the case, and anyone who has
read the documents that helped launch the boycott must know that. When I
refuse to referee for Elsevier journals, I do so politely and I always
suggest alternate reviewers to the editor. In every case the editor has
been equally polite and understanding.</p>
Blogging an iPython notebook with Jekyll2012-10-11T00:00:00+03:00h/2012/10/11/blogging_ipython_notebooks_with_jekyll<blockquote>
<p><strong>Update as of December 2014: Don’t bother using what’s below;
go to <a
href="http://cscorley.github.io/2014/02/21/blogging-with-ipython-and-jekyll/">Christop
Corley’s blog</a> for a much better setup!</strong></p>
</blockquote>
<p>I’ve been playing around with <a
href="http://ipython.org/ipython-doc/dev/interactive/htmlnotebook.html">iPython
notebooks</a> for a while and planning to use them instead of <a
href="http://www.sagemath.org/">SAGE</a> worksheets for my numerical
analysis course next spring. As a warmup, I wrote an iPython notebook
explaining a bit about internal stability of Runge-Kutta methods and
showing some new research results using <a
href="http://numerics.kaust.edu.sa/nodepy/">NodePy</a>.</p>
<p>I also wanted to post the notebook on my blog here; the ability to
more easily include math and code in blog posts was one of my main
motivations for moving away from Blogger to my own site. I first tried
following <a
href="http://blog.fperez.org/2012/09/blogging-with-ipython-notebook.html">the
instructions given by Fernando Perez</a>. That was quite painless and
worked flawlessly, using <code>nbconvert.py</code> to convert the .ipynb
file directly to HTML, with graphics embedded. The only issue was that I
didn’t love the look of the output quite as much as I love how Carl
Boettiger’s Markdown + Jekyll posts with code and math look (see an
example <a
href="http://www.carlboettiger.info/2012/09/14/analytic-solution-to-multiple-uncertainty.html">here</a>).
Besides, Markdown is so much nicer than HTML, and
<code>nbconvert.py</code> has a Markdown output option.</p>
<p>So I tried the markdown option:</p>
<pre><code>nbconvert.py my_nb.ipynb -f markdown</code></pre>
<p>I copied the result to my <code>_posts/</code> directory, added the
<a href="https://github.com/mojombo/jekyll/wiki/YAML-Front-Matter">YAML
front-matter</a> that Jekyll expects, and took a look. Everything was
great except that all my plots were gone, of course. After considering a
few options, I decided for now to put plots for such posts in a
subfolder <code>jekyll_images/</code> of my public Dropbox folder. Then
it was a simple matter of search/replace all the paths to the images. At
that point, it looked great; you can see the <a
href="https://github.com/ketch/nodepy/blob/master/examples/Internal_stability.ipynb">source</a>
and the <a
href="http://davidketcheson.info/2012/10/11/Internal_stability.html">result</a>.</p>
<p>The only issue was that I didn’t want to manually do all that work
every time. I considered creating a new Converter class in
<code>nbconvert</code> to handle it, but finally decided that it would
be more convenient to just write a shell script that calls
<code>nbconvert</code> and then operates on the result.<br />
Here it is:</p>
<pre><code>#!/bin/bash
fname=$1
nbconvert.py ${fname}.ipynb -f markdown
sed -i '' "s#${fname}_files#https:\/\/dl.dropbox.com\/u\/656693\/jekyll_images\/${fname}_files#g" ${fname}.md
dt=$(date "+%Y-%m-%d")
echo "0a
---
layout: post
time: ${dt}
title: TITLE-ME
subtitle: SUBTITLE-ME
tags: TAG-ME
---
.
w" | ed ${fname}.md
mv ${fname}.md ~/labnotebook/_posts/${dt}-${fname}.md</code></pre>
<p>It’s also on Github <a
href="https://github.com/ketch/labnotebook/blob/master/nbconv.sh">here</a>.
This was a nice educational exercise in constructing shell scripts, in
which I learned or re-learned:</p>
<ul>
<li>how to use command-line arguments</li>
<li>how to use sed and ed</li>
<li>how to use data</li>
</ul>
<p>You can expect a lot more iPython-notebook based posts in the
future.</p>