David Ketcheson

Truth and the war in Ukraine

2025-03-12T00:00:00+03:00

“O thus be it ever when free men shall stand
Between their lov’d home and the war’s desolation!
Blest with vict’ry and peace may the heav’n rescued land
Praise the power that hath made and preserv’d us a nation!
Then conquer we must, when our cause it is just,
And this be our motto -”In God is our trust,”
And the star-spangled banner in triumph shall wave
O’er the land of the free and the home of the brave.”

– Francis Scott Key, “The Star-Spangled Banner” (last verse)

The United States of America was founded on the principle that every society has the right to choose how and by whom they will be governed, and that they should defend their right to “life, liberty, and the pursuit of happiness”, even by force if need be.

In 2014, and again in 2022, Russia invaded Ukraine, illegally moving its military forces over the border and using them to kill Ukrainians with the goal of depriving them of their liberty and their right to self-governance and self-determination. Ukrainian soldiers are giving their lives to defend their homes, their families, and their freedom.

The current administration of the United States has chosen to appease, then capitulate to, and finally outright support the Russian government’s aggression. They claim that Russia is not the agressor and wants peace, while Ukraine is in the wrong and wishes to continue the war.

Lies do not become truth, no matter how emphatically, nor from how grand a throne they are proclaimed. These things are no more true than the claims of those who deny the holocaust or seek to justify the massacre at Tiananmen Square. In blaming the victim for the actions of the perpetrator on a grand scale, they are among the most harmful lies that can be told. To accept them would be to repudiate what the United States of America has stood for for 250 years.

I am not eager to make political statements, much less in public. But at some point it becomes necessary to speak up, not in order to convince others but simply to avoid becoming complicit through silence. And so I say: I do not, and never will, embrace these lies. To the people of Ukraine, and to all those who are obliged to defend their freedom with their lives, I support you and admire you.

To the leaders of my country who by their words or their silence support these lies, I say: when the oppressor comes to take your freedom, I will speak up even for you. But I will not support your policies, I will not vote for you, and I will do what I can to see that you do not hold sway over my country any longer.

As always, the opinions expressed here are my own, and should not be assumed to reflect those of any organization with which I am affiliated.

Modeling Coronavirus part V -- try the model yourself

2020-03-22T00:00:00+03:00

As a follow up to this series of posts, I’ve created a version of the model that you can experiment with on your own, without needing to know any computer programming. Try it here.

Note that it may take some time for the model to load.

If you arrived here and haven’t already read the series of posts on this topic, I recommend that you start at the beginning.

Modeling Coronavirus part IV -- understanding exponential growth

2020-03-20T00:00:00+03:00

This is my fourth post on modeling the Coronavirus epidemic. I recommend starting with the first post and reading them in order.

So far, we’ve learned about the SIR model and used available data combined with the model to predict the epidemic. We’re now going to detour into some additional mathematical ideas that will help us get further understanding. At the end of this post, we’ll try to judge how effective the mitigation strategies have been in some countries.

Exponential growth

In the second post, we saw that the initial spread of a disease follows a differential equation of the form

\[ \frac{dI}{dt} = \beta I(t). \]

Once you’re comfortable with the idea that \(dI/dt\) just means “the rate of change of \(I\)”, you realize that this is one of the simplest equations imaginable. So it shouldn’t be surprising that this equation comes up a lot in the real world. Its solution is

\[ I(t) = e^{\beta t} I(0). \]

Here \(e\approx 2.72\) is Euler’s number. This equation tells us that the number of infected grows very quickly. In the case of Coronavirus, the number \(I\) can double in about 3 days (you can see more estimates of the doubling time here). We refer to this kind of growth (where a given quantity doubles over a certain time interval) as exponential growth.

Financial advisors love to talk about the power of exponential growth because that’s also how compound interest works. Of course, with most investments it takes a lot more than 3 days to double your money, but the principle is the same. In fact, exponentially growing functions are all around us, and learning how they behave can help you to understand a lot of things.

Let’s look again at our first prediction from the last post:

Right now, we are at the very left edge of this plot. How many people are infected at the start of the plot? It looks like zero, but we know there are over 100 thousand current cases. It just looks like zero because the scale here is in billions, and 100 thousand is much too small to see on that scale! This is a common problem with exponentially growing functions. We can try to solve the problem by zooming in on the left end of the graph, showing results for just the next 30 days:

It’s a little better, but even on this scale the current number of infections is too small to be seen. We can try again to fix it by just changing the vertical scale:

Now we can see that the starting value is not zero, but we can’t see the right part of the graph at all! Let’s find a better solution.

Logarithmic scaling

In the plots above, the scale of the \(y\)-axis is linear. That means that equal distances in \(y\) represent equal changes in the value of the function. The problem with using this for an exponentially growing function is that the changes in the function at early times are tiny compared to the later growth.

Instead, we can visualize the growth using a logarithmic scale:

Look carefully at the \(y\) axis here. As you can see, each of the evenly space labels on the axis represents a value 10 times greater than the value below it. In other words, equal distances on this axis represent equal ratios. The great thing about this scaling is that we can easily see how the function varies over the whole graph. It might seem strange that it looks almost like a straight line, but that’s exactly how an exponentially growing function should look on this kind of plot. Remember, equal distances in \(y\) represent equal ratios, and we said that this kind of function doubles over each interval of some fixed size.

With this plot, it’s easy to answer questions like “when do we expect to have more than 1 million cases?” We couldn’t possibly answer that by looking at the previous plots.

We can also see more easily how rapidly the epidemic goes from something small to a true global crisis. At present there are less than a million cases, but before the end of April we would have (without mitigation) over 1 billion. Never in history have that many human beings been ill at the same time with a single disease.

In technical terms, we say this is a semi-log plot, because the \(y\) axis is logarithmic while the \(x\) axis is linear. If both the \(x\) and \(y\) axes were logarithmic, we would call it a log-log plot.

Understanding logarithmic plots is a bit of a superpower, because it allows you to study data by looking at plots like the last one above, which let you simultaneously see the parts where the function is small and big.

The model predictions on a log scale

Here’s a semilog plot of our basic model for the epidemic over the next year:

Remember, this is exactly the same data that’s in the first plot near the top of this post. We’re just looking at it in a different way. But with this new plot we can much more easily see how many susceptibles remain at the end of a year: about a tenth of a billion, or 100 million.

We can also see that there are still about 1000 infected people in this model at the end of a year. Notice that after the infection peaks, it declines in a way that also looks like a straight (downward-trending) line on this plot; that means that the decrease is also exponential. In other words, after we pass the peak of infection, the number of infected individuals will consistently reduce by a factor of two over a certain time interval. Looking back at the model, we see that the rate of decrease is determined by \(\gamma\). In the late stages of the epidemic, because the fraction of susceptible people is small, we have approximately:

\[ \frac{dI}{dt} = -\gamma I(t) \]

whose solution is

\[ I(t+\tau) = e^{-\gamma \tau} I(t), \]

which, for our estimate \(\gamma \approx 0.05\), implies that the number of infections is reduced by half about every 14 days.

Assessing the effectiveness of mitigation

Armed with our new superpower, let’s look at our data from specific countries with this logarithmic scaling. We’ll start again with Italy.

Several things become clearer with this scaling. Notice that the plot is not a straight line, but we can make good guesses as to why. Before day 30 (February 21st), there were only one or two known cases, and after day 30 there was a very abrupt increase. It seems likely that the virus was spreading before day 30 and the new cases were only detected later. This matches with statements from experts in Italy.

Recall our full model equation for \(I(t)\):

\[ \frac{dI}{dt} = (\beta \frac{S}{N}-\gamma) I \\ \]

Since only a tiny fraction of the whole Italian population is infected, we have \(S/N\approx 1\), so the slope of our exponential growth line on a semilog plot should be \(\beta-\gamma \approx 0.2\). Let’s see how a line with that slope matches the data:

Note that we are doing essentially the same thing that we did when trying to determine \(\beta\) in the second post of this series; but now we are looking at the results on a log scale to reveal more detail.

It seems plausible (not utterly convincing) that the virus has been spreading at this expected rate in Italy for about 50 days now. However, notice that in the last week the slope seems to have decreased. Since Italy is now making great efforts to detect all new cases, it seems most likely that this is the result of mitigation. It’s probably too soon to try to assess the effectiveness of that mitigation, but let’s make an attempt anyway:

We see that the slope over the last several days is about 0.14, corresponding to a mitigation factor \(q \approx 0.7\) in the model I introduced in the last post.

For Spain, we see a similar pattern, but no sign of any impact from mitigation yet.

For South Korea, up to about day 40 we see the same pattern (with again a similar plateau in late February followed by a rapid rise when testing increased). But afterward we see completely different behavior, as the curve flattens. Given that South Korea has perhaps the most agressive testing strategy in the world, it seems unlikely that this can be attributed to new infections going undetected. Instead, the evidence suggests that mitigation strategies have been very successful. We can measure this success by looking at how much the slope of the curve has changed:

An approximate fit to the data from recent days suggests that the growth rate has been cut to approximately \(0.02\); this would correspond to a value of \(q\) (from my previous post) of about 1/4, meaning each infected person on average transmits the disease to only 1/4 as many people as they naturally would. But again it is probably too early to estimate this number with confidence.

In the next post in the series, you can experiment with the model yourself.

Modeling Coronavirus part III -- predictions

2020-03-19T00:00:00+03:00

Welcome to the third post in my series on modeling the Coronavirus pandemic. In the first and second posts, we introduced the SIR model and used the available data to estimate the model parameters, in the absence of any mitigation (i.e. with no social distancing, quarantine, etc.). In this post we’ll put model and parameters together to see what they predict for the current pandemic. We’ll do this first assuming no mitigation, and then we’ll make a very rough attempt to understand how mitigation may impact the predictions.

Remember, I am not an epidemiologist and I make no guarantees as to the accuracy of these predictions. I don’t recommend that you make precise plans based on the details of these predictions. My goal is merely to show that with a bit of mathematics and common sense, we can have a reasonable idea of what the future holds.

Predictions in the absence of mitigation

First, let’s look at what we expect to happen without any mitigation. In other words, this is a scenario in which schools and workplaces remain open, people continue to shake hands or kiss cheeks when greeting one another, and so forth. Note that the predictions here are NOT what we expect will actually happen, because we have intentionally ignored the effects of containment measures.

Here are the basic assumptions leading to the predictions below:

The dynamics of the COVID-19 epidemic follow the SIR model, with parameters \(\beta \approx 0.25\) and \(\gamma \approx 0.05\).
No containment measures (like quarantine, closures, and social distancing) are implemented.
The number of confirmed cases at present is about 15 percent of the actual cases (this is a very rough guess based on some expert opinions).

There are several important things to notice here.

When does the maximum number of infected individuals occur, and how high is it?

As we expect, there is an initial exponential growth of infection that eventually levels off and then decreases after most of the population has been infected. Recall from the first post that we expect the infection peak to occur when the fraction of susceptible individuals is

\[\gamma/\beta = 0.05/0.25 = 0.2\]

i.e., when four-fifths of the world population has already been infected (including those who have recovered). With the current parameters, this peak occurs around the middle of May and results in almost 4 billion concurrent cases. Of course, the vast majority of those cases would not need medical care; based on our estimate of 15% reporting and less than 20% of reported cases being severe, perhaps only about 3% of all cases would require medical attention. Even so, that would mean a peak of 100 million cases requiring medical attention simultaneously, worldwide.

How many people catch the virus?

In this scenario, almost everyone in the world is eventually infected; by day 400 only 54 million susceptible individuals remain.

How long does the epidemic last?

After mid-May, the epidemic begins to trail off gradually, but it is not until about early November that the number of infected drops back to below 1 million. Thus (in this scenario) we should expect the severe epidemic to last at least several months. Of course, it would make sense for those who have already recovered from the virus – which is most of the world population – to go back to work/school in the early Fall,

Variations on the average scenario

In the last post, we saw that there is significant uncertainty in the value of \(\gamma\) and especially \(\beta\). How do these predictions change if we vary \(\beta\)? We found values in the range \((0.2, 0.3)\), so let’s look at what happens for each of the extremes of this interval:

With a smaller value of \(\beta=0.2\), the virus spreads more slowly; the peak occurs near June 1st with just over 3 billion infected. Again, almost the entire world catches the virus (eventually 150 million susceptibles remain), and the number of cases does not drop below 1 million until November.

With a larger values of \(\beta=0.3\), the virus spreads more quickly; the peak occurs in early May with just over 4 billion infected. The epidemic ends a bit sooner, with the number of cases dropping below 1 million some time in October. Only 19 million people remain susceptible after a year.

As we can see, even with these fairly large changes in \(\beta\), the general picture remains the same. We can expect the epidemic to peak in May or June and last into the fall.

The effect of mitigation

As I write this, almost every country in the world is adopting measures to mitigate the spread of the virus. This includes closing schools and workplaces, quarantining infected individuals and their contacts, and encouraging people to stay home.

For simplicity we can view all of these mitigation measures as having a single effect: increasing the average time between encounters, or in other words reducing \(\beta\). Remember that \(\beta\) is the average number of people per day that a given individual has close contact with. Since \(\beta\) is a constant in our model but the mitigation techniques and their effectiveness may vary over time, we can incorporate mitigation by adding a new factor \(q(t)\in[0,1]\) multiplying \(\beta\):

\[\begin{align} \frac{dS}{dt} & = -q(t)\beta I \frac{S}{N} \\ \frac{dI}{dt} & = q(t)\beta I \frac{S}{N}-\gamma I \\ \frac{dR}{dt} & = \gamma I \end{align}\]

What is the meaning of \(q(t)\)? If there were an absolute quarantine, with no human contact at all, we would have \(q=0\), whereas if no measures are implemented then we would have \(q=1\) (corresponding to the predictions above). In the real world, \(q\) will be somewhere between these extremes.

What is the correct value of \(q\)? Frankly, I have no idea and I doubt that even the experts can say with confidence. But we can hypothesize some values and explore their impact. For simplicity, we’ll assume that some value \(q<1\) is achieved through mitigation measures starting now and lasting for the next \(N_q\) days. After \(N_q\) days, these measures are lifted so that society (and \(\beta\)) returns to normal.

If we could maintain mitigation measures forever (i.e. \(N_q=\infty\)) then this mitigation would have exactly the same effect as reducing \(\beta\). As we saw above, smaller values of \(\beta\) lead to a smaller infection peak, but a longer epidemic.

Of course, we do not expect mitigation to last forever; people must go back to work eventually. This can have some surprising effects.

As a first scenario, let’s imagine that \(q=1/2\); i.e., we are able to reduce the amount of human contact by 50%, and \(N_q=180\) (about six months).

We see that this mitigation makes a substantial difference. Now the peak point of infection is delayed until late June, with less than 2 billion simultaneous cases at maximum. Notice the bump in the number of cases around mid-September when the restrictions are relaxed. Another effect of this mitigation is that the time in which there are some significant number of cases lasts much longer – in this scenario there are still more than 1 million cases even 1 year from now. It’s still true that most of the world catches the virus eventually, but about 480 million susceptibles remain even after two years.

Next let’s consider an even more successful mitigation: suppose that we cut human contact by three-fifths, so \(q=0.4\), with measures again lasting \(N_q\)=180 days.

Here the initial growth is even slower, and we see an interesting phenomenon: the infection appears to reach a peak around mid-September, right when restrictions are relaxed. But the relaxing of restrictions allows the virus to suddenly spread much faster, leading to a higher peak in early October. What’s quite surprising and counterintuitive is that even though we have stronger mitigation, the peak is actually higher in this case than in the previous case. Why? It’s because the very strong mitigation for 180 days means that at the end of that period there is still a large proportion of susceptible people, leading to stronger growth of the epidemic when mitigation ends.

Finally, if we have extremely strong mitigation and then suddenly remove all restrictions, we simply get a delayed version of the full-scale outbreak:

This is because our mitigation was so effective that hardly anyone caught the virus and everyone was still susceptible after our 180-day mitigation ended. Of course, the ideal scenario (in terms of minimizing infection) would be to maintain strong mitigation until a vaccine can be deployed.

When should we let society go back to normal?

A natural question at this point is, when is it okay to let everyone go back to school and to work? There are many possible answers, but perhaps a reasonable one can be obtained as follows. Let’s assume that our goal is to ensure that the number of infected \(I(t)\) will not increase when we send everyone back to work; i.e., when we let \(q\) go back up to 1. We have

\[\begin{align} \frac{dI}{dt} & = (q\beta \frac{S}{N}-\gamma) I. \end{align}\]

To ensure that \(dI/dt<0\) even with \(q=1\), we need the susceptible fraction \(S/N\) to have fallen to \(\gamma/\beta\) – the same condition we found previously for when the infection peak would occur (without mitigation). Using the parameter values we found in the last post, this means we should wait until 80% of the population has been infected before returning to normal. This is probably too cautious, since ending mitigation a bit sooner (but after the initial peak of infection) would only lead to a small subsequent rise, with a second infection peak smaller than the first:

Of course, other strategies are possible and could make sense, like sending people back to work as soon as they have tested positive and then recovered from the virus. And more complicated mitigation strategies are possible (and likely) in which some restrictions are relaxed while others remain in place. All of these will have to be weighed against the cost they impose on our lives.

In the Jupyter notebook for this post there is an interactive widget that lets you experiment with the model, including all five of the parameters we have discussed. The possible effects are quite interesting and we have only touched on the basics here. I encourage you to try it out!

Further considerations

There are a number of potentially important factors that we have ignored here, including:

Because countries will have differing strategies, there may be significantly earlier infection peaks in some countries and later peaks in others. This is likely to be the case if international travel restrictions remain in place for an extended period.
Possible seasonal effects on the spread of Coronavirus. Currently there seems to be no way to know if COVID-19 will be seasonal like the flu.
Development of a vaccine. As we have seen, significantly reducing the number of susceptible individuals can rapidly halt the spread of a disease, even if not every individual is vaccinated. So if a vaccine appears and can be mass produced before the fall, it could susbstantially shorten the duration of the epidemic and reduce its impact. On the other hand, it seems that a vaccine arriving after a year would be too late to have much impact.

In then next post, we take a deeper look at exponential growth and its implications for the epidemic.

Modeling Coronavirus part II -- estimating parameters

2020-03-19T00:00:00+03:00

Welcome back! In the first post of this series, we learned about the SIR model, which consists of three differential equations describing the rate of change of susceptible (S), infected (I), and recovered (R) populations:

\[\begin{align*} \frac{dS}{dt} & = -\beta I \frac{S}{N} \\ \frac{dI}{dt} & = \beta I \frac{S}{N}-\gamma I \\ \frac{dR}{dt} & = \gamma I \end{align*}\]

As we discussed, the model contains two key parameters (\(\beta\) and \(\gamma\)) that influence the spread of a disease. In this second post on modeling the COVID-19 outbreak, we will take the existing data and use it to estimate the values of those parameters.

The parameters we want to estimate are:

\(\beta\): The average number of people that come in close contact with a given infected individual, per day
\(\gamma\): The reciprocal of the average duration of the disease (in days)

Estimating \(\gamma\)

A rough estimate of \(\gamma\) is available directly from medical sources. Most cases of COVID-19 are mild and recovery occurs after about two weeks, which would give \(\gamma = 1/14 \approx 0.07\). However, a smaller portion of cases are more severe and can last for several weeks, so \(\gamma\) will be somewhat smaller than this value. Estimates I have seen put the value in the range \(0.03 - 0.06\).

Estimating \(\beta\)

It’s much more difficult to get a good estimate of \(\beta\). To be clear, we are trying to estimate, for an infected individual, the average number of other individuals with whom they have close contact per day. Here close contact means contact that would lead to infection of the other individual (if that individual is still susceptible).

As we discussed earlier, this number is affected by many factors. It will also be affected by mitigation strategies implemented to reduce human contact. For now, we want to estimate the value of \(\beta\) in the absence of mitigation. Later, we will try to take mitigation into account.

Recall that our equation for the number of infected is

\[ \frac{dI}{dt} = \left(\beta \frac{S}{N}-\gamma \right) I(t) \]

Very early in an outbreak, the ratio \(S/N \approx 1\) since hardly anyone has been infected. Also, at extremely early times, we can ignore \(\gamma\) because the disease is so new that nobody has been sick for long enough to recover. For COVID-19, this is true for about the first two weeks of the disease’ spread in a new population. During that time we have simply

\[ \frac{dI}{dt} = \beta I(t) \]

This is one of the simplest differential equations, and its solution is just a growing exponential:

\[ I(t) = e^{\beta t} I(0). \]

Here \(I(0)\) is of course the number of initially infected individuals. Thus we can try to estimate \(\beta\) by fitting an exponential curve to the initial two weeks of spread. This is not the only way to estimate \(\beta\); using this approach is the first of several choices that we’ll make, and those choices will influence the our eventual predictions.

Getting the data

Fortunately for us, comprehensive data on the spread of COVID-19 is available from this Github repository provided by the Johns Hopkins University Center for Systems Science and Engineering. Specifically, I’ll be using the data in this file. Note that the file gets updated daily; as I write it is March 17th.

I’m using Python and Pandas to work with the data. For this blog post, I have removed most of the computer code, but you can download the Jupyter notebook and play with the code and data yourself.

To estimate \(\beta\), we just pick a particular country from this dataset, plot the number of cases over time, and fit an exponential function to it. We can use a standard mathematical tool called least squares fitting to find a reasonable value.

Fitting the data from Italy

For instance, here is the data from Italy:

Since this data starts back in January, before the virus reached Italy, the number of cases at the beginning is zero. We can use the interval from day 30 to day 43 (inclusive) to try to fit \(\beta\), since this seems to be when the outbreak began to take off. Here it must be emphasized that the choice of this particular interval is somewhat arbitrary; different choices will give somewhat different values for \(\beta\).

def exponential_fit(cases,start,length):

    def resid(beta):
        prediction = cases[start]*np.exp(beta*(dd-start))
        return prediction[start:start+length]-cases[start:start+length]

    soln = optimize.least_squares(resid,0.2)
    beta = soln.x[0]
    print('Estimated value of beta: {:.3f}'.format(beta))
    return beta

Let’s see how well this value predicts the data:

def plot_fit(cases,start,end=56):
    length=end-start
    plt.plot(cases)
    beta = exponential_fit(cases,start,length)
    prediction = cases[start]*np.exp(beta*(dd-start))
    plt.plot(dd[start:start+length],prediction[start:start+length],'--k');
    plt.legend(['Data','fit']);
    plt.xlabel('Days'); plt.ylabel('Total cases');
    return beta
    
beta = plot_fit(total_cases,start=35,end=49)
plt.title('Italy');

Estimated value of beta: 0.247

The fit seems reasonably good, over the interval we used. How well does it match if we plot the fit over the whole time interval?

start=35
plt.plot(total_cases)
dd = np.arange(len(days))
prediction = total_cases[start]*np.exp(beta*(dd-start))
plt.plot(dd[start:],prediction[start:],'--k');
plt.legend(['Data','fit']);
plt.xlabel('Days'); plt.ylabel('Total cases');

Clearly, the prediction is not accurate at later times. There are two main reasons for this:

Our assumption of exponential growth was based on other assumptions that are only valid at the very start of the outbreak;
Italian society has taken measures to combat the spread of the virus, effectively reducing \(\beta\) at later times.

We can resolve the first issue by using the full SIR model (instead of just exponential growth) to make predictions. The second issue is more complicated; we will try to deal with it in a later blog post.

Fitting to data from other regions

To have more confidence in our value of \(\beta\), we can perform a similar fit with data from other regions, and see if we get a similar value. Next, let’s try fitting the data from the USA. Here’s the data:

Notice that we only have about 1 week of meaningful data. Let’s try to fit an exponential to it:

We get a fairly similar value for \(\beta\). Furthermore, the fit using this value seems to be pretty good.

Spain

UK

France

Hubei Province, China

Let’s look at the data from where it all started: Hubei province, China. Here it makes sense to start the fit from day zero of the JHU data set.

Each of these countries seems to fit the model reasonably well and to give a more or less similar value of \(\beta\), in the range \(0.22\) to \(0.29\). It would be wrong to feel completely confident about this value, or to try to extrapolate too much from such a short time interval of data, but the consistency of these results does seem to suggest that our estimate is meaningful.

Now let’s look at some countries that don’t fit this pattern.

Iran and South Korea

Here is the number of confirmed cases for Iran:

And here is Korea:

At a glance we can see that this data doesn’t follow the pattern of the previous countries. In Iran, after the first week, the growth seems to be linear. In Korea, the initial exponential growth eventually slows down drastically and is beginning to level off. This tells us that something we left out of our model must be at play.

In the case of Korea, it seems straightforward to understand what is going on. Korea has deployed the most extensive COVID-19 testing system in the world, with over 270,000 people tested to date. This is combined with an extensive effort to isolate infected people and those they have been in recent contact with. Essentially, South Korea has reduced the value of \(\beta\). Based on our earlier analysis, to prevent future exponential growth, they will need to keep \(\beta\) down to approximately the value of \(\gamma\) or less. If we believe that \(\gamma \approx 0.05\) and \(\beta \approx 0.25\), this means reducing the amount of human contact by infected people by five times.

Iran’s case is at first more puzzling, since the testing and quarantine measures there have not been exceptional compared to countries like Italy and Spain. Instead, there are strong suspicions that the official numbers from Iran are wildly inaccurate and the real number of cases (and deaths) is drastically higher than what is reported.

Problems with the our approach

Before we go finish, it’s important to understand the limitations of the data we’re working with and the technique we have used. Most importantly, the numbers we have certainly do not represent the real number of infected individuals. That’s because many infected individuals are never tested for the virus. This is especially true for diseases like COVID-19 in which the majority of cases are mild and do not require professional medical care. Estimates I have seen claim that only about 10-20% of all cases are detected.

If we assume that the fraction of cases that are actually detected is constant over time, then this discrepancy does not hinder our ability to estimate \(\beta\), since dividing the initial and final number of infected by the same constant will lead to the same estimate of \(\beta\) that would be obtained if we counted all the cases. However, it’s clear that in many places this factor changes over time as a country starts doing more and more testing. This would cause the number of reported cases to grow even faster than the real number. This is most likely occurring, for instance, in the US where previously many individuals with symptoms were not tested due to a lack of test availability.

Another issue is that in some cases governments may be intentionally hiding the true number of infections. As we have seen, this is likely the case in Iran.

Finally, mitigation strategies may already be in place and influencing the rate of spread in some countries, even in the early days of outbreak. This would lead to us underestimating the natural value of \(\beta\).

Conclusion

What we can take away from this analysis are the rough estimates for the SIR parameters:

\[\gamma \approx 0.05\] \[\beta \approx 0.25.\]

Notice that the behavior in this initial phase of the epidemic that we have focused on is very similar to the simple behavior we considered at the start of the first post. There, the number of infected individuals doubled each day, but we knew that was unrealistic. Here, the number of infected individuals doubles every few days. How many days does it take for the number to double? If it takes \(m\) days for the number of cases to double, then we have

\[ e^{\beta m} = 2 \]

so \(m = \log(2)/\beta\) where \(\log\) means the natural logarithm. For \(\beta=0.25\), this gives a doubling time of about 2.8 days. This growth will slow down somewhat after the first couple of weeks for reasons we have already discussed.

It should be emphasized that the value of \(\beta\) here is what we expect in the absence of mitigation strategies. In later posts, we’ll look at what these values mean for the future spread of the epidemic, and what the potential effect of mitigation may be.

In the Jupyter notebook for this post there is an interactive setup where you can make your own fits to the data from a variety of regions.

Click here to go to the next post, in which we use what we’ve found to predict the future.

Modeling Coronavirus part I -- the SIR model

2020-03-17T00:00:00+03:00

This post is the first in a series in which we’ll use a simple but effective mathematical model to predict the ongoing Coronavirus outbreak. As I write, the number of officially confirmed global cases is just under 200,000 and many schools and workplaces all over the world have closed in order to slow its spread. If you’re like me, you’re wondering:

Am I likely to catch this virus?
How long will it be until my school or workplace opens up again?

My claim is that we can reach some reasonable approximate answers using straightforward mathematics. Math is quite effective at predicting the average behavior of large groups, and a little math can go a long way in telling us what will happen next with COVID-19. My goal here is to help you make and understand those predictions with little more than high school mathematics.

Disclaimer: I am not an epidemiologist and I make no guarantees about the predictions we’ll arrive at here. By reading this you agree not to sue me. What I write here does not reflect the opinion of my employer or anyone else.

Each post in this series is written in a Jupyter notebook, which you can download and experiment with yourself if you are so inclined. The notebook for this first post is here

Modeling the spread of infectious disease

An infectious disease spreads from one individual to another. Consider the following simple model:

On day zero, a single individual is infected
On each subsequent day, each infected individual passes the disease to one more individual

How quickly does the number of infected individuals grow?

1, 2, 4, 8, 16, …

On each day, the number of infected doubles! How many days would it take for everyone on earth to be infected?

This is not a reasonable model for any of the diseases we know of. Of course, the rate of new infections per infected person (1 per day) was an arbitrary choice and real values are likely to be smaller. What other effects are missing from this model?

Recovery and immunity: eventually, an individual recovers and can no longer infect others
Spread: The infection can only spread to people who don’t yet have it. If most of the individuals in contact with an infected person are already infected, that person is less likely to spread the disease to someone new

Those two factors are vital to understanding the true dynamics of epidemics. Of course, there are many other important details we have left out; for instance:

The disease may affect different individuals in different ways.
Individuals are spread out geographically.
Certain individuals are likely to infect many others, while others are less likely. This depends on many factors including culture, personality, and lifestyle, as well as the mode of transmission of the disease.
Individuals might take actions to avoid getting infected (e.g. washing hands, avoiding sick people) or to avoid spreading the disease (e.g. staying home when sick).

All of these effects (and many others) will influence the spread of the disease. A model that tries to incorporate them all would be very complex.

The SIR model

One of the simplest but most relevant models is based on the idea that the population consists of three groups:

S(t) Susceptible (those who have not yet been infected)
I(t) Infected (those who can currently spread the disease)
R(t) Recovered (those who are now immune and cannot contract or spread the disease)

When we write \(S(t)\), we mean that the number of susceptible individuals (\(S\)) is given as a function of time (\(t\)). We can’t write down this function exactly; instead it will be described by a differential equation. Now, differential equations are a bit like whale sharks: they sound scary at first, but in reality they are simple and friendly creatures.

A differential equation is just a description of how some quantity changes. In this case, we will have three differential equations, describing the rate of change of \(S\), \(I\), and \(R\). The idea is that susceptible people can become infected and infected people can become recovered:

\[ S \to I \to R\]

To define differential equations for the three groups, we only need to determine the rate at which each of these transitions occurs.

Rate of infection

In our first simple model, we assumed the rate of infection was proportional to the number of infected. This is very reasonable, but for someone new to become infected we need both an infected individual and a susceptible one. If we imagine that people encounter each other randomly at some rate \(\beta\), then the rate of new infections is just the number of infected multiplied by the probability of encountering a susceptible individual:

\[ \frac{dI}{dt} = \beta I \frac{S}{N}. \]

Here \(N=S+I+R\) is the total population, so \(S/N\) is the probability that a randomly chosen individual is susceptible. This is probably the most complicated point of our discussion, so take some time to think about it until it makes sense to you.

Of course, since new infected people were previously susceptible, the number of susceptible individuals must decrease at the same rate:

\[ \frac{dS}{dt} = -\beta I \frac{S}{N}. \]

Rate of recovery

The other transition is from infected to recovered. A proper model for this should involve a time delay, since (for many diseases) new infected individuals typically become recovered after a certain interval of time. For instance, with the flu or the new Coronavirus, the number of new recovered individuals might depend on how many became infected about one or two weeks ago. Incorporating such an effect would lead to a more complicated model known as a delay differential equation.

Instead, we will simply assume that over any time interval, a certain fraction of the infected become recovered. Denoting the recovery rate by \(\gamma\), we have

\[ \frac{dR}{dt} = \gamma I. \]

The number of infected must decrease at the same rate, so we must modify our differential equation for \(I(t)\) to read

\[ \frac{dI}{dt} = \beta I \frac{S}{N}-\gamma I. \]

The full model

Taking these three equations together, we have

\[\begin{align} \frac{dS}{dt} & = -\beta I \frac{S}{N} \\ \frac{dI}{dt} & = \beta I \frac{S}{N}-\gamma I \\ \frac{dR}{dt} & = \gamma I \end{align}\]

Notice that if we add the 3 equations together, we get

\[ \frac{dN}{dt} = 0. \]

What do \(\beta\) and \(\gamma\) really mean? We can think of \(\beta\) as the number of others that one infected person encounters per unit time, and \(\gamma^{-1}\) as the typical time from infection to recovery. So the number of new infections generated by one infected individual is, on average, \[\beta/\gamma = R_0,\] the basic reproduction number.

SIR dynamics

Notice that \(S(t)\) can only decrease and \(R(t)\) can only increase, but \(I(t)\) may increase or decrease. A key question is, under what conditions will \(I(t)\) increase? This will tell us whether a small number of cases could become an epidemic.

We can write

\[ \frac{dI}{dt} = \left(\beta \frac{S}{N}-\gamma \right) I \]

from which we see that \(I(t)\) grows if \[\beta S/N > \gamma.\]

\[ \frac{dI}{dt} = \left(\beta \frac{S}{N}-\gamma \right) I \]

Initially in a population we have \[S/N \approx 1,\]

so an epidemic of some size can occur if \(\beta > \gamma\). As the epidemic grows, the ratio \(S/N\) becomes smaller, so eventually the spread slows down.

What fraction of the population must been infected before \(I(t)\) will start to decrease?

The epidemic will begin to subside when \[S/N = (\beta/\gamma)^{-1} = R_0^{-1}.\]

This determines the infection peak. After this point, there will still be new infections but the overall number of infected will decrease.

An example

So what does an epidemic look like, using the SIR model? We can easily compute the solution using standard numerical methods; I’ve omitted the code here since I want to focus on the model, but feel free to look at the original Jupyter notebook with code and modify the parameters yourself.

Here I’ve set \(N=1\) so the numbers on the vertical axis represent a fraction of the current population. Initially, only a tiny fraction is infected while the remainder is susceptible. The plot above shows the typical behavior of an epidemic: an initial rapid exponential spread, until much of the population is infected or recovered, at which point the number of infections begins to decline.

According to our analysis above, the number of infections should begin to decrease when the susceptible fraction \(S/N\) is equal to \(\gamma/\beta\). Here I’ve taken \(\beta=1\) and \(\gamma=1/10\), so the infection peak should occur when the susceptible population has dropped to 1/10. Let’s check:

The results are in perfect agreement with what we predicted. In the original notebook there is an interactive model where you can adjust the parameters and see the results.

Click here to go to the next post, in which we look at real-world data to estimate the values of \(\beta\) and \(\gamma\).

How and why I'm teaching my kids to code

2014-12-09T00:00:00+03:00

I think that most of the world today drastically underestimates kids – and by so doing, often harms them. Kids love learning, creating, and achieving. We do them no service by providing everything for them, or by “protecting” them from challenging tasks. This troubling trend is manifest across the physical, social, and mental aspects of life. But today I want to focus on one thing to which I think we should introduce kids earlier and oftener: programming.

Why programming?

In my childrens’ school (which I consider to be quite good), students are introduced to computers early on. But this introduction focuses on things like preparing a PowerPoint presentation, writing an essay in Word, etc. Learning to use a computer by focusing on such canned applications is a bit like learning to cook by mastering the operation of a microwave. Yes, it will allow you to produce edible results – assuming your local supermarket has a freezer section – but it hardly acquaints you with the breadth of the culinary arts. With a microwave, the only choice you really have in the process is how long to heat the thing up – all other details have been determined for you, in advance, by someone else. You cannot choose a recipe that fits your tastes or dietary preferences, and you certainly can’t adapt a recipe or invent something new.

To really learn to cook, you have to start working with the raw ingredients. To really learn to compute, you have to learn to program.

I think it is a shame to go through life without ever learning to cook, since food is such a central part of human existence. But that concern is mainly philosophical. Those of the coming generation who go through life without ever learning to program will be, in a sense, relegated to second-class status, unable to understand, control, or create that which governs so much of life. Think of it: how much of your time is spent interacting directly or indirectly with some electronic device? For the computationally-illiterate, the landscape of daily life is thus one of immovable, incomprehensible objects which they must adapt to or work around. But for those who can program, these objects become tools that are understood and can be modified to fit any desired purpose.

Piquing their interest

Small children are fascinated by whatever they see adults doing. All three of my daughters (ages 9, 6, and 2) are interested in programming, though I have never told them they should be. They became interested in programming by seeing me at it. Of course, simply typing in a terminal doesn’t really grab their attention. But they get curious when I’m running wave simulations, and ask questions about the visualizations. It was surprising to me to find that even quite abstract things can grab their interest, if there is an interesting plot to go along with it. But when I recently showed them simulations of water waves, their excitement became palpable.

I was running some simple simulations of waves breaking on a beach, for a talk I was to give in front of a general scientific audience. The girls began asking what-if questions, and the experimental fun began. We put a big wall on the beach and then tested how big the waves needed to be before they would go over the wall. Then we added a big dip in front of the wall. We tried starting the simulation with all the water flowing in toward the shore. And so forth. They came up with the ideas, and I would implement them. Importantly, the code that sets up the problem was easy to change and run in just a matter of seconds. I think if they had had to wait even a full minute to find out “what if”, I would have lost them.

Did they learn how to program from this? No, none of them typed any code, and I made only a minimal attempt to explain to them the code I was changing. But they understood that by typing instructions, one can make a computer do whatever one wants. They learned that computers can be used to answer fun and interesting questions. And they got a little exposure to some programming tools and concepts. Most importantly, they want to understand how to make the computer do whatever they can imagine.

Simple programming for kids

There are a number of tools designed to give kids a “softer” introduction to programming. Perhaps the best-known is MIT’s Scratch. I guess the idea is that the connection between typed instructions and computer output is too abstract. Also, young children may still be developing reading, writing, and typing skills. So the text editor is replaced by a GUI with cute animals and buttons that add actions. This may be great for some kids, but again there is the sense that one is only learning to microwave pre-cooked meals. My experience in introducing my oldest daughter to programming (at 6) is that she was much more excited by the blank slate of a Python interpreter.

Of course, we didn’t jump into decorators and class inheritance. There should be fast, fun feedback, especially at the start when the learning curve is steepest. My daughter got a huge kick out of learning she could make the computer talk (using the system command “say” on a Mac). This was easily incorporated as part of programming some simple games (like guessing a number or hangman). Those games naturally introduce simple ideas like loops and if-statements. The goal is always to create something fun or useful; the programming ideas are only incidental. In my opinion, programming should work that way at all ages and all levels.

Some of the things she has programmed so far include:

A “guess the number” game (that tells you to guess higher/lower at each iteration)
A game that asks simple math questions
Hangman (this one was surprisingly easy compared to what I expected, though she didn’t implement any graphics)
A very simple adventure game (that lets you move around an imaginary world)
A “countdown to Christmas” that announces the number of hours and days left before Christmas. I helped her use chrontab to make it run every hour.

Maintaining interest

My oldest daughter is 9 now, and while I haven’t taught her as much programming as I’d like, she continues to be interested in and excited about it. Let me mention a couple of things that I think have helped maintain that excitement:

Ownership: when she is programming, the program is hers. I don’t write code for her, although in some cases I have nearly dictated bits and pieces before explaining them. I also don’t impose the design or details of the end goals. I provide ideas and suggestions when she is stuck, and I ask lots of probing questions.
Freedom: I let her dictate the pace. We don’t have a regularly-scheduled time for her to learn – we do it when the fancy strikes. If she becomes frustrated or bored, we stop.
Fun: Every project is something that she has decided would be fun. Sometimes the ideas come from her, and sometimes from me, but the decision to pursue one is hers.

Difficulties

Along the way, I’ve run into some challenges that I haven’t solved. My daughter often comes up with projects that would be much too complicated, especially where graphics are concerned. I try to find something similar that would be simple enough. When working on a project, she usually wants to do it in a way that is far from my preconceived “optimal” implementation. I try to be patient and hands-off, and to let her learn for herself from her attempts. She also tends to get interested in some non-essential aspect of a project, which may not involve much programming skill – like making the computer say a lot of silly things when you guess wrong in the number-guessing game. Again, I try to be enthusiastic and not to interfere. Most importantly, our programming sessions are never too serious and are not a source of tension.

The biggest challenge for me is that teaching programming can be frustrating. It’s easy to forget how difficult the programming mindset is. Teaching requires a lot of patience as the student grapples with ideas that seem obvious to the teacher. It’s important that the teacher not jump in and “fix” the student’s work – the grappling (however slow and painful it may seem) is essential to learning. I try to stick to the Socratic method – that is, I can only guide by asking questions. I also find that my daughter benefits a lot (though she never wants to do it) from “rubber ducking”, which means reading the code out loud.

Hour of Code

This morning I participated in part of an amazing world-wide effort to help kids learn to code. Last year it exposed 15 million kids to programming. The idea is that each kid spend at least one hour learning about programming. A number of teachers at the KAUST schools have chosen to participate. If you want to start teaching your own kids to program – or if you want to learn! – the Hour of Code website has modules for all levels. For instance, in my daughter’s 3rd-grade class the kids worked through a set of lessons using a graphical interface (moving code around with a mouse, rather than typing) in order to make Elsa and Anna (from Frozen) ice skate in snowflake-shaped patterns. The lessons are an amazingly well-designed sequence that also teaches some geometry and is very appealing to kids. My hat is off to the people behind Hour of Code and all the teachers who use it to make programming part of their curriculum.

Conclusion

Programming literacy presupposes literacy in reading and writing. For future generations, these two types of literacy will be of similarly profound importance. Computer programs run our world. If approached in the right way, programming can be a fun and playful pasttime that builds creativity and reasoning skills while teaching kids to see the devices that surround them as malleable tools rather than some kind of opaque oracle.

One last note: you may be thinking, “but I only have boys. Can boys learn to program too?” Sure they can, and you’d better teach them now or they won’t have a chance against all the great women coders of the future. ;-)

Clawpack turns 20

2014-12-01T00:00:00+03:00

Twenty years ago, version 1.0 of the Conservation LAWs PACKage (CLAWPACK, now Clawpack was first released by Randy LeVeque. It seems fitting to take the occasion to look back on the intervening years. What follows are my thoughts on some of the great things that have resulted.

As far as I can tell, this item in the NA-Digest is the first public announcement of its existence. It was also announced more verbosely the same year in this conference paper, from the proceedings of the 5th HYP conference. Reading that conference paper now, I am struck by how it incorporated many of the ideals of scientific software development that we now discuss as if they were new ideas. For instance,

Code that is easy to read and use, with plentiful documentation and examples.
Modular design, that allows low-level functions to be reused and disparate parts of the code to be modified independently. In the case of Clawpack, this is epitomized by the fact that it allows the solution of any system of hyperbolic PDEs by changing just a single routine (the Riemann solver).
An interface that allows methods and parameters to be changed easily, so that different methods can be conveniently compared.
Clawpack was proposed as a benchmark against which to easily test new algorithms.
Clawpack was released open source and for free on a public FTP server (netlib).

In this day of so much ado about credit for software, it’s also interesting to view this paper as an early example of a mathematical publication that is all about software.

Looking through the code snippets in the paper, I was astonished to recognize how much of the original Fortran 77 code remains virtually unaltered – including many variable names, function interfaces, and overall design. This is a testament to the quality of the original code design.

The central algorithms in Clawpack have also stood the test of time. The 80’s saw the heyday of research into second-order TVD methods for conservation laws, and Clawpack was released just as that era came to a close. Since then, research has gone in other directions – high-order methods, well-balancing, and positivity preservation, to name a few. While these new directions have provided additions to Clawpack, the “classic” algorithms have not changed and are still hard to beat as a robust general-purpose tool.

Of course, much has happened in the intervening twenty years. The original library handled 1- and 2-dimensional problems on regular cartesian grids. In the next few years, subsequent versions added algorithms for 3D, mapped grids, and adaptively refined meshes.

Additional algorithmic innovations are too numerous to try to list, but one that has had a lot of impact is the f-wave technique.

The problems to which Clawpack has been applied are certainly much too numerous to list. But you can start to get an idea by looking at citations of major Clawpack papers like this, this, this, this, and this. Perhaps the heaviest use in recent years has involved geophysical flows such as tsunamis and storm surges, in GeoClaw.

The Clawpack family of codes

Clawpack has spawned numerous offshoots and extensions, including (but not limited to) AMRClaw, BearClaw, ZPLClaw, SharpClaw, CUDAClaw, ForestClaw, ManyClaw, PyClaw, and GeoClaw. Some of these have become part of the Clawpack-5 suite while others have forked and gone in other directions.

Nowadays, the term Clawpack refers to a collection of interrelated packages that are maintained and developed at github.com/clawpack. They include:

The original (“Classic”) Clawpack;
AMRClaw (with adaptive mesh refinement)
GeoClaw (with special tools for geophysical flows)
PyClaw (a Python interface to both the classic code and the high-order “SharpClaw” algorithms)
Riemann (a library of approximate Riemann solvers, which can be used with all of the above codes)
VisClaw (an in-house visualization tool)

The Github organization also includes repositories for the docs and for contributed applications.

The Clawpack community

As far as I know, the original release was a one-person effort. But like most open-source projects, Clawpack quickly became a broader collaboration. I won’t attempt to credit everyone here; you can see some of the major contributors here, and many more by looking at the contributors pages on Github.

I was surprised to realize that I’ve now been involved with Clawpack for half of its existence – ten years! During those years I’ve gotten to work with an group of exceptional researchers who are also just outstanding people. They say that the culture of an open-source software community is shaped strongly by its founder, and I think Clawpack is no exception.
It seems to me that its creator is not only a great applied mathematician, but also somene who consistently leads the way in terms of improving the way we do science. Clawpack exemplifies his commitment to reproducibility and sustainable scientific software development, long before those words came into scientific vogue. He was an advocate for publishing in journals with low subscription prices, long before open access became a movement.
Most significantly, he has always been interested first in finding and solving interesting problems, and only secondarily in publishing papers. Both through his personal influence and as chair of the SIAM Journals Committee, he has been influential in making progress in these directions, including the establishment of a Software section in SISC, the acceptance by SIAM journals of supplementary materials (including code), and a new policy allowing authors to post published articles on their own or institutional websites.

As a result, the culture surrounding Clawpack has always encouraged openness and a willingness to accept new contributions. Furthermore, I think that the Clawpack developers have maintained a healthy skepticism toward our own algorithms and code. Although we try to make our code useful to as many people as possible, there has never been any attempt to evangelize the community in order to increase use of a particular set of algorithms or to increase metrics like citation counts. Because of this attitude, the code is continually improved through incorporation of new algorithmic innovations.

Lessons learned

Of course, it would be wrong to say that Clawpack has been a perfect model for scientific code development. There are plenty of things we’ve done wrong or could learn to improve.

The original announcement says that “contributions from other users will be gratefully accepted,” and that has always been true. Nevertheless, the widely accepted development model for a long time was that most users would take the code, fork it, and make their own enhancements that would never get back to the main codebase. While this prevented feature bloat, it also meant that a great wealth of knowledge – largely in the form of sophisticated approximate Riemann solvers – will perish on some dusty hard drive rather than benefitting the larger community. We’re trying to change that now by encouraging users to submit pull requests for Riemann solvers and for entire applications.

Another example of where I see room for improvement is in output and visualization, where we have, to some degree, reinvented the wheel. Clawpack has long used custom ASCII and binary file formats, that can only be read in by Clawpack (or by reverse-engineering code for the relatively simple formats). We are now pushing to move to a more standard default format (probably HDF5), which would allow easier integration with standard visualization and post-processing libraries.

On the visualization side, the Clawpack developers have created some extremely useful tools for plotting time-dependent data on structured grids (including block-structured AMR). These tools sit on top of MATLAB and matplotlib. A large amount of work has gone into these “in-house” tools rather than into leveraging and contributing to dedicated visualization tools. Meanwhile, individual users have occasionally connected Clawpack to powerful visualization tools, but their custom code never got back to the main codebase. The limited capabilities of matplotlib in 3D seem to finally be providing sufficient impetus to force us to integrate with a sophisticated visualization library. I have been working lately on integration with yt.

The next 20 years?

It may come as a surprise for a code that’s so long in the tooth, but I think Clawpack development at present is more vibrant than ever. Since 2011, we’ve held annual developer workshops, the latest of which took place last week here at KAUST. The pictures on this page are from those workshops (the cake in the first photo, which shows the Clawpack logo, was made by my wife, and is a fondant version of a fluid-dynamical shockwave hitting a low-density bubble).

As for the future, I won’t claim enough clairvoyance to see 20 years ahead. But here are some things I hope we can accomplish in the next few years:

Massively parallel adaptive mesh refinement (at present, it exists only in the unreleased ForestClaw code; a concurrent effort is bring this to PyClaw through BoxLib);
An ever-growing library of Riemann solvers for increasingly complex systems;
Code-generation for solving problems where a custom Riemann solver is not yet available;
Incorporation of code that runs on accelerators (like CUDAClaw and ManyClaw) into the main library in a way that allows users to change hardware seamlessly;
More teaching tools based on Clawpack and IPython notebooks, including a book showcasing Riemann solutions of important physical systems;
Additional algorithms (such as Discontinuous Galerkin methods and new time stepping techniques) that can be accessed through the same problem setup and use the same Riemann solvers;
Better interoperability between Clawpack and other codes (such as Proteus), by making Clawpack more of a true library.

Are you excited yet? I certainly am. Come join the fun!

Teaching in the open

2014-07-18T00:00:00+03:00

If you examine the menu bar above, you’ll notice that my site has a new top-level page: Teaching. This is a direct result of my attending Greg Wilson’s inspiring keynote at Scipy 2014. That link will take you to the key (for me) part of the talk, but I recommend watching the whole thing. His message is: massive collaboration is the real revolution. Michael Nielsen made the same statement in Reinventing Discovery; here Greg applies this statement to university education, and asks:

Why don’t instructors open-source their teaching materials?

This new page is my own effort to enable that revolution. In fact, I’ve been gradually putting my teaching materials online for the past couple of years, without giving it much thought. The teaching page collects all the resources I’ve made available in one place.

Something even more exciting in this vein is coming in the fall. If you want to know a little about it, watch the last few minutes of Lorena Barba’s excellent Scipy keynote on computational thinking and teaching.

Stay tuned.

KAUST goes open access

2014-07-01T00:00:00+03:00

I’m proud to announce that as of today, KAUST has officially adopted an open access policy!

What it means

Institutional open access (OA) policies are a primary tool in the effort to allow academics to retain control of their own work. MIT and Harvard were the first to adopt such policies; now hundreds of institutions have similar policies.

In short, the policy ensures that KAUST has non-exclusive rights to distribute all research done at KAUST. This right precedes any publishing or copyright agreement terms. It also places a responsibility on KAUST faculty to provide a pre-print of each paper to the library.

The policy has nothing to do with publishing in open access journals (so-called Gold OA). Authors continue to publish in the same manner – and the same journals – as before.

KAUST’s OA policy is based closely on the text recommended by the Harvard Open Access Project (HOAP). HOAP was an extremely valuable resource for us in developing a policy and convincing the faculty, administration, and legal team to approve it.

How it happened

This is the culmination of a process that started back in 2011 with a lunch conversation between Rick Johnson (KAUST librarian and long-time OA advocate) and myself. We were both frustrated that KAUST theses were being “published” in a way that was inaccessible to anyone outside the University. Over the next several months, we successfully worked to ensure that all KAUST theses would be accessible for free to the general public. In fact, the first thesis to be published openly (and for months the only one) was that of my MS student, Manuel Quezada de Luna.
Now anyone can read the currently 367 completed KAUST theses here.

We decided the next order of business was a full institutional open-access policy. With strong support from Jim Calvin (VP for academic affairs) and my faculty colleagues Suzana Nunes and Sahraoui Chaieb, we eventually hammered out something that all could agree on (even the lawyers!) The policy was championed by our new library director, Molly Tamarkin, as soon as she arrived at KAUST earlier this year.

The policy

Here’s the full text of the policy, which at the moment is only available on an internal site. I’ll post a link here when a public announcement is made.

University faculty members, research scientists, post-doctoral fellows, students and employees (“University Research Authors”) grant to the University non-exclusive permission to make available their scholarly research articles and to exercise the copyright in those articles for the purpose of open dissemination.

More specifically, each University Research Author grants to the University a non-exclusive, irrevocable, worldwide license to exercise any and all rights under copyright relating to each of his or her scholarly research articles, in any medium, provided that the articles are not sold for a profit, and to authorize others to do the same.

The Office of the Vice President for Academic Affairs or its designate may waive application of the license for a particular article or delay access for a speciﬁed period of time upon express direction by the author.

Each faculty member or researcher will provide an electronic copy of the author’s final version of each article no later than the date of its publication at no charge in accordance with the guidelines published from time to time by the Office of the Vice President for Academic Affairs.

The Office of the Vice President for Academic Affairs charges the KAUST Library to develop and monitor a plan to comply with this policy and existing copyright obligations in a manner as convenient for the faculty as possible.

The Office of the Vice President for Academic Affairs or its delegate will be responsible for interpreting this policy, resolving disputes concerning its interpretation and application, and recommending changes to the Academic Council from time to time.

The KAUST Library will review this policy after three years.

Teaching with SageMathCloud

2014-05-31T00:00:00+03:00

During the past Spring semester at KAUST, I again taught AMCS 252, our masters-level course on numerical analysis for differential equations. I’ve been teaching the course using Python for 5 years now. This year, for the first time, I didn’t spend any time helping students install Python, numpy, matplotlib, or scipy. In fact, I even had them use Clawpack – and they didn’t need to install it. Why? Because they all used SageMathCloud for the course.

A little history

For the past several years, I have been increasingly integrating into the course a set of electronic notebooks in which the students are presented with some explanations and code, followed by exercises that involve modifying, running, and understanding the numerical algorithms implemented in the notebook. At first these were a set of Sage worksheets, and I ran a local Sage server within the KAUST network. When the VM that held the server died a horrible and irreversible death, I decided to switch to the IPython notebook format that had become increasingly popular. It wasn’t too hard to convert all my Sage worksheets to IPython notebooks. But my students had to either do all their work in the computer lab or figure out how to install the necessary Python packages on their own machines. This was a bit of a time sink for me, although it has gotten easier each year thanks to packages like Anaconda and Canopy. This also meant that they all ended up working in slightly different environments, which occasionally caused problems.

IPython notebooks in the cloud

In the last year, two new cloud services emerged, both offering free accounts with the ability to run IPython notebooks:

I realized that by using one of these services, I could avoid dealing with installation issues and ensure that everyone worked in an identical environment. Though I have found both Wakari and SMC to be useful, I ended up going with SMC for the course because it has, in my opinion, a more intuitive user interface.

Getting started

On the first day of class, students had only to create a free SMC account, create a new project, and type the URL of the course github repo into the “new file” box, which automatically caused it to be cloned into their SMC project. As I updated materials during the semester, all they had to do was open a SMC terminal and type “git pull” (in fact, none of the students had ever used git before, but none of them had any difficulty with this during the course).

Another great advantage of using a cloud service was that students could work or show their work from any computer. Since it was a small class, I had them present homework solutions in-class. They could all present solutions using the computer attached to the projector in the room by just logging into their own SMC account. That meant we avoided losing 5 or 10 minutes of class time in order to switch cables or transfer files.

Feedback

Overall, the students’ feedback was very positive. Most notably, although some of them did eventually install Python and the related packages locally on their laptops, they all chose to use SMC for their homework assignments throughout the course. There were some noticeable latency issues (the ping time between Saudi Arabia and Seattle is 200ms), and SMC currently has a 10-20 second delay the first time you open an IPython notebook (there’s no such delay for Sage worksheets). But those were not showstoppers, and I think by the time I teach my next course those issues will be resolved (by an IPython upgrade on SMC and by the launch of a European SMC server, respectively). William Stein, creator of SMC (and Sage) was extremely responsive and helpful (in fact, he created a trial European server recently in response to my and others’ comments about latency).

I used SMC again to teach a 1-day tutorial at a workshop this month. Other than a couple of minor hiccups, it again worked very well. I plan to continue using it for teaching in the future. One feature I haven’t used yet (but intend to) is the ability to “collaborate” on a project so that multiple users can edit it at the same time. I understand that many other great features are in the works.

I would strongly recommend SMC to other teachers of computationally-oriented courses, even if you’re not using IPython notebooks or Sage worksheets. As long as all the software for your course is freely available, you can install it all on SMC so that students have identical environments, accessible from anything with a web browser, with no need to do any installation of their own.

If you’re interested in my notebooks, you can find them here:

Just be warned that some are more polished than others, and they’re likely to all get a makeover soon.

Now that I keep a lot of my research in IPython notebooks on Github, I’m also thinking that SMC is a way to be able to show that research to anyone, anywhere. Heck, I can create a project, clone a Github repo, and run PyClaw in a notebook on my phone! Just amazing.

HyperPython

2014-05-28T00:00:00+03:00

Last week, I ran a 1-day tutorial at the Workshop on Design, Simulation, Optimization and Control of Green Vehicles and Transportation. The idea was to teach attendees about Python programming, basic theory of hyperbolic conservation laws, finite volume methods, and how to use PyClaw, all in the space of a few hours.

Inspired by Lorena Barba’s recent release of AeroPython, I decided to develop a short set of IPython notebooks for the tutorial. The result is HyperPython, a set of 5 lessons (plus Python crash course):

These won’t make you an expert, but if you’re looking for something short, practical, and fun, please give them a try. You may also find the last two notebooks useful if you’re looking for a good introduction to PyClaw.

These may be greatly expanded in the future into a full-fledged semester-length course.

Open access is about open access, not journals

2013-12-13T00:00:00+03:00

In October, Science Magazine conducted a survey regarding open access. Among the questions:

How important is it for scientific papers to be freely accessible to the public?
Of the papers that you published in the last 3 years, what percentage did you submit to fully open access journals?

72% replied “extremely important” to the first question, while only 58% indicated they had submitted any paper to an open access journal. Does this mean that scientists are not acting in agreement with their own principles?

No!

It may shock the editors of Science, but the open access movement is not about changing the funding model for academic publishers.

Open access means that research results can be read by anyone, for free.

Scientists can accomplish that without any help from publishers. The fact is that most scientists don’t view open access journals as the best way to make their work accessible. Another question from the Science survey asked

Which options for making papers freely available do you prefer?

The most common answer (66%) was “Immediate access through a repository, such as PubMedCentral or Arxiv, or on an author’s web site”.

This is quick and painless. It is allowed by an overwhelming majority of publishers. It requires no mandates from governments or universities. It requires no extra funding. Anyone can do it, and every scientist who cares a whit about open access already has done it.

If someone tells you that we need governments or publishers to intervene to make open access possible, you can be sure that his agenda is something other than open access. The only obstacle left is our own apathy.

A Tale of Two Theorems

2013-10-14T00:00:00+03:00

In their celebrated 1928 paper, Courant, Friedrichs, and Lewy proved a geometric condition that must be satisfied by a convergent partial differential equation discretization – the famous CFL condition. Briefly, the CFL theorem says that the numerical method must transport information at least as quickly as information travels in the true PDE solution. The proof is geometric and is conveyed through numerous diagrams.

Exactly fifty years later, in 1978, Rolf Jeltsch and Olavi Nevanlinna published a theorem [JN] that deals with bounding the modulus of a polynomial \(\psi(z)\) over a disk of the form \[D_r = {z \in \{\mathbb C} : |z-r|\le r\}.\] Their theorem says that if \(\psi(z) = 1 + z + a_2 z^2 + \cdots + a_s z^s\) and \(|\psi(z)|\le 1\) for all \(z\) in such a disk \(D_r\), then the disk radius \(r\) is at most \(s\). The proof of this result is, of course, purely algebraic.

These results apparently have nothing to do with one another. And yet it turns out that they are equivalent statements! That is, the CFL theorem can be proved using the JN disk theorem. And the JN disk theorem can be proved using the CFL condition (and no algebraic techniques). This was explained in a beautiful paper of Sanz-Serna and Spijker [SS] in 1986, and the result deserves to be much more well known.

First order upwinding

Consider the problem of approximating the value \(u(x_i,t_n)\) for the advection equation \[u_t + u_x = 0.\] The exact solution can be obtained by characteristics from the previous time level: \[u(x_i,t_n) = u(x_i-k,t_{n-1}),\] where \(k\) is the time step size. The CFL theorem says that the stencil used for approximating \(u(x_i,t_n)\) must enclose the point \(x_i-k\).

Let’s discretize the advection equation in space using upwind differences: \[U_i'(t) = -\left(U_i-U_{i-1}\right).\] Here for simplicity we’ve assumed a spatial mesh width of 1. Taking periodic boundary conditions, this semi-discretization is a system of ODEs of the form \(U'=LU\) where \(L\) is the circulant matrix \[ \begin{pmatrix} -1 & & & 1 \\ 1 & -1 & & \\ & \ddots & \ddots \\ & & 1 & -1 \\ \end{pmatrix}\] (as usual, all the omitted entries are zero). The eigenvalues of this matrix all lie on the boundary of the disk of radius one centered at \(z=-1\), which we denote by \(D_1\). Here are the eigenvalues of a 50-point discretization:

If we discretize in time with Euler’s method, we get the scheme \[U^n_i = U^{n-1}_i - k\left(U_i-U_{i-1}\right).\] This scheme computes the solution at \((x_i,t_n)\) using values at \((x_{i-1},t_{n-1})\) and \((x_i,t_{n-1})\), so the CFL theorem says it can be convergent only if \(x_i-k\) lies in the interval \((x_{i-1},x_i)\). Since \(x_{i-1} = x_i - 1\), this holds iff the step size \(k\) is smaller than 1.

This result – that the first-order upwind method is stable and convergent only for CFL number at most one – is well known, and can also be derived using basic method of lines stability theory. The stability function for Euler’s method is \(\psi(z) = 1 + z\), so it is stable only if \(z=k\lambda\) lies in the disk \(\{z : |1+z|\le 1\} = D_1\) for each eigenvalue \(\lambda\) of \(L\). What we have seen in the foregoing is that this stability condition can be derived directly from the CFL condition, without considering the eigenvalues of \(L\) or the stability region of Euler’s method.

Proving the JN disk theorem via the CFL theorem

For higher order discretizations, the CFL condition is necessary but not generally sufficient for stability. Nevertheless, we can use it to derive the JN disk theorem. I’ll restrict the explanation here to Runge-Kutta methods, but the extension to multistep methods is very simple. Suppose that we discretize in time using a Runge-Kutta method with \(s\) stages. In each stage, one point further to the left is used, so typically the stencil for computing \(u(x_i,t_n)\) includes the values from the previous step at \(x_{i-s}, x_{i-s+1}, \dots, x_i\). Thus the CFL theorem says the method cannot be convergent unless \(x_i-k\) lies in the interval \((x_{i-s},x_i)\); i.e., unless \(k\le s\). Meanwhile, the stability function \(\psi(z)\) of the Runge-Kutta method is a polynomial of degree at most \(s\). Method of lines analysis tells us that the full discretization is stable if \(kD_1\) lies inside the region \(\{z : |\psi(z)|\le 1\}.\) Since we know it is unstable for \(k>s\), this implies that if \(|\psi(z)|\le 1\) over the disk \(D_k\), then \(k \le s\).

Recap

An \(s\)-stage upwind discretization has stencil width \(s\).
The CFL condition implies that this discretization cannot be convergent for Courant numbers larger than \(s\).
The spectrum of the semi-discretization is the boundary of the disk \(D_1\).
Stability analysis implies that the full discretization is convergent if the scaled spectrum \(kD_1 = D_k\) lies inside the stability region of the time discretization.
Thus no \(s\)-stage time discretization can have a stability region including the disk larger than \(D_s\) (this is the content of the JN disk theorem).

Ellipses

Of course, we didn’t have to choose first-order upwinding in space; we could have taken any spatial discretization. For instance, if we use centered differences: \[U_i'(t) = -\left(U_{i+1}-U_{i-1}\right)\] then the spectrum of the semi-discretization lies on the imaginary axis in the interval \([-i,i]\). Then the same line of reasoning then tells us that the largest imaginary-axis interval of stability for an \(s\)-stage method is \([-is,is]\). By considering convex combinations of upwind and centered differences, we get similar results for a family of ellipses; this is the content of Theorem 5 of [SS].

Parabolic problems

It’s well known that the largest interval of stability of a consistent \(s\)-stage method on the negative real axis has length \(s^2\); the corresponding polynomials are (shifted) Chebyshev polynomials. You might hope that this could also be deduced by considering a centered difference semi-discretization of the heat equation and applying the CFL theorem. That would be very neat, since it would provide a connection between PDE stability theory and the optimality of Chebyshev polynomials.

Indeed, explicit time discretizations generally lead to step size restrictions depending on the square of the spatial mesh width when paired with the usual centered spatial discretization. But the CFL theorem is not sharp for these discretizations; it only tells us that \(k\) must vanish vanish more quickly than the spatial mesh width. So no deduction along these lines seems possible.

#spnetwork #recommend doi:/10.1007/BF01389633

#discusses doi:10.1147/rd.112.0215 #discusses doi:10.1007/BF01932030

Documentation, testing, and default arguments for your MATLAB packages

2013-10-12T00:00:00+03:00

I primarily develop code in Python and Fortran, but I also use MATLAB for certain things. For instance, I haven’t found a Python-friendly nonlinear optimization package that measures up to the capabilities of MATLAB’s optimization toolbox (fmincon). So my RK-opt package for optimising Runge-Kutta methods is written all in MATLAB.

The trouble is that working in Python has spoiled me for other languages. Python has the excellent Sphinx package for writing beautiful documentation. Python has the nosetests harness for easily writing and running tests. And Python has a simple syntax for including optional function arguments with default values.

MATLAB doesn’t support any of these things so elegantly*.

*This was true one year ago, when I started writing this. But it seems things have improved – see below.

In any case, all is not lost – I have found reasonable approximations in the MATLAB ecosystem, and in some cases I’ve adapted the Python tools to work with MATLAB.

Documenting MATLAB projects using Sphinx

In principle, Sphinx can be used to write documentation for packages written in any language. However, its autodoc functionality, which automatically extracts Python docstrings, doesn’t work with MATLAB. For RK-Opt, I hacked together a simple workaround in this 74-line Python file. It goes through a given directory, extracts the MATLAB docstring for each function, and compiles them into an .rst file for Sphinx processing. You can see an example of the results here.

Update: as I’m writing this, I’ve discovered a new MATLAB extension for Sphinx’s autodoc. I will have to try it out sometime; please let me know in the comments if you’ve used it.

Automated testing in MATLAB

I’ve become convinced that writing at least one or two tests is worthwhile for even small, experimental packages. In Python, it’s simple to include test in the docs and run them with doctest, or write test suites and run them with nosetest. For MATLAB, I would have recommended the third-party xunit framework. But it seems that this year Mathworks finally added this functionality to MATLAB. Even so, you might want to use xunit because it’s possible to run doctests with it but not with MATLAB’s new built-in framework. Also, you can get XML output from xunit, which a number of other tools can analyze (for instance, to tell you about code coverage). For an example of how to use xunit, see RK-Opt.

Again, I’d be interested to hear from you in the comments if you’ve used MATLAB’s new built-in test harness.

Optional arguments with default values

MATLAB does allow the user to specify only some subset of the input arguments to a function – as long as the omitted ones all come after the included ones. I used to take advantage of this, with this kind of code inside the function:

if nargin<5 rmax=50; end
if nargin<4 eps=1.e-10; end

This is a reasonable solution in very small functions, but it breaks if you want to add new arguments that don’t come at the end, and if you want to specify the very last value then you have to specify them all. A better general solution is the inputParser object. It’s much less natural than Python’s syntax, but the result for the user is the same: arbitrary subsets of the optional arguments can be specified; default values will be used for the rest. As a bonus, you can check the types of the inputs. Here’s an example of usage.

If you know of better ways to do any of these things, please let me know in the comments!

Of course, it’s entirely possible to develop large, well-documented, well-tested, user-friendly packages in MATLAB – Chebfun is one example. It’s just that this is the exception and not the rule in the MATLAB community. Hopefully better integration with testing and documentation tools will improve this situation.

Giving a math talk using IPython notebook slides and Wakari

2013-09-21T00:00:00+03:00

Giving a math talk using IPython notebook slides and Wakari

Last week I gave my first full-length executable talk: one in which I showed the code that produced (almost) all the results I presented. You can see the talk and run the talk on Wakari (or download it and run it locally). All you need is Python with its scientific packages (numpy, scipy, sympy – I recommend just installing Anaconda Python if you haven’t already). I took things a step further and actually ran a bunch of demo code live on Wakari. I was excited beforehand, and judging by the number of people that came into the room right immediately before my talk (and left immediately afterward), so was the audience. But I was disappointed with how it went. Here’s why.

Composing a talk in an IPython notebook is counterintuitive. When I give a talk, I try to tell a compelling and coherent story. This requires a certain mindset, and somehow the IPython notebook helps rather than hinders – at least, for me. I think there is too much of a disconnect between how things look when I’m writing them and how they look as slides. In theory Beamer is worse in this respect, but it felt worse with the notebook.

It is hard to engage your audience with code. Almost nobody can digest complicated formulas during a talk, which is why even when I speak to mathematicians I usually have very few equations and lots of pictures. Well, the same goes for code – nobody can digest more than a few simple lines on a slide. I think I did a good job of keeping the code short, high-level, and intuitive, but it still felt flat.

Code in the talk needs to execute very quickly. This is obvious for code that you run as a live demo, but I found it necessary also for code snippets that I didn’t run live (but where I wanted to show the results). That’s because when you recompile your talk (which I do many, many times during the composing process), you have to wait for all that code to execute again. It doesn’t help that things seem to run significantly slower on Wakari than on my laptop.

The IPython notebook format is not (yet) good at displaying graphs and tables. Talks full of text put people to sleep, and code is text, so this kind of talk already has a strike against it. But to makes matters worse, I can’t insert images into my notebook slides without putting an ugly line of code above them. And the notebook refuses to let me embed vector graphics formats (like PDF), so I have to degrade them to slightly blurry pngs.

It’s hard to judge how long a code-based talk will take. I usually judge conservatively so I can move at a relaxed pace. But my demo took much longer than I planned (partly due to the difficulty of using a Spanish keyboard), and I had to rush through the last third of the talk in about 2 minutes. I guess this is something to learn with practice.

The default fonts in notebook-converted slides are just too small. They are fine for someone sitting at a computer screen, but much too small for the projector screen at the front of a large screen. You can adjust the size in the browser using ‘+’, but the result looks ugly for some reason. I know the fonts can be changed using CSS, and I’ll make them larger next time.

For me, the worst condemnation of any talk is that no questions are asked afterward. I haven’t had that happen in a long time, but this was close: there was only one question, and that question demonstrated that I had completely failed to convey what was going on behind a lot of the code I had showed.

It feels too soon to give up on this approach to talks; I will try it again some time. Perhaps I just haven’t found the right use for this medium. If you have tried giving a similar talk, I’d love to hear your opinion or suggestions.

One note about the slides: parts of them will not make sense in the absence of my verbal explanations. I generally avoid including a lot of explanatory text in the slides. I actually added a lot more than usual in this case because I was planning to post them online.

Don't scrap the DOE CSGF program

2013-06-23T00:00:00+03:00

The US federal government has proposed to eliminate a number of smaller graduate fellowship programs and lump them together with the NSF graduate fellowship. Unfortunately, this includes the Dept. of Energy’s illustrious Computational Science Graduate Fellowship (CSGF) program, of which I’m an alumnus. I think the CSGF program is an irreplaceable asset that is nurturing the third pillar of science across disciplines in a way that a much larger program never could.

Here are just a few of the definite, tangible impacts the CSGF program had on me, off the top of my head:

Because of the program of study requirements, I took a course in optimization, without which I would never have written this paper, parts of this paper, and probably this paper.
I met David Keyes (a member of the steering committee) and came to KAUST! I almost certainly would not be here if it weren’t for the CSGF program.
I met Carl Boettiger and learned about using Jekyll for open notebook science, resulting in the site you are reading.

Here is the letter I sent to four congressional committee members asking them to save the program. If you know the CSGF program and its significance, I urge you to do so too.

Dear Senator/Congressman,

I am writing to you because of your leadership role on the Appropriations Subcommittee on Energy and Water Development. I am an applied mathematician and an alumnus of the DOE computational science graduate fellowship (CSGF) program. I am writing because I have learned that funding for the CSGF program is slated to be merged into a much larger NSF graduate fellowship program. I think this would be a terrible decision, because it would destroy the unique benefits of the CSGF program.

The CSGF program was the second federal graduate fellowship that I received while pursuing a Ph.D. The first was a fellowship from the Dept. of Homeland Security. I would like to emphasize the value of the CSGF program by contrasting it with the DHS fellowship program. Under the DHS fellowship, I received valuable funding, but that was essentially all. In contrast, the CSGF program dramatically altered my career path in several positive ways. It required me to receive a broader graduate education including computer science and physics, which has allowed me to pursue interdisciplinary research that would otherwise be impossible. It sent me to a practicum at Sandia National Laboratory, where I established collaborations that continue to this day. Most importantly, it introduced me to the network of CSGF fellows and alumni, a small and very cohesive community of outstanding computational scientists that is beginning to transform this relatively new scientific discipline. It is no exaggeration to say that my career has been shaped by my interaction with that community. I am now a successful professor with my own research funding and have no obligation to attend the annual CSGF conference. But that interaction is important enough that last year I flew half way around the world, using my own research funds, to spend a few days with the current fellows and other alumni.

I think that many federal fellowships, including the DHS fellowship, may be well served by their being merged into a larger NSF program. But the unique benefits of the CSGF program, and especially the scientific community that it fosters, could not exist under a larger program with less focus. Please keep the DOE CSGF program intact and keep the funding for it within the Advanced Scientific Computing Research (ASCR) office.

Sincerely,

Professor David I. Ketcheson

How to avoid javascript errors when copy-pasting Bibtex citations in Mendeley on Mac OS X

2013-02-15T00:00:00+03:00

I use Mendeley to manage references. Mendeley has a nice auto-import feature that will pull down bibliographic data from the web to my database. When writing, my workflow typically involves grabbing references from Mendeley in bibtex format. The simplest way to do this involves right-clicking on a publication and selecting “copy citation”. Provided that one has already selected “bibtex generic citation style” in the View->Citation Style menu, this action results in the full bibtex entry being copied to the clipboard.

At least, that’s how it’s supposed to work.

For a couple of years now, I’ve had the problem that I get this on the clipboard instead:

Error: JavaScript error found: CSL error: Exception: TypeError: ‘undefined’ is not a function, 515, file:///Applications/Mendeley%20Desktop.app/Contents/Resources/citeproc-js/citeproc.js

Despite this problem being reported by numerous users, Mendeley has never provided a fix that worked for me. But today, after discussion with Mendeley support, I found my own fix.

What to do: Just replace the file

~/Library/Application Support/Mendeley Desktop/citationStyles-1.0/bibtex.csl

with the one found at

http://www.zotero.org/styles/bibtex

Then re-open Mendeley. That’s it. Of course, I reccomend just moving your bibtex.csl rather than deleting it, in case anything goes wrong.

5 reasons why you should submit your next paper to CAMCoS

2013-01-17T00:00:00+03:00

I have a new favorite journal: Communications in Applied Mathematics and Computational Science. I just published a paper with them for the first time (technically it’s still in press, but you can download it here (paywall) or here (free; same version).

CAMCoS is a hidden gem – it is relatively new (6 years old) and not yet as widely known as most established journals. I believe that within a few years it will be as coveted a publishing venue as any applied mathematics journal. Here’s why.

A respected publisher with an exceptional editorial board. We all know that a primary consideration when submitting an article is the prestige of the publisher and the journal. CAMCoS is published by Mathematical Sciences Publishers (MSP), a non-profit run by mathematicians for mathematicians; they also publish Annals of Mathematics, Geometry and Topology, and a number of other excellent journals. Their website says “our aim is to transform scientific publishing into an industry that helps rather than hinders scholarly activity”, and their actions back that up. The CAMCoS editorial board is an outstanding group of some of the world’s leading applied mathematicians; take a look for yourself.
Timely and thorough peer review and copy-editing. A respected publisher and a famed editorial board are nice, but how well is the journal actually operated?
My experience with CAMCoS puts it far ahead of most other journals I’ve dealt with. We submitted the article in early July, and it came back in early November: 4 months, which is not lightning-fast but not too shabby. The referees seemed to be well chosen and to have done a thorough job, suggesting several valuable improvements. We resubmitted in late November (minor revisions) and the article was accepted five days later. We submitted the TeX files in early December, and received the galley proofs with copy editing one month later. The really astonishing part: we approved the proofs on January 7, with one added correction; our article was made available online, in final form the next day, on January 8. I’ve never before experienced or even heard of that kind of turnaround from a publisher. For comparison, my SISC paper that was accepted in August still hasn’t been assigned an issue or a DOI (it’s now mid-January).
Electronic PDF features that no other publisher I know offers. The copy editing is high quality, but what really blew me away is that the copy editor went through our bibliography and made every paper title into a hyperlink to the published journal article. As if that wasn’t enough, he added links to Mathematical Reviews and Zentralblatt for every article that possessed such entries. These hyperlinks are active in the PDF, as are hyperlinks from references in the paper to the bibliography, references to equations and theorems, etc.
This may seem like a small thing, but I think it’s very powerful. It means that as you read through the paper, when you see a citation in a sentence that puzzles or interests you, can just click on the citation, which will take you to the bibliography. Then you can click on the bibliographic entry to go immediately to the paper cited, or a to a review of it! Click through to the paper and try it for yourself. This is a capability that all journal articles obviously should have had for the last 15 years, but this is the first publisher I’ve seen who understands that.
You can choose whether to keep your copyright, and you can post the final version of the paper on your website or institutional server. MSP’s publishing agreement has two options: if you wish, you may sign over your copyright to them. You will still retain the rights to “reproduce [the article] by any means for educational and scientiﬁc purposes … without fee or permission” as long as you don’t try to charge anyone for it. Alternatively, you can retain the copyright to your work, granting MSP only a license to publish; the only restriction is again that you can’t charge others a fee for accessing your work. Their policy has clearly been designed with the author’s interests as primary concern.
(Virtually) Diamond open access. Diamond open access is the OA movement’s dream; a model that avoids the author charges of Gold OA while still providing peer review and a stable DOI (which green OA often lacks). In strict terms, CAMCoS is not diamond OA, since it requires a subscription. However, I claim it is virtually diamond OA, for two reasons. First, CAMCoS uses a moving paywall, under which articles become OA after one year (thus, only the 2012 issue requires a subscription at present). During that first year, open access can easily be provided by the author posting a copy somewhere.

By way of disclosure, I have no affiliation with CAMCoS and no reason to promote them except that they represent the kind of journal I think the applied mathematics community should support.

Convert SAGE worksheets to IPython notebooks

2013-01-16T00:00:00+03:00

Converting a SAGE worksheet to an IPython notebook

Download link: http://github.com/ketch/sage2ipython/

I use Python to teach numerical methods here at KAUST, and I’m in the process of switching from using SAGE worksheets to IPython notebooks (more on the reasons in a later post). I’ve invested a lot of time over the past three years in developing a set of SAGE worksheets and it would be a substantial amount of tedious work to manually copy-paste their contents into IPython notebooks. So I decided to write an automated converter.

Each SAGE worksheet is usually stored in a .sws file that is a bzipped tarball; underneath, there is a text version (called worksheet.html). If you run SAGE on your own machine, the text versions of your worksheets can usually be found in ~/.sage/sage_notebook.sagenb/home/username/number/.

It’s a simple matter to convert the SAGE format (that uses triple braces to delimit code cells) into the IPython format (that I believe is JSON).
Rather than write an actual parser, which seemed like overkill, I just created a script that steps through the file line-by-line and keeps track of whether it’s in a cell. Debugging it was slightly painful because if you have the tiniest syntax error, then the IPython notebook server just tells you something is wrong and displays nothing.

You can download the converter from the Github repository. It has been tested with SAGE version 4.2.1 and IPython version 0.13.1. Note that it has several limitations (see the list below). But it has served my needs well.

Usage:

   
import sage2ipython
sage2ipython.sage2ipy('/path/to/sage/worksheet/html/file','output_file_name.ipynb')

To convert all your SAGE worksheets, do:

>>> import sage2ipython
>>> sage2ipython.convert_all_sage_worksheets('username')

where username is your account name. You may also need to edit the SAGE notebook account name that occurs in the path in convert_all_sage_worksheets().

If you have any problems, it is likely that your worksheet contains some special characters that need to be escaped in the IPython notebook. I’ve included fixes for several of those, but almost certainly not all of them. Please let me know.

General notes/limitations:

All code blocks are assumed to be Python code blocks.
Output is simply deleted.
Everything else is put in Markdown cells.
Double backslashes are handled properly only if you have the development version of IPython. Otherwise, you should convert them to quadruple backslashes.

Why I signed the Cost of Knowledge

2012-12-23T00:00:00+03:00

It has been almost a year since Tim Gowers’ blog post about boycotting Elsevier triggered the Cost of Knowledge movement. The boycott has been signed by more than 13,000 people. But the vast majority of academics – including most of my own friends and collaborators – continue to support Elsevier by gifting to it much of their research output and their labor, enabling Elsevier to operate at greater profits than Starbucks, Amazon, or Nike.

One of the most common arguments I’ve heard for this continued support goes as follows:

Sure, it would be better if journal prices were lower, but it’s not such a big deal that we should all get worked up about it. Corporations always seek to maximize profits – why focus on academic publishers?

Commercial publishers are pillaging academia

In September, I spoke with a colleague who is a professor at a small university in Spain. She told me that she currently has no Ph.D. students and sees no possibility of supervising students “in the next 5-6 years” due to the financial situation at her university. Furthermore, she confided that each month she is uncertain whether her next paycheck will come. Last year, all faculty at her university took an involuntary 5% pay-cut as the university struggled to pay its bills. Another colleague from a major university in Texas told me that the Math department lost 20% of its faculty in the last two years over funding problems, as the university’s budget has decreased by 25%.

Of course, both of these universities continue to pay a huge sums for Elsevier journal bundles – that’s a cost they simply can’t cut if they want to continue as a respected institution of higher learning. Elsevier continues to pillage academic institutions through its strangle-hold on scientific publishing, while professors face salary cuts and students cope with ever-rising tuition.

Who is to blame?

Many people on both sides of the boycott have argued that we shouldn’t expect commercial publishers to behave any better. As corporations, the argument goes, their highest allegiance is to their shareholders, not their stakeholders. Commercial publishers simply can’t be expected to do anything but plunder academia and the general public in order to enrich themselves. But then who is responsible for channeling funds that should support scientific research but instead go to pay for overpriced journals? The guilty party must be the academics who prop up commercial publishers by providing all the content and labor!

It’s not you, it’s me

Researchers who continue to support Elsevier can pretend that they are passively “staying out of the fight”, but they are deluding themselves. Submitting or refereeing a paper is a very active and expensive decision. When I submit a paper, I decide where to deposit a huge investment of my own time and my instution’s (or funding agency’s) money. That is why I joined the boycott – why I had to. It is the scientific publishing equivalent of the Hippocratic oath: First, do no harm. Locking away publicly-funded research for the profit of a few is harmful. Forcing my own university (and taxpayers) to buy back the research they already paid me to do is harmful. I’m not boycotting in order to stop Elsevier from doing harm, I’m boycotting to prevent myself from doing harm.

One editor-in-chief of an Elsevier journal pointed out to me that SIAM also operates its journals “at a profit”, meaning that the subscription fees generate more revenue than what it costs to run the journal. This is true! But where does that extra revenue go? It is used to subsidize SIAM conferences, reducing the registration fees I pay. This can be seen as a shrewd way of leveraging central university budgets (which pay for journal subscriptions) to support my discipline specifically.

Why would I send my work to a publisher like Elsevier that aims primarily to enrich shareholders when I could send it to a journal of the same quality with a non-profit publisher (like SIAM) that charges much lower prices and uses its income to benefit its customers (members)?

Open scientific collaboration

2012-12-22T00:00:00+03:00

This is the fourth post in my series on habits of the open scientist. Here I discuss the fourth habit, open collaboration. The previous post was on Pre-publication dissemination of research.

As mentioned in the introduction to this series, the first three habits are truly essential for any conscientious scientist. With the fourth habit, we’re moving into things that are valuable but less essential – advanced open science, if you will.

What do I mean by open collaboration? The use of online tools and social media to connect with new collaborators and provide your own expertise where it is needed most. For an excellent introduction to the subject, go read Michael Nielsen’s book, Reinventing Discovery. Here I’ll just focus on a few examples from my own experience:

Scientific Q&A sites

Often scientific research involves elements of work that have been done before or are already well understood – by someone, somewhere. Sometimes this work is published and readily available, but other times it is unpublished or perhaps published in a place you wouldn’t know to look. Finding the person with the specialized knowlege you need might take much longer than “reinventing the wheel”, i.e. redoing the work yourself. Enter StackExchange, an engine for connecting questions with correct answers and making them readily available.

I’m an avid participant in (and former moderator of) the Stack Exchange for Computational Science. I also use Mathoverflow and Stack Overflow. Some personal examples of the kind of connections I’m talking about are here and here. These are conversations that would never have taken place “in real life” simply because the people involved have never met each other.

I also find the TeX stack exchange to be a gold mine, and typically far more useful than browsing through package documentation on CTAN.

I use Google+ (and previously Reader, which was a far superior tool) for sharing new papers that I think may be of interest to my collaborators. I’ve also used it to debate journals’ editorial policies (with the editors) and for preliminary planning of conferences and proposals – to find out who may be interested in participating. It’s certainly not suited to discussing scientific or mathematical concepts in any detail, and it is annoyingly difficult to sort through new things that are posted. I think that Facebook is less useful for this purpose because Facebook is used primarily for personal content whereas a large community of G+ users (of which I am part) consider it to be a platform for sharing professional content. But I’m not a good judge – I don’t even have a Facebook account.

Github

I wanted to say “sites like Github”, but I don’t think there are any others. Online code hosting sites have long facilitated collaboration between existing teams, but Github takes this to a new level by explicitly promoting collaboration between people who have never met. Surprisingly, this paradigm shift didn’t require any new technology. Rather, it stems from a combination of their “code first, ask permission later” pull-request mindset and subtle differences in the user interface – like a “fork me” button on every page, just begging you to modify some stranger’s code.

Now this philosophy – and use 0f Github – has moved beyond just sharing what we usually think of as computer code. For instance, Carl Boettiger puts the full source of his Jekyll-based website on Github, which enabled me (simply by forking it) to easily set up this site.

A word of caution

As useful as all the above are, I’ve found that they can also be a way of wasting time. You may find this to be the case if you’re merely trading opinions with strangers or consuming tidbits of information that aren’t really relevant to your research – for instance, I find that my time spent on the Academia Stack Exchange is of dubious value. I stepped down from moderating the SciComp Stack Exchange because I felt it was too time-consuming. But if used in a focused way, open collaboration tools can accelerate, enrich, and expand your research.

What other tools or sites ought to be mentioned here? Let me know in the comments.

Reflections on the 2012 ICERM Reproducibility Workshop

2012-12-14T00:00:00+03:00

I spent the last five days at ICERM (the new Math institute at Brown University) attending the workshop Reproducibility in Computational and Experimental Mathematics. The workshop was focused on discussing how mathematicians can ensure that their computations are reproducible, in order to ensure correctness and facilitate their use by others. It’s a topic dear to my heart and one that I’ve blogged about before.

My hat is off to the organizers for managing to assemble a highly diverse group of experts, including not only academic luminaries from both pure and applied math, open source software gurus, and leaders from companies like Github and Google (yes, Peter Norvig himself attended). Most of the talks were excellent. Many included live demos of great tools, and others introduced me to things that I never thought you could do with computation – like discovering new formulas for pi.

Going into the workshop, I felt that I already knew a lot about reproducibility and had relatively good habits in this regard. So what did I learn? I picked up a new tool, Andrew Davison’s Sumatra, which I had heard of before but now have begun to use in earnest (more on that in a future post). I was impressed with Lorena Barba’s Reproducibility PI Manifesto and learned a new trick from her: put your figures up on Figshare before submitting a paper in order to retain copyright on the figures. I marveled at Greg Wilson’s goal of reaching 20% of all scientists with his Software Carpentry courses, and I determined to host such a course at KAUST in the near future.

I also learned that the reproducibility movement in computational science and mathematics involves a wide range of opinions and concerns. For instance, some consider that the primary motivation for reproducibility is to ensure correctness of results, while others feel that it is scientific productivity. There is disagreement about how much value should be placed on code development, on how reproducibility should be taught, and on ways in which journals and funding agencies should encourage reproducibility. In the end, we had difficulty even agreeing on a well-defined terminology for concepts related to reproducibility. Nevertheless, there is broad agreement that we need to improve our habits in recording and presenting our computational work. On the final day, in a spurt of crazy massive Google Doc collaboration (have you ever edited a documented live with 30 others at once?) we drafted a report that I’ll link to here once it appears.

If you want to know more, take a look at the great thought pieces submitted and the rest of the material on the wiki.

Adopting the Reproducible Research Standard

2012-12-06T00:00:00+03:00

Back in July, I read Victoria Stodden’s work on licensing reproducible research. Victoria has proposed the Reproducible Research Standard (RRS), which is an amalgamation of recommended licenses for what she calls the research compendium. The research compendium is the full set of outputs of a research project, including:

The research paper
Additional media, such as movies
Computer code
Data
A record of the computing environment used to process the code and data

The idea is that all of these components are part of your research and someone wanting to understand your research may need access to all of them. The RRS consists of the following licenses:

Creative Commons Attribution (BY) for media (text, figures, movies)
Modified BSD for code
Science Commons Database Protocol for data

For the most part, this is easy enough to implement: the current academic research system frankly doesn’t care what you do with your code, data or miscellaneous media outputs. And I think that actually releasing those is the most important part of the RRS. But the text and figures of the paper itself must be published in a journal, and typically the journal will want the copyright – preventing you from releasing those media under CC-BY.

Nevertheless, I’ve attempted to follow the full RRS with each of the two papers I’ve had accepted since then. The first (still in press) was accepted to the SIAM Journal on Scientific Computing (SISC). The code is licensed under modified BSD as part of the SharpClaw package (now rolled into PyClaw). After reading one author’s experience retaining copyright to an article published by SIAM, I decided to try the same approach of modifying the copyright transfer agreement by striking out the transfer of copyright. I suspected that the instance just linked to went “below the radar”, and I wanted to be completely above-board, so I pointed out to SIAM that I had modified the agreement. What made this particularly interesting is that one of my co-authors on the paper is Randy LeVeque, chair of the SIAM journals committee.

Eventually, SIAM objected “on the grounds that non-exclusive right to publish doesn’t prohibit others from publishing for profit, which may be to [the authors’] disadvantage as well.” They agreed instead to an addendum generated via http://scholars.sciencecommons.org/ that retains for the authors the right to post the final article on any public server, as long as publication in SISC is stated. Since this gave me what I wanted in practical terms, I agreed and signed the copyright transfer + addendum. I’ve been told that an ad hoc committee of SIAM leadership is now discussing how SIAM should handle these copyright questions like this.

I came away from this feeling like we had made progress, but I still wanted to see if I could implement the full RRS with respect to the next paper. My next accepted paper (also still in press) was a submission to Communications in Applied Mathematics and Computational Science, published by the extremely progressive not-for-profit Mathematical Sciences Publishers. This is a truly remarkable journal that will be the subject of another blog post in the near future, but what’s important in this context is that the journal doesn’t require authors to transfer copyright! They only require a license to publish which includes this clause:

The copyright holder retains the right to duplicate the Work by any means and to permit others to do the same with the exception of reproduction by services that collect fees for delivery of documents, which may be licensed only by the Publisher. In each case of authorized duplication of the Work in whole or in part, the Author(s) must still ensure that the original publication by the Publisher is properly credited.

After discussion with my co-author Aron Ahmadia, we’re retaining copyright and licensing the paper under CC-BY-NC. The NC (non-commercial clause) seems necessary to comply with the paragraph above, and seems reasonable to me. The code for the paper is released as part of the RK-opt package. So I’m calling this mission accomplished.

I have mixed feelings about whether it makes sense for journals to let authors keep copyright – I can see some sense in SIAM’s objection, and I think that non-profit publishers need to protect enough of a revenue stream to support their activities. I think it is better that that revenue come from (low-cost) subscriptions than from author fees. It will be interesting to see where SIAM’s policy falls.

Switching from Blogger to Jekyll

2012-10-25T00:00:00+03:00

If you’re reading this, then you’ve probably noticed: I moved my blog from blogspot to my own new site. Among other things, that meant a change in the engine that runs the blog, from blogger to Jekyll. It was a big jump from the simplest, hosted blogging platform out there to a rather advanced engine designed by hackers for hackers.

Why switch?

I had been wanting for some time to include a lot more math and code in my blog posts, and it was a hassle with Blogger. The output often looked funny and was hard to control. With Jekyll, I get beautiful results like this. I also wanted more control over my blog’s appearance and greater interoperability, which meant keeping things in plain text and using (generated) static HTML, both of which Jekyll enables me to do.

But really the switch was part of a much bigger change: I’ve migrated the content of my professional home page here to davidketcheson.info and begun an open science notebook. That’s why the link at the top of the page reads NoteBlog: it’s intended to be a combination notebook and blog. On the blog side, I’ll keep posting about issues like scientific publishing, open science, reproducibility. On the notebook side, there will be a lot more posts of raw results and experiments from my current research projects, not intended for a general audience. And somewhere in-between there’ll be reasonably polished expository math-y posts accessible to students and researchers in my field.

How I switched

It was easy, thanks primarily to Carl Boettiger. This site was built based on Carl Boettiger’s labnotebook site. Carl publishes the source for his site on Github as the labnotebook project and releases it all under CC0, so setting my site up was as easy as following his instructions, replacing the _posts directory, and making a few CSS customizations.

I migrated all my Blogger content following these instructions. This didn’t manage to bring in the tags or comments, unfortunately. I had done a poor job of tagging my posts in the past anyway, so I manually re-tagged my 45 existing posts.

Subscribing to my new blog and/or notebook

One nice thing about having more control is that I can set up separate feeds for different kinds of posts. On the right you’ll see three RSS feed links: one for all entries (notebook and blog), and one each for the separate notebook and blog feeds. I imagine most of you will only want to subscribe to the blog, unless you’re interested in my research niche (you can look at the categories page to get an idea of what each will include).

You might be a low-quality scientific journal if...

2012-10-16T00:00:00+03:00

In the spirit of David Letterman’s top 10 lists, here are my top 10 signs you might be a low quality scientific journal, inspired by an e-mail I received today from Computing in Mathematics, Natural Sciences, Engineering and Medicine.

You are incorporated in Canada, and have “Canadian” in your title, but none of your editors lives in Canada.
Your scope is so broad that it includes abstract algebra, textile engineering, and dermatology all in one journal.
Most of your abstracts include misspellings of common English words, like “stander” for standard.
Most of your article titles could be used as exercises for grade school students learning to fix improper verb conjugations and subject-noun agreement.
You promise to complete reviews of articles in mathematics or similar fields in two weeks.
You require math and physics manuscript submissions to be in Microsoft Word format.
You spam researchers in other disciplines with your calls for papers.
Your chief editor received a Ph.D. in the last five years.
Your funding model is ~~vanity press~~ gold open access.
You regularly invite graduate students to serve on your editorial board.

On a serious note: a few of these, taken by themselves, might not necessarily be a bad sign. And I must say that AM Publishers’ author charges (95 USD) are the lowest I’ve ever seen.

For a related, more serious analysis, see Beall’s List of Predatory Open-Access Publishers.

Impact of the Elsevier boycott

2012-10-11T00:00:00+03:00

Seven months ago, I signed the Elsevier boycott at thecostofknowledge.com. What impact has this had? So far, I’ve

Submitted 2 manuscripts to SIAM journals that would otherwise have gone to Elsevier journals
Declined to referee 3 manuscripts from Elsevier journals

If every signee had a similar impact (admittedly, that’s a very optimistic view), that would be more than 24,000 journal articles effectively pulled from Elsevier journals and published elsewhere. Which might be a good thing for the editors, since they’d be having a difficult time finding qualified referees in the communities where the boycott has been adopted.

It will be impossible to quantify my impact going forward, since I now automatically rule out Elsevier journals when planning a new paper, and since I’ve asked editors to remove me from their list of potential referees.

Some people I’ve met seem to have the perception that the boycotters are deeply angry people who spend their time muttering curses at commercial publishers. That simply isn’t the case, and anyone who has read the documents that helped launch the boycott must know that. When I refuse to referee for Elsevier journals, I do so politely and I always suggest alternate reviewers to the editor. In every case the editor has been equally polite and understanding.

Blogging an iPython notebook with Jekyll

2012-10-11T00:00:00+03:00

Update as of December 2014: Don’t bother using what’s below; go to Christop Corley’s blog for a much better setup!

I’ve been playing around with iPython notebooks for a while and planning to use them instead of SAGE worksheets for my numerical analysis course next spring. As a warmup, I wrote an iPython notebook explaining a bit about internal stability of Runge-Kutta methods and showing some new research results using NodePy.

I also wanted to post the notebook on my blog here; the ability to more easily include math and code in blog posts was one of my main motivations for moving away from Blogger to my own site. I first tried following the instructions given by Fernando Perez. That was quite painless and worked flawlessly, using nbconvert.py to convert the .ipynb file directly to HTML, with graphics embedded. The only issue was that I didn’t love the look of the output quite as much as I love how Carl Boettiger’s Markdown + Jekyll posts with code and math look (see an example here). Besides, Markdown is so much nicer than HTML, and nbconvert.py has a Markdown output option.

So I tried the markdown option:

nbconvert.py my_nb.ipynb -f markdown

I copied the result to my _posts/ directory, added the YAML front-matter that Jekyll expects, and took a look. Everything was great except that all my plots were gone, of course. After considering a few options, I decided for now to put plots for such posts in a subfolder jekyll_images/ of my public Dropbox folder. Then it was a simple matter of search/replace all the paths to the images. At that point, it looked great; you can see the source and the result.

The only issue was that I didn’t want to manually do all that work every time. I considered creating a new Converter class in nbconvert to handle it, but finally decided that it would be more convenient to just write a shell script that calls nbconvert and then operates on the result.
Here it is:

#!/bin/bash

fname=$1

nbconvert.py ${fname}.ipynb -f markdown
sed  -i '' "s#${fname}_files#https:\/\/dl.dropbox.com\/u\/656693\/jekyll_images\/${fname}_files#g"  ${fname}.md

dt=$(date "+%Y-%m-%d")

echo "0a
---
layout:    post
time:      ${dt}
title:     TITLE-ME
subtitle:  SUBTITLE-ME
tags:      TAG-ME
---
.
w" | ed ${fname}.md

mv ${fname}.md ~/labnotebook/_posts/${dt}-${fname}.md

It’s also on Github here. This was a nice educational exercise in constructing shell scripts, in which I learned or re-learned:

how to use command-line arguments
how to use sed and ed
how to use data

You can expect a lot more iPython-notebook based posts in the future.