A few years ago I was introduced to the idea of testing the scientific code I write, and now I am a huge fan of this approach. It saves huge amounts of time and vastly improves the quality of my research by helping me catch bugs early and often. I use nose for automatically finding and running all the tests. I use Travis to automatically run the tests every time I update the code – or issue a pull request – on Github.
Usually, I can test the output of a routine by comparing numbers (generally, by taking norms of large arrays of numbers). However, I also develop routines that are used to visualize numerical output, using matplotlib and other Python plotting tools. Sometimes the numerical output is fine but the visualization code breaks. It’s easy to write tests that just make sure the visualization code still runs, and that perhaps check some assertions about the properties of the resulting matplotlib objects in memory. But that doesn’t ensure that the image still looks right. How does one automatically check whether an image is correct?
Obviously, correct in this case must refer to some reference image that you produced once and whose correctness you confirmed by visual inspection. In other words, I’m talking about regression testing, which simply ensures that things function as they did before. To regresion test an image, you could write an image file and check whether it is bitwise identical to the reference image. But this is very problematic, since running the same bit of matplotlib code on two different machines will often lead to slightly different output, depending on the versions of other libraries that are installed, the fonts available, etc. Some more intelligent method of comparison is needed.
Fortunately, the matplotlib developers have been grappling with these issues for a long time and have a nice packaged solution, in the form of a Python decorator
@image_comparison, whose usage within matplotlib is explained here. The decorator automatically deals with saving the test image and comparing it to the reference image; all you do is add the decorator before your test function.
Clearly, I thought, it would be great if I could use this decorator for image testing in my own projects. So I followed the instructions linked above, but things didn’t immediately work. Apparently others have had the same experience, so I thought I’d share what I had to do to get things working.
The decorator makes a few assumptions that are based on the directory structure of matplotlib’s own test directories. So you have two options:
I went with option 1, since I generally prefer to outsource as much functionality as reasonably possible. I had to do the following:
tests. Thus if your package is named pystuff, you might have your tests in
pystuff/tests/my_tests.pyand there should be an
__init__.pyfile in the
testsdirectory (it can be just an empty file).
my_testswith the name of the file containing your tests). Note that the image names are specified as arguments to the decorator.
Once I did that the tests passed on my machine. However, they didn’t pass on Travis.
I ran into two issues on Travis:
Both of these are solved by ensuring that matplotlib uses the agg backend for the tests (both locally and on Travis). To set the backend to agg, just add the command
This line must be executed before you import pylab or
matplotlib.pyplot. This is very tricky if you are using nose, because nose actually imports every file in your package during the test collection phase. What I had to do to get things to work (and the matplotlib package itself does the same!) is to put a
tests.py script in the top directory of my package. This script has the following contents:
Instead of calling nosetests directly on Travis, I use the following in my
That’s it! Since examples are the best documentation, you can check out my working setup here.