For part of my sophomore year of college, I was a computer science major. When I realized that I loved my CS theory courses while my classmates hated them, I decide to major in math instead. I enjoyed the programming classes enough, but programming is not what I wanted to spend my time doing.
The summer after my junior year, I was accepted to a math REU at Rochester Institute of Technology. The first thing my adviser Stanislaw Radziszowski asked me was whether or not I could program! I spent the whole summer programming combinatorial graph theory-related algorithms in C1.
Now I, like many of my operations research classmates, spend much of my time programming. Despite the importance of writing code for solving operations research problems, I am surprised how little programming is discussed. The admissions page for my program says nothing about programming ability, but it is implicitly assumed that programming is a skill that students have.
Moreover, I suspect the operations research-specific parts of the research behind many journal articles is only a fraction of the actual work done by the authors. Much of the required work is implementation and debugging of their algorithms. Yet, articles contain little-to-no discussion of the actual code. Even worse, the code is often not published or reviewed. I can only imagine how many coding errors underly the results of peer-reviewed papers.
Marc Kuo recently blogged about how operations researchers need to get with the program (pun intended). His post kicked of tons of discussion in its comments, on Google+, on Hacker News, and on OR-Exchange.
This discussion came at a good time for me. I’m in the middle of my first big coding project of my PhD research. Despite completing a computer science minor and spending two summers doing nothing but coding, I never learned good software engineering practices. I decided at the beginning of the summer to force myself not to just write this code to get the job done but to write good code.
To start, I finally started using git and github for version control. I have tried several times before, but I have always found it rather confusing2. This git tutorial finally got me over the hump. Now I can easily branch my code into different versions, and I have the ability to go back to old versions when I screw something up.
Second, I started teaching myself about unit testing. Code testing was never mentioned in any of my classes in college, and I never hear operations researchers talk about it. Again, I have no doubt that the code behind much published work is full of mistakes. Operations researchers need good testing practices?3
Third, I’m trying to write clean, object-oriented, well-commented code. My intention is to publish this code on github when the corresponding paper is published. I want my results to be easily reproducible by others and open to scrutiny. I would also like my code to be reusable for future research. My design patterns might not be quite there yet, but I’m trying to move in that direction.
I realized that I’ve used the word I as much as Stephen Wolfram blog post. I have no desire to toot my own horn here; I’m just thankful this conversation is happening, and I want to continue it. Good software is crucial to good operations research (both in the academy and out), and yet academic operations researchers, in my experience, talk very little about good software engineering practices. We can do better.
I’m eternally indebted to my brilliant research partner Evan who taught me how to use bash, vim, and subversion, among other things. ↩︎
I feel vindicated by a recent thread on Hacker News. ↩︎
Incidentally, here’s an interesting Quora thread about testing stochastic algorithms. ↩︎