Taking Leave

On Monday I started a full-time job at RTI International in Durham, NC. At the end of the semester, I am going on a leave of absence from the PhD program in operations research at North Carolina State University. I finished my masters of operations research in May, and I am putting the PhD on hold. I made this decision for a number of reasons that I’m not going to go in to here.

If you’ve never heard of RTI, I cannot explain it to you better than our website:

RTI is an independent, nonprofit institute that provides research, development, and technical services to government and commercial clients worldwide. Our mission is to improve the human condition by turning knowledge into practice.

Established in 1958 as the Research Triangle Institute, RTI has a distinguished history of scientific achievement in the areas of health and pharmaceuticals, education and training, surveys and statistics, advanced technology, international development, economic and social policy, energy and the environment, and laboratory testing and chemical analysis. RTI’s staff of more than 2,800 supports projects in more than 40 countries.

RTI is the cornerstone of the Research Triangle Park which also includes companies such as IBM, Cisco Systems, Lenovo, and GlaxoSmithKline.

I am joining the Health Sciences Division where I will be doing predictive modeling problems across a variety of domains (including, but not limited to, healthcare modeling). This position is not operations research per se (depending, of course, on how one defines both “operations research” and “analytics”), but I am already finding opportunities that will blur the lines between OR and analytics problems.

My first week at RTI has been great. I’m surrounded by incredibly bright colleagues in many disciplines: statistics, math, operations research, economics, education, engineering, epidemiology, and more. For me, this gives opportunities to work on mathematical modeling problems across these many disciplines. I could hardly ask for more.

Operations Research and Computer Programming

For part of my sophomore year of college, I was a computer science major. When I realized that I loved my CS theory courses while my classmates hated them, I decide to major in math instead. I enjoyed the programming classes enough, but programming is not what I wanted to spend my time doing.

The summer after my junior year, I was accepted to a math REU at Rochester Institute of Technology. The first thing my adviser Stanislaw Radziszowski asked me was whether or not I could program! I spent the whole summer programming combinatorial graph theory-related algorithms in C.1

Now I, like many of my operations research classmates, spend much of my time programming. Despite the importance of writing code for solving operations research problems, I am surprised how little programming is discussed. The admissions page for my program says nothing about programming ability, but it is implicitly assumed that programming is a skill that students have.

Moreover, I suspect the operations research-specific parts of the research behind many journal articles is only a fraction of the actual work done by the authors. Much of the required work is implementation and debugging of their algorithms. Yet, articles contain little-to-no discussion of the actual code. Even worse, the code is often not published or reviewed. I can only imagine how many coding errors underly the results of peer-reviewed papers.

Marc Kuo recently blogged about how operations researchers need to get with the program (pun intended). His post kicked of tons of discussion in its comments, on Google+, on Hacker News, and on OR-Exchange.

This discussion came at a good time for me. I’m in the middle of my first big coding project of my PhD research. Despite completing a computer science minor and spending two summers doing nothing but coding, I never learned good software engineering practices. I decided at the beginning of the summer to force myself not to just write this code to get the job done but to write good code.

To start, I finally started using git and github for version control. I have tried several times before, but I have always found it rather confusing.2 This git tutorial finally got me over the hump. Now I can easily branch my code into different versions, and I have the ability to go back to old versions when I screw something up.

Second, I started teaching myself about unit testing. Code testing was never mentioned in any of my classes in college, and I never hear operations researchers talk about it. Again, I have no doubt that the code behind much published work is full of mistakes. Operations researchers need good testing practices?3

Third, I’m trying to write clean, object-oriented, well-commented code. My intention is to publish this code on github when the corresponding paper is published. I want my results to be easily reproducible by others and open to scrutiny. I would also like my code to be reusable for future research. My design patterns might not be quite there yet, but I’m trying to move in that direction.

I realized that I’ve used the word I as much as Stephen Wolfram blog post. I have no desire to toot my own horn here; I’m just thankful this conversation is happening, and I want to continue it. Good software is crucial to good operations research (both in the academy and out), and yet academic operations researchers, in my experience, talk very little about good software engineering practices. We can do better.

  1. I’m eternally indebted to my brilliant research partner Evan who taught me how to use bash, vim, and subversion, among other things. []
  2. I feel vindicated by a recent thread on Hacker News. []
  3. Incidentally, here’s an interesting Quora thread about testing stochastic algorithms. []

A Markdown Syllabus

Instead of maintaining my syllabus in HTML this semester, I decided to try Markdown, a “a text-to-HTML conversion tool for web writers.”1 Markdown provides simple, readable formatting syntax for plain text files which can then be converted to various other formats, in particular, HTML.

Markdown allowed me to keep my syllabus clean and readable while also having various formatting (italics, bold, links, etc). Take a look for yourself. I actually used an extension called Multimarkdown which permits, among other things, tables.

With a few keystrokes in Textmate, I can convert the Markdown file to an HTML file ready to be uploaded to the web. Notice the fourth line in the Markdown file says style: style.css. That tells the Markdown interpreter to add a line of HTML that calls the stylesheet style.css. You can see my simple CSS file here.

With Multimarkdown, you can also convert Markdown formatted files to PDF and LaTeX.

If you’re going to use tables in Multimarkdown, check out this Python script that automagically reformats the table to fit the contents. That’s how I got my table looking so pretty with a few keystokes in Textmate.

|:-----|:------|:----------------------------|:---------------------|
| Week | Class | Lecture Topic               | Book Sections        |
| 1    | 1     | Substitution Rule           | 5.5, 5.6             |
| -    | 2     | Partial Fractions           | 5.7, 5.8, Appendix G |
| 2    | 1     | Approximate Integration     | 5.9                  |
| -    | 2     | Improper Integrals          | 5.10                 |
| 3    | 1     | Areas, Volumes              | 6.1, 6.2             |
| -    | 2     | Volumes                     | 6.2, 6.3, 6.4        |
| 4    | 1     | Average Value of a Function | 6.5, Page 429-430    |
| -    | 2     | **Test 1**                  | 6.6                  |

I won’t be teaching again for a while, so it’s hard to say if I’d use Markdown again. If I do, I need to figure out how to streamline my upload process. I’d like to be able to save, convert to HTML, and upload with a single command.

Laura McLay just posted about using Google Docs in teaching. A Google Doc shared with the whole class might be an even easier way to maintain a syllabus.

  1. http://daringfireball.net/projects/markdown/ []

Calculus Haikus and Limericks

I offered my calculus students bonus points to write a limerick or haiku on their final exam. I got some great answers! And the students seem to enjoy it.

My favorite limerick (on improper integrals):

An integral has bounds from zero to one
You finish the problem and think you are done
Then your mind double checks
The equation is one over x
And the improper integral has a different sum

My favorite haiku gives the formula for integration by parts:

Int of mu dv:
Equal to mu v minus
Int of v d mu

Here are some other good ones:

Doing Integrals…
Oh How I shall miss thee, Calc!
Goodbye, It’s been real?

Calculus is hard
Derivative and limit
¡Yo no se hombre!

MacLaurin Series
is an infinite series
Centered at zero

Calculus is great
I’m sure I’ll use this in life
Or maybe I won’t

Studying ’til 5
And almost sleeping thru test
Was a real bad call

Finding integrals
Is used to find area
Underneath the curve

Calc is fun to do
If you like to integrate
x from a to b

Calculus two, sigh
Why must you torment me so?
I thought we were friends

F of e to x
Derived is e to the x
Lone function like it

I like calculus
The feeling is mutual
But not all the time

Integrals are fun
u-sub can be tricky… yes!
Don’t forget plus C

Calculus is hard
Eight A.M. is too early
Had a good time though

The ratio test:
If the limit equals one
Use another test

Does this series telescope?
For my sake, I truly hope
Comparisons are icky,
Integrals make me sickly,
Turns out the answer is nope. :-(

Here are a few from when I taught at UVA:

An integral is
A derivative reversed
Don’t forget constants!

Thanks to L’Hopital
We can use derivatives
to find a limit!

Find the area,
Between the two stated curves,
Using integrals.

Natural log of
Zero, does not exist, but
Ln of 1 doesn’t

Calculus is great
But only if taught by Tim
At U of V.A.

Teaching Students to Fail

I am teaching the dreaded calculus II this semester. I’ve known many students who flew through calc I in college (having taken calculus in high school) only to receive a reality check from calc II the next semester.

In the US, calc II often involves a significant section on “techniques of integration” where students learn techniques such as partial fractions, trig substitutions, integration by parts. Unlike much of differential calculus, which is taught in calc I, and unlike much of the math taught before college, integration is harder to do algorithmically. That is, a calc II professor cannot simply outline surefire steps guaranteed to give an antiderivative for any function.

The inimitable Robert Ghrist explains it this way in his “funny little calculus text“:

Of course, algorithms for computing many antiderivatives do exist (and are used in Maple, Mathematica, and Wolfram Alpha), but I’d be fired if I tried to take my undergrads through Symbolic Integration I: Transcendental Functions.

Instead, calculus II teachers teach a handful of methods and attempt to teach students intuition for where to use what technique. Even more important, I try to teach my students the skill of trying a method, seeing that the method does not work, then trying something else. Try-fail-try-fail-try again.

I do not think that high school students learn that skill—a skill vital to success not just in calc II, but in every discipline requiring analytical problem solving. Yesterday, my adviser and I were discussing the first big research problem that I’ll be tackling this summer. He noted that our first attempt at solving a massive problem would probably fail; they usually do. Fortunately, I’ve been failing at solving problems at least since taking number theory with Dan Krider in 2003. I know what Edison meant by, “I have not failed. I’ve just found 10,000 ways that won’t work.”

I hope my students are learning how to fail and how to try again. However, I think that kids need to learn earlier. High school assignments should not be set up for students to succeed the first time, every time. Somehow, teachers need to allow students to take risks, learn from their mistakes, and rebound.

I’d love to hear feedback from students who are learning these lessons and teachers trying to teach them.

Animating Plots in Mathematica

A friend asked me how to animate a polar plot graphing in Mathematica. It’s really easy. The Animate and Manipulate commands added in version 6 are a huge step in interactive mathematical visualizations.

The easiest way to visualize it is to have the upper plot range vary from a 0+ϵ to 2π.

 
Animate[
  PolarPlot[2 Cos[4 x], {x, 0, y}, 
    PlotRange -> {{-2, 2}, {-2, 2}}], 
  {y, 0.001, 2}
]

But, I thought this would be a good opportunity to try out Wolfram’s recently introduced Computable Document Format which makes Mathematica files interactive in the web browser. Instead of having to buy an expensive Mathematica license to play with interactive Mathematica tools, CDFs are accessible right from the browser.

If you haven’t already, you’ll have to install the CDF Player (free from Wolfram) to view my animation.1 Try clicking the different functions to see their polar plots drawn.

Here’s code which you should be able to copy and paste right into Mathematica.

 
Manipulate[
  PolarPlot[f[\[Theta]], {\[Theta], 0, \[Mu]},
    PlotRange -> {{-2, 2}, {-2, 2}}
  ], 
    {{f, 2 Cos[4 #] &, "Function"},
    {
      Sin -> "r=Sin[\[Theta]]",
      2 Cos[4 #] & -> "r=2 Cos[4\[Theta]]",
      2 Cos[16 #] & -> "r=2 Cos[16\[Theta]]"
    }, ControlType -> SetterBar},
  {\[Mu], 0.001, 2 \[Pi]},
  ControlType -> Animator,
  AppearanceElements -> None
]
  1. Wolfram has recently made their massive collection of Demonstrations interactive through the CDF format. This is a great feature for math teacher and students. []

My Email Analytics

Last month, Stephen Wolfram did a blog post on the Personal Analytics of his life. For years, he’s recorded every phone call, keyboard stroke, email, and step. He made beautiful graphs to show his activity over the years.

A Wolfram Alpha developer just posted a Mathematica notebook on the Wolfram blog allowing anyone to do the same email analysis that Wolfram did with any IMAP email account. Of course, I dropped what I was doing to try it out with my Gmail account.

At first, it failed to finish processing my incoming email because the JVM ran out of memory. It took me a while to figure out how to tell JLink to let Java have more memory.1

Here’s a plot of emails sent by from Gmail me over the last six years:

I started using Gmail regularly after I graduated from college in 2008 (once my college Exchange-based email account was gone).

My emailing was noticeably sparse from May 2008-May 2009. During that time I was a teacher, and I didn’t spend nearly as much time on a computer as I do now. You can also see a gap during the summer of 2011. I was working at Kiva Systems at the time and primarily used my company email.

On the horizontal axis, you notice I’m pretty silent between 11 PM and 7 AM. I need my sleep, and I never work at night! My emailing is light from 6-9 PM too.

Here’s a graph of my email received over the past six years. It comes in pretty heavy from 8 AM to midnight!

The thick line just under 6 AM is the Google Calendar email updates I used to get every morning. I stopped getting those once I got an iPod Touch this past Christmas. I can’t remember what email used to come at 3 AM for a few years.

This next graphic shows the average number of emails I receive per day for each month. My amount emailing  ramped up once I started using email. Notice the downward trend on incoming emails recently: I’ve been unsubscribing from unnecessary mailing lists and circulars. My emailing had pretty serious peak last May right before I moved to Boston. Not sure why.

Here we have a histogram of the time at which I send emails. Apparently I’m most likely to send an email just after 10 PM. I wouldn’t have guess that. Don’t expect to hear from me after midnight!

As a good operations researcher, I wondered if I received email according to a Poisson process. I pulled the email time stamp data into R. I get email pretty steadily between 8 AM and 10 PM. I looked at the emails that arrived in that interval since September 2011. The mean interarrival time is 0.53 hours. The standard deviation is 0.92. If it were a poisson process, interarrival times would be exponentially distributed, and the mean and standard deviation would be equal.

Below is a histogram of the interarrival times of my emails. The red line is an exponential distribution with the rate set to 1 over the mean interarrival time of my emails. It’s not a terrible fit!2 One reason the email arrival rate might not be exponential is that I frequently have back-and-forth email conversations with people, which skews the distribution towards short interarrival times. I might do some more statistics later, but I have homework to do.

  1. ReinstallJava[CommandLine -> "java", JVMArguments -> "-Xmx4024m"] []
  2. That’s not a official statistical statement! []

Average Area of a Random Hull

Yesterday, someone on MathOverflow asked

Consider n points generated randomly and uniformly on a unit square. What is the expected value of the area (as a function of n) enclosed by the convex hull of the set of points?

Someone quickly cited 2004 paper provides an analytical result for the cases where n=3 and n=4:

For n=3 the expected value is 11/144 and for n=4 it is 11/72.

This is certainly a nontrivial result. However, the value can be approximated by generating a large number of random points, finding the area of the convex hull, and averaging the areas. Of course, finding the convex hull and the area of the convex hull of a set of points requires a little work.

Mathematica provides functions for generating random points and finding the area of the convex hull of a set of points quickly. As a result, I was able to perform a Monte Carlo simulation for the n=3 and n=4 case in a couple of lines of Mathematica code:

Sampling 5000 cases for each returned results fairly close to the predicted average.

 

 

Operations Research, Machine Learning, and Optimization

Over the past 18 months, I’ve been slowly learning some machine learning. One thing I’ve noticed is that most of the math in machine learning is optimization. Regression is typically minimization of some error term. Support vector machines are a quadratic optimization problem with linear constraints. Learning a neural network is simply solving a nonconvex optimization problem. Clustering often takes the form of expectation-maximization. I’m currently learning Bayesian network structure learning which is an extremely difficult combinatorial optimization problem.

Yesterday on Twitter, I commented that I am surprised at how little operations research people and machine learning people talk. Most of the math of OR is, like machine learning, optimization. All the same theorems apply, and we use many of the same algorithms; we just apply them in different ways.

I got helpful feedback from the nerds that follow me. Jeff Linderoth pointed to the recent book Optimization for Machine Learning by his colleague (et al) Stephen J. Wright at University of Wisconsin, Madison. From what I can tell, Wright is an OR guy in a computer scientists clothing. There’s a two-hour lecture by Wright on the same topic that I look forward to watching.

Jeff also pointed to the work of his colleague Ben Recht who’s looking at the optimization problems in ML from a theoretical standpoint. Paul Kerl linked to Jorge Nocedal‘s work at Northwestern. Nocedal and Recht seem to have feet in both worlds. John Myles White noted that the legendary optimizer Stephen Boyd came to the New York Academy of Science’s Machine Learning event last year.

I also came across a 2006 paper on The Interplay of Optimization and Machine Learning Research. The authors note some difference between an OR and ML perspective on optimization:

We observe that the qualities of good optimization algorithms from the machine learning and optimization perspectives can be quite different. Mathematical programming puts a premium on accuracy, speed, and robustness. Since generalization is the bottom line in machine learning and training is normally done off-line, accuracy and small speed improvements are of little concern in machine learning. Machine learning prefers simpler algorithms that work in reasonable computational time for specific classes of problems.1

A bigger question might be where optimization lies as a discipline. Since I’ve been in OR, I’ve always considered optimization as a subfield of OR. But as I read applied OR literature, I find it jarring to see the details of solving a difficult optimization problem mixed with the application of the solution to a real world problem.

Of course, both ML and OR require practitioners to understand how the algorithms work. Optimization problems are hard, and a black box solution rarely works for any of us. But perhaps optimization will become a field of its own that OR and ML can both feed from instead of the two working independently.

  1. I haven’t read the whole paper (that’s from the abstract), but I’m not entirely convinced that is true. Modern machine learning often requires large scale problems to be solved quickly on-line, while optimizers often solve a problem  offline and speed is negotiable. []

Stephen Wolfram’s AMA

Stephen Wolfram, of Wolfram Research and Mathematica fame, did a Q&A (i.e. AMA) on Reddit today. I just enjoyed reading through his answers. A few interesting answers stood out to me.

Someone ask Wolfram’s opinion on P=NP. He thinks it’s undecidable.1

Some smart aleck threw the Riemann hypothesis at him. Interestingly, Wolfram also suspects this is undecidable.2

One questioner asked about open sourcing old versions of Mathematica. Wolfram responded very winsomely, in my view. I didn’t know that they’ve thought about making the core language more freely available. I’d like to see that.

His most interesting answer is his opinion on Matlab. He argues that Matlab has remained matrix-centric when so much of contemporary mathematics goes beyond that. “In the complete web of algorithms in Mathematica, things that can reasonably be represented as numerical matrices are perhaps 5 or 10% of the total.” However, Wolfram believes that Mathematica isn’t outdone by Maple in the realm of matrices.

Wolfram relays that a major goal of Mathematica is “to make a single coherent system in which one can work, and in which everything fits nicely together.” I argued that that’s one thing they’ve done quite well.

I appreciate Wolfram doing this. I continue to be optimistic about Mathematica as a product, and I hope they have a bright future ahead of them.

  1. See the Wikipedia page on undecidability for more. []
  2. Both the Riemann hypothesis and P=NP have been around for many years and have a big bounty on solving them: http://www.claymath.org/millennium/. []
Page 1 of 3123»