Twenty weeks down to twenty minutes

I had a simulation this week that I estimated would take 20 weeks to run. This estimate was based on too small a run. A more reliable estimate based on a longer run was 300 CPU hours, about two weeks running on a single processor. Before I split the problem up to run in parallel, I wanted to see if I could make each thread run faster.

The most expensive part of the simulation is a function that takes in four integer parameters and does a numerical integration. Each parameter could be any value from 0 to 120. So in principle there are 121⁴ = 214,358,881 possible parameter combinations. I knew that some of these combinations were extremely unlikely or impossible, so I thought I could reduce the execution time by caching the most commonly called parameters. Maybe a few thousand arguments would account for most of the function calls.

I saved the function arguments and computed values in map, a.k.a. dictionary, associative array, etc. (The buzz word for this is memoization.) I set a maximum size on the map because I thought it could grow very large. Even if, say, a thousand arguments accounted for most of the function calls, maybe there were many more that only appeared once.

It turned out that the function was only called with 408 distinct parameter combinations. That means the same integrals were being evaluated millions of times. Caching these values reduced the execution time by a factor of about 800. In other words, the software could retrieve a previously computed value 800x faster than compute it from scratch.

The simulation that would have taken 300 hours took only 22 minutes, and so there was no need to make it run in parallel.

Update: I found a bug in my hashing algorithm. After fixing the bug, the time to run my simulations increased to 38 minutes. So my speed-up was a about a factor of 420 rather than 800, but still enough to run on one thread over lunch. The number of distinct arguments to the numerical integration was much more than 408 in the corrected program, but still small enough to cache.

6 thoughts on “Twenty weeks down to twenty minutes”

Matt

26 January 2012 at 20:08

Memoization is fantastic! I learned about it, and other fantastic concepts, from _Higher-Order Perl_ by Mark Jason Dominus (here is the chapter that deals with it: http://hop.perl.plover.com/book/pdf/03CachingAndMemoization.pdf ). You may have abandoned Perl for Python, but this whole book illustrates many great programming concepts using Perl — I highly recommend it for any programmer.

Rick Wicklin

27 January 2012 at 09:32

Once again, working smarter beats working harder/longer.

John

27 January 2012 at 10:15

Thanks for the technique. Too bad this won’t work for my Monte Carlo.

John

27 January 2012 at 10:39

Perhaps this won’t work for your problem, but it does work on some Monte Carlo problems. That’s what I was doing.

Because of the randomness in the program, I could not determine with certainty which parameter arguments would or would not be needed. I thought perhaps there could be thousands of values only used once. But that was far from the case.

joshin4colours

27 January 2012 at 13:59

This type of idea is something that should be taught to any grad students using MC simulation or any simulation that’s similar. I remember lots of discussions about efficiently calculating random numbers, variance reduction, etc. Using bit-wise operations instead of the more usual numerical calculations also briefly came up. But never anything about saving values you end up computing several times, which is too bad. It’s a concept even someone who’s relatively green to programming could pick up for scientific computations.

Anselmo

29 January 2012 at 20:54

This concept is highly exploited in the Dynamic Programming technique. The basic idea of Dynamic Programming is decomposing a huge optimisation problem in small problems, optimising the small problems, and saving the result in a table function for future lookup.

Comments are closed.