Evaluating weather forecast accuracy: an interview with Eric Floehr

Eric Floehr is the owner of ForecastWatch, a company that evaluates the accuracy of weather forecasts. In this interview Eric explains what his business does, how he got started, and some of the technology he uses.

JC: Let’s talk about your business and how you got started.

EF: I’m a programmer by trade. I got a computer science degree from Ohio State University and took a number of programming jobs, eventually ending up in management.

I’ve also always been interested in weather. A couple years ago my Mom showed me my baby book. At five years old it said “He’s interested in space, dinosaurs, and the weather.” I’m not as interested in dinosaurs now, but still interested in space and the weather.

When I was working as a programmer, and especially when I was a manager, I liked to do little programming projects to learn things. So when I ran across Python I thought about what I could write. I’d wondered whether there was any difference in the accuracy of various weather services — AccuWeather, Weather.gov, etc. Did they use different models, or did they all get their data from the National Weather Service and just package it up differently? So I wrote a little Python web scraper to pull forecasts from various places and compare it with observations. I kept doing that and realized there really were differences between the forecasters.

I didn’t start out for this to be a business. It just started out to satisfy personal curiosity. It just kept growing every year. In my last position before going out on my own I was CTO for a company that made a backup appliance. We got to the point where the product was mature and doing well. ForecastWatch was taking more and more of my time because I was getting more business from it, and so I decided to make the switch. That was March 2010. Revenue doubled over the next year and and it looks like this year it will double again. Things are going well and I really enjoy it.

JC: So you hadn’t been doing this that long when we met last year at SciPy in Austin.

EF: No, I’d only been doing this full time for a few months. But I’d been doing this part-time since 2004.

I didn’t have full-time revenue when I was doing this part-time. But it’s amazing. Once you have the time to focus on something, the opportunities that you hadn’t had time to notice before suddenly open up. Just the act of making something your focus almost makes your goal come to fruition. For years you think “too risky, too risky” and then once you make that jump, things fall in place.

JC: So what exactly is the product you sell?

EF: There are two main components. There’s an online component that is subscription-based. It provides monthly aggregated statistics on forecasts versus actual observations. It has absolute errors, min and max errors, Brier score, all kinds of statistics. It evaluates forecasts for precipitation, high and low temperature, opacity, wind speed and direction, etc. Meteriologist use those statistics to evaluate their forecasts to see how they’re doing relative to their peers.

The second component is research reports. Sometimes meteorologists will commission a report to show how well they’re doing. These reports are based on standard, widely-accepted metrics and time-frames, so they can’t just cherry-pick criteria they happened to do well on. But if they see there are statistics in ForecastWatch where they are doing really well, they might want to tell their customers. I’ve also created reports for media companies, large Internet service providers, energy trading companies and other companies who were evaluating weather forecast providers or want some other data analysis related to weather forecasts.

Something else, and I don’t know whether this will become a major component, but another area some people are interested in is historical forecasts. I have agreements with some of the weather forecasting companies to sell their forecasts that are no longer forecasts. Some people find this information valuable. For example, a marketer with a major sports league wanted to know how weather forecasts affected attendance. Another example was an investment manager who was looking to invest in a business whose performance he believed had some correlation with weather forecasts. For example, a ski lodge might want to know how far out people base their decisions on forecasts.

I have this data back to 2004. It’s funny, but most weather forecasting companies historically have not kept their forecasts. Their bread-and-butter is the forecast in the future. Once that future becomes the past, they saw no value in that data until recently.

Incidentally, because I’m monitoring weather forecasters’ web sites, I sometimes let them know about errors they were unaware of.

JC: What volume of data are you dealing with?

EF: I have about 200,000,000 forecast data points back to 2004. I’m adding about 130,000 data points a day. My database is something on the order of 70 GB. That’s observation data, hourly forecasts, metadata, etc. Right now I’m looking at data from about 850 locations in the US and about 50 in Canada. I’m looking to expand that both domestically and internationally.

JC: So what kind of technology are you using?

EF: I’m running a LAMP stack: Linux, Apache, MySQL, Python. Originally I was on Red Hat Linux but I’ve switched to Ubuntu server. I’m using Django for the web site. Everything is in Python: the scrapers are in Python, the web site is in Python, all the administrative back-end is in Python.

There are two web sites right now: ForecastWatch.com, which is the subscription, professional site, and a free consumer site ForecastAdvisor.com. The consumer site will give you a local forecast and a measure of the accuracy for various forecasters for your weather.

JC: And who are your customers?

EF: All the major weather forecast companies. Also some financial companies, logistics and transportation companies, etc. I’m just starting to expand more into serving companies that depend on meterological forecasts whereas in the past I’ve focused directly on meterologists.

JC: Let’s talk a little more about the entrepreneurial aspect of your business.

EF: Well, for one thing, I don’t think I’d ever have done this if I’d thought about doing it to make money. There’s not an enormous market for this service, but in a way that’s good. I came from a completely technical background. There’s not a marketing or sales gene in my body and I’ve had to learn a lot. ForecastWatch has given me a great opportunity to learn about those non-technical areas of a business that were so foreign to me before.

I got into this entirely for my own use. And I thought that maybe there was already something that did what I wanted, and in the process of trying to find what’s out there I discovered an unmet need. Even though all the major forecasters said that accuracy was the number one thing they were interested in, they weren’t effectively measuring their accuracy. I thought that if I’m interested in this, maybe other people are too.

At first pricing was a mystery to me. Maybe I needed a new laptop, so I’d charge someone the price of a laptop for some analysis. I had to learn the value of my time and my product.

* * *

Some talks by Eric:

PyOhio 2009 talk about ForecastWatch
PyOhio 2010 panel on Python and entrepreneurship
SciPy 2010 talk

* * *

More interviews
More on entrepreneurship
More on Python

Slide rules

Mike Croucher raises an important point for teachers: Are graphical calculators pointless? I think they are. I resented having to buy my daughter an expensive calculator when I could have bought her a netbook for not much more money.

Calculators are obsolete. I can’t remember the last time I used one. On the other hand, it could be valuable to have students use something really obsolete: a slide rule. Not for long, maybe just for a week or two.

  1. Slide rules are basically strips of log-scale paper. If you play with a slide rule long enough, you might get a tangible feel for logarithms.
  2. Slide rules make you concentrate on orders of magnitude. A slide rule will give you the significant digits, but you have to know what power of ten to use.
  3. Slide rules give you a tangible sense of significant figures. You can’t report more than three significant figures because you can’t see more than three significant figures. Maybe some experience with a slide rule would break students of the habit of reporting ever decimal that comes out of their calculators.

I’m not saying that being able to use a slide rule is a valuable skill. It’s not anymore. But the process of using a slide rule for a little while might teach some skills that are valuable. It would be fine if they forgot how to use a slide rule but retained an intuition for logarithms, orders of magnitude, and significant digits.

I’d recommend using a slide rule in high school for the same reason as using an abacus in elementary school: because it’s tangible, not because it’s practical.

Related posts:

Atomic skills versus molecular skills

Scott Adams has an essay in the Wall Street Journal today entitled How to Get a Real Education. He starts by saying the brightest students should get an academic education and the rest should learn entrepreneurship. I disagree. I don’t see why the choice between a traditional academic education and an education emphasizing entrepreneurship should depend on IQ. I also don’t see why there should be a sharp division between the two. Future professors would do well to learn entrepreneurship and future business owners would do well to learn math and history.

But I want to talk here about what I do agree with Scott Adams on. Here’s my favorite part of his essay.

Combine Skills. The first thing you should learn in a course on entrepreneurship is how to make yourself valuable. It’s unlikely that any average student can develop a world-class skill in one particular area. But it’s easy to learn how to do several different things fairly well. I succeeded as a cartoonist with negligible art talent, some basic writing skills, an ordinary sense of humor and a bit of experience in the business world. The “Dilbert” comic is a combination of all four skills. The world has plenty of better artists, smarter writers, funnier humorists and more experienced business people. The rare part is that each of those modest skills is collected in one person. That’s how value is created.

Academia trains people to think in terms of departments. Achievement is measured in ways that fit into a course catalog: chemistry, French, art, math, history, etc. Those who do the best at the academic game have the hardest time shaking these categories. Someone like Scott Adams could berate himself for not excelling as an artist or a writer. But rather than focusing on these atomic skills, he prides himself on how he combines these skills to do something few could do.

When Adams talks about combining skills, I don’t believe he’s talking about the myth of the Renaissance man. The Renaissance ideal is to be great at several atomic skills, each practiced in isolation. Adams is talking about combining skills that may not be remarkable individually and doing something remarkable.

Related posts:

Words that are primes base 36

This morning on Twitter, Alexander Bogomolny posted a link to his article that gives examples of words that are prime numbers when interpreted as numbers in base 36. Some examples are “Brooklyn”, “paleontologist”, and “deodorant.” (Numbers in base 36 are written using 0, 1, 2, …, 9, A, B, C, …, Z as “digits.” )

Tim Hopper replied with a snippet of Mathematica code that lists all words with up to four letters that correspond to base 36 primes.

Rest[ Flatten[ Union[
    DictionaryLookup /@ IntegerString[
        Table[Prime[n], {n, 1, 300000}], 36]]]]

That made me wonder whether you could estimate how many such words there are without doing an exhaustive search.

The Prime Number Theorem says that the probability of a number less than N being prime is approximately 1/log(N). If we knew how many English words there were of a certain length, then we could guess that 1/log(N) of that those words would be prime when interpreted as base 36 numbers. This assumes that forming an English word and being prime have independent probabilities, which may be approximately true.

How well would our guess have worked on Tim’s example? He prints out all the words corresponding to the first 300,000 primes. The last of these primes is 4,256,233. The exact probability that a number less than that upper limit is prime is then

300,000 / 4,256,233 ≈ 0.07.

There are about 4200 English words with four or fewer letters. (I found this out by running

grep -ciE '^[a-z]{1,4}$'

on the words file on a Linux box. See similar tricks here.) If we estimate that 7% of these are prime, we’d expect 294 words from Tim’s program. His program produces 275 words, so our prediction is pretty good.

If we didn’t know the exact probability of a number in our range being prime, we could have estimated the probability at

1/log(4,256,233) ≈ 0.0655

using the Prime Number Theorem. Using this approximation we’d estimate 4200*0.0655 = 275.1 words; our estimate would be exactly correct! There’s good reason to believe our estimate would be reasonably close, but we got lucky to get this close.

Related posts:

Picking classes

Here’s a little advice to students picking electives.

Consider taking classes in those things that would be hardest to learn on your own after you graduate. Taking the most advanced courses available in your major may not be the best choice. Presumably you’ve learned how to learn more about your area of concentration. (If not, your education has failed you.) So the advanced courses might teach you the material you’re best prepared to learn on your own.

Maybe it would be better to take a foundational course in a related area than an advanced course in your main area. For example, I suggested to some statistics graduate students yesterday that they take a really good linear algebra class rather than taking all the statistics they can. If they become professional statisticians, they’ll continue to learn statistics (I hope!) but they may find it harder to take the time to really understand mathematical foundations.

A knight’s tour magic square

This magic square was created by Leonhard Euler (1707-1783). Each row and each column sum to 260. Each half-row and half-column sum to 130. The square is also a knight’s tour: a knight could visit each square on a chessboard exactly once by following the numbers in sequence.

Here is Python code to verify that the square has the properties listed above.

Update: It seems the attribution to Euler is a persistent error. Euler did publish the first paper on knight’s tours, but the knight’s tour square above was published by William Beverley in 1848. Thanks to George Jelliss for the correction. See the comments below.

Update 2: Notes from George Jelliss on magic king and queen tours.

Mersenne primes and world records

Here’s an interesting account of the largest known primes over time. Thanks to @mathematicsprof for pointing this out.

Ever since 1952, the largest known prime has been a Mersenne prime, with one exception in 1989. One reason is that it is simple to test whether Mersenne numbers are prime using the Lucas-Lehmer test. The algorithm is described in seven lines of pseudo-code here.

Here are a couple connections with Mersenne and his primes I’ve written about before. First, Mersenne is one of my mathematical ancestors. Second, Mersenne primes are intimately connected with even perfect numbers, a connection that has been known since Euclid.

Related posts:

Better for whom?

Software generally gets better over time, but this does not mean it’s getting better and better every day in every way.

Software quality has so many dimensions that it is impossible to make progress along every front with every release of every product. Life’s full of trade-offs. A successful software project will improve over time in the ways that matter to most of its constituents. That doesn’t mean that every user will be better served by each subsequent release, especially if the user base changes.

It’s inevitable that some software will get worse over time, as far as a minority of users is concerned.  See, for example, this post about Word Perfect.

Commercial software may disappoint tech savvy users over time as such users make up a diminishing proportion of the software market. One reason programmers often prefer open source software is that they are the target market for the software.

The dynamics of open source software are more complex. Software written by volunteers is driven by what volunteers find interesting. This could result in software becoming wonkier over time, delighting geeks and alienating the general population. However, many volunteer developers find it interesting to make software easy to use for a wide audience.

And not all open source software is developed by volunteers. For example, the majority of work on the Linux kernel is done by corporate employees.  The companies paying for the development have a commercial interest in the software, even though they don’t sell the software.  Commercial and non-commercial are fuzzy concepts.

A company may sponsor an open source project because they rely on the software. Or maybe they want to undermine a competitor who sells an analogous project. Or maybe they’re sponsoring a project because they want to crow that they sponsor open source projects. Each of these motivations could make a project better for a different constituency.

Related post: Software development and the myth of progress