My mathematical opposite

Eugenia Cheng may be my mathematical opposite. She did a great interview with Peter Rowlett in which she bubbles over with enthusiasm for category theory. She explains that she couldn’t stand applied math, but stuck with math because she believed there was something there she could love. The further she moved from applicable math, the happier she became. Abstract algebra was a big improvement, but still too concrete. When she discovered category theory, she was home.

Category theory is a sort of meta-mathematics. It aims to identify patterns across diverse areas of math the way a particular area of math may identify patterns in nature. I like the idea of category theory, but I get that deer-in-the-headlights look in my eyes almost immediately when I look at category theory in any detail.

I enjoy pure math, though I prefer analysis to algebra. I even enjoyed my first abstract algebra class, but when I ran into category theory I knew I’d exceeded my abstraction tolerance. I moved more toward the applied end of the spectrum the longer I was in college. Afterward, I moved so far toward the applied end that you might say I fell off the end and moved into things that are so applied that they’re not strictly mathematics: mathematical modeling, software development, statistics, etc. I call myself a very applied mathematician because I actually apply math and don’t just study areas of math that could potentially be applied.

I appreciate Eugenia Cheng’s enthusiasm even though I don’t share her taste in math. I have long intended to go back and learn a little category theory. It would be great mental exercise precisely because it is so foreign to my way of thinking. Cheng’s interview inspired me to give it one more try.

Update: After this post was written, I did give category theory another try, and gave up again. Then a couple years later I finally committed to digging into category theory and have found practical uses for it.

How Michelangelo worked

Michelangelo's Pieta</ins>

The following quote from Irving Stone describes how Michelangelo worked on his Pietà.

He carved in a fury from first light to dark, then threw himself across his bed, without supper and fully clothed, like a dead man. He awoke around midnight, refreshed, his mind seething with sculptural ideas, craving to get at the marble.

ACID versus BASE for database transactions

Database developers all know the ACID acronym. It says that database transactions should be:

  • Atomic: Everything in a transaction succeeds or the entire transaction is rolled back.
  • Consistent: A transaction cannot leave the database in an inconsistent state.
  • Isolated: Transactions cannot interfere with each other.
  • Durable: Completed transactions persist, even when servers restart etc.

These qualities seem indispensable, and yet they are incompatible with availability and performance in very large systems. For example, suppose you run an online book store and you proudly display how many of each book you have in your inventory. Every time someone is in the process of buying a book, you lock part of the database until they finish so that all visitors around the world will see accurate inventory numbers. That works well if you run The Shop Around the Corner but not if you run Amazon.com.

Amazon might instead use cached data. Users would not see not the inventory count at this second, but what it was say an hour ago when the last snapshot was taken. Also, Amazon might violate the “I” in ACID by tolerating a small probability that simultaneous transactions could interfere with each other. For example, two customers might both believe that they just purchased the last copy of a certain book. The company might risk having to apologize to one of the two customers (and maybe compensate them with a gift card) rather than slowing down their site and irritating myriad other customers.

There is a computer science theorem that quantifies the inevitable trade-offs. Eric Brewer’s CAP theorem says that if you want consistency, availability, and partition tolerance, you have to settle for two out of three. (For a distributed system, partition tolerance means the system will continue to work unless there is a total network failure. A few nodes can fail and the system keeps going.)

An alternative to ACID is BASE:

  • Basic Availability
  • Soft-state
  • Eventual consistency

Rather than requiring consistency after every transaction, it is enough for the database to eventually be in a consistent state. (Accounting systems do this all the time. It’s called “closing out the books.”) It’s OK to use stale data, and it’s OK to give approximate answers.

It’s harder to develop software in the fault-tolerant BASE world compared to the fastidious ACID world, but Brewer’s CAP theorem says you have no choice if you want to scale up. However, as Brewer points out in this presentation, there is a continuum between ACID and BASE. You can decide how close you want to be to one end of the continuum or the other according to your priorities.

Robust, scalable, and the keyboard works

Glynn Foster from Sun talks about OpenSolaris on FLOSS Weekly episode 75. After explaining how Solaris has always been a robust, scalable operating system, Foster brags that now on a Toshiba laptop with OpenSolaris pre-installed ” … the volume works, and the keys work…” Then host Jono Bacon laughs “The keys work?!” The dialog starts at about 23:30 into the podcast.

The other host, Leo Laporte, mumbled “so cool” after Glynn Foster says “the volume works” and apparently would have let him get away with saying “the keys work.” But Jono Bacon is the community manager for Ubuntu, a Linux distribution that cares more about whether the volume and keyboard work than whether the OS scales.

It was amusing to listen to Glynn Foster and Jono Bacon personify their respective operating system’s priorities, server performance for Solaris and desktop experience for Ubuntu. Foster says that OpenSolaris used to be a royal pain to install and configure but now it has gotten much better. I don’t know how well Ubuntu scales—I imagine it’s not nearly as scalable as OpenSolaris—but it was designed from the beginning to be easy to install.

Three rules of thumb

Here are three rules of thumb for back-of-the-envelope estimates:

  1. Duff’s rule: Pi seconds is a nanocentury.
  2. Hopper’s rule: Light travels one foot in a nanosecond.
  3. Rule of 72: An investment at n% interest will double in 72/n years.

How might you use these? How accurate are they?

Duff’s rule comes in handy when converting from times measured in seconds to times measured on calendars. This may not sound useful, but it often happens in software. For example, if a task takes a second to complete, how long would it take to do it a billion times? Well, a billion seconds, obviously. But how long is that in familiar terms? Duff’s rule says a century is about 3.14 billion seconds, so a billion seconds would be something like 30 years.

How accurate is Duff’s rule? A year is 31,536,000 seconds, whereas Duff’s rule would estimate 31,415,927 seconds, so it underestimates the number of seconds in a year by about 0.4%.

Hopper’s rule is useful in electrical engineering. For example, you might need to know how long it would take a radio signal to travel between a transmitter and receiver. Hopper’s rule can explain why computer chip clock rates are not increasing. Electrical signals travel at some fraction of the speed of light, and current chip designs are limited by whether a signal can move across the chip during a clock cycle.

How accurate is Hopper’s rule? Light travels 299,792,458 meters per second. That corresponds to 0.983 feet per nanosecond, so Hopper’s rule overestimates by about 1.7%.

There’s a video of Grace Hopper explaining Hopper’s rule to David Letterman, but it keeps getting taken down.

The Rule of 72 is obviously useful in financial estimates. For example, $1000 invested at 6% interest will become $2000 in 72/6 = 12 years.

How accurate is the rule of 72? The value of an initial investment P at time t with under a continuous interest rate r is P exp(rt). Solving exp(rt) = 2 for t gives t = log 2 / r. If we express r as a percentage, we have to multiply t by 100. This says that for continuously compounded interest, the rule of 72 would be exact if “72” were replaced with 100 log 2 = 69.3. So for continuous interest, the rule overestimates the doubling time by 0.72/log 2 or about 4%. So why use 72 rather than 69.3? There are two reasons. First, 72 is easy to work with mentally since it is divisible by lots of small integers. Second, interest is often compounded periodically — say annually or monthly — rather than continuously.

The doubling time is longer for investments with periodic interest rather than continuous interest. The overestimate from using 72 rather than 69.3 is partially canceled out by the accounting for the longer doubling time for periodic compounding and so 72 may work better than 69.3. Exactly how accurate the rule of 72 is for periodically compounded interest depends on the interest rate and the compounding period.

Related post: Bancroft’s rule (rule of thumb for estimating linear regression)