My most popular posts on Hacker News

Here are the most popular posts on my site according to the number of points given on Hacker News.

Variable-speed learning

When I was in college, one of the professors seemed to lecture at a sort of quadratic pace, maybe even an exponential pace.

He would proceed very slowly at the beginning of the semester, so slowly that you didn’t see how he could possibly cover the course material by the end. But his pace would gradually increase to the point that he was going very quickly at the end. And yet the pace increased so smoothly that you were hardly aware of it. By understanding the first material thoroughly, you were able to go through the latter material quickly.

If you’ve got 15 weeks to cover 15 chapters, don’t assume the optimal pace is to cover one chapter every week.

I often read technical books the way the professor mentioned above lectured. The density of completely new ideas typically decreases as a book progresses. If your reading pace is proportional to the density of new ideas, you’ll start slow and speed up.

The preface may be the most important part of the book. Some books I’ve only read the preface and felt like I got a lot out of the book.

The last couple chapters of technical books can often be ignored. It’s common for authors to squeeze in something about their research at the end of a book, even it its out of character with the rest of the book.

Books you’d like to have read

I asked on Twitter today for books that people would like to have read, but don’t want to put in the time and effort to read.

Here are the responses I got, organized by category.


Math, Science, and Software:

History and economics

Religion and philosophy


US flag if California splits into three states

There’s a proposal for California to split into three states. If that happens, what would happen to the US flag?

The US flag has had 13 stripes from the beginning, representing the first 13 states. The number of stars has increased over time as the number of states has increased. Currently there are 50 stars, arranged in alternating rows of 6 and 5 stars.

If California breaks into three states, how would we arrange 52 stars? One way would be alternating rows of 7 and 6. Here’s a quick mock up of a possible 52-star flag I just made:

52 star flag

The difference isn’t that noticeable.

What if California [1] were to secede from the US? We’ve had 49 states before, in the brief period between the beginning of Alaskan statehood and the beginning of Hawaiian statehood. During that time [1], the flag had 7 rows of 7 stars, but the rows were staggered as in the current flag, not lined up in columns as in the 48-star flag before it.


[1] I don’t know how many people seriously propose any state leaving the US. The last time a state tried the result was the bloodiest war in US history. There’s a group in Vermont that wants their state to withdraw from the US.

Texans talk about seceding from the union, and you’ll occasionally see SECEDE bumper stickers, but it’s a joke. Maybe a few people take it seriously, but certainly not many.

[2] There is a tradition dating back to 1818 that the flag only changes on July 4. So the period of the 49-star flag didn’t exactly coincide with the period of 49 states.

Perl as a better grep

I like Perl’s pattern matching features more than Perl as a programming language. I’d like to take advantage of the former without having to go any deeper than necessary into the latter.

The book Minimal Perl is useful in this regard. It has chapters on Perl as a better grep, a better awk, a better sed, and a better find. While Perl is not easy to learn, it might be easier to learn a minimal subset of Perl than to learn each of the separate utilities it could potentially replace. I wrote about this a few years ago and have been thinking about it again recently.

Here I want to zoom in on Perl as a better grep. What’s the minimum Perl you need to know in order to use Perl to search files the way grep would?

By using Perl as your grep, you get to use Perl’s more extensive pattern matching features. Also, you get to use one regex syntax rather than wondering about the specifics of numerous regex dialects supported across various programs.

Let RE stand for a generic regular expression. To search a file foo.txt for lines containing the pattern RE, you could type

    perl -wln -e "/RE/ and print;" foo.txt

The Perl one-liner above requires more typing than using grep would, but you could wrap this code in a shell script if you’d like.

If you’d like to print lines that don’t match a regex, change the and to or:

    perl -wln -e "/RE/ or print;" foo.txt

By learning just a little Perl you can customize your search results. For example, if you’d like to just print the part of the line that matched the regex, not the entire line, you could modify the code above to

    perl -wln -e "/RE/ and print $&;" foo.txt

because $& is a special variable that holds the result of the latest match.


For daily tips on regular expressions, follow @RegexTip on Twitter.

Regex tip icon

Line art

A new video from 3Blue1Brown is about visualizing derivatives as stretching and shrinking factors. Along the way they consider the function f(x) = 1 + 1/x.

Iterations of f converge on the golden ratio, no matter where you start (with one exception). The video creates a graph where they connect values of x on one line to values of f(x) on another line. Curiously, there’s an oval that emerges where no lines cross.

Here’s a little Python I wrote to play with this:

    import matplotlib.pyplot as plt
    from numpy import linspace

    N = 70
    x = linspace(-3, 3, N)
    y = 1 + 1/x

    for i in range(N):
        plt.plot([x[i], y[i]], [0, 1])
    plt.xlim(-5, 5)

And here’s the output:

In the plot above I just used matplotlib’s default color sequence for each line. In the plot below, I used fewer lines (N = 50) and specified the color for each line. Also, I made a couple changes to the plot command, specifying the color for each line and putting the x line above the y line.

        plt.plot([x[i], y[i]], [0, 1], c="#243f6a")

If you play around with the Python code, you probably want to keep N even. This prevents the x array from containing zero.

Update: Here’s a variation that extends the lines connecting (x, 0) and (y, 1). And I make a few other changes while I’m at it.

    N = 200
    x = linspace(-10, 10, N)
    y = 1 + 1/x
    z = 2*y-x

    for i in range(N):
        plt.plot([x[i], z[i]], [0, 2], c="#243f6a")
    plt.xlim(-10, 10)

Off by one character

There was a discussion on Twitter today about a mistake calculus students make:

\frac{d}{dx}e^x = x e^{x-1}

I pointed out that it’s only off by one character:

\frac{d}{de}e^x = x e^{x-1}

The first equation is simply wrong. The second is correct, but a gross violation of convention, using x as a constant and e as a variable.


It’s like this other thing except …

One of my complaints about math writing is that definitions are hardly ever subtractive, even if that’s how people think of them.

For example, a monoid is a group except without inverses. But that’s not how you’ll see it defined. Instead you’ll read that it’s a set with an associative binary operation and an identity element. A module is a vector space, except the scalars come from a ring instead of a field. But the definition from scratch is more than I want to write here. Any time you have a sub-widget or a pre-widget or a semi-widget, it’s probably best to define the widget first.

I understand the logical tidiness of saying what a thing is rather than what it is not. But it makes more pedagogical sense to describe the difference between a new concept and the most similar familiar concept. And the nearest familiar concept may have more structure rather than less.

Suppose you wanted to describe where a smaller city is by giving directions from larger, presumably more well known city, but you could only move east. Then instead of saying Ft. Worth is 30 miles west of Dallas, you’d have to say it’s 1,000 miles east of Phoenix.

Writers don’t have to chose between crisp logic and good pedagogy. They can do both. They can say, for example, that a pre-thingy is a thingy without some property, then say “That is, a pre-thingy satisfies the following axioms: …”

Surface area of an egg

The first post in this series looked at a possible formula for the shape of an egg, how to fit the parameters of the formula, and the curvature of the shape at each end of the egg.

The second post looked at the volume. This post looks at the surface area.

If you rotate the graph of a function f(x) around the x-axis between c and d, the area of the resulting surface is

2 \pi \int_c^d f(x)\sqrt{1 + (f'(x))^2} \, dx

This integral cannot be computed in closed form for the function f describing an egg. (At least I can’t find a closed form, and neither can Mathematica.) But we can do it with numerical integration.

Here’s the Mathematica code.

f[x_, a_, b_, k_] := b Sqrt[(1 - x^2/a^2) / (1 + x k)]
area[a_, b_, k_] := 
    2 Pi* NIntegrate[ 
    f[x, a, b, k] Sqrt[1 + D[f[x, a, b, k], x]^2], 
    {x, -a, a}

As a sanity check, let’s verify that if our egg were spherical we would get back the area of that sphere.

area[3, 3, 0] returns 113.097 and N[36 Pi] also returns 113.097, so that’s a good sign.

Now let’s plot the surface area as a function of the parameter k.

Plot[area[4, 2, k], {k, -0.2, 0.2}]

The y-axis starts at 85.9, so the plot above exaggerates the effect of k. Here’s another plot with the y-axis starting at zero.

Plot[g[4, 2, k], {k, -0.2, 0.2}, PlotRange -> {0, 100}]

As with volume, the difference between an egg and an ellipsoid is approximately a quadratic function of the parameter k.

Volume of an egg

The previous post looked at an equation to fit the shape of an egg. In two dimensions we had

\frac{x^2}{a^2} + \frac{y^2}{b^2}(1 + kx) = 1

In this post, we’ll rotate that curve around the x-axis to find the volume. Then we’ll see how it compares to that of an ellipsoid.

If we rotate the graph of a function f(x) around the x-axis with x ranging from c to d, the volume is given by

\pi \int_c^d \left(f(x) \right)^2 \, dx

It works out well that the function f is squared because when we express y explicitly as a function of x we get a square root. Our volume works out to be

b^2 \pi \int_{-a}^a \left(1 - \frac{x^2}{a^2} \right ) \frac{1}{1 + kx}\, dx = \frac{2b^2\pi \left(\left(a^2 k^2-1\right) \tanh ^{-1}(a k)+a k\right)}{a^2 k^3}

We’d like to see how this compares to the volume of an ellipsoid, and that would be easier if we expanded the inverse hyperbolic tangent using its power series

\tanh^{-1}x = x + \frac{x^3}{3} + \frac{x^5}{5} + \cdots

This says our volume is given by

2b^2\pi\left(\frac{2}{3}a + \frac{2}{15}a^3k^2 + \frac{2}{35} a^5 k^4 + \cdots \right )

Note that if abr and k = 0 this reduces to the volume of a sphere of radius r, i.e. 4πr³/3. If a and b are not necessarily equal but k = 0 we get 4πab²/3, the volume of an ellipse.

To first order, k does not effect the volume. That is, k does not appear in the series above except with exponent 2 and higher. This says to first approximation, the volume of an egg (assuming our formula for the shape) is simply that of an ellipsoid with the same major and minor axes. Also, k only appears to even powers. We should have expected that from the previous post since changing the sign of k just flips the egg over and doesn’t change the volume.

To second order, the volume of an egg relative to that of an ellipse is a quadratic function of the parameter k. To change an ellipse into an egg shape, making one end flatter and the other end more pointy, but keeping the length and width the same, you have to add volume. You gain more volume on the flatter end than you lose on the pointier end.

See the next post for a discussion of the surface area of an egg.

Generating Laplace random variables

Differential privacy adds Laplace-distributed random noise to data to protect individual privacy. (More on that here.) Although it’s simple to generate Laplacian random values, the Laplace distribution is not always one of the built-in options for random number generation libraries.

The Laplace distribution with scale β has density

f(x) = \frac{1}{2\beta} \exp\left(-\frac{|x|}{\beta} \right )

The Laplace distribution is also called the double exponential because it looks like two mirror-image exponential distributions glued together.

Note that the scale β is not the standard deviation. The standard deviation is √2 β.

To generate samples from a Laplace distribution with scale β, generate two independent exponential samples with mean β and return their difference.

If you don’t have an API for generating exponential random values, generate uniform random values and return the negative of the log. That will produce exponential values with mean 1. To make random values with mean β, just multiply the results by β.

If you want to generate Laplace values in Python, you could simply use the laplace function in scipy.stats. But I’ll write a generator from scratch just to show what you might do in another environment where you didn’t have exponential or Laplace generators.

    from math import log
    from random import random

    def exp_sample(mean): 
        return -mean*log(random())

    def laplace(scale):
        e1 = exp_sample(scale)
        e2 = exp_sample(scale)
        return e1 - e2

Related: Stand-alone numerical code, useful when you need a few common mathematical functions but are in an environment that doesn’t provide them, or when you want to avoid adding a library to your project.

Painless project management

This evening, after I got off a phone call discussing a project with a colleague, I thought “Huh, I guess you could call that project management.”

I worked as a project manager earlier in my career, but what I’m doing now feels completely different and much more pleasant. Strip away the bureaucracy and politics, and project management just feels like getting work done.

I have several people working for me now, but project management doesn’t take much time. I still spend most of my time working on my own projects.

Ways to connect

If you visit this blog once in a while, here are a few ways to hear from me more regularly.


You can subscribe to the blog via RSS or email.

I often use SVG images because they look great on a variety of devices, but most email clients won’t display that format. If you subscribe by email, there’s always a link to open the article in a browser and see all the images.

I also have a monthly newsletter. It highlights the most popular posts of the month and usually includes a few words about what I’ve been up to.


I have 17 Twitter accounts that post daily on some topic and one personal account.

Twitter icons

The three most popular are CompSciFact, AlgebraFact, and ProbFact. These accounts are a little broader than the names imply. For example, if I run across something that doesn’t fall neatly into another mathematical category, I’ll post it to AlgebraFact. Miscellaneous applied math tends to end up on AnalysisFact.

You can find a list of all accounts and their descriptions here.

I don’t keep up with replies to my topical accounts, but I usually look at replies to my personal account. If you want to be sure I get your message, please call or send me email.

Occasionally people ask whether there’s a team of people behind my Twitter accounts. Nope. Just me. I delegate other parts of my business, but not Twitter. I schedule most of the technical account tweets in advance, but I write them. My personal account is mostly spontaneous.

Contact info

Here’s my contact info.

contact info

My phone number isn’t on there. It’s 832.422.8646. If you’d like, you can import my contact info as a vCard or use the QR code below.

Integrating polynomials over a sphere or ball

Spheres and balls are examples of common words that take on a technical meaning in math, as I wrote about here. Recall the the unit sphere in n dimensions is the set of points with distance 1 from the origin. The unit ball is the set of points of distance less than or equal to 1 from the origin. The sphere is the surface of the ball.

Integrating a polynomial in several variables over a ball or sphere is easy. For example, take the polynomial xy² + 5x²z² in three variables. The integral of the first term, xy², is zero. If any variable in a term has an odd exponent, then the integral of that term is zero by symmetry. The integral over half of the sphere (ball) will cancel out the integral over the opposite half of the sphere (ball). So we only need to be concerned with terms like 5x²z².

Now in n dimensions, suppose the exponents of x1, x2, …, xn are a1, a2, …, an respectively. If any of the a‘s are odd, the integral over the sphere or ball will be zero, so we assume all the a‘s are even. In that case the integral over the unit sphere is simply

2 B(b_1, b_2, \ldots, b_n)


B(b_1, b_2, \ldots, b_n) = \frac{\Gamma(b_1) \Gamma(b_2) \cdots \Gamma(b_n)}{ \Gamma(b_1 + b_2 + \cdots + b_n)}

is the multivariate beta function and for each i we define bi = (ai + 1)/2. When n = 2 then B is the (ordinary) beta function.

Note that the integral over the unit sphere doesn’t depend on the dimension of the sphere.

The integral over the unit ball is

\frac{2 B(b_1, b_2, \ldots, b_n)}{ a_1 + a_2 + \cdots + a_n + n}

which is proportional to the integral over the sphere, where the proportionality constant depends on the sum of the exponents (the original exponents, the a‘s, not the b‘s) and the dimension n.

Note that if we integrate the constant polynomial 1 over the unit sphere, we get the surface area of the unit sphere, and if we integrate it over the unit ball, we get the volume of the unit ball.

You can find a derivation for the integral results above in [1]. The proof is basically Liouville’s trick for integrating the normal distribution density, but backward. Instead of going from rectangular to polar coordinates, you introduce a normal density and go from polar to rectangular coordinates.

[1] Gerald B. Folland, How to Integrate a Polynomial over a Sphere. The American Mathematical Monthly, Vol. 108, No. 5 (May, 2001), pp. 446-448.