Inverted sense of risk

Watching the news gives you an inverted sense of risk.

We fear bad things that we’ve seen on the news because they make a powerful emotional impression. But the things rare enough to be newsworthy are precisely the things we should not fear. Conversely, the risks we should be concerned about are the ones that happen too frequently to make the news.

Read More

Haskell analog of Sweave and Pweave

Sweave and Pweave are programs that let you embed R and Python code respectively into LaTeX files. You can display the source code, the result of running the code, or both.

lhs2TeX is roughly the Haskell analog of Sweave and Pweave.  This post takes the sample code I wrote for Sweave and Pweave before and gives a lhs2TeX counterpart.

%include polycode.fmt
%options ghci

Invisible code that sets the value of the variable $a$.

a = 3.14

Visible code that sets $b$ and squares it. 

(There doesn't seem to be a way to display the result of a block of code directly. 
Seems you have to save the result and display it explicitly in an eval statement.)

b = 3.15
c = b*b

$b^2$ = \eval{c}

Calling Haskell inline: $\sqrt{2} = \eval{sqrt 2}$

Recalling the variable $a$ set above: $a$ = \eval{a}.


If you save this code to a file foo.lhs, you can run

lhs2TeX -o foo.tex foo.lhs

to create a LaTeX file foo.tex which you could then compile with pdflatex.

One gotcha that I ran into is that your .lhs file must contain at least one code block, though the code block may be empty. You cannot just have code in \eval statements.

Unlike R and Python, the Haskell language itself has a notion of literate programming. Haskell specifies a format for literate comments. lhs2TeX is a popular tool for processing literate Haskell files but not the only one.

Read More

A subway topologist

One of my favorite books when I was growing up was the Mathematics volume in the LIFE Science Library. I didn’t own the book, but my uncle did, and I’d browse through the book whenever I visited him. I was too young at the time to understand much of what I was reading.

One of the pages that stuck in my mind was a photo of Samuel Eilenberg. His name meant nothing to me at the time, but the caption titled “A subway topologist” caught my imagination.

… Polish-born Professor Samuel Eilenberg sprawls contemplatively in his Greenwich Village apartment in New York City. “Sometimes I like to think lying down,” he says, “but mostly I like to think riding on the subway.” Mainly he thinks about algebraic topology — a field so abstruse that even among mathematicians few understand it. …

I loved the image of Eilenberg staring intensely at the ceiling or riding around on a subway thinking about math. Since then I’ve often thought about math while moving around, though usually not on a subway. I’ve only lived for a few months in an area with a subway system.

The idea that a field of math would be unknown to many mathematicians sounded odd. I had no idea at the time that mathematicians specialized.

Algebraic topology doesn’t seem so abstruse now. It’s a routine graduate course and you might get an introduction to it in an undergraduate course. The book was published in 1963, and I suppose algebraic topology would have been more esoteric at the time.


Read More

Making change

How many ways can you make change for a dollar? This post points to two approaches to the problem, one computational and one analytic.

SICP gives a Scheme program to solve the problem:

(define (count-change amount) (cc amount 5))

(define (cc amount kinds-of-coins)
    (cond ((= amount 0) 1)
    ((or (< amount 0) (= kinds-of-coins 0)) 0)
    (else (+ (cc amount
                 (- kinds-of-coins 1))
             (cc (- amount

(define (first-denomination kinds-of-coins)
    (cond ((= kinds-of-coins 1) 1)
          ((= kinds-of-coins 2) 5)
          ((= kinds-of-coins 3) 10)
          ((= kinds-of-coins 4) 25)
          ((= kinds-of-coins 5) 50)))

Concrete Mathematics explains that the number of ways to make change for an amount of n cents is the coefficient of z^n in the power series for the following:

\frac{1}{(1 - z)(1 - z^5)(1 - z^{10})(1 - z^{25})(1 - z^{50})}

Later on the book gives a more explicit but complicated formula for the coefficients.

Both show that there are 292 ways to make change for a dollar.

Read More

A puzzle puzzle

Jigsaw puzzles that say they have 1,000 pieces have approximately 1,000 pieces, but probably not exactly 1,000. Jigsaw puzzle pieces are typically arranged in a grid, so the number of pieces along a side has to be a divisor of the total number of pieces. This means there aren’t very many ways to make a puzzle with exactly 1,000 pieces, and most have awkward aspect ratios.

Since jigsaw pieces are irregularly shaped, it may be surprising to learn that the pieces are actually arranged in a regular grid. At least they usually are. There are exceptions such as circular puzzles or puzzles that throw in a couple small pieces that throw off the grid regularity.

How many aspect ratios can you have with a rectangular grid of 1,000 points? Which ratio comes closest to the golden ratio? More generally, answer the same questions with 10^n points for positive integer n.

More puzzles:

A knight’s random walk
Peculiar property of 3909511
Roman numeral problem
A perspective problem

Read More

Ellipsoid surface area

How much difference does the earth’s equatorial bulge make in its surface area?

To first approximation, the earth is a sphere. The next step in sophistication is to model the earth as an ellipsoid.

The surface area of an ellipsoid with semi-axes abc is

A = 2\pi \left( c^2 + \frac{ab}{\sin\phi} \left( E(\phi, k) \sin^2\phi + F(\phi, k) \cos^2 \phi\right)\right)




m = k^2 = \frac{a^2(b^2 - c^2)}{b^2(a^2 - c^2)}

The functions E and F are incomplete elliptic integrals

 F(\phi, k) = \int_0^\phi \frac{d\theta}{\sqrt{1 - k^2 \sin^2\theta}}


E(\phi, k) = \int_0^\phi \sqrt{1 - k^2 \sin^2\theta}\,d\theta

implemented in SciPy as ellipeinc and ellipkinc. Note that the SciPy functions take m as their second argument rather its square root k.

For the earth, a = b and so m = 1.

The following Python code computes the ratio of earth’s surface area as an ellipsoid to its area as a sphere.

from scipy import pi, sin, cos, arccos
from scipy.special import ellipkinc, ellipeinc

# values in meters based on GRS 80
equatorial_radius = 6378137
polar_radius = 6356752.314140347

a = b = equatorial_radius
c = polar_radius

phi = arccos(c/a)
# in general, m = (a**2 * (b**2 - c**2)) / (b**2 * (a**2 - c**2))
m = 1 

temp = ellipeinc(phi, m)*sin(phi)**2 + ellipkinc(phi, m)*cos(phi)**2
ellipsoid_area = 2*pi*(c**2 + a*b*temp/sin(phi))

# sphere with radius equal to average of polar and equatorial
r = 0.5*(a+c)
sphere_area = 4*pi*r**2


This shows that the ellipsoid model leads to 0.112% more surface area relative to a sphere.

Source: See equation 19.33.2 here.

Update: It was suggested in the comments that it would be better to compare the ellipsoid area to that of a sphere of the same volume. So instead of using the average of the polar and equatorial radii, one would take the geometric mean of the polar radius and two copies of the equatorial radius. Using that radius, the ellipsoid has 0.0002% more area than the sphere.

Read More

Iterative linear solvers as metaphor

Gaussian elimination is systematic way to solve systems of linear equations in a finite number of steps. Iterative methods for solving linear systems require an infinite number of steps in theory, but may find solutions faster in practice.

Gaussian elimination tells you nothing about the final solution until it’s almost done. The first phase, factorization, takes O(n^3) steps, where n is the number of unknowns. This is followed by the back-substitution phase which takes O(n^2) steps. The factorization phase tells you nothing about the solution. The back-substitution phase starts filling in the components of the solution one at a time. In application n is often so large that the time required for back-substitution is negligible compared to factorization.

Iterative methods start by taking a guess at the final solution. In some contexts, this guess may be fairly good. For example, when solving differential equations, the solution from one time step gives a good initial guess at the solution for the next time step. Similarly, in sequential Bayesian analysis the posterior distribution mode doesn’t move much as each observation arrives. Iterative methods can take advantage of a good starting guess while methods like Gaussian elimination cannot.

Iterative methods take an initial guess and refine it to a better approximation to the solution. This sequence of approximations converges to the exact solution. In theory, Gaussian elimination produces an exact answer in a finite number of steps, but iterative methods never produce an exact solution after any finite number of steps. But in actual computation with finite precision arithmetic, no method, iterative or not, ever produces an exact answer. The question is not which method is exact but which method produces an acceptably accurate answer first. Often the iterative method wins.

Successful projects often work like iterative numerical methods. They start with an approximation solution and iteratively refine it. All along the way they provide a useful approximation to the final product. Even if, in theory, there is a more direct approach to a final product, the iterative approach may work better in practice.

Algorithms iterate toward a solution because that approach may reach a sufficiently accurate result sooner. That may apply to people, but more important for people is the psychological benefit of having something to show for yourself along the way. Also, iterative methods, whether for linear systems or human projects, are robust to changes in requirements because they are able to take advantage of progress made toward a slightly different goal.

Related post: Ten surprises from numerical linear algebra

Read More

Multiple zeta

The Riemann zeta function, introduced by Leonard Euler, is defined by

\zeta(k) = \sum n^{-k}

where the sum is over all positive integers n.

Euler also introduced a multivariate generalization of the zeta function

\zeta(k_1, \ldots, k_r) = \sum n_1^{-k_1}\cdots n_r^{-k_r}

where the sum is over all decreasing k-tuples of positive integers. This generalized zeta function satisfies the following beautiful identity:

 \zeta(a)\,\zeta(b) = \zeta(a, b) + \zeta(b, a) + \zeta(a+b)

The multivariate zeta function and identities such as the one above are important in number theory and are the subject of open conjectures.


Read More

Swedish superellipse

Sergels torg, Stockholm, Sweden. Photo via Wikipedia.

The fountain in Sergels torg (English: Sergel’s Square) in Sockholm is in the shape of a “superellipse.” Piet Hein proposed this shape as a practical and aesthetically pleasing compromise between a circle and a rectangle.

What Piet Hein dubbed a superellipse is a curve satisfying

|x/a|2.5 + |y/b|2.5 = 1

In the case of Sergels torg, a/b = 6/5. A superellipse is what an analyst would call an Lp ball with p = 2.5 and rescaled axes.

Incidentally, I used curves of this form in a dose-finding method a few years ago. A clinical trial design would pick a, b, and p to match a physician’s desired trade-off between probability of efficacy and probability of toxicity.

Update: A few minutes after I posted this, I realized that there was a bowl on my desk shaped like a superellipse:

The bowl came from Ikea, so there’s another connection to Sweden.

Related post: Volumes of Lp balls

Read More

Haskell is to math as Perl is to English?

Fortran superficially looks like mathematics. Its name comes from “FORmula TRANslation,” and the language does provide a fairly straight-forward way to turn formulas into code. But the similarity to mathematics ends there at the lowest level of abstraction.

Haskell, on the other hand, is more deeply mathematical. Fortran resembles math in the small, but Haskell resembles math in the large. Haskell has mathematical structures at a higher level of abstraction than just formulas. As a recent article put it

While Fortran provides a comfortable translation of mathematical formulas, Haskell code begins to resemble mathematics itself.

On its surface, Perl is one of the least English-like programming languages. It is often criticized as looking like “line noise” because of its heavy use of operators. By contrast, Python has been called “executable pseudocode” because the source is easier to read, particularly for novice programmers. And yet at a deeper level, Perl is more English-like than other programming languages such as Python.

Larry Wall explains in Natural Language Principles in Perl that he designed Perl to incorporate features of spoken language. For example, Perl has analogs of articles and pronouns. (Larry Wall studied both linguistics and computer science in college.) Opinions differ on how well his experiment with incorporating natural language features into programming languages has worked out, but it was an interesting idea.

Related posts:

Haskell / Category theory decoder ring

Programming languages and magic

Extreme syntax

Three-hour-a-week language

Read More

An algebra and a calculus

One generation develops an area of math and a second generation comes along to tidy up the definitions. Sometimes this leads to odd vocabulary because the new generation wants to retain terms even as it gives them different meanings.

In high school you get used to hearing about “algebra” and “calculus” as subjects, then the first time you hear someone say “an algebra” or “a calculus” it sounds bizarre. Similarly, once you’re used to hearing about logic, statistics, or topology, it sounds exotic to hear someone say “a logic,” “a statistic,” or “a topology,” something like hearing “a speed of light.”

When I signed up for a topology course, visions of “rubber sheet geometry” danced in my head. I was taken aback when the course started out by defining a topology to be a collection of sets closed under finite intersections and arbitrary unions. Where did the rubber sheet go?

Sometimes when a subject takes on an indefinite article, the new definition comes early in the development. For example, “a statistic” is any function of a random sample, a disappointing definition that comes early in a statistics course. But sometimes the definition with the indefinite article comes much later.

You can take a couple courses in algebra in high school and even a college course in algebra (i.e. groups, rings, and fields) without ever hearing the definition of “an algebra.” And you can take several courses in calculus without ever hearing of “a calculus.”

The phrase “an algebra” can have several meanings, and the phrase “a calculus” even more. The core definition of an algebra is a vector space with a bilinear product. But there are more general ideas of an algebra, some so general that they mean “things and operations on things.” (That’s basically universal algebra.)

Often you put an adjective in front of algebra to specify what kind of algebra you’re talking about: Banach algebra, σ-algebra, Boolean algebra, etc. Usually in mathematics, an adjective in front of a noun narrows the definition. For example, an Abelian group is a particular kind of group. But often an algebra with a qualifier, such as a σ-algebra, is not an algebra by what I called the core definition above.

The idea of “a calculus” is even more vague. If it has a formal definition, I’ve never seen it. A calculus is just a subject with a set of useful rules for calculating things. Sometimes there’s a strong analogy to calculus in the sense of differential calculus, as in the calculus of finite differences, a.k.a. umbral calculus. Sometimes the analogy is more of a stretch, as in λ calculus or π calculus.

So what is the point of this post? I don’t really have one. Just throwing out some half-baked musings.

Read More

Electrical hum

If you hear electrical equipment humming, it’s probably at a pitch of about 60 Hz since that’s the frequency of AC power, at least in North America. In Europe and most of Asia it’s a little lower at 50 Hz. Here’s an audio clip in a couple formats: wav, mp3.

The screen shot above comes from a tuner app taken when I was around some electrical equipment. The pitch sometimes registered at A# and sometimes as B, and for good reason. In a previous post I derived the formula for converting frequencies to musical pitches:

h = 12 log(P / C) / log 2.

Here C is the pitch of middle C, 261.626 Hz, P is the frequency of your tone, and h is the number of half steps your tone is above middle C. When we stick P = 60 Hz into this formula, we get h = -25.49, so our electrical hum is half way between 25 and 26 half-steps below middle C. So that’s between a A# and a B two octaves below middle C.

For 50 Hz hum, h = -28.65. That would be between a G and a G#, a little closer to G.

Update: So why would the frequency of the sound match the frequency of the electricity? The magnetic fields generated by the current would push and pull parts, driving mechanical vibrations at the same frequency.

Read More

Distribution of a range

Suppose you’re drawing random samples uniformly from some interval. How likely are you to see a new value outside the range of values you’ve already seen?

The problem is more interesting when the interval is unknown. You may be trying to estimate the end points of the interval by taking the max and min of the samples you’ve drawn. But in fact we might as well assume the interval is [0, 1] because the probability of a new sample falling within the previous sample range does not depend on the interval. The location and scale of the interval cancel out when calculating the probability.

Suppose we’ve taken n samples so far. The range of these samples is the difference between the 1st and the nth order statistics, and for a uniform distribution this difference has a beta(n-1, 2) distribution. Since a beta(a, b) distribution has mean a/(a+b), the expected value of the sample range from n samples is (n-1)/(n+1). This is also the probability that the next sample, or any particular future sample, will lie within the range of the samples seen so far.

If you’re trying to estimate the size of the total interval, this says that after n samples, the probability that the next sample will give you any new information is 2/(n+1). This is because we only learn something when a sample is less than the minimum so far or greater than the maximum so far.



Read More