Approximating a spiral by rings

An Archimedian spiral has the polar equation

r = b θ1/n

This post will look at the case n = 1. I may look at more general values of n in a future post. The case n = 1 is the simplest case, and it’s the case I needed for the client project that motivated this post.

In this case the spacing between points where the spiral crosses an axis is constant. Call this constant h. Then

h = 2πb.

For example, when rolling up a carpet, h corresponds to the thickness of the carpet.

Suppose θ runs from 0 to 2πm, wrapping around the origin m times. We could approximate the spiral by m concentric circles of radius h, 2h, 3h, …, mh. To visualize this, we’re approximating the length of the red spiral on the left with that of the blue circles on the right.

Comparing Archimedes spiral and concentric circles

We could approximate this further by saying we have m/2 circles whose average radius is πmb. This suggests the length of the spiral should be approximately


How good is this approximation? What happens to the relative error as θ increases? Intuitively, each wrap around the origin is more like a circle as θ increases, so we’d expect the approximation to improve for large θ.

According to Mathworld, the exact length of the spiral is

πbm √(1 + (2πm)²) + b arcsinh(2πm) /2

When m is so large that we can ignore the 1 in √(1 + (2πm)²) then the first term is the same as the circle approximation, and all that’s left is the arcsinh term, which is on the order of log m because

arcsinh(x) = log(x + (1 + x²)1/2).

So for large m, the arc length is on the order of m² while the error is on the order of log m. This means the relative error is O( log(m) / m² ). [1]

We’ve assumed m was an integer because that makes it easier to visual approximating the spiral by circles, but that assumption is not necessary. We could restate the problem in terms of the final value of θ. Say θ runs from 0 to T. Then we could solve

T = 2πm

for m and say that the approximate arc length is

½ bT²

and the exact length is

½ bT(1 + T²)1/2 + ½ b arcsinh(T).

The relative approximation error is O( log(T) / T² ).

Related posts

[1] The error in approximating √(1 + (2πm)²) with 2πm is on the order of 1/(4πm) and so is smaller than the logarithmic term.

Hypergeometric function of a large negative argument

It’s occasionally necessary to evaluate a hypergeometric function at a large negative argument. I was working on a project today that involved evaluating F(a, b; c; z) where z is a large negative number.

The hypergeometric function F(a, b; c; z) is defined by a power series in z whose coefficients are functions of a, b, and c. However, this power series has radius of convergence 1. This means you can’t use the series to evaluate F(a, b; c; z) for z < −1.

It’s important to keep in mind the difference between a function and its power series representation. The former may exist where the latter does not. A simple example is the function f(z) = 1/(1 − z). The power series for this function has radius 1, but the function is defined everywhere except at z = 1.

Although the series defining F(a, b; c; z) is confined to the unit disk, the function itself is not. It can be extended analytically beyond the unit disk, usually with a branch cut along the real axis for z ≥ 1.

It’s good to know that our function can be evaluated for large negative x, but how do we evaluate it?

Linear transformation formulas

Hypergeometric functions satisfy a huge number of identities, the simplest of which are known as the linear transformation formulas even though they are not linear transformations of z. They involve bilinear transformations z, a.k.a. fractional linear transformations, a.k.a. Möbius transformations. [1]

One such transformation is the following, found in A&S 15.3.4 [2].

F(a, b; c; z) = (1-z)^{-a} F\left(a, c-b; c; \frac{z}{z-1} \right)

If z < 1, then 0 < z/(z − 1) < 1, which is inside the radius of convergence. However, as z goes off to −∞, z/(z − 1) approaches 1, and the convergence of the power series will be slow.

A more complicated, but more efficient, formula is A&S 15.3.7, a linear transformation formula relates F at z to two other hypergeometric functions evaluated at 1/z. Now when z is large, 1/z is small, and these series will converge quickly.

\begin{align*} F(a, b; c; z) &= \frac{\Gamma(c) \Gamma(b-a)}{\Gamma(b) \Gamma(c-a)} \,(-z)^{-a\phantom{b}} F\left(a, 1-c+a; 1-b+a; \frac{1}{z}\right) \\ &+ \frac{\Gamma(c) \Gamma(a-b)}{\Gamma(a) \Gamma(c-b)} \,(-z)^{-b\phantom{a}} F\left(\,b, 1-c+b; 1-a+b; \,\frac{1}{z}\right) \\ \end{align*}

Related posts

[1] It turns out these transformations are linear, but not as functions of a complex argument. They’re linear as transformations on a projective space. More on that here.

[2] A&S refers to the venerable Handbook of Mathematical Functions by Abramowitz and Stegun.

More is less

When I first started using Unix, I used a program called “more” to read files. The name makes sense because each time you press the space bar, more will show you more of your file, one screen at a time.

Now everyone uses less, and more is all but forgotten.

Daniel Halbert wrote more in 1978. Mark Nudelman a similar program with more functionality in 1984 which he named less. The name was a pun on the aphorism “less is more” [1]. Soon less completely replaced more.

I’m curious why I ever used more, since less had taken over before I touched Unix. One possibility is that someone who was accustomed to more showed me that command. Another possibility is that I learned it from reading The Unix Programming Environment which came out in November 1983. It includes more but not less.

My laptop contains executables for more and less in /usr/bin. The command

    diff less more

returns nothing, indicating that the binaries are identical: less literally is more.

My desktop has distinct binaries for less and more. The more binary is much smaller, and so I assume it is limited to the original functionality of more, more or less.

Related posts

[1] I don’t know who coined the phrase “less is more,” but it is associated with architect Ludwig Mies van der Rohe (1886–1969) who often quoted it. He did not apply the principle to is own name, however. He was born Ludwig Mies and later appended van der Rohe.

Precise answers to useless questions

I recently ran across a tweet from Allen Downey saying

So much of 20th century statistics was just a waste of time, computing precise answers to useless questions.

He’s right. I taught mathematical statistics at GSBS [1, 2] several times, and each time I taught it I became more aware of how pointless some of the material was.

I do believe mathematical statistics is useful, even some parts whose usefulness isn’t immediately obvious, but there were other parts of the curriculum I couldn’t justify spending time on [3].

Fun and profit

I’ll say this in partial defense of computing precise answers to useless questions: it can be fun and good for your career.

Mathematics problems coming out of statistics can be more interesting, and even more useful, than the statistical applications that motivate them. Several times in my former career a statistical problem of dubious utility lead to an interesting mathematical puzzle.

Solving practical research problems in statistics is hard, and can be hard to publish. If research addresses a practical problem that a reviewer isn’t aware of, a journal may reject it. The solution to a problem in mathematical statistics, regardless of its utility, is easier to publish.

Private sector

Outside of academia there is less demand for precise answers to useless questions. A consultant can be sure that a client finds a specific project useful because they’re willing to directly spend money on it. Maybe the client is mistaken, but they’re betting that they’re not.

Academics get grants for various lines of research, but this isn’t the same kind of feedback because the people who approve grants are not spending their own money. Imagine a grant reviewer saying “I think this research is so important, I’m not only going to approve this proposal, I’m going to write the applicant a $5,000 personal check.”

Consulting clients may be giving away someone else’s money too, but they have a closer connection to the source of the funds than a bureaucrat has when dispensing tax money.


[1] When I taught there, GSBS was The University of Texas Graduate School of Biomedical Sciences. I visited their website this morning, and apparently GSBS is now part of, or at least co-branded with, MD Anderson Cancer Center.

There was a connection between GSBS and MDACC at the time. Some of the GSBS instructors, like myself, were MDACC employees who volunteered to teach a class.

[2] Incidentally, there was a connection between GSBS and Allen Downey: one of my students was his former student, and he told me what a good teacher Allen was.

[3] I talk about utility a lot in this post, but I’m not a utilitarian. There are good reasons to learn things that are not obviously useful. But the appeal of statistics is precisely its utility, and so statistics that isn’t useful is particularly distasteful.

Pure math is beautiful (and occasionally useful) and applied math is useful (and occasionally beautiful). But there’s no reason to study fake applied math that is neither beautiful or useful.

Pairs in poker

An article by Y. L. Cheung [1] gives reasons why poker is usually played with five cards. The author gives several reasons, but here I’ll just look at one reason: pairs don’t act like you might expect if you have more than five cards.

In five-card poker, the more pairs the better. Better here means less likely. One pair is better than no pair, and two pairs is better than one pair. But in six-card or seven-card poker, a hand with no pair is less likely than a hand with one pair.

For a five-card hand, the probabilities of 0, 1, or 2 pair are 0.5012, 0.4226, and 0.0475 respectively.

For a six-card hand, the same probabilities are 0.3431, 0.4855, and 0.1214.

For a seven-card hand, the probabilities are 0.2091, 0.4728, and 0.2216.

Related posts

[1] Y. L. Cheung. Why Poker Is Played with Five Cards. The Mathematical Gazette, Dec., 1989, Vol. 73, No. 466 (Dec., 1989), pp. 313–315

Solar system means

Yesterday I stumbled on the fact that the size of Jupiter is roughly the geometric mean between the sizes of Earth and the Sun. That’s not too surprising: in some sense (i.e. on a logarithmic scale) Jupiter is the middle sized object in our solar system.

What I find more surprising is that a systematic search finds mean relationships that are far more accurate. The radius of Jupiter is within 5% of the geometric mean of the radii of the Earth and Sun. But all the mean relations below have an error less than 1%.

\begin{eqnarray*} R_\Mercury &=& \mbox{GM}\left(R_\Moon, R_\Mars\right) \\ R_\Mars &=& \mbox{HM}\left(R_\Moon, R_\Jupiter\right) \\ R_\Uranus &=& \mbox{AGM}\left(R_\Earth, R_\Saturn\right) \\ \end{eqnarray*}

The radius of Mercury equals the geometric mean of the radii of the Moon and Mars, within 0.052%.

The radius of Mars equals the harmonic mean of the radii of the Moon and Jupiter, within 0.08%.

The radius of Uranus equals the arithmetic-geometric mean of the radii of Earth and Saturn, within 0.0018%.

See the links below for more on AM, GM, and AGM.

Now let’s look at masses.

\begin{eqnarray*} M_\Earth &=& \mbox{GM}\left(M_\Mercury, M_\Neptune\right) \\ M_\Pluto &=& \mbox{HM}\left(M_\Moon, M_\Mars\right) \\ M_\Uranus &=& \mbox{AGM}\left(M_\Moon, M_\Saturn\right) \\ \end{eqnarray*}

The mass of Earth is the geometric mean of the masses of Mercury and Neptune, within 2.75%. This is the least accurate approximation in this post.

The mass of Pluto is the harmonic mean of the masses of the Moon and Mars, within 0.7%.

The mass of Uranus is the arithmetic-geometric mean of the masses of of the Moon and Saturn, within 0.54%.

Related posts

Earth : Jupiter :: Jupiter : Sun

The size of Jupiter is approximately the geometric mean of the sizes of Sun and Earth.

In terms of radii,

\frac{R_\Sun}{R_{\text{\Jupiter}}} \approx \frac{R_\Jupiter}{R_\Earth}

The ratio on the left equals 9.95 and the ratio on the left equals 10.98.

The subscripts are the astronomical symbols for the Sun (☉, U+2609), Jupiter (♃, U+2643), and Earth (, U+1F728). I produced them in LaTeX using the mathabx package and the commands \Sun, Jupiter, and Earth.

The the mathabx symbol for Jupiter is a little unusual. It looks italicized, but that’s not because the symbol is being used in math mode. Notice that the vertical bar in the symbol for Earth is vertical, i.e. not italicized.


Gravity on Jupiter

NASA image of Jupiter

I was listening to the latest episode of the Space Rocket History podcast. The show includes some audio from a documentary on Pioneer 11 that mentioned that a man would weigh 500 pounds on Jupiter.

My immediate thought was “Is that all?! Is this ‘man’ a 100 pound boy?”

The documentary was correct and my intuition was wrong. And the implied mass of the man in the documentary is 190 pounds.

Jupiter has more than 300 times more mass than the earth. Why is its surface gravity only 2.6 times that of the earth?

Although Jupiter is very massive, it is also very large. Gravitational attraction is proportional to mass, but inversely proportional to the square of distance.

A satellite in orbit 100,000 km from the center of Jupiter would feel 300 times as much gravity as one in orbit the same distance from the center of Earth. But the surface of Jupiter is further from its center of mass than the surface of Earth is from its center of mass.

The mass of Jupiter is 318 times that of Earth, and the its mean radius is 11 times that of Earth. So the ratio of gravity on the surface of Jupiter to gravity on the Earth’s surface is

318 / 11² = 2.63

Now suppose a planet had the same density as Earth but a radius of r Earth radii. Then its mass would be r³ times greater, but its surface gravity would only be r times greater since gravity follows an inverse square law. So if Jupiter were made of the same stuff as Earth, its surface gravity would be 11 times greater. But Jupiter is a gas giant, so its surface gravity is only 2.6 times greater.

Related posts

Are guidance documents laws?

Are guidance documents laws? No, but they can have legal significance.

The people who generate regulatory guidance documents are not legislators. Legislators delegate to agencies to make rules, and agencies delegate to other organizations to make guidelines. For example [1],

Even HHS, which has express cybersecurity rulemaking authority under the Health Insurance Portability and Accountability Act (HIPAA), has put a lot of the details of what it considers adequate cybersecurity into non-binding guidelines.

I’m not a lawyer, so nothing I can should be considered legal advice. However, the authors of [1] are lawyers.

The legal status of guidance documents is contested. According to [2], Executive Order 13892 said that agencies

may not treat noncompliance with a standard of conduct announced solely in a guidance document as itself a violation of applicable statutes or regulations.

Makes sense to me, but EO 13992 revoked EO 13892.

Again according to [3],

Under the common law, it used to be that government advisories, guidelines, and other non-binding statements were non-binding hearsay [in private litigation]. However, in 1975, the Fifth Circuit held that advisory materials … are an exception to the hearsay rule … It’s not clear if this is now the majority rule.

In short, it’s fuzzy.


[1] Jim Dempsey and John P. Carlin. Cybersecurity Law Fundamentals, Second Edition, page 245.

[2] Ibid., page 199.

[3] Ibid., page 200.


More Laguerre images

A week or two ago I wrote about Laguerre’s root-finding method and made some associated images. This post gives a couple more examples.

Laguerre’s method is very robust in the sense that it is likely to converge to a root, regardless of the starting point. However, it may be difficult to predict which root the method will end up at. To visualize this, we color points according to which root they converge to.

First, let’s look at the polynomial

(x − 2)(x − 4)(x − 24)

which clearly has roots at 2, 4, and 24. We’ll generate random starting points and color them blue, orange, or green depending on whether they converge to 2, 4, or 24. Here’s the result.

To make this easier to see, let’s split it into each color: blue, orange, and green.

Now let’s change our polynomial by moving the root at 4 to 4i.

(x − 2)(x − 4i)(x − 24)

Here’s the combined result.

And here is each color separately.

As we explained last time, the area taken up by the separate colors seems to exceed the total area. That is because the colors are so intermingled that many of the dots in the images cover some territory that belongs to another color, even though the dots are quite small.