Posts tagged as:

Math

Leading digits of factorials

by John on October 19, 2011

Suppose you take factorials of a lot of numbers and look at the leading digit of each result. You could argue that there’s no apparent reason that any digit would be more common than any other, so you’d expect each of the digits 1 through 9 would come up 1/9 of the time. Sounds plausible, but it’s wrong.

The leading digits of factorials follow Benford’s law as described in the previous post. In fact, factorials follow Benford’s law even better than physical constants do. Here’s a graph of the leading digits of the factorials of 1 through 500.

In the remainder of this post, I’ll explain why Benford’s law should apply to factorials, make an aside on statistics, and point out an interesting feature of the Python code used to generate the chart above.

Why Benford’s law applies

Here’s a hand-waving explanation. One way to justify Benford’s law is to say that physical constants are uniformly distributed, but on a logarithmic scale. The same is true for factorials, and it’s easier to see why.

The leading digits of the logarithms depend on on their logarithms in base 10. The gamma function extends the factorial function and it is log-convex. The logarithm of the gamma function is fairly flat (see plot here), and so the leading digits of the log-gamma function applied to integers are uniformly distributed on a logarithmic scale.  (I’ve mixed logs base 10 and natural logs here, but that doesn’t matter. All logarithms are the same up to a multiplicative constant. So if a plot is nearly linear on a log10 scale, it’s nearly linear on a natural log scale.)

Update: Graham gives a link in the comments below to a paper proving that factorials satisfy Benford’s law exactly in the limit.

Uniform on what scale?

This example brings up an important principle in statistics. Some say that if you don’t have a reason to assume anything else, use a uniform distribution. For example, some say that a uniform prior is the ideal uninformative prior for Bayesian statistics. But you have to ask “Uniform on what scale?” It turns out that the leading digits of physical constants and factorials are indeed uniformly distributed, but on a logarithmic scale.

Python integers and floating point

I used nearly the same code to produce the chart above as I used in its counterpart in the previous post. However, one thing had to change: I couldn’t compute the leading digits of the factorials the same way. Python has extended precision integers, so I can compute 500! factorial without overflowing. Using floating point numbers, I could only go up to 170!. But when I used my previous code to find the leading digit, it first tried to apply log10 to an integer larger than the largest representable floating point number and failed. Converting numbers such as 500! to floating point numbers will overflow. (See Floating point numbers are a leaky abstraction.)

The solution was to find the leading digit using only integer operations.

def leading_digit_int(n):
    while n > 9:
        n = n/10
    return n

This code works fine for numbers like 500! or even larger.

Related posts:

Benford’s law and SciPy
Physical constants and factorials
Anatomy of a floating point number

{ 13 comments }

Moby Dick and the tautochrone

by John on October 15, 2011

The tautochrone is a curve such that a ball rolling down the curve takes the same amount of time to reach the bottom, no matter where along the curve it starts. (The name comes from the Greek tauto for same and chrono for time.) It doesn’t sound like such a curve should be possible because balls starting further up the curve have longer to travel. However, balls starting higher also have more potential energy, and so they travel further but faster. See the video below for a demonstration.

[The video is entitled "brachistochrone race" rather than "tautochrone race." The brachistochrone problem is to find the curve of fastest descent. But its solution is the same curve as the tautochrone. So different problems, same solution.]

I first heard of the tautochrone as a differential equation problem to find its equation. But someone could run into it in an American literature class.

Clifford Pickover’s new book The Physics Book has a chapter on the tautochrone. (In this book, “chapters” are only two pages: one page of prose and one full-page illustration.) Pickover points out a passage in Moby Dick that discusses a bowl called a try-pot that is shaped like a tautochrone in the radial direction.

[The try-pot] is a place also for profound mathematical meditation. It was in the left hand try-pot of the Pequod, with the soapstone diligently circling round me, that I was first indirectly struck by the remarkable fact, that in geometry all bodies gliding along the cycloid, my soapstone for example, will descend from any point in precisely the same time.

{ 5 comments }

Rational right triangles

by John on October 13, 2011

Suppose you have a right triangle and every side has rational length. What can you say about the area? For example, is it possible that such a triangle could have area 1?

It turns out the answer is no, and here’s a proof. Suppose you had a right triangle with sides of length a, b, and c with c being the length of the hypotenuse. And suppose a, b, and c are all rational numbers.

Start with the equation

(a2b2)2 = (a2 + b2)2 – 4a2b2

Suppose the area of the triangle is 1. Then ab/2 = 1 and so ab = 2. Use this and the Pythagorean theorem to rewrite the right side of the equation above. Now we have

(a2b2)2 = c4 – 16

But this result contradicts a theorem of Fermat: in rational numbers, the difference of two fourth powers cannot be a square.

So a rational right triangle cannot have area 1. Also, it cannot have area r2 for any rational number r. (If it did, you could divide each side by r and have a rational triangle with area 1.)

Now things are about to get a lot deeper.

What values can the area of a rational right triangle take on? Euler called such numbers congruent. Determining easily testable criteria for congruent numbers is an open problem. However, if the Birch and Swinnerton-Dyer conjecture is correct, then an algorithm of Jerrold Tunnell resolves the congruent number problem. (Incidentally, the Clay Mathematics Institute has placed a $1,000,000 bounty on the Birch and Swinnerton-Dyer conjecture.)

What if instead of limiting the problem to rational right triangles we considered all triangles with rational sides? See this paper for such a generalization.

{ 15 comments }

Just an approximation

by John on September 30, 2011

I find it amusing when I hear someone say something is “just an approximation” because their “exact” answer is invariably “just an approximation” from someone else’s perspective. When someone says “mere approximation” they often mean “that’s not the kind of approximation my colleagues and I usually make” or “that’s not an approximation I understand.”

For example, I once audited a class in celestial mechanics. I was surprised when the professor spoke with disdain about some analytical technique as a “mere approximation” since his idea of “exact” only extended to Newtonian physics. I don’t recall the details, but it’s possible that the disreputable approximation introduced no more error than the decision to only consider point masses or to ignore relativity. In any case, the approximation violated the rules of the game.

Statisticians can get awfully uptight about numerical approximations. They’ll wring their hands over a numerical routine that’s only good to five or six significant figures but not even blush when they approximate some quantity by averaging a few hundred random samples. Or they’ll make a dozen gross simplifications in modeling and then squint over whether a p-value is 0.04 or 0.06.

The problem is not accuracy but familiarity. We all like to draw a circle around our approximation of reality and distrust anything outside that circle. After a while we forget that our approximations are even approximations.

This applies to professions as well as individuals. All is well until two professional cultures clash. Then one tribe will be horrified by an approximation another tribe takes for granted. These conflicts can be a great reminder of the difference between trying to understand reality and playing by the rules of a professional game.

Related posts:

Works well versus well understood
Software exoskeletons
The trouble with wizards

{ 10 comments }

Counting magic squares

by John on September 12, 2011

How many k × k magic squares are possible? If you start from a liberal definition of magic square, there’s an elegant result. For the purposes of this post, a magic square is a square arrangement of non-negative numbers such that the rows and columns all sum to the same non-negative number m called the magic constant. Note that this allows the possibility that numbers will be repeated, and this places no restriction on the diagonals.

With this admittedly non-standard definition, the number of k × k magic squares with magic constant m is always a polynomial in m of degree no more than (k – 1)2. For k = 3, the result is

(m + 1)(m + 2)(m2 + 3m + 4)/8

There is no general formula for all k, but there is an algorithm for finding a formula for each value of k.

Source: The Concrete Tetrahedron

Update: I had reported the polynomial degree as k + 1, but looking back at Concrete Tetrahedron, that book lists the order as (k + 1)2. However, the paper cited in the comments lists the exponent as (k – 1)2, which I believe is correct.

Related posts:

Knight’s magic tour
King’s magic tour
Jupiter’s magic square

{ 8 comments }

Normal subgroups are not transitive

by John on September 7, 2011

The property “is a normal subgroup of” is not transitive. [click to continue...]

{ 0 comments }

Anti-calculus proposition of Erdős

by John on September 6, 2011

The “anti-calculus proposition” is a little result by Paul Erdős that contrasts functions of a real variable and functions of a complex variable.

A standard calculus result says the derivative of a function is zero where the function takes on its maximum. The anti-calculus proposition says that for analytic functions, the derivative cannot be zero at the maximum.

To be more precise, a differentiable real-valued function on a closed interval takes on its maximum where the derivative is zero or at one of the ends. It’s possible that the maximum occurs at one of the ends of the interval and the derivative is zero there.

The anti-calculus proposition says that the analogous situation cannot occur for functions of a complex variable. Suppose a function f is analytic on a closed disk and suppose that f is not constant. Then |f| must take on its maximum somewhere on the boundary by the maximum modulus theorem. Erdős’ anti-calculus proposition adds that at the point on the boundary where |f| takes on its maximum, the derivative cannot be zero.

Related posts:

Six degrees of Paul Erdős
Easy to guess, hard to prove
Publish or perish

{ 9 comments }

When asked why he robbed banks, Willie Sutton famously replied “Because that’s where the money is.”

If you read about data analysis in high dimensions, you might hear someone say they’re focused on a thin shell because that’s where the probability is. For a multivariate normal distribution in high dimensions, nearly all the probability mass is concentrated in a thin shell some distance away from the origin.

What does that mean? Why is it true? How thin is the shell and what is its radius?

It seems absurd to say the probability is concentrated in a shell. The multivariate normal density looks has its greatest value at the origin and quickly decays as you move out in any direction. So most of the probability must be near the origin, right? No, because mass equals density times volume. The density decays quickly as you move away from the origin, but volume increase quickly. The product of the two is greatest at some radius away from the origin. That’s the shell.

The volume of a sphere in d dimensions is proportional to rd, so volume increases very quickly if d is large. For example, if d = 100, how much of the volume of a unit sphere is between a distance of 0.99 and 1 from the origin? Since 1100 – 0.99100 = 0.634, this says 63.4% of the volume is in the outer shell of thickness 0.01.

Since volume of a sphere is proportional to rd, the volume of a shell of radius r and thickness Δr is roughly proportional to d rd-1 Δr. When you multiply that volume by the probability density exp( -r2 / 2 ) you get that the probability mass in the shell is proportional to

rd-1 exp( -r2 / 2 ) Δr.

This leads to a χ distribution with d degrees of freedom. (Not the better known χ2 distribution.) This distribution has mode √(d-1) and variance 1. For large d, the distribution is approximately normal. So a multivariate normal in d dimensions with d large has roughly 95% of its probability mass in a shell of radius √d with thickness 4, two standard deviations either side of √d. (I’m approximating anyway, so I approximated √(d-1) as √d to make the conclusion a little simpler.)

The graph below gives the probability density of shells as a function of radius in dimensions 10 and 100.

Related post:

Volumes of Lp unit balls

{ 2 comments }

The power of parity

by John on August 30, 2011

Puzzle: Give an elegant proof that the following matrix is invertible.

\left[ \begin{array}{rrr} 397 & -12 & -98 \\ -278 & 91 & 1000 \\ 314 & 218 & -11\end{array} \right]

[click to continue...]

{ 1 comment }

Few coefficients, few roots

by John on August 2, 2011

Here’s an elegant little theorem I just learned about. Informally,

A polynomial with few non-zero coefficients has few real roots.

More precisely,

If a polynomial has k non-zero coefficients, it can have at most 2k – 1 distinct real roots.

This says, for example, that a polynomial like x100 – 12 x37 + 1 can have at most 5 distinct real roots, even though it has a total of 100 roots when you count complex roots and count with multiplicity.

I won’t repeat the proof here, but it’s not long or advanced.

The upper bound 2k – 1 in the theorem is sharp. To see this, note that

x(x2 – 1)(x2 – 4) … (x2 – (k-1)2)

has k non-zero coefficients and 2k – 1 distinct real roots.

Source: Mathematical Omnibus

Update: Someone asked about the converse. Does having few real roots imply few non-zero coefficients? No. The polynomial (x2 + 1)n has no real roots but it has n+1 non-zero coefficients.

{ 4 comments }

Math superhero in training

by John on July 27, 2011

Steve Yegge has a new project. He’s in training to become a math superhero. Or at least a sidekick. He said that math/stat folks superheros and he wants to join them.

In his presentation at OSCON Data 2011 on Monday, Yegge said that all the hard problems require math and statistics. So he’s quitting his job at Google to study math in hopes that he can solve big problems three to five years from now.

His enthusiasm for math is naive and inspiring. I gather from some of the articles he’s written that he’s an original thinker and a hard worker. It’ll be interesting to see what he does.

Update: As pointed out in the comments, Steve Yegge clarified on his blog that he is not quitting Google, only his project at Google.

{ 11 comments }

Golden Carnival of Mathematics

by John on July 5, 2011

Welcome to the 79th edition of the Carnival of Mathematics. By tradition, each edition begins with a bit of trivia about the number of the carnival.

Gold has atomic number 79, so this is the golden edition. There is an older tradition of calling 25th things silver, 50th things gold, etc. However, I propose switching to atomic numbers as this system is simpler and easier to look up. :)

Not only is 79 a prime number, it belongs to numerous categories of of prime numbers:

  • Cousin
  • Fortunate
  • Lucky
  • Happy
  • Gaussian
  • Higgs
  • Kynea
  • Permutable
  • Pillai
  • Regular
  • Sexy

(Definitions here)

And now on to the posts.

History

Fëanor at JOST A MON presents Cipher, a history of zero through the Middle Ages.

Historical wanderings

Katie Sorene takes us on a stroll through ancient labyrinths on her blog Travel Blog – Tripbase.

Guillermo Bautista strolls through The Seven Bridges of Königsberg at Mathematics and Multimedia.

Wandering

Next we wander around a grid of dominoes. Jim Wilder presents The Domino Effect: An Elementary Approach to the Kruskal Count. Starting from this elementary post, you can wander into an investigation of coupling methods for Markov chains.

Teaching

Peter Rowlett discusses whether there is a generational gap between professors and students on his blog Travels in a Mathematical World.

Alexander Bogomolny from CTK Insights presents An Olympiad Problem for a Kindergarten Investigation. He gives a problem that is simple to describe and that has a simple but sophisticated solution.

Media

Mike Croucher shares a couple videos simulating pendulum waves, one in Maple and one in Mathematica.

Peter Rowlett explains why he supports Relatively Prime, Samuel Hansen’s Kickstarter project. Samuel Hansen has produced two mathematical podcasts and is now raising donations to fund the creation of a series of mathematical documentaries.

Computing

One of the most fundamental questions you can ask about a computer program is whether it stops. This may appear to be an easy task, yet there is a three-line program that no one knows whether it always terminates. Fëanor shared a link to Brian Hayes‘ commentary Don’t try to read this proof! on a recent proposed proof of the Collatz conjecture.

Applications often involve matrices that are too large to store directly. For an introduction to how large matrices are represented in computer memory, see Storing Banded Matrices for Speed from The NAG Blog. The post promises to be the first in a series.

Next

You can submit posts for the next Carnival of Mathematics here. Also, you can keep up with Carnival of Mathematics new and other mathematical tidbits by following CarnivalOfMath on Twitter.

Related upcoming carnivals

{ 6 comments }

Doubly periodic functions

by John on June 29, 2011

Functions like sine and cosine are periodic. For example, sin(x + 2πn) = sin(x) for all x and any integer n, and so the period of sine is 2π. But what happens if you look at sine or cosine as functions of a complex variable? They’re still periodic if you shift left or right, but not if you shift up or down. If you move up or down, i.e. in a pure imaginary direction, sines and cosines become unbounded.

Doubly periodic functions are periodic in two directions. Formally, a function f(z) of complex variable is doubly periodic if there exist two constants ω1 and ω2 such that

f(z) = f(z + ω1) = f(z + ω2)

for all z. The ratio ω1 / ω2 cannot be real; otherwise the two periods point in the same direction. For the rest of this post, I’ll assume ω1 = 1 and ω2 = i. Such functions are periodic in the horizontal (real-axis) and vertical (imaginary-axis) directions. They repeat everywhere their behavior on the unit square.

What do doubly periodic functions look like? It depends on what restrictions we place on the functions. When we’re working with complex functions, we’re typically interested in functions that are analytic, i.e. differentiable as complex functions.

Only constant functions can be doubly periodic and analytic everywhere. Why? Our functions take on over and over the values they take on over the unit square. If a function is analytic over the (closed) unit square then it’s bounded over that square, and since it’s doubly periodic, it’s bounded everywhere. By Liouville’s theorem, the only bounded analytic functions are constants.

This says that to find interesting doubly periodic functions, we’ll have to relax our requirements. Instead of requiring functions to be analytic everywhere, we will require them to be analytic except at isolated singularities. That is, the functions are allowed to blow up at a finite number of points. There’s a rich set of such functions, known as elliptic functions.

There are two well-known families of elliptic functions. One is the Weierstrass ℘ function (TeX symbol \wp, Unicode U+2118) and its derivatives. The other is the Jacobi functions sn, cn, and dn. These functions have names resembling familiar trig functions because the Jacobi functions are in some ways analogous to trig functions.

It turns out that all elliptic functions can be written as combinations either of the Weierstrass function and its derivative or combinations of Jacobi functions. Roughly speaking, Weierstrass functions are easier to work with theoretically and Jacobi functions are easier to work with numerically.

Related posts:

How many trig functions are there?
College math in a single symbol
Diagram of special function relationships

{ 11 comments }

Square root interview question

by John on June 23, 2011

Imagine some of the answers you might get to  “What is the square root of 101?” First, three answers that suggest an interviewee is not strong with math.

  • What’s a square root?
  • You gotta calculator?
  • 101 doesn’t have a square root.

And here are some other answers that might give an idea where an interviewee is coming from.

  • An irrational number. (Pure mathematician)
  • Approximately 10. (Engineer)
  • Somewhere between 10 and 11, closer to 10. (Better engineer)
  • Approximately 10.05, based on a linear Taylor approximation centered at 100 (Applied mathematician)
  • Approximately 10.05, based on one step of Newton’s method (Computer scientist)

(This is just for amusement. I don’t think quiz show-like interviews are a good way to find people you want to work with for years.)

Related posts:

Writes large, correct programs
Dumb and gets things done

{ 51 comments }

Impure math

by John on June 15, 2011

When Samuel Hansen said in his interview “You’re not a pure mathematician” I agreed without thinking, but later the statement bothered me a little. I know what he meant: considering the two categories of pure math and applied math, you’d put yourself in the latter category. Which is true.

But the term “pure” math can be misleading, as if everyone else does impure math. Applied math is not an alternative to theoretical math. Applied mathematicians prove theorems etc. We work on applications in addition to doing what is expected of pure mathematicians. The difference between pure and applied math is motivation, not content. Applied math is motivated by direct application to non-mathematical problems. Pure math seeks to advance math for its own sake. Both are important.

Statistics uses the terms “theoretical” and “applied” rather than “pure” and “applied.” Math doesn’t use “theoretical” as an antithesis to “applied” because applied math is theoretical. But unlike math, being “applied” in statistics does mean you’re often (too often?) excused from proving theorems. The first time I was a coauthor on a statistics paper I was surprised to find out you could publish with just simulation results and no theorems. This happens in applied math as well, but not nearly as often as it does in applied statistics.

On the other hand, when I hear the term “applied statistics” I want to ask “Is there any other kind?” Statistics is applied (and theoretical!) though some statisticians work more directly on applications than others. As Andrew Gelman quips, the difference between theoretical and applied statisticians is that

The theoretical statistician uses x, the applied statistician uses y (because we reserve x for predictors).

I assume that statement wasn’t meant to be taken literally, but I agree with the sentiment that the distinction between theoretical and applied statistics can be exaggerated. I’d say the same applies to pure and applied math.

{ 9 comments }