From the category archives:

Math

A childhood question about heat

by John on March 10, 2010

When I was a little kid, I asked some adults the following question.

If hot things cool, and cool things warm up, could something hot cool down and warm back up?

The people I asked didn’t understand my question and just laughed. I have no idea how old I was, but I wasn’t old enough to articulate what I was thinking.

Here’s what I had in mind. I knew that hot things like a cup of coffee grew cold. And I knew that cold things, say a glass of milk, get warm. Well, could the coffee get so cold that it becomes a cold thing and start to warm back up?

Could the coffee become as cold as the glass of milk? Common sense suggests that can’t happen. When we say coffee grows cold, we mean that it becomes relatively colder, closer to room temperature. And when we say the milk is getting warm, we also mean it is getting closer to room temperature. We’ve never left a hot cup of coffee on a table and come back later to find that it has cooled off so much that it is colder than room temperature. But could there be small fluctuations?

As the coffee and milk head toward room temperature, could they overshoot the target, just by a little bit? Say room temperature is 70 °F, the coffee starts out at 150 °F, and the milk starts out at 40 °F. We don’t expect the coffee to cool down to 40 °F or the milk to warm up to 150 °F. But could the coffee cool down to 69.5 °F and then go back up to 70 °F? Could the milk warm up to 70.5 °F and then cool back down to 70 °F?

I didn’t get a satisfactory answer to my childhood question until I was in college. Then I found out about Newton’s law of cooling. It says that the rate at which a warm body cools is proportional to the difference between its current temperature and the ambient temperature. This law can be written as a differential equation whose solution shows that the temperature of a warm body decreases exponentially to the ambient temperature. The temperature curve always slopes downward. It doesn’t wiggle even a little on its journey to room temperature. Cold bodies warm up the opposite way, exponentially approaching room temperature but never exceeding it.

In case it this seems obvious, think about thermostats. They don’t work this way. Say the temperature in a room is 85 °F and you’d like it to be 72 °F, so you turn on the air conditioning. Will the temperature steadily lower to 72 °F? Not exactly. If you were to plot the temperature in the room over time and look at the graph from far enough away, it would look like it is steadily going down to the desired temperature. But if you look at the graph more closely, you’ll see wiggles. The AC may cool the room to a little below 72 °F, maybe to 70 °F. The AC would cut off and the temperature would rise to 72 °F. Unlike the cup of hot coffee, the AC will often overshoot its target, though not by too much. The temperature may feel constant, but it is not. It oscillates around the desired temperature.

{ 9 comments }

Economizing approximations

by John on February 18, 2010

The most obvious approximation may not be the best. But sometimes a small change to an obvious approximation can make it better approximation. This post gives an example illustrating this point. [click to continue...]

{ 0 comments }

Paul Erdős had this notion that God kept a book of the best proofs. Erdős called God’s book simply “the book.” Springer recently published Proofs from THE BOOK, a collection of elegant proofs that the authors suppose might be in God’s book.

For many mathematicians, the first proof that comes to mind as a candidate for a proof in the book is Euclid’s proof that there are infinitely many primes. The proof is so short and simple that it can almost be written within the 140-character limit of Twitter. I give the proof in my AlgebraFact Twitter account as follows.

There are infinitely many primes. Proof: If not, multiply all primes together and add 1. Now you’ve got a new prime.

Several people pointed out that I was a little too terse. If N is the supposed product of all primes + 1, N itself doesn’t have to be a new prime. It could be that N has a factor that is a new prime. The smallest prime factor of N must be a prime not on the list. Either way “you’ve got a new prime,” but the proof above leaves out some important details.

Here’s an example. Suppose we believed the only primes were 2, 3, 5, 7, 11, and 13. Let N = 2*3*5*7*11*13 + 1 = 30031. Now N cannot be divisible by 2, 3, 5, 7, 11, or 13 because there is a remainder of 1 when dividing N by any of these numbers. Maybe N is a new prime, or maybe N is composite but N has a prime factor not on our list. The latter turns out to be the case: 30031 = 59*509.

If you can find a better summary of Euclid’s proof in 140 characters or less, please leave it in the comments. Please include the character count with your proposal.

Related posts:

In praise of tedious proofs
Proofs of false statements

{ 7 comments }

Carnival of Mathematics #62

by John on February 5, 2010

What is the Carnival of Mathematics? Math bloggers submit articles they have written recently and each month a host writes a post linking to the submitted posts. The sister carnival, Math Teachers at Play, focuses on math education and on math up through high school level. For a more thorough description of the two carnivals and some FAQs, please see Mike Croucher’s article What is a Maths Carnival?

I’m taking a turn hosting this month. Tradition dictates that the host begin with some trivia about the number of the post. As this is the 62nd Carnival of Mathematics, here are a few facts about 62.

  • 62 is the only number whose cube (238328) consists of 3 digits each occurring 2 times.
  • The great rhombicosidodecahedron has 62 sides.
  • Louis Pasteur developed the first rabies vaccination at age 62.

And now onto the posts.

Math and science teacher Cory Poole sends in a video that he created along with his partners and students. The video features a 64-foot Sierpenski triangle about of 12,000 tortilla chips. Read more about the story of the video. Also, here are the bloopers from making the video.

St. Swithun’s day is a sort of British analog of America’s Groundhog Day. If it rains on St. Swithun’s day, it is supposed to rain for the next 40 days. Is there some truth to the legend? See Jon McLoone’s article Mathematica Tests the St. Swithun’s Day Proverb posted at Wolfram Blog.

Edmund Harriss from Maxwell’s Demon presents Spirographs and the third dimension. Interesting visually and mathematically.

toral spirograph

Rachel Thomas presents Beautiful symmetry provides glimpse into quantum world on News from the world of maths. This article reports on a low-termperature experiment that implies that the exceptional Lie group E8 is at work.

Did you know that sine and cosine are equal for all x? Heather (Xi) submitted a pseudo-proof in A=B implies that 1=1, therefore? by her colleague TwoPi at 360. (If there is ever a 360th Carnival of Mathematics, Heather should host it.)

Update: The 360 blog has agreed to host the 360th Carnival of Mathematics, tentatively scheduled for December 1, 2034. (Mike, I hope it’s OK that I scheduled this date without consulting you. ;) )

Marc West presents The curse of the duck, a post about mathematics and cricket. His blog is Mr Science Show: Where Science Meets Pop Culture.

Dan M presents Appolonian Gaskets and Ford Circles on his blog mathrecreation. This post relates number theory and geometry as Ford circles are related to the Farey sequence.

Ford circles

Rick Regan presents Counting Binary and Hexadecimal Palindromes posted at Exploring Binary. Rick counts base 10 palindromes as a warm-up before diving into new territory with binary and hexidecimal numbers.

Gregory Astley wrote a guest post Maxima Tutorial – plotting direction fields for 1st order ODEs for Mike Croucher’s blog Walking Randomly. (Maxima is part of the SAGE mathematics system. Mike has written several posts about SAGE lately.)

Annarita Ruberto from Matem@ticaMente presents the article “How heavy the fish?” The original post was written in Italian, and here is Google’s translation of the page into English.

Matt McDonnell presents Mathematical Recreations: Tweetable Game Of Life, a guest blog for Loren Shure’s blog Loren on the Art of MATLAB.

For some mental arithmetic shortcuts and an explanation of why they work, see Sol Lederman’s post Trachtenberg speed multiplication: exploring why it works on his blog Wild About Math.

Politics impacts sphere of human activity, including math education. See Jason Dyer’s post Anatomy of a Political Math-Ed Reaction on The Number Warrior.

Peter Rowlett from Travels in a Mathematical World presents Substitution ciphers: Ancient – Renaissance, video included below.

You can keep up with Carnival of Mathematics news on Twitter by following @CarnivalOfMath. You may also be interested in daily math facts on Twitter from @ProbFact (probability), @AnalysisFact (real and complex analysis), and @AlgebraFact (algebra and number theory).

{ 10 comments }

A few days ago I wrote a post on finding parameters so that a probability distribution satisfies two percentile conditions. Since then I’ve written Python code to carry out the calculations described in that article and the accompanying technical report.

The article is Finding probability distribution parameters from percentiles posted on CodeProject. The article comes with Python source code and some commentary. The article shows how SciPy and the functools module make it possible for the code to be very succinct.

Related posts:

Probability distribution parameters in SciPy
Numerical computing in IronPython with Ironclad
Getting started with SciPy

{ 0 comments }

How to compute the soft maximum

by John on January 20, 2010

The most obvious way to compute the soft maximum can easily fail due to overflow or underflow.

The soft maximum of x and y is defined by

g(x, y) = log( exp(x) + exp(y) ).

The most obvious way to turn the definition above into C code would be

double SoftMaximum(double x, double y)
{
    return log( exp(x) + exp(y) );
}

This works for some values of x and y, but fails if x or y is large. For example, if we use this to compute the soft maximum of 1000 and 200, the result is numerical infinity. The value of exp(1000) is too big to represent in a floating point number, so it is computed as infinity. exp(200) is finite, but the sum of an infinity and a finite number is infinity. Then the log function applied to infinity returns infinity.

We have the opposite problem if we try to compute the soft maximum of -1000 and -1200. In this computation exp(-1000) and exp(-1200) both underflow to zero, and the log function returns negative infinity for the logarithm of zero.

Fortunately it’s not hard to fix the function SoftMaximum to avoid overflow and underflow. Look what happens when we shift both arguments by a constant.

log( exp(x – k) + exp(y – k) ) = log( exp(x) + exp(y) ) – k.

This says

log( exp(x) + exp(y) ) = log( exp(x -k) + exp(y-k) ) + k

If we pick k to be the maximum of x and y, then one of the calls to exp has argument 0 (and so it returns 1) and the other has a negative argument. This means the follow code cannot overflow.

double SoftMaximum(double x, double y)
{
	double maximum = max(x, y);
	double minimum = min(x, y);
	return maximum + log( 1.0 + exp(minimum - maximum) );
}

The call to exp(minimum - maximum) could possibly underflow to zero, but in that case the code returns maximum. And in that case the return value is very accurate: if maximum is much larger than minimum, then the soft maximum is essentially equal to maximum.

The equation for the soft maximum implemented above has a few advantages in addition to avoiding overflow. It makes it clear that the soft maximum is always greater than the maximum. Also, it shows that the difference between the hard maximum and the soft maximum is controlled by the spread of the arguments. The soft maximum is nearest the hard maximum when the two arguments are very different and furthest from the hard maximum when the two arguments are equal.

Thanks to Andrew Dalke for suggesting the topic of this post by his comment.

Related links:

Soft maximum
Anatomy of a floating point number
Avoiding overflow, underflow, and loss of precision

{ 3 comments }

Ten surprises from numerical linear algebra

by John on January 20, 2010

Here are ten things about numerical linear algebra that you may find surprising if you’re not familiar with the field.

  1. Numerical linear algebra applies very advanced mathematics to solve problems that can be stated with high school mathematics.
  2. Practical applications often require solving enormous systems of equations, millions or even billions of variables.
  3. The heart of Google is an enormous linear algebra problem. PageRank is essentially an eigenvalue problem.
  4. The efficiency of solving very large systems of equations has benefited at least as much from advances in algorithms as from Moore’s law.
  5. Many practical problems — optimization, differential equations, signal processing, etc. — boil down to solving linear systems, even when the original problems are non-linear. Finite element software, for example, spends nearly all its time solving linear equations.
  6. A system of a million equations can sometimes be solved on an ordinary PC in under a millisecond, depending on the structure of the equations.
  7. Iterative methods, methods that in theory require an infinite number of steps to solve a problem, are often faster and more accurate than direct methods, methods that in theory produce an exact answer in a finite number of steps.
  8. There are many theorems bounding the error in solutions produced on real computers. That is, the theorems don’t just bound the error from hypothetical calculations carried out in exact arithmetic but bound the error from arithmetic as carried out in floating point arithmetic on computer hardware.
  9. It is hardly ever necessary to compute the inverse of a matrix.
  10. There is remarkably mature software for numerical linear algebra. Brilliant people have worked on this software for many years.

Related posts:

Don’t invert that matrix
Searching for John Francis
Applying PageRank to biology
Matrix cookbook
What is the cosine of a matrix?

{ 8 comments }

Don’t invert that matrix

by John on January 19, 2010

There is hardly ever a good reason to invert a matrix.

What do you do if you need to solve Ax = b where A is an n x n matrix? Isn’t the solution A-1 b? Yes, theoretically. But that doesn’t mean you need to actually find A-1. Solving the equation Ax = b is faster than finding A-1. Books might write the problem as x = A-1 b, but that doesn’t mean they expect you to calculate it that way.

What if you have to solve Ax = b for a lot of different b’s? Surely then it’s worthwhile to find A-1. No. The first time you solve Ax = b, you factor A and save that factorization. Then when you solve for the next b, the answer comes much faster. (Factorization takes O(n3) operations. But once the matrix is factored, solving Ax = b takes only O(n2) operations. Suppose n = 1,000. This says that once you’ve solved Ax = b for one b, the equation can be solved again for a new b 1,000 times faster than the first one. Buy one get one free.)

What if, against advice, you’ve computed A-1. Now you might as well use it, right? No, you’re still better off solving Ax = b than multiplying by A-1, even if the computation of A-1 came for free. Solving the system is more numerically accurate than the performing the matrix multiplication.

It is common in applications to solve Ax = b even though there’s not enough memory to store A-1. For example, suppose n = 1,000,000 for the matrix A but A has a special sparse structure — say it’s banded — so that all but a few million entries of A are zero.  Then A can easily be stored in memory and Ax = b can be solved very quickly. But in general A-1 would be dense. That is, nearly all of the 1,000,000,000,000 entries of the matrix would be non-zero.  Storing A requires megabytes of memory, but storing A-1 would require terabytes of memory.

Related posts:

Floating point numbers are a leaky abstraction
Ten surprises from numerical linear algebra

{ 37 comments }

Soft maximum

by John on January 13, 2010

In applications you often want to take the maximum of two numbers. But the simple function

f(x, y) = max(x, y)

can be difficult to work with because it has sharp corners. Sometimes you want an alternative that sands down the sharp edges of the maximum function. One such alternative is the soft maximum. Here are some questions this post will answer.

  • What exactly is a soft maximum and why is it called that?
  • How is it related to the maximum function?
  • In what sense does the maximum have “sharp edges”?
  • How does the soft maximum round these edges?
  • What are the advantages to the soft maximum?
  • Can you control the “softness”?

I’ll call the original maximum the “hard” maximum to make it easier to compare to the soft maximum.

The soft maximum of two variables is the function

g(x, y) = log( exp(x) + exp(y) ).

This can be extended to more than two variables by taking

g(x1, x2, …,  xn) = log( exp(x1) + exp(x2) + … + exp(xn) ).

The soft maximum approximates the hard maximum. Why? If x is a little bigger than y, exp(x) will be a lot bigger than exp(y). That is, exponentiation exaggerates the differences between x and y. If x is significantly bigger than y, exp(x) will be so much bigger than exp(y) that exp(x) + exp(y) will essentially equal exp(x) and the soft maximum will be approximately log( exp(x) ) = x, the hard maximum. Here’s an example. Suppose you have three numbers: -2, 3, 8. Obviously the maximum is 8. The soft maximum is 8.007.

The soft maximum approximates the hard maximum but it also rounds off the corners. Let’s look at some graphs that show what these corners are and how the soft maximum softens them.

Here are 3-D plots of the hard maximum f(x, y) and the soft maximum g(x, y). First the hard maximum:

Now the soft maximum:

Next we look at a particular slice through the graph. Here’s the view as we walk along the line y = 1. First the hard maximum:

And now the soft maximum:

Finally, here are the contour plots. First the hard maximum:

And now the soft maximum:

The soft maximum approximates the hard maximum and is a convex function just like the hard maximum. But the soft maximum is smooth. It has no sudden changes in direction and can be differentiated as many times as you like. These properties make it easy for convex optimization algorithms to work with the soft maximum. In fact, the function may have been invented for optimization; that’s where I first heard of it.

Notice that the accuracy of the soft maximum approximation depends on scale. If you multiply x and y by a large constant, the soft maximum will be closer to the hard maximum. For example, g(1, 2) = 2.31, but g(10, 20) = 20.00004. This suggests you could control the “hardness” of the soft maximum by generalizing the soft maximum to depend on a parameter k.

g(x, y; k) = log( exp(kx) + exp(ky) ) / k

You can make the soft maximum as close to the hard maximum as you like by making k large enough. For every value of k the soft maximum is differentiable, but the size of the derivatives increase as k increases. In the limit the derivative becomes infinite as the soft maximum converges to the hard maximum.

Update: See How to compute the soft maximum.

Related posts:

Convex optimization
Free optimization software from Microsoft

{ 19 comments }

Calendars, Connections, and Cats

by John on January 6, 2010

James Burke had a television series Connections in which he would create a connection between two very different things. For example, in one episode he starts with the discovery of the touchstone for testing precious metals and tells a winding tale of how the touchstone led centuries later to the development of nuclear weapons.

I had a Connections-like moment when a calendar led to some physics, which then lead to Andrew Lloyd Webber’s musical Cats.

A few days ago I stumbled on Ron Doerfler’s graphical computing calendar and commented on the calendar here. When I discovered Ron Doerfler’s blog, I bookmarked his article on Oliver Heaviside to read later. (Heaviside was a pioneer in what was later called distribution theory, a way of justifying such mathematical mischief as differentiating non-differentiable functions.) As I was reading the article on Heaviside, I came to this line:

At one time the ionosphere was called the Heaviside layer …

Immediately the lyrics “Up, up, up to the Heaviside layer …” started going through my head. These words come from the song “The Journey to the Heaviside Layer” from Cats. I had never thought about “Heaviside” in that song as being related to Mr. Heaviside. I’ve never seen the lyrics in print, so I thought the words were “heavy side” and didn’t stop to think what they meant.

Andrew Lloyd Webber based Cats on Old Possum’s Book of Practical Cats by T. S. Eliot. The song “The Journey to the Heaviside Layer” in particular is based on the poem Old Deuteronomy from Eliot’s book. Webber used the Heaviside layer as a symbol for heaven, based on an allusion in one of T. S. Eliot’s letters. The symbolism is obvious in the musical, but I hadn’t thought about “Heaviside layer” as meaning “the heavens” (i.e. the upper atmosphere) as well as heaven in the theological sense.

{ 2 comments }

2010 calendar of lost mathematical art

by John on January 3, 2010

Rod Carvalho wrote a post this morning announcing a beautiful 2010 calendar created by Ron Doerfler. Doerfler’s blog is entitled Dead Reckonings: Lost Art in the Mathematical Sciences. The calendar is an example of such lost art. It is illustrated with nomograms, ingenious ways of computing with graphs before electronic calculators were common. The illustrations are pleasant to look at even if you have no idea what they mean.

Image via Ron Doerfler.

Related posts:

Spherical trig is a lost art. Why care about spherical trig?

The Gudeermannian function gd(x) is another interesting relic of an early time. It is closely related to the Mercator projection and shows how to relate ordinary and hyperbolic trig functions without using complex numbers.

The image above shows solutions to the equation u + v + w = uvw. Here’s a post explaining the significance of that equation.

{ 5 comments }

Roots of integers

by John on December 20, 2009

An integer is either a perfect square or its square root is irrational. Said a different way, when you compute the square root of an integer, there are either no figures to the right of the decimal or there are an infinite number of figures to right of the decimal and they don’t repeat. There’s no middle ground. You can’t hope, for example, that the decimal expansion might stop or repeat after a hundred or so terms.

This theorem came up recently when I was talking to one of my kids about her math class, so I decided to look up the proof. It’s easier than I expected, not much harder than the familiar proof that the square root of 2 is irrational. Here goes.

Suppose a/b is a fraction in lowest terms, i.e. a and b are relatively prime, and a/b a solution to xn = c where n > 0 is an integer and c is an integer. Then

(a/b)n = an / bn = c

and so

an / b = c bn-1.

Now the right side of the equation above is an integer, so the left side must be an integer as well. But b is relatively prime to a, and so b is relatively prime to an. The only way an / b could be an integer is for b to equal 1 or -1. And so a/b must be an integer.

This proof shows that what we said about square roots extends to cube roots and in fact to all integer roots. For example, the fifth root of an integer is either an integer or an irrational number.

Update: Note that there’s another proof in the comments, one that I believe is easier to follow.

{ 6 comments }

How many gifts are there in the song Twelve Days of Christmas?

Day 1: 1 gift
Day 2: 1 + 2 = 3 gifts
Day 3: 1 + 2 + 3 = 6 gifts

Day 12: 1 + 2 + 3 + … + 12 = 78 gifts

The number of gifts on day n is the nth triangular number. The total number of gifts up to and including day n is the sum of the first n triangular numbers, known as the nth tetrahedral number. In the image below, the total number of balls is the fifth tetrahedral number. The number of balls in each layer are triangular numbers. (Image credit: Math is Fun.)

tetrahedron of glass balls

I’ll develop a formula for tetrahedral numbers and continuations of the pattern  such as the sum of tetrahedral numbers etc.

First, let T(n, 1) = n.

Next, let T(n, 2) be the nth triangular number. So T(n, 2) is the sum of the first n terms in the sequence T(i, 1).

Next, let T(n, 3) be the nth tetrahedral number. So T(n, 3) is the sum of the first n terms in the sequence T(i, 2).

In general, define T(n, k) to be the sum of the first n terms in the sequence T(i, k-1). You could think of T(n, k) as the nth k-dimensional triangular number. (A tetrahedron is a sort of 3-dimensional triangle. It’s a pyramid whose base is a triangle. T(n,4) would count balls arranged in a sort of 4-dimensional triangle, a simplex in 4 dimensions.)

Theorem: T(n, k) = n(n+1)(n+2) … (n+k-1)/k!

Corollary: There are T(12, 3) = 12*13*14/6 = 364 gifts in the Twelve Days of Christmas.

See these notes for a elementary proof by induction.

Update: Here’s more advanced proof that uses calculus of finite differences.  The more advanced proof requires more background, but it also gives a better idea of how someone might have discovered the formula.

Related posts:

Binomial coefficients
Splitting a convex set through its center
My favorite Christmas carol

{ 3 comments }

Random inequalities IX: new tech report

by John on November 20, 2009

Just posted: Exact calculation of inequality probabilities. This report summarizes previous results for calculating P(X > Y) where X and Y are random variables.

Previous posts on random inequalities:

Introduction
Analytical results
Numerical results
Cauchy distributions
Beta distributions
Gamma distributions
Three or more random variables
Folded normals

{ 6 comments }

Div, grad, and curl videos

by John on November 13, 2009

Open University videos on gradient, divergence, and curl

Grad:

Div:

Curl:

Related link:

Div, Grad, Curl and All That

{ 0 comments }