In geometry, you’d say that if a square has side *x*, then it has area *x*^{2}.

In calculus, you’d say more. First you’d say that if a square has side *near* *x*, then it has area *near* *x*^{2}. That is, area is a continuous function of the length of a side. As the length of the side changes, there’s never an abrupt jump in area. Next you could be more specific and say that a small change Δ*x* to a side of length *x* corresponds to approximately a change of 2*x* Δ*x* in the area.

In probability, you ask what is the area of a square like if you pick the length of its side at random. If you pick the length of the side from a distribution with mean μ, does the distribution of the area have mean μ^{2}? No, but if the probability distribution on side length is tightly concentrated around μ, then the distribution on area will be concentrated near μ^{2}. And you can approximate just how near the area is to μ^{2} using the delta method, analogous to the calculus discussion above.

If the distribution on side lengths is not particularly concentrated, finding the distribution on the area is more interesting. It will depend on the specific distribution on side length, and the mean area might not be particularly close to the square of the mean side length. The function to compute area is trivial, and yet the question of what happens when you stick a random variable into that function is not trivial. **Random variables** behave as you might expect when you stick them into linear functions, but **offer surprises when you stick them into nonlinear functions**.

Suppose you pick the length of the side of a square uniformly from the interval [0, 1]. Then the average side is 1/2, and so you might expect the average area to be 1/4. But the expected area is actually 1/3. You could see this a couple ways, analytically and empirically.

First an analytical derivation. If *X* has a uniform [0, 1] distribution and *Z* = *X*^{2}, then the CDF of *Z* is

Prob(*Z* ≤ *z*) = Prob(*X* ≤ √*z*) = √ *z*.

and so the PDF for *Z*, the derivative of the CDF, is -1/2√*z*. From there you can compute the expected value by integrating *z* times the PDF.

You could check your calculations by seeing whether simulation gives you similar results. Here’s a little Python code to do that.

from random import random N = 1000000 print( sum([random()**2 for _ in range(N)] )/N )

When I run this, I get 0.33386, close to 1/3.

Now lets look at an exponential distribution on side length with mean 1. Then a calculation similar to the one above shows that the expected value of the product is 2. You can also check this with simulation. This time we’ll be a little fancier and let SciPy generate our random values for us.

print( sum(expon.rvs(size=N)**2)/N )

When I ran this, I got 1.99934, close to the expected value of 2.

You’ll notice that in both examples, the expected value of the area is more than the square of the expected value of the side. This is not a coincidence but consequence of Jensen’s inequality. Squaring is a convex function, so the expected value of the square is larger than the square of the expected value for any random variable.

Hi John,

I can’t count the times I tell myself “I have GOT to remember this!” after reading one of your blog posts. Your brief yet lucid explanations put me well on the path from scratching a curiosity toward learning a tool.

Yet the simple fact that I have a mind like a steel sieve often leaves me with vague recall when I’m needing the right math to wrangle sense from a sensor or system or archive. Searching your blog generally fails because I’ve forgotten the nomenclature, leaving me to scan the posts, generally too quickly to recognize what I’m looking for.

Any chance of a book that would organize and connect your posts? The table of contents and index combined would prevent much pulling of hair, gnashing of teeth and rending of clothes. At least on my part.

And by “connect” I don’t mean writing anything at all like a traditional pedagogical math text. I’m thinking more along the lines of the blog posts themselves, fleshed out with footnotes and FMI links, and, where appropriate, links to sample code in GitHub.

OK, my dream would be something like Martin Gardner’s collected columns, but with a more applied focus and making full use of online resources.

If you are at all intrigued, estimate the resources you’d need to pay the bills while writing, then create a KickStarter campaign to see if there’s a reader community consisting of more than just me. I can envision several levels of rewards:

$1 – The blog is plenty for me, but I want to help make the book happen.

$10 – The ebook.

$25 – The paperback + ebook.

$50 – The hardcover + ebook.

And on up for folks wanting multiples for gifts.

Another alternative is to “just do it” on a self-publishing/funding site such as InkShares (where I supported the fun and insightful book “Trekonomics”).

What do you think? Any chance at all?

I suspect the writing process itself would generate a ton of blog and Twitter posts, so there’s that to consider.

In para 3, you write “x^2” when you mean “\mu^2” a couple of times.

Cheers,

MarkL

Thanks, Mark. Updated.

I second Bob. Also “mind like a steel sieve,” funny.

In calculus, you’d say more. First you’d say that if a square has side near x, then it has area near x2. That is, area is a continuous function of the length of a side. As the length of the side changes, there’s never an abrupt jump in area. Next you could be more specific and say that a small change Δx to a side of length x corresponds to approximately a change of 2x Δx in the area.

Is this correct? Should there be a squared x term in the last sentence?