The smallest uninteresting number and fuzzy logic

I’ve tried to think of something interesting about the number 2013 and haven’t come up with anything. This reminds me of the interesting number paradox.

Theorem: All positive integers are interesting.

Proof: Let n be the smallest uninteresting positive integer. Then n is interesting by virtue of being the smallest such number.

The interesting number paradox is semi-serious, and so is the resolution I propose below. Both are jokes, but they touch on some serious ideas.

“Interestingness” is not an all-or-nothing property. Some numbers are more interesting than others, so perhaps we should use fuzzy logic to quantify how interesting a number is, say on a scale from 0 to 1.

For a given ε > 0, define as interesting the set of numbers whose interestingness is greater than ε. Suppose the interestingness of numbers trails off after some point. (Otherwise, if the interestingness dropped sharply, the first number after the drop would be interesting.) The largest interesting number then is barely interesting. The number one larger than a barely interesting number is even less interesting. So the proof of the interesting number paradox doesn’t apply in the continuous setting.

On a more serious note, many paradoxes in mathematics can be resolved by replacing a binary criterion with a continuous one.

For example, the sum of a trillion continuous functions is continuous, but the infinite sum of continuous functions may not be. How can that be? The problem is that we’re viewing continuity as an all-or-nothing property. If you have a series of continuous functions that converges to a discontinuous limit, the degree of continuity must be degrading. The partial sum after some large number of terms is continuous, but not very continuous. The modulus of continuity of each partial sum is finite, but is getting larger, and is infinite in the limit.

Classical statistics is filled with yes-no concepts that make more sense when replaced with continuous measures. For example, instead of asking whether an estimator is biased, it’s more practical to ask how biased it is.

Computer science is often concerned with whether something can be computed (i.e. exactly). But sometimes it’s more important to ask how well something can be computed. Many things that cannot be computed in theory can be computed well enough in practice.

Related post: How to solve supposedly intractable problems

Extreme change is easier

This last week I ran across a TED video about a couple who had a house full of stuff and $18,000 in debt. They sold all their stuff except what could fit in a couple bags and went backpacking in Australia.

Good for them for having the courage to make a big change. I am impressed, but I’d be more impressed if they had sold their new home and moved into another one 20 years older and half the size.

It’s easier to get rid of all your stuff than half your stuff. If you get rid of all your stuff, you’re deciding to hire other people to meet your needs. You can get rid of your house if you’re willing to rent your shelter from hotels. You can get rid of your pots and pans if you’re willing to pay restaurants to prepare your food with their pots and pans. You can get rid of your car if you’re willing to pay a cab driver to take you everywhere you need to go. Moving into a smaller home, with fewer pots and pans, and selling one of your two cars may be harder.

I don’t know whether these folks are still living as tourists. But if they haven’t bought another house yet, they probably will some day, though maybe one much smaller than their first house. The sequence

large house -> no house -> small house

may be easier than

large house -> small house.

Extreme change is often easier than moderate change, for better or for worse. Extreme change can be more impressive, so people who sell everything get invited to talk at TED, whereas people who cut their living expenses by 20% and slowly pay off their debts get 30 seconds on the Dave Ramsey Show. People who sacrifice to achieve their goals slowly while maintaining their responsibilities are less impressive at first glance, but more impressive after more thought.

Extreme change can also be temporary. Lottery winners go bankrupt. People on starvation diets end up heavier than ever. One extreme change can lead to another extreme change in the opposite direction.

However, you can also use the ease of extreme change to your advantage. The book Change or Die is all about making extreme changes wisely. (The book grew out of this article.) Radical change requires fewer decisions, and leads to encouraging results sooner. Along those lines, I love the story of Eric Coyle, a mediocre student who suddenly became extremely motivated and took up to 64 credit hours in a semester.

Related posts

Ideas for blog posts

When George Will began his career as a syndicated columnist, he asked his editor William Buckley how he could ever come up with two columns per week. Buckley replied that at least twice a week something would annoy him [Will], and he just needed to write about it.

I don’t write about what I find annoying, but rather what I find interesting. I’m always running into things I find interesting, and sometimes I write about them.

One strategy for coming up with ideas for blog posts is to fill in details from your reading. I’ve done this several times lately. And many of my programming posts come from my research to fill in gaps or resolve ambiguity in software documentation. You can also subtract detail, i.e. write summaries, but posts that add detail are likely to be more original.

Lucky house prices

Here’s an interesting tidbit on the least significant digits of house prices.

In Nevada, the last non-zero number in the selling price of a house is a lucky seven 37 percent more often than in the rest of the country. 777 is used three times more often than in the rest of the country. … In neighborhoods with a majority of Asian people, the asking price for homes ends in the lucky number eight 20 percent of the time, compared with 4 percent in other neighborhoods.

From “While we’re at it” by David Mills, First Things, January 2013.

Napier’s mnemonic

John Napier (1550–1617) discovered a way to reduce 10 equations in spherical trig down to 2 equations and to make them easier to remember.

Draw a right triangle on a sphere and label the sides a, b, and c where c is the hypotenuse. Let A be the angle opposite side a, B the angle opposite side b, and C the right angle opposite the hypotenuse c.

There are 10 equations relating the sides and angles of the triangle:

sin a = sin A sin c = tan b cot B
sin b = sin B sin c = tan a cot A
cos A = cos a sin B = tan b cot c
cos B = cos b sin A = tan a cot c
cos c = cot A cot B = cos a cos b

Here’s how Napier reduced these equations to a more memorable form. Arrange the parts of the triangle in a circle as below.

Then Napier has two rules:

  1. The sine of a part is equal to the product of the tangents of the two adjacent parts.
  2. The sine of a part is equal to the product of the cosines of the two opposite parts.

For example, if we start with a, the first rule says sin a = cot B tan b. (The tangent of the complementary angle to B is the cotangent of B.) Similarly, the second rule says that sin a = sin c sin A. (The cosine of the complementary angle is just the sine.)

For a more algebraic take on Napier’s rules, write the parts of the triangle as

(p1, p2, p3, p4, p5) = (a , b, co-A, co-c, co-B).

Then the equations above can be reduced to

sin pi = tan pi-1 tan pi+1 = cos pi+2 cos pi+3

where the addition and subtraction in the subscripts is carried out mod 5. This is just using subscripts to describe the adjacent and opposite parts in Napier’s diagram.

Source: Heavenly Mathematics

Related posts

Spotting sensitivity in an equation

The new book Heavenly Mathematics describes in the first chapter how the medieval scholar Abū Rayḥān al-Bīrūnī calculated the earth’s radius. The derivation itself is interesting, but here I want to expand on a parenthetical remark about the calculation.

The earth’s radius r can be found by solving the following equation.

\cos\theta = \frac{r}{r + 305.1m}

The constant in the denominator comes from a mountain which is 305.1 meters tall. The angle θ is known to be 34 minutes, i.e. 34/60 degrees. Here is the remark that caught my eye as someone more interested in numerical analysis than trigonometry:

There is a delicate matter hidden in this solution however: a minute change in the value of θ results in a large change in the value of r.

How can you tell that the solution is sensitive to changes (i.e. measurement errors) in θ? That doesn’t seem obvious.

Think of r as a function of θ and differentiate both sides of the equation with respect to θ. We’ll convert θ to radians because that’s what we do. (Explanation at the bottom of this post.) We get

-\sin\theta = \frac{305.1m}{(r + 305.1m)^2} \frac{dr}{d\theta}

or

\frac{dr}{d\theta} = -\sin\theta \frac{(r + 305.1m)^2}{305.1m}

Now let’s get a feel for the size of the terms in this equation. θ is approximately 0.01 radians, and so sin θ is approximately 0.01 as well. (See explanation here.) The radius of the earth is about 6.4 million meters. So the right side of the equation above is about 1.3 billion meters, i.e. it’s big.

A tiny increase in θ leads to a large decrease in r. For example, if our measurement of θ increased by 1%, from 0.01 to 0.0101, our measurement of the earth’s radius would decrease by 130,000 meters.

I’d like to point out a couple things about this analysis. First, it shows how it can be useful to think of constants as variables. After measuring θ we could think that we know its value with certainty and treat it as a constant. But a more sophisticated analysis takes into account that while θ might not change, our measurement of θ has changed from the true value.

Second, we used the radius of the earth to determine how sensitive our estimate of the earth’s radius is to changes in θ. Isn’t that circular reasoning? Not really. We can use a very crude estimate of the earth’s radius to estimate how sensitive a new estimate is to changes in its parameters. You always have some idea how big a value is before you measure it. If you want to measure the distance to the moon, you know not to pick up a yard stick.

Basics of Sweave and Pweave

Sweave is a tool for embedding R code in a LaTeX file. Pweave is an analogous tool for Python. By putting your code in your document rather than the results of running your code somewhere else, results are automatically recomputed when inputs change. This is especially useful with graphs: rather than including an image into your document, you include the code to create the image.

To use either Sweave or Pweave, you create a LaTeX file and include source code inside. A code block begins with <<>>= and ends with @ on a line by itself. By default, code blocks appear in the LaTeX output. You can start a code block with <<echo=FALSE>>= to execute code without echoing its source. In Pweave you can also use <% and %> to mark a code block that executes but does not echo. You might want to do this at the top of a file, for example, for import statements.

Sweave echos code like the R command line, with > for the command prompt. Pweave does not display the Python >>> command line prompt by default, though it will if you use the option term=TRUE in the start of your code block.

In Sweave, you can use Sexpr to inline a little bit of R code. For example, $x = Sexpr{sqrt(2)}$ will produce x = 1.414…. You can also use Sexpr to reference variables defined in previous code blocks. The Pweave analog uses <%= and %>. The previous example would be $x = <%= sqrt(2) %>$.

You can include a figure in Sweave or Pweave by beginning a code block with <<fig=TRUE, echo=FALSE>>= or with echo=TRUE if you want to display the code that produces the figure. With Sweave you don’t need to do anything else with your file. With Pweave you need to add usepackage{graphicx} at the top.

To process an Sweave file foo.Rnw, run Sweave("foo.Rnw") from the R command prompt. To process a Pweave file foo.Pnw, run Pweave -f tex foo.Pnw from the shell. Either way you get a LaTeX file that you can then compile to a PDF.

Here are sample Sweave and Pweave files. First Sweave:

\documentclass{article}
\begin{document}

Invisible code that sets the value of the variable $a$.

<<<echo=FALSE>>=
a <- 3.14
@

Visible code that sets $b$ and squares it.

<<bear, echo=TRUE>>=
b <- 3.15
b*b
@

Calling R inline: $\sqrt{2} = Sexpr{sqrt(2)}$

Recalling the variable $a$ set above: $a = Sexpr{a}$.

Here's a figure:

<<fig=TRUE, echo=FALSE>>=
x <- seq(0, 6*pi, length=200)
plot(x, sin(x))
@

\end{document}

And now Pweave:

\documentclass{article}
\usepackage{graphicx}
\begin{document}

<%
import matplotlib.pyplot as plt
from numpy import pi, linspace, sqrt, sin
%>

Invisible code that sets the value of the variable $a$.

<<echo=FALSE>>=
a = 3.14
@

Visible code that sets $b$ and squares it.

<<term=True>>=
b = 3.15
print b*b
@

Calling Python inline: $\sqrt{2} = <%= sqrt(2) %>$

Recalling the variable $a$ set above: $a = <%= a %>$.

Here's a figure:

<<fig=True, echo=False>>=
x = linspace(0, 6*pi, 200)
plt.plot(x, sin(x))
plt.show()
@

\end{document}

Related links

Beethoven, The Beatles, and Beyoncé: more on the Lindy effect

This post is a set of footnotes to my previous post on the Lindy effect. This effect says that creative artifacts have lifetimes that follow a power law distribution, and hence the things that have been around the longest have the longest expected future.

Works of art

The previous post looked at technologies, but the Lindy effect would apply, for example, to books, music, or movies. This suggests the future will be something like a mirror of the present. People have listened to Beethoven for two centuries, the Beatles for about four decades, and Beyoncé for about a decade. So we might expect Beyoncé to fade into obscurity a decade from now, the Beatles four decades from now, and Beethoven a couple centuries from now.

Disclaimer

Lindy effect estimates are crude, only considering current survival time and no other information. And they’re probability statements. They shouldn’t be taken too seriously, but they’re still interesting.

Programming languages

Yesterday was the 25th birthday of the Perl programming language. The Go language was announced three years ago. The Lindy effect suggests there’s a good chance Perl will be around in 2037 and that Go will not. This goes against your intuition if you compare languages to mechanical or living things. If you look at a 25 year-old car and a 3 year-old car, you expect the latter to be around longer. The same is true for a 25 year-old accountant and a 3 year-old toddler.

Life expectancy

Someone commented on the original post that for a British female, life expectancy is 81 years at birth, 82 years at age 20, and 85 years at age 65. Your life expectancy goes up as you age. But your expected additional years of life does not. By contrast, imagine a pop song that has a life expectancy of 1 year when it comes out. If it’s still popular a year later, we could expect it to be popular for another couple years. And if people are still listening to it 30 years after it came out, we might expect it to have another 30 years of popularity.

Mathematical details

In my original post I looked at a simplified version of the Pareto density:

f(t) = c/tc+1

starting at t = 1. The more general Pareto density is

f(t) = cac/tc+1

and starts at t = a. This says that if a random variable X has a Pareto distribution with exponent c and starting time a, then the conditional distribution on X given that X is at least b is another Pareto distribution, now with the same exponent but starting time b. The expected value of X a priori is ac/(c-1), but conditional on having survived to time b, the expected value is now bc/(c-1). That is, the expected value has gone up in proportion to the ratio of starting times, b/a.