Was Betteridge right?

Betteridge’s law says

Any headline which ends in a question mark can be answered by the word no.

If Betteridge was right, then the answer to my headline question should be no, in which case Betteridge was wrong. But Betteridge was wrong, then the answer to the question in my headline is yes.

This isn’t quite like Russell’s paradox. He asked whether the set of sets which contain themselves contains itself. If it does, it doesn’t. If it doesn’t, it does. This logical contradiction led to a more rigorous construction of set theory that avoids the paradox.

My observation about Betteridge’s law isn’t a paradox, though it resembles one. If Betteridge was wrong, there’s no contradiction in saying that he was sometimes but not always right.

Betteridge’s law was an aphorism, not a logical absolute, and so was never intended to be a rigorous statement. I’m sure Betteridge was quite aware that there had been exceptions, or at least that one could easily create an exception. He did so himself. But as is often the case with yes/no statements that are not always true, it can be turned into a rigorous statement using probability.

Betteridge could have said that if a headline ends in a question mark, the probability that the answer is no is large. Then my headline, added to the vast collection of headlines, would ever-so-slightly lower the proportion of headlines that ask questions that can be answered negatively, without contradicting Betteridge, if was right.

We could call Betteridge’s constant the probability that a headline asks a question that could be answered no. But then it probably isn’t a constant. Maybe knowledge of Betteridge’s law influences how people write headlines …

* * *

Thanks to Don Sizemore for pointing out Betteridge’s law.

An incomplete post about sphere volumes

This is an incomplete blog post. Maybe you can help finish it.

One of the formulas I’ve looked up the most is the volume of a ball in n dimensions. I needed it often enough to be aware of it, but not often enough to remember it. Here’s the formula:

\frac{\pi^{\frac{n}{2}}}{\Gamma\left(\frac{n}{2} + 1\right)}r^n

The factor of rn is no surprise: of course the volume as a function of radius has to be proportional to rn. So we can make the formula a little simpler by just remembering the formula for the volume of a unit ball.

Next, we can make the formula a simpler still by using factorials instead of the gamma function. If n is a non-negative integer, n! = Γ(n+1). We can use that to define factorial for non-integers. Then the volume of a unit ball is

 \frac{\pi^{\frac{n}{2}}}{\frac{n}{2}!}

That’s easier to remember.

It’s also curious. The nth term in the series for ex is xn/n!, so the volumes of unit balls look like series for eπ except compressed, with each index n cut in half. The volumes are not the coefficients in the series for ex, but could they be the coefficients in the series for another familiar function? To find out, let’s stick back in the factor of rn and sum.

\sum_{n=0}^\infty \frac{\pi^{\frac{n}{2}}}{\frac{n}{2}!} ,r^n

This is the sum of the volumes of balls of radius r in all dimensions. That doesn’t make sense by itself, but you could also think of this as the generating function for the volumes of unit balls. So can we find a closed-form expression for the generating function? Yes:

\sum_{n=0}^\infty \frac{\pi^{\frac{n}{2}}}{\frac{n}{2}!} \,r^n = \sqrt{pi} r \exp(\pi r^2) (\mbox{erf}(\sqrt{\pi} r) + 1)

If you work with probability, you probably find Φ more familiar than the error function (see notes relating these) and find exp(x2/2) more familiar than exp(x2). So you could rewrite the generating function as f(√(2π)r) where

f(x) = \sqrt{2} x\exp(x^2/2) \Phi(x)

That looks familiar, but I don’t know what to do with it.

I warned you this would an incomplete post. I feel like there’s an interesting connection to be made, but I’m not quite there. Any suggestions?

Update: I completely forgot about this post and wrote a more complete post on the same topic on May 26, 2019. As Greg Egan points out in the comments below, there’s an error in this post that the newer post corrects.

RSS readers on Linux

This afternoon I asked on UnixToolTip for suggestions of RSS readers on Linux. Here are the suggestions I got, in order of popularity.

Update:

Some other readers available on Linux:

 

How to double science research

Scientists spend 40% of their time chasing grants according to some estimates. Suppose they spend 20% of their time doing something else, such as teaching. That means they spend no more than 40% of their time doing research.

If universities simply paid their faculty a salary rather than giving them a hunting license for grants, the faculty could spend 80% of their time on research rather than 40%. Of course the numbers wouldn’t actually work out so simply. But it is safe to say that if you remove something that takes 40% of their time, researchers could spend more time doing research. (Researchers working in the private sector are often paid by grants too, so to some extent this applies to them as well.)

Universities depend on grant money to pay faculty. But if the money allocated for research were given to universities instead of individuals, universities could afford to pay their faculty.

Not only that, universities could reduce the enormous bureaucracies created to manage grants. This isn’t purely hypothetical. When Hillsdale College decided to refuse all federal grant money, they found that the loss wasn’t nearly as large as it seemed because so much of the grant money had been going to administering grants.

Suffix primes

MathUpdate tweeted this afternoon that

Any number made by removing the first n digits of 646216567629137 is still prime.

and links to sequence A012885 in the Online Encyclopedia of Integer Sequences (OEIS). The OEIS heading for the sequence is

Suffixes of 357686312646216567629137 (all primes)

which implies you can start with an even larger number, cutting off the first digit each time and producing a sequence of primes.

The following Python code verifies that this is indeed the case.

    from sympy.ntheory import isprime

    x = "357686312646216567629137"

    while x:
        print isprime(int(x))
        x = x[1:]

Update: lucio wrote a program to show that the prime given here is the longest one with the suffix property.

    def extend_prime(n, result):
        for i in range(10):
            nn = int(str(i) + str(n))
            if nn == n: continue
            if isprime(nn):
                result.append(nn)
                extend_prime(nn, result)
        return result        

    print "Max Prefix Prime:", max(extend_prime("", []))

One minor suggestion: by using range(1, 10) rather than range(10) above, i.e. eliminating 0, the line if nn == n: continue could be eliminated.

Instead of calling max, you could call len to find that there are 4260 suffix primes.

Here’s a list of all suffix primes created by the code above and sorting the output.

Other special primes

Use typewriter font for code inside prose

There’s a useful tradition of using a typewriter font, or more generally some monospaced font, for bits of code sprinkled in prose. The practice is analogous to using italic to mark, for example, a French mot dropped into an English paragraph. In HTML, the code tag marks content as software code, which a browser typically will render in a typewriter font.

Here’s a sentence from a new article on Python at Netflix that could benefit a few code tags.

These features (and more) have led to increasingly pervasive use of Python in everything from small tools using boto to talk to AWS, to storing information with python-memcached and pycassa, managing processes with Envoy, polling restful APIs to large applications with requests, providing web interfaces with CherryPy and Bottle, and crunching data with scipy.

Here’s the same sentence with some code tags.

These features (and more) have led to increasingly pervasive use of Python in everything from small tools using boto to talk to AWS, to storing information with python-memcached and pycassa, managing processes with Envoy, polling restful APIs to large applications with requests, providing web interfaces with CherryPy and Bottle, and crunching data with scipy.

It’s especially helpful to let the reader know that packages like requests are indeed packages. It helps to clarify, for example, whether Wes McKinney has been stress testing pandas or pandas. That way we know whether to inform animal protection authorities or to download a new version of a library.

Outliers and kettlebells

When you reject a data point as an outlier, you’re saying that the point is unlikely to occur again, despite the fact that you’ve already seen it. This puts you in the curious position of believing that some values you have not seen are more likely than one of the values you have in fact seen.

Maybe you believe that you did not actually see the outlier. If you’re looking at a set of human heights, and one of the values is 61 feet, it is more plausible that you’ve seen a transcription error than that you’ve encountered a person an order of magnitude taller than average.

However, if you believe that a data point is real, but unlikely to reoccur, you are placing more weight on subjective belief than on data, which may or may not be appropriate.

Here’s a personal example. This weekend I bought a kettlebell. As I was waiting in line to check out, I struck up a conversation with the man in line behind me. His right leg was in a cast and resting on a scooter. He told me that he broke his foot in two places by dropping a kettlebell on it! My immediate thought was that this was a fluke, an outlier. My second thought was that according to the only data I have, kettlebells are quite dangerous.

Perhaps the rational decision would have been to leave the store immediately, but I bought the kettlebell anyway. Still, the fellow behind me made an impression. I will think of him every time I work out with the kettlebell and be more careful than I would have been otherwise. Kettlebells are probably more dangerous than I’d like to believe, but so is a sedentary life.