Twitter account wordclouds

Here are wordclouds for some of my most popular Twitter accounts. Thanks to Mike Croucher for creating these images. He explains on his blog how to create your own Twitter wordclouds using R.

My most popular account is CompSciFact, tweets about computer science and related topics.

AlgebraFact is for algebra, number theory, and miscellaneous pure math. (Miscellaneous applied math is more likely to end up on AnalysisFact.)

ProbFact is for probability.

DataSciFact is for data science: statistics, machine learning, visualization, etc.

You can find a full list of my various Twitter accounts here.

Mathematical alchemy and wrestling

David Mumford wrote a blog post a few weeks ago in which he identified four tribes of mathematicians. Here’s a summary of his description of the four tribes.

  • Explorers are people who ask — are there objects with such and such properties and if so, how many? …
  • Alchemists … are those whose greatest excitement comes from finding connections between two areas of math that no one had previously seen as having anything to do with each other.
  • Wrestlers … thrive not on equalities between numbers but on inequalities, what quantity can be estimated or bounded by what other quantity, and on asymptotic estimates of size or rate of growth. This tribe consists chiefly of analysts …
  • Detectives … doggedly pursue the most difficult, deep questions, seeking clues here and there, sure there is a trail somewhere, often searching for years or decades. …

I’m some combination of alchemist and wrestler. I suppose most applied mathematicians are. Applications usually require taking ideas developed in one context and using them in another. They take often complex things then estimate and bound them by things easier to understand.

One of my favorite proofs is Bernstein’s proof of the Weierstrass approximation theorem. It appeals to both alchemists and wrestlers. It takes an inequality from probability and uses it in an entirely different context, one with no randomness in sight, and uses it to explicitly construct an approximation satisfying the theorem.

I thought of David Mumford’s tribes when I got an email a couple days ago from someone who wrote to tell me he found in one of my tech reports a function that he’d studied in his own research. My tech report was motivated by a problem in biostatistics, while he was looking at material structural fatigue. The connection between remote fields was a bit of alchemy, while the content of the tech report, an upper bound on an integral, was a bit of wrestling.

You do not want to be an edge case

Hilary Mason made an important observation on Twitter a few days ago:

You do not want to be an edge case in this future we are building.

Systems run by algorithms can be more efficient on average, but make life harder on the edge cases, people who are exceptions to the system developers’ expectations.

Algorithms, whether encoded in software or in rigid bureaucratic processes, can unwittingly discriminate against minorities. The problem isn’t recognized minorities, such as racial minorities or the disabled, but unrecognized minorities, people who were overlooked.

For example, two twins were recently prevented from getting their drivers licenses because DMV software couldn’t tell their photos apart. Surely the people who wrote the software harbored no malice toward twins. They just didn’t anticipate that two drivers licence applicants could have indistinguishable photos.

I imagine most people reading this have had difficulty with software (or bureaucratic procedures) that didn’t anticipate something about them; everyone is an edge case in some context. Maybe you don’t have a middle name, but a form insists you cannot leave the middle name field blank. Maybe there are more letters in your name or more children in your family than a programmer anticipated. Maybe you choose not to use some technology that “everybody” uses. Maybe you happen to have a social security number that hashes to a value that causes a program to crash.

When software routinely fails, there obviously has to have a human override. But as software improves for most people, there’s less apparent need to make provision for the exceptional cases. So things could get harder for edge cases as they get better for more people.

Related posts:

Bastrop State Park, four years later

Four years ago I wrote about the wildfires in Bastrop, Texas. Here’s a photo from the time by Kerri West, used by permission.

Today I visited Bastrop State Park on the way home from Austin. Some trees, particularly oaks, survived the fires. Pines have come back on their own in parts of the park. A volunteer working in the park told me that some of these new trees are 10 feet tall, though I didn’t see these myself. In other parts volunteers have planted pines. Here’s a photo I took this morning.

Most of the new growth in the forest is underbrush, in some places thicker than in the photo above. The same volunteer mentioned above said that the park is already planning prescribed burning in some areas to clear the underbrush and protect the viable trees.




Interpreting scientific literature about your product

A medical device company approached me with the following problem. Scientists had written academic journal articles about their product, but the sales force couldn’t understand what they said. My task was to read the articles, then tell the people in sales what the articles were saying in laymen’s terms.

One of the questions that came up was how to compare two studies with different sample sizes. Of course there are many factors involved, but I said that as a general rule of thumb, a study with four times the sample size will give confidence intervals that are half as wide. They loved that. In the midst of what to them was a sea of statistical mumbo jumbo, here was something they could grab onto. I also pointed out a few things I thought doctors would want to hear and two or three buzzwords the sales people should learn.

The scientific literature on their product was favorable, but the company was not able to convey this because the sales reps didn’t have the words to use. I gave them the words by translating scientific jargon to simple language.

If you’d like for me to give your sales team the words they need, please contact me.

New data, not just bigger data

The Insight 2015 conference highlighted some impressive applications of big data: predicting the path of hurricanes more accurately (as we saw with hurricane Patricia), improving the performance of athletes, making cars safer, etc.

These applications involve large amounts of data. But more importantly they involve new data, not simply greater quantities of data we’ve had before. Cheap sensors make it possible to measure things more directly and in higher resolution than before. We have sources of data, such as social media, that are qualitatively different from what we’ve had in the past.

Simply saying we have more data than before obscures what’s happening. For example, we don’t know more about consumer behavior than a generation ago because we do more phone surveys and have more customer satisfaction post cards to fill out. We know more because we can observe things we couldn’t observe before.

Clever analysis deserves some credit for the successes of big data, but more credit goes to new sources of data and the technologies that make these sources possible.

Insight 2015

A few weeks ago I got a message on Twitter saying that IBM’s Watson had identified me as an “influencer” and invited me to the company’s Insight 2015 conference. So that’s where I am this week.

I had a brief interview last night. Someone took this photo as we were setting up.

Impulse response

You may expect that a burst of input will cause a burst of output. Sometimes that’s the case, but often a burst of input results in a long, smoothly decreasing succession of output. You may not get immediate results, but long-term results. This is true of life in general, but it’s also true in a precise sense of differential equations.

One of the surprises from differential equations is that an infinitely concentrated input usually results in a diffuse output. A fundamental solution to a differential equation is a solution to the equation with a Dirac delta as the forcing function. In a sense, your input is so concentrated that it’s not actually a function. And yet the output may be a nice continuous function, and not one that is not particularly concentrated.

The situation is analogous to striking a bell. The input, the hammer blow to the bell, is extremely short, but the response of the bell is long and smooth. Solving a differential equation with a delta function as input is like learning about a bell by listening to how it rings when you strike it. A better analogy would be striking the bell in many places; a fundamental solution actually solves for a delta function with a position argument, not just a single delta function.

If you’re curious how this informal talk of “infinitely concentrated” input and delta “functions” can be made rigorous, start by reading this post.

Related post: Life lessons from differential equations

PACE: Property Assessed Clean Energy

Energy efficiency improvements can pay for themselves in the long run. Financing can make the improvements immediately cash-flow positive, but only if the loan tenor can match the useful life of the equipment. This enables the payments to be low enough that the projected energy savings exceeds the payments.

PACE, which stands for Property Assessed Clean Energy, is a nation-wide program that makes long-term financing available for energy upgrades, repaid through an annual assessment added to your property tax bill. Though it is a national initiative, each state must create its own PACE program, and so there is some variety in the forms PACE can take. Texas passed legislation in June 2013 authorizing local governments to implement PACE programs. Yesterday the Houston City Council unanimously passed a Resolution of Intent to adopt PACE.

I have been working with PACE Houston, a private company that develops PACE projects by providing strategic advice to property owners. If you’re in the Houston area and are interested in help with PACE financing, you could contact me, or go to the contact page on the PACE Houston web site.

Second languages and selection bias

When I was growing up, I was told that you could never become fluent in a second language, and I believed it. I had no reason not to. I didn’t know anybody who had become fluent at a second language, and I could think of plenty of people who had learned English as a second language but who weren’t fluent.

But how would I know if someone had learned English fluently? If they were fluent, I’d assume they were native speakers. I knew people had learned English as a second language because they weren’t fluent. This is selection bias, where the selection of data you see is influenced by the very thing you’re interested in.

A famous example of selection bias is that of British bombers returning from missions over Germany in World War II. Statistician Abraham Wald advised the RAF to add armor precisely where these bombers were not shot, reasoning that he was only able to inspect bombers that survived their missions. More on this story here.

When I was in college, I had a roommate who had learned Spanish fluently as a second language. I thought “That’s not possible!” though of course it is possible. I immediately began to wonder what else that I thought was impossible was merely difficulty.

Number of digits in n!

The other day I ran across the fact that 23! has 23 digits. That made me wonder how often n! has n digits.

There can only be a finite number of cases, because n! grows faster than 10n for n > 10, and it’s reasonable to guess that 23 might be the largest case. Turns out it’s not, but it’s close. The only cases where n! has n digits are 1, 22, 23, and 24. Once you’ve found these by brute force, it’s not hard to show that they must be the only ones because of the growth rate of n!.

Is there a convenient way to find the number of digits in n! without having to compute n! itself? Sure. For starters, the number of digits in the base 10 representation of a number x is

⌊ log10 x ⌋ + 1.

where ⌊ z ⌋ is the floor of z, the largest integer less than or equal to z. The log of the factorial function is easier to compute than the factorial itself because it won’t overflow. You’re more likely to find a function to compute the log of the gamma function than the log of factorial, and more likely to find software that uses natural logs than logs base 10. So in Python, for example, you could compute the number of digits with this:

from scipy.special import gammaln
from math import log, floor

def digits_in_factorial(n):
    return floor( gammaln(n+1)/log(10.0) ) + 1

What about a more elementary formula, one that doesn’t use the gamma function? If you use Stirling’s approximation for factorial and take log of that you should at least get a good approximation. Here it is again in Python:

from math import log, floor, pi

def stirling(n):
    return floor( ((n+0.5)*log(n) - n + 0.5*log(2*pi))/log(10) ) + 1

The code above is exact for every n > 2 as far as I’ve tested, up to n = 1,000,000. (Note that one million factorial is an extremely large number. It has 5,565,709 digits. And yet we can easily say something about this number, namely how many digits it has!)

The code may break down somewhere because the error in Stirling’s approximation or the limitations of floating point arithmetic. Stirling’s approximation gets more accurate as n increases, but it’s conceivable that a factorial value could be so close to a power of 10 that the approximation error pushes it from one side of the power of 10 to the other. Maybe that’s not possible and someone could prove that it’s not possible.

You could extend the code above to optionally take another base besides 10.

def digits_in_factorial(n, b=10):
    return floor( gammaln(n+1)/log(b) ) + 1

def stirling(n, b=10):
    return floor( ((n+0.5)*log(n) - n + 0.5*log(2*pi))/log(b) ) + 1

The code using Stirling’s approximation still works for all n > 2, even for b as small as 2. This is slightly surprising since the number of bits in a number is more detailed information than the number of decimal digits.

Technical arbitrage

There are huge opportunities to take technology that is well-known and undervalued in one context and apply it in another where it is unknown but valuable. You could call this technical arbitrage, analogous to financial arbitrage, taking advantage of the price difference of something in two markets.

As with financial arbitrage, the hard part is spotting opportunities, not necessarily acting on them. If you want to be a hero with regular expressions, as in the xkcd cartoon below, the key isn’t knowing regular expressions well. The key is knowing about regular expressions in a context where no one else does.

To spot a technical arbitrage opportunity, you need to know what that technology can and cannot (easily) do. You also need to recognize situations where the technology can help. You don’t need to be able to stand up to a quiz show style oral exam.


Related post: You can be a hero with a simple idea

The academic cocoon

In the novel Enchantment, the main character, Ivan, gives a bitter assessment of his choice of an academic career, saying it was for “men who hadn’t yet grown up.”

The life he had chosen was a cocoon. Surrounded by a web of old manuscripts and scholarly papers, he would achieve tenure, publish frequently, teach a group of carefully selected graduate students, be treated like a celebrity by the handful of people who had the faintest idea who he was, and go to his grave deluded into thinking he had achieved greatness when in fact he stayed in school all his life. Where was the plunge into the unknown?

I don’t believe the author meant to imply that an academic career is necessarily so insular. In the context of the quote, Ivan says his father, also a scholar, “hadn’t stayed in the cocoon” but had taken great risks. But there is certainly the danger of living in a tiny world and believing that you’re living in a much larger world. Others may face the same danger, though it seems particularly acute for academics.

It’s interesting that Ivan criticizes scholars for avoiding the unknown. Scholars would say they’re all about pursuing the unknown. But scholars pursue known unknowns, questions that can be precisely formulated within the narrow bounds of their discipline. The “plunge into the unknown” here refers to unknown unknowns, messier situations where not only are the outcomes unknown, even the risks themselves are unknown.

Secret equation

I got a call this afternoon from someone who records audio books for the blind. He wanted to know the name of a symbol he didn’t recognize. He then asked me if the equation was real.

Here’s the equation in context, from the book Michael Vey 4: Hunt for Jade Dragon. The context is as follows.

Suddenly math problems she hadn’t understood made sense. Except now they weren’t just numbers and equations, they were patterns and colors. Calculus, geometry, and trigonometry were easy to understand, simple as a game, like shooting balls at a basketball hoop that was a hundred feet wide. Then a specific sequence of numbers, letters, and symbols started running through her mind.

s(t; t_y) = k \frac{Q}{r^2} \hat{r} \int_{R^2} m(x, y) e^{-2\pi i \daleth\left(\frac{G_x xt + \daleth G_y yt_y}{2\pi}\right)} \,dx\,dy

She almost said the equation when a powerful thought came over her not to speak it out loud—that she must not ever divulge it. She new that what she was receiving was something of great importance, even if she had no idea what it meant.

I believe the symbol in question is the fourth letter of the Hebrew alphabet, ℸ (daleth).

Is this a real equation? The overall form of it looks like an integral transform. However, the two instances of ℸ in equation look suspicious.

One reason is that I’ve never seen ℸ used in math, though I read somewhere that Cantor used it for the cardinality of some set. Even so, Cantor’s use wouldn’t make sense inside an integral.

Also, the two instances of ℸ are used differently. The first is a function (or else the factors of 2 π could be cancelled out) and the second one apparently is not. Finally, the equation is symmetric in x and y if you remove the two daleths. So I suspect this was a real equation with the daleths added in for extra mystery.