Why don’t you simply use XeTeX?

From an FAQ post I wrote a few years ago:

This may seem like an odd question, but it’s actually one I get very often. On my TeXtip twitter account, I include tips on how to create non-English characters such as using \AA to produce Å. Every time someone will ask “Why not use XeTeX and just enter these characters?”

If you can “just enter” non-English characters, then you don’t need a tip. But a lot of people either don’t know how to do this or don’t have a convenient way to do so. Most English speakers only need to type foreign characters occasionally, and will find it easier, for example, to type \AA or \ss than to learn how to produce Å or ß from a keyboard. If you frequently need to enter Unicode characters, and know how to do so, then XeTeX is great.

One does not simply type Unicode characters.

Related posts:

Team dynamics and encouragement

When you add people to a project, the total productivity of the team as a whole may go up, but the productivity per person usually goes down. Someone suggested that as a rule of thumb, a company needs to triple its number of employees to double its productivity. Fred Brooks summarized this saying

“Many hands make light work” — Often
But many hands make more work — Always

I’ve seen this over and over. But I think I’ve found an exception. When work is overwhelming, a lot of time is absorbed by discouragement and indecision. In that case, new people can make a big improvement. They not only get work done, but they can make others feel more like working.

Flood cleanup is like that, and that’s what motivated this note. Someone new coming by to help energizes everyone else. And with more people, you see progress sooner and make more progress, in a sort of positive feedback loop.

This is all in the context of fairly small teams. There must be a point where adding more people decreases productivity per person or even total productivity. I’ve heard reports of a highly bureaucratic relief organization that makes things worse when they show up to “help.” The ideal team size is somewhere between a couple discouraged individuals and a bloated bureaucracy.

Related post: Optimal team size

Relearning from a new perspective

I had a conversation with someone today who said he’s relearning logic from a categorical perspective. What struck me about this was not the specifics but the pattern:

Relearning _______ from a _______ perspective.

Not relearning something forgotten, but going back over something you already know well, but from a different starting point, a different approach, etc.

Have any experiences along these lines you’d like to share in the comments? Anything you have relearned, attempted to relearn, or would like to relearn from a new angle?

Hurricane Harvey update

As you may know, I live in the darkest region of the rainfall map below.

Hurricane Harvey rainfall map

My family and I are doing fine. Our house has not flooded, and at this point it looks like it will not flood. We’ve only lost electricity for a second or two.

Of course not everyone in Houston is doing so well. Harvey has done tremendous damage. Downtown was hit especially hard, and apparently they are in for more heavy rain. But it looks like the worst may be over for my area.

Update (5:30 AM, August 28): More flooding overnight, some of it near by. We’re still OK. It looks like the heaviest rain is over, but there’s still rain in the forecast and there’s no place for more rain to go.

Houston has two enormous reservoirs west of town that together hold about half a billion cubic meters of water. This morning they started releasing water from the reservoirs to prevent dams from breaking.

Space City Weather has been the best source of information. The site offers “hype-free forecasts for greater Houston.” It’s a shame that a news source should have to describe itself as “hype-free,” but they are indeed hype-free and other sources are not.

Update (August 29): Looks like the heavy rain is over. We’re expecting rain for a few more days, but the water is receding faster than it’s collecting, at least on the northwest side.

Solving problems we wish we had

There’s a great line from Heather McGaw toward the end of the latest episode of 99 Percent Invisible:

Sometimes … we can start to solve problems that we wish were problems because they’re easy to solve.

Reminds me of an excerpt from Richard Weaver’s book Ideas Have Consequences:

Obsession, according to the canons of psychology, occurs when an innocuous idea is substituted for a painful one. The victim simply avoids recognizing the thing which will hurt. We have seen that the most painful confession for the modern egoist to make is that there is a center or responsibility. He has escaped it by taking his direction with reference to the smallest points. … The obsession, however, is a source of great comfort to the obsessed.

Subscribing by email

You can subscribe to my blog by email or RSS. I also have a brief newsletter you could sign up for. There are links to these in the sidebar of the blog:

subscription options

If you subscribe by email, you’ll get an email each morning containing the post(s) from the previous day.

I just noticed a problem with email subscription: it doesn’t show SVG images, at least when reading via Gmail; maybe other email clients display SVG correctly. Here’s what a portion of yesterday’s email looks like in Gmail:

screen shot of missing image

I’ve started using SVG for graphs, equations, and a few other images. The main advantage to SVG is that the images look sharper. Also, you can display the same image file at any resolution; no need to have different versions of the image for display at different sizes. And sometimes SVG files are smaller than their raster counterparts.

There may be a way to have web site visitors see SVG and email subscribers see PNG. If not, email subscribers can click on the link at the top of each post to open it in a browser and see all the images.

By the way, RSS readers handle SVG just fine. At least Digger Reader, the RSS reader I use, works well with SVG. The only problem I see is that centered content is always moved to the left.

* * *

The email newsletter is different from the email blog subscription. I only send out a newsletter once a month. It highlights the most popular posts and says a little about what I’ve been up to. I just sent out a newsletter this morning, so it’ll be another month before the next one comes out.

Color theory questions

Here’s a script I wanted to write: given a color c specified in RGB and an angle θ, rotate c on the color wheel by θ and return the RGB value of the result.

You can’t rotate RGB values per se, but you can rotate hues. So my initial idea was to convert RGB to HSV or HSL, rotate the H component, then convert back to RGB. There are some subtleties with converting between RGB and either HSV or HSL, but I’m willing to ignore those for now.

The problem I ran into was that my idea of a color wheel doesn’t match the HSV or HSL color wheels. For example, I’m thinking of green as the complementary color to red, the color 180° away. On the HSV and HSL color wheels, the complementary color to red is cyan. The color wheel I have in mind is the “artist’s color wheel” based on the RYB color space, not RGB. Subtractive color, not additive.

This brings up several questions.

  • How do you convert back and forth between RYB and RGB?
  • How do you describe the artist’s color wheel mathematically, in RYB or any other system?
  • What is a good reference on color theory? I’d like to understand in detail how the various color systems relate, something that spans the gamut (pun intended) from an artist’s perspective down to physics and physiology.

Student’s future, teacher’s past

“Teachers should prepare the student for the student’s future, not for the teacher’s past.” — Richard Hamming

I ran across the above quote from Hamming this morning. It made me wonder whether I tried to prepare students for my past when I used to teach college students.

How do you prepare a student for the future? Mostly by focusing on skills that will always be useful, even as times change: logic, clear communication, diligence, etc.

Negative forecasting is more reliable here than positive forecasting. It’s hard to predict what’s going to be in demand in the future (besides timeless skills), but it’s easier to predict what’s probably not going to be in demand. The latter aligns with Hamming’s exhortation not to prepare students for your past.

Changing your mind

From Dorothy Sayers’ essay Why Work?

It is always strange and painful to have to change a habit of mind; though, when we have made the effort, we may find a great relief, even a sense of adventure and delight, in getting rid of the false and returning to the true.

Cauchy, Benford, and a problem with NHST


Samples from a Cauchy distribution nearly follow Benford’s law. I’ll demonstrate this below. The more data you see, the more confident you should be of this. But with a typical statistical approach, crudely applied NHST (null hypothesis significance testing), the more data you see, the less convinced you are.

This post assumes you’ve read the previous post that explains what Benford’s law is and looks at how well samples from a Weibull distribution follow that law.

This post has two purposes. First, we show that samples from a Cauchy distribution approximately follow Benford’s law. Second, we look at problems with testing goodness of fit with NHST.

Cauchy data

We can reuse the code from the previous post to test Cauchy samples, with one modification. Cauchy samples can be negative, so we have to modify our leading_digit function to take an absolute value.

      def leading_digit(x):
          y = log10(abs(x)) % 1
          return int(floor(10**y))

We’ll also need to import cauchy from scipy.stats and change where we draw samples to use this distribution.

      samples = cauchy.rvs(0, 1, N)

Here’s how a sample of 1000 Cauchy values compared to the prediction of Benford’s law:

| Leading digit | Observed | Predicted |
|             1 |      313 |       301 |
|             2 |      163 |       176 |
|             3 |      119 |       125 |
|             4 |       90 |        97 |
|             5 |       69 |        79 |
|             6 |       74 |        67 |
|             7 |       63 |        58 |
|             8 |       52 |        51 |
|             9 |       57 |        46 |

Here’s a bar graph of the same data.Bar graph of Cauchy leading digits compared to Benford's law

Problems with NHST

A common way to measure goodness of fit is to use a chi-square test. The null hypothesis would be that the data follow a Benford distribution. We look at the chi-square statistic for the observed data, based on a chi-square distribution with 8 degrees of freedom (one less than the number of categories, which is 9 because of the nine digits). We compute the p-value, the probability of seeing a chi-square statistic this larger or larger, and reject our null hypothesis if this p-value is too small.

Here’s how our chi-square values and p-values vary with sample size.

| Sample size | chi-square | p-value |
|          64 |     13.542 |  0.0945 |
|         128 |     10.438 |  0.2356 |
|         256 |     13.002 |  0.1118 |
|         512 |      8.213 |  0.4129 |
|        1024 |     10.434 |  0.2358 |
|        2048 |      6.652 |  0.5745 |
|        4096 |     15.966 |  0.0429 |
|        8192 |     20.181 |  0.0097 |
|       16384 |     31.855 | 9.9e-05 |
|       32768 |     45.336 | 3.2e-07 |

The p-values eventually get very small, but they don’t decrease monotonically with sample size. This is to be expected. If the data came from a Benford distribution, i.e. if the null hypothesis were true, we’d expect the p-values to be uniformly distributed, i.e. they’d be equally likely to take on any value between 0 and 1. And not until the two largest samples do we see values that don’t look consistent with uniform samples from [0, 1].

In one sense NHST has done its job. Cauchy samples do not exactly follow Benford’s law, and with enough data we can show this. But we’re rejecting a null hypothesis that isn’t that interesting. We’re showing that the data don’t exactly follow Benford’s law rather than showing that they do approximately follow Benford’s law.

Click to learn more about Bayesian statistics consulting

What personality classifications have in common

There are many ways to divide people into four personality types, from the classical—sanguine, choleric, melancholic, and phlegmatic—to contemporary systems such as the DISC profile. The Myers-Briggs system divides people into sixteen personality types. I just recently ran across the “enneagram,” an ancient system for dividing people into nine categories.

There’s one thing advocates of all the aforementioned systems agree on: the number of basic personality types is a perfect square.

Resisting simplicity

As much as we admire simplicity and strive for simplicity, something in us isn’t happy when we achieve it.

Sometimes we’re disappointed with a simple solution because, although we don’t realize it yet, we didn’t properly frame the problem it solves.

I’ve been in numerous conversations where someone says effectively, “I understand that 2+3 = 5, but what if we made it 5.1?” They really want an answer of 5.1, or maybe larger, for reasons they can’t articulate. They formulated a problem whose solution is to add 2 and 3, but that formulation left out something they care about. In this situation, the easy response to say is “No, 2+3 = 5. There’s nothing we can do about that.” The more difficult response is to find out why “5” is an unsatisfactory result.

Sometimes we’re uncomfortable with a simple solution even though it does solve the right problem.

If you work hard and come up with a simple solution, it may look like you didn’t put in much effort. And if someone else comes up with the simple solution, you may look foolish.

Sometimes simplicity is disturbing. Maybe it has implications we have to get used to.

Update: A couple people have replied via Twitter saying that we resist simplicity because it’s boring. I think beneath that is that we’re not ready to move on to a new problem.

When you’re invested in a problem, it can be hard to see it solved. If the solution is complicated, you can keep working for a simpler solution. But once someone finds a really simple solution, it’s hard to justify continuing work in that direction.

A simple solution is not something to dwell on but to build on. We want some things to be boringly simple so we can do exciting things with them. But it’s hard to shift from producer to consumer: Now that I’ve produced this simple solution, and still a little sad that it’s wrapped up, how can I use it to solve something else?

Related posts: