Thou, thee, you, and ye

Ever wonder what the rules were for when to use thou, thee, ye, or you in Shakespeare or the King James Bible?

For example, the inscription on front of the Main Building at The University of Texas says

Ye shall know the truth and the truth shall make you free.

Why ye at the beginning and you at the end?

The latest episode of The History of English Podcast explains what the rules were and how they came to be. Regarding the UT inscription, ye was the subject form of the second person plural and you was the object form. Eventually you became used for subject and object, singular and plural.

The singular subject form was thou and the singular object form was thee. For example, the opening lines of Shakespeare’s Sonnet 18:

Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate.

Originally the singular forms were intimate and the plural forms were formal. Only later did thee and thou take on an air of reverence or formality.

Notes on HTML, XML, TeX, and Unicode

This week’s resource post: some notes on typesetting, Unicode, etc.

See also blog posts tagged LaTeX, HTML, and Unicode and the Twitter account TeXtip.

Last week: C++ resources

Next week: Special functions

Why assign two characters to the same symbol?

Unicode often counts the same symbol (glyph) as two or more different characters. For example, Ω is U+03A9 when it represents the Greek letter omega and U+2126 when it represents Ohms, the unit of electrical resistance. Similarly, M is U+004D when it’s used as a Latin letter but U+216F when it’s used as the Roman numeral for 1,000.

The purpose of such distinctions is to capture semantic differences. One example of how this could be useful is increased accessibility. A text-to-speech reader should pronounce things the same way people do. When such software sees “a 25 Ω resistor” it should say “a twenty five Ohm resistor” and not “a twenty five uppercase omega resistor,” just as a person would. [1]

Making text more accessible to the blind helps everyone else as well. For example, it makes the text more accessible to search engines as well. As Elliotte Rusty Harold points out in Refactoring HTML:

Wheelchair ramps are far more commonly used by parents with strollers, students with bicycles, and delivery people with hand trucks than they are by people in wheelchairs. When properly done, increasing accessibility for the disabled increases accessibility for everyone.

However, there are practical limits to how many semantic distinctions Unicode can make without becoming impossibly large, and so the standard is full of compromises. It can be quite difficult to decide when two uses of the same glyph should correspond to separate characters, and no standard could satisfy everyone.


[1] Someone may discover that when I wrote “a 25 Ω resistor” above, I actually used an Omega  (Ω, U+03A9) rather than an Ohm character (Ω, U+2126). That’s because font support for Unicode is disappointing. If I had used the technically correct Ohm character, some people would not be able to see it.  Ironically, this would make the text less accessible.

On my Android phone, I can see Ω (Ohm) but I cannot see Ⅿ (Roman numeral M) because the installed fonts have a glyph for the former but not the latter.


This post first appeared on Symbolism, a blog that I’ve now shut down.

Updating blog posts

I’ve been going through my old blog posts and fixing a few problems. I found a few missing images, code samples that had lost their indentation, etc. Most of the errors have been my fault, but some were due to bugs in plug-ins.

If you see any problems with a post, please let me know. You could send me an email, or leave a comment on the post. (For a while I had comments automatically turn off on older posts, but I’ve disabled that. Now you can comment on any post.)

For the first couple years, this blog didn’t have many readers, and so not many people pointed out my errors. Now that there are more readers, I find out about errors more quickly. But I’ve found some egregious errors in some of the older posts.

Thanks for your contribution to this blog. I’ve been writing here for almost seven years, and I’ve benefited greatly from your input.

R resources

This is the third in my weekly series of posts pointing out resources on this site. This week’s topic is R.

See also posts tagged Rstats.

I started the Twitter account RLangTip and handed it over the folks at Revolution Analytics.

Last week: Emacs resources

Next week: C++ resources

After a coin comes up heads 10 times

Suppose you’ve seen a coin come up heads 10 times in a row. What do you believe is likely to happen next? Three common responses:

  1. Heads
  2. Tails
  3. Equal probability of heads or tails.

Each is reasonable in its own context. The last answer is correct assuming the flips are independent and heads and tails are equally likely.

But as I argued here, if you see nothing but heads, you have reason to question the assumption that the coin is fair. So there’s some justification for the first answer.

The reasoning behind the second answer is that tails are “due.” This isn’t true if you’re looking at independent flips of a fair coin, but it could reasonable in other settings, such as sampling without replacement.

Say there are a number of coins on a table, covered by a cloth. A fixed number are on the table heads up, and a fixed number tails up. You reach under the cloth and slide a coin out. Every head you pull out increases the chances that the next coin will be tails. If there were an equal number of heads and tails under the cloth to being with, then after pulling out 10 heads tails are indeed more likely next time.

Related post: Long runs

First two impressions of statistics

When I was a postoc I asked a statistician a few questions and he gave me an overview of his subject. (My area was PDEs; I knew nothing about statistics.) I remember two things that he said.

  1. A big part of being a statistician is knowing what to do when your assumptions aren’t met, because they’re never exactly met.
  2. A lot of statisticians think time series analysis is voodoo, and he was inclined to agree with them.

How medieval astronomers made trig tables

How would you create a table of trig functions without calculators or calculus?

It’s not too hard to create a table of sines at multiples of 3°. You can use the sum-angle formula for sines

sin(α+β) = sin α cos β + sin β cos α.

to bootstrap your way from known values to other values. Elementary geometry gives you the sines of 45° and 30°, and the sum-angle formula will then give you the sine of 75°. From Euclid’s construction of a 5-pointed star you can find the sine of 72°. Then you can use the sum-angle formula to find the sine of 3° from the sines of 75° and 72°. Ptolemy figured this out in the 2nd century AD.

But if you want a table of trig values at every degree, you need to find the sine of 1°. If you had that, you could bootstrap your way to every other integer number of degrees. Ptolemy had an approximate solution to this problem, but it wasn’t very accurate or elegant.

The Persian astronomer Jamshīd al-Kāshī had a remarkably clever solution to the problem of finding the sine of 1°. Using the sum-angle formula you can find that

sin 3θ = 3 sin θ – 4 sin3 θ.

Setting θ = 1° gives you a cubic equation for the unknown value of sin 1° involving the known value of sin 3°. However, the cubic formula wasn’t discovered until over a century after al-Kāshī. Instead, he used a numerical algorithm more widely useful than the cubic formula: finding a fixed point of an iteration!

Define f(x) = (sin 3° + 4x3)/3. Then sin 1° is a fixed point of f. Start with an approximate value for sin 1° — a natural choice would be (sin 3°)/3 — and iterate. Al-Kāshī used this procedure to compute sin 1° to 16 decimal places.

Here’s a little Python code to play with this algorithm.

from numpy import sin, deg2rad

sin3deg = sin(deg2rad(3))

def f(x):
    return (sin3deg + 4*x**3)/3

x = sin3deg/3
for i in range(4):
    x = f(x)

This shows that after only three iterations the method has converged to floating point precision, which coincidentally is about 16 decimal places, the same as al-Kāshī’s calculation.

Source: Heavenly Mathematics: The Forgotten Art of Spherical Trigonometry


Roughly speaking, an ergodic system is one that mixes well. You get the same result whether you average its values over time or over space.

This morning I ran across the etymology of the word:

In the late 1800s, the physicist Ludwig Boltzmann needed a word to express the idea that if you took an isolated system at constant energy and let it run, any one trajectory, continued long enough, would be representative of the system as a whole. Being a highly-educated nineteenth century German-speaker, Boltzmann knew far too much ancient Greek, so he called this the “ergodic property”, from ergon “energy, work” and hodos “way, path.” The name stuck.

Found here, footnote on page 479.

Other etymological footnotes:


Miscellaneous math notes

This web site started as static HTML files. Later I added a WordPress blog, but still wrote some things as static HTML pages for various reasons. Now I’ve moved most of those static pages to WordPress pages so that they’ll have the same style as the blog.

There’s not a good way to find these pages except through search. So I plan to categorize them and write a short post each Wednesday for the next few weeks listing some related pages. This post starts the series with math notes that didn’t fall into any other category.

See also posts tagged math.

Next week: Emacs resources

Googol and googolplex

Numericon gives the history of the words googol and googolplex:

… the famous googol, 10100 (a 1 followed by 100 zeros), defined in 1929 by American mathematician Edward Kasner and named by his nine-year-old nephew, Milton Sirotta. Milton went even further and came up with the googolplex, now defined as 10googol but initially defined by Milton as a 1, followed by writing zeros until you get tired.

Related post: There isn’t a googol of anything