“Nobody knows what most C++ programmers do.” — Bjarne Stroustrup

The quote above came up in a discussion of C++ by Scott Meyers, Andrei Alexandrescu, and Herb Sutter. They argue that C++ is used in so many diverse applications that if someone starts a sentence with “Most C++ programmers …” he probably doesn’t know what he’s talking about.

As I commented here, I typically try to master the languages I use. But for some languages, like awk and sed, it makes sense to learn just a small, powerful subset. (The larger a language is, the harder it can be to just learn part of it because the features intertwine.) Krumins’ book would be good for someone looking to learn just a little awk rather than wanting to explore every dark corner of the language.

Awk One-Liners Explained is exactly what title would lead you to expect. It has 70 awk one-liners along with a commentary on each. Some of the one-liners solve common specific problems, such as converting between Windows and Unix line endings. Most of the one-liners are solutions to general types of problems rather than code anyone is likely to run verbatim. For example, one of the one-liners is

Change “scarlet” or “ruby” or “puce” to “red.”

I doubt anybody has ever had to solve that exact problem, but it’s not hard to imagine wanting to do something similar.

Because the book is entirely about one-line programs, it doesn’t cover how to write complex programs in awk. That’s perfect for me. If something takes more than one line of awk, I probably don’t want to use awk. I use awk for quick file filtering. If a task requires writing several lines of code, I’d use Python.

Last month the New York Times ran a story about a sculpture based on cutting open a “Menger sponge,” a shape formed by recursively cutting holes through a cube. All the holes are rectangular, but when you cut the sponge open at an angle, you see six-pointed stars.

Here are some better photos, including both a physical model and a computer animation. Thanks to Mike Croucher for the link.

I’ve written some Python code to take slices of a Menger sponge. Here’s a sample output.

The Menger sponge starts with a unit cube, i.e. all coordinates are between 0 and 1. At the bottom of the code, you specify a plane by giving a point inside the cube and vector normal to the plane. The picture above is a slice that goes through the center of the cube (0.5, 0.5, 0.5) with a normal vector running from the origin to the opposite corner (1, 1, 1).

from math import floor, sqrt
from numpy import empty, array
from matplotlib.pylab import imshow, cm, show
def outside_unit_cube(triple):
x, y, z = triple
if x < 0 or y < 0 or z < 0:
return 1
if x > 1 or y > 1 or z > 1:
return 1
return 0
def in_sponge( triple, level ):
"""Determine whether a point lies inside the Menger sponge
after the number of iterations given by 'level.' """
x, y, z = triple
if outside_unit_cube(triple):
return 0
if x == 1 or y == 1 or z == 1:
return 1
for i in range(level):
x *= 3
y *= 3
z *= 3
# A point is removed if two of its coordinates
# lie in middle thirds.
count = 0
if int(floor(x)) % 3 == 1:
count += 1
if int(floor(y)) % 3 == 1:
count += 1
if int(floor(z)) % 3 == 1:
count += 1
if count >= 2:
return 0
return 1
def cross_product(v, w):
v1, v2, v3 = v
w1, w2, w3 = w
return (v2*w3 - v3*w2, v3*w1 - v1*w3, v1*w2 - v2*w1)
def length(v):
"Euclidean length"
x, y, z = v
return sqrt(x*x + y*y + z*z)
def plot_slice(normal, point, level, n):
"""Plot a slice through the Menger sponge by
a plane containing the specified point and having
the specified normal vector. The view is from
the direction normal to the given plane."""
# t is an arbitrary point
# not parallel to the normal direction.
nx, ny, nz = normal
if nx != 0:
t = (0, 1, 1)
elif ny != 0:
t = (1, 0, 1)
else:
t = (1, 1, 0)
# Use cross product to find vector orthogonal to normal
cross = cross_product(normal, t)
v = array(cross) / length(cross)
# Use cross product to find vector orthogonal
# to both v and the normal vector.
cross = cross_product(normal, v)
w = array(cross) / length(cross)
m = empty( (n, n), dtype=int )
h = 1.0 / (n - 1)
k = 2.0*sqrt(3.0)
for x in range(n):
for y in range(n):
pt = point + (h*x - 0.5)*k*v + (h*y - 0.5)*k*w
m[x, y] = 1 - in_sponge(pt, level)
imshow(m, cmap=cm.gray)
show()
# Specify the normal vector of the plane
# cutting through the cube.
normal = (1, 1, 0.5)
# Specify a point on the plane.
point = (0.5, 0.5, 0.5)
level = 3
n = 500
plot_slice(normal, point, level, n)

Puzzle: Give an elegant proof that the following matrix is invertible.

Solution: The determinant of the matrix is odd, so the determinant is not zero, so the matrix is invertible.

Why is the determinant odd? The determinant is defined as a sum of products that pick an element from each row and each column. Some of the products are multiplied by -1, but that doesn’t matter for our purposes. Each product of three elements is even except for the product that takes the terms along the diagonal, which are all odd. The sum of an odd number and several even numbers is odd.

Check out The Calculus of Grit by Venkat Rao. This article is somewhat similar to my Jack of all trades post but goes into far more depth. It is about 20 times longer than my article and well worth reading.

Venkat Rao compares discipline boundaries to extrinsic coordinates and one’s career to intrinsic coordinates. You don’t have to understand the mathematical significance of these terms to read The Calculus of Grit, though it helps. Extrinsic coordinates describe a surface as it sits inside a larger space. Intrinsic coordinates describe a surface as it would be experienced by a bug crawling around on it. A line that is straight in one coordinate system will typically not be straight in the other coordinate system.

For some background on the technical use of the term “grit,” see the Psychology Today article The Winning Edge. (The math in the first paragraph is annoying because the superscripts were stripped in the online version of the article. It says, for example, 32 + 42 = 52.)

Thanks to DavidC for pointing out Venkat Rao’s post.

I was struck by this quote from Ralph Waldo Emerson, even though I’m not sure I understand what he meant.

In every work of genius, we recognize our own rejected thoughts: they come back to us with a certain alienated majesty.

Maybe Emerson was referring to that why-didn’t-I-think-of-that feeling when you see that someone else connected one or two more dots than you did. You thought about a challenge, and maybe you were close to resolving it, but you lacked a key insight to pull it all together. You decided your approach wouldn’t work, but someone did make it work.

If that’s what Emerson had in mind, it’s puzzling that he speaks of “every work of genius.” It would be incredibly arrogant to think that you almost came up with every great idea you see. Maybe he means that we recognize genius best when it relates to something we’ve struggled with.

What do you think Emerson meant? When have your rejected ideas come back to you?

I remember, must be 20 or 25 years ago, hearing a talk given by Nick Negroponte of the MIT Media Lab, in which he made a prediction … Everything that today goes through wires will go through the air, and everything that goes through the air today will go through wires.

Last night I shared the article Why we don’t hire .NET programmers by David Barrett on Twitter. Some of the responses I got said the article was

A load of rubbish

Amazingly successful trolling

So narrow minded it hurts

The article contains some provocative criticisms of Microsoft’s tool stack. But it also has high praise for the same tools.

Here’s what I believe the article is saying at its core:

The Microsoft tool stack is not optimized for the kind of development we want to do, so we doubt that people who have chosen to make a career using that tool stack will be a good fit for us.

I’ll let David Barrett decide who is or is not a good fit for his company, but this much seems undeniable: Microsoft’s tools are optimized for a certain market. All tools are optimized for some market, at least tools that are successful. I would take Microsoft’s enormous financial success as evidence that their tools are indeed optimized for some market, and a large market at that. The article says

[.NET is] the most modern platform for application development on the planet. Microsoft has always produced the best tools for building internal business applications, and .NET is their masterpiece. There’s a reason why they own that space; they earned it. That space employs millions of people, and those people are unquestionably the masters at what they do.

That’s quite an endorsement. Microsoft should quote that in their marketing literature.

I assume .NET developers don’t take offense to what Barrett says .NET does well but rather what he thinks it does poorly.

Barrett’s main criticism of .NET is that it makes it easier to solve common problems at the expense of making it harder to solve uncommon problems. And that seems clear. He makes his point in an inflammatory way — implying that Microsoft wants to entrap developers, and that .NET developers are happy to let Microsoft think for them — but I agree that Microsoft has designed its tools for developers working on common problems. They’ve aimed at the profitable heart of the developer market.

I don’t agree with Barrett’s argument that start-ups are necessarily working on unusual problems that are not well served by Microsoft tools. A start-up may have a unique product or service and yet have mainstream software needs. For example, suppose you develop a kit that lets people run their car on oatmeal. A web site for selling your kits might not be very different from a web site selling T-shirts.

Whether a person is a “jack of all trades and a master of none” depends on how you define trades. The same person could be a dilettante or a specialist depending on your mental categories.

Take an expert programmer back in time 100 years. What are his skills? Maybe he’s pretty good at math. He has good general problem solving skills, especially logic. He has dabbled a little in linguistics, physics, psychology, business, and art. He has an interesting assortment of knowledge, but he’s not a master of any recognized trade.

Is a manager a master of one trade or a jack of several trades? Obviously if you recognize management as a profession, then someone who is good at it is a master of that trade. But if you don’t have the mental category of manager, what is a manager good at? She knows a little psychology, a little sociology, a little math, she has good communication skills, etc. But she’s a jack of all trades and master of none unless you have a name for her trade.

Calling someone a jack of all trades could be a way of saying that you don’t have a mental category to hold what they do.

My wife told me about someone on the radio yesterday discussing voluntary water rationing. People in odd-numbered houses are being asked to water their yards only on odd-numbered days. This person said “I suppose they’re talking about the last digit.”

When she told me about this, my first two thoughts were:

Yes, that’s what it means to be odd.

Nearly every house number in suburban Houston starts with 1, so going by first digit would be a bad idea.

Strictly speaking, it’s a theorem that odd numbers are those that end in odd digits. The definition of an odd number is one that leaves a remainder of 1 when divided by 2. And in base 10, a number is odd if and only if it ends in an odd digit.

But what if you were using a base other than 10? If the base is even, then a number is odd if and only if the last digit is odd, just like base 10. But what if you’re using an odd base, say base 7? Then the theorem doesn’t hold. For example the number 122 in base 7 is odd, and the number 33 is even. And it’s not just the opposite of the rule for base 10 because 43 is also odd in base 7.

In an odd base, a number is odd iff it has an odd number of odd digits.

(In case you haven’t seen “iff” before, it’s an abbreviation for “if and only if.”)

So, for example, in base 7, the number 642341 is even because it contains two odd digits. And the number 744017 in base 9 is odd because it has three odd digits.

Why does this rule work? Suppose, for example, you have a 4-digit number number pqrs in base b where b is odd. Then pqrs represents

pb^{3} + qb^{2} + rb + s

All the powers of b are odd, so a number like p times a power of b is odd iff p is odd. So every odd digit in the number contributes an odd number to the sum that expands what the number means. Even digits contribute even terms. A sum is odd iff it has an odd number of odd terms, so a number in an odd base is odd iff it has an odd number of odd digits.

The new C++ standard includes a couple Python-like features that I ran across recently. There are other Python-like features in the new standard, but here I’ll discuss range-based for-loops and raw strings. In Python you loop over lists rather than rather than incrementing a loop counter variable. For example,

for p in [2, 3, 5, 7, 11]:
print p

Range-based for loops now let you do something similar in C++11:

int primes[5] = {2, 3, 5, 7, 11};
for (int &p : primes)
cout << p << "n";

Also, Python has raw strings. If you preface a quoted string with R, the contents of the string is interpreted literally. For example,

print "Hello\nworld"

will produce

Hello
world

but

print R"Hello\nworld"

will produce

Hello\nworld

because the \n is no longer interpreted as a newline character but instead printed literally as two characters.

Raw strings in C++11 use R as well, but they also require a delimiter inside the quotation marks. For example,

cout << R"(Hello\nworld)";

The C++ raw string syntax is a little harder to read than the Python counterpart since it requires parentheses. The advantage, however, is that such strings can contain double quotes since a double quote alone does not terminate the string. For example,

cout << R"(Hello "world")";

would print

Hello "world"

In Python this is unnecessary since single and double quotes are interchangeable; if you wanted double quotes inside your string, you’d use single quotes on the outside.

Note that raw strings in C++ require a capital R unlike Python that allows r or R.

The C++ features mentioned here are supported gcc 4.6.0. The MinGW version of gcc for Windows is available here. To use C++11 features in gcc, you must add the parameter -std=c++0x to the g++ command line. For example,

g++ -std=c++0x hello.cpp

Visual Studio 2010 supports many of the new C++ features, but not the ones discussed here.

The smaller, the odder, the more out of the way, and the more specialized, the better. That is my philosophy on bookshops. Come to think of it, that is my philosophy on everything else too — it makes for a very interesting life unconstrained by the smothering expectations of the tyranny of fashion or popularity.

Conal Elliott gave a talk at Google a while back in which he points out the tension between usability and composability, between software that is user-friendly and software that is programmer-friendly. Consumers like software that’s easy to use. But programmers like software that’s easy to compose, i.e. to combine in unanticipated ways. Users want applications; programmers want libraries. Users like GUIs; programmers like APIs.

It’s not immediately obvious that usability and composability are in tension. Why can’t you make users and programmers happy? You may be able to make some initial improvements that please both communities, but at some point their interests diverge.

Vivek Haldar picks up this theme this morning in his latest blog post. He uses “operation versus expression” to express the same idea Elliott’s idea of usability versus composability.

Combining Elliott and Haldar’s presentations, we have these contrasts.

Usability

Composability

Operation

Expression

Visual / GUI

Syntactic / CLI

Bounded

Unbounded

Externalize knowledge

Internalize knowledge

Neither column is necessarily better. Sometimes you want to be in the left column, sometimes in the right. Sometimes you want a stereo and sometimes you want a guitar.

When I file my taxes, I want the software to be as easy to use as possible right now. There’s no long-term use to consider since I’m not going to use it again for a year, so I’ll have forgotten anything peculiar about the software by the time I open it again. But when I’m writing software, I have a different set of values. I don’t mind internalizing some knowledge of how my tools work in exchange for long-term ease of use.

Michael Lugo pointed out that the telephone number 867-5309 is prime and may be the largest prime number to appear in the title of a popular song. (The song 867-5309/Jenny peaked at #4 on Billboard in 1982.)

David Radcliffe added “The phone number of Jenny’s twin sister is 8675311” because 8675309 and 8675311 are twin primes.

How likely is it for a telephone number to be prime? The Prime Number Theorem says that for large values of x, the probability that a number less than x is prime is approximately 1/log(x). Since 1/log(10^{7}) = 0.062, about 6% of phone numbers are prime.

We could try to be more accurate. We could look at the probability that a seven-digit number is prime rather than simply a number less than 10^{7} (i.e. excluding numbers with less than seven digits). Or we could use the exact number of primes in a certain range (say using Mathematica’s PrimePi function) rather than using the Prime Number Theorem approximation. But these refinements would still give us estimates of about 6%. Note that not all seven-digit numbers have been assigned as phone numbers, so an exact calculation still gives only an approximate answer.

What about phone numbers with area codes? The Prime Number Theorem suggests about 4.3% of 10-digit numbers are prime. But the US has on the order of 300 area codes, so most 10-digit numbers are not telephone numbers. Also, area codes were not selected randomly. They were selected with a preference for smaller numbers, which means our estimate of 4.3% may be a little low. (We’d expect more prime numbers to start with small area codes.) But I imagine the area code bias has little effect.