10 software tool virtues

Diomidis Spinellis gives a list of 10 software tool sins in The Tools at Hand episode of his Tools of the Trade podcast. Here are his points, but turned around. For each sin he lists, I give the opposite as a virtue.

10. Maintain API documentation with the source code.

9. Integrate unit testing in development.

8. Track bugs electronically.

7. Let the compiler do what it can do better than you.

6. Learn how to script your tools to work together.

5. Pay attention to compiler warnings and fix them.

4. Use a version control system.

3. Use tools to find definitions rather than scanning for them.

2. Use a debugger.

1. Use tools that eliminate repetitive manual editing.

I turned the original list around because I believe it’s easier to agree that the things above are good than it is to see that their lack is bad. Some items are opposites, like #5: you either pay attention to warnings or you ignore them. But some are not, like #8. Tracking bugs electronically is a good idea, but I wouldn’t call tracking bugs on paper a “sin.”

Related post: Reducing development friction comments on another podcast from Diomidis Spinellis.

Why can you multiply different units but not add them?

It’s usually an error to add quantities that have different units. If you have an equation that adds meters and kilograms, for example, something is almost certainly wrong. A formula that multiples meters and kilograms might be correct, but not one that adds them.

Why is this? It’s because the laws of nature don’t depend on our choice of units. For example, F = ma whether you use English, metric, or any other system of units.

Changing units introduces a multiplicative factor. You convert inches to centimeters, for example, by multiplying by 2.54.

If you were to add a length to a mass, and say you cut your unit of length in half but left your unit of mass the same, this wouldn’t be a multiplicative change (unless the mass were zero). But when you multiply or divide quantities with different units, the result does scale multiplicatively.

* * *

For daily tidbits about units, follow @UnitFact on Twitter.

Sensitive dependence on initial conditions

The following problem illustrates how the smallest changes to a problem can have large consequences. As explained at the end of the post, this problem is a little artificial, but it illustrates difficulties that come up in realistic problems.

Suppose you have a simple differential equation,

y'' - y = 0

where y(t) gives displacement in meters at a time measured in seconds. Assume initial conditions y(0) = 1 and y'(0) = -1.

Then the unique solution is y(t) = exp(-t). The solution is decreasing and positive. But a numerical solution to the equation might turn around and start increasing, or it might go negative.

Suppose the initial conditions are y(0) = 1 as before but now y'(0) = -1 ± ε. You could think of the ε as a tiny measurement error in the initial velocity, or a limitation in representing the initial velocity as a number in a computer.

The following graph shows the solutions corresponding to ε = 10-6, 0, and -10-6.

plot of 3 solutions, epsilon positive, negative, zero

The exact solution is

 y(t) = \left(1 \mp \frac{\varepsilon}{2}\right)\exp(-t) \pm \frac{\varepsilon}{2}\exp(t)

If the initial velocity is exactly -1, then ε = 0 and we only have the solution exp(-t). But if the initial velocity is a little more than -1, there is an exponentially increasing component of the solution that will eventually overtake the exponentially decreasing component. And if the initial velocity is a little less than -1, there is a negative component increasing exponentially in magnitude. Eventually it will overtake the positive component and cause the solution to become negative.

The solution starts to misbehave (i.e. change direction or go negative) at

t = \frac{1}{2} \log\left( \frac{2}{\varepsilon} \mp 1 \right)

If ε is small, 2/ε is much larger than 1, and so the final 1 above makes little difference. So we could say the solution goes bad at approximately

t = \frac{1}{2} \log\left( \frac{2}{\varepsilon} \right)

whether the error is positive or negative. If ε is ±0.000001, for example, the solution goes bad around t = 7.25 seconds. (The term we dropped only effects the 6th decimal place.)

On a small time scale, less than 7.25 seconds, we would not notice the effect of the (initially) small exponentially growing component. But eventually it matters, a lot. We can delay the point where the solution goes bad by controlling the initial velocity more carefully, but this doesn’t buy us much time. If we make ε ten times smaller, we postpone the time when the solution goes bad to 8.41, only a little over a second later.

[Looking at the plot, the solution does not obviously go wrong at t = 7.25. But it does go qualitatively wrong at that point: a solution that should be positive and decreasing becomes either negative or increasing.]

Trying to extend the quality of the solution by making ε smaller is a fool’s errand. Every order of magnitude decrease in ε only prolongs the inevitable by an extra second or so.

This is a somewhat artificial example. The equation and initial conditions were selected to make the sensitive dependence obvious. The sensitive dependence is itself sensitive: small changes to the problem, such as adding a small first derivative term, make it better behaved.

However, sensitive dependence is a practical phenomena. Numerical methods can, for example, pick up an initially small component of an increasing solution while trying to compute a decreasing solution. Sometimes the numerical sensitivity is a reflection of a sensitive physical system. But if the physical system is stable and the numerical solution is not, the problem may not be numerical. The problem may be a bad model. The numerical difficulties may be trying to tell you something, in which case increasing the accuracy of the numerical method may hide a more basic problem.

More differential equation posts

Elusive statistics

From Controversies in the Foundations of Statistics by Bradley Efron:

Statistics seems to be a difficult subject for mathematicians, perhaps because its elusive and wide-ranging character mitigates against the traditional theorem-proof method of presentation. It may come as some comfort then that statistics is also a difficult subject for statisticians.

Related posts

Number theory determinant and SymPy

Let σ(n) be the sum of the positive divisors of n and let gcd(a, b) be the greatest common divisor of a and b.

Form an n by n matrix M whose (i, j) entry is σ(gcd(i, j)). Then the determinant of M is n!.

The following code shows that the theorem is true for a few values of n and shows how to do some common number theory calculations in SymPy.

from sympy import gcd, divisors, Matrix, factorial

def f(i, j):
    return sum( divisors( gcd(i, j) ) )

def test(n):
    r = range(1, n+1)
    M = Matrix( [ [f(i, j) for j in r] for i in r] )
    return M.det() - factorial(n)

for n in range(1, 11):
    print( test(n) )

As expected, the test function returns zeros.

If we replace the function σ above by τ where τ(n) is the number of positive divisors of n, the corresponding determinant is 1. To test this, replace sum by len in the definition of f and replace factorial(n) by 1.

In case you’re curious, both results are special cases of the following more general theorem. I don’t know whose theorem it is. I found it here.

For any arithmetic function f(m), let g(m) be defined for all positive integers m by

g(m) = \sum_{d \,\mid \,m} \mu(d) f\left(\frac{m}{d}\right)

Let M be the square matrix of order n with ij element f(gcd(i, j)). Then

\det M = \prod_i^n g(j)

Here μ is the Möbius function. The two special cases above correspond to g(m) = m and g(m) = 1.

I before E?

How well does the spelling rule “i before e except after c” hold? I searched the 5,000 most common English words (from here) to see.

70% of the words containing ‘ie” or “ei” follow the rule.

If you weigh the word counts by word frequency, the rule only holds 54% of the time.

There’s a longer version of the rule that adds “or when sounding as ‘a’ as in neighbor or weigh.”  This version holds for 79% of the words in my list. And when weighted by frequency, the rule holds 85% of the time.

Update: Here’s an even more accurate version from Merriam-Webster:

i before e,
except after c,
or when sounded as a,
as in ‘neighbor’ and ‘weigh’,
or when it appears in comparatives and superlatives like ‘fancier’,
or when the c sounds as sh as in ‘glacier’,
or when the vowel sounds like ee as in ‘seize’,
or i as in ‘height’,
or when it shows up in compound words such as ‘albeit’,
or when it shows up in –ing inflections of verbs that end in e, like queueing,
or occasionally in technical words that have a strong etymological link to their parent languages such as ‘cuneiform’ and ‘caffeine’,
and in numerous other random exceptions such as ‘science’, ‘forfeit’, and ‘weird.’

 

“I got the easy ones wrong”

This morning my daughter told me that she did well on a spelling test, but she got the easiest words wrong. Of course that’s not exactly true. The words that are hardest for her to spell are the ones she in fact did not spell correctly. She probably meant that she missed the words she felt should have been easy. Maybe they were short words. Children can be intimidated by long words, even though long words tend to be more regular and thus easier to spell.

Our perceptions of what is easy are often upside-down. We feel that some things should be easy even though our experience tells us otherwise.

Sometimes the trickiest parts of a subject come first, but we think that because they come first they should be easy. For example, force-body diagrams come at the beginning of an introductory physics class, but they can be hard to get right. Newton didn’t always get them right. More advanced physics, say celestial mechanics, is in some ways easier, or at least less error-prone.

“Elementary” and “easy” are not the same. Sometimes they’re opposites. Getting off the ground, so to speak, may be a lot harder than flying.

Mathematics and one-handed pottery

From Michael Atiyah’s essay Trends in Pure Mathematics.

In any given field of mathematics there are always some very fine points which present great technical challenges to the specialist but not are usually of interest to the general mathematician. To make an analogy, if you want to buy some hand-made pottery you are usually not interested in whether the potter used two hands or only one, though the potter would no doubt take great pride in his single-handed effort. In general the mathematical results that have the widest impact are not technically the most difficult.

This essay appears in the first volume of Atiyah’s collected works.

See his comments on this passage in this recent interview.

Hard-working lazy people

“It was a favorite theme of C. S. Lewis that only lazy people work hard. By lazily abdicating the essential work of deciding and directing, establishing values and setting goals, other people do it for us; then we find ourselves frantically, at the last minute, trying to satisfy a half dozen demands on our time, none of which is essential to our vocation, to stave off the disaster of disappointing someone.” — Eugene Peterson

 

Extract audio from video

If I bookmark a video, it may be months before I watch it, if ever, but audio I get around to more quickly. I have more time for listening to audio than watching video, so I’ve started going through my video backlog by downloading the files and extracting the audio to put on my iPod.

I download the video with Video DownloadHelper, a Firefox plug-in, then extract the audio to an MP3 file using Sound Converter. (I assume DownloadHelper works cross-platform, but Sound Converter only runs on Linux.)

DownloadHelper has an option to download and convert in one step, but I haven’t gotten that to work. Also, you can use ffmpeg or avconv to convert video to audio, but they have failed more often than not in my experience. So far Sound Converter has always worked for me.

Update: TubGet will let you enter a video URL and download an MP3 file. I seems to work OK, and is faster than downloading video. There’s also ListenToYouTube but apparently it only works with YouTube while TubGet also works with Vimeo and others.