Uncategorized

Decide what to abandon

Sometimes it’s rational to walk away from something you’ve invested a great deal in.

It’s hard imagine how investors could abandon something as large and expensive as a shopping mall. And yet it must have been a sensible decision. If anyone disagreed, they could buy the abandoned mall on the belief that they could make a profit.

The idea that you should stick to something just because you’ve invested in it goes by many names: sunk cost fallacy, escalation of commitment, gambler’s ruin, etc. If further investment will simply lose more money, there’s no economic reason to continue investing, regardless of how much money you’ve spent. (There may be non-economic reasons. You may have a moral obligation to fulfill a commitment or to clean up a mess you’ve made.)

Most of us have not faced the temptation to keep investing in an unprofitable shopping mall, but everyone is tempted by the sunk cost fallacy in other forms: finishing a novel you don’t enjoy reading, holding on to hopelessly tangled software, trying to earn a living with a skill people no longer wish to pay for, etc.

According to Peter Drucker, “It cannot be said often enough that one should not postpone; one abandons.”

The first step in a growth policy is not to decide where and how to grow. It is to decide what to abandon. In order to grow, a business must have a systematic policy to get rid of the outgrown, the obsolete, the unproductive.

It’s usually more obvious what someone else should abandon than what we should abandon. Smart businesses turn to outside consultants for such advice. Smart individuals turn to trusted friends. An objective observer without our emotional investment can see things more clearly than we can.

* * *

This posted started out as a shorter post on Google+.

Photo above by Steve Petrucelli via flickr

Code Project articles

This week’s resource post lists some articles along with source code I’ve posted on CodeProject.

Probability

Pitfalls in Random Number Generation includes several lessons learned the hard way.

Simple Random Number Generation is a random number generator written in C# based on George Marsaglia’s WMC algorithm.

Finding probability distribution parameters from percentiles

Numerical computing

Avoiding Overflow, Underflow, and Loss of Precision explains why the most obvious method for evaluating mathematical functions may not work. The article includes C++ source code for evaluating some functions that come up in statistics (particularly logistic regression) that could have problems if naïvely implemented.

An introduction to numerical programming in C#

Five tips for floating point programming gives five of the most important things someone needs to know when working with floating point numbers.

Optimizing a function of one variable with Brent’s method.

Fast numerical integration using the double-exponential transform method. Optimally efficient numerical integration for analytic functions over a finite interval.

Three methods for root-finding in C#

Getting started with the SciPy (Scientific Python) library

Numerical integration with Simpson’s rule

Filling in the gaps: simple interpolation discusses linear interpolation and inverse interpolation, and gives some suggestions for what to do next if linear interpolation isn’t adequate.

PowerShell

Automated Extract and Build from Team System using PowerShell explains a PowerShell script to automatically extract and build Visual Studio projects from Visual Studio Team System (VSTS) version control.

PowerShell Script for Reviewing Text Show to Users is a tool for finding errors in prose displayed to users that might not be exposed during testing.

Monitoring unreliable scheduled tasks describes a simple program for monitoring legacy processes.

C++

Calculating percentiles in memory-bound applications gives an algorithm and C++ code for calculating percentiles of a list too large to fit into memory.

Quick Start for C++ TR1 Regular Expressions answers 10 of the first questions that are likely to come to mind when someone wants to use the new regular expression support in C++.

Resource series

Last week: Miscellaneous math notes

Next week: Stand-alone numerical code

Rare letter combinations and key chords

A bigram is a pair of letters. For various reasons—word games, cryptography, user interface development, etc.—people are interested in knowing which bigrams occur most often, and so such information is easy to find. But sometimes you might want to know which bigrams occur least often, and that’s harder to find. My interest is finding safe key-chord combinations for Emacs.

Peter Norvig calculated frequencies for all pairs of letters based on the corpus Google extracted from millions of books. He gives a table that will show you the frequency of a bigram when you mouse over it. I scraped his HTML page to create a simple CSV version of the data. My file lists bigrams, frequencies to three decimal places, and the raw counts: bigram_frequencies.csv. The file is sorted in decreasing order of frequency.

The Emacs key-chord module lets you bind pairs of letters to Emacs commands. For example, if you map a command to jk, that command will execute whenever you type j and k in quick succession. In that case if you want the literal sequence “jk” to appear in a file, pause between typing the j and the k. This may sound like a bad idea, but I haven’t run into any problems using it. It allows you to execute your most frequently used commands very quickly. Also, there’s no danger of conflict since neither basic Emacs nor any of its common packages use key chords.

The table below gives bigrams whose percentage frequency rounds to zero keeping three decimal places. See the data file for details.

Since Q is always followed by U in native English words, it’s safe to combine Q with any other letter. (If you need to type Qatar, just pause a little after typing the Q.) It’s also safe to use any consonant after J and most consonants after Z. (It’s rare for a consonant to follow Z, but not quite rare enough to round to zero. ZH and ZL occur with 0.001% frequency, ZY 0.002% and ZZ 0.003%.)

Double letters make especially convenient key chords since they’re easy to type quickly. JJ, KK, QQ, VV, WW, and YY all have frequency rounding to zero. HH and UU have frequency 0.001% and AA, XX, and ZZ have frequency 0.003%.

Note that the discussion above does not distinguish upper and lower case letters in counting frequencies, but Emacs key chords are case-sensitive. You could make a key chord out of any pair of capital letters unless you like to shout in online discussions, use a lot of acronyms, or write old-school FORTRAN.

Update (2 Feb 2015):

This post only considered ordered bigrams. But Emacs key chords are unordered, combinations of keys pressed at or near the same time. This means, for example, that qe would not be a good keychord because although QE is a rare bigram, EQ is not (0.057%). The file unordered_bigram_frequencies.csv gives the combined frequencies of bigrams and their reverse (except for double letters, in which case it simply gives the frequency).

Combinations of J and a consonant are still mostly good key chords except for JB (0.023%), JN (0.011%), and JD (0.005%).

Combinations of Q and a consonant are also good key chords except for QS (0.007%), QN (0.006%), and QC (0.005%). And although O is a vowel, QO still makes a good key chord (0.001%).

Numerical computing resources

This week’s resource post: some numerical computing pages on this site.

See also the Twitter account SciPyTip and numerical programming articles I’ve written for Code Project.

Last week: Regular expressions

Next week: Probability approximations

Blog seventh anniversary

Seven years

This blog is seven years old today. I’ve written 2,273 posts so far, a little less than one per day.

Over the holidays I combed through older posts looking for typos, broken links, etc. I fixed a lot of little things, but I’m sure I missed a few. If you find any problems, please let me know.

Most popular posts

Here are some of the most popular posts for the last three years.

In 2011 I split my most popular post list into three parts:

In 2010 I split the list into two parts:

I didn’t post a list of most popular posts for 2009, but the most popular post that year was Why programmers are not paid in proportion to their productivity.

Finally, my most popular posts for 2008.

Other writing

Here are a few other places you can find things I write:

Probability resources

Each Wednesday I post a list of notes on some topic. This week it’s probability.

See also posts tagged probability and statistics and the Twitter account ProbFact.

Last week: Python resources

Next week: Regular expression resources

Photo review

Here are some of the photos I took on my travels last year.

Bicycles on the Google campus in Mountainview, California:
Google bicycles

Sunrise at Isle Vista, California:
Sunrise

View from University of California Santa Barbara:

Reflection of the Space Needle in the EMP museum in Seattle, Washington:

Paradise Falls, Thousand Oaks, California:

Bed and breakfast in Geldermalsen, The Netherlands:

Amsterdam, The Netherlands:

Chinatown in San Francisco, California

Heidelberg, Germany

Bringing back Coltrane

In the novel Chasing Shadows the bad guys have built a time machine named Magog.

“The bottom line is this. And it is hard for me to believe. They are going to use Magog to bring someone back from the past.”

Jack did not blink or move. His heart was beating very quickly now, but he merely said wryly, “Yes? Who are they going to bring back? Tell me it is John Coltrane.”

“You are never really serious are you?”

“I’m very serious about my music. Why are bad guys always bringing back people that everyone was glad to see go the first time? We could use more Coltrane …”

Related post: Nunc dimittis

Thou, thee, you, and ye

Ever wonder what the rules were for when to use thou, thee, ye, or you in Shakespeare or the King James Bible?

For example, the inscription on front of the Main Building at The University of Texas says

Ye shall know the truth and the truth shall make you free.

Why ye at the beginning and you at the end?

The latest episode of The History of English Podcast explains what the rules were and how they came to be. Regarding the UT inscription, ye was the subject form of the second person plural and you was the object form. Eventually you became used for subject and object, singular and plural.

The singular subject form was thou and the singular object form was thee. For example, the opening lines of Shakespeare’s Sonnet 18:

Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate.

Originally the singular forms were intimate and the plural forms were formal. Only later did thee and thou take on an air of reverence or formality.

Notes on HTML, XML, TeX, and Unicode

This week’s resource post: some notes on typesetting, Unicode, etc.

See also blog posts tagged LaTeX, HTML, and Unicode and the Twitter account TeXtip.

Last week: C++ resources

Next week: Special functions

Why assign two characters to the same symbol?

Unicode often counts the same symbol (glyph) as two or more different characters. For example, Ω is U+03A9 when it represents the Greek letter omega and U+2126 when it represents Ohms, the unit of electrical resistance. Similarly, M is U+004D when it’s used as a Latin letter but U+216F when it’s used as the Roman numeral for 1,000.

The purpose of such distinctions is to capture semantic differences. One example of how this could be useful is increased accessibility. A text-to-speech reader should pronounce things the same way people do. When such software sees “a 25 Ω resistor” it should say “a twenty five Ohm resistor” and not “a twenty five uppercase omega resistor,” just as a person would. [1]

Making text more accessible to the blind helps everyone else as well. For example, it makes the text more accessible to search engines as well. As Elliotte Rusty Harold points out in Refactoring HTML:

Wheelchair ramps are far more commonly used by parents with strollers, students with bicycles, and delivery people with hand trucks than they are by people in wheelchairs. When properly done, increasing accessibility for the disabled increases accessibility for everyone.

However, there are practical limits to how many semantic distinctions Unicode can make without becoming impossibly large, and so the standard is full of compromises. It can be quite difficult to decide when two uses of the same glyph should correspond to separate characters, and no standard could satisfy everyone.

* * *

[1] Someone may discover that when I wrote “a 25 Ω resistor” above, I actually used an Omega  (Ω, U+03A9) rather than an Ohm character (Ω, U+2126). That’s because font support for Unicode is disappointing. If I had used the technically correct Ohm character, some people would not be able to see it.  Ironically, this would make the text less accessible.

On my Android phone, I can see Ω (Ohm) but I cannot see Ⅿ (Roman numeral M) because the installed fonts have a glyph for the former but not the latter.

* * *

This post first appeared on Symbolism, a blog that I’ve now shut down.

Updating blog posts

I’ve been going through my old blog posts and fixing a few problems. I found a few missing images, code samples that had lost their indentation, etc. Most of the errors have been my fault, but some were due to bugs in plug-ins.

If you see any problems with a post, please let me know. You could send me an email, or leave a comment on the post. (For a while I had comments automatically turn off on older posts, but I’ve disabled that. Now you can comment on any post.)

For the first couple years, this blog didn’t have many readers, and so not many people pointed out my errors. Now that there are more readers, I find out about errors more quickly. But I’ve found some egregious errors in some of the older posts.

Thanks for your contribution to this blog. I’ve been writing here for almost seven years, and I’ve benefited greatly from your input.