# Troubleshooting C++ TR1 problem in Visual Studio 2008

Patrick Getzmann and I have been exchanging email about a problem he had using some sample code I’d written for working with regular expressions in C++. I wasn’t much help, but Patrick figured it out. I wanted to post his solution here in case someone else has the same problem.

His code would compile but not link. The compiler gave the error message “regex.obj : error LNK2019 …” His solution follows.

I have German Visual Studio installed. The Feature Pack is bundled with SP1 in the German Version. My Installation order was:

1. Visual Studio 2008
2. Visual Studio 2008 SP1 (including the Feature Pack)
3. Windows SDK for Windows Server 2008 and .NET Framework 3.5

But it should be (If one needs/wants the Server 2008 SDK):

1. Visual Studio 2008
2. Windows SDK for Windows Server 2008 and .NET Framework 3.5
3. Visual Studio 2008 SP1 (including the Feature Pack)

Otherwise reinstallation of SP1 helps.

The same problem would show up if you were using C++ TR1 random number generators. In a nutshell, try reinstalling SP1.

# Log concave functions

Log concave functions have some very interesting and useful properties. I’ll list some of these shortly after a three definitions.

A function is convex if the line segment joining two points on the graph lies above the graph. In symbols, f(x) is convex if for every t between 0 and 1, and for every x and y, f(tx + (1-t)y) ≤ t f(x) + (1-t) f(y).

A function f(x) is concave if –f(x) is convex. Equivalently, the line connecting two points on the graph lies below the graph. In symbols, reverse the direction of the inequality for convexity.

A function f(x) is log concave if log( f(x) ) is concave.

The basic properties of convex functions are obvious. It’s easy to show that the sum of two convex functions is convex, the maximum of two convex functions is convex, etc. It would seem that log concave functions would be unremarkable because they are so simply related to convex functions. But they have some surprising properties.

• The Laplace transform of a non-negative function is a log convex.
• The running average of a log concave function is also log concave.
• If a random variable has a log concave PDF (probability density function), then the CDF (cumulative distribution function) is also log concave.
• The convolution of two log concave functions is log concave.
• If two independent random variables have log-concave PDFs, then their sum also has a log concave PDF.

A few properties of log concave functions do follow easily from analogous properties of convex and concave functions.

• The product of log concave functions is log concave.
• A function f(x) is log concave if and only if for every t between 0 and 1 and for every x and y, f(tx + (1-t)y) ≥ f(x)t f(y)1-t
• A smooth function f(x) is log concave if and only if the inequality   f2 f ≤ ∇ f ⋅ ∇ f   holds everywhere.

Here are a few specific functions that are log concave.

• The exponential function eax
• The inverse logit function ex/(ex + 1)
• The function ∏ xi / ∑ xi
• The function xa for a ≥ 0
• Determinant is a log concave function on positive definite matrices

For more, see Stephen Boyd’s book Convex Optimization.

# First anniversary post

This blog began one year ago today. Thank you for reading my posts and thank you for all your encouragement, online and offline.

Thanks again for a great year.

# Five principles for object oriented software engineering

In Scott Hanselman’s latest podcast, Robert C. Martin explains the SOLID principles for object oriented software design. SOLID is an acronym for the following:

• Single responsibility principle
• Open-closed principle
• Liskov substitution principle
• Interface segregation principle
• Dependency inversion principle

Related blog post: Why there will always be programmers

Update: Someone has made some funny posters for each of the SOLID principles, a parody of these posters with a photo and some motivational business cliché.

# PNG vs JPEG

Bill the Lizard answered an image compression question on StackOverflow by pointing out the image below that shows the difference between PNG and JPEG compression when applied to line drawings.

The image comes from lbrandy.com. The left side of the image uses PNG compression, a lossless compression format. The right side uses JPEG, a lossy format that computes a Fourier transform and discards the highest frequency components. JPEG can produce smaller files for natural photographic images. But for line drawings, the artifacts of the JPEG compression are noticeable.

# Convex optimization

I’ve enjoyed following Stephen Boyd’s lectures on convex optimization. I stumbled across a draft version of his textbook a few years ago but didn’t realize at first that the author and the lecturer were the same person. I recommend the book, but I especially recommend the lectures. My favorite parts of the lectures are the off-hand remarks about which methods are well known and which are not, what different fields call the same methods, which concerns are merely academic minutia and which are practically important. This kind of information is almost impossible to pick up just from reading books and articles.

A surprisingly large set of problems can be formulated as minimizing a convex function subject to convex constraints, and algorithms for solving these problems are very well developed. It’s roughly true that an optimization problem is tractable if and only if it is convex. Of course that’s not exactly true. Many special non-convex problems can be solved easily, though if you write down an arbitrary non-convex optimization problem, it’s probably hard to solve. On the other hand, enormous convex optimization problems can often be solved quite efficiently.

I became interested in optimization a few years ago in order to solve some problems that came up at work. After going through Boyd’s lectures, I’d like to go back and do a better job on some of these problems. For example, the EffTox dose-finding software solves an optimization problem to determine prior parameters based on input from doctors. That optimization is slow and not very robust. If I can find the time to do it, I think I could improve that software.

# Making screencasts with Jing

After seeing Jing as the top recommendation in a list of teaching tools, I decided to look into it. This two-minute video gives a good introduction to what the tool can do. In a nutshell, Jing makes it very easy to capture a screenshot or a video and post it to the web. I tried it out with this  screencast of the Inequality Calculator software I helped write and use all the time.

At first I thought I had to select the desktop region around the program I wanted to make a video of, but then I found that if I click inside the program window with the cropping tool Jing will automatically select a rectangle around the window.

Posting your videos to the web and getting a link is easy. It took a little more work to figure out how to get the code to embed the video in a web page. This page shows how to do it. You don’t get an “embed” button by default, but the video on that page shows you how to create one.

It seems you have to choose whether you want an ordinary link or embedding code. If you choose the latter, it’s not obvious how to recover the former. If you look in the HTML code Jing generates,  you’ll see something like this.

`<param name="flashVars" value="thumb=http://.....content=http://....swf">`

Select everything following `content=` down to and including `.swf`. There are several other URLs in the embedding code, but only this one works by itself.

# A priest, a Levite, and a Samaritan walk into a bar …

A rabbi, a priest, and a preacher walk into a bar. The bartender looks up and says “Is this some sort of joke?”

Many jokes follow the pattern of three people doing something. The first two establish a pattern as the set up and the third provides the contrast for the punch line. You see the same pattern in folk tales such as The Three Little Pigs and Goldilocks and the Three Bears. Apparently this is an ancient formula.

A few days ago I listed to a sermon about the parable of the Good Samaritan recommended by Guy Kawasaki. In that message John Ortberg says that around the time of Christ, Jews often told stories involving a priest, a Levite, and an ordinary Israelite. The priest and the Levite would be the bad guys and the common Jew would be the hero. Jesus used this same formula but with a twist. His parable began with a priest and a Levite passing by a man who had been beaten by robbers and left for dead. The audience was all set for the contemporary equivalent of Joe the Plumber to be the hero. But instead, Jesus shocked the audience by making the hero a Samaritan.

Here’s the parable, taken from the English Standard Version.

And behold, a lawyer stood up to put him to the test, saying, “Teacher, what shall I do to inherit eternal life?” He said to him, “What is written in the Law? How do you read it?” And he answered,  “You shall love the Lord your God with all your heart and with all your soul and with all your strength  and with all your mind, and your neighbor as yourself.” And he said to him, “You have answered correctly;  do this, and you will live.”

But he, desiring to justify himself, said to Jesus, “And who is my neighbor?” Jesus replied, “A man was going down from Jerusalem to Jericho, and he fell among robbers, who stripped him and beat him and departed, leaving him half dead. Now by chance a priest was going down that road, and when he saw him he passed by on the other side. So likewise a Levite, when he came to the place and saw him, passed by on the other side. But a Samaritan, as he journeyed, came to where he was, and when he saw him, he had compassion. He went to him and bound up his wounds, pouring on oil and wine. Then he set him on his own animal and brought him to an inn and took care of him.  And the next day he took out two denarii and gave them to the innkeeper, saying, ‘Take care of him, and whatever more you spend, I will repay you when I come back.’  Which of these three, do you think, proved to be a neighbor to the man who fell among the robbers?” He said, “The one who showed him mercy.” And Jesus said to him, “You go, and do likewise.”

The lawyer couldn’t bring himself to say “Samaritan” and so he answered Jesus indirectly by saying “The one who showed him mercy.”

# Taking your code out for a walk

I posted two articles on the Reproducible Ideas blog this morning.

Taking your code out for a walk
Just because you haven’t changed your code doesn’t mean it still works.

CiSE special issue on reproducible research
The latest issue of Computing in Science and Engineering has five articles on reproducible research.

[Update: Reproducible Ideas has gone away.]

# Hard disk array failure probabilities

The acronym RAID originally stood for Redundant Array of Inexpensive Disks. The idea is to create reliable disk storage using multiple drives that are not so reliable individually. Data is stored in such a way that if one drive fails, you’re OK as long as you can replace it before another drive fails. RAID is very commonly used.

Now folks often use redundant arrays of expensive disks. The name “RAID” is too established to change, and so now people say the “I” stands for “Independent.” And their in lies a problem.

The naïve theory is that if hard disk 1 has probability of failure 1/1000 and so does disk 2, then the probability of both failing is 1/1,000,000. That assumes failures are statistically independent, but they’re not. You can’t just multiply probabilities like that unless the failures are uncorrelated. Wrongly assuming independence is a common error in applying probability, maybe the most common error.

Joel Spolsky commented on this problem in the latest StackOverflow podcast. When a company builds a RAID, they may grab four or five disks that came off the assembly line together. If one of these disks has a slight flaw that causes it to fail after say 10,000 hours of use, it’s likely they all do. This is not just a theoretical possibility. Companies have observed batches of disks all failing around the same time.

Using disks made by competing companies would significantly decrease the correlation in failure probabilities. The failures will always be somewhat correlated. For one thing, different manufacturers use similar materials and processes. Also, regardless of where the drives come from, they end up in the same box, all subject to the same environmental factors.

# Leaky abstractions and hockey stick learning curves

Michael Lugo wrote a blog post Turtles all the way down that links to a cartoon in which someone explains how a compiler works by diving through layers of abstraction from software to hardware to electrons to quantum physics. You can never write a program if you want to understand every detail from programming language syntax down to quarks. But how far down the abstraction stack do you need to understand things?

For personal satisfaction more than practical necessity, I like to have at least a vague idea of what’s going on all the way down the abstraction stack. But how far down do you need to understand what’s going on? The answers you’re likely to hear to this question depend on the age of the person you talk to. Programmers tend to think everyone should know things as far down the stack as they do. That may mean assembly, C, Java, PHP, etc. depending on when they started programming. A better answer is that you need to go down until your abstractions no longer leak. The rest of this post explains what that means.

Programming is all about layers of abstraction, and in theory you only need to understand the layer you’re currently working in. For example, if you’re writing ASP.NET, you don’t need to know how the CLR works, nor do you need to know HTML, JavaScript, HTTP, TCP/IP, etc. At least that’s the theory. But when something goes wrong, and then you have to understand what’s going on in the layer below where you’re working.  Joel Spolsky coined the phrase leaky abstraction to explain this phenomena. At some point, your abstraction fails and you have to dig a layer deeper, maybe two or three layers until you get to an abstraction that doesn’t leak as far as you are concerned.

This is one of the things that makes programming hard. Maybe your nephew in high school can put up a web site for you that does everything you need. Or maybe not. A rookie programmer can jump into a project and have something to show for himself quickly. But when his abstraction layer springs a leak, he hits a brick wall unless he has a more experienced programmer to help him out. And of course that more experienced programmer doesn’t know everything. He will have to do some research if he has to dig below the abstraction layers he knows. It’s turtles all the way down.

Leaky abstractions lead to the phenomena Joel calls the hockey stick learning curve.

You go along a nice flat learning curve until suddenly the path goes nearly vertical, like a hockey stick on its side.

Leaky abstractions mean that we live with a hockey stick learning curve: you can learn 90% of what you use day by day with a week of learning. But the other 10% might take you a couple of years catching up. That’s where the really experienced programmers will shine over the people who say “whatever you want me to do, I can just pick up the book and learn how to do it.” If you’re building a team, it’s OK to have a lot of less experienced programmers cranking out big blocks of code using the abstract tools, but the team is not going to work if you don’t have some really experienced members to do the really hard stuff.

I’d like to see the terms “leaky abstraction” and “hockey stick learning curve” more commonly used. They go a long way toward explaining the difference between fantasy and reality in software development.