Troubleshooting C++ TR1 problem in Visual Studio 2008

Patrick Getzmann and I have been exchanging email about a problem he had using some sample code I’d written for working with regular expressions in C++. I wasn’t much help, but Patrick figured it out. I wanted to post his solution here in case someone else has the same problem.

His code would compile but not link. The compiler gave the error message “regex.obj : error LNK2019 …” His solution follows.

I have German Visual Studio installed. The Feature Pack is bundled with SP1 in the German Version. My Installation order was:

1. Visual Studio 2008
2. Visual Studio 2008 SP1 (including the Feature Pack)
3. Windows SDK for Windows Server 2008 and .NET Framework 3.5

But it should be (If one needs/wants the Server 2008 SDK):

1. Visual Studio 2008
2. Windows SDK for Windows Server 2008 and .NET Framework 3.5
3. Visual Studio 2008 SP1 (including the Feature Pack)

Otherwise reinstallation of SP1 helps.

The same problem would show up if you were using C++ TR1 random number generators. In a nutshell, try reinstalling SP1.

Functional programming in C++ with function objects

Here’s something I do all the time. I have a function of one variable and several parameters. I implement it as a function object in C++ so I can pass it on to code that does something with functions of one variable, such as integration or optimization. I’ll give a trivial example and then show the most recent real problem I’ve worked on.

Say I have a function f(x; a, b, c) = 1000a + 100b + 10c + x. In a sense this is simply a function of four variables. But the connotation of using a semicolon rather than a comma after the x is that I think of x as being a variable and I think of a, b, and c as parameters. So f is a function of one variable that depends on three constants. (A “parameter” is a “constant” that can change!)

I create a C++ function object with two methods. One method is a constructor that takes the function parameters as arguments and saves them to member variables. The other method is an overload of the parenthesis method. That’s what makes the class a function object. By overloading the parenthesis method, I can call an instance of the class as if it were a function. Here’s some code.

class FunctionObject
{
public:
	FunctionObject(double a, double b, double c)
	{
		m_a = a;
		m_b = b;
		m_c = c;
	}

	double operator()(double x) const
	{
		return 1000*m_a + 100*m_b + 10*m_c + x;
	}

private:
	double m_a;
	double m_b;
	double m_c;
};

So maybe I instantiate an instance of this function object and pass it to a function that finds the maximum value over an interval [a, b]. The code might look like this.

FunctionObject f(3, 1, 4);
double maximum = Maximize(f, a, b);

Here’s a more realistic example. A few days ago I needed to solve this problem. Given user input parameters λ, σ, n, and ξ, find b such that the following holds.

\int_0^1 \frac{1}{\sqrt{2}\nu} \Phi\left(\frac{\lambda \sqrt{2\nu n}}{\sqrt{\sigma^2(1 - 2\nu) + bn}}\right) , dnu = xi

The function Φ above is the CDF of a standard normal random variable, defined here.

To solve this problem, I wrote a function object to evaluate the left side of the equation above. It takes λ, σ, and n as constructor arguments and takes b as an argument to operator(). Then I passed the function object to a root-finding method to solve for the value of b that makes the function value equal ξ. But my function is defined in terms of an integral, so I needed to write another function object first that returns the integrand. Then I pass that function object to this numerical integration routine.  So I had to write two function objects to solve this problem.

There are several advantages to function objects over functions. For example, I would typically do parameter validation in the constructor. Quite often I also do some expensive calculations in the constructor and cache the results so that each call to operator() is then more efficient. Maybe I want to keep track of how often the function is called, so I put in some sort of odometer method that increments a counter with each call.

Unfortunately there’s a fair amount of code to write in order to implement even the simplest function. This effort hardly matters in production code; so many other things take more time. But it is annoying when doing some quick exploration. The next post shows how this can be done much easier in Python. The Python approach would be much easier for small problems, but it doesn’t have the advantages mentioned above such as caching expensive calculations in a constructor.

Computing the inverse of the normal CDF

Someone asked me this week for C++ code to compute the inverse of the normal (Gaussian) distribution function. The code I usually use isn’t convenient to give away because it’s part of a large library, so I wrote a stand-alone function using an approximation out of Abramowitz and Stegun (A&S). There are a couple things A&S takes for granted, so I decided to write up the code in the spirit of a literate program to explain the details. The code is compact and portable. It isn’t as fast as possible nor as accurate as possible, but it’s good enough for many purposes.

A literate program to compute the inverse of the normal CDF

Free C# book

Charles Petzold is a highly respected author in Windows programming circles. For years, his book was THE reference for Win32 API programming. I knew he had since written several books on .NET programming but I didn’t realize until I listened to an interview with Petzold that he has a .NET book that he gives away on his website.

.NET Book Zero: What the C or C++ Programmer Needs to Know About C# and the .NET Framework

How to compute standard deviation accurately

The most convenient way to compute sample variance by hand may not work in a program. Sample variance is given by

\sigma^2 = \frac{1}{ n(n-1)}\left(n \sum_{i=1}^n x_i^2 -\left(\sum_{i=1}^n x_k\right)^2\right)

If you compute the two summations and then carry out the subtraction above, you might be OK. Or you might have a large loss of precision. You might get a negative result even though in theory the quantity above cannot be negative. If you want the standard deviation rather than the variance, you may be in for an unpleasant surprise when you try to take your square root.

There is a simple but non-obvious way to compute sample variance that has excellent numerical properties. The algorithm was first published back in 1962 but is not as well known as it should be. Here are some notes explaining the algorithm and some C++ code for implementing the algorithm.

Accurately computing running variance

The algorithm has the added advantage that it keeps a running account of the mean and variance as data are entered sequentially.

Related posts

NaN, 1.#IND, 1.#INF, and all that

If you’ve ever been surprised by a program that printed some cryptic letter combination when you were expecting a number, you’ve run into an arithmetic exception. This article explains what caused your problem and what you may be able to do to fix it.

IEEE floating-point exceptions

Here’s a teaser. If x is of type float or double, does the expression (x == x) always evaluate to true? Are you certain?

Multiple string types: BSTR, wchar_t, etc.

This morning I listened to a podcast interview with Kate Gregory. She used some terms I hadn’t heard in years: BSTR, OLE strings, etc.

Around a decade ago I was working with COM in C++ and had to deal with the menagerie of string types Kate Gregory mentioned. I wrote an article to get all the various types straight in my head: all the different memory allocation rules, conventions for use, conversions between types, etc. I never published the article. When I started my personal website I thought about posting the article there, but then I thought that by now nobody cared about such things. But the interview I listened to this morning made me think more people might be interested than I’d thought. So I posted my article Unravelling Strings in Visual C++ in case someone finds it useful.

Random number generation in C++ TR1

The C++ Standard Library Technical Report 1 (TR1) includes a specification for random number generation classes.

The Boost library has supported TR1 for a while. Microsoft released a feature pack for Visual Studio 2008 in April that includes support for most of TR1. (They left out support for mathematical special functions.) Dinkumware sells a complete TR1 implementation. And gcc included support for TR1 in version 4.3 released in May. (According to the gcc status page the latest version supports most of TR1 except regular expressions. I’ve been able to get some TR1 features to work using gcc 4.3.1 but have not been able to get random number generation to work yet.)

I’ve posted a set of notes that explain how to use the C++ TR1 random number generation classes in Visual Studio 2008. The notes include sample code and point out a few gotchas. They also explain how to use the C++ TR1 classes to generate from distributions not directly supported by the TR1.

Related: Need help with randomization?

C++ templates may reduce memory footprint

One of the complaints about C++ templates is that they can cause code bloat. But Scott Meyers pointed out in an interview that some people are using templates in embedded systems applications because templates result in smaller code.

C++ compilers only generate code for template methods that are actually used in an application, so it’s possible that code using templates may result in a smaller executable than code that a more traditional object oriented approach.