Charles Karney showed me a clever trick for computing log(1 + *x*). It only takes a couple lines of code, but those lines are not obvious. If you understand this trick, you’ll understand a good bit of how floating point arithmetic works.

First of all, what’s the big deal? Why not just add 1 to *x* and take the log? The direct approach will be inaccurate for small *x* and completely inaccurate for very small *x*. For more explanation, see Math library functions that seem unnecessary.

Here’s how Charles Kerney implements the code in C++:

template<typename T> static inline T log1p(T x) throw() { volatile T y = 1 + x, z = y - 1; return z == 0 ? x : x * std::log(y) / z; }

The code comes from GeographicLib, and is based on Theorem 4 of this paper paper. I’ve read that paper several times, but I either didn’t notice the trick above or forgot it.

The code includes comments explaining the trick. I cut out the comments because I’ll explain the code below.

If `y = 1+x`

and `z = y-1`

, doesn’t `z`

just equal `x`

? No, not necessarily! This is exactly the kind of code that a well-meaning but uninformed programmer would go through and “clean up.” If `x`

isn’t too small, no harm done. But if `x`

is very small, it could ruin the code. It’s also the kind of code an optimizing compiler might rearrange, hence the `volatile`

keyword telling optimizers to not be clever.

Imagine `x = 1e-20`

. Then `y = 1+x`

sets `y`

to exactly 1; a standard floating point number does not have enough precision to distinguish 1 and 1 + 10^-20. Then `z = y-1`

sets `z`

to 0. While `z`

will not exactly equal `x`

in general, it *will* be a good approximation for `x`

.

If `z`

is not zero, `log(y)/z`

is a good approximation to log(1 + *x*)/*x* because `y`

is close to 1 + *x* and `z`

is close to *x*. Multiplying `log(y)/z`

by `x`

gives a good approximation to log(1+*x*).

If `z`

is zero, `x`

is so small that `x`

is an excellent approximation to log(1 + *x*) because the error is on the order of *x*^{2}.

Sliderule users knew about this problem; heavy duty sliderules had separate scales for arguments close to 1.

For me the most interesting part of the code is that it uses x * std::log(y) / z instead of just std::log(y) or, what amounts to the same, std::log(1+x). As you say, this is a good approximation to the logairthm of 1+x, but what is amazing is that it’s a better approximation than computing it directly.

I have one comment and one question.

Instead of convoluted computation of z (with volatiles etc) which you test against zero, you could just check “x < std::numeric_limits::epsilon()”. This will tell you whether 1+x is representable as a floating point of type T.

My question is this: For the case where x > epsilon (z != 0), why multiply log(y) by x and then divide by z? Why not just return log(y)? I assume this is somehow compensating for the loss of precision when x is small but not too small, but it is not obvious how.

Stupid comment field dropped my < and >

That should read:

“you could just check x < std::numeric_limits<T>::epsilon()”

I’d also love to see the explanation of why x * log(1+x) / z is a better approximation of log(1+x) than just log(1+x) itself. The comments in the code don’t even claim that it is (?!) — they say “The multiplication x * (log(y)/z) introduces little additional error”, which makes it sound undesirable.

So consider z is a crude estimate for x. x = z + s where s is small (not necessarily positive) (actually z is small. s is really small)

in terms of z and s you want

log(1+z+s) = (z+s)*log(1+z+s)/(z+s) = (z+s) * [log(1+z+s)/(z+s)]

so you have z+s exactly, and you can do multiplication without too much loss of precision. so you want a good estimate for [log(1+z+s)/(z+s)].

but your original problem is you can’t calculate log(1+z+s). You can only calculate log(1+z) = log(y). But note that log(1+z+s)/log(1+z) is close to (z+s)/z. So a good estimate for [log(1+z+s)/(z+s)] is [log(1+z)/z] = [log(y)/z].

Which brings us to the result.

But why not be more direct? Let F(s)=log(1+z+s), and expand at s=0 to get F(s) ~ log(1+z)+ s/(1+z) ~ log(1+z)+s. So can’t we use log(y)+(x-z) instead? Indeed, can’t we just use this last expression without the z==0 test?

IIt might be instructive to work through the algorithm with x = 1.5 * eps

where eps is the machine epsilon (the smallest positive number for which 1

+ eps is computed exactly).

x = 1.5 * eps

y = 1 + x = 1 + 2 * eps (using the round to even rule)

z = y – 1 = 2 * eps

log(y) = 2 * eps

log(y)/z = 1

return x * log(y)/z = 1.5 * eps

Goldberg bounds the error in this method (assuming that the error in

log(x) is bounded).

@SteveBrooklineMA That’s a very interesting suggestion. I can’t see anything wrong with it.

@SteveBrooklineMA @Jitse Niesen yeah, that does sound right. I did a quick simulation and you get about half the error by using (z+s) * [log(1+z+s)/(z+s)]. Though my calculus is failing me at the moment for explaining why.

@SteveBrooklineMA @Jitse Niesen

but log(1+z)+ s/(1+z) is way better.

so in the orignal c

std::log(y) + (x-z)/y

If A is the output from the original code, B =log(y)+(x-z) and C=log(y)+(x-z)/y, I haven’t been able to get either (A-B)/A or (A-C)/A to be more than something on the order of 1e-16. The absolute error between B and C seems even lower. This is just by naively testing with simple code and various x. So if it were up to me, I think I’d go with B.

In the original code, is x * (std::log(y) / z) better, since (x * std::log(y)) / z may overflow for large x?

But why the

`volatile T`

in the second line?The only thing I can think of is trying to

disablecompiler optimization for those variables and trivial calculations (addition, subtraction).Like most such clever formulas, this one originates with Velvel Kahan.

A proof of why it works is given in the solution to problem 1.5 of

Accuracy and Stability of Numerical Algorithms (2nd ed., 2002).

Section 1.14.1 of the book discusses a similar formula for (exp(x)-1)/x.

Thanks Nick, I see you wrote the book on this. Is there motivation for the formula beyond its similarity with (exp(x)-1)/x? You cite this paper, and say it gives an alternate method:

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.5280

and the method there is log(y) + (x-z)/y. Isn’t this latter formula more straightforward?