Here’s a strange detail of IEEE floating point arithmetic: computers have two versions of 0: positive zero and negative zero. Most of the time the distinction between +0 and −0 doesn’t matter, but once in a while signed versions of zero come in handy.
If a positive quantity underflows to zero, it becomes +0. And if a negative quantity underflows to zero, it becomes −0. You could think of +0 (respectively, −0) as the bit pattern for a positive (negative) number too small to represent.
The IEEE floating point standard says 1/+0 should be +infinity and 1/−0 should be -infinity. This makes sense if you interpret ± 0 as the ghost of a number that underflowed leaving behind only its sign. The reciprocal of a positive (negative) number too small to represent is a positive (negative) number too large to represent.
To demonstrate this, run the following C code.
int main() { double x = 1e-200; double y = 1e-200 * x; printf("Reciprocal of +0: %g\n", 1/y); y = -1e-200*x; printf("Reciprocal of -0: %g\n", 1/y); }
On Linux with gcc
the output is
Reciprocal of +0: inf Reciprocal of -0: -inf
Windows with Visual C++ returns the same output except Windows prints infinity as 1#INF
rather than inf
. (See these notes for more on how Windows and Linux handle floating point exceptions.)
There is something, however, about signed zeros and exceptions that doesn’t make sense to me. The aptly named report “What Every Computer Scientist Should Know About Floating Point Arithmetic” has the following to say about signed zeros.
In IEEE arithmetic, it is natural to define log 0 = −∞ and log x to be a
NaN
when x < 0. Suppose that x represents a small negative number that has underflowed to zero. Thanks to signed zero, x will be negative, so log can return aNaN
. However, if there were no signed zero, the log function could not distinguish an underflowed negative number from 0, and would therefore have to return −∞.
This implies log(-0) should be NaN
and log(+0) should be -∞. That makes sense, but that’s not what happens in practice. The log
function returns −∞ for both +0 and −0.
I ran the following code on Linux and Windows.
int main() { double x = 1e-200; double y = 1e-200 * x; printf("Log of +0: %g\n", log(y)); y = -1e-200*x; printf("Log of -0: %g\n", log(y)); }
On Linux, the code prints
Log of +0: -inf Log of -0: -inf
The results were the same on Windows, except for the way Windows displays infinities.
I dabbled with infinites in C# and Ruby a while back. Now it seems my unnecessary bold statement that Ruby is IEEE 754-1985 compliant could be wrong, because in Ruby (1.8.7) both log(y) above sadly causes a “Numerical result out of range” error.
C# outputs the same as C++
double x = 1e-200;
double y = 1e-200 * x;
System.Console.WriteLine("Log of +0: {0}n", Math.Log(y));
y = -1e-200 * x;
System.Console.WriteLine("Log of -0: {0}n", Math.Log(y));
Log of +0: -INF
Log of -0: -INF
Ironically both -0. and +0. are equal to 0. and equal among them selves.
ie. y==0. is true in both cases!
if particular -0. in no less than 0.
y<0. is false
regarding log, maybe it makes sense define log(-0) as -inf, from the point of view of the algorithm of log. which probably ignores the sign bit after checking that the input is no less than 0.
maybe interpreting the IEEE -0. as a limit is wrong, to me it is more like a partial nan.
Thanks for your investigations, Captain. Is it possible that most languages are not IEEE compliant? Did you ever get to grips with William Kahan’s “Why JAVA’s Floating Point Hurts Everyone Everywhere?” enough to know whether the criticisms relate to other languages too?
Um, you should look up one’s compliment and two’s compliment
http://en.wikipedia.org/wiki/Ones_compliment
http://en.wikipedia.org/wiki/Twos_Compliment
whether or not you have -0 and +0 is usually dependent on how your hardware represents numbers. I wouldn’t expect -0 or +0 to be a valid numbers on x86 architecture, since I believe it’s two’s compliment.
Joe User, he’s talking about floating point representations of numbers, which is what he means by “IEEE floating point standard” and using “double” as the types of his variables. Modern x86 processors have floating-point units (aka “math coprocessors”) that operate on floating-point numbers — processors without floating-point units must implement all the wacky IEEE754 operations in software rather than in hardware.
http://en.wikipedia.org/wiki/IEEE_754-1985
http://en.wikipedia.org/wiki/IEEE_754-2008
You are correct about x86’s using two’s complement for integers, though.
I tested this on my Nokia phone (arm processor). The results are Inf and -Inf
I think Goldberg’s essay goes a little astray on the logarithm thing. The same reasoning would lead to a conclusion that sqrt(-0) should be NaN, which I don’t think anyone would advocate. I’m not expert on IEEE arithmetic (which doesn’t specify log(0) in any case, unless that has changed since I last paid attention) but as I understand it, IEEE has a signed zero as a result of a lot of thought about branch cuts for elementary functions of a complex variable, so thinking about it in terms of functions of a real variable doesn’t work very well.
For nonzero z, write z = r*exp(i*t), where r>0 and i is the imaginary unit. The choice of t is indeterminate, but we will have log(z) = log(r) + i*t for one of the possible values of t. As z goes to zero, log(z) goes to minus infinity no matter what the direction of approach. This suggests that log(-0) and log(0) should both evaluate to minus infinity. They should also raise the divide by zero exception, for the reasons given by Kahan in his notes on IEEE 754 (cf. page 10).
The real story with signed zero and the logarithm has to do with discontinuity across the negative real axis with the usual principal value, where you get an extra tiny slice of continuity by a definition like log(-1+0i) = pi and log(-1-0i) = minus pi.
Actually sqrt(-0) = -0.
IEEE 754: “Except that squareRoot(–0) shall be –0, every valid squareRoot shall have a positive sign.”
I remember when Fortran 95 added the distinction between +0 and -0 to the Fortran language as an option. I had to change a compiler and libraries. I added a command-line option which allowed +0 and -0 to either be distinguished or not . The only places in the whole compiler/library chain which were affected, were the I/O routines which needed to be able to write +/- 0 and read it back in unchanged, and the SIGN(A,B) intrinsic, which combines B’s sign with A’s mantissa/exponent.
Fortran 77 and 90 mandated that -0 == +0 in all behaviors, so if you wrote -0 out, it would be written the same as +0, and if you read it back in it would be read in as +0, and if you used SIGN(A,-0) it would return ABS(A), just like SIGN(A,+0) would.
Fortran 95 changed this, allowing signed 0 distinctions. However, -0 == +0 still (-0 and +0 compare equal).
If you want to know how special functions should work with non-normal IEEE values, see this draft of the C99 standard, Annex F. It was mostly authored by a former colleague of mine, Jim Thomas. It even covers complex numbers.
-0 and +0 exist in mathematics too.
Consider the Laplace Transform when applied to, for example, the Dirac delta function. In this case the lower bound is -0, and it’s well defined in terms of a limit.
Logarithms are (generally, as far as I know) a software operation instead of a hardware one done in the mathematical coprocessor. The result returned by your language can differ from the expected one depending on the software (math library of language used).
I have always considered that those special numbers in the IEEE standard smell sulfur. Willing to continue computations when the floating-point range has been exceeded seems to me both suicidal and of very limited use.
I mean suicidal because I feel it to be a naive attempt to “emulate” the computation of limits. Limits are interesting corner cases where something happens (most of the time this is where your idealized model of nature is flawed), and they deserve careful scrutiny from a competent mathematician. Leaving this to a the silicon neurons of a number-crunching processor is a bit nonsensical.
I mean of very limited use because in the same vein, underterminate forms quickly pop in and you just end-up with no answer. Not a big difference with an earlier out-of-range exception.
In my opinion, just a handy trick for lazy programmers, using the NaNs to detect uninitialized values.
I’d be pleased that someone explains me what I misunderstood.
When there are so many ways to represent invalid or ambiguous floating point numbers, why don’t we also get the tools to test for them? I’ve tried to write a test for NaN or +/-inf after calls to mathematical library functions in an attempt to catch errors, but I wasn’t able to come up with a working code.
This article gave me the idea to use sprintf and analyze the result strings. But it would be so much easier if there were simple test functions to test a floating point number for these special cases. It would also help to have constants with these values.
For those interested in learning more about the fascinating heritage of Zero, I recommend the book Zero: The Biography of a Dangerous Idea by Charles Seife
I’m having some hard time understanding what’s the point of -0 and +0 – what is the usage for these two values?
Can you give us an example on when should you care about have – or + in front of a zero?
One other reason for the distinction is for functions with branch lines. The best example is the arctangent. Languages usually have two functions for this: atan, which takes one argument, and atan2, which takes two. Java’s Math.atan2 has signature
public double atan2(double y, double x)
Now, given a point P in the complex plane, (x,y), atan2(y, x) gives the signed angle from the positive x-axis to the vector OP. The function has a “branch line” along the negative x-axis; as a point approaches the negative x-atis from above, atan2 tends to π. As it approaches it from below, atan2 tends to −π. Given a function that produces negative zeroes, probably because of branch lines too, one can get the correct arctangent.
same thing in Java, it’s actually documented to do that. https://docs.oracle.com/javase/7/docs/api/java/lang/Math.html#log(double)
would make more sense to return NaN for negative zeroes, but since everyone is returning zeroes we should see it as a special case.
I didn’t find the relevant source for standard java log method, but FastMath library in the relevant NaN returning if statement checks negativity in the leftmost bit AND tests for y !=0.0. Negative zero y returns false there of course.
I always assumed there were two zeros because the sign bit is there anyway. The actual uses or non uses of it are just convenience.
The definition of log is actually standardized. (in IEC 60559, I believe)
The definition of log in this case is actually standardized. Sometimes for extreme values, the value is undefined and could be anything.
“If the argument (of log) is ±0, -∞ is returned and FE_DIVBYZERO is raised.” (not a quote from the standard, but close enough).
Fun fact, this code doesn’t behave as expected on my Intel machine:
#include
#include
#include
#pragma STDC FENV_ACCESS ON
int main() {
fesetround(FE_DOWNWARD);
assert(lrint(-0) == -1);
/*NOTREACHED*/
}
It’s rounding -0 down to 0!
These two zeroes, as used in the example, appear to NOT indicate zero, but rather indicate “an infinitesimal just greater than zero” and “an infinitesimal just less than zero.”
Mathematically, a “true zero” has no sign – it’s neither positive nor negative. Perhaps an argument could me made for a third zero, a “Really And Truly Zero” zero. The other two appear to have signs (or use the sign bit used for other values) solely as a result of precision underflow, or perhaps even in spite of it. If a calculation causes a floating point number to lose all precision, wouldn’t that indicate it should lose its sign as well?
I must try the example and see what the sum of +0.0 and -0.0 are. No doubt with multiply and divide the signs carry over just as in non-zero numbers, as in the reciprocal example.
In practical terms, whenever something hits a limit like this (underflow), I wonder that I’ve done something wrong, and I try to rearrange code to avoid it happening. It’s tempting to add tests to prevent it, and give an appropriate message or do something else appropriate, rather than rely on the underlying system trying to do the right thing, as in log (+0.0) vs. log(-0.0).
Floating point is yet another leaky abstraction (YALA?).
To other commenters: It’s important not to confuse floating point with mathematics. In floating point, both zeroes are considered to be exactly equal to zero, to each other. The redundant sign bit is “piggybacked” information which retains a sign across a series of multiplies or divisions, and it only affects downstream results in a few exceptional cases (eg dividing non-zero by zero, yielding inf or -inf). The behaviour has been standardized based on various pragmatic considerations, and will not always be consistent with evaluating infinitesimals in mathematics.
I can see the reasoning that log(-0.0) should give the same exceptional result as log(-1). But bear in mind it can arise from something like log( -(a+b)) where a+b evaluates to 0. Whereas ((-a)-b) would be +zero. I.e the sign of the zero is generally not meaningful when the zero originally arises from a sum or difference. Indeed, in such cases, if you consider a and b to carry “rounding fuzz”, then the proper sign of a+b is indeterminate, and not correlated to the sign bit generated by the add. When zeroes are generated by underflow of mul or div, at least the sign is “correct”.
So I think the rationale is that a “nan vs -inf” outcome decision should not depend on this unreliable info, whereas the inf vs -inf from the division/0 is at least just following the sign behavior for division.
IEEE 754-2008 explicitly says in section 9.2.1 that “functions log, log2, and log10, f (±0) is −∞ and signals the divideByZero exception”.
There are other routines where signed zero matters. For example acotan(-0e0) is -1.5707963267948966 while acotan(0e0) is 1.5707963267948966
Great article. But what happened to the backslashes before the n in the printf statements?
Thanks. Just fixed it.
Years ago there was an odd technical problem that resulted in all the backslashes on my site disappearing. Maybe it sounds like I’m making that up, but really that’s what happened. I keep finding instances like this and fixing them.
The title of the article is confusing. In all algebraic structures as field, ring or group the zero is unique element.
Simply, in IEEE floating point the zero has two different representations.
I would not say that -0 and +0 are different representations of the same number. They behave differently and so it makes sense to consider them different numbers.