Floating point round trip radix conversion

Suppose you store a floating point number in memory, print it out in human-readable base 10, and read it back in. When can the original number be recovered exactly?

D. W. Matula answered this question more generally in 1968 [1].

Suppose we start with base β with p places of precision and convert to base γ with q places of precision, rounding to nearest, then convert back to the original base β. Matula’s theorem says that if there are no positive integers i and j such that

βⁱ = γ^j

then a necessary and sufficient condition for the round-trip to be exact (assuming no overflow or underflow) is that

γ^q−1 > β^p.

In the case of floating point numbers (type double in C) we have β = 2 and p = 53. (See Anatomy of a floating point number.) We’re printing to base γ = 10. No positive power of 10 is also a power of 2, so Matula’s condition on the two bases holds.

If we print out q = 17 decimal places, then

10¹⁶ > 2⁵³

and so round-trip conversion will be exact if both conversions round to nearest. If q is any smaller, some round-trip conversions will not be exact.

You can also verify that for a single precision floating point number (p = 24 bits precision) you need q = 9 decimal digits, and for a quad precision number (p = 113 bits precision) you need q = 36 decimal digits [2].

Looking back at Matula’s theorem, clearly we need

γ^q ≥ β^p.

Why? Because the right side is the number of base β fractions and the left side is the number of base γ fractions. You can’t have a one-to-one map from a larger space into a smaller space. So the inequality above is necessary, but not sufficient. However, it’s almost sufficient. We just need one more base γ figure, i.e. we Matula tells us

γ^q−1 > β^p

is sufficient. In terms of base 2 and base 10, we need at least 16 decimals to represent 53 bits. The surprising thing is that one more decimal is enough to guarantee that round-trip conversions are exact. It’s not obvious a priori that any finite number of extra decimals is always enough, but in fact just one more is enough; there’s no “table maker’s dilemma” here.

Here’s an example to show the extra decimal is necessary. Suppose p = 5. There are more 2-digit numbers than 5-bit numbers, but if we only use two digits then round-trip radix conversion will not always be exact. For example, the number 17/16 written in binary is 1.0001_two, and has five significant bits. The decimal equivalent is 1.0625_ten, which rounded to two significant digits is 1.1_ten. But the nearest binary number to 1.1_ten with 5 significant bits is 1.0010_two = 1.125_ten. In short, rounding to nearest gives

1.0001_two -> 1.1_ten -> 1.0010_two

and so we don’t end up back where we started.

5 thoughts on “When is round-trip floating point radix conversion exact?”

Severin Pappadeux

16 March 2020 at 19:39

“You can also verify that for a single precision floating point number (p = 24 bits precision) you need q = 9 decimal digits” Really? 2^24 is about 16 millions, so q = 8 decimal digits should be enough.
John

16 March 2020 at 20:21

@Severin: It is true that 8 decimal digits is enough to have more than 2^24 possible numbers. That is necessary but not sufficient. With 8 decimals it would be possible to create some one-to-one map back and forth, but that map would not correspond to radix conversion with round to nearest. For exact round-trip radix conversion you need one more decimal.
Eric Farmer

18 March 2020 at 06:50

The necessary condition for a one-to-one map (i.e., that distinct floating-point values are guaranteed to map to distinct decimal representations) can be refined to a characterizing– i.e., equivalent– stricter condition that γ^(q-1) ≥ β^(p)-1. This is in another paper by Matula published later that same year:

Matula, David W., The Base Conversion Theorem, Proceedings of the American Mathematical Society, 19(3) June 1968, p. 716-723

The good news is that in the particular common case where we are converting base 2 to base 10, this condition is equivalent to the inequality in your post, with the sole exception of single-bit conversions (p=q=1).
Severin Pappadeux

18 March 2020 at 11:37

@John, you’re right, I’ll take it back. Tested all 32bit floats via roundtrip float->char->float conversion. Code is here: https://github.com/Kri-Ol/from_chars-to_chars-conversion-test
Andre Adrian

4 September 2022 at 01:25

You wrote “the IEEE754 single (32bit) format has 24bits presision and therefore 9 decimal digits are needed”. I disagree. IEEE754 uses implicit leading bit. There are only 2^23 different fraction values for a normalized IEEE754 single format number. 8 decimal digits fullfill the Matula requirement.

Comments are closed.

More floating point posts

5 thoughts on “When is round-trip floating point radix conversion exact?”