Anatomy of a posit number

This post will introduce posit numbers, explain the interpretation of their bits, and discuss their dynamic range and precision.

Posit numbers are a new way to represent real numbers for computers, an alternative to the standard IEEE floating point formats. The primary advantage of posits is the ability to get more precision or dynamic range out of a given number of bits. If an application can switch from using 64-bit IEEE floats to using 32-bit posits, for example, it can fit twice as many numbers in memory at a time. That can make a big difference in the performance of applications that process large amounts of data.

Let’s back up and say what a posit number is.

Unums and posits

John Gustafson introduced unums (universal numbers) as a different way to represent real numbers using using a finite number of bits, an alternative to IEEE floating point. See, for example, his 2015 book The End of Error. Posits are a hardware-friendly version of unums.

A conventional floating point number (IEEE 754) has a sign bit, a set of bits to represent the exponent, and a set of bits called the significand (formerly called the mantissa). For details, see Anatomy of a floating point number. For a given size number, the lengths of the various parts are fixed. A 64-bit floating point number, for example, has 1 sign bit, 11 exponent bits, and 52 bits for the significand.

A posit adds an additional category of bits, known as the regime. A posit has four parts

  1. sign bit
  2. regime
  3. exponent
  4. fraction

while an IEEE floating point number has a sign bit, exponent, and significand, the latter corresponding to the fraction part of a posit. Unlike IEEE numbers, the exponent and fraction parts of a posit do not have fixed length. The sign and regime bits have first priority. Next, the remaining bits, if any, go into the exponent. If there are still bits left after the exponent, the rest go into the fraction.

The main reference for this post is [1].

Bit pattern of a posit

To understand posits in more detail, and why they have certain advantages over conventional floating point numbers, we need to unpack their bit representation. A posit number type is specified by two numbers: the total number of bits n, and the maximum number of bits devoted to the exponent, es. (Yes, it’s a little odd to use a two-letter variable name, but that’s conventional in this context.) Together we say we have a posit<nes> number.

Sign bit

As with an IEEE floating point number, the first bit of a posit is the sign bit. If the sign bit is 1, representing a negative number, take the two’s complement of the rest of the bits before unpacking the regime, exponent, and fraction bits.

Regime bits

After the sign bit come the regime bits. The number of regime bits is variable. There could be anywhere from 1 to n-1 regime bits. How do you know when the regime bits stop? When a run of identical bits ends, either because you run out of bits or because you run into an opposite bit.

If the first bit after the sign bit is a 0, then the regime bits continue until you run out of bits or encounter a 1. Similarly, if the first bit after the sign bit is a 1, the regime bits continue until you run out of bits or encounter a 0. The bit that indicates the end of a run is not included in the regime; the regime is a string of all 0’s or all 1’s.

Exponent bits

The sign bit and regime bits get first priority. If there are any bits left, the exponent bits are next in line.  There may be no exponent bits. The maximum number of exponent bits is specified by the number es. If there are at least es bits after the sign bit, regime bits, and the regime terminating bit, the next es bits belong to the exponent. If there are fewer than es bits left, what bits remain belong to the exponent.

Fraction bits

If there are any bits left after the sign bit, regime bits, regime terminating bit, and the exponent bits, they all belong to the fraction.

Interpreting the components of a posit

Next we look at how the components described above represent a real number.

Let b be the sign bit in a posit. The sign s of the number represented by the bit pattern is positive if this bit is 0 and negative otherwise.

s = (-1)^b

Let m be the number of bits in the regime, i.e. the length of the run of identical bits following the sign bit. Then let k = –m if the regime consists of all 0’s, and let km-1 otherwise.

k = \left\{ \begin{array}{ll} -m & \text{ if regime has } m \text{ 0's} \\ m-1 & \text{ if regime has } m \text{ 1's} \end{array} \right.

The useed u of the posit is determined by es, the maximum exponent size.

u = 2^{2^{\text{\small\emph{es}}} }

The exponent e is simply the exponent bits interpreted as an unsigned integer.

The fraction f is 1 + the fraction bits interpreted as following a binary point. For example, if the fraction bits are 10011, then f = 1.10011 in binary.

Putting it all together, the value of the posit number is the product of the contributions from the sign bit, regime bits, exponent bits (if any), and fraction bits (if any).

x = s\, u^k\, 2^e f = (-1)^b \, f\, 2^{e + k2^{\text{\small\emph{es}}} }

Exceptional posits

There are two exceptional posits, both with all zeros after the sign bit. A string of n 0’s represents the number zero, and a 1 followed by n-1 0’s represents ±∞.

There’s only one zero for posit numbers, unlike IEEE floats that have two kinds of zero, one positive and one negative.

There’s also only one infinite posit number. For that reason you could say that posits represent projective real numbers rather than extended real numbers. IEEE floats have two kinds of infinities, positive and negative, as well as several kinds of non-numbers. Posits have only one entity that does not correspond to a real number, and that is ±∞.

Dynamic range and precision

The dynamic range and precision of a posit number depend on the value of es. The larger es is, the larger the contribution of the regime and exponent bits will be, and so the larger range of values one can represent. So increasing es increases dynamic range. Dynamic range, measured in decades, is the log base 10 of the ratio between the largest and smallest representable positive values.

However, increasing es means decreasing the number of bits available to the fraction, and so decreases precision. One of the benefits of posit numbers is this ability to pick es to adjust the trade-off between dynamic range and precision to meet your needs.

The largest representable finite posit is labeled maxpos. This value occurs when k is as large as possible, i.e. when all the bits after the sign bit are 1’s. In this case kn-2. So maxpos equals

u^{n-2} = \left( 2^{2^{\text{\small\emph{es}}} } \right)^{n-2}

The smallest representable positive number, minpos, occurs when k is as negative as possible, i.e. when the largest possible number of bits after the sign bit are 0’s. They can’t all be zeros or else we have the representation for the number 0, so there must be a 1 on the end. In this case m = n-2 and k = 2-n.

\mbox{minpos} = u^{2-n} = \left( 2^{2^{\text{\small\emph{es}}} } \right)^{2-n} = 1/\mbox{maxpos}

The dynamic range is given by the log base 10 of the ratio between maxpos and minpos.

\log_{10}\left( 2^{2^{\text{\small\emph{es}}} } \right)^{2n-4} = (2n-4)2^{es}\log_{10}2

For example, 16-bit posit with es = 1 has a dynamic range of 17 decades, whereas a 16-bit IEEE floating point number has a dynamic range of 12 decades. The former has a fraction of 12 bits for numbers near 1, while the latter has a significand of 10 bits. So a posit<16,1> number has both a greater dynamic range and greater precision (near 1) than its IEEE counterpart.

[Update: See this post for more on the dynamic range and precision of IEEE floats of various sizes and how posits compare.]

Note that the precision of a posit number depends on its size. This is the sense in which posits have tapered precision. Numbers near 1 have more precision, while extremely big numbers and extremely small numbers have less. This is often what you want. Typically the vast majority of numbers in a computation are roughly on the order of 1, while with the largest and smallest numbers, you mostly want them to not overflow or underflow.

Related post: Anatomy of a floating point number

***

[1] John L. Gustafson and Isaac Yonemoto. Beating Floating Point at its Own Game: Posit Arithmetic. DOI: 10.14529/jsfi170206

 

14 thoughts on “Anatomy of a posit number

  1. While I’m sure this is all technically correct, it doesn’t seem very instructive or expository. Maybe some examples and critiques are forthcoming?

  2. I’d like to write more about posits. My intention was to start with a post that lays out exactly what a posit is.

  3. “Similarly, if the first bit after the sign bit is a 1, the regime bits continue until you run out of bits or encounter a 1.”

    I assume that should be “encounter a 0?”

  4. A less significant — HA! — typo than the one “Egg Syntax” pointed out: in one place you have “sigficand”.

  5. Is there a k missing on the far right of your “Putting it all together” equation? It seems like the final exponent should end with k2^{es} instead of just 2^{es}.

  6. 30+ years ago I started my career replacing analog systems with microprocessor-based digital instruments. Back then, the workhorse embedded processor was a 5 MHz 8085. The resulting systems needed to not merely equal, but in most ways exceed the performance of the analog instruments they were intended to replace.

    Accurate and timely math was key, and given an 8-bit processor lacking an FPU I soon became expert in crafting and using hand-optimized fixed-point libraries. Algorithm optimization was another expertise I developed, the goal being to get as many accurate bits in the result as possible within the memory and timing constraints.

    This was very finicky work, which I gladly left behind as embedded processors got faster, wider, then gained FPUs. (Yes, I’m an ARM M4F fan-boy.)

    Those painful integer beginnings left me with an abhorrence of “wasted bits”, which I see happening so often when users are unaware of the limitations of their FPU. And, worse, when they move computations to GPUs without allowing for the domain differences.

    The ongoing need to keep leaping between numeric format domains is frustrating. I thought we’d be long past it by now.

    It is important to recall that IEEE-704 came about because vendors originally had proprietary floating point formats. Some, such as Cray’s, were very painful to use as a library programmer, but were blisteringly fast to compute. Friendly or not, each representation supported different notions of zero, infinity, NAN, and similar properties. Caveat Emptor!

    While IEEE-704 brought an end to the “Floating Point Format Wars”, it also brought an end to innovation. It was “good enough” for all but us pedants.

    I’ve casually followed the rare whispers of “what’s next” for FPUs, and I, too, was intrigued by Unums. And dismayed when they failed to get prompt traction.

    Posits appear to be truly great news, and I look forward to learning more of their advantages and disadvantages. I’ve found rumors of multiple commercial FPGA implementations, and I can’t wait for a fully-functional Open Source FPGA implementation I can try!

  7. Arseniy Alekseyev

    You write that exponent bits are simply interpreted as an unsigned integer. Don’t you want to pad it with zeroes to the right first (to es bits) so as to avoid a huge gap at the point when the regime changes?

  8. I don’t think I understand your point about a gap, but no, the exponent bits are not padded.

  9. Arseniy Alekseyev

    About the gap…
    Let’s consider 6-bit posits with es=3.
    Specifically the adjacent numbers:
    a = 0 110 11
    b = 0 1110 0
    c = 0 1110 1

    If we don’t pad the exponent with zeroes, those numbers become:

    a = u^1 * 2^3 = 2^11
    b = u^2 * 2^0 = 2^16
    c = u^2 * 2^1 = 2^17

    If we do pad, then

    a = u^1 * 2^6 = 2^14
    b = u^2 * 2^0 = 2^16
    c = u^2 * 2^4 = 2^20

    Note that in the latter case the gap between (logs of) adjacent numbers get larger as numbers get larger as you’d expect, but in the former example that’s not the case.

    Looking at the paper: http://www.johngustafson.net/pdfs/BeatingFloatingPoint.pdf I can’t find them talking about padding, but the example in Fig.4 implies it (notice the interpretation of 01101 is 64 instead of 32 as it would be if we didn’t pad).

  10. Good description, but there is a discrepancy with Gustafson’s paper: https://posithub.org/docs/Posits4.pdf. When extracting the regime, he says (p. 11) to take the 2’s complement of negative posits before counting the run of identical bits. I haven’t tried to work through why, or what (if any) difference it makes, but I assume he said it for a reason.
    Also, there appears to be a minor error on p. 12; where he says “For positive exponents”, I think he means “For positive regimes”.

  11. @Rob: I mention the 2’s complement in the section headed “Sign bit.” It might have been better if I’d mentioned it further down or included an example. I think the reason he takes the 2’s complement is to avoid having a +0 and -0.

  12. @John @Arseniy The posit exponent bits that couldn’t get into the budget of nbits are implicitly zero. Hence, the 2nd set of calculations with padded zeros are correct.

    nbits=6, es=3
    a = 0 110 11
    b = 0 1110 0
    c = 0 1110 1
    a = u^1 * 2^6 = 2^14
    b = u^2 * 2^0 = 2^16
    c = u^2 * 2^4 = 2^20

Leave a Reply

Your email address will not be published. Required fields are marked *