Solving for probability given entropy

If a coin comes up heads with probability p and tails with probability 1-p, the entropy in the coin flip is

S = –p log2 p – (1-p) log2 (1-p).

It’s common to start with p and compute entropy, but recently I had to go the other way around: given entropy, solve for p. It’s easy to come up with an approximate solution.

entropy and approximation

Entropy in this case is approximately quadratic

S ≈ 4p(1-p)

and so

p ≈ (1 ± √(1-S))/2.

This is a good approximation if S is near 0 or 1 but mediocre in the middle. You could use solve for p numerically, say with Newton’s method, to get more accuracy if needed.

Update

As Sjoerd Visscher pointed out in the comments, the quadratic approximation for entropy is much better if you raise it to the power 3/4. When I added this new approximation to the graph above, the new approximation agreed with the correct value to within the thickness of the plotting line.

To make the approximation error visible, here’s the log of the absolute value of the error of the two approximations, on a log scale.

approximation error on log scale

The error in the new approximation is about an order of magnitude smaller, sometimes more.

The improved approximation for entropy is

S ≈ (4p(1-p))3/4

and so the new approximation for probability is

p ≈ (1 ± √(1-S4/3))/2.

More information theory posts

4 thoughts on “Solving for probability given entropy

  1. If you want to minimize the maximum error in p, the exponent 1.3516 is slightly better than 4/3. The exponent that minimizes the maximum error in S is something like 1.32928, but the difference with 1.33333 is very small. The function 2^S is also close to quadratic and using that gives an approximation for p that works better near p=1/2, but is worse near p=0 or 1.

  2. If you want to minimize the error in p rather than S, then the exponent 1.3516 is slightly better than 4/3=1.3333. The value that minimizes the error in S is about 1.32928, but the difference with 4/3 is very small.
    The function 2^S is also close to quadratic and using this gives an approximation for p that is very good near p=1/2, but a bit worse near p=0 and 1.

Comments are closed.