Hyperinflation changes everything

Zimbabwe one hundred trillion dollar note

My two latest blog posts have been about compound interest. I gave examples of interest rates of 6% up to 18% per year.

Hyperinflation, with rates of inflation in excess of 50% per month, changes everything. Although many economists accept 50% per month as the threshold of hyperinflation, the world has seen worse. Zimbabwe, for example, faced 98% inflation per day. With hyperinflation lots of implicit assumptions and approximations break down.

One take away from the previous posts is that at moderate interest rates, the frequency of compounding doesn’t make that much difference. A nominal interest rate of 12%, compounded continuously, is effectively a 12.75% interest rate. Compound less than continuously, say monthly or daily, and the effective rate will be somewhere between 12% and 12.75%.

But now say interest is 50% per month. Simple interest would be 600% a year, but nobody would accept simple interest. Compounded every month, the effective interest rate would be 12975%. Compounded daily it would be 38433%. And compounded continuously it would be 40343%.

In the numerical post, I said that the difference between continuous interest and interest compounded n times per year was approximately r² / 2n. That works fine when r is say 0.12. When r = 10 it’s off by two orders of magnitude. The implicit assumption that r² < r breaks down when r > 1.

Hyperinflation causes unusual behavior, such as paying for a meal before you eat it rather than afterward, because by the end of the meal the price will be appreciably higher.

What I find hardest to understand about hyperinflation is that people continue to use hyperinflated currency far longer than I would imagine. Once you start using a clock rather than a calendar when doing interest calculations, I would think that people would abandon the inflated currency in favor of something harder, like gold or silver, or even cigarettes. And eventually people do, but “eventually” is further out than I would imagine. It’s absurd to haul paper money in a wheel barrow, and yet people do it.

Numerical problem with an interest calculation

The previous post looked at the difference between continuously compounded interest and interest compounded a large discrete number of times. This difference was calculated using the following Python function.

    def f(P, n, r) : return P*(exp(r) - (1 + r/n)**n)

where the function arguments are principle, number of compoundings, and interest rate.

When I was playing around with this I noticed

    f(1000000, 365*24*60*60, 0.12)

returned a negative value, −0.00066. If this were correct, it would mean that compounding a 12% loan once a second results in more interest than continuous compounding. But this cannot happen. Compounding more often can only increase the amount of interest.

The problem with the Python function is that when n is very large, exp(r) and (1 + r/n)n are so nearly equal that their subtraction results in a complete loss of precision, a phenomenon known as catastrophic cancellation. The two terms are indistinguishable within the limits of floating point calculation because the former is the limit of the latter as n goes to infinity.

One way to calculate

exp(r) − (1 + r/n)n

for moderately small r (such as typical interest rates) and very large n (such as the number of seconds in a year) would be to write out the power series for exp(r), expand (1 + r/n)n using the binomial theorem, and subtract.

Then we find that

exp(r) − (1 + r/n)nr² / 2n

is a good approximation.

When n = 365 × 24 × 60 × 60 and r = 0.12, the approximation gives 2.28 × 10−10 and the correct result is 2.57 × 10−10. The approximation is only good to one significant figure, but it gets the sign and the order of magnitude correct. You could retain more series terms for more accuracy.

Interest compounding with every heartbeat

When I was a child, I heard an advertisement for a bank that compounded the interest on your savings account with every heartbeat. I thought that was an odd thing to say and wondered what it meant. If you have a rapid heart rate, does your money compound more frequently?

I figured there was probably some fine print, such as saying interest was compounded once a second or something like that. Beyond some frequency it doesn’t matter that much how often interest is compounded, and that’s essentially what continuously compounded interest is: interest compounded so often that it doesn’t matter how often it is compounded [1].

So how often do you need to compound interest before the difference between discretely compounded interest and continuously compounded interest doesn’t matter? Well, that depends on what you think matters. The more demanding you are about what matters, the finer the discrete compounding needs to be. It also matters what the interest rate is. The following Python function [2] gives the difference between continuous compounding and compounding n times per year, at a percentage rate r and with principle P.

    def f(P, n, r) : return P*(exp(r) - (1 + r/n)**n)

Let’s first say that the frequency of compounding matters if it makes a difference of more than $1 on a loan of $1,000,000 over a year. The difference between continuous interest and compounding daily at 6% is $5.24. If we increase the frequency of compounding to hourly, the difference is $0.22, which we are saying does not matter.

When the interest rate goes up, the difference between continuous and discrete compounding also goes up. If we triple the interest rate to 18%, now the difference is $2.21, but if we go to compounding every minute, the difference is $0.04.

Now if we’re more demanding, and we want the difference in interest to be less than a cent on a principle of one million dollars, we need to compound even more often. In that case compounding once a second is enough, given an interest rate of 18%, which means that’s frequent enough for any lower interest rate.

Related posts

[1] You could make this statement rigorous by saying for every definition of what matters, i.e. for every tolerance ε, there exists an N such that for all n > N the difference between continuous compounding and compounding with n periods is less than ε.

[2] The Python function is correct in theory, and also in practice as long as n isn’t too big. Very large n could lead to a numerical problem, addressed in the next post.

Powers of 3 + √2

Yesterday’s post looked at the distribution of powers of x mod 1. For almost all x > 1 the distribution is uniform in the limit. But there are exceptions, and the post raised the question of whether 3 + √2 is an exception.

A plot made it look like 3 + √2 is an exception, but that turned out to be a numerical problem.

A higher precision calculation showed that the zeros on the right end of the plot were erroneous.

So this raises the question of how to calculate (3 + √2)n accurately for large n. The way I created the second plot was to use bc to numerically calculate the powers of 3 + √2. In this post, I’ll look at using Mathematica to calculate the powers symbolically.

For all positive integers n,

(3 + √2)n = an + bn√2

where an and bn are positive integers. We want to compute the a and b values.

If you ask Mathematica to compute (3 + √2)n it will simply echo the expression. But if you use the Expand function it will give you want you want. For example

    Expand[(3 + Sqrt[2])^10]

returns

    1404491 + 993054 √2

We can use the Coefficient function to split a + b √2 into a and b.

    parts[n_] := 
        Module[{x = (3 + Sqrt[2])^n}, 
            {Coefficient[x, Sqrt[2], 0], Coefficient[x, Sqrt[2], 1]}]

Now parts[10] returns the pair {1404491, 993054}.

Here’s something interesting. If we set

(3 + √2)n = an + bn√2

as above, then the two halves of the expression on the right are asymptotically equal. That is, as n goes to infinity, the ratio

anbn√2

converges to 1.

We can see this by defining

    ratio[n_] := 
        Module[ {a = Part[ parts[n], 1], b = Part[parts[n], 2]}, 
        N[a / (b Sqrt[2])]]

and evaluating ratio at increasing values of n. ratio[12] returns 1.00001 and ratio[13] returns 1, not that the ratio is exactly 1, but it is as close to 1 as a floating point number can represent.

This seems to be true more generally, as we can investigate with the following function.

    ratio2[p_, q_, r_, n_] := 
        Module[{x = (p + q Sqrt[r])^n}, 
            N[Coefficient[x, Sqrt[r], 0]/(Coefficient[x, Sqrt[r], 1] Sqrt[r])]]

When r is a prime and

(p + q√r)n = an + bnr

then it seems that the ratio an / bnr converges to 1 as n goes to infinity. For example, ratio2[3, 5, 11, 40] returns 1, meaning that the two halves of the expression for (3 + 5√11)n are asymptotically equal.

I don’t know whether the suggested result is true, or how to prove it if it is true. Feels like a result from algebraic number theory, which is not something I know much about.

Update: An anonymous person on X suggested a clever and simple proof. Observe that

\begin{align*} a_n &= \frac{(3+\sqrt{2})^n + (3-\sqrt{2})^n}{2} \\ b_n\sqrt{2} &= \frac{(3 + \sqrt{2})^n - (3-\sqrt{2})^n}{2} \end{align*}

In this form it’s clear that the ratio an / bn √2 converges to 1, and the proof can be generalized to cover more.

Multiples and powers mod 1

For a positive real number x, the expression x mod 1 means the fractional part of x:

x mod 1 = x − ⌊ x ⌋.

This post will look at the distributions of nx mod 1 and xn mod 1 for n = 1, 2, 3, ….

The distribution of nx mod 1 is easy to classify. If x is rational then nx will cycle through a finite number of values in the interval [0, 1]. If x is irrational then nx will be uniformly distributed in the interval.

The distribution of xn mod 1 is more subtle. We have three basic facts.

  1. If 0 ≤ x < 1, then xn → 0 as n → ∞.
  2. If x = 1 then xn = 1 for all n.
  3. For almost all x > 1 the sequence xn mod 1 is uniformly distributed. But for some values of x it is not, and there’s no known classification for these exceptional values of x.

The rest of the post will focus on the third fact.

Suppose x = √2. Then for all even n, xn mod 1 = 0. The sequence isn’t uniformly distributed in [0, 1] because half the sequence is piled up at 0.

For another example, let’s look at powers of the golden ratio φ = (1 + √5)/2. For even values of n the sequence φn converges to 1, and for odd values of n it converges to 0. You can find a proof here.

At one time it was not known whether xn mod 1 is uniformly distributed for x = 3 + √2.  (See HAKMEM, item 32.) I don’t know whether this is still unknown. Let’s see what we can tell from a plot.

Apparently sometime around n = 25 the sequence suddenly converges to 0. That looks awfully suspicious. Let’s calculate x25 using bc.

    $ echo '(3 + sqrt(2))^25' | bc -l
    13223107213151867.43024035874391918714

This says x25 mod 1 should be around 0.43, not near 0. The reason we’re seeing all zeros is that all the bits in the significand are being used to store the integer part and none are left over for the fractional part. A standard floating point number has 53 bits of significand and

(3 + √2)25 > 253.

For more details, see Anatomy of a floating point number.

Now let’s see what happens when we calculate xn mod 1 correctly using bc. Here’s the revised plot.

Is this uniformly distributed? Well, it’s not obviously not uniformly distributed. But we can’t tell by looking at a plot because uniform distribution is an asymptotic property.

We didn’t give the definition of a uniformly distributed sequence until now because an intuitive understanding of the term was sufficient. Now we should be more precise.

Given a set E, a sequence ω, and an integer N, define A(E, ω, N) to be the number of elements in the first N elements of the sequence ω that fall inside E. We say ω is uniformly distributed mod 1 if for every 0 ≤ ab ≤ 1,

limN → ∞ A([a, b), ω, N) / N = ba.

That is, the relative port of points that fall in an interval is equal to the length of the interval.

Now let’s go back to the example x = √2. We know that the powers of this value of x are not uniformly distributed mod 1. But we also said that for almost all x > 1 the powers of x are uniformly distributed. That means every tiny little interval around √2 contains a number y such that the powers of y are uniformly distributed mod 1. Now if y is very close to √2 and we plot the first few values of yn mod 1 then half of the values will be near 0. The sequence will not look uniformly distributed, though it is uniformly distributed in the limit.

A Continental Divide for Newton’s Method

Newton’s method is a simple and efficient method for finding the roots of equations, provided you start close enough to the root. But determining the set of starting points that converge to a given root, or converge at all, can be very complicated.

In one case it is easy to completely classify where points converge. Suppose you have a quadratic polynomial p(x) with distinct roots a and b in the complex plane. The line perpendicular to the line between a and b is a sort of continental divide, splitting the plane into two watersheds. If you start Newton’s method at any point on a root’s side of the line, it will converge to that root.

To illustrate this, we will set a = 7 + 14i and b = 20 + 25i, and use Newton’s method to find the roots of

p(z) = (z − a)(z − b).

You can start far from the root and still converge, which isn’t typical but is the case for quadratic polynomials.

And the point you’ll converge to is determined by which side of the continental divide you start on. For example, the following code shows that if you start at 1000 + 2000i you will converge to 20 + 25i.

a = 7 + 14j
b = 20 + 25j

p      = lambda z: (z - a)*(z - b)
pprime = lambda z: (z - a) + (z - b)
newton = lambda z: z - p(z)/pprime(z)

z = 1000 + 2000j
for _ in range(12):
    z = newton(z)
    print(z)

If you’re not familiar with Python, note that it uses j for the imaginary unit. Here’s why.

Things are more complicated if you start Newton’s method exactly on the line dividing the two watersheds. The method will not converge. There is a dense set of starting points on the line that will cycle through a finite number of points. And there is a dense set of points that lead to division by zero. (See HAKMEM, item 3.)

Related posts

Legendre polynomials

The previous post mentioned Legendre polynomials. This post will give a brief introduction to these polynomials and a couple hints of how they are used in applications.

One way to define the Legendre polynomials is as follows.

  • P0(x) = 1
  • Pk are orthogonal on [−1, 1].
  • Pk(1) = 1 for all k ≥ 0.

The middle bullet point means

\int_{-1}^1 P_m(x) P_n(x) \, dx = 0

if mn. The requirement that each Pk is orthogonal to each of its predecessors determines Pk up to a constant, and the condition Pk(1) = 1 determines this constant.

Here’s a plot of the first few Legendre polynomials.

There’s an interesting pattern that appears in the white space of a graph like the one above when you plot a large number of Legendre polynomials. See this post.

The Legendre polynomial Pk(x) satisfies Legendre’s differential equation; that’s what motivated them.

(1 - x^2) y^{\prime\prime} -2x y^\prime + k(k+1)y = 0

This differential equation comes up in the context of spherical harmonics.

Next I’ll describe a geometric motivation for the Legendre polynomials. Suppose you have a triangle with one side of unit length and two longer sides of length r and y.

You can find y in terms of r by using the law of cosines:

y = \sqrt{1 - 2r \cos \theta + r^2}

But suppose you want to find 1/y in terms of a series in 1/r. (This may seem like an arbitrary problem, but it comes up in applications.) Then the Legendre polynomials give you the coefficients of the series.

\frac{1}{y} = \frac{P_0(\cos\theta)}{r} + \frac{P_1(\cos\theta)}{r^2} + \frac{P_2(\cos\theta)}{r^3} + \cdots

Source: Keith Oldham et al. An Atlas of Functions. 2nd edition.

Related posts

The biggest perturbation of satellite orbits

To first approximation, a satellite orbiting the earth moves in an elliptical orbit. That’s what would get from solving the two-body problem: two point masses orbiting their common center of mass, subject to no forces other than their gravitational attraction to each other.

But the earth is not a point mass. Neither is a satellite, though that’s much less important. The fact that the earth is not exactly a sphere but rather an oblate spheroid is the root cause of the J2 effect.

The J2 effect is the largest perturbation of a satellite orbit from a simple elliptical orbit, at least for satellites in low earth orbit (LEO) and medium earth orbit (MEO). The J2 effect is significant for satellites in higher orbits, though third body effects are larger.

Legendre showed that the gravitational potential of an axially symmetric planet is given by

V(r, \phi) = \frac{Gm}{r} \left( 1 - \sum_{k=2}^\infty J_k  \left( \frac{r_{eq}}{r}\right)^k P_k(\cos \phi) \right)

Here (r, φ) are spherical coordinates. There’s no θ term because we assume the planet, and hence its gravitational potential, is axially symmetric, i.e. independent of θ. The term req is the equatorial radius of the planet. The Pk are Legendre polynomials.

For a moderately oblate planet, like the one we live on, the J2 coefficient is much larger than the others, and neglecting the rest of the coefficients gives a good approximation [1].

Here are the first few coefficients for Earth [2].

\begin{align*} J_2 &= \phantom{-}0.00108263 \\ J_3 &= -0.00000254 \\ J_4 &= -0.00000161 \end{align*}

Note that J2 is three orders of magnitude smaller than 1, and so the J2 effect is small. And yet it matters a great deal. The longitude of the point at which a satellite crosses the equatorial plane may vary a few degrees per day. The rate of precession is approximately proportional to J2.

The value of J2 for Mars is about twice that of Earth (0.001960454). The largest J2 in our solar system is Neptune, which is about three times that of Earth (0.003411).

There are many factors left out of the assumptions of the two body problem. The J2 effect doesn’t account for everything that has been left out, but it’s the first refinement.

More orbital mechanics posts

More Legendre posts

[1] Legendre discovered/invented what we now call the Legendre polynomials in the course of calculating the gravitational potential above. I assume the convention of using J for the coefficients goes back to Legendre.

[2] Richard H. Battin. An Introduction to the Mathematics and Methods of Astrodynamics, Revised Edition, 1999.

Transmission Obstacles and Ellipsoids

Suppose you have a radio transmitter T and a receiver R with a clear line of sight between them. Some portion the signal received at R will come straight from T. But some portion will have bounced off some obstacle, such as the ground.

The reflected radio waves will take a longer path than the waves that traveled straight from T to R. The worst case for reception is when the waves traveling a longer path arrive half a period later, i.e. 180° out of phase, canceling out part of the signal that was received directly.

We’d like to describe the region of space that needs to be empty in order to eliminate destructive interference, i.e. signals 180° out of phase. Suppose T and R are a distance d apart and the wavelength of your signal is λ. An obstacle at a location P can cause signals to arrive exactly out of phase if the distance from T to P plus the distance from P to R is d + λ/2.

So we’re looking for the set of all points such that the sum of their distances to fixes points is a constant. This is the nails-and-string description of an ellipse, where the nails are a distance d apart and the string has length d + λ/2.

That would be a description of the region if we were limited to a plane, such as a plane perpendicular to the ground and containing the transmitter and receiver. But signals could reflect off an obstacle that’s outside this plane. So now we need to imagine being able to move the string in three dimensions. We still get all the points we’d get if we were restricted to a plane, but we also get their rotation about the axis running between T and R.

The region we’re describing is an ellipsoid, known as a Fresnel ellipsoid or Fresnel zone.

Suppose we choose our coordinates so that our transmitter T is located at (0, 0, h) and our receiver R is located at (d, 0, h). We imagine a string of length d + λ/2 with endpoints attached to T and R. We stretch the string so it consists of two straight segments. The set of all possible corners in the string traces out the Fresnel ellipsoid.

Greater delays

If reflected waves are delayed by exactly one period, they reinforce the portion of the signal that arrived directly. Signals delayed by an even multiple of a half-period cause constructive interference, but signals delayed by odd multiples of a half-period cause destructive interference. The odd multiples matter most because we’re more often looking to avoid destructive interference rather than seeking out opportunities for constructive interference.

If you repeat the exercise above with a string of length length d + λ you have another Fresnel ellipsoid. The foci remain the same, i.e. T and R, but this new ellipsoid is bigger since the string is longer. This ellipsoid represents locations where a signal reflected at that point will arrive one period later than a signal traveling straight. Obstacles on the surface of this ellipsoid cause constructive interference.

We can repeat this exercise for a string of length d + nλ/2, where odd values of n correspond to regions of destructive interference. This gives us a set of confocal ellipsoids known as the Fresnel ellipsoids.

Related posts

Most ints are not floats

All integers are real numbers, but most computer representations of integers do not equal computer representations of real numbers.

To make the statement above precise, we have to be more specific about what we mean by computer integers and floating point numbers. I’ll use int32 and int64 to refer to 32-bit and 64-bit signed integers. I’ll use float32 and float64 to refer to IEEE 754 single precision and double precision numbers, what C calls float and double.

Most int32 numbers cannot be represented exactly by a float32. All int32 numbers can be represented approximately as float32 numbers, but usually not exactly. The previous statements remain true if you replace “32” everywhere by “64.”

32 bit details

The int32 data type represents integers −231 through 231 − 1. The float32 data type represents numbers of the form

± 1.f × 2e

where 1 bit represents the sign, 23 bits represent f, and 8 bits represent e.

The number n = 224 can be represented by setting the fractional part f  to 0 and setting the exponent e to 24. But the number n + 1 cannot be represented as a float32 because its last bit would fall off the end of f:

224 + 1 = (1 + 2−24) 224 = 1.000000000000000000000001two × 224

The bits in f fill up representing the 23 zeros after the decimal place. There’s no 24th bit to store the final 1.

For each value of e, there are 223 possible values of f. So for each of e = 24, 25, 26, …, 31 there are 223 representable integers, for a total of 8 × 223.

So of the 231 non-negative integers that can be represented by an int32 data type, only 9 × 223 can also be represented exactly as a float32 data type. This means about 3.5% of 32-bit integers can be represented exactly by a 32-bit float.

64 bit details

The calculations for 64-bit integers and 64-bit floating point numbers are analogous. Numbers represented by float64 data types have the form

± 1.f × 2e

where now f has 52 bits and e has 11.

Of the 263 non-negative integers that can be represented by an int64 data type, only 11 × 252 can also be represented exactly as a float64 data type. This means about 0.5% of 64-bit integers can be represented exactly by a 64-bit float.

A note on Python

Python’s integers have unlimited range, while its floating point numbers correspond to float64. So there are two reasons an integer might not be representable as a float: it may be larger than the largest float, or it may require more than 53 bits of precision.

Related posts