Logarithmic spiral

I’ve seen an image similar to the following many times, but I don’t recall any source going into detail regarding how the spiral is constructed. This post will do just that.

The previous post constructed iterated golden rectangles. We start with a golden rectangle and imagine chopping of first the blue, then the green, then the orange, and then the red square. We could continue this process on the inner most rectangle, over and over.

At this point, many sources would say “Hey look! We can draw a spiral that goes through two corners of each square” without explicitly stating an equation for the spiral. We’ll find spiral, making everything explicit.

In the previous post we showed how to find the limit of the iterative process at the intersection of the two diagonal lines below.

We showed that the intersection is at

x0 = (2φ + 1)/(φ + 2)

and

y0 = 1/(φ + 2).

Logarithmic spirals

A logarithmic spiral centered at the origin has the polar equation

r(θ) = a exp(kθ)

for some constants a and k. Our spiral will be such a spiral shifted to center at (x0y0).

Imagine a spiral going from the lower left to the top left of the blue square, then to the top left of the green square, then to the top right of the orange square etc., passing through two corners of every square. Every time θ goes through a quarter turn, i.e. π/2 radians, our rectangles get smaller by a factor of φ. We can use this to solve for k because

a exp(k(θ + π/2)) = φ a exp()

and so

exp(kπ/2) = φ

and

k = 2log(φ) / π.

To summarize what we know so far, if we shift our spiral to the origin, it has equation in polar form

r(θ) = a exp(kθ)

where we now know k but don’t yet know a.

Finding θ and a

To get our actual spiral, we have to shift the origin to (x0, y0) which we have calculated above. Converting from polar to rectangular coordinates so we can do the shift, we have

x(θ) = x0 + a exp(kθ) cos(θ)

y(θ) = y0 + a exp(kθ) sin(θ)

We can find the parameter a by knowing that our spiral passes through the origin, the bottom left corner in the plots above. At what value of θ does the curve go through the origin? We have a choice; values of θ that differ by multiples of 2π will give different values of a.

The angle θ is the angle made by the line connecting (0, 0) to (x0, y0). We can find θ via

θ0 = atan2(-y0, –x0).

Here’s Python code to find θ0:

    import numpy as np

    phi = (1 + 5**0.5)/2
    y0 = 1/(2+phi)
    x0 = (2*phi + 1)*y0
    theta0 = np.arctan2(-y0, -x0)

Then we solve for a from the equation x0) = 0.

    k = 2*np.log(phi)/np.pi
    a = -x0 / (np.exp(k*theta0)*np.cos(theta0)))

Plots

Now we have all the parameters we need to define our logarithmic spiral. The following code draws our logarithmic spiral on top of the plot of the rectangles.

    t = np.linspace(-20, theta0, 1000)
    def x(t): return x0 + a*np.exp(k*t)*np.cos(t)
    def y(t): return y0 + a*np.exp(k*t)*np.sin(t)
    plt.plot(x(t), y(t), "k-")

The result is the image at the top of the post.

Find log normal parameters for given mean and variance

Earlier today I needed to solve for log normal parameters that yield a given mean and variance. I’m going to save the calculation here in case I needed in the future or in case a reader needs it. The derivation is simple, but in the heat of the moment I’d rather look it up and keep going with my train of thought.

NB: The parameters μ and σ² of a log normal distribution are not the mean and variance of the distribution; they are the mean and variance of its log.

If m is the mean and v is the variance then

\begin{align*} m &= \exp(\mu + \sigma^2/2) \\ v &= (\exp(\sigma^2) - 1) \exp(2\mu + \sigma^2) \end{align*}

Notice that the square of the m term matches the second part of the v term.

Then

\frac{v}{m^2} = \exp(\sigma^2) -1

and so

\sigma^2 = \log\left(\frac{v}{m^2} + 1 \right)

and once you have σ² you can find μ by

\mu = \log m - \sigma^2/2

Here’s Python code to implement the above.

    from numpy immport log
    def solve_for_log_normal_parameters(mean, variance):
        sigma2 = log(variance/mean**2 + 1)
        mu = log(mean) - sigma2/2
        return (mu, sigma2)

And here’s a little test code for the code above.

    mean = 3.4
    variance = 5.6

    mu, sigma2 = solve_for_log_normal_parameters(mean, variance)

    X = lognorm(scale=exp(mu), s=sigma2**0.5)
    assert(abs(mean - X.mean()) < 1e-10)
    assert(abs(variance - X.var()) < 1e-10)

Related posts

Finding Lagrange points L1 and L2

The James Webb Space Telescope (JWST) is on its way to the Lagrange point L2 of the Sun-Earth system. Objects in this location will orbit the Sun at a fixed distance from Earth.

There are five Lagrange points, L1 through L5. The points L1, L2, and L3 are unstable, and points L4 and L5 are stable. Since L2 is unstable, it will have to adjust its orbit occasionally to stay at L2.

The Lagrange points L1 and L2 are nearly equally distant from Earth, with L1 between the Earth and Sun, and L2 on the opposite side of Earth from the Sun.

The equations for the distance r from L1 and L2 to Earth are very similar and can be combined into one equation:

\frac{M_1}{(R\pm r)^2} \pm \frac{M_2}{r^2}=\left(\frac{M_1}{M_1+M_2}R \pm r\right)\frac{M_1+M_2}{R^3}

The equation for L1 corresponds to taking ± as – and the equation for L2 corresponds to taking ± as +.

In the equation above, R is the distance between the Sun and Earth, and M1 and M2 are the mass of the Sun and Earth respectively. (This is not going to be too precise since the distance between the Sun and Earth is not constant. We’ll use the mean distance for R.)

For both L1 and L2 we have

r \approx R \sqrt[3]{\frac{M_2}{3M_1}}

Let’s use Newton’s method to solve for the distances to L1 and L2, and see how they compare to the approximation above.

    from scipy.optimize import newton
    
    M1 = 1.988e30 # kg
    M2 = 5.972e24 # kg
    R  = 1.471e8  # km

    approx = R*(M2/(3*M1))**(1/3)
    
    def f(r, sign):
        M = M1 + M2
        ret = M1/(R + sign*r)**2 + sign*M2/r**2
        ret -= (R*M1/M + sign*r)*M/R**3
        return ret
    
    L1 = newton(lambda r: f(r, -1), approx)
    L2 = newton(lambda r: f(r,  1), approx)
    
    print("L1       = ", L1)
    print("approx   = ", approx)
    print("L2       = ", L2)
    print("L1 error = ", (approx - L1)/L1)
    print("L2 error = ", (L2 - approx)/L2)    

This prints the following.

    
    L1       = 1467000
    approx   = 1472000
    L2       = 1477000
    L1 error = 0.3357%
    L2 error = 0.3312%

So L2 is slightly further away than L1. The approximate distance under-estimates L2 and over-estimates L1 both by about 0.33% [1].

L1 and L2 are about 4 times further away than the moon.

Related posts

[1] The approximation for r makes an excellent starting point. When I set the relative error target to 1e-5, Newton’s method converged in four iterations.

    full = newton(lambda r: f(r, 1), approx, tol=1e-5, full_output=True)
    print(full)

Numbering the branches of the Lambert W function

The previous post used the Lambert W function to solve an equation that came up in counting partitions of an integer. The first solution found didn’t make sense in context, but another solution, one on a different branch, did. The default branch k = 0 wasn’t what we were after, but the branch k = -1 was.

Branches 0 and -1

The branch corresponding to k = 0 is the principal branch. In mathematical notation, the k is usually a subscript, i.e W0(z) is the principal branch of the Lambert W function. For real z ≥ -1/e it returns a real value. It’s often what you want, and that’s why it’s the default in software implementations such as SciPy and Mathematica. More on that below.

The only other branch that returns real values for real inputs is W-1 which returns real values for -1/ez < 0.

If you’re working strictly with real numbers, you only need to be concerned with branches 0 and -1. If branch 0 doesn’t give you what you expect, try branch -1, if your argument is negative.

SciPy and Mathematica

SciPy implements Wk with lambertw(z). The parameter k defaults to 0.

The Mathematica function ProductLog[z] implements W0(z), and ProductLog[k, z] implements Wk.

Note that Mathematica and Python list their arguments in opposite orders.

Branch numbering

The branches are numbered in order of their imaginary parts. That is, The imaginary part of Wk (z) is an increasing function of k. For example, here is a list of the numerical values of the imaginary parts of Wk (1) for k running from -3 to 3.

    Table[N[Im[ProductLog[k, 1]]], {k, -3, 3}]
    {-17.1135, -10.7763, -4.37519, 0., 4.37519, 10.7763, 17.1135}

Note the zero in the middle because W0(1) is real.

Recovering k

Given z and a value Wk (z) with k unknown, you can determine k with

k = \frac{W_k(z) + \log W_k(z) - \log z}{2\pi i}

with the exception that if the expression above is 0 and -1/ez < 0 then k = -1. See [1].

For example, let z = 1 + 2i and w = W3 (z).

    z = 1 + 2 I
    w = ProductLog[3, z]

Now pretend you don’t know how w was computed. When you compute

    N[(w + Log[w] - Log[z])/(2 Pi I)]

the result is 3.

Partition example

The discussion of the W function here grew out of a problem with partitions. Specifically, we wanted to solve for n such that
f(n) = \frac{1}{4n\sqrt{3}} \exp\left( \pi \sqrt{\frac{2n}{3}} \right)

is approximately a trillion. We found in the previous post that solutions of the equation

a w^b \exp(c w^d) = x

are given by

w = \left(\frac{b}{cd} \,\,W\left(\frac{cd}{b} \left(\frac{x}{a} \right )^{d/b} \right )\right )^{1/d}

Our partition problem corresponds to a = 1/√48, b = -1, c = π √(2/3), and d = 1/2. We found that the solution we were after came from the k = -1 branch.

    N[( (b/(c d)) ProductLog[-1, (x/a)^(d/b) c d /b])^(1/d)]
    183.852

Even though our final value was real, the intermediate values were not. The invocation of W-1 in the middle returns a complex value:

    N[ProductLog[-1, ((x/a)^(d/b) c d /b)^(1/d)]]
    -32.5568 - 3.24081 I

We weren’t careful about specifying ranges and branch cuts, but just sorta bulldozed our way to a solution. But it worked: it’s easy to verify that 183.852 is the solution we were after.

***

[1] Unwinding the branches of the Lambert W function by Jeffrey, Hare, and Corless. The Mathematical Scientist 21, 1&ndash;7 (1996)

Solving equations with Lambert’s W function

In the previous post we wanted to find a value of n such that f(n) = 1012 where

f(n) = \frac{1}{4n\sqrt{3}} \exp\left( \pi \sqrt{\frac{2n}{3}} \right)

and we took a rough guess n = 200. Turns out f(200) ≈ 4 × 1012 and that was good enough for our purposes. But what if we wanted to solve f(n) = x accurately?

We will work our way up to solving f(n) = 1012 exactly using the Lambert W function.

Lambert’s W function

If you can solve one equation involving exponential functions you can bootstrap your solution solve a lot more equations.

The Lambert W function is defined to be the function W(x) that maps each x to a solution of the equation

w exp(w) = x.

This function is implemented Python under scipy.special.lambertw. Let’s see how we could use it solve similar equations to the one that define it.

Basic bootstrapping

Given a constant c we can solve

cw exp(w) = x

simply by solving

w exp(w) = x/c,

which says w = W(x/c).

And we can solve

w exp(cw) = x

by setting y = cw and solving

y exp(y) = cx.

and so cw = W(cx) and w = W(cx)/c.

Combining the two approaches above shows that the solution to

aw exp(bw) = x

is

w = W(bx/a) / b.

We can solve

exp(w) / w = x

by taking the reciprocal of both sides and applying the result above with a = 1 and b = -1.

More generally, we can solve

wc exp(w) = x

by raising both sides to the power 1/c; the reciprocal the special case c = 1.

Similarly, we can solve

w exp(wc) = x

by setting y = wc , changing the problem to

y-1/c exp(y) = x.

Putting it all together

We’ve now found how to deal with constants and exponents on w, inside and outside the exponential function. We now have all the elements to solve our original problem.

We can solve the general equation

a w^b \exp(c w^d) = x

with

w = \left(\frac{b}{cd} \,\,W\left(\frac{cd}{b} \left(\frac{x}{a} \right )^{d/b} \right )\right )^{1/d}

The equation at the top of the post corresponds to a = 1/√48, b = -1, c = π √(2/3), and d = 1/2.

We can code this up in Python as follows.

    from scipy.special import lambertw

    def soln(a, b, c, d, x):
        t = (x/a)**(d/b) *c*d/b
        return (lambertw(t)*b/(c*d))**(1/d) 

We can verify that our solution really is a solution by running it though

    from numpy import exp, pi

    def f(a, b, c, d, w):
        return a*w**b * exp(c*w**d)

to make sure we get our x back.

When we run

    a = 0.25*3**-0.5
    b = -1
    c = pi*(2/3)**0.5
    d = 0.5
    x = 10**12

    w = soln(a, b, c, d, x)
    print(f(a, b, c, d, w))

we do indeed get x back.

    a = 0.25*3**-0.5
    b = -1
    c = pi*(2/3)**0.5
    d = 0.5

    w = soln(a, b, c, d, 10)
    print(f(a, b, c, d, w))

When we take a look at the solution w, we get 1.443377079584187e-13. In other words, we get a number near zero. But our initial guess was w = 200. What went wrong?

Nothing went wrong. We got a solution to our equation. It just wasn’t the solution we expected.

The Lambert W function has multiple branches when viewed as a function on complex numbers. It turns out the solution we were expecting is on the branch SciPy denotes with k = -1. If we add this to the call to lambertw above we get the solution 183.85249773880142 which is more in line with what we expected.

More Lambert W posts

Lambert W strikes again

I was studying a statistics paper the other day in which the author said to solve

t log( 1 + n/t ) = k

for t as part of an algorithm. Assume 0 < k < n.

Is this well posed?

First of all, can this equation be solved for t? Second, if there is a solution, is it unique?

It’s not hard to show that as a function of t, the left side approaches 0 as t approaches 0, and it approaches n as t goes to infinity. So there is a solution for all k between 0 and n. The restriction on k is necessary since the left side cannot exceed n.

With a little more work one can show that the derivative is always positive, so the left side is a monotonically increasing function, and so the solution given each value of k is unique.

Analytic solution

Now if we fix n, we can think of the equation above as defining t as a function of k. Can we solve for t exactly? I suspected that there might be a solution in terms of the Lambert W function because if you divide by t and exponentiate, you get an equation that smells like the equation

z = w exp(w)

defining the function W(z). It turns out this is indeed the case.

If we ask Mathematica

    Solve[t Log[1 + n/t] ==  k, t]

we get

t\to -\frac{k}{ W\left(-\cfrac{k \exp(-k/n)}{n}\right)+\cfrac{k}{n}}

Great! There’s a closed-form solution, if you accept using the W function as being closed form.

Problems with the solution

I found the solution in Mathematica, but I tested it in Python to make sure both platforms define W the same way.

    from numpy import log, exp
    from scipy.special import lambertw

    def f(t, n):
        return t*log(1 + n/t)

    def g(k, n):
        r = k/n
        return -k/(lambertw(-r*exp(-r)) + r)

    n, k = 10, 8
    t = g(k, n)
    print(f(t, n))

This should print k, so it prints 8, right? No, it prints 10.

What’s up with that?

If we look back at the equation for t above, we see that the W function is being evaluated at x exp(x) where x = –k/n, so we should get –k/n back since W(x exp(x)) = x by definition. But that means our denominator is zero, and so the equation doesn’t make sense!

Things are getting worse. At first we had a wrong value, but at least it was finite!

The problem is not a difference between Mathematica and Python.

Resolution

The problem is we’ve glossed over a branch cut of the W function. To make a long story short, we were using the principal branch of the W function, but we should have used a different branch.

Let’s go back to where I asked Mathematica

    Solve[t Log[1 + n/t] == k, t]

I ran the solution through TeXForm to get the TeX code that produced the image for the solution equation. I made a few aesthetic changes to the TeX, but it was essentially Mathematica’s output.

Without the TeXForm, Mathematica’s solution was in terms of ProductLog, not in terms of W; the TeXForm function turned ProductLog into W. If you look up ProductLog, it will tell you

ProductLog[z] gives the principal solution for w in z = wew.

The principal solution. So we should be on the alert for difficulties with branches. There are two real solutions to z = wew for some values of z, and we have to choose the right one. For example, if z = -0.1, the w could be -0.1118 or -3.5772.

Mathematica gave me the wrong branch. But to be fair, it did try to warn me.

Solve::ifun: Inverse functions are being used by Solve, so some solutions may not be found; use Reduce for complete solution information.

The solution is to use the -1 branch. In Mathematica’s notation, the branch comes before the argument. In SciPy, the branch comes second. So to fix the code above, we need to change

    lambertw(-r*exp(-r))

to

   lambertw(-r*exp(-r), -1)

and then the code will be correct.

If x is negative, and we use the -1 branch of W, then

W-1(x exp(x)) ≠ x

and so we’re not dividing by zero in our solution.

More posts with Lambert W

Bessel determinants

The Bessel functions J and Y are analogous to sine and cosine. Bessel functions come up in polar coordinates the way sines and cosines come up in rectangular coordinates. There are J and Y functions of various orders, conventionally written with a subscript ν.

Plotting Bessel functions J_3 and Y_3

I recently ran across a curious set of relations between these functions and their derivatives. Let’s start with the following terms defined by 2 by 2 determinants.

\begin{align*} p_\nu &= \begin{vmatrix} J_\nu(a) & J_\nu(b) \\ Y_\nu(a) & Y_\nu(b) \\ \end{vmatrix} \\ q_\nu &= \begin{vmatrix} J_\nu(a) & J'_\nu(b) \\ Y_\nu(a) & Y'_\nu(b) \\ \end{vmatrix} \\ r_\nu &= \begin{vmatrix} J'_\nu(a) & J_\nu(b) \\ Y'_\nu(a) & Y_\nu(b) \\ \end{vmatrix} \\ s_\nu &= \begin{vmatrix} J'_\nu(a) & J'_\nu(b) \\ Y'_\nu(a) & Y'_\nu(b) \\ \end{vmatrix} \end{align*}

There are a lot of symmetries in the definitions above. First, every term has a subscript ν, so you can ignore those for now.

Second, every determinant has J‘s on the top row and Y‘s on the bottom.

 \begin{vmatrix} J & J \\ Y & Y \\ \end{vmatrix}

Third, every determinant has a‘s in the first column and b‘s in the second.

 \begin{vmatrix} a & b \\ a & b \\ \end{vmatrix}

And finally, the primes, indicating derivatives, have the following pattern.

 \begin{align*} p &= \begin{vmatrix} \phantom{\cdot} & \phantom{\cdot} \\ \phantom{\cdot} & \phantom{\cdot} \\ \end{vmatrix} \\ q &= \begin{vmatrix} \phantom{\cdot} & \phantom{\cdot}' \\ \phantom{\cdot} & \phantom{\cdot}' \\ \end{vmatrix} \\ r &= \begin{vmatrix} \phantom{\cdot}' & \phantom{\cdot} \\ \phantom{\cdot}' & \phantom{\cdot} \\ \end{vmatrix} \\ s &= \begin{vmatrix} \phantom{\cdot}' & \phantom{\cdot}' \\ \phantom{\cdot}' & \phantom{\cdot}' \\ \end{vmatrix} \end{align*}

Now that we have all these definitions in place, there are several identities that the p’s, q’s, r’s, and s‘s satisfy. Some of them depend on the orders, and that’s why we included the ν subscripts. You can find various relations in A&S equation 9.1.32, but I wanted to point out one that’s particularly elegant, A&S equation 9.1.34:

\begin{vmatrix} p_\nu & q_\nu \\ r_\nu & s_\nu \end{vmatrix} = \frac{4}{\pi^2ab}

This equation, a determinant of terms defined by determinants, reduces to a simple form that depends only on where the Bessel functions are evaluated, i.e. a and b, but not on their order ν.

Python example

The Python code below will show how to call Bessel functions and their derivatives and will illustrate the equations above.

    from math import pi
    from scipy.special import jv, yv, jvp, yvp
    
    def example(n, a, b):
    
        ja = jv(n, a)
        jb = jv(n, b)
        ya = yv(n, a)
        yb = yv(n, b)
        
        jap = jvp(n, a)
        jbp = jvp(n, b)
        yap = yvp(n, a)
        ybp = yvp(n, b)
        
        p = ja  * yb  - jb  * ya
        q = ja  * ybp - jbp * ya
        r = jap * yb  - jb  * yap
        s = jap * ybp - jbp * yap
        
        print(p*s - q*r)
        print(4/(pi*pi*a*b))
    
    example(3, 16, 21)

This prints

    0.0012062045671706876
    0.0012062045671706878

The two results differ slightly in the last decimal place due to rounding error.

The order of a Bessel function can be any real number [1], and the arguments can be any complex number. Here’s another example with more general arguments.

    example(3.14, 16-3j, 21+1.2j)    

This prints

    (0.0011738907214344785 + 0.00015140286689885318j)
    (0.0011738907214345800 + 0.00015140286689880630j)

with the real and imaginary parts agreeing to 15 decimal places.

Related posts

[1] You can define Bessel functions with complex order, but SciPy doesn’t support that.

Broadcasting and functors

In my previous post, I looked at the map Δ that takes a column vector to a diagonal matrix. I even drew a commutative diagram, which foreshadows a little category theory.

Suppose you have a function f of a real or complex variable. To an R programmer, if x is a vector, it’s obvious that f(x) means to apply f to every component of a vector. Python (NumPy) works the same way, and calls this broadcasting. To a mathematician, this looks odd. What does the logarithm of a vector, for example, even mean?

As in the previous post, we can use Δ to formalize things. We said that Δ has some nice properties, and in fact we will show it is a functor.

To have a functor, we have to have categories. (Historically, functors came first; categories were defined in order to define functors.) We will define C to be the category of column vectors and M the category of square matrices as before. Or rather, we should say the objects of C are column vectors and the objects of M are square matrices.

Categories need morphisms, functions between objects [1]. We define the morphisms on C to be analytic functions applied componentwise. So, for example, if

z = [1, 2, -3],

then

tan(z) = [tan(1), tan(2), tan(-3)].

The morphisms on M will be analytic functions on square matrices, not applied componentwise but applied by power series. That is, given an analytic function f, we define f of a square matrix X as the result of sticking the matrix X into the power series for f. For an example, see What is the cosine of a matrix?

We said that Δ is a functor. It takes column vectors and turns them into square matrices by putting their contents along the diagonal of a matrix. We gave the example in the previous post that [4, i, π] would be mapped to the matrix with these elements on the diagonal, i.e.

\Delta: \begin{pmatrix} 4 \\ i \\ \pi \end{pmatrix} \mapsto \begin{pmatrix} 4 & 0 & 0\\ 0 & i & 0 \\ 0 & 0 & \pi \end{pmatrix}

That says what Δ does on objects, but what does it do on morphisms? It takes an analytic function that was applied componentwise to column vectors, and turns it into a function that is applied via its power series to square matrices. That is, starting with a function

f(z) = \sum_{n=0}^\infty a_nz^n

we define the morphism f on C by

f : \begin{pmatrix} z_1 \\ z_2 \\ \vdots \\ z_n \end{pmatrix} \mapsto \begin{pmatrix} f(z_1) \\ f(z_2) \\ \vdots \\ f(z_n) \end{pmatrix}

and the morphism Δ f on M by

\Delta f : Z \mapsto \sum_{n=0}^\infty a_n Z^n

where Z is a square matrix.

We can apply f to a column vector, and then apply Δ to turn the resulting vector into a diagonal matrix, or we could apply Δ to turn the vector into a diagonal matrix first, and then apply f (technically,  Δf). That is, the follow diagram commutes:

\begin{diagram}   C & \rTo^{\Delta} & M \\   \uTo^{f} & & \uTo_{\Delta f} \\   C & \rTo^{\Delta} & M  \end{diagram}

Python example

Applying an analytic function to a diagonal matrix gives the same result as simply applying the function to the elements of the diagonal. But for more general square matrices, this is not the case. We will illustrate this with some Python code.

    import numpy as np
    from scipy.linalg import funm

    d = np.array([1, 2])
    D = np.diag(d)
    M = np.array([[1, np.pi], [2, 0]])

Now let’s look at some output.

    >>> np.sin(d)                     
    array([0.84147098, 0.90929743])   

    >>> np.sin(D)                     
    array([[0.84147098, 0.        ],  
           [0.        , 0.90929743]]) 

    >>> funm(D, np.sin)               
    array([[0.84147098, 0.        ],  
           [0.        , 0.90929743]]) 

So if we take the sine of d and turn the result into a matrix, we get the same thing as if we turn d into a matrix D and then take the sine of D, either componentwise or as an analytic function (with funm, function of a matrix).

Now let’s look at a general, non-diagonal matrix.

    >>> np.sin(M)
    array([[0.84147099, 0],
       [0.90929743, 0]])

    >>> funm(D, np.sin)
    array([[0.84147098, 0.        ],
           [0.        , 0.90929743]])

Note that the elements in the bottom row are in opposite positions in the two examples.

[1] OK, morphisms are not necessarily functions, but in practice they usually are.

Python and the Tell-Tale Heart

I was browsing through SciPy documentation this evening and ran across a function in scipy.misc called electrocardiogram. What?!

It’s an actual electrocardiogram, sampled at 360 Hz. Presumably it’s included as convenient example data. Here’s a plot of the first five seconds.

ECG plot

I wrote a little code using it to turn the ECG into an audio file.

from numpy import int16, iinfo
from scipy.io.wavfile import write
from scipy.misc import electrocardiogram

def to_integer(signal):
    # Take samples in [-1, 1] then scale to 16-bit integers
    m = iinfo(int16).max
    M = max(abs(signal))
    return int16(signal*m/M)

ecg = electrocardiogram()
write("heartbeat.wav", 360, to_integer(ecg))

I had to turn the volume way up to hear it, and that made me think of Edgar Allan Poe’s story The Tell-Tale Heart.

I may be doing something wrong. According to the documentation for the write function, I shouldn’t need to convert the signal to integers. I should just be able to leave the signal as floating point and normalize it to [-1, 1] by dividing by the largest absolute value in the signal. But when I do that, the output file will not play.

Related posts

Stable and unstable recurrence relations

The previous post looked at computing recurrence relations. That post ends with a warning that recursive evaluations may nor may not be numerically stable. This post will give examples that illustrate stability and instability.

There are two kinds of Bessel functions, denoted J and Y. These are called Bessel functions of the first and second kinds respectively. These functions carry a subscript n denoting their order. Both kinds of Bessel functions satisfy the same recurrence relation:

fn+1 – (2n/x) fn + fn-1 = 0

where f is J or Y.

If you apply the recurrence relation in the increasing direction, it is unstable for J but stable for Y.

If you apply the recurrence relation in the opposite direction, it is stable for J and unstable for Y.

We will illustrate the above claims using the following Python code. Since both kinds of Bessel function satisfy the same recurrence, we pass the Bessel function in as a function argument. SciPy implements Bessel functions of the first kind as jv and Bessel functions of the second kind as yv. [1]

    from scipy import exp, pi, zeros
    from scipy.special import jv, yv

    def report(k, computed, exact):
        print(k, computed, exact, abs(computed - exact)/exact)

    def bessel_up(x, n, f):
        a, b = f(0, x), f(1, x)
        for k in range(2, n+1):
            a, b = b, 2*(k-1)*b/x - a
            report(k, b, f(k,x))     

    def bessel_down(x, n, f):
        a, b = f(n,x), f(n-1,x)
        for k in range(n-2, -1, -1):
            a, b = b, 2*(k+1)*b/x - a
            report(k, b, f(k,x))

We try this out as follows:

    bessel_up(1, 20, jv)
    bessel_down(1, 20, jv)
    bessel_up(1, 20, yv)
    bessel_down(1, 20, yv)

When we compute Jn(1) using bessel_up, the relative error starts out small and grows to about 1% when n = 9. The relative error increases rapidly from there. When n = 10, the relative error is 356%.

For n = 20, the recurrence gives a value of 316894.36 while the true value is 3.87e-25, i.e. the computed value is 30 orders of magnitude larger than the correct value!

When we use bessel_down, the results are correct to full precision.

Next we compute Yn(1) using bessel_up the results are correct to full precision.

When we compute Yn(1) using bessel_down, the results are about as bad as computing Jn(1) using bessel_up. We compute Y0(1) as 0 5.7e+27 while the correct value is roughly 0.088.

There are functions, such as Legendre polynomials, whose recurrence relations are stable in either direction, at least for some range of inputs. But it would be naive to assume that a recurrence is stable without some exploration.

Miller’s algorithm

There is a trick for using the downward recurrence for Bessel functions known as Miller’s algorithm. It sounds crazy at first: Assume JN(x) = 1 and JN+1(x) = 0 for some large N, and run the recurrence downward.

Since we don’t know JN(x) our results be off by some constant proportion. But there’s a way to find out what that proportionality constant is using the relation described here.

1 = J_0(x) + 2J_2(x) + 2J_4(x) + 2J_6(x) + \cdots

We add up out computed values for the terms on the right side, then divide by the sum to normalize our estimates. Miller’s recurrence algorithm applies more generally to other recurrences where the downward recurrence is stable and there exists a normalization identity analogous to the one for Bessel functions.

The following code lets us experiment with Miller’s algorithm.

    def miller(x, N):
        j = zeros(N) # array to store values
        
        a, b = 0, 1
        for k in range(N-1, -1, -1):
            a, b = b, 2*(k+1)*b/x - a
            j[k] = b
            
        norm = j[0] + sum(2*j[k] for k in range(2,N,2))
        j /= norm
        
        for k in range(N-1, -1, -1):
            expected, computed = j[k], jv(k,x)
            report(k, j[k], jv(k,x))

When we call miller(pi, 20) we see that Miller’s method computes Jn(π) accurately. The error starts out moderately small and decreases until the results are accurate to floating point precision.

    |----+------------|
    |  k | rel. error |
    |----+------------|
    | 19 |   3.91e-07 |
    | 17 |   2.35e-09 |
    | 16 |   2.17e-11 |
    | 15 |   2.23e-13 |
    | 14 |   3.51e-15 |
    |----+------------|

For smaller k the relative error is also around 10-15, i.e. essentially full precision.

[1] Why do the SciPy names end in “v”? The order of a Bessel function does not have to be an integer. It could be any real number, and the customary mathematical notation is to use a Greek letter ν (nu) as a subscript rather than n as a reminder that the subscript might not represent an integer. Since a Greek ν looks similar to an English v, SciPy uses v as a sort of approximation of ν.