Python | John D. Cook

Converting very long strings to integers in Python

John — Fri, 15 Aug 2025 18:45:06 +0000

In the process of writing the previous post, I wanted to confirm that the number in the post

really is prime. This was useful in debugging my manual conversion of the image to text: errors did not result in a prime number. For example, I didn’t see the 9’s in the image at first, and I didn’t get a prime number until I changed four of the 8’s to 9’s.

But there was a problem along the way. Simply converting the string to an integer didn’t work. It produced the following error:

SyntaxError: Exceeds the limit (4300) for integer string conversion: value has 5382 digits; use sys.set_int_max_str_digits() to increase the limit – Consider hexadecimal for huge integer literals to avoid decimal conversion limits.

Note that the limitation is not on the size of a Python integer. The only limitation on the size of an integer is available memory. But there is a limitation on the size of string that can be converted to an integer.

The fix suggested in the error message didn’t work. But storing the number as several strings, i.e. each row of the image, and doing my own radix conversion did work.

from sympy import isprime

flaglines = [
    "888888888888888888888888888888888888888888888888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118888811188988111888881118888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118898811188888111888881118888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111889881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111888881118898811188888111888881118888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888888888111888881118888811188888111888881118888888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888881111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "888881118888811188888111888881118888811188888111888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888888888888888888888888888888888888888888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "888888888888888888888888888888888888888888888888888883333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
    "333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333333",
]

m = int(flaglines[0])
for i in range(1, len(flaglines)):
    line = flaglines[i]
    x = int(line)
    m = m * 10**len(line) + x
print(isprime(n))

The post Converting very long strings to integers in Python first appeared on John D. Cook.

Monero’s elliptic curve

John — Sun, 27 Jul 2025 13:57:59 +0000

Digital signatures often use elliptic curves. For example, Bitcoin and Ethereum use the elliptic curve secp256k1 [1]. This post will discuss the elliptic curve Ed25519 [2] using in Monero and in many other applications.

Ed25519 has the equation

y² − x² = 1 + d x² y²

over the finite field F_q where q = 2²⁵⁵ − 19 and d = −121665/121666. The general form of the equation above makes Ed25519 a “twisted Edwards curve.”

The expression for d is not the rational number it appears to be. Think of it as

d = −121665 × 121666⁻¹

where the multiplication and the multiplicative inverse are calculated mod q.

We could calculate d in Python as follows.

>>> q = 2**255 - 19
>>> d = (-121665*pow(121666, -1, q)) % q
>>> d 
37095705934669439343138083508754565189542113879843219016388785533085940283555

The equation above does not look like an elliptic curve if you think of an elliptic curve having the form

y² = x³ + ax + b.

But that form, known as Weierstrass form, is not the most general definition of an elliptic curve. Elliptic curves can be written in that form [3] but they do not have to be. There are computational advantages to writing curves in the form

y² − x² = 1 + d x² y²

when possible rather than in Weierstrass form [4].

Elliptic curve digital signatures require a specified base point. The Monero whitepaper describes the base point simply as

G = (x, −4/5).

That’s leaving out a fair bit of detail. First of all, 4/5 is interpreted as 4 times the multiplicative inverse of 5 mod q, similar to the calculation for d above.

>>> y = (-4*pow(5, -1, q)) % q; y
11579208923731619542357098500868790785326998466564056403945758400791312963989

Now we have two tasks. How do we solve for x, and which solution do we take?

We need to solve

x² = (y² − 1)/(1 + d y²) mod q.

We can do this in Python as follows.

>>> from sympy import sqrt_mod
>>> roots = sqrt_mod((y**2 - 1)*pow(1 + d*y**2, -1, q), q, all_roots=True)
>>> roots 
[15112221349535400772501151409588531511454012693041857206046113283949847762202,
42783823269122696939284341094755422415180979639778424813682678720006717057747]

So which root to we choose? The convention is to use the even solution, the first one above. (The two roots add to 0 mod q; one will be odd and one will be even because q is odd.)

We can verify that (x, y) is on the Ed25519 elliptic curve:

>>> ((y**2 - x**2) - (1 + d*x**2 * y**2)) % q
0

[1] The name secp256k1 was created as follows. The “sec” comes from “Standards for Efficient Cryptography,” referring to the group that specified the curve parameters. The “p” means the curve is over a finite field of prime order. The order of the curve is slightly less than 2²⁵⁶. The “k” indicates that this is a Koblitz curve.

[2] The “Ed” part of the name refers to Harold Edwards, the mathematician who studied the family of elliptic curves now known as Edwards curves. The “25519” part of the name refers to the fact that the curve is over the finite field with 2²⁵⁵ − 19 elements.

[3] Provided the elliptic curve is not over a field of characteristic 2 or 3.

[4] Group operations can be implemented more efficiently and the point at infinity can be handled without exception logic.

The post Monero’s elliptic curve first appeared on John D. Cook.

Converting between quaternions and rotation matrices

John — Wed, 07 May 2025 13:52:27 +0000

In the previous post I wrote about representing rotations with quaternions. This representation has several advantages, such as making it clear how rotations compose. Rotations are often represented as matrices, and so it’s useful to be able to go between the two representations.

A unit-length quaternion (q₀, q₁, q₂, q₃) represents a rotation by an angle θ around an axis in the direction of (q₁, q₂, q₃) where cos(θ/2) = q₀. The corresponding rotation matrix is given below.

(Update: there was a typo in the equation above. See this post for how Claude found the typo and wrote Lean code to verify the fix.)

Going the other way around, inferring a quaternion representation from a rotation matrix, is harder. Here is a mathematically correct but numerically suboptimal method known [1] as the Chiaverini-Siciliano method.

Here sgn is the sign function; sgn(x) equals 1 if x is positive and −1 if x is negative. Note that the components only depend on the diagonal of the rotation matrix, aside from the sign terms. Better numerical algorithms make more use of the off-diagonal elements.

Accounting for degrees of freedom

Something seems a little suspicious here. Quaternions contain four real numbers, and 3 by 3 matrices contain nine. How can four numbers determine nine numbers? And going the other way, out of the nine, we essentially choose three that determine the four components of a quaternion.

Quaternions have four degrees of freedom, but we’re using unit quaternions, so there are basically three degrees of freedom. Likewise orthogonal matrices have three degrees of freedom. An axis of rotation is a point on a sphere, so that has two degrees of freedom, and the degree of rotation is the third degree of freedom.

In topological terms, the unit quaternions and the set of 3 by 3 orthogonal matrices are both three dimensional manifolds, and the former is a double cover of the latter. It is a double cover because a unit quaternion q corresponds to the same rotation as −q.

Python code

Implementing the equations above is straightforward.

import numpy as np

def quaternion_to_rotation_matrix(q):
    q0, q1, q2, q3 = q
    return np.array([
        [2*(q0**2 + q1**2) - 1, 2*(q1*q2 - q0*q3), 2*(q1*q3 + q0*q2)],
        [2*(q1*q2 + q0*q3), 2*(q0**2 + q2**2) - 1, 2*(q2*q3 - q0*q1)],
        [2*(q1*q3 - q0*q2), 2*(q2*q3 + q0*q1), 2*(q0**2 + q3**2) - 1]
    ]) 

def rotation_matrix_to_quaternion(R):
    r11, r12, r13 = R[0, 0], R[0, 1], R[0, 2]
    r21, r22, r23 = R[1, 0], R[1, 1], R[1, 2]
    r31, r32, r33 = R[2, 0], R[2, 1], R[2, 2]
    
    # Calculate quaternion components
    q0 = 0.5 * np.sqrt(1 + r11 + r22 + r33)
    q1 = 0.5 * np.sqrt(1 + r11 - r22 - r33) * np.sign(r32 - r23)
    q2 = 0.5 * np.sqrt(1 - r11 + r22 - r33) * np.sign(r13 - r31)
    q3 = 0.5 * np.sqrt(1 - r11 - r22 + r33) * np.sign(r21 - r12)
    
    return np.array([q0, q1, q2, q3])

Random testing

We’d like to test the code above by generating random quaternions, converting the quaternions to rotation matrices, then back to quaternions to verify that the round trip puts us back essentially where we started. Then we’d like to go the other way around, starting with randomly generated rotation matrices.

To generate a random unit quaternion, we generate a vector of four independent normal random values, then normalize by dividing by its length. (See this recent post.)

To generate a random rotation matrix, we use a generator that is part of SciPy.

Here’s the test code:

def randomq():
    q = norm.rvs(size=4)
    return q/np.linalg.norm(q)

def randomR():
    return special_ortho_group.rvs(dim=3)

np.random.seed(20250507)
N = 10

for _ in range(N):
    q = randomq()
    R = quaternion_to_rotation_matrix(q)
    t = rotation_matrix_to_quaternion(R)
    print(np.linalg.norm(q - t))
    
for _ in range(N):
    R = randomR()
    q = rotation_matrix_to_quaternion(R)
    T = quaternion_to_rotation_matrix(q)
    print(np.linalg.norm(R - T))

The first test utterly fails, returning six 2s, i.e. the round trip vector is as far as possible from the vector we started with. How could that happen? It must be returning the negative of the original vector. Now go back to the discussion above about double covers: q and −q correspond to the same rotation.

If we go back and add the line

    q *= np.sign(q[0])

then we standardize our random vectors to have a positive first component, just like the vectors returned by rotation_matrix_to_quaternion.

Now our tests all return norms on the order of 10⁻¹⁶ to 10⁻¹⁴. There’s a little room to improve the accuracy, but the results are good.

Update: I did some more random testing, and found errors on the order of 10⁻¹⁰. Then I was able to create a test case where rotation_matrix_to_quaternion threw an exception because one of the square roots had a negative argument. In [1] the authors get around this problem by evaluating two theoretically equivalent expressions for each of the square root arguments. The expressions are complementary in the sense that both should not lead to numerical difficulties at the same time.

[1] See “Accurate Computation of Quaternions from Rotation Matrices” by Soheil Sarabandi and Federico Thomas for a better numerical algorithm. See also the article “A Survey on the Computation of Quaternions From Rotation Matrices” by the same authors.

The post Converting between quaternions and rotation matrices first appeared on John D. Cook.

How hard is constraint programming?

Wayne Joubert — Thu, 31 Oct 2024 11:52:13 +0000

I’ve been writing code for the Z3 SMT solver for several months now. Here are my findings.

Python is used here as the base language. Python/Z3 feels like a two-layer programming model—declarative code for Z3, imperative code for Python. In this it seems reminiscent of C++/CUDA programming for NVIDIA GPUs—in that case, mixed CPU and GPU imperative code. Either case is a clever combination of methodologies that is surprisingly fluent and versatile, albeit not a perfect blend of seamless conceptual cohesion.

Other comparisons:

Both have two separate memory spaces (CUDA CPU/GPU memories for one; pure Python variables and Z3 variables for the other).
Both can be tricky to debug. In earlier days, CUDA had no debugger, so one had to fall back to the trusty “printf” statement (for a while it didn’t even have that!). If the code crashed, you might get no output at all. To my knowledge, Z3 has no dedicated debugger. If the problem being solved comes back as satisfiable, you can print out the discovered model variables, but if satisfiability fails, you get very little information. Like some other novel platforms, something of a “black box.”
In both cases, programmer productivity can be well-served by developing custom abstractions. I developed a Python class to manage multidimensional arrays of Z3 variables, this was a huge time saver.

There are differences too, of course.

In Python, “=” is assignment, but in Z3, one only has “==”, logical or numeric equality, not assignment per se. Variables are set once and can’t be changed—sort of a “write-once variables” programming model—as is natural to logic programming.
Code speed optimization is challenging. Code modifications for Z3 constraints/variables can have extreme and unpredictable runtime effects, so it’s hard to optimize. Z3 is solving an NP-complete problem after all, so runtimes can theoretically increase massively. Speedups can be massive also; one round of changes I made gave 2000X speedup on a test problem. Runtime of CUDA code can be unpredictable to a lesser degree, depending on the PTX and SASS code generation phases and the aggressive code optimizations of the CUDA compiler. However, it seems easier to “see through” CUDA code, down to the metal, to understand expected performance, at least for smaller code fragments. The Z3 solver can output statistics of the solve, but these are hard to actionably interpret for a non-expert.
Z3 provides many, many algorithmic tuning parameters (“tactics”), though it’s hard to reason about which ones to pick. Autotuners like FastSMT might help. Also there have been some efforts to develop tools to visualize the solve process, this might be of help.

It would be great to see more modern tooling support and development of community best practices to help support Z3 code developers.

The post How hard is constraint programming? first appeared on John D. Cook.

Factored random numbers

John — Sat, 18 Nov 2023 01:42:42 +0000

A couple days ago Michael Nielsen posted an image of a one-page paper that gives an algorithm for generating factored random numbers, uniformly distributed from 1 to some designated N.

The algorithm does not generate random numbers then factor them. It’s more efficient than that, generating the factorization along with the final result. It does require testing for whether a number is prime, but this is more efficient than factorization.

I thought about trying to code up the algorithm in Python, but then I see that @iconjack beat me to it.

from sympy import isprime
from random import random, randint

def randfacts(N):
    while True:
        n, r, s = N, 1, []
        while n > 1:
            if r > N: break
            if isprime(n := randint(1,n)):
                r *= n
                s.append(n)
        else:
            if random() < r/N:
                return r, s

The post Factored random numbers first appeared on John D. Cook.

Python code for means

John — Wed, 11 Oct 2023 16:05:18 +0000

The last couple article have looked at various kinds of mean. The Python code for four of these means is trivial:

gm  = lambda a, b: (a*b)**0.5
am  = lambda a, b: (a + b)/2
hm  = lambda a, b: 2*a*b/(a+b)
chm = lambda a, b: (a**2 + b**2)/(a + b)

But the arithmetic-geometric mean (AGM) is not trivial:

from numpy import pi
from scipy.special import ellipk

agm = lambda a, b: 0.25*pi*(a + b)/ellipk((a - b)**2/(a + b)**2)

The arithmetic-geometric mean is defined by iterating the arithmetic and geometric means and taking the limit. This iteration converges very quickly, and so writing code that directly implements the definition is efficient.

But the AGM can also be computed via a special function K, the “complete elliptic integral of the first kind,” which makes the code above more compact. This is conceptually nice because we can think of the AGM as a simple function, not an iterative process.

But how is K evaluated? In some sense it doesn’t matter: it’s encapsulated in the SciPy library. But someone has to write SciPy. I haven’t looked at the SciPy source code, but usually K is calculated numerically using the AGM because, as we said above, the AGM converges very quickly.

This fits the pattern of a bell curve meme: the novice and expert approaches are the same, but for different reasons. The novice uses an iterative approach because that directly implements the definition. The expert knows about the elliptic integral, but also knows that the iterative approach suggested by the definition is remarkably efficient and eliminates the need to import a library.

Although it’s easy to implement the AGM with a while loop, the code above does have some advantages. For one thing, it pushes the responsibility for validation and exception handling onto the library. On the other hand, the code is easy to get wrong because there are two conventions on how to parameterize K and you have to be sure to use the same one your library uses.

The post Python code for means first appeared on John D. Cook.

Searching for proper nouns

John — Mon, 17 Jul 2023 21:03:33 +0000

Suppose you want to find all the proper nouns in a document. You could grep for every word that starts with a capital letter with something like

    grep '\b[A-Z]\w+'

but this would return the first word of each sentence in addition to the words you’re after.

You could grep for capitalized words that are not preceded by a period or question mark followed by a space.

grep -P '(?

That’s possibly better, but it misses proper nouns at the beginning of a sentence.

You might be able to accomplish what you’re after by tinkering with regular expressions, but it would be better to use a library that has some idea of what a proper noun is.

NLP with spaCy

The Python natural language processing library spaCy classifies words by part of speech, and so could in particular search for proper nouns.

Here’s an example using the opening lines of Moby Dick.

    import spacy
    nlp = spacy.load("en_core_web_lg")

    doc = nlp("Call me Ishmael. Some years ago—never mind how long precisely—having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul ... I account it high time to get to sea as soon as I can.")

    for tok in doc:
        if tok.pos_ == "PROPN":
            print(tok)

This will print Ishmael and November only. It does not print words at the beginning of a sentence such as Call or Some even though they are capitalized. When spaCy got to the line

Queequeg was George Washington cannibalistically developed.

it detected that Queequeg is a proper noun. Presumably the model can tell this from context, because the word precedes the verb was and not because it knows Queequeg is proper name.

When I changed November to november spaCy was still able to detect that november was a proper noun. When I downcased Ishmael it did not detect that ishmael was a proper noun, presumably because Ishmael is an uncommon name. When I changed the text to “Call me tim” the library did recognize tim as a proper noun.

When I fed spaCy the sentence

I never go as a passenger; nor, though I am something of a salt, do I ever go to sea as a Commodore, or a Captain, or a Cook.

the library thought that Commodore, Captain, and Cook were proper nouns. If I downcase these words, spaCy does not flag them as proper nouns.

When processing the line

For as in this world,head winds are far more prevalent than winds from astern (that is, if you never violate the Pythagorean maxim), so for the most part the Commodore on the quarter-deck gets his atmosphere at second hand from the sailors on the forecastle

spaCy correctly flagged Commodore as a proper noun in this instance. Also, it did not classify Pythagorean as a proper noun; the word is proper but not a noun, i.e. it’s a proper adjective.

TANSTAAFL

My script above has only six lines of code. But it depends on a library that uses a 588 MB language model. [1]

[1] “TANSTAALF” stands for “There ain’t no such thing as a free lunch.” It comes from The Moon is a Harsh Mistress by Heinlein.

Incidentally, when I fed “The term TANSTAAFL comes from The Moon is a Harsh Mistress by Heinlein.” to spaCy, it flagged Harsh and Mistress as proper nouns.

When I fed it “The term TANSTAAFL comes from ‘The moon is a harsh mistress’ by Heinlein.” the library correctly tagged harsh as an adjective and mistress as a (non-proper) noun.

The post Searching for proper nouns first appeared on John D. Cook.

Golden integration

John — Sat, 29 Apr 2023 14:56:37 +0000

Let φ be the golden ratio. The fractional parts of nφ bounce around in the unit interval in a sort of random way. Technically, the sequence is quasi-random.

Quasi-random sequences are like random sequences but better in the sense that they explore a space more efficiently than random sequences. For this reason, Monte Carlo integration (“integration by darts“) can often be made more efficient by replacing random sequences with quasi-random sequence. This post will illustrate this efficiency advantage in one dimension using the fractional parts of nφ.

Here are functions that will generate our integration points.

    from numpy import random, zeros

    def golden_source(n):
        phi = (1 + 5**0.5)/2
        return (phi*n)%1

    def random_source(N):
        return random.random()

We will pass both of these generators as arguments to the following function which saves a running average of function evaluates at the generated points.

    def integrator(f, source, N):
        runs = zeros(N)
        runs[0] = f(source(0))
        for n in range(1, N):
            runs[n] = runs[n-1] + (f(source(n)) - runs[n-1])/n
        return runs

We’ll use as our example integrand f(x) = x² (1 − x)³. The integral of this function over the unit interval is 1/60.

    def f(x):
        return x**2 * (1-x)**3
    exact = 1/60

Now we call our integrator.

    random.seed(20230429)
    N = 1000
    golden_run = integrator(f, golden_source, N)
    random_run = integrator(f, random_source, N)

Now we plot the difference between each run and the exact value of the integral. Both methods start out with wild fluctuations. We leave out the first 10 elements in order to make the error easier to see.

    import matplotlib.pyplot as plt

    k = 10
    x = range(N)
    plt.plot(x[k:], golden_run[k:] - exact)
    plt.plot(x[k:], random_run[k:] - exact)

This produces the following plot.

The integration error using φn − ⌊φn⌋ is so close to zero that it’s hard to observe its progress. So we plot it again, this time taking the absolute value of the integration error and plotting on a log scale.

    plt.plot(x[k:], abs(golden_run[k:] - exact))
    plt.plot(x[k:], abs(random_run[k:] - exact))
    plt.yscale("log")

This produces a more informative plot.

The integration error for the golden sequence is at least an order of magnitude smaller, and often a few orders of magnitude smaller.

The function we’ve integrated has a couple features that make integration using quasi-random sequences (QMC, quasi-Monte Carlo) more efficient. First, it’s smooth. If the integrand is jagged, QMC has no advantage over MC. Second, our integrand could be extended smoothly to a periodic function, i.e. f(0) = f(1) and f′(0) = f′(1). This makes QMC integration even more efficient.

The post Golden integration first appeared on John D. Cook.

Python code to solve Kepler’s equation

John — Wed, 02 Nov 2022 13:00:42 +0000

The previous post looked at solving Kepler’s equation using Newton’s method. The problem with using Newton’s method is that it may not converge when the eccentricity e is large unless you start very close to the solution. As discussed at the end of that post, John Machin came up with a clever way to start close. His starting point is defined as follow.

The variable s is implicitly defined as the root of a cubic polynomial. This could be messy. Maybe there are three real roots and we have to decide which one to use. Fortunately this isn’t the case.

The discriminant of our cubic equation is negative, so there is only one real root. And because our cubic equation for s has no s² term the expression for the root isn’t too complicated.

Here’s Python code to solve Kepler’s equation using Newton’s method with Machin’s starting point.

    from numpy import sqrt, cbrt, pi, sin, cos, arcsin, random
    
    # This will solve the special form of the cubic we need.
    def solve_cubic(a, c, d):
        assert(a > 0 and c > 0)
        p = c/a
        q = d/a
        k = sqrt( q**2/4 + p**3/27 )
        return cbrt(-q/2 - k) + cbrt(-q/2 + k)
    
    # Machin's starting point for Newton's method
    # See johndcook.com/blog/2022/11/01/kepler-newton/
    def machin(e, M):
        n = sqrt(5 + sqrt(16 + 9/e))
        a = n*(e*(n**2 - 1)+1)/6
        c = n*(1-e)
        d = -M
        s = solve_cubic(a, c, d)
        return n*arcsin(s)    
    
    def solve_kepler(e, M):
        "Find E such that M = E - e sin E."
        assert(0 <= e < 1)
        assert(0 <= M <= pi) 
        f = lambda E: E - e*sin(E) - M 
        E = machin(e, M) 
        tolerance = 1e-10 

        # Newton's method 
        while (abs(f(E)) > tolerance):
            E -= f(E)/(1 - e*cos(E))
        return E

To test this code, we’ll generate a million random values of e and M, solve for the corresponding value of E, and verify that the solution satisfies Kepler’s equation.

    random.seed(20221102)
    N = 1_000_000
    e = random.random(N)
    M = random.random(N)*pi
    for i in range(N):
        E = solve_kepler(e[i], M[i])
        k = E - e[i]*sin(E) - M[i]
        assert(abs(k) < 1e-10)
    print("Done")

All tests pass.

Machin’s starting point is very good, and could make an adequate solution on its own if e is not very large and if you don’t need a great deal of accuracy. Let’s illustrate by solving Kepler’s equation for the orbit of Mars with eccentricity e = 0.09341.

Here the maximum error is 0.01675 radians and the average error is 0.002486 radians. The error is especially small for small values of M. When M = 1, the error is only 1.302 × 10⁻⁵ radians.

The post Python code to solve Kepler’s equation first appeared on John D. Cook.

Dump a pickle file to a readable text file

John — Tue, 16 Aug 2022 18:40:17 +0000

I got a data file from a client recently in “pickle” format. I happen to know that pickle is a binary format for serializing Python objects, but trying to open a pickle file could be a puzzle if you didn’t know this.

Be careful

There are a couple problems with using pickle files for data transfer. First of all, it’s a security risk because an attacker could create a malformed pickle file that would cause your system to run arbitrary code. In the Python Cookbook, the authors David Beazley and Brian K. Jones warn

It’s essential that pickle only be used internally with interpreters that have some ability to authenticate one another.

The second problem is that the format could change. Again quoting the Cookbook,

Because of its Python-specific nature and attachment to source code, you probably shouldn’t use pickle as a format for long-term storage. For example, if the source code changes, all of your stored data might break and become unreadable.

Suppose someone gives you a pickle file and you’re willing to take your chances and open it. It’s from a trusted source, and it was created recently enough that the format probably hasn’t changed. How do you open it?

Unpickling

The following code will open the file data.pickle and read it into an object obj.

    import pickle
    obj = pickle.load(open("data.pickle", "rb"))

If the object in the pickle file is very small, you could simply print obj. But if the object is at all large, you probably want to save it to a file rather than dumping it at the command line, and you also want to “pretty” print it than simply printing it.

Pretty printing

The following code will dump a nicely-formatted version of our pickled object to a text file out.txt.

    import pickle
    import pprint

    obj = pickle.load(open("sample_data.pickle", "rb"))

    with open("out.txt", "a") as f:
         pprint.pprint(obj, stream=f)

In my case, the client’s file contained a dictionary of lists of dictionaries. It printed as one incomprehensible line, but it pretty printed as 40,000 readable lines.

Prettier printing

Simon Brunning left a comment suggesting that the json module output is even easier to read.

    import json

    with open("out.txt", "a") as f:
        json.dump(obj, f, indent=2)

And he’s right, at least in my case. The indentation json.dump produces is more what I’d expect, more like what you’d see if you were writing the structure in well-formatted source code.

The post Dump a pickle file to a readable text file first appeared on John D. Cook.

Org-mode as a lightweight notebook

John — Tue, 02 Aug 2022 14:49:24 +0000

You can think of org-mode as simply a kind of markdown, a plain text file that can be exported to fancier formats such as HTML or PDF. It’s a lot more than that, but that’s a reasonable place to start.

Org-mode also integrates with source code. You can embed code in your file and have the code and/or the result of running the code appear when you export the file to another format.

Org-mode as notebook

You can use org-mode as a notebook, something like a Jupyter notebook, but much simpler. An org file is a plain text file, and you can execute embedded code right there in your editor. You don’t need a browser, and there’s no hidden state.

Here’s an example of mixing markup and code:

    The volume of an n-sphere of radius r is 

    $$\frac{\pi^{\frac{n}{2}}}{\Gamma\left(\frac{n}{2} + 1\right)}r^n.$$

    #+begin_src python :session
    from scipy import pi
    from scipy.special import gamma

    def vol(r, n):
        return pi**(n/2)*r**n/gamma(n/2 + 1)

    vol(1, 5)
    #+end_src

If you were to export the file to PDF, the equation for the volume of a sphere would be compiled into a image using LaTeX.

To run the code [1], put your cursor somewhere in the code block and type C-c C-c. When you do, the following lines will appear below your code.

    #+RESULTS:
    : 5.263789013914324

If you think of your org-mode file as primary, and you’re just inserting some code as a kind of scratch area, an advantage of org-mode is that you never leave your editor.

Jupyter notebooks

Now let’s compare that to a Jupyter notebook. Jupyter organizes everything by cells, and a cell can contain markup or code. So you could create a markup cell and enter the exact same introductory text [2].

    The volume of an n-sphere of radius r is 

    $$\frac{\pi^{\frac{n}{2}}}{\Gamma\left(\frac{n}{2} + 1\right)}r^n$$.

When you “run” the cell, the LaTeX is processed and you see the typeset expression rather than its LaTeX source. You can click on the cell to see the LaTeX code again.

Then you would enter the Python code in another cell. When you run the cell you see the result, much as in org-mode. And you could export your notebook to PDF as with org-mode.

File diff

Now suppose we make a couple small changes. We want the n and r in the comment section set in math italic, and we’d like to find the volume of a 5-sphere of radius 2 rather than radius 1. We do this, in Jupyter and in org-mode [3], by putting dollar signs around the “n” and the “r”, and we change vol(1, 5) to vol(2, 5).

Let’s run diff on the two versions of the org-mode file and on the two versions of the Jupyter notebook.

The differences in the org files are easy to spot:

    1c1
    < The volume of an n-sphere of radius r is 
    ---
    > The volume of an \(n\)-sphere of radius \(r\) is 
    12c12
    < vol(1, 5)
    ---
    > vol(2, 5)
    16c16,17
    < : 5.263789013914324
    ---
    > : 168.44124844525837

However, the differences in the Jupyter files are more complicated:

    5c5
    <    "id": "2a1b0bc4",
    ---
    >    "id": "a0a89fcf",
    8c8
    <     "The volume of an n-sphere of radius r is \n",
    ---
    >     "The volume of an $n$-sphere of radius $r$ is \n",
    15,16c15,16
    <    "execution_count": 1,
    <    "id": "888660a2",
    ---
    >    "execution_count": 2,
    >    "id": "1adcd8b1",
    22c22
    <        "5.263789013914324"
    ---
    >        "168.44124844525837"
    25c25
    <      "execution_count": 1,
    ---
    >      "execution_count": 2,
    37c37
    <     "vol(1, 5)"
    ---
    >     "vol(2, 5)"
    43c43
    <    "id": "f8d4d1b0",

There’s a lot of extra stuff in a Jupyter notebook. This is a trivial notebook, and more complex notebooks have more extra stuff. An apparently small change to the notebook can cause a large change in the underlying notebook file. This makes it difficult to track changes in a Jupyter notebook in a version control system.

[1] Before this will work, you have to tell Emacs that Python is one of the languages you want to run inside org-mode. I have the following line in my init file to tell Emacs that I want to be able to run Python, DITAA, R, and Perl.

    (org-babel-do-load-languages 'org-babel-load-languages '((python . t) (ditaa . t) (R . t) (perl . t)))

[2] Org-mode will let you use \[ and \] to bracket LaTeX code for a displayed equation, and it will also let you use $$. Jupyter only supports the latter.

[3] In org-mode, putting dollar signs around variables sometimes works and sometimes doesn’t. And in this example, it works for the “r” but not for the “n”. This is very annoying, but it can be fixed by using $ and $ to enter and leave math mode rather than use a dollar sign for both.

The post Org-mode as a lightweight notebook first appeared on John D. Cook.

Find log normal parameters for given mean and variance

John — Fri, 25 Feb 2022 01:10:49 +0000

Earlier today I needed to solve for log normal parameters that yield a given mean and variance. I’m going to save the calculation here in case I needed in the future or in case a reader needs it. The derivation is simple, but in the heat of the moment I’d rather look it up and keep going with my train of thought.

NB: The parameters μ and σ² of a log normal distribution are not the mean and variance of the distribution; they are the mean and variance of its log.

If m is the mean and v is the variance then

Notice that the square of the m term matches the second part of the v term.

Then

and so

and once you have σ² you can find μ by

Here’s Python code to implement the above.

    from numpy immport log
    def solve_for_log_normal_parameters(mean, variance):
        sigma2 = log(variance/mean**2 + 1)
        mu = log(mean) - sigma2/2
        return (mu, sigma2)

And here’s a little test code for the code above.

    mean = 3.4
    variance = 5.6

    mu, sigma2 = solve_for_log_normal_parameters(mean, variance)

    X = lognorm(scale=exp(mu), s=sigma2**0.5)
    assert(abs(mean - X.mean()) < 1e-10)
    assert(abs(variance - X.var()) < 1e-10)

The post Find log normal parameters for given mean and variance first appeared on John D. Cook.

Spaceship operator in Python

John — Mon, 12 Apr 2021 15:36:20 +0000

Some programming languages, such as Perl, have an infix operator <=> that returns a three-state comparison. The expression

    a <=> b

evaluates to -1, 0, or 1 depending on whether a < b, a = b, or a > b. You could think of <=> as a concatenation of <, =, and >.

The <=> operator is often called the “spaceship operator” because it looks like Darth Vader’s ship in Star Wars.

Python doesn’t have a spaceship operator, but you can get the same effect with numpy.sign(a-b). For example, suppose you wanted to write a program to compare two integers.

You could write

    from numpy import sign
    def compare(x, y):
        cmp = ["equal to", "greater than", "less than"][sign(x-y)]
        print(f"{x} is {cmp} {y}.")

Here we take advantage of the fact that an index of -1 points to the last element of a list.

The sign function will return an integer if its argument is an integer or a float if its argument is a float. The code above will break if you pass in floating point numbers because sign will return −1.0, 0.0, or 1.0. But if you replace sign(x-y) with int(sign(x-y)) it will work for floating point arguments.

Related post: Symbol pronunciation

The post Spaceship operator in Python first appeared on John D. Cook.

Illustrating Gershgorin disks with NumPy

John — Sun, 14 Mar 2021 19:00:21 +0000

Gershgorin’s theorem gives bounds on the locations of eigenvalues for an arbitrary square complex matrix.

The eigenvalues are contained in disks, known as Gershgorin disks, centered on the diagonal elements of the matrix. The radius of the disk centered on the kth diagonal element is the sum of the absolute values of the elements in the kth row, excluding the diagonal element itself.

To illustrate this theorem, we create a 5 by 5 diagonal matrix and add some random noise to it. The diagonal elements are

0, 3 + i, 4 + i, 1 + 5i, 9 + 2i.

The eigenvalues of the diagonal matrix would simply be the diagonal elements. The additional random values pull the eigenvalues away from the diagonal values, but by an amount no more than Gershgorin predicts.

Note that in this example, two of the eigenvalues land in the smallest disk. It’s possible that a disk may not contain any eigenvalues; what Gershgorin guarantees is that the union of all the disks contains the union of all the eigenvalues.

Here’s the Python code that created the graph above.

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(202103014)

N = 5 # dimension of our square matrix

D = np.diag([0, 3 + 1j, 4 + 1j, 1 + 5j, 9 + 2j])
M = np.random.rand(N, N) + D

R = np.zeros(N) # disk radii
for i in range(N):
    R[i] = sum(abs(M[i,:])) - abs(M[i,i])

eigenvalues = np.linalg.eigvals(M)

# Plotting code
fig, ax = plt.subplots()
for k in range(N):
    x, y = M[k,k].real, M[k,k].imag
    ax.add_artist( plt.Circle((x, y), R[k], alpha=0.5) )
    plt.plot(eigenvalues[k].real, eigenvalues[k].imag, 'k+')

ax.axis([-4, 12.5, -4, 9])
ax.set_aspect(1)    
plt.xlabel("$x$")
plt.ylabel("$y$")
plt.title("Gershgorin disks and eigenvalues $x + iy$")

The post Illustrating Gershgorin disks with NumPy first appeared on John D. Cook.

Pareto and Pandas

John — Thu, 11 Mar 2021 16:44:09 +0000

This post muses about what it means to learn a software library. I’ll use Pandas as an example, but the post isn’t just about Pandas.

Suppose you say “I want to learn Pandas.” That implicitly assumes Pandas one thing, and in a sense it is. In another sense Pandas is hundreds of things.

At the top level, the pandas module (version 1.2.0) has 142 things inside.

    >>> import pandas as pd
    >>> len(dir(pd))
    142

The two most important things inside are the Series and DataFrame objects. They each in turn contain hundreds of things.

    >>> len(dir(pd.Series))
    434
    >>> len(dir(pd.DataFrame))
    441

That’s evidence Pandas’ diversity. But here’s evidence of it’s unity: most of the things inside these two objects have the same names.

    >>> s = set(dir(pd.Series))
    >>> d = set(dir(pd.DataFrame))
    >>> len(s.union(d))
    491
    >>> len(s - d)
    50
    >>> len(d - s)
    57

Pandas kinda has a fractal dimension, having both complexity and unity. The best way to think about it is not as one monolithic thing, or as hundreds of isolated things. It’s a coherent, but not perfectly coherent, collection of related things. This is true of all software libraries. Pandas is more coherent than most libraries because it was initially the product of one mind, that of Wes McKinney.

This has a couple implications for what it means to “learn Pandas.” Because Pandas is big, you have to explore it strategically, not exhaustively. And because Pandas is coherent, part of what it means to learn Pandas is to develop a feel for the way Pandas does things.

No one is going to learn Pandas by studying every object, every method on every object, and every argument to every method on every object. It’s too big. That’s also unnecessary.

There’s probably something like a Pareto distribution on the usefulness of features. The most commonly used features are used far, far more often than the most obscure features.

It would be interesting to do some kind of survey to see which features are actually used and how often. But I don’t think that’s practical. The easiest thing to do would be to find some large code base that heavily uses Pandas. But that’s not typical of how Pandas is used. Probably most lines of code using Pandas are scattered over millions of small scripts, much of it not in production code.

A well-designed library makes it possible to make good guesses about functionality you haven’t used. You learn the gestalt of the library. You can always look up API documentation as needed, but you can’t develop an intuition for a library just-in-time.

“Learn Pandas” is a daunting goal, and maybe an impossible goal if by “learn” you mean explore exhaustively. But “learn how to do my common tasks quickly in Pandas” and “develop a feel for how to do things in Pandas” are much smaller tasks.

The post Pareto and Pandas first appeared on John D. Cook.

Broadcasting and functors

John — Wed, 10 Mar 2021 01:48:27 +0000

In my previous post, I looked at the map Δ that takes a column vector to a diagonal matrix. I even drew a commutative diagram, which foreshadows a little category theory.

Suppose you have a function f of a real or complex variable. To an R programmer, if x is a vector, it’s obvious that f(x) means to apply f to every component of a vector. Python (NumPy) works the same way, and calls this broadcasting. To a mathematician, this looks odd. What does the logarithm of a vector, for example, even mean?

As in the previous post, we can use Δ to formalize things. We said that Δ has some nice properties, and in fact we will show it is a functor.

To have a functor, we have to have categories. (Historically, functors came first; categories were defined in order to define functors.) We will define C to be the category of column vectors and M the category of square matrices as before. Or rather, we should say the objects of C are column vectors and the objects of M are square matrices.

Categories need morphisms, functions between objects [1]. We define the morphisms on C to be analytic functions applied componentwise. So, for example, if

z = [1, 2, −3],

then

tan(z) = [tan(1), tan(2), tan(−3)].

The morphisms on M will be analytic functions on square matrices, not applied componentwise but applied by power series. That is, given an analytic function f, we define f of a square matrix X as the result of sticking the matrix X into the power series for f. For an example, see What is the cosine of a matrix?

We said that Δ is a functor. It takes column vectors and turns them into square matrices by putting their contents along the diagonal of a matrix. We gave the example in the previous post that [4, i, π] would be mapped to the matrix with these elements on the diagonal, i.e.

That says what Δ does on objects, but what does it do on morphisms? It takes an analytic function that was applied componentwise to column vectors, and turns it into a function that is applied via its power series to square matrices. That is, starting with a function

we define the morphism f on C by

and the morphism Δ f on M by

where Z is a square matrix.

We can apply f to a column vector, and then apply Δ to turn the resulting vector into a diagonal matrix, or we could apply Δ to turn the vector into a diagonal matrix first, and then apply f (technically, Δf). That is, the follow diagram commutes:

Python example

Applying an analytic function to a diagonal matrix gives the same result as simply applying the function to the elements of the diagonal. But for more general square matrices, this is not the case. We will illustrate this with some Python code.

    import numpy as np
    from scipy.linalg import funm

    d = np.array([1, 2])
    D = np.diag(d)
    M = np.array([[1, np.pi], [2, 0]])

Now let’s look at some output.

    >>> np.sin(d)                     
    array([0.84147098, 0.90929743])   

    >>> np.sin(D)                     
    array([[0.84147098, 0.        ],  
           [0.        , 0.90929743]]) 

    >>> funm(D, np.sin)               
    array([[0.84147098, 0.        ],  
           [0.        , 0.90929743]])

So if we take the sine of d and turn the result into a matrix, we get the same thing as if we turn d into a matrix D and then take the sine of D, either componentwise or as an analytic function (with funm, function of a matrix).

Now let’s look at a general, non-diagonal matrix.

    >>> np.sin(M)
    array([[0.84147099, 0],
       [0.90929743, 0]])

    >>> funm(D, np.sin)
    array([[0.84147098, 0.        ],
           [0.        , 0.90929743]])

Note that the elements in the bottom row are in opposite positions in the two examples.

[1] OK, morphisms are not necessarily functions, but in practice they usually are.

The post Broadcasting and functors first appeared on John D. Cook.

Python triple quote strings and regular expressions

John — Sat, 30 Jan 2021 15:36:02 +0000

There are several ways to quote strings in Python. Triple quotes let strings span multiple lines. Line breaks in your source file become line break characters in your string. A triple-quoted string in Python acts something like “here doc” in other languages.

However, Python’s indentation rules complicate matters because the indentation becomes part of the quoted string. For example, suppose you have the following code outside of a function.

x = """\
abc
def
ghi
"""

Then you move this into a function foo and change its name to y.

def foo():
    y = """\
    abc
    def
    ghi
    """

Now x and y are different strings! The former begins with a and the latter begins with four spaces. (The backslash after the opening triple quote prevents the following newline from being part of the quoted string. Otherwise x and y would begin with a newline.) The string y also has four spaces in front of def and four spaces in front of ghi. You can’t push the string contents to the left margin because that would violate Python’s formatting rules. (Update: Oh yes you can! See Aaron Meurer’s comment below.)

We now give three solutions to this problem.

Solution 1: textwrap.dedent

There is a function in the Python standard library that will strip the unwanted space out of the string y.

import textwrap 

def foo():
    y = """\
    abc
    def
    ghi
    """
    y = textwrap.dedent(y)

This works, but in my opinion a better approach is to use regular expressions [1].

Solution 2: Regular expression with a flag

We want to remove white space, and the regular expression for a white space character is \s. We want to remove one or more white spaces so we add a + on the end. But in general we don’t want to remove all white space, just white space at the beginning of a line, so we stick ^ on the front to say we want to match white space at the beginning of a line.

import re 

def foo():
    y = """\
    abc
    def
    ghi
    """
    y = re.sub("^\s+", "", y)

Unfortunately this doesn’t work. By default ^ only matches the beginning of a string, not the beginning of a line. So it will only remove the white space in front of the first line; there will still be white space in front of the following lines.

One solution is to add the flag re.MULTILINE to the substitution function. This will signal that we want ^ to match the beginning of every line in our multi-line string.

    y = re.sub("^\s+", "", y, re.MULTILINE)

Unfortunately that doesn’t quite work either! The fourth positional argument to re.sub is a count of how many substitutions to make. It defaults to 0, which actually means infinity, i.e. replace all occurrences. You could set count to 1 to replace only the first occurrence, for example. If we’re not going to specify count we have to set flags by name rather than by position, i.e. the line above should be

    y = re.sub("^\s+", "", y, flags=re.MULTILINE)

That works.

You could also abbreviate re.MULTILINE to re.M. The former is more explicit and the latter is more compact. To each his own. There’s more than one way to do it. [2]

Solution 3: Regular expression with a modifier

In my opinion, it is better to modify the regular expression itself than to pass in a flag. The modifier (?m) specifies that in the rest of the regular the ^ character should match the beginning of each line.

    y = re.sub("(?m)^\s+", "", y)

One reason I believe this is better is that moves information from a language-specific implementation of regular expressions into a regular expression syntax that is supported in many programming languages.

For example, the regular expression

    (?m)^\s+

would have the same meaning in Perl and Python. The two languages have the same way of expressing modifiers [3], but different ways of expressing flags. In Perl you paste an m on the end of a match operator to accomplish what Python does with setting flasgs=re.MULTILINE.

One of the most commonly used modifiers is (?i) to indicate that a regular expression should match in a case-insensitive manner. Perl and Python (and other languages) accept (?i) in a regular expression, but each language has its own way of adding modifiers. Perl adds an i after the match operator, and Python uses

    flags=re.IGNORECASE

    flags=re.I

as a function argument.

More on regular expressions

[1] Yes, I’ve heard the quip about two problems. It’s funny, but it’s not a universal law.

[2] “There’s more than one way to do it” is a mantra of Perl and contradicts The Zen of Python. I use the line here as a good-natured jab at Python. Despite its stated ideals, Python has more in common with Perl than it would like to admit and continues to adopt ideas from Perl.

[3] Python’s re module doesn’t support every regular expression modifier that Perl supports. I don’t know about Python’s regex module.

The post Python triple quote strings and regular expressions first appeared on John D. Cook.

Searching Greek and Hebrew with regular expressions

John — Sun, 20 Sep 2020 21:13:38 +0000

According to the Python Cookbook, “Mixing Unicode and regular expressions is often a good way to make your head explode.” It is thus with fear and trembling that I dip my toe into using Unicode with Greek and Hebrew.

I heard recently that there are anomalies in the Hebrew Bible where the final form of a letter is deliberately used in the middle of a word. That made me think about searching for such anomalies with regular expressions. I’ll come back to that shortly, but I’ll start by looking at Greek where things are a little simpler.

Greek

Only one letter in Greek has a variant form at the end of a word. Sigma is written ς at the end of a word and σ everywhere else. The following Python code shows we can search for sigmas and final sigmas in the first sentence from Matthew’s gospel.

    import re

    matthew = "Κατάλογος του γένους του Ιησού Χριστού, γιου του Δαυείδ, γιου του Αβραάμ"
    matches = re.findall(r"\w+ς\b", matthew)
    print("Words ending in sigma:", matches)

    matches = re.findall(r"\w*σ\w+", matthew)
    print("Words with non-final sigmas:", matches)

This prints

    Words ending in sigma: ['Κατάλογος', 'γένους']
    Words with non-final sigmas: ['Ιησού', 'Χριστού']

This shows that we can use non-ASCII characters in regular expressions, at least if they’re Greek letters, and that the metacharacters \w (word character) and \b (word boundary) work as we would hope.

Hebrew

Now for the motivating example. Isaiah 9:6 contains the word “לםרבה” with the second letter (second from right) being the final form of Mem, ם, sometimes called “closed Mem” because the form of the letter ordinarily used in the middle of a word, מ, is not a closed curve [1].

Here’s code to show we could find this anomaly with regular expressions.

    # There are five Hebrew letters with special final forms.
    finalforms = "\u05da\u05dd\u05df\u05e3\u05e5" # ךםןףץ

    lemarbeh = "\u05dc\u05dd\u05e8\u05d1\u05d4" # לםרבה
    m = re.search(r"\w*[" + finalforms + "\w+", lemarbeh)
    if m:
        print("Anomaly found:", m.group(0))

As far as I know, the instance discussed above is the only one where a final letter appears in the middle of a word. And as far as I know, there are no instances of a non-final form being used at the end of a word. For the next example, we will put non-final forms at the end of words so we can find them.

We’ll need a list of non-final forms. Notice that the Unicode value for each non-final form is 1 greater than the corresponding final form. It’s surprising that the final forms come before the non-final forms in numerical (Unicode) order, but that’s how it is. The same is true in Greek: final sigma is U+03c2 and sigma is U+03c3.

    finalforms = "\u05da\u05dd\u05df\u05e3\u05e5" # ךםןףץ
    nonfinal   = "\u05db\u05de\u05e0\u05e4\u05e6" # כמנפצ

We’ll start by taking the first line of Genesis

    genesis = "בראשית ברא אלהים את השמים ואת הארץ"

and introduce errors by replacing each final form with its corresponding non-final form. The code uses a somewhat obscure feature of Python and is analogous to the shell utility tr.

    genesis_wrong = genesis.translate(str.maketrans(finalforms, nonfinal))

(The string method translate does what you would expect, except that it takes a translation table rather than a pair of strings as its argument. The maketrans method creates a translation table from two strings.)

Now we can find our errors.

    anomalies = re.findall(r"\w+[" + nonfinal + r"]\b", genesis_wrong)
    print("Anomalies:", anomalies)

This produced

    Anomalies: ['אלהימ', 'השמימ', 'הארצ']

Note that Python printed the letters left-to-right.

Vowel points

The text above did not contain vowel points. If the text does contain vowel points, these are treated as letters between the consonants.

For example, the regular expression “בר” will not match against “בְּרֵאשִׁית” because the regex engine sees two characters between ב and ר, the dot (“dagesh”) inside ב and the two dots (“sheva”) underneath ב. But the regular expression “ב..ר” does match against “בְּרֵאשִׁית”.

See footnote [1].

Python

I started this post with a warning from the Python Cookbook that mixing regular expressions and Unicode can cause your head to explode. So far my head intact, but I’ve only tested the waters. I’m sure there are dragons in the deeper end.

I should include the rest of the book’s warning. I used the default library re that ships with Python, but the authors recommend if you’re serious about using regular expressions with Unicode,

… you should consider installing the third-party regex library which provides full support for Unicode case folding, as well as a variety of other interesting features, including approximate matching.

[1] When I refer to “בר” as a regular expression, I mean the character sequence ב followed by ר, which your browser will display from right to left because it detects it as characters from a language written from right to left. Written in terms of Unicode code points this would be “\u05d1\u05e8”, the code point for ב followed by the code point for ר.

Similarly, the second regular expression could be written “\u05d1..\u05e8”. The two periods in the middle are just two instances of the regular expression symbol that matches any character. It’s a coincidence that dots (periods) are being used to match dots (the dagesh and the sheva).

The post Searching Greek and Hebrew with regular expressions first appeared on John D. Cook.

Descartes and Toolz

John — Sun, 20 Sep 2020 15:22:47 +0000

I was looking recently at the Python module toolz, a collection of convenience functions. A lot of these functions don’t do that much. They don’t save you much code, but they do make your code more readable by making it more declarative. You may not realize need them until you see them.

For example, there is a function partitionby that breaks up a sequence at the points where a given function’s value changes. I’m pretty sure that function would have improved some code I’ve written recently, making it more declarative than procedural, but I can’t remember what that was.

Although I can’t think of my previous example, I can think of a new one, and that is Descartes’ rule of signs.

Given a polynomial p(x), read the non-zero coefficients in order and keep note of how many times they change sign, either from positive to negative or vice versa. Call that number n. Then the number of positive roots of p(x) either equals n or n minus a positive even number.

For example, suppose

p(x) = 4x⁴ + 3.1x³ − x² − 2x + 6.

The coefficients are 4, 3.1, −1, −2, and 6. The list of coefficients changes signs twice: from positive to negative, and from negative to positive. Here’s a first pass at how you might have Python split the coefficients to look sign changes.

    from toolz import partitionby

    coefficients = [4, 3.1, -1, -2, 6]
    parts = partitionby(lambda x: x > 0, coefficients)
    print([p for p in parts])

This prints

    [(4, 3.1), (-1, -2), (6,)]

The first argument to partitionby an anonymous function that tests whether its argument is positive. When this function changes value, we have a sign alteration. There are three groups of consecutive coefficients that have the same sign, so there are two times the signs change. So our polynomial either has two positive roots or no positive roots. (It turns out there are no positive roots.)

The code above isn’t quite right though, because Descartes said to only look at non-zero coefficients. If we change our anonymous function to

    lambda x: x >= 0

that will work for zeros in the middle of positive coefficients, but it will give a false positive for zeros in the middle of negative coefficients. We can fix the code with a list comprehension. The following example works correctly.

    coefficients = [4, 0, 3.1, -1, 0, -2, 6]
    nonzero = [c for c in coefficients if c != 0]
    parts = partitionby(lambda x: x > 0, nonzero)
    print([p for p in parts])

If our coefficients were in a NumPy array rather than a list, we could remove the zeros more succinctly.

    from numpy import array

    c = array(coefficients)
    parts = partitionby(lambda x: x > 0, c[c != 0])

The function partitionby returns an iterator rather than a list. That’s why we don’t just print parts above. Instead we print [p for p in parts] which makes a list. In applications, it’s often more efficient to have an iterator than a list, generating items if and when they are needed. If you don’t need all the items, you don’t have to generate them all. And even if you do need all the items, you could save memory by not keeping them all in memory at once. I’ll ignore such efficiencies here.

We don’t need the partitions per se, we just need to know how many there are. The example that escapes my mind would have been a better illustration if it needed to do more with each portion than just count it. We could count the number of sign alternations for Descartes rule as follows.

   len([p for p in parts]) - 1

The post Descartes and Toolz first appeared on John D. Cook.

Time spent on the moon

John — Sat, 01 Aug 2020 20:46:48 +0000

This post will illustrate two things: the amount of time astronauts have spent on the moon, and how to process dates and times in Python.

I was curious how long each Apollo mission spent on the lunar surface, so I looked up the timelines for each mission from NASA. Here’s the timeline for Apollo 11, and you can find the timelines for the other missions by making the obvious change to the URL.

Here are the data on when each Apollo lunar module touched down and when it ascended.

    data = [
        ("Apollo 11", "1969-07-20 20:17:39", "1969-07-21 17:54:00"),
        ("Apollo 12", "1969-11-19 06:54:36", "1969-11-20 14:25:47"),
        ("Apollo 14", "1971-02-05 09:18:13", "1971-02-06 18:48:42"),
        ("Apollo 15", "1971-07-30 22:16:31", "1971-08-02 17:11:23"),
        ("Apollo 16", "1972-04-21 02:23:35", "1972-04-24 01:25:47"),
        ("Apollo 17", "1972-12-11 19:54:58", "1972-12-14 22:54:37"),
    ]

Here’s a first pass at a program to parse the dates and times above and report their differences.

    from datetime import datetime, timedelta

    def str_to_datetime(string):
        return datetime.strptime(string, "%Y-%m-%d %H:%M:%S")

    def diff(str1, str2):
        return str_to_datetime(str1) - str_to_datetime(str2)

    for (mission, touchdown, liftoff) in data:
        print(f"{mission} {diff(liftoff, touchdown)}")

This works, but the formatting is unsatisfying.

    Apollo 11 21:36:21
    Apollo 12 1 day, 7:31:11
    Apollo 14 1 day, 9:30:29
    Apollo 15 2 days, 18:54:52
    Apollo 16 2 days, 23:02:12
    Apollo 17 3 days, 2:59:39

It would be easier to scan the output if it were all in hours. So we rewrite our diff function as follows.

    def diff(str1, str2):
        delta = str_to_datetime(str1) - str_to_datetime(str2)
        hours = delta.total_seconds() / 3600
        return round(hours, 2)

Now the output is easier to read.

    Apollo 11 21.61
    Apollo 12 31.52
    Apollo 14 33.51
    Apollo 15 66.91
    Apollo 16 71.04
    Apollo 17 74.99

These durations fall into three clusters, corresponding to the Apollo mission types G, H, and J. Apollo 11 was the only G-type mission. Apollo 12, 13, and 14 were H-type, intended to demonstrate a precise landing and explore the lunar surface. (Apollo 13 had to loop around the moon without landing.) The J-type missions were more extensive scientific missions. These missions included a lunar rover (“moon buggy”) to let the astronauts travel further from the landing site. There were no I-type missions; the objectives of the original I-type missions were merged into the J-type missions.

Incidentally, UNIX systems store times as seconds since 1970-01-01 00:00:00. That means the first two lunar landings were at negative times and the last four were at positive times. More on UNIX time here.

The post Time spent on the moon first appeared on John D. Cook.

A spring, a rubber band, and chaos

John — Sun, 26 Apr 2020 12:39:36 +0000

Suppose you have a mass suspended by the combination of a spring and a rubber band. A spring resists being compressed but a rubber band does not. So the rubber band resists motion as the mass moves down but not as it moves up. In [1] the authors use this situation to motivate the following differential equation:

where

If a = b then we have a linear equation, an ordinary damped, driven harmonic oscillator. But the asymmetry of the behavior of the rubber band causes a and b to be unequal, and that’s what makes the solutions interesting.

For some parameters the system exhibits essentially sinusoidal behavior, but for other parameters the behavior can become chaotic.

Here’s an example of complex behavior.

Here’s the Python code that produced the plot.

    from scipy import linspace, sin
    from scipy.integrate import solve_ivp
    import matplotlib.pyplot as plt

    def pos(x):
        return max(x, 0)

    def neg(x):
        return max(-x, 0)

    a, b, λ, μ = 17, 1, 15.4, 0.75

    def system(t, z):
        y, yp = z # yp = y'
        return [yp, 10 + λ*sin(μ*t) - 0.01*yp - a*pos(y) + b*neg(y)]

    t = linspace(0, 100, 300)

    sol = solve_ivp(system, [0, 100], [1, 0], t_eval=t)
    plt.plot(sol.t, sol.y[0])
    plt.xlabel("$t$")
    plt.ylabel("$y$")

In a recent post I said that I never use non-ASCII characters in programming, so in the code above I did. In particular, it was nice to use λ as a variable; you can’t use lambda as a variable name because it’s a reserved keyword in Python.

Update: Here’s a phase portrait for the same system.

More efficient way to sum a list comprehension

John — Wed, 15 Jan 2020 16:17:16 +0000

List comprehensions in Python let you create a list declaratively, much like the way you would describe the set in English. For example,

    [x**2 for x in range(10)]

creates a list of the squares of the numbers 0 through 9.

If you wanted the sum of these numbers, you could of course say

    sum( [x**2 for x in range(10)] )

but in this case the brackets are unnecessary. The code is easier to read and more efficient if you omit the brackets.

    sum( x**2 for x in range(10) )

With the brackets, Python constructs the list first then sums the elements. Without the brackets, you have a generator expression. Python will sum the values as they’re generated, not saving all the values in memory. This makes no difference for a short list as above, but with a large list comprehension the generator expression could be more efficient.

The post More efficient way to sum a list comprehension first appeared on John D. Cook.

Area of sinc and jinc function lobes

John — Tue, 31 Dec 2019 15:23:05 +0000

Someone left a comment this morning on my blog post on sinc and jinc integrals regarding the area of the lobes.

It would be nice to have the values of integrals of each lobe, i.e. integrals between 0 and multiples of pi. Anyone knows of such a table?

This post will include Python code to address that question. (Update: added asymptotic approximation. See below.)

First, let me back up and explain the context. The sinc function is defined as [1]

sinc(x) = sin(x) / x

and the jinc function is defined analogously as

jinc(x) = J₁(x) / x,

substituting the Bessel function J₁ for the sine function. You could think of Bessel functions as analogs of sines and cosines. Bessel functions often come up when vibrations are described in polar coordinates, just as sines and cosines come up when using rectangular coordinates.

Here’s a plot of the sinc and jinc functions:

The lobes are the regions between crossings of the x-axis. For the sinc function, the lobe in the middle runs from −π to π, and for n > 0 the nth lobe runs from nπ to (n+1)π. The zeros of Bessel functions are not uniformly spaced like the zeros of the sine function, but they come up in application frequently and so it’s easy to find software to compute their locations.

First of all we’ll need some imports.

    from scipy import sin, pi
    from scipy.special import jn, jn_zeros
    from scipy.integrate import quad

The sinc and jinc functions are continuous at zero, but the computer doesn’t know that [2]. To prevent division by zero, we return the limiting value of each function for very small arguments.

    def sinc(x):
        return 1 if abs(x) < 1e-8 else sin(x)/x

    def jinc(x):
        return 0.5 if abs(x) < 1e-8 else jn(1,x)/x

You can show via Taylor series that these functions are exact to the limits of floating point precision for |x| < 10⁻⁸.

Here’s code to compute the area of the sinc lobes.

    def sinc_lobe_area(n):
        n = abs(n)
        integral, info = quad(sinc, n*pi, (n+1)*pi)
        return 2*integral if n == 0 else integral

The corresponding code for the jinc function is a little more complicated because we need to compute the zeros for the Bessel function J₁. Our solution is a little clunky because we have an upper bound N on the lobe number. Ideally we’d work out an asymptotic value for the lobe area and compute zeros up to the point where the asymptotic approximation became sufficiently accurate, and switch over to the asymptotic formula for sufficiently large n.

    def jinc_lobe_area(n):
        n = abs(n)
        assert(n < N)
        integral, info = quad(jinc, jzeros[n-1], jzeros[n])
        return 2*integral if n == 0 else integral

Note that the 0th element of the array returned by jn_zeros is the first positive zero of J₁; it doesn’t include the zero at the origin.

For both sinc and jinc, the even numbered lobes have positive area and the odd numbered lobes have negative area. Here’s a plot of the absolute values of the lobe areas.

Asymptotic results

We can approximate the area of the nth lobe of the sinc function by using a midpoint approximation for 1/x. It works out that the area is asymptotically equal to

We can do a similar calculation for the area of the nth jinc lobe, starting with the asymptotic approximation for jinc given here. We find that the area of the nth lobe of the jinc function is asymptotically equal to

To get an idea of the accuracy of the asymptotic approximations, here are the results for n=100.

    sinc area:      0.00633455
    asymptotic:     0.00633452
    absolute error: 2.97e-8
    relative error: 4.69e-6

    jinc area:      0.000283391
    asymptotic:     0.000283385
    absolute error: 5.66e-9
    relative error: 2.00e-5

Minimizing context switching between shell and Python

John — Mon, 30 Dec 2019 14:05:25 +0000

Sometimes you’re in the flow using the command line and you’d like to briefly switch over to Python without too much interruption. Or it could be the other way around: you’re in the Python REPL and need to issue a quick shell command.

One solution would be to run your shell and Python session in different terminals, but let’s assume that for whatever reason you’re working in just one terminal. For example, maybe you want the output of a shell command to be visually close when you run Python, or vice versa.

Calling Python from shell

You can run a Python one-liner from the shell by calling Python with the -c option. For example,

    $ python -c "print(3*7)"
    21

I hardly ever do this because I want to run more than a one-liner. What I find more useful is to launch Python with the -q option to suppress all the start-up verbiage and simply bring up a prompt.

    $ python -q
    >>>>

Calling shell from Python

If you run Python with the ipython command rather than the default python you get much better shell integration. IPython let’s you type a shell command at any point simply by preceding it with a !. For example, the following command tells us this is the 364th day of the year.

    In [1]: ! date +%j
    364

You can run some of the most common shell commands, such as cd and ls without even a bang prefix. These are “magic” commands that do what you’d expect if you forgot for a moment that you’re in a Python REPL rather than a command shell.

    In [2]: cd ..
    Out[2]: '/mnt/c/Users'

IPython also supports other forms of shell integration such as capturing the output of a shell command as a Python variable, or using a Python variable as an argument to a shell command.

More on context switching and shells

The post Minimizing context switching between shell and Python first appeared on John D. Cook.

Calculating the period of Van der Pol oscillators

John — Thu, 26 Dec 2019 12:35:10 +0000

A few days ago I wrote about how to solve differential equations with SciPy’s ivp_solve function using Van der Pol’s equation as the example. Van der Pol’s equation is

The parameter μ controls the amount of nonlinear damping. For any initial condition, the solution approach a periodic solution. The limiting periodic function does not depend on the initial condition [1] but does depend on μ. Here are the plots for μ = 0, 1, and 2 from the previous post.

A couple questions come to mind. First, how quickly do the solutions become periodic? Second, how does the period depend on μ? To address these questions, we’ll use an optional argument to ivp_solve we didn’t need in the earlier post.

Using events in ivp_solve

For ivp_solve an event is a function of the time t and the solution y whose roots the solver will report. To determine the period, we’ll look at where the solution is zero; our event function is trivial since we want to find the roots of the solution itself.

Recall from the earlier post that we cast our second order ODE as a pair of first order ODEs, and so our solution is a vector, the function x and its derivative. So to find roots of the solution, we look at what the solver sees as the first component of the solver. So here’s our event function:

    def root(t, y): return y[0]

Let’s set μ = 2 and find the zeros of the solution over the interval [0, 40], starting from the initial condition x(0) = 1, x‘(0) = 0.

    mu = 2
    sol = solve_ivp(vdp, [0, 40], [1, 0], events=root)
    zeros = sol.t_events[0]

Here we reuse the vdp function from the previous post about the Van der Pol oscillator.

To estimate the period of the limit cycle we look at the spacing between zeros, and how that spacing is changing.

    spacing = zeros[1:] - zeros[:-1]
    deltas = spacing[1:] - spacing[:-1]

If we plot the deltas we see that the zero spacings quickly approach a constant value. Zero crossings are half periods, so the period of the limit cycle is twice the limiting spacing between zeros.

Theoretical results

If μ = 0 the Van der Pol oscillator reduces to a simple harmonic oscillator and the period is 2π. As μ increases, the period increases.

For relatively small μ we can calculate the period as above, but as μ increases this becomes more difficult numerically [2]. But one can easily show that the period is asymptotically

T ~ (3 − 2 log 2) μ

as μ goes to infinity. A more refined estimate due to Mary Cartwright is

T ~ (3 − 2 log 2) μ + 2π/μ^1/3

for large μ.

Solving Van der Pol equation with ivp_solve

John — Sun, 22 Dec 2019 23:57:16 +0000

Van der Pol’s differential equation is

The equation describes a system with nonlinear damping, the degree of nonlinearity given by μ. If μ = 0 the system is linear and undamped, but as μ increases the strength of the nonlinearity increases. We will plot the phase portrait for the solution to Van der Pol’s equation in Python using SciPy’s new ODE solver ivp_solve.

The function ivp_solve does not solve second-order systems of equations directly. It solves systems of first-order equations, but a second-order differential equation can be recast as a pair of first-order equations by introducing the first derivative as a new variable.

Since y is the derivative of x, the phase portrait is just the plot of (x, y).

If μ = 0, we have a simple harmonic oscillator and the phase portrait is simply a circle. For larger values of μ the solutions enter limiting cycles, but the cycles are more complicated than just circles. These limiting cycles are periodic attractors: every non-trivial solution converges to the limit cycle.

Here’s the Python code that made the plot.

from scipy import linspace
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt

def vdp(t, z):
    x, y = z
    return [y, mu*(1 - x**2)*y - x]

a, b = 0, 10

mus = [0, 1, 2]
styles = ["-", "--", ":"]
t = linspace(a, b, 500)

for mu, style in zip(mus, styles):
    sol = solve_ivp(vdp, [a, b], [1, 0], t_eval=t)
    plt.plot(sol.y[0], sol.y[1], style)
  
# make a little extra horizontal room for legend
plt.xlim([-3,3])    
plt.legend([f"$\mu={m}$" for m in mus])
plt.axes().set_aspect(1)

To plot the solutions as a function of time, rather than plotting phase portraits, change the line

    plt.plot(sol.y[0], sol.y[1], style)

    plt.plot(sol.t, sol.y[0], style)

and comment out the line setting xlim This gives the following plot.

Data Science and Star Science

John — Sun, 01 Dec 2019 04:11:53 +0000

I recently got a review copy of Statistics, Data Mining, and Machine Learning in Astronomy. I’m sure the book is especially useful to astronomers, but those of us who are not astronomers could use it as a survey of data analysis techniques, especially using Python tools, where all the examples happen to come from astronomy. It covers a lot of ground and is pleasant to read.

The post Data Science and Star Science first appeared on John D. Cook.

Generating Python code from SymPy

John — Thu, 31 Oct 2019 15:18:08 +0000

Yesterday I wrote about Householder’s higher-order generalizations of Newton’s root finding method. For n at least 2, define

and iterate H_n to find a root of f(x). When n = 2, this is Newton’s method. In yesterday’s post I used Mathematica to find expressions for H₃ and H₄, then used Mathematica’s FortranForm[] function to export Python code. (Mathematica doesn’t have a function to export Python code per se, but the Fortran syntax was identical in this case.)

Aaron Muerer pointed out that it would have been easier to generate the Python code in Python using SymPy to do the calculus and labdify() to generate the code. I hadn’t heard of lambdify before, so I tried out his suggestion. The resulting code is nice and compact.

    from sympy import diff, symbols, lambdify

    def f(x, a, b):
        return x**5 + a*x + b

    def H(x, a, b, n):
        x_, a_, b_ = x, a, b
        x, a, b = symbols('x a b')
        expr = diff(1/f(x,a,b), x, n-2) / \
               diff(1/f(x,a,b), x, n-1)
        g = lambdify([x,a,b], expr)
        return x_ + (n-1)*g(x_, a_, b_)

This implements all the H_n at once. The previous post implemented three of the H_n separately.

The first couple lines of H require a little explanation. I wanted to use the same names for the numbers that the function H takes and the symbols that SymPy operated on, so I saved the numbers to local variables.

This code is fine for a demo, but in production you’d want to generate the function g once (for each n) and save the result rather than generating it on every call to H.

The post Generating Python code from SymPy first appeared on John D. Cook.

Calling Python from Mathematica

John — Fri, 19 Apr 2019 00:42:21 +0000

The Mathematica function ExternalEvalute lets you call Python from Mathematica. However, there are a few wrinkles.

I first pasted in an example from the Mathematica documentation and it failed.

    ExternalEvaluate[
        "Python", 
        {"def f(x): return x**2", "f(3)"}
    ]

It turns out you (may) have to tell Mathematica where to find Python. I ran the following, tried again, and the example worked.

    RegisterExternalEvaluator[
        "Python", 
        "C:\\bin\\Anaconda3\\python.EXE"
    ]

You can also run Python with NumPy loaded using

    ExternalEvaluate["Python-NumPy", … ]

except that didn’t work the first time either. You have to register a Python-NumPy evaluator separately. So I ran

    RegisterExternalEvaluator[
        "Python-NumPy", 
        "C:\\bin\\Anaconda3\\python.EXE"
    ]

and then tried again calling Python code using NumPy. But then there’s the question of how it imports NumPy. Does it simply run import numpy, or maybe from numpy import *, or maybe import numpy as np? It turns out the first possibility is what happens. So to print pi from NumPy, your code string needs to be numpy.pi.

You don’t need to use Python-NumPy if you just do your own importing. For example, this code returns π².

    ExternalEvaluate[
        "Python", 
        "import numpy; x = numpy.pi; x**2"
    ]

And you can import any library you’d like, not just NumPy.

    ExternalEvaluate[
        "Python", 
        "from sympy import isprime; isprime(7)"
    ]

Everything above applies to Mathematica 11.3 and Mathematica 12.

The post Calling Python from Mathematica first appeared on John D. Cook.

Squircle perimeter and the isoparametric problem

John — Thu, 11 Apr 2019 11:25:40 +0000

If you have a fixed length of rope and you want to enclose the most area inside the rope, make it into a circle. This is the solution to the so-called isoparametric problem.

Dido’s problem is similar. If one side of your bounded area is given by a straight line, make your rope into a semi-circle.

This post looks at a similar problem for a squircle. Peter Panholzer mentioned the problem of finding the squircle exponent that led to the largest area in proportion to its arclength. This sounds like the isoparametric problem, but it’s not.

The isoparametric problem holds perimeter constant and lets the shape enclosed vary, maximizing the area. The question here is to hold the axes constant and maximize the ratio of the area to the perimeter. Panholzer reports the maximum occurs at p = 4.39365.

Computing the perimeter

The volume of a squircle can be found in closed form, and I’ve mentioned the equation a few times, for example here. The perimeter, however, cannot be found in closed form, as far as I know, except for special exponents.

We can solve for y as a function of x and find the arclength using the formula taught in calculus courses. However, the derivative of y has a singularity at x = 1. By switching to polar coordinates, we can find arclength in terms of an integrand with no singularities. We can also simplify things a little by computing the total arclength as 4 times the arclength in the first quadrant; this avoids having to take absolute values.

The following Python code computes the perimeter and the ratio of the area to the perimeter.

    from scipy import sin, cos, pi
    from scipy.integrate import quad
    from scipy.special import gamma
    
    def r(t, p):
        return (cos(t)**p + sin(t)**p)**-(1/p)
    
    def drdt(t, p):
        deriv = (cos(t)**p + sin(t)**p)**(-1-1/p)
        deriv *= cos(t)**(p-1)*sin(t) - sin(t)**(p-1)*cos(t)
        return deriv
    
    def integrand(t, p):
        return (drdt(t,p)**2 + r(t,p)**2)**0.5
    
    def arclength(p):
        integral = quad(lambda t: integrand(t,p), 0, pi/2)[0]
        return 4*integral
    
    def area(p):
        return 4*gamma(1 + 1/p)**2 / gamma(1 + 2/p)
    
    def ratio(p):
        return area(p) / arclength(p)

Basic geometry tells us the ratio is 1/2 when p = 2 and we have a circle. The ratio is also 1/2 when p = ∞, i.e. when we have a square. We can use the code above to find that the ratio when p = 0.528627, so there is at least one local maximum for values of p between 2 and ∞.

Here’s a plot of the ratio of area to perimeter as a function of p.

The plot is quite flat for large values of p, but if we zoom in we can see that there is a local maximum near 4.4.

When I find the maximum of the ratio function using Brent’s method (scipy.optimize.brent) I find p = 4.39365679, which agrees with the value Panholzer stated.

Here’s a plot of the squircle corresponding to this value of p.

Back to the isoparametric problem

Now why doesn’t this contradict the isoparametric problem? Area scales quadratically but perimeter scales linearly. If you don’t hold perimeter constant, you can find a larger ratio of area to volume by making the perimeter larger. And that’s what we’ve done. When p = 2, we have a unit circle, with area π and perimeter 2π. When p = 4.39365679 we have area 3.750961567 and perimeter 7.09566295. If we were to take a circle with the same perimeter, it would have area 4.00660097, larger than the squircle we started with.

Python | John D. Cook

Converting very long strings to integers in Python

Monero’s elliptic curve

Related posts

Converting between quaternions and rotation matrices

Accounting for degrees of freedom

Python code

Random testing

How hard is constraint programming?

Factored random numbers

Python code for means

Searching for proper nouns

NLP with spaCy

TANSTAAFL

Related posts

Golden integration

Related posts

Python code to solve Kepler’s equation

Dump a pickle file to a readable text file

Be careful

Unpickling

Pretty printing

Prettier printing

Related posts

Org-mode as a lightweight notebook

Org-mode as notebook

Jupyter notebooks

File diff

Related posts

Find log normal parameters for given mean and variance

Related posts

Spaceship operator in Python

Illustrating Gershgorin disks with NumPy

Pareto and Pandas

Related posts

Broadcasting and functors

Python example

Python triple quote strings and regular expressions

Solution 1: textwrap.dedent

Solution 2: Regular expression with a flag

Solution 3: Regular expression with a modifier

More on regular expressions

Searching Greek and Hebrew with regular expressions

Greek

Hebrew

Vowel points

Python

Related posts

Descartes and Toolz

Related posts

Time spent on the moon

Related posts

A spring, a rubber band, and chaos

More posts on differential equations

More efficient way to sum a list comprehension

Area of sinc and jinc function lobes

Asymptotic results

More signal processing posts

Minimizing context switching between shell and Python

Calling Python from shell

Calling shell from Python

More on context switching and shells

Calculating the period of Van der Pol oscillators

Using events in ivp_solve

Theoretical results

More VdP posts

Solving Van der Pol equation with ivp_solve

More dynamical system posts

Data Science and Star Science

Generating Python code from SymPy

Calling Python from Mathematica

Squircle perimeter and the isoparametric problem

Computing the perimeter

Back to the isoparametric problem

More squircle posts