How to plot the spiral of Theodorus

You may have seen the spiral of Theodorus. It sticks a sequence of right triangles together to make a sort of spiral. Each triangle has a short side of length 1, and the hypotenuse of each triangle becomes the long leg of the next triangle as shown below.

Spiral of Theodorus

How would you plot this spiral? At each step, you need to draw a segment of length 1, perpendicular to the hypotenuse of the previous triangle. There are two perpendicular directions, and you want to choose the one that moves counterclockwise.

If we step outside the xy plane, we can compute the cross product of the unit vector in the z direction with the vector (x, y). The cross product will be perpendicular to both, and by the right-hand rule, it will point in the counterclockwise direction.

The cross product of (0, 0, 1) and (x, y, 0) is (-y, x, 0), so the direction we want to go in the xy plane is (-y, x). We divide this vector by its length to get a vector of length 1, then add it to our previous point. [1]

Here’s Python code to draw the spiral.

    import matplotlib.pyplot as plt
    def next_vertex(x, y):
        h = (x**2 + y**2)**0.5
        return (x - y/h, y + x/h)
    # base of the first triangle
    plt.plot([0, 1], [0, 0])
    N = 17
    x_old, y_old = 1, 0
    for n in range(1, N):
        x_new, y_new = next_vertex(x_old, y_old)
        # draw short side
        plt.plot([x_old, x_new], [y_old, y_new])
        # draw hypotenuse
        plt.plot([0, x_new], [0, y_new])
        x_old, y_old = x_new, y_new

If you’re not familiar with the plot function above, you might expect the two arguments to plot to be points. But the first argument is a list of x coordinates and the second a list of y coordinates.

Update: See the next post for how the hypotenuse angles fill in a circle.

Related posts

[1] Here’s an alternative derivation using complex numbers.

Label the current vertex in the complex plane zn. Then zn/|zn| has length 1 and points in the same direction from the origin. Multiplying this point by i rotates it a quarter turn counterclockwise, so

zn+1 = zn + i zn/|zn|.

Encryption as secure as factoring

RSA encryption is based on the assumption that factoring large integers is hard. However, it’s possible that breaking RSA is easier than factoring. That is, the ability to factor large integers is sufficient for breaking RSA, but it might not be necessary.

Two years after the publication of RSA, Michael Rabin created an alternative that is provably as hard to break as factoring integers.

Like RSA, Rabin’s method begins by selecting two large primes, p and q, and keeping them secret. Their product n = pq is the public key. Given a message m in the form of a number between 0 and n, the cipher text is simply

c = m² mod n.

Rabin showed that if you can recover m from c, then you can factor n.

However, Rabin’s method does not decrypt uniquely. For every cipher text c, there are four possible clear texts. This may seem like a show stopper, but there are ways to get around it. You could pad messages with some sort of hash before encryption so that is extremely unlikely that more than one of the four decryption possibilities will have the right format. It is also possible to make the method uniquely invertible by placing some restrictions on the primes used.

I don’t know that Rabin’s method has been used in practice. I suspect that the assurance that attacks are as hard as factoring integers isn’t valuable enough to inspire people to put in the effort to harden the method for practical use [1].

There’s growing suspicion that factoring may not be as hard as we’ve thought. And we know that if large scale quantum computing becomes practical, factoring integers will be easy.

My impression is that researchers are more concerned about a breakthrough in factoring than they are about a way to break RSA without factoring [2, 3].

Related posts

[1] One necessary step in practical implementation of Rabin’s method would be to make sure m &gt: √n. Otherwise you could recover m by taking the integer square root rather than having to take a modular square root. The former is easy and the latter is hard.

[2] There are attacks against RSA that do not involve factoring, but they mostly exploit implementation flaws. They are not a frontal assault on the math problem posed by RSA.

[3] By “breaktrhough” I meant a huge improvement in efficiency. But another kind of breaktrough is conceivable. Someone could prove that factoring really is hard (on classical computers) by establishing a lower bound. That seems very unlikely, but it would be interesting. Maybe it would spark renewed interest in Rabin’s method.

Accelerating convergence with Aitken’s method

The previous post looked at Euler’s method for accelerating the convergence of a slowly converging alternating series. Both hypotheses are necessary. The signs must alternate between terms, and applying the method to a series that is already converging quickly can slow down convergence.

Aitken’s method

This post looks at Aitken’s method for speeding up the convergence of a series. It does not matter whether the series is an alternating series or not. Aitken’s method works best when the series is converging approximately like a geometric series, which is often true for power series of analytic functions.

Aitken acceleration estimates the sum of a series by taking the last three partial sums and extrapolating to where the partial sums appear to be going. That is, it estimates the sum as

s_{n+2} - \frac{(s_{n+2} - s_{n+1})^2}{s_{n+2} - 2s_{n+1} + s_n}

We’ll use the exponential function as our example, estimating exp(0.3) first by using six terms of the Taylor series for exp(x) and then by applying Aitken’s method.


Here’s a little Python code to carry out our example.

from math import exp, factorial
from numpy import cumsum

def aitken(s0, s1, s2):
    return s2 - (s2 - s1)**2/(s2 - 2*s1 + s0)

def exp_partial_sums(x, N):
    terms = [x**n/factorial(n) for n in range(N)]
    return cumsum(terms)

x, N = 0.3, 6
partial = exp_partial_sums(x, N)

# estimate taking the last partial sum
p = partial[-1]

# estimate applying Aitken acceleration
a = aitken( partial[-3], partial[-2], partial[-1] )

# error analysis
error_p = exp(x) - p
error_a = exp(x) - a
print(error_p, error_a, error_p/error_a)

This says the error simply using partial sums is 1.0576e-06. The error using Aitken acceleration is -2.3498e-07, which is 4.5 times smaller.

If we go back to the example of the Struve function in the previous post, the approximation error using Aitken acceleration is about 3 times smaller than simply using partial sums. So Aikten acceleration would have been more appropriate than Euler acceleration.

When to use

Why would you use an acceleration method rather than just adding up more terms of the series. The latter might be the thing to do, depending on context. If the terms of the series are expensive to calculate, acceleration will be more economical since it only requires a few arithmetic operations. If the terms of your series involve special functions, the additional cycles needed to apply Aitkin acceleration are negligible.

Sometimes you only have a limited number of terms, such as when you’ve derived the first few terms from a hand calculation. For example, maybe you’ve computed a series solution to a differential equation. In that case, Aitken’s method lets you squeeze a little more accuracy out of the terms you have.

Related posts

Accelerating an alternating series

The most direct way of computing the sum of an alternating series, simply computing the partial sums in the terms get small enough, may not be the most efficient. Euler figured this out in the 18th century.

For our demo we’ll evaluate the Struve function defined by the series

H_\nu(z) = (z/2)^{\nu + 1} \sum_{k=0}^\infty (-1)^k \frac{(z/2)^{2k}}{\Gamma\left(k + 3/2\right ) \, \Gamma\left(k + \nu + 3/2 \right )}

Note that the the terms in the series are relatively expensive to evaluate since each requires evaluating a gamma function. Euler’s acceleration method will be inexpensive relative to computing the terms it takes as input.

Here’s what we get by evaluating the first partial sums for H1.2(3.4):


So if we were to stop here, we’d report 1.11332 as the value of H1.2(3.4).

Next let’s see what we’d get using Euler’s transformation for alternating series. I’ll full precision in my calculations internally but only displaying four digits to save horizontal space that we’ll need shortly.

2.3474 1.5099 
0.6723 0.9340 
1.1957 1.1497 
1.1037 1.1089  
1.1141 1.1137

Now we repeat the process again, taking averages of consecutive terms in each column to produce the next column.

2.3474 1.5099 1.2219 1.1319 1.1087 1.1058
0.6723 0.9340 1.0418 1.0856 1.1029 
1.1957 1.1497 1.1293 1.1203 
1.1037 1.1089 1.1113 
1.1141 1.1137 

The terms across the top are the Euler approximations to the series. The final term 1.1058 is the Euler approximation based on all six terms. And it is worse than what we get from simply taking partial sums!

The exact value is 1.11337… and so the sixth partial sum was accurate to four decimal places, but our clever method was only good to one decimal place.

What went wrong?! Euler’s method is designed to speed up the convergence of slowly converging alternating series. But our series converges pretty quickly because it has two terms in the denominator that grow like factorials. When you apply acceleration when you don’t need to, you can make things worse. Since our series was already converging quickly, we would have done better to use Aitken acceleration, the topic of the next post.

But Euler’s method works well when it’s needed. For example, let’s look at the slowly converging series

π = 4 – 4/3 + 4/5 – 4/7 + 4/9 – …

Then we get the following array.

4.0000 3.3333 3.200o 3.1619 3.1492 3.1445
2.6666 3.0666 3.1238 3.1365 3.1399 
3.4666 3.1809 3.1492 3.1434
2.8952 3.1174 3.1376 
3.3396 3.1578

The sequence of partial sums is along the left side, and the series of Euler terms is across the top row. Neither gives a great approximation to π, but the approximation error using Euler’s acceleration method is 57 times smaller than simply using the partial sums.

Related posts

Data breach trends

Are data breaches becoming more or less common? This post gives a crude, back-of-the-envelope calculation to address the question. We won’t look at number of breaches per se but number of records breached.

There’s a terrific visualization of data breach statistics at Information is Beautiful, and they share their data here. Note that the data only includes breaches of over 30,000 records.

Small piece of data breach visualization from Information is Beautiful

By using their data, we can look at the total number of records breached each year. The figure for 2019 is projected: since the data stop in June of this year, I’ve assumed there will be as many breaches in the second half of 2019 as the first half.

Note that records breached is measured in billions.

Number of records breached per year

Need to respond to a data incident/breach? We can help.

Beating the odds on the Diffie-Hellman decision problem

There are a couple variations on the Diffie-Hellman problem in cryptography: the computation problem (CDH) and the decision problem (DDH). This post will explain both and give an example of where the former is hard and the latter easy.

The Diffie-Hellman problems

The Diffie-Hellman problems are formulated for an Abelian group. The main group we have in mind is the multiplicative group of non-zero integers modulo a large prime p. But we start out more generally because the Diffie-Hellman problems are harder over some groups than others as we will see below.

Let g be the generator of a group. When we think of the group operation written as multiplication, this means that every element of the group is a power of g.

Computational Diffie-Hellman problem

Let a and b be two integers. Given ga and gb, the CDH problem is to compute gab. Note that the exponent is ab and not a+b. The latter would be easy: just multiply ga and gb.

To compute gab we apparently need to know either a or b, which we are not given. Solving for either exponent is the discrete logarithm problem, which is impractical for some groups.

It’s conceivable that there is a way to solve CDH without solving the discrete logarithm problem, but at this time the most efficient attacks on CDH compute discrete logs.

Diffie-Hellman key exchange

Diffie-Hellman key exchange depends on the assumption that CDH is a hard problem.

Suppose Alice and Bob both agree on a generator g, and select private keys a and b respectively. Alice sends Bob ga and Bob sends Alice gb. This doesn’t reveal their private keys because the discrete logarithm problem is (believed to be) hard. Now both compute gab and use it as their shared key. Alice computes gab by raising gb to the power a, and Bob computes gab by raising ga to the power b.

If someone intercepted the exchange between Alice and Bob, they would possess ga and gb and would want to compute gab. This the the CDH.

When working over the integers modulo a large prime p, CDH is hard if p-1 has a large factor, such as when p – 1 = 2q for a prime q. If p-1 has only small factors, i.e. if p-1 is “smooth”, then the discrete logarithm problem is tractable via the Silver-Pohlig-Hellman algorithm [1]. But if p is large and p-1 has a large prime factor, CDH is hard as far as we currently know.

Decision Diffie-Hellman problem

The DDH problem also starts with knowledge of ga and gb. But instead of asking to compute gab it asks whether one can distinguish gab from gc for a randomly chosen c with probability better than 0.5.

Clearly DDH is weaker than CDH because if we can solve CDH we know the answer to the DDH question with certainty. Still, the DDH problem is believed to be hard over some groups, such as certain elliptic curves, but not over other groups, as illustrated below.

DDH for multiplicative group mod p

Suppose we’re working in the multiplicative group of non-zero integers modulo a prime p. Using Legendre symbols, which are practical to compute even for very large numbers, you can determine which whether ga is a square mod p, which happens if and only if a is even. So by computing the Legendre symbol for ga mod p, we know the parity of a. Similarly we can find the parity of b, and so we know the parity of ab. If ab is odd but gc is a square mod p, we know that the answer to the DDH problem is no. Similarly if ab is even but gc is not a square mod p, we also know that the answer to the DDH problem is no.

Since half the positive integers mod p are squares, we can answer the DDH problem with certainty half the time by the parity argument above. If our parity argument is inconclusive we have to guess. So overall we can answer he DDH problem correctly with probability 0.75.

Related posts

[1] As is common when you have a lot of names attached to a theorem, there were multiple discoveries. Silver discovered the algorithm first, but Pohlig and Hellman discovered it independently and published first.

Magic square links and errata

Someone pointed out that what I called a knight’s tour magic square is technically a semi-magic square: the rows and columns add up to the same constant, but the diagonals do not.

It turns out there are no strict magic squares formed by knight’s tours. This was proved in 2003. See a news article here.

Related posts

Quaternion reference in the Vulgate

To contemporary ears “quaternion” refers to a number system discovered in the 19h century, but there were a couple precedents. Both refer to something related to a group of four things, but there is no relation to mathematical quaternions other than that they have four dimensions.

I’ve written before about Milton’s use of the word in Paradise Lost. Milton is alluding to the four elements of antiquity: air, earth, fire, and water.

I recently found out the word quaternion appears in the Latin Vulgate as well. Acts 12:4 records that Herod had Peter guarded by four squads of four soldiers each, “quattuor quaternionibus militum.”

Update: Thanks to Phil for pointing out in the comments that quaternion appears in the King James as well, “four quaternions of soldiers.”

More quaternion posts

Fame, difficulty, and usefulness

Pierre Fermat is best known for two theorems, dubbed his “last” theorem and his “little” theorem. His last theorem is famous, difficult to prove, and useless. His little theorem is relatively arcane, easy to prove, and extremely useful.

There’s little relation between technical difficulty and usefulness.

Fermat’s last theorem

Fermat’s last theorem says there are no positive integer solutions to

an + bn = cn

for integers n > 2. This theorem was proven three and a half centuries after Fermat proposed it. The theorem is well known, even in pop culture. For example, Captain Picard tries to prove it in Star Trek: Next Generation. Fermat’s last theorem was famous for being an open problem that was easy to state. Now that it has been proven, perhaps it will fade from popular consciousness.

The mathematics developed over the years in attempts to prove Fermat’s last theorem has been very useful, but the theorem itself is of no practical importance that I’m aware of.

Fermat’s little theorem

Fermat’s little theorem says that if p is a prime and a is any integer not divisible by p, then ap − 1 − 1 is divisible by p. This theorem is unknown in pop culture but familiar in math circles. It’s proved near the beginning of any introductory number theory class.

The theorem, and its generalization by Euler, comes up constantly in applications of number theory. See three examples here.

Related posts

Twisted elliptic curves

This morning I was sitting at a little bakery thinking about what to do before I check out of my hotel. I saw that the name of the bakery was Twist Bakery & Cafe, and that made me think of writing about twisted elliptic curves when I got back to my room.

Twist of an elliptic curve

Suppose p is an odd prime and let E be the elliptic curve with equation

y² = x³ + ax² + bx + c

where all the variables come from the integers mod p. Now pick some d that is not a square mod p, i.e. there is no x such that x² – d is divisible by p. Then the curve E‘ with equation

dy² = x³ + ax² + bx + c

is a twist of E. More explicit it’s a quadratic twist of E. This is usually what people have in mind when they just say twist with no modifier, but there are other kinds of twists.

Let’s get concrete. We’ll work with integers mod 7. Then the squares are 1, 4, and 2. (If that last one looks strange, note that 3*3 = 2 mod 7.) And so the non-squares are 3, 5, and 6. Suppose our curve E is

y² = x³ + 3x + 2.

Then we can form a twist of E by multiplying the left side by 3, 5, or 6. Let’s use 3. So E‘ given by

3y² = x³ + 3x + 2

is a twist of E.

Twisted curve

There’s something potentially misleading in the term “twisted curve”. The curve E‘ is a twist of E, so E‘ is twisted relative to E, and vice versa. You can’t say E‘ is twisted and E is not.

But you might object that E‘ has the form

dy² = x³ + ax² + bx + c

where d is a non-square, and E does not, so E‘ is the one that’s twisted. But a change of variables can leave the curve the same while changing the form of the equation. Every curve has an equation in which the coefficient of y² is 1 and a form in which the coefficient is a non-square. Specifically,

dy² = x³ + ax² + bx + c


y² = x³ + dax² + d²bx + d³c

specify the same curve.

The two curves E and E‘ are not isomorphic over the field of integers mod p, but they are isomorphic over the quadratic extension of the integers mod p. That is, if you extend the field by adding the square root of d, the two curves are isomorphic over that field.

Partitioning x coordinates

For a given x, f(x) = x³ + ax² + bx + c is either a square or a non-square. If it is a square, then

y² = f(x)

has a solution and

dy² = f(x)

does not. Conversely, if f(x) is not a square, then

dy² = f(x)

has a solution and

y² = f(x)

does not. So a given x coordinate either corresponds to a point on E or a point on E‘, but not both.

Application to cryptography

In elliptic curve cryptography, it’s good if not only is the curve you’re using secure, so are its twists. Suppose you intend to work over a curve E, and someone sends you an x coordinate that’s not on E. If you don’t check whether x corresponds to a point on E, you are effectively working on a twist E‘ rather than E. That can be a security hole if E‘ is not as secure a curve as E, i.e. if the discrete logarithm problem on E‘ can be solved more efficiently than the same problem on E.

Related posts