Sine of a googol

How do you evaluate the sine of a large number in floating point arithmetic? What does the result even mean?

Sine of a trillion

Let’s start by finding the sine of a trillion (1012) using floating point arithmetic. There are a couple ways to think about this. The floating point number t = 1.0e12 can only be represented to 15 or 16 significant decimal figures (to be precise, 53 bits [1]), and so you could think of it as a representative of the interval of numbers that all have the same floating point representation. Any result that is the sine of a number in that interval should be considered correct.

Another way to look at this is to say that t can be represented exactly—its binary representation requires 42 bits and we have 53 bits of significand precision available—and so we expect sin(t) to return the sine of exactly one trillion, correct to full precision.

It turns out that IEEE arithmetic does the latter, computing sin(1e12) correctly to 16 digits.

Here’s the result in Python

    >>> sin(1.0e12)

and verified by Mathematica by computing the value to 20 decimal places

    In:= N[Sin[10^12], 20]
    Out= -0.61123870237688949819

Range reduction

Note that the result above is not what you’d get if you were first to take the remainder when dividing by 2π and then taking the sine.

    >>> sin(1.0e12 % (2*pi))

This makes sense. The result of dividing a trillion by the floating point representation of 2π, 159154943091.89536, is correct to full floating point precision. But most of that precision is in the integer part. The fractional part is only has five digits of precision, and so we should expect the result above to be correct to at most five digits. In fact it’s accurate to four digits.

When your processor computes sin(1e12) it does not naively take the remainder by 2π and then compute the sine. If it did, you’d get four significant figures rather than 16.

We started out by saying that there were two ways of looking at the problem, and according to the first one, returning only four significant figures would be quite reasonable. If we think of the value t as a measured quantity, measured to the precision of floating point arithmetic (though hardly anything can be measured that accurately), then four significant figures would be all we should expect. But the people who designed the sine function implementation on your processor did more than they might be expected to do, finding the sine of a trillion to full precision. They do this using a range reduction algorithm that retains far more precision than naively doing a division. (I worked on this problem a long time ago.)

Sine of a googol?

What if we try to take the sine of a ridiculously large number like a googol (10100)? This won’t work because a googol cannot be represented exactly as a floating point number (i.e. as a IEEE 754 double). It’s not too big; floating point numbers can be as large as around 10308. The problem is that a number that big cannot be represented to full precision. But we can represent 2333 exactly, and a googol is between 2332 and 2333. And amazingly, Python’s sine function (or rather the sine function built into the AMD processor on my computer) returns the correct result to full precision.

    >>> sin(2**333)

How do we know this is right? I verified it in Mathematica:

    In:= Sin[2.0^333]
    Out= 0.9731320373846353

How do we know Mathematica is right? We can do naive range reduction using extended precision, say 200 decimal places.

    In:= p = N[Pi, 200]
    In:= theta = x - IntegerPart[x/ (2 p)] 2 p
    Out= 1.8031286178334699559384196689…

and verify that the sine of the range reduced value is correct.

    In:= Sin[theta]
    Out= 0.97313203738463526…

Interval arithmetic interpretation

Because floating point numbers have 53 bits of precision, all real numbers between 256 – 22 and 256 + 22 are represented as the floating point number 256. This is a range of width 8, wider than 2π, and so the sines of the numbers in this range cover the possible values of sine, i.e. [-1, 1]. So if you think of a floating point number as a sample from the set of all real numbers with the same binary representation, every possible value of sine is a valid return value for numbers bigger than 256. But if you think of a floating point number as representing itself exactly, then it makes sense to ask for the sine of numbers like 2333 and larger, up to the limits of floating point representation [2].

Related posts

[1] An IEEE 754 double has 52 significand bits, but these bits can represent 53 bits of precision because the first bit of the fractional part is always 1, and so it does not need to be represented. See more here.

[2] The largest exponent of an IEEE double is 1023, and the largest significand is 2 – 2-53 (i.e. all 1’s), so the largest possible value of a double is

(253 – 1)21024-53

and in fact the Python expression sin((2**53 - 1)*2**(1024-53)) returns the correct value to 16 significant figures.

Spherical trig, Research Triangle, and Mathematica

This post will look at the triangle behind North Carolina’s Research Triangle using Mathematica’s geographic functions.

Spherical triangles

A spherical triangle is a triangle drawn on the surface of a sphere. It has three vertices, given by points on the sphere, and three sides. The sides of the triangle are portions of great circles running between two vertices. A great circle is a circle of maximum radius, a circle with the same center as the sphere.

An interesting aspect of spherical geometry is that both the sides and angles of a spherical triangle are angles. Because the sides of a spherical triangle are arcs, they have angular measure, the angle formed by connecting each vertex to the center of the sphere. The arc length of a side is its angular measure times the radius of the sphere.

Denote the three vertices by AB, and C. Denote the side opposite A by a, etc. Denote the angles at AB, and C by α, β, and γ respectively.

Research Triangle

Research Triangle is a (spherical!) triangle with vertices formed by Duke University, North Carolina State University, and University of North Carolina at Chapel Hill.

(That was the origin of the name, though it’s now applied more loosely to the general area around these three universities.)

We’ll take as our vertices

  • A = UNC Chapel Hill (35.9046 N, 79.0468 W)
  • B = Duke University in Durham (36.0011 N, 78.9389 W),
  • C = NCSU in Raleigh (35.7872 N, 78.6705 W)

Research Triangle


We’ll illustrate several features of Mathematica using the spherical triangle corresponding to Research Triangle.


The map above was produced with the following Mathematica code.

    ptA = GeoPosition[{35.9046, -79.0468}]
    ptB = GeoPosition[{36.0011, -78.9389}]
    ptC = GeoPosition[{35.7872, -78.6705}]

    GeoGraphics[{Red, PointSize[Large], 
        Point[ptA], Point[ptB], Point[ptC]}, 
        GeoScaleBar -> "Imperial", 
        GeoRange -> 29000]

Note that longitude is in degrees east, so the longitudes above are negative.


To find the distance between two locations, you can use the function GeoDistance. For example,

    GeoDistance[ptA, ptB]

tells me that the distance between UNC and Duke is 8.99185 miles. I assume it displays miles by default based on my location in the US, though the GeoRange option above defaults to meters. You can make the system of units explicit. For example

    GeoDistance[ptA, ptB, UnitSystem -> "Metric"]

returns 14.741 km.

If we want to find the length of the side c in degrees, we can use the Earth’s radius.

    r = PlanetData["Earth", "Radius"]
    c = GeoDistance[ptA, ptB] 360 / (2 Pi r)

This shows that c is 0.13014°. Calculating the other sides similarly shows a = 0.30504° and b = 0.32739°.


Calling GeoDirection[ptA, ptB] returns 42.2432°, which says we need to head at an angle of about 42° to walk from UNC to Duke.

The code

    GeoDirection[ptA, ptB] - GeoDirection[ptA, ptC]

shows that the angle α is 68.6128°. (The code returns the negative of this angle because the angle is clockwise.) Similarly we find β = 87.9808° and γ = 23.4068.

The angles add up to 180° only because our triangle is small compared to the earth’s total area. The actual sum should be slightly more than 180°, but we’ve not retained enough precision to detect the difference. In general the “spherical excess,” i.e. the amount by which the sum of the angles exceed 180°, is proportional to the area of the triangle.

Related posts

Prime denominators and nines complement

Let p be a prime. If the repeating decimal for the fraction a/p has even period, the the second half of the decimals are the 9’s complement of the first half. This is known as Midy’s theorem.

For a small example, take

1/7 = 0.142857142857…

and notice that 142 + 857 = 999. That is, 8, 5, and 7 are the nine’s complements of 1, 4, and 2 respectively.

For a larger example, we can use Mathematica to look at the decimal expansion of 6/47:

    In:  N[6/47, 60]
    Out: 0.127659574468085106382978723404255319148936170212765957446809

and we can confirm

    12765957446808510638297 + 
    87234042553191489361702 =

Let’s do another example with 6/43:

    In:  N[6/43, 50]
    Out: 0.13953488372093023255813953488372093023255813953488

Does the theorem hold here? The the hypothesis of the theorem doesn’t hold because the fraction has period 21, which is not even. Can the conclusion of the theorem hold anyway? No, because we can’t split 21 digits in half! But we can split 21 digits into three parts, and if we do we find

    1395348 + 8372093 + 232558 = 9999999

Let’s finish up with an example in another base. We’ll look at the fraction 3/7 in base 17.

    In:  BaseForm[N[3/7, 24], 17]
    Out: 0.74e9c274e9c274e9c2717

If we split the repeating part of the decimal into two parts we get 74e and 9c2, and these add up to the base 17 analog of all 9’s.

    In:  BaseForm[17^^74e + 17^^9c2, 17]
    Out: ggg17

That is, the base 17 numbers 9, c, and 2, are the g’s complements of 7, 4, and e respectively.

[The Mathematica operator ^^ interprets its right argument as a number whose base is given by the left argument. The BaseForm function takes a number as its first argument and returns its representation in the base given as its second argument.]

Related post: Casting out Z’s

Continued fraction cryptography

Every rational number can be expanded into a continued fraction with positive integer coefficients. And the process can be reversed: given a sequence of positive integers, you can make them the coefficients in a continued fraction and reduce it to a simple fraction.

In 1954, Arthur Porges published a one-page article pointing out that continued fractions could be used to create a cipher. To encrypt your cleartext, convert it to a list of integers, use them as continued fraction coefficients, and report the resulting fraction. To decrypt, expand the fraction into a continued fraction and convert the coefficients back to text.

We can implement this in Mathematica as follows:

encode[text_] := FromContinuedFraction[ ToCharacterCode[ text ]]
decode[frac_] := FromCharacterCode[ ContinuedFraction[ frac ]]

So, for example, suppose we want to encrypt “adobe.” If we convert each letter to its ASCII code we get {97, 100, 111, 98, 101}. When we make these numbers coefficients in a continued fraction we get


which reduces to 10661292093 / 109898899.

This isn’t a secure encryption scheme by any means, but it’s a cute one. It’s more of an encoding scheme, a (non-secret) way to convert a string of numbers into a pair of numbers.

Related posts

[1] Arthur Porges. A Continued Fraction Cipher. The American Mathematical Monthly, Vol. 59, No. 4 (Apr., 1952), p. 236

Clearing up the confusion around Jacobi functions

The Jacobi elliptic functions sn and cn are analogous to the trigonometric functions sine and cosine. The come up in applications such as nonlinear oscillations and conformal mapping. Unfortunately there are multiple conventions for defining these functions. The purpose of this post is to clear up the confusion around these different conventions.

Plot of Jacobi sn

The image above is a plot of the function sn [1].

Modulus, parameter, and modular angle

Jacobi functions take two inputs. We typically think of a Jacobi function as being a function of its first input, the second input being fixed. This second input is a “dial” you can turn that changes their behavior.

There are several ways to specify this dial. I started with the word “dial” rather than “parameter” because in this context parameter takes on a technical meaning, one way of describing the dial. In addition to the “parameter,” you could describe a Jacobi function in terms of its modulus or modular angle. This post will be a Rosetta stone of sorts, showing how each of these ways of describing a Jacobi elliptic function are related.

The parameter m is a real number in [0, 1]. The complementary parameter is m‘ = 1 – m. Abramowitz and Stegun, for example, write the Jacobi functions sn and cn as sn(um) and cn(um). They also use m1 = rather than m‘ to denote the complementary parameter.

The modulus k is the square root of m. It would be easier to remember if m stood for modulus, but that’s not conventional. Instead, m is for parameter and k is for modulus. The complementary modulus k‘ is the square root of the complementary parameter.

The modular angle α is defined by m = sin² α.

Note that as the parameter m goes to zero, so does the modulus k and the modular angle α. As any one of these three goes to zero, the Jacobi functions sn and cn converge to their counterparts sine and cosine. So whether your dial is the parameter, modulus, or modular angle, sn converges to sine and cn converges to cosine as you turn the dial toward zero.

As the parameter m goes to 1, so does the modulus k, but the modular angle α goes to π/2. So if your dial is the parameter or the modulus, it goes to 1. But if you think of your dial as modular angle, it goes to π/2. In either case, as you turn the dial to the right as far as it will go, sn converges to hyperbolic secant, and cn converges to the constant function 1.

Quarter periods

In addition to parameter, modulus, and modular angle, you’ll also see Jacobi function described in terms of K and K‘. These are called the quarter periods for good reason. The functions sn and cn have period 4K as you move along the real axis, or move horizontally anywhere in the complex plane. They also have period 4iK‘. That is, the functions repeat when you move a distance 4K‘ vertically [2].

The quarter periods are a function of the modulus. The quarter period K along the real axis is

K(m) = \int_0^{\pi/2} \frac{d\theta}{\sqrt{1-m\sin^2\theta}}

and the quarter period K‘ along the imaginary axis is given by K‘(m) = K(m‘) = K(1 – m).

The function K(m) is known as “the complete elliptic integral of the first kind.”


So far we’ve focused on the second input to the Jacobi functions, and three conventions for specifying it.

There are two conventions for specifying the first argument, written either as φ or as u. These are related by

u = \int_0^{\varphi} \frac{d\theta}{\sqrt{1-m\sin^2\theta}}

The angle φ is called the amplitude. (Yes, it’s an angle, but it’s called an amplitude.)

When we said above that the Jacobi functions had period 4K, this was in terms of the variable u. Note that when φ = π/2, uK.

Jacobi elliptic functions in Mathematica

Mathematica uses the u convention for the first argument and the parameter convention for the second argument.

The Mathematica function JacobiSN[u, m] computes the function sn with argument u and parameter m. In the notation of A&S, sn(um).

Similarly, JacobiCN[u, m] computes the function cn with argument u and parameter m. In the notation of A&S, cn(um).

We haven’t talked about the Jacobi function dn up to this point, but it is implemented in Mathematica as JacobiDN[u, m].

The function that solves for the amplitude φ as a function of u is JacobiAmplitude[um m].

The function that computes the quarter period K from the parameter m is EllipticK[m].

Jacobi elliptic functions in Python

The SciPy library has one Python function that computes four mathematical functions at once. The function scipy.special.ellipj takes two arguments, u and m, just like Mathematica, and returns sn(um), cn(um), dn(um), and the amplitude φ(um).

The function K(m) is implemented in Python as scipy.special.ellipk.

Related posts

[1] The plot was made using JacobiSN[0.5, z] and the function ComplexPlot described here.

[2] Strictly speaking, 4iK‘ is a period. It’s the smallest vertical period for cn, but 2iK‘ is the smallest vertical period for sn.

Group statistics

I just ran across FiniteGroupData and related functions in Mathematica. That would have made some of my earlier posts easier to write had I used this instead of writing my own code.

Here’s something I find interesting. For each n, look at the groups of order at most n and count how many are Abelian versus non-Abelian. At first there are more Abelian groups, but the non-Abelian groups soon become more numerous. Also, the number of Abelian groups grows smoothly, while the number of non-Abelian groups has big jumps, particularly at powers of 2.

Counting Abelian and non-Abelian groups

Here’s the Mathematica code:

    fgc = FoldList[Plus, 0, Table[FiniteGroupCount[n], {n, 1, 300}]]
    fga = FoldList[Plus, 0, Table[FiniteAbelianGroupCount[n], {n, 1, 300}]]
    ListLogPlot[ {fgc - fga, fga }, 
        PlotLegends -> {"Non-Abelian", "Abelian"}, 
        Joined -> True, 
        AxesLabel -> {"order", "count"}]

I see the plot legend on my screen, but when saving the plot to a file the legend wasn’t included. Don’t know why. (Update: See footnote [1]). The jagged blue curve is the number of non-Abelian groups of size up to n. The smooth gold curve is the corresponding curve for Abelian groups.

Here’s the same plot carried out further to show the jumps at 512 and 1024.

Counting Abelian and non-Abelian groups

Related posts

[1] Someone from Wolfram Research saw this post and sent me a fix:

pl = ListLogPlot[...]
Export["~/Desktop/img.png", pl]

Sine of five degrees

Today’s the first day of a new month, which means the exponential sum of the day will be simpler than usual. The exponential sum of the day plots the partial sums of

\sum_{n=0}^N \exp\left( 2\pi i \left( \frac{n}{m} + \frac{n^2}{d} + \frac{n^3}{y} \right ) \right )

where md, and y are the month, day, and (two-digit) year. The n/d term is simply n, and integer, when d = 1 and so it has no effect because exp(2πn) = 1. Here’s today’s sum, the plot formed by the partial sums above.

exponential sum for August 1, 2018

It’s possible in principle to work out exactly each of the points are on the curve. In order to do so, you’d need to compute

\exp\left(\frac{2\pi i}{72}\right) = \cos\left(\frac{2\pi}{72}\right) + i \sin\left(\frac{2\pi}{72}\right) = \cos(5^\circ) + i \sin(5^\circ)

Since we can use the Pythagorean theorem to compute sine from cosine and vice versa, we only need to know how to compute the sine of five degrees.

OK, so how do we do that? The sine and cosine of 45 degrees and of 30 degrees are well known, and we can use the trig identity for sin(A – B) to find that the sine of 15 degrees is (√3 – 1)/ 2√2. So how do we get from the sine of 15 degrees to the sine of 5 degrees? We can borrow a trick from medieval astronomers. They did something similar to our work above, then solved the trig identity

\sin 3\theta = 3 \sin\theta - 4\sin^3\theta

as a cubic equation in sin θ to find the sine of 1/3 of an angle with known sine.

If we ask Mathematica to solve this equation for us

    Solve[3 x -4 x^3 == Sin[15 Degree], x]

we get three roots, and it’s the middle one that we’re after. (We can evaluate the roots numerically and tell that the middle one has the right magnitude.) So Mathematica says that the sine of 5 degrees is

\frac{\left(1+i \sqrt{3}\right) \sqrt[3]{-\sqrt{2}+\sqrt{6}+i \sqrt{8+4 \sqrt{3}}}}{4\ 2^{2/3}}+\frac{1-i \sqrt{3}}{2 \sqrt[3]{2 \left(-\sqrt{2}+\sqrt{6}+i \sqrt{8+4 \sqrt{3}}\right)}}

Unfortunately, this is the end of the line for Mathematica. We know that the expression above must be real, but Mathematica is unable to find its real part. I tried running FullSimplify on this expression, and aborted evaluation after it ran for 20 minutes or so. If anybody wants to pick up where Mathematica left off, let me know your solution. It may be possible to continue with Mathematica by giving the software some help, or it may be easier to continue with pencil and paper.

Update: Based on the comments below, it seems this is about as far as we can go. This was a surprise to me. We’re solving a cubic equation, less than 5th degree, so the roots are expressible in terms of radicals. But that doesn’t mean we can eliminate the appearance of the i‘s, even though we know the imaginary part is zero.

System of Jacobi elliptic functions

Jacobi’s elliptic functions are sorta like trig functions. His functions sn and cn have names that reminiscent of sine and cosine for good reason. These functions come up in applications such as the nonlinear pendulum (i.e. when θ is too
large to assume θ is a good enough approximation to sin θ) and in conformal mapping.

I ran across an article [1] yesterday that shows how Jacobi’s three elliptic functions—sn, cn, and dn—could be defined by one dynamical system

\begin{eqnarray*} x' &=& yz \\ y' &=& -zx \\ z' &=& -k^2 xy \end{eqnarray*}

with initial conditions x(0) = 0, y(0) = 1, and z(0) = 1.

The parameter k is the modulus. (In Mathematica’s notation below, k² is the parameter. See this post on parameterization conventions.) As k decreases to 0, sn converges to sine, cn to cosine, and dn to 1. As k increases to 1, sn converges tanh, and cn and dn converge to sech. So you could think of k as a knob you turn to go from being more like circular functions (ordinary trig functions) to more like hyperbolic functions.

Since we have a dynamical system, let’s plot the solution, varying the modulus each time.The Jacobi functions are periodic (in fact they’re doubly periodic) and so the plots will be closed loops.

f[t_, m_] = {JacobiSN[t, m], JacobiCN[t, m], JacobiDN[t, m]}
ParametricPlot3D[f[t, 0.707], {t, 0, 10}, AspectRatio -> 1]

phase portrait k = 1/2

ParametricPlot3D[f[t, 0.99], {t, 0, 20}, AspectRatio -> 1]

phase portrait k = 1/2

ParametricPlot3D[f[t, 0.01], {t, 0, 20}, AspectRatio -> 1]

phase portrait k = 1/2

Note that this last plot is nearly flat because the modulus is small and so z is nearly constant. The small modulus also makes the phase portrait nearly circular because x is approximately sine and y is approximately cosine.

[1] Kenneth R. Meyer. Jacobi Elliptic Functions from a Dynamical Systems Point of View. The American Mathematical Monthly, Vol. 108, No. 8 (Oct., 2001), pp. 729-737

Almost prime generators and almost integers

Here are two apparently unrelated things you may have seen before. The first is an observation going back to Euler that the polynomial

n^2 - n + 41

produces a long sequence of primes. Namely, the values are prime for n = 1, 2, 3, …, 40.

The second is that the number

e^{\pi \sqrt{163}}

is extraordinarily close to an integer. This number is known as Ramanujan’s constant. It differs from its nearest integer by 3 parts in 1030. Ramanujan’s constant equals


There is a connection between these two facts: The polynomial

n^2 - n + k

returns primes for n = 1, 2, 3, …, k-1 primes if 4k – 1 is a Heegner number, and

e^{\pi \sqrt{d}}

is almost an integer if d is a (large) Heegner number.

Source: The Book of Numbers by Conway and Guy.

Heegner numbers

So what’s a Heegner number and how many are there? An integer d is a Heegner number if the ring generated by appending √-d to the integers has unique factorization. There are nine such numbers:

1, 2, 3, 7, 11, 19, 43, 67, 163.

There’s deeper stuff going on here than I understand—modular forms, the j-function, etc.—so this post won’t explain everything. There’s something unsatisfying about saying something is “almost” an integer without quantifying. There’s a way to be more precise, but we won’t go there. Instead, we’ll just play with the results.

Mathematica computation

First we look at the claim that n² – n + k produces primes for n = 1 through k – 1 if 4k – 1 is a Heegner number. The Heegner numbers of the form 4k + 1 are 2, 3, 5, 11, and 17. The following code shows that the claim is true for these values of k.

k = {2, 3, 5, 11, 17}
claim[x_] := AllTrue[
  Table[n^2 - n + x, {n, x - 1}], 
AllTrue[k, claim]

This returns True, so the claim is true.

As for exp(π √d) being close to an integer, this apparently only true for the last three Heegner numbers.

h = {1, 2, 3, 7, 11, 19, 43, 67, 163}
For[i = 1, i < 10, i++, 
        Exp[ Pi Sqrt[ h[[i]] ] ], 

(The function AccountingForm suppresses scientific notation, making it easier to see where the decimal portion of the number starts.)

Here are the results:


I manually edited the output to align the decimal points and truncate the decimal places beyond that needed to show that the last number is not an integer.

Partition numbers and Ramanujan’s approximation

The partition function p(n) counts the number of ways n unlabeled things can be partitioned into non-empty sets. (Contrast with Bell numbers that count partitions of labeled things.)

There’s no simple expression for p(n), but Ramanujan discovered a fairly simple asymptotic approximation:

p(n) \sim \frac{1}{4n\sqrt{3}} \exp(\pi \sqrt{2n/3})

How accurate is this approximation? Here’s a little Matheamtica code to see.

p[n_] := PartitionsP[n]
approx[n_] := Exp[ Pi Sqrt[2 n/3]] / (4 n Sqrt[3])
relativeError[n_] := Abs[p[n] - approx[n]]/p[n]
ListLinePlot[Table[relativeError[n], {n, 100}]]

So for values of n around 100, the approximation is off by about 5%.

Since it’s an asymptotic approximation, the relative error decreases (albeit slowly, apparently) as n increases. The relative error for n = 1,000 is about 1.4% and the relative error for n = 1,000,000 is about 0.044%.

Update: After John Baez commented on the oscillation in the relative error I decided to go back and look at it more carefully. Do the oscillations end or do they just become too small to see?

To answer this, let’s plot the difference in consecutive terms.

relerr[a_, b_] := Abs[a - b]/b
t = Table[relerr[p[n], approx[n]], {n, 300}];
ListLinePlot[Table[ t[[n + 1]] - t[[n]], {n, 60}]]

first differences of relative error

The plot crosses back and forth across the zero line, indicating that the relative error alternately increases and decreases, but only up to a point. Past n = 25 the plot stays below the zero line; the sign changes in the first differences stop.

But now we see that the first differences themselves alternate! We can investigate the alternation in first differences by plotting second differences [1].

    Table[ t[[n + 2]] - 2 t[[n + 1]] + t[[n]], 
    {n, 25, 120}]

first differences of relative error

So it appears that the second differences keep crossing the zero line for a lot longer, so far out that it’s hard to see. In fact the second differences become positive and stay positive after n = 120. But the second differences keep alternating, so you could look at third differences …

Related posts: Special numbers


[1] The code does a small algebraic simplification that might make some people scratch their heads. All it does is simplify

(tn+2tn+1) – (tn+1tn).

Computing SVD and pseudoinverse

In a nutshell, given the singular decomposition of a matrix A,

A = U \Sigma V^*

the Moore-Penrose pseudoinverse is given by

A^+ = V \Sigma^+ U^*.

This post will explain what the terms above mean, and how to compute them in Python and in Mathematica.

Singular Value Decomposition (SVD)

The singular value decomposition of a matrix is a sort of change of coordinates that makes the matrix simple, a generalization of diagonalization.

Matrix diagonalization

If a square matrix A is diagonalizable, then there is a matrix P such that

A = P^{-1} D P

where the matrix D is diagonal. You could think of P as a change of coordinates that makes the action of A as simple as possible. The elements on the diagonal of D are the eigenvalues of A and the columns of P are the corresponding eigenvectors.

Unfortunately not all matrices can be diagonalized. Singular value decomposition is a way to do something like diagonalization for any matrix, even non-square matrices.

Generalization to SVD

Singular value decomposition generalizes diagonalization. The matrix Σ in SVD is analogous to D in diagonalization. Σ is diagonal, though it may not be square. The matrices on either side of Σ are analogous to the matrix P in diagonalization, though now there are two different matrices, and they are not necessarily inverses of each other. The matrices U and V are square, but not necessarily of the same dimension.

The elements along the diagonal of Σ are not necessarily eigenvalues but singular values, which are a generalization of eigenvalues. Similarly the columns of U and V are not necessarily eigenvectors but left singular vectors and right singular vectors respectively.

The star superscript indicates conjugate transpose. If a matrix has all real components, then the conjugate transpose is just the transpose. But if the matrix has complex entries, you take the conjugate and transpose each entry.

The matrices U and V are unitary. A matrix M is unitary if its inverse is its conjugate transpose, i.e. M* M = MM* = I.

Pseudoinverse and SVD

The (Moore-Penrose) pseudoinverse of a matrix generalizes the notion of an inverse, somewhat like the way SVD generalized diagonalization. Not every matrix has an inverse, but every matrix has a pseudoinverse, even non-square matrices.

Computing the pseudoinverse from the SVD is simple.


A = U \Sigma V^*


A^+ = V \Sigma^+ U^*

where Σ+ is formed from Σ by taking the reciprocal of all the non-zero elements, leaving all the zeros alone, and making the matrix the right shape: if Σ is an m by n matrix, then Σ+ must be an n by m matrix.

We’ll give examples below in Mathematica and Python.

Computing SVD in Mathematica

Let’s start with the matrix A below.

A = \begin{bmatrix} 2 & -1 & 0 \\ 4 & 3 & -2 \end{bmatrix}

We can find the SVD of A with the following Mathematica commands.

    A = {{2, -1, 0}, {4, 3, -2}}
    {U, S, V} = SingularValueDecomposition[A]

From this we learn that the singular value decomposition of A is

\begin{bmatrix} \frac{1}{\sqrt{26}} & -\frac{5}{\sqrt{26}} \\ \frac{5}{\sqrt{26}} & \frac{1}{\sqrt{26}} \\ \end{bmatrix} \begin{bmatrix} \sqrt{30} & 0 & 0 \\ 0 & 2 & 0 \end{bmatrix} \begin{bmatrix} \frac{11}{\sqrt{195}} & \frac{7}{\sqrt{195}} & -\sqrt{\frac{5}{39}} \\ -\frac{3}{\sqrt{26}} & 2 \sqrt{\frac{2}{13}} & -\frac{1}{\sqrt{26}} \\ \frac{1}{\sqrt{30}} & \sqrt{\frac{2}{15}} & \sqrt{\frac{5}{6}} \end{bmatrix}

Note that the last matrix is not V but the transpose of V. Mathematica returns V itself, not its transpose.

If we multiply the matrices back together we can verify that we get A back.

    U . S. Transpose[V]

This returns

    {{2, -1, 0}, {4, 3, -2}}

as we’d expect.

Computing pseudoinverse in Mathematica

The Mathematica command for computing the pseudoinverse is simply PseudoInverse. (The best thing about Mathematica is it’s consistent, predictable naming.)


This returns

    {{19/60, 1/12}, {-(11/30), 1/6}, {1/12, -(1/12)}}

And we can confirm that computing the pseudoinverse via the SVD

    Sp = {{1/Sqrt[30], 0}, {0, 1/2}, {0, 0}}
    V . Sp . Transpose[U]

gives the same result.

Computing SVD in Python

Next we compute the singular value decomposition in Python (NumPy).

    >>> a = np.matrix([[2, -1, 0],[4,3,-2]])
    >>> u, s, vt = np.linalg.svd(a, full_matrices=True)

Note that np.linalg.svd returns the transpose of V, not the V in the definition of singular value decomposition.

Also, the object s is not the diagonal matrix Σ but a vector containing only the diagonal elements, i.e. just the singular values. This can save a lot of space if the matrix is large. The NumPy method svd has other efficiency-related options that I won’t go into here.

We can verify that the SVD is correct by turning s back into a matrix and multiply the components together.

    >>> ss = np.matrix([[s[0], 0, 0], [0, s[1], 0]])
    >>> u*ss*vt

This returns the matrix A, within floating point accuracy. Since Python is doing floating point computations, not symbolic calculation like Mathematica, the zero in A turns into -3.8e-16.

Note that the singular value decompositions as computed by Mathematica and Python differ in a few signs here and there; the SVD is not unique.

Computing pseudoinverse in Python

The pseudoinverse can be computed in NumPy with np.linalg.pinv.

    >>> np.linalg.pinv(a)
    matrix([[ 0.31666667,  0.08333333],
            [-0.36666667,  0.16666667],
            [ 0.08333333, -0.08333333]])

This returns the same result as Mathematica above, up to floating point precision.

Related posts

Narcissus prime

This morning Futility Closet posted the following.

Repeat the string 1808010808 1560 times, and tack on a 1 the end. The resulting 15601-digit number is prime, and because it’s a palindrome made up of the digits 1, 8, and 0, it remains prime when read backward, upside down, or in a mirror.

I used Mathematica to verify that the number described above is indeed prime.

PrimeQ[ 10*Sum[1808010808*10^(10 i), {i, 0, 1559}] + 1 ]

After a little over two minutes, the function returned True.

Related posts:

For daily tweets on algebra and other math, follow @AlgebraFact on Twitter.

AlgebraFact logo

Words that are primes base 36

This morning on Twitter, Alexander Bogomolny posted a link to his article that gives examples of words that are prime numbers when interpreted as numbers in base 36. Some examples are “Brooklyn”, “paleontologist”, and “deodorant.” (Numbers in base 36 are written using 0, 1, 2, …, 9, A, B, C, …, Z as “digits.” )

Tim Hopper replied with a snippet of Mathematica code that lists all words with up to four letters that correspond to base 36 primes.

Rest[ Flatten[ Union[
    DictionaryLookup /@ IntegerString[
        Table[Prime[n], {n, 1, 300000}], 36]]]]

That made me wonder whether you could estimate how many such words there are without doing an exhaustive search.

The Prime Number Theorem says that the probability of a number less than N being prime is approximately 1/log(N). If we knew how many English words there were of a certain length, then we could guess that 1/log(N) of that those words would be prime when interpreted as base 36 numbers. This assumes that forming an English word and being prime have independent probabilities, which may be approximately true.

How well would our guess have worked on Tim’s example? He prints out all the words corresponding to the first 300,000 primes. The last of these primes is 4,256,233. The exact probability that a number less than that upper limit is prime is then

300,000 / 4,256,233 ≈ 0.07.

There are about 4200 English words with four or fewer letters. (I found this out by running

grep -ciE '^[a-z]{1,4}$'

on the words file on a Linux box. See similar tricks here.) If we estimate that 7% of these are prime, we’d expect 294 words from Tim’s program. His program produces 275 words, so our prediction is pretty good.

If we didn’t know the exact probability of a number in our range being prime, we could have estimated the probability at

1/log(4,256,233) ≈ 0.0655

using the Prime Number Theorem. Using this approximation we’d estimate 4200*0.0655 = 275.1 words; our estimate would be exactly correct! There’s good reason to believe our estimate would be reasonably close, but we got lucky to get this close.

Related posts:

Sonnet primes

The previous post showed how to list all limerick primes. This post shows how to list all sonnet primes. These are primes of the form ababcdcdefefgg, the rhyme scheme of an English (Shakespearean) sonnet, where the letters a through g represent digits and a is not zero.

Here’s the Mathematica code.

number[s_] := 10100000000000 s[[1]] + 1010000000000 s[[2]] +
1010000000 s[[3]] + 101000000 s[[4]] +
101000 s[[5]] + 10100 s[[6]] + 11 s[[7]]

test[n_] := n > 10^13 && PrimeQ[n]

candidates = Permutations[Table[i, {i, 0, 9}], {7}];

list = Select[Map[number, candidates], test];

The function Permutations[list, {n}] creates a list of all permutations of list of length n. In this case we create all permutations the digits 0 through 9 that have length 7. These are the digits a through g.

The function number turns the permutation into a number. This function is applied to each candidate. We then select all 14-digit prime numbers from the list of candidates using test.

If we ask for Length[list] we see there are 16,942 sonnet primes.

As mentioned before, the smallest of these primes is 10102323454577.
The largest is 98987676505033.

Related post: Limerick primes

For daily tweets on algebra and other math, follow @AlgebraFact on Twitter.

AlgebraFact logo