*f*(*z*) = (1-*z*)/(1+*z*).

The most recent examples include applications to **radio antennas** and **mental calculation**. More on these applications below.

A convenient property of our function *f* is that it is its own inverse, i.e. *f*( *f*(*x*) ) = *x*. The technical term for this is that *f* is an involution.

The first examples of involutions you might see are the maps that take *x* to –*x* or 1/*x*, but or function shows that more complex functions can be involutions as well. It may be the simplest involution that isn’t obviously an involution.

By the way, *f* is still an involution if we extend it by defining *f*(-1) = ∞ and *f*(∞) = -1. More on that in the next section.

The function above is an example of a **Möbius transformation**, a function of the form

(*az* + *b*)/(*cz* + *d*).

These functions seem very simple, and on the surface they are, but they have a lot of interesting and useful properties.

If you define the image of the singularity at *z* = –*d*/*c* to be ∞ and define the image of ∞ to be *a*/*c*, then Möbius transformations are one-to-one mappings of the extended complex plane, the complex numbers plus a point at infinity, onto itself. In fancy language, the Möbius transformations are the holomorphic automorphisms of the Riemann sphere.

More on why the extended complex plane is called a sphere here.

One nice property of Möbius transformations is that they map circles and lines to circles and lines. That is, the image of a circle is either a circle or a line, and the image of a line is either a circle or a line. You can simplify this by saying Möbius transformations map circles to circles, with the understanding that a line is a circle with infinite radius.

Back to our particular Möbius transformation, the function at the top of the post.

I’ve been reading about antennas, and in doing so I ran across the Smith chart. It’s essentially a plot of our function *f*. It comes up in the context of antennas, and electrical engineering more generally, because our function *f* maps reflection coefficients to normalized impedance. (I don’t actually understand that sentence yet but I’d like to.)

The Smith chart was introduced as a way of computing normalized impedance graphically. That’s not necessary anymore, but the chart is still used for visualization.

Image Wdwd, CC BY-SA 3.0, via Wikimedia Commons.

It’s easy to see that

*f*(*a*/*b*) = (*b*–*a*)/(*b*+*a*)

and since *f* is an involution,

*a*/*b* = *f*( (*b*–*a*)/(*b*+*a*) ).

Also, a Taylor series argument shows that for small *x*,

*f*(*x*)^{n} ≈ *f*(*nx*).

A terse article [1] bootstraps these two properties into a method for calculating roots. My explanation here is much longer than that of the article.

Suppose you want to mentally compute the 4th root of *a*/*b*. Multiply *a*/*b* by some number you can easily take the 4th root of until you get a number near 1. Then *f* of this number is small, and so the approximation above holds.

Bradbury gives the example of finding the 4th root of 15. We know the 4th root of 1/16 is 1/2, so we first try to find the 4th root of 15/16.

(15/16)^{1/4} = *f*(1/31)^{1/4} ≈ *f*(1/124) = 123/125

and so

15^{1/4} ≈ 2 × 123/125

where the factor of 2 comes from the 4th root of 16. The approximation is correct to four decimal places.

Bradbury also gives the example of computing the cube root of 15. The first step is to multiply by the cube of some fraction in order to get a number near 1. He chooses (2/5)^{3} = 8/125, and so we start with

15 × 8/125 = 24/25.

Then our calculation is

(24/25)^{1/3} = *f*(1/49)^{1/3} ≈ *f*(1/147) = 73/74

and so

15^{1/3} ≈ (5/2) × 73/74 = 365/142

which is correct to 5 decimal places.

[1] Extracting roots by mental methods. Robert John Bradbury, The Mathematical Gazette, Vol. 82, No. 493 (Mar., 1998), p. 76.

The post Applications of (1-z)/(1+z) first appeared on John D. Cook.]]>I stumbled on a recording of a contrabass saxophone last night and wondered just how low it was [1], so I decided to write this post giving the ranges of each of the saxophones.

The four most common saxophones are baritone, tenor, alto, and soprano. These correspond to the instruments in the image above. There are saxophones below the baritone and above the soprano, but they’re rare.

Saxophones have roughly the same range as the human vocal parts with the corresponding names, as shown in the following table.

SPN stands for scientific pitch notation, explained here. Hz stands for Hertz, vibrations per second.

The human ranges are convenient two-octave ranges. Of course different singers have different ranges. (Different saxophone players have different ranges too if you include the altissimo range.)

If you include the rare saxophones, the saxophone family has almost the same range as a piano. The lowest note on a subcontrabass saxophone is a half step lower than the lowest note on a piano, and the highest note on the sopranissimo saxophone is a few notes shy of the highest note on a piano.

My intent when I wrote this post was to add some visualization. One thought was to display the data above on a piano keyboard. That would be a nice illustration, but it would be a lot of work to create. Then it occurred to me that what putting things on a piano is really just a way of displaying the data on a log scale. So I plotted the SATB data on a log scale, which was much easier.

If I were to plot the data in the second table, it would look just like the blue solid lines from the table above, just more of them. All saxes have a range of two and a half octaves, so all the vertical lines would be the same length. All the B♭ saxophones (tenor, soprano, etc.) are an octave apart, as are all the E♭ saxophones (baritone, alto, etc.) and so the vertical positions of the blue lines would continue the pattern above.

[1] I could figure it out in terms of musical notation—you can see what a regular pattern the various saxes have in the table above—but I think more in terms of frequencies these days, so I wanted to work everything out in terms of Hz. Also, I’d always assumed that tenor saxes and tenor voices have about the same range etc., but I hadn’t actually verified this before.

The post Saxophone ranges first appeared on John D. Cook.]]>*a*/*b* > *c*/*d*,

it is often enough to test whether

*a*+*d* > *b*+*c*.

Said another way *a*/*b* is usually greater than *c*/*d* when *a*+*d* is greater than *b*+*c*.

This sounds imprecise if not crazy. But it is easy to make precise and [1] shows that it is true.

Consider 4/11 vs 3/7. In this case 4 + 7 = 11 < 11 + 3 = 14, which suggests 4/11 < 3/7, which is correct.

But the rule of thumb can lead you astray. For example, suppose you want to compare 3/7 and 2/5. We have 3 + 5 = 8 < 7 + 2 = 9, but 3/7 > 2/5.

The claim isn’t that the rule *always* works; clearly it doesn’t. The claim is that it *usually* works, in a sense that will be made precise in the next section.

Let *N* be a large integer, and pick integers *a*, *b*, *c*, and *d* at random uniformly between 1 and *N*. Let

*x* = *a*/*b* – *c*/*d*

and

*y* = (*a*+*d*) – (*b*+*c*).

Then the probability *x* and *y* are both positive, or both negative, or both zero, approaches 11/12 as *N* goes to infinity.

I won’t repeat the proof of the theorem above; see [1] for that. But I’ll give a simulation that illustrates the theorem.

import numpy as np np.random.seed(20210225) N = 1_000_000 numreps = 10_000 count = 0 for _ in range(numreps): (a, b, c, d) = np.random.randint(1, N+1, 4, dtype=np.int64) x = a*d - b*c y = (a+d) - (b+c) if np.sign(x) == np.sign(y): count += 1 print(count/numreps)

This prints 0.9176, and 11/12 = 0.91666….

The random number generator `randint`

defaults to 32-bit integers, but this could lead to overflow since 10^{6} × 10^{6} > 2^{32}. So I used 64-bit integers.

Instead of computing *a*/*b* – *c*/*d*, I multiply by *bd* and compute *ad* – *bc* instead because this avoids any possible floating point issues.

In case you’re not familiar with the `sign`

function, it returns 1 for positive numbers, 0 for 0, and -1 for negative numbers.

The code suggests a different statement of the theorem: if you generate two pairs of integers, their sums and their products are probably ordered the same way.

This post has assumed that the numbers *a*, *b*, *c*, and *d* are all chosen uniformly at random. But the components of the fractions for which you have any doubt whether *a*/*b* is greater or less than *c*/*d* are not uniformly distributed. For example, consider 17/81 versus 64/38. Clearly the former is less than 1 and the latter is greater than 1.

It would be interesting to try to assess how often the rule of thumb presented here is correct in practice. You might try to come up with a model for the kinds of fractions people can’t compare instantly, such as proper fractions that have similar size.

[1] Kenzi Odani. A Rough-and-Ready Rule for Fractions. The Mathematical Gazette, Vol. 82, No. 493 (Mar., 1998), pp. 107-109

The post Fraction comparison trick first appeared on John D. Cook.]]>Pr(*Z* < *x*) = *x*.

That is, Φ has a unique fixed point where Φ is the CDF of a standard normal.

It’s easy to find the fixed point: start anywhere and iterate Φ.

Here’s a cobweb plot that shows how the iterates converge, starting with -2.

The black curve is a plot of Φ. The blue stair-step is a visualization of the iterates. The stair step pattern comes from outputs turning into inputs. That is, vertical blue lines connect and input value *x* to its output value *y*, and the horizontal blue lines represent sliding a *y* value over to the dotted line *y* = *x* in order to turn it into the next *x* value.

*y*‘s become *x*‘s. The blue lines get short when we’re approaching the fixed point because now the outputs approximately equal the inputs.

Here’s a list of the first 10 iterates.

0.02275 0.50907 0.69465 0.75636 0.77528 0.78091 0.78257 0.78306 0.78320 0.78324

So it seems that after 10 iterations, we’ve converged to the fixed point in the first four decimal places.

The post Normal probability fixed point first appeared on John D. Cook.]]>At first thought you might think you could become a superstar, like the musician in the movie Yesterday who takes credit for songs by The Beatles. But on second thought, maybe not.

Maybe you’re a physicist and you go back in time before relativity. If you go back far enough, people will not be familiar with the problems that relativity solves, and your paper on relativity might not be accepted for publication. And you can’t just post it on arXiv.

If you know about technology that will be developed, but you don’t know how to get there using what’s available in the past, it might not do you much good. Someone with a better knowledge of the time’s technology might have the advantage.

If you’re a mathematician and you go back in time 50 years, could you scoop Andrew Wiles on the proof of Fermat’s Last Theorem? Almost certainly not. You have the slight advantage of knowing the theorem is true, whereas your colleagues only strongly suspect that it’s true. But what about the proof?

If I were in that position, I might think “There’s something about a Seaborn group? No, that’s the Python library. Selmer group? That sounds right, but maybe I’m confusing it with the musical instrument maker. What is this Selmer group anyway, and how does it let you prove FLT?”

Could Andrew Wiles himself reproduce the proof of FTL without access to anything written after 1971? Maybe, but I’m not sure.

In your area of specialization, you might be able to remember enough details of proven results to have a big advantage over your new peers. But your specialization might be in an area that hasn’t been invented yet, and you might know next to nothing about areas of research that are currently in vogue. Maybe you’re a whiz at homological algebra, but that might not be very useful in a time when everyone is publishing papers on differential equations and special functions.

There are a few areas where a time traveler could make a big splash, areas that could easily have been developed much sooner but weren’t. For example, the idea of taking the output of a function and sticking it back in, over and over, is *really* simple. But nobody looked into it deeply until around 50 years ago.

Then came the Mandelbrot set, Feigenbaum’s constants, period three implies chaos, and all that. A lack of computer hardware would be frustrating, but not insurmountable. Because computers have shown you what phenomena to look for, you could go back and reproduce them by hand.

In most areas, I suspect knowledge of the future wouldn’t be an enormous advantage. It would obviously be *some* advantage. Knowing which way a conjecture was settled tells you whether to try to prove or disprove it. In the language of microprocessors, it gives you good branch prediction. And it would help to have patterns in your head that have turned out to be useful, even if you couldn’t directly appeal to these patterns without blowing your cover. But I suspect it might make you something like 50% more productive, not enough to turn an average researcher into a superstar.

The term “debauch of indices” pejorative, but I’ve usually heard it used tongue-in-cheek. Although some people can be purists, going to great lengths to avoid index manipulation, pragmatic folk move up and down levels of abstraction as necessary to get their work done.

I searched on the term “debauch of indices” to find out who first said it, and found an answer on Stack Exchange that traces it back to Élie Cartan. Cartan said that although “le Calcul différentiel absolu du Ricci et Levi-Civita” (tensor calculus) is useful, “les débauches d’indices” could hide things that are easier to see geometrically.

After solving my problem using indices, I went back and came up with a more abstract solution. Both approaches were useful. The former cut through a complicated problem formulation and made things more tangible. The latter revealed some implicit pieces of the puzzle that needed to be made explicit.

Paul Graham wrote something similar about fake work. Blatantly non-productive activity doesn’t dissipate your productive energy as unimportant work does.

I have a not-to-do list, though it’s not as rigorous as the “avoid at all costs” list that Buffet is said to have recommended. These are not hard constraints, but more like what optimization theory calls soft constraints, more like stiff springs than brick walls.

One of the things on my not-to-do list is work with students. They don’t have money, and they often want you to do their work for them, e.g. to write the statistical chapter of their dissertation. It’s easier to avoid ethical dilemmas and unpaid invoices by simply turning down such work. I haven’t made exceptions to this one.

My softest constraint is to avoid small projects, unless they’re interesting, likely to lead to larger projects, or wrap up quickly. I’ve made exceptions to this rule, some of which I regret. My definition of “small” has generally increased over time.

I like the variety of working on lots of small projects, but it becomes overwhelming to have too many open projects at the same time. Also, transaction costs and mental overhead are proportionally larger for small projects.

Most of my not-to-do items are not as firm as my prohibition against working with students but more firm than my prohibition against small projects. These are mostly things I have pursued far past the point of diminishing return. I would pick them back up if I had a reason, but I’ve decided not to invest any more time in them just-in-case.

Sometimes things move off my not-to-do list. For example, Perl was on my not-to-do list for a long time. There are many reasons not to use Perl, and I agree with all of them in context. But nothing beats Perl for small text-munging scripts for personal use.

I’m not advocating my personal not-to-do list, only the idea of having a not-to-do list. And I’d recommend seeing it like a storage facility rather than a landfill: some things may stay there a while then come out again.

I’m also not advocating evaluating everything in terms of profit. I do lots of things that don’t make money, but when I am making money, I want to make money. I might take on a small project *pro bono*, for example, that I wouldn’t take on for work. I heard someone say “Work for full rate or for free, but not for cheap” and I think that’s good advice.

***

[1] Some sources say this story *may* be apocryphal. But “apocryphal” means of doubtful origin, so it’s redundant to say something may be apocryphal. Apocryphal does not mean “false.” I’d say a story might be false, but I wouldn’t say it might be apocryphal.

That post was based on the assumption that 26 million Americans had been infected with the virus. I’ve heard other estimates of 50 million or 100 million. The post was also based on the assumption that we’re vaccinating 1.3 million per day. A more recent estimate is 1.8 million per day. So maybe my estimate was pessimistic. On the other hand, the estimate for the number of people with pre-existing immunity that I used may have been optimistic. (Or not. There’s a lot we don’t know.)

Because there is so much we don’t know, and because numbers are frequently being updated, I’ve written a little Python code to make all the assumptions explicit and easy to update. According to this calculation, we’re 45 days from herd immunity.

As I pointed out before, herd immunity is not a magical cutoff with an agreed-upon definition. I’m using a definition that was suggested a year ago. Viruses never [1] completely go away, so any cutoff is arbitrary.

Here’s the code. It’s Python, but you it would be trivial to port to any programming language. Just remove the underscores as thousands separators if your language doesn’t support them and change the comment marker if necessary.

US_population = 330_000_000 num_vaccinated = 40_000_000 num_infected = 50_000_000 vaccine_efficacy = 0.9 herd_immunity_portion = 0.70 # Some portion of the population had immunity to SARS-COV-2 # before the pandemic. I've seen estimates from 10% up to 60%. portion_pre_immune = 0.40 num_pre_immune = portion_pre_immune*US_population # Adjust for vaccines given to people who are already immune. portion_at_risk = 1.0 - (num_pre_immune + num_infected)/US_population num_new_vaccine_immune = num_vaccinated*vaccine_efficacy*portion_at_risk # Number immune at present num_immune = num_pre_immune + num_infected + num_new_vaccine_immune herd_immunity_target = herd_immunity_portion*US_population num_needed = herd_immunity_target - num_immune num_vaccines_per_day = 1_800_000 num_new_immune_per_day = num_vaccines_per_day*portion_at_risk*vaccine_efficacy days_to_herd_immunity = num_needed / num_new_immune_per_day print(days_to_herd_immunity)

[1] One human virus has been eliminated. Smallpox was eradicated two centuries after the first modern vaccine.

The post Herd immunity countdown first appeared on John D. Cook.]]>The images produced are sensitive to small changes in the starting parameters *x* and *y*, as well as to the parameters *a*, *b*, and *c*.

Here are three examples:

And here’s the Python code that was used to make these plots.

import matplotlib.pyplot as plt from numpy import sign, empty def make_plot(x, y, a, b, c, filename, N=20000): xs = empty(N) ys = empty(N) for n in range(N): x, y = y - sign(x)*abs(b*x - c)**0.5, a - x xs[n] = x ys[n] = y plt.scatter(xs, ys, c='k', marker='.', s = 1, alpha=0.5) plt.axes().set_aspect(1) plt.axis('off') plt.savefig(filename) plt.close() make_plot(5, 5, 30.5, 2.5, 2.5, "doiley1.png") make_plot(5, 6, 30, 3, 3, "doiley2.png") make_plot(3, 4, 5, 6, 7, "doiley3.png")

[1] Desert Island Theorems: My Magnificent Seven by Tony Crilly. The Mathematical Gazette, Mar., 2001, Vol. 85, No. 502 (Mar., 2001), pp. 2-12

The post Martin’s doileys first appeared on John D. Cook.]]>First example: Is 2759 divisible by 31?

Yes, because

and 0 is divisible by 31.

Is 75273 divisible by 61? No, because

and 33 is not divisible by 61.

What in the world is going on?

Let *p* be an odd prime and *n* a number we want to test for divisibility by *p*. Write *n* as 10*a* + *b* where *b* is a single digit. Then there is a number *k*, depending on *p*, such that *n* is divisible by *p* if and only if

is divisible by *p*.

So how do we find *k*?

- If
*p*ends in 1, we can take*k*= ⌊*p*/ 10⌋. - If
*p*ends in 3, we can take*k*= ⌊7*p*/ 10⌋. - If
*p*ends in 7, we can take*k*= ⌊3*p*/ 10⌋. - If
*p*ends in 9, we can take*k*= ⌊9*p*/ 10⌋.

Here ⌊*x*⌋ means the floor of *x*, the largest integer no greater than *x*. Divisibility by even primes and primes ending in 5 is left as an exercise for the reader. The rule takes more effort to carry out when *k* is larger, but this rule generally takes less time than long division by *p*.

One final example. Suppose we want to test divisibility by 37. Since 37*3 = 111, *k* = 11.

Let’s test whether 3293 is divisible by 37.

329 – 11×3 = 296

29 – 11×6 = -37

and so yes, 3293 is divisible by 37.

[1] R. A. Watson. Tests for Divisibility. The Mathematical Gazette, Vol. 87, No. 510 (Nov., 2003), pp. 493-494

The post Divisibility by any prime first appeared on John D. Cook.]]>- Arbitrary precision floating point
- Lazy quantifiers in regular expressions
- Returning the positions of matched groups.

Our problem is to look for the digits 3, 1, 4, and 1 in the decimal part of π.

First, we get the first 100 digits of π after the decimal as a string. (It turns out 100 is enough, but if it weren’t we could try again with more digits.)

use Math::BigFloat "bpi"; $x = substr bpi(101)->bstr(), 2;

This loads Perl’s extended precision library `Math::BigFloat`

, gets π to 101 significant figures, converts the result to a string, then lops off the first two characters “3.” at the beginning leaving “141592…”.

Next, we want to search our string for a 3, followed by some number of digits, followed by a 1, followed by some number of digits, followed by a 4, followed by some number of digits, and finally another 1.

A naive way to search the string would be to use the regex `/3.*1.*4.*1/`

. But the star operator is greedy: it matches as much as possible. So the `.*`

after the 3 would match as many characters as possible before backtracking to look for a 1. But we’d like to find the *first* 1 after a 3 etc.

The solution is simple: add a `?`

after each star to make the match lazy rather than greedy. So the regular expression we want is

/3.*?1.*?4.*?1/

This will tell us *whether* our string contains the pattern we’re after, but we’d like to also know *where* the string contains the pattern. So we make each segment a captured group.

/(3.*?)(1.*?)(4.*?)(1)/

Perl automatically populates an array `@-`

with the positions of the matches, so it has the information we’re looking for. Element 0 of the array is the position of the entire match, so it is redundant with element 1. The advantage of this bit of redundancy is that the starting position of group `$1`

is in the element with index 1, the starting position of `$2`

is at index 2, etc.

We use the `shift`

operator to remove the redundant first element of the array. Since `shift`

modifies its argument, we can’t apply it directly to the constant array `@-`

, so we apply it to a copy.

if ($x =~ /(3.*?)(1.*?)(4.*?)(1)/) { @positions = @-; shift @positions; print "@positions\n"; }

This says that our pattern appears at positions 8, 36, 56, and 67. Note that these are array indices, and so they are zero-based. So if you count from 1, the first 3 appears in the 9th digit etc.

To verify that the digits at these indices are 3, 1, 4, and 1 respectively, we make the digits into an array, and slice the array by the positions found above.

@digits = split(//, $x); print "@digits[@positions]\n";

This prints `3 1 4 1`

as expected.

Tonight I repeated my experiment with an empty water bottle. But I ran into a difficulty immediately: where would you say the neck ends?

An ideal Helmholtz resonator is a cylinder on top of a larger sphere. My water bottle is basically a cone on top of a cylinder.

So instead of measuring the neck length *L* and seeing what pitch was predicted with the formula from the earlier post

I decided to solve for *L* and see what neck measurement would be consistent withe Helmholtz resonator approximation. The pitch *f* was 172 Hz, the neck of the bottle is one inch wide, and the volume is half a liter. This implies *L* is 10 cm, which is a little less than the height of the conical part of the bottle.

Here’s the chart. The C column also stands for languages like Python that follow C’s conventions. More on this below.

The table above is a PNG image. An HTML version of the same table is available here.

The rest of the post discusses details and patterns in the table.

There are six trig functions according to the most common convention: sine, cosine, tangent, secant, cosecant, and cotangent. Your list could be longer or shorter, depending on how you count them. These six functions have inverses (over some range) and so a math library could have 12 trig functions, including inverses.

Except there’s a wrinkle with the inverse tangent. For any given *t* there are infinitely many angles θ whose tangent is *t*. Which one do you want? By convention, we usually want θ between -π/2 and π/2. But it’s handy sometime to specify a point (*x*, *y*) and ask for the angle made by the line from the origin to the point. In that case we a value of θ *in the same quadrant* as the point. So if *y*/*x* = *z*, the one-argument form of inverse tangent depends only on *z*, but the two-argument depends on both *x* and *y*; two points with equal ratios may result in different choices of θ.

So with our two versions on inverse tangent we have a total of 13 functions. This post will explain which of these 13 functions are supported in C, Python, R, Perl, Mathematica, bc, and Common Lisp, and how the supported functions are named.

Of the languages I looked at, only Mathematica implements all 13 functions. The names of the six basic functions are what you’d see in a contemporary calculus textbook except that, like all functions in Mathematica, names begin with a capital letter.

So the six trig functions are `Sin`

, `Cos`

, `Tan`

, `Sec`

, `Csc`

, and `Cot`

. The inverse functions are the same with an `Arc`

prefix: `ArcSin`

, `ArcCos`

, etc.

The two inverse tangent functions are both named `ArcTan`

. With one argument, `ArcTan[z]`

returns an angle θ between -π/2 and π/2 such that tan θ = *z*. With two arguments, `ArcTan[x, y]`

returns an angle θ in the same quadrant as (*x*, *y*) with tan θ = *y*/*x*.

Python and R follow C’s lead, as do other programming languages such as JavaScript.

C does not support sec, csc, and cot, presumably because they’re simply the reciprocals of cos, sin, and tan respectively. It does not support their inverses either.

Inverse trig functions are denoted with an `a`

prefix: `asin`

, `acos`

, `atan`

. The two-argument form of inverse tangent is `atan2`

. The order of arguments to `atan2`

differs from that of Mathematica and is discussed in the section on Common Lisp below.

NumPy supports the same trig functions as base Python, but it uses different names for inverse functions. That is, NumPy uses `arcsin`

, `arccos`

, `arctan`

, and `arctan2`

while Python’s standard `math`

module uses `asin`

, `acos`

, `atan`

, and `atan2`

.

Common Lisp is the same as C except for the inverse tangent function of two arguments. There CL is similar to Mathematica in that the same name is used for the function of one argument and the function of two arguments.

However, the order of the arguments is reversed in CL relative to Mathematica. That is, `(atan y x)`

in CL equals `ArcTan[x, y]`

in Mathematica.

Even though the two languages use opposite conventions, there are good reason for both. Mathematica interprets the arguments of `ArcTan[x, y]`

as the coordinates of a point, written in the usual order. Common Lisp interprets `(atan y x)`

as a function whose second argument defaults to 1.

Mathematica has the advantage that the coordinates of a point are listed in the natural order. Common Lisp has the advantage that the meaning of the first argument does not change if you add a second argument.

C and languages like Python and R that follow C use the same convention as Common Lisp, i.e. the *first* argument to `atan2`

is the *second* coordinate of a point.

The base Perl language only supports three trig functions: sine, cosine, and two-argument inverse tangent. The module `Math::Trig`

supports everything else, including the reciprocal functions and their inverses.

It’s interesting that Perl made the choices that it did. There are languages, like `bc`

discussed below, that support inverse tangent of with one argument but not two; Perl is the only language I know of that does the opposite, supporting inverse tangent with two arguments but not one. It makes sense that if you’re only going to support one, you support the more general of the two.

Perl’s `atan2`

function uses the same argument convention as C et al., i.e. `atan2(y, x), numerator first.`

In the `Math::Trig`

module, the reciprocal trig functions are named `sec`

, `csc`

, and `cot`

, as is standard now. (You may see things like cosec and ctn in older math books.) Inverse functions have an `a`

prefix, as they do in C.

There’s one inexplicable quirk in `Math::Trig`

: the functions `sin`

and `cos`

aren’t there, but `atan2`

is. I could see leaving out `sin`

and `cos`

because they’re redundant with base Perl, but so is `atan2`

. It would be more consistent to add `sin`

and `cos`

(my preference) or take out `atan2`

.

The Unix calculator `bc`

has a minimal set of trig functions: sine, cosine, and (one-argument) inverse tangent. These are denoted simply `s`

, `c`

, and `a`

. But you can bootstrap your way from these three functions to all the rest.

From Leonardo da Vinci:

The impetus is much quicker than the water, for it often happens that the wave flees the place of its creation, while the water does not; like the waves made in a field of grain by the wind, where we see the waves running across the field while the grain remains in place.

Quoted in Almost All About Waves by John R. Pierce

The post Da Vinci on wave propagation first appeared on John D. Cook.]]>She had started removing the label, but as you can see she didn’t get very far yet. It’s an Incanto Chardonnay Pinot Grigio from Trader Joe’s.

I blew across the top of the bottle to hear what sound it makes, and it makes a nice deep rumble.

I tried to identify the pitch using a spectrum analyzer app on my phone, and it says 63 Hz.

Next I tried to figure out what pitch I should expect theoretically based on physics. Wine bottles are **Helmholtz resonators**, and there’s a formula for the fundamental frequency of Helmholtz resonators:

The variables in this equation are:

*f*, frequency in Hz*v*, velocity of sound*A*, area of the opening*L*, length of the neck*V*, volume

I measured the opening to be 3/4 of an inch across, and the neck to be about 7 inches. The volume is 1.5 liters. The speed of sound at sea level and room temperature is 343 meters per second. After a few unit conversions [1] I got a result of 56.4 Hz, about 10% lower than what the spectrum analyzer measured.

An ideal Helmholtz resonator has a cylindrical neck attached to a spherical body. This bottle is *far* from spherical. The base is an ellipse with a major axis about twice as long as the minor axis. And from there it tapers off more like a cone than a sphere [2]. And yet the frequency predicted by Helmholtz’ formula comes fairly close to what I measured empirically.

I suspect I got lucky to some extent. I didn’t measure the bottle that accurately; it’s hard to even say when the neck of the bottle stops. But apparently Helmholtz’ formula is robust to changes in shape.

I repeated my experiment with a beer bottle, specifically a Black Venom Imperial Stout.

The opening diameter is about 3/4″, as with the wine bottle above, and the neck is about 3″ long. The volume is 12 fluid ounces. Helmholtz’ formula predicts a pitch of 177 Hz. My spectrum analyzer measured 191 Hz, the G below middle C. So this time theory was about 7% lower than the observed value.

The beer bottle is closer to the shape of a Helmholtz resonator than the wine bottle was. It’s at least radially symmetric, but the body is a cylinder rather than a sphere.

[1] Thanks to a reader who provided this write-up of the calculation:

[2] What we usually call a cone is more specifically a right circular cone. But more generally a cone can have any base, not just a circle, and this bottle is approximately an elliptical cone.

The post Pitch of a big wine bottle first appeared on John D. Cook.]]>This morning I played around with the code from that earlier post and made some new images.

The following image was based on exp(*x*+7)/10 + 10*tan(*x*/5).

This image was based on the same function over a different range.

And this hummingbird-line images was based on exp(*x*) – *x*.

Here are a few more blog posts that have interesting images.

The post More images from an oddball coordinate system first appeared on John D. Cook.]]>Here φ is the golden ratio, (1 + √5)/2.

We’ll use this formula as the jumping off point to discuss the implications of how equations are written, complex logarithms, and floating point computing in Python.

Of course every equation of the form *a* = *b* can be rewritten *b* = *a*. The two forms of the equation have the same denotation but different connotations. Equations have an implied direction of application. When you see an equation written as *a* = *b*, it’s often the case that the equation is usually applied by substituting *b* for *a*.

For example, take the equation *A* = π*r*². The most natural application of this equation is that you can compute the area of a circle circle of radius *r* by squaring *r* and multiplying by π. It’s less common to see something like 9π and think “Oh, that’s the area of a circle of radius 3.”

Note also that this is also how nearly every programming language works: `a = b`

means update the variable `a`

to the value `b</code. `

In writing Askey’s formula as above, I’m implying that it might be useful to express the *m*th Fibonacci number in terms of hyperbolic sine evaluated at complex arguments. Why in the world would you want to do that? The Fibonacci numbers are elementary and concrete, but logs and hyperbolic functions are not so much, especially with complex arguments. Askey’s formula is interesting, and that would be enough, but it could be useful if some things are easier to prove using the formulation on the right side. See, for example, [1].

If I had written the formula above as

the implication would be that the complicated expression on the left can be reduced to the simple expression on the right. It would be gratifying if some application lead naturally to the formulation on the left, but that seems highly unlikely.

I’ll close out this section with two more remarks about the direction of reading equations. First, it is a common pattern in proofs to begin by applying an equation left-to-right and to conclude by applying the same equation right-to-left. Second, it’s often a clever technique to applying an equation in the opposite of the usual order. [2]

What does it mean to take the logarithm of a complex number? Well, the same as it does to take the logarithm of any number: invert the exp function. That is, log(*x*) = *y* means that *y* is a number such that exp(*y*) = *x*. Except that it’s not quite that simple. Notice the indefinite article: “*a* number such that …”. For positive real *x*, there is a unique real number *y* such that exp(*y*) = *x*. But there are infinitely many complex solutions *y*, even if *x* is real: for any integer *n, *exp(*y* + 2π*ni*) = exp(*y*).

When we extend log to complex arguments, we usually want to do so in such a way that we keep familiar logs the same. We want to extend the logarithm function from the positive real axis into more of the complex plane. We can’t extend it continuously to the entire complex plane. We have to exclude some path from the origin out to infinity, and this path is known as a branch cut.

The conventional choice for log is to cut out the negative real axis. That’s what NumPy does.

Let’s see whether Askey’s formula works when coded up in Python.

from numpy import sinh, log def f(m): phi = (1 + 5**0.5)/2 return 2*sinh(m*log(phi*1j))/(5**0.5*(1j)**m)

Note that Python uses *j* for the imaginary unit rather than *i*. And you can’t just use `j`

in the code above; you have to use `1j`

. That let’s Python use `j`

as an ordinary variable when it’s not part of a complex number.

When we evaluate `f(1)`

, we expect 1, the first Fibonacci number. Instead, we get

(1-2.7383934913210137e-17j)

Although this is surprising at first glance, the imaginary part is tiny. Floating point numbers in Python have about 16 significant figures (more details here) and so the imaginary part is as close to zero as we can expect for any floating point calculation.

Here are the first five Fibonacci numbers using the code above.

(1-2.7383934913210137e-17j) (1.0000000000000002-1.6430360947926083e-16j) (2-3.2860721895852156e-16j) (3.0000000000000013-7.667501775698841e-16j) (5.0000000000000036-1.5061164202265582e-15j)

If you took the real part and rounded to the nearest integer, you’d have yet another program to compute Fibonacci numbers, albeit an inefficient one.

Just out of curiosity, let’s see how far we could use this formula before rounding error makes it incorrect.

import functools @functools.lru_cache() def fib0(m): if m == 1 or m == 2: return 1 else: return fib0(m-1) + fib0(m-2) def fib1(m): return round(f(m).real) for m in range(1, 70): a, b = fib0(m), fib1(m) if a != b: print(f"m: {m}, Exact: {a}, f(m): {f(m)}")

The `lru_cache`

decorator adds memoization to our recursive Fibonacci generator. It caches computed values behind the scenes so that the code does not evaluate over and over again with the same arguments. Without it, the function starts to bog down for values of *m* in the 30’s. With it, the time required to execute the code isn’t noticeable.

The code above shows that `fib1`

falls down when *m* = 69.

m: 69, Exact: 117669030460994 f(m): (117669030460994.7-0.28813316817427764j)

[1] Thomas Osler and Adam Hilburn. An Unusual Proof That *F*_{m} Divides *F*_{mn} Using Hyperbolic Functions. The Mathematical Gazette, Vol. 91, No. 522 (Nov., 2007), pp. 510-512.

[2] OK, a third comment. You might see equations written in different directions according to the context. For example, in a calculus book you’d see

1/(1-*x*) = 1 + *x* + *x*² + *x*³ + …

but in a book on generating functions you’re more likely to see

1 + *x* + *x*² + *x*³ + … = 1/(1-*x*)

because calculus problems start with functions and compute power series, but generating function applications create power series and then manipulate their sums. For another example, differential equation texts start with a differential equation and compute functions that satisfy the equation. Books on special functions might start with a function and then present a differential equation that the function satisfies because the latter form makes it easier to prove certain things.

The post Fibonacci numbers and hyperbolic sine first appeared on John D. Cook.]]>According to a recent article, about 26 million Americans have been vaccinated against COVID, about 26 million Americans have been infected, and 1.34 million a day are being vaccinated, all as of February 1, 2021.

Somewhere around half the US population was immune to SARS-COV-2 before the pandemic began, due to immunity acquired from previous coronavirus exposure. The proportion isn’t known accurately, but has been estimated as somewhere between 40 and 60 percent.

Let’s say that as of February 1, that 184 million Americans had immunity, either through pre-existing immunity, infection, or vaccination. There is some overlap between the three categories, but we’re taking the lowest estimate of pre-existing immunity, so maybe it sorta balances out.

The vaccines are said to be 90% effective. That’s probably optimistic—treatments often don’t perform as well in the wild as they do in clinical trials—but let’s assume 90% anyway. Furthermore, let’s assume that half the people being vaccinated already have immunity, due to pre-existing immunity or infection.

Then the number of people gaining immunity each day is 0.5*0.9*1,340,000, which is about 600,000 per day. This assumes nobody develops immunity through infection from here on out, though of course some will.

There’s no consensus on how much of the population needs to have immunity before you have herd immunity, but I’ve seen numbers like 70% tossed around, so let’s say 70%.

We assumed we had 184 M with immunity on February 1, and we need 231 M (70% of a US population of 330M) to have herd immunity, so we need 47 M more people. If we’re gaining 600,000 per day through vaccination, this would take 78 days from February 1, which would be April 20.

So, the bottom line of this very crude calculation is that we should have herd immunity by the end of April.

I’ve pointed out several caveats. There are more, but I’ll only mention one, and that is that herd immunity is not an objective state. Viruses never completely go away; only one human virus—smallpox—has ever been eradicated, and that took two centuries after the development of a vaccine.

Every number in this post is arguable, and so the result should be taken with a grain of salt, as I said from the beginning. Certainly you shouldn’t put April 20 on your calendar as the day the pandemic is over. But this calculation does suggest that we should see a substantial drop in infections long before most of the population has been vaccinated.

**Update**: A few things have changed since this was written. For one thing, we’re vaccinating more people per day. See an update post with code you can update (or just carry out by hand) as numbers change.

|*a*²-*b*²|, 2*ab*, *a*²+*b*²

is a Pythagorean triple. The result still gives the sides of a right triangle if the starting points aren’t integers

In [1], Nick Lord looks at what happens if you iterate this procedure, using the output of one step as the input to the next step, and look at the smaller angle of the right triangle that results.

The numbers grow exponentially, so it helps to divide *a* and *b* by *c* on each iteration. This prevents the series from overflowing but doesn’t change the angles.

Here’s a little Python code to explore the sequence of angles.

import numpy as np θ = np.empty(50000, dtype=float) a, b = np.random.random(2) for i in range(1, N-1): c = a**2 + b**2 a, b = abs(a**2 - b**2)/c, 2*a*b/c θ[i] = min( np.arctan(a/b), np.arctan(b/a) )

Here’s what we get if we plot the first 100 angles.

As Lord points out in [1], the series is chaotic.

Here’s a histogram of the angles.

[1] Nick Lord. Maths bite: Pythagoras causes chaos! The Mathematical Gazette, Vol. 92, No. 524 (July 2008), pp. 290-292.

The post Pythagorean chaos first appeared on John D. Cook.]]>However, Python’s indentation rules complicate matters because the indentation becomes part of the quoted string. For example, suppose you have the following code outside of a function.

x = """\ abc def ghi """

Then you move this into a function `foo`

and change its name to `y`

.

def foo(): y = """\ abc def ghi """

Now `x`

and `y`

are different strings! The former begins with `a`

and the latter begins with four spaces. (The backslash after the opening triple quote prevents the following newline from being part of the quoted string. Otherwise `x`

and `y`

would begin with a newline.) The string `y`

also has four spaces in front of `def`

and four spaces in front of `ghi`

. You can’t push the string contents to the left margin because that would violate Python’s formatting rules.

We now give three solutions to this problem.

There is a function in the Python standard library that will strip the unwanted space out of the string `y`

.

import textwrap def foo(): y = """\ abc def ghi """ y = textwrap.dedent(y)

This works, but in my opinion a better approach is to use regular expressions [1].

We want to remove white space, and the regular expression for a white space character is `\s`

. We want to remove one or more white spaces so we add a `+`

on the end. But in general we don’t want to remove all white space, just white space at the beginning of a line, so we stick `^`

on the front to say we want to match white space at the beginning of a line.

import re def foo(): y = """\ abc def ghi """ y = re.sub("^\s+", "", y)

Unfortunately this doesn’t work. By default `^`

only matches the beginning of a *string*, not the beginning of a line. So it will only remove the white space in front of the first line; there will still be white space in front of the following lines.

One solution is to add the flag `re.MULTILINE`

to the substitution function. This will signal that we want `^`

to match the beginning of every `line`

in our multi-line string.

y = re.sub("^\s+", "", y, re.MULTILINE)

Unfortunately that doesn’t quite work either! The fourth positional argument to `re.sub`

is a count of how many substitutions to make. It defaults to 0, which actually means infinity, i.e. replace all occurrences. You could set `count`

to 1 to replace only the first occurrence, for example. If we’re not going to specify `count`

we have to set `flags`

by name rather than by position, i.e. the line above should be

y = re.sub("^\s+", "", y, flags=re.MULTILINE)

That works.

You could also abbreviate `re.MULTILINE`

to `re.M`

. The former is more explicit and the latter is more compact. To each his own. There’s more than one way to do it. [2]

In my opinion, it is better to modify the regular expression itself than to pass in a flag. The modifier `(?m)`

specifies that in the rest of the regular the `^`

character should match the beginning of each line.

y = re.sub("(?m)^\s+", "", y)

One reason I believe this is better is that moves information from a language-specific implementation of regular expressions into a regular expression syntax that is supported in many programming languages.

For example, the regular expression

(?m)^\s+

would have the same meaning in Perl and Python. The two languages have the same way of expressing modifiers [3], but different ways of expressing flags. In Perl you paste an `m`

on the end of a match operator to accomplish what Python does with setting `flasgs=re.MULTILINE`

.

One of the most commonly used modifiers is `(?i)`

to indicate that a regular expression should match in a case-insensitive manner. Perl and Python (and other languages) accept `(?i)`

in a regular expression, but each language has its own way of adding modifiers. Perl adds an `i`

after the match operator, and Python uses

flags=re.IGNORECASE

or

flags=re.I

as a function argument.

- Regular expressions in Perl and Python
- Regular expressions with Hebrew and Greek
- Why are regular expressions difficult

[1] Yes, I’ve heard the quip about two problems. It’s funny, but it’s not a universal law.

[2] “There’s more than one way to do it” is a mantra of Perl and contradicts The Zen of Python. I use the line here as a good-natured jab at Python. Despite its stated ideals, Python has more in common with Perl than it would like to admit and continues to adopt ideas from Perl.

[3] Python’s `re`

module doesn’t support every regular expression modifier that Perl supports. I don’t know about Python’s `regex`

module.

This morning I wrote up a similar (and simpler) trick for cube roots as a thread on @AlgebraFact. You can find the Twitter thread starting here, or you could go to this page that unrolls the whole thread in one page.

The post Mentally computing 3rd and 5th roots first appeared on John D. Cook.]]>First, the Koch snowflake on @AlgebraFact:

Then the logistic bifurcation on @AnalysisFact:

Then cellular automaton “Rule 90” on @CompSciFact:

And finally, the Lorenz system on @Diff_eq:

The post How it started, how it’s going first appeared on John D. Cook.]]>*f*(*z*) = *z*^{p}

where *p* = *a* + *bi*. They show that if you start Newton’s method at *z* = 1, the *kth* iterate will be

(1 – 1/*p*)^{k}.

This converges to 0 when *a* > 1/2, runs around in circles when *a* = 1/2, and diverges to infinity when *a* < 1/2.

You can get a wide variety of images by plotting the iterates for various values of the exponent *p*. Here are three examples.

Here’s the Python code that produced the plots.

import numpy as np import matplotlib.pyplot as plt def make_plot(p, num_pts=40): k = np.arange(num_pts) z = (1 - 1/p)**k plt.plot(z.real, z.imag) plt.axes().set_aspect(1) plt.grid() plt.title(f"$x^{{ {p.real} + {p.imag}i }}$") plt.savefig(f"newton_{p.real}_{p.imag}.png") plt.close() make_plot(0.53 + 0.4j) make_plot(0.50 + 0.3j) make_plot(0.48 + 0.3j)

Note that the code uses f-strings for the title and file name. There’s nothing out of the ordinary in the file name, but the title embeds LaTeX code, and LaTeX needs its own curly braces. The way to produce a literal curly brace in an f-string is to double it.

[1] Joe Latulippe and Jennifer Switkes. Sometimes Newton’s Method Always Cycles. The College Mathematics Journal, Vol. 43, No. 5, pp. 365-370

The post Newton’s method spirals first appeared on John D. Cook.]]>is never an integer. This was proved by József Kürschák in 1908.

This means that the harmonic numbers defined by

are never integers for *n* > 1. The harmonic series diverges, so the sequence of harmonic numbers goes off to infinity, but it does so carefully avoiding all integers along the way.

Kürschák’s theorem says that not only are the harmonic numbers never integers, the difference of two distinct harmonic numbers is never an integer. That is, *H _{n}* –

The first post included this discussion of the peak locations.

The peaks of sin(

x)/xare approximately at the same positions as sin(x), and so we use (2n+ 1/2)π as our initial guess. In fact, all our peaks will be a little to the left of the corresponding peak in the sine function because dividing byxpulls the peak to the left. The largerxis, the less it pulls the root over.

This post will refine the observation above. The paragraph above suggests that for large *n*, the *n*th peak is located at approximately (2*n* + 1/2)π. This is a zeroth order asymptotic approximation. Here we will give a first order asymptotic approximation.

For a fixed positive *n*, let

θ = (2*n* + 1/2)π

and let

*x* = θ + ε

be the location of the *n*th peak. We will improve or approximation of the location of *x* by estimating ε.

As described in the first post in this series, setting the derivative of the sinc function to zero says *x* satisfies

*x* cos *x* – sin *x* = 0.

Therefore

(θ + ε) cos(θ + ε) = sin(θ + ε)

Applying the sum-angle identities for sine and cosine shows

(θ + ε) (cos θ cos ε – sin θ sin ε) = sin θ cos ε + cos θ sin ε

Now sin θ = 1 and cos θ = 0, so

-(θ + ε) sin ε = cos ε.

or

tan ε = -1/(θ + ε).

So far our calculations are exact. You could, for example, solve the equation above for ε using a numerical method. But now we’re going to make a couple very simple approximations [1]. On the left, we will approximate tan ε with ε. On the right, we will approximate ε with 0. This gives us

ε ≈ -1/θ = -1/(2*n* + 1/2)π.

This says the *n*th peak is located at approximately

θ – 1/θ

where θ = (2*n* + 1/2)π. This refines the earlier statement “our peaks will be a little to the left of the corresponding peak in the sine function.” As *n* gets larger, the term we subtract off gets smaller. This makes the statement above “The larger *x* is, the less it pulls the root over” more precise.

Now let’s see visually how well our approximation works. The graph below plots the error in approximating the *n*th peak location by θ and by θ – 1/θ.

Note the log scale. The error in approximating the location of the 10th peak by 20.5π is between 0.1 and 0.01, and the error in approximating the location by 20.5π – 1/20.5π is approximately 0.000001.

[1] As pointed out in the comments, you could a slightly better approximation by not being so simple. Instead of approximating 1/(θ + ε) by 1/θ you could use 1/θ – ε/θ² and solve for ε.

The post Peaks of Sinc first appeared on John D. Cook.]]>Briefly stated, the Gell-Mann Amnesia effect is as follows. You open the newspaper to an article on some subject you know well. In Murray [Gell-Mann]’s case, physics. In mine, show business. You read the article and see the journalist has absolutely no understanding of either the facts or the issues. Often, the article is so wrong it actually presents the story backward—reversing cause and effect. I call these the “wet streets cause rain” stories. Paper’s full of them.

In any case, you read with exasperation or amusement the multiple errors in a story, and then turn the page to national or international affairs, and read as if the rest of the newspaper was somehow more accurate about Palestine than the baloney you just read. You turn the page, and forget what you know.

I think about the Gell-Mann Amnesia effect when I read news stories that totally botch science or statistics. Most of the time when I read a news story that touches on something I happen to know about, it’s at best misleading and at worst just plain wrong.

Yesterday I had the opposite experience. I was trying out a new podcast, not one focused on science or statistics, that was mostly correct when it touched on statistical matters that I’ve looked into. They didn’t bat 1000, but they did better than popular news sites. That increased my estimate of how likely the podcast is to be accurate about other matters.

By the way, why is the effect named after the Nobel Prize-winning physicist Murray Gell-Mann? Crichton explained

The post Gell-Mann amnesia and its opposite first appeared on John D. Cook.]]>I refer to it by this name because I once discussed it with Murray Gell-Mann, and by dropping a famous name I imply greater importance to myself, and to the effect, than it would otherwise have.

In this post we use this problem to illustrate how two formulations of the same problem can behave very differently with Newton’s method.

The previous post mentioned finding the peaks by solving either

*x* cos *x* – sin *x* = 0

or equivalently

tan *x* – *x* = 0

It turns out that the former is *much* better suited to Newton’s method. Newton’s method applied to the first equation will converge quickly without needing to start particularly close to the root. Newton’s method applied to the second equation will fail to converge at all unless the method beings close to the root, and even then the method may not be accurate.

Here’s why. The rate of convergence in solving

*f*(*x*) = 0

with Newton’s method is determined by the ratio of the second derivative to the first derivative

| *f* ‘ ‘ (*x*) / *f* ‘ (*x*) |

near the root.

Think of the second derivative as curvature. Dividing by the first derivative normalizes the scale. So convergence is fast when the curvature relative to the scale is small. Which makes sense intuitively: When a function is fairly straight, Newton’s method zooms down to the root. When a function is more curved, Newton’s method requires more steps.

The following table gives the absolute value of the ratio of the second derivative to the first derivative at the first ten peaks, using both equations. The bound on the error in Newton’s method is proportional to this ratio.

|------+------------+------------| | peak | Equation 1 | Equation 2 | |------+------------+------------| | 1 | 0.259 | 15.7 | | 2 | 0.142 | 28.3 | | 3 | 0.098 | 40.8 | | 4 | 0.075 | 53.4 | | 5 | 0.061 | 66.0 | | 6 | 0.051 | 78.5 | | 7 | 0.044 | 91.1 | | 8 | 0.039 | 103.7 | | 9 | 0.034 | 116.2 | | 10 | 0.031 | 128.8 | |------+------------+------------|

The error terms are all small for the first equation, and they get smaller as we look at peaks further from the origin. The error terms for the second equation are all large, and get larger as we look at peaks further out.

sinc(*x*) = sin(*x*)/*x*.

This function comes up constantly in signal processing. Here’s a plot.

We would like to find the location of the function’s peaks. Let’s focus first on the first positive peak, the one that’s somewhere between 5 and 10. Once we can find that one, the rest will be easy.

If you take the derivative of sinc and set it to zero, you find that the peaks must satisfy

*x* cos *x* – sin *x* = 0

which means that

tan *x* = *x*.

So our task reduces to finding the fixed points of the tangent function. One way to find fixed points is simply to iterate the function. We pick a starting point near the peak we’re after, then take its tangent, then take the tangent of that, etc.

The peak appears to be located around 7.5, so we’ll use that as a starting point. Then iterates of tangent give

2.7060138667726910 -0.4653906051625444 -0.5021806478408769 -0.5491373258198057 -0.6119188887713993 -0.7017789436750164 -0.8453339618848119 -1.1276769374114777 -2.1070512803092996 1.6825094538261074

That didn’t work at all. That’s because tangent has derivative larger than 1, so it’s not a contraction mapping.

The iterates took us away from the root we were after. This brings up an idea: is there some way to iterate a *negative* number of times? Well, sorta. We can run our iteration backward.

Instead of solving

tan *x* = *x*

we could equivalently solve

arctan *x* = *x*.

Since iterating tangent pushes points away, iterating arctan should bring them closer. In fact, the derivative of arctan is less than 1, so it *is* a contraction mapping, and we will get a fixed point.

Let’s start again with 7.5 and iterate arctan. This quickly converges to the peak we’re after.

7.721430101677809 7.725188823982156 7.725250798474231 7.725251819823800 7.725251836655669 7.725251836933059 7.725251836937630 7.725251836937706 7.725251836937707 7.725251836937707

There is a little complication: we have to iterate the *right* inverse tangent function. Since tangent is periodic, there are infinitely many values of *x* that have a particular tangent value. The arctan function in NumPy returns a value between -π/2 and π/2. So if we add 2π to this value, we get values in an interval including the peak we’re after.

Here’s how we can find all the peaks. The peak at 0 is obvious, and by symmetry we only need to find the positive peaks; the *n*th negative peak is just the negative of the *n*th positive peak.

The peaks of sin(*x*)/*x* are approximately at the same positions as sin(*x*), and so we use (2*n* + 1/2)π as our initial guess. In fact, all our peaks will be a little to the left of the corresponding peak in the sine function because dividing by *x* pulls the peak to the left. The larger *x* is, the less it pulls the root over.

The following Python code will find the *n*th positive peak of the sinc function.

def iterate(n, num_iterates = 6): x = (2*n + 0.5)*np.pi for _ in range(num_iterates): x = np.arctan(x) + 2*n*np.pi return x

My next post will revisit the problem in this post using Newton’s method.

The post Reverse iteration root-finding first appeared on John D. Cook.]]>`foo.xlsx`

and `foo.csv`

. Presumably these are redundant; the latter is probably an export of the former. I did a spot check and that seems to be the case.
Then I had a bright idea: use `pandas`

to make sure the two files are the same. It’s an elegant solution: import both files as data frames, then use the `compare()`

function to verify that they’re the same.

Except it didn’t work. I got a series of mysterious and/or misleading messages as I tried to track down the source of the problem, playing whack-a-mole with the data. There could be any number of reason why `compare()`

might not work on imported data: character encodings, inferred data types, etc.

So I used brute force. I exported the Excel file as CSV and compared the text files. This is low-tech, but transparent. It’s easier to compare text files than to plumb the depths of `pandas`

.

One of the problems was that the data contained heights, such as `5'9"`

. This causes problems with quoting, whether you enclose strings in single or double quotes. A couple quick `sed`

one-liners resolved most of the mismatches. (Though not all. Of course it couldn’t be that simple …)

It’s easier to work with data in a high-level environment like `pandas`

. But it’s also handy to be able to use low-level tools like `diff`

and `sed`

for troubleshooting.

I suppose someone could write a book on how to import CSV files. If all goes well, it’s one line of code. Then there are a handful of patterns that handle the majority of remaining cases. Then there’s the long tail of unique pathologies. As Tolstoy would say, happy data sets are all alike, but unhappy data sets are each unhappy in their own way.

The post Low-tech transparency first appeared on John D. Cook.]]>has at least 2*n* zeros in the interval [0, 2π) if *a*_{n} and *b*_{n} are not both zero. You could take *N* to be infinity if you’d like.

Note that the lowest frequency term

can be written as

for some amplitude *c* and phase φ as explained here. This function clearly has 2*n* zeros in each period. The remarkable thing about the Sturm-Hurwitz theorem is that adding higher frequency components can increase the number of zeros, but it cannot decrease the number of zeros.

To illustrate this theorem, we’ll look at a couple random trigonometric polynomials with *n* = 5 and *N* = 9 and see how many zeros they have. Theory says they should have at least 10 zeros.

The first has 16 zeros:

And the second has 12 zeros:

(It’s difficult to see just how many zeros there are in the plots above, but if we zoom in by limiting the vertical axis we can see the zeros more easily. For example, we can see that the second plot does not have a zero between 4 and 5; it almost reaches up to the *x*-axis but doesn’t quite make it.)

Here’s the code that made these plots.

import matplotlib.pyplot as plt import numpy as np n = 5 N = 9 np.random.seed(20210114) for p in range(2): a = np.random.random(size=N+1) b = np.random.random(size=N+1) x = np.linspace(0, 2*np.pi, 200) y = np.zeros_like(x) for k in range(n, N+1): y += a[k]*np.sin(k*x) + b[k]*np.cos(k*x) plt.plot(x, y) plt.grid() plt.savefig(f"sturm{p}.png") plt.ylim([-0.1, 0.1]) plt.savefig(f"sturm{p}zoom.png") plt.close()The post Zeros of trigonometric polynomials first appeared on John D. Cook.]]>