Scientific papers: innovation … or imitation?

Posted on 5 June 2025 by Wayne Joubert

Sometimes a paper comes out that has the seeds of a great idea that could lead to a whole new line of pioneering research. But, instead, nothing much happens, except imitative works that do not push the core idea forward at all.

For example the McCulloch Pitts paper from 1943 showed how neural networks could represent arbitrary logical or Boolean expressions of a certain class. The paper was well-received at the time, brilliantly executed by co-authors with diverse expertise in neuroscience, logic and computing. Had its signficance been fully grasped, this paper might have, at least notionally, formed a unifying conceptual bridge between the two nascent schools of connectionism and symbolic AI (one can at least hope). But instead, the heated conflict in viewpoints in the field has persisted, even to this day.

Another example is George Miller’s 7 +/- 2 paper. This famous result showed humans are able to hold only a small number of pieces of information in mind at the same time while reasoning. This paper was important not just for the specific result, but for the breakthrough in methodology using rigorous experimental noninvasive methods to discover how human thinking works—a topic we know so little about, even today. However, the followup papers by others, for the most part, only extended or expanded on the specific finding in very minor ways. [1] Thankfully, Miller’s approach did eventually gain influence in more subtle ways.

Of course it’s natural from the incentive structures of publishing that many papers would be primarily derivative rather than original. It’s not a bad thing that, when a pioneering paper comes out, others very quickly write rejoinder papers containing evaluations or minor tweaks of the original result. Not bad, but sometimes we miss the larger implications of the original result and get lost in the details.

Another challenge is stovepiping—we get stuck in our narrow swim lanes for our specific fields and camps of research. [2] We don’t see the broader implications, such as connections and commonalities across fields that could lead to fruitful new directions.

Thankfully, at least to some extent current research in AI shows some mix of both innovation and imitation. Inspired in part by the accelerationist mindset, many new papers appear every day, some with significant new findings and others that are more modest riffs on previous papers.

Notes

[1] Following this line of research on human thought processes could be worthwhile for various reasons. For example, some papers in linguistics state that Chomsky‘s vision for a universal grammar is misguided because the common patterns in human language are entirely explainable by the processing limitations of the human mind. But this claim is made with no justification or methodological rigor of any kind. If I claimed a CPU performs vector addition or atomic operations efficiently because of “the capabilities of the processor,” I would need to provide some supporting evidence, for example, documenting that the CPU has vector processing units or specialized hardware for atomics. The assertions about language structure being shaped by the human mental processing faculty is just an empty truism, unless supported by some amount of scientific rigor and free of the common fallacies of statistical reasoning.

[2] I recently read a paper in linguistics with apparent promise, but the paper totally misconstrued the relationship between Shannon entropy and Kolmogorov complexity. Sadly this paper passed review in a linguistic journal, but if it had had a mathematically inclined reviewer, the problem would have been caught and fixed.

Voyager’s slingshot maneuvers

Posted on 24 March 2025 by John

This post started out as a thread on X. Here I’ve edited it into a blog post. The image below and the fact cited can be found in JPL Publication 89-24.

Voyager 2 velocity relative to the sun over time

In 1960 it didn’t seem that it would be possible to explore the solar system beyond Jupiter without greatly improved propulsion.

Then the gravitational assist (“slingshot”) maneuver was discovered in 1961. With this new discovery, NASA began making plans to take advantage of an alignment of the outer planets in the 1970s. This led to the Voyager missions.

(Fun fact: In a gravitational assist, the velocity of a spacecraft with respect to the planet doesn’t change, but the velocity relative to the sun changes greatly.)

Note that before encountering Jupiter, Voyager was moving well below solar system escape velocity. As a result of gravitational assists at four planets, the spacecraft is traveling at well over solar system escape velocity.

In a gravitational assist, speed (relative to the sun) increases or decreases, depending on which direction you approach the planet. At the end of the tour, Voyager 2 no longer had a reason to increase speed—it wasn’t possible to visit Pluto—but decreasing speed allowed it to visit both Neptune and its moon Triton.

It may seem that a gravitational assist violates conservation laws: where does the additional momentum come from? From the planets. When Voyager 2 passed Jupiter, Saturn, and Uranus, each of these planets lost some momentum, transferring it to the probe. When the spacecraft passed Neptune, the planet gained some momentum and the probe lost momentum. The changes in momentum were infinitesimal relative to the momentum of the gas giants, but large relative to the momentum of Voyager.

A calendar for Mars

Posted on 28 February 2025 by John

I recently started reading The Case for Mars by Robert Zubrin. This post will unpack one line from that book regarding creating a calendar for Mars:

Equipartitioned months don’t work for Mars, because the planet’s orbit is elliptical, which causes the seasons to be of unequal length.

This sentence doesn’t sit well at first for a couple reasons. First, Earth’s orbit is elliptical too, and the seasons here are of roughly equal length. Second, the orbit of Mars, like the orbit of Earth, is nearly circular.

There are three reasons why Zubrin’s statement is correct, despite the objections above. The first has to do with the nature of eccentricity, and the second with the reference to which angles are measured, and the third with variable speed.

Eccentricity

The orbit of Mars is about five and a half times as eccentric as that of Earth. That does not mean that the orbit of Mars is noticeably different from a circle, but it does mean the sun is noticeably not at the center of that (almost) circle.

There’s a kind of paradox interpreting eccentricity e. An ellipse with e = 0 is a circle, and the two foci of the ellipse coincide with the center of the circle. As e increases, the ellipse aspect ratio increases and the foci move apart. But here’s the key: the aspect ratio doesn’t change nearly as fast as the distance between the two foci changes. I’ve written more about this here and here.

So while the orbit of Mars is nearly a circle, the sun is substantially far from the center of the orbit. We can visualize this with a couple plots. First, here are the orbits of Earth and Mars, shifted so that both have their center at the origin.

Both are visually indistinguishable from circles.

How here are the two orbits with their correct placement relative to the sun at the center.

Angle reference

Zubrin writes

In order to predict the seasons, a calendar must divite the planet’s orbit not into equal division of days , but into equal angles of travel around the sun. … a month is really 30 degrees of travel around the Sun.

If we were to divide the orbit of Mars into partitions of 30 degrees relative to the center of the orbit then each month would be about the same length. But Zubrin is dividing the orbit into partitions of 30 degrees relative to the sun.

In the language of orbital mechanics, Zubrin’s months correspond to 30 degrees of true anomaly, not 30 degrees of mean anomaly. I explain the difference between true anomaly and mean anomaly here. That post shows that for Earth, true anomaly and mean anomaly never differ by more than 2 degrees. But for Mars, the two anomalies can differ by up to almost 19 degrees.

Variable speed

A planet in an elliptical orbit around the sun travels fastest when it is nearest the sun and slowest when it is furthest from the sun. Because Mars’s orbit is more eccentric than Earth’s, the variation in orbital speed is greater. We can calculate the ratio of the fastest speed to the slowest speed using the vis-viva equation. It works to be

(1 + e)/(1 − e).

For Earth, with eccentricity 0.0167 this ratio is about 1.03, i.e. orbital speed varies by about 3%.

For Mars, with eccentricity 0.0934 this ratio is about 1.21, i.e. orbital speed varies by about 21%.

Zubrin’s months

The time it takes for Mars to rotate on its axis is commonly called a sol, a Martian day. The Martian year is 669 sols. Zubrin’s proposed months range from 46 to 66 sols, each corresponding to 30 degrees difference in true anomaly.

Standing with Intellectual Giants

Posted on 20 February 2025 by Wayne Joubert

Is it possible to come up with truly innovative ideas when you’re not part of the institutions where the expertise resides?

According to one study, the answer would seem to be “No.” The book, The Sociology of Philosophies by Randall Collins, makes a case for how great ideas through history have always developed, almost without fail, in connection with the expert community.

Organizations, institutions and even loose associations of individuals can possess tacit knowledge giving a competitive moat that is hard for outsiders to cross. This may include explicit trade secrets and technical facts, but also certain thought styles, rules of thumb, recipes, etc., that reside only in the minds of the participants.

Sometimes these thought styles are more important than the bare facts themselves. In a recent interview, Terence Tao commented that sometimes his most appreciated lectures are those in which he makes a mistake and must show his thought process in real time for how he fixes the problem. By learning, not just what the solution is but how to approach the problem, one can be enabled to solve many problems, not just one.

Sometimes such learning occurs when a corporation or other institution imprints its attitudes, thought styles and mental habits onto its members over a period of time.

The book Sociology of Philosophies, however, may have a fatal flaw. In determining what is a great idea, the book, perhaps circularly, relies on the authority of what institutions say is a great idea—thus potentially arriving at the conclusion as a tautology. Diffusion of ideas may be a better tool for looking at the problem, by looking at societal impact rather than just elite opinion.

An opposite idea is the notion of maverick science—knowledge developed outside of the institutions, often ridiculed and sometimes vindicated. Some ideas like open source software were developed from a purposely anti-institutional perspective (thus spawning a new community of its own). Maverick thinking may be more important now than ever, as many institutions have become moribund (for perspectives see here and here).

Opportunities for the maverick may be better now than ever. For one, the Internet, and particularly the prevalence of online talks and lectures, a trend accelerated during Covid, make expert knowledge more accessible than ever. Second, AI chatbots now allow you to ask questions of this content, playing something of a mentoring role. It’s a better time than ever for institutional outsiders to do worthwhile things.

What exactly is a second?

Posted on 29 December 2024 by John

The previous post looked into the common definition of Unix time as “the number of seconds since January 1, 1970 GMT” and why it’s not exactly true. It was true for a couple years before we started inserting leap seconds. Strictly speaking, Unix time is the number of non-leap seconds since January 1, 1970.

This leads down the rabbit hole of how a second is defined. As long as a second is defined as ¹/₈₆₄₀₀ th of a day, and a day is the time it takes for the earth to rotate once on its axis, there’s no cause for confusion. But when you measure the rotation of the earth precisely enough, you can detect that the rotation is slowing down.

Days are getting longer

The rotation of the earth has been slowing down for a long time. A day was about 23½ hours when dinosaurs roamed the earth, and it is believed a day was about 4 hours after the moon formed. For most practical purposes a day is simply 24 hours. But for precision science you can’t have the definition of a second changing as the ball we live on spins slower.

This lead to defining the second in terms of something more constant than the rotation of the earth, namely the oscillations of light waves, in 1967. And it lead to tinkering with the calendar by adding leap seconds starting in 1972.

Cesium

You’ll hear that the second is defined in terms of vibrations of a cesium atom. But what exactly does that mean? What about the atom is vibrating? The second is not defined in terms of motions inside an atom, but by the frequency of the radiation produced by changes in an atom. Specifically, a second has been defined since 1967 as

the duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium-133 atom.

Incidentally, “cesium” is the American English spelling of the name of atomic element 55, and “caesium” is the IUPAC spelling.

The definition of a second raises several questions. Why choose cesium? Why choose that number of periods? And what are hyperfine levels and all that? I’ll attempt to answer the first two questions and punt on the third.

OK, so why cesium? Do we use cesium because that’s what atomic clocks use? And if so, why do atomic clocks use cesium?

As I understand it, the first atomic clocks were based on cesium, though now some atomic clocks are based on hydrogen or rubidium. And one reason for using Cs 133 was that it was easy to isolate that particular isotope with high purity.

Backward compatibility

So why 9,192,631,770 periods? Surely if we’d started from this definition we’d go with something like 10,000,000,000 periods. Clearly the number was chosen for backward compatibility with the historical definition of a second, but that only approximately settles the question. The historical definition was fuzzy, which was the point of the new definition, so which variation on the historical definition was used for backward compatibility?

The time chosen for backward compatibility was basically the length of the year 1900. Technically, the number of periods was chosen so that a second would be

the fraction ¹/_{31556925.9747} of the tropical year for 1900 January 0 at 12 hours ephemeris time.

Here “tropical year” means the time it took earth to orbit the sun from the perspective of the “fixed stars,” i.e. from a vantage point so far away that it doesn’t matter exactly how far away it is. The length of a year varies slightly, and that’s why they had to pick a particular one.

The astronomical definition was just motivation; it has been discarded now that 9,192,631,770 periods of a certain radiation is the definition. We would not change the definition of a second if an alien provided us some day a more accurate measurement of the tropical year 1900.

Unix Time and a Modest Proposal

Posted on 27 December 2024 by John

The time it takes earth to orbit the sun is not a simple multiple of the time it takes earth to rotate on its axis. The ratio isn’t even constant. The time it takes earth to circle the sun wobbles a bit, and the rotation of the earth is slowing down slightly.

The ratio is around 365.2422. Calling it 365 is too crude. Julius Caesar said we should call it 365 ¹/₄, and that was good enough for a while. Then Pope Gregory said we really should use 365 ⁹⁷/₄₀₀, and that’s basically good enough, but not quite. More on that here.

Leap seconds

In 1972 we started adding leap seconds in order to synchronize the day and the year more precisely. Unlike leap days, leap seconds don’t occur on a strict schedule. A leap second is inserted when a committee of astronomers decides one should be inserted, about every two years.

An international standards body has decided to stop adding leap seconds by 2035. They cause so confusion that it was decide that letting the year drift by a few seconds was preferable.

Unix time

Unix time is the number of seconds since the “epoch,” i.e. January 1, 1970, sorta.

If you were to naively calculate Unix time for this coming New Year’s Day, you’d get the right result.

New Year’s Day 2025

When New Year’s Day 2025 begins in England, the Unix time will be

(55 × 365 + 14) × 24 × 60 × 60 = 1735689600

This because there are 55 years between 1970 and 2025, 14 of which were leap years.

However, that moment will be 1735689627 seconds after the epoch.

Non-leap seconds

Unix time is the number of non-leap seconds since 1970-01-01 00:00:00 UTC. There have been 27 leap seconds since 1970, and so Unix time is 27 seconds behind elapsed time.

Leap year analogy

You could think of a second in Unix time as being 1/86400 th of a day. Every day has 86400 non-leap seconds, but some days have had 86401 seconds. A leap second could potentially be negative, though this has not happened yet. A day containing a negative leap second would have 86399 seconds.

The analogous situation for days would be to insist that every year have 365 days. February 28 would be the 59th day of the year, and March 1 would be the 60th day, even in a leap year.

International Atomic Time

What if you wanted a time system based on the actual number of elapsed seconds since the epoch? This is almost what International Atomic Time is.

International Atomic Time (abbreviated TAI, from the French temps atomique international) is ahead of UTC [1] by 37 seconds, not 27 seconds as you might expect. Although there have been 27 leap seconds since 1972, TAI dates back to 1958.

So New Year’s Day will start in England at 2025-01-01 00:00:37 TAI.

A Modest Proposal

It seems our choices are to add leap seconds and endure the resulting confusion, or not add leap seconds and allow the year to drift with respect to the day. There is a third way: adjust the position of the earth periodically to keep the solar year equal to an average Gregorian calendar day. I believe this modest proposal [2] deserves consideration.

Kepler’s law says the square of a planet’s orbital period is proportional to the cube of the semi-major axis of its orbit. This means that increasing earth’s orbital period by 1 second would only require moving earth 3.16 km further from the sun.

***

[1] UTC stands for Universal Coordinated Time. From an earlier post,

The abbreviation UTC is an odd compromise. The French wanted to use the abbreviation TUC (temps universel coordonné) and the English wanted to use CUT (coordinated universal time). The compromise was UTC, which doesn’t actually abbreviate anything.

[2] In case you’re not familiar with the term “modest proposal,” it comes from the title of a satirical essay by Jonathan Swift. A modest proposal has come to mean an absurdly simple solution to a complex problem presented satirically.

Starlink configurations

Posted on 23 December 2024 by John

My nephew recently told me about being on a camping trip and seeing a long line of lights in the sky. The lights turned out to be Starlink satellites. It’s fairly common for people report seeing lines of these satellites.

Four lights in the sky in a line

Why would the satellites be in a line? Wouldn’t it be much more efficient to spread them out? They do spread out, but they’re launched in groups. Satellites released into orbit at the same time initially orbit in a line close together.

It would seem the optimal strategy would be to spread communication satellites out evenly in a sphere. There are several reasons why that is neither desirable or possible. It is not desirable because human population is far from evenly distributed. It’s very nice to have some coverage over the least-populated places on earth, such as Antarctica, but there is far more demand for service over the middle latitudes.

It is not possible to evenly distribute more than 20 points on a sphere, and so it would not be possible to spread out thousands of satellites perfectly evenly. However there are ways to arbitrarily many points somewhat evenly, such as in a Fibonacci lattice.

It’s also not possible to distribute satellites in a static configuration. Unless a satellite is in geostationary orbit, it will constantly move relative to the earth. One problem with geostationary orbit is that it is at an altitude of 42,000 km. Starlink satellites are in low earth orbit (LEO) between 300 km and 600 km altitude. It is less expensive to put satellites into LEO and there is less latency bouncing signals off satellites closer to the ground.

Satellites orbit at different altitudes, and altitude and velocity are tightly linked. You want satellites orbiting at different altitudes to avoid collisions, they’re orbiting at different velocities. Even if you wanted all satellites to orbit at the same altitude, this would require constant maintenance due to various real-world departures from ideal Keplerian conditions. Satellites are going to move around relative to each other whether you want them to or not.

GPS satellite orbits

Posted on 15 November 2024 by John

GPS satellites all orbit at the same altitude. According to the FAA,

GPS satellites fly in circular orbits at an altitude of 10,900 nautical miles (20,200 km) and with a period of 12 hours.

Why were these orbits chosen?

You can determine your position using satellites that are not in circular orbits, but with circular orbits all the satellites are on the surface of a sphere, and this insures that certain difficulties don’t occur [1]. More on that in the next post.

To maintain a circular orbit, the velocity is determined by the altitude, and this in turn determines the period. The period T is given by

$T = 2\pi \sqrt{\frac{r^3}{\mu}}$

where μ is the “standard gravitational parameter” for Earth, which equals the mass of the earth times the gravitational constant G.

The weakest link in calculating of T is r. The FAA site says the altitude is 20,200 km, but has that been rounded? Also, we need the distance to the center of the earth, not the altitude above the surface, so we need to add the radius of the earth. But the radius of the earth varies. Using the average radius of the earth I get T = 43,105 seconds.

Note that 12 hours is 43,200 seconds, so the period I calculated is 95 seconds short of 12 hours. Some of the difference is due to calculation inaccuracy, but most of it is real: the orbital period of GPS satellites is less than 12 hours. According to this source, the orbital period is almost precisely 11 hours 58 minutes.

The significance of 11 hours and 58 minutes is that it is half a sidereal day, not half a solar day. I wrote about the difference between a sidereal day and a solar day here. That means each GPS satellite returns to almost the same position twice a day, as seen from the perspective of an observer on the earth. GPS satellites are in a 2:1 resonance with the earth’s rotation.

(But doesn’t the earth rotate on its axis every 24 hours? No, every 23 hours 56 minutes. Those missing four minutes come from the fact that the earth has to rotate a bit more than one rotation on its axis to return to the same position relative to the sun. More on that here.)

Update: See the next post on the mathematics of GPS.

[1] Mireille Boutin, Gregor Kemperc. Global positioning: The uniqueness question and a new solution method. Advances in Applied Mathematics 160 (2024)

Maybe Copernicus isn’t coming

Posted on 5 November 2024 by John

Before Copernicus promoted the heliocentric model of the solar system, astronomers added epicycle on top of epicycle, creating ever more complex models of the solar system. The term epicycle is often used derisively to mean something ad hoc and unnecessarily complex.

Copernicus’ model was simpler, but it was less accurate. The increasingly complex models before Copernicus were refinements. They were not ad hoc, nor were they unnecessarily complex, if you must center your coordinate system on Earth.

It’s easy to draw the wrong conclusion from Copernicus, and from any number of other scientists who were able to greatly simplify a previous model. One could be led to believe that whenever something is too complicated, there must be a simpler approach. Sometimes there is, and sometimes there isn’t.

If there isn’t a simpler model, the time spent searching for one is wasted. If there is a simpler model, the time searching for one might still be wasted. Pursuing brute force progress might lead to a simpler model faster than pursuing a simpler model directly.

It all depends. Of course it’s wise to spend at least some time looking for a simple solution. But I think we’re fed too many stories in which the hero comes up with a simpler solution by stepping back from the problem.

Most progress comes from the kind of incremental grind that doesn’t make an inspiring story for children. And when there is a drastic simplification, that simplification usually comes after grinding on a problem, not instead of grinding on it.

3Blue1Brown touches on this in this video. The video follows two hypothetical problem solvers, Alice and Bob, who attack the same problem. Alice is the clever thinker and Bob is the calculating drudge. Alice’s solution of the original problem is certainly more elegant, and more likely to be taught in a classroom. But Bob’s approach generalizes in a way that Alice’s approach, as far as we know, does not.

Why does FM sound better than AM?

Posted on 13 October 2024 by John

The original form of radio broadcast was amplitude modulation (AM). With AM radio, the changes in the amplitude of the carrier wave carries the signal you want to broadcast.

AM signal

Frequency modulation (FM) came later. With FM radio, changes to the frequency of the carrier wave carry the signal.

I go into the mathematical details of AM radio here and of FM radio here.

Pinter [1] gives a clear explanation of why the inventor of FM radio, Edwin Howard Armstrong, correctly predicted that FM radio transmissions would be less affected by noise.

Armstrong reasoned that the effect of random noise is primarily to amplitude-modulate the carrier without consistently producing frequency derivations.

In other words, noise tends to be a an unwanted amplitude modulation, not a frequency modulation.

FM radio was able to achieve levels of noise reduction that people steeped in AM radio thought would be impossible. As J. R. Carson eloquently but incorrectly concluded

… as the essential nature of the problem is more clearly perceived, we are unavoidably forced to the conclusion that static, like the poor, will always be with us.

But as Pinter observes

The substantial reduction of noise in a FM receiver by use of a limiter was indeed a startling discovery, contrary to the behavior of AM systems, because experience with such systems had shown that the noise contribution to the modulation of the carrier could not be eliminated without partial elimination of the message.

[1] Philip F. Pinter. Modulation, Noise, and Spectral Analysis. McGraw-Hill 1965.

Science

Scientific papers: innovation … or imitation?

Notes

Voyager’s slingshot maneuvers

A calendar for Mars

Eccentricity

Angle reference

Variable speed

Zubrin’s months

Related posts

Standing with Intellectual Giants

What exactly is a second?

Days are getting longer

Cesium

Backward compatibility

Unix Time and a Modest Proposal

Leap seconds

Unix time

New Year’s Day 2025

Non-leap seconds

Leap year analogy

International Atomic Time

A Modest Proposal

Starlink configurations

Related posts

GPS satellite orbits

Maybe Copernicus isn’t coming

Related posts

Why does FM sound better than AM?

Related posts