# Generating noble gases

The previous post discussed what the periodic table would look like if it could be extended indefinitely and if certain patterns in the actual table continued to hold. In particular, the last element of each period would have atomic number and so we could call the Zn in the equation above noble numbers, atomic numbers of noble gases.

We could then look at the generating function for the noble numbers by multiplying each Zn by xn and summing. There’s no practical reason I can imagine for computing this other than it provides an excuse to play around with generating functions.

Does the series above converge? For some purposes it doesn’t matter. Generating functions are often useful as formal sums even if if they do not converge as analytic functions.

However, this function does converge. The Zn grow like a polynomial function of n, and for |x| < 1, the xn terms decrease exponentially as a function of n, so the series converges. The series has radius of convergence 1 because it clearly diverges for |x| > 1.

We can do better than saying the series converges: we can find the sum in closed form. We’ll use Mathematica to do the calculation for us.

    Sum[((-1)^n (3 n + 6) + 2 n^3 + 12 n^2 + 25 n - 6 ) x^n,
{n, 1, Infinity}]/12


This returns the following rational function: Note that the generating function has singularities at -1 and 1. This confirms our argument above that the radius of convergence should be 1.

We can use Mathematica to expand the generating function as a series, and we should recover the atomic numbers of the noble gases.

    Series[g[x], {x, 0, 7}]

And indeed we do: # Infinite periodic table

All the chemical elements discovered or created so far follow a regular pattern in how their electrons are arranged: the nth shell contains up to 2n – 1 suborbitals that each contain up to two electrons. For a given atomic number, you can determine how its electrons are distributed into shells and suborbitals using the Aufbau principle.

The Aufbau principle is a good approximation, but not exact. For this post we’ll assume it is exact, and that everything in the preceding paragraph generalizes to an arbitrary number of shells and electrons.

Under those assumptions, what would the periodic table look like if more elements are discovered or created?

D. Weiss worked out the recurrence relations that the periods of the table satisfy and found their solutions.

The number of elements in nth period works out to and the atomic numbers of the elements at the end of the nth period (the noble gases) are We can verify that these formulas give the right values for the actual periodic table as follows.

    >>> def p(n): return ((-1)**n*(2*n+3) + 2*n*n + 6*n +5)/4
>>> def z(n): return ((-1)**n*(3*n+6) + 2*n**3 + 12*n**2 + 25*n - 6)/12
>>> [p(n) for n in range(1, 8)]
[2.0, 8.0, 8.0, 18.0, 18.0, 32.0, 32.0]
>>> [z(n) for n in range(1, 8)]
[2.0, 10.0, 18.0, 36.0, 54.0, 86.0, 118.0]


So, hypothetically, if there were an 8th row to the periodic table, it would contain 50 elements, and the last element of this row would have atomic number 168.

# A footnote to year share

A couple weeks ago I wrote a post about the year share component of calculating the day of the week. To calculate the day of the week, you need to add the day of the week, a constant for the month, and the year share. Calculating year share is not that hard, but it’s the hardest step to carry out in your head.

The year share of a date y is

⌊5y/4⌋ % 7

where y is the last two digits of a year. There’s no need to multiply by 5, since you could compute

(y + ⌊y/4⌋) % 7

but this still takes some effort if y is large.

The earlier post gave nine different ways to compute the year share, and the easiest in my opinion is method #6:

1. Round y down to the most recent leap year and remember how much you had to round down. Call that amount r.
2. Let t be the tens value and u the units value.
3. Return (2tu/2 + r) % 7.

In symbols,

1. Let y‘ = 4 ⌊y/4⌋ and r = yy‘.
2. Let t = ⌊y’/10⌋ and u = y‘ % 10.
3. Return (2tu/2 + r) % 7.

This method has a longer description but is easier to carry out mentally as the following examples illustrate.

## Examples

Let y = 97. Here are the calculations required by the direct method.

1. y/4⌋ = 24
2. 97 + 24 = 121
3. 121 % 7 = 2

Here are the calculations required by the second method.

1. y‘ = 96 and r = 1.
2. t = 9 and u = 6.
3. (2*9 – 3 + 1) % 7 = 2

The former requires adding two 2-digit numbers. The latter requires doubling a single digit number and subtracting another single digit number.

The former requires reducing a 3-digit number mod 7 and the latter requires reducing a 2-digit number mod 7. Furthermore, the 2-digit number is never more than 18.

## Proof

To prove that the two methods are equivalent we have to prove

⌊5y/4⌋ ≡ 2tu/2 + r mod 7

which is kinda messy. The way t, u, and r are defined prevent this from being a simple algebraic calculation.

We can verify that

⌊5y/4⌋ ≡ 2tu/2 + r mod 7

for y = 0, 1, 2, … 19 by brute force then prove the rest by induction. We’ll show that if the congruence holds for y then it holds for y + 20.

Suppose you increase y by 20. Then ⌊5y/4⌋ increases by 25. The value of t increases by 2, and the values of u and r remain unchanged, so the right side increases by 4. Since 25 % 7 = 4, the congruence still holds.

# Inverse tetrahedral numbers

The previous post looked at the tetrahedral numbers:

1, 4, 10, 20, 35, …

We could invert the process of creating tetrahedral numbers and ask for what n is a given number the nth tetrahedral number. So the inverse of 1 is 1, the inverse of 4 is 2, the inverse of 10 is etc.

We can find the nth tetrahedral number by evaluating a cubic polynomial in n and so we could invert this for a given x by solving p(n) = x for n. The result is complicated, but it can be written in closed form: We could do the same for generalized tetrahedral numbers. The nth tetrahedral number of dimension d is given by a polynomial in n of degree d, and so in principle we could look for the roots of these polynomials, though we shouldn’t expect closed forms expressions for the roots in general.

Now if x is a tetrahedral number, i.e. x is an integer such that p(n) = x for some positive integer n, then this value of n is unique as far as positive integers go. The sequence of tetrahedral numbers is increasing, so it has a unique inverse, if it has an inverse at all.

Now suppose x is not a tetrahedral number, and maybe not even an integer. We can still solve the equation p(n) = x. And if x > 0 then there is a real root given by the equation above. What are the other roots like?

We can find out by computing the discriminant of the equation p(n) – x = 0, which works out to be When the discriminant is positive, there are three real roots. When the discriminant is negative, there is one real root and two complex roots.

So, for example, the tetrahedral inverse of 10 is 3, but you could say that -3 ± √11 i are also inverses.

For another example, the real tetrahedral inverse of 7 is 2.57189… and there is also a conjugate pair of complex inverses.

# General tetrahedral numbers

1, 1, 1, 1, 1, …

Taking the partial sums of this sequence gives consecutive numbers. That is, the nth number of the new series is the sum of the first n terms of the previous series.

1, 2, 3, 4, 5, …

If we take partial sums again, we get the triangular numbers.

1, 3, 6, 10, 15, …

They’re called triangle numbers because if we arrange coins in a triangle, with n on a side, the total number of coins is the nth triangular number.

If we repeat the process again, we get the tetrahedral numbers.

1, 4, 10, 20, 35, …

The nth tetrahedral number is the number of cannonballs is a tetrahedral stack, with each layer of the stack being a triangle.

We can produce the examples above in Python using the cumsum (cumulative sum) function on NumPy arrays.

    >>> import numpy as np
>>> x = np.ones(5, int)
>>> x
array([1, 1, 1, 1, 1])
>>> x.cumsum()
array([1, 2, 3, 4, 5])
>>> x.cumsum().cumsum()
array([ 1, 3, 6, 10, 15])
>>> x.cumsum().cumsum().cumsum()
array([ 1, 4, 10, 20, 35])


We could continue this further, taking tetrahedral numbers of degree k. The sequences above could be called the tetrahedral numbers of degree k for k = 0, 1, 2, and 3.

Here’s Python code to find the nth tetrahedral number of a given dimension.

    def tetrahedral(n, dim):
x = np.ones(n, int)
for _ in range(dim):
x = x.cumsum()
return x[-1]


It turns out that and so we could rewrite the code above more simply as

    from math import comb

def tetrahedral(n, dim):
return comb(n + dim - 1, dim)


# Visualizing C operator precedence

Here’s an idea for visualizing C operator precedence. You snake your way through the diagram starting from left to right.

Operators at the same precedence level are on the same horizontal level.

Following the arrows for changing directions, you move from left-to-right through the operators that associate left-to-right and you move right-to-left through the operators that associate right-to-left.

Although this diagram is specifically for C, many languages follow the same precedence with minor exceptions. For example, all operators that Perl shares with C follow the same precedence as C. # Numeric distance vs geographic distance in zip codes

If two zip codes numbers are close, are the regions they represent close? How much can you tell about how far apart two regions are by comparing their zip codes? (Zip codes are US postal codes. The name is an acronym for “Zone Improvement Plan” and was introduced in 1963.)

To investigate this, I looked at some data on zip codes that I purchased last year. I selected the zip codes with latitude and longitude coordinates  and computed the distance between consecutive zip codes, consecutive in the sense of being the next item in the list of assigned zip codes, not necessarily consecutive integers.

The closest consecutive zip codes are 10171 and 10172 in Manhattan. These zip codes are neighboring blocks and so you could say their distance is zero. The distance using the latitude and longitude values in my data is 0.05 miles.

The furthest consecutive zip codes are 96863 in Honolulu, Hawaii and 97001 in Antelope, Oregon. These locations are 2,658 miles apart.

The mean distance between consecutive zip codes is 34.4 miles and the median distance is 24.9 miles. So consecutive zip codes are often fairly close geographically, but not always.

Next I went back and divided the geographic distance between consecutive pair of zip codes by the numerical difference between the codes, taking a sort of “derivative”.

By this measure, the furthest consecutive zip codes are 99685 and 99686, which happen to also be consecutive integers. The both are in Alaska, the form former in Unalaska in the Aleutian Islands and the latter in Valdez on the mainland. The mean and median distance in terms of miles per integer difference is 26.4 and the median is 16.9.

The previous post shows how you could use Hilbert curves to assign zip codes to a grid so that consecutive codes always correspond to neighboring squares.

## Related posts

 A lot of zip codes don’t have coordinates because they serve an administrative function more than strictly representing a geographic region.

# Zip codes, geocodes, and Hilbert curves

You might think that if zip codes are close, then the regions they represent are close. Or that if zip codes are consecutive, then their regions touch. Neither of these are true. I explore how far they are from being true in the next post.

But these statements could have been true . It’s possible to lay a grid on a region and number the squares in the grid in such a way that consecutive numbers correspond to contiguous squares, and nearby numbers correspond to nearby squares. There are geocoding systems that accomplish this using Hilbert curves. For example, Google’s S2 system is based on Hilbert curves.

A Hilbert curve is a curve that winds through a square, coming arbitrarily close to every point in the square. A Hilbert curve is a fractal, defined as the limit of an iterative process. We aren’t concerned with the limit because we only want to carry out a finite number of steps in the construction. But the fact that the limit exists tells us that with enough steps we can get any resolution we want.

To illustrate Hilbert curves and how they could be used to label grids, we will use a Hilbert curve to tour a chessboard. We will number the squares of a chess board so that consecutive numbers correspond to neighboring squares.

Here’s Python code to produce our numbering.

    from hilbertcurve.hilbertcurve import HilbertCurve

p = 3 # 2^3 squares on a side
n = 2 # two dimensional cube, i.e. square

hc = HilbertCurve(p, n)

for row in range(2**p):
for col in range(2**p):
d = hc.distance_from_point([row, col])
print(format(d,"2d"), " ", end="")
print()


This produces the following output.

     0   1  14  15  16  19  20  21
3   2  13  12  17  18  23  22
4   7   8  11  30  29  24  25
5   6   9  10  31  28  27  26
58  57  54  53  32  35  36  37
59  56  55  52  33  34  39  38
60  61  50  51  46  45  40  41
63  62  49  48  47  44  43  42


Here’s Python code to draw lines rather than print numbers.

    N = (2**p)**n
points = hc.points_from_distances(range(N))

for i in range(1, N):
x0, y0 = points[i-1]
x1, y1 = points[i]
plt.plot([x0, x1], [y0, y1])
plt.gca().set_aspect('equal')
plt.show()


This produces the following image. There’s a small problem. Following the square numberings above produces a Hilbert curve, and the plotting code produces and Hilbert curve, but they’re not the same Hilbert curve. The implicit axes created by our sequence of print statements doesn’t match the geometry in the plotting code. To get the curves to match we need to do a sort of change of coordinates. We can do this by changing the call to plt.plot to

    plt.plot([y0, y1], [8-x0, 8-x1])

This produces the following plot, matching the numbering above. ## Related posts

 It could have been true in the sense that it was theoretically possible. It might not have been practically possible. Postal codes have different design constraints than geocodes.

# Conspicuously missing data

I was working on a report for a client this afternoon when I remembered this comic from Spiked Math. I needed to illustrate the point that revealing information about one person or group can reveal information on other people or other groups. If you give your genetic information to a company, for example, you also give that company (and every entity they share your data with) information about your relatives.

This comic doesn’t illustrate the point I had in mind, but it does illustrate a related point. The third logician didn’t reveal the preferences of the first two, though it looks like that at first. Actually, the first two implicitly reported their own preferences.

If the first logician did not want a beer, he or she could have said “No” to the question “Does everyone want a beer?” Answering this question with “I don’t know” is tantamount to answering the question “Do you want a beer?” with “Yes.” What appears to be a non-committal answer is a definite answer on closer exanmination.

One of the Safe Harbor provisions under HIPAA is that data may not contain sparsely populated three-digit zip codes. Sometimes databases will replace sparse zip codes with nulls. But if the same database reports a person’s state, and the state only has one sparse zip code, then the data effectively lists all zip codes. Here the suppressed zip code is conspicuous by its absence. The null value itself didn’t reveal the zip code, nor did the state, but the combination did.

A naive approach to removing sensitive data can be about as effective as bleeping out profanity: it’s not hard to infer what was removed.