Elliptic curve P-384

The various elliptic curves used in ellitpic curve cryptography (ECC) have different properties, and we’ve looked at several of them before. For example, Curve25519 is implemented very efficiently, and the parameters were transparently chosen. Curve1174 is interesting because it’s an Edwards curve and has a special addition formula.

This post looks at curve P-384. What’s special about this curve? It’s the elliptic curve that the NSA recommends everyone use until post-quantum methods have been standardized. It provides 192 bits of security, whereas more commonly used curves provide 128 bits.

Does the NSA recommend this method because they know how to get around it? Possibly, but they also need to recommend methods that they believe foreign governments cannot break.

The equation of the P-384 curve is

y² = x³ + ax + b

working over the field of integers modulo a prime p. We will go into each of the specific parameters ab, and p, and discuss how they were chosen.

Modulus p

Consisting with the naming conventions for elliptic curves used in cryptography, the name “P-384” tells you that the curve is over a prime field where the prime is a 384-bit integer. Specifically, the order of the field is

p = 2384 – 2128 – 296 + 232 – 1

For a given number of bits, in this case 384, you want to pick a prime that’s relatively near the maximum size for that number of bits. In our case, our prime p is a prime near 2384 with a convenient bit pattern. (The special pattern allows implementation tricks that increase efficiency.)

Hasse’s theorem says that the number of points on a curve modulo a large prime is on the order of magnitude equal to the prime, so P-384 contains approximately 2384 points. In fact, the number of points n on the curve is

39402006196394479212279040100143613805079739270465446667946905279627659399113263569398956308152294913554433653942643

or approximately 2384 – 2190. The number n is a prime, and so it is the order of P-384 as a group.

Linear coefficient a

According to a footnote in the standard defining P-384, FIPS PUB 186-4,

The selection a ≡ -3 for the coefficient of x was made for reasons of efficiency; see IEEE Std 1363-2000.

All the NIST elliptic curves over prime fields use a = -3 because this makes it possible to use special algorithms for elliptic curve arithmetic.

Constant coefficient b

The curve P-384 has Weierstrass form

y² = x³ – 3x + b

where b is

27580193559959705877849011840389048093056905856361568521428707301988689241309860865136260764883745107765439761230575.

The parameter b is between 2383 and 2384 but doesn’t have any particular binary pattern:

101100110011000100101111101001111110001000111110111001111110010010011000100011100000010101101011111000111111100000101101000110010001100000011101100111000110111011111110100000010100000100010010000000110001010000001000100011110101000000010011100001110101101011000110010101100011100110001101100010100010111011010001100111010010101010000101110010001110110111010011111011000010101011101111

The specification says that b was chosen at random. How can you convince someone that you chose a parameter at random?

The standard gives a 160-bit seed s, and a hash-based algorithm that s was run through to create a 384-bit parameter c. Then b is the solution to

b² c = -27 mod p.

The algorithm going from the s to c is given in Appendix D.6 and is a sort of key-stretching algorithm. The standard cites ANS X9.62 and IEEE Standard 1363-2000 as the source of the algorithm.

If b was designed to have a back door, presumably a tremendous amount of computation had to go into reverse engineering the seed s.

Koblitz and Menezes wrote a paper in which they suggest a way that the NSA might have picked seeds that lead to weak elliptic curves, but then argue against it.

It is far-fetched to speculate that the NSA would have deliberately selected weak elliptic curves in 1997 for U.S. government usage … confident that no one else would be able to discover the weakness in these curves in the ensuing decades. Such a risky move by the NSA would have been inconsistent with the Agency’s mission.

Related posts

Bessel function crossings

The previous looked at the angles that graphs make when they cross. For example, sin(x) and cos(x) always cross with the same angle. The same holds for sin(kx) and cos(kx) since the k simply rescales the x-axis.

The post ended with wondering about functions analogous to sine and cosine, such as Bessel functions. This post will look at that question in more detail. Specifically we’ll look at the functions Jν and Yν.

Because these two Bessel functions satisfy the same second order linear homogeneous differential equation, the Strum separation theorem says that their zeros are interlaced: between each pair of consecutive zeros of Jν is exactly one zero of Yν, and between each pair of consecutive zeros of Yν there is exactly one zero of Jν.

Plotting Bessel functions J_3 and Y_3

In the following Python code, we find zeros of Jν, then look in between for places where Jν and Yν cross. Next we find the angle the two curves make at each intersection and plot the angles.

    from scipy.special import jn_zeros, jv, yv
    from scipy.optimize import bisect
    from numpy import empty, linspace, arccos
    import matplotlib.pyplot as plt
    
    n = 3 # bessel function order
    N = 100 # number of zeros
    
    z = jn_zeros(n, N) # Zeros of J_n
    crossings = empty(N-1)
    
    f = lambda x: jv(n,x) - yv(n,x)    
    for i in range(N-1):
        crossings[i] = bisect(f, z[i], z[i+1])
    
    def angle(n, x):
        # Derivatives of J_nu and Y_nu
        dj = 0.5*(jv(n-1,x) - jv(n+1,x))
        dy = 0.5*(yv(n-1,x) - yv(n+1,x))
        
        top = 1 + dj*dy
        bottom = ((1 + dj**2)*(1 + dy**2))**0.5
        return arccos(top/bottom)
        
    y = angle(n, crossings)
    plt.plot(y)
    plt.xlabel("Crossing number")
    plt.ylabel("Angle in radians")
    plt.show()

This shows that the angles steadily decrease, apparently quadratically.

Angles of crossing of J_3 and Y_3

This quadratic behavior is what we should expect from the asymptotics of Jν and Yν: For large arguments they act like shifted and rescaled versions of sin(x)/√x. So if we looked at √xJν and √xYν rather than Jν and Yν we’d expect the angles to reach some positive asymptote, and they do, as shown below.

Angles of crossing of √x J_3 and √xY_3

Related posts

Orthogonal graphs

Colin Wright posted a tweet yesterday that said that the plots of cosine and tangent are orthogonal. Here’s a plot so you can see for yourself.

Plot of cosine and tangen

Jim Simons replied with a proof so short it fits in a tweet: The product of the derivatives is -sin(x)sec²(x) = -tan(x)/cos(x), which is -1 if cos(x)=tan(x).

This made me wonder whether sine and cosine are orthogonal in the sense of graphs intersecting at right angles. They are orthogonal in the sense that their product integrates to zero over the interval [0, 2π] is zero, a fact that’s important fact in Fourier analysis, but are they orthogonal in the sense of their graphs intersecting at right angles? A graph makes this look plausible:

Plot of sine and cosine

But the graph is misleading. I made the plot without specifying the aspect ratio, using the default in Mathematica. This makes the angle between the graphs look smaller. Setting the aspect ratio to 1 shows the true picture. The two curves intersect at π/4 and 5π/4, and they intersect at an angle of 2π/3, not π/2.

The product of the slopes is not -1, but it is negative and constant, so you could multiply each function by some constant to make the product of slopes -1. Said another way, you could make the curves perpendicular by adjusting he aspect ratio.

Can you do this for other functions that are orthogonal in the inner product sense? Not easily. For example, sin(2x) and sin(3x) are orthogonal in the inner product (integral) sense, but the angles of intersection are different at different places where the curves cross.

What about other functions that are like sine and cosine? I looked at the Bessel functions J1 and J2, but the angles at the intersections vary. Ditto for Chebyshev polynomials. I suppose the difference is that sine and cosine are translations of each other, whereas that’s not the case for Bessel or Chebyshev functions. But it is true for wavelets, so you could find wavelets that are orthogonal in the sense of inner products and in the sense of perpendicular graphs.

Update: See more on how Bessel functions cross in the next post.

Area and volume of Menger sponge

The Menger sponge is the fractal you get by starting with a cube, dividing each face into a 3 by 3 grid (like a Rubik’s cube) and removing the middle square of each face and everything behind it. That’s M1, the Menger sponge at the 1st stage of its construction. The next stage repeats this process on all the little cubes that make up what’s left. That’s M2. Repeat the process over and over, and in the limit you get Menger’s sponge, a fractal with zero volume and infinite area!

This business of zero volume and infinite area may sound unreal, but the steps along the way to the limit are very tangible. Here’s a video by Aaron Benzel that let’s you fly through M3, and watch what happens if you split M3 apart.

You can compute the volume and area at each stage to show that

\mathrm{volume}(M_n) = \left(\frac{20}{27} \right )^n

and

\mathrm{area}(M_n) = 2\left(\frac{20}{9} \right )^n + 4 \left(\frac{8}{9} \right )^n

From these equations you can see that you can make the volume as small and you’d like, and the area as large as you like, by taking n big enough. And in case that sounds a little hand-wavey, we can get more concrete. Here’s a little code to find exactly how big a value of n is big enough.

    from math import log, ceil
    
    def menger_volume(n):
        return (20/27.)**n
    
    def menger_area(n):
        return 2*(20/9.)**n + 4*(8/9.)**n
    
    def volume_below(v):
        if v >=1:
            return 1
        else:
            n = log(v)/log(20/27.)
            return int(ceil(n)) 
    
    def area_above(a):
        if a < 2:
            return 1
        else:
            n = (log(a) - log(2))/log(20/9.)
            return int(ceil(n))
    
    n = volume_below(0.001)
    print( n, menger_volume(n) )
    
    n =  area_above(1000)
    print( n, menger_area(n) )

Related posts

A misunderstanding of complexity

Iterating simple rules can lead to complex behavior. Many examples of this are photogenic, and so they’re good for popular articles. It’s fun to look at fractals and such. I’ve written several articles like that here, such as the post that included the image below.

But there’s something in popular articles on complexity that bothers me, and it’s the following logical fallacy:

Complex systems can arise from iterating simple rules, therefore many complex systems do arise from iterating simple rules.

This may just be a popular misunderstanding of complexity theory, but I also get the impression that some people working in complexity theory implicitly fall for the same fallacy.

What fractals, cellular automata, and such systems show is that it is possible to start with simple rules and create a complex system. It says nothing about how often this happens. It does not follow that a particular complex system does in fact arise from iterating simple rules.

There’s a variation on the fallacy that says that while complex systems may not exactly come from iterating simple rules, it’s possible to approximate complex systems this way. Even that is dubious, as I discuss in The other butterfly effect.

At best I think these popular models serve as metaphors and cautionary tales. Strange attractors and such show that systems can exhibit unexpectedly complex behavior, and that forecasts can become useless when looking ahead too far. That’s certainly true more broadly.

But I’m skeptical of saying “You have a complex system. I have a complex system. Let’s see if my complex system models your complex system.” It’s often possible to draw loose analogies between complex systems, and these may be useful. But it’s not realistic to expect quantitative guidance to come out of this, such as using the Mandelbrot set to pick stocks.

Improving on the sieve of Eratosthenes

Ancient algorithm

Eratosthenes had a good idea for finding all primes less than an upper bound N over 22 centuries ago.

Make a list of the numbers 2 to N. Circle 2, then scratch out all the larger multiples of 2 up to N. Then move on to 3. Circle it, and scratch out all the larger multiples of 3.

Every time you start over and find a number that hasn’t been scratched out before, that means it’s not divisible by any numbers smaller than itself, so it’s prime. Circle it and repeat. This algorithm is known as the sieve of Eratosthenes.

You could turn this into an algorithm for factoring every number less than N by not just scratching off composite numbers but keeping track of what numbers they were divisible by.

Not bad for an algorithm that predates Hindu-Arabic numerals by nine centuries. But as you might imagine, there have been a few improvements over the last couple millennia.

New algorithm

A paper by Helfgott published last week [1] gives a refined version of the sieve that takes less time and less space. Helfgott is not the first to improve on the sieve of Eratosthenes, but the latest.

His paper shows that it is possible to find all primes less than N in time

O(N log N)

and space

O(N1/3 (log N)2/3).

Furthermore, it is possible to factor all integers less than N in time

O(N log N)

and space

O(N1/3 (log N)5/3).

He also addresses finding all primes and factoring all integers in an interval [N – Δ, N + Δ] provided

N1/3 (log N)2/3 ≤ Δ ≤ N.

In the case of such an interval, one can find all the primes in the interval in time O(Δ log N) and space O(Δ). And one can factor all integers in the interval in time and space O(Δ log N).

Helfgott’s paper gives detailed pseudocode for all the algorithms.

Related posts

[1] Harald Andrés Helfgott. An improved sieve of Eratosthenes, Mathematics of Computation, https://doi.org/10.1090/mcom/3438. Published online April 23, 2019.

How category theory is applied

Instead of asking whether an area of mathematics can be applied, it’s more useful to as how it can be applied.

Differential equations are directly and commonly applied. Ask yourself what laws govern the motion of some system, write down these laws as differential equations, then solve them. Statistical models are similarly direct: propose a model and feed it data.

Linear algebra is extremely useful in application, but it’s not often applied so directly. Rarely would you look at a system and immediately see a linear system. Instead you might describe a system in terms of differential equations or statistical models, then use linear algebra as a key part of the solution process.

Numerical analysis is useful because, for example, it helps you practically solve large linear systems (which help you solve differential equations, which model physical systems).

Many areas of math are useful in application, but some are applied more directly than others. It’s a mistake to say something is not applied just because it is not applied directly.

The following quote from Colin McLarty describes how some of the most abstract math, including cohomology and category theory, is applied.

Cohomology does not itself solve hard problems in topology or algebra. It clears away tangled multitudes of individually trivial problems. It puts the hard problems in clear relief and makes their solution possible. The same holds for category theory in general.

While McLarty speaks of applications to other areas of math, the same applies to applications to other areas such as software engineering.

I suspect many supposed applications of category theory are post hoc glosses, dressing up ideas in categorical language that were discovered in a more tangible setting. At the same time, the applications of category theory may be understated because the theory works behind the scenes. As I discuss here, category theory may lead to insights that are best communicated in the language of the original problem, not in the language of categories.

Related posts

Easter and exponential sums

For the last couple years, the exponential sum of the day for Easter Sunday has been a cross. This was not planned, since the image each day is determined by the numbers that make up the date, as explained here.

This was the exponential sum for last Easter last year, April 1, 2018:

Exponential sum for Easter 2018

and this was the sum for this Easter, today, April 21, 2019:

Exponential sum for Easter 2019

A partial explanation is that Easter usually falls in April, the fourth month, and the sums for April often have four-fold symmetry. That said, many of the images in April look nothing like a cross. And in fact the image for next Easter, April 12, 2020, will be an asymmetric mess.

Exponential sum for Easter 2020

Groups in categories

The first time I saw a reference to a “group in a category” I misread it as something in the category of groups. But that’s not what it means. Due to an unfortunately choice of terminology, “in” is more subtle than just membership in a class.

This is related to another potentially misleading term, algebraic groups, mentioned in the previous post on isogenies. An algebraic group is a “group object” in the category of algebraic varieties. Note the mysterious use of the word “in.”

You may have heard the statement “A monad is just a monoid in the category of endofunctors.” While true, the statement is meant to be a joke because it abstracted so far from what most people want to know when they ask what a monad is in the context of functional programming. But notice this follows the pattern of an X in the category of Y‘s, even though here X stands for monoid rather than group. The meaning of “in the category of” is the same.

If you want to go down this rabbit hole, you could start with the nLab article on group objects. A group object is a lot like a group, but everything is happening “in” a category.

Take a look at the list of examples. The first says that a group object in the category of sets is a group. That’s reassuring. The second example says that a group object in the category of topological spaces is a topological group. At this point you may get the idea that an X in the category of Y‘s simply adds an X structure to a Y thing. But further down you’ll see that a group object in the category of groups is an Abelian group, which is an indication that something more subtle is going on.

Related posts

What is an isogeny?

The previous post said that isogenies between elliptic curves are the basis for a quantum-resistant encryption method, but we didn’t say what an isogeny is.

It’s difficult to look up what an isogeny is. You’ll find several definitions, and they seem incomplete or incompatible.

If you go to Wikipedia, you’ll read “an isogeny is a morphism of algebraic groups that is surjective and has a finite kernel.” The possibly misleading term here is “algebraic group.” It may sound like they’re throwing in the term “algebraic” for clarification as if to say “a group like the kind you’d talk about in math, as opposed to the kind of group you might talk about somewhere else like in sociology.” An algebraic group is indeed a group, but one with additional structure. A “morphism” is a structure-preserving map, and so here “morphism” means a function that preserves the algebraic and topological structure of an algebraic group.

The algebraic groups we care about are elliptic curves, so let’s specialize to elliptic curves. For the rest of this post all definitions and theorems will come from Handbook of Elliptic and Hyperelliptic Curve Cryptography by Cohen and Frey. This book defines isogeny implicitly by defining what it means for two curves to be isogenous.

Two curves E/K and E‘/K are isogenous over K if there exists a morphism φ : EE‘ with coefficients in K mapping the neutral element of E to the neutral element of E‘.

Here K is the field the curves are defined over. The neutral element is the identity element of the curve as a group, the point at infinity.

Something strange is going on here. The definition talks about two curves being isogenous. That sounds like a symmetric relationship, but the definition is not symmetric. It specifies the existence of a morphism in one direction but not in the other. But the book goes on to explain that if an isogeny exists in one direction, there necessarily exists a unique isogeny in the opposite direction, the dual isogeny, though it’s not obvious why this should be.

Another puzzling thing at that it doesn’t seem to take much for a function to be an isogeny, just map the group identity to the group identity. But there are other requirements implicit in the statement that φ is a morphism in the context of algebraic groups. Cohen and Frey do not require φ to be a homomorphism, as some authors do, but they point out that in fact φ will turn out to be a group homomorphism.They say “for more background on isogenies, we refer to section 4.3.4.” OK, lets go there.

In that section the authors say that a morphism between Abelian varieties (a special case of algebraic groups which includes elliptic curves) is an isogeny if and only if it is surjective and has a finite kernel. That seems like a much stronger requirement than the definition above. However, in this context simply requiring a morphism to map the group identity to the group identity implies that the morphism will be surjective and have a finite kernel, and vice versa.

An isogeny is not an isomorphism

An isogeny is a group homomorphism, but not an isomorphism, not in the modern usage of the term. Historically, however, the term “isomorphism” was used for what we now call an isogeny.

We said above that the existence of an isogeny in one direction implied the existence of a dual isogeny in the opposite direction. These functions are not inverses of each other: their composition is not the identity. However, their composition does have a simple form: it is multiplication by an integer n, called the degree of the isogeny. (Multiplication here means repeated group addition, i.e. the composition takes an element x to the sum of n copies of x using the group law on the elliptic curve.)

Example

Cohen and Frey give the example

\varphi : (x, y) \mapsto \left( \frac{x^2 + 301x + 527}{x + 301}, \frac{yx^2 + 602 yx + 1942y}{x^2 + 602x + 466} \right )

of an isogeny of degree 2 between the elliptic curves

y² = x³ + 1132x + 278

and

y² = x³ + 500x + 105

over the field with 2003 elements. The curves are not isomorphic because the group structure of the former is a cyclic group and the group structure of the latter is not.

Incidentally, there is a theorem that says two elliptic curves over a finite field are isogenous if and only if they have the same number of points. So a brute-force approach to showing that the curves above are isogenous would be to show that they both have the same number of points. And indeed, both have 1956 points.

Related posts