When is a function of two variables separable?

Posted on 18 January 2024 by John

Given a function f(x, y), how can you tell whether f can be factored into the product of a function g(x) of x alone and a function h(y) of y alone? Depending on how an expression for f is written, it may or may not be obvious whether f(x, y) can be separated into g(x) h(y).

There are several situations in which you might want to know whether a function is separable. For example, the ordinary differential equation

y′ = f(x, y)

can be solved easily when f(x, y) = g(x) h(y).

You might want to do something similar for a partial differential equation, using separation of variables, possibly choosing a coordinate system that allows the separation of variables trick to work.

Aside from applications to differential equations, you might want to know whether a polynomial in two variables can be factored into the product of polynomials in each variable separately.

In [1] David Scott gives a simple necessary condition for f to be separable:

f f_xy = f_x f_y

Here the subscripts indicate partial derivatives.

It’s easy to see this condition is necessary. Scott shows the condition is also sufficient under some mild technical assumptions.

As an example, determine the value of k such that the differential equation

y′ = 6xy² + 3y² −4x + k

is separable.

Scott’s equation

f f_xy = f_x f_y

becomes

(6xy² + 3y² −4x + k)(12y) = (6y² −4)(12xy + 6y)

which holds if and only if k = −2.

[1] David Scott. When is an Ordinary Differential Equation Separable? The American Mathematical Monthly, Vol. 92, No. 6, pp. 422–423

Applications of Bernoulli differential equations

Posted on 17 January 2024 by John

When a nonlinear first order ordinary differential equation has the form

$\frac{dy}{dx} + P(x)\,y = Q(x)\, y^n$

with n ≠ 1, the change of variables

$u = y^{1-n}$

turns the equation into a linear equation in u. The equation is known as Bernoulli’s equation, though Leibniz came up with the same technique. Apparently the history is complicated [1].

It’s nice that Bernoulli’s equation can be solve in closed form, but is it good for anything? Other than doing homework in a differential equations course, is there any reason you’d want to solve Bernoulli’s equation?

Why yes, yes there is. According to [1], Bernoulli’s equation is a generalization of a class of differential equations that came out of geometric problems.

Someone asked about applications of Bernoulli’s equation on Stack Exchange and got a couple interesting answers.

The first answer said that a Bernoulli equation with n = 3 comes up in modeling frictional forces. See also this post on drag forces.

The second answer links to a paper on Bernoulli memristors.

[1] Adam E. Parker. Who Solved the Bernoulli Differential Equation and How Did They Do It? College Mathematics Journal, vol. 44, no. 2, March 2013.

The IQ Test That AI Can’t Pass

Posted on 16 January 2024 by Wayne Joubert

Large language models have recently achieved remarkable test scores on well-known academic and professional exams (see, e.g., [1], p. 6). On such tests, these models are at times said to reach human-level performance. However, there is one test that humans can pass but every AI method known to have been tried has abysmally failed.

The Abstraction and Reasoning Corpus (ARC) benchmark [2] was developed by François Chollet to measure intelligence in performing tasks never or rarely seen before. We all do tasks something like this every day, like making a complicated phone call to correct a mailing address. The test is composed of image completion problems similar to Raven’s Progressive Matrices but more complex. Given images A, B and C, one must identify the image D such that the relationship “A is to B as C is to D” holds. Sometimes several examples of the A:B relationship are given.

The problem is hard because the relationship patterns between A and B that humans could easily identify (for example, image shrinking, rotating, folding, recoloring, etc.) might be many, many different things—more than can easily be trained for. By construction, every problem is qualitatively, unpredictably different, so the common approach of training on the training set doesn’t work. Instead, bonafide reasoning on a new kind of problem for each case is required.

Several competitions with prize money have encouraged progress on the ARC benchmark [3]. In these, each entrant’s algorithm must be tested against an unseen ARC holdout set. The leaderboard for the ARCathon 2023 challenge completed last month shows top score of 30 percent [4]; this is excellent progress on a very hard problem, but far from a perfect score or anything else resembling passing.

Ilya Sutskever has famously warned we shouldn’t bet against deep learning, and perhaps a future LLM will do much better on this benchmark. Others feel a new approach is needed, for example, from the burgeoning field of neurosymbolic methods. In any case, these results show at the present moment in this rapidly progressing field, we don’t seem to be anywhere close to strong forms of AGI, artificial general intelligence.

[1] OpenAI, “GPT-4 Technical Report,” https://cdn.openai.com/papers/gpt-4.pdf

[2] François Chollet, “On the Measure of Intelligence,” https://arxiv.org/abs/1911.01547

[3] “Abstraction and Reasoning Challenge,” https://www.kaggle.com/c/abstraction-and-reasoning-challenge

[4] “Winners – Lab42, “https://lab42.global/past-challenges/arcathon-2023/“

Means of means bounding the logarithmic mean

Posted on 16 January 2024 by John

The geometric, logarithmic, and arithmetic means of a and b are defined as follows.

$\begin{align*} G &= \sqrt{ab} \\ L &= \frac{b - a}{\log b - \log a} \\ A &= \frac{a + b}{2} \end{align*}$

A few days ago I mentioned that G ≤ L ≤ A. The logarithmic mean slips between the geometric and arithmetic means.

Or to put it another way, the logarithmic mean is bounded by the geometric and arithmetic means. You can bound the logarithmic mean more tightly with a mixture of the geometric and arithmetic means.

In [1] the authors show that

$G^{2/3} A^{1/3} \leq L \leq \tfrac{2}{3}G + \tfrac{1}{3}A$

Note that the leftmost expression is the geometric mean of G, G, and A, and the rightmost expression is the arithmetic mean of G, G, and A. We can write this as

$G(G, G, A) \leq L \leq A(G, G, A)$

where G with no argument is the geometric mean of a and b and G with three arguments is the geometric mean of those arguments, and similarly for A.

The following plot shows how well these means of means bound the logarithmic mean. We let a = 1 and let b vary from 1 to 10o.

The upper bound is especially tight for moderate values of b. When I first made the plot I let b run up to 10 and there were apparently only four curves in the plot. I had to pick a larger value of b before the curves for L and (2G + A)/3 to be distinguished.

[1] Graham Jameson and Peter R. Mercer. The Logarithmic Mean Revisited. The American Mathematical Monthly, Vol. 126, No. 7, pp. 641-645

When zeros at natural numbers implies zero everywhere

Posted on 11 January 2024 by John

Suppose a function f(z) equals 0 at z = 0, 1, 2, 3, …. Under what circumstances might you be able to conclude that f is zero everywhere?

Clearly you need some hypothesis on f. For example, the function sin(πz) is zero at every integer but certainly not constantly zero.

Carlson’s theorem says that if f is analytic and bounded for z with non-negative real part, and equals zero at non-negative integers, then f is constantly zero.

Carlson’s theorem doesn’t apply to sin(πz) because this function is not bounded in the complex plane. It is bounded on the real axis, but that’s not enough. The identity

sin(z) = ( exp(iz) – exp(-iz) ) / 2i

shows that the sine function grows exponentially in the vertical direction.

Liouville’s theorem says that if a function is analytic and bounded everywhere then it must be constant. Carleson’s theorem does not require that the function f be bounded everywhere but in the right half-plane.

In fact, the boundedness requirement can be weakened to requiring f(z) be O( exp(k|z|) ) for some k < π. This, in combination with having zeros at 0, 1, 2, 3, …. is enough to conclude that f is zero.

Ky Fan’s inequality

Posted on 8 January 2024 by John

Let

$x = (x_1, x_2, x_3, \ldots, x_n)$

with each component satisfying 0 < x_i ≤ 1/2. Define the complement x′ by taking the complement of each entry.

$x' = (1 - x_1, 1 - x_2, 1 - x_3, \ldots, 1 - x_n)$

Let G and A represent the geometric and arithmetic mean respectively.

Then Ky Fan’s inequality says

$\frac{G(x)}{G(x')} \leq \frac{A(x)}{A(x')}$

Now let H be the harmonic mean. Since in general H ≤ G ≤ A, you might guess that Ky Fan’s inequality could be extended to

$\frac{H(x)}{H(x')} \leq \frac{G(x)}{G(x')} \leq \frac{A(x)}{A(x')}$

and indeed this is correct.

Source: Jósef Sándor. Theory and Means and Their Inequalities.

Integral representations of means

Posted on 6 January 2024 by John

The average of two numbers, a and b, can be written as the average of x over the interval [a, b]. This is easily verified as follows.

$\frac{1}{b-a}\int_a^b x\, dx = \frac{1}{b-a} \left( \frac{b^2}{2} - \frac{a^2}{2}\right) = \frac{a+b}{2}$

The average is the arithmetic mean. We can represent other means as above if we generalize the pattern to be

$\varphi^{-1}\left(\,\text{average of } \varphi(x) \text{ over } [a, b] \,\right )$

For the arithmetic mean, φ(x) = x.

Logarithmic mean

If we set φ(x) = 1/x we have

$\left(\frac{1}{b-a} \int_a^b x^{-1}\, dx \right)^{-1} = \left(\frac{\log b - \log a}{b - a} \right)^{-1} = \frac{b - a}{\log b - \log a}$

and the last expression is known as the logarithmic mean of a and b.

Geometric mean

If we set φ(x) = 1/x² we have

$\left(\frac{1}{b-a} \int_a^b x^{-2}\, dx \right)^{-1/2} = \left(\frac{1}{b-a}\left(\frac{1}{a} - \frac{1}{b} \right )\right)^{-1/2} = \sqrt{ab}$

which gives the geometric mean of a and b.

Identric mean

In light of the means above, it’s reasonable ask what happens if we set φ(x) = log x. When we do we get a more arcane mean, known as the identric mean.

The integral representation of the identric mean seems natural, but when we compute the integral we get something that looks arbitrary.

$\begin{align*} \exp\left( \frac{1}{b-a} \int_a^b \log x\, dx \right) &= \exp\left( \left.\frac{1}{b-a} (x \log x - x)\right|_a^b \right) \\ &= \exp\left( \frac{b \log b - a \log a - b + a}{b-a} \right) \\ &= \frac{1}{e} \left( \frac{b^b}{a^a} \right)^{1/(b-a)} \end{align*}$

The initial expression looks like something that might come up in application. The final expression looks artificial.

Because the latter is more compact, you’re likely to see the identric mean defined by this expression, then later you might see the integral representation. This is unfortunate since the integral representation makes more sense.

Order of means

It is well known that the geometric mean is no greater than the arithmetic mean. The logarithmic and identric means squeeze in between the geometric and arithmetic means.

If we denote the geometric, logarithmic, identric, and arithmetic means of a and b by G, L, I, and A respectively,

$G \leq L \leq I \leq A$

Sierpiński’s inequality

Posted on 5 January 2024 by John

Let A_n, G_n and H_n be the arithmetic mean, geometric mean, and harmonic mean of a set of n numbers.

When n = 2, the arithmetic mean times the harmonic mean is the geometric mean squared. The proof is simple:

$A_2(x, y) H_2(x, y) = \left(\frac{x + y}{2}\right)\left(\frac{2}{\frac{1}{x} + \frac{1}{y}} \right ) = xy = G_2(x,y)^2$

When n > 2 we no longer have equality. However, W. Sierpiński, perhaps best known for the Sierpiński’s triangle, proved that an inequality holds for all n. Given

$x = (x_1, x_2, \ldots, x_n)$

we have the inequality

$H_n(x)^{n-1}\, A_n(x) \leq G_n(x)^n \leq A_n(x)^{n-1}\, H_n(x)$

[1] W. Sierpinski. Sur une inégalité pour la moyenne alrithmétique, géometrique, et harmonique. Warsch. Sitzunsuber, 2 (1909), pp. 354–357.

A curious pattern in January exponential sums

Posted on 4 January 2024 by John

The exponential sum page on this site draws a new image every day based on plugging the month, day, and year into a formula. Some of these images are visually appealing; I’ve had many people ask if they could use the images in publications or on coffee mugs etc.

The images generally look very different from one day to the next. One reason I include the date numbers in the order I do, using the American convention, is that this increases the variety from one day to the next.

If you first became aware of the page on New Year’s Day this year, you might think the page is broken because there was no apparent change between January 1 and January 2. Yesterday’s image was different, but then today, January 4, the image looks just like the images for January 1st and 2nd. They all look like the image below.

The plots produced on each day are distinct, but they are geometrically congruent.

The exponential sum page displays a plot connecting the partial sums of a certain series given here. The axes are turned off and so only the shape of the plot is displayed. If one plot is a translation or dilation of another, the images shown on the page will be the same.

Here’s a plot of the images for January 1, 2, and 4, plotted red, green, and blue respectively.

This shows that the images are not the same, but are apparently translations of each other.

There’s another difference between the images. Connecting consecutive partial sums draws an image clockwise on January 1, but counterclockwise on January 2 and 4. You can see this by clicking on the “animate” link on each page.

Is there an elliptic curve with 2024 points?

Posted on 2 January 2024 by John

On New Year’s Day I posted about groups of order 2024. Are there elliptic curves of order 2024?

The Hasse-Weil theorem relates the number of points on an elliptic curve over a finite field to the number of elements of the field. Namely, an elliptic curve E over a field with q elements must have cardinality

q + 1 − t

where

|t| ≤ 2√q.

So if there is an elliptic curve with 2024 points, the curve must be over a field with roughly 2024 points.

The condition on t above is necessary for the existence of an elliptic curve of a certain size, but is it sufficient? Sorta.

The order of a finite field must be a prime power, i.e. q = p^d for some prime p. There is a theorem ([1], Theorem 13.30) that there exists a curve of the size indicated in the Hasse-Weil theorem if t ≠ 0 mod p. The theorem also lists a couple more sufficient conditions that are more complicated.

So, for example, we could take q = p = 2027 and t = 4.

Now that we know the search isn’t futile, we can search for an elliptic curve over the integers mod 2027 that has 2024 points. After a brief brute force search I found

y² = x³ + 4x + 28

over the field with 2027 elements is such a curve .

[1] Henri Coghen and Gerhard Frey. Handbook of Elliptic and Hyperelliptic Curve Cryptography. Chapman & Hall/CRC. 2006.

Math

When is a function of two variables separable?

Related posts

Applications of Bernoulli differential equations

Related posts

The IQ Test That AI Can’t Pass

Means of means bounding the logarithmic mean

Related posts

When zeros at natural numbers implies zero everywhere

Related posts

Ky Fan’s inequality

Integral representations of means

Logarithmic mean

Geometric mean

Identric mean

Order of means

Related posts

Sierpiński’s inequality

Related posts

A curious pattern in January exponential sums

Is there an elliptic curve with 2024 points?

Related posts