When is a function of two variables separable?

Posted on 18 January 2024 by John

Given a function f(x, y), how can you tell whether f can be factored into the product of a function g(x) of x alone and a function h(y) of y alone? Depending on how an expression for f is written, it may or may not be obvious whether f(x, y) can be separated into g(x) h(y).

There are several situations in which you might want to know whether a function is separable. For example, the ordinary differential equation

y′ = f(x, y)

can be solved easily when f(x, y) = g(x) h(y).

You might want to do something similar for a partial differential equation, using separation of variables, possibly choosing a coordinate system that allows the separation of variables trick to work.

Aside from applications to differential equations, you might want to know whether a polynomial in two variables can be factored into the product of polynomials in each variable separately.

In [1] David Scott gives a simple necessary condition for f to be separable:

f f_xy = f_x f_y

Here the subscripts indicate partial derivatives.

It’s easy to see this condition is necessary. Scott shows the condition is also sufficient under some mild technical assumptions.

As an example, determine the value of k such that the differential equation

y′ = 6xy² + 3y² −4x + k

is separable.

Scott’s equation

f f_xy = f_x f_y

becomes

(6xy² + 3y² −4x + k)(12y) = (6y² −4)(12xy + 6y)

which holds if and only if k = −2.

[1] David Scott. When is an Ordinary Differential Equation Separable? The American Mathematical Monthly, Vol. 92, No. 6, pp. 422–423

Applications of Bernoulli differential equations

Posted on 17 January 2024 by John

When a nonlinear first order ordinary differential equation has the form

$\frac{dy}{dx} + P(x)\,y = Q(x)\, y^n$

with n ≠ 1, the change of variables

$u = y^{1-n}$

turns the equation into a linear equation in u. The equation is known as Bernoulli’s equation, though Leibniz came up with the same technique. Apparently the history is complicated [1].

It’s nice that Bernoulli’s equation can be solve in closed form, but is it good for anything? Other than doing homework in a differential equations course, is there any reason you’d want to solve Bernoulli’s equation?

Why yes, yes there is. According to [1], Bernoulli’s equation is a generalization of a class of differential equations that came out of geometric problems.

Someone asked about applications of Bernoulli’s equation on Stack Exchange and got a couple interesting answers.

The first answer said that a Bernoulli equation with n = 3 comes up in modeling frictional forces. See also this post on drag forces.

The second answer links to a paper on Bernoulli memristors.

[1] Adam E. Parker. Who Solved the Bernoulli Differential Equation and How Did They Do It? College Mathematics Journal, vol. 44, no. 2, March 2013.

Convergent subsequence

Posted on 1 December 2023 by John

I was reading a theorem giving conditions for a divergent series to have a convergent subseries and had a sort of flashback.

I studied nonlinear PDEs in grad school, which amounted to applied functional analysis. We were constantly proving or using theorems about sequences having convergent subsequences, often subsequences that converged in a very weak sense.

This seemed strange to me at first. If a sequence diverges, why is it of any interest that a subsequence converges? This seemed like blackout poetry, completely changing the meaning of a text by selecting various words. For example, here is the opening paragraph of Pride and Prejudice, blacked out to appear to be a real estate ad.

good neighborhood, surrounding park

Here’s the big picture I was missing. We’re trying to show that a differential equation has a solution, and we’re doing that by some kind of successive approximation. Maybe our series of approximations doesn’t work in general, but that doesn’t matter. We’re just trying to find something that is a solution. Once you come up with a candidate solution, by whatever means, grasping at whatever straws you can grasp, you then prove that the candidate really is a solution, perhaps a solution in a weak sense. Then you show that this solution, potentially one of many, is unique. Then you show that your weak solution is a in fact a solution in a stronger sense.

Period of a nonlinear pendulum

Posted on 20 November 2023 by John

The term “nonlinear pendulum” is analogous to a retronym, a new name for an old thing to distinguish it from a new variation. For example, once upon a time a guitar was just a guitar. Now such a guitar is called an acoustic guitar to distinguish it from an electric guitar. Similarly, analog signal processing is a retronym to distinguish what was once the only kind of signal processing from the new arrival, digital signal processing.

The equation of motion for a pendulum is nonlinear. If the initial angle of displacement is sufficiently small, the linearized form of the equation is adequate for most applications. This linearized approximation is better known than the more accurate original equation, and so the un-linearized equation is known as the nonlinear pendulum equation.

The (nonlinear) equation of motion for a pendulum is the differential equation

$\theta'' + \frac{g}{\ell}\sin \theta = 0$

where g is the acceleration due to gravity and ℓ is the length of the pendulum. For small initial displacement θ₀ the linear approximation

$\theta'' + \frac{g}{\ell} \theta = 0$

works well. The smaller θ₀ is the more accurate the linear approximation is.

Linear and nonlinear period

The period of a pendulum obtained by solving the linearized equation is

$T = 2\pi \sqrt{\frac{\ell}{g}}$

The solution to the nonlinear pendulum equation is also periodic, though the solution is a combination of Jacobi functions rather than a combination of trig functions. The difference between the two solutions is small when θ₀ is small, but becomes more significant as θ₀ increases.

The difference in the periods is more evident than the difference in shape for the two waves. The period of the nonlinear solution is longer than that of the linearized solution. Here’s a plot of the solutions to the linear and nonlinear equations, with ℓ = g and θ₀ = 1.

The period for the nonlinear pendulum is given by

$T = 2\pi \sqrt{\frac{\ell}{g}}\, f(\theta_0)$

where f is an increasing function, equal to 1 at θ = 0.

The exact form of f involves special functions, and so there is naturally a lot of interest in approximations to f. The exact value is given by

$f(\theta) = \frac{2}{\pi}K(\sin(\theta/2)) = \frac{1}{\text{AGM}(1, \cos(\theta/2))}$

where K is the “complete elliptic integral of the first kind” and AGM is the arithmetic-geometric mean.

The AGM of two numbers is found by taking their ordinary (arithmetic) mean and geometric mean, then repeating the process. This process converges very rapidly, and so doing one step of the iteration gives a good approximation. If that’s not good enough, doing two steps gives an even better approximation, and so on. In fact, a common approximation for f(θ) is to do half a step, taking the geometric mean of 1 and cos(θ/2), i.e.

$f(\theta) \approx \frac{1}{\sqrt{\cos(\theta/2)}}$

To see how accurate this approximation is, let’s plot the exact and approximate values of f.

The two curves can hardly be distinguished visually, so let’s look at a plot of their difference.

Here’s the code that produced the two plots above.

from scipy.special import ellipk
from numpy import sin, cos, pi, linspace
import matplotlib.pyplot as plt

def exact(θ):
    return 2*ellipk(sin(θ/2)**2)/pi

def approx(θ):
    return cos(θ/2)**-0.5

t = linspace(0, 1.5)
plt.plot(t, exact(t))
plt.plot(t, approx(t))
plt.xlabel(r"$\theta$")
plt.legend(["exact", "approx"])
plt.savefig("nonlinear_pendulum1.png")
plt.close()

plt.plot(t, exact(t) - approx(t))
plt.xlabel(r"$\theta$")
plt.ylabel("approximation error")
plt.savefig("nonlinear_pendulum2.png")

NB: There are two conventions for defining the complete elliptic integral of the first kind. SciPy uses a convention for K that requires us to square the argument.

Driven oscillations

The differences between the linearized and nonlinear equation become more apparent when there is a forcing function, i.e. when the right-hand side of the differential equations is not zero. Here are the solutions when the forcing function is cos(2t).

Now the solutions not only differ in their period, the shapes of the solutions are substantially different. The linear solutions are well-behaved but the nonlinear solutions can be chaotic with sensitive dependence on initial conditions. This remains true if a damping term is added.

Rational solution to Korteweg–De Vries equation

Posted on 13 November 2023 by John

Students seeing differential equations for the first time expect every equation to have a nice closed-form solution, because up to that point in their education nearly every problem they’ve seen has been contrived to have a nice closed-form solution.

Once you resign yourself to the fact that a differential equation will rarely have a closed form solution, it’s a treat when you run across one that does. This is especially true for nonlinear equations.

The Korteweg–De Vries (KdV) equation is

$u_t - 6 u\, u_x + u_{xxx} = 0$

is such a treat. I wrote a few days ago about the sech² solution to the KdV equation.

$u(x,t) = -\frac{v}{2} \,\text{sech}^2\left(\frac{\sqrt{v}}{2} (x - vt - a)\right )$

There’s also a rational solution:

$u(x, t) = 6 x \frac{ \left(x^3-24 t\right)}{\left(x^3 + 12 t\right)^2}$

We can verify this is a solution to the KdV equation reusing the Mathematica code from the earlier post.

    u[x_, t_] := u[x_, t_] := 6 x (x^3 - 24 t)/(x^3 + 12 t)^2
    Simplify[ D[u[x, t], {t, 1}] 
            - 6 u[x, t] D[u[x, t], {x, 1}] 
            + D[u[x, t], {x, 3}] ]

This simplifies to 0.

Here’s a plot:

The top of the plot looks like a two-lane road on top of a mountain ridge, with a sinkhole in the middle of the road.

The “road” is a artifact of plotting. The solution is singular along the curve x³ + 12t= 0, and Mathematica had to chop the top of the graph off because it can’t plot an infinitely tall function.

Solitons and the KdV equation

Posted on 3 November 2023 by John

Rarely does a nonlinear differential equation, especially a nonlinear partial differential equation, have a closed-form solution. But that is the case for the Korteweg–De Vries equation.

(Technically I should say it’s rare for a naturally-occurring nonlinear differential equation to have a closed-form solution. You can always start with a solution and cook up a contrived differential equation that it satisfies, but that differential equation will not be one that is interesting in applications.)

The Korteweg–De Vries (KdV) equation is

$u_t - 6 u\, u_x + u_{xxx} = 0$

This equation is used to model shallow water waves.

The KdV equation is a third-order PDE, which is unusual but not unheard of. As I wrote about earlier in the context of ODEs, third order linear equations are virtually nonexistent in application, but third order nonlinear equations are not so uncommon.

The function

$u(x,t) = -\frac{v}{2} \,\text{sech}^2\left(\frac{\sqrt{v}}{2} (x - vt - a)\right )$

is a solution to the KdV equation. It is an example of a class of functions known as solitons.

You can increase the amplitude by increasing v, but when you do you also increase the velocity of the wave. You can’t vary amplitude and velocity independently because the KdV equation is nonlinear.

Verifying by hand that this is a solution is tedious, so I’ll show the verification in Mathematica.

    u[x_, t_] := - (v/2) Sech[Sqrt[v] (x - v t - a)/2]^2
    Simplify[  D[u[x, t], {t, 1}]  
               - 6 u[x, t] D[u[x, t], {x, 1}] 
               + D[u[x, t], {x, 3}]  ]

This returns 0. (Without the Simplify[] command it returns a mess, but the mess simplifies to 0.)

When a function cannot be extended

Posted on 9 October 2023 by John

The relation between a function and its power series is subtle. In a calculus class you’ll see equations of the form “series = function” which may need some footnotes. Maybe the series only represents the function over part of its domain: the function extends further than the power series representation.

Starting with the power series, we can ask whether the function it represents extends further than the series representation.

This video does a nice job of explaining why a particular function cannot be extended beyond the disk on which the series converges.

Toward the end, the video explains how its main example is a member of a broader class of functions that have no analytic continuation. The technical term, which the video does not use, is lacunary series [1]. When the gaps in a power series grow faster than linearly, the series cannot be extended beyond its radius of convergence.

Lacunary series make interesting images since the behavior of the function becomes complicated toward the edge of the domain. The video gives some nice examples. The image above comes from this post and the following image comes from this post.

Differential equations

The video mentions Hadamard’s gap theorem. I believe his gap theorem was a spin-off of his work on Laplace’s equation. See this post on Hadamard’s counterexample to the Dirichlet principle for the Laplacian.

The motivation for a LOT of classical math was differential equations. I didn’t realize this as a student. Years later I’d run into something and think “So that is why this person was interested in that problem,” such as why Hadamard would care about whether power series could be extended.

Hadamard wanted to solve a differential equation on a disk with boundary conditions specified on the rim. It’s going to be a problem if the series representation of the solution doesn’t extend to the rim.

- When does a function equal its power series?
- Grand unified theory of 19th century math

[1] Lacuna is the Latin word for a hole or a pit. The word came to be use metaphorically for a gap, such as a gap in a manuscript. Later mathematicians used this term for power series with increasing gaps between non-zero terms.

Test functions

Posted on 3 October 2023 by John

Test functions are how you can make sense of functions that aren’t really functions.

The canonical example is the Dirac delta “function” that is infinite at the origin, zero everywhere else, and integrates to 1. That description is contradictory: a function that is 0 almost everywhere integrates to 0, even if you work in extended real numbers where a function can take on the value ∞.

You can make things like the delta function rigorous by saying they’re not functions of real numbers, but functions that operate on other functions, i.e. test functions. More on that here. These functions acting on test functions are called generalized functions or distributions. (This this post for how this kind of distribution differs from a probability distribution, but is analogous.)

Analogy with test charges

To say rigorously how these generalized functions behave you show how they act on test functions φ. Test functions are analogous to test charges: you can describe an electric field by saying what force it would exert on a test charge.

A test charge is somewhat idealized. It has to be so small that it tests a field without effecting it. This isn’t really possible, but you can think of it as a limit. You’re looking at the limit of the force per unit charge as the charge goes to zero.

Similarly, a test function is ideal in that it very well behaved so the generalized functions that act on it can be badly behaved. Test functions are infinitely differentiable, and they either have compact support or have extremely thin tails, depending on context.

Analogy with category theory

While writing the previous post I thought about an analogy between distribution theory and category theory. I worked with distribution theory a lot in grad school, and I found it natural that the definition of a distribution depended on how it acted in relation to something else, i.e. how it acted on all test functions.

But I found category definitions that involved extraneous objects puzzling. For example, the product of two objects is a third object such that for any fourth object (!) a certain diagram commutes. Why is this superfluous object doing injecting itself into the definition? If I’d thought of it as a test object then I would have found the definition more palatable.

As with distribution theory, you’re defining something by how it relates to all elements of some collection. But in distribution theory, your distributions and your test functions are very distinct things. In category theory, your test objects are peers of the thing you’re testing.

Software and the Allee effect

Posted on 4 August 2023 by John

The Allee effect is named after Warder Clyde Allee who added a term to the famous logistic equation. His added term is highlighted in blue.

$\frac{dN}{dt} = r N {\color{red}\left( \frac{N}{A} - 1 \right)} \left( 1 - \frac{N}{K} \right)$

Here N is the population of a species over time, r is the intrinsic rate of increase, K is the carrying capacity, and A is the critical point.

If you remove Allee’s term, you get an equation saying that the rate of growth of a population is proportional to the current population size, and so growth starts out exponential, and a term (1 – N/K), which says growth slows down as the population approaches its carrying capacity.

Allee’s term (N/A – 1) says that the rate of growth becomes negative when the population falls below some threshold A. When there are too few individuals, survival becomes more difficult.

Software metaphor

I thought of the Allee effect as a metaphor for software technology after writing my previous post. In general, problems become easier to solve over time. Software development may become harder because the problems developers solve are changing, but solving old problems typically gets easier. Algorithms improve and get wrapped up for convenience. There’s something like logistic growth where tasks get easier to solve, but improvement slows down over time.

If a problem is specialized, it can run into something like the Allee effect. It becomes harder over time because fewer people are interested in it. Software isn’t maintained as fast as it degrades. Fewer people have experience with it. It’s harder to be a COBOL programmer, for example, than it used to be. But this can also apply to much more current problems. A problem that was hot five years ago can be harder to solve now than it was then, for reasons the previous post discusses.

Complex differential equations

Posted on 5 July 2023 by John

Differential equations in the complex plane are different from differential equations on the real line.

Suppose you have an nth order linear differential equation as follows.

$y^{(n)} + a_{n-1}y^{(n-1)} + a_{n-2}y^{(n-2)} + \cdots a_0y = 0$

The real case

If the a‘s are continuous, real-valued functions on an open interval of the real line, then there are n linearly independent solutions over that interval. The coefficients do not need to be differentiable for the solutions to be differentiable.

The complex case

If the a‘s are continuous, complex-valued functions on an connected open of the real line, then there may not be n linearly independent solutions over that interval.

If the a‘s are all analytic, then there are indeed n independent solutions. But if any one of the a‘s is merely continuous and not analytic, there will be fewer than n independent solutions. There may be as many as n − 1 solutions, or there may not be any solutions at all.

Source: A. K. Bose. Linear Differential Equations on the Complex Plane. The American Mathematical Monthly, Vol. 89, No. 4 (Apr., 1982), pp 244–246.

Differential equations

When is a function of two variables separable?

Related posts

Applications of Bernoulli differential equations

Related posts

Convergent subsequence

Related posts

Period of a nonlinear pendulum

Linear and nonlinear period

Driven oscillations

Related posts

Rational solution to Korteweg–De Vries equation

Solitons and the KdV equation

Related posts

When a function cannot be extended

Differential equations

Related posts

Test functions

Analogy with test charges

Analogy with category theory

Software and the Allee effect

Software metaphor

Complex differential equations

The real case

The complex case