Higher roots and r-means

The previous post looked at a simple method of finding square roots that amounts to a special case of Newton’s method, though it is much older than Newton’s method.

We can extend Newton’s method to find cube roots and nth roots in general. And when we do, we begin to see a connection to r-means. I’ve written about these means several times, most recently in connection with finding the perimeter of an ellipse and the surface area of an ellipsoid.

To find the nth root of y, we apply Newton’s root-finding method to find where the function

f(x) = x^n - y

is zero. We start with an initial estimate x0 and our updated estimate is

x_1 = \left( \frac{n-1}{n}\right ) x_0 + \left(\frac{1}{n}\right ) \frac{y}{x_0^{n-1}}

When n = 2, this reduces to the method in our previous post: the updated estimate x1 equals the average of our initial estimate x0 and y/x0. That is, our updated estimate is the arithmetic mean of x0 and y/x0, and the geometric mean of the two terms is the square root of y.

For n in general, we have two terms whose geometric mean is the nth root of y, and we take their weighted arithmetic mean. Said another way, our updated estimate is a convex combination of these two terms. The rest of the post will explore this further and point out some connections.

Weighted means and geometric mean

The ellipse and ellipsoid posts mention above make use of the means

\mathfrak{M}_r(x) = \left( \frac{1}{n} \sum_{i=1}^n x_i^r \right)^{1/r}

as defined in Hardy, Littlewood, and Pólya. More generally the authors define

\mathfrak{M}_r(x, p) = \left( \sum_{i=1}^n p_i x_i^r \right)^{1/r}

where the weights pi are positive numbers that sum to 1. The unweighted mean corresponds to the special case where all pi equal 1/n.

The authors also define the geometric mean

\mathfrak{G}(x) = \left( \prod_{i=1}^n x_i \right)^{1/n}

and weighted extensions which we will not need here.

Connections between means

We note two connections between the geometric mean and the r-mean. First, the geometric mean is the exponential of the arithmetic mean of the logarithms.

\mathfrak{G}(x) = \exp\left(\mathfrak{M}_1(\log x) \right )

Second, the geometric mean is the limit of the r-means as r goes to 0, and so you could define the r-mean with r = 0 to be the geometric mean.

\lim_{r\to 0} \,\mathfrak{M}_r(x) = \mathfrak{G}(x) \equiv \mathfrak{M}_0(x)

Newton’s method

Reading Newton’s method for nth roots in terms of r means, we take an initial estimate x0 and an auxiliary estimate x0‘ such that their geometric mean is the nth root we’re after. Then we take the arithmetic mean of x0 and x0‘ with weights (n-1)/n and 1/n.

That is, let

a = (x_0, x_0')

and solve for x_0′ such that

\mathfrak{G} (a) = \sqrt[n]{y}

Then

x_1 = \mathfrak{M}_1(a, p)

where

p = \left( \frac{n-1}{n}, \frac{1}{n} \right)

If we write the geometric mean as the r-mean with r = 0, we could describe Newton’s method entirely in terms of r-means.

One thought on “Higher roots and r-means

  1. Interesting connection.
    The definition of the geometric mean of course uses the very r-th root for which Newton’s method is being applied. This confused me a bit at first. But since the geometric mean is only “used” implicitly it all comes together in the end.

Comments are closed.