Smoothest curve through points: natural cubic spline

How do you fit the smoothest curve through a set of points? Suppose you have a set of increasing x values x₁, x₂, x₃, … , x_n and a corresponding set of y values y₁, y₂, y₃, … , y_n. You want to find the smoothest function f(x) such that f(x_i) = y_i for i = 1, 2, 3, …, n. The smoothest curve depends on how you define smoothness, but we’ll develop a common definition that leads to a nice solution.

Our definition of smoothness starts with the second derivative. If a curve is flat, it has zero second derivative. If the curve has a sharp kink, the second derivative will be high there. We want to add up the smoothness over the entire curve, so we’ll integrate the second derivative. But that won’t quite work. Second derivatives can be positive or negative, and a curve could have several sharp bends and still have average second derivative zero. Sharp bends in one direction could be canceled out by equally sharp bends in the opposite direction. So we look at the square of the second derivative. That gives a positive number whether the curve bends up or down. Also, squaring small numbers makes them smaller and squaring big numbers makes them bigger. So squaring the second derivative will penalize sharp bends and be more forgiving of small bends. In summary, we’ll measure the smoothness of a function f(x) over an interval [a, b] by the following integral.

$\int_a^b \left( f''(x) \right)^2\, dx$

Using the above measure of smoothness, the smoothest function interpolating a set of values is a natural cubic spline. I’ll explain this definition from right to left: first I’ll explain what a spline is, then a cubic spline, then a natural cubic spline. Cubic splines are very commonly used in graphical applications.

A spline is a function made by piecing together other functions. Most splines are piecewise polynomials, though there are other kinds of splines, such as trigonometric splines that are piecewise trig functions. The term “spline” comes from a mechanical device for drawing curves.

A cubic spline is formed by piecing together cubic polynomials. That is, the restriction of f to any interval between two x-values, from x_i to x_i+1 is a cubic polynomial. These piecewise cubic polynomials are formed by making their first and second derivatives match where they join. This makes a cubic spline curve appear very smooth.

Say we have n values of x and y. That means we have n − 1 cubic polynomials to fit together: one over [x₁, x₂], another over [x₂, x₃], etc. Each cubic polynomial has 4 coefficients, and so we have a total of 4(n − 1) coefficients to determine. At each interior node, i.e. the values from x₂ to x_n-1, there are four constraints. At each such x_i we have

The polynomial on the left side must have value y_i.
The polynomial on the right side must have value y_i.
The first derivatives of the left and right polynomials must match.
The second derivatives of the left and right polynomials must match.

This gives us a total of 4(n − 2) interior constraints. The values at the left and right ends are specified too, i.e. f(x₁) = y₁ and f(x_n) = y_n. This adds up to 4n-6 constraints on 4n − 4 coefficients. We need two more constraints to have as many equations as unknowns. There are various ways to add these constraints but a natural cubic spline imposes the requirement that the second derivatives should be zero at the ends, i.e. f”(x₁) = f”(x_n) = 0.

Finding the coefficients for a natural cubic spline with n nodes requires solving 4n − 4 equations in as many unknowns. There is a unique solution to this system of equations, and the system has a special structure that can be solved very quickly.

One thing I find interesting about natural cubic splines is how a fairly ad hoc procedure — patching cubic polynomials — leads to the solution to an elegant minimization problem.

Related: Help with interpolation

13 thoughts on “The smoothest curve through a set of points”

Daniel Lemire

7 February 2009 at 09:05

You might also like Dubuc’s iterative scheme.

1) Assume (for simplicity) that your data points are regularly spaced.

2) First try to find the middle points… that is, if you samples are every ten units, find the samples in between (at 5 units). So, compute the sample at f(i+1/2) like so:

-f(i-1)/16 + 9* f(i)/16+9*f(i+1)/16 – f(i+1)/16

3) Rinse and repeat, iteratively.

The net result is a differentiable curve. It is not twice differentiable though.

This, by the way, is the foundation for modern wavelet theory.

I have written a few papers about it:

Serge Dubuc, Daniel Lemire, and Jean-Louis Merrien, Fourier Analysis of 2-Point Hermite Interpolatory Subdivision Schemes, J. of Fourier An. and Appl., Volume 7, Issue 5, pages 537-552, 2001.
http://www.daniel-lemire.com/fr/abstracts/ETRANSF.html

Daniel Lemire, A Family of 4-Point Dyadic Multistep Subdivision Schemes. Proceedings of Curves and Surfaces 2002, pages 259-268, 2003.
http://www.daniel-lemire.com/fr/abstracts/CS2002.html

Carlos Cobas

8 February 2009 at 05:26

Nice post … You might want to take a look at the ‘Perfect Smoother’ algorithm:http://admin.pubs.acs.org/doi/abs/10.1021/ac034173t

If you cannot access to the online PDF article, let me know, I have a copy of it
Carlos

John Venier

9 February 2009 at 16:53

Proving this was one of the first problems we tackled in my first optimization course. It is elegant indeed. I like cubic splines because besides being easy, fast, and smooth, they are also pretty good at approximating much more complicated curves. As an experiment check out how close the cubic spline interpolation is to the Normal PDF for relatively few data points.

Also, the mental image of the physics of using splines (as in shipbuilding) to get a minimum energy curve through fixed points is to me unusually apt. I always hated the “vertical spring” image for linear regression, even though it matches the math. Because you just wouldn’t do it that way in real life. But splines are how it was done long before cubic spline approximation.

Ibrahim

10 November 2009 at 18:24

Nice article and post, I am interesting in finding the middle points of set of points that represent the boundary of an object in image, could be this solved using some like equations?

Qublai Khan

11 February 2010 at 14:03

I wonder why do we take the second derivative in quantifying the ‘smoothness’ of a curve? Wouldn’t the first derivative capture the same idea? Can you give any example of a situation where the first derivative falls short? Thanks.

John

11 February 2010 at 15:41

Qublai: Think of something like absolute value, |x|. The derivative is -1 for all x < 0 and the derivative is +1 for all x >0. The first derivative is never big. But the function |x| has a pointy corner. In a sense, the second derivative is infinite at 0. Now |x| isn’t differentiable, but you could come up with a smooth approximation and the same argument applies: the first derivative will never be large but the second derivative will be large at the corner.

Dan

27 October 2011 at 10:10

Actually, what you are referring to is not the “smoothest” but sounds a lot like “total curvature.” The “smoothest” (IMO) would be the curve fitting the points with the largest number of continuous derivatives.

Sean

29 December 2011 at 11:45

Dan,

I think you’re right in saying that a higher degree polynomial would be smoother, but the catch seems to be in being able to solve for such. I just don’t think you’d reasonably get the constraints you’d need to solve the system uniquely.

Ian

31 May 2012 at 09:31

The Catmull-Rom spline is another cubic spline that only has continuity of the first derivative. It also has the advantage of “local control” — changing the position of one point doesn’t affect the entire spline.

rdm

8 February 2013 at 10:50

I’m trying to rationalize this idea of smoothness with fitting points on a curve which originally defined using something more complex than a quadratic equation. (Let’s take points from the curve e^x for example.)

(I have not yet had time to work through the numbers though, to see how the axioms introduced here play out in that context.)

George Papazafeiropoulos

5 October 2016 at 11:51

A point would be how the approximation of the various types of splines to the actual curve defined by the points (xi,yi) can be measured. Is the smoothest curve also the most accurate? At a first thought, accuracy can be measured by the integral of the absolute value of the difference between actual curve and natural cubic spline, divided by the number of points at which the two curves coincide. This is a measure of relative accuracy, though, and can be different between the various types of splines.
To state the problem in a different way, which are the boundary conditions (continuity, etc.) that each spline segment must satisfy, in order to approximate a given curve in the best way?

Sean

23 March 2017 at 14:38

Does the “smoothest curve” mean the same as “minimum energy curve” mean the same as “minimum curvature”? I’m looking for a 2d interpolator that will ensure minimum curvature. Many thanks.

John

23 March 2017 at 15:22

In short, no. Natural cubic splines are the smoothest curve only in the sense described above.

Comments are closed.