One of my lightbulb moments in college was when my professor, Jim Vick, explained the Lagrange multiplier theorem. The way I’d seen it stated in a calculus text gave me no feel for why it should be true, but his explanation made sense immediately.

Suppose *f*(*x*) is a function of several variables, i.e. *x* is a vector, and *g*(*x*) = *c* is a constraint. Then the Lagrange multiplier theorem says that at the maximum of *f* subject to the constraint *g* we have ∇*f* = λ ∇*g*.

Where does this mysterious λ come from? And why should the gradient of your objective function be related to the gradient of a constraint? These seem like two different things that shouldn’t even be comparable.

Here’s the geometric explanation. The set of points satisfying *g*(*x*) = *c* is a surface. And for any *k*, the set of points satisfying *f*(*x*) = *k* is also surface. Imagine *k* very large, larger than the maximum of *f* on the surface defined by *g*(*x*) = *c*. You could think of the surface *g*(*x*) = *c* being a balloon inside the larger balloon *f*(*x*) = *k**.*

Now gradually decrease *k*, like letting the air out of the outer balloon, until the surfaces *g*(*x*) = *c* and *f*(*x*) = *k* first touch. At that point, the two surfaces will be tangent, and so their normal vectors, given by their gradients, point in the same direction. That is, ∇*f* and ∇*g* are parallel, and so ∇*f* is some multiple of ∇*g*. Call that multiple λ.

I don’t know how well that explanation works when written down. But when I heard Jim Vick explain it, moving his hands in the air, it was an eye-opener.

This is not a rigorous proof, and it does not give the most general result possible, but it explains what’s going on. It’s something to keep in mind when reading proofs that are more rigorous or more general. As I comment on here,

Proofs serve two main purposes: to establish that a proposition is true, and to show

whyit is true.

The literally hand-wavy proof scores low on the former criterion and high on the latter.

***

Jim Vick was a great teacher. Some of us affectionately called him The Grinning Demon because he was always smiling, even while he gave devilishly hard homework. He was Dean of Natural Sciences when I took a couple classes from him. He later became Vice President for Student Affairs and kept teaching periodically. He has since retired but still teaches.

After taking his topology course, several of us asked him to teach a differential geometry course. He hesitated because it’s a challenge to put together an undergraduate differential geometry course. The usual development of differential geometry uses so much machinery that it’s typically a graduate-level course.

Vick found a book that starts by looking only at manifolds given by level sets of smooth functions, like the surfaces discussed above. Because these surfaces sit inside a Euclidean space, you can quickly get to some interesting geometric results using elementary methods. We eventually got to the more advanced methods, but by then we had experience in a more tangible setting. As Michael Atiyah said, abstraction should follow experience, not precede it.

Nice explanation.

When I look at a proof, I try, not always successfully, to imagine a thought process the theorem’s discoverer may have used to come up with the result. In this case, the balloon analogy fits the bill.

I took CVX101 from Stanford University, but did not understand this equation. Now it totally makes sense!

Hi John,

Thanks for this — it’s something I kinda-sorta have a feel for, but it’s not under my fingernails (yet). However … please may I trouble you to adjust the wording of your initial statement of the theorem?

“Then the Lagrange multiplier theorem at the maximum of f subject to the constraint g we have ∇f = λ ∇g.”

That sentence no verb. It looks like you were so keen to get on to the proof that you omitted a word or three from the statement … maybe?