Higher order Taylor series | Kronecker product and vec

Most sources that present Taylor’s theorem for functions of several variables stop at second order terms. One reason is that one or two terms are good enough for many applications.

But the bigger reason is that things get more complicated when you include higher order terms. As Lloyd Trefethen put it,

In principle, the Taylor series of a function of n variables involves an n-vector, an n × n matrix, an n × n × n tensor, and so on. Actual use of orders higher than two, however, is so rare that the manipulation of matrices is a hundred times better supported in our brains and in our software tools than that of tensors.

The kth order term in Taylor’s theorem is a rank k tensor. You can think of rank 0 tensors as numbers, rank 1 tensors as vectors, and rank 2 tensors as matrices. Then we run out of familiar objects. A rank 3 tensor requires you start thinking in terms of tensors rather than more elementary terms.

There is a way to express Taylor’s theorem using only vectors and matrices. Maybe not the most elegant approach, depending on one’s taste, but it avoids any handwaving talk of a tensor being a “higher dimensional boxes of numbers” and such.

There’s a small price to pay. You have to introduce two new but simple ideas: the vec operator and the Kronecker product.

The vec operator takes an m × n matrix A and returns an mn × 1 matrix v, i.e. a column vector, by stacking the columns of A. The first m elements of v are the first column of A, the next m elements of v are the second column of A, etc.

The Kronecker product of an m × n matrix A and a p × q matrix B is a mp × nq matrix K = A ⊗ B. You can think of K as a block partitioned matrix. The ij block of K is a_ijB. In other words, to form K, take each element of A and replace it with its product with the matrix B.

A couple examples will make this clear.

$A = \left\[ \begin{array}{cc} 1 & 2 \\ 3 & 4 \\ 5 & 6 \\ 7 & 8 \end{array} \right]$

$\mathrm{vec} \, A = \left\[ \begin{array}{c} 1 \\ 3 \\ 5 \\ 7 \\ 2 \\ 4 \\ 6 \\ 8 \end{array} \right]$

$B= \left\[ \begin{array}{ccc} 10 & 0 & 0 \\ 0 & 20 & 0 \\ 0 & 0 & 30 \end{array} \right]$

$A \otimes B= \left\[ \begin{array}{cccccc} 10 & 0 & 0 & 20 & 0 & 0 \\ 0 & 20 & 0 & 0 & 40 & 0 \\ 0 & 0 & 30 & 0 & 0 & 60 \\ 30 & 0 & 0 & 40 & 0 & 0 \\ 0 & 60 & 0 & 0 & 80 & 0 \\ 0 & 0 & 90 & 0 & 0 & 120 \\ 50 & 0 & 0 & 60 & 0 & 0 \\ 0 & 100 & 0 & 0 & 120 & 0 \\ 0 & 0 & 150 & 0 & 0 & 180 \\ 70 & 0 & 0 & 80 & 0 & 0 \\ 0 & 140 & 0 & 0 & 160 & 0 \\ 0 & 0 & 210 & 0 & 0 & 240 \\ \end{array} \right]$

Now we write down Taylor’s theorem. Let f be a real-valued function of n variables. Then

$f(x + h) = f(x) + f^{(1)}(x) h + \sum_{k=2}^\infty \frac{1}{k!} \left[\stackrel{k-1}{\otimes}h^T \right] f^{(k)}(x) h$

where f⁽⁰⁾ = f and for k > 0,

$f^{(k)}(x) = left. \frac{ \partial \mathrm{vec}\, f^{(k-1)} }{\partial h^T} \right|x$

The symbol ⊗ with a number on top means to take the Kronecker product of the argument with itself that many times.

Source: Matrix Differential Calculus by Magnus and Neudecker.

Related post: What is a tensor?

One thought on “Higher order Taylor series in several variables”

saigatx

6 July 2020 at 08:19

I think there is a mistake: Writing a multivariate polynomial P in terms of Kronecker powers would be, according to your notation:
P(x) = a_0 · x^0 + a_1 · x^1 + a_2 · x^2 + … + a_p · x^p
where x^0=1, x^1 = x, x^2 = x×x (× is the Kronecker product) and so on and · the usual Euclidean inner product. But taking a closer look at x^2 reveals that:
[x1 x2] × [x1 x2] = [x1² x2x1 x1x2 x2²]
So x2x1 appears twice! Thus, writing the polynomial in terms of Kronecker powers would requires multiplication of an elemination matrix to transform [x1^2 x2x1 x1x2 x2^2] into [x1² x2x1 x2²].

Comments are closed.