Most sources that present Taylor’s theorem for functions of several variables stop at second order terms. One reason is that one or two terms are good enough for many applications. But the bigger reason is that things get more complicated when you include higher order terms.
The kth order term in Taylor’s theorem is a rank k tensor. You can think o rank 0 tensors as numbers, rank 1 tensors as vectors, and rank 2 tensors as matrices. Then we run out of familiar objects. A rank 3 tensor requires you start thinking in terms of tensors rather than more elementary terms.
There is a way to express Taylor’s theorem using only vectors and matrices. Maybe not the most elegant approach, depending on one’s taste, but it avoids any handwaving talk of a tensor being a “higher dimensional boxes of numbers” and such.
There’s a small price to pay. You have to introduce two new but simple ideas: the vec operator and the Kronecker product.
The vec operator takes an m × n matrix A and returns an mn × 1 matrix v, i.e. a column vector, by stacking the columns of A. The first m elements of v are the first column of A, the next m elements of v are the second column of A, etc.
The Kronecker product of an m × n matrix A and a p × q matrix B is a mp × nq matrix K = A ⊗ B. You can think of K as a block partitioned matrix. The ij block of K is aij B. In other words, to form K, take each element of A and replace it with its product with the matrix B.
A couple examples will make this clear.
Now we write down Taylor’s theorem. Let f be a real-valued function of n variables. Then
where f(0) = f and for k > 0,
The symbol ⊗ with a number on top means to take the Kronecker product of the argument with itself that many times.
Source: Matrix Differential Calculus by Magnus and Neudecker.
Related post: What is a tensor?