Let *u* be a real-valued function of *n* variables, and let *v* be a vector-valued function of *n* variables, a function from *n* variables to a vector of size *n*. Then we have the following product rule:

D(*uv*) = *v* D*u* + *u* D*v.*

It looks strange that the first term on the right isn’t D*u* *v*.

The function *uv* is a function from *n* dimensions to *n* dimensions, so it’s derivative must be an *n* by *n* matrix. So the two terms on the right must be *n* by *n* matrices, and they are. But D*u* *v* is a 1 by 1 matrix, so it would not make sense on the right side.

Here’s why the product rule above looks strange: the multiplication by *u* is a **scalar **product, not a matrix product. Sometimes you can think of real numbers as 1 by 1 matrices and everything works out just fine, but not here. The product *uv* doesn’t make sense if you think of the output of *u* as a 1 by 1 matrix. Neither does the product *u* D*v*.

If you think of *v* as an *n* by 1 matrix and D*u* as a 1 by *n* matrix, everything works. If you think of *v* and D*u* as vectors, then *v* D*u* is the **outer product** of the two vectors. You could think of D*u* as the gradient of *u*, but be sure you think of it horizontally, i.e. as a 1 by *n* matrix. And finally, D(*uv*) and D*v* are **Jacobian matrices**.

**Update**: As Harald points out in the comments, the usual product rule applies if you write the scalar-vector product *uv* as the matrix product *vu* where now we *are* thinking of *u* as a 1 by 1 matrix! Now the product rule looks right

D(*vu*) = D*v* *u* + *v* D*u*

but the product *vu* looks wrong because you always write scalars on the left. But here *u* isn’t a scalar!

It isn’t really that strange. If you identify the scalar u with an 1-by-1 matrix, you should rewrite the scalar product uv as a legitimate matrix product vu as well, but then it follows immediately that D(vu) = (Dv)u + v(Du).