Let u be a real-valued function of n variables, and let v be a vector-valued function of n variables, a function from n variables to a vector of size n. Then we have the following product rule:
D(uv) = v Du + u Dv.
It looks strange that the first term on the right isn’t Du v.
The function uv is a function from n dimensions to n dimensions, so it’s derivative must be an n by n matrix. So the two terms on the right must be n by n matrices, and they are. But Du v is a 1 by 1 matrix, so it would not make sense on the right side.
Here’s why the product rule above looks strange: the multiplication by u is a scalar product, not a matrix product. Sometimes you can think of real numbers as 1 by 1 matrices and everything works out just fine, but not here. The product uv doesn’t make sense if you think of the output of u as a 1 by 1 matrix. Neither does the product u Dv.
If you think of v as an n by 1 matrix and Du as a 1 by n matrix, everything works. If you think of v and Du as vectors, then v Du is the outer product of the two vectors. You could think of Du as the gradient of u, but be sure you think of it horizontally, i.e. as a 1 by n matrix. And finally, D(uv) and Dv are Jacobian matrices.
Update: As Harald points out in the comments, the usual product rule applies if you write the scalar-vector product uv as the matrix product vu where now we are thinking of u as a 1 by 1 matrix! Now the product rule looks right
D(vu) = Dv u + v Du
but the product vu looks wrong because you always write scalars on the left. But here u isn’t a scalar!