Least squares solutions to over- or underdetermined systems

If often happens in applications that a linear system of equations Axb either does not have a solution or has infinitely many solutions. Applications often use least squares to create a problem that has a unique solution.

Overdetermined systems

Suppose the matrix A has dimensions m by n and the right hand side vector b has dimension m. Then the solution x, if it exists, has to have dimension n. If mn, i.e. we have more equations than unknowns, the system is overdetermined. The equations will be inconsistent in general. This is typical, for example, in linear regression.

In this case, you can use the least square criterion to determine a solution. Instead of demanding that Axb, we look for an x than makes the difference between Ax and b as small as possible, as measured by the 2-norm (root mean square). That is, we pick x to minimize

|| Ax - b||_2^2

meaning we solve the system as best we can, best as measured by the 2-norm.

Underdetermined systems

If the number of rows in the matrix A, i.e. the number of equations, is less than the number of columns, i.e. the number of unknowns, then the system is underdetermined. In general there were be infinitely many solutions. It’s also possible that there are no solutions because the equations are inconsistent.

In this case, we can use least squares to assure that a solution exists, and to decide which of the many possible solutions to choose. Here we want to find the x that minimizes

|| Ax - b ||_2^2 + ||x||_2^2

If there are values of x that satisfy Axb then this will chose the solution with least norm.

Least squares solutions and SVD

If the singular value decomposition (SVD) of a real matrix A is given by

A = U \Sigma V^T
and A has rank r, then the least squares solution to the system Axb is given explicitly by

x = \sum_{i=1}^r \frac{u_i^T b}{\sigma_i} v_i

where the u‘s are the columns of U and the v‘s are the columns of v.

Note that the denominators are not zero; the fact that A has rank r means that it has r positive singular values.

Furthermore, the least squares residual, i.e. the degree to which Ax differs from b, is given by

||Ax -b ||_2^2 = \sum_{i = r+1}^m (u_i^Tb)^2

Note that if the matrix A has rank m, then the least squares problem can be solved exactly, and the right side above is an empty sum.

See, for example, Golub and Van Loan’s classic book Matrix Computations.

More linear algebra posts

One thought on “Least squares solutions to over- or underdetermined systems

Comments are closed.