If often happens in applications that a linear system of equations *Ax* = *b* either does not have a solution or has infinitely many solutions. Applications often use **least squares** to create a problem that has a unique solution.

## Overdetermined systems

Suppose the matrix *A* has dimensions *m* by *n* and the right hand side vector *b* has dimension *m*. Then the solution *x*, if it exists, has to have dimension *n*. If *m* > *n*, i.e. we have more equations than unknowns, the system is **overdetermined**. The equations will be inconsistent in general. This is typical, for example, in linear regression.

In this case, you can use the least square criterion to determine a solution. Instead of demanding that *Ax* = *b*, we look for an *x* than makes the difference between *A**x* and *b* as small as possible, as measured by the 2-norm (root mean square). That is, we pick *x* to minimize

meaning we solve the system as best we can, best as measured by the 2-norm.

## Underdetermined systems

If the number of rows in the matrix *A*, i.e. the number of equations, is less than the number of columns, i.e. the number of unknowns, then the system is **underdetermined**. In general there were be infinitely many solutions. It’s also possible that there are no solutions because the equations are inconsistent.

In this case, we can use least squares to assure that a solution exists, and to decide which of the many possible solutions to choose. Here we want to find the *x* that minimizes

If there are values of *x* that satisfy *Ax* = *b* then this will chose the solution with least norm.

## Least squares solutions and SVD

If the singular value decomposition (SVD) of a real matrix *A* is given by

and *A* has rank *r*, then the least squares solution to the system *Ax* = *b* is given explicitly by

where the *u*‘s are the columns of *U* and the *v*‘s are the columns of *v*.

Note that the denominators are not zero; the fact that *A* has rank *r* means that it has *r* positive singular values.

Furthermore, the least squares residual, i.e. the degree to which *Ax* differs from *b*, is given by

Note that if the matrix *A* has rank *m*, then the least squares problem can be solved exactly, and the right side above is an empty sum.

See, for example, Golub and Van Loan’s classic book Matrix Computations.

Nice blog post. The underdetermined system leads to a similar solution as ridge regression, right?

You may also find this link interesting:

https://lectures.quantecon.org/py/orth_proj.html