This post will explain a connection between probability and geometry. Standard deviations for independent random variables add according to the Pythagorean theorem. Standard deviations for correlated random variables add like the law of cosines. This is because correlation is a cosine. Update: Here is a Spanish translation of this post.
First, let’s start with two independent random variables X and Y. Then the standard deviations of X and Y add like sides of a right triangle.
In the diagram above, “sd” stands for standard deviation, the square root of variance. The diagram is correct because the formula
Var(X+Y) = Var(X) + Var(Y)
is analogous to the Pythagorean theorem
c2 = a2 + b2.
Next we drop the assumption of independence. If X and Y are correlated, the variance formula is analogous to the law of cosines.
The generalization of the previous variance formula to allow for dependent variables is
Var(X+Y) = Var(X) + Var(Y) + 2 Cov(X, Y).
Here Cov(X,Y) is the covariance of X and Y. The analogous law of cosines is
c2 = a2 + b2 – 2 a b cos(θ).
If we let a, b, and c be the standard deviations of X, Y, and X+Y respectively, then cos(θ) = -ρ where ρ is the correlation between X and Y defined by
ρ(X, Y) = Cov(X, Y) / sd(X) sd(Y).
When θ is π/2 (i.e. 90°) the random variables are independent. When θ is larger, the variables are positively correlated. When θ is smaller, the variables are negatively correlated. Said another way, as θ increases from 0 to π (i.e. 180°), the correlation increases from -1 to 1.
The analogy above is a little awkward, however, because of the minus sign. Let’s rephrase it in terms of the supplementary angle φ = π – θ. Slide the line representing the standard deviation of Y over to the left end of the horizontal line representing the standard deviation of X.
Now cos(φ) = ρ = correlation(X, Y).
When φ is small, the two line segments are pointing in nearly the same direction and the random variables are highly positively correlated. If φ is large, near π, the two line segments are pointing in nearly opposite directions and the random variables are highly negatively correlated.
Now let’s see the source of the connection between correlation and the law of cosines. Suppose X and Y have mean 0. Think of X and Y as members of an inner product space where the inner product <X, Y> is E(XY). Then
<X+Y, X+Y> = < X, X> + < Y, Y> + 2<X, Y >.
In an inner product space,
<X, Y > = || X || || Y || cos φ
where the norm || X || of a vector is the square root of the vector’s inner product with itself. The above equation defines the angle φ between two vectors. You could justify this definition by seeing that it agrees with ordinary plane geometry in the plane containing the three vectors X, Y, and X+Y.