Correlation

The problem with Covariance is that it is very unit independent. What we really want is to instead find a way to normalize the value we get for covariance into a simpler range. The correlation here helps do that.

correlation

The correlation coefficient of X,Y denoted by Corr(X,Y) or ρX,Y or just ρ is defined by:

ρX,Y=E[XμXσXYμYσY]=Cov(X,Y)σXσY

Properties of Correlation

Proposition

For any two rvs X,Y:

  1. Corr(X,Y)=Corr(Y,X)
  2. Corr(X,X)=1
  3. (Scale Invariance Property): If a,b,c,d are constants and ac>0:
Corr(aX+b,cY+d)=Corr(X,Y)
  1. Corr(X,Y)[1,1]

Proof

  1. See Covariance's properties (1).
Corr(X,X)=Cov(X,X)σX2=Var(X,X)σX2=σX2σX2=1
  1. Again using the Covariance's properties gives this result.
  2. Same as (3). Notice that the maximum of the covariance part is when both variables are the same (gives (2)) or all the way uncorrelated (gives -1 as we expected).

Propsition

  1. If X,Y are independent then ρ=0. However, ρ=0 does not imply independence.
  2. ρ=1,1 iff Y=aX+b for some numbers a,b where a0.

When ρ=0 we specifically say that X,Y are uncorrelated.

Uncorrelated iff expectation and multiplication can be swapped.

Two rvs X,Y are said to be uncorrelated iff E(XY)=μXμY.

Proof

Uncorrelated implies Cov(X,Y)=0 and applying the covariance shortcut:

ρ=0Cov(X,Y)=0E(XY)μXμY=0E(XY)=μXμY