15 Measures of Association
Some Statistics that you may need:
Correlation
A correlation exists between two variables when one of them is related to the other in some way.
A scatter plot is a graph in which the paired sample data are plotted with a horizontal -axis and a vertical -axis.
Linear Correlation means our plot looks like a line
IMAGE
The Linear Correlation Coefficient or Pearson Product Moment Correlation Coefficient is a way to look at the variances of our data and come up with , a number which tells us how strong the correlation is.
where the sum ∑ is over all ordered pairs (x,y), sx is the standard deviation of the x values, sy is the standard deviation of the y values, and
and
are the sample means of x and y respectively.
Strong Positive |
Weak Positive |
Weak Negative |
Strong Negative |
|
|
|
|
|

Covariance
Covariance is another way of measuring correlation, but can also look at some non-linear relationships. It is defined by:
where the sum ∑ is over all ordered pairs (x,y),
Fun fact! When you take the same set of data twice, you get the following identities:
Correlation vs. Causation

In the following chart, we can see a clear correlation between the number of people who drowned by falling in a swimming pool in the USA and number of films that Nicholas Cage appeared in in that year.
CAUSATION: Do you think Nicholas Cage causes drowning?

Or: Does smoking cause lung cancer? It’s harder than you think to prove.
Remember….. CORRELATION does not imply CAUSATION
The term for a relationship between two variables