202
Chapter 5: Bivariate Data
5-5 Correlation and Causation
It is important to understand the nature of the relationship between the
independent variable
x
and the dependent variable
y
. Listed are some
possibilities that one should consider.
There may be a direct cause-and-effect relationship between the two
variables. For example,
may cause
To illustrate, lack of water
causes dehydration, intensive exercise causes thirst, heat causes ice
cream to melt, etc.
There may be a reverse cause-and-effect relationship between the two
variables. For example,
causes
. To illustrate, one may believe that
bad grades may be caused by absences, but one should not fail to also
consider the fact that bad grades may cause absences.
The relationship may be due to chance or coincidence. To illustrate, one
may find a relationship between the number of suicides and the increase
in the sale of bagels. One can only conclude that any association
between these two variables must be due to chance.
The relationship may be due to confounding. That is, the relationship
may be due to the interrelationships between several variables.
The next illustration shows the distinction between association and
causation. For example, a large correlation (negative or positive) does not
imply causation. Suppose that a moderately high linear correlation is
observed between the weekly sales of hot chocolate and the number of
weekly skiing accidents during the skiing months in the USA. One can
reasonably conclude that hot chocolate sales could
not cause
skiers to have
accidents while skiing, and that more skiing accidents could
not cause
an
increase in sales of hot chocolate. Since the two variables are not actually
related, what could explain such a relationship? The apparent linear
relationship between the two variables may be caused by a third variable. In
this case, the values of the variables may be due to the weather conditions
during the winter months during the skiing season.
That is, the weather conditions may be causing an increase or a decrease in
both the number of weekly skiing accidents and the weekly hot chocolate
sales during the winter months. Thus, although the correlation between the
two variables is moderately high, the correlation is not causing the




