First, we load a few R packages

Most of the material in this lecture was taken from here and this wonderful dsbook from Rafael Irizarry.


In this lecture, we will discuss one of the most important aspects of analyzing data: being skeptical of results. We discuss some reasons why and give you examples.

“Correlation is not causation” is perhaps the most important lesson one learns in a statistics class. In this lecture, we have described tools useful for quantifying associations between variables. However, we must be careful not to overinterpret these associations.

There are many reasons that a variable \(X\) can be correlated with a variable \(Y\) without either being a cause for the other. Here we examine three common ways that can lead to misinterpreting data.

  1. Spurious correlation
  2. Outliers
  3. Reversing cause and effect
  4. Confounders

Next, we will discuss in detail what each of these are and give an example.