Another question
Why doesn't correlation imply causation?
It is simple but i am getting very muddled! i.e. why, doesnt the data prove just that?
There's actually a Wikipedia article on this. It's a very famous statistical concept.
To summarise it though, the WIkipedia article presents several counter-examples to disprove that "correlation must imply causation". The Wikipedia article also suggests reasons why there may be no causation between the two variables you're modelling,
If you're not bothered to read the entire Wikipedia page, this is perhaps one of the examples I feel illustrates the point the most..
As ice cream sales increase, the rate of drowning deaths increases sharply.
Therefore, ice cream consumption causes drowning.
This example fails to recognize the importance of time of year and temperature to ice cream sales. Ice cream is sold during the hot summer months at a much greater rate than during colder times, and it is during these hot summer months that people are more likely to engage in activities involving water, such as swimming. The increased drowning deaths are simply caused by more exposure to water-based activities, not ice cream. The stated conclusion is false.
In short, here, the two variables seemed highly correlated, but not because one caused the other. But rather, there was a hidden
third variable that was causing both to increase.
The Wikipedia article shows that in general, this might not be the only reason that disproves "correlation implies causation". But this reason seems to be the easiest to understand, in my opinion.