Remember to register here for FREE to ask any questions you may come across in your QCE studies!In a time series plot.
irregular fluctuations will make possible underlying trends difficult to describe. Recall that irregular fluctuations are essentially non-systematic effects that we can relate to with trend/seasonality/anything else that is common.
You learn in this course how to use
smoothing to help minimise these effects. The technique you use involves
moving averages. The goal is to average out values on both sides (left and right) at a certain point in time. The purpose is that this will 'minimise' the impacts of the fluctuations by somewhat 'spreading out' their effects across neighbouring values.
The averages that we find relate to the point in time that is essentially
between all points involved. This is why averaging an
even number of points creates a problem. The new time will be
halfway between two points on the original time series plot. Ideally, when attempting to address for fluctuations, we would like to
not change our measures of time! Thus, we introduce the technique of
centring.
We look at the case of
four point moving averages here. This is similar to
five point moving averages in that the
first and last two months will not have a four-point centred moving average. (Two-moving averages have the same similarity with three-moving averages in that the first and last one month have no centred moving average.)
To show an example of how this works, suppose this is my data:
\begin{align*}
\textbf{Time} |&| \textbf{Value}\\
1|&|20\\
2|&|22\\
3|&|24\\
4|&|25\\
5|&|23\\
6|&|26\\
7|&|28\\
8|&|26\\
9|&|29\\
10|&|27\\
11|&|28\\
12|&|30\\
13|&|27\\
14|&|29\\
15|&|28
\end{align*}
Our (ordinary) four moving averages will give us values corresponding to times
between our original values. We essentially average the values when we go two units of time to the left, and two units of time to the right. (Or up and down in our case, since our data was lined up going downwards.)
Carefully note that (ordinary) four moving averages will not exist for between times 1 and 2! This is because going to the left (i.e. upwards), we'd hit times 1 and 0, and we don't have a time 0 in this example!
- Between times 2 and 3: Use times from 1 to 4. \( \frac{20+22+24+25}{4}=22.75 \)
- Between times 3 and 4: Use times from 2 to 5. \( \frac{22+24+25+23}{4}=23.50 \)
- Between times 4 and 5: Use times from 3 to 6. \( \frac{24+25+23+26}{4}=24.50 \)
- Between times 5 and 6: Use times from 4 to 7. \( \frac{25+23+26+28}{4}=25.50 \)
- Between times 6 and 7: Use times from 5 to 8. \( \frac{23+26+28+26}{4}=25.75 \)
- Between times 7 and 8: Use times from 6 to 9. \( \frac{26+28+26+29}{4}=27.25\)
- Between times 8 and 9: Use times from 7 to 10. \( \frac{28+26+29+27}{4}=27.50 \)
- Between times 9 and 10: Use times from 8 to 11. \( \frac{26+29+27+28}{4} = 27.50 \)
- Between times 10 and 11: Use times from 9 to 12. \( \frac{29+27+28+30}{4} = 28.50 \)
- Between times 11 and 12: Use times from 10 to 13. \( \frac{27+28+30+27}{4}=28.00\)
- Between times 12 and 13: Use times from 11 to 14. \( \frac{28+30+27+29}{4} = 28.50\)
- Between times 13 and 14: Use times from 12 to 15. \( \frac{30+27+29+28}{4} = 28.50\)
In a similar way, we don't have anything similar for between times 14 and 15. Going to the right, we'd have to consider times 15 and 16, and time 16 does not appear here.
Then, for our centred moving averages, we
only average the
two (ordinary) moving averages on each side of the point. After we compute those, we are done!
\begin{align*}
\textbf{Time}|&| \textbf{Centred moving average}\\
3 |&| \frac{22.75+23.50}{2} = 23.125\\
4 |&| \frac{23.50+24.50}{2} = 24.000\\
5 |&| \frac{24.50+25.50}{2} = 25.000\\
6 |&| \frac{25.50+25.75}{2} = 25.625\\
7 |&| \frac{25.75+27.25}{2} = 26.500\\
8 |&| \frac{27.25+27.50}{2} = 27.375\\
9 |&| \frac{27.50+27.50}{2} = 27.500\\
10 |&| \frac{27.50+28.50}{2} = 28.000\\
11 |&| \frac{28.50+28.00}{2} = 28.250\\
12 |&| \frac{28.00+28.50}{2} = 28.250\\
13 |&| \frac{28.50+28.50}{2} = 28.500
\end{align*}
Just to again remind you what the end result looks like, here's a plot of the original data with the 'smoothed' data. It definitely looks "smoother" to me.