Time Series Modelling— Trends, Seasons, and Stationarity

DataScience Deep Dive
4 min readDec 24, 2020

--

There are two major reasons behind non-stationarity of a time series:

Trend: Varying mean over time

Seasonality: Certain variations at specific time-frames

Weather patterns are one of the simplest forms of Time Series seasonality. Other examples that have less seasonal trends are Stock market prices and economic indicators like GDP

At the very basis of when selecting forecasting methods in a time series model I found it very useful to look at it in terms of Systematic and Non-Systematic components as described in https://machinelearningmastery.com/decompose-time-series-data-trend-seasonality/ :

  • Systematic: Components of the time series that have consistency or recurrence and can be described and modeled.
  • Non-Systematic: Components of the time series that cannot be directly modeled.

A given time series is thought to consist of three systematic components including level, trend, seasonality, and one non-systematic component called noise.

These components are defined as follows:

  • Level: The average value in the series.
  • Trend: The increasing or decreasing value in the series.
  • Seasonality: The repeating short-term cycle in the series.
  • Noise: The random variation in the series. This is the residual of the original time series after the seasonal and trend series are removed.

Although the stationarity assumption is required in several time series modeling techniques, few practical time series are stationary.

There are three key ways to eliminate trends:

  • Taking the log transformation
  • Subtracting the rolling mean
  • Differencing

I found Differencing most effective in my recent project to attempting to ‘get as close as possible’ to making the data stationary when there are strong seasonal trends. Other techniques I have used are Subtracting the rolling mean and the exponential rolling mean.

In the differencing technique, we take the difference of an observation at a particular time instant with that at the previous instant (i.e. a so-called 1-period “lag”).periods=1 (denoting a 1-period lag). Details on pandas differencing.diff() can be found here.

Here is an example of differencing and how to validate stationarity of your time series data file with the Dickey Fuller test (from statsmodels.tsa.stattools import adfuller):

Step 1: Difference the time series df:

Recent project on — Zillow US house price data — difference the time series monthly data file
Step 2: Visualise the results of the removal of the rolling mean
Step 3: Visualise seasonal decomposition

Even though we are to get ‘as close to’ stationary in order to best fit our model prediction, as you can see looking at seasonal patterns are also very useful in predicting entry and exit points, like house price buy and sell on Ziplow’s US house price data:

Step 4: Look for seasonality pattern

To achieve successful decomposition, it is important to choose between the additive and multiplicative models, which requires analyzing the series. For example, does the magnitude of the seasonality increase when the time series increases?

I tried all four types of removing trends and since this data has a very strong seasonal pattern the Dickey Fuller test did not result in a positive confirmation of stationarity:

Step 5: Run adfuller() test

In general, a p-value of less than 5% means you can reject the null hypothesis that there is a unit root. You can also compare the calculated DFT statistic with a tabulated critical value. Here is further detail on the DFT: https://www.machinelearningplus.com/time-series/augmented-dickey-fuller-test/

For the full code above view it directly in my github project, starting from line [231]: https://github.com/Sue-Mir/Module4_Project_Time_Series_Modelling/blob/master/time-series/TimeSeries.ipynb

--

--

DataScience Deep Dive
DataScience Deep Dive

No responses yet