Types of Autocorrelation

Last Updated : 17 May, 2021

Autocorrelation:

As we discussed in this article, Autocorrelation is defined as the measure of the degree of similarity between a given time series and the lagged version of that time series over successive time periods. Autocorrelation measures the degree of similarity between a time series and the lagged version of that time series at different intervals.

Autocorrelation Function:

Suppose we have a time series {X_t} which has the following mean:

$\mu = E\left [ X_t \right ]$

and the autocovariance functions

$\gamma_x\left ( t+k, t \right ) = Cov\left ( X_{t+k}, X_t \right ) \, = E\left [ \left ( X_{t+k}- \mu_{t+k} \right )\left ( X_t -\mu_t \right )\right ]$

at t=0,

$\gamma_x\left (k, 0\right ) = \gamma_x\left (k\right )$

and the autocorrelation function is defined as:

$\rho_x\left ( k \right ) = \frac{\gamma_x\left (k\right )}{\gamma_x\left (0\right )} = Corr\left ( X_{t+k}, X_t \right )$

The value of autocorrelation varies from -1 for perfectly negative autocorrelation and 1 for perfectly positive autocorrelation. The value closer to 0 is referred to as no autocorrelation.

Positive Autocorrelation:

Positive autocorrelation occurs when an error of a given sign between two values of time series lagged by k followed by an error of the same sign.

$Corr\left ( X_{t+k}, X_t \right ) > 0 \,for\, k > 0$

Below is the graph of the dataset that represents positive autocorrelation at lag=1:

Negative Autocorrelation:

Negative autocorrelation occurs when an error of a given sign between two values of time series lagged by k followed by an error of the different sign.

$Corr\left ( X_{t+k}, X_t \right ) < 0 \,for\, k > 0$

Below is the graph of time series that represents negative autocorrelation at lag=1:

Strong Autocorrelation

We can conclude that the data have strong autocorrelation if the autocorrelation plot has similar to the following plots:

The autocorrelation plot starts with a very high autocorrelation at lag 1 but slowly declines until it becomes negative and starts showing an increasing negative autocorrelation. This type of pattern indicates a strong autocorrelation, which can be helpful in predicting future trends

The next step would be to estimate the parameters for the autoregressive model:

$Y_{i} = A_0 + A_1*Y_{i-1} + E_{i}$

The randomness assumption for least-squares fitting applies to the residuals of the model. That is, even though the original data exhibit non-randomness, the residuals after fitting Y_i against Y_i-1 should result in random residuals.

Weak Autocorrelation

We can conclude that the data have weak autocorrelation if the autocorrelation plot has similar to the following plot at lag = 1:

Lag plot at Lag =1

The above plot shows that there is some autocorrelation at lag=1 because if there is no autocorrelation the plot will be similar to this plot on random values with lag=1

The conclusion can be drawn from the above plot

An underlying autoregressive model with moderate positive/negative autocorrelation.
There were very few outliers.

The above weak autocorrelation plot have some autoregressive model that can be represented in such a form

$Y_(i+1) = A_0 + A_1*Y_(i) + random-error$

at Y_i =0, we can obtain the residual of estimators.

$Y_(i+1) = A_0 + random-error$

It is easy to perform estimation on the lag plot because of the Y_i+1 and Y_i as their axes.

Implementation

In this implementation, we will be looking on how to generate correlation plots and lag plots. We will use the flicker dataset and some randomly generated samples for this purpose.

python3

# Necessary imports
import numpy as np
from numpy.random import random_sample
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.api as sm
from statsmodels.graphics.tsaplots import plot_acf 
 
# Generate Autocorrelation plot at different lags
# with a given level of significance.
weak_Corr_df = pd.read_csv('flicker.csv', sep ='\n', header=None)
plot_acf(weak_Corr_df, alpha = 0.05)
 
# Generate Lag plots for a particular lag value
pd.plotting.lag_plot(weak_Corr_df, lag = 1)
 
# Generate 200 random numbers and plot lag plot and autocorrelation plot for that
random_Series = pd.Series(random_sample(200))
pd.plotting.lag_plot(random_Series, lag = 1)
plot_acf(random_Series, alpha = 0.05)