Open In App

Stationarity of Time Series Data using R

Last Updated : 27 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will discuss about Stationarity of Time Series Data, its characteristics, and types, why stationarity matters, and How to test it using R.

Stationarity of Time Series Data

Stationarity is an important concept when working with time series data. A stationary time series is one whose statistical properties, such as mean, variance, and autocorrelation, remain constant over time. Stationary data is easier to model and analyze. You can check for stationarity using various methods in the R Programming Language. Here are a few common techniques:

Characteristics of a Stationary Time Series

  1. Constant Mean: The mean of the time series data should be consistent across time. This means that there is no upward or downward trend.
  2. Constant Variance: The variance (or standard deviation) of the time series data should be consistent throughout time periods. The spread of data points shouldn’t vary.
  3. Constant Autocorrelation Structure: The autocorrelation function (ACF) or partial autocorrelation function (PACF) should not change significantly with time. The relationship between data at different lags should be constant.

Types of Stationarity

There are two main types of stationarity: strict stationarity and weak stationarity.

  1. Strict Stationarity:
    • A time series is said to be strictly stationary if the joint distribution of any set of time indices is the same for all time points.
    • In other words, the entire probability distribution of the data does not change over time.
    • Achieving strict stationarity is often too restrictive for real-world data, and it may be a challenging assumption to meet.
  2. Weak Stationarity (Second-Order Stationarity):
    • Weak stationarity is a more practical and commonly used form of stationarity.
    • A time series is weakly stationary if it satisfies three conditions: a. Constant mean: The mean of the time series is constant over time. b. Constant variance: The variance of the time series is constant over time. c. Constant autocovariance: The covariance between observations at any two points in time depends only on the time lag between them.
    • Mathematically, for a weakly stationary time series {X_t}: a. Mean: E(X_t) = μ for all t b. Variance: Var(X_t) = σ^2 for all t c. Autocovariance: Cov(X_t, X_(t+h)) = γ(h) for all t and some function γ(h) that only depends on the time lag ‘h’.

Why Stationarity Matters

  1. Simplifies Analysis:
    • Stationary time series are easier to analyze because the statistical properties do not change over time. This simplifies the application of various statistical methods and models.
  2. Modeling:
    • Many time series models, such as ARIMA (AutoRegressive Integrated Moving Average), assume stationarity. Modeling becomes more reliable when this assumption holds.
  3. Statistical Inference:
    • Stationarity is often a prerequisite for statistical inference techniques like hypothesis testing and confidence interval estimation.

Testing for Stationarity

Several statistical tests are available to check for stationarity, and some common ones include:

  1. Visual Inspection:
    • Plotting the time series data and visually inspecting for trends and seasonality.
  2. Summary Statistics:
    • Comparing the mean and variance of different segments of the time series.
  3. Augmented Dickey-Fuller (ADF) Test:
    • A statistical test that assesses the presence of a unit root in a univariate time series, which is a key indicator of non-stationarity.
  4. Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test:
    • Another test for stationarity that complements the ADF test. It is used to determine if a time series is trend-stationary around a deterministic trend.

Achieving Stationarity

If your time series is found to be non-stationary, you may need to apply transformations or differencing to make it stationary. Common techniques include:

  • Logarithmic transformation.
  • Differencing: Subtracting the value of the previous time point from the current one.

Remember that achieving stationarity is not always possible or necessary for every time series, and the appropriate approach depends on the specific characteristics of your data.

  1. Visual Inspection
  2. Summary Statistics
  3. Rolling Statistics
  4. Augmented Dickey-Fuller (ADF) Test
  5. Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

Stationarity of Time Series Data using R

Load and check the times series data

R




# Load necessary packages
install.packages(c("tseries", "zoo"))
library(tseries)
library(zoo)
  
# Load lh dataset (already stationary)
data("lh")
  
# 1. Visual Inspection
plot(lh, main = "Guinea Pigs Tibia Lengths")
  
# 2. Summary Statistics
summary(lh)


Output:

gh

Stationarity of Time Series Data

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
1.40 2.00 2.30 2.40 2.75 3.50

Check Stationarity of Time Series Data

1. Augmented Dickey-Fuller (ADF) Test

R




# 4. Augmented Dickey-Fuller (ADF) Test
adf_test <- adf.test(lh)
print("ADF Test:")
print(adf_test)


Output:

[1] "ADF Test:"

Augmented Dickey-Fuller Test

data: lh
Dickey-Fuller = -3.558, Lag order = 3, p-value = 0.04624
alternative hypothesis: stationary

The Augmented Dickey-Fuller (ADF) test was applied to the lh dataset, yielding a test statistic of -3.558 with a lag order of 3. The associated p-value is 0.04624. With a significance level of 0.05, the p-value is below the threshold, leading to the rejection of the null hypothesis. Consequently, there is evidence to suggest that the lh dataset is stationary. This result aligns with the expected behavior, confirming the absence of a unit root and supporting the stationary nature of the time series.

2. Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

R




# 5. Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test
kpss_test <- kpss.test(lh)
print("KPSS Test:")
print(kpss_test)


Output:

[1] "KPSS Test:"

KPSS Test for Level Stationarity

data: lh
KPSS Level = 0.29382, Truncation lag parameter = 3, p-value = 0.1

The KPSS (Kwiatkowski-Phillips-Schmidt-Shin) test was conducted on the lh dataset, resulting in a KPSS Level statistic of 0.29382 with a truncation lag parameter of 3. The associated p-value is 0.1. In the context of the test for level stationarity, a p-value greater than the significance level (commonly 0.05) suggests that we fail to reject the null hypothesis. Therefore, based on the KPSS test, the lh dataset provides evidence in favor of level stationarity. It is important to note that the ADF and KPSS tests assess different aspects of stationarity, and their complementary results can enhance the confidence in the overall stationarity assessment.

Stationary vs Non-Stationary Time Series Data

R




# Generate time vector
t <- 1:300
  
# Generate stationary time series
set.seed(123)
y_stationary <- rnorm(length(t), mean = 0, sd = 1)
y_stationary <- y_stationary / max(y_stationary)
  
# Generate non-stationary time series with a trend
set.seed(456)
y_trend <- cumsum(rnorm(length(t), mean = 0, sd = 4)) + t / 100
y_trend <- y_trend / max(y_trend)
  
# Set up a more attractive layout for the plots
par(mfcol = c(2, 2), mar = c(4, 4, 2, 1))
  
# Plot stationary time series
plot(t, y_stationary, type = 'l', col = 'darkgreen', xlab = "Time (t)", ylab = "Y(t)",
     main = "Stationary Time Series", cex.main = 1.2, cex.lab = 1.1)
  
# ACF for stationary time series
acf_y_stationary <- acf(y_stationary, lag.max = length(y_stationary), plot = FALSE)
plot(acf_y_stationary, main = 'ACF - Stationary Time Series', cex.main = 1.2, 
     cex.lab = 1.1)
  
# Plot non-stationary time series with trend
plot(t, y_trend, type = 'l', col = 'steelblue', xlab = "Time (t)", ylab = "Y(t)",
     main = "Non-Stationary Time Series with Trend", cex.main = 1.2, cex.lab = 1.1)
  
# ACF for non-stationary time series with trend
acf_y_trend <- acf(y_trend, lag.max = length(y_trend), plot = FALSE)
plot(acf_y_trend, main = 'ACF - Non-Stationary Time Series with Trend', cex.main = 1.2, 
     cex.lab = 1.1)


Output:

gh

Stationarity of Time Series Data

In this code generates and plots two time series: one stationary (y_stationary) and one non-stationary with a trend (y_trend).

  1. Stationary Time Series:
    • The first plot depicts a stationary time series generated with random normal noise.
    • A stationary time series has a constant mean, variance, and autocorrelation structure over time.
  2. ACF (Autocorrelation Function) for Stationary Time Series:
    • The second plot shows the autocorrelation function (ACF) for the stationary time series.
    • In a stationary series, the ACF tends to decay rapidly, indicating a lack of long-term dependencies.
  3. Non-Stationary Time Series with Trend:
    • The third plot illustrates a non-stationary time series with a cumulative sum of random normal noise and a linear trend.
    • Non-stationary time series exhibit changing statistical properties, often with trends or seasonality.
  4. ACF for Non-Stationary Time Series with Trend:
    • The fourth plot displays the autocorrelation function (ACF) for the non-stationary time series with a trend.
    • In a non-stationary series, the ACF may show slower decay, reflecting the persistence of dependencies.
  • Stationarity is evident in the first set of plots where the mean, variance, and autocorrelation structure remain relatively constant over time.

The second set of plots demonstrates how the ACF can be a useful diagnostic tool for identifying stationary and non-stationary time series. The decay pattern in the ACF provides insights into the series’ temporal dependencies.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads