Stationarity of Time Series Data using R

Last Updated : 27 Mar, 2024

In this article, we will discuss about Stationarity of Time Series Data, its characteristics, and types, why stationarity matters, and How to test it using R.

Stationarity of Time Series Data

Stationarity is an important concept when working with time series data. A stationary time series is one whose statistical properties, such as mean, variance, and autocorrelation, remain constant over time. Stationary data is easier to model and analyze. You can check for stationarity using various methods in the R Programming Language. Here are a few common techniques:

Characteristics of a Stationary Time Series

Constant Mean: The mean of the time series data should be consistent across time. This means that there is no upward or downward trend.
Constant Variance: The variance (or standard deviation) of the time series data should be consistent throughout time periods. The spread of data points shouldn’t vary.
Constant Autocorrelation Structure: The autocorrelation function (ACF) or partial autocorrelation function (PACF) should not change significantly with time. The relationship between data at different lags should be constant.

Types of Stationarity

There are two main types of stationarity: strict stationarity and weak stationarity.

Strict Stationarity:
- A time series is said to be strictly stationary if the joint distribution of any set of time indices is the same for all time points.
- In other words, the entire probability distribution of the data does not change over time.
- Achieving strict stationarity is often too restrictive for real-world data, and it may be a challenging assumption to meet.
Weak Stationarity (Second-Order Stationarity):
- Weak stationarity is a more practical and commonly used form of stationarity.
- A time series is weakly stationary if it satisfies three conditions: a. Constant mean: The mean of the time series is constant over time. b. Constant variance: The variance of the time series is constant over time. c. Constant autocovariance: The covariance between observations at any two points in time depends only on the time lag between them.
- Mathematically, for a weakly stationary time series {X_t}: a. Mean: E(X_t) = μ for all t b. Variance: Var(X_t) = σ^2 for all t c. Autocovariance: Cov(X_t, X_(t+h)) = γ(h) for all t and some function γ(h) that only depends on the time lag ‘h’.

Why Stationarity Matters

Simplifies Analysis:
- Stationary time series are easier to analyze because the statistical properties do not change over time. This simplifies the application of various statistical methods and models.
Modeling:
- Many time series models, such as ARIMA (AutoRegressive Integrated Moving Average), assume stationarity. Modeling becomes more reliable when this assumption holds.
Statistical Inference:
- Stationarity is often a prerequisite for statistical inference techniques like hypothesis testing and confidence interval estimation.

Testing for Stationarity

Several statistical tests are available to check for stationarity, and some common ones include:

Visual Inspection:
- Plotting the time series data and visually inspecting for trends and seasonality.
Summary Statistics:
- Comparing the mean and variance of different segments of the time series.
Augmented Dickey-Fuller (ADF) Test:
- A statistical test that assesses the presence of a unit root in a univariate time series, which is a key indicator of non-stationarity.
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test:
- Another test for stationarity that complements the ADF test. It is used to determine if a time series is trend-stationary around a deterministic trend.

Achieving Stationarity

If your time series is found to be non-stationary, you may need to apply transformations or differencing to make it stationary. Common techniques include:

Logarithmic transformation.
Differencing: Subtracting the value of the previous time point from the current one.

Remember that achieving stationarity is not always possible or necessary for every time series, and the appropriate approach depends on the specific characteristics of your data.

Visual Inspection
Summary Statistics
Rolling Statistics
Augmented Dickey-Fuller (ADF) Test
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

Stationarity of Time Series Data using R

Load and check the times series data

R

# Load necessary packages 
install.packages(c("tseries", "zoo")) 
library(tseries) 
library(zoo) 
  
# Load lh dataset (already stationary) 
data("lh") 
  
# 1. Visual Inspection 
plot(lh, main = "Guinea Pigs Tibia Lengths") 
  
# 2. Summary Statistics 
summary(lh) 

Output:

Stationarity of Time Series Data

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.40    2.00    2.30    2.40    2.75    3.50

Check Stationarity of Time Series Data

1. Augmented Dickey-Fuller (ADF) Test

R

# 4. Augmented Dickey-Fuller (ADF) Test 
adf_test <- adf.test(lh) 
print("ADF Test:") 
print(adf_test) 

Output:

[1] "ADF Test:"

    Augmented Dickey-Fuller Test

data:  lh
Dickey-Fuller = -3.558, Lag order = 3, p-value = 0.04624
alternative hypothesis: stationary

The Augmented Dickey-Fuller (ADF) test was applied to the lh dataset, yielding a test statistic of -3.558 with a lag order of 3. The associated p-value is 0.04624. With a significance level of 0.05, the p-value is below the threshold, leading to the rejection of the null hypothesis. Consequently, there is evidence to suggest that the lh dataset is stationary. This result aligns with the expected behavior, confirming the absence of a unit root and supporting the stationary nature of the time series.

2. Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

R

# 5. Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test 
kpss_test <- kpss.test(lh) 
print("KPSS Test:") 
print(kpss_test) 

Output:

[1] "KPSS Test:"

    KPSS Test for Level Stationarity

data:  lh
KPSS Level = 0.29382, Truncation lag parameter = 3, p-value = 0.1

The KPSS (Kwiatkowski-Phillips-Schmidt-Shin) test was conducted on the lh dataset, resulting in a KPSS Level statistic of 0.29382 with a truncation lag parameter of 3. The associated p-value is 0.1. In the context of the test for level stationarity, a p-value greater than the significance level (commonly 0.05) suggests that we fail to reject the null hypothesis. Therefore, based on the KPSS test, the lh dataset provides evidence in favor of level stationarity. It is important to note that the ADF and KPSS tests assess different aspects of stationarity, and their complementary results can enhance the confidence in the overall stationarity assessment.

Stationary vs Non-Stationary Time Series Data

R

# Generate time vector 
t <- 1:300 
  
# Generate stationary time series 
set.seed(123) 
y_stationary <- rnorm(length(t), mean = 0, sd = 1) 
y_stationary <- y_stationary / max(y_stationary) 
  
# Generate non-stationary time series with a trend 
set.seed(456) 
y_trend <- cumsum(rnorm(length(t), mean = 0, sd = 4)) + t / 100 
y_trend <- y_trend / max(y_trend) 
  
# Set up a more attractive layout for the plots 
par(mfcol = c(2, 2), mar = c(4, 4, 2, 1)) 
  
# Plot stationary time series 
plot(t, y_stationary, type = 'l', col = 'darkgreen', xlab = "Time (t)", ylab = "Y(t)", 
     main = "Stationary Time Series", cex.main = 1.2, cex.lab = 1.1) 
  
# ACF for stationary time series 
acf_y_stationary <- acf(y_stationary, lag.max = length(y_stationary), plot = FALSE) 
plot(acf_y_stationary, main = 'ACF - Stationary Time Series', cex.main = 1.2,  
     cex.lab = 1.1) 
  
# Plot non-stationary time series with trend 
plot(t, y_trend, type = 'l', col = 'steelblue', xlab = "Time (t)", ylab = "Y(t)", 
     main = "Non-Stationary Time Series with Trend", cex.main = 1.2, cex.lab = 1.1) 
  
# ACF for non-stationary time series with trend 
acf_y_trend <- acf(y_trend, lag.max = length(y_trend), plot = FALSE) 
plot(acf_y_trend, main = 'ACF - Non-Stationary Time Series with Trend', cex.main = 1.2,  
     cex.lab = 1.1) 

Output:

Stationarity of Time Series Data

In this code generates and plots two time series: one stationary (y_stationary) and one non-stationary with a trend (y_trend).

Stationary Time Series:
- The first plot depicts a stationary time series generated with random normal noise.
- A stationary time series has a constant mean, variance, and autocorrelation structure over time.
ACF (Autocorrelation Function) for Stationary Time Series:
- The second plot shows the autocorrelation function (ACF) for the stationary time series.
- In a stationary series, the ACF tends to decay rapidly, indicating a lack of long-term dependencies.
Non-Stationary Time Series with Trend:
- The third plot illustrates a non-stationary time series with a cumulative sum of random normal noise and a linear trend.
- Non-stationary time series exhibit changing statistical properties, often with trends or seasonality.
ACF for Non-Stationary Time Series with Trend:
- The fourth plot displays the autocorrelation function (ACF) for the non-stationary time series with a trend.
- In a non-stationary series, the ACF may show slower decay, reflecting the persistence of dependencies.

Stationarity is evident in the first set of plots where the mean, variance, and autocorrelation structure remain relatively constant over time.

The second set of plots demonstrates how the ACF can be a useful diagnostic tool for identifying stationary and non-stationary time series. The decay pattern in the ACF provides insights into the series’ temporal dependencies.

Suggest improvement

How to Check if Time Series Data is Stationary with Python?

Share your thoughts in the comments

Stationarity of Time Series Data using R

Stationarity of Time Series Data

Characteristics of a Stationary Time Series

Types of Stationarity

Why Stationarity Matters

Testing for Stationarity

Achieving Stationarity

Stationarity of Time Series Data using R

Load and check the times series data

R

Check Stationarity of Time Series Data

1. Augmented Dickey-Fuller (ADF) Test

R

2. Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

R

Stationary vs Non-Stationary Time Series Data

R

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?