Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University

Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University

“Stationarity”-what is it? • Example: Stocks of Silver in the NY Commodities Exchange • Two forecasts: • Nonstationary in yellow • No mean reversion, unbounded error bands • Stationary in green • Reverts to mean, bounded error bands

SilverSeries

“Stationarity”-what is it? • Constant mean m • Covariance between Yt, Yt+h function of h only. g(h) • [Autocorrelation r(h) = g(h)/g(0)]

One Lag Model • Yt-m=r(Yt-1-m)+et • “shocks” et~N(0,s2) • Stationary: |r|<1 • Yt=m(1-r) +rYt-1+et • Regress Yt on 1, Yt-1 • Estimators approximately normally distributed in large samples • Use t test for H0:r=0

One Lag Model with r=1 • Yt-m=1(Yt-1-m)+et • “shocks” et~N(0,s2) • Yt=Yt-1+et • Best forecast of Yt is Yt-1 • Nonstationary: r=1 • Regress Yt on 1, Yt-1 • Estimators NOT normally distributed even in large samples • CANNOT use t tables to test for H0:r=0 • t test statistic does NOT have t distribution!!!

Hypothesis Test • Model: Yt-m=r(Yt-1-m)+et • Test • H0: r=1 “Nonstationary, Unit Root” • H1: |r|<1 “Stationary (mean reverting) • Compare t calculated to new distribution

Two Tests • Model: Yt-m=r(Yt-1-m)+et • Yt-m-(Yt-1-m)=(r-1)(Yt-1-m)+et • Yt-Yt-1= m (1-r)+ (r-1)Yt-1+et • Regress Yt-Yt-1 on 1, Yt-1 • Tests: • n(coefficient of Yt-1)  “Rho” • calculated t test  “Tau”

Some math Above diagonal ->

More math W(t) is Wiener Process on [0,1]

Two Series SAS software: PROC ARIMA procgplot; plot (Y Z)*t / overlay; procarima; i var=Y nlag=10 stationarity=(adf); i var=Z nlag=10 stationarity=(adf);

Symptoms of Nonstationarity • ACF dies down slowly • ACF is Corr (Yt, Yt-j) plot vs. j • Nonconstant level when plotted • Saw plot, ACFs coming up

Y series ACF The ARIMA Procedure Name of Variable = Y Mean of Working Series 110.9728 Standard Deviation 5.286108 Number of Observations 250 Autocorrelation Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 Std Error 0 1.00000 | |********************| 0 1 0.97219 | . |******************* | 0.063246 2 0.94506 | . |******************* | 0.107523 3 0.91741 | . |****************** | 0.136771 4 0.89025 | . |****************** | 0.159498 5 0.86479 | . |***************** | 0.178269 6 0.84145 | . |***************** | 0.194326 7 0.81771 | . |**************** | 0.208391 8 0.79836 | . |**************** | 0.220853 9 0.77912 | . |**************** | 0.232110 10 0.75671 | . |*************** | 0.242346

Z series ACF The ARIMA Procedure Name of Variable = Z Mean of Working Series 100.5022 Standard Deviation 2.402392 Number of Observations 250 Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 1.00000 | |********************| 1 0.90796 | . |****************** | 2 0.81755 | . |**************** | 3 0.72228 | . |************** | 4 0.63703 | . |************* | 5 0.56707 | . |*********** | 6 0.51964 | . |********** | 7 0.47865 | . |********** | 8 0.46026 | . |********* | 9 0.44466 | . |********* | 10 0.42313 | . |******** | "." marks two standard errors

Tests on Y The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr < Rho Tau Pr < Tau F Pr > F Zero Mean 0 0.1014 0.7059 0.71 0.8675 1 0.0880 0.7027 0.59 0.8422 2 0.0719 0.6989 0.45 0.8101 Single Mean 0 -6.8507 0.2817 -2.30 0.1724 2.99 0.3095 1 -6.8539 0.2815 -2.16 0.2211 2.57 0.4147 2 -7.1478 0.2624 -2.07 0.2564 2.29 0.4861 Trend 0 -7.3468 0.6313 -2.46 0.3502 3.64 0.4500 1 -7.3273 0.6328 -2.30 0.4295 3.07 0.5636 2 -7.5909 0.6114 -2.19 0.4905 2.65 0.6489 Nonstationary

Tests on Z The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr < Rho Tau Pr < Tau F Pr > F Zero Mean 0 -0.0087 0.6803 -0.05 0.6647 1 -0.0237 0.6769 -0.15 0.632 2 -0.0393 0.6733 -0.24 0.5997 Single Mean 0 -22.8511 0.0051 -3.45 0.0104 5.96 0.0136 1 -24.5443 0.0034 -3.48 0.0095 6.06 0.0114 2 -28.8542 0.0015 -3.69 0.0050 6.80 0.0010 Trend 0 -24.6119 0.0236 -3.61 0.0312 6.53 0.0449 1 -26.2971 0.0161 -3.60 0.0319 6.48 0.0461 2 -30.7682 0.0057 -3.77 0.0196 7.13 0.0283 Stationary

Higher Order Processes • Yt-m=a1(Yt-1-m) + a2(Yt-2-m) + a3(Yt-3-m) + et • DYt= Yt-Yt-1 = • -(1-a1- a2 - a3) (Yt-1-m) - (a2 + a3) DYt-1 - a3 DYt-2 + et • [ coefficient ]  Augmenting lags ADF stands for Augmented Dickey-Fuller Testing for no mean reversion: H0: (1-a1- a2 - a3) = 0 • Regress Yt-Yt-1 on 1, Yt-1,Yt-1-Yt-2, Yt-2-Yt-3 • Nonstandard  |  N(__, __) |

Higher Order Processes Q1: How many lags??? Regress DYt on 1,Yt-1, DYt-1 ,DYt-2, . . . |  N(__, __) | so . . . Just use usual t tests and p-values!!! Q2: Why “Unit Root” Tests ?? B(Yt)= Yt-1 (1-a1B - a2B2- a3B3)(Yt-m)= et root of 1-a1B - a2B2- a3B3 at B=1 means 1-a11 - a212- a313 = 0

Check Silver Series for Augmenting Lags PROC REG; MODEL DEL= LSILVERDEL1 DEL2 DEL3 DEL4; TEST DEL2=0, DEL3=0, DEL4=0; Mean Source DF Square F Value Pr > F Numerator 3 4589.63459 1.31 0.2753 Denominator 133 3515.48242

Unit Root test in PROC REG PROC REG; MODEL DEL= LSILVERDEL1; Parameter Variable DF Estimate t Value Pr > |t| Intercept 1 75.58073 2.76 0.0082 LSILVER 1 -0.11703 -2.780.0079  DEL1 1 0.67115 6.21 <.0001

Unit Root test in PROC ARIMA PROC ARIMA DATA=SILVER; I VAR=SILVER STATIONARITY=(ADF=(1)); Augmented Dickey-Fuller Unit Root Tests Type Lags Tau Pr < Tau Zero Mean 1 -0.28 0.5800 Single Mean 1 -2.780.0689 Trend 1 -2.63 0.2697

And now. . .the rest of the story

Type Lags Tau Pr < Tau Zero Mean  ????? (A) Single Mean 1 -2.780.0689 Trend  ????? (B) (A) Assumes mean is 0 (or known and subtracted off) Has different (pair of) distributions !! (B) Allows for TREND under H1 Has third (pair of) distributions !!!!

Silver - Need 2nd Difference? Dt = DYt = Yt-Yt-1 Q: Does D (also) have a unit root ?

Regress DDt on Dt-1 using /NOINT (why?) No augmenting lags (why?) I VAR=Y(1) STATIONARITY = . . . Type Lags Tau Pr < Tau Zero Mean 0 -3.42 0.0010 Single Mean 0 -3.39 0.0158 Trend 0 -3.62 0.0383

Autocorrelations • Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 • 0 7612550 1.00000 | |********************| • 1 7604217 0.99891 | .|********************| • 2 7595529 0.99776 | .|********************| • 3 7586855 0.99662 | . |********************| • 4 7578152 0.99548 | . |********************| • 5 7569481 0.99434 | . |********************| • 6 7560553 0.99317 | . |********************| • 7 7551925 0.99204 | . |********************| • 8 7543869 0.99098 | . |********************| • 9 7535957 0.98994 | . |********************| • 10 7528240 0.98892 | . |********************| • 11 7519890 0.98783 | . |********************| • 12 7511672 0.98675 | . |********************| • "." marks two standard errors

Output from SAS PROC ARIMA • Augmented Dickey-Fuller Unit Root Tests • Type Lags Rho Pr < Rho • Zero Mean 0 1.3567 0.9565 • 1 1.3481 0.9557 • Single Mean 0 0.4065 0.9744 • 1 0.3500 0.9725 • Trend 0 -6.3073 0.7203 • 1 -6.5833 0.6981

Differences

Autocorrelations • Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 • 0 4003.285 1.00000 | |********************| • 1 102.471 0.02560 | .|* | • 2 -117.368 -.02932 | *|. | • 3 -235.578 -.05885 | *|. | • 4 -26.946567 -.00673 | .|. | • 5 -46.750761 -.01168 | .|. | • 6 -77.100469 -.01926 | .|. | • 7 -224.055 -.05597 | *|. | • 8 -27.874814 -.00696 | .|. | • 9 132.415 0.03308 | .|* | • 10 316.534 0.07907 | .|** | • 11 -254.117 -.06348 | *|. | • 12 200.979 0.05020 | .|* | • "." marks two standard errors

Inverse Autocorrelation • Ming Chang thesis • Dual model (1-a B) Yt= et dual is Yt = (1-a B) et AR(1) MA(1) • Chang shows IACF dies off slowly if you overdifference.

Inverse Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 -0.51119 | **********|. | 2 0.01380 | .|. | 3 -0.00533 | .|. | 4 0.01061 | .|. | 5 -0.02324 | .|. | 6 0.00722 | .|. | 7 0.02122 | .|. | 8 -0.01617 | .|. | 9 0.02831 | .|* | 10 -0.04860 | *|. | 11 0.02759 | .|* | 12 -0.00422 | .|. | Differenced DJIA IACF

2nd Differenced DJIA IACF Just for illustration, here is the inverse autocorrelation you would get if you differenced these differences once more, that is, if you took the second difference of the original series. Note the roughly triangular appearance, suggesting that you should have stopped after the first difference Inverse Autocorrelations Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 0.89720 | .|****************** | 2 0.80302 | .|**************** | 3 0.70785 | .|************** | 4 0.60466 | .|************ | 5 0.50498 | .|********** | 6 0.41173 | .|******** | 7 0.32523 | .|******* | 8 0.23836 | .|***** | 9 0.15871 | .|*** | 10 0.09447 | .|** | 11 0.05758 | .|* | 12 0.01735 | .|. |

Rho and F Yt-m=a1(Yt-1-m) + a2(Yt-2-m) + et Factor: (1-a1B-a2B2) = (1-rB)(1-gB) • DYt = - (1-r)(1-g)(Yt-1-m) + rg(DYt-1) + et Rho (1) Estimate rg = -a2 = ( H0)gby regression (2) Divide n[(1-r)(g-1) estimate] by (g estimate-1) F Regress DYt on 1, t, Yt-1 , DYt-1 Test underlined items with F (3 numerator df)

Trend is not Unit Root Yt = a + b t + Zt with Zt stationary Yt-1 = a + b(t-1) + Zt-1 DYt = b + DZt with DZt an overdifferenced series !! Example:

Amazon.com Example (volume)

PROC REG; MODEL DV = DATE LAGV DV1-DV4; TEST DV3=0, DV4=0; Parameter Variable DF Estimate t Value Pr > |t| Type I SS Intercept 1 -17.49220 -5.26 <.0001 0.00848 date 1 0.00147 5.41 <.0001 0.01395 LAGV 1 -0.21914 -5.80 <.0001 26.67803 DV1 1 -0.15446 -3.08 0.0022 0.94211 DV2 1 -0.18447 -3.72 0.0002 3.52898 DV3 1 -0.04433 -0.94 0.3477 0.07997 DV4 1 -0.05774 -1.31 0.1923 0.48763 Test 1 Results for Dependent Variable DV Mean Source DF Square F Value Pr > F Numerator 2 0.28380 0.99 0.3715 Denominator 497 0.28602

ACF Levels: Lag Covariance Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 0 2.503910 1.00000 | |********************| 1 2.327538 0.92956 | . |******************* | 2 2.225324 0.88874 | . |****************** | 3 2.193509 0.87603 | . |****************** | 4 2.155492 0.86085 | . |***************** | 5 2.127643 0.84973 | . |***************** | 6 2.099292 0.83841 | . |***************** | 7 2.069929 0.82668 | . |***************** | 8 2.062194 0.82359 | . |**************** | 9 2.051450 0.81930 | . |**************** | 10 2.011864 0.80349 | . |**************** | 11 2.006564 0.80137 | . |**************** | 12 1.996735 0.79745 | . |**************** | 13 1.960231 0.78287 | . |**************** | 14 1.951272 0.77929 | . |**************** | 15 1.940939 0.77516 | . |**************** | 16 1.919167 0.76647 | . |*************** | 17 1.906896 0.76157 | . |*************** | 18 1.905406 0.76097 | . |*************** | 19 1.892168 0.75569 | . |*************** | 20 1.857199 0.74172 | . |*************** | 21 1.846038 0.73726 | . |*************** | 22 1.826167 0.72933 | . |*************** | 23 1.816151 0.72533 | . |*************** | 24 1.821228 0.72735 | . |*************** | "." marks two standard errors

IACF - Differences Lag Correlation -1 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 1 1 0.48216 | . |********** | 2 0.44816 | . |********* | 3 0.34266 | . |******* | 4 0.30682 | . |****** | 5 0.25213 | . |***** | 6 0.24854 | . |***** | 7 0.23624 | . |***** | 8 0.18675 | . |**** | 9 0.14088 | . |*** | 10 0.20330 | . |**** | 11 0.13295 | . |*** | 12 0.11437 | . |** | 13 0.15524 | . |*** | 14 0.11829 | . |** | 15 0.09978 | . |** | 16 0.10919 | . |** | 17 0.09049 | . |** | 18 0.06653 | . |*. | 19 0.02886 | . |*. | 20 0.09515 | . |** | 21 0.05504 | . |*. | 22 0.07104 | . |*. | 23 0.06065 | . |*. | 24 0.02284 | . | .

The ARIMA Procedure Augmented Dickey-Fuller Unit Root Tests Type Lags Rho Pr < Rho Tau Pr < Tau F Pr > F Zero Mean 2 0.0144 0.6861 0.02 0.6909 Single Mean 2 -14.2100 0.0474 -2.60 0.0944 3.42 0.1920 Trend 2 -85.7758 0.0007 -6.35 <.0001 20.18 0.0010 Do the test: Fit AR(3) plus trend. Diagnostics: Autocorrelation Check of Residuals To Chi- Pr > Lag Square DF ChiSq -----Autocorrelations----- 6 1.59 3 0.6615 -0.015 . . . -0.000 12 10.89 9 0.2835 -0.025 . . . 0.072 18 12.43 15 0.6460 -0.036 . . . 0.031 24 18.97 21 0.5872 30 23.75 27 0.6439 36 30.32 33 0.6014 42 37.56 39 0.5358 48 39.37 45 0.7087

Extensions S. E. Said shows that models with lagged et terms can still be tested by ADF tests. Nobel Prize “cointegration” idea: Two or more unit root processes have stationary linear combination. Compute, e.g. Yt = ln(St/Lt) and test for stationarity. • http://www4.stat.ncsu.edu/~dickey • Click: SAS Code from Presentations

Thanks ! Questions ?

Stationarity Issues in Time Series Modeling David A. Dickey North Carolina State University