IT Applications in Business Analytics
220 likes | 876 Vues
IT Applications in Business Analytics. Lecture 09 – Time Series Regression Thomas Zeutschler. Business Analytics (M.Sc.) IT in Business Analytics. Let’s get started…. „Prediction is very difficult, especially when it‘s about the future.“ Niels Bohr. Regression Analysis.
IT Applications in Business Analytics
E N D
Presentation Transcript
IT Applications in Business Analytics Lecture 09 – Time Series Regression Thomas Zeutschler Business Analytics (M.Sc.) IT in Business Analytics IT Applications in Business Analytics - 09. Time Series Regression
Let’s get started… „Prediction is very difficult, especially when it‘s about the future.“ Niels Bohr IT Applications in Business Analytics - 09. Time Series Regression
Regression Analysis • Regression analysis is a class of statistical methods, to describe the relation between one dependent and one or more independent variables. • Many economical time series have robust relations. • Oil price > fuel price • US 12m avg. fuel price > engine size (Hubraum) of US cars. • Average US income > engine size (Hubraum) of US cars. IT Applications in Business Analytics - 09. Time Series Regression
Time Series Dependencies IT Applications in Business Analytics - 09. Time Series Regression
Nonlinear Time Series IT Applications in Business Analytics - 09. Time Series Regression
Nonlinear Time Series • A nonlinear time series (process) is any stochastic process that is not linear. • Nonlinear time series are generated by nonlinear dynamic equations. Their display features cannot be modelled by linear processes: • time-changing variance, • Asymmetric cycles, • higher-moment structures, • thresholds and breaks. IT Applications in Business Analytics - 09. Time Series Regression
Time Series in R • ASTSA R Package • A collection of time series analysis methods • A package containing some sample data sets • By David Stoffer, “Data sets and scripts for Time Series Analysis and Its Applications: With R Examples”, http://www.stat.pitt.edu/stoffer/tsa3/ IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Southern Oscillation Index, or SOI, gives an indication of the development and intensity of El Niño or La Niña events in the Pacific Ocean. • The SOI is calculated using the pressure differences between Tahiti and Darwin. • SOI = 10 x • Pdiff = (average Tahiti MSLP for the month) - (average Darwin MSLP for the month),Pdiffav = long term average of Pdiff for the month in question, and SD(Pdiff) = long term standard deviation of Pdiff for the month in question. (Pdiff – Pdiffav) SD(Pdiff) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Fish RecruitmentA measure of the fish population in the southern hemisphere. library(astsa) # R-Package with Data sets and scripts for Time Series Analysis # Southern Oscillation Index (SOI) for a period of 453 months ranging # over the years 1950-1987. soi = scan("soi.dat") soi = ts(soi) # Fish recruitment (number of new fish) for a period of 453 months ranging # over the years 1950-1987. rec = scan("recruit.dat") rec = ts(rec) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Let’s Take a look at Auto-Covariance and Correlation… • What does it tell us? # Auto-Covariance and -Correlation function estimation for REC acf(rec) # Partial Auto-Covariance and -Correlation function estimation for REC pacf(rec) # Cross-Covariance and -Correlation function estimation for REC & SOI ccf(soi,rec) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Let’s do a visual analysis of SOI and REC # Visual coorelation analysis lag2.plot(soi, rec, 10) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Data preparation for the setup of a prediction model… # create a tablewithshiftedtimeseries. # Just keepperiodswherefor all periodsthereis a valueusing 'ts.intersect()' alldata = ts.intersect(rec, reclag1 = lag(rec,-1), reclag2 = lag(rec,-2), soilag5 = lag(soi,-5), soilag6 = lag(soi,-6), soilag7 = lag(soi,-7), soilag8 = lag(soi,-8), soilag9 = lag(soi,-9), soilag10 = lag(soi,-10)) # showthetable alldata IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Build a linear model based on SOI -5 to -10 (into the past) # build a linear model (using 'lm()' function) # 1st try a multiple regression in whichthe REC variable is a linear function # of (past) lags 5, 6, 7, 8, 9, and 10 ofthe SOI variable # info: lm(formula, data) -> formatforformula := [response]~[terms] -> termsis tryit1 = lm( formula = rec ~ soilag5 + soilag6 + soilag7 + soilag8 + soilag9 + soilag10, data = alldata) summary(tryit1) # Visual analysis of prediction model plot(tryit1) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Let’s take a look at the models residuals # plot and print ACF (Auto Correlated Function) and PACF (partial ACF) of REC & the model # info: residuals() is a generic function which extracts model residuals from objects returned by modeling functions acf2(rec) acf2(residuals(tryit1)) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • PACF > high values for t-1 and t-2 indicates auto correlation • Adjust the model and introduce REC for t-1 and t-2… # 2nd try a multiple regression in which the REC variable is a linear function # of (past) lags 5, 6, 7, 8, 9, and 10 of the SOI variable + 2 past values from REC tryit2 = lm(formula = rec ~ reclag1 + reclag2 + soilag5 + soilag6 + soilag7 + soilag8 + soilag9 + soilag10, data = alldata) summary(tryit2) acf2(residuals(tryit2)) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Can we optimize or simplify the model? • Remove variable without significance: SOI t-7, t-8, t-9 and t-10 # 3rd try a multiple regression in which the REC variable is a linear function # of only 2 (past) lags 5 and 6 of the SOI variable + 2 past values REC tryit3 = lm(formula =rec~reclag1+reclag2+ soilag5+soilag6, data = alldata) summary(tryit3) acf2(residuals(tryit3)) IT Applications in Business Analytics - 09. Time Series Regression
Time Series – Use Case • El Niño and the Fish • Congratulations !!! • We have build a reliable model over the supposed dependency between El Niño and the fish replication rate. IT Applications in Business Analytics - 09. Time Series Regression
Lecture Summary & Homework IT Applications in Business Analytics - 09. Time Series Regression
Homework • Take the course… „Applied Time Series Analysis“ by Pennsylvania State Universityhttps://onlinecourses.science.psu.edu/stat510/node/33 IT Applications in Business Analytics - 09. Time Series Regression
Literatur • Take a look at „Nonlinear time series modelling. An Introduction“https://www.newyorkfed.org/medialibrary/media/research/staff_reports/sr87.pdf • Take a look at “Nonlinear Time Series, Theory, Methods and Application with R Examples”http://www.stat.pitt.edu/stoffer/nltsa/chs3_9_10.pdf • Books worth to spend money… • “Time Series Analysis: Forecasting and Control” Box, Jenkins 5th Ed. 2015http://www.amazon.com/Time-Analysis-Forecasting-Probability-Statistics/dp/1118675029 • “New Introduction to Multiple Time Series Analysis”,http://www.amazon.com/New-Introduction-Multiple-Time-Analysis/dp/3540262393 IT Applications in Business Analytics - 09. Time Series Regression
AnyQuestions? IT Applications in Business Analytics - 09. Time Series Regression