Upload
others
View
19
Download
0
Embed Size (px)
Citation preview
Time Series Regression (part 1) LECTURE 7|TIME SERIES FORECASTING METHOD [email protected]. id
Review Smoothing method for non-seasonal time series data:
Moving Average: SMA, DMA
Exponential Smoothing: SES, DES
Smoothing method for seasonal time series data: Additive Holt-Winter
Multiplicative Holt-Winter
Review
Outline Review of regression model
Independence assumption and the consequences of its violation
Regresion model for time series data set
Linear
Regression
??
Linear Regression
𝒚 = 𝑿𝜷+ 𝜺
dependent variable
independent variable(s)
error model
Linear Regression
1086420
20.0
17.5
15.0
12.5
10.0
7.5
5.0
S 0.911075
R-Sq 95.9%
R-Sq(adj) 95.9%
X
Y
Fitted Line PlotY = 2.803 + 1.511 X
Assumptions on Linear Regression Model
• The relationship between X and Y is linear
• 𝜀~𝑖. 𝑖. 𝑑 𝑁𝑜𝑟𝑚𝑎𝑙 0, 𝜎2
• No multicollinearity
Diagnostics
Serial Correlated Error
𝑐𝑜𝑣 𝑒𝑡 , 𝑒𝑡−𝑘 ≠ 0
where
𝑒𝑡 = error at time 𝑡
𝑒𝑡−𝑘 = error at time (𝑡 − 𝑘), 𝑘 = 1,2, …
Problems in Linear Regression: Serial Correlation
Positive serial correlation of
residuals
The residuals change sign in
gradual oscillation.
Problems in Linear Regression: Serial Correlation
Negative serial correlation of
residuals
The residuals bounce
between positive and negative, but
not randomly
Possible Causes of Serial Correlated Error
1) omitted variables
2) ignoring nonlinearities
3) measurement errors
Consequences of Serial Correlated Error
1. The OLS estimators are still unbiased and consistent
2. In large samples, the error may be still normally distributed
3. The estimators are no longer efficient no longer BLUE.
4. The estimated standard error may be underestimated,
5. the tests using the t and F distribution, may no longer be appropriate
Identification of Serial Correlated Error Residual Plot
Durbin Watson test
Runs Test
Breuch-Godfrey Test
Etc.
Possible Solutions for Autocorrelation Problem Cochrane-Orcutt
Hildreth-Lu
Distributed Lag
Etc.
IllustrationConsider the number of labour hours and sales (in dollars) data set as follows:
YearQuar-
terNumber of labour
hourssales in dollars
2011 1 126754 15349829
2011 2 129839 15629384
2011 3 106872 15720934
2011 4 123787 16230984
2012 1 137678 16809312
2012 2 138279 16923347
2012 3 109873 16978434
2012 4 137368 17203948
2013 1 139823 17830230
2013 2 138346 17937463
2013 3 112837 18074652
2013 4 149870 18347655
YearQuar-
terNumber of labour
hourssales in dollars
2014 1 147263 18438749
2014 2 147868 18604334
2014 3 113897 18740234
2014 4 149879 18943340
2015 1 149376 19276345
2015 2 156982 19173645
2015 3 123783 19147234
2015 4 159734 19842667
2016 1 159734 20783274
2016 2 169283 20348753
2016 3 128647 20873488
2016 4 163467 20475644
Source: kaggle.com
Illustration
The datasets is avalaible at:
https://github.com/raoy/Time-Series-Analysis
Illustration
170000160000150000140000130000120000110000100000
21000000
20000000
19000000
18000000
17000000
16000000
15000000
Number of labour hours
sale
s in
do
llars
Scatterplot of sales in dollars vs Number of labour hours
Pearson correlation
0.615
P-value 0.001
Correlations
Illustration
Source DF Adj SS Adj MS F-Value P-ValueRegression 1 2.32E+13 2.32E+13 13.37 0.001Error 22 3.82E+13 1.74E+12Total 23 6.14E+13
S R-sq R-sq(adj) R-sq(pred)1317579 37.79% 34.97% 27.52%
Model Summary
Analysis of Variance
Term Coef SE Coef T-Value P-Value VIFConstant 10373187 2167627 4.79 0Number of labour hours 56.8 15.5 3.66 0.001 1
Coefficients
Regression Equationsales in dollars = 10373187 + 56.8 Number of labour hours
Illustration
The residuals are NOT RANDOM!
Illustration
21000000200000001900000018000000170000001600000015000000
21000000
20000000
19000000
18000000
17000000
16000000
15000000
sales in dollars (t-1)
sale
s in
do
llars
Scatterplot of sales in dollars vs sales in dollars (t-1)
Sales is HIGHLY CORRELATED
with its value at (t-1) period
IllustrationRegression Equationsales in dollars = 1111051 + 0.9296 sales in dollars (t-1) + 2.80 Number of labour hours
Source DF Adj SS Adj MS F-Value P-Value
Regression 2 5.06E+13 2.53E+13 237.69 0Error 20 2.13E+12 1.06E+11Total 22 5.27E+13
S R-sq R-sq(adj) R-sq(pred)
326161 95.96% 95.56% 94.30%
Model Summary
Analysis of Variance
Term Coef SE Coef T-Value P-Value VIF
Constant 1111051 796218 1.4 0.178
sales in dollars (t-1) 0.9296 0.0546 17.01 0 1.58
Number of labour hours 2.8 4.88 0.57 0.573 1.58
Coefficients
Add the lag of SALES as independent variable
Illustration
Chapter Summary Assumptions on classical regression
modeling
Consequences of autocorrelated residuals
Regression modeling for time series data
Another Example
See chapter 4.8 on Hyndman (2013) https://www.otexts.org/fpp/4/8
Exercise 1Supposed there were 20 periods market share data set of a toothpasteproduct :
PeriodMarket
sharePrice Period
Market
sharePrice
1 3.63 0.97 11 7.25 0.79
2 4.20 0.95 12 6.09 0.83
3 3.33 0.99 13 6.80 0.81
4 4.54 0.91 14 8.65 0.77
5 2.89 0.98 15 8.43 0.76
6 4.87 0.90 16 8.29 0.80
7 4.90 0.89 17 7.18 0.83
8 5.29 0.86 18 7.90 0.79
9 6.18 0.85 19 8.45 0.76
10 7.20 0.82 20 8.23 0.78
Conduct regression modeling of market share (Y) towards price (X).Investigate autocorrelation of the residuals.
Exercise 2Conduct appropriate regression modeling using the following data set, and
investigate autocorrelation of the residuals.
Year Sales Advertising Year Sales Advertising
1975 11.7 9.4 1995 18.0 15.9
1976 12.0 9.6 1996 17.9 16.0
1977 12.3 10 1997 18.0 16.3
1978 12.8 10.4 1998 18.2 16.2
1979 13.1 10.8 1999 18.2 16.8
1980 13.6 10.9 2000 18.3 17.3
1981 13.9 11.7 2001 18.6 17.6
1982 14.4 12.2 2002 19.2 18.1
1983 14.7 12.5 2003 19.3 18.3
1984 15.3 12.9 2004 19.5 18.5
1985 15.5 13.0 2005 19.2 18.7
1986 15.8 13.2 2006 19.3 18.9
1987 16.1 13.8 2007 19.5 19.2
1988 16.6 14.2 2008 20.0 20.0
1989 16.9 14.6 2009 20.0 20.0
1990 16.7 14.4 2010 19.9 20.3
1991 16.9 15.0 2011 19.8 20.4
1992 17.4 15.4 2012 19.9 21.0
1993 17.6 15.7 2013 20.2 21.5
1994 17.9 15.9 2014 21.0 22.1
Next Topic…
Regression for Time Series Data Set (part 2)
ReferencesGujarati, D., McMillan, P. 2011. Econometrics by Example.
London: Palgrave Macmillan.
Hyndman, R.J and Athanasopoulos, G. 2013. Forecasting:principles and practice. https://www.otexts.org/fpp/6/2/ [March 21st, 2018]
Paulson, D.S. 2007. Handbook of Regression and Modeling:Applications for the Clinical and PharmaceuticalIndustries. Boca Raton: Chapman & Hall.
30
The handouts are available on the following site:
stat.ipb.ac.id/en
31
PREPARE
YOUR MID-EXAM