Decision 411: Class 4 - Fuqua School of Businessrnau/Decision411... · Linear Exponential Smoothing (LES) model • It’s also sometimes called “double exponential smoothing, because

1

Decision 411: Class 4

• Non-seasonal averaging & smoothing models– Simple moving average (SMA) model– Simple exponential smoothing (SES) model– Linear exponential smoothing (LES) model

• Combining seasonal adjustment with non-seasonal smoothing

• Winters’ seasonal smoothing model

Guidelines for future HW writeups• Presentation should stand on its own (SG files are

mainly just for audit trail)• What’s the bottom line? (forecast, trend, key drivers?)• Clearly define the variables (units, dates,

transformations, etc.) used in the analysis• Use bullet points for key observations & findings• Use tables to present key numbers (forecasts & CI’s)• Embed the most important chart(s), with annotations• Show where the numbers came from• Explain your model’s assumptions in layman’s terms

2

Averaging & smoothing models

Today’s topics

Later: ARIMA modelsWe’ll meet ARIMA later in the course,

but briefly, an “ARIMA (p,d,q)” model is like a regression model in which the dependent variable is a

d-order difference of the input variable, and the independent variables

are p lagged values of the dependent variable (AR terms) and/or q lagged

values of the forecast errors (MA terms), plus an optional constant term. Many of the averaging & smoothing

models are special cases, e.g., an ARIMA(0,1,1) model is an SES

model.

p = # AR terms (lags of dependent

variable)

q = # MA terms (lags of errors)

d = order of differencing of input variable

3

Averaging & smoothing models• The problem: sometimes nonseasonal (or

seasonally adjusted) data appears to be “locally stationary” with a time-varying mean

• The mean (constant) model doesn’t track changes in the mean, has positivelyautocorrelated errors

• The random walk model may not perform well either in this situation: it “oversteers”, picks up too much “noise” in the data, and yields negatively correlated errors

Residual Autocorrelations for XConstant mean = 463.136

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Constant mean = 463.136

0 20 40 60 80 100 120100

300

500

700

900

Example: series “X”• Mean (constant) model yields positively

autocorrelated errors.... doesn’t react to changes in the local mean ...RMSE = 121

Strong positive autocorrelation at lag 1No reaction to local changes in data

4

Residual Autocorrelations for XRandom walk

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Random walk

0 20 40 60 80 100 120100

300

500

700

900

Example, continued

Random walk model for series X yields negatively autocorrelated errors.... overreactsto changes... RMSE=122 …not any better!

Strong negative autocorrelation at lag 1Over-reaction to local changes in data (always 1 period too late)

A solution:

• Use a model that averages or “smooths” the recent data to filter out some of the noise and estimate the local mean, such as the Simple Moving Average (SMA) model:

1 2Y Y ... Yt t t mt mY + + +− − −=

…i.e., just average the last m observed values.

5

m=3 ⇒ avg. age = 2

m=5 ⇒ avg. age = 3

m=9 ⇒ avg. age = 5, etc.

…hence it lags behind turning points by (m+1)/2 periods

Properties of SMA modelAverage age of the data in the forecast is (m+1)/2

(m+1)/2 is midway between 1 period old

and m periods old

1 2Y Y ... Yt t t mt mY + + +− − −=

Properties of SMA, continued• Long-term forecasts = horizontal straight line

(=simple average of last few values)

• Confidence limits??? No theory!!

• Works well on highly irregular data: no data point receives more weight than others, so it’s relatively robust against “outliers”

• Can also be “tapered” for even greater robustness

6

Residual Autocorrelations for XSimple moving average of 3 terms

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Simple moving average of 3 terms

0 20 40 60 80 100 120100

300

500

700

900

Example, continuedSMA with m=3 (average age=2) yields RMSE=104 (significantly better!) and less negative autocorrelation (50% confidence limits are shown here, but don’t trust them: they are based on the assumption of the mean remaining fixed at the latest value)

Forecasts lag behind turning point by about 2 periods

No autocorrelation at lag 150% confidence limits (?)


0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions


0 20 40 60 80 100 120100

300

500

700

900

Example, continuedSMA with m=5 (average age=3) yields RMSE=102 (very slightly better), “smoother” forecasts, slight positive autocorrelation in errors

Forecasts lag behind turning point by about 3 periods Slight positive autocorrelation at lag 1

50% confidence limits shown

7


0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions


0 20 40 60 80 100 120100

300

500

700

900

Example, continued

SMA with m=9 (average age=5) yields RMSE=104 (slightly worse), more positive autocorrelation in errors


More positive autocorrelation at lag 1


0 20 40 60 80 100 120100

300

500

700

900Residual Autocorrelations for XSimple moving average of 19 terms

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Example, continued

SMA with m=19 (average age=10) yields RMSE=118 (significantly worse), very smooth forecasts, much more positive autocorrelation


Strong positive autocorrelation at lag 1

8

Smoothness vs. responsiveness

• Note that the more we smooth the data, the more clearly we see the “signal” stand out.

• But...greater clarity comes at the expense of getting the news later.

• If we want our forecasting model to respond quickly to changes, it will also pick up “false alarms” due to noise in the data.

Conclusions

• For a time series with a randomly varying local mean, the SMA model may outperform both the mean model and the random walk model

• It allows us to “strike a balance” between averaging over too much past data or too little past data.

• However...

9

Shortcomings of SMA model• It’s hard to optimize the number of terms

(m), because it is a discrete parameter... you must use trial and error.

• Intuitively, you should not equally weightthe last m observations when computing the average... it would be better to “discount” the older data in a gradual fashion.

• These observations motivate....

Brown’s Simple Exponential Smoothing

• Let: α = “smoothing constant”St = smoothed series at period t

• Recursive smoothing formula:

St = αYt + (1− α) St-1

• Forecast for next period = current smoothed value:

tt SY =+1ˆ

10

Mathematically equivalent formulas for SES forecasts

ttt YYY ˆα)1(αˆ 1 −+=+

ttt eYY αˆˆ 1 +=+ ttt YYe ˆ−=

ttt eYY α)1(ˆ 1 −−=+

forecast=interpolation between previous forecastand previous observation

forecast=previous forecast plus fraction α of previous error:

forecast=previous observationminus fraction 1-α of previous error

Mathematically equivalent formulas for SES forecasts, continued

...]α)1(α)1(α)1(α[ˆ 33

22

11 +−+−+−+= −−−+ ttttt YYYYY

forecast = exponentially weighted moving average of all past observations

…or in other words, a discounted moving average with a discount factor of 1-α per period

Last but not least:

11

Properties of SES model• SES uses a smoothing parameter (α)

which is continuously variable, so it is easily optimized by least squares

• If α = 1, SES → random walk model

• If α = 0, SES → constant model

• Average age of data in SES forecast is 1/αExamples: α = 0.5 ⇒ avg. age = 2

α = 0.2 ⇒ avg. age = 5α = 0.1 ⇒ avg. age = 10, etc.

Properties of SES, continued• For a given average age, SES is

somewhat superior to SMA because it places relatively more weight on the most recent observation

• Hence it is slightly more "responsive" to changes occuring in the recent past.

• Caveat: it is also more sensitive to recent “outliers” than the SMA model--not so good for messy data.

12

SMA (m=9) vs. SES (α=0.2)

0

0.05

0.1

0.15

0.2

0.25

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Lag

SMA weightSES weight

SMA weights are 1/9 on first 9 lags of Y, zero afterward

SES weights are larger than SMA weights at first few lags, then gradually decline to zero

average age = 5 for both models

Average age is the center of mass (“balancing point”) of the weight distribution

Properties of SES, continued• Long-term forecasts from the basic SES

model are a horizontal straight line (no trend, as in random walk and SMA)

• SES = ARIMA(0,1,1), i.e., random walk model (without drift) plus MA=1, which adds a multiple of lag-1 forecast error:

ttt eYY α)1(ˆ 1 −−=+

random walk lag-1 error

13

Properties of SES, continued

• Note that it increases with k more slowly than for the random walk model, which is the special case α=1:

)1(2

)( α)1(1 fcstkfcst SEkSE −+=

)1()( fcstkfcst SEkSE =

• Exact k-step ahead forecast standard error can be computed using ARIMA theory:

• Hence the SES model assumes the series is “more predictable” than a random walk

Residual Autocorrelations for XSimple exponential smoothing with alpha = 0.2961

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Example, continuedSES with optimal α=0.3 (average age=3.3) yields RMSE = 99 (best yet, by a small margin), no significant residual autocorrelations

Don’t worry about an isolated spike at an oddball lag like lag 9—probably just due to a pair of large errors separated by 9 periods


Simple exponential smoothing with alpha = 0.2961

0 20 40 60 80 100 120100

300

500

700

900

14

SES with constant trend

• A constant linear trend can be added to an SES model by fitting it as an ARIMA(0,1,1) model with constant

• Alas, the ARIMA implementation of SES models can’t be combined with seasonal adjustment in the Forecasting procedure in Statgraphics (although you could seasonally adjust and then fit an ARIMA model in two steps)

SES with constant trend, continued

• A constant exponential trend can be added to SES by using the inflation adjustment option in Statgraphics

• The average percentage growth per period can be estimated from the slope coefficient of a linear trend model or ARIMA(0,1,0)+c model fitted with a natural log transformation

• See video clip #10 for examples

15

• Evidently what is needed is an estimate of the local trend as well as the local mean

• This is the motivating idea behind Brown’s Linear Exponential Smoothing (LES) model

• It’s also sometimes called “double exponential smoothing, because it involves a double application of exponential smoothing

What if the series has a time-varying trend, as well as a time-varying mean?

How LES works• Apply SES once to get a singly-smoothed

series St′ that lags behind the current value by 1/α − 1 periods.*

• Smooth the smoothed series (using same α) to get an even smoother series St″ that lags behind by 2(1/α − 1) periods

• To forecast the future, extrapolate a linebetween the two points (t − (1/α − 1), St′ ) and (t − 2(1/α − 1), St″ )

*Average age relative to next value is 1/α, so age relative to current value is 1/α - 1

16

XS'S''

0 20 40 60 80 1000

200

400

600

800

LES forecasts from t = 90, α=0.1*1. Draw a horizontal line extending 9 periods back in time from the current value of the singly-smoothed series

2. Draw a horizontal line extending 18 periods back in time from the current value of the doubly-smoothed series

3. Extrapolate a line into the future through the left endpoints of the

two horizontal lines

*1/α = 10, so 1/α - 1 = 9

How LES works• There are two equivalent sets of

mathematical formulas for implementing the logic of the LES model

• One set of formulas (I) explicitly computes the current estimates of level and trend in each period

• The other set of formulas (II) merely computes the next forecast from the observed data and forecast errors in the last two periods

17

LES formulas: I1. Compute singly smoothed series at period t:

S't = αYt + (1-α)S't-12. Compute doubly smoothed series:

S''t = α S't + (1-α) S''t-13. Compute the estimated level at period t:

Lt = 2S't − S''t4. Compute the estimated trend at period t:

Tt = (α/(1-α))(S't − S''t )5. Finally, the k-step ahead forecast is given by:

ttkt kTLY +=+ˆ

Startup: S'1 = S''1 = Y1

• Very important start-up values:

(If you don’t use these start-up values, the early forecasts will gyrate wildly!)

LES formulas: II

• Mathematically equivalent formula (requires fewer columns on a spreadsheet):

12

11 )α1()α1(22ˆ−−+ −+−−−= ttttt eeYYY

1221112 ,0 hence,ˆˆ YYeeYYY −====

18

Example, continuedLES model is optimized at α=0.16, yielding RMSE=102 (about the same as SES) …but the forecast plot shows a decreasing trend due to the local downward trend at end of series, confidence intervals also widen more rapidly due to assumption that trend may be varying


Brown's linear exp. smoothing with alpha = 0.1608

0 20 40 60 80 100 120100

300

500

700

900 Residual Autocorrelations for XBrown's linear exp. smoothing with alpha = 0.1608

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

LES vs. SES• SES assumes only a time-varying level (i.e., a

local mean), while LES assumes a time-varying level and trend.

• SES assumes that the series is more predictable than a random walk, while LES is assumes it is less predictable.

• LES model is relatively unstable, hence it may be dangerous to extrapolate the local trend very far.

• There are fancier versions of LES that include a “trend-dampening” factor.

19

LES vs. SES, continued• In both SES and LES, the smaller the value of

α, the more smoothing (i.e., less response to the most recent observation)

• Remember that the “average age” is 1/α in SES model (amount of lag behind turning points).

• In LES model, forecast is based on what was happening between 1/α and 2/α periods ago.

• When fitted to the same series, LES usually has a smaller optimal α than SES.

LES vs. SES, continued• SES is the most widely used non-seasonal forecasting model.

• It has a sounder underlying theory than the SMA model, and it is computationally convenient to use on hundreds or thousands of parallel time series (e.g., for SKU-level forecasting).

• Its assumption of no trend is often unrealistic, but it is surprisingly robust in practice for short-term forecasts--often better than LES even for series that have trends.

• You can add an exponential trend via the inflation adjustment option.

• You can add a linear trend to an SES model by fitting it as an ARIMA(0,1,1) model with constant--but you can’t combine ARIMA with seasonal adjustment in the Forecasting procedure.

20

Estimation issues• Optimization of α is performed by nonlinear

least squares (like Excel’s nonlinear solver).

• Nonlinear estimation requires a “search”process whose solution is inexact and may depend on the starting value.

• In Statgraphics, you may notice that the optimal α varies slightly when the model is revisited, because it restarts the estimation from the previous optimum.

Estimation issues, continued• α is constrained to lie between 0.0001 and

0.9999 for SES and LES models.

• If the best SES model is actually a random walk model (α=1), then the estimation algorithm will converge to 0.9999. This will often happen if the series has a significant trend.

• Once α hits its upper bound (0.9999), the estimation may get “stuck” there. Try manually changing the initial value to (say) 0.5 before re-fitting the model if the data sample is changed.

21

Estimation issues, continued• Because LES and SES use “recursive”

formulas in which each forecast depends on prior errors, their estimation also depends on how they are initialized (i.e., on the “prior errors” that are assumed at the very beginning).

• The usual approach is to just assume that the first error is zero.

• A more sophisticated approach, available as an estimation option in Statgraphics, is to use “backforecasting”* to start up the model.

*We’ll discuss this in more detail later in the course.

Holt’s linear exponential smoothing

• Holt’s model improves on LES by introducing separate smoothing constants for level and trend (“alpha” and “beta”)

• In theory, this allows it to perform more stable trend estimation while adapting to sudden jumps in level

22

Holt’s model formulas

1. Updated level Lt is an interpolation between the most recent data point and the previous forecast of the level:

1 1(1 )( )t t t tL Y L Tα α − −= + − +

Most recent data point Forecast of Ltmade at period t-1


2. Updated trend Tt is an interpolation between the change in the estimated level and the previous estimate of the trend:

11 1 −− β−+−β= tttt TLLT )()(

Just-observed change in the level

Previous trend estimate

23


3. k-step ahead forecast from period t:

Extrapolation of level and trend from period t

t k t tY L kT+ = +

Example, continuedHolt’s model is optimized at α=0.306, β=0.007 yielding RMSE = 100 (essentially same as SES & LES) …but forecast plot shows a slightly increasinglocal trend at end of series, due to relatively heavy smoothing of trend!


Residual Autocorrelations for XHolt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069

0 5 10 15 20 25

lag

-1

-0.6

-0.2

0.2

0.6

1

Aut

ocor

rela

tions

Holt's linear exp. smoothing with alpha = 0.3061 and beta = 0.0069

0 20 40 60 80 100 120100

300

500

700

900

24

Model comparisonsModels B-C-D-E hardly differ on error measures.

Model choice should also depend on

“theoretical” considerations,

such as the reasonableness

of the trend assumptions

A cautionary word about trend extrapolation

• If you are forecasting more than one period ahead, it is especially important to estimate the trend correctly

• In general, trend assumptions and estimation should be based on everything you know about a time series, not just error statistics of one-period-ahead forecasts or t-stats of slope coefficients

25

A cautionary word about trend extrapolation

• Extrapolation of time-varying trends estimated by “double smoothing” can be dangerous

• Hence SES (perhaps with fixed trend) often works better in practice

• A trend dampening factor is often used in conjunction with LES or Holt’s:

2ˆ ( ... )kt k t tY L Tφ φ φ+ = + + + +

(0 1)φ< <

Combining seasonal adjustment with a non-seasonal smoothing model

• Often a seasonally adjusted series looks like a good candidate for fitting with a smoothing or averaging model.

• Hence, you can forecast a seasonal series by a combination of seasonal adjustment and non-seasonal smoothing (or other non-seasonal model).

• This “hybrid” approach allows you to model the seasonal pattern explicitly, but it does not have a solid underlying statistical theory--confidence limits may be dubious.

• There is also some danger of overfitting the seasonal pattern if you don’t have enough seasons of data.

26

Example of LES + seasonal adjustment on a spreadsheet

The single-equation form of the LES model is easily implemented on a spread-sheet, and Solver can be used to find the value of αα that minimizes RMSE.

LES out-of-sample forecasts

The LES model, like any other one-step-ahead forecasting model, can extrapolate its forecasts into the future by “bootstrapping” itself, i.e., by

substituting the one-step-ahead forecast for the next data point and then forecasting the next period from there, and so on.

27

LES forecasts for seasonally adjusted data

0.000

50.000

100.000

150.000

200.000

250.000

300.000

350.000

400.000

450.000

500.000

Dec

-83

Dec

-84

Dec

-85

Dec

-86

Dec

-87

Dec

-88

Dec

-89

Dec

-90

Dec

-91

Dec

-92

Dec

-93

Dec

-94

Seasonally adjustedLES forecast

Note that LES lags behind turning points, like all smoothing models…

…but it tracks the data pretty well during stretches where

the trend is consistent……and its out-of-sample forecasts extrapolate the

most recent trend

Re-seasonalized LES forecasts

0.0

100.0

200.0

300.0

400.0

500.0

600.0

Dec-83

Jun-8

4

Dec-84

Jun-8

5

Dec-85

Jun-8

6

Dec-86

Jun-8

7

Dec-87

Jun-8

8

Dec-88

Jun-8

9

Dec-89

Jun-9

0

Dec-90

Jun-9

1

Dec-91

Jun-9

2

Dec-92

Jun-9

3

Dec-93

Jun-9

4

Dec-94

Jun-9

5

Original seriesReseasonalized forecast

28

Example: housing starts

Series displays strong seasonality as well as cyclicality

Original data (not seasonally adjusted)

Time Series Plot for HousesNSA

Hou

sesN

SA

1/83 1/87 1/91 1/95 1/99 1/0339

59

79

99

119

139

New residential construction since 1983

Note the last observation…

29

Seasonally adjusted data

After seasonal adjustment, variations in level and trend are clearer

Time Series Plot for SADJUSTED

SAD

JUST

ED

1/83 1/87 1/91 1/95 1/99 1/0354

74

94

114

134

In seasonally adjusted terms, the last observation is abnormally large!

How will different models react to it?

(This abnormality was not so

apparent on the unadjusted graph!)

Time Sequence Plot for SADJUSTEDRandom walk with drift = 0.139171

1/83 1/88 1/93 1/98 1/03 1/0850

100

150actualforecast50.0% limits

Nonseasonal forecasting model fitted to adjusted data: RW+drift

Depending on the kind of long-term trend assumptions we feel are appropriate, we could fit the seasonally adjusted series with

a non-seasonal model such as a random walk with drift...

This model extrapolates the long-term trend from the most recent (higher)

level

30

Time Sequence Plot for SADJUSTEDSimple exponential smoothing with alpha = 0.4682

1/83 1/88 1/93 1/98 1/03 1/0850

100


…or a simple exponential smoothing model...

This model extrapolates a flat

trend from an exponentially-

weighted average of recent levels

Nonseasonal forecasting model fitted to adjusted data: SES

Time Sequence Plot for SADJUSTEDBrown's linear exp. smoothing with alpha = 0.2352

1/83 1/88 1/93 1/98 1/03 1/0850

100


…or Brown’s linear exponential smoothing model...

This model tries to extrapolate the

recent trend, which is jerked upward by the

last observation

Nonseasonal forecasting model fitted to adjusted data: Brown’s LES

31

Time Sequence Plot for SADJUSTEDHolt's linear exp. smoothing with alpha = 0.4765 and beta = 0.015

1/83 1/88 1/93 1/98 1/03 1/0850

100


… or Holt’s linear exponential smoothing model...

This model also tries to extrapolate the recent trend,

but the trend estimate is more conservative due

to small “beta” (heavy smoothing)

Nonseasonal forecasting model fitted to adjusted data: Holt’s LES

Hybrid seasonal models in SG• You can fit hybrid models in the Forecasting

procedure in Statgraphics by selecting “multiplicative seasonal adjustment” in conjunction with a RW or SES or LES model type.

• The forecasts are automatically “reseasonalized” in the plots and model comparison statistics

• Be on guard against overfitting: seasonal adjustment adds many parameters to the model, and estimation period statistics may not be fully adjusted to correct for additional parameters.

32

Hybrid seasonal models

Time Sequence Plot for HousesNSARandom walk with drift = 0.142988

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


RW + seasonal adjustment

Here’s the result of fitting the RW-with-drift model with multiplicative seasonal adjustment

Note sharply raised

forecasts, driven by unusual

seasonally adjusted value

of last data point

33

Time Sequence Plot for HousesNSASimple exponential smoothing with alpha = 0.4617

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Here’s the result of fitting the SES model with multiplicative seasonal adjustment

More conservative (though still raised) forecasts, tighter confidence limits

SES + seasonal adjustment

Time Sequence Plot for HousesNSABrown's linear exp. smoothing with alpha = 0.2365

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Here’s the result of fitting the LES model with multiplicative seasonal adjustment

Forecasts march steeply upward, confidence limits are rather wide

Brown’s LES + seasonal adjustment

34

Time Sequence Plot for HousesNSAHolt's linear exp. smoothing with alpha = 0.4667 and beta = 0.0144

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Here’s the result of fitting Holt’s model with multiplicative seasonal adjustment

Forecasts start from higher level

but with flatter trend than LES, but confidence limits are rather

optimistic

Holt’s LES + seasonal adjustment

Time Sequence Plot for HousesNSALinear trend = 76.7875 + 0.0262053 t

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Just for fun, here’s a linear trend model with multiplicative seasonal adjustment

Obviously not appropriate!

Linear trend + seasonal adjustment (?)

35

Model comparison report shows that SES and Holt’s do the best in estimation

period, although RW model is slightly “luckier” in

validation period (last 4 years of data were held out)

Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594

1/83 1/87 1/91 1/95 1/99 1/03-18

-8

2

12

22

Resid

ual

Residual Autocorrelations for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594

lag

Aut

ocor

relat

ions

0 5 10 15 20 25-1

-0.6

-0.2

0.2

0.6

1

Residual plots for SES model show stable

variance, no significant autocorrelation… model

appears “OK”

36

Even the (vertical) probability plot looks good.* This is a “pane option” behind the “residual plots”.

Residual Plot for adjusted HousesNSASimple exponential smoothing with alpha = 0.4594

prop

ortio

n

-18 -8 2 12 220.1

15

2050809599

99.9

*This result validates the use of normal distribution theory to compute the confidence intervals from the forecast standard errors.

What’s the best forecast?• The main issue here is what to infer from the recent

jump in seasonally adjusted housing starts.

• Our modeling results do not really answer this question for us—they merely show the consequences of different assumptions we may wish to make.

• Ideally, “domain knowledge” should shed additional light on the appropriateness of the assumptions.

• The SES model is clearly the most “conservative” choice, because its forecasts are less radically affected by one recent observation.

37

Winter’s Seasonal Smoothing• The logic of Holt’s model can be extended to

recursively estimate time-varying seasonal indices as well as level and trend.

• Let Lt, Tt, and St denote the estimated level, trend, and seasonal index at period t.

• Let s denote the number of periods in a season.

• Let α, β, and γ denote separate smoothing constants* for level, trend, and seasonality

*numbers between 0 and 1: smaller values → more smoothing

Winters’ model formulas

1. Updated level Lt is an interpolation between the seasonally adjusted value of the most recent data point and the previous forecast of the level:

))(( 111 −−−

+α−+α= ttst

tt TL

SYL

Seasonally adjusted value of Yt

Forecast of Ltmade at period t-1

38


2. Updated trend Tt is an interpolation between the change in the estimated level and the previous estimate of the trend:

11 1 −− β−+−β= tttt TLLT )()(

Just-observed change in the level

Previous trend estimate


3. Updated seasonal index St is an interpolation between the ratio of the data point to the estimated level and the previous estimate of the seasonal index:

stt

tt S

LYS −γ−+γ= )(1

“Ratio to moving average” of

current data point

Last estimate of seasonal index in the same season

39


4. k-step ahead forecast from period t:

Extrapolation of level and trend from period t

Most recent estimate of the seasonal index for kth

period in the future

kstttkt SkTLY +−+ += )(ˆ

Estimation issues

• Estimation of Winters’ model is tricky, and not all software does it well: sometimes you get crazy results.

• There are three separate smoothing constants to be jointly estimated by nonlinear least squares (α, β, γ).

• Initialization is also tricky, especially for the seasonal indices.

40

Estimation issues• Some common initialization schemes:

– Naïve approach: set initial level = 1st data point, trend = 0, seasonal indices = 1.0

– More sophisticated: perform a seasonal decomposition to obtain initial seasonal indices & fit trend line to obtain initial trend

– Even more sophisticated: use backforecasting

• Calculation of confidence intervals is also complicated & not always done correctly.

Time Sequence Plot for HousesNSAWinter's exp. smoothing with alpha = 0.4454, beta = 0.0146, gamma = 0.2843

1/83 1/88 1/93 1/98 1/03 1/0850

75

100

125

150


Winter’s model fitted to housing starts

Results of fitting Winters’ model

In this case, the Winters forecasts

& confidence intervals look

similar to those of the Holt’s model

with seasonal adjustment (alpha and beta are very similar as should

be expected)

41

Model comparison report shows that

Winters’ fits a little less well than SES or Holt’s model, but is otherwise

“OK”

Winters’ model in practice• The Winters model is popular in “automatic

forecasting” software, because it has a little of everything (level, trend, seasonality).

• Sometimes it works well, but difficulties in initialization & estimation can lead to strange results in other cases.

• In principle it is similar to linear exponential smoothing and can produce similarly unstable long-term trend projections.

42

DATE

VariablesRW+driftSESLESHOLTWINTERSACTUAL

2002 2003 2004 2005 2006 200770

100

130

160

190

220

All models overpredicted housing starts for the rest of 1992 and 1993, over-responding to the Feb. ‘02 jump, but later values were in the middle range of predictions until recent plunge

What really happened in last 5 years?

Class 4 recap• Averaging and smoothing models enable you to

estimate time-varying levels and trends.

• SMA, SES, and LES models can be combined with seasonal adjustment to forecast seasonal data (...but beware of changing seasonal patterns and possibility of overfitting)

• Winters’ estimates time-varying seasonal indices.

• You need to exercise judgment in model selection in order to make appropriate assumptions about changing levels and trends & unusual events.

Documents

Decision 411: Class 4 - Fuqua School of Businessrnau/Decision411... · Linear Exponential Smoothing (LES) model • It’s also sometimes called “double exponential smoothing, because