Visualizing, Modeling and Forecasting of Functional Time Series

Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion

Visualizing and forecasting functional time series

Han Lin Shang

Department of Econometrics and Business Statistics

[email protected]


Outline

1 Visualizing functional time series.

2 Modeling and forecasting functional time series.

3 Modeling and forecasting seasonal univariate time series viafunctional approach.

4 Present empirical analysis on estimation, modeling,forecasting techniques, with no theoretical proof.


Aim of the first paper

Introduce three visualization methods

1 rainbow plot

2 functional bagplot

3 functional highest density region (HDR) boxplot

Functional bagplot and functional HDR boxplot can detect outliers.


Overview of functional data

1 A collection of functions, represented by curves, surfaces,shapes or images.

2 Some applications include

Age-specific mortality and fertility rates (Hyndman and Ullah,2007)Term-structured yield curve (Kargin and Onatski, 2008)Spectrometry data (Reiss and Odgen, 2007)El Nino data (Ferraty and Vieu, 2006)


Visualizing functional data

Help discovery characteristics that might not apparent frommathematical models and summary statistics.

Visualization plays a minor role.


Some visualization methods

1 Phase-plane plot

2 Rug-plot

3 Singular value decomposition plot


Rainbow plot

1 A simple plot of all the data, with added feature being arainbow color palette based on an ordering of functional data.

2 Functional data can be ordered by depth and density.


Example of rainbow plot

Annual age-specific mortality curves for French males between1899 and 2005


Multivariate principal component analysis

1 PC1 is calculated by maximizing the variance of φ1X′, that is

argmax‖φ1=1‖

var(φ1X′) = argmax

‖φ1=1‖φ1X

′Xφ

′1.

2 Successive PC are obtained iteratively by subtracting the firstk PC from X.

Xk = Xk−1 − Xk−1φ′kφk ,

3 Treating Xk as the new data matrix to find φk+1 bymaximizing the variance of φk+1X

′k , subject to

‖φk+1‖ = (∑p

j=1 φ2k+1,j)

12 = 1 and φk+1 ⊥ φj , j = 1, . . . , k .


Properties of functional principal component analysis

PCA FPCA

Variables X = [x1, . . . , xp],xi = [x1i , . . . , xni ]

′, i =

1, . . . , p

f(x) =[f1(x), . . . , fn(x)],x ∈ [x1, xp]

Data Vectors ∈ Rp Curves ∈ L2[x1, xp]

Covariance MatrixV = Cov(X) ∈ Rp

Operator T boundedbetween x1 and xp, T :L2[x1, xp]→ L2[x1, xp]

Eigenstructure

Vector ξk ∈ R,Vξk = λkξk , for1 ≤ k < min(n, p)

Functionξk(x) ∈ L2[x1, xp],∫ xpx1

T ξk(x)dx =λkξk(x), for 1 ≤ k < n

Components Random variables inRp

Random variables inL2[x1, xp]


Bivariate and functional bagplots

1 Apply robust functional principal component analysis (FPCA)to {yt(x)} and obtain the first two PC scores.

2 Bivariate PC scores then ordered by Tukey’s halfspacelocation depth and plotted by bivariate bagplot.

3 Mapping the features of bivariate bagplot into the functionalspace.












Bivariate and functional HDR boxplots

1 Compute a bivariate kernel density estimate on the first tworobust PC scores.

2 Apply the bivariate HDR boxplot.

3 Mapping the features of the HDR boxplots into the functionalspace.












Example of El Nino data

Average monthly sea surface temperatures (Celsius) from January1951 to December 2007

2 4 6 8 10 12

2022

2426

28

Month

Sea

sur

face

tem

pera

ture


Rainbow plots ordered by depth and density

2 4 6 8 10 12

2022

2426

28

Month

Sea

sur

face

tem

pera

ture

2 4 6 8 10 12

2022

2426

28

Month

Sea

sur

face

tem

pera

ture


Outlier detection by bagplots

−10 −5 0 5 10 15

−1

01

23

4

PC score 1

PC

sco

re 2

●●●●

●

●

●

●●

●●

●

●

●●●

●

●●●

●

●

●

●

●●

●●●●

●

●

●

●●

●

●●

●

●●●

●

●

●●●

●

●

●

●

●●

●●●

●●

●●●●

●

●●

● ●

●

●

●

●

●

●

●

●

●

●●

●●

●

●●●●

●

●●●

●

●

●

●

●

●

●

●

●

●

●

●

●●●

●

●●●●

●●●

●

●●

●●

●●●●

●

●

●●

●

●

●●

●

●●●

●●

●●●●

●

●

●●

●●

●●

●

●●●

●●●

●●

●

● ●

●

●

●

●

●

●

●

19141915

1916

1917

1918

1919

1940

1943

1944

0 20 40 60 80 100

−8

−6

−4

−2

0

Age

Log

mor

talit

y ra

te

−4 −2 0 2 4 6 8 10

−6

−4

−2

02

4

PC score 1

PC

sco

re 2

●

●●

●●

●

●

●

●

● ●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

1982

1983

1997

1998

2 4 6 8 10 12

2022

2426

28

Month

Sea

sur

face

tem

pera

ture


Outlier detection by HDR boxplots

−15 −10 −5 0 5 10 15 20

−2

02

46

PC score 1

PC

sco

re 2

● ●

●

●

●

●

●

●

●

o

●

●●●●●●●

●●●●

●

●●

● ●

●

●

●

●

●●

●●

●

●●●●

●

●●●●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●

●●●●

●●●●

●●●

●●●

●●

●●●●●

●

●●

●●

●●●

●●●

●●

●●●●

●●

●●

●●

●●●

●●●●●●

●●●

● ●

●

●

●

●

●

●

●

19141915

1916

1917

1918

1919

1940

1943

1944

0 20 40 60 80 100

−8

−6

−4

−2

0

Age

Log

mor

talit

y ra

te

−5 0 5 10

−8

−6

−4

−2

02

46

PC score 1

PC

sco

re 2

●

●

●

●

o

●

●

●●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●

●

●

●

●

1982

1983

1997

1998

2 4 6 8 10 12

2022

2426

28

Month

Sea

sur

face

tem

pera

ture


Other outlier detection methods

1 Notion of functional depth and calculates a likelihood ratiotest statistics for each curve.

2 A curve is an outlier if the maximum of the test statisticsexceeds a given critical value.

3 Remove the outlier, the remaining data are tested again.


Integrated squared error

1 Utilizes robust FPCA. Integrated squared error for each curveis ∫ xp

x1

e2t (x)dx =

∫ xp

x1

[yt(x)− µ(x)−

K∑k=1

βt,k φk(x)]2dx

2 High integrated squared errors indicate a high likelihood ofcurves being detected as outliers.


Robust Mahalanobis distance method

1 Discretize functional data on an equally spaced dense grid.

2 The squared robust Mahalanobis distance is defined by

rt = [yt(xi )−µ(xi )]′Σ−1[yt(xi )−µ(xi )], i = 1, . . . , p, t = 1, . . . , n

3 Outliers have squared robust Mahalanobis distances greaterthan χ2

.99,p.


Outlier detection comparison of mortality data

Method Outliers detectedFunctional depth NoneIntegrated squared error 1914–1918, 1940, 1943–1945Functional bagplot 1914–1919, 1940, 1943–1944Functional HDR boxplot 1914–1919, 1940, 1943–1944Robust Mahalanobis distance 1914–1918, 1940, 1944

Table: The outliers are 1914-1919, 1940, 1943-1944.


Outlier detection comparison of El Nino data

Method Outliers detectedFunctional depth 1983, 1997Integrated squared error 1973, 1982–1983, 1997–1998Functional bagplot 1982–1983, 1997–1998Functional HDR boxplot 1982–1983, 1997–1998Robust Mahalanobis distance 1982–1983, 1997–1998

Table: The outliers are 1982-1983, 1997-1998.


Conclusion of the first paper

1 Three graphical methods to visualize functional data.

2 Functional bagplots and HDR boxplots can detect outliers.

3 One limitation is only first two principal component scores areconsidered.

4 Probability of outliers needs to be pre-chosen.


Possible extension

1 FPCA can be replaced by other dimension reductiontechniques.

2 Other ways of ordering functional data or determiningfunctional median or mode.

3 Tukey’s location depth can be replaced by other depthmeasures.

4 Extend from two-dimensional curves to three-dimensionalimages.


Aim of the second paper

1 New functional data analytic tool for forecasting age-specificmortality and fertility rates.

2 Mortality rate forecasting is vital for planning insurance andpension policies.

3 Fertility rate forecasting is important for planning child carepolicy.


Australian fertility data set

Annual Australian fertility rates (1921-2006) for age groupsfrom 15 to 49.These are defined as the number of live births during thecalendar year, according to the age of the mother, per 1000 ofthe female resident population of the same age at 30 June.


French female mortality data set

Annual French female mortality rates (1899-2005) for single year ofage. These are simply the ratio of death counts to populationexposure in the relevant interval of age and time.


Modeling step

1 Smooth the data for each year using a nonparametricsmoothing method to estimate ft(x) for x ∈ [x1, xp] from{xi , yt(xi )}, i = 1, 2, . . . , p.

2 Decompose the realized curves via FPCA

yt(x) = µ(x) +K∑

k=1

βt,k φk(x) + et(x) + σt(x)ηt , (1)

µ(x) is the mean function.{φ1(x), . . . , φK (x)} is the functional principal components,which are assumed to be fixed.{βt,1, . . . , βt,K} is the uncorrelated principal component scores

satisfying∑K

k=1 β2t,k <∞.

et(x) is the estimated model residual function.σt(x)ηt takes into account heterogeneity, and ηt ∼ N(0, 1).K is the number of functional principal components.


Forecasting step

1 Model and forecast the coefficients{β1,k , . . . , βn,k}, k = 1, . . . ,K via univariate time series.

2 Use the forecast coefficients with (1) to obtain forecasts offn+h(x), where h is forecast horizon.

3 Estimated variances of the error terms in (1) are used tocompute prediction intervals.


Weighted mean function

1 Mean function µ(x) estimated by a weighted average

µ∗(x) =n∑

t=1

wt ft(x),

where ft(x) is the smoothed curve estimated from yt(x), andwt = κ(1− κ)n−t is a geometrically decreasing weight with0 < κ < 1.

2 f ∗t (x) = ft(x)− µ∗(x) is the de-centralized functional curves,let G = W f∗(x), where W = diag(w1, . . . ,wn) is a diagonalweight matrix.

3 Apply singular value decomposition to G = UDV′, where

φk(x∗i ) is the (i , k)th element of V.


Weighted functional principal components

1 Weighted functional principal component decomposition is

yt(x) = µ∗(x) +K∑

k=1

βt,k φ∗k(x) + et(x) + σt(x)ηt

2 Since the scores {βt,1, . . . , βt,K} are uncorrelated, they can beforecasted using an univariate time series model.

3 Conditioning on the observations I and the set of fixedweighted functional principal componentsΦ∗ = {φ∗1(x), . . . , φ∗K (x)}, h-step-ahead forecasts of yn+h(x)is

yn+h|n(x) = E[yn+h(x)|I, Φ∗] = µ∗(x) +K∑

k=1

βn+h|n,k φ∗k(x),

where βn+h|n,k denotes the h-step-ahead forecast of βn+h,k .


Selection of weight parameter

κ can be determined by minimizing the mean integrated forecasterror (MISFE):

MISFE(h) =

∫ xp

x1

[yn+h(x)− yn+h|n(x)

]2dx ,

over a set of grid points of κ.


Selection of number of components

Optimal number of components is determined by minimizing theMISFE.


Australian fertility rates

K FPCA FPCAw RW

1 99.0611 16.73042 56.3095 3.30193 24.9330 3.25804 15.6845 3.19955 4.4495 3.21326 3.4310 3.2123 4.9800

Table: MSE: Australian fertility rates.


French female mortality rates

K FPCA FPCAw RW

1 0.5956 0.02932 0.0537 0.03103 0.0316 0.03104 0.0296 0.03115 0.0287 0.03116 0.0425 0.0311 0.0437

Table: MSE (×1000): French female log mortality rates.


Conclusion of the second paper

1 Proposed a weighted FPCA to forecast age-specific fertilityand mortality rates.

2 Compared point forecast accuracy between the unweightedand weighted FPCA.

3 Extend weighting idea to other dimension reductiontechniques, such as functional partial least squares regression.


Aim of the third paper

1 Sea surface temperature (SST) is rising.

2 Rising sea surface temperatures increases intensity of naturedisaster, such as hurricanes and storms.

3 Provide a better way, a multivariate way and a nonparametricway for modeling and predicting sea surface temperature.


El Nino data set

1 Average monthly sea surface temperature from January 1950to December 2008, available online atwww.cpc.noaa.gov/data/indices/sstoi.indices.

2 Sea surface temperatures are measured by moored buoys inthe “Nino region” defined by the coordinate 0− 10◦ Southand 90− 80◦ West.

www.cpc.noaa.gov/data/indices/sstoi.indices


Univariate graphical display

1950 1960 1970 1980 1990 2000 2010

2022

2426

28

Month

Sea

sur

face

tem

pera

ture


Functional graphical display

2 4 6 8 10 12

2022

2426

28

Month

Sea

sur

face

tem

pera

ture


Functional time series analysis

{Zw ,w ∈ [1,N]} be a seasonal time series observed at Nequispaced times.

For unequally-spaced data set, the smoothing methods maybe applied.

Observed time series {Z1, . . . ,Z708} divided into 59 successivepaths of length 12,

yt(x) = {Zw ,w ∈ (p(t−1), pt]}, ∀t = 1, . . . , 59, p = 1, . . . , 12.

To forecast future processes, yn+h,h>0(x), from the observeddata.


FPCA

1 Decompose a complete (12× 59) data matrix,y(x) = [y1(x), . . . , yn(x)]

′, into a number of functional

principal components and their uncorrelated scores.

2 FPCA decomposition can be written as

yt(x) = µ(x) +K∑

k=1

βt,k φk(x) + εt(x), (2)


Functional principal component regression

Conditioning on historical curves I and fixed functionalprincipal components {Φ = φ1(x), . . . , φK (x)}, forecastedcurves are

yTSn+h|n(x) = E[yn+h(x)|I, Φ] = µ(x)+K∑

k=1

βn+h|n,k φk(x), (3)

where βn+h|n,k denotes the h-step-ahead forecast of βn+h,k .

Hereafter, we refer this method as the time series (TS)method.


Problem statement

1 As observe most recent data points consisting of first m0 timeperiod of yn+1(x), denoted byyn+1(xe) = [yn+1(x1), . . . , yn+1(xm0)]

′, we want update

forecasts for the remaining time period of year n + 1, denotedby yn+1(xl) = [yn+1(xm0+1), . . . , yn+1(x12)]

′.

2 Using (3), TS forecasts of yn+1(xl) is given as

yTSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑

k=1

βTSk,n+1|nφk(xl).

3 TS method does not consider any new observations.

4 Introduce four dynamic updating methods and compare theirpoint forecast performance.


Block moving (BM)

1 BM method considers most recent data as last observation ina complete data matrix.

2 Because time is a continuous variable, we observe a completedata matrix at any given time interval.

3 TS method can be applied by sacrificing a number of datapoints in the first year.


Ordinary least squares (OLS) regression

1 Denote Fe as a m0 × K matrix whose (j , k)th entry is φj ,k for1 ≤ j ≤ m0, 1 ≤ k ≤ K .

2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′

be a K × 1 vector, andεn+1(xe) = [εn+1(x1), . . . , εn+1(xm0)]

′be a m0 × 1 vector.

3 As the mean-adjusted y∗n+1(xe) = yn+1(xe)− µ(xe) becomesavailable, OLS regression

y∗n+1(xe) = Feβn+1 + εn+1(xe).

4 Via OLS, βOLSn+1 = (Fe

′Fe)−1Fe

′y∗n+1(xe).

5 OLS forecast of yn+1(xl) is given by

yOLSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +

K∑k=1

βn+1,k φk(xl).




2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′






′Fe)−1Fe

′y∗n+1(xe).



K∑k=1

βn+1,k φk(xl).




2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′






′Fe)−1Fe

′y∗n+1(xe).



K∑k=1

βn+1,k φk(xl).




2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′






′Fe)−1Fe

′y∗n+1(xe).



K∑k=1

βn+1,k φk(xl).




2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′






′Fe)−1Fe

′y∗n+1(xe).



K∑k=1

βn+1,k φk(xl).


Ridge regression (RR)

1 RR penalizes the OLS coefficients, which deviate from 0. RRcoefficients minimize a penalized residual sum of squares

argminβn+1

{(y∗n+1(xe)−Feβn+1)′(y∗n+1(xe)−Feβn+1)+λβ

′n+1βn+1}

2 Taking derivative with respect to βn+1,

βRRn+1 = (Fe

′Fe + λI)−1Fe

′y∗n+1(xe).

3 RR forecast of yn+1(xl) is

yRRn+1(xl) = E[yn+1(xl)|I, Φl ] = µ(xl) +K∑

k=1

βRRn+1,k φk(xl).




argminβn+1


′n+1βn+1}


βRRn+1 = (Fe

′Fe + λI)−1Fe

′y∗n+1(xe).



k=1

βRRn+1,k φk(xl).




argminβn+1


′n+1βn+1}


βRRn+1 = (Fe

′Fe + λI)−1Fe

′y∗n+1(xe).



k=1

βRRn+1,k φk(xl).


Penalized least square (PLS) regression

1 OLS method needs a sufficient number of observation (≥ K )in order for βOLS

n+1 to be numerically stable.

2 βn+1 obtained from the PLS methods minimizes

(y∗n+1(xe)− Feβn+1)′(y∗n+1(xe)− Feβn+1) +

λ(βn+1 − βTSn+1|n)

′(βn+1 − βTS

n+1|n)

3 Taking first derivative with respect to βn+1,

βPLSn+1 = (Fe

′Fe + λI)−1(Fe

′yn+1(xe) + λβTS

n+1|n). (4)

4 PLS forecasts is a weighted average between the TS and OLSforecasts, subject to a penalty parameter λ.

5 PLS forecast of yn+1(xl) is given as

yPLSn+1(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑

k=1

βPLSn+1,k φk(xl).




n+1 to be numerically stable.2 βn+1 obtained from the PLS methods minimizes



′(βn+1 − βTS

n+1|n)


βPLSn+1 = (Fe

′Fe + λI)−1(Fe


n+1|n). (4)




k=1

βPLSn+1,k φk(xl).







′(βn+1 − βTS

n+1|n)


βPLSn+1 = (Fe

′Fe + λI)−1(Fe


n+1|n). (4)




k=1

βPLSn+1,k φk(xl).







′(βn+1 − βTS

n+1|n)


βPLSn+1 = (Fe

′Fe + λI)−1(Fe


n+1|n). (4)




k=1

βPLSn+1,k φk(xl).







′(βn+1 − βTS

n+1|n)


βPLSn+1 = (Fe

′Fe + λI)−1(Fe


n+1|n). (4)




k=1

βPLSn+1,k φk(xl).


Penalty parameter selection

Split the data into a training set1 a training sample (SST from 1950 to 1970), and2 a validation sample (SST from 1971 to 1992).

and a testing set (SST from 1993 to 2007).

Optimal penalty parameters λ for different updating periodsare determined by minimizing the mean absolute error (MAE).

MAE =1

hp

h∑j=1

p∑i=1

|yn+j(xi )− yn+j(xi )|,

over a grid of candidates (from 10−6 to 106 in steps of0.0001).


Component selection

With data in training set, select number of components byminimizing MAE within the validation set.

Optimal number of components is K = 5.


Some benchmark forecasting methods

1 Mean predictor (MP) method predicts values at n + 1 byempirical mean from first year to nth year.

2 Random walk (RW) method predicts new values at year n + 1by observations at year n.

3 Seasonal autoregressive moving average (SARIMA) is abenchmark method for forecasting seasonal univariate timeseries. Requires the specifications of order of the seasonal andnon-seasonal components of an ARIMA model. Implement anautomatic algorithm of Hyndman and Khandakar (2008) toselect the optimal orders.


Point forecast comparison

Non-dynamic updating method Dynamic updating methodsUpdate MP RW SARIMA TS OLS Block PLS RR

Mar-Dec 0.72 0.86 0.96 0.73 0.72 0.70 0.67 0.76Apr-Dec 0.73 0.87 0.98 0.74 0.69 0.73 0.68 0.65May-Dec 0.71 0.86 0.88 0.71 0.94 0.71 0.68 0.62Jun-Dec 0.71 0.84 0.86 0.71 1.07 0.70 0.66 0.58Jul-Dec 0.72 0.87 0.86 0.73 0.94 0.68 0.60 0.57Aug-Dec 0.71 0.91 0.84 0.74 0.94 0.69 0.63 0.62Sep-Dec 0.71 0.93 0.84 0.74 1.03 0.70 0.65 0.64Oct-Dec 0.72 0.96 0.57 0.78 0.69 0.74 0.71 0.64Nov-Dec 0.72 0.92 0.52 0.79 0.25 0.75 0.58 0.24Dec 0.64 0.83 0.21 0.71 0.29 0.59 0.23 0.29

Mean 0.71 0.88 0.75 0.74 0.76 0.70 0.61 0.56

Table: MAE of the point forecasts using different methods.


Parametric prediction intervals

1 Based on orthogonality and linear additivity, total forecastvariance is approximated by the sum of individual variances

ξn+h|n = Var[yn+h|I, Φ] ≈K∑

k=1

ηn+h|n,k φ2k(x) + vn+h,

ηn+h|n,k = Var(βn+h,k |β1,k , . . . , βn,k) is obtained by a timeseries model.vn+h is estimated by averaging ε2n+h(x) in (3) for each xvariable.

2 Under the normality, the (1− α) prediction intervals foryn+h(x) are

yn+h|n(x)± zα(ξn+h|n)12 ,

where zα is the (1− α/2) standard normal quantile.


Nonparametric prediction intervals

1 h-step-ahead forecast errors of principal component scores isπt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1.

2 By sampling with replacement, obtain bootstrap samples ofβn+h,k ,

βb,TSn+h|n,k = βTSn+h|n,k + πb∗,h,k , for b = 1, . . . ,B.

3 Since the residual {ε1(x), . . . , εn(x)} is uncorrelated to theprincipal components, bootstrap the model residual termεbn+h|n(x) by iid sampling.

4 Based on orthogonality and linear additivity, obtain B forecastvariants of yn+h|n(x),

ybn+h|n(x) = µ(x) +K∑

k=1

βb,TSn+h|n,k φk(x) + εbn+h|n(x).

5 (1− α) prediction intervals are quantiles of ybn+h|n(x).









k=1











k=1











k=1











k=1




Distributional forecast updating

1 By sampling with replacement, obtain bootstrap samples ofβn+1,k for year n + 1,

βb,TSn+1|n,k = βTSn+1|n,k + πb∗,1,k , for b = 1, . . . ,B.

2 With bootstrapped samples βb,TSn+1|n,k , these lead to

bootstrapped samples βb,PLSn+1 by (4).

3 From βb,PLSn+1 , obtain B replications of

yb,PLSn+1 (xl) = µ(xl) +K∑

k=1

βb,PLSn+1,k φk(xl) + εn+1(xl).

4 (1− α) prediction intervals are quantiles of yb,PLSn+1 (xl).


Distributional forecast measure

1 Empirical conditional coverage probability was calculated asthe ratio between number of ‘future’ samples falling into thecalculated prediction intervals and number of testing samples.

coverage =1

hp

p∑i=1

h∑j=1

I (y lbn+j |n(xi ) < yn+j(xi ) < yubn+j |n(xi )),

Mean coverage probability deviance = average(empiricalcoverage - nominal coverage).

2 To assess which approach gives narrower prediction intervals,calculate the width of prediction intervals

Width =1

hp

p∑i=1

h∑j=1

|yubn+j |n(xi )− y lbn+j |n(xi )|.


Distributional forecast comparison

Parametric NonparametricPeriod TS BM TS BM PLSMar-Dec 97% 98% 97% 97% 95%Apr-Dec 97% 98% 97% 97% 95%May-Dec 96% 96% 96% 96% 96%Jun-Dec 96% 96% 96% 95% 95%Jul-Dec 95% 96% 95% 94% 94%Aug-Dec 94% 94% 94% 94% 93%Sep-Dec 93% 95% 93% 95% 93%Oct-Dec 93% 93% 93% 93% 90%Nov-Dec 93% 96% 93% 93% 93%Dec 93% 100% 93% 93% 93%MCD 1.58% 1.88% 1.58% 1.40% 1.49%

Table: Nominal = 95%, smaller the mean coverage probability deviance(MCD) is, the better the method is.


Distributional forecast comparison

Parametric NonparametricPeriod TS BM TS BM PLSMar-Dec 3.65 3.64 3.55 3.51 3.15Apr-Dec 3.73 3.73 3.62 3.66 3.21May-Dec 3.69 3.69 3.57 3.61 3.21Jun-Dec 3.58 3.58 3.47 3.50 3.05Jul-Dec 3.47 3.46 3.38 3.41 2.90Aug-Dec 3.34 3.33 3.26 3.37 2.61Sep-Dec 3.26 3.26 3.19 3.25 2.82Oct-Dec 3.27 3.28 3.20 3.23 2.78Nov-Dec 3.23 3.24 3.16 3.26 2.69Dec 3.19 3.18 3.12 3.30 2.48Mean width 3.44 3.44 3.35 3.41 2.89

Table: Width comparison at nominal = 95%.


Conclusion of the third paper

1 Presented a nonparametric method to forecast univariateseasonal time series.

2 Showed importance of dynamic updating for improving pointforecast accuracy.

3 Among all dynamic updating methods, RR turns out to bebest.

4 Possible to examine other penalty functions used in both thePLS and RR methods.


Summary of the paper

1 Proposed three graphical tools for visualizing functional dataand identifying functional outliers.

2 Proposed a weighted functional principal component analysisto model and forecast mortality and fertility.

3 Applied the functional data analytic approach to model andforecast seasonal univariate time series.












References of three papers

Hyndman, R. J. and Shang, H. L. (2010) Rainbow plot, bagplotand boxplot for functional data, Journal of Computational andGraphical Statistics, 19(1), 29-45.

Hyndman, R. J. and Shang, H. L. (2009) Forecasting functionaltime series (with discussion), Journal of Korean Statistical Society,38(3), 199-221.

Shang, H. L. and Hyndman, R. J. (2011) Nonparametric timeseries forecasting with dynamic updating, Mathematics andComputers in Simulation, 81(7), 1310-1324.


References of three R packages

Shang, H. L. and Hyndman, R. J. (2011) rainbow: Rainbow plots,bagplots and boxplots for functional data, R package version 2.3.4,http://CRAN.R-project.org/package=rainbow.

Shang, H. L. and Hyndman, R. J. (2011) fds: Functional data sets,R package version 1.6,http://CRAN.R-project.org/package=fds.

Hyndman, R. J. and Shang, H. L. (2011) ftsa: Functional timeseries analysis, R package version 2.6,http://CRAN.R-project.org/package=ftsa.

http://CRAN.R-project.org/package=rainbow

http://CRAN.R-project.org/package=fds

http://CRAN.R-project.org/package=ftsa


Contact detail

Thank you for your attention.

Keep contact [email protected]

[email protected]

Education

Visualizing, Modeling and Forecasting of Functional Time Series