Upload
hanshang
View
1.660
Download
0
Embed Size (px)
Citation preview
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Visualizing and forecasting functional time series
Han Lin Shang
Department of Econometrics and Business Statistics
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Outline
1 Visualizing functional time series.
2 Modeling and forecasting functional time series.
3 Modeling and forecasting seasonal univariate time series viafunctional approach.
4 Present empirical analysis on estimation, modeling,forecasting techniques, with no theoretical proof.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Aim of the first paper
Introduce three visualization methods
1 rainbow plot
2 functional bagplot
3 functional highest density region (HDR) boxplot
Functional bagplot and functional HDR boxplot can detect outliers.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Overview of functional data
1 A collection of functions, represented by curves, surfaces,shapes or images.
2 Some applications include
Age-specific mortality and fertility rates (Hyndman and Ullah,2007)Term-structured yield curve (Kargin and Onatski, 2008)Spectrometry data (Reiss and Odgen, 2007)El Nino data (Ferraty and Vieu, 2006)
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Visualizing functional data
Help discovery characteristics that might not apparent frommathematical models and summary statistics.
Visualization plays a minor role.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Some visualization methods
1 Phase-plane plot
2 Rug-plot
3 Singular value decomposition plot
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Rainbow plot
1 A simple plot of all the data, with added feature being arainbow color palette based on an ordering of functional data.
2 Functional data can be ordered by depth and density.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Example of rainbow plot
Annual age-specific mortality curves for French males between1899 and 2005
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Multivariate principal component analysis
1 PC1 is calculated by maximizing the variance of φ1X′, that is
argmax‖φ1=1‖
var(φ1X′) = argmax
‖φ1=1‖φ1X
′Xφ
′1.
2 Successive PC are obtained iteratively by subtracting the firstk PC from X.
Xk = Xk−1 − Xk−1φ′kφk ,
3 Treating Xk as the new data matrix to find φk+1 bymaximizing the variance of φk+1X
′k , subject to
‖φk+1‖ = (∑p
j=1 φ2k+1,j)
12 = 1 and φk+1 ⊥ φj , j = 1, . . . , k .
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Properties of functional principal component analysis
PCA FPCA
Variables X = [x1, . . . , xp],xi = [x1i , . . . , xni ]
′, i =
1, . . . , p
f(x) =[f1(x), . . . , fn(x)],x ∈ [x1, xp]
Data Vectors ∈ Rp Curves ∈ L2[x1, xp]
Covariance MatrixV = Cov(X) ∈ Rp
Operator T boundedbetween x1 and xp, T :L2[x1, xp]→ L2[x1, xp]
Eigenstructure
Vector ξk ∈ R,Vξk = λkξk , for1 ≤ k < min(n, p)
Functionξk(x) ∈ L2[x1, xp],∫ xpx1
T ξk(x)dx =λkξk(x), for 1 ≤ k < n
Components Random variables inRp
Random variables inL2[x1, xp]
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Bivariate and functional bagplots
1 Apply robust functional principal component analysis (FPCA)to {yt(x)} and obtain the first two PC scores.
2 Bivariate PC scores then ordered by Tukey’s halfspacelocation depth and plotted by bivariate bagplot.
3 Mapping the features of bivariate bagplot into the functionalspace.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Bivariate and functional bagplots
1 Apply robust functional principal component analysis (FPCA)to {yt(x)} and obtain the first two PC scores.
2 Bivariate PC scores then ordered by Tukey’s halfspacelocation depth and plotted by bivariate bagplot.
3 Mapping the features of bivariate bagplot into the functionalspace.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Bivariate and functional bagplots
1 Apply robust functional principal component analysis (FPCA)to {yt(x)} and obtain the first two PC scores.
2 Bivariate PC scores then ordered by Tukey’s halfspacelocation depth and plotted by bivariate bagplot.
3 Mapping the features of bivariate bagplot into the functionalspace.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Bivariate and functional HDR boxplots
1 Compute a bivariate kernel density estimate on the first tworobust PC scores.
2 Apply the bivariate HDR boxplot.
3 Mapping the features of the HDR boxplots into the functionalspace.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Bivariate and functional HDR boxplots
1 Compute a bivariate kernel density estimate on the first tworobust PC scores.
2 Apply the bivariate HDR boxplot.
3 Mapping the features of the HDR boxplots into the functionalspace.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Bivariate and functional HDR boxplots
1 Compute a bivariate kernel density estimate on the first tworobust PC scores.
2 Apply the bivariate HDR boxplot.
3 Mapping the features of the HDR boxplots into the functionalspace.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Example of El Nino data
Average monthly sea surface temperatures (Celsius) from January1951 to December 2007
2 4 6 8 10 12
2022
2426
28
Month
Sea
sur
face
tem
pera
ture
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Rainbow plots ordered by depth and density
2 4 6 8 10 12
2022
2426
28
Month
Sea
sur
face
tem
pera
ture
2 4 6 8 10 12
2022
2426
28
Month
Sea
sur
face
tem
pera
ture
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Outlier detection by bagplots
−10 −5 0 5 10 15
−1
01
23
4
PC score 1
PC
sco
re 2
●●●●
●
●
●
●●
●●
●
●
●●●
●
●●●
●
●
●
●
●●
●●●●
●
●
●
●●
●
●●
●
●●●
●
●
●●●
●
●
●
●
●●
●●●
●●
●●●●
●
●●
● ●
●
●
●
●
●
●
●
●
●
●●
●●
●
●●●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●●●
●
●●●●
●●●
●
●●
●●
●●●●
●
●
●●
●
●
●●
●
●●●
●●
●●●●
●
●
●●
●●
●●
●
●●●
●●●
●●
●
● ●
●
●
●
●
●
●
●
19141915
1916
1917
1918
1919
1940
1943
1944
0 20 40 60 80 100
−8
−6
−4
−2
0
Age
Log
mor
talit
y ra
te
−4 −2 0 2 4 6 8 10
−6
−4
−2
02
4
PC score 1
PC
sco
re 2
●
●●
●●
●
●
●
●
● ●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
1982
1983
1997
1998
2 4 6 8 10 12
2022
2426
28
Month
Sea
sur
face
tem
pera
ture
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Outlier detection by HDR boxplots
−15 −10 −5 0 5 10 15 20
−2
02
46
PC score 1
PC
sco
re 2
● ●
●
●
●
●
●
●
●
o
●
●●●●●●●
●●●●
●
●●
● ●
●
●
●
●
●●
●●
●
●●●●
●
●●●●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●●●●
●●●●
●●●
●●●
●●
●●●●●
●
●●
●●
●●●
●●●
●●
●●●●
●●
●●
●●
●●●
●●●●●●
●●●
● ●
●
●
●
●
●
●
●
19141915
1916
1917
1918
1919
1940
1943
1944
0 20 40 60 80 100
−8
−6
−4
−2
0
Age
Log
mor
talit
y ra
te
−5 0 5 10
−8
−6
−4
−2
02
46
PC score 1
PC
sco
re 2
●
●
●
●
o
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
1982
1983
1997
1998
2 4 6 8 10 12
2022
2426
28
Month
Sea
sur
face
tem
pera
ture
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Other outlier detection methods
1 Notion of functional depth and calculates a likelihood ratiotest statistics for each curve.
2 A curve is an outlier if the maximum of the test statisticsexceeds a given critical value.
3 Remove the outlier, the remaining data are tested again.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Integrated squared error
1 Utilizes robust FPCA. Integrated squared error for each curveis ∫ xp
x1
e2t (x)dx =
∫ xp
x1
[yt(x)− µ(x)−
K∑k=1
βt,k φk(x)]2dx
2 High integrated squared errors indicate a high likelihood ofcurves being detected as outliers.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Robust Mahalanobis distance method
1 Discretize functional data on an equally spaced dense grid.
2 The squared robust Mahalanobis distance is defined by
rt = [yt(xi )−µ(xi )]′Σ−1[yt(xi )−µ(xi )], i = 1, . . . , p, t = 1, . . . , n
3 Outliers have squared robust Mahalanobis distances greaterthan χ2
.99,p.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Outlier detection comparison of mortality data
Method Outliers detectedFunctional depth NoneIntegrated squared error 1914–1918, 1940, 1943–1945Functional bagplot 1914–1919, 1940, 1943–1944Functional HDR boxplot 1914–1919, 1940, 1943–1944Robust Mahalanobis distance 1914–1918, 1940, 1944
Table: The outliers are 1914-1919, 1940, 1943-1944.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Outlier detection comparison of El Nino data
Method Outliers detectedFunctional depth 1983, 1997Integrated squared error 1973, 1982–1983, 1997–1998Functional bagplot 1982–1983, 1997–1998Functional HDR boxplot 1982–1983, 1997–1998Robust Mahalanobis distance 1982–1983, 1997–1998
Table: The outliers are 1982-1983, 1997-1998.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Conclusion of the first paper
1 Three graphical methods to visualize functional data.
2 Functional bagplots and HDR boxplots can detect outliers.
3 One limitation is only first two principal component scores areconsidered.
4 Probability of outliers needs to be pre-chosen.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Possible extension
1 FPCA can be replaced by other dimension reductiontechniques.
2 Other ways of ordering functional data or determiningfunctional median or mode.
3 Tukey’s location depth can be replaced by other depthmeasures.
4 Extend from two-dimensional curves to three-dimensionalimages.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Aim of the second paper
1 New functional data analytic tool for forecasting age-specificmortality and fertility rates.
2 Mortality rate forecasting is vital for planning insurance andpension policies.
3 Fertility rate forecasting is important for planning child carepolicy.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Australian fertility data set
Annual Australian fertility rates (1921-2006) for age groupsfrom 15 to 49.These are defined as the number of live births during thecalendar year, according to the age of the mother, per 1000 ofthe female resident population of the same age at 30 June.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
French female mortality data set
Annual French female mortality rates (1899-2005) for single year ofage. These are simply the ratio of death counts to populationexposure in the relevant interval of age and time.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Modeling step
1 Smooth the data for each year using a nonparametricsmoothing method to estimate ft(x) for x ∈ [x1, xp] from{xi , yt(xi )}, i = 1, 2, . . . , p.
2 Decompose the realized curves via FPCA
yt(x) = µ(x) +K∑
k=1
βt,k φk(x) + et(x) + σt(x)ηt , (1)
µ(x) is the mean function.{φ1(x), . . . , φK (x)} is the functional principal components,which are assumed to be fixed.{βt,1, . . . , βt,K} is the uncorrelated principal component scores
satisfying∑K
k=1 β2t,k <∞.
et(x) is the estimated model residual function.σt(x)ηt takes into account heterogeneity, and ηt ∼ N(0, 1).K is the number of functional principal components.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Forecasting step
1 Model and forecast the coefficients{β1,k , . . . , βn,k}, k = 1, . . . ,K via univariate time series.
2 Use the forecast coefficients with (1) to obtain forecasts offn+h(x), where h is forecast horizon.
3 Estimated variances of the error terms in (1) are used tocompute prediction intervals.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Weighted mean function
1 Mean function µ(x) estimated by a weighted average
µ∗(x) =n∑
t=1
wt ft(x),
where ft(x) is the smoothed curve estimated from yt(x), andwt = κ(1− κ)n−t is a geometrically decreasing weight with0 < κ < 1.
2 f ∗t (x) = ft(x)− µ∗(x) is the de-centralized functional curves,let G = W f∗(x), where W = diag(w1, . . . ,wn) is a diagonalweight matrix.
3 Apply singular value decomposition to G = UDV′, where
φk(x∗i ) is the (i , k)th element of V.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Weighted functional principal components
1 Weighted functional principal component decomposition is
yt(x) = µ∗(x) +K∑
k=1
βt,k φ∗k(x) + et(x) + σt(x)ηt
2 Since the scores {βt,1, . . . , βt,K} are uncorrelated, they can beforecasted using an univariate time series model.
3 Conditioning on the observations I and the set of fixedweighted functional principal componentsΦ∗ = {φ∗1(x), . . . , φ∗K (x)}, h-step-ahead forecasts of yn+h(x)is
yn+h|n(x) = E[yn+h(x)|I, Φ∗] = µ∗(x) +K∑
k=1
βn+h|n,k φ∗k(x),
where βn+h|n,k denotes the h-step-ahead forecast of βn+h,k .
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Selection of weight parameter
κ can be determined by minimizing the mean integrated forecasterror (MISFE):
MISFE(h) =
∫ xp
x1
[yn+h(x)− yn+h|n(x)
]2dx ,
over a set of grid points of κ.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Selection of number of components
Optimal number of components is determined by minimizing theMISFE.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Australian fertility rates
K FPCA FPCAw RW
1 99.0611 16.73042 56.3095 3.30193 24.9330 3.25804 15.6845 3.19955 4.4495 3.21326 3.4310 3.2123 4.9800
Table: MSE: Australian fertility rates.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
French female mortality rates
K FPCA FPCAw RW
1 0.5956 0.02932 0.0537 0.03103 0.0316 0.03104 0.0296 0.03115 0.0287 0.03116 0.0425 0.0311 0.0437
Table: MSE (×1000): French female log mortality rates.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Conclusion of the second paper
1 Proposed a weighted FPCA to forecast age-specific fertilityand mortality rates.
2 Compared point forecast accuracy between the unweightedand weighted FPCA.
3 Extend weighting idea to other dimension reductiontechniques, such as functional partial least squares regression.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Aim of the third paper
1 Sea surface temperature (SST) is rising.
2 Rising sea surface temperatures increases intensity of naturedisaster, such as hurricanes and storms.
3 Provide a better way, a multivariate way and a nonparametricway for modeling and predicting sea surface temperature.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
El Nino data set
1 Average monthly sea surface temperature from January 1950to December 2008, available online atwww.cpc.noaa.gov/data/indices/sstoi.indices.
2 Sea surface temperatures are measured by moored buoys inthe “Nino region” defined by the coordinate 0− 10◦ Southand 90− 80◦ West.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Univariate graphical display
1950 1960 1970 1980 1990 2000 2010
2022
2426
28
Month
Sea
sur
face
tem
pera
ture
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Functional graphical display
2 4 6 8 10 12
2022
2426
28
Month
Sea
sur
face
tem
pera
ture
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Functional time series analysis
{Zw ,w ∈ [1,N]} be a seasonal time series observed at Nequispaced times.
For unequally-spaced data set, the smoothing methods maybe applied.
Observed time series {Z1, . . . ,Z708} divided into 59 successivepaths of length 12,
yt(x) = {Zw ,w ∈ (p(t−1), pt]}, ∀t = 1, . . . , 59, p = 1, . . . , 12.
To forecast future processes, yn+h,h>0(x), from the observeddata.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
FPCA
1 Decompose a complete (12× 59) data matrix,y(x) = [y1(x), . . . , yn(x)]
′, into a number of functional
principal components and their uncorrelated scores.
2 FPCA decomposition can be written as
yt(x) = µ(x) +K∑
k=1
βt,k φk(x) + εt(x), (2)
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Functional principal component regression
Conditioning on historical curves I and fixed functionalprincipal components {Φ = φ1(x), . . . , φK (x)}, forecastedcurves are
yTSn+h|n(x) = E[yn+h(x)|I, Φ] = µ(x)+K∑
k=1
βn+h|n,k φk(x), (3)
where βn+h|n,k denotes the h-step-ahead forecast of βn+h,k .
Hereafter, we refer this method as the time series (TS)method.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Problem statement
1 As observe most recent data points consisting of first m0 timeperiod of yn+1(x), denoted byyn+1(xe) = [yn+1(x1), . . . , yn+1(xm0)]
′, we want update
forecasts for the remaining time period of year n + 1, denotedby yn+1(xl) = [yn+1(xm0+1), . . . , yn+1(x12)]
′.
2 Using (3), TS forecasts of yn+1(xl) is given as
yTSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑
k=1
βTSk,n+1|nφk(xl).
3 TS method does not consider any new observations.
4 Introduce four dynamic updating methods and compare theirpoint forecast performance.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Block moving (BM)
1 BM method considers most recent data as last observation ina complete data matrix.
2 Because time is a continuous variable, we observe a completedata matrix at any given time interval.
3 TS method can be applied by sacrificing a number of datapoints in the first year.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ordinary least squares (OLS) regression
1 Denote Fe as a m0 × K matrix whose (j , k)th entry is φj ,k for1 ≤ j ≤ m0, 1 ≤ k ≤ K .
2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′
be a K × 1 vector, andεn+1(xe) = [εn+1(x1), . . . , εn+1(xm0)]
′be a m0 × 1 vector.
3 As the mean-adjusted y∗n+1(xe) = yn+1(xe)− µ(xe) becomesavailable, OLS regression
y∗n+1(xe) = Feβn+1 + εn+1(xe).
4 Via OLS, βOLSn+1 = (Fe
′Fe)−1Fe
′y∗n+1(xe).
5 OLS forecast of yn+1(xl) is given by
yOLSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +
K∑k=1
βn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ordinary least squares (OLS) regression
1 Denote Fe as a m0 × K matrix whose (j , k)th entry is φj ,k for1 ≤ j ≤ m0, 1 ≤ k ≤ K .
2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′
be a K × 1 vector, andεn+1(xe) = [εn+1(x1), . . . , εn+1(xm0)]
′be a m0 × 1 vector.
3 As the mean-adjusted y∗n+1(xe) = yn+1(xe)− µ(xe) becomesavailable, OLS regression
y∗n+1(xe) = Feβn+1 + εn+1(xe).
4 Via OLS, βOLSn+1 = (Fe
′Fe)−1Fe
′y∗n+1(xe).
5 OLS forecast of yn+1(xl) is given by
yOLSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +
K∑k=1
βn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ordinary least squares (OLS) regression
1 Denote Fe as a m0 × K matrix whose (j , k)th entry is φj ,k for1 ≤ j ≤ m0, 1 ≤ k ≤ K .
2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′
be a K × 1 vector, andεn+1(xe) = [εn+1(x1), . . . , εn+1(xm0)]
′be a m0 × 1 vector.
3 As the mean-adjusted y∗n+1(xe) = yn+1(xe)− µ(xe) becomesavailable, OLS regression
y∗n+1(xe) = Feβn+1 + εn+1(xe).
4 Via OLS, βOLSn+1 = (Fe
′Fe)−1Fe
′y∗n+1(xe).
5 OLS forecast of yn+1(xl) is given by
yOLSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +
K∑k=1
βn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ordinary least squares (OLS) regression
1 Denote Fe as a m0 × K matrix whose (j , k)th entry is φj ,k for1 ≤ j ≤ m0, 1 ≤ k ≤ K .
2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′
be a K × 1 vector, andεn+1(xe) = [εn+1(x1), . . . , εn+1(xm0)]
′be a m0 × 1 vector.
3 As the mean-adjusted y∗n+1(xe) = yn+1(xe)− µ(xe) becomesavailable, OLS regression
y∗n+1(xe) = Feβn+1 + εn+1(xe).
4 Via OLS, βOLSn+1 = (Fe
′Fe)−1Fe
′y∗n+1(xe).
5 OLS forecast of yn+1(xl) is given by
yOLSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +
K∑k=1
βn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ordinary least squares (OLS) regression
1 Denote Fe as a m0 × K matrix whose (j , k)th entry is φj ,k for1 ≤ j ≤ m0, 1 ≤ k ≤ K .
2 Let βn+1 = [βn+1,1, . . . , βn+1,K ]′
be a K × 1 vector, andεn+1(xe) = [εn+1(x1), . . . , εn+1(xm0)]
′be a m0 × 1 vector.
3 As the mean-adjusted y∗n+1(xe) = yn+1(xe)− µ(xe) becomesavailable, OLS regression
y∗n+1(xe) = Feβn+1 + εn+1(xe).
4 Via OLS, βOLSn+1 = (Fe
′Fe)−1Fe
′y∗n+1(xe).
5 OLS forecast of yn+1(xl) is given by
yOLSn+1|n(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +
K∑k=1
βn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ridge regression (RR)
1 RR penalizes the OLS coefficients, which deviate from 0. RRcoefficients minimize a penalized residual sum of squares
argminβn+1
{(y∗n+1(xe)−Feβn+1)′(y∗n+1(xe)−Feβn+1)+λβ
′n+1βn+1}
2 Taking derivative with respect to βn+1,
βRRn+1 = (Fe
′Fe + λI)−1Fe
′y∗n+1(xe).
3 RR forecast of yn+1(xl) is
yRRn+1(xl) = E[yn+1(xl)|I, Φl ] = µ(xl) +K∑
k=1
βRRn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ridge regression (RR)
1 RR penalizes the OLS coefficients, which deviate from 0. RRcoefficients minimize a penalized residual sum of squares
argminβn+1
{(y∗n+1(xe)−Feβn+1)′(y∗n+1(xe)−Feβn+1)+λβ
′n+1βn+1}
2 Taking derivative with respect to βn+1,
βRRn+1 = (Fe
′Fe + λI)−1Fe
′y∗n+1(xe).
3 RR forecast of yn+1(xl) is
yRRn+1(xl) = E[yn+1(xl)|I, Φl ] = µ(xl) +K∑
k=1
βRRn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Ridge regression (RR)
1 RR penalizes the OLS coefficients, which deviate from 0. RRcoefficients minimize a penalized residual sum of squares
argminβn+1
{(y∗n+1(xe)−Feβn+1)′(y∗n+1(xe)−Feβn+1)+λβ
′n+1βn+1}
2 Taking derivative with respect to βn+1,
βRRn+1 = (Fe
′Fe + λI)−1Fe
′y∗n+1(xe).
3 RR forecast of yn+1(xl) is
yRRn+1(xl) = E[yn+1(xl)|I, Φl ] = µ(xl) +K∑
k=1
βRRn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Penalized least square (PLS) regression
1 OLS method needs a sufficient number of observation (≥ K )in order for βOLS
n+1 to be numerically stable.
2 βn+1 obtained from the PLS methods minimizes
(y∗n+1(xe)− Feβn+1)′(y∗n+1(xe)− Feβn+1) +
λ(βn+1 − βTSn+1|n)
′(βn+1 − βTS
n+1|n)
3 Taking first derivative with respect to βn+1,
βPLSn+1 = (Fe
′Fe + λI)−1(Fe
′yn+1(xe) + λβTS
n+1|n). (4)
4 PLS forecasts is a weighted average between the TS and OLSforecasts, subject to a penalty parameter λ.
5 PLS forecast of yn+1(xl) is given as
yPLSn+1(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑
k=1
βPLSn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Penalized least square (PLS) regression
1 OLS method needs a sufficient number of observation (≥ K )in order for βOLS
n+1 to be numerically stable.2 βn+1 obtained from the PLS methods minimizes
(y∗n+1(xe)− Feβn+1)′(y∗n+1(xe)− Feβn+1) +
λ(βn+1 − βTSn+1|n)
′(βn+1 − βTS
n+1|n)
3 Taking first derivative with respect to βn+1,
βPLSn+1 = (Fe
′Fe + λI)−1(Fe
′yn+1(xe) + λβTS
n+1|n). (4)
4 PLS forecasts is a weighted average between the TS and OLSforecasts, subject to a penalty parameter λ.
5 PLS forecast of yn+1(xl) is given as
yPLSn+1(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑
k=1
βPLSn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Penalized least square (PLS) regression
1 OLS method needs a sufficient number of observation (≥ K )in order for βOLS
n+1 to be numerically stable.2 βn+1 obtained from the PLS methods minimizes
(y∗n+1(xe)− Feβn+1)′(y∗n+1(xe)− Feβn+1) +
λ(βn+1 − βTSn+1|n)
′(βn+1 − βTS
n+1|n)
3 Taking first derivative with respect to βn+1,
βPLSn+1 = (Fe
′Fe + λI)−1(Fe
′yn+1(xe) + λβTS
n+1|n). (4)
4 PLS forecasts is a weighted average between the TS and OLSforecasts, subject to a penalty parameter λ.
5 PLS forecast of yn+1(xl) is given as
yPLSn+1(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑
k=1
βPLSn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Penalized least square (PLS) regression
1 OLS method needs a sufficient number of observation (≥ K )in order for βOLS
n+1 to be numerically stable.2 βn+1 obtained from the PLS methods minimizes
(y∗n+1(xe)− Feβn+1)′(y∗n+1(xe)− Feβn+1) +
λ(βn+1 − βTSn+1|n)
′(βn+1 − βTS
n+1|n)
3 Taking first derivative with respect to βn+1,
βPLSn+1 = (Fe
′Fe + λI)−1(Fe
′yn+1(xe) + λβTS
n+1|n). (4)
4 PLS forecasts is a weighted average between the TS and OLSforecasts, subject to a penalty parameter λ.
5 PLS forecast of yn+1(xl) is given as
yPLSn+1(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑
k=1
βPLSn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Penalized least square (PLS) regression
1 OLS method needs a sufficient number of observation (≥ K )in order for βOLS
n+1 to be numerically stable.2 βn+1 obtained from the PLS methods minimizes
(y∗n+1(xe)− Feβn+1)′(y∗n+1(xe)− Feβn+1) +
λ(βn+1 − βTSn+1|n)
′(βn+1 − βTS
n+1|n)
3 Taking first derivative with respect to βn+1,
βPLSn+1 = (Fe
′Fe + λI)−1(Fe
′yn+1(xe) + λβTS
n+1|n). (4)
4 PLS forecasts is a weighted average between the TS and OLSforecasts, subject to a penalty parameter λ.
5 PLS forecast of yn+1(xl) is given as
yPLSn+1(xl) = E[yn+1(xl)|I l , Φl ] = µ(xl) +K∑
k=1
βPLSn+1,k φk(xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Penalty parameter selection
Split the data into a training set1 a training sample (SST from 1950 to 1970), and2 a validation sample (SST from 1971 to 1992).
and a testing set (SST from 1993 to 2007).
Optimal penalty parameters λ for different updating periodsare determined by minimizing the mean absolute error (MAE).
MAE =1
hp
h∑j=1
p∑i=1
|yn+j(xi )− yn+j(xi )|,
over a grid of candidates (from 10−6 to 106 in steps of0.0001).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Component selection
With data in training set, select number of components byminimizing MAE within the validation set.
Optimal number of components is K = 5.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Some benchmark forecasting methods
1 Mean predictor (MP) method predicts values at n + 1 byempirical mean from first year to nth year.
2 Random walk (RW) method predicts new values at year n + 1by observations at year n.
3 Seasonal autoregressive moving average (SARIMA) is abenchmark method for forecasting seasonal univariate timeseries. Requires the specifications of order of the seasonal andnon-seasonal components of an ARIMA model. Implement anautomatic algorithm of Hyndman and Khandakar (2008) toselect the optimal orders.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Point forecast comparison
Non-dynamic updating method Dynamic updating methodsUpdate MP RW SARIMA TS OLS Block PLS RR
Mar-Dec 0.72 0.86 0.96 0.73 0.72 0.70 0.67 0.76Apr-Dec 0.73 0.87 0.98 0.74 0.69 0.73 0.68 0.65May-Dec 0.71 0.86 0.88 0.71 0.94 0.71 0.68 0.62Jun-Dec 0.71 0.84 0.86 0.71 1.07 0.70 0.66 0.58Jul-Dec 0.72 0.87 0.86 0.73 0.94 0.68 0.60 0.57Aug-Dec 0.71 0.91 0.84 0.74 0.94 0.69 0.63 0.62Sep-Dec 0.71 0.93 0.84 0.74 1.03 0.70 0.65 0.64Oct-Dec 0.72 0.96 0.57 0.78 0.69 0.74 0.71 0.64Nov-Dec 0.72 0.92 0.52 0.79 0.25 0.75 0.58 0.24Dec 0.64 0.83 0.21 0.71 0.29 0.59 0.23 0.29
Mean 0.71 0.88 0.75 0.74 0.76 0.70 0.61 0.56
Table: MAE of the point forecasts using different methods.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Parametric prediction intervals
1 Based on orthogonality and linear additivity, total forecastvariance is approximated by the sum of individual variances
ξn+h|n = Var[yn+h|I, Φ] ≈K∑
k=1
ηn+h|n,k φ2k(x) + vn+h,
ηn+h|n,k = Var(βn+h,k |β1,k , . . . , βn,k) is obtained by a timeseries model.vn+h is estimated by averaging ε2n+h(x) in (3) for each xvariable.
2 Under the normality, the (1− α) prediction intervals foryn+h(x) are
yn+h|n(x)± zα(ξn+h|n)12 ,
where zα is the (1− α/2) standard normal quantile.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Nonparametric prediction intervals
1 h-step-ahead forecast errors of principal component scores isπt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1.
2 By sampling with replacement, obtain bootstrap samples ofβn+h,k ,
βb,TSn+h|n,k = βTSn+h|n,k + πb∗,h,k , for b = 1, . . . ,B.
3 Since the residual {ε1(x), . . . , εn(x)} is uncorrelated to theprincipal components, bootstrap the model residual termεbn+h|n(x) by iid sampling.
4 Based on orthogonality and linear additivity, obtain B forecastvariants of yn+h|n(x),
ybn+h|n(x) = µ(x) +K∑
k=1
βb,TSn+h|n,k φk(x) + εbn+h|n(x).
5 (1− α) prediction intervals are quantiles of ybn+h|n(x).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Nonparametric prediction intervals
1 h-step-ahead forecast errors of principal component scores isπt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1.
2 By sampling with replacement, obtain bootstrap samples ofβn+h,k ,
βb,TSn+h|n,k = βTSn+h|n,k + πb∗,h,k , for b = 1, . . . ,B.
3 Since the residual {ε1(x), . . . , εn(x)} is uncorrelated to theprincipal components, bootstrap the model residual termεbn+h|n(x) by iid sampling.
4 Based on orthogonality and linear additivity, obtain B forecastvariants of yn+h|n(x),
ybn+h|n(x) = µ(x) +K∑
k=1
βb,TSn+h|n,k φk(x) + εbn+h|n(x).
5 (1− α) prediction intervals are quantiles of ybn+h|n(x).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Nonparametric prediction intervals
1 h-step-ahead forecast errors of principal component scores isπt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1.
2 By sampling with replacement, obtain bootstrap samples ofβn+h,k ,
βb,TSn+h|n,k = βTSn+h|n,k + πb∗,h,k , for b = 1, . . . ,B.
3 Since the residual {ε1(x), . . . , εn(x)} is uncorrelated to theprincipal components, bootstrap the model residual termεbn+h|n(x) by iid sampling.
4 Based on orthogonality and linear additivity, obtain B forecastvariants of yn+h|n(x),
ybn+h|n(x) = µ(x) +K∑
k=1
βb,TSn+h|n,k φk(x) + εbn+h|n(x).
5 (1− α) prediction intervals are quantiles of ybn+h|n(x).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Nonparametric prediction intervals
1 h-step-ahead forecast errors of principal component scores isπt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1.
2 By sampling with replacement, obtain bootstrap samples ofβn+h,k ,
βb,TSn+h|n,k = βTSn+h|n,k + πb∗,h,k , for b = 1, . . . ,B.
3 Since the residual {ε1(x), . . . , εn(x)} is uncorrelated to theprincipal components, bootstrap the model residual termεbn+h|n(x) by iid sampling.
4 Based on orthogonality and linear additivity, obtain B forecastvariants of yn+h|n(x),
ybn+h|n(x) = µ(x) +K∑
k=1
βb,TSn+h|n,k φk(x) + εbn+h|n(x).
5 (1− α) prediction intervals are quantiles of ybn+h|n(x).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Nonparametric prediction intervals
1 h-step-ahead forecast errors of principal component scores isπt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1.
2 By sampling with replacement, obtain bootstrap samples ofβn+h,k ,
βb,TSn+h|n,k = βTSn+h|n,k + πb∗,h,k , for b = 1, . . . ,B.
3 Since the residual {ε1(x), . . . , εn(x)} is uncorrelated to theprincipal components, bootstrap the model residual termεbn+h|n(x) by iid sampling.
4 Based on orthogonality and linear additivity, obtain B forecastvariants of yn+h|n(x),
ybn+h|n(x) = µ(x) +K∑
k=1
βb,TSn+h|n,k φk(x) + εbn+h|n(x).
5 (1− α) prediction intervals are quantiles of ybn+h|n(x).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Distributional forecast updating
1 By sampling with replacement, obtain bootstrap samples ofβn+1,k for year n + 1,
βb,TSn+1|n,k = βTSn+1|n,k + πb∗,1,k , for b = 1, . . . ,B.
2 With bootstrapped samples βb,TSn+1|n,k , these lead to
bootstrapped samples βb,PLSn+1 by (4).
3 From βb,PLSn+1 , obtain B replications of
yb,PLSn+1 (xl) = µ(xl) +K∑
k=1
βb,PLSn+1,k φk(xl) + εn+1(xl).
4 (1− α) prediction intervals are quantiles of yb,PLSn+1 (xl).
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Distributional forecast measure
1 Empirical conditional coverage probability was calculated asthe ratio between number of ‘future’ samples falling into thecalculated prediction intervals and number of testing samples.
coverage =1
hp
p∑i=1
h∑j=1
I (y lbn+j |n(xi ) < yn+j(xi ) < yubn+j |n(xi )),
Mean coverage probability deviance = average(empiricalcoverage - nominal coverage).
2 To assess which approach gives narrower prediction intervals,calculate the width of prediction intervals
Width =1
hp
p∑i=1
h∑j=1
|yubn+j |n(xi )− y lbn+j |n(xi )|.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Distributional forecast comparison
Parametric NonparametricPeriod TS BM TS BM PLSMar-Dec 97% 98% 97% 97% 95%Apr-Dec 97% 98% 97% 97% 95%May-Dec 96% 96% 96% 96% 96%Jun-Dec 96% 96% 96% 95% 95%Jul-Dec 95% 96% 95% 94% 94%Aug-Dec 94% 94% 94% 94% 93%Sep-Dec 93% 95% 93% 95% 93%Oct-Dec 93% 93% 93% 93% 90%Nov-Dec 93% 96% 93% 93% 93%Dec 93% 100% 93% 93% 93%MCD 1.58% 1.88% 1.58% 1.40% 1.49%
Table: Nominal = 95%, smaller the mean coverage probability deviance(MCD) is, the better the method is.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Distributional forecast comparison
Parametric NonparametricPeriod TS BM TS BM PLSMar-Dec 3.65 3.64 3.55 3.51 3.15Apr-Dec 3.73 3.73 3.62 3.66 3.21May-Dec 3.69 3.69 3.57 3.61 3.21Jun-Dec 3.58 3.58 3.47 3.50 3.05Jul-Dec 3.47 3.46 3.38 3.41 2.90Aug-Dec 3.34 3.33 3.26 3.37 2.61Sep-Dec 3.26 3.26 3.19 3.25 2.82Oct-Dec 3.27 3.28 3.20 3.23 2.78Nov-Dec 3.23 3.24 3.16 3.26 2.69Dec 3.19 3.18 3.12 3.30 2.48Mean width 3.44 3.44 3.35 3.41 2.89
Table: Width comparison at nominal = 95%.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Conclusion of the third paper
1 Presented a nonparametric method to forecast univariateseasonal time series.
2 Showed importance of dynamic updating for improving pointforecast accuracy.
3 Among all dynamic updating methods, RR turns out to bebest.
4 Possible to examine other penalty functions used in both thePLS and RR methods.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Summary of the paper
1 Proposed three graphical tools for visualizing functional dataand identifying functional outliers.
2 Proposed a weighted functional principal component analysisto model and forecast mortality and fertility.
3 Applied the functional data analytic approach to model andforecast seasonal univariate time series.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Summary of the paper
1 Proposed three graphical tools for visualizing functional dataand identifying functional outliers.
2 Proposed a weighted functional principal component analysisto model and forecast mortality and fertility.
3 Applied the functional data analytic approach to model andforecast seasonal univariate time series.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Summary of the paper
1 Proposed three graphical tools for visualizing functional dataand identifying functional outliers.
2 Proposed a weighted functional principal component analysisto model and forecast mortality and fertility.
3 Applied the functional data analytic approach to model andforecast seasonal univariate time series.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
References of three papers
Hyndman, R. J. and Shang, H. L. (2010) Rainbow plot, bagplotand boxplot for functional data, Journal of Computational andGraphical Statistics, 19(1), 29-45.
Hyndman, R. J. and Shang, H. L. (2009) Forecasting functionaltime series (with discussion), Journal of Korean Statistical Society,38(3), 199-221.
Shang, H. L. and Hyndman, R. J. (2011) Nonparametric timeseries forecasting with dynamic updating, Mathematics andComputers in Simulation, 81(7), 1310-1324.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
References of three R packages
Shang, H. L. and Hyndman, R. J. (2011) rainbow: Rainbow plots,bagplots and boxplots for functional data, R package version 2.3.4,http://CRAN.R-project.org/package=rainbow.
Shang, H. L. and Hyndman, R. J. (2011) fds: Functional data sets,R package version 1.6,http://CRAN.R-project.org/package=fds.
Hyndman, R. J. and Shang, H. L. (2011) ftsa: Functional timeseries analysis, R package version 2.6,http://CRAN.R-project.org/package=ftsa.
Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion
Contact detail
Thank you for your attention.
Keep contact [email protected]