Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
STATISTICAL UNDERSTANDING OF
DEPENDENT DATA
Sofia C. Olhede and Adam M. Sykulski
Department of Statistical Science, UCL
Outline
• Time Series
• Different types
• Modelling
• Estimation techniques
• Random Fields
• Networks
Time Series Financial Data (S&P 500)
Time Series Neuroscientific data (fMRI scans)
Time Series Global climate data
Time Series Regional tidal data
Time Series Ecological data (predator-prey)
Time Series Data from the Oceans
Time Series Data from the Oceans
Time Series Data from the Oceans
Time Series Analysis Introduction
• A time series is a sequence of data points sampled over time:
{𝑋𝑡} = {𝑋𝑡1, 𝑋𝑡2
… , 𝑋𝑡𝑛}
• Can be regularly spaced or irregularly spaced
• Can be univariate, complex-valued or multivariate
• Can be continuous or discrete valued
• Can be analysed in the time or frequency domain
• Can be used for summarising data or forecasting future values
• Can be analysed as a single time series or with multiple time series
• Can be analysed online or offline
• Can be modelled parametrically, non-parametrically, or semi-parametrically
Time Series Analysis Stationary v non-stationary
• A time series is stationary if its statistical properties do not change with time
• {𝑋𝑡} is said to be completely stationary if, for any set of times 𝑡1, 𝑡2, … , 𝑡𝑛, and any integer 𝜏, the joint probability of {𝑋𝑡1
, 𝑋𝑡2… , 𝑋𝑡𝑛
} is identical with the joint probability distribution
of {𝑋𝑡1+𝜏, 𝑋𝑡2+𝜏 … , 𝑋𝑡𝑛+𝜏}
• {𝑋𝑡} is weakly stationary (or stationary up to order 2), if only the joint moments up to order 2 of the above probability distributions exist and are identical. In this case we have:
𝐸[𝑋𝑡] = 𝜇, 𝐸 𝑋𝑡2 = 𝑘
var 𝑋𝑡 = 𝐸 𝑋𝑡2 − 𝐸 𝑋𝑡
2 = 𝑘 − 𝜇2 • Weak stationarity implies that the mean and variance do not change over time
Time Series Analysis (Auto)covariance, (auto)correlation
• The covariance describes the dependence between two points in a time
series:
cov{𝑋𝑡 , 𝑋𝑡+𝜏} = 𝐸 𝑋𝑡 − 𝐸[𝑋𝑡] 𝑋𝑡+𝜏 − 𝐸[𝑋𝑡+𝜏]
• For stationary series, this simplifies to:
cov{𝑋𝑡 , 𝑋𝑡+𝜏} = 𝐸 𝑋𝑡 − 𝜇 𝑋𝑡+𝜏 − 𝜇 = 𝐸 𝑋𝑡𝑋𝑡+𝜏 − 𝜇2 = 𝑠(𝜏)
• 𝑠(𝜏) is known as the autocovariance sequence,
and the autocorrelation sequence is then:
𝜌 𝜏 =𝑠(𝜏)
𝑠(0)
• Most time series are non-stationary,
but have stationary components
Time Series Analysis Gaussian Processes
• {𝑋𝑡} is called a Gaussian Process if, for all 𝑡1, 𝑡2, … , 𝑡𝑛,
{𝑋𝑡1, 𝑋𝑡2
… , 𝑋𝑡𝑛} has a multivariate normal distribution
• The multivariate normal distribution is completely specified by its
mean and covariance matrix
• In this case, if {𝑋𝑡} is stationary to order 2 (weakly stationary), then it
must also be completely stationary
0 100 200 300 400 500 600 700 800 900 1000-3
-2
-1
0
1
2
3
4
t
Time Series Analysis Some examples
• Gaussian White noise (stationary)
0 100 200 300 400 500 600 700 800 900 1000-5
0
5
10
15
20
25
30
35
40
45
t
Time Series Analysis Some examples
• Brownian Motion (Random walk, non-stationary)
0 100 200 300 400 500 600 700 800 900 1000-15
-10
-5
0
5
10
15
t
Time Series Analysis Some examples
• Sinusoid + Gaussian White noise (cyclostationary)
Time Series Analysis Trends and Cycles
• Some non-stationary components of a time series can be modelled
• Linear (or polynomial) trends, cycles, seasonal effects
• Removing them can be useful as we can discover an underlying
stationary process
• Approach taken in unobserved components model (Harvey, 1989)
Time Series Analysis Stochastic Processes
• Used to represent the evolution of a series over time
• The probabilistic counterpart of a deterministic processes
(modelled for example using ordinary differential equations)
• With stochastic processes there is indeterminacy: even if a value at
a given time is known, there are several (often infinite) values at
other times
• Common examples include (fractional) Brownian motion, auto-
regressive or moving average processes and Poisson processes
• Some can be represented as a Stochastic Differential Equation
(SDE), i.e.:
𝑑𝑋𝑡 = −𝛼𝑋𝑡𝑑𝑡 + 𝜎𝑑𝑊𝑡
Time Series Analysis Continuous v Discrete
• Stochastic process can be either continuous or discrete in both
time and state space
• Discrete time and discrete state space
• Markov chains
• Continuous time and continuous state space
• Brownian Motion
• Discrete time and Continuous state space
• Autoregressive and moving average processes
• Continuous time and discrete state space
• Poisson process
Time Series Analysis Mean-reversion
• Is a process guaranteed to return
to the mean? How long will it
take?
• Stationary processes are always
mean-reverting
• Non-stationary processes may or
may not be mean-reverting
• For Brownian motion, the mean
is zero, and the process is
guaranteed to return to zero, but
the expected waiting time is
infinite!
Time Series Analysis Parametric v non-parametric v semi-parametric
• A parametric model means we describe the time series using a
family of probability distributions, described by the vector-valued
parameter 𝜃, all possible value of which are in finite-dimensional
space
• With a non-parametric model, the set of all possible values for 𝜃
is in infinite-dimensional space
• With a semi-parametric model, 𝜃 has both a finite-dimensional
and infinite-dimensional component
• We are interested in estimated the finite-dimensional component of 𝜃
• The infinite-dimensional component of 𝜃 is considered as a nuisance
parameter
Time Series Analysis Method of Moments estimation
• Estimating 𝜃 by equating sample moments with unobservable
moments of the model
• Example
• White noise from Gamma distribution with theoretical moments:
𝐸 𝑋𝑡 = 𝛼𝛽 𝐸 𝑋𝑡2 = 𝛽2𝛼(𝛼 + 1)
• The sample moments from an observed time series are given by:
𝑚1 =𝑋𝑡1
+ ⋯ + 𝑋𝑡𝑛
𝑛 𝑚2 =
𝑋𝑡12 + ⋯ + 𝑋𝑡𝑛
2
𝑛
• We then equate the moments and solve simultaneously:
𝛼 =𝑚1
2
𝑚2 − 𝑚12 𝛽 =
𝑚2 − 𝑚12
𝑚1
Time Series Analysis Maximum likelihood estimation
• Estimating 𝜃 by maximising the likelihood function given by:
ℒ 𝜃 𝑥𝑡1, … , 𝑥𝑡𝑛
• Maximises ‘agreement’ between data and model, i.e. the
probability of the data observed given the selected model
• For time series we can estimate 𝜃 by specifying a covariance
matrix and then maximising the log-likelihood:
𝜃 = max𝜃
ℓ 𝜃 𝑥𝑡1, … , 𝑥𝑡𝑛
= −1
2log 𝐶 𝜃 −
1
2𝑋𝑇𝐶−1 𝜃 𝑋
• The variance of the parameter estimates can be estimated from:
𝑛 𝜃 − 𝜃0
𝑑 𝑁 0, 𝐼−1
• Where 𝐼 is the Fisher information matrix
Time Series Analysis Comparing method of moments v maximum likelihood
• Method of moments is quicker to calculate, maximum likelihood
usually requires numerical optimisation
• Maximum likelihood has higher probability of yielding estimators
close to the true value
• Method of moments can lead to parameters outside of the
parameter space
• Method of moments can be used when the probability
distribution is not known
• Method of moment estimators are not necessarily sufficient
statistics
Time Series Analysis Kalman Filter
• Recursive estimation of parameters from a noisily observed process