STATISTICAL UNDERSTANDING OF DEPENDENT …rmjbale/Stat/9_Adam.pdfTime Series Analysis Stochastic Processes •Used to represent the evolution of a series over time •The probabilistic

STATISTICAL UNDERSTANDING OF

DEPENDENT DATA

Sofia C. Olhede and Adam M. Sykulski

Department of Statistical Science, UCL

Outline

• Time Series

• Different types

• Modelling

• Estimation techniques

• Random Fields

• Networks

Time Series Financial Data (S&P 500)

Time Series Neuroscientific data (fMRI scans)

Time Series Global climate data

Time Series Regional tidal data

Time Series Ecological data (predator-prey)

Time Series Data from the Oceans



Time Series Analysis Introduction

• A time series is a sequence of data points sampled over time:

{𝑋𝑡} = {𝑋𝑡1, 𝑋𝑡2

… , 𝑋𝑡𝑛}

• Can be regularly spaced or irregularly spaced

• Can be univariate, complex-valued or multivariate

• Can be continuous or discrete valued

• Can be analysed in the time or frequency domain

• Can be used for summarising data or forecasting future values

• Can be analysed as a single time series or with multiple time series

• Can be analysed online or offline

• Can be modelled parametrically, non-parametrically, or semi-parametrically

Time Series Analysis Stationary v non-stationary

• A time series is stationary if its statistical properties do not change with time

• {𝑋𝑡} is said to be completely stationary if, for any set of times 𝑡1, 𝑡2, … , 𝑡𝑛, and any integer 𝜏, the joint probability of {𝑋𝑡1

, 𝑋𝑡2… , 𝑋𝑡𝑛

} is identical with the joint probability distribution

of {𝑋𝑡1+𝜏, 𝑋𝑡2+𝜏 … , 𝑋𝑡𝑛+𝜏}

• {𝑋𝑡} is weakly stationary (or stationary up to order 2), if only the joint moments up to order 2 of the above probability distributions exist and are identical. In this case we have:

𝐸[𝑋𝑡] = 𝜇, 𝐸 𝑋𝑡2 = 𝑘

var 𝑋𝑡 = 𝐸 𝑋𝑡2 − 𝐸 𝑋𝑡

2 = 𝑘 − 𝜇2 • Weak stationarity implies that the mean and variance do not change over time

Time Series Analysis (Auto)covariance, (auto)correlation

• The covariance describes the dependence between two points in a time

series:

cov{𝑋𝑡 , 𝑋𝑡+𝜏} = 𝐸 𝑋𝑡 − 𝐸[𝑋𝑡] 𝑋𝑡+𝜏 − 𝐸[𝑋𝑡+𝜏]

• For stationary series, this simplifies to:

cov{𝑋𝑡 , 𝑋𝑡+𝜏} = 𝐸 𝑋𝑡 − 𝜇 𝑋𝑡+𝜏 − 𝜇 = 𝐸 𝑋𝑡𝑋𝑡+𝜏 − 𝜇2 = 𝑠(𝜏)

• 𝑠(𝜏) is known as the autocovariance sequence,

and the autocorrelation sequence is then:

𝜌 𝜏 =𝑠(𝜏)

𝑠(0)

• Most time series are non-stationary,

but have stationary components

Time Series Analysis Gaussian Processes

• {𝑋𝑡} is called a Gaussian Process if, for all 𝑡1, 𝑡2, … , 𝑡𝑛,

{𝑋𝑡1, 𝑋𝑡2

… , 𝑋𝑡𝑛} has a multivariate normal distribution

• The multivariate normal distribution is completely specified by its

mean and covariance matrix

• In this case, if {𝑋𝑡} is stationary to order 2 (weakly stationary), then it

must also be completely stationary

0 100 200 300 400 500 600 700 800 900 1000-3

-2

-1

0

1

2

3

4

t

Time Series Analysis Some examples

• Gaussian White noise (stationary)

0 100 200 300 400 500 600 700 800 900 1000-5

0

5

10

15

20

25

30

35

40

45

t


• Brownian Motion (Random walk, non-stationary)

0 100 200 300 400 500 600 700 800 900 1000-15

-10

-5

0

5

10

15

t


• Sinusoid + Gaussian White noise (cyclostationary)

Time Series Analysis Trends and Cycles

• Some non-stationary components of a time series can be modelled

• Linear (or polynomial) trends, cycles, seasonal effects

• Removing them can be useful as we can discover an underlying

stationary process

• Approach taken in unobserved components model (Harvey, 1989)

Time Series Analysis Stochastic Processes

• Used to represent the evolution of a series over time

• The probabilistic counterpart of a deterministic processes

(modelled for example using ordinary differential equations)

• With stochastic processes there is indeterminacy: even if a value at

a given time is known, there are several (often infinite) values at

other times

• Common examples include (fractional) Brownian motion, auto-

regressive or moving average processes and Poisson processes

• Some can be represented as a Stochastic Differential Equation

(SDE), i.e.:

𝑑𝑋𝑡 = −𝛼𝑋𝑡𝑑𝑡 + 𝜎𝑑𝑊𝑡

Time Series Analysis Continuous v Discrete

• Stochastic process can be either continuous or discrete in both

time and state space

• Discrete time and discrete state space

• Markov chains

• Continuous time and continuous state space

• Brownian Motion

• Discrete time and Continuous state space

• Autoregressive and moving average processes

• Continuous time and discrete state space

• Poisson process

Time Series Analysis Mean-reversion

• Is a process guaranteed to return

to the mean? How long will it

take?

• Stationary processes are always

mean-reverting

• Non-stationary processes may or

may not be mean-reverting

• For Brownian motion, the mean

is zero, and the process is

guaranteed to return to zero, but

the expected waiting time is

infinite!

Time Series Analysis Parametric v non-parametric v semi-parametric

• A parametric model means we describe the time series using a

family of probability distributions, described by the vector-valued

parameter 𝜃, all possible value of which are in finite-dimensional

space

• With a non-parametric model, the set of all possible values for 𝜃

is in infinite-dimensional space

• With a semi-parametric model, 𝜃 has both a finite-dimensional

and infinite-dimensional component

• We are interested in estimated the finite-dimensional component of 𝜃

• The infinite-dimensional component of 𝜃 is considered as a nuisance

parameter

Time Series Analysis Method of Moments estimation

• Estimating 𝜃 by equating sample moments with unobservable

moments of the model

• Example

• White noise from Gamma distribution with theoretical moments:

𝐸 𝑋𝑡 = 𝛼𝛽 𝐸 𝑋𝑡2 = 𝛽2𝛼(𝛼 + 1)

• The sample moments from an observed time series are given by:

𝑚1 =𝑋𝑡1

+ ⋯ + 𝑋𝑡𝑛

𝑛 𝑚2 =

𝑋𝑡12 + ⋯ + 𝑋𝑡𝑛

2

𝑛

• We then equate the moments and solve simultaneously:

𝛼 =𝑚1

2

𝑚2 − 𝑚12 𝛽 =

𝑚2 − 𝑚12

𝑚1

Time Series Analysis Maximum likelihood estimation

• Estimating 𝜃 by maximising the likelihood function given by:

ℒ 𝜃 𝑥𝑡1, … , 𝑥𝑡𝑛

• Maximises ‘agreement’ between data and model, i.e. the

probability of the data observed given the selected model

• For time series we can estimate 𝜃 by specifying a covariance

matrix and then maximising the log-likelihood:

𝜃 = max𝜃

ℓ 𝜃 𝑥𝑡1, … , 𝑥𝑡𝑛

= −1

2log 𝐶 𝜃 −

1

2𝑋𝑇𝐶−1 𝜃 𝑋

• The variance of the parameter estimates can be estimated from:

𝑛 𝜃 − 𝜃0

𝑑 𝑁 0, 𝐼−1

• Where 𝐼 is the Fisher information matrix

Time Series Analysis Comparing method of moments v maximum likelihood

• Method of moments is quicker to calculate, maximum likelihood

usually requires numerical optimisation

• Maximum likelihood has higher probability of yielding estimators

close to the true value

• Method of moments can lead to parameters outside of the

parameter space

• Method of moments can be used when the probability

distribution is not known

• Method of moment estimators are not necessarily sufficient

statistics

Time Series Analysis Kalman Filter

• Recursive estimation of parameters from a noisily observed process

Documents

STATISTICAL UNDERSTANDING OF DEPENDENT …rmjbale/Stat/9_Adam.pdfTime Series Analysis Stochastic Processes •Used to represent the evolution of a series over time •The probabilistic