Upload
antonia-townsend
View
213
Download
0
Embed Size (px)
Citation preview
Analysis of time series
Riccardo BellazziDipartimento di Informatica e
Sistemistica Università di Pavia
0 10 20 30 40 50 60 70150
160
170
180
190
200
210
220
Dialysis sessions
Blo
od
flu
xtime course of blood flux
Time series
• Time series: a collection of observations made sequentially in time
• Many application fields:– Economic time series– Physical time series– Marketing time series– Process control
• Characteristics: – successive observations are NOT independent– The order of observation is crucial
Outline
• Dynamic systems basics– Basic concepts– Linear and non linear dynamic systems
• Structural and black box models of dynamic systems– Time series analysis
• AI approaches for the analysis of time series– Knowledge-based Temporal Abstractions– Knowledge-discovery through clustering of
time series
Outline
• Dynamic systems basics– Basic concepts– Linear and non linear dynamic systems
• Structural and black box models of dynamic systems– Time series analysis
• AI approaches for the analysis of time series– Knowledge-discovery through clustering of
time series– Knowledge-based Temporal Abstractions
Dynamical systems• System: a (physical) entity which can be
manipulated with actions, called inputs (u) and that, as a consequence of the actions, gives a measurable reaction, called output (y)
• Dynamic: the system changes over time; in general, the output does not only depend on the input, but also on the current “state” of the system (x), i.e. on the system history
xu y
A dynamical system (example)
• A simple circuit with two lamps and one switch with values 0 (u1) or 1 (u2). The output can be y={y1 (lamp1 on), y2 (lamp2 on), y3 (off)}. The system is configured to have four states, x1, x2, x3, x4
x1
x2
x3
x4
y1
y2
y3
x1 x4
x2 x3
u2
u1
u2
u1
Dynamical system definition• A dynamical system is a process in which
a function's value changes over time according to a rule that is defined in terms of the function's current value and the current time.
Modeling a dynamical system
• Two ingredients:– A state transition
function
X(t)=f(t,t0,X0,u(.))
– An output transformation
Y(t)=h(t,x(t))
x1
x2
x3
x4
y1
y2
y3
x1 x4
x2 x3
u2
u1
u2
u1
Main classes of dynamical systems• Continuous / discrete• Linear / nonlinear• Time invariant / variant systems• Single / Multiple Input / Outputs• Deterministic / stochastic
Discrete and continuous systems• Discrete: the time set is the set of integer
numbers (t=1,2,…,k,…). The system is typically modeled with difference equations
• Continuous: the time set is the set of non -negative real numbers. The system is typically modeled with differential equations
))(()(,)(),),(( txhtyxtxutxfdt
dxoo
h(x(k)))), y(k)f(x(k),u(kx(k )1
Equilibrium
The pair defines an equilibrium if and only if
The output at the equilibrium is given by
)u,x(gy
x,u
)u,x(f0
x1 = drug concentration in the gastrointestinal compartment (mg/cc)
x2 = drug concentration in the hematic compartment (mg/cc)k1 = transfer coefficient for the gastrointestinal compartment (h-1)k2 = transfer coefficient for metabolic and excretory systems (h-1)
2222112
11111
ubxkxkdt
dx
ubxkdt
dx
ingestionu1
injectionu2
Gastrointestinal
compartment
Hematiccompartment
k1
elimination
k2
States and States and inputsinputs
x1 , x2, u1, u2
Compartmental models
Equilibrium
2
22112
1
111
222211
1111
0
0
k
ubxkx
k
ubx
ubxkxk
ubxk
Given constant inputs, u1 and u2,
2222112
11111
ubxkxkdt
dx
ubxkdt
dx
Stability of equilibria
0x)0(x)),t(x(f)t(x
An equilibrium x = a is asymptotically stable if all the solutions starting in the neighbourhood of a moves towards it.
)x(f
x
Phase portrait
)x,x(fx
)x,x(fx
2122
2111
The locus in the x1-x2 plane of the solution x(t) for all t > 0 is a curve that passes through the point x0. The x1-x2 plane is usually called the state plane or phase plane.
For easy visualization, we represent f(x)=(f1(x),f2(x)), x= (x1,x2 ), as a vector, that is, we assign to x the directed line segment from x to x + f(x).
The family of all trajectories or solution curves is called the phase portrait.
A Phase portrait of a pendulum
x ' = y y ' = - sin(x) - y
-2 -1 0 1 2 3 4
-4
-3
-2
-1
0
1
2
x
y
equilibrio instabileequilibrio as. stabile
M=g=l=1
The phase portraits
• Fixed or equilibrium points• Periodic orbits or limit cycles• Quasi periodic-attactors• Chaotic of strange attractors
Non linear dynamic systems theory studies the property of the system in the phase plan
Linear systems
• Linear systems: f and g are linear in x and u
• Linear Time Invariant (LTI) Systems
))(()(,)(),),(( txhtyxtxutxfdt
dxoo
)t(Du)t(Cx)t(y)t(Bu)t(Ax)t(x
Theorem: An equilibrium point of a LTI system is stable, asymptotically stable or unstable if and only if every equilibrium point of the system is stable, asymptotically stable or unstable respectively
Linear systems: input/output representation• A linear system can be represented
in the frequency domain
t
0
y d)(u)t(g)t(y
)s(U)s(G)s(Y)t(yL
g(t)G(s)
u(t) y(t)
Y(s)U(s)
Reachability
Definition: A state is reachable if there exists a finite time instant and an input , defined from 0 to , such thatA system such that all its states are reachable is called completely reachable
x~
0t~ u~ t~
x~)t~(x f
Observability
Definition: A state is called unobservable if, for any finite ,
A system without unobservable states is called completely observable
0x~ .t~t0,0)t(yl t~
Decomposition
Output transformationu
ax̂
bx̂
cx̂
dx̂
y
Reachable and unobservable
Reachable and observable
Unreachable and non
observable
Unreachable and observable
Outline
• Dynamic systems basics– Basic concepts– Linear and non linear dynamic systems
• Structural and black box models of dynamic systems– Time series analysis
• Some AI approaches for the analysis of time series– Knowledge-discovery through clustering of
time series– Knowledge-based Temporal Abstractions
Data Models
• Input/output or black box• Description of the system only by
knowing measurable data• Typically based on minimal
assumptions on the system• No infos on the internal structure of
the system
Modeling with black-box
Output transformationu
ax̂
bx̂
cx̂
dx̂
y
Reachable and unobservable
Reachable and observable
Unreachable and observable
Unreachable and non
observable
Data Models
• Time series• Impulse response• Transfer functions (linear models)• Convolution / deconvolution (linear
models)
0 20 40 60 80 100 120 140 160
0
5
10
15
TEMPOC
ON
CE
NT
RA
ZIO
NE
y t p A eit
i
i( , )
1
2
p A T[ , ]1 2 1 2, A ,
SYSTEMu y
Unknown parameters
Data models (Input-output)Example
System Models
• White or grey box• Description of the internal structure
of the system based on physical principles and on explicit hypotesis on causal relationships
• After comparison with experimental data are aimed at understanding the principles of the system
Modeling
MODEL
SYSTEM
STRUCTUREPARAMETER
ESTIMATE
A priori knowledge
Assumptions
DATA
Purpose
System Models
SYSTEM MODELS (STRUCTURAL)COMPARTMENTAL MODELS
Unknown parameters p=[k01, k12, k21, V1]T
( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )
( ) ( ) /
x t k k x t k x t u t
x t k x t k x t
y t x t V
1 01 21 1 12 2 1
2 21 1 12 2 2
1 1
0 0
0 0
x
x
x2
k01
k12
k21
y1 = x1/V1u
V1
x1
Unknown parameters p=[k01, V1]T
11
11011
V/)t(x)t(y
0)0(x )t(u)t(xk)t(x
k01
y1 = x1/V1u
V1
x1
Structural models
Output transformationu
ax̂
bx̂
cx̂
dx̂
y
Reachable and unobservable
Reachable and observable
Unreachable and observable
Unreachable and non
observable
Guesses/Prior kb
Guesses/Prior kb
Modeling time series
• Time series: data are correlated; data are realizations of stochastic processes
• Stochastic linear discrete input-output models
• Two approaches:– Model the data as a function of time (a regression
over time)– Model the data as a function of its past values:
ARMA models
• Often, assumption of stationarity (the mean and variance of the process generating the data do not change over time)
Autoregressive (AR) models
• AR(h) is a regression model that regresses each point on the previous h time points. Example is AR(1)
• Each value is affected by random noise with zero mean and variance 2
• Can be learned with linear estimation algorithm
kkk eyay 11
Moving Average (MA)
• A different kind of model is the Moving Average model (MA(h))
• It propagates over time the effect of the random fluctuations
• The autocorrelation function may help in choosing proper models
• An iterative estimation process is needed
kkk eeby 11
ARMA
• It can be used to obtain a more parsimonious model, with “difficult” autocorrelation functions
kkkk eebyay 1111
Exogenous inputs
• The system can be driven not only by noise but also by eXogenous inputs
kkkkk eucebyay 11111
This is the general ARMAX model
Non linear models
• Also non-linear stochastic models have been proposed in the literature
• Examples are NARX models
• NARX models can be easily learned from data with Neural Nets
kkkkk eydy )( 1
From black-box to structural stochastic models
X1
X2
X1
X2
Y2 Y2
Y1 Y1
Examples:- Kalman filters- Dynamic BNs- Hidden Markov Models
Observable and partially observable models
X1
X2
X1
X2
Y2 Y2
Y1 Y1
Fully observable
X1
X2
X1
X2
Y2 Y2
Partially observable
k k+1 k k+1
Delay coordinate embedding• How to reconstruct a state-space
representation from a uni-dimensional time series y
• Sampled data• Idea: add n state variables using the
values of y with a delay of tau
Example Time Y1 0 0 0.0100 0.0092 0.0200 0.0171 0.0300 0.0238 0.0400 0.0295 0.0500 0.0343 0.0600 0.0383 0.0700 0.0415 0.0800 0.0441 0.0900 0.0462
From 1 dimension
Time X1 X2 0 0 0.0343 0.0100 0.0092 0.0383 0.0200 0.0171 0.0415 0.0300 0.0238 0.0441 0.0400 0.0295 0.0462
To 2 dimensions
EmbeddingDelay=0.05
Challenges
• Finding the embedding parameters– Estimate the number of state variable– Estimate the delay
• Algorithms proposed in the literature– Autocorrelation– Pineda-Somerer– False near neighbour
Outline
• Dynamic systems basics– Basic concepts– Linear and non linear dynamic systems
• Structural and black box models of dynamic systems– Time series analysis
• Some AI approaches for the analysis of time series– Knowledge-discovery through clustering
of time series– Knowledge-based Temporal Abstractions
Clustering of time series
Several methodologies available
• Similarity-based clustering
• Model-based clustering
• Template-based clustering
Zhong, S., Ghosh, J., Journal of Machine Learning Research, 2003
Clustering of time series
Several methodologies available
• Similarity-based clustering
• Model-based clustering
• Template-based clustering
Zhong, S., Ghosh, J., Journal of Machine Learning Research, 2003
Similarity-Based Clustering
Key point: to define a distance measure (similarity function) between time series.
Strategy: temporal profiles which verify the same similarity condition are grouped together.
Different classes of algorithms: hierarchical clustering, partitioning methods, self-organizing maps.
Eisen et al., 1998; Tamayo et al., 1999
Similarity-Based Clustering: how to choose a distance
pn
1i
pii tsTS,D
Minkowski metricGiven the time series:S = s1, … , sn
T= t1, … , tn
S
T
D(S,T)
p = 1 : Manhattanp = 2 : Euclideanp = ∞ : Sup
Euclidean distance: limits
0 50 100 150 200 250 3000
0.5
1
1.5
2
2.5
3
0 100 200 300 400 500 600 700 800 900 1000
0 20 40 60 80 100 120 140-4
-2
0
2
4
6
8
0 50 100 150 200 250 300
Offset Translation
Amplitude Scaling
Noise
S = S - mean(S)
T = T - mean(T)
0 100 200 300 400 500 600 700 800 900 1000
std(S)mean(S) - S
S
std(T)mean(T) - T
T
0 20 40 60 80 100 120 140-4
-2
0
2
4
6
8
Smoothing
Problem Solutions
Other distances (1)
n
1 i
n
1 i
2i
2i
n
1 iii
)t (t)s (s
)t)(ts (sTS,r
Correlation coefficient:
- Useful for temporal models.
- Looks for similarities of the shapes of profiles.
- Disadvantage: not robust to temporal dislocations
Other distances (2)
Dynamic Time Warping:
Fixed time axis Warped time axis
Idea: to ‘extend’ each sequence by repeating some element. It is possible to calculate the euclidean distance between the extended sequences.
Functional genomics: Hiercarchical Clustering with correlation coefficients
Time series of 13 samples of 517 genes of human fibroblasts stimulated with serum.
Dendrograms are related to the heat-maps of gene expression over time.
Eisen et al., PNAS 1998Iyer et al., Science,
1999
Clustering of time series
• Similarity-based clustering
• Model-based clustering
• Template-based clustering
Zhong, S., Ghosh, J., Journal of Machine Learning Research, 2003
Model-based Clustering (1)
Key point: assume that the data are sampled from a population composed by sub-populations characterized by different stochastic processes;
clusters + processes = model
Strategy: the temporal profiles generated by the same stochastic process are grouped in the same cluster. The clustering problem becomes a problem of model selection.
Cheesman and Stutz, 1996; Fraley and Raftery, 2002; Yeung et al., 2001
Model-based Clustering (2)
Given:Y : the dataM: a set of stochastic dynamic models and a cluster division
Θ: the model parameters
A suitable approach:- Bayesian approach: select the model which maximize the posterior probability of the model M given the data Y, P(M|Y)
Ramoni e Sebastiani, 1999; Baldi e Brunak, 1998; Kay, 1993
The Bayesian SolutionRamoni et al., PNAS 2002
Analysis of gene expression time series: CAGED system (Cluster Analysis of Gene Expression Dynamics)
Assumption: time series generated by an unknown number of autoregressive stochastic processes (AR)
From Bayes theorem: P(M|Y) proportional to f(Y|M) (marginal likelihood)
Assumption + hypothesis on the distribution on the model parameters calculation of f(Y|M) for each possible model in closed form
Model selection: agglomerative process + heuristic strategy
Cluster number: automatically selected maximizing the marginal likelihood
Clustering of time series
• Similarity-based clustering
• Model-based clustering
• Template-based clustering
Zhong, S., Ghosh, J., Journal of Machine Learning Research, 2003
Template-Based Clustering (1)Idea: group the time series on the basis of the similarity with a
set of qualitative prototypes (templates)
Template-Based Clustering (2)
Data representation: from quantitative to qualitative
Templates may capture the relevant characteristics of an expression profile, although they can eliminate the spurious effects caused by noise.
They may simplify the process of capturing the variety of behavior which characterize the gene expression profiles.
Current Limit: templates and clusters have to be a-priori identified.
Template-Based Clustering: an example Hvidsten et al., 2003
Template-based clustering is used to forecast the gene function on the basis of the knowledge of known genes.
Define all possible intervals on the time
series
Templates: Increasing, decreasing, steady
Times series of gene expression
Cluster with genes that has a match with a
template on the same subinterval
Template-Based Clustering: an example
Example: all sets of time series with 4
points
Templates: Increasing Decreasing
Steady
Possible time intervals: 3+2+1
= 6
Possible cluster 3 x 6 = 18
Template-based clustering with temporal abstractions
QUALITATIVE representation of expression profiles
TEMPORAL ABSTRACTIONS
Shahar, 1997
Temporal Entities
• Events (<time-point, value>)
• Episodes (<interval, pattern>)
Pattern: specific data course (decreasing, normal, stationary, …)
Time Series: sequence of events
0
50
100
150
200
250
300
1 2 3 4 5 6 7 8 9 10 11 12 13 14Time (days)
BG
L (
U/m
l)
Data Abstraction Methods• Qualitative Abstraction: quantitative
data are abstracted into qualitative (a BGL of 110 U/ml is abstracted into normal value)
• Temporal Abstraction (TA): time stamped data are aggregated into intervals associated to specific patterns.
Temporal Abstractions
• Methods used to generate an abstract description of temporal data represented by a sequence of episodes.
0
70
140
210
280
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14
State Temporal Abstractions
Low-Normal BGL values
0
70
140
210
280
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14time
BG
L (
U/m
l)
Trend Temporal Abstractions
BGL decreasing trend
0
70
140
210
280
350
1 2 3 4 5 6 7 8 9 10 11 12 13 14Time (days)
BG
L (
U/m
l)
Stationary Temporal Abstractions
BGL Stationary
0
50
100
150
200
250
300
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Time
BG
L (
U/m
l)
Complex Abstractions
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Time
Series1 Series2 Series1 OVERLAPS Series2
Complex Abstractionsexample
hyperglycemia at Breakfast OVERLAPS
absence of glycosuria
Somogyi Effect: response to hypoglycemia while asleep with counter-regulatory hormones causing morning hyperglycemia
Relationships between intervals: Allen algebra
Finished-by
Overlaps
Meets
Before
Equals
Starts
AC
AC
AC
ACACACAC
Contains
Started-byAC
During
Finishes
Overlapped-by
Is met by
After
ACACACACAC
Allen, J.F.: Towards a general theory of action and time. Artificial Intelligence (1984)
Clustering with dynamic template generation
• Idea: apply Temporal Abstractions
• Generate Tas for each temporal profile
• Cluster together “similar” TAs
Time
Expression
TA generation
Picewise linear approximation (J.A. Horst, I. Beichl, 1997)
D D D I II I I
Decreasing
Increasing
Original time series
Dominant points detection
Threshold needed
Linear regression
Trend TAs extracted from local
slopes
I S I
Labeling at different abstraction level (1)
S [Steady]
I [Increasing]
I I S I I S S I I S I II S I
I I
I
Labeling at different abstraction level (2)
IISSIIIS ISSS SIII SSII SIII SSIS SISS SIIS
S [Steady]
I [Increasing]
IISI ISSI ISII
ISIS IS SI SIS SISI ISI
I
L1
L2
L3
Building clusters
Time series to be clustered labels L1, L2, L3
L1
L2
L3 ?
Comparison
Comparison
Comparison
Results: TaxonomySaccharomyces Cerevisiae gene expression
L3
L2
(S. Chu et al. The Transcriptional Program of Sporulation in Budding Yeast. Science, 1998.)
Template: [Increasing Decreasing]
Results (1)
GO Process
(B.J. Breitkreutz et al. Osprey: a network visualization system. Genome Biology, 2003)
Outline
• Dynamic systems basics– Basic concepts– Linear and non linear dynamic systems
• Structural and black box models of dynamic systems– Time series analysis
• AI approaches for the analysis of time series– Knowledge-discovery through clustering of
time series– Knowledge-based Temporal Abstractions