Deepayan ChakrabartiCIKM 20021 F4: Large Scale Automated Forecasting Using Fractals -Deepayan...

Preview:

Citation preview

Deepayan Chakrabarti

CIKM 2002 1

F4: Large Scale Automated Forecasting Using Fractals

-Deepayan Chakrabarti-Christos Faloutsos

Deepayan Chakrabarti

CIKM 2002 2

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 3

General Problem Definition

Given a time series {xt}, predict its future course, that is, xt+1, xt+2, ...

Time

Value?

Deepayan Chakrabarti

CIKM 2002 4

Motivation

• Financial data analysis

• Physiological data, elderly care

• Weather, environmental studies

Traditional fields

Sensor Networks (MEMS, “SmartDust”)• Long / “infinite” series

• No human intervention “black box”

Deepayan Chakrabarti

CIKM 2002 5

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 6

How to forecast? ARIMA but linearity assumption Neural Networks but large

number of parameters and long training times [Wan/1993, Mozer/1993]

Hidden Markov Models O(N2) in number of nodes N; also fixing N is a problem [Ge+/2000]

Lag Plots

Deepayan Chakrabarti

CIKM 2002 7

Lag Plots

xt-1

xxtt

4-NNNew Point

Interpolate these…

To get the final prediction

Q0: Interpolation Method

Q1: Lag = ?

Q2: K = ?

Deepayan Chakrabarti

CIKM 2002 8

Q0: Interpolation

Using SVD (state of the art) [Sauer/1993]

Xt-1

xt

Deepayan Chakrabarti

CIKM 2002 9

Why Lag Plots?

Based on the “Takens’ Theorem” [Takens/1981]

which says that delay vectors can be used for predictive purposes

Deepayan Chakrabarti

CIKM 2002 10

Inside TheoryExample: Lotka-Volterra equations

ΔH/Δt = rH – aH*P ΔP/Δt = bH*P – mP

H is density of preyP is density of predators

Suppose only H(t) is observed. Internal state is (H,P).

Extra

Deepayan Chakrabarti

CIKM 2002 11

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 12

Problem at hand

Given {x1, x2, …, xN} Automatically set parameters

- L(opt) (from Q1) - k(opt) (from Q2)

in Linear time on N to minimise Normalized Mean

Squared Error (NMSE) of forecasting

Deepayan Chakrabarti

CIKM 2002 13

Previous work/Alternatives

Manual Setting : BUT infeasible [Sauer/1992]

CrossValidation : BUT Slow; leave-one-out crossvalidation ~ O(N2logN) or more

“False Nearest Neighbors” : BUT Unstable [Abarbanel/1996]

Deepayan Chakrabarti

CIKM 2002 14

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 15

Intuition

X(t-1)

X(t

)

The Logistic Parabola xt = axt-1(1-xt-1) + noise

time

x(t

) Intrinsic Dimensionality

≈ Degrees of Freedom

≈ Information about Xt given Xt-1

CIKM 2002 16

Intuition

x(t-1)

x(t)

x(t-2)

x(t)

x(t)

x(t-2)

x(t-2) x(t-1)

x(t-1)

x(t-1)

x(t)

Deepayan Chakrabarti

CIKM 2002 17

Intuition

To find L(opt): Go further back in time (ie., consider

Xt-2, Xt-3 and so on) Till there is no more information

gained about Xt

Deepayan Chakrabarti

CIKM 2002 18

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 19

Fractal Dimensions FD = intrinsic dimensionality

“Embedding” dimensionality = 3

Intrinsic dimensionality = 1

Deepayan Chakrabarti

CIKM 2002 20

Fractal Dimensions

FD = intrinsic dimensionality [Belussi/1995]

log(r)

log( # pairs)

Points to note:

• FD can be a non-integer

• There are fast methods to compute it

Deepayan Chakrabarti

CIKM 2002 21

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 22

Q1: Finding L(opt) Use Fractal Dimensions

to find the optimal lag length L(opt)

Lag (L)

Fra

ctal

Dim

ensi

on

epsilon

L(opt)

f

Deepayan Chakrabarti

CIKM 2002 23

Q2: Finding k(opt)

To find k(opt)

• Conjecture: k(opt) ~ O(f)

We choose k(opt) = 2*f + 1

Deepayan Chakrabarti

CIKM 2002 24

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 25

Datasets Logistic Parabola:

xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

Time

Value

Deepayan Chakrabarti

CIKM 2002 26

Datasets Logistic Parabola:

xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

LORENZ: Models convection currents in the air

Time

Value

CIKM 2002 27

Datasets

Error NMSE = ∑(predicted-true)2/σ2

Logistic Parabola: xt = axt-1(1-xt-1) + noise Models population of flies [R. May/1976]

LORENZ: Models convection currents in the air

LASER: fluctuations in a Laser over time (from the Santa Fe Time Series Competition, 1992)

Time

Value

Deepayan Chakrabarti

CIKM 2002 28

Logistic Parabola

• FD vs L plot flattens out

• L(opt) = 1

Timesteps

Value

Lag

FD

Deepayan Chakrabarti

CIKM 2002 29

Logistic Parabola

Timesteps

Value

Our Prediction from here

Deepayan Chakrabarti

CIKM 2002 30

Logistic Parabola

Timesteps

Value

Comparison of prediction to correct values

Deepayan Chakrabarti

CIKM 2002 31

Logistic Parabola

Our L(opt) = 1, which exactly minimizes NMSE

Lag

NM

SE

FD

Deepayan Chakrabarti

CIKM 2002 32

LORENZ

• L(opt) = 5

Timesteps

Value

Lag

FD

Deepayan Chakrabarti

CIKM 2002 33

LORENZ

Value

Timesteps

Our Prediction from here

Deepayan Chakrabarti

CIKM 2002 34

LORENZ

Timesteps

Value

Comparison of prediction to correct values

Deepayan Chakrabarti

CIKM 2002 35

LORENZ

L(opt) = 5

Also NMSE is optimal at Lag = 5

Lag

NM

SE

FD

Deepayan Chakrabarti

CIKM 2002 36

Laser

• L(opt) = 7

Timesteps

Value

Lag

FD

Deepayan Chakrabarti

CIKM 2002 37

Laser

Timesteps

Value

Our Prediction starts here

Deepayan Chakrabarti

CIKM 2002 38

Laser

Timesteps

Value

Comparison of prediction to correct values

Deepayan Chakrabarti

CIKM 2002 39

Laser

L(opt) = 7

Corresponding NMSE is close to optimal

Lag

NM

SE

FD

Deepayan Chakrabarti

CIKM 2002 40

Speed and Scalability Preprocessin

g is linear in N

Proportional to time taken to calculate FD

Deepayan Chakrabarti

CIKM 2002 41

Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method

Fractal Dimensions Background Our method

Results Conclusions

Deepayan Chakrabarti

CIKM 2002 42

Conclusions

Our Method:

Automatically set parameters

L(opt) (answers Q1)

k(opt) (answers Q2)

In linear time on N

Deepayan Chakrabarti

CIKM 2002 43

Conclusions Black-box non-linear time series

forecasting Fractal Dimensions give a fast,

automated method to set all parameters

So, given any time series, we can automatically build a prediction system

Useful in a sensor network setting

Deepayan Chakrabarti

CIKM 2002 44

Snapshothttp://snapdragon.cald.cs.cmu.edu/TSPExtra

Deepayan Chakrabarti

CIKM 2002 45

Future Work

Feature Selection Multi-sequence prediction

Extra

Deepayan Chakrabarti

CIKM 2002 46

Discussion – Some other problems

How to forecast?

•x1, x2, …, xN

•L(opt)

•k(opt)How to find the k(opt) nearest neighbors quickly?

Given:

Extra

Deepayan Chakrabarti

CIKM 2002 47

Motivation

Forecasting also allows us to

• Find outliers anything that doesn’t match our prediction!

• Find patterns if different circumstances lead to similar predictions, they may be related.

Extra

Deepayan Chakrabarti

CIKM 2002 48

Motivation (Examples)

• EEGs : Patterns of electromagnetic impulses in the brain

• Intensity variations of white dwarf stars

• Highway usage over time

Traditional

Sensors• “Active Disks” for forecasting / prefetching / buffering

• “Smart House” sensors monitor situation in a house

• Volcano monitoring

Extra

Deepayan Chakrabarti

CIKM 2002 49

General Method

• Store all the delay vectors {x{xt-1t-1, …, x, …, xt-L(opt)t-L(opt)} }

and corresponding prediction xand corresponding prediction x tt

Xt-1

xt• Find the latest delay vector

L(opt) = ?

• Find nearest neighbors

K(opt) = ?

Interpolate• Interpolate

Extra

Deepayan Chakrabarti

CIKM 2002 50

Intuition

• The FD vs L plot does flatten out

• L(opt) = 1

Lag

Fractal dimension

Extra

Deepayan Chakrabarti

CIKM 2002 51

Inside Theory

Internal state may be unobserved But the delay vector space is a

faithful reconstruction of the internal system state

So prediction in delay vector space is as good as prediction in state space

Extra

Deepayan Chakrabarti

CIKM 2002 52

Fractal Dimensions

Many real-world datasets have fractional intrinsic dimension

There exist fast (O(N)) methods to calculate the fractal dimension of a cloud of points [Belussi/1995]

Extra

Deepayan Chakrabarti

CIKM 2002 53

Speed and Scalability Preprocessin

g varies as L(opt)2

Extra

Recommended