21
Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Eamonn Keogh Computer Science & Engineering Department University of California - Riverside Riverside,CA 92521 [email protected]

Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Embed Size (px)

Citation preview

Page 1: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Exact indexing of Dynamic Time Warping

Presented By:Ankit Hirdesh

Piyush Goswami

Eamonn KeoghEamonn KeoghComputer Science & Engineering Department

University of California - RiversideRiverside,CA [email protected]

Page 2: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

INTRODUCTION• Time Series

– collection of observations made sequentially in timecollection of observations made sequentially in time

– Occur in Medical, business, scientific domain

– Finding out similarities between two time series is required in many time series data mining applications

Page 3: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

CHALLENGES

• How do we define similarity ?

• Need a method that allows elastic shifting of time axis to accommodate sequences that are similar but can be out of phase

• Large Amount of data

• How do we search quickly ?

Page 4: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

SOLUTIONS

• Euclidean distance– Aligned one to one– Cannot find similarity b/w

out of phase signals

• Dynamic Time Warping– Can be non-linearly aligned

Page 5: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

WHAT IS TIME WARPING

C

Q

Warping Path

Page 6: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

DYNAMIC TIME WARPING

• (i,j) = d(qi,cj) + min{ (i-1,j-1) , (i-1,j ) , (i,j-1) }

• Three Basic Constraints of Time Warping

– Path should include beginning and ending

– Path should not have any jumps

– Path cannot go back in time

Page 7: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Global Constraints for Speedy Calculations

• Limit the warping path wk = (i,j)k close to diagonal i.e. j-r i j+r where r is the “reach”

• Speed up the calculations – from O(n2) to O (n)

• Prevent pathological warpings

Warping WindowWarping Window

Page 8: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Lower Bounding

• Both Euclidean and DTW metric highly demanding in terms of CPU and I/O time

• A lower bounding function can also speed up the similarity search by erasing sequences that could not possibly be a best match

• Must be fast• Must be tightly bound

Page 9: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Existing Lower Bounding Techniques

Lower bounding measure by Yi et al. Lower bounding measure by Yi et al. The sum of the squared length of The sum of the squared length of gray lines is returned as the lower gray lines is returned as the lower bounding measurebounding measure

Lower bounding measure by Kim et Lower bounding measure by Kim et al. The maximum squared difference al. The maximum squared difference between the two sequences first (A), between the two sequences first (A), last (D), minimum (B) and maximum last (D), minimum (B) and maximum points (C) is returned as the lower points (C) is returned as the lower bound.bound.

Page 10: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Proposed Lower Bounding Method

• Let us define two sequences: where r is the reach, U and L stand for Upper and Lower respectively.

• Also:

A : Bounding Envelope for Sakoe – Chiba Band

B: Bounding envelope for Ikatura parallelogram

Page 11: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Proposed Lower Bounding Method – LB_KEOGH

LB_KEOGH (Q,C) DTW (Q,C)

•The query sequence Q is enclosed in the bounding envelope of U and L.

•The squared sum of the distances from every part of the candidate sequence C not falling within the bounding envelope, to the nearest orthogonal edge of the bounding envelope is returned as the lower bound.

•A and B mean same as previous slide.

Page 12: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

LB_KeoghSakoe-Chiba

LB_KeoghItakura

LB_Yi

LB_Kim

The tightness of the lower bound for each technique is proportional to The tightness of the lower bound for each technique is proportional to the length of gray linesthe length of gray lines

Page 13: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

How to index Dynamic Time Warping

• Piecewise Aggregate Approximation (PAA)– Represent time series as sequence of box basis functions– Reduce dimensionality from n to N, as time series may include large

number of items, degrading performance of indexing– Data divided into N equal sized frames– Extremely fast to calculate

Page 14: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

PAA continued• PAA of U and L, denoted by Û and Ĺ .

iii

Nn

Nn UUU ,...,maxˆ

11 iii

Nn

Nn LLL ,...,minˆ

11

Page 15: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Indexing Dynamic Time Warping• There are two time series data sets There are two time series data sets

(Q and C) in length n, both are (Q and C) in length n, both are being divided into N dimension. being divided into N dimension.

• C is a candidate sequence C is a candidate sequence • Q is a query sequence.Q is a query sequence.• Approximate the minimum Approximate the minimum

bounding rectangle (R) in each bounding rectangle (R) in each dimension of candidate sequence dimension of candidate sequence CC

• MINDIST (Q,R) =MINDIST (Q,R) =

h1h2

hi

l1l2

li

MBR R = (L,H)L = {l1, l2, …, lN}

H = {h1, h2, …, hN}

MINDIST(Q,R)

N

iiiii

iiii

otherwise

LhifLh

UlifUl

N

n

1

2

2

0

ˆ)ˆ(

ˆ)ˆ(

Page 16: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

K-Nearest neighbor search algorithm

• Given query sequence Q and desired number of K time series Given query sequence Q and desired number of K time series neighbors from a set Cneighbors from a set C

• Priority queue is used for storing the index in an increasing order of Priority queue is used for storing the index in an increasing order of distance from Qdistance from Q

• Push root node of index into QPush root node of index into Q

• At each step Pop from top of queueAt each step Pop from top of queue

• If popped item is PAA point C, compute exact DTW(Q,C) and If popped item is PAA point C, compute exact DTW(Q,C) and insert into temporary list ‘temp’insert into temporary list ‘temp’

• If index node, compute distance of each children from Q and If index node, compute distance of each children from Q and push them into queuepush them into queue

• Move C from temp to result only when we are sure that it is one of Move C from temp to result only when we are sure that it is one of K-NN of QK-NN of Q

Page 17: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Experimental Evaluation

• Most comprehensive and detailed set of time series indexing experiments ever conducted

• Sakoe – Chiba Band with 10% width was used• 32 datasets from various sources were taken. 50 sequences of

length 256 were randomly extracted.• Tightness of lower bound functions was compared by taking

one sequence at a time and comparing with 49 others

Page 18: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Experimental Evaluation Contd..

• Pruning power of the lower bounding functions was also compared similarly

• LB_Keogh was also evaluated against Linear Scan on the basis of Normalized CPU Cost

Page 19: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

Conclusion

• This paper provides a way to speed up DTW by indexing

• DTW allows us to do similarity matching between sequences which are out of phase. Euclidean space does not give us that privilege

• A new Lower Bounding function was proposed: LB_Keogh, which is superior than the ones seen previously

• Method to index time series using the proposed lower bounding function was showed

Page 20: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

References

• Eamonn J. Keogh: Exact Indexing of Dynamic Time Warping. VLDB 2002: 406-417

• Slides for the above paper by same author (All colored pictures in the presentation are from the author’s slides)

• Slides from following class web page:www.csis.hku.hk/~nikos/courses/CSIS7101/multimedia.ppt

Page 21: Exact indexing of Dynamic Time Warping Presented By: Ankit Hirdesh Piyush Goswami Eamonn Keogh Computer Science & Engineering Department University of

QUESTIONS?