Upload
michi
View
44
Download
2
Embed Size (px)
DESCRIPTION
Online Discovery and Maintenance of Time Series Motifs. Abdullah Mueen Eamonn Keogh University of California, Riverside. Time Series Motifs. Repeated pattern in a time series Utility: Activity/Event discovery Summarization Classification Application - PowerPoint PPT Presentation
Citation preview
Abdullah Mueen Eamonn KeoghUniversity of California, Riverside
Online Discovery and Maintenance
of Time Series Motifs
Time Series Motifs
• Repeated pattern in a time series
• Utility:– Activity/Event discovery– Summarization– Classification
• Application– Data Center Chiller Management (Patnaik, et al. KDD 09)– Designing Smart Cane for Elders (Vahdatpour, et al. IJCAI 09)
0 2000 4000 6000 80000
10
20
30
Online Time Series Motifs
• Streaming time series• Sliding window of the recent history
– What minute long trace repeated in the last hour?
Problem FormulationDiscovery
100 200 300 400 500 600 700 800 900 1000
-8000
-7500
-7000
. . .
12345678...
873
The most similar pair of non-overlappingsubsequences
Maintenance
12345678...
873874
time:100112345678...
873874875
time:100212345678...
873874875876
time:100312345678...
873874875876877
time:100412345678...
873874875876877878
time:1005
Update motif pair afterevery time tick
time
Challenge
• A subsequence is a high dimensional point– The dynamic closest pair of points problem
• Closest pair may change upon every update• Naïve approach: Do quadratic comparisons.
4
5
2
7
3
6
89
4
5
2
7
3
6
8
1
4
5
2
7
3
6
8
1
Deletion Insertion
Our Approach
• Goal: Algorithm with Linear update time• Previous method for dynamic closest pair
(Eppstein,00) – A matrix of all-pair distances is maintained
• O(w2) space required
– Quad-tree is used to update the matrix• Maintain a set of neighbors and reverse
neighbors for all points• We do it in O( ) spaceww
Outline
• Methods• Evaluation• Extensions• Case Studies• Conclusion
Maintaining Motif• Smallest nearest neighbor → Closest pair• Upon insertion
– Find the nearest neighbor; Needs O(w) comparisons.
• Upon deletion– Find the next NN of all the reverse NN
4
5
2
7
3
6
8
1
4
5
2
7
3
6
8
1
Insertion
9
4
5
2
7
3
6
8
1
Deletion
9
Data Structure
1 2 3 4 5 6 7 8FIFO Sliding window
N-listsNeighbors in
order of distances
8 7 21 8 1 24
5 1 17 3 3 67
7 6 75 2 4 55
1 4 83 7 5 72
4 8 42 5 2 13
6 1 38 1 8 38
3 5 66 4 6 46
R-listsPoints that should be updated
Total number of nodes in R-lists is w
5 1 3 24
8 67
4
5
2
7
3
6
8
1
4
7
Still O(w2) space
(ptr, distance)
ObservationsWhile inserting
Updating NN of old points is not necessary A point can be removed from the neighbor list if it
violates the temporal order
Averagelength O( )w
Example4
5
2
7
3
6
8
1
1 1 21 3 1 2
52
83
2 43 5 3 6
4 7
5
6
4
7
6
1 2 3 4 5 6 7 8
2 3 32 3 2 6
3 7 9
5
54 4 6 8
5 7
6
68
4
2 3 4 5 6 7 8 91
4
5
2
7
3
6
8
1
9
4
5
2
7
3
6
89
10
3 34 3 6 6
57 9
54 4 7 8
5
6
4
6 8
5
7
8
9
3 4 5 6 7 8 9 1021
10
Evaluation
0
0.02
0.04
0.06
0.08
0.1
0.12
0.140.16
0.18
Fast Pair Insect
EEGEOGRW
Ave
rage
Upd
ate
Tim
e (S
ec)
0 0.5 1 1.5 2 2.5 3 3.5 4x 104
Window Size (w)0 200 400 600 800 1000
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Insect
EEG
EOGRW
Fast Pair
Ave
rage
Upd
ate
Tim
e (S
ec)
Motif Length (m)
8x
0 0.5 1 1.5 2 2.5 3 3.5 4x 104
102030405060708090100110
Insect
EEG
EOG
RW
Ave
rage
Len
gth
of N
-lis
t
Window Size (w)
•Up to 8x speedup from general dynamic closest pair• Stable space cost per point with increasing window size
0 200 400 600 800 10000
50
100
150
200
250
300
350
InsectEEG
EOG
RW
Motif Length (m)
Ave
rage
Len
gth
of N
-lis
t
Outline
• Methods• Evaluation• Extensions• Case Studies• Conclusion
Extension
1 1 21 3 1 2
52
83
2 43 5 3 6
4 5
5 7
6
4
7
6
1 2 3 4 5 6 7 8
1 1 21 3 1 2
52
83
2 43 5 3 6
4 5
5 7
6
4
7
6
1 2 3 4 5 6 7 8
1 1 21 3 1 2
52
83
2 43 5 3 6
4 5
5 7
6
4
7
6
1 2 3 4 5 6 7 8
• Multidimensional Motif• Maintain motif in Multiple Synchronous time series• Points in one series can have neighbors in others
• Arbitrary Data Rate• Load shedding: Skip points if can’t handle
0.40.50.60.70.80.91.0
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Fraction of Data Used
Fraction of M
otifs Discovered
EOGRandom Walk
EEGInsect FIFO for each streams
Case Study 1: Online Compression
• Replace all the occurrences of a motif by a pointer to that motif.
• For signals with regular repetitions– Higher compression rate with less error.
0 500 1000 1500 2000
Raw
PLA
MR
A
BA AB B B B B
Dictionary
Case Study 2: Closing The Loop
• Check if a robot closed a loop• Convert the stream of video frames to stream
of color histograms.• Most similar color histograms are good
candidates for loop detection.
0 400 800 1200 16000481216
0 400 800 1200 16000481216
Start Loop Closure Detected
Conclusion
• First attempt to maintain time series motif online– Maintains minutes long repetition in hours long
sliding window• Linear update time with less space cost than
quad-tree based method– O( ) Vs O(w2)
• Faster than general dynamic closest pair solution
ww
Thank You
Code and Data: http://www.cs.ucr.edu/~mueen/OnlineMotif