Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Analytics on Time Series Data
Jure LeskovecStanford UniversityChan Zuckerberg BiohubPinterest
2Jure Leskovec
Sensors are Everywhere
§ Sequences of time stamped observations
Jure Leskovec, Stanford 3
Sensors are everywhere
I In many applications, we generate large sequences of timestampedobservations
– “Sensors” have a broad definition
Introduction 2
Sensor Data: Time Series§ Sensors generate lots of time-series
data
Jure Leskovec, Stanford 4
Network inference from time series data
I Convert a sequence of timestamped sensor observations into atime-varying network
Introduction 5
Sensors à Time Series
§ But such time series data are:§ High-dimensional§ Unlabeled§ High-velocity§ Dynamic§ Heterogeneous
Jure Leskovec (@jure), Stanford University 5
Network inference from time series data
I Convert a sequence of timestamped sensor observations into atime-varying network
Introduction 5
Sensors: Much data, no insights§ It is hard to obtain insights:
§ Sensor readings are interdependent and correlated
§ As the environment changes dependencies might change
§ Sensors might fail§ Readings might be asynchronous
Jure Leskovec, Stanford 6
Raw Data Structured Data Learning Algorithm Model
Downstream prediction task
Feature Engineering
Automatically learn the features
Sensors: Much data, no insights
§ Sensor data is hard to work with§ The algorithms must be:
§ Scalable: Large amounts of raw data over long time series
§ Robust: Must apply to lots of different applications
Jure Leskovec, Stanford 7
Challenge: ML & Insights
8
Raw Data Structured Data Learning Algorithm Model
Downstream prediction task
Feature Engineering
Automatically learn the features
§ (Supervised) Machine Learning Lifecycle: This feature, that feature. Every single time!
Jure Leskovec (@jure), Stanford University
9
How do we describe the structure of the time series so we can obtain insights and make predictions?
Discovering Structure§ Value in “breaking down” the time series
into a sequence of states:§ Allows us to draw interpretable conclusions
from the data
10
Segmentation and Clustering§ In general, the “states” are not predefined
§ We do know what they are, nor what they refer to…
§ Instead, we need to discover these states in an unsupervised way, while simultaneously segmenting the time series into the states!
11
How to Describe StatesNetworks are great to encode
structure and dependencies in data
Jure Leskovec, Stanford 12
Network inference from time series data
I Convert a sequence of timestamped sensor observations into atime-varying network
Introduction 5
Sens
ors
Depe
nden
cies
Network Inference via the Time-Varying Graphical Lasso. D. Hallac, Y. Park, S. Boyd, J. Leskovec.ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017.
Our Approach§ Our approach: Given a set of sensor
data, we learn temporal dependency networks between different sensors
§ The network is not static but can change over time§ Allows us to gain insights into the
process and detect anomalies
Jure Leskovec, Stanford 13
States and Dependencies
14
State
Sensor Dependency network of state A, B
TICC Problem Setup§ Formal definition:
where,
15
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data. D. Hallac, S. Vare, S. Boyd, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017
TICC: Scalability
§ Can scale to problems with tens of millions of observations!
Jure Leskovec, Stanford 16
Scalability
I ADMM can solve for millions of variables in minutes!
– Centralized solver (CVXPY) explodes computationally
Number of Unknowns TVGL Interior-Point100 0.9 37.9360 0.9 5362.7200,000 48.3 -5 million 706.4 -
Results 18
CVXPYSnapVX
SnapVX: A Network-Based Convex Optimization Solver. D. Hallac, C. Wong, S. Diamond, A. Sharang, R. Sosič, S. Boyd, J. Leskovec. Journal of Machine Learning Research (JMLR), 18(4):1−5, 2017.
Case Study: Cars§ Car driving sessions:
§ 36,000 samples @ 10Hz
§ We observed 7 sensors:§ Brake pedal position§ Forward (X-)acceleration§ Lateral (Y-)acceleration§ Steering wheel angle§ Vehicle velocity§ Engine RPM§ Gas pedal position
17
Cars – “Turning” State
18
Cars– “Stopping” State
19
TICC Results: Car Data§ We run TICC with K=5 states
(selected via BIC)§ The betweenness centrality score of
each node in each state/network:
20
Resulting StatesGreen = straight, White = slowing down, Red = turning, Blue = speeding upResults are very consistent across the data!
26
22
Predicting the Future(but without feature engineering)
Problem Setup§ Dataset: Automobile data containing
1,400 sensors recording at 10 Hz.
§ Goal: Predict driver actions 1 sec before they occur§ Left/Right blinker§ Accelerate (gas pedal > threshold)§ Hard braking (brake pedal < threshold)
23
Driver Identification Using Automobile Sensor Data from a Single Turn. D. Hallac, A. Sharang, R. Stahlmann, A. Lamprecht, M. Huber, M. Roehder, R. Sosic, J. Leskovec IEEE International Conference on Intelligent Transportation Systems (ITSC), 2016.
Neural Networks§ Multi-layer RNN architecture:
§ The output of the first LSTM is passed as input to a second LSTM)
§ 3,000h of driving, 1400 sensors, at 10Hz=150 billion datapoints
24
Time-Series Analysis
2/21/17 25
SparseTSV SNAPBinaryFormat
AsynchronousSignalArray
Task-SpecificData
ProcessResults
Space-EfficientDataConversion
DataFiltering
Analysis
Results – Brake Pedal
26
0.934 Test Set AUC (0.935 Training AUC)
§ AUC of predicting whether the driver will press the break pedal:
Predicting Driver Actions
27
Predicting Driver Actions
28
Predicting Driver Actions
29
Failure Prediction§ Dataset: Boiler data containing 100
sensors at 1Hz for 6-12 months§ Goal: Predict component failures 7
days before they occur
30
Preliminary Results§ Very promising results across various
deep learning models!
31
Conclusion§ Complex engineered
systems§ High-dimensional unlabeled
time series data collected in real-time
§ We need tools to understand these data as well as to make accurate predictions
Jure Leskovec, Stanford University 32
Sensors are everywhere
I In many applications, we generate large sequences of timestampedobservations
– “Sensors” have a broad definition
Introduction 2
Thanks!§ Joint work with D. Halla,c Y. Park, S. Boyd, R.
Sosic, S. Vare, A. Sharang, C. Wong, S. Diamond.
§ Papers:§ Network Inference via the Time-Varying Graphical Lasso. D. Hallac, Y. Park, S. Boyd, J. Leskovec.ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (KDD), 2017.
§ Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data. D. Hallac, S. Vare, S. Boyd, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017.
§ Learning the Network Structure of Heterogeneous Data via Pairwise Exponential Markov Random Fields. Y. Park, D. Hallac, S. Boyd, J. Leskovec. Artificial Intelligence and Statistics Conference (AISTATS), 2017.
§ SnapVX: A Network-Based Convex Optimization Solver. D. Hallac, C. Wong, S. Diamond, A. Sharang, R. Sosič, S. Boyd, J. Leskovec. Journal of Machine Learning Research (JMLR), 18(4):1−5, 2017.
§ Driver Identification Using Automobile Sensor Data from a Single Turn. D. Hallac, A. Sharang, R. Stahlmann, A. Lamprecht, M. Huber, M. Roehder, R. Sosic, J. Leskovec IEEE International Conference on Intelligent Transportation Systems (ITSC), 2016.
§ Network Lasso: Clustering and Optimization in Large Graphs. D. Hallac, J. Leskovec, S. Boyd. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2015.
Jure Leskovec, Stanford University 33
http://snap.stanford.edu@jure
THANKS!
Jure Leskovec, Stanford University 34