Upload
dat-tran
View
1.553
Download
0
Embed Size (px)
Citation preview
Saving Human Lives with the IoTCloud Foundry Summit Europe 2016
Dat Tran @datitranSeptember 27th, 2016
2
echo $(whoami)
Senior Data Scientist
@datitran
Nearly 1.2 million people die in road crashes each year
(WHO - 2015)
Worst road traffic conditions like glazed frost
Most traffic accidents are
predictable and preventable
We believe transforming how the world builds software will make the
world a safer place
7
Technology
8
“Cloud” Architecture
9
Cloud Native ArchitecturePredictive API
Redisfor Pivotal CF
Deep Learning
Spring XD
FTP Source Shell Processor
Tap
S3
Feature Engineering
Dashboard
Enricher
Model Pipeline Orchestration
EC2
S3 Sink for persistent storage
Redis Sink
Measured data
Weather data
λ-Architecture
Model persistence
JSON
JSON
CSVCSV
CSV
Real-Time Layer
Batch Layer
Models
10
Spring XD
Unified, distributed, and extensible open-source system for data ingestion, real time analytics, batch processing and data exporto Data Ingestion and Pipeline Processingo Real Time Analyticso Rapid Dashboardingo Batch Workflow Orchestration + ETL
Redisfor Pivotal CFSpring XD
FTP Source Shell Processor
Tap
S3
S3 Sink for persistent storage
Redis Sink
11
Batch Layer
Deep Learning
S3
Feature Engineering
Model Pipeline Orchestration
CSVCSV
12
Real-Time LayerPredictive API
Redisfor Pivotal CF
Dashboard
Enricher
Measured data
Weather data JSON
JSON
CSV
Real-Time Layer
Models
13
Pivotal Cloud Foundry PaaS
more…
http://
Push App> cf
Modelling
15
Short Introduction into Deep Learning
Input Layer Output Layer
Hidden Layer 1 Hidden Layer 2
Positive
Neutral
Negative
16
Types of Neural NetworkRecurrent Neural NetworkFeed-Forward Neural Network
o Models dynamic temporal behaviors o Many variants: LSTM, GRU, Bi-
directional RNNs etc.o Applications: Handwriting and
speech recognition and many more
o Ideal for functional mapping problemso Architectures: Multi-layer perceptron,
CNNs etc.o Many applications in supervised
learning
17
Recurrent Neural Network
Image classification
Image captioning
Sentiment analysis
Machine translation
Video classification
Source: The Unreasonable Effectiveness of Recurrent Neural Networks (Andrej Karpathy)
18
Key Learnings
oOverbalanced problem
oUse simple RNNs over LSTM
oUse GPUs!
oTime-consuming to find the optimal network
oMany data is needed
19
Processes
20
Radically Agile Data Science @ Pivotal Labs
Pair Programming
Retros
Test Driven Development
Continuous Integration / API First
Tracker
Standups
21
API First
Source: What is hardcore data science—in practice? (Mikio Braun)
22
Additional Links
oAPI First for Data Science http://engineering.pivotal.io/post/api-first-for-data-science/
oWhat is hardcore data science—in practice? https://www.oreilly.com/ideas/what-is-hardcore-data-science-in-practice
23
Continuous Delivery
Release once every 6 monthsMore Bugs in production
Release early and oftenHigher Quality of Code
Pair Programming
Not my problemI pay so you deliver
Shared responsibilityWe build it together
Microservices
Tightly coupled componentsSlow deployment cycles waiting on
integrated tests teams
Loosely coupled componentsAutomated deploy without waiting on
individual components
24
Key Takeaways
oAPI First: Bringing the models into production as fast as possible helps to minimize risk
oClients can test it and give early/regular feedback
oFast ROI
oCloud Foundry enables us to reliably expose models as scalable predictive APIs
oCloud Native Data science is crucial for Smart Apps
25
Questions?
@datitran