Dato Confidential2
Hello, my name is…
Shawn [email protected] of Product(Physicist, Cleantech Geek, Data Scientist, Urban Farmer)
I
Intelligent Applications
Dato Confidential3
Who is Dato?
45+ and growing fast!
Dato Confidential4
by making
sophisticated machine learning
Dato’s mission is to
accelerate the creation of
intelligent applications
as easy as
“Hello world!”
Dato Confidential5
Business
must be intelligent
Machine learning applications
• Recommenders
• Fraud detection
• Ad targeting
• Financial models
• Personalized medicine
• Churn prediction
• Smart UX
(video & text)
• Personal assistants
• IoT
• Socials nets
• …
Last decade:
Data management
Now:
Intelligent apps
?Last 5 years:
Traditional analytics
Dato Confidential
Example Intelligent Applications
6
Dato Confidential
SystemsElastic, scalable
PeopleData scientist
Challenge today: Path from inspiration to production
ScalePrototyping
Data engineering is painful
• Limited by system memory
• Data munging & feature eng.
• Manipulate complex data types
Data intelligence is hard
• Models don’t scale
• No task-oriented ML
• Algos trapped in papers
Production is fragile
• Build custom services & API
• Write new code to scale
• Model management
Inspiration
Data Intelligence
Data Engineering Production
Dato Confidential8
Our customers
Dato Confidential
We make small teams extremely productive.
9
Developer (former DBA) built & deployed first recommender to increase community engagement (and therefore ad revenue).
Small team of developers built & deployed a recommender in 1/5 the time of previous efforts and at higher performance for increased sales.
Small team of data scientists more rapidly iterating on models to improve state of the art music experience for better user experience.
Small team iterating quickly to improve personalization (and increase revenue) in their daily deals.
2 person team iterate & deploy better job search ranking using text to increase clicks & therefore revenue.
Dato Confidential
Demo: Recommender
Dato Confidential
• Out-of-core computation
• Tools for feature engineering
• Rich data type support
• Models built for scale
• App-oriented toolkits
• Advanced ML & Extensible
• Deploy models as low-latency REST services
• Same code for distributed computation
• Elastically scale up or out with one command
• Job monitoring & model management
• Deploy existing Python code & models
• Run on AWS EC2 or Hadoop YARN
SGraph
Create Engine
SFrameCanvas
Machine Learning Toolkits SDK
GraphLab Create Dato DistributedDato Predictive Services
Predictive Engine
REST Client Direct
Model Mgmt
Distributed Engine
DirectJob Client
Job Mgmt
The Dato Machine Learning Platform
Dato Confidential12
Sophisticated ML made easy - Toolkits
RecommenderImage search
Sentiment analysis
Data matching
Auto tagging
Churn predictor
Object detectorProduct
sentimentClick
predictionFraud
detectionUser
segmentationData
completion
Anomaly detection
Document clustering
Forecasting Search ranking
Summarization …
import graphlab as gl
data = gl.SFrame.read_csv('my_data.csv')
model = gl.recommender.create(data,
user_id='user',
item_id='moviez
target='rating')
recommendations = model.recommend(k=5)
Principles:
• Get started fast
• Rapidly iterate
• Combine for new apps
Dato Confidential13
Sophisticate ML made easy - Transfer learning
• Train a model on one task, use it for another task
• Examples
- Learn to walk, use that knowledge to run
- Train image tagger to recognize cars, use that knowledge to
recognize trucks.
13
Dato Confidential14
Create an intelligent world!
Data Engineering
Sophisticated ML
Deployment
• Fast & scalable
• Rich data types
• Built for ML
• App-oriented ML
• Supporting utils
• Extensibility
• Batch & always-on
• RESTful interface
• Elastic & robust
Dato Confidential
Get the software: dato.com/download
Start learning: dato.com/learn
Bug me: [email protected]