Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Learnings from Building Data ProductsResources and Guidelines
Bob Raman, Engineering Manager - Data
Today’s agenda
2. Answer Bot
Data Product Team
1. Customer Satisfaction (CSAT)
3. Team structure, Delivery, Scalability
4. Our next evolution in building Data Products
5. Takeaways
Customer engagement
Relationship management
Zendesk
Data Product Team
● Build machine learning based products to solve business problems
● Team of 20+ people○ Data Eng - full stack
THE
TEA
MC
irca
6/2
017
Satisfaction Prediction (CSAT)Identify end users at risk of having a bad outcome before it happens
CSAT in Action
CSAT Thresholding
Background - Classical Machine Learning Model
Background - Model Build Stack
Map-Reduce - Cascalog (Clojure)
Background - Model Serving Stack (Change Data Capture)
Serving machine learning models on AWS
How we worked?
Data Scientists and Data Engineers
● Separate DS and DE teams○ 3 Data Scientists (DS)○ 5 Data Engineers (DE)○ PM, QA, Shared UX and Ops
● PM worked with DS and DE teams
Delivery - Streams
Delivery - Scalability
● Model serving ○ “Dark Launch” approach
● Model building ○ Use Map-reduce - 3000 models/week■ Shell out in “reducer”
Learnings #1 - Technology
● Data Scientists deploy python to prod○ Python is fast enough○ Virtual environment - Virtualenv○ Scaling done by Data Engineers
● De-normalised intermediate datasets
Learnings #2 - UX Issues
● Exposing CSAT score ○ Fragile customer business rules■ Limits improvement to algorithm○ Gradated scale may have been better
Model-UX conundrum
Learnings #4 - Project Delivery
● Need roadmap alignment across teams
● Infra compliance increased scope
Answer BotAutomatically answer end-user problems
Answer Bot - Customer emails in a question
Answer Bot - Suggests most relevant articles
How we worked?
● Lean startup ○ “Only way to win is learn faster than
anyone else.” [Eric Ries]
● +1 DS, +2 DE, Dedicated UX
MVP Hypothesis
Examples
● End users will interact with links in email
● Wrong answer does not negatively affect customer experience
● ...
MVP Measurements
Examples
● Surveys○ We have large number of customers
● Cohort metrics - “Activate and left on”○ New○ Oldie○ No-hoper
Background - Deep Learning Model
● Deep learning neural network model○ Measure similarity
between articles and tickets
○ Google TensorFlow (TF)
Background - Model Serving Stack (Change Data Capture)
Serving and updating machine learning model in real time on AWS
Delivery - Streams
Delivery - Key Items
● Single DL model for all customers ● Real time update of models● Self-service● Roadmap alignment across teams○ UX in AU○ 7 dev teams from EU, US, AU
Delivery - Scalability
● Real-time update of models○ Capacity planning - articles ’s/min○ TensorFlow Model■ CPU’s - prediction■ GPU’s - updating model
Learnings #1 - MVP’s
● Dogfooding Zendesk App○ Fast to build: 2-4 weeks
● Do not build MVP’s that negatively affect primary customer workflow
Learnings #2 - Moving faster with ML model development
● “Gold” dataset for validation
● A/B testing of models
● Seperate model and UX tests
Challenges - Model-UX conundrum
Learnings #3 - Lean Startup
● Build metrics into the product
● Use “Goal” driven iterations vs points
Example: Metrics picked up probs with UX
Metrics - Datadog
Monthly Recurring Revenue Targets Q3 2017
Can we do better with teams?
● No Silos● Blur line between DS and DE● DS and DE solve problems together● Build kickass team by building trust
Product based teams
● 2 Pizza size cross-functional squads ○ Embedded Scientists, Ops, UX○ PM per squad● End-end ownership○ From Research to Delivery● Single backlog● Co-located
Team setup
● Teams in Production○ 3:1 DE/DS ratio○ Delivery lead - Engineering
● Teams in research○ 2:1 DE/DS ratio ○ Delivery lead - Data Science
Team setup
● Teams in Production○ 3:1 DE/DS ratio○ Delivery lead - Engineering
● Teams in research○ 2:1 DE/DS ratio ○ Delivery lead - Data Science
Next Platform - in research
• GCP • DS’s
• Datalab• Dataprep
• DE’s• Dataflow, BigQuery, BigTable• GKE - K8• TPU’s (we have invested in TF)
Takeaway’s
Key Takeaway’s
● Be clear with your hypothesis○ Are you validating the model?○ Or are you testing UX?
● Build a culture of data driven decision making
Key Takeaway’s
● “Gold” dataset can help you go faster
● DS deliver code to production
● “Goal” driven iterations in research
Acknowledgements
Thanks to the awesome Data Product team
● Ai-Lien Trong-Cong● Arvind Kunday● Anh Dinh● Arwen Griffioen● Balaji Sekar● Chris Hausler● Chris Holman● Christine Lin● Damen Turnbull● Derrick Cheng● Ed Savage● Eric Pak● Jason Maynard
● Jeff Theobald● Jeremy Kirkwood● Mike Mortimer● Paul Gradie● Patsy Lai● Pei Shi Yong● Sean Caffery● Shiyu Zhu● Soon-ee Cheah● Tan Le● Wai Chee Yau
TM and © 2017 Zendesk Inc. All rights reserved.
Questions?