Upload
monisha-kanoth
View
225
Download
3
Embed Size (px)
Citation preview
Rapid Data Analytics @ Netflix
Jason FlittnerSenior BI Engineer
Chris StephensSenior Data Engineer
Monisha KanothSenior Data Architect
What We Do
633643DEA @ NetflixContent Analytics
Global Expansion & Content Spend
Freedom & ResponsibilityHighly Aligned, Loosely CoupledContext, not Control
Culture + Technology
CourageJudgementHonestyCommunication CuriosityPassionInnovationImpactSelflessness
Parquet FF
Storage Compute Tools BI
AWSS3
(Hadoop clusters)
Deploy Fast, Fix Faster
● Improve & Iterate vs Perfect● Have a Rollback Plan Ready
Develop Business Logic not ETL
● Think in Patterns
The Path of Least Resistance is the Right Path
● Make Smart Engineering Tradeoffs
The Clock starts Ticking when you Deploy
● Every Data Pipeline comes with an Expiration Date
● Deprecate and Prune
No Man’s Land is Expensive
● Ownership
Be a Noob
● User Groups
What You Could Doin your Data Warehouse
Let everyone drop tables in production
Cost / BenefitConscientious people make mistakes,but not very often
Data warehouse is not an operational system
What happens if a table is accidentally dropped?● Do you have backups?● How quickly can you restore a table?
Is the benefit of worth the tax on every data / analytical product your team produces?
We have some protection
In Hive, all tables are external tables pointing to S3 locations.
ETL writes a new “batch” of data then updates the metastore.
s3://[bucket]/hive/schema.db/table/batchid=1459364911
ALTER TABLE table SET LOCATION [path to new batch ID];
DROP TABLE does not delete any data.
In our MPP databases, we have a procedure for upgrading and downgrading our privileges.
CALL admin.UpgradePrivileges('me')
Lasts for several hours. Usage is logged.
Accidents? Restore from backups. Or reload from Hive.
When other teams are ready to move to production ...
We’re done. And moving on to the next thing.
You can trust your people to work the same way.
Don’t have an “on call”(Use a “first responder” instead)
Everyone on the team takes a shift: both BI and data engineers (even managers every once in a while!)
First Responder = the first one to respond
● handles most common failures (restarting jobs)● reaches out directly to ETL owner if escalation is required● handles communication surrounding ETL delays
Goal is to protect the team’s time and focus
How we do this
● visually define what needs attention and what doesn’t○ “above the line” vs “below the line”
● email alerts for “above the line” jobs that take longer than normal
● playbook for fixing common stuff○ the more complete your entries are, the less you get
called!
Have a very clear sense of what is urgent, and what isn’t
Treating every failure like it’s urgent bleeds your team of the time they need to do work
Build your processes so they can be ignored for 3 days
● don’t load data if it’s incomplete● reprocess fact data for several days instead of picking up
the latest
Gives you the freedom to judge whether a failure is worth an interruption
Everybody owns ETL(when they need to)
BI engineer needs data structured a certain way for a report
Many environments:
● Ask a data engineer to build them a table
Our environment:
● Let them schedule a Hive script and adjust as necessary
We focus on centers of excellence, not role boundaries
More Examples:
● our BI engineers use Python to automate tasks
● our data engineers have Tableau licenses, and use them for quick visualizations and report deployments
For small tasks, this helps us avoid the overhead of interruption and knowledge transfer
What You Could Do on the Front-end
Parquet FF(Hadoop clusters)
Storage Compute Data Interface Data Access, Analytics and Visualization
AWS S3
Do Not Limit Yourself to Conventional Tools
○ Tableau - Data Visualization and Dashboards○ MicroStrategy - Dynamic SQL and Metadata○ Python or Custom Reporting - Emails
Give your BI Engineers Superpowers (like this guy)
○ Provide a data platform○ BI + Data Engineering○ Context not Requirements○ Be early adopters
Simple isOften Best
Dismantle your Data Warehouse Team
○ Integrate with the business○ Data Engineering and Data Science
teams○ Open and honest communication
Fast is better than perfect
○ Build, iterate… repeat○ How to handle adhocs○ Freedom - make the right call○ Responsibility - Ownership
EncourageHacking
Questions?
Want to chill with us!?jobs.netflix.com