dataFundamentals
Hadoop Automation in 15 Minutes
Or how to get to the fun stuff before your boss pulls the plug.
ETL is not the Fun Stuff, in Big Data
❖ Analytics
❖ Machine Learning
❖ Spark
❖ [even just Building APIs]
But you can’t do the fun stuff until your corporate data is in place to work against. Chicken and egg problem.
Quick!Before your boss turns off the spigot!
❖ Automate your ETL processes.
❖ Automate your server instances.
What kind of code to Automate?
❖ Clean code. Super clean.
❖ Well designed code.
Other pitfalls?
❖ NIH, Not Invented Here
How to get the fun tasks?
❖ 2 week P.O.C.
❖ Your sample data
Code, Content, Contacts❖ This Slide Deck: http://www.slideshare.net/petecarapetyan/cloud-austin-hadoop-automationlightingtalk141118
❖ or just remember slideshare.net/datafundamentals
❖ Youtube - 11 minute slide-less version of code demo - https://www.youtube.com/playlist?list=PLO_T9AjxEaYeByfqBqHVCmg4GbLFkYCJe
❖ Dev Code
❖ Carrie (ruby UI and generator) https://github.com/datafundamentals/df_ui_carrie
❖ Avro from delimited https://bitbucket.org/datafundamentals/avro_from_delimited
❖ Camel-Avro https://bitbucket.org/datafundamentals/camel-avro-etl
❖ Ops Code - cookbook recipes
❖ https://github.com/datafundamentals
❖ Contact
❖ [email protected] [email protected] Jeff Twitter @devopsjeff Pete Twitter @appwritercom Site: datafundamentals.com
Be careful! It’s a competitive world out there!