Upload
cloudera-inc
View
1.894
Download
0
Embed Size (px)
Citation preview
Real-Time BI in HadoopBradford Stephens
Lead Engineer, Visible TechnologiesPrincipal Consultant, Drawn to Scale Consulting
Scalability in BI
•Scalbility matters now
•Social Media: Catalyst
•All data is important
•Data doesn’t scale with business size any more
Doing it Cheap
•100 TB, Structured and Unstructured
•Oracle- $100,000,000
•“NewSQL” - $4,000,000
•Hadoop + Katta - $250,000
Why We Need Hadoop
•Need to process high-latency data to get the “small stuff” fast
•Robust Ecosystem
•Need more than SQL. RDBMS not a Swiss-Army Knife
Aggregation is Real-Time
•Distributed Search w/ Katta + Facets = Aggregation-Based BI
•Sum, Count, Filter, Avg, Group
Protips: Review
•Understand High vs. Low Latency data
•Hadoop makes it cheap
•Pre-aggregate w/ Hadoop, Explore w/ Katta + Faceted Search