Upload
casey-kiernan
View
584
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Why Classic Data Warehousing Architectures miss the mark with the New Analytics
Citation preview
New Analytical Architectures
March 21, 2013
Casey Kiernan • [email protected]
Blog • www.the-data-platform.com
Why Classic Data Warehousing Approaches Miss the Mark with Big Data
Doug Cutting“Hadoop is the kernel of a new Distributed Data OS”
“The Future is Data”
Transactional
Communities
Personal
Data has Changed
> Trailing Indicators
> Reach/Influence
> Interactive
> Analytics has Changed
Can the Data Warehouse Architecture adapt?
The World as I See it
“Data” is the Platform
New DataClutch Analytics
Wink Eller
My Mountain Bike
Guidance
PerformanceRate of ClimbCalories BurnedMiles ObtainedTotal ClimbedElapsed Time
Current, Average, Max Values
Data CollectionSpeed / Trip Miles
Data CollectionCadence / RPM
Data Collection Heart Rate
Data Collection AltitudeTemperatureTime
Data Architecture - on a Local Wireless Network (ANT+ Protocol)
as a Data Platform
“Personal” Ride Analytics
…is this a Data Warehouse?
Beha
vior
s
Content
Progression of B
ehaviors
New Data Behaviors (individual actions) > Content > Time
Time Varia
nce
9
Gui
danc
e
Data
Meaningful
Massive
New Data More is Better…
BUSINESS INTELLIGENCEOLAP / DATA WAREHOUSE
OLTP / TRANSACTIONSDATA.
“Business” Analytics - Classic “DW”
Answers the question: What are our most profitable Products?
11
What will Happen?What did Happen?
StrategicTactical TrendingOperational Reporting
Months Weeks Weeks Months Years
Classic “Business” AnalyticsGood for Reporting, Forecasting
Descriptive/Trending Analytics
New“Personal” Analytics
Answers the question: Show me a good movie to watch!
DATA.
SELF-SERVICEGUIDANCE
BEHAVIOURS
StrategicTactical TrendingOperational Reporting
13
What will Happen?What did Happen?
Months Weeks Weeks Months Years
What is Happening RIGHT NOW!
“Personal” Analytics“Right Now” is a very important time-frame!
Predictive/Prescriptive Analytics
14
15
16
Ordering App
Data WarehouseOLTP to OLAP
Mapping
OLAP / ReportsFacts/DimensionsFinancial App
Master Data
BusinessAnalyst
What are our most Profitable Products?
Stag
ing
“Business” Analytical ArchitectureClassic “DW” Data Flow - Uni-Directional, Latent,…
Business Metrics, KPI, YTD ReportingFacts &
Dimensions
17
Application / UX
AnalyticsData
“Personal” Analytical Architecture
DataAnalysts
Analytical CapabilitiesScoring/Ranking, Recommendations,Natural Language Processing, Relevancy, Classification, Optimization, Collaborative Filtering,Personalization,Digital Attribution,…
“New” Data Flow - Iterative, Specialized, Extensible, plug & play Analytics, near real-time [Some components are open-source]
What movie should I watch tonight?
18
Published Analytics “Read” Performance
App Persistence“State” PersistencePersistence/Analytics
Mass Data StorageBehaviors / “Write” Performance
PersonalizedRecommendations
Personalization,Preferences, State
End-User ExperienceBrowser, Tablet,
Mobile,…Self-Service Application
“Personal Analytics” Data Architecture
Analytics EnginesPluggable
Data Scientists
“New” Data Flow – Detailed View of Components
Social SignalsRSS/Facebook/…
SALLY LIKES TACOS
HOW DO WE MODEL THIS DATA?
Let’s get personal…
Classic “DW” Data Model
OBJECT PREDICATE (Score) SUBJECT
SALLY LIKES (143) TACOS
MARY LIKES (200) TACOS
THE_TACO_SHOP MENU_ITEM TACOS
SALLY LIKES (125) THE_TACO_SHOP
SALLY CITY VENICE BEACH
THE_TACO_SHOP CITY VENICE BEACH
SALLY FRIEND (187) MARY
“Triples” - Directed (Weighted) Acyclic GraphModeling Social Data
Reach and Influence
Collaborative Filtering
Analyzing Relationships Reach and Influence
How important is Social?
Install ghostery.comShows you who is actively watching you surf the web! Lots of people!!!
Signals – The Core of New Data
SocialPersonalContent
Time
Mixture of Proprietary and Public Data
26
Published AnalyticsHbase
App PersistenceCassandra, Riak,…Persistence/Analytics
Data-Center or Cloud
Mass Data StorageHadoop
PersonalizedRecommendations
Personalization,Preferences, State
End-User ExperienceBrowser, Tablet,
Mobile,…Self-Service Application
Specialization of Data Technologies
AnalyticsR, Mahout, Pig
The New “Analytical Application” Architecture“New” Data Flow – Specialized Technology Choices
p. 27
Published Analytics
HBase
PersistenceRiak
Mass Data StorageBehaviors / “Write” Performance
Hadoop / AWS
Self-Service Application A
Analytics EnginePluggable
Data Scientists
Analytics EnginePluggableAnalytics Engine
Pluggable
Published Analytics
MySQL
PersistenceCassandra
Self-Service Application B
Servicing Multiple Analytical SystemsUsing Shared Analytical Mas- Storage
Integrating the Architectures
28
Data WarehouseOLTP to OLAP Mapping
OLAP / Reports
BusinessAnalystSt
agin
g
AppOnly Financial Events ($$$) cross the threshold(and are recorded into) the Data Warehouse
App
App
“Local” Events stay Local (they are analyzed locally)
“Personal” Analytics Stack + Classic “DW” Stack
Not all DATA Belongs in the Data Warehouse!
Classic DW New Analytics
Scope Enterprise Application
Analytics Trailing: OLAP Predictive: Machine LearningSentiment Analysis, Recommendations, Personalization, Natural Language Processing, Classification, Clustering, Optimization, Collaborative Filtering,Digital Attribution,…
Actionable? Loosely Coupled Tightly Coupled Analytics Embedded in Application
Data Structures Facts/Dimensions(Requires a DW)
Semantic Data, Graph / Triples, Observations, Direct Signals
Knowledge Expert Business Analyst Data Scientist
Technology Stack Vendor Driven ($$$) Open-Source
Architecture Scale-Up Scale-Out (or in the Cloud)
Classic DW Vs. the New AnalyticsThe Shift from “Business” Analytics to “Personal” Analytics
New Signals + New Analytics = New Scenarios
Data
Signals
Social
Location
Personal
Behaviors
Transactions
Content
Time
New Analytics
Recommendations,Natural Language
Processing, Relevancy,
Classification, Optimization, Collaborative
Filtering,Digital
Attribution,…
NewScenariosCustomer
Engagement, Customer Loyalty / Attrition / Retention, Fraud, Risk Analysis,
Intent, Customer Personalization