Upload
curiousfan
View
27
Download
0
Embed Size (px)
Citation preview
Utilizing Big Data Analytics
with Hadoop
Fern Halper @fhalper
TDWI Research Director for Advanced Analytics
April 17, 2014
Sponsor
3
Speakers
Fern Halper Research Director for
Advanced Analytics,
TDWI
Tapan Patel Product Marketing Manager,
SAS
Agenda
The evolving big data ecosystem
Status of big data, analytics,and hadoop
Considerations for getting started
4
New TDWI Checklist
Free to download
http://tdwi.org/research/list/tdwi-
checklist-
reports.aspx
An evolving ecosystem
6
Hadoop
Big data
Advanced Analytics
in-memory
Examining the pieces: Big Data
7
Social
M2M/IoT
Text
Mobile/Location Volume
Formats
70% of those respondents
using or currently using predictive
analytics are utilizing big data
(source: TDWI Predictive Analytics Best Practices Report, 2014)
8
Examining the pieces: Analytics The Analytics Spectrum
Excel Dashboards and Reports
Other BI Visualization Advanced Analytics
9
Advanced Analytics
10
Advanced analytics provides algorithms for
complex analysis of either structured or unstructured
data. It includes sophisticated statistical models,
machine learning, text analytics, advanced
visualization, and other advanced
data mining techniques.
Examining the pieces: Hadoop
HDFS/MapReduce
Schema on read
Ecosystem of tools
Commercial distributions
11
In-memory analytics
Performance
Interactivity
12
Status: Evolving architectures
13
Source: (TDWI Evolving Data Warehouse Architectures In the Age of Big Data, 2014) n=1688 responses
What technical issues or practices are driving change in your DW architecture?
Select all that apply.
Status: Big data pieces
14
Status: Analytics pieces
15
Considerations
16
Defining the problem
Data preparation
Analyzing the data
Making it work (i.e., the team)
Governance
Data preparation
ETL vs. ELT
Data quality
Metadata
17
Data exploration
18
Query
Visualization
Descriptive statistics
Analysis
19
Data mining
Supervised
Unsupervised
Other analytics
Operationalize
20
Business process
In-database scoring
Skills
21
Computing
Analytic modeling
Creative thinker
Communicator
Big Data:
The Big Data Maturity Model
22
Poll Question
Are you making use of Hadoop for advanced
analytics
Yes
No, but were thinking about it
No, and no plans to do so
Dont know
23
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
UTILIZING BIG DATA ANALYTICS
WITH HADOOP
TAPAN PATEL, PRODUCT MARKETING MANAGER, SAS
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
DATA TO DECISION LIFECYCLE
TEXT COMPETITIVE
ADVANTAGE
PREPARE
DATA
EX
PL
OR
E
DA
TA
DEVELOP
MODELS
DE
PL
OY
&
MO
NIT
OR
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
ACCESS TO HADOOP
HADOOP
Hive QL
SAS SERVER
Push some of SAS processing to Hadoop 1
Key Offerings: SAS/Access to Hadoop
SAS/Access to Cloudera Impala
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
EMBEDDED PROCESS FRAMEWORK
HADOOP
SAS Data Step & DS2
SAS SERVER
Push SAS processing to Hadoop with MapReduce 2
Key Offerings: SAS Scoring Accelerator for Hadoop
SAS Data Quality Accelerator for Hadoop
SAS Code Accelerator for Hadoop
SAS Data Management
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS
IN-MEMORY ANALYTICS AND HADOOP
In-memory processing; use Hadoop for storage persistence and commodity computing 3
SAS LASR ANALYTIC
SERVER
SAS IN-MEMORY
SAS IN-MEMORY
SAS IN-MEMORY
SAS IN-MEMORY
SAS IN-MEMORY
HADOOP WEB CLIENTS APPLICATIONS ERP
SCM
CRM
Images
Audio
and Video
Machine
Logs
Text
f Web and
Social
Data Discovery and Visualization
Statistics and Predictive Analytics
Data Management
Text Analytics
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS
VISUAL
STATISTICS INTERACTIVE PREDICTIVE ANALYTICS
EXPLORE AND
DISCOVER PREDICT AND
REFINE
DEPLOY AND
MONITOR
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS
VISUAL
STATISTICS INTERACTIVE PREDICTIVE ANALYTICS
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS
IN-MEMORY
STATISTICS FOR
HADOOP
WHAT IS IT
Provides a single interactive programming environment
for Hadoop to perform:
analytical data manipulation
variable transformations
exploratory analysis
statistical modeling and machine learning
integrated modeling comparison and scoring
Takes advantage of distributed in-memory computing
optimized for analytical workloads
TEXT
MANIPULATE
DATA
EX
PL
OR
E
DA
TA
DEVELOP
MODELS
SC
OR
E
Copyr i g ht 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .
SAS
IN-MEMORY STATISTICS FOR HADOOP
PRODUCT DEMONSTRATION
33
Questions?
34
Download a free
copy of the report
Download the report as a PDF file at:
http://tdwi.org/research/2014/03/
checklist-utilizing-big-data-
analytics-with-hadoop
Feel free to distribute the PDF file
of any TDWI Checklist Report
35
Contact Information
If you have further questions or comments:
Fern Halper, TDWI [email protected]
Tapan Patel, SAS [email protected]