46
Grab some coffee and enjoy the pre-show banter before the top of the hour!

What Is Hadoop and Where Is It Going?

Embed Size (px)

DESCRIPTION

The Briefing Room with Dr. Robin Bloor and Techwise Live Webcast on April 23, 2014 Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=fa24dc305208c34cd98fb63f62797323 Few innovations over the past 40 years compare to the hype or significance of Hadoop. But despite the vast array of software vendors touting their Hadoop strategy, very few businesses have yet to fully wrap their heads around what this development means, or where it's going. Register for this inaugural episode of TechWise to hear veteran IT Analyst, Dr. Robin Bloor explain how Hadoop is vastly different from Linux or SOA, and why its future remains largely unwritten. He will be joined by Constellation Research Founder Ray Wang, who will provide his perspective on the Hadoop landscape. They will then take questions from the audience about any facet of this transformative trend. Visit InsideAnlaysis.com for more information.

Citation preview

Page 1: What Is Hadoop and Where Is It Going?

Grab some coffee and enjoy the pre-show banter before the top of the hour!

Page 2: What Is Hadoop and Where Is It Going?

The Briefing Room

Hand in Hand—Optimizing the Data Warehouse for Big Data

Page 3: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected] @eric_kavanagh

Page 4: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!   Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Page 5: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

Topics

This Month: BIG DATA

May: DATABASE

June: ANALYTICS & MACHINE LEARNING

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

Page 6: What Is Hadoop and Where Is It Going?
Page 7: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

Analyst: Claudia Imhoff

Claudia Imhoff is President & Founder of

Intelligent Solutions, Inc.

Page 8: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

Pentaho

! Pentaho offers a suite of open source business intelligence products called Pentaho Business Analytics

! Pentaho’s big data solution provides access to any data source, and includes data integration, discovery, analysis and visualization

! Pentaho’s solutions are available in community or enterprise editions

Page 9: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

Guest: Chuck Yarbrough

Chuck is the Director of Big Data Product Marketing at Pentaho, a leading big data analytics company that helps organizations engineer big data connections, blend data and report and visualize all of their data. Much of Chuck's focus at Pentaho is in educating organizations on how big data can help win, serve and retain customers, lower costs and grow revenue through the proper use of big data. A life-long participant in the data game, Chuck has held leadership roles at Deloitte Consulting, SAP Business Objects, Hyperion and National Semiconductor.

Page 10: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 10

Director, Big Data Product Marketing @cyarbrough

April 29, 2014

Data Warehouse Optimization

Blueprint Chuck Yarbrough

Page 11: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 11

ANY Analytics •  Reports •  Dashboards •  Visualizations •  Discovery •  Predictive •  Any role

Analytics

ANY Environment •  Data warehouses •  Data marts •  Stack vendors •  Cloud •  Embedded

Existing & New Data Infrastructure &

Processes

ANY Data •  Relational •  Operational •  Big Data •  Data sources not

yet anticipated

Billing

Location

Social Media

Customer

Web

Network

OUR VISION

The New Reality: Powerful yet simplified analytics for all users

Page 12: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 12

Improve operational effectiveness Machines/sensors: predict failures, network attacks

Financial risk management: reduce fraud, increase security

Reduce data warehouse cost Integrate new data sources without increased database cost

Provide online access to ‘dark data’

Drive incremental revenue Predict customer behavior across all channels

Understand and monetize customer behavior

Begin to monetize data as a service

Emerging big data use cases demand blending multiple data sources

Page 13: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 13

Entry

Tran

sfor

m

Advanced

Opt

imiz

e A Spectrum of Big Data Use Cases What the Market is Deploying Today and Planning for Tomorrow

Data Warehouse Optimization

Streamlined Data

Refinery

Big Data Exploration

Customer 360 Degree

View

Harnessing Machine &

Sensor Data

Next Generation Applications

Internal Big Data as a Service

On-Demand Big Data Blending

Big Data Predictive Analytics

Use Case Complexity

Bus

ines

s Im

pact

Monetize My Data

Data Warehouse Optimization

Page 14: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 14

Entry

Tran

sfor

m

Advanced

Opt

imiz

e A Spectrum of Big Data Use Cases What the Market is Deploying Today and Planning for Tomorrow

Data Warehouse Optimization

Streamlined Data

Refinery

Big Data Exploration

Customer 360 Degree

View

Harnessing Machine &

Sensor Data

Next Generation Applications

Internal Big Data as a Service

On-Demand Big Data Blending

Big Data Predictive Analytics

Use Case Complexity

Bus

ines

s Im

pact

Monetize My Data

Data Warehouse Optimization

Page 15: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 15

Cut Downtime and Focus on Product Creation

Remove Costly Legacy Systems

Simplicity Empowers Business Users

Data Warehouse Optimization Remove the clutter and connect to Big Data

“Using Pentaho in our data warehouse, it now takes about 20 minutes to break down a metric and do specific analysis to identify performance issues. In the past, similar queries would take all night.” Greg Allen, Business Analyst, Kiva

“Pentaho Data Integration not only simplifies the data delivery process but also enables us to gather the high-quality data. Ultimately Pentaho has enabled us to reach our goal of making the Swiss real estate market more transparent.” Prof. Dr. Peter IlG, Managing Director, Swiss Real Estate Datapool

“We needed fully functional reporting and data integration tools but wanted to cut the cost burden experienced with Oracle. After looking at what was out there, Pentaho had the complete tool set, and after further testing, our users noticed no difference in the features they need.” Uwe Geercken. IT Manger, Swissport

Page 16: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 16

Data Warehouse Optimization Shrink Data Costs & Boost Analytics Performance for Business Users

Key Considerations

•  Normally leverages Hadoop

•  Relevant across industries

•  May require new coding skillsets that are hard to find

Why Do It?

•  Save data capacity & management costs

•  Empower IT and business users to meet goals on time

What is it?

•  Existing DW infrastructure can’t support data explosion, & adding DW capacity is costly

•  So offload low priority data to Big Data store to extend capacity

Page 17: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 17

Data Warehouse Optimization Shrink Data Costs & Boost Analytics Performance for Business Users

CRM & ERP Systems

Data Warehouse

PDI

Page 18: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 18

Data Warehouse Optimization Shrink Data Costs & Boost Analytics Performance for Business Users

CRM & ERP Systems

Data Warehouse

PDI

PDI

Hadoop Cluster

Page 19: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 19

Data Warehouse Optimization Shrink Data Costs & Boost Analytics Performance for Business Users

CRM & ERP Systems

Data Warehouse

PDI

PDI

Hadoop Cluster

Page 20: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 20

Data Warehouse Optimization Shrink Data Costs & Boost Analytics Performance for Business Users

CRM & ERP Systems

Data Warehouse

PDI

Other Data Sources

PDI

PDI

Hadoop Cluster

Page 21: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 21

Data Warehouse Optimization Shrink Data Costs & Boost Analytics Performance for Business Users

CRM & ERP Systems

Data Warehouse

PDI

Other Data Sources

PDI

PDI

Hadoop Cluster

Analytic Data Mart

PDI

Page 22: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 22

Data Warehouse Optimization Shrink Data Costs & Boost Analytics Performance for Business Users

CRM & ERP Systems

Data Warehouse

PDI

Other Data Sources

PDI

PDI

Hadoop Cluster

Analytic Data Mart

PDI

Relational Layer

Page 23: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 23 © 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 23

Data Warehouse Optimization Cost effective, fast processing

Business Challenge •  Gain competitive advantage through intraday

balance reporting for commercial customers

•  Use Hadoop and relational data stores to process huge volumes 15x faster

to develop 10x faster to execute

No coding

Integrate with existing

Easy to find resources

Pentaho Benefits •  Graphical orchestration for Hadoop, Hbase &

DB2 data integration workloads

•  15x faster to develop, 10x faster to execute

A Major Financial Institution

Page 24: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 24

Optimize data infrastructure to connect hundreds of interdependent banking applications

Internal User Reporting & Data

mining

Clients Statements,

Balance, Transaction Reporting &

Analytics

A Major Financial Institution

Hadoop Cluster

Historical Data Mart

Data Marts

Customer & Account

Master Data

Payments Data

Cash Processing

Data

Other Financial

Apps

PDI PDI

Scalable Enterprise Data Hub

Hundreds of Enterprise Data

Sources

Page 25: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 25

Thank You

blog.pentaho.com

@Pentaho

Facebook.com/Pentaho

Pentaho Business Analytics

JOIN THE CONVERSATION. YOU CAN FIND US ON:

Page 26: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 26

Text Here

Streamlined Data Refinery Drive a Sustainable Analytics Strategy with Big Data ETL at Scale

Vertical Fit

High Tech, Telecom, Media, Financial Services, etc

Technology Fit

Primarily Hadoop, but also NoSQL

Benefits

•  Establish usable analytics on diverse sources at high volume (terabytes+)

•  Speed queries substantially with

rapid ingestion & powerful processing

•  Reduce costs of ETL processing

Challenges

•  Expansive integration project

•  May require new coding skillsets that are hard to find

•  May call for swapping from a data warehouse to a higher performing Analytic database, depending on requirements

Why Do It?

•  Give business users insight into all data

•  Scale ETL and data management cost savings

•  Next step after DW optimization

What is It? In the face of exploding volumes of transaction, customer, and other data, traditional ETL systems slow down, making analytics unworkable. One solution is to streamline most data through a scalable Big Data processing hub – that pushes refined data to a data warehouse or analytical database for low-latency self-service analytics across a diverse base of data.

Page 27: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 27

Streamlined Data Refinery Drive a Sustainable Analytics Strategy with Big Data ETL at Scale

•  Offers a full platform for this use case, including broad data integration (incl. leading Hadoop distros and analytic DBs) and a powerful array of easy to use front-end analytics

•  Visual mapReduce mitigates need for additional developers, and makes Big Data accessible to existing IT staff

•  Pentaho mapReduce runs much faster in the cluster vs. other scripting tools

Why

Transactions – Batch & Real-time

PDI Enrollments & Redemptions

Location, Email, Other

Data

Hadoop Cluster

PDI

Analytic Database

Analyzer

Reports

Page 28: What Is Hadoop and Where Is It Going?

© 2013, Pentaho. All Rights Reserved. pentaho.com. Worldwide +1 (866) 660-7555 28

Pentaho Big Data Analytics Platform Simplified data preparation and analytics for all users

Simplified Analytics

Experience

Enterprise Big Data

Integration

Blended Big Data

Page 29: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Claudia Imhoff

29

President and Founder Intelligent Solutions, Inc.

A thought leader, visionary, and practitioner, Claudia Imhoff, Ph.D., is an internationally recognized expert on analytics, business intelligence, and the architectures to support these initiatives. Dr. Imhoff has co-authored five books on these subjects and writes articles (totaling more than 150) for technical and business magazines. She is also the Founder of the Boulder BI Brain Trust, a consortium of internationally-recognized independent analysts and experts. You can follow them on Twitter at #BBBT or become a subscriber at www.bbbt.us.

Email: [email protected] Phone: 303-444-6650 Twitter: Claudia_Imhoff

Page 30: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Topics

§  An extended data warehouse architecture for a modern BI environment

§  Questions

30

Page 31: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Data Warehouse Technology Drivers

§  Do more with less §  Data compression §  Schemas on read §  Open source components §  In-Memory capabilities

§  Simpler environments §  Cloud deployments §  Easier data management §  Mobile and Self-service BI §  Built-in analytic functions

31

Page 32: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Extended Data Warehouse Architecture

32

Traditional EDW environment

Investigative computing platform

Data refinery

Data integration platform

Analytic tools & applications

Operational real-time environment

RT analysis engine

Other internal & external structured & multi-structured data

Real-time streaming data Operational systems

BI services

Slide created by Colin White – BI Research, Inc.

Page 33: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Data Integration Use Case: Data Refinery

Ingests raw detailed data in batch and/or real-time into a managed data store

Distills the data into useful business information and distributes the results to downstream systems

May also directly analyze certain types of data

Employs low-cost hardware and software to enable large amounts of detailed data to be managed cost effectively

Requires (flexible) governance policies to manage data security, privacy, quality, archiving and destruction

Traditional EDW environment

Investigative computing platform

Data refinery

Data integration platform

33

Page 34: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Traditional EDW Use Cases

Most BI environments today §  New technologies can be

incorporated into the EDW environment to improve performance, efficiency and reduce costs

Use cases §  Production reporting §  Historical comparisons §  Customer analysis (next

best offer, segmentation, life-time value scores, churn analysis, etc.)

§  KPI calculations §  Profitability analysis §  Forecasting

Traditional EDW environment

Data refinery

Data integration platform

Analytic tools & applications

Operational real-time environment

RT analysis engine Operational systems

BI services

34

Page 35: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Investigative Computing Use Cases

New technologies used here include: o Hadoop, in-memory computing, columnar storage, data compression, appliances, etc. Use cases o Data mining and predictive modeling for EDW and real-time environments o Cause and effect analysis o Data exploration (“Did this ever happen?” “How often?”) o Pattern analysis o General, unplanned investigations of data

Data refinery

Data integration platform

Analytic tools & applications

Operational real-time environment

RT analysis engine

Investigative computing platform

Operational systems

BI services

35

Page 36: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Operational RT Environment Use Cases

Embedded or callable BI services: o  Real-time fraud detection o  Real-time loan risk

assessment o  Optimizing online promotions o  Location-based offers o  Contact center optimization o  Supply chain optimization

Real-time analysis engine: §  Traffic flow optimization §  Web event analysis §  Natural resource

exploration analysis §  Stock trading analysis §  Risk analysis §  Correlation of unrelated

data streams (e.g., weather effects on product sales)

36 Operational real-time environment

RT analysis engine

Other internal & external structured & multi-structured data

Real-time streaming data

Operational systems

BI services

36

Page 37: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

BUT – All Components Must Work Together!

37

analytic models analyses

New sources of data Enterprise DW

Analytic tools

Investigative computing platform Data refinery Operational systems

existing customer

data

next best customer offer

3rd party data location data social data

feedback

RT analysis engine call center dashboard or web event stream

Slide created by Colin White – BI Research, Inc.

Page 38: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Topics

§  Extending the data warehouse architecture for a modern analytics environment

§  Questions

38

Page 39: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Many Organizations Not Ready for Big Data?

§  Many companies are struggling to get a traditional data warehouse in place and produce basic BI §  Business users not analytically savvy §  Minimal governance §  Chaotic architectures

§  What do you say to these organizations?

39

Page 40: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Existing Data Warehouse

§  Do organizations have to rip and replace their existing DW to solve big data problems? §  When do I use a traditional DW versus the Hadoop

environment? §  Does the data hub replace the data warehouse?

40

Page 41: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Data Integration

§  Where is ETL used and not used? §  How do enterprises control data blending and

virtualization (do they need to)? §  Is data governance still important?

§  How does it change in this new environment?

41

Page 42: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

New IT Skills

§  To achieve DW optimization… §  Does IT have to rip and replace their employees? §  Should they rely on consultants? §  To what extent?

§  What is needed to move from basic DW to a big data architecture?

42

Page 43: What Is Hadoop and Where Is It Going?

Copyright © 2014, Intelligent Solutions, Inc., All Rights Reserved

Evolving to Advanced Analytics

§  Is it mandatory to hire data scientists? §  Is training on new technology enough? §  What else is needed to make the company more

analytically-driven?

43

Page 44: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

Page 45: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

Upcoming Topics

www.insideanalysis.com

2014 Editorial Calendar at www.insideanalysis.com/webcasts/the-briefing-room

This Month: BIG DATA

May: DATABASE

June: ANALYTICS & MACHINE LEARNING

Page 46: What Is Hadoop and Where Is It Going?

Twitter Tag: #briefr

The Briefing Room

THANK YOU for your

ATTENTION!