22
Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks We Do Hadoop. We Do Retail. September 22, 2014

Hortonworks - What's Possible with a Modern Data Architecture?

Embed Size (px)

DESCRIPTION

This is Mark Ledbetter's presentation from the September 22, 2014 Hortonworks webinar “What’s Possible with a Modern Data Architecture?” Mark is vice president for industry solutions at Hortonworks. He has more than twenty-five years experience in the software industry with a focus on Retail and supply chain.

Citation preview

Page 1: Hortonworks - What's Possible with a Modern Data Architecture?

Page 1 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hortonworks We Do Hadoop. We Do Retail.

September 22, 2014

Page 2: Hortonworks - What's Possible with a Modern Data Architecture?

Page 2 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Our Mission: Power your Modern Data Architecture with HDP and Enterprise Apache Hadoop

Who we are June 2011: Original 24 architects, developers, operators of Hadoop from Yahoo! June 2014: An enterprise software company with 500+ Employees

Key Partners

Our model Innovate and deliver Apache Hadoop as a complete enterprise data platform completely in the open, backed by a world class support organization

Page 3: Hortonworks - What's Possible with a Modern Data Architecture?

Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Fastest growing Fortune 1000 customer base

Customer Momentum •  300+ customers in seven quarters, growing at 75+/quarter •  Two thirds of customers come from F1000

Largest Cluster in North America

32,000 Nodes Largest Cluster in Europe

1,000 Nodes

Some notable migrations include many of the early adopters of Hadoop:

© Hortonworks Inc. 2011 – 2014. All Rights Reserved

Experience at Scale 80,000 nodes under contract

Largest Known Cluster in APAC

400 Nodes

30+ customers migrated from other distributions

Page 4: Hortonworks - What's Possible with a Modern Data Architecture?

Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Enabling a Modern Data Architecture with HDP and Apache Hadoop

Spring 2014 Version 1.4

We Do Hadoop. We Do Retail.

Page 5: Hortonworks - What's Possible with a Modern Data Architecture?

Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

APP

LIC

ATIO

NS

DAT

A S

YSTE

M

Business Analytics

Custom Applications

Packaged Applications

Traditional systems under pressure

•  Silos of Data •  Costly to Scale •  Constrained Schemas

Clickstream

Geolocation

Sentiment, Web Data

Sensor. Machine Data

Unstructured docs, emails

Server logs

SOU

RC

ES

Existing Sources (CRM, ERP,…)

RDBMS EDW MPP

New Data Types

…and difficult to manage new data

Page 6: Hortonworks - What's Possible with a Modern Data Architecture?

Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

LIMITATIONS Silos & Expensive

Single Purpose

APP

LIC

ATIO

NS

DAT

A S

YSTE

M

Business Analytics

Custom Applications

Packaged Applications

Why a Modern Data Architecture?

RDBMS EDW MPP

MDA: Key Drivers

1.  Leverage new types of data 2.  IT optimization 3.  Enable a data lake GOALS •  Extend new data sets across

existing data platforms •  Common data platform, multiple

processing engines •  Batch, interactive and real time on

a single data platform

EXISTING  Systems  

Clickstream   Web    &Social  

Geoloca9on   Sensor    &  Machine  

Server    Logs  

Unstructured  

SOU

RC

ES

Page 7: Hortonworks - What's Possible with a Modern Data Architecture?

Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

HDP2 and YARN enable the Modern Data Architecture

Hortonworks architected and led development of YARN

Common data set, multiple applications •  Optionally land all data in a single cluster

•  Batch, interactive & real-time use cases

•  Support multi-tenant access, processing & segmentation of data

YARN: Architectural center of Hadoop •  Consistent security, governance & operations •  Ecosystem applications certified

by Hortonworks to run natively in Hadoop

SOU

RC

ES

EXISTING  Systems  

Clickstream   Web    &Social  

Geoloca9on   Sensor    &  Machine  

Server    Logs  

Unstructured  

APP

LIC

ATIO

NS

DAT

A S

YSTE

M

Business Analytics

Custom Applications

Packaged Applications

RDBMS EDW MPP YARN: Data Operating System

1 ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° N

HDFS (Hadoop Distributed File System)

Interactive Real-Time Batch

Page 8: Hortonworks - What's Possible with a Modern Data Architecture?

Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

HDP delivers a comprehensive data management platform

HDP 2.1 Hortonworks Data Platform

Provision, Manage & Monitor

Ambari

Zookeeper

Scheduling

Oozie

Data Workflow, Lifecycle & Governance

Falcon Sqoop Flume NFS

WebHDFS

YARN: Data Operating System

DATA MANAGEMENT

SECURITY BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

GOVERNANCE & INTEGRATION

Authentication Authorization Accounting

Data Protection

Storage: HDFS Resources: YARN Access: Hive, … Pipeline: Falcon

Cluster: Knox

OPERATIONS

Script

Pig

Search

Solr

SQL

Hive HCatalog

NoSQL

HBase Accumulo

Stream

Storm

Others

ISV Engines

1 ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° °

°

°

N

HDFS (Hadoop Distributed File System)

In-Memory

Spark

Deployment Choice

Linux Windows On-Premise Cloud

YARN is the architectural center of HDP

•  Enables batch, interactive and real-time workloads

•  Single SQL engine for both batch and interactive

•  Enable existing ISV apps to plug directly into Hadoop via YARN

Provides comprehensive enterprise capabilities

•  Governance

•  Security

•  Operations

The widest range of deployment options

•  Linux & Windows

•  On premise & cloud

Tez Tez

Page 9: Hortonworks - What's Possible with a Modern Data Architecture?

Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Our Approach

Page 10: Hortonworks - What's Possible with a Modern Data Architecture?

Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hortonworks Approach

Innovate the Core 1

Architect and build innovation at the core of Hadoop

•  YARN: Data Operating System •  HDFS as the storage layer •  Key processing engines

YARN  :  Data  Opera9ng  System  

Script    Pig      

Search    

Solr      

SQL    

Hive/Tez,  HCatalog  

   

NoSQL    

HBase  Accumulo  

   

Stream      

Storm  

     

Batch    

Map  Reduce  

   

HDFS    (Hadoop  Distributed  File  System)  

Page 11: Hortonworks - What's Possible with a Modern Data Architecture?

Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hortonworks Approach

Innovate the Core 1

Architect and build innovation at the core of Hadoop

•  YARN: Data Operating System •  HDFS as the storage layer •  Key processing engines

Extend Hadoop as an Enterprise Data Platform 2

Extend Hadoop with enterprise capabilities for governance, security & operations Apply enterprise software rigor to the open source development process

YARN  :  Data  Opera9ng  System  

Script    Pig      

Search    

Solr      

SQL    

Hive/Tez,  HCatalog  

   

NoSQL    

HBase  Accumulo  

   

Stream      

Storm  

     

Batch    

Map  Reduce  

   

HDFS    (Hadoop  Distributed  File  System)  

HDP 2.1

Gov

erna

nce

&

Inte

grat

ion

Secu

rity

Ope

ratio

ns

Data Access

Data Management

YARN

Page 12: Hortonworks - What's Possible with a Modern Data Architecture?

Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hortonworks Approach

Innovate the Core 1

Architect and build innovation at the core of Hadoop

•  YARN: Data Operating System •  HDFS as the storage layer •  Key processing engines

Extend Hadoop as an Enterprise Data Platform 2 Enable the Ecosystem 3

Extend Hadoop with enterprise capabilities for governance, security & operations Apply enterprise software rigor to the open source development process

Enable the leaders in the data center to easily adopt & extend their platforms

•  Establish Hadoop as standard component of a modern data architecture

•  Joint engineering

YARN  :  Data  Opera9ng  System  

Script    Pig      

Search    

Solr      

SQL    

Hive/Tez,  HCatalog  

   

NoSQL    

HBase  Accumulo  

   

Stream      

Storm  

     

Batch    

Map  Reduce  

   

HDFS    (Hadoop  Distributed  File  System)  

HDP 2.1

Gov

erna

nce

&

Inte

grat

ion

Secu

rity

Ope

ratio

ns

Data Access

Data Management

YARN

Page 13: Hortonworks - What's Possible with a Modern Data Architecture?

Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

YARN  :  Data  Opera9ng  System  

Script    Pig      

Search    

Solr      

SQL    

Hive/Tez,  HCatalog  

   

NoSQL    

HBase  Accumulo  

   

Stream      

Storm  

     

Batch    

Map  Reduce  

   

HDFS    (Hadoop  Distributed  File  System)  

Contributes more to the Apache Hadoop ecosystem in the ASF than any other vendor

Hadoop is a platform decision

•  Open Source: fastest path to innovation for a platform technology

•  Eliminate vendor lock in, no proprietary software

•  Data center leaders have committed to the open source approach

…all done completely in Open Source 4

Apache Project Committers PMC

Members

Hadoop 26 20 Tez 15 13

Hive 15 5

HBase 7 3

Pig 5 5

Accumulo 2 2

Flume 1 0

Storm 2 2

Sqoop 1 0

Ambari 32 28

Oozie 3 2

Zookeeper 2 1

Knox 6 6

Falcon 3 3

TOTAL 120 90

HDP 2.1

Gov

erna

nce

&

Inte

grat

ion

Secu

rity

Ope

ratio

ns

Data Access

Data Management

YARN

Page 14: Hortonworks - What's Possible with a Modern Data Architecture?

Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

The Modern Data Architecture w/ HDP

Page 15: Hortonworks - What's Possible with a Modern Data Architecture?

Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hadoop Juices Sales in Retail FUNCTION USE CASE

Marketing

360° View of Customer:

Ø Customer Lifetime Value Ø Targeted Marketing Campaigns

Segmentation Pricing Brand Sentiment Analysis

eCommerce & Customer Service Product Recommendation Engine Web Path Optimization Call Center Productivity

Forecasting, Allocation & Merchandizing Product Placement Store-Level Optimization of Assortment, Prices and Spaces

Procurement & Supply Chain

Inventory Management Real-time Delivery Management Improved Order Picking Vendor Management Strategic Sourcing

Page 16: Hortonworks - What's Possible with a Modern Data Architecture?

Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Case Study: 12 month Hadoop evolution at TrueCar D

ata

Plat

form

Cap

abili

ties

12 months execution plan

June 2013 Begin Hadoop Execution

July 2013 Hortonworks Partnership

May ‘14 IPO

Aug 2013 Training & Dev. Begins

Nov 2013 Production Cluster 60 Nodes 2 PB

Jan 2014 40% Dev. Staff Proficient

Dec 2013 Three Production Apps (3 total)

Feb 2014 Three More Production Apps (6 total)

12 Month Results at TrueCAR •  Six Production Hadoop Applications •  Sixty nodes/2PB data •  Storage Costs/Compute Costs

from $19/GB to $0.23/GB

“We addressed our data platform capabilities strategically as a pre-cursor to IPO.”

Page 17: Hortonworks - What's Possible with a Modern Data Architecture?

Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hortonworks Support We Do Hadoop. We Do Retail.

Page 18: Hortonworks - What's Possible with a Modern Data Architecture?

Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

End to end support to ensure your Hadoop success

Hortonworks Support Backed by the architects, builders and operators of Hadoop, Hortonworks offers the most effective and complete Hadoop support available Support Provided •  Application Development Support •  Diagnose Install, Config & Cluster Mgmt Issues •  Access to Upgrades, Updates and Patches •  Diagnose Performance Issues •  Remote Troubleshooting •  Diagnose Loading, Processing & Query Issues •  Customer Support Portal •  Advanced Knowledgebase

Architect & Design Development Implementation Production

Only Hortonworks provides unlimited support across architecture, development,

implementation & production

Mission Critical Hadoop Support

Page 19: Hortonworks - What's Possible with a Modern Data Architecture?

Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

End to end support to ensure your Hadoop success

Architect & Design Development Implementation Production

Only Hortonworks provides unlimited support across architecture, development,

implementation & production

Mission Critical Hadoop Support

Services

Hortonworks Services Our services team ensures your Hadoop project will be delivered successfully Services Provided •  Architecture •  Implementation •  Cluster Tuning •  Migration •  Best Practices

Page 20: Hortonworks - What's Possible with a Modern Data Architecture?

Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

End to end support to ensure your Hadoop success

Architect & Design Development Implementation Production

Only Hortonworks provides unlimited support across architecture, development,

implementation & production

Mission Critical Hadoop Support

Services

Training

Hortonworks University We offer a wide range of training options backed by experts and designed to evolve your teams Hadoop proficiency Custom Coursework •  On-site training for your team •  Customized for your requirements

Public Courses •  Offered in all geographies •  Hadoop Architect •  Hadoop Developer •  Hadoop Analyst •  Hadoop Operations •  Data Science

Page 21: Hortonworks - What's Possible with a Modern Data Architecture?

Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Hadoop is a Platform Decision

Open Leadership Drive innovation in the open via the Apache community-driven open source process

Enterprise Rigor Engineer, test and certify Apache Hadoop with the enterprise in mind

Ecosystem Endorsement Focus on deep integration with existing data center technologies and skills

Fastest Growing Customer and Partner Base Largest and most experienced Hadoop adopters have standardized on Hortonworks The data center leaders have standardized on Hortonworks

Page 22: Hortonworks - What's Possible with a Modern Data Architecture?

Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved

Questions? We Do Hadoop. We Do Retail.

September 22, 2014