34
Azure Café Marketplace Hortonworks Data Platform M A R K T P L A C E Learn how to architected, developed, and build completely in the open, Hortonworks Data Platform (HDP). Enterprise ready data platform to adopt a Modern Data Architecture. A Z U R E C A F E

Azure Cafe Marketplace with Hortonworks March 31 2016

Embed Size (px)

Citation preview

Azure Café Marketplace Hortonworks Data Platform

MARKTPLACE

Learn how to architected, developed, and build completely in the

open, Hortonworks Data Platform (HDP).

Enterprise ready data platform to adopt a Modern Data

Architecture.

AZURE

CAFE

Azure Café Marketplace SeriesExplore what you can build with Microsoft Azure Marketplace Solutions.

Agenda

• Introductions

• Big Data Market Trends

• Hortonworks Marketplace Solutions

• Demo with Q&A

• Next Steps

An online store for highly optimized and integrated

applications and services ready to deploy on

Microsoft Azure

Growing ecosystem of 3,000+ apps or components

Reduced sales cycle with pre-configured, ready-to-run

apps and services

Streamlined configuration, deployment, and management

Integrated platform experience

Top scenarios include: Big data, security, networking,

DevOps & automation, business continuity & backup,

management apps

Microsoft Azure Marketplace

MARKTPLACE

AZURE

CAFE

Solving Big Data Scenarios

using Hortonworks

© Hortonworks Inc. 2011 – 2015. All Rights Reserved

ON

LY

100open source

Apache Hadoop data platform

% Founded in 2011

HADOOP1ST

provider to go public

IPO 4Q14 (NASDAQ: HDP)

employees across

800+

countries

technology partners

1,350

17TM

Hortonworks Company Profile

Fastest company to reach $100 M in revenue

Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

HDP FOR DATA AT REST

HDF FOR DATA IN MOTION

ACTIONABLEINTELLIGENCE

MODERN DATA APPS

Hortonworks Delivers

Two Connected Data

Platforms: HDP and HDF

PERISHABLE INSIGHTS

HISTORICAL INSIGHTS

INTERNETOF

ANYTHING

HDP + HDF Create Modern Data Apps

Real-Time Cyber Security

protects systems with superior threat detection

Smart Manufacturing

dramatically improves yields by managing more variables in greater detail

Connected, Autonomous Cars

drive themselves and improve road safety

Future Farming

optimizing soil, seeds and equipment to measured conditions on each square foot

Automatic Recommendation Engines

match products to preferences in milliseconds

DATA AT REST

HDF DATA IN MOTION

ACTIONABLEINTELLIGENCE

MODERN DATA APPS

SOURCES REGIONAL

INFRASTRUCTURE

CORE

INFRASTRUCTURE

HDF Manages Bidirectional Dataflow

Constrained

High-Latency

Localized Context

Hybrid – Cloud/On-Premise

Low-Latency

Global Context

Hortonworks Data Flow

Visual User Interface

Drag and drop for efficient, agile operations

Immediate Feedback

Start, stop, tune, replay dataflows in real-time

Adaptive to Volume and Bandwidth

Any data, big or small

Event Level Data Provenance

Governance, compliance & data evaluation

Secure Data Acquisition & Transport

Fine grained encryption for controlled data sharing and selective data democratization

Powered by

Apache NiFi

Hortonworks Data Platform Processes Data at Rest

GOVERNANCE

Manage and audit data according to

policy

OPERATIONS

Manage, Monitor and Maintain

cluster operations

DATA ACCESS

YARN: Data Operating System(Cluster Resource Management)

Batch

1 • • • • • • • • • • •

• • • • • • • • • • • •

HDFS(Hadoop Distributed File System)

Deployment

SECURITY

Authentication, Authorization &

Encryption for data at rest or in motion

InteractiveMachine Learning

Search Real Time

Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved

Hortonworks Influences

the Apache Community

We Employ the Committers

--one third of all committers to the Apache®

Hadoop™ project, and a majority in Apache

NiFi and other important projects

Our Committers Innovate

and improve Connected Data Platforms

We Influence the Hadoop Roadmap

by communicating important requirements to

the community through our leaders

A P A C H E H A D O O P C O M M I T T E R S

HDP and HDF – Flexible Deployment Options

On-Premises Cloud

Virtualized

Your deployment of Hadoop

• VMWare

• Docker

• OpenStack

HDP on Your Hardware

• Linux or Windows

HDP on Appliance

Turnkey Hadoop Appliances

• Teradata

• Microsoft

• PSSC Labs

Infrastructure as a Service (IAAS)

• Amazon EC2

• Microsoft Azure

• Rackspace

Hadoop as a Service (HAAS)

Managed Hadoop Service

• Microsoft HDInsight

HDP on Azure

HDP Sandbox on

Marketplace

HDP Azure IaaS on

Marketplace

HDInsight on

Marketplace

Cloudbreak on

launch.hortonworks…

• Single node HDP

Cluster on

Marketplace

• Fully functional – all

HDP components

are preinstalled and

running

• Centos 7.1 VM

• Great for getting

started

• Multi node HDP on

Azure IaaS

• Users specify

number of nodes,

type of VM, HA/non-

HA.

• Processes HDFS

data on VHD disks

attached to VMs

• Can connect to

WASB

• Great for maximally

performing non-

elastic clusters

• Managed PaaS

offering by Microsoft

• Great for elastic

clusters -- compute

scaling independent

of storage

• Can spin up more

nodes on demand

automatically

• Processes data in

WASB (ADLS

coming soon)

• Autoscaling HDP

Clusters on Azure

• Runs HDP in Docker

containers

• Processes WASB

data

• Great for elastic

clusters – scale

compute layer

independently from

storage

• Periscope scales

clusters depending

on SLA

requirements

Use Cases

Classic Hadoop Driver: Cost optimization

Archive Data off EDWMove rarely used data to Hadoop as active

archive, store more data longer

Onboard costly ETL processFree your EDW to perform high-value functions

like analytics & operations, not ETL

Enrich the value of your EDWUse Hadoop to refine new data sources, such as

web and machine data for new analytical context

AN

AL

YT

ICS

Data

Marts

Business

Analytics

Visualization

& Dashboards

HDP helps you reduce costs and optimize the value associated with your EDW

AN

AL

YT

ICS

DA

TA

SY

ST

EM

S

Data

Marts

Business

Analytics

Visualization

& Dashboards

HDP 2.3

ELT

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

°

N

Cold Data,

Deeper Archive

& New Sources

Enterprise Data

Warehouse

Hot

MPP

In-Memory

Clickstream Web & Social

Geolocation Sensor & Machine

Server Logs

Unstructured

Existing Systems

ERP CRM SCM

SO

UR

CE

S

Case Study: 12 month Hadoop evolution at TrueCarD

ata

Pla

tfo

rm C

ap

ab

ilit

ies

12 months execution plan

June 2013

Begin

Hadoop

Execution

July 2013Hortonworks

Partnership

May ‘14

IPO

Aug 2013

Training

& Dev

Begins

Nov 2013

Production

Cluster

60 Nodes

2 PB

Jan 2014

40% Dev

Staff

PerficientDec 2013

Three

Production

Apps

(3 total)

Feb 2014

Three More

Production

Apps

(6 total)

12 Month Results at TRUECar

• Six Production Hadoop Applications

• Sixty nodes/2PB data

• Storage Costs/Compute Costs

from $19/GB to $0.12/GB

“We addressed our data platform capabilities

strategically as a pre-cursor to IPO.”

Common Apache NiFi Use Cases

Predictive Analytics

Ensure the highest value data is captured and available for analysis

ComplianceGain full transparency into provenance and flow of data

IoT OptimizationSecure, Prioritize, Enrich and Trace data at the edge

Fraud DetectionMove sales transaction data in real time to analyze on demand

Big Data IngestEasily and efficiently ingest data into Hadoop

Value Resources

Gain visibility into how data sources are used to determine value

Hortonworks Data Platform

20092006

1 ° ° ° ° °

° ° ° ° ° N

HDFS (Hadoop Distributed File

System)

MapReduceLargely Batch Processing

Hadoop w/ MapReduce

YARN: Data Operating System

1 ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° °

°

°N

HDFS (Hadoop Distributed File System)

Hadoop2 & YARN based Architecture

Silo’d clusters

Largely batch system

Difficult to integrate

MR-279: YARN

Hadoop 2 & YARN

Interactive Real-TimeBatch

Architected &

led development

of YARN to enable

the Modern Data

Architecture

October 23, 2013

YARN: A Data Operating System

Enables Multi-Tenancy

Better Utilization of existing clusters

• 60% – 150% improvement on node utilization

Enable next-generation Vendor Integration

• YARN is an application framework. (e.g: SAS, R, SAP)

Run Next-Generation Workloads

• Interactive SQL + Streaming + ML +…

YARN in Production

• Yahoo: ~40,000 nodes, multiple clusters running YARN across over 365PB of data

• Spotify, Progressive, Kohls, UHG, Sprint, JPMC, Target, AIG, Samsung

YARN: Data Operating System

(Cluster Resource Management)

1 ° ° ° ° ° ° °

° ° ° ° ° ° ° °

Script

Pig

SQL

Hive

TezTez

Java

Scala

Cascading

Tez

° °

° °

° ° ° ° °

° ° ° ° °

Others

ISV

Engines

HDFS (Hadoop Distributed File System)

Stream

Storm

Search

Solr

NoSQL

HBase

Accumulo

Slider Slider

BATCH, INTERACTIVE & REAL-TIME DATA ACCESS

In-Memory

Spark

How do you Operate a Hadoop Cluster?

Apache Ambari is a

platform to provision,

manage and monitor

Hadoop clusters

Storm/Spark Streaming

Storm

Detailed Reference Architecture for IoT Applications

HDF

Flume

Sink to

HDFS

Transform

Interactive

UI Framework

Hive

Hive

HDFS

HDFS

SOURCE DATA

Server logs

Application Logs

Firewall Logs

CRM/ERP

Sensor

Kafka

Kafka

Stream to

HDF

Forward to

Storm

Real Time Storage

Spark-ML

Pig

Alerts

Bolt to

HDFS

Dashboard

Silk

JMSAlerts

Hive Server

HiveServer

Reporting

BI Tools

High Speed

Ingest

Real-Time

Batch Interactive

Machine Learning

Models

Spark

Pig

Alerts SQOOP

Flume

Iterative ML

Hbase/Pheonix

HBaseEvent Enrichment

Spark-Thrift

Pig

Demo Hortonworks Sandbox

Thank You!

Want to learn more HDP?

http://hortonworks.com/training/

Azure Café Next Steps For more information regarding the Azure Marketplace and Hortonworks solutions

contact:

• Marti Stephens-Hartka – Microsoft ISV Leader East Region [email protected]

• Saptak Sen – Hortonworks Group Manager Partner Solutions [email protected]

Additional Resources:

• Azure HDInsight: https://azure.microsoft.com/en-us/services/hdinsight/

• Hortonworks Sandbox on Azure Marketplace: https://azure.microsoft.com/en-

us/marketplace/partners/hortonworks/hortonworks-sandbox/

• Hortonworks Data Platform on Azure Marketplace: https://azure.microsoft.com/en-

us/marketplace/partners/hortonworks/hortonworks/

• Hortonworks Customer Stories: http://hortonworks.com/customers/

• Hortonworks Blog: http://hortonworks.com/blog/

• Microsoft Cortana Analytics Suite: http://www.microsoft.com/en-us/server-cloud/cortana-analytics-suite/overview.aspx

• Azure Data Lake Analytics: https://azure.microsoft.com/en-us/solutions/data-lake/

• Hortonworks and Microsoft on YouTube: https://www.youtube.com/watch?v=zWVlOMlzZgw&feature=youtu.be

https://tryazuremarketplace.com/

Complete Labs for top AMP ISVs Hortonworks, Barracuda, Chef, Docker, Kemp and Coming Soon (Hanu and

more)

• Most popular Azure

Marketplace solutions in

4 tracksHolistic

• 3 week intervals

between same track ISV

• Onboarding assistance

with lab set up

Programmatic

• Need access to Azure

SubscriptionTargeted

Dev Ops Security Big Data Management

Chef Barracuda Hortonworks Cloud Cruiser

April 27th April 20th May 5th May 18th

Docker Kemp DataStax Hanu

May 11th May 18th May 25th June 8th

Core OS Nasuni

June 1st June 15th

*Registration links not yet available

Barracuda Bus Tour BriefActivity Name Barracuda + Microsoft North America Bus/MTC TourApproximate Length

15 cities 02/09/16 – 04/21/16

General Overview

The Barracuda Bus Tour is an annual event. – this is Barracuda’s fifth annual tour, but first with a partner. The goal of this year’s tour with Microsoft is to:- Drive awareness of the Barracuda/Microsoft solutions – Office 365 and Azure- Target engagement across three focus areas: Customers, Partners and Microsoft

Sellers- Drive pipeline/revenue – Azure consumption and Office 365 active usage

Event Track 10:30am – 12:45pm Customer Seminar - Migration and Security for Azure and Office 365 –1:30pm – 3:30pm Partner & Seller Seminar – Migration and Security for Azure and Office 365 –

Goals/Metrics - Solution Awareness- Education- Leads/Revenue- Drive Azure consumption- Drive Office 365 active usage

Registration Link https://httpswww.barracuda.com/programs/expedition

Date(s) Location/ State

Tues, 2/16 Mountain View, CA

Tues, 2/23 Irvine, CA

Wed, 2/24 Los Angeles, CA

We, 3/16 Portland, OR

Thurs, 3/17 Seattle, WA

Wed, 3/30 Dallas, TX

Thurs, 3/31 Houston, TX

Wed, 4/6 Minneapolis, MN

Thurs, 4/7 Chicago, IL

Mon, 4/11 Detroit, MI

Wed, 4/13 Boston, MA

Thurs, 4/14 New York, NY

Mon, 4/18 Philadelphia, PA

Tues, 4/19 Reston, VA

Wed, 4/20 Charlotte, NC

Thurs, 4/21 Atlanta, GA

An online store for highly optimized and integrated

applications and services ready to deploy on

Microsoft Azure

Growing ecosystem of 3,000+ apps or components

Reduced sales cycle with pre-configured, ready-to-run

apps and services

Streamlined configuration, deployment, and management

Integrated platform experience

Top scenarios include: Big data, security, networking,

DevOps & automation, business continuity & backup,

management apps

Microsoft Azure Marketplace

session.

Keeping it simple and making it fun (KIS+MIF)