Upload
others
View
19
Download
0
Embed Size (px)
Citation preview
1© Cloudera, Inc. All rights reserved.
Hadoop in the CloudRyan Lippert, Cloudera Product Marketing@lippertryan
2© Cloudera, Inc. All rights reserved.
3© Cloudera, Inc. All rights reserved.Cloudera — Confidential
4© Cloudera, Inc. All rights reserved.
Drive Customer
Insights
Improve Product & Services Efficiency
Lower Business
Risk
5© Cloudera, Inc. All rights reserved.
The world’s largest taxi
company owns ZERO
vehicles.
The world’s largest
accommodation provider owns
ZERO real estate.
The world’s most popular
media owner creates ZERO
content.
The world’s leading
music platform owns no
music.
6© Cloudera, Inc. All rights reserved.
Our relationship with datais changing
7© Cloudera, Inc. All rights reserved.
What’s Driving Hadoop to the Cloud?Enterprise customers using cloud for big data analytics
Hadoop deployments in cloud are accelerating:
● Executive mandate: minimize on-prem datacenter footprint
● Perceived lower overall TCO
● Increased agility: end-user self-service
● Elasticity: optimize infrastructure usage
8© Cloudera, Inc. All rights reserved.
9© Cloudera, Inc. All rights reserved.
Common workloads in the cloud
Only pay for what you need, when you need it
▪ Transient clusters▪ Elastic workload▪ Object storage centric▪ Cloud-native deployment
ETL/Modeling(Data Engineering)
App Delivery(Operational
Database)
Reduce Operating Costs New Insights, New Revenue Run Without Risk
BI/Analytics(Analytic Database)
Explore and analyze all data, wherever it lives
▪ Transient or Persistent clusters▪ Sized to demand▪ HDFS or object storage▪ Lift-and-shift or cloud-native
deployment
Enterprise-grade to protect your business, no matter what
▪ Fixed clusters▪ Periodic sync▪ All HDFS storage▪ Lift-and-shift deployment
10© Cloudera, Inc. All rights reserved.
Crunching 1,000+ Business Metrics per Customer with Sub-Second Responses
• Enables granular targeting of customers
• 50% reduction in marketing cost execution at one bank with focus on high potential customers
• Stores and processes thousands of critical events at scale at a low cost
• Provides flexibility, agility to support customer needs with Cloudera on Amazon Web Services and on premises
CUSTOMER 360
FINANCIAL SERVICES» BEHAVIORAL ANALYTICS» PREDICTIVE ANALYTICS» SCALABLE PROCESSING
11© Cloudera, Inc. All rights reserved.
Common workloads in the cloud
Only pay for what you need, when you need it
▪ Transient clusters▪ Elastic workload▪ Object storage centric▪ Cloud-native deployment
ETL/Modeling(Data Engineering)
App Delivery(Operational
Database)
Reduce Operating Costs New Insights, New Revenue Run Without Risk
BI/Analytics(Analytic Database)
Explore and analyze all data, wherever it lives
▪ Transient or Persistent clusters▪ Sized to demand▪ HDFS or object storage▪ Lift-and-shift or cloud-native
deployment
Enterprise-grade to protect your business, no matter what
▪ Fixed clusters▪ Periodic sync▪ All HDFS storage▪ Lift-and-shift deployment
12© Cloudera, Inc. All rights reserved.
Measure user interaction across the ecosystem, help direct R&D and development spend• Virtuous cycle: Identify features that
facilitate sharing of content that drive new customers
• Real-time streaming and batch data from product logs, web analytics, channel data and ERP
• Impala connects to third-party data wrangling and BI tools for fast reporting
MANUFACTURING» CUSTOMER 360» DATA DRIVEN PRODUCTS» DATA DRIVEN SERVICES
DATA-DRIVENPRODUCTS
13© Cloudera, Inc. All rights reserved.
Common workloads in the cloud
Only pay for what you need, when you need it
▪ Transient clusters▪ Elastic workload▪ Object storage centric▪ Cloud-native deployment
ETL/Modeling(Data Engineering)
App Delivery(Operational
Database)
Reduce Operating Costs New Insights, New Revenue Run Without Risk
BI/Analytics(Analytic Database)
Explore and analyze all data, wherever it lives
▪ Transient or Persistent clusters▪ Sized to demand▪ HDFS or object storage▪ Lift-and-shift or cloud-native
deployment
Enterprise-grade to protect your business, no matter what
▪ Fixed clusters▪ Periodic sync▪ All HDFS storage▪ Lift-and-shift deployment
14© Cloudera, Inc. All rights reserved.
REDUCE RISK
COMPLIANCE» DATA INGEST» STREAM PROCESSING» EVENT MONITORING
Process 30-billion market events per day to build a holistic picture of US market activity• Market event graph database
built on Cloudera on AWS• Rapid, interactive access to 2+
years’ data • Operational efficiencies
resulting from the platform’s scalability result in net annual savings of $20 million
15© Cloudera, Inc. All rights reserved.
Key Requirements of Big Data in the Cloud
Size compute and storage independently, grow and shrink clusters dynamically, and pay only for what you use on ad-hoc, transient workloads
Preserve business flexibility and data portability and minimize cloud lock-in by running in any one of the three major public cloud providers or in private cloud
Reduce risk with comprehensive manageability, availability, security, and governance required for production big data workloads
Elastic Hybrid/Multi-Cloud Enterprise Grade
16© Cloudera, Inc. All rights reserved.
How do you do Hadoop in the cloud?
17© Cloudera, Inc. All rights reserved.
Embrace Transience for Lower Costs
Decoupled Storage and Compute for Elastic Scale
Patterns of Cloud-Native ApplicationsFlexibility, Self-Service Models, and New Cost Dynamics
Compartmentalize for Greater Isolation
Object Store
COMPUTE
1hr
SPIN UP SPIN DOWN
Object Store
18© Cloudera, Inc. All rights reserved.
Persistent application
DataSource
s
Real-TimeServing
Kafka/Flume
Spark Streaming
HBase orImpala/Kudu (beta)
Kafka
Application
S3
Hive/Spark/HoS
Impala
Analytics
Batch Data Transformations
Streaming Architecture
19© Cloudera, Inc. All rights reserved.
Transient application
DataSource
s
Real-TimeServing
Kafka/Flume
Spark Streaming
HBase, orImpala/Kudu
(beta)
Kafka
Application
S3
Hive/Spark/HoS
S3
Batch Data Transformations
Batch Analytics
Impala
BI & Analytics
20© Cloudera, Inc. All rights reserved.
Combining the two: lambda architecture
DataSource
s
Real-TimeServing
Kafka/Flume
Spark Streaming
HBase orImpala/Kudu (beta)
Kafka
Application
S3 S3
Hive/Spark/HoS
Batch Data Transformations
Impala
BI & Analytics
21© Cloudera, Inc. All rights reserved.
Director Provisioning: Cluster Lifecycle ManagementSpin up, grow & shrink, terminate CDH clusters that read/write to object store
Easy Administration• Dynamic cluster lifecycle management• Single pane of glass: multi-cluster view
Flexible Deployments• Multi-cloud: AWS, Azure, GCP• Fast cluster deployments• Scaling of CDH clusters • Spot instance support
Enterprise-grade• Integration across Cloudera Enterprise• Management of CDH deployments at scale
Cloudera Director
22© Cloudera, Inc. All rights reserved.
Get started with Cloudera Enterprise in the cloud
Deploy and manage Cloudera Enterprise in the cloud environment of your choice
Deploy an enterprise data hub on AWS
Provision and deploy Cloudera Enterprise on the Azure Marketplace
Cloudera Director AWS Quickstart Azure Marketplace
23© Cloudera, Inc. All rights reserved.
Thank You