21
1 Cloudera Manager – API’s & Extensibility Patrick Angeles, Director Field Technical Services December 2013 CONFIDENTIAL - RESTRICTED

Pa cloudera manager-api's_extensibility_v2

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Pa   cloudera manager-api's_extensibility_v2

1

Cloudera Manager – API’s & Extensibility

Patrick Angeles, Director Field Technical ServicesDecember 2013

CONFIDENTIAL - RESTRICTED

Page 2: Pa   cloudera manager-api's_extensibility_v2

2

Cloudera ManagerEnd-to-End Administration for CDH

ManageEasily deploy, configure & optimize clusters1MonitorMaintain a central view of all activity2DiagnoseEasily identify and resolve issues3IntegrateUse Cloudera Manager with existing tools4

©2013 Cloudera, Inc. All Rights Reserved.

Page 3: Pa   cloudera manager-api's_extensibility_v2

3

Integrating with your IT Mgmt tools

©2013 Cloudera, Inc. All Rights Reserved.

Cloudera Manager

Installation, Deployment

toolse.g. Chef,

Puppet etc.

Monitoring Tools

e.g. Orion, Tivoli, BMC

etc.

Alerting Tools

e.g Nagios, SNMP etc.

Hadoop Operations

Datacenter OperationsVarious options of integrating Cloudera Manager into your existing Datacenter Operations/Tools• Cloudera Manager API

• Introduced in CM4 (June 2012)• Installation & deployment• Monitoring

• SNMP Alerts• Introduced in CM4.5 (Feb 2013)

• And more…• Monitoring ‘tsquery’ (Feb 2013)• User-defined triggers/alarms (new for C5!)• Service extensibility (new for C5!)

Page 4: Pa   cloudera manager-api's_extensibility_v2

Cloudera Manager (CM) API• API access was a new feature introduced in Cloudera Manager 4.0, providing programmatic access to

cluster operations (such as configuration and restart) and monitoring information (such as health and metrics).

• The CM API is an HTTP REST API, using JSON serialization. The API is served on the same host and port as the CM web UI, and does not require an extra process or extra configuration. API users have the same privileges as they do in the web UI world.

©2013Cloudera, Inc. All Rights Reserved.4

• Docs & Exampleshttp://cloudera.github.io/cm_api/https://github.com/cloudera/cm_api

• Java/Python clientshttp://blog.cloudera.com/blog/2013/05/how-to-automate-your-hadoop-cluster-from-java/

Page 5: Pa   cloudera manager-api's_extensibility_v2

©2013 Cloudera, Inc. All Rights Reserved.

Examples of integration with CM API• Installation & Deployment

• Chef• Puppet• Dell Crowbar

• http://blog.cloudera.com/blog/2013/08/how-to-deploy-hadoop-clusters-automatically-with-dell-crowbar-and-cloudera-manager/

• StackIQ• http://web.stackiq.com/blog/bid/312064/StackIQ-Cluster-Manager-now-integrated-with-Cloudera

• WANdisco – non-stop NN setup• Several other customers/partners leveraging the API’s as part of their install & deployment

process• Monitoring & Alerting

• Oracle Enterprise Manager (via Big Data Appliance)• Nagios

• https://github.com/cloudera/cm_api/tree/master/nagios• https://

github.com/harisekhon/nagios-plugins/blob/master/check_hadoop_cloudera_manager_metrics.pl• SNMP alerts integration with IBM Netcool

5

Develop & Contribute your plug-in’s using Cloudera Manager API

Page 6: Pa   cloudera manager-api's_extensibility_v2

6

Cloudera Manager – Monitoring via ‘tsquery’

©2013 Cloudera, Inc. All Rights Reserved.

• Introduced as part of CM4.5 release (Feb 2013)

• Great way to add interesting charts (above & beyond what is provided by default) and monitor metrics that are relevant to your clusters

• The tsquery language is used to specify statements for retrieving time-series data from the Cloudera Manager time-series data store

• Example: How do I compare all disk IO for all the DataNodes that belong to a specific HDFS service?select bytes_read, bytes_written where roleType=DATANODE and serviceName=hdfs1

• Retrieved time-series data can be plotted via various options – line, bar, scatter, heat maps, table list etc.

• Extending this concept to create user-defined triggers/alarms (new for C5!).

• More details• http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Man

ager-Diagnostics-Guide/cm5dg_chart_time_series_data.html

Page 7: Pa   cloudera manager-api's_extensibility_v2

7

Examples of Cloudera Manager ‘tsquery’

©2013 Cloudera, Inc. All Rights Reserved.

Example1: How do I track the aggregate Cluster Disk IO?select dt0(read_bytes_disk_sum), dt0(write_bytes_disk_sum) where category = CLUSTER and clusterId = $CLUSTERID

Example2: How do I compare CPU usage across hosts?select dt0(total_cpu_user) / getHostFact(numCores, 1) * 100, dt0(total_cpu_system) / getHostFact(numCores, 1) * 100, dt0(total_cpu_nice) / getHostFact(numCores, 1) * 100, dt0(total_cpu_iowait) / getHostFact(numCores, 1) * 100, dt0(total_cpu_irq) / getHostFact(numCores, 1) * 100, dt0(total_cpu_soft_irq) / getHostFact(numCores, 1) * 100

Create & Contribute your ‘tsqueries’!https://github.com/cloudera/cm_charting_scrapbook

Page 8: Pa   cloudera manager-api's_extensibility_v2

Cloudera Manager – Service Extensibility

• Introduced in C5• Still in Beta!

• Some aspects (espcially Parcel mgmt) available in CM4.x

• Example: Collaboration with Syncsort to deploy DMX-h libraries

• Single management console for CDH, non-CDH services and ISV applications

• Similar look and feel as existing services

• Easy to write (Java-free!)

• Flexible

• Independent release cycle

©2013Cloudera, Inc. All Rights Reserved.

Page 9: Pa   cloudera manager-api's_extensibility_v2

9

Analogy from Operating Systems (OS) world

©2013Cloudera, Inc. All Rights Reserved.

Core OS kernel

PackageMgmt

Process/Resource

Mgmt

SecurityMgmt

Data AccessMgmt

ISV’s view of OS

Systems Management

Page 10: Pa   cloudera manager-api's_extensibility_v2

10

Bringing ISV Apps to CDH

©2013Cloudera, Inc. All Rights Reserved.

Core Hadoop/CDH kernel

Parcels Resource Mgmt

SecurityMgmt CDK API’s

ISV’s view of Hadoop

Cloudera Manager

Page 11: Pa   cloudera manager-api's_extensibility_v2

11

Integrating into the Cloudera Product Portfolio

©2013Cloudera, Inc. All Rights Reserved.

Cloudera Manager

Features Description Examples

Package Mgmt

- Ability to easily package and distribute binaries/jars via “Parcels”

-Informatica-Syncsort

Resource Mgmt

- Ability to deploy applications as stand-alone processes or via YARN* on the Hadoop grid

- Resource isolation of cluster resources

-SAS-0xData-Accumulo

Security Mgmt

- Support for Kerberos Mgmt- Role bases access control for Tables/Views in

Hive/Impala via Sentry

Data Access Mgmt

- HDFS and HBase API abstraction and simplification

Systems Mgmt

Manage -Deploy and upgrade (rolling) services and pkgs-Manage configurations

Monitor -Proactive health checks-Track resource utilization -Custom metrics charts

Diagnose -Distributed log collection and searching-Tag and track key events

Integrate -Access operational tools via API-Surface overall cluster metrics to ISV dashboard

Non-CDH Apps…

ISV’s

Accumulo, Spark, Giraph etc.

* Support for YARN planned as part of CM5.x in FY14

Page 12: Pa   cloudera manager-api's_extensibility_v2

So.. How does it work?

• A JSON file that describes of your service• Set of control scripts• Packaged as a JAR file• As promised, Java-free

©2013Cloudera, Inc. All Rights Reserved.

Page 13: Pa   cloudera manager-api's_extensibility_v2

Example: Cloudera Manager Extensions - Spark

©2013Cloudera, Inc. All Rights Reserved.

Page 14: Pa   cloudera manager-api's_extensibility_v2

Cloudera Manager Extensions

©2013Cloudera, Inc. All Rights Reserved.

Page 15: Pa   cloudera manager-api's_extensibility_v2

Cloudera Manager Extensions: Spark

©2013Cloudera, Inc. All Rights Reserved.

Page 16: Pa   cloudera manager-api's_extensibility_v2

Cloudera Manager Extensions: Spark

©2013Cloudera, Inc. All Rights Reserved.

Page 17: Pa   cloudera manager-api's_extensibility_v2

Cloudera Manager Extensions: Spark

©2013Cloudera, Inc. All Rights Reserved.

Page 18: Pa   cloudera manager-api's_extensibility_v2

#!/bin/bash

CMD=$1

MASTER_PORT=<read in from ./params.properties>

case $CMD in

(start_master)

exec $SPARK_HOME/scripts/spark-start.sh master"

;;

(*)

echo "$timestamp Don't understand [$CMD]"

;;

esac

name : “spark”,

roles : [{

name : "master",

startRunner : {

program : "scripts/control.sh",

args : [ "start_master",

"./params.properties"]

},

parameters : [{

name : "master_port",

type : "port",

default : 7077

}],

configWriter : {

generators : [{

filename : "params.properties"

}]

}]

The Code

©2013Cloudera, Inc. All Rights Reserved.

Page 19: Pa   cloudera manager-api's_extensibility_v2

Next Steps

• Documentation & SDK as part of C5 Beta2 or later (definitely before GA!)

• Working with select ISV’s (SAS, Syncsort, 0xData etc.) as part of Beta to further fine-tune this feature

©2013Cloudera, Inc. All Rights Reserved.

Develop & Contribute your Cloudera Manager service extensibility plug-in’s !

Page 20: Pa   cloudera manager-api's_extensibility_v2

©2012Cloudera, Inc. All Rights Reserved.

20

Vision of CM Extensibility

CDHCM

Syncsort Informatica

Security ISV’s

0xData

Capacity Mgr SLA Mgr Cost

Optimizer

API

Horizontal Extension

Ve

rtic

al

Ex

ten

sio

n

Se

rvic

e E

xte

ns

ibil

ity

Ops Apps

SAS

Revolution

Spark GiraphAccumulo

Oracle OEM DellNagios

APISNMP

Chef/Puppet

Page 21: Pa   cloudera manager-api's_extensibility_v2

Q&A

©2013Cloudera, Inc. All Rights Reserved.