21
© Hortonworks Inc. 2015 Accumulo Management, Monitoring, and Authentication HDP 2.3 Upcoming Features Billie Rinaldi @billierinaldi (@hortonworks) Josh Elser @josh_elser (@hortonworks)

Accumulo Summit 2015: Ambari and Accumulo: HDP 2.3 Upcoming Features [Sponsored]

Embed Size (px)

Citation preview

© Hortonworks Inc. 2015

Accumulo Management, Monitoring, and Authentication

HDP 2.3 Upcoming Features

Billie Rinaldi@billierinaldi (@hortonworks)

Josh Elser@josh_elser (@hortonworks)

© Hortonworks Inc. 2015

Outline

•Ambari support for Accumulo• Configuration, metrics and alerting

•Kerberos user authentication• specify “root” principal on init• all users with kerberos principals may connect• permissions still must be granted by root

principal•Hadoop Metrics2 support

• Accumulo metrics can be collected with metrics2• ganglia, graphite, Ambari Metrics System

support

© Hortonworks Inc. 2015

Apache Ambari

Apache Ambari is a software project focused on provisioning, managing and monitoring Apache Hadoop clusters.[1]

Ambari alleviates the need to manually install RPMs/DEBs, create XML configuration files and configure custom monitoring for processes.

As Hadoop is often just a prerequisite for other software like Pig, Hive, or Accumulo, Ambari is extensible to to support the provisioning, managing and monitoring of other software projects in the Hadoop ecosystem.

Ambari also has great support for configuring Hadoop and other applications to run with Kerberos.

Ambari 2.1.0 will include initial support for Accumulo.1. http://ambari.apache.org

© Hortonworks Inc. 2015

Ambari Install

•Select HDP 2.3 Stack

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Ambari Install

© Hortonworks Inc. 2015

Kerberos (in 5 seconds)

Kerberos is a protocol designed to allow authentication of trusted nodes over a (potentially) untrusted network.

Implemented throughout applications in the Hadoop ecosystem. Feels like SSO, no more pesky passwords!

Obtain a “ticket” once via Keytab file or password from the centralized Key Distribution Center (KDC).

Users (people) typically identify themselves with a password, services (computers) identify themselves with a keytab file. Protect keytabs like passwords.

Kerberos Principal: accumulo/[email protected]

© Hortonworks Inc. 2015

Kerberos (in 5 seconds)

Accumulo has support to run on Kerberos-enabled HDFS since 1.5.0. ACCUMULO-2518 introduced support for clients to authenticate with Accumulo via Kerberos.

Apache Thrift does the majority of the work via SASL.

A cached Kerberos ticket is all that’s needed to connect:

$ kinit Password for user@REALM:

# or ...

$ kinit -kt ~/.keytabs/user.keytab

# Launch the shell$ accumulo shelluser@REALM@accumulo>

© Hortonworks Inc. 2015

Kerberos (in 5 seconds)

Running Accumulo with Kerberos enabled will feel mostly the same as it does without. A few notable differences:

● Accumulo username is the full Kerberos principal○ [email protected]

● No “root” user. accumulo init requires a principal to be provided. That user will have the permissions the “root” user would have normally had.

● createuser is not required to authenticate with Accumulo, only a valid Kerberos identity.○ New users still have no permissions out of the box

© Hortonworks Inc. 2015

Kerberos (in 5 seconds)

MapReduce is a special case where a local cached ticket is insufficient as the job is run on other nodes.

Hadoop introduced the DelegationToken to work around the problem of safe guarding secret key material and avoiding NodeManagers (then TaskTrackers) from DDOS’ing the KDC.

Clients request a time-limited shared-secret from HDFS/YARN and pass it along with a Job. Tasks can then use this shared secret.

Accumulo has a similarly architected feature which clients can use to run MapReduce with Kerberos enabled.

© Hortonworks Inc. 2015

Metrics

One of the major goals of Ambari is monitoring.

Hadoop Metrics2 is a relatively old metrics framework provided by Hadoop designed around Sources and Sinks.

Notable from previous implementations in that it can push new metrics directly to systems like Ganglia or Graphite without the need for something like jmxtrans.

Complementary to the distributed tracing APIs that Accumulo (and HTrace) provide.

© Hortonworks Inc. 2015

Metrics

MBeans exposed via the TabletServer observed with JVisualVM. Metrics2 automatically publishes these metrics for us.

© Hortonworks Inc. 2015

MBeans will be automatically published with Accumulo 1.7.0 with Metrics2

An example hadoop-metrics2-accumulo.properties is included with Accumulo’s example configuration flies which serves as a template for enabling FileSink, GangliaSink and/or GraphiteSink.

The presence of this file on the Accumulo server process classpath will trigger the metrics sinks.

Metrics2 serves as a replacement to the existing metrics support within Accumulo. The old implementation can be enabled via configuration if desired.

See the Accumulo 1.7.0 User Manual’s Administration section for more details on how to configure Metrics2.

Metrics

© Hortonworks Inc. 2015

Thank [email protected]@hortonworks.com