21
Building Applications on Building Applications on YARN YARN Chris Riccomini Chris Riccomini 10/11/2012 10/11/2012

Building Applications on YARN

Embed Size (px)

DESCRIPTION

On Friday, I presented, "Building Applications on YARN," at the Apache YARN meet-up at Hortonworks. This post contains my slides.

Citation preview

Building Applications on Building Applications on YARNYARN

Chris RiccominiChris Riccomini10/11/201210/11/2012

Staff Software Engineer at Staff Software Engineer at LinkedInLinkedIn

http://riccomini.name

@criccomini@criccomini

What I want to Talk What I want to Talk AboutAbout

Anatomy of a YARN ApplicationAnatomy of a YARN Application

Things to consider when building your applicationThings to consider when building your applicationArchitectureArchitecture

OperationsOperations

Anatomy of a YARN AppAnatomy of a YARN App

ClientClient

Application MasterApplication Master

Container CodeContainer Code

Resource ManagerResource Manager

Node ManagerNode Manager

Anatomy of a YARN AppAnatomy of a YARN App

ClientClient

Application MasterApplication Master

Container CodeContainer Code

Resource ManagerResource Manager

Node ManagerNode Manager

Client

Client RMRM

NMNMNMNM

AMAM CCCC

* simplified

A lot to considerA lot to consider

DeploymentDeployment

MetricsMetrics

ConfigurationConfiguration

SecuritySecurity

LanguageLanguage

LoggingLogging

Fault ToleranceFault Tolerance

IsolationIsolation

DashboardDashboard

StateState

DeploymentDeployment

HDFSHDFS

HTTPHTTP

File (NFS)File (NFS)

DDOS’ing your serversDDOS’ing your servers

What we do: Tarball over HTTP. Life is easier with What we do: Tarball over HTTP. Life is easier with HDFS, but operational overhead is too high.HDFS, but operational overhead is too high.

MetricsMetrics

Application-level metricsApplication-level metrics

YARN-level metricsYARN-level metrics

metrics2metrics2

Containers are transientContainers are transient

What we do: Both app-level and framework-level What we do: Both app-level and framework-level metrics use same metrics framework. Pipe to in-metrics use same metrics framework. Pipe to in-house metrics dashboard. We don’t use metrics2 house metrics dashboard. We don’t use metrics2 since we don’t want a dependency on Hadoop in since we don’t want a dependency on Hadoop in our core jar.our core jar.

MetricsMetrics

ConfigurationConfiguration

YARN config (yarn-site.xml, core-site.xml, etc)YARN config (yarn-site.xml, core-site.xml, etc)

Application ConfigurationApplication Configuration

Transporting ConfigurationTransporting Configuration

What we do: Config is fully resolved at client What we do: Config is fully resolved at client execution time. No admin-override/locked config execution time. No admin-override/locked config protection yet. Config is passed from client to AM protection yet. Config is passed from client to AM to containers via environment variables.to containers via environment variables.

SecuritySecurity

Kerberos?Kerberos?

Firewalls are your friendFirewalls are your friend

Gateway machineGateway machine

DashboardDashboard

What we do: Firewall all YARN machines so they What we do: Firewall all YARN machines so they can only talk to each-other. All users go through can only talk to each-other. All users go through LDAP controlled dashboard.LDAP controlled dashboard.

LanguageLanguage

Favor complexity in Application Master, and make Favor complexity in Application Master, and make container-logic thincontainer-logic thin

Talk to RM via RESTTalk to RM via REST

Potential to talk to RM via Protobuf RPCPotential to talk to RM via Protobuf RPC

What we do: Application AM is Java. Tasks-side of What we do: Application AM is Java. Tasks-side of application has Python and Java implementations.application has Python and Java implementations.

LoggingLogging

Local storage (application is running)Local storage (application is running)

HDFS storage (application has stopped for a while)HDFS storage (application has stopped for a while)

Be careful with STDOUT/STDERR (rollover)Be careful with STDOUT/STDERR (rollover)

What we do: No HDFS. Logs sit for 7 days, then What we do: No HDFS. Logs sit for 7 days, then disappear. Not ideal.disappear. Not ideal.

Fault ToleranceFault Tolerance

Failure matrixFailure matrix

HA RM/NMHA RM/NM

Orphaned processesOrphaned processes

Pay attention to process treesPay attention to process trees

What we do: No HA. Manual fail over when RM dies. What we do: No HA. Manual fail over when RM dies. Orphaned process monitor (proc start time < RM Orphaned process monitor (proc start time < RM start time).start time).

Fault ToleranceFault Tolerance

IsolationIsolation

MemoryMemory

DiskDisk

CPUCPU

NetworkNetwork

What we do: Nothing, right now. Hoping YARN will What we do: Nothing, right now. Hoping YARN will solve this before we need it (cgroups?).solve this before we need it (cgroups?).

DashboardDashboard

Application-specific informationApplication-specific information

Integrate with YARNIntegrate with YARN

Application Master or Standalone?Application Master or Standalone?

What we do: Dashboard enforces security, talks to What we do: Dashboard enforces security, talks to RM/AM via HTTP/JSON to get information about RM/AM via HTTP/JSON to get information about jobs.jobs.

DashboardDashboard

StateState

HDFSHDFS

Deployed with ApplicationDeployed with Application

Remote data storeRemote data store

What we do: Nothing, right now.What we do: Nothing, right now.

TakeawaysTakeaways

There’s a lot more than just the YARN APIThere’s a lot more than just the YARN API

Look for examples (Spark, Storm, Map-Reduce)Look for examples (Spark, Storm, Map-Reduce)

Decide your level of Hadoop integrationDecide your level of Hadoop integration

Metrics2Metrics2

HDFSHDFS

ConfigConfig

Kerberos and doAsKerberos and doAs

Questions?Questions?