54
Danairat T., 2013, [email protected] Big Data Hadoop – Hands On Workshop Setting up Hadoop Clustering Hands-On Workshop Danairat T. Line ID: Danairat FB: Danairat Thanabodithammachari +668-1559-1446, [email protected], Certified Java Programmer

Setting up Hadoop YARN Clustering

Embed Size (px)

Citation preview

Page 1: Setting up Hadoop YARN Clustering

Danairat T., 2013, [email protected] Data Hadoop – Hands On Workshop

Setting up Hadoop ClusteringHands-On Workshop

Danairat T.

Line ID: Danairat

FB: Danairat Thanabodithammachari

+668-1559-1446, [email protected], Certified Java Programmer

Page 2: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Big Data Introduction

Volume

Variety Velocity

DB Table

Delimited Text

XML, HTML

Free Form Text

Image, Music, VDO, Binary

Batch

Near real time

Real time

GB

TB

PB

XB

ZB

Page 3: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Big Data Architecture

Big Data InfrastructureBig Data Infrastructure

BI/ReportNext Best Action

Distributed Data Processing

Integration and Metadata Framework

Distributed Data Store and DWH

Monitoring and

Management Framework

SecurityFramework

Predictive Analytics

Descriptive Analytics

Prescriptive Analytics

Big Data Platform

Big Data Applications

Hardware, Storage, Network

Fraud Analysis

Cyber Security

Talent Search

Page 4: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Hadoop Timeline

Page 5: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Apache Hadoop Core Technology

j2eedev.org/ecosystem-hadoop

Page 6: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Apache Hadoop Ecosystem

j2eedev.org/ecosystem-hadoop

Page 7: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Big Data Platform & Big Data AnalyticsHadoop Technology

Page 8: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Block Size = 64MBReplication Factor = 3

HDFS: Hadoop Distributed File System

Cost/GB is a few ¢/month vs $/month

apache.org/hadoop/

Page 9: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

YARN: Yet Another Resource Negotiator

Hadoop.apache.org

MRV2 maintains API compatibility with previous stable release (hadoop-1.x). This means that all Map-Reduce jobs should still run unchanged on top of MRv2 with just a recompile.

Page 10: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Hadoop 1.0 vs Hadoop 2.0

Hortonwork.com

Page 11: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Hadoop 1.0 vs Hadoop 2.0

Hortonwork.com

Page 12: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Hadoop 2

Hortonworks.com

Page 13: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Hadoop Symbols and Reasons Behind

13

Page 14: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Clone hadoop master to slave1 and slave2

master

slave1

slave2

Page 15: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master node: Edit host file

Page 16: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master node : Copy key file to slave1 and slave2

scp /home/ubuntu/.ssh/id_dsa.pub ip-172-31-1-8:/home/ubuntu/.ssh/master.pub

scp /home/ubuntu/.ssh/id_dsa.pub 172.31.15.16:/home/ubuntu/.ssh/master.pub

Page 17: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

After this slide, we will use 3 cascaded windows to represent master node, slave1

node and slave2 node

master node

slave1 node

slave2 node

Page 18: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1 and slave2: cat /home/ubuntu/.ssh/master.pub >> /home/ubuntu/.ssh/authorized_keys

Page 19: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: Test ssh to slave1 and slave 2

$ ssh ip-172-31-1-8

$ exit

$ ssh ip-172-31-15-16

$ exit

Page 20: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: add slave1 and slave2 to Hadoop slave file

Page 21: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: add slave1 and slave2 to Hadoop slave file

Page 22: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: edit hdfs-site.xml

Page 23: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: edit hdfs-site.xml for 2 replication servers

Page 24: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At all nodes: remove directories of namenode and datanode

Page 25: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: format namenode

Page 26: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: format namenode

Page 27: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: Execute start-dfs.sh

Page 28: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1: Check jps result, you will see DataNode has been started

Page 29: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave2: Check jps result, you will see DataNode has been started

Page 30: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: Execute start-yarn.sh

Page 31: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1: Check jps result, you will see NodeManager has been started

Page 32: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave2: Check jps result, you will see NodeManager has been started

Page 33: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Importing data into HDFS Cluster

Page 34: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: import data to hdfs

Page 35: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1: review imported result data from hdfs

Page 36: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave2: review imported result data from hdfs

Page 37: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Running MapReduce in Cluster Mode

Page 38: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: execute YARN mapreduce program

Page 39: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1, slave2: you will see Application Master and Yarn Child Container

Page 40: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: review output file from hdfs

Page 41: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: review output file from hdfs

Page 42: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1, slave2: review output file from hdfs by using command:-hdfs dfs -cat /outputs/wordcount_output_dir01/part-r-00000

Page 43: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: review output result data from web console

Page 44: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: review output result data from web console

Page 45: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: review output result data from web console

Page 46: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: review output result data from web console

Page 47: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Stopping Hadoop Cluster

Page 48: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: execute stop-yarn.sh

Page 49: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1: use jps to review NodeManager has been stopped

Page 50: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave2: use jps to review NodeManager has been stopped

Page 51: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At master: execute stop-dfs.sh

Page 52: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave1: use jps to review DataNode has been stopped

Page 53: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

At slave2: use jps to review DataNode has been stopped

Page 54: Setting up Hadoop YARN Clustering

Danairat T., [email protected]:Big Data Hadoop – Hands On Workshop

Thank you very much

Danairat T.

Line ID: Danairat

FB: Danairat Thanabodithammachari

+668-1559-1446, [email protected], Certified Java Programmer