Click here to load reader
Upload
buiminh
View
213
Download
1
Embed Size (px)
Citation preview
Topics covered in the training1. Module – 1 & Session - 1
a. Understanding Big Data Basics
b. Big Data Use Cases
c. Introduction to Hadoop
d. Understanding Hadoop Ecosystem
e. Introduction to HDFS
i. Introduction to Namenode
ii. Introduction to Datanode
iii. Introduction to Secondary Namenode
f. Introduction to MapReduce
i. Introduction to JobTracker
ii. Introduction to TaskTracker
g. Summarizing Hadoop Architecture
h. Roles and Responsibilities of a Hadoop Administrator
2. Module – 2 & Session – 2 & 3
a. Linux internals
i. Commands that are required
ii. Linux basics
b. Hadoop Cluster Installation Pre-requisites
i. Pre-requisites of Hadoop Installation
1. Softwares Download
2. Preparing your VM
3. Enabling VM with VMware
4. Understanding mandatory changes in the operating system
c. Installation and Configuration
i. Understanding Hadoop cluster installation modes
ii. Understanding Hadoop version 1 installation and configuration
iii. Passwordless SSH setup
Session - 4
a. Hands-On Practice for creating a Hadoop cluster
i. Helping individually in practicing Hadoop cluster installation
4. Module – 3 & Session - 5
a. Hadoop Cluster Planning
i. Recommended Hadoop cluster configuration
1. Hardware/Software/Network
2. Recommended configuration for Master and Slave Nodes
3. Sample Base configuration
4. Hadoop Different Distributions in the market
b. Hadoop performance tuning
i. Important Hadoop tuning parameters to understand
ii. Hadoop Cluster Benchmarking Jobs – How to run the jobs
Module – 4 & Session – 6 & 7
a. Job Schedulers
i. FIFO Scheduler
ii. Fair Scheduler
b. Backup and Recovery
i. Data backup
ii. Meta-data backup
[email protected] www.collaberatact.com
iii. Hadoop Quotas
iv. Safemode
v. Hadoop Ports
c. DistCP
d. Security
i. How to secure your cluster using Kerberos
e. Upgrades
i. Upgrading Hadoop cluster from Hadoop 1 to Hadoop 2
6. Module – 5 & Session – 8
a. Hadoop 2.0 new features
b. YARN
i. Understanding Resource Manager
ii. Understanding Application Master
iii. Understanding Node Manager
iv. Understanding Hadoop 2 Job Execution Framework
c. Hadoop 2 Multi-node cluster creation
i. Pre-requisites of Hadoop Installation
ii. Softwares Download
iii. Preparing your VM
iv. Enabling VM with VMware
v. Understanding mandatory changes in the operating system
vi. Installation and Configuration
vii. Understanding Hadoop version 2 installation and configuration
viii. Passwordless SSH setup
7. Session - 9
a. Practice Hadoop 2 multi-node Cluster Creation
i. Helping individuals in practicing Hadoop 2 cluster installation
b. Sample Yarn Job execution
Module – 6 & Session – 10 & 11
a. Understanding Issues of Hadoop 1
b. Understanding improvements in Hadoop 2
c. Namenode Federation
i. Enable segregation of HDFS using multiple namenodes
d. Namenode – High Availability
i. Achieving Namenode High-Availability using Quorum Journal Manager
ii. Achieving Namenode High-Availability using Network File System
Session - 12
a. Implementation of NN High Availability
i. Helping individuals achieving Namenode High Availability
10. Module – 7 & Session – 13, 14
a. Hadoop Ecosystem Introduction
i. Understanding the integration of Hadoop ecosystem
b. Touchbase with Hive
i. What is Hive
ii. Architecture of Hive
iii. Understanding Hive metastore concepts
[email protected] www.collaberatact.com
c. HBase
i. Understading HBase Basics
ii. Understanding HBase storage Model
iii. Understanding HBase Architecture
iv. Cluster Installation and Configuration
d. Pig
i. What is Pig?
ii. How Pig integrates with Hadoop cluster?
iii. Demo of Pig Jobs using MapReduce
e. Sqoop
I. What is Sqoop?
ii. How to import and export the data from Sqoop to RDBMS?
iii. Example of Sqoop jobs using MySQL
f. Flume
i. What is Flume?
ii. Sample Flume jobs
11. Module – 8 & Session - 15
a. Understanding the internals of Cloudera Manager
b. Understanding the automation of Hadoop installation using Cloudera Manager
c. Understanding Cloudera Hadoop Distribution and Cloudera Manager
d. Understanding the underlying directory structure of Cloudera Hadoop
e. Cloudera Hadoop Cluster Installation – CDH
[email protected] www.collaberatact.com