2
www.VirtualNuggets.com [email protected] India +91-8885560202 ; +91-40-64540202 USA +1-707-666-8949 Online Training Corporate Training Web-Development Software Development SEO Services Course Overview: The course presents the material as small building blocks with a thorough coverage of each component in the Hadoop stack. We begin by looking at Hadoop’s architecture and its underlying parts with top -down identification of component interactions within the Hadoop eco-system. The course then provides in- depth coverage of Hadoop Distributed File System (HDFS), HBase, Map/Reduce, Oozie, Pig and Hive. To re-enforce concepts, each section is followed by a set of hands-on exercises. The exercises come in various complexities to accommodate developers with various levels of expertise. Course Contents :- 1.) Introduction to Big Data and Hadoop What is Big Data? What are the challenges for processing big data? What technologies support big data? What is Hadoop? Why Hadoop? History of Hadoop Use Cases of Hadoop Hadoop eco System HDFS Map Reduce Statistics 2.) Understanding the Cluster Typical workflow Writing files to HDFS Reading files from HDFS Rack Awareness 5 daemons 3.) Developing the Map Reduce Application Configuring development environment - Eclipse Writing Unit Test Running locally Running on Cluster MapReduce workflows 4.) How MapReduce Works Anatomy of a MapReduce job run Failures Job Scheduling Shuffle and Sort Task Execution 5.) MapReduce Types and Formats MapReduce Types Input Formats - Input splits & records, text input, binary input, multiple inputs & database input

Best Online Training Institute on Hadoop

Embed Size (px)

Citation preview

Page 1: Best Online Training Institute on Hadoop

[email protected] +91-8885560202 ; +91-40-64540202USA +1-707-666-8949

Online Training Corporate Training Web-Development Software Development SEO Services

Course Overview:

The course presents the material as small building blocks with a thorough coverage of each component inthe Hadoop stack. We begin by looking at Hadoop’s architecture and its underlying parts with top-downidentification of component interactions within the Hadoop eco-system. The course then provides in-depth coverage of Hadoop Distributed File System (HDFS), HBase, Map/Reduce, Oozie, Pig and Hive.To re-enforce concepts, each section is followed by a set of hands-on exercises. The exercises come invarious complexities to accommodate developers with various levels of expertise.

Course Contents :-

1.) Introduction to Big Data and HadoopWhat is Big Data?What are the challenges for processing big data?What technologies support big data?What is Hadoop?Why Hadoop?History of HadoopUse Cases of HadoopHadoop eco SystemHDFSMap ReduceStatistics

2.) Understanding the ClusterTypical workflowWriting files to HDFSReading files from HDFSRack Awareness5 daemons

3.) Developing the Map Reduce ApplicationConfiguring development environment - EclipseWriting Unit TestRunning locallyRunning on ClusterMapReduce workflows

4.) How MapReduce WorksAnatomy of a MapReduce job runFailuresJob SchedulingShuffle and SortTask Execution

5.) MapReduce Types and FormatsMapReduce TypesInput Formats - Input splits & records, text input, binary input, multiple inputs & database input

Page 2: Best Online Training Institute on Hadoop

[email protected] +91-8885560202 ; +91-40-64540202USA +1-707-666-8949

Online Training Corporate Training Web-Development Software Development SEO Services

Output Formats - text Output, binary output, multiple outputs, lazy output and database output

6.) MapReduce FeaturesCountersSortingJoins - Map Side and Reduce SideSide Data DistributionMapReduce CombinerMapReduce PartitionerMapReduce Distributed Cache

7.) Hive and PIGFundamentalsWhen to Use PIG and HIVEConcepts

8.) HBASECAP TheoremHbase Architecture and conceptsProgramming