Hadoop Installation Guide

Preview:

Citation preview

Hadoop installation guide

This current tutorial is all about installation of Hadoop and setting up a single node cluster

on Linux Ubuntu environment.it is a made easy short tutorial when compared with the existing hefty

tutorials on the web...anyway here we go...

What we want to do

1. Install sun java jdk

2. Install hadoop-0.20

3. Install hue (a file browser and job designer on hdfs)

Prerequisites

The following process was tested on Ubuntu 10.04LTS and also 11.04...but regarding stability i would

personally prefer 10.04LTS..

Download 10.04 or 11.04 from this link:http://www.ubuntu.com/download/ubuntu/download

Once done with the installation of Ubuntu, sun java must be installed. Let’s get into it...

Open terminal (applications>terminal)...the following steps(highlighted command) must be copied one

by one paste it to terminal interface hit enter after every step..

Step 1: we create a canonical partner repository...to do so paste the below code to terminal.

sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner"

Step 2: we update the source list

sudo apt-get update

Step 3: now we Install sun-java6-jdk

sudo apt-get install sun-java6-jdk

Step 4: we make a quick check whether Sun’s JDK is correctly set up

java -version

Now we head to the installation of Hadoop...

Step 1: Ubuntu 10.04 is a lucid system...hit the below link and save the file in the downloads folder,

http://archive.cloudera.com/one-click-install/lucid/cdh3-repository_1.0_all.deb

Step 2: now we install the downloaded package with the following command

sudo dpkg -i Downloads/cdh3-repository_1.0_all.deb

Step 3: now we update the installed package

sudo apt-get update

Step 4: now we are to the final Hadoop installation

sudo apt-get install hadoop-0.20-conf-pseudo

We are done with the installation of Hadoop...Now we start the daemons and verify whether all the

components are working fine. With the below command

for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done

now we say Hadoop accomplished, if ur well used to the commands of Hadoop then u can start

working with Hadoop, if u want to say adios to the command UI or if ur a newbie to Hadoop i would

prefer using hue-a file browser and job designer on hdfs...

Now we head to the installation of hue(Hadoop user environment)

Step 1: To install cloudera Hue

sudo apt-get install hue

Once done with the installation of hue. Follow the next command

Step 2:open a new terminal and type the command

sudo gedit

Step 3: now head to the file location as places>computer>filesystem>>etc>hive>conf>hive-site.xml file

(to edit drag the file to gedit interface) and then save it after editing the below code...

Change the location of the tables by changing the value of the javax.jdo.option.ConnectionURL

property:

<property>

<name>javax.jdo.option.ConnectionURL</name>

<value>jdbc:derby:;databaseName=/usr/share/hue/metastore_db;create=true</value>

<description>JDBC connect string for a JDBC metastore</description>

</property>

Step 4: now once again open the gedit editor to provide a secret key

sudo gedit

step 5:Open the /etc/hue/hue.ini configuration file(head to the location and drag the file to gedit editor

interface).In the [desktop] section, enter a long series of random characters (30 to 60 characters is

recommended) and save the file after editing with the below code..

[desktop]

secret_key=jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o

Step 6: Start Hue by running

sudo /etc/init.d/hue start

Step 7: To use Hue, open a web browser and go to: http://localhost:8088/

If error occurs stop hue followed by stopping hadoop nodes as:

sudo /etc/init.d/hue stop

Now stop the Hadoop daemons as (error occurs but will rectify the misconfiguration in a while):

for service in /etc/init.d/hadoop-0.20-*; do sudo $service stop; done

Now restart Hadoop daemons as:

for service in /etc/init.d/hadoop-0.20-*; do sudo $service start; done

Restart hue as:

sudo /etc/init.d/hue start

Now the following plugins must be installed to avoid misconfiguration in hue browser

Step 8: installing flume

sudo apt-get install flume

Step 9: installing hbase

sudo apt-get install hadoop-hbase

step 10:installing Hadoop-pig

sudo apt-get install hadoop-pig

Now we cross check whether the entire configuration is working well or not…open hue in a browser

using http://localhost:8088/

To check whether all the daemons are working fine. Hit the file browser and job browser from the dock

provided. If it works well then we are done...

----THE END---- -Sriram

Recommended