11. Local (Standalone) Mode(1/7) HadoopApache Hadoop
(http://hadoop.apache.org/) HadoopHadoop 2.6.0Hadoop 2.7.0 Hadoop
2.6.0wget hadoop-2.6.0.tar.gz or rpm ~# wget
http://apache.cs.pu.edu.tw//hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
~# tar zxvf hadoop-2.6.0.tar.gz 11
12. Local (Standalone) Mode(2/7) hadoop-0.20.2/opt hadoop ~# mv
hadoop-0.20.2 /opt/hadoop JavaHadoophadoop viconf/hadoop-env.sh ~#
cd /opt/hadoop/ /hadoop# vi conf/hadoop-env.sh 12
13. Local (Standalone) Mode(3/7) hadoop-env.shJAVA_HOME(export
JAVA_HOME=/usr/jdk1.7.0_45) IPv6 IPv6hadoop-env.shexport
HADOOP_OPTS=- Djava.net.preferIPv4Stack=trueIPv4 # Command specific
options appended to HADOOP_OPTS when specified ... ... ... export
HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote
$HADOOP_JOBTRACKER_OPTS" export JAVA_HOME=/usr/jdk1.7.0_45
JAVA_HOME export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true IPv4
13
14. Local (Standalone) Mode(4/7) Hadoop Local (Standalone) Mode
Hadoop conf/hadoop-env.shJAVA_HOME /hadoop# bin/hadoop Usage:
hadoop [--config confdir] COMMAND where COMMAND is one of: namenode
-format format the DFS filesystem ... ... ... or CLASSNAME run the
class named CLASSNAME Most commands print help when invoked w/o
parameters. 14
20. Pseudo-Distributed Mode(3/9) hdfs-site.xml /hadoop# vi
conf/hdfs-site.xml hdfs-site.xml dfs.replication1 20
21. Pseudo-Distributed Mode(4/9) mapred-site.xml /hadoop# vi
conf/mapred-site.xml mapred-site.xml
mapred.job.trackerlocalhost:9001 21
22. Pseudo-Distributed Mode(5/9) SSH Hadoopssh ssh ( yesEnter )
~# ssh localhost The authenticity of host 'localhost (127.0.0.1)'
can't be established. RSA key fingerprint is Are you sure you want
to continue connecting (yes/no)? yes yes Warning: Permanently added
'localhost' (RSA) to the list of known hosts. root@localhost's
password: 22
23. Pseudo-Distributed Mode(6/9) Ctrl + C ~# ssh-keygen -t rsa
-f ~/.ssh/id_rsa -P "" ~# cp ~/.ssh/id_rsa.pub
~/.ssh/authorized_keys exit ~# ssh localhost Last login: Mon May 16
10:04:39 2011 from localhost ~# exit 23
24. Pseudo-Distributed Mode(7/9) HadoopHadoop bin/hadoop
namenode -format HDFS /hadoop# bin/hadoop namenode -format 11/05/16
10:20:27 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode ... ... ... 11/05/16 10:20:28 INFO
namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at localhost/127.0.0.1
************************************************************/
24
25. Pseudo-Distributed Mode(8/9) bin/start-all.sh
JobtrackerTasktracker /hadoop# bin/start-all.sh starting namenode,
logging to /opt/hadoop/bin/../logs/hadoop-root-namenode-Host01.out
localhost: starting datanode, logging to
/opt/hadoop/bin/../logs/hadoop-root-datanode- Host01.out localhost:
starting secondarynamenode, logging to
/opt/hadoop/bin/../logs/hadoop-root- secondarynamenode-Host01.out
starting jobtracker, logging to
/opt/hadoop/bin/../logs/hadoop-root-jobtracker-Host01.out
localhost: starting tasktracker, logging to
/opt/hadoop/bin/../logs/hadoop-root-tasktracker- Host01.out 25
49. HBase(8/9) HBase /hbase# bin/start-hbase.sh Host02:
starting zookeeper, logging to
/tmp/hadoop/hbase-logs/hbase-root-zookeeper-Host02.out Host01:
starting zookeeper, logging to
/tmp/hadoop/hbase-logs/hbase-root-zookeeper-Host01.out starting
master, logging to
/tmp/hadoop/hbase-logs/hbase-root-master-Host01.out Host02:
starting regionserver, logging to
/tmp/hadoop/hbase-logs/hbase-root-regionserver-Host02.out 49
50. HBase(9/9) hbase shellHBaselist HBsae /hbase# bin/hbase
shell HBase Shell; enter 'help' for list of supported commands.
Type "exit" to leave the HBase Shell Version 0.90.2, r1085860, Sun
Mar 27 13:52:43 PDT 2011 hbase(main):001:0> list listEnter TABLE
0 row(s) in 0.3950 seconds hbase(main):002:0> 50
53. Hadoop(2/7) HDFSbin/hadoop fs /hadoop# bin/hadoop fs Usage:
java FsShell [-ls ] [-lsr ] [-du ] ... ... ... -files specify comma
separated files to be copied to the map reduce cluster -libjars
specify comma separated jar files to include in the classpath.
-archives specify comma separated archives to be unarchived on the
compute machines. The general command line syntax is bin/hadoop
command [genericOptions] [commandOptions] 53
54. Hadoop(3/7) MapReduce Job HadoopMapReduce Job jarHadoop
bin/hadoop jar [MapReduce Job jar] [Job] [Job]
Hadoophadoop-0.20.2-examples.jar grepwordcount pi /hadoop#
bin/hadoop jar hadoop-0.20.2-examples.jar 54
56. Hadoop(5/7) bin/hadoop jobJob /hadoop# bin/hadoop job -list
all 5 jobs submitted States are: Running : 1 Succeded : 2 Failed :
3 Prep : 4 JobId State StartTime UserName Priority SchedulingInfo
job_201105162211_0001 2 1305555169692 root NORMAL NA
job_201105162211_0002 2 1305555869142 root NORMAL NA
job_201105162211_0003 2 1305555912626 root NORMAL NA
job_201105162211_0004 2 1305633307809 root NORMAL NA
job_201105162211_0005 2 1305633347357 root NORMAL NA 56
58. Hadoop(7/7) Jobbin/hadoop job /hadoop# bin/hadoop job
Usage: JobClient [-submit ] [-status ] [-counter ] [-kill ]
[-set-priority ]. Valid values for priorities are: VERY_HIGH HIGH
NORMAL LOW VERY_LOW ... ... ... The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions] 58
59. Hadoop HBase Hadoop HBase 59
60. HBase(1/10) HBase name student ID course : math course :
history John 1 80 85 Adam 2 75 90 bin/hbase hsellHBase /hbase#
bin/hbase shell HBase Shell; enter 'help' for list of supported
commands. Type "exit" to leave the HBase Shell Version 0.90.2,
r1085860, Sun Mar 27 13:52:43 PDT 2011 hbase(main):001:0>
60
61. HBase(2/10) scoresstudentidcoursecolumn > create [],
[column1], [column2], hbase(main):001:0> create 'scores',
'studentid', 'course' 0 row(s) in 1.8970 seconds listHBase
hbase(main):002:0> list TABLE scores 1 row(s) in 0.0170 seconds
61
62. HBase(3/10) > describe [] hbase(main):003:0> describe
'scores' DESCRIPTION ENABLED BLOCKCACHE => 'true'}]} 1 row(s) in
0.0260 seconds scoresJohnstudentid column1 > put [], [row],
[column], [] hbase(main):004:0> put 'scores', 'John',
'studentid:', '1' 0 row(s) in 0.0600 seconds 62
63. HBase(4/10) Johncourse:mathcolumn80 hbase(main):005:0>
put 'scores', 'John', 'course:math', '80' 0 row(s) in 0.0100
seconds Johncourse:historycolumn 85 hbase(main):006:0> put
'scores', 'John', 'course:history', '85' 0 row(s) in 0.0080 seconds
63
64. HBase(5/10) Adamstudentid2 course:math75course:history90
hbase(main):007:0> put 'scores', 'Adam', 'studentid:', '2' 0
row(s) in 0.0130 seconds hbase(main):008:0> put 'scores',
'Adam', 'course:math', '75' 0 row(s) in 0.0100 seconds
hbase(main):009:0> put 'scores', 'Adam', 'course:history', '90'
0 row(s) in 0.0080 seconds 64
65. HBase(6/10) scores > scan [] hbase(main):011:0> scan
'scores' ROW COLUMN+CELL Adam column=course:history,
timestamp=1305704304053, value=90 Adam column=course:math,
timestamp=1305704282591, value=75 Adam column=studentid:,
timestamp=1305704186916, value=2 John column=course:history,
timestamp=1305704046378, value=85 John column=course:math,
timestamp=1305703949662, value=80 John column=studentid:,
timestamp=1305703742527, value=1 2 row(s) in 0.0420 seconds 65
66. HBase(7/10) scoresJohn > get [], [row]
hbase(main):010:0> get 'scores', 'John' COLUMN CELL
course:history timestamp=1305704046378, value=85 course:math
timestamp=1305703949662, value=80 studentid:
timestamp=1305703742527, value=1 3 row(s) in 0.0440 seconds 66
67. HBase(8/10) scorescoursescolumn family > scan [],
{COLUMNS => [column family]} hbase(main):011:0> scan
'scores', {COLUMNS => 'course:'} ROW COLUMN+CELL Adam
column=course:history, timestamp=1305704304053, value=90 Adam
column=course:math, timestamp=1305704282591, value=75 John
column=course:history, timestamp=1305704046378, value=85 John
column=course:math, timestamp=1305703949662, value=80 2 row(s) in
0.0250 seconds 67
68. HBase(9/10) scorescolumn > scan [], {COLUMNS =>
[[column1], [column 2],]} hbase(main):012:0> scan 'scores',
{COLUMNS => ['studentid','course:']} ROW COLUMN+CELL Adam
column=course:history, timestamp=1305704304053, value=90 Adam
column=course:math, timestamp=1305704282591, value=75 Adam
column=studentid:, timestamp=1305704186916, value=2 John
column=course:history, timestamp=1305704046378, value=85 John
column=course:math, timestamp=1305703949662, value=80 John
column=studentid:, timestamp=1305703742527, value=1 2 row(s) in
0.0290 seconds 68
69. HBase(10/10) disabledrop hbase(main):003:0> disable
'scores' 0 row(s) in 2.1510 seconds hbase(main):004:0> drop
'scores' 0 row(s) in 1.7780 seconds 69