Hadoop mapreduce 过程简析

MapReduce

Hadoop MapReduce

过程浅析

@Author：刘堃@Email: [email protected]@Blog: www.topdigger.tk

http://www.topdigger.tk/

MapReduce

MapReduceFile 要处理的处理文件：

文件存储在HDFS中，每个文件切分成多个一

定大小（默认64M）的Block（默认3个备份）

存储在多个节点（DataNode）上。

文件数据内容：This is a mapreduce ppt\nThis is a mapreduce ppt\n……

MapReduceInputFormat：

数据格式定义，如以“\n”分隔每一条记录，

以” ”(空格) 区分一个目标单词。

“This is a mapreduce ppt\n”为一条记录

“this”“is”等为一个目标单词

MapReduceSplit：

inputSplit是map函数的输入

逻辑概念，一个inputSpilt和Block是默认1对1

关系，可以是1对多（多对1呢？）

实际上，每个split实际上包含后一个Block中

开头部分的数据（解决记录跨Block问题）

比如记录“This is a mapreduce ppt\n”跨越存

储在两个Block中，那么这条记录属于前一个

Block对应的split

MapReduce

RecordReader：每读取一条记录,调用一次map函数。

比如读取一条记录“This is a mapreduce

ppt”v，然后作为参数值v，调用map(v).

然后继续这个过程，读取下一条记录直到split

尾。

MapReduce

Map函数：function map（record v）：

for word in v：

collect（word，1）//{“word”:1}

调用执行一次 map(“This is a mapreduce ppt” )

在内存中增加数据：

{“this”:1}

{“is ”:1}

{“a”:1}

{“mapreduce ”:1}

{“ppt”:1}

MapReduceShuffle：

Partion，Sort，Spill，Meger，

Combiner，Copy，Memery，Disk……

神奇发生的地方

也是性能优化大有可为的的地方

MapReduce

Partitioner:决定数据由哪个Reducer处理，从而分区

比如采用hash法，有n个Reducer，那么对数

据{“This”:1}的key“This”对n取模，返回m，而

该数据由第m个Reducer处理：

生成 {partion，key，value}

{“This”，1} {“m”，“This”：1}

MapReduceMemoryBuffer:

内存缓冲区，每个map的结果和partition处理

的Key Value结果都保存在缓存中。

缓冲的区的大小：默认100M

溢写阈值：100M*0.8 = 80M

缓冲中的数据：partion key value 三元数据组

{“1”，“this”:1}

{“2”，“is ”:1}

……

{“1”，“this”:1}

{“1”，“otherword”:1}

MapReduceSpill:

内存缓冲区达到阈值时，溢写spill线程锁住

这80M缓冲区，开始将数据写出到本地磁盘

中，然后释放内存。

每次溢写都生成一个数据文件。

溢出数据到磁盘前会对数据进行key排序

Sort，以及合并Combiner

发送到相同Reduce的key数据，会拼接在一

起，减少partition的索引数量。

MapReduceSort:

缓冲区数据按照key进行排序

{“1”，“this”:1}

{“2”，“is ”:1}

……

{“1”，“this”:1}


{“1”，“this”:1}

{“1”，“this ”:1}


……

{“2”，“is”:1}

MapReduceCombiner:

数据合并，相同的key的数据，value值合并，

可以减少数据量。

combine函数事实上就是reduce函数，在满

足combine处理不影响（sum，max等）最终

reduce的结果时，可以极大的提供性能。

{“1”，“this”:1}

{“1”，“this ”:1}


{“1”，“this”:2}


MapReduce

Spill to disk: 溢写到硬盘

每次将内存缓冲区的数据溢写到本地

硬盘的时候都生成一个溢写文件。

溢写文件中，相同partion的数据顺序

存储一块，如图中，则是三块，说明

对应有三个reducer。

MapReduceMerge on disk:

map结束后，需要合并多次溢写生成

的文件及内存中的数据（完成时未需溢写的数据），生成一个map输出文

件。Partion相同的数据连续存储。

数据合并为group：

溢写文件1{“this”：2}

溢写文件2{“this”：3}

合并文件 {“this”：[2,3]}

group：{key：[value1……]}是reduce函数的输入数据格式

如果设置了combiner，则合并相同value

{“this”：[5]}

MapReduceMap task 结果:

Map任务上的输出就是按照partion和

key排序的group数据存储文件。

一个mapreduce作业中，一般都有多

个map task，每个task都有一个输出文件

{key1：[value1……]}

{key2：[value1……]}

……

MapReduceCopy: Copy数据到reduce端

reduce端，读取map端的输出文件数据，从所

有的map节点端中copy自己相对应的partion的那部

分数据到内存中。

比如key“this” 和“other” 都对应reducer1,那么

reducer1中的内存中将存储数据如:

{“this”，[2,2,3,1]}

{“other”,[10,2,3,4]}

……

{“this”,[3,2,4]}

这些数据来自于多个map端，key都属于同一

个partion。

MapReduceSpill: 溢写

和map端一样，内存缓冲满时，也通过

sort和combiner，将数据溢写到硬盘文件

中。

reduce端的缓存设置更灵活，此时reduce

函数未运行，也可以占用较大的内存

MapReducesort: 排序

{“this”，[2,2,3,1]}

{“other”,[10,2,3,4]}

……

{“this”,[3,2,4]}

{“this”，[2,2,3,1]}

{“this”,[3,2,4]}

……

{“other”,[10,2,3,4]}

MapReducecombiner:

如果combiner不影响reduce结果，则

设置combiner可以对数据进行合并，输出到

硬盘文件中。

{“this”，[2,2,3,1]}

{“this”,[3,2,4]}

……

{“other”,[10,2,3,4]}

{“this”，[8]}

{“this”,[9]}

……

{“other”,[19]}

MapReducemerge:

读取将多个溢出的文件数据，合并成一个

文件，作为reduce（{key，[v……]}）的输入

{“this”，[8，9,3,10]}

……

{“task”,[6]}

{“other”，【19】}

{“this”，[8]}

{“this”,[9]}

……

{“other”,[19]}

{“this”，[3]}

{“this”,[10]}

……

{“task”,[6]}

MapReduceReduce函数:

function reduce（{key,v[]}）:

for value in v:

collectAdd(key,value)

{“this”，[30]}

……

{“task”,[13]}

{“other”，【22】}

{“this”，[8，9,3,10]}

……

{“task”,[6,2,5]}

{“other”，【19,3】}

MapReduce

Reduce结果:

多个reduce任务输出的数据都属于不同

的partition，因此结果数据的key不会重

复。

合并reduce的输出文件即可得到最终的

wordcount结果。

MapReduce

MapReduce

Langyu’s Blog：http://langyu.iteye.com/blog/992916

http://langyu.iteye.com/blog/992916






MapReduce

Thanks !

Documents

Hadoop mapreduce 过程简析