Map Reduce Programming Waue Chen. Why ? Moore’s law ? 每隔 18 個月， CPU...

Map ReduceProgramming

Waue Chen

Moore’s law ?每隔 18 個月， CPU 的主頻就會增加一倍2005 開始失效

多核及平行運算時代來臨

What is Hadoop

Hadoop 是一個 open source 可運作於大規模 cluster 上的平行分散式程式框架

提供一個分散式文件系統 HDFS ，用來在各個節點上存儲數據

高容錯性，自動處理失敗節點實現了 Google 的 MapReduce 算法

What is MapReduce 把應用程序分割成許多很小的工作單元每個單元可以在任何節點上執行或運算

MapReduce: ExampleMapReduce: Example

MapReduce in Parallel: ExampleMapReduce in Parallel: Example

Thinking in Hadoop:MapReduce

HDFS Map Class Reduce Class Overall Configuration

Program prototypeClass MR{

Class Map …{ }

Class Reduce …{ }

main(){

JobConf conf = new JobConf(“MR.class”);

conf.setInputPath(“the_path_of_HDFS ”);

conf.setMapperClass(Map.class);

conf.setReduceClass(Reduce.class);

JobClient.runJob(conf);

Word Count SampleClass WordCount{

main(){

JobConf conf = new JobConf(WordCount.class);conf.setJobName("wordcount"); // set pathconf.setInputPath(new Path(“/user/waue/input”));conf.setOutputPath(new Path(“counts”));FileSystem.get(conf).delete(new Path(wc.outputPath));// set map reduceconf.setOutputKeyClass(Text.class); // set every word as keyconf.setOutputValueClass(IntWritable.class); // set 1 as valueconf.setMapperClass(MapClass.class);conf.setReducerClass(ReduceClass.class);conf.setNumMapTasks(1);conf.setNumReduceTasks(1);// runJobClient.runJob(conf);

Word Count Sampleclass MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {

private final static IntWritable one = new IntWritable(1);private Text word = new Text();public void map( LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {

String line = ((Text) value).toString();StringTokenizer itr = new StringTokenizer(line);while (itr.hasMoreTokens()) {

word.set(itr.nextToken());output.collect(word, one);

Word Count Sample

class ReduceClass extends MapReduceBase implements Reducer< Text, IntWritable, Text, IntWritable> {

IntWritable SumValue = new IntWritable();public void reduce( Text key, Iterator<IntWritable> values,OutputCollector<Text, IntWritable> output, Reporter reporter)throws IOException {

int sum = 0;while (values.hasNext())

sum += values.next().get();SumValue.set(sum);output.collect(key, SumValue);

Result

MapReduce with HBase prototypeClass MR_HBase{

Class Map extends …{

Class Reduce extends…{

main(){

JobConf conf = new …;

conf.setInputPath(…);

conf.setMapperClass( …);

conf.setReduceClass( …);

JobClient.runJob(conf);

HBase API:

TableMap

TableReduce

WordCountIntoHbase SampleClass WordCountIntoHbase{

main(){

BuildHTable build_table = new BuildHTable( Table_Name, ColumnF);

if (!build_table.checkTableExist(Table_Name)) {

if ( !build_table.createTable() )

System.err.println("create table error !");

} else System.out.println("Table existed !");

JobConf conf = new JobConf(WordCount.class);conf.setJobName("wordcount"); conf.setInputPath(new Path(“/user/waue/input”));//conf.setOutputPath(new Path(“counts”));//FileSystem.get(conf).delete(new Path(wc.outputPath));//conf.setOutputKeyClass(Text.class); // set every word as key//conf.setOutputValueClass(IntWritable.class); // set 1 as value//conf.setMapperClass(MapClass.class);conf.setReducerClass(ReduceClass.class);conf.setNumMapTasks(0);conf.setNumReduceTasks(1);JobClient.runJob(conf);

class ReduceClass extends TableReduce<LongWritable, Text> {

Text col = new Text( “word:text” );

private MapWritable map = new MapWritable();

public void reduce( LongWritable key, Iterator<Text> values,

OutputCollector<Text, MapWritable> output, Reporter reporter)

throws IOException {

ImmutableBytesWritable bytes

= new ImmutableBytesWritable(values.next().getBytes());

map.clear();

map.put(col, bytes);

output.collect(new Text(key.toString()), map);

WordCountIntoHbase Sample

result

WordCountFromHbase

Word Counting from Hbase after WordCountIntoHbase run.

In Trac … http://trac.nchc.org.tw/cloud/browser/sample/h

adoop-0.16/tw/org/nchc/code/WordCountFromHBase.java

What’s HBaseRecordPro parse your record create Hbase set the first line as column qualify store in HBase Automatically Locally http://trac.nchc.org.tw/cloud/wiki/HBaseRe

cordPro

HBaseRecordPro

name:locate:years waue:taiwan:1981 rock:taiwan:1981 aso:taiwan:1981 jazz:taiwan:1982

Run HBaseRecordPro.java

hql> Select * from Table;

Detailed Code Explanation

Apache log parser http://trac.nchc.org.tw/cloud/wiki/

LogParser

More .. ? Enjoy http://trac.nchc.org.tw/cloud/

How to code Hadoop in Eclipsehttp://trac.nchc.org.tw/cloud/browser/hadoop-

eclipse.pdfMap Reduce in Hadoop/HBase Manualhttp://trac.nchc.org.tw/cloud/wiki/MR_manualMy Code sourceshttp://trac.nchc.org.tw/cloud/browser/sample/h

adoop-0.16/tw/org/nchc/code

Then .. ? Intrusion-Detection-System log parser

Count => the last Format => 6 lines / 1 cell

Apache Pig Pig is a platform for analyzing large data sets that

consists of a high-level language

[**] [1:2189:3] BAD-TRAFFIC IP Proto 103 PIM [**][Classification: Detection of a non-standard protocol or event] [Priority: 2] 07/08-14:57:56.500718 140.110.138.253 -> 224.0.0.13PIM TTL:1 TOS:0xC0 ID:11078 IpLen:20 DgmLen:54[Xref => http://cve.mitre.org/cgi-bin/cvename.cgi?name=2003-0567][Xref => http://www.securityfocus.com/bid/8211]

References

API http://hadoop.apache.org/hbase/docs/current/

api/index.htmlhttp://hadoop.apache.org/core/docs/r0.16.4/a

pi/index.html用 Hadoop 進行分佈式並行編程

http://www.ibm.com/developerworks/cn/opensource/os-cn-hadoop1/index.html

Map Reduce Programming Waue Chen. Why ? Moore’s law ? 每隔 18 個月， CPU...

Documents

Julianne moore’s jewelry love

SUMMARY - CECI · 譜圖。由圖1可知，高鐵引致之振動頻率範圍涵蓋1~100Hz，振動尖峰頻率，亦可能涵蓋低頻、中頻及高頻，於選擇量測儀器之有效量測頻率

Moore’s Creek Fecal Coliform TMDL Implementation Plan

Moore’s Proof of an External World

Patrick Moore’s Practical Astronomy Series

Motion Control, Mechatronics Design, and Moore’s Law

Review of Cost, Integrated Circuits, Benchmarks, Moore’s Law

Moore’s Law Discussed

Amending Moore’s Law for Embedded Applications Panel Discussion

Amending Moore’s Law for Embedded Applications

Concurrency - Multithreadingwies/teaching/cso-fa19/class24_concurrency.pdf · Concurrency - Multithreading . 2 Moore’s Law. 3 Moore’s Law Transistor count still rising . 4 Moore’s

Technology Strategy to Drive Moore’s Law into

The Autumn of Moore’s Law - fpgacpu.org...The Autumn of Moore’s Law: Scaling Up Computer Performance, 2011-2020 Jan Gray Gray Research LLC jsgrayat acm.org 6/23/2011 1 1. Moore’s

Parallel Data Processingavid.cs.umass.edu › courses › 645 › s2020 › lectures › Lec19... · 2020-04-15 · Limitations of Moore’s Law •Concerns regarding if Moore’s

射頻電子 - [第五章] 射頻放大器設計

Computers of the Future? CS 221. Moore’s Law Ending in 2018? Moore’s Law: Processor speed / number transistors doubling approximately 18 months

Scaling: history, Moore’s law, > Moore • Cores •ITRS •Roadmapspioneer.netserv.chula.ac.th/~ksongpho/584/1.pdf · • Scaling: history, Moore’s law, > Moore • Cores •ITRS

Austin Moore’S Prosthesis Surgical Technique

Extend moore’s law?

Beyond Moore’s Law Computer Architecturedebenedictis.org/erik/SAND-2014/SAND2014-18566 Final Report 171… · Beyond Moore’s Law Computer Architecture Background The Beyond Moore’s