32
Distributed Computing at Web Scale The MAP REDUCE model Serge Abiteboul Ioana Manolescu Philippe Rigaux Marie-Christine Rousset Pierre Senellart Web Data Management and Distribution http://webdam.inria.fr/textbook March 17, 2011 WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 1 / 32

Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Distributed Computing at Web ScaleThe MAPREDUCE model

Serge Abiteboul Ioana Manolescu Philippe RigauxMarie-Christine Rousset Pierre Senellart

Web Data Management and Distributionhttp://webdam.inria.fr/textbook

March 17, 2011

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 1 / 32

Page 2: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Introduction

Outline

1 Introduction

2 The MAPREDUCE computing model

3 Toward easier programming interfaces: PIG

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 2 / 32

Page 3: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Introduction

History and development of MAPREDUCE

Published by Google Labs in 2004 at OSDI [DG04]. Implemented inHadoop, widely used (Yahoo!, Amazon EC2).

A programming model (inspired by standard functional programmingoperators) to facilitate the development and execution of distributedtasks.

Main idea: “push the program near the data”. The programmer definestwo functions; MAPREDUCE takes in charge distribution aspects.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 3 / 32

Page 4: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Introduction

Centralized computing with distributed data storageRun the program at the Client, get data from the distributed system.

Downsides: important data flows, no use ot the cluster computingresources.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 4 / 32

Page 5: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Outline

1 Introduction

2 The MAPREDUCE computing model

3 Toward easier programming interfaces: PIG

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 5 / 32

Page 6: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

The programming model of MAPREDUCEUsed to process data flows of (key ,value) pairs.

1 map() takes as input a list of pairs (k ,v) ∈ K1 × V1 and produces(for each pair), another list of pairs (k 0,v 0) ∈ K2 × V2, calledintermediate pairs.Example: take a pair (uri ,document), produce list of pairs(term,count)

2 (shuffle) the MAPREDUCE execution environment groupsintermediate pairs on the key, and produces grouped instances oftype (K2, list(V2))Example: intermediate pairs for term ’job’ are grouped as(0 job0,< 1,4,2,8 >)

3 reduce() operates on grouped instances of intermediate pairs(k 0

1,< v 01, � � � ,v0p, � � � ,v0q , � � � >); Each instance processed by the

procedure outputs a result, usually a single value v 00.Example: take a grouped instance (0 job0,< 1,4,2,8 >) and outputthe sum: 15

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 6 / 32

Page 7: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Job workflow in MAPREDUCE

Important: each pair, at each phase, is processed independently fromthe other pairs.

Network and distribution are transparently managed by theMAPREDUCE environment.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 7 / 32

Page 8: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Example: Counting terms occurrences in documents

The map() function:

mapCW(String key, String value):// key: document name// value: document contentsfor each term t in value:

return (t, 1);

Note: we do not even need to count the occurrences of t in value.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 8 / 32

Page 9: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Counting terms occurrences in documents (cont’)

The reduce() function. It takes as input a grouped instance on a key .The nested list can be scanned with an iterator.

reduceCW(String key, Iterator values):// key: a term// values: a list of countsint result = 0;

// Loop on the values list; cumulate in resultfor each v in values:

result += v;

// Send the resultreturn result;

And, finally, the Job Driver program which submits both functions.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE model March 17, 2011 9 / 32

Page 10: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

// A specification object for MapReduce executionMapReduceSpecification spec;

// Define input filesMapReduceInput* input = spec.add_input();input->set_filepattern("/movies/*.xml");input->set_mapper_class("MapWC");

// Specify the output files:MapReduceOutput* out = spec.output();out->set_filebase("/gfs/freq");out->set_num_tasks(100);out->set_reducer_class("ReduceWC");

// Now run itMapReduceResult result;if (!MapReduce(spec, &result)) abort();

// Done: ’result’ structure contains result inforeturn 0;

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 10 / 32

Page 11: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Processing map() and reduce() as a MAPREDUCE job.

A MAPREDUCE job takes care of the distribution, synchronization andfailure handling. Specifically:

the input is split in M groups; each group is assigned to a mapper(assignment is based on the data locality principle);

each mapper processes a group and stores the intermediate pairslocally;

grouped instances are assigned to reducers thanks to a hashfunction.

(Shuffle) intermediate pairs are sorted on their key by the reducer;

one obtains grouped instances, submitted to the reduce() function;

NB: the data locality does no longer hold for the REDUCE phase, sinceit reads from the mappers.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 11 / 32

Page 12: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Distributed execution of a MAPREDUCE job.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 12 / 32

Page 13: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Processing the WordCount() example (1)

Let the input consists of documents, say, one million 100-termsdocuments of approximately 1 KB each.

The split operation distributes these documents in groups of 64 MBs:each group consist of 64,000 documents. ThereforeM : d 1,000,000/64,000e ≈ 16,000 groups.

We assume a global count of 1,000 distinct terms in the chunk; the(local) Map phase produces 6,400,000 pairs (t ,c).

Let hash(t) = t mod 1,000. Each intermediate group i ,0 ≤ i < 1000contains 6,400 pairs, each with 6-7 distinct terms t such thathash(t) = i .

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 13 / 32

Page 14: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Processing the WordCount() example (2)

Assume that hash(’call’) = hash(’mine’) = hash(’blog’) = i = 100. Wefocus on three Mappers Mp, Mq and M r :

1 Gpi =(<. . . , (’mine’, 1), . . . , (’call’,1), . . . , (’mine’,1), . . . , (’blog’, 1)

. . . >2 Gq

i =(< . . . , (’call’,1), . . . , (’blog’,1), . . . >3 Gr

i =(<. . . , (’blog’, 1), . . . , (’mine’,1), . . . , (’blog’,1), . . . >

Ri reads Gpi , Gp

i and Gpi from the three Mappers, sorts their unioned

content, and groups the pairs with a common key:

. . . , (’blog’, <1, 1, 1, 1>), . . . , (’call’, <1, 1>), . . . , (’mine’, <1, 1, 1>)

Our reduceWC() function is then applied by Ri to each element of thislist. The output is (’blog’, 4), (’call’, 2) and (’mine’, 3).

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 14 / 32

Page 15: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

The MAPREDUCE computing model

Failure management

In a distributed setting, the specific job handled by a machine is only aminor part of the overall computing task.

Moreover, because the task is distributed on hundreds or thousands ofmachines, the chances that a problems occurs somewhere are muchlarger; starting the job from the beginning is not a valid option.

The Master periodically checks the availability and reacheability of the“Workers”

1 if a reducer fails, its task may be reassigned;2 if a mapper fails, its task must be started from scratch, even in the

REDUCE phase (it holds intermediate groups).3 if the Master fails? Then the whole job should be re-initiated.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 15 / 32

Page 16: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

Outline

1 Introduction

2 The MAPREDUCE computing model

3 Toward easier programming interfaces: PIG

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 16 / 32

Page 17: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

Pig Latin

Motivation: define high-level languages that use MAPREDUCE as anunderlying data processor.

A Pig Latin statement is an operator that takes a relation as input andproduces another relation as output.

Pig Latin statements are generally organized in the following manner:1 A LOAD statement reads data from the file system as a relation

(list of tuples).2 A series of "transformation" statements process the data.3 A STORE statement writes output to the file system; or, a DUMP

statement displays output to the screen.

Statements are executed as composition of MAPREDUCE jobs.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 17 / 32

Page 18: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

An example

A flat file extracted from DBLP.

2005 VLDB J. Model-based approximate querying in sensor1997 VLDB J. Dictionary-Based Order-Preserving String Compression.2003 SIGMOD Record Time management for new faculty.2001 VLDB J. E-Services - Guest editorial.2003 SIGMOD Record Exposing undergraduate students to database1998 VLDB J. Integrating Reliable Memory in Databases.1996 VLDB J. Query Processing and Optimization in Oracle1996 VLDB J. A Complete Temporal Relational Algebra.1994 SIGMOD Record Data Modelling in the Large.2002 SIGMOD Record Data Mining: Concepts and Techniques - Book

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 18 / 32

Page 19: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

An example

Here is the complete Pig Latin program that computes the averagenumber of publication per year.

-- Load records from the filearticles = load ’../../data/dblp/journal.txt’

as (year: chararray, journal:chararray,title: chararray) ;

sr_articles = filter articlesBY journal==’SIGMOD Record’;

year_groups = group sr_articles by year;

avg_nb = foreach year_groupsgenerate group, COUNT(sr_articles.title);

dump avg_nb;

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 19 / 32

Page 20: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

The data model

The model allows nesting of bags and tuples. Example: theyear_group temporary bag.

group: 1990sr_articles:{(1990, SIGMOD Record, SQL For Networks of Relations.),(1990, SIGMOD Record, New Hope on Data Models and Types.)

}

Unlimited nesting, but no references, no constraint of any kind (forparallelization purposes).

Pig programs are either run locally (for testing purposes) or asMAPREDUCE jobs (possibly pipelined if several jobs are necessary).

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 20 / 32

Page 21: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

Flexible representation

Pig allows the representation of heterogeneous data, in the spirit ofsemi-structured dat models (e.g., XML).

The following is a bag with heterogeneous tuples.

{(2005, {’SIGMOD Record’, ’VLDB J.’},

{’article1’, article2’})(2003, ’SIGMOD Record’, {’article1’, article2’},

{’author1’, ’author2’})}

Explains why the computing model is popular in key-value stores, anddocument-oriented systems.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 21 / 32

Page 22: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

The Pig operators

Operator Descriptionforeach Apply one or several expression(s) to each of the input tu-

ples.filter Filter the input tuples with some criteria.cogroup Associate two related groups from distinct inputs.cross Cross product of two inputs.join Join of two inputs.union Union of two inputs (note: no need to agree on a same

schema, as in SQL).order Order an input.distinct Remove duplicates from an input.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 22 / 32

Page 23: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

Our dataset

A simple flat file with tab-separated fields.

1995 Foundations of Databases Abiteboul1995 Foundations of Databases Hull1995 Foundations of Databases Vianu2010 Web Data Management Abiteboul2010 Web Data Management Manolescu2010 Web Data Management Rigaux2010 Web Data Management Rousset2010 Web Data Management Senellart

NB: Pig accepts inputs from user-defined function – allows to extractdata from any source.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 23 / 32

Page 24: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

First example: group

The “program”:

-- Load records from the webdam-books.txt file (tab separatedbooks = load ’../../data/dblp/webdam-books.txt’

as (year: int, title: chararray, author: chararray) ;group_auth = group books by title;authors = foreach group_auth generate group, books.author;dump authors;

and the result:

(Foundations of Databases,{(Abiteboul),(Hull),(Vianu)})

(Web Data Management,{(Abiteboul),(Manolescu),(Rigaux),(Rousset),(Senellart)})

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 24 / 32

Page 25: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

Unnesting with flatten

Flatten serves to unest a nested field.

-- Take the ’authors’ bag and flatten the nested setflattened = foreach authors generate group, flatten(author);

Applied to the previous authors bags, one obtains:

(Foundations of Databases,Abiteboul)(Foundations of Databases,Hull)(Foundations of Databases,Vianu)(Web Data Management,Abiteboul)(Web Data Management,Manolescu)(Web Data Management,Rigaux)(Web Data Management,Rousset)(Web Data Management,Senellart)

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 25 / 32

Page 26: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

The cogroup operator

Allows to gather in nested fields two data sources.

Example: a file with publishers:

Fundations of Databases Addison-Wesley USAFundations of Databases Vuibert FranceWeb Data Management Cambridge University Press USA

The program:

--- Load records from the webdam-publishers.txt filepublishers = load ’../../data/dblp/webdam-publishers.txt’

as (title: chararray, publisher: chararray) ;cogrouped = cogroup flattened by group, publishers by title

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 26 / 32

Page 27: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

The result

For each grouped field value, two nested sets, coming from bothsources.

(Foundations of Databases,{ (Foundations of Databases,Abiteboul),

(Foundations of Databases,Hull),(Foundations of Databases,Vianu)

},{(Foundations of Databases,Addison-Wesley),(Foundations of Databases,Vuibert)

})

A kind of join? Yes, at least a preliminary step.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 27 / 32

Page 28: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

Joins

Same as before, but produces a flat output (cross product of the innernested bags).

-- Take the ’flattened’ bag, join with ’publishers’joined = join flattened by group, publishers by title;

(Foundations of Databases,Abiteboul,Fundations of Databases,Addison-Wesley)

(Foundations of Databases,Abiteboul,Fundations of Databases,Vuibert)

(Foundations of Databases,Hull,Fundations of Databases,Addison-Wesley)

(Foundations of Databases,Hull,...

Important: the nested model is more elegant and easier to deal with.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 28 / 32

Page 29: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

What you should remember on distr. computing

MAPREDUCE is a simple model for batch processing of very largecollections.⇒ good for data analytics; not good for point queries (high latency).

The systems brings robustness against failure of a component andtransparent distribution.⇒ more expressive languages required (Pig); could MAPREDUCE beused as an infrastructure for very large scale database distribution(e.g., HadoopDB)?

Parallel databases exist for a long time (TeraData); they usedistribution for scalability.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 29 / 32

Page 30: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

Overview of existing systems

Google File System (GFS). Distr. storage for very large,unstructured files. Open-source: HDFS (Hadoop)

Dynamo. Internal data store of Amazon. Open source: Voldemortproject.

Bigtable, sorted distributed storage. Open source: HTable(Hadoop).

Amazon proposes a Cloud environment, EC2, with: S3 andHadoop/MAPREDUCE.

and Yahoo! with PNuts, Microsoft, with Scope, etc.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 30 / 32

Page 31: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

The end!

For references, slides, and some public chapters of the book:

http://webdam.inria.fr/textbook

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 31 / 32

Page 32: Distributed Computing at Web Scale The MapReduce modeldeutchd/teaching/sldistcomp.pdf · two functions; MAPREDUCE takes in charge distribution aspects. WebDam (INRIA) Distributed

Toward easier programming interfaces: PIG

References

Jeffrey Dean and Sanjay Ghemawat.

MapReduce: Simplified Data Processing on Large Clusters.In Intl. Symp. on Operating System Design and Implementation (OSDI), pages 137–150, 2004.

WebDam (INRIA) Distributed Computing at Web Scale The MAPREDUCE modelMarch 17, 2011 32 / 32