A Generate-Test-Aggregate Parallel Programming Library on Spark

A Generate-Test-Aggregate Parallel Programming Library

Yu Liu1, Kento Emoto2, Zhenjiang Hu3

1The Graduate University for Advanced Studies 2The University of Tokyo 3National Institute of Informatics

PPoPP PMAM 2013

Systematic Parallel Programming for MapReduce

Outline

Introduction to GTA

The GTA library

Implementation strategy

Programming interface

Automatic parallelization and optimization

Applications and evaluations

Conclusions

Outline

Introduction to GTA

The GTA library





Conclusions

The GTA Programming Methodology

Simple programming pattern

1. Generate all possible solution candidates;

2. Test and filter candidates;

3. Aggregate the valid candidates.

Expressive and code efficient

Covers a large class of problems

Automatic optimization and parallelization

~ Kento Emoto, et.al., [ESOP’12]

An Example: The Knapsack Problem

Writing a parallel (MapReduce) program for the knapsack problem is not easy.

Picture from Wikipedia

input: [ (1 $, 2 Kg), (2 $, 6 Kg), (3 $, 10 Kg) ]

weight limitation =15

generate:

[ [ ], [ (1$, 2 Kg) ], [ (2$, 6 Kg) ], [ (3 $, 10 Kg) ], [(1$, 2 Kg) , (2$, 6 Kg) ], [1$, 2 Kg) , (3 $, 10 Kg) ], [(2$, 6 Kg) , (3 $, 10 Kg) ], [(1$, 2 Kg) , (2$, 6 Kg) , (3 $, 10 Kg) ] ]

test: [true, true, true, true, true, false, false]

filter: [ [ ], [ (1$, 2 Kg) ], [ (2$, 6 Kg) ], [ (3 $, 10 Kg) ],

[(1$, 2 Kg) , (2$, 6 Kg) ], [1$, 2 Kg) , (3 $, 10 Kg) ] ]

aggregate: 0$, 1$, 2 $, 3$, 3$, 4$

Naively implementing Knapsack is inefficient (O(2n)).

Input (length) Time (ms) 8 30

12 86

16 97

20 2829

24 java.lang.OutOfMemoryError: Java heap

space

performance of the naïve Knapsack program

The GTA fusion theorem is introduced for resolve efficiency problem

GTA Fusion

mapReduceable

predicates

generator

aggregator

map ( mapReduceable.f ) . reduce ( mapReduceable.combine )

MapReduce

Definitions of G,T,A

Class Name Algebraic Structure

Generator polymorphic semiring generator

Predicate almost list homomorphism

Aggregator semiring homomorphism

Ref: K.Emoto [ESOP’12]

Main Contributions

The implementation of a GTA library

A simple and statically typed GTA-DSL is implemented

Algebraic structures and computations/transformations of them are implemented

Evaluation of GTA methodology

Outline

GTA programming methodology

The GTA library





Conclusions

Object-oriented Functional Style

We defined the basic algebraic structures.

Relations/transformations of the algebras are well typed

Examples

Outline


The GTA library





Conclusions

The users write GTA expressions like: generate(g:GEN) filter(t:Predicate)* aggregate(a:Aggregator)

G‧T‧A Programming DSL

GEN, Aggregator, Predicate are Scala traits defined in the GTA library

Outline


The GTA library





Conclusions

GTA-fusion

G+A+T 𝑀𝑎𝑝𝑅𝑒𝑑𝑢𝑐𝑒𝑎𝑏𝑙𝑒[𝑓,⊕]

Input x1, x2, x3, … , xn

MAP

REDUCE

table1 tablen

f f f f

…

table1 tablen table2 ⊕ ⊕ ⊕ …

[EuroPar’11]

Implementation of GTA Fusion/Optimization

The main difficulties:

How to define a polymorphic generator

How to define a predicate for test

How to define intermediate data structures and other algebraic structures

Outline


The GTA library





Conclusions

More Examples

More examples in the paper and source package:

Extended Knapsack problems

The maximum-segments-sum problem

Finding the most possible sequence (viterbi algorithm)

More information on: https://bitbucket.org/inii/gtalib

G‧T‧A Building Blocks

Our library provides commonly used G·T·A building blocks and users can also implement their own G,T,As.

Performance Evaluations

Evaluations on EdubaseCluster (Cloud)

– Up to 32 VM nodes, each has 3GB RAM, 1 single core CPU

– Executed on Spark – an in-memory MR cluster

Execution Time (Knapsack)

203.63

92.83 64.64 47.76 37.06 29.78 25.17 23.25

1727.973

679.305 637.33

471.2

362.36 287.08

234.25 223.44

0

200

400

600

800

1000

1200

1400

1600

1800

4 8 12 16 20 24 28 32

Tim

e (

seco

nd

)

Number of VM nodes

1.00E+07 items

1.00E+08 items

Linear Speedup

0

1

2

3

4

5

6

7

8

9

4 8 12 16 20 24 28 32

spe

ed

up

number of VM

Knapsack

ViterbiAlg

MSS

Outline


The GTA library





Conclusions

Conclusions

We show GTA can be efficiently implemented

GTA-DSL can simplify parallel programming

Simple programming model

Good code efficiency

GTA-DSL is architecture independent

Future Works

Enrich the library by more building blocks in terms of G, T, A

GTA-DSL can be extended to processing more complex data structures such as tree/graph

Q&A

Thank you very much!

Technology

A Generate-Test-Aggregate Parallel Programming Library on Spark