16
Example: Sorting on Distributed Computing Environment Apr 20, 2009 1

Example: Sorting on Distributed Computing Environment Apr 20, 2009 1

Embed Size (px)

Citation preview

Example:Sorting on Distributed

Computing Environment

Apr 20, 2009

1

About this presentation

Example for starting preparation of the "Enshu" part of this class Give a talk about the problem attached to you

according to the theme of the day.

Does not show complete presentation. Just shows the points to be studied, explained

and solved.

2

Sorting Large Number of Data

Data size > Memory size of single computer Ex. 1~100trillion integer numbers

Distributed Parallel Sort:Distribute the data into multiple computers on a network and sort. ⇒ Use multiple computational power ⇒ Requires communication among computers

3

Computational Infrastructures

Case 1: PCs in a computer room Use all of the PCs on holidays

or in midnights ~100 PCs (200~400GB of memory in total)

Case 2: Supercomputers in Japan Enable "Ultra Large Scale Computation"

by using supercomputers all over Japan 10~20 supercomputers Speed: 10TFLOPS ~ 100TFLOPS / each Memory: 10TB ~ 100TB / each

4

Network Infrastructures

Case 1: Ethernet Switch Bandwidth: 100Mbps ~ 1Gbps Latency: 0.05~0.1msec

Case 2: SINET3 (Academic Network in Japan) Backbone Bandwidth: 10~40Gbps Bandwidth per computer: ~10Gbps Latency: 10~100msec

depends on the length of physical networks

5

Bandwidth? Latency?

Bandwidth: Available speed of data transfer (bit/sec) on the network

Latency:Minimum time required for each data transfer

Estimation of the cost for a data transfer: T = L + S / B L: Latency, B: Bandwidth, S: Data size (bit)

6

System Infrastructure

Case 1: Network environment can be "Reliable",

since no other user is using the system. Implementation of the program will be easier

by installing MPI(Message Passing Interface). Case 2:

Network environment may be "Unreliable" since many users share the network routes.

Usage of MPI is difficult, since the environment is"Heterogeneous" Different architectures and OSs

7

Implementation on Internet

Everything can be built on "Application Layer"

Choose a protocol for internet: TCP or UDP Case 1: UDP (or MPI over UDP) Case 2: TCP

Choose a parallel algorithm of sorting Parallel Algorithm:

algorithm for solving a problem by dividing it into multiple tasks and running them concurrently

8

"Layers" of networks

OSI model: divides facilities of network devices into 7 layers Application Layer Presentation Layer Session Layer Transport Layer Network Layer Data Link Layer Physical Layer

9

TCP or UDP TCP (Transmission Control Protocol)

Guarantees the completion of data transfer. Slow but reliable.

UDP(User Datagram Protocol) No guarantee about data transfer. Fast but unreliable.

Sorting requires every data to be correctly transferred. ⇒ TCP is preferred. On reliable networks such as Case 1, UDP can be

used. MPI is an interface over UDP (or TCP).

Guarantees data transfer even over UDP.10

Detailed Implementation of Softing Program

Implementation of parallel algorithm: Cost of computation? Cost of communication? Requirements of Memory?

Policies for distributing computation and data affects the performance.

11

Characteristics of each case Case 1:

Low latency and narrow bandwidth Total amount of computational power and memory is

small No need for load-balancing

Case 2: High latency and wide bandwidth (Possibly) Large amount of computational power and

memory Requires load-balancing according to the computational

power of each machine.

12

Implementation for Case 1.

Distribute same amount of computation and data on each computer

Consider the number of PCs to be used: Communication cost increases according to the

number of PCs If the target data is large enough, it will achieve

sufficient speedup by parallelization even with 100PCs.

13

Implementation for Case 2

The amount of computation and data depends on the relative performance of each computer. Accurate analysis of the performance of each

machine and network is important.

It must be difficult to obtain sufficient effect of parallelization with large number of nodes. Performance degradation by load unbalance and

communication cost.

14

To complete the presentation of your solution Detailed information about the infrastructure. Detailed information of implementation:

Parallel Algorithm? Policy for distributing data and computation?

Estimate computation and communication time,and find the optimal distribution.

How to distribute the target data?and how to gather the results?

Management of multiple jobs. Standardization of the solution Relationship with the future networks

15

Exercise

Find existing parallel algorithms For example: Sorting Note:

Algorithms for "distributed memory parallel" computing environment = Each computer has its own memory => Requires explicit communication

Consider how to implement them on computers connected with Internet.

16