46
Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Embed Size (px)

DESCRIPTION

Carleton University© S. Dandamudi3 Parallel Algorithm Models (cont’d)  Data parallel model  One of the simplest of all the models  Tasks are statically mapped onto processors  Each task performs similar operation on different data  Called data parallelism model  Work may be done in phases  Operations in different phases may be different  Ex: Matrix multiplication

Citation preview

Page 1: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Parallel Programming with PVM

Prof. Sivarama DandamudiSchool of Computer Science

Carleton University

Page 2: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 2

Parallel Algorithm Models Five basic models

Data parallel modelTask graph modelWork pool modelMaster-slave modelPipeline modelHybrid models

Page 3: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 3

Parallel Algorithm Models (cont’d)

Data parallel modelOne of the simplest of all the modelsTasks are statically mapped onto processorsEach task performs similar operation on different data

Called data parallelism modelWork may be done in phases

Operations in different phases may be differentEx: Matrix multiplication

Page 4: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 4

Parallel Algorithm Models (cont’d)

Data parallel modelA11 A12 B11 B12 C11 C12A21 A22 B21 B22 C21 C22A decomposition into four tasks

Task 1: C11 = A11 B11 + A12 B21Task 2: C12 = A11 B12 + A12 B22Task 3: C21 = A21 B11 + A22 B21Task 4: C11 = A21 B21 + A22 B22

. =

Page 5: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 5

Parallel Algorithm Models (cont’d)

Task graph modelParallel algorithm is viewed as a task-dependency

graphCalled task parallelism model

Typically used for tasks that have large amount of dataStatic mapping is used to optimize data movement cost

Locality-based mapping is importantEx: Divide-and-conquer algorithms, parallel quicksort

Page 6: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 6

Parallel Algorithm Models (cont’d)

Task parallelism

Page 7: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 7

Parallel Algorithm Models (cont’d)

Work pool modelDynamic mapping of tasks onto processors

Important for load balancingUsed on message passing systems

When the data associated with a task is relatively smallGranularity of tasks

Too small: overhead in accessing tasks can increaseToo big: Load imbalance

Ex: Parallelization of loops by chunk scheduling

Page 8: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 8

Parallel Algorithm Models (cont’d)

Master-slave modelOne or more master processes generate work and

allocate it to worker processesAlso called manager-worker model

Suitable for both shared-memory and message passing systems

Master can potentially become a bottleneckGranularity of tasks is important

Page 9: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 9

Parallel Algorithm Models (cont’d)

Pipeline modelA stream of data passes through a series of processors

Each process performs some task on the dataAlso called stream parallelism model

Uses producer-consumer relationshipOverlapped executionUseful in applications such as database query processing

Potential problem One process can delay the whole pipeline

Page 10: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 10

Parallel Algorithm Models (cont’d)

Pipeline model

R5

R4

R3

R2R1

Pipelined processing canavoid writing temporaryresults on disk and reading them back

Page 11: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 11

Parallel Algorithm Models (cont’d)

Hybrid modelsPossible to use multiple modelsHierarchical

Different models at different levelsSequentially

Different models in different phasesEx: Major computation may use task graph model

Each node of the graph may use data parallelism or pipeline model

Page 12: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 12

PVM Parallel virtual machine

Collaborative effortOak Ridge National Lab, University of Tennessee,

Emory University, and Carnegie Mellon UniversityBegan in 1989

Version 1.0 was used internallyVersion 2.0 released in March 1991Version 3.0 in February 1993

Page 13: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 13

PVM (cont’d)

Parallel virtual machineTargeted for heterogeneous network computing

Different architecturesData formatsComputational speedsMachine loadsNetwork load

Page 14: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 14

PVM Calls Process control

int tid = pvm_mytid(void)Returns pid of the calling processCan be called multiple times

int info = pvm_exit(void)Does not kill the processTells the local pvmd that this process is leaving PVMinfo < 0 indicates error (error: pvmd not responding)

Page 15: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 15

PVM Calls (cont’d)

Process controlint numt = pvm_spawn(char *task, char **argv, int flag, char *where, int ntask, int *tids)

Starts ntask copies of the executable file task

Arguments to task (NULL terminated)

Specific host

Specific architecture(PVM_ARCH)

Page 16: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 16

PVM Calls (cont’d)

flag specifies options

Value Option Meaning

0 PvmTaskDefault PVM chooses where to span

1 PvmTaskHost where specifies a host

2 PvmTaskArch where specifies a architecture

3 PvmTaskDebug Starts tasks under debugger

Page 17: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 17

PVM Calls (cont’d)

Process controlint info = pvm_kill(int tid)

Kills the PVM task identified by tidDoes not kill the calling taskTo kill the calling task

First call pvm_exit()Then exit()

Writes to the file /tmp/pvml.<uid>

Page 18: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 18

PVM Calls (cont’d)

Informationint tid = pvm_parent(void)

Returns the tid of the process that spawned the calling task

Returns PvmNoParent if the task is not created by pvm_spawn()

Page 19: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 19

PVM Calls (cont’d)

Informationint info = pvm_config( int *nhost, int *narch, struct pvmhostinfo **hostp)

Returns nhost = number of hosts Returns narch = number of different data formats

Page 20: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 20

PVM Calls (cont’d)

Message sending Involves three steps

Send buffer must be initialized Use pvm_initsend()

Message must be packed Use pvm_pk*() Several pack routines are available

Send the message Use pvm_send()

Page 21: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 21

PVM Calls (cont’d)

Message sendingint bufid = pvm_initsend( int encoding)

Called before packing a new message into the bufferClears the send buffer and creates a new one for

packing a new message bufid = new buffer id

Page 22: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 22

PVM Calls (cont’d)

encoding can have three options: PvmDataDefault

XDR encoding is used by default Useful for heterogeneous architectures

PvmDataRaw No encoding is done Messages sent in their original form

PvmDataInPLace No buffer copying Buffer should not be modified until sent

Page 23: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 23

PVM Calls (cont’d)

Packing data Several routines are available (one for each data type) Each takes three arguments

int info = pvm_pkbyte(char *cp, int nitem, int stride)

nitem = # items to be packed stride = stride in elements

Page 24: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 24

PVM Calls (cont’d)

Packing datapvm_pkint pvm_pklongpvm_pkfloat pvm_pkdoublepvm_pkshort

Pack string routine requires only the NULL-terminated string pointer

pvm_pkstr(char *cp)

Page 25: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 25

PVM Calls (cont’d)

Sending dataint info = pvm_send(int tid, int msgtag)

Sends the message in the packed buffer to task tidMessage is tagged with msgtag

Message tags are useful to distinguish different types of messages

Page 26: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 26

PVM Calls (cont’d)

Sending data (multicast)int info = pvm_mcast(int *tids, int ntask, int msgtag)

Sends the message in the packed buffer to all tasks in the tid array (except itself)

tid array length is given by ntask

Page 27: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 27

PVM Calls (cont’d)

Receiving dataTwo steps

Receive dataUnpack it

Two versionsBlocking

Waits until the message arrivesNon-blocking

Does not wait

Page 28: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 28

PVM Calls (cont’d)

Receiving dataBlocking receiveint info = pvm_recv(int tid, int msgtag)

Wait until a message with msgtag has arrived from task tid

Wildcard value (1) is allowed for both msgtag and tid

Page 29: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 29

PVM Calls (cont’d)

Receiving dataNon-blocking receive

int info = pvm_nrecv(int tid, int msgtag)

If no message with msgtag has arrived from task tidReturns bufid = 0

Otherwise, behaves like the blocking receive

Page 30: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 30

PVM Calls (cont’d)

Receiving dataProbing for a messageint info = pvm_probe(int tid, int msgtag)

If no message with msgtag has arrived from task tidReturns bufid = 0

Otherwise, returns a bufid for the messageDoes not receive the message

Page 31: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 31

PVM Calls (cont’d)

Unpacking data (similar to packing routines)pvm_upkint pvm_upklongpvm_upkfloat pvm_upkdoublepvm_upkshort pvm_upkbyte

Pack string routine requires only the NULL-terminated string pointer

pvm_upkstr(char *cp)

Page 32: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 32

PVM Calls (cont’d)

Buffer informationUseful to find the size of the received message

int info = pvm_bufinfo(int bufid, int *bytes, int *msgtag, int *tid)

Returns msgtag, source tid, and size in bytes

Page 33: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 33

Example Finds sum of elements of a given vector

Vector size is given as inputThe program can be run on a PVM with up to 10 nodes

Can be modified by changing a constantVector is assumes to be evenly divisable by number of

nodes in PVM Easy to modify this restriction

Master (vecsum.c) and slave (vecsum_slave.c) programs

Page 34: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 34

Example (cont’d)

vecsum.c#include <stdio.h>#include <sys/time.h>#include "pvm3.h"

#define MAX_SIZE 250000 /* max. vector size */#define NPROCS 10 /* max. number of PVM nodes */

Page 35: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 35

Example (cont’d)

main(){int cc, tid[NPROCS];long vector[MAX_SIZE];double sum = 0,

partial_sum; /* partial sum received from slaves */long i, vector_size;

Page 36: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 36

Example (cont’d)

int nhost, /* actual # of hosts in PVM */ size; /* size of vector to be distributed */

struct timeval start_time, finish_time;long sum_time;

Page 37: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 37

Example (cont’d)

printf("Vector size = ");scanf("%ld", &vector_size);

for(i=0; i<vector_size; i++) /* initialize vector */ vector[i] = i;

gettimeofday(&start_time, (struct timezone*)0); /* start time */

Page 38: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 38

Example (cont’d)

tid[0] = pvm_mytid(); /* establish my tid */

/* get # of hosts using pvm_config() */pvm_config(&nhost, (int *)0,

(struct hostinfo *)0);

size = vector_size/nhost; /* size of vector to send to slaves */

Page 39: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 39

Example (cont’d)

if (nhost > 1) pvm_spawn("vecsum_slave", (char **)0,

0, "", nhost-1, &tid[1]);for (i=1; i<nhost;i++){

/* distribute data to slaves */ pvm_initsend(PvmDataDefault); pvm_pklong(&vector[i*size],size,1); pvm_send(tid[i],1);}

Page 40: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 40

Example (cont’d)

for (i=0; i<size;i++) /* perform local sum */ sum += vector[i];for (i=1; i<nhost;i++){

/* collect partial sums from slaves */ pvm_recv(-1,2); pvm_upkdouble(&partial_sum,1,1); sum += partial_sum;}

Page 41: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 41

Example (cont’d)

gettimeofday(&finish_time, (struct timezone*)0); /* finish time */

sum_time = (finish_time.tv_sec – start_time.tv_sec) * 1000000 + finish_time.tv_usec -

start_time.tv_usec;

Time in secs

Page 42: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 42

Example (cont’d)

printf("Sum = %lf\n",sum);printf("Sum time on %d hosts =

%lf sec\n", nhost, (double)sum_time/1000000);pvm_exit();

}

Page 43: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 43

Example (cont’d)

vecsum_slave.c#include "pvm3.h"#define MAX_SIZE 250000main(){int ptid, bufid, vector_bytes;long vector[MAX_SIZE];double sum = 0;int i;

Page 44: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 44

Example (cont’d)

ptid = pvm_parent(); /* find parent tid */bufid = pvm_recv(ptid,1);

/* receive data from master *//* use pvm_bufinfo() to find the number of bytes received */pvm_bufinfo(bufid, &vector_bytes,

(int *)0, (int *) 0);

Page 45: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 45

Example (cont’d)

pvm_upklong(vector, vector_bytes/sizeof(long),1); /* unpack */for (i=0;

i<vector_bytes/sizeof(long); i++) /* local summation */ sum += vector[i];

Page 46: Parallel Programming with PVM Prof. Sivarama Dandamudi School of Computer Science Carleton University

Carleton University © S. Dandamudi 46

Example (cont’d)

pvm_initsend(PvmDataDefault); /* send sum to master */pvm_pkdouble(&sum,1,1);pvm_send(ptid, 2);

/* use msg type 2 for partial sum */

pvm_exit();}