of 28 /28
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University

Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University

Embed Size (px)

Text of Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton...

  • Parallel Programming with MPIProf. Sivarama DandamudiSchool of Computer ScienceCarleton University

  • IntroductionProblemLack of a standard for messagepassing routinesBig issue: Portability problemsMPI defines a core set of library routines (API) for message passing (more than 125 functions in total!!)Several commercial and public domain implementationsCray, IBM, IntelMPI-CH from Argonne National LaboratoryLAM from Ohio Supercomputer Center/Indiana University

  • Introduction (contd)Some Additional Goals [Snir et al. 1996]Allows efficient communicationAvoids memory-to-memory copyingAllows computation and communication overlapNon-blocking communicationAllows implementation in a heterogeneous environmentProvides reliable communication interfaceUsers dont have to worry about communication failures

  • MPIMPI is large but not complex125 functionsBut.Need only 6 functions to write a simple MPI programMPI_InitMPI_FinalizeMPI_Comm_sizeMPI_Comm_rankMPI_SendMpi_Recv

  • MPI (contd)Before any other function is called, we must initializeMPI_Init(&argc, &argc)To indicate end of MPI callsMPI_Finalize()Cleans up the MPI stateShould be the last MPI function call

  • MPI (contd)A typical program structureint main(int argc, char **argv) { MPI_Init(&argc, &argv); . . . /* main program */ . . . MPI_Finalize();}

  • MPI (contd)MPI uses communicators to group processes that communicate with each otherPredefined communicator: MPI_COMM_WORLDconsists of all processes running when the program begins executionSufficient for simple programs

  • MPI (contd)Process rankSimilar to mytid in PVM

    MPI_Comm_rank(MPI_Comm comm, int *rank)First argument: CommunicatorSecond argument: returns process rank

  • MPI (contd)Number of processesMPI_Comm_size(MPI_Comm comm, int *size)First argument: CommunicatorSecond argument: returns number of processesExample:MPI_Comm_size(MPI_COMM_WORLD, &nprocs)

  • MPI (contd)Sending a message (blocking version)MPI_Send(void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm ) Data types MPI_CHAR, MPI_INT, MPI_LONG MPI_FLOAT, MPI_DOUBLEBuffer descriptionDestination specification

  • MPI (contd)Receiving a message (blocking version)MPI_Recv(void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) Wildcard specification allowedMPI_ANY_SOURCEMPI_ANY_TAG

  • MPI (contd)Receiving a messageStatus of the received messageStatus gives two pieces of information directly Useful when wildcards are used status.MPI_SOURCE Gives identity of the sourcestatus.MPI_TAG Gives the tag information

  • MPI (contd)Receiving a messageStatus also gives message size information indirectly MPI_Get_count(MPI_Status *status, MPI_Datatype datatype, int *count)

    Takes status and the datatype as inputs and returns the number of elements via count

  • MPI (contd)Non-blocking communicationPrefix send and recv by I (for immediate)MPI_IsendMPI_IrecvNeed completion operations to see if the operation is completedMPI_WaitMPI_Test

  • MPI (contd)Sending a message (non-blocking version)MPI_Isend(void* buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm,MPI_Request *request ) Returns the request handle

  • MPI (contd)Receiving a message (non-blocking version)MPI_Irecv(void* buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm,MPI_Request *request ) Same arguments as Isend

  • MPI (contd)How do we know when a non-blocking operation is done?Use MPI_Test or MPI_WaitCompletion of a send indicates:Sender can access the send bufferCompletion of a receive indicatesReceive buffer contains the message

  • MPI (contd)MPI_Test returns the statusDoes not wait for the operation to completeMPI_Test(MPI_Request*request, int *flag, MPI_Status *status) Request handleOperation status: true (if completed)If flag = true, gives status

  • MPI (contd)MPI_Wait waits until the operation is completedMPI_Wait( MPI_Request *request, MPI_Status *status) Request handleGives status

  • MPI Collective CommunicationCollective communicationSeveral functions are provided to support collective communicationSome examples are given here:MPI_BarrierMPI_BcastMPI_ScatterMPI_GatherMPI_ReduceBroadcastBarrier synchronizationGlobal reduction

  • From Snir et al. 1996

  • MPI Collective Communication (contd)MPI_Barrier blocks the caller until all group members have called itMPI_Barrier( MPI_Comm comm ) The call returns at any process only after all group members have entered the call

  • MPI Collective Communication (contd)MPI_Bcast broadcasts a message from root to all processes of the groupMPI_Bcast( void* buf, int count, MPI_Datatype datatype, int root, MPI_Comm comm )

  • MPI Collective Communication (contd)MPI_Scatter distributes data from the root process to all the others in the groupMPI_Scatter(void* send_buf, int send_count, MPI_Datatype send_type, void* recv_buf, int recv_count, MPI_Datatype recv_type, int root, MPI_Comm comm )

  • MPI Collective Communication (contd)MPI_Gather inverse of the scatter operation (gathers data and stores it in rank order)MPI_Scatter(void* send_buf, int send_count, MPI_Datatype send_type, void* recv_buf, int recv_count, MPI_Datatype recv_type, int root, MPI_Comm comm )

  • MPI Collective Communication (contd)MPI_Reduce performs global reduction operations such as sum, max, min, AND, etc.MPI_Reduce(void* send_buf, void* recv_buf, int count, MPI_Datatype datatype, MPI_Op operation, int root, MPI_Comm comm )

  • MPI Collective Communication (contd)Predefined reduce operations includeMPI_MAXmaximumMPI_MINminimumMPI_SUMsum MPI_PRODproductMPI_LANDlogical ANDMPI_BANDbitwise ANDMPI_LORlogical ORMPI_BORbitwise ORMPI_LXORlogical XORMPI_BXORbitwise XOR

  • FromSnir et al. 1996Last slide