Scheduling Considerations for building Dynamic Verification Tools for MPI

1

Scheduling Considerations for building Dynamic Verification Tools for MPI

Sarvani Vakkalanka, Michael DeLisiGanesh Gopalakrishnan, Robert M. Kirby

School of Computing, University of Utah, Salt Lake City

Supported by Microsoft HPC Institutes,

NSF CNS-0509379

http://www.cs.utah.edu/formal_verification

2

(BlueGene/L - Image courtesy of IBM / LLNL)

(Image courtesy of Steve Parker, CSAFE, Utah)

The scientific community is increasingly employing expensive supercomputers that employ distributed programming libraries….

…to program large-scale simulations in all walks of science, engineering, math, economics, etc.

Background

3

Current Programming Realities

Code written using mature libraries (MPI, OpenMP, PThreads, …)

API calls made from real programming languages

(C, Fortran, C++)

Runtime semantics determined by realistic Compilers and Runtimes

How best to verifycodes that will run on actual platforms?

4

Classical Model Checking

Finite State Model of

Concurrent Program

Check Properties

Extraction of Finite State Models for realistic programs is difficult.

5

Dynamic Verification

ActualConcurrent

Program Check Properties

Avoid model extraction which can be tedious and imprecise

Program serves as its own model

Reduce Complexity through Reduction of interleavings (and other methods)

6


ActualConcurrent


One Specific Test Harness

Need test harness in order to run the code.

Will explore ONLY RELEVANT INTERLEAVINGS (all Mazurkeiwicz traces) for the given test harness

Conventional testing tools cannot do this !!

E.g. 5 threads, 5 instructions each 1010 interleavings !!

7


ActualConcurrent


One Specific Test Harness

Need to consider all test harnesses

FOR MANY PROGRAMS, this number seems small (e.g. Hypergraph Partitioner)

8

Related Work

• Dynamic Verification tool:– CHESS – Verisoft (POPL ’97)– DPOR (POPL’ 05)– JPF

• ISP is similar to CHESS and DPOR

9

Dynamic Partial Order Reduction (DPOR)

P0 P1 P2

lock(x)

…………..

unlock(x)

lock(x)

…………..

unlock(x)

lock(x)

…………..

unlock(x)

L0

U0

L1

L2

U1

U2

L0

U0

L2

U2

L1

U1

10

Executable

Proc1

Proc2

……Procn

SchedulerRun

MPI Runtime

ISP

Manifest only/all relevant interleavings (DPOR)

Manifest ALL relevant interleavings of the MPI Progress Engine : - Done by DYNAMIC REWRITING of WILDCARD Receives.

MPI Program

Profiler

11

Using PMPI

Scheduler

MPI Runtime

P0’s Call Stack

User_Function

MPI_Send SendEnvelopeP0: MPI_Send

TCP socket

PMPI_Send

In MPI RuntimePMPI_Send

MPI_Send

12

DPOR and MPI

Implemented an Implicit deadlock detection technique form a single program trace.

Issues with MPI progress engine for wildcard receives could not be resolved.

More details can be found in our CAV’2008 paper: “Dynamic Verification of MPI Programs with Reductions in Presence of

Split Operations and Relaxed Orderings”

13

POE P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

sendNext Barrier

14

P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

sendNextBarrier

Irecv(*)

Barrier

POE

15

P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

Barrier

Irecv(*)

Barrier

Barrier

Barrier

Barrier

Barrier

POE

16

P0 P1 P2

Barrier

Isend(1, req)

Wait(req)

MPI Runtime

Scheduler

Irecv(*, req)

Barrier

Recv(2)

Wait(req)

Isend(1, req)

Wait(req)

Barrier

Isend(1)

Barrier

Irecv(*)

Barrier

Barrier

Wait (req)

Recv(2)

Isend(1)

SendNext

Wait (req)

Irecv(2)Isend

Wait

No Match-Set

Deadlock!

POE

17

MPI_Waitany + POE

P0 P1 P2

MPI Runtime

Scheduler

Barrier

Recv(0)

Recv(0)

Barrier

Isend(1, req[0])

Waitany(2,req)

Isend(2, req[1])

Barrier

Isend(1, req[0])sendNextIsend(2, req[0])

sendNext

Waitany(2, req)

Recv(0)

Barrier

18

P0 P1 P2

MPI Runtime

Scheduler

Barrier

Recv(0)

Recv(0)

Barrier

Isend(1, req[0])

Waitany(2,req)

Isend(2, req[1])

Barrier

Isend(1, req[0])

Isend(2, req[0])

Waitany(2, req)

Recv(0)

Barrier

Isend(1,req[0])

Recv

Barrier

Error! req[1] invalid

Valid

Invalid

req[0]

req[1] MPI_REQ_NULL

MPI_Waitany + POE

19

MPI Progress Engine Issues

P0 P1

MPI Runtime

Scheduler

Isend(0, req)

Wait(req)

Irecv(1, req)

Wait(req)

Barrier

Irecv(1, req)

Barrier

Wait

Isend(0, req)Barrier

sendNext

sendNext

PMPI_Wait

Does not Return

Scheduler Hangs

PMPI_Irecv + PMPI_Wait

20

Experiments ISP was run on 69 examples of the Umpire test suite.

Detected deadlocks in these examples where tools like Marmot cannot detect these deadlocks.

Produced far smaller number of interleavings compared to those without reduction.

ISP run on Game of Life ~ 500 lines code. ISP run on Parmetis ~ 14k lines of code.

Widely used for parallel partitioning of large hypergraphs ISP run on MADRE

(Memory aware data redistribution engine by Siegel and Siegel, EuroPVM/MPI 08)• Found previously KNOWN deadlock, but AUTOMATICALLY within one

second ! Results available at:

http://www.cs.utah.edu/formal_verification/ISP_Tests

http://www.cs.utah.edu/formal_verification/ISP_Tests

21

Concluding Remarks Tool available (download and try) Future work

Distributed ISP scheduler Handle MPI + Threads Do large-scale bug hunt now that ISP can execute large-

scale codes.

22

Implicit Deadlock Detection

P0 P1 P2

Irecv(*, req)

Recv(2)

Wait(req)

Isend(0, req)

Wait(req)

Isend(0, req)

Wait(req)

MPI Runtime

Scheduler

P0 : Irecv(*)

P1 : Isend(P0)

P2 : Isend(P0)

P0 : Recv(P2)

P1 : Wait(req)

P2 : Wait(req)

P0 : Wait(req)

No Matching Send

Deadlock!

Documents

Scheduling Considerations for building Dynamic Verification Tools for MPI