14
Computing the Probability Distribution of Project Duration in a PERT Network Jane N. Hagstrom Department of Information and Decision Sciences, University of Illinois, Chicago, Illinois 60680 The algorithm presented here makes exact computations of characteristics of the probability distribution of project duration for a stochastic project network. Given discrete independent probability distributions for task durations, the algorithm will compute either moments of the project duration distribution or values of the cumulative distribution function of project duration. The algorithm can be used for sensitivity analysis on probability weights since several task duration distributions having the same range can be handled simultaneously with no great increase in computation time requirements. 1. INTRODUCTION This section introduces the problem under consideration and establishes notation and terminology. A general background in project scheduling can be obtained from Elmaghraby [6]. Graph theoretic terms used here are consistent with Even [7], except that we use the term arc instead of edge. We will consider a project as represented by an activity-on-arc network. Figure 1 represents a project in which tasks have deterministic durations. Each arc of the network corresponds to a task. All tasks directed into a vertex must be completed before any task directed out of it may be started. The vertices in Figure 1 have been labeled with a precedence numbering, so that if there is an arc from i to j, then i < j. In this illustration, each arc is labeled with the time the task requires. The project This is Working Paper No. 86-01, College of Business Administration, University of Illinois at Chicago. This research was supported by the National Science Foundation under Grants ECS- 8205230 and ECS-8407332 to the University of Illinois at Chicago. NETWORKS, Vol. 20 (1990) 231-244 @ 1990 John Wiley & Sons, Inc.

Computing the probability distribution of project duration in a PERT network

Embed Size (px)

Citation preview

Page 1: Computing the probability distribution of project duration in a PERT network

Computing the Probability Distribution of Project Duration in a PERT Network Jane N. Hagstrom Department of Information and Decision Sciences, University of Illinois, Chicago, Illinois 60680

The algorithm presented here makes exact computations of characteristics of the probability distribution of project duration for a stochastic project network. Given discrete independent probability distributions for task durations, the algorithm will compute either moments of the project duration distribution or values of the cumulative distribution function of project duration. The algorithm can be used for sensitivity analysis on probability weights since several task duration distributions having the same range can be handled simultaneously with no great increase in computation time requirements.

1. INTRODUCTION

This section introduces the problem under consideration and establishes notation and terminology. A general background in project scheduling can be obtained from Elmaghraby [ 6 ] . Graph theoretic terms used here are consistent with Even [7], except that we use the term arc instead of edge.

We will consider a project as represented by an activity-on-arc network. Figure 1 represents a project in which tasks have deterministic durations. Each arc of the network corresponds to a task. All tasks directed into a vertex must be completed before any task directed out of it may be started. The vertices in Figure 1 have been labeled with a precedence numbering, so that if there is an arc from i to j , then i < j . In this illustration, each arc is labeled with the time the task requires. The project

This is Working Paper No. 86-01, College of Business Administration, University of Illinois a t Chicago. This research was supported by the National Science Foundation under Grants ECS- 8205230 and ECS-8407332 to the University of Illinois a t Chicago.

NETWORKS, Vol. 20 (1990) 231-244 @ 1990 John Wiley & Sons, Inc.

Page 2: Computing the probability distribution of project duration in a PERT network

232 HAGSTROM

FIG. 1. Deterministic project network.

duration is defined by the length of the longest path. In Figure 1, this path, called the critical path, is shown with broken lines, and the project duration is 27.

A PERTnetwork is a stochastic version of this model. The time requirement for a task is no longer fixed but is now a random variable. Thus, the identity of the critical path is no longer fixed, and the project duration is a random variable. In this paper, we require that the task times be independent with discrete, finite range. Given the precedence relations and the probability distributions for the task times, the algorithm presented in the next section computes moments of the project duration. The ability to adjust the algorithm to compute points of the cumulative distribution function and to allow sensitivity analysis on the probability weights will be discussed in the same section.

Martin [lo] gives an algorithm that can be adapted to apply to the problems we analyze here. His algorithm handles series-parallel networks by performing convolutions and comparisons appropriately. For a general network, one must identify a set of tasks whose replication will allow the production of a series-parallel network. The distribution of project duration is obtained by conditioning on these tasks’ durations. When this set of tasks is small, his algorithm is likely to require less time than the one presented here. However, the present algorithm has better worst-case performance.

Section 2 presents the algorithm. Section 3 describes computational experience with a Pascal implementation. Section 4 analyzes the computation requirements of the algorithm. Section 5 considers two alternative enumerative schemes for computing PERT quantities and argues that these will not improve on the efficiency of the algorithm presented here. Section 6 makes some final remarks on applying the algorithm.

Page 3: Computing the probability distribution of project duration in a PERT network

PROJECT DURATION IN A PERT NETWORK 233

2. THE ALGORITHM

The algorithm we will use is based on ordered recursive conditioning on the task durations. This is a common technique in reliability computations, where it has been referred to as pivoting [4], factoring [12], or backtracking [3]. To simplify the concept of the conditioning, we introduce a representation of the problem that is used by the algorithm. The concept of this representation is similar to the emergency equivalent network defined by Mirchandani [ 1 11 for stochastic shortest- path problems. Each task has several states: each one corresponding to a realization of the length of time the task requires. We represent this in our project network by assigning each task as many arcs as it has states. If two tasks share the same pair of vertices as endpoints, we replace the two tasks by a single task with an appropriate probability distribution and then represent the new task by a set of arcs. Vertices are given a precedence numbering.

Figure 2 illustrates such a representation. The project consists of 11 tasks, each taking two possible states. Task (4,5) cannot be started until tasks (1,4) and (2,4) have been completed. Task (4,5) requires either 1 or 2 units of time. The project will be completed when tasks (3,7) and (6,7) have been completed.

Conditioning algorithms enumerate substructures of the system under con- sideration. The goal of ordering the conditioning is to minimize the size of the enumeration. The structures enumerated here are candidates for longest-path arborescence of the project network. The recursive conditioning is based on choosing which state of which task will determine the longest path into a particular vertex. The version stated below computes moments of the probability distribution of project duration.

The algorithm is stated in terms of a recursive function PIVOTS. At a call of depth k to PIVOTS, the lengths of the longest paths to vertices 1 through k have

FIG. 2. PERT network.

Page 4: Computing the probability distribution of project duration in a PERT network

234 HAGSTROM

been fixed. PIVOTS then conditions on which of the arcs directed into vertex k + 1 determines the longest path into that vertex. It then calls itself for each of these cases and proceeds recursively until it reaches a depth corresponding to the number of vertices in the network; at this level, it returns the selected powers of the length of the critical path. When a level k call is finished, PIVOTS returns the moments of the critical path length conditioned on the distances to vertices 1 through k.

Global data for PIVOTS: The statement of the algorithm follows.

Network described by updated in-directed adjacency lists. Each vertex lies on some directed path from vertex 1 to the last vertex. Vertices are identified by a precedence numbering.

For each arc E, LENGTHCE], giving the task time corresponding to E, and PROBCE], the conditional probability that the task takes that long given it takes no longer.

For each vertex I , DISTCI], the current longest distance to I (fixed by the conditioning).

For each arc E directed from vertex J into vertex I , TEMPDISTCE], the distance to vertex I computed by adding DISTCJ] and LENGTHCE].

Input parameter for PIVOTS: K , the identity of largest numbered vertex whose distance is fixed when PIVOTS

is entered.

Local data for PIVOTS: E, the arc directed into vertex K + 1 that is currently being considered as

determining the distance to vertex K + 1. MOMENTS, a vector of conditional moments of the project duration given the

current longest distances to vertices 1 through K and given that no arc with a larger value of TEMPDIST than TEMPDISTCE] determines the longest distance to vertex K + 1 (more accurately stated in Theorem 1).

Output returned from PIVOTS: The final contents of MOMENTS, conditional moments of the project duration

given the current longest distances to vertices 1 through K .

Function PIVOTS(K) returns MOMENTS; Begin

if K is final vertex then

else Begin

compute MOMENTS by taking appropriate powers of DISTCK]

Compute TEMPDIST for arcs directed into vertex K + 1 and arrange the in- directed adjacency list for vertex K + 1 from smallest to largest value of TEMPDIST;

{initialize computation of MOMENTS:} MOMENTS + 0;

Page 5: Computing the probability distribution of project duration in a PERT network

PROJECT DURATION IN A PERT NETWORK 235

with each arc E in the ordered adjacency list of vertex K + 1 do Begin {accumulate conditional moments in MOMENTS)

DISTCK + 13 + TEMPDISTCE]; {condition on whether or not E determines the longest path to vertex

K + 1:)

MOMENTS MOMENTS + PROBCEIPIVOTS (K + 1) + (1 - PROBCE])

End

End {return}. End

Theorem 1.

Proof:

PIVOTS(1) correctly computes the moments of the project duration.

The proof is by induction on the correctness of PIVOTS(K). Clearly PIVOTS(N), where N is the last vertex, correctly computes the moments con- ditional on the longest distances to all vertices being fixed.

Suppose PIVOTS(K + 1) correctly computes the moments conditional on any consistent set of fixed longest distances to vertices 1 through K + 1. We must show that PIVOTS(K) correctly computes the moments conditional on any consistent set of fixed longest distances to vertices 1 through K . To reduce notation, we assume that MOMENTS contains only the first moment of the project duration. It will be clear that higher moments will also be correctly computed.

We must show that the loop through the ordered adjacency list operates correctly. Before the loop starts, the arcs have been sorted so that if arc F precedes arc E, then, using the fixed longest distances to vertices 1 through K, the distance from the first vertex to vertex K + 1 via arc F is no longer than the distance via arc E. The correctness of the loop depends on the correctness of the statement that calls PIVOTS(K + 1). The form of this statement is justified by the following condition- ing argument: Let TN be a random variable representing the longest distance to vertex N . Let z I be a fixed longest distance to vertex I . Let t (E ) be the index of the tail of arc E . Let A(E) be the random variable representing the length of the task for which E is an arc. Let L,(E) = {LENGTHCHI I H is an arc belonging to task (I, K + 1) and H precedes or equals E in the sorted list}. Each expectation below should also be conditional on the fixed longest distances to all vertices 1 through K . We omit this for notational brevity. We claim that the left-hand side of the following equation is the quantity stored in MOMENTS during the iteration of the loop in which arc E is processed.

E{ TN I each task ( I , K + 1) has a length in L,(E)} =

each task (I, K + 1) has a length in L,(E) and A(E) = LENGTHCE])

x Prob{A(E) = LENGTHCE] 1 A(E) < LENGTHCE]) + each task (I, K + 1) has a length in L,(E) and A(E) < LENGTHCE])

x Prob{A(E) < LENGTHCE] I A(E) < LENGTHCE]}.

Page 6: Computing the probability distribution of project duration in a PERT network

236 HAGSTROM

This statement is correct since A(E) is independent of the states of other tasks. Now

E { T, I each task ( I , K + 1) has a length in L,(E) and A(E) = LENGTHCE]} =

E { T, I longest distance to K + 1 is fixed at z, (~) + LENGTHCE]}.

By the induction hypothesis, PIVOTS(K + 1) computes this correctly. For the other expectation,

E{ TN 1 each task ( I , K + 1) has a length in L,(E) and A(E) < LENGTHCE]} =

E{ TN I each task ( I , K + 1) has a length in L,(F)},

where F is the arc immediately preceding E in the sorted list. This is precisely the contents of MOMENTS generated by the previous iteration of the loop. Since PROBCE] = Prob{A(E) = LENGTHCE] 1 A(E) < LENGTHCE]}, the computa- tion of MOMENTS inside the loop is correct.

When the loop is finished, MOMENTS contains E{ TN I each task ( I , K + 1) has a length in L,(E)} = E{ T,} ,

where E is the last arc in the ordered list of task states incident into K + 1 . Remembering that all such expectations have been conditioned on fixed longest distances to vertices 1 through K, we recognize that this is the correct output for

The most time-consuming step within a single call to PIVOTS is the sort required in order to arrange the in-directed adjacency list for vertex K + 1. Compared to the number of times PIVOTS is called, this does not contribute greatly to the computation requirements of the algorithm. The Pascal im- plementation uses a bubblesort that keeps track of the most recent item to reach its permanent position. Section 4 discusses the computation time requirements in greater detail.

The coded version of PIVOTS has the following two improvements, which reduce the number of times PIVOTS is called:

1. PIVOTS recognizes certain values of TEMPDIST as unrealizable as DIST and does not condition on these values. Values of TEMPDIST associated with task ( I , K + 1) that are smaller than the smallest possible value of TEMPDIST associated with another task ( J , K + 1) will never be realized. A search through the sorted adjacency list determines where to start conditioning.

2. PIVOTS recognizes when several arcs into vertex K + 1 have the same value of TEMPDIST, computes the appropriate probability that some one of them determines the longest path into vertex K + 1, and conditions on any one of them determining the longest path to continue the recursion.

Another version of the algorithm computes values of the cumulative distribution function of the project duration. Instead of accumulating expectations in MOMENTS, the alternate version accumulates conditional probabilities. Both versions can be rewritten to process several distributions for the task durations at once. As long as the distributions have the same ranges, the order of processing is not changed and the computation requirements are not greatly increased.

PIVOTS(K) and the induction step is completed.

Page 7: Computing the probability distribution of project duration in a PERT network

PROJECT DURATION IN A PERT NETWORK 237

3. COMPUTATIONAL EXPERIENCE

The algorithm was coded in Pascal, compiled under the optimizing option of Pascal/VS, and run on an IBM3081 operating under CMS. Cases from the literature as well as randomly generated cases were run. Characteristics of the cases as well as computation time are shown in Table I. Input, preprocessing, and output times are not included. Except in the smallest problems, these require an insignificant fraction of computation time. We note further that no analysis is made of space requirements. As indicated in the description of the algorithm, only one copy of the network is kept and the amount of data in the recursion stack is minimal. Computation time requirements are much more likely to be limiting than are space requirements, even for a less space-efficient implementation.

TESTSA through TEST12A were randomly generated cases where an attempt was made to keep the average number of tasks directed into each vertex at 2. TESTSA and COMPLET6 through COMPLElO were complete acyclic digraphs with randomly generated task-time distributions. SERIES10 is a network in which the only tasks are of the form ( i , i + 1). For all the above cases, the probability distribution for task time was generated by randomly selecting a mean time between 0 and 10, randomly selecting a distance from the mean between 0 and the mean, and then constructing a three-point distribution with probability 0.6 of being at the mean, probability 0.2 for being at the selected distance above the mean, and probability 0.2 for being at the selected distance below the mean.

The remaining cases come from the literature. VANSLYKE is a discrete version of figure 10 from Van Slyke [19]. For this case, 0.1 probability was assigned to both the optimistic and pessimistic task completion times and 0.8 probability was assigned to the most likely time. ELM4FIG4 is figure 4.4 from Elmaghraby [6]. SHOGANl is example 1 from Shogan [18]. The data for PRITSKER were taken from table 6.1 in Pritsker and Kiviat [13]. STRIPDWN is Kleindorfer’s example [9] with each task restricted to taking either its smallest or largest possible length.

Table I specifies the number of vertices for each case and tabulates the effective in-degree (see Section 4) for each vertex from 2 to the last. To fill in the null sections of the table, ones were used. Confirmation of the bound developed in Section 4 seems to require more large examples than we were able to run.

Bala [ 2 ] performed some computational experiments on the effect of using a quicksort instead of a bubblesort in the program (see [l]). This did not seem to improve the performance, presumably because the adjacency lists tend to be fairly well sorted. This paper also contains some discussion of ways of reducing the number of calls to PIVOTS. The points listed at the end of Section 2 were two of these possibilities that were implemented.

4. ANALYSIS OF COMPUTATION REQUIRMENTS

Hagstrom [8] demonstrates that, unless P = NP, we cannot hope to obtain a polynomial algorithm for this problem. We are not surprised then that the algorithm in Section 2 is not a polynomial algorithm. This section analyzes how well the algorithm does, in fact, perform.

Page 8: Computing the probability distribution of project duration in a PERT network

TAB

LE I

. C

ompu

tatio

n tim

e re

quir

emen

ts fo

r se

lect

ed c

ases

.

Effe

ctiv

e in

-deg

rees

N

o.

CP

U

Cas

e ve

rtice

s 2

3 4

5 6

7 8

9 10

11

12

13

14

15

16

17

18

19

20

m

icro

seco

nds

TEST

5A

TEST

6A

TEST

7A

TEST

SA

TEST

9A

TE

ST 1 O

A TE

ST1 1

A

TEST

12A

C

OM

PLE

T6

CO

MPL

ET

7 C

OM

PLE

T8

CO

MPL

ET

9 C

OM

PLE

lO

SER

IES1

0 E

LM

4FIG

4 SH

OG

AN

l V

AN

SLY

KE

PRIT

SKE

R

STR

IPD

WN

5 3

57

91

11

I1

I1

11

1 I

1 1

1 1

1296

0 6

35

75

71

11

11

11

11

11

11

1 35

364

7 3

35

93

91

11

I1

I1

11

11

11

81

381

8 3

55

73

71

3 1

1 1

1 1

1 1

1 1

1 1

1

7412

10

9 3

53

9 5

11

5

11

1

1 1

I1

1 I

1 1

1 1

2284

369

10

33

53

71

1

9 9

15

1

1 1

I1

11

11

1 52

1 51 3

3 11

3

33

5 5

5

7 9

15

9

11

11

11

11

1 17

8798

58

12

33

33

5

7 7

5 7

91

1

1 I

1 1

1 I

1 I

7572

9868

6

35

79

11

1

1 1

1 1

1 1

1 1

1 1

1 1

1

7605

3 7

35

79

11

13

1

1 1

1 1

1 1

1 1

1 1

1 1

I536

03

8 3

57

91

11

31

5

11

I1

I1

I I

I1

11

57

624 I

9

35

79

11

13

15

17

1

1 1

1 1

1 1

1 1

1 1

1378

182

10

35

79

11

13

15

17

19

1

1 1

1 1

1 1

I1

1 12

0755

20

10

33

33

33

33

3 1

1 1

1 1

1 1

1 1

1

9 106

08

4 3

37

11

11

11

11

1I

11

11

11

35

33

6 5

59

99

1 1

1 I

1 1

1 1

1 1

1 I

1 1

1867

19

9 3

35

3 3

7

17

11

11

11

11

11

1 33

9900

9

35

33

7

3 5

5 1

1 1

1 1

1 1

1 1

1 1

8465

04

2817

1382

8 20

2

22

3 2

3 3

22

44

43

3 3

34

44

Page 9: Computing the probability distribution of project duration in a PERT network

PROJECT DURATION IN A PERT NETWORK 239

U C e

b d f FIG. 3. Series PERT network.

The computation time requirements of the algorithm can be treated in some detail by analyzing the recursion tree associated with applying the algorithm to a particular instance of a PERT network. This tree is defined as follows: It has a node corresponding to each call to PIVOTS; a child node is linked to a parent node if the call made to PIVOTS that is associated with the child was made during the execution of PIVOTS that is associated with the parent.

Figure 3 illustrates one of the simplest possible situations: The set of tasks associated with the project must be executed in series and each task has two states. The recursion tree generated by PIVOTS is illustrated in Figure 4. The leftmost node at level 3 corresponds to a call to PIVOTS with argument 3 where the distance to vertex 2 has been fixed at a and the distance to vertex 3 has been fixed at u + c.

For a more complex network, level k of the recursion tree will correspond to fixing the longest distance to vertices 1 through k. The children of a node at level k are generated by considering all possible distances to vertex k + 1 with the distances to 1 through k fixed. Let us assume that we have made the adjustment (1) of Section 2, so that we do not generate unrealizable longest distances to any vertex. If all tasks directed into vertex k + 1 have deterministic durations, then each node at level k has exactly one child. In general, the number of children of a node at level k is no more than the eflective in-degree of vertex k + 1 , where we define this to be the quantity computed as the in-degree of vertex k + 1 minus the number of tasks incident with k + 1 plus 1.

Level 1 n

Level 2

Level 3

Level 4

FIG. 4. Recursion tree for series PERT network.

Page 10: Computing the probability distribution of project duration in a PERT network

240 HAGSTROM

The bulk of the work done at a node at level k of the tree is the sorting of the arcs directed into vertex k + 1. Depending on the sort, this work is bounded by at worst the square of the largest in-degree of any vertex.

Theorem 2. Let H be an upper bound on the work required to sort the arcs in the in-directed adjacency list of any vertex in a project network. Let u be the number of vertices for which all in-directed tasks have deterministic durations. Let ll be the product of the effective in-degrees. Then an upper bound for the work required in applying PIVOTS is a constant times H ( u + 2)ll.

The number of nodes at the bottom level of the tree is no more than IT. Since there are u levels of the tree in which nodes have only a single child, the total number of nodes in the recursion tree is no more than (a + 2)ll. Each node requires at most a constant times work H.

In a complete project network on n vertices, with k states per task, the computation requirements would then be O(H . k" . n!). Computation requirements for Martin's algorithm [lo] would be O(Q.k"'), where Q represents the time required to compute the project duration distribution in a series-parallel network. H and Q are both polynomial functions of n and k. Comparing k" .n! to k"' indicates that the algorithm here has a better worst-case performance than does Martin's.

Section 2 indicated that the algorithm presented here can handle several distributions for task durations at once. As long as the range of the task durations is the same for each distribution, several distributions may be handled at once during one generation of a recursion tree with no more sorting than for one distribution. Multiplying the number of distributions for project duration by rn multiplies the number of multiplications and additions by m. Let us consider H to be the number of comparisons performed by the sort at a single node of the recursion tree. In the case of a single distribution, we can associate two multiplications and one addition (belonging to the statement that generates the node) with the node. In the case of several distributions, we will have to associate 2m multiplications and m additions with the node. Taking additions, multiplica- tions, and comparisons to require equivalent amounts of time, then, if the number of distributions is less than H , the computation time will be no more than quadruple the time required by a single distribution.

Proof.

5. IMPROVING ON THE ALGORITHM

Sections 3 and 4 indicate that the rapid growth of computation time with the size of the network will limit the use of this algorithm to relatively small networks. We can then ask how we might improve on the algorithm in order to increase the size of network that can be handled. If we restrict ourselves to staying within the framework of the algorithm presented here, the possibility for improvement seems to be limited to fine-tuning the generation of the recursion tree, so that a few more branches can be removed, or improving the method of sorting. These may make the algorithm more competitive with Martin's [lo] for sparse networks. Neither of

Page 11: Computing the probability distribution of project duration in a PERT network

PROJECT DURATION IN A PERT NETWORK 241

these will significantly reduce the size of the tree, and the size of the tree is the major determinant of the computation time. In the rest of this section, we consider going out of the framework of this algorithm to see if other approaches might yield a significant reduction in the computation requirements. We find that the two most obvious approaches do not seem promising.

Algorithms for # P-complete problems seem to require an exhaustive enumera- tion of some combinatorial substructure that appears in the problem structure. The amount of work grows polynomially in the number of this substructure, but, unfortunately, the number of these substructures can grow exponentially with the size of the original problem structure. For example, in the network reliability context, Provan and Ball [14] give an algorithm for computing the probability that the source vertex in a directed network can communicate with the sink vertex when links are subject to failure. Their algorithm enumerates all minimal source-sink cuts of the network and also enumerates all minimal cuts of certain subnetworks. The total enumeration is polynomial in the number of source-sink cuts of the network, and other associated work is small. In general, the number of minimal cuts can grow exponentially in the number of edges or vertices, so the size of network that can be analyzed is limited. A number of other types of enumerations have been applied to this problem, for instance, enumerating arborescences (Ball and Van Slyke [3] adapted for directed graphs) or enumerating acyclic subgraphs [ 171, but at least for large, dense networks the Provan and Ball method requires a smaller enumeration.

Since the PERT problem is #P-complete, we expect that we will have to enumerate some structure whose number grows exponentially in the size of the network. We will consider the network to be as defined in Section 2 and illustrated in Figure 2. PIVOTS enumerates almost all the spanning arborescences of the network. The question arises as to whether there is something better to enumerate. The two most obvious candidates for a smaller enumeration are minimal source- sink paths and minimal source-sink cuts. We will discuss these two possibilities, but will abbreviate source-sink paths to paths and source-sink cuts to cuts in the rest of this section.

Complexity analysis indicates that we will not succeed by enumerating minimal paths unless P = NP. Hagstrom 181 showed that the PERT problems described in this paper are # P-hard by showing that algorithms for these problems could be used to solve a # P-complete reliability problem of Provan and Ball [l5]. In the PERT problem context, the network that must be solved is one such as that in Figure 5, where there can be arbitrary numbers of vertices of type A or of type B. In this network, the number of minimal paths grows polynomially in the number of arcs. The paths can be enumerated in constant time per path. However, due to the work of Provan and Ball, we know that this problem cannot be solved in time polynomial in the number of arcs unless P = NP. The Provan and Ball example indicates that we cannot always compute PERT problems in time polynomial in the number of paths. Thus we conclude that an algorithm that enumerates minimal paths and performs a polynomial amount of work in the number of paths will not successfully compute PERT quantities for all networks. We use here essentially the same argument Provan and Ball [14] used to show that this approach will not work for computing source-sink reliability.

Page 12: Computing the probability distribution of project duration in a PERT network

242 HAGSTROM

v

FIG. 5 . Provan and Ball example.

Another possibility is to enumerate minimal cuts. We do not have as strong evidence that this will not work. However, Figure 6 gives an example of a class of problems that would seem to be difficult to solve just by enumerating minimal cuts. Figure 6 illustrates a ladder with 8 vertices. A ladder with 10 vertices can be constructed by adding the tasks (8,9), (8, lo), and (9,lO). By repeating the addition of three tasks in this way, a ladder with any even number of vertices can be constructed. The number of minimal cuts of the ladder grows as the square of the number of vertices and the minimal cuts are easy to enumerate. The task states in Figure 6 have been chosen so that the number of possible values for the longest path to vertex 2i is 1 more than 4 times the number of possible values for the longest path to vertex 2i - 2. It is easy to continue this pattern. Thus, although the number of minimal cuts is growing polynomially in the size of the network, the number of states of the project is growing exponentially. It seems unlikely that we can devise an algorithm that computes characteristics of the distribution of project duration without enumerating all possible states of the project, which we certainly

0 0 0

0 74

2 33 37

FIG. 6. Ladder.

Page 13: Computing the probability distribution of project duration in a PERT network

PROJECT DURATION IN A PERT NETWORK 243

cannot do in time polynomial per cut. Thus enumerating cuts does not appear to be a promising approach.

We conclude that if we wish to look for algorithms that significantly improve on PIVOTS, we should think of enumerating some combinatorial structure in the network other than minimal cuts, minimal paths, or spanning arborescences. We must expect that in some cases the number of instances of this structure will be exponentially more than the number of minimal paths. Similarly, it seems likely that in other cases the number of instances of this structure will be exponentially more than the number of minimal cuts. In order to improve on PIVOTS, the number of instances of this structure should be less than the number of spanning arborescences.

6. FINAL REMARKS

The algorithm presented here takes as input discrete, finite, independent distributions for task durations and computes characteristics of the probability distribution of project duration. For sparse networks, Martin’s method [lo] is likely to be faster. The algorithm seems to be the best presently available for exact computation on dense PERT networks and the discussion of Section 5 implies that it may be difficult to find an algorithm with better worst-case performance.

If the computation budget is limited, the alternatives would seem to be approximative methods. One class of methods contains bounding algorithms such as those of Shogan [18] and Dodin [ S ] . (These papers give references to other bounding algorithms.) These methods seem to work well in practice, but we cannot predict ahead of time how tight the bounds are going to be. The second class contains Monte Carlo methods, which are surveyed in Elmaghraby [6]. These too seem to work well in practice, but again we have difficulty in predicting the behavior of an experiment ahead of time. This difficulty may be intrinsic to the problem. The parallel of PERT problems with reliability problems and the results of Rosenthal [16] indicate that obtaining a PERT characteristic to a specified accuracy may be NP-hard.

Since the input data for task durations is likely to be inaccurate, we might ask what advantage there is to exact computation of output when it is based on inaccurate input. If the network is small enough to be handled using the algorithm presented here, the capability for sensitivity analysis can be exploited to obtain bounds on the effects of lack of precise data on task durations. Two distributions may be used as input: one a lower bound on task duration distributions, the other an upper bound on task duration distributions. The output will then give characteristics of bounds on the project duration distribution. As pointed out in Section 4, this will not greatly increase computation time over that required for a single distribution.

In this paper, we have presented an algorithm for general PERT networks that may be close to as efficient as possible for exact computation. The largest-size network that will be reasonable to compute this way will depend on the computer budget available. If the computer budget is available, the effects of inaccuracy in the input data can be analyzed using the algorithm’s sensitivity analysis capability.

Page 14: Computing the probability distribution of project duration in a PERT network

244 HAGSTROM

References

[ 11 S . Baase, Computer Algorithms: Introduction to Design and Analysis. Addison-Wesley, Reading, MA (1978).

[2] R. Bala, Variations of PIVOTS: Improving the performance of a PERT algorithm. Technical report, Department of Information and Decision Sciences, University of Illinois, Box 4348, Chicago 60680 (1985).

[3] M. 0. Ball and R. M. Van Slyke, Backtracking algorithms for network reliability analysis, Ann. Discrete Math. 1 (1977) 49-64.

[4] R. E. Barlow and F. Proschan, Statistical Theory of Reliability and Life Testing. Holt, Rinehart, and Winston, New York (1975).

[5] B. Dodin, Bounding the project completion time distribution in PERT networks. Operations Res. 33 (1985) 862-881.

[6] S . E. Elmaghraby, Activity Networks. Wiley, New York (1977). [7] S . Even, Graph Algorithms. Computer Science Press, Potomac, M D (1979). [S] J. N. Hagstrom, Computational complexity of PERT problems. Networks 18 (1988)

[9] G. B. Kleindorfer, Bounding distributions for a stochastic acyclic network. Operations Res. 19 (1975) 1586-1601.

[lo] J. J. Martin, Distribution of the time through a directed acyclic network. Operations Res. 13 (1965) 46-66.

[ 11) P. B. Mirchandani, Shortest distance and reliability of probabilistic networks. Comput. Operations Res. 3 (1976) 347-355.

[12] F. Moskowitz, The analysis of redundancy networks. A I I E Trans. Part I: Commun. Electronics 77 (1958) 627-632.

[13] A. A. B. Pritsker and P. J. Kiviat, Simulation with GASP 11. Prentice-Hall, Englewood Cliffs, NJ (1969).

[14] J. S. Provan and M. 0. Ball, Computing network reliability in time polynomial in the number of cuts. Operations Res. 32 (1984) 516-526.

[15] J. S. Provan and M. 0. Ball, The complexity of counting cuts and of computing the probability that a graph is connected. SIAM J . Computing 12 (1983) 777-778.

[16] A. Rosenthal, A computer scientist looks at reliability computations. In Reliability and Fault Tree Analysis (R. E. Barlow, J. B. Fussell, and N. D. Singpurwalla, Eds.), SIAM, Philadelphia (1975) 133-152.

[17] A. Satyanarayana and A. Prabhakar, New topological formula and rapid algorithm for reliability analysis of complex networks. IEEE Trans. Reliability R-27 (1978) 82- 100.

[18] A. W. Shogan, Bounding distributions for a stochastic PERT network. Networks 7 (1977) 359-381.

[19] R. M. Van Slyke, Monte Carlo methods and the PERT problem. Operations Res. 11

Received April 1987 Revised July 1989

139- 147.

(1963) 839-860.