Numerical Analysis of Large Markov Reward Modelswebspn.hit.bme.hu/~telek/cikkek/tele99a.pdf · ferred to as Markov Reward Model (MRM). Various numerical techniques were proposed for

Numerical Analysis of Large Markov

Reward Models

Miklos Telek

Department of Telecommunications,

Technical University of Budapest, Hungary, [email protected]

Sandor Racz

Department of Telecommunications and Telematics,

Technical University of Budapest, Hungary, [email protected]

Abstract

First analysis of Markov Reward Models (MRM) resulted in a double transformexpression, whose numerical solution is based on the inverse transformations bothin time and reward variable domain. Better numerical methods were proposed basedon the time domain properties of these models, such as the set of partial differentialequation describes the process evolution in time.

This paper introduces an effective numerical method for the analysis of MRMsbased on the transform domain description of the system, which allows the evalua-tion of models with large state space (∼ 106 states). The proposed method providesthe moments of reward measures on the same computational cost and memoryrequirement as the transient analysis of the underlying Continuous Time MarkovChain and benefits from the advantages of the Randomization method, which avoidsnumerical instabilities and provides global error bound in advance of the compu-tation. Implementation notes and numerical examples demonstrate the numericalproperties of the proposed method are also provided.

Key words: Markov Reward Models, Performability, Completion Time, Ran-domization.

1 M. Telek was partially supported by OTKA F-23971. S. Racz thanks the supportof HSNLab. The authors thank the help of Gergely Matefi in the implementationof the proposed method.

Preprint submitted to Elsevier Preprint 19 November 2004

1 Introduction

The stochastic reward processes have been studied since a long time [13,9],because the possibility of associating a reward variable to each system stateincreases the descriptive power and the modeling flexibility. However, only re-cently, stochastic reward models (SRM) have received attention as a modelingtool in performance evaluation of computer and communication systems. Com-mon assignments of the reward rates are: execution rates of tasks in computingsystems (the computational capacity) [1,20], number of active processors (orprocessing power) [3,8], throughput [14], available bandwidth [2], or averageresponse time [12].

Two main different points of view have been assumed in the literature whendealing with SRM [11]. In the system oriented point of view the most sig-nificant measure is the total amount of work done by the system in a finiteinterval. This measure is often referred to as performability [14]. In the useroriented (or task oriented) point of view the system is regarded as a server,and the emphasis of the analysis is on the ability of the system to accomplishan assigned task in due time. Consequently, the most characterizing measurebecomes the probability of accomplishing an assigned service in a given time.

A unified formulation to the system oriented and the user oriented point ofview was provided in [11,16] together with the double Laplace transform ex-pression of the completion time for the case when the underlying stochasticprocess Z(t) is a Continuous Time Markov Chain (CTMC). This case is re-ferred to as Markov Reward Model (MRM).

Various numerical techniques were proposed for the evaluation of the systemand the user oriented measures of MRMs. Some of these methods calculatethe distribution of reward measures. The distribution, in double transformdomain, can be obtained by a symbolic matrix inversion. If the size of thestate space allows to obtain the solution of the symbolic matrix inversion thenmulti-dimensional numerical inverse transform methods [22] can provide thetime domain results, but, due to the computational complexity of the symbolicinversion of matrices, this approach is not applicable for models with morethan 20 states.

In time domain, reward measures can be described either by a set of equationswith convolution integrals, or by a set of partial differential equations, but thenumerical methods compute the distribution in time domain are usually basedon the evaluation of a double summation, where both of the summation para-meters increase to infinity. The discrete summations are obtained by adoptingthe randomization technique [19]. The randomization technique usually pro-vides nice numerical properties and an overall error bound. The numerical

2

methods based on this approach [6,5,15] differ in the complexity and memoryrequirement of one iteration step. The methods in [5,15] are with polynomialcomplexity with respect to the size of the state space.

MRMs with special features allow special, effective numerical approaches. Inthe case when the underlying CTMC has an absorbing state, in which nouseful work is performed, it is easy to evaluate the limiting distribution ofperformability [1]. The numerical method in [7] makes use of a special structureof the underlying CTMC.

The numerical analysis of the distribution of reward measures is, in general,more complex than the computation of the moments of those measures. Themean of performability can be obtained by the transient analysis of the un-derlying CTMC. A numerical convolution approach is proposed in [10] toevaluated the (n + 1)-th moment of performability based on its n-th moment.A similar approach is followed in [21] to calculate the moments of the useroriented measures, but the high computational complexity of the numericalconvolution does not allow to apply this approach for the analysis of MRMwith large (> 100) state spaces. Other direct methods make use of a spectral -or partial fraction decomposition, which is relatively easy for acyclic CTMCs,since the eigenvalues of the generator matrix are available in its diagonal [18].The subclass of MRMs where the user has an associated Phase-type distrib-uted random work requirement was studied in [4]. In this case the completiontime is Phase type distributed, i.e., an “extended” CTMC can be defined whichcharacterize the distribution of the completion time.

There are very few general numerical methods applicable for the reward analy-sis of MRMs with more than 105 states, while there are effective numericalmethods to compute the steady state, the transient and the cumulative tran-sient measures of large CTMCs [19,17]. It seems, only those reward measuresof large MRMs can be evaluated which are associated (with simple computa-tion) with the steady state, the transient or the cumulative transient measuresof a CTMC of the same size.

In this paper, we provide a method based on the transform domain descrip-tion of MRMs which allows the reward analysis of large models. Indeed, theproposed method evaluates each required moments of reward measures on thesame computational cost as the transient analysis of the underlying CMTC,hence, it outperforms all the above mentioned general methods, at least, re-garding the size of the models for which the numerical analysis is feasible.

The paper is organized as follows. Section 2 provides a summary of resultsabout MRMs. In Section 3 the analysis of the accumulated reward while inSection 4 the completion time analysis of MRMs is presented. Section 5 givessome implementation issues of the proposed computational approach. In Sec-

3

tion 6 two numerical examples are investigated and the paper is concluded inSection 7.

2 Markov Reward Models

In this section we provide the definitions and the well known results aboutMRMs, but following a different (may be simpler) way of reasoning than theone in the original papers.

Let {Z(t), t ≥ 0} be a CTMC over the finite state space S = {1, 2, . . . ,M}with generator Q = [qij] and initial distribution P = [pi]. A non-negative realconstant (ri, i ∈ S) is associated to each state of the process representing thereward rate (the performance index) in state i. Let R be the diagonal matrixof the reward rates (i.e., R = diag(r1, r2, . . . , rM)).

Let ℓ(t) = [ℓi(t)] denote the transient state probability vector (ℓi(t) =Pr{Z(t) = i}) and L(t) = [Li(t)] denote the cumulative state probability vec-tor (Li(t) =

∫ t0 ℓi(τ)dτ). It is known that ℓ(t) = PeQt and L(t) = P

∫ t0 eQτdτ .

Definition 1 The accumulated reward B(t) is the random variable whichrepresents the accumulation of reward in time:

B(t) =

t∫

0

rZ(τ) dτ (1)

and

Bi(t) =

t∫

0

rZ(τ) dτ , if Z(0) = i . (2)

By this definition, B(t) is a stochastic process that depends on Z(u) for 0 ≤u ≤ t and B(0) = 0. According to Definition 1 this paper restricts the attentionto the class of models in which no state transition can entail to a loss of theaccumulated reward. This kind of process is called preemptive resume model.The distribution of the accumulated reward is defined by

B(t, w) = Pr{B(t) ≤ w} (3)

and

Bi(t, w) = Pr{Bi(t) ≤ w} . (4)

4

Note that

B(t, w) =∑

i∈S

pi Bi(t, w) , (5)

hence, in the rest of this paper, we use the initial state dependent measuresand the global measures can always be evaluated by the mean of this relation.

Definition 2 The completion time, Ci, is the random variable representingthe time to accumulate the random amount of reward W

Ci = min[t ≥ 0 : Bi(t) = W ] . (6)

The distribution of Ci is

Ci(t) = Pr{Ci ≤ t} . (7)

Let Ci(w) be the random variable representing the time to accumulate w (fix)amount of reward and Ci(t, w) its distribution, i.e.,

Ci(w) = min[t ≥ 0 : Bi(t) = w] , (8)

Ci(t, w) = Pr{Ci(w) ≤ t} . (9)

Let G(w) be the distribution of W with support on [0,∞). By Definition 2,

Ci(t) =

∞∫

0

Ci(t, w) dG(w) . (10)

The distribution of the completion time is closely related to the distributionof the accumulated reward by the mean of the following relation (see Figure1.)

Bi(t, w) = Pr{Bi(t) ≤ w} = Pr{Ci(w) ≥ t} = 1 − Ci(t, w) . (11)

Theorem 1 The column vector of the distribution of the accumulated reward(B(t, w) = [Bi(t, w)]) is defined as follows:

B∼(t, v) = e(Q−vR)t · h (12)

where ∼ denotes the Laplace-Stieltjes transform with respect to w(→ v), andh is the column vector with all the entries equal to 1.

5

B(t)

t

Z(t)

t

r j

ijk

r i

r i

rkw

C(w)

Fig. 1. A sample path of Z(t) and B(t).

Proof: Consider an exponentially distributed work requirement (W) with pa-rameter m. On the one hand, the completion time is characterized by thefollowing distribution function

Ci(t) =

∞∫

0

Ci(t, w) dG(w) =

∞∫

0

(

1 − Bi(t, w))

dG(w) (13)

= m

∞∫

0

(

1 − Bi(t, x))

e−mx dx = 1 − B∼

i (t, v)∣

∣

∣

v=m

which, in vector form, is

C(t) = h − B∼(t, v)∣

∣

∣

v=m. (14)

On the other hand, Ci(t) is phase type distributed and its distribution canbe obtained by the representation of the phase type distribution (the originalCTMC plus an absorbing state to which transition from state i ∈ S is at ratem ri) [4]:

C(t) = h − e(Q−mR)t · h . (15)

6

And since (12) is analytical for ℜ(v) ≥ 0 the theorem is given. ✷

A further Laplace-Stieltjes transform of (12) with respect to t results:

B∼∼(s, v) = s(sI + vR − Q)−1 · h (16)

In order to simplify the transform domain expressions, in the rest of the paper,we apply the most convenient version of them using the F∼(a) = aF ∗(a) rule 2 .Detailed derivations in [10] resulted in the same expression for distribution ofthe accumulated reward based on different approaches. From (11), (16), usingQ · h = 0, we have:

C∼∼(s, v) = h − B∼∼(s, v)

= [I − s(sI + vR − Q)−1] · h

= [(sI + vR − Q)−1 · (sI + vR − Q) − s(sI + vR − Q)−1] · h

= (sI + vR − Q)−1 · (vR − Q) · h

= v(sI + vR − Q)−1 · R · h

(17)

which was obtained with a different way of reasoning in [11]. Suppose R−1

exists, i.e., ri > 0,∀i ∈ S, (17) can be inverse transformed with respect to thereward variable as follows:

C∼∗(s, v) = (sI + vR − Q)−1 · (R−1)−1 · h

= (sR−1 + vI − R−1Q)−1 · h ,

(18)

from which

C∼(s, w) = e(R−1Q−sR−1)w · h . (19)

A kind of duality can be observed comparing (12) and (19). Assume that{Z ′(w), w ≥ 0} is a CTMC over S with generator Q′ = R−1 · Q (which isa proper generator matrix). The mean reward accumulated up to time t (w)by Z(t) (Z ′(w) ) with reward rate matrix R (R′ = R−1) can be evaluated

2 E.g., B∗∼(s, v) = (sI + vR − Q)−1 · h and B∼∗(s, v) =s

v(sI + vR − Q)−1 · h

7

by multiplying the cumulative state probabilities with the associated rewardrates:

E{B(t)} =

t∫

0

eQ τ dτ · R · h and E{B′(w)} =

w∫

0

eQ′ τ dτ · R′ · h .(20)

Now, by (12) and (19), one can see that the mean time to accumulate w unitof reward by Z(t) equals to B′(w) and vice-versa, i.e.,

E{C(w)} = E{B′(w)} and E{C ′(t)} = E{B(t)} . (21)

Note that, we did not restrict the class of MRMs till (18), hence the resultsare valid for any reducible and irreducible underlying CTMC and any non-negative reward rates. In (18) – (21), the only restriction is that R must beinvertable, i.e., strictly positive reward rates are only allowed.

3 Moments of the accumulated reward

Let m(n)i (t) = E{Bi(t)

n} be the n-th moment of the reward accumulated

in [0, t). The column vector m(n)(t) = [m(n)i (t)] can be evaluated based on

B∼(t, v) as

m(n)(t) = (−1)n ∂nB∼(t, v)

∂vn

∣

∣

∣

∣

∣

v=0

. (22)

The following theorem provides a computationally effective, recursive methodfor the numerical analysis of the moments of accumulated reward.

Theorem 2 The n-th moment (n ≥ 1) of the accumulated reward is

m(n)(t) = (−1)n∞∑

i=0

ti

i!N(n)(i) · h (23)

where N(n)(i) is defined as

N(n)(i) =

I , if i = n = 0 ,

0 , if i = 0, n ≥ 1 ,

Qi , if i ≥ 1, n = 0 ,

Q · N(n)(i − 1) − n R · N(n−1)(i − 1) , if i ≥ 1, n ≥ 1 .

(24)

8

To prove the theorem we need the following results.

Lemma 1 If F(t) and G(t) are real-valued, n times derivable matrix functionsand F′′(t) = 0, then

(F(t) · G(t))(n) = F(t) · G(n)(t) + n F′(t) · G(n−1)(t), n ≥ 1 . (25)

Proof of Lemma 1

1. For n = 1

(F(t) · G(t))′ = F(t) · G′(t) + F′(t) · G(t) (26)

holds.

2. Assuming (25) holds for n = k, it follows

(F(t) · G(t))(k+1) =k+1∑

l=0

(

k + 1

l

)

F(l)(t) · G(k+1−l)(t) (27)

=F(t) · G(k+1)(t) + (k + 1) F′(t) · G(k)(t)

where the assumption for n = k and F′′(t) = 0 is used. ✷

Lemma 2 If i, n ≥ 1 then

∂n

∂vn(Q − vR)i

∣

∣

∣

∣

∣

v=0

=

Q ·∂n

∂vn(Q − vR)i−1

∣

∣

∣

∣

∣

v=0

− n R ·∂n−1

∂vn−1(Q − vR)i−1

∣

∣

∣

∣

∣

v=0

(28)

Proof of Lemma 2 Let F(v) = (Q − vR) and G(v) = (Q − vR)i−1. FromLemma 1

∂n

∂vn(Q − vR)i =

(Q − vR) ·∂n

∂vn(Q − vR)i−1 − n R ·

∂n−1

∂vn−1(Q − vR)i−1

(29)

which implies the Lemma. ✷

9

Proof of Theorem 2 From (22) and (12)

m(n)(t) = (−1)n ∂ne(Q−vR)t

∂vn

∣

∣

∣

∣

∣

v=0

· h

= (−1)n ∂n

∂vn

∞∑

i=0

ti

i!(Q − vR)i

∣

∣

∣

∣

∣

v=0

· h

= (−1)n∞∑

i=0

ti

i!

∂n

∂vn(Q − vR)i

∣

∣

∣

∣

∣

v=0

· h .

(30)

Let

N(n)(i) =∂n

∂vn(Q − vR)i

∣

∣

∣

∣

∣

v=0

, for ∀n, i ≥ 1. (31)

From Lemma 2 it follows

N(n)(i) = Q · N(n)(i − 1) − n R · N(n−1)(i − 1), (32)

with the initial conditions N(0)(0) = I, N(0)(i) = Qi and N(n)(0) = 0. By thisrecursion N(n)(i) = 0, if i < n . This completes the proof of Theorem 2. ✷

The iterative procedure to evaluate N(n)(i) has the following properties:

• it is not possible to evaluate the nth moment itself, but to obtain the nthmoment all the previous moments (or at least the associated N(n)(i) terms)must be computed;

• matrix-matrix multiplications are computed in each iteration steps;• numerical problems are possible due to the repeated multiplication with Q,

which contains both positive and negative elements, hence Theorem 2 is notdirectly applicable for numerical analysis.

4 Moments of the completion time

Let s(n)i (w) = E{Ci(w)n} be the n-th moment of the time to accumulate w

amount of reward. The column vector s(n)(w) = [s(n)i (w)] can be evaluated

based on C∼(s, w) as

s(n)(w) = (−1)n ∂nC∼(s, w)

∂sn

∣

∣

∣

∣

∣

s=0

. (33)

10

Theorem 3 The n-th moment of completion time, s(n)(w), satisfies thefollowing equation

s(n)(w) = (−1)n∞∑

i=n

wi

i!M(n)(i) · h (34)

where M(n)(i) is defined as

M(n)(i) =

I, i = n = 0 ,

0, i = 0, n ≥ 1 ,

(R−1 · Q)i, i ≥ 1, n = 0 ,

R−1(

Q · M(n)(i − 1) − n M(n−1)(i − 1))

, i, n ≥ 1 .

(35)

Proof of Theorem 3 Using

s(n)(w) = (−1)n ∂n

∂sne(R−1·Q−sR−1)w

∣

∣

∣

∣

∣

s=0

· h (36)

the proof follows the same pattern as the proof of Theorem 2. ✷

The numerical method based on Theorem 3 has the same properties as the onebased on Theorem 2. In contrast with Theorem 2, the application of Theorem 3is restricted to MRMs with strictly positive reward rates, while, as in Theorem2, we do not have restriction on the underlying CTMC.

4.1 System with zero reward rates

In case of some of the reward rates are zero Theorem 3 can not be applied forcomputing the moments of completion time. In this section we give a methodwhich can handle this case.

Let us partition the state space S into two disjoint sets S+ and S0. S+ (S0)contains the states with associated positive (0) reward rate, i.e., ri > 0;∀i ∈S+ and ri = 0;∀i ∈ S0. The accumulated reward does not increase duringthe sojourn in S0. If S0 has got an absorbing subset then the distributionof the completion time is defective, i.e., there is a positive probability thatCi(w) = ∞. In the subsequent analysis we do not allow this case.

Without loss of generality, we number the states in S such that i < j,∀i ∈S+,∀j ∈ S0 By this partitioning of the state space the reward rate and and

11

the generator matrix have the following sub-block structure:

R =(

R1 0

0 0

)

, Q =(

Q1 Q2

Q3 Q4

)

. (37)

Note that Q4 is invertable as a consequence of the requirement that S0 hasno absorbing subset. The partitioned form of the performance vectors are:

C∼∼(s, v) =(

C∼∼

1 (s, v)C∼∼

2 (s, v)

)

, s(n)(w) =

(

s(n)1 (w)

s(n)2 (w)

)

. (38)

Theorem 4 The n-th moment of completion time, s(n)(w), can be computedas follows:

s(n)1 (w) = (−1)n

∞∑

i=0

wi

i!L(n)(i) · h (39)

s(n)2 (w) = (−1)n

∞∑

i=0

wi

i!H(n)(i) · h (40)

where

L(n)(i) =

0 , i = 0, n > 0 ,

(R−11 · Q1 − R−1

1 · Q2 · Q−14 · Q3)

i , i ≥ 0, n = 0 ,

−R−11 · Q2 · Q

−24 · Q3 − R−1

1 , i = 1, n = 1 ,

(−1)n+1 n! R−11 · Q2 · Q

−n−14 · Q3 , i = 1, n ≥ 2 ,

n∑

ℓ=0

(

n

ℓ

)

L(ℓ)(1) · L(n−ℓ)(i − 1) , i ≥ 2, n ≥ 1 ,

(41)

H(n)(i) = (−1)n∑

ℓ=0

(

n

ℓ

)

ℓ! Q−(ℓ+1)4 · Q3 · L

(n−ℓ)(i) , i ≥ 0, n ≥ 0 (42)

Proof of Theorem 4 Substituting the vectors and matrices in (17) with theirpartitioned form and using the following form of matrix inverse

(

A B

C D

)−1

=

(

(A − BD−1C)−1 −(A − BD−1C)−1BD−1

−D−1C(A − BD−1C)−1 D−1 + D−1C(A − BD−1C)−1BD−1

)

12

where

A = sI1 + vR1 − Q1 , B = −Q2 ,

C = −Q3 , D = sI4 − Q4

for C∼∼

1 (s, v) we have:

C∼∼

1 (s, v) = v[sI1 + vR1 − Q1 − Q2 · (sI4 − Q4)−1 · Q3]

−1 · R1 · h . (43)

Since R1−1 exists by its definition the inverse Laplace transform of (43) with

respect to v → w gives

C∼

1 (s, w) = eα(s)w · h =∞∑

i=0

α(s)i

i!wi · h (44)

where

α(s) = R−11 · Q1 + R−1

1 · Q2 · (sI4 − Q4)−1 · Q3 − sR−1

1 . (45)

The n-th moment of completion time is

s(n)1 (w) = (−1)n ∂n

∂snC∼

1 (s, w)

∣

∣

∣

∣

∣

s=0

= (−1)n∞∑

i=0

wi

i!

∂n

∂snα(s)i

∣

∣

∣

∣

∣

s=0

· h (46)

where the n-th deviate of α(s)i can be evaluated using the Leibniz rule

(α(s) · α(s)i−1)(n) =n∑

l=0

(

n

l

)

α(s)(l) ·(

α (s)i−1)(n−l)

. (47)

Now L(n)(i) =∂n

∂snα(s)i

∣

∣

∣

s=0, completes the proof for s

(n)1 (w).

The same partitioning of (17) gives

C∼

2 (s, w) = (sI4 + Q4)−1 · Q3 · C1(s, w)

=∞∑

i=0

wi

i!(sI4 + Q4)

−1 · Q3 · α(s)i · h(48)

and applying the Leibniz-rule as before:

s(n)2 (x) = (−1)n ·

∂n

∂snC∼

2 (s, x)

∣

∣

∣

∣

∣

s=0

= (−1)n∞∑

i=0

wi

i!H(n)(i) · h (49)

13

gives the theorem. ✷

5 Numerical methods based on randomization

In the previous sections iterative procedures were provided to compute themoments of reward measures, but due to the properties of digital comput-ers using floating point numbers a direct application of those methods wouldresult in numerical problems such as instabilities, “ringing” (negative proba-bilities), etc. The main reason of these problems is that matrices with positiveand negative elements (like Q) are multiplied several times. To avoid theseproblems a modified procedure is proposed. Let

A =Q

q+ I , S =

R

qd(50)

where q = maxi,j∈S (|qij|) and d = maxi∈S(ri)/q. By this definition A is astochastic matrix (0 ≤ ai,j ≤ 1,∀i, j ∈ S and

∑

j∈S ai,j = 1,∀i ∈ S) and S isa diagonal matrix such that 0 ≤ si,i ≤ 1,∀i ∈ S. The dimension of d is unitof reward. d can be considered as a scaling factor of the accumulated reward.Using these matrices

B∼(t, v) = e(Q−vR)t · h = e(A−vdS)qt · he−qt . (51)

Theorem 5 The moments of accumulated reward can be computed using onlymatrix-vector multiplications and saving only vectors of size #S in each stepof the iteration as

m(n)(t) = n! dn∞∑

i=0

U (n)(i)(qt)i

i!e−qt (52)

where

U (n)(i) =

0 , if i = 0, n ≥ 1 ,

h , if i ≥ 0, n = 0 ,

A · U (n)(i − 1) + S · U (n−1)(i − 1) , if i ≥ 1, n ≥ 1 .

(53)

Proof of Theorem 5 Starting from (51) the proof of Theorem 5 follows thesame pattern as the proof of Theorem 2. ✷

To demonstrate the iterative procedure of computing U (n)(i) the first elements

of U (n)(i) evaluated based on (53) are provided in Table 1.

14

U (n)(i) i=0 i=1 i=2 i=3

n=0 h h h h

n=1 0 Sh ASh + Sh AASh + ASh + Sh

n=2 0 0 SSh ASSh + SASh + SSh

n=3 0 0 0 SSSh

Table 1.

Suppose one is interested in the first 3 moments of the accumulated reward. Toperform the computation 3 vectors of size #S needs to store U (n)(i), n = 1, 2, 3.In each iteration step i = 1, 2, 3, . . . matrix-vector multiplications and vectorsummations has to be performed according to (53) using the vectors of theprevious iteration step and the constant matrices A and S. Figure 2. showsthe dependency structure of the computation. One can recognize that onlythe (i − 1)-th column (iteration) of U is used for calculating the i-th columnof U . Note that S is a diagonal matrix and A is as sparse as Q is. Further 3

i=0 i=1 i=2 i=3 i=4

n=0

n=1

n=2

n=3

U (n) (i)

multiplying with A

multiplying with S

h h h h h

Fig. 2. The dependency structure of the iteration steps

vectors of the same size need to store the “actual value” of m(n)(t), n = 1, 2, 3according to (52).

The following theorem provides a global error bound of the procedure.

Theorem 6 The n-th moment of accumulated reward can be calculated asa finite sum and an error part, where the maximum allowed error is ε

m(n)(t) = n! dnG−1∑

i=0

U (n)(i)(qt)i

i!e−qt + ξ(G) (54)

where

G = ming∈N

(qt) n! dn∞∑

i=g−1

(qt)i

i!e−qt ≤ ε

(55)

15

and the 0 ≤ ξ(G) ≤ h ε inequality holds for all the elements of the vectors.

Proof of Theorem 6 By the definition of S and A

0 ≤ S · h ≤ h and 0 ≤ A · S · h ≤ h (56)

hold piece-wise (as all the subsequent vector inequalities), hence U (n)(i) isbounded by

0 ≤ U (n)(i) ≤ i h. (57)

The error ξ(g) incurred when eliminating the tale of the infinite sum is alsobounded by

ξ(g) = n! dn∞∑

i=g

U (n)(i)(qt)i

i!e−qt ≤ n! dn

∞∑

i=g

h i(qt)i

i!e−qt

≤ (qt) n! dn∞∑

i=g−1

h(qt)i

i!e−qt (58)

which gives the theorem. ✷

The error bound provided by the theorem is the tail of a Poisson distributionwith mean qt multiplied by a constant (qt) n! dn. A Poisson distribution has alow squared coefficient of variation (qt)−1, which decreases as qt increases, andits tail has an exponential decay. Hence, when qt is large (> 100) G is mainlydetermined by qt and it has only a logarithmic dependence on the constant(qt) n! dn and the precision requirement ε. In general, if qt > 100 then G andqt are of the same order of magnitude (G > qt). A high level description ofthe proposed method can be found in the appendix.

The same approach can be applied for the analysis of completion time, whenall the reward rates are positive, i.e., R−1 exists. Let

B =R−1 · Q

z+ I , T =

R−1

zf(59)

where z = maxi,j∈S (|qij/ri|) and f = maxi∈S(1/ri)/z. By this definition B isa stochastic matrix (0 ≤ bi,j ≤ 1,∀i, j ∈ S and

∑

j∈S bi,j = 1,∀i ∈ S) and T

is a diagonal matrix such that 0 ≤ ti,i ≤ 1,∀i ∈ S. f is a number with nodimension.

C∼(s, w) = e(R−1Q−sR−1)w · h = e(B−sfT)zw · h e−zw . (60)

16

Theorem 7 The moments of the completion time can be computed using onlymatrix-vector multiplications and saving only vectors of size #S as follows:

s(n)(w) = n! fn∞∑

i=0

V (n)(i)(zw)i

i!e−zw (61)

where

V (n)(i) =

0 if i = 0, n ≥ 1 ,

h if i ≥ 0, n = 0 ,

B · V (n)(i − 1) + T · V (n−1)(i − 1) if i ≥ 1, n ≥ 1 .

(62)

Proof of Theorem 7 From (60), Theorem 7 comes. ✷

Theorem 8 The n-th moment of completion time can be calculated as afinite sum and an error part, where the maximum allowed error is ε

s(n)(w) = n! fnG−1∑

i=0

V (n)(i)(zw)i

i!e−zw + ξ(G) (63)

where G = ming∈N

(zw) n! fn∞∑

i=g−1

(zw)i

i!e−zw ≤ ε

(64)

and 0 ≤ ξ(G) ≤ h ε . (65)

Proof of Theorem 8 The proof of Theorem 8 follows the same pattern as theproof of Theorem 6. ✷.

The numerical analysis of the completion time of large models when states withzero reward rate are present in the system is more complicated. A numericalprocedure similar to the one in Theorem 8 can be obtained as well, but onthe one hand it is very complicated, and on the other hand its applicability isstrongly limited by the cardinality of S0. The Q4 matrix of cardinality #S0 hasto be inverted in this case. In general, the complexity of inverting a matrix ofcardinality 104 has higher computational complexity and memory requirementthan the proposed numerical method with 106 states.

6 Numerical examples

Example 1Consider a CTMC with n = 1, 000, 000 states. Let the non-zero state transition

17

Mean value t = 0.02s t = 0.1s t = 0.2s t = 1s t = 2s

Z(0) = 750, 000 8.06 · 10−12 9.81 · 10−8 5.11 · 10−6 0.022 0.33

Z(0) = 790, 000 0.00047 0.010 0.037 0.58 1.54

Z(0) = 800, 000 0.019 0.093 0.18 0.94 1.94

Table 2.

Variance t = 0.02s t = 0.1s t = 0.2s t = 1s t = 2s

Z(0) = 750, 000 4.61 · 10−14 2.73 · 10−9 5.03 · 10−7 7.73 · 10−3 0.17

Z(0) = 790, 000 6.07 · 10−6 5.85 · 10−4 3.62 · 10−3 0.096 0.16

Z(0) = 800, 000 5.79 · 10−6 4.13 · 10−4 1.92 · 10−3 0.018 0.022

Table 3.

rates the following:

qij =

5 , if j = i + 1 ,

2.5 , if j = i + 10, 000 ,

2.5 , if j = i − 1 .

(66)

The reward rate matrix R has the following structure:

ri,i =

0 if i < 800, 000 ,

1 if i ≥ 800, 000 .(67)

Figure 3. shows the structure of the underlying CTMC, where u = 10, 000.

1 2 3 u+1 u+2 n

2.5

2.5

5

Fig. 3. The underlying CTMC of Example 1.

Table 2 and 3 contain the mean and the variance of the accumulated rewardwith different initial state. The accumulated reward represents the time thesystem spent in states 800, 000, . . . , 1, 000, 000.

Example 2In the second example, the performance parameters of a Carnegie-Mellon mul-tiprocessor system are evaluated by the proposed method. The system is sim-ilar to the one presented in [18]. The system consists of N processors, M

18

memories, and an interconnection network (composed by switches) that al-lows any processor to access any memory (Figure 4). The failure rates perhour for the system are set to be 0.1, 0.05, 0.01 and 0.003 for the processors,memories, switches, and general failure, respectively.

Viewing the interconnecting network as S switches and modeling the systemat the processor-memory-switch level, the system performance depends on theminimum of the number of operating processors, memories, and switches. Eachstate is thus specified by a triple (i, j, k) indicating the number of operatingprocessors, memories, and switches, respectively. We augment the states withthe nonoperational state F . Events that decrease the number of operationalunits are the failures and events that increase the number of operational ele-ments are the repairs. We assume that failures do not occur when the systemis not operational. When a component fails, a recovery action must be taken(e.g., shutting down the a failed processor, etc.), or the whole system will failand enter state F .

memory

1

2

Mprocessor N

processor 1

processor 2network

memory

memory

Fig. 4. Example system structure

Two kinds of repair actions are possible, global repair which restores the sys-tem to state (N,M, S) with rate µ = 0.01 per hour from state F , and localrepair, which can be thought of as a repair person beginning to fix a componentof the system as soon as a component failure occurs. We assume that there isonly one repair person for each component type. Let the local repair rates be2.0, 2.0 and 0.1 for the processors, memories and the switch, respectively.

The system starts from the perfect state (N,M, S). The studied system has32 processors, 64 memories, and 16 switches, thus the state space consists of36,466 states (247,634 transitions). The performance of the system is propor-tional to the number of cooperating processors and memories, whose cooper-ation is provided by one switch. The reward rate is defined as the minimumof the operational processors, memories, and switches. The minimal opera-tional configuration is supposed to have one processor, one memory and oneinterconnection switch.

19

t E(B(t)) E(B(t)2) E(B(t)3) E(B(t)4) E(B(t)5) E(B(t)6)

1 15.89 253.0 4030 6.41 · 104 1.02 · 106 1.63 · 107

2 31.60 1001 3.14 · 104 1.00 · 106 3.19 · 107 1.01 · 109

5 77.70 6072 4.75 · 105 3.72 · 107 2.92 · 109 2.30 · 1011

10 151.5 2.32 · 104 3.57 · 106 5.51 · 108 8.52 · 1010 1.31 · 1013

20 289.5 8.57 · 104 2.55 · 107 7.67 · 109 2.30 · 1012 6.96 · 1014

50 648.0 4.42 · 105 3.08 · 108 2.16 · 1011 1.53 · 1014 1.09 · 1017

Table 4.

t E(B(t)) E(B(t)2) E(B(t)3) E(B(t)4) E(B(t)5) E(B(t)6)

1 15.89 253.0 4030 6.42 · 104 1.02 · 106 1.63 · 107

2 31.60 1001 3.14 · 104 1.00 · 106 3.19 · 107 1.01 · 109

5 77.70 6073 4.75 · 105 3.72 · 107 2.92 · 109 2.30 · 1011

10 151.6 2.32 · 104 3.57 · 106 5.51 · 108 8.52 · 1010 1.31 · 1013

20 290.1 8.59 · 104 2.56 · 107 7.68 · 109 2.31 · 1012 6.97 · 1014

50 655.6 4.48 · 105 3.11 · 108 2.19 · 1011 1.55 · 1014 1.10 · 1017

Table 5.

The first 6 moments of the accumulated reward were calculated using Theo-rem 5 in two different cases. In the first case global repair was not possible,hence F was an absorbing state of the system. In the second case global repairwas allowed at rate 0.01. Table 4 and 5 contain the results obtained at timet = 1, 2, 5, 10, 20, 50 for the case without and with global repair, respectively.

The mean and the variance of the accumulated reward of the two cases arecompared in Figures 5, and 6, respectively. The dashed lines refer to the casewhen global repair is not possible. As it was expected, the mean accumu-lated reward of the case without global repair is less. The variance curves aremisleading for the first sight. The second moment of the case without globalrepair is still less, but the relation of the variance parameters depend on thedifference of the first two moments, and that is why the variance of the casewithout global repair is higher.

7 Conclusion

An iterative numerical method is introduced which can evaluate the momentsof the accumulated reward and the completion time of MRMs with large 106

20

0

250

500

750

0 10 20 30 40 50Time

E[B(t)]

Fig. 5. Mean accumulated reward

0

10000

20000

30000

0 10 20 30 40 50Time

Variance of B(t)

Fig. 6. Variance of the accumulated reward

state spaces. The proposed methods make use of the randomization technique,hence they are numerically stabile and allow the implementation of a globalerror bound.

A possible future extension of the proposed method is the automatic steadystate detection. The computational complexity increases linearly with the time(in case of accumulated reward analysis) or with the work requirement w (incase of completion time analysis), but after the underlying CTMC reached itssteady state the reward measures can be computed in a simpler way.

References

[1] M.D. Beaudry. Performance-related reliability measures for computing systems.IEEE Transactions on Computers, C-27:540–547, 1978.

[2] S. Blaabjerg, G. Fodor, A. T. Andersen, and M. Telek. A partially blocking-queueing system with CBR/VBR and ABR/UBR arrival streems. In 5th Int.

Conf. on Telecommunication Systems, pages 411–424, Nashville, TN, USA,March 1997.

21

[3] A. Bobbio. The effect of an imperfect coverage on the optimum degreeof redundancy of a degradable multiprocessor system. In Proceedings

RELIABILITY’87, Paper 5B/3, Birmingham, 1987.

[4] A. Bobbio and K.S. Trivedi. Computation of the distribution of the completiontime when the work requirement is a PH random variable. Stochastic Models,6:133–149, 1990.

[5] L. Donatiello and V. Grassi. On evaluating the cumulative performancedistribution of fault-tolerant computer systems. IEEE Transactions on

Computers, 1991.

[6] E. De Souza e Silva and H.R. Gail. Calculating availability and performabilitymeasures of repairable computer systems using randomization. Journal of the

ACM, 36:171–193, 1989.

[7] A. Goyal and A.N. Tantawi. Evaluation of performability for degradablecomputer systems. IEEE Transactions on Computers, C-36:738–744, 1987.

[8] V. Grassi, L. Donatiello, and G. Iazeolla. Performability evaluation ofmulticomponent fault-tolerant systems. IEEE Transactions on Reliability, R-37:216–222, 1988.

[9] R.A. Howard. Dynamic Probabilistic Systems, Volume II: Semi-Markov and

Decision Processes. John Wiley and Sons, New York, 1971.

[10] B.R. Iyer, L. Donatiello, and P. Heidelberger. Analysis of performability forstochastic models of fault-tolerant systems. IEEE Transactions on Computers,C-35:902–907, 1986.

[11] V.G. Kulkarni, V.F. Nicola, and K. Trivedi. On modeling the performanceand reliability of multi-mode computer systems. The Journal of Systems and

Software, 6:175–183, 1986.

[12] Y. Levy and P.E. Wirth. A unifying approach to performance and reliabilityobjectives. In Proceedings 12-th International Teletraffic Congress, ITC-12,pages 4.2B2.1–4.2B2.7, Torino, 1988.

[13] R.A. McLean and M.F. Neuts. The integral of a step function defined ona Semi-Markov process. SIAM Journal on Applied Mathematics, 15:726–737,1967.

[14] J.F. Meyer. Closed form solution of performability. IEEE Transactions on

Computers, C-31:648–657, 1982.

[15] H. Nabli and B. Sericola. Performability analysis: A new algorithm. IEEE

Transactions on Computers, C-45(4):491–494, 1996.

[16] V.F. Nicola, V.G. Kulkarni, and K. Trivedi. Queueing analysis of fault-tolerantcomputer systems. IEEE Transactions on Software Engineering, SE-13:363–375, 1987.

22

[17] A. Reibman, R. Smith, and K.S. Trivedi. Markov and Markov reward modeltransient analysis: an overview of numerical approaches. European Journal of

Operational Research, 40:257–267, 1989.

[18] R. Smith, K. Trivedi, and A.V. Ramesh. Performability analysis: Measures, analgorithm and a case study. IEEE Transactions on Computers, C-37:406–417,1988.

[19] W. J. Stewart, ”Introduction to the Numerical Solution of Markov Chains”,Princeton University Press, Princeton, New Jersey, ISBN 0-691-03699-3, 1994.

[20] U. Sumita, J.G. Shanthikumar, and Y. Masuda. Analysis of fault tolerantcomputer systems. Microelectronics and Reliability, 27:65–78, 1987.

[21] M. Telek, A. Pfenning, and G. Fodor, ”An Effective Numerical Method toCompute the Moments of the Completion Time of Markov Reward Models”,Computers and Mathematics with Applications, Vol. 36, No. 8, pp. 59-65, 1998.

[22] G.L. Choudhury, D.M. Lucantoni, and W. Whitt, ”Multi dimensional transforminversion with applications to the transient M/G/1 queue”, Ann. Appl. Prob.,Vol. 4, pp. 719–740, 1994.

23

A Implementation of the numerical method

A formal description of the program that calculates the moments of accumu-lated reward according to Theorem 4 is provided. The memory requirementand number of required operations are calculated in advance.

Input M cardinality of the state space

Q generator matrix of underlying CTMC

R diagonal matrix of the reward rates

P initial probability vector

t time of accumulation

n order of moment

G number of iterations

z number of non-zero elements in Q

Output m The n-th moment of accumulated reward

mem memory requirement

mul required floating point multiplication

add required floating point addition

1 memA = z · Size(double) storing elements of A

memA = memA + (z + M) · Size(int)

memS = M · Size(double) storing S

memP = M · Size(double) storing P

memN = M · (n + 1) · Size(double) temporary vectors

mem = memA + memS + memP + memN

2 add = o (G · (2 · n · z + (n + 1) · M)) compute numerical complexity

mul = o (G · (2 · n · z + M))

3 U (0) = h; U (i) = 0 , i : 1 . . . n; compute the n-th moment

For i := 1 To G Do

Begin

For j := n DownTo 0 Do

U (j) := S · U (j−1) + A · U (j);

m := m + U (j) · Poisson(i; qt);

End;

m := m · n! · dn

24

Documents

Numerical Analysis of Large Markov Reward Modelswebspn.hit.bme.hu/~telek/cikkek/tele99a.pdf · ferred to as Markov Reward Model (MRM). Various numerical techniques were proposed for