14

A finite characterization of weak lumpable Markov processes. Part I: The discrete time case

  • Upload
    inria

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

A Finite Characterization of

Weak Lumpable Markov Processes�

Part I� The Discrete Time Case�

Gerardo Rubino and Bruno Sericola

IRISA � INRIA

Campus de Beaulieu� ����� Rennes Cedex� FRANCE

Abstract

We consider an irreducible and homogeneous Markov chain �discrete time� with �nite

state space� Given a partition of the state space� it is of interest to know if the aggregatedprocess constructed from the �rst one with respect to the partition is also Markov homo�

geneous� We give a characterization of this situation by means of a �nite algorithm� This

algorithm computes the set of all initial probability distributions of the starting homogeneous

Markov chain leading to an aggregated homogeneous Markov chain�

Markov Chains� Aggregation� Weak Lumpability

� Introduction

This paper is the natural continuation of the work performed in G� Rubino and B� Sericola �������

To remain compatible with this previous paper� we conserve the same notation� Let us �rst recall

the problem of weak lumping in Markov chains�

Let X � �Xn�n�� be a homogeneous irreducible Markov chain evolving in discrete time� The

state space is assumed to be �nite and denoted by E � f�� � ���� Ng� The stationary distribution

of X is denoted by �� Let us denote by B � fB���� B��� ���� B�M�g a partition of the state space

and by n�m� the cardinal of B�m�� We assume the states of E ordered such that

B��� � f�� � � � � n���g����

B�m� � fn��� � � � n�m� �� �� � � � � n��� � � � n�m�g����

B�M� � fn��� � � � n�M � �� �� � � � � Ng�

With the given process X we associate the aggregated stochastic process Y with values in the set

F � f�� � ����Mg� de�ned by

Yn � mdef�� Xn � B�m�� for all n � �

It is easily checked from this de�nition and the irreducibility of X that the process Y obtained is

also irreducible in the following sense� for all m � F � for all l � F such that IP�Y� � l� � �� there

exists n � � such that IP�Yn � m�Y� � l� � ��

The Markov property of X means that given the state in which X is at the present time�

the future and the past of the process are independent� Clearly� this is no longer true for the

aggregated process Y in the general case �see the example below�� This paper deals with the

conditions under which the process Y is also a homogeneous Markov chain�

The homogeneous Markov chain X is given by its transition probability matrix P and its

initial distribution � we shall denote it by ��� P � when necessary� The �i� j� entry of matrix P

is denoted by P �i� j�� We shall denote by agg��� P�B� the aggregated process constructed from

��� P � over the partition B� Let us denote by A the set of all probability vectors with N entries�

Of course� it is possible to have the situation in which agg��� P�B� is a Markov homogeneous

chain for any � � A� This happens� for instance� in the following example where E � f�� � �g�

B��� � f�g� B�� � f� �g and

P �

�B� �� p p �q � �� qq �� q �

�CAwith � � p � � and � � q � �� In this case� X is said to be strongly lumpable with repect to the

partition B and this is a well known property with a simple characterization� For every i � E and

m � F � let us denote by P �i� B�m�� the transition probability of passing in one step from state

i to the subset B�m� of E� that is P �i� B�m��def�P

j�B�m� P �i� j�� Then� X is strongly lumpable

with respect to the partition B if and only if for every pair of elements l� m � F � the probability

P �i� B�m�� has the same value for every i � B�l�� This common value is the transition probability

from state l to state m for the aggregated homogeneous Markov chain Y �

The opposite situation in which for any initial distribution �� the process agg��� P�B� is

not Markov is illustrated by the following example� Consider again a three state chain with

B��� � f�g� B�� � f� �g and

P �

�B� �� p p �� � �� � �

�CA �

For any value of p with � � p � � and for any initial probability distribution� we have

IP�Xn�� � B��� j Xn � B��� Xn�� � B���� � �

and IP�Xn�� � B��� j Xn � B��� Xn�� � B��� � ��

In ��� the authors show that a third situation is possible� in which there exists a proper subset of A

denoted here by AM such that for any � � AM� the process agg��� P�B� is Markov homogeneous

while for any � �� AM the aggregated process is not Markov� In this more general situation� X

is said to be weakly lumpable with respect to the given partition B �see �� for some examples��

Moreover� in the same reference a su�cient condition to weak lumpability is given�

In ��� the authors analyse the problem of the characterization of weak lumpability� An inter�

esting technique is introduced but we have shown in G� Rubino and B� Sericola ������ that their

characterization is wrong�

The work performed here is an extension of G� Rubino and B� Sericola ������� We obtain

a �nite characterization of weak lumpability by means of an algorithm which computes the set

AM� In G� Rubino and B� Sericola ������� AM is given as an in�nite intersection of sets� The

main result obtained in this paper is the reduction of this in�nite intersection to a �nite one�

The paper is organized as follows� The next section gives the background material and notation

and recalls the main results obtained in G� Rubino and B� Sericola ������� In Section �� we show

how the searched set AM can be obtained by means of a �nite algorithm� Section � presents an

example in which the set AM is computed and Section � concludes the paper�

� Notation and main results of G� Rubino and B� Seri�

cola ������

By convention� the vectors will be row vectors� Column vectors will be indicated by means of the

transposition operator � � �T � A vector with all its entries equal to � will be denoted simply by �

its dimension will be de�ned by the context�

For each l � F and � � A we denote by Tl�� the �restriction� of � to the subset B�l�� That is�

Tl�� is a vector with n�l� entries and its ith entry is �Tl����i� � � �n��� � � � n�l � �� i� � i �

�� � � � � � n�l�� When Tl�� �� �� we denote by �B�l� the vector ofA de�ned by �B�l��i� � ��i��kTl��k

if i belongs to B�l�� � otherwise� where k�k denotes the sum of the components of the non negative

real vector �� Each time when in the sequel we shall write a vector of the form �B�m� with � � A�

we implicitely mean that this vector is de�ned� i�e� Tm�� �� ��

To clarify this notation� let us give some examples� Suppose that N � � and B � fB���� B��g

with B��� � f�� � �g and B�� � f�� �g� Then�

for � � ������ ����� ���� ��� ���� �

�B��� � ����� ���� ��� �� ��� �B��� � ��� �� �� ��� �����

T��� � ������ ����� ����� T��� � ���� �����

T���B��� � ����� ���� ���� T���

B��� � ���� ����

for � � ���� ��� �� �� �� �

�B��� � ���� ��� �� �� �� and �B��� is not de�ned since T��� � ��� ���

Recall the de�nition of the main object of this analysis�

AMdef� f� � A�Y � agg��� P�B� is a homogeneous Markov chaing

For l � F � we shall denote by ePl the n�l��M matrix de�ned by

ePl�j�m�def� P � n��� � � � n�l � �� j� B�m� � � � � m � n�l�� m � F�

In other words� ePl�j�m� is the probability of passing in one step from the jth state in B�l� to

the set B�m��

When the set AM is not empty� the transition probability matrix of the Markov chain Y �

denoted by bP � is the same for every � � AM� In this case� its lth row bPl can be computed by

bPl � �Tl��B�l�� ePl �see G� Rubino and B� Sericola ��������

To illustrate these de�nitions� consider the following matrix

P �

�BBB���� ��� ��� ����� ��� ��� ������ ��� ��� ������ ��� ��� ���

�CCCAwith B � fB���� B��g� B��� � f����g and B�� � f�g�

We have

eP� �

�B� �� ����� ������ ���

�CA � eP� ����� ���

��

� � ������ ����� ����� �����

and bP �

��� ������ ���

��

Let � be an element of A� k a positive integer and i�� i�� � � � � ik a sequence of k elements of F �

We de�ne recursively the vector f���B�i��� � � � � B�ik�� � A by

f���B�i��� � �B�i��

f���B�i��� � � � � B�ih�� � �f���B�i��� � � � � B�ih���P �B�ih� � � h � k�

whenever these operations make sense� The reader can check the following probabilistic interpre�

tation of f � If � � f���B�i��� � � � � B�in�� then

��j� � IP�Xn � j�Xn � B�in�� � � � � X� � B�i����

That is� � is the probability distribution of Xn given �X� � B�i��� � � � � Xn � B�in���

Using this de�nition� we de�ne the basic sequence of sets

A� def� f � � A � �Tl��

B�l�� ePl � bPl for every l � F g

and for j � �

Aj def� f� � A � � � f��� B�i��� � � � � B�ik�� with k � j� we have � � A�g�

As an example� the reader can check that for the previous ��dimensional transition probability

matrix� the set A� can be written as

A� � f� � A�� � ��

�� t�

�� t� ��� ���� �� �� ��� � � t �

�� � � � �g�

We have then the basic result �proved in G� Rubino and B� Sericola �������

AM ��j��

Aj�

In the same paper� some properties of AM are given� Let us say that a subset U of A is stable

by right product by P i� for all x � U the vector xP � U � The principal properties of the set AM

that we need here are

��� � � AM �� �B�l� � AM� l � F

��� � � AM �� n � �� �n

nXk��

�P k � AM

��� AM �� �� � � AM

��� AM �� �� AM is a convex closed set

��� A� is stable by right product by P �� AM � A�

��� j � �� Aj�� � Aj

��� If �j � � such that Aj�� � Aj then AM � Aj�k� k � �

�� AM ��j��

Aj

See G� Rubino and B� Sericola ������ for the proofs of these properties� In the next section� we

show that the sequence of sets �Aj�j�� is a stationary sequence whose limit is equal to AN � That

is� AM � AN � where N denotes the cardinal of the state space E of the process X� Moreover� a

�nite algorithm to compute this set is given�

� An algorithm to compute AM

To obtain the announced result� we need some preliminary lemmas� The �rst one gives a way to

compute the set Aj�� as a function of Aj�

Lemma �� j � ��

Aj�� � f� � Aj��B�l�P � Aj for every l � Fg�

Proof The proof is a direct consequence of the de�nition of f � It only uses the property

f���B�i��� B�i��� � � � � B�ik�� � f��B�i��P�B�i��� � � � � B�ik��

which is immediate� �

For every l � F � we denote by Pl the n�l��N sub�matrix of P corresponding to the transition

probabilities of X from the states of B�l� to the states of E� Let us de�ne for every l � F � the

n�l� �M matrix Hl as Hl � ePl � �T bPl and recall that a polytope of IRN is a bounded convex

polyhedral subset of IRN �����

Lemma �� j � �� Aj is a polytope of IRN �

Proof Verify �rst that A� can be written as

A� � f� � A��H � �g

where H is the N �M� block diagonal matrix whose lth block is Hl� When necessary� we shall

also write H � Diag�Hl�� So� A� is a polytope �see ����

Now� for every j � �� we de�ne the N �M j�� matrix H �j� as

H ��� � H�

H �j��� � Diag�PlH�j�� for j � ��

The previous lemma allows us to write

Aj�� � f� � Aj��H �j��� � �g�

The result follows by de�nition �see ���� �

For � and � � IRN � we denote by D��� �� the line containing the two points � and �� That

is D��� �� � f� ��� ��� � IRg� Recall that the dimension of a convex set C of IRN is the

dimension of the smallest a�n subset containing C we shall denote it by dim�C��

Lemma �� j � � such that Aj�� �� � we have

hdim�Aj��� � dim�Aj� ��Aj�� � Aj

i�

Proof We have seen that for all j � �� Aj�� � Aj� If the dimension is equal to �� the result

follows immediately� Suppose now that the common dimension is at least �� Let � and � be two

points belonging to Aj�� and let � be a point of D��� �� Aj� Such a point exists since the two

sets have the same dimension� Now� there exists � IR such that � � � �� � ��� That is�

using the previous lemma� �H �j��� � �H �j��� ��� ��H �j��� � �� This proves that � � Aj��

and then that Aj�� � Aj� �

We are now ready to prove the main result of this paper� Recall that N denotes the cardinal

of the state space of the starting homogeneous Markov chain X�

Theorem ��

AM � AN

Proof Consider the sequence A�� � � � � AN � From property ��� of the previous section� this

sequence is decreasing� If two consecutive elements of A�� � � � � AN are equal� property ��� of the

previous section leads to AM � AN � If all its elements are di�erent� the previous lemma concludes

that the sequence of their dimensions is strictly decreasing and so� since dim�A� � N � �� we

have dim�Aj� � N � j for j � �� � � � � � N � �� and then AN � � Property �� of the previous

section leads to the result� �

Before giving the algorithm which computes the set AM� we need another interesting property

given in the following lemma and which generalizes property ��� of the previous section�

Lemma �� Aj is stable by right product by P �� AM � Aj

Proof AM being stable by right product by P �property ��� of the previous section�� the

implication from the right to the left is immediate� Conversely� suppose that Aj is stable by right

product by P � Verify then that

j � ��h� � Aj �� �B�l� � Aj for every l � F

i�

Then Lemma ��� leads to Aj�� � Aj and property ��� implies AM � Aj� �

To determine the set AM� we can proceed as follows� Assume the chain X not strongly

lumpable with respect to the partition B� The �rst step consists of calculating the set A� and

determining if it is stable by right product by P � If A� is stable by right product by P � Lemma ���

says that AM � A�� If A� is not stable by right product by P � we �rst verify whether for all l � F

the vector �B�l�P is in A�� If there exists l � F such that �B�l�P �� A�� then � �� A� and AM �

�consequence of properties ��� and ��� of the previous section�� If � � A� then we calculate this

set and we repeat the previous operations�

Denote by � the following procedure� where P�A� is the set of all subsets of A�

� � P�A� � P�A�

U ��n� � U��B�l�P � U for every l � F

oWith this notation� Aj�� � ��Aj� �Lemma ����� The algorithm has then the following form�

if X is strongly lumpable then AM � A stop

else U �� A�

loop

if U is stable by P break byfound endif

if �l � F � �B�l�P �� U break byempty endif

U �� ��U�

endloop

exits

��

exit found � AM �� U

exit empty � AM ��

endexits

endif

Remark that all the sets considered in this algorithm are polytopes and so they can be determined

in a unique way by giving their vertices�

Example

To illustrate our algorithm� let us take as an example the process X � � � � P � on the state space

E � f�� � �� �� �� �g with

P �

�BBBBBBBB�

���� ���� ���� ���� ���� ������� ��� ��� ��� ��� ������ ��� ��� ��� ��� ������ ��� ��� ��� ��� ������ ��� ��� ��� ���� ������� ��� ��� ��� ���� ����

�CCCCCCCCAand consider the partition B � fB���� B��g where B��� � f�� � �� �g and B�� � f�� �g� We

have N � �� M � � n��� � �� n�� � � We see that X is not strongly lumpable with respect to

the partition B� The stationary distribution � of X is

� ������� ����� ����� ����� ����� �����

and we have

�B��� ����� ��� ��� ��� � �

�� �B��� �

�� � � � �� ��

��

��

The matrices eP� and eP� are given by

eP� �

�BBB���� ������ ������� ������� ���

�CCCA � eP� �

���� ������ ���

��

This leads to

bP� ������ ���

�� bP� �

���� ���

��

The polytope A� can be determined by the matrix H ��� �see the proof of Lemma ��� given by

H ��� �

�BBBBBBBB�

���� ��� � ����� ��� � ���� ���� � ���� ���� � �� � � �� � � �

�CCCCCCCCA�

Therefore� the vertices of the polytope A� are

�� � � ��� � � �� � � � ��� � � ��� � �� � � � � �� � � � ��� � ��� � � � �� � � � ��� ��� � � � � ��� � � � � � � � � � ��� � � � � � � � � � �

The set A� is not stable by right product by P since ��P �� A� ���PH��� �� ��� Furthermore�

�B���P � A� and �B���P � A� since �B���PH ��� � � and �B���PH ��� � �� This means that the set

A� �� �� � A��� The polytope A� can be determined by the matrix H ��� given by

H ��� �

�BBBBBBBBBBBBB�

� � � � � � � ������ ������ � � � � � ������� ����� � � � � � �

� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �

�CCCCCCCCCCCCCA�

The vertices of the polytope A� are ��� ��� �� and � where

� ��� ��� ��� ��� � �

��

The set A� is not stable by right product by P since ��P �� A� ���PH��� �� ��� Furthermore�

�B���P � A� and �B���P � A� since �B���PH ��� � � and �B���PH ��� � �� This means that A ��

�� � A�� The polytope A can be determined by the matrix H �� given by

H �� �

�BBBBBBBB�

� � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � �

������ ������� � � � � � � � � � � � � � �������� ������ � � � � � � � � � � � � � �

� � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � �

�CCCCCCCCA�

The vertices of the polytope A are ��� �� and �B���� Finally� the polytope A is stable by

right product by P since ��PH�� � �� ��PH

�� � � and �B���PH �� � �� So� we have AM � A�

that is

AM ����� �� �� �� �� �� ��� �� �� �� �� �� ��

���

���

���

�� �� ��� � � � � ��� ��� � � �

Conclusions

In this paper� we have analyzed the set of all initial probability distributions of an irreducible

and homogeneous Markov chain which lead to a homogeneous aggregated Markov chain given the

transition probability matrix and a partition of the state space� We have obtained a constructive

characterization of this set by means of a �nite algorithm� In a subsequent paper we analyse the

continuous time case� Basically� it is shown that it is always possible to come down to the discrete

time case using the uniformization technique� The case of homogeneous Markov processes with

absorbing states seems to be the �rst possible direction to extend these results�

��

References

�� A�M� Abdel�Moneim and F�W� Leysie�er� Weak lumpability in �nite Markov chains� J� Appl�

Prob� �� ����� ��������

�� J�G� Kemeny and J�L Snell� Finite Markov chains� Springer�Verlag� New York Heidelberg

Berlin� �����

�� R� T� Rockafellar� Convex Analysis� Princeton University Press� Princeton� New�Jersey� �����

�� G� Rubino and B� Sericola� On weak lumpability in Markov chains� J� Appl� Prob� � ������

��������

��