Upload
inria
View
0
Download
0
Embed Size (px)
Citation preview
A Finite Characterization of
Weak Lumpable Markov Processes�
Part I� The Discrete Time Case�
Gerardo Rubino and Bruno Sericola
IRISA � INRIA
Campus de Beaulieu� ����� Rennes Cedex� FRANCE
Abstract
We consider an irreducible and homogeneous Markov chain �discrete time� with �nite
state space� Given a partition of the state space� it is of interest to know if the aggregatedprocess constructed from the �rst one with respect to the partition is also Markov homo�
geneous� We give a characterization of this situation by means of a �nite algorithm� This
algorithm computes the set of all initial probability distributions of the starting homogeneous
Markov chain leading to an aggregated homogeneous Markov chain�
Markov Chains� Aggregation� Weak Lumpability
� Introduction
This paper is the natural continuation of the work performed in G� Rubino and B� Sericola �������
To remain compatible with this previous paper� we conserve the same notation� Let us �rst recall
the problem of weak lumping in Markov chains�
Let X � �Xn�n�� be a homogeneous irreducible Markov chain evolving in discrete time� The
state space is assumed to be �nite and denoted by E � f�� � ���� Ng� The stationary distribution
of X is denoted by �� Let us denote by B � fB���� B��� ���� B�M�g a partition of the state space
and by n�m� the cardinal of B�m�� We assume the states of E ordered such that
B��� � f�� � � � � n���g����
B�m� � fn��� � � � n�m� �� �� � � � � n��� � � � n�m�g����
B�M� � fn��� � � � n�M � �� �� � � � � Ng�
�
With the given process X we associate the aggregated stochastic process Y with values in the set
F � f�� � ����Mg� de�ned by
Yn � mdef�� Xn � B�m�� for all n � �
It is easily checked from this de�nition and the irreducibility of X that the process Y obtained is
also irreducible in the following sense� for all m � F � for all l � F such that IP�Y� � l� � �� there
exists n � � such that IP�Yn � m�Y� � l� � ��
The Markov property of X means that given the state in which X is at the present time�
the future and the past of the process are independent� Clearly� this is no longer true for the
aggregated process Y in the general case �see the example below�� This paper deals with the
conditions under which the process Y is also a homogeneous Markov chain�
The homogeneous Markov chain X is given by its transition probability matrix P and its
initial distribution � we shall denote it by ��� P � when necessary� The �i� j� entry of matrix P
is denoted by P �i� j�� We shall denote by agg��� P�B� the aggregated process constructed from
��� P � over the partition B� Let us denote by A the set of all probability vectors with N entries�
Of course� it is possible to have the situation in which agg��� P�B� is a Markov homogeneous
chain for any � � A� This happens� for instance� in the following example where E � f�� � �g�
B��� � f�g� B�� � f� �g and
P �
�B� �� p p �q � �� qq �� q �
�CAwith � � p � � and � � q � �� In this case� X is said to be strongly lumpable with repect to the
partition B and this is a well known property with a simple characterization� For every i � E and
m � F � let us denote by P �i� B�m�� the transition probability of passing in one step from state
i to the subset B�m� of E� that is P �i� B�m��def�P
j�B�m� P �i� j�� Then� X is strongly lumpable
with respect to the partition B if and only if for every pair of elements l� m � F � the probability
P �i� B�m�� has the same value for every i � B�l�� This common value is the transition probability
from state l to state m for the aggregated homogeneous Markov chain Y �
The opposite situation in which for any initial distribution �� the process agg��� P�B� is
not Markov is illustrated by the following example� Consider again a three state chain with
B��� � f�g� B�� � f� �g and
P �
�B� �� p p �� � �� � �
�CA �
For any value of p with � � p � � and for any initial probability distribution� we have
IP�Xn�� � B��� j Xn � B��� Xn�� � B���� � �
and IP�Xn�� � B��� j Xn � B��� Xn�� � B��� � ��
In ��� the authors show that a third situation is possible� in which there exists a proper subset of A
denoted here by AM such that for any � � AM� the process agg��� P�B� is Markov homogeneous
while for any � �� AM the aggregated process is not Markov� In this more general situation� X
is said to be weakly lumpable with respect to the given partition B �see �� for some examples��
Moreover� in the same reference a su�cient condition to weak lumpability is given�
In ��� the authors analyse the problem of the characterization of weak lumpability� An inter�
esting technique is introduced but we have shown in G� Rubino and B� Sericola ������ that their
characterization is wrong�
The work performed here is an extension of G� Rubino and B� Sericola ������� We obtain
a �nite characterization of weak lumpability by means of an algorithm which computes the set
AM� In G� Rubino and B� Sericola ������� AM is given as an in�nite intersection of sets� The
main result obtained in this paper is the reduction of this in�nite intersection to a �nite one�
�
The paper is organized as follows� The next section gives the background material and notation
and recalls the main results obtained in G� Rubino and B� Sericola ������� In Section �� we show
how the searched set AM can be obtained by means of a �nite algorithm� Section � presents an
example in which the set AM is computed and Section � concludes the paper�
� Notation and main results of G� Rubino and B� Seri�
cola ������
By convention� the vectors will be row vectors� Column vectors will be indicated by means of the
transposition operator � � �T � A vector with all its entries equal to � will be denoted simply by �
its dimension will be de�ned by the context�
For each l � F and � � A we denote by Tl�� the �restriction� of � to the subset B�l�� That is�
Tl�� is a vector with n�l� entries and its ith entry is �Tl����i� � � �n��� � � � n�l � �� i� � i �
�� � � � � � n�l�� When Tl�� �� �� we denote by �B�l� the vector ofA de�ned by �B�l��i� � ��i��kTl��k
if i belongs to B�l�� � otherwise� where k�k denotes the sum of the components of the non negative
real vector �� Each time when in the sequel we shall write a vector of the form �B�m� with � � A�
we implicitely mean that this vector is de�ned� i�e� Tm�� �� ��
To clarify this notation� let us give some examples� Suppose that N � � and B � fB���� B��g
with B��� � f�� � �g and B�� � f�� �g� Then�
for � � ������ ����� ���� ��� ���� �
�B��� � ����� ���� ��� �� ��� �B��� � ��� �� �� ��� �����
T��� � ������ ����� ����� T��� � ���� �����
T���B��� � ����� ���� ���� T���
B��� � ���� ����
for � � ���� ��� �� �� �� �
�B��� � ���� ��� �� �� �� and �B��� is not de�ned since T��� � ��� ���
�
Recall the de�nition of the main object of this analysis�
AMdef� f� � A�Y � agg��� P�B� is a homogeneous Markov chaing
For l � F � we shall denote by ePl the n�l��M matrix de�ned by
ePl�j�m�def� P � n��� � � � n�l � �� j� B�m� � � � � m � n�l�� m � F�
In other words� ePl�j�m� is the probability of passing in one step from the jth state in B�l� to
the set B�m��
When the set AM is not empty� the transition probability matrix of the Markov chain Y �
denoted by bP � is the same for every � � AM� In this case� its lth row bPl can be computed by
bPl � �Tl��B�l�� ePl �see G� Rubino and B� Sericola ��������
To illustrate these de�nitions� consider the following matrix
P �
�BBB���� ��� ��� ����� ��� ��� ������ ��� ��� ������ ��� ��� ���
�CCCAwith B � fB���� B��g� B��� � f����g and B�� � f�g�
We have
eP� �
�B� �� ����� ������ ���
�CA � eP� ����� ���
��
� � ������ ����� ����� �����
and bP �
��� ������ ���
��
Let � be an element of A� k a positive integer and i�� i�� � � � � ik a sequence of k elements of F �
We de�ne recursively the vector f���B�i��� � � � � B�ik�� � A by
f���B�i��� � �B�i��
f���B�i��� � � � � B�ih�� � �f���B�i��� � � � � B�ih���P �B�ih� � � h � k�
�
whenever these operations make sense� The reader can check the following probabilistic interpre�
tation of f � If � � f���B�i��� � � � � B�in�� then
��j� � IP�Xn � j�Xn � B�in�� � � � � X� � B�i����
That is� � is the probability distribution of Xn given �X� � B�i��� � � � � Xn � B�in���
Using this de�nition� we de�ne the basic sequence of sets
A� def� f � � A � �Tl��
B�l�� ePl � bPl for every l � F g
and for j � �
Aj def� f� � A � � � f��� B�i��� � � � � B�ik�� with k � j� we have � � A�g�
As an example� the reader can check that for the previous ��dimensional transition probability
matrix� the set A� can be written as
A� � f� � A�� � ��
�� t�
�� t� ��� ���� �� �� ��� � � t �
�� � � � �g�
We have then the basic result �proved in G� Rubino and B� Sericola �������
AM ��j��
Aj�
In the same paper� some properties of AM are given� Let us say that a subset U of A is stable
by right product by P i� for all x � U the vector xP � U � The principal properties of the set AM
that we need here are
��� � � AM �� �B�l� � AM� l � F
��� � � AM �� n � �� �n
nXk��
�P k � AM
�
��� AM �� �� � � AM
��� AM �� �� AM is a convex closed set
��� A� is stable by right product by P �� AM � A�
��� j � �� Aj�� � Aj
��� If �j � � such that Aj�� � Aj then AM � Aj�k� k � �
�� AM ��j��
Aj
See G� Rubino and B� Sericola ������ for the proofs of these properties� In the next section� we
show that the sequence of sets �Aj�j�� is a stationary sequence whose limit is equal to AN � That
is� AM � AN � where N denotes the cardinal of the state space E of the process X� Moreover� a
�nite algorithm to compute this set is given�
� An algorithm to compute AM
To obtain the announced result� we need some preliminary lemmas� The �rst one gives a way to
compute the set Aj�� as a function of Aj�
Lemma �� j � ��
Aj�� � f� � Aj��B�l�P � Aj for every l � Fg�
Proof The proof is a direct consequence of the de�nition of f � It only uses the property
f���B�i��� B�i��� � � � � B�ik�� � f��B�i��P�B�i��� � � � � B�ik��
which is immediate� �
�
For every l � F � we denote by Pl the n�l��N sub�matrix of P corresponding to the transition
probabilities of X from the states of B�l� to the states of E� Let us de�ne for every l � F � the
n�l� �M matrix Hl as Hl � ePl � �T bPl and recall that a polytope of IRN is a bounded convex
polyhedral subset of IRN �����
Lemma �� j � �� Aj is a polytope of IRN �
Proof Verify �rst that A� can be written as
A� � f� � A��H � �g
where H is the N �M� block diagonal matrix whose lth block is Hl� When necessary� we shall
also write H � Diag�Hl�� So� A� is a polytope �see ����
Now� for every j � �� we de�ne the N �M j�� matrix H �j� as
H ��� � H�
H �j��� � Diag�PlH�j�� for j � ��
The previous lemma allows us to write
Aj�� � f� � Aj��H �j��� � �g�
The result follows by de�nition �see ���� �
For � and � � IRN � we denote by D��� �� the line containing the two points � and �� That
is D��� �� � f� ��� ��� � IRg� Recall that the dimension of a convex set C of IRN is the
dimension of the smallest a�n subset containing C we shall denote it by dim�C��
Lemma �� j � � such that Aj�� �� � we have
hdim�Aj��� � dim�Aj� ��Aj�� � Aj
i�
�
Proof We have seen that for all j � �� Aj�� � Aj� If the dimension is equal to �� the result
follows immediately� Suppose now that the common dimension is at least �� Let � and � be two
points belonging to Aj�� and let � be a point of D��� �� Aj� Such a point exists since the two
sets have the same dimension� Now� there exists � IR such that � � � �� � ��� That is�
using the previous lemma� �H �j��� � �H �j��� ��� ��H �j��� � �� This proves that � � Aj��
and then that Aj�� � Aj� �
We are now ready to prove the main result of this paper� Recall that N denotes the cardinal
of the state space of the starting homogeneous Markov chain X�
Theorem ��
AM � AN
Proof Consider the sequence A�� � � � � AN � From property ��� of the previous section� this
sequence is decreasing� If two consecutive elements of A�� � � � � AN are equal� property ��� of the
previous section leads to AM � AN � If all its elements are di�erent� the previous lemma concludes
that the sequence of their dimensions is strictly decreasing and so� since dim�A� � N � �� we
have dim�Aj� � N � j for j � �� � � � � � N � �� and then AN � � Property �� of the previous
section leads to the result� �
Before giving the algorithm which computes the set AM� we need another interesting property
given in the following lemma and which generalizes property ��� of the previous section�
Lemma �� Aj is stable by right product by P �� AM � Aj
Proof AM being stable by right product by P �property ��� of the previous section�� the
implication from the right to the left is immediate� Conversely� suppose that Aj is stable by right
�
product by P � Verify then that
j � ��h� � Aj �� �B�l� � Aj for every l � F
i�
Then Lemma ��� leads to Aj�� � Aj and property ��� implies AM � Aj� �
To determine the set AM� we can proceed as follows� Assume the chain X not strongly
lumpable with respect to the partition B� The �rst step consists of calculating the set A� and
determining if it is stable by right product by P � If A� is stable by right product by P � Lemma ���
says that AM � A�� If A� is not stable by right product by P � we �rst verify whether for all l � F
the vector �B�l�P is in A�� If there exists l � F such that �B�l�P �� A�� then � �� A� and AM �
�consequence of properties ��� and ��� of the previous section�� If � � A� then we calculate this
set and we repeat the previous operations�
Denote by � the following procedure� where P�A� is the set of all subsets of A�
� � P�A� � P�A�
U ��n� � U��B�l�P � U for every l � F
oWith this notation� Aj�� � ��Aj� �Lemma ����� The algorithm has then the following form�
if X is strongly lumpable then AM � A stop
else U �� A�
loop
if U is stable by P break byfound endif
if �l � F � �B�l�P �� U break byempty endif
U �� ��U�
endloop
exits
��
exit found � AM �� U
exit empty � AM ��
endexits
endif
Remark that all the sets considered in this algorithm are polytopes and so they can be determined
in a unique way by giving their vertices�
Example
To illustrate our algorithm� let us take as an example the process X � � � � P � on the state space
E � f�� � �� �� �� �g with
P �
�BBBBBBBB�
���� ���� ���� ���� ���� ������� ��� ��� ��� ��� ������ ��� ��� ��� ��� ������ ��� ��� ��� ��� ������ ��� ��� ��� ���� ������� ��� ��� ��� ���� ����
�CCCCCCCCAand consider the partition B � fB���� B��g where B��� � f�� � �� �g and B�� � f�� �g� We
have N � �� M � � n��� � �� n�� � � We see that X is not strongly lumpable with respect to
the partition B� The stationary distribution � of X is
� ������� ����� ����� ����� ����� �����
�
and we have
�B��� ����� ��� ��� ��� � �
�� �B��� �
�� � � � �� ��
��
��
The matrices eP� and eP� are given by
eP� �
�BBB���� ������ ������� ������� ���
�CCCA � eP� �
���� ������ ���
��
This leads to
bP� ������ ���
�� bP� �
���� ���
��
The polytope A� can be determined by the matrix H ��� �see the proof of Lemma ��� given by
H ��� �
�BBBBBBBB�
���� ��� � ����� ��� � ���� ���� � ���� ���� � �� � � �� � � �
�CCCCCCCCA�
Therefore� the vertices of the polytope A� are
�� � � ��� � � �� � � � ��� � � ��� � �� � � � � �� � � � ��� � ��� � � � �� � � � ��� ��� � � � � ��� � � � � � � � � � ��� � � � � � � � � � �
The set A� is not stable by right product by P since ��P �� A� ���PH��� �� ��� Furthermore�
�B���P � A� and �B���P � A� since �B���PH ��� � � and �B���PH ��� � �� This means that the set
A� �� �� � A��� The polytope A� can be determined by the matrix H ��� given by
H ��� �
�BBBBBBBBBBBBB�
� � � � � � � ������ ������ � � � � � ������� ����� � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
�CCCCCCCCCCCCCA�
The vertices of the polytope A� are ��� ��� �� and � where
� ��� ��� ��� ��� � �
��
�
The set A� is not stable by right product by P since ��P �� A� ���PH��� �� ��� Furthermore�
�B���P � A� and �B���P � A� since �B���PH ��� � � and �B���PH ��� � �� This means that A ��
�� � A�� The polytope A can be determined by the matrix H �� given by
H �� �
�BBBBBBBB�
� � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � �
������ ������� � � � � � � � � � � � � � �������� ������ � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � �
�CCCCCCCCA�
The vertices of the polytope A are ��� �� and �B���� Finally� the polytope A is stable by
right product by P since ��PH�� � �� ��PH
�� � � and �B���PH �� � �� So� we have AM � A�
that is
AM ����� �� �� �� �� �� ��� �� �� �� �� �� ��
�
���
���
���
�� �� ��� � � � � ��� ��� � � �
�
Conclusions
In this paper� we have analyzed the set of all initial probability distributions of an irreducible
and homogeneous Markov chain which lead to a homogeneous aggregated Markov chain given the
transition probability matrix and a partition of the state space� We have obtained a constructive
characterization of this set by means of a �nite algorithm� In a subsequent paper we analyse the
continuous time case� Basically� it is shown that it is always possible to come down to the discrete
time case using the uniformization technique� The case of homogeneous Markov processes with
absorbing states seems to be the �rst possible direction to extend these results�
��
References
�� A�M� Abdel�Moneim and F�W� Leysie�er� Weak lumpability in �nite Markov chains� J� Appl�
Prob� �� ����� ��������
�� J�G� Kemeny and J�L Snell� Finite Markov chains� Springer�Verlag� New York Heidelberg
Berlin� �����
�� R� T� Rockafellar� Convex Analysis� Princeton University Press� Princeton� New�Jersey� �����
�� G� Rubino and B� Sericola� On weak lumpability in Markov chains� J� Appl� Prob� � ������
��������
��