Scalable approach for effective control of gene regulatory networks

Scalable approach for effective control of gene regulatory networks

Mehmet Tan a,b, Reda Alhajj b,c,*, Faruk Polat a

a Department of Computer Engineering, Middle East Technical University, Ankara, Turkeyb BIDEALS Group, Department of Computer Science, University of Calgary, 2500 University Drive NW, Calgary, Alberta Canadac Department of Computer Science, Global University, Beirut, Lebanon

Artificial Intelligence in Medicine 48 (2010) 51–59

A R T I C L E I N F O

Article history:

Received 21 April 2008

Received in revised form 25 September 2009

Accepted 3 October 2009

Gene regulatory networks

Control policy

Scalability technique

Feature reduction

Probabilistic Boolean networks

Markov decision problems

A B S T R A C T

Objective: Interactions between genes are realized as gene regulatory networks (GRNs). The control of

such networks is essential for investigating issues like different diseases. Control is the process of

studying the states and behavior of a given system under different conditions. The system considered in

this study is a gene regulatory network (GRN), and one of the most important aspects in the control of

GRNs is scalability. Consequently, the objective of this study is to develop a scalable technique that

facilitates the control of GRNs.

Method: As the approach described in this paper concentrates on the control of GRNs, we argue that it

is possible to improve scalability by reducing the number of genes to be considered by the control policy.

Consequently, we propose a novel method that considers gene relevancy to estimate genes that are less

important for control. This way, it is possible to get a reduced model after identifying genes that can be

ignored in model-building. The latter genes are located based on a threshold value which is expected to

be provided by a domain expert. Some guidelines are listed to help the domain expert in setting

appropriate threshold value.

Results: We run experiments using both synthetic and real data, including metastatic melanoma and

budding yeast (Saccharomyces cerevisiae). The reported test results identified genes that could be

eliminated from each of the investigated GRNs. For instance, test results on budding yeast identified the

two genes SWI5 and MCM1 as candidates to be eliminated. This considerably reduces the computation

cost and hence demonstrate the applicability and effectiveness of the proposed approach.

Conclusion: Employing the proposed reduction strategy results in close to optimal solutions to the

control of GRNs, which are otherwise intractable due to the huge state space implied by the large number

of genes.

� 2009 Elsevier B.V. All rights reserved.

Contents lists available at ScienceDirect

Artificial Intelligence in Medicine

journa l homepage: www.e lsev ier .com/ locate /a i im

1. Introduction

Protein synthesis is a key process for living organisms. Allproteins are encoded by messenger RNA (mRNA), which isextracted from a gene in DNA; proteins are produced in a processthat involves two stages, namely transcription and translation. Intranscription, a sequence of the gene is used to produce mRNA,which is then used to create a protein during translation. For a geneto be transcribed into mRNA, it is often necessary for a specificprotein called transcription factor to bind to the DNA in a specificlocation. A transcription factor can have a positive or negativeregulatory effect on the binding site. So, the transcription level (or

* Corresponding author at: BIDEALS Group, Department of Computer Science,

University of Calgary, Calgary, Alberta Canada. Tel.: +1 403 220 9453;

fax: +1 403 284 4707.

E-mail addresses: [email protected] (M. Tan), [email protected]

(R. Alhajj), [email protected] (F. Polat).

0933-3657/$ – see front matter � 2009 Elsevier B.V. All rights reserved.

doi:10.1016/j.artmed.2009.10.002

the expression level) of the gene can change based on the bindingof the transcription factor. Since the transcription factor is also aprotein, which is decoded from a gene, it is possible to describe anddiscuss a set of interactions among genes; these interactionsconstitute a GRN.

As described in the literature, there are various methods torepresent and model a GRN [1]. These include (dynamic) Bayesiannetworks, (probabilistic) Boolean networks (BNs), neural net-works, petri-net models and differential equation-based models.Modeling may provide an opportunity to estimate the future stateof a cell based on the current state and the conditions affecting thecell. To justify the need for control, consider a cell which isestimated to be in an undesirable state (e.g., cancerous state) in thenear future; this brings the necessity to intervene the current stateof the network in order to avoid reaching undesirable state(s). But,it is important to intervene as efficiently and effectively as possiblebecause of the urgency of the situation and the cost of theintervention. This motivates for the need to control GRNs, theproblem may be stated as follows: find an efficient policy to interact

mailto:[email protected]



http://www.sciencedirect.com/science/journal/09333657

http://dx.doi.org/10.1016/j.artmed.2009.10.002

M. Tan et al. / Artificial Intelligence in Medicine 48 (2010) 51–5952

(by interventions) with the network in order to change the behavior in

a way that satisfies some prespecified objective(s). On the other hand,the size of the state space is the most crucial issue in GRN control;this consideration is common to all control problems. Here it is alsoworth mentioning that the term control in the context of GRN isslightly different from control theory because of the limitedness ofpossible intervention means for GRNs.

For a discrete GRN (where the expression levels of genes arediscretized), the size of the state space is proportional to thenumber of genes and the number of levels of discretization for eachgene. Even if the expression levels of the genes are discretized tobinary levels, the size of the state space is 2N for an N-genenetwork; this makes the problem hard to cope with even for smallvalues of N. So, to find an efficient policy for the GRN controlproblem, appropriate methods must be introduced to reduce thestate space to a reasonable size, whenever possible and desired.

The relevancy of a given gene in terms of control depends on theobjective to be satisfied because genes in a GRN have varyingeffects on each other; this means that a gene might have minimalor negligible effect on the solution of a GRN control problem. In thispaper, we utilize the relevancy measure to propose a kind offeature reduction method capable of identifying genes which areless relevant for control. Such genes are candidates to beeliminated in building a model so that an approximate controlpolicy can be reached faster. The feature reduction process mayidentify more than one gene as candidates to be eliminated. But,even when one gene is eliminated, the state space reducessignificantly. Obviously, this positively reflects on the scalability ofthe GRN control problem to be investigated.

Generally, the control of GRNs has been studied on Markovianmodels, e.g., [2–10]. For instance, Shmulevich et al. [10] consideredcontrol in a Markovian model by exploiting the Markov chaintheory [11]. It has been shown how to select the gene to intervenein order to minimize the time required to reach some set ofdesirable states, given the current state. Structural intervention isalso considered for reaching desired states [9]. On the other hand,Datta et al. [5] formulated the interventions in terms of alteringtransition probabilities by using some external control variables.They used dynamic programming to formulate and solve a finitehorizon controlled Markov chain, where a horizon is the durationof applying external actions and the Markov chain is definedsimilar to a Markov decision process [12]. Optimal infinite-horizoncontrol extension of this work is described in [8].

Almost all the above mentioned studies use probabilisticBoolean networks (PBNs) [13] as the Markovian model. Aslightly different model in the context of control is investigatedin [7], where the switching between the BNs forming aprobabilistic Boolean network (PBN) is not performed in everystep, but probabilistically. Attractors in a PBN are the states to bereached after a finite number of steps and this stable situationwill not change in the absence of perturbations. Control with thelack of the knowledge of switching probabilities between BNsthat have common attractors has been studied by Choudharyet al. [4].

A previous study in our group [2,3] is based on the followingargument: if the control process is considered as a treatment then

observing the patient after the treatment can also be taken into

account while solving the control problem. As a result, the solutiondescribed in [2] was developed; it considers a monitoring horizonafter the control horizon. The solution is given for various settingsdepending on the control and monitoring horizons being finite orinfinite. The problem was also formulated as a multi-objectiveproblem where the objectives are state cost and state-action costdefined by domain experts [3].

To sum up, the above mentioned works focus on solving thecontrol problem in GRNs for different settings using dynamic

programming. A gene is assumed to be relevant if it is chosen formodeling, i.e., it exists in the same GRN with others. But, weobserved that the relevancy also depends on the objective(s); andconsequently we argue that the component of the GRN we shouldfocus on may change according to the given set of objectives. Basedon this argument, we propose a feature reduction method thatsuccessfully maintains scalability in the control of GRNs [14,15]. Byfeature reduction, we provide the choice to reduce the number ofgenes to be considered in the control process and hence maintainscalability. Neglecting scalability turns control into an unmanage-able process, though control is essential to study and understand thebehavior of any given system. To the best of our knowledge, this is amajor contribution as the first attempt of applying feature reductionin the context of GRNs; our initial results have encouraged us toexpand the work as described in this paper. The results reported inthis paper demonstrate the applicability and effectiveness of theproposed approach. Although GRN control studies are not yetdirectly applicable to clinical practice, the promising resultsdemonstrate the potential to be used in real applications. Wereported test results using both synthetic and real gene expressiondata.

The rest of this paper is organized as follows. Section 2 includesthe necessary background information. Section 3 covers the detailsof the proposed reduction based approach. Section 4 reportsexperimental results on synthetic and real gene expression data.Section 5 is conclusions and future research directions.

2. Background

In this section, we cover the background necessary for the scopeof the work described in this paper. In particular, we present anoverview of the Markov decision problems (MDPs) and discuss thecontrol problem in the context of GRNs.

2.1. Markov decision problems

A Markov decision process is formally defined as a quadrupleðS;A; T;RÞ, where S is the set of states, A is the set of actions, T istransition probabilities such that Tðs; a; s0Þ denotes the prob-ability of the next state being s0 when action a is applied in states, and R is the real-valued reward function defined on the setS� A. R is sometimes defined on the set S� A� S as Rðs; a; s0Þ,where the next state also effects the reward function. Thisdefinition can be easily transformed into the former as,Rðs; aÞ ¼

Ps0Tðs; a; s0ÞRðs; a; s0Þ. A Markov decision process with a

performance criterion is called a Markov decision problem(MDP). Solution to a MDP is a mapping p : S!A, which is calledpolicy. The aim here is to find a policy that maximizes the totalexpected future reward, which is defined differently based on theperformance criterion. Unless otherwise stated, we focus in thispaper on total discounted infinite horizon future whosedefinition can be specified as:

Ptb

tRtðs; aÞ, where Rtðs; aÞ is the

immediate reward on performing action a in state s at time step t,and the discount factor b is in ð0;1Þ.

The value function for state s, denoted VðsÞ, gives the value ofdesirability of state s for the current control process. Every policy pdefines a value function Vp on state space S. The value of a policy pin state s is the total reward of choosing an action in state s

according to p, and following p thereafter. The Bellman equation[16] defines the relationship between VpðsÞ and values of otherstates:

VpðsÞ ¼ Rðs;pðsÞÞ þ bX

s0Tðs;pðsÞ; s0ÞVpðs0Þ (1)

There exists a unique optimal value function [12], denoted V�,which is the value function of the optimal policy p�. The optimal

M. Tan et al. / Artificial Intelligence in Medicine 48 (2010) 51–59 53

value function is the one that achieves maximum value in allstates;

V�ðsÞ ¼ maxa½Rðs; aÞ þ bX

s0Tðs; a; s0ÞV�ðs0Þ� (2)

If all the elements of a MDP are known (i.e., the model isknown), then the solution can be found using dynamic program-ming [12]. One of the most well-known dynamic programmingalgorithms for solving MDPs is the value iteration algorithm used inthis paper; it uses the fact that when Eq. (2) is used in an iterativemanner, it converges to the optimal value function as shown inEq. (3). So, with arbitrary initial assignments to V0, Vk converges toV� as k!1 [12].

Vkþ1ðsÞ ¼ maxa½Rðs; aÞ þ bX

s0Tðs; a; s0ÞVkðs0Þ� (3)

and given V�, a corresponding optimal policy is computed as:

p�ðsÞ ¼ argmaxa½Rðs; aÞ þ bX

s0Tðs; a; s0ÞV�ðs0Þ� (4)

2.2. Control in GRNs

Markov decision processes have been used in conjunction withcontrol of GRNs. All components of a MDP have to be defined in orderto find a control policy for a given GRN. Of the 4 components of aMDP, S is defined by the model; and after we define the actions, T canalso be computed with some additional processing. The twocomponents A and R should be defined according to the specifica-tions of the problem.

Controlling a GRN model means to externally apply somecontrol actions to the model in order to achieve one or more of theobjectives. These actions are usually expressed in the form ofinterventions [2,5–8,10], which are realized as externally changingthe expression level of one or more genes. If g is the control gene foraction a, the expression level of gene g is instantaneously changedto a different level when action a is applied. Then, the networkevolves. We assume in this paper that the control gene is given; itcan be determined using the already developed methods forchoosing the control gene as discussed in [7,13]. The methodutilized in this paper depends on computing the influence of onegene on another based on the model constructed from the geneexpression data. The gene that has high influence on the gene to becontrolled is chosen as the control gene.

The studied control objectives are generally in the form ofdesirable and undesirable states, which are assigned low and highrewards, respectively. Besides, actions are considered to have acost associated with them (as in real life). In this paper, we considerthe rewards and costs of actions to be additive as in [5,7,8]. Asolution that assumes action costs and rewards as non-additive hasbeen developed by our group; it is described in [2,3]. Although, it ismore realistic to consider action costs and rewards as non-additivebecause of their different measurement units, we assume that thecosts of actions are not uniform and ignoring them is non-realistic;a multi-objective solution is out of the scope of this paper because

Figure 1. Finding control po

in here we want to highlight the importance of feature reductiontechniques in improving the scalability of control for GRNs; afterdemonstrating the effectiveness of reducing the number of genes,it will be possible to adapt the multi-objective solution that hasbeen developed by our group. Therefore, for the study conducted inthis paper, we assume the action and state costs are given bydomain experts and can be combined by summation.

3. Scalable control by feature reduction

Feature reduction is the process of finding and excluding fromfurther consideration, features that are expected to have reason-ably negligible or minimal effect on the output quality. In general,feature reduction or feature selection is performed to improve theperformance of some predictors [17]. The features in the case ofgene expression data are the genes, the samples, or both. In thispaper, we consider feature reduction as decreasing the number ofgenes. What we consider as output is the value function in Eq. (2).This is reasonable as the value function represents the reward (orcost) associated with a given state by applying the control policy. Inreal life, this can give an indication of the cost of treatment (policy).The speed-up gained by reducing the state space of the MDP isconsidered as performance improvement.

In the GRN control domain, we observed that some of thegenes can be ignored in the process of finding a control policyfor a given data. So, it is essential to estimate the redundancy inthe data before starting the modeling and MDP solving parts.This way, we can effectively deal with GRNs that have largernumber of genes by applying reduction to obtain a smaller set ofgenes instead. This is depicted in Fig. 1; path (a) is the ordinarypath of solving the control problem, and path (b) is thereduction based method proposed to solve the problem.Function F that labels the last link along path (b) in Fig. 1maps policy p0 found for MDP M0 to policy p of the larger MDPM. Assume the Model on path (a) has two states si and s j whichonly differ in the value of one gene, say the third gene. Forexample, for the binary case let these states be si ¼ 1001010 ands j ¼ 1001110, for a network of seven genes. If, in the featurereduction step of path (b), we decide that the third gene isirrelevant, then si and s j will be aggregated in Model’, forming astate sn ¼ 100110. To get policy p, after solving M’, we have toremap the action defined for sn in p0 to si and s j. Since we knowthat the third gene is redundant, si and s j are in fact equivalentstates. Therefore, F simply gets p0 and produces p by settingpðsiÞ ¼ pðs jÞ ¼ p0ðsnÞ, for all si, s j and sn.

The proposed solution is based on the assumption that theobjective is defined in terms of the expression values of somegenes; the genes that we want to control. We require that thereward function is defined in the form Rðs; a; s0Þ and depends onlyon the action and next state. This does not need to be the case inreal life, but it also does not overly restrict the problem because theobjective is generally defined in terms of desirable or undesirablestates. Finally, model minimization can also be performed afterbuilding the model. However, if we think of the process as finding apolicy for some given data, then feature (or gene) reduction before

licy for the given data.

Figure 2. Aggregation of stochastically bisimilar states s1 and s2.


modeling saves time in the modeling stage because the modelbuilding process is a time-consuming task as well.

3.1. Selecting the genes to remove

The selection of the gene to remove is based on the followingobservation: Since the objective is defined in terms of reward andcontrol genes, all other genes are candidates for removal. From theset of candidate genes, a subset will be selected based on theirestimated relevance for deriving a control policy.

Recall that genes to be removed should have the lowesteffect on the value function. As explained above, removing agene, say g, from consideration is equivalent to aggregating thestates that differ only in the value of g. Assume that s0; s1; s2 ands00 are related in the MDP as shown in Fig. 2. Both s1 and s2 haveto be stochastically bisimilar [18] in order to be aggregated sothat the resulting MDP has the same solution as the originalMDP.

Definition 1 ([18]).Any two states si and s j in a MDP are said to be stochasticallybisimilar if the following two conditions hold:

I. 8

1

bu

aRðsi; aÞ ¼ Rðs j; aÞ.
II. 8 a; s0Tðsi; a; s
0Þ ¼ Tðs j; a; s0Þ.

Stochastic bisimilarity for the states of a MDP is an equivalencerelation (see Theorem 4 in [18] for more details). Two stochas-tically bisimilar states have the same value in the solutionfor a MDP. Under this equivalence relation, stochasticallybisimilar states are said to be equivalent in a MDP; by usingthis information, the MDP can be reduced to another MDP with asmaller state space.

Theorem 1 ([18]).Two stochastically bisimilar states in a MDP are equivalent and canbe aggregated.

Proof. Follows from Theorem 7 in [18].The aggregation procedure of the two stochastically bisimilar

states s1 and s2 from Fig. 2 leads to the new state sag shown inFig. ; it has the same reward function as s1 and s2, i.e.,8 a; s0Rðsag ; a; s0Þ ¼ Rðs1; a; s

0Þ ¼ Rðs2; a; s0Þ, and the transition prob-

abilities are as shown in the figure.1

Consider states si and s j that only differ in the value of g in M

(see Fig. 1). According to Theorem 1, if the states si and s j arestochastically bisimilar and there is in M0 a state sag as the

Note that Givan et al. [18] call sag a block (set of states) rather than a new state,

t there is conceptually no difference for our case.

aggregation of si and s j, then M0 is the minimized version of M, andhence has the same solution as M. This may be interpreted asfollows, we can find the control policy faster by locating andremoving from the data every gene g that satisfies the following:for gene g cases I and II in Definition 1 hold for the states that onlydiffer in the value of g.

For case I, we will use the assumption in the definition ofRðs; a; s0Þ that it does not depend on the current state s. Assume wehave one reward gene gr , that we want to control. The rewardfunction Rðs; aÞ by definition satisfies:

Rðs; aÞ ¼X

s0Tðs; a; s0ÞRðs; a; s0Þ (5)

and by using our assumption on the reward function, it can berewritten as:

Rðs; aÞ ¼X

i2ValðgrÞ

Xs0 2 Sgr¼i

Tðs; a; s0ÞRðs; a; s0Þ (6)

where ValðgrÞ denote the discrete values that gr can take, andSgr¼i denote the set of all states that satisfy gr ¼ i. Notice thatRðs; a; s0Þ is constant for all s0 (where gr ¼ i) and a given action a

(recall the assumption about Rðs; a; s0Þ). This means that case I inDefinition 1 holds for two different states si and s j if,

8 iX

s0 2 Sgr¼i

Tðsi; a; s0Þ ¼

Xs0 2 Sgr¼i

Tðs j; a; s0Þ (7)

Eq. (7) may be interpreted as follows: being in state si or states j makes no difference about the value of gr in s0. If si and s j differonly in the value of a gene, say g, then Eq. (7) holds if theprobability of gr taking value i in s0 is independent of the valueof g, i.e., Prðgrðt þ 1Þ ¼ kjgðtÞÞ ¼ Prðgrðt þ 1Þ ¼ kÞ, where gðtÞdenotes the value of g at time step t. In other words, case Iholds if g has no influence on the next state value of gr . Usingsimilar argument, for case II to hold for states si and s j that differonly in the value of g, gene g should have a low effect ondetermining the next state of any gene. So, we have to checksimilar conditions for cases I and II. If we approximate theinfluence of a gene on a set of genes as the average influence,both the influence of g on gr and the average influence of g on allother genes are important.

3.2. Influence Score

Given two genes gi and g j, the influence of gi on g j can beestimated by checking to what degree the equation Prðg jðt þ 1Þ ¼kjgiðtÞÞ ¼ Prðg jðt þ 1Þ ¼ kÞ is satisfied. We define the followingfunction to estimate the influence of gi on g j:

In f ðgi; g jÞ ¼X

k2Valðg jÞjPrðg jðt þ 1Þ ¼ kjgiðtÞÞ � Prðg jðt þ 1Þ ¼ kÞj (8)


and we define the average influence of g on a set of genes G as:

A f ðg;GÞ ¼ 1

jGjX

gG 2G

In f ðg; gGÞ (9)

where jGj is the number of genes in G. The counts for thedifferent values of pairs ðgi; g jÞ in the data constitute sufficientstatistics for In f ðgi; g jÞ.

Note that the function In f ðgi; g jÞ that gives the influence of genegi on g j is similar in nature to the influence concept introduced byShmulevich et al. [13]. But, Shmulevich et al. [13] compute thisvalue based on the model (PBN), while we compute the value ofIn f ðgi; g jÞ directly from the data without building a model.

To select a subset from the genes in the data, we assign to eachgene what we call influence score (IS), which is based on two sub-scores inspired for the cases in Definition 1. The sub-score for case Iis:

SIðgÞ ¼ In f ðg; grÞ (10)

where gr is the reward gene. The sub-score for case II is:

SIIðgÞ ¼ A f ðg;GÞ (11)

where G includes all genes in the data except gene g. As a result,

ISðgÞ ¼ SIðgÞ þ SIIðgÞ (12)

Combining all the already introduced concepts, the final reductionmethod that we call FRGC (Feature Reduction for GRN Control) isgiven in Algorithm 1.

Algorithm 1. Feature reduction for GRN control

Input: m� n discrete gene expression data (D) and threshold Th

Output: ðm� kÞ � n reduced gene expression data (D0)number o f genes� number o f sam ples ¼ sizeðDÞGenes ¼ f1; . . . ;number o f genesgIrrele ¼ fgfor all g 2Genesdocompute IS (g)

if ISðgÞ< Ththen

Irrele ¼ g [ Irrele

end ifend for D0 = Remove IrrelevantGenes from D

return D0

Algorithm 1 identifies and removes some of the lowest scoredgene(s). One point to consider in the process is the number oflowest scored genes to remove. We use a threshold score Th as thestopping criteria of the removal; Th obviously depends on theanalyzed expression data. In this paper, we rely on a domain expertto specify the value of Th.

Deciding on a value for the threshold is a subjective processwhich depends on several issues, like the usage of the results, thedegree of accuracy, simplicity of the policy, computationalresources and the objective. A large (small) threshold impliesless (more) accurate results and requires less (more) computa-tional resources. The expected complexity of the resulting policycan also be important because eliminating some of the geneswould generally produce a simpler policy which is moreattractive as the applicability is concerned. The need to takeimmediate action for time critical sensitive cases may toleratelower accuracy for simpler policy. All these factors are goodindicators to guide the choice of a threshold value. The process issubjective; it is like a multiobjective optimization issue becausemost of the factors and objectives described above do conflict. So,it is the duty of the domain expert to decide on which factors orobjectives should be considered more important to the specificproblem being investigated and hence set the threshold valueaccordingly. For instance, a lower threshold value is preferred if

sensitivity is the issue, while a higher threshold value is expectedif simplicity is the major concern; most of the cases it issomewhere in between. Finally, while deciding on the thresholdvalue it is possible to employ simple procedures such asinvestigating the sum of IS scores for all genes and eliminatingthe genes with smallest scores up to a certain percentage of thesum; however, a decision on the percentage is necessary and thisis again subjective. The problem can be also considered asdetermining the number of minimum scored genes to eliminate;this depends on the set of genes being investigated by theexperimenter. Keeping all these issues in mind and knowing howsubjective the process is, we are working on a method thatminimizes as much as possible the involvement of the domainexpert in a way to automatically determine the threshold value;once put in action such method will add much to the value of thereduction approach proposed in this paper, turning it into a moreadaptable process.

As described in the experiments section, errors are computed asthe percentage difference between the resulting value function andthe approximate value function. How much error is tolerablehighly depends on the problem specification. For instance, if thereward function is defined as the financial cost of a treatment, thenthe tolerable error bound can be large compared to the case whenthe reward function is defined in terms of life expectancy orprobability of survival. For instance, a 10% error would beacceptable as financial cost is concerned, i.e., it may be tolerablefor some cases. But if the survival of a patient is the concern then10% error should not be acceptable unless it is the best availablealternative. Therefore, deciding whether an error bound istolerable or not is problem dependent.

4. Experimental results

We have conducted some experiments to demonstrate theapplicability and effectiveness of the proposed reduction approach.We used PBNs [13,10] as the modeling technique. The basic idea inPBNs as different from BNs is to use for each target gene more thanone Boolean function. So, PBN is a more general and probabilisticmodification of a Boolean network. We used the PBN Toolboxsoftware [13] to derive a PBN from a given data. The algorithm ofderiving a PBN from the data depends on a concept calledCoefficient Of Determination (COD) [19], which simply computesthe decrease in error to determine the future value of a gene byusing the observations related to other genes instead of using noobservation. So, given k sets of genes X1; . . . ;Xk and a set offunctions f 1

; . . . ; f k such that f 1ðX1Þ; . . . ; f kðXkÞ are optimalpredictors of the target gene g according to some probabilisticerror measure e, COD is computed as follows [13],

ugk ¼

eg � eðg; f kðXkÞÞeg

(13)

where eg is the error in the best estimation of g, when we do nothave any information other than the value of g.

The PBN derivation algorithm uses three parameters:

1. T
he number of regulators that will be chosen for each gene.Biologically, genes are thought to be regulated by few number ofgenes [5,20]. So, among one, two and three gene-regulator sets,we select genes with highest COD values, where the errormeasure is the best-fit extension error [21].
2. T
he number of functions that will be used to model each gene. Itis set to 3 based on some initial test runs that check the model’sability to predict the next state of the network given its currentstate.
3. T
he probabilities assigned to the functions chosen to model agene. Probability ci j which is the probability of choosing the jth

Figure 3. Synthetic networks: (a) Network 1 and (b) Network 2.

Table 2Gene subsets with error less than 10% for network 1.

Subset Error Subset Error Subset Error

3 2.582 5 8 2.073 3 5 8 2.245

5 2.073 6 7 0.203 3 5 7 2.245

6 0.134 6 8 0.000 3 5 6 0.173

7 0.184 7 8 0.758 5 6 7 8 0.173

8 0.026 6 7 8 0.173 3 6 7 8 0.173

3 5 2.245 5 7 8 2.245 3 5 7 8 2.245

3 6 0.173 5 6 8 0.000 3 5 6 8 0.173

Table 1Influence scores of genes in network 1.

Gene 3 4 5 6 7 8

Score 1.004 1.181 0.626 0.923 1.162 1.282


function for gene i, is calculated as follows [13]:

ci j ¼ui

jPj u

ij

(14)

So, functions with high COD values are chosen with highprobability in the model.

Perturbation probability [10], say p, is the probability ofrandomly changing the expression level of genes in the model. Thisway, all the states in the model become reachable, and theunderlying Markov chain that correspond to the PBN becomesergodic [10]. Ergodicity means the steady-state probability of aMarkov chain can be estimated empirically. More details about thisparameter of the PBN can be found in [10]. In our settings forderiving the PBN, the perturbation probability p is set to 0:01.

The error in all the conducted experiments is calculated as thepercentage difference between the optimal value function of theoriginal model and the value function of the policy found afterfeature reduction. In other words, the error is the percentagedifference between the value functions of the policies found byfollowing paths (a) and (b) in Fig. 1. Also, for discretization, we usedinterval discretization with two bins whenever necessary.

4.1. Synthetic data

We first evaluated our algorithm on some synthetic data setsgenerated using the algorithm proposed in [22], which is based ona regulation matrix A. Matrix A is set such that each entry ai j of A

gives the degree of regulation of gene j on gene i, and the diagonalof A is 1, i.e., for all i, aii ¼ 1. If Yt denotes the system state at time t,the next state is generated as follows:

Ytþ1 ¼ AðYt � NÞ þ e (15)

where N is the threshold that a gene has to be above (or below)in order to affect other genes, and e is the noise uniformlydistributed in a specified range. In the experiments, N is set as 50and e is randomly set in the range (�10, 10).

To generate the data sets, we used the same parameters thatwere used by Yu et al. [22]. By setting Y0 to random values, wegenerated 500 samples from each network, where one sample istaken every 5 steps of the simulation. Each ai j is set to 0:1 or �0:1,representing positive or negative regulation,2 respectively. Forexample, for the network shown in Fig. 3, a53 ¼ �0:1 and a14 ¼ 0:1.

2 Notice that positive or negative regulation does not mean to always increase or

decrease the expression level of the gene. The net effect depends on the value of the

regulator’s expression level and N.

In the figures, arrows denote positive regulation and the lines witha bar denote negative regulation. Expression levels are assumed tobe in the range [�100, 100]; so in the data generation process, if avalue goes above (below) these limits, it is set to 100 (�100), andthe noise parameter e is uniformly distributed in the range [�10,10]. Finally, the data generated by the above simulation isdiscretized into binary levels.

In all of the experiments involving synthetic data, the objectiveis intervening the first gene and down regulating the second gene.For this objective, we assigned a negative reward of�5 to the statesif the expression level of the second gene is 1, and a reward of 0otherwise. There are two actions where one is the intervention ofthe first gene, whose cost is 1, and the other is the costlessmonitoring action.

The first set of data is generated from the network shown inFig. 3 which represents matrix A in Eq. (15). As can be seen in Fig. 3,there are two components in the network connected via gene 3. So,we can expect that a subset of genes, namely f4;5;6;7g, can beignored in finding a control policy because they seem to be lessrelated to the control and reward genes, namely genes 1 and 2.

The IS values for all genes are shown in Table 1; these valuesdemonstrate that the least scored genes are 5 and 6. If Th is given as1, the set that will be chosen is f5;6g. Table 2 shows the errorsassociated with the different gene subsets; only subsets that haveerror less than 10% are listed. From the results in Table 2, it can beeasily seen that f5;6g is one of the three best subsets.

The second set of synthetic data is generated from the networkshown in Fig. . In this network, the expression level of the second

3 7 2.245 5 6 7 0.173 3 5 6 7 0.173

3 8 3.012 3 7 8 2.245 3 5 6 7 8 0.173

5 6 0.000 3 6 8 0.173

5 7 2.245 3 6 7 0.173

Table 3Influence scores of genes in network 2.

Gene 3 4 5 6 7 8

Score 2.837 2.502 1.944 2.623 1.534 2.875

Table 8Subset errors of genes for yeast data.

Subset Error Subset Error

FKH2 9.684 FKH1 5.049

MBP1 0.123 SWI5 1.506

MCM1 2.399 SWI6 1.821

NDD1 5.119 SWI5 MCM1 5.409

SKN7 2.062 SWI5 MCM1 FKH1 9.319

STB1 1.372

Table 5Influence scores of genes for melanoma data.

Gene 3 4 5 6 7

Score 0.775 0.911 0.333 0.526 1.333

Table 4Gene subsets with error less than 10% for network 2.


3 3.177 3 7 1.246 3 6 7 2.004

5 0.085 5 6 4.812 3 5 7 0.858

6 3.990 5 7 0.429 3 5 6 4.235

7 0.287 6 7 2.004 3 5 6 7 2.004

3 5 2.295 5 6 7 2.004

3 6 4.985 4 5 7 9.517

Table 7Influence scores of genes for yeast data.

Gene FKH2 MBP1 MCM1 NDD1 SKN7 STB1 FKH1 SWI5 SWI6

Score 1.605 0.836 0.610 1.128 1.312 0.642 1.489 0.471 1.226


gene is to be controlled indirectly. The first gene is connected to thereward gene through genes 4 and 8. This time, a subset of thegenes, namely f3;5;6;7g, is expected to be the candidates forremoval. The results are shown in Tables 3 and 4. Genes 5 and 7have the least scores, so they can be the candidates for removal.And if we check Table 4, we see that the best subset is f5g followedby f7g and f5;7g. This means that if Th is set as 2 for the dataderived from this network, the subset f5;7g will be chosen forremoval; it is the third best subset out of 64 subsets.

4.2. Gene expression data

4.2.1. Metastatic melanoma

In this section, we report the results of the application of thegene selection algorithm proposed in this paper to gene expressiondata produced in a study of metastatic melanoma [23]. The datawas also used by Pal et al. [24] for deriving a PBN model; sevengenes are chosen from the whole data set based on their ability topredict the state of each other; these genes are pirin, WNT5A,S100P, RET1, MART1, HADHB and STC2. The objective here isspecified as down-regulating WNT5A; and pirin is used as thecontrol gene, same as in [7]. The reward function is set the sameway in the synthetic experiments. The data is relatively smallcompared to the synthetic data sets, it has 31 samples. This can be adisadvantage for the gene selection algorithm because theinformation that the data contains is small compared to thesynthetic data sets. Since Pal et al. [24] were also working on binarydata, the samples were discretized to two levels.

The results are given in Tables 5 and 6. Although the error ratesare high in this case, there is still one subset, f5gwith minimum IS,that we can remove with error less than 2% and with appropriate

Table 6Gene subset errors of genes for melanoma data.


3 20.766 4 5 12.907 3 5 7 30.179

4 14.477 3 7 27.822 3 5 6 31.739

5 1.856 3 6 34.248 3 4 7 16.549

6 19.223 3 5 21.643 3 4 6 38.641

7 39.257 3 4 29.564 3 4 5 34.974

6 7 42.276 5 6 7 43.796 3 4 5 6 27.966

5 7 38.417 4 6 7 32.570 3 4 5 7 16.549

5 6 19.980 4 5 7 31.461 3 4 6 7 16.549

4 7 31.238 4 5 6 27.966 3 5 6 7 35.619

4 6 19.935 3 6 7 44.455 4 5 6 7 31.274

Th. The high error rates can be due to high degree of connectivityamong the selected genes. This is consistent with the informationstated above that the genes are selected based on their ability topredict each other’s state. Another reason can be the possible higheffects of most of the genes on the reward gene WNT5A.

4.2.2. Yeast cell cycle

In this section, we report the results of applying the proposedreduction method to a set of well-known transcription factors ofbudding yeast (Saccharomyces cerevisiae). These 11 transcriptionfactors were previously identified to be the important regulatorsfor the yeast cell cycle [25]: ACE2, FKH1, FKH2, MBP1, MCM1,NDD1, SKN7, STB1, SWI4, SWI5 and SWI6. We used in thisexperiment the microarray data with 77 time steps as developedby Spellman et al. [26]. Missing values in the data were imputed byusing KNNImpute software [27]. Again, before applying ourmethod, we discretized the data set first into binary levels byapplying interval discretization.

The reward gene is set as SWI4, which is one of the importanttranscription factors (part of the SBF complex) that play a role in G1phase. The control gene is set as ACE2, which has been chosenbecause the PBN model derived from the data has ACE2 as one ofthe regulators for SWI4.3 The objective is set as down-regulatingSWI4 and the reward function is set in the same way as previouslydescribed.

The IS scores of the genes are given in Table 7. SWI5 is thelowest scored gene with a score of 0.471. If this gene is eliminatedan error of 1.5% occurs(see Table 8 for errors of some of thesubsets). This error is very low and demonstrates the applicabilityof the proposed reduction method in case the threshold is chosenas 0.5. Eliminating SWI5 and MCM1 (with a threshold of 0.62 forinstance) has an error of 5.4% which can be considered acceptablefor some cases. The computational gain, however, due to acceptingthis error rate is huge; it takes 3.32 minutes to produce a solutionwhen SWI5 and MCM1 are eliminated and 12.10 minutes whenSWI5 is eliminated, while it takes 52.25 minutes are requiredwithout eliminating any gene. We discuss this issue further inSection 4.4.

4.3. Comparison to other methods

As mentioned in Section 1, GRN control problem has beenpreviously studied in the literature, however scalability andfeature reduction issues have not yet been considered for thisproblem. As mentioned before, feature reduction can also be

3 Note that, to the best of our knowledge, ACE2 and SWI4 have not been identified

as regulating each other. But verification of the model derived by the modeling

algorithm we use here is out of the scope of this paper.

Table 10Elapsed time (s) for the experiments.

Network 1 Network 2 Metastatic melanoma Yeast cell cycle

Path (a) 19.141 19.157 4.329 3135.258

Path (b) 1.125 1.110 1.015 199.362

Table 9Comparison to Shmulevich et al. [13].

FRGC PS

Sim. Error Time Sim. Error Time

Network 1 48.35 4.76 50.81 48.27

Network 2 32.30 2.67 36.53 40.60

Metastatic Melanoma 28.05 0.77 21.43 14.15

Yeast Cell Cycle 9.46 305.20 9.12 5772.83


performed after the modeling phase. But due to the computationalgain of reduction before modeling, in this paper, we eliminateirrelevant genes prior to the modeling phase (see Fig. 1).

Control genes can be determined by using the influence concept[7,13] which is the underlying notion of IS introduced in this study.The genes can also be eliminated after the modeling phase alongpath (a) in Fig. 1 by using the influence concept. This featurereduction method will eliminate genes with the lowest scores,where the score is computed as in Eq. 12 except that this time theinfluence value from [13] will be used instead of In f ðgi; g jÞ. Noticethat, this method is different from ours in the way that it is applied;it is a method that can be applied given the model, i.e., it is appliedafter the modeling phase. We performed a number of experimentsto demonstrate how this type of elimination compares to ours. Thiswill lead to better insight into the process and will show the effectof the elimination before the modeling step.

A structure learning (or modeling) algorithm may output morethan one model for a given dataset, where the produced models areequally likely. When this is the case, most of the modelingalgorithms choose one of these models as their output. The numberof equally likely methods gets smaller as the number of samples inthe dataset gets larger. The PBN learning algorithm that we haveused in this study outputs one of the equally likely models bybreaking the ties randomly during construction. To eliminate theeffect of this for a fair comparison, we repeated the process offollowing the paths 20 times for each data set and reported theaverage results, also we sampled data sets of 1000 steps instead ofthe previously used 500 for synthetic networks.

A control policy can also be evaluated by simulation; startingfrom a random initial state, apply the policy and count the numberof undesired state visits. Although comparison of the valuefunctions is more accurate, this type of evaluation provides akind of weighted difference between the value functions byeliminating the effect of states that are hardly visited. So if thesteady state probability of a state in a model is small then the effectof the value function difference (if any) for that state will also besmall. This type of evaluation has the advantage of being fastercompared to the value determination process; the value determi-nation for a given policy requires a very long time to execute. So,we have chosen the number of undesired state visits in thesimulation as the evaluation metric for this experiment with 20iterations. Each starting from a random initial state, we performedfive simulations of 1000 steps and averaged the number ofundesired state visits. The results will be given as the averagepercentage of undesired state visits in 1000 steps.

For the experiment, we used the same four data sets. Thecontrol problems are defined in exactly the same way as before. Tobe able to make a fair comparison, we have chosen the thresholdvalues so that two genes will be eliminated for each of themethods. We will call the method that is based on choosing controlgene in Pal et al. [7] and Shmulevich et al. [13] as PS (combining thefirst letter from the names of the first authors). Note that this time,Path (a) also has a feature reduction step applied after modelgeneration (specifically after ModelðMÞ is obtained in Fig. 1).

The results are given in Table 9, where the Sim. Error columngives the simulation error defined above. Although PS has theadvantage of directly using the model, the results demonstrate thatFRGC outputs comparable results to PS, sometimes even better.This shows that focusing on important parts of the model byeliminating irrelevant genes provides a reliable model reductionmethod for control. In addition, as previously emphasized, valuefunction errors are more accurate, but may correspond to very lowerrors in practice as seen in the Yeast cell cycle results in Table 9.Only the metastatic melanoma results can be considered assignificantly different, but notice that we have forced two-geneelimination for comparison here instead of the one-gene elimina-

tion in Section 4.2.1. The given table also includes the total time offollowing in Fig. 1 Path (a) with PS and Path (b) with FRGC. Theexecution time results demonstrate the computational gain ofFRGC compared to PS.

4.4. Time complexity

The complexity of deriving a BN under the best-fit extension

paradigm is given in [21] as Oð nk

� �:n:m: polyðkÞÞ, where n is the

number of genes, k is the number of predictors (regulators) for each

gene, m is the number of samples in the data and polyðkÞ is a

polynomial function of k, which is in most cases equal to k.

Deriving a PBN adds an additional cost of Oð nk

� �n f :nÞ, where n f is

the number of functions for each target gene because for each gene,

we are choosing n f functions out ofnk

� �. The last two steps in

path (a) of Fig. 1 (the construction of the MDP and value iteration)

have equal complexity of Oða:4nÞ for the binary case, where a is thenumber of actions. So, the dominating term in the total complexity

of path (a) is Oð4nÞ for k<n (which is generally the case for GRNs).The complexity of computing ISðgÞ is Oðn:mÞ, since it depends on all

genes other than g and the sufficient statistics for In f ðg; giÞ arecollected from the data in one pass. Since we are computing IS for

all genes and removing the l selected genes in the algorithm, the

total complexity of feature reduction is Oðn2:mþ l:n:mÞ, assumingno clever data structures in shifting the columns of a multi-dimensional array. So, the total complexity of both paths in Fig. 1 is

dominated by the Oð4nÞ term for the binary case. The featurereduction algorithm does not dominate the overall complexityeven if the structured representation that may have lower averagecase complexity in terms of n are used in solving the MDP

(provided that k�2 in PBN modeling, which is usually the case).The main purpose of performing feature reduction is to achieve

speed-up in reaching the policy with tolerable error rate. So,Table 10 contains the elapsed time for finding the policy for each ofthe data sets used in the experiments. From the results reported inTable 10, it can be easily seen that there is a significant decrease intime. The results here are also in correlation with the complexityanalysis.

5. Conclusions and future research directions

In this paper, we proposed a feature reduction based method tohandle the problem of finding approximate solution to the controlof GRNs. For each gene, a score is computed to estimate whetherthe gene is relevant in solving the resulting MDP. The score is based


on MDP minimization theory and estimation of the degree of thegenes in determining the next state of each other. The results arepromising in the sense that given a threshold value, the score canbe used to remove some genes with minimum relevance. As aresult, the study conducted in this paper showed the possibility toidentify some genes that do not influence much the control of GRN;this could lead to a less costly and a more manageable controlprocess because of the reduced state space. At the end, all theprocess depends on the value of the threshold to decide on thegenes to eliminate; therefore it is important to have domainknowledge for better specification of the threshold.

The successful results reported in this paper motivated us forextending this work in different directions. Although the algorithmis good at finding some less important genes, the order relationshipamong genes in the error rates can not be in general captured bythe score function. To give exact solutions or to be able to give anerror bound, the score must always be directly proportional to theerror. Also, a score function that gives the score of a set of genesinstead of a single gene may improve the results becausesummation of the scores of genes in a set may not be alwaysproportional to the error of that set. We are also working on anautomated method to determine the threshold value. Solving theconstructed MDP in finite horizon is another extension we areconsidering; investigating the effect of the horizon on the qualityof the solution can bring new insights to the problem. Finally,adapting some other biological information (pathway informationfor instance) while determining the genes to eliminate is alsoamong our plans.

Acknowledgments

The research of Mehmet Tan is partially supported by TheScientific and Technological Research Council of Turkey. Theresearch of Reda Alhajj is partially supported by NSERC, Canada.

References

[1] de Jong H. Modeling and simulation of genetic regulatory systems: A literaturereview’’. Journal of Computational Biology 2002;9(1):67–103.

[2] Abul O, Alhajj R, Polat F. Markov decision processes based optimal controlpolicies for probabilistic Boolean networks. In: Sheu PCY, Zhang D, Karypis G,Chen P, Hwang M-J, Tang C-Y, Wang CS, Horng J-T, Hsu F-R, Brause RW. editors.Proceedings of IEEE symposium on bioinformatics and bioengineering, IEEEComputer Society, Taichung, Taiwan, March 2004. p. 337–44.

[3] Abul O, Alhajj R, Polat F. An optimal multi-objective control method fordiscrete genetic regulatory networks. In: Bourbakis NG, Raymer M, KarypisG, editors. Proceedings of IEEE symposium on bioinformatics and bioengineer-ing, IEEE Computer Society, Arlington, Virginia, October 2006. p. 81–4.

[4] Choudhary A, Datta A, Bittner ML, Dougherty ER. Intervention in a family ofBoolean networks. Bioinformatics 2006;22(2):226–32.

[5] Datta A, Choudhary A, Bittner ML, Dougherty ER. External control in markoviangenetic regulatory networks. Machine Learning 2003;52(1–2):169–91.

[6] Datta A, Choudhary A, Bittner ML, Dougherty ER. External control in markoviangenetic regulatory networks: The imperfect information case’’. Bioinformatics2004;20(6):924–30.

[7] Pal R, Datta A, Bittner ML, Dougherty ER. Intervention in context-sensitiveprobabilistic Boolean networks’’. Bioinformatics 2005;21(7):1211–8.

[8] Pal R, Datta A, Dougherty ER. Optimal infinite-horizon control for probabilisticBoolean networks. IEEE Transactions on Signal Processing 2006;54(6):2375–87.

[9] Shmulevich I, Dougherty ER, Kim S, Zhang W. Control of stationary behaviourin probabilistic Boolean networks by means of structural intervention. Biolo-gical Systems 2002;10(4):431–46.

[10] Shmulevich I, Dougherty ER, Zhang W. Gene perturbation and intervention inprobabilistic Boolean networks. Bioinformatics 2002;18(10):1319–31.

[11] Kulkarni VG. Modeling and analysis of stochastic systems. Boca Raton, FL, USA:Chapman&Hall/CRC Texts in Statistical Science; 1996.

[12] Sutton RS, Barto AG. Reinforcement learning. Cambridge, MA, USA: The MITPress; 1998.

[13] Shmulevich I, Dougherty ER, Kim S, Zhang W. Probabilistic Boolean networks:A rule-based uncertainty model for gene regulatory networks. Bioinformatics2002;18(2):261–74.

[14] Tan M, Polat F, Alhajj R. Feature reduction for gene regulatory network control.In: Yang JY, Yang MQ, Zhu MM, Zhang Y, Arabnia HR, Deng Y, Bourbakis N,editors. Proceedings of the IEEE seventh international symposium on bioin-formatics and bioengineering, Harvard Medical School. Boston, MA: IEEEPress; October 2007. p. 1260–4.

[15] Tan M, Alhajj R, Polat F. Large-scale approximate intervention strategies forprobabilistic Boolean networks as models of gene regulation. In: Nikita KS, Roa L,Fotiadis D, Alterovitz G, editors. Proceedings of IEEE symposium on bioinfor-matics and bioengineering. Athens, Greece: IEEE Press; October 2008. p. 1–6.

[16] Bellman RE. Dynamic programming. Princeton, New Jersey: Princeton Uni-versity Press; 1957.

[17] Guyon I, Elisseeff A. An introduction to variable and feature selection. Journalof Machine Learning Research 2003;3:1157–82.

[18] Givan R, Dean T, Greig M. Equivalence notions and model minimization inmarkov decision processes. Artificial Intelligence 2003;147(1–2):163–223.

[19] Edward D, Kim RS, Chen Y. Coefficient of determination in nonlinear signalprocessing. Signal Processing 2000;80(10):2219–35.

[20] Kim S, Li H, Dougherty ER, Cao N, Chen Y, Bittner M, Sui EB. Can markov chainmodels mimic biological regulation? Journal of Biological Systems2002;10(4):337–57.

[21] Lahdesmaki H, Shmulevich I, Yli-Harja O. On learning gene regulatory net-works under the Boolean network model. Machine Learning 2003;52(1–2):147–67.

[22] Yu J, Smith V, Wang P, Hartemink A, Jarvis E. Advances to Bayesian networkinference for generating causal networks from observational biological data.Bioinformatics 2004;20(18):3594–603.

[23] Bittner M, Meltzer P, Chen Y, Jiang Y, Seftor E, Hendrix M, Radmacher M, SimonR, Yakhini Z, Ben-Dor A, Sampas N, Dougherty7 E, Wang E, Marincola F, GoodenC, Lueders J, Glatfelter A, Pollock P, Carpten J, Gillanders E, Leja D, Dietrich K,Beaudry C, Berens M, Alberts D, Sondak V, Hayward N, Trent J. Molecularclassification of cutaneous malignant melanoma by gene expression profiling.Nature 2000;406:536–40.

[24] Pal R, Ivanov I, Datta A, Bittner ML, Dougherty ER. Generating Boolean networkswith a prescribed attractor structure. Bioinformatics 2005;21(21):4021–5.

[25] Yang Y-L, Suen J, Brynildsen M, Galbraith S, Liao J. Inferring yeast cell cycleregulators and interactions using transcription factor activities. BMC Geno-mics 2005;6(1):90–104.

[26] Spellman P, Sherlock G, Zhang M, Iyer V, Anders K, Eisen M, Brown P, BotsteinD, Futcher B. Comprehensive identification of cell cycle regulated genes ofyeast saccharomyces cerevisiae by microarray hybridization. Molecular Biol-ogy of the Cell 1998;9(12):3273–97.

[27] Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, BotsteinD, Altman RB. Missing value estimation methods for dna microarrays. Bioin-formatics 2001;17(6):520–5.

Documents

Scalable approach for effective control of gene regulatory networks