Putting genetic interactions in context through a global modular decomposition Jamal

Preview:

Citation preview

Putting genetic interactions in context through a global

modular decomposition

Jamal

• Genetic interaction provide powerful perspective how gene functions

• specific mechanisms that give rise to these interactions not well understood

• Requires a thorough study of genetic interaction networks understand the structure of the network.

Motivation

This study

• This study uses a datamining approach to explore all block structure with in this network.

Characteristics

• Genetic interaction:“Multiple genetic perturbations whose combination result in a

phenotype that is unexpected given the phenotypes of the individual perturbations”

The redundancies and dependencies within genetic network can provide powerful means for functional characterization.

• Unlike the PPI network, there is no obvious functional interpretation of a single genetic interaction, either negative or positive.

• The genetic interaction of two genes does not imply that they interact physically, it simply suggest that they share some kind of functional interaction.

Modular hypothesis

• Gene membership falls into different type of functional modules

• For example:• Protein complexes, pathways, etc.

Negative between pathway Model

• Defines Negative interactions: which are thought to arise between functionally redundant pathways such that deleting any pair of genes spanning across the pathways results in a significant reduction of fitness

Positive within pathway Model

• defines Positive interactions: If the second deletion in that same compromised pathway does not result in any additional fitness defect.

Bi-Clusters as block pattern in network

• Can be over-lapping or disjoint sets of genes

• Every gene in one set is connected to every other gene in other set.

• Pu et al.(2008) specifically designed an algorithm that randomly start with an initial bi-cluster and then rediscover the prominent bi-cluster many times.

• In this study authors employed an approach based on an algorithm from field association rule mining to find all biclusters of sufficient size.

Approach Summary--bi-cluster Discovery

• Recent data from Costanzo et al. (2010) was used in this study and the developed approach utilizes the apriori algorithm from the field of association rule mining to discover all biclusters.

• and the biclusters that can be expressed by degree distribution alone were filter out using non-parametric statistical assessment.

XMOD

• This approach XMOD (eXhaustive Modular Discovery) guaranteed to find all bi-partite graphs : Where 1 part of bi-partite acts as a functional unit

Presence of degree distribution based Bi-clusters

• Edges were randomized and still bi-partite graphs were obtained suggesting that biologically meaningless bipartite graphs can exist.

• score for each bi-cluster lower for biologically meaningful

• Score: “ the product of probabilities of each edge occurring independently conditioned on the degree of two interacting genes”

• Filtered Biclusters: using the independence score a cutt off is applied to separate the ones with less independence score

• Condensed Biclusters: after removing Biclusters with >40% overlap

Comparison with other techniques

Dataset

• The dataset in Costanzoo et al. in 2010 was used.

• 85,714 negative interactions and 35,858 interactions were used.

Association rule Mining

• Apriori Algorithm in Agrawal (1993) was used.

• Its standard available implementation from a website was used.

• Apriori was run on a binary set of positive interactions and also on a set of negative interactions

Randomizing the Genetic Interaction network

• The number of edges for each gene was preserved but the targets were randomized.

• A gene cannot have an edge with itself

Filtering Random bi-clusters

• We found that 50% of the real negative biclusters and 6% of real positive biclusters have scores below the 0.01 percentile of biclusters of the same size from the random networks. This resulted in 256,502 negative biclusters and 2194 positive biclusters.

Removing overlap from Biclusters

• we first arranged the biclusters in descending order by area.

• Then, beginning with the first bicluster A, we removed all biclusters whose area overlap with A was greater than 0.4, where overlap between biclusters A and B was calculated using the following formula:

Evaluation of Functional CoherenceMEFIT network is based on coexpression data and does not use genetic interaction datasets

• Improvements?