36
Inferring gene regulatory networks from multiple microarray datasets (Wang 2006) Tiffany Ko ELE571 Spring 2009

Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

Embed Size (px)

DESCRIPTION

Inferring gene regulatory networks from multiple microarray datasets (Wang 2006). Tiffany Ko ELE571 Spring 2009. Outline. Introduction Gene Regulatory Networks DNA Microarrays Objectives Methods Approach: SVD GNR Algorithm Confidence Evaluation Results Simulated Data - PowerPoint PPT Presentation

Citation preview

Page 1: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

Tiffany Ko

ELE571

Spring 2009

Page 2: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

OUTLINE

Introduction Gene Regulatory Networks DNA Microarrays Objectives

Methods Approach: SVD GNR Algorithm Confidence Evaluation

Results Simulated Data Experimental Data

Discussion Limitations Conclusions

Page 3: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

INTRO

Page 4: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

GENE REGULATORY NETWORKS

http://upload.wikimedia.org/wikipedia/commons/0/07/Gene.pnghttp://upload.wikimedia.org/wikipedia/commons/thumb/a/a7/Gene2-plain.svg/708px-Gene2-plain.svg.png

Page 5: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

GENE REGULATORY NETWORKS

http://upload.wikimedia.org/wikipedia/commons/thumb/d/df/Gene_Regulatory_Network_2.jpg/800px-Gene_Regulatory_Network_2.jpg

Page 6: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

GENE REGULATORY NETWORKS

http://www.pnas.org/content/104/31/12890/F2.large.jpg

Page 7: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

DNA MICROARRAYS Y-direction: genes

X-direction: data points

M x N matrix S M genes, N experiments

Expression (color magnitude) representative of the number of probes which have bound to present complementary DNA templates.

High number of genes, low number of samples/data points.

Page 8: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

OBJECTIVES

Purpose Construct a novel method of gene network

reconstruction (GNR) which able to process a variety of multiple microarray datasets from difference experiments for inferring the most consistent gene network (GN) while taking into consideration sparsity of connections.

Motivation Multiple datasets: addresses data scarcity and the

“dimensionality problem” Improve inferred gene network reliability Derive gene networks with higher biologically

plausible sparsity

Page 9: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

METHODS

Page 10: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

APPROACH

1. Express Gene Networks (GN) as differential equations.

2. Derive a solution for a single time-course dataset using singular value decomposition (SVD).

3. Find the most consistent network structure with respect to all datasets.

Optimal solution has minimal connections (edges).

Page 11: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

APPROACH

1. Express Gene Networks (GN) as differential equations.

Gene regulation dynamics typically nonlinear, however linear equations capture main features of the network.

Page 12: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

APPROACH

2. Derive a solution for a single time-course dataset using singular value decomposition (SVD).

Page 13: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

APPROACH

2. Derive a solution for a single time-course dataset using singular value decomposition (SVD).

SVD:

nonzero elements of ek listed last, s.t. e1 = … = el , el+1 , … , en ≠ 0.

Allows for particular solution with the smallest L2 norm for the connectivity matrix, Ĵ.

Page 14: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

APPROACH

2. Derive a solution for a single time-course dataset using singular value decomposition (SVD).

Page 15: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

APPROACH

3. Find the most consistent network structure with respect to all datasets.

Multiple, N, microarray datasets for one organism exists; each corresponds to its own general solution, J.

Jk is already normalized in time due to definition of X’. LP problem posed:

Page 16: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

APPROACH

3. Find the most consistent network structure with respect to all datasets.

Matching Term Match most consistent solution with k’s solution Weighted by reliability

Sparsity Term Forces sparsity by minimizing the L1 norm Relative importance balanced by

Matching Term Sparsity Term

Page 17: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

GNR ALGORITHM

When J is fixed, problem can be divided into N independent subproblems.

Through iteration, J will then be updated based on results of Y.

STEP 0: Initialize; set iteration index q = 1. STEP 1: Fix J (q-1) STEP 2: Fix J(q) STEP 3: Check for convergence; else return to STEP

1.

Page 18: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

ALGORITHM: STEP 0

Initialize:

Using SVD, solve for the particular solution

Set initial values:

Ensure given parameters are positive.

q = Iteration index, set

Page 19: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

ALGORITHM: STEP 1

Update J: At iteration q, with fixed, solve LP:

Page 20: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

ALGORITHM: STEP 2 & 3

STEP 2: Having solved for , fix all of and solve for J(q):

STEP 3: Check for convergence.

Is ?

Yes Terminate computation. No Return to STEP 1.

Page 21: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

GNR ALGORITHM OVERVIEW

Page 22: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

CONFIDENCE EVALUATION

Given the optimal solution is , we can compute for each element Jij: Variance

Deviation

Overall average deviation:

Page 23: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

RESULTS

l = 01 dataset

l = 0.33 datasets

l = 02 datasets

l = 03 datasets

True Network

Page 24: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

SIMULATED DATA

Constructed a small simulated network with five genes, and noise function (t):

Randomly chose 3 initial starting conditions.

Produced 3 datasets with 4, 4, and 3 time points, respectively.

Page 25: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

SIMULATED DATA

Assessed network recovering ability (Yeung 2002 criterion):

Assessed accuracy of GNR

Page 26: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

SIMULATED DATA

True Networkl = 0, = 0

1 dataset

l = 0, = 02 datasets

l = 0, = 03 datasets

No sparsity or noise factorVariant: # of data sets

Page 27: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

l = 01 dataset

l = 0.33 datasets

SIMULATED DATA Gaussian noise distributionVariant: # of data sets,

l = 02 datasets

l = 03 datasets

True Network

Page 28: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

SIMULATED DATA Adding datasets improves accuracy of network reconstruction GNR must balance between topology reconstruction accuracy and

interaction strength accuracy. controls the trade-off between E0 and E1 (or E2). Adding datasets improves the confidence of network reconstruction.

Page 29: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

SIMULATED DATA

GNR is able to accurately infer the GN solution to a highly under-determined problem given datasets with few time points and differing initial conditions.

Network topology may still be correctly inferred in the presence of high noise by including a sparsity constraint at the expense of interaction strength accuracy.

Larger simulated network structures were tested with similarly effective results.

Page 30: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

EXPERIMENTAL DATA

Heat-Shock Response Data for Yeast 10 transcription factors 4 microarray datasets (Stanford Microarray

Database) 7, 5, 5, 4 time points

Correctly inferred 4 edges with

documented, known regulation, and 1 edge

with documented potential regulation.

Page 31: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

EXPERIMENTAL DATA

Cell-cycle Data for Yeast 140 differentially expressed genes 4 datasets with differing experimental conditions

Constructed sub-GN involving several genes with proven function within cell wall organization.

(Circles in same color indicate same biological function.)

Page 32: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

EXPERIMENTAL DATA

Stress Response Data for Arabidopsis Root experiments: 226 genes; Shoot

experiments: 246 genes 9 datasets with 6+ time points for each root and

shoot (www.arabidopsis.org)

Page 33: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

DISCUSSION

Page 34: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

LIMITATIONS Assumes the regulatory network remains stationary

regardless of differing environmental conditions.

Requires high resolution, high-quality, time-course datasets. Noise of gene expression data intrinsic to microarray

technologies is a major source of error.

Hidden regulatory factors may lead to implicit description errors.

Inferred GN models predict, indiscriminately, both direct and indirect regulations due to hidden variables. Model edges correlate to net effect. Predicted regulatory relationship does not inherently correlate

to regulation by a transcriptional factor.

Page 35: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

CONCLUSIONS Created a novel method to derive GN substructure using

multiple microarray datasets instead of multiple inferred network alignment.

Model can capture regulatory mechanisms at the protein and metabolite levels which cannot be physically measured.

Capable of deriving a more global structure with dense connections, in addition to more local substructures with sparse connections by modifying the trade-off parameter, .

Model is used most effectively in tandem with other information sources.

FUTURE WORK: Extend GNR to identify conserved network patterns or motifs from the datasets of differing species.

Page 36: Inferring gene regulatory networks from multiple microarray datasets (Wang 2006)

THE ENDThank you for listening!