33
Gene Discovery from Micr oarray Images 陳陳陳 陳陳陳 陳陳陳 ARCNTU, NTU-Hospital [email protected] [email protected] Project#: 93-EC-17-A-19-S1-0016

Gene Discovery from Microarray Images

Embed Size (px)

DESCRIPTION

Gene Discovery from Microarray Images. 陳朝欽、 高成炎、張春梵 ARCNTU, NTU-Hospital [email protected] [email protected] Project#: 93-EC-17-A-19-S1-0016. Motivation and Data Acquisition. - PowerPoint PPT Presentation

Citation preview

Page 1: Gene Discovery from Microarray Images

Gene Discovery from Microarray Images

陳朝欽、 高成炎、張春梵ARCNTU, [email protected]@csie.ntu.edu.tw

Project#: 93-EC-17-A-19-S1-0016

Page 2: Gene Discovery from Microarray Images

Motivation and Data Acquisition

• Parts of our current works attempt to investigate and discover “a subset of genes” related to some specific diseases such as Hepatoma and Gastric Cancers by microarray experiments. Hence, we collect data from cDNA microarray images which are “spot signal intensities” via a sequence of biological experiments

Page 3: Gene Discovery from Microarray Images

A Paradigm for Microarray Image Data Analysis

Page 4: Gene Discovery from Microarray Images

Outline

• Microarray Image Data Acquisition

• Gridding for Image Segmentation

• Normalization from MA-Plot

• Finding Differentially Expressed Genes

• Finding Discriminative Genes

• Performance Evaluation by Dendrogram and K-means Algorithms

Page 5: Gene Discovery from Microarray Images

A Look at a Microarray Slide

Page 6: Gene Discovery from Microarray Images
Page 7: Gene Discovery from Microarray Images

Examples of Microarray Images

Page 8: Gene Discovery from Microarray Images
Page 9: Gene Discovery from Microarray Images

Gridding for Spot Segmentation

Page 10: Gene Discovery from Microarray Images

Gridding for a Block of 30*9 Spots

Page 11: Gene Discovery from Microarray Images

Spot Feature Computation

• Cy3 (for Column 1) 639 54879 5980 1984 324 910 2153 236

• Cy5 (for Column 6) 104 52858 567 189 36 1489 5083 407

Page 12: Gene Discovery from Microarray Images

M-A plot and Piecewise Normalization

Page 13: Gene Discovery from Microarray Images

Normalized Ratio from MA-Plot

Page 14: Gene Discovery from Microarray Images

Pre-Processing / Normalization

• Due to the process of measurements or some unavoidable factors, “Raw Data” directly collected from experiments may contain noise and may have different scales, or have missing items. Thus, a pre-processing step for filtering out some inappropriate data, or normalization may be done.

Page 15: Gene Discovery from Microarray Images

Spot Features for Gene Discovery

Cy3 Cy5

201 67

520 153

28276 21747

4072 6324

14807 690

1058 1451

572 524

M=(log2Cy3 − log2Cy5)

A= (log2Cy3+log2Cy5)/2

Program compustt.c

computes spot features

and pieceline.c does

normalization and

maplot.c does M-A plot

Page 16: Gene Discovery from Microarray Images

Microarray Pattern Analysis

• Microarrays consisting of 13574 effected genes from 18564 in a chip with tumor dyed in Cy3 and normal dyed in Cy5

• 12 HCV, 27 HBV, 1 HCV+HBV, 4 neither HCV nor HBV patients

• Criterion for Differentially Expressed is defined as log2(Lowess normalized ratio of Cy3/Cy5) is greater than T (↑) or less than -T (↓)

Page 17: Gene Discovery from Microarray Images
Page 18: Gene Discovery from Microarray Images
Page 19: Gene Discovery from Microarray Images
Page 20: Gene Discovery from Microarray Images

Feature Selection/Extraction (1)

• Given a set of N patterns from K categories (K=2, a problem of dichotomy) with Ni , 1≤ i ≤ K, patterns belonging to category i, each pattern consists of M redundant features, e.g., a microarray can be represented as a pattern consisting of 13574 features corresponding to 13574 effected genes. The goal is to select a small subset of features for “Recognition”

Page 21: Gene Discovery from Microarray Images

Feature Selection/Extraction (2)

• Given a set of N patterns from K categories (K=2, a problem of dichotomy) with Ni , 1≤ i ≤ K, patterns belonging to category i. The goal of extraction is to transform an M-dimensional pattern into an m-dimensional pattern with m<<M for classification. A selected feature preserves the original meaning but an extraction usually does not preserve the original one.

Page 22: Gene Discovery from Microarray Images
Page 23: Gene Discovery from Microarray Images

16 Most Discriminative Genes to distinguish HCV from HBV [YCT39] Index Accession# 13796 U35376 7197 BG259957 2918 BI520001 8495 AJ012159 11189 AB008549 11087 BC006496 9443 CAC51145 9546 X52125

Index Accession# 16144 AK024601 16496 Y00083 17213 BC007437 14579 BC011568 587 AF386492 113 Y1696117215 AF19576616760 AI022747

Page 24: Gene Discovery from Microarray Images

Next 16 Most Discriminative Genes to distinguish HCV from HBV

Index Accession# 5947 BG207354 4885 AK021818 11291 AF155110 1262 BI861005 8055 AJ224741 10965 AAF36120 4164 NM_000423 8088 BC000187

Index Accession# 7353 AF070641 5434 AB05078512727 AB06298714993 AA974308 4182 AI970531 5341 X65882 10052 AB011542 8140 AK026068

Page 25: Gene Discovery from Microarray Images

32 Discriminative Genes by Fisher’s Ratios for a Dendrogram

Page 26: Gene Discovery from Microarray Images

32 Discriminative Genes by Chuang+Kao’s for a Dendrogram

Page 27: Gene Discovery from Microarray Images

Dendrogram from Chen’s 32 Most Discriminative Genes [CC39]

Page 28: Gene Discovery from Microarray Images

Dendrogram from Genasia’s 32 Most Discriminative Genes

Page 29: Gene Discovery from Microarray Images

K-means Clustering Results by using 32 Best Discriminative Genes• G45 from Genasia: distortion 341.261222221222 2211111111 111111111111111111• X47 from C. Chen: distortion 302.331222221222 2211111111 112111111111111111• Y48 by Fisher’s Ratio on YCT39: distortion 307.491222221222 2211111111 112111111111111111• PY50 by Chuang+Kao’s on YCT39: distortion 290.062222222222 2211211111 112111111111111111

Leave-one-out errors by 1-nn : 4, 3, 2, 1 (/39)Leave-one-out errors by Fisher : 15, 7, 8, 9 (/39)

Page 30: Gene Discovery from Microarray Images

Up (Down) Regulated Genes for Gastric Cancers

• 5 Advanced and 5 Early Stage of Patients with Gastric Cancer

• We find the following genes which can completely discriminate Patients of “Advanced Stage” from “Early Stage” under clinical diagnosis

Page 31: Gene Discovery from Microarray Images

Dengrogram for Gastric Patients

Page 32: Gene Discovery from Microarray Images

Top 16 Discriminative Genes for Advanced and Early Stages

Index Accession# 15843 AF316855

12994 BF868865 18370 BC002996 2070 AK021788 1118 BC000249 9661 AP000350 2017 U53530 1128 AF035281

Index Accession# 8728 AL591713 494 AB01452610990 L77570 342 BC00784810425 BG745129 6052 AF073362 170 AK000278 1016 BF526386

Page 33: Gene Discovery from Microarray Images

Thank You

• http://www.bioinfo.ntu.edu.tw

• http://www.cs.nthu.edu.tw/~cchen

• Tel: (02) 2312 3456 ~ 5917

• Tel: (02) 2362 5336 ~ 418

• Tel: (03) 573 1078