MGED IV Meeting Considerations in the Design of Local Research Microarray Databases Jason Gonçalves Slides available at

Embed Size (px)

Citation preview

MGED IV Meeting Considerations in the Design of Local Research Microarray Databases Jason Gonalves Slides available at Public Repositories vs. Local Microarray Databases Aims - Public Microarray Repositories: A.Exchange of published (complete) microarray datasets B.Facilitate meta-analysis of previously published microarray datasets Aims - Local Microarray Databases: A.Storage of all microarray data (images, etc) B.Facilitate analysis of microarray data C.Enable validation and debugging of local microarrays, reagents and equipment MIAME Components Array Design MIAME++ MIAME Components Measurement Data MIAME++ MIAME Components Sample Annotation MIAME++ Detailed description of replication: At what level is this a replicate? Evolving Microarray Analysis 1.Develop statistical methods for microarray data analysis 2.Study and understand the multiple sources of variability 3.Develop methods to reduce variability and develop experimental designs with the sources of variability in mind Data Analysis and Variability Spot IDGene IDT1 R1T1 R2T1 R3T2 R1T2 R2T2 R3 Mean (T2- T1) Mean Fold ChangeP-value 1p Jun Jun BMP GSK Replication does not ensure duplication of results, of course this is not obvious when replication is not used Quality Views To Find Variability Sources Time Points Time Course Hyb Dates Print Batch ID Slide Batch ID Ratio View - Simulated Quantified Microarray Data Original Image Data Background - Simulated Dissecting Spatial Effects - Z-score values identify genes that are highly variable relative the other genes in the group while individual variance values report the absolute levels - Identify genes with extreme standard deviations - z-score > 3 Using Replicate Statistics within a Hybridization Group to Identify Variable Genes Note: SD of the Hybridization Group Standard Deviations Clustering SD Z-Scores to Identify Random and Systematic Sources of Variability Calculated Z-score for all genes and hybridization groups Filtered data set to include genes with at lease one observed z-score > 3 Random Noise: Dust spots Incorrect patch placement Ghost spots Systematic Variability: Using poor gene annotation (e.g. using Unigene vs. Clone ID) Microarray production problems PCR failure, double bands, end of print plate errors Gene with Z-Score > 3 Dust Spot Examples Systematic Production Problem Example Unique Hybridizations Unique PCR Amplifications Lambda Ig light chain (62 clones) Kappa Ig light chain (74 clones) Other Genes (49 clones) Other Genes (12 clones) Interesting Genes: p53, x-actin, ras, etc. Validate with RT-PCR Multiple Myeloma Cell Lines All interesting genes are artifacts Iobion - GeneTraffic Acknowledgements University Health Network, University of Toronto Mark Takahashi Neil Winegarden James R. Woodgett All UHN Microarray Center Staff Iobion Informatics LLC. Daniel Iordan Harry Liu WL Marks William Roboly Faye Barron Bogdan Georgescu