17
Henrik Bengtsson [email protected] Mathematical Statistics Centre for Mathematical Sciences Lund University Plate Effects in Plate Effects in cDNA Microarray Data cDNA Microarray Data

Henrik Bengtsson [email protected] Mathematical Statistics Centre for Mathematical Sciences

  • Upload
    kezia

  • View
    30

  • Download
    1

Embed Size (px)

DESCRIPTION

Plate Effects in cDNA Microarray Data. Henrik Bengtsson [email protected] Mathematical Statistics Centre for Mathematical Sciences Lund University. Outline. Intensity dependent effects A new way of plotting microarray data Plate effects Plate normalization Measure of Fitness Results - PowerPoint PPT Presentation

Citation preview

Page 1: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Henrik [email protected]

Mathematical StatisticsCentre for Mathematical Sciences

Lund University

Plate Effects inPlate Effects incDNA Microarray DatacDNA Microarray Data

Page 2: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Outline

• Intensity dependent effects• A new way of plotting microarray data• Plate effects• Plate normalization• Measure of Fitness• Results• Discussion

Page 3: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Data• Matt Callow’s ApoAI experiment (2000):

– (8 ApoAI-KO mice vs. pool of 8 control mice),8 control mice vs. pool of 8 control mice.

– 5357 ESTs/genes (6 triplicates, 175 duplicates, 4989 single spotted) & 840 blanks=> 6384 spots in all.

– Labeled using Cy3-dUTP and Cy5-dUTP.– Signals extracted from images by Spot.

Page 4: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Intensity dependent effectsThe log-ratio, M, depends on the intensity of the spot, A.

Page 5: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Print-tip effectsThe log-ratio (and its variance) depends on printtip group.

How are the spots printed…?

Page 6: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Print order plotThe spots are order according to when they were spotted/dipped onto the glass slide(s).

Page 7: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Plate effectsThe log-ratios depends on the plate the spotted clone comes from.

(384-well plates from 6 different labs were used)

Page 8: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Plate NormalizationAssumption:The genes from one plate are in averagenon-differentially expressed.

Correctness?Are clones on the plates selected randomly? Spots on plates are less random that for instance spots in print-tip groups.

The ApoAI mouse experiment is a comparison between 8 control mice and the pool of them. Even if clones on plates were from different tissues, e.g. plate 9-12 from brain, in this setup it should not affect the ratios, just the strength of the signals.

Page 9: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Removing plate biases

Page 10: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Intensity normalization

• Intensities (A) also have plate effects.

• Intensity normalization => plate biases again!

Should we normalize A for plate? Probably not!Blanks and ”brain” spots have lower intensities, whereas the ”liver” spots have higher...

Page 11: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Sources of Artifacts

scanning

data: (R,G,...)

cDNA clones

PCR product amplificationpurification

printing

Hybridize

RNA

Test sample

cDNA

RNA

Reference sample

cDNA

excitationred lasergreen

laser

emission

overlay images

Production

Plate effects(?)

Intensity effects(labelling efficiency)

Intensity effects(quenching)

Page 12: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Several possible approaches ;(

Decisions to make:

• Background correction?• Plate normalization?• Intensity (slide, print-tip or scaled print-tip) normalization?• Platewise-intensity normalization?

If both plate and intensity normalization, in what order? Maybe plate-intensity-plate-intensity-plate-... and so on?

Need a way to compare different approaches...

Page 13: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Measure of FitnessMedian absolute deviation (MAD) for gene i:

di = 1.4826 · median | rij |

where rij = Mij – median Mij is residual j for gene i.

The measure of fitness is defined as the mean of the genewise MADs:

m.o.f. = di / N

where N is the number of genes. (...or or look at the density of the di ’s)

Important. Compare on the same scale!

Page 14: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Visual comparison between the ”best”Slidewise intensity normalization:

(m.o.f.=0.228)Plate+print-tip int.+plate normalization:

(m.o.f.=0.188)

Page 15: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

bg – background corrected, P – Plate biases removed, S – slide-intensity normalized,B – printtip-intensity normalized, sB – scaled printtip intensity normalized.

m.o.f.

• Removing plate biases first significantly lowers the gene variabilities. (15-20% lower than intensity normalization only)

• It is critical not to dobackground correction.

• Using measure of fitness is helpful in deciding what to do.

Results

Page 16: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

Discussion

• What are the reasons for plate effects and where do they actually occur? i) On the plates, ii) during printing or iii) at hybridization?

• How should one best standardize the measure of fitness? i) Based an all spot, ii) on a subset (blanks?), or iii) ?

Page 17: Henrik Bengtsson hb@maths.lth.se Mathematical Statistics Centre for Mathematical Sciences

AcknowledgementsStatistics Dept, UC Berkeley:* Sandrine Dudoit * Terry Speed* Yee Hwa Yang

Lawrence Berkeley National Laboratory:* Matt Callow

Ernest Gallo Research Center, UCSF:* Karen Berger

Mathematical Statistics, Lund University:* Ola Hössjer

com.braju.sma - object oriented extension to sma (free):http://www.braju.com/R/

[R] Software (free):http://www.r-project.org/

The Statistical Microarray Analysis (sma) library (free):http://www.stat.berkeley.edu/users/terry/zarray/Software/smacode.html