26
Genetical Genomics in the Mouse Finding Genes with Microarray Expression Data

Genetical Genomics in the Mouse

  • Upload
    vianca

  • View
    46

  • Download
    0

Embed Size (px)

DESCRIPTION

Genetical Genomics in the Mouse. Finding Genes with Microarray Expression Data. Genetical Genomics. Jansen, R.C. and J.P. Nap (2001). Genetical genomics: the added value from segregation. Trends Genet 17(7): 388-91. Mouse Genetical Genomics. BXD recombinant inbred lines - PowerPoint PPT Presentation

Citation preview

Page 1: Genetical Genomics in the Mouse

Genetical Genomics in the Mouse

Finding Genes with Microarray Expression Data

Page 2: Genetical Genomics in the Mouse

Genetical Genomics

Jansen, R.C. and J.P. Nap (2001). Genetical genomics: the added value from segregation. Trends Genet 17(7): 388-91.

Page 3: Genetical Genomics in the Mouse

Mouse Genetical Genomics

• BXD recombinant inbred lines• 21 strains + parents and F1

– genotypes• 508 markers

– traits• forebrain RNA assayed by Affymetrix

U74Av2– PM probe sequences– MM probe sequences

• 1 to 4 microarrays per RI line (average 2.5)

Page 4: Genetical Genomics in the Mouse

QTL mapping by regression

• Trait vs genotype association– Genetically determined difference

• in expressed RNA level• in hybridization of probe sequence• in competing hybridization

– Measured by LRS (likelihood ratio statistic)

Page 5: Genetical Genomics in the Mouse

BXD Marker Distribution

0.0

0.4

0.8

1.2

0 100 200 300 400 500

Marker Number

Ma

rke

r L

oca

tion 1

35 7

9

11

1517

1319

BXD Marker Distribution

Page 6: Genetical Genomics in the Mouse

Trait Data Preparation

• 12,422 probesets (traits)– 16 PM & 16 MM probes

(oligonucleotides)– average PM-MM difference

• log2-transform average difference• normalize data of each microarray to

common mean and standard deviation• average replicate microarrays

• 400,000 PM & MM probes (cells)– log2-transform cell intensity– normalize and average replicate arrays

Page 7: Genetical Genomics in the Mouse

Multiple testing problem

• Two levels of multiple testing– Each trait or probe vs 508 loci– 12,422 traits or 400,000 probes

• Strategy– Empirical p-value for multiple loci

• measures significance of single best association

– Benjamini-Hochberg procedure for multiple traits or probes• may declare many significant associations• assumes at least one significant association

Page 8: Genetical Genomics in the Mouse

Empirical p-value

• Measures genome-wide significance– converts multiple test into single test– significance of best association among

all loci• Permutation test for distribution under

null– up to 106 scans with permuted trait values– record largest LRS for each permutation

• Find p-value of original regression from its rank in the null distribution

Page 9: Genetical Genomics in the Mouse

Outliers

Distribution of significance thresholds

10

12

14

16

18

20

22

7 8 9 10 11

Suggestive

Sig

nif

ican

t

• Examine permutation test distribution for bimodality– Compare 37th and

95th percentile values

• Find outlier and assign next most extreme value

• Redo permutation test and regression

Page 10: Genetical Genomics in the Mouse

Benjamini-Hochberg test

• Test of 100 uniformly distributed p-values (p-values from non-significant results)

• P-values as blue dots• Significance

threshold for FDR = 0.2 as red line0

0.05

0.1

0.15

0.2

0 5 10 15 20

Rank

P-v

alue

Page 11: Genetical Genomics in the Mouse

0

0.05

0.1

0.15

0.2

0 5 10 15 20

Rank

P-v

alue

Benjamini-Hochberg test

• Test of 10 low p-values (significant results) mixed with 90 p-values from non-significant results

• P-values as blue dots• Significance

threshold for FDR = 0.2 as red line

• Eleven cases declared significant

Declare significant

Page 12: Genetical Genomics in the Mouse

Empirical P-value Calculation

Marker regression mapping

Maximumgenome-wide

LRS

500x Permutation test

5000x Perm

50000x Perm

1000000x Perm

? p-value

p-value

p-value

p-value

?

?

Page 13: Genetical Genomics in the Mouse

Trait-locus associations

• Ranked P-values as blue dots (90 smallest from 12,422)

• Significance threshold as red line

• Cases below red line are significant for FDR = 0.2

• 75 significant trait-locus associations

0.000000

0.000500

0.001000

0.001500

0.002000

0.002500

0 20 40 60 80

Rank

P-v

alu

e

Page 14: Genetical Genomics in the Mouse

Probe-locus associations

• Ranked P-values as blue dots (600 smallest from ~400,000)

• Significance threshold as red line

• Cases below red line are significant for FDR = 0.2

• 576 significant probe-locus associations

0.0E+00

5.0E-05

1.0E-04

1.5E-04

2.0E-04

2.5E-04

3.0E-04

3.5E-04

4.0E-04

0 100 200 300 400 500 600

Rank

P-v

alu

e

Page 15: Genetical Genomics in the Mouse

QTLs from MM probes

• 576 QTLs defined by single microarray probes– 454 (79%) by PM

probes– 122 (21%) by MM

probes

• Proportion of PM probes QTLs declines as p-value increases

A B C

Fraction of PM probesamong QTLs

0.00

0.20

0.40

0.60

0.80

1.00

0 200 400 600

Rank

Av

era

ge

fo

r w

ind

ow

of

50

Page 16: Genetical Genomics in the Mouse

QTLs from cell-level mapping

• 576 cell-marker associations (QTLs)– 339 traits (probesets) represented– most probesets represented by a single

probe– rarely, two or more significant probes

from same probeset– all probes from one probeset identify

same locus– 79% of probes are PM

Page 17: Genetical Genomics in the Mouse

QTLs from PM cells only

• 454 PM cells defining QTLs– 288 traits (probesets) represented

• 184 controlled by location on the same chr

• 88 controlled by location on different chr• 16 unknown location for probeset

– 147 locations (marker loci) with nearby QTLs, distributed on all chromosomes

Page 18: Genetical Genomics in the Mouse

Probe-locus associations among traits

• 339 traits (probesets) with probes identifying significant QTLs

• 186 traits represented by a single probes

• 2 traits represented by 10 probes

Distribution among probesets of PM probes detecting a QTL

0

25

50

75

100

125

150

175

200

225

1 2 3 4 5 6 7 10

Probes per probeset

Fre

qu

en

cy

Page 19: Genetical Genomics in the Mouse

QTL distribution among marker loci

• 147 loci identified by at least one significant probes-locus association

• multiple associations to one locus– multiple probes from

one probeset– multiple QTL near

locus

Distribution among marker loci of QTLs detected by PM probes

0

10

20

30

40

50

60

70

1 3 5 7 9 11 13 17

Probes per marker

Fre

qu

en

cy

Page 20: Genetical Genomics in the Mouse

Profiles of probe sensitivity

Li, C & Wong, WH (2001) Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. PNAS 98: 31-36

Page 21: Genetical Genomics in the Mouse

Probe profiles (best)

93269_at

0

10

20

30

40

1 3 5 7 9 11 13 15

96156_at

0

20

40

60

80

1 4 7 10 13 16

94426_at

0

20

40

60

80

1 4 7 10 13 16

• LRS vs probe number

• Probesets with highest significance in probeset-level mapping

PMMM

94244_at

01020304050

1 3 5 7 9 11 13 15

Page 22: Genetical Genomics in the Mouse

Probe profiles (worst)

• LRS vs probe number

• Probesets with lowest significant association in probeset-level mapping

93583_s_at

05

10152025

1 3 5 7 9 11 13 15

102321_at

0123456

1 3 5 7 9 11 13 15

102776_at

05

1015202530

1 3 5 7 9 11 13 15

93730_at

02468

1012

1 3 5 7 9 11 13 15

PMMM

Page 23: Genetical Genomics in the Mouse

Distribution of controlled loci

Controlled traits by chromosome

00.5

11.5

22.5

3

3.54

4.55

1 3 5 7 9 11 13 15 17 19

Probeset Chr

Re

lati

ve

Fre

qu

en

cy

Syn

Nonsyn

Page 24: Genetical Genomics in the Mouse

Distribution of controlling loci

Controlling loci by chromosome

00.5

11.5

22.5

3

3.54

4.55

1 3 5 7 9 11 13 15 17 19

QTL Chr

Re

lati

ve

Fre

qu

en

cy

Syn

Nonsyn

Page 25: Genetical Genomics in the Mouse

Chr 9 QTLs

• Unusual number of chr 9 QTLs (22) controlling sequences on other chrs

• Normalized frequency 3-fold greater than average chr

• Many of these QTLs cluster near 2 loci on chr 9

Chr 9 QTLs controlling nonsyntenic sequences

0

1

2

3

4

5

6

7

8

0.00 0.20 0.40 0.60

Position of QTLs

Nu

mb

er

of

seq

ue

nce

s co

ntr

olle

d

D9Mit253

D9Mit18

Page 26: Genetical Genomics in the Mouse

Acknowledgments

• Robert W Williams– Lu Lu – S Shou– Yanhua Qu– Elissa Chesler

• John D Mountz– Hui Chen Hsu

• David Threadgill• Gene Hwang• Dan Nettleton

• Jintao Wang

• Ram Varma• Jianxin Wang• Mark Brady• Gene Sobel

U Tennessee, Memphis

U Alabama, Birmingham

U North Carolina

Iowa State U

Cornell U

GOG

Gene Expression Core

Bioinformatics