35
eX treme A rray M apping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

  • View
    229

  • Download
    0

Embed Size (px)

Citation preview

Page 1: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

eXtreme Array Mappingand Haplotype analysisUsing Arrays

Justin BorevitzSalk Institutenaturalvariation.org

Page 2: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Talk Outline

• Bulk Segregant Mapping of– Mendelian mutations

• eXtreme Array Mapping of QTL– Kas x Col RILs and Simulations

• Haplotype analysis– Patterns Global Variation Selection

Page 3: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Potential Deletions

Page 4: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

False Discovery and Sensitivity

Permuted data

real data

5% FDR

PM only SAM threshold

5% FDR

GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity

Polymorphic 340 117 223 34% Non-polymorphic 477 4 473

False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p- value = 1.845e- 40

Observed t statistics vsNull (permuted) t statistics

Page 5: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Chip genotyping of a Recombinant Inbred Line

29kb interval

Page 6: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Potential Deletions

111 potential deletions45 confirmed by Ler sequence

23 (of 114) transposons

Disease Resistance(R) gene clusters

Single R gene deletions

Genes involved in Secondary metabolism

Unknown genes

Page 7: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Potential Deletions Suggest Candidate Genes

deletion of MAF1

FLOWERING1 QTL

Chr1 (bp)

Flowering Time QTL caused by a natural deletion in MAF1

MAF1

Page 8: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

211 172

135

46

54 70

53

EST Deletions all KASC ALL DEL

LER del all

19838all genes

Deletions between Accessions

Page 9: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Fast Neutron deletions

FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1

Het

Page 10: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Map bibb100 bibb mutant plants100 wt mutant plants

Page 11: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

bibb mapping

ChipMapAS1

Bulk segregantMapping usingChip hybridization

bibb maps toChromosome2 near ASYMETRIC LEAVES1

Page 12: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

BIBB = ASYMETRIC LEAVES1

Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain

bibb as1-101

MYB

bib-1W49*

as-101Q107*

as1bibb

AS1 (ASYMMETRIC LEAVES1) =MYB closely related toPHANTASTICA located at 64cM

Page 13: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Other Mendelian mutations

aar21 arhythmic ein6 ethylene insensitive (no een?)Also aar90, aar60 and stamenstay

Page 14: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Short pool –Tall pool

Kas x Col RILsall Features

RED2 QTL

Page 15: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

LOD

0 50 100 150

-200

020

040

0

log

likel

ihoo

d ra

tio

Chromosome 1

0 50 100 150

-200

020

040

0

log

likel

ihoo

d ra

tio

Chromosome 2

0 50 100 150

-200

020

040

0

log

likel

ihoo

d ra

tio

Chromosome 3

0 50 100 150

-200

020

040

0

log

likel

ihoo

d ra

tio

Chromosome 4

0 50 100 150

-200

020

040

0

log

likel

ihoo

d ra

tio

Chromosome 5

eXtreme Array Mapping

Red light QTL RED2 from 100 Kas/ Col RILs

QTL likelihood model using bulk segregant analysis with SFP genotyping

0

4

8

12

16

0 20 40 60 80 100cM

LO

D

Composite Interval MappingRED2 QTL

RED2 QTL

Chromosome 2

15 tallest RILs pooled vs15 shortest RILs pooled

Page 16: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Simulation Genotypes

0 20 40 60 80 100 120

-1.0

-0.5

0.0

0.5

1.0

Chromosome 1 (cM)

cM

geno

type

0 20 40 60 80

-1.0

-0.5

0.0

0.5

1.0

Chromosome 2 (cM)

cMge

noty

pe

0 50 100 150

-1.0

-0.5

0.0

0.5

1.0

Chromosome 3 (cM)

cM

geno

type

0 10 20 30 40 50 60 70

-1.0

-0.5

0.0

0.5

1.0

Chromosome 4 (cM)

cM

geno

type

0 20 40 60 80 100

-1.0

-0.5

0.0

0.5

1.0

Chromosome 5 (cM)

cM

geno

type

15 eXtremeRILs of 1002 QTL chr2 37%var chr5 13%var

Page 17: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Simulation Genotypes

100 eXtremeRILs of 7002 QTL

0 20 40 60 80 100 120

-1.0

-0.5

0.0

0.5

1.0

Chromosome 1 (cM)

cM

geno

type

0 20 40 60 80

-1.0

-0.5

0.0

0.5

1.0

Chromosome 2 (cM)

cMge

noty

pe0 50 100 150

-1.0

-0.5

0.0

0.5

1.0

Chromosome 3 (cM)

cM

geno

type

0 10 20 30 40 50 60 70

-1.0

-0.5

0.0

0.5

1.0

Chromosome 4 (cM)

cM

geno

type

0 20 40 60 80 100

-1.0

-0.5

0.0

0.5

1.0

Chromosome 5 (cM)

cM

geno

type

Page 18: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Simulation Genotypes

50 eXtremeF2s of 5002 QTL

0 20 40 60 80 100 120

-1.0

-0.5

0.0

0.5

1.0

Chromosome 1 (cM)

cM

geno

type

0 20 40 60 80

-1.0

-0.5

0.0

0.5

1.0

Chromosome 2 (cM)

cM

geno

type

0 50 100 150

-1.0

-0.5

0.0

0.5

1.0

Chromosome 3 (cM)

cM

geno

type

0 10 20 30 40 50 60 70

-1.0

-0.5

0.0

0.5

1.0

Chromosome 4 (cM)

cM

geno

type

0 20 40 60 80 100

-1.0

-0.5

0.0

0.5

1.0

Chromosome 5 (cM)

cM

geno

type

Page 19: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Simulation Chip Noise

50 eXtremeF2s of 5002 QTL

0 20 40 60 80 100 120

-1.0

-0.5

0.0

0.5

1.0

Chromosome 1 (cM)

cM

geno

type

0 20 40 60 80

-1.0

-0.5

0.0

0.5

1.0

Chromosome 2 (cM)

cMge

noty

pe

0 20 40 60 80 120

-1.0

-0.5

0.0

0.5

1.0

Chromosome 3 (cM)

cM

geno

type

0 10 20 30 40 50 60

-1.0

-0.5

0.0

0.5

1.0

Chromosome 4 (cM)

cM

geno

type

0 20 40 60 80 100

-1.0

-0.5

0.0

0.5

1.0

Chromosome 5 (cM)

cM

geno

type

Page 20: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Simulation Likelihood

50 eXtremeF2s of 5002 QTL

0 20 40 60 80 100 120

050

015

00

cM

log

likel

ihoo

d ra

tio

Chromosome 1 (cM)

0 20 40 60 80

050

015

00

cM

log

likel

ihoo

d ra

tio

Chromosome 2 (cM)

0 20 40 60 80 100 120 140

050

015

00

cM

log

likel

ihoo

d ra

tio

Chromosome 3 (cM)

0 10 20 30 40 50 60

050

015

00

cMlo

g lik

elih

ood

ratio

Chromosome 4 (cM)

0 20 40 60 80 100

050

015

00

cM

log

likel

ihoo

d ra

tio

Chromosome 5 (cM)

Page 21: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Array Haplotyping

• Hybridize 48 arrays with 15 accessions

• ~300ng DNAeasy MiniPrep leaf tissue

• Overnight Bioprime Klenow labeling 25C

• "col", "lz", "ler", "bay", "shah", "cvi",

"kas", "c24", "est", "kendl", "mt", "nd", "sorbo", "van", "ws2"

Page 22: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Linkage Disequilibrium explained

1 SNP2 haplotypes

Mutation2 SNPs3 haplotypes

2 SNPs4 haplotypes

recombination

Page 23: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Sequence Variation at a Candidate Locus, LIGHT2

PHYB locus (6.5 kb)

I 143 / L1072 x = 8.4 ± 0.8 mm associat ion test ing

L143 / V1072 x =10.2 ± 0.4 mm p < 0.01 (permutat ion testing)

ColLer

Uk-4SorboTsu-1Wei-0Van-0Ema-1Cvi-0Ts-1Sf-0Se-0

VVVVVVVVV

Ler PHYB protein (1172 aa)

These polymorphismsare in complete LD

LLLLLLLLL

III

L

LL

TA

LL

ER

SH

OR

TE

R

Page 24: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

C c c c C c C j j j j j j L L L B B B S S C C C k k c c E E E K K M M M N N N S S S v v V WWW

Cc

cc

Cc

Cj

jj

jj

jL

LL

BB

BS

SC

CC

kk

cc

EE

EK

KM

MM

NN

NS

SS

vv

VW

WW

o o o o o o o w w w w w w e e e a a a h h v v v a a 2 2 s s s e e t t t d d d o o o a a a s s s

oo

oo

oo

ow

ww

ww

we

ee

aa

ah

hv

vv

aa

22

ss

se

et

tt

dd

do

oo

aa

as

ss

l l l l l l l C C C L L L r r r y y y a a i i i s s 4 4 t t t n n 0 0 0 - - - r r r n n n - - -

ll

ll

ll

lC

CC

LL

Lr

rr

yy

ya

ai

ii

ss

44

tt

tn

n0

00

--

-r

rr

nn

n-

--

Pairwise Correlation between and within replicates

Page 25: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Feature Density chr1

Page 26: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Diversity measure

Page 27: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

LIGHT1 tstat and raw data

Page 28: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

LIGHT1 tstat and raw data

Page 29: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Array Haplotyping

Inbred lines

Low effectiverecombinationdue to partialselfing

Extensive LDblocks

Col Ler Cvi Kas Bay Shah Lz Nd

Chr

omos

ome1

~50

0kb

Page 30: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Quantitative Trait Loci

Page 31: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org
Page 32: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Feature level model

FLC controls flowering time Difference detected it 3 day old seedlings

Gene Expression index that accounts for feature effect and polymorphisms

Page 33: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

PAG1 down regulated in Cvi

PLALE GREEN1 knock out has long hypocotyl in red light

Page 34: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

Review

• Single Feature Polymorphisms (SFPs) can be used to identify recombination breakpoints, potential deletions

• Bulk segregant mapping, and – eXtreme Array Mapping of QTL

• Haplotyping Diversity scans

Page 35: EXtreme Array Mapping and Haplotype analysis Using Arrays Justin Borevitz Salk Institute naturalvariation.org

NaturalVariation.orgSalkJon WernerSam HazenSarah LiljegrenRamlah NehringJoanne ChoryJoseph Ecker

UC San DiegoCharles Berry

ScrippsElizabeth Winzeler

SyngentaHur-Song ChangTong Zhu

SalkJon WernerSam HazenSarah LiljegrenRamlah NehringJoanne ChoryJoseph Ecker

UC San DiegoCharles Berry

ScrippsElizabeth Winzeler

NaturalVariation.org