Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
1
Genetic Mapping of Mutations
1. Identifying candidate genes2. Initiating positional cloning projects
Genetic Maps Have Two Properties:
• Distance between markers• Marker order
Genetic distance is determined by recombination frequencyResolution of map is determined by number of meioses scored
Classical Genetic Mapping Haploid Mapping Cross
Mutants will tend to have the a allele, B and b will be 1:1WT will tend to have the A allele, B and b will be 1:1
Markers used in initial mapping:• SSLP Markers (Simple sequence length polymorphisms)
Also called:CA-repeats, SSRs (simple sequence repeats), microsatellitesLength of CA tract differs in different strains
CACACACACACACACA
Forward
Forward
Reverse
Reverse
200 bp
206 bp
CACACACACA
• Co-dominant: both alleles can be detected in a heterozygote
• Informative in most crosses: 50-90% polymorphic among strains
• Robust markers: easily scored, reproducible banding patterns, easy transfer information between crosses and labs
Single nucleotide polymorphisms
• snip SNP’s (restriction enzyme polymorphisms)
A
G
200 bp
60 bp, 140bp
Forward
ForwardReverse
Reverse
• Co-dominant: both alleles can be detected in a heterozygote
• Abundant polymorphisms: ~1/100 bp
• Sequence information needed for assay design
• With the advent of deep sequencing, millions of SNPs can bescored in a single experiment
2
SSLP markers scored onhaploid individuals
Wild type Mutant
Wild type Mutant3 recombinants among 20 meioses = 15 cM
0 recombinants among 20 meioses = 0 cM
8 different SSLP markers scored onpools of haploid WT and mutants
Three classes of markers:• non-polymorphic (uninformative)
• polymorphic unlinked• polymorphic linked
For gel pictures from a previous year, see
http://people.fas.harvard.edu/~ianwoods/woods_hole_2010/
Testing pools with 48 primersarranged in a 96-well format
454137332925211713951
4541373329252117139514642383430262218141062
4642383430262218141062
4743393531272319151173
4743393531272319151173
4844403632282420161284
4844403632282420161284
mutant
mutant
mutant
mutant
WT
WT
WT
WT
Where did those primers come from?
Original link, now not maintained:http://zebrafish.mgh.harvard.edu/mapping/ssr_map_index.html
All data on SSLPs, available at zfin.org
~ 240 primer pairs = 5 96-well plates
http://people.fas.harvard.edu/~ianwoods/woods_hole_2010/
Original site: MGH zebrafish server
Primers used in our lab
Woods Hole Zebrafish Course 2007
3
Mapping a mutation1. Establish a polymorphic mapping cross
2. Prepare genomic DNA from wild-type and mutantsiblings
3. Analyze SSLPs on WT and mutant pools(3a optional: retest some markers on the pools)
4. Analyze putatively linked SSLPs on moderate number ofindividuals (50-100)
5. Analyze more individuals for high-resolution mapping
Diploid mapping crossX
X+/- +/-
+/- +/+
Outcross ID’d carrierto a divergent strain
Identify carriers andintercross
+/+
+/-
-/-
+/+
+/++/+
+/+
+/++/+
+/+
+/++/+
+/+
+/++/+
+/+
+/-
+/-
+/-
+/-+/-
+/-
+/-
+/-
+/-
+/-
+/-+/-
+/-
+/-
+/-
+/-
+/-
+/-+/-
+/-
+/-
+/-
+/-
+/-
+/-
+/-+/-+/-
-/-
-/-
-/-
-/-
-/-
-/- -/-
-/-
-/--/-
-/--/-
-/-
Collect embryos
Bulk Segregant Analysis on diploids-/-
-/- -/--/-
-/--/- -/--/- -/-
-/-
+/++/++/+ +/-
+/-+/-
+/-+/-
+/- +/-(Carefully) sort embryos
Prepare gDNA from embryos
Pool DNA samples (20 each)
PCR with arrayed primers
WTmut
WTmutWTmutWTmut
WT
mut
Expected results from BSA on diploidsMarker Q:Polymorphicunlinked
Marker X:Not polymorphicunlinked
Marker Y:Polymorphiclinked
Marker Z:Not polymorphiclinked
mut WT mut WT mut WT mut WT
m + m +
x
Q q Q q
m + m +
x
xx xx
m + m +x
Y y Y y
m + m +x
z z z z
Mapping a mutation1. Establish a polymorphic mapping cross
2. Prepare genomic DNA from wild-type and mutantsiblings
3. Analyze SSLPs on WT and mutant pools(3a optional: retest some markers on the pools)
4. Analyze putatively linked SSLPs on moderate number ofindividuals (50-100)
5. Analyze more individuals for high-resolution mapping
Mapping of mutant X• Incompletely characterized somite phenotype, ventrally curved body axis
• Diploid mapping cross
• PCR: 48 markers on linkage groups 4-7WT pool, mut pool
• Load and run gel (96 samples)
• Analyze BSA gels for linkage
• Analyze some markers on individuals
• Discuss follow-up experiments
4
Group 1 Group 2
Group 3 Group 4
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
What’s next?
Test promising markers on panels of individual mutantsand wild-types
Not all markers in the following slides are “promising,” butwere chosen to show different kinds of possible results
First, the ones that were wrong . . .
5
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z22467
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
Not polymorphicUninformative marker
Test others in region, perhaps run gel longer
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z11988
Polymorphic, not linkedLikely not located near this marker
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z8693
Polymorphic & linked?Marker segregating as an intercross
No recombinants in 16 meiosesRun more mutant individuals:
Confirm position and distance from mutation
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
6
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z11119
Polymorphic & linked?Marker segregating as an intercross
No recombinants in 16 meiosesRun more mutant individuals:
Confirm position and distance from mutation
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z3057
Polymorphic & linked?Marker segregating as an intercross
Potential recombinants in mutant #3 = 1 out of 16 meiosesRun more mutant individuals to confirm
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z4999
Polymorphic & linked?Marker segregating as an intercross
Potential recombinants in mutant #3 = 1 out of 16 meiosesRun more mutant individuals to confirm
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
7
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z7109
Polymorphic & linked?Marker segregating as an intercross
Potential recombinants in mutant #3 = 1 out of 16 meiosesRun more mutant individuals to confirm
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
Sample gel
Marker
Wild
-type
po
ol
Mu
tan
t po
ol
Z13936
Polymorphic & linked?Marker segregating as an intercross
Potential recombinants in mutant #4, 5, 7, 8 = 4 out of 16 meiosesRun more mutant individuals to confirm
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
Linked markers:Marker LG Data
Z13936 7 4 recs: #4, 5, 7, 8 (4/16)Z8693 7 0 recs (0/16)Z11119 7 0 recs (0/16)Z3057 7 1 rec: #3 (1/16)Z4999 7 1 rec: #3 (1/16)Z7109 7 1 rec: #3 (1/16)
Model of mutation location
LG 7
Z3057Z4999Z7109
Z11119Z8693 Z13936
mutation
8
1 2 3 4 5 6 7 8
- - - - - - - -
Mutant Individuals
Mutant locus
Z11119
Polymorphic & linked?Marker segregating as an intercross
No recombinants in 16 meiosesRun more mutant individuals:
Confirm position and distance from mutation
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
1 2 3 4 5 6 7 8
- - - - - - - -m m m m m m m m
Mutant Individuals
Z11119
Mutant locus
Z3057
Polymorphic & linked?Marker segregating as an intercross
Potential recombinants in mutant #3 = 1 out of 16 meiosesRun more mutant individuals to confirm
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
1 2 3 4 5 6 7 8
- - - - - - - -m m m m m m m mm m M m m m m m
Mutant Individuals
Z11119
Z3057
Mutant locus
Z13936
Polymorphic & linked?Marker segregating as an intercross
Potential recombinants in mutant #4, 5, 7, 8 = 4 out of 16 meiosesRun more mutant individuals to confirm
1 2 3 4 5 6 7 8
MUTANTINDIVIDUALS
WILD-TYPEINDIVIDUALS
9
1 2 3 4 5 6 7 8
- - - - - - - -m m m m m m m mm m M m m m m m
m m m M M m M M
Mutant Individuals
Z11119
Z3057
Z13936Mutant locus
1 2 3 4 5 6 7 8
- - - - - - - -m m m m m m m mm m M m m m m m
m m m M M m M M
Mutant Individuals
Z11119
Z3057
Z13936Mutant locus
X
X X X X
1 2 3 4 5 6 7 8
- - - - - - - -m m m m m m m mm m M m m m m mm m m M M m M M
Mutant Individuals
Z11119
Z3057
Z13936
Mutant locus
X
X X X XX
Overview of zebrafish genome and genomic resourcesGenome size: 1.4 x 109 bp (for comparison, C.elegans 108, Drosophila 1.7 x 108, pufferfish 0.4 x 109, mammals 3.3 x 109)
25 chromosomes
Maps and other genomic resources
pre-1994 No two genes or markers were known to be linked
1994 First genetic map (Postlethwait et al.), ~400 RAPDs, several genes and mutationsFirst mutation cloned by the candidate gene approach (Schulte-Merker et al.)
1996 Centromeres mapped, more RAPDs added, remaining gaps closed (Johnson et al.)First SSLP map (Knapik et al.), ~100 markersLarge-insert genomic libraries become available (Amemiya, Zon, Fishman et al.)Insertional mutagenesis used to clone mutated genes (Hopkins and colleagues)
1998 ~140 genes mapped (Postlethwait et al.)SSLP map expanded to ~700 markers (Knapik et al.)Large-scale generation of ESTs begins (Washington Univ. Sequencing Group)First gene identified by positional cloning (Zhang et al.)
1999 SSLP map expanded to 2000 markers (Shimoda et al.)~250 genes and ESTs genetically mapped (Gates et al.)Radiation hybrid maps become available (Geisler et al., Hukreide et al.)~200 additional genes mapped in RH panels (Geisler et al., Hukreide et al.)
2000 More than 50,000 ESTs generated (WashU)~2000 genes and ESTs genetically mapped (Woods et al.; Stanford, Oregon)~4000 genes and ESTs mapped in RH panel (WashU, Children’s, NIH, Tübingen)~50 mutations identified by candidate approach (Many groups)5-10 mutations identified by positional cloning (Many groups)Dozens of mutations cloned by insertional mutagenesis (Hopkins et al.)
2005-07 >1,400,000 ESTs defining ~20,000 genes~3400 genes and ESTs genetically mapped (Stanford, Oregon)>5000 genes mapped in RH panels (Hukreide et al. 2001, Children’s, NIH, Tübingen)Hundreds of mutated genes cloned by insertional mutagenesis (Hopkins/MIT)>200 mutated genes cloned by candidate approach and positional cloning 7,823 BAC clones sequenced, totaling 1.02 Gb of sequence (June 2007)
2010 (Almost) entire 1.4 Gb genome in high-quality assembly>11,500 genes sequenced as full-length cDNAs (NIH/ZGC)~150,000 SNPs mapped at high-resolution, used for assembly framework (Sanger)
Genome Resource Overview, continued Deep sequencing will change everything!
10
Strategy: sequence wildtype and mutant pools of genomic DNA
mut mutG G
mut +G A
+A
+A
Wild type pool Mutant pool
ATCGGCATGCATCGTAGCACGAATCGGCATGCATCGTAGCACGAATCGGCATGCATCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCATCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGA
ATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGAATCGGCATGCGTCGTAGCACGA
1!
Positional cloning: the rest of the story
http://faculty.ithaca.edu/iwoods/docs/wh!
X Today: So you have a map location … now what?
Mapped Mutant Cloned Gene
From mutant map position to cloned gene
• Refining the map location with high-resolution mapping
• Trolling for candidate genes
• Testing candidates
Refining the map position Two basic strategies: • Establish boundaries: Test other markers
- SSLPs – easy, do first - SNPs – require sequence data
• Improve resolution: Test more meioses
- generate more mutants
One advanced strategy:
• Deep sequencing of WT and mutants => SNPs = more markers to map actual mutation? esp. combined with hybrid capture . . .
2!
Data so far:
Mutant with defects in slow muscle specification
Initial Mapping:
Out of 16 meioses:
1 recombinants: Z3057, Z4999, Z7109
0 recombinants: Z8693, Z11119
4 recombinants: Z13936
What SSLPs are in the region? http://www.zfin.org!
Ciick on ‘Genetic maps’
ZFIN map query
MGH = microsatellite / SSLP map
ZFIN map view
Start close and move out both ways
3!
ZFIN marker view GenBank Marker View
Obtaining FASTA sequence Designing PCR primers
http://frodo.wi.mit.edu/primer3/!
4!
Testing for informative
SSLPs!
Informa(ve+=+polymorphic+Different+lengths+on+WT+and+mut+chromosome+
Identifying polymorphic
markers
Informa(ve+=+polymorphic+…+some+will+have+same+SSLP+allele,+
++not+good+for+mapping+
Refining the map
More fish (i.e. embryos / larvae)
= more recombinants = higher resolving power
Narrowing the critical interval
More fish = more better
5/1156
7/1156
5!
Now what?
• Identify more markers and do more high-res mapping Key point = continually refine boundaries by recombination
• Look in genome for potential candidates
What’s nearby in genome? (as a MODEL of reality)
No luck in genome sequence? (rare these days) misassembly gaps
• conserved synteny with other fish • Physical map: BAC clones • genetic or RH maps
Now what?
• Identify more markers and do more high-res mapping Key point = continually refine boundaries by recombination
• Look in genome for potential candidates
What’s nearby in genome? (as a MODEL of reality)
No luck in genome sequence? (rare these days) misassembly gaps
• conserved synteny with other fish • Physical map: BAC clones • genetic or RH maps
What’s nearby in the genome? http://www.ensembl.org/Danio_rerio/!
Ensembl marker search
No luck! Let’s try a sequence-based search: BLAST/BLAT!
6!
Ensembl BLAT search
Scroll down And hit “Run”!
Genome assembly
7!
Good candidate? External references
calca at ZFIN calca expression
motor neuron expression muscle phenotype
what if . . . signal from motor neurons to developing muscle?!
8!
What’s known about calca?
http://www.ncbi.nlm.nih.gov/gene!
What’s known about calca?
Cool new biology: it’s a secreted peptide with a novel role in directing slow muscle specification! Alert Cell, Science, and Nature!!
How to test if this is the right gene?
Is calca the right gene? High resolution mapping
- no recombinants between mutation and gene in lots of meioses
Phenocopy with MO injection or noncomplementation with another allele
Rescue with mRNA injection
Find mutation in coding sequence
Picking the right strategy often is determined by balance of . . .
- Available Resources - Number of Candidates
These are often determined by size of candidate interval
9!
Now what?
Test potential candidates
• Turn the candidate into a new map marker - could it be the right gene? - if not, can it narrow your interval?
How to turn it into a map marker?
What’s a good candidate?
Now what?
Test potential candidates
• Turn the candidate into a new map marker - could it be the right gene? - if not, can it narrow your interval?
How to turn it into a map marker?
What’s a good candidate?
Single nucleotide polymorphisms
A
G
200 bp
60 bp, 140bp
Forward
Forward Reverse
Reverse
SNPs+=+~+1+/+250+bp+in+genome+
Generating map markers from ESTs/Genes/other sequences
• Find or design primers for PCR (from gDNA)
• Sequence PCR product on WT and mut
• Find RE polymorphism
10!
Obtaining gDNA from cDNA sequence: exporting from genome!
http://genome.ucsc.edu/!
Obtaining gDNA from cDNA sequence: exporting from genome!
Obtaining gDNA from cDNA sequence: exporting from genome!
Obtaining gDNA from cDNA sequence: exporting from genome!
11!
Obtaining gDNA from cDNA sequence: exporting from genome!
Obtaining gDNA from cDNA sequence: exporting from genome!
Obtaining gDNA from cDNA sequence: exporting from genome!
Obtaining gDNA from cDNA sequence: exporting from genome!
12!
Obtaining gDNA from cDNA sequence: exporting from genome!
Beware of shotgun (non-BAC, i.e. large clone) assembly
Here there be Monsters
Safe Sailing (mostly)
Designing PCR primers
http://frodo.wi.mit.edu/primer3/!
PCR primers
Amplify from WT and mut, sequence . . .
Locating a SNP to map
. . . run on your mapping panel - still a candidate? (0 recombinants) - narrow the candidate interval?
13!
Identifying a restriction enzyme to map your SNP
http://helix.wustl.edu/dcaps/dcaps.html
dCAPS results!
Striking the right balance in positional cloning
Mapping:
lots of fish, lots of PCR, lots of gels should always give you an unambiguous answer
Functional:
Sequencing => often done concomitantly with mapping
mRNA cloning/rescue Morpholinos => time, money Ambiguous, easy to make up lots of stories
Mapping vs. Functional follow-up!
What if ZF genome turns out to be a dead end?
• Check other fish genomes
- more candidate genes? - fix a gap in the ZF data
• Check genetic/RH maps on ZFIN
• Start a chromosome walk
- iterative BAC screening
14!
What if ZF genome turns out to be a dead end?
• Check other fish genomes
Pufferfish (Tetraodon, Fugu)
- smaller, more compact genome - good for getting enhancer regions
Tetraodon calca region
More Candidates to test: find and map zebrafish orthologs!
Today: So you have a map location … now what? Mapped Mutant Cloned Gene
Tomorrow’s bioinformatics practical:
1) Positional cloning in 2 (mostly) easy steps
2) Morpholinos (ATG, Splice) and rescue
3) Zebrafish orthologs of your favorite human genes Identification of enhancer elements Transgenic Lines
4) Doing cool things in big batches