22
Introduction to Bioinformatics for Medical Research Gideon Greenspan [email protected] Lecture 14 Genetic Mapping

Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan [email protected]

Embed Size (px)

Citation preview

Page 1: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

Introduction to Bioinformaticsfor Medical Research

Gideon [email protected]

Lecture 14Genetic Mapping

Page 2: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

2

Genetic Mapping

• Background• Linkage Analysis

– Pedigrees– Probability model

• Association Analysis– Single markers– Haplotype blocks

• Haplotype Resolution

Page 3: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

3

Why map genes?

• Many diseases are partially genetic– Also: environmental factors, randomness

• We want to identify these genes– Early diagnosis for abortion or regular checks– First step towards developing treatment

• Individual sequencing is too costly (today)– Sequence a small number of markers– Analyze statistically via biological principles

Page 4: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

4

Meiotic Recombination

Page 5: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

5

Linkage Analysis

• Identify families with disease– Many families, many members is best

• Construct pedigree of family members– Collect disease status for each individual

• Test alleles at multiple markers– Problems: dispersed families, death

• Create models for possible disease locations– Find model which best explains data

Page 6: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

6

Pedigrees

MaleFemale

Healthy

Diseased

Carrier

Page 7: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

7

Linkage Analysis Exampleposition LOD_score information 0.00 -1.254417 0.224384 1.52 2.836135 0.226379

...[other data skipped]...

18.58 13.688599 0.384088 19.92 14.238474 0.401992 21.26 14.718037 0.426818 22.60 15.159389 0.462284 22.92 15.056713 0.462510 23.24 14.928614 0.463208 23.56 14.754848 0.464387

...[other data skipped]...

81.84 1.939215 0.059748 90.60 -11.930449 0.087869

Putative distanceof disease gene

from first markerin centi-Morgans

Log likelihood ofplacing disease

gene at distance,relative to it

being unlinked.

Maximum loglikelihood score

Most ‘likely’position

Page 8: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

8

Founder Probabilities

• Founders assumed tobe unrelated

• Each founder markerassigned probabilitybased on populationallele frequencies– Linkage equilibrium

Page 9: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

9

Inheritance Probabilities

• Markov chain ofselector distributionsfor each marker– No interference

• Probability that allelesource is differentfrom previous due torecombination = RF

Page 10: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

10

Disease Probabilities

• Disease dependent onallele at unseen locus– Many positions tested

• Probability of diseasedepends on dominancemodel, liability classand penetrance

Page 11: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

11

Association Analysis

• Also: Linkage Disequilibrium (LD) analysis• Collect disease cases and healthy controls

– Small, inbred populations are best (e.g. us)• Measure alleles at densely spaced markers

– Single Nucleotide Polymorphisms are ideal• Test marker–disease correlations

– How well does each SNP predict disease?– P value under assumption of independence

Page 12: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

12

Association Analysis Example

3 1 10 1 2

1 3 11 1 1

Page 13: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

13

False Associations

• Population structure– Migration and admixture– Preferential mating

• Phenotypic interactions– Epistasis between distant sites– Selective sweeps

• Many individual tests– Raise significance level for multiple tests

Page 14: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

14

Recombination Hotspots

Page 15: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

15

Bottleneck Effects

106 years 105 years

Page 16: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

16

Genetic Drift

0

100

200

300

400

500

600

700

800

900

1000

0 100 200 300 400 500 600 700 800 900 1000

Generation

Alle

le F

requ

ency

Page 17: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

17

Haplotype Blocks

1 GAACTGC ATTCGACTG CCAGTAGC 2 ACGTACA GATGAGCTG CCAGTAGC … 99 ACGTACA AACCGAGGT TGTACTAA100 GAACTGC GATGAGCTG TGTGCTAA

Recombinationhotspot

separates blocks

Few blockvariants due to

bottlenecks, drift

Mutationhotspot

Page 18: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

18

HaploBlock Analysis

• Treat haplotype as single unified marker– Increased information, fewer tests

• Consider recombination events– Consider haplotypes in each block separately

• Consider mutation events– Cluster haplotypes into similiarity clades

• Test disease correlation with each block– P values or prediction ability

Page 19: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

19

Haplotype Resolution1 2 3 4 5Variable Loci

MaternalChromosome

PaternalChromosome

ObservedGenotypes A/T C/G T/T A/C G/T

A G T C T

T C T A G

HiddenHaplotypes

Page 20: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

20

Linkage vs Association

Many false positivesMany successes104 bp accuracy106 bp accuracy

Closely spaced SNPsWidely spaced markersComplex diseasesMendelian diseases

Historical recombinationObserved recombinationUnrelated individualsFamily pedigreesAssociation analysisLinkage analysis

Page 21: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

21

Stages of Mapping a Gene

• Demonstrate disease is hereditary– Show it runs in families

• Linkage analysis to identify region– Widely-spaced markers, e.g. RFLPs

• Association analysis to narrow region– Closely-spaced markers, usually SNPs

• Clone the gene within found region– Investigate its metabolic relevance

Page 22: Lecture 14 Genetic Mapping - Technionbioinfo.cs.technion.ac.il/courses/biomed/lectures/14-Genetic... · Introduction to Bioinformatics for Medical Research Gideon Greenspan gdg@cs.technion.ac.il

22

Resources

• Genetic Analysis Software– http://linkage.rockefeller.edu/soft/list.html

• Introduction to Genetic Analysis– http://www2.qimr.edu.au/davidD/Course/

• Genetic Analysis Resources– http://ihome.cuhk.edu.hk/~b400559/complexdismap.html

• NCBI Human Genes and Disease– http://www.ncbi.nlm.nih.gov/disease/

• International Haplotype Map Project– http://www.genome.gov/10001688