Upload
lethuan
View
216
Download
0
Embed Size (px)
Citation preview
020406080
100120140160180200
1945
1949
1953
1957
1961
1965
1969
1973
1977
1981
1985
1989
1993
1997
2001
The human genomeThe human genome
3.000.000.000 bases3.000.000.000 bases30.000 genes30.000 genes10.000.000 10.000.000 SNPsSNPs
500.000 proteins500.000 proteins
DNADNA
2 copies per cell2 copies per cell–– Well defined dynamic rangeWell defined dynamic range
StableStable–– e.g. e.g. mtDNAmtDNA used for identification of human used for identification of human
remainsremains
Robust typing methodsRobust typing methods–– PCRPCR
Not good drug targetsNot good drug targets
DNA resourceDNA resource
Extraction from blood lymphocytes by Extraction from blood lymphocytes by lysislysisof cells, treatment with of cells, treatment with proteinaseproteinase K and K and precipitationprecipitation
Immortalization by EBV transformation Immortalization by EBV transformation –– cell cell cultureculture
Whole genome amplificationWhole genome amplification
DNA marker systemsDNA marker systems
MicrosatelliteMicrosatellite genotypinggenotyping–– Linkage studiesLinkage studies
SNP genotypingSNP genotyping–– Association studiesAssociation studies
Candidate geneCandidate geneWhole genomeWhole genome
MicrosatellitesMicrosatellites
…..TGACCGGGATGTAA(CA)NCGTAGCTAGCGAT…..
N > 25
>100 bases>100 basesOne every 1 One every 1 cMcMCan expand from generation to generationCan expand from generation to generation>2 alleles>2 alleles
Analysis of Analysis of microsatellitesmicrosatellites
PCR with fluorescently labelled primersPCR with fluorescently labelled primersPoolingPoolingSeparation by capillary gelSeparation by capillary gel--electrophoresiselectrophoresis
Monogenetic disordersMonogenetic disorders
Monogenic Monogenic -- one gene one gene ““damageddamaged””–– Cystic fibrosisCystic fibrosis–– HuntingtonHuntington’’s diseases disease–– ThalassemiaThalassemia–– SickleSickle--cell anemiacell anemia–– SCIDSCID–– ……
Monogenic disordersMonogenic disorders
Linkage analysis Linkage analysis –– MendelianMendelian inheritanceinheritanceMF
C1
C2/C3
C4
MicrosatelliteMicrosatellite genotypinggenotyping
Genome scansGenome scans–– 400 400 microsatellitemicrosatellite markersmarkers–– 1 marker every 10 1 marker every 10 cMcM
Problem Problem –– 100 genes between two 100 genes between two microsatellitemicrosatellite markersmarkers
Advantage Advantage –– each each microsatellitemicrosatellite has many has many allelesalleles
And now ?And now ?
FineFine--mapmap–– Genotype more Genotype more microsatellitesmicrosatellites in the locusin the locus–– Reduce interval Reduce interval –– 10 genes per peak10 genes per peak
Educated guess based on biologyEducated guess based on biology–– Candidate geneCandidate gene
Sequence the most promising candidatesSequence the most promising candidatesUse known polymorphism (SNP) for Use known polymorphism (SNP) for genotypinggenotypingDrawback of Drawback of SNPsSNPs -- biallelicbiallelic
Single Nucleotide Polymorphisms Single Nucleotide Polymorphisms SNPSNP
Single base changeSingle base change10 million known in the human genome10 million known in the human genomeStable through evolutionStable through evolution
TGCATATGCAAGTAACCGTAACGTATACGTTCATTGGCAT
TGCATATGCAAATAACCGTAACGTATACGTTTATTGGCAT
Primer Extension
OligonucleotideLigationHybridisation
Nuclease cleavage
Gel separation
PlatereaderDNA Array
Mass spectrometry
CNG SNP genotyping platformsCNG SNP genotyping platforms
SequencingSequencingGOOD assay GOOD assay -- MALDI MSMALDI MSTaqManTaqManAmplifluorAmplifluorIlluminaIlluminaSNPlexSNPlexAffymetrixAffymetrixPyrosequencingPyrosequencing
SNPl
exSN
Plex
Mas
s M
ass
Spec
trom
etry
Spec
trom
etry
Number of different Number of different SNPsSNPs
Num
ber
of In
divi
dual
sN
umbe
r of
Indi
vidu
als
Illum
ina
Illum
ina
Pyro
sequ
encin
g
Pyro
sequ
encin
g
TaqM
an
TaqM
an--
Ampl
ifluo
r
Ampl
ifluo
r ??
Sequ
encin
g
Sequ
encin
g
Affy
met
rix
Affy
met
rix
SNP genotyping methods with choiceSNP genotyping methods with choice
SelfSelf5533GOOD assayGOOD assay
Self with Self with supportsupport
441010BiotageBiotagePyrosequencingPyrosequencing
Self with Self with supportsupport
1111ChemiconChemiconAmplifluorAmplifluor
OptimisationOptimisationmanufacturermanufacturer
1111Applied BioApplied BioTaqManTaqMan
OptimisationOptimisationmanufacturermanufacturer
994848Applied BioApplied BioSNPlexSNPlex
Optimisation Optimisation manufacturermanufacturer
9915361536IlluminaIlluminaGoldenGateGoldenGate
Common disordersCommon disorders
Cardiovascular diseaseCardiovascular diseaseDiabetesDiabetesAsthmaAsthmaCancerCancer
Common and often strong environmental Common and often strong environmental componentcomponent
Candidate Candidate genegene selectionselection
FunctionalFunctional candidate candidate genesgenes::–– ««glucose glucose metabolismmetabolism andand toxicitytoxicity»»–– ««renalrenal hemodynamichemodynamic andand hypertensionhypertension»»
PositionalPositional candidate candidate genesgenes: : –– ««chromosome 3q chromosome 3q projectproject»»
Candidate Candidate genesgenes to to confirmconfirm: : –– ««literatureliterature»» (ACE, PON2, CCR5(ACE, PON2, CCR5……))
Candidate Candidate genesgenes fromfrom animal animal modelsmodels–– ««micemice genesgenes»»–– ««GK rat GK rat genesgenes»»
EURAGEDICEURAGEDIC
SNP SNP selectionselection -- haplotypeshaplotypesSLC2A2All DK FIN FR CAUC FR+cauc Haplo 12567 24067 25887 32993 12623_1 24895 16160 21445 3897 164590.6327 0.7083 0.6644 0.6563 0.5731 0.5906 1 0 0 0 0 0 0 0 0 0 00.0379 0.0000 0.0312 0.0000 0.0753 0.0443 4 1 0 0 0 0 0 0 0 0 00.0925 0.1458 0.0441 0.0500 0.0658 0.0606 2 0 1 0 1 0 0 1 1 0 00.0332 0.0000 0.0415 0.0500 0.0500 0.0514 5 0 1 0 0 0 0 1 1 0 00.0480 0.0000 0.0500 0.1000 0.0434 0.0681 3 0 0 0 1 0 0 1 1 1 10.0189 0.0625 0.0000 0.0000 0.0000 0.0000 6 0 0 1 0 0 0 0 0 0 0
… … … … … … …0.0104 0.0000 0.0000 0.0438 0.0000 0.0329 7 0 0 0 0 0 0 0 0 0 00.0074 0.0000 0.0000 0.0000 0.0167 0.0100 8 0 1 0 1 0 0 0 0 0 0… … … … … … …0.0000 0.0000 0.0000 0.0000 0.0023 0.0000 71 0 0 0 0 0 0 0 1 0 00.0000 0.0000 0.0000 0.0000 0.0023 0.0000 72 0 1 0 0 0 0 0 1 0 00.0000 0.0000 0.0000 0.0000 0.0023 0.0000 73 0 0 0 0 0 0 0 1 0 0
SLC2A2: 28 SNPs 4 SNPs selected
73 Haplotypes 6 Haplotypes >5%
Human Human HapMapHapMap
Common samples (288 samples from several populations)Genotype ~1 million SNPs, 5% frequencySelect haplotype tag SNPs
Has helped advance technology and genome knowledge
Benefit to complex disease genetics ?
...mapping ancestral haplotype blocks across the genome
Whole genome associationWhole genome association
IlluminaIllumina –– InfiniumInfinium
AffymetrixAffymetrix –– 100/500 100/500 kSNPskSNPs
SNPsSNPs and linkage analysisand linkage analysis
IlluminaIllumina (4.600 (4.600 SNPsSNPs –– now 6.000 now 6.000 SNPsSNPs))
AffymetrixAffymetrix (10.000 (10.000 SNPsSNPs))
Compared to Compared to microsatellitesmicrosatellites
Affymetrix 10KIllumina 4 OPAMicrostatellite 400 markers
Chromosome 1 - Information with parents genotyped
Affymetrix 10KIllumina 4 OPAMicrostatellite 400 markers
Chromosome 1 - Information without parents genotyped
MADO HLA typingMADO HLA typing
Matching of tissue donors/recipientsMatching of tissue donors/recipients
HLA genesHLA genes–– Highly polymorphicHighly polymorphic–– Best match of sequence Best match of sequence –– best chance of successbest chance of success
Time consuming Time consuming -- expensiveexpensive
GGGTGAAGGAGCGCAGAGGCCGATTCTA*0231
GGGTGAAGGACCGCAGAGGCCGATTGTA*0230
GGGTGAAGGACCGCAGAGGCCGATTCTA*0029
GGGTGAAGGACCGCAGAGGCCGATTCTA*0227
GGGTGAAGGACCGCAGAGGCCGATTCTA*0226
GGGTGAAGGACCGCAGAGGCCGATTCTA*0225
GGGTGAAGGACCGCAGAGGCCGATTCTA*0224
GGGTGAAGGACCGCAGAGGCCGATTCTA*0222
GGGTGAAGGACCGCAGAGGCCGATTCTA*02202
GGGTGAAGGACCGCAGAGGCCGATTCTA*02201
GGGTGAAGGACCGCAGAGGCCGATTCTA*0219
GGGTGAAGGACCGCAGAGGCCGATTCTA*0218
GGGTGAAGGACCGCAGAGGCCGATTCTA*02172
GGGTGAAGGACCGCAGAGGCCGATTCTA*02171
GGGTGAAGGACCGCAGAGGCCGATTCTA*0216
GGGTGAAGGACCGCAGAGGCCGATTCTA*0213
GGGTGAAGGACCGCAGAGGCCGATTCTA*0212
GGGTGAAGGACCGCAGAGGCCGATTCTA*0211
GGGTGAAGGACCGCAGAGGCCGATTCTA*0209
GGGTGAAGGACCGCAGAGGCCGATTCTA*0207
GGGTGAAGGACCGCAGAGGCCGATTCTA*0204
GGGTGAAGGACCGCAGAGGCCGATTCTA*0203
GGGTGAAGGGCCGCAGAGGCCGATTCTA*0202
GGGTGAAGGACCGCAGAGGCCGATTCTA*02016
GGGTGAAGGACCGCAGAGGCCGATTCTA*02015
GGGTGAAGGACCGCAGAGGCCGATTCTA*02014
GGGTGAAGGACCGCAGAGGCCGATTCTA*02013
GGGTGAAGGACCGCAGAGGCCGATTCCA*02012
GGGTGAAGGACCGCAGAGGCCGATTCTA*02011
ACGGGAAGAACCGAAGCGGCCGATTCCA*0109
ACGGGAAGAACCGCAGCGGCCGATTCCA*0108
ACTGAAAGAACCGCAGCGGCCGATTCCA*0107
ACGGGAAGAACCGCAGCGGCCGATTCCA*0106
ACGGGAAGAACCGCAGCGGCCGATTCCA*0103
257
256
243
240
238
233
228
219
203
200
194
180
176
171
163
160
144
142
126
123
121
106
102
98
97
81
78SNP Position
TAGGTACAGTYAGRTACTAGGAGTCA
TAGGTACAGTCAGATACTAGGAGTCATAGGTACAGTCAGGTACTAGGAGTCATAGGTACAGTTAGATACTAGGAGTCATAGGTACAGTTAGGTACTAGGAGTCA
TAGGTACAGTC/TAGA/GTACTAGGAGTCA
µµ--haplotypinghaplotyping
Example for HLAExample for HLA--DRB1DRB1
Masses
Name Sequence Primer A C G T
DRB1_1971_1r20 CGTCGCTGTCGAAGCGCAspG^spG 1178,1 1505,4 - - 1496,3
DRB1_1972_1r20 CGTCGCTGTCGTAGCGCGspC^spG 1154,1 - - - 1472,3
DRB1_1973_1r20 CGTCGCTGTCGAAGCGCAspA^spG 1162,1 - - - 1480,3
DRB1_1974_1r20 CGTCGCTGTCGAAGYGCAspC^spG 1110,1 1437,4 - 1453,4 1428,3
DRB1_1975_1r20 CGTCGCTGTCGAASCGCAspC^spG 1110,1 1437,4 - 1453,4 1428,3
MADO Frequent AllelesMADO Frequent Alleles
A B DRB10101 1501 01010201 4001 03010301 4403 04012902 5701 07013001 0702 11012402 0801 11042301 3501 13023002 3503 1501
4402510113021801
8 12 8
28 alleles
““HLAfamiliesHLAfamilies””
Individual Allele 1 Allele 2 1333 14 HLA‐DRB1*0801 HLA‐DRB1*1001 HLA‐DRB1*0804 HLA‐DRB1*1001 HLA‐DRB1*0802 HLA‐DRB1*1001 HLA‐DRB1*0806 HLA‐DRB1*1001 HLA‐DRB1*0807 HLA‐DRB1*1001 HLA‐DRB1*0826 HLA‐DRB1*1001 HLA‐DRB1*0811 HLA‐DRB1*1001 HLA‐DRB1*0805 ? HLA‐DRB1*0813 ? HLA‐DRB1*0824 ?
Frequencies of HLAFrequencies of HLA--AllelesAllelesAllele Frequencies Allele Frequencies Allele Frequencies
DRB1*0701 15,72 DRB1*0103 1,26 DRB1*0810 0,06 DRB1*1501 12,32 DRB1*0407 1,05 DRB1*0410 0,04 DRB1*0301 10,99 DRB1*1001 1,02 DRB1*0416 0,04 DRB1*0101 6,59 DRB1*1103 1,01 DRB1*1503 0,04 DRB1*1101 6,56 DRB1*1502 0,94 DRB1*1406 0,04 DRB1*1301 5,69 DRB1*0901 0,79 DRB1*1402 0,03 DRB1*0401 5,18 DRB1*1305 0,38 DRB1*1116 0,02 DRB1*1104 4,73 DRB1*0408 0,38 DRB1*1306 0,02 DRB1*1302 3,71 DRB1*0803 0,29 DRB1*1310 0,02 DRB1*0404 2,87 DRB1*0804 0,29 DRB1*0106 0,02 DRB1*1401 2,72 DRB1*1602 0,23 DRB1*0414 0,02 DRB1*0102 2,47 DRB1*0406 0,22 DRB1*1407 0,02 DRB1*0801 1,86 DRB1*0304 0,17 DRB1*0411 0,02 DRB1*1601 1,69 DRB1*0305 0,14 DRB1*0417 0,02 DRB1*0403 1,58 DRB1*0802 0,13 DRB1*1417 0,02 DRB1*1303 1,55 DRB1*1404 0,12 DRB1*1423 0,02 DRB1*1201 1,40 DRB1*0806 0,10 DRB1*1433 0,02 DRB1*0402 1,30 DRB1*1202 0,09 DRB1*1109 0,01 DRB1*0405 1,28 DRB1*0302 0,09 DRB1*1408 0,01 DRB1*1102 1,27 DRB1*0805 0,06 DRB1*1403 0,01
www.allelefrequencies.net
Weighting of resultsWeighting of results
Allele 1 Frequencies Allele 2 Frequencies Products ofFrequencies Likelihood
DRB1*0801 1,86 DRB1*1001 1,02 1,8972 0,781502735DRB1*0804 0,29 DRB1*1001 1,02 0,2958 0,121847201DRB1*0802 0,13 DRB1*1001 1,02 0,1326 0,054621159DRB1*0806 0,1 DRB1*1001 1,02 0,102 0,042016276DRB1*0807 0,00001 DRB1*1001 1,02 0,0000102 4,20163E-06DRB1*0826 0,00001 DRB1*1001 1,02 0,0000102 4,20163E-06DRB1*0811 0,00001 DRB1*1001 1,02 0,0000102 4,20163E-06DRB1*0805 0,06 ? 0,000001 0,00000006 2,47155E-08DRB1*0813 0,00001 ? 0,000001 1E-11 4,11924E-12DRB1*0824 0,00001 ? 0,000001 1E-11 4,11924E-12
Σ = 2,42763066
Who did the work and who paid for it:Who did the work and who paid for it:
Centre National de Centre National de GGéénotypagenotypage, Paris, Paris
–– Doris Doris LechnerLechner, Ram, Ramóón n KucharzakKucharzak, Christine , Christine PlanPlanççonon, Francis , Francis BoussicaultBoussicault, , JJöörgrg TostTost, Nelly , Nelly PapinPapin, , CelineCeline BesseBesse, Steven , Steven McGinnMcGinn, Florence , Florence MaugerMauger, Jeanne, Jeanne--AntideAntide Perrier, David Perrier, David DerbalaDerbala, Dominique , Dominique BrunelBrunel, , AurelieAurelie BBéérardrard, Heather , Heather McKhannMcKhann, Sandra , Sandra GiancolaGiancola, Jean, Jean--Guillaume Guillaume GarnierGarnier, , LaetitiaLaetitia SobreSobre, Tim , Tim FraylingFrayling, Diane , Diane LebeauLebeau, Olivier , Olivier JaunayJaunay, , KatiaKatia DariiDarii, , Valerie Valerie DumazDumaz, Susanne , Susanne SchwonbeckSchwonbeck, , GwendolineGwendoline ThieryThiery, Nicholas , Nicholas DorvaultDorvault, David , David ArnauldArnauld, , HafidaHafida El El AbdalaouiAbdalaoui, Florence , Florence BusatoBusato, , IngaInga MMüüllerller
–– PhilippPhilipp Schatz, Ole Brandt, Bernard China, Pierre Schatz, Ole Brandt, Bernard China, Pierre LindenbaumLindenbaum, , KarineKarine Moreau, Pierre Moreau, Pierre LibeauLibeau,, PierrePierre--Antoine Antoine GourraudGourraud, Noah Christian, , Noah Christian, LaetitiaLaetitia DorelDorel, , SaschaSascha Sauer, Sauer, AnettAnett SmyraSmyra, , SamiSami ZerzeriZerzeri, Valerie , Valerie FenelonFenelon, Alexandrine , Alexandrine GarrigueGarrigue, , AurelieAurelie LecomteLecomte, , MagalieMagalie CaucimonCaucimon, , DorotheeDorothee DuvaDuva, Valerie , Valerie SouriceSourice, Stephanie Durand, Raphael , Stephanie Durand, Raphael DemartyDemarty, Jean, Jean--Michel Michel DupontDupont, , Naomi Naomi BarakBarak, Christine , Christine CamilleriCamilleri, Olivier , Olivier JaunayJaunay
–– FumiFumi Matsuda, Nino Matsuda, Nino MargeticMargetic, Simon Heath, Mark Lathrop, Simon Heath, Mark Lathrop
Supported by the French Government and the European Supported by the French Government and the European CommissionCommission