Upload
diane-franklin
View
253
Download
5
Embed Size (px)
Citation preview
GGAW - Oct , 2001 M-W LIN
Study Design for Linkage, Association
and TDT Studies
林明薇 Ming-Wei Lin, PhD陽明大學醫學系家庭醫學科台北榮民總醫院教學研究部
GGAW - Oct , 2001 M-W LIN
Collins FS. (1992) Nature genetics 1:3-6
GGAW - Oct , 2001 M-W LINCollins FS. (1992) Nature genetics 1:3-6
GGAW - Oct , 2001 M-W LIN
Linkage Mapping for Disease Genes
•Linkage analysis (Lod score method)•Allele-sharing methods
GGAW - Oct , 2001 M-W LIN
Gregor Mendel
•The principle of segregation of alleles.•The principle of independent assortment.
GGAW - Oct , 2001 M-W LIN
LinkageLinkage describes the phenomenon whereby allele at neighbouring loci are close to one another on the same chromosome, they will be transmitted together more frequently than chance.
GGAW - Oct , 2001 M-W LIN
Linkage Family
GGAW - Oct , 2001 M-W LIN
Linkage Analysis Family
BD BCAC BC
BE AF
AB CD
AD AD
GGAW - Oct , 2001 M-W LIN
Recombinant Gametes
Crossing over between
two neighbouring loci
will produce
recombinant gametes.
GGAW - Oct , 2001 M-W LIN
Recombination Fraction
Recombination fraction (θ) =
number of recombinant gametes
---------------------------------------total gametes
GGAW - Oct , 2001 M-W LIN
Estimation of Recombination
Fraction• Direct Method:
count recombinants.
• Maximum Likelihood Method:Unknown phases
Incomplete penetrance Heterogeneity
GGAW - Oct , 2001 M-W LIN
GGAW - Oct , 2001 M-W LIN
GGAW - Oct , 2001 M-W LIN
Recombination Fraction
•Recombination fraction is a measure of genetic distance.
•1cM=1% chance of
recombination between two loci.
GGAW - Oct , 2001 M-W LIN
Likelihood Odds
Likelihood of data if loci linked at θ
Likelihood odds = Likelihood of data if loci
unlinked
L(θ< 0.5) =
L(θ= 0.5)
GGAW - Oct , 2001 M-W LIN
Lod Score
L(θ< 0.5)Lod score (θ) = log10
L(θ = 0.5)
GGAW - Oct , 2001 M-W LIN
Linkage Analysis Methods
• Direct counting recombinants
and non-recombinants
• Maximum Likelihood Estimate
GGAW - Oct , 2001 M-W LIN
Phase Known Family
BD BCAC BC
BE AF
AB CD
AD AD
GGAW - Oct , 2001 M-W LIN
Phase Known
L(θ) = (θ/2)r ((1-θ)/2) n-r
r: No. of recombinants
n: All meiosis
GGAW - Oct , 2001 M-W LIN
Lod ScorePhase Known
L(θ)LOD = log
L(θ= 0.5)
(θ/2) r [(1-θ) / 2] n-r
= log { }
(0.25) n
= log 2nθr(1-θ)n-r
GGAW - Oct , 2001 M-W LIN
Phase Unknown Family
BD BCAC BC
AB CD
AD AD
GGAW - Oct , 2001 M-W LIN
Phase Unknown
L(θ) = 1/2 (θ/2)r [(1-θ)/2]n-r +1/2 (θ/2)n-r[(1-θ)/2]r
r: No. of recombinants
n: All meiosis
GGAW - Oct , 2001 M-W LIN
Lod ScorePhase Unknown L(θ)
LOD = log
L(θ= 0.5)
1/2 [(θ/2) r[(1-θ)/2] n-r+(θ/2)n-r [(1-θ)/2]r ]
=log { }
(0.25) n
= log {2n-1[θr(1-θ)n-r +θn-r(1-θ)r ]}
GGAW - Oct , 2001 M-W LIN
Lod Score - Maximum Likelihood
Estimate (Z)• Can be calculated at any values of between 0 and 0.5, but are conventionally reported at =0, 0.01, 0.05, 0.1, 0.2, 0.3, and 0.4.
• Zmax is the maximum likelihood estimate (MLE) of .
• Lod score can be converted to a chi-square statistic by 2(loge10) 4.6.
GGAW - Oct , 2001 M-W LIN
Total Lod ScoreLod score obtained from individual families can be added together to calculate the total lod score.
GGAW - Oct , 2001 M-W LIN
Statistical Significance of the
Lod Scorelod score > 3: evidence of linkage
2 < lod score < 3: suggestive evidence of
linkage
-2 < lod score < 2: uninformative of
linkage
lod score < -2: exclusion of linkage
GGAW - Oct , 2001 M-W LIN
Is a Pedigree Useful for linkage
Analysis?• Are critical individuals in the
pedigrees doubly heterozygous at the loci? (Informative)
• Can the offsprings be scored as recombinants or nonrecombinants? (Phase)
GGAW - Oct , 2001 M-W LIN
Parameters Assumed in Lod Score Analysis
• Transmission mode of disease
• Recombination fraction
• Trait allele frequencies
• Penetrance values for each possible disease phenotypes
• Marker allele frequencies.
GGAW - Oct , 2001 M-W LIN
Advantages of Lod Score Analysis
• Statistically, it is more powerful approach than any nonparametric method.
• Utilizes every family member’s phenotypic and genotypic information.
• Provides an estimate of the recombination fraction.
• Provides a statistical test for linkage and for genetic (locus) heterogeneity.
GGAW - Oct , 2001 M-W LIN
Limitations of Lod Score Method
•assumes single locus inheritance
•requires specification of disease gene frequency and penetrance
•has reduced power when disease model is grossly misspecified
GGAW - Oct , 2001 M-W LIN
Complex Diseases
• No clear pattern of Mendelian inheritance• A mix of genetic and environmental
factors• Incomplete penetrance• Phenocopies• Oligogenic or polygenic• Heterogeneity• High frequency of disease-causing allele
GGAW - Oct , 2001 M-W LIN
Recurrence Risk (λ)
Frequency in relatives of affected person
λr = -------------------------------------------------------Population frequency
r denotes the degree of relationship
GGAW - Oct , 2001 M-W LIN
Recurrence Risk
Genetic mapping is much
easier for traits with high λs (λs
> 10) than for those with low λs
(λs < 2).
GGAW - Oct , 2001 M-W LIN
Recurrence Risk of Different
DiseasesDisease λ s
Cystic fibrosis 500Type I diabetes 15schizophrenia 8.6Type II diabetes 3.5
GGAW - Oct , 2001 M-W LIN
Allele-sharing Methods
•Identical by state (I.B.S.)Two alleles of the same form.
•Identical by descent (I.B.D.)Two alleles are descended from the same ancestral allele.
GGAW - Oct , 2001 M-W LIN
Allele-sharing Methods
Testing whether affected relatives inherited a region IBD (or IBS) more often than expected under random Mendelian segregation.
GGAW - Oct , 2001 M-W LIN
IBD = 2
IBD = 1
IBD = 0
AC AB
BC BC
AC BC
AC AB
AB
CD
AD BC
GGAW - Oct , 2001 M-W LIN
IBS = 2
IBS = 1
IBS = 0
BC BC
AC AB
AD BC
GGAW - Oct , 2001 M-W LIN
Affected Sib-pair Methods
An affected sib-pair may share 0,1, 2 alleles identical by descent (IBD) with probabilities of 0.25, 0.5, 0.25, respectively, at any marker locus.
GGAW - Oct , 2001 M-W LIN
IBD = 2
ACAB
BC
BCBC ABAC
BC AA
IBD = 1IBD = 0
25%
50%
25%
GGAW - Oct , 2001 M-W LIN
Affected Sib-pair Methods
If the marker locus is independent of the trait locus, the probabilities of the affected sib-pairs share 0,1, 2 alleles ibd will remain as 0.25, 0.50, 0.25.
GGAW - Oct , 2001 M-W LIN
Affected Sib-pair Methods
If the marker locus is linked to the trait locus, an excess of affected sib-pair sharing two alleles ibd will be expected.
GGAW - Oct , 2001 M-W LIN
Allele-sharing Methods
•Affected Sib-pairs
•Affected Pedigree Member
GGAW - Oct , 2001 M-W LIN
Pearson 2 statistics
Comparing observed numbers of sib-pairs sharing 0, 1, 2 alleles IBD with their expectations under the null hypothesis.
GGAW - Oct , 2001 M-W LIN
Pearson 2 statistics
• Alternative hypothesis: IBD sharing 0 1 2observed n0 n1 n2
N = n0 + n1 + n2
• Null hypothesis: IBD sharing:0 1 2 expected
N/4 N/2 N/4
GGAW - Oct , 2001 M-W LIN
Comments on Allele-Sharing Method
There is no need to specify any genetic parameters of the transmission model.
Less powerful to detect linkage compared with the lod score method if the genetic transmission model can be specified correctly.
It is poor at providing a precise location of the disease gene.
GGAW - Oct , 2001 M-W LIN
Thresholds for Mapping Complex
TraitsMappingMethods
Suggestivelinkage
Lod score
SuggestivelinkageP value
Significantlinkage
Lod score
SignificantlinkageP value
Lod score 1.9 0.0017 3.3 0.000049
Sibs andhalf-sibs
2.2 0.00074 3.6 0.000022
Uncle-nephew
2.3 0.00056 3.7 0.000018
Firstcousin
2.3 0.00052 3.7 0.000016
Lander and Kruglyak (1995) Nature Genetics, 11, 241-247
GGAW - Oct , 2001 M-W LIN
Association Study
•Case-Control study•Transmission disequilibrium test (TDT)
GGAW - Oct , 2001 M-W LIN
○ □
○ □
□
○
○
□
○
□AD
AC
BC
AC
AB
BC
CD AA
AD
AC
■●■
● ●■
●
■
■
●DD
AC
BD
CD
CD
BC
AB
AD
BD
AD
Case-Control study
GGAW - Oct , 2001 M-W LIN
Linkage Disequilibrium
Linkage disequilibrium is the non-random association in a
population of alleles at closely linked loci.
GGAW - Oct , 2001 M-W LIN
Linkage DisequilibriumA2---B1-----C2---X----D3-----E4----F2
A2---B1-----C2---X----D3-----E4
A2---B1-----C2---X----D3
B1-----C2---X----D3
C2---X----D3 C2---X
N generations
GGAW - Oct , 2001 M-W LIN
TDT StudyTo examine the transmission of a particular allele at a locus from heterozygous parents to their affected offspring.
GGAW - Oct , 2001 M-W LIN
□ ○
●
□ ○
■
□ ○
■
□ ○
●BC AB BC B
B
AB
AC AC
BC
AC
BB
BC
AB
“ Trios” for TDT study
“ transmitted allele“ “case”
“ Non-transmitted allele” “control”
GGAW - Oct , 2001 M-W LIN
What does a positive
association imply?•Direct causal effect•Linkage disequilibrium•Population stratification
GGAW - Oct , 2001 M-W LIN
When to Use Association Study
• Candidate gene
• Positive evidence of linkage
• Candidate region allelic associations
GGAW - Oct , 2001 M-W LIN
Suitable Sample for Linkage Disequilibrium
Mapping
•Genetically isolated populations
•Younger populations
GGAW - Oct , 2001 M-W LIN
Successful Examples of Mapping Genes by Association Studies
• Autoimmune diseases associated with HLA IDDM multiple sclerosis ankylosing spondylitis rheumatoid arthritis
• Angiotensin-converting enzyme and heart disease
• low-density lipoprotein receptor and heart disease
• insulin locus and IDDM
GGAW - Oct , 2001 M-W LIN
Sample Size Required
Linkage for Monogenic Traits
• One large family
• at least 40 informative meioses
• 20 cM marker density
• Expected lod score > 3
GGAW - Oct , 2001 M-W LIN
Sample Size Required
Allele-Sharing• λs = 2
• at least 600 affected sib pairs
• narrow down the region to 1 cM
GGAW - Oct , 2001 M-W LIN
Sample Size Required
Linkage for Complex Traits
Sham, Lin et al (2000) Am J Human Genetics 66, 1661-1668.
GGAW - Oct , 2001 M-W LIN
Genetic Markers
A complete informative
marker locus at 0
recombination fraction to
the disease locus.
GGAW - Oct , 2001 M-W LIN
Genetic ModelsModel f0 f1 f2 Kp q
Common Recessive (CR) 0.005 0.005 0.50 0.01 0.100
Common Dominant (CD) 0.005 0.500 0.50 0.01 0.005
Minor gene 1 (MG1) 0.050 0.200 0.80 0.10 0.130
Minor gene 2 (MG2) 0.050 .150 0.45 0.10 0.207
Kp: population risk, q: disease allele frequencyf0: penetrance for the genotype AA; f1: penetrance for the genotype Aaf2: penetrance for the genotype aa
GGAW - Oct , 2001 M-W LIN
Pedigree TypesPedigree Types
GGAW - Oct , 2001 M-W LIN
Number of Pedigrees Required
= 0.0001, Power = 90%, Homogeneity
Model
Pedigree type
CommonRecessive
CommonDominant
MinorGene 1
MinorGene 2
Type 1 18 50 285 626
Type 2 10 16 80 199
Type 3 16 44 245 582
Type 4 13 7 124 300
GGAW - Oct , 2001 M-W LIN
Number of Pedigrees Required = 0.0001, Power = 90%,
Heterogeneity ( = 0.5) Model
Pedigree type
CommonRecessive
CommonDominant
MinorGene 1
MinorGene 2
Type 1 74 227 1139 2503
Type 2 35 79 305 771
Type 3 65 191 965 2317
Type 4 44 26 468 1159
GGAW - Oct , 2001 M-W LIN
Sample Size Required
Case-Control Study ( = 0.05, Power = 90%) Allele Frequency (p)
Relative Risk (RR)0.05 0.1 0.2
1.1 41813 19747 8714
1.2 10921 5142 2252
1.3 5060 2375 1032
1.4 2962 1385 597
1.5 1969 918 392
2 582 266 109
3 188 82 30
4 101 42 13
5 65 6 -
10 19 - -
GGAW - Oct , 2001 M-W LIN
Sample Size Required
Case-Control Study ( = 0.05, Power = 90%)
Allele Frequency (p)Gene Effect Size
0.1 0.2 0.3 0.4
10 % 19747 8714 5037 3198
30 % 2375 1032 585 361
50 % 918 392 217 130
GGAW - Oct , 2001 M-W LIN
Sample Size Required
TDT Study ( = 0.001, Power = 80%)
Frequency of allele A Genotypic Risk Ratio
0.01 0.10 0.50 0.80
4.0 1098(0.048)
150(0.346)
103(0.500)
222(0.235)
2.0 5823(0.029)
695(0.245)
340(0.500)
640(0.267)
1.5 19320(0.025)
2218(0.197)
949(0.500)
1663(0.286)
Risch & Merikangas (1996) Science, 273, 1516-1517.
GGAW - Oct , 2001 M-W LIN
Define phenotype
Identify evidence of genetic component
Extended families
Define study design
Sib pairs Single affected member
Family, clinical information and DNA collection
Genotyping
Data analysis
Identify regions of interest
Physical Mapping / Gene Identification
GGAW - Oct , 2001 M-W LIN
Successful Examples• Cystic fibrosis
• Huntington disease• Early onset breast cancer (BRCA1,
BRCA2)• Alzheimer disease (chr14, chr1)• Maturity-onset diabetes of the young
(MODY) (chr12)• ...