Upload
tacey
View
74
Download
4
Tags:
Embed Size (px)
DESCRIPTION
Introduction to Genetic Epidemiology. HGEN619, 2006 Hermine H. Maes. Genetic Epidemiology. Establishing / Quantifying the role of genes and environment in variation in disease and complex traits ~ Answering questions about the importance of nature and nurture on individual differences - PowerPoint PPT Presentation
Citation preview
Introduction to Genetic EpidemiologyHGEN619, 2006
Hermine H. Maes
Genetic Epidemiology Establishing / Quantifying the role of genes and
environment in variation in disease and complex traits ~ Answering questions about the importance of nature and nurture on individual differences
Finding those genes and environmental factors
Genes & Environment How much of the variation in a trait is accounted
for by genetic factors? Do shared environmental factors contribute
significantly to the trait variation? The first of these questions addresses
heritability, defined as the proportion of the total variance explained by genetic factors
Nature-nurture question Sir Francis Galton: comparing the similarity of identical
and fraternal twins yields information about the relative importance of heredity vs environment on individual differences
Gregor Mendel: classical experiments demonstrated that the inheritance of model traits in carefully bred material agreed with a simple theory of particulate inheritance
Ronald Fisher: first coherent account of how the ‘correlations between relatives’ explained ‘on the supposition of Mendelian inheritance’
People and IdeasGalton (1865-ish)
CorrelationFamily Resemblance
TwinsAncestral Heredity
Mendel (1865)Particulate Inheritance
Genes: single in gamete double in zygote
Segregation ratios
Darwin (1858,1871)Natural SelectionSexual Selection
Evolution
Fisher (1918)Correlation & MendelMaximum Likelihood
ANOVA: partition of variance
Spearman (1904)Common Factor Analysis
Wright (1921)Path Analysis
Thurstone (1930's)Multiple Factor Analysis
Mather (1949) &
Jinks (1971)Biometrical Genetics
Model Fitting (plants) Joreskog (1960)Covariance
Structure AnalysisLISREL
Morton (1974)Path Analysis &
Family Resemblance
Watson &
Crick (1953)
Jinks & Fulker (1970)Model Fitting applied to humans
Martin & Eaves (1977)Genetic Analysis of
Covariance Structure
Elston etc (19..)Segregation
Linkage Rao, Rice, Reich,
Cloninger (1970's)Assortment
Cultural Inheritance
Neale (1990) MxMolecularGenetics
PopulationGenetics
2000
Biometrical Model
aa AA
Aa m
d-d
h
To make the simple two-allele model concrete, let us imagine that we are talking about genesthat influence adult stature. Les us assume that the normal range of height for males is from
4 feet 10 inches to 6 feet 8 inches; that is, about 22 inches. And let us assume that eachsomatic chromosome has one gene of roughly equivalent effect. Then, roughly speaking, we
are thinking in terms of loci for which the homozygotes contribute +- 1/2 inch (from themidpoint), depending on whether they are AA , the increasing homozygote, or aa , the
decreasing homozygote. In reality, although some loci may contribute greater effects thanthis, others will almost certaily contribute less; thus we are talking about the kind of model in
which any particular polygene is having an effect that would be difficult to detect by themethods of classical genetics.
in Biometrical Genetics chapter in Methodology for Genetic Studies of Twins and Families
0
1
2
3
1 Gene 3 Genotypes 3 Phenotypes
0
1
2
3
2 Genes 9 Genotypes 5 Phenotypes
01234567
3 Genes 27 Genotypes 7 Phenotypes
0
5
10
15
20
4 Genes 81 Genotypes 9 Phenotypes
Polygenic Traits
Stature in adolescent twins
Stature
190.0
185.0
180.0
175.0
170.0
165.0
160.0
155.0
150.0
145.0
Women700
600
500
400
300
200
100
0
Std. Dev = 6.40
Mean = 169.1
N = 1785.00
Physical attributes (height, eye color) Disease susceptibility (asthma, anxiety) Behavior (intelligence, personality) Life outcomes (income, children)
Individual differences
Polygenic Model Polygenic model: variation for a trait caused by a
large number of individual genes, each inherited in a strict conformity to Mendel’s laws
Multifactorial model: many genes and many environmental factors also of small and equal effect
Effects of many small factors combined > normal (Gaussian) distribution of trait values, according to the central limit theorem.
The normal distribution is to be expected whenever variation is produced by the addition of a large number of effects, non-predominant
This holds quite often
Quantitative traits
Central Limit Theorem
Body Mass Index vs “obesity” Blood pressure vs “hypertensive” Bone Mineral Density vs “fracture” Bronchial reactivity vs “asthma” Neuroticism vs “anxious/depressed” Reading ability vs “dyslexic” Aggressive behavior vs “delinquent”
Continuous or Categorical ?
unaffected affected
Disease liability
Single threshold
severe
Disease liability
Multiple thresholds
mildnormal mod
Multifactorial Threshold Model of Disease
Imprecise phenotype Phenocopies / sporadic cases Low penetrance Locus heterogeneity/ polygenic effects
Genetically Complex Diseases
Disease Phenotype
Commonenvironment
Marker Gene1
Individualenvironment
Polygenicbackground
Gene2
Gene3
Linkage
Linkagedisequilibrium
Mode ofinheritance Linkage
Association
Complex Trait Model
Causes of Variation pre-1990
estimation of ‘anonymous’ genetic and environmental components of phenotypic variation
genetic epidemiologic studies post-1990
identification of QTL’s: quantitative trait loci contributing to genetic variation of complex (quantitative) traits
linkage and association studies
Stages of Genetic Mapping Are there genes influencing this trait?
Genetic epidemiological studies Where are those genes?
Linkage analysis What are those genes?
Association analysis
Partitioning Variation phenotypic variance (VP) partitioned in genetic
(VG) and environmental (VE) VP = VG + VE Assumptions: additivity & independence of
genetic and environmental effects heritability (h2): proportion of variance due to
genetic influences (h2 = VG /VP) property of a group (not an individual), thus specific
to a group in place & time
Sources of Variance Genetic factors:
Additive (A)Dominance (D)
Environmental factors:Common / Shared (C)Specific / Unique (E)Measurement Error, confounded with E
Genetic Factors Additive genetic factors (A): sum of all the
effects of individual loci
Non-additive genetic factors: result of interactions between alleles at the same locus (dominance, D) or between alleles on different loci (epistasis)
Environmental Factors Shared [common or between-family]
environmental factors (C): aspects of the environment shared by members of same family or people who live together, and contribute to similarity between relatives
Non-shared [specific, unique or within-family] environmental factors (E): unique to an individual, contribute to variation within family members, but not to their covariation
Estimating Components Estimate phenotypic variance components from
data on covariances of related individuals Different types of relative pairs share different
amounts of phenotypic variance Biometrical genetics theory: specify amounts in
terms of genetic and environmental variances Three major types of study: family, adoption and
twin
Designs to disentangle G+E Resemblance between relatives caused by:
Shared Genes (G = A + D)
Environment Common to family members (C)
Differences between relatives caused by:
Non-shared Genes
Unique environment (E)
Informative Designs Family studies – G + C confounded
MZ twins alone – G + C confounded
MZ twins reared apart – rare, atypical, selective placement ?
Adoption studies – increasingly rare, atypical, selective placement ?
MZ and DZ twins reared together
Extended twin design
Classical Twin Study MZ and DZ twins reared together
MZ twins genetically identicalDZ twins share on average half their genes
Equal Environments AssumptionMZ and DZ twins share relevant
environmental influences to same extent
MZ and DZ twins: determining zygosity using ABI Profiler™ genotyping
(9 STR markers + sex)MZ DZ DZ
Identity at marker loci - except for rare mutation
Zygosity
MZ & DZ Correlations rMZ > rDZ: G (heritability) C: increase rMZ & rDZ Relative magnitude of the MZ and DZ
correlations > contribution of additive genetic (G) and shared environmental (C) factors
1-rMZ: importance of specific environmental (E) factors
Twin Correlations
*
*
MZDZ
.5
1.0
**
MZDZ
.6
.8
* *
MZDZ
.7
.8
*
*
MZDZ
.4
.8A
E
C
Example thus if, VP = VA + VC + VE = 2.0
CovMZ = VA + VC = 1.6CovDZ = 1/2VA + VC = 1.2
then, by algebra, VA = 0.8, VC = 0.8, VE = 0.4
but it isn’t always so simple, consider VP = 1.0, CovMZ = 0.6; CovDZ = 0.65
then VA = -0.1, VC = 0.7, VE = 0.4 nonsensical negative variance component
Observed Statistics Trait variance & MZ and DZ covariance as
unique observed statistics Estimate the contributions of additive genes (A),
shared (C ) and specific (E) environmental factors, according to the genetic model
Useful tool to generate the expectations for the variances and covariances under a model is path analysis
Path Analysis Allows us to diagrammatically represent linear
models for the relationships between variables Easy to derive expectations for the variances
and covariances of variables in terms of the parameters of the proposed linear model
Permits translation into matrix formulation
Phenotype
E C A D
UniqueEnvironment
AdditiveGenetic
SharedEnvironment
DominanceGenetic
e
ac
d
Variance ComponentsP = eE + aA + cC + dD
PT1
ACE
PT2
A C E
1
MZ=1.0 / DZ=0.5
e ac eca
ACE Model Path Diagram for MZ & DZ Twins
Model Fitting Evaluate significance of variance components -
effect size & sample size Evaluate goodness-of-fit of model - closeness of
observed & expected values Compare fit under alternative models Obtain maximum likelihood estimates
Mx Structural equation modeling package Software: www.vcu.edu/mx Manual: Neale et al. 2006 Free
Both continuous and categorical variables Systematic approach to hypothesis testing Tests of significance Can be extended to:
More complex questions Multiple variables Other relatives
Structural equation modeling
Are the same genes acting in males and females? (sex limitation)
Role of age on (a) mean (b) variance (c) variance components
Are G & E equally important in age, country cohorts? (heterogeneity)
Are G & E same in other strata (e.g. married/unmarried)? ( G x E interaction)
SEM: more complex questions I
Do the same genes account for variation in multiple phenotypes? (multivariate analysis)
Do the same genes account for variation in phenotypes measured at different ages? (longitudinal analysis)
Do specific genes account for variation/covariation in phenotypes? (linkage/association)
SEM: more complex questions II
Linkage & Association Analysis
Stages of Genetic Mapping Are there genes influencing this trait?
Epidemiological studies Where are those genes?
Linkage analysis What are those genes?
Association analysis
Sharing between relatives Identifies large regions
Include several candidates
Complex disease Scans on sets of small families popular No strong assumptions about disease alleles Low power Limited resolution
Linkage Analysis
Linkage Scan
Stages of Genetic Mapping Are there genes influencing this trait?
Epidemiological studies Where are those genes?
Linkage analysis What are those genes?
Association analysis
Sharing between unrelated individuals Trait alleles originate in common ancestor High resolution
Recombination since common ancestor Large number of independent tests
Powerful if assumptions are met Same disease haplotype shared by many patients
Sensitive to population structure
Association Analysis
Association Scan
Genome Scan Gene 1 Gene 2 Gene 3 Gene 4
Breast cancer DLC-1 Chr 8q Chr 13q
Lung cancer CD44 Chr 22q
Melanoma B-RAF
Type 2 diabetes PPAR PPP1R3A FOXA2 Chr 1q
HDL-C plasma level CETP LPL
Osteoarthritis AGC1
Schizophrenia DDC
Proof of Concept: Genes/Regions
First (unequivocal) positional cloning of a complex disease QTL !
From QTL to gene: the harvest begins: RKorstanje & B Paigen : Nature Genetics 31, 235 – 236 (2002)
Number of genes identified from QTL by year