63
Reminder - Means, Variances and Covariances i i i x f x X E X Var 2 2 ) ( ) ( i i i Y i X i Y X y x f y x Y X E Y X Cov , ) , ( i i i x f x X E ) (

Reminder - Means, Variances and Covariances. Covariance Algebra

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Reminder - Means, Variances and Covariances

iii xfx

XEXVar2

2)()(

iiiYiXi

YX

yxfyx

YXEYXCov

,

),(

i

ii xfxXE )(

Covariance Algebra

),(2)()()( YXCovYVarXVarYXVar

),(),(),( ZYCovZXCovZYXCov

),(),( YXaCovYaXCov

Covariance and Correlation

YVarXVar

YXCovr YX

,,

Correlation is covariance scaled to range [-1,1].

For two traits with the same variance:

Cov(X1,X2) = r12 Var(X)

Phenotypes, Genotypes and environment

A phenotype (P) is composed of genotypic values (G) and environmental deviations (E):

P = G + E Whether we focus on mean, variance, or covariance, inference

always comes from the measurement of the phenotype A distinction will be made:

V will be used to indicate inferred components of variance

2 will be used to indicate observational components of variance

Mean genotypic value is equal to the mean phenotypic value Genotypic values are expressed as deviations from the mid-

homozygote pointE(Gj) = Pj )= Pj

Genotypic values

Consider two alleles, A1 and A2, at a single locus.

The two homozygous classes, A1A1 and A2A2, are assigned genotypic values +a and -a, respectively.

Assume that the A1 allele increases the value of a phenotype while the A2 allele decreases the value.

The heterozygous class, A1A2, is assigned a genotypic value of d

Zero is midpoint between the two genotypic values of A1A1 and A2A2; d is measured as a deviation from this midpoint

A1A1A2A2 A1A2

-a +a0 d

Genotype

Genotypic value

Properties of the Genotypic Values and Environmental Deviations

The mean of environmental deviations is zero Ej = Pj – Gj , ,(E = 0)

The correlation between genotypic values and environmental deviations for a population of subjects is zero (GE = .00)

Elements of a population mean

Genotype Frequency Value Freq x Value

A1A1 p2 +a p2a

A1A2 2pq d 2pqd

A2A2 q2 -a -q2a

Sum = a(p - q) + 2dpq

Population mean

P = G = Gkpk

Multiply frequency by genotypic value and sum

Recall that p2 - q2 = (p + q)(p - q) = p - q

P = a(p - q) + 2dpq

Additive model

Assume aA and aB correspond to A1A1 and B1B1

A1A1B1B1 = aA + aB

So that

P = a(p - q) + 2dpq

Average effect for an allele ()

Population properties vis a vis family structure Transmission from parent to offspring; parents pass

on genes and not genotypes Average effect of a particular gene (allele) is the

mean deviation from the population mean of individuals which received that gene from one parent (assuming the gene transmitted from the other parent having come at random from the population)

Average effect for an allele ()

Gamete Frequ. & Values G Minus pop.

mean

A1A1

& +aA1A2

& dA2A2

& -a

A1 p q pa + qd -[a(p-q) +2dpq]

q[(a + d(q-p)]

A2 p q -qa + pd -[a(p-q) +2dpq]

-p[a+d(q-p)]

Average effect for an allele ()

Thus, the average effect for each allele also can be calculated for A1 and A2 in the following manner (Falconer, 1989):

1 = pa + qd - [a(p - q) + 2dpq] and

2 = -p[a + d(q - p)]

Average effect of a gene substitution

Assume two alleles at a locus Select A2 genes at random from population; p

in A1A2 and q in A2A2

A1A2 to A1A1 corresponds to a change of d to +a, i.e., (a - d); A2A2 to A1A2 corresponds to a change of -a to d, (d + a)

On average, p(a - d) plus q(d + a) or

= a + d(q - p)

When gene frequency is greater is greater

q = 0.10 q = 0.40

1 = +0.24 +1.44

2 = -2.16 -2.16

= 1 - 2 2.40 3.60

Breeding Value (A)

The average effects of the parents’ genes determine the mean genotypic value of its progeny

Average effect can not be measured (gene substitution), while breeding value can

Breeding value: Value of individual compared to mean value of its progeny

Mate with a number of random partners; breeding value equals twice the mean deviation of the progeny from the population mean (provides only half the genes)

Breeding value is interpretable only when we know in which population the individual is to be mated

Breeding Value

Genotype: Breeding valueA1A1: 21 = 2q

A1A2: 1 + 2 = (q - p)

A2A2: 22 = -2p Mean breeding value under HWC equilibrium

is zero2p2q + 2pq(q - p) - 2q2p which equals...

2pq(p + q - p - q) = 0

Dominance deviation

Breeding values are referred to as “additive genotype”; variation due to additive effects of genes

A symbolizes the breeding value of an individual Proportion of 2

P attributable to 2A is called

heritability (h2)

G = A + D Statistically speaking, within-locus interaction Non-additive, within-locus effect A parent can not individually transmit dominance

effects; it requires the gametic contribution of both parents

Genotypic values, breeding values, and dominance deviation

+a

d

0

-a

-2p

(q - p)0

2q

A1A1A1A2A2A2

}

Genotypic values

Breeding values

2pqq2 p2

Genotypic values, breeding values, and dominance deviation

Regression of genotypic value on gene dosage yields the genotypic values predicted by gene dosage average effect of an allele that which “breeds true”

If there is dominance, this prediction of genotypic values from gene dosage will be slightly off dominance is deviation from the regression line

Epitasis - Separate analysis

locus A shows an association with the trait locus B appears unrelated

AA Aa aa BB Bb bb

Locus A Locus B

Epitasis - Joint analysis

locus B modifies the effects of locus A

BB Bb bb

AA

Aa

aa

Genotypic Means

Locus A

Locus B AA Aa aa

BB AABBAaBB aaBB BB

Bb AABb AaBb aaBb Bb

bb Aabb Aabb aabb

bb

AA Aa aa

Partitioning of effects

Locus A

Locus B

M P

M P

4 main effects

M

P

M

P

Additiveeffects

6 twoway interactions

M

PM

P

Additive-additive epistasis

M

PP

M

4 threeway interactions

M P M

P

M

P

M P

M P

M P

Additive-dominance epistasis

1 fourway interaction

M M P Dominance-dominance epistasis

P

Two loci

AA Aa aa

BB

Bb

bb

m

m

m

m

m

m

m

m

m

+ aA

+ aA

+ aA

– aA

– aA

– aA

+ dA

+ dA

+ dA

+ aB + aB+ aB

– aB – aB – aB

+ dB + dB + dB

– aa

– aa

+ aa

+ aa

+ dd+ ad

– da

+ da

– ad

Covariance matrix Sib 1 Sib 2

Sib 1 2A + 2

D + 2S + 2

N 2A + z2

D + 2S

Sib 2 2A + z2

D + 2S 2

A + 2D + 2

S + 2

N

Sib 1 Sib 2

Sib 1 2A + 2

D + 2S + 2

N ½2A + ¼2

D + 2S

Sib 2 ½2A + ¼2

D + 2S 2

A + 2D + 2

S + 2N

Detecting epistasis

The test for epistasis is based on the difference in fit between

- a model with single locus effects and epistatic effects and

- a model with only single locus effects,

Enables us to investigate the power of the variance components method to detect epistasis

A B

Y

a b

True Model

A

Y

a*

Assumed Model

a* is the apparent co-efficienta* will deviate from a to the extent that A and B are correlated

Phenotypic variance

Again, assume

P = G + E Thus differences in phenotypes, measured as variance and

symbolized as VP, can be decomposed into both genetic and environmental variation, VG and VE, respectively.

VP = VG + VE

VG is comprised of three kinds of distinct variance: additive (VA), dominant (VD), and epistatic (VI).

VP = (VA + VD + VI ) + VE

Analysis of variance

Variance Symbol Value

Phenotypic VP Phenotypic value

Genotypic VG Genotypic value

Additive VA Breeding value

Dominance VD Dominance deviation

Epistasis VI Epistatic deviation

Environment VE Environmentaldeviation

Additive (VA) and dominance variance (VD)

The covariance between breeding values and dominance deviations equals zero so that

VG = VA + VD + VI

VA = 2pq[a + d(q - p)]2

VD = d2(4q4p2 + 8p3q3 + 4p4q2) = (2pqd)2

Additive and dominance variance

If d = 0, then VA = 2pqa2, where q is the recessive allele

If d = a, then VA = 8pq3a2

If p = q = .50 (e.g., cross of two inbred strains)

VA = 1/2a2

VD = 1/4d2

In general, genes at intermediate frequency contribute more variance than high or low frequencies

Epistatic variance (VI)

Epistatic variance beyond three or more loci do not contribute substantially to total variance

Three types of two-factor interactions (breeding values by dominance deviations) additive x additive (VAA)

additive x dominance (VAD)

dominance x dominance (VDD)

Environmental variance

Special environmental variance (VEs)

within-individual component temporary or localized circumstance

General environmental variance (VEg)

between-individual component permanent or non-localized circumstances

Ratio of between-individual to total phenotypic is an intraclass correlation (r)

Summary of variance partitioning

Data needed Partition made Ratio Estimated

Resemblancebetween relatives

(VA):(VNA+VEg+VEs) Heritability, VA / VP

Genetically uniformgroup

(VA+VNA): (VEg+VEs)= (VG):(VE)

Degree of geneticdetermination, VG/VP

Multiplemeasurements

(VG+VEg):VEs Repeatability,(VG+VEg)/VP

All three VA:VNA:VEg:VEs

Components of variance - Summary

Phenotypic Variance

Environmental Genetic GxE interaction

and correlation

Components of variance - Summary

Phenotypic Variance

Environmental Genetic GxE interaction

Additive Dominance Epistasis

and correlation

Components of variance - Summary

Phenotypic Variance

Environmental Genetic

Additive Dominance Epistasis

Quantitative trait loci

GxE interaction and correlation

Resemblance of relatives

Degree of relative resemblance is a function of additive variance, i.e, breeding values

The proportionate amount of additive variance is an estimate of heritability (VA / VP)

Intraclass correlation coefficient

t = 2B / 2

B + 2W

Between and within full-sibships, for example

Resemblance of relatives

bOP = CovOP / 2P

New property of the population is covariance of related individuals

Cross-Products of Deviations for Pairs of RelativesAA Aa aa

AA (a-m)2

Aa (a-m)(d-m) (d-m)2

aa (a-m)(-a-m) (-a-m)(d-m) (-a-m)2

The covariance between relatives of a certain class is the weighted average of these cross-products, where each cross-product is weighted by its frequency in that class.

Offspring and one parent

Individual genotypic values and those of their offspring produced by random mating

When expressed as normal deviations, the mean value of the offspring is 1/2 the breeding value of the parent

Covariance between individual’s genotypic value (G) with 1/2 its breeding value (A)

Covariance for Parent-offspring (P-O)

AA Aa aa

AA p3

Aa p2q pq

aa 0 pq2 q3

Covariance = (a-m)2p3 + (d-m)2pq + (-a-m)2q3

+ (a-m)(d-m)2p2q + (-a-m)(d-m)2pq2

= pq[a+(q-p)d]2

= VA / 2

Offspring and one parent

G = A + D so that covariance is between (A + D) and 1/2A; sum of cross products equal

1/2A(A + D) = 1/2A2 + 1/2AD CovOP = (1/2A2 + 1/2AD) / # of parents

Recall that CovAD = 0

CovOP = 1/2VA (i.e., 1/2 the variance of breeding values)

Offspring and one parent: Effects of a single locus

Parents Offspring

Genotype Frequency Genotypic value Mean genotypicvalue

A1A1 p2 2q( - qd) q

A1A2 2pq (q - p) +2pqd

1/2(q - p)

A2A2 q2 -2p( + pd) -p

Offspring and one parent

Mean genotypic values of the offspring are 1/2A of the parents

Mean cross product equals Frequency X Genotypic value of the parent X Mean genotypic value of the offspring

CovOP =

pq2(p2+2pq+q2)+2p2q2d(-q+q-p+p)=pq2=1/2VA

(Note: VA = 2pq2)

Covariance of MZ Twins

AA Aa aa

AA p2

Aa 0 2pq

aa 0 0 q2

Covariance = (a-m)2p2 + (d-m)22pq + (-a-m)2q2

= 2pq[a+(q-p)d]2 + (2pqd)2

= VA + VD

Twins

Dizygotic twins are fulls sibs and their genetic covariance is that of full sibs

Monozygotic twins have identical genotypes, i.e., no genetic variance within pairs so that

Cov(MZ) = VG

Covariance for Unrelated Pairs (U)

AA Aa aa

AA p4

Aa 2p3q 4p2q2

aa p2q2 2pq3 q4

Covariance = (a-m)2p4 + (d-m)24p2q2 + (-a-m)2q4

+ (a-m)(d-m)4p3q + (-a-m)(d-m)4pq3

+ (a-m)(-a-m)2p2q2

= 0

Resemblance in general

Let r be the fraction of VA and u the fraction of VD so that

Cov = rVA + uVD

P, Q two individual in relationship with parents A,B and C,D and f coancestry

r = 2fPQ and u = fACfBD + fADfBC

For inbred relatives, r = 2fPQ / [(1 + FP)(1 + FQ)]1/2

Resemblance in general

Coefficient r of the additive variance is sometimes called the coefficient of relationship (the correlation between the breeding values A)

Coefficient u represents the probability of the relatives having the same genotype through identity by descent

It is zero unless the related individuals have paths of coancestry through both of their respective parents, (e.g., full sibs and double first cousins)

Environmental covariance

VE = VEc + VEw

VEc; common, i.e., contributes to variance between means of families but not the variance within (covariance among related individuals)

VEw; within, i.e., arises from independent of coefficient of relationship

Maternal effects and competition

Phenotypic resemblance between relatives

Relatives Covariance Regression (b) orintraclasscorrelation (t)

Offspring andone parent

1/2VA b = 1/2(VA/VP)

Offspring andmid-parent

1/2VA b = VA/VP

Half sibs 1/4VA t = 1/4(VA/VP)

Full sibs 1/2VA+1/4VD+VEc t=(1/2VA+1/4VD+VEc)/VP

Heritability

Regression of breeding value on phenotypic value Index of response to genetic selection Estimated with

offspring-parent regression, sib analysis, intra-sire regression of offspring on dam, or combined estimates plus other methods (Markel et al., 1995, 1999)

Heritability: Ratio of additive genetic variance to phenotypic variance

h2 = VA / VP

Regression of breeding value on phenotypic value

h2 = bAP

rAP = bAP P / A = h2(1/h) = h

Heritability: Twins and human data

Between pairs, 2b Within pairs, 2

w

Identical(MZ)

VA + VD + VEc VEw

Dizygotic(DZ)

1/2 VA + 1/4 VD + VEc 1/2 VA + 3/4 VD + VEw

Difference 1/2 VA + 3/4 VD 1/2 VA + 3/4 VD

Heritability: Twins and human data

Correlations

Trait Monozygotic Dizygotic

Height 0.93 0.48

Intelligence 0.86 0.62

Personality 0.50 0.30

Alcoholconsumption

0.64 0.27

C = B1P + B2 + B3P

P P

B1 B2 B3

C

A1 C1 E1 D1 A2 C2 E2 D2

P1P2

Twin 1 Twin 2

a c e d a c e d

1.0 (MZ) or .25 (DZ)1.0 (MZT,DZT) or 0.0 (MZA, DZA)

1.0 (MZ) or .5 (DZ)