17
ON THE MATHEMATICS OF RANDOM MATING IN CASE OF DIFFERENT RECOMBINATION VALUES FOR MALES AND FEMALES HILDA GEIRINGER V71iealon College, Norton, Massaclruselts Received June 7, 1948 In a colloquium discussion PROFESSOR SEWALL WRIGHT remarked that cases where the recombination values (r.v.’s) are different for the two sexes appear so common that the mathematical theory should be completely worked out. He added that in case of two and three Mendelian characters the arith- metic mean (c+d)/2 of the two r.v.’s plays the role of the single r.v. appearing in the usual situation where one and the same recombination value, c, holds for males and for females. In the following the mathematics of the problem is completely given. The result is that Professor WRIGHT’S statement is true for any number of charac- ters. More specifically: In case of m = 2 or of m=3 characters we have an un- restricted analogy to the “ordinary” case. If m 2 4 the situation changes. The general character of the problem will, however, still be determined by the mean of the r.v.’s. This is, in particular, true for the limit behaviour of the distribu- tions as n, the number of distinct generations, tends toward infinity. As in previous papers (GEIRINGER 1944, 1945) the author’s approach is based on the consideration of three basic probability distributions, the distribu- tion of genotypes, (d. ge.), the distribution of gametes, (d.ga.), and the linkage distribution (1.d.). If the first is given for an “initial” generation for both sexes, if we know or assume a l.d., and if random mating is considered we may derive the dga. for the initial generation, and both the d.ge. and the d.ga. for any subsequent generation (n=O, 1, 2, . . . ). Mathematically, the problem ap- pears as a probability problem. To the not mathematically minded biologist this may seem a non customary approach. If however the usual way of forma- tion of abstract concepts in science is used it will be realized that these “pro- portions” or “frequencies” of certain types have to be regarded as probabilities and probability distributions. The consideration of such distributions affords the best way to deal with more involved heredity situations since it enables us to use the highly developed methods and concepts of probability calculus. In sections 1 and 2 of this paper the cases of m=2 and m=3 characters are completely investigated with details which seem not obvious. In each case we first derive the “recUrrenceformz~Za” [(16) in sec. 1 and (31) in sec. 21 which enable us to compute the characteristic distributions from generation to gener- ation. The result is that, indeed, from the first filial generation on, the recur- rence relations are exactly the same as in the “ordinary” case with the arith- metic mean of the corresponding r.v.’s and the arithmetic mean of the corre- sponding gametic probabilities taking the place of the r.v.’s and of the gametic distribution of the “ordinary” case.-Next, we solve the recurrence equations * Part of the cost of this publication is paid by the GALTON AND MENDEL MEMORIAL FOND. GENETICS 33: 548 Kovember 1948

ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

  • Upload
    lambao

  • View
    218

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

ON T H E MATHEMATICS OF RANDOM MATING IN CASE OF DIFFERENT RECOMBINATION VALUES FOR

MALES AND FEMALES

HILDA GEIRINGER V71iealon College, Norton, Massaclruselts

Received June 7, 1948

In a colloquium discussion PROFESSOR SEWALL WRIGHT remarked that cases where the recombination values (r.v.’s) are different for the two sexes appear so common that the mathematical theory should be completely worked out. He added that in case of two and three Mendelian characters the arith- metic mean (c+d)/2 of the two r.v.’s plays the role of the single r.v. appearing in the usual situation where one and the same recombination value, c, holds for males and for females.

In the following the mathematics of the problem is completely given. The result is that Professor WRIGHT’S statement is true for any number of charac- ters. More specifically: In case of m = 2 or of m = 3 characters we have an un- restricted analogy to the “ordinary” case. If m 2 4 the situation changes. The general character of the problem will, however, still be determined by the mean of the r.v.’s. This is, in particular, true for the limit behaviour of the distribu- tions as n, the number of distinct generations, tends toward infinity.

As in previous papers (GEIRINGER 1944, 1945) the author’s approach is based on the consideration of three basic probability distributions, the distribu- tion of genotypes, (d. ge.), the distribution of gametes, (d.ga.), and the linkage distribution (1.d.). If the first is given for an “initial” generation for both sexes, if we know or assume a l.d., and if random mating is considered we may derive the dga . for the initial generation, and both the d.ge. and the d.ga. for any subsequent generation (n=O, 1, 2, . . . ). Mathematically, the problem ap- pears as a probability problem. To the not mathematically minded biologist this may seem a non customary approach. If however the usual way of forma- tion of abstract concepts in science is used it will be realized that these “pro- portions” or “frequencies” of certain types have to be regarded as probabilities and probability distributions. The consideration of such distributions affords the best way to deal with more involved heredity situations since it enables us to use the highly developed methods and concepts of probability calculus.

I n sections 1 and 2 of this paper the cases of m = 2 and m = 3 characters are completely investigated with details which seem not obvious. In each case we first derive the “recUrrenceformz~Za” [(16) in sec. 1 and (31) in sec. 21 which enable us to compute the characteristic distributions from generation to gener- ation. The result is that, indeed, from the first filial generation on, the recur- rence relations are exactly the same as in the “ordinary” case with the arith- metic mean of the corresponding r.v.’s and the arithmetic mean of the corre- sponding gametic probabilities taking the place of the r.v.’s and of the gametic distribution of the “ordinary” case.-Next, we solve the recurrence equations

* Part of the cost of this publication is paid by the GALTON AND MENDEL MEMORIAL FOND.

GENETICS 33: 548 Kovember 1948

Page 2: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING 549 (mathematically they are “difference” equations), that means we derive ex- pressions which give the distribution in the ntll generation directly in terms of the initial distribution and of the given r.v.’s [(18) and (33)] . Finally, by means of these last formulae the limit distribu/ion as n+ 00 is easily found [(20), (37) , and (38)].

In section 3 about the same is done for general mz4. Here a general l.d., or rather two such distributions in our case, take the place of the three pairs of r.v.’s which appear if m = 3. The recurrence relations are not quite of the afore- mentioned type but the limit theorem is still the expected analogon to the simpler case. It is derived directly from the recurrence formula.

There would be no worthwhile simplification if we restrict ourselves to two alleles (dominant and recessive; A and a). Hence an arbitrary number o j alleles, r, is assumed throughout.

1. The problem in case oJ m = 2 loci.

Call m the number of Mendelian characters, r the number of alleles under consideration, and begin with the case of m = 2 and r = 2. There are 16 possible genotypes. I t has been found an essential advantage to denote, quite generally, the genotype of an individual in a way which shows clearly the maternal and the paternal heritage. If the two possibilities are denoted by A, a and B, b respectively, the 16 types follow, the letters before the semicolon denoting the maternal heritage:

(AB; AB) (AB; Ab) (AB; aB) (AB; ab)

(Ab; AB) (Ab; Ab) (Ab; aB) (Ab; ab)

(aB; AB) (aB; Ab) (aB; aB) (aB; ab)

(ab; AB) (ab; Ab) (ab; aB) (ab; ab)

If we denote briefly the whole maternal heritage by x and the paternal one by y we assume, as usual, that the genotypes (x; y) and (y; x) are the same:

(1) (x; Y) = (Y; 4 (that means for example that (AB; Ab) = (Ab; AB)). Hence in the above scheme the types symmetrical to the “main diagonal” are the same, which reduces the maximum number of different types from 16 to 10.

If we consider r alleles and m characters the number N = rZm takes obviously the place of N = 24= 16 while rl”(rm+ 1)/2 = NI takes the place of 22(22+ 1)/2 = 10.

Returning to our particular case, each of the 16 types will initially occur in certain proportions. These proportions will, in general, change from generation to generation. We thus introduce the probability distribution of genotypes, w(”)(x; y) in the ntll generation where x and y stand for the total maternal and paternal heritage respectively. I n accordance with (1) we have to assume that

(2) w y x ; y) = w‘”’(y; x) (n = 0, 1, . * * ).

Since wcn)(x; y) is a probability distribution we have

Page 3: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

550 HILDA GEIRINGER

(3) c w y x ; y) = 1: X Y

I n our particular case, m = r = 2 , we have

w(”)(AB; AB) + [w(”)(AB; Ab) + w(”)(Ab; AB)] + . . a + w(”)(ab; ab) = 1.

We may now assume that the initial distributions of genotypes are different for males and females. (This is not an essential assumption since after one generation of random mating the d. ge. will be the same for both sexes. See (7)). Let o(O)(x; y) be the initial distribution for the females, w(O)(x; y) that for the males. Then we have, besides (2) and (3)

w(O)(x; y) = w(O)(y; x), o(O)(x; y) = 1.

X Y

If m = r = 2 four different gametes are possible: (AB, (Ab), (aB), and (ab). They appear in proportions given by the probability distribution of gametes which in our problem will be a different one for the two sexes. It will be denoted by p(”)(AB), * and by q(”)(AB), - - and we have

(4)

(4’)

p(“)(AB) + p(”)(Ab) + p(”)(aB) + p(”)(ab) = 1

q(”)(AB) + q(”)(Ab) + q(*)(aB) + q(”)(ab) = 1.

I n case of random mating the d.ga. is derived from the d.ge. by means of the segregation distribution. The s.d. specifies the kinds of gametes an organ- ism can produce. This s.d., or, as we shall call it, this lilzkage distribution, (l.d.), is trivial in case of m = 1 character, and fairly simple in case of m=2 , but not so simple for general m. If m = 1, and r = 2 the three different geno- types are (A; A), (A; a), (a; a). (The semicolon is unnecessary if m = 1.) We may denote the d.ge. by w(”)(AA) =pn, w(”)(Aa) =2qn, w(”)(aa) = rn and the d.ga. by p(”)(A)=u,,, p(”)(a)=v,,. We then have obviously: u,=pn.1+2qn.i , vn=2qn.$+rn . 1. We see that the 1.d. is here simply given by the factors 1 and 4 respectively. I n fact, an individual which is of type (AA) transmits A with a probability of 1, while the individual of type (Aa) transmits A with a proba- bility of $, and so on.

If m=2, the 1.d. in the “ordinary” case is completely characterized by on> parameter, the recombination value (r.v.). With a view to later considerations for general m we may describe the mathematical situation as follows: The individual may transmit a gamete consisting of: a) both “maternal” genes, b) the “maternal” gene with respect to the first character, and with respect to the second character the “paternal” gene, c) with respect to the first character the “paternal,” and with regard to the second the “maternal” gene, d) both “paternal” genes. (This “schematic” probability explanation is independent of “linear theory” or “chiasma theory,” and i t need not be modified if “chroma- tid segregation” rather than “chromosome segregation” is taken into account.) The probability that either b) or c) will happen is denoted by c, and that for a) or d) accordingly by (1-c). More specifically c/2 is the probability of the

Page 4: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING 55 1 "event" b) and likewise of the event c), and so on. This obviously applies to any number of alleles since we have merely spoken of "paternal" and "ma- ternal" genes with respect to a character no matter by what gene this character is represented.

I n our problem we no-w assume that this parameter, the r.v., is m t the same for the two sexes; we shall call it c for the females and d for the males.

We shall now derive the initial d.ga. from the initial d.ge. by means of our s.d. (or 1.d.). We begin with the probability p(O)(AB) of a female gamete and obtain as will be explained:

p(O)(AB) = w(O)(AB; AB) 1 + [w(O)(AB; Ab) . $ + w(O)(Ab; AB) .$ ] + [w(O)(AB; aB).+ + w(O)(aB; AB).+]

- "3. 1 - c + [w(O)(AB; ab) - + w(O)(ab; AB) - 2 2

To understand this let us first remember that a gamete (AB) can be formed only by a parent which 1) possesses A and B and 2) also transmits these genes. Consider, then, for example, the term w(O)(AB; Ab) .$. This is the probability that an individual be of type (AB; Ab) times the probability to transmit AB; this last probability equals $, since A will be transmitted anyway and B (rather than b) with probability $. Likewise the term, w(O)(Ab; aB) .c/2 is the probability that the parent be of type (Ab; aB) and transmits the "mixed" gamete AB which consists of the "maternal" A, and the "paternal" B, and this "recombination" happens with probability 4 2 .

A completely analogous formula holds for males:

q(O)(AB) = w(O)(AB; AB).1 + [(w(O)(AB; Ab)$ + w(O)(Ab; AB)$]

+ [ w("((AB; ab) + w(O)(ab; AB) (6)

Next, we remember that a new organism is formed by the fusion of two gametes. If everything happens a t random this amounts to the principle:

(7) w(n+')(x; y) = p'"'(x)q'"'(y) (n = 0, 1, - - ). I n fact a zygote whose maternal and paternal heritages are x and y respec- tively is formed by the fusion of the egg which contributes x and the sperm of type y and the probability of such a zygote is consequently given by (7) where p(")(x) and q(")(y) are the respective female and male gene probabilities. For instance, w(~+')(AB; Ab) =p(")(AB)q(")(Ab). (n=O, 1,

e . . I n order to ob- tain a general recurrence relation we write

. - ). A formula analogous to (5) and (6) holds for n = 1, 2,

p("+')(AB) = w ( ~ + ~ ) ( A B ; AB). 1 + [,("+')(AB; Ab)$ + w(n+l)(Ab; AB) .$

Page 5: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

552 HILDA GEIRINGER

( 5 '1 2 '1 C + . . . + [w(ll+l)(Ab;aB) -+ w("+')(aB; Ab).---

2

and there is aformula (6') for q("+l)(AB) which is like ( 5 ' ) with the only differ- ence that the r.v. is now d rather than c. Substituting (7) into (5 ' ) we obtain

p("+')(AB) = p(")(AB) q(")( AB)

+ [p(")(AB)q'")(Ab) + p("'(Ab)q(")(AB)] .3 + [p(")(AB)q(")(aB) + p(")(aB)q(")(AB)] . a

1 - c + [p(")(AB)q(")(ab) + p(")(ab)q(")(AB)].- 2

C + [p(")(Ab)q(")(aB) + p(")(aB)q(")(Ab)] .- 2

(n = 0, 1, 2 , . . . ) and an analogous formula for q("+')(AB) in terms of p("), q(n), and d. There are also formulae like (8) for p("+')(Ab), for pcnfl)(aB), for p("+')(ab) and the like for the q("+l). Such a formula already constitutes a recurrence formula since it shows how to find p("+l) from the distributions for the nth generation. All it needs is to be transformed so as to exhibit a clear structure.

For this purpose we introduce the gelze-probabilities (in probability calculus called "marginal distributions") .namely:

pl(")(a) = p(")(aB) + p(")(ab)

pz(")(b) = p(")(Ab) + p(")(ab)

for n=O, 1, . * . . Here pl(")(A) is the probability of the gene A, and so forth, and we have

(9') pl(")(A) + PI(")(") = 1, p2(")(B) + p2(")(b) = 1, (n = 0, 1, . . >. We introduce in the-same way for the males: ql(")(A), . . . , qZ(")(b). Using these definitions as well as (4) and (4') we find by an elementary computation that the right side of (8) equals

pl(")(A) = p(")(AB) + p(")(Ab),

pz(")(B) = p(")(AB) + p(")(aB), (9)

and we get the two formulae

Page 6: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING 553

In the same way we obtain formulae for p(Il+l)(Ab), p("+')(aB), p("fi)(ab),

Next we add the formula for p("+')(AB) and that for p("+')(Ab). Using . . . q("+l)(ab).

(9') we find

= $[pi(")(A) + qi'"'(A) 1, ( n = O , l , . . . )

and in the same way

(11') ql("+l)(A) = $[pl(")(A) + ql(")(A)], (n = 0, 1, . ).

Comparing (11) and (11') we see the interesting fact that pl("+"(A) =ql("+l)(A), for n=O, 1, + . - or, since the same holds for the other genes:

ql(")(A) = pl(")(A), ql(")(a) = plcn)(a), . . - , qZ(")(b) = pz(")(b),

(n = 1 , 2 , . . . )

while the q(")(A, B) are not equal to the p(")(AB), and so forth. Substituting this in (11) we obtain

- ) p1("+l)(A) = $[pl(")(A) + pl(")(A)] = pl(")(A), (n = 1, 2, - Thus we have for the gene probabilities, for n = 1, 2, . . :

(12)

(12')

pi'"'(A) = pi(')(A) = $[pi('"(A) + ql(O'(A)] = ql(l)(A) = qi("'(A)

pz("'(B) pz'"(B) = $[pz'O)(B) + qz'O'(B)] = qz'"(J3) = qz(")(B)

and analogous relations for the pl(")(a) and pz(")(b). If we introduce these re- sults into (10) these formulae simplify. In fact, for n = 1, 2, . . *

pi'"'(A) qz'"'(B) + qi(") (A)R ("1 (B)

= pi ( l ) (A) q2 ( l ) (B) + q1 ( l ) (A) pz ( l ) (B) = 2pi ( l ) (A) pz ( l ) (J3)

and we obtain a first result:

(n = 1, 2, . - ).

Before continuing we wish to state that all this applies in the same way to any number, r, of alleles. I n fact, in this case, firstly, the formula (5) will con- tain some more terms; for instance, if r = 4 it would start like this:

Page 7: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

554 HILDA GEIRINGER

p(0)(albl) = w(0)(albl: albl). 1 + 3 [w(0)(albl; albz) + w(O!(albl; alb3)

+ w(0)(albl; alb4] + *

and the same remark holds for (5') and (6). Formula (7) is general. Formuia (8) will again contain more terms since it is derived from (5), but the decisive transformation of (8) will lead to exactly the same result as in case r = 2 since we now have instead of (9) and (9') with an obvious notation, for i= 1 , 2 , . r, and n=O, 1, . :

r T

pl(")(ai) = c p(")(aibj), pl(")(b;) = c p(")(ajbi), j=1 j=1

(9') r I

ql(")(ai) = q(")(aibj), qa(")(bi) = q(")(ajbi).

It is however simpler to use a slightly different notation, namely to introduce a variable, x, which takes on the values al, + . . a,, and a variable y, which takes on the values bl, . . . b,. Then we simply have

j=1 j=1

Pl("'(X) = c p'"'(xy), p2'"'(y) = c p'"'(xy),

q1'"W = q'"'(xy), qz'"'(y) = c q'"'(xy)

Y X

(9")

Y X

with

(9"') Pl'"'(X) = pz'"'(y) = q1'"'(x) = qz'n)(y) = 1. X Y x Y

Also instead of (12) we now write

Pl'"'(X) = p1(l'(x) = +[Pl'O)(X) + q1(0'(x)] = q1(')(x) = q1'")(x),

pz'"'(y) = pz"'(y) = +[pz'O'(y) + qz'"(y)I = q2'"(y) = qz'"'(y)

(12")

(12"')

(x = al, az, . . . , a r t y = bl, bz, + . , b,, n = 1, 2 , - . ).

The generalized formulae corresponding to (10') will now be:

1 - c

2

1 - d

2

p(n+l)(xy) = _-. [p'"'(xy) + q'"'(xy) I + cp1(')(x)p2(')(y)

q(n+l)(xy) = -. [~'"'(XY) + ~(" ' (xY)] + dpi"'(x)pz"'(y)

(x = al, - . , ar, y = bl, * * , br), (n = 1, 2, . * - ).

In order to prove the statement in the introduction we introduce the distri-

(n = 0, 1 , . - . ) bution

( 14)

with

r(n)(xy) = +[p(")(xy) + q(")(xy)],

Page 8: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING

C r(n)(xy) = I . Z Y

(14?

We then obtain, step by step, using (12), (12"), (12"'):

r(ll)(xy) = rl(n)(x) = +[pl(n)(x) + q1(n)(x)] Y

= $ [ p p ( x ) + q1(1)(x)] = p1(I)(x) = q1(1)(x)

= $[pl(o)(x) + q1(O)(x)] = rl(0)(x),

555

(n = 0, 1, 1 . ). Thus, for n=O,l, .

rl(n)(x) = rl(o)(x) = p1(1)(x) = q1(1)(x) = +[p1(0)(x) + ql(O)(x)]

rz(n)(y) = r2(0)(y) = pz(')(y) = q2("(y) = t [ ~ z ( ~ ) ( ~ ) + ~~(O) (Y) I .

If we now add the two formulae (13) and divide by two, we get our f inal elegant result, valid for n = 1, 2, . . .

(15)

Thi s i s exactly the same formula as the usual formula fo r m = 2 (see papers by JENNINGS (1917, 1923), ROBBINS (1918), and the author, 1.c.) the only di jer - ence being: 1) that r(")(xy) =a. [ ~ ( ~ ) ( x y ) + q ( ~ ) ( x y ) ] takes the place of p(n)(xyj. 2 ) $.(c+d) takes the place of c. 3) I t holds for n=1, 2, . - - only, not f o r n

Since (16) holds from n = 1 on only, it has to be completed by the direct computation of the first step in order to find r(l)(xy). For this purpose we must return to (10) and find, for n=O, upon addition of the two formulae (10) and dividing by a:

=o, 1, . * * .

where

r(O)(xy) = 3 . [p(O)(xy) + q(O)(xy)] as in (14).

The formulae (16) and (16') constitute our ma in result. Having obtained them we may easily "solve" the recurrence relation (16). This is, mathematically, exactly the same problem as in the "ordinary" case since the recurrence equa- tion has exactly the same form. We thus find with

-- - k c + d 2

the result

Page 9: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

556 HILDA GEIRINGER

(18)

where r(l)(xy) iiust be computed from (16'). Or, in the form:

(18')

with (from (16')):

r('j(xy) = (1 - k) r(O)(xy + - [pl(oj(x)qz(oj(y) + q~(~j(x)pz(")(y)]

r("j(xy) = (1 - k)"-lr(')(xy) + [l - (1 - k)n-l]rl(o)(x)rz(nj(y)

r(Il)(xy) = rl(")(x)r~(")(y) + (1 - k)I1-l. [r(')(xy) - rl(")(x)r2;"j(y)]

k

2

Next let us investigate what happens as n+ ;fi . We may assume that c is actually dijerertt from d. Otherwise we have the "ordinary" case where p(")(xy) = q("'(xy), and so on. Therefore the trivial case c = d = 0, that is com- plete linkage for both sexes may be excluded. (The result in this case is, of course :

p("j(xy) = q("j(xy) = p(O)(xy) for all n.)

Thus we assume that k>O and (18) gives the result:

Next consider (10) and use (15) and (19). We find:

1 - c 2 lim r("j(xy)

Hence the final simple and useful result:

lim p(")(xy) = lim q(")(xy) = rl(o)(x)r2(o)(y) n- - n--, 00

(20)

where

rl(oj(x) = 4 . [pl(oj(x) + ql(o)(x)], rz(Oj(y) = +. [p2(Oj(Y) + qn(Oj(~) l .

As n, the number of distinct generatiom, irtcreases the two gametic distributions p(")(xy) and q("j(xy) will tend towards the same distribution which equals the product rl(0)(x) r2(Oj(y) where each of these two factors denotes the arithmetic mean of the respective gene probabilities in the initial generation. T h i s holds in case of random mating if k=+(c+d), the arithmetic mean of the two recombina- tiom values for females and f o r males, i s different f r o m zero.

2. The case of m = 3 characbers.

In this case the distribution of genotypes may be denoted by

Page 10: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING 55 7 (21) w(n)(x1x2x3; y1y2y3)

where now, with a view to the general case of any m, xl, x2, x j is used rather than x, y, z. (This same notation was anticipated in (7) , where we called x the total maternal, y the total paternal, heritage.) Here x1 as well as yl stands for any of the r genes al, - . . , a,, while x2 or y2 stands for any of the r genes bl, . . b, and x3 and y3 for cl, . . . cr, and xl, x2, x3 denotes the maternal, yl, yz, y l the paternal heritage. A distribution of genotypes contains, as stated before r2m values of which not more than rm(rm+l)/2 correspond to different types. This last gives if m = 3 : r3(r3+1)/2 and if r = 2 the two numbers are 64 and 36 respectively; we have, for instance the types (ABC; ABC), (ABC; ABc), . . . (abc; abc). The sum of all the r6 values in (21) equals one.

In our problem we assume a t the beginning two different distributions W(O)(X~X~X~; yly2y3) for the females and d0)(x1x2x3; yly2y3) for the males. The corresponding distributions of gametes are ~ ( " ) ( x I x ~ x ~ ) and q(n)(x1x2x3) with

(22) c c c p'"'(x1xzx3) = 1, c c c q'"~(x1Xzxs) = 1, XI x2 x3 X I x 2 x3

(n = 0, I, * . . )?

The lirtkage distribution contains four values with sum one. v0=210 is the probability that the produced gamete has only maternal genes or only paternal genes, v, = 21, is the probability that the ith gene be maternal, the two others paternal, or vice versa (i= 1, 2, 3) . Hence

(23) Vo + V1 + Vz + V3 =s 210 + 211 + 212 + 213 = 1.

The relation between these values and the three r.v.'s, c12, c13, c23 is the fol- lowing (GEIRINGER 1944, p. 34)

(24) c1J = VI + VJ

v1 = $(cl, -!- Cxk - c,k), (i = 1, 2, 3, j = 1, 2, 3, i # j ) ,

(i, j, k = 1, 2, 3, i # j , j # k, i # k), VO = 4(2 - c12 - c13 - c23).

In order to derive our recurrence formula, we need again the "marginal" probabilities of gametes, that is, the probability of a gamete which contains, e.g., with respect to the first and the second character the genes XI, x2, or the probability of a gamete which contains with respect to the first character the gene XI:

p12(n)(X1X2) = p(1')(x1xzx3), p13(")(x1x3) = c P(")(XlX2X3), * *

X I x2

p1(")(xd = c p("'(xIxZx3) = c p12(n)(x1x2) = p13(n)(X1X3), * . . (25) *2 x3 x2 x3

(25')

with

plJ(n)(xlxJ) = 1, c pl(")(x,) = 1. X I x, x1

Analogous definitions hold for the q'")(xlx2x~).

Page 11: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

558 HILDA GEIRINGER

The 1.d. for the females may be given by cij or by vi= 21i, that for the males by T i j or by Vi=2 i i . R e then derive first the formulae which correspond to (lo), using either direct explicit computation as in section 1, or the more general conclusions used in the author's previous papers. A first result is

consequently

Page 12: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING 559

Hence

(30’) ri(n)(Xi) = pi(l)(xi) = qi(l)(xi) = ri(l)(xi) = ri(0)(xi).

Addition of (29) and (29’) and multiplication by 1/2 gives the f inal f o r m of the recurrence formula

Since this hold from n= 1 on only it has to be complemented by the formula following from (26) and (26’) for n=O:

Therefore a statement completely analogous to that a t the end of section 1 holds true: T h e formula (31) is exactly the same as the “ordinary” formula ex- cept that 1) T(~)(XIXZX~) = +[p(n)(x1x~x3)+q(n)(x1x2x3)] takes the place of P(”)(X~XZX~). 2) li+Ii=+(vi+Oi) takes the placeof vi, (i=0, . - - 3 ) , or: kij=$(cij+cij) takes the place of cij. 3) T h e recurrence holds from n = 1 on only. Consequently (31’) has to be considered too.

We may now “solve” (31’) in the same way as in the ‘(ordinary” case. Put

(3 7)

Page 13: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

560 HILDA GEIRINGER

Next we consider (26) and find, using results for m=2:

Hence, as expected: Zf all kij>O:

(38) lim p123(n) = lim q123(n) = lim r123(n) = r1(o)r2(0)r3(o).

I f one ki, oanishes, for example kls = 0, then the corresponding m12 = 1 - k12 = 1, and s1 = s2 = 0. It then follows that

p z ( l f ) = p12(0), q 1 2 ( 1 1 ) = q l p ( 0 ) , r12(n) = rl2(U)

and

lim p123(1') = lim qlZ3(n) = lim r123(n) = r12(O)r3(').

The case where all kij=0 is the ordinary case of complete linkage with, p123(n) = p123(O). If two kij vanish it follows that the third must vanish.

It can be seen easily that these very elegant results holding if m = 2 or m = 3 can not hold in the same way if m 2 4 .

3. The general case of m >= 4 characters.

If m z 4 , linkage cannot be completely described in terms of the m(m-1)/2 r.v.'s. We need a linkage distribution which may be introduced as follows. GEIRINGER 1944, p. 32). Denote by S the m numbers 1, 2, . . m, by A any "subset"of S (including S), by A' the complementary set A' = S -A (for example A = 1, 3, 4, A f = 2 , 5, 6, . . . m). T h e n 1.4 i s the probability that a gamete trans- mitted by a parent contains maternal genes with respect to the characters whose numbers are in A and paternal ones with respect to the other characters, and l ~ = l A ' . If we want to specify we write, for example, for m=6 , l l s = 1 2 4 6 and 1135+1246=v135 which is the probability that the gamete receives with re- spect to the first, third, and fifth character the maternal, and with respect to the second, fourth and sixth the paternal, gene, or vice versa. There are obviously 2m such 1-values which are equal to each other in pairs and have the sum one, hence there are M = 2m-1 - 1 independent values. Thus, for m 2 4, M > m(m- 1)/2 and we see that the complete linkage situation can not be described in terms of the r.v.'s alone (at least not without an additional hy- po thesis .)

It is a noteworthy result that, in the ordinary case, the recurrence relations have still a very simple and clear structure, if we introduce in analogy to (25) the "marginal" distributions for fi genes where 1 S p 5 m- 1. The general recur- rence formula as derived by GEIRINGER (1944, page 39) may be written in a very condensed form as follows

n- m n- - n+ 00

(38')

Page 14: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING 561 ps(n+l) = C pA(n)pA, (n) . l A .

('4) (39)

This gives, for example for m = 4, if we use I A + ~ A , = VA = VA,

p1234(n+1) = VOp1234(") + [vlpl'o'p234(11) + vZpZ(o)p134(n)

(40) + v3p3(O)p124(n) + v4p4(0)p123(")]

+ [V12p12(n)p34(n) + V13p13(n)@4(n) + V14p14(n)p23(n) J .

In writing (40) we have already used that pi(n'=pi(U). Now assume again the general situation where 1.4 and 1A denote the respec-

tive l.d.'s for females and males, while p(n)(x)=p(n)(xlxz . . . xm)=plz("). . .," and q(n)(x) =q(")(xlx2 . . # xm) = q12(n). . .,,, are the gametic distributions for females and males respectively. In a way analogous to that before we derive for m = 4

If we now add the two formulae (41) and introduce dn) we clearly see that the terms with (li+ii) ( i = O , . . . 4) will, as before, depend on the r-distribution alone while the last three terms with factors (lij+Iij) cannot be expressed in terms of the r-distribution alone. We obtain, using (42) and (43)

If we consider a greater value of m there are, if m=2p or 2p+1 respec- tively (p+l) groups of terms, and of these p+1 groups the first twogroups only can be expressed in terms of the r-distribution while all other (p+ 1) -2

Page 15: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

562 HILDA GEIRINGER

= p - 1 groups will contain the PA(") and q,(n). Hence the elegant results found for m = 2 and m = 3 no longer hold in full simplicity. The mean of the two I.d.'s is still the 1.d. in the recurrence relations but the gametic distributions appear both and not their mean only.

Formula (46) is typical. We can easily get the analogous result for any m, for instance for m = 7 (of course with other 1-values) :

r123.. .67 = (lo + iO)r(n)123.. .67 + { (11 + i1)r1(0)r(n)23. . 7 + + . (7 terms) 1 + { (112 + ilz). +. [p12(n)q34.. .,(n) + q12(n)P34.. . ,(n)] + - . (21 terms).!

+ { (1123 + I,,,) 3. [p123(n)q4567(n) + q123(n)P4567(n)] + . . . (35 terms) 1

(n+l)

(47)

Now it is interesting to see that notwithstanding this less simple structure of the recurrence relations the l imit theorem i s soill the same. We shall derive it directly from the recurrence formula without using an explicit "solution." We shall use induction. Consider (46) and assume that all s ix mean r.v.'s, kij = (cij+c,j)$ are greater that2 zero, that means, for 120 (i, j) both cij and Fij

vanish. Write

r1234(n+1) - ( lo + iO)r1234(n) = { (11 + Tl)r1(0)r234(n) +. +. + 1 + { (iI2 + i12) ' 4 [p12(n)q34(n) + q12(n)~34(n)l + . + . . (46')

The right side of this equation contains merely distributions of orders 1, 2, and 3. For such distributions we studied the limit behavior before. Consider for example r234("). Since kZ3, k24, k34 are # O we know from section 2 that r234(n)~r~(o)r3(0)r4(o), hence rl(o)r234(n)~r1(0)r2(0)r3(0)r4(0) and the same holds for r~(o)r13~(n)~r1(0)r~(o)r3(0)r~(o) . Next consider p 1 ~ ( ~ ) , because of k12fO and our re- sult for m = 2, plz(n)-+rl(o)rz(o) and q,4(n)-+r3(0)r4(0) since k34ZO. Furthermore

(il + . . . + 14) + (112 + 113 + 114) + O1 + . . + 1,) + Ol2 + TI, + i14) - -

= 3 - 1, + 3 - lo = 1 - (lo + l o ) .

Hence the right side of (46) converges towards

[I - (1, + I o ) ] r l ( ~ ) ~ ~ ( ~ ) r 3 ( ~ ) r 4 ( ~ ) .

The left side of (46') is of the form x,,+I-(Yx~ and the whole equation is of the form Xnfl-axn=yn where lim,,,y,=y= [ l - ( lo+lo)]r l~o~ * . r4(O). It is an easily proved theorem of analysis that for an equation of this form if I -a\ < 1, and limn+% yn=y then limn-, x,=y/l--a. Now here cy=lo+io and - this is less than one unless all r.v.'s are zero. Hence limn+= x,=y/l-l0-10 and conse- quently

Next consider the first formula (45). We find, using (48) and our results for m=2 , and m=3:

lim p1234(n) = vorl(0)r2(O)r3(O)r4(O) + vlrl(0) . . . r4(0) +. + . + . n+ -

Page 16: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

MATHEMATICS OF RANDOM MATING 563 + 2112r1(0) . . . r4(0) + +.

= rl(0) . . . r4(o). [VO + VI + ' ' ' + v4 + v12 + v13 + v14] = rl(0)r2(O) rx (CJ)r4(I)).

Hence, just as before: If all kij # O

lim p1234'") = lim q1234(11) = rl(o)r2(o)r3(o)r4(o). n+ m n-1 m

(49)

In the same way, using (48), (49), and the results for m < 4 we prove for m = 5 the result, which corresponds to (49), and so we go on. Our final result is that, for any m if k;j=%(cij+rij)>O:

lim plz.. .,n(n) = lim qlz . . . n l (n ) = r,(o)rz(o) . . . rnl(o) n-. - n-m

(50)

where, as before,

(509

with

ri(o) = A ( a Pi .(o) + qi(o)), ( i = 1 , 2 . - . m )

p i ' O ' E pi ("(xi)

XI X I - 1 x,+1 xm

In words: Consider with respect to m genotypes a n arbitrary distribution f o r males and another one f o r females, assume random mating and not necessarily identical l.d.'sfor the two sexes, but such that the m(m- 1)/2 "mean recombination values" k,,=+(cl,+c,j) are all di jerent j r o m zero. Denote by p(n)(xl . . . xm) and q(")(xl . ' . x,) the distribution of gametes f o r females and males respectively in the nth generation and by r(O)(x1 . * xm) =+[p(O)(xl * . . Xm)+q(O)(xl * - * Xm)]

the mean distribution of gametes f o r the initial generation with the corresponding mean gene distribzltion r,(O) of the i th character. Then , as n increases, both, p(")(x) and q(")(x) will tend towards the same limit distribution, where, according to ( S O ) , the m genes are independently distributed. Moreover: T h e gene distribution f o r each single character remains the same f o r males and f o r females and for all n f r o m the first jilial generation on:

(i = 1, . .m).

Also because of (SO) we have for the distribution of genotypes

lim u(n)(xl . . x,; yl . . ym) n-+ m

= rl(0)(xl) . . rm(0)(xm)rl(o)(yl) r,(O)(y,,,). (52)

Page 17: ON THE MATHEMATICS OF RANDOM MATING IN CASE OF

564 HILDA GEIRINGER

If the kij are not all different from zero we have within the considered link- age group of size m smaller groups of completely linked geizes and results analogous to (38’). If for instance m = 6 and the values klz, kla, and (conse- quently) k23 vanish while the twelve other values k,j are different from zero the result is

lim p(n)12. . .6 = lim q(n)12. . .6 = r123(0)r4(o)r6(o)r6(o) n+ m n-m

where, corresponding to the complete linkage of the first three characters,

.). p123‘n) = qlZ3(n) = p123(”) = q123(”) = rlz2(o), (n = 0, 1, . On the whole we see that the main results concerning random mating of m

linked characters are not changed essentially if the linkage values are dif- ferent for the two sexes. The “mean linkage distribution” $(lA+iA) takes the place of the ordinary 1.d. The analogy to the “ordinary” case is complete for m 5 3 characters. If m z 4 the limit theorem is still the usual type while the recurrence relations present a “mixed” structure.

SUMMARY

On the basis of Mendel’s Theory of Heredity the mathematics of random breeding for autosomal factors is worked out under the following assump- tions: 1) Any number, finite or infinite, of distinct, non overlapping genera- tions is considered; 2) the number of alleles and the number of Mendelian factors is arbitrary; 3) the “crossover distributions” for males and females are not necessarily equal.

LITERATURE CITED

GEIRINGER, HILDA, 1944 On the probability theory of linkage in Mendelian heredity. Ann. Math. Stat. 15: 25-57. 1945 Further remarks on linkage theory in Mendelian heredity. Ann. Math. Stat. 16: 390- 393.

JENNINGS, H. S., 1917 The numerical results of diverse systems of breeding, with respect to two pairs of characters, linked or independent, with special relation to the effect of linkage. Genetics 2: 97-154. 1923 The numerical relations in the crossing over of the genes, with a critical examination of the theory that the genes are arranged in a linear series. Genetics 8: 393-457.

Applications of mathematics to breeding problems. 11. Genetics 3: 73-92. 1918b Some applications of mathematics to breeding problems. 111. Genetics 3: 375-389.

ROBBINS, R. B., 1918a