35
MATHEMATIK-ARBEITSPAPIERE

A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

  • Upload
    vuminh

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

MATHEMATIK-ARBEITSPAPIERE

Page 2: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

M A T H E M A T I K - A R B E I T S P A P I E R E

A: MATHEMATISCHE FORSCHUNGSPAPIERE

FACHBEREICH MATHEMATIK UND INFORMATIK

UNIVERSITAT BREMEN

BibliothekstraBe D-28359 Bremen

Germany

Page 3: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

A Formal Derivation Of The Conditional Likelihood For Matched Case-Control Studies

Gerhard Osius

1 Introduction and Notation

2 Odds Ratios

3 Odds Ratio Models

4 Conditional Sampling Distribution and Likelihood

5 Case-Control Studies With l:M Matching

A Multivariate Noncentral Hypergeometric Distributions

B Rank and Order Statistics

Abstract

Likelihood analysis for logistic regression models in matched case-control studies is based on a conditional likehood. We provide a formal derivation for this conditional likelihood in a general setting by conditioning the sampling distribution of the risk factors (in each matching group) on its observed order statistic with respect to lexicographical ordering. The required joint distribution of the rank and order statistics for vectors is derived in a more general context. The conditional likelihood depends only upon the odds ratios of interest, i.e. for different values of the risk factor and same value of the matching variable. Hence the conditional analysis may be used in parametric (eg.. logistic repesssion) as well as in nonparametric models for the odds ratios. In the presence of only two disease categories (case and control) and 1M matching the conditional likelihood is a product of multinomial probabilities (reducing to binomial probabilities for M = 1). In this case the conditional analysis can be done within the framework of generalized linear models. In the presence of more than two disease categories the conditional distribution of the covariate in each matching group is a multivariate noncentral hypergeometric distribution (whose basic properties are also given) and the conditional analysis requires a higher computational effort.

1 This work was supported by the German National Science Foundation (DFG), grant 0s-14411 and is a revised version of the German report (Oct. 1997) for the DFG.

2 Institut fiir Statistik, Fachbereich 3, Universitat Bremen, Postfach 330440, 28334 Bremen, Germany. E-mail: [email protected]

Page 4: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies

1. Introduction and Notation

Let the status of a specific disease (eg. lung cancer) be given by a discrete variable

YE{O, 1, ..., K} where 0 represents no disease and 1, ..., K are the different disease

categories of interest. In the case K= 1 the variable Y is binary and hence an

indicator. In epidemiology one investigates the dependence of the disease status Y

upon a vector X€lRR of (suspected) risk factors (eg. consumtion of tobacco,

exposition to asbestos) taking into account an additional vector Z E lRT of

confounders (eg. age and gender) which are not of primary interest. An important

(often retrospective) sampling design for this situation is a matched case-control

study in which risk variable is sampled from conditional distributions given the

disease status Y = k and the confounder Z = z which is also referred to as the

matching variable.

The association between X and Y conditional upon Z = z is completely determined

through the family of odds ratios (cf. Osius 2000)

for all disesase categories k = 1, ..., K and all pairs u, v of values of the risk

variable X. In the logistic regression model thes odds ratios depend on unknown R parameter vectors PI, ..., PK E lR through

Although the likelihood of a matched case-control study involves the above odds

ratios of interest (and hence the parameters Pk of the logistic model) it also

depends on additional nuisance parameters. The number of the nuisance

parameters typically increases with the number of matching groups. This may

result in severely biased and inconsistent maximum likelihood estimates bk if the

matching variable Z ranges over infinitely many values (eg. if one component of Z is continuous), c.f. Breslow und Day (1980), Sec. 7.1. Therefore a conditional

likelihood has been proposed (Liddell et. a1 1977, Breslow und Day 1980), which

depends only on the parameters Pk of interest and leads to consistent estimates.

Our aim is to provide a formal derivation of this conditional likelihood in a general

context (not restricted to parametric logistic models) by conditioning the sample of

a the matched case-control study upon a suitable family of random variables,

namely the order statistics of the risk factors in each matching group. Besides a

clarification of the term conditional likelihood this derivation also allows a baysian

Page 5: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 3

approach by considering sampling from conditional distribution as a (pseudo)

experiment (cf. van der Linde und Osius 2001). Our derivation of the conditional

sampling distribution is not limited to the logistic regression model (2) but allows a

general type of models (including nonparametric ones) for the odds ratios (1). For

binary Y (i.e. K = 1) we have a closer look at 1M matching where the conditional

likelihood is a product of multiniomial likelihoods (and hence a product of

binomials for M = 1). This allows a conditional likelihood analysis. using standard

techniques and software.

The formal derivations require some assumptions and notations which will be given

prior to the more substantial considerations. Although the discrete variable Y was

introduced as a disease status this is not essential and more generally Y may

represent any kind of status and its value 0 is considered a normal or reference

status. The distribution of the risk factor XER' and the matching variable R Z E R will not be restricted and in particular the components of each vector may

be dicrete, continuous or mixed. We only assume that the joint distribution

2(Y, X, Z ) of (Y, X, Z ) has a density with respect to some product measure S u = u x u x u on the product space R x R xRT. Here u#

# X Z is the counting measure

on R and typically (but not necesassarily)

"x = uxlX. . . X u and

XR "z = '121X. . . X u ZT

are product measures, too, with each factor being Lebesgue's resp. the counting

measure on R provided the corresponding component of X or Z is continuous resp.

discrete distributed.

R Y has support Oy={O, 1, ..., K} and let O X c R resp.. O2clRT denote the

support of X resp. Z, i.e.

In analogy to the generic notation P for the probability we will use p for the density,

e.g. p(Y=y, X=x, Z=z) denotes the densitity of 2(Y, X , Z ) in (y, x , z ) and

p ( X = x, Z = z I Y= y ) the (conditonal) densitity of 2 ( X ,Z I Y = y ) in (x ,z).

Furthermore we assume that the joint densitiy of (Y, X , Z ) is positive on its

support Oyx Oxx Oz

(4) p ( ~ = y , X=X, Z = Z ) > o for all (y ,x , z ) E O ~ X O ~ X O ~

The lexicografical ordering on O x c R R needed later may be defined recursively

over the length R of the vectors by

(Y 5 (v, vR) * u < v or ( u = v und uR<vR)

Page 6: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 4

for all u, v E E~- ' and uR, vR E E.

R Any two vectors u, v E E are lexicografically comparable, i.e. u 5 v or v 5 u holds,

and hence the lexicografic minimum min(u,v) and maximum max(u,v) of (u,v) are

defined. As usual, u < v abbreviates (u 5 v and u s v).

2. Odds Ratios

A statistical analysis typically focusses on certain parameters of interest and not

on the whole distribution of (Y, X ,Z) itself. Of primary interest in applications is

the conditional distribution of the status Y given the values X E O ~ and Z E O ~ of

the risk factor X and the matching variable Z, i.e. the conditional probability

for status k E Oy Instead of the conditional probabilities the odds with respect to

the reference status 0 may be used, i.e.

The probability vector ~ ( x , z) E (0,l) ltK is uniquely determined by the K K-dimensional vector Odds (~(x , z)) E (0, co) .

The odds ratio of a status k = 1, ..., K for two values u, v E Ox of the risk factor and

fixed value z E OZ of the matching variable is given by

(3) Odds T ~ ( u , Z)

ORk(u,v I Z) = Odds T~(v , Z)

From ORk(u,v 1 z) = ORk(v,u 1 z)-I and ORk(u,u 1 z) = 0 we conclude, that the

family of all odds ratios (3) - which characterizes the association of (X,Y') for a

given Z = z (cf. Osius 2000) - is already determined by its subfamiy with u < v.

Page 7: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 5

The logarithm of the odds ratio will be denoted by

(4> $k(u,v I Z) = log O R k ( ~ , v I Z )

= log Odds T ~ ( u , z)) - log Odds T~(v , z)

= logitk(?r(u, z)) - logitk(?r(v, z)),

where the multivariate logit transform of a probability vector ?r E (0,l) 1+K

is defined as the K-dimensional vector logit(?r) with components

(5) Iogitk(?r) = log T~ - log 7r0 . k = I, ..., K.

To arrive at further representations of the odds ratios we consider other

conditional distributions and start with conditioning (X,Z) upon Y. The density of

the conditional distribution %(X, Z I Y = k) for k E Oy is given by

with the marginal probability

(7> P(Y=k) = Jp(Y=k, X = x , Z = z ) ux(dx) uZ(dz)

The densitiy ratio for two values u, v E Ox is

(8) D R ( U , V ~ z,k) = ~ ( x = u , Z = Z 1 ~ = k ) / ~ ( x = v , Z = Z 1 Y=k)

= p(Y=k, X = u , Z = Z ) / p ( ~ = k , X=V, Z = Z ) .

Hence the odds ratio also appears as a ratio of density ratios

Consider now the conditional distribution of X given (Y,Z). The density of

%(XIY=k,Z=z) for any k ~ O ~ a n d z€Ozis given by

(10) p ( X = x 1 Y=k, Z = z ) = p(Y=k, X = x , Z = z ) / p ( ~ = k , Z = Z )

with

(11) p(Y=k, Z = z ) = Jp(Y=k, X = x , Z = z ) ux (dx) .

The density ratio (8) may also be written as a ratio for the density (10)

Page 8: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 6

Hence the odds ratio (9) is also a parameter of the conditional distributions

.d(X I Y=k, Z = z )

From the representations (3), (9) and (13) we conclude that the odds ratio

OR(u,v I z ) are common parameter of the three conditional distributions .d(Y I X, Z),

.d(X,ZIY) and .d(X IY,Z). This is important since the major sampling schemes

in epidemiology are obtained by drawing samples from these conditional

distributions: cohort studies, case-control studies and matched case-control studies

3. Odds Ratio Models

The logistic regression model specifies the conditional probability sk(x,z) for

status k = 1, ..., K through K

(x, Z) = ex^ { Q ~ ( Z ) + x T ~ k } / ex^ {QI(z) + x T ~ l } k resp.

T I =o logitk a(x, z) = ak(z) + x Pk

R with unknown parameter vectors pkcIR and an unknown function ak:Oz+ IR.

Since the model (1) has no interaction between the risk factor X and the matching

variable Z the corresponding log-odds ratio

does not dependent upon the value z of the matching variable. Replacing the linear T function x pk in (1) by an arbitrary (eg. sufficiently smooth) function hk leads to

the model

with log-odds ratios

(4> $,(u, I Z ) = hk(u) - hk(v)

independent of z. Our model approach will based on (4) which may alternatively

be specified by two conditions. The first is a matching condition

Page 9: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 7

(MC) All odds ratio functions ORk(-, - I z): Oxx Ox+ (0, co) for k = 1, ..., K do

not depend upon Z E O ~ i.e. there are functions ORk(-, -) such that

ORk(u, v 1 z) = ORk(u, v ) resp..

$,(u, I Z ) = $ k ( ~ , V) : = log O R k ( ~ , V) for all k, u, v, z.

And the second condition spezifies the actual odds ratio model by restriction the

structure of the functions ORk resp. $k = log ORk in (MC) through

(ORM) dk(u , V) = hk(u) - hk(v) for all k = 1, ..., K and u , v E Ox,

with arbitrary functions hk: Ox+ IR. Both conditions (MC) and (ORM) together

are equivalent to (4).

The log-linear odds ratio model is given by linear functions

T (LLM) h k ( u ) = u P k' (log-linear OR-modell),

with unknown parameters pl, ..., PK€IRR and in this case the model (ORM)

reduces to the logistic regression model (2).

4. Conditional Sampling Distribution and Likelihood

Before turning to sampling with matching in its general form let us look at

case-control studies with binary Y and l : M matching. Here sampling is typical

divided into two steps. First, the cases are sampled, i.e. for all i = 1, ..., I the pairs

(Xil,Zi) of the risk factor and matching variable are independently drawn from

the conditional distribution %(X,Z I Y=l) . Second, for each case i = 1, ..., I a fixed

number M of controls is collected each having the same value zi forthe matching

variable as the case, i.e. the risk factors Xoim are drawn independently from the

conditional distribution %(X I Y = 0, Z = zi) for m = 1 ,..., M. Since the distribution

%(Z I Y = 1) of the matching variable among the cases contains no information

about the odds ratios of interest we may equally well pass among the cases to the

conditional distribution given the observed matching values zi, i.e. we may assume

that the risk factors Xil among the cases are sampled from the conditional

distribution %(X I Y = 1, Z = z .) with given values zi of the matching variable. 2

For general Y with arbitrary K 2 1 and Mo : Ml : ... : MK matching we now assume

that the risk factor is sampled conditional upon the status Y and the matching

variable Z. More precisely, for each matching group i =1, ..., I specified by a fixed

value Z . E O ~ and each status k = 0, ..., K an independent sample Xikm for 2

Page 10: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 8

m = 1, ..., Mk is drawn from the conditional distribution %(X I Y = 5, Z = zi), i.e. from

the corresponding subpopulation { Y= 5, Z=z .}, 2

(1) %(Xikm) = %(X I Y=k, Z=zi) for all i, 5, m.

Since all Xikm are independent, the joint distribution of the sample is given by the

product probability measure

From 2 (13) we conclude that the odds ratios of interest

for 5 = 1, ... ,K are parameters of the sampling distribution (2).

Now we want to pass to a conditional distribution of the sample which depends only

on the odds ratios of interest. To begin with we look at a single matching group

(Xkm, Z ) - the index i is dropped for convenience - satisfying

The basic idea is to condition on the observed distribution of the risk factor X resp.

on the empirical distribution of the observed values (xkm) viewed as a vector with

M : = Mo + ... + MK components in Ox The empirical distribution is determined by M the order statistic (ukm) = ord(xkm) E Ox of (xkm) (cf. appendix B1 where the M

components are indexed here by a pair ,,5mU of indices). More precisely, if

(5) < u < . . . U(l) (2) < U(4

represent the J>1 distinct components of (ukm) - and hence of (xkm) - then

the empirical distribution of (xkm) is given by the frequencies

of uO1 among (ukm) resp. (xkm) for j = 1, ..., J. Hence conditioning on the observed

distribution of X is the same as conditioning upon the observed order statistic.

Now consider the random JxM table A = ( A 3 given by the indicators (cf. 3 k

appendix B1)

Page 11: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 9

and let a = (a. ) be the corresponding observed table (Table 1). Then by appendix 3 km

B3 the conditional distribution of A given ord(Xkm) = u is hypergeometric

with row sums n(u) = (n,(u)), constant column sums 1 = (1) and noncentrality

parameters

= p(X=uO)IY=k, Z = z ) for all j, 5, m.

The corresponding family 8(u) = OR(p(u)) of odds ratios is given by

p ( X = u g ) I Y=k, z = z ) . ~ ( X = U ( ~ ) I Y=O, z = z ) (10) e . 3 km (u) = ~ ( X = U ( ~ ) I Y=k, z = z ) . p ( X = u g ) 1 Y=O, z = z )

= oRk(u@'u(l) I Z ) cf. 2 (13)

and depends only on the odds ratios (3) of interest. Since the hypergeometric

distribution depends only through 8(u) upon p(u) the conditional distribution

4(xkm) I ord(Xkm) = (ukm) ) is obtained as (cf. appendix B3 (8))

(I1) (Xkm) = (xkm) I Ord(Xkm) = (ukm) = A = a 1 ord(Xkm) = (ukm)

= h(aIe(u) ,n ,I)

for (xkm) E ordC1{(ukm)}.

Table 1: The JxM table a = (a. ) using double indices for columns. 3 km

(1)

U(4 C

0 1 . . . . km . . . . K " ~

. . . . a l k m ""

a . . . . . a . 3 01

. . . . a . IKMK 3 km

. . . . a J k m ""

1 . . . . 1 . . . . 1

C

n 1

n . 3

n J

M = n +

Page 12: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 10

For all m € M k the variables Xkm are independent and identically distributed

according to .d(X I Y = 5, Z=z) and hence (by appendix B4) the hypergeometric

probabilities depend on the table a only through the collapsed table at E IN Jx(l+K)

(Table 2) with

(12) a t = a 3k 3k+ = #{rnlxkm=uO} for all j and 5.

Table 2: The collapsed Jx(1 +K)-Tafel at = ( a t ) 3k '

By appendix B 4 (6), conditioning the collapsed table A+ on ord(Xkm) = (ukm)

yields a hypergeometric distribution

(I3) dA+ I ord(Xkm) = (u~,)) = JX @+I) (p("> I n,") with column sums M = (Mk) and noncentrality parameters

(14) j . 3k (u) = J? (X=U @ I Y = k , Z = z ) for all j and 5.

Again, this hypergeometric distribution depends on p(u) only through the odds

ratios B(u) = OR(~(U) ) given by

(15) s. 3k = oRk(uO,u(l) I "I for all j and 5.

In the case J= 1 the observed values u k m € Ox coincide for all 5 and all m and

hence the conditional distribution of (Xkm) resp. A given ord(Xkm) = (ukm) has a

unit mass and thus contains no information about the odds ratios (3) since

Qj(km)(u) = 1.

Page 13: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 11

After the principal investigations for a single matching group we now return to all I

matching groups. All terms defined for a single group now receive an additional

index i = 1 . . I referring to the corresponding matching group, i.e. M (uikm) = ord(xikm) m)EOf denotes the observed order statistic of ( x i k m ) k m ~ O x

with Ji distinct components

(16) U i ( l ) < ~ i ( 2 ) < . . . < U . 2 (J; .

Furthermore A. = ( A . . 3 resp. A: = (A?. ) are J x M resp. J . x(l+K) tables z 211% 2 1 1% 2 2

(17) Ai j km = I { X . zkm = u . 20 } resp.

(18) A+ 2 1 1% = A . 2 1 k+ = # { m I X i k m = u i O } , +- + with observed tables a . = (a . ) resp. ai - (a . ) having row sums

2 zkm zkm

(19) 2 1 = # { ( k , m ) I xikm=uiO)} for j = I , ..., J.. 2

The Likelihood for the observed sample (xikm) is given by

using the notation ba from appendix A.1 (7) and

(21) p . . 211% (u.) z = p ( X = u . 2 0) . l Y = k , Z = z . ) 2 for all i , j, 5, m.

Since the likelihood depends on the sample only through the collapsed tables at 2

the familiy of tables (A+) is a sufficient statistic. Conditioning all tables A: on 2 2

the observed order statistic of (X ikm)km yields the corresponding conditional

likelihood which - by (13) - turns out as a product of hypergeometric probabilities

with noncentrality parameters

P3) s. 211% . (u) = ORk(uio, u 2 . (1) 1 z i ) for all i , j and 5.

Denoting the set of all J . x (l+K) tables with row sums n . = (n . .) and column sums 2 2 2 1

M = (Mk) by 2. = q n ., M ) the hypergeometric probabilities may be written as 2 2

(cf. appendix Al)

Page 14: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 12

As already mentioned a matching group with constant values for the risk

factor, i.e. Ji= 1, does not contribute to the conditional likelihood because A: is

concentrated on a single value. For this reason we exclude these matching groups

when discussing the conditional likelihood which amounts to the assumption J. > 1 2

for all i.

Since conditional likelihood becomes more complex if either the number K of

categories or the row columns Mk increase we only look at the binary case K = 1 in

some detail.

Following usual practice we have assumed that the sample sizes Mk are the same

in all matching groups. However the same arguments apply for group specific sizes

Mik and lead to the conditional likelihood (22) with M. = (Mik) instead of M. 2

5. Case-Control Studies With l:M Matching

Let us now assume that Y E {O,l} is an indicator (eg. for a disease) and look at

case-control studies with 1M matching, which are typically used in epidemiology if

the disease is rare. The index 5=1 for the status will now be omitted in notations

like T ~ ( x , z), ORk(u,v I Z) etc.

For each matching group i = 1, ..., I with given value zi of the matching variable we

now have only one risk factor Xill for the corresponding case and M risk factors

XiOl, ..., XiOM drawn as controls. Using the notation in section 4 we have M = 1, 1

Mo = M, and that the risk variables Xil, Xiom are independently distributed for all

m = 1, ..., M and all i = 1, ..., I such that

for all i, m.

The observed values (x. ) = (x. ..., xioM, x . ) for machting group i can zkm 2 01' 2 1 1 eqivalently be described in terms of their order statistic (uikm) = ord(xikm) with J.

2

distinct components

Page 15: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 13

and the collapsed Jix2 contingency table a t = ( a t ) (cf. Table 3 ) with entries 2 zkm

(3) a? . 2 1 1% = # { m l x i k m = u . 2 0) j .

The jth row sum n.. in this table is the frequency of u . . in (xikm) %' 20

(4) a? . 2 1 + = n . . 2 1 = # { ( k , m ) I x i k m = u i O }

+- + Table 3: The Jix2 table a . - (aikm) for a matching group i. 2

Ui (1)

Ui

U . z (J;)

C

Given the row and column sums the table a t is uniquely determined by its second

column which will now be abbreviated by ri = ( r . ) E 10, u J i , i.e. 2 1

(5) + - r . . = a , . - a , . = I { x .

2 1 211 2 1 1 z l ~ = ~ i ~ ) for j = 1, ..., J . 2

control case Y=O Y = l

+ a . + r . = a . 2 10 2 1 2 1 1

+ a . . + r . . = a . . 2 1 0 %' 211

+ a . + 2 Ji0 2 J i l r~ = a .

M 1

indicates which component of (u. . ) belongs to the case x i l l . 2 0)

C

n . 2 1

n . . 2 1

n . 2 Ji

M + l = n

If Ri = (R . ) E 10, llJi denotes the corresponding random vector with components 2 1

for j = 1, ..., Ji ,

then by 4 (13) and appendix A3 the conditional distribution of R. given 2

ord(Xi km) = (U . ) is multinomial zkm

The probability vector r ( u 1 zJ is given by its logit transform - cf. 4 (15) 2

Page 16: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 14

(8) Iogit. 3 r (u i 1 ZJ = log s 3 2 ( u . I zi) - log s1(ui I zi)

= logOR(u. . , u . Iz.) 2 0 2 (1) 2

= d ( u i O l , ~ i ( l ) I ~ i ) .

The expectation of Ri is r(uil ZJ and may be written as

(9) log E(R..) 2 3 = log s ( u . 1 1 2 2 z.) = ai + $ ( u . , u . I z . ) 2 0 2 (1) 2

with

(lo) ai = log E(R. 2 1 ) = log s 1 2 2 (u.1 z.).

The conditional likelihood 4 (22) simplifies in the present situation to a product

of multinomial probabilities

Note that the numbers Ji of classes for the multinomials may vary with i = 1, ..., I. If however at least one component of the risk factor is continuously distributed then

all 1+M components of (X . . ) are distinct (almost surely) and hence Ji = 1+M for 2 3 m

all i. Then R = (R. .) is a IxJ table with J= 1+M whose expected table is given by 2 3

(9).

Using (8) and (9) the log-linear odds ratio model

reduces to a multivariate logistic regression model

(13) T

logi t . r (u. lz . ) = (u . . - u . ) P 3 2 2 2 0 2 (1)

or equivalently to a log-linear model for the expected table E(R)

log E(R..) = ai + (u . . - u . lTP. 2 3 2 0 2 (1)

The (conditional) maximum likelihood estimate for P can be obtained with

standard software by maximizing the conditional likelihood LC. The computation

of the estimate B may be performed as if all entries R . . are independent and 2 3

Poisson-distributed (cf. e.g. Habermann 1974) with expectations given by the

log-linear model (14). In this case the additional nuisance parameters ai (for each

matching group) need to be estimated too. The estimated (asymptotic)

covariance matrix 2 of the estimate B may also be obtained assuming a P

Poisson distribution, at least if the supports OX and OZ of the risk factor and

matching variable are both finite (c.f. Haberman 1974).

Page 17: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 15

6. Case-Control Studies with 1:1 Matching

For case-control studies with 1:l-matching, i.e. M = 1, further simplifications of the

the results in section 5 are availbale. Now each matching group consists of a pair

(Xi ,, Xi J : = (Xi 01, Xi 11) of independent random vectors

(1) %(Xi,) = L ( X I Y = k , Z = z i ) fork=0,1.

The order statistic of the observation (xio, xiJ is given by

(2) ui = min(x 2 . 0' Xiill , U . 2 2 = max(x. 2 0' Xi 1).

If all "uninformative pairs" with xio = xil are ignored, we have

(3) -

Ui(l) - Uil < Ui2 = Ui(2)

and hence Ji = 2 and nil = ni2 = 1. The pair r . = (r. r . ) is determined by the rank 2 2 1 ' 22

indicator

(4) r . 2 := ril = I{X. 2 0 - < x . 2 1 }

and may be represented as a 2x2 table (cf. Table 4) in which all row and column

totals equal 1

Table 4: The 2x2 table for a matching pair (xio, xil)

The conditional distribution of the rank indicator

(5> R . 2 := Ril = I{Xio<Xil}

is binomial

(6) -qRi l ord(Xi 0, XiJ = (Ui 1, Ui2)) = B(1, +i I zi))

whith probability given by

(7) logit s(ui 1 zi) = +(ui 2, ui 1 z 2 .) .

The two possible values of ri correspond to the following 2x2 tabels

C

1

1

2

risk factor

o, xi l)

max(x 2 0 ' ~ i l ) .

C

control case Y = O Y = l

r . 2 I - r . 2

1 - r 1 r . 2

1 1

Page 18: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Conditional likelihood for matched case-control studies 16

control

The log-linear odds ratio model 5 (12) here reduces to the (univariate) logistic

regression model1

(8) T logit ~ ( u . 1 z . ) = (u. - u . ) p .

2 2 22 2 1

Hence standard software for logistic regression is available for estimating and S testing the paramter vector IR .

A detailed analysis of a real and simulated 1:l matched case-control data using

nonparametric as well as logistic regression models may be found in van der Linde

and Osius (2001).

Page 19: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix A: Multivariate noncentral hypergeometric distributions 17

Appendix A Multivariate Noncentral Hypergeometric Distributions

This section summarized the relevant features (needed in appendix B) of the

multivariate noncentral hypergeometric distribution which will be defined as a

conditional (multivariate) Poisson-distribution.

1. Definition

Let Y = (Y. ) be a JxK contingency table where all components Y. are 3k 3k

independent and Poisson-distributed with positive expectation ,LL. = E(Y. ) so that 3k 3k

Y has a (multivariate) product Poisson-distribution

For a given vector n = (n.) E INJ of row sums we denote the set of all JxK tables 3

having these row sums by

where No = {0) U IN is the set of nonnegative integers and the index ,,+" indicates

summation over the corresponding index. Similar for given column sums K m = (mk) E W let

(3) JXK

g = $ m ) = { y = ( y ) E W 0 3k I y+k=mkfo ra l l k= l , . . . ,K) ,

be the set of all JxK tables with these column sums. Assuming n = m the + +' intersection

(4) X= %n,m) = %(n) n q m ) contains all JxK tables with these margin totals (Table A.l).

The conditional distribution of Y under the condition Y E Wn,m) is a multivariate

noncentral hypergeometric distribution and will be denoted by

(5> HJxK(~ I "w) = dY I Y E %n,m)) .

The hypergeometric probabilities are

for y E 2

Page 20: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix A: Multivariate noncentral hypergeometric distributions 18

Table A.1: The general JxK table y = ( y . ) E Wn,m) with row sums 3k

n = (n .) and column sums m = (m ). 3 k

Using the notation

we get

(8) Yjk . ,Pjk 1 yjk!

for

= P Y . e-'++ . c(y) with

and hence

for y E 2

Note that for constant n = (1) or m = (1) we get yjk E {O, 1) and hence c(y) = 1 for all

y E 2

The hypergeometric probabilities depend on the vector of expectations p = E(Y)

only through its odds ratio familie 8= OR(,u) defined by

(10) 8. = pll ' y j k

3k P1k ' P j l

Indeed, from

p . = a : b . 8 with a . = p . 3k 3 k jk 3 31'

for all j, 5.

Page 21: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix A: Multivariate noncentral hypergeometric distributions 19

we conclude for y E 2

pY = n n ( a . b 8,)YJ"e = eY.na>+. Y+k = g Y . a n . b m j k 3 k 3k j n b k k

and thus

(11) . C(Y)

h(y I P, n, m) = = h(y I 8, n, m) for Y E 2 c eZ.+) z€ 2

Hence the hypergeometric distribution depends on p only through 8

(12) HJxK(p I = H JXK (8 I n,m),

and with no loss in generality we may asssume that 8 = p holds, i.e. yk = p . = I for 31

all j and 5. The parameter 8 is called the noncentrality of HJxK(81 n,m) and is

already determined by its (J-l)x(K-1) subtable (8jk)j,k>l. If 8 . = 1 holds for all j and 3k

5 we get a central hypergeometric distribution.

T For the tranpose Y of Y we get

Hence any result on hypergeometric distributions entails a "dual" result for the

tranposed table.

The hypergeometric distribution may also be derived as a conditional

product multinomial distribution if conditioning for Y is broken up in two steps:

first on the column sums and then on the row sums. Under the condition Y E q m ) the columns of Y are independent with multinomial distributions, i.e.

J with probability vectors T. = ( T ~ ~ ) E (0,l) given by

(15) Tjk = P$ /P+~ for all j, 5.

Suppose Z is a JxK contingency table with product multinomial distribution

then the distribution of Z under the condition Z E %(n) is hypergeometric

= HJxK(" I n ~ ) , since 8 = OR(p) = OR(.lr).

Page 22: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix A: Multivariate noncentral hypergeometric distributions 20

Using (13) we get the corresponding result, if we condition Y upon the row sums

Further properties of hypergeometric distributions can be found in Haberman

(1974, Chapter 1) within the more general framework of conditional Poisson

distributions.

2. Collapsing Over Groups of Co lumns

For a JxL contingency table Y with hypergeometric distribution

the summation of columns with the same noncentrality yields again a

hypergeometric distributed table. More formally, we consider a decomposition of

the L columns into K > 1 groups. Let each group k = 1, ..., K contain (say) Lk > 1

columns and hence L = L For notational convenience we replace the column +' index 1 = 1, ..., L by a pair (k, l) with k = 1, ..., K and 1 = 1, ..., Lk thus writing the JxL

table as Y = (Y. ). $1

We assume that the columns of the noncentrality table p= (,L. ) are constant $1

within each group k and put

for all j, 5, 1.

Collapsing over groups, i.e. summing within all groups, may be described by the

matrix operator

+- JxK Y = (yjk1) lRJxL y - (Yjk+) E ,

and y+ will be called the collapsed table.

We now show that the collapsed JxK contingency table Y+ has a hypergeometric

distribution

(4> qy+) = HJrK(b I nim+)

+ where m is the collapsed vector of column sums

(5) m+ = (rnk+).

To prove (4) we view the distribution of Y according to 1 (17) as a conditional

product multinomial distribution

(6) q Y ) = HJd(* I n,m) = 4~ I Z E 3) with

Page 23: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix A: Multivariate noncentral hypergeometric distributions 21

and

(8) T j k ~ = pjk~lp+k~ for all j, 5, 1.

To establish (4) it suffides to shown

(9) 4 ~ ' I Z E 3) = HJxK(,ii I n,mt).

By (2) the probability vectors kl coincide within each group k

(lo) T,, := t. 1k = b. 1k //I +I for all jand I .

Hence the kth column Z; =(Zlk+, ..., Z ) of the collapsed table Z+ has a Jk+

multinomial distribution

(11) 4 2 ; ) = M&mk+>" k ) .

Thus the collapsed table Z+ is a product multinomial

and using 1 (17) yields

(13) + + 4 Z 1 Z E 3) = HJxK(i 1 n,mt)

= HJxK(b I nn,m+) , since OR(i) = OR@).

+ Because Z and Z have the same row sums we get

(15) Z + E % o Z E % ,

and hence (9) follows from (13).

3. Special Case: T h e Mul t inomia l Distr ibut ion

In the case with only K= 2 columns a Jx2 contingency table Y = (Y. ) (Table A2) ~k

with a hypergeometric distribution

(I) B Y ) = HJx2(e 1 n,m)

is already determined by its second column Y. = (Yj2), because Y E %(n) implies

(2) Y. = n . - Y . 11 1 12 for all j.

Page 24: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix A: Multivariate noncentral hypergeometric distributions 22

Table A.2: The general J x 2 table y = (y . ) E Wn,m) with row sums 3k

n = (n .) and column sums m = (m m ). 3 1' 2

Hence the hypergeometic distribution is determined by the (marginal) distribution

2(Y. 2) of the second column. Let us additonally assume, that all row totals are not

smaller that the second column total

(3) n . > m 2 3 - for all j,

which always holds for example in the case m2 = 1. Then the second column has a

multinomial distribution

(4) J(Y. 2) = MJ(m27. 2) provided (3) holds, J with probability vector T . ~ = ( ~ ~ 2 ) E (0,l) given by

(5> T . 32 = 19. 32 / e +2 for all j.

With no loss in generality we may assume i3= OR(i3), i.e.. elk = 19. = 1 for all j and 31

5. Using the representation 1 (17)

@I dz I %) = HJx2(e I sm) with

2

(7) g z ) = n ~ ~ ( m ~ , r . ~ ) , k = l

the result (4) will follow from

(8) d Z . 2 1 Z ~ % ) = MJ(m2i?r2).

In view of (7) it suffices to prove

Page 25: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix A: Multivariate noncentral hypergeometric distributions 23

for any a E IN; with a = m2. First, for any z E 3 we get + (10) P { Z . , = z . , l Z ~ 3 1

= P { Z . ~ = Z . ~ I Z . ~ = n-z . 2 ' Z E INF~} = P { Z . ~ = Z . ~ ~ Z . , = n - z . 2 ' z E i q K }

= P{Z.2=z.21Z.l= z . I ' Z E INF~}, since z E 3

= P{Z.2=z.21Z.1= z . l , Z E IN;*}

- = P{Z.2=~ .21Z .1 - z . J , since z E 3

= p{z.2=z.2},

where the last step exploits the independence of the columns Z. and Z. 2.

To establish (9) define for any a~ IN; with a = m a Jx2 table z having the + 2 columnsz . 2 = a a n d z l = n - a . F r o m a . < m and(3) weget

3- 2

z. = n . - a , > n.-m > 0 for all j, 31 3 3 - 3 2 -

and hence z E 3. Thus (9) follows from (10).

The probability vector is uniquely determined by its J- 1 logits

(11) logit 3 (T . 2 ) = log T . 32 - log s12 for j = 2, ..., J,

and hence we get the representation

(12) logit = log 0. , 1.e. logit 3 .(T . 2 ) = log 19. 32 for j = 2, ..., J.

Thus ?r2 is uniquely determined by B and vice versa.

Page 26: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix B: Joint distribution of the rank and order statistic 24

Appendix B Rank and Order Statistics

The purpose of the section is to derive the joint distribution of the rank and the

order statistic for a sample of random vectors with respect to lexicografical S S ordering 5 in R . For simplicity we put R = R , let d denote the a-algebra of

S Borel-sets in R and "forget" the special euclidean nature of the totally ordered set

R except that all intervals with respect to the ordering should be measurable, (i.e.

members of 4. In fact, the following consideration hold for any such totally

ordered structure (0, @ 5).

For fixed K E IN we start with a the definition of the rank and order statistic for

K-dimensional samples x E fl which allows tied observations. Then we derive the

joint (and conditional) distribution of the rank and order statistic of a random

vector X having an arbitrary distribution (dominated by a product measure).

Furthermore we look at typical situations with independent and partially

identically distributed components of X.

1. Definitions

The rank statistic of a vector x = (xl, ..., xK) E fl is a permutation rk(x) = Q on X

(1, ..., K} given by

(1) eX(k) = #{i lxi<xk} + #{i l i < k , xi=. L } for k = 1, ..., K.

As usual a < b stands for (a 5 b and a s b) which for a total ordering is equivalent to

not b 5 a. The fundamental properties are

PI xk < x1 * ex(k] < ex(L) ,

(3) xk = x~ + ( ~ , ( k ) < eX(l) * 6 < 1 ) for all 5, 1.

Let YK denote the set of all permutations on {I, ..., K}. For any D E YK the index

permutation is a map IIo : fl+ fl defined by

o o II (x) = II (xl, ..., xK) = (x o(k) )k = 1, ..., K .

The order statistic of x is defined as

where a is the inverse rank statistic of x. The components of ord(x) are X X

ordered

x < x < . . . . < x res p. x < x < . . . . < x ox(1) - ox(2) - - ox(K) PI- PI- - [Kl

Page 27: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix B: Joint distribution of the rank and order statistic 25

using the common notation [ k ] := ox(k). If x has no ties (i.e. all components of x

are distinct) then (5) uniquely defines the (inverse) rank statistic and may serve as a

definition of the rank statistic. However in the presence of ties the rank statistic ex is no longer determined by (5). Furthermore by (3), the indices of tied components

in x have the same ordering in ord(x) as in x itself.

Example: For x = (2, 3, 1, 2) we get ord(x) = (1, 2, 2, 3) and QX(l) = 2, QX(2) = 4,

Q (3) = 1, Q (4) = 3. Here we have x = x = 2 and Q preserves the ordering of the x x 1 4 x

indices 1 and 4, i.e. 1 < 4 implies QX(l) < ~ ~ ( 4 ) .

Any x E fl is uniquely determined by its rank ex E YK and order statistic

(7) ord(x) E 4 - = { x ~ f l l xl<x2< . . .< xK}

since

Hence the mapping (rk, ord): fl+ YKx fl is injective (one-to-one) but not I surjective (onto) because its range

(9) 93 := {(rk(x), ord(x)) I x E fl} = { ( Q , U ) E Y ~ X ~ ~ for all k , lwi thuk=ul : k < 1 u e(k)<e(l)}

-

is not the whole space YK x fl. In particular, if all components of u coincide, we I have ( Q , u) E 93 if and only if Q is the identity. But if u has no ties, then ( Q , u) E 93 holds for any permutation Q.

The inverse mapping H : 9 3 1 fl of (rk, ord) is given by

(10) H(Q, 4 = lIe(u) for (Q, u) E 93.

For a fixed ordered vector u = (uk) E 4 any vector x E fl with ord(x) = u will

now be characterized using a binary contingency table. First let

(11) < u < . . . U(l) (2) < denote the J> 1 distinct components of u, and

(12) n . 3 = n(u) 3 = # { k I uk = uD1} for j = 1, ..., J

the frequency of u in u. For any x E fl with ord(x) = u we defined the J x K Ci> table a(x) = (a. (x)) of indicators

3k

Page 28: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix B: Joint distribution of the rank and order statistic 26

Now ord(x) = u implies for the row resp. column sums of a(x)

(14) a . (x) = n . a (x) = 1 for all j, 5 3+ 3 +k

and hence a(x) lies in the following set of contingency tables

(15) JxK = { ( a k ) ~ { O , 1 } la . = n . f o r a l l j , a = l f o r a l l k } .

3+ 3 +k

Thus the assignment x H a(x) defines a mapping

ordC1{u) = { x ~ f l l ord(x) = u ) + 2

which is injective. Conversely, for any table a E 2 the vector x(a) = (xk(a))E fl is

defined

Now a = 1 implies that there is a unique j such that a . = 1 (the other entries in +k 3k

the kth row are 0). Hence the product in (16) reduces to a single factor which gives

(17) x ( a ) = u L u a . = I . O 3k

The frequency of u in x(a) is n . because a . = n . and hence Ci> 3 3+ 3

By (17) the assignment a H x(a) is inverse to x H a(x) and thus both mappings

are bijections. Hence any vector x E ordC1{u) may also be viewed as a JxK

contingency table a(x) E 2

2. Jo in t And Condit ional Distribution

A "random vector" X = (XI, ..., XK) taking values in fl is uniquely determined by

its rank and order statistic rk(X) and ord(X). To derive the joint distribution of the

rank and order statistic we assume that X has a density p(x) =p(X = x) with

respect to some product measure uK arising from a 0-finite measure u on d We

will provide a joint densitiy of (rk(X), ord(X)) on its support BC Y K x f l with

respect the product measure u xuK, u being the counting measure on YK # #

We first observe that for any permutation Q E the measure induced by u K

under the mapping lIe : fl+ fl concides with u .

Page 29: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix B: Joint distribution of the rank and order statistic 27

Indeed, for arbitrary All ..., AK E d we have

with -1 ( ne ) - l [n~k] = I-I a = @ k k

which proves (1) in view of

The joint distribution of rk(X) and ord(X) is determined by the following

probabilities for any Q E YK and A E with {Q)XA c B

K = J p ( X = x ) u (dx)

lie [A1 K Q-I -1

= J P(X = lIe(x)) (u (n 1 )(dx) A

= J p(X = lIe(x)) uK(dx) by (1). A

Let C c YKx fl be any (measurable) set such that C(Q) = { u 1 (Q, u) E C) E dK for

all Q E YK Then

This implies that the distribution of (rk(X),ord(X)) with support B has the

following density with respect u K # x u

(4) for (Q,u) E B

p(rk(X) = Q, ord(X) = u) = for (e,u) Sf B.

The conditional distribution of the rank rk(X) given ord(X) = U E fl has I support

and is given by

Page 30: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix B: Joint distribution of the rank and order statistic 28

(6) P{rk(X) = Q 1 ord(X) = u ) = P(X = fle(4)

for Q E T(u). C P(X = f l T ( ~ ) )

Given the order statistic ord(X) the random vector X is uniquely determined by

its rank rk(X). Hence the conditional distribution of X given ord(X) = u has (finite)

support

and (6) implies

p ( ~ = nrk(x)(u)) (8) P { X = x l o r d ( X ) = u ) = for ord(x) = u.

C p(X=flrk(v)(u)) v E ord-l{u}

Using the bijection a(-): ordC1{u) + %'we finally obtain from (8) the conditional

distribution of the table a(X)

(9) P{a(X) = a(x) I ord(X) = ord(x) ) = P{X = x I ord(X) = ord(x) ) .

3. Independen t Componen t s

We now look at the special case where all components X1, ..., XK of X are

independent. If p(Xk=x) denotes the density (with respect to u) of Xk the joint

density of X is

for x = (x,) E fl.

For fixed u E 4 and arbitrary Q E T(u) we get -

K

PI P ( ~ = '@(,)) = n P ( ~ ~ = 1' k=l

For x E ordC1{u) with rk(x) = Q we consider the table a(x) =(a. (x)) E %' from 1 3k

(12) and the table p(u) = (p. (u)) given by 3k

(3) p . 3k (u) = p ( X - u . ) k - b) where u are the distinct componente of u. Then (2) may be expressed as

Ci> K J

(4) p ( ~ = ILQ(U)) = n n p ( ~ k - - 0) . )$(x) = P(U) a(x) , ~ f . A1(7), k=l j=1

and the conditional probabilities 2 (8) and (9) reduce to

Page 31: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix B: Joint distribution of the rank and order statistic 29

Using appendix A1 we conclude that the conditional distribution of the

contingency table a(X) given the condition ord(X) = u is hypergeometric

(7> .d(a(X)lord(X)=u) = H JxK (p(u)~n(u) , l ) .

The row sums n(u) = (nj(u)) are the frequencies from 1 (12) and the column sums

are constant =l. Thus the conditional distribution .d(X I ord(X) = u) corresponds

(up to a bijection of its support) to the hypergeometric distribution (7), i.e.

= h(a(x) I ~ ( 4 , n(u),l) for x E ordC1{u).

4. Ident ical Repl icat ions

Suppose now that we have b = 1, ..., K independent samples each consisting of Mk

random elements X k m ~ X which are independent and identically distributed

replications of a random element, say Xk, SO that

(1) .d(xkm) = .d(xk) for all m = 1, ..., M ~ ,

with all Xkm being independent. Viewing X = (Xkm) as a random vector of length

M = Ml + ... + MK with independent components the results from section 3 apply. In

particular the JxM tables a(x) =(a. (x)) and p(u) are defined by 3 km

(1) a . 3 km ( X ) = I { X ~ ~ = U , ) , pj km(u) = dXkm = u(j) )

Using the matrix operator from appendix A2 (4) we get

with p(u) = (jjk(u)) and the collapsed JxM table a+(x) = ( a t (x)) given by 3 k

Page 32: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Appendix B: Joint distribution of the rank and order statistic 30

(3) P 3 k (4 = p(Xk = uO) ) ,

(4) + a . (x) = a . (x) = #I{mIxk,=u,}. 3 k 3 k+

Hence the hypergeometric probabilities 3 (6) depend on a(x) only through the

collapsed table at(x). Instead of the conditional distribution of a(X) given

ord(X) = u

(5) 44x1 I ord(X) = u) = HJXM(p(~) I 4 ~ ) ,I)

we might equally well consider the conditional distribution of at(x) which is

hypergeometric by appendix A2

@I qat(~) I ord(X) = U) = HJxK(P(u) I "(u) W) .

The connection between the hypergeometric distributions (5) and (6) is given by

(7) P{~+(x) = a+(x) I ord(X) = u)} = d(x). P{a(X) = a(x) I ord(X) = u)}

for any x E X w h e r e the number

@I d(x) = # { z E XI ord(z) = u , at(z) = a+(x) }.

does not depend on the noncentralities.

References

Breslow, N.E. and Day, N.E. (1980). Statistical Methods i n Cancer Research, Volume I: The Analysis of Case-Control Studies. International Agency for Research on Cancer, Lyon.

Haberman, S.J. (1974). The analysis of frequency data. The University of Chicago Press, Chicago and London.

Liddel, F.D.K., McDonald, J.C. and Thomas D.C. (1977). Methods of Cohort Analysis: Appraisal by Application to Asbestos Mining. J.R.Statist.Soc. A, 140, 469-491.

Osius, G. (2000). The association between two random elements: A complete characterization i n terms of odds ratios. Mathematik-Arbeitspapiere No. 53, Universitat Bremen (http://www.math.uni-bremen.de/"osius/download).

van der Linde, A. and Osius, G. (2001). Estimation of nonparametric multivariate risk functions i n matched case-control studies: with application to the asessment of interactions of risk factors i n the study of cancer. To appear in Statistics i n Medicine.

Date: 29-Jan-2001 (printed edition) 7-Feb-2000 (PDF-File, with minor corrections)

Page 33: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

Universität Bremen Mathematik-ArbeitspapiereStand Januar 2001 ISSN 0173 - 685 X

Vertrieb der Hefte 4, 14, 23, 26 durch Universitätsbuchhandlung, Bibliothekstr. 3, D-28359 Bre-men. Vertrieb der übrigen Hefte (soweit nicht vergriffen) durch die Autoren oder FB 3 Mathema-tik/Informatik Universität Bremen, Postfach 330440, D-28334 Bremen.

1. Ulrich Krause (1976): Strukturen in unendlichdimensionalen konvexen Mengen, 74 S.

2. Fritz Colonius, Diederich Hinrichsen (1976): Optimal control of heriditary differential systems.Part I, 66 S.

3. Günter Matthiessen (1976): Theorie der heterogenen Algebren, 88 S.

4. H. Wolfgang Fischer, Jens Gamst, Klaus Horneffer (1976): Skript zur Analysis, Band 1 (11.Auflage 2000), 286 S.

5. Wolfgang Schröder (1977): Operator-algebraische Ergodentheorie für Quantensysteme, 59S.

6. Rolf Röhrig, Michael Unterstein (1977): Analyse multivariabler Systeme mit Hilfe komplexerMatrixfunktionen, 216 S.

7. Horst Herrlich, Hans-Eberhard Porst, Rudolf-Eberhard Hoffmann, Manfred Bernd Wisch-newsky (1976): Nordwestdeutsches Kategorienseminar, 193 S.

8. Fritz Colonius, Diederich Hinrichsen (1977): Optimal Control of Hereditary Differential Sy-stems. Part II: Differential State Space Description, 36 S.

9. Ludwig Arnold (1977): Differentialgleichungen und Regelungstheorie, 185 S.

10. Rudolf Lorenz (1977): Iterative Verfahren zur Lösung großer, dünnbesetzter symmetrischerEigenwertprobleme, 104 S.

11. Konrad Behnen, Hans-Peter Kinder, Gerhard Osius, Rüdiger Schäfer, Jürgen Timm (1977):Dose-Response-Analysis, 206 S.

12. Hans-Friedrich Münzner, Dieter Prätzel-Wolters (1978): Minimalbasen polynomialer Moduln,Strukturindizes und BRUNOVSKY-Transformationen, 53 S.

13. Konrad Behnen (1978): Vorzeichen-Rangtests mit Nullen und Bindungen, 53 S.

14. H. Wolfgang Fischer, Jens Gamst, Klaus Horneffer, Eberhard Oeljeklaus (1978): Skript zurLinearen Algebra, Band 1 (13. Auflage 2000), 249 S.

15. Günter Ludyk (1978): Abtastregelung zeitvarianter Einfach- und Mehrfachsysteme, 54 S.

16. Momme Johs Thomsen (1977): Zur Theorie der Fastalgebren, 146 S.

17. Klaus Horneffer, Horst Diehl (1978): Modellrechnungen zur anaeroben Reduktionskinetik desCytochroms P-450, 34 S.

18. Horst Herrlich, Rudolf-Eberhard Hoffmann, Hans-Eberhard Porst, Manfred BerndWischnewsky (1979): Structure of Topological Categories, 252 S.

19. Hans-Friedrich Münzner, Dieter Prätzel-Wolters (1979): Geometric and moduletheoretic ap-proach to linear systems. Part I: Basic categories and functors, 28 S.

20. Hans-Friedrich Münzner, Dieter Prätzel-Wolters (1979): Geometric and moduletheoretic ap-proach to linear systems. Part II: Moduletheoretic characterization and reachability, 28 S.

21. Eckart Beutler, Hans Kaiser, Günter Matthiessen, Jürgen Timm (1979): Biduale Algebren,165 S.

22. Horst Diehl, Detlef Harbach, Jürgen Timm (1980): Planung und Auswertung von Atomabsorp-tions-Spektrometrie-Untersuchungen mit der Additionsmethode, 44 S.

23. H. Wolfgang Fischer, Jens Gamst, Klaus Horneffer (1981): Skript zur Analysis, Band 2 (7.Auflage 2001), 299 S.

24. Horst Herrlich (1981): Categorical Topology 1971-1981, 105 S.

25. Horst Herrlich, Rudolf-Eberhard Hoffmann, Hans-Eberhard Porst, Manfred BerndWischnewsky (1981): Special Topics in Topology and Category Theory, 108 S.

Page 34: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

26. H. Wolfgang Fischer, Jens Gamst, Klaus Horneffer (1984): Skript zur Linearen Algebra, Band2 (7. Auflage 1999), 257 S.

27. Rudolf-Eberhard Hoffmann (1982): Continuous Lattices and Related Topics, 314 S.

28. Horst Herrlich, Rudolf-Eberhard Hoffmann, Hans-Eberhard Porst (1987): Workshop on Cate-gory Theory, 169 S.

29. Harald Boehme (1987): Zur Berufspraxis des Diplommathematikers, 16 S.

30. Jürgen Timm (1986): Mathematische Modelle der Dosis-Wirkungsanalyse bei den experimen-tellen Untersuchungen der Arbeitsgruppe zur karzinogenen Belastung des Menschen durchLuftverunreinigung, 65 S.

31. Dieter Denneberg (1988): Mathematik für Wirtschaftswissenschaftler. I. Lineare Algebra, 97S.

32. Peter E. Crouch, Diederich Hinrichsen, Anthony J. Pritchard, Dietmar Salamon (1988, previ-ous edition University of Warwick 1981): Introduction to Mathematical Systems Theory, 244S.

33. Gerhard Osius (1989): Some Results on Convergence of Moments and Convergence in Dis-tribution with Applications in Statistics, 27 S.

34. Dieter Denneberg (1989): Verzerrte Wahrscheinlichkeiten in der Versicherungsmathematik,Quantilsabhängige Prämienprinzipien, 24 S.

35. Eberhard Oeljeklaus (1989): Birational splitting of homogeneous Albanese bundles, 30 S.

36. Gerhard Osius, Dieter Rojek (1989): Normal Goodness-of-Fit Tests for Parametric MultinomialModels with Large Degrees of Freedom, 38 S.

37. Dieter Denneberg (1990): Mathematik zur Wirtschaftswissenschaft. II. Analysis, 59 S.

38. Ulrich Krause, Cornelia Zahlten (1990): Arithmetik in Krull monoids and the cross number ofdivisor class groups, 29 S.

39. Dieter Denneberg (1990): Subadditive Measure and Integral, 39 S.

40. Ulrich Krause, Peter Ranft (1991): A limit set trichotomy for monotone nonlinear dynamicalsystems, 31 S.

41. Angelika van der Linde (1992): Statistical analyses with splines: are they well defined? 22 S.

42. Dieter Denneberg (1992): Lectures on non-additive measure and integral (new edition: Non-additive measure and integral. TDLB 27, Kluwer Academic, Dordrecht (1994)), 114 S.

43. Gerhard Osius (1993): Separating Agreement from Association in Log-linear Models forSquare Contingency Tables With Applications, 23 S.

44. Hans-Peter Kinder, Friedrich Liese (1995): Bremen-Rostock Statistik Seminar, 5. - 7. März1992, 110 S.

45. Dieter Denneberg (1995): Extension of a measurable space and linear representation of theChoquet Integral, 30 S.

46. Dieter Denneberg, Michael Grabisch (1996): Shapley value and interaction index, 20 S.

47. Angelika Bunse-Gerstner, Heike Faßbender (1996): A Jacobi-like method for solving alge-braic Riccati equations on parallel computers, 24 S.

48. Hans-Eberhard Porst editor (1997): Categorical methods in algebra and topology - a collecti-on of papers in honour of Horst Herrlich, 498 S.

49. Angelika van der Linde, Gerhard Osius (1997): Estimation of nonparametric risk functions Inmatched case-control studies, 28 S.

50. Angelika van der Linde (1997): Estimating the smoothing parameter in generalized spline-based regression, 46 S.

51. Ursula Müller, Gerhard Osius (1998): Asymptotic normality of goodness-of-fit statistics forsparse Poisson data, 15 S.

52. Ursula Müller (1999): Nonparametric regression for threshold data, 18 S.

53. Gerhard Osius (2000): The association between two random elements – A complete charac-terization in terms of odds ratios, 32 S.

Page 35: A Formal Derivation Of The Conditional Likelihood For ...osius/download/papers/MAP55Osius.pdf · 4 Conditional Sampling Distribution and ... version of the German report (Oct

54. Horst Herrlich, Hans-E. Porst (2000): CatMAT 2000, Proceedings of the Conference: Catego-rical Methods in Algebra and Topology, 490 S.

55. Gerhard Osius (2001): A formal derivation of the conditional likelihood for matched case-control studies, 30 S.