5
Proc. Natl. Acad. Sci. USA Vol. 92, pp. 11741-11745, December 1995 Medical Sciences Inferring identity from DNA profile evidence (forensic science/statistical inference) DAVID J. BALDING* AND PETER DONNELLYt *School of Mathematical Sciences, Queen Mary and Westfield College, Mile End Road, London, El 4NS, United Kingdom; and tDepartments of Statistics, and Ecology and Evolution, University of Chicago, 5734 University Avenue, Chicago, IL 60637 Communicated by Newton E. Morton, University of Southampton Princess Anne Hospital, Southampton, United Kingdom, July 24, 1995 ABSTRACT The controversy over the interpretation of DNA profile evidence in forensic identification can be attrib- uted in part to confusion over the mode(s) of statistical inference appropriate to this setting. Although there has been substantial discussion in the literature of, for example, the role of population genetics issues, few authors have made explicit the inferential framework which underpins their arguments. This lack of clarity has led both to unnecessary debates over ill-posed or inappropriate questions and to the neglect of some issues which can have important conse- quences. We argue that the mode of statistical inference which seems to underlie the arguments of some authors, based on a hypothesis testing framework, is not appropriate for forensic identification. We propose instead a logically coherent frame- work in which, for example, the roles both of the population genetics issues and of the nonscientific evidence in a case are incorporated. Our analysis highlights several widely held misconceptions in the DNA profiling debate. For example, the profile frequency is not directly relevant to forensic inference. Further, very small match probabilities may in some settings be consistent with acquittal. Although DNA evidence is typically very strong, our analysis of the coherent approach highlights situations which can arise in practice where alternative methods for assessing DNA evidence may be misleading. 1. Introduction: Inference in Forensic Identification In criminal cases involving DNA profiles, the jury may be presented with evidence that (i) the culprit has a certain genetic type, (ii) the defendant has the same genetic type (to within measurement error), and (iii) the shared genetic type is rare. The jury has the task of using this evidence, together with any other evidence presented to it, to decide whether or not the defendant is the culprit. The jury's task can formally be viewed as a statistical inference problem: it is required to accept or reject a hypothesis (that the defendant is the culprit) on the basis of data (evidence), part of which is statistical in nature (that measuring the rarity of the genetic type). Most current practice involves an expert witness reporting to the jury an estimate of the profile frequency, which is the relative frequency in some population of the DNA profiles which, due to measurement error, cannot be distinguished from the culprit's DNA profile. The profile frequency is often interpreted in terms of the probability that a randomly selected individual would have the profile. Surprisingly, reasons for using the profile frequency to quantify the DNA evidence are rarely discussed. Moreover, there appears to have been little consideration of the crucial question of the logical connection between it and the proba- bility which is of direct interest to the jury-namely, the probability that the defendant is the culprit. It seems intuitively reasonable that the smaller the profile frequency, the more The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. surprising is the evidence if the defendant is not the culprit, and hence the more the jury ought to be inclined to believe that culprit and defendant are the same person. It is difficult, however, to make this intuition precise. The lack of theoretical underpinning of the profile fre- quency as a measure of evidential weight is the source of much confusion that surrounds questions such as the following. (i) Of what relevance is the fact that profile frequencies vary among human populations? In which population should the profile frequency be estimated? (ii) How is the DNA evidence, summarized by the profile frequency, to be incorporated with the other evidence? (iii) How should inference be affected if one or more of the defendant's close relatives are not excluded by the other evidence from being among the possible culprits? (iv) How should inference be affected if it is known that the defendant is one of a group of possible culprits whose DNA profiles were investigated? (v) What can be inferred if the "match" between the defendant's profile and the culprit's profile is equivo- cal-for example, because of some missing bands from a degraded sample? Without a sound theoretical basis for inference, the resolution of these questions is extremely difficult and, in fact, the current extensive debate has not resolved them. Although it may be difficult to answer such questions exactly in particular cases, it is nevertheless important to have a theoretical framework in which they can be answered in principle. In this paper we present such a framework. It has been accepted by both prosecution and defence in some U.K. cases, including appeal cases. In Section 2, we discuss inferential frameworks based on classical hypothesis testing and argue that these are unsatis- factory in the criminal trial setting. Next, in Section 3, we describe a coherent framework for forensic inference, in which it is conditional match probabilities, and not profile frequen- cies, which are directly relevant to inference. The effects of both relatedness and population genetics issues enter the discussion through conditioning on the defendant's profile, as discussed in Section 4. Further consequences of the coherent approach, which can be important in practical casework, are discussed in Section 5. Collins and Morton (1) have previously discussed methods of statistical inference appropriate to forensic identification using DNA profiles. In common with the analysis presented here, those authors advocate a central role for likelihood ratios and, consequently, conditional match probabilities rather than profile frequencies. The two approaches differ in their meth- ods for combining the different likelihood ratios correspond- ing to the various levels of relatedness between defendant and possible culprit. Collins and Morton propose a sequence of hypothesis tests, whereas the present approach employs a weighted sum of the likelihood ratios, with the weights being the probabilities, based on the non-DNA evidence, of each possible level of relatedness. The DNA evidence at trial relates directly to the identity of the source of crime samples. To simplify terminology, we equate "is guilty" with "is the source of the crime sample." 11741

Inferring identity from DNA profile evidence - Proceedings of the

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Proc. Natl. Acad. Sci. USAVol. 92, pp. 11741-11745, December 1995Medical Sciences

Inferring identity from DNA profile evidence(forensic science/statistical inference)

DAVID J. BALDING* AND PETER DONNELLYt*School of Mathematical Sciences, Queen Mary and Westfield College, Mile End Road, London, El 4NS, United Kingdom; and tDepartments of Statistics, andEcology and Evolution, University of Chicago, 5734 University Avenue, Chicago, IL 60637

Communicated by Newton E. Morton, University of Southampton Princess Anne Hospital, Southampton, United Kingdom, July 24, 1995

ABSTRACT The controversy over the interpretation ofDNA profile evidence in forensic identification can be attrib-uted in part to confusion over the mode(s) of statisticalinference appropriate to this setting. Although there has beensubstantial discussion in the literature of, for example, therole of population genetics issues, few authors have madeexplicit the inferential framework which underpins theirarguments. This lack of clarity has led both to unnecessarydebates over ill-posed or inappropriate questions and to theneglect of some issues which can have important conse-quences. We argue that the mode of statistical inference whichseems to underlie the arguments of some authors, based on ahypothesis testing framework, is not appropriate for forensicidentification. We propose instead a logically coherent frame-work in which, for example, the roles both of the populationgenetics issues and of the nonscientific evidence in a case areincorporated. Our analysis highlights several widely heldmisconceptions in the DNA profiling debate. For example, theprofile frequency is not directly relevant to forensic inference.Further, very small match probabilities may in some settingsbe consistent with acquittal. Although DNA evidence is typicallyvery strong, our analysis of the coherent approach highlightssituations which can arise in practice where alternative methodsfor assessing DNA evidence may be misleading.

1. Introduction: Inference in Forensic Identification

In criminal cases involving DNA profiles, the jury may bepresented with evidence that (i) the culprit has a certaingenetic type, (ii) the defendant has the same genetic type (towithin measurement error), and (iii) the shared genetic type israre. The jury has the task of using this evidence, together withany other evidence presented to it, to decide whether or not thedefendant is the culprit. The jury's task can formally be viewedas a statistical inference problem: it is required to accept orreject a hypothesis (that the defendant is the culprit) on thebasis of data (evidence), part of which is statistical in nature(that measuring the rarity of the genetic type).Most current practice involves an expert witness reporting to

the jury an estimate of the profile frequency, which is therelative frequency in some population of the DNA profileswhich, due to measurement error, cannot be distinguishedfrom the culprit's DNA profile. The profile frequency is ofteninterpreted in terms of the probability that a randomly selectedindividual would have the profile.

Surprisingly, reasons for using the profile frequency toquantify the DNA evidence are rarely discussed. Moreover,there appears to have been little consideration of the crucialquestion of the logical connection between it and the proba-bility which is of direct interest to the jury-namely, theprobability that the defendant is the culprit. It seems intuitivelyreasonable that the smaller the profile frequency, the more

The publication costs of this article were defrayed in part by page chargepayment. This article must therefore be hereby marked "advertisement" inaccordance with 18 U.S.C. §1734 solely to indicate this fact.

surprising is the evidence if the defendant is not the culprit, andhence the more the jury ought to be inclined to believe thatculprit and defendant are the same person. It is difficult,however, to make this intuition precise.The lack of theoretical underpinning of the profile fre-

quency as a measure of evidential weight is the source of muchconfusion that surrounds questions such as the following.

(i) Of what relevance is the fact that profile frequenciesvary among human populations? In which populationshould the profile frequency be estimated?

(ii) How is the DNA evidence, summarized by the profilefrequency, to be incorporated with the other evidence?

(iii) How should inference be affected if one or more of thedefendant's close relatives are not excluded by the otherevidence from being among the possible culprits?

(iv) How should inference be affected if it is known that thedefendant is one of a group of possible culprits whoseDNA profiles were investigated?

(v) What can be inferred if the "match" between thedefendant's profile and the culprit's profile is equivo-cal-for example, because of some missing bands froma degraded sample?

Without a sound theoretical basis for inference, the resolutionof these questions is extremely difficult and, in fact, the currentextensive debate has not resolved them. Although it may bedifficult to answer such questions exactly in particular cases, it isnevertheless important to have a theoretical framework in whichthey can be answered in principle. In this paper we present sucha framework. It has been accepted by both prosecution anddefence in some U.K. cases, including appeal cases.

In Section 2, we discuss inferential frameworks based onclassical hypothesis testing and argue that these are unsatis-factory in the criminal trial setting. Next, in Section 3, wedescribe a coherent framework for forensic inference, in whichit is conditional match probabilities, and not profile frequen-cies, which are directly relevant to inference. The effects ofboth relatedness and population genetics issues enter thediscussion through conditioning on the defendant's profile, asdiscussed in Section 4. Further consequences of the coherentapproach, which can be important in practical casework, arediscussed in Section 5.

Collins and Morton (1) have previously discussed methodsof statistical inference appropriate to forensic identificationusing DNA profiles. In common with the analysis presentedhere, those authors advocate a central role for likelihood ratiosand, consequently, conditional match probabilities rather thanprofile frequencies. The two approaches differ in their meth-ods for combining the different likelihood ratios correspond-ing to the various levels of relatedness between defendant andpossible culprit. Collins and Morton propose a sequence ofhypothesis tests, whereas the present approach employs aweighted sum of the likelihood ratios, with the weights beingthe probabilities, based on the non-DNA evidence, of eachpossible level of relatedness.The DNA evidence at trial relates directly to the identity of

the source of crime samples. To simplify terminology, weequate "is guilty" with "is the source of the crime sample."

11741

11742 Medical Sciences: Balding and Donnelly

2. Hypothesis Testing

One possible rationale for the use of the profile frequency toquantify the identification evidence, a rationale which seems tobe assumed implicitly by many authors, is in terms of thesignificance level associated with a particular hypothesis test.Suppose that the jury considers only the following two hy-potheses.

R: The defendant has been chosen at random from apopulation of innocent individuals;

G: The defendant is the culprit.The legal maxim "innocent until proven guilty" would con-strain the jury not to accept hypothesis G unless it is persuadedby evidence to do so. Hypothesis R would thus play the role ofthe null hypothesis, in statistical terminology.

After observing the culprit's profile, a critical region can beidentified that consists of profiles which, because of measure-ment error, cannot be distinguished from it. If the defendant'sprofile falls in the critical region, then the null hypothesis, R,is rejected and its rival, G, is accepted. The conventionalmeasure of "confidence" associated with this decision is thesignificance level of the test-that is, the probability of ac-cepting G if R is true, which in this context is the profilefrequency in the population specified by R.An immediate criticism of this hypothesis testing rationale

for use of the profile frequency is the lack of justification forconsidering the "random selection" hypothesis, R, as the rivalto the hypothesis of guilt, G. The mechanism of randomselection does not provide a reasonable model for the processby which innocent defendants are actually obtained. Becauseof the arbitrary nature of R, the problem of specifying in whichpopulation the profile frequency should be measured cannotbe resolved.

In criminal trials, the jurors must base their verdict on all ofthe evidence presented in court. Hypothesis testing approachesaccept or reject the hypothesis of innocence solely on the basisof the DNA evidence. A fundamental weakness of suchmethods is that there are no reasonable principles for incor-porating the non-DNA evidence into the decision process. Itmight be argued that the significance level of the hypothesistest (based on the DNA evidence) could be varied accordingto the strength of the other evidence. There seems to be nobasis for quantifying this suggestion and its implementationcould not be other than ad hoc.Another drawback of hypothesis testing approaches is that

the outcome of such a test can depend crucially on the methodby which the defendant was identified. For example, supposethat suspects are tested sequentially until one is found to matchthe culprit's profile. In principle, a sufficiently thorough searchis guaranteed to result in a matching profile, and thus thesignificance level associated with the evidence would be 1. Inother words, for an ideal search the hypothesis testing ap-proach assigns no weight to the DNA evidence. In practice, itmay not be possible to examine every possible culprit, so thatthe search could have terminated without a match. Althoughthe significance level associated with the search would then be< 1, it may often be substantial and will in any case be difficultto assess.

Similar difficulties arise in the large number of realistic casesin which a hypothesis such as R is inappropriate-for example,when a database of DNA profiles has been searched andprecisely one profile in the database has been found to match(2). Information about the method used to identify the defen-dant is sometimes not admissible as evidence, in which case itwould seem impossible to employ a hypothesis-testing infer-ential approach.More generally, measures of "precision" in the hypothesis

testing paradigm, in terms of the proportion of incorrectrejections of the null hypothesis in an indefinite sequence ofindependent repetitions of the same procedure, seem antithet-

ical to established principles of criminal justice. Although suchlong-run success rates may be of some interest, interpretingthem in the context of a specific case is, at best, problematic.

3. A Coherent Framework for Inference

In criminal trials, the important issue is how an assessment asto the guilt or innocence of the defendant should be made inthe light of all the evidence presented to the jury. There existsa substantial literature on methods for assessing and revisingbeliefs in the presence of uncertainty. Loosely speaking, acollection of beliefs is said to be coherent if it avoids internallogical inconsistencies. Coherent, quantitative belief systemsmust behave as if they are governed by the usual laws ofprobability (for example, see ref. 3 and references therein).When additional evidence becomes available, beliefs (quanti-fied as probabilities) must be updated according to Bayestheorem. Because of this prominent role of Bayes theorem, thecoherent approach is sometimes labelled "Bayesian."Arguments for the use of probabilities in assessing evidence

presented at trials are not new. See, for example, the collectionof papers in ref. 4 and, for discussions of the coherent approachto scientific identification evidence, refs. 5 and 6. Althoughthese arguments have not always been accepted in generalsettings, in the particular case of DNA evidence, presented incourt in numerical terms, the problem of its combination withother evidence cannot be avoided. The arguments for thecoherent approach in this context are compelling.Note that we regard the general discussion of the assessment

of evidence and the particular method described below inconnection with DNA profiling evidence to be normativerather than descriptive. That is, it relates to an "in principle"solution of the problem rather than a description of the way inwhich jurors actually arrive at verdicts. Knowledge of the"correct" method for assessing DNA evidence is valuable inhighlighting errors and misunderstandings which may followfrom other approaches (7, 8) and in assessing possible approx-imations or simplifications. In particular, the framework pre-sented here allows the resolution, in principle, of all thequestions listed in Section 1.

Within the coherent framework, the problem of incorpo-rating the DNA evidence thus becomes one of using it toupdate the probability, based on the other evidence, that thedefendant is the culprit. It is convenient to frame the discus-sion in terms of O(G), the odds against the hypothesis of guilt(after taking the identification evidence into account). Theseodds can be converted into the probability of guilt P(G) by theformula

1P(G) = 1 + O(G) [1]

Under certain plausible assumptions (9), O(G) can be ob-tained as a weighted sum of conditional probabilities of profilepossession. More precisely, writing s and C for the name of,respectively, the defendant and the culprit,

O(G) = PP(Cij9s)P(C= i)3(iVis P(C = S)' [2]

where we introduce I;S for the (measured) DNA profile of thedefendant and ti (i 0 s) for the event that the profile ofindividual i matches the defendant's profile. Although notexplicit in the notation, each probability in Eq. 2 is conditionalon all the other evidence. For example, P(C = s) denotes theprobability that the defendant is the culprit based on all theevidence other than the defendant's DNA profile.The summation in Eq. 2 is over every individual, other than

s, not excluded by the other evidence from being a possible

Proc. Natl. Acad. Sci. USA 92 (1995)

Proc. Natl. Acad. Sci. USA 92 (1995) 11743

culprit. Alternatively, one could think of the summation asbeing over every person on earth, for many of whom thecorresponding value ofP(C = i)/P(C = s) will be negligible orzero. In practice it is not necessary to contemplate explicitlyevery possible culprit. Instead, evaluation can proceed bygrouping together individuals who have a similar level ofrelatedness to the defendant s and who consequently have asimilar value for P(%iJ5;;). For each such group, one then needonly assess the probability, based on the non-DNA evidence,that the culprit is a member of the group, rather than considerseparately each member of the group. Broadly speaking, thiswill typically require consideration of close relatives of thedefendant, persons not directly related to the defendant butsharing ancestry on an evolutionary time scale, and otherpersons unrelated to the defendant. Some examples of theevaluation of P(G) under simplifying assumptions are given inref. 10.An important feature of Eq. 2 is that there is a distinct term

for each possible culprit. One need not consider any hypo-thetical "random" individual, a concept that has led to con-siderable confusion. There has been considerable debate overthe choice of the appropriate "reference population" (11), theconceptual population from which, in the hypothesis testingframework, an innocent suspect has been chosen randomly.The analysis described above shows that this question is notrelevant.

Eq. 2 involves conditioning on the DNA profile of thedefendant. There is an equivalent equation which conditionsinstead on the profile of the criminal.

4. Match Probabilities

The analysis of the previous section establishes that thestrength of the DNA evidence does not depend on profilefrequencies but on the conditional probabilities P(6iJi;). Thisobservation, first emphasized by Morton (12), follows from theprinciple that weight of evidence should be assessed vialikelihood ratios and does not rely on the coherent approachfor its justification. We will refer to the conditional probabil-ities P(%iJ9;;) as match probabilities. Others have used this termto refer to profile frequencies, which may be misleadingbecause a match involves two profiles that are the same orsimilar, whereas a profile frequency involves only one profile.The probabilities P(%iJ9;) will in general vary with i, de-

pending on the degree of relationship (in the everyday senseand in the sense of sharing ancestors on an evolutionary timescale) between individual i and the defendant s. In mostforensic cases, there is thus more than one relevant matchprobability (equivalently, more than one likelihood ratio).Approximations which involve a small number of match prob-abilities will be discussed in Section 5.1.While population databases may be helpful in assessing

profile frequencies, they do not allow direct assessment ofmatch probabilities. These require, in addition to suitable data,modeling of the relevant correlations on the basis of statisticaland population genetics theory. In general there will becorrelations that arise because of our uncertainty over profilefrequencies, which may be due to a number of factors (9). Weconcentrate here on genetic correlations.The role of population genetics issues in forensic DNA

analysis has been widely discussed. Much of the debate hasfocussed on the appropriateness or otherwise of the "productrule" (13) for estimating profile frequencies. The coherentanalysis of Section 3 shows this debate to be misconceived, sincethe appropriate focus is not a profile frequency but the condi-tional probabilities P(%iJ9;;) for different possible culprits i.The effect of any population stratification is exactly incor-

porated in the coherent analysis by the conditioning on thedefendant's profile and the (weighted) averaging over possiblealternative culprits. The positive correlations in profile pos-

session induced by shared ancestry will imply that P(ZJI9S) >P(6i) for individuals i with an ethnic background similar to thatof the defendant. The strengths of these correlations dependon the genealogy of the population and the details of themutation mechanism at the loci involved. In simple populationgenetic models, the effect of this correlation depends on asingle parameter, often called FST. In such models, formulasfor match probabilities are available, in terms of the allelefrequencies and FST (14, 15). However, care is needed inapplying these formulas because the assumptions of the un-derlying models may not be appropriate for forensic loci.Further, only limited data for assessing FST values are yetavailable on the relevant demographic scales. Published esti-mates ofFST at forensic loci are usually small (16, 17), typicallymuch smaller than the values for traditional loci (18), althoughrecent analyses of forensic data using a different method ofestimation (D.J.B. and R. A. Nichols, unpublished work) sug-gest appreciable values of FST which cannot reasonably beignored in forensic settings.Aside from empirical issues, a match probability is quite

distinct, as a matter of logic, from a profile frequency. Thisdistinction has a number of important consequences for fo-rensic inference. For example, it has been argued (19, 20) thatgenotype frequency estimates based on the product rule are"conservative" because they overestimate the population gen-otype frequency in simple models of population substructure.These arguments are misdirected, in focusing on profile fre-quencies rather than (conditional) match probabilities, and aremisleading, since the effect of substructure in these models isto increase, relative to the panmictic case, all relevant matchprobabilities for individuals in the same subpopulation as thedefendant (14).

5. Practical Consequences of Coherence

5.1. Approximations. To simplify the presentation of DNAevidence, it may be desirable to introduce approximations intoEq. 2 which require only one or two match probabilities. Eq.2 may be rewritten as

0(G) = P(C s) aye, [3]

in which

Pave = PP(WiI;;)P(C = i)ios

denotes the appropriate "average" match probability. Unfor-tunately, evaluation OfPave depends on the jury's assessment ofthe other evidence. It cannot be undertaken by the forensicscientist. While Eq. 3 may help to clarify the intuition behindEqs. 1 and 2, it is not useful in the sense of providing a singlematch probability which the forensic scientist can report tocourt.

Ignoring for the moment close relatives of the defendant, aconservative choice for a single match probability would be toreport the largest value of P(ZJI9i) over possible alternativeculprits i. This will typically be the match probability forindividuals i in the same subpopulation as the defendant. In acase in which this applies, and for which other evidenceexcludes close relatives, an upper bound for Eq. 2 is

1 - P(Glnon-DNA evidence)P(Glnon-DNA evidence)

P(esub Is)P(Glnon-DNA evidence)'

Medical Sciences: Balding and Donnelly

11744 Medical Sciences: Balding and Donnelly

where P(fsubjIY) is the conditional probability that a particular(unrelated) individual from the defendant's subpopulation willmatch. Use of Eq. 5 as a conservative bound for Eq. 2 hasobvious practical advantages (when close relatives are exclud-ed).

Further, whatever its exact value, the match probability forunrelated individuals in the defendant's subpopulation is likelyto be small for a four-or-more-locus profile, so that if the effectof the non-DNA evidence is to cast reasonable suspicion on thedefendant, then the (overall) case against him/her will be verystrong. Note, however, that this conclusion depends on anassessment of the other evidence in a case, which is not in thedomain of an expert witness. Further, it does not obviate theneed for care and accuracy in the statistical evaluation andpresentation of DNA evidence.

5.2. Close Relatives. The report of the U.S. National Re-search Council (ref. 13, p. 87) advocated that "whenever thereis a possibility that a suspect is not the perpetrator but is relatedto the perpetrator, this issue should be pointed out to thecourt." In fact, the substantially higher match probabilities forrelatives can have an important effect even in cases in whichthere is no direct evidence suggesting that the perpetrator maybe a relative of the defendant.

Other than identical twins, the most important effectsinvolve siblings of the defendant whose DNA profiles are notavailable. A direct consequence of Eq. 2 is that the relevantissue is not whether or not there is reason to suspect thedefendant's siblings, but the strength of the evidence againstthem relative to the strength of the non-DNA evidence againstthe defendant. It may in some cases be plausible that thenon-DNA evidence has approximately the same weight for thedefendant as for some or all of the siblings. In this case itfollows from Eqs. 1 and 2 that P(G), the probability of thedefendant's guilt, is at most 1/(1 + nmx), in whichx denotes thenumber of such siblings and m denotes the sibling matchprobability. For four-locus DNA profiles, m is typically about1%. Thus there may be reasonable doubt about the defendant'sguilt due to the defendant's siblings alone, even in cases inwhich there is no specific reason to suspect them. A practicalconsequence is that satisfactory prosecutions will requireadditional evidence tending either to inculpate the defendantor to exculpate the siblings. One possible source of such furtherevidence is to profile the siblings, if this is feasible, althoughnon-DNA evidence may also suffice.

5.3. Limited, or Exculpatory, Non-DNA Evidence. The per-ceived strength of DNA evidence has led to cases in whichthere is little or no evidence against the defendant other thanthe DNA evidence. Even if close relatives are unequivocallyexcluded, there may be many unrelated individuals who, if notfor the DNA evidence, would be in much the same situationas the defendant. In other words, there may be many i such thatP(C = i)/P(C = s) is close to unity.

Similar issues arise when there is substantial exculpatoryevidence, such as a convincing alibi, which conflicts withinculpatory DNA evidence. Such cases can arise when anindividual not directly under suspicion is noticed, for reasonseither unrelated to the crime or which are inadmissible asevidence, to have a matching DNA profile.

In such cases, even very small match probabilities do notnecessarily allow a satisfactory prosecution: although eachP(ffjjJ) contributing to Eq. 2 is small, the sum of many suchterms may be nontrivial. In some cases it might be appropriateto consider, for example, the number of males aged between,say, 16 and 50 who could have been at the scene of the crime.(Note that we are not advocating the general use of priorprobabilities given by the reciprocal of census figures.) In largecities, this number could be of the order of 106, in which caseeven a match probability as small as 10-7 may not be convinc-ing in establishing guilt. A less idealized example arose in arecent U.K. rape trial in which (by the admission of the

prosecution) the only incriminatory evidence was the DNAprofile match between defendant and criminal. In contrast, thevictim, who maintained that she had a good view, and memory,of her attacker's face, testified without challenge that thedefendant did not resemble the man who attacked her. Inaddition an apparently credible witness gave the defendant analibi at the time of the offense. Our point is that there will be(a minority of) cases in which extremely small values (say 10-5or smaller) for P(C = s) are plausible.A further consequence of such scenarios is that the exact

value of the match probability can be important, in contrast tothe widespread view that debates over errors of 1 or 2 ordersof magnitude are "purely academic" (21) when the matchprobability is very small. In examples like those outlined in theprevious paragraph, if by ignoring sources of uncertainty aforensic scientist reported a match probability of 10-8 when amore careful analysis might lead to 10-6 as an appropriatevalue, the outcome of the trial could be affected. Since it is notthe role of a forensic scientist to assess the nonscientificevidence, it is not in general possible for her/him to knowwhether or not this may occur in a particular case.

5.4. The Effect of Searches. As discussed in Section 2,hypothesis testing approaches have particular difficulty indealing with the method by which the defendant was identified,which may have involved a search of possible culprits or a"trawl" through a DNA profile database. These situations arereadily accommodated in the coherent framework. In a widerange of settings, the DNA evidence is slightly stronger whenit is obtained after a search, compared with the case of only onesuspect, and hence it slightly favors the defendant to ignore thesearch (9, 23). Although this result conflicts with conclusionsdrawn in a hypothesis testing framework (1), the intuition isthat the search excludes other individuals from being possibleculprits. That the hypothesis testing approach is misleading canbe illustrated by considering the situation in which everyone onearth has been profiled and only one found to match. From thehypothesis testing perspective, this evidence is of no value sincethe search was guaranteed to find a match. The coherentapproach gives the correct answer: the evidence amounts toproof of guilt (assuming here no false exclusions).Although the DNA evidence may be slightly stronger in the

case of a search, the overall case against the defendant mayoften be weaker because there may be little or no otherevidence and consequently a large pool of possible culprits.(For a thorough discussion, see ref. 2.)

5.5. Incorporating Laboratory Errors. Our discussion ofEq. 2 involved some simplification. In place of P(WiJ5s) in thatequation, one should use P(WcJIs, C = i). If Wi and is wereinterpreted in terms of actual (multilocus) genotypes, thenP(fcl9s, C = i) would usually be the same as PffiJ5;). That is,conditional on the defendant's genotype (but not on thecriminal's genotype) knowledge that a particular individual isthe criminal will not change his/her probabilities of genotypepossession. (See ref. 9 for a fuller discussion.)

In fact, Wc and i* are defined not in terms of actualgenotypes but in terms of the profiles measured and reportedby the forensic laboratory. If an error occurs, the criminal anddefendant profiles could be reported to match even though theindividuals have nonmatching genotypes. This possibility isincorporated in the analysis exactly by using P(fcJI9, C = i) inplace of P(fil5-) in Eq. 2. The probability P(WScI5;, C = i) isthus the sum of the probability that individuals i and s havematching genotypes and no false exclusion error occurs and theprobability that the genotypes do not match and a falseinclusion error occurs. This latter probability could be sub-stantially larger than the former (7, 22), in which case ignoringthe possibility of such errors would be seriously prejudicial tothe defendant. Note that what is relevant is an assessment ofthe probability of an error resulting in an inappropriate matchdeclaration in the particular case at hand. These probabilities

Proc. Natl. Acad. Sci. USA 92 (1995)

Proc. Natl. Acad. Sci. USA 92 (1995) 11745

will differ from case to case, depending, for example, on thedetails of sample collection and custody, possible replication ofthe analysis, and the times at which the crime and defendantsample were profiled. Nonetheless, the results of external,blind proficiency tests would appear extremely helpful informing a basis for appropriate assessments.

6. Discussion

We do not propose that juries should be required to assessDNA identification evidence in terms of Eqs. 1 and 2. How-ever, a minimum requirement of a method for presenting DNAevidence to be used in court is that concerns which arise as aconsequence of the logical analysis are adequately addressed.Much current practice, based on reporting to. the court an

estimated profile frequency, does not satisfy this minimumrequirement. The profile frequency is not directly relevant and,if viewed as an estimate of the match probability, its implicitassumption that the relevant correlations are zero is anticon-servative, sometimes substantially so. Further, a juror told onlythat the defendant has a profile which occurs in approximatelyone in a million members of a population may immediatelyconclude that this is overwhelming evidence of the defendant'sguilt, irrespective of the other evidence even though, asillustrated in Section 5.3, this information may in some cir-cumstances be consistent with acquittal. More generally, wehave argued that the hypothesis testing framework is notappropriate for DNA evidence.

In many cases a jury may feel that the effect of the non-DNAevidence is at least to focus reasonable suspicion on thedefendant and away from her/his close relatives. In such casesthe additional effect of a four-locus match should be to makethe overall case against the defendant extremely strong. Nev-ertheless, in our experience there exists a substantial minorityof cases in which the effects of the concerns raised here maybe important. Examples include cases in which there is little orno evidence other than the DNA evidence, cases in which theother evidence is exculpatory, and cases in which a partial ormixed profile is available or some experimental anomaly arises.

Irrespective of any assessment of practical importance, it isimportant as a matter of principle to ensure that presentationsof DNA evidence are fair to the defendant.

P.D. was supported in part by a Block grant from the University ofChicago and National Science Foundation Grant DMS-9505129.

1. Collins, A. & Morton, N. E. (1994) Proc. Natl. Acad. Sci. USA 91,6007-6011.

2. Balding, D. J. & Donnelly, P. (1996) J. Forensic Sci., in press.3. Bernardo, J. M. & Smith, A. F. M. (1994) Bayesian Theory

(Wiley, Chichester, U.K.).4. Tillers, P. & Green, E., eds. (1988) Probability and Inference in the

Law of Evidence: The Limits and Uses of Bayesianism (Kluwer,Dordrecht, The Netherlands).

5. Aitken, C. G. G. (1995) Statistics and the Evaluation ofEvidencefor Forensic Scientists (Wiley, Chichester, U.K.).

6. Evett, I. W. (1983) J. Forensic Sci. Soc. 23, 35-39.7. Koehler, J. J. (1993) Jurimetrics 34, 21-39.8. Balding, D. J. & Donnelly, P. (October 1994) Crim. Law Rev., pp.

711-721.9. Balding, D. J. & Donnelly, P. (1995) J. R. Stat. Soc. A 158, 21-53.

10. Evett, I. W. (1992) J. Forensic Sci. Soc. 32, 5-14.11. Roeder, K. (1994) Stat. Sci. 9, 222-278.12. Morton, N. E. (1992) Proc. Natl. Acad. Sci. USA 89, 2556-2560.13. National Research Council (1992) DNA Technology in Forensic

Science (Natl. Acad. Press, Washington, DC).14. Balding, D. J. & Nichols, R. A. (1995) Genetica 96, 3-12.15. Balding, D. J. & Nichols, R. A. (1994) Forensic Sci. Int. 64,

125-140.16. Morton, N. E., Collins, A. & Balazs, I. (1993) Proc. Natl. Acad.

Sci. USA 90, 1892-1896.17. Weir, B. S. (1994) Annu. Rev. Genet. 28, 597-621.18. Cavalli-Sforza, L. L. & Piazza, A. (1993) Eur. J. Hum. Genet. 1,

3-18.19. Chakraborty; R., Srinivisan, M., Jin, L. & De Andrade, M. (1992)

Proceedings of the Third International Symposium on HumanIdentification (Promega, Madison, WI), pp. 205-222.

20. Li, C. C. & Chakravarti, A. (1994) Hum. Hered. 44, 100-109.21. Lander, E. S. & Budowle, B, (1994) Nature (London) 371,

735-738.22. Thompson, W. C. (1994) Stat. Sci. 9, 263-266.23. Dawid, A. P. & Mortera, J. (1996) J. R. Stat. Soc. B 58, in press.

Medical Sciences: Balding and Donnelly