7
The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three Real Data Sets Ronald W. DeBry* and Lawrence G. Abe/e Department of Biological Science, Florida State University Under a restricted set of conditions and with infinitely long sequences, maximum parsimony is guaranteed to give the same estimate of the phylogenetic tree topology as a maximum-likelihood estimator. However, it is not known how frequently the most parsimonious topology will be the same as the maximum-likelihood topology with real data sets. Three 18s nucleotide sequence data sets are examined, each consisting of seven crustacean taxa. For each data set, under both parsimony and likelihood, scores are determined for all 945 topologies, complete confidence sets are estimated by methods that account for variance in the phylogenetic estimate, and bootstrap resampling is performed. For each data set, the maximum-likelihood tree is also a most parsimonious tree, and there is a strong correlation between tree length and maximum likelihood. For two of the data sets, confidence set size and bootstrap results are very similar under both methods. For the third data set, the confidence set estimated under parsimony is much smaller than that estimated under maximum likelihood, and bootstrap values were substantially different between parsimony and maximum likelihood at several nodes. Introduction Phylogenies are an important tool for testing hy- potheses about historical events and processes, including adaptation and character evolution (for a recent review, see Harvey and Page1 199 1). Frequently the phylogeny is assumed to be correct; however, the phylogeny itself is a statistical inference, and a valid test must account for phylogenetic uncertainty ( DeBry 1992~). The idea that the study of phylogeny is a problem of statistical inference is not new; it dates back at least to Edwards and Cavalli-Sforza ( 1964). Since that time only a few methods have been developed that are specifically de- signed to be used as a basis for statistical inference. Most notable is the method of maximum likelihood, which was originally explored by Edwards and Cavalli-Sforza ( 1964)) and extended primarily by Felsenstein (e.g., 1973, 198 1; see also Barry and Hartigan 1987 for a slightly different approach). Maximum likelihood is computationally quite demanding and is limited to types of data for which a reasonable probability model of character state change can be developed, so it is only * Present address: Department of Medicine, Duke University Key words: parsimony, maximum likelihood, phylogenetic infer- ence, confidence sets, nucleotide sequence data. Address for correspondence and reprints: Ronald W. DeBty, De- partment of Medicine, Box 3380, Duke University Medical Center, Durham, North Carolina 277 10. Mol. Bid. Evol. 12(2):291-297. 1995. 0 1995 by The University of Chicago. All rights reserved. 0737~4038/95/ 1202-00 I 1$02.00 just beginning to come into widespread use. Parsimony (Edwards and Cavalli-Sforza 1964; Eck and Dayhoff 1966, p. 164; Fitch 197 1) is used to infer phylogenies much more often than is maximum likelihood, but rel- atively little is known about parsimony’s properties as a statistical estimator beyond that it is not generally a maximum-likelihood method (Felsenstein 1978). It has been suggested ( DeBry 1992~) that statistical tests based on phylogeny might combine the speed of parsimony methods with the statistical foundation of maximum-likelihood methods. In this approach, par- simony is initially used to search for trees that are of minimal and near-minimal length. Parsimony is also used to search for trees that are minimal and near min- imal under a topological constraint determined by the hypothesis to be tested or by the null hypothesis. These trees are submitted to maximum-likelihood analysis to test the hypothesis that the best of the constrained to- pologies is not significantly worse than the best topology overall. This approach will be valid only if there is a reasonable correlation between parsimony scores (tree length) and maximum-likelihood scores over tree to- pologies. Here we examine this correlation by exhaustive search of three nucleotide sequence data sets with both parsimony and maximum likelihood. If there is a cor- relation between scores, then a hypothesis test based on parsimony might give very similar results to the likeli- hood-based test. Therefore, we also examine the full confidence sets around the best tree(s) as determined 291

The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three Real Data Sets

Ronald W. DeBry* and Lawrence G. Abe/e Department of Biological Science, Florida State University

Under a restricted set of conditions and with infinitely long sequences, maximum parsimony is guaranteed to give the same estimate of the phylogenetic tree topology as a maximum-likelihood estimator. However, it is not known how frequently the most parsimonious topology will be the same as the maximum-likelihood topology with real data sets. Three 18s nucleotide sequence data sets are examined, each consisting of seven crustacean taxa. For each data set, under both parsimony and likelihood, scores are determined for all 945 topologies, complete confidence sets are estimated by methods that account for variance in the phylogenetic estimate, and bootstrap resampling is performed. For each data set, the maximum-likelihood tree is also a most parsimonious tree, and there is a strong correlation between tree length and maximum likelihood. For two of the data sets, confidence set size and bootstrap results are very similar under both methods. For the third data set, the confidence set estimated under parsimony is much smaller than that estimated under maximum likelihood, and bootstrap values were substantially different between parsimony and maximum likelihood at several nodes.

Introduction

Phylogenies are an important tool for testing hy- potheses about historical events and processes, including adaptation and character evolution (for a recent review, see Harvey and Page1 199 1). Frequently the phylogeny is assumed to be correct; however, the phylogeny itself is a statistical inference, and a valid test must account for phylogenetic uncertainty ( DeBry 1992~). The idea that the study of phylogeny is a problem of statistical inference is not new; it dates back at least to Edwards and Cavalli-Sforza ( 1964). Since that time only a few methods have been developed that are specifically de- signed to be used as a basis for statistical inference. Most notable is the method of maximum likelihood, which was originally explored by Edwards and Cavalli-Sforza ( 1964)) and extended primarily by Felsenstein (e.g., 1973, 198 1; see also Barry and Hartigan 1987 for a slightly different approach). Maximum likelihood is computationally quite demanding and is limited to types of data for which a reasonable probability model of character state change can be developed, so it is only

* Present address: Department of Medicine, Duke University

Key words: parsimony, maximum likelihood, phylogenetic infer- ence, confidence sets, nucleotide sequence data.

Address for correspondence and reprints: Ronald W. DeBty, De- partment of Medicine, Box 3380, Duke University Medical Center, Durham, North Carolina 277 10.

Mol. Bid. Evol. 12(2):291-297. 1995. 0 1995 by The University of Chicago. All rights reserved. 0737~4038/95/ 1202-00 I 1$02.00

just beginning to come into widespread use. Parsimony (Edwards and Cavalli-Sforza 1964; Eck and Dayhoff 1966, p. 164; Fitch 197 1) is used to infer phylogenies much more often than is maximum likelihood, but rel- atively little is known about parsimony’s properties as a statistical estimator beyond that it is not generally a maximum-likelihood method (Felsenstein 1978).

It has been suggested ( DeBry 1992~) that statistical tests based on phylogeny might combine the speed of parsimony methods with the statistical foundation of maximum-likelihood methods. In this approach, par- simony is initially used to search for trees that are of minimal and near-minimal length. Parsimony is also used to search for trees that are minimal and near min- imal under a topological constraint determined by the hypothesis to be tested or by the null hypothesis. These trees are submitted to maximum-likelihood analysis to test the hypothesis that the best of the constrained to- pologies is not significantly worse than the best topology overall. This approach will be valid only if there is a reasonable correlation between parsimony scores (tree length) and maximum-likelihood scores over tree to- pologies. Here we examine this correlation by exhaustive search of three nucleotide sequence data sets with both parsimony and maximum likelihood. If there is a cor- relation between scores, then a hypothesis test based on parsimony might give very similar results to the likeli- hood-based test. Therefore, we also examine the full confidence sets around the best tree(s) as determined

291

Page 2: The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

292 DeBry and Abele

under maximum likelihood by using Kishino and Ha- segawa’s ( 1989) test, and as determined under parsi- mony by using a version of Templeton’s ( 1983) test. We also examine the relationship between these confi- dence sets and confidence as assessed by bootstrap re- sampling ( Felsenstein 1985 ) .

These analyses are also relevant to the general question of the relationship of parsimony to maximum- likelihood methods. As mentioned above, Felsenstein ( 1978) first demonstrated that parsimony is not generally a maximum-likelihood estimator and that parsimony can sometimes have undesirable statistical properties. The most studied statistical property of the parsimony method is consistency. Parsimony will converge on an incorrect topology under some sets of branch lengths (Felsenstein 1978; Hendy and Penny 1989; DeBry 1992b), but whether parsimony will be a consistent es- timator of phylogeny for all, some, or only a few real data sets remains unresolved.

It is important to note the conceptual differences between parsimony and maximum likelihood. Maxi- mum-likelihood estimation incorporates an explicit model of nucleotide sequence evolution. This model ac- counts for multiple changes at a single position, so max- imum likelihood simultaneously estimates both branch lengths and phylogenetic tree topologies. Parsimony is strictly a method for choosing among topologies, al- though Steel et al. ( 1993) have recently shown that cor- rection for multiple substitutions as an intermediate step in parsimony analysis will result in parsimony being generally a consistent estimator of tree topology. In this study, we consider only the performance of parsimony without such a correction procedure.

Material and Methods

We examine three different crustacean 18s ribo- somal RNA data sets (a “decapod” data set, a “barnacle” data set, and a “maxillopod” data set), with seven species each. The classification of the Crustacea is unsettled, so we follow the summary classification used elsewhere (Bowman and Abele 1982). The three data sets span a range of levels of phylogenetic affinity, with the decapods being the most closely related to each other and the maxillopods being the most heterogeneous group.

The decapod data set includes only members of the order Decapoda and is derived from earlier work (Kim and Abele 1990; Abele 199 1). Based on these two works, the species involved are thought to be made up of rel- atively closely related pairs, consisting of Penaeus and Sicyonia, Paleamonetes and Procaris, and Procambarus and Callinectes.

The barnacle data set is considerably more heter- ogeneous than the decapod data set, although except for the outgroup Chlamydotheca (an ostracod), all of the

taxa are generally placed in the same taxonomic subunit, the Cirripedia. These species include representatives of two classes of the Cirripedia, the Rhizocephala (Loxo- thylacus) and the Thoracica. Within the Thoracica, Lepas and Octolasmis are classified in the suborder Le- padomorpha, while Chelonibia, Tetraclita and Chtham- alus are members of the suborder Balanomorpha. The sequence data for the barnacle groups are from Spears et al. (1994).

The maxillopod data set, like the class (Bowman and Abele 1982), subclass (Abele et al. 1989), or su- perclass ( Abele et al. 1992) Maxillopoda is a heteroge- nous mixture of groups of debatable taxonomic affilia- tion. This data set consists of one member of the class Branchiopoda (Branchinecta, designated the outgroup), one member of the class Ostracoda, order Podocopida ( Chlamydotheca), one pentastome (Porocephalus, once considered a member of a separate phylum, but placed within the Crustacea by Abele et al. [ 19891)) one mem- ber of the maxillopod subclass Branchiura (Argulus), and two members of the maxillopod subclass Cirripedia (order Acrothoracica [ Trypetesa] and order Thoracica [ Balanus] ) .

The decapod and maxillopod data sets are based on direct sequencing of ribosomal RNA, and each con- tain regions in which the sequence remains unknown. The barnacle data set is based on DNA sequencing data following PCR amplification of an approximately 2,000 base pair region of the 18s ribosomal gene. These DNA sequences are much more complete than are the RNA- derived sequences. All gaps, whether due to missing in- formation or inserted to improve the alignment, were considered as unknown nucleotide states.

All 945 unrooted tree topologies were evaluated by parsimony using PAUP (version 3.0s; Swofford 1992), and the lengths of all trees were saved. A computer- readable description of each topology was also saved for input into the DNAML and DNAPARS programs in PHYLIP ( version 3.4; Felsenstein 199 1) as user-defined trees, effectively creating an exhaustive search mode for the maximum-likelihood analysis.

We performed two kinds of statistical tests for both likelihood and parsimony, bootstrap resampling (Fel- senstein 1985), and tests based on site-by-site pairwise comparison between trees (Templeton 1983; Kishino and Hasegawa 1989). When DNAML is provided with multiple user-defined trees, it performs the statistical test developed by Kishino and Hasegawa ( 1989). The log likelihood of each tree is compared to the best tree in a pairwise test. However, the mean difference in log like- lihood between the two trees is not a sufficient basis for comparison (Hasegawa and Kishino 1989). It is also necessary to estimate the variance of log-likelihood dif-

Page 3: The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

Parsimony and Maximum-Likelihood Analysis 293

ferences based on the variability of likelihoods at indi- vidual sites in the nucleotide sequence (Kishino and Hasegawa 1989, eq. [ 121). One estimate of the confi- dence set is the collection of trees for which the difference in mean likelihood compared to the best tree is less than 1.96 SE. This test has been shown to be closely related to the bootstrap probability for a particular alternative tree (Kishino et al. 1990; Hasegawa and Kishino 1994). Bootstrap analysis can be closely approximated by re- peated sampling from the multivariate normal distri- bution, without recalculating the maximum likelihood for each bootstrap sample (Kishino et al. 1990). In this case, however, we were able to perform bootstrap re- sampling directly with the program fastDNAML (pro- vided by Dr. G. Olsen, University of Illinois, Cham- paign ) .

Templeton ( 1983 ) developed a nonparametric test for comparing parsimony trees that is very similar to the Kishino-Hasegawa likelihood test, and Felsenstein ( 199 1) has implemented a version of this test in the DNAPARS program in PHYLIP. For both the Kishino- Hasegawa and Templeton tests, note that we are not placing too much emphasis on whether a particular to- pology is rejected; in this study we are interested pri- marily in comparing the sizes of confidence intervals estimated by the pair-wise tests to estimates of statistical reliability derived from bootstrap resampling. For par- simony, bootstrap analysis was done in PAUP.

Our main focus is on comparing parsimony to maximum likelihood, because both of those methods can be used to provide a score for every topology. For comparison purposes, we have also analyzed each of the three data sets by neighbor-joining analysis (Saitou and Nei 1987) using the NEIGHBOR program in the PHY- LIP package and by the method of Fitch and Margoliash ( 1967) using the FITCH program in PHYLIP.

Results

The decapod and maxillopod data sets each have two equally most parsimonious trees, while the barnacle data set has a single most parsimonious tree (fig. 1). Consistency indexes (counting only phylogenetically in- formative characters) for the most parsimonious trees are 0.7 14 for the decapod data set, 0.86 1 for the barnacle data set, and 0.631 for the maxillopod data set. For the decapod data set, the maximum-likelihood tree is iden- tical to one of the two most parsimonious trees, and the other most parsimonious tree has the second highest likelihood. For the barnacle data set, the maximum- likelihood tree is the same as the most parsimonious tree. For the maxillopod data set, the maximum-likeli- hood tree is identical to one of the two most parsimo- nious trees, but the other most parsimonious tree ranks fourteenth in likelihood.

A Sicyonia A Chiam);doth&a

C FIG. 1 .-Parsimony and bootstrap analyses of three 18s ribosomal

RNA data sets. Partitions appearing in greater than 50% of 1,000 par- simony replicates (using a branch and bound search algorithm) or 300 maximum-likelihood replicates (using global branch swapping and random sequence addition orders) are indicated. A, Decapod data set; B, barnacle data set; C, maxillopod data set. In A and B, parsimony values are shown above the lines, maximum-likelihood values are shown below the lines; an asterisk indicates a partition found in less than 50% of the replicates by that method; in C, parsimony results are shown on the left, maximum-likelihood results are shown on the right.

We also converted the nucleotide sequence data to distances using both the Kimura ( 1980) two-parameter model and the maximum-likelihood method as imple- mented in the PHYLIP program DNADIST. For the decapod and barnacle data sets the neighbor-joining (Saitou and Nei 1987) and Fitch-Margoliash (Fitch and Margoliash 1967) trees are identical to the maximum- likelihood tree (using the programs NEIGHBOR and FITCH in PHYLIP; results not shown). For the max- illopod data set, neither the neighbor-joining nor the Fitch-Margoliash tree are the same as the maximum- likelihood or parsimony trees. The neighbor-joining tree has a log likelihood of -7247 (a difference in log like- lihood of 26 compared to the maximum-likelihood tree) and a length of 995 steps ( 12 steps longer than the max- imum-parsimony tree), while the Fitch-Margoliash tree has a log likelihood of -7245 (a difference in log like- lihood of 24) and a length of 998 steps ( 15 steps longer).

For all three data sets there is a highly significant correlation between parsimony scores and likelihoods (fig. 2). For both the decapod and maxillopod data sets, the parsimony and likelihood scores form a single cloud of points, with the decapod data set appearing to show a stronger relationship between parsimony and likeli- hood ranks for the more well-supported trees. For the barnacle data set, the correlation between parsimony and likelihood scores is highly significant, but the scores are separated into three distinct clouds of points.

If parsimony is to be used to find trees meeting a particular topological constraint prior to examining those trees using likelihood, then it is important that there be a correlation between the parsimony and likelihood

Page 4: The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

294 DeBry and Abele

lonibia; maxillopods, Argulus and Balanus). Each o the 105 topologies that meet the constraint that the tw( taxa be sisters were found using PAUP, and the rest o the procedure was the same as the overall comparisons Not unexpectedly, given the strong overall correlations parsimony and likelihood scores for the restricted set o trees are also highly correlated (by Spearman rank car relation, P < 0.000 1 in all three cases; results not shown)

Confidence sets were estimated using the parsi mony-based Templeton test and the likelihood-base<

710 720 730 740 750 760 770 780 790 Kishino-Hasegawa test (fig. 2, table 1). For the decapoc Tree Length (Parsimony) and maxillopod data sets, the confidence sets estimates

-5225 - under parsimony are similar in both size and compo

. . B sition to the confidence sets estimated under maximun . l ;:o T$

i~rlg~~*

likelihood, with the parsimony-based confidence se -5250 - $80 q

es

z 8 0

being slightly smaller than the likelihood-based set fo 0 8 0 the decapod data and slightly larger than the likelihood

0 .s 5 -5275 -

based set for the maxillopod data. The confidence set .3 for the maxillopod data are much larger than those fo 5 the decapod data, and in fact they include more thar

-5300 - %OBBi Oe 0 10% of all possible trees. As noted above, the neighbor joining and Fitch-Margoliash trees for the maxillopoc

-5325 - , data are not the same as the maximum-likelihood o 510 520 530 540 550 560

Tree Length (Parsimony) maximum parsimony trees. However, both the neighbor joining and Fitch-Margoliash trees fall within the con

-7200, fidence set according to both the Kishino-Hasegawa ant

C Templeton tests.

-7220 l . . For the barnacle data, there is a considerable dif .**a . -0

. ference between the confidence sets estimated unde 8 -7240-

r parsimony and likelihood. The parsimony-based con

5 .sj

fidence set contains only 18 trees, while the likelihood -7260 -

5 based confidence set contains 104 trees. This difference 0

-7280 -

-7300 980

is related to the separation of the barnacle trees into three groups (fig. 2 B) . Taken separately, parsimony ant likelihood each give a bimodal distribution of tree scores

, 990 1000 1010 1020 1030 but the score distributions are different between the twc

Tree Length (Parsimony) methods. For parsimony, there is a distinct group of 2 trees which all have scores of 527 or fewer steps. Fo likelihood, there are no trees in an interval around a lo; likelihood of about -5270, and there are 189 trees ir the group with higher log likelihoods.

FIG. 2.-Relationship between tree length according to parsimony and maximum likelihood for all 945 tree topologies for three 18s nu- cleotide sequence data sets. Filled circles, trees included within the confidence set according to both the Templeton test and the Kishino- Hasegawa test; open squares, trees included only in the confidence set according to the likelihood-based Kishino-Hasegawa test; open dia- monds, trees included only in the confidence set according to the par- simony-based Templeton test; open circles, trees excluded from both confidence sets. All three data sets show a highly significant correlation

Table 1 Number of Trees Included in the Confidence Sets

between tree length and likelihood by a Spear-man rank test. A, Decapod data set; ? = 0.902, P < 0.0001. B, Barnacle data set; ? = 0.823, P < 0.000 1. C, Maxillopod data set; ? = 0.842, P -C 0.000 1.

DATA SET

ANALYSIS

Parsimony” Likelihood

Decapods ............. 10 18

scores among. subsets of constrained trees. For each data Maxillopods ........... 138 101

set, two species that represent different lineages on the Barnacles . . . . . . . . . . 18 104

maximum-parsimony tree were chosen (decapods, Cal- a Estimated by the method of Templeton ( 1983). linectes and Penaeus; barnacles, Octolasmis and Che- b Estimated by the method of Kishino and Hasegawa (1989).

Page 5: The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

Parsimony and Maximum-Likelihood Analysis 295

A majority-rule consensus of all 189 trees in the high-likelihood group for the barnacle data set shows only that a Lepas + Octolasmis group is found on 56% of the trees (not shown). However, closer examination reveals a pattern that is obscured by looking at the con- sensus tree (fig. 3). Among the 945 total trees, 189 can be bipartitioned such that Lepas and Octolasmis are in one partition and Chthamalus, Tetraclita, and Chelon- ibia are in the other (with Loxothylacus and Chlamy- dotheca allowed to occupy all possible positions). Those are the same 189 trees that are included in the high- likelihood group, and 104 of that 189 are included in the likelihood confidence set. The group of 2 1 trees with low parsimony scores is formed by constraining Loxo- thylacus and Chlamydotheca to form a monophyletic group, and then allowing that group to join any edge of the tree in figure 3. Of the 2 1 low-parsimony trees, 18 are included in the parsimony confidence set according to Templeton’s test.

Results of bootstrap analyses of the three data sets span a wide range (fig. 1). The decapod data set gives very high bootstrap values; three bipartitions appear in at least 93% of the resampled replicates, and nearly identical values are obtained under both parsimony and likelihood. The maxillopod data set gives much lower bootstrap values; the highest values obtained are 76% under parsimony and 8 1% under likelihood, and only two bipartitions appear in over 50% of the parsimony replicates. Not surprisingly, the barnacle data give quite different results depending on whether parsimony or likelihood is used. Under parsimony, two bipartitions have high values (86% and 98%), but both of those show much lower values under maximum likelihood (65% and 52%, respectively).

The bootstrap results agree with the confidence sets in the sense that the data set-analysis combinations with the smaller confidence sets tend to have higher support for bipartitions under the bootstrap analysis. To more closely examine the relationship between the confidence sets and bootstrap results, it is interesting to compare the decapod topology suggested by the very high boots- trap values with the trees in the confidence set suggested by the Kishino-Hasegawa test. According to the likeli- hood bootstrap results, three pairs of taxa are supported in over 93% of the replicates. The only taxon that is not in a highly supported partition is Stenopus; but even Stenopus is confined to one of two positions, either as the sister to Procambarus plus Callinectes or as the sister to Palaemonetes plus Procaris. Thus, the bootstrap re- sults could be interpreted as suggesting a confidence set of only two trees, while the Kishino-Hasegawa confi- dence set includes 18 trees.

The bootstrap proportions represent multiple a posteriori tests, so the confidence levels might be over-

1 Chthamalus

1 Chelonibia

FIG. 3.-Unrooted tree showing the relationships in common all of the 189 high-likelihood trees for the barnacle data set.

to

estimated. Felsenstein and Kishino ( 1993) have pointed out that the best interpretation of P (the proportion of bootstrap replicates that include a particular partition) is that 1 - P is a conservative estimate of the probability of making a Type I error regarding that partition. So for the likelihood analysis of the decapod data, the proba- bility that the support seen for one or more of these three groups is due to chance could be estimated as 1 - (0.99 X 0.96 X 0.93) = 0.12 (assuming that these events are independent ). By this logic, the Kishino-Ha- segawa confidence set still appears to be more conser- vative than the overall bootstrap estimate of confidence.

If we relax our interpretation of the bootstrap results to allow Stenopus to occupy any possible position on the tree, there are still only nine topologies that are com- patible with the three highly supported two-taxon par- titions. Five of those nine trees are included in the Ki- shino-Hasegawa confidence set, leaving 13 trees that disagree with one or more of the relationships inferred from the bootstrap results. Among those 13 trees, 10 do not include the Sicyonia-Penaeus clade, the clade that also has the lowest bootstrap support. There are only three trees without a Procambarus-Callinectes clade and four trees without a Palaemonetes- Procaris clade in the Kishino-Hasegawa confidence set. It appears that the bootstrap and Kishino-Hasegawa estimates agree on the relative ranking of support for the three clades, if not on the absolute value of the support.

Discussion

For all three data sets, there is very good agreement between parsimony and maximum likelihood on which tree is the best overall. The two methods are also sub- stantially in agreement on the ranking of near-optimal trees. If these results prove to be representative of many real data sets, then they suggest that an economical ap- proach to statistical analysis of large phylogenies may involve two phases: identification of interesting candi- date phylogenies by parsimony, followed by examination of a restricted set of trees using maximum likelihood. There is no guarantee that other data sets will give similar results, but in the present study no effort was made to select data sets showing a high correlation.

The barnacle data demonstrate that differences be- tween parsimony and likelihood can be important in

Page 6: The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

296 DeBry and Abele

some cases. Despite a highly significant overall corre- lation between parsimony and likelihood scores, both the confidence sets and bootstrap values are substantially different between parsimony and likelihood. Many of the trees within the likelihood confidence set fall well outside the parsimony confidence set. with lengths ap- proximately at the midpoint of the distribution of all tree lengths. However, the two methods do at least agree on the best topology overall.

In any particular study it would be safest to verify the relationship between parsimony scores and likeli- hoods. In most cases it will not be possible to examine all possible topologies, but it should be sufficient to ex- amine the correlation between scores for a subset of trees. Two strategies could be taken. One would be to examine randomly selected topologies, to verify the overall cor- relation. The other, probably more important approach, would be to examine as many nearly most parsimonious trees as possible, both overall and under the selected constraints. This will be important because correct es- timation of the confidence set depends on correctly identifying the maximum-likelihood tree. Even if there is a significant overall correlation between parsimony and likelihood scores, the maximum-likelihood tree may not be the most parsimonious tree. With large data sets, it will not be possible to know if the most parsimonious tree has been found, so even greater caution must be used in interpreting likelihood results that are based on initial searches using parsimony.

We find a general correspondence between the size of the confidence set estimated by the Kishino-Hasegawa and Templeton tests and the presence of highly sup- ported partitions in the bootstrap analysis. For example, in both analyses of the decapod data set, three partitions were found in over 90% of the bootstrap replicates (fig. 1 A), and the confidence sets include only 10 and 18 trees (table 1 ), while in both analyses of the maxillopod data set the bootstrap values are all much below 90% (fig. 1 C), and the confidence sets include over 100 trees (table 1). Likewise, parsimony analysis of the barnacle data gives both high bootstrap values and a small con- fidence set, while likelihood analysis of the same data gives very low bootstrap values and a much larger con- fidence set. This relationship is consistent with the ob- servation by Kishino et al. ( 1990), who found that the Kishino-Hasegawa test implemented in DNAML gave very similar estimates of the variance in likelihood dif- ference compared to the bootstrap.

However, when we compare the particular trees found in the Kishino-Hasegawa confidence set for the decapod data to the trees that are compatible with the likelihood bootstrap results, we find the Kishino-Hase- gawa test to be somewhat more conservative than the

bootstrap. The Kishino-Hasegawa confidence set in- cludes 13 trees that are excluded by the bootstrap test, if we accept bipartitions found in more than 90% of the bootstrap replicates as being highly supported.

The results for these three data sets show that par- simony and maximum-likelihood analyses agree to a considerable degree when applied to real data. The two methods also provide similar measures of the support for nonoptimal tree topologies. We do not know the true phylogeny for any of these groups, so a conclusion that parsimony is likely to be a consistent estimator of the phylogeny for these data sets is only valid if the as- sumptions of maximum likelihood are met to a sufficient degree. There is some cause for concern on this point, because the implementation of maximum-likelihood used in the PHY LIP package (version 3.4) shares with parsimony a number of important assumptions about the evolutionary process. For example, both methods require an assumption that the evolution of individual nucleotides is independent of the states at other positions (an assumption of particularly dubious validity in the case of ribosomal RNA genes, because of constraints imposed by both functional and secondary structure re- quirements). Both methods are also critically dependent on the sequence alignment for determination of which nucleotide positions are considered to be homologous. The same alignment was used for both methods in all three analyses presented here, so any bias due to mis- alignment will be shared by both methods. It is not known whether there are any significant differences be- tween likelihood and parsimony with respect to how robust the methods are to violations of these assumptions about the evolutionary process.

Advances in computing power and improved models of nucleotide sequence evolution should lead to the development of more sophisticated maximum-like- lihood methods, which should allow relaxation of many of the assumptions required by the implementation that we used here. This may reduce the similarity between parsimony and likelihood results. For example, Thorne and Kishino ( 1992) have recently described an approach in which the likelihoods of all possible alignments are incorporated into the phylogenetic estimate, and newer versions of DNAML (i.e., version 3.5) allow estimation using different rates at different positions without spec- ifying the rate for any particular position in advance. Some of the assumptions of parsimony analysis can also be relaxed. For example, nonindependent characters can be treated under parsimony by altering the weight ac- corded to those characters. However, under parsimony the correlated characters must be identified a priori, while advanced maximum-likelihood methods might allow estimation of nonphylogenetic correlations from the

Page 7: The Relationship between Parsimony and Maximum-Likelihood ... · The Relationship between Parsimony and Maximum-Likelihood Analyses: Tree Scores and Confidence Estimates for Three

Parsimony and Maximum-Likelihood Analysis 297

data. Also, Steel et al. ( 1993) have recently suggested that any method of tree selection, including parsimony, can be a consistent estimator of phylogeny if an appro- priate correction is applied for unobserved multiple changes at a given character.

Acknowledgments

R.W.D. is supported by a postdoctoral fellowship from the Alfred P. Sloan Foundation. This work was partially supported by National Science Foundation (NSF) grant DEB92-07522. We thank David Penny and one anonymous referee for comments that substantially improve the presentation of this work. This work is sup- ported in part by the Supercomputer Computations Re- search Institute, which is partially funded by the U.S. Department of Energy contract No. DE-FCOS- 85ER25000.

LITERATURE CITED

ABELE, L. G. 199 1. Comparative molecular and morphological phylogeny of the Decapoda. Memoirs of the Queensland Museum 31:101-108.

ABELE, L. G., W. KIM, and B. E. FELGENHAUER. 1989. Mo- lecular evidence for the inclusion of the phylum Pentas- tomida in the Crustacea. Mol. Biol. Evol. 6:685-691.

ABELE, L. G., T. SPEARS, W. KIM, and M. APPLEGATE. 1992. Phylogeny of selected maxillopodan and other crustacean taxa based on 18s ribosomal nucleotide sequences. Acta Zool. 73:373-382.

BARRY, D., and J. A. HARTIGAN. 1987. Statistical analysis of hominoid molecular evolution. Stat. Sci. 2: 19 l-2 10.

BOWMAN, T. E., and L. G. ABELE. 1982. Classification of the recent Crustacea. Pp. l-27 in L. G. ABELE, ed. The biology of Crustacea. Vol. 1. Academic Press, New York.

DEBRY, R. W. 1992a. Biogeography of new world taiga-dwell- ing Microtus (Mammalia: Arvicolidae): a hypothesis test that accounts for phylogenetic uncertainty. Evolution 46: 1347-1357.

- . 1992b. The consistency of several phylogeny-inference methods under varying evolutionary rates. Mol. Biol. Evol. 9:537-55 1.

ECK, R. V., and M. 0. DAYHOFF. 1966. Atlas of protein se- quence and structure 1966. National Biomedical Research Foundation, Silver Spring, Md.

EDWARDS, A. W. F., and L. L. CAVALLI-SF~RZA. 1964. The reconstruction of evolutionary trees. Pp. 67-76 in V. H. HEYWOOD and J. MCNEILL, eds. Phenetic and phylogenetic classification. Systematics Association, London.

FELSENSTEIN, J. 1973. Maximum likelihood and minimum steps methods for estimating evolutionary trees from data on discrete characters. Syst. Zool. 22:240-249.

- 1978. Cases in which parsimony or compatibility . methods will be positively misleading. Syst. Zool. 27:40 l- 410.

-. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783-79 1.

---. 198 1. Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17:368-376.

-. 199 1. PHYLIP, version 3.4. University of Washington, Seattle.

FELSENSTEIN, J., and H. KISHINO. 1993. Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull. Syst. Zool. 42: 193-200.

FITCH, W. M. 197 1. Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20:406-4 16.

FITCH, W. M., and E. MARGOLIASH. 1967. Construction of phylogenetic trees. Science 155:279-284.

HARVEY, P. H., and M. D. PAGEL. 199 1. The comparative method in evolutionary biology. Oxford University Press, New York.

HASEGAWA, M., and H. KISHINO. 1989. Confidence limits on the maximum-likelihood estimate of the Hominoid tree from mitochondrial-DNA sequences. Evolution 43:672- 677.

HENDY, M. D., and D. PENNY. 1989. A framework for the quantitative study of evolutionary trees. Syst. Zool. 38:297- 309.

KIM, W., and L. G. ABELE. 1990. Molecular phylogeny of selected decapod crustaceans based on 18s rRNA nucleotide sequences. J. Crustacean Biol. 10: I- 13.

KIMURA, M. 1980. A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16: 11 l-120.

KISHINO, H., and M. HASEGAWA. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree to- pologies from DNA sequence data, and the branching order in Hominoidea. J. Mol. Evol. 29:170-179.

KISHINO, H., T. MIYATA, and M. HASEGAWA. 1990. Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J. Mol. Evol. 31: 15 1- 160.

SAITOU, N., and M. NEI . 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4:406-425.

SPEARS, T., L. G. ABELE, and M. A. APPLEGATE. 1994. Phy- logenetic study of cirripedes and selected relatives (The- costraca) based on 18s rDNA sequence analysis. J. Crus- tacean Biol. 14:64 l-656.

STEEL, M. A., M. D. HENDY, and D. PENNY. 1993. Parsimony can be consistent! Syst. Biol. 42:581-587.

SWOFFORD, D. L. 1992. PAUP: phylogenetic analysis using parsimony, version 3.0s. Illinois Natural History Survey, Champaign.

TEMPLETON, A. R. 1983. Phylogenetic inference from restric- tion endonuclease cleavage site maps with particular ref- erence to the evolution of humans and apes. Evolution 37: 22 l-244.

THORNE, J. L., and H. KISHINO. 1992. Freeing phylogenies from artifacts of alignment. Mol. Biol. Evol. 9:1148-l 162.

TAKASHI GOJOBORI, reviewing editor

Received August 9, 1994 Accepted September 30, 1994