Upload
adam-breindel
View
228
Download
0
Embed Size (px)
Citation preview
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
1/31
N ote regarding 2007 republication: Source and binary code referenced in note 34 are no longer at that
URL. The same assets are now hosted athttp://www.selfmummy.com/mss2dna
Adam Breindel
Departm ent of Classics, Brow n UniversityMay 1998
The Application of a D iscrete-Character Parsimony
Phylogeny-Inference Algorithm to Classical Text Stemmata
The purp ose of this paper is to present two interdisciplinary observations; a new
technique for stemmatic analysis; and preliminary results from an app lication of this
technique.
The first interd isciplinary observation is that the method s and pu rpose of
stemmatics overlaps substantially with the method s and p urp ose of the biological sub-
discipline of cladistic analysis. While this fact is rarely em ph asized or exploited, it is not
a new discovery, and its history w ill be discussed. The second interd isciplinary
observation is that compu ter software which has been developed for biologists in ord er
to solve problems in clad istic analysis now offers us the possibility of advances in the
construction of textual stemm ata, through a non-traditional use of trad itional
manuscript collations.
The analytic technique contained herein wh ich d oes not appear ever to have
been attempted heretoforeis the application of an existing cladistic analysis software
package to the stemmatic analysis of a manu script collation. The use of this technique to
analyze part of the Sallustian corpus is thorough ly documented in this stud y.
Preliminary results indicate that th e technique produ ces a stemm a nearly iden tical to
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
2/31
Breindel 2
that pu blished by L.D. Reynold s, the editor of the Oxford text. Hence, this method
app ears to offer an effective new ap proach to evaluating t he relationships amon g extant
versions of a text.
The interp lay between the disciplines of biological systematics, genetics, and
textual criticism, w hich m akes this pap er p ossible, has a somewh at Byzantine history
spann ing the last thirty years. I ask the read er to consider with charity my exposition of
this history. For it seems that the relative uniqueness of this pap er demand s an
unusu ally large amou nt of background information.1
Background
In 1968, John G. Griffith p ublished a p aper entitled A Taxonomic Stud y of the
Manuscript Trad ition of Juvenal.2In this study , Griffith app lied m ethods of numerical
taxonomy to the classification of Juvenal manuscripts. The taxonomic methods, as he
explains in a similar article the following year ,3
he had in turn learned from biologist
Rober t Sokals 1966 Scientific American article on that top ic.
Griffith describes the biological advances wh ich he exploits in analyzing the texts
of Juv enal:
1I have found it necessary in the course of this paper to refer to some technical aspects of systematics and
genetics. I have attemp ted to r estrict to an elemen tary level the familiarity requ ired w ith these disciplines,
in order to make this work accessible to a broad aud ience. Non etheless, readers seeking an introd uctory
exposition may find ap propriate sections of the following textbooks useful:
Gamblin, Linda and Gail Vines, eds. (1991) The Evolution of Life, Oxford, chapter 3.
Maxson, Lind a R. and Ch arles H. Dau gherty (1992) Genetics: A Human Perspective,
Dubu que, Iowa, chapter s 8, 10.
Minkoff, Eli C. (1983)Evolut ionary Biology , Reading, Mass., chapter 22.2Griffith, John G. (1968) A Taxonomic Stud y of the Manu script Trad ition of Juv enal Museum Helveticum
25:101-38.3Griffith, John G. (1969) Nu merical Taxonomy an d Some Prim ary Man uscript s of the Gospels, Journal of
Theological Studies 20:389-406.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
3/31
Breindel 3
Scientist have long been aware of the limitations of the traditional
meth ods of classifying specimen s; biologists in particular have laboured
under this hand icap. Within th e last 10-15 years considerable ad vances
have been made, largely because techniques developed for comp uter u se
have enabled specialists in this activity, who style themselves numerical
taxonomists, to sift w ith speed and precision large masses ofunprom isingly heterogeneous material, and thereby to isolate groups or
taxa of related specimens, on the basis of which further inquiry may be
conducted. ...4
Thus, Griffith identifies a requirem ent which textual criticism has in comm on with
taxonomy: in both d isciplines, objects mu st be group ed based on small numbers of
distinctions among v ast amounts of similarity. The numeric taxonomy m ethods ap pear,
he says, to offer new quan titative approaches app licable to both p roblems. He then
expresses the hop e that w e might find associations between specimens by evaluating
large amounts of data w ith machine assistance. In light of the existing resou rces,
though, he remarks that for a textual critic operating with only a few thousand lines of
text it is simply not w orth the trouble of program ming the data for machine-
processing...5
The limitations to Griffiths pioneering app roach were unfortun ately several. His
procedu re w as, first, extraordinarily laborious: for the fourteen Gospel manuscripts
analyzed in h is article of 1969, up to fifty-six manua l record ing acts were required for
every variant among one or more of the manu scripts. Thu s he was constrained to look
at only small samples of the d ata. Moreover, if he had had access to more data, he may
likely have lacked access to the technology to evaluate it.
4Griffith, op. cit. 1968, pp. 113-14.
5ibid.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
4/31
Breindel 4
Griffiths procedure (and , in a ll fairness, the biological methods with h e worked )
had a more troublesome limitation in that they resulted only in associations of objects.
Griffith could assert the d istribution of manu scripts into various sub -groups w ith
statistically-argued accuracy, but th e mere grou ping of the manu scripts d oes not seem
to have accomp lished m uch. His method s said nothing abou t the genealogical
relationships of the manuscripts. For example, if manuscripts A, B, and C are found to
be in a single taxon, we have only formalized their external similarity. As useful as such
formalization m ight be, little is ind icated abou t the genealogical relationships likely to
inhere between the manu scripts.
Thus, Griffith su cceeded in bringing n um erical taxonomy into the arena of
textual criticism, but the biological appr oach upon which he d epend ed w as not
ambitious enough to describe the relationships among the sp ecimens and so his textual
techniques appear to have fallen into d esuetud e.
In 1973, Martin West published a shor t work on textual criticism, Textual
Criticism and Editorial Technique.6
In this work, West explains that compu ters might
theoretically hold some prom ise for stemm a construction, because, und er the best
possible circumstan ces, building a stemm a d emand s only simple logic. Such a stemma
would natu rally be an ad vance over Griffiths taxonomic man uscript associations. West
is, however, skeptical about th e idea and holds ou t some theoretical reservations:
If prov ided w ith suitable prepared transcriptions of the manuscripts,
pu rged of coincidental errors, a comp uter could d raw u p a clumsy and
unselective critical apparatus; and it could in principle wh ere there was
no contamination!work out an un oriented stemma. That m eans ... that
it could w ork out a scheme simply by comparing the variants, without
6West, Martin L. (1973) Textual Criticism and Editorial Technique, Stuttgart.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
5/31
Breindel 5
regard to w hether they were right or wrong; but this scheme would be
capable of susp ension from an y point [i.e., the schem e could not
distingu ish the subarchetyp es] ... The correct orientation could on ly be
determined by evaluating th e quality of the variants, which no machine is
capable of doing.7
Wests objections w ill be considered in detail later, as they are impor tant to the
present investigation. But it is worth noting for now that even if West had wanted to
test a compu terized construction of a stemm a, there would have been obstacles to his
progress.
First, there wou ld not have been readily available technology for his pu rpose.
But more impor tantly, outside of theoretical comp uter science or mathematical graph
theory, there had not been p ractical research on autom ating the construction of
stemmata when the data for the specimens is inconsistent or underdetermined. That is, if the
variants in a set of manuscripts were completely compatible with a u nique stemma, w e
would need on ly make the right inferences to generate it. In reality though, there is
usually no stemm a which is not inconsistent with at least one locus in the m anuscripts;
conversely, if a degree of latitude is allowed so as to overcome such strict
inconsistencies, we find a multitude of possible stemm ata. These stemmata we mu st
distinguish on the basis of some criterion capable of evaluating the likelihood that each
would give rise to the man uscripts as they exist.
Thus, a variety of difficult problems, theoretical and computational, inhere in the
task of mechanically constructing a stem ma and they are not problems wh ich
classicists were likely to attack on their ow n. Fortuitously, however, developm ent had
simultaneously been taking place within the biological disciplines of taxonomy and
7West, op . cit., pp . 71-2.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
6/31
Breindel 6
systematics so as to motivate biologists to attempt these same problems. For derivation
of the evolutionary relationships of a grou p of extant specimens w as a key part of the
emergent stud y now called cladistics.
Biologist Willi Hennig h ad begun to d evelop and advocate a strictly phylogenetic
approach to arranging organisms.8Hen nigs view, that the evolutionary relationships of
organisms formed the best found ation for classifying and systematizing them, was an d
remains the object of debate.9Parts of his theory how ever, seem to have been ad opted
or ad apted by increasing numbers of systematists throughou t the 1970s.
The cladistic approach seems intu itively obvious, and G.D.C. Griffiths (along
with many defenders) insisted that it alone had the ad vantage of relying on objective
fact abou t the organisms in qu estion (rather than d eploying the organisms into classes
invented by hu man s). Griffiths writes, [Henn igs method ] provides the only
theoretically sound basis for achieving an objective equivalence between the taxa
assigned to particular categories in a phylogenetic system.10
Unfortunately, what seems
intu itively obvious can also be deceptively fallacious, and cladistics does have a
disingenuou s side. It is wor th p ointing out tw o objections to the system here, largely so
that the reader may see that they do notapp ly to a textual app lication of the theory.
8Hennig m ight be called the father of modern cladistics; his work was d eveloped and debated in v arious
pu blications includ ing (1950) Grundzge einer Theorie der Phylogenetischen Systematik, Berlin.
(1966) Phylogenetic Systematics, Urbana, Illinois.
(1971) Zur Situation der biologischen Systematik, Erlanger Forschungen, R. Siewing ed .,
Erlangen.9
For views on the early intellectual p ositions in t he d ebates, see Ernst Mayr (1976) Evolution and the
Diversity of Life: Selected Essays, Cambridge, Mass., pp. 435-41.10
Griffiths, G.D.C. (1972) The Phylogenetic Classification of Diptera Cyclorrhap ha with Special
Reference to the Structur e of the Male Postabdom en, W. Jun k, N.V., The H ague.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
7/31
Breindel 7
First, even if we are granted a thorou gh knowledge of the evolutionary
interrelationships of the specimens in qu estion, no m ethod is thereby presented for
determ ining the level of descent at which class divisions shou ld be m ade. We are on ly
shown that, having mad e a choice, we are bound to include and exclud e certain
specimens.
Second, given three organisms, A, B, and C, suppose that A and B are similar in
form, while C differs greatly from both A and B. Sup pose further that A and C are
closer evolut ionarily to one another than either is to B. In th is situat ion wh ich is not
uncommon in naturewe w ould be forced u nd er Hen nigs system either to class A, B
and C all together, or else to class A and C together against B (Figure 1).
Figure 1
Neither of these options appeals to our intuition the way that th e system at first did . For
A and B appearto form a group as against C, and yet this is precisely the classification
which we are proh ibited from making.
These two objections, wh ile hav ing much p ractical import for the classifying of
organisms, will clearly be irrelevant w hen w e come to ap ply this method to
manu scripts. First, we need nt classify manuscripts by name (and if we do, we accept
that classification as our own prod uction); second , we have no sympathy for similarity
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
8/31
Breindel 8
of appearance between man uscripts if we have hard evidence that they are u nrelated in
origin (since it is the or igin that is the object of the textual qu est).
These cladistic meth ods of analysis and classification, even if controversial,
promp ted research into the creation and evaluation of stemmata (or cladograms) from
incomp lete and incompatible data. The cladistic approach d epend s for a starting p oint
on d etermining the evolutionary relationships of the specimensand these
relationships m ust be assembled from lists of variations among the specimens. Hence,
in a sense, biologists set to work on the p roblems which had stood in front of Martin
West.
But debate abou t the p hilosophical und erpinnings of the cladistic method ology
did not subside. In 1977, the methodology attracted a d efender in University of
Michigan classicist and zoologist H. Don Cameron, due to cladistics eviden t similarity
to established techniques in trad itional (i.e., not m echanical) textual criticism.11
Cameron
along with Norman I. Platnick d escribe the d ebate, and situate th emselves in it, thus:
Recent years h ave seen an increasing aw areness and use am ong zoological
systematists of the theory and method s of phylogenetic analysis
(cladistics) developed by H ennig. These methods h ave been w ell
defended by [E.O.] Wiley from the point of view of Popperian
hypothetico -dedu ctive science. Critics, both of the method s themselves
and of their application to classification, have not been silent... The
pu rpose of this paper is to point out a fact overlooked d uring th e
controversy, namely, that method s analogous to those of Hennig are
accepted as the stand ard tools of analysis in two other fields th at resemble
phylogenetic systematics in being p rimarily concerned with constructing
and testing h ypotheses about the interrelationships of taxa connected by
ancestor-descendant sequences.
11Platnick, Norm an I. and H . Don Cam eron (1977) Clad istic Method s in Textu al, Lingu istic, and
Phylogenetic An alysis, Systematic Zoology 26:380-85.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
9/31
Breindel 9
The fields referred to are ... textual criticism ... and ... lingu istic
reconstruction.12
Cameron and Platnick, wr iting for an aud ience of biologists, next summarize the
techniques of textual criticism pu t forth by Pau l Maas.13
Differences of techniqu e
between biological and textual stemmatics wh ich Cameron an d Platnick view as
subord inate to an overarching similarity are described in m oderate d etail.14
The pap er
is intend ed to pr ovide a critique of a situation w ithin a d iscipline of biology, but it
serves also to ind icate that these scholars can recognize and make precise the
correspond ence between stemm a construction an d cladistic analysis.
In a conference conclud ed in 1983, Cameron again p resented his view of textual
criticism. The conference had been organized to investigate the biological and clad istic
metaphor in other intellectual fields.15
Cameron treated stemm atics, but he d id not
discuss stemmata as a metaphor from biology, since, as he points out, the stemmatic
meth ods as used in both fields were d eveloped by classical scholars systematically in
the nineteenth century and ... the origins of the method can be found as early as the
sixteenth century...16
Beyond merely recounting the techniques of Maas, Cameron
explores the distinction as far as it imp acts his cladistics-stemm atics comparison
between vertical or uncontaminated traditions and horizontal transmissions, those
12Platnick, op . cit., p. 380.
13Maas, P. (1958) Textual Criticism, Oxford ; Platnick, op. cit., p. 381-3.
14Platnick, op . cit., p. 384.
15Biological Metaphor O utside Biology (1982) and Interdisciplinary Round -Table on Clad istics and Other
Graph Theoretical Representa tions (1983) symp osia at the University of Pennsylvan ia. Proceedings in
Hoen igswald , Henry M . and Lind a F. Wiener, eds. (1987)Biological M etaphor and Cladistic Classification,
Philadelphia. 16
Camer on, H .D. (1987) The Up side-Down Cladogram: Problems in Manuscript Affiliation, in
Hoenigswald, op. cit.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
10/31
Breindel 10
full of Byzant ine, and even ancient, ed iting and conjecture.17
In the latter cases,
clad istic method s give little aid. But in the former, he conclud es:
[V]ertical transmission an d uncontaminated text tradition make the
mechanical application of cladistic methods to reconstruct a singlearchetype a workable and successful method, with a claim to being
scientific...18
Thus, Cameron argues that, at least in a vertical textual trad ition, we ought to be able to
use method s from cladistics to derive a stemma an d even an archetype.
At this point, the next move for a textual critic might have app eared obvious:
mate Wests insight abou t mechanical prod uction of stemmata with Cameron s insight
th at clad istics provides the theoretical and algorithmic un derpinn ing for Wests
operation. That is, use cladistic techniques to attack thorny problems of textual
transmission. It is un clear w hy this app roach was not exploited in the 1980s. We might,
however, hypothesize a pau city of tools to supp ort such research.
In the 1980s, three fur ther d evelopm ents came about wh ich m ade th e project
presented herein more practicable.19
One breakthrough was imp roved DNA
sequencing:20
it became possible to put genetic mater ial from var ious species into an
autom ated process and receive, as outp ut, essentially a collation showing every genetic
difference between the samples.21
More abundant data w as now available with w hich
cladistic analysis could work.
17Cameron, op. cit., p. 238.
18ibid.
19It is important to note that none of these three developments sprang fully formed from the head of Zeus
in the 1980s. It is convenient to d escribe them here, as their confluence seems to change the research
environment at the time, but research on DNA sequencing, p arsimony algorithms, and of course
computers had a long prior history.20
In paticular the developm ent of polymerase chain reaction (PCR) dup lication of DNA segments.21
That is, in the sequenced stran ds of DNA.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
11/31
Breindel 11
The second developm ent of this time period was th e availability of comp uters
sophisticated enou gh to compare and evaluate the thousand s or tens of thousands of
possible cladogram s (stemmata) which might result from comparing large num bers of
species. That is, computers allowed biologists to overcome that challenge which Maas
had identified for textual critics, when he observed that a large num ber of specimens or
witnesses would prod uce an astronomical num ber of possible stemmata.22
The last pre-requisite development was software systems to pu t large quantities
of data (whether from DNA or elsewhere) together w ith the compu ters. Software to
compute likely stemmata involves, at its core, algorithm s which have been top ics in
computer science and mathematics for a ha lf-centur y or more. Hence, strictly speaking,
app ropriate software had probably been in d evelopm ent in research u niversities and
corporate labs for some time. But the ear ly 1980s saw the release of packages d esigned
specifically for cladistics, tailored to the needs of practicing biologists, and read y to ru n
on existing microcompu ters.
The present experimental stud y, described below, is an attem pt to establish a
stemma for the textual trad ition of SallustsDe Coniuratione Catalinae using one such
software package, the freely-distributable Phylogeny Inference Package (or, as henceforth ,
PHYLIP).23
22Maas, op. cit., p. 47: If we have fou r w itnesses, the num ber of possible types of stemma am oun ts to 250,
if we have five, to appr oximately 4,000, and so on in qu asi-geometrical progression. 23
Felsenstein, J. (1993) PHY LIP (Phylogeny Inference Package) version 3.5c, distributed by the author, Dept.
of Genetics, Univ. of Washington , Seattle. See http:/ / evolution.genetics.washington .edu
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
12/31
Breindel 12
Before p roceeding to describe the m ethod and outcome of the experiment, it is
app ropr iate to consider two technical objections wh ich textual critics have put forward
concerning stemm a construction.
The first objection is one of M.L. West, prin ted above. West correctly pointed out
that any stemma derived by algorithm w ould be an unoriented stemma (or, as the
clad ists say, an un rooted cladogram ).24
That is, the algorithm could determine the
branchings of the stemma but could not ascertain w hich branching belongs at the
top(in practice, this amou nts to identifying the nodes representing the subarchetypes).
An un rooted cladogram (Figure 2) can represent several d istinct ro oted versions (Figure
3). Each rooted cladogram can, in turn represent several distinct possible phylogenies
(Figure 4).25
Figure 2. Unrooted cladogram. This cladogram shows
the relationships of the specimens relative to oneanother, but does not indicate their relationship toancestors from which they descend.
Figure 3. Rooted cladograms. Each of these five rooted cladograms is consistent with
the unrooted cladogram above (Figure 2). By postulating the first branching in thedescent, the known relationships specify the remainder of the tree. Note, however, thatthe lenths of branches, and the specimens which might lie on the nodes of the tree, arenot indicated.
24West, op . cit., pp . 71-2.
25Hu mp hries, C.J. and P.H . Williams (1994) Clad ogram s and Trees in Biodiversity, Models in Phylogeny
Reconstruction, Robert W. Scotland, Darrell J. Siebert, and David M. Williams, eds., Oxford, pp. 336-7.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
13/31
Breindel 13
Figure 4. Phylogenetic Trees. All four of these phylogenetic trees are compatible with asingle cladogram above (Figure 3.ii). Note that schemata involving direct descent areincluded.
Wests objection is legitimate. It shou ld not, though, prevent u s from pu rsuing
autom ated stemma construction, for several reasons. First, the unrooted cladogram is, if
accura te, a great advance over no stemma and an even greater advance over an
incorrect stemma. Second , it may in many cases be tolerably easy to p roperly root th e
cladogram, thus p roducing a traditional stemma, based on our knowledge of the dates
and locales of origin for the various manuscripts. Third, compu ter method s are
par ticularly useful in the frequent circumstance that the collation is not u niquely
compatible with any single prop osed stemm a. In such cases, we shall be happ y to have
an analysis of the entire collation, a most-likely stemma, and a mathematical
justification for excluding many other stemmata.
The second objection is one advanced by Roger David Dawe in stud ies of the
traditions of Aeschylus and Sophocles.26Dawes contention is that there is so much
horizontal transmission in the trad itions for these authors, as ind icated by numerou s
true readings app earing in d ependent m anuscripts though absent in other manu scripts,
as to invalidate th e stemmatic approach.27 Dawe confronts the method ology of Pasquali
26Daw e, R.D. (1964) The Collation and Investigation of Manuscripts of Aeschylus, Cambridge
and (1973) Studies on the Text of Sophocles, 2 vols., Leiden .27
Cameron, op. cit., p. 237.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
14/31
Breindel 14
and consequently confronts my m ethod, which d erives partly throu gh Pasquali, Maas,
and Westat least in the case of individu al authors such as Aeschylus. He writes:
We believe that th e fact of un ique preservation has been dem onstrated [in
the Aeschylean case]; consequen tly the fault mu st lie with th e theory ofdescent, and we conclud e that the ... stemm a does not after all represent,
even in th e simplest form, the tru e character of the trad ition. ...
It seems clear that the p icture p resented by the manuscripts is one of a
recension so entangled that it is utterly impossible for us to u nravel the
threads.28
Cameron summ arizes the p roblems w hich Dawes assertion p oses to an y method such
as the one employed in the present study:
Dawe d enies radically that archetypes can be reconstructed, but he
necessarily p ays a theoretical pr ice for his conclusion...
If there are no archetypes or stemm ata, and if true readings are uniquely
preserved in any m anuscript regard less of its stemmatic position, we are
then throw n back to a procedur e of evaluating readings which is unaided
by considerations of outgroup comp arison, reconstruction of an
archetype, or to p ush the concept to its logical conclusion, withou t the
consideration of manu script au thority of any kind.29
In order tha t we m ay avoid an imbroglio in Aeschylean Textkritik, we might concede
Dawes assertion to hold true in certain specific textual trad itions. But we need not
suppose that any p articular nu mber of such trad itions invalidates the ded uctive
stemmatic method in general. Hence, in the absence of any argum ent against stemmatic
representat ion of the Sallustian trad ition, we can proceed to an alyze it via the cladistic
approach.
Experimental Procedure
28Daw e, op . cit. 1964, pp. 157-8.
29Cameron, op. cit., pp. 237-8.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
15/31
Breindel 15
In this stud y, the manu scripts containing the De Coniuratione Catilinae and theDe
Bello Iugurthino were examined, as these two works are found together in one set of
manuscripts. Absent access to a complete collation, an ad apte d collation w as formed by
the following method . Eleven manuscripts were selected from those includ ed in L.D.
Reynold s Oxford text of 1991 (Table 1).
Siglum Manuscript
A Parisinus 16025B BasileensisC Parisinus 6085D Parisinus 10195F Hauniensis Fabricianus
H Berolinensis Phillippsianus 1902K Vaticanus Palatinus 887N Vaticanus Palatinus 889P Parisinus 16024Q Parisinus 5748V Vaticanus 3864
(Florilegium Vaticanum)
Table 1
Beginning at Catilina 1.1, the first 300 loci were selected which contain var iants in
one or m ore of the above eleven man uscripts.30
The adap ted collation was then formed
by listing, for each locus, the grou ps of man uscripts wh ich exhibited the same reading.
The collation then consisted of a sequence of row s such as appear in Table 2.
Locus: Group 1 Group 2 Group 3 Group 4
[rows 1-11]12 ABCDFNP HK V13 C N A BDFHKPV[rows 14-300]
Table 2
30To be more precise, in keeping w ith the biological metaphor, only the latest markings in the
man uscripts wer e collated. Thus, as corrected mar kings were ignored , loci containing variants in ear lier
hands are n ot includ ed in th e 300. The selected loci do, how ever, includ e every varian t in the last hand (at
each locus) of the app rop riate manu script from Catilina 1.1 to 52.35.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
16/31
Breindel 16
To analyze the collation, the DNAPARS componen t of the PHYLIP package was to be
employed, because it is the only component of PHYLIP which can p rocess mu lti-state
discrete characters (albeit by marking the states with DNA labels).31
DNAPARS is a
program w hich compar es DNA base sequences for a set of specimens and evaluates
various p ossible cladogram s on the basis of a parsimony criterion.
A parsimony criterion favors arrangements of the specimens w hich require the
fewest character state changes in the course of the specimens evolution. For example, a
phylogeny wh ich requires a specimen possessing a DN A sequence of AAA to give rise
to one possessing ACT and , thereafter, requires the specimen p ossessing ACT to give
rise to one possessing the sequence AAA again wou ld not be favored. This prop osed
phylogeny requires two bases to change state (AA to CT) and later to change again
(back to AA), involving four base changes overall. Instead , a parsimon y criterion might
favor an arrangement w here one specimen featur ing the AAA sequence gives rise to the
other w ith the AAA sequence, and the latter gives rise to that possessing the ACT
sequence.32
This latter phylogeny requ ires only a single change of two bases, or two
character state changes overall, and is thus more parsimonious than the former.
Further assump tions involved in the par simony m ethod, and d iffering views
about them, are listed (or references provided) by Felsenstein.33
In order to evaluate the collation using DN APARS, the collation d ata had to be
converted from the form illustrated in Table 2 to a form wh erein manuscrip ts group ed
31See Frequen tly Asked Qu estions, Felsenstein, op. cit.
32This ph ylogeny m ight be favored because one can observe other p ossible phylogenies with only two
character state changes. Such phylogenies would be equally parsimonious with the one given, and hence
wou ld be judged equally d esirable by a p arsimony criterion.33
DNAPARSDNA Parsimony Program (documentation) in Felsenstein, op. cit.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
17/31
Breindel 17
by a shared reading were each assigned a particular DNA base abbreviation (A, C, G, T,
or -, wh ich ind icates a fifth state to DN APARS). The DNA base label assigned to a
manuscript at a particular locus wou ld correspond to the group in wh ich that
manuscript resided at that locus.
Each row of the collation would yield one DN A base label for each m anu script;
thu s the 300 loci in the collation wou ld prod uced a 300-base DNA strand for each of
the eleven manu scripts. The creation and da ta entry of these 3,300 base labels was
beyond what could easily be accomplished man ually. To perform the task, a custom
app lication p rogram was written (MSS2DNA) which allows th e entry of the collation in
table form, p erforms the translation to sequences of DNA base labels for the various
manuscripts, and m oun ts the results on the Microsoft Wind ows clipboard (Figure 5).34
From the clipboard , the DNA d ata for the various manu scripts was assembled
with a text editor into the file format required by DNAPARS, as documente d by
Felsenstein.35
In order to facilitate comp arison to Reynolds stemmatic work on the
Sallust m anu scripts, and because they represent only p arts of the text, data for
manuscripts V (a florilegium) and Q w ere removed from the d ata file, leaving the nine
manuscripts for which Reynolds had p ublished a stemma. In removing V and Q, some
27 (i.e., 9%) of the loci were ren dered irrelevant, although th ey remain in the set.36
34This program , while not elegant, is pu blicly available (with source code) so that others may
independ ently condu ct investigations or repeat and verify the p resent investigation. The pr ogram,
MSS2DNA, runs on 32-bit Microsoft Wind ows p latforms (Wind ows 95, Wind ows 98, Window s NT) and
may be dow nloaded in archived (ZIP) form at http:/ / homer.bus.miami.edu / ~adbreind/ mss2dna.zip35
Molecular Sequence Program s in Felsenstein, op. cit.36
These data p oints represent loci at wh ich only Q and/ or V differed from the consensus of remaining
ma nu scripts. These sites can be identified from Ap pen dix B, in the table marked steps in each site, as
sites where the table shows 0 steps. That is, the rem aining man uscripts show consensus at the site, so no
character state changes are required for any phylogenetic arrangement of the manuscripts.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
18/31
Breindel 18
The completed DNAPARS file app ears in this report as Ap pen dix A: Infile.
The DNAPARS program was th en run, using th is file as its data source.37
Figure 5. MSS2DNA. The columns collect themanuscripts which share a reading at each locus. Thecolumn headings indicate the DNA base labels which willbe attached to the manuscript groups.
DNAPARSprodu ced the outp ut file wh ich app ears in th is report as App end ix B:
Outfile, and which includ es the preliminary phylogenetic tree (Figure 6). DNAPARS
was then run on the inpu t data several more times in order that other possible most
parsimonious trees might be discovered. No other m ost parsimonious trees were found.
37The 386-Windows precompiled PHYLIP executables were used throughout. The program options
selected for DNAPARS were all defaults with the following exceptions: Rand omize ord er was selected,
with a seed of 69 (=4*17+1) and 100 perm utations of the inp ut r ows; terminal typ e was set to (none); inpu t
sequences interleaved was set to No; and all printing options for the output were selected.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
19/31
Breindel 19
One most parsimonious tree found:
+--F.Hauniens
+--8+--7 +--D.Par10195! !
+--6 +-----H.Beroline! !
+--------5 +--------K.VatP_887! !! +-----------N.VatP_889
+--4! ! +--C.Par_6085! ! +--3
--1 +--------------2 +--B.Basileen! !
! +-----A.Par16025!+-----------------------P.Par16024
remember: this is an unrooted tree!
Figure 6
In order that the outpu t from this program m ight be comp ared to Reynolds
pu blished stemma for Sallust, and in recognition of Reynolds jud gments abou t the
quality of the textual variants, the tree was re-oriented using th e PHYLIPs RETREE
program. Since manu scripts F, D, H, K, and N formed a monop hyletic group an d
because they had been collected in Reynolds presentation ofthe Sallust stemma, the
nod e representing their common an cestor was selected for the ou tgroup (or
subarchetype). Note that althou gh the tree was re-oriented n o changes were made to
the genealogical relationships inferred betw een the m anuscripts by DNAPARS.38
The
transcript of the RETREE session ap pear s in th is report as Ap pendix C: RETREE
38Re-orientation in effect asserts likely positions for the subarchetyp es. As described above, West had
indicated th at such a step w ould be r equired, and that it should be cond ucted using a critics evaluation
of the variants.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
20/31
Breindel 20
Session.39
The session also produ ced as outp ut a new tree file. This tree file was u sed as
inpu t to PHYLIPs DRAWGRAM program, which constructed a grap hical
representation of the stemma (Figure 7).
Figure 7
For the sake of comparison, Reynolds stemma is reprod uced (Figure 8).40
Figure 8
As can be observed from the compu ter-generated tree and Reynolds tree
(Figures 7 and 8), they are nearly identical modu lo inversion. There are, however, tw o
39The progr am op tions selected for RETREE were all defaults w ith the following exception: no gra ph ics
was selected.40
Reynold s, L.D., ed. (1991) C. Sallusti Crispi: Catilina, Iugurtha, Historiarum Fragmenta Selecta, Appendix
Sallustiana, Oxford, p . xi.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
21/31
Breindel 21
differences. First, Reynold s associates N an d K more closely with each other than with
H, D, or F, while DNAPARS detected no su ch difference in proximity. Second,
Reynolds associates A more closely with P than with B or C, wh ile DNAPARS indicated
no su ch closer affiliation. This latter d istinction can in fact be attribu ted to d ifferences in
the text being collated, rather than to differences between th e analyses of Reynolds and
DNAPARS (see below).
Analysis
Since several hund red rearrangements of the order of the DNA strand s
produ ced no further most parsimonious trees, it seems reasonable to supp ose that the
manuscript collation d ata specify a unique m ost parsimonious tree.41
The existence of a
un ique most par simon ious tree is itself an ind ication that the p resent method may be
produ ctive, as it obviates the need for a hu man to insert prejudices into the analysis, by
selecting one cladogram from a list of many. The similarity of the resu lts derived
throu gh Reynolds analysis to those derived through the parsimon y analysis can, in
light of the n ovelty of the app roach, only be called stu nning.
This similarity is fur ther strengthened when w e account for one of the two
ind icated differences between the stemmata. As described above (see n. 30), in keeping
with the metaphor of biological evolut ion, only the latest extant m arkings (corrections,
not including d eletions) on each man uscript w ere collated. Thus, wh ere the first and
41This supposition is based on Felsensteins implicit assumption that a relatively small number of
rearran gemen ts of the inpu t data ought to yield mu ltiple most par simonious trees if they exist. Such an
assertion seems mathem atically suspect, considering th e large num ber of possible permu tations of, say,
nine m anu scripts (over 360,000). On this m atter, how ever, I defer to Felsensteins know ledge as a
specialist.
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
22/31
Breindel 22
second hand s of A differed, the second han d w as read for the collation instead of the
first. Reynolds naturally constructs his stemma indicating the position of the original A
text. But h e notes that Secund a manu s (A2) librum lectionibu s instruxit ex aliquo stirp is
[= B, C] codice petitis. That is, where readings exist in A2, they come from the B-C
branchwhich fact DNAPARS appears to have recognized, in asserting the A -A2
man uscript to d escend both from an ancestor of P and also from a closer ancestor of B
and C. To test this hypoth esis, we w ould merely need to modify the collation to reflect
only A-A1
readings, and th en see where DNAPARS places the manu script.
Hav ing taken the discrepancies into account, it seems that both th e hum an and
the machine-assisted analysis derive results from th e same und erlying p attern among
the man uscript readings. This study, then, preliminarily suggests that the parsimony
analysis technique could substantively advance knowledge of textual tran smission.
Furthermore, the parsimony analysis can indicate the readings likely to ap pear in
the archetype an d subarchetypes, in ord er that th ey most efficiently give rise to the
extant man uscripts. A detailed examination of such archetype reconstruction is beyond
the scope of this study. But ambitious readers shou ld note that Append ix B to this
pap er (i.e., the DNAPARS outp ut) prov ides the readings likely to appear at various
nod es in the cladogram for every locus studied. On Reynolds view of the transmission,
the archetype (his ), ough t to bear the readings given for node 4.
Future Research
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
23/31
Breindel 23
The futu re p resents a nu mber of immediate challenges and possibilities for the
cladistic analysis of texts using p arsimony techniques. The obvious methods throu gh
which th e procedu re may be tested include examining a variety of texts, as well as
using full collationsin p lace of collations bu ilt from apparatus criticiso as to avoid
dep endence on one editors opin ion of wh at may be viable manuscript read ings.42
If positive results are indicated, p arsimony an alysis might be deployed to assist
the textual critic in determining the relationships of texts, and in reconstructing
archetypes, for n ew pu blications. Perspectives may also be presented for re-evaluating
existing dogm a about traditions wh ich have not been recently examined .43
In the
classroom, the use of graphical interactive parsimony programs, wh ich allow on e to
manipulate stemmata on -screen an d immed iately to observe the consistencies or
inconsistencies thus fostered , may facilitate integration of stemmatics into the stand ard
classics curr iculu m.44
Lastly, literary theorists may w ish to pond er the existence of
deeper metaph ors connecting the enzymes and mutation s of DNA replication with the
correspond ing verbal agents and scribal errors giving rise to many of our textual
variants.
42Readings w hich mu st quite certainly be eliminated h ave no place und er the text, wr ites Maas (p. 23),
thu s giving editors license to omit even from the app. crit. those readings deemed eliminanda.43
We may sup pose that p arsimony an alysis will be effective in evaluating relationships between
man uscripts of texts in m oder n, as well as ancient, languages.44
MacClade (distributed by Sinau er Associates) is one such program. Many cand idates which might be
useful for heavy-du ty analysis as well as pedagogy are described by Felsenstein at
http:/ / evolution.genetics.washington.edu/ phylip/ software.html
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
24/31
Appendix A: Infile
9 300
P.Par16024AAACCCCCCCAATACCCCCCAACGCCCACACCCAACCACCACCCCCACGACGGAAACCCCCCGCCCCCCACCACACCACACCCCCCACCCAACCCATACCACCAAACACACGCCCCCGCCACACGACGGCACACACCCCACACAACAC-
CTCACCCACAACCCCCCCCACCACCCAACAAACCGCCCAAACCCACACCCCCACCACCCAACCCCCACCCCACACCCCCACAAACCCCCCACCTACCCCCCCCCACCAGCCCCACCAACCAAACCAACCCCCGAACCACCCCCACCCCCCAA.Par16025CCACACCCCACAGCCCCCCCACCGCACCCCCCACCCCCCCCCCCCCCCATCGAACCCGCCACGCCCCCCGCAACCCCCCCCCCCCCCACACTCCCCACCACCCACACCCCCGACCCAAACCAAAAACGGCCCCCCCCCCACACCCCCCCCAC-CCACCCAAAACACCCACCCCCCCACACCCAACCAACAAAACCACCCCCCCCAACACCCCCCACCACCCCCCCACACCCCCCAAACAACCCACCCCCACCCCCCCGCCCCACCCACCACCCCACACCCCGACCCCACCCCGACACCGCB.BasileenAACCCCCCCAAATCACCCCCAGCGCCCACACAACCCCCACACCCCCGCCCCGGAACCCCCCCCCACCACCCCCCCCCCCCCCCCCCCACCATACCCAACGAACAAACCCCCGAACCACACACAACCCGGACACCGCCCCACACCCCCCACCACCCACCCCCAACCCCCACACCCCACCAGACAACCACCCCCCCCACCCCCCCCAACCCCCCCCCCCCAACCCCCCCCCCCCCCCCGCCCACCCCCCCCACCACCCCGCACCACCCACCGCCCCACCCCCCGACCCCCCCCCCACACCGAC.Par_6085AAACCCCCCAAAACACCCCCAGCGCACCCCCAACCCCCCCCCCCCCGCACCGGACCCCCCACACACCACTCAACCCCACACCCCCCCACACTCCCCTACGCCCACACCCCCGCACCACACAAAACCCGGCCACCCCCCCACACCCCCCCCCACCCACCCCCAACCCCCCCCCCCCCCCAGCCACCCACCCCCAACACCCCCCCCACCACCCCCCACCCCAACACACCCCCCCCCCCGCCCCCCCCCCCCACCCCCCCGCCCCACCCACCACCCCACCCCCCGACCCCACCCCGACACCGCN.VatP_889CAACCCAACCCACACCACAAACCGCCCCACCCCCCAAAACACCCCCAAGGCAGCCCCACACAAACCCCCACCACCC
CCCCCAACCCCACCCGCCCCACAAAGAACCCACCCGCCCCCGCCCCCAGCAGGCGGCCGACACCACCCCCCAGCGCCCCCCCCCCCAACCCCCCCCCCCCCCAGACAGCACACACCCAACACCCCCCCAACACCCCCACACCCCACCCCCCCCCCCAACCACAACAACCACCCCCCCCCCAGCCCCCCCAACCACCCCCCCCCCCCCCCACCCCCCGCAGCCGCK.VatP_887ACCCCCCACCCCTCCAACCAAGCGCCCCCCCCCCCAAAACACCACAAAGGAGGCCCCCCCACGACCCCCGCCGCCCCCCCCCCCCACCCCGGACCCCACAAGACCCACCCCACACACGCCACCCGCAGAGGGCCCACACCACCCACCATCCCTCCACCCCCCACCACACACCCCCAACAGCCAGCCCCCCCCCAACCCCACCACAACCCACAAACCCCCCCCCCCCCCCCCCCACCACCCAACACACCCCCACCCAAGCCCCACCACCCCCACCCCCCCCCGCCCACCACCCACCACCGCH.BerolineAACCCACCACACTCCCCCCACGACCCCCACCCCCCCACACGCCCCAAAGCCGGACCCCACCGGCCCCACCACCCCCCCCCCCCCCACACCCTACCCGCCTAGCCCAACCCACCCACCGCCCCCAGCCGGACTCCCCAACCCCCCGCCATCCC-CCACCCCCAACCACACCCCACCACAAGACAGACCCCCCCCCCACCCCCCAAGCCCAAAAACAGACCAACCCCCACCCCCCCCCCCCCAC-GCCCCCCCACCCCCACACCCCCACCAGCCCCACCCACCGCCCCCCCCCAACCTCCGCD.Par10195CACACACCACCATCCACACAATCACCACCCACCCCCAAACGACAAAACGGACCCCCCCCCCAGACAAACACCCACCCCCCACAACACCCCTCAAACGCCAAGCCCCACACAACCAACGCCCCCCGCACCGGGCCACAAACCCCCCCCAGCCCGCCCCACCCACCCCAACAACAACGAAAGCAAGCCCCCCCCCACCCAACCCAACCCAAAAAACCGACCAACCACCACCCCCCCCCCCCCACGCCCACCCCACCACCCCACACCCACCAGCAAACACCCAAGCCCCCCCACCCCCACCAC
F.HauniensCACACACCCACATCCAAACCAGCAACACCCACCCCCAACAAAACAAACGGACCCCCATCCCACACAAACACACACACCCCACAACCCACCTGAACCGGCAAGCCCCCCACCACAAACGCACCCCGCCAGTGTCCGCACACCCCACCCAGCCCGCCCCACCCACACAAACACCAAAGAAAGCAAGCCCCACCCCACGCAACACAACCCCAAAACCCCAACACCCACCACCCCCCCCCCACCACGCCCACCACAACACCCCACCCCCACCAGCCAAGACCAAAGCCACCCCACCCCC-ACAC
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
25/31
Breindel 25
Appendix B: Outfile
DNA parsimony algorithm, version 3.572c
Name Sequences---- ---------
P.Par16024 AAACCCCCCC AATACCCCCC AACGCCCACA CCCAACCACC ACCCCCACGA CGGAAACCCCA.Par16025 CC..A....A C.GC...... .C...A.C.C ..ACC..C.. C.....C.AT ..A.CC.G..B.Basileen ..C......A ...CA..... .G........ .AACC..CA. ......G.CC .....C....C.Par_6085 .........A ..ACA..... .G...A.C.C .AACC..C.. C.....G.AC ....CC....N.VatP_889 C.....AA.. C.C...A.AA .C.....CAC ...CCAA.A. .......A.G .A.CCC.A.AK.VatP_887 .CC....A.. CC.C.AA..A .G.....C.C ...CCAA.A. ...A.A.A.G A..CCC....H.Beroline ..C..A..A. .C.C.....A CGAC...CAC ...CC.ACA. G....A.A.C ....CC..A.D.Par10195 C.CA.A..A. C..C.A.A.A .T.A..AC.C A..CC.A.A. GA.AAA...G ACCCCC....F.Hauniens C.CA.A...A C..C.AAA.. .G.AA.AC.C A..CC.A..A .AA.AA...G ACCCCCAT..
P.Par16024 CCGCCCCCCA CCACACCACA CCCCCCACCC AACCCATACC ACCAAACACA CGCCCCCGCCA.Par16025 A........G .A..C..C.C ......CA.A CT...CAC.A C...C..C.C ..A...AAA.B.Basileen ..C.A..A.C ..C.C..C.C ......CA.. .TA..CA..G .A.....C.C ..AA..ACA.C.Par_6085 A.A.A..A.T .A..C..... ......CA.A CT...C...G C...C..C.C ...A..ACA.N.VatP_889 .AAA...... ....C..C.C .AA...CA.. CG...CACAA .GA.CC...C ..........
K.VatP_887 A..A.....G ..G.C..C.C .....AC... GGA..CC..A .GACCCAC.C .A.A.A....H.Beroline .G.....A.C A.C.C..C.C .....ACA.. CTA..CGC.T .G.CC.AC.C AC..A.....D.Par10195 .A.A.AAA.. ..CAC..C.C A.AA.AC... TCAAACGC.A .G.CCCACAC AA..AA....F.Hauniens .ACA.AAA.. .ACACA.C.C A.AA..CA.. TGAA.CGG.A .G.CCC.CAC .A.AAA...A
P.Par16024 ACACGACGGC ACACACCCCA CACAACAC-C TCACCCACAA CCCCCCCCAC CACCCAACAAA.Par16025 CA.AA..... C.C.C..... ...CC.C.C. A.-..AC.C. AAA.A..... .C...C...CB.Basileen ...ACC...A CAC.G..... ...CC.C.A. CAC..AC.CC .AA....... AC....C..GC.Par_6085 .A.ACC.... CAC.C..... ...CC.C.C. CAC..AC.CC .AA.....C. .C...CC..GN.VatP_889 C.CA.CA... GGC.GA.A.C AC.CC.CAG. G.C...C.CC ..AA....C. .C...CC..GK.VatP_887 ..C..CA.AG GGC.CA.A.C AC.C..CAT. C.T..AC.CC ..A..A.ACA .C.......GH.Beroline C.CA.C...A CTC.C.AA.C .C.CG.CAT. C.-..AC.CC .AA..A.AC. .CA...CA.GD.Par10195 C.C..CACCG GGC...AAAC .C.CC.CAG. C.G...CACC .A....AACA ACAA.G.A.GF.Hauniens C.C..C.A.T GTC.G.A.AC .C..C.CAG. C.G...CACC .A.A.AAACA .CAAAG.A.G
P.Par16024 ACCGCCCAAA CCCACACCCC CACCACCCAA CCCCCACCCC ACACCCCCAC AAACCCCCCAA.Par16025 C.AA..A.C. AAAC...... .C...A.ACC ....AC.A.. C.C..A.AC. CCC.AAA.A.B.Basileen ..AA..ACCC ...C...... .C...A..CC .....C..AA C.C.....C. CCC...G..CC.Par_6085 C.AC..ACCC ..A....... .C.....ACC ....AC...A ....A...C. CCC...G..CN.VatP_889 ..A..A..C. ....ACA... .C...A.ACC ...A...... ..C.....C. CC.A..A.A.K.VatP_887 C.A....CCC ....AC...A .CA..A..C. .AAA.C.... C.C.....C. CCCA..A..CH.Beroline ..A.A..CCC ...C...... .CAAG..... AAA.AGA..A ..C...A.C. CCC......CD.Par10195 CAA....CCC .....C.AA. .CAAC..A.. AAA..GA..A ..CA..A.C. CCC......CF.Hauniens CAA....C.C .....G.AA. ACAAC..... AA...CAA.A C.CA..A.C. CCC....A.C
P.Par16024 CCTACCCCCC CCCACCAGCC CCACCAACCA AACCAACCCC CGAACCACCC CCACCCCCCAA.Par16025 ..C......A ...C..C... .....C.... CC...CA... ...C..CA.. ..GA.A..GCB.Basileen A.CC.....A ..AC..C..A .....C...G CC...C.... ...C..C... ..CA.A..G.C.Par_6085 ..CC.....A ...C..C... .....C.... CC...C.... ...C..CA.. ..GA.A..GCN.VatP_889 .AAC.A.... ...C...... ..C....... CC..CC.... .CCC.AC... ..G.AG..GC
K.VatP_887 AAC..A.... .A.C.A.... ......C..C C...CC.... ..CC.AC.A. .....A..GCH.Beroline A.-G...... .A.C..CA.A ..C...C.AG CC...C..A. ..CC..C... .A...T..GCD.Par10195 A.GC..A... .A.CA.CC.A .AC...C.AG C.AAC....A A.CC..C..A ..C..A..ACF.Hauniens A.GC..A..A .AACA.CC.A ..C...C.AG CCAAG...AA A.CCA.C..A ..C..-A.AC
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
26/31
Breindel 26
One most parsimonious tree found:
+--F.Hauniens+--8
+--7 +--D.Par10195! !+--6 +-----H.Beroline! !
+--------5 +--------K.VatP_887! !! +-----------N.VatP_889
+--4! ! +--C.Par_6085! ! +--3
--1 +--------------2 +--B.Basileen! !! +-----A.Par16025!+-----------------------P.Par16024
remember: this is an unrooted tree!
requires a total of 500.000
steps in each site:0 1 2 3 4 5 6 7 8 9
*-----------------------------------------0! 3 2 2 1 1 1 1 2 210! 2 3 2 3 2 1 2 3 1 120! 2 1 4 1 2 1 2 1 2 230! 2 1 1 1 1 1 2 1 2 340! 1 4 1 1 2 1 1 2 2 250! 4 2 2 2 2 2 1 1 3 1
60! 1 3 3 4 2 1 1 1 2 070! 5 1 3 3 1 1 1 0 2 080! 2 1 1 2 1 0 2 1 3 090! 2 4 4 2 1 1 1 4 4 1100! 3 2 2 2 1 2 2 2 2 1110! 1 2 2 2 3 1 2 1 2 1120! 1 3 2 1 3 2 2 3 2 2130! 4 3 4 1 0 5 2 1 2 1140! 1 2 1 0 2 3 0 1 1 5150! 0 3 1 5 0 0 3 1 1 1160! 2 1 2 2 2 1 2 1 1 2170! 2 2 1 1 1 1 4 3 1 0180! 2 4 1 1 2 1 1 1 2 2190! 2 1 1 2 3 2 3 1 1 1200! 1 1 1 1 1 2 3 0 4 2
210! 2 1 1 2 2 3 4 1 2 1220! 2 4 0 2 1 1 1 1 1 1230! 0 1 1 2 2 1 1 3 1 2240! 2 2 2 4 4 0 2 1 0 0250! 2 0 1 2 1 1 1 2 2 0260! 2 0 1 2 0 0 1 1 0 1270! 3 1 3 1 1 3 2 1 0 2280! 1 1 1 1 1 1 2 1 2 1290! 1 0 1 4 1 1 4 1 0 2300! 2
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
27/31
Breindel 27
From To Any Steps? State at upper node( . means same as in the node below it on tree)
1 AAACCCCCCC MATMCCCCCC AVCGCCCMCM CCCMMCCACC1 4 maybe .......... .......... .S.....C.C ...CC.....4 5 yes .......M.. C.....M..A .......... .....MA.A.
5 6 yes ..C....... .M.C.M.... .G........ ..........6 7 yes .....A.CM. .......... ...V...... .....C....7 8 yes C..A...... .A...A.A.. ...A..A... A.........8 F.Hauniens yes ........CA ......A..C ....A..... ........CA8 D.Par10195 yes ........A. ......C... .T........ ..........7 H.Beroline yes ........A. AC...CC... C.AC....A. .......C..6 K.VatP_887 yes .C.....A.. .C...AA... .......... .....A....5 N.VatP_889 yes C.....AA.. ..CA..A.A. .C......A. .....A....4 2 yes .........A ...C...... .....M.... ..A....C..2 3 yes .......... A...A..... .G........ .A........3 C.Par_6085 yes .......... ..A....... .....A.... ..........3 B.Basileen yes ..C....... .......... .....C.A.A ........A.2 A.Par16025 yes CC..A..... C.G....... .C...A.... ..........1 P.Par16024 maybe .......... A..A...... .A.....A.A ...AA.....
1 ACCCCCACGN CGGAMMCCCC CCGCCCCCCA CCACMCCMCM1 4 maybe .......... ....CC.... .......... ....C..C.C4 5 yes .......A.G ...C...... .M.A...... ..........5 6 yes .....A.... M......... .......... ..V.......6 7 yes R......... .......... .V.....A.. ..C.......7 8 yes .A..A..C.. ACC....... .A...AA... ...A......8 F.Hauniens yes A.A....... ......AT.. ..C....... .A...A....8 D.Par10195 yes G..A...... .......... .......... ..........7 H.Beroline yes G........C C..A....A. .G.C.....C A.........6 K.VatP_887 yes ...A...... A......... AC.......G ..G.......5 N.VatP_889 yes .......... .A.....A.A .AA....... ..........4 2 yes M.....V.A. .......... M........N .M........2 3 yes ......G..C .......... ..V.A..A.. ..........3 C.Par_6085 yes C......... .......... A.A......T .A.....A.A3 B.Basileen yes A.......C. ....A..... C.C......C .CC.......
2 A.Par16025 yes C.....C..T ..A....G.. A........G .A........1 P.Par16024 maybe .........A ....AA.... .......... ....A..A.A
1 CCCCCCMMCC MDCCCMWMCM ACCAMACACM CGCCCCCGCC1 4 maybe ......CA.. C....CA..A ....C..M.C ..........4 5 yes .......... .G........ .GM..C.... ..........5 6 yes .....A.... ..A...V... ...C..AC.. .A...M....6 7 yes .......... ......GC.. ..C....... M...A.....7 8 yes A.AA...... T..A...... ........A. .....A....8 F.Hauniens yes .....C.... .......G.. ......C... C..A.....A8 D.Par10195 yes .......C.. .C..A..... .......... A.........7 H.Beroline yes .......... .T.......T .....A.... AC...C....6 K.VatP_887 yes .......C.. G.....CA.. ..A....... ...A.A....5 N.VatP_889 yes .AA....... .......CA. ..A....A.. ..........4 2 yes .........M .T........ M......C.. ..M...AVA.
2 3 yes .......... .......A.G .......... ...A...C..3 C.Par_6085 yes .........A ......T... C......... ..C.......3 B.Basileen yes .........C A.A....... AA..A..... ..A.......2 A.Par16025 maybe .........A .......C.. C......... ..A....A..1 P.Par16024 maybe ......AC.. AA...ATA.C ....A....A ..........
1 MCAMGMCGGC VCMCMCCCCA CACMMCMC?C YC?CCMMCMM1 4 maybe .......... ..C.C..... ...CC.C... C.?...C.C.4 5 yes ..C..CM... GG...M.A.C MC.....A.. .........C5 6 yes .........G .......... ........K. ..?.......6 7 yes C......... .K...CA... C......... ..........
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
28/31
Breindel 28
7 8 yes ...C...V.. ....V...A. ........G. ..G..C.A..8 F.Hauniens yes ......CA.T .T..G..C.. ...A...... ..........8 D.Par10195 yes ......ACC. .G..A..... .......... ..........7 H.Beroline yes ...A..C..A CT........ ....G...T. ..-..A....6 K.VatP_887 yes A..C..A.A. .....A.... A...A...T. ..T..A....5 N.VatP_889 yes C..A..A... ....GA.... A.......G. G.C..C....4 2 maybe .M.AV..... C......... ........C. .....A....
2 3 yes A...CC.... .A........ .......... .AC......C3 C.Par_6085 maybe .A........ .......... .......... ..........3 B.Basileen yes .C.......A ....G..... ........A. ..........2 A.Par16025 yes CA..AA.... .......... .......... A.-......A1 P.Par16024 maybe A..C.A.... A.A.A..... ...AA.A.-. T.A..CA.AA
1 CCMCCCCCAC CMCCCMACAR MCMGCCCAMA CCCACACCCC1 4 maybe ..A....... .C.......G ..A.....C. ..........4 5 yes ........C. .......... .......... ....MC....5 6 yes .....A.A.M .....A.... .......C.C ..........6 7 yes .A........ ..A....A.. .......... ....C.....7 8 yes ..C...A..A ...A.G.... CA........ .......AA.8 F.Hauniens yes ...A...... ....A..... ........A. .....G....8 D.Par10195 yes .....C.... A......... .......... ..........7 H.Beroline yes .........C ......C... A...A..... ...C.A....
6 K.VatP_887 yes .........A .......... C......... ....A....A5 N.VatP_889 yes ...A...... .....CC... A....A.... ....A.A...4 2 yes .A........ .......... ...A..A... ..MM......2 3 yes .......... ......C... .......C.C ..........3 C.Par_6085 yes ........C. .....C.... C..C...... ..AA......3 B.Basileen yes .......... A....A.... A......... ..CC......2 A.Par16025 yes A...A..... .....C...C C......... AAAC......1 P.Par16024 maybe ..C....... .A...A...A A.C.....A. ..........
1 CMCCAMCMMM CCCCCMCCCC ACMCCCCCMC MMACCCMCCA1 4 maybe .C...A..C. .......... ..C.....C. CCM...A...4 5 maybe .......... ...M...... .......... ...M......5 6 yes ..A....C.A .AA..V.... .......... ..C......C6 7 yes ...AVC..A. A..C.SA..A ......A... ...C..C...7 8 yes ....C..... .......... ...A...... ..........
8 F.Hauniens yes A......... ..C..C.A.. C......... .......A..8 D.Par10195 yes .......A.. .....G.... .......... ..........7 H.Beroline yes ....G..... ....AG.... .......... ..........6 K.VatP_887 yes .......... ...A.C.... C......... ...A......5 N.VatP_889 yes .......A.C ...A.A.... .......... ..AA....A.4 2 maybe .........C ....MC.... M......... ..C.......2 3 yes .......... .........A .......... ......G..C3 C.Par_6085 yes .....C.A.. ....A..... A.A.A..... ..........3 B.Basileen yes .......C.. ....C...A. C......... ..........2 A.Par16025 yes .......A.. ....A..A.. C....A.A.. ....AA..A.1 P.Par16024 maybe .A...C.CAA .....A.... ..A.....A. AA....C...
1 CCYMCCCCCC CCCMCCAGCC CCACCAACCA MMCCAMCCCC1 4 maybe ..C....... ...C...... .......... CC...C....4 5 yes .M...M.... .......... ..M....... ....C.....
5 6 yes A......... .A........ ......C..V ..........6 7 yes .C?V.C.... ......CV.A ..C.....AG ........M.7 8 yes ..GC..A... ....A..C.. .......... ..AA.A...A8 F.Hauniens yes .........A ..A....... .......... ....G...A.8 D.Par10195 yes .......... .......... .A........ .A......C.7 H.Beroline yes ..-G...... .......A.. .......... ....A...A.6 K.VatP_887 yes .A.A.A.... .....A.... ..A......C .A........5 N.VatP_889 yes .AAC.A.... .......... ..C....... ..........4 2 yes .........A ......C... .....C.... ..........2 3 maybe ...C...... .......... .......... ..........3 C.Par_6085 no .......... .......... .......... ..........
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
29/31
Breindel 29
3 B.Basileen yes A......... ..A......A .........G ..........2 A.Par16025 yes ...A...... .......... .......... ......A...1 P.Par16024 maybe ..TA...... ...A...... .......... AA...A....
1 CGAMCCMCCC CCRCCMCCSM1 4 maybe ...C..C... .....A..GC4 5 yes ..C..M.... ..........
5 6 maybe .......... ..A.......6 7 maybe .....C.... ..........7 8 yes A........A ..C.....A.8 F.Hauniens yes ....A..... .....-A...8 D.Par10195 no .......... ..........7 H.Beroline yes .......... .A...T....6 K.VatP_887 yes .....A..A. ..........5 N.VatP_889 yes .C...A.... ..G.AG....4 2 yes .......M.. ..GA......2 3 no .......... ..........3 C.Par_6085 maybe .......A.. ..........3 B.Basileen yes .......C.. ..C......A2 A.Par16025 maybe .......A.. ..........1 P.Par16024 maybe ...A..A... ..A..C..CA
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
30/31
Breindel 30
Appendix C: RETREE Session
Tree Rearrangement, version 3.572c
Settings for this run:U Initial tree (arbitrary, user, specify)? User tree from tree file
N Use the Nexus format to write out trees? No0 Graphics type (IBM PC, VT52, ANSI)? ANSIW Width of terminal screen, of plotting area? 80, 80L Number of lines on screen? 24
Are these settings correct? (type Y or the letter for one to change)0
Tree Rearrangement, version 3.572c
Settings for this run:U Initial tree (arbitrary, user, specify)? User tree from tree fileN Use the Nexus format to write out trees? No0 Graphics type (IBM PC, VT52, ANSI)? (none)
W Width of terminal screen, of plotting area? 80, 80L Number of lines on screen? 24
Are these settings correct? (type Y or the letter for one to change)y
Reading tree file ...
retree: can't read intreePlease enter a new filename>treefile
,>>1:F.Hauniens,>15
,>14 `>>2:D.Par10195! !
,>13 `>>>>>3:H.Beroline! !
,>>>>>>>12 `>>>>>>>>4:K.VatP 887! !! `>>>>>>>>>>>5:N.VatP 889
,>11! ! ,>>6:C.Par 6085! ! ,>17
-10 `>>>>>>>>>>>>>16 `>>7:B.Basileen! !! `>>>>>8:A.Par16025!`>>>>>>>>>>>>>>>>>>>>>>>9:P.Par16024
NEXT? (Options: R . U W O T F B N H J K L C + ? X Q) (? for Help) oWhich node should be the new outgroup? 12
,>>1:F.Hauniens,>15
,>14 `>>2:D.Par10195! !
,>13 `>>>>>3:H.Beroline! !
8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata
31/31
Breindel 31
,>>>>>>>>>>12 `>>>>>>>>4:K.VatP 887! !! `>>>>>>>>>>>5:N.VatP 889!
-10 ,>>6:C.Par 6085! ,>17! ,>16 `>>7:B.Basileen
! ! !`>>>>>>>>>>>>>11 `>>>>>8:A.Par16025!`>>>>>>>>9:P.Par16024
NEXT? (Options: R . U W O T F B N H J K L C + ? X Q) (? for Help) wEnter R if the tree is to be rootedOR enter U if the tree is to be unrooted: r
Tree written to file