17
1 Marlowe or Shakespeare? Determining the Authorship of a Mysterious Play Chapter 9, Exercise 4 Bill Camarinos Andy Gibbons

Marlowe or Shakespeare? Determining the Authorship of a Mysterious Play

Embed Size (px)

DESCRIPTION

Marlowe or Shakespeare? Determining the Authorship of a Mysterious Play. Chapter 9, Exercise 4 Bill Camarinos Andy Gibbons. Background. - PowerPoint PPT Presentation

Citation preview

Page 1: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

1

Marlowe or Shakespeare? Determining the Authorship of a

Mysterious PlayChapter 9, Exercise 4

Bill Camarinos

Andy Gibbons

Page 2: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

2

Background

• Virtually every year for the past one hundred years a play or other work of literature is found somewhere in the United Kingdom ostensibly written by William Shakespeare or Christopher Marlowe.

• Specialists in Elizabethan literature typically conclude that these “finds” are frauds.

Page 3: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

3

Shakespeare and Marlowe

• Both were born in 1564.

• Shakespeare died in 1616.

• Marlowe supposedly killed in a tavern brawl in 1593, but many suspect that this death was staged. There is enough doubt that Marlowe’s window in Westminster Abbey’s Poets’ corner has the dates of his life as “1564-1593?”

Page 4: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

4

Shakespeare Authorship Controversy

• Some maintain that someone other than Shakespeare was the true author of the Shakespearean canon.

• Among the candidates– Edward DeVere, 17th Earl of Oxford.

– Francis Bacon.

– Queen Elizabeth I

– Christopher Marlowe

• Every year there is a court-type competition in Washington among leading attorneys to prove who is the author. Supreme Court Justices sit as judges.

• Separately, there is a prize, the Hoffman prize, that will be given to whoever can convince the world that Christopher Marlowe wrote the works attributed to Shakespeare.

Page 5: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

5

Our Assumptions

• Shakespeare, not Marlowe or anyone else, wrote the Shakespearean canon.

• The mystery play which has been found was definitely written by either Marlowe or Shakespeare. It is not another of the frauds that keep turning up.

Page 6: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

6

What we have to work with

• An electronic version of a play of unknown authorship

• Electronic versions of all known works of William Shakespeare

• Electronic versions of all known works of Christopher Marlowe

Page 7: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

7

How we propose to proceed

• Investigate how quantitative techniques and computers have been used to solve authorship attribution problems in the past.

• Determine which techniques have the greatest probabilities of success.

• Design a process for applying the selected techniques using what we have at our disposal.

• Determine the true authorship.

Page 8: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

8

Early and Simple Quantitative Approaches

• Compare word length. Frequency distribution of word lengths in works by the authors in question.

• Average number of syllables per word.

• Sentence length

• Percentages of different parts of speech

Page 9: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

9

What is the result of applying the simple tests?

• Many are better at identifying types of writing (e.g. narrative vs. drama) than they are at distinguishing one author from another.

• The word-length test was actually applied to Shakespeare and Marlowe and the result was “Christopher Marlowe agrees with Shakespeare about as well as Shakespeare agrees with himself.”

Page 10: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

10

Other Methods

• Function-word approach. Focus on the frequency with which different articles, conjunctions and prepositions (“context-free words”) are used. Frequencies often vary significantly from one author to another.

• Measure “pace” - the rate of introduction of new vocabulary into the texts.

• Focus on words used only once or twice.

Page 11: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

11

Other Methods (Continued)

• Cumulative Sum Charts (cusums or qsums)- Compare two features using a chart– one of which is sentence length– the other of which is something like the number

of two or three letter words in each sentence– similar chart patterns suggest uniform

authorship.– chart patterns for a different author will diverge

Page 12: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

12

Other Methods (Continued)

• Use of Neural Networks– Neural Networks have powerful pattern-recognition

capabilities

– Network is “trained” or calibrated using data from a known author ( such as the known works of Shakespeare or Marlowe)

– The network can then classify doubtful text (such as the mystery play) based on what it has “learned.”

– Two researchers reported success using neural networks to compare Shakespeare and Marlowe.

Page 13: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

13

What Previous Authorship Attribution Studies Have Shown

• The simplest tests (e.g. word length analysis) don’t work.

• Some only slightly more complex tests (e.g. function-word analysis) have had some success.

• Combinations of tests, even if some are quite simple, have a high probability of success.

• Success in attribution is much more likely when only two candidate authors are present.

• Success becomes even more likely if there is a large body of known material available (and we have all the known works of both Shakespeare and Marlowe).

• With leading edge techniques that you really don’t understand-Don’t try this at home.

Page 14: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

14

Methods We Considered

• Even though they show a lot of promise we ruled out neural networks since we have no experience at all in using them.

• We also considered data mining.– Data are stored in a data warehouse.

– Query and reporting tools, multidimensional analysis tools, and intelligent agents are used to analyze the data.

– For example, intelligent agents could be fitted with an algorithm designed to find patterns.

– We decided that data mining was overkill for the problem at hand.

Page 15: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

15

Method We Selected

• Use a readily available relational data base, Oracle, as our analysis and reporting tool.

• Relational data bases organize data into tables which are related to one another using key fields.– Some of the tables we would create

• Words used by Shakespeare.

• Troublesome words used in the plays

• Weird words used by Shakespeare

• Examples of each author’s use of verse and meter.

Page 16: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

16

Method We Selected (Continued)

• Structured Query Language (SQL) or the associated Query by Example (QBE) would be used to query the data.

• We would define how many points of similarity in use of language, verse, etc. would be needed to establish authorship. For example, samples of Shakespeare’s and Marlowe’s use of iambic pentameter in their known works would be compared to that in the mystery play

• Oracle’s Report Generator would be used to create a report showing how the mystery play compares with the known texts based on the criteria we established.

Page 17: Marlowe or Shakespeare?  Determining the Authorship of a Mysterious Play

17

Conclusion

• Our task has been a fascinating, and fun, one.

• Our survey of the work previously done showed that Computers and Linguistics have come a way and that computers can be used to help solve the type of authorship attribution questions that scholars have debated for years.

• We believe that using a powerful relational data base to perform the kinds of tests that have proven most successful in previous studies would convince the quantitatively oriented community of the authorship of the mystery play.

• We would seek validation of our results from an Elizabethan scholar who specializes in the works of Shakespeare and Marlowe. This would give credence to our results among those who are dubious of quantitative approaches.