4
BOOK REVIEW 1326 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY VOLUME 66, NUMBER 8 Yossi Elran works at the Davidson Institute of Science Education, the ed- ucational arm of the Weizmann Institute of Science, Rehovot, Israel, and is the co-author of The Paper Puzzle Book. His email address is yossi [email protected]. Communicated by Notices Book Review Editor Stephan Ramon Garcia. For permission to reprint this article, please contact: reprint-permission @ams.org. DOI: https://dx.doi.org/10.1090/noti1938 into subsections, with many examples, illustrations, and beautiful photos. The book focuses on geometric and mathematical ori- gami as opposed to representational or figurate origami, folding a subject that looks like something—a crane, or a hat, or whatever. Geometric origami is the paper folding of geometric structures, using math both for the design and for the description of the beautiful outcomes. One of its main topics is origami tessellations, the division of the plane into decorative patterns (usually) using just a single sheet of paper. The ten chapters proceed in increasing order of structural complexity, from the very basics of modeling origami with crease patterns, through twists, tilings, and tessellations to 3D analysis and rotational solids. The book is truly as exhaustive as possible within rea- sonable limits. There are some origami structures that are not included in the book, notably, elementary shapes such as polygons and polyhedra and, on the other hand, some of the more modern structures, like Ron Resch tes- sellations and curved folds. This is already noted by Lang, and, I would say, quite acceptable, given that the book is meant to be a comprehensive work on contemporary geometric origami, itself very much focused on tilings and tessellations. The exceptionally detailed instructions and analyses of those structures that do appear give the reader all the knowledge needed to explore other structures that are not in the book. From a didactic point of view, Twists, Tilings, and Tessella- tions is a masterpiece. Lang really bends over backwards (so to speak) to make sure that his readers wholly understand the text, and he does a great job of it. First, he uses the intro- duction to explain the purpose of the book, the motivation for writing it, and what it does and does not include. He then explains what to expect and what you need, going on to define the book’s intended audience: anyone who wants to learn about geometric origami and the math behind it, experts and novices alike, whose mathematical skills are anywhere from high school level upwards. Throughout the book, a system of asterisk symbols accurately indicates the Twists, Tilings, and Tessellations Mathematical Methods for Geometric Origami by Robert J. Lang Introduction “Everyone and no-one knows what origami, the Japanese art of folding, is” is a fine and fitting first sentence for Robert Lang’s excellent new book Twists, Til - ings, and Tessellations: Mathemat- ical Methods for Geometric Origami. Lang’s motivation for writing the book is given a few sentences later: “…in the latter part of the 20th century, it [origami] exploded in variety, complexity, and artistry, with numerous genres and specializations.” He could not have spelled out the contemporary origamist’s quandary better and could not have provided us with a better solution: a well-written, exceptionally structured book that makes a perfect com- pendium of geometric origami. The Book Twists, Tilings, and Tessellations has 736 pages, so it is a very long read from cover to cover. However, as Lang writes in the introduction, the reader can “jump among the vari- ous chapters, trying out things that look interesting and skipping what makes your eyes glaze over.” Indeed, the book is perfectly structured to allow the reader to do just that. Every chapter is self-contained, meticulously divided Twists, Tilings, and Tessellations A Review by Yossi Elran CRC Press, 2018, 736 pages. ISBN: 978-1568812328

Natural Language Processing: Part II Overview of Natural Language Processing (L90 ... · 2018. 10. 29. · Natural Language Processing: Part II Overview of Natural Language Processing

  • Upload
    others

  • View
    21

  • Download
    1

Embed Size (px)

Citation preview

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Natural Language Processing: Part IIOverview of Natural Language Processing

    (L90): ACSLecture 10: Discourse

    Simone Teufel (Materials by Ann Copestake)

    Computer LaboratoryUniversity of Cambridge

    October 2018

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Outline of today’s lecture

    Putting sentences together (in text).

    Coherence

    Anaphora (pronouns etc)

    Algorithms for anaphora resolution

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Document structure and discourse structure

    I Most types of document are highly structured, implicitly orexplicitly:

    I Scientific papers: conventional structure (differencesbetween disciplines).

    I News stories: first sentence is a summary.I Blogs, etc etc

    I Topics within documents.I Relationships between sentences.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Rhetorical relations

    Max fell. John pushed him.

    can be interpreted as:1. Max fell because John pushed him.

    EXPLANATIONor

    2 Max fell and then John pushed him.NARRATION

    Implicit relationship: discourse relation or rhetorical relationbecause, and then are examples of cue phrases

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Lecture 10: Discourse

    Coherence

    Anaphora (pronouns etc)

    Algorithms for anaphora resolution

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Coherence

    Discourses have to have connectivity to be coherent:

    Kim got into her car. Sandy likes apples.

    Can be OK in context:

    Kim got into her car. Sandy likes apples, so Kim thought she’dgo to the farm shop and see if she could get some.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Coherence

    Discourses have to have connectivity to be coherent:

    Kim got into her car. Sandy likes apples.

    Can be OK in context:

    Kim got into her car. Sandy likes apples, so Kim thought she’dgo to the farm shop and see if she could get some.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Coherence in generation

    Language generation needs to maintain coherence.

    In trading yesterday: Dell was up 4.2%, Safeway was down3.2%, HP was up 3.1%.

    Better:

    Computer manufacturers gained in trading yesterday: Dell wasup 4.2% and HP was up 3.1%. But retail stocks suffered:Safeway was down 3.2%.

    More about generation in the next lecture.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Coherence in interpretation

    Discourse coherence assumptions can affect interpretation:

    Kim’s bike got a puncture. She phoned the AA.

    Assumption of coherence (and knowledge about the AA) leadsto bike interpreted as motorbike rather than pedal cycle.

    John likes Bill. He gave him an expensive Christmas present.

    If EXPLANATION - ‘he’ is probably Bill.If JUSTIFICATION (supplying evidence for first sentence), ‘he’is John.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Factors influencing discourse interpretation

    1. Cue phrases.2. Punctuation (also prosody) and text structure.

    Max fell (John pushed him) and Kim laughed.Max fell, John pushed him and Kim laughed.

    3. Real world content:Max fell. John pushed him as he lay on the ground.

    4. Tense and aspect.Max fell. John had pushed him.Max was falling. John pushed him.

    Hard problem, but ‘surfacy techniques’ (punctuation and cuephrases) work to some extent.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Rhetorical relations and summarization

    Analysis of text with rhetorical relations generally gives a binarybranching structure:

    I nucleus and satellite: e.g., EXPLANATION,JUSTIFICATION

    I equal weight: e.g., NARRATIONMax fell because John pushed him.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Rhetorical relations and summarization

    Analysis of text with rhetorical relations generally gives a binarybranching structure:

    I nucleus and satellite: e.g., EXPLANATION,JUSTIFICATION

    I equal weight: e.g., NARRATIONMax fell because John pushed him.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Coherence

    Summarisation by satellite removal

    If we consider a discourse relation as arelationship between two phrases, we get a binary branchingtree structure for the discourse. In many relationships,such as Explanation, one phrase depends on the other:e.g., the phrase being explained is the mainone and the other is subsidiary. In fact we can get rid of thesubsidiary phrases and still have a reasonably coherentdiscourse.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Lecture 10: Discourse

    Coherence

    Anaphora (pronouns etc)

    Algorithms for anaphora resolution

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Referring expressions

    Niall Ferguson is prolific, well-paid and a snappy dresser.Stephen Moss hated him — at least until he spent an hourbeing charmed in the historian’s Oxford study.

    referent a real world entity that some piece of text (orspeech) refers to. the actual Prof. Ferguson

    referring expressions bits of language used to performreference by a speaker. ‘Niall Ferguson’, ‘he’, ‘him’

    antecedent the text initially evoking a referent. ‘Niall Ferguson’anaphora the phenomenon of referring to an antecedent.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Niall Ferguson and Stephen Moss. . .

    Niall Ferguson is a British historianand conservative political commen-tator. He is a senior research fellowat Jesus College, Oxford. He is thebestselling author of several books,including The Ascent of Money.

    Stephen Moss is a feature writer atthe Guardian.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Pronoun resolution

    Pronouns: a type of anaphor.Pronoun resolution: generally only consider cases which referto antecedent noun phrases.

    Niall Ferguson is prolific, well-paid and a snappy dresser.Stephen Moss hated him — at least until he spent an hourbeing charmed in the historian’s Oxford study.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Pronoun resolution

    Pronouns: a type of anaphor.Pronoun resolution: generally only consider cases which referto antecedent noun phrases.

    Niall Ferguson is prolific, well-paid and a snappy dresser.Stephen Moss hated him — at least until he spent an hourbeing charmed in the historian’s Oxford study.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Hard constraints: Pronoun agreement

    Pronouns must agree with their antecedents in number andgender.BUT:

    I A little girl is at the door — see what she wants, please?I My dog has hurt his foot — he is in a lot of pain.I * My dog has hurt his foot — it is in a lot of pain.

    Complications:I The team played really well, but now they are all very tired.I Kim and Sandy are asleep: they are very tired.I Kim is snoring and Sandy can’t keep her eyes open: they

    are both exhausted.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Hard constraints: Reflexives

    I Johni cut himselfi shaving. (himself = John, subscriptnotation used to indicate this)

    I # Johni cut himj shaving. (i 6= j — a very odd sentence)

    Reflexive pronouns must be coreferential with a preceedingargument of the same verb, non-reflexive pronouns cannot be.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Hard constraints: Pleonastic pronouns

    Pleonastic pronouns are semantically empty, and don’t refer:I It is snowingI It is not easy to think of good examples.I It is obvious that Kim snores.I It bothers Sandy that Kim snores.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    Soft preferences: Salience

    Recency Kim has a big car. Sandy has a smaller one. Leelikes to drive it.

    Grammatical role Subjects > objects > everything else: Fredwent to the Grafton Centre with Bill. He bought ahat.

    Repeated mention Entities that have been mentioned morefrequently are preferred.

    Parallelism Entities which share the same role as the pronounin the same sort of sentence are preferred: Billwent with Fred to the Grafton Centre. Kim wentwith him to Lion Yard. Him=Fred

    Coherence effects (mentioned above)

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    World knowledge

    Sometimes inference will override soft preferences:

    Andrew Strauss again blamed the batting after England lost toAustralia last night. They now lead the series three-nil.

    they is Australia.But a discourse can be odd if strong salience effects areviolated:

    The England football team won last night. Scotland lost.? They have qualified for the World Cup with a 100% record.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Anaphora (pronouns etc)

    World knowledge

    Sometimes inference will override soft preferences:

    Andrew Strauss again blamed the batting after England lost toAustralia last night. They now lead the series three-nil.

    they is Australia.But a discourse can be odd if strong salience effects areviolated:

    The England football team won last night. Scotland lost.? They have qualified for the World Cup with a 100% record.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Lecture 10: Discourse

    Coherence

    Anaphora (pronouns etc)

    Algorithms for anaphora resolution

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Anaphora resolution as supervised classification

    I Classification: training data labelled with class andfeatures, derive class for test data based on features.

    I For potential pronoun/antecedent pairings, class isTRUE/FALSE.

    I Assume candidate antecedents are all NPs in currentsentence and preceeding 5 sentences (excludingpleonastic pronouns)

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Example

    Niall Ferguson is prolific, well-paid and a snappy dresser.Stephen Moss hated him— at least until he spent an hour being charmed inthehistorian’s Oxford study.

    Issues: detecting pleonastic pronouns and predicative NPs,deciding on treatment of possessives (the historian and thehistorian’s Oxford study), named entities (e.g., Stephen Moss,not Stephen and Moss), allowing for cataphora, . . .

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Example

    Niall Ferguson is prolific, well-paid and a snappy dresser.Stephen Moss hated him— at least until he spent an hour being charmed in thehistorian’s Oxford study.

    Issues: detecting pleonastic pronouns and predicative NPs,deciding on treatment of possessives (the historian and thehistorian’s Oxford study), named entities (e.g., Stephen Moss,not Stephen and Moss), allowing for cataphora, . . .

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Features

    Cataphoric Binary: t if pronoun before antecedent.Number agreement Binary: t if pronoun compatible with

    antecedent.Gender agreement Binary: t if gender agreement.Same verb Binary: t if the pronoun and the candidate

    antecedent are arguments of the same verb.Sentence distance Discrete: { 0, 1, 2 . . . }Grammatical role Discrete: { subject, object, other } The role of

    the potential antecedent.Parallel Binary: t if the potential antecedent and the

    pronoun share the same grammatical role.Linguistic form Discrete: { proper, definite, indefinite, pronoun }

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Feature vectors

    pron ante cat num gen same dist role par formhim Niall F. f t t f 1 subj f prophim Ste. M. f t t t 0 subj f prophim he t t t f 0 subj f pronhe Niall F. f t t f 1 subj t prophe Ste. M. f t t f 0 subj t prophe him f t t f 0 obj f pron

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Training data, from human annotation

    class cata num gen same dist role par formTRUE f t t f 1 subj f propFALSE f t t t 0 subj f propFALSE t t t f 0 subj f pronFALSE f t t f 1 subj t propTRUE f t t f 0 subj t propFALSE f t t f 0 obj f pron

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Naive Bayes ClassifierChoose most probable class given a feature vector ~f :

    ĉ = argmaxc∈C

    P(c|~f )

    Apply Bayes Theorem:

    P(c|~f ) = P(~f |c)P(c)P(~f )

    Constant denominator:

    ĉ = argmaxc∈C

    P(~f |c)P(c)

    Independent feature assumption (‘naive’):

    ĉ = argmaxc∈C

    P(c)n∏

    i=1

    P(fi |c)

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Problems with simple classification model

    I Cannot implement ‘repeated mention’ effect.I Cannot use information from previous links:

    Sturt think they can perform better in Twenty20 cricket. Itrequires additional skills compared with older forms of thelimited over game.it should refer to Twenty20 cricket, but looked at in isolationcould get resolved to Sturt. If linkage between they andSturt, then number agreement is pl.

    Not really pairwise: really need discourse model with real worldentities corresponding to clusters of referring expressions.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Evaluation

    Simple approach is link accuracy. Assume the data ispreviously marked-up with pronouns and possible antecedents,each pronoun is linked to an antecedent, measure percentagecorrect. But:

    I Identification of non-pleonastic pronouns and antecendentNPs should be part of the evaluation.

    I Binary linkages don’t allow for chains:Sally met Andrew in town and took him to the newrestaurant. He was impressed.

    Multiple evaluation metrics exist because of such problems.

  • Natural Language Processing: Part II Overview of Natural Language Processing (L90): ACS

    Algorithms for anaphora resolution

    Classification in NLP

    I Also sentiment classification, word sense disambiguationand many others. POS tagging (sequences).

    I Feature sets vary in complexity and processing needed toobtain features. Statistical classifier allows somerobustness to imperfect feature determination.

    I Acquiring training data is expensive.I Few hard rules for selecting a classifier: e.g., Naive Bayes

    often works even when independence assumption isclearly wrong (as with pronouns). Experimentation, e.g.,with WEKA toolkit.

    CoherenceAnaphora (pronouns etc)Algorithms for anaphora resolution