35
Lexical Semantics COMP-550 Oct 17, 2017

Lexical Semantics - cs.mcgill.cajcheung/teaching/fall-2017/comp550/lectures/lecture13.pdf · Lexical Semantics Jargon Lexeme: Pairing of a particular form (orthographic or phonological)

  • Upload
    buicong

  • View
    224

  • Download
    0

Embed Size (px)

Citation preview

Lexical Semantics

COMP-550Oct17,2017

OutlineSemanticsLexicalsemanticsLexicalsemanticrelationsWordNetWordSenseDisambiguation• Lesk algorithm• Yarowsky’s algorithm

2

SemanticsWhatis”Semantics”?Thestudyofmeaning inlanguage

“WhenIuseaword”,HumptyDumptysaidinratherascornfultone,“itmeansjustwhatIchooseittomean– neithermorenorless.”

LewisCarroll,AliceinWonderland

Whatdoesmeaningmean?• Relationshipoflinguisticexpressiontotherealworld• Relationshipoflinguisticexpressionstoeachother

3

This LectureWe’llstartbyfocusingonthemeaningofwords—lexicalsemantics.Lateron:• meaningofphrasesandsentences• howtoconstructthatfrommeaningsofwords

4

From Language to the WorldWhatdoestelephonemean?• Picksoutalloftheobjectsintheworldthatare

telephones(itsreferents)

Itsextensional definition

5

telephones

nottelephones

Relationship of Linguistic Expressions

Howwouldyoudefinetelephone?e.g,toathree-year-old,ortoafriendlyMartian.

6

Dictionary Definitionhttp://dictionary.reference.com/browse/telephone

Itsintensional definition• Thenecessaryandsufficientconditionstobeatelephone

Thispresupposesyouknowwhat“apparatus”,“sound”,“speech”,etc.mean.

7

Lexical Semantics JargonLexeme:Pairingofaparticularform(orthographicorphonological)withitsmeaning.

Forexample,thelexemeBANK(noun)consistsofbank andbanks,butnotbanker.BANKERisalexemeofitsown!

Lexicon:FinitelistoflexemesLemma:Thegrammaticalformthatisusedtorepresentalexeme.

Thelemmaforsing,sang,sung issing.Thespecificform(e.g.sang)iscalledwordform.

Lemmatization:Theprocessofmappingawordform toalemma.

8

Sense and Reference (Frege, 1892)Frege wasoneofthefirsttodistinguishbetweenthesense ofaterm,anditsreference.

Samereferent,differentsenses:

Venus

themorningstar

theeveningstar

9

Word SensesThemeaningofalemmacanvaryenormouslygiventhecontext:• Abank canholdinvestmentsinacustodialaccountinthe

client’sname.• Asagricultureburgeonsontheeastbank,therivershrink

evenmore.

Awordsense(orsimplysense)isadiscreterepresentationofoneaspectofthemeaningofaword.Next:Relationsbetweendifferentsenses(andgenerallywords)Later:Howtodisambiguatebetweenvaryingsenses?

10

Lexical Semantic RelationsHowspecificallydotermsrelatetoeachother?Herearesomeways:

Hypernymy/hyponymySynonymyAntonymyHomonymyPolysemyMetonymySynecdocheHolonymy/meronymy

11

Hypernymy/HyponymyISArelationship

Hyponym Hypernymmonkey mammalMontreal cityredwine beverage

12

Synonymy and AntonymySynonymy

(Roughly)samemeaningoffspringdescendentspawnhappyjoyfulmerry

Antonymy(Roughly)oppositemeaningsynonymantonymhappysaddescendantancestor

13

HomonymySameform,different(andunrelated)meaningHomophone – samesound• e.g., son vs.sun

Homograph – samewrittenform• e.g.,lead (noun)vs.lead (verb)

14

PolysemyMultiplerelatedmeanings

S: (n) newspaper, paper (adailyorweeklypublicationonfoldedsheets;containsnewsandarticlesandadvertisements) "hereadhisnewspaperatbreakfast"S: (n) newspaper, paper, newspaperpublisher (abusinessfirmthatpublishesnewspapers) "Murdochownsmanynewspapers"S: (n) newspaper, paper (thephysicalobjectthatistheproductofanewspaperpublisher) "whenitbegantorainhecoveredhisheadwithanewspaper"S: (n) newspaper, newsprint (cheappapermadefromwoodpulpandusedforprintingnewspapers) "theyusedbalesofnewspapereveryday"

15

Homonymy vs PolysemyHomonymy:unrelated Polysemy:related meaning

S: (n) position, place (theparticularportionofspaceoccupiedbysomething) "heputthelampbackinitsplace"S: (n) militaryposition, position (apointoccupiedbytroopsfortacticalreasons)S: (n) position, view, perspective (awayofregardingsituationsortopicsetc.)"considerwhatfollowsfromthepositivistview"S: (n) position, posture, attitude (thearrangementofthebodyanditslimbs) "heassumedanattitudeofsurrender"S: (n) status, position (therelativepositionorstandingofthingsorespeciallypersonsinasociety) "hehadthestatusofaminor";"thenovelattainedthestatusofaclassic";"atheistsdonotenjoyafavorablepositioninAmericanlife"S: (n) position, post, berth, office, spot, billet, place, situation (ajobinanorganization) "heoccupiedapostinthetreasury"

16

MetonymySubstitutionofoneentityforanotherrelatedone

Weorderedmanydeliciousdishesattherestaurant.Iworkedforthelocalpaperforfiveyears.QuebecCityiscuttingourbudgetagain.Theloonieisata11-yearlow.

Synecdoche – aspecifickindofmetonymyinvolvingwhole-partrelations

Allhandsondeck!Don’tbea<censoredbodypart>

17

Holonymy/meronymySomekindofwhole/partrelationshipSubtypes Holonym Meronym

groupsandmembers class studentwholeandpart car windshieldwholeandsubstance chair wood

18

QuizClassifythefollowingexamplesintermsofwhatlexicalsemanticrelationtheyexhibit

cold freezingthey’re theirhair headenemy friendcut(hair) cut(bread)GeorgeClooney actor

19

WordNet (Miller et et., 1990)WordNetisalexicalresourceorganizedbysynsets• Nodes:synsets• Edges:lexicalsemanticrelationbetweentwosynsets

Separatehierarchyfordifferentpartsofspeech• Nouns,verbs,adjectives,adverbs

20

A Synset EntryS: (n) hand, manus, mitt, paw (the(prehensile)extremityofthesuperiorlimb) "hehadthehandsofasurgeon";"heextendedhismitt"

directhyponym / fullhyponymS: (n) fist, clenchedfist (ahandwiththefingersclenchedinthepalm(asforhitting))S: (n) hooks, meathooks, maulers (largestronghand(asofafighter))"waittillIgetmyhooksonhim"S: (n) right, righthand (thehandthatisontherightsideofthebody) "hewriteswithhisrighthandbutpitcheswithhisleft";"hithimwithquickrightstothebody"S: (n) left, lefthand (thehandthatisontheleftsideofthebody) "jabwithyourleft"

partmeronymdirecthypernym / inheritedhypernym / sistertermpartholonym

S: (n) arm (ahumanlimb;technicallythepartofthesuperiorlimbbetweentheshoulderandtheelbowbutcommonlyusedtorefertothewholesuperiorlimb)S: (n) homo, man, humanbeing, human (anylivingorextinctmemberofthefamilyHominidae characterizedbysuperiorintelligence,articulatespeech,anderectcarriage)

derivationallyrelatedform

http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=hand&i=8&h=1100000000000000000000000#c

21

WordNet Has an NLTK Interface>>>fromnltk.corpus importwordnet

Someusefulfunctions:>>>wordnet.synsets(<query_term>)>>>wordnet.synset(<synset_name>)

Rememberyoucanusedir andhelptogetalistoffunctionsinPython.

22

Word Sense DisambiguationFiguringoutwhichwordsenseisexpressedincontext

Hishands weretiredfromhoursoftyping.à hand.n.01

Duetohersuperioreducation,herhand wasflowingandgraceful.à hand.n.03

Generalidea:usewordsinthecontexttodisambiguate.Whichwordsabovewouldhelpwiththis?

23

Possible Computational ApproachesAheuristicalgorithm• Lesk’s algorithm

Supervisedmachinelearning• Possible,butrequiresalotofworktoannotateword

senseinformationthatwewanttoavoid

Unsupervised,orminimallysupervisedmachinelearning• Yarowsky’s algorithm

24

Lesk’s Algorithm (1986)Morelikeafamilyofalgorithmswhich,inessence,choosethesensewhosedictionarydefinitionsharesthemostwordswiththetargetword’sneighborhood.

Stepstodisambiguateword𝑤:1. Constructabagofwordsrepresentationofthecontext,𝐵2. Foreachcandidatesense𝑠$ ofword𝑤:

• Calculateasignatureofthesensebytakingallofthewordsinthedictionarydefinitionof𝑠$

• ComputeOverlap(𝐵, signature(𝑠$))3. Selectthesensewiththehighestoverlapscore

25

Financial Bank or Riverbank?

26

Construct from definitions of all senses of context words

Model Variations

27

Whichdictionarytouse?NLTK?Useonlydictionarydefinitions?Orincludeexamplesentences?Ignoreuninformativestopwords (e.g.,the,a,of)?Lemmatizewhenconsideringmatches(tomatoesmatchestomato)?

ExerciseRuntheLesk algorithmusingNLTK/WordNet.Ignorestopwords,includeexamples,countlemmaoverlap.Consideronlythetoptwosensesofbank.

1.I’lldeposit thecheque atthebank.2.Thebank overflowed andwater flooded thetown.

28

Yarowsky’s Algorithm (1995)AmethodbasedonbootstrappingGoal:LearnaclassifierforatargetwordSteps:

1. Gatheradatasetwithtargetwordtobedisambiguated2. Automaticallylabelasmallseedset ofexamples3. Repeatthefollowingforawhile:

• Trainasupervisedlearningalgorithmfromtheseedset• Applythesupervisedmodeltotheentiredataset• Keepthehighlyconfidentclassificationoutputstobethe

newseedset4. Usethelastmodelasthefinalmodel

29

Yarowsky’s ExampleStep1:Disambiguatingplant

30

Step 2: Initial Seed SetSenseA:• plant asinalifeform

Otherdata

SenseB:• plant asinafactory

31

Step 3: Train a ClassifierHewentwithadecision-list classifier(wedidn’tcoverthisoneinclass)

Notehownewcollocationsarefoundforeachsense

32

Step 3: Change Seed SetUseonlythecaseswhereclassifierishighlyconfident

33

Results96%onbinarywordsensedistinctionsSameresultaswithsupervisedmethods,butwithminimalamountsofannotationeffort!

34

Notes on Yarowski’s AlgorithmThekeytoanybootstrappingapproachliesinitsabilitytocreatealargertrainingsetfromasmallsetofseeds:• Needanaccurateinitialsetofseeds• Needagoodconfidencemetricforpickinggoodnew

examplestoaddtothetrainingset

35