Upload
buicong
View
224
Download
0
Embed Size (px)
Citation preview
OutlineSemanticsLexicalsemanticsLexicalsemanticrelationsWordNetWordSenseDisambiguation• Lesk algorithm• Yarowsky’s algorithm
2
SemanticsWhatis”Semantics”?Thestudyofmeaning inlanguage
“WhenIuseaword”,HumptyDumptysaidinratherascornfultone,“itmeansjustwhatIchooseittomean– neithermorenorless.”
LewisCarroll,AliceinWonderland
Whatdoesmeaningmean?• Relationshipoflinguisticexpressiontotherealworld• Relationshipoflinguisticexpressionstoeachother
3
This LectureWe’llstartbyfocusingonthemeaningofwords—lexicalsemantics.Lateron:• meaningofphrasesandsentences• howtoconstructthatfrommeaningsofwords
4
From Language to the WorldWhatdoestelephonemean?• Picksoutalloftheobjectsintheworldthatare
telephones(itsreferents)
Itsextensional definition
5
telephones
nottelephones
Relationship of Linguistic Expressions
Howwouldyoudefinetelephone?e.g,toathree-year-old,ortoafriendlyMartian.
6
Dictionary Definitionhttp://dictionary.reference.com/browse/telephone
Itsintensional definition• Thenecessaryandsufficientconditionstobeatelephone
Thispresupposesyouknowwhat“apparatus”,“sound”,“speech”,etc.mean.
7
Lexical Semantics JargonLexeme:Pairingofaparticularform(orthographicorphonological)withitsmeaning.
Forexample,thelexemeBANK(noun)consistsofbank andbanks,butnotbanker.BANKERisalexemeofitsown!
Lexicon:FinitelistoflexemesLemma:Thegrammaticalformthatisusedtorepresentalexeme.
Thelemmaforsing,sang,sung issing.Thespecificform(e.g.sang)iscalledwordform.
Lemmatization:Theprocessofmappingawordform toalemma.
8
Sense and Reference (Frege, 1892)Frege wasoneofthefirsttodistinguishbetweenthesense ofaterm,anditsreference.
Samereferent,differentsenses:
Venus
themorningstar
theeveningstar
9
Word SensesThemeaningofalemmacanvaryenormouslygiventhecontext:• Abank canholdinvestmentsinacustodialaccountinthe
client’sname.• Asagricultureburgeonsontheeastbank,therivershrink
evenmore.
Awordsense(orsimplysense)isadiscreterepresentationofoneaspectofthemeaningofaword.Next:Relationsbetweendifferentsenses(andgenerallywords)Later:Howtodisambiguatebetweenvaryingsenses?
10
Lexical Semantic RelationsHowspecificallydotermsrelatetoeachother?Herearesomeways:
Hypernymy/hyponymySynonymyAntonymyHomonymyPolysemyMetonymySynecdocheHolonymy/meronymy
11
Synonymy and AntonymySynonymy
(Roughly)samemeaningoffspringdescendentspawnhappyjoyfulmerry
Antonymy(Roughly)oppositemeaningsynonymantonymhappysaddescendantancestor
13
HomonymySameform,different(andunrelated)meaningHomophone – samesound• e.g., son vs.sun
Homograph – samewrittenform• e.g.,lead (noun)vs.lead (verb)
14
PolysemyMultiplerelatedmeanings
S: (n) newspaper, paper (adailyorweeklypublicationonfoldedsheets;containsnewsandarticlesandadvertisements) "hereadhisnewspaperatbreakfast"S: (n) newspaper, paper, newspaperpublisher (abusinessfirmthatpublishesnewspapers) "Murdochownsmanynewspapers"S: (n) newspaper, paper (thephysicalobjectthatistheproductofanewspaperpublisher) "whenitbegantorainhecoveredhisheadwithanewspaper"S: (n) newspaper, newsprint (cheappapermadefromwoodpulpandusedforprintingnewspapers) "theyusedbalesofnewspapereveryday"
15
Homonymy vs PolysemyHomonymy:unrelated Polysemy:related meaning
S: (n) position, place (theparticularportionofspaceoccupiedbysomething) "heputthelampbackinitsplace"S: (n) militaryposition, position (apointoccupiedbytroopsfortacticalreasons)S: (n) position, view, perspective (awayofregardingsituationsortopicsetc.)"considerwhatfollowsfromthepositivistview"S: (n) position, posture, attitude (thearrangementofthebodyanditslimbs) "heassumedanattitudeofsurrender"S: (n) status, position (therelativepositionorstandingofthingsorespeciallypersonsinasociety) "hehadthestatusofaminor";"thenovelattainedthestatusofaclassic";"atheistsdonotenjoyafavorablepositioninAmericanlife"S: (n) position, post, berth, office, spot, billet, place, situation (ajobinanorganization) "heoccupiedapostinthetreasury"
16
MetonymySubstitutionofoneentityforanotherrelatedone
Weorderedmanydeliciousdishesattherestaurant.Iworkedforthelocalpaperforfiveyears.QuebecCityiscuttingourbudgetagain.Theloonieisata11-yearlow.
Synecdoche – aspecifickindofmetonymyinvolvingwhole-partrelations
Allhandsondeck!Don’tbea<censoredbodypart>
17
Holonymy/meronymySomekindofwhole/partrelationshipSubtypes Holonym Meronym
groupsandmembers class studentwholeandpart car windshieldwholeandsubstance chair wood
18
QuizClassifythefollowingexamplesintermsofwhatlexicalsemanticrelationtheyexhibit
cold freezingthey’re theirhair headenemy friendcut(hair) cut(bread)GeorgeClooney actor
19
WordNet (Miller et et., 1990)WordNetisalexicalresourceorganizedbysynsets• Nodes:synsets• Edges:lexicalsemanticrelationbetweentwosynsets
Separatehierarchyfordifferentpartsofspeech• Nouns,verbs,adjectives,adverbs
20
A Synset EntryS: (n) hand, manus, mitt, paw (the(prehensile)extremityofthesuperiorlimb) "hehadthehandsofasurgeon";"heextendedhismitt"
directhyponym / fullhyponymS: (n) fist, clenchedfist (ahandwiththefingersclenchedinthepalm(asforhitting))S: (n) hooks, meathooks, maulers (largestronghand(asofafighter))"waittillIgetmyhooksonhim"S: (n) right, righthand (thehandthatisontherightsideofthebody) "hewriteswithhisrighthandbutpitcheswithhisleft";"hithimwithquickrightstothebody"S: (n) left, lefthand (thehandthatisontheleftsideofthebody) "jabwithyourleft"
partmeronymdirecthypernym / inheritedhypernym / sistertermpartholonym
S: (n) arm (ahumanlimb;technicallythepartofthesuperiorlimbbetweentheshoulderandtheelbowbutcommonlyusedtorefertothewholesuperiorlimb)S: (n) homo, man, humanbeing, human (anylivingorextinctmemberofthefamilyHominidae characterizedbysuperiorintelligence,articulatespeech,anderectcarriage)
derivationallyrelatedform
http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=hand&i=8&h=1100000000000000000000000#c
21
WordNet Has an NLTK Interface>>>fromnltk.corpus importwordnet
Someusefulfunctions:>>>wordnet.synsets(<query_term>)>>>wordnet.synset(<synset_name>)
Rememberyoucanusedir andhelptogetalistoffunctionsinPython.
22
Word Sense DisambiguationFiguringoutwhichwordsenseisexpressedincontext
Hishands weretiredfromhoursoftyping.à hand.n.01
Duetohersuperioreducation,herhand wasflowingandgraceful.à hand.n.03
Generalidea:usewordsinthecontexttodisambiguate.Whichwordsabovewouldhelpwiththis?
23
Possible Computational ApproachesAheuristicalgorithm• Lesk’s algorithm
Supervisedmachinelearning• Possible,butrequiresalotofworktoannotateword
senseinformationthatwewanttoavoid
Unsupervised,orminimallysupervisedmachinelearning• Yarowsky’s algorithm
24
Lesk’s Algorithm (1986)Morelikeafamilyofalgorithmswhich,inessence,choosethesensewhosedictionarydefinitionsharesthemostwordswiththetargetword’sneighborhood.
Stepstodisambiguateword𝑤:1. Constructabagofwordsrepresentationofthecontext,𝐵2. Foreachcandidatesense𝑠$ ofword𝑤:
• Calculateasignatureofthesensebytakingallofthewordsinthedictionarydefinitionof𝑠$
• ComputeOverlap(𝐵, signature(𝑠$))3. Selectthesensewiththehighestoverlapscore
25
Model Variations
27
Whichdictionarytouse?NLTK?Useonlydictionarydefinitions?Orincludeexamplesentences?Ignoreuninformativestopwords (e.g.,the,a,of)?Lemmatizewhenconsideringmatches(tomatoesmatchestomato)?
ExerciseRuntheLesk algorithmusingNLTK/WordNet.Ignorestopwords,includeexamples,countlemmaoverlap.Consideronlythetoptwosensesofbank.
1.I’lldeposit thecheque atthebank.2.Thebank overflowed andwater flooded thetown.
28
Yarowsky’s Algorithm (1995)AmethodbasedonbootstrappingGoal:LearnaclassifierforatargetwordSteps:
1. Gatheradatasetwithtargetwordtobedisambiguated2. Automaticallylabelasmallseedset ofexamples3. Repeatthefollowingforawhile:
• Trainasupervisedlearningalgorithmfromtheseedset• Applythesupervisedmodeltotheentiredataset• Keepthehighlyconfidentclassificationoutputstobethe
newseedset4. Usethelastmodelasthefinalmodel
29
Step 3: Train a ClassifierHewentwithadecision-list classifier(wedidn’tcoverthisoneinclass)
Notehownewcollocationsarefoundforeachsense
32
Results96%onbinarywordsensedistinctionsSameresultaswithsupervisedmethods,butwithminimalamountsofannotationeffort!
34