31
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th April, 2011

Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th April, 2011

  • Upload
    eshe

  • View
    32

  • Download
    0

Embed Size (px)

DESCRIPTION

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language). Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th April, 2011. Semantics: wikipedia. - PowerPoint PPT Presentation

Citation preview

Page 1: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

CS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 37– Semantics; Universal Networking Language)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

12th April, 2011

Page 2: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Semantics: wikipedia• Semantics (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning.

• It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.

Page 3: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Computational Semantics: wikipedia• Computational semantics is the study of how to

automate the process of constructing and reasoning with meaning representations of natural language expressions.

• Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution.

• Methods employed usually draw from formal semantics or statistical semantics.

• Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving).

• Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.

Page 4: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

A hurdle: signifier-denotata dichotomy Divide between a word and what it

stands for “red” is NOT red in colour “red wine”, “red rose”, “he is in the

red” denote very different sense of the word

Translation into another language reveals this difference

Page 5: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

A Perpective

Morphology

Lexicon

Syntax

SemanticsPragmatics

Discourse

Page 6: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Our tryst with semantics:

Universal Networking Language (UNL)

Page 7: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Motivation Extraction of semantics, i.e., deep

meaning is important for many applications. Machine Translation, Meaning-based IR, CLIR

Robust, scalable & efficient methods of knowledge extraction required

Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier

Page 8: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Interlingua: a vehicle for machine translation

Interlingua(UNL)

English

French

Hindi

Chinese

generation

Analysis

Page 9: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

UNL: a United Nations project Started in 1996 10 year program 15 research groups across continents First goal: generators Next goal: analysers (needs solving various

ambiguity problems) Current active language groups

UNL_French (GETA-CLIPS, IMAG) UNL_English+Hindi UNL_Italian (Univ. of Pisa) UNL_Portugese (Univ of Sao Paolo, Brazil) UNL_Russian (Institute of Linguistics, Moscow) UNL_Spanish (UPM, Madrid)

Page 10: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

10

World-wide Universal Networking Language (UNL) Project

UNL

English Russian

Japanese

Hindi

Spanish

Language independent meaning representation.

Marathi

Others

Page 11: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

11

The UNL MT System: an Overview

Page 12: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

NLP@IITB

Page 13: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Foundations and Applications

UNL Foundations Semantic Relations Universal Words Attributes How to write UNL expressions

UNL Applications Machine Translation: Rule based and

Statistical Search Text Entailment Sentiment Analysis

Page 14: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

LanguageProcessing & Understanding

Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization

Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications)Using graphical models, support vector machines, neural networks

IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback

Machine Translation: Statistical Interlingua Based EnglishIndian languages Indian languagesIndian languages Indowordnet

Resources: http://www.cfilt.iitb.ac.inPublications: http://www.cse.iitb.ac.in/~pb

Linguistics is the eye and computation thebody

Page 15: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

UNL represents knowledge: John eats rice with a spoon

Semantic relations

attributes

Universal words

Repositoryof 42SemanticRelations and84 attributelabels

Page 16: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Sentence embeddingsDeepa claimed that she had composed a

poem.[UNL]

agt(claim.@entry.@past, Deepa)obj(claim.@entry.@past, :01)agt:01(compose.@past.@entry.@complete,

she)obj:01(compose.@past.@entry.@complete,

poem.@indef)[\UNL]

Page 17: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

17

Constituents of Universal Networking Language Universal Words (UWs) Relations Attributes Knowledge Base

Page 18: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

18

UNL Graph

obj

agt

@ entry @ past

minister(icl>person)

forward(icl>send)

mail(icl>collection)

he(icl>person)

@def

@def

gol

He forwarded the mail to the minister.

Page 19: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

19

UNL Expressionagt (forward(icl>send).@ entry @

past, he(icl>person))

obj (forward(icl>send).@ entry @ past, minister(icl>person))

gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def)

Page 20: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

20

What is a Universal Word (UW)? Words of UNL Constitute the UNL vocabulary, the

syntactic-semantic units to form UNL expressions

A UW represents a concept Basic UW (an English word/compound

word/phrase with no restrictions or Constraint List)

Restricted UW (with a Constraint List ) Examples:

“crane(icl>device)” “crane(icl>bird)”

Page 21: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

21

The LexiconFormat of the dictionary entry

e.g., [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN); Head word Universal word Attributes

Morphological - Pl(plural), V_ed(past tense form)

Syntactic - V(verb),VOA(verb of action) Semantic - ANIMT(animate), PLACE, TIME

[headword] {} “Universal word“ (Attribute list);

Page 22: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

22

The Lexicon (cntd)

Content words:

[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;

[mail] {} “mail(icl>message)” (N,PHSCL,INANI) <E,0,0>;

[minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN) <E,0,0>;

Headword Universal Word Attributes

He forwarded the mail to the minister.

Page 23: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

23

The Lexicon (cntd)

function words:

[he] {} “he” (PRON,SUB,SING,3RD) <E,0,0>;

[the] {} “the” (ART,THE) <E,0,0>;

[to] {} “to” (PRE,#TO) <E,0,0>;Headword Universal

WordAttributes

He forwarded the mail to the minister.

Page 24: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Hindi example: संज्ञा का उदाहरण १/२

सार्व�भौमशब्दमुख्य शब्द

farmer(icl>creator)farmer

शेतकरी

किकसान N,M,ANIMT,FAUNA,MML,PRSN,Na

N,ANIMT,FAUNA,MML,PRSN

E

M

H

N,M,ANIMT,FAUNA,MML,PRSN

गुण

Page 25: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

25

The Features of a UW Every concept existing in any

language must correspond to a UW The constraint list should be as

small as necessary to disambiguate the headword

Every UW should be defined in the UNL Knowledge-Base

Page 26: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

26

Restricted UWs

Examples He will hold office until the spring of next

year. The spring was broken.

Restricted UWs, which are Headwords with a constraint list, for example:

“spring(icl>season)” “spring(icl>device)”“spring(icl>jump)”“spring(icl>fountain)”

Page 27: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

27

How to create UWs? Pick up a concept

the concept of “crane" as "a device for lifting heavy loads”

or as “a long-legged bird that wade in water in search of food”

Choose an English word for the concept. In the case for “crane", since it is a word of

English, the corresponding word should be ‘crane'

Choose a constraint list for the word. [ ] ‘crane(icl>device)' [ ] ‘crane(icl>bird)'

Page 28: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

How to create UNL expressions

Page 29: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

English sentences: basic structure

A <verb> B John eats bread agt(eat.@entry,

John) obj(eat.@entry,

bread) A <verb>

John sleeps aoj(sleep.@entry,

John) A <be> B

John is good aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

Page 30: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

Hindi sentences: basic structure

A B <verb> John roti khaataa hai agt(eat.@entry, John) obj(eat.@entry,

bread) A <verb>

John sotaa hai aoj(sleep.@entry,

John) A <be> B

John acchaa hai aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

Page 31: Pushpak Bhattacharyya CSE Dept.,  IIT  Bombay   12 th  April, 2011

:02:01

Complex English sentences: Use recursion on the basic structure

A <verb> B John who is a good boy eats

bread which is toasted

agt(eat.@entry, :01) obj(eat.@entry, :02) aoj:01(boy, John.@entry) mod:01(boy, good) obj:01(toast,

bread.@entry.@focus)

boy

John

aoj

toast

Bread

obj

eat

:02

:01

agt obj

good

mod

Red arrows indicate entry nodes