Upload
stefan
View
97
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Semantic Inference at the Lexical-Syntactic Level. Roy Bar- Haim and Ido Dagan Computer Science Department Bar- Ilan University. Iddo Greental Linguistics Department Tel Aviv University. Eyal Shnarch Computer Science Department Bar- Ilan University. AAAI2007. Outline. Abstrast - PowerPoint PPT Presentation
Citation preview
Semantic Inference at the Lexical-Syntactic Level
Roy Bar-Haim and Ido DaganComputer Science DepartmentBar-Ilan University
Iddo GreentalLinguistics DepartmentTel Aviv University
Eyal ShnarchComputer Science DepartmentBar-Ilan University
AAAI2007
Outline
• Abstrast• Introduction• Inference Framework• Rule for Generic Linguistic Structures• Lexical-Syntactic Rules• Evaluation• Result• Conclusion
Abstrast
• 語義推論應用在理解自然語言是很重要的課題• 作者提出一個可以直接在單一語句語法相依樹中做語義推論架構
Introduction
• 近年來英文語系的 PASCAL Recognizing Textual Entailment 比賽是一大挑戰
• 上述比賽需要辨識當一個假設 (hypothesis h) 中資訊是否可以從一段話 (t) 中得知,稱之為– t entails h
Introduction
• 可能的實際應用在 QA 系統 :– 問 : Who killed Kennedy?– 轉換成 h( 假設 ): X killed Kennedy.– 到 copus 中找尋適合的句子證明假設成立 :– The assassination of Kennedy by Oswald shook the
nation– 得知 X 是 Oswald
Inference Framework
• 本系統語義推論架構包含 propositions( 命題 ) 和 inference(entailment) rules
– Propositions: t(assumption)->h(the goal) 在dependency tree 中首先抓取 predicate 在經由一連串的 proof( 利用 entailment rules) 證明 t->h
Inference Framework
• 語法相依樹 ( 處理 passive rule):
Inference Framework
• inference(entailment) rules:– 組合兩個樣版 L 和 R 分別皆是 dependency
subtree
– L matching:• 對於 L 中每一個結點 u ,存在一個 function f 由 L 到 s(source tree) 擁有相同的 feature 值• 對於 L 中每一個邊 u-> v 存在 f(u) -> f(v) 在 s 有相同的相依關係
Inference Framework
– R instantiation: 做完 L matching 後 , R 子樹複製L 子樹中變數和 root, 交換 root 之外的變數位置
Inference Framework
– Alignment copying : • L matching 和 R instantiation 這兩個動作中只會抓取
predicate 和 主詞以及受詞部分 , 最後將 predict 的直接 children nodes 重新加回– Derived tree generation by rule type:• Substitution 和 introduction 兩種動作
– Substitution 例子 : by a lexical rule, buy -> purchase
Inference Framework
– Introduction : 將一些不必要的節點 (predicate 的父親等等 )忽略 ( 如下圖 )
– Annotation Rules:• 應用在任何其它 feature 前 , 任意 node 中的兩種
feature:– Negation and modality
Rule for Generic Linguistic Structures
• Syntactic-Based Rules:– (1) 簡化 source tree :
• Passive( 被動式 ): • 原句 :We have been approached by the investment banker.• 改成 :The investment banker approached us.
• Genitive modifier( 所有格修改 ):• Malaysia’s crude palm oil output is estimated to have risen
by up to six percent. • The crude palm oil output of Malasia is estimated to have
risen by up to six percent.
Rule for Generic Linguistic Structures
– (2) 只抽取部分資訊 propositions:• Conjunctions( 有連接詞 ):• Helena’s very experienced and has played a long time
on the tour.• Helena has played a long time on the tour.
• Clausal modifiers• But celebrations were muted as many Iranians
observed a Shi’ite mourning month.• Many Iranians observed a Shi’ite mourning month.
Rule for Generic Linguistic Structures
• Relative clauses( 關係子句 ):• The assailants fired six bullets at the car, which carried
Vladimir Skobtsov.• The car carried Vladimir Skobtsov.
– (3)• Appositives( 同位語 )• Frank Robinson, a one-time manager of the
Indians,has the distinction for the NL.• Frank Robinson is a one-time manager of the Indians.
Rule for Generic Linguistic Structures
• Polarity-Based Rules:– John knows that Mary is here->Mary is here.– John believes that Mary is here 不能表示 Mary is
here.– (Nairn, Condoravdi, & Karttunen. 2006) 利用動詞出現的上下文分析 ( 正向 , 負面 , 未知 ), 作者抽取有極性的動詞形成一個動詞列表另外加上兩個新聞文章常出現的動詞 (say , announce) 通常表達的意見是確定可靠的
Rule for Generic Linguistic Structures
– 舉例 :• Polarity( 極性 ): • Yadav was forced to resign.• Yadav resigned.
• Negation and Modality Annotation Rules:– Modal verbs: 像是 should, can, might…– Negation:• What we’ve never seen is actual costs come down.• What we’ve never seen is actual costs come down.
Rule for Generic Linguistic Structures
• 其他 :– Determiners( 限定詞 , 限定名詞 ):• The plaintiffs filed their lawsuit last year in U.S. District
Court in Miami.• The plaintiffs filed a lawsuit last year in U.S. District
Court in Miami.
• Generic Default Rules:– 刪除修飾語 (mod)
Lexical-Syntactic Rules
• Nominalization Rules:– These rules were derived automatically (Ron2006)
from Nomlex (NOMinalization Lexicon)– 例 : X’s acquisition of Y 產生 X acquired Y
• Automatically Learned Rules:– DIRT (Lin & Pantel 2001) and TEASE (Szpektor et
al.2004) are two state-of-the-art unsupervised algorithms that learn lexical-syntactic inference rules.
– 例 : X file lawsuit against Y 產生 X accuse Y
Evaluation
• 不使用 PASCAL RTE 資料集 , 因為數量較少而且已經有許多相似的語句對• 替代方案是使用資訊擷取 (RE) 方式當作樣版 , 例 :x buy y
Evaluation process
• 需要產生相關的實驗樣版 , 所以先利用TEASE 產生的動詞資源中選其中 9 個 :– approach, approve, consult, lead, observe,
play,seek, sign, strike.對應 RE 的 predicates
• 下一步利用 DIRT/TEASE 與上述 9 個predicates 學習出更多實驗樣版– 例 : X approve Y 產生 X confirm Y,
Evaluation process
• 接下來利用 Reuters RCV1 corpus 找尋符合樣版中 predicate 的語句利用一系列的 Generic Linguistic rules 找尋最適合樣板的主詞和受詞
• 作者使用 Minipar (Lin 1998) for parsing
Result
Conclusion
• 未來研究朝向多語句推論以及加入更多lexical rule 像是 dog-> animal