Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
CS 562: STATISTICAL NATURAL LANGUAGE PROCESSING
August 2010
Instructors: Liang Huang and Kevin Knight
TA: Jason Riesa
CS 562 - Intro
Doesn’t Google know everything?What animal does a cat eat?
2
CS 562 - Intro
Even Key Word Queries
• Paris Hilton -- not easy to book! (vs. Boston Hilton)
3
CS 562 - Intro
Ambiguity
Where can I spot a snow leopard?
4
CS 562 - Intro
More about Ambiguities
• to middle school kids: what does this sentence mean?
5
Aravind Joshi
I saw her duck.
CS 562 - Intro
More about Ambiguities
• to middle school kids: what does this sentence mean?
5
Aravind Joshi
I saw her duck.
CS 562 - Intro
More about Ambiguities
• to middle school kids: what does this sentence mean?
5
Aravind Joshi
I saw her duck.
lexical ambiguity(word-sense)
CS 562 - Intro
More about Ambiguities
6
Aravind Joshi
I eat sushi with tuna.
• to middle school kids: what does this sentence mean?
CS 562 - Intro
More about Ambiguities
6
Aravind Joshi
I eat sushi with tuna.
• to middle school kids: what does this sentence mean?
structural ambiguity(PP-attachment)
CS 562 - Intro
More about Ambiguities
7
Aravind Joshi
I eat sushi with tuna.
• to middle school kids: what does this sentence mean?
CS 562 - Intro
More about Ambiguities
7
Aravind Joshi
I eat sushi with tuna.
• to middle school kids: what does this sentence mean?
lexical ambiguity(word-sense)
CS 562 - Intro
More about Ambiguities
8
Aravind Joshi
Everybody loves somebody.
• to middle school kids: what does this sentence mean?
???
CS 562 - Intro
More about Ambiguities
8
Aravind Joshi
Everybody loves somebody.
• to middle school kids: what does this sentence mean?
structural ambiguity(quantifier scope)
???
CS 562 - Intro
More about Ambiguities
9
Aravind Joshi
• to middle school kids: what does this sentence mean?
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
CS 562 - Intro
More about Ambiguities
9
Aravind Joshi
• to middle school kids: what does this sentence mean?
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
http://www.cse.buffalo.edu/~rapaport/BuffaloBuffalo/buffalobuffalo.html
CS 562 - Intro
More about Ambiguities
9
Aravind Joshi
• to middle school kids: what does this sentence mean?
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
http://www.cse.buffalo.edu/~rapaport/BuffaloBuffalo/buffalobuffalo.html
CS 562 - Intro
More about Ambiguities
9
Aravind Joshi
• to middle school kids: what does this sentence mean?
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo
Dogs dogs dog dog dogs.Police police police police police
http://www.cse.buffalo.edu/~rapaport/BuffaloBuffalo/buffalobuffalo.html
CS 562 - Intro
Ambiguities in Translation
10
zi zhu zhong duan自 助 终 端
self help terminal device
CS 562 - Intro
Ambiguities in Translation
11
CS 562 - Intro
Ambiguities in Translation
11
CS 562 - Intro
Ambiguities in Translation
11
CS 562 - Intro
Ambiguities in Translation
11
CS 562 - Intro
If you are stolen...
12
CS 562 - Intro
or even...
13
CS 562 - Intro
or even...
13clear evidence that NLP is used in real life!
CS 562 - Intro
Grammar
SBARQ
WHNP SINV
VBZ NPWhat animal
does a cat
VP
VB NP
eat t
14
DP for incremental parsing
PP Attachment Ambiguity
15
DP for incremental parsing
PP Attachment Ambiguity
15
One morning in Africa, I shot an elephant in my pajamas;
DP for incremental parsing
PP Attachment Ambiguity
15
One morning in Africa, I shot an elephant in my pajamas;
how he got into my pajamas I’ll never know.
DP for incremental parsing
PP Attachment Ambiguity
15
One morning in Africa, I shot an elephant in my pajamas;
how he got into my pajamas I’ll never know.
CS 562 - Intro
Ambiguity Explosion
16
I saw her duck.
CS 562 - Intro
Ambiguity Explosion
• how about...
• I saw her duck with a telescope.
• I saw her duck with a telescope in the garden...16
...
I saw her duck.
CS 562 - Intro
Ambiguity Explosion
• exponential explosion of the search space
• Q1: how to represent ambiguities (compactly)?
• Q2: how to search over this space (efficiently)?
• Q3: how to rank different hypotheses?
17
..
S
NP
PRP
I
VP
VBD
saw
NP
PRP$
her
NN
duck
PP
IN
with
NP
DT
a
NN
telescope
CS 562 - Intro
Answers...
• Q1: how to represent ambiguities?
• context-free grammar (unit 2)
• finite-state automata (unit I)
• Q2: how to search in this space?
• dynamic programming (units 1&2)
• Q3: how to rank these hypotheses?
• weighted grammar (units 1-3)
• weights learned from data
• (saw, with, telescope) seen more often in texts18
S
NP
PRP
I
VP
VBD
saw
NP
PRP$
her
NN
duck
PP
IN
with
NP
DT
a
NN
telescope
CS 562 - Intro
Why Learning?
• learning is better than hand-written rules, because:
• less work; easily adapts to new languages/domains
• Powerset (now bing.com): 15 years for English grammar!
• now they are writing their Chinese grammar...
• and languages constantly change!
• learning can work, and often works better!
• machine translation: used to be dominated by rule-based
• now statistical methods are better: google vs. systran
• google learns from the web, and translates 40+ langs
19[also CS 567, Machine Learning, Fall 2010]
CS 562 - Intro
Example - Rosetta Stone
• the most famous (tri-)parallel text
• machines can do the same job! (if given parallel text)
• UN/EU/Ca proceedings, News, tech manuals, ...20
CS 562 - Intro
Take Home Message
• languages are beyond just bags of words!
• ambiguity is everywhere, and NLP is all about that
• we’ll teach machines how to read and translate...
• and how to learn to read and translate from data
• have fun in this class! :)
21