21
1 Introduction to Natural Language Processing ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign

Introduction to Natural Language Processing

Embed Size (px)

DESCRIPTION

Introduction to Natural Language Processing. ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign. Lecture Plan. What is NLP? A brief history of NLP The current state of the art NLP and text information systems. What is NLP?. Thai:. - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to  Natural Language Processing

1

Introduction to

Natural Language Processing

ChengXiang Zhai

Department of Computer Science

University of Illinois, Urbana-Champaign

Page 2: Introduction to  Natural Language Processing

2

Lecture Plan

• What is NLP?

• A brief history of NLP

• The current state of the art

• NLP and text information systems

Page 3: Introduction to  Natural Language Processing

3

What is NLP?

… เรา เล่�น ฟุ�ตบอล่ …

How can a computer make sense out of this string?

Thai:

- What are the basic units of meaning (words)?- What is the meaning of each word? - How are words related with each other? - What is the “combined meaning” of words? - What is the “meta-meaning”? (speech act)- Handling a large chunk of text- Making sense of everything

Syntax

Semantics

Pragmatics

Morphology

DiscourseInference

Page 4: Introduction to  Natural Language Processing

4

An Example of NLP

A dog is chasing a boy on the playgroundDet Noun Aux Verb Det Noun Prep Det Noun

Noun Phrase Complex Verb Noun PhraseNoun Phrase

Prep PhraseVerb Phrase

Verb Phrase

Sentence

Dog(d1).Boy(b1).Playground(p1).Chasing(d1,b1,p1).

Semantic analysis

Lexicalanalysis

(part-of-speechtagging)

Syntactic analysis(Parsing)

A person saying this maybe reminding another person to

get the dog back…

Pragmatic analysis(speech act)

Scared(x) if Chasing(_,x,_).+

Scared(b1)

Inference

Page 5: Introduction to  Natural Language Processing

5

If we can do this for all the sentences, then …

BAD NEWS:

Unfortunately, we can’t.

General NLP = “AI-Complete”

Page 6: Introduction to  Natural Language Processing

6

NLP is Difficult!!

• Natural language is designed to make human communication efficient. As a result,

– we omit a lot of “common sense” knowledge, which we assume the hearer/reader possesses

– we keep a lot of ambiguities, which we assume the hearer/reader knows how to resolve

• This makes EVERY step in NLP hard

– Ambiguity is a “killer”!

– Common sense reasoning is pre-required

Page 7: Introduction to  Natural Language Processing

7

Examples of Challenges

• Word-level ambiguity: E.g.,

– “design” can be a noun or a verb (Ambiguous POS)

– “root” has multiple meanings (Ambiguous sense)

• Syntactic ambiguity: E.g.,

– “natural language processing” (Modification)

– “A man saw a boy with a telescope.” (PP Attachment)

• Anaphora resolution: “John persuaded Bill to buy a TV for himself.” (himself = John or Bill?)

• Presupposition: “He has quit smoking.” implies that he smoked before.

Page 8: Introduction to  Natural Language Processing

8

Despite all the challenges, research in NLP has also made

a lot of progress…

Page 9: Introduction to  Natural Language Processing

9

High-level History of NLP• Early enthusiasm (1950’s): Machine Translation

– Too ambitious

– Bar-Hillel report (1960) concluded that fully-automatic high-quality translation could not be accomplished without knowledge (Dictionary + Encyclopedia)

• Less ambitious applications (late 1960’s & early 1970’s): Limited success, failed to scale up

– Speech recognition

– Dialogue (Eliza)

– Inference and domain knowledge (SHRDLU=“block world”)

• Real world evaluation (late 1970’s – now)– Story understanding (late 1970’s & early 1980’s)

– Large scale evaluation of speech recognition, text retrieval, information extraction (1980 – now)

– Statistical approaches enjoy more success (first in speech recognition & retrieval, later others)

• Current trend: – Heavy use of machine learning techniques

– Boundary between statistical and symbolic approaches is disappearing.

– We need to use all the available knowledge

– Application-driven NLP research (bioinformatics, Web, Question answering…)

Stat. language models

Robust component techniques

Learning-based NLP

Applications

Knowledge representation

Deep understanding in

limited domainShallow understanding

Page 10: Introduction to  Natural Language Processing

10

The State of the Art

A dog is chasing a boy on the playgroundDet Noun Aux Verb Det Noun Prep Det Noun

Noun Phrase Complex Verb Noun PhraseNoun Phrase

Prep PhraseVerb Phrase

Verb Phrase

Sentence

Semantics: some aspects

- Entity/relation extraction- Word sense disambiguation- Anaphora resolution

POSTagging:

97%

Parsing: partial >90%(?)

Speech act analysis: ???

Inference: ???

Page 11: Introduction to  Natural Language Processing

11

Technique Showcase: POS Tagging

This sentence serves as an example of Det N V1 P Det N P annotated text… V2 N

Training data (Annotated text)

POS Tagger“This is a new sentence”This is a new sentenceDet Aux Det Adj N

This is a new sentenceDet Det Det Det Det

… …Det Aux Det Adj N … …V2 V2 V2 V2 V2

Consider all possibilities,and pick the one withthe highest probability

Method 1: Independent assignmentMost common tag

Method 2: Partial dependency

1 1

1 1 1

11

( ,..., , ,..., )

( | )... ( | ) ( )... ( )

( | ) ( | )

k k

k k k

k

i i i ii

p w w t t

p t w p t w p w p w

p w t p t t

w1=“this”, w2=“is”, …. t1=Det, t2=Det, …,

Page 12: Introduction to  Natural Language Processing

roller skates12

Technique Showcase: ParsingS NP VPNP Det BNPNP BNPNP NP PPBNP NVP V VP Aux V NPVP VP PPPP P NP

V chasingAux isN dogN boyN playgroundDet theDet aP on

Grammar

Lexicon

Generate

S

NP VP

BNP

N

Det

A

dog

VP PP

Aux V

is

the playground

ona boy

chasing

NP P NP

S

NP VP

BNP

N

dog

PPAux V

is

ona boy

chasing

NP

P NP

Det

A

the playground

NP

1.00.30.40.3

1.0

0.01

0.003

Probability of this tree=0.000015

Choose a tree with highest prob….

Can also be treated as a classification/decision problem…

Page 13: Introduction to  Natural Language Processing

13

Semantic Analysis Techniques• Only successful for VERY limited domain or for

SOME aspect of semantics

• E.g.,

– Entity extraction (e.g., recognizing a person’s name): Use rules and/or machine learning

– Word sense disambiguation: addressed as a classification problem with supervised learning

– Sentiment tagging

– Anaphora resolution …

In general, exploiting machine learning

and statistical language models…

Page 14: Introduction to  Natural Language Processing

14

What We Can’t Do

• 100% POS tagging– “He turned off the highway.” vs “He turned off the fan.”

• General complete parsing– “A man saw a boy with a telescope.”

• Precise deep semantic analysis– Will we ever be able to precisely define the meaning of “own” in

“John owns a restaurant.”?

Robust & general NLP tends to be “shallow”,

while “deep” understanding doesn’t scale up

Page 15: Introduction to  Natural Language Processing

15

Major NLP Applications

• Speech recognition: e.g., Auto telephone call routing

• Text management and analysis – Text retrieval/filtering

– Text classification

– Text summarization

– Text mining

– Query answering

• Language tutoring– Spelling/grammar correction

• Machine translation– Cross-language retrieval

– Restricted natural language

• Natural language user interface

Our focus

Page 16: Introduction to  Natural Language Processing

16

NLP & Text Information Systems

Better NLP => Better Text Information Systems

Bad NLP => Bad Text Information Systems?

Robust shallow NLP tends to be more useful than fragile deep NLP.

Errors in NLP can hurt a text information system

Page 17: Introduction to  Natural Language Processing

17

How Much NLP is Really Needed?

Tasks Dependency on NLP

Classification/Retrieval

Summarization/

Extraction/

Topic Mining

Translation/Dialogue

Question Answering

“Easier” & More “workarounds”

Page 18: Introduction to  Natural Language Processing

18

Workaround Example I: ELIZA

Person : Men are all alike.ELIZA : In what way?Person : They're always bugging us about something or other.ELIZA : Can you think of a specific example?Person : Well, my boyfriend made me come here.ELIZA : Your boyfriend made you come here?Person : He says I'm depressed much of the time.ELIZA : I am sorry to hear that you are depressed.Person : It's true. I am unhappy.ELIZA : Do you think coming here will help you not to be unhappy?Person : I need some help, that much seems certain.ELIZA : What would it mean to you if you got some help?Person : Perhaps I could learn to get along with my mother.ELIZA : Tell me more about your family

ELIZA plays the role of a therapistHow does it work?

Heuristic rules based on Keyword matching! E.g.,

I remember X =>Do you often think of X?

always => Can you think of a specific example?

It keeps the dialogue going on, but NEVER makes any claim!

A breakdown example:Person : Failure is the mother of success.ELIZA : Tell me more about your family

No NLP, but useful. Perhaps we should call this NLP?Statistical NLP often has a similar flavor with “SOFT” rules LEARNED from data

Page 19: Introduction to  Natural Language Processing

19

Workaround Example II: Statistical Translation

Learn how to translate Chinese to English from many example translations

Intuitions:

- If we have seen all possible translations, then we simply lookup- If we have seen a similar translation, then we can adapt- If we haven’t seen any example that’s similar, we try to generalize what we’ve seen

EnglishSpeaker

TranslatorNoisyChannel

P(E) P(C|E)

EnglishWords (E)

ChineseWords(C)

EnglishTranslation

P(E|C)=?

All these intuitions are captured through a probabilistic model

Page 20: Introduction to  Natural Language Processing

20

So, what NLP techniques are most useful for text information systems?

Statistical NLP in general, and

statistical language models in particular

The need for high robustness and efficiency implies the dominant use of

simple models (i.e., unigram models)

Page 21: Introduction to  Natural Language Processing

21

What You Should Know

• NLP is the foundation of text information systems– Better NLP enables better text management

– Better NLP is necessary for sophisticated tasks

• But – Bad NLP doesn’t mean bad text information systems

– There are often “workarounds” for a task

– Inaccurate NLP can hurt the performance of a task

• The most effective NLP techniques are often statistical with the help of linguistic knowledge

• The challenge is to bridge the gap between imperfect NLP and useful application functions