Upload
kyran
View
32
Download
0
Embed Size (px)
DESCRIPTION
Ling 138/238. Martin Kay Stanford University. Introduction to. Computational Linguistics. 30Introduction Oct1Complexity; String search 6Knuth-Morris-Pratt; Boyer Moore; 8Suffix Trees 13Tagging; Alignment 15 20Chomsky Hierarchy; Regular Expressions 22 - PowerPoint PPT Presentation
Citation preview
Martin Kay CL Introduction 1
Martin Kay
Stanford University
Ling 138/238
Martin Kay CL Introduction 2
30 Introduction
Oct 1 Complexity; String search
6 Knuth-Morris-Pratt; Boyer Moore;
8 Suffix Trees
13 Tagging; Alignment
15
20 Chomsky Hierarchy; Regular Expressions
22
27 Finite-state automata
39
Martin Kay CL Introduction 3
Nov 3 Morphology
5
10 Context-free grammar
12
17 Unification, HPSG, LFG
19
24 Machine Translation
26
Dec 1 Summary; Wrap-up
3
Martin Kay CL Introduction 4
Martin Kay
740 3043
Margaret Jacks 124
Office hours: TuTh 4.15-5.45 p.m.
Linguistics 138/238
Martin Kay CL Introduction 5
Prerequisites and Expectations
• No prerequisites• Classroom participation• Occasional readings• Learn Prolog• Laboratory sessions• Homework Problems• Project
Martin Kay CL Introduction 6
Project
• Learn something new about language• Significant programming• Group work• Modifying or amplifying existing code
A HMM-based taggerA searcher for tagged textImplementation of Suffix treesMorphological analysisNamed-entity recognition
Martin Kay CL Introduction 7
Intellectual Relations
Relation to—Linguistics
—Psychology
—Artificial Intelligence
—Computer Science
Martin Kay CL Introduction 8
Computational Linguistics as Science
Martin Kay CL Introduction 9
Ideas from Computing
SearchDivide and ConquerGuides and OraclesNondeterminism
Dynamic ProgrammingScheduling, agendasCompilationUnificationAutomata TheoryCo-routining and parallelismTop-down vs. bottom-upComplexity
Martin Kay CL Introduction 10
Ideas from Computing
Search
Nondeterminism
Dynamic Programming
Martin Kay CL Introduction 11
A Maize
Keep you right hand on the wall
SearchNondeterminismDynamic Programming
Martin Kay CL Introduction 12
Backup!
A Maize
Backup!
Backup!
Out!
SearchNondeterminismDynamic Programming
Martin Kay CL Introduction 13
Nondeterminism
• A process is nondeterministic if there are points in it when a choice must be made, but the information necessary to make the choice is not available.
• Solution: Pick one of the alternatives. If it does not work out, come back and pick another one.
• Note: the information required to make the choice was available after all!
SearchNondeterminismDynamic Programming
Martin Kay CL Introduction 14
DynamicProgramming
p o u r
f 1 2 3 4
o 2 1 2 3
r 3 2 2 2
Paris
DijonMulhouse
Strasbourg
Chalons Metz266
192 161
344
276
115
234288
458
620
619
SearchNondeterminismDynamic Programming
Martin Kay CL Introduction 15
people np np np
s s s
like prep pp pp
v vp vp
the det np np
French adj n
n n
drink n
vp
The CKY Chart
Context free: All phrase with the same— Coverage, and— Category
enter into larger phrases as a single item
Context free: All phrase with the same— Coverage, and— Category
enter into larger phrases as a single item
SearchNondeterminismDynamic Programming
Martin Kay CL Introduction 16
Ideas from Computing
Unification
Martin Kay CL Introduction 17
UnificationAttribute Report 1 Report 2 Combined
Report
eyes blue blue blue
hair black or brown brown or red brown
accent Italian Italian
wife see below see below see below
children Ahemed & Angela Rebecca & Angela Ahmed, Angela &
Rebecca
age middle 48 Middle
Wife
eyes brown brown
weight 247 lbs 112 Kg 247 lbs
disposition surly surly
Unification
Martin Kay CL Introduction 18
UnificationAttribute Report 1 Report 2 Combined
Report
eyes blue blue blue
hair black or brown brown or red brown
accent Italian Italian
wife see below see below see below
children Ahemed & Angela Rebecca & Angela Ahmed, Angela &
Rebecca
age middle 48 Middle
Wife
eyes brown grey FAIL
weight 247 lbs 112 Kg 247 lbs
disposition surly surly
Unification
Martin Kay CL Introduction 19
English Agreement
The dog sleeps
The dogs sleep
The dog slept
The dogs slept
The sheep sleeps
The sheep sleep
The sheep slept
The sheep that was in the barn slept
The sheep that were in the barn slept
Unification
Martin Kay CL Introduction 20
German Case
Der Junge sah den Lehrer
Den Lehrer sah der Junge
Das Mädchen sah der Junge
der Junge sah das Mädchen
Die Lehrerin sah den Lehrer
Die Lehrerin sah das Mädchen
Unification
Martin Kay CL Introduction 21
Ideas from Computing
Finite-State Methods
Martin Kay CL Introduction 22
Finite-State Methods in Language Processing
The Application of a branch of mathematics
—The regular branch of automata theory
to a branch of computational linguistics in which what is crucial is (or can be reduced to)
—Properties of string sets and string relations with
—A notion of bounded dependency
Finite-State Methods
Martin Kay CL Introduction 23
Applications
• Finite Languges— Dictionaries
— Compression
• Phenomena involving bounded dependency
— Morpholgy
• Spelling
• Hyphenation
• Tokenization
• Morphological Analysis
— Phonology
• Approximations to phenomena involving mostly bounded dependency
— Syntax
• Phenomena that can be translated into the realm of strings with bounded dependency
— Syntax
Finite-State Methods
Martin Kay CL Introduction 24
Ideas from Computing
Complexity
Martin Kay CL Introduction 25
The Chomsky HierarchyGrammar Language AutomatonType 0 Recursively Turing Machines
Enumerable Sets
Context-sensitive Context-sensitive Nondeterministic linear space bound Turing Machines
Context-free Context-free Nondeterministic push- down automata
LR(k) Deterministic Context- Deterministic push-down free automata
Regular Expressions Regular Sets Finite-state automataLeft (Right) Linear
Complexity
Martin Kay CL Introduction 26
Computation and Psychology
Sentence Processing
Martin Kay CL Introduction 27
Computational Linguistics as Engineering
Martin Kay CL Introduction 28
Tools for Linguists
• TLF, OED• Corpus Linguistics• Field Notes• Grammar Testing
Martin Kay CL Introduction 29
Translation
• MT, Translator's Tools• Alignment, Dictionaries, Term Banks• Normalization and Tuning
Martin Kay CL Introduction 30
Other Applications
• Writer's Tools—Spelling
—Dictionary, Thesaurus
—Grammar
• Natural Language Interfaces
• Information Storage and Retrieval
Martin Kay CL Introduction 31
CL & AI
••••••••
• •
Text Interpretation
Meaning
Linguistics ???
• Text, Meaning, and Interpretation