61
Hsin-Hsi Chen Grammar-1 CHAPTER 2 Grammars

Hsin-Hsi Chen Grammar-1 CHAPTER 2 Grammars. Hsin-Hsi Chen Grammar-2 Background

Embed Size (px)

Citation preview

  • CHAPTER 2Grammars

  • Background

  • artificial intelligence computational linguistics limited domain only engineering acoustical speech recognition system merry vs. very pan vs. ban communitiescommunities

  • Please hand me the ??? The choice you made was ??? good.

    extract the most likely words from signal decide their probabilities using information from surrounding words

    NLU community: parsing, interpretation, SR community: statistical techniques, ...

  • Morphology & knowledge of words morphology structure of wordssyntax structure of sentencessemantics meaning of individual sentencespragmatics how sentences relate to each otheranalysis of written language:Morphological analyzerParserSemantic interpreter

  • Morphological Analyzerword (lexical item)dictionary (lexicon)morphological analyzerapply morphological rule to finding the roots of words, e.g., going go, cats cat

  • Part of speech taggerpart- of- speech (lexical category) : a set of words having similar syntactic properties

    Part of speech (symbol)

    Example

    noun (noun)

    dog, equation, concerts

    pronoun (pro)

    I, you, it, they, them

    possessive (pos)

    my, your

    verb (verb)

    is, touch, went, remitted

  • Part of speech (symbol)

    Example

    adjective (adj)

    red, large, remiss

    determiner (det)

    the, a , some

    proper noun ( prop)

    Alice, Romulus

    conjunction (conj)

    and, but, since

    preposition (prep)

    in, to, into

    preposition (prep)

    in, to, into

    auxiliary verb (aux)

    be, have

    modal verb (modal)

    will, can must, should

    adverb (adv)

    closely, quickly

    wh-word (wh)

    who, what, where

    final punctuation (fpunc)

    .?!

  • Tag Set(Academia Sinica Balanced Corpus )Tag mapping

  • Words Syntactic FunctionsNouns refer to entities in the world like people, animals and thingsDeterminers describe the particular reference of a noun Adjectives describe the properties of nounsVerbs are used to describe actions, activities and statesAdverbs modify a verb in the same way as adjectives modify nouns

  • Words Syntactic FunctionsPrepositions are typically small words that express spatial or time relationships. Prepositions can also be used as particles to create phrasal verbs (e.g., add up)Conjunctions and complementizers link two words, phrases or clausesA complementizer is a conjunction which marks a complement clause.I know that he is here.

  • Features number, person, ...

    use of features grammatical checking (subject-verb agreement, ...) The dogs eat. The dog eats. semantic checking PP-attachmentThe hole that had been drilledby the woman in the wrenchturned out to be very useful

    Singular

    Plural

    1st person

    I, me, mine

    we, us, ours

    2nd person

    you, yours

    you, yours

    3rd person

    she, him, its

    they, them, theirs

  • Syntax and Context-Free Grammars syntactic structure

    In the hotel the fake property was sold to visitors.

    s-majsprep det noun det adj noun aux verb prep noun fpunc

    in the hotel the fake property was sold to visitors . np np np ppppvp(major sentence)

  • a set of terminal symbols (words & punctuation)a set of non-terminal symbols (pos & syntactic category, e.g., np, vp)start symbolrewrite rules non terminal (terminal| non-terminal)Grammar:specification of permitted structures in a language

    Context-Free Grammars (CFGS)

  • major sentences-maj s fpunc det thes np vp noun dogvp verb verb atenp det noun fpunc

    ambiguous grammar vp verb np noun salespeoplevp verb np np verb soldnp det noun noun noun biscuitsnp noun.Part(a)Part(b)

  • ambiguous Salespeople sold the dog biscuits.transitive verb

  • s-maj noun verb the noun noun fpunc

    Salespeople sold the dog biscuits .np np npsvpDitransitive verb

  • chomsky-normal form non-terminal terminal non-terminal non-terminal non-terminalX theory(Jackendoff) X

    specifier X X:v, n, ...

    X modifier

    X` argument

  • multiplying out features(cf. unification- based method) s np vp number: singular or pluralagree- s np singular vp singularment s np plural vp plural

  • Local and Non-Local DependenciesA local dependency is a dependency between two words expressed within the same syntactic ruleA non-local dependency is an instance in which two words can be syntactically dependent even though they occur far apart in a sentence (e.g., subject-verb agreement; long-distance dependencies such as wh-extraction).Non-local phenomena are a challenge for certain statistical NLP approaches (e.g., n-grams) that model local dependencies.

  • Semantic RolesA semantic role is the underlying relationship that a participant has with the main verb in a clause. Semantic case, thematic role, theta role (generative grammar), and deep case (case grammar) Most commonly, noun phrases are arguments of verbs. These arguments have semantic roles: the agent of an action, the patient and other roles such as the instrument or the goal.John (agent) hit Bill (patient). vs. Bill (patient) was hit by John (agent). In English, these semantic roles correspond to the notions of subject and object.But things are complicated by the notions of direct and indirect object, active and passive voice.

  • Agent (): a person or thing who is the doer of an event. The boy ran down the street.Patient (): the surface object of the verb in a sentence. He opened the door. Instrument (): an inanimate thing that an agent uses to implement an event. The cook cut the cake with a knife. Goal (): thing toward which an action is directed He threw the book at me. Beneficiary (): a referent which is advantaged or disadvantaged by an event.John sold the car for a friend. More roles and examples Semantic Role Labeling

  • SubcategorizationDifferent verbs can relate different numbers of entities: transitive versus intransitive verbs.Tightly related verb arguments are called complements but less tightly related ones are called adjuncts. Prototypical examples of adjuncts tell us time, place, or manner of the action or state described by the verb.Verbs are classified according to the type of complements they permit. This called subcategorization. Subcategorizations allow to capture syntactic as well as semantic regularities.Academic Sinica Dictionary (Format, Sample )

  • Chinese PropBank ()[arg0][argm-adv][rel][arg1][arg2][arg1][arg0 ][rel][argm-ext][arg1 ][arg0 ][rel][arg2 ][arg0 ][argm-tmp][rel][arg1 ]More examples 1-4937 ()

  • SemanticsSemantics is the study of the meaning of words, constructions, and utterancesSemantics can be divided into two parts: lexical semantics and combination semanticsLexical semantics: hypernymy () vs. hyponymy ()antonymy () vs. Synonymy()meronymy/holonymy(-/-/-)homonymy() vs. polysemy( ) Compositionality: the meaning of the whole often differs from the meaning of the partsIdioms correspond to cases where the compound phrase means something completely different from its parts

  • hypernym () The generic term used to designate a whole class of specific instances. Y is a hypernym of X if X is a (kind of) Y .

    hyponym ()The specific term used to designate a member of a class. X is a hyponym of Y if X is a (kind of) Y .

    holonym (-/-/-)The name of the whole of which the meronym names a part. Y is a holonym of X if X is a part of Y .

  • PragmaticsPragmatics is the area of studies that goes beyond the study of the meaning of a sentence and tries to explain what the speaker really is expressingUnderstand the scope of quantifiers, speech acts, discourse analysis, anaphoric relationsThe resolution of anaphoric relations is crucial to the task of information extraction

  • Scope of Quantifiers

  • Speech Act Examples

    UtteranceSpeech ActChunked Semantic ConceptsU (I want to register outpatient service of ophthalmology department)Elaborate-fact(DEPARTMENT_TYPE) (SERVICE_TYPE) (PERSON) (WANT) (REGISTER)S (Would you like to go to main hospital or Gong-Gwen branch?)Request-ref(BRANCH_NAME1) (BRANCH_NAME2) (REQUEST) (WANT) (AT) (SEE_DOC) (QWORD) (SEE_DOC) U (The main hospital)Answer-ref(BRANCH_NAME)S (Is this your first time consulting a doctor at NTU Hospital?)Request-if(GENERAL_NUM) (REQUEST)(YES) (COME_TIME) (COME)(HERE) (SEE_DOC) (GROUND)U (No.)Answer-if(NO)S (You have registered Dr. Shue-Ching on March 29.)Elaborate-fact(DEPARTMENT_TYPE) (DOCTOR_NAME) (DATE_VALUE) (PERSON) (REGISTER)(YES) (DOCTOR) U (OK.)Accept(YES)

  • Anaphoric RelationsAnaphora vs. Co-ReferenceAnaphora

    Type/Instance: /, /Function/Value: / 30 NP : /

    , 1 , , 2

    3 0

  • Discourse Analysis1a: 19931b: 1b1a1993 1c: 1c1a 1d: 1d1a1993 1e: 1e1d 1f: 1f1a1993

  • Grammars

  • Grammar as knowledge representation. a way to represent certain aspects of what we know about a language

    General criteria of grammar formalismlinguistic naturalness,mathematical power, andcomputational effectiveness.

  • NL employs the following knowledge. A representation for syntactic categories or parts of speech. A data type for words (and hence a lexicon, dictionary or word list ). A data type for syntactic rules. A data type for syntactic structures.

  • Words, rules and structuresLexicon: associate each word with its properties e.g. syntactic category (verb, noun, ...) subcategory (transitive, intransitive, ...) morphological information, semantic information.

  • PATR notation (words)(Ex 1) Word paid: = V. Syntactic category(Ex 2) Word paid: = V = past past tense = NP. transitiveA feature description is always partial.

  • Phrase structure rule a particular syntactic category(LHS) is composed of (RHS).

    S NP VP (a sentence consists of noun phrase followed by a verb phrase)VP V NP (a verb phrase consists of a verb followed by a noun phrase )

  • Phrase structure rules introduce lexical items. V paid ===> word paid: = V Example. Grammar 1 Word Dr Chan: Rule {simple sentence formation} = NP. S NP VP Word nurses: Rule {transitive verb} = NP. VP V NP Word MediCenter: Rule {intransitive verb} = NP. VP V. Word Patients: = NP. Word died: = V. Word employed: = V.

  • Context-free phrase structure grammars (CF-PSGS) LHS: category regardless RHS: category or symbol. of context Functions of a grammar(1) Define the sets of grammatical words in a language.(2) Associate one or more structures with each grammatical string.

  • [s, [np, MediCenter], [vp, [v, employed],[np, nurses]]][s [np MediCenter] [vp [v employed] [np nurses]]]s (np(MediCenter), vp(v(employed), np(nurses)))(1)(2)(3)(4)Representation of parsing trees

  • How to evaluate a particular grammar for a (fragment of a) language?

    (1) Does it undergenerate?(2) Does it overgenerate?(3) Does it assign appropriate structures to the strings that it generates?

  • undergenerate:There are some syntactically well-formed expressions to which grammar assigns no structure. (undergeneration is not necessarily a problem if the goal is to produce a language.) NLgrammar

  • overgenerate:Grammar legitimates strings that cannot be constructed as grammatical expressions of the language in question. (Overgeneration is not necessarily a problem if the goal is to recognize or understand well-formed language)NLgrammar

  • Subcategorization and the use of features*Dr. Chan died patients.*MediCenter employed. died: intransitive employed: transitive

  • Example. Grammar 2 Word died: Word and: Rule {simple sentence formation} = V = C S NP VP = 0 Word or: Rule {intransitive verb} Word recovered: = C VP V = V =0 = 0Rule {single complement verbs} Word slept:VP V X = V = = 0 Rule {coordination of identical categories} Word employed:X0 X1 C X2 = V = = NP = Word paid: = = V = = NP Word nursed: = V = NP

  • Examples Nurses died. *Nurses died patients. *MediCenter employed. MediCenter employed nurses.

  • Phrase structure tree for sentence with conjunction

  • Grammar 2 admits all of the following examples: Nurses died. Nurses died and patients recovered. Nurses died and patients recovered and Dr Chan slept and MediCenter employed nurses. Nurses died and patients recovered and Dr Chan slept and MediCenter employed nurses and Grammar 3 = Grammar 2 + Rule {prepositional phrases} PP P X: =

  • Example: Lexicon3 Word approved: Word thought: =V =V =PP =S Word disapproved: Word of: =V =P =PP =NP Word appeared: Word fit: =V =AP =AP Word competent: Word seemed: =AP =V Word well-qualified: =AP =AP Word had: =V =VP Word believed: =V =PP

  • Grammar 3 admits additional examples like: Nurse thought Dr Chan seemed competent. Dr Chan appeared well-qualified and disapproved of MediCenter. Patients had believed nurses thought Dr Chan had slept. sentential or verb phrase complements of a verb

  • Long-distance dependencies wh-question, relative clause,

    Whom did Freg give the ball to e? dependencyWhom does Alice believe Freg wants to give the ball to e?

    dependency

  • Unbounded dependencies-- no limit on the amount of intervening material that can occur between displaced constituent and the empty constituent.cf. ATN HOLD list. (1971)

  • Slash categories s/np, vp/np, s wh s/np s/np vp s/np np vp/np vp verb np pp/np

  • Example: Rules for Grammar 4x/y: a category x0 whose cat is x and whose slash is y. = x, = y. Interpretation: an expression of category x/y an expression of category x from which an expression of category y is missing. e.g. S/NP: a sentence (or clause) that has got a noun phrase missing. VP/PP: a verb phrase that is missing a prepositional phrase.

  • Example. Phrase structure tree with slash categoriesTransfer of information about a missing category from motherto daughter.Dr. Chan, nurses thought MediCenter had employed

  • Rule {simple sentence formation} S NP VP: = =0.Rule {intransitive verb} VP V: =0 =0 =0.Rule{single complement verbs} VP V X: = =0 =.Rule{prepositional phrases} PP P X: = = 0 =.Rule X0 X1 X2 =S =NP =VP = =0 S/Y NP VP/Y Rule S/S NP VP/S. Rule S/NP NP VP/NP. Rule S/VP NP VP/VP. Rule S/AP NP VP/AP .

  • Rule {coordination} PP X1 C X2: = = =0 = = = = .Rule{topicalization} X0 X1 X2: =S =no =S = =no =. S S S/XA sentence can consist of some category followed by a sentence which is missing an expression of that category.

    MediCenter, nurses disapproved of _.Of MediCenter, nurses disapproved _.Well-qualified, Dr Chan had seemed _.

    Rule {slash elimination} X0 : = =yes.X/X .

  • Classes of grammars and languages

    Recursively enumerable sets (type 0)Context-sensitive languages (type 1)Indexed language Context-free languages (type 2)Finite-state languages (type 3)Chomsky hierarchy of languages

  • Context-free grammars and languagesCF-PSGS: exist high-efficiency algorithms for recognizing and parsing CFLs.

    types

    factors

    CF-PSGs

    RTNs

    power

    equivalent

    semantics

    declarative

    procedural

    iteration

    recursion

    yes

  • Indexed grammars and languages anbncn problem=S,=a,=a,=z. cat:s stack: top:aX stack: top:a stack:[top:z ] stack:?

  • GrammarRule{push endmarker on to stack} S a A: =z =.Rule {push a on to stack} A0 a A1: =A =A =a =.Rule {copy stack from A to B} A B: =. Rule {pop a off stack} B0 b B1 c: =B =B =a =.Rule {pop endmarker off stack} B b c: =z.1:::stackztopstackA2::::1stackatopstackAcatA334For every n, there is a category x s.t.=z2