5

Click here to load reader

Padova

  • Upload
    kienna

  • View
    4

  • Download
    0

Embed Size (px)

DESCRIPTION

Padova

Citation preview

  • Dialogues From Texts: How to Generate Answers from a DiscourseModel

    Rodolfo Delmonte, Dario Bianchi*Ca' Garzoni-Moro, San Marco 3417

    Universit "Ca Foscari"30124 - VENEZIA

    Tel. 39-41-2578464/52/19 - Fax. 39-41-5287683E-mail: [email protected] - website: byron.cgm.unive.it

    *Dipartimento di Ingegneria dell'InformazioneUniversita' degli Studi di Parma

    43100 Parma - ItalyTel. 39-521-905734- Fax 39-521-905723

    E-mail [email protected] - website:www.ce.unipr.it/people/bianchi

    There are two main problems we are faced withwhen thinking of answers/descriptions generator:choice of actual words, and order in which theyshould occur. In order to select the appropriatelexical predicate, a number of crossing abstractrepresentation should conspire to produce themost adequate result. In particular, we assumethat in a plan there are different levels ofabstractions involved: the higher level isrepresented by relations very similar to RSTrhetorical relations. These in turn specialize intotuples of semantic relations which aresubsequently used to recover predicates from thelexicon. These tuples may be represented by asemantic class and an associated aspectual class,as for instance in:

    1. IntroductionIn this paper, we shall deal with a system

    for text generation and understanding calledGETA_RUN (GEneration of Text and Analysisfor Reference Understanding). Dialogues in oursystem may be regarded as a sequence ofquestion-answering actions with the aim toretrieve as much information as possible from atext. In a certain sense the system acts as a veryintelligent information retrieval system whichallows to recover the contents of a text on thebasis of what is actually being linguisticallyexpressed. No implicit information is recoverablefrom the system, however.

    The system works by encoding thelinguistic contents of a text fully analyzed in theDiscourse Model where all entities, propertiesand relations are recorded. Further semanticproperties and relations are then transferred into aKnowledge Representation Module runningunder KL-One to be used for consistency checkwhenever the dialogue is started up.

    narrative(movement, activity).background(existence, state).In addition, we need some criteria to establishthe order with which events may take place inthe world: this is not to be intended in the senseof domain discourse plans for task orienteddialogues above. The idea we have in mind isbased on conceptual classes onto which linkingrules may be established so as to disallowunwanted sequences, as for instance in,

    The system is capable of building adiscourse plan in order to keep track of theongoing conversation and then to specifysemantic structure sufficient for sentence levelgeneration.

    LR1: *(GO(TOx ==>GO(TOx)LR2: *(STAY(ATx ==> STAY(ATx)LR3: *(BE(ATx ==> BE(ATx)What linguistic information constitutes an

    adequate input to the tactical component ? Inorder to answer this first question we need todecide how much work has to be assigned to theplanner. From the subdivision of labour betweenthe two components, we shall be able toascertain what is left to the grammar itself andthe lexicon.

    LR4: *(BE(ATx ==> GO(TOx)These rules are axioms made up of two sides: theleft is a part of a conceptual representation and isthe consequent and the right side is the premise;they can be applied at sequences of relations oneof which must be the unrealized or yet to berealized relation, represented by the lefttemplate.The right side template can be liked toany of the relations already present in the plan.The variable x is linked to the object, locationor other semantic type for an argument. In

    2. Plan Creation from Rhetorical Relationsand Conceptual Representations

  • particular, in the case of LR4, if some entity hasgot to be AT(x), he should GO(TOx) first.

    In the knowledge representation we establish asemantic relation that holds between a sentenceand an interval in the spirit of interval semantics.We specify what property of an interval isentailed by the input sentence and thencompositionally we construct a representation ofthe event from the intervals and their associatedproperties. We already presented in a previouspaper the interaction between KL-One and thesemantic representation of a story contained in aDiscourse Model (see Bianchi, Delmonte, 1997).We shall now concentrate on the Realization orTactical Component.

    Conceptual Representations(CR) have beenintroduced by Jackendoff and others, however werefer to Dorr(1993) who introduced a number ofaugmentation to the original set which we alsoendorse. Delmonte(1990, 1996) considered CRthe link from the semantics to the knowledge ofthe world needed to represent meaning in ageneral and uniform manner.Since we endorse LFG as our theoreticalframework (Bresnan, 1987), and our lexical formsencompass semantic information related tosemantic roles, we assume that the correctmapping from lexical forms and CR is achievedby means of semantic roles and aspectual class.In this way, CRs map onto and outwards LexicalForms. C-structure and f-structure representationwould be completely lost in our framework oncethe Discourse Model is being built. TheDiscourse Model only contains reference tosemantic roles and other semantic relations likePoss, which have a correspondence in the CR.

    3. Extracting-expressing relevant facts fromthe DMIf we want to characterize the input to theTactical Component we can paraphrase itscontribution as follows:Select the most relevant sequence of relationsfor the most important entity of the discoursemodel and order them temporally; thenestablish coordinate or subordinatedependancies among adjacent relations.Finally recover arguments for each relation.

    As to Aspectual Classes we use them to definelexical classes rather than sentence level aspectualclass, which as we said above, is the result of theinteraction of an extended number of linguisticelements. Lexical aspect is used to individuatethe appropriate internal constituency of the event(see also Delmonte, 1997 and above), and also todrive the semantics, which together with theinformation coming from arguments and adjunctswill be able to trigger the adequate knowledgerepresentation. In particular, as shown inPassonneau 1988, we need to process reference toentities and events in the discourse model, inorder to know what predicates are asserted tohold over what entities and when. We use thefollowing lexical aspectual classes:

    However there are two main problems thatrequire particular attention, and they are, choiceof cue words and choice of syntactic structurewhich in turn is cast into another important task,concept aggregation(Hovy, 1994:366). In otherwords, we have a set of discourse structure thatwe need to express more concisely: we need toknow how (which syntactic structure to use) andwhat cue word to use. In order to do this, weneed to aggregate concepts which aresemantically related.The Focus mechanism allows the TacticalComponent to inspect a set of discoursestructures to see whether they contain the samefocus value; in addition there is a number ofadditional issues to be taken into account, suchas the complexity of the remainder of thediscourse substructure, the overall style of thetext and its rhythm. In more detail, Hovysuggests a number of heuristics to governsentence formation, including:

    a. achievement; b. achievement irreversible; c.achievement iterable; d. accomplishment; e.accomplishment ingressive; f. activity; g. state;h. state_resultMeaning associated to each semantic class areexpanded into conceptual classes by means ofaspectual information. For instance, thefollowing class11 = exten (GO(TO[end] - (GO(TO[exist] finire,creare

    - Embedding of background information can berealized as an adjective, appositive NP, PP orrelative clause in this order of preference;is split into the following two meanings:

    Funct(exten, achievement irreversible)CAUSE(GO(TO[end] finire to end

    - Sentences should contain at most one level ofembedding which should occur before focustransformation, and in the leftmost nuclear clausewith the same focus value;

    Funct(exten, accomplishment)CAUSE(GO(TO[exist] creare to createwhere Funct may assume only those rhetorical ordiscourse relation labels that constitute aconceptually admissible link. Elaboration orDescription are not allowed by Linking Rules.Narration and Egression would be allowed.

    - Coordination occurs only between clausesheaded by the adequate rhetorical relation such asJOIN, SEQUENCE, CONTRAST;- Sentences should contain no more than threeclauses.

  • We are now able to pick up the relevant fact andits semantic structure from the Discourse Model.We extract the relation and its arguments fromthe list of facts associated to the current firstentity. In particular we are interested inhighlighting the semantic role of the main entityin the current fact. This will be used to decidewhether the Voice to be associated to thegenerating phase has to be set in the Active or inthe Passive mode.

    argument has in canonical predicate argumentstructure. Syntactic non-canonical realizations,like for instance passive construction, expletivesubject insertion, left-dislocation and any otherpossible grammatically relevant stucturaldecision is left to the phrase structure rulecomponent of the grammar. Consider the need torealize one argument as clitic pronoun, as isrequired in Romance languages: the semanticstructure would carry the information that thesecond argument of the predicate belongs to TOPtype, as for instance in the followingrepresentation for Mario, which is realized as theclitic pronoun lo/him independently by thegrammar. The fact that Mario has been assignedthe TOP type in the slot reserved for Definitinessdoes not depend on syntactic but merely onpragmatic and semantic information. Features forthe choice of the adequate pronominal form arepartially extracted from the lexical entryassociated to Mario, which are Person=3,Gender=Masculine, Animacy= Human,

    Other important semantic relations and propertiesto be kept under control are:- lexical level semantic properties of predicateslike: checking for synonyms and antonyms- lexical level semantic relations betweenproperties, events and entities which can be thelexicalization of the same meaning or beambiguous. Other important semantic relationsare semantic inclusion and overlapping inmeaning which may turn the use of a givenlexical item redundant;- utterance level semantic relations: may have thesame properties above and in addition a certainutterance may express meaning which strictlyimplicates and/or presupposes the meaning of apreviously expressed utterance, thus making itscontent redundant;

    [top, nil, sing, mario] --> loIn addition, Number is set to singular, and Caseis equal to Accusative owing to the fact that theargument is the second. The additionalinformation that lo should be anteposed to theverbal predicate is not encoded in the semanticstructure but is independently imposed by thephrase structure rules associated to thetransitive verb syntactic class, and the presenceof a TOP argument. On the contrary, byinterleaving focus rules with the realizationgrammar, have the undesirable side-effect ofhaving to check where the Focus argument hasbeen assigned in the case frame slots of thesentence level predicate before entering the correctvp rule. In our grammar, we capture passivestructures very simply by means of the featurePASSIVE in slot assigned to VOICE in theinput semantic structure. The grammar will lookfor second argument or third argument accordingto argument structure and execute a LexicalRedundancy Rule, according to LFG: theargument selected will be set to Subject of thecurrent structural realization and realized first.Then second argument will be passed to VPstructure as Adjunct Oblique with the semanticrole of Agent. Semantic role will trigger theadequate preposition by to be instantiated infront of the NP. Choice of focussed constituent isagain present in the linear disposition ofarguments: in case Recipient/Beneficiary/Goalshould be fronted, it would have been positionedas second argument, for ditransitive verbs only,however. In other words, we perform dative shiftin the pragmatic/semantic component beforeentering the realization phase.

    - an utterance may be contradictory and/or denythe meaning of a previous utterance;contradiction may be detected at the level oftemporal reasoning when the order of occurrenceof two adjacent event has to be that of precedencerather than consequence.- finally the way in which entities are expressedmay vary from proper noun to emptypronominal, dependending on the previous text;also properties may be expressed in apropositional way, as an adverbial phrase or asan adjectival phrase.

    4. Tactical ComponentIt is generally agreed that a suitable input to therealization component must be constituted bysome form of semantic representation which mayinclude the actual lexical choice or some abstractconceptual representation of each lexical item forthe final realization.However, there are many differences that can befound between the approaches documented in theliterature and ours. In our system, input to therealization has a general argument structure and anumber of functional features associated that areused by the grammar to generate the mostadeqauate structural configuration. Top-downsemantic, rhetoric and pragmatic decisions arepaired with bottom-up lexical requirementsimposed by each predicate on the fly, whilerealizing each lexical item. In particular,argument specification only reflects the order each

  • 4.1 Text Generation in Italian transitive verbs. Also notice the apostrophewhich deletes the final vowel of clitic pronouns:in this case no information is available on thegender, only case can be inferred.

    Generating text in Italian is intrinsically boundto the peculiarities of its surface grammar.Summarizing is a task that requires full discoursestructure information which in our case is madeavailable from Geta_run and feeds directly theplanning component. In turn, grammar andlexicon needed for the tactical component isreadily available from the DCG parser.

    Example 5 is a case of obligatory postverbalSubject: Cosa i tuoi amici hanno detto isungrammatical. Notice the use of the article infront of a possessive pronoun which is againobligatory and can be dispensed with in case ofnames denoting family relations like brother,sister, mother etc.

    Italian is a language which allows and in somecases requires the Subject to be generated inpostverbal position. Subject inversion is a freeprocess, i.e. it does not obey such constraints asthe D(efinitiness) E(effect), and requires noexpletive, as is the case with other languages likeEnglish, German or French. In fact, Italian isregarded as a language with empty expletives.Choice for auxiliaries is determined on the basisof syntactic category: unaccusatives require be,while the other categories require have.However, passive and impersonal constructionsalso require be as auxiliary. In addition,Object NP can be expressed as clitic and be thusobligatorily positioned in front of tensed verb.

    Finally, example 6 shows two important featuresof Italian and other Romance languages: first,adjuncts can be freely interspersed betweenSubject and verb or verb and Object, in otherwords there is no adjacency constraint applicable.Second, Italian has compound prepositions, i.e.a preposition with article which in turn canundergo epenthesis by the use of the apostrophe.In the latter case, generation of the compoundpreposition requires gender and numberinformation to be made available beforehand, orelse it should be generated afterwards.Input to our Tactical Component is as follows:

    Here are some examples, where we include verbalfeatures in brackets, a literal translation andfinally an approximate English rendering:

    Voice: active/passiveTense: any tenseMood: any mood including imperative,interrogative etc.1. Arriva [Arriva_3/pers/sing_pres/ind]

    lit. Arrives |He is arriving| Modality any modality2. Arriva Gino [Arriva_3/pers/sing_pres/indGino] lit. Arrives John |John is arriving|

    Main Relation: the main clause relationMain Relation:Modification

    3. E arrivata Maria Adverbial Phrase ; Subordinate Clause ;Coordinate Clause ; Prepositional Phrase ;Predicative Adjunct

    [E_3/pers/sing_pres/ind arrivata_3/pers/ sing/fem_past/partic Maria] lit. Is arrived Mary |Mary has arrived| List of Arguments:4. Gino lha conosciuta ieri 1st Argument: Subject argument - Sentential

    subject; 2nd Argument: Object, Oblique,Sentential Object; 3rd Argument:IndirectObject or Oblique

    [Gino l_acc ha_3/pers/sing_pres/indconosciuta_3/pers/sing/fem_past/partic ieri] lit. John her has known yesterday |John mether yesterday| Argument specifications 1.5. Cosa hanno detto i tuoi amici? Semantic Type: [Cosa hanno_3/pers/plur_pres/ind detto_3/pers/sing/mas_past/partic i_mas/plur tuoi_2/pers/ plur/mas amici_plur/mas]

    a. prop (proper name), b. def (definite commonnoun), c. ndef (indefinite common noun),d. foc (focussed noun to be fronted by syntacticstructures like left dislocation, it- cleft,topicalization, etc.),

    lit. What have said the your friends |Whatdid your friends say|6. Gino ha visto ieri Maria sullautobus e. top (topic noun - to be pronominalized), f. rel

    (relative pronoun argument), [Gino ha_3/pers/sing_pres/ind visto_3/pers/sing/mas_past/partic ieri Maria sullautobus_mas]

    g.trace(controllee of syntactic or lexicalcontroller), i. pro(empty or lexically unexpressednoun), lit. John has seen yesterday Mary on the bus

    |Yesterday John saw Mary on the bus | Cardinality : : a number/nil; Number: :sing(ular)/pl(ural) ; Head: : lexical head

    As can be noticed from the features associated tothe verbal morphemes, agreement on the pastparticiple should include Gender as well asNumber which in case of unaccusatives shouldagree with the Subject. However, as example 4shows, agreement goes with the Object with

    Argument specifications 2. ModificationAdjectival Phrase, Prepositional Phrase,Predicative AdjunctsThe following constitutes the input to theTactical Component for the examples 1 and 2:

  • Ex.1: Ieri Mario corse a casa / Yesterday Marioran home

    di storie, Atti Apprendimento Automatico eLinguaggio Naturale, Torino, pp.95-98.

    Voice=act, Delmonte R.(1990), Semantic Parsing with anLFG-based Lexicon and ConceptualRepresentations, Computers & the Humanities,5-6, 461-488.

    Tense=past,Mood=indic,Modality=assert ,Main_relation=correre, Delmonte R.(1995a), Lexical Representations:

    Syntax-Semantics interface and WorldKnowledge, in Notiziario AIIA (AssociazioneItaliana di Intelligenza Artificiale), Roma,pp.11-16.

    Main_relation_modifier=[dtemp,ieri],List_of_arguments=[First_argument=[prop, nil, sing, mario], Second_argument=[meta, casa] ] Delmonte R.(1997), Lexical Representations,

    Event Structure and Quantification, QuaderniPatavini di Linguistica, 39-94.

    Ex.2: Maria che ieri lo cercava lo insult / Mariawho yesterday was looking for him, insulted himVoice=act, Delmonte R., D.Bianchi, E.Pianta(1992),

    GETA_RUN - A General Text Analyzer withReference Understanding, in Proc. 3rdConference on Applied Natural LanguageProcessing, Systems Demonstrations, Trento,ACL-92, 9-10.

    Tense=past,Mood=indic,Modality=assert ,Main_relation=insultare,List_of_arguments=[ First_argument=[prop, nil, sing, [maria, Delmonte R., D.Bianchi(1994), Computing

    Discourse Anaphora from GrammaticalRepresentation, in D.Ross & D.Brink(eds.),Research in Humanities Computing 3,Clarendon Press, Oxford, 179-199.

    First_argument_modifier=[ Voice=act, Tense=imperf, Mood=indic, Main_relation=cercare, Delmonte R., Dibattista D.(1995d), Switching

    from Narrative to Legal Genre, Working Papersin Linguistics, C.L.I., University of Venice.

    Main_relation_modifier=[dtemp,ieri], List_of_arguments=[First_argument=[rel, nil, sing, maria], Dorr B.,(1993), Machine Translation, MIT

    Press, Cambridge Mass.Second_argument=[top, nil, sing, mario] ] Mann, W.C. & Thompson, S.A.(1987),

    Rhetorical Structure Theory: A theory of textorganization. Technical Report ISI/RS-87-190,ISI.

    Second_argument=[top, nil, sing, mario] ]

    4. REFERENCES Palmer M.S., Passoneau R.J., C.Weir,T.Finin(1994), The KERNEL textunderstanding system, in Pereira & Grosz(eds),Natural Language Processing, MITPress, 17-68.

    Bianchi D., R.Delmonte, E.Pianta(1993),Understanding Stories in Different Languageswith GETA_RUN, Proc.EC of the ACL,Utrecht, 464.Bianchi D., Delmonte R.(1997),Rappresentazioni concettuali nella comprensione