Upload
christiana-fox
View
215
Download
0
Embed Size (px)
Citation preview
NLG STEC WorkshopApril 20-21, 2007Arlington, VA
Nancy GreenUniv. of North Carolina Greensboro, USA
NLG Pipeline Model & STEC
STEC
Pro-STEC Assumptions:
• (All/most/worth-funding) NLG can be decomposed into well-defined independent STEC-modules such that improving each one will advance NLG
• Input/output representation for STEC is non-controversial
Discourse KR&R
Domain CommunicationK
R&R
User Model KR&R
Media/ Presentation- related KR&R
NLG ‘Pipeline’ = Tip of Iceberg
Who will pay for NLG research outside of classical pipeline?: essential empirical research, major cost, but afraid it would fall outside of STEC funding model
Example NLG System KR&RGenIE: generates letters to genetics clinic patients; goal to justify medical
experts’ conclusions such that all arguments are comprehensible to a lay person
• Discourse: argumentation
• Domain Communication: conceptual causal model underlying expert-lay communication (not domain model)
• User Model: model of appraisal • Media/Presentation: how presentation affects argument
comprehension
Lesson from GenIE• NLG Pipeline = global control + sentence planning/realization
• can use existing surface realizers, standard domain ontology, and lexical resources
• Main cost has been KR&R modules; mainly empirical work: • Goal: find non-domain-specific principles/ guidelines to
optimize lay audience’s comprehension of arguments• Corpus studies: very useful but not sufficient• Controlled studies: necessary, and cannot afford to wait for
other disciplines (HCI, learning sciences, etc.) to do them for us
GenIE Corpus Studies• Intercoder reliability of content annotation
scheme: used to justify domain communication model
• Argumentation schemes (non-domain-specific, both normative and affective)
• Stylistic (lexical/syntactic) features of author perspective
• Argument presentation features (order, cue words, explicitness)
GenIE Controlled Studies
• How multimedia layout, cross-media cue words affect comprehension
• How argument presentation (explicit vs. implied claim, cue words) affects recognition of argument components (Claim vs. Data) & dependence of final claim on intermediate claims
NLG Pipeline Model & STEC
STEC
Pro-STEC Assumptions:
• (All/most/worth-funding) NLG can be decomposed into well-defined independent STEC-modules such that improving each one will advance NLG
• Input/output representation for STEC is non-controversial
STEC Input/Output Problem
Different input representations needed for different types of output; e.g. compare requirements for:
• Fixed-format text (original scope of NLG)
• Task-appropriate, user-friendly text format (e.g. line length, paragraphing, headings, font)
• Text and (reported or quoted) dialogue in story
• Dialogue spoken by animated emoting conversational agent
• Integrated text and images or data graphics
• Text referring to physical or visual properties of presentation (‘The red line in Fig. 2 shows sales in 2002.’)
Big Challenges
Empirical research to test computation- oriented, general theories, principles, guidelines to answer:
• What makes a “text” (i.e. including spoken
dialogue, MMPs, etc.)• Coherent? In story dialogue, believable? • User-friendly? Task-appropriate? • Comprehensible? Pedagogically effective?
• Entertaining (suspenseful, funny, etc.)?
Ex. Challenges (cont.)
• How does channel change answer? • E.g. HCI research: cannot assume findings for
paper apply to computer screen
• How does length change answer? • E.g. learning sciences: 300-word summary vs. 3-
page science argument for middle school
• How do individual differences matter?• E.g. cognitive impairments, affect
Conclusions• Need some NLG research with massively
interdisciplinary view: cognitive science, communication studies, etc.
• Need some NLG research motivated by search for answers to general questions such as above
• Will STEC approach effectively kill the above kind of NLG research?