Upload
bonnie-mcdaniel
View
218
Download
2
Tags:
Embed Size (px)
Citation preview
2
Papers Presented Today• Automating the Generation of Coordinated Multimedia
Explanations (S. K. Feiner and K. R.McKeown)– COMET: content selection, media selection, media generation,
coordinated layout.• Plan-based Integration of Natural Language and
Graphics Generation (W. Wahlster et al.)– IBIS: generation of 3-D illustrations.
• Automated Generation of Intent-Based 3D Illustrations (D.D. Seligmann and S. Feiner)– Allocation of information to particular media.
• Presentation Design Using an Integrated Knowledge Base (Y. Arens et al.)– WIP: generalization of text-linguistic notions relations to multimedia
presentation• The Knowledge Underlying Multimedia Presentations
(Y. Arens, Y.)– Integrated Interfaces: dynamic construction of multimedia displays
using rules.
Ariffin Yahaya 3
Introduction
• Typical Steps in multimedia presentations:– Determination of communicative intent.– Selection of content from a base of
knowledge.– Grouping/structuring and ordering.– Allocation to particular media.– Layout.
Ariffin Yahaya 4
Automating the generation of Coordinated Multimedia Explanation• Pictures and language complement each other
to enable highly effective communication.
– First generation authoring facility
• BUT how does one go about putting together a system that does it dynamically?
– COMET (coordinated multimedia explanation testbed) created to overcome these problems.
Ariffin Yahaya 5
First generation authoring facilities
• Basic facilities– Create presentations
• Text, Graphics, Animation, Video
• Problems– Requires skills
• Medium conventions (i.e. what people expect)
• Coherent mix of mediums.
– Must be authored in advance• Limits the presentation to a
known audience set.
Figure 1 First Generation Authoring
Ariffin Yahaya 6
Comet
• Goal :– Coordinated, interactive generation of explanations
that combine text and graphics, all generated by the system on the fly.
• Example:– How to repair a radio receiver / transmitter
• Select symptoms from a menu.• System consults a rule database.• System can request user to perform actions.• System explains actions step by step using graphics and
text.
Ariffin Yahaya 7
COMET is used in radio repair
Step 1: Remove the holding battery cover plate, highlighted in the right picture: Loosen the captive screws and pull the holding battery cover plate off the radio.
Remove the old holding Battery. Step 1 of 2.
Figure 2 Sample of COMET output.
8
Comet Overview• Basic Components
– Knowledge Source• Contains databases for
information used in all the components.
– Content planner• “Answers” the user’s
request
– Media coordinator• Associates “answers” with
the best method of presentation.
– Presentation generator• Text generator.
• Graphics generator.
• Media layout
• Render & typeset
User request
“What” to say
“How” to say it.
Coordinateexplanationto medium.
Expert sys.determines
explanation’scontent.
Previousdiscourse
Usermodel
Diagnosticrule base
Staticobjects
GeometricKnowledge
Domainknowledge
Presentationgenerated.
Figure 3 : Comet Architecture & State Diagram
Content Planner
Media Coordinator Presentation generator
Knowledge Source
idle
Ariffin Yahaya 9
Knowledge Source
• Static representation of the objects and actions.– Loom language.
• Declarative knowledge in Loom consists of definitions, rules, facts, and default rules.
• A deductive engine called a classifier utilizes forward-chaining, semantic unification and object-oriented truth maintenance technologies in order to compile the declarative knowledge into a network designed to efficiently support on-line deductive query processing
• Diagnostic Rule Base– Rules pertaining to the application.
• Detailed Geometric knowledgebase for graphics generation
Ariffin Yahaya 10
Content Planner
• Produces full content for the explanations.– Represented as a hierarchy of logical forms.
• Logical forms (LF) are required as input to the next stage which uses functional unification grammar (FUF)*.
– Text Plans or Schemas• “Blackboard” capability.
– Intermediate results can be stored to see other results before committing.
– Previous discourse
* FUF not covered in this presentation
Ariffin Yahaya 11
Media coordinator
• Fine grained analysis of an input LF to decide whether each portion should be realized in images, text or both.
• Uses FUF grammar that maps 6 different types of information to types of media.
• Passes the output to the presentation generators.– Graphics generator.– Text generator.
Ariffin Yahaya 12
6 Different Types of Information
• Locations attributes.– Graphics alone
• Physical attributes.– Graphics alone
• Abstract actions.– Text alone
• Expressive connectives that indicate relationships among actions.– Text alone
• Simple actions.– Both Text and Graphics.
• Compound actions.– Both Text and Graphics.
Ariffin Yahaya 13
Text Generator
• Type and Number of sentences needed.
• Lexical chooser.– LF actions verbs.– LF objects nouns.– Chooses words based on multiple constraints
• Wider variety and more appropriate output. Example:– Previous discourse: install reinstall– Use words user knows: technical term explain procedure
• Sentence generator.– Construct the syntactic structure.– Linearize the resulting tree as a sentence
Ariffin Yahaya 14
Graphics Generator
• Uses IBIS (Intent based Illustration System) from the paper “Automated Generation of Intent-Based 3D Illustrations”.
• This paper is covered within this presentation set, so we will revisit IBIS later.
• Suffices to say that the graphics generator takes in annotated LFs and outputs graphics.
Ariffin Yahaya 15
Media Coordination (1)
• Common content description language.– Text and Graphics influence each other.
• Generators display “cohesive” presentations.
– Communicative goal separated from resources.• LFs only specify goals and what is needed to achieve the
goals, generator defines resources.
– Provides mechanism for Text and Graphic generators to communicate.
• Content description (blackboard) is used to coordinate internal text structures with pictures.
Ariffin Yahaya 16
Media Coordination (2)
• Bidirectional interaction– Certain types of coordination between media
can only be provided by incorporating interactive constrains between text and graphics.
• Coordinating sentence breaks with picture breaks.– Both Text and Graphics generators annotate their
current process and can refer to each others progress and decisions so that it can compensate.
• Cross-referencing text and graphics.– Text generator queries an IBIS database that is indexed
by LF so that it can cross-reference specific objects and refer to the graphical locations within the generated text.
Ariffin Yahaya 17
COMET
Figure 4 COMET architecture from the Paper.
Ariffin Yahaya 18
Automated Generation of Intent-Based 3D Illustrations
• Exact presentation of a message is available to us through technology, but many people viewing the same presentation may not lead to all the people having the same interpretation.
• The intention of the author, and the viewing context (i.e. who is viewing) must be taken into consideration.
• SO, how do we make sure that the presentation is appropriate?– IBIS (Intent Based Illustration System)
Ariffin Yahaya 19
IBIS
• Goals:– Automate the creation of illustrations based on a
specific communicative intent.• Used by COMET
• Method– Formalize the intent and create an illustration that fulfills
the goals set by the intent.
• Example:– Communicative Intent: Show a dice
• Generated illustration: Whole dice is shown
– Communicative Intent: Show the weight in loaded dice• Generated illustration: Transparent dice with weight visible.
Ariffin Yahaya 20
Communication Intent makes a difference!
Intent: Show the dice. Intent: Show the location of weights in loaded dice.
weight
Figure 5 Sample of IBIS output
Ariffin Yahaya 21
Generate and Test• Every time IBIS generates a
stylistic choice, it also associates a set of criteria to which the results are compared to.– System of ratings (criteria).– Thresholds (minimum degree of
success).
• Stylistic choice are associated with methods, and as each method is “tried”, the results are tested against the criteria.
• If the criteria is not fulfilled, a new stylistic choice with a new method is requested. Figure 6 IBIS Generate and Test Cycles
22
IBIS Overview• Basic components
– Communicative Goals• Tightly coupled with
COMET.
– Generate & Test cycle• Method used by IBIS to
evaluate appropriateness.
– Illustrator• Maps intent to stylistic
choice with “Design Rules”.
– Knowledgebase• Superset of graphics info.
– Drafter• Maps stylistic choice to
visual effect with “Style Rules.”
Communicative goals
Design R
ules
idle
Match designrules to
style rules.
Match goalsto design
rules.
Figure 7 : IBIS Architecture & State Diagram
Illustration
Style Strategies
Ineffecti
ve
Inef
fect
ive
Illustrator
Drafter
AbstractProperties
PhysicalProperties
Features
MaterialInformation
GeometricInformation
Knowledgebase
Style rules
Drafter
Gen
erat
e &
Tes
t Cyc
le
Ariffin Yahaya 23
Communicative Goals• Location
– Show the location of an object in a context.• Relative Location
– Show the relative location of 2 or more objects in terms of a specified/derived context.
• Property– Show objects physical properties of material, color,
size or shape.• State*
– Show an object’s state• Change*
– Show the difference between a set of states.* State and change may further be qualified by concepts that refer to how the object is manipulated or has changed.
Ariffin Yahaya 24
Illustrator
• Designs illustrations.– Map communicative goals to style strategies
with design methods.– Evaluate the success of communicative goals
with design evaluators.
• An illustrator may split jobs according to need and assign them to subordinate illustrators.
• Share design rules database.
Ariffin Yahaya 25
Design Rules• Describes on a high level how illustrations
should be put together.– Communicative Goal– Set of Style Strategies.
• Visual effect• Style rules.
• Design Methods specify how to accomplish communication goals.– Specifies what style strategies must be achieved.
• Design Evaluators determine how well communicative goals have been accomplished.– Achievement ratings of a collection of style strategies.
Ariffin Yahaya 26
Knowledgebase• Concerned with physical objects to be illustrated.• Superset of typical graphics databases
– Geometric Information– Material Information
• Also includes– Object’s Features
• What are the object’s capabilities
– Physical Properties• How does an object move (i.e. a hinge).
– Abstract Properties• How things fit together (?)
Ariffin Yahaya 27
Drafter
• Knows nothing about communicative intent.• Translates the illustrator's plans into reality.• Tightly coupled with the hardware they utilize.• Shares a database of style rules.• Report back to the illustrators with the
achievement rating of the various style strategies they implement
• Render the illustrations.
Ariffin Yahaya 28
Style Rules
• 2 types of Style Rules that specify either– Style Methods
• Accomplish visual effects specified by style strategies.
• Illustration methods.
– Style Evaluators• Determine the success of style strategies in a
given illustration.• Illustration evaluator
Ariffin Yahaya 29
Illustrative Style
• Represented by an ordering of the rules such that the preferred methods are always attempted first.
• Illustrators and drafters can be specified with different illustrative styles.
• Illustrations can combine different illustrative styles.
Ariffin Yahaya 30
Other IBIS Features
• Interactive Illustrations– User can change view specifications with IBIS
continuously monitoring to make sure that the communicative goal is maintained.
• Written in C++ and CLIPS.
Ariffin Yahaya 31
Intent based approach to Authoring
Figure 8 Typical intent based authoring architecture.
Ariffin Yahaya 32
Plan-based integration of natural language and graphics generation.
• Multimodal presentations should be generated from a common representation of what is to be conveyed.
• BUT– How do we decompose the communicative goal into
sub goals.– How do we Integrate multiple AI components to
create the presentation.• WIP multimodal presentation system was
created to be a prototype to solve these problems.– Computer as a “desktop publisher”.
Ariffin Yahaya 33
WIP Design Goals
• Generate coordinated multimodal presentations from a common representation.– What should be in text / graphics.– Which kinds of links verbal / non-verbal is necessary.
• Adaptation of these presentations to intended audience and situations.– All presentation decisions are postponed until runtime
• Incrementality of all processes constituting the design and realization of the multimodal output.– computations for an object are performed not long
before the object is output.
Ariffin Yahaya 34
WIP• Goal :
– Allows the generation of alternate presentations of the same content taking into account contextual factors such as the user’s degree of expertise and preferences for a particular output medium or mode.
– Specify information once, but view in infinite ways.
• Example:– How to use an espresso machine.– How to assemble a lawnmower.– How to install a modem
• Inputs– Stereotypes, Target language, Layout format, and
Output modes.
Ariffin Yahaya 35
Interleaved Content Planning
• Processing is done non linearly.• Cascades are used based on some task/results
queues.• Cascade:
– Presentation planner and Layout Manager– Design module– Realization module
• Purpose:– Leave presentation decisions to the last possible
moment to refine the presentation.
36
WIP Overview• Basic Components
– Presentation Planner• Decides on content and
mode combination
– Layout Manager• Screen/Output Manager• Last step to rendering
presentation
– Text & Graphics Cascade• Micro planner.
– Application Knowledge• Application specific data.
– Knowledgebase• Used internally.
Presentation goals
idle
Figure 9 : WIP Architecture & State Diagram
Presentation
Planner
Knowledgebase
Determinecontents
andmode
Determinelayoutand
Generate.
TextDesign
TextRealization
GraphicsRealization
GraphicsDesign
Layout
Manager
Generation Parameters
Incremental
Design Cascade
bus / queue
Graphics D.Strategies
TAG
BasicOntology
UserModel
SelectionRules
RevisionStrategies
PresentationStrategies
ApplicationKnowledge
Ariffin Yahaya 37
Presentation Planner
• Tries to find a presentation strategy whose effect (or header) match the presentation goal.
• Keeps revising plan until some basic elements of the presentation are formed.
• Elements sent to the task queue.
• Design modules take task from the queue and begin processing.
Ariffin Yahaya 38
Layout Manager
• In charge of the screen real estate.
• Positions design components.
• Interacts with the realization module.
Ariffin Yahaya 39
Text & Graphics Cascade
• Design Module– Elementary Speech/Pictoral acts.– What to say (intent) micro-planner
• Which view of the espresso machine.• What is the micro-message (instruction).
• Realization Module
– How to say it micro-planner• Natural Language• Geometric shapes
Ariffin Yahaya 40
Application Knowledge
• Externally coded in RAT
• Main source of knowledge– Domain terminology
• Used in– Presentation Planner– Generation of text– Generation of graphics
Ariffin Yahaya 41
Knowledgebase
• Application Knowledge– Unique domain knowledge
• Strategies– Used to design/revise
• Presentation• Graphics
• User Model– Matches the generation parameters.
Ariffin Yahaya 42
Comparison to COMET
• WIP– Operator based
approach to planning.– Supports
incrementality.– Bidirectional
communication between Presentation Planner and Layout Manager.
• COMET– Schema based
content planner– No increments
– Layout component combines text and graphics fragments during final steps.
Ariffin Yahaya 43
Coming up next: Anders with the second half of this talk.
Figure 10 Ariffin is smiling ‘cos he’s done!