Upload
lamnhu
View
226
Download
2
Embed Size (px)
Citation preview
Artificial Intelligence Adv., April 8th, 2015
Copyright © 2015, Toyoaki Nishida, Atsushi Nakazawa, Yoshimasa Ohmoto, Yasser Mohammad, At ,Inc. All Rights Reserved.
1. Introduction‐‐ Communicative Intelligence ‐‐
Toyoaki NishidaKyoto University
Year AI ICT
1940~ 1936: Turing Machine, 1947: von Neumann Computer, 1948: Information Theory, by C. Shannon and W. Weaver, 1948: Cybernetics by Wiener
1950~ 1952‐62: Checker program by A.Samuel1956: Dartmouth Conference 1957: FORTRAN by J.Backus
1960~ 1961: Symbolic Integration program SAINT by J.Slagle1962: Perceptron by F.Rosenblatt1966: The ALPAC report against Machine Translation by R. Pierce1967: Formula Manipulation System Macsyma by J.Moses1967: Dendral for Mass Spectrum Analysis by E.Feigenbaum
1961: Mathematical theory of Packet Networks by L. Kleinrock1963: Interactive Computer Graphics by I.Sutherland
1968: Mouse and Bitmap display for oN Line System (NLS) by D.C.Engelbart1969: ARPA‐net
1970~ 1971: Natural Language Dialogue System SHRDLU, by T.Winograd1973: Combinatorial Explosion problem pointed out in The Lighthill report1974: MYCIN by T.ShortliffeMid 1970’s: Prial Sketch and Visual Perceptron by D.Marr1976: Automated Mathematician (AM) by D.Lenat1979: Autonomous Vehicle Stanford Cart by H.Moravec
1970: ALOHAnet1970: Relational Database Theory by E.F.Codd1972: Theory of NP‐completeness by S.Cook and R.KarpMid 1970’s: Alto Machine by A.Kay and A.Goldberg1976: Ethernet1979: Spreadsheet Program Visicalc by D.Bricklin
1980~ 1982: Fifth Generation Computer Project1984: The CYC Project by D.LenatMid 1980’s: Back‐propagation algorithm was widely used1985: the Cybernetic Artist Aaron by H.Cohen1986: Subsumption Architecture by R.Brooks1989: An Autonomous Vehicle ALVINN by D.Pomerleau
1982:TCP/IP Protocol by B.Kahn and V.CerfMid 1980’s: First Wireless Tag Products1987: UUNET started the Commercial UUCP Network Connection Service1988: Internet worm (Morris Worm)1989: World Wide Web by T.Berners‐Lee1989: The number of hosts on the Internet has exceeded 100,000.
1990~ 1990: Genetic Programming by J.R.KozaEarly1990’s: TD‐Gammon by G.TesauroMid 1990’s: Data Mining Technology1997: DeepBlue defeated the World Chess Champion G.Kasparov1997: The First Robocup by H.Kitano1999: Robot pets became commercially available
1992: The number of hosts on the Internet has exceeded 1,000,000.1994: Shopping malls on the Internet1994: W3C was founded by T. Berners‐Lee1997: Google Search1998: XML1.0(eXtensible Markup Language) by W3C1998: PayPal
2000~ 2000: Honda Asimo
2004: The Mars Exploration Rovers (Spirit & Opportunity)
2001: Wikipedia.2003: Skype / iTunes store2004: Facebook2005: YouTube / Google Earth2006: Twitter2007: Google Street View
2010~ 2010: Google Driverless Car / Kinect2011: IBM Watson Jeopardy defeated two of the greatest champions2012: Siri
History of AI research in contrast with ICT
Year AI ICT
1940~ 1936: Turing Machine, 1947: von Neumann Computer, 1948: Information Theory, by C. Shannon and W. Weaver, 1948: Cybernetics by Wiener
1950~ 1952‐62: Checker program by A.Samuel1956: Dartmouth Conference 1957: FORTRAN by J.Backus
1960~ 1961: Symbolic Integration program SAINT by J.Slagle1962: Perceptron by F.Rosenblatt1966: The ALPAC report against Machine Translation by R. Pierce1967: Formula Manipulation System Macsyma by J.Moses1967: Dendral for Mass Spectrum Analysis by E.Feigenbaum
1961: Mathematical theory of Packet Networks by L. Kleinrock1963: Interactive Computer Graphics by I.Sutherland
1968: Mouse and Bitmap display for oN Line System (NLS) by D.C.Engelbart1969: ARPA‐net
1970~ 1971: Natural Language Dialogue System SHRDLU, by T.Winograd1973: Combinatorial Explosion problem pointed out in The Lighthill report1974: MYCIN by T.ShortliffeMid 1970’s: Prial Sketch and Visual Perceptron by D.Marr1976: Automated Mathematician (AM) by D.Lenat1979: Autonomous Vehicle Stanford Cart by H.Moravec
1970: ALOHAnet1970: Relational Database Theory by E.F.Codd1972: Theory of NP‐completeness by S.Cook and R.KarpMid 1970’s: Alto Machine by A.Kay and A.Goldberg1976: Ethernet1979: Spreadsheet Program Visicalc by D.Bricklin
1980~ 1982: Fifth Generation Computer Project1984: The CYC Project by D.LenatMid 1980’s: Back‐propagation algorithm was widely used1985: the Cybernetic Artist Aaron by H.Cohen1986: Subsumption Architecture by R.Brooks1989: An Autonomous Vehicle ALVINN by D.Pomerleau
1982:TCP/IP Protocol by B.Kahn and V.CerfMid 1980’s: First Wireless Tag Products1987: UUNET started the Commercial UUCP Network Connection Service1988: Internet worm (Morris Worm)1989: World Wide Web by T.Berners‐Lee1989: The number of hosts on the Internet has exceeded 100,000.
1990~ 1990: Genetic Programming by J.R.KozaEarly1990’s: TD‐Gammon by G.TesauroMid 1990’s: Data Mining Technology1997: DeepBlue defeated the World Chess Champion G.Kasparov1997: The First Robocup by H.Kitano1999: Robot pets became commercially available
1992: The number of hosts on the Internet has exceeded 1,000,000.1994: Shopping malls on the Internet1994: W3C was founded by T. Berners‐Lee1997: Google Search1998: XML1.0(eXtensible Markup Language) by W3C1998: PayPal
2000~ 2000: Honda Asimo
2004: The Mars Exploration Rovers (Spirit & Opportunity)
2001: Wikipedia.2003: Skype / iTunes store2004: Facebook2005: YouTube / Google Earth2006: Twitter2007: Google Street View
2010~ 2010: Google Driverless Car / Kinect2011: IBM Watson Jeopardy defeated two of the greatest champions2012: Siri
History of AI research in contrast with ICT
Exponential Growth‐Moore’s Law‐ Expanding Internet World
Universal Turing Machine‐ from Hardware to Software
Generic Computing Engines‐ from Software to Data
Service Copying‐Machine Learning / Data Mining
Symbolic Computing‐ Knowledge‐based Systems
Heuristics‐ Intelligence as Clever Computing
Embodied & Network Intelligence‐ Like human and society
Communicative Intelligence for Bridging People and SI
[Nishida‐Nakazawa‐Ohmoto‐Mohammad 2014]
Communicative Intelligence
Super Intelligence People
Challenge: A robot that can participate in conversation
Long-term goal
Eye gaze
Hand gesture PostureFacial expression
AskingNegotiating
Proposing
ConvivialitySocial networksTrust
Conversation is a complex business
Application
Platform Evaluation
Content production Model building
Analysis
Theory
Measurement
Conversational interactions
Conversational Informatics
[Nishida‐Nakazawa‐Ohmoto‐Mohammad 2014]
Applied Intelligence Information Processing GroupCognitive DesignWe aim at designing the expressions, functions, control, and interactions of artifacts based on a physio‐cognitive approach.
Interaction Measurement, Analysis and Modeling We aim at uncovering principles of verbal and nonverbal interactions that people engage everyday as a part of intellectual activities and developing a suite of technologies to help people benefit from conversational interactions.
Augmented Conversation EnvironmentWe aim at building a smart environment that integrates conversation in physical and virtual spaces.
Conversation in the physical environment Conversation in the virtual environment
Learning by imitation
Designing high‐level cognitive interaction
Reconstructing 3D conversation scene
ICIE: Immersive Collaborative Interaction Environment
Multiparty conversation recorder
Eye tracker
Eye tracker (wearable)
Polygraph Optical motion capture systems
Cluster computer
IMADE: Interaction Measurement, Analysis, and Design Environment
SmartInFill
Annotation tool
Conversation corpus Virtualization of physical space
ICIE for the shared virtual world
[Lala 2013]https://www.youtube.com/watch?v=V‐9SKpcMrzk
[Nishida‐Nakazawa‐Ohmoto‐Mohammad 2014]
ICIE – immersive collaborative interaction environment
Immersive WOZ environment
Feedback generationMotion mappingUser motion sensingHead recognitionGesture recognitionFace modelHuman body model
WOZ operating environment
WOZ operatorTele-operated robot
The conversation place
[Ohmoto 2013a]
Projecting the real world into the virtual world
[Ohmoto 2011]http://www.youtube.com/watch?v=68UrJv65HvY
IMADE: Interaction Measurement, Analysis, and Design Environment
[Nishida‐Nakazawa‐Ohmoto‐Mohammad 2014]
iCorpusStudio
Collaborative annotation system
[Nishida‐Nakazawa‐Ohmoto‐Mohammad 2014]
[Yano 2012]
3D conversation capture – over the shoulder
Corneal Imaging CameraScene cameraEye camera
• Lightweight and versatile system • Appl.: Google Glass like HMD, unconstrained setups
Corneal Image Feature MatchingProblem:Local feature correspondence + RANSAC does not work due to large noise in eye images
Approach:1. Formulate problem as registration of 3D
spherical light maps of eye and scene image2. Single point algorithm for robust alignment
Non‐intrusive Eye Gaze Tracking (EGT) by corneal imaging
Eye images with GRP
Scene images with PoG
↑ Aligned results (from eye images)
Peripheral vision map overlaid to scene imagePeripheral vision map in eye image
Gaze Reflection Point Mapping
Application 1: Non‐intrusive and uncalibrated PoG estimation
Application 2: Peripheral vision estimation
Gaze trajectory in static scene image
Mouse(67.37pixel/mm)
66.3×37.3cm2560×1600pixel
Head mount eye camera
Eyeball display distance50cm
IR filter
InfraredLED
140 Lux at eyes
Display
Quantitative analysis of pupil dilations against the difficulty of the task
Estimate how Pupil dilations is related to the difficulty of the irritating maze game.
[Tange, Nakazawa 2015]
Pupil dilations in the irritating maze game
[Tange, Nakazawa 2015]
110
115
120
125
130
135
140
145
150
0 10000 20000 30000 40000 50000 60000 70000 80000 90000IN OUT
5pixel 10pixel10pixel
Expand before the narrow conduit
Delay before contraction
Elapse time(ms)
Pupil size
(pixel)
Pupil dilations in the irritating maze game
[Tange, Nakazawa 2015]
Tacit social signals
Intentions are dynamic and tacit
[Nishida‐Nakazawa‐Ohmoto‐Mohammad 2014]
Facilitative agent that can track dynamic and tacit intentions
The first half
The second half
Agent Control
Estimation of emphasizing points
Switch or not between divergent and convergent
Next proposal
Agent behavior
ResultsEvaluating estimation process
Results
Select next proposal
Next proposal
Agent behavior
Results
Select next proposal
Facilitative agent Estimation agent
Divergent
Social signals
Social signals
Estimation of emphasizing points
[Ohmoto 2013b]
Results of questionnaires
• Facilitative agent is significantly more preferred in all aspects we have investigated.
[Ohmoto 2013b]
Inducing intentional stance toward agent players
Q: How can we induce an intentional stance toward NPCs?
H: Presenting goal‐oriented trial‐and‐error process with multimodal behaviors.
[Ohmoto 2015]
Inducing intentional stance toward agent players
[Ohmoto 2015]
Video analysis
Questionnaire analysis
Trial‐and‐error agent:Presents goal‐oriented trial‐and‐error process by multimodal behavior (e.g., hand gesture, body orientation, etc that only ambiguously display mental states).
Text display agent:Presents agent’s behavioral intention by unambiguous text.
# of comm. acts
score
Intentional stance
Design stance
Inducing intentional stance toward agent players
Q: How can we induce an intentional stance toward NPCs?H: Demonstrating strategy change.
[Suyama, Ohmoto 2015]
Player #1 in ICIE #1
Red hat
Player #2 in ICIE #2
Green hat
Player #3 (agent)
Blue hat
PurposeEvaluating the physiological sensing method for estimating user’s concentration and appropriate advice giving during exercise.
MethodVR exercise game as the task.Using SCR and LF/HF.Controlling the condition of advice giving.
Findings
Estimating user’s concentration during exercise
Deliberating but not reacting Deliberating and reacting
SCR
LF/HF
+
+
‐
‐
Hide from the chaser
Not concentrated
Hide in the place the chaser checked previously.
Simply moving aroundadvice
No advice
Learning by imitation
[Mohammad 2009]
Constrained Motif Discovery: • Given a time series X(t)
find recurring patterns of length between L1 and L2 using distance function D subject to the constraint P(t), where P(t) is an estimation of the probability that a motif occurrence exists near time step t.
[Mohammad 2009]
P(t)
unlikely
likely
Change point discovery
Learning by imitation
Future
Change angle
GH
Past Futuret
;...; 1H t seq t n seq t 1 ;...;G t seq t seq t n
1
1
1
ˆ
f
f
Ti i i
l
i ii
l
ii
s t t t
csx
c
t
TtVtStUtH )()()()( Find optimal lP
ggT uutGtG )()(Find optimal lF
11and,)( jjjFg
ii liut
fTll
Tll
i litUUtUUt ,)()()(
)()()()()(ˆ)(~ tttttxtx PFPF
Learning by imitation
Robust Singular Spectrum Transform
[Mohammad 2009]
Learning by imitation
[Mohammad 2009]
Fluid Learning
[Mohammad 2015]
Synthetic Evidential Study
Synthetic evidential study (SES) combines dramatic role play and group discussion to help people spin stories by bringing together partial thoughts and evidence.
Componentize
Reuse
SES session Interpretation archive
Structured collection of {story, background, critique}Agent Play
Dramatic role play
Group discussions
At the beginning of the 18th century, a feudal lord named Asano Takumi-no-kami Naganori was in charge of a reception for envoys from the Imperial Court in Kyoto. Another feudal lord, Kira Kozuke-no-suke Yoshinaka, was appointed to instruct Asano in the ceremonies. On the day of the reception, while Kira was talking with Yoriteru Kajikawa, a lesser official, at “Matsu no Roka” (“Hallway of Pine Trees”) in Edo Castle, Asano came up to them screaming “This is for revenge!!” and slashed Kira twice with a short sword. Soon after the incident, Kajikawa restrained Asano, who was then imprisoned. The reason for the attack was not known, though it was widely believed that Kira had somehow humiliated Asano. Ultimately Asano was sentenced to commit seppuku, a ritual suicide, but Kira went without punishment.
Hallway of Pine Trees (from Chushingura)
Kira Kozuke-no-suke Yoshinaka
Asano Takumi-no-kami Naganori
Yoriteru Kajikawa
Why was it possible?
How did it happen?
What did each think?
Dramatic Role Play
Group play capture
Agent play
Discussion phase
Asano
Kira
Kajikawa
Third person view First person view
Discussions
History of conversational systems development
t1990 2000 201019801970
Natural language dialogue systems
Speech dialogue systems
Multi‐modal dialogue systems
Embodied Conversational Agents / Intelligent Virtual Human
Story Understanding systems
Conversational Systems
Transactional systems
Interactional systems
Affective Computing
Cognitive systems
Natural language question answering systems
The Knowledge Navigator
Early Natural Language Dialogue Systems
[Green 1961]
Spec[ification] list:
“Where did the Red Sox play on July 7?”‐> Place = ?
Team = Red SoxMonth = JulyDay = 7
“What teams won 10 games in July?”‐> Team(winning) = ?
Game(number of) = 10Month = July
Dictionary:“team” ‐> meaning: Team=(blank)“Red Sox” ‐> meaning: Team=Red Sox“who” ‐> meaning: Team=?“winning” ‐> meaning: subroutine
Data:Month = July
Place= BostonDay = 7Game Serial No. = 96(Team = Red Sox, Score= 5)(Team = Yankees, Score = 3)
Baseball
• One of the earliest natural language question answering system.
• Answers such questions about baseball games.
Early Natural Language Dialogue Systems
[Weizenbaum 1966]
The syntax routine determines whether the verb is active or passive and locates its subject and object.
The syntax analysis checks to see if any of the words is marked as a question word. If not, a signal is set to indicate that the question requires a yes/no answer
4. Content AnalysisThe content analysis uses the dictionary meanings and the results of the syntactic analysis to set up a specification list for the processing program. E.g.,
“each team”: Team=(blank) ‐> Team=each“what team” :Team=(blank)‐> Team=?“Who beat the Yankees on July 4”: Team=(blank)‐> Team=?, Team(winning)=? Team(losing)=Yankees“six games” : Game=(blank)‐>Game(number of)=6“how many games”: Game=(blank)‐>Game(number of)=?“Who was the winning team…”: Team=? and Team(winning)=(blank) ‐> Team(winning)=?
5. ProcessingThe specification list indicates to the processor what part of the stored data is relevant for answering the input question. The processor extracts the matching information from the data and procedures, for the responder, the answer to the question in the form of a list structure.
Modules of Baseball
1. Question Read‐in2. Dictionary Look‐up3. Syntax(1) scan for ambiguities in part of speech: in some
cases, resolved by looking at adjoining words; in other cases, resolved by inspecting the entire question.
(2) locates and brackets the noun phrases, [], and the prepositional and adverbial phrases, (). The verb is left unbracketed. E.g.,
“How many games did the Yankees play in July?”‐> [How many games] did [the Yankees] play (in [July])?
Any unbracketed preposition is attached to the first noun phrase in the sentence, and prepositional brackets added. E.g.,
“Who did the Red Sox lose to on July 5?”‐> (To [who]) did [the Red Sox] lose (on [July 5])?
Early Natural Language Dialogue Systems
[Woods 1973]
LUNAR
Prototype natural language question answering system that helps lunar geologists access chemical analysis data on lunar rock and soil composition.
(i)Syntactic analysis using heuristic/semantic information to choose the most likely parsing (Augmented Transition Network Grammar was used)
(ii) Semantic interpretation: produce a formal representation for queries
(iii) Execution of this formal expression in the retrieval
Early Natural Language Dialogue Systems
[Woods 1973]
Augmented Transition Network Grammar
S:NP V NP
AUX V PP
NP:
DET NOUN
ADJ
PRON
PREP NPPP:
Early Natural Language Dialogue Systems
Winograd, Terry, Understanding Natural Language, New York: Academic Press, 1972. http://hci.stanford.edu/~winograd/shrdlu/
SHRDLU
• Natural language understanding system working on the “Blocks” world.
• Based on the belief that “a computer cannot deal with language unless it can understand the subject it is discussing”
• Answers questions, executes commands, and accepts information in normal English dialog.
• Uses semantic information and context to understand discourse and disambiguate sentences.
• Procedural knowledge representation <‐> declarative KR
Early Natural Language Dialogue Systems
Winograd, Terry, Understanding Natural Language, New York: Academic Press, 1972. http://hci.stanford.edu/~winograd/shrdlu/
USER: “Pick up a big red block”SHRDLU: “OK”
simulate robot arm
Syntactic analysis
Semantic analysis
Planning
Graphics
Questioning answering
SHRDLU
Early Natural Language Dialogue Systems
[Weizenbaum 1966]
Example:
“You are very helpful”‐> “I are very helpful”
by a simple substitution rule‐> “What makes you think I am very helpful”
by a decomposition template: (0 I are 0)by a reassembly rule:
(What makes you think I am 4)
After Ri,j was applied, an index in the transformation list is inserted to prevent the same reassembly rule from being applied in a row.
(How = (What)) allows the transformation rule for “how” to be equally applied to “what”
(MOTHER DLIST (/NOUN FAMILY))allows the keyword “MOTHER” to be identified as a noun and as a member of the class “family”. As a result, “MOTHER” will match “(/FAMILY)” in a decomposition rule.
Transformation rules: collection of key lists.
where, K: keyword, Di: decomposition template, Ri,j: Reassembly rule.
Example of a decomposition template and a reassembly rule:D: (0 YOU 0 ME) R: (WHAT MAKES YOU THINK I 3 YOU)
“It seems that you hate me”‐> “What makes you think I hate you”
Substitution rule. E.g.,
(YOURSELF = MYSELF)(MY YOUR 5 (transformation rules))
)))()()()((
))()()()((
))()()()(((
,2,1,
,22,21,22
,12,11,11
2
1
nmnnnn
m
m
RRRD
RRRD
RRRDK
ELIZA: Natural language dialogue system without explicit knowledge about the discourse domain
Early Speech Dialogue Systems
[Erman 1980]
The Hearsay‐II Speech‐Understanding System
Integrates multiple levels of information processing by knowledge sources coordinated by the blackboard model: • Parameter• Segment• Syllable• Word• Word‐sequence• Phrase• Data base interface
Combining top‐down (hypothesis driven) and bottom‐up (data‐driven) processing.
Selective attention to allocate limited computing resources.
[Erman 1980]
The blackboard architecture
| ARE | ANY | BY | FEIGENBAUM | AND | FELDMAN |
(g) Phrases
(f) Word sequences
(e) Words ‐ 1
(d) Words – 2
(c) Syllable classes
(b) Segments
(a) ParametersSEG
POM
MOW
WORD‐CTL
VERIFY
PARSE
SEMANT
WORD‐SEQ
VERIFY
WORD‐SEQ‐CTL
PREDICT STOP
CONTACT
RPOL
[Erman 1980]
The blackboard architecture
KS1
Level2
Levelk
Level1
Blackboard
Schedulingqueues
KSn
Focus‐of‐controlDatabase
BlackboardMonitor
Scheduler
…
…
… …Program modules Databases Data flow Control flow
Early Speech Dialogue Systems
PUT‐THAT‐THERE
The conjoint use of voice‐input and gesture recognition to command events on a large format raster‐scan graphic display in a “Media Room”
Commands
“Create a blue square there.”“Move the blue triangle to the right of the green square”“Move that to the right of the green square.” (with pointing)“Put that there” (indicated by gesture)“Make that smaller” (with pointing gesture)“Make that (indicating some item) like that (indicating some other item)”“Delete that” (pointing to some item)“Call that … the calendar” (with pointing)
[Bolt 1980]
“Phil”: concrete image of a personal assistant.
The Knowledge Navigator
[Apple Computer 1987]
Phil, the agent
CASA: Computers Are Social Actors
Social responses to computers
- not: conscious beliefs that computers are human or human‐like.
- not: from user’s ignorance or psychological or social dysfunctions
- not from: a belief that subjects are interacting with programmers
- rather: human‐computer relationship is fundamentally social.
Nass 1994]
- Support interactive give and takeAssistants will need not only respond to questions but also ask questions
- Recognize the costs of interaction and delayIt is inappropriate to require the user’s confirmation of every decision made while carrying out a task.
- Manage interruptions effectivelyWhen it is necessary to initiate an interaction with the user, the assistant needs to do so carefully, recognizing the likelihood that the user is already occupied to some degree.
- Acknowledge the social and emotional aspects of interactionTo become a comfortable working partner, a computer assistant will need to vary its behavior depending on such the task, the time of day, and the boss’s mood.
Requirements
=> The PERSONA project @ Microsoft Research[Ball 1997]
From Tool-based Computing to Assistive Interface
[Ball 1997]
The architecture of Peedy
Spoken Language Interpretation
Character Animation
Sound Generation
Background and Object Animation
Dialogue Management
MicrophoneWhisper Speech Recognition
NamesProper Name Substitution
NLPLanguage Analysis
SemanticTemplate Matching & Object Descriptions
DialogueContext & Conversation State
ApplicationCD Changes
Speech Controller
Player/ReActorAnimation Engine
Character
Sound
CD Player
[Ball 1997]
Names Database Action Templates Database
Object Database (CDs)
Dialogue Rules Database
Speech & Animation Database
The architecture of Peedy
stGotCDevOK / do
stSuggested
Peedy: “I have The Bonnie Raitt Collection, would you like to hear something from that?”
User: “Sure”
pePickTrack; genTracks; doSelect; Say <!PDSays(how About)>
pePickTrack: causes Peedy to look down at the CD (note) that he’s holding as if considering a choice.genTracks: expands the description of the current CD into a list of the songs it contains.doSelect: select one or two tracks, based on the parameters given in the interaction. Say: verbally offer the selected song with the appropriate beak‐sync.
A state transition machine model with 5 conversational states and 17 input events is used.
[Ball 1997]
Dialogue management
[Bates 1994]
Believable agents
provides the illusion of life, and permits the audience’s suspension of disbelief. … critical in theater, film, animation, radio drama, …
Questions
i. What makes characters believable? What makes agents believable? Are agents different from characters?
ii. What is the nature of current integrated architectures and situated agents? How well do they support believability? Are there clear areas demanding study?
iii. How much breadth of capability is necessary to produce a believable agent? How much depth (competence) is necessary?
iv. What are the respective role of movement and language in achieving believability?
v. What is the role of context in establishing expectations in the user and thus is simplifying the task? For instance, how can the setting of an interface agent or the theme of a story help limit the technical requirements on agents?
vi. How can we measure believability and progress toward believability? Why don’t artists use “scientifically valid” techniques to evaluate the believability of their characters?
Believability
[Mateas 1997]
Drama = Character + Story + Presentation
Interactive Drama:Interactive drama concerns itself with building dramatically interesting virtual worlds inhabited by computer‐controlled characters, within which the user (the player) experiences a story from a first person perspective. … (Bates 1992)
Oz project
[Hayes‐Roth 1998]
Jennifer James
Text
Multi-modal dialogue engine
Graphics
Sound
Cloud
[Mateas 1997]
Oz project
PersonalityRich personality should infuse everything that a character does. What makes characters interesting are their unique ways doing things.
EmotionCharacters exhibit their own emotions and respond to the emotions of others in personality‐specific ways.
Self‐motivationCharacters have their own internal drives and desires which they pursue whether or not others are interacting with them.
Change Characters grow and change with time, in a manner consistent with their personality.
Social relationshipsCharacters engage in interactions with others in a manner consistent with their relationship. In turn, these relationships change as a result of the interaction.
Illusion of lifePursuing multiple, simultaneous goals and actions, having broad capabilities, and reacting quickly to stimuli in the environment.
Rea
[Cassell 1999]
Implements the social, linguistic, and psychological conventions of conversation.
‐ Has a human‐like body. Uses eye gaze, body posture, hand gestures, and facial displays to organize and regulate the conversation.
‐ The conversational model relies on the function of non‐verbal behaviors as well as speech.
‐ A full symmetry between input and output modalities: not only respond to visual, audio and speech cues (such as speech, shifts in gaze, gesture, and non‐speech audio but also generate these cues.
Rea
[Cassell 1999]
Discourse Model
Knowledge Base Word timing
Language Tagging
Behavior Scheduling
Text Input
Animation
Behavior Suggestion
Behavior Selection
Generator Set Filter Set
Behavior Generation
Translator
BEAT (Behavior Expression Animation Toolkit)
[Cassell 2004]
"Mary socked John.""Mary punched John.""Mary hit John with her fist.""John was socked by Mary.""Marie a donne un coup de poing a Jean.""Maria pego a Juan."
Story understanding and generation
Primitive Meaning
ATRANS Transfer of an abstract relationship (i.e. give)
PTRANS Transfer of the physical location of an object (e.g., go)
PROPEL Application of physical force to an object (e.g. push)
MOVE Movement of a body part by its owner (e.g. kick)
GRASP Grasping of an object by an action (e.g. throw)
INGEST Ingesting of an object by an animal (e.g. eat)
EXPEL Expulsion of something from the body of an animal (e.g. cry)
MTRANS Transfer of mental information (e.g. tell)
MBUILD Building new information out of old (e.g decide)
SPEAK Producing of sounds (e.g. say)
ATTEND Focusing of a sense organ toward a stimulus (e.g. listen)
[Schank 1975]
Action: PROPELActor: MaryObject: FistFrom: MaryTo: John
Margie
Story understanding and generation
SAM (Script Applier Mechanism)[Cullingford 1981] Cullingford, Richard E: SAM and Micro SAM. In Roger C. Schank, & Christopher K. Riesbeck (Eds.), Inside computer understanding. Hillsdale, NJ: Erlbaum, 1981
FRUMP [Dejong 1979] DeJong, Gerald F.: Skimming stories in real time: An experiment in integrated understanding (Technical Report YALE/DCS/tr158). New Haven, CT: Computer Science Department, Yale University, 1979
Script‐based understanding
Story understanding and generation
Plan‐based understandingPAM[Wilensky 1978] Wilensky, Robert: Understanding goal‐based stories (Technical Report YALE/DCS/tr140). New Haven, CT: Computer Science Department, Yale University, 1978.
POLITICS[Carbonell 1978] Carbonell, Jaime: Subjective understanding: Computer models of belief systems (Technical Report YALE/DCS/tr150). New Haven, CT: Computer Science Department, Yale University, 1978.
Dynamic MemoryIPP [Lebowitz 1980] Lebowitz, Michael : ¥Generalization and memory in an integrated understanding system (Technical Report YALE/DCS/tr186). New Haven, CT: Computer Science Department, Yale University, 1980.
BORIS[Lehnert 1983] Lehnert, Wendy G., Dyer, Michael G., Johnson, Peter N., Yang, C. J., Harley, Steve: BORIS ‐‐ An experiment in in‐depth understanding of narratives. Artificial Intelligence, 20(1), 15‐62., 1983.
CYRUS[Kolodner 1984] Kolodner, Janet L.: Retrieval and organizational strategies in conceptual memory: A computer model. Hillsdale, NJ: Erlbaum, 1984.
Story tellingTALE‐SPIN[Meehan 1981] Meehan, James: TALE‐SPIN and Micro TALE‐SPIN. In Roger C. Schank, & Christopher K. Riesbeck (Eds.), Inside computer understanding. Hillsdale, NJ: Erlbaum, 1981.
Summary
1. Conversational systems have a long history of development.2. Conversational systems evolved from natural language dialogue
systems. Multiple modality and embodiment have been incorporated, then.
3. The Knowledge Navigator gave a significant impact on the concept of intelligent virtual agents.
4. Personality, social interactions and lifelikeness are key features of intelligent virtual agents.
5. In order to create natural conversational behaviors, conversational artifacts need to coordinate verbal and nonverbal behaviors.
6. Story understanding and generation techniques are necessary for processing content.
References
[Apple Computer 1987] http://homepage.mac.com/ericestrada/Movies/iMovieTheater53.html[Ball 1997] Gene Ball, Dan Ling, David Kurlander, John Miller, David Pugh, Tim Skelly, Andy Stankosky, David Thiel, Maarten Van Dantzich, and Trace
Wax. Lifelike Computer Characters: The Persona Project at Microsoft Research. Software Agents. Jeffrey M. Bradshaw (ed.). AAAI/MIT Press. Menlo Park, CA. 1997.
[Bates 1994] Joseph Bates. The role of emotion in believable agents. Communications of ACM, Vol. 37, No. 7, pp. 122‐125, 1994.[Blumberg 1994] Bruce Blumberg: Action‐Selection in Hamsterdam: Lessons from Ethology, In Proceedings of the 3rd international conference on the
simulation of Adaptive behaviour, pp. 107‐116, 1994. [Bolt 1980] Richard A. Bolt: Put‐that‐there": Voice and gesture at the graphics interface. In Proceedings of the 7th annual conference on Computer
graphics and interactive techniques, Vol. 14, No. 3. (July 1980), pp. 262‐270.[Cassell 1999] Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L., Chang, K., Vilhjálmsson, H. and Yan, H. (1999). "Embodiment in Conversational
Interfaces: Rea." Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI), pp. 520‐527. Pittsburgh, PA.[Cassell 2001] Cassell, J., Vilhjálmsson, H., Bickmore, T. BEAT: the Behavior Expression Animation Toolkit. Proceedings of SIGGRAPH '01, pp. 477‐486.
August 12‐17, Los Angeles, CA, 2001[Cassell 2004] Justine Cassell, Hannes Hogni Villhjalmsson and Timothy Bickmore. BEAT: the Behavior Expression Animation Toolkit, in: Helmut
Prendinger and Mitsuru Ishizuka (eds.) (2004) Life‐Like Characters: Tools, Affective Functions, and Applications. Springer. pp. 163‐185.[Erman 1980] Lee D. Erman, Frederick Hayes‐Roth, Victor R. Lesser, and D. Raj Reddy. The Hearsay‐II Speech‐Understanding System: Integrating
Knowledge to Resolve Uncertainty, ACM Computing Surveys, Volume 12, Issue 2, Pages: 213 ‐ 253, 1980.[Green 1961] Bert F. Green, Jr., Alice K. Wolf, Carol Chomsky, and Kenneth Laughery: Baseball: An Automatic Question‐Answerer Proceedings of the
IRE‐AIEE‐ACM '61 (Western); western joint IRE‐AIEE‐ACM computer conference, 1961.[Hayes‐Roth 1998] Barbara Hayes‐Roth: Jennifer James, celebrity auto spokesperson, International Conference on Computer Graphics and Interactive
Techniques archive ACM SIGGRAPH 98 Conference abstracts and applications table of contents, P. 136, 1998.[Mateas 1997] Michael Mateas: An Oz−Centric Review of Interac ve Drama and Believable Agents, CMU−CS−97−156, School of Computer Science,
Carnegie Mellon University, 1997.[Schank 1975] Schank, Roger C., Goldman, Neil, Rieger, Chuck, and Riesbeck, Christopher K.: Inference and paraphrase by computer. Journal of the
ACM, 22(3), 309‐328, 1975.[Weizenbaum 1966] Joseph Weizenbaum. ELIZA ‐‐ A Computer Program for the study of natural language communication between man and machine,
Communications of the ACM, Vol. 9, No. 1, pp. 36‐45, 1966.[Winograd 1972] Terry Winograd: Understanding Natural Language, Academic Press, 1972.[Woods 1970] W. A. Woods: Transition network grammars for natural language analysis, Communications of the ACM 13(10): 591‐606, 1970.[Woods 1973] W. A. Woods: Progress in natural language understanding: an application to lunar geology, In Proceeding AFIPS '73 Proceedings of the
June 4‐8, 1973, national computer conference and exposition.
For more details …
http://www.springer.com/engineering/signals/book/978‐4‐431‐55039‐6