View
224
Download
2
Tags:
Embed Size (px)
Citation preview
CommandTalk: A Spoken Language CommandTalk: A Spoken Language Interface for Battlefield SimulationsInterface for Battlefield Simulations
Presenter: Pranith Reddy KadaruPresenter: Pranith Reddy Kadaru
74.793 Natural Language and Speech Processing74.793 Natural Language and Speech Processing
Instructor: Dr. Christel KemkeInstructor: Dr. Christel Kemke
OutlineOutline
IntroductionIntroduction
Principal Agents of CommandTalkPrincipal Agents of CommandTalk
Speech RecognitionSpeech Recognition
Natural LanguageNatural Language
Contextual InterpretationContextual Interpretation
Push to TalkPush to Talk
ModSAFModSAF
Start-ItStart-It
IntroductionIntroduction““CommandTalk is a spoken-language CommandTalk is a spoken-language
interface to synthetic forces in entity-based interface to synthetic forces in entity-based battlefield simulations [1]”. The battlefield simulations [1]”. The CommandTalk System allows the user to CommandTalk System allows the user to use ordinary English commands to:use ordinary English commands to:
Create forcesCreate forces Allocate missions to forcesAllocate missions to forces Alter missions while executionAlter missions while execution Manage ModSAF system functions Manage ModSAF system functions
An ExampleAn Example
““Create an M1 platoon designated Charlie 4 5.Create an M1 platoon designated Charlie 4 5.Put Checkpoint 1 at 937 965.Put Checkpoint 1 at 937 965.Create a point called Checkpoint 2 at 930 960.Create a point called Checkpoint 2 at 930 960.Objective Alpha is 92 96.Objective Alpha is 92 96.Charlie 4 5, at my command, advance in a Charlie 4 5, at my command, advance in a column to Checkpoint 1.column to Checkpoint 1.Next, proceed to Checkpoint 2.Next, proceed to Checkpoint 2.Then assault Objective Alpha.Then assault Objective Alpha.Charlie 4 5, move out.”Charlie 4 5, move out.”
from [1]from [1]
Principal Agents of CommandTalkPrincipal Agents of CommandTalk
The principal agents in commandTalk system The principal agents in commandTalk system are:are:
Speech recognitionSpeech recognition
Natural languageNatural language
Contextual InterpretationContextual Interpretation
Push to talkPush to talk
ModSAFModSAF
Start-ItStart-It
Speech RecognitionSpeech Recognition
Based on Nuance Speech Recognition Based on Nuance Speech Recognition System (a thin agent layer)System (a thin agent layer)
Listens on the audio port of the computerListens on the audio port of the computer
Generates its best hypothesis as to what Generates its best hypothesis as to what string of words have been spoken by the string of words have been spoken by the useruser
Makes use of wide-band acoustic models Makes use of wide-band acoustic models
Speech RecognitionSpeech Recognition
Accepts messages that notifies it to start, Accepts messages that notifies it to start, stop and change grammarstop and change grammar
Produces messages consisting of the Produces messages consisting of the hypothesized word stringhypothesized word string
Makes use of the grammar specification of Makes use of the grammar specification of the Natural language agentthe Natural language agent
The grammar specification is extracted by The grammar specification is extracted by an algorithman algorithm
Natural LanguageNatural Language
Based on SRI’s Gemini, a Based on SRI’s Gemini, a natural-language natural-language understanding system understanding system based on unification based on unification grammargrammarReceives word string to be parsed and Receives word string to be parsed and interpreted as inputinterpreted as inputApplies an application-specific grammar to the Applies an application-specific grammar to the word string and performs bottom-up parsingword string and performs bottom-up parsing Generates logical form, a structured Generates logical form, a structured representation of the meaning of the stringrepresentation of the meaning of the string
Contextual InterpretationContextual Interpretation
Receives logical form as input from the Receives logical form as input from the Natural Language agentNatural Language agent
Makes use of contextual information to Makes use of contextual information to generate a complete interpretationgenerate a complete interpretation
Sources of this contextual information include Sources of this contextual information include linguistic context and situational context linguistic context and situational context
Bi-directional interaction between CI agent Bi-directional interaction between CI agent and ModSAF as ModSAF is itself a source of and ModSAF as ModSAF is itself a source of situational contextsituational context
Contextual InterpretationContextual InterpretationThe Context Interpretation Agent must solve the The Context Interpretation Agent must solve the
following problemsfollowing problems
Noun Phrase ResolutionNoun Phrase Resolution:: A noun phrase must be resolved to the unique object that it A noun phrase must be resolved to the unique object that it
representsrepresents For example, For example, tank platoontank platoon or or M4 platoonM4 platoon, or , or Charlie 5 6Charlie 5 6, may , may
represent the same objectrepresent the same object Whenever an object is created , modified or destroyed, the ModSAF Whenever an object is created , modified or destroyed, the ModSAF
agent notifies the Context Interpretation agent. This helps to keep agent notifies the Context Interpretation agent. This helps to keep track of a reference to an objecttrack of a reference to an object
Temporal Resolution:Temporal Resolution: Determine when a command is to be executedDetermine when a command is to be executed Is determined based upon the context and phrasing or explicit Is determined based upon the context and phrasing or explicit
indicatorsindicators
Contextual InterpretationContextual Interpretation Predicate resolution:Predicate resolution:
The ModSAF task corresponding to a verb is different The ModSAF task corresponding to a verb is different for each objectfor each object
For example, For example, Advance in column to checkpoint 5. Advance in column to checkpoint 5. In this example we observe that tanks do not have a In this example we observe that tanks do not have a column formation. This command is being referred to column formation. This command is being referred to a battaliona battalion
Vagueness resolution:Vagueness resolution: A command may not contain all the information to A command may not contain all the information to
perform the taskperform the task The missing information is filled by using a The missing information is filled by using a
combination of linguistic and situational contextcombination of linguistic and situational context
Situational ContextSituational Context
Used by Contextual interpretationUsed by Contextual interpretation
Used to resolve references such as Used to resolve references such as pronouns, proper names, plural pronouns, proper names, plural descriptions, quantified descriptions and descriptions, quantified descriptions and conjunctionsconjunctions
Interpret commandsInterpret commands
Resolve parameters of underspecified Resolve parameters of underspecified commandscommands
An Example of An Interpret An Example of An Interpret CommandCommand
A22 advance to checkpoint 2A22 advance to checkpoint 2 Add the task for A22 if A22 is not taking part in any Add the task for A22 if A22 is not taking part in any
missionmission Carry out the task if the task is pendingCarry out the task if the task is pending Override the current mission with the task if A22 is Override the current mission with the task if A22 is
taking part in any mission, leaving the current taking part in any mission, leaving the current mission to be resumed latermission to be resumed later
Resume the task if it has been suspendedResume the task if it has been suspended Replace the current mission with the task, if the Replace the current mission with the task, if the
current mission is not centralcurrent mission is not central
An Example for Resolving Arguments An Example for Resolving Arguments of an Underspecified Commandof an Underspecified Command
““U 1 Establish a base of fire at Checkpoint 2 facing Objective U 1 Establish a base of fire at Checkpoint 2 facing Objective Alpha.Alpha.
U 2 Move to Checkpoint 2 and attack the enemy with direct U 2 Move to Checkpoint 2 and attack the enemy with direct fire.fire.
U 3 Engage the enemy to the northU 3 Engage the enemy to the northU 4 Action right” [3]U 4 Action right” [3]The “The “Attack by fireAttack by fire” command requires two arguments:” command requires two arguments:1) the position to attack from 2) the position to fire at1) the position to attack from 2) the position to fire at (1) provides both the arguments(1) provides both the arguments (2) does not provide the second argument which must be (2) does not provide the second argument which must be
resolved from contextresolved from context (3) and (4) do not provide both the arguments which must be (3) and (4) do not provide both the arguments which must be
resolved from contextresolved from context
Linguistic ContextLinguistic Context
Used to resolve references based on linguistic Used to resolve references based on linguistic descriptionsdescriptions
An exampleAn example ““U 5 A11 advance to Objective B.U 5 A11 advance to Objective B. S 6 There is no Objective B. Which point should 100A11 S 6 There is no Objective B. Which point should 100A11
proceed to?proceed to? U 7 Create it at 635 545.U 7 Create it at 635 545. S 8 Should 100A11 proceed to Objective B?S 8 Should 100A11 proceed to Objective B? U 9 Yes.” [3]U 9 Yes.” [3]
• In the above example the system resolves In the above example the system resolves it it to to Objective BObjective B
Push to TalkPush to Talk
Handles the interactions with the userHandles the interactions with the user
Long narrow window running on the top of Long narrow window running on the top of the screenthe screen
Contains a microphone that indicates Contains a microphone that indicates whether the CommandTalk system is whether the CommandTalk system is ready, listening or busy.ready, listening or busy.
Also contains an area for the recognized Also contains an area for the recognized string and for the system messagesstring and for the system messages
Push to TalkPush to Talk
Two ways to initiate a spoken commandTwo ways to initiate a spoken command Push-and-hold-to-talkPush-and-hold-to-talk Click-to-talkClick-to-talk
• Sends messages to stop and start listening Sends messages to stop and start listening to the Speech Recognition agentto the Speech Recognition agent
• Receives messages consisting of the Receives messages consisting of the recognized words agent and confirmation or recognized words agent and confirmation or error messages from the Speech error messages from the Speech Recognition Recognition
ModSAFModSAF
Sends messages that inform the current Sends messages that inform the current state of simulation to the Context state of simulation to the Context Interpretation agentInterpretation agent
Executes commands received from the Executes commands received from the Context Interpretation agentContext Interpretation agent
Provides functions that are not provided by Provides functions that are not provided by GUI such as zoom, center on a point that GUI such as zoom, center on a point that is not displayed is not displayed
Start-ItStart-It
A graphical process-spawning agent that A graphical process-spawning agent that aids in controlling the large number of aids in controlling the large number of processes of the CommandTalk system.processes of the CommandTalk system.
Mouse-and-menu interface to start and Mouse-and-menu interface to start and configure the processes in the configure the processes in the CommandTalk systemCommandTalk system
Allocate processes to machines distributed Allocate processes to machines distributed over a networkover a network
Reports the status of a processReports the status of a process
Robustness Techniques used in Robustness Techniques used in CommandTalkCommandTalk
The One-Grammar ApproachThe One-Grammar ApproachSame grammar for recognition, understanding and Same grammar for recognition, understanding and
generationgenerationAny changes made in the understanding grammar Any changes made in the understanding grammar
reflects in recognition and generationreflects in recognition and generation Utterance-Level RobustnessUtterance-Level Robustness
Allow “close-to-grammar” utterancesAllow “close-to-grammar” utterances For example “zoom way out” is interpreted as For example “zoom way out” is interpreted as
“zoom out” (Skip unknown words)“zoom out” (Skip unknown words)Delete words that do not contribute to the meaning Delete words that do not contribute to the meaning
of the utteranceof the utterance
Dialogue-Level RobustnessDialogue-Level RobustnessThe system must be able to deal with situations where it The system must be able to deal with situations where it cannot be interpreted within the current system state or cannot be interpreted within the current system state or dialogue contextdialogue context
The fault interpretations must be easily corrected by the userThe fault interpretations must be easily corrected by the user
From [4]From [4]
ConclusionConclusion
““CommandTalk is a spoken-language interface CommandTalk is a spoken-language interface to the ModSAF battlefield simulator that allows to the ModSAF battlefield simulator that allows simulation operators to generate and execute simulation operators to generate and execute military exercises by creating forces and control military exercises by creating forces and control measures, assigning missions to forces, and measures, assigning missions to forces, and controlling the display” [3].controlling the display” [3].
CommandTalk has four versions for the Marine CommandTalk has four versions for the Marine Corps, Army, Navy and the AirforceCorps, Army, Navy and the Airforce
ReferencesReferences
[1] "[1] "CommandTalk: A Spoken-Language Interface for Battlefield Simulations", CommandTalk: A Spoken-Language Interface for Battlefield Simulations", 1997, by Robert Moore, John Dowding, Harry Bratt, J. Mark Gawron, 1997, by Robert Moore, John Dowding, Harry Bratt, J. Mark Gawron, Yonael Gorfu and Adam Cheyer, in "Proceedings of the Fifth Conference on Yonael Gorfu and Adam Cheyer, in "Proceedings of the Fifth Conference on Applied Natural Language Processing", Washington, DC, pp. 1-7, Applied Natural Language Processing", Washington, DC, pp. 1-7, Association for Computational Linguistics Association for Computational Linguistics
[2] "The CommandTalk Spoken Dialogue System", 1999, by Amanda Stent, [2] "The CommandTalk Spoken Dialogue System", 1999, by Amanda Stent, John Dowding, Jean Mark Gawron, Elizabeth Owen Bratt and Robert John Dowding, Jean Mark Gawron, Elizabeth Owen Bratt and Robert Moore, in "Proceedings of the Thirty-Seventh Annual Meeting of the ACL", Moore, in "Proceedings of the Thirty-Seventh Annual Meeting of the ACL", pp. 183-190, University of Maryland, College Park, MD, Association for pp. 183-190, University of Maryland, College Park, MD, Association for Computational LinguisticsComputational Linguistics
[3] "Interpreting Language in Context in CommandTalk", 1999, by John [3] "Interpreting Language in Context in CommandTalk", 1999, by John Dowding and Elizabeth Owen Bratt and Sharon Goldwater, in Dowding and Elizabeth Owen Bratt and Sharon Goldwater, in "Communicative Agents: The Use of Natural Language in Embodied "Communicative Agents: The Use of Natural Language in Embodied Systems", pp. 63-67, Association for Computing Machinery (ACM) Special Systems", pp. 63-67, Association for Computing Machinery (ACM) Special Interest Group on Artificial Intelligence (SIGART), Seattle, WAInterest Group on Artificial Intelligence (SIGART), Seattle, WA
ReferencesReferences
[4] "Building a Robust Dialogue System with Limited Data", 2000, by Sharon J. [4] "Building a Robust Dialogue System with Limited Data", 2000, by Sharon J. Goldwater, Elizabeth Owen Bratt, Jean Mark Gawron, and John Dowding, Goldwater, Elizabeth Owen Bratt, Jean Mark Gawron, and John Dowding, presented at the Workshop on Conversational Systems at the 1st Meeting presented at the Workshop on Conversational Systems at the 1st Meeting of the North American Chapter of the Association for Computational of the North American Chapter of the Association for Computational Linguistics, Seattle, WA.Linguistics, Seattle, WA.
[5] [5] http://www.ai.sri.com/natural-language/projects/arpa-sls/commandtalk.htmlhttp://www.ai.sri.com/natural-language/projects/arpa-sls/commandtalk.html
[6] [6] http://www.ai.sri.com/~oaa/http://www.ai.sri.com/~oaa/
[7] [7] http://www.ai.sri.com/natural-language/projects/arpa-sls/nat-lang.htmlhttp://www.ai.sri.com/natural-language/projects/arpa-sls/nat-lang.html