25
CommandTalk: A Spoken CommandTalk: A Spoken Language Interface for Language Interface for Battlefield Simulations Battlefield Simulations Presenter: Pranith Reddy Presenter: Pranith Reddy Kadaru Kadaru 74.793 Natural Language and Speech 74.793 Natural Language and Speech Processing Processing Instructor: Dr. Christel Kemke Instructor: Dr. Christel Kemke

CommandTalk: A Spoken Language Interface for Battlefield Simulations Presenter: Pranith Reddy Kadaru 74.793 Natural Language and Speech Processing Instructor:

  • View
    224

  • Download
    2

Embed Size (px)

Citation preview

CommandTalk: A Spoken Language CommandTalk: A Spoken Language Interface for Battlefield SimulationsInterface for Battlefield Simulations

Presenter: Pranith Reddy KadaruPresenter: Pranith Reddy Kadaru

74.793 Natural Language and Speech Processing74.793 Natural Language and Speech Processing

Instructor: Dr. Christel KemkeInstructor: Dr. Christel Kemke

OutlineOutline

IntroductionIntroduction

Principal Agents of CommandTalkPrincipal Agents of CommandTalk

Speech RecognitionSpeech Recognition

Natural LanguageNatural Language

Contextual InterpretationContextual Interpretation

Push to TalkPush to Talk

ModSAFModSAF

Start-ItStart-It

IntroductionIntroduction““CommandTalk is a spoken-language CommandTalk is a spoken-language

interface to synthetic forces in entity-based interface to synthetic forces in entity-based battlefield simulations [1]”. The battlefield simulations [1]”. The CommandTalk System allows the user to CommandTalk System allows the user to use ordinary English commands to:use ordinary English commands to:

Create forcesCreate forces Allocate missions to forcesAllocate missions to forces Alter missions while executionAlter missions while execution Manage ModSAF system functions Manage ModSAF system functions

An ExampleAn Example

““Create an M1 platoon designated Charlie 4 5.Create an M1 platoon designated Charlie 4 5.Put Checkpoint 1 at 937 965.Put Checkpoint 1 at 937 965.Create a point called Checkpoint 2 at 930 960.Create a point called Checkpoint 2 at 930 960.Objective Alpha is 92 96.Objective Alpha is 92 96.Charlie 4 5, at my command, advance in a Charlie 4 5, at my command, advance in a column to Checkpoint 1.column to Checkpoint 1.Next, proceed to Checkpoint 2.Next, proceed to Checkpoint 2.Then assault Objective Alpha.Then assault Objective Alpha.Charlie 4 5, move out.”Charlie 4 5, move out.”

from [1]from [1]

Principal Agents of CommandTalkPrincipal Agents of CommandTalk

The principal agents in commandTalk system The principal agents in commandTalk system are:are:

Speech recognitionSpeech recognition

Natural languageNatural language

Contextual InterpretationContextual Interpretation

Push to talkPush to talk

ModSAFModSAF

Start-ItStart-It

Speech RecognitionSpeech Recognition

Based on Nuance Speech Recognition Based on Nuance Speech Recognition System (a thin agent layer)System (a thin agent layer)

Listens on the audio port of the computerListens on the audio port of the computer

Generates its best hypothesis as to what Generates its best hypothesis as to what string of words have been spoken by the string of words have been spoken by the useruser

Makes use of wide-band acoustic models Makes use of wide-band acoustic models

Speech RecognitionSpeech Recognition

Accepts messages that notifies it to start, Accepts messages that notifies it to start, stop and change grammarstop and change grammar

Produces messages consisting of the Produces messages consisting of the hypothesized word stringhypothesized word string

Makes use of the grammar specification of Makes use of the grammar specification of the Natural language agentthe Natural language agent

The grammar specification is extracted by The grammar specification is extracted by an algorithman algorithm

Natural LanguageNatural Language

Based on SRI’s Gemini, a Based on SRI’s Gemini, a natural-language natural-language understanding system understanding system based on unification based on unification grammargrammarReceives word string to be parsed and Receives word string to be parsed and interpreted as inputinterpreted as inputApplies an application-specific grammar to the Applies an application-specific grammar to the word string and performs bottom-up parsingword string and performs bottom-up parsing Generates logical form, a structured Generates logical form, a structured representation of the meaning of the stringrepresentation of the meaning of the string

Contextual InterpretationContextual Interpretation

Receives logical form as input from the Receives logical form as input from the Natural Language agentNatural Language agent

Makes use of contextual information to Makes use of contextual information to generate a complete interpretationgenerate a complete interpretation

Sources of this contextual information include Sources of this contextual information include linguistic context and situational context linguistic context and situational context

Bi-directional interaction between CI agent Bi-directional interaction between CI agent and ModSAF as ModSAF is itself a source of and ModSAF as ModSAF is itself a source of situational contextsituational context

Contextual InterpretationContextual InterpretationThe Context Interpretation Agent must solve the The Context Interpretation Agent must solve the

following problemsfollowing problems

Noun Phrase ResolutionNoun Phrase Resolution:: A noun phrase must be resolved to the unique object that it A noun phrase must be resolved to the unique object that it

representsrepresents For example, For example, tank platoontank platoon or or M4 platoonM4 platoon, or , or Charlie 5 6Charlie 5 6, may , may

represent the same objectrepresent the same object Whenever an object is created , modified or destroyed, the ModSAF Whenever an object is created , modified or destroyed, the ModSAF

agent notifies the Context Interpretation agent. This helps to keep agent notifies the Context Interpretation agent. This helps to keep track of a reference to an objecttrack of a reference to an object

Temporal Resolution:Temporal Resolution: Determine when a command is to be executedDetermine when a command is to be executed Is determined based upon the context and phrasing or explicit Is determined based upon the context and phrasing or explicit

indicatorsindicators

Contextual InterpretationContextual Interpretation Predicate resolution:Predicate resolution:

The ModSAF task corresponding to a verb is different The ModSAF task corresponding to a verb is different for each objectfor each object

For example, For example, Advance in column to checkpoint 5. Advance in column to checkpoint 5. In this example we observe that tanks do not have a In this example we observe that tanks do not have a column formation. This command is being referred to column formation. This command is being referred to a battaliona battalion

Vagueness resolution:Vagueness resolution: A command may not contain all the information to A command may not contain all the information to

perform the taskperform the task The missing information is filled by using a The missing information is filled by using a

combination of linguistic and situational contextcombination of linguistic and situational context

Situational ContextSituational Context

Used by Contextual interpretationUsed by Contextual interpretation

Used to resolve references such as Used to resolve references such as pronouns, proper names, plural pronouns, proper names, plural descriptions, quantified descriptions and descriptions, quantified descriptions and conjunctionsconjunctions

Interpret commandsInterpret commands

Resolve parameters of underspecified Resolve parameters of underspecified commandscommands

An Example of An Interpret An Example of An Interpret CommandCommand

A22 advance to checkpoint 2A22 advance to checkpoint 2 Add the task for A22 if A22 is not taking part in any Add the task for A22 if A22 is not taking part in any

missionmission Carry out the task if the task is pendingCarry out the task if the task is pending Override the current mission with the task if A22 is Override the current mission with the task if A22 is

taking part in any mission, leaving the current taking part in any mission, leaving the current mission to be resumed latermission to be resumed later

Resume the task if it has been suspendedResume the task if it has been suspended Replace the current mission with the task, if the Replace the current mission with the task, if the

current mission is not centralcurrent mission is not central

An Example for Resolving Arguments An Example for Resolving Arguments of an Underspecified Commandof an Underspecified Command

““U 1 Establish a base of fire at Checkpoint 2 facing Objective U 1 Establish a base of fire at Checkpoint 2 facing Objective Alpha.Alpha.

U 2 Move to Checkpoint 2 and attack the enemy with direct U 2 Move to Checkpoint 2 and attack the enemy with direct fire.fire.

U 3 Engage the enemy to the northU 3 Engage the enemy to the northU 4 Action right” [3]U 4 Action right” [3]The “The “Attack by fireAttack by fire” command requires two arguments:” command requires two arguments:1) the position to attack from 2) the position to fire at1) the position to attack from 2) the position to fire at (1) provides both the arguments(1) provides both the arguments (2) does not provide the second argument which must be (2) does not provide the second argument which must be

resolved from contextresolved from context (3) and (4) do not provide both the arguments which must be (3) and (4) do not provide both the arguments which must be

resolved from contextresolved from context

Linguistic ContextLinguistic Context

Used to resolve references based on linguistic Used to resolve references based on linguistic descriptionsdescriptions

An exampleAn example ““U 5 A11 advance to Objective B.U 5 A11 advance to Objective B. S 6 There is no Objective B. Which point should 100A11 S 6 There is no Objective B. Which point should 100A11

proceed to?proceed to? U 7 Create it at 635 545.U 7 Create it at 635 545. S 8 Should 100A11 proceed to Objective B?S 8 Should 100A11 proceed to Objective B? U 9 Yes.” [3]U 9 Yes.” [3]

• In the above example the system resolves In the above example the system resolves it it to to Objective BObjective B

Push to TalkPush to Talk

Handles the interactions with the userHandles the interactions with the user

Long narrow window running on the top of Long narrow window running on the top of the screenthe screen

Contains a microphone that indicates Contains a microphone that indicates whether the CommandTalk system is whether the CommandTalk system is ready, listening or busy.ready, listening or busy.

Also contains an area for the recognized Also contains an area for the recognized string and for the system messagesstring and for the system messages

Push to TalkPush to Talk

Two ways to initiate a spoken commandTwo ways to initiate a spoken command Push-and-hold-to-talkPush-and-hold-to-talk Click-to-talkClick-to-talk

• Sends messages to stop and start listening Sends messages to stop and start listening to the Speech Recognition agentto the Speech Recognition agent

• Receives messages consisting of the Receives messages consisting of the recognized words agent and confirmation or recognized words agent and confirmation or error messages from the Speech error messages from the Speech Recognition Recognition

ModSAFModSAF

Sends messages that inform the current Sends messages that inform the current state of simulation to the Context state of simulation to the Context Interpretation agentInterpretation agent

Executes commands received from the Executes commands received from the Context Interpretation agentContext Interpretation agent

Provides functions that are not provided by Provides functions that are not provided by GUI such as zoom, center on a point that GUI such as zoom, center on a point that is not displayed is not displayed

Start-ItStart-It

A graphical process-spawning agent that A graphical process-spawning agent that aids in controlling the large number of aids in controlling the large number of processes of the CommandTalk system.processes of the CommandTalk system.

Mouse-and-menu interface to start and Mouse-and-menu interface to start and configure the processes in the configure the processes in the CommandTalk systemCommandTalk system

Allocate processes to machines distributed Allocate processes to machines distributed over a networkover a network

Reports the status of a processReports the status of a process

Robustness Techniques used in Robustness Techniques used in CommandTalkCommandTalk

The One-Grammar ApproachThe One-Grammar ApproachSame grammar for recognition, understanding and Same grammar for recognition, understanding and

generationgenerationAny changes made in the understanding grammar Any changes made in the understanding grammar

reflects in recognition and generationreflects in recognition and generation Utterance-Level RobustnessUtterance-Level Robustness

Allow “close-to-grammar” utterancesAllow “close-to-grammar” utterances For example “zoom way out” is interpreted as For example “zoom way out” is interpreted as

“zoom out” (Skip unknown words)“zoom out” (Skip unknown words)Delete words that do not contribute to the meaning Delete words that do not contribute to the meaning

of the utteranceof the utterance

Dialogue-Level RobustnessDialogue-Level RobustnessThe system must be able to deal with situations where it The system must be able to deal with situations where it cannot be interpreted within the current system state or cannot be interpreted within the current system state or dialogue contextdialogue context

The fault interpretations must be easily corrected by the userThe fault interpretations must be easily corrected by the user

From [4]From [4]

ConclusionConclusion

““CommandTalk is a spoken-language interface CommandTalk is a spoken-language interface to the ModSAF battlefield simulator that allows to the ModSAF battlefield simulator that allows simulation operators to generate and execute simulation operators to generate and execute military exercises by creating forces and control military exercises by creating forces and control measures, assigning missions to forces, and measures, assigning missions to forces, and controlling the display” [3].controlling the display” [3].

CommandTalk has four versions for the Marine CommandTalk has four versions for the Marine Corps, Army, Navy and the AirforceCorps, Army, Navy and the Airforce

DEMODEMO

here it comes!here it comes!

ReferencesReferences

[1] "[1] "CommandTalk: A Spoken-Language Interface for Battlefield Simulations", CommandTalk: A Spoken-Language Interface for Battlefield Simulations", 1997, by Robert Moore, John Dowding, Harry Bratt, J. Mark Gawron, 1997, by Robert Moore, John Dowding, Harry Bratt, J. Mark Gawron, Yonael Gorfu and Adam Cheyer, in "Proceedings of the Fifth Conference on Yonael Gorfu and Adam Cheyer, in "Proceedings of the Fifth Conference on Applied Natural Language Processing", Washington, DC, pp. 1-7, Applied Natural Language Processing", Washington, DC, pp. 1-7, Association for Computational Linguistics Association for Computational Linguistics

[2] "The CommandTalk Spoken Dialogue System", 1999, by Amanda Stent, [2] "The CommandTalk Spoken Dialogue System", 1999, by Amanda Stent, John Dowding, Jean Mark Gawron, Elizabeth Owen Bratt and Robert John Dowding, Jean Mark Gawron, Elizabeth Owen Bratt and Robert Moore, in "Proceedings of the Thirty-Seventh Annual Meeting of the ACL", Moore, in "Proceedings of the Thirty-Seventh Annual Meeting of the ACL", pp. 183-190, University of Maryland, College Park, MD, Association for pp. 183-190, University of Maryland, College Park, MD, Association for Computational LinguisticsComputational Linguistics

[3] "Interpreting Language in Context in CommandTalk", 1999, by John [3] "Interpreting Language in Context in CommandTalk", 1999, by John Dowding and Elizabeth Owen Bratt and Sharon Goldwater, in Dowding and Elizabeth Owen Bratt and Sharon Goldwater, in "Communicative Agents: The Use of Natural Language in Embodied "Communicative Agents: The Use of Natural Language in Embodied Systems", pp. 63-67, Association for Computing Machinery (ACM) Special Systems", pp. 63-67, Association for Computing Machinery (ACM) Special Interest Group on Artificial Intelligence (SIGART), Seattle, WAInterest Group on Artificial Intelligence (SIGART), Seattle, WA

ReferencesReferences

[4] "Building a Robust Dialogue System with Limited Data", 2000, by Sharon J. [4] "Building a Robust Dialogue System with Limited Data", 2000, by Sharon J. Goldwater, Elizabeth Owen Bratt, Jean Mark Gawron, and John Dowding, Goldwater, Elizabeth Owen Bratt, Jean Mark Gawron, and John Dowding, presented at the Workshop on Conversational Systems at the 1st Meeting presented at the Workshop on Conversational Systems at the 1st Meeting of the North American Chapter of the Association for Computational of the North American Chapter of the Association for Computational Linguistics, Seattle, WA.Linguistics, Seattle, WA.

[5] [5] http://www.ai.sri.com/natural-language/projects/arpa-sls/commandtalk.htmlhttp://www.ai.sri.com/natural-language/projects/arpa-sls/commandtalk.html

[6] [6] http://www.ai.sri.com/~oaa/http://www.ai.sri.com/~oaa/

[7] [7] http://www.ai.sri.com/natural-language/projects/arpa-sls/nat-lang.htmlhttp://www.ai.sri.com/natural-language/projects/arpa-sls/nat-lang.html