The Assistive Kitchen — A Demonstration Scenario for ... · Assistive kitchens and living environments also raise chal-lenging research problems. One of these problems is that performing

The Assistive Kitchen —A Demonstration Scenario for Cognitive Technical Systems

Michael Beetz, Freek Stulp, Bernd Radig,Jan Bandouch, Nico Blodow, Mihai Dolha, Andreas Fedrizzi, Dominik Jain, Uli Klank, Ingo Kresse,

Alexis Maldonado, Zoltan Marton, Lorenz Mosenlechner, Federico Ruiz, Radu Bogdan Rusu, Moritz TenorthIntelligent Autonomous Systems, Technische Universitat Munchen, Department of Informatics

Boltzmannstr. 3, D-85748 Garching bei Munchen, Germany

Abstract— This paper introduces the Assistive Kitchen as acomprehensive demonstration and challenge scenario for tech-nical cognitive systems. We describe its hardware and softwareinfrastructure. Within the Assistive Kitchen application, weselect particular domain activities as research subjects andidentify the cognitive capabilities needed for perceiving, inter-preting, analyzing, and executing these activities as researchfoci. We conclude by outlining open research issues that needto be solved to realize the scenarios successfully.

I. INTRODUCTION

Cognitive technical systems are systems that are equippedwith artificial sensors and actuators, integrated into physicalsystems, and act in a physical world. They differ from othertechnical systems in that they have cognitive capabilitiesincluding perception, reasoning, learning, and planning thatturn them into systems that “know what they are doing” [7].The cognitive capabilities will result in systems of higherreliability, flexibility, adaptivity, and better performance andsystems that are easier to interact and cooperate with.

The cluster of excellence COTESYS (Cognition for Tech-nical Systems) [8] considers the assistance of elder people tobe a key application where technical cognitive systems couldprofoundly impact the well-being of our society. Therefore,COTESYS investigates the realization of an assistive kitchen(Fig. 1), a ubiquitous computing, sensing, and actuationenvironment with a robotic assistant as one of its primarydemonstration scenarios. The Assistive Kitchen aims at

• supporting and assisting people in their householdchores through physical action;

• enhancing the cognitive capabilities of people doinghousehold work by reminding them; and

• monitoring health and safety of the people.To achieve these objectives, the Assistive Kitchen is to

• perceive, interpret, learn, and analyze models of house-hold chore and activities of daily life (ADLs); and

• represent the acquired models such that the AssistiveKitchen can use them for activity and safety monitoring,

The research reported in this paper is partly funded by the Germancluster of excellence COTESYS (Cognition for Technical Systems). Moreinformation including videos and publications about the Assistive Kitchencan be found at http://ias.cs.tum.edu/assistivekitchen.Due to space limitations this paper does not give a comprehensive discussionof related work.

health assessment, and for adapting itself to the needsand preferences of the people.

The Assistive Kitchen includes an autonomous roboticagent that is to learn and perform complex household chores.The robot must perform housework together with peopleor at least assist them in their activities. This requires safeoperation in the presence of humans and behaving accordingto the preferences of the people they serve.

Clearly, assistive kitchens of this sort are important forseveral reasons. First, they are of societal importance becausethey can enable persons with minor disabilities includingsensory, cognitive, and motor ones to live independentlyand to perform their household work. This will increase thequality of life as well as reduce the cost of home care.

Assistive kitchens and living environments also raise chal-lenging research problems. One of these problems is thatperforming household chores is a form of everyday activitythat requires extensive commonsense knowledge and reason-ing [3]. Another challenge is the low frequency of daily activ-ities, which requires embedded systems and robotic agents tolearn from very scarce experience. Besides, household choresinclude a large variety of manipulation actions and composedactivities that pose hard research questions for current robotmanipulation research. The management of daily activitiesalso requires activity management that is very different fromthat commonly assumed by AI planning systems.

II. ASSISTIVE KITCHEN INFRASTRUCTURE

We start with the hardware and software infrastructure ofthe kitchen — the implementational basis that defines thepossibilities and restrictions of the demonstration scenarios.

A. The Hardware Infrastructure

The hardware infrastructure consists of a mobile robot andnetworked sensing and actuation devices that are physicallyembedded into the environment.

1) The Autonomous Mobile Robot: Currently, an au-tonomous mobile robot with two arms with grippers acts asa robotic assistant in the Assistive Kitchen (see Fig. 1). Therobot is a RWI B21 robot equipped with a stereo CCD systemand laser rangefinders as its primary sensors. One laser rangesensor is integrated into the robot base to allow for estimatingthe robot’s position within the environment. Small laser range

http://ias.cs.tum.edu/assistivekitchen

Fig. 1. The Assistive Kitchen containing a robot and a variety of sensors.

sensors are mounted onto the robot’s grippers to providesensory feedback for reaching and grasping actions. Thegrippers are also equipped with RFID tag readers that supportobject detection and identification. Cameras are used forlonger range object recognition and to allow for vision-basedinteraction with people. This robot will be complemented bythe Justin robot for sophisticated manipulation tasks.1

2) Room Infrastructure: The sensor-equipped kitchen en-vironment is depicted in Fig. 1. In this kitchen, a set ofstatic off-the-shelf cameras is positioned to cover criticalworking areas with high resolution. With these cameras,human activity and robots can be tracked from differentlocations to allow for more accurate positioning and poseestimation (see Section IV-B). In addition, laser range sensorsare mounted at the walls for covering large parts of thekitchen. They provide accurate and valuable position data forthe people present in the environment and their movementswithin the kitchen.

The pieces of furniture in the Assistive Kitchen are alsoequipped with various kinds of sensors. RFID tags andmagnetic and capacitive sensors provide information aboutthe objects in a cupboard or on the table, and whether acupboard is open or closed.

Furthermore, small ubiquitous devices offer the possibilityto instrument people acting in the environment with addi-tional sensors. We have equipped a glove with an RFIDtag reader that enables us to identify the objects that aremanipulated by the person wearing it. In addition, theperson is equipped with tiny inertial measurement units thatprovide us with detailed information about the person’s limbmotions. Another body worn sensory device to be used inthe Assistive Kitchen demonstration scenario is the gaze-aligned head mounted camera which allows the estimationof the attentional state of people while performing their

1The Justin robot is currently under development at the DLR roboticsinstitute: http://www.robotic.dlr.de

kitchen work.2 These last two sensors will be presented inSection IV-B.

Web-enabled kitchen appliances such as the refrigerator,the oven, the microwave, and the faucet, allow for remoteand wireless monitoring and control.

B. Software Infrastructure

A critical factor for the successful implementation of theassistive kitchen is the software infrastructure. It has toprovide a simple, reliable, uniform, and flexible interfacefor communicating with and controlling different physicallydistributed sensors and actuators.

We use and extend the open-source Player/Stage/Gazebo(P/S/G) software library to satisfy these requirements for thesensor-equipped environment as well as the robotic agent.Player provides a simple and flexible interface for robotcontrol by making available powerful classes of interfaceabstractions for interacting with robot hardware, in particularsensors and effectors. These abstractions enable the program-mer to use devices with similar functionality with identicalsoftware interfaces, thus increasing the code transferability.An enhanced client/server model featuring auto-discoverymechanisms as well as permitting servers and clients to com-municate between them in a heterogeneous network, enablesprogrammers to code their clients in different programminglanguages. Programmers can also implement sophisticatedalgorithms and provide them as Player drivers. By incor-porating well-understood algorithms into our infrastructure,we eliminate the need for users to individually re-implementthem [10].

Using the P/S/G infrastructure, a robot can enter a sensor-equipped environment, auto-discover the sensors and theservices they provide and use these sensors in the same wayas it uses its own sensors.

2The gaze-aligned head mounted camera is currently developed byNeurologische Klinik und Poliklinik, Ludwig-Maximilians-UniversitatMunchen, which participates in the COTESYS excellence cluster (http://www.forbias.de/).

http://www.robotic.dlr.de

http://www.forbias.de/

http://www.forbias.de/

Figure 2 depicts the computational structure of the con-troller of the household robot. The control system is com-posed of two layers: the functional and the task-orientedlayer. At the functional layer the system is decomposed intodistributed modules where each module provides a particularfunction. A function could be the control over a particularactuator or effector, access to the data provided by a partic-ular sensor subsystem, or a specific function of the robot’soperation, such as guarded motion. The functional layer isdecomposed into the perception and the action subsystem.The modules of the perception subsystem perceives states ofthe environment and the modules of the action subsystemcontrol the actuators of the robot. All these modules areprovided by our P/S/G infrastructure.

Action Subsystem

NavigationSystem

Perceptual Subsystem

ImageProcessing

System

ProbabilisticState

Estimators

FunctionalLayer

ControlSignals

SensorData

Structured Reactive Controller (SRC)

ProcessModules Fluents

StructuredReactive Plan

(SRP)

Task-orientedLayer

Robot Abstract Machine Registers ProcessModules

Implementation of the Robot Abstract Machine

ControlStructures

Fig. 2. The computational structure of the household robot control system.The figure depicts the two layers of the system. We consider the functionallayer as being decomposed into a perception and an action subsystem. Thetask-oriented layer is constituted by the Structured Reactive Controller. Theperception subsystem computes the information for setting the fluents ofthe Structured Reactive Controller. The process modules that are activatedand deactivated by the Structured Reactive Plan control the modules of theaction subsystem.

The task-oriented layer is constituted by the StructuredReactive Controller [2], [3] that specifies how the functionsprovided by the functional layer are to be coordinated inorder to perform specific jobs such as the such as setting thetable. The perception subsystem computes the informationfor setting the fluents of the Structured Reactive Controller.The process modules that are activated and deactivated by theStructured Reactive Plan control the modules of the actionsubsystem.

Structured reactive controllers work as follows. Whengiven a set of requests, the structured reactive controllerretrieves routine plans for individual requests and executesthe plans concurrently. These routine plans are general andflexible — they work for standard situations and whenexecuted concurrently with other routine plans. Routine planscan cope well with partly unknown and changing envi-ronments, run concurrently, handle interrupts, and controlrobots without assistance over extended periods. For standardsituations, the execution of these routine plans causes therobot to exhibit an appropriate behavior in achieving theirpurpose. While it executes routine plans, the robot controller

also tries to determine whether its routines might interferewith each other and watches out for non-standard situations.If it encounters a non-standard situation it will try to an-ticipate and forestall behavior flaws by predicting how itsroutine plans might work in the non-standard situation and,if necessary, revising its routines to make them robust for thiskind of situation. Finally, it integrates the proposed revisionssmoothly into its ongoing course of actions.

C. Simulation and Visualization

We have developed simulation tools for our kitchen androbot acting as robotic assistant using the Gazebo toolbox for3D, physics-based robot simulation. The simulator is realisticalong many dimensions. In particular, the simulator will usesensing and actuation models that are learned from the realrobots. Models of rooms and their furnishing can also beacquired automatically using the methods for the acquisitionof environment models, described in Section IV-A.

Fig. 3. The Assistive Kitchen and its simulation in Gazebo.

The use of these simulation tools promotes the researchin assistive kitchen technology in various ways. First, thesimulator supports generalization: we can model all kindsof robots in our simulation framework. We will also havedifferent kitchen setups, which requires us to develop con-trol programs that can specialize themselves for differentrobots and environments. The simulator allows us to runexperiments fast and with little efforts and under controllablecontext settings. This supports the performance of extensiveempirical studies.

III. DEMONSTRATION SCENARIOS

The demonstration scenarios are organized along twodimensions (see Table I). The first dimension is the domaintasks and activities under investigation. The second oneis the research themes related to the cognitive aspects ofperception, interpretation, learning, planning, and executionof these activities.

We start by looking at three scenario tasks in the context ofhousehold chores: setting the table, cooking, and performinghousehold chores for an extended period. These tasks chal-lenge cognitive systems along different dimensions, whichwe will discuss in the Section IV.

A. Scenario: Table setting

Table setting refers to the arrangement of tableware, cut-lery, and glasses on the table for eating. Table setting is a

domain task/activitytable cooking household

setting choreperception * * *

activity model * * *env. model * *

motion primitives * *macro actions * *

everyday activity *

rese

arch

them

e

TABLE IDIMENSIONS OF DEMONSTRATION SCENARIOS.

complex transportation task that can be performed as a singlethread of activity.

There are various aspects of table setting that make it toa suitable challenge for cognitive systems: (1) the common-sense knowledge and reasoning needed are to perform thetask competently; (2) the task itself is typically specifiedincompletely and in a fuzzy manner; and (3) the task canbe carried out and optimized in many ways.

To account for incomplete task specifications, agents mustuse their experience or make an inquiry to determine who sitswhere and whether people prefer particular plates, cups, etc.Other missing information can be found in the world wideweb such as which kinds of plates and cutlery are neededand where they should be placed.

Aspects to be considered for the optimization of theactivity include the the position where the robot should standto place the objects on the table, the decision if objectsshould be stacked or carried individually, etc. Also, the taskrequires the execution of macro actions such as approachingthe cupboard in order to open it and get out the plates. Agentsalso have to consider subtle issues in action selection, suchas whether or not to interrupt the carrying action in order toclose the door immediately or later. Finally, actions requiresubstantial dexterity. Putting objects on table where peopleare seated requires the robot to perform socially acceptablereaching actions.

B. Scenario: Cooking

Cooking is the activity of preparing food to eat. Unlikesetting the table, cooking requires the selection, measurementand combination of ingredients in an ordered procedure. Theperformance of a cooking activity is measured in terms ofthe quality and taste of the meal and its timely provision.

Cooking involves the transformation of food, the controlof physical and chemical processes, the concurrent executionof different activities, the use of tools and ingredients, andmonitoring actions. Concurrent activities have to be timedsuch that all parts of the meal are done at the desiredtime. Cooking comprises many different methods, tools andcombinations of ingredients — making it a difficult skill toacquire.

C. Scenario: Prolonged House Keeping

Prolonged house keeping and household chores are spe-cific activities related to or used in the running of a

household. The activities include cooking, setting the table,washing the dishes, cleaning, and many other tasks.

Performing household chores illustrates well the flexibilityand reliability of human everyday activity. An importantreason is the flexible management of multiple, diverse jobs.Humans as well as intelligent agents usually have severaltasks and activities to perform simultaneously. They cleanthe living room while the food sits on the oven. At any time,interrupts such as telephone calls, new errands and revisionsof old ones might come up. In addition, they have regulardaily and weekly errands, like cleaning the windows.

Since doing the household chores is a daily activity anddone over and over again, agents are confronted with thesame kinds of situations many times, they learn and usereliable and efficient routines for carrying out their daily jobs.Performing their daily activities usually does not require alot of thought because routines are general. This generalityis necessary since jobs come in many variations and requireagents to cope with whatever situations they encounter. Theregularity of everyday activity also requires agents to handleinterruptions that occur while carrying out their routines.Routines, like cleaning the kitchen, are often interrupted if amore urgent task is to be performed, and continued later on.Therefore, agents have to remember where they put thingsin order to find them again later.

A main factor that contributes to the efficiency of everydayactivity is the flexibility with which different routines can beinterleaved. Agents often work on more than one job at atime. Or, they do jobs together — operating the washingmachine in, and getting stored food from, the basement.

People can accomplish their household chores well —even in situations they have not encountered yet — fortwo reasons: they have general routines that work well instandard situations and they are, at the same time, able torecognize non-standard situations and adapt their routines tothe specific situations they encounter. Making the appropriateadaptations often requires people to predict how their rou-tines would work in non-standard situations and why theymight fail.

IV. RESEARCH THEMES

The scenarios described in the previous section challengecognitive systems along different dimensions. In this section,we discuss the research themes related to the cognitivecapabilities needed to perform the tasks in the scenarios.

A. Acquisition and Use of Environment Models

Maps or environment models are resources that the As-sistive Kitchen uses in order to accomplish its tasks morereliably and efficiently. Acquiring a model of an assistivekitchen is very different from environment mapping doneby autonomous robots. For mapping the Assistive Kitchenthe robot can make use of the sensors of the environment.Semantic information is also associated with RFID tagsin the environment and the sensors are services providinginformation about themselves and their use. In contrastto many other robot mapping tasks where the purpose of

mapping is to support navigation tasks, the kitchen mapis a resource for understanding and carrying out householdchores. To this end, the map needs to have a richer structure,explicitly reference task relevant objects such as appliances,and know about the concept of containers and doors, suchas cabinets and drawers.

Here, we consider mapping to be the following task: Given(1) a sensor-equipped kitchen where appliances and otherpieces of furniture might or might not be tagged with RFIDtags that have information associated with them (such as theirsize) and (2) a stream of observations of activities in thiskitchen acquired by the various sensors acquire a semantic,3D object map of the kitchen.

We have developed a system for the acquisition of objectmaps (see Fig. 4) based on a set of comprehensive geo-metrical reasoning techniques that extract semantic objectinformation from 3D point cloud data [11]. The object mapcomprises the static parts of the environment such as thekitchen furniture, appliances, tables, and walls, saved inOWL-DL format in our knowledge base, which is sharedbetween different applications and robots, and can be queriedto retrieve higher level information. To build accurate androbust robot plans, we make use of the object map byautomatically importing it in our 3D simulation environment.

Fig. 4. An Object Map of the Assistive Kitchen.

The maps also have to contain sensors of the environmentand their locations in order to semantically interpret thesensor data. For example, the mapper has to estimate theposition and orientation of a laser range sensor using its ownmotions and their effects on the sensor data, in order to inferthat the sensor can be used for determining its own positionand how. Or, the robot has to locate an RFID tag reader,for example using the estimated position of its gripper andobserving which of the sensors reports the RFID identifierof its gripper. Knowing that its gripper is inside a cabinetit can infer that the respective sensor can be used to sensewhat is inside this cabinet.

Further information can be acquired by observing activ-ities in the kitchen. By describing objects and places asaction-related concepts, i.e. by their roles in actions, thesystem can infer their function and refer to them not basedon the appearance or position but on the purpose they servefor. Figure 5 shows an example of a place that has beenlearned as the one where the robot stands when picking upcups from the cupboard. Action-related concepts are in manycases a very natural representation since most objects in ahousehold are designed for a certain purpose.

Fig. 5. Positions learned as ”pick-up cup place” from observations oftable-setting actions.

Finally, the geometry of the environment and recognizedactivities are used to fine-tune the sensors for particularactivity recognition tasks.

B. Acquisition and Use of Activity Models

The first research theme is concerned with the observationof human activity in the kitchen in order to acquire abstractmodels of the activity. The models are then to be used to

• answer queries about the activity,• monitor activity and assess its execution,• generate visual summaries of a cooking activity includ-

ing symbolic descriptions, and• detect exceptional situations and the need for help.In order to do so, the system is required to answer

questions including the following ones: What are meaningfulsub-activities of “setting the table” and why? Which eatingutensils and plates have to be used for spaghetti and howshould they be arranged? Does John prefer a particular cup?Where do people prepare meals? How do adults set the tableas opposed to kids? And why? Why does one put down theplates before bringing them to the table? (To close the doorof the cupboard.) What happens after the table is set? Whereare the forks kept? What is the fork used for?

The models needed for answering these questions are tobe acquired (semi-)automatically from the world wide web,from observing people, and by asking for advice.

Fig. 6. The model used for human motion capture (left), and the views ofthe four cameras in the kitchen (right).

One of the technologies we use to observe and interprethuman activity is the markerless human motion capturesystem we have developed [1]. This system enables us toretrieve accurate 3D human posture of people performing

manipulation tasks in the kitchen, as depicted in Fig. 6.The algorithm uses an anthropometric human model to trackpersons based on an initialization and observation data fromthe four ceiling-mounted CCD cameras. The model can beadapted to accurately capture the appearance of a broad rangeof humans. As output we get the articulated joint anglesof the human skeleton, that can further be used for actionclassification or interpretation of human intentions. We arealso investigating ways to incorporate complementary sensordata to ease the computationally expensive task so that wecan improve the motion capture algorithm towards real-timeprocessing.

Furthermore, we use the gaze-aligned head mounted cam-era and small inertial measurement units, depicted in Fig. 7,to record the person’s gaze and arm posture in everydaymanipulation tasks. Fig. 8 depicts an image from the headmounted camera, and some of the intermediate results inextracting a 3D hand and object model from them.

Fig. 7. The gaze-aligned head mounted camera (left) and inertial measure-ment units (right).

C. Action and Motion Primitives

People doing their household activities perform very com-plex movements smoothly and effortlessly, and improve suchmovements with repetition and experience. Such activitieslike putting the dishes in the cupboard, or setting the table,are simple for people but present challenges for robots.

Consider picking up a cup from the table and placing itin the cupboard. Any person will turn and walk towards thecup, while simultaneously stretching the arm and openingthe hand. Then the hand will grasp the cup firmly, and lift it.It will apply just the right amount of force, because throughexperience, the person has learned how much a cup shouldweigh. If the cup were heavier as expected, this would bedetected and corrected immediately. Then, taking the cupclose the body for a better stability, the person will walk tothe cupboard, open the door with the other hand, and placethe cup inside.

Several noteworthy things take place. First, the motionsused to get the cup are similar to the ones used for reachingand picking up other objects. Such movements can be formedby combining one or more basic movements, called motionprimitives. These primitives can be learned by observation

Fig. 8. Processing steps in fitting object and hand models to the imagesfrom the head mounted camera: 1) matched object model; 2) matched 2Dhand model; 3) matched 3D hand model.

and experimentation. Each one of this primitives can beparameterized to generate different movements. Second, it isknown that noise and lag are present in the nervous system,which affect both sensing and motor control. But peoplemanage despite this limitations to have elegant and precisecontrol of our limbs. A robot with a traditional controllerwould have great difficulties carrying out simple movementsin such conditions. Humans deal with this problem bybuilding forward and inverse models for motor control [12].Third, during a movement, any abnormal situation is quicklydetected and a corrective action is taken.

This research theme is concerned with endowing a robotwith similar capabilities. The robot will observe the activitiesof people, obtain motion primitives from them, and improvethem through own experimentation. Models of activity arealso learned for reaching and grasping movements, and in-clude information about the effects of the control commands.This allows the robot make predictions and detect abnormalsituations when the sensory data differs from the expectedvalues.

D. Planning and Learning Macro Actions

The next higher level of activities in the kitchen are macroactions. We consider macro actions to be the synchronizedexecution of a set of action primitives that taken togetherperform a frequent macro activity in the task domain.These activities are so frequent that the agents learn high-performance skills from experience for their reliable andflexible execution. Examples for such macro activities areopening a tetrapak and filling a cup with milk, opening acabinet to take a glass, or buttering a bagel.

Here the challenge is to compose macro actions fromcoordinated action primitives such that the resulting behavior

is skillful, flexible, reliable, and fluent without noticeabletransiting between the subactions [14]. Another challengeis that such macro actions must be learned from very littleexperience — compared with other robot learning tasks [6].

The learning of macro actions is an application of actionaware control [13] where the agents learn performanceand predictive models of actions and use these modelsfor planning, learning, and executing macro actions. Thecomposition of such macro actions requires the applicationof transformational learning and planning methods and thecombination of symbolic and motion planning with learneddynamic models.

E. Self-adapting High-level Controller

Robotic agents can not be fully programmed for everyapplication. Thus, in this research theme we realize robotcontrol programs that specialize to their respective robotplatform, work space, and tasks (see Fig. 9).

set the table. . .pick-up-objgo2posegrip-obj. . .

. . .

Fig. 9. Self-adaptation of different robots in different kitchens.

We realize a high-level control program for the specifictask of setting the table. The program learns from experiencewhere to stand when taking a glass out of the cupboard, howto best grasp particular kitchen utensils, where to look forparticular cutlery, etc. This requires the control system toknow the parameters of control routines and to have modelsof how the parameters change the behavior [4]. Also, therobots perform their tasks over extended periods of time,which requires very robust control [5].

F. Learning to Carry out Abstract Instructions

The research theme discussed in this paper is the acquisi-tion of new high-level skills. Let us consider cooking pastaas an illustrative example. Upon receiving “cook pasta” therobot retrieves instructions from webpages such as ehow.com.These instructions are typically sequences of steps to beexecuted in order to carry out the activities successfully. Thechallenges of this execution research theme are: (1) translatethe abstract instructions into an executable robot controlprogram, (2) disambiguate the often incomplete task de-scriptions given in natural language and supplement miss-ing information through observations of kitchen activities,(3) transform the action sequence into an activity structurethat can be carried out more reliably, efficiently, and flexibly.Instructions typically abstract away from these aspects ofactivity specification.

The procedure for transforming natural language taskinstructions into executable robot plans comprises the fol-lowing steps (Fig. 10): A syntax parser analyzes the sentence

structure and labels the parts of speech. The resulting syntaxtree is then transformed to instructions including the actionto be performed, the object(s) to be manipulated, possiblepre- and postconditions as well as parameters like a desiredlocation. Wordnet and Cyc are used for determining themeaning of the words. An intermediate plan representationis created in the knowledge base which can be exported asan executable robot plan.

How to cook pasta

syntaxparser

Wordnetlexical

database

Cyc upperontology

(isa ?PLAN CookingFood)(objectActedOn ?PLAN Pasta)

how/ADVto/TOcook/VERBpasta/NOUN

cook(noun) cook - someone who cooks food(verb) cook - prepare a hot meal

• domains:cooking, preparation• hypernym:create from raw stuff• synset ID:v01617913

CookingFood• genls:HeatingProcess,

PreparingFoodOrDrink• specs:BoilingFood,Microwaving• usesDeviceType:HeatingDevice• synonymousConcept:v01617913

Fig. 10. Import procedure of natural language task descriptions intoexecutable robot plans.

We have successfully imported (almost) working plansfrom descriptions from ehow.com using the described proce-dure, but several hard challenges remain in this scenario. Ageneral, robust solution for translating abstract instructionsinto working robot control programs requires answers to thefollowing research questions.

(1) How can ambiguities in the descriptions be resolvedand which missing parameters have to be inferred? Forexample, a description for setting a table does not explicitlystate that the objects have to be placed on the table but onlydescribes the positions relative to each other.

(2) How can the plan libraries of autonomous householdrobots be specified so generally, reliably, transparently, andmodularly that a robot can compose working plans fromabstract instructions? In order for newly composed sequencesof plan steps to work it helps if the individual plan stepsare specified as “universal plans”, that is they achieve –if necessary – all preconditions needed for producing thedesired effects.

(3) Debugging newly created plans from instructions re-quires the robot to predict what will happen if it executes thenew plan, to identify the flaws of the plan with respect to itsdesired behavior, and to revise the plan in order to avoid thepredicted flaws.

(4) Optimizing tasks like table setting also requires thetechnical cognitive system to observe people setting thetable, to infer the structure of the activity and reason aboutwhy people perform the task the way they do instead offollowing the abstract instructions. This way the robot wouldlearn that people stack plates when carrying them in orderto minimize the distance they have to walk. The robot

would then transform its plan analogously and test whetherthis change of activity structure would result in improvedperformance.

Pdefault

Pstack-1

stack cups

on plates

Preorderreorder

plan steps

Pstack-plates

stack

plates

Puse-both-arms

use both

arms

Pstack-plates-use-both-arms

use both

arms

stack

plates

Pstack-2

stack

plates and

one cup

. . .

. . .

. . .

. . .

. . .

. . .

Fig. 11. Plan transformations performed by the TRANER planning systemin order to optimize a table setting activity.

Figure 11 sketches the search for optimized table settingstrategies performed by our planning system. The planningsystem called TRANER transforms a default plan for tablesetting (in which the robot is asked to carry objects to thetable one by one) into an optimized plan (in which the robotstacks plates, takes cups in both hands, leaves cupboarddoors open while setting the table and position itself to reachmultiple objects and positions from the same position). Theunique feature of TRANER [9] is that it applies very generalplan transformations such as improve efficiency of transporttasks by stacking objects, or use both hands to concurrentreactive plans.

V. CONCLUDING REMARKS

This paper has presented assistive kitchens as demonstra-tion platforms for cognitive technical systems that includevarious research challenges for cognitive systems. In partic-ular, we expect the investigation of cognitive capabilities inthe context of human everyday activity, which has receivedsurprisingly little attention in previous research efforts, tosubstantially promote the state-of-the-art of cognition fortechnical systems.

REFERENCES

[1] J. Bandouch, F. Engstler, and M. Beetz. Accurate human motioncapture using an ergonomics-based anthropometric human model.In Proceedings of the Fifth International Conference on ArticulatedMotion and Deformable Objects (AMDO), 2008. Accepted forpublication.

[2] M. Beetz. Concurrent Reactive Plans: Anticipating and ForestallingExecution Failures, volume LNAI 1772 of Lecture Notes in ArtificialIntelligence. Springer Publishers, 2000.

[3] M. Beetz. Plan-based Control of Robotic Agents, volume LNAI 2554of Lecture Notes in Artificial Intelligence. Springer Publishers, 2002.

[4] M. Beetz. Plan representation for robotic agents. In Proceedings ofthe Sixth International Conference on AI Planning and Scheduling,pages 223–232, Menlo Park, CA, 2002. AAAI Press.

[5] M. Beetz and H. Grosskreutz. Probabilistic hybrid action modelsfor predicting concurrent percept-driven robot behavior. Journal ofArtificial Intelligence Research, 24:799–849, 2005.

[6] M. Beetz, A. Kirsch, and A. Muller. RPL-LEARN: Extendingan autonomous robot control language to perform experience-basedlearning. In 3rd International Joint Conference on Autonomous Agents& Multi Agent Systems (AAMAS), 2004.

[7] R. Brachman. Systems that know what they’re doing. IEEE IntelligentSystems, pages 67 – 71, November/December 2002.

[8] M. Buss, M. Beetz, and D. Wollherr. CoTeSys — cognition fortechnical systems. In Proceedings of the 4th COE Workshop on HumanAdaptive Mechatronics (HAM), 2007.

[9] A. Muller, A. Kirsch, and M. Beetz. Transformational planning foreveryday activity. In Proceedings of the 17th International Conferenceon Automated Planning and Scheduling (ICAPS’07), pages 248–255,Providence, USA, September 2007.

[10] R. B. Rusu, B. Gerkey, and M. Beetz. Robots in the kitchen: Exploitingubiquitous sensing and actuation. Robotics and Autonomous SystemsJournal (Special Issue on Network Robot Systems), 2008.

[11] R. B. Rusu, Z. C. Marton, N. Blodow, M. E. Dolha, and M. Beetz.Functional Object Mapping of Kitchen Environments. In Acceptedfor publication in Proceedings of the 21st IEEE/RSJ InternationalConference on Intelligent Robots and Systems (IROS), Nice, France,September 22-26, 2008.

[12] S. Schaal and N. Schweighofer. Computational motor control inhumans and robots. Current Opinion in Neurobiology, 15:675–682,2005.

[13] F. Stulp and M. Beetz. Action awareness – enabling agents tooptimize, transform, and coordinate plans. In Proceedings of the FifthInternational Joint Conference on Autonomous Agents and MultiagentSystems (AAMAS), 2006.

[14] F. Stulp and M. Beetz. Refining the execution of abstract actionswith learned action models. Journal of Artificial Intelligence Research(JAIR), 32, June 2008.

Documents

The Assistive Kitchen — A Demonstration Scenario for ... · Assistive kitchens and living environments also raise chal-lenging research problems. One of these problems is that performing