33
Coupled dynamical system based arm-hand grasping model for learning fast adaptation strategies Ashwini Shukla * , Aude Billard * Learning Algorithms and Systems Laboratory (LASA), Ecole Polytechnique Federale de Lausanne - EPFL, Switzerland Abstract Performing manipulation tasks interactively in real environments requires a high degree of accuracy and stability. At the same time, when one cannot assume a fully deterministic and static environment, one must endow the robot with the ability to react rapidly to sudden changes in the environment. These considerations make the task of reach and grasp difficult to deal with. We follow a programming by demonstration (PbD) approach to the problem and take inspiration from the way humans adapt their reach and grasp motions when perturbed. This is in sharp contrast to previous work in PbD that uses unperturbed motions for training the system and then applies perturbation solely during the testing phase. In this work, we record the kinematics of arm and fingers of human subjects during unperturbed and perturbed reach and grasp motions. In the perturbed demonstrations, the target’s location is changed suddenly after the onset of the motion. Data show a strong coupling between the hand transport and finger motions. We hypothesize that this coupling enables the subject to seamlessly and rapidly adapt the finger motion in coordination with the hand posture. To endow our robot with this competence, we develop a Coupled Dynamical System based controller, whereby two dynamical systems driving the hand and finger motions are coupled. This offers a compact encoding for reach-to-grasp motions that ensures fast adaptation with zero latency for re-planning. We show in simulation and on the real iCub robot that this coupling ensures smooth and “human-like” motions. We demonstrate the performance of our model under spatial, temporal and grasp type perturbations which show that reaching the target with coordinated hand-arm motion is necessary for the success of the task. Keywords: Grasping, Hand Arm Coordination, Fast Perturbations, Manipulation Planning, Programming by Demonstration 1. Introduction Planning and control of constrained grasping motions has often been studied as two separate problems in which one first generates the arm motion [1, 2] and then shapes the * Corresponding author. Tel.:+41 21 693 6947 Email address: [email protected] (Ashwini Shukla) Preprint submitted to Elsevier September 6, 2011

Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: [email protected](Ashwini

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

Coupled dynamical system based arm-hand grasping model for

learning fast adaptation strategies

Ashwini Shukla∗, Aude Billard∗

Learning Algorithms and Systems Laboratory (LASA), Ecole Polytechnique Federale de Lausanne - EPFL,Switzerland

Abstract

Performing manipulation tasks interactively in real environments requires a high degree ofaccuracy and stability. At the same time, when one cannot assume a fully deterministicand static environment, one must endow the robot with the ability to react rapidly tosudden changes in the environment. These considerations make the task of reach and graspdifficult to deal with. We follow a programming by demonstration (PbD) approach to theproblem and take inspiration from the way humans adapt their reach and grasp motions whenperturbed. This is in sharp contrast to previous work in PbD that uses unperturbed motionsfor training the system and then applies perturbation solely during the testing phase. In thiswork, we record the kinematics of arm and fingers of human subjects during unperturbed andperturbed reach and grasp motions. In the perturbed demonstrations, the target’s locationis changed suddenly after the onset of the motion. Data show a strong coupling between thehand transport and finger motions. We hypothesize that this coupling enables the subjectto seamlessly and rapidly adapt the finger motion in coordination with the hand posture.To endow our robot with this competence, we develop a Coupled Dynamical System basedcontroller, whereby two dynamical systems driving the hand and finger motions are coupled.This offers a compact encoding for reach-to-grasp motions that ensures fast adaptation withzero latency for re-planning. We show in simulation and on the real iCub robot that thiscoupling ensures smooth and “human-like” motions. We demonstrate the performance ofour model under spatial, temporal and grasp type perturbations which show that reachingthe target with coordinated hand-arm motion is necessary for the success of the task.

Keywords: Grasping, Hand Arm Coordination, Fast Perturbations, ManipulationPlanning, Programming by Demonstration

1. Introduction

Planning and control of constrained grasping motions has often been studied as twoseparate problems in which one first generates the arm motion [1, 2] and then shapes the

∗Corresponding author. Tel.:+41 21 693 6947Email address: [email protected] (Ashwini Shukla)

Preprint submitted to Elsevier September 6, 2011

Page 2: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

(a) Experimental Setup

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.2

0.4

0.6

0.8

1

1.2

1.4

Time (s)

Finger Joint Angle (rad)

(b) Hand Closeup

Figure 1: (a) - Experimental setup to record human behavior under perturbations. On screen target selectoris used to create a sudden change in the target location for reaching. (b) - Motion of the fingers as seen froma high speed camera @ 100 fps. Note the decrease in the joint angle values (re-opening of fingers) startingat the onset of perturbation.

hand to grasp stably the targeted object [3, 4]. The sheer complexity of each of thesetwo problems when controlling high dimensional arm-hand systems has discouraged the useof a single coherent framework for carrying out both tasks simultaneously. In this work,we advocate the use of a single framework to control reach and grasp motion when thetask requires very fast adaptation of the motion. We consider the problem of on-the-flyreplanning reach and grasp motion for enabling adaption to changes in the position, size ortype of object to be grasped. This requires the ability for fast and flexible re-planning.

One widely desired property when designing a robot controller is robustness, i.e. theability to robustly recover from perturbations. In reach-to-grasp tasks, perturbations maybe of the following types: a) displacement of the robot end-effector and/or the target (spatialperturbations), b) delays in the task execution due to random factors such as friction in thegears or delays in the underlying controller of the robot (temporal Perturbations), c) changein the target object forcing a change in the type of the grasp required. In the context ofcontrolling for reach and grasp tasks, this problem has been addressed primarily by designinga stable controller (ensured to stop at the target) for both reach and grasp components ofmotion. Many different ways have been offered to designing task-specific controllers withminimum uncertainties and deviations from the intended trajectory. One drawback of such

2

Page 3: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

approaches is that they assume that the trajectory to track is known before hand. It ishowever not always desirable to return to the original desired trajectory as the path toget there may be unfeasible, especially when a perturbation sent us far from the originallyplanned trajectory. More recent approaches have advocated the use of dynamical systemsas a natural means of embedding sets of feasible trajectories. This offers great robustness inthe face of perturbations, as a new desired trajectory can be recomputed on the fly with noneed to re-plan [5–7]. We follow this trend and extend our previous works [8, 9] on learninga motor control law using time-invariant dynamical systems. Such a control law generatestrajectories that are asymptotically stable at a single attractor. In this paper, we extendthis work to enable coupling across two such dynamical systems for controlling reach andgrasp motions in synchrony. Controlling for such coupled dynamical systems entails morecomplexity than controlling using two independent control laws to ensure satisfaction ofconvergence constraints and correlations between the two processes [10–13].

We follow a Programming by Demonstration (PbD) approach [14] and investigate howwe can take inspiration from the way humans react when perturbed and learn motor controllaws from such examples. This departs from the usual approaches in PbD that usually usedemonstrations of unperturbed motions.

A number of studies of the way humans, and other animals, control reach-and-grasptasks [10, 15, 16] have established that the dynamics of arm and finger movements followa particular pattern of coordination, whereby the fingers start opening (preshape) for thefinal posture at about half of the reaching cycle motion. Humans and other animals adaptboth the timing of hand transport and the size of the finger aperture to the object’s size andlocation. When perturbed, humans adapt these two variables seamlessly and in synchrony[17, 18]. In [19], we showed a strong coupling between the dynamics of finger aperture andthe hand velocity. Finger aperture is composed of a biphasic course, i.e. a short and wideopening after hand peak velocity is followed by a slow closure phase. In this paper, we revisitthese observations to derive precise measurement of the correlation between hand transportand fingers preshape, which we then use to determine specific parameters of our model ofcoupled dynamical systems across these two motor programs.

This paper is divided as follows. Section 2 reviews the literature related with the pre-sented work: imitation learning, manipulation planning and biological evidences of reach-grasp coupling. Section 3 starts with a short recap of the background of Dynamical Systems(DS), their estimation using GMMs and performing regression. We give a formal definitionof the Coupled Dynamical System (CDS) model, explain the model construction processand give the algorithm for regression. In Section 4, we present the experimental setup usedto learn from perturbed human demonstrations. We validate our approach by presentinga series of experiments on the iCub simulator as well as the real robot. We show that thereach-grasp behavior is reproduced while respecting the correlations and couplings learnedduring the demonstrations and that it is critical for the success of the overall task. We alsoshow that the post-perturbation re-planning is quick and enables very fast response fromthe robot.

3

Page 4: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

2. Related Work

The presented model relates to different fields of work. It draws inspiration from neuro-physiological studies of human reach-to-grasp motion, exploits the current techniques fromimitation learning to add novel contribution in the field of manipulation planning and con-trol. In this section, we review the relevant literature in each of these fields.

2.1. Manipulation Planning

The classical approach in robotics for reaching to grasp objects has been to divide theoverall problem into two sub-problems, where one first reaches for and then grasps theobjects [1, 20, 21]. Although both the issues of reaching to a pre-grasp pose and formationof grasp around arbitrary objects are intensively studied, very few [22–24] have looked intocombining the two so as to have a unified reach-grasp system.

Most manipulation planners typically plan paths in the configuration space of the robotusing graph based techniques. Very powerful methods such as those based on probabilisticroadmap and its variants [25, 26] use a C-space description of the environment and graphbased methods for search. Another approach to the same problem has been adopted by us-ing various control schemes in conjunction with offline grasp planners [21] or visual trackingsystems [27]. A synergistic combination of grasp planning, visual tracking and arm trajec-tory generation is presented in [28]. LaValle and Kuffner [29] proposed RRT’s as a fasteralternative to manipulation planning problems, provided the existence of an efficient inversekinematic (IK) solver. RRT based methods [1] are currently the fastest online planners dueto their efficient searching ability. The reported planning times are of the order of 100 msfor single arm reaching tasks in the absence of any obstacles [20]. While this is certainlyvery quick, graph based methods lose to take into account the dynamic constraints of thetask.

It remains a challenge to design planning algorithms for dynamic tasks under quickperturbations. Moreover, in such cases, re-planning upon perturbation must not take morethan a few milliseconds. These are the type of problems we address here. We take inspirationfrom human studies to understand how humans embed correlations between arm and fingermotion to ensure robust response to such fast perturbations. We show that retaining thesecorrelations between the reach and grasp motions is critical to the success of the tasks.

2.2. Biological Evidences

The concept of coupling between the reach and grasp motions is inspired by extensiveevidence in neuro-physiological studies [10, 13, 30–35]. The most frequently reported mech-anism suggests a parallel, but time-coupled evolution of the reach and grasp motions withsynchronized termination. Attempts at quantifying this process in a way that may be usablefor robot control are few. Glke et al. [36], Bae and Armstrong [37] showed that the fingermotion during reach to grasp tasks could be described by a simple polynomial function oftime, while Ulloa and Bullock [38] modeled the covariation of the arm and the finger mo-tion. Interestingly, these authors also report on an involuntary reopening of the fingers uponperturbation of the target location; an observation which we will revisit in this paper.

4

Page 5: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

A number of models have been developed to simulate the finger-hand coupling, so as toaccount for the known coordinated pattern of hand-arm motions. The Hoff-Arbib model[39] generates a heuristic estimate of the transport time based on the reaching distance andobject size and uses it to compute the opening and closing times of the hand. Since thecontrol scheme presented in their approach was time dependent and the temporal couplingparameters decided prior to the onset of movement, this model did not guarantee handlingof temporal or spatial perturbations. Oztop and Arbib [40] argued in their hand statehypothesis that during human reach-grasp motion control, the most appropriate feedbackis a 7 dimensional vector, including pose of the hand w.r.t the target, hand aperture andthumb adduction/abduction. In the Haggard-Wing model [41], both processes of transportand aperture control have access to each other’s spatial state. The variance of hand andfinger joint angles is used to set the corresponding control gains, whereas computation of thecorrelation across the two is used to implement the spatial coupling. The time independencyof this model proved to be an elegant way to handle temporal perturbations. A neuralnetwork based model presented by Ulloa and Bullock [42] ensured continuous coupling andefficient handling of perturbations. They assumed the vector integration to end point (VITE)model [43] as a basis for task dynamics.

The model we propose here specifically exploits the principle of spatial coupling betweenthe palm and finger motion. It ensures that the motion reproduced by the robot exhibitshand-arm coupling and respects termination constraints similar to what are found in naturalhuman motion.

2.3. Imitation Learning

Learning how to perform a task by observing demonstrations from an experienced agenthas been explored extensively under different frameworks. Classical means of encoding thetask information are based on spline or polynomial decomposition and averaging [44, 45].These have been shown to be very fast trajectory generators, useful for tasks like catch-ing moving objects. A different body of work advocates non-linear stochastic regressiontechniques in order to represent the tasks and regenerate motion in a generalized setting[46]. These methods allow systematical treatment of uncertainty by assuming data noiseand hence estimate the trajectories as a set of random variables. The regression assumes amodel for the underlying process and learns its parameters via machine learning techniques.Subsequently, multiple works under the PbD framework [2, 8, 47] have shown that thisproblem can be handled elegantly by using the dynamical systems (DS) approach. UsingDS to represent motion removes the explicit time dependency from the model. As a result,transitions between the states during the execution of a task depend solely on the currentstate of the robot and the environment1. However, the removal of time dependency is in-troduced at the cost of non-trivial stability of the models. The states of a process evolvingautonomously under the influence of a DS may diverge away from the goal if initialized out-side the basin of attraction of the equilibrium point. Eppner et al. [48] presented a dynamic

1Even in a DS formulation, time dependency is present but only implicitly in the form of time derivativesof the state variables.

5

Page 6: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

Bayesian networks based approach to learn generalized relations between the world and therobot/demonstrator. While generalizing the task reproduction over different spatial setups,this framework also allows to include constraints not captured from the demonstrations,such as obstacle avoidance. Ijspeert et al. [2] in their dynamic movement primitives (DMP)formulation, augment the dynamics learned from motion data with a stable linear dynamicswhich would take precedence as the state reaches close to the goal. In our previous work [9],it was shown that formulating the problem of fitting data to the Gaussian Mixture Modelas a non-linear optimization problem under stability constraints ensures global asymptoticstability of the DS.

Although a large amount of work has been done on learning and improving skills fromobserving good examples of successful behavior, very few work has looked into the infor-mation that can be extracted from the non-canonical demonstrations. In Grollman andBillard [49], we proposed one way to learn from failed demonstrations. Here, we follow acomplementary road and investigate how one can learn from observing how humans adapttheir motion so as to avoid failure. To the best of our knowledge, this is the first work onPbD that studies human motion recovery under perturbation. We empirically show - a) thedifference in dynamics employed during perturbed and unperturbed demonstrations and b)the coupling that exists between the reach and grasp components which ensures successfultask completion. We present a coupled dynamical systems based approach to achieve coor-dination between hand transport and pre-shape. We show that the DS based formulationenables our model to react under very fast on-the-fly perturbations without any latency forre-planning. We validate the model by implementing our method on the iCub simulator aswell as the real robot.

3. Methodology

In this section, we start with a short description of motion encoding using autonomousdynamical systems (DS) and explain how Gaussian Mixture Models (GMMs) can be usedto estimate them. We then present an extension of this GMM estimate to allow couplingacross different DS, which we further refer to as Coupled Dynamical System (CDS). Aformal discussion of the Coupled Dynamical System (CDS) model is presented describing themodeling process and regression algorithm to reproduce the task and a simple 2D exampleis included to establish intuitive understanding of the working of the CDS model.

3.1. DS control of reaching

We here briefly present our previous work on modeling reaching motion through au-tonomous dynamical systems with a single attractor at the target. For clarity, we reiteratethe encoding presented in [8].

Let ξ denote the end-effector position and ξ̇ its velocity. We further assume that the stateof the system evolves in time according to a first order autonomous Ordinary DifferentialEquation (ODE):

ξ̇ = f(ξ) (1)

6

Page 7: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

f : Rd 7→ R

d is a continuous and continuously differentiable function with a singleequilibrium point at the attractor, denoted ξ∗ and we have:

limt→∞

ξ(t) = 0 (2)

We do not know f but we are provided with a set of N demonstrations of the task wherethe state vector and its velocities are recorded at particular time intervals, yielding the dataset {ξt

n, ξ̇tn}∀t ∈ [0, Tn];n ∈ [1, N ]. Tn denotes the number of data points in demonstration

n. We assume that this data was generated by our function f subjected to a white gaussiannoise ǫ and hence we have:

ξ̇ = f(ξ; θ) + ǫ (3)

Notice that f is now parameterized by the vector θ, that represents the parameters ofthe model we will use to estimate f.

We build a model-free estimate of the function f̂ in two steps. We first build a probabilitydensity model of the data by modeling it through a mixture of K gaussian functions. Thecore assumption when representing a task as a gaussian Mixture Model (GMM) is that eachrecorded point ξ(t) from the demonstrations is a sample drawn from the joint distribution:

ξ ∼ P(

ξ, ξ̇|θ)

=

K∑

k=1

πkN (ξ, ξ̇; θk) (4)

with N (ξ; θk) = 1√(2π)2d |Σk|

e1

2(ξ−µk)T(Σk)−1(ξ−µk)T and where πk, µk and Σk, are the compo-

nent weights, means and covariances of the k − th gaussian.

Taking then the posterior mean estimate of P(

ξ̇|ξ)

yields a noise-free estimate of our

underlying function:

ξ̇ =K∑

k=1

hk(ξ)(Ak + bk) (5)

where,Ak = Σk

ξ̇ξ(Σk

ξ )−1

bk = µk

ξ̇− Akµk

ξ

hk(ξ) = πkN (ξ;θk)∑K

i=1πiN (ξ;θi)

. (6)

To ensure that the resulting function is asymptotically stable at the target, we use thestable estimator of dynamical system (SEDS) approach, see [9] for a complete description.In short, SEDS determines the set of parameters θ that maximizes the likelihood of thedemonstrations being generated by the model, under strict constraints of global asymptoticstability. Next, we explain how this basic model is exploited to ensure that the hand andfingers reach the target even when perturbed. We further show how it is extended to build anexplicit coupling between hand and finger motion dynamics to ensure robust and coordinatedreach and grasp.

7

Page 8: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

3.2. Why Coupling Reach and Grasp?

We are now facing the problem of extending our reaching model presented in the previoussection to allow successful hand-arm coordination when performing reach and grasp. Thescheme presented in our previous works, as explained in previous sub-section, assumes thatthe motion is point-to-point in high dimensional space. As a result, the state vector ξ

converges uniformly and asymptotically to the target. To perform reach-grasp tasks, such ascheme could be exploited in two ways. One could either:

1. Learn two separate and independent DS with state vectors as end-effector pose andfinger configurations.

2. Or learn one DS with an extended state vector consisting of degrees of freedom of theend-effector pose as well as finger configurations.

Learning two DS would not be desirable at all since then two sub-systems (transport andpre-shape) would evolve independently using their respective learned dynamics. Hence,any perturbation in hand transport would leave the two sub-systems temporally out ofsynchronization. This may lead to failure of the overall reach-grasp task even when boththe individual DS will have converged to their respective goal states.

At first glance, the second option is more appealing as one could hope to be able tolearn the correlation between hand and finger dynamics, which would then ensure that thetemporal constraints between the convergence of transport and hand pre-shape motions willbe retained during reproduction. In practice, good modeling of such an implicit couplingin high-dimensional system is hard to ensure. The model is as good as the demonstrationsare. If one is provided with relatively few demonstrations (in PbD one targets less thanten demonstrations for the training to be bearable to the trainer), chances are that thecorrelations will be poorly rendered, especially when querying the system far away fromthe demonstrations. Hence, if the state of the robot is perturbed away from the region ofthe state space which was demonstrated, one may not ensure that the two systems will beproperly synchronized. We will establish this by the means of a simulation experiment inSection 4.

We here take an intermediary approach in which two separate DS are first learned andthen coupled explicitly. In the context of reach-and-grasp tasks, the two separate DS corre-spond to the hand transport (dynamics of the end-effector motion) and the hand pre-shape(dynamics of the finger joint motion). We will assume that the transport process evolvesindependently of the fingers’ motions while the instantaneous dynamics followed by the fin-gers depends on state of the hand. This will result in the desired behavior, namely that thefingers will reopen when the object is moved away from the target. Note that the finger-hand coupling will be parameterized. We will show in the experiments that this couplingcan be tuned by changing the model parameters to favor either “human-like” motion or fastadaptive motion to recover from quick perturbations.

3.3. Coupled Dynamical System

In the following subsections, we present the formalism behind the Coupled DynamicalSystem (CDS), describing how we learn the model and how we then query the model during

8

Page 9: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

task execution. To facilitate understanding of the task reproduction using the CDS model,we illustrate the latter in a 2D example that offers a simplistic representation of the high-dimensional implementation presented in the result section.

3.3.1. CDS model

Let ξx ∈ R3 denote the cartesian position of the hand and ξf ∈ R

df the joint angles ofthe fingers. df denotes the total number of degrees of freedom of the fingers. The handand the fingers follow separate autonomous DS with associated attractors. For convenience,we place the attractors at the origin of the frames of reference of both the hand motionand the finger motion and hence we have: ξ∗

x = 0 and ξ∗f = 0. In other words, the hand

motion is expressed in a coordinate frame attached to the object to be grasped, while thezero of the finger joint angles is placed at the joint configuration adopted by the fingerswhen the object is in the grasp. We assume that there is a single grasp configuration for agiven object. Since the reach and grasp dynamics may vary depending on the object to begrasped, we will build a separate CDS model for each object considered here. We denotethe set G of all objects for which grasping behaviors are demonstrated.

The following three joint distributions, learned as separate GMMs, combine to form theCDS model:

1. P(

ξx, ξ̇x|θgx

)

: encoding the dynamics of the hand transport

2. P(

Ψ(ξx), ξf |θginf

)

: encoding the joint probability distribution of the inferred state ofthe fingers and the current hand position

3. P(

ξf , ξ̇f |θgf

)

: encoding the dynamics of the finger motion

∀g ∈ G. Here, Ψ : R3 7→ R denotes the coupling function which is a monotonic function ofξx satisfying:

limξx→0

Ψ(ξx) = 0. (7)

θgx, θ

gf and θ

ginf denote the parameter vectors of the GMMs encoding the hand-transport

dynamics, finger motion dynamics and the inference model respectively.

The distributions P(

ξx, ξ̇x|θgx

)

and P(

ξf , ξ̇f |θgf

)

that represent an estimate of the dy-

namics of the hand and finger motion respectively are learned using the same procedure asdescribed in Section 3.1. To recall, each density is modeled through a mixture of gaussianfunctions. As explained in Section 3.1, in order to ensure that the resulting mixture isasymptotically stable at the attractor (here the origin of each system), we use the SEDSlearning algorithm Khansari-Zadeh and Billard [9]. Note that SEDS allows to only learnmodels where the input and output variables have the same dimensions. Since the variablesof the distribution P

(

Ψ(ξx), ξf |θginf

)

have not the same dimension, we learned this distri-bution through a variant of SEDS where we maximize the likelihood of the model under theconstraint:

limx→0

E [ξf |x ] = 0. (8)

9

Page 10: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

Learned GMMs Arm Motion

Finger Motion

P(

ξx, ξ̇x |θgx

)

P(

Ψ(ξx) , ξ̃f

∣θginf

)

P(

ξf , ξ̇f

∣θgf

)

ξ̇x=E

[

P(

ξ̇x |ξx)]

ξx ← ξx + ξ̇x∆t

ξ̃f = E [P (ξf |Ψ(ξx))]

ξ̇f = E

[

P(

ξ̇f

(

ξf − ξ̃f

))]

ξf ← ξf + ξ̇f∆t

ξx

ξf

CouplingΨ(ξx)

ARM‖HAND

ξ̃f

Figure 2: Task execution using CDS model. Blue region shows the three Gaussian Mixture Models whichform the full CDS model. Green region shows the sub-system which controls the dynamics of the handtransport. Magenta region shows the sub-system controlling finger motion, while being influenced by thestate of the hand transport sub-system. Coupling is ensured by passing selective state information in theform of Ψ(ξx) as shown in red.

3.3.2. Reproduction

While reproducing the task, the model essentially works in three phases: Update handposition → Infer finger joints → Increment finger joints. The palm position is updatedindependently at every time step and its current value is used to modulate the dynamics ofthe finger motion through the coupling mechanism. Figure 2 shows this flow of informationacross the sub-systems and the robot. Such a scheme is desired since it ensures that anyperturbation is reflected appropriately in both sub-systems.

The process starts by generating a velocity command for the hand transport sub-systemand increments its state by one time step. Ψ(ξx) transforms its current state which isfed to the inference model that calculates the desired state of the finger joint angles byconditioning the learned joint distribution. The velocity to drive the finger joints fromtheir current state to the inferred (desired) state is generated by gaussian mixture regression(GMR) conditioned on the error between the two. The fingers reach a new state and thecycle is repeated until convergence. Algorithm 1 explains the complete reproduction processin pseudo-code. Note that the coupling function Ψ(ξx) also acts as a phase variable whichupdates itself at each time step and, in the event of a perturbation, will command the fingersto re-adjust so as to maintain the same correlations between the sub-system states as learnedfrom the demonstrations. Two other parameters governing the coupled behavior are scalarsα, β > 0. Qualitatively speaking, they respectively control the speed and amplitude of the

10

Page 11: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

Algorithm 1 Task Execution using CDS

Input: ξx(0); ξf(0); θgx; θ

ginf ; θ

gf ; α; β; ∆t; ǫ

Set t = 0repeat:

if perturbation then

update g ∈ Gend if

Update Hand Position: ξ̇x(t) ∼ P(

ξ̇x|ξx; θgx

)

ξx(t+ 1) = ξx(t) + ξ̇x(t)∆t

Infer Finger Joints: ξ̃f(t) ∼ P(

ξf |Ψ (ξx) ; θginf

)

Update Finger Joints: ξ̇f(t) ∼ P(

ξ̇f |β(

ξf − ξ̃f

)

; θgf

)

ξf(t + 1) = ξf(t) + αξ̇f(t)∆t

t← t+ 1

until: Convergence(

‖ξ̇f(t)‖ < ǫ and ‖ξ̇x(t)‖ < ǫ)

robot’s reaction under perturbations.As described in Section 3.1, learning using SEDS ensures that the model for reaching

and the model for grasping are both stable at their respective attractors. We however neednow to verify that when one combines the two models using CDS, the resulting model isstable at the same attractors.

Definition. A CDS model is globally asymptotically stable at the attractors ξ∗x, ξ

∗f if by

starting from any given initial conditions ξx(0), ξf(0) and coupling parameters α, β ∈ R thefollowing conditions hold:

limt→∞

ξx(t) = ξ∗x (9a)

limt→∞

ξf(t) = ξ∗f (9b)

Such a property is fundamental to ensure that the CDS model will result in a reach and graspmotion terminating at the desired target. Most importantly, showing that the attractors forhand and fingers are also globally asymptotically stable will ensure that this model benefitsfrom the same robustness to perturbation as described for the simple reaching model inSection 3.1. See Appendix A for the proof of stability.

3.3.3. Minimal Example

To establish an intuitive understanding, we instantiate the CDS model as a 2D repre-sentative example of actual high-dimensional reach-grasp tasks. We consider 1-D cartesianposition ξx of the end-effector and 1 finger joint angle ξf , both expressed with respect totheir respective goal states so that they converge to the origin. In this way, the full fledgedgrasping task is just a higher dimensional version of this case by considering 3-dimensional

11

Page 12: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

−0.16 −0.14 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0−0.2

−0.15

−0.1

−0.05

0

ξx

ξ f

(a) Demonstrations

−0.18 −0.16 −0.14 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02−0.05

0

0.05

0.1

0.15

0.2

0.25

ξx

ξ̇ x

(b) P(

ξ̇x |ξx)

−0.2 −0.15 −0.1 −0.05 0 0.05−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

ξx

ξ̃ f

(c) P (ξf |ξx )

−0.15 −0.1 −0.05 0 0.05

0

0.05

0.1

0.15

0.2

ξf

ξ̇ f

(d) P(

ξ̇f |ξf)

Figure 3: GMMs which combine to form the CDS model for 2D example. (a) shows the human demon-strations. Large number of datapoints around the end of trajectories depict very small velocities. (b)shows the GMM encoding the velocity distribution conditioned on the position of reaching motion (ξx), (c)shows the GMM encoding the desired value of ξf (i.e. ξ̃f ) given the current value of ξx as seen during thedemonstrations. (d) shows the GMM encoding the dynamic model for the finger pre-shape.

cartesian position instead of ξx and all joint angles (or eigen-grasps) of the fingers insteadof ξf .

Under the given setting, typical demonstrations of reach-grasp task are as shown inFigure 3(a), where the reaching motion converges slightly faster than the finger curl. Weextract the velocity information at each recorded point by finite differencing and build the

following models from the resulting data: P(

ξx, ξ̇x |θx

)

, P (ξx, ξf |θinf ) and P(

ξf , ξ̇f |θf

)

.

The resulting mixtures for each of the models is shown in Figure 3. For reproducing the task,instead of using the earlier approach of [8] where the system evolves under the velocities

computed as E[

(ξ̇x; ξ̇f) |(ξx; ξf)]

, we proceed as in Algorithm 1. Figure 4 shows reproduction

of the task in the (ξx, ξf) space overlaid on the demonstrations. It clearly shows that a

12

Page 13: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

−0.15 −0.1 −0.05 0 0.05

−0.2

−0.15

−0.1

−0.05

0

ξx

ξ f

Inferred ξf mean3σ varianceActual ξf

(ξx, ξf ) Demonstrations

Perturbation

α increasing

Figure 4: Reproducing the task under the CDS model. The reproduction (dashed) is overlaid on thedemonstrations for reference. The model is run for different α values and the flow of the state values intime is depicted by the arrows. It is evident that the model tries to track the desired ξf (blue) values at thecurrent ξx by reversing the velocity in ξf direction. The tracking is more stringent for larger α.

0 20 40 60 80 100 120 140 160

−0.14

−0.12

−0.1

−0.08

−0.06

−0.04

−0.02

0

Time Steps

ξ x(t

)

0 20 40 60 80 100 120 140 160

−0.16

−0.14

−0.12

−0.1

−0.08

−0.06

−0.04

−0.02

0

ξ f(t

)

ξf (t)

ξ̃f (t)20 40 60 80 100 120 140 160

−0.45

−0.4

−0.35

−0.3

−0.25

−0.2

−0.15

−0.1

−0.05

0

ξ f(t

)

20 40 60 80 100 120 140 160

−0.14

−0.12

−0.1

−0.08

−0.06

−0.04

−0.02

ξ x(t

)

Time Steps

α increasingβ increasing

Figure 5: Variation of obtained trajectories with α and β. Vertical red line shows the instant of perturbationwhen the target is suddenly pushed away along positive ξx direction. Negative velocities are generated inξf in order to track ξ̃f . Speed of retracting is proportional to α (left) and amplitude is proportional to β

(right).

perturbation in ξx creates an effect in ξf , i.e., generating a negative velocity, the magnitudeof which is tunable using the α parameter. This change is brought due to the need of trackingthe inferred ξf values i.e. ξ̃f , at all ξx. ξ̃f represents the expected value of ξf given ξx as seenduring the demonstrations. The variation of the trajectories of ξf with α and β is shownin Figure 5. α modulates the speed with which the reaction to perturbation occurs. Onthe other hand, a high value of β increases the amplitude of reopening. Figure 6 shows thestreamlines of this system for two different α values in order to visualize the global behaviorof trajectories evolving under the CDS model.

At this point, it is important to distinguish our approach from the single GMM approach

13

Page 14: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

−0.14 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0

−0.15

−0.1

−0.05

0

ξx

ξ f

α=5

−0.14 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0

−0.15

−0.1

−0.05

0

ξx

ξ f

α=1Goal Streamlines Demonstration Envelope

Figure 6: Change in α affecting the nature of streamlines. Larger α will tend to bring the system morequickly towards the (ξx, ξf ) locations seen during demonstrations.

−0.14 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02

−0.2

−0.15

−0.1

−0.05

0

ξx

ξ f

Gaussian Mixture ComponentsUncoupledCDS

(a)

20 40 60 80 100 120 140−0.15

−0.1

−0.05

0

Time Steps

ξ x(t

)

20 40 60 80 100 120 140

−0.15

−0.1

−0.05

0

ξ f(t

)

CDSUncoupled

(b)

Figure 7: Task reproduction with explicit and implicit coupling shown in (a) state space, (b) time variation.Dotted lines show the implicitly coupled task execution. Note the difference in the directions from whichthe convergence occurs in the two cases. In the explicitly coupled execution, convergence is faster in ξx thanin ξf .

of [8] mentioned in section 3.2. Figure 7 shows a comparison of the CDS trajectories withthose obtained using the single GMM approach, where the coupling is only implicit. Itshows the behavior when a perturbation is introduced only on the abscissa. Clearly, inthe implicitly coupled case, the perturbation is not appropriately transferred to the un-perturbed dimension ξf and the motion in that space remains unchanged. This behaviorcan be significantly different depending on the state of the two sub-systems just after theperturbation.

To investigate this, we initialize the single GMM model as well as the CDS model atdifferent points in state space and follow the two trajectories. Figure 8 shows this experi-ment. Notice the sharp difference in the trajectories as the CDS trajectories try to maintaincorrelation between the state space variables and always converge from within the demon-stration envelope. On the other hand, the trajectories of the single GMM approach have no

14

Page 15: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

−0.18 −0.16 −0.14 −0.12 −0.1 −0.08 −0.06 −0.04 −0.02 0 0.02−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

0.2

ξx

ξ f

CDS (Explicit Coupling)GMR (Implicit Coupling)Demonstration envelope

Figure 8: Comparison between the motions obtained by single GMM and CDS approaches. Note how theorder of convergence can be remarkably different when starting at different positions in state space.

definite convergence constraint2. This difference is significantly important in the context ofreach-grasp tasks. If the trajectories converge from the top of the envelope, it means thatthe variable ξf (fingers) is converging faster than the ξx (hand position). This translates topremature finger closure as compared to what was seen during the demonstrations. If theyconverge from below, it means that the fingers are closing later than what was seen duringthe demonstrations. While the former is undesirable in any reach-grasp task, the latter isundesirable only in the case of moving/falling objects.

4. Experiments and Results

A core assumption of our approach lies in the fact that human motor control exploitsan explicit coupling between hand and finger motions. In this section, we first validate thishypothesis by reporting on a simple motion studies conducted with five subjects performinga reach and grasp task under perturbations. Data from human motion are used in threecapacities: a) to confirm that the CDS model captures well the coupling across hand andfingers found in human data; b) as demonstration data to build the probability densityfunctions of the CDS model; c) to identify relationships across the variables of the systemand to use these to instantiate the two free parameters (α and β) of the CDS model.

In the second part of this section, we perform various experiments in simulation and withthe real iCub robot to validate the performance of the CDS model as a good model to ensurerobust control of reach and grasp in robots. In particular, we test that the CDS model isindeed well suited to handle fast perturbations which typically need re-planning and aredifficult to handle online. Videos for all robot experiments and simulations are cross-linkedto the corresponding figures.

2It is also worth mentioning that the two trajectories are fairly similar when initialized close to thedemonstration envelope.

15

Page 16: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

4.1. Instantiation of CDS variables

In all the experiments presented here, including human data, the state of our system iscomposed of the cartesian position and orientation of the end-effector (human/robot wrist)and of the following 6 finger joint angles:

• 1 for curl of the thumb.

• 2 for index finger proximal and distal joints.

• 2 for middle finger proximal and distal joints.

• 1 for combined curl of ring and little finger.

We use the norm-2 for the coupling function, i.e. Ψ(.) = ||.|| in the CDS implementationfor modeling both human data and for robot control. As a result, fingers’ reopen and closeas a function of the distance of the hand to the target.

Since CDS controls for the hand displacement, we use the moore-penrose inverse kine-matic function to convert the end-effector pose to joint angles of the arm. In simulation andon the real iCub robot, we control the 7 degrees of freedom (DOFs) of the arm and 6 fingerjoints at an update rate of 20 ms.

4.2. Validation against human data

As discussed in Section 2.2, many physiological studies reported a natural coordinationbetween arm and fingers when humans reach for objects. In order to assess quantitativelythese observations and provide data in support of our model of a coupling between thetwo processes of hand transport and finger motion, we performed experiments with humansubjects performing reach-to-grasp tasks under fast and random perturbations.

4.2.1. Experimental Procedure

The experimental setup is shown in Figure 1(a). The subject stands in front of twostationary targets, a green and a red ball. An on-screen target selector prompts the subjectto reach and grasp one of the two balls depending on the color shown on the screen. Tostart the experiment, one of the ball is switched on and the subject starts to reach towardsthe corresponding object. As the subject is moving his hand and preshaping his fingers toreach for the target ball, a perturbation is created by abruptly switching off the target balland lighting up the second ball. The switch across targets occurs only once during each trialabout 1 to 1.5 sec. after the onset of the motion. The subject’s hand has usually by thentraveled more than half the distance separating it from the target. The trial stops once thesubject has successfully grasped the second target.

To ensure that we are observing natural response to such perturbations, subjects wereinstructed to proceed at their own pace and no timing for the overall motion was enforced.As a result, the time it took for each subject to complete the motion varied across subjectsand across trials. Since encoding in the CDS model is time-invariant, modeling is not affectedby these changes in duration of experiment completion.

16

Page 17: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4−0.8

−0.7

−0.6

−0.5

−0.4

−0.3

−0.2

−0.1

0

0.1

Hand distance from target (m)

Inde

x F

inge

r P

roxi

mal

Joi

nt (

rad)

Perturbed TrajectoriesUnperturbed Trajectories

Perturbation

Figure 9: Hand-finger coordination during perturbed and unperturbed demonstrations.

We recorded the kinematic of the hand, fingers and arm motions of 5 subjects across20 trials. 10 of the trials were unperturbed, i.e. the target was not switched during themotion. Subjects did the 20 trials in one swipe. Unperturbed and perturbed trials werepresented in random order for each subject. The arm and hand motion was recorded usingthree XSensTM IMU motion sensors attached to the upper arm, forearm and wrist of thesubject at a frame rate of 20ms3. The fingers’ motion was recorded using a 5DTTMdataglove. Angular displacements of the arm joint and finger joints were re-constructed andmapped to the iCub’s arm joint angles and finger joint angles. To assess visually that thecorrespondence between human motion and robot motion is well done, the iCub simulatorruns simultaneously while the human is performing the trials.

Data from the 10 unperturbed trials and from the 5 subjects are used to train the 3GMM-s which serve as basis for the CDS model. The next section discusses how well theCDS model renders human behavior under perturbations.

4.2.2. Qualitative analysis of human motion

Visual inspection of the human data confirms a steady coupling between hand transportand fingers closing in the unperturbed situation, whereby fingers close faster as the handapproaches faster the target, and conversely. This coupling persists across trials and forall subjects. Most interesting is the observation that, during perturbed trials, just afterthe target is switched, the fingers first reopen and then close again synchronously with thehand, as the hand moves toward the new target, see Figure 1(b). Note that, in all trials,the fingers re-opened irrespective of the fact that the aperture of the fingers at the time ofperturbation was large enough to accommodate the object. This suggests that this reaction

3To compensate for drifts from the IMU and data glove measurements, subjects were instructed toproceed to a brief calibration procedure after each trial. This procedure lasted no more than 5 sec.

17

Page 18: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35−1.4

−1.2

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

Ψ(ξx)

ξf(1

)

2σ Variance of unperturbed data

Recorded human trajectory (after perturbation)Recorded human trajectory (before perturbation)

Perturbation

(a) Recorded Trajectory

−1.1

−1

−0.9

−0.8

−0.7

−0.6

−0.5

−0.4

(b) Zoomed around perturbation

Figure 10: (a) shows the data recorded from perturbed human demonstrations. The adaptation behaviorunder perturbation follows the same correlations between hand position and fingers as in the unperturbedbehavior. The region where perturbation is handled is indicated in red and zoomed in (b) where 3 differentdemonstrations (red, blue and magenta) from the same subject are shown.

to perturbation is not driven by the need of accommodating the object within the grasp, butmay be the result of some inherent property of finger-hand motor control. It appears as ifthe fingers would first “reset” to a location that corresponds to the expected location for thefingers given the new hand-target distance. Once reset, fingers and hand would resume theirusual coupled hand-finger dynamics. Figure 9 shows this typical two-phase motions afterperturbation, plotting the displacement of the proximal joint of the index finger against thedistance of the hand to the target ball.

The CDS model, using the distance of the hand to the target for the coupling functionΨ(ξx), gives a very good account of this two phases response and is shown overlaid onthe data, see Figure 10(a). Observe further that the trajectories followed by the fingersafter perturbations remain within the covariance envelope of the model. This enveloperepresents the variability of finger motion observed during the unperturbed trials. This henceconfirms the hypothesis that the fingers resume their unperturbed motion model shortly afterresponding to the perturbation. This is particularly visible when looking at Figure 10(b),zoomed in on the part of the trajectories during and just after the perturbation. Threedifferent demonstrations are shown. It can be seen that, irrespective of the state of thefingers ξf at the time of perturbation, the finger trajectories tend to follow the mean ofthe regressive model (which is representative of the mean of the trajectories followed bythe human finger during the unperturbed trials) before the perturbation occurs. Just afterthe perturbations, the fingers then re-open (trajectory goes down) and then close again(trajectory goes up).

4.2.3. Modeling human motion

The previous discussion assessed the fact that the CDS model gives a good account ofthe qualitative behavior of the fingers’s motion after perturbation. We here discuss how, bytuning the two open parameters of the CDS models, namely α and β (see Table 1), we can

18

Page 19: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

2.5 3 3.5 4 4.5 5

x 10−3

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Mean Hand Velocity

Reo

peni

ng A

mpl

itude

(a)

0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.0450

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

Reopening Velocity

Re

op

en

ing

Am

plit

ud

e

Subject 1Subject 2Subject 3

(b)

0 50 100 150

−1.1

−1

−0.9

−0.8

−0.7

−0.6

−0.5

−0.4

ξ f

Time Steps0 20 40 60 80 100 120

−1.1

−1

−0.9

−0.8

−0.7

−0.6

−0.5

−0.4

−0.3

Time Steps0 20 40 60 80 100 120

−1.1

−1

−0.9

−0.8

−0.7

−0.6

Time Steps

Model run with inferred parametersModel run with optimal parametersHuman Demonstration

(c)

Figure 11: Correlations deduced from the experiments. (a) shows the linear correlation found between meanhand velocity prior to perturbation and the amplitude of finger reopening. (b) shows the same between speedand amplitude of finger reopening. (c) compares the trajectories obtained using the inferred parameters withthe actual human demonstration. Finger motion obtained using optimal parameters values is also shown inblack (dashed).

better reproduce individual trajectories of the fingers for a particular trial and subject.As illustrated in Section 3.3, these two parameters control, respectively, for the speed and

amplitude of the motion of the reopening of the fingers after perturbation. Although theseparameters can be set arbitrarily in our model, a closer analysis of human data during theperturbed trials shows that one can estimate these parameters by observing the evolutionof hand motion prior to perturbation. When plotting the average velocity of the hand priorto perturbation and the amplitude of finger reopening, we see that the two parameters arelinearly correlated, see Figure 11(a). Similarly, when plotting the velocity at which fingersreopen against the amplitude of reopening, we see that these two parameters are also linearlycorrelated. In other words, the faster the hand moves towards the target, the less the fingersreopen upon perturbation. Further, the faster the fingers reopen the larger the amplitudeof the reopening of the fingers. Note that while there is a correlation, this correlation issubject dependent. To reproduce human data for a particular trial with CDS, we can henceuse the above two observations combined with the fact that α and β control the speed andamplitude of the fingers’ motion.

19

Page 20: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

1 1.5 2 2.5 3 3.5 4 4.5 515.8

16

16.2

16.4

16.6

16.8

17

17.2

17.4

17.6

17.8

α→

Err

or(r

ad)→

β = 1

(a) Variation with α

1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.62

4

6

8

10

12

14

16

18

β →

Err

or(r

ad)→

α = 5

(b) Variation with β

α

β

2 3 4 5 6 7

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

8

10

12

14

16

18

20

(c) Contour plot

Figure 12: Effect of changing parameters α and β on the error between model run and mean humandemonstrations.

Figure 11(c) shows that this results in a good qualitative fit of the motion after pertur-bation. 3 perturbed trials chosen from subject 1 are shown. Similar plots for other subjectscan be found in Appendix B. We analyze the quality of the fit by comparing it to the motionobtained from the model with optimal parameter values. We find the optimal values of αand β for a particular demonstration by performing a grid search and optimizing the fitbetween the model generated and demonstrated motion4. The fitting is evaluated using theabsolute error between the joint angle values (radians) summed over a time window fromthe instant of perturbation till the end of demonstration. Figure 12 shows the variation ofthis error term with α and β. It can be seen that the error first decreases and then increaseswith progressively increasing α and β. The contours of the error function on variation withα and β are shown in Figure 12(c)

Note that it is not the aim of this analysis to find the optimal parameters, but to givethe reader an idea of how good is the fit obtained from a brute force grid search as compared

4This estimate of the optimal α and β is only accurate upto the width of the grid chosen.

20

Page 21: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

(a) {ξ̇x; ξ̇f} = f ({ξx; ξf}) (b) ξ̇x = f (ξx) | ξ̇f = g (ξf ,Ψ(ξx))

(c) Hand Closeup

Figure 13: Reach-grasp task executions with and without explicit coupling. The explicitly coupled execution(b) prevents premature finger closure, ensuring that given any amount of perturbation, formation of thegrasp is prevented until it is safe to do so. In the implicitly coupled execution (a), fingers close early andthe grasp fails. (c) shows closeup of hand motion post perturbation with implicit (left) and explicit (right)coupling.

to what we can infer prior to the perturbation. Further, the discrepancy between the modelrun with inferred parameters and the actual data is only due to the noise in the linearcorrelation.

It is important to emphasize that the CDS model is built using data from the unperturbedtrials and the parameters α and β are inferred from the perturbed trials. It is a representativeof a generic pattern of finger-hand coupled dynamics that is present across subjects and trialsbut that is not subject specific. Further, the estimation of the parameters α and β is donebased on an observation of a coupling across variables in a single subject, and is not fittedfor a particular trial. The CDS model, hence, encapsulates general patterns of finger-handmotions inherent to human motor control. We discuss next how such human-like dynamicsof motion can be used for robust control of hand-finger motion for successful grasp duringperturbations.

4.3. Validation of the model for robot control

We here test the performance of CDS for robust control of reach and grasp motion inthe iCub robot. We first show using the iCub simulator that the approach presented in thiswork is decidedly better than our previous approach of learning task dynamics using onlyone dynamical system. It ensures successful task completion under spatial perturbation ofthe target where the previous approach fails. We also investigate the adaptability of theCDS model in reacting quickly to counter fast perturbations (even when not demonstrated

21

Page 22: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

a priori). Finally, we conduct experiments on the iCub robot to validate the ability of themodel to adapt on-the-fly reach and grasp motion under various forms of perturbations.

4.3.1. Comparison with single DS approach

In Section 3.3 we discussed the fact that the “naive” approach in which one would learnthe hand-finger coupling by using a single GMM (the state vector in this case comprises thehand position and finger joints) would likely fail at encapsulating explicitly the correlationbetween the two dynamics. We had then advocated the use of an explicit coupling functionto couple the dynamics of finger and hand motion, each of which are learned through separateGMM-s, leading to the CDS model. We here illustrate this in simulation when reproducingthe human experiment. The iCub robot first reaches out for the green ball. Midway throughthe motion, the target is switched and the robot must go and reach for the red ball.

Figure 13(a) shows that using a single GMM for the hand and finger dynamics failsat embedding properly the correlations between the reach and grasp sub-systems and doesnot adapt well the fingers’ motion to grasp for the new ball target. Lack of an explicitcoupling leads to a poor coordination between fingers and hand motion. As a result, thefingers close too early, leading the ball to fall. Figure 13(b) shows the same task whenperformed using the CDS model. The fingers first reopen following the perturbation, hencedelaying the grasp formation, and then close according to the correlations learned duringthe demonstrations, leading to a successful grasp. Figure 13(c) shows the hand from topview where the re-opening of fingers can be seen clearly in the explicitly coupled task.

4.3.2. Adaptability to fast perturbations

An important aspect of encoding motion using autonomous dynamical systems is thatit offers a great resilience to perturbations. We here show that this offers robust control inthe face of very rapid perturbations.

In Section 4.2, we showed that the two free parameters α and β of the CDS model couldbe inferred from human data. We then already emphasized the role of these two parametersto control for speed and amplitude of finger reopening. To recall, the larger α the faster themotion. Hence, executing the task with values for α that differ from that set from humandata may be interesting for robot control for two reasons: a) as robots can move muchfaster than humans, using larger values for α could exploit the robot’s faster reaction timeswhile retaining the coupling between finger and hand motion found in human data. b) Also,using values of α that depart from these inferred from human demonstration may allow togenerate better responses to perturbations that send the system to area of the state spacenot seen during demonstrations.

Figure 14(a) illustrates the role that α plays in controlling for the reaction time. Asexpected, the time it takes for the finger to adapt to perturbation decreases when increasingα ( i.e. ∆Tadapt) corresponds to the time elapsed between the onset of the perturbationand the time when the finger position re-join the original desired position, i.e. the positionthe finger should have been in the unperturbed case. Figure 14(b) plots α against ∆Tadapt.Recovery times can be significantly reduced by increasing α. This is the time it takes for therobot to completely recover from the perturbation and reach the target position successfully.

22

Page 23: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5−0.35

0

0.35

0.70

1.05

1.40

1.75

−0.35

0

0.35

0.70

Time (s)

ξ f(1

)(r

ad)

∆Tadapt

α decreasing

Perturbation from pinch to power grasp

(a) Adaptation with varying α

0.5 1 1.5 2 2.5 3 3.5 4 4.5 50.8

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

α

∆Tad

apt (

s)

(b) α vs ∆Tadapt

Figure 14: (a) - Speed of reaction varies with varying α. Adaptation is qualitatively the same, but faster asα increases. (b) shows the variation of recovery time ∆Tadapt with α.

It is important to emphasize once more that this trajectory “re-planning” is performed atrun-time, i.e within the 20 ms close-loop control of the robot. Again, there is no replanning,adaptation to perturbation results from providing the CDS model with the current positionof the finger, hand and target. Importantly, this provides a smooth response that enablesthe robot to change its trajectory without stopping to re-plan.

We illustrate this capacity to adapt to rapid perturbation in an experiment with theiCub robot when the robot must not only adapt the trajectory of its hand but also switchacross grasp types, see Figure 15. Due to hardware constraints on the real platform, weperform this particularly high speed perturbation experiment in the iCub simulator. As therobot moves towards the target located on its left, the target ball suddenly disappears andreappears on the right of the robot. In contrast to our previous experiment, the ball is nolonger supported against gravity and hence, starts falling. For the robot to reach and graspthe object before it reaches the floor, the robot has to act very quickly. This requires a fastadaptation from palm-up to palm-down grasp as well as for fingers, while the target keepson moving. To perform this task, we first trained two separate CDS model to learn twodifferent dynamics for power grasp in palm-up and palm-down configurations, respectively.Learning was done by using five (unperturbed) human demonstrations of this task. Duringreproduction, the robot initially starts moving towards the target using the CDS model forpalm-down configuration grasp. After perturbation, the robot switches to the CDS modelfor palm-up power grasp.

To ensure that the robot intercepts the falling object in its workspace, we use the ap-proach presented in our previous work on catching flying objects by Kim et al. [50]5. Thisallows us to determine the catching point as well as the time it will take the object to reachthis point (assuming here a simple free fall for the dynamics of the object). This determinesthe maximal value for ∆Tadapt, which we then use to set the required α to intercept theobject in time.

5In Kim et al. [50] we had used a single GMM to control for both arm and hand motion. This would notallow to quickly adapt to different grasp on the fly as shown in Section 4.3.1

23

Page 24: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

Figure 15: Fast adaptation under perturbation from palm-down to palm-up power grasp.

Tp (s) ∆Tadapt (s)1.2 2.821.5 2.771.8 2.762.1 2.80

Table 1: Variation of time taken to recover from perturbation with the instant of perturbation. Note thatthe total task duration is of the order of 4 s. The values were taken at constant α and hence do not changewith the instant of perturbation. This shows the robustness of the proposed method in adapting againstperturbations.

As shown in Figure 15, switching to the second CDS model ensures that replanning ofthe finger motion is done in coordination with the hand motion (now redirected to the fallingobject). Precisely, the orientation of the hand and the finger curl are changed synchronouslyyielding the hand to close its grasp on the falling object at the right time. Notice thatas the distance-to-target suddenly increases, the CDS model forces the fingers to reopen.Subsequently, the fingers close proportionally as the distance between the falling ball and therobot hand decreases, hence maintaining the correlations seen during the demonstrations.Note that to generate this task, we control also for the torso (adding two more variables tothe inverse kinematics so as to increase the workspace of the robot).

4.3.3. Switching between different grasp types

Here, we perform another experiment showing the ability of our system to adapt thefingers’ configuration (in addition to adapting the fingers’ dynamics of motion) so as toswitch between pinch and power grasps. We learn two separate CDS models for pinch graspof a thin object (screw-driver) and power grasp of a spherical object, respectively, from fivedemonstrations of each task during unperturbed trials.

Figure 16 illustrates the experiment. While the robot reaches for the thin object, pre-shaping its fingers to the learned pinch grasp, we suddenly present the spherical object inthe robot’s field of view. The robot then redirects its hand to reach for the spherical objectin place of the thin one. Since this experiment does not require very rapid reacting time, theexperiment could be conducted on the real iCub robot. In this experiment, the two objectsare color-tracked using the iCub’s on-board cameras. Change of target is hardcoded. Assoon as the green object is detected in the cameras, the target location is switched from thered object to the green one and the robot’s CDS model is switched accordingly.

Figure 17 shows the motion of robot’s index finger proximal joint as it adapts to theinduced perturbation. During the first phase of the motion, the finger closes rapidly so as

24

Page 25: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

(a)

(b)

Figure 16: Validating the model on the real iCub platform. The robot adapts between pinch and powergrasps at different spatial positions in real time without any delays for re-planning. (a) and (b) show thesame task from front and top view to better visualize the motion of the fingers.

0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0−0.35

0

0.35

0.70

1.05

1.40

1.75

Time (s)

ξ f(1

)(r

ad)

Actual (before perturbation)Inferred (pinch grasp)Actual (after perturbation)Inferred (power grasp)

∆Tadapt

Perturbation from pinch to power grasp

Figure 17: Motion of one finger joint angle under grasp-type perturbation as recorded from the simulation.The inferred joint position predicted by the models both the grasp models (power and pinch) are shown indotted. The adaptation is smooth and robust w.r.t the instant when perturbation was applied. Time takento recover from perturbation (∆Tadapt) remains constant.

to yield a pinch grasp. After perturbation, the fingers reopen to yield the power grasp thatwould better accommodate the spherical object. The robot smoothly switches from followingthe pinch-grasp model requiring smaller hand aperture (i.e. larger joint value) to the powergrasp model which requires a larger aperture (smaller joint value) by reopening the fingersand subsequently closing them on the target, thereby, completing the task successfully.

While we discussed in the previous section the advantage to adapt the parameter α

when one needs to perform tasks that require very high reaction times (reaction times that

25

Page 26: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

are higher than what humans could achieve), we here show that, using the α parameterinferred from human data is sufficient when required reaction times are sufficiently slow.We emphasize once more the benefit of the model to achieve very robust behavior in theface of various perturbations. To this end, we perform two variants on the task describedabove where we introduce perturbations.

First, we run the same switching tasks but present the spherical objects at differentinstants after the onset of motion. The perturbation instants vary from middle of the taskduration to almost completion of the task. Table 1 gives the recovery time and time instantof perturbation. Tp denotes the instant at which the perturbation was introduced. Since wedo not change the value of the parameter α, at each run, the time taken by the robot torecover from the perturbation remains the same. Figure 17 shows the resulting trajectoriesfor the index finger. Even when the perturbation occurs shortly before completion of pinchgrasp, the model readapts the grasp smoothly, yielding a correct grasp at the second object.

Second, to highlight the performance of CDS to adapt continuously and on the fly controlfor coordinated motion of hand and fingers on the real iCub robot, we introduce perturbationduring the first part of the previous task in which the iCub robot reaches with a pinch grasp(here the robot reaches for a glass of wine6), see Figure 18. To introduce perturbationson-the-fly during execution of the pinch grasp, we implement a reflex behavior using theiCub’s skin touch sensors on the forearm, such that, when the robot detects a touch on itsforearm, it immediately moves its arm away from the point where it was touched. Thisreflex overlays the CDS controller. When the CDS controller takes over again, it uses thenew position of the arm to predict the new finger and hand motion.

Figure 18 shows the displacement along time of the proximal and distal finger joints ofthe index finger, hand aperture and the distance-to-target, when the robot is solely reachingwith a pinch grasp and is being perturbed once on its way toward the object. The handaperture is computed as the distance between the tips of the thumb and index fingers.As expected, as the hand is moved away from the target, the fingers reopen in agreementwith the correlations learned in the CDS model and and then close into pinch grasp on theobject. The finger motion robustly adapts to the perturbation, changing the hand aperturein coordination with the perturbed hand position and finally reaches the target state for thepinch grasp with ≈ 1 cm hand aperture. Figure 19 shows one cycle of tactile perturbationon the real robot.

5. Conclusion and Discussion

In this paper we presented a model for encoding and reproducing different reach-to-graspmotions that allows to handle fast perturbations in real-time. We showed that this capacityto adapt without re-planning could be used to allow to switch across grasps types smoothly.The model was strongly inspired from the way humans adapt reach and grasp motion underperturbation. Human data was used to determine a generic coupling between control of

6The location of the grasping point on the glass of wine is indicated by a red patch that is detectedthrough two external cameras running at 100Hz.

26

Page 27: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

0

0.5

1

ξ f(r

ad)

0 2 4 6 8 10 120

20

40

Time (s)

Ψ(ξ

x)

(cm

)

ξf (1)

ξf (2)

0

5

10

Aper

ture

(cm

)

Figure 18: Coordinated hand arm motion while the robot was perturbed multiple times in different di-rections. Finger joint angles, and hence, the hand aperture, change according to the learned correlations.Vertical red line marks the instants at which the robot was perturbed.

Figure 19: Spatio-temporal perturbations created using the tactile interface of the iCub. Top row shows themotion of the robot. Bottom row shows closeup of the hand.

hand transport and finger aperture. It was also used to determine quantitative values forthis coupling.

We first showed that the model gives a good qualitative account of human reach andgrasp motions during perturbation. We then showed through both simulation and real robotexperiments that the CDS model provides a robust controller for a variety of reach and graspmotions in robots.

Importantly, we showed that, while human behavior is a good source of inspiration, onecan depart from this model, by tuning the parameters of the model, to induce better robotperformance if needed and for tasks which humans struggle to perform. This follows a trendin programming by demonstration that emphasizes the fact that what is good for the humanis not necessarily good for the robot (biological inspiration should be taken with a grain ofsalt).

In this work, biological inspiration pertains to our use of coupled dynamical systemsto control for hand motion and finger motion. We showed that, introducing such explicitcoupling, was advantageous over a more implicit coupling that could be learned with other

27

Page 28: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

density based methods for estimating the correlations across all the variables. The two freeparameters of the CDS model offer a variety of ways in which the model can be adapted torealize motion that are optimal for the particular robot platform or the particular task.

5.1. Shortcomings and Future Work

In this work, we have shown that the model parameters can be tuned in order to eithergenerate “human-like” motion or to induce “un-human-like” very fast reactions. Determin-ing which parameter to use is task and platform dependent. One could, like we did here,learn how reaction times vary as a function of the open parameter α and choose the optimalα taking into consideration task and platform-related constraints, such as minimizing jerk,choosing a time window large enough to ensure successful task completion. The advantageof having such simple and explicit parametrization of the model is that it allows to reusethe same model and adapt the speed to different tasks, e.g. a power grasp may be realizedin a very rapid manner when one must hurriedly grasp a support to prevent a fall. It mayon the other hand be performed very slowly when grasping a raw egg.

In this work we have assumed that there exists only one way in which the object can begrasped and that this corresponds to the way demonstrated by the human. This assumptionmakes it difficult to complete the grasp if the object can be grasped in multiple ways and ispresented in a pose different from the one seen during the demonstrations, or when the robotdoes not have the same dexterity as the human. By construction the CDS model allows tohave a single attractor point and hence it is constrained to yield a single grasp pose. Weshowed that one could switch across different CDS models and one could hence considerlearning different CDS models for each possible grasp pose. A secondary mechanism, e.g.based on measuring the closest distance between the current posture and the variety ofgrasp pose could determine which CDS model to activate and when. Our current work isinvestigating how the CDS model could be extended to learn not just a single attractor pointbut an attractor surface. Indeed, most objects can be grasped at different points along theirgraspable surface.

This work ignored the notion of obstacle. In this paper, perturbations related to eithera displacement of the target or to a displacement of the end-effector during task comple-tion. Obstacle avoidance is certainly a major source of disturbances during reach-and-grasptasks and current planning techniques offer advanced and robust solutions to trajectoryplanning that can deal with very cluttered environments. Embedding obstacle avoidancewhile retaining the time-invariance property of autonomous dynamical systems which weexploit here would provide an interesting extension to the present model and would offer aninteresting alternative to planning techniques. Even the quickest planning techniques stillrequire planning times of the order of 0.5 second [20] when considering obstacles. This istoo slow when the required reactions times are of the order of only a few milliseconds (suchas when reaching for a falling object as shown in this paper). Current work in our groupis investigating the possibility to embed obstacle avoidance in the scheme proposed by theCDS model.

Throughout the work presented here we have assumed that we had access to an efficientinverse kinamtic (IK) solver. Thanks to which, we were able to learn the models of the

28

Page 29: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

task and control the robot in the operational space. This is however too restrictive andcreates a major drawback in practice. While we advocated that the CDS model needs noreplanning, we were still running at a 20ms close-loop. Most of these 20ms were devoted tocomputing the IK. Indeed, computing the output of the regressive models of CDS requiresless than a couple of ms. Another obvious drawback of controlling in task space is that themotion in task space inferred by the model may not always yield a feasible solution in jointspace, especially when transferring human data to robots, and when a perturbation sendsthe motion very far from the motion demonstrated. In the experiments presented here, wedid not encounter this problem as the task was always chosen so as to remain within thecenter of the robot’s workspace.

Lastly, the nature of the coupling assumed in our model is only one directional, i.e., aperturbation in the reaching motion is reflected in the finger motion and not the other wayround. A typical case where such a coupling is useful is when the dynamic controller for thefingers is malfunctioning and is too slow to follow the desired trajectory. In such a scenario,the arm must also slow down and synchronize with the finger motion. Such a behavior, evenif not biologically inspired, is desirable in the context of robotics where controller noise isubiquitous. Although we did not explicitly include this in our experiments, the CDS model isactually capable in its current formulation, to handle bidirectional coupling. This is becausethe learned GMM is a joint distribution of distance-to-target and finger configuration. Onecould take the conditional on either of the variables (in the presented scheme, we take it ondistance-to-target). However, it needs to be studied what effect (in terms of stability) thisbi-directional dependency will have on the overall system.

Acknowledgments

This work was supported by EU Project First-MM(FP7/2007-2013) under grant agree-ment number 248258.

The authors thank Florent D’Halluin and Jean-Baptiste Keller for help in setting up theexperiments on the iCub robot.

Appendix A. Stability of CDS model

To prove that the CDS model indeed follows the conditions 9, we use the properties ofits individual components. For simplicity, we shift all the data into the goal reference frameso that ξ∗

x = ξ∗f = 0. The condition 9a holds true due to the global stability of SEDS. To

investigate the stability of the coupling, we consider

limt→∞

E

[

ξ̃f |Ψ (ξx)]

= E

[

ξ̃f

∣Ψ(

limt→∞

ξx

)]

= E

[

ξ̃f

limξx→0

Ψ (ξx)

]

(By 9a)

= E

[

ξ̃f |0]

(By 7)

= 0 (By 8) (A.1)

29

Page 30: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

The model which governs the evolution of the coupled variable ξf is given by

ξ̇f = E

[

ξ̇f

(

ξf − ξ̃f

)

β]

.

Taking the limiting values and using A.1 , we get

limt→∞

ξ̇f = E

[

ξ̇f |βξf]

which is again globally asymptotically stable due to SEDS. However, as seen from Algorithm1, the multiplier α boosts the velocity before incrementing the state. It is trivial to see thatthis does not affect the global asymptotic behavior of the model since negative definiteA+AT

2⇒ αA+AT

2is also negative definite for α > 0. For details on why such a condition is

required for global stability, the reader is referred to [9].

Appendix B. Model performance with inferred parameter values

0 50 100 150 200

−1

−0.8

−0.6

−0.4

−0.2

0

ξ f

Time Steps0 50 100 150 200 250 300

−1

−0.8

−0.6

−0.4

−0.2

0

Time Steps0 50 100 150 200

−1

−0.8

−0.6

−0.4

−0.2

0

Time Steps

Model run with inferred parametersModel run with optimal parametersHuman Demonstration

(a) Subject 2

0 50 100 150

−1

−0.8

−0.6

−0.4

−0.2

0

Time Steps

ξ f

0 20 40 60 80 100 120 140 160

−1

−0.8

−0.6

−0.4

−0.2

0

Time Steps0 20 40 60 80 100 120 140 160 180

−1

−0.8

−0.6

−0.4

−0.2

0

Time Steps

Model run with inferred parametersModel run with optimal parametersHuman Demonstration

(b) Subject 3

Figure B.20: Comparison of model run with inferred parameter values, optimal parameter values and theactual human demonstrations under perturbation.

[1] D. Berenson, S. Srinivasa, D. Ferguson, A. Collet, J. Kuffner, Manipulation planning with workspacegoal regions, in: Robotics and Automation. ICRA. IEEE International Conference on, IEEE, 2009, pp.618–624.

[2] A. Ijspeert, J. Nakanishi, S. Schaal, Movement imitation with nonlinear dynamical systems in humanoidrobots, in: Robotics and Automation. Proceedings. ICRA. IEEE International Conference on, volume 2,IEEE, 2002, pp. 1398–1403.

[3] A. Miller, S. Knoop, H. Christensen, P. Allen, Automatic grasp planning using shape primitives, in:Robotics and Automation. Proceedings. ICRA. IEEE International Conference on, volume 2, IEEE,2003, pp. 1824–1829.

[4] G. Bone, A. Lambert, M. Edwards, Automated modeling and robotic grasping of unknown three-dimensional objects, in: Robotics and Automation. ICRA. IEEE International Conference on, IEEE,2008, pp. 292–298.

[5] P. Pastor, H. Hoffmann, S. Schaal, Movement generation by learning from demonstration and general-ization to new targets, in: Adaptive Motion of Animals and Machines (AMAM).

30

Page 31: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

[6] L. Dan, E. Todorov, Hierarchical optimal control of a 7-dof arm model, in: Adaptive DynamicProgramming and Reinforcement Learning, 2009. ADPRL ’09. IEEE Symposium on, IEEE, 2009, pp.50 –57.

[7] D. Grimes, D. Rashid, R. Rao, Learning nonparametric models for probabilistic imitation, Advancesin Neural Information Processing Systems 19 (2007) 521.

[8] E. Gribovskaya, A. Billard, Learning Nonlinear Multi-Variate Motion Dynamics for Real-Time Posi-tion and Orientation Control of Robotic Manipulators, in: Proceedings of IEEE-RAS InternationalConference on Humanoid Robots, IEEE, 2009, pp. 472–477.

[9] S. Khansari-Zadeh, A. Billard, Imitation learning of globally stable non-linear point-to-point robotmotions using nonlinear programming, in: Intelligent Robots and Systems (IROS), IEEE/RSJ Inter-national Conference on, IEEE, 2010, pp. 2676–2683.

[10] U. Castiello, K. Bennett, H. Chambers, Reach to grasp: the response to a simultaneous perturbationof object position and size, Experimental Brain Research 120 (1998) 31–40.

[11] M. Gentilucci, S. Chieffi, M. Scarpa, U. Castiello, Temporal coupling between transport and graspcomponents during prehension movements: effects of visual perturbation, Behavioural Brain Research47 (1992) 71–82.

[12] L. Jakobson, M. Goodale, Factors affecting higher-order movement planning: a kinematic analysis ofhuman prehension, Experimental Brain Research 86 (1991) 199–208.

[13] M. Jeannerod, The timing of natural prehension movements., Journal of Motor Behavior 16 (1984)235.

[14] A. Billard, S. Calinon, R. Dillmann, S. Schaal, Survey: Robot programming by demonstration, Hand-book of Robotics. Chapter 59, 2008.

[15] Y. Paulignan, M. Jeannerod, C. MacKenzie, R. Marteniuk, Selective perturbation of visual inputduring prehension movements, Experimental Brain Research 87 (1991) 407–420.

[16] U. Castiello, K. Bennett, G. Stelmach, Reach to grasp: the natural response to perturbation of objectsize, Experimental Brain Research 94 (1993) 163–178.

[17] M. Kawato, Internal models for motor control and trajectory planning, Current opinion in neurobiology9 (1999) 718–727.

[18] D. Engstrom, J. Kelso, Coordination dynamics of the complementary nature, Gestalt theory 30 (2008)121.

[19] M. Christel, A. Billard, How monkeys morphology constrain natural prehension kinematics. Uncon-strained conditions and simulation, Behavioural brain research 131 (2001) 169–184.

[20] N. Vahrenkamp, D. Berenson, T. Asfour, J. Kuffner, R. Dillmann, Humanoid motion planning fordual-arm manipulation and re-grasping tasks, in: Intelligent Robots and Systems. IROS. IEEE/RSJInternational Conference on, IEEE, 2009, pp. 2464–2470.

[21] K. Harada, K. Kaneko, F. Kanehiro, Fast grasp planning for hand/arm systems based on convex model,Robotics and Automation. ICRA. IEEE International Conference on, 2008.

[22] J.-H. Bae, S. Arimoto, Y. Yamamoto, H. Hashiguchi, M. Sekimoto, Reaching to grasp and preshapingof multi-dofs robotic hand-arm systems using approximate configuration of objects, in: IntelligentRobots and Systems. IEEE/RSJ International Conference on, volume 14, IEEE, 2006, pp. 1605 –1610.

[23] M. Gienger, M. Toussaint, C. Goerick, Task maps in humanoid robot manipulation, in: IntelligentRobots and Systems. IROS. IEEE/RSJ International Conference on, IEEE, 2008, pp. 2758–2764.

[24] K. Hsiao, P. Nangeroni, M. Huber, A. Saxena, A. Y. Ng, Reactive grasping using optical proximitysensors, in: Robotics and Automation, 2009. ICRA ’09. IEEE International Conference on, Instituteof Electrical and Electronics Engineers, 2009, pp. 2098–2105.

[25] J.-P. Saut, A. Sahbani, S. El-Khoury, V. Perdereau, Dexterous manipulation planning using probabilis-tic roadmaps in continuous grasp subspaces, in: Proceedings of the IEEE/RSJ International Conferenceon Intelligent Robots and Systems, IEEE, 2007, pp. 2907–2912.

[26] A. Gasparri, G. Oliva, S. Panzieri, Path planning using a lazy spatial network PRM, in: Control andAutomation. MED. 17th Mediterranean Conference on, IEEE, 2009, pp. 940–945.

[27] A. Morales, T. Asfour, P. Azad, S. Knoop, R. Dillmann, Integrated grasp planning and visual ob-

31

Page 32: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

ject localization for a humanoid robot with five-fingered hands, in: Intelligent Robots and Systems.IEEE/RSJ International Conference on, IEEE, 2007, pp. 5663–5668.

[28] D. Kragic, A. Miller, P. Allen, Real-time tracking meets online grasp planning, in: Robotics andAutomation. Proceedings. ICRA. IEEE International Conference on, volume 3, IEEE, 2001, pp. 2460–2465.

[29] S. LaValle, J. Kuffner, Rapidly-exploring random trees: Progress and prospects, in: Algorithmicand computational robotics: new directions: the fourth Workshop on the Algorithmic Foundations ofRobotics, AK Peters, Ltd., 2001, p. 293.

[30] J. Vilaplana, J. Batlle, J. Coronado, A neural model of hand grip formation during reach to grasp, in:Systems, Man and Cybernetics, IEEE International Conference on, volume 1, IEEE, 2005, pp. 542–546.

[31] M. Gentilucci, U. Castiello, M. Corradini, M. Scarpa, C. Umilta, G. Rizzolatti, Influence of differenttypes of grasping on the transport component of prehension movements, Neuropsychologia 29 (1991)361–378.

[32] M. Saling, S. Mescheriakov, E. Molokanova, G. Stelmach, M. Berger, Grip reorganization during wristtransport: the influence of an altered aperture, Experimental Brain Research 108 (1996) 493–500.

[33] A. Mitz, M. Godschalk, S. Wise, Learning-dependent neuronal activity in the premotor cortex: activityduring the acquisition of conditional motor associations, Journal of Neuroscience 11 (1991) 1855.

[34] R. Meulenbroek, D. Rosenbaum, C. Jansen, J. Vaughan, S. Vogt, Multijoint grasping movements,Experimental Brain Research 138 (2001) 219–234.

[35] Y. Paulignan, C. MacKenzie, R. Marteniuk, M. Jeannerod, The coupling of arm and finger movementsduring prehension, Experimental Brain Research 79 (1994) 431–435.

[36] J. Glke, N. J. Wachter, T. Geyer, H. Schll, G. Apic, M. Mentzel, Motion coordination patterns duringcylinder grip analyzed with a sensor glove, The Journal of Hand Surgery 35 (2010) 797 – 806.

[37] S. Bae, T. J. Armstrong, A finger motion model for reach and grasp, International Journal of IndustrialErgonomics 41 (2011) 79 – 89.

[38] A. Ulloa, D. Bullock, A neural network simulating human reach-grasp coordination by continuousupdating of vector positioning commands, Neural Networks 16 (2003) 1141 – 1160.

[39] B. Hoff, M. Arbib, Models of trajectory formation and temporal interaction of reach and grasp, Journalof Motor Behavior 25 (1993) 175–192.

[40] E. Oztop, M. Arbib, Schema design and implementation of the grasp-related mirror neuron system,Biological cybernetics 87 (2002) 116–140.

[41] P. Haggard, A. Wing, Coordinated responses following mechanical perturbation of the arm duringprehension, Experimental Brain Research 102 (1995) 483–494.

[42] A. Ulloa, D. Bullock, A neural circuit for coordinating reaching with grasping: autocompensatingvariable initial apertures, perturbations to target size, and perturbations to target orientation, in:Neural Networks. Proceedings. IJCNN. International Joint Conference on, volume 2, IEEE, 2002, pp.1047–1052.

[43] D. Bullock, S. Grossberg, The VITE model: A neural command circuit for generating arm and artic-ulator trajectories, Dynamic patterns in complex systems (1988) 305–326.

[44] J. Hwang, R. Arkin, D. Kwon, Mobile robots at your fingertip: Bezier curve on-line trajectory gen-eration for supervisory control, in: Intelligent Robots and Systems (IROS). Proceedings. IEEE/RSJInternational Conference on, volume 2, IEEE, 2003, pp. 1444–1449.

[45] M. Keshmiri, M. Keshmiri, A. Mohebbi, Augmented online point to point trajectory planning, a newapproach in catching a moving object by a manipulator, in: Control and Automation (ICCA), 8thIEEE International Conference on, IEEE, 2010, pp. 1349–1354.

[46] S. Schaal, C. Atkeson, Constructive incremental learning from only local information, Neural Compu-tation 10 (1998) 2047–2084.

[47] S. Schaal, S. Kotosaka, D. Sternad, Nonlinear dynamical systems as movement primitives, in: Hu-manoids. First IEEE-RAS International Conference on Humanoid Robots, CD-Proceedings, 2000.

[48] C. Eppner, J. Sturm, M. Bennewitz, C. Stachniss, W. Burgard, Imitation learning with generalizedtask descriptions, in: ICRA’09, IEEE, 2009, pp. 3968–3974.

32

Page 33: Coupleddynamicalsystembasedarm-handgraspingmodelfor …first-mm.informatik.uni-freiburg.de/files/ashwini12ras.pdf · 2012. 7. 18. · Emailaddress: ashwini.shukla@epfl.ch(Ashwini

[49] D. H. Grollman, A. Billard, Donut as I do: Learning from failed demonstrations, in: InternationalConference on Robotics and Automation, IEEE, Shanghai, 2011.

[50] S. Kim, E. Gribovskaya, A. Billard, Learning Motion Dynamics to Catch a Moving Object, in: 10thIEEE-RAS International Conference on Humanoid Robots, IEEE, 2010.

Ashwini Shukla Ashwini Shukla is a PhD student in the Learning Algorithms andSystems Laboratory (LASA) at the Swiss Federal Institute of Technology in Lausanne(EPFL). He received his Bachelors and Masters degree in Mechanical Engineering (2009)from the Indian Institute of Technology Kanpur (IITK), India. His research interestsfocus upon Programming by Demonstration and machine learning techniques for robotcontrol and fast grasp planning techniques.

Aude Billard Aude G. Billard is Associate Professor and head of the LASA Laboratoryat the School of Engineering at the Swiss Federal Institute of Technology in Lausanne(EPFL). Prior to this, she was Research Assistant Professor at the Department of Com-puter Sciences at the University of Southern California, where she retains an adjunctfaculty position to this day. Aude Billard received a B.Sc. (1994) and M.Sc. (1995) inPhysics from EPFL, with specialization in Particle Physics at the European Center forNuclear Research (CERN), an M.Sc. in Knowledgebased Systems (1996) and a Ph.D. inArtificial Intelligence (1998) from the Department of Artificial Intelligence at the Univer-

sity of Edinburgh. Her research interests focus on machine learning tools to support robot learning throughhuman guidance. This extends also to research on complementary topics, including machine vision and itsuse in human-machine interaction and computational neuroscience to develop models of learning in humans.

33