Goal-oriented human locomotion: the inverse optimal ... › itn-sadco › talks › madeira2013 › madeira2… · Goal-oriented human locomotion: the inverse optimal control approach

Goal-oriented human locomotion:the inverse optimal control approach

with Y. Chitour, P. Mason (LSS), and F. Chittaro (Univ. Toulon)

Frederic Jean

ENSTA ParisTech, Paris (and Team GECO, INRIA Saclay)

SADCO Internal Meeting, Funchal, 24-25 January 2013

F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 1 / 32

Outline

1 Inverse optimal control

2 Modelling goal oriented human locomotion

3 Analysis of Pk(L)General resultsAsymptotic analysisStability results

4 Locomotion depends only on θ


Inverse optimal control

Outline








Analysis/modelling of human motor control→ looking for optimality principles

Subjects under study:

Arm pointing motions (with J-P. Gauthier, B. Berret)Saccadic motion of the eyesGoal oriented human locomotion

Mathematical formulation: inverse optimal control

Given X = φ(X,u) and a set Γ of trajectories, find a cost C(Xu) suchthat every γ ∈ Γ is solution of

inf{C(Xu) : Xu traj. s.t. Xu(0) = γ(0), Xu(T ) = γ(T )}.

(Direct) Optimal control problem

Given a dynamic X = φ(X,u), a cost C(Xu) and a pair of points X0, X1,find a trajectory Xu∗ solution of

inf{C(Xu) : Xu traj. s.t. Xu(0) = X0, Xu(T ) = X1}.



Analysis/modelling of human motor control→ looking for optimality principles

Mathematical formulation: inverse optimal control

Inverse optimal control problem

Given X = φ(X,u) and a set Γ of trajectories, find a cost C(Xu) suchthat every γ ∈ Γ is solution of

inf{C(Xu) : Xu traj. s.t. Xu(0) = γ(0), Xu(T ) = γ(T )}.

(Direct) Optimal control problem

Given a dynamic X = φ(X,u), a cost C(Xu) and a pair of points X0, X1,find a trajectory Xu∗ solution of

inf{C(Xu) : Xu traj. s.t. Xu(0) = X0, Xu(T ) = X1}.



Difficulties:

Γ = experimental data (noise, feedbacks, etc)Dynamical model not always known (← hierarchical optimal control)Biological nature of both dynamical models and costs

⇒ necessity of stability (genericity) of the criterion

Non well-posed inverse problemNo general method

Validation method: a program in three steps

1 Modelling step: propose a class of optimal control problems

2 Analysis step: enhance qualitative properties of the optimal synthesis→ reduce the class of problems

(using geometric control theory)

3 Comparison step: numerical methods→ choice of the best fitting L (identification)

[Mombaur-Laumond, Ajami-Gauthier-Maillot-Serres]



Difficulties:

Γ = experimental data (noise, feedbacks, etc)Dynamical model not always known (← hierarchical optimal control)Biological nature of both dynamical models and costs

⇒ necessity of stability (genericity) of the criterion

Non well-posed inverse problemNo general method

Validation method: a program in three steps

1 Modelling step: propose a class of optimal control problems

2 Analysis step: enhance qualitative properties of the optimal synthesis→ reduce the class of problems

(using geometric control theory)

3 Comparison step: numerical methods→ choice of the best fitting L (identification)

[Mombaur-Laumond, Ajami-Gauthier-Maillot-Serres]


Modelling goal oriented human locomotion

Outline







Trajectories of the Human Locomotion

0 θ0(x , y , )

1 1 θ1(x , y , )

0

Goal-oriented human locomotion (Berthoz, Laumond et al.)

Initial point (x0, y0, θ0) → Final point (x1, y1, θ1)(x, y position, θ orientation of the body)

QUESTIONS :

Which trajectory is experimentally the most likely?

What criterion is used to choose this trajectory?



Examples of recorded trajectories(data due to G. Arechavaleta and J-P. Laumond)



Trajectories of the Human Locomotion

HYPOTHESIS:the chosen trajectory is solution of a minimization problem

min

∫L(x, y, θ, x, y, θ, . . . )dt

among all “possible” trajectories

joining the initial point to the final one.

→ TWO QUESTIONS:

What are the possible trajectories ? dynamical constraints?

How to choose the criterion ? (inverse optimal control problem)



Dynamical Constraints

[Arechavaleta-Laumond-Hicheur-Berthoz, 2006] (=[ALHB])↓

1st experimental observation: if target far enough,

the velocity is perpendicular to the body

v

→ x sin θ − y cos θ = 0 nonholonomic constraint!

(Dubins)−→{x = v cos θy = v sin θ

v = tangential velocity.



Dynamical Model

2nd experimental observation in [ALHB]:

the velocity has a positive lower bound, v ≥ a > 0,

and is a function (almost constant) of the curvature

⇒ The trajectories may be parameterized by arc-length

and the problem is geometric (the curves do not depend onds

dt)

→ v ≡ 1.

The whole problem is invariant by rototranslations

→ L independent of (x, y, θ)

+ the initial point can be chosen as (x, y, θ)(0) = 0



Dynamical Model

Previous observations: L = L(θ, θ, . . . )

Trajectories obtained through a minimization procedure

⇒trajectories in a complete functional space, θ ∈W k,p,

and

L = L(θ, . . . , θ(k)) convex w.r.t. θ(k)

→ the trajectories are solutions of the optimal control problem

minCL(u) =

∫ T

0L(θ, . . . , θ(k))dt

among all trajectories of

x = cos θy = sin θ

θ(k) = u

u ∈ Lp,

s.t. (x, y, θ)(0) = 0 and (x, y, θ)(T ) = (x1, y1, θ1) := X1.



The class of admissible costs

Definition

Lk = set of functions L = L(θ, . . . , θ(k)) such that:

0 is the unique minimum of L (normalization L(0) = 1)

L is smooth (at least C2)

L is strictly convex w.r.t. θ(k)

L(θ, . . . , θ(k−1), u) ≥ C|u|p for |u| > R

Physiologically, k ≥ 3 is not reasonable ⇒ k = 1 or k = 2

→ The class of admissible costs is L = L1 ∪ L2



Inverse Optimal Control Problem

Given experimental data, infer a cost function L ∈ Lk, k = 1 or 2, suchthat the recorded trajectories are optimal solutions of

Pk(L)

minCL(u) =

∫ T

0L(θ, . . . , θ(k))dt

subject to

x = cos θy = sin θ

θ(k) = u

with (x, y, θ)(0) = 0 and (x, y, θ)(T ) = X1, T not fixed.

MAIN QUESTIONS

Stability of the direct problem

Asymptotic analysis (far away target)

What is the value of k?


Analysis of Pk(L)

Outline






Analysis of Pk(L) General results

Outline







Analysis of Pk(L) – General Results

Proposition

For every target X1, there exists an optimal trajectory of Pk(L).

Every optimal trajectory satisfies the Pontryagin Maximum Principle.

REMARKS :

The control system is controllable

The proof of existence uses standard arguments (cf. Lee & Markus)

The optimal control does not belong a priori to L∞([0, T ]). But:

Lemma

For (x1, y1) far away from 0, the optimal control is uniformly bounded.

→ not necessary to put an a priori bound on the control



CASE L = L(θ) ∈ L1H = p1 cos θ + p2 sin θ + p3θ − νL(θ) ≡ 0

No abnormal extremals → optimal traj. are C∞ (ν = 1)

Adjoint Equation: (p1, p2) are constant and

using ∂H∂u = 0, the adjoint equation writes as an ODE:

θ = GL(θ, θ; (p1, p2)

), θ(0) = 0

CASE L = L(θ, θ) ∈ L2H = p1 cos θ + p2 sin θ + p3θ + p4θ − νL(θ, θ) ≡ 0

No abnormal extremals → optimal traj. are C∞ (ν = 1)

Adjoint Equation: (p1, p2) are constant and

θ(4) = FL(θ, θ, θ, θ(3); (p1, p2)

),

with initial data: (θ, θ(3))(0) = (0, 0) [transversality condition]


Analysis of Pk(L) Asymptotic analysis

Outline







Analysis of P2(L)

We present only the analysis for k = 2; same thing for k = 1.

Simplifying assumption: L(θ, θ) = 1 + ψ(θ) + φ(θ)

REMIND:

Adjoint equation ⇒ fourth-order ODE on θ parameterized by (p1, p2):

θ(4) = FL(θ, θ, θ, θ(3); (p1, p2)

),

with initial data: (θ, θ(3))(0) = (π2 , 0)

H = p1 cos θ + p2 sin θ + p3θ + p4θ − L(θ, θ) ≡ 0



Notations: (x1, y1) = ρ(cosα, sinα),

Θ = (θ, θ, θ, θ(3)).

( )1 y1,y

x

x

α

Proposition

Given ν > 0, |Θ(t)− (α, 0, 0, 0)| < ν for t ∈ [τν , T − τν ].

Proposition

As |(x1, y1)| → ∞, (p1, p2) ∼(x1, y1)

|(x1, y1)|= (cosα, sinα).

(α, 0, 0, 0) equilibrium of the “limit” equation θ(4) = FL(Θ; (x1,y1)|(x1,y1)|

)Consequence: an optimal trajectory has a stable behaviour near the

equilibrium Yeq = (α, 0, 0, 0).



Asymptotic Trajectories

The equilibrium Yeq = (α, 0, 0, 0) is not stable:its stable manifold Ws is 2-dimensional.

Z(0)

eq

Yeq

u

s

u

s

Y → the limit optimal trajectorymust be “contained” in Ws.

→ Close to Yeq, the trajectoryis tangent to the stable spaceof the linearized ODE



Asymptotic Trajectories

Ws contains a 1-parameter family of trajectories: we compute themnumerically and select the one starting with θ(0) = π/2 and θ(3)(0) = 0.

-8 -6 -4 -2 2 4 6 8

-8

-6

-4

-2

2

4

6

8

We obtain also the limit value of p(0)→ initialization of a shooting algorithm.

We can recover dL from the phase portraitof Θ

(Figure: L = 1 + θ2 + θ2)



5 10 15 20

5

10

15

20

25

30


Analysis of Pk(L) Stability results

Outline







Stability results

We say that Lε ∈ L converges to L0 ∈ L if:

|Lε(θ, θ)− L0(θ, θ)| or |Lε(θ, θ)− L0(θ)| ≤ Cε|θ|p when Lε ∈ L2,

|Lε(θ)− L0(θ)| ≤ Cε|θ|p when L0 and Lε ∈ L1.

Notation: T (L,X1) = the set of trajectories (x, y, θ, θ) s.t.

(x, y, θ) is optimal for P1(L) with final point X1 if L ∈ L1;

(x, y, θ, θ) is optimal for P2(L) with final point X1 if L ∈ L2.

Theorem

If Xε1 → X1, Lε converges to L0, and (xε, yε, θε, θε) ∈ T (Lε, X

ε1), then

dunif((xε, yε, θε, θε), T (L0, X1)

)→ 0

(+ uniform convergence of θε and θ(3)ε if Lε ∈ L2 under add. hypothesis).



→ Optimal trajectories + adjoint vectors are stable under perturbationsof the cost and of the target.

Theorem

The optimal synthesis of a problem Pk(L) is stable under perturbations ofthe cost in L1 ∪ L2 (robustness).

Consequences

Stability of the direct problem

Our modelling is compatible with the physiology

A solution “up to perturbations” is sufficient

Question

Can we choose, up to perturbations, a cost in L1?

(i.e. a cost that depends only on the curvature)


Locomotion depends only on θ

Outline







Remark on the problem P1(L):

Up to a canonical rotation + translation Φ,the minimizers of P1(L) form a one-parameter class of curves.

→ Set Φ :a curve (x, y, θ)(·)

s.t. θ = 0 once7→ a curve (x, y, θ)(·),

{(x(t), y(t)) = Rot(−θ(t0)) ·

[(x(t), y(t))− (x(t0), y(t0))

]θ(t) = θ(t)− θ(t0) where θ(t0) = 0 and t = t+ t0

and set Φt(x, y, θ) = (x(t), y(t), θ(t)).

→ For L ∈ Lk, ML = {minimizers (x, y, θ) of Pk(L) s.t. θ = 0 once}

Proposition

If L ∈ L1, then for every fixed t the set Φt(ML) is a curve in R3.



Numerical test

Numerical test: apply the transformation Φt to the recorded curves.Does it give a curve?

YES!

-2-1.8

-1.6-1.4

-1.2-1

-0.8-0.6

-0.4-0.2

-0.4-0.3

-0.2-0.1

0 0.1

0.2 0.3

0.4

-0.5-0.4-0.3-0.2-0.1

0 0.1 0.2 0.3 0.4

angle

T=0.25T=0.50T=0.75T=1.00T=1.25T=1.50T=1.75T=2.00

xy

angle

(Φt applied at different times to ∼ 200 recorded trajectories)



Validity of the test

Theorem

Let L ∈ L2.

ML is a non empty open subset of the set of minimizers of P2(L)

For every fixed t the set Φt(ML) contains at least a surface in R3.

Idea of the proof:

Parameterization of minimizers byinitial adjoint vector

Continuity of the adjoint vector w.r.t.the goal

Implicit function theorem

Numerical test of Φt(M)



Conclusion

Models with k = 1 should be sufficient to describe human locomotion

that is,

the cost is a function of the curvature of the planar trajectory (x, y)


Documents

Goal-oriented human locomotion: the inverse optimal ... › itn-sadco › talks › madeira2013 › madeira2… · Goal-oriented human locomotion: the inverse optimal control approach