Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Goal-oriented human locomotion:the inverse optimal control approach
with Y. Chitour, P. Mason (LSS), and F. Chittaro (Univ. Toulon)
Frederic Jean
ENSTA ParisTech, Paris (and Team GECO, INRIA Saclay)
SADCO Internal Meeting, Funchal, 24-25 January 2013
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 1 / 32
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 2 / 32
Inverse optimal control
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 3 / 32
Inverse optimal control
Inverse optimal control
Analysis/modelling of human motor control→ looking for optimality principles
Subjects under study:
Arm pointing motions (with J-P. Gauthier, B. Berret)Saccadic motion of the eyesGoal oriented human locomotion
Mathematical formulation: inverse optimal control
Given X = φ(X,u) and a set Γ of trajectories, find a cost C(Xu) suchthat every γ ∈ Γ is solution of
inf{C(Xu) : Xu traj. s.t. Xu(0) = γ(0), Xu(T ) = γ(T )}.
(Direct) Optimal control problem
Given a dynamic X = φ(X,u), a cost C(Xu) and a pair of points X0, X1,find a trajectory Xu∗ solution of
inf{C(Xu) : Xu traj. s.t. Xu(0) = X0, Xu(T ) = X1}.
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 4 / 32
Inverse optimal control
Analysis/modelling of human motor control→ looking for optimality principles
Mathematical formulation: inverse optimal control
Inverse optimal control problem
Given X = φ(X,u) and a set Γ of trajectories, find a cost C(Xu) suchthat every γ ∈ Γ is solution of
inf{C(Xu) : Xu traj. s.t. Xu(0) = γ(0), Xu(T ) = γ(T )}.
(Direct) Optimal control problem
Given a dynamic X = φ(X,u), a cost C(Xu) and a pair of points X0, X1,find a trajectory Xu∗ solution of
inf{C(Xu) : Xu traj. s.t. Xu(0) = X0, Xu(T ) = X1}.
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 4 / 32
Inverse optimal control
Difficulties:
Γ = experimental data (noise, feedbacks, etc)Dynamical model not always known (← hierarchical optimal control)Biological nature of both dynamical models and costs
⇒ necessity of stability (genericity) of the criterion
Non well-posed inverse problemNo general method
Validation method: a program in three steps
1 Modelling step: propose a class of optimal control problems
2 Analysis step: enhance qualitative properties of the optimal synthesis→ reduce the class of problems
(using geometric control theory)
3 Comparison step: numerical methods→ choice of the best fitting L (identification)
[Mombaur-Laumond, Ajami-Gauthier-Maillot-Serres]
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 5 / 32
Inverse optimal control
Difficulties:
Γ = experimental data (noise, feedbacks, etc)Dynamical model not always known (← hierarchical optimal control)Biological nature of both dynamical models and costs
⇒ necessity of stability (genericity) of the criterion
Non well-posed inverse problemNo general method
Validation method: a program in three steps
1 Modelling step: propose a class of optimal control problems
2 Analysis step: enhance qualitative properties of the optimal synthesis→ reduce the class of problems
(using geometric control theory)
3 Comparison step: numerical methods→ choice of the best fitting L (identification)
[Mombaur-Laumond, Ajami-Gauthier-Maillot-Serres]
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 5 / 32
Modelling goal oriented human locomotion
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 6 / 32
Modelling goal oriented human locomotion
Trajectories of the Human Locomotion
0 θ0(x , y , )
1 1 θ1(x , y , )
0
Goal-oriented human locomotion (Berthoz, Laumond et al.)
Initial point (x0, y0, θ0) → Final point (x1, y1, θ1)(x, y position, θ orientation of the body)
QUESTIONS :
Which trajectory is experimentally the most likely?
What criterion is used to choose this trajectory?
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 7 / 32
Modelling goal oriented human locomotion
Examples of recorded trajectories(data due to G. Arechavaleta and J-P. Laumond)
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 8 / 32
Modelling goal oriented human locomotion
Trajectories of the Human Locomotion
HYPOTHESIS:the chosen trajectory is solution of a minimization problem
min
∫L(x, y, θ, x, y, θ, . . . )dt
among all “possible” trajectories
joining the initial point to the final one.
→ TWO QUESTIONS:
What are the possible trajectories ? dynamical constraints?
How to choose the criterion ? (inverse optimal control problem)
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 9 / 32
Modelling goal oriented human locomotion
Dynamical Constraints
[Arechavaleta-Laumond-Hicheur-Berthoz, 2006] (=[ALHB])↓
1st experimental observation: if target far enough,
the velocity is perpendicular to the body
v
→ x sin θ − y cos θ = 0 nonholonomic constraint!
(Dubins)−→{x = v cos θy = v sin θ
v = tangential velocity.
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 10 / 32
Modelling goal oriented human locomotion
Dynamical Model
2nd experimental observation in [ALHB]:
the velocity has a positive lower bound, v ≥ a > 0,
and is a function (almost constant) of the curvature
⇒ The trajectories may be parameterized by arc-length
and the problem is geometric (the curves do not depend onds
dt)
→ v ≡ 1.
The whole problem is invariant by rototranslations
→ L independent of (x, y, θ)
+ the initial point can be chosen as (x, y, θ)(0) = 0
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 11 / 32
Modelling goal oriented human locomotion
Dynamical Model
Previous observations: L = L(θ, θ, . . . )
Trajectories obtained through a minimization procedure
⇒trajectories in a complete functional space, θ ∈W k,p,
and
L = L(θ, . . . , θ(k)) convex w.r.t. θ(k)
→ the trajectories are solutions of the optimal control problem
minCL(u) =
∫ T
0L(θ, . . . , θ(k))dt
among all trajectories of
x = cos θy = sin θ
θ(k) = u
u ∈ Lp,
s.t. (x, y, θ)(0) = 0 and (x, y, θ)(T ) = (x1, y1, θ1) := X1.
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 12 / 32
Modelling goal oriented human locomotion
The class of admissible costs
Definition
Lk = set of functions L = L(θ, . . . , θ(k)) such that:
0 is the unique minimum of L (normalization L(0) = 1)
L is smooth (at least C2)
L is strictly convex w.r.t. θ(k)
L(θ, . . . , θ(k−1), u) ≥ C|u|p for |u| > R
Physiologically, k ≥ 3 is not reasonable ⇒ k = 1 or k = 2
→ The class of admissible costs is L = L1 ∪ L2
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 13 / 32
Modelling goal oriented human locomotion
Inverse Optimal Control Problem
Given experimental data, infer a cost function L ∈ Lk, k = 1 or 2, suchthat the recorded trajectories are optimal solutions of
Pk(L)
minCL(u) =
∫ T
0L(θ, . . . , θ(k))dt
subject to
x = cos θy = sin θ
θ(k) = u
with (x, y, θ)(0) = 0 and (x, y, θ)(T ) = X1, T not fixed.
MAIN QUESTIONS
Stability of the direct problem
Asymptotic analysis (far away target)
What is the value of k?
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 14 / 32
Analysis of Pk(L)
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 15 / 32
Analysis of Pk(L) General results
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 16 / 32
Analysis of Pk(L) General results
Analysis of Pk(L) – General Results
Proposition
For every target X1, there exists an optimal trajectory of Pk(L).
Every optimal trajectory satisfies the Pontryagin Maximum Principle.
REMARKS :
The control system is controllable
The proof of existence uses standard arguments (cf. Lee & Markus)
The optimal control does not belong a priori to L∞([0, T ]). But:
Lemma
For (x1, y1) far away from 0, the optimal control is uniformly bounded.
→ not necessary to put an a priori bound on the control
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 17 / 32
Analysis of Pk(L) General results
CASE L = L(θ) ∈ L1H = p1 cos θ + p2 sin θ + p3θ − νL(θ) ≡ 0
No abnormal extremals → optimal traj. are C∞ (ν = 1)
Adjoint Equation: (p1, p2) are constant and
using ∂H∂u = 0, the adjoint equation writes as an ODE:
θ = GL(θ, θ; (p1, p2)
), θ(0) = 0
CASE L = L(θ, θ) ∈ L2H = p1 cos θ + p2 sin θ + p3θ + p4θ − νL(θ, θ) ≡ 0
No abnormal extremals → optimal traj. are C∞ (ν = 1)
Adjoint Equation: (p1, p2) are constant and
θ(4) = FL(θ, θ, θ, θ(3); (p1, p2)
),
with initial data: (θ, θ(3))(0) = (0, 0) [transversality condition]
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 18 / 32
Analysis of Pk(L) Asymptotic analysis
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 19 / 32
Analysis of Pk(L) Asymptotic analysis
Analysis of P2(L)
We present only the analysis for k = 2; same thing for k = 1.
Simplifying assumption: L(θ, θ) = 1 + ψ(θ) + φ(θ)
REMIND:
Adjoint equation ⇒ fourth-order ODE on θ parameterized by (p1, p2):
θ(4) = FL(θ, θ, θ, θ(3); (p1, p2)
),
with initial data: (θ, θ(3))(0) = (π2 , 0)
H = p1 cos θ + p2 sin θ + p3θ + p4θ − L(θ, θ) ≡ 0
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 20 / 32
Analysis of Pk(L) Asymptotic analysis
Notations: (x1, y1) = ρ(cosα, sinα),
Θ = (θ, θ, θ, θ(3)).
( )1 y1,y
x
x
α
Proposition
Given ν > 0, |Θ(t)− (α, 0, 0, 0)| < ν for t ∈ [τν , T − τν ].
Proposition
As |(x1, y1)| → ∞, (p1, p2) ∼(x1, y1)
|(x1, y1)|= (cosα, sinα).
(α, 0, 0, 0) equilibrium of the “limit” equation θ(4) = FL(Θ; (x1,y1)|(x1,y1)|
)Consequence: an optimal trajectory has a stable behaviour near the
equilibrium Yeq = (α, 0, 0, 0).
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 21 / 32
Analysis of Pk(L) Asymptotic analysis
Asymptotic Trajectories
The equilibrium Yeq = (α, 0, 0, 0) is not stable:its stable manifold Ws is 2-dimensional.
Z(0)
eq
Yeq
u
s
u
s
Y → the limit optimal trajectorymust be “contained” in Ws.
→ Close to Yeq, the trajectoryis tangent to the stable spaceof the linearized ODE
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 22 / 32
Analysis of Pk(L) Asymptotic analysis
Asymptotic Trajectories
Ws contains a 1-parameter family of trajectories: we compute themnumerically and select the one starting with θ(0) = π/2 and θ(3)(0) = 0.
-8 -6 -4 -2 2 4 6 8
-8
-6
-4
-2
2
4
6
8
We obtain also the limit value of p(0)→ initialization of a shooting algorithm.
We can recover dL from the phase portraitof Θ
(Figure: L = 1 + θ2 + θ2)
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 23 / 32
Analysis of Pk(L) Asymptotic analysis
5 10 15 20
5
10
15
20
25
30
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 24 / 32
Analysis of Pk(L) Stability results
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 25 / 32
Analysis of Pk(L) Stability results
Stability results
We say that Lε ∈ L converges to L0 ∈ L if:
|Lε(θ, θ)− L0(θ, θ)| or |Lε(θ, θ)− L0(θ)| ≤ Cε|θ|p when Lε ∈ L2,
|Lε(θ)− L0(θ)| ≤ Cε|θ|p when L0 and Lε ∈ L1.
Notation: T (L,X1) = the set of trajectories (x, y, θ, θ) s.t.
(x, y, θ) is optimal for P1(L) with final point X1 if L ∈ L1;
(x, y, θ, θ) is optimal for P2(L) with final point X1 if L ∈ L2.
Theorem
If Xε1 → X1, Lε converges to L0, and (xε, yε, θε, θε) ∈ T (Lε, X
ε1), then
dunif((xε, yε, θε, θε), T (L0, X1)
)→ 0
(+ uniform convergence of θε and θ(3)ε if Lε ∈ L2 under add. hypothesis).
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 26 / 32
Analysis of Pk(L) Stability results
→ Optimal trajectories + adjoint vectors are stable under perturbationsof the cost and of the target.
Theorem
The optimal synthesis of a problem Pk(L) is stable under perturbations ofthe cost in L1 ∪ L2 (robustness).
Consequences
Stability of the direct problem
Our modelling is compatible with the physiology
A solution “up to perturbations” is sufficient
Question
Can we choose, up to perturbations, a cost in L1?
(i.e. a cost that depends only on the curvature)
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 27 / 32
Locomotion depends only on θ
Outline
1 Inverse optimal control
2 Modelling goal oriented human locomotion
3 Analysis of Pk(L)General resultsAsymptotic analysisStability results
4 Locomotion depends only on θ
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 28 / 32
Locomotion depends only on θ
Remark on the problem P1(L):
Up to a canonical rotation + translation Φ,the minimizers of P1(L) form a one-parameter class of curves.
→ Set Φ :a curve (x, y, θ)(·)
s.t. θ = 0 once7→ a curve (x, y, θ)(·),
{(x(t), y(t)) = Rot(−θ(t0)) ·
[(x(t), y(t))− (x(t0), y(t0))
]θ(t) = θ(t)− θ(t0) where θ(t0) = 0 and t = t+ t0
and set Φt(x, y, θ) = (x(t), y(t), θ(t)).
→ For L ∈ Lk, ML = {minimizers (x, y, θ) of Pk(L) s.t. θ = 0 once}
Proposition
If L ∈ L1, then for every fixed t the set Φt(ML) is a curve in R3.
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 29 / 32
Locomotion depends only on θ
Numerical test
Numerical test: apply the transformation Φt to the recorded curves.Does it give a curve?
YES!
-2-1.8
-1.6-1.4
-1.2-1
-0.8-0.6
-0.4-0.2
-0.4-0.3
-0.2-0.1
0 0.1
0.2 0.3
0.4
-0.5-0.4-0.3-0.2-0.1
0 0.1 0.2 0.3 0.4
angle
T=0.25T=0.50T=0.75T=1.00T=1.25T=1.50T=1.75T=2.00
xy
angle
(Φt applied at different times to ∼ 200 recorded trajectories)
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 30 / 32
Locomotion depends only on θ
Validity of the test
Theorem
Let L ∈ L2.
ML is a non empty open subset of the set of minimizers of P2(L)
For every fixed t the set Φt(ML) contains at least a surface in R3.
Idea of the proof:
Parameterization of minimizers byinitial adjoint vector
Continuity of the adjoint vector w.r.t.the goal
Implicit function theorem
Numerical test of Φt(M)
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 31 / 32
Locomotion depends only on θ
Conclusion
Models with k = 1 should be sufficient to describe human locomotion
that is,
the cost is a function of the curvature of the planar trajectory (x, y)
F. Jean (ENSTA ParisTech) Inverse Optimal Control and Locomotion Funchal, 2013 32 / 32