Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Hegselmann-Krause Dynamics†
&
Constrained Consensus∗
Angelia Nedic
ISE Department and Coordinated Science Laboratory
University of Illinois at Urbana-Champaign
joint work
with †B. Touri (former PhD student), and ∗P.A. Parrilo and A. Ozdaglar
IMA, University of Minnesota June 1–6, 2014
Bio-Inspired (State-Dependent) Dynamic
• Bio-inspired models gaining popularity across vast areas of research
• Dynamics and Control: flight formation, collision avoidance, coordination of
network of robots
• Signal Processing: distributed in network estimation and tracking
• Optimization and Learning: distributed multi-agent optimization over network,
distributed resource allocation, distributed learning, data-mining
• Social/Economical Behavior: social networks, spread of influence, opinion formation
and dynamics
• At the core of all these is a bio-inspired dynamical model for
information spread over a network through local agents interactions
1
IMA, University of Minnesota June 1–6, 2014
Multi-dimensional Hegselmann-Krause Model
• Discrete-time model used often for opinion dynamics and robotic networks
• The model consists of the following
• The set [m] = {1,2, . . . ,m} of m agents
• The agents interact at discrete time instances, indexed by t = 1,2,3, ...
• Each agent i has an opinion modeled by a vector xi(t) ∈ Rn at time t
• Starting with some initial opinion xi(0) ∈ Rn, each agent updates his opinion upon
interaction with neighbors
• Neighbors are defined in terms of a confidence interval, given by a scalar ε > 0
Ni(t) = {j ∈ [m] | ‖xj(t)− xi(t)‖ ≤ ε}
• Each agent updates his opinion by averaging the opinions of all his neighbors
xi(t+ 1) =1
|Ni(t)|
∑j∈Ni(t)
xj(t) for t = 0,1,2, . . .
Proposed by Hegselmann and Krause 2002∗ for scalar case∗R. Hegselmann and U. Krause, ”Opinion dynamics and bounded confidence models, analysis, and simulation,” Journal of
Artificial Societies and Social Simulation, vol. 5, 2002.
2
IMA, University of Minnesota June 1–6, 2014
What is known about Hegselmann-Krause Modelxi(t+ 1) =
1
|Ni(t)|
∑j∈Ni(t)
xj(t) for t = 0,1,2, . . .
Ni(t) = {j ∈ [m] | ‖xj(t)− xi(t)‖ ≤ ε} for t = 0,1, . . .
SCALAR CASE
• Given ε > 0 and the initial profile {xi(0), i ∈ [m]} the behavior of the opinions is
completely determined• For any initial profile and any confidence interval ε, the agents opinions xi(t) converge
to some limiting opinion x∗i in a finite time, i.e.,
for each agent i, there exists time Ti ≥ 0 such that
xi(t) = x∗i for all t ≥ Ti• The fragmentation of opinion may occur i.e., we may have agents i and j such that
x∗i 6= x∗j• Hegselmann and Krause 2002, Blondel et al. 2009†
• Convergence time of the order O(m4) - Behrouz Touri Ph.D. Thesis• Improved to O(m3) -S. Mohajer and B. Touri, On Convergence Rate of Scalar
Hegselmann-Krause Dynamics, ACC 2013†V. Blondel, J. Hendrickx, and J. Tsitsiklis, ”On Krause’s multiagent consensus model with state-dependent connectivity,”
IEEE Transactions on Automatic Control, vol. 54, no. 11, pp. 2586 2597, nov. 2009.
3
IMA, University of Minnesota June 1–6, 2014
Some Illustrations of the Opinion Dynamics
m = 100 and ε = 5
• 9 groups in the limit, convergence in 16 time steps
• The fragmentations of opinions occur, BUT the fragmentations are permanent
• The analysis of the scalar case depends heavily on the fact that the ordering of the
agents is not changing, when the agents are ordered according to the values they have.
4
IMA, University of Minnesota June 1–6, 2014
Illustration of the Opinion Dynamics in 2-D
• The opinions can merge!!!
• The fragmentations of opinions may occur, BUT the fragmentations are NOT perma-
nent
• The analysis cannot make use of the ordering of the agents’values
5
IMA, University of Minnesota June 1–6, 2014
What is known about HK Model
• Convergence in a finite time - Lorenz 2007 (PhD Thesis)
• Not knowing this result at the time we have shown that the dynamics converges in
a finite time for any initial profile and confidence ε > 0
• Lorenz’s proof considers the matrix products - convergence rate - hard to get from
that analysis
• Our proof is based on a dynamic system point of view and makes use of two basic
ingredients
• An adjoint dynamics - existence
• Selection of a Lyapunov function - using adjoint variables
• Open question - ???
6
IMA, University of Minnesota June 1–6, 2014
Multidimensional Hegselmann-Krause (HK) ModelAgent Types
M(t) = {i ∈ [m] | maxj∈Ni(t)
‖xi(t)− xj(t)‖ > ε/2}, (1)
G(t) = {i ∈ [m] | maxj∈Ni(t)
‖xi(t)− xj(t)‖ ≤ ε/2}.
If an agent and all his neighbors have small local-opinion spread, then the agent and
his neighbors will reach an agreement in the next time step. Left An agent and all his
neighbors have small local-opinion spread, which leads to their agreement at next time step.
Right An agent with a small local-opinion spread, with a neighbor whose local-opinion
spread is large.
7
IMA, University of Minnesota June 1–6, 2014
Lemma 1 There holds for all t ≥ 0,
(a) Ni(t) ⊆ Nj(t) for all i ∈ G(t) and all j ∈ Ni(t).
(b) Nj(t) = Ni(t) for all i ∈ G(t) and all j ∈ Ni(t) ∩G(t).
(c) For every i, ` ∈ G(t), either Ni(t) = N`(t) or Ni(t) ∩N`(t) = ∅.
A consequence of Lemma 1(b), for i ∈ G(t) with Ni(t) ⊆ G(t), we have
xi(t+ 1) = xj(t+ 1) for all j ∈ Ni(t).
8
IMA, University of Minnesota June 1–6, 2014
Left: an agent in G(t) with a neighbor not in G(t); Right: an agent in G(t) with all neighbors in G(t)
9
IMA, University of Minnesota June 1–6, 2014
Important Relation
Lemma 2 Let i ∈ G(t) be such that Ni(t) ⊆ G(t). Then, at time t + 1, one of the
following two cases occurs
Ni(t+ 1) = Ni(t) or Ni(t+ 1) ⊃ Ni(t).Furthermore, in the second case if there is an agent ` ∈ G(t) such that ` ∈ Ni(t+ 1) \Ni(t), then ` ∈ Nj(t+ 1) for all j ∈ Ni(t) and
‖xj(t+ 1)− x`(t+ 1)‖ >ε
2for all j ∈ Ni(t).
10
IMA, University of Minnesota June 1–6, 2014
Termination Criteria
• The termination time T of the HK dynamics is defined by
T = inft≥0{t | xi(k + 1) = xi(k) for all k ≥ t and all i ∈ [m]},
Proposition 1 Suppose that the time t is such that G(t) = [m] and G(t + 1) = [m].
Then, we have
Ni(t+ 1) = Ni(t) for all i ∈ [m],
and the termination time T of the Hegselmann-Krause dynamics satisfies T ≤ t+ 1.
Since G(t) = [m], by Lemma 2, it follows that either Ni(t+ 1) = Ni(t) for all i ∈ [m],
or Ni(t + 1) ⊃ Ni(t) for some i ∈ [m]. We show that the latter case cannot occur.
Specifically, every ` ∈ Ni(t + 1) \ Ni(t) must belong to the set G(t) since M(t) = ∅.Therefore, by Lemma 2 it follows that ‖xj(t + 1) − x`(t + 1)‖ > ε
2for all j ∈ Ni(t).
In particular, this implies i, ` ∈ M(t + 1) which is a contradiction since M(t + 1) = ∅.Therefore, we must have Ni(t+ 1) = Ni(t) for all i, which means that each agent group
has reached a local group-agreement and this agreement persists at time t+1 and onward.
11
IMA, University of Minnesota June 1–6, 2014
Adjoint Dynamics
We next show that the Hegsellman-Krause dynamics has an adjoint dynamics. To do so,
we first compactly represent the dynamics by defining the neighbor-interaction matrix B(t)
with the entries given as follows:
Bij(t) =
{1
|Ni(t)|if j ∈ Ni(t),
0 otherwise.
The HK dynamics can now be written as:
X(t+ 1) = B(t)X(t) for all t ≥ 0, (2)
where X(t) is the m×n matrix with the rows given by the opinion vectors x1(t), . . . , xm(t).
• We say that the Hegselmann-Krause dynamics has an adjoint dynamics if there exists
a sequence of probability vectors {π(t)} ⊂ Rm such that
π′(t) = π′(t+ 1)B(t) for all t ≥ 0.
• We say that an adjoint dynamics is uniformly bounded away from zero if there exists
a scalar p∗ ∈ (0,1) such that πi(t) ≥ p∗ for all i ∈ [m] and t ≥ 0.
12
IMA, University of Minnesota June 1–6, 2014
Existence of Adjoint Dynamics for HK model
To show the existence of the adjoint dynamics, we use a variant Theorem 4.8 in Behrouz
Thesis, as stated below.
Theorem 2 Let {A(t)} be a sequence of stochastic matrices such that the following two
properties hold:
(i) There exists a scalar γ ∈ (0,1) such that Aii(t) ≥ γ for all i ∈ [m] and t ≥ 0;
(ii) There exists a scalar α ∈ (0,1] such that for every nonempty S ⊂ [m] and its comple-
ment S = [m] \ S, there holds∑i∈S,j∈S
Aij(t) ≥ α∑
j∈S,i∈S
Aji(t) for all t ≥ 0.
Then, the dynamics z(t + 1) = A(t)z(t), t ≥ 0, has an adjoint dynamics {π(t)}which is uniformly bounded away from zero.
• HK dynamics is driven by stochastic matrices B(t) to which the Theorem applies
• HK opinion dynamics has an adjoint dynamics which is uniformly bounded away
from zero.
13
IMA, University of Minnesota June 1–6, 2014
Lyapunov Comparison Function
We construct a Lyapunov comparison function by using the adjoint HK dynamics. The
comparison function for the dynamics is a function V (t), which is defined for the m × nopinion matrix X(t) and t ≥ 0:
V (t) =m∑i=1
πi(t)‖xi(t)− π′(t)X(t)‖2, (3)
where the row Xi,:(t) is given by the opinion vector xi(t) of agent i and π(t) is the
adjoint dynamics. For this function, we have following essential relation.
Lemma 3 For any t ≥ 0, we have
V (t+ 1) = V (t)−D(t)
where
D(t) =1
2
m∑`=1
π`(t+ 1)
|N`(t)|2∑
i∈N`(t)
∑j∈N`(t)
‖xi(t)− xj(t)‖2.
• Proof of this result is in Behrouz Touri’s thesis‡
‡B. Touri ”Product of Random Stochastic Matrices and Distributed Averaging” Springer Series in Control Engineering, 2012.
14
IMA, University of Minnesota June 1–6, 2014
Finite Termination Time
Proposition 3 The Hegselmann-Krause opinion dynamics (2) reaches its steady state
in a finite time.
By summing the relation of essential Lemma for t = 0,1, . . . , τ − 1 for some τ ≥ 1, and
by rearranging the terms, we obtain
V (τ) +1
2
τ−1∑t=0
m∑`=1
π`(t+ 1)
|N`(t)|2∑
i∈N`(t)
∑j∈N`(t)
‖xi(t)− xj(t)‖2 = V (0).
Letting τ →∞, since V (τ) ≥ 0 for any τ , we conclude that
limt→∞
m∑`=1
π`(t+ 1)
|N`(t)|2∑
i∈N`(t)
∑j∈N`(t)
‖xi(t)− xj(t)‖2 = 0.
By Theorem 2 (existence of adjoint dynamics) the adjoint dynamics {π(t} is uniformly
bounded away from zero, i.e., π`(t) ≥ p∗ for some p∗ > 0, and for all ` ∈ [m] and t ≥ 0.
Furthermore, |N`(t)| ≤ m for all ` ∈ [m] and t ≥ 0, Therefore, it follows that for every
` ∈ [m],
limt→∞
∑i∈N`(t)
∑j∈N`(t)
‖xi(t)− xj(t)‖2 = 0.
15
IMA, University of Minnesota June 1–6, 2014
We further have∑i∈N`(t)
∑j∈N`(t)
‖xi(t)− xj(t)‖2 ≥∑
i∈N`(t)
‖xi(t)− x`(t)‖2 ≥ maxi∈N`(t)
‖xi(t)− x`(t)‖2,
where the first inequality follows from the fact ` ∈ N`(t) for all ` ∈ [m] and t ≥ 0.
Consequently, we obtain for every ` ∈ [m],
limt→∞
maxi∈N`(t)
‖xi(t)− x`(t)‖2 = 0.
Hence, for every ` ∈ [m], there exists a time t` such that
maxi∈N`(t)
‖xi(t)− x`(t)‖2 ≤ε
2for all t ≥ t`.
In view of the definition of the set M(t) in (1), the preceding relation implies that
M(t) = ∅ for all t ≥ max`∈[m]
t`.
By Proposition 1 (termination criteria), it follows that the termination time T satisfies
T ≤ max`∈[m] t` + 1.
• Multidimensional HK dynamics converges in a finite time
• Sadly the result is not new: Jan Lorenz 2007§
§J. Lorenz, ”Repeated averaging and bounded confidence modeling, analysis and simulation of continuous opinion dynamics,”Ph.D. Dissertation, Universitat Bremen, 2007.
16
IMA, University of Minnesota June 1–6, 2014
Some open directions/questions
• Convergence time?
• Can we predict the number of groups in the limit?
• What about convergence of such a dynamics when each agent uses its own εi???
Partial answers in
• B. Touri and C. Langbort, On Indigenous Random Consensus and Averaging
Dynamics, IEEE Conference on Decision and Control, Florence, Italy, Dec 2013
17
IMA, University of Minnesota June 1–6, 2014
Constrained Consensus
• Intro to Feasibility Problems
• Constrained Consensus viewed as Feasibility Problem
• Distributed Algorithm for Constrained Consensus
18
IMA, University of Minnesota June 1–6, 2014
Constrained Consensus Problem
• Given a sequence {Gt} of directed and strongly connected graphs, Gt = ([m], Et)
• Each agent i (node in the graph) has a constraint set Xi ⊆ Rn, nonempty closed and
convex
• Construct a process (algorithm) that uses local agent interactions, so that the agents
reach an agreement on some vector x ∈ ∩mi=1Xi
• Sets Xi are assumed to be simple for projections
• The intersection set X = ∩mi=1Xi is assumed to be nonempty
19
IMA, University of Minnesota June 1–6, 2014
Characterization of a convex set C
• Given a closed convex set C, consider the distance function associated with C:
dist(x,C) = ‖x−ΠC[x]‖ for all x ∈ Rn
where the norm is the Euclidean norm
• Note that
x ∈ C ⇐⇒ ‖x−ΠC[x]‖ = 0
• Hence
x ∈ C ⇐⇒ dist(x,C) = 0
• The distance function is not differentiable, but the squared distance function is
g(x) = dist2(x,C), ∇g(x) = 2(x−ΠC[x])
• Thus, we can use the description of C in terms of the squared distance function of C:
x ∈ C ⇐⇒1
2dist2(x,C) = 0
20
IMA, University of Minnesota June 1–6, 2014
Consensus Problem Reformulation
• The set Xi can be described in terms of the distance function¶ x 7→ dist(x,Xi)
Xi =
{x ∈ Rn | 1
2dist2(x,Xi) = 0
}
• Consider the problem
minimizem∑i=1
1
2dist2(x,Xi) subject to x ∈ Rn
• We see that
finding a point x ∈ X is equivalent to solving the preceding problem
• Letting fi(x) = 12dist2(x,Xi), we are back to the problem of the form
minimizem∑i=1
fi(x) subject to x ∈ Rn
¶AN ”Random Projection Algorithms for Convex Set Intersection Problems” CDC 2010
21
IMA, University of Minnesota June 1–6, 2014
• Each function fi(x) = 12dist2(x,Xi) is convex (partial minimization criterion)
• Each function is differentiable
∇fi(x) = x−ΠXi[x]
• Furthermore, the gradients ∇fi for each i are Lipschitz continuous
‖∇fi(x)−∇fi(y)‖ = ‖x−ΠXi[x]− (y −ΠXi
[y])‖≤ ‖x− y‖+ ‖ΠXi
[x]−ΠXi[y]‖
≤ 2‖x− y‖where the last inequality follows from the non-expansiveness property of the projection
‖ΠXi[x]−ΠXi
[y]‖ ≤ ‖x− y‖
22
IMA, University of Minnesota June 1–6, 2014
Constrained Consensus Problem: New Formulation
• Given a sequence {Gt} of directed and strongly connected graphs, Gt = ([m], Et)
• Solve the problem
minimizem∑i=1
fi(x) subject to x ∈ Rn
with fi(x) = 12dist2(x,Xi)
• Each agent has access to fi only
• USEFUL properties
• Each fi is Lipschitz-gradient convex function, with Lipschitz constant Li = 2
• When X , ∩mi=1Xi 6= ∅, the set of solutions of the problem is the set X
• When X 6= ∅, all functions fi have a set of common global minima, which is the
set X
• Apply distributed algorithm
23
IMA, University of Minnesota June 1–6, 2014
Distributed Algorithm
• Let Ni(t) = {j ∈ [m] | (j, i) ∈ Et}
• Let each agent have an initial vector xi(0) ∈ Xi
• Agent updates:
• Consensus-like step to mix their own estimate with those received from neighbors
wi(t+ 1) =m∑j=1
aij(t)xj(t) (aij(t) = 0 when j /∈ Ni(t))
• Followed by a local gradient-based step
xi(t+ 1) = wi(t+ 1)− α(t)∇fi(wi(t+ 1))
where α(t) > 0 is a stepwise
• Algorithm in this case assumes a special form:
• Use: ∇fi(x) = x−ΠXi[x]
xi(t+ 1) = wi(t+ 1)− α(t)(wi(t+ 1)−ΠXi
[wi(t+ 1)])
xi(t+ 1) = (1− α(t))wi(t+ 1) + α(t)ΠXi[wi(t+ 1)]
24
IMA, University of Minnesota June 1–6, 2014
Distributed Algorithm: Stepize• Agents initialize with xi(0) ∈ Xi for i ∈ [m]
• At each update time t, every agent performs two steps:
wi(t+ 1) =m∑j=1
aij(t)xj(t)
xi(t+ 1) = (1− α(t))wi(t+ 1) + α(t)ΠXi[wi(t+ 1)]
where aij(t) = 0 when j /∈ Ni(t) and α(t) > 0
• Stepsizes here can take advantage of the gradient Lipschitz continuity:
step-size can be constant α(t) = α for all t
• Algorithm will be convergent as long as α ≤ mini2Li
where Li is Lipschitz constant
for function fi• Recall Li = 2 for all fi in this case
• So we can use
α(t) = α for all t, with 0 < α ≤ 1.
• We consider the version with α = 1
• Done in: AN, A. Ozdaglar and A.P. Parrilo ”Constrained Consensus and Optimization
in Multi-Agent Networks” IEEE TAC, 2010
25
IMA, University of Minnesota June 1–6, 2014
Distributed Algorithm for Constrained Consensus• Agents initialize with xi(0) ∈ Xi for i ∈ [m]• At each update time t, every agent i updates as follows:
wi(t+ 1) =m∑j=1
aij(t)xj(t), xi(t+ 1) = ΠXi[wi(t+ 1)]
• Compactly, agent updates have the form:
xi(t+ 1) = ΠXi
m∑j=1
aij(t)xj(t)
for all i ∈ [m] and all t
Proposition (Convergence)‖ Let each of the graphs Gt be strongly connected and let
each A(t) be doubly stochastic matrix such that
aii(t) ≥ η for all i and t
aij(t) ≥ η for all j ∈ Ni(t), and all i and t
where η > 0. Assume also that X = ∩mi=1Xi 6= ∅. Then, the iterate sequences {xi(t)},i ∈ [m], converge to a common point x∗ ∈ X, i.e.,
limt→∞
xi(t) = x∗ for some x∗ ∈ X and all i ∈ [m]
‖Prop. 2 in AN, A. Ozdaglar and A.P. Parrilo, 2010
26
IMA, University of Minnesota June 1–6, 2014
Convergence - open items
Proposition (Convergence)∗∗ Let each of the graphs Gt be strongly connected and let
each A(t) be doubly stochastic matrix such that
aii(t) ≥ η for all i and t
aij(t) ≥ η for all j ∈ Ni(t), and all i and t
where η > 0. Assume also that X = ∩mi=1Xi 6= ∅. Then, the iterate sequences {xi(t)},i ∈ [m], converge to a common point x∗ ∈ X, i.e.,
limt→∞
xi(t) = x∗ for some x∗ ∈ X and all i ∈ [m]
• B-connectivity works (we have done that)
• Since all functions fi have a set of common minima when X 6= ∅, the assumption of
the doubly stochastic matrices is NOT NEEDED
• It is sufficient to have row-stochastic matrices A(t)
• Convergence for this case is not done yet∗∗Prop. 2 in AN, A. Ozdaglar and A.P. Parrilo, 2010
27
IMA, University of Minnesota June 1–6, 2014
Convergence Rate: Wide Open Question
Proposition (Convergence Rate)†† Let X = ∩mi=1Xi 6= ∅ have an interior point, i.e.,
there is a vector x ∈ X and a scalar δ > 0 such that
{z ∈ Rn | ‖z − x‖ ≤ δ} ⊆ XAssuming that each graph Gt is complete and the weights aij(t) are uniform, i.e.,
A(t) = 1m11′, with 1 being the vector of 1’s, we have
m∑i=1
‖xi(t)− x∗‖2 ≤ qtm∑i=1
‖xi(0)− x∗‖2 for all t,
where
q = 1−1
4R2, R =
1
δ
m∑i=1
‖xi(0)− x∗‖.
• TOO STRONG assumptions
• The rate should be done under assumptions needed for convergence (no use doubly
stochastic matrices) and some assumption for the set regularity
• Set regularity: interior point assumption is fine, but it is not the only one††Prop. 3 in AN, A. Ozdaglar and A.P. Parrilo, 2010
28