Upload
others
View
15
Download
0
Embed Size (px)
Citation preview
Chance-constrained optimization with tightconfidence bounds
Mark CannonUniversity of Oxford
25 April 2017
1
Outline
1. Problem definition and motivation
2. Confidence bounds for randomised sample discarding
3. Posterior confidence bounds using empirical violation estimates
4. Randomized algorithm with prior and posterior confidence bounds
5. Conclusions and future directions
2
Chance-constrained optimization
Chance constraints limit the probability of constraint violationin optimization problems with stochastic parameters
Applications are numerous and diverse:
. . . building control, chemical process design, finance, portfolio manage-
ment, production planning, supply chain management, sustainable develop-
ment, telecommunications networks . . .
Long history of stochastic programming methodse.g. Charnes & Cooper, Management Sci., 1959
Prekopa, SIAM J. Control, 1966but exact handling of probabilistic constraints is generally intractable
This talk: approximate methods based on random samples
3
Motivation: Energy management in PHEVs
Battery: charged before trip, empty at end of trip
Electric motor: shifts engine load point to improve fuel eciencyor emissions
Control problem: optimize power flows to maximize performance(eciency or emissions) over the driving cycle
4
Motivation: Energy management in PHEVs
Engine Clutch
ElectricMotor
+ Gearbox Vehicle
Battery CircuitLosses
FuelP
f
P
eng
P
s
P
b
P
em
P
drv
Parallel configuration with common engine and electric motor speed
P
drv
= P
eng
+ P
em
!
eng
= !
em
= ! (if ! !
stall
)
I Engine fuel consumption:
P
f
= P
f
(P
eng
, !)
5
Motivation: Energy management in PHEVs
Engine Clutch
ElectricMotor
+ Gearbox Vehicle
Battery CircuitLosses
FuelP
f
P
eng
P
s
P
b
P
em
P
drv
I Electric motor electrical power P
b
$ mechanical power P
em
:
P
b
= P
b
(P
em
, !)
I Battery stored energy flow rate P
s
$ P
b
:
P
s
= P
s
(P
b
)
stored energy E $ P
s
:
˙
E = P
s
5
Motivation: Energy management in PHEVs
Nominal optimization, assuming known P
drv
(t), !(t) for t 2 [0, T ]:
minimize
P
eng
(), P
em
()
2[t,T ]
Z
T
0
P
f
(P
eng
(), !()) d
subject to
˙
E() = P
s
(), 2 [t, T ]
E(t) = E
now
E(T ) E
end
with P
drv
= P
eng
+ P
em
P
b
= P
b
(P
em
, !)
P
s
= P
s
(P
b
)
6
Motivation: Energy management in PHEVs
P
drv
(t), !(t) depend on uncertain trac flow and driver behaviour
Time (sec)200 400 600 800 1000 1200 1400 1600 1800
Battery
sta
te o
f energ
y
0.5
0.52
0.54
0.56
0.58
0.6
Introduce feedback by using a receding horizon implementation
but how likely is it that we can meet constraints?
7
Motivation: Energy management in PHEVs
P
drv
(t), !(t) depend on uncertain trac flow and driver behaviour
Real time trac flow data
7
Motivation: Energy management in PHEVs
Reformulate as a chance-constrained optimization,assuming known distributions for P
drv
(t), !(t):
minimize
P
eng
(), P
em
()
2[t,T ]
Z
T
0
P
f
(P
eng
(), !()) d
subject to
˙
E() = P
s
(), 2 [t, T ]
E(t) = E
now
P
E(T ) < E
end
for a specified probability
8
Chance-constrained optimization
Define the chance-constrained problem
CCP : minimize
x2Xc
>x
subject to P
f(x, ) > 0
.
2 Rd stochastic parameter 2 [0, 1] maximum allowable violation probability
Assume
X Rn compact and convex
f(·, ) : Rn ! R convex, lower-semicontinuous for any 2
CCP is feasible, i.e. Pf(x, ) > 0 for some x 2 X .
9
Chance-constrained optimization
Computational diculties with chance constraints:
1. Prohibitive computation to determine feasibility of a given x
P
f(x, ) > 0
=
Z
2
1f(x,)>0() F (d)
– Closed form expression only available in special cases
e.g. if
N (0, I) Gaussian
f(x, ) = b
> 1 + (d + D
>)>x ane
then
Pf(x, ) > 0
() 1
N (1 )kb + Dxk 1 d
>x
– In general multidimensional integration is needed(e.g. Monte Carlo methods)
10
Chance-constrained optimization
Computational diculties with chance constraints:
2. Feasible set may be nonconvex even if f(·, ) is convex for all 2
e.g. 1
N (1 ) kb + Dxk 1 d
>x is convex if 0.5
nonconvex if > 0.5
x feasible
< 0.5
x feasible
> 0.5
10
Sample approximation
Define the multisample !
m
=
(1)
, . . . ,
(m) for i.i.d.
(i) 2
Define the sample approximation of CCP as
SP : minimize
x2X!!m
c
>x
subject to f(x, ) 0 for all 2 !
|!| q
m q discarded constraints, m q n
x
(!
): optimal solution for x
!
: optimal solution for !
11
Sample approximation
B Solutions of SP converge to the solution of CCP as m ! 1if q = m(1 )
B Approximation accuracy characterized via bounds on:
– probability that solution of SP is feasible for CCP
– probability that optimal objective of SP exceeds that of CCP
Luedtke & Ahmed, SIAM J. Optim., 2008
Calafiore, SIAM J. Optim., 2010
Campi & Garatti, J. Optim. Theory Appl., 2011
B SP can be formulated as a mixed integer program with a convexcontinuous relaxation
12
Example: minimal bounding circle
Smallest circle containing a given probability mass
CCP : minimize
c,R
R subject to P
kc k > R
SP : minimize
c,R
!!m
R subject to kc k R for all 2 !
|!| q
e.g. = 0.5, m = 1000, q = 500
R
c
CCP 1.386 (0, 0)SP 1.134 (0.003, 0.002)
SP solved for one realisation !
1000
N (0, I)
13
Example: minimal bounding circle
Bounds on confidence that SP solution, x
(!
) with m = 1000, q = 500
is feasible for CCP () from Calafiore, 2010
Violation probability ϵ
0.4 0.45 0.5 0.55 0.6
Confiden
ce
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.95 -
0.05 -
-0.95 0.05 = 0.11
=) with 90% probability: the violation probability of x
(!
)
lies between 0.47 and 0.58
14
Example: minimal bounding circle
Smallest circle containing a given probability mass – observations:
– SP computation grows rapidly with m
– Bounds on confidence of feasibility of x
(!
) for CCP are not tight
(unless q < m or probability distribution is a specific worst case)
because SP solution has poor generalization properties
e.g. m = 20, q = 10:
– Randomized sample discarding improves generalization
15
Outline
1. Problem definition and motivation
2. Confidence bounds for randomised sample discarding
3. Posterior confidence bounds using empirical violation estimates
4. Randomized algorithm with prior and posterior confidence bounds
5. Conclusions and future directions
16
Level-q subsets of a multisample
Define the sampled problem for given ! !
m
as
SP(!) : minimize
x2Xc
>x subject to f(x, ) 0 for all 2 !
x
(!): optimal solution of SP(!)
Assumptions:
(i). SP(!) is feasible for any ! !
m
with probability (w.p.) 1
(ii). f
x
(!),
6= 0 for all 2 !
m
\ ! w.p. 1
17
Level-q subsets of a multisample
Define the set of level-q subsets of !
m
, for q m as
q
(!
m
) =
n
! !
m
: |!| = q and f
x
(!),
> 0 for all 2 !
m
\ !
o
x
(!) for ! 2
q
(!
m
) is called a level-q solution
e.g. minimal bounding circle problem, m = 20, q = 10:
! 2 10
(!
20
)
18
Level-q subsets of a multisample
Define the set of level-q subsets of !
m
, for q m as
q
(!
m
) =
n
! !
m
: |!| = q and f
x
(!),
> 0 for all 2 !
m
\ !
o
x
(!) for ! 2
q
(!
m
) is called a level-q solution
e.g. minimal bounding circle problem, m = 20, q = 10:
! 2 10
(!
20
)
18
Level-q subsets of a multisample
Define the set of level-q subsets of !
m
, for q m as
q
(!
m
) =
n
! !
m
: |!| = q and f
x
(!),
> 0 for all 2 !
m
\ !
o
x
(!) for ! 2
q
(!
m
) is called a level-q solution
e.g. minimal bounding circle problem, m = 20, q = 10:
! 2 10
(!
20
)
18
Level-q subsets of a multisample
Define the set of level-q subsets of !
m
, for q m as
q
(!
m
) =
n
! !
m
: |!| = q and f
x
(!),
> 0 for all 2 !
m
\ !
o
x
(!) for ! 2
q
(!
m
) is called a level-q solution
e.g. minimal bounding circle problem, m = 20, q = 10:
! 2 10
(!
20
)
18
Level-q subsets of a multisample
Define the set of level-q subsets of !
m
, for q m as
q
(!
m
) =
n
! !
m
: |!| = q and f
x
(!),
> 0 for all 2 !
m
\ !
o
x
(!) for ! 2
q
(!
m
) is called a level-q solution
e.g. minimal bounding circle problem, m = 20, q = 10:
! 2 10
(!
20
)
18
Level-q subsets of a multisample
Define the set of level-q subsets of !
m
, for q m as
q
(!
m
) =
n
! !
m
: |!| = q and f
x
(!),
> 0 for all 2 !
m
\ !
o
x
(!) for ! 2
q
(!
m
) is called a level-q solution
e.g. minimal bounding circle problem, m = 20, q = 10:
! 62 10
(!
20
)
18
Essential constraint set
Es(!) =
(i1), . . . ,
(ik) ! is an essential set of SP(!) if
(i). x
(i1), . . . ,
(ik)
= x
(!),
(ii). x(! \ ) 6= x
(!) for all 2
(i1), . . . ,
(ik).
e.g. minimal bounding circle problem
essential constraintsCCCCO
PPPPPPPi
19
Support dimension
Define:
maximum support dimension: ¯
= ess sup
!2
m
m1
|Es(!)|
minimum support dimension: = ess inf
!2
m
m¯
|Es(!)|
then ¯
dim(x) = n and 1
e.g. minimal bounding circle problem
|Es(!)| = 2 =
20
Support dimension
Define:
maximum support dimension: ¯
= ess sup
!2
m
m1
|Es(!)|
minimum support dimension: = ess inf
!2
m
m¯
|Es(!)|
then ¯
dim(x) = n and 1
e.g. minimal bounding circle problem
|Es(!)| = 3 =
¯
20
Confidence bounds for randomly selected level-q subsets
Theorem 1:
For any 2 [0, 1] and q 2 [
¯
,m],
Pm
V
x
(!)
| ! 2 q
(!
m
)
(q ¯
; m, 1 ).
For any 2 [0, 1] and q 2 [, m],
Pm
V
x
(!)
| ! 2 q
(!
m
)
(q ; m, 1 ).
where
violation probability: V (x) = P
f(x, ) > 0
binomial c.d.f.: (n; N, p) =
n
X
i=0
N
i
p
i
(1 p)
Ni
21
Confidence bounds for randomly selected level-q subsets
Proof of Thm. 1 relies on . . .
Lemma 1: For any v 2 [0, 1] and k 2 [,
¯
],
Pm
V
x
(!
k
)
v
|Es(!
k
)| = k
= F (v) = v
k
(e.g. Calafiore, 2010)
Proof (sketch): for all q 2 [k, m],
Pm
!
k
= Es(!q
) \ V
x
(!k
)
= v
|Es(!k
)| = k
= (1 v)qk dF (v)
=) Pm
!
k
= Es(!q
) |Es(!
k
)| = k
=
Z 1
0
(1 v)qk dF (v)
e.g. k = 3, q = 10:
hence
Z 1
0
(1 v)qk dF (v) =
q
k
1— Hausdor↵ moment problem
22
Confidence bounds for randomly selected level-q subsets
Proof of Thm. 1 relies on . . .
Lemma 1: For any v 2 [0, 1] and k 2 [,
¯
],
Pm
V
x
(!
k
)
v
|Es(!
k
)| = k
= F (v) = v
k
(e.g. Calafiore, 2010)
Proof (sketch): for all q 2 [k, m],
Pm
!
k
= Es(!q
) \ V
x
(!k
)
= v
|Es(!k
)| = k
= (1 v)qk dF (v)
=) Pm
!
k
= Es(!q
) |Es(!
k
)| = k
=
Z 1
0
(1 v)qk dF (v)
e.g. k = 3, q = 10:
hence
Z 1
0
(1 v)qk dF (v) =
q
k
1— Hausdor↵ moment problem
22
Confidence bounds for randomly selected level-q subsets
. . . and
Lemma 2: For any k 2 [,
¯
] and q 2 [k, m],
Pm
!
k
= Es(!) for some ! 2 q
(!
m
)
|Es(!
k
)| = k
= k
m k
q k
B(m q + k, q k + 1)
(B(·, ·) = beta function)
Note:
!
k
=
(1), . . . ,
(k) is statistically equivalent to a randomly selectedsubset of !
m
!
k
= Es(!) for some ! 2 q
(!m
) |Es(!
k
)| = k
occurs i↵:
q k of the samples in !
m
\ !
k
satisfy f(x(!k
), ) 0and the remaining m q samples satisfy f(x(!
k
), ) > 0
hence Lem. 2 follows from Lem. 1 and the law of total probability
23
Confidence bounds for randomly selected level-q subsets
. . . and
Lemma 2: For any k 2 [,
¯
] and q 2 [k, m],
Pm
!
k
= Es(!) for some ! 2 q
(!
m
)
|Es(!
k
)| = k
= k
m k
q k
B(m q + k, q k + 1)
(B(·, ·) = beta function)
Note:
!
k
=
(1), . . . ,
(k) is statistically equivalent to a randomly selectedsubset of !
m
!
k
= Es(!) for some ! 2 q
(!m
) |Es(!
k
)| = k
occurs i↵:
q k of the samples in !
m
\ !
k
satisfy f(x(!k
), ) 0and the remaining m q samples satisfy f(x(!
k
), ) > 0
hence Lem. 2 follows from Lem. 1 and the law of total probability
23
Confidence bounds for randomly selected level-q subsets
Proof of Thm. 1 (sketch):
I Pm
!
k
= Es(!) for some ! 2 q
(!m
) \ V
x
(!k
)
|Es(!k
)| = k
= k
m k
q k
!B(; m q + k, q k + 1)
(B(·, ·, ·) = incomplete beta function)
I Pm
V
x
(!k
)
!
k
= Es(!) for some ! 2 q
(!m
)
=Pm
k
!
k
= Es(!) for ! 2 q
(!m
) \ V
x
(!k
)
|Es(!k
)| = k
Pm
k
!
k
= Es(!) for some ! 2 q
(!m
) |Es(!
k
)| = k
= (q k; m, 1 ) (Baye’s law)
I Hence
Pm
V
x
(!)
! 2
q
(!m
) \ |Es(!)| = k
= (q k; m, 1 )
Thm. 1 follows from total law of probability, since, for all k 2 [, ],
(q ; m, 1 ) (q k; m, 1 ) (q ; m, 1 )24
Confidence bounds for randomly selected level-q subsets
Proof of Thm. 1 (sketch):
I Pm
!
k
= Es(!) for some ! 2 q
(!m
) \ V
x
(!k
)
|Es(!k
)| = k
= k
m k
q k
!B(; m q + k, q k + 1)
(B(·, ·, ·) = incomplete beta function)
I Pm
V
x
(!k
)
!
k
= Es(!) for some ! 2 q
(!m
)
=Pm
k
!
k
= Es(!) for ! 2 q
(!m
) \ V
x
(!k
)
|Es(!k
)| = k
Pm
k
!
k
= Es(!) for some ! 2 q
(!m
) |Es(!
k
)| = k
= (q k; m, 1 ) (Baye’s law)
I Hence
Pm
V
x
(!)
! 2
q
(!m
) \ |Es(!)| = k
= (q k; m, 1 )
Thm. 1 follows from total law of probability, since, for all k 2 [, ],
(q ; m, 1 ) (q k; m, 1 ) (q ; m, 1 )24
Confidence bounds for randomly selected level-q subsets
Proof of Thm. 1 (sketch):
I Pm
!
k
= Es(!) for some ! 2 q
(!m
) \ V
x
(!k
)
|Es(!k
)| = k
= k
m k
q k
!B(; m q + k, q k + 1)
(B(·, ·, ·) = incomplete beta function)
I Pm
V
x
(!k
)
!
k
= Es(!) for some ! 2 q
(!m
)
=Pm
k
!
k
= Es(!) for ! 2 q
(!m
) \ V
x
(!k
)
|Es(!k
)| = k
Pm
k
!
k
= Es(!) for some ! 2 q
(!m
) |Es(!
k
)| = k
= (q k; m, 1 ) (Baye’s law)
I Hence
Pm
V
x
(!)
! 2
q
(!m
) \ |Es(!)| = k
= (q k; m, 1 )
Thm. 1 follows from total law of probability, since, for all k 2 [, ],
(q ; m, 1 ) (q k; m, 1 ) (q ; m, 1 )24
Confidence bounds for randomly selected level-q subsets
Consider the relationship between optimal costs: J
o
() of CCPJ
(!) of SP(!)
Prop. 1: Pm
J
(!) J
o
()
! 2 q
(!
m
)
(q ¯
; m, 1 )
Proof: J
(!) J
o() if the solution x of SP(!) is feasible for CCP, so
Pm
V
x
(!)
! 2
q
(!m
)
Pm
J
(!r
) J
o()
! 2 q
(!m
)
Prop. 2: Pm
J
(!) > J
o
()
|!| = q
(q 1; q, 1 )
Proof: J
(!) J
o() if the solution x
o of CCP is feasible for SP(!)i.e. if f
x
o(),
0 for all 2 !, so
Pm
J
(!) J
o()
Pf
x
o(),
0
q
(1 )q = 1 (q 1; q, 1 )
25
Confidence bounds for randomly selected level-q subsets
Consider the relationship between optimal costs: J
o
() of CCPJ
(!) of SP(!)
Prop. 1: Pm
J
(!) J
o
()
! 2 q
(!
m
)
(q ¯
; m, 1 )
Proof: J
(!) J
o() if the solution x of SP(!) is feasible for CCP, so
Pm
V
x
(!)
! 2
q
(!m
)
Pm
J
(!r
) J
o()
! 2 q
(!m
)
Prop. 2: Pm
J
(!) > J
o
()
|!| = q
(q 1; q, 1 )
Proof: J
(!) J
o() if the solution x
o of CCP is feasible for SP(!)i.e. if f
x
o(),
0 for all 2 !, so
Pm
J
(!) J
o()
Pf
x
o(),
0
q
(1 )q = 1 (q 1; q, 1 )
25
Confidence bounds for randomly selected level-q subsets
Consider the relationship between optimal costs: J
o
() of CCPJ
(!) of SP(!)
Prop. 1: Pm
J
(!) J
o
()
! 2 q
(!
m
)
(q ¯
; m, 1 )
Proof: J
(!) J
o() if the solution x of SP(!) is feasible for CCP, so
Pm
V
x
(!)
! 2
q
(!m
)
Pm
J
(!r
) J
o()
! 2 q
(!m
)
Prop. 2: Pm
J
(!) > J
o
()
|!| = q
(q 1; q, 1 )
Proof: J
(!) J
o() if the solution x
o of CCP is feasible for SP(!)i.e. if f
x
o(),
0 for all 2 !, so
Pm
J
(!) J
o()
Pf
x
o(),
0
q
(1 )q = 1 (q 1; q, 1 )
25
Comparison with deterministic discarding strategies
For optimal sample discarding (i.e. SP) or a greedy heuristic we have
Pm
V
x
(!
)
(q,
¯
; m, ),
(q,
¯
; m, ) = 1
m q +
¯
1
m q
(m q +
¯
1; m, )
Calafiore (2010), Campi & Garatti (2011)
this bound is looser than the lower bound for random sample discarding(i.e. SP(!), ! 2
q
(!
m
))
since
1 (q, ; m, ) =m q + 1
m q
1 (q ; m, 1 )
implies
B (q, ; m, ) (q ; m, 1 )
B (q, ; m, ) = (q ; m, 1 ) only if = 1 or q = m
(in which case |q
(!m
)| = 1)26
Comparison with deterministic discarding strategies
For optimal sample discarding (i.e. SP) or a greedy heuristic we have
Pm
V
x
(!
)
(q,
¯
; m, ),
(q,
¯
; m, ) = 1
m q +
¯
1
m q
(m q +
¯
1; m, )
Calafiore (2010), Campi & Garatti (2011)
this bound is looser than the lower bound for random sample discarding(i.e. SP(!), ! 2
q
(!
m
))
since
1 (q, ; m, ) =m q + 1
m q
1 (q ; m, 1 )
implies
B (q, ; m, ) (q ; m, 1 )
B (q, ; m, ) = (q ; m, 1 ) only if = 1 or q = m
(in which case |q
(!m
)| = 1)26
Comparison with deterministic discarding strategies
The lower confidence bound for optimal sample discarding corresponds tothe worst case assumption:
Pm
V
x
(!)
>
= 0 for all ! 2 q
(!
m
) such that ! 6= !
since
? Thm. 1 implies
1
Em
|
q
(!m
)| Em
(X
!2q(!m)
Pm
V
x
(!)
>
)
1 (q ; m, 1 )
? Lem. 2 implies
Em
|
q
(!m
)|
m q + 1
m q
27
Comparison with deterministic discarding strategies
Confidence bounds for m = 500 with q = d0.75me = 375, = 1, ¯
= 10
Violation probability ϵ0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Con
fiden
ce
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Φ(q − ζ;m, 1− ϵ)
Φ(q − ζ;m, 1− ϵ)
Ψ(q, ζ;m, ϵ)
28
Comparison with deterministic discarding strategies
Values of lying on 5% and 95% confidence boundsfor varying m with q = d0.75me, = 1, ¯
= 10
Sample size m
102
104
106
108
Violationprobability
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6ϵ95
ϵ5
ϵ′
95
Sample size m
102
104
106
108
10-4
10-3
10-2
10-1
100
ϵ95 − ϵ5
ϵ′
95 − ϵ5
probablities
5
,
95
and
095
are defined by
0.05 = (q ; m, 1 5), 0.95 = (q ; m, 1 95)
0.95 = (q, ; m,
095)
29
Example: minimal bounding circle
Empirical distribution of violation probability for smallest bounding circlewith: N (0, I), q = 80, m = 100, and 500 realisations of !
m
Violation probability ϵ
0 0.1 0.2 0.3 0.4 0.5
Probab
ilityof
V(x)≤
ϵ
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
x(ω), ω ∈ Ωq(ωm)x = x∗(ω∗)
(78; 100, 1 )
(77; 100, 1 )
--
(80, 3; 100, )
30
Example: minimal bounding circle
Empirical cost distributions for smallest bounding circlewith: N (0, I), q = 80, m = 100, and 500 realisations of !
m
Violation probability ϵ
0 0.1 0.2 0.3 0.4 0.5
Probab
ilityof
J∗(ω
)≥
Jo(ϵ)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
ω ∈ Ωq(ωm)ω = ω∗
(99; 100, 1 )
(77; 100, 1 )
@@I-
(80, 3; 100, )
31
Outline
1. Problem definition and motivation
2. Confidence bounds for randomised sample discarding
3. Posterior confidence bounds using empirical violation estimates
4. Randomized algorithm with prior and posterior confidence bounds
5. Conclusions and future directions
32
Random constraint selection strategy
How can we select ! 2 q
(!
m
) at random?
Summary of approach:
For given !
m
, let !
r
=
(1)
, . . . ,
(r), and
– solve SP(!
r
) for x
(!
r
)
– count q =
2 !
m
: f
x
(!
r
),
0
then
2 !
m
: f
x
(!
r
),
0
2 q
(!
m
)
Choose r to maximize the probabilitythat q lies in the desired range
e.g. r = 10
m = 50
q = 13 + 10 = 23
33
Random constraint selection strategy
How can we select ! 2 q
(!
m
) at random?
Summary of approach:
For given !
m
, let !
r
=
(1)
, . . . ,
(r), and
– solve SP(!
r
) for x
(!
r
)
– count q =
2 !
m
: f
x
(!
r
),
0
then
2 !
m
: f
x
(!
r
),
0
2 q
(!
m
)
Choose r to maximize the probabilitythat q lies in the desired range
e.g. r = 10
m = 50
q = 13 + 10 = 23
33
Random constraint selection strategy
How can we select ! 2 q
(!
m
) at random?
Summary of approach:
For given !
m
, let !
r
=
(1)
, . . . ,
(r), and
– solve SP(!
r
) for x
(!
r
)
– count q =
2 !
m
: f
x
(!
r
),
0
then
2 !
m
: f
x
(!
r
),
0
2 q
(!
m
)
Choose r to maximize the probabilitythat q lies in the desired range
e.g. r = 10
m = 50
q = 13 + 10 = 23
33
Random constraint selection strategy
How can we select ! 2 q
(!
m
) at random?
Summary of approach:
For given !
m
, let !
r
=
(1)
, . . . ,
(r), and
– solve SP(!
r
) for x
(!
r
)
– count q =
2 !
m
: f
x
(!
r
),
0
then
2 !
m
: f
x
(!
r
),
0
2 q
(!
m
)
Choose r to maximize the probabilitythat q lies in the desired range
e.g. r = 10
m = 50
q = 13 + 10 = 23
33
Posterior confidence bounds
Define
r
(!
m
) =
2 !
m
: f
x
(!
r
),
0
Theorem 2:
For any 2 [0, 1] and ¯
r q m,
Pm
V
x
(!
r
)
|
r
(!
m
) = q
(q ¯
; m, 1 ).
For any 2 [0, 1] and r q m],
Pm
V
x
(!
r
)
|
r
(!
m
) = q
(q ; m, 1 ).
?
r
(!m
) provides an empirical estimate of violation probability
? Thm. 2 gives a posteriori confidence bounds (i.e. for given
r
(!m
))
? These bounds are tight for general probability distributions, i.e.
– they cannot be improved if can take any value in [, ]
– they are exact for problems with =
34
Posterior confidence bounds
Define
r
(!
m
) =
2 !
m
: f
x
(!
r
),
0
Theorem 2:
For any 2 [0, 1] and ¯
r q m,
Pm
V
x
(!
r
)
|
r
(!
m
) = q
(q ¯
; m, 1 ).
For any 2 [0, 1] and r q m],
Pm
V
x
(!
r
)
|
r
(!
m
) = q
(q ; m, 1 ).
?
r
(!m
) provides an empirical estimate of violation probability
? Thm. 2 gives a posteriori confidence bounds (i.e. for given
r
(!m
))
? These bounds are tight for general probability distributions, i.e.
– they cannot be improved if can take any value in [, ]
– they are exact for problems with =
34
Probability of selecting a level-q subset of !m
Selection of r is based on
Lemma 3: For any ¯
r q m,
Pm
r
(!
m
) = q
m r
q r
min
2[,
¯
]
B(m q + , q + 1)
B(, r + 1)
Pm
r
(!
m
) = q
m r
q r
max
2[,
¯
]
B(m q + , q + 1)
B(, r + 1)
Proof (sketch):
1 From Lem. 1 & Lem. 2, for k 2 [, ] we have
Pm
V
x
(!r
)
= v
|Es(!r
)| = k
=
(1 v)rk
v
k1
B(k, r k + 1)dv
2
r
(!m
) = q i↵
q r of the m r samples in !
m
\ !
r
satisfy f(x(!r
), ) 0and the remaining m q samples satisfy f(x(!
r
), ) > 0
Hence Lem. 3 follows from 1 and the law of total probability35
Probability of selecting a level-q subset of !m
Selection of r is based on
Lemma 3: For any ¯
r q m,
Pm
r
(!
m
) = q
m r
q r
min
2[,
¯
]
B(m q + , q + 1)
B(, r + 1)
Pm
r
(!
m
) = q
m r
q r
max
2[,
¯
]
B(m q + , q + 1)
B(, r + 1)
Proof (sketch):
1 From Lem. 1 & Lem. 2, for k 2 [, ] we have
Pm
V
x
(!r
)
= v
|Es(!r
)| = k
=
(1 v)rk
v
k1
B(k, r k + 1)dv
2
r
(!m
) = q i↵
q r of the m r samples in !
m
\ !
r
satisfy f(x(!r
), ) 0and the remaining m q samples satisfy f(x(!
r
), ) > 0
Hence Lem. 3 follows from 1 and the law of total probability35
Posterior confidence bounds (contd.)
Proof of Thm. 2 (sketch):
I Pm
r
(!m
) = q \ V
x
(!r
)
|Es(!r
)| = k
=
m r
q r
!B(; m q + k, q k + 1)
B(k, r k + 1)
I Hence from Lem. 3 and Bayes’ law
Pm
V
x
(!r
)
r
(!m
) = q \ |Es(!r
)| = k
= (q k; m, 1 )
so the total law of probability implies
Pm
V
x
(!r
)
r
(!m
) = q
(q ; m, 1 )X
k=
Pm
|Es(!
r
)| = k
Pm
V
x
(!r
)
r
(!m
) = q
(q ; m, 1 )
minr,X
k=
Pm
|Es(!
r
)| = k
36
Posterior confidence bounds (contd.)
Proof of Thm. 2 (sketch):
I Pm
r
(!m
) = q \ V
x
(!r
)
|Es(!r
)| = k
=
m r
q r
!B(; m q + k, q k + 1)
B(k, r k + 1)
I Hence from Lem. 3 and Bayes’ law
Pm
V
x
(!r
)
r
(!m
) = q \ |Es(!r
)| = k
= (q k; m, 1 )
so the total law of probability implies
Pm
V
x
(!r
)
r
(!m
) = q
(q ; m, 1 )X
k=
Pm
|Es(!
r
)| = k
Pm
V
x
(!r
)
r
(!m
) = q
(q ; m, 1 )
minr,X
k=
Pm
|Es(!
r
)| = k
36
Posterior confidence bounds (contd.)
Proof of Thm. 2 (sketch):
I Pm
r
(!m
) = q \ V
x
(!r
)
|Es(!r
)| = k
=
m r
q r
!B(; m q + k, q k + 1)
B(k, r k + 1)
I Hence from Lem. 3 and Bayes’ law
Pm
V
x
(!r
)
r
(!m
) = q \ |Es(!r
)| = k
= (q k; m, 1 )
so the total law of probability implies
Pm
V
x
(!r
)
r
(!m
) = q
(q ; m, 1 )X
k=
Pm
|Es(!
r
)| = k
Pm
V
x
(!r
)
r
(!m
) = q
(q ; m, 1 )
minr,X
k=
Pm
|Es(!
r
)| = k
36
Outline
1. Problem definition and motivation
2. Confidence bounds for randomised sample discarding
3. Posterior confidence bounds using empirical violation estimates
4. Randomized algorithm with prior and posterior confidence bounds
5. Conclusions and future directions
37
Parameters and target confidence bounds
Consider the design of an algorithm to generate
x: an approximate CCP solution
q: the number of constraints satisfied out of m sampled constraints
Impose prior confidence bounds:
V (x) 2 (, ] with probability pprior
Impose posterior confidence bounds, given the value of q:
|V (x) q/m| with probability ppost
38
Parameters and target confidence bounds
q cannot be specified directly, but, for given m and q, q:
B Lem. 3 gives the probability, ptrial, of
r
(!
m
) 2 [q, q], for given r
B hence SP(!r
) must be solved Ntrial timesin order to ensure that q 2 [q, q] with a given prior probability
B choose r = r
, the maximizer of ptrial (possibly subject to r r
max
)in order to minimize Ntrial
Selection of m and q, q:
B choose m so that posterior confidence bounds are met:
m(1 ) ; m, 1 /2
1
2 (1 + ppost)
m(1 ) ; m, 1 +/2
1
2 (1 ppost).
B Thm. 2 allows q, q to be chosen so that prior confidence bounds hold
39
Randomized algorithm with tight confidence bounds
Algorithm:
(i). Given bounds , , probabilities pprior < ppost, and sample size m,determine
q = min q subject to (q ¯
; m, 1 ) 1
2
(1 + ppost)
q = max q subject to (q ; m, 1 ) 1
2
(1 ppost)
r
= arg max
r
q
X
q=q
m r
q r
min
2[,
¯
]
B(m q + , q + 1)
B(, r + 1)
ptrial =
q
X
q=q
m r
q r
min
2[,
¯
]
B(m q + , q + 1)
B(, r
+ 1)
Ntrial =
ln(1 pprior/ppost)
ln(1 ptrial)
40
Randomized algorithm with tight confidence bounds
Algorithm (cont.):
(ii). Draw Ntrial multisamples, !
(i)
m
, and for each i = 1, . . . , Ntrial:
– compute x
(!
(i)
r
)
– count
r
(!
(i)
m
) =
2 !
(i)
m
: f
x
(!
(i)
r
),
0
(iii). Determine i
2 1, . . . , Ntrial:
i
= arg min
i21,...,Ntrial
1
2
(q + q)
r
(!
(i)
m
)
,
and return
the solution estimate, x = x
(!
(i
)
r
)
the number of satisfied constraints, q =
r
(!
(i
)
m
)
40
Randomized algorithm with tight confidence bounds
Proposition 3: x, q satisfy the prior confidence bounds
PNtrial m
V (x) 2 (, ]
pprior
and the posterior confidence bounds
(q ¯
; m, 1 ) PNtrial m
V (x) | q = q
(q ; m, 1 ).
The posterior bounds follow directly from Thm. 2
The choice of q, q gives Pm
V (x) 2 (, ] | q = q
ppost 8q 2 [q, q],
=) PNtrial mV (x) 2 (, ]
ppost PNtrial m
q 2 [q, q]
but the definition of i
implies PNtrial mq 2 [q, q]
1 (1 ptrial)
Ntrial
=) PNtrial mV (x) 2 (, ]
ppost
1 (1 ptrial)
Ntrial
so the choice of Ntrial ensures the prior bound
41
Randomized algorithm with tight confidence bounds
B Large m is computationally cheap (hence small and ppost 1)
B ptrial is a tight bound on the probability of q 2 [q, q]
and is exact if =
¯
B The posterior confidence bounds are tight for all distributionsand are exact if =
¯
B Repeated trials and empirical violation tests have been used beforee.g. Calafiore (2017), Chamanbaz et al. (2016)
B The lower posterior confidence bound derived here coincides withCalafiore (2017) for fully supported problems ( = = n)
but our approach:
– shows that this bound is exact for fully supported problems– provides tight bounds in all other cases
42
Randomized algorithm with tight confidence bounds
Variation of r
and Ntrial with ,
¯
for m = 10
5, = 0.19, = 0.21, = 0.005, and ppost = 0.5(1 + pprior):
(, ) (2, 5) (7, 10) (17, 20) (47, 50) (97, 100)pprior p1 p2 p3 p4 p1 p2 p3 p4 p1 p2 p3 p4 p1 p2 p3 p4 p1 p2 p3 p4
r
15 40 91 241 492Ntrial 84 109 176 291 37 48 77 128 22 29 46 76 13 16 26 43 8 11 17 29
(, ) (1, 2) (1, 5) (1, 10)pprior p1 p2 p3 p4 p1 p2 p3 p4 p1 p2 p3 p4
r
5 12 22Ntrial 96 125 200 331 189 246 396 655 1022 1329 2116 3465
where p
1
= 0.9, p
2
= 0.95, p
3
= 0.99, p
4
= 0.999
43
Example I: minimal bounding hypersphere
Find smallest hypersphere in R4 containing N (0, I) with probability 0.8
Support dimension: = 2, ¯
= 5
Prior bounds:V (x) 2 (0.19, 0.21] with probability pprior = 0.9
Posterior bounds:|V (x) q/m| 0.005 with probability ppost = 0.95
Parameters: m = 10
5, q = 79257, q = 80759
r
= 15, ptrial = 0.0365, Ntrial = 84
Compare SP solved approximately using greedy discarding (assuming oneconstraint discarded per iteration), with m = 420, q = 336:
V
x
(!
)
2 (0.173, 0.332] with probability 0.9
44
Example I: minimal bounding hypersphere
Find smallest hypersphere in R4 containing N (0, I) with probability 0.8
Support dimension: = 2, ¯
= 5
Prior bounds:V (x) 2 (0.19, 0.21] with probability pprior = 0.9
Posterior bounds:|V (x) q/m| 0.005 with probability ppost = 0.95
Parameters: m = 10
5, q = 79257, q = 80759
r
= 15, ptrial = 0.0365, Ntrial = 84
Compare SP solved approximately using greedy discarding (assuming oneconstraint discarded per iteration), with m = 420, q = 336:
V
x
(!
)
2 (0.173, 0.332] with probability 0.9
44
Example I: minimal bounding hypersphere
Posterior confidence bounds given q 2 [q, q]:
PV (x) | q = q (blue) PV (x) | q = q (red)
Violation probability ϵ0.185 0.19 0.195 0.2 0.205 0.21 0.215
Probab
ility
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Φ(q − ζ;m, 1− ϵ)Φ(q − ζ;m, 1− ϵ)
45
Example I: minimal bounding hypersphere
Proportion of sampled constraints satisfied q/m0.785 0.79 0.795 0.8 0.805 0.81 0.815
Probability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Observed frequency
Worst-case bound
– Empirical probability distribution of q (blue) for 1000 repetitions
– Worst case bound F (red), P|q/m 0.8| t F (0.8 + t) F (0.8 t)
46
Example II: finite horizon robust control design
Control problem (from Calafiore 2017):
Determine a control sequence to drive the system state to a target over afinite horizon with high probability despite model uncertainty
Discrete-time uncertain linear system
z(t + 1) = A()z(t) + Bu(t), A() = A
0
+
nzX
i,j=1
i,j
e
i
e
>j
i,j
2 [, ] uniformly distributed
47
Example II: finite horizon robust control design
Control problem (from Calafiore 2017):
Determine a control sequence to drive the system state to a target over afinite horizon with high probability despite model uncertainty
Discrete-time uncertain linear system
z(t + 1) = A()z(t) + Bu(t), A() = A
0
+
nzX
i,j=1
i,j
e
i
e
>j
i,j
2 [, ] uniformly distributed
47
Example II: finite horizon robust control design
Control objective:
CCP : minimize
,u
subject to P
kz
T
z(T, )k2
+ kuk2
>
with u = u(0), . . . , u(T 1): predicted control sequence
z(T, ): = A()
T
z(0) +
A()
T1
B · · · B
u
z
T
: target state
time step t
0 2 4 6 8 10
controlinputu(t)
-0.1
-0.05
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
48
Example II: finite horizon robust control design
Problem dimensions: n
z
= 6, T = 10, n = 11, = 1, ¯
= 3
and parameters: = 10
3, = 0.005
Single-sided prior probability bounds with = 0.005:
P
V (x) 2 (0, ]
pprior = 0.99
Posterior uncertainty: = 0.003
and probability: ppost = 1 10
12
=) m = 63 10
3
Choose number of sampled constraints in each trial to maximize ptrial:
r
= 2922, ptrial = 0.482, Ntrial = 9
if r
is limited to 2000, we get ptrial = 0.414, Ntrial = 9
49
Example II: finite horizon robust control design
Compare number of trials needed for q q
expected using bounds of Calfiore (2017): 9.6
expected using proposed approach (1/ptrial): 2.4
observed: 1.4
Residual conservativism is due to use of bounds: = 1, ¯
= 3
rather than the actual distribution for :
support dimension: 1 2 3observed frequency: 5% 59% 36%
50
Future directions
B Alternative methods for randomly selecting level-q solutions
B Methods of checking feasibility of suboptimal points
B Estimation of violation probabilityover a set of parameters
application to computing invariantsets for constrained dynamics
v
11.5 1 0.5 0 0.5 1 1.5
v
2
1.5
1
0.5
0
0.5
1
1.5
Fig. 1: Outlined in black the result V of the optimization (18), thered points show samples drawn for this distribution and 100 redoutlined sets are convex hulls of a random selection of N = 11samples such that any V allows statements on (18) to be made witha confidence larger than 0.999. The result for the non-homogeneousgrid discussed in Section VI is outlined in blue.
of V , this means that for non-uniform distributions theerror e(V) decreases with the distance from the expectedvalue. This implies that a non-uniform grid size such thatfor each Qi the measure 0 < fi < for some fixed maybe used to reduce the number of necessary cubes as well asthe dependence of the error e(V) on the inner radius of V .
VI. A NON-HOMOGENEOUS GRID
In the previous section we presented a simple exampleusing a homogeneous grid, this led to a large number ofcubes Qi and hence centre points ci. Since the centralapproximation (16) does not require a homogeneous gridand neither does Lemma 4.1, we present in this section amethod to determine a grid which reflects the properties ofthe probability density function more closely. We considera compact sample space B for some box B = x 2Rd
: c x
i
c. The algorithm we suggest is todefine a polytopal complex C consisting of cubes Q
i
suchthat
Qi2C Q
i
= B and the associated probability measureof each cube is below a certain threshold f
i
fmax
. To dothis efficiently first notice that
Z
0
. . .
Z
0
f(t1
, . . . , t
d
) dt
1
. . . dt
d
=
d
Z
1
0
. . .
Z
1
0
f(1
, . . . ,
d
) d
1
. . . d
d
.
Although this is a trivial substitution, it allows the simul-taneous computation of f
i
over differently sized cubes Q
i
.Let q
1
, . . . , q
2
d denote the vertices of the cube Q = x 2
Rd
: 0 x
i
1. For this algorithm it is convenient tocharacterise cubes by their lower left corner b
i
and theirside length
i
, so that the ith cube is given by Q
i
=
conv1k2
db
i
+
i
q
k
, and the associated measure is givenby
fi
=
Z
Qf(b
i
+
i
1t) dt.
We propose the following algorithm to determine C.
Algorithm 1 : Non-homogeneous grid generationChoose desired f
max
> 0 and
min
> 0, set C = B andevaluate f
i
while (max
i
fi
fmax
and min
i
min
) doIDX := f
i
fmax
for i 2 IDX doreplace b
i
by b
i
+ q
k
, k = 1, . . . , 2
d and
i
by i2
endevaluate f
i
, i 2 IDX
update Cend
Algorithm 1 terminates either when every cube contains aprobability measure below the threshold f
max
or when somecubes become smaller than the threshold
min
. Otherwise itreplaces all cubes for which the probability measure is largerthan f
max
with cubes with a halved side length. Once the gridis found the centre points are given by c
i
= b
i
+
i2
1.
Algorithm 1 can reduce the computational overhead sig-nificantly: For the same example as in Section V witha non-homogeneous grid satisfying f
i
1
20000
whichyields min
i
i
=
1
2
7 = 0.0078125, max
i
i
=
1
2
anda total of 45436 cubes. Since (16) does not depend onthe choice of the grid itself the remaining optimization isunchanged, a single evaluation of ˚
P (V) now takes 0.0251
seconds, i.e. about five times faster than the homogeneouscounterpart. The resulting set V is shown in blue in Figure 1,For illustrative purposes, Figure 2 shows a (slightly less fine)grid constructed using Algorithm 1.
We summarise the numerical values of the proposed schemesin Table I. The first row of the table shows values forthe homogeneous grid with 160000 cubes (with no directinfluence on the measure of each box), the second row givesvalues for the inhomogeneous grid, while the third row showsthe values obtained with the grid shown in Figure 2. Bothscenario based sets are evaluated on the inhomogeneousgrid, with P (V) and P (V) evaluated at extreme values ofthe generated convex hulls, and the values of ˚
P (V) are thearithmetic means of all 100 sample sets.
VII. CONCLUSIONS
This paper presents a method of optimizing an auxiliarypolytope so as to guarantee satisfaction of probabilisticconstraints. Such problems arise, for example, in stochasticmodel predictive control (see e.g. [20], [6]). To obtain a finite
CONFIDENTIAL. Limited circulation. For review only.
Preprint submitted to 56th IEEE Conference on Decision and Control.
Received March 20, 2017.
B Application to chance-constrained energy management in PHEVs
51
Conclusions
B Solutions of sampled convex optimization problems with randomisedsample selection strategies have tighter confidence bounds thanoptimal sample discarding strategies
B Randomised sample selection can be performed by solving a robustsampled optimization and counting violations of additional sampledconstraints
B We propose a randomised algorithm for chance-constrained convexprogramming with tight a priori and a posteriori confidence bounds
52
Thank you for your attention!
Acknowledgements: James Fleming, Matthias Lorenzen,Johannes Buerger, Rainer Manuel Schaich . . .
53
Research directions
B Optimal control of uncertain systems
– chance-constrained optimization
– stochastic predictive control
B Robust control design
– invariant sets and uncertainty parameterization
B Applications:
– energy management in hybrid electric vehicles (BMW)
– ecosystem-based fisheries management (BAS)
54