Approximate Optimality with Bounded Regret in Dynamic …busic/slides/DynamicMatching... · 2016....

Approximate Optimality with Bounded Regret in

Dynamic Matching Models

Ana Busic

Inria ParisCS Department of Ecole normale superieure

Joint work withSean Meyn

University of Florida

andIvo Adan, Varun Gupta, Jean Mairesse, and Gideon Weiss

Real-Time Decision MakingSimons Institute, Berkeley, Jun. 27 - Jul. 1, 2016

1 / 19

Outline

1 Background

2 Bipartite matching model

3 OptimizationAverage Cost CriterionWorkloadWorkload RelaxationAsymptotic optimality

4 Final remarks

2 / 19

Background

Bipartite Matching

(D,S,E ) bipartite graph

D(s) = {d ∈ D : (d , s) ∈ E}S(d) = {s ∈ S : (d , s) ∈ E}

2’ 1’3’

xi number of elements of type i ∈ D ∪ S

Perfect matching: m ∈ NE such that:

xd =∑

s∈S(d)

mds , ∀d ∈ D, xs =∑

d∈D(s)

mds , ∀s ∈ S

Hall’s marriage theorem (1935)

∃ perfect matching if and only if:∑d∈U xc ≤

∑s∈S(U) xs , ∀U ⊂ D∑

s∈V xs ≤∑

d∈D(V ) xd , ∀V ⊂ S

3 / 19

Background

Matching in Health-care

Kidney paired donation

Who can join this program?For recipients: If you are eligible

for a kidney transplant and are

receiving care at a transplant

center in the United States, you

can join ... You must have a living

donor who is willing and medically

able to donate his or her kidney ...

For donors: You must also be

willing to take part ...

U N I T E D N E T W O R K F O R O R G A N S H A R I N G

TA L K I N G A B O U T T R A N S P L A N TAT I O N

Need for Dynamic Matching Models

4 / 19

Background

Matching in Health-care

Kidney paired donation

Who can join this program?For recipients: If you are eligible

for a kidney transplant and are

receiving care at a transplant

center in the United States, you

can join ... You must have a living

donor who is willing and medically

able to donate his or her kidney ...

For donors: You must also be

willing to take part ...

U N I T E D N E T W O R K F O R O R G A N S H A R I N G

TA L K I N G A B O U T T R A N S P L A N TAT I O N

Need for Dynamic Matching Models

4 / 19

Background

Dynamic Matching Model: FCFS

Another Example

Boston area public housing (some 25 years ago):

Two independent infinite sequences of items.Demand / supply i.i.d.

FCFS matching policy admits product-form invariant distribution

Caldentey, Kaplan, Weiss 2009Adan, Weiss 2012Adan, B., Mairesse, Weiss 2015

5 / 19

Bipartite matching model

Dynamic Bipartite Matching Model

Multiclass queueing model – Supply/Demand play symmetric roles

Discrete time queueing model with twotypes of arrival: “supply” and “demand”.

Discrete time: at each time step there isone customer and one server that arriveinto the system, independently of the past.

Instantaneous matchings according to abipartite matching graph.Unmatched supply/demand stored in abuffer.

Given by: a matching graph, a joint probability measure µ for arrivals ofdemand/supply and a matching policy.

6 / 19

Dynamic Matching Model: Stability

For the dynamic model with i.i.d. arrivals, when is the Markovian model stable?(positive recurrent)

Necessary condition: generalization of Hall’s marriage theorem

Under this condition, certain policies are stabilizing, such as MaxWeight

Under this condition, other policies are not stabilizing

B., Gupta, Mairesse 2013.

7 / 19

Dynamic Matching Model: Approximate Optimality

Subject of this talk:

How to define ‘heavy traffic’? This requires a formulation of ‘network load’

What is the structure of an optimal policy for the model in heavy traffic?

How do we use this structure for policy design?

B., Meyn 2016

8 / 19

Necessary stability conditions

Assumption: matching graph (D,S,E ) is connected.

Necessary conditions: If the model is stable then the marginals of µ satisfy

NCond :

{µD(U) < µS(S(U)), ∀U ( DµS(V ) < µD(D(V )), ∀V ( S

Sufficient conditions: If NCond holds, then there exists a policy that is stabilizing

Prop. Given [(D,S,E ), µ], there exists an algorithm of time complexityO((|D|+ |S|)3) to decide if NCond is satisfied.

9 / 19

NCond :

9 / 19

NCond :

9 / 19

Optimization Average Cost Criterion

Optimization

Cost function c on buffer levels.

Average-cost: η = lim supN→∞

N−1∑t=0

E[c(Q(t))

Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0Input process U represents the sequence of matching activities. Input space:

U� ={∑e∈E

neue : ne ∈ Z+

}with ue = 1i + 1j for e = (i , j) ∈ E .

X (t) = Q(t) + A(t) the state process of the MDP model,

X (t + 1) = X (t)− U(t) + A(t + 1)

The state space X� = {x ∈ Z`+ : ξ0 · x = 0} with ξ0 = (1, . . . , 1,−1, . . . ,−1).

10 / 19

Optimization

N−1∑t=0

E[c(Q(t))

]Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0

Input process U represents the sequence of matching activities. Input space:

U� ={∑e∈E

neue : ne ∈ Z+

X (t + 1) = X (t)− U(t) + A(t + 1)

10 / 19

Optimization

N−1∑t=0

E[c(Q(t))

]Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0Input process U represents the sequence of matching activities. Input space:

U� ={∑e∈E

neue : ne ∈ Z+

X (t + 1) = X (t)− U(t) + A(t + 1)

10 / 19

Optimization

N−1∑t=0

E[c(Q(t))

]Queue dynamics: Q(t + 1) = Q(t)− U(t) + A(t) , t ≥ 0Input process U represents the sequence of matching activities. Input space:

U� ={∑e∈E

neue : ne ∈ Z+

X (t + 1) = X (t)− U(t) + A(t + 1)

10 / 19

Optimization Workload

Workload

For any D ⊂ D, corresponding workload vector ξD defined so that

ξD · x =∑i∈D

xDi −

∑j∈S(D)

Necessary and sufficient condition for a stabilizing policy:

NCond: δD :=−ξD · α > 0 for each Dα = E[A(t)] arrival rate vector.

Why is this workload? Consistent with routing/scheduling models:

Fluid model,d

dtx(t) = −u(t) + α

The minimal time to reach the origin from x(0) = x : T ∗(x) = maxDξD ·xδD

Heavy-traffic: δD ∼ 0 for one or more D

11 / 19

Workload

ξD · x =∑i∈D

xDi −

∑j∈S(D)

Why is this workload?

Consistent with routing/scheduling models:

Fluid model,d

dtx(t) = −u(t) + α

11 / 19

Workload

ξD · x =∑i∈D

xDi −

∑j∈S(D)

Fluid model,d

dtx(t) = −u(t) + α

11 / 19

Workload

ξD · x =∑i∈D

xDi −

∑j∈S(D)

Fluid model,d

dtx(t) = −u(t) + α

11 / 19

Workload Dynamics

Fix one workload vector ξD ; denote (ξ, δ) for (ξD , δD).

Workload W (t) = ξ · X (t)

can be positive or negative.Dynamics as in other queueing models,

E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ

Achieved ⇐⇒ S(D) matches with D only.

Workload relaxation: take this as the model for control.

12 / 19

Workload Dynamics

Workload W (t) = ξ · X (t) can be positive or negative.

Dynamics as in other queueing models,

E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ

12 / 19

Workload Dynamics

Workload W (t) = ξ · X (t) can be positive or negative.Dynamics as in other queueing models,

E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ

12 / 19

Workload Dynamics

E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ

12 / 19

Workload Dynamics

E[W (t + 1)−W (t) | X (t), U(t)] ≥ −δ

12 / 19

Optimization Workload Relaxation

Relaxations

A workload relaxation takes this as the model for control:One Dimensional Workload relaxation,

W (t + 1) = W (t)− δ + I (t)︸︷︷︸Idleness ≥ 0

+ ∆(t + 1)︸︷︷︸Zero mean

Effective cost c : R→ R+: Given a cost function c for Q,

c(w) = min{c(x) : ξ · x = w}

Piecewise linear if c is linear: c(w) = max(c+w ,−c−w).

Conclusions

Control of the relaxation = inventory model of Clark & Scarf (1960)

Hedging policy, with threshold τ∗: Idling is not permitted unless W (t) < −τ∗

Heavy-traffic: For average-cost optimal control, τ∗ ∼ 12

σ2∆

δlog(1 + c+/c−)

13 / 19

Relaxations

+ ∆(t + 1)︸︷︷︸Zero mean

c(w) = min{c(x) : ξ · x = w}

Conclusions

σ2∆

δlog(1 + c+/c−)

13 / 19

Relaxations

+ ∆(t + 1)︸︷︷︸Zero mean

c(w) = min{c(x) : ξ · x = w}

Conclusions

σ2∆

δlog(1 + c+/c−)

13 / 19

Relaxations

+ ∆(t + 1)︸︷︷︸Zero mean

c(w) = min{c(x) : ξ · x = w}

Conclusions

σ2∆

δlog(1 + c+/c−)

13 / 19

Relaxations

+ ∆(t + 1)︸︷︷︸Zero mean

c(w) = min{c(x) : ξ · x = w}

Conclusions

σ2∆

δlog(1 + c+/c−)

13 / 19

Optimization Asymptotic optimality

Asymptotic optimality

Family of arrival processes {Aδ(t)}, parameterized by δ ∈ [0, δ•], δ• ∈ (0, 1).Additional assumptions:

(A1) For one set D ( D we have ξD · αδ = −δ, where αδ denotes the mean ofAδ(t).Moreover, there is a fixed constant δ > 0 such that ξD

′ · αδ ≤ −δ for anyD ′ ( D, D ′ 6= D, and δ ∈ [0, δ•].

(A2) The distributions are continuous at δ = 0, with linear rate: For someconstant b,

E[‖Aδ(t)− A0(t)‖] ≤ bδ.

(A3) Graph structure for arrivals and for feasible matches independent of δ ≥ 0=⇒ The matching graph is connected even for δ = 0.Moreover, there exists i0 ∈ S(D), j0 ∈ Dc , and pI > 0 such that

P{Aδi0 (t) ≥ 1 and Aδj0 (t) ≥ 1} ≥ pI , 0 ≤ δ ≤ δ•.

14 / 19

h-MWT (h-MaxWeight with threshold) policy:For a differentiable function h : R` → R+, and a threshold τ ≥ 0,

φ(x) = argmax u · ∇h (x)

subject to u feasible

and I (t) ≤ max(−W (t)− τ, 0), when X (t) = x and U(t) = u.

Thm (Asymptotic Optimality With Bounded Regret) [B., Meyn ’16]

There is an h-MWT policy with finite average cost η, satisfying

η∗ ≤ η∗ ≤ η ≤ η∗ + O(1)

where η∗ is the optimal average cost for the MDP model, η∗ is the optimalaverage cost for the workload relaxation, and the term O(1) does not dependupon δ.

15 / 19

h-MWT (h-MaxWeight with threshold) policy:For a differentiable function h : R` → R+, and a threshold τ ≥ 0,

φ(x) = argmax u · ∇h (x)

subject to u feasible

and I (t) ≤ max(−W (t)− τ, 0), when X (t) = x and U(t) = u.

Thm (Asymptotic Optimality With Bounded Regret) [B., Meyn ’16]

There is an h-MWT policy with finite average cost η, satisfying

η∗ ≤ η∗ ≤ η ≤ η∗ + O(1)

where η∗ is the optimal average cost for the MDP model, η∗ is the optimalaverage cost for the workload relaxation, and the term O(1) does not dependupon δ.

15 / 19

The average cost for the relaxation satisfies the uniform bound,

η∗ = η∗∗ + O(1)

where η∗∗ is the optimal cost for the diffusion approx. for the relaxation:

η∗∗ = τ∗ c− = 12

δc− log

h(x) = h(ξ · x) + hc(x)

- hc is introduced to penalize deviations between c(x) and c(ξ · x).- The first term h is a function of workload. For w ≥ −τ∗, it solves the

second-order differential equation,

−δh′ (w) + 12σ2

∆h′′ (w) = −c(w) + η∗∗, (1)

There is a solution that is convex and increasing on [−τ∗,∞), withh′(−τ∗) = h′′(−τ∗) = 0. Then extended to get a convex C 2 function on R.

16 / 19

The average cost for the relaxation satisfies the uniform bound,

η∗ = η∗∗ + O(1)

where η∗∗ is the optimal cost for the diffusion approx. for the relaxation:

η∗∗ = τ∗ c− = 12

δc− log

)h(x) = h(ξ · x) + hc(x)

- hc is introduced to penalize deviations between c(x) and c(ξ · x).- The first term h is a function of workload. For w ≥ −τ∗, it solves the

second-order differential equation,

−δh′ (w) + 12σ2

∆h′′ (w) = −c(w) + η∗∗, (1)

There is a solution that is convex and increasing on [−τ∗,∞), withh′(−τ∗) = h′′(−τ∗) = 0. Then extended to get a convex C 2 function on R.

16 / 19

Examples

Example

Cost: c(x) = xD1 + 2xD

2 + 3xD3 + 3xS

1 + 2xS2 + xS

=⇒ Effective Cost: c(w) = 4|w |

e1 e2 e3 e4 e5

xD1 xD

0 5 10 15 20 25 30 r

r∗ = 14.9

Average Cost Estimated in Simulation:Average Cost Comparisons:

Priority

MaxWeight

Threshold (15)

1 2 3 4 5x 106

W (t) = QD3 (t)− QS

Matching of Supply 1 and Demand 2allowed only if W (t) < −τ∗

Workload Relaxation:

QS1 (t) = QS

2 (t) = 0 if W (t) > 0

QD2 (t) = QD

3 (t) = 0 if W (t) < 0

Simulation with τ = 14.9

17 / 19

Examples

Example

2 + 3xD3 + 3xS

1 + 2xS2 + xS

e1 e2 e3 e4 e5

xD1 xD

0 5 10 15 20 25 30 r

r∗ = 14.9

Average Cost Estimated in Simulation:

Average Cost Comparisons:

Priority

MaxWeight

Threshold (15)

1 2 3 4 5x 106

W (t) = QD3 (t)− QS

QS1 (t) = QS

2 (t) = 0 if W (t) > 0

QD2 (t) = QD

3 (t) = 0 if W (t) < 0

17 / 19

Examples

Example

2 + 3xD3 + 3xS

1 + 2xS2 + xS

e1 e2 e3 e4 e5

xD1 xD

30 5 10 15 20 25 30 r

r∗ = 14.9

Average Cost Estimated in Simulation:

Average Cost Comparisons:

Priority

MaxWeight

Threshold (15)

1 2 3 4 5x 106

W (t) = QD3 (t)− QS

QS1 (t) = QS

2 (t) = 0 if W (t) > 0

QD2 (t) = QD

3 (t) = 0 if W (t) < 0

17 / 19

Final remarks

Performance bounds?

Approximate optimal control for relaxations in higher dimensions?

More general arrival assumptions. Admission control? Abandonnements?

Optimization for non-bipartite matching?

Applications?

18 / 19

References

Dynamic bipartite matching models

Caldentey, Kaplan, Weiss, FCFS infinite bipartite matching of servers andcustomers. Adv. Appl. Probab. 2009.

Adan & Weiss, Exact FCFS matching rates for two infinite multi-type sequences.Operations Research, 2012.

Busic, Gupta, Mairesse, Stability of the bipartite matching model. Adv. Appl.Probab. 2013.

Mairesse, Moyal, Stability of the stochastic matching model. ArXiv. 2014.

Adan, Busic, Mairesse, Weiss, Reversibility and further properties of FCFS infinitebipartite matching. ArXiv. 2015.

Busic, Meyn, Approximate optimality with bounded regret in dynamic matchingmodels. ArXiv. 2016.

Workload relaxations

Meyn, Control Techniques for Complex Networks. Cambridge Uni. Press, 2007.

Meyn, Stability and asymptotic optimality of generalized MaxWeight policies.SIAM J. Control Optim., 2009.

Gurvich, Ward, On the dynamic control of matching queues, Stoch. Systems, 2014.

19 / 19

Approximate Optimality with Bounded Regret in Dynamic …busic/slides/DynamicMatching... · 2016....

Documents

Mairesse d'un jour 2016 - ville de Chapais

Rapport de la mairesse sur la situation financière - 24 octobre 2016

By Jacques Mairesse † with Yanyun Zhao ‡ and Feng Zhen §

MeyerMonitor - Strategic Dialogue at Meyn...ME Y N Global Vision Conversation @ MEYN Mexico, Toluca, 14 December 2011 NOWÐ e SS M ore S Inches 2015 WORLD Indiscu±c8e The WORK has

Mairesse d'un jour 2013

MEYN D50 THIGH DEBONER · Meyn D50 thigh deboner The Meyn D50 thigh deboner is regarded as the industry benchmark for highly efficient production of deboned thigh meat. The system

Emile Mairesse Bénévole d’accompagnement

Vanderbilt Research Shared Resources and Core Facilities Susan Meyn Director – Research Resources VUMC Office of Research

Meyn, Persson, Leander SWIM20 Vulnerability Assessment of Groundwater Aquifers during the Construction of the Citytunnel in Malmö, Sweden Alina Meyn, Kenneth

George Meyn Brochure - wycokck.org · George Meyn Building Wyandotte County Park 126th & State Ave. Kansas City, KS Building Occupancy 350 Fees: ... Sunday of the current year, those

Cristian Mairesse Cavalheiro

SEQUENCINGANDROUTINGINMULTICLASSQUEUEING NETWORKSPARTII:WORKLOADRELAXATIONS∗ SEAN P. MEYN† SIAMJ.CONTROL OPTIM. c 2003SocietyforIndustrialandAppliedMathematics ...Cited by: 93Publish

BULLETIN MUNICIPAL - Municipalité de Colombier · Rapport de la mairesse RAPPORT ANNUEL DE LA MAIRESSE SUR LA SITUATION FINANCIÈRE 2017 Conformément à l’article 955 du Code

DAVID A. BUSIC PERFECTLY IMPERFECT€¦ · Perfectly Imperfect is about people whose true-to-life stories are ... Busic has published numerous articles, served as coeditor of Preacher’s

Chapter IV BASIC CONCEPTUALISATIONS AND ...shodhganga.inflibnet.ac.in/bitstream/10603/6198/9/09...Busic Cottrcptutrliscr!io~zs & Struc!rrrcs itr Music: FVestertr & I~rdirtrt 156 hums

droidcon 2012: What's the Hack is NFC .., Hauke Meyn, NXP

KM C364e-20150803094214 · Cellulaire de la mairesse 14. Cartes de crédit. 2015-06-092 Acceptation du procès-verbal du 5 mai 2015 ... Que madame la Mairesse contribuera à l'achat

MEYN MAMVRO | 30 | Spring-Summer 1996 … · 2020. 11. 9. · MEYN MAMVRO | 30 | Spring-Summer 1996 meynmamvro.co.uk/archive . MEYN MAMVRO | 30 | Spring-Summer 1996 meynmamvro.co.uk/archive

STÉPHANIE LACOSTE ÉLUE 2 MAIRESSE DE L’HISTOIRE DE LA VILLE

НАЗАРЯНСКИОСНОВИ · Еугенио Дуарте (Eugénio Duarte) Дейвид У. Грейвс (David W. Graves) Дейвид Бусич (David A. Busic) Густаво