Algorithmic Game Theory - IRIFzielonka/Jeux/TRANSPARENTS/AlgoGame.pdfAlgorithmic Game Theory...

Preview:

Citation preview

Algorithmic Game Theory

Nguyen Kim Thang

LIAFA, 18/2/09

Game Theory + Algorithms

Game Theory + Algorithms Entities in society, each with its own information and

interests, behave in rational manners.

Game Theory + Algorithms Entities in society, each with its own information and

interests, behave in rational manners.

Game theory is a deep theory studying such interactions (in economics, political science, ... etc).

Game Theory + Algorithms Entities in society, each with its own information and

interests, behave in rational manners.

Theoretical computer science studies optimization problems, seeks to optimum, efficient computing, impossibility results, ... etc

Game theory is a deep theory studying such interactions (in economics, political science, ... etc).

Algorithmic Game Theory

Research field on the interface of game theory and theoretical computer science (mostly algorithms)

The field has phenomenally exploded with many branches: computing Nash equilibrium, mechanism design, inefficiency of equilibria, ... etc

Formulating novel goals and problems, fresh looks on different issues (inspired by Internet, ...).

Outline

Existence and inefficiency of pure Nash equilibrium

Online Algorithmic Mechanism Design

Scheduling Games in the Dark

Online Auction with single-minded customers

Nash Equilibrium

Nash Equilibrium Equilibrium: strategy profile that is resilient to

deviation of individual player.

Nash Equilibrium Equilibrium: strategy profile that is resilient to

deviation of individual player.

Mixed equilibrium Pure equilibrium

Nash Equilibrium Equilibrium: strategy profile that is resilient to

deviation of individual player.

Mixed equilibrium Pure equilibriumchoose a distribution

over strategies

Nash Equilibrium Equilibrium: strategy profile that is resilient to

deviation of individual player.

Mixed equilibrium Pure equilibriumchoose a distribution

over strategiesdeterministically choose a strategy

Nash Equilibrium Equilibrium: strategy profile that is resilient to

deviation of individual player.

Mixed equilibrium Pure equilibrium

always exists (by Nash)

choose a distribution over strategies

deterministically choose a strategy

Nash Equilibrium Equilibrium: strategy profile that is resilient to

deviation of individual player.

Mixed equilibrium Pure equilibrium

always exists (by Nash)

choose a distribution over strategies

deterministically choose a strategy

does not necessarily exist

Nash Equilibrium Equilibrium: strategy profile that is resilient to

deviation of individual player.

Mixed equilibrium Pure equilibrium

always exists (by Nash)

choose a distribution over strategies

deterministically choose a strategy

does not necessarily exist

Potential games: admit a function such that if a player change her strategy to get a better utility then the function strictly decreases.

Inefficiency of equilibria

the worst Nashequilibrium price of anarchy

(PoA) = worst NE/OPT

price of stability (PoS) = best NE/OPT

the best Nashequilibrium

OPT

Good equilibria ?social objective function

Scheduling Game

Scheduling Game jobs (players) and machines: a job chooses a

machine to execute. The processing time of job on machine is

n mi

j pij

Scheduling Game jobs (players) and machines: a job chooses a

machine to execute. The processing time of job on machine is

n m

The cost of a job is its completion time.

ij pij

ci i

Scheduling Game jobs (players) and machines: a job chooses a

machine to execute. The processing time of job on machine is

n m

The cost of a job is its completion time.

ij pij

The social cost is the makespan, i.e.

ci i

maxi

ci

Scheduling Game jobs (players) and machines: a job chooses a

machine to execute. The processing time of job on machine is

Each machine specifies a policy how jobs assigned to the machine are to be scheduled.

n m

The cost of a job is its completion time.

ij pij

The social cost is the makespan, i.e.

ci i

maxi

ci

Scheduling Game jobs (players) and machines: a job chooses a

machine to execute. The processing time of job on machine is

Each machine specifies a policy how jobs assigned to the machine are to be scheduled.

Eg: Shortest Processing Time First (SPT)

n m

The cost of a job is its completion time.

ij pij

The social cost is the makespan, i.e.

ci i

machine 1machine 2machine 3

maxi

ci

Scheduling Game jobs (players) and machines: a job chooses a

machine to execute. The processing time of job on machine is

Each machine specifies a policy how jobs assigned to the machine are to be scheduled.

Eg: Shortest Processing Time First (SPT)

n m

The cost of a job is its completion time.

ij pij

The social cost is the makespan, i.e.

ci i

machine 1machine 2machine 3

maxi

ci

Non-clairvoyant policies Typically, a policy depends on the processing time of jobs

assigned to the machine.

Non-clairvoyant policies Typically, a policy depends on the processing time of jobs

assigned to the machine.

What about policies that do not require this knowledge? Incomplete information games Private information of jobs Jobs cannot influence on their completion time

by misreporting their processing time

Non-clairvoyant policies Typically, a policy depends on the processing time of jobs

assigned to the machine.

What about policies that do not require this knowledge? Incomplete information games Private information of jobs Jobs cannot influence on their completion time

by misreporting their processing time

Non-clairvoyant policies

existenceNash equilibrium

small PoA

Natural policies RANDOM: schedules jobs in a random order.

EQUI: schedules jobs in parallel, assigning each job an equal fraction of the processor.

ci = pij +12

!

i!:!(i!)=j,i! !=i

pi!j

In the strategy profile , is assigned to : ! i j

Natural policies RANDOM: schedules jobs in a random order.

EQUI: schedules jobs in parallel, assigning each job an equal fraction of the processor.

ci = pij +12

!

i!:!(i!)=j,i! !=i

pi!j

In the strategy profile , is assigned to : ! i j

AB

C

C

D

pA = 1pB = 1

pC = 2pD = 3

0 4 6 7

Natural policies RANDOM: schedules jobs in a random order.

EQUI: schedules jobs in parallel, assigning each job an equal fraction of the processor.

ci = pij +12

!

i!:!(i!)=j,i! !=i

pi!j

In the strategy profile , is assigned to : ! i j

ci = p1j + . . . + pi!1,j + (k ! i + 1)pij

If there are jobs on machine s.t: p1j ! . . . ! pkjk j

Models

Def of models:

Def: A job is balanced if i max pij/ min pij ! 2

Identical machines: for some length

Uniform machines: for some speed

Unrelated machines: arbitrary

pij = pi !j

pij = pi/sj

pij

sj

pi

Existence of equilibrium

The game under EQUI policy is a potential game.

Theorem:

The game under RANDOM policy is a potential game for 2 unrelated machines but it is not for more than 3 machines. For uniform machines, balanced jobs, there always exists equilibrium.

RANDOM, uniform machines Def: A job is unhappy if it can decrease its cost by

changing the strategy (other players’ strategies are fixed)

RANDOM, uniform machines Def: A job is unhappy if it can decrease its cost by

changing the strategy (other players’ strategies are fixed)

Def: a best move of a job is the strategy which minimizes the cost of the job (while other players’ strategies are fixed)

RANDOM, uniform machines Def: A job is unhappy if it can decrease its cost by

changing the strategy (other players’ strategies are fixed)

Def: a best move of a job is the strategy which minimizes the cost of the job (while other players’ strategies are fixed)

Jobs have length Machines have speed

p1 ! p2 ! . . . ! pn

s1 ! s2 ! . . . ! sm

pij = pi/sj

RANDOM, uniform machines

Lemma: Consider a job making a best move from to . If there is a new unhappy job with index greater than , then

Def: A job is unhappy if it can decrease its cost by changing the strategy (other players’ strategies are fixed)

Def: a best move of a job is the strategy which minimizes the cost of the job (while other players’ strategies are fixed)

Jobs have length Machines have speed

p1 ! p2 ! . . . ! pn

s1 ! s2 ! . . . ! sm

pij = pi/sj

sa > sb

i

i

ab

Potential function Dynamic: among all unhappy jobs, let the one with the

greatest index make a best move.

Potential function Dynamic: among all unhappy jobs, let the one with the

greatest index make a best move.

For any strategy profile , let be the unhappy job with greatest index.

! t

f!(i) =

!1 if 1 ! i ! t,

0 otherwise.1 0t

Potential function Dynamic: among all unhappy jobs, let the one with the

greatest index make a best move.

For any strategy profile , let be the unhappy job with greatest index.

! t

f!(i) =

!1 if 1 ! i ! t,

0 otherwise.

lex. decreases

1 0t

!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))

Potential function Dynamic: among all unhappy jobs, let the one with the

greatest index make a best move.

For any strategy profile , let be the unhappy job with greatest index.

! t

f!(i) =

!1 if 1 ! i ! t,

0 otherwise.

If t! < t lex. decreases

!(!) = (1, s!(1), . . . , 1, s!(t!), 1, s!(t!+1), . . . , 1, s!(t), . . .)!(!!) = (1, s!(1), . . . , 1, s!(t!), 0, s!(t!+1), . . . , 0, s!!(t), . . .)

1 0t

!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))

Potential function Dynamic: among all unhappy jobs, let the one with the

greatest index make a best move.

For any strategy profile , let be the unhappy job with greatest index.

! t

f!(i) =

!1 if 1 ! i ! t,

0 otherwise.

If lex. decreases

t! > t

!(!) = (1, s!(1), . . . , 1, s!(t), . . . , 0, s!(t!), . . .)!(!!) = (1, s!(1), . . . , 1, s!!(t), . . . , 0, s!(t!), . . .)

1 0t

!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))

Potential function Dynamic: among all unhappy jobs, let the one with the

greatest index make a best move.

For any strategy profile , let be the unhappy job with greatest index.

! t

f!(i) =

!1 if 1 ! i ! t,

0 otherwise.

If lex. decreases

t! > t

!(!) = (1, s!(1), . . . , 1, s!(t), . . . , 0, s!(t!), . . .)!(!!) = (1, s!(1), . . . , 1, s!!(t), . . . , 0, s!(t!), . . .)

by Lemma: s!(t) > s!!(t)

1 0t

!(!) = (f!(1), s!(1), . . . , f!(n), s!(n))

Inefficiency

Theorem: For unrelated machines, the PoA of policy EQUI is at most 2m – interestingly, that matches the best clairvoyant policy.

PoA is not increased when processing times are unknown to the machines.

the worstNE

PoA

OPT

Mechanism Design

Goal: self-interested behavior yields desired outcomes.

Define the game

Online Auction A company produces one perishable item per time unit

(items have to be immediately delivered to customers, e.g. electricity, ice-cream, ...)

Opt. prob: maximize the welfare over all satisfied customers.

Single-minded customers arrive online: a customer arrives at , pays if he receives items before deadline , otherwise he pays nothing. !

i

wi

wi ki

di

ri

Online Auction A company produces one perishable item per time unit

(items have to be immediately delivered to customers, e.g. electricity, ice-cream, ...)

Opt. prob: maximize the welfare over all satisfied customers.

Single-minded customers arrive online: a customer arrives at , pays if he receives items before deadline , otherwise he pays nothing. !

i

wi

Mechanism design: are private Customers may misreport

their value. They bid

wi ki

di

ri

wi

bi

Mechanism Design (MD)Mechanism:

receives all bids

Mechanism Design (MD)

allocation algorithm: determine the set of satisfied customers

Mechanism: receives all bids

Mechanism Design (MD)

allocation algorithm: determine the set of satisfied customers

payment algorithm: determine how much a customer has to pay

Mechanism: receives all bids

Mechanism Design (MD)

allocation algorithm: determine the set of satisfied customers

payment algorithm: determine how much a customer has to pay

Mechanism: receives all bids

ui =

!wi ! pi if satisfied,

0 otherwise.

Mechanism Design (MD)

allocation algorithm: determine the set of satisfied customers

payment algorithm: determine how much a customer has to pay

Mechanism: receives all bids

ui =

!wi ! pi if satisfied,

0 otherwise.

Goal: self-interested behavior yields truthfulness, bi = wi

Mechanism Design (MD)

allocation algorithm: determine the set of satisfied customers

payment algorithm: determine how much a customer has to pay

Mechanism: receives all bids

Mechanism Design (MD)

allocation algorithm: determine the set of satisfied customers

payment algorithm: determine how much a customer has to pay

monotone: a winner still win if he

raises his bid

Mechanism: receives all bids

Mechanism Design (MD)

allocation algorithm: determine the set of satisfied customers

payment algorithm: determine how much a customer has to pay

monotone: a winner still win if he

raises his bid

Mechanism: receives all bids

critical payment:the smallest bid that a winner needs to bid in

order to win.

Truthful MD

Theorem: for single-parameter domain, a mechanism is truthful iff its allocation algo is monotone and it uses the critical payment scheme.

Our problem: design a monotone algorithm

verify whether the critical payment scheme can be computed efficiently.

Online Algorithm

Maximizing the welfare is hard.

Def: an online algorithm is -competitive if for any instance , the outcome

!

i

wi

ALGI c · ALG(I) ! OPT (I)

c

Online Algorithm

Maximizing the welfare is hard.

Def: an online algorithm is -competitive if for any instance , the outcome

!

i

wi

ALGI c · ALG(I) ! OPT (I)

Theorem: if all then there exists a 7-competitive truthful mechanism.

ki = k

c

Algorithm The CONSERVATIVE algo:

if there is no currently running job, serve the pending one with highest value

still schedule the current customer except there is a new one with value at least 2 that of the current customer

Algorithm The CONSERVATIVE algo:

Proof:

if there is no currently running job, serve the pending one with highest value

still schedule the current customer except there is a new one with value at least 2 that of the current customer

the algorithm is monotone 7-competitive by a charging scheme

i 7wi

OPTCONSERwj

j

Proof (sketch)

CONSER

OPT

Proof (sketch)

(j, k) (j, b) (j, 1)

CONSER

OPT

wj/k wj/k wj/k

? ??

Proof (sketch)

(j, k) (j, b) (j, 1)

CONSER

OPT

Proof (sketch)

(j, k) (j, b) (j, 1)

CONSER

OPT

(j, 1)

type 1: if is completed by CONSERj

Proof (sketch)

(j, k) (j, b) (j, 1)

CONSER

OPT

type 1: if is completed by CONSERj

Proof (sketch)

(j, k) (j, b) (j, 1)

CONSER

OPT

type 1: if is completed by CONSERj

type 2: if 2wi > wj

(i, a) (i0, 1)

Proof (sketch)

(j, k) (j, b) (j, 1)

CONSER

OPT

type 1: if is completed by CONSERj

type 2: if 2wi > wj

Proof (sketch)

(j, k) (j, b) (j, 1)

CONSER

OPT

type 1: if is completed by CONSERj

type 2: if 2wi > wj

(i!, a!) (i0, 1)

type 3: otherwise , is not pending.

last timestill pending

j

2wi! > wj 2wi0/k > wj/k

2wi ! wj j

then

(i, a)

Proof (sketch)

Observation: receives at most charges of type 3.

(j, b)

CONSER

OPT

(i!, a!) (i, a)(i0, 1)

last timestill pending

j

(i0, 1) k

! k

Summing up all charges, we get 7-competitive.

General case

Theorem: if all then there exists a -competitive truthful mechanism. This mechanism is optimal.

O(k/ log k)ki ! k

Proof: more elaborated but the idea is similar.

Conclusion

Motivation through two problems.

Inspired by Game Theory, using technique of Computer Science

Inspired by Computer Science, using technique of Game Theory

theoretically beautiful

real problems, practical importance.

Recommended