History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA

History-Dependent Graphical Multiagent Models

Quang Duong Michael P. Wellman Satinder SinghComputer Science and Engineering

University of Michigan, USA

Yevgeniy VorobeychikComputer and Information Sciences

University of Pennsylvania, USA1

Modeling Dynamic Multiagent Behavior

• Design a representation that:– expresses a joint probability distribution

over agent actions over time– supports inference (e.g., prediction)– exploits locality of interaction

• Our solution: – history-dependent graphical multiagent

models (hGMMs)

2

Example

3

Consensus Voting [Kearns et al. ’09]:shown from agent 1’s perspective

2

3

4

5

1

6

t=10s

Agent Blue consensus

Red consensus

neither

1 1.0 0.5 0

2 0.5 1.0 0

time

Graphical Representations

• Exploit locality in agent interactions– MAIDs [Koller & Milch ’01], NIDs [Gal &

Pfeffer ’08], Action-graph games [Jiang et al. ’08]

– Graphical games [Kearns et al. ’01] and Markov random field for graphical games [Daskalakis & Papadimitriou ’06]

4

Graphical Multiagent Models (GMMs)

• [Duong, Wellman and Singh UAI-08]– Nodes: agents– Edges: dependencies between agents

– Neighborhood Ni includes i and its neighbors

• accommodates multiple sources of belief about agent behavior for static (one-shot) scenarios

5

23

4

5

16

Joint probabilitydistribution ofsystem’s actions

Joint probabilitydistribution ofsystem’s actions

potential of neighborhood’s joint actionspotential of neighborhood’s joint actions

normalizationnormalization

Contribution

Extend static GMM for modeling

• dynamic joint behaviors

• by conditioning on local history

6

History-dependent GMM (hGMM)

7

• Extend static GMM: condition joint agent behavior on abstracted history of actions

• directly captures joint behavior using limited action history

Joint probability distribution of system’s actions at time t

Joint probability distribution of system’s actions at time t

potential of neighborhood’s joint actions at tpotential of neighborhood’s joint actions at t

normalizationnormalization

neighborhood-relevant abstracted historyneighborhood-relevant abstracted history

abstracted historyabstracted history

Joint vs. Individual Behavior ModelsAutonomous agents’ behaviors are independent given complete history.

Agent i’s actions depend on past observations, specified by strategy function σi(Ht)– Individual behavior models (IBMM): conditional independence of agent

behavior given complete history.

Pr(at | Ht) = Πiσi(Ht)

History is often abstracted/summarized (limited horizon h, frequency function f, etc.), resulting in correlations in observed behavior. – Joint behavior models (hGMM) – no independence assumption

8

2

33

1

σ2(Ht2)

σ3(Ht3)

3

σ1(Ht1)

Voting Consensus Simulation

• Simulation (treated as the true model): smooth fictitious play [Camerer and Ho ’99]– agents respond probabilistically in proportion to

expected rewards (given reward function and beliefs about others’ behavior)

• Note:– This generative model is individual behavior– Given abstracted history, joint behavior models may

better capture behavior even if generated by an individual behavior model

9

Voting Consensus ModelsIndividual Behavior Multiagent Model (IBMM)

Joint Behavior Multiagent Model (hGMM)

10

normalizationnormalization Frequency that action ai is previously chosen by each of i’s neighbors

Frequency that action ai is previously chosen by each of i’s neighbors

Reward for action ai, regardless of neighbor’s actionsReward for action ai, regardless of neighbor’s actions

Expected reward for aNi, discounted by the number of dissenting neighbors

Expected reward for aNi, discounted by the number of dissenting neighbors

Frequency that aNi is previously chosen by neighborhood Ni

Frequency that aNi is previously chosen by neighborhood Ni

Model Learning and Evaluation

• Given a sequence of joint actions over m time periods X = {a0,…,am}, the log likelihood induced by the model M: LM(X;θ) – θ: model’s parameters

• Potential function learning:

– assumes a known graphical structure

– employs gradient descent

• Evaluation:

– computes LM(X;θ) to evaluate M

11

Experiments

• 10 agents

• i.i.d. payoffs for consensus red and blue results (between 0 and 1), 0 otherwise.

• max node degree d

• T = 100 or when the vote converges

• 20 smooth fictitious play game runs generated for each game configuration (10 for training, 10 for testing)

12

Results

13

1. hGMMs outperform IBMMs in predicting outcomes for shorter history lengths.

2. Shorter history horizon more abstraction of history more induced behavior correlation hGMM > IBMM

3. hGMMs outperform IBMMs in predicting outcomes across different values of d

Evaluation: log likelihood for hGMM / log likelihood for IBMM

d\h 1 2 3 4 5 6 7 8

6

3

•Green: hGMM > IBMM•Yellow: hGMM < IBMM

Asynchronous Belief Updates

• hGMMs outperform IBMMs more for longer summarization intervals v (which induce more behavior correlations)

14

Direct Sampling• Compute the joint distribution of actions as the empirical distribution of

the training data

• Evaluation: Log likelihood for hGMM / log likelihood for direct sampling

• Direct sampling is computationally more expensive but less powerful

15

Conclusions• hGMMs support efficient and effective inference

about system dynamics, using abstracted history, for scenarios exhibiting locality

• hGMMs provide better predictions of dynamic behaviors than IBMMs and direct fictitious play sampling

• Approximation does not deteriorate performance• Future work:

– More domain applications: authentic voting experimental results, other scenarios

– (Fully) dynamic GMM that allows reasoning about unobserved past states

16

Documents

History-Dependent Graphical Multiagent Models Quang Duong Michael P. Wellman Satinder Singh Computer Science and Engineering University of Michigan, USA