Presenter: Robert Holte. 2 Helping the world understand … and make informed decisions. * * Potential beneficiaries: commercial games companies, and their

InteractiveInteractiveEntertainment Entertainment Presenter: Robert HoltePresenter: Robert Holte

2

Vision StatementVision Statement

Helping the world understand … and make informed decisions.

** Potential beneficiaries:• commercial games companies, and • their customers.

games and the people who play themgames and the people who play them

**

3

MotivationMotivation

Multi-billion dollar industry, with considerable Canadian activity

U. of A. has one of the best AI & Games research groups in the world

Games are good testbeds for A.I. researchMachine learning has a key role to play:

Opponent/user modelling Massive datasets (e.g. play logs)

Challenging problems for machine learning Opponent modelling: very short time frame, weak data Massive datasets: large number of low-level features Active learning opportunities Human element in the overall system

4

Projects and StatusProjects and Status

1. Gameplay Analysis (ongoing)

2. Poker (ongoing, poster)3. Counter-strike Log Analysis

(new, poster) 4. Go (ongoing, poster)5. General Game Playing (new)6. Threat Modelling (complete,

poster)

5

AICML personnel (cumulative)AICML personnel (cumulative)AICML PI’s: M. Bowling, R. Holte, J.

Schaeffer8 Software developers 3 Postdoctoral Fellows14 Grad students

6

Partners/CollaboratorsPartners/Collaborators

Electronic ArtsBioWareBioTools3 UofA CS profs

7

ResourcesResources

Grants$490K over 3 years, NSERC strategic grant$10k/year BioWare giftPortion of Jonathan Schaeffer’s iCORE chair

In-kindNeverwinter Nights source code (BioWare)FIFA’2004 source code (EA) with our

gameplay analysis hooks installed at their expense

BioTools support of competitions we organize

8

Highlights Highlights IJCAI’03 best paper awardWinner of AAAI’06 poker-bot competitions,

competitive with top human playersWorld’s first man-versus-machine poker matchCurrently world’s best 9x9 Go program,

competitive with very good humans (Scientific American article)

Electronic Arts interest in gameplay analysis GDC paperHQP to EA, BioWare, BioTools, Invidi, Google,

Yahoo!

PokerPoker

Technical DetailsTechnical Details

10

The ChallengesThe Challenges

Large game tree (1018)Stochastic elementVariable number of players (2–10)Imperfect information (during play, and

after)Aim is to maximize winnings not just win

The last two make it essential to discover and exploit the opponent’s weaknesses

11

Many Approaches over 12 yearsMany Approaches over 12 years

Rule-based (“expert system”) – LokiSearch-based – PokiGame-theoretic – PsOpti and othersOpponent modelling

VexbotPDF cuttingParameter Estimation (Bayesian)Strategy Value estimation (“experts”)

12

PsOpti (Sparbot)PsOpti (Sparbot)

Nash Equilibrium of an abstract poker game

Bluffing, slow play, etc. fall out from the mathematics.

Best paper award at IJCAI’03Won the AAAI’06 poker-bot

competitionsHas held its own against 2 world-class

humans

13

PsOpti2 vs. “theCount”PsOpti2 vs. “theCount”

DIVAT: an unbiased, low variance estimator of winnings

14

Weaknesses of the PsOpti’sWeaknesses of the PsOpti’s

The equilibrium strategy for the highly abstract game is far from perfect.

No opponent modelling.Nash equilibrium not the best strategy:

Non-adaptiveDefensive

Even the best humans have weaknesses that should be exploited

15

Why is Opponent Modelling Hard ?Why is Opponent Modelling Hard ?

Short time to learn and exploit model (< 200 hands). Want to simultaneously:Collect information about the opponentUse the information to get higher payoffNot “pay” too much for the informationNot be exploitable ourselves

Imperfect information, even after hand finishes

High variancechance in the game (the shuffled deck)stochastic opponent strategies

Properties of the opponent… (next slide)

16

Difficult OpponentsDifficult Opponents

We assume a “smart” opponent – it has exploitable weaknesses but does not make outright errorsplays a non-equilibrium strategydoes not play a dominated strategy

Opponent’s strategy is non-stationarychanges during the gamemay be modelling me to exploit my

weaknesses

17

ConclusionsConclusions

In Kuhn poker against exploitable, stationary opponents …

Convergence to best-response is slow.

Opponent modelling is superior to a static Nash equilibrium strategy.often produces positive expected valuerobust to game length (50-400) and opponent type

Bad initial estimates of P2’s parameters overcome in 25-50 hands.

“Aggressive” exploration strategies slightly superior to “safe” exploration strategies.

18

Poker – Future WorkPoker – Future Work

Improved Algorithms for Information-Gathering and Modelling

Scaling upNon-stationary OpponentsOther poker variants: no-limit, multi-

player

Semi-AutomatedSemi-AutomatedGameplay Gameplay AnalysisAnalysis

IntroductionIntroduction

20

Software Behaviour AnalysisSoftware Behaviour Analysis

How to test if game software behaves as intended by the designer ?

21

Example: FIFA’99 Corner KickExample: FIFA’99 Corner Kick

22

Visualization of BehaviourVisualization of Behaviour

Corner kicks to the coloured areas score. This was discovered by our SAGA-ML system.

23

SAGA-MLSAGA-ML

MachineLearningrules

behaviour

control Sampling

Documents

Presenter: Robert Holte. 2 Helping the world understand … and make informed decisions. * * Potential beneficiaries: commercial games companies, and their