m6-game-pub

8/9/2019 m6-game-pub

1/26

3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1

Adversarial Search

Chapter 6Sections 1 4


2/26


Outline Optimal decisions

- pruning

Imperfect, real-time decisions


3/26


Games vs. search problems Unpredictable opponent specifying a

move for every possible opponent reply

Time limits unlikely to find goal, mustapproximate

Hmm: Is hex a game or a search problem bythis definition?


4/26


Lets play! Two players:

Max Min

F o r m a l D e s c r i p t i o n :

A n i n i t i a l s t a t e

S u c c e s s o r f u n c t i o n

T e r m i n a l T e s t

U t i l i t y F u n c t i o n


5/26


Game tree (2-player, deterministic, turns)

K e y p r o p e r t y : w e h a v e a z e r o -

s u m g a m e .

L o o s e l y , i t m e a n s t h a t t h e r e s a

l o s e r f o r e v e r y w i n n e r .

T o t a l u t i l i t y s c o r e o v e r a l l a g e n t s

s u m t o z e r o .

M a k e s t h e g a m e a d v e r s a r i a l .

T o t h i n k a b o u t : n o t z e r o s u m ?


6/26


Example : Game of NIMS e v e r a l p i l e s o f s t i c k s a r e g i v e n . W e r e p r e s e n t t h e c o n f i g u r a t i o n o f t h e p i l e s b y

a m o n o t o n e s e q u e n c e o f i n t e g e r s , s u c h a s ( 1 , 3 , 5 ) . A p l a y e r m a y r e m o v e , i n o n e

t u r n , a n y n u m b e r o f s t i c k s f r o m o n e p i l e . T h u s , ( 1 , 3 , 5 ) w o u l d b e c o m e ( 1 , 1 , 3 ) i f

t h e p l a y e r w e r e t o r e m o v e 4 s t i c k s f r o m t h e l a s t p i l e . T h e p l a y e r w h o t a k e s t h e

l a s t s t i c k l o s e s .

Represent the NIM game (1, 2, 2) as a game

tree.


7/26


Minimax P e r f e c t p l a y f o r d e t e r m i n i s t i c g a m e s

I d e a : c h o o s e m o v e t o p o s i t i o n w i t h h i g h e s t m i n i m a x v a l u e

= b e s t a c h i e v a b l e p a y o f f a g a i n s t b e s t p l a y

E . g . , 2 - p l y g a m e :


8/26


Minimax algorithm


9/26


Properties of minimax Complete?Yes (if tree is finite)

Optimal?Yes (against an optimal opponent)

Time complexity? O(bm )

Space complexity? O(bm) (depth-first exploration)

For chess, b 35, m 100 for reasonable games

exact solution completely infeasible What can we do?

Pruning!


10/26

3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 0

- pruning example

3


11/26


- pruning example


12/26


- pruning example


13/26


- pruning example


14/26


- pruning example


15/26


Properties of - Pruning does not affect final result

Good move ordering improves effectiveness of pruning

With perfect ordering, time complexity = O(bm / 2

) d o u b l e s d e p t h o f s e a r c h

W h a t s t h e w o r s e a n d a v e r a g e c a s e t i m e c o m p l e x i t y ?

D o e s i t m a k e s e n s e t h e n t o h a v e g o o d h e u r i s t i c s f o r w h i c h n o d e s t o

e x p a n d f i r s t ?

A simple example of the value of reasoning about whichcomputations are relevant (a form ofmetareasoning)


16/26


Why is it called -? is the value of the best

(i.e., highest-value) choice

found so far at any choicepoint along the path formax

Ifvis worse than , maxwill avoid it

p r u n e t h a t b r a n c h

Define similarly for min


17/26


The - algorithm


18/26


The - algorithm


19/26


Resource limitsThe big problem is that the search space in typical games is

very large.

Suppose we have 100 secs, explore 104

nodes/sec 10 6 nodes per move

Standard approach:

cutoff test:e . g . , d e p t h l i m i t ( p e r h a p s a d d q u i e s c e n c e s e a r c h )

evaluation function= e s t i m a t e d d e s i r a b i l i t y o f p o s i t i o n


20/26


Evaluation functions For chess, typically linear weighted sum offeatures

Eval(s)= w1

f1

(s) + w2

f2

(s) + + wn

fn

(s)

e.g., w1

= 9 with

f1

(s) = (number of white queens) (number of blackqueens), etc.

Caveat: assumes independence of the features

B i s h o p s i n c h e s s b e t t e r a t e n d g a m e

U n m o v e d k i n g a n d r o o k n e e d e d f o r c a s t l i n g

Should model the expected utility valuestates with thesame feature values lead to.


21/26


Expected utility value1 2

1

1 2 1 2

1 2 1 2

1

1

1 2

- 1

- 1

- 1

2 0

1

2 0 2 0

1 - 1

2 0

1

2 0 2 0

1 - 1

A utility value may map to many states, each ofwhich may lead to different terminal states

Want utility values to model likelihood of betterutility states.


22/26


Cutting off searchM i n i m a x C u t o f f i s i d e n t i c a l t o M i n i m a x V a l u e e x c e p t

1 . T e r m i n a l ? i s r e p l a c e d b y C u t o f f ?

2 . U t i l i t y i s r e p l a c e d b y E v a l

D o e s i t w o r k i n p r a c t i c e ?

b

m

= 1 0

6

, b = 3 5 m = 4

4 - p l y l o o k a h e a d i s a h o p e l e s s c h e s s p l a y e r !

4 - p l y

h u m a n n o v i c e

8 - p l y

t y p i c a l P C , h u m a n m a s t e r

1 2 - p l y

D e e p B l u e , K a s p a r o v


23/26


Deterministic games in practice C h e c k e r s : C h i n o o k e n d e d 4 0 - y e a r - r e i g n o f h u m a n w o r l d c h a m p i o n

M a r i o n T i n s l e y i n 1 9 9 4 . U s e d a p r e c o m p u t e d e n d g a m e d a t a b a s e

d e f i n i n g p e r f e c t p l a y f o r a l l p o s i t i o n s i n v o l v i n g 8 o r f e w e r p i e c e s o n t h e

b o a r d , a t o t a l o f 4 4 4 b i l l i o n p o s i t i o n s .

C h e s s : D e e p B l u e d e f e a t e d h u m a n w o r l d c h a m p i o n G a r r y K a s p a r o v i n a

s i x - g a m e m a t c h i n 1 9 9 7 . D e e p B l u e s e a r c h e s 2 0 0 m i l l i o n p o s i t i o n s p e r

s e c o n d , u s e s v e r y s o p h i s t i c a t e d e v a l u a t i o n , a n d u n d i s c l o s e d m e t h o d s

f o r e x t e n d i n g s o m e l i n e s o f s e a r c h u p t o 4 0 p l y .

O t h e l l o : h u m a n c h a m p i o n s r e f u s e t o c o m p e t e a g a i n s t c o m p u t e r s , w h o

a r e t o o g o o d .

G o : h u m a n c h a m p i o n s r e f u s e t o c o m p e t e a g a i n s t c o m p u t e r s , w h o a r e

t o o b a d . I n g o , b > 3 0 0 , s o m o s t p r o g r a m s u s e p a t t e r n k n o w l e d g e

b a s e s t o s u g g e s t p l a u s i b l e m o v e s .


24/26


Summary Games are fun to work on!

They illustrate several important pointsabout AI

Perfection is unattainable must

approximate Good idea to think about what to think about


25/26


Min, go over Hex! What does the web page say?

http://www.comp.nus.edu.sg/~cs3243/


26/26


What do you need to do Implement Minimax

Implement Pruning (optional) Implement an evaluation function

Input: board, selected grid location Output: continuous value

(really optional) use state

Documents

m6-game-pub