m6-game-pub

Embed Size (px)

Citation preview

  • 8/9/2019 m6-game-pub

    1/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1

    Adversarial Search

    Chapter 6Sections 1 4

  • 8/9/2019 m6-game-pub

    2/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2

    Outline Optimal decisions

    - pruning

    Imperfect, real-time decisions

  • 8/9/2019 m6-game-pub

    3/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 3

    Games vs. search problems Unpredictable opponent specifying a

    move for every possible opponent reply

    Time limits unlikely to find goal, mustapproximate

    Hmm: Is hex a game or a search problem bythis definition?

  • 8/9/2019 m6-game-pub

    4/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 4

    Lets play! Two players:

    Max Min

    F o r m a l D e s c r i p t i o n :

    A n i n i t i a l s t a t e

    S u c c e s s o r f u n c t i o n

    T e r m i n a l T e s t

    U t i l i t y F u n c t i o n

  • 8/9/2019 m6-game-pub

    5/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 5

    Game tree (2-player, deterministic, turns)

    K e y p r o p e r t y : w e h a v e a z e r o -

    s u m g a m e .

    L o o s e l y , i t m e a n s t h a t t h e r e s a

    l o s e r f o r e v e r y w i n n e r .

    T o t a l u t i l i t y s c o r e o v e r a l l a g e n t s

    s u m t o z e r o .

    M a k e s t h e g a m e a d v e r s a r i a l .

    T o t h i n k a b o u t : n o t z e r o s u m ?

  • 8/9/2019 m6-game-pub

    6/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 6

    Example : Game of NIMS e v e r a l p i l e s o f s t i c k s a r e g i v e n . W e r e p r e s e n t t h e c o n f i g u r a t i o n o f t h e p i l e s b y

    a m o n o t o n e s e q u e n c e o f i n t e g e r s , s u c h a s ( 1 , 3 , 5 ) . A p l a y e r m a y r e m o v e , i n o n e

    t u r n , a n y n u m b e r o f s t i c k s f r o m o n e p i l e . T h u s , ( 1 , 3 , 5 ) w o u l d b e c o m e ( 1 , 1 , 3 ) i f

    t h e p l a y e r w e r e t o r e m o v e 4 s t i c k s f r o m t h e l a s t p i l e . T h e p l a y e r w h o t a k e s t h e

    l a s t s t i c k l o s e s .

    Represent the NIM game (1, 2, 2) as a game

    tree.

  • 8/9/2019 m6-game-pub

    7/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 7

    Minimax P e r f e c t p l a y f o r d e t e r m i n i s t i c g a m e s

    I d e a : c h o o s e m o v e t o p o s i t i o n w i t h h i g h e s t m i n i m a x v a l u e

    = b e s t a c h i e v a b l e p a y o f f a g a i n s t b e s t p l a y

    E . g . , 2 - p l y g a m e :

  • 8/9/2019 m6-game-pub

    8/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 8

    Minimax algorithm

  • 8/9/2019 m6-game-pub

    9/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 9

    Properties of minimax Complete?Yes (if tree is finite)

    Optimal?Yes (against an optimal opponent)

    Time complexity? O(bm )

    Space complexity? O(bm) (depth-first exploration)

    For chess, b 35, m 100 for reasonable games

    exact solution completely infeasible What can we do?

    Pruning!

  • 8/9/2019 m6-game-pub

    10/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 0

    - pruning example

    3

  • 8/9/2019 m6-game-pub

    11/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 1

    - pruning example

  • 8/9/2019 m6-game-pub

    12/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 2

    - pruning example

  • 8/9/2019 m6-game-pub

    13/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 3

    - pruning example

  • 8/9/2019 m6-game-pub

    14/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 4

    - pruning example

  • 8/9/2019 m6-game-pub

    15/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 5

    Properties of - Pruning does not affect final result

    Good move ordering improves effectiveness of pruning

    With perfect ordering, time complexity = O(bm / 2

    ) d o u b l e s d e p t h o f s e a r c h

    W h a t s t h e w o r s e a n d a v e r a g e c a s e t i m e c o m p l e x i t y ?

    D o e s i t m a k e s e n s e t h e n t o h a v e g o o d h e u r i s t i c s f o r w h i c h n o d e s t o

    e x p a n d f i r s t ?

    A simple example of the value of reasoning about whichcomputations are relevant (a form ofmetareasoning)

  • 8/9/2019 m6-game-pub

    16/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 6

    Why is it called -? is the value of the best

    (i.e., highest-value) choice

    found so far at any choicepoint along the path formax

    Ifvis worse than , maxwill avoid it

    p r u n e t h a t b r a n c h

    Define similarly for min

  • 8/9/2019 m6-game-pub

    17/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 7

    The - algorithm

  • 8/9/2019 m6-game-pub

    18/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 8

    The - algorithm

  • 8/9/2019 m6-game-pub

    19/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 9

    Resource limitsThe big problem is that the search space in typical games is

    very large.

    Suppose we have 100 secs, explore 104

    nodes/sec 10 6 nodes per move

    Standard approach:

    cutoff test:e . g . , d e p t h l i m i t ( p e r h a p s a d d q u i e s c e n c e s e a r c h )

    evaluation function= e s t i m a t e d d e s i r a b i l i t y o f p o s i t i o n

  • 8/9/2019 m6-game-pub

    20/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 0

    Evaluation functions For chess, typically linear weighted sum offeatures

    Eval(s)= w1

    f1

    (s) + w2

    f2

    (s) + + wn

    fn

    (s)

    e.g., w1

    = 9 with

    f1

    (s) = (number of white queens) (number of blackqueens), etc.

    Caveat: assumes independence of the features

    B i s h o p s i n c h e s s b e t t e r a t e n d g a m e

    U n m o v e d k i n g a n d r o o k n e e d e d f o r c a s t l i n g

    Should model the expected utility valuestates with thesame feature values lead to.

  • 8/9/2019 m6-game-pub

    21/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 1

    Expected utility value1 2

    1

    1 2 1 2

    1 2 1 2

    1

    1

    1 2

    - 1

    - 1

    - 1

    2 0

    1

    2 0 2 0

    1 - 1

    2 0

    1

    2 0 2 0

    1 - 1

    A utility value may map to many states, each ofwhich may lead to different terminal states

    Want utility values to model likelihood of betterutility states.

  • 8/9/2019 m6-game-pub

    22/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 2

    Cutting off searchM i n i m a x C u t o f f i s i d e n t i c a l t o M i n i m a x V a l u e e x c e p t

    1 . T e r m i n a l ? i s r e p l a c e d b y C u t o f f ?

    2 . U t i l i t y i s r e p l a c e d b y E v a l

    D o e s i t w o r k i n p r a c t i c e ?

    b

    m

    = 1 0

    6

    , b = 3 5 m = 4

    4 - p l y l o o k a h e a d i s a h o p e l e s s c h e s s p l a y e r !

    4 - p l y

    h u m a n n o v i c e

    8 - p l y

    t y p i c a l P C , h u m a n m a s t e r

    1 2 - p l y

    D e e p B l u e , K a s p a r o v

  • 8/9/2019 m6-game-pub

    23/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 3

    Deterministic games in practice C h e c k e r s : C h i n o o k e n d e d 4 0 - y e a r - r e i g n o f h u m a n w o r l d c h a m p i o n

    M a r i o n T i n s l e y i n 1 9 9 4 . U s e d a p r e c o m p u t e d e n d g a m e d a t a b a s e

    d e f i n i n g p e r f e c t p l a y f o r a l l p o s i t i o n s i n v o l v i n g 8 o r f e w e r p i e c e s o n t h e

    b o a r d , a t o t a l o f 4 4 4 b i l l i o n p o s i t i o n s .

    C h e s s : D e e p B l u e d e f e a t e d h u m a n w o r l d c h a m p i o n G a r r y K a s p a r o v i n a

    s i x - g a m e m a t c h i n 1 9 9 7 . D e e p B l u e s e a r c h e s 2 0 0 m i l l i o n p o s i t i o n s p e r

    s e c o n d , u s e s v e r y s o p h i s t i c a t e d e v a l u a t i o n , a n d u n d i s c l o s e d m e t h o d s

    f o r e x t e n d i n g s o m e l i n e s o f s e a r c h u p t o 4 0 p l y .

    O t h e l l o : h u m a n c h a m p i o n s r e f u s e t o c o m p e t e a g a i n s t c o m p u t e r s , w h o

    a r e t o o g o o d .

    G o : h u m a n c h a m p i o n s r e f u s e t o c o m p e t e a g a i n s t c o m p u t e r s , w h o a r e

    t o o b a d . I n g o , b > 3 0 0 , s o m o s t p r o g r a m s u s e p a t t e r n k n o w l e d g e

    b a s e s t o s u g g e s t p l a u s i b l e m o v e s .

  • 8/9/2019 m6-game-pub

    24/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 4

    Summary Games are fun to work on!

    They illustrate several important pointsabout AI

    Perfection is unattainable must

    approximate Good idea to think about what to think about

  • 8/9/2019 m6-game-pub

    25/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 5

    Min, go over Hex! What does the web page say?

    http://www.comp.nus.edu.sg/~cs3243/

  • 8/9/2019 m6-game-pub

    26/26

    3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 6

    What do you need to do Implement Minimax

    Implement Pruning (optional) Implement an evaluation function

    Input: board, selected grid location Output: continuous value

    (really optional) use state