Upload
chakravarthyashok
View
214
Download
0
Embed Size (px)
Citation preview
8/9/2019 m6-game-pub
1/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1
Adversarial Search
Chapter 6Sections 1 4
8/9/2019 m6-game-pub
2/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2
Outline Optimal decisions
- pruning
Imperfect, real-time decisions
8/9/2019 m6-game-pub
3/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 3
Games vs. search problems Unpredictable opponent specifying a
move for every possible opponent reply
Time limits unlikely to find goal, mustapproximate
Hmm: Is hex a game or a search problem bythis definition?
8/9/2019 m6-game-pub
4/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 4
Lets play! Two players:
Max Min
F o r m a l D e s c r i p t i o n :
A n i n i t i a l s t a t e
S u c c e s s o r f u n c t i o n
T e r m i n a l T e s t
U t i l i t y F u n c t i o n
8/9/2019 m6-game-pub
5/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 5
Game tree (2-player, deterministic, turns)
K e y p r o p e r t y : w e h a v e a z e r o -
s u m g a m e .
L o o s e l y , i t m e a n s t h a t t h e r e s a
l o s e r f o r e v e r y w i n n e r .
T o t a l u t i l i t y s c o r e o v e r a l l a g e n t s
s u m t o z e r o .
M a k e s t h e g a m e a d v e r s a r i a l .
T o t h i n k a b o u t : n o t z e r o s u m ?
8/9/2019 m6-game-pub
6/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 6
Example : Game of NIMS e v e r a l p i l e s o f s t i c k s a r e g i v e n . W e r e p r e s e n t t h e c o n f i g u r a t i o n o f t h e p i l e s b y
a m o n o t o n e s e q u e n c e o f i n t e g e r s , s u c h a s ( 1 , 3 , 5 ) . A p l a y e r m a y r e m o v e , i n o n e
t u r n , a n y n u m b e r o f s t i c k s f r o m o n e p i l e . T h u s , ( 1 , 3 , 5 ) w o u l d b e c o m e ( 1 , 1 , 3 ) i f
t h e p l a y e r w e r e t o r e m o v e 4 s t i c k s f r o m t h e l a s t p i l e . T h e p l a y e r w h o t a k e s t h e
l a s t s t i c k l o s e s .
Represent the NIM game (1, 2, 2) as a game
tree.
8/9/2019 m6-game-pub
7/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 7
Minimax P e r f e c t p l a y f o r d e t e r m i n i s t i c g a m e s
I d e a : c h o o s e m o v e t o p o s i t i o n w i t h h i g h e s t m i n i m a x v a l u e
= b e s t a c h i e v a b l e p a y o f f a g a i n s t b e s t p l a y
E . g . , 2 - p l y g a m e :
8/9/2019 m6-game-pub
8/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 8
Minimax algorithm
8/9/2019 m6-game-pub
9/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 9
Properties of minimax Complete?Yes (if tree is finite)
Optimal?Yes (against an optimal opponent)
Time complexity? O(bm )
Space complexity? O(bm) (depth-first exploration)
For chess, b 35, m 100 for reasonable games
exact solution completely infeasible What can we do?
Pruning!
8/9/2019 m6-game-pub
10/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 0
- pruning example
3
8/9/2019 m6-game-pub
11/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 1
- pruning example
8/9/2019 m6-game-pub
12/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 2
- pruning example
8/9/2019 m6-game-pub
13/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 3
- pruning example
8/9/2019 m6-game-pub
14/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 4
- pruning example
8/9/2019 m6-game-pub
15/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 5
Properties of - Pruning does not affect final result
Good move ordering improves effectiveness of pruning
With perfect ordering, time complexity = O(bm / 2
) d o u b l e s d e p t h o f s e a r c h
W h a t s t h e w o r s e a n d a v e r a g e c a s e t i m e c o m p l e x i t y ?
D o e s i t m a k e s e n s e t h e n t o h a v e g o o d h e u r i s t i c s f o r w h i c h n o d e s t o
e x p a n d f i r s t ?
A simple example of the value of reasoning about whichcomputations are relevant (a form ofmetareasoning)
8/9/2019 m6-game-pub
16/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 6
Why is it called -? is the value of the best
(i.e., highest-value) choice
found so far at any choicepoint along the path formax
Ifvis worse than , maxwill avoid it
p r u n e t h a t b r a n c h
Define similarly for min
8/9/2019 m6-game-pub
17/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 7
The - algorithm
8/9/2019 m6-game-pub
18/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 8
The - algorithm
8/9/2019 m6-game-pub
19/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 1 9
Resource limitsThe big problem is that the search space in typical games is
very large.
Suppose we have 100 secs, explore 104
nodes/sec 10 6 nodes per move
Standard approach:
cutoff test:e . g . , d e p t h l i m i t ( p e r h a p s a d d q u i e s c e n c e s e a r c h )
evaluation function= e s t i m a t e d d e s i r a b i l i t y o f p o s i t i o n
8/9/2019 m6-game-pub
20/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 0
Evaluation functions For chess, typically linear weighted sum offeatures
Eval(s)= w1
f1
(s) + w2
f2
(s) + + wn
fn
(s)
e.g., w1
= 9 with
f1
(s) = (number of white queens) (number of blackqueens), etc.
Caveat: assumes independence of the features
B i s h o p s i n c h e s s b e t t e r a t e n d g a m e
U n m o v e d k i n g a n d r o o k n e e d e d f o r c a s t l i n g
Should model the expected utility valuestates with thesame feature values lead to.
8/9/2019 m6-game-pub
21/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 1
Expected utility value1 2
1
1 2 1 2
1 2 1 2
1
1
1 2
- 1
- 1
- 1
2 0
1
2 0 2 0
1 - 1
2 0
1
2 0 2 0
1 - 1
A utility value may map to many states, each ofwhich may lead to different terminal states
Want utility values to model likelihood of betterutility states.
8/9/2019 m6-game-pub
22/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 2
Cutting off searchM i n i m a x C u t o f f i s i d e n t i c a l t o M i n i m a x V a l u e e x c e p t
1 . T e r m i n a l ? i s r e p l a c e d b y C u t o f f ?
2 . U t i l i t y i s r e p l a c e d b y E v a l
D o e s i t w o r k i n p r a c t i c e ?
b
m
= 1 0
6
, b = 3 5 m = 4
4 - p l y l o o k a h e a d i s a h o p e l e s s c h e s s p l a y e r !
4 - p l y
h u m a n n o v i c e
8 - p l y
t y p i c a l P C , h u m a n m a s t e r
1 2 - p l y
D e e p B l u e , K a s p a r o v
8/9/2019 m6-game-pub
23/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 3
Deterministic games in practice C h e c k e r s : C h i n o o k e n d e d 4 0 - y e a r - r e i g n o f h u m a n w o r l d c h a m p i o n
M a r i o n T i n s l e y i n 1 9 9 4 . U s e d a p r e c o m p u t e d e n d g a m e d a t a b a s e
d e f i n i n g p e r f e c t p l a y f o r a l l p o s i t i o n s i n v o l v i n g 8 o r f e w e r p i e c e s o n t h e
b o a r d , a t o t a l o f 4 4 4 b i l l i o n p o s i t i o n s .
C h e s s : D e e p B l u e d e f e a t e d h u m a n w o r l d c h a m p i o n G a r r y K a s p a r o v i n a
s i x - g a m e m a t c h i n 1 9 9 7 . D e e p B l u e s e a r c h e s 2 0 0 m i l l i o n p o s i t i o n s p e r
s e c o n d , u s e s v e r y s o p h i s t i c a t e d e v a l u a t i o n , a n d u n d i s c l o s e d m e t h o d s
f o r e x t e n d i n g s o m e l i n e s o f s e a r c h u p t o 4 0 p l y .
O t h e l l o : h u m a n c h a m p i o n s r e f u s e t o c o m p e t e a g a i n s t c o m p u t e r s , w h o
a r e t o o g o o d .
G o : h u m a n c h a m p i o n s r e f u s e t o c o m p e t e a g a i n s t c o m p u t e r s , w h o a r e
t o o b a d . I n g o , b > 3 0 0 , s o m o s t p r o g r a m s u s e p a t t e r n k n o w l e d g e
b a s e s t o s u g g e s t p l a u s i b l e m o v e s .
8/9/2019 m6-game-pub
24/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 4
Summary Games are fun to work on!
They illustrate several important pointsabout AI
Perfection is unattainable must
approximate Good idea to think about what to think about
8/9/2019 m6-game-pub
25/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 5
Min, go over Hex! What does the web page say?
http://www.comp.nus.edu.sg/~cs3243/
8/9/2019 m6-game-pub
26/26
3 F e b 2 0 0 5 C S 3 2 4 3 - G a m e p l a y i n g 2 6
What do you need to do Implement Minimax
Implement Pruning (optional) Implement an evaluation function
Input: board, selected grid location Output: continuous value
(really optional) use state