57
SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

Embed Size (px)

Citation preview

Page 1: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 1

This time: Outline

• Game playing• The minimax algorithm• Resource limitations• alpha-beta pruning• Elements of chance

Page 2: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 2

What kind of games?

• Abstraction: To describe a game we must capture every relevant aspect of the game. Such as:• Chess• Tic-tac-toe• …

• Accessible environments: Such games are characterized by perfect information

• Search: game-playing then consists of a search through possible game positions

• Unpredictable opponent: introduces uncertainty thus game-playing must deal with contingency problems

Page 3: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 3

Searching for the next move

• Complexity: many games have a huge search space• Chess: b = 35, m=100 nodes = 35 100

if each node takes about 1 ns to explorethen each move will take about 10 50 millennia to calculate.

• Resource (e.g., time, memory) limit: optimal solution not feasible/possible, thus must approximate

1. Pruning: makes the search more efficient by discarding portions of the search tree that cannot improve quality result.

2. Evaluation functions: heuristics to evaluate utility of a state without exhaustive search.

Page 4: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 4

Two-player games

• A game formulated as a search problem:

• Initial state: ?• Operators: ?• Terminal state: ?• Utility function: ?

Page 5: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 5

Two-player games

• A game formulated as a search problem:

• Initial state: board position and turn• Operators: definition of legal moves• Terminal state: conditions for when game is over• Utility function: a numeric value that describes the

outcome of thegame. E.g., -1, 0, 1 for loss, draw, win.(AKA payoff function)

Page 6: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 6

Game vs. search problem

Page 7: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 7

Example: Tic-Tac-Toe

Page 8: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 8

Type of games

Page 9: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 9

Type of games

Page 10: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 10

The minimax algorithm

• Perfect play for deterministic environments with perfect information

• Basic idea: choose move with highest minimax value= best achievable payoff against best play

• Algorithm: 1. Generate game tree completely

2. Determine utility of each terminal state

3. Propagate the utility values upward in the tree by applying MIN and MAX operators on the nodes in the current level

4. At the root node use minimax decision to select the move with the max (of the min) utility value

• Steps 2 and 3 in the algorithm assume that the opponent will play perfectly.

Page 11: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 11

Generate Game Tree

Page 12: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 12

Generate Game Tree

x x x

x

Page 13: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 13

Generate Game Tree

x

o x x

o

x

o

x o

Page 14: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 14

Generate Game Tree

x

o x x

o

x

o

x o

1 ply

1 move

Page 15: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 15

A subtree

win

lose

draw

xxo

o

ox

xxo

o

ox

xxo

o

oxx

xxo

o

ox

x x

xxo

o

oxx

xxo

o

oxx

xxo

o

ox

x

xxo

o

ox

x

xxo

o

ox

x

xxo

o

ox

xo

oo oo

o

xxo

o

oxx

o x xx

xxo

o

ox

xo

xxo

o

ox

x

o xxx

oo

ox

xo

Page 16: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 16

What is a good move?

win

lose

draw

xxo

o

ox

xxo

o

ox

xxo

o

oxx

xxo

o

ox

x x

xxo

o

oxx

xxo

o

oxx

xxo

o

ox

x

xxo

o

ox

x

xxo

o

ox

x

xxo

o

ox

xo

oo oo

o

xxo

o

oxx

o x xx

xxo

o

ox

xo

xxo

o

ox

x

o xxx

oo

ox

xo

Page 17: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 17

Minimax

3 812 4 6 14 252

•Minimize opponent’s chance•Maximize your chance

Page 18: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 18

Minimax

3 2

3

2

812 4 6 14 252

MIN

•Minimize opponent’s chance•Maximize your chance

Page 19: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 19

Minimax

3

3

2

3

2

812 4 6 14 252

MAX

MIN

•Minimize opponent’s chance•Maximize your chance

Page 20: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 20

Minimax

3

3

2

3

2

812 4 6 14 252

MAX

MIN

•Minimize opponent’s chance•Maximize your chance

Page 21: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 21

minimax = maximum of the minimum

1st ply

2nd ply

Page 22: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 22

Minimax: Recursive implementation

Complete: ?Optimal: ?

Time complexity: ?Space complexity: ?

Page 23: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 23

Minimax: Recursive implementation

Complete: Yes, for finite state-spaceOptimal: Yes

Time complexity: O(bm)Space complexity: O(bm) (= DFSDoes not keep all nodes in memory.)

Page 24: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 24

1. Move evaluation without complete search

• Complete search is too complex and impractical

• Evaluation function: evaluates value of state using heuristics and cuts off search

• New MINIMAX:• CUTOFF-TEST: cutoff test to replace the termination

condition (e.g., deadline, depth-limit, etc.)• EVAL: evaluation function to replace utility function

(e.g., number of chess pieces taken)

Page 25: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 25

Evaluation functions

• Weighted linear evaluation function: to combine n heuristics

f = w1f1 + w2f2 + … + wnfn

E.g, w’s could be the values of pieces (1 for prawn, 3 for bishop etc.)f’s could be the number of type of pieces on the board

Page 26: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 26

Note: exact values do not matter

Page 27: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 27

Minimax with cutoff: viable algorithm?

Assume we have 100 seconds, evaluate 104 nodes/s; can evaluate 106 nodes/move

Page 28: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 28

2. - pruning: search cutoff

• Pruning: eliminating a branch of the search tree from consideration without exhaustive examination of each node

- pruning: the basic idea is to prune portions of the search tree that cannot improve the utility value of the max or min node, by just considering the values of nodes seen so far.

• Does it work? Yes, in roughly cuts the branching factor from b to b resulting in double as far look-ahead than pure minimax

Page 29: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 29

- pruning: example

6

6

MAX

6 12 8

MIN

Page 30: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 30

- pruning: example

6

6

MAX

6 12 8 2

2MIN

Page 31: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 31

- pruning: example

6

6

MAX

6 12 8 2

2

5

5MIN

Page 32: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 32

- pruning: example

6

6

MAX

6 12 8 2

2

5

5MIN

Selected move

Page 33: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 33

- pruning: general principle

Player

Player

Opponent

Opponent

m

n

v

If > v then MAX will chose m so prune tree under n

Similar for for MIN

Page 34: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 34

Properties of -

Page 35: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 35

The - algorithm:

Page 36: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 36

More on the - algorithm

• Same basic idea as minimax, but prune (cut away) branches of the tree that we know will not contain the solution.

Page 37: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 37

More on the - algorithm: start from Minimax

Page 38: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 38

Remember: Minimax: Recursive implementation

Complete: Yes, for finite state-spaceOptimal: Yes

Time complexity: O(bm)Space complexity: O(bm) (= DFSDoes not keep all nodes in memory.)

Page 39: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 39

More on the - algorithm

• Same basic idea as minimax, but prune (cut away) branches of the tree that we know will not contain the solution.

• Because minimax is depth-first, let’s consider nodes along a given path in the tree. Then, as we go along this path, we keep track of: : Best choice so far for MAX : Best choice so far for MIN

Page 40: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 40

More on the - algorithm: start from Minimax

Note: These are bothLocal variables. At theStart of the algorithm,We initialize them to = - and = +

Page 41: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 41

More on the - algorithm

MAX

MIN

MAX

= - = +

5 10 6 2 8 7

Min-Value loopsover these

In Min-Value:

= - = 5

= - = 5

= - = 5

Max-Value loopsover these

Page 42: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 42

More on the - algorithm

MAX

MIN

MAX

= - = +

5 10 6 2 8 7

In Max-Value:

= - = 5

= - = 5

= - = 5

= 5 = +Max-Value loops

over these

Page 43: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 43

In Min-Value:More on the - algorithm

MAX

MIN

MAX

= - = +

5 10 6 2 8 7 = - = 5

= - = 5

= - = 5

= 5 = +

= 5 = 2End loop and return 5

Min-Value loopsover these

Page 44: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 44

In Max-Value:More on the - algorithm

MAX

MIN

MAX

= - = +

5 10 6 2 8 7 = - = 5

= - = 5

= - = 5

= 5 = +

= 5 = 2End loop and return 5

= 5 = +

Max-Value loopsover these

Page 45: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 45

Example

Page 46: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 46

- algorithm:

Page 47: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 47

Solution

NODE TYPE ALPHA BETA SCORE A Max -I +IB Min -I +I C Max -I +I D Min -I +I E Max 10 10 10 D Min -I 10 F Max 11 11 11 D Min -I 10 10 C Max 10 +I G Min 10 +I H Max 9 9 9 G Min 10 9 9 C Max 10 +I 10 B Min -I 10 J Max -I 10 K Min -I 10 L Max 14 14 14 K Min -I 10 10 …

NODE TYPE ALPHA BETA SCORE …J Max 10 10 10 B Min -I 10 10 A Max 10 +I Q Min 10 +I R Max 10 +I S Min 10 +I T Max 5 5 5 S Min 10 5 5 R Max 10 +I V Min 10 +I W Max 4 4 4 V Min 10 4 4 R Max 10 +I 10 Q Min 10 10 10 A Max 10 10 10

Page 48: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 48

State-of-the-art for deterministic games

Page 49: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 49

Nondeterministic games

Page 50: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 50

Algorithm for nondeterministic games

Page 51: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 51

Remember: Minimax algorithm

Page 52: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 52

Nondeterministic games: the element of chance

3 ?

0.50.5

817

8

?

CHANCE ?

expectimax and expectimin, expected values over all possible outcomes

Page 53: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 53

Nondeterministic games: the element of chance

3 50.50.5

817

8

5

CHANCE 4 = 0.5*3 + 0.5*5Expectimax

Expectimin

Page 54: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 54

Evaluation functions: Exact values DO matter

Order-preserving transformation do not necessarily behave the same!

Page 55: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 55

State-of-the-art for nondeterministic games

Page 56: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 56

Summary

Page 57: SE 420 1 This time: Outline Game playing The minimax algorithm Resource limitations alpha-beta pruning Elements of chance

SE 420 57

Exercise: Game Playing

(a) Compute the backed-up values computed by the minimax algorithm. Show your answer by writing values at the appropriate nodes in the above tree.

(b) Compute the backed-up values computed by the alpha-beta algorithm. What nodes will not be examined by the alpha-beta pruning algorithm?

(c) What move should Max choose once the values have been backed-up all the way?

A

B C D

E F G H I J K

L M N O P Q R S T U V W YX

2 3 8 5 7 6 0 1 5 2 8 4 210

Max

Max

Min

Min

Consider the following game tree in which the evaluation function values are shown below each leaf node. Assume that the root node corresponds to the maximizing player. Assume the search always visits children left-to-right.