47
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.

G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Embed Size (px)

Citation preview

Page 1: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

G51IAIIntroduction to AI

Minmax and Alpha Beta Pruning

Garry Kasparov and Deep Blue. © 1997, GM Gabriel

Schwartzman's Chess Camera, courtesy IBM.

Page 2: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing - Minimax

• Game Playing

• An opponent tries to thwart your every move

• 1944 - John von Neumann outlined a search method (Minimax) that maximised your position whilst minimising your opponents

Page 3: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Example Game Tic Tac Toe

Page 4: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Example

• Nim (a simple game)

• Start with a single pile of tokens

• At each move the player must select a pile and divide the tokens into two non-empty, non-equal piles

+

+

+

Page 5: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing - Minimax

• Starting with 7 tokens, the game is small enough that we can draw the entire game tree

• The “game tree” to describe all possible games follows:

Page 6: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

7

6-1 5-2 4-3

5-1-1 4-2-1 3-2-2 3-3-1

4-1-1-1 3-2-1-1 2-2-2-1

3-1-1-1-1 2-2-1-1-1

2-1-1-1-1-1

Page 7: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Nim Game Tree• NOTE: We converted the tree of possible games

to a graph by merging nodes that have the same “game state” – this just saves repetition of work

• But what do we do with the “game tree”

• How can we use it to help decide how to play?

• Use “Minimax Method”

Page 8: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing - Minimax

• In order to implement minimax we need a method of measuring how good a position is.

• Often called a utility function– a.k.a. score, evaluation function, utility value, …

• Initially this will be a value that describes our position exactly

Page 9: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing - Minimax

• Conventionally, in discussion of minimax, have two players “MAX” and “MIN”

• The utility function is taken to be the utility for MAX

• Larger values are better for MAX”

Page 10: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Nim

• Remember that larger values are taken to be better for MAX

• Assume that use a utility function of

– 1 = a win for MAX– 0 = a win for MIN

• We only compare values, “larger or smaller”, so the actual sizes do not matter– in other games might use {+1,0,-1} for

{win,draw,lose}.

Page 11: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Minimax• Basic idea of minimax:

• Player MAX is going to take the best move available

• Will select the next state to be the one with the highest utility

• Hence, value of a MAX node is the MAXIMUM of the values of the next possible states

– i.e. the maximum of its children in the search tree

Page 12: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Minimax• Player MIN is going to take the best move

available for MIN– i.e. the worst available for MAX

• Will select the next state to be the one with the lowest utility

– recall, higher utility values are better for MAX and so worse for MIN

• Hence, value of a MIN node is the MINIMUM of the values of the next possible states

– i.e. the minimum of its children in the search tree

Page 13: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Minimax Summary

• A “MAX” move takes the best move for MAX – so takes the MAX utility of the children

• A “MIN” move takes the best for min – hence the worst for MAX – so takes the MIN utility of the children

• Games alternate in play between MIN and MAX

Page 14: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Minimax for NIM

• Assuming MIN plays first, complete the MIN/MAX tree

• Assume that use a utility function of

– 1 = a win for MAX

– 0 = a win for MIN

Page 15: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

7

6-1 5-2 4-3

5-1-1 4-2-1 3-2-2 3-3-1

4-1-1-1 3-2-1-1 2-2-2-1

3-1-1-1-1 2-2-1-1-1

2-1-1-1-1-1

MIN

MIN

MIN

MAX

MAX

MAX 0 (loss for MAX)

1

0

0

01

0 1 0 1

1 1 1

1

Page 16: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Use of Minimax

• The Min node has value +1

• All moves by MIN lead to a state of value +1 for MAX

• MIN cannot avoid losing

• From the values on the tree one can read off the best moves for each player– make sure you know how to extract these best

moves (“perfect lines of play”)

Page 17: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Bounded Minimax

• For real games, search trees are much bigger and deeper than Nim

• Cannot possibly evaluate the entire tree

• Have to put a bound on the depth of the search

Page 18: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Bounded Minimax

• The terminal states are no longer a definite win/loss– actually they are really a definite win/draw/loss but

with reasonable computer resources we cannot determine which

• Have to heuristically/approximately evaluate the quality of the positions of the states

• Evaluation of the utility function is expensive if it is not a clear win or loss

Page 19: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Bounded Minimax

Next Slide:

• Artificial example of minimax bounded

• Evaluate “terminal position” after all possible moves by MAX

• (The numbers are invented, and just to illustrate the working of minimax)

Page 20: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

= terminal position = agent = opponent

1

MIN

MAX

1 -3

A

B

B C

Utility values of “terminal” positions obtained

by an evaluation function

Page 21: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Bounded Minimax• Example of minimax with bounded depth

• Evaluate “terminal position” after all possible moves in the order:

1. MAX (aka “agent”)2. MIN (aka “opponent”)3. MAX

• (The numbers are invented, and just to illustrate the working of minimax)

• Assuming MX plays first, complete the MIN/MAX tree

Page 22: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

D E F G

= terminal position = agent = opponent

4 -5 -5 1 -7 2 -3 -8

1

MAX

MIN

4 1 2 -3

MAX

1 -3B C

A

Page 23: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Bounded Minimax

• If both players play their best moves, then which “line” does the play follow?

Page 24: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

D E F G

= terminal position = agent = opponent

4 -5 -5 1 -7 2 -3 -8

1

MAX

MIN

4 1 2 -3

MAX

1 -3B C

A

Page 25: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Perfect Play

• Note that the line of perfect play leads the a terminal node with the same value as the root node

• All intermediate nodes also have that same value

• Essentially, this is the meaning of the value at the root node

• Caveat: This only applies if the tree is not expanded further after a move because then the terminals will change and so values can change

Page 26: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Game Playing – Summary So Far

• Game tree– describes the possible sequences of play

– might be drawn as a graph if we merge together identical states

• Minimax– Utility values assigned to the leaves

• Values “backed up” the tree by– MAX node takes max value of children

– MIN node takes min value of children

– Can read off best lines of play and results

• Depth Bound – utility of terminal states estimated using an “evaluation function”

Page 27: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Minimax algorithm

Page 28: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Minimax

max

min

max

min

Page 29: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Minimax

max

min

max

min 10 9 14 13 2 1 3 24

10 14 2 24

10 2

10

Page 30: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

A MINMAX GAME

Page 31: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Properties of minimax• Complete? Yes (if tree is finite)

• Optimal? Yes (against an optimal opponent)

• Time complexity? O(bm)

• Space complexity? O(bm) (depth-first exploration)

• For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

Page 32: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

α-β pruning example

Page 33: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

α-β pruning example

Page 34: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

α-β pruning example

Page 35: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

α-β pruning example

Page 36: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

α-β pruning example

Page 37: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Alpha and beta

• The ALPHA value of a MAX node is set equal to the current LARGES final backed-up value of its successors.

• The BETA value of a MIN node is set equal to the current SMALLEST final backed-up value of its successors.

Page 38: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

ALPHA-BETA PRUNING

Page 39: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Properties of α-β• Pruning does not affect final result

• Good move ordering improves effectiveness of pruning

• With "perfect ordering," time complexity = O(bm/2) doubles depth of search

• A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

••

Page 40: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Why is it called α-β?• α is the value of the

best (i.e., highest-value) choice found so far at any choice point along the path for max

• If v is worse than α, max will avoid it prune that branch

• Define β similarly for min•

Page 41: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

The α-β algorithm

Page 42: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

The α-β algorithm

Page 43: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Resource limits

Suppose we have 100 secs, explore 104 nodes/sec 106 nodes per move

Standard approach:• cutoff test:

e.g., depth limit (perhaps add quiescence search)

• evaluation function = estimated desirability of position

Page 44: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Evaluation functions• For chess, typically linear weighted sum of features

Eval(s) = w1 f1(s) + w2 f2(s) + … + wn fn(s)

• e.g., w1 = 9 with

f1(s) = (number of white queens) – (number of black queens), etc.

Page 45: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Cutting off searchMinimaxCutoff is identical to MinimaxValue except

1. Terminal? is replaced by Cutoff?2. Utility is replaced by Eval

Does it work in practice?bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player!– 4-ply ≈ human novice– 8-ply ≈ typical PC, human master– 12-ply ≈ Deep Blue, Kasparov–

••

Page 46: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Deterministic games in practice• Checkers: Chinook ended 40-year-reign of human world champion

Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions.

• Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

• Othello: human champions refuse to compete against computers, who are too good.

• Go: human champions refuse to compete against computers, who are too bad. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves.

•••

Page 47: G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM

Summary

• Games are fun to work on!

• They illustrate several important points about AI

• perfection is unattainable must approximate

• good idea to think about what to think about