66
CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman Game Playing Game Playing Introduction to Artificial Intelligence CSE 150 Lecture on Chapter 5

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

Embed Size (px)

Citation preview

Page 1: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Game PlayingGame Playing

Introduction to Artificial Intelligence

CSE 150

Lecture on Chapter 5

Page 2: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

CSP SummaryCSP Summary• CSPs are a special kind of search problem:

States defined by values of a fixed set of variablesGoal test defined by constraints on variable values

• Backtracking = depth-first search with1) fixed variable order2) only legal successors3) Stop search path when constraints are violated.

• Forward checking prevents assignments that guarantee a later failure

• Heuristics:– Variable selection: most constrained variable– Value selection: least constraining value

• Iterative min-conflicts is usually effective in practice

Page 3: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

READINGREADING

• Read Chapter 6! (Next stop, Chapter 13!).

Page 4: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading Clicker QuizReading Clicker Quiz

What type of adversarial search is best for chess?

A) Minimax

B) Expectimax

C) Expectiminimax

Page 5: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading Clicker QuizReading Clicker Quiz

What type of adversarial search is best for chess?

A) Minimax

B) Expectimax

C) Expectiminimax

Page 6: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading Clicker QuizReading Clicker Quiz

What type of adversarial search is best for backgammon?

A) Minimax

B) Expectimax

C) Expectiminimax

Page 7: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading Clicker QuizReading Clicker Quiz

What type of adversarial search is best for backgammon?

A) Minimax

B) Expectimax

C) Expectiminimax

Page 8: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading Clicker QuizReading Clicker Quiz

An evaluation function gives an estimate of the expected utility of the game for a given, non-terminal game position.

A) True

B) False

Page 9: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading Clicker QuizReading Clicker Quiz

An evaluation function gives an estimate of the expected utility of the game for a given, non-terminal game position.

A) True

B) False

Page 10: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

This lectureThis lecture

• Introduction to Games

• Perfect play (Min-Max)

– Tic-Tac-Toe

– Checkers

– Chess

– Go

• Resource limits - pruning

• Games with chance

– Backgammon

Page 11: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Types of GamesTypes of Games

Chess, Checkers

Go, Othello

Backgammon,

Monopoly

BattleshipBridge, Poker,

Scrabble,

Nuclear war

Deterministic Chance

Perfect

Information

Imperfect

Information

Page 12: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Games vs. search problemsGames vs. search problemsSimilar to state-space search but:

1. There’s an Opponent whose moves must be considered

2. Because of the combinatorial explosion, we can’t search the entire tree:

Have to commit to a move based on an estimate of its goodness.

Heuristic mechanisms we used to decide which node to open next will be used to decide which MOVE to make next

Page 13: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Games vs. search problemsGames vs. search problems

We will follow some of the history:

– algorithm for perfect play [Von Neumann, 1944]

– finite horizon, approximate evaluation [Zuse, 1945; Shannon, 1950; Samuel, 1952-57]

– pruning to reduce costs [McCarthy, 1956]

Page 14: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Deterministic Two-person GamesDeterministic Two-person Games

• Two players move in turn

• Each Player knows what the other has done and can do

• Either one of the players wins (and the other loses) or there is a draw.

Page 15: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Games as Search: GrundyGames as Search: Grundy’’s Game s Game [Nilsson, Chapter 3][Nilsson, Chapter 3]

• Initially a stack of pennies stands between two players• Each player divides one of the current stacks into two

unequal stacks.• The game ends when every stack contains one or two

pennies• The first player who cannot play loses

MAX MINNIE

Page 16: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

GrundyGrundy’’s Game: Statess Game: States

7 Max’s turn

MINNIE Loses

3, 2, 2 3, 3, 15, 1, 1 4, 2, 1

Minnie’s turn

Max’s turn

Minnie’s turn2, 1, 1, 1, 1, 1MAX Loses

6, 1 5, 2 4, 3

4, 1, 1, 1 3, 2, 1, 1 Minnie’s turn2, 2, 2, 1

Max’s turn3, 1, 1, 1, 1MINNIE Loses

2, 2, 1, 1, 1

Can Minnie Can Minnie formulate a strategy formulate a strategy to always win?to always win?

Page 17: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

When is there a When is there a winning strategy for Max?winning strategy for Max?

• When it is MIN’s turn to play, a win must be obtainable for MAX from all the results of the move (sometimes called an AND node)

• When it is MAX’s turn, a win must be obtainable from at least one of the results of the move (sometimes called an OR node).

Page 18: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

MinimaxMinimax• Initial stateInitial state:: Board position and whose turn it is.• Operators:Operators: Legal moves that the player can make• Terminal testTerminal test:: A test of the state to determine if game is

over; set of states satisfying terminal test are terminal states.

• Utility function:Utility function: numeric value for outcome of game (leaf nodes).– E.g. +1 win, -1 loss, 0 draw.– E.g. # of points scored if tallying points as in

backgammon.

• Assumption: Opponent will always chose operator (move) maximize own utility.

Page 19: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

MinimaxMinimax• Initial stateInitial state:: Board position and whose turn it is.• Operators:Operators: Legal moves that the player can make• Terminal testTerminal test:: A test of the state to determine if game is

over; set of states satisfying terminal test are terminal states.

• Utility function:Utility function: numeric value for outcome of game (leaf nodes).– E.g. +1 win, -1 loss, 0 draw.– E.g. # of points scored if tallying points as in

backgammon.

• Assumption: Opponent will always chose operator (move) maximize own utility.

Page 20: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

MinimaxMinimaxPerfect play for deterministic, perfect-information games

Max’s strategy: choose action with highest minimax valuebest achievable payoff against best play by min.

Consider a 2-ply (two step) game:

Max wants largest outcome --- Minnie wants smallest

Page 21: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Minimax ExampleMinimax ExampleGame tree for a fictitious 2-ply gameGame tree for a fictitious 2-ply game

MAX

MIN

3 12 8 2 14 5 24 6

3 =min(3,12,8)

A1,2

A2

A3

A1.3

A1

2 2

3 = max(3,2,2)

Leaf nodes: value of utility functionLeaf nodes: value of utility function

Page 22: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Minimax AlgorithmMinimax AlgorithmFunction MINIMAX-DECISION (game) returns an operator

For each op in OPERATORS[game] do

Value[op] MINIMAX-VALUE(Apply(op, game), game)

End

Return the op with the highest Value[op]

Function MINIMAX-VALUE (state, game) returns a utility value

If TERMINAL-TEST[game](state) then

return UTILITY[game](state)

else if MAX is to move in state then

return the highest MINIMAX-VALUE of Successors[state]

else return the lowest MINIMAX-VALUE of Successors[state]

Page 23: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Properties of MinimaxProperties of Minimax

• Complete:• Optimal:

• Time complexity:• Space complexity:

Yes, if tree is finite

Yes, against an optimal opponent. Otherwise??

O(bm)

O(bm)

Note: For chess, b 35, m 100 for a “reasonable game.” Solution is completely infeasible

Actually only see roughly 1040 board positions, not 35100: Checking for previously visited states becomes important! (“transposition table”)

Note: For chess, b 35, m 100 for a “reasonable game.” Solution is completely infeasible

Actually only see roughly 1040 board positions, not 35100: Checking for previously visited states becomes important! (“transposition table”)

Page 24: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Clicker QuestionClicker Question

A minimax search tree is shown to the left. What value should we give to the root node?

A) 8B) 10C) 9D) 100

Page 25: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Clicker QuestionClicker Question

A minimax search tree is shown to the left. What value should we give to the root node?

A) 8B) 10C) 9D) 100

Page 26: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Clicker QuestionClicker Question

A minimax search tree is shown to the left. What value should we give to the root node?

A) 8B) 10C) 9D) 100

Page 27: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Resource limitsResource limits

Suppose we have 100 seconds, explore 104 nodes/second

106 nodes per move

Standard approach:

• cutoff test

e.g., depth limit

• evaluation function = estimated desirability of position

(an approximation to the utility used in minimax).

This is similar to the heuristic evaluation function - but its units are arbitrary - it is not the distance to a goal

Page 28: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Evaluation FunctionEvaluation Function

For chess, typically linear weighted sum of features

Eval(s) = w1 f1(s) + w2 f2(s) + ... + wn fn(s)

e.g., w1 = 9 with

f1(s) = (number of white queens) - (number of black queens)

w2 = 5 with

f2(s) = (number of white rooks) - (number of black rooks)

Page 29: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Clicker QuestionClicker QuestionAssume we have an evaluation function f1 which returns a value between 0 and 100.

Now assume we define another evaluation function f2, which is defined as follows:

f2(s) = f1(s)2, for all states s.

Which of the following is/are true?

A) Minimax search using f2 is guaranteed to return the same action as minimax using f1

B) Expectimax search using f2 is guaranteed to return the same action as expectimax using f1

C) A and B

D) Neither A nor B

Page 30: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Digression: Exact Values Don't MatterDigression: Exact Values Don't Matter

• Behavior is preserved under any monotonic transformation of Eval

• Only the order matters:

payoff in deterministicdeterministic games acts as an ordinal utility function

MAX

MIN

Page 31: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Cutting off searchCutting off search• Replace MinimaxValue with MinimaxCutoff .• MinimaxCutoff is identical to MinimaxValue except

1. Terminal? is replaced by Cutoff?2. Utility is replaced by Eval

Does it work in practice?

bm= 106, b=35 m=4

No: 4-ply lookahead is a hopeless chess player!4-ply human novice8-ply typical PC, human master12-ply Deep Blue, Kasparov

Page 32: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

A variable in a CSP is arc-consistent if every value in its domain satisfies the variable’s binary constraints.

A) True

B) False

Page 33: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

A variable in a CSP is arc-consistent if every value in its domain satisfies the variable’s binary constraints.

A) True

B) False

Page 34: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

If a CSP problem is arc consistent then it is also path consistent.

A) True

B) False

Page 35: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

If a CSP problem is arc consistent then it is also path consistent.

A) True

B) False

Page 36: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

A sudoku puzzle is an example of a constraint satisfaction problem.

A) True

B) False

Page 37: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

A sudoku puzzle is an example of a constraint satisfaction problem.

A) True

B) False

Page 38: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

Local search can be applied to CSPs.

A) True

B) False

Page 39: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

Local search can be applied to CSPs.

A) True

B) False

Page 40: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

The AC3 algorithm (for arc-consistency) is an example of backtracking search.

A) True

B) False

Page 41: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Reading QuizReading Quiz

The AC3 algorithm (for arc-consistency) is an example of backtracking search.

A) True

B) False

Page 42: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Another ideaAnother idea

• α-β Pruning• Essential idea is to stop searching down a branch

of tree when you can determine that it’s a dead end.

• Guaranteed to give the same answer as minimax!

Page 43: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

α-β Pruning Exampleα-β Pruning Example

≥ 3MAX

MIN =3

3 12 8 2

X X

≤ 2

14

≤ 14

5

≤5

2

=2

=3

Page 44: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

The α-β AlgorithmThe α-β AlgorithmBasically MINIMAX and keep track of α, β and prune

Initialize v for search under this node

This returns the best value for min

If there is a better value for MIN, then PRUNE

Otherwise, update alpha

Page 45: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

How is the search tree pruned?How is the search tree pruned?

• α is the best value (to MAX) found so far.

• If V ≤ α, then the subtree containing V will have utility no greater than V

Prune that branchPrune that branch

• β can be defined similarly from MIN’s perspective

Page 46: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

α-β Pruningα-β Pruning• For the entire search tree, keep parameter α:

– A LOWER BOUND on MAX’s backed-up value from MINIMAX search

– I.E., if we did a complete MINIMAX search, MAX’s final backed-up value would be no lower than α.

• Why a LOWER BOUND? – It is based on what MINNIE can hold him to

on some line of play. – MAX will always be able to CHOOSE that

line of play -- MAX will choose a BETTER line of play if he can.

Page 47: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

α-β Pruningα-β Pruning• For each MIN node, keep a parameter β:

– An UPPER BOUND on how well MINNIE can do I.e., if we did a complete MINIMAX search, MINNIE’s value will be no higher than β

• Why? – MINNIE is looking for LOW values of the

static evaluation function.– This is based on what MAX can hold her to.

She will choose a BETTER line of play (LOWER VALUE) if she can.

Page 48: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

α-β Pruningα-β Pruning

• While searching the tree: – The α value of the tree is set to the

HIGHEST backed-up value found so far– The β value of the tree is set to the

LOWEST backed-up value found so far• STOP SEARCHING WHEN:

– The value of a MIN node is less than the α value found so far

– The value of a MAX node is greater than the β value found so far

Page 49: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Properties of α-β pruningProperties of α-β pruning

Pruning does not affect final result.

Good move ordering improves effectiveness of pruning.

With “perfect ordering,” time complexity = O(bm/2)– doubles depth of search– can easily reach depth 8 and play good chess

A simple example of the value of reasoning about which computations are relevant (a form of metareasoning).

Page 50: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

A reminderA reminder• “Standard” Minimax and alpha-beta assume play to the end

of the game - which is often not possible!

• Instead, we use a cut-off test instead of a terminal test

• Use a heuristic evaluation function instead of actual value.

• The results of cutting of search using alpha-beta are going to depend on the quality of your evaluation function.

Page 51: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Deeper searchesDeeper searches

Problem with depth-limited cut-off of search: if the value of a position can change radically in the next move or two.

One fix is quiesencequiesence search:

If a piece is about to be taken, the board is not “quiet” - keep searching beyond that point. Deep Blue used this.

Page 52: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Deeper searchesDeeper searches

The horizonhorizon problem:

Sometimes disaster (or Shangri-la) is just beyond the depth bound:

In the example at the right, in three moves (6 ply), whitecan capture the bishop at a7

But black can stall this by moving its pawns aggressivelyagainst the white king.

If this pushes the taking ofthe bishop over the horizon, black will “think”it has saved the day, whenall it has done is wasted its pawns.

Page 53: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Deeper searchesDeeper searches

One way to mitigate the horizon problem (there is no “cure”):

The singular extensionsingular extension heuristic:

if one move from this position is clearly better than the others, and is stored in memory, extend the search using that move.

requires that this situation has been encountered in the search before.

Page 54: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Deeper searchesDeeper searches

Another useful heuristic:

the null movenull move heuristic: Let the opponent take twotwo moves and then search under the node.

Page 55: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Stored knowledgeStored knowledge

• The transposition table: Store results of searches, using a hash table - if you come to a node again, you don’t need to search under it!

• There is a big fan-out at the beginning of the game:– Deep Blue stored thousands of opening moves

• And thousands of solved end games (all games with five pieces or less)

• And 700,000 grandmaster games

• Nowadays, a PC with good heuristics is as good as the best player.

Page 56: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Deterministic games in practiceDeterministic games in practiceCheckers: In 1994 ChinookChinook ended 40-year-reign of human world champion Marion

Tinsley. Used an endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 443,748,401,247 positions. Checkers is now solvedsolved. The answer is boring: the game can always be played to a draw.

Chess: Deep BlueDeep Blue defeated human world champion Gary Kasparov in a six-game match in 1997. Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply.

Othello: Human champions refuse to compete against computers, who are too good.

Go: The program ZenZen played against Takemiya Masaki (ranked at 9p - high professional) on March 17, 2012. Zen beat Takemiya with a 5 stone handicap by eleven points followed by a stunning twenty point win at a 4 stone handicap.

Takemiya remarked "I had no idea that computer go had come this far.”

Page 57: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Nondeterministic gamesNondeterministic games• E.g., in backgammon, the dice rolls determine the legal moves

• Consider simpler example with coin-flipping instead of dice-rolling:

Page 58: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Nondeterministic gamesNondeterministic games1. For min node, compute min of children

Page 59: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Nondeterministic gamesNondeterministic games1. For mind node, compute min of children

2. For chance node, compute weighted average of children

Page 60: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Nondeterministic gamesNondeterministic games1. For min node, compute min of children

2. For chance node, compute weighted average of children

3. For max node, compute max of children

3

Page 61: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Algorithm for Nondeterministic GamesAlgorithm for Nondeterministic Games Expectiminimax maximize expected value (average) of utility against a

perfect opponent.

Pseudocode is just like Minimax, except we must also handle chance nodes:

...if state is a chance node then return average of

ExpectiMinimax-Value of Successors(state)...

A version of pruning is possible but only if the leaf values are bounded.

Page 62: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Function MINIMAX-VALUE (state, game) returns a utility value

If TERMINAL-TEST[game](state) then

return UTILITY[game](state)

else if MAX is to move in state then

return the highest MINIMAX-VALUE of Successors[state]

else return the lowest MINIMAX-VALUE of Successors[state]

Function EXPECTIMINIMAX-VALUE (state, game) returns a utility value

If TERMINAL-TEST[game](state) then

return UTILITY[game](state)

else if state is a chance node then return average of EXPECTIMINMAX-VALUE of Successors(state)else if MAX is to move in state then

return the highest EXPECTIMINIMAX-VALUE of Successors[state]

else return the lowest EXPECTIMINIMAX-VALUE of Successors[state]

Page 63: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Nondeterministic Games in PracticeNondeterministic Games in Practice• Dice rolls increase branching factor b:

– 21 possible rolls with 2 dice– Backgammon 20 legal moves

• depth 4 20 * (21 * 20)3 1.2 X 109

• As depth increases, probability of reaching a given node shrinks value of look ahead is diminished

pruning is much less effective

• TDGammon uses depth-2 search + very good EvalPlays at world-champion level

Page 64: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Clicker QuestionsClicker Questions

An expectiminimax search tree is shown to the left. Edge weights represent probability. What value should we give to the root node?

A) -1B) 1C) 3D) 4E) 5.5

Page 65: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Digression: Exact Values DO MatterDigression: Exact Values DO Matter

• Behavior is preserved only by positive linear transformation of Eval.

• Hence Eval should be proportional to the expected payoff.

Page 66: CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell ’ s by Kriegman Game Playing Introduction to Artificial Intelligence CSE 150 Lecture

CSE 150, Fall 2012 Adapted from slides adapted from slides of Russell’s by Kriegman

Games SummaryGames Summary• Games are fun to work on!

• They illustrate several important points about AI

– perfection is unattainable must approximate

– good idea to think about what to think about

– uncertainty constrains the assignment of values to states

• “Games are to AI as grand prix racing is to automobile design”