24
Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe Minimax Alpha-beta cutoff Extensions

Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Embed Size (px)

DESCRIPTION

Games, page 3 CSI 4106, Winter 2005 Definitions (2) A (pure) strategy: a complete set of advance instructions that specifies a definite choice for every conceivable situation in which the player may be required to act. In a two-player game, a strategy allows the player to have a response to every move of the opponent. Game-playing programs implement a strategy as a software mechanism that supplies the right move on request.

Citation preview

Page 1: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 1CSI 4106, Winter 2005

GamesPointsGames and strategiesA logic-based approach to games

AND-OR trees/graphsStatic evaluation functions

Tic-Tac-ToeMinimaxAlpha-beta cutoffExtensions

Page 2: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 2CSI 4106, Winter 2005

Definitions

We consider non-random or semi-random games• with full information,• zero-sum,• two-person (dual),• with rational players.

That's mainly board games such as chess, chequers, go -- sometimes called strategic games; some paper-and-pencil games are of this kind too.

Page 3: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 3CSI 4106, Winter 2005

Definitions (2)

A (pure) strategy:a complete set of advance instructions that specifies a definite choice for every conceivable situation in which the player may be required to act.In a two-player game, a strategy allows the player to have a response to every move of the opponent.Game-playing programs implement a strategy as a software mechanism that supplies the right move on request.

Page 4: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 4CSI 4106, Winter 2005

A logic-based approach to games

Find a winning strategy by proving that the game can be won -- use backward chaining.A very simple game: nim. initially, there is one stack of chips; a move: select a stack and divide it in two

unequal non-empty stacks; a player who cannot move loses the game.

(The player who moves first can win.)

Page 5: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 5CSI 4106, Winter 2005

... Games in logic (2)

Page 6: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 6CSI 4106, Winter 2005

... Games in logic (3)

A_wins([6], A)

A_wins([5], B) A_wins([4], B)

A_wins([4], A) A_wins([3], A)

A_wins([3], B) A_wins([], B)

A_wins([], A)

The players are A and B. A_wins( P, X ) means "player X moves in position P and there is a winning continuation for A". A position is represented as a list of sizes of stacks with 3 or more chips (only those can be still divided).

This is an AND/OR tree (actually a directed, acyclic AND/OR graph).

Player B loses.

Page 7: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 7CSI 4106, Winter 2005

... Games in logic (4)

A_wins([6], B)

A_wins([5], A) A_wins([4], A)

A_wins([4], B) A_wins([3], B)

A_wins([3], A) A_wins([], A)

A_wins([], B)

A winning strategy would always lead to a win. Here, such a strategy is described by a subgraph with one OR edge selected from each OR node. All leaves in the subgraph must represent wins for player A.

Now we cannot find a winning strategy: why?

Page 8: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 8CSI 4106, Winter 2005

... Games in logic (5)

This kind of analysis only works for very small game trees:

A_wins([8], A)

A_wins([7], B)

A_wins([6], A) A_wins([5], A)A_wins([4, 3], A)

A_wins([4], B)

Page 9: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 9CSI 4106, Winter 2005

The basic loop

The basic loop in a game program:build as much of the complete tree as seems reasonable (for example, within a given time limit);evaluate the incomplete tree;prune unpromising or bad moves;make a move;get the opponent's move.

Regularities:moves of player A sprout from OR nodes,moves of player B sprout from AND nodes.Seen from A's perspective, this means that A chooses one of the moves (the best move, if possible), and is ready to react to all of B's moves.

Page 10: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 10CSI 4106, Winter 2005

Static evaluation

A static evaluation function returns the value of a move without trying to play (which would mean simulating the rest of the game but not playing it).Usually a static evaluation function returns positive values for positions advantageous to A, negative values for positions advantageous to B.If player A is rational, he will choose the maximal value of a leaf.Player B will choose the minimal value.

Page 11: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 11CSI 4106, Winter 2005

Static evaluation (2)

If we can have (guess or calculate) the value of an internal node N, we can treat it as if it were a leaf. This is the basis of the minimax procedure.No tree would be necessary if we could evaluate the initial position statically. Normally we need a tree, and we need look-ahead into it. Further positions can be evaluated more precisely, because there is more information, and a more focussed search.Minimax works best for large trees, but it can be useful even in mini-games such as tic-tac-toe.

Page 12: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 12CSI 4106, Winter 2005

Tic-Tac-ToeLet player A be x and let open(x), open(o) mean the number of lines open to x and o. There are 8 lines. An evaluation function for position P:f(P) = - if o winsf(P) = + if x winsf(P) = open(x) - open(o) otherwiseExample:open(x) - open(o) = 6 - 4

xo

Assumptions:only one of symmetrical positions is generated;we build 2 levels of the game tree (one move -- one response) to have 2-ply lookahead.

Page 13: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 13CSI 4106, Winter 2005

Tic-Tac-Toe (2)

Player B chooses the minimal backed-up value among level 1 nodes.Player A chooses the maximal value, and makes the move.Player B, as a rational agent, selects the optimal response.

xx

x

x x x x x

x x x x x

o oo

oo

o oxx

oo

oo

o

6-5 5-5 6-5 5-5 4-5 5-4 6-4

5-6 5-5 5-6 6-6 4-6

Page 14: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 14CSI 4106, Winter 2005

Tic-Tac-Toe (3)

B's first three moves are blocking moves. Other moves lead to + for A: the only finite value is the minimum.For A this is a three-way tie in the evaluation; the chance to get more information is to consider more plies.

x xxx x

o

x x

o o

x

oo

x

o o

3-3 3-2 4-3 4-2 3-2 3-2

o o

xxo x

xo

xxo

xx

ox

x

ox

x

oo

o

Page 15: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 15CSI 4106, Winter 2005

Tic-Tac-Toe (4)

Now, what happens if B chooses a weaker move?

The procedure finds a winning continuation: the best position ensures a win by forced moves.

x xx

xx

o

x x

oo

x

o

o

x

o o

2-2 3-2 4-2 4-3 4-3 3-3

o o

xox

xxo

xx

oxx

oxx

oxx

oo o

Page 16: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 16CSI 4106, Winter 2005

Tic-Tac-Toe (5)

Building complete plies is usually not necessary. If we evaluate a position when it is generated, we may save a lot.Assume that we are at a minimizing level. If the evaluation function returns -, we do not need to consider other positions:- will be the minimum.The same applies to + at a maximizing level.

2-1 3-12-1 3-1 -

xx

o o

xx

o o

xx

o o

xx

o o

xx

o o

xx

o oxx

x xx

xx

o o

xx

o o

xx

o o

xx

o ox

x xx

- --

o o o o

Page 17: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 17CSI 4106, Winter 2005

Tic-Tac-Toe (6)

This is possible because of the special properties of the infinite values, but we can achieve a similar effect for finite values.

x x

x x x x xo oo

oo

o x

6-5 5-5 6-5 5-5 4-5 5-6

Page 18: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 18CSI 4106, Winter 2005

Tic-Tac-Toe (7)

The backed-up value of the first node at level 1 is -1, so the value of the (maximizing) root must be ≥ -1.When we see 5 - 6 = -1, we know that the value of the (minimizing) node • must be ≤ -1. The whole subtree sprouting from • cannot contribute anything and should not even be built.

x x

x x x x xo oo

oo

o x

6-5 5-5 6-5 5-5 4-5 5-6

Page 19: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 19CSI 4106, Winter 2005

Minimax with cut-offIn general, we keep a provisional value in every node. This value can only increase in an OR (maximizing) node, and decrease in an AND (minimizing) node.If an AND node with the provisional value V has a child C with a value less than V, we abandon C.If an OR node with the provisional value V has a child C with a value greater that V, we abandon C.Provisional values are established as soon as we "know something", and are propagated up the tree, from the leaves to the root.

Page 20: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 20CSI 4106, Winter 2005

- cut-off (2)

i. We stop searching in and below a minimizing node N with a provisional value PVN that is less than or equal to the provisional values of its maximizing ancestors.The final value for N is PVN.

ii. We stop searching in and below a maximizing node N with a provisional value PVN that is greater than or equal to the provisional values of its minimizing ancestors.The final value for N is PVN.

Page 21: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 21CSI 4106, Winter 2005

- cut-off (3)

A provisional value of an OR node is called its alpha-value.A provisional value of an AND node is called its beta-value.During search, the alpha-value of a node is set to the currently largest of the final values for descendants;the beta-value of a node is set to the currently smallest of the final values for descendants.(i) is a shallow alpha-cutoff,(ii) is a shallow beta-cutoff.

Page 22: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 22CSI 4106, Winter 2005

- cut-off (4)

0 5 -3 3 3 -3 0 2 -2 3 5 2 5 -5 0 1 5 1 -3 0 -5 5 -3 3 2

Page 23: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 23CSI 4106, Winter 2005

Extensions, modifications"Waiting for quiescence" -- when we reach the depth limit in the middle of a dynamic exchange (large amplitude of values).Secondary search ("feedover") -- "double check" down a path that seems the best.Book moves -- "canned" continuations (openings, endgames), and forced moves.Disadvantages of minimax:relying on the optimality of the opponent's play,no spectacular sacrifices are possible (winning back beyond the search limit),the horizon effect.

Queenlost

Queenlost

Pawnlost

Search limit

Page 24: Games, page 1 CSI 4106, Winter 2005 Games Points Games and strategies A logic-based approach to games AND-OR trees/graphs Static evaluation functions Tic-Tac-Toe

Games, page 24CSI 4106, Winter 2005

What next?

This is a technology for one kind of games, and quite probably not the most popular (-:). Adventure games require a very different kind of Artificial Intelligence. Visit

http://www.gameai.com/ai.htmlfor a nearly professional perspective. And, in general, google it.