58
Games Henry Kautz

Games

  • Upload
    reed

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Games. Henry Kautz. ExpectiMiniMax: Alpha-Beta Pruning. Cutoffs at Max and Min nodes work just as before If range of values is bounded, can add cutoffs to Chance nodes Assume that all branches not searched have the worst-case result L = lowest value achievable (-10) - PowerPoint PPT Presentation

Citation preview

Page 1: Games

Games

Henry Kautz

Page 2: Games
Page 3: Games
Page 4: Games
Page 5: Games
Page 6: Games
Page 7: Games
Page 8: Games
Page 9: Games
Page 10: Games
Page 11: Games
Page 12: Games
Page 13: Games
Page 14: Games
Page 15: Games
Page 16: Games
Page 17: Games
Page 18: Games
Page 19: Games
Page 20: Games
Page 21: Games
Page 22: Games
Page 23: Games
Page 24: Games
Page 25: Games
Page 26: Games
Page 27: Games
Page 28: Games
Page 29: Games
Page 30: Games

ExpectiMiniMax: Alpha-Beta Pruning

•Cutoffs at Max and Min nodes work just as before

•If range of values is bounded, can add cutoffs to Chance nodes

•Assume that all branches not searched have the worst-case result

•L = lowest value achievable (-10)

•U = highest value achievable (10)

Page 31: Games

ExpectiMiniMax: Cutoffs

• Beta cutoff:

• Alpha cutoff:

Values seen Values to comeCurrentvalue

Values seen Values to comeCurrentvalue

Page 32: Games
Page 33: Games
Page 34: Games
Page 35: Games
Page 36: Games
Page 37: Games
Page 38: Games
Page 39: Games
Page 40: Games
Page 41: Games
Page 42: Games
Page 43: Games
Page 44: Games
Page 45: Games
Page 46: Games
Page 47: Games
Page 48: Games

Probabilistic STRIPS Planning

domain: Hungry Monkeyshake: if (ontable)

Prob(2/3) -> +1 banana Prob(1/3) -> no change

else Prob(1/6) -> +1 banana Prob(5/6) -> no change

jump:if (~ontable)Prob(2/3) -> ontable

Prob(1/3) -> ~ontableelse

ontable

Page 49: Games

What is the expected reward?

[1] shake

[2] jump; shake

[3] jump; shake; shake;

[4] jump; if (~ontable){ jump; shake}

else { shake; shake }

Page 50: Games

ExpectiMax

node chance a isn if )(ExpectiMax)(

nodemax isn if )}(children|)(ExpectiMaxmax{

node terminala isn if )(

)(ExpectiMax

)(

nchildrens

ssP

nss

nU

n

Page 51: Games

Hungry Monkey: 2-Ply Game Tree

0 0 1 0 0 0 1 0 1 1 2 1 0 0 1 0

jump

jump jumpjump

jump

shake

shake shake shakeshake

2/3

2/3 2/3 2/3 2/3 2/3

1/3

1/3 1/3 1/3 1/3 1/3

1/6 5/6

1/6 1/61/6 5/6 5/6 5/6

Page 52: Games

ExpectiMax 1 – Chance Nodes

0 2/3

0 0 1 0

0 1/6

0 0 1 0

1 7/6

1 1 2 1

0 1/6

0 0 1 0

jump

jump jumpjump

jump

shake

shake shake shakeshake

2/3

2/3 2/32/3 2/3 2/3

1/3

1/3 1/3 1/3 1/3 1/3

1/6 5/6

1/6 1/61/6 5/6 5/6 5/6

Page 53: Games

ExpectiMax 2 – Max Nodes

2/3

0 2/3

0 0 1 0

1/6

0 1/6

0 0 1 0

7/6

1 7/6

1 1 2 1

1/6

0 1/6

0 0 1 0

jump

jump jumpjump

jump

shake

shake shake shakeshake

2/3

2/3 2/32/3 2/3 2/3

1/3

1/3 1/3 1/3 1/3 1/3

1/6 5/6

1/6 1/61/6 5/6 5/6 5/6

Page 54: Games

ExpectiMax 3 – Chance Nodes

1/2 1/3

2/3

0 2/3

0 0 1 0

1/6

0 1/6

0 0 1 0

7/6

1 7/6

1 1 2 1

1/6

0 1/6

0 0 1 0

jump

jump jumpjump

jump

shake

shake shake shakeshake

2/3

2/3 2/32/3 2/3 2/3

1/3

1/3 1/3 1/3 1/3 1/3

1/6 5/6

1/6 1/61/6 5/6 5/6 5/6

Page 55: Games

ExpectiMax 4 – Max Node

1/2

1/2 1/3

2/3

0 2/3

0 0 1 0

1/6

0 1/6

0 0 1 0

7/6

1 7/6

1 1 2 1

1/6

0 1/6

0 0 1 0

jump

jump jumpjump

jump

shake

shake shake shakeshake

2/3

2/3 2/32/3 2/3 2/3

1/3

1/3 1/3 1/3 1/3 1/3

1/6 5/6

1/6 1/61/6 5/6 5/6 5/6

Page 56: Games

PoliciesThe result of the ExpectiMax analysis

is a conditional plan (also called a policy):

Optimal plan for 2 steps: jump; shake

Optimal plan for 3 steps:jump; if (ontable) {shake; shake}

else {jump; shake}

Probabilistic planning can be generalized in many ways, including action costs and hidden state

The general problem is that of solving a Markov Decision Process (MDP)

Page 57: Games

Gambler’s Paradox

• How much would you pay to play the following game?

• Flip a coin. If heads, you win $2.

• Otherwise: flip again. If heads, you win $4.

• Otherwise: flip again. If heads, you win $8.

• Otherwise: flip again. If heads, you win $16.

Page 58: Games

Expected Value

• Expect value is INFINITE!(1/2)*2 + (1/4)*4 + (1/8)*8 + …

• “Rationally” you should pay ANY fixed amount.

• In real life, people will pay about $20.– This is consistent with logarithmic utility of

money– (1/2)*log(2) + (1/4)*log(4) + (1/8)*log(8) + …