CS 380: ARTIFICIAL INTELLIGENCE PROBLEM SOLVING: INFORMED ...santi/teaching/2017/CS380/CS380-W4... · CS 380: ARTIFICIAL INTELLIGENCE PROBLEM SOLVING: INFORMED SEARCH, A* Santiago

CS 380: ARTIFICIAL INTELLIGENCE

PROBLEM SOLVING:INFORMED SEARCH, A*

Santiago Ontañó[email protected]

A Note on Graph Search• Repeated-state checking:

• When the search state is a graph strategies like DFS can get stuck in loops.

• Algorithms need to keep a list (CLOSED) of already visited nodes.

• In DFS:• If we want to avoid repeating states completely, we need to keep

ALL the visited states in memory (in the CLOSED list)• If we just want to avoid loops, we only need to remember the

current branch (linear memory as a function of “m”)

Evaluation Functions• Idea: represent the information we have about the domain

as an “evaluation function” h

• Evaluation function (heuristic):• Given a state s• h(s) it estimates how close or how far it is from the goal

• Example:• In a maze solving problem: Euclidean distance to the goal

A*• At each cycle, A* expands the node with the lowest f(n):

• f(n) = g(n) + h(n)

• A* implementations assume repeated state checking (i.e. assume search space is a graph):• OPEN: list of nodes that need to be expanded• CLOSED: list of nodes that have already been expanded

Example: A*

Goal

Start

Start

Heuristic used: Manhattan Distance

OPEN = [Start]CLOSED = []

Example: A*

Goal

Startg = 0h = 3

Start3

Assigns an “estimated cost”, f, to each node:

f(n) = g(n) + h(n)

Real cost from Start to n

Heuristic

Heuristic used: Manhattan Distance

OPEN = [Start]CLOSED = []

Example: A*

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4

Start3

A3

B5

C5

Expands the node with the lowestEstimated cost first

OPEN = [A,B,C]CLOSED = [Start]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4

Start3

A3

B5

C5

D5

OPEN = [B, C, D]CLOSED = [Start, A]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Eg = 2h = 5

Cg = 1h = 4

E7

Start3

A3

B5

C5

D5

OPEN = [C, D, E]CLOSED = [Start, A, B]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Eg = 2h = 5

Cg = 1h = 4

Fg = 2h = 3

Gg = 2h = 5

F5

G7

E7

Start3

A3

B5

C5

D5

OPEN = [D, E, F, G]CLOSED = [Start, A, B, C]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Eg = 2h = 5

Cg = 1h = 4

Fg = 2h = 3

Gg = 2h = 5

F5

G7

E7

Start3

A3

B5

C5

D5

OPEN = [E, F, G]CLOSED = [Start, A, B, C, D]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Eg = 2h = 5

Cg = 1h = 4

Fg = 2h = 3

Jg = 3h = 2

Gg = 2h = 5

Ig = 3h = 4

I7

J5

F5

G7

E7

Start3

A3

B5

C5

D5

OPEN = [E, G, I, J]CLOSED = [Start, A, B, C, D, F]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Lg = 4h = 1

Eg = 2h = 5

Cg = 1h = 4

Fg = 2h = 3

Jg = 3h = 2

Gg = 2h = 5

Ig = 3h = 4

Kg = 4h = 3

I7

J5

F5

G7

E7

Start3

A3

B5

C5

D5

L5

K7

OPEN = [E, G, I, K, L]CLOSED = [Start, A, B, C, D, F, J]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goalg = 5h = 0

Bg = 1h = 4

Startg = 0h = 3

Lg = 4h = 1

Eg = 2h = 5

Cg = 1h = 4

Fg = 2h = 3

Jg = 3h = 2

Gg = 2h = 5

Ig = 3h = 4

Kg = 4h = 3

I7

J5

F5

G7

E7

Start3

A3

B5

C5

D5

L5

K7

Goal5

OPEN = [E, G, I, K, Goal]CLOSED = [Start, A, B, C, D, F, J, L]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goalg = 5h = 0

Bg = 1h = 4

Startg = 0h = 3

Lg = 4h = 1

Eg = 2h = 5

Cg = 1h = 4

Fg = 2h = 3

Jg = 3h = 2

Gg = 2h = 5

Ig = 3h = 4

Kg = 4h = 3

I7

J5

F5

G7

E7

Start3

A3

B5

C5

D5

L5

K7

Goal5

OPEN = [E, G, I, K]CLOSED = [Start, A, B, C, D, F, J, L, Goal]

Example: A*

Dg = 2h = 3

Ag = 1h = 2

Goalg = 5h = 0

Bg = 1h = 4

Startg = 0h = 3

Lg = 4h = 1

Eg = 2h = 5

Cg = 1h = 4

Fg = 2h = 3

Jg = 3h = 2

Gg = 2h = 5

Ig = 3h = 4

Kg = 4h = 3

I7

J5

F5

G7

E7

Start3

A3

B5

C5

D5

L5

K7

Goal5

OPEN = [E, G, I, K]CLOSED = [Start, A, B, C, D, F, J, L, Goal]

A*Start.g = 0;Start.h = heuristic(Start)OPEN = [Start]; CLOSED = []WHILE OPEN is not empty

N = OPEN.removeLowestF()IF goal(N) RETURN path to NCLOSED.add(N)FOR all children M of N not in CLOSED:

M.parent = NM.g = N.g + 1; M.h = heuristic(M)OPEN.add(M)

ENDFORENDWHILE

A*Start.g = 0;Start.h = heuristic(Start)OPEN = [Start]; CLOSED = []WHILE OPEN is not empty



ENDFORENDWHILE

Differences wrtBreadth First Search

A*Start.g = 0; Start.h = heuristic(Start)OPEN = [Start]; CLOSED = []WHILE OPEN is not empty



ENDFORENDWHILE

Goal

Start

Nodes in red are in CLOSED

Nodes in grey are in OPEN




ENDFORENDWHILE

Goal

Startg = 0






ENDFORENDWHILE

Goal

Startg = 0h = 3






ENDFORENDWHILE

Goal

Startg = 0h = 3






ENDFORENDWHILE

Goal

Startg = 0h = 3






ENDFORENDWHILE

Goal

Startg = 0h = 3






ENDFORENDWHILE

Goal

Startg = 0h = 3



N




ENDFORENDWHILE

Goal

Startg = 0h = 3



N




ENDFORENDWHILE

Goal

Startg = 0h = 3



N




ENDFORENDWHILE

Goal

Startg = 0h = 3



Nchildren(N)




ENDFORENDWHILE

A Goal

BStartg = 0h = 3

C



Nchildren(N) M




ENDFORENDWHILE

Ag = 1

Goal

BStartg = 0h = 3

C



Nchildren(N) M




ENDFORENDWHILE

Ag = 1h = 2

Goal

BStartg = 0h = 3

C



Nchildren(N) M




ENDFORENDWHILE

Ag = 1h = 2

Goal

BStartg = 0h = 3

C



Nchildren(N) M




ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

C



Nchildren(N)

M




ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4



Nchildren(N)

M




ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4



N




ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4



N




ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4






ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4



N




ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4



N




ENDFORENDWHILE

Ag = 1h = 2

Goal

Bg = 1h = 4

Startg = 0h = 3

Cg = 1h = 4



N

Implementation Notes• Remember the distinction between state and node• State:

• The configuration of the problem (e.g. coordinates of a robot, positions of the pieces in the 8-puzzle, etc.)

• Node:• A state plus: current cost (g), current heuristic (h), parent node,

action that got us here form the parent node• It is important to remember who was the parent, and which action,

so that once the solution is found, we can reconstruct the path

A* Intuition• The heuristic biases the search of the algorithm towards

the goal:

Start

Goal

BreadthFirst

Search

No bias

A* Intuition• The heuristic biases the search of the algorithm towards

the goal:

Start

Goal

BreadthFirst

Search

No bias

A*

Biasedtowards the goal

Admissible Heuristics• To ensure optimality, A* requires the heuristic to be

admissible:

• In other words: the heuristic underestimates the actual remaining cost to the goal.

h(n) ≤ h*(n) Actual cost to the goal

A* Optimality ProofOptimality of A∗ (standard proof)

Suppose some suboptimal goal G2 has been generated and is in the queue.Let n be an unexpanded node on a shortest path to an optimal goal G1.

G

n

G2

Start

f(G2) = g(G2) since h(G2) = 0

> g(G1) since G2 is suboptimal

≥ f(n) since h is admissible

Since f(G2) > f(n), A∗ will never select G2 for expansion

Chapter 4, Sections 1–2 23

Heuristic DominanceDominance

If h2(n) ≥ h1(n) for all n (both admissible)then h2 dominates h1 and is better for search

Typical search costs:

d = 14 IDS = 3,473,941 nodesA∗(h1) = 539 nodesA∗(h2) = 113 nodes

d = 24 IDS ≈ 54,000,000,000 nodesA∗(h1) = 39,135 nodesA∗(h2) = 1,641 nodes

Given any admissible heuristics ha, hb,

h(n) = max(ha(n), hb(n))

is also admissible and dominates ha, hb


Consistent HeuristicsProof of lemma: Consistency

A heuristic is consistent if

nc(n,a,n’)

h(n’)

h(n)

G

n’

h(n) ≤ c(n, a, n′) + h(n′)

If h is consistent, we have

f(n′) = g(n′) + h(n′)

= g(n) + c(n, a, n′) + h(n′)

≥ g(n) + h(n)

= f(n)

I.e., f(n) is nondecreasing along any path.


Consistent heuristics ensure A* will not try to expand a node more than once (i.e., they make it more efficient). But regardless of whether the heuristic is consistent or not, A* is still guaranteed to find the optimal path if the heuristic is admissible.

Constructing Heuristics by Relaxation

Relaxed problems

Admissible heuristics can be derived from the exactsolution cost of a relaxed version of the problem

If the rules of the 8-puzzle are relaxed so that a tile can move anywhere,then h1(n) gives the shortest solution

If the rules are relaxed so that a tile can move to any adjacent square,then h2(n) gives the shortest solution

Key point: the optimal solution cost of a relaxed problemis no greater than the optimal solution cost of the real problem


h1(n) = number of tiles out of place.h2(n) = Manhattan distance of tiles to proper locations.

A* everywhere…• Even for videogame playing:

• http://www.youtube.com/watch?v=DlkMs4ZHHr8

Variations of A*• SMA*: A* with bounded memory usage

• TBA*: A* for real-time domains where we have a bounded time before producing an action

• LRTA*: another real-time version of A* (very simple, and the basis of a whole family of algorithms)

• D*: A* for dynamic domains (problem configuration can change)

• etc.

Documents

CS 380: ARTIFICIAL INTELLIGENCE PROBLEM SOLVING: INFORMED ...santi/teaching/2017/CS380/CS380-W4... · CS 380: ARTIFICIAL INTELLIGENCE PROBLEM SOLVING: INFORMED SEARCH, A* Santiago