Clustering (Un-supervised Learning)pszrq/files/4FAISearch.pdf · depth/breadth first search Heuristic search ... Is BFS (or DFS later) a better option for - Maze - TSP ... In UCS

Clustering (Un-supervised Learning)

Partition-based clustering: k-mean • Goal: minimize sum of square of distance

o Between each point and centers of the cluster. o Between each pair of points in the cluster

• Algorithm:

• Initialize K cluster centers • Random K separated points

o Repeat until stabilization: • Assign each point to closest cluster

center • Generate new cluster centers • Adjust clusters by merging or

splitting

Centers

Clustering (Un-supervised Learning)

Distance functions Consider two records x=(x1,…,xd), y=(y1,…,yd): Special cases:

• p=1: Manhattan distance

• p=2: Euclidean distance

p p

dd

ppyxyxyxyxd ||...||||),(

2211

||...||||),(2211 pp yxyxyxyxd

22

22

2

11)(...)()(),(

ddyxyxyxyxd

Problems, Problem Spaces and Search

Foundations of Artificial Intelligence

Problems, Problem Spaces and Search

• Defining Problems as Search Spaces

• Weak Search Techniques

• Strong Search Techniques

Things to achieve :

• Understand the difference between blind and heuristic search.

• Know what a ‘blind’ and ‘heuristic’ method is.

• Use some of the search techniques on real problems.

Many problems exhibit no detectable regular structure to be exploited, they appear “chaotic”, and do not yield to efficient algorithms.

Exhaustive search of large state spaces appears to be the only viable approach. We introduce techniques for exhaustive (blind) search and present some examples of intelligent, ‘heuristic’ search.

Defining a Problem as a Search Space


The concept of search plays an ambivalent role in science and engineering, in one way, any problem whatsoever can be seen as a search for “the right answer”.


Often we can't simply write down and solve the equations for a problem. This is not to dismiss mathematical approaches to problem solving, especially the mechanistic, deterministic variety that are the central concern of science and engineering. These problems can be augmented by other kinds of problem solving approaches that will make engineering ‘better’.

Formulation and

Representation of Problems

To solve problems that are of interest to scientists and engineers we need to apply a common vocabulary.

State space, or search space

Goal, or search criterion

Search algorithm

Data structures

Nodes Search Trees

Decision Trees Search Graphs Search Space


Parents and Ancestors Children and Descendants

Search Tree

Combinatorial Explosion

The travelling sales problem n!

n=10 3,628,800

Claude Shannon delivered a paper in 1949 at a New York conference on how a computer could play chess.

Chess has 10120 unique games (with an average of 40

moves - the average length of a master game).

Working at 200 million positions per second, Deep Blue would require 10100 years to evaluate all possible games.

To put this is some sort of perspective, the universe is only about 1010 years old and 10120 is larger than the

number of atoms in the universe.

Combinatorial Explosion

Representing The Problem

A chessboard layout is a matrix, e.g. if the white queen is at d1. If “2” represents white queen, “2d1” can represent this fact. To play the game a computer needs to search for the wining position (state) in a huge tree of game states.

A game tree

Foxes and Chickens

The Problem

Three foxes and three chickens seek to cross a river. A boat is available which can hold two animals and which can be navigated by any combination of foxes and chickens involving one or two animals.

The chickens insist on never being left in a minority on either riverbank, for fear of being eaten by a majority of foxes.

Find a schedule of crossings that will permit all the foxes and chickens to cross the river safely.

Foxes and Chickens


Foxes and Chickens


F F F C C C B

Foxes and Chickens

Many questions arise

If all the generated nodes are expanded we generate multiple copies of many nodes.

Also many nodes which are generated are unacceptable.

Foxes and Chickens

Untenable States

Foxes and Chickens

Solving the Problem

Foxes and Chickens

Foxes and Chickens

Untenable States

Solving the Problem

Foxes and Chickens

Solving the Problem

Foxes and Chickens

Copy of start state

START

ROW 1

ROW 2

ROW 3

ROW 4

ROW 5

COL 1 COL 2 COL 3 COL 4 COL 5 COL 6 COL 7 COL 8 COL 9

Foxes and Chickens

Copy of start state Copy of start state

Foxes and Chickens

START

ROW 1

ROW 2

ROW 3

ROW 4

ROW 5

COL 1 COL 2 COL 3 COL 4 COL 5 COL 6 COL 7 COL 8 COL 9

Foxes and Chickens

Define the Problem as a Search Space

Questions come to mind.

For example, is there more than one ordering sequence to performing the node expansion ?

Does our search method actually find a solution?

Is it a good solution ?

Does it find the optimal solution ?

Which of method should be used ?


Two categories of search methods

Blind (exhaustive) search

depth/breadth first search

Heuristic search

A* search


We’ll evaluate all the later search techniques w.r.t the below 4 criteria

1. Completeness Is the strategy guaranteed to find a solution if one exist?

2. Time Complexity

How long does it take to find a solution?


We’ll evaluate all the later search techniques w.r.t the below 4 criteria

3. Space Complexity How much memory does it take to perform the search?

4. Optimality Does the strategy find the optimal solution where there are several solutions?

Blind Search Strategies

Since search forms the core of many intelligent processes, it is useful to structure AI programs in a way that facilitates describing and performing the search process. We need to study the question of how to decide which strategy to apply and even what the strategies are.

The algorithms and strategies for exhaustive search - that is those methods for straightforwardly expanding every single node in a search tree - are sometimes called the blind search methods because although they are very general they lack the power of knowledge-guided search.

Thus, their very generality implies a certain weakness.

Blind searches can usually be broken down into two forms of search, depth-first search and breadth-first search.

Blind Search

Blind searches have no preference as to which state (node) that is expanded next The different types of blind searches are characterised by the order in which they expand the nodes This can have a dramatic effect on how well the search performs when measured against the four criteria we defined earlier

Blind Search

Breadth first search

Depth first search

Uniform cost search

Depth limited search

Blind Search

Expand Root Node First Expand all nodes at level 1 before expanding level 2

OR

Expand all nodes at level d before expanding nodes at level d+1

Blind Search

Breadth First Search

A

B C

D E D E F G

Whereas depth-first search is a policy for quickly penetrating as deeply as possible, its cautious partner breadth-first search can be likened to a wave propagating through the search space at equal speed in all directions.

The memory cost of maintaining the wave front is significant, since all states in the front must be stored in their entirety.

Blind Search


Very systematic If there is a solution breadth first search is guaranteed to find it If there are several solutions then breadth first search will always find the shallowest goal state first and if the cost of a solution is a non-decreasing function of the depth then it will always find the cheapest solution

Blind Search


Evaluating against four criteria Optimal Complete Time complexity Space complexity

Blind Search


Evaluating against four criteria Optimal yes Complete yes Time complexity b + b2 + b3 + ... + bd-1 i.e. O(bd) Space complexity b + b2 + b3 + ... + bd-1 i.e. O(bd) b: the branching factor d: is the depth of the search tree Note : The space/time complexity could be less as the solution may be found somewhere before the dth level (depends on the problem).

Blind Search


Exponential growth quickly makes complete state space searches unrealistic If the branch factor was 10, by level 5 we would need to search 100,000 nodes, i.e. 105

Blind Search


Space is more of a factor to breadth first search than time Time is still an issue

Who has 35 years to wait for an answer to a level 12 problem (or even 128 days to a level 10 problem)

It could be argued that as technology gets faster then exponential growth will not be a problem But even if technology is 100 times faster

we would still have to wait 35 years for a level 14 problem and what if we hit a level 15 problem!

Blind Search


Is BFS (or DFS later) a better option for - Maze - TSP - n-Queen - 8-puzzle - …

Blind Search


Depth-first search (DFS) is the prime candidate. It’s simple logic:

“keep going as long as you see anything new, and when that is not possible, back up as far as necessary and proceed in a new direction”.

Expand Root Node First Explore one branch of the tree before exploring another branch

Blind Search

Depth First Search

Expand Root Node First Explore one branch of the tree before exploring another branch

Blind Search

Depth First Search

Evaluating DFS by four criteria Space complexity

Only needs to store the path from the root to the leaf node as well as the unexpanded nodes For a state space with a branching factor of b and a maximum depth of m, DFS requires storage of bm nodes

Time complexity bm in the worst case

Blind Search

Depth First Search

Evaluating DFS by four criteria

If DFS goes down a infinite branch it will not terminate if it does not find a goal state If it does find a solution there may be a better solution at a lower level in the tree Therefore, depth first search is neither complete nor optimal

Blind Search

Depth First Search

BFS will find the optimal (shallowest) solution as long as the cost is a function of the depth Suppose that we have a tree in which all the weights of branches are one Weight of a path from the root to a node N is just the depth of node N

Blind Search

Uniform Cost Search (vs. BFS)

Uniform Cost Search can be used when this is not the case

will find the cheapest solution provided that the cost of the path never decreases as we proceed along the path

Uniform Cost Search works by expanding the lowest cost node on the fringe (leaf)

Blind Search


Cost of a node n the total cost of the path from the root to n

“Search all nodes of cost c before those of cost c+1” In BFS deeper nodes always arrive after shallower nodes In UCS the costs of new nodes do not have such a nice pattern

Blind Search


In UCS we need to 1.explicitly store the cost g of a node 2.explicitly use such costs in deciding the ordering in the queue

Always remove the smallest cost node first sort the queue in increasing order alternatively, search the queue and remove the smallest cost Nodes removed by cost, not by order of arrival

Blind Search


• BFS will find the path SAG, with a cost of 11, but SBG is cheaper with a cost of 10

• UCS will find the cheaper solution (SBG). It will find SAG but will not see it as it has a higher cost so won’t be explored

Blind Search


1 10

5 5

15 5

S B

G

C

A

Completeness: If there is a path to a goal then UCS will find it If there is no path, then UCS will eventually report that the goal is unreachable

Optimality: UCS will report a minimum cost path (there might be many)

Blind Search


Breadth First Search Optimal Only if the branch cost is the same

Uniform Cost Search

Optimal Even if the branch cost is different

Complete Systematic search throughout the whole tree

Blind Search


Time and space complexity O(bd) (bounded by bd)

UCS is usually better than BFS UCS = BFS

When all solutions rather than just one solution is needed When all branches have the same cost

Blind Search


Depth Limited Search

DFS may never terminate as it could follow a path that has no solution on it DLS solves this by imposing a depth limit, at which point the search terminates at that particular branch Can be implemented by the general search algorithm using operators which keep track of the depth Choice of depth parameter is important

Too deep is wasteful of time and space Too shallow and we may never reach a goal state

Blind Search


Completeness If the depth parameter, l, is set deep enough then we are guaranteed to find a solution if one exists

Therefore it is complete if l>=d (d=depth of solution)

Space requirements

O(bl)

Time requirements

O(bl)

DLS is not optimal

Blind Search


G51IAI – Blind Searches

Blind Search

Map of Romania

Bucharest

Zerind

Arad

Timisoara

Lugoj

Mehadia

Dobreta Craiova

Rimnicu Vilcea

Sibiu

Pitesti

Giurgui

Urziceni Hirsova

Eforie

Vaslui

Iasi

Neamt Odarea

Fararas

On the Romania map there are 20 towns so any town is reachable in 19 steps

In fact, any town is reachable in 10 steps

Heuristic Search Techniques

Do you drive? Have you thought about how the route

plan is created for you in your TomTom?

How do you find the sequence of moves in 8-puzzle with the minimum number of steps?


Do you drive? How does your GPS create the shortest (or the quickest) route between A and

B for you?

A* search


The general search methods discussed do not make use of domain knowledge and are considered as weak methods simply because they do not exploit such knowledge. In order to solve many problems efficiently it is often necessary to construct a control structure that is no longer guaranteed to find the best answer, but will almost always find a very good answer. thus we introduce the idea of an heuristic.

Heuristic : A technique which improves the efficiency of a search process, possibly by sacrificing claims of completeness.

Heuristic Methods

Heuristics are key terms in many branches of AI. A heuristic is best defined as a 'rule of thumb' or piece of advice that is usually based on prior experience and not guaranteed to work.

Heuristics

A moment's reflection will show ourselves constantly using heuristics in the course of our everyday lives.

If the sky is grey we conclude that it would be better to put on a coat before going out.

We book our holidays in August because that is when the weather is best.

Heuristic Methods

Heuristics

Heuristic Search

A search algorithm to find the shortest path through a search space to a goal state using a heuristic.

f = g + h

f - function that gives an evaluation of the state g - the cost of getting from the initial state to the current state h - the cost of getting from the current state to a goal state

Heuristic Search

Heuristic searches vs. Uniform Cost Search Uniform cost search

expand the path with the lowest path cost chooses the lowest cost node thus far

Heuristic search estimate how close the solution is to the goal not how cheap the solution is thus far

Heuristic Search

Heuristic searches vs. Uniform Cost Search

Uniform Cost Search path cost function g(n): the cost of the path thus far

Heuristic searches evaluation function h(n): how close is the current node to the solution

The A* Search Heuristic


f = g + h

f - function that gives an evaluation of the state g - the cost of getting from the initial state to the current state h - the cost of getting from the current state to a goal state



f = g + h

Combines the cost so far and the estimated cost to the goal

That is f(n) = g(n) + h(n) This gives us an estimated cost of the cheapest solution through n



f = g + h

We need to have a proper way to estimate h

goal

outline of graph

start A

B

gA

gB

hA

hB



f = g + h

h=0 A* becomes UCS

complete & optimal but search pattern undirected h too large

if h is large enough to dominate g then becomes like Greedy, lose optimality

ANIMATION OF A*.

Arad

Oradea Zerind

Fagaras

Neamt

Iasi

Vaslui

Hirsova

Eforie

Urziceni

Giurgui

Pitesti

Sibiu

Dobreta

Craiova

Rimnicu

Mehadia

Timisoara

Lugoj

87

92

142

86

98

86

211

101

90

99

71

75

140 118

111

70

75

120

138

146

97

80

140

Bucharest

99+178=277

80+193=273

140+366=506

177+98=275

226+160=386(R)

310+0=310 (F)

Optimal route is (80+97+101) = 278 miles

1.Sibiu

278+0=278 (R,P)

2.Rimnicu 3.Pitesti 4.Fagaras 5.Bucharest 278 GOAL!!

Fringe in RED

Visited in BLUE

Nodes Expanded

0+253=253

Annotations:

“g+h=f”

Why not 211?

315+160=475(R, P)

A*

Nodes Expanded:

1.Sibiu; 2.Rimnicu; 3.Pitesti; 4.Fagaras; 5.Bucharest 278

Arad

Oradea Zerind

Fagaras

Neamt

Iasi

Vaslui

Hirsova

Eforie

Urziceni

Giurgui

Pitesti

Sibiu

Dobreta

Craiova

Rimnicu

Mehadia

Timisoara

Lugoj

87

92

142

86

98

86

211

101

90

99

71

75

140 118

111

70

75

120

138

146

97

80

140

Bucharest


UCS

Nodes expanded:

1.Sibiu; 2.Rimnicu; 3.Faragas; 4.Arad; 5.Pitesti; 6.Zerind; 7.Craiova; 8.Timisoara; 9.Bucharest 278

Arad

Oradea

Zerind

Faragas

Neamt

Iasi

Vaslui

Hirsova

Eforie

Urziceni

Giurgui

Pitesti

Sibiu

Dobreta

Craiova

Rimnicu

Mehadia

Timisoara

Lugoj

87

92

142

86

98

86

211

101

90

99

71

75

140

118

111

70

75

120

138

146

97

80

140

Bucharest



Clearly the expansion of the fringe is much more directed towards the goal The number of expansions is significantly reduced A* is optimal and complete, but it is not all good news

It can be shown that the number of nodes that are searched is still exponential to the size of most problems This has implications not only for the time taken to perform the search but also the space required Of these two problems the space complexity is more serious


If you examine the animation on the previous slide you will notice an interesting phenomenon

Along any path from the root, the f-cost never decreases This is no accident It holds true for all admissible heuristics


8 puzzle problem

Online demo of A* algorithm for 8 puzzle Noyes Chapman’s 15 puzzle

1 3 4 1 2 3

8 6 2 8 4

7 5 7 6 5

Initial State Goal State

http://www.permadi.com/java/puzzle8/














H1 = the number of tiles that are in the wrong position

H2 = the sum of the distances of the tiles from their goal positions using the Manhattan Distance

We need admissible heuristics (never over estimate) Both are admissible but which one is better?

Possible Heuristics in A* Algorithm


H1 = the number of tiles that are in the wrong position (=4)

H2 = the sum of the distances of the tiles from their goal positions using the Manhattan Distance (=5)


1 3 4 1 2 3

8 6 2 8 4

7 5 7 6 5


G51IAI - Heuristic


3 1 3 4

8 2

7 6 5

6 1 3 4

8 6 2

7 5

•H1 = the number of tiles that are in the wrong position (=4)

•H2 = the sum of the distances of the tiles from their goal positions using the Manhattan Distance (=5)

1 3 4

8 6 2

7 5

5

1 3 4

8 2

7 6 5

4 1 3 4

8 6 2

7 5

6

1 3 4

8 6 2

7 5

4

1 3 4

8 6 2

7 5

5 1 3 4

8 6 2

7 5

6



6 1 3 4

8 6 2

7 5

H2 = the sum of the distances of the tiles from their goal positions using the Manhattan Distance (=5)

1 3 4

8 6 2

7 5

5

1 3 4

8 2

7 6 5

4 1 3 4

8 6 2

7 5

6

5 1 3 4

8 2

7 6 5

1 4

8 3 2

7 6 5

5 1 3 4

8 2

7 6 5

3

1 3

8 2 4

7 6 5

2 1 3 4

8 2 5

7 6

4

1 3

8 2 4

7 6 5

1



3

5 1 3 4

8 6 2

7 5

H1 = the number of tiles that are in the wrong position (=4)

1 3 4

8 6 2

7 5

4

1 3 4

8 2

7 6 5

3 1 3 4

8 6 2

7 5

5

4 1 3 4

8 2

7 6 5

1 4

8 3 2

7 6 5

3 1 3 4

8 2

7 6 5

1 4

8 3 2

7 6 5

4 1 4

8 3 2

7 6 5

3

1 4 2

8 3

7 6 5

3



Distance Between Cities

B A

C

E

F

G

1

2

2

4

2

3

5 4

2

3

3

D

Start Point: A Goal Point: G

A 8

B 9

C 6

D 4

E 5

F 2

G 0

Distance to destination


Distance Between Cities

B A

C

E

F

G

1

2

2

4

2

3

5 4

2

3

3

D

A 8

B 9

C 6

D 4

E 5

F 2

G 0

Distance to destination

Travel From City A to City C Distance travelled (g) = 2 miles Distance still to go (h) = 6 miles

Value of Current State (f) = g + h = 8

In this game there are initially 9 tokens and two players take it in turns to remove

1, or 2, or 3 tokens at a time.

The player who has to remove the last token is the loser.

The Game of Nim

Minmax Algorithm – Generate and Test

Remember that the algorithm aims for the computer to win the game. The computer needs to assess each

possible move before actually making a move.

This can be done by giving each node a value, a big value to any nodes are is good for the computer and a

small value to the node that is bad for it.

Once the values for the leaf nodes (those at the bottom of the tree) are known we know the computer will try to

move to the nodes with big values and the opponent naturally will try to move to the nodes with small values.

Minmax Algorithm – Generate and Test

Following this logic we can back the values of the leaf nodes up according to whose turn it is to move: if it is the computer’s move, the maximum

value is backed up and if it is the opponent’s move the minimum value is backed up.

Once all the nodes have been assigned a value,

the computer player is now able to play the perfect game.

At each node it just moves to the next node that

has the highest value.

9

8 7 6

7 6 5 4 3

6 5 4 3 2 1

5 4 3 2 1

4 3 2 1

3 2 1

2 1

1

MIN

MAX

MIN

MAX

MIN

MAX

MIN

MAX

MIN 1

1 -1

1 -1 -1

-1 1 1 1

1 -1 -1 -1 1

1 -1 1 -1 1 1

-1 -1 1 -1 -1

1 1 1

The Game of Nim

Player

Computer

Player

Computer

Player

Computer

Player

Computer

Player

Concluding

Search space (state space)

Search tree

Search methods

Depth-, breath- first search

Depth limited search

Uniform cost search

A* algorithm

Minmax

Combinatorial explosion

Heuristics

Untenable states

Documents

Clustering (Un-supervised Learning)pszrq/files/4FAISearch.pdf · depth/breadth first search Heuristic search ... Is BFS (or DFS later) a better option for - Maze - TSP ... In UCS