36
Graphs and Finding your way in the wilderness Chapter 14 in DS&PS Chapter 9 in DS&AA

Graphs and Finding your way in the wilderness

  • Upload
    waite

  • View
    31

  • Download
    0

Embed Size (px)

DESCRIPTION

Graphs and Finding your way in the wilderness. Chapter 14 in DS&PS Chapter 9 in DS&AA. General Problems. What is the shortest path from A to B? What is the shortest path from A to all nodes? What is the shortest/cheapest path between any two nodes?. - PowerPoint PPT Presentation

Citation preview

Page 1: Graphs and Finding your way in the wilderness

Graphs andFinding your way in the wilderness

Chapter 14 in DS&PS

Chapter 9 in DS&AA

Page 2: Graphs and Finding your way in the wilderness

General Problems

• What is the shortest path from A to B?

• What is the shortest path from A to all nodes?

• What is the shortest/cheapest path between any two nodes?.

• Search for Goal node, i.e. a node with specific properties, like a win in chess.

• What is shortest tour? (visit all vertices)

– no known polynomial algorithm in number of edges

• What is longest path from A to B

Page 3: Graphs and Finding your way in the wilderness

Some Applications

• Route Finding

– metacrawler, on net, claims to find shortest path between two points

• Game-Playing

– great increase in chess/checkers end-game play occurred when recognized as graph search, not tree search

• Critical Path Analysis

– multiperson/task job schedule analysis

– answers what are the key tasks that can’t slip

• Travel arrangement

– cheapest cost to meet constraints

Page 4: Graphs and Finding your way in the wilderness

Definitions• Graph: set of edges E and vertices V.• Edge is a pair of vertices (v,w)

– edge may be directed or undirected– edge may have a cost– v and w are said to be adjacent

• Digraph or directed graph: directed edges

• Path: sequence of vertices v1,…vn where each <vi,vi+1> is an edge.

• Cycle: path where v1=vn

• Tour: cycle that contains every vertex• DAG: directed acyclic graph

– graph with no cycles– Tree algorithms usually work with DAGs just fine

Page 5: Graphs and Finding your way in the wilderness

Adjacency Matrix Representation

• Matrix A = new Boolean(|V|,|V|).

– O(|V|^2) memory costs: acceptable only if dense

• If <vi,vj> is an edge, set A[i][j] = true, else false

• Special matrix multiple operator:

– row[i] @ col[j] = row[i][1]&col[1][j] or row[i][2]&col[2][j]…..

• In A@A, if entry [i][j] is true, what does that mean?

– There is some k so that vi->vk and vk->vj or a length 2 path from vi to vk.

• Similarly, A^k indicates where any two vertices are connected by a length k path.

• Cost: O(k*n^3).

Page 6: Graphs and Finding your way in the wilderness

Matrix Review• If A has n rows and k columns and

• B has k rows and m columns then A*B = C

• Where C has n rows and m columns and (standard)

– C[i][j] = A[i][1]*B[1][j]+….+A[i][k]*B[k][j]

• or i-j entry of C is dot product of row i of A and column J of B. Total time cost is O(n*k*m).

• Example: matrix of size 3-3 represents a linear transformation from R^3 into R^3.

• Points in R^3 represented as a 3 by 1 vector (column)

• Essentially matrices stretch/shrink or rotate points.

– Determinant defines amount of stretch/shrink.

• Theorem: Locally every differentiable function can be approximated by a matrix (linear transformation).

Page 7: Graphs and Finding your way in the wilderness

Cost Matrix Representation

• Now A[i][j] = cost of edge from vi to vj.

– If no edge, either set cost to Infinity or add Boolean attribute to indicate no edge.

• New multiplication operation

– row[i]@col[j] = min { row[i][1]+col[1][j],

row[i][2]+col[2][j],…

row[i][n] +col[n][j] }

• Now A^2 contains minimum cost path of length 2 between any 2 vertices.

• A^n has complexity O(n^4). Not good.

• If add A[i][i]= 0, then A^k records minimum cost path of length = k. (how to change to allow all paths <= k)

Page 8: Graphs and Finding your way in the wilderness

Adjacency List Representation

• Here each vertex is the head of linked list which stores all the adjacent vertices. The cost to a node, if appropriate is also stored.

• The linked lists are usually stored in an array, since you probably know how many vertices there are.

• Memory cost = O(|E|) so if |E| << |V|^2, use list representation.

• Graph is sparse if |E| is O(|V|).

• To simplify the discussion, we will assume that each vertices has a method *sons* which returns all adjacent vertices.

• Now, will redo same problems with list representation.

Page 9: Graphs and Finding your way in the wilderness

General Graph Search AlgorithmLooking for a node with a property

Set Store equal to the some node

while ( Store is non-empty) do

choose a node n in Store

if n is solution, stop

• Decisions:

else add SOME sons of n to store

– What should store be?

– How do we choose a node?

– what does add mean?

– How do pick which sons to store.

• Cycles are a problem

Page 10: Graphs and Finding your way in the wilderness

Problem: Is Graph connected? (matrix rep)Note: n by n boolean matrix where n is number of vertices.

Set A[i][i] to true

Set A[i][j] to true if there is an edge between i and j.

Let B= A^2, using boolean arithmetic

Note B[i][j] is true iff there is a k such that B[i][k] is true and B[k][j] is true, i.e if there is a 2-path from i to j.

A^k represents whether a k-path exists between any vertices.

Let C = boolean sum of A^i where i= 1…n-1. (why?)

Graph connected if C is all ones.

Time complexity: O(N^4)!

How about directed graphs? Basically the same algorithm.

For directed graphs, strongly connected means directed path between any two vertices.

Page 11: Graphs and Finding your way in the wilderness

Is Undirected Graph G Connected? (adjacency list representation)

• Suppose G is an undirected graph with N vertices.

Let S be any node

Do a (depth/breadth) first search of G, counting the number of nodes. Be careful not to double count.

If number of nodes does not equal N, disconnected.

• Searching a Graph is like searching a tree, except that nodes may be revisited.

• Need to keep track of revisits, else infinite loop.

Page 12: Graphs and Finding your way in the wilderness

Depth First Search Pseudo-Code

• Store = Stack

• Choose = pop

• Initial node: any node

• Add = push only new sons (unvisited ones)

– keep a boolean field visited, initialized to false.

– When a node is “popped”, mark it as visitied.

• Graph connected if all nodes visited.

• Properties

– Memory cost: number of nodes

– Guarantee to find a solution, if one exists (not shortest solution) How could we guarantee that?

– Time: number of nodes (exponential for k-ary trees)

Page 13: Graphs and Finding your way in the wilderness

Breadth First Search

• G is a undirected Graph

• As before each node has a boolean visited field.

• Initial node is arbitrary

• Store = Queue

• Choose = dequeue and mark as visited

• Add = enqueue only those sons that have not been visited

• Properties:

– Time: Number of nodes to solution

– Space: Number of nodes

– Guaranteed to find shortest solution

Page 14: Graphs and Finding your way in the wilderness

Is Directed Graph Acyclic? (array represntation)

• Let A[i][j] be true if there is a directed edge from i to j.

• Similar to previous case, if B= A^2 with boolean multiplication, then B[i][j] is true iff there is a directed 2-path from i to j.

• Algorithm:

For i = 1 to n-1 (why?)

Compute A^i.

If some diagonal element is true, exit with true

end for

Exit with false.

Page 15: Graphs and Finding your way in the wilderness

Is Directed Graph Acyclic? (adjacency list rep)

• With care, breadth first search works.

• Define the indegree of a node v as the number of edges of the form (u,v), with u arbitrary.

• Define the outdegree of a node v as the number of edges of the form (v, u) with u arbitrary.

• A node with indegree 0 is like the root of a tree.

• A node with outdegree 0 is a terminal node.

Page 16: Graphs and Finding your way in the wilderness

Breadth First Search Pseudo-code

• Algorithm Idea (has numerous variations/implementations)

• Store = Queue

Compute indegree of all nodes

Enqueue all nodes of indegree 0

While Queue is not empty

Dequeue node n and lower indegrees of nodes of form (n,v)

Enqueue any node whose indegree is 0.

If any node still has positive indegree, then cyclic.

Why does algorithm terminate?

• Properties:

– Time & Space: Number of nodes

– find node closest to “roots”.

Page 17: Graphs and Finding your way in the wilderness

Best-First Search

• Goal: find least cost solution

• Here edges have a cost (positive)

• Store = priority queue

• Add = enqueue(), which puts in right order

• Choose = dequeue(), chooses element of least cost

• Properties:

– Find cheapest solution

– Time and Memory: exponential in…

• depth of tree.

Page 18: Graphs and Finding your way in the wilderness

Best First Pseudo-Code

Set distance from S to S to 0

Priority Queue PQ <- S

while (PQ is not empty)

vertex <- PQ.deque()

sons <- vertex.sons()

for each son in sons

PQ.enqueue(son, cost to son)

Page 19: Graphs and Finding your way in the wilderness

Topological Sort

• Given: a directed acyclic graph

• Produce: a linear ordering of the vertices such that if a path exist from v1 to v2, then v1 is before v2.

• If v1 is before v2, is there a path from v1 to v2?

• NO

• Note: there may be multiple correct topological sorts

• Algorithm Idea:

– any vertex with indegree 0 can be first

– Output and delete that vertex

– update indegree’s of its sons

– Repeat until empty

• So we need to compute and keep track of indegree’s

Page 20: Graphs and Finding your way in the wilderness

Algorithm Implementation

• HashTable of (vertex, indegree, sons)

• Queue of vertices

• Step 1: read each edge (v,w) and add 1 to indegree of w

– linear

• Step 2: Add all vertices with indegree 0 to queue Q.

• Step 3: Process Q by:

– dequeue vertex

– update indegrees of its sons (constant by hashing)

– enqueue any son whose indegree become 0.

• Time complexity: linear

• Space: linear

• Proof: Does everything get enqueued?

Page 21: Graphs and Finding your way in the wilderness

UnWeighted Single-Source Shortest path algorihtm

• Input: Directed graph and start node S

• Output: the minimum cost, in terms of number of edges traversed, from start node to all other nodes.

• Idea: do a level order search (breadth-first search)

• We’ll use a hashtable to mark elements as “seen”,

– i.e. we’ll track vertices that we’ve visited

– use hashtable to hold this information by “marking” vertices that have been visited

• We’ll use a queue to store the vertices to be “opened”

– to open a node means to consider its sons

• Each vertex will have a field for distance to start node.

Page 22: Graphs and Finding your way in the wilderness

Shortest Path Algorithm

Set distance from S to S to 0

queue <- S

while (queue is not empty)

vertex <- queue.dequeue()

mark vertex as visited (enter in hashtable)

record cost to vertex

sons <- vertex.sons()

newSons <- sons that are unmarked

fill in distance measure to newSons

queue.enqueue(newSons)

Essentially, breadth-first search

Page 23: Graphs and Finding your way in the wilderness

Discussion

• Will this terminate?

– Will we ever revisit a node?

• Computational cost?

– O(|E|)

• What are the memory requirements?

– Suppose we don’t count graph (virtual graphs)

– Can we bound queue?

– Only O(|E|) and if m-ary tree, this is exponential.

• Did we need the hashtable?

– This avoids a linear search of the constructed graph

– remove a factor of O(|G|).

Page 24: Graphs and Finding your way in the wilderness

Positive-Weighted Single-source Cheapest Path

• Suppose we have positive costs associated with every edge in a directed graph.

• Problem: Find the shortest path(total cost) from given vertex S to every vertex.

• Solution: Dijstra’s algorithm

• BFS idea still works, with slight modifications

– Replace Queue by Priority queue.

– As before, replace newSons by betterSons.

– As before, replace add entry to update entry (which may be add)

– Note: may reopen an old son( if return with better path)

• Dense graphs: O(|V|^2), sparse graphs: O(|E|log|V|)

Page 25: Graphs and Finding your way in the wilderness

Weighted Shortest Path Pseudo-Code

Set distance from S to S to 0 (on node)

Priority Queue PQ <- S

while (PQ is not empty)

vertex <- PQ.dequeue() … remove min

mark vertex as visited (enter in hashtable)

record cost to vertex

sons <- vertex.sons()

goodSons <- new sons OR old sons with better costs estimates

queue.enqueue(goodSons)… enqueue puts in proper order.

Page 26: Graphs and Finding your way in the wilderness

Graphs with negative edge costs

• Dijsktra doesn’t work (since we may have cycles which lower the cost)

• Input: Directed graph with arbitrary edge costs and vertex v.

• Output: Minimum cost from S to every vertex OR graph has a negative cost cycle.

• Note: If no negative cost cycles, then a vertex can be visited (expanded) at most |V| times.

• Algorithm: Add counter to each vertex so each time it is visited with lower cost, counter goes up. If counter exceeds |V|, then graph has negative cost cycle and we exit. Otherwise queue will be empty.

Page 27: Graphs and Finding your way in the wilderness

Weighted Single-Source shortest-path problems for Acyclic graphs

• Easy since no cycles

• Edge costs may be positive or negative

• Best-first search works

• Node may be reentrant

• Reentrant node require may required updating cost.

• Or apply topological sorting algorithm. (text)

2 4

3 6 17

Page 28: Graphs and Finding your way in the wilderness

Algorithm Display

• Idea:

– Iterative use of breadth first search

– addition of edges to effect other choices

• See Diagrams provided

• Analysis: (requires augmented path be cheapest)

– Runs in linear time

Page 29: Graphs and Finding your way in the wilderness

Minimum Spanning Tree

• Given: an undirected connected graph with edge costs

• Output: a subtree of graph such that

– contains all vertices

– sum of costs of edges is minimum

• If costs not given, assume 1. What then?

– Note all spanning trees have same number of edges

• Application:

– Is undirected Graph with n vertices connected?

• IFF minimal spanning tree has n-1 edges.

Page 30: Graphs and Finding your way in the wilderness

Prim’s Algorithm

• Let G be given as (V,E) where V has n vertices

• Let T = empty

• Algorithm Idea: grow cheapest tree

• Choose a random v to start and add to T

• Repeat (until T has n vertices)

– select edge of minimum length that does not form a cycle and that attaches to current tree (how to check?)

– add edge to T

• The proof is more difficult than the code.

• Complexity depends on G and code

• O(V^2) for dense graphs

• O(E*log(V)) for sparse graphs (use binary heap)

Page 31: Graphs and Finding your way in the wilderness

Kruskal’s Algorithm

• Given graph G = (V,E)

• Sort edges on the basis of cost.

• Add least cost edge to Forest, as long as no cycle is formed.

• Cost of cycle checking is?

– If implement as adjacency list, O(E^2)

– If implement as hash table O(1)

• Proof more difficult.

• Time complexity: O(E log E)

Page 32: Graphs and Finding your way in the wilderness

Finding the least cost between pairs of points

• Idea: Dynamic programming

• Let c[i][j] be the edge cost between vi and vj.

• Define C[i][j] as minimum cost for going from vi to vj.

• Finding the subproblems

• Suppose P is the path from vi to vj which realizes the minimum cost and vk is an intermediary node.

• Then the subpaths from i to k and from k to j must be optimal, otherwise P would not be optimal.

• Now define D[i][k][j] as the minimum cost for going from vi to vj using any of v1,v2..,vk as an intermediary.

• Define D[i][j] as the minimum cost for going from i to j.

• D[i][j] = min over k of D[i][k][j] and c[i][j].

Page 33: Graphs and Finding your way in the wilderness

All-Pairs (Floyd’s)Pseudo-Code

• Initialization:

– D[i][0][j] =cost(i,j) for all vertices i, j … O(|V|^2)

– D[i][k+1][j] =

• min(D[i][k][j], D[i][k][k+1]+D[k+1][k][j])

– This last statement is true since any path from the shortest path from vi to vj using {v1,…vk+1} either doesn’t use vk+1, or the path divides into a path from vi to vk+1 and one from vk+1 to vj.

• The cost of this is O(|V|^3) - i.e. single loop over all vertices with |V|^2 per loop.

Page 34: Graphs and Finding your way in the wilderness

NP vs P

• Multiple ways to define

• Define new computational model (Imaginary)

– add to programming language

• choose S1, S2,….Sn; where Si are statements

– Semantics: algorithm always chooses best Si to execute.

• This is the Non-Deterministic model

• If problem can be solve in polynomial time with non-deterministic it is in the class NP.

• If problem can be solved in polynomial time on standard computer (deterministic) then in class P.

• Unsolved (and possibly unsolvable) does NP = P?

Page 35: Graphs and Finding your way in the wilderness

NP-Completeness

• A problem is in NP or NP-hard if it can be solved in polynomial time on a non-deterministic machine.

• A problem p* is NP complete if any problem in NP can be polynomial reduced to p*.

• A problem P1 can be polynomial reduced to P2 if P1 can be solved in polynomially time assuming that P2 can be solved in polynomially time.

• Alternatively, if P1 can be transformed into P2 and solutions of P2 mapped back to P1 and all the transformations take polynomial time.

• This is a way of forming a taxonomy of difficulty of various problems

Page 36: Graphs and Finding your way in the wilderness

NP-Complete problems• Boolean Satisfiability

• Traveling Salesmen

• Bin Packing: given packages of size a[1]…a[n] and bins of size k, what is the fewest numbers of bins needed to store all the packages.

• Scheduling: Given tasks whose time take t[1]…t[n] and k processors, what is minimum completion time?

• Graph: Given a graph find the clique of maximum size.

– A clique is a completely connected subgraph.

• Subset-sum: Given a finite set S of n numbers and a target number t, does some subset of S sum to t.

• Vertex Cover: A vertex cover is a subset of vertices which hits every edge. The problem is to find a cover with the fewest number of vertices.