25
Data structures and complexity

Data structures and complexity

  • Upload
    others

  • View
    7

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Data structures and complexity

Data structures and complexity

Page 2: Data structures and complexity

Complexityn  Computational complexity refers to how much computing

is required to solve different problems. n  Spatial complexity refers to how much memory is

required to solve different problems.n  Chose the right algorithm and the right data structure

and your code could run in seconds. Chose the wrong algorithm or the wrong data structure and your code could run for days.

Page 3: Data structures and complexity

Search: I’m thinking of a word...n  Given a finite list of words, how do you find out which

one I’m think of ?n  Ground rules

n  The word has to be in a dictionary, e.g. a dictionary with 60,000 words.n  You can only ask me questions with YES/NO answers

n  Sequential searchn  Inspect every element and check to see if it’s the one you are

looking for. Amount of “effort” is proportional to the length of the list, e.g. worst case: 60,000 questions, average case 30,000 questions.

n  Binary searchn  Ask questions that reduce the number of possibilities in half.

Amount of effort is proportional to the logarithm (base 2) of the length of the list, i.e. about 16 questions for 60,000 words.

Page 4: Data structures and complexity

Binary search

60,000

30,000 30,000

In 1st ½ of set? yes no

15,000 15,000

yes no

15,000 15,000

yes noIn 1st ½ of subset?

In 1st ½ of subset?

In 1st ½ of subset?

In 1st ½ of subset?

In 1st ½ of subset?

~16 questions need to get down to a subset with 1 word

Page 5: Data structures and complexity

Queuesn  A queue stores items in FIFO (first-in first-out) order.n  It returns them in the same order that they are entered,

like a line of people at a cashier.n  Useful for letting one chunk of code collect (or generate)

items to be processes, while a separate chunk of code does the actual processing.n  mouse clicksn  internet TCP/IP packets

n  Terminologyn  Enqueue -- get in linen  Dequeue -- get out of line (reach the cashier)

Page 6: Data structures and complexity

Stacksn  A queue stores items in LIFO (last-in first-out)

order.n  It returns them in the same order that plates are

stacked in a cafeteria.n  Useful when operations need to be broken down

into sub-operations that are executed in sequence (especially recursive operations).n  file search in file systemn  parsing

n  Terminologyn  push -- put a plate on the stackn  pop -- remove a plate from the stack

Page 7: Data structures and complexity

Stacks: examplen  Infix notation: ((1+2)*4)+3n  Postfix notation: 1 ,2, +, 4, *, 3, +n  Evaluate postfix expressions with a stack

n  1. if operand, push onto stackn  2. if operator, pop, pop, evaluate, push result

Input Operations Stack

1 Push (1)2 Push (2,1)+ Pop,Pop,Add,Push (3)4  Push (4,3)* Pop,Pop,Mul,Push (12)3  Push (3,12)+ Pop,pop,Add,Push (15)

Page 8: Data structures and complexity

More complex data structures

Page 9: Data structures and complexity

Recursive data structuresn  Example: Binary trees

value

left right

value

left right

value

left right

value

left right

value

left right

value

left right

value

left right

root

branches

leaves

Page 10: Data structures and complexity

Creating a new noden  Represent each node by a hash with three�

keys: ‘LEFT’, ‘RIGHT’, and ‘VALUE’;n  The ‘VALUE’ will contain the content of the noden  The values of ‘LEFT’ and ‘RIGHT’ are references to the child

nodes (i.e. more hashes).n  Here is a subroutine that returns a reference to node data

structure. The argument of the subroutine is the value

value

left right

sub newNode {return {

'VALUE' => shift, 'LEFT' => undef, 'RIGHT' => undef

};

}

Page 11: Data structures and complexity

Attaching a node

value

left right

$root_ref->{LEFT} = $someNode_ref;

value

left right

$root_ref

$someNode_ref

Page 12: Data structures and complexity

Trees: in-order traversal

traverse($theTree);

sub traverse { my($tree) = @_;

if(!defined($tree)){return undef }; # if no node traverse($tree->{LEFT});

processTheNode($tree->{VALUE}); # e.g. print value traverse($tree->{RIGHT});

}

Page 13: Data structures and complexity

Trees: insertionsub insert { # -- recursively builds the tree my($tree, $val) = @_; if(!$tree) { # no node exists so create one $_[0] = newNode($val); return; } else { # a node exists, so insert if($tree->{VALUE}>$val)

{insert($tree->{LEFT},$val)} elsif($tree->{VALUE}<$val)

{insert($tree->{RIGHT},$val)} else

{ warn "dup insert of $val\n" if 0 } }}

Page 14: Data structures and complexity

Remindersn  Code examples in Readonly directory on Pinedalabn  Project coming up...

Page 15: Data structures and complexity

Complexity

Page 16: Data structures and complexity

Example: Searchn  Given a list of ordered values how do we find

one? e.g.n  Numbers in a listn  Words in a dictionary

n  The complexity depends on the data structure used to represent the set of objects and on the algorithm used to process the data structure

Page 17: Data structures and complexity

Big-O notationn  Big-O notation is way to express the asymptotic time-

complexity of a computer algorithm.

n  O(1) constantn  O(log(n)) logarithmicn  O(n) linearn  O(n2) quadraticn  O(nc) polynomialn  O(cn) exponential

Page 18: Data structures and complexity

Linear search

n  Represent the set of objects as a list and then sequentially search the listn  Space complexity is proportional to the number

of objects.n  Time complexity proportional to the number of

objects, i.e. O(n).

Page 19: Data structures and complexity

Binary Search

n  Represent the set of objects as a binary tree and sequentially search the listn  Space complexity is proportional to the number

of objects.n  Time complexity ?

Page 20: Data structures and complexity

Binary searchn  Represent the set of objects as a binary tree and

search the tree�

sub lookup { my($tree, $value) = @_;

if(!$tree) { return; }

elsif ($tree->{VALUE} == $value) { return $tree;}

elsif($value < $tree->{VALUE} ){return lookup($tree->{LEFT}, $value)}

else{return lookup($tree->{RIGHT},$value)}

}

Page 21: Data structures and complexity

Binary searchn  The search time depends on how deeply in the

tree you have to go to find the objectn  The depth of the tree depends on how it was

constructedn  Worst case: Input was presorted�

depth = n, complexity: O(n)n  Best case: Tree is balanced depth = log(n), complexity: O(log(n))n  If input is random then it can be shown that �

depth = nlog(n), complexity: O(nlog(n))n  (Average case)

Page 22: Data structures and complexity

Hash table searchn  Calculate a number from the key

n  Performed by a hash function n  Use the number to index into an arrayn  If more than one key hashes to the same index �

(a collision)n  Maintain a list of keys that resolve to the same hash

value

Page 23: Data structures and complexity

NP-Completenessn  A problem is tractable if some algorithm exists that

always solves the problem in a time that is proportional to some power of the length of the input. Such problems are said to be solvable in polynomial time. (Of course if the power is 50, then the problem is practically intractable).

n  A problem with no polynomial time algorithm is said to be intractable.

n  In the 1970’s the class of NP-complete problems was defined. NP stands for Nondeterministic polynomial. This class of problems have no known polynomial time algorithms.

Page 24: Data structures and complexity

Salient properties of NP-complete problems

n  No NP-complete problem has been proven to be solvable in polynomial time.

n  No NP-complete problem has been proven to be unsolvable in polynomial time.

n  All NP-complete problems are computationally equivalent in the following sense:n  If any polynomial-time algorithm can be found to solve any NP-

complete problem, then every NP-complete problem can be solved by some polynomial-time algorithm.

n  Since so many computer scientists and mathematicians have tried unsuccessfully to solve so many NP-complete problems, no one believes that polynomial-time algorithms exist for NP-hard problems -- but this hasn’t been proven either.

Page 25: Data structures and complexity

The Harsh realities of life: !Most problems of interest in

bioinformatics and computational biology are NP-complete