Introduction Pizzanu Kanongchaiyos, Ph.D. pizzanu@cp.eng.chula.ac.th

Preview:

Citation preview

What is Algorithm?• A detailed sequence of actions to perform to

accomplish some task. Named after an Iranian mathematician, Al-Khawarizmi.

• algorithm[N] ชุ�ดของคำ�สั่��งที่��สั่ร้�งไว้�ตมข��นตอน• algorithm[N] ลำ�ด�บข��นตอนที่��แน�นอนซึ่��งใชุ�ในกร้แก�ปั ญห• algorithm(แอลำ' กะร้$ธธ�ม) n. ร้ะบบกฏเกณฑ์*ในกร้แก�ปั ญหของ

จำ�นว้นที่��แน�นอนที่งคำณ$ตศสั่ตร้* เชุ�นในกร้หคำ�ของต�ว้หร้ร้�ว้มที่��สั่�ด., • S. algorism, -algorithmic adj. ข��นตอนว้$ธ�อ�ลำกอร้$ที่�มหมยถึ�ง

กร้ว้$เคำร้ะห*แยกแยะว้$ธ�กร้ที่�งนให�เปั/นข��นเปั/นตอนโดยก�หนดให� เร้�ยง ก�นไปัตมลำ�ด�บ กร้เข�ยนโปัร้แกร้มในย�คำแร้ก ๆ น��น ผู้3�เข�ยนโปัร้แกร้มจำะ

ต�องมองเห4นข��นตอนในกร้แก�ปั ญหอย�งแจำ�มชุ�ด เสั่�ยก�อน จำ�งจำะเข�ยนโปัร้แกร้มได�

So you want to be a computer scientist?

Is your goal to be a mundane programmer?

Or a great leader and thinker?

Original Thinking

Boss assigns task:

– Given today’s prices of pork, grain, sawdust, …– Given constraints on what constitutes a hotdog.– Make the cheapest hotdog.

Everyday industry asks these questions.

• Um? Tell me what to code.

With more suffocated software engineering systems,the demand for mundane programmers will diminish.

Your answer:

Your answer:• I learned this great algorithm that

will work.

Soon all known algorithms will be available in libraries.

Your answer:

• I can develop a new algorithm for you.

Great thinkers will always be needed.

– Content: An up to date grasp of fundamental problems and solutions

– Method: Principles and techniques to solve the vast array of unfamiliar problems that arise in a rapidly changing field

The future belongs to the computer scientist who

has

Rudich www.discretemath.com

Course Content

• A list of algoirthms. – Learn their code.– Trace them until you are convenced

that they work.– Impliment them.

class InsertionSortAlgorithm extends SortAlgorithm {

void sort(int a[]) throws Exception {

for (int i = 1; i < a.length; i++) {

int j = i;

int B = a[i];

while ((j > 0) && (a[j-1] > B)) {

a[j] = a[j-1];

j--; }

a[j] = B;

}}

Course Content• A survey of algorithmic design techniques.• Abstract thinking.• How to develop new algorithms for any

problem that may arise.

Study:

• Many experienced programmers were asked to code up binary search.

Study:

• Many experienced programmers were asked to code up binary search.

80% got it wrong

Good thing is was not for a nuclear power plant.

What did they lack?

What did they lack?

• Formal proof methods?

What did they lack?

• Formal proof methods?

Yes, likely

Industry is starting to realize that formal methods

are important.

But even without formal methods …. ?

What did they lack?• Fundamental understanding of the

algorithmic design techniques.• Abstract thinking.

Course Content

Notations, analogies, and abstractions

for developing, thinking about,

and describing algorithms so correctness is transparent

A survey of fundamental ideas

and algorithmic design techniques

For example . . .

Start With Some Math

Input Size

Tim

eClassifying Functions

f(i) = n(n)

Recurrence Relations

T(n) = a T(n/b) + f(n)

Adding Made Easy∑i=1 f(i).

Time Complexityt(n) = (n2)

Iterative Algorithms Loop Invariants

i-1 i

ii0

T+1<preCond> codeA loop <loop-invariant> exit when <exit Cond> codeBcodeC<postCond>

9 km

5 km

Code Relay RaceOne step at a time

Recursive Algorithms

?

?

Graph Search Algorithms

Network Flows

Greedy Algorithms

Recursive Back Tracking

Dynamic Programing

Reduction

=

Rudich www.discretemath.com

Useful Learning Techniques

Read Ahead

You are expected to read the lecture notes before the lecture.

This will facilitate more productive discussion during class.

Explaining

• We are going to test you on your ability to explain the material.

• Hence, the best way of studying is to explain the material over and over again out loud to yourself, to each other, and to your stuffed bear.

Day Dream

Mathematics is not all linear thinking.

Allow the essence of the material to seep

into your subconscious

Pursue ideas that percolate up and

flashes of inspiration that appear.

While going along with your day

Be Creative

•Ask questions. • Why is it done this way and not that way?

Guesses and Counter Examples

• Guess at potential algorithms for solving a problem.

• Look for input instances for which your algorithm gives the wrong answer.

Refinement:The best solution comes from a process of repeatedly refining

and inventing alternative solutions

Grade School Revisited:How To Multiply Two

Numbers2 X 2 =

5

A Few Example A Few Example AlgorithmsAlgorithms

Complex Numbers•Remember how to multiply 2 complex numbers?

•(a+bi)(c+di) = [ac –bd] + [ad + bc] i

•Input: a,b,c,d Output: ac-bd, ad+bc

•If a real multiplication costs $1 and an addition cost a penny. What is the cheapest way to obtain the output from the input?

•Can you do better than $4.02?

Gauss’ $3.05 Method:Input: a,b,c,d Output: ac-bd,

ad+bc

• m1 = ac

• m2 = bd

• A1 = m1 – m2 = ac-bd

• m3 = (a+b)(c+d) = ac + ad + bc + bd

• A2 = m3 – m1 – m2 = ad+bc

Question:•The Gauss “hack” saves one multiplication out of four. It requires 25% less work.

•Could there be a context where performing 3 multiplications for every 4 provides a more dramatic savings?

Odette Bonzo

How to add 2 n-bit numbers.

**

**

**

**

**

**

**

**

**

**

**

+

How to add 2 n-bit numbers.

**

*

**

**

**

**

**

* **

**

**

**

**

+

How to add 2 n-bit numbers.

**

*

**

**

**

**

* **

* **

*

**

**

**

**

+

How to add 2 n-bit numbers.

**

*

**

**

**

* **

* **

*

* **

*

**

**

**

**

+

How to add 2 n-bit numbers.

**

*

**

**

* **

* **

*

* **

*

* **

*

**

**

**

**

+

How to add 2 n-bit numbers.

**

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

* **

*

***

*

+*

*

Time complexity of grade school addition

**

*

*

**

*

*

**

*

*

**

*

*

**

*

*

**

*

*

**

*

*

**

*

*

**

*

*

**

*

+*

*

*

**

*

T(n) = The amount of time grade school addition uses to add two n-bit numbers

= θ(n) = linear time.

On any reasonable computer adding 3 bits can be done in constant time.

# of bits in numbers

time

f = θ(n) means that f can be sandwiched between two lines

Please feel free to ask questions!

Rudich www.discretemath.com

Is there a faster way to add?

• QUESTION: Is there an algorithm to add two n-bit numbers whose time grows sub-linearly in n?

Any algorithm for addition must read all of the input

bits– Suppose there is a mystery algorithm that

does not examine each bit – Give the algorithm a pair of numbers. There

must be some unexamined bit position i in one of the numbers

– If the algorithm is not correct on the numbers, we found a bug

– If the algorithm is correct, flip the bit at position i and give the algorithm the new pair of numbers. It give the same answer as before so it must be wrong since the sum has changed

So any algorithm for addition must use time at least linear in

the size of the numbers.

Grade school addition is essentially as good as it can be.

How to multiply 2 n-bit numbers.

X* * * * * * * *

* * * * * * * *

* * * * * * * * * * * * * * *

* * * * * * * * * * * * * * * *

* * * * * * * * * * * * * * * *

* * * * * * * * * * * * * * * *

* * * * * * * * * * * * * * * * *

n2

How to multiply 2 nHow to multiply 2 n--bit numbers.bit numbers.

X* * * * * * * * * * * * * * * *

* * * * * * * ** * * * * * * *

* * * * * * * ** * * * * * * *

* * * * * * * ** * * * * * * *

* * * * * * * ** * * * * * * *

* * * * * * * * * * * * * * * *

n2

I get it! The total time is bounded by

cn2.

Grade School Addition: Linear timeGrade School Multiplication: Quadratic

time

• No matter how dramatic the difference in the constants the quadratic curve will eventually

dominate the linear curve

# of bits in numbers

time

Neat! We have demonstrated that as things scale multiplication is a

harder problem than addition.

Mathematical confirmation of our common sense.

Don’t jump to conclusions!We have argued that grade

school multiplication uses more time than grade school addition.

This is a comparison of the complexity of two algorithms.

To argue that multiplication is an inherently harder problem than addition we would have to show that no possible multiplication algorithm runs in linear time.

Grade School Addition: θ(n) timeGrade School Multiplication: θ(n2)

time

Is there a clever algorithm to multiply two numbers

in linear time?

Despite years of research, no one knows! If you resolve this question,

Chula will give you a PhD!

Is there a faster way to multiply two numbers

than the way you learned in grade school?

Divide And Conquer(an approach to faster

algorithms)

•DIVIDE a problem into smaller subproblems

•CONQUER them recursively

•GLUE the answers together so as to obtain the answer to the larger problem

Multiplication of 2 n-bit numbers• X =

• Y =

• X = a 2n/2 + b Y = c 2n/2 + d

• XY = ac 2n + (ad+bc) 2n/2 + bd

a b

c d

Multiplication of 2 n-bit numbers

• X = • Y = • XY = ac 2n + (ad+bc) 2n/2 + bd

a b

c d

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

RETURN

MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)

Time required by MULT

• T(n) = time taken by MULT on two n-bit numbers

• What is T(n)?

What is its growth rate? Is it θ(n2)?

Recurrence Relation•T(1) = k for some constant k

•T(n) = 4 T(n/2) + k’ n + k’’ for some constants k’ and k’’

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

RETURN

MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)

Let’s be concrete

•T(1) = 1•T(n) = 4 T(n/2) + n

•How do we unravel T(n) so that we can determine its growth rate?

Technique 1Guess and Verify

•Recurrence Relation: T(1) = 1 & T(n) = 4T(n/2) + n

•Guess: G(n) = 2n2 – n•Verify:Left Hand Side Right Hand Side

T(1) = 2(1)2 – 1

T(n)

= 2n2 – n

1

4T(n/2) + n

= 4 [2(n/2)2 – (n/2)] + n

= 2n2 – n

Technique 2: Decorate The Tree

• T(n) = n + 4 T(n/2)

n

T(n/2) T(n/2) T(n/2) T(n/2)

T(n) = n

T(n/2) T(n/2) T(n/2) T(n/2)

T(n) =

T(1)

•T(1) = 1

1=

n

T(n/2) T(n/2) T(n/2) T(n/2)

T(n) =

n

T(n/2) T(n/2) T(n/2)

T(n) =

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

nT(n) =

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

nT(n) =

n/2 n/2 n/2n/2

11111111111111111111111111111111 . . . . . . 111111111111111111111111111111111

n/4 n/4 n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4n/4 n/4

n

n/2 + n/2 + n/2 + n/2

Level i is the sum of 4i copies of n/2i

. . . . . . . . . . . . . . . . . . . . . . . . . .

1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1

n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4

0

1

2

i

log n2

Level i is the sum of 4 i copies of n/ 2 i

1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1

. . . . . . . . . . . . . . . . . . . . . . . . . .

n/ 2 + n/ 2 + n/ 2 + n/ 2

n

n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4

0

1

2

i

lo g n2

=1n

= 4n/2

= 16n/4

= 4i n/2i

Total: θ(nlog4) = θ(n2)

= 4lognn/2logn

=nlog41

Divide and Conquer MULT: θ(n2) time

Grade School Multiplication: θ(n2) time

All that work for nothing!

MULT revisited

• MULT calls itself 4 times. Can you see a way to reduce the number of calls?

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

RETURN

MULT(a,c) 2n + (MULT(a,d) + MULT(b,c)) 2n/2 + MULT(b,d)

Gauss’ Hack:Input: a,b,c,d Output: ac,

ad+bc, bd

• A1 = ac

• A3 = bd

• m3 = (a+b)(c+d) = ac + ad + bc + bd

• A2 = m3 – A1- A3 = ad + bc

Gaussified MULT(Karatsuba 1962)

•T(n) = 3 T(n/2) + n

•Actually: T(n) = 2 T(n/2) + T(n/2 + 1) + kn

MULT(X,Y):

If |X| = |Y| = 1 then RETURN XY

Break X into a;b and Y into c;d

e = MULT(a,c) and f =MULT(b,d)

RETURN e2n + (MULT(a+b, c+d) – e - f) 2n/2 + f

nT(n) =

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)T(n/4)

n

T(n/2) T(n/2) T(n/2)

T(n) =

n

T(n/2) T(n/2)

T(n) =

n/2

T(n/4)T(n/4)T(n/4)

nT(n) =

n/2

T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)

n/2

T(n/4)T(n/4)T(n/4)

n

n/2 + n/2 + n/2

Level i is the sum of 3i copies of n/2i

. . . . . . . . . . . . . . . . . . . . . . . . . .

1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1

n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4 + n/4

0

1

2

i

log n2

=1n

= 3n/2

= 9n/4

= 3i n/2i

Total: θ(nlog3) = θ(n1.58..)

Level i is the sum of 3 i copies of n/ 2 i

1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1

. . . . . . . . . . . . . . . . . . . . . . . . . .

n/ 2 + n/ 2 + n/ 2

n

n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4 + n/ 4

0

1

2

i

lo g n2 = 3lognn/2logn

=nlog31

Dramatic improvement for large n

Not just a 25% savings!

θ(n2) vs θ(n1.58..)

Multiplication Algorithms

Kindergarten n2n

Grade School n2

Karatsuba n1.58…

Fastest Known n logn loglogn

Brute Force

• Solving a problem by methodically plowing through all the possibilities. For example, solving the EightQueensProblem by generating all possible board positions. Does not scale well.

EightQueensProblem

• Place eight queens on a chess board so that no queen attacks any other. • To solve the eight queens problem, you must place 8 queens (black squares) onto the chess board below. No queen can be attacking any other queen. (A queen can attack all cells on its vertical, horizontal, and diagonal lines of sight)

• Using recursion, the board can be solved from the top down, trying all possibilities and backtracking (known as a depth first search).

• http://spaz.ca/aaron/SCS/queens/index.html

Check String

• Check Pattern “gcagagag” in string “gcatcgcagagagtatacagtacg”

Brute Force algorithm

Main features• no preprocessing phase; • constant extra space needed; • always shifts the window by exactly 1

position to the right; • comparisons can be done in any order; • searching phase in O(mn) time

complexity; • 2n expected text characters comparisons

C

void BF(char *x, int m, char *y, int n) { int i, j; /* Searching */ for (j = 0; j <= n - m; ++j) { for (i = 0; i < m && x[i] == y[i + j]; ++i); if (i >= m) OUTPUT(j); } }

Brute Force algorithm• The brute force algorithm consists in checking, at all positions

in the text between 0 and n-m, whether an occurrence of the pattern starts there or not. Then, after each attempt, it shifts the pattern by exactly one position to the right.

• The brute force algorithm requires no preprocessing phase, and a constant extra space in addition to the pattern and the text. During the searching phase the text character comparisons can be done in any order. The time complexity of this searching phase is O(mn) (when searching for am-1b in an for instance). The expected number of text character comparisons is 2n.

To sum up ! A straightforward approach to solving a problem, usually directly based on the problem’s statement and definitions of the concepts involved.

• Examples: selection sort, sequential search • Important special case: exhaustive search

Transform and Conquer

Solve the problem by transforming it to: • a simpler instance of the same problem

– presorting – Gaussian elimination

• another representation of the same problem. – Horner’s rule – FFT

• another problem altogether – path counting by computing powers of adjacency

Matrix – math modeling such as linear programming

The Game of Fif

• The board consists of a row of nine squares numbered 1 through 9. Players take turns selecting squares. The goal of the game is to select squares such that among those selected by a single player there will be a triplet that sums up to 15 1 2 3 4 5 6 7 8 9

Theory of the Game

• The game is equivalent to playing a regular TicTacToe game on the magic square board. (The square is magic in that the sums of three numbers on any straight line (vertical, horizontal or diagonal) is always 15.) There are eight such lines: •Horizontal

•4 + 9 + 2 = 15 •3 + 5 + 7 = 15 •8 + 1 + 6 = 15

•Vertical •4 + 3 + 8 = 15 •9 + 5 + 1 = 15 •2 + 7 + 6 = 15

•Diagonal •4 + 5 + 6 = 15 •2 + 5 + 8 = 15

           

Theory of the Game

• These are all possible representations of 15 with three smaller positive integers.

• As is well known, a TicTacToe game ends in a stalemate unless one of the players makes a mistake. In the Fif game, computer has the advantage of knowing the origin of the games and having in front of its Mind's eye the magic square board. To make the game more appealing computer errs with the probability of 1/20.

Reduction

• The process of transforming an expression according to certain reduction rules.

Find median

• The statistical median is middle number of a group of numbers that have been arrangedin order by size. If there is an even number of terms, the median is the mean of the two middle numbers

Median of medians

• Line up elements in groups of five (this number 5 is not important, it could be e.g. 7 without changing the algorithm much). Call each group S[i], with i ranging from 1 to n/5.

• Find the median of each group. (Call this x[i]). This takes 6 comparisons per group, so 6n/5 total (it is linear time because we are taking medians of very small subsets).

• Find the median of the x[i], using a recursive call to the algorithm. If we write a recurrence in which T(n) is the time to run the algorithm on a list of n items, this step takes time T(n/5). Let M be this median of medians.

• Use M to partition the input and call the algorithm recursively on one of the partitions, just like in quickselect.

Divide-and-Conquer

• Given an instance of the problem to be solved, split this into several, smaller, sub-instances (of the same problem) independently solve each of the sub-instances and then combine the sub-instance solutions so as to yield a solution for the original instance. This description raises the question:By what methods are the sub-instances to be independently solved?

Divide-and-Conquer

• Consider the following: We have an algorithm, alpha say, which is known to solve all problem instances of size n in at most c n^2 steps (where c is some constant). We then discover an algorithm, beta say, which solves the same problem by:

• Dividing an instance into 3 sub-instances of size n/2.

• Solves these 3 sub-instances. • Combines the three sub-solutions taking d n

steps to do this.

Divide-and-Conquer

• Suppose our original algorithm, alpha, is used to carry out the `solves these sub-instances' step 2. Let T(alpha)( n ) = Running time of alpha

• T(beta)( n ) = Running time of beta • Then, • T(alpha)( n ) = c n^2 (by definition of

alpha) • But

Divide-and-Conquer

• T(beta)( n ) = 3 T(alpha)( n/2 ) + d n

= (3/4)(cn^2) + dn

So if dn < (cn^2)/4 (i.e. d < cn/4) then beta is faster than alpha

Divide-and-Conquer

• In particular for all large enough n, (n > 4d/c = Constant), beta is faster than alpha.

• This realization of beta improves upon alpha by just a constant factor. But if the problem size, n, is large enough then

• n > 4d/c n/2 > 4d/c ... n/2^i > 4d/c which suggests that using beta instead of alpha for the `solves these' stage repeatedly until the sub-sub-sub..sub-instances are of size n0 < = (4d/c) will yield a still faster algorithm.

Divide-and-Conquer

• So consider the following new algorithm for instances of size n

procedure gamma (n : problem size ) is begin if n <= n^-0 then Solve problem using Algorithm alpha; else Split into 3 sub-instances of size n/2; Use gamma to solve each sub-instance; Combine the 3 sub-solutions; end if; end gamma;

Divide-and-Conquer

• Let T(gamma)(n) denote the running time of this algorithm.

cn^2 if n < = n0 • T(gamma)(n) = 3T(gamma)( n/2 )+dn

otherwise

• We shall show how relations of this form can be estimated later in the course. With these methods it can be shown that

• T(gamma)( n ) = O( n^{log3} ) (=O(n^{1.59..})

• This is an asymptotic improvement upon algorithms alpha and beta.

Divide-and-Conquer

• The improvement that results from applying algorithm gamma is due to the fact that it maximizes the savings achieved beta.

• The (relatively) inefficient method alpha is applied only to "small" problem sizes.

• The precise form of a divide-and-conquer algorithm is characterized by:

• The threshold input size, n0, below which the problem size is not sub-divided.

• The size of sub-instances into which an instance is split.

• The number of such sub-instances. • The algorithm used to combine sub-solutions.

Divide-and-Conquer • it is more usual to consider the ratio of initial problem size

to sub-instance size. In our example this was 2. The threshold is sometimes called the (recursive) base value. In summary, the generic form of a divide-and-conquer algorithm is:

procedure D-and-C (n : input size) is begin if n < = n0 then Solve problem without further sub-division; else Split into r sub-instances each of size n/k; for each of the r sub-instances do D-and-C (n/k); Combine the r resulting sub-solutions to produce the solution to the

original problem; end if; end D-and-C;

Such algorithms are naturally and easily realized as: Recursive Procedures

การบ้�าน

หว้$ธ�เร้�ยงลำ�ด�บก�อนหลำ�ง ข��นตอนกร้ที่�งนต�ง ๆ ที่��แสั่ดงด�งต�อไปัน��

ต6�นนอน

อบน��

แปัร้งฟั น

สั่ร้ะผู้ม

แต�งต�ว้

ก$นข�ว้

อ�นหน�งสั่6อพิ$มพิ*

ออกก�ลำ�งกย

ที่�กร้บ�น

ไปัโร้งเร้�ยน

End

Restart IntroductionRelevant Mathematics

Recommended