98
1 Data Structures and DBMS ( Data Base Management System) Wen-Nung Tsai [email protected] CSIE Department, NCTU

1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai [email protected] CSIE Department, NCTU

Embed Size (px)

Citation preview

Page 1: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

1

Data Structures and DBMS ( Data Base Management System)

Wen-Nung Tsai

[email protected]

CSIE Department, NCTU

Page 2: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

2

Data Representation

所有的資料都是 1 和 0 如何表示文字符號 ? 回想有與沒有的猜數遊戲 整數如何表示 ?

負數 ? 正負配絕對值 ? 用 1 的補數 ? 用 2 的補數 ? 實數如何表示 ? 先標準化 , 拆成指數與小數

IEEE754/854 兩倍準的實數 ?

IEEE754/854

Page 3: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

3

Data Structures

更 High Level 的看法 如何表示 Stack ? Queue? List? 如何表示族譜 ? 公司或學校組織架構 ? Tree?

Binary tree General tree AVL 平衡樹 B-Tree …

Page 4: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

4

Sorting ( 排序 )

Take a set of items, order unknown Set: Linked list, array, file on disk, …

Return ordered set of the items

For instance: Sorting names alphabetically Sorting by height Sorting by weight

Page 5: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

5

Sorting Algorithms

Issues of interest: Running time in worst case, other cases Space requirements

In-place algorithms: require constant space The importance of empirical testing

Often Critical to Optimize Sorting

Page 6: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

6

Short Example: Bubble Sort

Key: “large unsorted elements bubble up”

Make several sequential passes over the set Every pass, fix local pairs that are not in order

Considered inefficient, but useful as first example

Page 7: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

7

(Naïve)BubbleSort(array A, length n)

1. for in to 2 // note: going down

2. for j2 to i // loop does swaps in [1..i]

3. if A[j-1]>A[j]

4. swap(A[j-1],A[j])

Page 8: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

8

Bubble sort example(1/3)

Pass 1: 25 57 48 37 12 92 86 33

25 48 57 37 12 92 86 33

25 48 37 57 12 92 86 33

25 48 37 12 57 92 86 33

25 48 37 12 57 86 92 33

25 48 37 12 57 86 33 92

Page 9: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

9

Bubble Sort example(2/3)

Pass 2: 25 48 37 12 57 86 33 92

25 37 48 12 57 86 33 92

25 37 12 48 57 86 33 92

25 37 12 48 57 33 86 92

Pass 3: 25 37 12 48 57 33 86 92

25 12 37 48 57 33 86 92

25 12 37 48 33 57 86 92

Page 10: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

10

Bubble Sort example(3/3)

Pass 4: 25 12 37 48 33 57 86 92

12 25 37 48 33 57 86 92

12 25 37 33 48 57 86 92

Pass 5: 12 25 37 33 48 57 86 92

12 25 33 37 48 57 86 92

Pass 6: 12 25 33 37 48 57 86 92

Pass 7: 12 25 33 37 48 57 86 92

Page 11: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

11

Bubble Sort Features

Worst case: Inverse sorting Passes: n-1 Comparisons each pass: (n-k) where k pass number Total number of comparisons:

(n-1)+(n-2)+(n-3)+…+1 = n2/2-n/2 = O(n2) In-place: No auxilary storage Best case: already sorted

O(n2) Still: Many redundant passes with no swaps

Page 12: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

12

Big O, Big , Big Big O

描述 complexity 上限

Big 描述 complexity 下限

Big ?

參考資料結構或演算法的書

Page 13: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

13

改良型氣泡排序法BubbleSort(array A, length n)

1. in

2. quitfalse

3. while (i>1 AND NOT quit) // note: going down

4. quittrue

5. for j=2 to i // loop does swaps in [1..i]

6. if A[j-1]>A[j]

7. swap(A[j-1],A[j]) // put max in I

8. quitfalse

9. ii-1

Page 14: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

14

Bubble Sort Features

Best case: Already sorted O(n) – one pass over set, verifying sorting

Total number of exchanges Best case: None Worst case: O(n2) -- 與 n 平方成正比

Lots of exchanges:

A problem with large items

Page 15: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

15

Selection Sort

Observation: Bubble-Sort uses lots of exchanges These always float largest unsorted element up

We can save exchanges: Move largest item up only after it is identified More passes, but less total operations

Same number of comparisons Many fewer exchanges

Page 16: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

16

SelectSort(array A, length n)

1. for in to 2 // note we are going down

2. largest A[1]

3. largest_index 1

4. for j1 to i // loop finds max in [1..i]

5. if A[j]>A[largest_index]

6. largest_index j

7. swap(A[i],A[largest_index]) // put max in i

Page 17: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

17

Selection Sort Example

Initial: 25 57 48 37 12 92 86 33

Pass 1: 25 57 48 37 12 33 86 | 92

Pass 2: 25 57 48 37 12 33 I 86 92

Pass 3: 25 33 48 37 12 I 57 86 92

Pass 4: 25 33 12 37 I 48 57 86 92

Pass 5: 25 33 12 I 37 48 57 86 92

Pass 6: 25 12 I 33 37 48 57 86 92

Pass 7: 12 I 25 33 37 48 57 86 92

Page 18: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

18

Selection Sort Summary Best case: Already sorted

Passes: n-1 Comparisons each pass: (n-k) where k pass number # of comparisons: (n-1)+(n-2)+…+1 = O(n2)

Worst case: Same. In-place: No external storage Very few exchanges:

Always n-1 (better than Bubble Sort)

Page 19: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

19

Selection Sort vs. Bubble Sort

Selection sort: more comparisons than bubble sort in best case O(n2) But fewer exchanges O(n) Good for small sets/cheap comparisons, large items

Bubble sort: Many exchanges O(n2) in worst case O(n) on sorted input

Page 20: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

20

Insertion Sort

Improve on # of comparisons Key idea: Keep part of array always sorted

As in selection sort, put items in final place As in bubble sort, “bubble” them into place

Page 21: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

21

InsertSort(array A, length n)

1. for i2 to n // A[1] is sorted

2. y=A[i]

3. j i-1

4. while (j>0 AND y<A[j])

5. A[j+1] A[j] // shift things up

6. jj-1

7. A[j+1] y // put A[i] in right place

Page 22: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

22

InsertSort example(1/4)

Initial: 25 57 48 37 12 92 86 33

Pass 1: 25 | 57 48 37 12 92 86 33

Pass 2: 25 57 I 48 37 12 92 86 33

25 48 | 57 37 12 92 86 33

Pass 3: 25 48 57 | 37 12 92 86 33

25 48 57 | 57 12 92 86 33

25 48 48 | 57 12 92 86 33

25 37 48 | 57 12 92 86 33

Page 23: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

23

InsertSort example(2/4)

Pass 4: 25 37 48 57 | 12 92 86 33

25 37 48 57 | 57 92 86 33

25 37 48 48 | 57 92 86 33

25 37 37 48 | 57 92 86 33

25 25 37 48 | 57 92 86 33

12 25 37 48 | 57 92 86 33

Page 24: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

24

InsertSort example(3/4)

Pass 5: 12 25 37 48 57 | 92 86 33

Pass 6: 12 25 37 48 57 92 | 86 33

12 25 37 48 57 86 | 92 33

Pass 7: 12 25 37 48 57 86 92 | 33

12 25 37 48 57 86 92 | 92

12 25 37 48 57 86 86 | 92

12 25 37 48 57 57 86 | 92

12 25 37 48 48 57 86 | 92

Page 25: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

25

InsertSort example(4/4)

Pass 7: 12 25 37 48 48 57 86 | 92

12 25 37 37 48 57 86 | 92

12 25 33 37 48 57 86 | 92

Page 26: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

26

Insertion Sort Summary

Best case: Already sorted O(n) Worst case: O(n2) comparisons

# of exchanges: O(n2) In-place: No external storage In practice, best for small sets (<30 items)

BubbleSort does more comparisons! Very efficient on nearly-sorted inputs

Page 27: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

27

Nicklas (Nicholas) Wirth

Invented the Pascal Language Wrote a book:

Data Structures + Algorithms

= Programs

Page 28: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

28

Divide-and-Conquer Algorithm

An algorithm design technique: Divide a problem of size N into sub-problems Solve all sub-problems Merge/Combine the sub-solutions

This can result in VERY substantial improvements

Page 29: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

29

Small Example: f(n)

1. if ( n == 0 OR n == 1)

2. return 1;

3. else

4. return f(n-1)*n;

What is this function?

Page 30: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

30

Small Example: f(n)

1. if ( n == 0 OR n == 1)

2. return 1;

3. else

4. return f(n-1)*n;

What is this function?

Factorial ! 算 n 階乘

你告訴我 (n-1) 階乘 ; 我就告訴你

n 階乘

Page 31: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

31

Recursion

Page 32: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

32

Divide-and-Conquer in Sorting Mergesort

O(n log n) always, but O(n) storage Quick sort

O(n log n) average, O(n2) worst in time Good in practice when n>30, O(log n) storage But, Quick sort is not a Stable sort

Key 一樣之 data 其相對順序在排好後與原先不同

Page 33: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

33

Selection Sort, Insertion Sort, Bubble Sort

Selection Sort does more comparisons! (key) Selection Sort does less exchanges ! (data) Worst case 對 n 個 data 做 sort

三者都是 O(n2) Big-O of n square

Quick Sort : (average case) O(n * log n) = O(n*log2 n)

Page 34: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

34

Quick Sort 到底有多快 ?

Consider the comparison Selection sort: (n-1)+(n-2)+…+2+1 = n(n-1)/2 n 個資料經過 n-1 次比較可以使一個資料到定位

到最前面 ( 最左邊 )? 到最後面 ( 最右邊 )? 可否到中間 ? 幾乎中間 : Quick sort

假設 100 個資料 Selection sort: 100*99/5 = 50*99 comparison times 若先用一 pass quick sort 概念 : 99 次比較使一個排定 再來切成兩半若運氣好為 49 個 和 50 個 都算 50 個且用 selection sort: 2 * 50 * 49 / 2 = 50*49 共 99 + 50 * 49 = 約 50 * 51 次比較

Page 35: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

35

Quicksort Algorithm

Given an array of n elements : If array only contains one element, return Else

pick one element to use as pivot. Partition elements into two sub-arrays:

Elements less than or equal to ( <= ) pivot Elements greater than ( > ) pivot

Quicksort two sub-arrays Return results

Page 36: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

36

Quick Sort Example

We are given array of n integers to sort:

38 18 10 80 60 50 7 30 98

Page 37: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

37

Pick Pivot Element

There are a number of ways to pick the pivot element. In this example, we will use the first element in the array:

38 18 10 80 60 50 7 30 98

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 38: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

38

Partitioning Array

Given a pivot, partition the elements of the array such that the resulting array consists of:

1. One sub-array that contains elements >= pivot 2. Another sub-array that contains elements < pivot

The sub-arrays are stored in the original data array.

Partitioning loops through, swapping elements below/above pivot.

Quick Sort Example

Page 39: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

39

38 18 10 80 60 50 7 30 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 40: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

40

38 18 10 80 60 50 7 30 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 41: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

41

38 18 10 80 60 50 7 30 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 42: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

42

38 18 10 80 60 50 7 30 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 43: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

43

38 18 10 80 60 50 7 30 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 44: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

44

38 18 10 80 60 50 7 30 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 45: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

45

38 18 10 80 60 50 7 30 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 46: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

46

38 18 10 30 60 50 7 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 47: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

47

38 18 10 30 60 50 7 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 48: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

48

38 18 10 30 60 50 7 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 49: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

49

38 18 10 30 60 50 7 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 50: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

50

38 18 10 30 60 50 7 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 51: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

51

38 18 10 30 60 50 7 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 52: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

52

38 18 10 30 60 50 7 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 53: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

53

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 54: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

54

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 55: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

55

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 56: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

56

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 57: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

57

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 58: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

58

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 59: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

59

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 60: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

60

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 61: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

61

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 62: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

62

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]

38 18 10 30 7 50 60 80 98pivot_index = 0

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 63: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

63

1. While data[too_big_index] <= data[pivot_index]++too_big_index

2. While data[too_small_index] > data[pivot_index]--too_small_index

3. If too_big_index < too_small_indexswap data[too_big_index] and data[too_small_index]

4. While too_big_index < too_small_index, go to 1.5. Swap data[too_small_index] and data[pivot_index]

7 18 10 30 38 50 60 80 98pivot_index = 4

[0] [1] [2] [3] [4] [5] [6] [7] [8]

too_big_index too_small_index

想辦法在 n-1 次 比較後 , 使小的在左半 ; 大的在右半

Quick Sort Example

Page 64: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

64

Partition Result

7 18 10 30 38 50 60 80 98

[0] [1] [2] [3] [4] [5] [6] [7] [8]

<= pivot > pivot

Recursion: Quicksort Sub-arrays

Quick Sort Example

Currently pivot = Data[too_small_index]

Page 65: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

65

Recursion: Quicksort Sub-arrays

7 18 10 30 38 50 60 80 98

[0] [1] [2] [3] [4] [5] [6] [7] [8]

<= data[pivot] > data[pivot]

Page 66: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

66

Algorithm types

Algorithm types we will consider include: Simple recursive algorithms Backtracking algorithms Greedy algorithms Divide and Conquer algorithms Dynamic programming algorithms Branch and bound algorithms Brute force algorithms Randomized algorithms

Page 67: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

67

Coin Changing problem Problem: A dollar amount to reach and a collection

of coin amounts to use to get there. Configuration: A dollar amount yet to return to a customer

plus the coins already returned Objective function: Minimize number of coins returned.

Greedy solution: Always return the largest coin you can Example 1: Coins are valued $.32, $.08, $.01

Has the greedy-choice property, since no amount over $.32 can be made with a minimum number of coins by omitting a $.32 coin (similarly for amounts over $.08, but under $.32).

Coins in USA: 1 ¢ 5 ¢ 10 ¢ 25 ¢ 50 ¢

Page 68: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

68

Question ? Suppose there are unlimited quantities of

coins of each denomination.

What property should the denominations c1, c2, …, ck have so that the greedy algorithm always yields an optimal solution?

Consider this example: Example 2: Coins are valued $.30, $.20, $.05, $.01

Does not have greedy-choice property, since $.40 is best made with two $.20’s, but the greedy solution will pick three coins (which ones?)

The greedy method cannot always find an optimal solution!

Page 69: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

69

再看看這賺了大錢的 RSA 演算法 RSA and Diffie-Hellman RSA - Ron Rives, Adi Shamir and Len Adleman at

MIT, in 1977. RSA is a block cipher The most widely implemented 公開金鑰密碼演算法 () Public-Key Cryptographic Algorithms

Diffie-Hellman in 1976 Echange a secret key securely Compute discrete logarithms

Page 70: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

70

The RSA Algorithm – Key Generation

1. Select p,q p and q both prime2. Calculate n = p x q3. Calculate 4. Select integer e5. Calculate d6. Public Key KU = {e,n}7. Private key KR = {d,n}

)1)(1()( qpn)(1;1)),(gcd( neen

)(mod1 ned

Page 71: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

71

Example of RSA Algorithm

1. Select p,q p =7, q =172. Calculate n = p x q =7 x 17 = 1193. Calculate = 964. Select integer e=5 5. Calculate d =776. Public Key KU = {e,n} = {5, 119}7. Private key KR = {d,n} = {77, 119}

)1)(1()( qpn)(1;1)),(gcd( neen

)(mod1 ned

因為 77 x 5 = 385 = 4 x 96 + 1

Page 72: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

72

Example of RSA Algorithm (cont.)

Page 73: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

73

Diffie-Hellman Key Echange

和 q 是雙方先約好或由一方送給另一方 (A 送給B)

雙方算出的 K 會相等

Page 74: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

74

最廣泛的的應用 DBMS + Web

MySQL + PHP 網頁程式 MS SQL + ASP 網頁程式

單純的網頁不值錢 必須與資料庫結合才能展現其威力 資料庫要有管理系統 (DBMS) DBMS 最常見的是關聯式資料庫管理系統 (Relational Data Base Management System)

Page 75: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

75

3-tier architecture

Web Server

Browsers

DBMS

學習平台採用三層式( 3-tiers )系統架構

Material

Knowledge

XML

程式邏輯

使用者、管理者

e2.NCTU

Page 76: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

76

資料庫與 DBMS 簡介 資料 檔案 資料庫 資料庫管理系統 (DBMS)

資料經過適當的安排 有系統的存取資料 最常搭配 4GL (4-th Generation Language) Structure Query Language (SQL)

Select ALL where 學分 >=9 and 不及格學分 *2 >= 總學分

Page 77: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

77

常見的資料庫相關名詞 DBMS (Data Base Management System) Relational DBMS 關聯式資料庫 (RDBMS) SQL (Structural Query Language) DDL (Data Defination Language) DML (Data Manipulation Language) DCL (Data Control Language) Normalization( 正規化 ) ER Model

Page 78: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

78

常見的資料庫系統名稱 Dbase III (DOS 時代最有名的 RDBMS) Lotus, Excel 也可拿來存資料

Excel ( 最接近日常生活的資料格式 ) MS Access ( 小而美的資料庫管理系統 ) MS SQL (PC 上 大型資料庫溝通的工具 ) MySQL 通常搭配 PHP = DBMS + Web PostGre SQL ( 最早開始於 BSD 的 Ingres 專案 ) RDBMS: Oracle / Informix / Sybase IBM DB2

IBM 於 2001 年 4 月 24 日宣佈將 10 億美元,購併知名資料庫大廠 Informix 。

Page 79: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

79

何以 The Matrix 中的先知叫做Oracle ?

Oracle 台灣 ( 美商甲骨文 ) http://www.oracle.com/global/tw/ Oracle( 甲骨文 ) 公司,是僅次於微軟的全球第二

大軟體公司,同時是全球最大的資料庫管理系統 (RDBMS) 供應商

SyBase 台灣 : http://www.sybase.com.tw/ Informix 台灣 :

台灣 Informix 用戶組織 (Taiwan Informix User Group; TWIUG) 在 IBM 支持下於 2005 年 9 月 21 日正式成立。

http://www.iiug.org/twiug/

Page 80: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

80

如何存取資料庫 使用試算表 Visicalc /Lotus / Excel 直接寫程式 ( 傳統寫法 )

dBase2, dBase III/Clipper/FoxPro 直接寫程式 (Visual programming)

VB / Delphi / Access / PowerBuilder / Developer 2000 /

透過中介軟體 (Middleware) 使用 DBMS Various ODBC / JDBC drivers Using OLEDB / SQL statements

Page 81: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

81

RDBMS 基本操作 Table ( 資料表格 ) = Relation =~ file 三種基本操作

Select Project Join

Structure Query Language (SQL) Select ALL where 學分 >=9 and 不及格學分 *2 >= 總學

Page 82: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

82

Example: An employee database consisting of three relations

Page 83: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

83

Another example of the JOIN operation in Relational DataBase

Page 84: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

84

資料庫表格正規化簡介 資料庫表格正規化簡介 (1/(1/2)2) 第一正規化第一正規化

關聯式資料庫 ((relational databaserelational database) 要求各個資料表皆須符合第一正規化(first normal form, 簡稱 1NF) ,如下:

第一正規化:每一個欄位只准有一個值。第一正規化:每一個欄位只准有一個值。 關聯式資料庫要求每個資料表都要有一個主鍵 ((primary key)primary key) ,用來識別每一

個 tuple 。 主鍵可以是資料表中的某一個欄位,也可以由幾個欄位組成。 關聯式資料庫對主鍵欄位另外有個要求,即實體完整性 ((entity integrityentity integrity) :

組成主鍵的任何欄位值,都不可以是 Null 。 第二正規化:

資料表要滿足第一正規化,而且所有欄位都資料表要滿足第一正規化,而且所有欄位都完全功能相依完全功能相依於主鍵。於主鍵。 功能相依功能相依 (functionally dependence) :

由 attribute X 的值可以決定一個唯一的 attribute Y 的值,簡寫成 X→Y 。 完全功能相依完全功能相依 (full functional dependence) :

如果 attribute Y 功能相依於 attribute X ,但是並不功能相依於 attribute X 的任何子集,則稱 attribute Y 完全功能相依於 attribute X 。

Page 85: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

85

X→YX→Y ,, Y→ZY→Z 的關係稱做遞移功能相依遞移功能相依 (transitive functional dependence) ,即 Z 遞移功能相依於 X 。

第三正規化第三正規化 ((third normal formthird normal form ,,簡稱簡稱 33NF)NF) 資料表要滿足第二正規化,而且所有欄位都不可遞移功能相依於主鍵。資料表要滿足第二正規化,而且所有欄位都不可遞移功能相依於主鍵。

第三正規化解決的問題第三正規化解決的問題 解決了新增資料的問題 解決了刪除資料的問題 解決了修改資料的問題

資料庫表格正規化簡介 資料庫表格正規化簡介 (2/(2/2)2)

Page 86: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

86

Data Mining ( 資料探勘 ) 應用

客戶信用資料 降低貸款風險損失率 預測潛在流失客戶

分析零售商店歷史銷售記錄與位置概述以決定最佳的位置 音樂 /電影喜好問卷蒐集 分析提款機設置地點最佳位置 分析販賣促銷資訊的成效( e.g. coupon) 分析客戶行為幫助決策 (e.g. CRM 系統 ) 預測侵蝕性的物質對皮膚的影響降低產品 (藥品或毒品 ) 的發

展成本和時間,以及減少動物實驗的需求

Page 87: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

87

Some Data Structure examples

Linked List : insert/delete a node Data Structure in Java

Java 把一般資料結構課本上討論的都做成程式庫 並提供一致的 access interface 請參考 Java 的線上 Reference Manual

( 可到 Sun 網站或 Java 網站抓 , 或這也有 :

http://www.csie.nctu.edu.tw/~tsaiwn/course/java/ )

Page 88: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

88

Linked List

Flexible structure, providing

• Insertion and removal from any place in O(1), compared to O(n) for array-based list

• Sequential access

• Random access at O(n), compared to O(1) for array-based list

Page 89: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

89

Connecting Nodes

creating the nodes

connecting

Page 90: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

90

Inserting Nodes

p.link = r

r.link = q

q can be accessed by p.link.link

r

Page 91: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

91

Removing Nodes

p q

Page 92: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

92

Traversing a List

(null)

Page 93: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

93

Double Linked ListsSingle linked list

Double linked list

(null)

(null)

data

successor

predecessor

data

successor

predecessor

data

successor

predecessor

(null)

(null)

Page 94: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

94

Data Structures in JAVA Let‘s see what JAVA has to offer:

Page 95: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

95

The Collection Hierarchy Collection: top interface, specifying requirements for all collections

Page 96: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

96

Collection Interface (1/2)

Page 97: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

97

Collection Interface (2/2)

!

Page 98: 1 Data Structures and DBMS ( Data Base Management System ) Wen-Nung Tsai tsaiwn@csie.nctu.edu.tw CSIE Department, NCTU

98

Iterator Interface Purpose:

Sequential access to collection elements

Note: the so far used technique of sequentially accessing elements by sequentially indexing is not reasonable in general (why ?) !

Methods: