34
Sorting

Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

Embed Size (px)

Citation preview

Page 1: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

Sorting

Page 2: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 2

Sorting

• Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)).

• Being able to sort data efficiently is thus a quite important ability

• But how fast can be sort data…?

Page 3: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 3

Selection sort

• A very simple algorithm for sorting an array of n integers works like this:– Search the array from element 0 to element

(n-1), to find the smallest element– If the smallest element is element i, then

swap element 0 and element i– Now repeat the process from element 1 to

element (n-1)– …and so on…

Page 4: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 4

Selection sort

10 56 26 4 82 7634 18 60 40

Page 5: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 5

Selection sort

10 56 26 4 82 7634 18 60 40

Page 6: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 6

Selection sort

10 56 26 34 82 764 18 60 40

Page 7: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 7

Selection sort

10 56 26 34 82 764 18 60 40

Page 8: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 8

Selection sort

10 56 26 34 82 764 18 60 40

Page 9: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 9

Selection sort

10 56 26 34 82 764 18 60 40

Page 10: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 10

Selection sort

10 56 26 34 82 764 18 60 40

Page 11: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 11

Selection sort

10 18 26 34 40 564 60 76 82

Page 12: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 12

Selection sort

• How fast is selection sort?

• We scan for the smallest element n times– In scan 1, we examine n element– In scan 2, we examine (n-1) element– …and so on

• A total of n + (n -1) + (n – 2) +…+ 2 + 1 examinations

• The sum is n(n + 1)/2

Page 13: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 13

Selection sort

• The total number of examinations is equal to n(n + 1)/2 = (n2 + n)/2

• The run-time complexity of selection sort is therefore O(n2)

• O(n2) grows fairly fast…

Page 14: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 14

Selection sort

n n2

2 4

5 25

20 400

50 2500

200 40000

Page 15: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 15

Merge sort

• Selection sort is conceptually very simple, but not very efficient…

• A different algorithm for sorting is merge sort

• Merge sort is an example of a divide-and-conquer algorithm

• It is also a recursive algorithm

Page 16: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 16

Merge sort

• The principle in merge sort is to merge two already sorted arrays:

10 26 34 56 18 404 60 76 82

Page 17: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 17

Merge sort

10 26 34 56 18 404 60 76 82

Page 18: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 18

Merge sort

• Merging two sorted arrays is pretty simple, but how did the arrays get sorted…?

• Recursion to the rescue!

• Sort the two arrays simply by appying merge sort to them…

• If the array has length 1 (or 0), it is sorted

Page 19: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 19

Merge sortpublic void sort() // Sort the array a

{

if (a.length <= 1) return; // Base case

int[] a1 = new int[a.length/2]; // Create two new

int[] a2 = new int[a.length – a1.length]; // arrays to sort

System.arraycopy(a,0,a1,0,a1.length); // Copy data to

System.arraycopy(a,a1.length,a2,0,a2.length); // the new arrays

MergeSorter ms1 = new MergeSorter(a1); // Create two new

MergeSorter ms2 = new MergeSorter(a2); // sorter objects

ms1.sort(); // Sort the two

ms2.sort(); // new arrays

merge(a1,a2); // Merge the arrays

}

Page 20: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 20

Merge sort

• All that is left is the method for merging two arrays

• A little bit tedious, but as such trivial…

• Time needed to merge two arrays to the total length of the arrays, i.e to n

• We can now analyse the run-time com-plexity for merge sort

Page 21: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 21

Merge sort

• Merge sort of an array of length n requires– Two merge sorts of arrays of length n/2– Merging two arrays of length n/2

• The running time T(n) then becomes:

T(n) = 2×T(n/2) + n

Page 22: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 22

Merge sort

• If we re-insert the expression for T(n) into itself m times, we get

T(n) = 2m×T(n/2m) + mn

• If we choose m such that n = 2m, we get

T(n) = n×T(1) + mn = n + n×log(n)

Page 23: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 23

Merge sort

• The run-time complexity of merge sort is therefore O(n log(n))

• Many other sorting algorithms have this run-time complexity

• This is the fastest we can sort, except under very special circumstances

• Much better than O(n2)…

Page 24: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 24

Merge sort

n n log(n) n2

2 2 4

5 12 25

20 86 400

50 282 2500

200 1529 40000

Page 25: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 25

Sorting in practice

• It does matter which sorting algorithm you use…

• …but do I have to code sorting algorithms myself?

• No! You can – and should – use sorting algorithms found in the Java library

Page 26: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 26

Sorting in practice

• Sorting an array:

Car[] cars = new Car[n];

Arrays.sort(cars);

Page 27: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 27

Sorting in practice

• Sorting an arraylist:

ArrayList<Car> cars =

new ArrayList<Car>();

Collections.sort(cars);

Page 28: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 28

Sorting in practice

• Why not code my own sorting algorithms?

• Sorting algorithms in Java library are better than anything you can produce…– Carefully debugged– Highly optimised– Used by thousands

• You cannot beat them

Page 29: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 29

Sorting in practice

• In order to sort an array of data, we need to be able to compare the elements

• ”Larger than” should make sense for the elements in the array

• Easy for numeric types (>)

• What about types we define ourselves…?

Page 30: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 30

Sorting in practice

• If a class T implements the Comparable interface, objects of type T can be compared:

public interface Comparable<T>

{

int compareTo(T other);

}

Page 31: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 31

Sorting in practice

• In the interface definition, T is a type parameter

• It is used the same way as we use an arraylist

• ArrayList<Car> : an arraylist holding elements of type Car

Page 32: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 32

Sorting in practice

• In order for the sorting algorithms to work properly, an implementation of the interface must obey these rules:

• The call a.compareTo(b) must return:– A negative number if a < b– Zero if a = b– A positive number if a > b

Page 33: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 33

Sorting in practice

• The implementation of compareTo must define a so-called total ordering:– Antisymmetric: If a.compareTo(b) ≤ 0, then b.compareTo(a) ≥ 0

– Reflexive: a.compareTo(a) = 0– Transitive: If a.compareTo(b) ≤ 0 and b.compareTo(c) ≤ 0, then a.compareTo(c) ≤ 0

Page 34: Sorting. DCS – SWC 2 Sorting Searching in sorted data is much faster (O(log(n)), than searching in unsorted data (O(n)). Being able to sort data efficiently

DCS – SWC 34

Sorting in practicepublic class Car implements Comparable<Car>

{

...

// Here using weight as ordering criterion

//

public int compareTo(Car other)

{

if (getWeight() < other.getWeight()) return -1;

if (getWeight() == other.getWeight()) return 0;

return 1;

}

...

}