25
Data Structures and Collections Principles .NET: Two libraries: System.Collections System.Collections.Generics FEN 2014 UCN Teknologi/act2learn 1 Deprecate d New one

Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Embed Size (px)

DESCRIPTION

Overview Abstract data types: –lists/sequences –stack –queue –set –table/map/dictionary.NET-specific: –Collections.Generics –IList –ISet –IDictionary Data structures: –static/dynamic –array –linked list –trees: Search trees –balanced –hashing Algorithms: –search –sweep –sorting –divide & conquer –recursion FEN 2014UCN Teknologi/act2learn3

Citation preview

Page 1: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Data Structures and Collections

• Principles• .NET:

– Two libraries:• System.Collections• System.Collections.Generics

FEN 2014 UCN Teknologi/act2learn 1

Deprecated

New one

Page 2: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

interface:

(e.g. IDictionary)

Specification

class Appl{

----

IDictionary dic;

-----

dic= new XXX();

application

class:

Dictionary

SortedDictionary

----

ADT Data structure and algorithmsChoose

and use an adt,

e.g. IDictionary

Choose and use a data structure, e.g.

Dictionary

Know about

Read and write (use)

specifications

Data Structures and Collections

FEN 2014 UCN Teknologi/act2learn 2

Page 3: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Overview• Abstract data types:

– lists/sequences– stack– queue– set– table/map/dictionary

• .NET-specific:– Collections.Generics– IList<>– ISet<>– IDictionary<>

• Data structures:– static/dynamic– array– linked list– trees:

• Search trees– balanced

– hashing• Algorithms:

– search– sweep– sorting– divide & conquer– recursion

FEN 2014 UCN Teknologi/act2learn 3

Page 4: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

.NET 2:System.Collections.Generics

ICollection<T>

IList<T> LinkedList<T> IDictionary<TKey, TValue>

List<T>Dictionary

<TKey, TValue>SortedDictionary<TKey, TValue>

Index ableArray-based Balanced

search tree Hashtabel

(key, value) -pair

FEN 2014 UCN Teknologi/act2learn 4

Page 5: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

FEN 2014 UCN Teknologi/act2learn 5

Demos

• Lists• Dictionaries• LinkedList in C#

Page 6: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

How does they work?

• Array-based list

• Linked list

FEN 2014 UCN Teknologi/act2learn 6

used

Count

Free (waste)

Page 7: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Dynamic vs. Static Data Structures

• Array-Based Lists:– Fixed (static) size (waste of memory).– May be able to grown and shrink (ArrayList), but this is very

expensive in running time (O(n))– Provides direct access to elements from index (O(1))

• Linked List Implementations:– Uses only the necessary space (grows and shrinks as

needed).– Overhead to references and memory allocation– Only sequential access: access by index requires searching

(expensive: O(n))

FEN 2014 UCN Teknologi/act2learn 7

Page 8: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Hashing

• Keys are converted to indices in an array.

• A hash function, h maps a key to an integer, the hash code.

• The hash code is divided by the array size and the remainder is used as index

• If two or more keys gives the same index, we have a collision.

FEN 2014 UCN Teknologi/act2learn 8

Page 9: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Chaining• The array doesn’t hold the element itself, but a reference to a

collection (a linked list for instance) of all colliding elements.• On search that list must be traversed

FEN 2014 UCN Teknologi/act2learn 9

Page 10: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Efficiency of Hashing• Worst case (maximum collisions):

– retrieve, insert, delete all O(n)• Average number of collisions depends on the load factor, λ, not

on table sizeλ = (number of used entries)/(table size)

– But not on n.• Typically (linear probing):

numberOfCollisionsavg = 1/(1 - λ)• Example: 75% of the table entries in use:

– λ = 0.75:1/(1-0.75) = 4 collisions in average

(independent of the table size).

FEN 2014 UCN Teknologi/act2learn 10

Page 11: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

When Hashing Is Inefficient

• Traversing in key order.• Find smallest/largest key.• Range-search (Find all keys

between high and low).• Searching on something else than

the designated primary key.

FEN 2014 UCN Teknologi/act2learn 11

Page 12: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

FEN 2014 UCN Teknologi/act2learn 12

(Binary) Search Trees• Value based container:

– The search tree property:• For any internal node: the value is greater than the value

in the left child• For any internal node: the value is less than the value in

the right child– Note the recursive nature of this definition:

• It implies that all sub trees themselves are search trees• Every operation must ensure that the search tree

property is maintained

Page 13: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

FEN 2014 UCN Teknologi/act2learn 13

Example:A Binary Search Tree Holding Names

Page 14: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

FEN 2014 UCN Teknologi/act2learn 14

InOrder:Traversal Visits Nodes in Sorted Order

Page 15: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

FEN 2014 UCN Teknologi/act2learn 15

Efficiency• insert• retrieve• delete

– All operations depend on the depth of the tree

– If balanced: O(log n)• Most libraries use a balanced

version, for instance Red-Black Trees that guarantees O(log n) search, insert and delete.

• Easy to traverse in key-order.

demos\Collections

Page 16: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Collections LibrarySystem.Collections

• Data structures in .NET are normally called Collections• Are found in namespace System.Collections• Compiled into mscorlib.dll assembly• Uses object and polymorphism for generic containers.• Deprecated!• Classes:

– Array– ArrayList– Hashtable– Stack– Queue

FEN 2014 UCN Teknologi/act2learn 16

WARNING:Deprecated!

Page 17: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Collection Interfaces• System.Collections implements a range of different interfaces in

order to provide standard usage of different containers– Classes that implements the same interface provides the same

services– Makes it easier to learn and to use the library– Makes it possible to write generic code towards the interface

• Interfaces:– ICollection– IEnumerable– IEnumerator– IList– IComparer– IComparable

FEN 2014 UCN Teknologi/act2learn 17

Page 18: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

ArrayList• ArrayList stores sequences of elements.

– duplicate values are ok – position- (index-) based– Elements are stored in an resizable array.– Implements the IList interface

public class ArrayList : IList, IEnumerable, ...{ // IList services ...

// additional services int Capacity { get... set... } void TrimToSize()

int BinarySearch(object value) int IndexOf (object value, int startIndex) int LastIndexOf (object value, int startIndex) ...}

control of memoryin underlying array

searching

FEN 2014 UCN Teknologi/act2learn 18

Page 19: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

IList Interface• IList defineres sequences of elements

– Access through index

public interface IList : ICollection { int Add (object value); void Insert(int index, object value);

void Remove (object value); void RemoveAt(int index); void Clear ();

bool Contains(object value); int IndexOf (object value);

object this[int index] { get; set; }

bool IsReadOnly { get; } bool IsFixedSize { get; }}

add new elements

remove

containment testing

read/write existing element(see comment)structural properties

FEN 2014 UCN Teknologi/act2learn 19

Page 20: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Hashtable• Hashtable supports collections of key/value pairs

– keys must be unique, values holds any data– stores object references at key and value– GetHashCode method on key determine position in the table.

Hashtable ages = new Hashtable();

ages["Ann"] = 27;ages["Bob"] = 32;ages.Add("Tom", 15);

ages["Ann"] = 28;

int a = (int) ages["Ann"];

create

add

update

retrieve

FEN 2014 UCN Teknologi/act2learn 20

Page 21: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Hashtable Traversal• Traversal of Hashtable

– each element is of type DictionaryEntry (struct)– data is accessed using the Key and Value properties

Hashtable ages = new Hashtable();

ages["Ann"] = 27;ages["Bob"] = 32;ages["Tom"] = 15;

foreach (DictionaryEntry entry in ages){ string name = (string) entry.Key; int age = (int) entry.Value; ...}

enumerate entries

get key and value

FEN 2014 UCN Teknologi/act2learn 21

Page 22: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

”Generic” Programming in C#/Java(as it was until Summer 2005 – and you still see it, also in other languages)

• All classes inherit from Object• So we can apply polymorphism and use

Object as static type for elements in containers

• For instance: Object[ ] data– this array may take any object as element– This approach is well known from standard

collections as ArrayList, HashTable etc.

FEN 2014 UCN Teknologi/act2learn 22

Page 23: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

Pros and Cons

• Pros– heterogeneous collections– ...

• Cons– many type casts– not type safe

• type checking is done runtime when casting– int and other native (value) type must be wrapped.

(boxing – costs runtime overhead)

Is this really an

advantage?

FEN 2014 UCN Teknologi/act2learn 23

Page 24: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

The Idea: Types as Parameters

C# before 2005:

ArrayList al = new ArrayList();Customer c= (Customer)al[i];//cast

Instead we want something like:

List<Customer> al = new List<Customer>();Customer c= al[i];

– The compiler is able to check that only objects with static type Customer is placed in al

– So the compiler knows that everything that may come out from al has static type Customer

– So static type checking instead of dynamic type checking is possible

– Dynamic casting can be avoided (but is not in all implementations)

Type parameter

FEN 2014 UCN Teknologi/act2learn 24

Page 25: Data Structures and Collections Principles.NET: –Two libraries: System.Collections System.Collections.Generics FEN 2014UCN Teknologi/act2learn1 Deprecated

In C#: EmpSeqApplEmployee a1 = new Employee("Joe", "Programmer", 10000);Employee a = new Employee("Curt", "Senior Programmer", 20000);Employee b = new Employee("Carl", "Programmer", 10000);Employee c = new Employee("Karen", "System Programmer", 13000);Employee d = new Employee("Lisa", "Programmer", 11000);Employee e = new Employee("John", "System Engineer", 9000);string s = "HELLOOOO!";

ArrayList emps = new ArrayList();//IList<Employee> emps = new List<Employee>();

emps.Add(a1);emps.Add(a);emps.Add(b);emps.Add(c);emps.Add(d);emps.Add(e);emps.Add(s); //no errors//emps.Add(s); //COMPILER ERROR!!!!

FEN 2014 UCN Teknologi/act2learn 25