Linked Lists CS 1037 Fundamentals of Computer Science II TexPoint fonts used in EMF. Read the...

Preview:

Citation preview

Linked Lists

CS 1037 Fundamentals of Computer Science II

2

You Will Learn...

A. Singly Linked List data structure

B. Iterators (and inner classes in C++)

C. Doubly Linked Lists

linkedlist<double> x;x.push_front(3.1416);x.push_front(2.1718);

for (linkedlist<double>::iterator i = x.begin(); i != x.end(); ++i) {

x.erase(x.begin()+1);x.push_back(3.1416);

3

Part A. Singly Linked Lists

datadata

data

......

4

Is dynarray All We Need?

• dynarray vs built-in C++ arrays...– easy to manipulate via insert/erase/resize– remembers its own size– automatically deletes its items when destructed– does range checking to catch bugs

• Limitations...1. insert/erase is only fast at the back of array; on

array of size n, erase(i) can take up to cn time2. only fast on average; one push_back can take cn time!

• Unacceptable in many real-world situations– i.e. need to insert/erase at middle of huge list

5

Linked Lists

• Fundamental data structure

• Does not keep items at consecutive memory addresses– dynarray can calculate address of any item i

from address of item 0– linked list node explicitly stores address of next item!

dataitem next data

data

......

a list ‘node’

6

Linked Lists

• Advantages:1. erase(node*) always takes c units of time

2. insert(node*) also takes just c time

dataitem next

data

......

just ‘unlink’ the node and then delete it!

dataitem next

data

...... data

just create the node and ‘link’ it into list!

7

Linked Lists

• Disadvantages:1. No random access; to get value of item i, need to

traverse items 0..i-1

2. Uses extra memory per item; each item needs to remember address of next item

dataitem next

data data ...front

you want pointer to 3rd item...

...but you only have pointer to 1st item

data pointer

overhead: >= 4 bytes (x86) or 8 bytes (x64)

useful stuff

Q: how many bytes of overhead for linked of char on 32-bit architecture?

8

Exercise in Visual Studio

See Snippet #1

9

C-style Linked List

• Example of NULL-terminated linked liststruct node { double item; node* next;};

node* front = 0; // empty list

void main() { // code to prepend an item node* n = new node; n->item = 3.1416; n->next = front; front = n;}

3.1416item

0next

front

0front

10

Many Design Questions

1. Singly linked or doubly linked?

2. Circularly linked or NULL terminated?

3. Remember size or calculate size?

item nextprev

frontfront

item next

node* front = 0;int size = 0;

int size = 0;for (node* n = front; n; n = n->next) size++;

size++ when item added to list

11

Singly Linked or Doubly Linked?

• Advantages of doubly linked:1. can traverse list forward or backward2. insert/erase more natural

• Disadvantages of doubly linked:1. speed/memory overhead2. slightly harder to implement

void erase_next(node* n);

void erase(node* n);

versus

n

we skip this!n

12

What is “Circularly Linked”?

• End signified by arriving back at front– Q: how can last node’s next point

at front when front is not a node?– A: in C/C++, “pointer casting” trick!

next item

1st

front

2nd

3rd

struct node { node* next; // *must* be first member! double item;};

node* front = (node*)&front; // empty list

not actually a node!

frontif (front == (node*)&front) cout << "list is empty";

no actual data is stored! accessing front->item could crash for this state!!

13

Circularly Linked or NULL Terminated?

• Advantages of circular– insert/erase code is shorter and straight-forward – insert/erase thereby more efficient – consistent with design of STL iterators (soon)– you learn how STL linked lists actually work, yay!

• Disadvantages of circular– not as natural for beginners– need to know address of terminator (fake node)

while (n != 0) n = n->next;

while (n != (node*)&front) n = n->next;

14

Remember Size or Calculate Size?

• Advantage of remembering size:– querying size takes c units of time instead of cn

• Disadvantages:– memory to store size in each list (4 bytes)

– adds c units of time to each insert/erase

• In our data structures we calculate size– C++ compiler vendors allowed to do it either way

for std::list, so don’t assume anything!

if (list.size() == 0) cout << "empty list";

if (list.empty()) cout << "empty list";vs

15

Circular Singly Linked List in C++

• How do we want to use a linked list? ...

• How might this look in memory?

void main() { linkedlist_of_double numbers; numbers.push_front(3.1416); numbers.push_front(2.7183); numbers.push_front(1.6180);}

next item

numbers1.6180

main

call stack heap

m_front

2.7183

3.1416

16

linkedlist Interface Version 1struct node { node* next; double item;};

class linkedlist_of_double {public: linkedlist_of_double(); // sets up empty list linkedlist_of_double(int size); // sets up non-empty list

int size(); // return current size bool empty(); // test if size==0 void clear(); // reset to empty list

void push_front(double item); // prepend a copy of 'item' void pop_front(); // erase first item

node* begin(); // ptr to first node node* end(); // ptr to 'terminator' node ...

(for double)

17

linkedlist Private Members

• Only one data member! – empty list takes only 4 bytes (if 32-bit pointers)– terminator() is just shorthand for &m_front

...private: node* terminator(); // ptr to 'terminator' node node* m_front; // ptr to 1st node in list // (if empty, 1st node is terminator)};

node* linkedlist_of_double::terminator(){ node* n = (node*)&m_front; // cast node** as a node* return n; // pretend '&m_front'} // points to a node!

(for double)

18

linkedlist Constructors

m_front m_front

linkedlist_of_double::linkedlist_of_double(){ m_front = terminator(); // terminate immediately}

linkedlist_of_double::linkedlist_of_double(int size){ m_front = terminator(); for (int i = 0; i < size; ++i) push_front(0.0);}

0.0

0.0

0.0

(for double)

empty list pre-sized list

19

• We only define push/pop_front(we wait until doubly linked list for insert/erase)

linkedlist Operations

void linkedlist_of_double::push_front(double item){ node* n = new node; n->item = item; n->next = m_front; // link node into m_front = n; // front of list}

void linkedlist_of_double::pop_front(){ assert(!empty()); node* n = m_front; m_front = m_front->next; // unlink front delete n; }

(for double)

m_front

m_frontdata

m_front

......

20

• Need begin/end to traverse circular list

• This is how standard (STL) lists work in C++

linkedlist Traversal

node* linkedlist_of_double::begin() { return m_front; }node* linkedlist_of_double::end() { return terminator(); }

(for double)

for (node* n = list.begin(); n != list.end(); n = n->next) n->item = 0.0;

list1.62

2.72

3.14n

list0.0

0.0

3.14n

list0.0

0.0

0.0n

n == list.end() .. STOPbefore first iteration before last iteration

21

• In CS1037 our linked list calculates size...

linkedlist Queries

int linkedlist_of_double::size(){ int size = 0; for (node* n = begin(); n != end(); n = n->next) size++; // count the nodes return size;}

bool linkedlist_of_double::empty(){ return m_front == terminator(); // faster than size() == 0}

(for double)

22

Exercise in Visual Studio

See Snippet #2

23

Problem #1 with Version 1 Interface

1. Each type of list needs its own node type, so we need node_double, node_int, etc?

Solution: make node an inner class

struct node_double { node* next; double item;};

struct node_int { node* next; int item;};

class linkedlist_of_double { ... struct node { // defines 'node' type node* next; // linkedlist_of_double::node double item; }; ...

24

• In C++, types can be defined inside any scope, incl. namespaces / class definitions

Inner Classes in C++

struct A { int a;};

struct B { int b;};

struct A { int a;

struct B { int b; };};

void main() { A x; // 4 bytes B y; // 4 bytes}

void main() { A x; // 4 bytes A::B y; // 4 bytes}

==

programs generate identical machine code!

25

• Same type name used in different contexts

Inner Classes in C++

struct A { int a; struct B { int foofoo; };};

struct C { double c; struct B { char blah; };};

linkedlist_of_int::node n1; // 8 byteslinkedlist_of_double::node n2; // 16 bytes

e.g. why?

void main() { A x; // 4 bytes A::B y; // 4 bytes C z; // 8 bytes C::B w; // 1 byte}

Q: Does this compile in C++?A: Yes! A::B and C::B are totally independent types

26

• If B inner class of A, then code defining A can just say “B” instead of saying “A::B”

Inner Classes in C++

struct A { struct B { int value; };

B b; // compiler assumes B must mean A::B void print() { cout << b.value; }};

void main() { A a; a.print(); // prints a.b.value}

27

Problem #2 with Version 1 Interface

2. Different code to traverse linkedlist than to traverse dynarray

• Can’t switch data structure for an object without rewriting all code that touches it– suppose we regret choosing dynarray somewhere!

• Can’t write generic code (won’t compile!)

for (node* n = list.begin(); n != list.end(); n = n->next) n->item = 0;

for (int i = 0; i < array.size(); ++i) array[i] = 0;

careful not to put ++n !!

28

Problem #2 with Version 1 Interface

Solution: Use iterator abstraction!

• Unified way to iterate items in containers• Can then write generic code too!

for (linkedlist<int>::iterator i = list.begin(); i != list.end(); ++i) *i = 0;

for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;

template <typename T>void set_to_zero(T& values) { for (T::iterator i = values.begin(); i != values.end(); ++i) *i = 0;} // T can be dynarray<int>, linkedlist<int>,...

29

Part B. Iterators

for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;

30

Iterators in C++

• To “iterate” means to repeat• An iterator object remembers where you

are along a sequence of items/values.• You can ask an iterator...– for access to current item– to advance to next item (or previous item)

• Code to “advance to next item” is totally different for different data structures...

...hence the need for abstraction!

31

Iterators in C++

• Lab 6 asks you to modify simple iterator

• What is output of this program?

class simple_iterator {public: simple_iterator(int start) { m_value = start; } void operator++() { m_value += 2; } operator int() { return m_value; }private: int m_value;};

void main() {    for (simple_iterator i(0); i <= 6; ++i)        cout << i << endl;}

32

linkedlist Iterator

• We know linked list iterator i should...– keep internal pointer to a node– when ++i happens, should advance to next node– when *i happens, should return reference to itemclass iterator {public: double& operator*() { return m_node->item; } void operator++() { m_node = m_node->next; } bool operator==(iterator j) { return m_node == j.m_node; } bool operator!=(iterator j) { return m_node != j.m_node; }private: iterator(node* n) { m_node = n; } node* m_node;};

(for double)

33

linkedlist Interface Version 2

• Used information hiding to hide the internal node type behind iterator abstraction!

class linkedlist_of_double { ... class iterator { ... // previous slide };

// now begin/end should return an iterator iterator begin() { return m_front; } iterator end() { return terminator(); }

private: struct node { // node is now private! node* next; double item; };

34

Real Life™ Example of Linked List

• Example from Chrome source codestruct IOItem { IOHandler* handler; IOContext* context; DWORD bytes_transfered; DWORD error;};

// This list will be empty almost always. It stores IO // completions that have not been delivered yet. std::list<IOItem> completed_io_;

std::list is C++ standard version of doubly linked list

35

• Standard C++ containers (STL) have them

• Note: Programmers complain about loops being too verbose, so C++ standard extended in 2010 to add auto keyword...

Iterator Abstraction

vector<int>::iterator iter1; // for dynamic arraylist<double>::iterator iter2; // for doubly linked listmap<string,int>::iterator iter3; // for associative array...

for (auto i = list.begin(); i != list.end(); ++i) *i = 0;

36

Exercise in Visual Studio

See Snippet #3

37

Real Life™ Example of Iterators

• Example from Chrome source codetypedef std::vector<AutocompleteMatch> ACMatches;

// All matches from all providers for a particular query. // This also tracks what the default match should be // if the user doesn't manually select another match.class AutocompleteResult {public: typedef ACMatches::const_iterator const_iterator; typedef ACMatches::iterator iterator;

void AddMatch(const AutocompleteMatch& match);

std::vector has iterators too!

38

Dynamic Array Iterators

• dynarray can have iterators too...

• Hint: remember this from Review?

for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;

for (int i = 0; i < 5; ++i) // clear :) v[i] = 0;

for (int i = 0; i < 5; ++i) // less clear :\ *(v+i) = 0;

for (int* p = v; p < v+5; ++p) // hard to read :( *p = 0;

this works just like an iterator!

39

dynarray Interface Version 5

• Raw pointer works just fine as iterator!

• Pointer would not work as iterator for linked lists or most other structures.

template <typename T>class dynarray {public: // pointer to item works just fine as an 'iterator'! typename T* iterator;

iterator begin() { return &m_items[0]; } iterator end() { return &m_items[m_size]; }

for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;

40

linkedlist Interface Version 3

• Template class similar to version 2template <typename T>class linkedlist {public: void push_front(T item); void pop_front();

class iterator { public: void operator++(); T& operator*(); ...

linkedlist<string> list;list.push_front("Christine O'Donnell");

for (linkedlist<string>::iterator i = list.begin(); i != list.end(); ++i) cout << *i << " is a witch";

41

Beware “Invalid” Iterators!

• An invalidated iterator is one that points to item that no longer exists in containertypedef linkedlist<string> name_list;

void main() { name_list names; names.push_front("Khalid"); names.push_front("Sorin"); name_list::iterator sorin = names.begin(); name_list::iterator khalid = names.begin(); ++khalid; names.pop_front();

cout << *khalid << endl; // prints "Khalid" cout << *sorin << endl; // CRASH!}

42

Beware “Invalid” Iterators!

• Linked lists only invalidates iterators that point to item that was erased

• Dynamic arrays invalidate all iterators to all items if there was any change to array!

• Why is dynarray so fragile?1. insert/erase move location of items (after index)2. insert/resize can move location of all items

• Summary: be careful modifying lists!

43

Part C. Doubly Linked Lists

...

data

data

data...

44

(Circular) Doubly Linked List

• Each node knows its successor (next) and its predecessor (prev) at all timesstruct node { node* next; node* prev; T item;};

Note: prev actually points to first byte of previous node,not to middle of node (as drawn).

front back

an empty list

template <typename T>class dlinkedlist { ... node* m_front; node* m_back;};

data

data

front back

next prev

2-item list

45

Doubly Linked Lists (repeated)

• Advantages of doubly linked:1. can traverse list forward or backward2. insert/erase more natural

• Disadvantages of doubly linked:1. speed/memory overhead2. slightly harder to implement

void erase_next(node* n);

void erase(node* n);

versus

n

we skip this!n

46

dlinkedlist Interface (Final)template <typename T>class dlinkedlist { ... void insert(iterator pos, T item); // insert at pos void erase(iterator pos); // erase at pos

void push_back(T item); // append item void pop_back(); // erase last item

void push_front(T item); // prepend copy of item void pop_front(); // erase first item

T& front(); // first item T& back(); // last item

iterator begin(); // iterator to first item iterator end(); // iterator *after* last item

47

dlinkedlist Iterator

• Can now go backwards with --i

class iterator {public: T& operator*() { return m_node->item; } void operator++() { m_node = m_node->next; } void operator--() { m_node = m_node->prev; } bool operator==(iterator j) { return m_node == j.m_node; } bool operator!=(iterator j) { return m_node != j.m_node; }private: iterator(node* n) { m_node = n; } node* m_node;};

new

48

dlinkedlist Constructorstemplate <typename T>dlinkedlist<T>::dlinkedlist(){ m_front = m_back = terminator(); // terminate immediately}

template <typename T>dlinkedlist<T>::dlinkedlist(int size){ m_front = m_back = terminator(); for (int i = 0; i < size; ++i) push_front(T()); // insert default-constructed T}

front backempty list

49

dlinkedlist Operationstemplate <typename T>void dlinkedlist<T>::push_front(T item) { insert(begin(),item); // insert at first position}

template <typename T>void dlinkedlist<T>::pop_front() { erase(begin()); // erase at first position}

template <typename T>void dlinkedlist<T>::push_back(T item) { insert(end(),item); // insert *after* last position}

template <typename T>void dlinkedlist<T>::pop_back() { erase(iterator(m_back)); }

50

dlinkedlist Operationstemplate <typename T>void dlinkedlist<T>::insert(iterator pos, T item) { node* b = pos.m_node; // b points to same node as 'pos' node* a = b->prev; // a points to node before b node* n = new node; n->item = item; n->next = b; // n points to b n->prev = a; // n points to a a->next = b->prev = n; // a and b point to n }

template <typename T>void dlinkedlist<T>::erase(iterator pos){ assert(!empty()); node* n = pos.m_node; n->next->prev = n->prev; // unlink 'n' from the list n->prev->next = n->next; delete n; // delete node}

(EXERCISE ON BOARD)

51

Exercise in Visual Studio

See Snippet #4

Recommended