Linked Lists
CS 1037 Fundamentals of Computer Science II
2
You Will Learn...
A. Singly Linked List data structure
B. Iterators (and inner classes in C++)
C. Doubly Linked Lists
linkedlist<double> x;x.push_front(3.1416);x.push_front(2.1718);
for (linkedlist<double>::iterator i = x.begin(); i != x.end(); ++i) {
x.erase(x.begin()+1);x.push_back(3.1416);
3
Part A. Singly Linked Lists
datadata
data
......
4
Is dynarray All We Need?
• dynarray vs built-in C++ arrays...– easy to manipulate via insert/erase/resize– remembers its own size– automatically deletes its items when destructed– does range checking to catch bugs
• Limitations...1. insert/erase is only fast at the back of array; on
array of size n, erase(i) can take up to cn time2. only fast on average; one push_back can take cn time!
• Unacceptable in many real-world situations– i.e. need to insert/erase at middle of huge list
5
Linked Lists
• Fundamental data structure
• Does not keep items at consecutive memory addresses– dynarray can calculate address of any item i
from address of item 0– linked list node explicitly stores address of next item!
dataitem next data
data
......
a list ‘node’
6
Linked Lists
• Advantages:1. erase(node*) always takes c units of time
2. insert(node*) also takes just c time
dataitem next
data
......
just ‘unlink’ the node and then delete it!
dataitem next
data
...... data
just create the node and ‘link’ it into list!
7
Linked Lists
• Disadvantages:1. No random access; to get value of item i, need to
traverse items 0..i-1
2. Uses extra memory per item; each item needs to remember address of next item
dataitem next
data data ...front
you want pointer to 3rd item...
...but you only have pointer to 1st item
data pointer
overhead: >= 4 bytes (x86) or 8 bytes (x64)
useful stuff
Q: how many bytes of overhead for linked of char on 32-bit architecture?
8
Exercise in Visual Studio
See Snippet #1
9
C-style Linked List
• Example of NULL-terminated linked liststruct node { double item; node* next;};
node* front = 0; // empty list
void main() { // code to prepend an item node* n = new node; n->item = 3.1416; n->next = front; front = n;}
3.1416item
0next
front
0front
10
Many Design Questions
1. Singly linked or doubly linked?
2. Circularly linked or NULL terminated?
3. Remember size or calculate size?
item nextprev
frontfront
item next
node* front = 0;int size = 0;
int size = 0;for (node* n = front; n; n = n->next) size++;
size++ when item added to list
11
Singly Linked or Doubly Linked?
• Advantages of doubly linked:1. can traverse list forward or backward2. insert/erase more natural
• Disadvantages of doubly linked:1. speed/memory overhead2. slightly harder to implement
void erase_next(node* n);
void erase(node* n);
versus
n
we skip this!n
12
What is “Circularly Linked”?
• End signified by arriving back at front– Q: how can last node’s next point
at front when front is not a node?– A: in C/C++, “pointer casting” trick!
next item
1st
front
2nd
3rd
struct node { node* next; // *must* be first member! double item;};
node* front = (node*)&front; // empty list
not actually a node!
frontif (front == (node*)&front) cout << "list is empty";
no actual data is stored! accessing front->item could crash for this state!!
13
Circularly Linked or NULL Terminated?
• Advantages of circular– insert/erase code is shorter and straight-forward – insert/erase thereby more efficient – consistent with design of STL iterators (soon)– you learn how STL linked lists actually work, yay!
• Disadvantages of circular– not as natural for beginners– need to know address of terminator (fake node)
while (n != 0) n = n->next;
while (n != (node*)&front) n = n->next;
14
Remember Size or Calculate Size?
• Advantage of remembering size:– querying size takes c units of time instead of cn
• Disadvantages:– memory to store size in each list (4 bytes)
– adds c units of time to each insert/erase
• In our data structures we calculate size– C++ compiler vendors allowed to do it either way
for std::list, so don’t assume anything!
if (list.size() == 0) cout << "empty list";
if (list.empty()) cout << "empty list";vs
15
Circular Singly Linked List in C++
• How do we want to use a linked list? ...
• How might this look in memory?
void main() { linkedlist_of_double numbers; numbers.push_front(3.1416); numbers.push_front(2.7183); numbers.push_front(1.6180);}
next item
numbers1.6180
main
call stack heap
m_front
2.7183
3.1416
16
linkedlist Interface Version 1struct node { node* next; double item;};
class linkedlist_of_double {public: linkedlist_of_double(); // sets up empty list linkedlist_of_double(int size); // sets up non-empty list
int size(); // return current size bool empty(); // test if size==0 void clear(); // reset to empty list
void push_front(double item); // prepend a copy of 'item' void pop_front(); // erase first item
node* begin(); // ptr to first node node* end(); // ptr to 'terminator' node ...
(for double)
17
linkedlist Private Members
• Only one data member! – empty list takes only 4 bytes (if 32-bit pointers)– terminator() is just shorthand for &m_front
...private: node* terminator(); // ptr to 'terminator' node node* m_front; // ptr to 1st node in list // (if empty, 1st node is terminator)};
node* linkedlist_of_double::terminator(){ node* n = (node*)&m_front; // cast node** as a node* return n; // pretend '&m_front'} // points to a node!
(for double)
18
linkedlist Constructors
m_front m_front
linkedlist_of_double::linkedlist_of_double(){ m_front = terminator(); // terminate immediately}
linkedlist_of_double::linkedlist_of_double(int size){ m_front = terminator(); for (int i = 0; i < size; ++i) push_front(0.0);}
0.0
0.0
0.0
(for double)
empty list pre-sized list
19
• We only define push/pop_front(we wait until doubly linked list for insert/erase)
linkedlist Operations
void linkedlist_of_double::push_front(double item){ node* n = new node; n->item = item; n->next = m_front; // link node into m_front = n; // front of list}
void linkedlist_of_double::pop_front(){ assert(!empty()); node* n = m_front; m_front = m_front->next; // unlink front delete n; }
(for double)
m_front
m_frontdata
m_front
......
20
• Need begin/end to traverse circular list
• This is how standard (STL) lists work in C++
linkedlist Traversal
node* linkedlist_of_double::begin() { return m_front; }node* linkedlist_of_double::end() { return terminator(); }
(for double)
for (node* n = list.begin(); n != list.end(); n = n->next) n->item = 0.0;
list1.62
2.72
3.14n
list0.0
0.0
3.14n
list0.0
0.0
0.0n
n == list.end() .. STOPbefore first iteration before last iteration
21
• In CS1037 our linked list calculates size...
linkedlist Queries
int linkedlist_of_double::size(){ int size = 0; for (node* n = begin(); n != end(); n = n->next) size++; // count the nodes return size;}
bool linkedlist_of_double::empty(){ return m_front == terminator(); // faster than size() == 0}
(for double)
22
Exercise in Visual Studio
See Snippet #2
23
Problem #1 with Version 1 Interface
1. Each type of list needs its own node type, so we need node_double, node_int, etc?
Solution: make node an inner class
struct node_double { node* next; double item;};
struct node_int { node* next; int item;};
class linkedlist_of_double { ... struct node { // defines 'node' type node* next; // linkedlist_of_double::node double item; }; ...
24
• In C++, types can be defined inside any scope, incl. namespaces / class definitions
Inner Classes in C++
struct A { int a;};
struct B { int b;};
struct A { int a;
struct B { int b; };};
void main() { A x; // 4 bytes B y; // 4 bytes}
void main() { A x; // 4 bytes A::B y; // 4 bytes}
==
programs generate identical machine code!
25
• Same type name used in different contexts
Inner Classes in C++
struct A { int a; struct B { int foofoo; };};
struct C { double c; struct B { char blah; };};
linkedlist_of_int::node n1; // 8 byteslinkedlist_of_double::node n2; // 16 bytes
e.g. why?
void main() { A x; // 4 bytes A::B y; // 4 bytes C z; // 8 bytes C::B w; // 1 byte}
Q: Does this compile in C++?A: Yes! A::B and C::B are totally independent types
26
• If B inner class of A, then code defining A can just say “B” instead of saying “A::B”
Inner Classes in C++
struct A { struct B { int value; };
B b; // compiler assumes B must mean A::B void print() { cout << b.value; }};
void main() { A a; a.print(); // prints a.b.value}
27
Problem #2 with Version 1 Interface
2. Different code to traverse linkedlist than to traverse dynarray
• Can’t switch data structure for an object without rewriting all code that touches it– suppose we regret choosing dynarray somewhere!
• Can’t write generic code (won’t compile!)
for (node* n = list.begin(); n != list.end(); n = n->next) n->item = 0;
for (int i = 0; i < array.size(); ++i) array[i] = 0;
careful not to put ++n !!
28
Problem #2 with Version 1 Interface
Solution: Use iterator abstraction!
• Unified way to iterate items in containers• Can then write generic code too!
for (linkedlist<int>::iterator i = list.begin(); i != list.end(); ++i) *i = 0;
for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;
template <typename T>void set_to_zero(T& values) { for (T::iterator i = values.begin(); i != values.end(); ++i) *i = 0;} // T can be dynarray<int>, linkedlist<int>,...
29
Part B. Iterators
for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;
30
Iterators in C++
• To “iterate” means to repeat• An iterator object remembers where you
are along a sequence of items/values.• You can ask an iterator...– for access to current item– to advance to next item (or previous item)
• Code to “advance to next item” is totally different for different data structures...
...hence the need for abstraction!
31
Iterators in C++
• Lab 6 asks you to modify simple iterator
• What is output of this program?
class simple_iterator {public: simple_iterator(int start) { m_value = start; } void operator++() { m_value += 2; } operator int() { return m_value; }private: int m_value;};
void main() { for (simple_iterator i(0); i <= 6; ++i) cout << i << endl;}
32
linkedlist Iterator
• We know linked list iterator i should...– keep internal pointer to a node– when ++i happens, should advance to next node– when *i happens, should return reference to itemclass iterator {public: double& operator*() { return m_node->item; } void operator++() { m_node = m_node->next; } bool operator==(iterator j) { return m_node == j.m_node; } bool operator!=(iterator j) { return m_node != j.m_node; }private: iterator(node* n) { m_node = n; } node* m_node;};
(for double)
33
linkedlist Interface Version 2
• Used information hiding to hide the internal node type behind iterator abstraction!
class linkedlist_of_double { ... class iterator { ... // previous slide };
// now begin/end should return an iterator iterator begin() { return m_front; } iterator end() { return terminator(); }
private: struct node { // node is now private! node* next; double item; };
34
Real Life™ Example of Linked List
• Example from Chrome source codestruct IOItem { IOHandler* handler; IOContext* context; DWORD bytes_transfered; DWORD error;};
// This list will be empty almost always. It stores IO // completions that have not been delivered yet. std::list<IOItem> completed_io_;
std::list is C++ standard version of doubly linked list
35
• Standard C++ containers (STL) have them
• Note: Programmers complain about loops being too verbose, so C++ standard extended in 2010 to add auto keyword...
Iterator Abstraction
vector<int>::iterator iter1; // for dynamic arraylist<double>::iterator iter2; // for doubly linked listmap<string,int>::iterator iter3; // for associative array...
for (auto i = list.begin(); i != list.end(); ++i) *i = 0;
36
Exercise in Visual Studio
See Snippet #3
37
Real Life™ Example of Iterators
• Example from Chrome source codetypedef std::vector<AutocompleteMatch> ACMatches;
// All matches from all providers for a particular query. // This also tracks what the default match should be // if the user doesn't manually select another match.class AutocompleteResult {public: typedef ACMatches::const_iterator const_iterator; typedef ACMatches::iterator iterator;
void AddMatch(const AutocompleteMatch& match);
std::vector has iterators too!
38
Dynamic Array Iterators
• dynarray can have iterators too...
• Hint: remember this from Review?
for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;
for (int i = 0; i < 5; ++i) // clear :) v[i] = 0;
for (int i = 0; i < 5; ++i) // less clear :\ *(v+i) = 0;
for (int* p = v; p < v+5; ++p) // hard to read :( *p = 0;
this works just like an iterator!
39
dynarray Interface Version 5
• Raw pointer works just fine as iterator!
• Pointer would not work as iterator for linked lists or most other structures.
template <typename T>class dynarray {public: // pointer to item works just fine as an 'iterator'! typename T* iterator;
iterator begin() { return &m_items[0]; } iterator end() { return &m_items[m_size]; }
for (dynarray<int>::iterator i = array.begin(); i != array.end(); ++i) *i = 0;
40
linkedlist Interface Version 3
• Template class similar to version 2template <typename T>class linkedlist {public: void push_front(T item); void pop_front();
class iterator { public: void operator++(); T& operator*(); ...
linkedlist<string> list;list.push_front("Christine O'Donnell");
for (linkedlist<string>::iterator i = list.begin(); i != list.end(); ++i) cout << *i << " is a witch";
41
Beware “Invalid” Iterators!
• An invalidated iterator is one that points to item that no longer exists in containertypedef linkedlist<string> name_list;
void main() { name_list names; names.push_front("Khalid"); names.push_front("Sorin"); name_list::iterator sorin = names.begin(); name_list::iterator khalid = names.begin(); ++khalid; names.pop_front();
cout << *khalid << endl; // prints "Khalid" cout << *sorin << endl; // CRASH!}
42
Beware “Invalid” Iterators!
• Linked lists only invalidates iterators that point to item that was erased
• Dynamic arrays invalidate all iterators to all items if there was any change to array!
• Why is dynarray so fragile?1. insert/erase move location of items (after index)2. insert/resize can move location of all items
• Summary: be careful modifying lists!
43
Part C. Doubly Linked Lists
...
data
data
data...
44
(Circular) Doubly Linked List
• Each node knows its successor (next) and its predecessor (prev) at all timesstruct node { node* next; node* prev; T item;};
Note: prev actually points to first byte of previous node,not to middle of node (as drawn).
front back
an empty list
template <typename T>class dlinkedlist { ... node* m_front; node* m_back;};
data
data
front back
next prev
2-item list
45
Doubly Linked Lists (repeated)
• Advantages of doubly linked:1. can traverse list forward or backward2. insert/erase more natural
• Disadvantages of doubly linked:1. speed/memory overhead2. slightly harder to implement
void erase_next(node* n);
void erase(node* n);
versus
n
we skip this!n
46
dlinkedlist Interface (Final)template <typename T>class dlinkedlist { ... void insert(iterator pos, T item); // insert at pos void erase(iterator pos); // erase at pos
void push_back(T item); // append item void pop_back(); // erase last item
void push_front(T item); // prepend copy of item void pop_front(); // erase first item
T& front(); // first item T& back(); // last item
iterator begin(); // iterator to first item iterator end(); // iterator *after* last item
47
dlinkedlist Iterator
• Can now go backwards with --i
class iterator {public: T& operator*() { return m_node->item; } void operator++() { m_node = m_node->next; } void operator--() { m_node = m_node->prev; } bool operator==(iterator j) { return m_node == j.m_node; } bool operator!=(iterator j) { return m_node != j.m_node; }private: iterator(node* n) { m_node = n; } node* m_node;};
new
48
dlinkedlist Constructorstemplate <typename T>dlinkedlist<T>::dlinkedlist(){ m_front = m_back = terminator(); // terminate immediately}
template <typename T>dlinkedlist<T>::dlinkedlist(int size){ m_front = m_back = terminator(); for (int i = 0; i < size; ++i) push_front(T()); // insert default-constructed T}
front backempty list
49
dlinkedlist Operationstemplate <typename T>void dlinkedlist<T>::push_front(T item) { insert(begin(),item); // insert at first position}
template <typename T>void dlinkedlist<T>::pop_front() { erase(begin()); // erase at first position}
template <typename T>void dlinkedlist<T>::push_back(T item) { insert(end(),item); // insert *after* last position}
template <typename T>void dlinkedlist<T>::pop_back() { erase(iterator(m_back)); }
50
dlinkedlist Operationstemplate <typename T>void dlinkedlist<T>::insert(iterator pos, T item) { node* b = pos.m_node; // b points to same node as 'pos' node* a = b->prev; // a points to node before b node* n = new node; n->item = item; n->next = b; // n points to b n->prev = a; // n points to a a->next = b->prev = n; // a and b point to n }
template <typename T>void dlinkedlist<T>::erase(iterator pos){ assert(!empty()); node* n = pos.m_node; n->next->prev = n->prev; // unlink 'n' from the list n->prev->next = n->next; delete n; // delete node}
(EXERCISE ON BOARD)
51
Exercise in Visual Studio
See Snippet #4