CS 215 Lecture 17 - Wikispaces215+lecture+14.pdf · CS 215 Lecture 17 Linked lists and time ... What about the linked list? To delete a node, ... Adding or removing an element at

CS 215 Lecture 17Linked lists and time complexities

Xiwei (Jeffrey) Wang1

Department of Computer Science

University of Kentucky

Lexington, Kentucky 40506

23 July 2014

1Most of the content is from the lecture slides of the book “C++ for Everyone” by Cay Horstmann.

Recap of last time

The permutations.

Mutual recursion.

The tower of Hanoi.

Queues and Stacks.

Xiwei (Jeffrey) Wang (UK CS) CS 215 Lecture 17 Summer 2014 2 / 31

Stacks and queues


Stack is a LIFO data structure.

Queue is a FIFO data structure.

Using linked lists

Linked list is a data structure that supports efficient addition and removal of elements in the middle of a sequence.

Consider the problem of storing employee records.

Assume we use a vector to hold the data, what if we needed the data sorted by the employees' last names?

Can we hope that the employees will be hired in order of their last names so that we will be putting them into the vector in the order we want?

How else can the data be kept in the order we want it in the vector?

Insert an employee; sort;

insert an employee; sort;

insert an employee; sort;

… …


Using linked lists

For each new employee, we could find the position in the vector where their data should be inserted and insert it.

But what about all the other records after this one? Won't they need to be MOVED inside the vector? Won't that take a lot of time shifting them toward the end of the vector?

Use linked list which does not need any shifting.

Rather than storing the data in a single block of memory, a linked list uses a different strategy.

Each value is stored in its own memory block, together with the locations of the block "before" it and "after" it in the sequence.

This memory block is traditionally called a node.


Using linked lists

Each node contains its data and a pointer to the previous node in the list and a pointer the next node in the list.

This is called a doubly-linked list.

In a singly-linked list each node has a link only to the next node in the list, with no link to the predecessor elements.


David

David

Using linked lists

Add a node to the doubly linked list:

Only the links need to be changed. No shifting required!


David

Using linked lists

And what must happen when an employee leaves the company? With vector, we need to remove that record and then it takes a lot of time shifting those left toward the front of the vector. What about the linked list?

To delete a node, only the links need to be changed. No shifting required!


David

Using linked lists

While insertion and deletion are that easy, where to do the insert or delete is the problem.

Try to delete the 5th element in a linked list.

You would have to first do a linear search just to find the 5th element! This is called Sequential Access.

In a vector or an array, using the [ ], you can go directly to an element position. This is called Random Access.

Random doesn't really mean "random", it means "arbitrary" –go directly to any specific item without having to go through all of the items before it in the sequence to get there.

The standard C++ library has an implementation of the linked list container structure.

list<string> names;

names.push_back("Tom");

names.push_back("David");

names.push_back("Harry");


Iterator for linked lists

To visit an element, you cannot use [], but a list iterator:

list<string>::iterator pos;

pos can "point" to an element in a list. It's not really a pointer, but it uses operator overloading to pretend to be one.

list<string>::iterator pos;

pos = names.begin();

pos++; // move pos to the next position in the list

pos--; // move pos backward

string value = *pos; // store the value from the

list into value

*pos = "Romeo"; // The list value at the

position is changed

Don’t get confused with these two notations:

*pos – the value in the list at pos

pos – the iterator that indicates a position in the list


Iterator for vectors

vectors also support iterator. The following two for loops do the same job.

vector<Person*>::iterator pos;

for (pos = m_list.begin(); pos != m_list.end(); pos++)

{

(*pos)->print(cout);

}

for (int i = 0; i < m_list.size(); i++)

{

m_list[i]->print(cout);

}


Using linked lists

To insert a new element, use the insert method:

names.insert(pos, "Romeo");

If you want to insert an element to the head of the list, you have to reset the pos iterator:


names.insert(pos, "Romeo");

There is also an end method for lists which indicates the position AFTER the last one. That is exactly where a new last element should go:

pos = names.end(); // Points past the end

of the list

names.insert(pos, "Juliet"); // Insert past the end

of the list, A new

last element is

appended


Using linked lists

Because end indicates the position AFTER the last one, using that position is an error, just as it is an error to access an element in an array past the last element.

string value = *names.end(); // ERROR!!

The begin and end methods are used when looping through all the elements in a list.

// while loop pos = names.begin();

while (pos != names.end())

{

cout << *pos << endl;

pos++;

}

// for loop for (pos = names.begin(); pos != names.end(); pos++)

{

cout << *pos << endl;

}


not <= or <

Using linked lists

The erase method removes an element at a position:


pos++;

pos = names.erase(pos);

This code removes the second element from the list.

erase returns the position after the one removed so pos now points to what was the third element.

http://cs.uky.edu/~xiwei/cs215/lectures/example-l17-1.cpp

This also works for vectors. In program 3, when you remove a person from the vector m_list, you should free the associated dynamic memory and erase the pointer from the vector as well.


http://cs.uky.edu/~xiwei/cs215/lectures/example-l17-1.cpp

Time complexity

How does the performance change with the size of the input?

Suppose it takes 2n2 + 5n - 3 operations to sort a list of size n.

What happens if we double the size of the input?

8n2 + 10n - 3 : almost four times as long

As n gets bigger and bigger, the 2n2 term becomes more important, and the 5n term becomes less important.

If we care mostly about big problems, we can ignore those lower-order terms.

The order of complexity of an algorithm is a measure of how its performance changes as the problem instance gets bigger.

Written in “big O notation”: O(n2), O(n).

Write the expression for number of operations: 2n2 + 5n -3

Take the highest-order (fastest-growing) term: 2n2

Drop the constant: n2

So we say the algorithm has complexity O(n2)

"Order n squared" or "Big Oh of n squared"


Time complexity

An expression like O(n2) represents a complexity class:

All the formulas with the highest-order term n2

n2 + 3n -1, n2/10 + n, 100n2 – 50, …

Computer scientists have names for some of the most common complexity classes:

O(1): constant. Doesn't depend on the size of the input.

O(log n): logarithmic. Doubling the input adds an amount of time.

O(n): linear. Doubling the input doubles the time.

O(n2): quadratic. Doubling the input quadruples the time.

O(nk): polynomial with order k.

O(2n): exponential. Adding one to the input size doubles the time.


wo

rse

The efficiency of list, array, and vector operations

How efficient are these operations on lists, arrays, and vectors?

Getting the kth element.

Adding or removing an element at a given position (an iterator or index).

Adding or removing an element at the end.



list – getting the kth element

Getting to the kth element requires starting at the beginning and advancing the iterator k times.

If it takes time T to advance the iterator once, advancing the iterator to the kth element takes kT time so locating the kthelement is an O(k) operation.

array – getting the kth element

Getting to the kth element requires only a calculation for [] to go directly to the kth element.

A simple calculation is O(1), so locating the kth element is an O(1) operation.

vector – getting the kth element

Same as array, locating the kth element is an O(1) operation.



list – inserting and deleting

These operations involve only changing two pointer values so it's O(1) operations.

Note that we will have already done the searching for where to do the insert or delete so that time is not considered here.

array and vector – inserting and deleting

Arrays are easy to visualize.

Inserting and deleting for vector requires seeing how the vector class is implemented.



vector – internal organization


A vector keeps its data in a dynamic buffer (an array) and a pointer to that area;

it keeps the value of the current capacity of the buffer;and it keeps the number of elements currently in the buffer.




In both, to insert an element at position k, the elements with higher index values must be moved to make room for the new element.

And size would be increased by one.




To delete an element at position k, the elements with higher index values must be moved to take up the room that had been used for the deleted element value.

And size would be reduced by one.



For the analysis, we need to know how many elements are affected in a move.

For simplicity, we will assume that insertions and deletions happen at random locations.

Then, on average, where n is the size of the array or vector, each insertion or deletion moves n ̸ 2 elements.

Insert and delete are O(n) operations.



list – adding or removing an element at the endIf we assume the list has a "maintenance pointer" which

points to the last item in the list, then, same as always, just reset some pointer values: an O(1) operation.

array – adding or removing an element at the endThe array must be large enough to insert at the end or we

simply cannot insert.

Inserting and deleting involve [], a simple calculation as before, plus arithmetic on the size.

There is no moving of data so it’s O(1).

vector – adding or removing an element at the endTo insert at the end of a vector, the push_back method is used.

When the capacity is sufficient, this is an O(1) process requiring only accessing and assigning to a position already there in the dynamic array (the buffer) and arithmetic on the size.



vector – adding or removing an element at the end

When the buffer is filled to its current capacity, the buffer will be increased in size to accommodate the call to the push_backmethod .


v.push_back(9);

5

5





v.push_back(9);

5

10





v.push_back(9);

6

10


vector – adding or removing an element at the endThe reallocation does not happen very often. That helps with

the efficiency. But makes the analysis a bit harder.

The reallocation algorithm effects the analysis also.

Suppose we choose to double the size with each reallocation. If we start a vector with capacity 10, we must reallocate when the buffer reaches sizes 10, 20, 40, 80, 160, 320, 640, 1280, and so on.

Assume that one insertion without reallocation takes time T1. Assume that reallocation of k elements takes time kT2.

What is the cost of 1,280 push_back operations?

1280 * T1 for the 1280 insertions and the reallocation cost is:




For 1280 push_backs, the total cost is a bit less than :

The cost of n push_back operations is then less than:

Because the second factor is a constant, we conclude that npush_back operations take O(n) time.

But, we know that it isn't quite true that an individual push_back operation takes O(1) time because occasionally a push_back is unlucky and must reallocate the buffer.




However, if the cost of that reallocation is distributed over the preceding push_back operations, then the surcharge for each of them is still a constant amount.

We say that push_back takes amortized O(1) time, which is written as O(1)+.


Action items

Read book chapter 13.

PA 3 is due tomorrow midnight.

Questions?


Documents

CS 215 Lecture 17 - Wikispaces215+lecture+14.pdf · CS 215 Lecture 17 Linked lists and time ... What about the linked list? To delete a node, ... Adding or removing an element at