CSE 1520 EXAM NOTES Topic As3.amazonaws.com/prealliance_oneclass_sample/AE4J6b4y23.pdf · CSE 1520 EXAM NOTES Topic A Computing Systems: Computing systems are dynamic entities used

CSE 1520 EXAM NOTES

Topic A

Computing Systems:

Computing systems are dynamic entities used to solve problems and interact with their

environment.

They consist of devices, programs, and data.

Hardware - The physical elements of a computing system (printer, circuit boards, wires,

keyboard…).

Software - The programs that provide the instructions for a computer to execute.

Data - Information in a form a computer can use.

****Abstraction***** - A mental model that removes complex details.

Early History of Computing:

o Abacus (2400 BC)

An early device to record numeric values.

o Blaise Pascal (1623-1662)

Created a mechanical device to add, subtract, divide & multiply.

o Gottfried Wilhelm von Leibniz (1646-1716)

Created a mechanical device to perform all four whole number operations.

o Joseph Jacquard

Jacquard’s Loom (1801), the punched card

o Charles Babbage (1792-1871)

Difference Engine, Analytical Engine

o Augusta Ada Byron (Lovelace)

Babage’s assistant

Considered to be the first Programmer, Invented the concept of the loop

o William Burroughs (1857-1898)

Adding Machine

o Herman Hollerith (1860-1929)

Electro-mechanical Tabulator

o Alan Turing (1912-1954)

Turing Machine - an abstract mathematical model

Artificial Intelligence Testing

Early computers launch new era in mathematics, physics, engineering and economics.

Harvard Mark I (1939)

ENIAC - Electronic Numerical Integrator and Calculator

EDVAC - Electronic Discrete Variable Automatic Computer

first machine with a stored program

http://en.wikipedia.org/wiki/Abacus

http://en.wikipedia.org/wiki/Blaise_Pascal

http://en.wikipedia.org/wiki/Blaise_Pascal

http://en.wikipedia.org/wiki/Gottfried_Wilhelm_von_Leibniz

http://en.wikipedia.org/wiki/Joseph_Jacquard

http://en.wikipedia.org/wiki/Charles_Babbage

http://en.wikipedia.org/wiki/Augusta_Ada_Byron



http://en.wikipedia.org/wiki/William_Seward_Burroughs

http://en.wikipedia.org/wiki/Herman_Hollerith

http://en.wikipedia.org/wiki/Alan_Turing

http://en.wikipedia.org/wiki/Harvard_Mark_I

http://en.wikipedia.org/wiki/ENIAC

http://en.wikipedia.org/wiki/EDVAC

UNIVAC I - Universal Automatic Computer (1951)

First Generation Hardware (1951-1959)

o Vacuum Tubes

Large, not very reliable, generated a lot of heat

o Magnetic Drum

Memory device that rotated under a read/write head

o Card Readers Magnetic Tape Drives

Sequential auxiliary storage devices

Second Generation Hardware (1959-1965)

o Transistor

Replaced vacuum tube fast, small, durable, cheap

o Magnetic Cores

Replaced magnetic drums information available instantly

o Magnetic Disks

Replaced magnetic tape data can be accessed directly

Third Generation Hardware (1965-1971)

o Integrated Circuits

Replaced circuit boards smaller, cheaper, faster, more reliable

o Transistors

Now used for memory construction

o Terminal

An input/output device with a keyboard and screen

Fourth Generation Hardware (1971-?)

o Large-scale Integration

Great advances in chip technology

o PCs, the Commercial Market, Workstations

Personal Computers were developed as new companies like Apple and Atari

came into being. Workstations emerged.

Parallel Computing and Networking

Parallel Computing

o Computers rely on interconnected central processing units that increase

processing speed.

http://en.wikipedia.org/wiki/UNIVAC_I

Networking

o With the Ethernet small computers could be connected and share resources. A

file server connected PCs in the late 1980s.

ARPANET and LANs Internet

First Generation Software (1951-1959)

o Machine Language

Computer programs were written in binary (1s and 0s).

o Assembly Languages and translators

Programs were written in artificial programming languages and were then

translated into machine language.

o Programmer Changes

Programmers divide into application programmers and systems programmers.

Second Generation Software (1959-1965)

o High Level Languages

English-like statements make programming easier. Fortran, COBOL, Lisp are

examples.

Third Generation Software (1965-1971)

o Systems Software

utility programs,

language translators,

and the operating system, which decides which

programs to run and when.

o Separation between Users and Hardware

Computer programmers began to write programs to be used by people who did

not know how to program.

Fourth Generation Software (1971-1989)

o Structured Programming

Pascal, C, C++

o New Application Software for Users

Spreadsheets, word processors, database management systems.

Fifth Generation Software (1990- present)

o Microsoft

The Windows operating system, and other Microsoft application programs

dominate the market.

o Object-Oriented Design

Based on a hierarchy of data objects (i.e. Java).

o World Wide Web

Allows easy global communication through the Internet.

o New Users

Today’s user needs no computer knowledge.

Topic A Laboratory Information

File Systems and Directories

File Systems

File

A named collection of related data.

File system

The logical view that an operating system provides so that users can manage

information as a collection of files.

Directory

A named group of files. Also called a folder.

Text and Binary Files

Text file:

A file in which the bytes of data are organized as characters from the ASCII or

Unicode character sets.

Binary file:

A file that contains data in a specific format, requiring interpretation.

The terms text file and binary file are somewhat misleading…

o They seem to imply that the information in a text file is not stored as binary data.

o Ultimately, all information on a computer is stored as binary digits.

o These terms refer to how those bits are formatted: as chunks of 8 or 16 bits,

interpreted as characters, or in some other special format.

File Types

Most files, whether they are in text or binary format, contain a specific type of

information.

For example, a file may contain a program, an image, or an audio clip.

The kind of information contained in a document is called the file type.

Most operating systems recognize a list of specific file types.

File names are often separated, usually by a period, into two parts:

1. Main name

2. File extension

The file extension indicates the type of the file.

File Access

Sequential access:

Information in the file is processed in order, and read and write operations

move the current file pointer as far as needed to read or write the data. The

most common file access technique, and the simplest to implement.

Direct access:

Files are conceptually divided into numbered logical records and each logical

record can be accessed directly by number.

File Protection

In multiuser systems, file protection is of primary importance.

We don’t want one user to be able to access another user’s files unless the access is

specifically allowed.

A file protection mechanism determines who can use a file and for what general

purpose.

A file’s protection settings in the Unix operating system is divided into three categories:

1. Owner

2. Group

3. World

Directory Trees

A directory of files can be contained within another directory.

The directory containing another is usually called the parent directory, and the

one inside is called a subdirectory.

Directory tree:

A logical view of a file system; a structure showing the nested directory

organization of a file system.

Root directory:

The directory at the highest level.

At any point in time, you can be thought of as working in a particular location (that is, a

particular subdirectory).

Working directory:

The subdirectory in which you are working.

Path Names

Path

A text designation of the location of a file or subdirectory in a file system,

consisting of the series of directories through which you must go to find the file.

Absolute path

A path that begins at the root and specifies each step down the tree until it

reaches the desired file or directory.

Examples of absolute path:

o C:\Program Files\MS Office\WinWord.exe

o C:\My Documents\letters\applications\vaTech.doc

o C:\Windows\System\QuickTime

Relative path

A path name that begins at the current working directory.

Suppose the current working directory is C:\My Documents\letters:

Then the following relative path names could be used:

cancelMag.doc, applications\calState.doc

Topic B

Binary Number Systems

Positional Notation

104 103 102 101 100

10000 1000 100 10 1

Allows us to count past 10.

Each column of a number represents a power of the base. The exponent is the order of

magnitude for the column.

104 103 102 101 100

10000 1000 100 10 1

The Decimal system is based on the number of digits we have.

104 103 102 101 100

10000 1000 100 10 1

The magnitude of each column is the base, raised to its exponent.

104 103 102 101 100

10000 1000 100 10 1

2 7 9 1 6

20000+7000 +900 +10 +6

=27916

The magnitude of a number is determined by multiplying the magnitude of the column

by the digit in the column and summing the products.

Binary Numbers

The base in a Binary system is 2.

There are only 2 digits – 0 and 1.

Since we use the term frequently, “binary digit” can be shortened to ‘bit’.

8 bits together form a byte.

A Single Byte

27 26 25 24 23 22 21 20

128 64 32 16 8 4 2 1

1 1 1 1 1 1 1 1

128 +64 +32 +16 +8 +4 + 2 + 1

=255

225 is the largest decimal value that can be expressed in 8 bits.

Longer Numbers

Since 255 is the largest number that can be represented in 8 bits, lager values simply

require longer numbers.

o For example, 27916 is represented by: 0011011010000110

Short Forms for Binary

Because large numbers require long strings of Binary digits, short forms have been

developed to help deal with them.

An early system used was called Octal.

It’s based on the 8 patterns in 3 bits.

Integers

To store integers, half the combinations are used to represent negative values.

The MSB is used to represent the sign.

The range for Integers in 1 byte is: -128 to +127

Which value of the sign bit (0 or 1) will represent a negative number? _________

Excess Notation

The notation system that uses 0 to represent negative values.

Fixed length notation system.

Zero is the first non-negative value:

o 10000000

The pattern immediately before zero is -1:

o 01111111

The largest value is stored as 11111111 (+127)

The smallest value is stored as 00000000 (-128)

2’s Complement Notation

The notation system that uses 1 to represent negative values.

Fixed length notation system.

Zero is the first non-negative value:

o 00000000

The pattern immediately before zero is -1:

o 11111111

The largest value is stored as 01111111 (+127)

The smallest value is stored as 10000000 (-128)

Fractions

A radix separates the integer part from the fraction part of a number.

o 101.101

Columns to the right of the radix have negative powers of 2.

Scientific Notation

Very large and very small numbers are often represented such that their order of

magnitude can be compared.

The basic concept is an exponential notation using powers of 10.

a × 10b

Where b is an integer,

and a is a real number such that: 1 ≤ |a| < 10

An electron's mass is about 0.00000000000000000000000000000091093826 kg.

o In scientific notation, this is written 9.1093826×10−31 kg.

The Earth's mass is about 5,973,600,000,000,000,000,000,000 kg.

o In scientific notation, this is written 5.9736×1024 kg.

E Notation

To allow values like this to be expressed on calculators and early terminals

× 10b

was replaced by Eb

So 9.1093826×10−31 becomes 9.1093826E−31

And 5.9736×1024 becomes 5.9736E+24

The ‘a’ part of the number is called the mantissa or significand.

The ‘Eb’ part is called the exponent.

Since these numbers could also be negative they would typically have a sign as well.

Floating Point Storage

In floating point notation the bit pattern is divided into 3 components:

Sign – 1 bit (0 for +, 1 for -)

Exponent – stored in Excess notation

Mantissa – must begin with 1

Mantissa

Assumes a radix point immediately left of the first digit.

The exponent will determine how far and in which direction to move the radix.

Representing Data

The information below is provided to help you review and practice

converting numbers.

Binary

numbers are expressed in a positional notation system. Each digit represents a power of 2.

In an 8-bit pattern the column positions have these values:

27 26 25 24 23 22 21 20

128 64 32 16 8 4 2 1

For example, the number 11111111 expresses the highest magnitude an 8-bit

pattern can represent (128+64+32+16+8+4+2+1 = 255) which is exactly 1 less than the digit value of the next column which is 256

Binary Fractions (radix point)

the radix point . separates integers from fractions fraction digits mirror the values of the integers:

.1/2 1/4 1/8 1/16 1/32 1/64 1/128 ...

1011.1100 converts to 8 + 0 + 2 + 1 + 1/2 + 1/4 + 0 + 0 = 11 3/4

Hexadecimal Notation

is a short hand for representing a 4 bit pattern using only a single character (a base 16 digit.)

Excess Notation

makes it possible to store negative values by treating the Most Significant Bit (MSB) as the sign. In excess notation a 0 as the MSB indicates a negative (-) number.

In the case of the 4 bit pattern, for example:

0110

the value of the most significant bit is 8, so 4 bit patterns are called excess 8. To convert this example find the decimal value of the pattern (which is 6);

and subtract 8 (which is the normal value of the MSB).

The result is -2

Two's Complement Notation

is another fixed length approach to representing negative values. It employs a sign bit of 0 to represent non-negative (+) and 1 for negative (-).

If the number is a non-negative, it may be evaluated as in standard binary notation. If the number is negative, evaluating the number involves finding the 1's

complement, adding 1 to the result, and remembering the sign. Alternatively, the following rule may be applied to find its positive complement.

"Copy the number from right to left up to and including the first "1", then

complement the rest of the number.

Evaluate as in standard binary and remember the sign."

Mapping Notation Systems onto Each Other

Binary Decimal Hexadecimal Excess (8) Two's

Complement

0000 0 0 -8 0

0001 1 1 -7 1

0010 2 2 -6 2

0011 3 3 -5 3

0100 4 4 -4 4

0101 5 5 -3 5

0110 6 6 -2 6

0111 7 7 -1 7

1000 8 8 0 -8

1001 9 9 1 -7

1010 10 A 2 -6

1011 11 B 3 -5

1100 12 C 4 -4

1101 13 D 5 -3

1110 14 E 6 -2

1111 15 F 7 -1

Floating Point Notation

consists of 3 parts:

1. a Sign bit ["0" is non-negative (+), "1" is negative (-)], 2. an Exponent 3. and a Mantissa.

In an eight bit pattern,

the most significant bit (MSB) is the sign bit, followed by a 3-bit exponent (expressed in excess notation), followed by a 4-bit mantissa.

o The radix point is assumed to be at the left of the mantissa. o In a normalized floating point notation, the mantissa must begin with a "1".

E.G. - 0 101 1001

the sign bit is 0 - so the number represented is non-negative the exponent is 101 - which, in excess(4) notation, is 5-4, or +1 the mantissa is 1001 - the radix goes at the left, producing .1001 a positive exponent moves the radix to the right and a negative exponent moves the

radix to the left. o applying the exponent shifts the radix 1 position right - 1.001 o which is a 1 and 1/8th.

Therefore the number 01011001 in normalized floating point notation represents the value +1 1/8th

Here are some more examples for you to try out:

Number Decimal Hexadecimal Excess(128) Two's

Complement

Floating

Point

(normalized)

11101110 238 EE +110 -18 -3 1/2

01011011 91 5B -37 +91 +1 3/8

10111000 184 B8 +56 -72 -1/4

Adding binary numbers

There are only 4 rules for binary addition:

0 0 1 1

+ 0 + 1 + 0 + 1

--- --- --- ---

0 1 1 10 (the carry rule)

Subtracting binary numbers

The two's complement notation system is typically used to perform subtraction by using the

rules of addition, e.g., 5 - 4 may be expressed as 5 + (-4)

5 converts to 0101

4 converts to 0100, so -4 must be 1100

Add these two together:

0101

+1100

10001

But since we are limited to a fixed length (4 bits in this case) the last carry bit is discarded. This leaves an answer of

0001

Multiplying binary numbers

Multiplication can be re-expressed as repeated addition:

7 * 3 is the same as 7 + 7 + 7 ("seven times three" means "add seven together three times")

7 is represented as 0111

0111

+0111

1110

+0111

10101

which is, of course, 21.

Optimising Multiplication

Multiplying large numbers, say 1043 * 131, involves a lot of additions.

For improved efficiency a processor can apply 2 rules:

1. to multiply by a binary number by 2, shift the bits to the left 2. multiplicaion Distributes over addition

The decimal number 6 is represented by the binary pattern 110.

Shifting the bits left one position doubles the value of the column of each bit.

So the pattern for decimal 12 is 1100.

Multiplying by powers of 2, then, is simply a matter of shifting the bits left by the exponent.

8 is 23 so multiplying a number by 8 consists of shifting the bits 3 positions left.

In this calculation, 131 is the sum of 128, 2, and 1, so applying the property of Distribution produces an equivalent expression :

1043 * (128 + 2 + 1)

or 1043 * 128 + 1043 * 2 + 1043 * 1

1043 in binary is 10000010011

1043 * 2 is 100000100110

1043 * 128 is 100000100110000000

Adding these three numbers produces the answer 100001010110111001

Dividing binary numbers

Division can be re-expressed as repeated subtraction:

To divide 21 by 7 see how many times you can subtract 7 from 21.

Of course, this is the same as adding -7, so we use two's complement notation.

21 is represented as 010101

7 is 000111, so -7 is 111001

0 1 0 1 0 1

+ 1 1 1 0 0 1

(1) 0 0 1 1 1 0

+ 1 1 1 0 0 1

(10) 0 0 0 1 1 1

+ 1 1 1 0 0 1

(11) 0 0 0 0 0 0

Since we're using 6 bit patterns, the extra carry bits would normally be discarded, but

notice that they contain the integer part of the answer.

The 6 bits to their right contain the remainder.

So 21 divided by 7 is 3, with 0 remainder.

Data Representation

Computers are multimedia devices, dealing with many categories of information.

Computers store, present, and help modify:

o Numbers

o Text

o Audio

o Images and graphics

o Video

Computers are finite. Computer memory and other hardware devices have only so

much room to store and manipulate a certain amount of data. The goal of data

representation is to represent enough of the world to satisfy our computational needs

and our senses of sight and sound.

Analog or Digital Information

Information can be represented in one of two ways: analog or digital:

Analog data A continuous representation, analogous to the actual

information it represents.

Digital data A discrete representation, breaking the information up into

separate elements.

A mercury thermometer exemplifies analog data as it continually rises and falls in direct

proportion to the temperature.

Digital displays only show discrete information.

Computers cannot work well with analog information, so we digitize information by

breaking it into pieces and representing those pieces separately.

Why do we use binary? Modern computers are designed to use and manage binary

values because the devices that store and manage the data are far less expensive and

far more reliable if they only need to represent one of two possible values.

Electronic Signals

An analog signal continually fluctuates up and down in voltage. But a digital signal has

only a high or low state, corresponding to the two binary digits.

All electronic signals (both analog and digital) degrade as they move down a line. That is,

the voltage of the signal fluctuates due to environmental effects.

Representing Text

To represent a text document in digital form, we need to be able to represent every

possible character that may appear.

There is a finite number of characters to represent, so the general approach is to list

them all and assign each a binary string.

A character set is a list of characters and the codes used to represent each one.

By agreeing to use a particular character set, computer manufacturers have made the

processing of text data easier.

The ASCII Character Set

ASCII stands for American Standard Code for Information Interchange.

The ASCII character set originally used seven bits to represent each character, allowing

for 128 unique characters

Notice the organisation of the ASCII table.

The table divides in half according to the MSB.

Letters are all in the second half so all codes for alphabetic characters start with

1

This second half of the table divides in half again according to the next

bit:

o UPPERCASE letters start 10.

o lowercase letters start 11.

The first half of the table also divides in half according to the next bit:

o Control characters start 00.

o Numerals and punctuation start 01.

Note that control characters (the first 32 in the ASCII character set) do not have simple

character representations that you could print to the screen.

Some, however, perform actions with which you are familiar.

Coding letters in ASCII is easy.

a. Let’s look at ‘j’ as an example:

i. Since ‘j’ is a letter, its code starts with a 1.

ii. Since it’s lowercase, the next bit is also a 1.

iii. Since it’s the tenth letter of the alphabet the rest of the code is 01010.

iv. The complete ASCII code for ‘j’ is 1101010.

ASCII evolved so that eight bits were used.

The 7-bit codes were simply prefixed with another bit, giving another natural doubling.

o The original 7-bit codes were padded with 0.

So the code for ‘j’ became 01101010.

o 128 new characters were added.

The codes for this alternate character set start with 1.

The Unicode Character Set

Even the extended version of the ASCII character set is not enough for international use.

The Unicode character set uses 16 bits per character. The Unicode character set can

represent 216, or over 65 thousand characters.

Unicode was designed to be a superset of ASCII. That is, the first 256 characters in the

Unicode character set correspond exactly to the extended ASCII character set.

http://en.wikipedia.org/wiki/UniCode

Data Compressing

It is important that we find ways to store and transmit data efficiently, which leads

computer scientists to find ways to compress it.

Data compression is a reduction in the amount of space needed to store a piece of data.

Compression ratio is the size of the compressed data divided by the size of the original

data.

A data compression technique can be

o lossless, which means the data can be retrieved without any loss of the original

information,

o lossy, which means some information may be lost in the process of compaction.

As examples, consider these 3 techniques:

1. keyword encoding

2. run-length encoding

3. Huffman encoding

Key Word Encoding

Frequently used words are replaced with a single character.

For example…

Note, that the characters used to encode cannot be part of the original text.

Consider the following paragraph,

The human body is composed of many independent systems, such as the

circulatory system, the respiratory system, and the reproductive system. Not

only must all systems work independently, they must interact and cooperate as

well. Overall health is a function of the well-being of separate systems, as well as

how these separate systems work in concert.

This version highlights the words that can be replaced.

The human body is composed of many independent systems, such as the

circulatory system, the respiratory system, and the reproductive system. Not only

must each system work independently, they must interact and cooperate as well.

Overall health is a function of the well-being of separate systems, as well as how

those separate systems work in concert.

This is the encoded paragraph:

The human body is composed of many independent systems, such ^ ~ circulatory

system, ~ respiratory system, + ~ reproductive system. Not only & each system

work independently, they & interact + cooperate ^ %. Overall health is a function

of ~ %- being of separate systems, ^ % ^ how # separate systems work in

concert.

There are a total of 349 characters in the original paragraph including spaces and

punctuation.

The encoded paragraph contains 314 characters, resulting in a savings of 35 characters.

The compression ratio for this example is 314/349 or approximately 0.9.

A compression ratio of .9 (90%) is NOT very good. The compressed file is 90% the size of

the original.

However, there are several ways this can be improved. Can you think of some?

Run- Length Encoding

A single character may be repeated over and over again in a long sequence. This type of

repetition doesn’t generally take place in English text, but often occurs in large data

streams.

In run-length encoding, a sequence of repeated characters is replaced by:

o a flag character,

o followed by the repeated character,

o followed by a single digit that indicates how many times the character is repeated.

Some examples:

AAAAAAA

would be encoded as

*A7

*n5*x9ccc*h6 some other text *k8eee

can be decoded into the following original text:

nnnnnxxxxxxxxxccchhhhhh some other text kkkkkkkkeee

In the second example, the original text contains 51 characters, and the encoded string

contains 35 characters, giving us a compression ratio of 35/51 or approximately 0.68.

Since we are using one character for the repetition count, it seems that we can’t encode

repetition lengths greater than nine. However, instead of interpreting the count

character as an ASCII digit, we could interpret it as a binary number.

Huffman Encoding

Why should the blank, which is used very frequently, take up the same number of bits as

the character “X”, which is seldom used in text?

Huffman codes use variable-length bit strings to represent each character.

A few characters may be represented by five bits, and another few by six bits, and yet

another few by seven bits, and so forth.

If we use only a few bits to represent characters that appear often and reserve longer

bit strings for characters that don’t appear often, the overall size of the document being

represented will be smaller.

DOORBELL would be encoded in binary as 1011110110111101001100100.

If we used a fixed-size bit string to represent each character (say, 8 bits), then the binary

from of the original string would be 64 bits.

The Huffman encoding for that string is 25 bits long, giving a compression ratio of 25/64,

or approximately 0.39.

An important characteristic of any Huffman encoding is that no bit string used to

represent a character is the prefix of any other bit string used to represent a character.

Representing Audio Information

We perceive sound when a series of air compressions vibrate a membrane in our ear,

which sends signals to our brain.

A stereo sends an electrical signal to a speaker to produce sound. This signal is an analog

representation of the sound wave. The voltage in the signal varies in direct proportion

to the sound wave.

To digitize the signal we periodically measure the voltage of the signal and record the

appropriate numeric value. The process is called sampling.

In general, a sampling rate of around 40,000 times per second is enough to create a

reasonable sound reproduction.

The standard sampling rate for CDs is 44.1 kHz. The Pro Audio standard is 48 kHz.

It should be noted that the potential loss of peak values suggested in the previous slide

is a myth. The time lapse between samples is much too short for any such loss.

The human ear hears sounds between 20 Hz and 20,000 Hz. Sampling at twice this

frequency (44,000+) eliminates any potential loss of data.

For a complete explanation refer to the Nyquist–Shannon sampling theorem.

A compact disk (CD) stores audio information digitally. On the surface of the CD are

microscopic pits that represent binary digits. A low intensity laser is pointed at the disc.

The laser light reflects strongly if the surface is smooth and reflects poorly if the surface

is pitted.

Audio Formats

WAV, AU, AIFF, VQF, and MP3.

http://en.wikipedia.org/wiki/Nyquist_theorem

http://en.wikipedia.org/wiki/Nyquist_theorem

MP3 is dominant

MP3 is short for MPEG-2, audio layer 3 file.

MP3 employs both lossy and lossless compression.

1. First it analyses the frequency spread and compares it to mathematical models of

human psychoacoustics (the study of the interrelation between the ear and the

brain), and it discards information that can’t be heard by humans.

2. Then the bit stream is compressed using a form of Huffman encoding to achieve

additional compression.

Representing Images and Graphics

Colour is our perception of the various frequencies of light that reach the retinas of our

eyes.

Our retinas have three types of colour photoreceptor cones which respond to different

sets of frequencies. These photoreceptor categories correspond to the colours of red,

green, and blue.

Colour is often expressed in a computer as an RGB (red, green, blue) value, which is

actually three numbers that indicate the relative contribution of each of these three

primary colours.

For example, an RGB value of (255, 255, 0) maximizes the contribution of red and

green, and minimizes the contribution of blue. The resulting colour is a bright yellow.

The amount of data that is used to represent a colour is called the colour depth.

HiColor is a term that indicates a 16-bit colour depth. Five bits are used for each number

in an RGB value and the extra bit is sometimes used to represent transparency.

TrueColor indicates a 24-bit colour depth. Therefore, each number in an RGB value gets

eight bits.

HiColor uses 5 bits for each number.

o Since 25 = 32, there are 32 different levels for each of the 3 primary colours. So

there are 323 (or 215) possible colours.

o This is a total of 32,768 different colours.

TrueColor uses eight bits for each colour component.

o 28* 28* 28 = 224 or 16,777,216 colours.

Some monitors can use as many as 32 bits for colour depth.

The human eye is able to distinguish about 200 intensity levels in each of the three

primaries red, green, and blue. All in all, up to 10 million different colours can be

distinguished.

So modern monitors are examples of solutions without a problem.

o If the human eye can distinguish only 10 million colours, why develop monitors

that can display over 4 billion?

Indexed color

A particular application such as a browser may support only a certain number of

specific colours, creating a palette from which to choose. For example, Netscape

Navigator’s colour palette has only 216 colours.

Digitized Images and Graphics

o The storage of image information on a pixel-by-pixel basis is called a

raster-graphics format.

o There are several popular raster file formats including:

BMP (bitmap)

GIF (Graphics Interchange Format)

JPEG (Joint Photographic Experts Group)

Vector Graphics

o Instead of assigning colours to pixels as we do in raster graphics, a vector-

graphics format describes an image in terms of lines and geometric

shapes

o A vector graphic is a series of commands that describe a line’s direction,

thickness, and colour. The file size for these formats tends to be small

because every pixel does not need to be represented.

Representing Video

A video codec (COmpressor/DECompressor) refers to the methods used to shrink the

size of a movie to allow it to be played on a computer or over a network.

Almost all video codecs use lossy compression to minimize the huge amounts of data

associated with video.

o To simulate motion, movies need to record (and play back) at least 12 frames per

second.

o However, good sound quality requires 24 frames/s.

o 24 frames/s

= 1440 frames/minute

= 46400 frames/hour

o Recall…

o If each frame has a resolution of 1024 x 768*

there are 786,432 pixels in a frame.

o If the colour of each pixel is stored as 24 bits (3 bytes) of data, one frame alone

requires 2,359,296 bytes (2 MB) of memory.

o An hour of film then, requires 203,843,174,400 bytes (194,400 MB – more than

190 Gigabytes) of storage – just for the images.

The first step in compressing video is to reduce the amount of information stored for a

frame.

This problem is essentially the same as that faced when compressing still images.

Spatial compression

A technique based on removing redundant information within a frame.

Each compressed frame will still be quite large.

We can save even more space by recognizing that between two frames, most of the

image hasn’t changed. Storing only the changes (deltas) from one cell to the next is

much more efficient.

Temporal compression

A technique based on storing differences between consecutive frames.

Topic B Laboratory information

Topic C

Gates and Circuits

Computers and Electricity

Gate:

A device that performs a basic operation on electrical signals.

Circuits:

Gates combined to perform more complicated tasks.

There are three different, but equally powerful, notational methods for describing the

behavior of gates and circuits:

o Boolean expressions

o logic diagrams

o truth tables

Boolean expressions:

Expressions in Boolean algebra, a mathematical notation for expressing two-

valued logic. This algebraic notation is an elegant and powerful way to

demonstrate the activity of electrical circuits.

Logic diagram:

A graphical representation of a circuit. Each type of gate is represented by a

specific graphical symbol.

Truth table:

A table showing all possible input values and the associated output values.

Gates

Let’s examine the processing of the following

six types of gates:

o NOT

o AND

o OR

o XOR

o NAND

o NOR

Typically, logic diagrams are black and white, and the gates are distinguished only by

their shape.

NOT Gate

A NOT gate accepts one input value and produces one output value.

By definition, if the input value for a NOT gate is 0, the output value is 1, and if the input

value is 1, the output is 0.

A NOT gate is sometimes referred to as an inverter because it inverts the input value.

AND Gate

An AND gate accepts two input signals.

If the input values for an AND gate are both 1, the output is 1; otherwise, the output is

0.

OR Gate

If the two input values are both 0, the output value is 0; otherwise, the output is 1.

XOR Gate (XOR gate (eXclusive OR)

An XOR gate produces 0 if its two inputs are the same, and a 1 otherwise.

Note the difference between the XOR gate

and the OR gate; they differ only in one

input situation:

o When both input signals are 1, the OR gate produces a 1 but the XOR

produces a 0.

NAND and NOR Gates

The NAND and NOR gates are essentially the opposite of the AND and OR gates,

respectively.

Various representations of a NAND gate

Various representations of a NOR gate

Review of Gate Processing

A NOT gate inverts its single input value.

An AND gate produces 1 if both input values are 1.

An OR gate produces 1 if one or the other or both input values are 1.

An XOR gate produces 1 if one or the other (but not both) input values are 1.

A NAND gate produces the opposite results of an AND gate.

A NOR gate produces the opposite results of an OR gate.

Constructing Gates

Transistor:

A device that acts, depending on the voltage level of an input signal, either as a wire

that conducts electricity or as a resistor that blocks the flow of electricity.

o A transistor has no moving parts, yet acts like a switch.

o It is made of a semiconductor material, which is neither a particularly good

conductor of electricity, such as copper, nor a particularly good insulator,

such as rubber.

A transistor has three terminals:

1. A source

2. A base

3. An emitter, typically connected to a ground wire

If the electrical signal is grounded, it is allowed to flow through an alternative route to

the ground (literally) where it can do no harm.

It turns out from the way a transistor works, the easiest gates to create are the NOT,

NAND, and NOR gates.

Circuits

Two general categories:

1. In a combinational circuit, the input values explicitly determine the output.

2. In a sequential circuit, the output is a function of the input values as well as the

existing state of the circuit.

As with gates, we can describe the operations

of entire circuits using three notations:

1. Boolean expressions

2. logic diagrams

3. truth tables

Combinational Circuits

Gates are combined into circuits by using the output of one gate as the input for

another.

Because there are three inputs to this circuit, eight rows are required to describe all

possible input combinations

This same circuit using Boolean algebra:

AB + AC

A (B + C)

Combinational Circuits

We have therefore just demonstrated circuit equivalence.

o That is, both circuits produce the exact same output for each input value

combination.

Boolean algebra allows us to apply provable mathematical principles to help us design

logical circuits.

Adders

At the digital logic level, addition is performed in binary.

Addition operations are carried out by special circuits called, appropriately, adders.

The result of adding two binary digits could produce a carry value.

Recall that 1 + 1 = 10 in base two.

A circuit that computes the sum of two bits and produces the correct carry bit is called a

half adder.

Examine the adder’s truth table carefully.

The Sum column has the same results as the XOR gate.

The Carry column has the same results as the AND gate.

This circuit diagram represents a half adder.

As do these two Boolean expressions:

sum = A B

carry = AB

Multiplexers

A multiplexer is a general circuit that produces a single output signal.

o The output is equal to one of several input signals to the circuit.

o The multiplexer selects which input signal is used as an output signal based on

the value represented by a few more input signals, called select signals or select

control lines.

Sequential Circuits

Digital circuits can also be used to store information.

This application employs sequential circuits, because the output of the circuit is also

used as input to the circuit.

Circuits as Memory

An S-R latch stores a single binary digit

(1 or 0).

There are several ways an S-R latch circuit could be designed using various kinds of

gates.

The design of this circuit guarantees that the two outputs X and Y are always

complements of each other.

The value of X at any point in time is considered to be the current state of the circuit.

Therefore, if X is 1, the circuit is storing a 1; if X is 0, the circuit is storing a 0.

Circuits as Memory (S-R latch)

There are many ways to construct memory circuits. The SR latch is cheap to build (only 4

transistors) but it requires that its inputs by 1 normally.

The flip-flop is a more expensive device that requires inputs of 0.

Integrated Circuits

Integrated circuit (also called a chip) - A piece of silicon on which multiple gates have

been embedded.

These silicon pieces are mounted on a plastic or ceramic package with pins along

the edges that can be soldered onto circuit boards or inserted into appropriate

sockets.

Integrated circuits (IC) are classified by the number of gates contained in them.

CPU Chips

The most important integrated circuit in any computer is the Central Processing Unit, or

CPU.

Each CPU chip has a large number of pins through which essentially all communication

in a computer system occurs.

A CPU adaptor:

Each hole receives a pin from the CPU.

Computing Components

Stored program concept, The von Neumann architecture

Arithmetic/Logic Unit

o Performs basic arithmetic operations such as adding.

o Performs logical operations such as AND, OR, and NOT.

o Most modern ALUs have a small number of special storage units called

registers.

Control Unit

The organizing force in the computer.

o There are two registers in the control unit:

1. The instruction register (IR) contains the instruction that is being

executed.

2. The program counter (PC) contains the address of the next

instruction to be executed.

o ALU and control unit comprise the Central Processing Unit, or CPU.

Memory

A collection of cells, each with a unique physical address.

The Fetch-Execute Cycle

o Fetch the next instruction

o Decode the instruction

o Get data if needed

o Execute the instruction

RAM and ROM

o RAM stands for Random Access Memory.

Inherent in the idea of being able to access each location is the ability to

change the contents of each location.

o ROM stands for Read Only Memory.

The contents in locations in ROM cannot be changed.

o RAM is volatile, ROM is not.

This means that RAM does not retain its bit configuration when the

power is turned off, but ROM does.

Secondary Storage Devices

o Because most of main memory is volatile and limited, it is essential that there be

other types of storage devices where programs and data can be stored when

they are no longer being processed.

o Secondary storage devices can be installed within the computer box at the

factory or added later as needed.

Compact Disks

o A CD drive uses a laser to read information stored optically on a plastic disk.

o CD-ROM is Read-Only Memory.

o CD-RW is Read/Write.

o CD-DA is Digital Audio.

o CD-WORM is Write Once, Read Many.

o DVD stands for Digital Versatile Disk.

Input/Output Units

o Input Unit

A device through which data and programs from the outside world are

entered into the computer.

Keyboard, mouse, and scanning devices

o Output unit

A device through which results stored in the computer memory are

made available to the outside world.

Printers and video display terminals

Touch Screens

Touch screen

A computer monitor that can respond to the user touching the screen with a

stylus or finger.

There are three types:

1. Resistive

2. Capacitive

3. Infrared

Surface acoustic wave (SAW)

Resistive touch screen

A screen made up of two layers of electrically conductive material.

One layer has vertical lines, the other has horizontal lines.

When the top layer is pressed, it comes in contact with the second layer

which allows electrical current to flow.

The specific vertical and horizontal lines that make contact indicate the

location on the screen that was touched.

Capacitive touch screen

A screen made up of a laminate applied over a glass screen.

The laminate conducts electricity in all directions, and a very small

current is applied equally on the four corners.

When the screen is touched, current flows to the finger or stylus.

The location of the touch on the screen is determined by comparing how

strong the flow of electricity is from each corner.

Infrared touch screen

A screen with crisscrossing horizontal and vertical beams of infrared light.

Sensors on opposite sides of the screen detect the beams.

When the user breaks the beams by touching the screen, the location of

the break can be determined.

Surface acoustic wave (SAW)

A screen with crisscrossing high frequency sound waves across the horizontal

and vertical axes.

When a finger touches the surface, the corresponding sensors detect the

interruption and determine the location of the touch.

Non-von Neuman Architectures

The linear machine cycle is still dominant.

Since 1990, the concept of parallel processing has attracted significant research.

3 basic approaches:

1. Synchronous processing

2. Pipelining

3. Shared-memory configuration

Topic D

Operating Systems

Software Categories

Application software

is written to address our specific needs—to solve problems in the

real world.

Word processing programs, games, inventory

control systems, automobile diagnostic programs,

and missile guidance programs are all application software.

System software

manages a computer system at a fundamental level.

It provides the tools and an environment in which application software

can be created and run.

Within the class of system software are two categories:

1. Utility software

programs for performing various activities fundamental to

computer installations, but not part of the OS. (Examples

include formating a disk, networking, copying files, using a

modem, and data compression.)

2. Operating Systems

Application Software

Utlity Software

Shell Kernel

Operating System

System Software

Software

Operating System

An operating system also consists of two parts:

1. The kernel manages computer resources, such as memory and input/output

devices.

2. The shell provides an interface through which a human can interact with the

computer.

An operating system also allows application programs to interact with the other system

resources.

An operating system interacts with many

aspects of a computer system.

The various roles of an operating system generally revolve around the idea of “sharing

nicely”.

An operating system manages resources, and these resources are often shared in one

way or another among programs that “want” to use them.

Managing Resources

Resource management consists of:

I. Memory management

II. Process management

III. CPU scheduling

Memory Management

Memory management

keeps track of what is stored in memory and where in memory it is.

Multiprogramming

is the technique of keeping multiple programs in main memory at the same time.

These programs compete for access to the CPU so that they can execute.

Memory is a continuous set of bits referenced by specific addresses

Logical and Physical Addresses

A program may include instructions that transfer control. For example, in BASIC a

programmer can say “GOTO 200”

where 200 is the line number of the instruction to be executed next.

This line number is relative to the start of the program and so is a logical address.

However, the physical address is the actual location in memory where this

instruction is stored.

Logical address

(sometimes called a virtual or relative address) is a value that

specifies a generic location, relative to the program but not to the

reality of main memory.

Physical address

is an actual address in the main memory device.

Operating systems must employ techniques to:

I. Track where and how a program resides in memory.

II. Convert logical program addresses into actual memory addresses.

There are three approaches to memory management depending on how we

conceive of memory being organised:

1) Single Contiguous Memory

2) Partitioned Memory

3) Paged Memory

Single Contiguous Memory Management

There are only two programs in memory:

1. The operating system

2. The application program

This approach is called single contiguous memory management.

In this system, a logical address is simply an integer value relative to the starting point of

the program.

To produce a physical address, we add a logical address to the starting address of the

program in physical main memory.

Partition Memory Management

When using fixed partitions, main memory is divided into a particular number of

partitions.

When using dynamic partitions, the partitions are created to fit the need of the

programs.

At any point in time, memory is divided into a set of partitions, some empty and some

allocated to programs.

The Base register holds the beginning address of the current partition.

The Bounds register holds the length of the current partition.

Address resolution in partition memory management

When using fixed partitions, main memory is divided into a particular number of

partitions.

When using dynamic partitions, the partitions are created to fit the need of the

programs.

At any point in time, memory is divided into a set of partitions, some empty and some

allocated to programs.

The Base register holds the beginning address of the current partition.

The Bounds register holds the length of the current partition.

Address resolution in partition memory management

Partition Selection

o First fit

Program is allocated to the first partition big enough to hold it.

o Best fit

Program is allocated to the smallest partition big enough to hold it.

o Worst fit

Program is allocated to the largest partition big enough to hold it.

Paged Memory Management

Paged memory technique

main memory is divided into small fixed-size blocks of storage called frames.

A program is divided into pages that (for the sake of our discussion) we

assume are the same size as a frame.

The operating system maintains a separate page-map table (PMT) for each program in

memory.

To produce a physical address, you first look up the page in the PMT to find the frame

number in which it is stored.

Then multiply the frame number by the frame size and add the offset to get the physical

address.

A paged memory management approach

An important extension is demand paging.

o Not all parts of a program actually have to be in memory at the same time.

o In demand paging, the pages are brought into memory on demand.

o The act of bringing in a page from secondary memory, which often causes

another page to be written back to secondary memory, is called a page swap.

The demand paging approach gives rise to the idea of virtual memory, the illusion that

there are no restrictions on the size of a program.

Too much page swapping, however, is called thrashing and can seriously degrade

system performance.

Resource Management

A process can be defined as a program in execution.

The operating system performs process management to carefully track the progress of

each process and all of its intermediate states.

Timesharing

Multiprogramming allowed multiple processes to be active at once, which gave rise to

the ability for programmers to interact with the computer system directly, while still

sharing its resources.

A timesharing system allows multiple users to interact with a computer at the same

time.

In a timesharing system, each user has his or her own virtual machine, in which all

system resources are (in effect) available for use.

The Process Control Block

The operating system must manage a large amount of data for each active process.

Usually that data is stored in a data structure called a process control block (PCB).

o Each state is represented by a list of PCBs, one for each process in that state.

Keep in mind that there is only one CPU and therefore only one set of CPU registers.

o These registers contain the values for the currently executing process.

o The values define the state of the machine at any given time.

Each time a process is moved to the running state:

o Register values for the interrupted process are stored into its PCB.

o Register values of the process admitted to the running state are loaded into the

CPU from its waiting state PCB.

o This exchange of information is called a context switch.

CPU Scheduling

The act of determining which process in the ready state should be moved to the running

state.

That is, decide which process should be given over to the CPU.

Nonpreemptive scheduling

occurs when the currently executing process gives up the CPU voluntarily.

Preemptive scheduling

occurs when the operating system decides to favour another process,

preempting the currently executing process.

Turnaround time

for a process is the amount of time between when the process arrives in the

ready state to the time it exits the running state for the last time.

First-Come, First-Served

The first ordering structure that comes to mind is the queue.

Processes are moved to the CPU in the order in which they arrive in the Ready state.

FCFS scheduling is nonpreemptive – one process completes before the next begin

Shortest Job Next

This technique looks at all processes in the Ready state and dispatches the one with the

shortest service time.

It is also generally implemented as a nonpreemptive algorithm.

Round Robin Scheduling

…distributes the processing time equitably among all ready processes.

The algorithm establishes a particular time slice (or quantum), which is the amount of

time each process receives before being preempted. It is then returned to the ready

state to allow another process its turn.

The Round-robin algorithm is preemptive.

Notice that Round Robin is much less efficient in principle.

Topic D Laboratory Information

Simulation

Simulation:

A model of a complex system and the experimental manipulation of the model

to observe the results.

Systems that are best suited to being simulated are dynamic, interactive,

and complicated.

Model:

An abstraction of a real system.

It is a representation of the objects within the system and the rules that

govern the interactions of the objects.

Constructing Models

Continuous simulation

o Treats time as continuous and expresses changes in terms of a set of differential

equations that reflect the relationships among the set of characteristics.

o Meteorological models fall into this category.

Discrete event simulation

o consists of entities, attributes, and events.

o Entity:

the representation of some object in the real system that must be

explicitly defined

o Attribute:

some characteristic of a particular entity

o Event:

an interaction between entities

Queuing Systems

Queuing system:

a discrete-event model that uses random numbers to represent the arrival and

duration of events.

The system is made up of servers and queues of objects to be served.

The objective is to utilize the servers as fully as possible while keeping the

wait time within a reasonable limit.

To construct a queuing model, we must know the following four things:

1. the number of servers

2. the number of events and how they affect the system

o in order to determine the rules of entity interaction

3. the distribution of arrival times

o in order to determine if an entity enters the system

4. the expected service time

o in order to determine the duration of an event

Meteorological Models

Meteorological models are based on the time-dependent, partial differential equations

of fluid mechanics and thermodynamics.

Initial values for the variables are entered from observation, and the equations are

solved to define the values of the variables at some later time.

Computer models are designed to aid the weathercaster, not replace him or her.

The outputs from the computer models are predictions of the values of

variables in the future.

It is up to the weathercaster to determine what the values mean.

Topic E

Database Management

Database Management Systems

Database:

A structured set of data.

Database Management System:

(DBMS) A combination of software and data, including:

Physical database:

a collection of files that contain the data.

Database engine:

software that supports access to and modification of the database

contents.

Database schema

a specification of the logical structure of the data stored in the

database.

Specialized database languages allow the user to:

o specify the structure of data;

o add, modify, and delete data;

o query the database to retrieve specific stored data.

The elements of a database management system

Databases

Databases are a recent development in the management of large amounts of data.

As paper file systems were “computerized” each application was implemented

separately with its own data set.

These systems were riddled with both corrupt data and redundant data, none of which

could be shared.

The integration of separate systems into one database resolved these issues, but

introduced new ones.

With all data shared, control of access to the data becomes a major concern.

A schema is a description of the entire database structure used by the database

software to maintain the database.

A subschema is a description of only that part of the database that is particular to a

user’s needs.

A layered approach hides the complexities of database implementation.

o User sees data in terms of the application.

o The application “sees” data in terms of the database model.

o The DBMS “sees” data as it is organized

Advantages of the layered approach include:

Simplification of the design process.

Better control of access.

Data Independence.

Applications can be written in terms of simple, conceptual views of the

data – the database model.

Database Models

A database model is a conceptual view of how to organize and manipulate data.

The most popular one is the Relational Model.

In a relational DBMS, the data items - and the relationships among them - are organized

into rectangular tables.

As with spreadsheets, these tables consist of rows and columns.

o Each table is called a relation.

o The rows are called tuples.

o The columns are called attributes.

Of course, different authors adopt different terms. There is a commonly used, alternate

set of names:

o Relations are also called tables.

o A tuple can be referred to as a record, and in this terminology a record is a

collection of related fields

We can express the schema for this database table as follows:

Movie (MovieId:key, Title, Genre, Rating)

We can express the schema for this table as:

Customer (CustomerId:key, Name, Address, CreditCardNumber

A table can represent a collection of relationships between objects. The

RENTS table relates Customers to the Movies they’ve rented by their

respective Ids.

Relationships

We can also express the schema for a relationship:

Rents (CustomerId, MovieId, DateRented, DateDue)

Note the absence of a key field.

Relational operations

There are 3 fundamental operations that can be used to manipulate the tables in a

database:

SELECT

Extracts rows (tuples) from a table (relation)

PROJECT

Extracts columns (attributes) from a table (relation)

JOIN

Combines 2 tables (relations) into 1

The result of any relational operation is a new relation. We can express these operations

with a simple syntax.

NEW ← SELECT from MOVIE where RATING = “PG”

This operation creates a new relation (named NEW) by extracting all rows from

the MOVIE table that have a RATING of PG.

The same syntax can be used for the other operations.

PGmovies ← PROJECT MovieId, Title from NEW

This operation creates a new relation (named PGmovies) that extracts 2

attributes from the NEW relation.

A JOIN creates a new relation by combining 2 relations according to some criterion.

TEMP1 ← JOIN CUSTOMER and RENTS

where CUSTOMER.CustomerId = RENTS.CustomerId

The PROJECT operation can be used to remove the attributes we don’t want…

RENTALS ← PROJECT Name, Address, MovieId from TEMP1

Structured Query Language

Structured Query Language (SQL)

A comprehensive database language for managing relational databases.

Queries in SQL

select attribute-list from table-list where condition

select Title from MOVIE where Rating = 'PG'

select Name, Address from CUSTOMER

select * from MOVIE where Genre like '%action%'

select * from MOVIE where Rating = 'R' order by Title

Modifying Database Content

insert into CUSTOMER values (9876, 'John Smith', '602 Greenbriar Court', '2938

3212 3402 0299')

update MOVIE set Genre = 'thriller drama' where title = 'Unbreakable‘

delete from MOVIE where Rating = 'R'

Data Base Design

Entity-relationship (ER) modeling

A popular technique for designing relational databases.

ER Diagram

Chief tool used for ER modeling.

o Captures the important record types, attributes, and relationships

in a graphical form.

These designations show the cardinality constraint of the relationship