Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
CSE 1520 EXAM NOTES
Topic A
Computing Systems:
Computing systems are dynamic entities used to solve problems and interact with their
environment.
They consist of devices, programs, and data.
Hardware - The physical elements of a computing system (printer, circuit boards, wires,
keyboard…).
Software - The programs that provide the instructions for a computer to execute.
Data - Information in a form a computer can use.
****Abstraction***** - A mental model that removes complex details.
Early History of Computing:
o Abacus (2400 BC)
An early device to record numeric values.
o Blaise Pascal (1623-1662)
Created a mechanical device to add, subtract, divide & multiply.
o Gottfried Wilhelm von Leibniz (1646-1716)
Created a mechanical device to perform all four whole number operations.
o Joseph Jacquard
Jacquard’s Loom (1801), the punched card
o Charles Babbage (1792-1871)
Difference Engine, Analytical Engine
o Augusta Ada Byron (Lovelace)
Babage’s assistant
Considered to be the first Programmer, Invented the concept of the loop
o William Burroughs (1857-1898)
Adding Machine
o Herman Hollerith (1860-1929)
Electro-mechanical Tabulator
o Alan Turing (1912-1954)
Turing Machine - an abstract mathematical model
Artificial Intelligence Testing
Early computers launch new era in mathematics, physics, engineering and economics.
Harvard Mark I (1939)
ENIAC - Electronic Numerical Integrator and Calculator
EDVAC - Electronic Discrete Variable Automatic Computer
first machine with a stored program
UNIVAC I - Universal Automatic Computer (1951)
First Generation Hardware (1951-1959)
o Vacuum Tubes
Large, not very reliable, generated a lot of heat
o Magnetic Drum
Memory device that rotated under a read/write head
o Card Readers Magnetic Tape Drives
Sequential auxiliary storage devices
Second Generation Hardware (1959-1965)
o Transistor
Replaced vacuum tube fast, small, durable, cheap
o Magnetic Cores
Replaced magnetic drums information available instantly
o Magnetic Disks
Replaced magnetic tape data can be accessed directly
Third Generation Hardware (1965-1971)
o Integrated Circuits
Replaced circuit boards smaller, cheaper, faster, more reliable
o Transistors
Now used for memory construction
o Terminal
An input/output device with a keyboard and screen
Fourth Generation Hardware (1971-?)
o Large-scale Integration
Great advances in chip technology
o PCs, the Commercial Market, Workstations
Personal Computers were developed as new companies like Apple and Atari
came into being. Workstations emerged.
Parallel Computing and Networking
Parallel Computing
o Computers rely on interconnected central processing units that increase
processing speed.
Networking
o With the Ethernet small computers could be connected and share resources. A
file server connected PCs in the late 1980s.
ARPANET and LANs Internet
First Generation Software (1951-1959)
o Machine Language
Computer programs were written in binary (1s and 0s).
o Assembly Languages and translators
Programs were written in artificial programming languages and were then
translated into machine language.
o Programmer Changes
Programmers divide into application programmers and systems programmers.
Second Generation Software (1959-1965)
o High Level Languages
English-like statements make programming easier. Fortran, COBOL, Lisp are
examples.
Third Generation Software (1965-1971)
o Systems Software
utility programs,
language translators,
and the operating system, which decides which
programs to run and when.
o Separation between Users and Hardware
Computer programmers began to write programs to be used by people who did
not know how to program.
Fourth Generation Software (1971-1989)
o Structured Programming
Pascal, C, C++
o New Application Software for Users
Spreadsheets, word processors, database management systems.
Fifth Generation Software (1990- present)
o Microsoft
The Windows operating system, and other Microsoft application programs
dominate the market.
o Object-Oriented Design
Based on a hierarchy of data objects (i.e. Java).
o World Wide Web
Allows easy global communication through the Internet.
o New Users
Today’s user needs no computer knowledge.
Topic A Laboratory Information
File Systems and Directories
File Systems
File
A named collection of related data.
File system
The logical view that an operating system provides so that users can manage
information as a collection of files.
Directory
A named group of files. Also called a folder.
Text and Binary Files
Text file:
A file in which the bytes of data are organized as characters from the ASCII or
Unicode character sets.
Binary file:
A file that contains data in a specific format, requiring interpretation.
The terms text file and binary file are somewhat misleading…
o They seem to imply that the information in a text file is not stored as binary data.
o Ultimately, all information on a computer is stored as binary digits.
o These terms refer to how those bits are formatted: as chunks of 8 or 16 bits,
interpreted as characters, or in some other special format.
File Types
Most files, whether they are in text or binary format, contain a specific type of
information.
For example, a file may contain a program, an image, or an audio clip.
The kind of information contained in a document is called the file type.
Most operating systems recognize a list of specific file types.
File names are often separated, usually by a period, into two parts:
1. Main name
2. File extension
The file extension indicates the type of the file.
File Access
Sequential access:
Information in the file is processed in order, and read and write operations
move the current file pointer as far as needed to read or write the data. The
most common file access technique, and the simplest to implement.
Direct access:
Files are conceptually divided into numbered logical records and each logical
record can be accessed directly by number.
File Protection
In multiuser systems, file protection is of primary importance.
We don’t want one user to be able to access another user’s files unless the access is
specifically allowed.
A file protection mechanism determines who can use a file and for what general
purpose.
A file’s protection settings in the Unix operating system is divided into three categories:
1. Owner
2. Group
3. World
Directory Trees
A directory of files can be contained within another directory.
The directory containing another is usually called the parent directory, and the
one inside is called a subdirectory.
Directory tree:
A logical view of a file system; a structure showing the nested directory
organization of a file system.
Root directory:
The directory at the highest level.
At any point in time, you can be thought of as working in a particular location (that is, a
particular subdirectory).
Working directory:
The subdirectory in which you are working.
Path Names
Path
A text designation of the location of a file or subdirectory in a file system,
consisting of the series of directories through which you must go to find the file.
Absolute path
A path that begins at the root and specifies each step down the tree until it
reaches the desired file or directory.
Examples of absolute path:
o C:\Program Files\MS Office\WinWord.exe
o C:\My Documents\letters\applications\vaTech.doc
o C:\Windows\System\QuickTime
Relative path
A path name that begins at the current working directory.
Suppose the current working directory is C:\My Documents\letters:
Then the following relative path names could be used:
cancelMag.doc, applications\calState.doc
Topic B
Binary Number Systems
Positional Notation
104 103 102 101 100
10000 1000 100 10 1
Allows us to count past 10.
Each column of a number represents a power of the base. The exponent is the order of
magnitude for the column.
104 103 102 101 100
10000 1000 100 10 1
The Decimal system is based on the number of digits we have.
104 103 102 101 100
10000 1000 100 10 1
The magnitude of each column is the base, raised to its exponent.
104 103 102 101 100
10000 1000 100 10 1
2 7 9 1 6
20000+7000 +900 +10 +6
=27916
The magnitude of a number is determined by multiplying the magnitude of the column
by the digit in the column and summing the products.
Binary Numbers
The base in a Binary system is 2.
There are only 2 digits – 0 and 1.
Since we use the term frequently, “binary digit” can be shortened to ‘bit’.
8 bits together form a byte.
A Single Byte
27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
1 1 1 1 1 1 1 1
128 +64 +32 +16 +8 +4 + 2 + 1
=255
225 is the largest decimal value that can be expressed in 8 bits.
Longer Numbers
Since 255 is the largest number that can be represented in 8 bits, lager values simply
require longer numbers.
o For example, 27916 is represented by: 0011011010000110
Short Forms for Binary
Because large numbers require long strings of Binary digits, short forms have been
developed to help deal with them.
An early system used was called Octal.
It’s based on the 8 patterns in 3 bits.
Integers
To store integers, half the combinations are used to represent negative values.
The MSB is used to represent the sign.
The range for Integers in 1 byte is: -128 to +127
Which value of the sign bit (0 or 1) will represent a negative number? _________
Excess Notation
The notation system that uses 0 to represent negative values.
Fixed length notation system.
Zero is the first non-negative value:
o 10000000
The pattern immediately before zero is -1:
o 01111111
The largest value is stored as 11111111 (+127)
The smallest value is stored as 00000000 (-128)
2’s Complement Notation
The notation system that uses 1 to represent negative values.
Fixed length notation system.
Zero is the first non-negative value:
o 00000000
The pattern immediately before zero is -1:
o 11111111
The largest value is stored as 01111111 (+127)
The smallest value is stored as 10000000 (-128)
Fractions
A radix separates the integer part from the fraction part of a number.
o 101.101
Columns to the right of the radix have negative powers of 2.
Scientific Notation
Very large and very small numbers are often represented such that their order of
magnitude can be compared.
The basic concept is an exponential notation using powers of 10.
a × 10b
Where b is an integer,
and a is a real number such that: 1 ≤ |a| < 10
An electron's mass is about 0.00000000000000000000000000000091093826 kg.
o In scientific notation, this is written 9.1093826×10−31 kg.
The Earth's mass is about 5,973,600,000,000,000,000,000,000 kg.
o In scientific notation, this is written 5.9736×1024 kg.
E Notation
To allow values like this to be expressed on calculators and early terminals
× 10b
was replaced by Eb
So 9.1093826×10−31 becomes 9.1093826E−31
And 5.9736×1024 becomes 5.9736E+24
The ‘a’ part of the number is called the mantissa or significand.
The ‘Eb’ part is called the exponent.
Since these numbers could also be negative they would typically have a sign as well.
Floating Point Storage
In floating point notation the bit pattern is divided into 3 components:
Sign – 1 bit (0 for +, 1 for -)
Exponent – stored in Excess notation
Mantissa – must begin with 1
Mantissa
Assumes a radix point immediately left of the first digit.
The exponent will determine how far and in which direction to move the radix.
Representing Data
The information below is provided to help you review and practice
converting numbers.
Binary
numbers are expressed in a positional notation system. Each digit represents a power of 2.
In an 8-bit pattern the column positions have these values:
27 26 25 24 23 22 21 20
128 64 32 16 8 4 2 1
For example, the number 11111111 expresses the highest magnitude an 8-bit
pattern can represent (128+64+32+16+8+4+2+1 = 255) which is exactly 1 less than the digit value of the next column which is 256
Binary Fractions (radix point)
the radix point . separates integers from fractions fraction digits mirror the values of the integers:
.1/2 1/4 1/8 1/16 1/32 1/64 1/128 ...
1011.1100 converts to 8 + 0 + 2 + 1 + 1/2 + 1/4 + 0 + 0 = 11 3/4
Hexadecimal Notation
is a short hand for representing a 4 bit pattern using only a single character (a base 16 digit.)
Excess Notation
makes it possible to store negative values by treating the Most Significant Bit (MSB) as the sign. In excess notation a 0 as the MSB indicates a negative (-) number.
In the case of the 4 bit pattern, for example:
0110
the value of the most significant bit is 8, so 4 bit patterns are called excess 8. To convert this example find the decimal value of the pattern (which is 6);
and subtract 8 (which is the normal value of the MSB).
The result is -2
Two's Complement Notation
is another fixed length approach to representing negative values. It employs a sign bit of 0 to represent non-negative (+) and 1 for negative (-).
If the number is a non-negative, it may be evaluated as in standard binary notation. If the number is negative, evaluating the number involves finding the 1's
complement, adding 1 to the result, and remembering the sign. Alternatively, the following rule may be applied to find its positive complement.
"Copy the number from right to left up to and including the first "1", then
complement the rest of the number.
Evaluate as in standard binary and remember the sign."
Mapping Notation Systems onto Each Other
Binary Decimal Hexadecimal Excess (8) Two's
Complement
0000 0 0 -8 0
0001 1 1 -7 1
0010 2 2 -6 2
0011 3 3 -5 3
0100 4 4 -4 4
0101 5 5 -3 5
0110 6 6 -2 6
0111 7 7 -1 7
1000 8 8 0 -8
1001 9 9 1 -7
1010 10 A 2 -6
1011 11 B 3 -5
1100 12 C 4 -4
1101 13 D 5 -3
1110 14 E 6 -2
1111 15 F 7 -1
Floating Point Notation
consists of 3 parts:
1. a Sign bit ["0" is non-negative (+), "1" is negative (-)], 2. an Exponent 3. and a Mantissa.
In an eight bit pattern,
the most significant bit (MSB) is the sign bit, followed by a 3-bit exponent (expressed in excess notation), followed by a 4-bit mantissa.
o The radix point is assumed to be at the left of the mantissa. o In a normalized floating point notation, the mantissa must begin with a "1".
E.G. - 0 101 1001
the sign bit is 0 - so the number represented is non-negative the exponent is 101 - which, in excess(4) notation, is 5-4, or +1 the mantissa is 1001 - the radix goes at the left, producing .1001 a positive exponent moves the radix to the right and a negative exponent moves the
radix to the left. o applying the exponent shifts the radix 1 position right - 1.001 o which is a 1 and 1/8th.
Therefore the number 01011001 in normalized floating point notation represents the value +1 1/8th
Here are some more examples for you to try out:
Number Decimal Hexadecimal Excess(128) Two's
Complement
Floating
Point
(normalized)
11101110 238 EE +110 -18 -3 1/2
01011011 91 5B -37 +91 +1 3/8
10111000 184 B8 +56 -72 -1/4
Adding binary numbers
There are only 4 rules for binary addition:
0 0 1 1
+ 0 + 1 + 0 + 1
--- --- --- ---
0 1 1 10 (the carry rule)
Subtracting binary numbers
The two's complement notation system is typically used to perform subtraction by using the
rules of addition, e.g., 5 - 4 may be expressed as 5 + (-4)
5 converts to 0101
4 converts to 0100, so -4 must be 1100
Add these two together:
0101
+1100
10001
But since we are limited to a fixed length (4 bits in this case) the last carry bit is discarded. This leaves an answer of
0001
Multiplying binary numbers
Multiplication can be re-expressed as repeated addition:
7 * 3 is the same as 7 + 7 + 7 ("seven times three" means "add seven together three times")
7 is represented as 0111
0111
+0111
1110
+0111
10101
which is, of course, 21.
Optimising Multiplication
Multiplying large numbers, say 1043 * 131, involves a lot of additions.
For improved efficiency a processor can apply 2 rules:
1. to multiply by a binary number by 2, shift the bits to the left 2. multiplicaion Distributes over addition
The decimal number 6 is represented by the binary pattern 110.
Shifting the bits left one position doubles the value of the column of each bit.
So the pattern for decimal 12 is 1100.
Multiplying by powers of 2, then, is simply a matter of shifting the bits left by the exponent.
8 is 23 so multiplying a number by 8 consists of shifting the bits 3 positions left.
In this calculation, 131 is the sum of 128, 2, and 1, so applying the property of Distribution produces an equivalent expression :
1043 * (128 + 2 + 1)
or 1043 * 128 + 1043 * 2 + 1043 * 1
1043 in binary is 10000010011
1043 * 2 is 100000100110
1043 * 128 is 100000100110000000
Adding these three numbers produces the answer 100001010110111001
Dividing binary numbers
Division can be re-expressed as repeated subtraction:
To divide 21 by 7 see how many times you can subtract 7 from 21.
Of course, this is the same as adding -7, so we use two's complement notation.
21 is represented as 010101
7 is 000111, so -7 is 111001
0 1 0 1 0 1
+ 1 1 1 0 0 1
(1) 0 0 1 1 1 0
+ 1 1 1 0 0 1
(10) 0 0 0 1 1 1
+ 1 1 1 0 0 1
(11) 0 0 0 0 0 0
Since we're using 6 bit patterns, the extra carry bits would normally be discarded, but
notice that they contain the integer part of the answer.
The 6 bits to their right contain the remainder.
So 21 divided by 7 is 3, with 0 remainder.
Data Representation
Computers are multimedia devices, dealing with many categories of information.
Computers store, present, and help modify:
o Numbers
o Text
o Audio
o Images and graphics
o Video
Computers are finite. Computer memory and other hardware devices have only so
much room to store and manipulate a certain amount of data. The goal of data
representation is to represent enough of the world to satisfy our computational needs
and our senses of sight and sound.
Analog or Digital Information
Information can be represented in one of two ways: analog or digital:
Analog data A continuous representation, analogous to the actual
information it represents.
Digital data A discrete representation, breaking the information up into
separate elements.
A mercury thermometer exemplifies analog data as it continually rises and falls in direct
proportion to the temperature.
Digital displays only show discrete information.
Computers cannot work well with analog information, so we digitize information by
breaking it into pieces and representing those pieces separately.
Why do we use binary? Modern computers are designed to use and manage binary
values because the devices that store and manage the data are far less expensive and
far more reliable if they only need to represent one of two possible values.
Electronic Signals
An analog signal continually fluctuates up and down in voltage. But a digital signal has
only a high or low state, corresponding to the two binary digits.
All electronic signals (both analog and digital) degrade as they move down a line. That is,
the voltage of the signal fluctuates due to environmental effects.
Representing Text
To represent a text document in digital form, we need to be able to represent every
possible character that may appear.
There is a finite number of characters to represent, so the general approach is to list
them all and assign each a binary string.
A character set is a list of characters and the codes used to represent each one.
By agreeing to use a particular character set, computer manufacturers have made the
processing of text data easier.
The ASCII Character Set
ASCII stands for American Standard Code for Information Interchange.
The ASCII character set originally used seven bits to represent each character, allowing
for 128 unique characters
Notice the organisation of the ASCII table.
The table divides in half according to the MSB.
Letters are all in the second half so all codes for alphabetic characters start with
1
This second half of the table divides in half again according to the next
bit:
o UPPERCASE letters start 10.
o lowercase letters start 11.
The first half of the table also divides in half according to the next bit:
o Control characters start 00.
o Numerals and punctuation start 01.
Note that control characters (the first 32 in the ASCII character set) do not have simple
character representations that you could print to the screen.
Some, however, perform actions with which you are familiar.
Coding letters in ASCII is easy.
a. Let’s look at ‘j’ as an example:
i. Since ‘j’ is a letter, its code starts with a 1.
ii. Since it’s lowercase, the next bit is also a 1.
iii. Since it’s the tenth letter of the alphabet the rest of the code is 01010.
iv. The complete ASCII code for ‘j’ is 1101010.
ASCII evolved so that eight bits were used.
The 7-bit codes were simply prefixed with another bit, giving another natural doubling.
o The original 7-bit codes were padded with 0.
So the code for ‘j’ became 01101010.
o 128 new characters were added.
The codes for this alternate character set start with 1.
The Unicode Character Set
Even the extended version of the ASCII character set is not enough for international use.
The Unicode character set uses 16 bits per character. The Unicode character set can
represent 216, or over 65 thousand characters.
Unicode was designed to be a superset of ASCII. That is, the first 256 characters in the
Unicode character set correspond exactly to the extended ASCII character set.
Data Compressing
It is important that we find ways to store and transmit data efficiently, which leads
computer scientists to find ways to compress it.
Data compression is a reduction in the amount of space needed to store a piece of data.
Compression ratio is the size of the compressed data divided by the size of the original
data.
A data compression technique can be
o lossless, which means the data can be retrieved without any loss of the original
information,
o lossy, which means some information may be lost in the process of compaction.
As examples, consider these 3 techniques:
1. keyword encoding
2. run-length encoding
3. Huffman encoding
Key Word Encoding
Frequently used words are replaced with a single character.
For example…
Note, that the characters used to encode cannot be part of the original text.
Consider the following paragraph,
The human body is composed of many independent systems, such as the
circulatory system, the respiratory system, and the reproductive system. Not
only must all systems work independently, they must interact and cooperate as
well. Overall health is a function of the well-being of separate systems, as well as
how these separate systems work in concert.
This version highlights the words that can be replaced.
The human body is composed of many independent systems, such as the
circulatory system, the respiratory system, and the reproductive system. Not only
must each system work independently, they must interact and cooperate as well.
Overall health is a function of the well-being of separate systems, as well as how
those separate systems work in concert.
This is the encoded paragraph:
The human body is composed of many independent systems, such ^ ~ circulatory
system, ~ respiratory system, + ~ reproductive system. Not only & each system
work independently, they & interact + cooperate ^ %. Overall health is a function
of ~ %- being of separate systems, ^ % ^ how # separate systems work in
concert.
There are a total of 349 characters in the original paragraph including spaces and
punctuation.
The encoded paragraph contains 314 characters, resulting in a savings of 35 characters.
The compression ratio for this example is 314/349 or approximately 0.9.
A compression ratio of .9 (90%) is NOT very good. The compressed file is 90% the size of
the original.
However, there are several ways this can be improved. Can you think of some?
Run- Length Encoding
A single character may be repeated over and over again in a long sequence. This type of
repetition doesn’t generally take place in English text, but often occurs in large data
streams.
In run-length encoding, a sequence of repeated characters is replaced by:
o a flag character,
o followed by the repeated character,
o followed by a single digit that indicates how many times the character is repeated.
Some examples:
AAAAAAA
would be encoded as
*A7
*n5*x9ccc*h6 some other text *k8eee
can be decoded into the following original text:
nnnnnxxxxxxxxxccchhhhhh some other text kkkkkkkkeee
In the second example, the original text contains 51 characters, and the encoded string
contains 35 characters, giving us a compression ratio of 35/51 or approximately 0.68.
Since we are using one character for the repetition count, it seems that we can’t encode
repetition lengths greater than nine. However, instead of interpreting the count
character as an ASCII digit, we could interpret it as a binary number.
Huffman Encoding
Why should the blank, which is used very frequently, take up the same number of bits as
the character “X”, which is seldom used in text?
Huffman codes use variable-length bit strings to represent each character.
A few characters may be represented by five bits, and another few by six bits, and yet
another few by seven bits, and so forth.
If we use only a few bits to represent characters that appear often and reserve longer
bit strings for characters that don’t appear often, the overall size of the document being
represented will be smaller.
DOORBELL would be encoded in binary as 1011110110111101001100100.
If we used a fixed-size bit string to represent each character (say, 8 bits), then the binary
from of the original string would be 64 bits.
The Huffman encoding for that string is 25 bits long, giving a compression ratio of 25/64,
or approximately 0.39.
An important characteristic of any Huffman encoding is that no bit string used to
represent a character is the prefix of any other bit string used to represent a character.
Representing Audio Information
We perceive sound when a series of air compressions vibrate a membrane in our ear,
which sends signals to our brain.
A stereo sends an electrical signal to a speaker to produce sound. This signal is an analog
representation of the sound wave. The voltage in the signal varies in direct proportion
to the sound wave.
To digitize the signal we periodically measure the voltage of the signal and record the
appropriate numeric value. The process is called sampling.
In general, a sampling rate of around 40,000 times per second is enough to create a
reasonable sound reproduction.
The standard sampling rate for CDs is 44.1 kHz. The Pro Audio standard is 48 kHz.
It should be noted that the potential loss of peak values suggested in the previous slide
is a myth. The time lapse between samples is much too short for any such loss.
The human ear hears sounds between 20 Hz and 20,000 Hz. Sampling at twice this
frequency (44,000+) eliminates any potential loss of data.
For a complete explanation refer to the Nyquist–Shannon sampling theorem.
A compact disk (CD) stores audio information digitally. On the surface of the CD are
microscopic pits that represent binary digits. A low intensity laser is pointed at the disc.
The laser light reflects strongly if the surface is smooth and reflects poorly if the surface
is pitted.
Audio Formats
WAV, AU, AIFF, VQF, and MP3.
MP3 is dominant
MP3 is short for MPEG-2, audio layer 3 file.
MP3 employs both lossy and lossless compression.
1. First it analyses the frequency spread and compares it to mathematical models of
human psychoacoustics (the study of the interrelation between the ear and the
brain), and it discards information that can’t be heard by humans.
2. Then the bit stream is compressed using a form of Huffman encoding to achieve
additional compression.
Representing Images and Graphics
Colour is our perception of the various frequencies of light that reach the retinas of our
eyes.
Our retinas have three types of colour photoreceptor cones which respond to different
sets of frequencies. These photoreceptor categories correspond to the colours of red,
green, and blue.
Colour is often expressed in a computer as an RGB (red, green, blue) value, which is
actually three numbers that indicate the relative contribution of each of these three
primary colours.
For example, an RGB value of (255, 255, 0) maximizes the contribution of red and
green, and minimizes the contribution of blue. The resulting colour is a bright yellow.
The amount of data that is used to represent a colour is called the colour depth.
HiColor is a term that indicates a 16-bit colour depth. Five bits are used for each number
in an RGB value and the extra bit is sometimes used to represent transparency.
TrueColor indicates a 24-bit colour depth. Therefore, each number in an RGB value gets
eight bits.
HiColor uses 5 bits for each number.
o Since 25 = 32, there are 32 different levels for each of the 3 primary colours. So
there are 323 (or 215) possible colours.
o This is a total of 32,768 different colours.
TrueColor uses eight bits for each colour component.
o 28* 28* 28 = 224 or 16,777,216 colours.
Some monitors can use as many as 32 bits for colour depth.
The human eye is able to distinguish about 200 intensity levels in each of the three
primaries red, green, and blue. All in all, up to 10 million different colours can be
distinguished.
So modern monitors are examples of solutions without a problem.
o If the human eye can distinguish only 10 million colours, why develop monitors
that can display over 4 billion?
Indexed color
A particular application such as a browser may support only a certain number of
specific colours, creating a palette from which to choose. For example, Netscape
Navigator’s colour palette has only 216 colours.
Digitized Images and Graphics
o The storage of image information on a pixel-by-pixel basis is called a
raster-graphics format.
o There are several popular raster file formats including:
BMP (bitmap)
GIF (Graphics Interchange Format)
JPEG (Joint Photographic Experts Group)
Vector Graphics
o Instead of assigning colours to pixels as we do in raster graphics, a vector-
graphics format describes an image in terms of lines and geometric
shapes
o A vector graphic is a series of commands that describe a line’s direction,
thickness, and colour. The file size for these formats tends to be small
because every pixel does not need to be represented.
Representing Video
A video codec (COmpressor/DECompressor) refers to the methods used to shrink the
size of a movie to allow it to be played on a computer or over a network.
Almost all video codecs use lossy compression to minimize the huge amounts of data
associated with video.
o To simulate motion, movies need to record (and play back) at least 12 frames per
second.
o However, good sound quality requires 24 frames/s.
o 24 frames/s
= 1440 frames/minute
= 46400 frames/hour
o Recall…
o If each frame has a resolution of 1024 x 768*
there are 786,432 pixels in a frame.
o If the colour of each pixel is stored as 24 bits (3 bytes) of data, one frame alone
requires 2,359,296 bytes (2 MB) of memory.
o An hour of film then, requires 203,843,174,400 bytes (194,400 MB – more than
190 Gigabytes) of storage – just for the images.
The first step in compressing video is to reduce the amount of information stored for a
frame.
This problem is essentially the same as that faced when compressing still images.
Spatial compression
A technique based on removing redundant information within a frame.
Each compressed frame will still be quite large.
We can save even more space by recognizing that between two frames, most of the
image hasn’t changed. Storing only the changes (deltas) from one cell to the next is
much more efficient.
Temporal compression
A technique based on storing differences between consecutive frames.
Topic B Laboratory information
Topic C
Gates and Circuits
Computers and Electricity
Gate:
A device that performs a basic operation on electrical signals.
Circuits:
Gates combined to perform more complicated tasks.
There are three different, but equally powerful, notational methods for describing the
behavior of gates and circuits:
o Boolean expressions
o logic diagrams
o truth tables
Boolean expressions:
Expressions in Boolean algebra, a mathematical notation for expressing two-
valued logic. This algebraic notation is an elegant and powerful way to
demonstrate the activity of electrical circuits.
Logic diagram:
A graphical representation of a circuit. Each type of gate is represented by a
specific graphical symbol.
Truth table:
A table showing all possible input values and the associated output values.
Gates
Let’s examine the processing of the following
six types of gates:
o NOT
o AND
o OR
o XOR
o NAND
o NOR
Typically, logic diagrams are black and white, and the gates are distinguished only by
their shape.
NOT Gate
A NOT gate accepts one input value and produces one output value.
By definition, if the input value for a NOT gate is 0, the output value is 1, and if the input
value is 1, the output is 0.
A NOT gate is sometimes referred to as an inverter because it inverts the input value.
AND Gate
An AND gate accepts two input signals.
If the input values for an AND gate are both 1, the output is 1; otherwise, the output is
0.
OR Gate
If the two input values are both 0, the output value is 0; otherwise, the output is 1.
XOR Gate (XOR gate (eXclusive OR)
An XOR gate produces 0 if its two inputs are the same, and a 1 otherwise.
Note the difference between the XOR gate
and the OR gate; they differ only in one
input situation:
o When both input signals are 1, the OR gate produces a 1 but the XOR
produces a 0.
NAND and NOR Gates
The NAND and NOR gates are essentially the opposite of the AND and OR gates,
respectively.
Various representations of a NAND gate
Various representations of a NOR gate
Review of Gate Processing
A NOT gate inverts its single input value.
An AND gate produces 1 if both input values are 1.
An OR gate produces 1 if one or the other or both input values are 1.
An XOR gate produces 1 if one or the other (but not both) input values are 1.
A NAND gate produces the opposite results of an AND gate.
A NOR gate produces the opposite results of an OR gate.
Constructing Gates
Transistor:
A device that acts, depending on the voltage level of an input signal, either as a wire
that conducts electricity or as a resistor that blocks the flow of electricity.
o A transistor has no moving parts, yet acts like a switch.
o It is made of a semiconductor material, which is neither a particularly good
conductor of electricity, such as copper, nor a particularly good insulator,
such as rubber.
A transistor has three terminals:
1. A source
2. A base
3. An emitter, typically connected to a ground wire
If the electrical signal is grounded, it is allowed to flow through an alternative route to
the ground (literally) where it can do no harm.
It turns out from the way a transistor works, the easiest gates to create are the NOT,
NAND, and NOR gates.
Circuits
Two general categories:
1. In a combinational circuit, the input values explicitly determine the output.
2. In a sequential circuit, the output is a function of the input values as well as the
existing state of the circuit.
As with gates, we can describe the operations
of entire circuits using three notations:
1. Boolean expressions
2. logic diagrams
3. truth tables
Combinational Circuits
Gates are combined into circuits by using the output of one gate as the input for
another.
Because there are three inputs to this circuit, eight rows are required to describe all
possible input combinations
This same circuit using Boolean algebra:
AB + AC
A (B + C)
Combinational Circuits
We have therefore just demonstrated circuit equivalence.
o That is, both circuits produce the exact same output for each input value
combination.
Boolean algebra allows us to apply provable mathematical principles to help us design
logical circuits.
Adders
At the digital logic level, addition is performed in binary.
Addition operations are carried out by special circuits called, appropriately, adders.
The result of adding two binary digits could produce a carry value.
Recall that 1 + 1 = 10 in base two.
A circuit that computes the sum of two bits and produces the correct carry bit is called a
half adder.
Examine the adder’s truth table carefully.
The Sum column has the same results as the XOR gate.
The Carry column has the same results as the AND gate.
This circuit diagram represents a half adder.
As do these two Boolean expressions:
sum = A B
carry = AB
Multiplexers
A multiplexer is a general circuit that produces a single output signal.
o The output is equal to one of several input signals to the circuit.
o The multiplexer selects which input signal is used as an output signal based on
the value represented by a few more input signals, called select signals or select
control lines.
Sequential Circuits
Digital circuits can also be used to store information.
This application employs sequential circuits, because the output of the circuit is also
used as input to the circuit.
Circuits as Memory
An S-R latch stores a single binary digit
(1 or 0).
There are several ways an S-R latch circuit could be designed using various kinds of
gates.
The design of this circuit guarantees that the two outputs X and Y are always
complements of each other.
The value of X at any point in time is considered to be the current state of the circuit.
Therefore, if X is 1, the circuit is storing a 1; if X is 0, the circuit is storing a 0.
Circuits as Memory (S-R latch)
There are many ways to construct memory circuits. The SR latch is cheap to build (only 4
transistors) but it requires that its inputs by 1 normally.
The flip-flop is a more expensive device that requires inputs of 0.
Integrated Circuits
Integrated circuit (also called a chip) - A piece of silicon on which multiple gates have
been embedded.
These silicon pieces are mounted on a plastic or ceramic package with pins along
the edges that can be soldered onto circuit boards or inserted into appropriate
sockets.
Integrated circuits (IC) are classified by the number of gates contained in them.
CPU Chips
The most important integrated circuit in any computer is the Central Processing Unit, or
CPU.
Each CPU chip has a large number of pins through which essentially all communication
in a computer system occurs.
A CPU adaptor:
Each hole receives a pin from the CPU.
Computing Components
Stored program concept, The von Neumann architecture
Arithmetic/Logic Unit
o Performs basic arithmetic operations such as adding.
o Performs logical operations such as AND, OR, and NOT.
o Most modern ALUs have a small number of special storage units called
registers.
Control Unit
The organizing force in the computer.
o There are two registers in the control unit:
1. The instruction register (IR) contains the instruction that is being
executed.
2. The program counter (PC) contains the address of the next
instruction to be executed.
o ALU and control unit comprise the Central Processing Unit, or CPU.
Memory
A collection of cells, each with a unique physical address.
The Fetch-Execute Cycle
o Fetch the next instruction
o Decode the instruction
o Get data if needed
o Execute the instruction
RAM and ROM
o RAM stands for Random Access Memory.
Inherent in the idea of being able to access each location is the ability to
change the contents of each location.
o ROM stands for Read Only Memory.
The contents in locations in ROM cannot be changed.
o RAM is volatile, ROM is not.
This means that RAM does not retain its bit configuration when the
power is turned off, but ROM does.
Secondary Storage Devices
o Because most of main memory is volatile and limited, it is essential that there be
other types of storage devices where programs and data can be stored when
they are no longer being processed.
o Secondary storage devices can be installed within the computer box at the
factory or added later as needed.
Compact Disks
o A CD drive uses a laser to read information stored optically on a plastic disk.
o CD-ROM is Read-Only Memory.
o CD-RW is Read/Write.
o CD-DA is Digital Audio.
o CD-WORM is Write Once, Read Many.
o DVD stands for Digital Versatile Disk.
Input/Output Units
o Input Unit
A device through which data and programs from the outside world are
entered into the computer.
Keyboard, mouse, and scanning devices
o Output unit
A device through which results stored in the computer memory are
made available to the outside world.
Printers and video display terminals
Touch Screens
Touch screen
A computer monitor that can respond to the user touching the screen with a
stylus or finger.
There are three types:
1. Resistive
2. Capacitive
3. Infrared
Surface acoustic wave (SAW)
Resistive touch screen
A screen made up of two layers of electrically conductive material.
One layer has vertical lines, the other has horizontal lines.
When the top layer is pressed, it comes in contact with the second layer
which allows electrical current to flow.
The specific vertical and horizontal lines that make contact indicate the
location on the screen that was touched.
Capacitive touch screen
A screen made up of a laminate applied over a glass screen.
The laminate conducts electricity in all directions, and a very small
current is applied equally on the four corners.
When the screen is touched, current flows to the finger or stylus.
The location of the touch on the screen is determined by comparing how
strong the flow of electricity is from each corner.
Infrared touch screen
A screen with crisscrossing horizontal and vertical beams of infrared light.
Sensors on opposite sides of the screen detect the beams.
When the user breaks the beams by touching the screen, the location of
the break can be determined.
Surface acoustic wave (SAW)
A screen with crisscrossing high frequency sound waves across the horizontal
and vertical axes.
When a finger touches the surface, the corresponding sensors detect the
interruption and determine the location of the touch.
Non-von Neuman Architectures
The linear machine cycle is still dominant.
Since 1990, the concept of parallel processing has attracted significant research.
3 basic approaches:
1. Synchronous processing
2. Pipelining
3. Shared-memory configuration
Topic D
Operating Systems
Software Categories
Application software
is written to address our specific needs—to solve problems in the
real world.
Word processing programs, games, inventory
control systems, automobile diagnostic programs,
and missile guidance programs are all application software.
System software
manages a computer system at a fundamental level.
It provides the tools and an environment in which application software
can be created and run.
Within the class of system software are two categories:
1. Utility software
programs for performing various activities fundamental to
computer installations, but not part of the OS. (Examples
include formating a disk, networking, copying files, using a
modem, and data compression.)
2. Operating Systems
Application Software
Utlity Software
Shell Kernel
Operating System
System Software
Software
Operating System
An operating system also consists of two parts:
1. The kernel manages computer resources, such as memory and input/output
devices.
2. The shell provides an interface through which a human can interact with the
computer.
An operating system also allows application programs to interact with the other system
resources.
An operating system interacts with many
aspects of a computer system.
The various roles of an operating system generally revolve around the idea of “sharing
nicely”.
An operating system manages resources, and these resources are often shared in one
way or another among programs that “want” to use them.
Managing Resources
Resource management consists of:
I. Memory management
II. Process management
III. CPU scheduling
Memory Management
Memory management
keeps track of what is stored in memory and where in memory it is.
Multiprogramming
is the technique of keeping multiple programs in main memory at the same time.
These programs compete for access to the CPU so that they can execute.
Memory is a continuous set of bits referenced by specific addresses
Logical and Physical Addresses
A program may include instructions that transfer control. For example, in BASIC a
programmer can say “GOTO 200”
where 200 is the line number of the instruction to be executed next.
This line number is relative to the start of the program and so is a logical address.
However, the physical address is the actual location in memory where this
instruction is stored.
Logical address
(sometimes called a virtual or relative address) is a value that
specifies a generic location, relative to the program but not to the
reality of main memory.
Physical address
is an actual address in the main memory device.
Operating systems must employ techniques to:
I. Track where and how a program resides in memory.
II. Convert logical program addresses into actual memory addresses.
There are three approaches to memory management depending on how we
conceive of memory being organised:
1) Single Contiguous Memory
2) Partitioned Memory
3) Paged Memory
Single Contiguous Memory Management
There are only two programs in memory:
1. The operating system
2. The application program
This approach is called single contiguous memory management.
In this system, a logical address is simply an integer value relative to the starting point of
the program.
To produce a physical address, we add a logical address to the starting address of the
program in physical main memory.
Partition Memory Management
When using fixed partitions, main memory is divided into a particular number of
partitions.
When using dynamic partitions, the partitions are created to fit the need of the
programs.
At any point in time, memory is divided into a set of partitions, some empty and some
allocated to programs.
The Base register holds the beginning address of the current partition.
The Bounds register holds the length of the current partition.
Address resolution in partition memory management
When using fixed partitions, main memory is divided into a particular number of
partitions.
When using dynamic partitions, the partitions are created to fit the need of the
programs.
At any point in time, memory is divided into a set of partitions, some empty and some
allocated to programs.
The Base register holds the beginning address of the current partition.
The Bounds register holds the length of the current partition.
Address resolution in partition memory management
Partition Selection
o First fit
Program is allocated to the first partition big enough to hold it.
o Best fit
Program is allocated to the smallest partition big enough to hold it.
o Worst fit
Program is allocated to the largest partition big enough to hold it.
Paged Memory Management
Paged memory technique
main memory is divided into small fixed-size blocks of storage called frames.
A program is divided into pages that (for the sake of our discussion) we
assume are the same size as a frame.
The operating system maintains a separate page-map table (PMT) for each program in
memory.
To produce a physical address, you first look up the page in the PMT to find the frame
number in which it is stored.
Then multiply the frame number by the frame size and add the offset to get the physical
address.
A paged memory management approach
An important extension is demand paging.
o Not all parts of a program actually have to be in memory at the same time.
o In demand paging, the pages are brought into memory on demand.
o The act of bringing in a page from secondary memory, which often causes
another page to be written back to secondary memory, is called a page swap.
The demand paging approach gives rise to the idea of virtual memory, the illusion that
there are no restrictions on the size of a program.
Too much page swapping, however, is called thrashing and can seriously degrade
system performance.
Resource Management
A process can be defined as a program in execution.
The operating system performs process management to carefully track the progress of
each process and all of its intermediate states.
Timesharing
Multiprogramming allowed multiple processes to be active at once, which gave rise to
the ability for programmers to interact with the computer system directly, while still
sharing its resources.
A timesharing system allows multiple users to interact with a computer at the same
time.
In a timesharing system, each user has his or her own virtual machine, in which all
system resources are (in effect) available for use.
The Process Control Block
The operating system must manage a large amount of data for each active process.
Usually that data is stored in a data structure called a process control block (PCB).
o Each state is represented by a list of PCBs, one for each process in that state.
Keep in mind that there is only one CPU and therefore only one set of CPU registers.
o These registers contain the values for the currently executing process.
o The values define the state of the machine at any given time.
Each time a process is moved to the running state:
o Register values for the interrupted process are stored into its PCB.
o Register values of the process admitted to the running state are loaded into the
CPU from its waiting state PCB.
o This exchange of information is called a context switch.
CPU Scheduling
The act of determining which process in the ready state should be moved to the running
state.
That is, decide which process should be given over to the CPU.
Nonpreemptive scheduling
occurs when the currently executing process gives up the CPU voluntarily.
Preemptive scheduling
occurs when the operating system decides to favour another process,
preempting the currently executing process.
Turnaround time
for a process is the amount of time between when the process arrives in the
ready state to the time it exits the running state for the last time.
First-Come, First-Served
The first ordering structure that comes to mind is the queue.
Processes are moved to the CPU in the order in which they arrive in the Ready state.
FCFS scheduling is nonpreemptive – one process completes before the next begin
Shortest Job Next
This technique looks at all processes in the Ready state and dispatches the one with the
shortest service time.
It is also generally implemented as a nonpreemptive algorithm.
Round Robin Scheduling
…distributes the processing time equitably among all ready processes.
The algorithm establishes a particular time slice (or quantum), which is the amount of
time each process receives before being preempted. It is then returned to the ready
state to allow another process its turn.
The Round-robin algorithm is preemptive.
Notice that Round Robin is much less efficient in principle.
Topic D Laboratory Information
Simulation
Simulation:
A model of a complex system and the experimental manipulation of the model
to observe the results.
Systems that are best suited to being simulated are dynamic, interactive,
and complicated.
Model:
An abstraction of a real system.
It is a representation of the objects within the system and the rules that
govern the interactions of the objects.
Constructing Models
Continuous simulation
o Treats time as continuous and expresses changes in terms of a set of differential
equations that reflect the relationships among the set of characteristics.
o Meteorological models fall into this category.
Discrete event simulation
o consists of entities, attributes, and events.
o Entity:
the representation of some object in the real system that must be
explicitly defined
o Attribute:
some characteristic of a particular entity
o Event:
an interaction between entities
Queuing Systems
Queuing system:
a discrete-event model that uses random numbers to represent the arrival and
duration of events.
The system is made up of servers and queues of objects to be served.
The objective is to utilize the servers as fully as possible while keeping the
wait time within a reasonable limit.
To construct a queuing model, we must know the following four things:
1. the number of servers
2. the number of events and how they affect the system
o in order to determine the rules of entity interaction
3. the distribution of arrival times
o in order to determine if an entity enters the system
4. the expected service time
o in order to determine the duration of an event
Meteorological Models
Meteorological models are based on the time-dependent, partial differential equations
of fluid mechanics and thermodynamics.
Initial values for the variables are entered from observation, and the equations are
solved to define the values of the variables at some later time.
Computer models are designed to aid the weathercaster, not replace him or her.
The outputs from the computer models are predictions of the values of
variables in the future.
It is up to the weathercaster to determine what the values mean.
Topic E
Database Management
Database Management Systems
Database:
A structured set of data.
Database Management System:
(DBMS) A combination of software and data, including:
Physical database:
a collection of files that contain the data.
Database engine:
software that supports access to and modification of the database
contents.
Database schema
a specification of the logical structure of the data stored in the
database.
Specialized database languages allow the user to:
o specify the structure of data;
o add, modify, and delete data;
o query the database to retrieve specific stored data.
The elements of a database management system
Databases
Databases are a recent development in the management of large amounts of data.
As paper file systems were “computerized” each application was implemented
separately with its own data set.
These systems were riddled with both corrupt data and redundant data, none of which
could be shared.
The integration of separate systems into one database resolved these issues, but
introduced new ones.
With all data shared, control of access to the data becomes a major concern.
A schema is a description of the entire database structure used by the database
software to maintain the database.
A subschema is a description of only that part of the database that is particular to a
user’s needs.
A layered approach hides the complexities of database implementation.
o User sees data in terms of the application.
o The application “sees” data in terms of the database model.
o The DBMS “sees” data as it is organized
Advantages of the layered approach include:
Simplification of the design process.
Better control of access.
Data Independence.
Applications can be written in terms of simple, conceptual views of the
data – the database model.
Database Models
A database model is a conceptual view of how to organize and manipulate data.
The most popular one is the Relational Model.
In a relational DBMS, the data items - and the relationships among them - are organized
into rectangular tables.
As with spreadsheets, these tables consist of rows and columns.
o Each table is called a relation.
o The rows are called tuples.
o The columns are called attributes.
Of course, different authors adopt different terms. There is a commonly used, alternate
set of names:
o Relations are also called tables.
o A tuple can be referred to as a record, and in this terminology a record is a
collection of related fields
We can express the schema for this database table as follows:
Movie (MovieId:key, Title, Genre, Rating)
We can express the schema for this table as:
Customer (CustomerId:key, Name, Address, CreditCardNumber
A table can represent a collection of relationships between objects. The
RENTS table relates Customers to the Movies they’ve rented by their
respective Ids.
Relationships
We can also express the schema for a relationship:
Rents (CustomerId, MovieId, DateRented, DateDue)
Note the absence of a key field.
Relational operations
There are 3 fundamental operations that can be used to manipulate the tables in a
database:
SELECT
Extracts rows (tuples) from a table (relation)
PROJECT
Extracts columns (attributes) from a table (relation)
JOIN
Combines 2 tables (relations) into 1
The result of any relational operation is a new relation. We can express these operations
with a simple syntax.
NEW ← SELECT from MOVIE where RATING = “PG”
This operation creates a new relation (named NEW) by extracting all rows from
the MOVIE table that have a RATING of PG.
The same syntax can be used for the other operations.
PGmovies ← PROJECT MovieId, Title from NEW
This operation creates a new relation (named PGmovies) that extracts 2
attributes from the NEW relation.
A JOIN creates a new relation by combining 2 relations according to some criterion.
TEMP1 ← JOIN CUSTOMER and RENTS
where CUSTOMER.CustomerId = RENTS.CustomerId
The PROJECT operation can be used to remove the attributes we don’t want…
RENTALS ← PROJECT Name, Address, MovieId from TEMP1
Structured Query Language
Structured Query Language (SQL)
A comprehensive database language for managing relational databases.
Queries in SQL
select attribute-list from table-list where condition
select Title from MOVIE where Rating = 'PG'
select Name, Address from CUSTOMER
select * from MOVIE where Genre like '%action%'
select * from MOVIE where Rating = 'R' order by Title
Modifying Database Content
insert into CUSTOMER values (9876, 'John Smith', '602 Greenbriar Court', '2938
3212 3402 0299')
update MOVIE set Genre = 'thriller drama' where title = 'Unbreakable‘
delete from MOVIE where Rating = 'R'
Data Base Design
Entity-relationship (ER) modeling
A popular technique for designing relational databases.
ER Diagram
Chief tool used for ER modeling.
o Captures the important record types, attributes, and relationships
in a graphical form.
These designations show the cardinality constraint of the relationship