Upload
others
View
24
Download
0
Embed Size (px)
Citation preview
Invitation to Computer Science, C++ Version, 6E 2
Objectives
In this chapter, you will learn about:
The components of a computer system
Putting all the pieces together – the Von Neumann architecture
The future: non-Von Neumann architectures
Invitation to Computer Science, C++ Version, 6E 3
Introduction • Remember that Computer science is the study of algorithms
including • Their formal and mathematical properties (Chapter 1-3)
• their hardware realization (Chapter 4-5)
• Their linguistic realizations.
• Their applications.
Computer organization examines the computer as a collection of interacting “functional units”
Functional units may be built out of the circuits already studied
Higher level of abstraction assists in understanding by reducing complexity
Invitation to Computer Science, C++ Version, 6E 5
The Components of a Computer System Von Neumann architecture has four functional units:
Memory
The unit that stores and retrieves instructions and data.
Input/Output
Handles communication with the outside world.
Arithmetic/Logic unit
Performs mathematical and logical operations. Control unit
Repeats the following 3 tasks repeatedly 1. Fetches an instruction from memory 2. Decodes the instruction 3. Executes the instruction
Program in memory
Sequential execution of instructions
Invitation to Computer Science, C++ Version, 6E 6
Memory and Cache
Information stored and fetched from memory subsystem
Random Access Memory (RAM) maps addresses to memory locations
Cache memory keeps values currently in use in faster memory to speed access times
Memory Hierarchy
Invitation to Computer Science, C++ Version, 6E
Fast,
Expensive,
Small
Slow,
Cheap,
Large
RAM
volatile
non-volatile
7
Invitation to Computer Science, C++ Version, 6E 8
Memory and Cache (con't)
RAM (Random Access Memory)
Memory made of addressable “cells”
Current standard cell size is 1 byte = 8 bits
All memory cells accessed in equal time
Memory address
The address is an unsigned binary number with N bits
Maximum memory size (or Address space) is then 2N cells
bit
Invitation to Computer Science, C++ Version, 6E 10
Memory and Cache (con't) Memory Size
Memory size is in power of 2
210 =1K 1 kilobyte
220 =1M 1 megabyte
230 =1G 1 gigabyte
240 =1T 1 terabyte
If the MAR is N digits long, the largest address is….?
The maximum memory size for a MAR with
N = 16 is..? _______________
N = 20 is..? _______________
N = 31 is..? ________________ 2 G
1 M
64 K
Invitation to Computer Science, C++ Version, 6E 11
Memory and Cache (con't)
Memory subsystem
Fetch/store controller
Fetch: retrieve a value from memory
Store: store a value into memory
Memory address register (MAR)
Memory data register (MDR)
Memory cells, with decoder(s) to select individual cells
15
Decoder Circuit A decoder is a control circuit which has
N input and 2N output
A 3 to 23 example of decoder
a b c o0 o1 o2 o3 o4 o5 o6 o7
0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1
a b c
0
1
2
3
4
5
6
7
Invitation to Computer Science, C++ Version, 6E
Invitation to Computer Science, C++ Version, 6E 16
Memory and Cache (con't)
Fetch operation
1. Load the address in to the MAR
The address of the desired memory cell is moved into the MAR
2. Decode the address in the MAR
Fetch/store controller signals a “fetch,” accessing the memory cell.
The memory unit must translate the N-bit address stored in the MAR into the set of signals needed to access that one specific memory cell
A decoder circuit is used for such a purpose
3. Copy the content of the memory location into the MDR
The value at the MAR’s location flows into the MDR
18
Memory and Cache (con't)
Store operation
1. Load the address into the MAR
The address of the cell where the value should go is placed in the MAR
2. Load the value into the MDR
The new value is placed in the MDR
3. Decode the address in the MAR and store the content of the MDR into that memory location
Fetch/store controller signals a “store,” copying the MDR’s value into the desired cell
Invitation to Computer Science, C++ Version, 6E
Memory and Decoding Logic
Invitation to Computer Science, C++ Version, 6E 21
Suffer from Scalability Problem!!
Invitation to Computer Science, C++ Version, 6E 23
Cache Memory
Memory access is much slower than processing time
Faster memory is too expensive to use for all memory cells
Locality principle
Once a instruction (or value) is used, it is likely to be used again
Once a instruction (or value) is used, its neighbors is likely to be used very soon
Small size, fast memory just for values currently in use speeds computing time
=> Cache
Memory Fetch Operation
Three major steps:
1. Look first in cache memory
2. If the desired information is not in the cache, then access it from RAM
3. Copy the data along with the k immediately following memory locations into cache
Invitation to Computer Science, C++ Version, 6E 24
Assume
•Information we need is in cache 70% of the time and in memory 0.3 of the time
•Cache access costs 5 nsec and memory access costs 20 nsec
Average access time = (0.7 × 5)+0.3 × (5+20) = 11.0
45% reduction!
Invitation to Computer Science, C++ Version, 6E 25
Input/Output and Mass Storage
Communication with outside world and external data storage
Human interfaces
monitor, keyboard, mouse, printer
Archival storage:
Machine readable, not dependent on constant power, like HD, floppy disks, CD-ROM
External devices vary tremendously from each other
Invitation to Computer Science, C++ Version, 6E 26
Input/Output and Mass Storage (con't)
Mass storage devices
Direct access storage device
Hard drive, CD-ROM, DVD, etc.
Uses its own addressing scheme to access data
Sequential access storage device
Tape drive, etc.
Stores data sequentially
Used for backup storage these days
Nonvolatile storage
Volatile storage like RAM
27
Magnetic Disks
A read/write head travels across a spinning magnetic disk, retrieving or recording data
Figure 5.8
The organization of a
magnetic disk
Invitation to Computer Science, C++ Version, 6E
28
Input/Output and Mass Storage (con't)
Direct access storage devices
Data stored on a spinning disk
Disk divided into concentric tracks
Each track is composed of sectors
Read/write head moves from one ring to another while disk spins
Access time depends on:
Time to move head to correct sector
Time for sector to spin to data location
read/write unit
Invitation to Computer Science, C++ Version, 6E
Access time
Best Worst Average
Seek Time 0 19.98 10.00
Latency 0 8.33 4.17
Transfer
time
0.13 0.13 0.13
Total 0.13 28.44 14.30
Invitation to Computer Science, C++ Version, 6E
Seek time – time to position the read/write head over the correct track
Latency- time needed for the correct sector to rotate under the
read/write head
Transfer time- time to read or write the data
Average is half the number of tracks or half a revolution
29
Invitation to Computer Science, C++ Version, 6E 30
Input/Output and Mass Storage (con't)
I/O controller
Intermediary between central processor and I/O devices
Processor sends request and data, then goes on with its work
I/O controller interrupts processor when request is complete
Invitation to Computer Science, C++ Version, Third Edition 32
Figure 5.18 The Organization of a Von Neumann Computer
Invitation to Computer Science, C++ Version, 6E 32
Invitation to Computer Science, C++ Version, 6E 33
The Arithmetic/Logic Unit
The ALU is made up of three parts Registers Interconnections between components ALU circuitry
Actual computations are performed
Primitive operation circuits Arithmetic (ADD, etc.) Comparison (CE, etc.) Logic (AND, etc.)
Data inputs and results stored in registers
Multiplexor selects desired output
ALU circuitry
Interconnections
(bus) Registers
A typical ALU has 16, 32, or 64 registers.
An Arithmetic operation
A + B
operand operator
ALU
Organization
Invitation to Computer Science, C++ Version, 6E 34
Invitation to Computer Science, C++ Version, 6E 36
Multiplexor
A multiplexor is a control circuit that selects one of the input lines to allow its value to be output.
To do so, it uses the selector lines to indicate which input line to select.
A multiplexor It has 2N input lines, N selector lines and one output. The N selector lines are set to 0s or 1s. When the values of the N
selector lines are interpreted as a binary number, they represent the number of the input line that must be selected.
With N selector lines you can represent numbers between 0 and 2N-1.
The single output is the value on the line represented by the number formed by the selector lines.
Invitation to Computer Science, C++ Version, 6E 37
Multiplexor Circuit
multiplexor circuit
2N input lines
0 1 2 2N-1
N selector lines
1 output
Selector: Interpret the selector lines as a binary number. Ex: 00….01 is equal to 1
Ex: Suppose that the selector line write the number 00….01 which is equal to 1, the output is the value on the line numbered 1
. . . . .
Invitation to Computer Science, C++ Version, 6E
Figure 5.12
Using a Multiplexor Circuit to Select the Proper ALU Result
38
RECALL: THE ARITHMETIC/LOGIC UNIT USES
A MULTIPLEXOR
R
AL1
AL2 ALU
circuits
multiplexor
selector lines
output
GT EQ LT
condition code register
Register R
Other registers
Invitation to Computer Science, C++ Version, 6E 39
Invitation to Computer Science, C++ Version, 6E 40
The Arithmetic/Logic Unit (con't)
ALU process
1. Values for operations copied into ALU’s input register locations
2. All circuits compute results for those inputs
3. Multiplexor selects the one desired result from all values
4. Result value copied to desired result register
Invitation to Computer Science, C++ Version, 6E 41
The Control Unit
A control unit comprises
Links to other subsystems
Instruction decoder circuit
Two special registers:
Program Counter (PC)
Stores the memory address of the next instruction to be executed
Instruction Register (IR)
Stores the code for the current instruction
Invitation to Computer Science, C++ Version, 6E 43
The Control Unit (con’t)
Manages the execution of a stored program
Task
While not a HALT instruction or a fatal error
1. Fetch the next instruction to be executed from memory
2. Decode it: determine what is to be done
3. Execute it: issue appropriate command to ALU, memory, and I/O controllers
End of the while loop
Invitation to Computer Science, C++ Version, 6E 44
Machine Language Instructions
Can be decoded and executed by control unit
An instruction
Operation code (op code)
Unique unsigned-integer code assigned to each machine language operation, such as +, -, *, /, cmp, jump, …
Address field(s)
Memory addresses of the values on which operation will work
Invitation to Computer Science, C++ Version, 6E 45
Figure 5.14 Typical Machine Language Instruction Format
00001001 0000000001100011 0000000001100100
ADD X Y
Invitation to Computer Science, C++ Version, 6E 46
Operations of Machine Language
Data transfer
Move values to and from memory and registers
Arithmetic/logic
Perform ALU operations that produce numeric values
Compares
Set bits of compare register to hold result
Branches
Jump to a new memory address to continue processing
Invitation to Computer Science, C++ Version, 6E 47
Operations Available
Data Transfer load Store Move Clear
Arithmetic Operations add increment subtract decrement
I/0 Operations in out
Compare
compare
Branch
jump
jumpgt
jumpeq
jumplt
jumpneq
halt
There is a different Operation Code (OpCode) for each Operation
Invitation to Computer Science, C++ Version, 6E 49
Putting All the Pieces Together—the
Von Neumann Architecture
Subsystems connected by a bus
Bus: wires that permit data transfer among them
At this level, ignore the details of circuits that perform these tasks: Abstraction!
Computer repeats fetch-decode-execute cycle indefinitely
Invitation to Computer Science, C++ Version, 6E 50
Figure 5.18 The Organization of a Von Neumann Computer
Invitation to Computer Science, C++ Version, Third Edition 52
LOAD 101
101 b
f b
101
LOAD
101
b c
101
102
100
Invitation to Computer Science, C++ Version, 6E 52
Invitation to Computer Science, C++ Version, Third Edition 53
ADD 102
102
c
c ADD 102
f
b
b
c
b+c 102
101
102
100
b+c
Invitation to Computer Science, C++ Version, 6E 53
54
Store 100
s
b+c
100
101
102
100 100
b+c Store 100
b+c
Invitation to Computer Science, C++ Version, 6E 54
56
Are All Architectures the von Neumann
Architecture? No.
One of the bottlenecks in the von Neuman Architecture is the fetch-decode-execute cycle.
With only one processor, that cycle is difficult to speed up.
I/O has been done in parallel for many years.
Why have a CPU wait for the transfer of data between the memory and the I/O devices?
Most computers today also multitask – they make it appear that multiple tasks are being performed in parallel (when in reality they aren’t as we’ll see when we look at operating systems).
But, some computers do allow multiple processors.
Invitation to Computer Science, C++ Version, 6E
57
Comparing Various Types of Architecture
Typically, synchronous computers have fairly simple processors so there can be many of them – in the thousands. One has been built by Paracel (GeneMatcher) with over 1M processors. Used by Celera in completing the description of the human genome
sequencing
Pipelined computers are often used for high speed arithmetic calculations as these pipeline easily.
Shared-memory computers basically configure independent computers to work on one task. Typically, there are something like 8, 16, or at most 64 such computers
configured together.
Some recent parallel computers used for gaming such as PlayStation are partially based on this architecture.
Invitation to Computer Science, C++ Version, 6E
58
Synchronous processing
One approach to parallelism is to have multiple processors apply the same program to multiple data sets
Figure 5.6 Processors in a synchronous computing environment
Invitation to Computer Science, C++ Version, 6E
59
Pipelining
Arranges processors in tandem, where each processor contributes one part to an overall computation
Figure 5.7 Processors in a pipeline
Invitation to Computer Science, C++ Version, 6E
60
Shared-Memory
Shared Memory
Processor Processor Processor Processor
Local
Memory1
Local
Memory2
Local
Memory3
Local
Memory4
Different processors do different things to different data.
A shared-memory area is used for communication.
Invitation to Computer Science, C++ Version, 6E
Von Neumann Bottleneck
First generation machine
10000 instructions per sec
2nd generation machine
1 million instructions per sec (MIPS)
Now
Processors
About 1000~5000 MIPS
Tens of billions of transistors separated distances of
less than 0.000001 cm
Speed of light ( approximately 3×108 m/s) => 3 ns for 1 meter
A real-time computer animation: 30 x 3000 x 3000 x 100 = 27 billion instructions per second (27000 MIPS)
Beyond the ability of current processors! Invitation to Computer Science, C++ Version, 6E 61
Invitation to Computer Science, C++ Version, 6E 62
What are the fastest computer in the world?
Visit the site
http://www.top500.org/list/2005/06/
to find out!
Invitation to Computer Science, C++ Version, 6E 63
Non-Von Neumann Architectures
Physical limitations on speed of Von Neumann computers
Non-Von Neumann architectures explored to bypass these limitations
Parallel computing architectures can provide improvements: multiple operations occur at the same time
Singe Instruction Stream/Multiple Data Stream (SIMD)
Multiple Instruction Stream/Multiple Data Stream (MIMD)
64
SIMD Parallel Processing Architecture
Multiple processors running in parallel
All processors execute same operation at one time
Each processor operates on its own data
Suitable for “vector” operations
Ex: V+1 1980 first supercomputer
Invitation to Computer Science, C++ Version, 6E
Cloud Computing
65
Multiple processors running in parallel
Each processor performs its own operations on its own data
Processors communicate with each other
MIMD Parallel Processing Architecture
High scalability!
Cluster computing
Grid Computing
Cloud Computing
Invitation to Computer Science, C++ Version, 6E
Key of Parallel Computing
To effectively utilize the large number of processors
Parallel Algorithms
Invitation to Computer Science, C++ Version, 6E 66
Invitation to Computer Science, C++ Version, 6E 67
Summary of Level 2
Focus on how to design and build computer systems
Chapter 4
Binary codes
Transistors
Gates
Circuits
Invitation to Computer Science, C++ Version, 6E 68
Summary of Level 2 (con't)
Chapter 5
Von Neumann architecture
Shortcomings of the sequential model of computing
Parallel computers
Invitation to Computer Science, C++ Version, 6E 69
Summary
Computer organization examines different subsystems of a computer: memory, input/output, arithmetic/logic unit, and control unit
Machine language gives codes for each primitive instruction the computer can perform, and its arguments
Von Neumann machine: sequential execution of stored program
Parallel computers improve speed by doing multiple tasks at one time
Connecting I/O devices
I/O devices cannot be connected directly to the buses that connect the CPU and memory
I/O devices are electromechanical, magnetic, or optical devices
CPU and memory are electronic devices.
I/O devices also operate at a much slower speed
Input/output devices are therefore attached to the buses through input/output controllers or interfaces.
Invitation to Computer Science, C++ Version, 6E 71
Invitation to Computer Science, C++ Version, Third Edition 72
Figure 5.9 Organization of an I/O Controller
Invitation to Computer Science, C++ Version, 6E 72
Small Computer System Interface (SCSI)
First developed for Macintosh computer in 1984
Parallel interface with 8, 16, 32 connections
Daisy-chained connection interface
Each device has a unique address
Both ends of the chain must be connected to a special device (terminator)
p.113
SCSI
Invitation to Computer Science, C++ Version, 6E 74
p.113
FireWire
IEEE standard 1394 Apple: FireWire
Sony: i.Link
Transfer rate up to 50 MB/sec or double
Connect up to 63 device in a daisy-chain or tree connection interface
Invitation to Computer Science, C++ Version, 6E 75
USB
Universal Serial Bus (USB) A serial controller that connects both low and high-speed devices Support hot-swappable Use a cable with 4 wires
Two (+5 volts and ground): provide power for low power devices The other two: carry data, address, and control signals
Data transferred in packets Device ID, control part, data part All devices will receive the same packet and filter unnecessary packets by
using Device ID
USB 2 Up to 127 devices connected in a tree-topology with hubs Support transfer rate: 1.5 Mbps, 12 Mbps, and 480 Mbps
Hot swapping describes replacing components without significant interruption to the system
Hot plugging describes the addition of components that would expand
the system without significant interruption to the operation of the system
Invitation to Computer Science, C++ Version, 6E 76
Types of Main Memory
There are two types of main memory
Random Access Memory (RAM)
holds its data as long as the computer is switched on
All data in RAM is lost when the computer is switched off
Described as being volatile
It is direct access as it can be both written to or read from in any order
Its purpose is to temporarily hold programs and data for processing. In modern computers it also holds the operating system
Invitation to Computer Science, C++ Version, 6E 77
Types of Main Memory (con’t)
Read Only Memory (ROM)
ROM holds programs and data permanently even when computer is switched off
Data can be read by the CPU in any order so ROM is also direct access
The contents of ROM are fixed at the time of manufacture
Stores a program called the bootstrap loader that helps start up the computer
Access time of between 10 and 50 ns
Invitation to Computer Science, C++ Version, 6E 78
Types of RAM
1. Dynamic Random Access Memory (DRAM) Contents are constantly refreshed 1000 times per second Access time 60 – 70 nanoseconds
2. Synchronous Dynamic Random Access Memory (SDRAM)
Quicker than DRAM Access time less than 60 nanoseconds
3. Direct Rambus Dynamic Random Access Memory (DRDRAM)
New type of RAM architecture Access time 20 times faster than DRAM More expensive
Invitation to Computer Science, C++ Version, 6E 79
Types of RAM (con’t)
4. Static Random Access Memory (SRAM)
Doesn’t need refreshing
Retains contents as long as power applied to the chip
Access time around 10 nanoseconds
Used for cache memory
Also for date and time settings as powered by small battery
Invitation to Computer Science, C++ Version, 6E 80
1. Cache fetches data
from next to current
addresses in main
memory
2. CPU checks to see
whether the next
instruction it requires is in
cache
3. If it is, then the
instruction is fetched from
the cache – a very fast
position
4. If not, the CPU has to
fetch next instruction
from main memory - a
much slower process
Main
Memory
(DRAM)
CPU
Cache
Memory
(SRAM)
= Bus connections
The operation of Cache Memory
Invitation to Computer Science, C++ Version, 6E 81
Others
Cache memory Small amount of memory typically 256 or 512 kilobytes Temporary store for often used instructions Level 1 cache is built within the CPU (internal) Level 2 cache may be on chip or nearby (external) Faster for CPU to access than main memory
Video Random Access memory Holds data to be displayed on computer screen Has two data paths allowing READ and WRITE to occur at the
same time A system’s amount of VRAM relates to the number of colours
and resolution A graphics card may have its own VRAM chip on boardCache
memory
Invitation to Computer Science, C++ Version, 6E 82
Others (con’t)
Virtual memory
Uses backing storage e.g. hard disk as a temporary location for programs and data where insufficient RAM available
Swaps programs and data between the hard-disk and RAM as the CPU requires them for processing
A cheap method of running large or many programs on a computer system
Cost is speed: the CPU can access RAM in nanoseconds but hard-disk in milliseconds (Note: a millisecond is a thousandth of a second)
Virtual memory is much slower than RAM
Invitation to Computer Science, C++ Version, 6E 83
85
Paging
Allows process to be comprised of a number of fixed-size blocks, called pages
Virtual address is a page number and an offset within the page
Each page may be located any where in main memory
Real address or physical address in main memory
Invitation to Computer Science, C++ Version, 6E
CPU MMU
Virtual Addresses
Physical Addresses
Address Translation
Address Space A group of memory addresses usable by something Each program (process) and kernel has potentially different
address spaces.
Address Translation: Translate from Virtual Addresses (emitted by CPU) into Physical
Addresses (of memory) Mapping often performed in Hardware by Memory Management
Unit (MMU)
87 Invitation to Computer Science, C++ Version, 6E
Example of Address Translation
Prog 1 Virtual Address Space 1
Prog 2 Virtual Address Space 2
Code
Data
Heap
Stack
Code
Data
Heap
Stack
Data 2
Stack 1
Heap 1
OS heap & Stacks
Code 1
Stack 2
Data 1
Heap 2
Code 2
OS code
OS data Translation Map 1 Translation Map 2
Physical Address Space 88 Invitation to Computer Science, C++ Version, 6E
Types of ROM
1. Programmable Read Only Memory (PROM)
Empty of data when manufactured
May be permanently programmed by the user
2. Erasable Programmable Read Only Memory (EPROM)
Can be programmed, erased and reprogrammed
The EPROM chip has a small window on top allowing it to be erased by shining ultra-violet light on it
After reprogramming the window is covered to prevent new contents being erased
Access time is around 45 – 90 nanoseconds
Invitation to Computer Science, C++ Version, 6E 90
Types of ROM (con’t)
3. Electrically Erasable Programmable Read Only Memory (EEPROM)
Reprogrammed electrically without using ultraviolet light
Must be removed from the computer and placed in a special machine to do this
Access times between 45 and 200 nanoseconds
4. Flash ROM Similar to EEPROM
However, can be reprogrammed while still in the computer
Easier to upgrade programs stored in Flash ROM
Used to store programs in devices e.g. modems
Access time is around 45 – 90 nanoseconds
Invitation to Computer Science, C++ Version, 6E 91
Types of ROM (con’t)
5. ROM cartridges
Commonly used in games machines
Prevents software from being easily copied
Invitation to Computer Science, C++ Version, 6E 92
Invitation to Computer Science, C++ Version, 6E 93
Examples of instructions: LOAD X
X
D
D
LOAD X
f
D
Invitation to Computer Science, C++ Version, 6E 95
ADD X
X
D
D
ADD X
f
ALU1 & ALU2
E
E+D E
D
E+D
E+D
Invitation to Computer Science, C++ Version, 6E 97
COMPARE X (assume D > E)
X
D
D
COM X
f
ALU1 & ALU2
D
E
E
1 0 0
Invitation to Computer Science, C++ Version, 6E 100
JUMPLT X
1 0 0
JUMPLT X
Similarly for condition code of 0 1 0