74
Computer Organization and System Software Lecturer: Szabolcs Mikulas E-mail: [email protected] URL: http://www.dcs.bbk.ac.uk/~szabolcs/coss.html Textbooks: J.A. Harris, Operating Systems, Schaum’s Outline Series, McGraw-Hill, 2002 N. Carter, Computer Architecture, Schaum’s Outline Series, McGraw-Hill, 2002 See also the URL for recommended readings.

Computer Organization and System Software Lecturer: Szabolcs Mikulas URL: Textbooks:

Embed Size (px)

DESCRIPTION

Computer Structure - Top Level Computer Main Memory Input Output Systems Interconnection Peripherals Communication lines Central Processing Unit Computer

Citation preview

Page 1: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Computer Organization and System Software

• Lecturer: Szabolcs Mikulas• E-mail: [email protected]• URL: http://www.dcs.bbk.ac.uk/~szabolcs/coss.html• Textbooks: J.A. Harris, Operating Systems, Schaum’s Outline Series,

McGraw-Hill, 2002 N. Carter, Computer Architecture, Schaum’s Outline

Series, McGraw-Hill, 2002• See also the URL for recommended readings.

Page 2: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Chapter 1Computer System Overview

Patricia RoyManatee Community College, Venice, FL

©2008, Prentice Hall

With additional inputs from Computer Organization and Architecture, Parts 1 and 2

Operating Systems:Internals and Design Principles, 6/E

William Stallings

Page 3: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Computer Structure - Top Level

Computer

Main Memory

InputOutput

SystemsInterconnection

Peripherals

Communicationlines

CentralProcessing Unit

Computer

Page 4: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

The Central Processing Unit - CPU

Computer Arithmeticand Logic Unit

ControlUnit

Internal CPUInterconnection

Registers

CPU

I/O

Memory

SystemBus

CPU

Page 5: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Computer Components - Registers

Page 6: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Control and Status Registers• Used by processor to control the operation of

the processor• Used by privileged operating system (OS)

routines to control the execution of programs• Program counter (PC): Contains the address of

the next instruction to be fetched• Instruction register (IR): Contains the

instruction most recently fetched – currently executed

• Program status word (PSW): Contains status information

Page 7: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

User-Visible Registers• May be referenced by machine language,

available to all programs – application programs and system programs

• Data• Address

– Index: Adding an index to a base value to get the effective address

– Segment pointer: When memory is divided into segments, memory is referenced by a segment and an offset inside the segment

– Stack pointer: Points to top of stack

Page 8: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Basic Instruction Cycle

Page 9: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Fetch Cycle• Program Counter (PC) holds address of next

instruction to be fetched• Processor fetches instruction from memory

location pointed to by PC• Increment PC

– Unless told otherwise• Instruction loaded into Instruction Register (IR)• Processor interprets instruction and performs

required actions

Page 10: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Execute Cycle• Data transfer (via the bus)

– Between CPU and main memory– Between CPU and I/O module

• Data processing (by the arithmetic-logic unit ALU)– Some arithmetic or logical operation on data

• Control (by the control unit)– Alteration of sequence of operations, e.g. jump

• Combinations of the above

Page 11: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Characteristics of a Hypothetical Machine

Page 12: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Simple Computation

• How to add the contents (3 and 2) of two memory locations (940 and 941) and store the result at a memory location (941)

1.load data (into accumulator register AC): LOAD 940, AC (3 -> AC)

2.perform addition: ADD 941, AC, AC (2+3 -> AC)3.store result (in memory): STORE AC, 941 (5 ->

941)

Page 13: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Example of Program Execution

Page 14: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

CPU Speed• Speed of CPU clocked is measured in

frequency: 1 Hz (hertz) – 1 cycle per second• 1 GHz = 10^3 MHz = 10^6 KHz = 10^9 Hz

(instead of 10^3 one can use 2^10=1052)• Length of a cycle measured in seconds• 1 s = 10^3 milliseconds = 10^6 microseconds =

10^9 nanoseconds• Performing one operation may take longer

than one clock cycle!!! – Accessing memory is slower that pure arithmetic operation (using the registers)

Page 15: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Other Performance Measurements

• MIPS: million instruction per secondNB: The same computation may take different

numbers of instructions on different machines, see RISC v CISC

• CPI/IPC: cycles per instruction/instructions per cycle

• Benchmark suites

Page 16: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Connecting

• All the units must be connected• Different type of connection for different type

of unit– Memory– Input/Output– CPU

Page 17: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Physical Realization of Bus Architecture

Page 18: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Computer Modules

Page 19: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Memory Connection

• Receives and sends data• Receives addresses (of locations)• Receives control signals

– Read– Write– Timing

Page 20: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Input/Output Connection(1)

• Similar to memory from computer’s viewpoint• Output

– Receive data from computer– Send data to peripheral

• Input– Receive data from peripheral– Send data to computer

Page 21: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Input/Output Connection(2)

• Receive control signals from computer• Send control signals to peripherals

– e.g. spin disk• Receive addresses from computer

– e.g. port number to identify peripheral• Send interrupt signals (control)

Page 22: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

CPU Connection

• Reads instruction and data• Writes out data (after processing)• Sends control signals to other units• Receives (& acts on) interrupts

Page 23: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Bus Interconnection Scheme

Page 24: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Data Bus

• Carries data– Remember that there is no difference between

“data” and “instruction” at this level• Width (number of lines) is a key determinant

of performance, since this determines how many bits can be transferred in one go (cycle)– 32 to hundreds of bits

Page 25: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Address bus

• Identify the source or destination of data• e.g. CPU needs to read an instruction (data)

from a given location in memory• Bus width determines maximum memory

capacity of system– e.g. 8080 has 16 bit address bus giving

2^16=2^6*2^10=64K addresses

Page 26: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Control Bus

• Memory or I/O read/write signals• Interrupt request/acknowledgment• Clock signals• Bus request/grant signals

Page 27: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Traditional (ISA) (with cache)

Page 28: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

The Memory Hierarchy

Page 29: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Memory

• Typical memory hierarchy (numbers shown on the right are a bit out-dated)

Page 30: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Memory as storage

• Limited register size, so code and data has to be stored in (main) memory

• These are fetched by the CPU during the execution of the code

• Also the results of the computations must be stored

• These result in frequent access to (main) memory

Page 31: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Going Down the Hierarchy

• Decreasing cost per bit• Increasing capacity• Increasing access time• Decreasing frequency of access to the

memory by the processor (optimally - requires good design)

Page 32: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Main Memory• Contains data (including instructions) in binary

format: sequences of bits• 1B=1 byte=8 bits=8b• Word – a sequence of bytes, length is system

specific (1, 2, 4, 8, etc. bytes)• Block – a sequence of words, typically in the

magnitude of several kilobytes (KB)• An address - a location in memory. It specifies

(the beginning of) a word or block - depending on the size of data transfer

Page 33: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Performance Balance• Processor (logic) speed increases• Memory capacity increases• Memory speed increases but lags behind

processor speed• Speed is measured in frequency – how many

cycles (execution of instruction or data transfer via the bus) happen in one second

• Typically: one bus cycle takes several clock cycles!!!

Page 34: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Logic and Memory Performance Gap

Page 35: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Cache Memory

• Processor speed faster than memory access speed

• Main memory becomes a bottleneck• Exploit the principle of locality of reference:

During the course of the execution of a program, memory references tend to cluster, e.g. loops, and the same data maybe needed again

• Introduction of small, fast memory - cache

Page 36: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Cache and Main Memory

Page 37: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Cache Principles• Contains copy of a (recently accessed) portion

of main memory• Processor first checks cache• If not found (cache miss), block of memory

read into cache (cache line)• Because of locality of reference, likely future

memory references are in that block• Modern systems have several caches

(instruction, data) on different levels (L1 on chip, L2, etc.)

Page 38: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Cache/Main-Memory Structure

Page 39: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Cache Read Operation

Page 40: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Size• Cache size

– Small caches have significant impact on performance, since accessing cache is faster than accessing main memory

• Block size– The unit of data exchanged between cache and

main memory, typically several KB (kilobytes)

Page 41: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

(Re)placement

• Mapping function– Determines which cache location the block will

occupy when loaded into the cache• Replacement algorithm

– Chooses which block to replace– Least-recently-used (LRU) algorithm

Page 42: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Write policy

• Dictates when the memory write operation takes place– Write through: occurs every time the block in the

cahce is updated– Write back: occurs when the block is replaced

• Minimize write operations• Leave main memory in an obsolete state

Page 43: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

I/O Devices

• Programs with intensive I/O demands• Large data throughput demands• Processors can handle this, but memory is limited and

slow• Problem moving data • Solutions:

– Caching– Buffering– Higher-speed interconnection buses– More elaborate bus structures– Multiple-processor configurations

Page 44: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Typical I/O Device Data Rates(in bit per second)

Page 45: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Hard disk

Page 46: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Speed

• Seek time– Moving head above the correct track

• (Rotational) latency– Waiting for the correct sector to rotate under

head• Access time = Seek + Latency• Transfer rate, typically in bit per second (bps)

Page 47: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Input/Output Problems

• Wide variety of peripherals– Delivering different amounts of data– At different speeds– In different formats

• All slower than CPU and main memory• Need I/O modules

Page 48: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

I/O Steps

• CPU checks I/O module device status• I/O module returns status• If ready, CPU requests data transfer• I/O module gets data from device• I/O module transfers data to CPU• Variations for output, DMA, etc.

Page 49: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Input Output Techniques

• Programmed• Interrupt driven• Direct Memory Access (DMA)

Page 50: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Programmed I/O (1)

• CPU has direct control over I/O– Sensing status– Read/write commands– Transferring data

• CPU waits for I/O module to complete operation

• Wastes CPU time

Page 51: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Programmed I/O (2)

• CPU requests I/O operation• I/O module performs operation• I/O module sets status bits• CPU checks status bits periodically• I/O module does not inform CPU directly• I/O module does not interrupt CPU• CPU may wait or come back later

Page 52: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Programmed I/O (3)

• I/O module performs the action• Sets the appropriate bits in the

I/O status register• CPU checks status bits

periodically• No interrupts occur• Processor checks status until

operation is complete

Page 53: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Program Flow of Control

Page 54: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Interrupt Driven I/O

• Overcomes CPU waiting• No repeated CPU checking of device• I/O module interrupts when ready

Page 55: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Program Flow of Control

Page 56: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Interrupts

• Interrupts the normal sequencing of the processor – suspends current activity and runs special code

• Program generated: result of an instruction, e.g. division by 0, overflow, illegal machine instruction

• Hardware generated: timer, I/O (when finished or error), other errors (e.g. parity check)

Page 57: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Interrupt Stage

• Processor checks for interrupts• If interrupt occurred

– Suspend execution of program– Execute interrupt-handler routine/interrupt

service procedure– Afterwards control may be returned to suspended

program

Page 58: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Transfer of Control via Interrupts

Page 59: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Instruction Cycle with Interrupts

Page 60: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Simple Interrupt Processing

Page 61: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Interrupt Driven I/O (2)

• CPU issues read command• I/O module gets data from peripheral while

CPU does other work• I/O module interrupts CPU• CPU requests data• I/O module transfers data

Page 62: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Interrupt-Driven I/O (3)

• Processor is interrupted when I/O module ready to exchange data

• Processor saves context of program executing and begins executing interrupt-handler

Page 63: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Direct Memory Access

• Interrupt driven and programmed I/O require active CPU intervention– Transfer rate is limited– CPU is tied up

• DMA, an additional module (hardware) on bus• DMA controller takes over from CPU for I/O

Page 64: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

DMA Configurations (1)

• Single Bus, Detached DMA controller• Each transfer uses bus twice

– I/O to DMA then DMA to memory• CPU is suspended twice

Page 65: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Typical DMA Module Diagram

Page 66: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

DMA Operation

• CPU tells DMA controller:-– Read/Write– Device address– Starting address of memory block for data– Amount of data to be transferred

• CPU carries on with other work• DMA controller deals with transfer• DMA controller sends interrupt when finished

Page 67: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Direct Memory Access

• Transfers a block of data directly to or from memory

• An interrupt is sent when the transfer is complete

• More efficient

Page 68: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

DMA Transfer - Cycle Stealing

• DMA controller takes over bus for a cycle• Transfer of one word of data• Not an interrupt

– CPU does not switch context• CPU suspended just before it accesses bus

– i.e. before an operand or data fetch or a data write• Slows down CPU but not as much as CPU doing

transfer

Page 69: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Improvements in Chip Organization and Architecture

• Increase hardware speed of processor– Fundamentally due to shrinking logic gate size

• More gates, packed more tightly, increasing clock rate• Propagation time for signals reduced

• Increase size and speed of caches– Dedicating part of processor chip

• Cache access times drop significantly

• Change processor organization and architecture– Increase effective speed of execution– Parallelism

Page 70: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Problems with Clock Speed and Logic Density• Power

– Power density increases with density of logic and clock speed

– Dissipating heat

• RC delay– Speed at which electrons flow limited by resistance and

capacitance of metal wires connecting them– Delay increases as RC product increases– Wire interconnects thinner, increasing resistance– Wires closer together, increasing capacitance

• Memory latency– Memory speeds lag processor speeds

• Solution: More emphasis on organizational and architectural approaches

Page 71: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Increased Cache Capacity

• Typically two or three levels of cache between processor and main memory

• Chip density increased– More cache memory on chip - faster cache access

• Pentium chip devoted about 10% of chip area to cache

• Pentium 4 devotes about 50%

Page 72: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

More Complex Execution Logic

• Enable parallel execution of instructions• Pipeline works like assembly line

– Different stages of execution of different instructions at same time along pipeline

• Superscalar allows multiple pipelines within single processor– Instructions that do not depend on one another

can be executed in parallel

Page 73: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

New Approach – Multiple Cores• Multiple processors on single chip

– Large shared cache• Within a processor, increase in performance

proportional to square root of increase in complexity• If software can use multiple processors, doubling

number of processors almost doubles performance• So, use two simpler processors on the chip rather

than one more complex processor• Example: IBM POWER4

– Two cores based on PowerPC

Page 74: Computer Organization and System Software Lecturer: Szabolcs Mikulas   URL:  Textbooks:

Intel Microprocessor Performance