Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Module 6: INPUT - OUTPUT (I/O)
Introduction
• Computers communicate with the outside world via I/O devices
– Input devices supply computers with data to operate on
• E.g: Keyboard, Mouse, Voice recognition hardware, Scanner, Digital camera
– Results of computations can be sent to output devices
• E.g : monitor, speaker, printer, motor, actuator
2 2012/13, Sem I
Relative speed • For a computer system, I/O generally slow
• CPUs can execute more than a billion instructions per second
• Modern memory systems can provide 2-4 GB/s bandwidth • For I/O
• A fast typist can enter 9-10 characters a second on a keyboard.
• Common local-area network (LAN) speeds go up to 100 Mbit/s, which is about 12.5MB/s
• Today’s hard disks provide a lot of storage and transfer speeds around 40-60MB per second
• I/O performance has not increased as quickly as CPU performance, partially due to neglect and partially to physical limitations
3
I/O Performance (1) • Primary characteristic
• Data rate (bandwidth) -- throughput
the amount of data that can be transferred in unit time (units = bytes/time)
This is a primary concern for applications which transfer large amounts of data in big blocks.
Eg: If you download large files, bandwidth will be the limiting factor.
• Latency -- I/O response time
the time taken for the smallest transfer (units = time)
This is a primary concern for programs that do many small dependent transfers.
4 2013/14, Sem I MMA
I/O Performance (2) • Contributing factors for I/O performance
– Partner: humans have slower data rates than machines
• Type of I/O - Input/output/both
5 2012/13, Sem I
I/O complexity • I/O has a wide variation
– frequency: <1Hz to GHz,
– Voltage: µV to KV
– Speed : < 1bit/sec to GB/sec
– Size of data transfer: 1 bit to Gbit
– Different format: async serial vs sync serial
• Therefore, I/O not connected directly to CPU and memory
– Use I/O module to manage data transfer. It contains logic for performing a communication function between peripherals and the bus.
6 2012/13, Sem I
I/O module connection
7 2012/13, Sem I
Programming Interface of I/O
• How are device registers identified?
– Memory-mapped I/O vs. Isolated I/O
• How is timing of transfer managed?
– Asynchronous vs. Synchronous
• Who controls transfer?
– Through CPU : by CPU (polling) vs. by device (interrupts)
– Direct to/from memory : by device - Direct Memory Access (DMA)
– Channel I/O : by device – independent operation
10 2012/13, Sem I
Transfer Control #1: Programmed I/O (Polling)
“Who determines when the next data transfer occurs?”
Polling - “Are we there yet? Are we there yet? Are we there yet?”
• Processor queries I/O device status register • Loops until it gets status it wants (ready for next
command) • Or tries again a little later
• Advantage: • Simple • Less hardware
• Disadvantage • Waste of processor’s time
• Processor much faster than I/O device • Worse for higher bandwidth I/O devices (e.g., disks)
14 2012/13, Sem I
Polling Example (1)
A hard-disk has a data rate of 16 MB/s and the data is transferred in block with size of 32 Bytes.
• How many times per second does the CPU need to poll the disk so that no data is lost?
• If the CPU speed is 400MHz, and if polls require 200 cycles, what portion of CPU time is spent polling?
Polling Example (2) • Analysis:
– CPU has to poll regardless of whether drive is actually transferring data or not!!
– Data is transferred at 16MB/second, in 32-Bytes blocks.
– Therefore number of polls required/second is 16MB/32 which is equal to 500,000 polls/sec.
– Each poll takes 200 cycles, so total number of cycles for polling is 500,000 x 200 = 100,000,000 cycles!
– Proportion of CPU time spent = 100x106 (cycles) / 400x106 (cycles/sec) x 100% which is equal to 25%
– So 25% of the time is just spent for polling data. Inefficient.
Processor overhead for polling = no. of polling/sec * (no. of cycles needed/CPU speed)
= (16MB/32B) * (200 / 400MHz)
= 0.25 0.25 * 100% = 25%
Transfer Control #2 : Interrupt (1)
“Wake me when we get there.” • Device sends a special signal to CPU when new data arrives
OR device ready for next data
CPU can be performing other tasks instead of polling device
• I/O interrupts are asynchronous
Not associated with any one instruction
Don’t need to be handled immediately
• I/O interrupts are prioritized
E.g High-bandwidth I/O devices have higher priority than low bandwidth
17 2012/13, Sem I
Interrupt-driven I/O (2) • Advantage Relieves the processor from having to continuously poll
for an I/O event
user program progress is only suspended during the actual transfer of I/O data to/from user memory space
• Disadvantage special hardware (overhead) is needed to:
• initiate an interrupt (I/O device)
• detect and recognize an interrupt
• save the necessary information to resume normal processing after servicing the interrupt (processor) -- context switching
18 2012/13, Sem I
Interrupt Example (1)
Suppose we have the same disk arrangement as before, but this time interrupts are used instead of polling.
• Find the fraction of processor time taken to process an interrupt, given that it takes 400 cycles to process an interrupt and that the disk sends data 10% of the time.
2012/13, Sem I 19
Interrupt Example (2) • Analysis:
– Each time the disk wants to transfer a 32 Bytes block, it will interrupt the processor. 16MB/32B = 500,000 interrupts/sec
– Number of cycles per interrupt = 400
– Therefore number of cycles per second to service interrupt is 400 x 500,000 = 200,000,000 or 200x10^6 cycles
– Percentage of CPU time spent processing interrupts per second is now (200x10^6) cycles/(400x10^6) cycles/sec = 50%
– Worse than before!!
– BUT interrupts occur only when the drive actually has data to transfer! This happens only 10% of the time!
– Therefore actual percentage of CPU time used per second is 10% of 50% = 5%. Much better than 25% polling time.
Processor overhead for interrupt = rate interrupt happening * no. of transfer/sec * (no. of cycles needed/CPU speed)
= (10/100) * (16MB/32B) * (400 / 400MHz)
= 0.05 0.05 * 100% = 5%
Transfer Control #3: Direct Memory Access (DMA)
• Interrupts remove overhead of polling …
But still, transfer data one word at a time through a CPU
• OK for low bandwidth I/O devices: mouse, microphones, etc.
• Bad for high bandwidth I/O devices: disks, monitors, etc.
21 2012/13, Sem I
Direct Memory Access (DMA) I/O controller has the ability to transfer data
directly to/from the memory without involving the processor
Transfers entire blocks (eg: pages, video frames) at a time • Can use bus “burst” transfer mode if available
Only interrupts processor when done (or if error occurs)
2012/13, Sem I 22
Interrupt & DMA Example
• A CPU has a clock rate (frequency) of 500MHz. Its interrupt handler requires 400 cycles and a total of 100 cycles for data transfer. The data rate is 4 MB/s and the percentage of I/O transfer is 50% of the total CPU time. For driven I/O interrupt, data is transfer in block with size of 16 B. On the other hand, direct memory access (DMA) setup takes 1600 cycles with 16KByte page transfer and one interrupt per page.
• Calculate the processor overhead during I/O operation for both I/O interrupt and DMA by filling in the table below with the correct value. Please show all related formula and workings.
2012/13, Sem I 23
DMA Processor only gets involved once per page, not once per 16 B
2012/13, Sem I 24
• Processor overhead for interrupt = rate interrupt happening * no. of transfer/sec * (no. of cycles needed/CPU speed)
= (50/100) * (4MB/16B) * ([400+100] / 500MHz)
= 0.125 0.125 * 100% = 12.5%
• Processor overhead for DMA = rate transfer happening * no. of transfer page /sec * (no. of cycles needed/CPU
speed)
= (50/100) * (4MB/16KB) * ([1600+400] / 500MHz)
= 0.0005 0.0005 * 100% = 0.05%
Parameters
• 500 MHz CPU
• Interrupt handler takes 400 cycles
• Data transfer takes 100 cycles
• 4 MB/s, 16 B interface disk transfers data 50% of time
• DMA setup takes 1600 cycles, transfer one 16KB page at a time
Interrupt handling on DMA completion
The interrupt and data transfer cycle
Transfer Control #4 :Channel I/O
• This is a special DMA similar to the CPU
Has its own fetch/execute cycle
Runs a channel program out of memory just like the CPU runs programs
• The channel program can perform more complicated I/O than simple block transfer
• Improves speed
Takes load off CPU
Dedicated processor is faster
27 2012/13, Sem I
2012/13, Sem I 28
Transfer Control: Comparison (1)
IOP – I/O program 29 2012/13, Sem I
Transfer Control: Comparison (2)
30 2012/13, Sem I
Bus • There are two main ingredients to I/O systems.
Devices like hard drives, keyboard, etc
Buses that connect devices to each other and the processor
31 2012/13, Sem I
Can be summarize as:
• Data sharing - able to transfer data between the computer peripherals connected to it.
• Addressing - A bus has address lines, which match those of the processor. This allows data to be sent to or from specific memory locations.
• Power - A bus supplies power to various peripherals that are connected to it.
• Timing - The bus provides a system clock signal to synchronize the peripherals attached to it with the rest of the system.
Function of Buses
32 2012/13, Sem I
Computer buses • Every computer has several small “networks” inside, called
buses, to connect processors, memory, and I/O devices.
• The simplest kind of bus is linear, as shown below.
All devices share the same bus.
Only one device at a time may transfer data on the bus.
• Simple is not always good!
With many devices, there might be a lot of contention.
The distance from one end of the bus to the other may also be relatively long, increasing latencies.
33 2012/13, Sem I
Hierarchical buses • We could split the bus into different segments.
• Faster devices: i.e. CPU and memory need to communicate often, a shorter and faster processor-memory bus can be dedicated to them.
• Slower devices: A separate I/O bus would connect the slower
devices to each other, and eventually to the processor.
34 2012/13, Sem I
What is the bus anyway? • A bus is just a bunch of wires which transmits three kinds of
information.
Control signals specify commands like “read” or “write.”
The location on the device to read or write is the address.
actual data being transferred
• Some buses include separate control, address and data lines, so all of this information can be sent in one clock cycle.
36 2012/13, Sem I
Bus Type Bus Width Bus Speed MB/sec
ISA 16 bits 8 MHz 16 MBps
EISA 32 bits 8 MHz 32 MBps
VL-bus 32 bits 25 MHz 100 MBps
VL-bus 32 bits 33 MHz 132 MBps
PCI 32 bits 33 MHz 132 MBps
PCI 64 bits 33 MHz 264 MBps
PCI 64 bits 66 MHz 512 MBps
PCI 64 bits 133 MHz 1 GBps
38 2012/13, Sem I
Bus Type and Its Characteristics
Advantages of Buses
• Versatility:
New devices can be added easily
Peripherals can be moved between computer systems that use the same bus standard
• Low Cost:
A single set of wires is shared in multiple ways
• Manage complexity by partitioning the design
by using a hierarchical bus ease the process of design
41 2012/13, Sem I
Disadvantages of Buses
• It creates a communication bottleneck
The bandwidth of that bus can limit the maximum I/O throughput
• The maximum bus speed is largely limited by:
The length of the bus
The number of devices on the bus
The need to support a range of devices with:
• Widely varying latencies
• Widely varying data transfer rates
42 2012/13, Sem I
Buses in modern PCs
43 2012/13, Sem I
Please look it up and
read.
Peripheral Component Interconnect (PCI)
• PCI is a synchronous 32-bit bus running at 33MHz, although it can be extended to 64 bits and 66MHz.
• The maximum bandwidth is about 132 MB/s.
Bandwidth = 33 million transfers/sec x 4 B/transfer = 132MB/s
• Cards in the motherboard PCI slots plug directly into the PCI bus.
• Devices made for older and slower ISA bus standard are connected via a “south bridge” controller chip, in a hierarchical manner.
44 2012/13, Sem I
External buses
• External buses are provided to support the frequent plugging and unplugging of devices
As a result their designs significantly differ from internal buses
• Two modern external buses, Universal Serial Bus (USB) & FireWire (IEEE 1394), have the following (desirable) characteristics:
Plug-and-play standards allow devices to be configured with software, instead of flipping switches or setting jumpers.
Hot plugging means that you don’t have to turn off a machine to add or remove a peripheral.
The cable transmits power! No more power cables or extension cords.
Serial links are used, so the cable and connectors are small. 45 2012/13, Sem I