CS1104 2001/02 Semester II Help Session I/O Colin Tan, S15-04-05, Ctank@comp.nus.edu.sg

CS1104 2001/02 Semester IIHelp Session

I/OColin Tan,

S15-04-05,

Ctank@comp.nus.edu.sg

• I/O devices are the interface between humans and their computers, computers and the rest of the world, and computers and auxiliary devices.– E.g. keyboards allow you to input text instead of having to speak

in binary– Disk drives allow you to store large amounts of data over long

periods.

• Unfortunately I/O devices are slow– CPU can transfer millions of bytes of data per second.– I/O can transfer thousands of bytes of data per second.– CPU might over-run I/O device when writing, or under-run when

reading.

Solution - Polling

• Simple solution– Device has a Control Register which tells CPU if device is ready

to send or receive data, and a Data Register which the CPU uses to read data or write data.

• CPU goes into a loop reading the Control Register. When the Control Register says “READY”, then CPU will either write to the Data Register or read from it.

Polling Example

• Suppose we have a hard-disk with throughput of 4 MB/s. The disk transfers in 4-word chunks. The drives actually transfer data only 5% of the time.

• How many times per second does the CPU need to poll the disk so that no data is lost? If the CPU speed is 500MHz, and if polls require 400 cycles, what portion of CPU time is spent polling?

Polling Example

• Analysis:– CPU has to poll regardless of whether drive is actually transferring

data or not!!

– Data is transferred at 4MB/second, in 4-word (i.e. 16-byte) chunks.

– Therefore number of polls required/second is 4MB/16 which is equal to 250,000 polls.

– Each poll takes 400 cycles, so total number of cycles for polling is 250,000 x 400 = 100,000,000 cycles!

– Proportion of CPU time spent = 100x10^6/500x10^6 which is equal to 20%

– So 20% of the time is just spent transferring data. Inefficient.

Interrupts

• Alternative: – CPU makes request, does other things.

– When I/O device is done, it will inform the CPU via an interrupt.

– Alternatively CPU might be minding its own business, and I/O interrupts CPU when it needs attention.

Interrupt Example

• Suppose we have the same disk arrangement as before, but this time interrupts are used instead of polling. Find the fraction of processor time taken to process an interrupt, given that it takes 500 cycles to process an interrupt and that the disk sends data 5% of the time.

Interrupt Example• Analysis:

– Each time the disk wants to transfer a 4-word (i.e. 16 byte) block, it will interrupt the processor. Number of interrupts per second would be 4MB/16 = 250,000 interrupts per second.

– Number of cycles per interrupt = 500

– Therefore number of cycles per second to service interrupt is 500 x 250,000 = 125,000,000 or 125x10^6.

– Percentage of CPU time spent processing interrupts per second is now (125x10^6)/(500x10^6) = 25%

– Worse than before!!

– BUT interrupts occur only when the drive actually has data to transfer!

• This happens only 5% of the time!

– Therefore actual percentage of CPU time used per second is 5% of 25% = 1.25%.

Difference Between Polling and Interrupts

• Polling basically works like this:– You are expecting a phone call.

– Your phone does not have a ringer

– You spend the entire day randomly picking up the phone to see if the other person is on the other end of the line.

– If he isn’t you put the phone back down and try again later.

– If he is, you start talking.

• You can waste a lot of time doing this!

• Interrupts is like having a telephone with a ringer– You pick the phone up and talk only when it rings.

– More efficient!

Polling vs. InterruptsThe Conclusion

• Numerical examples show that polling is very expensive.– CPU has no way of knowing whether a device needs attention, and

so has to keep polling repeatedly so as not to miss data.

– In our example, even if the drive is idle 95% of the time, CPU still has to poll.

• Interrupts allow CPU to do useful work during this 95% of the time.– So even if processing an interrupt takes longer (500 cycles vs. 400

cycles for polling, in the end a smaller portion of CPU time is used to process the device (1.25% vs. 20%).

– Note that Interrupt Processing is much more complex than polling

– Need vector tables, cause registers, handlers, etc.

Devices We Will Be Looking At

• Disk Drives– Data is organized into fixed-size blocks.

– Blocks are organized in concentric circles called tracks. The tracks are created on both sides of metallic disks called “platters”.

– The corresponding tracks on each side of each platter form a cylinder (e.g. Track 0 on side 0 of platter 0, track 0 on side 1 of platter 0, track 0 on side 0 of platter 1 etc.)

• Latencies are involved in finding the data and reading or writing it.

Devices We Will Be Looking At

• Network Devices– Transmits data between computers.

– Data often organized in blocks called “packets”, or sometimes into frames.

– Latencies are involved in sending/receiving data over a network.

Disk Drives

• Latencies involved in accessing drives:– Controller overheads: To read or write a drive, a

request must be made to the drive controller. The drive controller may take some time to respond. This delay is called the “controller overhead”, and is usually ignored.

• Controller overhead time also consists of delays introduced by controller circuitry in transferring data.

– Head-Selection Time:Each side of each platter has a head. To read the disk, first select which side of which platter to read/write by activating its head and de-activating the other heads. Normally this time is ignored.

Disk Drives

• Latencies involved in accessing drives:– Seek Time:Once the correct head has been selected, it

must be moved over the correct track. The time taken to do this is called the “seek time”, and is usually between 8 to 20 ms (NOT NEGLIGIBLE!)

– Rotational Latency: Even when the head is over the correct track, it must wait for the block it wants to read to rotate by.

• The average rotational latency is T/2, where T is the period (in seconds) of the rotation speed R (60/R if R is specified in RPM, or 1/R if R is specified in RPS)

Disk Drives

• Latencies involved in accessing drives:– Transfer Time: This is the time taken to actually read

the data. If the throughput of the drive is given as X MB/s and we want to read Y bytes of data, then the transfer time is given by:

Y/(X * 10^6)

Example

• A program is written to access 3 blocks of data (the blocks are not contiguous and may exist anywhere on the disk) from a disk with rotation speed of 7200 rpm, 12ms seek time, throughput of 10 MB/S and a block size of 16 KB. Compute the worst case timing for accessing the 3 blocks.

Example

• Analysis:– Each block can be anywhere on the disk

• In the worst case, we must incur seek, rotational and transfer delays for every block.

– What is the timing for each delay?• Controller Overhead - Negligible (since not given)

• Head-switching time - Negligible (since not given)

• Seek time

• Rotational Latency

• Transfer time.

– How many times are each of these delays incurred?

Example

• A disk drive has a rotational speed of 7200 rpm. Each block is 16KB, and there are 16 blocks per track. There are 22 platters with 25 tracks each. The average seek time is 12ms.– What is the capacity of this disk?

– How long does it take to read 1 block of data?

Example

• Analysis– Size:

• How many sides are there? How many tracks per side? How many blocks per track? How big is each block?

– Time to read 1 block• Throughput is not given. How to work it out?

Network Devices

• Latencies Involved:– Interconnect Time: This is the time taken for 2 stations

to “hand-shake” and establish a communications session

– Hardware Latencies: There is some latency in gaining access to a medium (e.g. In Ethernet the Network Interface Card (NIC) must wait for the Ethernet cable to be free of other activity) and in reading/writing to the medium.

– Software Latencies: Network access often requires multiple buffer copying operations, leading to delays.

Network Devices

• Latencies Involved:– Propagation Delays: For very large networks stretching

thousands of miles, signals do not reach their destination immediately, and take some time to travel in the wire. More details in CS2105.

– Switching Delays: Large networks often have intermediate switches to receive and re-transmit data (to restore signal integrity, for routing etc.). These switches introduce delays too. More details in CS2105.

Network Devices

• Latencies Involved:– Data Transfer Time: Time taken to actually transfer the

data. If we wish to transfer Y bytes of data over a network link with a throughput of X MBPS, the data transfer time is given by:

(Y bytes)/(X * 10^6)

– Aside from the Data Transfer Time (where real useful work is actually being done), all of the other latencies do not accomplish anything useful (but are still necessary), and these are termed “overheads”.

Network Devices

• Note that if the overheads are much larger than the data transfer time, it is possible for a slow network with low overheads to perform better than a fast network with high overheads.– E.g. Page 654 of Patterson & Hennessy.

Example

• A communications program was written and profiled, and it was found that it takes 40ns to copy data to and from the network. It was also found that it takes 100ns to establish a connection, and that effective throughput was 5 MBPS. Compute how long it takes to send a 32KB block of data over the network.

Example

• Analysis:– What are the overheads? What is the data transfer time?

– Overhead: 40ns to transfer data to network, 40 ns to transfer from network. Total = 80 ns

– Data transfer time:

– 32KB = 215 bytes (for simplicity we can also assume that 32KB = 32,000 bytes. Relative error is 768/32768 = 2.3% approx)

– 5 MBPS = 5 x 106 bytes/second.

– Transfer time = 215/5 x 106

– Total time taken = 80ns + transfer time.

Conclusions

• I/O is essential to allow a computer to interact with the world and remember things.

• I/O is slower than CPU and coordination is required:– Polling

– Interrupts

• Polling is simple but inefficient, interrupts are efficient but complex.– Use polling when little data needs to be transferred infrequently.

– Use interrupts when large amounts of data need to be transferred, or need to be transferred frequently.

Conclusions

• Disk drives incur delays through their controller electronics, having to move the drive arm over to the correct track, and having to wait for the correct block to roll by.

• Networks incur delays in the interface electronics, over the wires, and in actually transferring the data.

CS1104 2001/02 Semester II Help Session I/O Colin Tan, S15-04-05, Ctank@comp.nus.edu.sg

Documents

SAP University Alliance Program NUS Student Trainers Team Email: terp10@comp.nus.edu.sg Website:terp10nus.wordpress.com SAP TERP10 Workshop Information

eete - LG Hausys296 52 296 128 64 296 130 266 116 ct1104. s. ct1102. tal. ct1103. c. cs1104. s. 12 13

Limsoon Wong - comp.nus.edu.sg

System Implementation: Does it help or hinder research? Anthony K. H. Tung National University of Singapore atung@comp.nus.edu.sg

CS1104 Help Session I Memory Semester II 2001/02 Colin Tan, S15-04-05, Ctank@comp.nus.edu.sg

CS1104: Computer Organisation cs1104 Lecture 5: Karnaugh Maps cs1104

CS1104 – Computer Organization PART 2: Computer Architecture Lecture 5 MIPS ISA & Assembly Language Programming

CS 1104 Help Session I Caches Colin Tan, ctank@comp.nus.edu.sg S15-04-15

CS 1104 Help Session III Number Systems Colin Tan ctank@comp.nus.edu.sg ctank

CS1104: Computer Organisation cs1104 Lecture 4: Logic Gates and Circuits cs1104

CS1104: Computer Organisation cs1104 Lecture 2: Number Systems & Codes cs1104

CS1101 Group1 Discussion 7 Lek Hsiang Hui lekhsian @ comp.nus.edu.sg lekhsian/cs1101

CS1104 – Computer Organization PART 2: Computer Architecture Lecture 10 Memory Hierarchy

CS1104 2001/02 Semester II Help Session IIA Performance Measures Colin Tan S15-04-05 Ctank@comp.nus.edu.sg

PRIVÉ : Anonymous Location-Based Queries in Distributed Mobile Systems 1 National University of Singapore {ghinitag,kalnis}@comp.nus.edu.sg 2 University

CS1104 Computer Organisatoncs1101x/2_resources/cs1104/... · Web viewConvert each of the decimal numbers in Question 2-5 above to octal (base eight) with at most four digits in the

CS1104: Computer Organisation cs1104 Lecture 7: Combinational Circuits MSI Components cs1104

CS2100 Computer Organisation The Processor: Control (AY2015/6) Semester 1 Ack: Some slides are adapted from Prof. Tulika Mitra’s CS1104 notes

CS1104: Computer Organisation cs1104 Lecture 6: Combinational Circuits Design Methods/Arithmetic Circuits cs1104

1 CS1104 Assembly Language: Part 1 Dr. Ankush Mittal ankush@comp.nus.edu.sg S16 #06-17 Adapted from D. Patterson’s CS61C pattrsn/61CF00