22
1 Interrupt Processing in Concurrent Processors Wade Walker and Professor Harvey G. Cragon The University of Texas at Austin 1 Introduction As modern processors grow ever more complex, implementing interrupt processing strategies for them grows more complex as well. Today there are a confusing plethora of interrupt processing strategies with few unifying concepts to make learning about them or choosing between them easier. In this paper we discuss interrupt processing in concurrent processors and present a taxonomy of interrupt processing strategies suitable for them. Concurrent processors are merely processors such as the superscalar PowerPC 601 or the superpipelined MIPS R4400 which can process more than one instruction at a time. The taxonomy of interrupt processing strategies has several possible uses. One is to help processor designers systematically explore their options when designing an interrupt processing system, or to help researchers compare interrupt processing strategies and determine their similarities and differences. However, for those readers who don’t fit into the two previous categories this taxonomy can serve as an overview of the way interrupts can be implemented, with references provided in case the reader wishes to gain a more detailed understanding of a particular interrupt processing strategy. 2 What is an interrupt? Originally, interrupts were used mainly in I/O processing to eliminate the need for software polling [Meyers 82]. Later, interrupts subsumed the function of traps (also known as internal [Kuck 78] or program [Baer 80] interrupts). Still later, the terminology was unified and all events other than branches which change the normal flow of program execution were classed as interrupts [Hennessy and Patterson 89]. This definition is simpler than most manufacturers’ practice of naming several different categories of interrupts depending on their cause, so we’ll use it throughout the paper, though we will refer to some specific types of interrupts by their common names such as I/O interrupts, page faults, and memory protection violations.

Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

1

Interrupt Processing in Concurrent Processors

Wade Walker and Professor Harvey G. CragonThe University of Texas at Austin

1 Introduction

As modern processors grow ever more complex, implementing interrupt processing strategies for them grows more complex as well. Today there are a confusing plethora of interrupt processing strategies with few unifying concepts to make learning about them or choosing between them easier.

In this paper we discuss interrupt processing in concurrent processors and present a taxonomy of interrupt processing strategies suitable for them. Concurrent processors are merely processors such as the superscalar PowerPC 601 or the superpipelined MIPS R4400 which can process more than one instruction at a time.

The taxonomy of interrupt processing strategies has several possible uses. One isto help processor designers systematically explore their options when designing an interrupt processing system, or to help researchers compare interrupt processing strategies and determine their similarities and differences. However, for those readers who don’t fit into the two previous categories this taxonomy can serve as an overview of the way interrupts can be implemented, with references provided in case the readerwishes to gain a more detailed understanding of a particular interrupt processing strategy.

2 What is an interrupt?

Originally, interrupts were used mainly in I/O processing to eliminate the need for software polling [Meyers 82]. Later, interrupts subsumed the function of traps (also known as internal [Kuck 78] or program [Baer 80] interrupts). Still later, the terminology was unified and all events other than branches which change the normal flow of program execution were classed as interrupts [Hennessy and Patterson 89]. This definition is simpler than most manufacturers’ practice of naming several different categories of interrupts depending on their cause, so we’ll use it throughout the paper, though we will refer to some specific types of interrupts by their common names such as I/O interrupts, page faults, and memory protection violations.

Page 2: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

2

When an interrupt occurs, the processor must stop the currently executing process to deal with the interrupt. First, unless the interrupted process will not resume (such as the case of a memory protection violation) some state information about the interrupted process is saved. Then the interrupt is processed, the saved process state is restored, and the interrupted process resumes. All of these phases of interrupt processing will be described in more detail later, but this broad outline is sufficient for now, and is illustrated in Figure 2.0.

Figure 2.0: Typical interrupt processing

The fundamental piece of information about a process which must be saved in order for it to resume it is its program counter. In practice, much more information may be saved, but the program counter is of prime importance in most processors because it defines exactly which instructions of the interrupted process have and have not completed. And this is where precise interrupts enter the picture.

2.1 What are precise interrupts?

Interrupts are precise if the following three conditions hold [Smith and Pleszkun 85] (slightly modified):

1 All instructions before the one indicated by the saved program counter have finished execution and have modified the process state correctly.

2 All instructions after the one indicated by the saved program counter are unexecuted and have not modified the process state.

3 If the interrupt was caused by an instruction, the saved program counter points to that instruction, called the interrupting instruction. This instruction must either be completely executed or completely unexecuted.

Page 3: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

3

A pictorial example may help to clarify these three conditions. Our examplepipeline in Figure 2.1 has four stages, all of which are currently processing an instruction. Instruction 1 causes an interrupt.

Figure 2.1: Pipeline example of precise interrupt conditions

Since the program counter points to Instruction 1, then by Smith and Pleszkun’s three

conditions

1 Instruction 0 and all instructions before it must have finished execution and modified the process state correctly

2 Instructions 2, 3, and all instructions after them must be completely unexecuted and must not have modified the process state

3 Instruction 1 must be either completely executed or completely unexecuted

If an interrupt is precise, the process state seen just before interrupt processing begins is described as serially correct, because the process state is just as if the program had been executed one instruction at a time in serial order, finishing each instruction before starting the next.

2.2 Why would interrupts need to be precise?

Theoretically speaking, it is never absolutely necessary to have precise interrupts. There are at least three alternatives, each of which has good and bad points. Consider the following thought experiment:

We are a designers working on a new superscalar processor. It will allow out-of-order instruction issue and completion and is intended for use in a multitasking, virtual memory environment.

Page 4: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

4

The first type of interrupt we want to implement is the page fault, since we’re planning on using it to implement virtual memory. When a page fault occurs, the current process must be suspended, the necessary page must be loaded, and then the process must resume as if the page fault had not occurred. This scenario is illustrated inFigure 2.2.

Figure 2.2.0: Page fault interrupt processing

The usual strategy to insure that the original process will be able to resume successfully is to make the page fault a precise interrupt with the faulting instruction completely unexecuted. That way, resuming the original process after the memory page is loadedconsists merely of resuming instruction fetching starting with the faulting instruction.

2.3 Alternatives to precise interrupts

There are at least two more options, however, which do not require precise interrupts. The first one is to save the entire process state when a page fault occurs and restore the entire state after the new page is loaded. However, the entire process state is usually quite large in a superscalar processor. It includes of all memory elements in the processor, even ones which are not visible to the programmer like implemented registers and the latches between pipeline stages. Though this method would have the same effect as implementing precise interrupts, designers usually don’t choose this option because of the large amount of process state which must be saved and restored.However, if all that process state could be saved an restored in special on-chip memory areas, this might not be too difficult.

The second method is to halt the main processor while an auxiliary processorloads the new page into memory. This auxiliary processor performs all the tasks necessary to process the page fault, then restarts the main processor when the new page is safely in memory. This method is logically equivalent to the one above, except that instead of saving the process state explicitly, the process state is preserved in situ. This method suffers from the disadvantage of requiring an auxiliary processor to handle the

Page 5: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

5

page faults, which may be costly and difficult to design and implement. But now that greater integration is resulting in spiraling transistor budgets for processors, it might be possible to take an old, already implemented processor and use it as the auxiliary processor with not too much effort. This concept is not a new one; the idea of using an auxiliary processor to handle interrupts dates back at least to [Keller 75].

So it can be seen that there are three main classes of solution to the problem of resuming a process after an interrupt has been processed. The first and most common way, which implements precise interrupts, consists of serializing the process statebefore interrupt processing and restoring it afterwards. The second way consists of saving the entire process state before interrupt processing and restoring it afterwards.And the third way consists of allowing an auxiliary processor to handle interrupts to that the state of the main processor may be preserved in situ. Each of these three solutions will be discussed further in the taxonomy below.

3 Definitions

An internal interrupt is one which is caused by an instruction; an external interrupt is one which was caused by something external to the processor.

An interrupting instruction is the instruction which caused an interrupt (if the interrupt is internal). There is no interrupting instruction for external interrupts.

An interrupt handler is a piece of software which is invoked by the processor when an interrupt occurs. It is the interrupt handler’s responsibility to respond to the interrupt if it is recoverable and to relinquish control of the processor when it is finished. In theory a software interrupt handler is not necessary—specialized hardware may perform the necessary actions instead, or the interrupt may be handled by a combination of hardware and software.

An architected register (or more generally, architected resource) is one whose value is accessible to a machine language programmer [Venkatramani 90]. Animplemented register (or implemented resource) contains internal processor state which is not visible to a machine language programmer. Examples of implemented resources would be register scoreboards, reorder buffers, latches between pipeline stages, and history buffers.

A processor state (also called an internal state) refers to the values of itsregisters, both architected and implemented. A process state is all the information relevant to a process. A process state may encompass the processor state (if the process is running) as well as some of the state of the system outside the processor, such as values in caches or main memory. An external state is all the state information in the

Page 6: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

6

whole system which is not part of the internal state. External state encompasses cache memory, main memory, and secondary memory, but not processor registers.

One thing to remember about process state is that the interrupt processing hardware we’ll describe below generally affects the processor state only. This is because it is seldom necessary to modify any external state in order to insure that a process will be able to resume. The only reason we use the broader term processor state instead of process state is that it is theoretically possible to change state external to the processor during interrupt processing. One example of this sort of external state change is a checkpointing strategy which checkpoints main memory periodically and resumes a process from this checkpointed state after interrupt processing.

In the following discussion, when an instruction A is said to be before another instruction B in a pipeline, this means that A was issued before B. Similarly, if an instruction A is said to be after an instruction B, this means that A was issued after B. In this case, we take “issued” to mean, “issued into the first stage of a single pipeline.”In the case of superscalar processors, of course, there is the possibility of more than one instruction issuing at a time. However, if you treat each of the pipelines contained in a superscalar processor independently, the discussions below will still apply.

Serial instruction execution means that state changes are committed in the same order that the instructions entered the processor. A processor with one pipeline which does no instruction reordering executes instructions serially, though they are not executed sequentially. A serially correct state (of a processor or a process) is a state identical to that which would result from serial instruction execution.

Sequential instruction execution means that each instruction is executed completely before the next instruction is started. Note that a concurrent processor can execute instructions sequentially by waiting to issue each instruction until the previous one has completed.

4.0 If some interrupts are precise, must all of them be?

There are many different ways to implement precise interrupts. Some strategies, such as history buffers, are general enough to make all the interrupts in a processor precise if the designer so desires. Other strategies, such as adding special-purpose implemented registers to a processor, may only make one type of interrupts precise, such as page faults.

Some commercial processors only make the a bare minimum of interrupts precise. One example of this strategy is the DEC 21064 (Alpha). The page faults in the Alpha are precise interrupts. However, arithmetic interrupts are imprecise unless the user sends a special instruction to the processor after an arithmetic instruction. This

Page 7: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

7

special instruction, TRAPB, prevents any more instructions from issuing until the arithmetic instruction has exited the pipeline, which effectively ensures that the processor state will be serial. Not providing for precise arithmetic interrupts in hardware makes designing and implementing the Alpha easier than if all types of interrupt were precise.

4.1 Location of the saved process state

In any processor which allows interrupts, the interrupt processing system must save enough process state for the original process to resume after an interrupt occurs.The process state is usually saved in two different places: the process state which is not part of the processor state is usually stored by the operating system in main or secondary memory, and the processor state is usually saved directly by the processor.The latency of interrupt processing depends quite heavily on how much process state is saved, and where. We’ll illustrate this by considering two example processors, the Motorola MC68030 and the MC88100. This example assumes that the process statesstored by the operating system in either case are about the same size.

The MC68030 serializes its processor state and saves the program counter and other registers on the supervisor stack before running the interrupt handler. After the interrupt handler is run, the registers are then restored from the stack. Though serializing the processor state requires extra time, in this case it is worth it because serialization allows a much smaller state to be saved to the stack than if the processor state was not serialized. And since stack operations are relatively slow, saving a small state can decrease interrupt processing time even after the time taken in processor state serialization is added in.

The MC88100, on the other hand, continuously saves the values of the registers in a set of shadow registers, which will be discussed in more detail later. When an interrupt occurs, the state of the shadow registers is merely “frozen” and not allowed to change while the interrupt handler is run. Then, after the interrupt handler has run, the registers are restored from the values preserved in the shadow registers. Since the processor state is continuously saved inside the processor rather than saving it to the stack when an interrupt occurs, interrupt handling on the MC88100 is more time-efficient than than on the MC68030. However, this lower interrupt processing latency is gained at the expense of having many shadow registers on the chip, which take up valuable chip area.

4.2 Processor state serialization

If precise interrupts are being implemented, the processor state must be serially correct before the processor state can be saved and interrupt handling can begin. For

Page 8: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

8

the sake of simplicity, this discussion doesn’t consider serializing process state which is external to the processor.

There are two main ways to serialize processor state. The first and easiest way only works on a processor which does not allow out-of-order instruction issue or completion, and in which, if the interrupt is internal, no instruction which was issued after the interrupting instruction has changed the processor state. This way is illustrated in Figure 4.2.0.

Figure 4.2.0: Easy way to serialize process state

If the interrupt in this figure is internal, Instruction 2 is the interrupting instruction (unless Instruction 2 is an instruction designed to cause an interrupt, such as a debugging instruction—in that case, depending on the processor, the saved program counter could point at Instruction 3). If the interrupt is external, the saved program counter could be placed at the discretion of the processor designer so long as it and all the instructions issued after it have not yet changed the processor state.

A second way to serialize processor state is to add some extra interrupt processing hardware into the processor. This way can be made to work on processors regardless of out-of-order issue/completion, and regardless of whether an instruction after an interrupting instruction has caused a processor state change. One case which this second way of serializing processor state was designed to handle is shown below in Figure 4.2.1.

Page 9: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

9

Figure 4.2.1: Case which requires a harder way of serializing processor state

The instructions in this figure have been reordered by the processor. Instructions 4 and 0 have already modified the processor state, and Instructions 1, 3, and 2 have not yet modified the processor state. Instruction 2 has caused a page fault.

In this case, extra interrupt processing hardware is required which can undo the state changes caused by Instruction 4 so that processing can later resume from Instruction 1 (when instruction 2 is reached again after the new memory page is loaded, a page fault will not occur). The types of extra hardware which may be added are discussed below.

5.0 Types of interrupt processing hardware

Interrupt processing hardware performs one of two functions: saving the state of the processor, or serializing the state of the processor. In this section we’ll describe the most common types of interrupt processing hardware as a preface to the taxonomy in section 6.

5.1 State-saving hardware

Probably the most common piece of state-saving hardware is a simple stack. The program counter and some amount of information are saved on the stack when an interrupt occurs. Usually only the program counter and some registers are saved, since external memory access is relatively slow.

In a shadow register implementation, some or all the normal registers of the processor are given counterparts, known as shadow registers. By “normal registers” wemean architected and implemented registers. Each time the value of a shadowed register is changed, the old value of the register is saved in the shadow register. The values of the normal registers may be restored from the shadow registers when an

Page 10: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

10

interrupt occurs in order to serialize the processor state. The values of the shadow registers may not be changed by an instruction until it is assured that the instruction will not cause an interrupt.

One example of a processor which uses shadow registers to save processor state(though they are not specifically named in the literature) is the Intel 80486. For segmentation and paging interrupts, some of the processor registers’ values are restored to what they were before the interrupting instruction began, presumably through the use of a few dedicated shadow registers which are used for these two types of interrupt only. Shadow registers are also used in the MC88100.

Shadow stacks are the same as shadow registers, above, except that each shadow register is replaced by a shadow stack. A shadow stack is merely a stack, of which each element serves the same purpose as an individual shadow register. Each time the valueof a shadowed register is changed, the old value is pushed onto the shadow stack. The values of the normal registers may be restored from the shadow stack when an interrupt occurs. The advantage of the shadow stack mechanism over shadow registers is that, since more than one past value of a register may be saved in a shadow stack, nested interrupts may be accommodated with a shadow stack mechanism.

One implementation which we are classifying as a shadow stack is called a program counter stack mechanism [Grohoski 90]. Shadow stacks are provided for three processor registers whose values may have to be restored for certain types of interrupts.

Checkpointing can be used to save external state changes for the purposes of interrupt processing (in contrast to the above implementations, which only work for processor state changes). Checkpointing usually either saves the entire external state, or all the changes made to the external state over a certain time period. Since the amount of external state can be quite large, checkpointing usually saves the state either to primary or secondary memory. Readers who are interested in checkpointing in out-of-order execution processors can find a detailed exposition in [Hwu and Patt 87].

The last type of state-saving hardware we’ll discuss is an auxiliary processor, which is simply a second processor which processes the interrupt while the mainprocessor waits (in the case of an internal interrupt) or possibly continues processing (in the case of an external interrupt). We classify this as state-saving hardware since the state of the main processor is saved in situ while the auxiliary processor is processing the interrupt.

5.2 State-serializing hardware

Page 11: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

11

The best type of state-serializing hardware is a clever architecture. A clever architecture is just one which is designed so that instructions are completed serially and don’t cause any processor state changes that would need to be undone if an interrupt occurs. This implementation was illustrated in figure 4.2.0. Many processors realize a clever architecture by making the last stage of the pipeline the only one which can cause processor state changes, such as the Sparc, MC88000, DEC 21064, and AMD 29000.

In a result shift register implementation, instruction issue is forestalled, if necessary, to make sure that processor state changes occur in serial order. And since the results of an instruction are not committed until it is certain that the instruction didn’t cause an interrupt, there are never any processor state changes to undo. The result shift register is described in greater detail in [Smith and Pleszkun 85].

In a reorder buffer implementation, a special memory area called the reorder buffer is used to store the results of instructions. Instructions are allowed to issueand/or complete out of order, but the results are committed in order by the reorder buffer as they become available and only after it is certain that the instruction hasn’t caused an interrupt. The reorder buffer can be viewed as a generalization of the result shift register. The reorder buffer is described in greater detail in [Smith and Pleszkun 85].

A register update unit is a piece of hardware which allows a serial processor state to be maintained in a way similar to a reorder buffer. It also resolves dependenciesusing an extended version of Tomasulo’s dependency resolution algorithm. The register update unit is described in greater detail in [Sohi and Vajapeyam 87].

In a history buffer implementation, a special memory area called the history buffer is used to store old processor state information as it is replaced by new state information. When an interrupt occurs, the old processor state information is read out of the history buffer and used to restore the processor to its state just before the interrupt occurred. History buffers are described in more detail in [Smith and Pleszkun 85] and in [Ullah and Holle 93] and are used in the MC88110.

In a future file implementation, two separate register files are maintained—the future file, which is updated continuously as the processor operates, and the architectural file, which is updated in order by a reorder buffer. When an interrupt occurs, the architectural file is copied into the future file, thus effectively backing the processor state up to a serially correct point before the interrupt. The future file is described in greater detail in [Smith and Pleszkun 85].

6 Phases of Interrupt Processing

Page 12: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

12

For the purposes of our taxonomy, we divide interrupt processing into six phases, each of which has its own taxonomy of possible choices for the processor designer. Note that design choices for different phases are not independent—choices made in one phase can limit the possible choices in later phases. Briefly, the phases are:

• Phase 0: Detect the interrupt• Phase 1: Finish pending instructions• Phase 2: Undo process state changes• Phase 3: Save the process state• Phase 4: Run the interrupt handler• Phase 5: Allow the interrupted process to resume

The taxonomy of each phase of interrupt processing is illustrated by a figure which places commercial processors within the taxonomy. Many branches of the taxonomy diagrams have no processor listed. This doesn’t necessarily mean that there isn’t a processor which would fit at that branch; it just means that none of the relatively narrow set of concurrent processors which we considered fit in that place.

For simplicity’s sake, the discussion below takes place entirely in a pipelined context. So if a processor’s functional units are not pipelined, interrupt handling in some cases will be simpler than is stated below. Additionally, some branches of the taxonomy whose explanations are self evident are not explained in the text due to space considerations—if the reader desires further explanation, this taxonomy is explained in further detail in [Walker 92].

6.0 Phase 0: Detect an interrupt

In this phase an interrupt is detected. The source of the interrupt may be automatically identified by the processor, or it may be identified later by the interrupt handler. The taxonomy for this phase of interrupt processing is shown in Figure 6.0.

Page 13: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

13

Figure 6.0: Phase zero—detecting the interrupt

In this taxonomy, the detection of an interrupt within an instruction means that an interrupt can be detected while an instruction is actually inside one of the pipeline’s stages. Interrupts detected within an instruction are usually internal interrupts such as page faults. External interrupts, on the other hand, are usually not detected until the processor advances all the instructions in the pipeline from one stage to the next; this is referred to as recognizing interrupts between instructions.

6.1 Phase 1: Finish pending instructions

In this phase the processor performs some action on the pending instructions. Pending instructions are those instructions which must be executed before the interrupt handler is run if precise interrupts are to be implemented. If the interrupted process will not resume, then the instructions before the interrupting instruction can be discarded instead of executed. The taxonomy for this phase of interrupt processing is shown in Figure 6.1.

Page 14: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

14

Figure 6.1: Phase one—finishing pending instructions

* Since the length of the Sparc pipeline is implementation-dependent, there may not be any instructions to run to completion.

6.2 Phase 2: Undo process state changes

In this phase the processor undoes any process state changes already caused by instructions which are required to be completely unexecuted by Smith and Pleszkun’s second precise interrupt condition. If precise interrupts are not being implemented, this stage of interrupt processing is unnecessary. The taxonomy for this phase is shown in Figure 6.2.

Page 15: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

15

Figure 6.2: Phase two—undoing process state changes

* For segmentation and paging interrupts, some registers are restored to the values they had before the interrupting instruction began.

Note that in the above taxonomy, some processors don’t undo all process state changes in some cases. There are two possible reasons for this: a particular processor may not need to make interrupts entirely precise to insure that the interrupted process will run correctly after resumption, or the interrupted process can’t or won’t resume so precise interrupts are not necessary.

Page 16: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

16

6.3 Phase 3: Save the state of the process

In this phase the processor saves the state which will be restored in phase 5. This saved state is that of the process after pending instructions are allowed to compete in phase 1 and state changes are undone in phase 2. The program counter which is saved in this phase generally points to the first instruction which will be loaded into the processor after the interrupt handler completes, but this may vary in individual implementations.

Many of the processors in the taxonomy use stacking to save their state. This is feasible since the process state was already serialized in phases 1 and 2, so there is not much state to save.

The taxonomy for this phase of interrupt processing is shown in Figure 6.3.

Figure 6.3: Phase three—saving the state of the process

In the “auxiliary processor” branch of the taxonomy the state, while not explicitly saved, it preserved in situ as described earlier, so the only branch in which absolutely no state saving or preservation takes place is the “no process resumption” branch.

6.4 Phase 4: Run the interrupt handler

Page 17: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

17

This phase actually involves doing two things: identifying the interrupt handler which needs to be run, then running it. The taxonomy for this phase of interrupt processing is shown in Figure 6.4.

Figure 6.4: Phase four—running the interrupt handler

Some interrupts could be made such that no software interrupt handler is required. An example of this is a page fault, during which the necessary page could be loaded into main memory by specialized hardware or microcode rather than by a software interrupt handler. However, the authors are not aware of any processors which do this.

In the interrupt vector approach the starting addresses of the interrupt handlers are stored in a table where the processor can access them when an interrupt occurs. Special purpose hardware takes care of identifying the cause of the interrupt and loading the correct interrupt vector into the program counter. Note that within the interrupt handler invoked here, interrupt registers or devices could be polled. However, this polling is not what determines which interrupt handler to run.

In an interrupt register system, when an interrupt occurs, identifier bits are set in a register by the interrupting device to indicate the source of the interrupt. The processor or the interrupt handler tests this register and takes the appropriate action. This strategy seems to be most suitable for external interrupts.

In software polling, when an external interrupt occurs an interrupt handler is invoked. The interrupt handler must then poll the I/O devices to discover which one of them caused the interrupt.

6.5 Phase 5: Interrupted process resumes

The interrupted process now resumes from the process state which was saved in phase 3. The taxonomy for this phase of interrupt processing is shown in Figure 6.5.

Page 18: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

18

Figure 6.5: Phase five—interrupted process resumption

* The strategy which is used here is implementation dependent.** Some types of instructions, such as string instructions, are backed up less

than one pipeline stage and partially re-executed.*** The MC68020, MC68030, and MC68040 use an “Instruction continuation -

Continue from save” strategy for bus errors and page faults.**** If the interrupting instruction was a fault, the interrupting instruction is

continued.

The instructions which which are being either completely or partially re-executedin this taxonomy are the ones which were required to be completely unexecuted by Smith and Pleszkun’s second condition, but which were already some distance into the pipeline when the interrupt occurred. In complete re-execution, all these instructions are restarted from the beginning of the pipeline. In partial re-execution, these instructions are either backstepped some integral number of pipeline stages (atomic

Page 19: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

19

stage backstep) or are backed up some number of clock cycles which is less than a full pipeline stage (partial stage backstep). Note that partial stage backstep is only possible in a pipeline which has a stage or stages which can require more than one clock cycle to complete in some cases.

In the case of instruction continuation, no work which was previously done by the processor need be redone. There are two ways this might occur. The first is that there is an auxiliary processor to handle interrupts, so the main processor was stopped when the interrupt occurred and restarted after the interrupt was handled. The second way is that the process state saved by the processor in phase three was so complete that no work must be redone when the interrupted process resumes.

6.6 Taxonomizing interrupt processing

If you take a real microprocessor and attempt to taxonomize its interrupts using the notion of “phases of interrupt processing,” you will soon find that on one processor, different types of interrupts may use different processing strategies. For example, on many processors there are a few instructions which take a long time to execute, such asinstructions which move blocks of memory around. If the execution of this instruction takes, say, 2000 processor cycles, then an interrupt in an unfortunate spot could cause the processor to redo 1999 cycles of work. Consequently, most processors have a special way to continue instructions like this from some intermediate step in the calculations rather than restarting the whole instruction (e.g. the Intel Repeat String instruction [80386 86]). But the other interrupts on the processor may cause the instructions after the interrupting one to be restarted from the beginning of the pipeline. We can therefore conclude that, to be completely accurate, each possible interrupt or type of interrupt on a processor may have to be taxonomized individually—processors as a whole resist convenient taxonomization due to the kinds of special cases described above.

However, if you look at the taxonomy diagrams at the beginning of this section, you will see that we classified entire processors as to their interrupt processing strategies, which we just said was not entirely accurate. However, we felt that for a treatment of this type, too much detail in the taxonomy would create confusion. So in this paper processors are taxonomized only according to their best-documented interrupt processing strategy, with exceptions noted where applicable.

7 Queued and Nested Interrupts

Our treatment of interrupts so far has concentrated on isolated interrupts, and ignored the problems of interrupts which occur simultaneously, or interrupts which occur during the processing of other interrupts. Since the inclusion of these problems

Page 20: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

20

into the above taxonomy would have made it quite unwieldy, and since queued and nested interrupts affect most interrupt processing strategies in the same way, we discuss queued and nested interrupts separately here.

First we define queued and nested interrupts. A queued interrupt is an interrupt which occurs while another interrupt is being processed, but which is not processed itself until the first one is completed. A nested interrupt is an interrupt which occurs while another interrupt is being processed, and which preempts the processing of the first interrupt until it is completed.

The table below gives the choices the processor designer has about whether to allow queueing of interrupts or nesting or both. The “current interrupt” refers to the interrupt which is now being processed; the “new interrupt” refers to the interrupt which occurs during the processing of the current interrupt. This definition is general enough to handle any number of levels of queueing or nesting.

current interrupt

newinterrupt

options

external external queue or nest

external internal nest

internal external queue or nest

internal internal nest

Table 7.0: Queued and nested interrupts

In other words, if an external interrupt occurs while processing another interrupt, it can be safely queued until later since the current interrupt can finish processing without having to process the new one. But if an internal interrupt occurs while processing another interrupt, it must be nested, since the current interrupt cannot finish processing without first processing the new one.

Most real microprocessors avoid the problem of providing arbitrarily many levels of interrupts by disabling new interrupts during interrupt processing (in the case of external interrupts) or by stopping execution and signalling an error if another interrupt occurs (in the case of internal interrupts). This saves them from having to handle nested interrupts.

8 Conclusion

In this paper we’ve described the various problems of interrupt processing in concurrent processors and presented a taxonomy of interrupt processing strategies which encompasses those used or proposed today as well as suggesting a few whichmay not have been used before. In this, we hope that this paper has served its purpose.

Page 21: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

21

However, there are many issues which we didn’t address in this paper.Foremost is the issue of relative performance of interrupt processing strategies, which is a very real concern when designing real systems. Another issue we purposely didn’t cover is the possible commonality between hardware used to implement precise interrupts and that used to implement speculative execution. While these also are worthy issues, they deserve papers all their own.

9 Acknowledgements

We thank Professor Vijay Kumar Garg for his helpful comments offered while reviewing the thesis this paper is based on. We also thank all nine of our reviewers for their detailed, scholarly, and very helpful comments. Responsibility for any remaining errors, oversights or misunderstandings is, of course, ours alone.

This work was partially supported by a Microelectronics Development Fellowship and an Engineering Doctoral Fellowship, both from the University of Texas at Austin.

10 References

[80386 86] 80386 Hardware Reference Manual. Santa Clara, CA: Intel Corporation, 1986.

[Baer 80] Jean-Loup Baer, Computer Systems Architecture. Rockville, MD: Computer Science Press, 1980.

[Grohoski 90] G. F. Grohoski, “Machine organization of the IBM RISC System/6000 processor.” IBM Journal of Research and Development, Vol. 34, No. 1., January 1990, pp. 37-58. International Business Machines Corporation, 1990.

[Hennessy and Patterson 89] John L. Hennessy and David A. Patterson, Computer Architecture: A Quantitative Approach. Palo Alto, CA: Morgan Kaufmann Publishers, Inc., 1989.

[Hwu and Patt 87] W-M. W. Hwu and Y. N. Patt, “Checkpoint Repair for Out-of-Order Execution Machines.” IEEE Transactions on Computers, C-36, No. 12, December 1987, pp. 1515-1522.

[Keller 75] R. M. Keller, “Look-Ahead Processors.” Computing Surveys, December 1975, Vol. 7, No. 4, pp. 177-195.

[Kuck 78] David J. Kuck, The Structure of Computers and Computations. New York: John Wiley and Sons, 1978.

[Meyers 82] Glenford J. Meyers, Advances in Computer Architecture, Second Edition. New York: John Wiley and Sons, 1982.

[Smith and Pleszkun 85] James E. Smith and Andrew R. Pleszkun, “Implementation of Precise Interrupts in Pipelined Microprocessors.” Proceedings of the 12th Annual International Symposium on Computer Architecture, 1985, pp. 36-44.

Page 22: Interrupt Processing in Concurrent Processorsarch2.000webhostapp.com/appunti/AII_APP_Interrupts.pdf · 2019-04-26 · 1 Interrupt Processing in Concurrent Processors Wade Walker and

22

[Sohi and Vajapeyam 87] Gurindar S. Sohi and Sriram Vajapeyam, “Instruction Issue Logic for High-Performance, Interruptible Pipelined Processors.” Proceedings on the 14th Annual International Symposium on Computer Architecture, 1987, pp. 27-34.

[Ullah and Holle 93] Nasr Ullah and Matt Holle, “The MC88110 Implementation of Precise Exceptions in a Superscalar Architecture,” Computer Architecture News, Vol. 21, No. 1, March 1993.

[Venkatramani 90] Venkatramani, K., A Semantics-Based Approach for the Design and Verification of Concurrent Processors. Ph. D. Dissertation, The University of Texas at Austin, August 1990.

[Walker 92] Wade A. Walker, Interrupt Processing Strategies in Pipelined Microprocessors. Master’s Thesis, The University of Texas at Austin, December 1992.

View publication statsView publication stats