22
9/23/2003 2:11 PM MSR Interrupts 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS SPC-I and SPC-II Interrupts, APIC Interrupts and the NetBSD Interrupt Handling Code John DeHart Washington University [email protected] http://www.arl.wustl.edu/~jdd

SPC-I and SPC-II Interrupts, APIC Interrupts and the NetBSD Interrupt Handling Code John DeHart

  • Upload
    yitro

  • View
    46

  • Download
    0

Embed Size (px)

DESCRIPTION

SPC-I and SPC-II Interrupts, APIC Interrupts and the NetBSD Interrupt Handling Code John DeHart Washington University [email protected] http://www.arl.wustl.edu/~jdd. Issue. There seems to be a bug in the system Right now only shows up on SPC-II In the past, has shown up on SPC-I - PowerPoint PPT Presentation

Citation preview

Page 1: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 1WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

SPC-I and SPC-II Interrupts,APIC Interrupts

and the NetBSD Interrupt

Handling Code

John DeHartWashington University

[email protected]://www.arl.wustl.edu/~jdd

Page 2: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 2WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue• There seems to be a bug in the system

– Right now only shows up on SPC-II

– In the past, has shown up on SPC-I• But this could be similar symptoms of different problems.

– No recollection of it ever showing up on end hosts.

– All these different systems have different timing

• We ran into this problem in preparing for and doing the WU 150th anniversary demo.

• Fred is having this problem in his kernel testing.

• JohnD is having this problem in his final SPC-II performance testing.

Page 3: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 3WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue (continued)• Symptoms:

– Transmit queue stalls for paced connections• when it stalls the descriptor chain looks like this:

– ???

– Resuming connection as BE (then Paced) clears the queue most of the time

• sometimes it then stalls again and eventually we can not resume it.

– When it stalls and the APIC runs out of descriptors, we do get an ERROR interrupt for the out of descriptors state.

• This seems to imply that the APIC and ICU are in a state such that they can generate an APIC interrupt to the CPU.

• If the APIC had generated an interrupt that had been “lost” the APIC and/or ICU would probably not be in a state that would allow another APIC interrupt to reach the CPU.

Page 4: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 4WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue (continued)

• Suspects:– “Lost” interrupt

– APIC Hardware bug• interrupt handling

– timing between two instances of INTR signal being asserted.

• descriptor handling

• pacer

• flow control

• other?

– APIC driver bug• interrupt handling

• descriptor handling

• other?

– NetBSD Interrupt handling bug

– SPC-II FPGA flow control bug

Page 5: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 5WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Issue (continued)

• Plan of Attack:– Analyze apic driver code:

• compare MSR vs. end host driver code.

– Get details of descriptor chain when it stalls:• dump APIC descriptor chain as it exists in memory

• dump APIC current descriptor chain register for stalled channel

– monitor interrupt counts on SPC-II and compare to packet counts• vmstat –I

– Note what IRQs are assigned to what at boot time.

– Turn off SPC-II FPGA flow control to APIC• change VHDL

• rebuild bitfile

• re-program SPC-II FPGA

• retest

Page 6: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 6WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

SPC-I System FPGA

• Supported:– Four Interrupts supported and statically assigned:

• PIT (IRQ 0)

• APIC (IRQ 5)

• COM1 (IRQ 4)

• COM2 (IRQ 3)

– Static fully-nested interrupt priority structure.

– Specific End of Interrupt is the only EOI mode supported

• Not Supported:– Special Mask Mode

– Automatic End of Interrupt (AUTO_EOI_1, AUTO_EOI_2)

– Special Fully Nested Mode

Page 7: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 7WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

SPC-II Interrupts

• Supported by a real Southbridge/ICU

• FPGA provides flow control– but with the traffic patterns and rates we are using there should be

no flow control asserted.

Page 8: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 8WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Hardware Interrupt Structure (Ignoring Bus)

CPU

APIC

ICU ACK

INTR

ACK

MASK/UNMASK

INTR

Page 9: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 9WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

Overview of what happens• APIC generates INTR to ICU

– Apic will not generate another INTR until ACKed• ICU pushes INTR(IRQ) onto Bus

– ICU will only send higher priority interrupts• CPU gets INTR

– MASK IRQ in ICU• ICU will not send this IRQ again

– ACK IRQ in ICU• Allows lower priority interrupts from ICU

– Check priority and hold if lower than current– Call APIC inter handler

• ACK Intr in APIC– APIC can generate another INTR to ICU

• Intr processing…– process all packets that have been received– put packets being forwarded on transmit queue and resume transmit queue if needed

• Return– UNMASK IRQ in ICU

• ICU can send us this IRQ again– Check for other pending (held) interrupts.– RETI (expand…)

Page 10: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 10WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#include "opt_ddb.h"

#include <i386/isa/icu.h>#include <dev/isa/isareg.h>

#define ICU_HARDWARE_MASK

#define IRQ_BIT(irq_num) (1 << ((irq_num) % 8))#define IRQ_BYTE(irq_num) ((irq_num) / 8)

#ifdef ICU_SPECIAL_MASK_MODE // SPC System FPGA does not support SMM#define ACK1(irq_num)#define ACK2(irq_num) \

movb $(0x60|IRQ_SLAVE),%al /* specific EOI for IRQ2 */ ;\outb %al,$IO_ICU1

#define MASK(irq_num, icu)#define UNMASK(irq_num, icu) \

movb $(0x60|(irq_num%8)),%al /* specific EOI */ ;\outb %al,$icu

Page 11: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 11WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#else /* I.E. NOT ICU_SPECIAL_MASK_MODE */

#ifndef AUTO_EOI_1#define ACK1(irq_num) \

movb $(0x60|(irq_num%8)),%al /* specific EOI */ ;\outb %al,$IO_ICU1

#else#define ACK1(irq_num)#endif

#ifndef AUTO_EOI_2#define ACK2(irq_num) \

movb $(0x60|(irq_num%8)),%al /* specific EOI */ ;\outb %al,$IO_ICU2 /* do the second ICU first */ ;\movb $(0x60|IRQ_SLAVE),%al /* specific EOI for IRQ2 */ ;\outb %al,$IO_ICU1

#else#define ACK2(irq_num)#endif

Page 12: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 12WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#ifdef ICU_HARDWARE_MASK#define MASK(irq_num, icu) \

movb _C_LABEL(imen) + IRQ_BYTE(irq_num),%al /* imen: interrupt mask enable (2 bytes)*/orb $IRQ_BIT(irq_num),%al /* mask our irq (put a 1 in its place) */movb %al,_C_LABEL(imen) + IRQ_BYTE(irq_num)FASTER_NOPoutb %al,$(icu+1) /* write it to the ICU */

#define UNMASK(irq_num, icu) climovb _C_LABEL(imen) + IRQ_BYTE(irq_num),%alandb $~IRQ_BIT(irq_num),%almovb %al,_C_LABEL(imen) + IRQ_BYTE(irq_num)FASTER_NOPoutb %al,$(icu+1)sti

#else /* ICU_HARDWARE_MASK */#define MASK(irq_num, icu)#define UNMASK(irq_num, icu)

#endif /* ICU_HARDWARE_MASK */

#endif /* ICU_SPECIAL_MASK_MODE */

Page 13: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 13WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s#ifdef __ELF__

#define XINTR(irq_num) Xintr/**/irq_num

#define XHOLD(irq_num) Xhold/**/irq_num

#define XSTRAY(irq_num) Xstray/**/irq_num

#else

#define XINTR(irq_num) _Xintr/**/irq_num

#define XHOLD(irq_num) _Xhold/**/irq_num

#define XSTRAY(irq_num) _Xstray/**/irq_num

#endif

Page 14: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 14WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s/* Beginning of INTR Macro */

#define INTR(irq_num, icu, ack)

IDTVEC(resume/**/irq_num)

cli

jmp 1f

IDTVEC(recurse/**/irq_num)

pushfl

pushl %cs

pushl %esi

cli

Block the CPU from accepting any more interrupts.

Page 15: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 15WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.sXINTR(irq_num):

pushl $0 /* dummy error code */

pushl $T_ASTFLT /* trap # for doing ASTs */

INTRENTRY

MAKE_FRAME

MASK(irq_num, icu) /* mask it in hardware */

ack(irq_num) /* and allow other intrs */

incl MY_COUNT+V_INTR /* statistical info */

ICU will not send us

anymore of this IRQ

ACK this IRQ to the ICU. Allows it to

generate other interrupts.

Without this the ICU would only generate higher priority

interrupts

When an interrupt occurs the CPU will clear the

interrupt enable bit (equivalent of cli)

An iret restores the bit.

Page 16: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 16WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.stestb $IRQ_BIT(irq_num),_C_LABEL(cpl) + IRQ_BYTE(irq_num)

jnz XHOLD(irq_num) /* currently masked; hold it */

1: movl _C_LABEL(cpl),%eax /* cpl to restore on exit */

pushl %eax

orl _C_LABEL(intrmask) + (irq_num) * 4,%eax

movl %eax,_C_LABEL(cpl) /* add in this intr's mask */

sti /* safe to take intrs now */

In Kernel interrupt

mask

Allow CPU to accept more interrupts.

Pre-computed masks for each IRQ

IRQ 0: 0xe0000021IRQ 3: 0xe0000039IRQ 4: 0xe0000039IRQ 5: 0xc0000020

0 0 0 0 0 0 0 0 bits 5 4 3 2 1 0 irq

Add IRQ bit to ipending

Page 17: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 17WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.smovl _C_LABEL(intrhand) + (irq_num) * 4,%ebx /* head of chain */

testl %ebx,%ebx

jz XSTRAY(irq_num) /* no handlers; we're stray */

STRAY_INITIALIZE /* nobody claimed it yet */

incl _C_LABEL(intrcnt) + (4*(irq_num)) /* XXX */

Page 18: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 18WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s7: movl IH_ARG(%ebx),%eax /* get handler arg */

testl %eax,%eax

jnz 4f

movl %esp,%eax /* 0 means frame pointer */

4: pushl %eax

call IH_FUN(%ebx) /* call it */

addl $4,%esp /* toss the arg */

STRAY_INTEGRATE /* maybe he claimed it */

incl IH_COUNT(%ebx) /* count the intrs */

movl IH_NEXT(%ebx),%ebx /* next handler in chain */

testl %ebx,%ebx

jnz 7b

STRAY_TEST /* see if it's a stray */

5: UNMASK(irq_num, icu) /* unmask it in hardware */

jmp _C_LABEL(Xdoreti) /* lower spl and do ASTs */

Call NetBSD Interrupt Handler

ICU is now able to send us another

interrupt for this IRQ

Locate a handler for this IRQ

Return from Interrupt: Resume other interruptsCheck for pending interruptsRestore stackiret

Page 19: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 19WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.sIDTVEC(stray/**/irq_num)

pushl $irq_num

call _C_LABEL(isa_strayintr)

addl $4,%esp

incl _C_LABEL(strayintrcnt) + (4*(irq_num))

jmp 5b

IDTVEC(hold/**/irq_num) // XHOLD()

orb $IRQ_BIT(irq_num),_C_LABEL(ipending) + IRQ_BYTE(irq_num)

INTRFASTEXIT

/* End of INTR Macro */

Page 20: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 20WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.sINTR(0, IO_ICU1, ACK1) /* Clock interrupt */

INTR(1, IO_ICU1, ACK1)

INTR(2, IO_ICU1, ACK1)

INTR(3, IO_ICU1, ACK1) /* COM 2 Interrupt */

INTR(4, IO_ICU1, ACK1) /* Com 1 Interrupt */

INTR(5, IO_ICU1, ACK1) /* APIC Interrupt */

INTR(6, IO_ICU1, ACK1)

INTR(7, IO_ICU1, ACK1)

INTR(8, IO_ICU2, ACK2)

INTR(9, IO_ICU2, ACK2)

INTR(10, IO_ICU2, ACK2)

INTR(11, IO_ICU2, ACK2)

INTR(12, IO_ICU2, ACK2)

INTR(13, IO_ICU2, ACK2)

INTR(14, IO_ICU2, ACK2)

INTR(15, IO_ICU2, ACK2)

Page 21: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 21WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/vector.s/*Add a mask to cpl, and return the old value of cpl.*/

static __inline int

splraise(ncpl)

register int ncpl;

{

register int ocpl = cpl;

cpl = ocpl | ncpl;

return (ocpl);

}/* Restore a value to cpl (unmasking interrupts).

* If any unmasked interrupts are pending,

* call Xspllower() to process them.*/

static __inline void

splx(ncpl)

register int ncpl;

{

cpl = ncpl;

if (ipending & ~ncpl)

Xspllower();

}

/*Same as splx(), but we return the old value of spl, for the * benefit of some splsoftclock() callers.*/

static __inline intspllower(ncpl)

register int ncpl;{

register int ocpl = cpl;cpl = ncpl;if (ipending & ~ncpl)

Xspllower();return (ocpl);

}

Call Xspllower if there is something

pending that is higher priority then

our new cpl

Page 22: SPC-I and SPC-II Interrupts, APIC Interrupts and the  NetBSD Interrupt Handling Code John DeHart

9/23/2003 2:11 PM MSR Interrupts 22WashingtonWASHINGTON UNIVERSITY IN ST LOUIS

sys/arch/i386/isa/icu.s: spllower()IDTVEC(spllower) // Xspllower()

pushl %ebx

pushl %esi

pushl %edi

movl _C_LABEL(cpl),%ebx # save priority

movl $1f,%esi # address to resume loop at

1: movl %ebx,%eax

notl %eax

andl _C_LABEL(ipending),%eax

jz 2f

bsfl %eax,%eax

btrl %eax,_C_LABEL(ipending)

jnc 1b

jmp *_C_LABEL(Xrecurse)(,%eax,4)

2: popl %edi

popl %esi

popl %ebx

ret

Is there a pending

interrupt that is high enough priority?

If yes, then restart it?