A Reconfigurable Accelerator Card for High Performance Computing
Michael AitkenSupervisor: Prof M. Inggs
Co-Supervisor: Dr A. Langman
The Power of FPGAs
• Example: Virtex 5 – LX330T
– 331,000 Logic Cells– 207,000 Flip-flops– 11.6 Mb hardwired RAM– 960 I/O pins– Theoretical I/O Bandwidth: 960 I/O pins at
800Mbps = 768 Gbps– 24 On-board 3.2Gbps Transceivers giving
153.6 Gbps
Why do we want our own accelerator?
• An advancement on existing cards:
– Latest FPGAs available (Xilinx Virtex 5 – more logic, faster clock rate)
– Faster I/O interface needed for HPC (1 GE not good enough)
– Faster memory devices available– No: compulsory engineering costs, bundled
software• Core computing concept for the Advanced Computer
Engineering Laboratory at the CHPC.
Methodology
• Background Investigation• Conceptual Design• Design Review and Adjustment• Component Sourcing begins• Schematic Capture• Design Specification & Layout Outsourcing• Gateware development• PCB Fabrication and Assembly• Testing
AMD’s Direct ConnectArchitecture
• Improved Latency• Peripheral HTX device has direct access to system RAM
via DMA
Virtex5 LX110T or
SX95T
HTX
QDRII+ 4-word burst (2Mx18bit)
CX4 Connector
CX4 Connector
Rocket IO
Status LEDS
Virtex5LX50
Clock Generation
Con
fig
XCF16PPROM
QDRII+ 4-word burst (2Mx18bit)
QDRII+ 4-word burst (2Mx18bit)
QDRII+ 4-word burst (2Mx18bit)
QDRII+ 4-word burst (2Mx18bit)
QDRII+ 4-word burst (2Mx18bit)H
yper
Tra
nspo
rt
According to:http://www.xilinx.com/products/
silicon_solutions/proms/pfp/virtex.htm
XAUI (4 x 3.125Gbps)
XAUI (4 x 3.125Gbps)
Status LEDS
JTAG chain
JTA
G c
hain
JTAG chain
JTAG Test Port
Hyp
erT
rans
port
Config
Serial RS232
Conceptual Design
CX4 Connector Test:Cable Loopback Test
Status Bits indicate both cores are synchronized to all 4 incoming signals
Related work by other students
• Jane Hewitson – Preparing a FORTRAN processing engine
• Nick Thorne – Preparing a HyperTransport conroller core and drivers
• Brandon Hamiltion – Preparing the BORPH reconfigurable operating system