Digital Systems Design Overview
ENGIN 341 – Advanced Digital Design
University of Massachusetts Boston
Department of Engineering
Dr. Filip Cuckov
Overview
1. Introduction to Programmable Logic Devices2. Field Programmable Gate Arrays (FPGAs)3. Implementing Functions on FPGAs4. The Zynq System on Chip (SoC)5. FPGA Design Flow
1. Introduction to Programmable Logic Devices
Programmable Devices Comparison
• SPLD• PLA• PAL• GAL• PROM
• EPROM• EEPROM
• CPLD• FPGA
Implementing Functions Using ROMs
In General
Programmable Logic ArraysF0 = A’B’ + AC’F1 = B + AC’F2 = A’B’ + BC’F3 = AC + B
PLA Example Implementation
PLA Architecture
GAL 22V10
Output Logic Macrocell Detail
CPLD Example – Xilinx CoolRunner XCR3064XL
2. Field Programmable Gate Arrays
• Field Programmable Logic Block Arrays (FPLBA ???)• Name due to early PLD-type block usage in Altera devices.• Modern FPGAs mostly use LUTs
• First FPGA: Xilinx XC2000 ca. 1985• FPGA Market Share (2013 data)
• FPGAs vs CPLDs
36%
31%
10%
7%
6%6%
2% 2%
Xilinx
Altera
Actel
Vantis
Lattice
Lucent
QuickLogic
Cypress
FPGA Architectures
FPGA Programmable Logic Block* Models• Usually composed from Look-Up-Tables (LUTs)• Actel: MUX based
LUT-Based Logic Block
MUX-Based Logic Block*As in: basic building blocks
Actual FPGA Logic Block Variants
Simplified view of sample Xilinx (slice) and Altera (logic element) Logic Blocks
FPGA Programmable Interconnect
FPGA Programmable I/O Blocks
3. Implementing Functions on FPGAs
• Book Chapter 6• Boole’s Expansion Theorem / Shannon’s Expansion or Decomposition
F(a,b,c,d,e) = a’ F(0,b,c,d,e) + a F(1,b,c,d,e) = a’ F∙ ∙ ∙ 0 + a F∙ 1
• Example:• F(a,b,c,d,e) = abc’e + a’b’cd + cde’ + a’b• F0 = 0 bc’e + 1 b’cd + cde’ + 1 b = b’cd + cde’ + b∙ ∙ ∙• F1 = 1 bc’e + 0 b’cd + cde’ + 0 b = bc’e + cde’∙ ∙ ∙
Claude Elwood Shannon (1916–2001)
George Boole(1815-1864)
vs
Expansion/Decomposition Example• Implement F using 4-input LUTs (LUT4)• F0 = b’cd + cde’ + b | F1 = bc’e + cde’ | F = a’ F∙ 0 + a F∙ 1
b c d e F0
0 0 0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0 10 1 1 1 11 0 0 0 11 0 0 1 11 0 1 0 11 0 1 1 11 1 0 0 11 1 0 1 11 1 1 0 11 1 1 1 1
b c d e F1
0 0 0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0 10 1 1 11 0 0 01 0 0 1 11 0 1 01 0 1 1 11 1 0 01 1 0 11 1 1 0 11 1 1 1
F0 (b,c,d,e) F1 (b,c,d,e)
a b c d e F0 0 0 0 0
0 0 0 0 1
0 0 0 1 0
0 0 0 1 1
0 0 1 0 0
0 0 1 0 1
0 0 1 1 0 1
0 0 1 1 1 1
0 1 0 0 0 1
0 1 0 0 1 1
0 1 0 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
0 1 1 0 1 1
0 1 1 1 0 1
0 1 1 1 1 1
1 0 0 0 0
1 0 0 0 1
1 0 0 1 0
1 0 0 1 1
1 0 1 0 0
1 0 1 0 1
1 0 1 1 0 1
1 0 1 1 1
1 1 0 0 0
1 1 0 0 1 1
1 1 0 1 0
1 1 0 1 1 1
1 1 1 0 0
1 1 1 0 1
1 1 1 1 0 1
1 1 1 1 1
Original Function
a F1 F0 * F0 0 0 00 0 0 10 0 1 0 10 0 1 1 10 1 0 00 1 0 10 1 1 0 10 1 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 0 11 1 0 1 11 1 1 0 11 1 1 1 1
F (a,F1,F0,*)
* is a LUT input w
e don’t care about
LUT4
bcde
F0 LUT4
bcde
F1 LUT4
a F
Further Decomposition
• F(a,b,c,d,e) = a’b’ F(0,0,c,d,e) + a’b F(0,1,c,d,e) + ab’ F(1,0,c,d,e) + ab F(1,1,c,d,e)∙ ∙ ∙ ∙
= a’b’ F∙ 00 + a’b F∙ 01 + ab’ F∙ 10 + ab F∙ 11
• This would be done in order to implement F using 3-input LUT (LUT3)• Same Example:• F(a,b,c,d,e) = abc’e + a’b’cd + cde’ + a’b• F00 = 0 0 c’e + 1 1 cd + cde’ + 1 0 = cd + cde’∙ ∙ ∙ ∙ ∙• F01 = 0 1 c’e + 1 0 cd + cde’ + 1 1 = 1∙ ∙ ∙ ∙ ∙• F10 = 1 0 c’e + 0 1 cd + cde’ + 0 0 = cde’∙ ∙ ∙ ∙ ∙• F11 = 1 1 c’e + 0 0 cd + cde’ + 0 1 = c’e + cde’∙ ∙ ∙ ∙ ∙
• F00 = cd + cde’ | F01 = 1 | F10 = cde’ | F11 = c’e + cde’• F = a’b’ F∙ 00 + a’b F∙ 01 + ab’ F∙ 10 + ab F∙ 11 = ( F1 + F2 + F3 ) + F4 = F123 + F4
c d e F00
0 0 00 0 10 1 00 1 11 0 01 0 11 1 0 11 1 1 1
F00 (c,d,e)
a b c d e F0 0 0 0 0
0 0 0 0 1
0 0 0 1 0
0 0 0 1 1
0 0 1 0 0
0 0 1 0 1
0 0 1 1 0 1
0 0 1 1 1 1
0 1 0 0 0 1
0 1 0 0 1 1
0 1 0 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
0 1 1 0 1 1
0 1 1 1 0 1
0 1 1 1 1 1
1 0 0 0 0
1 0 0 0 1
1 0 0 1 0
1 0 0 1 1
1 0 1 0 0
1 0 1 0 1
1 0 1 1 0 1
1 0 1 1 1
1 1 0 0 0
1 1 0 0 1 1
1 1 0 1 0
1 1 0 1 1 1
1 1 1 0 0
1 1 1 0 1
1 1 1 1 0 1
1 1 1 1 1
Original Function
c d e F01
0 0 0 10 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 1
F01 (c,d,e)c d e F10
0 0 00 0 10 1 00 1 11 0 01 0 11 1 0 11 1 1
F10 (c,d,e)c d e F11
0 0 00 0 1 10 1 00 1 1 11 0 01 0 11 1 0 11 1 1
F11 (c,d,e)
a b F00 F1
0 0 00 0 1 10 1 00 1 11 0 01 0 11 1 01 1 1
F1 (a,b,F00)a b F01 F2
0 0 00 0 10 1 00 1 1 11 0 01 0 11 1 01 1 1
F2 (a,b,F01)a b F10 F3
0 0 00 0 10 1 00 1 11 0 01 0 1 11 1 01 1 1
F3 (a,b,F10)a b F11 F4
0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1 1
F4 (a,b,F11)F1 F2 F3 F123
0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 1
F123 (F1,F2,F3)F123 F4 * F123
0 0 00 0 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 1
F(F123,F4,*)
Further Decomposition Example
Diagram and Alternative MUX Implementation
LUT3
cde
F01
LUT3
cde
F00
LUT3
cde
F11
LUT3
cde
F10
LUT3
ab F1
LUT3
ab F2
LUT3
ab F3
LUT3
ab F4
LUT3F123
LUT3F
LUT3
cde
F01
LUT3
cde
F00
LUT3
cde
F11
LUT3
cde
F10
b
b
F
a
4. The Zynq System on Chip (SoC)
• Xilinx Zynq System on Chip (SoC) Overview• Zynq Overview – 16m 25s (Optional – Answers why was Zynq created)• Zynq Architecture – 10m 49s (Brief overview of the PS and PL components)• Zynq Processing System (PS) – 7m 25s (Optional – Brief overview of Zynq PS)• Zynq Programmable Logic (PL) – 9m 23s (Brief overview of Zynq PL)
• Zynq PL Architecture• Xilinx 7 Series FPGA Overview – 28m 21s (Optional - General overview, watch at 1.5x speed)• CLB Architecture – 21m 52s (A must see, IMDB rating of 9.1)
• Additional Xilinx videos and training:• http://www.xilinx.com/support/university.html• http://www.xilinx.com/training/
Digilent Zybo Feature DescriptionFPGA •Zynq-7000 AP SoC XC7Z010-1CLG400I/O Interfaces •USB-UART for programming, serial comm., and power
•One 10/100/1G Ethernet•USB OTG 2.0•USB-UART bridge•16-bit VGA output•Dual role (Input/Output) HDMI•I2S CODEC•Audio Line-In, Line-Out, microphone
Memory •512 Mbyte DDR3•128 Mbit Quad-SPI Flash•MicroSD card connector
Switches and LEDs •4 Slide switches accessible from PL•4 LEDs accessible from PL•1 LED accessible from PS•4 Push-buttons accessible from PL•2 Push-buttons accessible from PS•1 Reset button accessible from PL•1 Reset button accessible from PS
Clocks •one 50.000 MHz Oscillator for PSExpansion ports •One processor-dedicated Pmod connector
•One dual (analog/digital) Pmod conenctor•4 Pmod connectors
Zynq-7000 AP SoC XC7Z010-1CLG400
• Processing System (PS)• 650Mhz dual-core Cortex-A9 processor• DDR3 memory controller with 8 DMA
channels• High-bandwidth peripheral controllers: 1G
Ethernet, USB 2.0, SDIO• Low-bandwidth peripheral controller: SPI,
UART, CAN, I2C
• Programmable Logic (PL)• Equivalent to Artix-7 (7 Series) FPGA:
• 4,400 logic slices, each with four 6-input LUTs and 8 flip-flops
• 240 KB of fast block RAM • Two clock management tiles, each with a PLL and
a MMCM • 80 DSP slices • Internal clock speeds exceeding 450MHz • On-chip analog-to-digital converter (XADC)
Xilinx 7 Series FPGA Configurable Logic Block• Each CLB is composed of of 2 slices• Slices can be of type
• SLICEL• SLICEM
• Slice features:• Real 6-input look-up table (LUT) technology• Dual LUT5 (5-input LUT) option• Distributed Memory and Shift Register Logic capability• Dedicated high-speed carry logic for arithmetic
functions• Wide multiplexers for efficient utilization Xilinx 7 Series FPGA CLB
Source: Xilinx 7 Series FPGA CLB User Guide
• The four (A, B, C, and D) 6-input (A1-A6) LUTs provide 2 independent outputs (O5 and O6)
• The LUTs can implement:• Any arbitrarily defined six-input Boolean function OR• Two arbitrarily defined five-input Boolean
functions, as long as these two functions share common inputs
• Slices also contain three multiplexers used to combine up to four LUTs to implement any function of 7or 8 inputs.• F7AMUX: Used to generate 7-input functions from
LUTs A and B• F7BMUX: Used to generate 7-input functions from
LUTs C and D• F8MUX: Used to combine all LUTs to generate 8-
input functions.
Xilinx 7 Series FPGA Slice Detail
SLICEL element shown. See source for SLICEM details.
• Carry Chain(s)• Dedicated fast lookahead carry logic can perform fast
arithmetic addition and subtraction. • Cascadable to form wider add/subtract logic. • Run upward and have a height of four bits per slice. For
each bit, there is a carry MUX and a XOR gate for adding/subtracting the operands with selected carry bits. The dedicated carry path and carry MUX can also be used to cascade function generators for implementing wide logic functions.
• Memory Elements• Flip-Flops
• 8 in total• 4 are only DFF and• 4 can be configured as DFF or Latch
• Distributed RAM• LUTs can be configured as RAM (SLICEM only)
• Dual-Port 32 x 1bit RAM, up to• Single-Port 256 x 1bit RAM
Xilinx 7 Series FPGA Slice Detail
Other PL Elements
• Clock Management• Mixed-Mode Clock Manager• Phase Locked Loop (PLL)
• DSP Slices• 25x18 bit signed multiplier• 48 bit adder/accumulator• 25 bit pre-adder
• Block RAM• Dual-port 36KB blocks• Programmable FIFO logic
Zynq Layout Snapshot using Vivado
PS
PL
5. FPGA Design Flow
• Computer Aided Design (CAD) Tools• Xilinx Vivado• Aldec Active-HDL (Free Student Edition)
• Functional Simulation• Synthesis• Post Synthesis Simulation• Mapping, Placement, Routing• Implementation