Digital Systems Design Overview ENGIN 341 – Advanced Digital Design University of Massachusetts...

Preview:

Citation preview

Digital Systems Design Overview

ENGIN 341 – Advanced Digital Design

University of Massachusetts Boston

Department of Engineering

Dr. Filip Cuckov

Overview

1. Introduction to Programmable Logic Devices2. Field Programmable Gate Arrays (FPGAs)3. Implementing Functions on FPGAs4. The Zynq System on Chip (SoC)5. FPGA Design Flow

1. Introduction to Programmable Logic Devices

Programmable Devices Comparison

• SPLD• PLA• PAL• GAL• PROM

• EPROM• EEPROM

• CPLD• FPGA

Implementing Functions Using ROMs

In General

Programmable Logic ArraysF0 = A’B’ + AC’F1 = B + AC’F2 = A’B’ + BC’F3 = AC + B

PLA Example Implementation

PLA Architecture

GAL 22V10

Output Logic Macrocell Detail

CPLD Example – Xilinx CoolRunner XCR3064XL

2. Field Programmable Gate Arrays

• Field Programmable Logic Block Arrays (FPLBA ???)• Name due to early PLD-type block usage in Altera devices.• Modern FPGAs mostly use LUTs

• First FPGA: Xilinx XC2000 ca. 1985• FPGA Market Share (2013 data)

• FPGAs vs CPLDs

36%

31%

10%

7%

6%6%

2% 2%

Xilinx

Altera

Actel

Vantis

Lattice

Lucent

QuickLogic

Cypress

FPGA Architectures

FPGA Programmable Logic Block* Models• Usually composed from Look-Up-Tables (LUTs)• Actel: MUX based

LUT-Based Logic Block

MUX-Based Logic Block*As in: basic building blocks

Actual FPGA Logic Block Variants

Simplified view of sample Xilinx (slice) and Altera (logic element) Logic Blocks

FPGA Programmable Interconnect

FPGA Programmable I/O Blocks

3. Implementing Functions on FPGAs

• Book Chapter 6• Boole’s Expansion Theorem / Shannon’s Expansion or Decomposition

F(a,b,c,d,e) = a’ F(0,b,c,d,e) + a F(1,b,c,d,e) = a’ F∙ ∙ ∙ 0 + a F∙ 1

• Example:• F(a,b,c,d,e) = abc’e + a’b’cd + cde’ + a’b• F0 = 0 bc’e + 1 b’cd + cde’ + 1 b = b’cd + cde’ + b∙ ∙ ∙• F1 = 1 bc’e + 0 b’cd + cde’ + 0 b = bc’e + cde’∙ ∙ ∙

Claude Elwood Shannon (1916–2001)

George Boole(1815-1864)

vs

Expansion/Decomposition Example• Implement F using 4-input LUTs (LUT4)• F0 = b’cd + cde’ + b | F1 = bc’e + cde’ | F = a’ F∙ 0 + a F∙ 1

b c d e F0

0 0 0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0 10 1 1 1 11 0 0 0 11 0 0 1 11 0 1 0 11 0 1 1 11 1 0 0 11 1 0 1 11 1 1 0 11 1 1 1 1

b c d e F1

0 0 0 00 0 0 10 0 1 00 0 1 10 1 0 00 1 0 10 1 1 0 10 1 1 11 0 0 01 0 0 1 11 0 1 01 0 1 1 11 1 0 01 1 0 11 1 1 0 11 1 1 1

F0 (b,c,d,e) F1 (b,c,d,e)

a b c d e F0 0 0 0 0

0 0 0 0 1

0 0 0 1 0

0 0 0 1 1

0 0 1 0 0

0 0 1 0 1

0 0 1 1 0 1

0 0 1 1 1 1

0 1 0 0 0 1

0 1 0 0 1 1

0 1 0 1 0 1

0 1 0 1 1 1

0 1 1 0 0 1

0 1 1 0 1 1

0 1 1 1 0 1

0 1 1 1 1 1

1 0 0 0 0

1 0 0 0 1

1 0 0 1 0

1 0 0 1 1

1 0 1 0 0

1 0 1 0 1

1 0 1 1 0 1

1 0 1 1 1

1 1 0 0 0

1 1 0 0 1 1

1 1 0 1 0

1 1 0 1 1 1

1 1 1 0 0

1 1 1 0 1

1 1 1 1 0 1

1 1 1 1 1

Original Function

a F1 F0 * F0 0 0 00 0 0 10 0 1 0 10 0 1 1 10 1 0 00 1 0 10 1 1 0 10 1 1 1 11 0 0 01 0 0 11 0 1 01 0 1 11 1 0 0 11 1 0 1 11 1 1 0 11 1 1 1 1

F (a,F1,F0,*)

* is a LUT input w

e don’t care about

LUT4

bcde

F0 LUT4

bcde

F1 LUT4

a F

Further Decomposition

• F(a,b,c,d,e) = a’b’ F(0,0,c,d,e) + a’b F(0,1,c,d,e) + ab’ F(1,0,c,d,e) + ab F(1,1,c,d,e)∙ ∙ ∙ ∙

= a’b’ F∙ 00 + a’b F∙ 01 + ab’ F∙ 10 + ab F∙ 11

• This would be done in order to implement F using 3-input LUT (LUT3)• Same Example:• F(a,b,c,d,e) = abc’e + a’b’cd + cde’ + a’b• F00 = 0 0 c’e + 1 1 cd + cde’ + 1 0 = cd + cde’∙ ∙ ∙ ∙ ∙• F01 = 0 1 c’e + 1 0 cd + cde’ + 1 1 = 1∙ ∙ ∙ ∙ ∙• F10 = 1 0 c’e + 0 1 cd + cde’ + 0 0 = cde’∙ ∙ ∙ ∙ ∙• F11 = 1 1 c’e + 0 0 cd + cde’ + 0 1 = c’e + cde’∙ ∙ ∙ ∙ ∙

• F00 = cd + cde’ | F01 = 1 | F10 = cde’ | F11 = c’e + cde’• F = a’b’ F∙ 00 + a’b F∙ 01 + ab’ F∙ 10 + ab F∙ 11 = ( F1 + F2 + F3 ) + F4 = F123 + F4

c d e F00

0 0 00 0 10 1 00 1 11 0 01 0 11 1 0 11 1 1 1

F00 (c,d,e)

a b c d e F0 0 0 0 0

0 0 0 0 1

0 0 0 1 0

0 0 0 1 1

0 0 1 0 0

0 0 1 0 1

0 0 1 1 0 1

0 0 1 1 1 1

0 1 0 0 0 1

0 1 0 0 1 1

0 1 0 1 0 1

0 1 0 1 1 1

0 1 1 0 0 1

0 1 1 0 1 1

0 1 1 1 0 1

0 1 1 1 1 1

1 0 0 0 0

1 0 0 0 1

1 0 0 1 0

1 0 0 1 1

1 0 1 0 0

1 0 1 0 1

1 0 1 1 0 1

1 0 1 1 1

1 1 0 0 0

1 1 0 0 1 1

1 1 0 1 0

1 1 0 1 1 1

1 1 1 0 0

1 1 1 0 1

1 1 1 1 0 1

1 1 1 1 1

Original Function

c d e F01

0 0 0 10 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 1

F01 (c,d,e)c d e F10

0 0 00 0 10 1 00 1 11 0 01 0 11 1 0 11 1 1

F10 (c,d,e)c d e F11

0 0 00 0 1 10 1 00 1 1 11 0 01 0 11 1 0 11 1 1

F11 (c,d,e)

a b F00 F1

0 0 00 0 1 10 1 00 1 11 0 01 0 11 1 01 1 1

F1 (a,b,F00)a b F01 F2

0 0 00 0 10 1 00 1 1 11 0 01 0 11 1 01 1 1

F2 (a,b,F01)a b F10 F3

0 0 00 0 10 1 00 1 11 0 01 0 1 11 1 01 1 1

F3 (a,b,F10)a b F11 F4

0 0 00 0 10 1 00 1 11 0 01 0 11 1 01 1 1 1

F4 (a,b,F11)F1 F2 F3 F123

0 0 00 0 1 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 1

F123 (F1,F2,F3)F123 F4 * F123

0 0 00 0 10 1 0 10 1 1 11 0 0 11 0 1 11 1 0 11 1 1 1

F(F123,F4,*)

Further Decomposition Example

Diagram and Alternative MUX Implementation

LUT3

cde

F01

LUT3

cde

F00

LUT3

cde

F11

LUT3

cde

F10

LUT3

ab F1

LUT3

ab F2

LUT3

ab F3

LUT3

ab F4

LUT3F123

LUT3F

LUT3

cde

F01

LUT3

cde

F00

LUT3

cde

F11

LUT3

cde

F10

b

b

F

a

4. The Zynq System on Chip (SoC)

• Xilinx Zynq System on Chip (SoC) Overview• Zynq Overview – 16m 25s (Optional – Answers why was Zynq created)• Zynq Architecture – 10m 49s (Brief overview of the PS and PL components)• Zynq Processing System (PS) – 7m 25s (Optional – Brief overview of Zynq PS)• Zynq Programmable Logic (PL) – 9m 23s (Brief overview of Zynq PL)

• Zynq PL Architecture• Xilinx 7 Series FPGA Overview – 28m 21s (Optional - General overview, watch at 1.5x speed)• CLB Architecture – 21m 52s (A must see, IMDB rating of 9.1)

• Additional Xilinx videos and training:• http://www.xilinx.com/support/university.html• http://www.xilinx.com/training/

Digilent Zybo Feature DescriptionFPGA •Zynq-7000 AP SoC XC7Z010-1CLG400I/O Interfaces •USB-UART for programming, serial comm., and power

•One 10/100/1G Ethernet•USB OTG 2.0•USB-UART bridge•16-bit VGA output•Dual role (Input/Output) HDMI•I2S CODEC•Audio Line-In, Line-Out, microphone

Memory •512 Mbyte DDR3•128 Mbit Quad-SPI Flash•MicroSD card connector

Switches and LEDs •4 Slide switches accessible from PL•4 LEDs accessible from PL•1 LED accessible from PS•4 Push-buttons accessible from PL•2 Push-buttons accessible from PS•1 Reset button accessible from PL•1 Reset button accessible from PS

Clocks •one 50.000 MHz Oscillator for PSExpansion ports •One processor-dedicated Pmod connector

•One dual (analog/digital) Pmod conenctor•4 Pmod connectors

Zynq-7000 AP SoC XC7Z010-1CLG400

• Processing System (PS)• 650Mhz dual-core Cortex-A9 processor• DDR3 memory controller with 8 DMA

channels• High-bandwidth peripheral controllers: 1G

Ethernet, USB 2.0, SDIO• Low-bandwidth peripheral controller: SPI,

UART, CAN, I2C

• Programmable Logic (PL)• Equivalent to Artix-7 (7 Series) FPGA:

• 4,400 logic slices, each with four 6-input LUTs and 8 flip-flops

• 240 KB of fast block RAM • Two clock management tiles, each with a PLL and

a MMCM • 80 DSP slices • Internal clock speeds exceeding 450MHz • On-chip analog-to-digital converter (XADC)

Xilinx 7 Series FPGA Configurable Logic Block• Each CLB is composed of of 2 slices• Slices can be of type

• SLICEL• SLICEM

• Slice features:• Real 6-input look-up table (LUT) technology• Dual LUT5 (5-input LUT) option• Distributed Memory and Shift Register Logic capability• Dedicated high-speed carry logic for arithmetic

functions• Wide multiplexers for efficient utilization Xilinx 7 Series FPGA CLB

Source: Xilinx 7 Series FPGA CLB User Guide

• The four (A, B, C, and D) 6-input (A1-A6) LUTs provide 2 independent outputs (O5 and O6)

• The LUTs can implement:• Any arbitrarily defined six-input Boolean function OR• Two arbitrarily defined five-input Boolean

functions, as long as these two functions share common inputs

• Slices also contain three multiplexers used to combine up to four LUTs to implement any function of 7or 8 inputs.• F7AMUX: Used to generate 7-input functions from

LUTs A and B• F7BMUX: Used to generate 7-input functions from

LUTs C and D• F8MUX: Used to combine all LUTs to generate 8-

input functions.

Xilinx 7 Series FPGA Slice Detail

SLICEL element shown. See source for SLICEM details.

• Carry Chain(s)• Dedicated fast lookahead carry logic can perform fast

arithmetic addition and subtraction. • Cascadable to form wider add/subtract logic. • Run upward and have a height of four bits per slice. For

each bit, there is a carry MUX and a XOR gate for adding/subtracting the operands with selected carry bits. The dedicated carry path and carry MUX can also be used to cascade function generators for implementing wide logic functions.

• Memory Elements• Flip-Flops

• 8 in total• 4 are only DFF and• 4 can be configured as DFF or Latch

• Distributed RAM• LUTs can be configured as RAM (SLICEM only)

• Dual-Port 32 x 1bit RAM, up to• Single-Port 256 x 1bit RAM

Xilinx 7 Series FPGA Slice Detail

Other PL Elements

• Clock Management• Mixed-Mode Clock Manager• Phase Locked Loop (PLL)

• DSP Slices• 25x18 bit signed multiplier• 48 bit adder/accumulator• 25 bit pre-adder

• Block RAM• Dual-port 36KB blocks• Programmable FIFO logic

Zynq Layout Snapshot using Vivado

PS

PL

5. FPGA Design Flow

• Computer Aided Design (CAD) Tools• Xilinx Vivado• Aldec Active-HDL (Free Student Edition)

• Functional Simulation• Synthesis• Post Synthesis Simulation• Mapping, Placement, Routing• Implementation