224
Customer Training Implementing, Simulating, & Debugging External Memory Interfaces A-MNL-ISDMI-12-0-v1

Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Customer Training

Implementing, Simulating, &

Debugging External Memory

Interfaces A-MNL-ISDMI-12-0-v1

Page 2: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints
Page 3: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Table of Contents Implementing, Simulating, and Debugging External Memory Interfaces

Page Numbers

Objectives 1 Agenda 2 Section 1 – Introduction to Altera’s Memory Solutions 3

Memory Selection Criteria 5 Double Data Rate Memory Interfaces 6 DDR Logic Implementation in Altera FPGAs 10 Altera High-Speed Memory Interface IP 15 High Performance Controller II 16 ALTMEMPHY and UniPHY 24 UniPHY Calibration 28 Hard Memory Interface 34

Section 2 - Memory Interface Design Flow in the Quartus II Software

38

Parameterize with the MegaWizard™ Plug-In Manager 40 Timing Derating 47

Quartus II Project Settings 54 Exercise 1: Create the design 57

Section 3 – Functionality and Simulation of a Memory System 57 Controller Operation and Connections to User Logic 58 Performing a Simulation 70 Exercise 2: Simulation of the controller 81

Section 4 – Board and Termination Considerations 81 Creating I/O Assignments 83 Board Design and Simulation Basics 91 Choosing Optimal Termination Settings 96

Recommended Settings 102 Exercise 3: Complete the controller 104

Section 5 – Timing Analysis 104 Timing Analysis Methodology 106 General Recommendations for Closing Timing 116 Exercise 4: Perform timing analysis on the interface and test in hardware

119

Section 6 – Final Topics 120 DDR2/3 Controllers with UniPHY EMIF Toolkit 121 Using High Performance Interfaces with Nios II and Qsys 123

Page 4: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Multiple Memory Controllers on a Single FPGA 126 Conclusions 133 Resources 134 Appendix 137

Page 5: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

© 2012 Altera Corporation—Confidential

Objectives

Achieve comfort level with Altera® memory interface IP, f i DDR3 i t ffocusing on DDR3 interfaces

Parameterize and instantiate a High Performance memory controller in a Quartus® II projectmemory controller in a Quartus® II project

Test and debug an external memory interface (EMIF) through:through: Simulation

Static timing analysis

External Memory Interface Toolkit

In-system testing (SignalTap® II embedded logic analyzer)

Apply required I/O and other constraints to the interface Apply required I/O and other constraints to the interface

Gain practical experience with the entire design and verification flow through lab exercises

© 2012 Altera Corporation—Confidential

verification flow through lab exercises

2

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 1

Page 6: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Agenda

Introduction to Altera’s memory interface options Source synchronous double data rate (DDR) interfaces

Parameterizing memory controllers in the Quartus II software

V if i f i li f DDR i f i Verifying functionality of a DDR interface in simulation

Board and termination considerations

Performing static timing analysisg g y

Final topics External Memory Interface Debug ToolkitExternal Memory Interface Debug Toolkit

Memory Interfaces with a Nios® II Processor and Qsys

Using multiple memory controllers inside FPGA

© 2012 Altera Corporation—Confidential

3

Quartus II Software – Two Editions

Required for memory controller IP

Devices supported All Selected devices

Subscription Edition Web Edition

Devices supported All Selected devices

Features 100% 95%

Distribution Internet & DVD Internet & DVD

Price Paid FreePrice Paid Free

Feature comparison available on Altera web site

© 2012 Altera Corporation—Confidential

4

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 2

Page 7: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Altera’s Complete Memory Solution

Advanced FPGA Architecture

Open source datapath

Memory ControllerMegaCore® IP

DQS phase shift circuitry Registers in I/O cells

Feature rich PLLs & clock

Architecture Reference designs Graphical user interface Included in the free IP base

suite (Subscription Edition) Feature-rich PLLs & clock

management

Automatic generated constraints

Software Support

System-level timing analysis

Spice and IBIS simulation models

Device Handbooks, External Memory Interface H db k Demo project

Development Kits &Hardware Reference Platforms

Interface description and use Timing analysis Electrical analysis

Handbook Demo project Board design guidelines Schematic and gerber files

© 2012 Altera Corporation—Confidential

5

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

S ti 1 I t d ti t Alt ’ MSection 1: Introduction to Altera’s Memory Solutions

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 3

Page 8: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Current Common Memory Interfaces

QDRII and II+ SRAM

RLDRAM II DDR

QDRII and II+ SRAM DDR per port Separate RD and WR ports Mem access via 1 addr bus

Common I/O (single data bus) or separate I/O (read and write data buses)

Reduced latency SRAM-like fast access time but

Cost/Bit

SRAM Fastest access time Higher cost Lower density

DRAM low price Good for latency sensitive applications,

such as traffic mgmt, caches, videos

/ / S

SRAM,QDRII

DDR/DDR2/DDR3 SDRAM DDR Lowest cost Access via row & column address busesAccess via row & column address buses Bank management increases

bandwidth by interleaving Migrating to DDR3 is the trend:

higher data rate & lower power

RLDRAM II

DDR higher data rate & lower powerSDRAM

Cycle Time

© 2012 Altera Corporation—Confidential

7

Cycle Time

Altera FPGAs Support Multiple Interface Types

Example Stratix® III, IV, or V device with DDR/2/3 memory system is a , , y y

common solution to system requirements for data buffering and other low-latency storage

HSTL-15/18 Class IQDRII/II+

RLDRAM II Altera

System I/O, HSTL-15/18 Class II

DDR3/DDR2/DDR

Altera FPGA

other chips, backplanes

SSTL-15/18/2 Class ISSTL-15/18/2 Class IIDifferential SSTL-2/15/18

Common memory data width is 72 bits

DDR

8 data bytes plus ECC Support for up to 144-bit wide interfaces for DDR2 and DDR3

© 2012 Altera Corporation—Confidential

8

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 4

Page 9: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Memory Selection CriteriaParameter DDR3 SDRAM DDR2 SDRAM DDR SDRAM RLDRAM II QDR II/II+

SRAM

Performance 300-800 MHz 200-533 MHz 100-200 MHz 200-533 MHz 154-350 MHzPerformance 300-800 MHz 200-533 MHz 100-200 MHz 200-533 MHz 154-350 MHz

Altera supports

Up to 1066 Mbps

Up to 800 Mbps

Up to 400 Mbps

Up to 1600 Mbps

Up to 1400 Mbpspp p p p p p

Density 512 MB - 8 GB,

32 MB - 8 GB (DIMM)

256 MB - 1 GB,

32 Mb - 4 GB (DIMM)

128 MB - 1 GB,

32 Mb - 2 GB (DIMM)

288 MB,

576 MB

8 - 72 MB

I/O standard SSTL-15

Class I, II

SSTL-18

Class I, II

SSTL-2

Class I, II

HSTL-1.8V/1.5V

HSTL-1.8V/1.5V

Data width 4, 8, 16 4, 8, 16 4, 8, 16, 32 9,18, 36 8, 9, 18, 36

Burst length 8 4, 8 2, 4, 8 2, 4, 8 2, 4

CAS latency 5 - 10 3, 4, 5 2, 2.5, 3 4, 6, 8 N/A

Data strobe Differential bidirectional strobe only

Differential or single-ended bidirectional strobe

Single-ended bidirectional strobe

Free running differential read and write clocks

Free running read and write clocks

© 2012 Altera Corporation—Confidential

9

strobe clocks

Our Main Focus Today

DDR3 memory implementations

Stratix series devices

High Performance Controller II (HPCII) High Performance Controller II (HPCII)

UniPHY physical interface

Basic discussion of how to implement other memory interface types, devices, and IP ALTMEMPHY

A i ® d C l ® i d i f d h d Arria® and Cyclone® series devices soft and hard implementations

© 2012 Altera Corporation—Confidential

10

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 5

Page 10: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

D bl D t R t M I t fDouble Data Rate Memory Interfaces

© 2012 Altera Corporation—Confidential

DDR Memory Interfaces

Write cycle → FPGA to memory DQS strobe clock phase shifted 90o (center-aligned) with respect to

data (DQ) signalsdata (DQ) signals

Read cycle → FPGA from memory Receives DQS edge-aligned with data and introduces phase shiftReceives DQS edge aligned with data and introduces phase shift

to center-align for data capture

DQSDQ

Write operation

Memory

DQ

DQS

FPGA

(Logic + memory

DQSDQ

R d ti

memorycontroller)

clk/ lk#

© 2012 Altera Corporation—Confidential

12

Read operation clk#

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 6

Page 11: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DDR Data Read (FPGA input path)

data_out_hdata_in (DQ)D Q

FPGAFabricdata_out_lneg_reg_out

strobe (DQS)D QD Q

inclockstrobe (DQS)

90°

inclock

d t idata_in

neg_reg_out

data out l

B0 A0 B1 A1 B2 A2

B0 B1 B2 B3

B0 B1 B2xxdata_out_l

data_out_h xx

B0 B1 B2

A0 A1 A2

xx

© 2012 Altera Corporation—Confidential

13

DDR Write Logic (FPGA output path)

datain_ld (DQ)D Q

datain_h

dataout (DQ)01FPGA

Fabric

D Q

D Q

outclock Additional registers for center-aligning DQS strobe not shown

D Q

outclock

data_in_h

data in l

B0 B1 B2

A0 A1 A2

data_out

data_in_l

B0 A0 B1 A1 B2 A2

A0 A1 A2

© 2012 Altera Corporation—Confidential

14

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 7

Page 12: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DDR vs. DDR2

DDR is the original

DDR2 SDRAM offers key improvements over DDR On-die termination (ODT) improves signal integrity and timing

margin

C l SSTL 18 I/O t d d i t d f SSTL 2 Consumes less power; SSTL-18 I/O standard instead of SSTL-2

© 2012 Altera Corporation—Confidential

15

DDR3 Improvements Over DDR2

Lower power (SSTL-15) and double performance Even lower power DDR3L runs at 1.35 V Over 400 MHz operation requires fly-by

termination for CK and address commands Better signal integrity

Complicates timing analysis / controller design Complicates timing analysis / controller design

© 2012 Altera Corporation—Confidential

16

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 8

Page 13: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DDR3 Leveling

Clock signals routed in daisy-chain fly-by topology (see next slide)

Improves signal integrity on high fan-out clocksp g g y g

Other signals still point-to-point

Special leveling circuitry required Special leveling circuitry required Automatically accounts for delays and phase adjustments

Aligns all signals on writes and readsAligns all signals on writes and reads

Stratix III, IV, and V devices only

© 2012 Altera Corporation—Confidential

17

DDR3 Write LevelingFly-by termination

D D D D D D DD D D D D D DDDT

Fly by termination

D D D D D D DD D D D D D DDD

Legend:

Clock for top & bottom memory rank (fly by, double drop)

DQ DM DQS DQS (Point to point)DQ, DM, DQS, DQS (Point-to-point)

Device PHY is responsible

Mem

for skewing outgoing DQ & DQS/DQS# to match the clock flight times to

h t Ctrleach component

© 2012 Altera Corporation—Confidential

18

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 9

Page 14: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DDR3 Read LevelingFly-by termination

D D D D D D DDT

Fly by termination

D D D D D D DD D D D D D DDD

Clock for top & bottom (fly by,

Legend:

double drop)

DQ, DM, DQS, DQS (P2P)

Device PHY is responsible for de-skewing incoming

Mem

g gDQ & DQS/DQS# to match the clock flight times to each component

Ctrlp

© 2012 Altera Corporation—Confidential

19

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

DDR L i I l t ti i AltDDR Logic Implementation in Altera FPGAs

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 10

Page 15: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Memory Implementation in FPGAs

FPGAs implement DDR circuitry in different ways depending upon resources available

DDR data (DQ) and strobe (DQS) pins should be placed onto dedicated equi-skew DQ/DQS placement blocks in device for optimal memory performance (more later)memory performance (more later)

Wraparound (banks placed around corner of device) or split (banks split between opposite sides) configurations supported in somesplit between opposite sides) configurations supported in some scenarios

© 2012 Altera Corporation—Confidential

21Note: See Appendix for additional device families

Cyclone V Devices

Dedicated bidirectional DQ/DQS pins on top, bottom and rightbottom, and right

Dedicated transceivers on left side (all devices) Two hard memory controllers on top/bottom banks Two hard memory controllers on top/bottom banks

FPLLDQ/DQS in I/OFPLL

DDR read / write logic

Four DLLs available

Each I/O bank can access adjacent DLLs

DLLDLL

ock

s

/O

Hard controller

DDR read / write logic implemented in I/O cells on 3 sides of device

Up to 8 reconfigurable

for DQS phase shift

Each DLL has two outputs - allows multiple

Stratix IIIDevice

Cyclone V device

FPLL

FPLL

FPLL

FPLLscei

ver

blo

Q/D

QS

in I/

Bo

nd

ing

fractional PLLsp

interfaces to have separate frequencies

Differential DQS alsoDLL

Tran

s

DQ

DLL Hard controller

© 2012 Altera Corporation—Confidential

22

Differential DQS also possible

FPLLDQ/DQS in I/OFPLL

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 11

Page 16: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Arria V Devices Dedicated bidirectional DQ/DQS pins on top and bottom Dedicated transceivers on left (all devices) and right (most ( ) g (

GX and GT devices) Four hard memory controllers on top/bottom banks

DLLDLLDQ/DQS in I/O

DDR read / write logic

Four DLLs available

Each I/O bank can access adjacent DLLs

DLLo

cks

Hard controller

Tran

Hard controller

FPLL

FPLLFPLL

FPLLFPLL

implemented in I/O cells on 2 or 3 sides of device

Up to 16 reconfigurable fractional PLLs

for DQS phase shift

Each DLL has two outputs - allows multiple

Stratix IIIDevice

Arria V device

scei

ver

blo

Bo

nd

ing

nsceiver b

l

FPLL

FPLLFPLL

FPLL

FPLL

FPLLFPLL

FPLLfractional PLLs pinterfaces to have separate frequencies

Differential DQS alsoDLL

Tran

s

DLLHard controller

locks

Hard controller

FPLL

FPLL

FPLL FPLL

© 2012 Altera Corporation—Confidential

23

Differential DQS also possible

DLL

DQ/DQS in I/ODLL

Arria II GZ / Stratix III / IV Devices

Dedicated bidirectional DQ/DQS pins on all banks Top/bottom banks optimized for memory performance

S ff /O Side banks optimized to support differential I/O

Some Stratix IV GX/GT devices have dedicated transceivers on left/right sidestransceivers on left/right sides

DDR read / write logic implemented in I/O cells on Four DLLs available

PLL PLL PLL PLL5 6 7 8 implemented in I/O cells on all sides of device

Up to 12 reconfigurable PLLs

Four DLLs available

Each I/O bank can access adjacent DLLs for DQS phase shift

DLLDLL

3

4

10

9

Memory performance optimized on top and bottom of FPGA

LVDS ( CDR) SERDES

for DQS phase shift

Each DLL has two outputs - allows multiple

interfaces to have

Stratix IIIDevice

Stratix IIIdevicePLL

PLL

PLL

PLL

LVDS (non-CDR) SERDES support on left and right sides of die

interfaces to have separate frequencies

Differential DQS also ibl

1

PLL PLLPLL

DLLDLL

2

12

11

16 15 14 13

© 2012 Altera Corporation—Confidential

24

possiblePLL PLLPLL PLL16 15 14 13

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 12

Page 17: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Stratix V Devices

Dedicated bidirectional DQ/DQS pins on top/bottom (all devices) and right (larger GStop/bottom (all devices) and right (larger GS devices)

Dedicated transceivers on left side (all devices) ( )and right (some devices)

DDR read / write logic implemented in I/O cells on

FPLLFPLL

DQ/DQS in I/OFPLLFPLLimplemented in I/O cells on

3 sides of device

Up to 28 reconfigurable fractional PLLs

Four DLLs available

Each I/O bank can access adjacent DLLs

FPLL DLLDLL

FPLLFPLL

FPLL

ock

s

/O

FPLLFPLL

FPLLFPLL

for DQS phase shift

Each DLL has two outputs - allows multiple

Stratix IIIDevice

Stratix V device

FPLLFPLL

FPLLFPLLsc

eive

r b

lo

Q/D

QS

in I/

FPLLFPLL

FPLLFPLL p

interfaces to have separate frequencies

Differential DQS alsoDLL

FPLLFPLL

FPLL

FPLL

Tran

s

DQ

DLL

FPLL

FPLLFPLL

© 2012 Altera Corporation—Confidential

25

Differential DQS also possibleFPLL

FPLLDQ/DQS in I/OFPLLFPLL

Example DQ/DQS Block

DQ Output Path

DQ Input Path

Phase Delay via DLLy

DQS Clock

© 2012 Altera Corporation—Confidential

26

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 13

Page 18: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

FPGA External Memory Support

Family DDR3 DDR2 DDR RLDRAM II QDR-II+ QDR-II

Stratix V (1)

Stratix IV E/GX/GT

Stratix III Stratix III

Hardcopy® III/IV

Arria V (1,2) (2)

Arria II GX

Arria II GZ

Cyclone V (1 2) (2) Cyclone V (1,2) (2)

Cyclone IV E/GX

Cyclone III

(1): Devices support DDR3 and DDR3L

© 2012 Altera Corporation—Confidential

27

pp(2): Soft or hard controller

Family/Protocol Support

FamilyMaximum half-rate frequency (MHz)

DDR3 DDR2 DDR RLDRAM II QDR-II+ QDR-II

Stratix V 1066 400 533 550 350

Stratix IV E/GX/GT 533 400 200 533 550 350

Stratix III 533 400 200 400 400 350Stratix III 533 400 200 400 400 350

Hardcopy III/IV 400/533 333 200 400 350 300

Arria V 667 400 400 400 350

Arria II GX 400 333 200 250 250

Arria II GZ 400 333 350 350 300

Cyclone V 400 (h) /300 (s) 400 (h) /300 (s) y ( ) ( ) ( ) ( )

Cyclone IV GX 200 167

Cyclone IV E 167 133

C l III 200 167

Try the External Memory Interface Spec Estimatorhttp://www altera com/technology/memory/estimator/mem emif index html

Cyclone III 200 167

© 2012 Altera Corporation—Confidential

http://www.altera.com/technology/memory/estimator/mem-emif-index.html

28

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 14

Page 19: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

Alt Hi h S d M I t f IPAltera High-Speed Memory Interface IP

© 2012 Altera Corporation—Confidential

Memory Interfaces: A 2 (or 3) Part Solution

UniPHY mo

ry QDR II, QDR II+, RLDRAM II/III

PHYFPGA

UniPHY

or

Mem

RLDRAM II/III, DDR2/3

Memory controllerUser logic

PF

E)

ALTMEMPHY DDR1/2/3

Mem

ory

Mul

ti-po

rtfr

ont

end

(MP

Auto-calibrated PHY bridging physical interface and controller

MM f

controller

Uses one PLL and automatically selects all required clock phasesphases

Multi-port front end (MPFE) for multiple, independent accesses to hardened controller (discussed later)

© 2012 Altera Corporation—Confidential

accesses to hardened controller (discussed later) Cyclone V and Arria V hard memory interface only

30

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 15

Page 20: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DDR Memory Interface Blocks

User logic Generates data to be written to memory

Receives data read from memory

Memory controller (soft or hard) Altera High Performance Controller (HPC) II or custom controller

Initiator of read and write commands

Instantiates PHY if used

UniPHY (soft or hard) Instantiated by Altera HPCII or can be added separately

Read data/write data/address/command path

Clock and reset management

Automatic calibration during memory initializationAutomatic calibration during memory initialization

I/O logic to external memory

© 2012 Altera Corporation—Confidential

31

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

Hi h P f C t ll IIHigh Performance Controller II

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 16

Page 21: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

High Performance Controller II Features

Supports up to 1066 MHz DDR3 memory Power managementg Advanced bank management w/ command reordering Inter-bank data reorderingg Five cycle controller latency (6 w/ ECC) ECC with sub-word write Flexible system interface Run time programmable Efficiency Monitor and Protocol Checker Multi-cast writes

Q t t t Quarter-rate support Quasi 1T/2T

© 2012 Altera Corporation—Confidential

33

Quick Review: DDRx Command Cycle

Idle Refreshing

REF

Bank

ACT

active READ or READ AP

WRITE orWRITE AP

ReadingWriting

PRE PRE

DDR initialization and configuration not shown

Pre-charging

DDR initialization and configuration not shown

Reads and writes are bursts (2, 4, or 8 bit i l i l d l )

© 2012 Altera Corporation—Confidential

sequential or interleaved column accesses)34

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 17

Page 22: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

HPC II Advanced Bank Management

Look-ahead bank management Not efficient!!No look ahead

Efficient bank interleaving support

Issue activate and precharge commands idle cmd busprecharge commands early

Use auto-prechargewhere possibleI d d/ it

With look ahead bank management

In-order read/writes

Per access open or close page policy Use of idle cycles for bank-managementclose page policy Read/write accesses with

auto-precharge Automatic cancellation of

y g

Command Address Condition

Read Bank 0 Activate requiredAutomatic cancellation of auto-precharge on page hits

Read Bank 1 Precharge required

Read Bank 2 Precharge required

© 2012 Altera Corporation—Confidential

35

Inter-Bank Data Reordering

Intelligent reordering of read and write commands going to different bank addresses in an efficient mannerbank addresses in an efficient manner

Mitigates bus turn-around time (read to write, write to read)

Reduces conflict between rows

Data Reordering OFF

WR to RDturnaround

RD to WRturnaround

Data Reordering ON

WR to RDturnaround

WR to RDturnaround

WR to WRturnaround

© 2012 Altera Corporation—Confidential

36

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 18

Page 23: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Other Aspects of Data Reordering

Command aging Mechanism to favor older commands over newer commands

during data reordering if these commands are requested for access at the same timeaccess at the same time

Management of aging reduces latency

Starvation controlStarvation control Commands are “starving” if not served after a period of time

Starvation limit can be set (default is 10 commands)

Logic added to prevent command from starving

Also reduces latency

© 2012 Altera Corporation—Confidential

37

Without Starvation Control

Command Sequence

Local command(from user logic to

controller)

Memory command

Write to WriteTurnaround

Write to ReadTurnaround

Note: Write to write turnaround time is shorter than write to read turnaround time

Turnaround Turnaround

To minimize bus turnaround time, controller favors write over read if write was issued previously and vice versa

Causes read command to be pushed to the end Causes read command to be pushed to the end resulting in large latency

© 2012 Altera Corporation—Confidential

38

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 19

Page 24: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

With Starvation Control

Command Sequence

Local command

Memory command

Write to WriteTurnaround

Write to ReadTurnaround

Read to WriteTurnaround

Note: Write to write turnaround time is shorter than write to read / read to write turnaround time

Note: Starvation counter increments for every command issued

User sets starvation limit Starved command served immediately when Starved command served immediately when

starvation counter reaches limit Example: starvation limit set to 2p

After 2 commands, read tagged as starved (promoted into priority command) and served immediately

© 2012 Altera Corporation—Confidential

39

Full-Rate and Half-Rate Modes

Simplify design by halving application side frequency and doubling data width

Half-rate mode required for DDR3q

y y

Full-rate logic Half-rate logic

UserLogic

SDRto HDR

DDR to SDR

Mem

ory

Mem

ory

UserLogic

DDRto SDR

8 16 8 16 32

FPGA FPGA

DDR200 MHz

DDR200 MHz

SDR200 MHz

SDR200 MHz

HDR100 MHz

Implemented directly in I/O for Cyclone V, Arria V, and Stratix III / IV / V FPGAs

© 2012 Altera Corporation—Confidential

40

Implemented in fabric for Cyclone III / IV FPGAs

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 20

Page 25: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Quarter-Rate Mode

Allows the controller and user logic to run at a quarter of the memory clock frequency

Allows further flexibility without compromising y p gperformance

© 2012 Altera Corporation—Confidential

41

Quasi 1T/2T

Row commands: ACT or PREColumn commands: READ or WRITE

Half-rate mode: 1 controller clock cycle = 2 memory clock Half rate mode: 1 controller clock cycle 2 memory clock cycles

Quarter-rate mode: 1 controller clock cycle = 4 memoryclock cyclesclock cycles

Improve command bandwidth by issuing two commands every controller clock cycle

© 2012 Altera Corporation—Confidential

42

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 21

Page 26: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Flexible System Interface

Avalon interface Avalon®-ST (streaming) for user logic access to controller( g) g

Avalon-MM (memory mapped) slave interface for access to configuration and status register (CSR)

See the Avalon Interface Specifications See the Avalon Interface Specifications http://www.altera.com/literature/manual/mnl_avalon_spec.pdf for details

Burst size adaptation for efficient DRAM accesses Combines short local transactions into memory bursts

Splits long local transactions into memor b rsts Splits long local transactions into memory bursts

Integrated low latency half-rate system interface Supports an optional half system interface speed Supports an optional half system interface speed

Maintains the controller in the faster clock domain to reduce latency

© 2012 Altera Corporation—Confidential

43

Efficiency Monitor and Protocol Checker

Reports on read and write throughput of the interface by ti d t f d it ticounting command transfers and wait times

Checks legality of commands issued by user logic to the controllercontroller

Measures full path memory read latency Read commands from user logic time-stamped Read commands from user logic time-stamped

Returned data timestamp compared to when command was issued

Implemented as Avalon slave interface for manual access or by EMIF Toolkit (described later)

© 2012 Altera Corporation—Confidential

44

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 22

Page 27: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Other Advanced Features

Runtime configurableTi i Timing parameters

Address widths Controller behavior

Error correction code (ECC) with sub-word writes Multicast write to mitigate effects of multiple Multicast write to mitigate effects of multiple

activates Refresh timing control Refresh timing control

Programmable periodic refresh User requested auto-refresh

Power management User requested self-refresh

A t ti t / it d d

© 2012 Altera Corporation—Confidential

Automatic entry / exit power-down mode

45

HPC II Architecture

See appendix for detailed block diagram and description of blocks that make up the High Performance Controller

© 2012 Altera Corporation—Confidential

46

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 23

Page 28: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

ALTMEMPHY d U iPHYALTMEMPHY and UniPHY

© 2012 Altera Corporation—Confidential

Altera Memory PHY Solutions

Feature UniPHY ALTMEMPHY

A il bl M C IP Available as a MegaCore IP

Support for DDR2/3

Support for QDR II/II+ and RLDRAM II Support for QDR II/II+ and RLDRAM II

PLL/DLL sharing

Smart calibration algorithms Smart calibration algorithms

Latency 0.5 1.0

For all new designs with supported

memory

UniPHY Provides Higher Flexibility With Half the Latency

© 2012 Altera Corporation—Confidential

48

Half the Latency

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 24

Page 29: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

UniPHY AdvantagesALTMEMPHY

ReconfigI/O structure

UniPHYDLL PLL

ry

Reconfig

Clock generation

PLL

Mimic path

Calibrationsequencer

User logicry

Calibrationsequencer

Clock generation

I/O structureRe-config

Mem

or

Write path

DQSpath

DLL

pathUser logic

logic

Memory

Mem

o

R d th

Write path

DQS path

DQ I/O FIFO

Address/cmd path

Read path

Write pathDQ I/O block

I/O block

Memory controller

Memory controller

Address/cmd path

Read pathFIFO

I/O block

Hard read/write paths in all supported devices implemented as FIFO

SoftHard

Hard read/write paths in all supported devices, implemented as FIFO in Stratix V devices

Soft I/O grouping and calibration sequencer provide flexibility

© 2012 Altera Corporation—Confidential

Better resource sharing (PLL, DLL) for multiple interfaces

49

UniPHY Benefits

UniPHY Enhancements Benefits

H lf h l I d fHalf the latency Improved system performance

PLL, DLL, and on-chip termination (OCT) logic sharing

Easier to create multiple memory interfaces on a single device

More configurationsDDR1/2/3, QDRII/II+, RLDRAM II

Mainstream configurations: widths, burst sizes DIMM types and multi-rank supportsizes, DIMM types, and multi-rank support

Nios II processor-based calibration sequencer

Higher performance with advanced calibration algorithms makes for easier design and debug

qdesign and debug

Easy to build custom PHYModular clear text code

Ease of use enhancements Flexible timing model Pin and timing constraint enhancements Improved testbenches

© 2012 Altera Corporation—Confidential

50

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 25

Page 30: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

PHY Device Support

Device ALTMEMPHY UniPHYDevice ALTMEMPHY UniPHY

Arria II / II GX

Arria II GZ

Cyclone III

Cyclone IV

Hardcopy III - IV

Stratix III - IV

Stratix V

All new device families will only support UniPHY

© 2012 Altera Corporation—Confidential

51

UniPHY Interface to Controller and Memory

DLL and PLL instantiated at same level as PHY Can be set to master or slaveCan be set to master or slave

Facilitates sharing between multiple controllers

OCT block can also be shared or instantiated outside of UniPHY

UniPHY top-level file

UniPHYAltera PHY Interface

(AFI)

Reset interface

Memory interface

RUP & RDN

( R )OCT

(or RZQ)

DLL PLL

© 2012 Altera Corporation—Confidential

52

PLL/DLL sharing interfaceOCT sharing interface

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 26

Page 31: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

UniPHY Clocks Typical half-rate design clocks in table below

Default phases for a >240 MHz memory frequency

Addi i l l k d d f diff i i h lf Additional clocks needed for different scenarios, i.e. half-rate to quarter-rate conversion

Run Report Clocks in TimeQuest timing analyzer for detailsp Q g y

Clock SourceClock rate

Phase Description

pll afi clk PLL c0 Half 0° Controller clockp _a _c c0 a 0 Co o e c oc

pll_mem_clk PLL c1 Full 0° Output memory clock

pll write clk PLL c2 Full90°

(45° for Stratix V Write data clockp _ _ (45 for Stratix V devices)

pll_addr_cmd_clk PLL c3 Half 270° (adjustable)Address/command output clock

pll_avl_clk PLL c5 0° Nios sequencer clock

pll_config_clk PLL c6 0° Scan chain clock

DQSExternal

Full 90° Read data strobe

© 2012 Altera Corporation—Confidential

53

DQSmemory

Full 90 Read data strobe

The UniPHY Sequencer Parameterizable Nios II processor system generated at run time

Implements calibration algorithm to maintain center alignment of data d l k i l th h I/O d l h i dj t tand clock signals through I/O delay chain adjustment

Hands control over to memory controller once calibration is completedp

For more information on UniPHY and sequencer architectural blocks, see the Appendix

To debug DQS enable l

DebugRAM

(calibration

module

Tracking

samples

Nios II processor

Avalon-MM interface

Debug interface

(calibration software storage)

Tracking manager

processor

Scan chain control (SCC)

manager

Read write (RW)

manager

PHY manager

Data manager

© 2012 Altera Corporation—Confidential

54

manager

To I/Os AFIPHY parameters

(includes FIFO info)

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 27

Page 32: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

U iPHY C lib tiUniPHY Calibration

© 2012 Altera Corporation—Confidential

UniPhy Calibration

Overview of calibration

Calibration stages

Calibration signals Calibration signals

© 2012 Altera Corporation—Confidential

56

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 28

Page 33: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Overview of Calibration

Configures PHY and I/Os for reliable data transfer

Performed by Nios II processor-based sequencery p q

Determines the delay settings needed to center-align data signalsalign data signals

Two tasks performed1 FIFO buffer calibration: sets data valid (VFIFO) and read latency1. FIFO buffer calibration: sets data valid (VFIFO) and read latency

(LFIFO) lengths in the read datapath of UniPHY

2. I/O calibration: adjusts delay chains and clock phase settings

When calibration completes, control is passed to the memory controller

© 2012 Altera Corporation—Confidential

y

57

The Chicken-and-Egg Calibration Problem

Calibration, at a very high-level, works like this: Set the knobs to some value

Write to memory

Read from memory Read from memory

Check if what you read is correct If so, the knob settings are good..., g g

...if not... well either the write or the read failed

To test a write you need to be able to ready

To test a read you need to be able to write

© 2012 Altera Corporation—Confidential

58

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 29

Page 34: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

“Guaranteed” Write

Special write mode that attempts to get known data into memory that can be used for read calibration

Write a constant burst of zeros to one bank, and a burst of ones to another bank Back-to-back read of these two banks can be used for read

calibration

© 2012 Altera Corporation—Confidential

59

Calibration Stages

Read calibration part one: DQS enable calibration

DQ/DQS centering

Write calibration part one: Leveling

Write calibration part two: DQ/DQS centeringDQ/DQS centering

Read calibration part two: Read latency minimization

© 2012 Altera Corporation—Confidential

60

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 30

Page 35: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Read Calibration Part One

ObjectivesRead calibration part one:

DQS enable calibrationDQ/DQS centering

Calculates when read data is received after a read command is issued to setup the Data Valid Prediction FIFO (VFIFO) cycle

Q Q g

Write calibration part one:Valid Prediction FIFO (VFIFO) cycle

Aligns the input DQ with respect to DQS to maximize the read margins

Write calibration part one: Leveling

ActionsWrite calibration part two:

DQ/DQS centering

Uses guaranteed writes to perform: DQS enable phase calibration

DQ/DQS centering Read calibration part two: DQ/DQS centeringRead latency minimization

© 2012 Altera Corporation—Confidential

61

DQS Enable Phase Calibration

Goal: set up phase and latency of VFIFO to best capture DQ without DQS glitchescapture DQ without DQS glitches Including postamble

dqs enable should be active from before first dqs_enable should be active from before first DQS rising edge until before last falling edge

© 2012 Altera Corporation—Confidential

62

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 31

Page 36: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DQ/DQS Centering

Goal: Center DQ signals with respect to each other and center DQS to aligned DQother and center DQS to aligned DQ

1. Sweep D1 (DQ input) delay chain to align DQ to each othereach other

2. Sweep aligned DQ to center DQS

DQS not adjusted, only DQ delay

© 2012 Altera Corporation—Confidential

63

Write Calibration Part One

ObjectivesAli DQS h l k h

Read calibration part one: DQS enable calibration

DQ/DQS centering Align DQS to the memory clock at each device

Compensate for address, command, and

DQ/DQS centering

clock skew at each device

Actions

Write calibration part one: Leveling

Actions Perform a variety of random burst pattern

writes with different delay and phase settings f

Write calibration part two: DQ/DQS centering

followed by a read Simple patterns could lead to incorrect

alignmentRead calibration part two:

Sequencer picks the closest delay and phase values to the center of the window

Read calibration part two: Read latency minimization

© 2012 Altera Corporation—Confidential

64

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 32

Page 37: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Write Leveling Procedure

pll write clkp _ _phase adjustment

(PLL)

D5 and D6 output delay chain adjustment(in 50 ps increments)

PLL againPLL again

D5 and D6 delay adjustment againadjustment again

© 2012 Altera Corporation—Confidential

Final Calibration Stages

Write calibration part twoDQ/DQS centering similar to read calibration

Read calibration part one: DQS enable calibration

DQ/DQS centering DQ/DQS centering similar to read calibration

D5 and D6 delay chains adjusted

Read calibration part two Write calibration part one: Read calibration part two LFIFO at maximum latency so far

Reduce LFIFO latency until read fails

Leveling

y

Increase latency by two for margin

Control handed over to memory

Write calibration part two: DQ/DQS centering

ycontroller

Read calibration part two: Read latency minimizationRead latency minimization

© 2012 Altera Corporation—Confidential

66

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 33

Page 38: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Calibration Signals

Signal Description

afi cal fail Asserts high if calibration failsafi_cal_fail Asserts high if calibration fails

afi_cal_success Asserts high if calibration is successful

afi_cal_req Synchronous reset for sequencer

© 2012 Altera Corporation—Confidential

67

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

H d M I t fHard Memory Interface

(Optional; Cyclone V & Arria V devices only)only)

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 34

Page 39: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Multi-Port Front End (MPFE)

Avalon-MM/ST adaptor

MPFE

MM register

Hardened controller

Hardened PHY

© 2012 Altera Corporation—Confidential

69

MPFE Architecture

Multiple Avalon ports for access to hard controller & PHY Up to 6 command ports

Up to 4 read-data ports

Up to 4 write-data ports

C fi d l it l bi i t Configure as read-only or write-only or combine into bidirectional

Internal Avalon port widths from 32 to 256 bits depending Internal Avalon port widths from 32 to 256 bits depending on number used and whether uni- or bidirectional

Avalon-MM to ST implementation in fabric for connectivity Avalon-MM to ST implementation in fabric for connectivity

Request scheduling done through set priority levels (absolute) and weighted round robin (relative)(absolute) and weighted round robin (relative)

© 2012 Altera Corporation—Confidential

70

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 35

Page 40: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Hardened Controller

Functionally similar to soft controller

DRAM interface is 40 bits wide to accommodate from 8 bits up to 32 bits + ECCp

Multiple controllers can be bonded for wider interfaces, even if using different clocksinterfaces, even if using different clocks From controller to user logic: synchronized

From controller to memory: not synchronized

Counters track data in FIFO buffers to ensure data is sent and received on same cycle

© 2012 Altera Corporation—Confidential

71

Hardened PHY

Again, similar to soft UniPHY

Portions of sequencer remain soft Soft: Nios II processor, instruction/data RAM, Avalon fabric

Hard: R/W manager, PHY manager, Data manager (run at full rate)

C t t d fi d I/O i t bl k d Connects to predefined I/O register blocks and pins

Pi il bl l I/O if t i h d i t f Pins available as regular user I/O if not using hard interface

© 2012 Altera Corporation—Confidential

72

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 36

Page 41: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Test Your Knowledge: Intro to Memory IP

1. What controller feature, required for running DDR3 at high speeds is available only in Stratix devices?

A. Read and write leveling

high speeds, is available only in Stratix devices?

2. What do the half-rate and quarter-rate modes allow you to do?

A. Run the internal interface logic at half or quarter of the speed of the external memory to ease timing closure

3. What FPGA settings are adjusted during the calibration process?

A. PLL clock output phase and I/O delay chain lengths

© 2012 Altera Corporation—Confidential

73

Section 1 Resources

Memory Resource Center http://www.altera.com/technology/memory/mem-index.jspp gy y j p

User guides External Memory Interfaces Handbook Quartus II Software Handbook

Device handbooksCyclone III Cyclone IV & Cyclone V FPGAs Cyclone III, Cyclone IV, & Cyclone V FPGAs

Arria GX, Arria II GX/GZ, & Arria V FPGAs Stratix III, Stratix IV, & Stratix V FPGAs

Application note: AN461: QDRII and QDRII+ with Stratix III/IV devices

AN637: Sharing External Memory Bandwidth Using the MPFE AN637: Sharing External Memory Bandwidth Using the MPFE Reference Design

© 2012 Altera Corporation—Confidential

74

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 37

Page 42: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

S ti 2 M I t f D i FlSection 2: Memory Interface Design Flow in the Quartus II Software

© 2012 Altera Corporation—Confidential

Recommended Memory Interface Design Flow

Select deviceStart designCreate and parameterize

memory interfaceInstantiate PHY & controller(example or custom design)

Perform functional simulation

Add constraints (I/O, timing, etc.)

and compile

optional(but recommended)

Expected results?Debug design Verify timing

yesno

Meets timingand

performance?Adjust constraints

yes

no

Verify functionality & SI on board

yesBoard (PCB) related tasks

(layout, simulation, termination, drive strength settings, etc.)

Works correctly?

Debug design

yes

no

© 2012 Altera Corporation—Confidential

76

Design complete

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 38

Page 43: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

3 Main Design Flows

MegaWizard™ Plug-In Manager flow Full custom parameterization of IP core variant Instantiate anywhere in existing design Can generate a complete example design and testbench Our focus for todayy

SOPC Builder flow Generates complete simulation environment

I t t i t f ith th t IP Integrate memory interface with other custom IP Uses Avalon-MM interfaces for easy integration Not recommended for new designs; use Qsys instead

Qsys flow (discussed later) All advantages of SOPC Builder plus… Hierarchical system designHierarchical system design Higher performance interconnect

© 2012 Altera Corporation—Confidential

77

Create or Open a Quartus II Project

Create a new Quartus II project or open an existing project

Select the target gdevice or device familyy Memory type support

Desired performance

© 2012 Altera Corporation—Confidential

78

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 39

Page 44: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

P t i ith th M Wi d PlParameterize with the MegaWizard Plug-In Manager

© 2012 Altera Corporation—Confidential

Recommended Memory Interface Design Flow

Select deviceStart designCreate and parameterize

memory interfaceInstantiate PHY & controller(example or custom design)

Perform functional simulation

Add constraints (I/O, timing, etc.)

and compile

optional(but recommended)

Expected results?Debug design Verify timing

yesno

Meets timingand

performance?Adjust constraints

yes

no

Board (PCB) related tasks

(layout, simulation, termination, drive strength settings, etc.)

Verify functionality & SI on board

yes

Works correctly?

Debug design

yes

no

© 2012 Altera Corporation—Confidential

80

Design complete

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 40

Page 45: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Creating the Interface

MegaWizard Plug-In Manager

Easy creation and yparameterization of entire interface

T lTools menu or Tasks window

© 2012 Altera Corporation—Confidential

81

Select Memory Controller IP

Select PHY, output file HDL, and instance name

© 2012 Altera Corporation—Confidential

82

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 41

Page 46: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Parameterize the IP

Enable hard i t finterface

(Arria V & Cyclone V devices only)

Multiple settings tabs

Memorypresets

© 2012 Altera Corporation—Confidential

83

PHY Settings

PHY-only generation for use

with custom

Cl k

with custom controller

Clock frequencies & half/full-rate

mode selectionFineFine

adjustment of clock phases

Resource sharing options for multipleoptions for multiple memory interfaces

© 2012 Altera Corporation—Confidential

84

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 42

Page 47: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Memory Parameters

Use presets as a “starting point” for customizing external memory parameters and timingexternal memory parameters and timing

Adjust parameters (if needed) to match data sheet and memory usesheet and memory use

© 2012 Altera Corporation—Confidential

85

Custom Memory Presets

Create new preset from MegaWizard settings (.qprs file)

Update and save with custom settings Update and save with custom settings

© 2012 Altera Corporation—Confidential

86

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 43

Page 48: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Memory Initialization Options

Configures mode registers with MRS command during initializationduring initialization

Adjust parameters (if needed) to match data sheet and memory use

Check memory device datasheet for details on

each mode register setting sheet and memory useeach mode register setting

© 2012 Altera Corporation—Confidential

87

Memory Timing

Adjust memory timings to match datasheet; may needmatch datasheet; may need

to derate default values

© 2012 Altera Corporation—Confidential

88

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 44

Page 49: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Reading Memory Datasheets

1GB Micron DDR3 MT9JSF12872AY-1G1 DIMMbit id (i l dibits wide (including ECC check bits)

words deep x 1,000,000

(72 8)/8 = 8 bytes x 128M = 1 GB(72-8)/8 = 8 bytes x 128M = 1 GB

Component: DIMM (UDIMM or RDIMM): Simple chip / single device

soldered on board (or DIMM)

Collections of components placed into sockets

Requires one datasheet for DIMM)

Additional datasheet required for component-specific timing numbers

Requires one datasheet for general speed ratings and options

specific timing numbers

© 2012 Altera Corporation—Confidential

89

Setting Memory Options (Either Datasheet)

Obtain row and column addressing widths, number clock pairs and chip selects DQ widthnumber clock pairs and chip selects, DQ width, etc.

Adj t M P t t b Adjust Memory Parameters tab

© 2012 Altera Corporation—Confidential

90

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 45

Page 50: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Setting Memory Options (cont.)

If using different memory than preset, be sure to set options for correct operating speed, speed bin, and configuration

Example: CAS latency (CL) and CAS write latency (CWL) from component datasheet

For 533 MHz, -187E components

t CL f 7must use CL of 7 or 8 and CWL of 6; CWL of 5 not allowed

Can cause initialization orinitialization or calibration failures!

© 2012 Altera Corporation—Confidential

91

Setting Timing Parameters

Adjust for desired operating frequency on Memory Timing tab

Be aware of units! Different vendors, different units used in specs may not match units in MegaWizard!g

Most common error: clock cycles vs. ps/ns

With memory presets units should match but be With memory presets, units should match, but be sure to double check!

© 2012 Altera Corporation—Confidential

92

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 46

Page 51: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Timing Derating

Setup and hold time settings for DQ (wrtDQS/DQSn) and control/address (wrt CK/CK#) must be “derated” tDS, tDH, tIS, tIH specifications in component datasheet

Adjust values to account for different slew rates on signals, usually due to additional loading Example: single vs. multi-rank (multi-CS) memory configurations

Without derating, timing analysis may be overly optimistic Timing analysis passes, actual board implementation fails!

© 2012 Altera Corporation—Confidential

93

Timing Derating (cont.)

1. Enter base values for settings (memory preset f t d t h t)or from component datasheet)

2. Perform board simulations to determine slew t ith t l lik M t G hi ®rates with tools like Mentor Graphics®

HyperLynx (discussed later)E t i l ti i f ti i t M Wi d3. Enter simulation information into MegaWizard Plug-In Manager to automatically derate

If board simulation results not available, Altera d f lt b d ( t d d)defaults can be used (not recommended)

© 2012 Altera Corporation—Confidential

94

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 47

Page 52: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Automatic Derating

1. Enter base values for settings in the Memory Timing tabTiming tab

2. Enter slew rate information into Board Settings tabtab

Derated values automatically calculated

© 2012 Altera Corporation—Confidential

95

Additional Board Settings

Adjusts generated SDC constraints for timing analysis to account for board effectsanalysis to account for board effects

Account for ISI effects usually

found in multi-rank systemssystems

Account forAccount for differences in board

trace lengths

© 2012 Altera Corporation—Confidential

96

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 48

Page 53: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Controller Settings

Required for SOPC BuilderRequired for SOPC Builder and Qsys integration

Chip-row-bank-col: use bank look-Chip-row-bank-col: use bank look-ahead to hide affect of burst lengths

greater than column widthChip-bank-row-col: allocate separate

physical banks to multiple mastersphysical banks to multiple masters

Larger depth = more efficient, but more resources required

(possible max freq. hit)

Enable data reordering and set starvation limitstarvation limit

Enable ECC and choose

Enable CSR interface and how it will be accessed

© 2012 Altera Corporation—Confidential

97

Enable ECC and choose whether to auto correct

Multi-Port Front End Settings(H d C t ll O l )(Hard Controller Only)

Bond with another hardBond with another hard controller to create wider

data widths

Absolute priority(1-7; higher level has

Weight setting for weighted round robin (WRR)

priority over lower levels)round robin (WRR)

(0-32; relative priority)

© 2012 Altera Corporation—Confidential

98

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 49

Page 54: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Diagnostics Settings

Reduce simulation time by skipping calibration and initialization (only really

needed in hardware)

© 2012 Altera Corporation—Confidential

99

Enable the Efficiency Monitor for use with the EMIF Toolkit

Generate the IP

Clicking Finish generates the IP

Choose whether or not to generate the example designg

© 2012 Altera Corporation—Confidential

100

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 50

Page 55: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

MegaWizard Output

Top-level wrapper file for instantiation <project_directory>/<variation_name>.v or .vhd

Files for synthesis and simulation pointed to by .qip (added to project automatically) <project_directory>/<variation_name>/

<project_directory>/<variation_name>_sim/

Al dd QIP fil t Q t II j t Always add QIP file to Quartus II project Adds all IP HDL files to project for synthesis

One file to add instead of multiple files One file to add instead of multiple files

© 2012 Altera Corporation—Confidential

101

Generated Example Design

Complete working reference design project (.qpf) if example design generation enabled Files in

j t di t / i ti l d i / l j t<project_directory>/<variation_name>_example_design/example_project

Example designExample design

Traffic G t

DDR/2/3 controller

PHY

Local interface (Avalon-MM) External DDR/2/3 Pass or

Generator

Controllerlogic

PHY( )

AFI

memory

DLL

fail M S

Clock source

g

PLL

© 2012 Altera Corporation—Confidential

102

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 51

Page 56: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Hierarchy of Example Design

Top level:p

<variation_name>_example.v/vhd

Driver / User Logic: Megacore top level:

Issues reads/writes

<variation_name>_example_d0.v/vhd

<variation_name>_example_if0.v/vhd

Instantiates controller core: Instantiates PHY core:

<variation_name>_example_if0_c0.v/vhd <variation_name>_example_if0_p0.v/vhd

© 2012 Altera Corporation—Confidential

103

Top Level Code (Verilog)module ddr3_top_example (

input wire pll_ref_clk,input wire global_reset_n,_ _output wire [12:0] mem_a,output wire [2:0] mem_ba,output wire mem_ck,output wire mem_ck_n,

Must assign these ports to I/O pins

_ _output wire mem_cke,inout wire [63:0] mem_dq,inout wire [7:0] mem_dqs,inout wire [7:0] mem_dqs_n,output wire [7:0] mem_dm,output wire mem_cs_n,output wire mem_ras_n,output wire mem_cas_n,

Designed for use with test driver only could be

output wire mem_we_n,output wire mem_reset_n,output wire mem_odt,output wire drv_status_fail,

driver only, could be promoted to debug header / port if desired

output wire drv_status_test_complete,output wire drv_status_pass

);

© 2012 Altera Corporation—Confidential

104

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 52

Page 57: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Constraints Required

Generated by MegaWizard Plug-In Manager and pointed to by .qippointed to by .qip Full set of SDC timing constraints (.sdc files)

Tcl scripts to create timing report on memory interface marging y g

<variation_name>_report_timing.tcl

<variation_name>_report_timing_core.tcl

Pi I/O t d d d i i t ( t b b ) Pin I/O standards and grouping script (must be run by user) <variation_name>_pin_assignments.tcl

Required from user Required from user Pin placement constraints done in Quartus II Pin Planner

Pin Planner file ( ppf) available for early I/O planning flow Pin Planner file (.ppf) available for early I/O planning flow See the I/O System Design online training for details

© 2012 Altera Corporation—Confidential

105

QDR II/II+ SRAM Controller With UniPHY

Example design

Traffic Gen. QDR II/II+ controller

Write data QDR II/II+ SRAMM

PHYSM M

Local interface

(Avalon-MM)

FIFO

Command

QDR II/II+ SRAMM M

SWrite

AFIDLL

Command issuing

FSMSRead PLL

© 2012 Altera Corporation—Confidential

106

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 53

Page 58: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

RLDRAM Controller With UniPHY

Similar to QDR except has timers that interrupt t ll t d f hcontroller to do refresh

Example design

Traffic Gen. RLDRAM controller

Write dataM

PHYSAFI

data FIFO

RLDRAM IIM SM

S

DLL

Command issuing

FSM

Local interface

(Avalon-MM)

Refresh timer

PLL

Bank timers

© 2012 Altera Corporation—Confidential

107

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

Q t II P j t S ttiQuartus II Project Settings

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 54

Page 59: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Recommended Quartus II Settings Optimize hold timing feature for All Paths

Standard Fit (highest effort) option Standard Fit (highest effort) option Default is Auto Fit (shorter compilation)

Stand-alone memory designs should meet timing with Auto Fity g g

Physical synthesis

© 2012 Altera Corporation—Confidential

109

Compiling the Example Design

Open and use .qpf project in example_projectdirectorydirectory

© 2012 Altera Corporation—Confidential

110

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 55

Page 60: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Test Your Knowledge: Design Flow

1. Why is timing derating necessary when creating an external memory interface?external memory interface?

A. Adjusts setup and hold timing requirements to the external memory to account for board effects

A The default values for all memory interface parameter

2. What information is stored in a memory preset?

e te a e o y to accou t o boa d e ects

A. The default values for all memory interface parameter settings based on a given external memory device

3. What are the main components of the example design generated by the MegaWizard Plug-In Manager?

A. Top-level design, traffic generator, controller, and PHY

© 2012 Altera Corporation—Confidential

111

Section 2 Resources

User guidesExternal Memory Interfaces Handbook (Volume 2 Section I) External Memory Interfaces Handbook (Volume 2, Section I)

Quartus II Software Handbook

Device handbooks Device handbooks Cyclone III, Cyclone IV, & Cyclone V FPGAs

Arria GX, Arria II GX/GZ, & Arria V FPGAs, ,

Stratix III, Stratix IV, & Stratix V FPGAs

© 2012 Altera Corporation—Confidential

112

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 56

Page 61: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Please go to Exercise 1Please go to Exercise 1

Create a design that includes a high performance memory controller and PHY

© 2012 Altera Corporation—Confidential

113

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

S ti 3 F ti lit d Si l ti fSection 3: Functionality and Simulation of a Memory System

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 57

Page 62: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Agenda

Controller functionality Connections to core RTL design Connections to core RTL design

Latency

C ti b t t Consecutive burst support

Top-level system design description

Simulation models and directory structure

Simulation setup inside the Quartus II software Simulation setup inside the Quartus II software Choosing EDA simulator and setting up NativeLink

Using Quartus II software-generated scripts

Running simulation

© 2012 Altera Corporation—Confidential

115

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

C t ll O ti d C ti tController Operation and Connections to User Logic

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 58

Page 63: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Example Design Revisited

Simulation project with testbench and memory model found inmodel found in <project_directory>/<variation_name>_example_design/simulation

Testbench

Example designExample design

Traffic Generator

DDR/2/3 controller

PHY

Local interface (Avalon-MM) Memory Pass or

f il Generator

Controller logic

PHY

AFI

model

DLL

fail M S

Clock source PLL

© 2012 Altera Corporation—Confidential

117

Memory Controller Interface Signalsavl_addr

avl_beavl_burstbegin

avl_read_reqlocal refresh req

mem_addrmem ac parity

DDR3 SDRAMController

local_refresh_reqlocal_refresh_chip

avl_sizeavl_wdata

avl_write_reqlocal autopch req

mem_ac_paritymem_bamem_cas_nmem_ckemem_cs_nmem dm

External memory interface

local_autopch_reqlocal_self_rfsh_chiplocal_self_rfsh_req

local_multicastcsr_addr

mem_dmmem_odtmem_ras_nmem_we_nparity_error_n

Local interface (Avalon-MM)

MemoryController

csr_becsr_read_reqcsr_write_req

csr_wdata

mem_dqmem_dqsmem_dqs_n

AFI

local_init_doneavl_rdata

avl_rdata_validavl_rdata_error

avl ready

mem_err_out_n

or ALTMEMPHY

AFI

UniPHY orALTMEMPHY

avl_readylocal_refresh_ack

local_power_down_acklocal_self_rfsh_ack

ecc_interrupt

© 2012 Altera Corporation—Confidential

118

csr_rdatacsr_rdata_validcsr_waitrequest

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 59

Page 64: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Verify Design in Silicon

SignalTap II Logic Analyzer Verify local memory interface and pass-or-fail signals Verify local memory interface and pass-or-fail signals

Do not use on external memory interface pinsTapping signals adds stubs affects timing! Tapping signals adds stubs, affects timing!

© 2012 Altera Corporation—Confidential

119

Ops Initiated from User Logic or Example Driver

From local (Avalon) interface Memory writes

Memory readsMust not be activated simultaneously else core falls into unknown state

Refresh

If user controlled refresh option enabled If user-controlled refresh option enabled

Else Auto Refresh (ARF) command periodically issued

© 2012 Altera Corporation—Confidential

120

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 60

Page 65: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Local Interface: Hand Shaking Schemes

Local interface signals can be separated into 2 groups

1. request / avl_ready group*High

Controller ready to accept signals

avl_write_req, avl_read_reqavl_addr, avl_size

Controller returns:

avl_ready *

*Lowuser logic must hold read/write request, size, and address signals ntil a l read 1

2. write / read data group l d t

until avl_ready=1

avl_wdata Data to write put on bus along with avl_write_req

avl_rdata_valid avl_rdata Read data valid signal tells local interface that valid data is present

© 2012 Altera Corporation—Confidential

121

Hand Shaking when avl_ready Low

Local interface must hold read/write request, size, and address signals fixed until avl_ready = 1

© 2012 Altera Corporation—Confidential

122

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 61

Page 66: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

HPC II Architecture: Functional Overview

Memory refresh Periodically issue auto-refresh command for data retention Periodically issue auto-refresh command for data retention

Can choose user-controlled refresh option

Memory initialization Memory initialization Memory must be initialized before functional use

Initialize memory automatically in MegaWizard settingsInitialize memory automatically in MegaWizard settings

Training/calibration (user transparent) Between controller PHY and memoryBetween controller, PHY, and memory

Controller NOT READY during this phase

Memory write/read Memory write/read Must follow bank management order

Core automatically manages bank within memory chips

© 2012 Altera Corporation—Confidential

123

Controller Operation – Bus Commands

Command Acronym ras_n cas_n we_n

No operation NOP High High Highp g g g

Active - Opens bank for reads/writes

ACT Low High High

Read RD High Low HighRead RD High Low High

Write WR High Low Low

Burst terminateBT High High Low

(DDR only)BT High High Low

Precharge PCH Low High Low

Auto refresh ARF Low Low Highg

Mode register set MRS Low Low Low

DDR2/3 bursts can be burst chop of 4 (BC4), full burst length of 8 (BL8), or “on-the-fly” (set during initialization)

© 2012 Altera Corporation—Confidential

124

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 62

Page 67: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Read and Write Bursts

Controller allows you to use any burst length up to maximum burst length set on memory devicemaximum burst length set on memory device

Full-rate: burst lengths 1, 2, 4 (local) = 2, 4, 8 (memory)

Half-rate: burst lengths 1 2 (local) = 4 8 (memory) Half-rate: burst lengths 1, 2 (local) = 4, 8 (memory)

“On-the-fly” burst length selection with address pin lets controller optimize bursts for maximum efficiencycontroller optimize bursts for maximum efficiency

F ll tlocal interface 1 2 4

Full-rate controller

memory interface 2 4 8

Half-rate controller

local interface 1 2

© 2012 Altera Corporation—Confidential

125

memory interface 4 8

Consecutive Bursting

Data from one read or write command is t t d ith d t f b tconcatenated with data from subsequent

commandEffectively moving data on every clock cycle despite burst size Effectively moving data on every clock cycle, despite burst size limits

No wait states/gaps on DDR data bus P ibl i hi h h (RD/WR) d Possible within an open row, when the next (RD/WR) command is issued within an interval of burst length /2 cycles

HPCII manages this intelligently, storing commands and issuing them out of order if possible

Any gaps or wait states means the bus is empty; efficiency of controller goes downefficiency of controller goes down

© 2012 Altera Corporation—Confidential

126

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 63

Page 68: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Consecutive Bursting

Controller issues next command on BL/2 cycles interval to concatenate bursts as long asinterval to concatenate bursts as long as addressing within same row Otherwise ACT needed to address new row Otherwise ACT needed to address new row

HPCII bank look-ahead mitigates this

As long as accessing row doesn't change, controller can continue bursting over and over (until Refresh)

SDRAM rows

1 2 3

© 2012 Altera Corporation—Confidential

127

Gap breaks consecutive burst

Consecutive READ Burst

© 2012 Altera Corporation—Confidential

128

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 64

Page 69: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Consecutive WRITE Burst

© 2012 Altera Corporation—Confidential

129

Altera PHY Interface (AFI)

Communication protocol between the controller and PHY

Single data rate interface transfers high and low data in Single data rate interface transfers high and low data in one clock cycle

Up to PHY to split into rising and falling edge datap p g g g

Handles the transition between full memory clock speed and half (or quarter) rate data

AFI bus signal width = mem_signal_width * signal_rate * AFI_RATE_RATIO

wherewhere

signal_rate = 1 for SDR protocols, 2 for DDR protocols

AFI RATE RATIO = 1 for full-rate, 2 for half-rate, 4 for quarter-rate_ _ , , q

Ex.: 13-bit DDR3 address bus using half-rate mode

© 2012 Altera Corporation—Confidential

13 * 1 * 2 = 26-bit AFI_ADDR_WIDTH

130

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 65

Page 70: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Read Sequence from User Logic

As illustrated on next slide…

U l i t READ b ti l d1) User logic requests READ by asserting avl_read_reqalong with size and address

Accepted by controller indicated by avl ready high Accepted by controller indicated by avl_ready high

2) Controller issues ACT to PHY over AFI

3) PHY issues ACT to memory3) PHY issues ACT to memory

4) Controller issues READ over AFI

5) PHY issues READ to memory

6) PHY receives DDR data from memory) y

7) Controller receives SDR (half-rate) read data over AFI

8) User logic receives read data from controller

© 2012 Altera Corporation—Confidential

8) User logic receives read data from controller

131

1

8

4

2

7

5

3

© 2012 Altera Corporation—Confidential

132

6

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 66

Page 71: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Write Sequence from User Logic

As illustrated on next slide…

U l i t WRITE b ti l it1) User logic requests WRITE by asserting avl_write_reqalong with size and address

Accepted by controller indicated by avl ready high Accepted by controller indicated by avl_ready high

2) Controller receives write data

3) Controller issues ACT to PHY over AFI3) Controller issues ACT to PHY over AFI

4) PHY issues ACT to memory

5) Controller issues WRITE over AFI

6) PHY issues WRITE command to memory

7) Controller issues write data to PHY over AFI

8) PHY sends write data to memory

© 2012 Altera Corporation—Confidential

8) PHY sends write data to memory

133

1

3

2

5

7

6

4

© 2012 Altera Corporation—Confidential

134

8

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 67

Page 72: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Latency

Read latency Cycles for data to arrive at local interface after read request

Total latency: from read command to data

Memory latency: from read command hitting the memory to data Memory latency: from read command hitting the memory to data back

Write latency Write latency Cycles for data to arrive at memory interface after write request

Basic assumptions Basic assumptions Reading and writing to the rows that are already open

avl ready signal is asserted high (no wait states) _ y g g ( )

Number of clock cycles using local (PHY) clock

© 2012 Altera Corporation—Confidential

135

Read Latency Components

Controller latencyl d fi d (AFI i l) avl_read_req afi_rdata_en (AFI signal)

Command output latency afi_rdata_en mem_cs_n

CAS latencyy Read command DQ data appearing on the bus

PHY read data input latency PHY read data input latency Read data appearing on the local interface

© 2012 Altera Corporation—Confidential

136

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 68

Page 73: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Write Latency Components

Controller latency l i i d AFI avl_write_req write command on AFI

Command output latency Write command on AFI mem_cs_n

PHY write data output latencyp y Write data appearing on memory interface DQ/DQS pins

© 2012 Altera Corporation—Confidential

137

DDR3 Latency with UniPHY

Measured in full-rate (memory clock) clock cycles

Varies depending on read or write and whether memory’s required CAS write latency (CWL) setting is odd or even

Controller t

Controller address &

PHY address &

Memory maximum

PHY read

Controller read

Round t i

Round trip ith trate

address & command

address & command

maximum read

read return

read return

trip without memory

Quarter 20 8 11 5 11 14 17 8 57 65 52 56Quarter 20 8-11 5-11 14-17 8 57-65 52-56

Half 10 3-4 5-11 6-7 4 28-36 23-25

Full 5 0 5-11 4 10 24-30 19

© 2012 Altera Corporation—Confidential

138

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 69

Page 74: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

P f i Si l tiPerforming a Simulation

© 2012 Altera Corporation—Confidential

Recommended Memory Interface Design Flow

Select deviceStart designCreate and parameterize

memory interfaceInstantiate PHY & controller(example or custom design)

Perform functional simulation

Add constraints (I/O, timing, etc.)

and compile

optional(but recommended)

Expected results?Debug design Verify timing

yesno

Meets timingand

performance?Adjust constraints

yes

no

Board (PCB) related tasks

(layout, simulation, termination, drive strength settings, etc.)

Verify functionality & SI on board

yes

Works correctly?

Debug design

yes

no

© 2012 Altera Corporation—Confidential

140

Design complete

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 70

Page 75: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Review: Design Files Generated

MegaWizard Plug-In Manager generates: IP instance

Simulation model with scripts for simulating with Mentor Graphics, Cadence or Synopsys® toolsCadence, or Synopsys tools

(Optional) Example design and scripts for generating example design testbench

© 2012 Altera Corporation—Confidential

141

About the Traffic Generator

State machine with Avalon-MM interface Individual/block reads/writes Individual/block reads/writes

Sequential/random addressing

Writes data patterns to a range of addresses in all memory banks

Reads back data

Checks to see if it matches

Testbench outputs drv_status_pass, drv_status_fail

Active high; indicates test pass or fail Active high; indicates test pass or fail

drv_status_test_complete Transitions high for one clock cycle at end of test

Message printed to simulation console stating h th t t PASSES FAILS

© 2012 Altera Corporation—Confidential

whether test PASSES or FAILS

142

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 71

Page 76: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Greater Project Directory Structure

(Quartus II design project folder)

Project (.qpf) filej ( qp )

Settings (.qsf) file

Quartus IP (.qip) file (only file needed to be added to project)

(instance design files and constraints)(instance design files and constraints)

(standalone reference design)

E l j t f f d i f th i• Example project .qpf, .qsf, and .qip for synthesis

• Scripts for generating Verilog or VHDL modules for simulation

• Includes top-level testbench and generic memory model

(functional simulation files for components of the IP plus scripts for simulating in 3rd party tools)scripts for simulating in 3 party tools)

© 2012 Altera Corporation—Confidential

143

Generating Simulation Modules

1. Open simulation example project <variation_name>_example_design/simulation/<variation_name>_example_sim.qpf

2. Run appropriate Tcl script to generate Verilog or VHDLVHDL generate_sim_verilog_example_design.tcl

generate sim vhdl example design tcl generate_sim_vhdl_example_design.tcl

Creates submodules folder Creates submodules folder

Creates verilog or vhdl folderd d t bf ld t i i l ti cadence, synopsys, and mentor subfolders contain simulation

scripts for each vendor’s tool

© 2012 Altera Corporation—Confidential

144

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 72

Page 77: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Example Project Sim Hierarchy

[vhdl/verilog]/<variation_name>_example_sim.v/.vhd (simulation example design)

<variation_name>_example_sim_e0.v/.vhd (example design wrapper)

<variation_name>_example_sim_e0_d0.v/.vhd (traffic generator)

<variation_name>_example_sim_e0_if0.v/.vhd (example design DUT)

alt_mem_if_ddr3_mem_model_top_ddr3_mem_if_dm_pins_en_mem_if_dqsn_en.sv

(generic memory model)(generic memory model)

status_checker_no_ifdef_params.sv (pass/fail status checker)

In [vhdl/verilog]/submodules/ folder

© 2012 Altera Corporation—Confidential

145

Detailed Simulation Example Design

ddr3_top_example_sim.v/vhd

ddr3 top example sim e0 v/vhdddr3_top_example_sim_e0.v/vhd

ddr3_top_example_sim_e0_if0.v/vhdddr3_top_example_sim_e0_d0.v/vhd

Traffic generatorPHY

Local interface (Avalon-MM) Memory

modelM S

pll_ref_clk

Controller logic

AFI DLL

Status

pass

failPLL

global_reset_ncheckertest_complete

Note: Simulate just UniPHY with your own controller and traffic

© 2012 Altera Corporation—Confidential

146

generator with the Generate PHY only option

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 73

Page 78: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Generic or Vendor Model?

Generic: fully follows all memory protocol ifi tispecifications

Guaranteed to simulate accurately with generated IP No additional setup or configuration requiredNo additional setup or configuration required

Vendor: standardized and more thorough than generic modelgeneric model May require additional setup and configuration Requires manual connection to testbench

Vendor model simulation is not supported, but may provide a more accurate simulation of your

t l i t factual memory interface

© 2012 Altera Corporation—Confidential

147

Obtain Model from Vendor

Vendor downloads may include additional parameter files and instructions

© 2012 Altera Corporation—Confidential

148

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 74

Page 79: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Placement of Vendor Model Files

All sim modules generated in submodules folder including generic memory modelg g y

Place downloaded vendor model files (usually model and parameters files) here

© 2012 Altera Corporation—Confidential

model and parameters files) here

149

Edit Vendor Model File

Define selected device parameters

Select correct speed grade and device width `define sg15E Speed grade -15E

`define x16 DQS bank width or part width

Be sure to include any parameter files `include “ddr3_parameters.vh”

See instructions included with model files for parameter selection detailsparameter selection details

© 2012 Altera Corporation—Confidential

150

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 75

Page 80: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Replace Generic InstantiationReplace this:

With this:

© 2012 Altera Corporation—Confidential

151

Point Script to Vendor Model

For Mentor Graphics ModelSim®, edit msim_setup.tcl

For Synopsys VCS, edit vcs_setup.sh or y p y _ pvcsmx_setup.sh

For Cadence NCSim, edit ncsim setup.sh For Cadence NCSim, edit ncsim_setup.sh

Si il i t f i l ti j t IP ( t l Similar scripts for simulating just IP (not example design) in <variation_name>_sim/

© 2012 Altera Corporation—Confidential

152

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 76

Page 81: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

VHDL Simulation Notes

Generated files are mix of VHDL and encrypted or plain-text Verilog

Requires mixed-language simulation toolq g g

Encrypted files allow for use with ModelSim tool with VHDL-only licensewith VHDL only license Use Verilog or synthesis fileset for ModelSim mixed-language

license

© 2012 Altera Corporation—Confidential

153

Run the Simulation

Run script appropriate to tool

From shell or within simulation tool

ModelSim example from shell:vsim –do run dovsim do run.do

M d lSi l i t l ModelSim example in tool:do run.do

© 2012 Altera Corporation—Confidential

154

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 77

Page 82: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Post-Fit (Gate-Level) Simulation

Not possible with UniPHY due to inherent issues ith it b h i i t fit tli twith its behavior in a post-fit netlist

Sampling X’s during calibration Internal 0-cycle transfers require delays for simulationInternal 0 cycle transfers require delays for simulation

Can be worked around with a “quasi-post fit scheme” using Quartus incremental compilationscheme using Quartus incremental compilation

Pre-map RTL for UniPHY Post-fit RTL for rest of design Post-fit RTL for rest of design

See Simulating Memory IP chapter of the EMIF See Simulating Memory IP chapter of the EMIF handbook for the step-by-step process

© 2012 Altera Corporation—Confidential

155

Using 3rd Party Controller IP

Vendor should provide test bench with test generator used to stimulate controller through Avalon (local) interface

If using own controller, follow Avalon interface specificationsp

Start design using the Altera example driver and add/connect extra signals as needed to exerciseadd/connect extra signals as needed to exercise additional controller functionality

Use PHY-only generation to simplify setup Use PHY-only generation to simplify setup

© 2012 Altera Corporation—Confidential

156

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 78

Page 83: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Connecting User Logic to Controller

Replace traffic generator with user logic <variation_name>_example_sim_e0_d0.v in hierarchy

Borrow from test methodology defined inside traffic generator module

Obey functional requirements defined in y qcontroller handbook and external (physical) memory datasheetsy

© 2012 Altera Corporation—Confidential

157

Simulation Steps Summarized

Generate memory IP, choosing to generate the example design

Generate Verilog or VHDL simulation files using g gscript

Edit and instantiate vendor memory simulation Edit and instantiate vendor memory simulation model

Point simulation script to location of vendor Point simulation script to location of vendor model

R i t i l ti i t Run appropriate simulation script

Simulate and verify

© 2012 Altera Corporation—Confidential

158

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 79

Page 84: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Test Your Knowledge: Functionality & Simulation

1. What are the three main functions of the controller?

A. Memory initialization, refresh, and reading/writing; calibration is done by sequencer before controller handoffhandoff

2 What step must be performed in the simulation example2. What step must be performed in the simulation example project before you can run a simulation?

A Generate files for Verilog or VHDL simulation usingA. Generate files for Verilog or VHDL simulation using generated Tcl script

© 2012 Altera Corporation—Confidential

159

Section 3 Resources

User guides External Memory Interfaces Handbook (Volume 2, Section I)

Quartus II Software Handbook

D i h db k Device handbooks Cyclone III, Cyclone IV, & Cyclone V FPGAs

A i GX A i II GX/GZ & A i V FPGA Arria GX, Arria II GX/GZ, & Arria V FPGAs

Stratix III, Stratix IV, & Stratix V FPGAs

© 2012 Altera Corporation—Confidential

160

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 80

Page 85: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Please go to Exercise 2g

Simulation of the controller

© 2012 Altera Corporation—Confidential

161

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

S ti 4 B d d T i tiSection 4: Board and Termination Considerations

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 81

Page 86: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Recommended Memory Interface Design Flow

Select deviceStart designCreate and parameterize

memory interfaceInstantiate PHY & controller(example or custom design)

Perform functional simulation

Add constraints (I/O, timing, etc.)

and compile

optional(but recommended)

Expected results?Debug design Verify timing

yesno

Meets timingand

performance?Adjust constraints

yes

no

Board (PCB) related tasks

(layout, simulation, termination, drive strength settings, etc.)

Verify functionality & SI on board

yes

Works correctly?

Debug design

yes

no

© 2012 Altera Corporation—Confidential

163

Design complete

Agenda

Assigning I/O constraints Pin locations, loading

Termination schemes

© 2012 Altera Corporation—Confidential

164

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 82

Page 87: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

C ti I/O A i tCreating I/O Assignments

© 2012 Altera Corporation—Confidential

Some I/O Assignments Done For You

MegaWizard Plug-In Manager generatesTcl script that incl des certain I/O constraints Tcl script that includes certain I/O constraints <variation_name>_if0_p0_pin_assignments.tcl

You must source this script yourself prior to running Pin Plannery

Script automatically constrains:I/O Standard assignments for the various memory interface pins I/O Standard assignments for the various memory interface pins

Input and Output Termination assignments (discussed later)

Current Strength assignments for the high fanoutg g gaddress/command/control signals

DQ Group assignments to associate DQS signals with the DQ signals they clocksignals they clock

Memory Interface Delay Chain Configuration assignments to set the DQ, DQS, and DM I/O to use the flexible timing model

© 2012 Altera Corporation—Confidential

166

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 83

Page 88: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Additional I/O Constraints to Specify

Board trace components or output pin load assignmentsassignments Based on board memory topology & memory datasheet

specificationsspecifications

Pin Location assignments

© 2012 Altera Corporation—Confidential

167

Creating I/O Location and Other Assignments

Create and manage in the Quartus II Pin Planner

Optimized locations for memory interface pins defined in device handbooks and highlighted in g gPin Planner

Implement predefined pin-outs from board layout Implement predefined pin outs from board layout guidelines or define and send them to board developerdeveloper May be limited by locations on device of memory I/O specific

features

© 2012 Altera Corporation—Confidential

168

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 84

Page 89: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Quartus II Pin PlannerAssignments menu Pin Planner

Package View

Groups list

All Pins (signals) list

© 2012 Altera Corporation—Confidential

169

Assigning DQ Pins View or right-click menu

1.8-V HSTL Class II

PIN_C6

Q = DQS = DQS

© 2012 Altera Corporation—Confidential

170

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 85

Page 90: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Assigning DQS Pins One DQS (or DQS (differential pair) for

each DQ block

© 2012 Altera Corporation—Confidential

171

Assigning All Other Required Pins

mem_a [ ]

mem ba[ ]mem_ba[ ]

mem_ck[ ]

mem_ck_n[ ]

mem_cas, mem_cas_n

mem ras mem ras nmem_ras, mem_ras_n

mem_we, mem_we_n

mem_cs, mem_cs_n

mem_dm

mem odtmem_odt

etc.

© 2012 Altera Corporation—Confidential

172

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 86

Page 91: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Stratix V Layout Guidelines

Devices include unique architectural features to support highest frequency memory interfaces

Guidelines must be followed to guarantee timing g gclosure

See the Stratix V device handbook chapter entitled External Memory Interfaces in Stratix Ventitled External Memory Interfaces in Stratix V Devices for details

http://www altera com/literature/hb/stratix-v/stx5 51008 pdf http://www.altera.com/literature/hb/stratix-v/stx5_51008.pdf

© 2012 Altera Corporation—Confidential

173

Stratix V Sub-Banks and Leveling Blocks

Each I/O bank made up of multiple “sub-banks” Example: Upper-left bank has sub-banks named 8A, 8B, and 8C; p pp , , ;

more on larger devices

Leveling blocksg Generate delayed (PVT compensated) versions of the source

clock (e.g. 0°, 45°, 90°) from the PHY clock tree (next slide)

Distrib tes o tp t clocks to all I/Os in each s b bank Distributes output clocks to all I/Os in each sub-bank

Implements DQS phase shift

Connected to only one of the PHY clock trees (center, left, orConnected to only one of the PHY clock trees (center, left, or right) available

DLL

PLL

Leve

ling

bloc

k

I/O I/O I/O…

PH

Y

cloc

k tr

ee

© 2012 Altera Corporation—Confidential

174

I/O sub-bank

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 87

Page 92: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

PLLs & PHY Clock Trees

Dedicated high-speed, low-skew balanced trees

Three each on top and bottom edge of device Three each on top and bottom edge of device Controlled by left, center, or right dedicated PLLs along edge

E h PLL d i l PHY l k t Each PLL can drive only one PHY clock tree

Each PHY clock tree can reach one device edge Each leveling block can only access one PHY clock tree

CCenterPLL

PLL PLLSub-bank Sub-bank Sub-bank Sub-bank… …

Center PHY clock tree

Left PHY clock tree

© 2012 Altera Corporation—Confidential

175

Left PHY clock tree

Right PHY clock tree

DLL Placement and Limitations

4 DLLs available in the corners of a device

E h DLL th 2 dj t id Each DLL can serve the 2 adjacent sides

Maximum number of incompatible interfaces on each side of a device is 2 When implementing multiple interfaces (discussed later)

DLL can be shared without sharing PLLs DLL is clocked by PLL hence their frequencies must be the same

DLL DLL

© 2012 Altera Corporation—Confidential

176

DLL DLL

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 88

Page 93: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Mandatory Stratix V Layout Guidelines

Interfaces must not be split between top and bottom edges of device Due to PLL/DLL limitations discussed

All CK/CK#, address, control, command pins for an interface should be in same I/O sub-bank For optimum timing, especially for >800 MHz interfaces

Highly recommended, but not required for ≤800 MHz

© 2012 Altera Corporation—Confidential

177

Highly Recommended Stratix V Guidelines

Use a center, instead of a left or right corner, PLL Prevents long PHY clock tree delay from corner PLLs

Set PLL input clock reference to pin that drives center PLL

For ≥800 MHz interfaces and for wide interfaces that must For ≥800 MHz interfaces and for wide interfaces that must straddle between quadrants

If possible, avoid straddling

If center PLL not possible, all pins should be in same quadrant as PLLsame quadrant as PLL

© 2012 Altera Corporation—Confidential

178

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 89

Page 94: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Advanced I/O Timing for Memory

Cyclone III, Stratix III, and all newer devices

Assign signal to pin, select pin, then View newer devices

Adjust timing based on HSPICE model for more

or right-click menu Board Trace Model

HSPICE model for more accurate timing analysis

Create models for all outputCreate models for all output and bidirectional memory I/O

Match model to selected termination scheme

For use mostly when a 3rd-party board simulation tool is not available

See the I/O System Design online

© 2012 Altera Corporation—Confidential

179

y gtraining for details

Board Trace Constraints

Allow you to model board topology from board-level simulations into the Quartus II design flow

For example:p Near and far trace lengths

Near and far trace distributed inductance

Near and far trace distributed capacitance

Near end capacitor values

Far end capacitive (IC) load Far end capacitive (IC) load

Far end termination values

© 2012 Altera Corporation—Confidential

180

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 90

Page 95: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Setting Far Capacitance

Set load based on memory input capacitanceF d i d d h Found in vendor datasheet

Example 4 memory components each with 1 CS# input Load on single FPGA CS# output pin is 4 x 2 pF = 8 pF

© 2012 Altera Corporation—Confidential

181

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

B d D i d Si l ti B iBoard Design and Simulation Basics

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 91

Page 96: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

The Need for Good Board Design

Memory interfaces are getting faster

Increased speed comes at a price Smaller data valid windows

Signal integrity affected

Board design can be a balancing act Must ensure timing requirements are met

Must ensure quality signals at receivers

Oft l d t i i i Often also need to minimize power

© 2012 Altera Corporation—Confidential

183

Achieving Good Signal Integrity

Meeting timing and good signal integrity (SI) requires good placement and routing Board routing constraints are a necessity

Good signal integrity requires termination

Termination (usually) requires components( y) q p Termination resistors

Termination power supply rail(s) circuitry

More components = more difficult placement +routing + greater power usage!

Follow guidelines in the External Memory Interfaces Handbook

© 2012 Altera Corporation—Confidential

Interfaces Handbook

184

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 92

Page 97: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Design Methodology for Optimal SI

Minimize component usage to save power

Take advantage of features and settings available in FPGA and memoryy

Perform board-level simulation, taking features and settings into accountand settings into account Experiment before building board

Determine best topology

Help generate board routing constraints

After board build, compare actual results with simulations

© 2012 Altera Corporation—Confidential

185

Board Design 101: What is Impedance?

Impedance The resistance to the flow of energy in a transmission line The resistance to the flow of energy in a transmission line

Impedance (Z) is a complex number Z = R + jX where R = resistance (real) X = reactance (complex)

For capacitors and inductors (not discussed here)p ( )

Characteristic Impedance, Z0 Purely real impedance found on lossless transmission lines

For this discussion assume Z = 50 (common line value) For this discussion, assume Z0 = 50 (common line value)

Impedance discontinuities on an electrical path cause reflections of energy Impedance matched lines have minimal loss and yield better SI

© 2012 Altera Corporation—Confidential

186

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 93

Page 98: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Signal Reflections & Impedance Matching

Initial reflection seen here Problem reflection seen here

Z0ZSZ0

V lt

ZS

VSZT Z0 ≠ ZT

Vs = source voltage

Zs = source impedance

Zo = transmission line (characteristic) impedance

Driver

Accumulated reflections at the load cause ringing

(characteristic) impedance

ZT = load impedance

Accumulated reflections at the load cause ringing and/or over/undershoot

© 2012 Altera Corporation—Confidential

187

Termination

Need ways to Prevent reflections: Match impedance all along path

Dissipate reflections: External components or device features to dissipate reflection energydissipate reflection energy

Termination schemes prevent or dissipate signal p p greflections Device settings g

Component topologies

© 2012 Altera Corporation—Confidential

188

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 94

Page 99: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Designing Termination Schemes

Design and choose best scheme through simulation

Example simulation tool: Mentor Graphics HyperLynx Simple graphical interface to configure board stack-up

Draw I/O buffers, board trace components

Apply IBIS models (custom or generic) to buffers Apply IBIS models (custom or generic) to buffers

Virtual probes attached to points, usually receiver, in circuit

Simulations performed using an interface similar to an oscilloscope

Also needed for getting slew rate values for timing derating

S A di f d t il d l See Appendix for details and examples

© 2012 Altera Corporation—Confidential

189

IBIS Models

Voltage or current vs. time tables describe buffer behavior

Custom IBIS models for design created in Quartus II software by EDA Netlist Writer

Generic models available from Altera web site

http://www.altera.com/support/software/download/ibis/ibs-ibis_index.jsp

© 2012 Altera Corporation—Confidential

190

p pp _ j p

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 95

Page 100: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

Ch i O ti l T i ti S ttiChoosing Optimal Termination Settings

© 2012 Altera Corporation—Confidential

Termination Schemes: Series Termination

Added to source impedance to match trace impedance

Dissipates reflections back at sourcep

15 included on DDR3 DIMMs (as shown)

Decent signal integrity with low power usage Decent signal integrity with low power usage

However, source impedance changes whether i l hi h lsignal high or low

ZS Z0DDR3

RSRS

FPGA

ZS

DDR3DIMMZS + RS = Z0

© 2012 Altera Corporation—Confidential

192

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 96

Page 101: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Termination Schemes: Parallel

VTT (VCC/2) power supply requiredDDR: 1 25V DDR: 1.25V

DDR2: 0.9V

DDR3: 0.75V

Impedance matched at receiver

Good for unidirectional signals (command Good for unidirectional signals (command, address, etc.) or at the receiver of a bidirectional signal (DQ DQS etc )signal (DQ, DQS, etc.)

VTT

Z ZPZS Z0ZP

FPGA or memory

ZP = ZS = Z0

FPGA or memory

© 2012 Altera Corporation—Confidential

193

ZP ZS Z0

FPGA and Memory Termination Settings

Reduce (or remove) external termination components using settings on FPGA andcomponents using settings on FPGA and external memory

FPGA FPGA Series and Parallel On-Chip Termination (OCT) Dynamic OCT DDR3 read/write leveling See Appendix for additional features

External memory External memory Normal and reduced drive strength On-Die Termination (ODT)( ) Dynamic ODT

© 2012 Altera Corporation—Confidential

194

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 97

Page 102: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Quartus II Software Support for OCT

Assign termination scheme in the Pin Planner

S Select series or parallel with or without calibration

© 2012 Altera Corporation—Confidential

195

OCT with Calibration

Output (series) or input (parallel) buffer impedance automatically matched to externalimpedance automatically matched to external ±1% resistors at end of device configuration Stratix III & IV devices: RUP & RDN for 25 or 50 settingsUP DN g

Stratix V devices: single RZQ for a range of values

Requires OCT calibration block resourceq

RUP

VCCIO

RUP

VCCIO

RZQ or RDN

© 2012 Altera Corporation—Confidential

196

RZQ or RDN

RZQ or RDN

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 98

Page 103: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Stratix V OCT Calibration Blocks

4 OCT calibration blocks in the corners of a device

Each calibration block can connect to any sub-ybank on any side of the device

Each sub-bank can connect to only one OCT Each sub bank can connect to only one OCT calibration block

OCT calibration blocks can be shared by multiple OCT calibration blocks can be shared by multiple interfaces (discussed later) provided that they have the samehave the same Series/parallel termination settings

Sub-bank voltage

© 2012 Altera Corporation—Confidential

Sub bank voltage

197

Dynamic OCT

Dynamically turn on or off series or parallel OCTPo er sa ings Power savings

Correctly matched impedances for reads or writes

Write (Class I)

ZSStratix III / IV / V

VTT

Z0ZP

Write (Class I)

series OCT onparallel OCT off

ZS

memory

VCC

Read (Class I)

ZS Z0Stratix III / IV / Vseries OCT offparallel OCT on

100

100memory

RS

© 2012 Altera Corporation—Confidential

198

ZS

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 99

Page 104: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Recall: DDR3 Leveling

DDR3 clock signals routed in daisy-chain fly-by topology on DIMMtopology on DIMM Discrete memory components can be placed and routed on PCB

to support leveling

Improves signal integrity on high fan-out clocks Other signals still point-to-point Stratix III / IV / V devices have special leveling

circuitry Automatically accounts for delays and phase adjustments Aligns all signals on writes and reads

Required for DDR3 operating at 240 MHz or Required for DDR3 operating at 240 MHz or higher

© 2012 Altera Corporation—Confidential

199

Enabling Leveling

Automatically enabled based on frequency and geometry Leveling on above 240 MHzg

Leveling required for DDR3 DIMMs

Disable leveling for DDR2/3 star topologiesg p g

© 2012 Altera Corporation—Confidential

200

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 100

Page 105: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

External Memory: Output Drive Strength

Calibrated w/ VT Z Z R

through external precision resistor

ZS Z0

Z

RS

Zmem + RS = Z0 = ZS

DIMM: 34 34 + 15 = 49

Zmem

DDR3 DIMMnormal drive

Discrete: reduced 40 Set in memory’s mode

normal drive

ZS Z Set in memory s mode register MR1

ZS Z0

ZZmem ≈ Z0 = ZS

Zmem

DDR3 componentreduced drive

© 2012 Altera Corporation—Confidential

201

reduced drive

External Memory: On-Die Termination

Enabled or disabled parallel termination with no external componentscomponents

Dedicated ODT input pin on memory enabled by controller during writescontroller during writes

20, 30, 40, 60, 120 calibrated nominal settings Based on divided down 240 RZQ resistor connected to memory

Dynamic ODT: automatically switch between a normal ODT setting for reads and a different setting for writes Good for multi-DIMM configurations to reduce jitter and minimize

reflections VCC/2

ZS Z0ZP

FPGA DDR3 memory

ODT

© 2012 Altera Corporation—Confidential

202

ZS

ODT

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 101

Page 106: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Recommendations

Many options and settings!

Need to find the best recommendations for features available in FPGA and memoryy

Try to get rid of components where possible to simplify board placement and routingsimplify board placement and routing

Simulate simulate simulate!Simulate, simulate, simulate!

© 2012 Altera Corporation—Confidential

203

Recommended Settings 50 trace

Single-rank unregistered DIMM (UDIMM) Single rank unregistered DIMM (UDIMM)

Signal typeFPGA side1

(SSTL-15)Memory side for

writes2Memory drive

strength for reads2(SSTL 15) g

DQInput & Output: calibrated 50 60 ODT (RZQ/4) 40 (RZQ/6)

Input & Output:DQS

Input & Output: calibrated

differential 50 60 ODT (RZQ/4) 40 (RZQ/6)

Output only:DM

Output only: calibrated 50 60 ODT (RZQ/4) 40 (RZQ/6)

Address/command

Maximum drive strength

39 on-board fly-by terminationmand strength

CK/CK#Calibrated

differential 50

72 differential on-board fly-by termination plus compensation

capacitors

© 2012 Altera Corporation—Confidential

204

p

(1): set by <variation_name>_pin_assignments.tcl(2): set in MegaWizard Plug-In Manager

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 102

Page 107: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Test Your Knowledge: I/O and Termination

1. What View option in the Pin Planner makes it easy to place external memory signals on the optimized pins of

A. Show DQ/DQS Pins in x8/x9 mode or however wide

p y g p pa device?

your DQS byte lanes should be

2 How are the two types of OCT set on reads and writes

A. Write: serial OCT on, parallel OCT off; read: serial OCT

2. How are the two types of OCT set on reads and writes when dynamic OCT is enabled on bidirectional signals?

A. Write: serial OCT on, parallel OCT off; read: serial OCT off, parallel OCT on

© 2012 Altera Corporation—Confidential

205

Section 4 Resources

User Guides External Memory Interfaces Handbook (Volume 2, Section I)

Quartus II Software Handbook

D i H db k Device Handbooks Cyclone III, Cyclone IV, & Cyclone V FPGAs

A i GX A i II GX/GZ & A i V FPGA Arria GX, Arria II GX/GZ, & Arria V FPGAs

Stratix III, Stratix IV, & Stratix V FPGAs

Application notes Application notes AN465: Implementing OCT Calibration in Stratix III Devices

AN476: Impact of I/O Settings on Signal Integrity in Stratix IIIAN476: Impact of I/O Settings on Signal Integrity in Stratix III Devices

© 2012 Altera Corporation—Confidential

206

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 103

Page 108: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Please go to Exercise 3g

Completing the controller

© 2012 Altera Corporation—Confidential

207

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

S ti 5 Ti i A l iSection 5: Timing Analysis

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 104

Page 109: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Recommended Memory Interface Design Flow

Select deviceStart designCreate and parameterize

memory interfaceInstantiate PHY & controller(example or custom design)

Perform functional simulation

Add constraints (I/O, timing, etc.)

and compile

optional(but recommended)

Expected results?Debug design Verify timing

yesno

Meets timingand

performance?Adjust constraints

yes

no

Board (PCB) related tasks

(layout, simulation, termination, drive strength settings, etc.)

Verify functionality & SI on board

yes

Works correctly?

Debug design

yes

no

© 2012 Altera Corporation—Confidential

209

Design complete

Agenda

Timing analysis methodology Timing components

Timing paths

Timing constraints and report files Timing constraints and report files

Timing analysis description

Timing margin reportTiming margin report

Timing closure Common issuesCommon issues

Optimizing timing

© 2012 Altera Corporation—Confidential

210

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 105

Page 110: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Timing Analysis Methodology

Meeting timing requirements is challenging

Simplified implementation through Physical layer interface IPs

Numerous device features

Supported through TimeQuest timing analyzer

© 2012 Altera Corporation—Confidential

211

Timing Components

Source-synchronous timing paths

Calibrated timing paths

Internal FPGA timing paths Internal FPGA timing paths

FPGA timing parameters

© 2012 Altera Corporation—Confidential

212

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 106

Page 111: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Source-Synchronous Paths

Clock and data pass from transmitting device

Example: FPGA-to-memory write datapath Adjust phase of DQS to center clock within data valid window

© 2012 Altera Corporation—Confidential

213

Calibrated Paths

Data capture clocks dynamically positioned at power up

Reads Reads Sequencer analyzes path delays between read capture and read FIFO

buffer

Sets up FIFO write clock phase for optimal timing

Read postamble calibration done similarly

Read data valid signal calibrated to delay between read command issued g yand data received

Writes Write-leveling and programmable output delay chains align DQS with CK

at memory

BothBoth Dynamic deskew adjusts delay of each DQ/DQS to center-align DQ with

associated DQS

© 2012 Altera Corporation—Confidential

214

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 107

Page 112: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Internal FPGA Timing Paths

Impact memory interface timing

Common to all FPGA designs

Standard timing constraints required, i.e. clock Standard timing constraints required, i.e. clock constraints

TimeQuest timing analyzer reports these paths TimeQuest timing analyzer reports these paths

© 2012 Altera Corporation—Confidential

215

FPGA Timing Parameters

I/O toggle rates vary based on Speed grade

Loading

I/O bank location I/O bank location

Termination

Drive strengthDrive strength

Slew rate

Output clock specifications (from FPGA datasheet)Output clock specifications (from FPGA datasheet) Clock period jitter

Half-period jitter

Cycle-to-cycle jitter

Skew between FPGA clock outputs

© 2012 Altera Corporation—Confidential

216

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 108

Page 113: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Memory Timing Paths (Stratix III/IV Devices)

I/O source synchronous &

calibrated

CalibratedInternal source synchronous

ALTMEMPHY only; bypassed

by UniPHY

© 2012 Altera Corporation—Confidential

217

Memory Timing Paths(Arria V Cyclone V & Stratix V Devices)(Arria V, Cyclone V, & Stratix V Devices)

I/O source synchronous & calibrated

© 2012 Altera Corporation—Confidential

218

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 109

Page 114: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

UniPHY – DDR2/DDR3 Timing Paths

Timing pathApplicable

clock(s)Description

Address andcommand

pll_addr_cmd_clk,pll_mem_clk

Setup and hold margin for all address and command pins, from FPGA outputs to memory inputs

Clock-to-strobe pll_addr_cmd_clk DQS arrival at memory with respect to CK/CK#p _ _ _ y p

Core pll_afi_clk Internal timing of the UniPHY IP, between internal core registers

Core recovery/removal

PHY clocksInternal timing of the asynchronous reset signals to the UniPHYIP

Read capture DQS INSetup and hold margin for the DQ pins with respect to DQS strobe at the FPGA capture registers

p _strobe at the FPGA capture registers

Write datapathDQS_OUT, pll_write_clk

Setup and hold margin for the DQ and DM pins with respect to DQS strobe at the memory

Read resynch pll_avl_clk Synchronizing captured data with read FIFO

© 2012 Altera Corporation—Confidential

219

UniPHY Files

File (<variation_name>) Description

Clock constraints for PLL inputs

.sdc

Clock constraints for PLL inputsGenerated clock constraints for PLL outputsDerive clock uncertaintyExceptions (false paths and multi-cycle paths)O d l dd d dOutput delays on address and command outputsInput and output delays on DQ inputs and outputs

timing.tcl Includes memory interface and FPGA device parameters_timing.tcl Includes memory interface and FPGA device parameters

_report_timing.tcl Main script for reporting timing slacks

_report_timing_core.tcl Contains high-level procedures for report timing script

pin map.tcl Library of functions and procedures used by other scripts_pin_map.tcl Library of functions and procedures used by other scripts

_parameters.tclDefines parameters describing the geometry of the core and PLL configuration (Do not change)

© 2012 Altera Corporation—Confidential

220

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 110

Page 115: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Timing Analysis Description

Areas that are analyzed by the TimeQuest timing analyzer in a design that includes a memory IP

Address and command

Core and core reset Core and core reset

Read capture

Write

Read resynchronizationy

Write leveling

Bus turnaround time© 2012 Altera Corporation—Confidential

Bus turnaround time

221

Address and Command

Single data rate signals

Latched by memory using the FPGA output clock TimeQuest analyzes from set output delayTimeQuest analyzes from set_output_delay

constraints (-max and -min)

© 2012 Altera Corporation—Confidential

222

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 111

Page 116: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Core and Core Reset

Core analysis All internal core paths in the FPGA fabric

Core reset analysis Recovery/removal analysis of asynchronous reset signals to

UniPHY

© 2012 Altera Corporation—Confidential

223

Read Capture

Timing analysis indicates slack for DQ signals Signals are latched by DQS TimeQuest analyzes timing using

set_input_delay (-max and -min) set_max_delay , set_min_delay

Base anal sis from before calibration Base analysis from before calibration Emulation of calibration and timing margins after

calibration (Arria II Cyclone IV Stratix IV and Vcalibration (Arria II, Cyclone IV, Stratix IV and V devices only)

See “Analyzing Timing of Memory IP” chapter in EMIF handbook See Analyzing Timing of Memory IP chapter in EMIF handbook (Volume 2, Section I, chapter 10) for details on calibration emulation

© 2012 Altera Corporation—Confidential

224

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 112

Page 117: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Write

Timing analysis indicates slack on DQ signals

Latched by memory using DQS strobe output from FPGA

TimeQuest analyzes timing using set output delay (-max and -min)_ p _ y ( )

Analyzes base timing and after calibration

© 2012 Altera Corporation—Confidential

225

Read Resynchronization

UniPHY implements FIFO buffer (sequencer) Synchronizes data transfer from the data capture to the core

Calibration process sets the depth of the FIFO buffer

No dedicated synchronization clock is required No dedicated synchronization clock is required

© 2012 Altera Corporation—Confidential

226

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 113

Page 118: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Write Leveling DDR2/3

tDQSS memory timing parameter Calibrated path details skew margin for the arrival of DQS/DQS#

rising edge with respect to CK/CK# rising edge at memory

t /t ti i t tDSS/tDSH memory timing parameters Setup and hold skew margin for arrival of DQS/DQS# falling edge

to CK/CK# rising edge at memoryto CK/CK# rising edge at memory

© 2012 Altera Corporation—Confidential

227

Bus Turnaround Time

Analyzes margin from when bus switches from writing to reading

Prevents possible data bus contention failuresp

Stratix IV and V devices only

© 2012 Altera Corporation—Confidential

228

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 114

Page 119: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Timing Margin Report

Report DDR task in TimeQuest Analyzes all external memory interfacesAnalyzes all external memory interfaces

Run <variation_name>_report_timing.tcl Analyzes only this particular interfaceAnalyzes only this particular interface

Reports timing slacks on specific paths mentionedmentioned Read capture

Read resynchronizationy

Address and command

Core

Core recovery and removal

Write

Write leveling

© 2012 Altera Corporation—Confidential

Write leveling

229

Timing Report Summary

Quickly check for any failures in any of the critical timing areas

© 2012 Altera Corporation—Confidential

230

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 115

Page 120: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

General Recommendations

Accurately enter parameter settings and timing values, accounting for unit differences between datasheet and MegaWizard Plug-In Manager

Remember to derate timing parameters based on slew rates obtained through simulationg

Create board trace models and enter board information in MegaWizard Plug-In Managerinformation in MegaWizard Plug In Manager

Follow recommendations discussed today

© 2012 Altera Corporation—Confidential

231

Common Issues and Solutions (1)

Missing timing margin report Ensure .sdc file is attached to Quartus projectEnsure .sdc file is attached to Quartus project

Done automatically by UniPHY

PHY should not be the top-level project entity

Incomplete timing margin report Check if memory interface pins are optimized away

Ensure memory pins are connected at the top-level of the FPGA design

Read capture timing failures (Stratix III/IV devices) Read capture timing failures (Stratix III/IV devices) DQS phase shift selected is not optimal

Board skew is too large Board skew is too large

Make sure board skew parameters are set correctly Default mismatch is 20 ps

© 2012 Altera Corporation—Confidential

232

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 116

Page 121: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Common Issues and Solutions (2)

Write timing Negative margins reported if PLL phase shift is not optimal

Adjust PLL phase shift on the write clock Edit clock c2 in ALTPLL MegaWizard Plug In Manager for Edit clock c2 in ALTPLL MegaWizard Plug-In Manager for

<variation_name>_pll_memphy.v

Regenerating memory IP will overwrite this!

PHY reset recovery and removal PHY reset signals should not be globals

Global Signal assignment set to Off (should be set by pin_placement.tcl)

Manually adjust logic placement (last resort)

© 2012 Altera Corporation—Confidential

233

Address and Command Timing Solutions

Change the PLL phase shift used to generate these signals PHY settings tab: Additional CK/CK# phase and Additional

Address and Command clock phaseAddress and Command clock phase

Ensure board trace model is accurately representedrepresented Especially far-end load and trace delay differences between

address/command and memory clock I/Oaddress/command and memory clock I/O

Make sure PLL phase shift not negated by Fitter delay chain adjustmentdelay chain adjustment D5 Delay assignment set to 0

© 2012 Altera Corporation—Confidential

234

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 117

Page 122: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Optimizing Timing

Quartus II optimization settings Optimization Technique to Speed

Physical synthesis optimizations Effort level to Extra (will greatly increase compile time) Effort level to Extra (will greatly increase compile time)

Use Design Space Explorer (DSE) if necessary to sweep settings

© 2012 Altera Corporation—Confidential

235

Test Your Knowledge: Timing Analysis

1. What four components must be analyzed to perform a full timing analysis on an external memory interface?

A. Source synchronous timing paths, calibration timing paths internal paths timing parameters

full timing analysis on an external memory interface?

paths, internal paths, timing parameters

2. What is the best thing you can do to make sure that an external memory interface meets timing?

A. Make sure that all parameter settings, especially the memory timing settings are correct

external memory interface meets timing?

memory timing settings are correct

3. What adjustments can you try making if the design does not meet timing but without regenerating the interface?

A. Depending on the type of failure, PLL write clock phase shift global signal assignments optimization settings

not meet timing but without regenerating the interface?

© 2012 Altera Corporation—Confidential

shift, global signal assignments, optimization settings

236

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 118

Page 123: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Section 5 Resources

User Guides External Memory Interfaces Handbook (Volume 2, Section I)

Quartus II Software Handbook

Device Handbooks Cyclone III, Cyclone IV, & Cyclone V FPGAs

Arria GX, Arria II GX/GZ, & Arria V FPGAs

St ti III St ti IV & St ti V FPGA Stratix III, Stratix IV, & Stratix V FPGAs

© 2012 Altera Corporation—Confidential

237

Please go to Exercise 4g

Perform timing analysis on the memory interface and test in hardwareinterface and test in hardware

© 2012 Altera Corporation—Confidential

238

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 119

Page 124: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

S ti 6 Fi l T iSection 6: Final Topics

© 2012 Altera Corporation—Confidential

Agenda

DDR2/3 Controllers with UniPHY EMIF Toolkit

Using High Performance DDR Memory Controllers with Nios II and Qsys

M lti l t ll i i l FPGA Multiple controllers in single FPGA

© 2012 Altera Corporation—Confidential

240

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 120

Page 125: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

DDR2/3 C t ll ith U iPHY EMIFDDR2/3 Controllers with UniPHY EMIF Toolkit

© 2012 Altera Corporation—Confidential

DDR2/3 Controllers with UniPHY Toolkit GUI and Tcl task/report interface, similar to

TimeQuestQ

Report on calibration status and selected settings

Margining activities Margining activities

Generate and save reports on calibration and imargins

© 2012 Altera Corporation—Confidential

242

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 121

Page 126: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Enabling Communication via CSR Port

Controller Settings tab

Diagnostics tab

© 2012 Altera Corporation—Confidential

243

EMIF Toolkit

JTAGJTAGCSR port

JTAGAvalonMaster

JTAGy

Development system

Mem

ory

UniPHYHPCII

Controller

AFI

FPGA

© 2012 Altera Corporation—Confidential

244

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 122

Page 127: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Launching the EMIF Toolkit

1. Program the device

2 Quartus II Tools menu External Memory Toolkit2. Quartus II Tools menu External Memory Toolkit

3. Link Project to Device task

4 Select the project’s JTAG debugging information ( jdi)4. Select the project’s JTAG debugging information (.jdi) file

5 Create connections to the memory interface and/or the5. Create connections to the memory interface and/or the efficiency monitor

6 Generate and view calibration and margining reports6. Generate and view calibration and margining reports

See the optional steps at the end of Lab 4 for details on See t e opt o a steps at t e e d o ab o deta s ousing the EMIF Toolkit

© 2012 Altera Corporation—Confidential

245

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

M I t f ith Ni IIMemory Interfaces with a Nios II Processor and Qsys

© 2012 Altera Corporation—Confidential

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 123

Page 128: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Accessing Memory from Nios II Processor

Use Qsys to build system Supports DDR2/DDR3 SDRAM HPC plus QDR II/II+ and

RLDRAM, all with UniPHY Controller features Avalon interface Controller features Avalon interface

Stand-alone PHY not supported Needs a memory controller or driver

Add SDC file to Quartus II project and run Tcl scripts manuallyy

System performance is limited by Nios II processor performancep p

Number of peripherals connected to interconnect

© 2012 Altera Corporation—Confidential

247

Qsys Systems

Tools menu Qsys Create new Qsys system inside Quartus II project

Add all microcontroller peripherals required in system Nios II processor communications core/s PLL DMA controller Nios II processor, communications core/s, PLL, DMA controller,

memory IP, etc.

Same MegaWizard utility used as described previously

Remember to enable Generate power-of-2 data bus widths on Controller Settings tab

Note: Nios II processor and DMA controllers can both initiate

burst transactions with the memory

© 2012 Altera Corporation—Confidential

248

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 124

Page 129: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Memory IP in Qsys

DDR interfaces

Models of external memory for simulation

with Qsys systemwith Qsys system

Configurable traffic generator

© 2012 Altera Corporation—Confidential

249

Example System in Qsys

PLL included in memIP component; clock other components with it, if possible

© 2012 Altera Corporation—Confidential

250

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 125

Page 130: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

M lti l M C t ll i Si lMultiple Memory Controllers in a Single FPGA

© 2012 Altera Corporation—Confidential

Multiple Memory Interfaces

Save cost and board space, and reduce partitioning complexity

Independent read and write transactions on each pinterface

Unique address/command bus for each interface Unique address/command bus for each interface

If they do not contend for device elements, treat them as independent modulesthem as independent modules Otherwise, share resources if controllers operating at same

frequency and are all half-rate or all full-ratey

PLL, DLL, and OCT block can be shared

© 2012 Altera Corporation—Confidential

252

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 126

Page 131: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Creating Multiple Memory Controllers Run through the MegaWizard flow once for each interface

Evaluate device resource availability Are there sufficient pins?

Interfaces cannot share pins

Need to share DLL?Need to share DLL? Interfaces must operate at same frequency

DLL sharing possible across same side of some devices

Need to share PLL and/or clock network? Need to share PLL and/or clock network? Interfaces must operate at same frequency

Need to share OCT block? RUP & RDN or RZQ must connect to block and be in I/O bank with same voltage

as the memory interface

Sharing of device resources requires RTL modificationsg q

Example design available at: http://www.altera.com/support/examples/verilog/ver-stratix-v-multiple-ddr3-

© 2012 Altera Corporation—Confidential

253

uniphy.html

DLLs in Stratix FPGAs

Support any number of interfaces running at same f li it d b i PLL l k d l ifrequency, limited by pin, PLL, clock, and logic resources

4 located at corners of deviceCan shift DQS pins connected to adjacent sides of FPGA Can shift DQS pins connected to adjacent sides of FPGA

Maximum of 4 unique frequencies

© 2012 Altera Corporation—Confidential

254

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 127

Page 132: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

PLL/DLL/OCT Sharing

PLL/DLL/OCT block instantiated at same level as PHY to ease sharingg

Clocks from Master instance drive master instance and signals are exported

Clock and OCT inputs exposed and must be connected when set to Slave

Set to No sharing, Master, or Slave

© 2012 Altera Corporation—Confidential

255

Requirements for OCT/PLL/DLL Sharing

Same type of memory DDR3, QDRII, etc.

Same internal clock rate Full rate, half rate, quarter rate

Same interface clock rate 533 MHz, for example

Same PLL input clock rate 100 MHz, for example

Same clock phase requirements p q Setting on PHY Settings tab (previous slide)

Additional core-to-periphery phase of 45°, for example

© 2012 Altera Corporation—Confidential

256

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 128

Page 133: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Example of Full Resource Sharing

T ffiTraffic Generator

Local IF

Traffic Generator

Local IF

Slave ControllerMaster Controller

OCT

Local IF

oct_rupt d

oct_ctl_rt_value[13:0]

oca

OCT

DLL

oct_rdn

afi clk

dll_delayctrl[5:0]

oct_ctl_rs_value[13:0]

pll_ref_clkPLL

afi_clk

afi_half_clkafi_reset_npll_mem_clkpll_write_cllkpll_addr_cmd_clk

© 2012 Altera Corporation—Confidential

257

Memory 1 Memory 2

Number of PLL Outputs by Device

Device family

Number of Enhanced PLL clock outputs

Number of dedicated clock outputs

L ft/ i ht 7 l k t tLeft/right: 2 single-ended or 1 differential pair

Stratix IIILeft/right: 7 clock outputs

Top/bottom: 10 clock outputs

g g p

Top/bottom: 6 single-ended or 4 single-ended and 1 differential pair

L ft/ i ht 7 l k t tLeft/right: 2 single-ended or 1 diff. pair

Stratix IVLeft/right: 7 clock outputs

Top/bottom: 10 clock outputs

Left/right: 2 single ended or 1 diff. pair

Top/bottom: 6 single-ended or 4 single-ended and 1 differential pair

4 single-ended or 2 single-ended and 1Stratix V 18 clock outputs each

4 single-ended or 2 single-ended and 1 differential pair

© 2012 Altera Corporation—Confidential

258

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 129

Page 134: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Global Resources Available and DDR3 Half-R t Cl ki R i tRate Clocking Requirements

Device Global clock network Regional clock network

familyGlobal clock network Regional clock network

Stratix III 16 64-88

Stratix IV 16 64-88

Stratix V 16 92

Device f il

# of full-rate clocks # of half-rate clocksfamily

# of full rate clocks # of half rate clocks

Stratix III 3 global 1 global, 1 regional

Stratix IV 3 global 1 global, 1 regional

Stratix V 1 global, 2 regional 2 global

© 2012 Altera Corporation—Confidential

259

Stratix V Multiple Interface Guidelines

Pins from multiple interfaces cannot be placed in the same I/O sub-bank Pins in an interface all require access to the same leveling block

OK if sharing resources (PLL, DLL, OCT)

Multiple interfaces cannot share the same PLL i t f l kinput reference clock Would force same PLL (for a single PHY clock tree) to be used

for both interfacesfor both interfaces

Interfaces can’t share a single PHY clock tree

OK if sharing PLL/DLLg

© 2012 Altera Corporation—Confidential

260

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 130

Page 135: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Multi-Controller Considerations

Independent controllers (no resource sharing) Follow the design flow for each instance of the controller

I/O constraints Tcl file uses generic top-level pin names (mem_dq, mem dqs etc )mem_dqs, etc.) Edit the memory interface pin names to match your top-level

Or import .ppf file into Pin Planner to add unique prefix names to each instanceinstance

Additional considerations when sharing b t lti l t llresources between multiple controllers

Edit timing.tcl script for slave interface(s) to use master PLL clocksclocks

Edit report_timing_core.tcl script for slave interface(s) to allow master interface to adjust DQ/DQS delay chains

© 2012 Altera Corporation—Confidential

See EMIF Handbook for details on editing these files

261

Efficiently Fitting Memory Interfaces

Example showing 2 possible fits for

• Two 72 bit DDR3 interface (different sizes) and

Two 114 pin 72 bit DDR3

implementations

72 bit DDR2 72 bit DDR2

• Two 72 bit DDR3 interface (different sizes) and

• Four 36 bit (18R / 18W) QDRII+ interfaces (different sizes)

Number of I/O

p

One 159 pin

48

24

4832 4848 3248

48

24Number of I/O per bank

One 159 pin 72 bit DDR3

implementation

One 66 pin 36 bit QDRII+

implementation

Stratix III40

40

24

40

40

24 4832 4848 3248

Four 64 pin 36 bit QDRII

implementation

36 bit QDRII+ 36 bit QDRII 36 bit QDRII+ 36 bit QDRII+

48 48

4832 4848 3248

St ti III

48

24

40

48

24

4036 bit QDRII interfaces

36 bit QDRII+ 36 bit QDRII+ 36 bit QDRII+ 36 bit QDRII+ Stratix III40

48

24

40

48

24

4832 4848 3248

© 2012 Altera Corporation—Confidential

262

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 131

Page 136: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Test Your Knowledge: Final Topics

A Analyze how the interface was calibrated by the

1. What does the EMIF toolkit allow you to do?

A. Analyze how the interface was calibrated by the sequencer; view timing margin on all DQ paths

2. What does implementing the memory interface as a Qsys system component allow you to do?

A. Use a Nios II processor or any other Avalon master component control the interface.

3. How does enabling master or slave PLL/DLL/OCT sharing affect the generation of the controller?

A. Master: outputs of resources exposed to top-level of controller; slave: inputs exposed and must be

© 2012 Altera Corporation—Confidential

connected to resource output from master

263

Section 6 Resources

User Guides External Memory Interfaces Handbook (Volume 5, Section 2)

Chapter 2: Implementing Multiple Memory Interfaces Using UniPHY

Chapter 3: DDR3 SDRAM Controller with UniPHY Using Qsys Chapter 3: DDR3 SDRAM Controller with UniPHY Using Qsys

Quartus II Software Handbook

Device HandbooksCyclone III Cyclone IV & Cyclone V FPGAs Cyclone III, Cyclone IV, & Cyclone V FPGAs

Arria GX, Arria II GX/GZ, & Arria V FPGAs

Stratix III, Stratix IV, & Stratix V FPGAs, ,

© 2012 Altera Corporation—Confidential

264

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 132

Page 137: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

C l iConclusions

© 2012 Altera Corporation—Confidential

Summary

Discussed Altera DDR3 memory interface IPHow to find parameterize and instantiate a High Performance DDR3 How to find, parameterize, and instantiate a High Performance DDR3 memory controller in a Quartus II project

Listed the steps required to simulate a system Pointed out what type of termination schemes to use Walked through static timing analysis in the TimeQuest

timing analyzer and techniques for solving commontiming analyzer and techniques for solving common timing problems

Discussed using the High Performance Memory Discussed using the High Performance Memory controllers inside a Qsys system

Highlighted how you can implement multiple memory controllers in a single FPGA

© 2012 Altera Corporation—Confidential

266

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 133

Page 138: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, & DebuggingImplementing, Simulating, & Debugging External Memory Interfaces

R fReferences

© 2012 Altera Corporation—Confidential

Resources

Memory Resource Center http://www.altera.com/technology/memory/mem-index.jsp

User Guides External Memory Interface Handbook

http://www.altera.com/literature/lit-external-memory-interface.jsp

D i H db k Device Handbooks Cyclone III, Cyclone IV, & Cyclone V FPGAs

Arria GX Arria II GX/GZ & Arria V FPGAs Arria GX, Arria II GX/GZ, & Arria V FPGAs

Stratix III, Stratix IV, & Stratix V FPGAs

© 2012 Altera Corporation—Confidential

268

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 134

Page 139: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Learn More Through Technical Training

Instructor-Led Online Virtual ClassroomTraining TrainingTraining

With Altera's instructor-led training courses, you can: Learn from an experienced Altera technical

training engineer (instructor)

With Altera's virtual classroom training:

With Altera's online training courses, you can: Take a course at any time that is training engineer (instructor)

Complete hands-on exercises with guidance from an Altera instructor

Ask questions and receive real-time answers from an Altera instructor

Get the best of both worlds!

All the benefits of a live, instructor-led training class from the comfort of your home

convenient for you

Take a course from the comfort of your home or office (no need to travel as with instructor-led courses)

answers from an Altera instructor

Each instructor-led class is one or two days in length (8 working hours per day)

or office Each online course takes approximate one to three hours to complete

http://www.altera.com/training

View training class schedule and register for a class

© 2012 Altera Corporation—Confidential

269

g g

Altera Technical Support

Reference Quartus II software on-line help Quartus II Handbook Quartus II Handbook World-wide web: http://www.altera.com

Search for answers to problems with Knowledge Database Download literature View design examples View online trainingsg

MySupport: http://www.altera.com/mysupport Field applications engineers: contact your local Altera

sales officesales office Altera Wiki: www.alterawiki.com Altera Forum: www.alteraforum.com Altera Forum: www.alteraforum.com Intellectual Property Support

http://www.altera.com/support/ip/ips-index.html

© 2012 Altera Corporation—Confidential

270

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 135

Page 140: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Instructor-Led and Virtual Training Curriculum

Quartus IISoftware:

Foundation

Introductionto

VHDL

Quartus IIDesign

Quartus II

BestPractices

Introduction to

Qsys

Introductionto

Verilog

Quartus IISoftware:

TimingAnalysis

Optimization Using

Incremental Compilation

Quartus II Software:Debus &Analysis

forMaximizing

FPGAProductivity

(2 days)

Designing Advanced

AdvancedVHDL

AdvancedVerilog

AdvancedQsys

with the Nios II

Processor

AdvancedTiming

Analysis

TimingClosure

ExternalMemory

Interfaces

VideoDesign

FrameworkWorkshop

Developing SW for

the Nios IIProcessor

(2 days)

DSP Builder

Advanced Blockset

Foundation Classes

Advanced Follow-On Classes

Building Gigabit

Interfaces in Altera

Transceivers

DSP Builder

StandardBlockset

Designing w/ ARM

based SoC

Future Classes

Available as a Virtual Class

System Verilog

Specialized Classes

Scripting ModelSimDesigning w/ OpenCL

System Console

Partial Reconfiguration

Getting Started w/

PCIe

© 2011 Altera Corporation—Confidential http://www.altera.com/trainingThank You

Thank YouThank You

© 2012 Altera Corporation—Confidential ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos are trademarks of Altera Corporation and registered in the U.S. Patent and Trademark Office and in other countries. All other words and logos identified as trademarks or service marks are the property of their respective holders as described at www.altera.com/legal.

© 2011 Altera Corporation—Confidential ALTERA, ARRIA, CYCLONE, HARDCOPY, MAX, MEGACORE, NIOS, QUARTUS and STRATIX words and logos are trademarks of Altera Corporation and registered in the United States and are trademarks or registered trademarks in other countries.

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 136

Page 141: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

A diAppendix

© 2012 Altera Corporation—Confidential

Appendix Table of Contents Basics of source synchronous interfaces Example applications Older device resources for memory IP Older device resources for memory IP HPCII, UniPHY, and sequencer architecture ALTMEMPHY parameterization Details on manual timing derating Details on manual timing derating Detailed hierarchy of example design DDR/2 latencies 11 0 simulation with NativeLink and/or ALTMEMPHY 11.0 simulation with NativeLink and/or ALTMEMPHY Simulation of initialization and calibration Signal integrity analysis More information on termination More information on termination Other FPGA features for signal integrity ALTMEMPHY timing paths Multiple interfaces with ALTMEMPHY Multiple interfaces with ALTMEMPHY SOPC Builder performance considerations

© 2012 Altera Corporation—Confidential

274

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 137

Page 142: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Source Synchronous Interfaces

Strobe or clock signal sent from driver chip, not separate clock sourceseparate clock source Target device uses transmitted clock to sample incoming data

Data & clock routed identically to maintain phase relationship at destination device Example shown: Driver shifts clock to meet receiver timing

Driver ReceiverD tData

ClockDelay

© 2012 Altera Corporation—Confidential

275

Source Synchronous Clocking Schemes

Edge-aligned Center-aligned

SDRSDR

DDR

© 2012 Altera Corporation—Confidential

276

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 138

Page 143: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Source Synchronous Benefits

Higher maximum bus speed than common clock tsystems

Common clock method is limited by absolute delay of data signal

Source synchronous method is limited by delay difference (skew) between data and clock( ) No theoretical frequency limit Practical limits include I/O edge rates, SSN, signal integrity, and

minimum pulse widthsminimum pulse widths

© 2012 Altera Corporation—Confidential

277

Double-Data Rate (DDR) Interfaces

Not restricted to use with memory systems

Data sent on rising and falling clock edges Often uses complementary clocking for higher performance

lkclk_p

clk nclk_n

data AL AH BL BH CL CH DL DH EL EH

1st word 2nd word 3rd word 4th word 5th word

© 2012 Altera Corporation—Confidential

278

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 139

Page 144: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Example – Packet Buffering Application

RLDRAM II

RLDRAM IIAltera

600 Mbps RLDRAM

RLDRAM IIInterfaceFPGA

SPI 4.2 TX

SPI 4.2 RX

Core Logic

QDRII SRAMInterface

QDRII SRAM1 Gbps QDRII SRAM

© 2012 Altera Corporation—Confidential

279

p

Example - Embedded Application

DDR2 SDRAM DIMM

DDR2Altera

533 Mbps DDR2 SDRAM

Interface

Nios II

Altera FPGA

Nios IIEmbedded processor

Memory Controller

MemoryInterface

PCIInterface

600 Mbps RLDRAM II or1 Gbps QDRII SRAM

RLDRAM/QDRII

© 2012 Altera Corporation—Confidential

280

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 140

Page 145: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Cyclone III / IV Devices

Cyclone III & IV E devices: dedicated bidirectional DQ/DQS pins on all banks around dieDQ/DQS pins on all banks around die

Cyclone IV GX devices: top, bottom, & right sides only

3 PLL 4 PLL

DDR write logic implemented in I/O cells

Device

2

Cyclone IIIdevice

5 Up to 4 reconfigurable PLLs for static use during write and dynamic use during

DDR read logic implemented in FPGA Fabric

1 6

dynamic use during read operationPerformance optimized on top

and bottom of FPGA

PLL PLL8 7

Note: Left and right-hand sides of FPGA support LVDS data I/O. Top and

© 2012 Altera Corporation—Confidential

281

g pp pbottom optimized for memory performance

Arria GX / Arria II GX Devices

Dedicated bidirectional DQ/DQS pins on top and bottom banksbottom banks Non-DQS mode (PLL-generated capture clock) support on sides

of device Up to 4 PLLs for static use in read

3 PLL 4 PLL PLL DLLDDR read/write logic implemented in I/O cells on

static use in read and write operations

2

PLL blo

cks

implemented in I/O cells on top and bottom of FPGA

Can use regular I/O on sides of die for lower performance DLL used for

Arria GXdevicePLL

PLL

ansc

eive

r bof die for lower performance

Fast PLLs optimized to support LVDS and SERDES

DQS phase shift in DQS blocks during read

1

PLL 8 7 PLL DLL

Trapp

on side (non-CDR)Built in CDR

circuitry(xcvr on left hand side

PLL

© 2012 Altera Corporation—Confidential

282

(xcvr. on left hand side in Arria II GX devices)

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 141

Page 146: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Stratix III / IV I/O Memory Interface IP

Hard IP with powerful features

Stratix III IOE: 31 registerspowerful features

Available behind DQ i ll

DQ_CLK

0°_CLK

DQ_CLK_phase[8:0]

0°_CLK_phase[8:0]

Control signals from DLL 5

Resync_CLK Resync_CLK_phase[8:0]

DLL controlled delay chain

at per bank basis

every DQ pin on all 4 sides

D Q

D Q0

1

OE

99

D Q

D Q

11

Divided DQ_CLK

0

1

OE0

OE1D Q

D Q

D Q

Write levelingHalf rate resync

D Q

Older device IOE6 registersDQ

OE Register

O t t

D Q

D Q

D Q

D Q

IOD0OUT

IOD1OUT

IOD2OUT

IOD3OUT

0

1

D Q

D Q

D Q

0

1

D Q

D Q

D Q

0

1

vs. DQ

D Q

D Q

Input Register

D Q

D Q

01

D Q

D Q

Output Register

Input Register

D QD Q

CDATA0IN

CDATA2IN

CDATA3IN

DQ

DQ

DQ

DQ

DQ

DQ

DQ

DQ

DQ

DQ

DQ

Read calibration

DQS

Controlled byDLL or IP

Postamble

DQS

Controlled byDLL or IP

Divided 0°_CLK

CDATA1INDQ DQ

DQS logic Postamble

© 2012 Altera Corporation—Confidential

283

HPC II Architecture

face

Rank Timer

pu

t In

terf

Command Generator

Timing-Bank Pool Arbiter

AF

I IntTo user logic To PHY

lon

-ST

Interface

Write Data Buffer

To user logic To PHY

Ava ECC

Read Data Buffer

Config & Status Reg. (CSR) Interface

© 2012 Altera Corporation—Confidential

284

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 142

Page 147: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

HPC II Interfaces

Avalon-ST input interfaceE i i ll f l i Entry point into controller from user logic

Communicates with masters requesting data

CSR (Avalon-MM slave) interface CSR (Avalon-MM slave) interface Provides runtime access to controller configuration and status

registers Independent status and efficiency monitoring or can be used

along with the EMIF debug toolkit (discussed later)

Avalon PHY (AFI) interface Avalon PHY (AFI) interface Communicates between the controller and PHY using AFI 3.0

specification Single data rate (SDR) interface See External Memory Interface Handbook for details

© 2012 Altera Corporation—Confidential

285

HPC II Architectural Blocks

Command generatorA d f i i f d ECC l i Accepts commands from streaming interface and ECC logic

Timing bank poolParallel queue that works with arbiter to enable data reordering Parallel queue that works with arbiter to enable data reordering

Tracks all incoming requests Passes data to arbiter once write data is ready

Arbiter Orders requests to be passed to external memory following

bit ti larbitration rules If only one master issuing request, grant request immediately If 2 or more masters have outstanding requests:

Read request granted before write request Otherwise, oldest request granted first

© 2012 Altera Corporation—Confidential

286

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 143

Page 148: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Other Blocks

Rank timerM i i k ifi i i i l i k ( l i l CS i l ) Maintains rank-specific timing in multi-rank (multiple CS signals) memory topologies (typically for DIMMs)

Limits number of activates in a given timing period Manages read-to-write and write-to-read turnaround time Manages delay between activating different banks

W it d R d d t b ff Write and Read data buffers Stores data to write or read data as it passes between the user

logic and the PHY/memory interfaceg y

ECC Encoder and decoder-corrector generates interrupts on errors Can detect and fix single-bit errors Can detect double-bit errors

© 2012 Altera Corporation—Confidential

287

Address & Command Datapath Transfers address/command signals from AFI clock domain to SDR

address/command clock domain

C l hif hif i f ll l i l i Cycle shifter can shift in full-rate cycles to implement correct write latencies

Full-rate addr/cmd

afi_clkCore DDIO

addr cmd clk

Full rate cycle shifter

afi_ or dd d lk

addr/cmd

addr_cmd_clkaddr_cmd_clk

afi_clk270°

H0/L0 H1/L1

addr cmd clk

L0 H0

addr_cmd_clk

L1 H1

© 2012 Altera Corporation—Confidential

288

mem_clk

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 144

Page 149: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Write Datapath

afi_wdata_valid

DDIO

DQSDQSnDDIO

HDR to SDR SDR to DDR

DDIO

DDIO

0

wdata[1:0]DQ

DDIODQDDIO

wdata[3:2]

DDIO...

0° phase

phy_write_clk

© 2012 Altera Corporation—Confidential

289

Read Datapath Read FIFO

Full SDR in

HDR (or QDR)t

afi clk

out

data capture

DQS

DQafi_clk

VFIFO

gated DQS

p

LFIFO

read enable

DQS enable

afi_clk(half-rate) afi_clk

rdata_en

VFIFO LFIFO

Synchronizes read data with afi_clk domain and converts SDR to HDR or QDR

(half rate)

converts SDR to HDR or QDR VFIFO (valid) and LFIFO (read latency)

parameters set by calibration

© 2012 Altera Corporation—Confidential

parameters set by calibration

290

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 145

Page 150: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Sequencer Architecture

SCC ManagerSets delays and phases on I/O through adjustment of dynamic delay chains based Sets delays and phases on I/O through adjustment of dynamic delay chains based on calibration algorithm

RW ManagerA t t ll th h AFI t d d t fi Acts as controller through AFI to send commands to memory: configure memory registers, activate, precharge, refresh, guaranteed writes, write/read bursts, etc.

PHY Manager Direct access to PHY to pass on calibration results, such as calibrated FIFO buffer

parameters

Indicates completion and pass/fail status of calibration process

Data Manager Stores parameter information for software access

Tracking Managerg g Take over AFI interface during operation after refresh to track DQS enable signal

Adjust as needed due to voltage and temperature changes

© 2012 Altera Corporation—Confidential

291

ALTMEMPHY: Memory Settings

Settings analogous to UniPHY

© 2012 Altera Corporation—Confidential

292

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 146

Page 151: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

ALTMEMPHY: PHY Settings (1)Stratix devices:

improve SI

Newer Stratix devices: fine tune PLL phase shift to improve address/command timing(240 for dev. boards; typically use 270)

Devices with DLLs: Share DLL resource

with other core(s)

Arria, Cyclone, and older Stratix devices: fixed 90 degree phase shifts to improve address/command timing

Older Cyclone and Stratix devices: worst case skew for timing constraints(Newer devices use Board Settings)

© 2012 Altera Corporation—Confidential

293

ALTMEMPHY: PHY Settings (2)

For HardCopy II development

For newer Stratix devices (discussed later)

Reduce simulation time (analogous to UniPHY setting)

© 2012 Altera Corporation—Confidential

294

g)

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 147

Page 152: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

ALTMEMPHY Memory Presets

Filters on left filter displayed presets on right

Load custom presets (in XML) from IP’s /lib directory

© 2012 Altera Corporation—Confidential

295

ALTMEMPHY Preset Editor

White cells are programmable ti ( l t U iPHYoptions (analogous to UniPHY

Memory Parameters)

Gray cells for creating custom memory presets

Save as a custom preset

© 2012 Altera Corporation—Confidential

296

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 148

Page 153: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Saving Custom Memory Presets

Default save directory is /lib folder for ipcomponentcomponent C:\altera\<ver>\ip\altera\ddr3_high_perf\lib

New memory added automatically to Presets list New memory added automatically to Presets list Must manually Load Preset if saved in different directory

© 2012 Altera Corporation—Confidential

297

Setting Timing Parameters (cont.)

But ns used in Preset Editor and preset for 533 MHz!MHz!

533 MHz: 4 x 1.875 ns = 7.5 ns

400 MHz: 4 x 2.5 ns = 10 ns

Adjust appropriately and Adjust appropriately and save as custom preset

© 2012 Altera Corporation—Confidential

298

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 149

Page 154: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Manual Derating (Stratix III devices only)1. Find correct derating values (∆tDS, ∆tDH, ∆tIS, ∆tIH) in component

datasheet based on signal slew rates

Add d ti t b l (E t t ∆t )2. Add derating to base values (Ex: tDS = tDS(base) + ∆tDS)

3. Normalize values to VREF and enter in Board Settings tab Datasheets reference to VIH & VIL, not VREF used by AlteraDatasheets reference to VIH & VIL, not VREF used by Altera

Example: ∆tIS, ∆tIH derating values; select values based on slew rates

See “Derate Memory Setup and Hold Timing” section (Volume 3, Section II, chapter 3)

© 2012 Altera Corporation—Confidential

299

y p g ( , , p )and “Timing Derating Methodology” chapter in the External Memory Interface Handbook for more details and examples

Automatic Derating

1. Enter base values for settings Memory Timing tab (UniPHY) or Preset Editor (ALTMEMPHY)y g ( ) ( )

2. Enter slew rate information and/or # of slots/devices for approximationslots/devices for approximation Board Settings tab (UniPHY and ALTMEMPHY)

Derated values automatically calculated Derated values automatically calculatedUniPHY ALTMEMPHY

© 2012 Altera Corporation—Confidential

300

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 150

Page 155: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

ALTMEMPHY EDA Settings

Altera librariesAltera libraries needed for 3rd-party

simulation

Special sim lationSpecial simulation model; cannot be

synthesized

Timing and resource estimation netlist for 3rd-

t th i t lparty synthesis tools

© 2012 Altera Corporation—Confidential

301

ALTMEMPHY Summary

Choose optional files to generate; main variation and Quartus II IP

files always generated

© 2012 Altera Corporation—Confidential

302

y g

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 151

Page 156: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

MegaWizard Generation Messagesg g

U iPHYUniPHY

ALTMEMPHY

© 2012 Altera Corporation—Confidential

303

Custom Controller With ALTMEMPHY (1)

© 2012 Altera Corporation—Confidential

304

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 152

Page 157: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Custom Controller With ALTMEMPHY (2)

Quickly and easily configure physical layer interface

Connect to your own controller or driver and still ygain the benefit of the Altera physical interface

© 2012 Altera Corporation—Confidential

305

Add SDC Files to Project

UniPHY <variation name> sdc <variation_name>.sdc

<variation_name>_sequencer_cpu.sdc

ALTMEMPHY ALTMEMPHY <variation_name>_phy_ddr_timing.sdc

<variation_name>_example_top.sdc (optional; only if using _ _ p _ p ( p y gexample design)

© 2012 Altera Corporation—Confidential

306

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 153

Page 158: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

SDRAM Memory (Typical Configuration)

Multiple

memory

banks

Control

Micron 1 GB DDR3 SDRAM

© 2012 Altera Corporation—Confidential

307

Read/write logic

Example Design Revisited For 11.0

See main presentation for 11.1 and laterp

ddr3_top_example_sim_tb.v

ddr3 top example sim vddr3_top_example_sim.v

ddr3_top_example_sim_ddr3_top_example_sim.v

ddr3 top example sim ddr3 top example sim e0.v_ p_ p _ _ _ p_ p _ _

Test driverPHY

Local interface (Avalon-MM) Memory

modelM S

pll_ref_clk

Controller logic

AFI DLL

Status

pass

failPLL

global_reset_ncheckertest_complete

© 2012 Altera Corporation—Confidential

308

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 154

Page 159: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DDR/DDR2 Read Latency - ALTMEMPHY

Address and

Command Read Data Total Read

Device Frequency Interface

Controller

Latency

Command Latency

CAS

Latency

Read Data

Latency

Total Read

Latency

FPGA I/O FPGA I/O

Local Clock cycles

Time

(ns)Device Frequency Interface Latency LatencyFPGA I/O FPGA I/O cycles (ns)

Arria GX233 Half-rate 5 3 1 2 4.5 1 18 154

167 Full-rate 5 2 1 4 5 1 19 114

Arria II GX233 Half-rate 5 3 1 2.5 5.5 1 18 154

167 F ll t 5 2 1 4 6 1 20 120167 Full-rate 5 2 1 4 6 1 20 120

Cyclone III Cyclone IV

200 Half-rate 5 3 1 2 4.5 1 18 180

167 Full-rate 5 2 1 4 5 1 19 114

Stratix II333 Half-rate 5 3 1 2 4.5 1 18 108

Stratix II

Stratix II GX267 Half-rate 5 3 1 2 4.5 1 18 135

200 Full-rate 5 2 1 4 5 1 19 95

Stratix III

Stratix IV

400 Half-rate 5 3 1 2.5 7.125 1.5 20 100

267 Full-rate 4 2 1.5 4 7 1 20 75

© 2012 Altera Corporation—Confidential

309

DDR/DDR2 Write Latency - ALTMEMPHY

Address and

C d Total Write

Device Frequency Interface

Controller

Latency

Command Latency

Memory

Write

Latency

Total Write

Latency

FPGA I/O

Local Clock cycles

Time

(ns)Device Frequency Interface Latency LatencyFPGA I/O cycles (ns)

Arria GX233 Half-rate 5 3 1 1.5 12 103

167 Full-rate 5 2 1 3 12 72

Arria II GX233 Half-rate 5 3 1 2.5 12 103

Arria II GX167 Full-rate 5 2 1 4 12 72

Cyclone III Cyclone IV

200 Half-rate 5 3 1 1.5 12 120

167 Full-rate 5 2 1 3 12 72

St ti II333 Half-rate 5 3 1 1.5 12 72

Stratix II

Stratix II GX267 Half-rate 5 3 1 1.5 12 90

200 Full-rate 5 2 1 3 12 60

Stratix III

Stratix IV

400 Half-rate 5 3 1 2 12 60

267 Full-rate 5 2 1 5 3 13 49Stratix IV 267 Full rate 5 2 1.5 3 13 49

© 2012 Altera Corporation—Confidential

310

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 155

Page 160: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DDR3 Typical Latency - ALTMEMPHY

T t l L t

DeviceController

RateFrequency

(MHz)Latency

Type

Total Latency

Local Clock Cycles

Time

(ns)

Stratix III Half 400Read 23 115

Write 14 68

Stratix IV Half 400Read 23 115

Write 14 68

The exact latency depends on your precise configuration. You should obtain preciselatency from simulation, but this figure may vary in hardware because of the

© 2012 Altera Corporation—Confidential

311

automatic calibration process.

DDR2 Latency - UniPHY

Round

Controller Rate

Controller Address &

Command

PHY Address & Command

Memory Maximum

Read

PHY Read

ReturnRound

Trip

Round Trip

w/out

Memory

Full 5 1 3-7 4 13-17 10

Half 103 (1)

4 (2)

3-78

24-28(1)

26-28(2)

21(1)

22(2)

(1) Even write latency

© 2012 Altera Corporation—Confidential

312

(1) Even write latency(2) Odd write latency

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 156

Page 161: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

EDA Settings Revisited

Not necessary for UniPHY; generated designs work for both simulation and synthesis

© 2012 Altera Corporation—Confidential

313

Greater Project Directory Structure (ALTMEMPHY)

(Quartus II design project folder) Project (.qpf) file

Settings (.qsf) file

(Quartus II design project folder)

MegaWizard-generated IP files

(testbench directory)( y)

© 2012 Altera Corporation—Confidential

314

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 157

Page 162: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

NativeLink Simulation (11.0 & ALTMEMPHY only)

P f A l i & El b ti Perform Analysis & Elaboration

© 2012 Altera Corporation—Confidential

315

Set NativeLink Simulation OptionsTools menu Options

Path to simulator

© 2012 Altera Corporation—Confidential

316

Path to simulator executable directory

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 158

Page 163: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Choose EDA Simulation Tool & Language

© 2012 Altera Corporation—Confidential

317

Establish Simulation Settings

Simulation files stored

here

Testbench control

© 2012 Altera Corporation—Confidential

318

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 159

Page 164: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

NativeLink Test Benches

RTL test bench automatically created for example project (11.0)

Create new testbench manually (11.1 and later)y ( )

© 2012 Altera Corporation—Confidential

319

DUT and Test Bench Files

Replace generic model with vendor model© 2012 Altera Corporation—Confidential

Replace generic model with vendor model

320

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 160

Page 165: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Run Simulation via NativeLink

Run EDA RTL Simulation C i l i fil i “ i l i / d l i ” di Creates simulation files in “simulation/modelsim” directory

Creates script: <variation_name>_example_sim_run_msim_rtl_verilog.do

© 2012 Altera Corporation—Confidential

321

Simulation Script

Compiles all required files, starts simulation,files, starts simulation, sets up waveform view, and advances the simulation

Can be edited and run llmanually

Example: add or change waveforms to view

Overwritten each time simulation started with N ti Li kNativeLink

© 2012 Altera Corporation—Confidential

322

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 161

Page 166: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Edit and Source “.do” Script

© 2012 Altera Corporation—Confidential

323

Run Simulation Script Manually

© 2012 Altera Corporation—Confidential

324

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 162

Page 167: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Format Waveforms During Simulation

© 2012 Altera Corporation—Confidential

325

Simulation Details – Start-up Sequence

Self-calibrating control blockSt t t i i tt C lib t t Startup: training pattern - Calibrates out process differences (board, memory, & FPGA)Normal operation: monitor & adjust

External memorydevice initialization

Normal operation: monitor & adjust Compensate for voltage and temperature

variations No interruption of operation

Write data training

No interruption of operation

Calibration

Functional use of memorymemory

© 2012 Altera Corporation—Confidential

326

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 163

Page 168: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Core Function: Supplementary Information

Initialization is activated automatically by core immediately after reset release and cannot be stopped by user logic Global reset going into the PHY can re-start sequence

© 2012 Altera Corporation—Confidential

327

Simulation Stages

1. Device initialization

3. Functional test begins

2. Training and calibration stage2. Training and calibration stage

© 2012 Altera Corporation—Confidential

328

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 164

Page 169: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Functional StageFunctional use beginsFunctional use begins- Test driver takes over after

local_init_done goes high

© 2012 Altera Corporation—Confidential

329

HyperLynx LineSim GUI

Pick and place parts to build circuit

Set properties for each component Set properties for each component

Simulate and view

Output Series Trace Input

Board stackup

Simulate

Simulate and view results in scope window

Output buffer

Series resistance

Trace element

Input buffer

Display eye-pattern at receiver

© 2012 Altera Corporation—Confidential

330

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 165

Page 170: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Modeling Stratix III to DDR2 SDRAM DIMM

Hyperlynx setup

© 2012 Altera Corporation—Confidential

331

Hyperlynx Sim vs. Real Measurement

DDR2 SDRAM Read

Diff b tDifference between simulation and measured trace could be due to capacitancebe due to capacitance in the via where the measurement was taken

© 2012 Altera Corporation—Confidential

332

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 166

Page 171: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Gauging Signal Integrity and Quality

“Eye” opening: persistent oscilloscope

overshoot

measurement at receiver width

height

overshoot

Bigger eye means better margin, less undershootgover/undershoot

Compare with Compare with simulation

© 2012 Altera Corporation—Confidential

333

Parts of the Eye

Width Faster edge rates widen the eye

Helps meet setup and hold timing

H i h Height Bigger height, larger signal swing

E ti f V d V ifi ti Ensure meeting of VIH and VIL specifications

Over/undershootC d b l ti f fl ti d i i i l Caused by accumulation of reflections or over-driving signal

Ringing can cause false triggering

Receiver damage possible over long term if specifications greatlyReceiver damage possible over long term if specifications greatly exceeded

© 2012 Altera Corporation—Confidential

334

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 167

Page 172: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Check Point

Does my simulated eye meet my criteria? If not, adjust settings and iterate

Determine boardDetermine board design constraints

Perform boardPerform board level

simulations

Adjust termination, drive

strength, etc.

noMeets timing & performance?

yes

Continue design

© 2012 Altera Corporation—Confidential

335

No Termination Scheme

Not recommended

Bad signal integrity on both reads and writes

No external components No external components

ZS Z00

ZS

FPGA memoryZ0 ≠ ZS

S

Potential electrical discontinuities

© 2012 Altera Corporation—Confidential

336

discontinuities

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 168

Page 173: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Eye (R/W) for No Termination Scheme

Bad signal integrity!

Small eye width and height

Large over/undershoot Large over/undershoot

© 2012 Altera Corporation—Confidential

337

Termination Schemes: Parallel Class II

Two resistors needed per line - One for each receiver

VTT power supply requiredTT p pp y q

Impedance matched and energy absorbed at both endsboth ends

Good for bidirectional signals

VTT

ZS Z0ZP

FPGA memory

VTT

ZP

ZS

FPGA memoryZP = Z0 = 2ZS

© 2012 Altera Corporation—Confidential

338

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 169

Page 174: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Non-Fly-By (Star) Topology

Parallel termination typically placed beforememory

Creates un-terminated stubs Stub itself causes reflections and ringing

ZP

Z0

Zstub1

Z0

Zstub2

© 2012 Altera Corporation—Confidential

339

Daisy Chain (Fly-By) Topology

ZP can be placed past receiver

Removes stubs

More difficult to route More difficult to route

ZP

Z0

© 2012 Altera Corporation—Confidential

340

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 170

Page 175: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Board Area & Component Usage

Many components needed for full Class II!

Fly-by pull-ups & bypass

Pull-ups and VTT

bypassDIMM

Connector

FPGAFPGA

© 2012 Altera Corporation—Confidential

341

Initial Conclusions

Termination in some form is essential for high speed designs

Class I good for unidirectional signals (address, g g (command)

Class II good for bidirectional signals (DQ, DQS) Class II good for bidirectional signals (DQ, DQS)

Lots of resistors needed

E t il/ l d d Extra power rail/plane needed

© 2012 Altera Corporation—Confidential

342

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 171

Page 176: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Utilize FPGA Adjustable I/O Features

Transistors inside I/O elements can be tailored to allow either drive strength or impedance adjustment Thus, these options are mutually exclusive

The former is calibrated around drive strength and the latter around impedance but either can be used in this contextaround impedance but either can be used in this context

© 2012 Altera Corporation—Confidential

343

Setting FPGA Drive Strength

Multiple settings available on all Altera devices

8 mA

Settings depend on selected I/O standard

Typical of Class I termination

Prevents over/undershoot due to less loading

16 mA Typical Class II

termination Increases eye height in

more heavily loaded configuration

© 2012 Altera Corporation—Confidential

344

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 172

Page 177: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Programmable Slew Rate Slew rate

Rate a signal takes to transition from one logic state to another Measured in V/ns

Stratix III/IV devicesSet as 0 (slowest) 1 2 or 3 (fastest; default) Set as 0 (slowest), 1, 2, or 3 (fastest; default)

Not available when series OCT in use

Stratix V devices

fast

l Only 2 settings: 0 (slow) and 1 (fast; default) Not available when series OCT in use

F t l t f t t d

slow

Faster slew rate means faster, stronger edges More chance of overshoot, noise (SSN)

Slower slew rate means slower edges Slower slew rate means slower edges Slower signaling, but less noise

© 2012 Altera Corporation—Confidential

345

Board Trace Mismatch Compensation

Manually adjust I/O delay to compensate for longer/shorter board trace

Digitally programmable in 50 ps steps

Compensation for 0 – 5 ½ inches FR4 ( t d d b d t i l) d l 170 /i h FR4 (standard board material) delay: ~170 ps/inch

“Last resort” debugging; check IP parameters and board trace models first

Stratix III / IV / V

UserC t ll dControlled

© 2012 Altera Corporation—Confidential

346

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 173

Page 178: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Deliberately Skew DQ Data Output

Reduce Simultaneous Switch Noise (SSN) Delaying adjacent edges reduces total number of simultaneous Delaying adjacent edges reduces total number of simultaneous

switch output (SSO) edges.

Controllable in 50 ps steps

See Advanced I/O System Design online training for more information about SSN

5 ns

700700 ps

© 2012 Altera Corporation—Confidential

347

5 ns

Series OCT (Class I)

Choose 50 for Class I (unidirectional signals) For typical Z0 of 50

Similar to standard 8 mA drive strength Typically used for Class I

Eye with OCT slightly bigger than with drive strength setting (mutually exclusive settings)

VTTVTT

ZS Z0ZP

FPGA memory

ZSZS = ZP = Z0

VTT = VCC/2

© 2012 Altera Corporation—Confidential

348

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 174

Page 179: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Series OCT (Class II)

Choose 25 for Class II

Similar to standard 16 mA drive strength Typically used for Class II

Eye with OCT slightly bigger than with drive strength settingg g

VTT

Z ZVTT

ZZS Z0ZP

FPGA memory

ZP

/2 /2 ZSZS = ZP/2 = Z0/2

© 2012 Altera Corporation—Confidential

349

Parallel OCT

50 Thevenin equivalent parallel termination No VTT required

Bidirectional and input pins only

N l i i i ( ) d d No external termination resistor(s) needed at FPGA Calibrates using external 50 resistors

Uses same RUP, RDN resistors as for series OCT

ZS Z0Stratix III / IV /V

bidirectional memory

VCC

100

ZS

bidirectional memory100

© 2012 Altera Corporation—Confidential

350

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 175

Page 180: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Loading Considerations

Decisions on settings and termination need to take type of loading into accountyp g

DIMM vs. discrete components No connector and no on-board series resistors Reduces loading, improves signal integrity

Single-rank vs. dual-rank DIMMsHigher density DIMMs increase load Higher density DIMMs increase load

Slows edge rates, affecting signal integrity Improve with increased drive strength at the expense of more power

usage

Multiple DIMMs Many options: one or both populated multiple sets of controlsMany options: one or both populated, multiple sets of controls See EMIF Handbook for detailed analysis and recommendations

© 2012 Altera Corporation—Confidential

351

Summary: Unidirectional Recommendations

Class I ODT on memory if availableDDR2: 50 or 75 DDR2: 50 or 75

DDR3: 60 or 120

If ODT is not supported, use memory side pp , yexternal termination - Series OCT at 50

No external components possible with DDR2/3!

ZS = 50 Ω Z0 = 50 ΩFPGA

VCC/2

22

VTT

ZP

DIMM

ZS

ODT

DIMM

© 2012 Altera Corporation—Confidential

352

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 176

Page 181: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Summary: Bidirectional Recommendations

Class II FPGA side external termination (if not Stratix III / IV / V device)

ODT on memory on writes (dynamic ODT with DDR3)

ODT not supported, external term. at memory

Dynamic series/parallel OCT at 50 if available Series OCT at 25 if not

No extra components possible with Stratix III / IV / V device and DDR2/3!/ V device and DDR2/3!

© 2012 Altera Corporation—Confidential

353

Cyclone III and Cyclone IV

ALTMEMPHY performs initial data capture using self-calibrating circuit

DQS strobes from memory are not used for ycapture

Dynamic PLL clock used to capture DQ data Dynamic PLL clock used to capture DQ data signals

© 2012 Altera Corporation—Confidential

354

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 177

Page 182: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

ALTMEMPHY Timing Paths (1)

Timing PathALTMEMPYVariations

ApplicableClock

Description

Address andcommand

All mem_clkSetup and hold margin for all address and command pins or for mem_cke, mem_cs_n, and mem_odt pins.

PHY All PHY clocks Internal timing of the ALTMEMPHY megafunction.g g

PHY reset All PHY clocksInternal timing of the asynchronous reset signals to the ALTMEMPHY megafunction.

DQS vs. CK DDR/DDR2 mem_clk_DQSSSkew requirement for the DQS strobe at the memory with respect to the arrival time of CK/CK#.

Half-rate dd d

DDR/DDR2/lk

Setup and hold margin for the address and command pins (except for mem cs n, mem cke, and mem odt

address and command

DDR/DDR2/DDR3

mem_clkpins (except for mem_cs_n, mem_cke, and mem_odtpins) with respect to the mem_clk clock at the memory when the PHY is in half-rate mode.

Mimic DDR/DDR2 clk[0]The setup margin for the voltage and temperature tracking mechanism

[ ]tracking mechanism.

Read capture

All dqsSetup and hold margin for the DQ pins with respect to DQS strobe at the FPGA capture registers.

© 2012 Altera Corporation—Confidential

355

ALTMEMPHY Timing Paths (2)

Timing PathALTMEMPYVariations

ApplicableClock

Description

Read Postamble

DDR/DDR2 Postamble clockSetup and hold time margin for the postamble path that is calibrated with the resynchronization clock phase.

Read Postamble DDR/DDR2 DQS Clocks

The setup and hold margin for the postamble logic that enables and disables the DQS signal going to the DQPostamble

EnableDDR/DDR2 DQS Clocks enables and disables the DQS signal going to the DQ

registers.

Read Resync.

DDR/DDR2/DDR3

Resync. ClockSetup and hold margin for the DQ data with respect to resynchronization and the postamble clock at the resynchronization and the postamble registers

yresynchronization and the postamble registers.

Write datapath

All DQSSetup and hold margin for the DQ pins with respect to DQS strobe at the memory.

Write Skew margin for the arrival time of the DQS strobe atWrite leveling tDQSS

DDR3 except Arria II GX

CK/CK# clocksSkew margin for the arrival time of the DQS strobe at the memory with respect to the arrival time of CK/CK# at the memory.

Write DDR3 t S t d h ld i f th DQS f lli d ithleveling tDSS

/tDQSH

DDR3 except Arria II GX

CK/CK# clocksSetup and hold margin for the DQS falling edge with respect to the CK clock at the memory.

© 2012 Altera Corporation—Confidential

356

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 178

Page 183: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

ALTMEMPHY

File Description

Clock constraints for PLL inputs

ddr_timing.sdc

Clock constraints for PLL inputs.Generated clock constraints for PLL outputsDerive clock uncertaintyExceptions (false paths and multi-cycle paths)OOutput delays on address and command outputsOutput delays on DQS strobe outputs

example_top.sdc Provides constraints for the example driver block

ddr_timing.tcl Includes memory interface and FPGA device parameters

report_timing.tcl Reports timing slacks

report_timing_core.tcl Contains high-level procedures for report timing script g g g

ddr_pins.tcl Library of useful functions

© 2012 Altera Corporation—Confidential

357

Note: Files are preceded with variation name.

Mimic Path - ALTMEMPHY

Mimics the round trip delay

Enables calibration sequencer to track variations Voltage

Temperature

Adjusts without affecting operation of controller

No timing constraints required for Arria II GX and Stratix IV devices

Cyclone III and IV devices place mimic register close to the IOEclose to the IOE

© 2012 Altera Corporation—Confidential

358

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 179

Page 184: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

DQS vs. CK Path Cyclone III/IV

Indicates skew requirement for arrival of DQS strobe at memory

Requires timing constraints to account for duty q g ycycle distortion (set_output_delay max & min)

© 2012 Altera Corporation—Confidential

359

ALTMEMPHY DLL Sharing

Instantiate DLL externally Ensure Instantiate DLL externally option turned on in PHY

Settings page of MegaWizard Plug-In Manager

Stratix devices only Stratix devices only

© 2012 Altera Corporation—Confidential

360

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 180

Page 185: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

ALTMEMPHY PLL Sharing ALTMEMPHY requires 5 PLL output taps (min)

phy_clk_1x - static system clock for data path and controller

mem clk 2x - static DQS output clock for DQS CK/CK# and input to DLL mem_clk_2x - static DQS output clock for DQS, CK/CK#, and input to DLL

write_clk_2x - static DQ output clock for DQ signals 90° before DQS

resynch_clk_2x - dynamic phase clock for resynchronization and postamble

measure_clk_2x - dynamic phase clock for VT tracking

mem_clk_ext_2x - optional static clock when dedicated outputs used for CK/CK#

ac_clk_2x - dedicated static clock for address and command signals

(Arria II GX, Stratix III, IV, V devices only)

Multiple PLL clock output sharing options Depending on number of clock networks and PLL outputs available

Share static clocks, saving up to 4 clock networks With unique resynchronization clocks for each interfaceWith unique resynchronization clocks for each interface

While mimic paths can be shared or independent

User needs to design own logic to share mimic clock

Requires modifications to design Once ALTMEMPHY Megafunction files changed, cannot re-open in MegaWizard Plug-In

Manager

© 2012 Altera Corporation—Confidential

361

Number of PLL Outputs by Device

Device Family

Number of Enhanced PLL Clock Outputs

Number of Dedicated Clock Outputs

Arria II GX 4 clock outputs each1 single-ended or 1 differential pair

3 single-ended or 3 differential pair total

Cyclone III 1 single ended or 1 differential pair totalCyclone III

Cyclone IV5 clock outputs each

1 single-ended or 1 differential pair total

(not for memory interface use)

Left/right: 7 clock outputsLeft/right: 2 single-ended or 1 differential pair

Stratix IIILeft/right: 7 clock outputs

Top/bottom: 10 clock outputsTop/bottom 6 single-ended or 4 single-ended and 1 differential pair

Arria II GZ Left/right: 7 clock outputsLeft/right: 2 single-ended or 1 diff. pair

Arria II GZ

Stratix IV

Left/right: 7 clock outputs

Top/bottom: 10 clock outputsTop/bottom 6 single-ended or 4 single-ended and 1 differential pair

Stratix V 18 clock outputs each4 single-ended or 2 single-ended and 1

Stratix V 18 clock outputs eachdifferential pair

© 2012 Altera Corporation—Confidential

362

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 181

Page 186: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Example Stratix III/IV PLL Sharing Limitations (ALTMEMPHY)(ALTMEMPHY) Top / bottom PLLs have up to ten clock outputs

Up to three controllers sharing same PLL with separate resynchronization clock and measure clocks (4 static) + (2 dynamic x 3)(4 static) (2 dynamic x 3)

Or up to five controllers sharing the same PLL (4 static) and measure clocks (1 dynamic) with a separate resynchronization clock (1 dynamic x 5)clock (1 dynamic x 5) (4 static) + (1 dynamic) + (1 dynamic x 5)

When sharing measure clock, ensure memory devices accessed by each different controller are laid out with same trace lengthsdifferent controller are laid out with same trace lengths

© 2012 Altera Corporation—Confidential

363

Cyclone III / IV Clock Sharing Limitations

PLLs have up to 5 clock outputs (3 static) + (2 dynamic) for single interface

Need additional PLL to resynchronize other interface

© 2012 Altera Corporation—Confidential

364

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 182

Page 187: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Dealing with Clock Crossing Logic

Clock crossing bridges added if components not clocked by memorycomponents not clocked by memory controller (Adds latency)

Manually add bridge or use half-rate Manually add bridge or use half rate bridge in controller

Must cut unrelated timing paths g pfrom timing analysis and place & route

Uses clock-crossing FIFOs to translate transfers across-l k d iclock domains

© 2012 Altera Corporation—Confidential

365

Example SDC Syntax to Cut Paths#**************************************************************# Create Clocks#**************************************************************# Define external clock frequency from oscillator:# Define external clock frequency from oscillator:create_clock -name clk_0 -period 20.000 -waveform 0.000 10.000 [get_ports clk] -add

#**************************************************************# Define aliases for long clock names:#**************************************************************set uniphy_ddr2_0_clock_source

sopc_top_inst|the_uniphy_ddr2_0|mem_if|controller_phy_inst|memphy_top_inst|upll_memphy|altpll_component|auto_generated|pll1|clk[1]

set system_clk sopc_top_inst|the_pll|the_pll|altpll_component|auto_generated|pll1|clk[0]

#**************************************************************# Set False Paths#**************************************************************# Cutting the paths between the system clock and DDR local clock since there is a clock

crossing # bridge between them (FIFOs)g # g ( )set_false_path -from [get_clocks $system_clk] –to [get_clocks $uniphy_ddr2_0_clock_source]set_false_path -from [get_clocks $uniphy_ddr2_0_clock_source] –to [get_clocks $system_clk]

© 2012 Altera Corporation—Confidential

366

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 183

Page 188: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Performance Considerations (1)( )

Don’t waste bandwidth through width mismatches Ideal data size at Avalon interface is 32 bits for (32-bit) Nios II

processorprocessor

Double-data rate and half-rate stages both double incoming data width 16-bit memory device best for full-rate option (16x2=32)

8-bit external memory device best when using half-rate option (8x2x2=32)(8x2x2 32)

© 2012 Altera Corporation—Confidential

367

Performance Considerations (2)( )

Data at Avalon interface (burst of 1)

Consider full-rate DDR (16-bit memory chip) example:

(burst of 1)

Data at memory interface

ABCDEFGH (32 bits per Avalon clock)

Data at memory interface(burst of 2)

EFGH ABCD (2x16 bits) at memory interface

Consider half-rate DDR (8-bit memory chip) example:Data at Avalon interface (burst of 1)

ABCDEFGH (32 bits per Avalon clock)

Data at memory interface(burst of 4)

GH EF CD AB (4x8 bits at memory interface)

© 2012 Altera Corporation—Confidential

368

GH EF CD AB (4x8 bits at memory interface)

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 184

Page 189: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

CSR Address Map

Address Bit Description

25 Reports the value of afi cal fail

0x004

25 Reports the value of afi_cal_fail

24 Reports the value of afi_cal_success

0 Initiates a soft reset0 Initiates a soft reset

0x00523:16 Write figure of merit

7 0 R d fi f it7:0 Read figure of merit

0x00623:16 Initial failing error group of calibration

7:0 Initial failing error stage of calibration

0x007 31:0Indicates whether DQS edges have been id tifi d f hidentified for each group

Figure of merit: sum over all groups of minimum margin on DQ

© 2012 Altera Corporation—Confidential

369

+ margin on DQS divided by 2; measure of interface health

Implementing, Simulating, & Debugging External Memory Interfaces

A-MNL-ISDMI-12-0-v1 185

Page 190: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory

Interfaces

Exercise Manual

Software Requirements: Quartus® II software v. 12.0, ModelSim®-Altera® Edition software v. 10.0d

Hardware requirements: Stratix® IV GX FPGA development kit

Link to the Quartus II and External Memory Interfaces Handbooks:

http://www.altera.com/literature/hb/qts/quartusii_handbook.pdf

http://www.altera.com/literature/lit-external-memory-interface.jsp

Use the link below to download the design files for the exercises:

http://www.altera.com/customertraining/ILT/Memory_Interface_12_0_v1.zip

Page 191: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

2

Page 192: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

3

Exercise 1

Create a Quartus II Design with the High Performance DDR3 SDRAM Controller with

UniPHY

Page 193: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

4

Page 194: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

5

Exercise 1

Objective:

• Create and parameterize an instance of the DDR3 High Performance Controller (HPC) II with UniPHY

Introduction:

This document walks you through the steps necessary to create, constrain, and verify operation of the DDR3 SDRAM High Performance Controller with UniPHY in a Stratix IV GX device. The lab is targeted to the Stratix IV GX FPGA Development Board. However, using the same steps, it could be targeted to any development board supporting DDR3 (or DDR2 with modifications).

Hardware Requirements:

- Altera Stratix IV GX FPGA Development Board, which includes:

o Stratix IV EP4SGX230KF40C2 FPGA o Micron MT41J64M16LA-15E 1 Gb (128 MB) DDR3 SDRAM (x5)

components (top interface: 128 MB; bottom interface: 512 MB) - USB-Blaster™ programming interface built into development board and connected

between the computer and the board via USB - The appropriate power supply connected to the board

Performance Expectations:

The Stratix IV C2 device is rated at up to 533 Mhz (1066 Mbps) for DDR3 SDRAM. Using all 4 “bottom” port DDR3 devices (U5, U12, U18, U24) on the development board gives a maximum bandwidth as:

64 (bits wide) x 1066 Million (bits/second) = 68224 Million bits per second or 68.2 Gbps

In this lab, you will connect the DDR3 memory controller to all four of the “bottom” port DDR3 SDRAM devices on the development board (64 bits wide; about 68.2 Gbps).

As you proceed through the exercises, be sure to completely read the instructions for each step and sub-step in this lab manual. Each step first summarizes what you will be doing in that step before providing detailed instructions. Use the lines next to each step (____) to keep track of your progress or to check off completed steps in the exercises.

If you have any questions or problems, please ask the instructor for assistance.

Page 195: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

6

Step 1: Extract the lab files

____ 1. Unzip the lab project files, if necessary. In an Explorer window, go to C:/altera_trn/Memory. This is your lab installation directory. Please check with your instructor if you do not see this directory. Delete any old lab file folders that may already exist there. Double-click the executable file Memory_Interface_12_0_v1.exe. If you cannot find this file, ask your instructor for assistance. In the WinZip dialog box, just click Unzip to automatically extract the files in place to a new folder named MEM12_0 in the directory mentioned above.

From now on, this will be referred to as the <Mem lab install directory>.

____ 2. Create a Quartus II project in order to start creating the memory IP. Start the Quartus II software, version 12.0, from the Start menu (All Programs → Altera 12.0 Build 178 → Quartus II 12.0; use the 64-bit version if using an Altera training laptop or desktop) or from a shortcut on the desktop.

____ 3. From the File menu, select New Project Wizard. Choose the following options in completing the wizard:

a. Page 1 - Name the project and top-level design entity siv_ddr3 and set the project directory to <Mem lab install directory>.

b. Page 2 - Leave this page blank as you will add files later in the exercise.

c. Page 3 - Set the Device Family to Stratix IV (GT/GX/E). Set Devices to Stratix IV GX to help filter the Available devices list. You can further filter the list by setting the Pin count to 1517 and the Speed grade to 2. Select the EP4SGX230KF40C2 device. Be sure to select the correct device!

d. Page 4 - Though you will be simulating the design later, leave this page blank.

e. Click Finish.

The project you just created won’t actually be used in the labs today. You’ll be using the example project created by the MegaWizard® Plug-In Manager exclusively. However, in your own designs, you would create your own top-level project with your own user logic to talk to the controller and then instantiate the memory IP into it.

Page 196: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

7

Step 2: Create and parameterize the DDR3 memory IP

In this step you will create a DDR3 IP block targeting the Stratix IV GX family. The DDR3 IP generated will include synthesizable and simulation versions of the Altera HPCII and UniPHY. The Quartus II 12.0 MegaWizard Plug-in Manager will also create an example project instantiating the DDR3 IP block, as well as a traffic generator, allowing you to test the controller in simulation as well as on the targeted FPGA development board.

____ 1. From the Tools menu, select MegaWizard Plug-In Manager.

____ 2. When the Megawizard Plug-in Manager opens, select Create a new custom megafunction variation and click Next.

____ 3. On page 2a of the Megawizard tool, make the following selections (if necessary): Device family Stratix IV

Type of output file Verilog HDL

Output file C:/altera_trn/Memory/MEM12_0/ddr3_top

____ 4. Expand the Interfaces folder, the External Memory folder, and then the DDR3 SDRAM folder. Select DDR3 SDRAM Controller with UniPHY v12.0. The tool should look like the screenshot below. Click Next to continue.

Page 197: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

8

The tool will launch the DDR3 SDRAM Controller with UniPHY dialog box. This dialog box has six tabs across the top. You will configure the DDR3 SDRAM controller for the Stratix IV GX Development Board.

The lab instructions don’t go into detail about each setting you’ll be making in the parameter editor. If you are interested in learning more about the settings, you can hover over the setting to get a tooltip; click Documentation in the upper right hand corner of the tool; or search for the setting name in the External Memory Interfaces Handbook, included with the exercise files in the Vendor_files folder or on the Altera web site at http://www.altera.com/literature/lit-external-memory-interface.jsp.

____ 5. Under the PHY Settings tab, make the following selections. Any setting not listed should be left at its default value.

FPGA Speed Grade 2

Clocks Memory clock frequency 533

PLL reference clock frequency 50

Rate on Avalon-MM interface Half

Advanced PHY Settings Advanced clock phase control Enabled

Additional address and command clock phase

0.0

Additional CK/CK# phase 0.0

PLL, DLL, OCT sharing mode No sharing

____ 6. Select a memory preset. From the Presets list on the right, select the memory found on the development board, MICRON MT41J64M16LA-15E, and click Apply.

The preset sets default values for the memory you’ll be interfacing to, saving time in having to scour the memory data sheet for information. If you are curious about this memory, its datasheet can be found in the Vendor_files folder in the lab installation directory.

Page 198: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

9

____ 7. Select the Memory Parameters tab. Verify that the memory preset selected the correct values and set custom values for this design using the list below. Again, if a setting is not listed, leave it at its default value. (Some of these settings we’ll discuss in more detail throughout the rest of today.)

Memory Parameters

Memory vendor Micron

Memory Format Discrete Device

Memory device speed grade 666.667

Total interface width 64

DQ/DQS group size 8

Number of chip selects 1

Number of clocks per chip select 1

Row address width 13

Column Address Width 10

Bank address width 3

Enable DM pins Enabled

Memory Topology

Fly-by topology Enabled

Memory Initialization Options

READ Burst Type Sequential

DLL precharge power down DLL off

Memory CAS latency setting 8

Output drive strength setting RZQ/7

Memory additive CAS latency Disabled

ODT Rtt nominal value RZQ/4

Auto selfrefresh method Manual

Selfrefresh temperature Normal

Memory write CAS latency setting 6

Dynamic ODT (Rtt_WR) value RZQ/4

Page 199: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

10

____ 8. Select the Memory Timing tab, but do not make any changes to the values. If you want, compare the values here with the values found in the vendor data sheet, found in the Vendor_files directory of the class file installation directory. The setup and hold times (tIS, tIH, tDS, tDH) will get derated automatically based on the board parameters of the development kit.

Note that the automatic derating set in the next step will not change the values entered into the Memory Timing tab.

____ 9. Select the Board Settings tab. Under Setup and Hold Derating, switch the derating method to Specify slew rates to calculate setup and hold times. Enter the values for this and the Board Skews using the table below. There is no need to adjust the ISI values.

These are all unique values that have already been determined for the Stratix IV GX development board through board simulation. Don’t worry if you see warnings or errors as you enter these values; they will go away once all values have been entered.

Setup and Hold Derating

CK/CK# slew rate (Differential) 4.0

Address and command slew rate 1.5

DQS/DQS# slew rate (Differential) 3.0

DQ slew rate 1.5

Board Skews Maximum CK delay to DIMM/device 0.618

Maximum DQS delay to DIMM/device 0.368

Minimum delay difference between CK and DQS 0.25

Maximum delay difference between CK and DQS 0.378

Maximum skew within DQS group 0.017

Maximum skew between DQS groups 0.128

Average delay difference between DQ and DQS 0.021

Maximum skew within address and command bus 0.072

Average delay difference between address and command and CK

0.015

Page 200: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

11

____ 10. Select the Controller Settings tab. On this tab, change Maximum Avalon-MM burst length to 64.

____ 11. Also on the Controller Settings tab, turn on Enable Configuration and Status Register Interface, making sure that CSR port host interface is set to Internal (JTAG).

The CSR port will be used later when you use the EMIF Toolkit with the interface. Leave all other settings at their defaults.

____ 12. Select the Diagnostics tab. Make sure that the Auto-calibration mode is set to Skip calibration to save time later when the interface gets simulated.

____ 13. Also on the Diagnostics tab, enable Skip Memory Initialization Delays and Enable verbose memory model output.

____ 14. Under Debugging Options, set the Debugging feature set to Option 1 if it isn’t set already.

____ 15. Finally, turn on Enable the Efficiency Monitor and Protocol Checker on the Controller Avalon Interface. Again, these features will be used later with the EMIF Toolkit.

____ 16. Click Finish. When prompted, be sure that Generate Example Design is enabled, and click Generate. Let the instructor know when you have started generating the IP.

Exercise Summary • Created a top-level project for the memory interface

• Created, parameterized, and started generation of the IP

END OF EXERCISE 1

Page 201: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

12

Page 202: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

13

Exercise 2

Verify the High Performance DDR3 Memory Controller Functionality through Simulation

Page 203: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

14

Objectives:

• Generate and examine the simulation module files created with the simulation scripts

• Simulate the example design

The Megawizard Plug-In Manager generates 3 folders: a version of the IP for instantiation and synthesis; a version of the IP for simulation; and an example design folder. The example design folder contains a full example project for synthesis and scripts for generating files for simulation. The example synthesis project and the simulation files generated from the scripts each include a traffic generator block to test the design in simulation as well as on the development board. The simulation version of the project includes a testbench and a generic external memory model for testing. The diagram below represents the simulation system generated by the scripts.

In this exercise, you’ll generate the example simulation files (top-level entity is ddr3_top_example_sim), and perform a scripted simulation using the ModelSim simulator. To save time, you will simulate the system using the generic memory model. This is not as accurate as using a vendor memory model, but it will give a good approximation of the actual behavior of the interface.

Page 204: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

15

Step 1: Generate the simulation files and examine them

As noted in the presentation, the IP generation only creates scripts to generate files for simulation instead of generating an entire project. This simplifies things since a Quartus II project is not necessary for simulating the design in a 3rd-party simulation tool, such as the ModelSim simulator. A simple project is generated, however, to make it easy to run the simulation file generation scripts.

____ 1. From the end of the last exercise, click Exit and choose to add the .qip file to the project if asked.

____ 2. From the File menu, select Open Project. Open the simulation generation project, generate_sim_example_design.qpf, located in

<Mem lab install directory>/ddr3_top_example_design/simulation/

For your own reference, there’s also a README file in this location that explains the simulation file generation we are about to go through.

____ 3. Change the device setting to the correct one. From the Assignments menu, select Device. Select from the Available devices list the same Stratix IV GX selected earlier (EP4SGX230KF40C2).

____ 4. From the Tools menu, select Tcl Scripts.

____ 5. Select the script named generate_sim_verilog_example_design.tcl. Examine the script, if you’d like, in the Preview window. When you’re ready, click Run.

____ 6. The script may take some time to execute, with no indication that it’s running. Click OK in the dialog that appears when it completes.

If you’d like, you can also run the generate_sim_vhdl_example_design.tcl script to generate the VHDL version of the simulation files. However, since the main IP was generated as Verilog, we’ll use the Verilog version for this and the rest of the lab exercises. Of course, if this was your own design and VHDL is your preferred HDL, you would use it instead.

____ 7. Open the newly-generated ddr3_top_example_sim.v file in the verilog directory using the Quartus II text editor or WordPad. Do not edit the file.

The testbench instantiates the example design (ddr3_top_example_sim_e0), a status checker, and the generic DDR3 SDRAM memory module and connects the memory interface signals appropriately.

____ 8. Close the top-level file.

____ 9. If you’d like, examine the files listed in the diagram above, but do not edit them. They can be found in

<Mem lab install directory>/ddr3_top_example_design/simulation/verilog/submodules/

Page 205: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

16

Step 2: Simulate the design in the ModelSim-Altera software

The generated simulation files include a script to completely handle the compilation of the simulation files and the setup and running of the simulation itself. You’ll replace this script with one that displays additional waveforms during the simulation.

____ 1. In an Explorer window, copy the run.do file from the Constraint_files folder in the lab install directory to the mentor simulation directory that was generated by the script, replacing the existing run.do:

<Mem lab install directory>/ddr3_top_example_design/simulation/verilog/mentor

____ 2. Start the ModelSim-Altera software from the Windows Start menu (All Programs → Altera 11.1 Build 178 → ModelSim-Altera 10.0d (Quartus II 12.0) Edition) or from a shortcut on the desktop. Click Close if the introduction window opens.

____ 3. From the File menu, select Change Directory. Change the directory to

<Mem lab install directory>/ddr3_top_example_design/simulation/verilog/mentor

____ 4. From the Tools menu, go to the Tcl submenu and select Execute Macro. Select the run.do file you copied and click Open.

The script runs, compiling the testbench and all the files that make up the interface in the submodules folder. When the simulation actually starts, you’ll see the Wave window appear. The window will populate with all of the top-level signals listed in the testbench file along with a “virtual” signal named memcommandwave. This signal, defined in run.do, creates a mnemonic for the control signals (ras_n, cas_n, we_n) to make it easy to see the commands being sent to the memory.

____ 5. Observe the simulation status in the Transcript and Wave windows.

The Transcript window displays the write and read tests performed by the interface and evaluates whether the results are correct.

The Wave window displays a graphical representation of the top-level signal levels. During the simulation, select the Wave window to highlight it and use the Wave zoom

buttons to update the window and observe signal activity. Right-click a signal to change its radix (hex is useful for the address and dq buses). To hide the expanded signal names that include the name of the testbench, click the tiny button at the

bottom of the signal list with the tooltip Toggle leaf names <-> full names.

The simulation will run for approximately 170 μs. When the functional simulation is successful, the testbench should output a SIMULATION PASSED message in the Transcript window. This will take several minutes.

Page 206: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

17

____ 6. When the simulation finishes, a dialog box will appear asking if you are finished simulating. Click No so you will be able to explore the Wave window.

____ 7. Explore the functionality of the controller in greater detail by looking at the transactions on the waveforms in the wave window.

The entire simulation waveform should resemble the following:

The early activity on the DDR3 interface signals is the controller writing to the mode registers (cs_n, ras_n, cas_n, we_n all low; ba picks which of the 4 mode registers to write to). The empty part of the simulation is the calibration that was skipped when the IP was generated. After this, observe e0_emif_status_local_init_done going high. Once this happens, the traffic generator starts to test the DDR3 interface by generating write and read transactions. Once testing is finished, the e0_drv_status_test_complete signal goes high and the simulation finishes.

Diving into the waveforms, you can observe the read and write transactions on the DDR3 interface. Try to correlate the read back data with the data written to memory. This is easiest to do with the tests at the very end of the simulation which can be matched to information at the bottom of the Transcript window, but it can still be a little tricky due to the data masking (dm). Write and read tests occur in subsequent groups of issued commands, so you can try to observe that the first dq data written with the WR command on memcommandwave should correspond to the first dq data written when memcommandwave is RD. The dm masking can be seen in the Wave window or listed with the data in the Transcript. Ask the instructor if you need assistance.

Page 207: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

18

____ 8. Close the simulator after exploring the signals in the Waves window.

Exercise Summary • Generated the files for performing a functional simulation of the example design

• Ran the simulation and examined the results

END OF EXERCISE 2

Page 208: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

19

Lab 3

Complete the DDR3 Memory Controller

Page 209: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

20

Step 1: Modify the top-level of the design

In this step, you will use the synthesizable version of the design and set it up for running on the development board. As part of this, you’ll change the polarity of the test output signals so they can drive LEDs on the development board.

____ 1. From the Quartus II File menu, select Open Project.

____ 2. Open ddr3_top_example.qpf found in:

<Mem lab install directory>/ddr3_top_example_design/example_project

____ 3. Change the target device for the project. From the Assignments menu, select Device. Again select the same Stratix IV GX device (EP4SGX230KF40C2) from the Available devices list that you selected for the original top-level project. Don’t click OK yet.

____ 4. Change the reserved setting for unused pins on the device so that the device will draw less power. Click Device and Pin Options, and select the Unused Pins category.

____ 5. Set Reserve all unused pins to As input tri-stated. Click OK twice.

____ 6. Open the top level of the design by double-clicking ddr3_top_example in the Project

Navigator.

____ 7. Change the example project status signals to active low so that they can drive LEDs on the development board. In the module definition at the top of the ddr3_top_example.v file, look for the output signals drv_status_test_complete, drv_status_pass, and drv_status_fail. For each of these three signals, append _n to the signal name.

Page 210: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

21

____ 8. Add Verilog wire and assign statements to invert the three signals. A little further down in the file, you should see a number of wire type declarations. Underneath, add the following new wire declarations and assign statements to invert the signals: wire drv_status_test_complete; wire drv_status_pass; wire drv_status_fail; assign drv_status_test_complete_n = ~drv_status_test_complete; assign drv_status_pass_n = ~drv_status_pass; assign drv_status_fail_n = ~drv_status_fail;

____ 9. Save and close the ddr3_top_example.v file.

Step 2: Add constraints and assign I/O locations

Signals entering or exiting the FPGA device need to be assigned physical pin locations on the device I/O. The signals that require these location assignments are listed in the tables below. In this step, you will source a Tcl script created from the Megawizard Plug-In Manager to set up a number of I/O assignments and then use another script to create the required I/O location assignments. A synthesized netlist is required in order to run these scripts.

Top-level design inputs and outputs

Inputs Outputs

pll_ref_clk drv_status_pass_n

global_reset_n drv_status_fail_n

soft_reset_n drv_status_test_complete_n

Page 211: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

22

DDR3 interface signals

mem_a mem_odt

mem_ba mem_ras_n

mem_ck mem_cas_n

mem_ck_n mem_we_n

mem_cke mem_dq

mem_cs_n mem_dqs

mem_dm mem_dqs_n

oct_rdn oct_rup

____ 1. Synthesize the project. From the Processing menu, go to Start, and select Start Analysis & Synthesis. You can also click the toolbar icon . Ignore all warnings.

____ 2. While the design is synthesizing, copy the ddr3_top_globals_pin_locations_bot.tcl file from the <Mem lab install directory>/Constraint_files directory to the submodules directory: <Mem lab install directory>/ ddr3_top_example_design/example_project/ddr3_top_example/submodules

____ 3. When synthesis is complete, add constraints generated by the MegaWizard Plug-In Manager to the project. From the Tools menu, select Tcl Scripts.

____ 4. Select the ddr3_top_example_if0_p0_pin_assignments.tcl script. Click Run. Click OK when the script completes.

This Tcl script sources 2 other Tcl scripts located in the submodules directory, creating I/O assignments for the DDR3 pins. If you like, open the Tcl scripts in the submodules folder and examine them.

____ 5. From the Assignments menu, open the Assignment Editor.

The Assignment Editor is basically a spreadsheet of the assignments for the project that are stored in the .qsf file. In the Assignment Editor, you can see all the assignments that were added by the script. Notice the Input Termination and Output Termination assignments. You may also notice a value of Flexible_timing for assignments named Memory Interface Delay Chain Configuration. This indicates the use of the newer, flexible timing model on the I/O delay chains used for the interface, as opposed to the older macro timing model.

Page 212: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

23

You’ll also see a number of Global Signal assignments. These assignments set the PLL clocks (signals named auto_generated|clk) to use the global routing resources and force a number of control signals to not use the global routing resources. Remember that putting control signals on the global resources could cause recovery and removal timing failures.

At this point in the design flow, you have to assign all of the pin-outs to the design as per your board requirements. Normally, you would use the Pin Planner or source a pre-defined pin-out script for this. Today, to save time, you will use a premade script named ddr3_top_globals_pin_locations_bot.tcl prepared specifically for this lab.

____ 6. Select Tcl Scripts from the Tools menu again, and Run the ddr3_top_globals_pin_locations_bot.tcl script that you copied to the submodules folder earlier. Click OK when complete.

____ 7. Open the Quartus II Pin Planner by selecting Pin Planner from the Assignments menu.

You can see that the interface has been placed along the bottom edge of the chip.

The Pin Planner should appear with DQ/DQS pin groups highlighted as shown above. If the pin groups are not highlighted, from the View menu, select Show, then Show DQ/DQS Pins and finally In x8/x9 Mode.

Feel free to examine the I/O assignments for all the top-level signals. Right-click in the All Pins list at the bottom of the window and select Customize columns to add columns for other assignments for each I/O pin, such as Output Termination and Input Termination.

pll_ref_clk

Page 213: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

24

____ 8. When you are finished, close the Pin Planner.

____ 9. From the File menu in the Quartus II software, select Save Project.

Step 3: Add a SignalTap™ II instance to the design

The SignalTap II embedded logic analyzer allows you to tap and monitor any internal node(s) in the design. Normally, you might add a SignalTap II instance to your design after you discover that something has gone wrong, and you need to debug it. In this section of the lab, you will add it to your design now for use later in the day to tap a number of useful signals. This approach will save us some time by avoiding an extra compile.

Recall that the traffic generator provides the pass, fail, and test_complete signals (named drv_status_pass, drv_status_fail, and drv_status_test_complete by default) to indicate whether the memory interface is operating correctly or not. You inverted these signals in the design to drive LEDs on the Stratix IV GX development board. The new signals, drv_status_pass_n, drv_status_fail_n, and drv_status_test_complete_n are tied to user LEDs D23, D22, and D21 (labeled 0, 1, and 2 on the board) respectively. D21 should turn on at the end of the test. D23 turns on to indicate a passing test while D22 turns on to indicate a test failure. You will add SignalTap II nodes to allow you to probe these signals along with a number of others.

The SignalTap II file has already been created for you.

____ 1. Examine the SignalTap II file. From the File menu, select Open (not Open project). Change the Files of type to SignalTap II Logic Analyzer Files (*.stp), and open the ddr3_top.stp file from <Mem lab install directory>.

Notice the signals that will be tapped by the logic analyzer. Besides the drv_status_test_complete_n, drv_status_pass_n, and drv_status_fail_n status signals, the other signals are all Avalon bus signals, indicated by the avl_ prefix. As mentioned earlier, you should only tap the local Avalon interface between the controller and the traffic generator (or your user logic) because you want to avoid adding stubbed routing paths on the timing-critical signals of the external interface.

Page 214: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

25

The logic analyzer Setup tab should look like this:

____ 2. From the Assignments menu, select Settings. Go to the SignalTap II Logic Analyzer category.

____ 3. Turn on Enable SignalTap II Logic Analyzer. Click the browse button next to SignalTap II File name and open the ddr3_top.stp file. Don’t click OK yet.

Step 4: Make final project settings and compile the design

Before starting the compilation, you’ll make some final project settings that will help optimize the design in order to meet timing.

____ 1. Still in the Settings dialog box, go to the Fitter Settings category and set the Fitter effort to Standard fit.

____ 2. Make sure that Optimize hold timing is enabled and set to All Paths.

____ 3. Make sure that Optimize multi-corner timing is turned on.

Page 215: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

26

____ 4. In the Compilation Process Settings category, make sure Use smart compilation and Run Assembler during compilation are turned on.

____ 5. In the Analysis & Synthesis Settings category, set the Optimization Technique to Speed.

____ 6. Finally, in the Physical Synthesis Optimizations category, turn on all 4 options under Optimize for performance (combinational logic, register retiming, asynchronous signal pipelining, and register duplication), and set the Effort level to Extra. Click OK to close the Settings dialog box.

While these options will certainly increase compile time, they can help guarantee that the design will meet timing.

You are now ready to compile your design.

____ 7. Start compilation by either selecting Start Compilation from the Processing menu or clicking Start Compilation in the toolbar.

The compilation process will take some time. Please inform the instructor once you have started compiling.

Exercise Summary • Made I/O related assignments, including termination settings and locations

• Added the SignalTap II embedded logic analyzer to the design to capture internal signal data during runtime

END OF EXERCISE 3

Page 216: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

27

Lab 4

Verify the High Performance DDR3 Memory Controller through Timing Analysis and

In-System Testing on the Board

Page 217: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

28

Step 1: Verify timing using the TimeQuest timing analyzer

____ 1. Once compilation has completed, open the TimeQuest Timing Analyzer by selecting TimeQuest Timing Analyzer from the Tools menu or by clicking the toolbar button .

____ 2. Double-click Report DDR in the Device Specific Reports folder in the Tasks pane. Once this is performed, a green checkmark should appear.

This will perform the necessary steps in order to obtain timing reports. It will create a post-fit, slow corner timing netlist, read in the SDC files that the MegaWizard Plug-In Manager automatically added to the project, and update the timing netlist based on the timing constraints. TimeQuest will then generate the DDR timing report.

____ 3. When the script is run, you will see timing information in the Console window at the bottom of the window. Your design should meet all DDR timing (but see note below). If you had failing paths, however, you would need to investigate and fix them.

If you see some timing failures in the Core, they can be safely ignored for our purposes. The CSR interface and the Efficiency monitor add some additional delay between the example driver and the controller. For a final design, you could remove these debug features (which you’ll experiment with soon) in order to meet timing.

____ 4. Locate the worst address/command path in the Chip Planner. Highlight the first path shown in the if0 Address Command (setup) Summary of Paths tab.

____ 5. Right-click the path and select Locate Path.

Page 218: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

29

____ 6. Select to Locate in the Chip Planner.

____ 7. Observe the path in the Chip Planner when it opens. Once the Chip Planner opens, from the View menu, select Show Delays. You may have to zoom in to see the delay label.

____ 8. Click the + on the delay label to expand the path into its actual routing segments through the device.

Feel free to cross-probe from other paths in TimeQuest to the Chip Planner to see how they were routed. Each path you locate from TimeQuest is stored in the Locate History window at the bottom of the Chip Planner for easy review later. You may also wish to look at the timing waveforms on the Waveform tab in TimeQuest to graphically explore timing margins, etc.

____ 9. Close the Chip Planner and TimeQuest when you are finished.

Step 2: Verify operation of the DDR3 interface

After verifying the timing requirements, you can now download the design to the FPGA and verify that the DDR3 interface works properly. You’ll use the SignalTap II embedded logic analyzer to do that in this step.

____ 1. Plug in the Stratix IV GX development board and turn it on. The fan should start running and a number of LEDs should light.

____ 2. Set the rotary dial switch (SW2) to position 1.

____ 3. Plug the USB A-B cable into the board and connect it to your computer.

You should not have to install the drivers for the built-in USB-Blaster hardware if you are working on an Altera training computer. If the New Hardware Wizard appears on a training computer or you are using your own machine, please ask the instructor for assistance in getting the driver set up.

____ 4. Open the SignalTap file added to the project in Lab 3 if it’s not already open. You will program the device from here.

____ 5. Select USB-Blaster [USB-0] from the Hardware menu if it is not already selected. If the USB-Blaster connection does not appear in the list, click Setup and select it from the Currently selected hardware list. Click Close.

If you don’t see the USB-Blaster connection or cannot connect, please ask the instructor for assistance.

____ 6. Click browse to point to the ddr3_top_example.sof file that was generated by the Assembler during compilation.

Page 219: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

30

The JTAG Chain Configuration section should look like the screenshot below. Make sure that the EP4SGX230 is selected as the target Device.

____ 7. Click to program the device.

It will take about 20 to 30 seconds to program the device, during which time a status bar will appear to fill and refill a number of times. This is normal.

The example driver provides the pass, fail, and test_complete signals to determine whether your memory interface is operating correctly. The pass signal of the example driver is driven to logic high as long as the data written to memory matches what is read from the same location. Remember you connected these signals through inverters to LEDs on the development board. If D22 and D21 are turned on, that indicates proper functioning of the interface (the test passed and the test completed, respectively). If D23 turns on, the test has failed, indicating a mismatch between what was written and what was read back.

Other signals (like local_cal_fail) were not inverted earlier, so you should also see D20 lit as well, indicating that the calibration did not fail.

____ 8. Press the design’s global reset button (PB0) to restart the test and observe the LEDs.

Test fail Test pass Test

complete

Global reset

Page 220: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

31

Step 3: Verify the design with the SignalTap II Embedded Logic Analyzer

With SignalTap II logic analyzer, you’ll be able to see the actual data transfers on the Avalon (local) bus similar to what you saw in the simulation earlier on the external interface.

____ 1. Click Run Analysis in the SignalTap II Instance Manager to start the logic analyzer.

The logic analyzer starts looking for the trigger: the falling edge of drv_status_complete_n. You should observe the Status of the logic analyzer instance as Waiting for trigger.

The driver starts running as soon as the device is programmed. To actually catch the driver test, you’ll need to reset the design.

____ 2. Press the global_reset_n button (PB0) on the development board.

This should trigger the logic analyer when the drv_status_test_complete_n signal goes low.

If you dig deeper into the waveforms (left-click to zoom in, right-click to zoom out), you can look at the read and write transactions on the Avalon interface and correlate the writes with the read back data. In the screenshot above (from a 16-bit version of the interface), the cursor (on the right) is placed at a location where avl_rdata_valid is high, indicating valid incoming data. If you look at avl_rdata at that point, highlighted in the Value column on the left, you can see it starts with 3C12… If you look at the indicated avl_wdata, you can see the write that generated this read. It may sometimes be difficult to find matching data because of the data reordering performed by the controller, but you should be able to find 16-bit or 32-bit patterns that match.

The pass signal always stays high, indicating that the test has passed.

____ 3. Try triggering on other signals. Switch back to the Setup tab, and try triggering on the rising edge of avl_write_req, avl_read_req, and avl_rdata_valid, switching the other signals back to don’t care each time and setting the Trigger position to the Pre trigger position.

Triggering on the rising edge of each of these signals will let you see the behavior of the Avalon interface right after calibration.

Page 221: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

32

Step 4: Generate calibration and margin reports with the EMIF Debug Toolkit (optional; time permitting; discussed in the next presentation section)

The EMIF Toolkit gives you the unique ability to analyze how your memory interface was calibrated by the UniPHY sequencer and how much margin you have in your design. For example, if memory calibration failed, you could use the EMIF Toolkit to pinpoint which signal failed and why. It could be a problem with unmatched board routing or a delay chain may have been set incorrectly. The toolkit can help figure it out.

The EMIF Toolkit makes use of the CSR Avalon interface you enabled to communicate during runtime with the Qsys system-based UniPHY sequencer. Operation of the EMIF Toolkit is very similar to the TimeQuest timing analyzer, making use of tasks to connect the toolkit to the memory interface sequencer and to generate reports. To use the toolkit, you have to establish a link with the CSR interface of the sequencer.

____ 1. From the Quartus II Tools menu, select External Memory Interface Toolkit.

____ 2. From the Tasks pane in the toolkit, double-click Initialize Connections.

This task looks for all available connections through the JTAG interface and generates a Discovered Connections report. Each connection found is made up of a number of nodes, so you’ll see multiple rows in this report even though there is actually only a single connection. This report is useful for figuring out which memory interface you want to link the toolkit to if you have more than one JTAG connection or more than one memory interface on one or more devices on your board. Since this design uses only one interface with a single JTAG interface, you could have skipped this step.

____ 3. Double-click the Link Project to Device task. Click OK.

This establishes a link between the hardware connection and the Quartus II project through a JTAG debugging information, or .jdi, file, which was generated by the Assembler during compilation. The file provides information about the debugging interfaces that were compiled into the design and programmed into the FPGA device. For the memory toolkit, this includes the CSR interface and the efficiency monitor.

____ 4. Create a connection to the memory interface sequencer. Double-click the Create Memory Interface Connection task.

Page 222: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

33

____ 5. Look over the information displayed. Optionally, enter a Connection name. Click OK.

Once you establish this connection, a number of additional tasks become available. You can also look through some newly generated reports that summarize the interface and indicate whether any DQS groups or memory ranks were masked during calibration of the interface.

____ 6. Double-click the Rerun Calibration task in the Commands folder.

Since the toolkit was not active during the initial calibration after programming the device, you must rerun calibration to get information about it from the sequencer. This also generates the Calibration Report folder.

____ 7. Examine the generated reports in the Calibration Report folder.

The calibration reports indicate the margins that were observed during calibration and the final delay chain settings and phase adjustments that were selected to best center the DQS in the DQ bits.

____ 8. Double-click the Generate Margining Report task in the Commands folder, and examine the generated reports in the Margin Report folder.

The margining reports indicate how much margin there is on each DQ signal before a read or write would fail. The Read Data Valid Windows and Write Data Valid Windows reports graphically illustrate the DQ Pin Post Calibrations Margins report, displaying DQS for each DQ bus group as a black line within the data valid window (DVW).

Page 223: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

34

____ 9. Make a connection to the efficiency monitor. Double-click the Create Efficiency

Monitor Connection task.

____ 10. Again, look over the connection information and optionally enter a Connection name. Click OK.

____ 11. Examing the reports generated in the Avalon-MM Efficiency Monitor folder.

In general, efficiency is calculated as the number of active cycles of data transfer divided by the total number of operating cycles. The efficiency numbers you see in the Efficiency Monitor Statistics report are somewhat low because not much data was transferred by the traffic generator for its test. If you were using your own logic to continuously read and write data to the interface, you would get a better picture of the efficiency of the controller.

The Protocol Checker Summary Statistics report indicates if there were any violations of the Avalon bus protocol between the user logic (the traffic generator in this example design) and the interface.

____ 12. When you are done looking through the reports, close the EMIF Toolkit and quit the Quartus II software.

Page 224: Customer Training · 2012-08-13 · Graphical user interface Included in the free IP base suite (Subscription Edition) -rich PLLs & clock management Automatic generated constraints

Implementing, Simulating, and Debugging External Memory Interfaces Exercises

Copyright © 2012 Altera Corporation

A-MNL-ISDMI-EX-12-0-v1

35

Exercise Summary • Performed a timing analysis on the interface

• Verified the functional operation of the memory interface using the board LEDs and the SignalTap II logic analyzer

• Used the EMIF Toolkit to generate reports relating to calibration and margin on the interface, as well as using it to evaluate controller efficiency

END OF EXERCISE 4