View
7
Download
0
Category
Preview:
Citation preview
Multi Channel DMA IP for PCIExpress User Guide
Updated for Intel® Quartus® Prime Design Suite: 21.1
SubscribeSend Feedback
UG-20297 | 2021.05.28Latest document on the web: PDF | HTML
Contents
1. Terms and Acronyms...................................................................................................... 4
2. Introduction................................................................................................................... 62.1. Multi Channel DMA IP for PCI Express Features..........................................................72.2. Device Family Support............................................................................................82.3. Recommended Speed Grades.................................................................................. 92.4. Performance and Resource Utilization....................................................................... 92.5. Release Information............................................................................................. 10
3. Functional Description.................................................................................................. 113.1. Multi Channel DMA...............................................................................................11
3.1.1. H2D Data Mover...................................................................................... 113.1.2. D2H Data Mover...................................................................................... 123.1.3. Descriptors............................................................................................. 123.1.4. Avalon-MM PIO Master..............................................................................173.1.5. Avalon-MM Write (H2D) and Read (D2H) Master...........................................173.1.6. Avalon-ST Source (H2D) and Sink (D2H).....................................................183.1.7. User MSI-X............................................................................................. 203.1.8. User Functional Level Reset (FLR).............................................................. 203.1.9. Control Registers..................................................................................... 20
3.2. Bursting Avalon-MM Master (BAM)..........................................................................213.3. Bursting Avalon-MM Slave (BAS)............................................................................223.4. Config Slave (Non-Bursting AVMM Slave) ............................................................... 223.5. Hard IP Reconfiguration Interface...........................................................................233.6. Config TL Interface (P-Tile Only)............................................................................ 233.7. Configuration Intercept Interface (EP Only)............................................................. 23
4. Interface Overview....................................................................................................... 244.1. Port List..............................................................................................................254.2. Clocks................................................................................................................ 274.3. Resets................................................................................................................274.4. Multi Channel DMA...............................................................................................28
4.4.1. Avalon-MM PIO Master..............................................................................284.4.2. Avalon-MM Write Master (H2D).................................................................. 294.4.3. Avalon-MM Read Master (D2H).................................................................. 294.4.4. Avalon-ST Source (H2D)........................................................................... 304.4.5. Avalon-ST Sink (D2H)...............................................................................304.4.6. User MSI-X Interface................................................................................ 314.4.7. User FLR Interface................................................................................... 32
4.5. Bursting Avalon-MM Master (BAM) Interface............................................................ 324.6. Bursting Avalon-MM Slave (BAS) Interface.............................................................. 334.7. Config Slave Interface (RP only) ........................................................................... 334.8. Hard IP Reconfiguration Interface...........................................................................344.9. Config TL Interface.............................................................................................. 354.10. Configuration Intercept Interface (EP Only)........................................................... 35
5. Parameters (H-Tile)...................................................................................................... 365.1. IP Settings..........................................................................................................36
Contents
Multi Channel DMA IP for PCI Express User Guide Send Feedback
2
5.1.1. System Settings...................................................................................... 365.1.2. MCDMA Settings...................................................................................... 375.1.3. Device Identification Registers................................................................... 385.1.4. Multifunction and SR-IOV System Settings Parameters..................................395.1.5. Configuration, Debug and Extension Options................................................395.1.6. PHY Characteristics.................................................................................. 405.1.7. PCI Express / PCI Capabilities Parameters................................................... 40
5.2. Example Designs................................................................................................. 42
6. Parameters (P-Tile)...................................................................................................... 446.1. IP Settings..........................................................................................................44
6.1.1. Top-Level Settings................................................................................... 446.2. PCIe0 Settings.................................................................................................... 45
6.2.1. PCIe0 Multifunction and SR-IOV System settings..........................................466.2.2. PCIe0 Configuration, Debug and Extension Options.......................................466.2.3. PCIe0 Device Identification Registers..........................................................476.2.4. MCDMA Settings...................................................................................... 486.2.5. PCIe0 TPH/ATS for Physical Functions......................................................... 496.2.6. PCIe0 PCI Express / PCI Capabilities Parameters......................................... 49
6.3. Example Designs................................................................................................. 53
7. Designing with the IP Core........................................................................................... 547.1. Generating the IP Core......................................................................................... 547.2. Simulating the IP Core..........................................................................................557.3. IP Core Generation Output - Intel Quartus Prime Pro Edition......................................577.4. Systems Integration and Implementation................................................................60
7.4.1. Required Supporting IP.............................................................................60
8. Software Programming Model.......................................................................................618.1. Multi Channel DMA Custom Driver..........................................................................61
8.1.1. Architecture............................................................................................ 618.1.2. libmqdma library details ...........................................................................648.1.3. Application .............................................................................................658.1.4. Software Flow......................................................................................... 678.1.5. API Flow ................................................................................................ 688.1.6. libmqdma Library API List........................................................................ 708.1.7. Request Structures...................................................................................76
8.2. Multi Channel DMA IP DPDK Poll-Mode based Driver................................................. 768.3. Multi Channel DMA IP Kernel Mode Driver................................................................76
9. Registers...................................................................................................................... 779.1. Queue Control (QCSR)..........................................................................................789.2. MSI-X Memory Space........................................................................................... 829.3. Control Register (GCSR)....................................................................................... 83
10. Revision History..........................................................................................................85
Contents
Send Feedback Multi Channel DMA IP for PCI Express User Guide
3
1. Terms and AcronymsTable 1. Acronyms
Term Definition
PCIe* Peripheral Component Interconnect Express (PCI Express*)
DMA Direct Memory Access
MCDMA Multi Channel Direct Memory Access
PIO Programmed Input/Output
H2D Host-to-Device
D2H Device-to-Host
H2DDM Host-to-Device Data Mover
D2HDM Device-to-Host Data Mover
QCSR Queue Control and Status register
GCSR General Control and Status Register
IP Intellectual Property
HIP Hard IP
PD Packet Descriptor
QID Queue Identification
TIDX Queue Tail Index (pointer)
HIDX Queue Head Index (pointer)
TLP Transaction Layer Packet
IMMWR Immediate Write Operation
MRRS Maximum Read Request Size
CvP Configuration via Protocol
PBA Pending Bit Array
API Application Programming Interface
Avalon®-MM (or AVMM) Avalon Memory-Mapped Interface
Avalon-ST (or AVST) Avalon Streaming Interface
SOF Start of a File (or packet) for streaming
EOF End of a File (or packet) for streaming
File (or Packet) A group of descriptors defined by SOF and EOF bits of thedescriptor for the streaming. At Avalon-ST user interface, a file (or
packet) is marked by means of sof/eof.
continued...
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
Term Definition
BAM Bursting Avalon-MM Master
BAS Bursting Avalon-MM Slave
MSI Message Signaled Interrupt
MSI-X Message Signaled Interrupt - Extended
FLR Functional Level Reset
FAE Field Applications Engineer
1. Terms and Acronyms
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
5
2. IntroductionFigure 1. Multi Channel DMA IP for PCI Express Usage in Server Hardware
Infrastructure
H2DQueue
Ch. 0
Host Intel FPGA
MCDMA
VMD2H
Queue
H2DQueue
VirtualMachineManager
RootComplex
PCIeHIP
Ch. 1
VMD2H
Queue
H2DQueue
H2DQCSR
D2HQCSR
H2DQCSR
AVMM/AVSTPort
UserLogic
D2HQCSR
H2DQCSR
D2HQCSR
Ch. n
VMD2H
Queue
The Multi Channel DMA IP for PCI Express enables you to efficiently transfer databetween the host and device. The Multi Channel DMA IP for PCI Express supportsmultiple DMA channels between the host and device over the underlying PCIe link. ADMA channel consists of H2D (host to device) and D2H (device to host) queue pair.
As shown in the figure above, the Multi Channel DMA IP for PCI Express can be usedin a server’s hardware infrastructure to allow communication between various VM-clients and their FPGA-device based counterparts. The Multi Channel DMA IP for PCIExpress operates on descriptor-based queues set up by driver software to transferdata between local FPGA and host. The Multi Channel DMA IP for PCI Express’s controllogic reads the queue descriptors and executes them.
The Multi Channel DMA IP for PCI Express integrates the Intel® PCIe Hard IP andinterfaces with the host Root Complex via the PCIe serial lanes. On the user logicinterface, Avalon-MM/Avalon-ST interfaces allow the designer easy integration of theMulti Channel DMA IP for PCI Express with other Platform Designer components.
Besides DMA functionality, Multi Channel DMA IP for PCI Express enables standaloneEndpoint or Rootport functionality with Avalon-MM interfaces to the user logic. Thisfunctionality is described in more detail in the Functional Description chapter.
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
2.1. Multi Channel DMA IP for PCI Express Features
Endpoint mode
• PCIe Gen3 / Gen4 x8 / Gen4 x16 in Intel Stratix® 10 DX and Intel Agilex™ devices
• PCIe Gen3 x8 / Gen3 x16 in Intel Stratix 10 GX / Intel Stratix 10 MX devices
• User Mode options:
— Multi Channel DMA
— Bursting Master
— Bursting Slave
— Bursting Avalon-MM Master and Bursting Avalon-MM Slave
— Bursting Avalon-MM Master and Multi Channel DMA
• Supports up to 2K DMA channels.
—Table 2. Maximum DMA channels depends on the device and user interface type* = Maximum 512 channels per Function
DeviceMCDMA Interface Type
AVMM 4 AVST Ports 1 AVST Port
Intel Stratix 10 GXIntel Stratix 10 MX
2048* 4 256
Intel Stratix 10 DXIntel Agilex
2048* 4 64
• Per Descriptor completion notification with MSI-X or Writebacks
• Architectural support for 'Head-of-line' blocking prevention for 4 Avalon-ST ports
• Option to select Avalon-MM or Avalon-ST DMA for user logic interface
• Alternate option to enable 4 Avalon-ST DMA ports with 1 DMA channel per port
• SR-IOV
Note: SRIOV is only enabled when a single port configuration is enabled in theMulti Channel DMA IP
• User MSI-X
Note: MSI is currently not supported
• FLR
Note: User MSI-X and FLR are supported only in Multi Channel DMA mode
2. Introduction
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
7
Root Port Mode
• PCIe Gen3 x8 / Gen3 x16 in Intel Stratix 10 GX / Intel Stratix 10 MX devices
• PCIe Gen3 x16 / Gen4 x16 in Intel Stratix 10 DX and Intel Agilex devices
• Configuration Slave (CS) interface for accessing Endpoint’s config space
• User mode options:
— Bursting Master
— Bursting Slave
— Bursting Avalon-MM Master and Bursting Avalon-MM Slave
• Maximum payload size is 512 bytes
2.2. Device Family Support
The following terms define Multi Channel DMA IP for PCI Express core support levelsfor Intel FPGA IP cores in Intel Stratix 10 devices:
• Advanced support: the IP core is available for simulation and compilation for thisdevice family. Timing models include initial engineering estimates of delays basedon early post-layout information. The timing models are subject to change assilicon testing improves the correlation between the actual silicon and the timingmodels. You can use this IP core for system architecture and resource utilizationstudies, simulation, pinout, system latency assessments, basic timing assessments(pipeline budgeting), and I/O transfer strategy (data-path width, burst depth, I/Ostandards tradeoffs).
• Preliminary support: the IP core is verified with preliminary timing models forthis device family. The IP core meets all functional requirements, but might still beundergoing timing analysis for the device family. It can be used in productiondesigns with caution.
• Final support: the IP core is verified with final timing models for this devicefamily. The IP core meets all functional and timing requirements for the devicefamily and can be used in production designs.
Table 3. Device Family Support Table
Device Family Support Level
Intel Stratix 10 Preliminary
Intel Agilex Preliminary
Other device families No support
Related Information
Timing and Power ModelsReports the default device support levels in the current version of the Intel QuartusPrime Pro Edition software.
2. Introduction
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
8
2.3. Recommended Speed Grades
Table 4. Recommended Speed Grades
PCIe Gen. Device Tile Variant PLD Clock Frequency Recommended SpeedGrade
Gen 3 S10 GX/MX H-Tile 250MHz -1, -2
S10 DX, Agilex P-Tile 250MHz -1, -2, -3
Gen4 S10 DX P-Tile 350MHz -1, -2
Agilex P-Tile 400MHz -1, -2, -3
Related Information
Quartus Standard to Timing Closure and OptimizationUse this link for the Quartus Prime Pro Edition Software.
2.4. Performance and Resource Utilization
Note: Resource utilization for x8 link width is much lower. Contact your Intel FAE if you needresource utilization for specific IP configuration
Table 5. Avalon-MM
Tile (Device) User Mode Link Conf DMA Channels ALMs LogicRegisters
M20Ks
H-Tile (IntelStratix 10GX/MX)
MCDMA Gen3x16 1 / 32 /64 51427 /51580 / 51465
119197 /119279 /119267
618 / 618 /620
BAM_MCDMA Gen3x16 1 / 32 /64 55,873 /55,818 /55,791
129,649 /129,723 /129,727
685 / 685 /687
BAM Gen3x16 n/a 32418 65672 419
BAS Gen3x16 n/a 34181 74626 384
P-Tile (IntelStratix 10 DX,Intel Agilex)
MCDMA Gen4x16 1/32/64 45,527 /45,602/45,577
111,256 /111,388 /111,584
596 / 596 /598
BAM_MCDMA Gen4x16 1/32/64 49,167 /49,267 /49,074
121,554 /121,660 /121,303
663 / 663/ 665
BAM Gen4x16 n/a 23692 54906 398
BAS Gen4x16 n/a 26971 63720 363
Table 6. 1 port Avalon-ST
Tile (Device) User Mode Link Conf DMA Channels ALMs LogicRegisters
M20Ks
H-Tile (IntelStratix 10GX/MX)
MCDMA Gen3x16 1 / 32 /64 54,501 /54,535 /54,481
124,804 /125,019 /124,875
640 / 639 /641
BAM_MCDMA Gen3x16 1 / 32 /64 58,143 /58,105 /58,236
135,284 /135,365 /135,554
708 / 705 /708
continued...
2. Introduction
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
9
Tile (Device) User Mode Link Conf DMA Channels ALMs LogicRegisters
M20Ks
BAM Gen3x16 n/a 32,418 65,672 419
BAS Gen3x16 n/a 34,181 74,626 384
P-Tile (IntelStratix 10 DX,Intel Agilex)
MCDMA Gen4x16 1/32/64 48,087 /47,997 /47,920
116,788 /116,732 /116,501
619 / 618 /620
BAM_MCDMA Gen4x16 1/32/64 51,688 /51,715 /51,883
127,132 /127,197 /127,578
686 / 685 /688
BAM Gen4x16 n/a 23,692 54,906 398
BAS Gen4x16 n/a 26,971 63,720 363
2.5. Release Information
IP versions are the same as the Intel Quartus® Prime Design Suite software versionsup to v19.1. From Intel Quartus Prime Design Suite software version 19.2 or later, IPcores have a new IP versioning scheme. If an IP core version is not listed, the userguide for the previous IP core version applies. The IP versioning scheme (X.Y.Z)number changes from one software version to another.
A change in:
• X indicates a major revision of the IP. If you update your Intel Quartus Primesoftware, you must regenerate the IP.
• Y indicates the IP includes new features. Regenerate your IP to include these newfeatures.
• Z indicates the IP includes minor changes. Regenerate your IP to include thesechanges.
Table 7. Release information for the Multi Channel DMA IP for PCI Express Core
Item Description
IP Version H-Tile IP version: 2.0.0P-Tile IP version: 1.0.0
Intel Quartus Prime Version Intel Quartus Prime Pro Edition 21.1 Software Release
Ordering Part Number (OPN) H-Tile: IP-PCIEMCDMAP-Tile: IP-PCIEMCDMA
2. Introduction
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
10
3. Functional DescriptionFigure 2. Multi Channel DMA IP for PCI Express Block Diagram
PCIe Link HIPInterface/Scheduler/
Arbiter
Multi Channel DMA IP for PCI Express
MCDMA*
PCIeHIP
AVSTRx
CSR(BAR0)
Rx PIO I/F(BAR2)
H2DDescriptor FIFO
DescriptorFetch Engine
D2HDescriptor FIFO
H2DData Mover
MSI-X /Writeback Gen.
D2HData Mover
H2D I/F
Bursting AVMM Master (BAM)
Bursting AVMM Slave (BAS)
Config Slave (CS)**
User Logic
User MSI-XUser FLR
AVMM Slave
AVMM Slave /AVST Source (x1/x4)
AVMM Slave /AVST Sink (x1/x4)
AVMM Slave
AVMM Master
AVMM Master
D2H I/F
* MCDMA block in Endpoint only, ** Config Slave block in Root Port only
Rx PIO I/F : Programmed I/O Interface – AVMM MasterH2D I/F : Host to Device Interface – AVMM Master (Write) or AVST Source (1 port/4 ports)D2H I/F : Device to Host Interface – AVMM Master (Read) or AVST Sink (1 port/4 ports)
AVSTTx
3.1. Multi Channel DMA
Multi Channel DMA IP for PCI Express consists primarily of H2DDM & D2HDM blocks. Italso offers a DMA-bypass capability to the Host for doing PIO Read/Writes to devicememory.
3.1.1. H2D Data Mover
The Host-to-Device Data Mover (H2DDM) module transfers data from the hostmemory to local memory through the PCIe Hard IP and the Avalon-ST Sourceinterface.
There are two modes of usage for the H2DDM: queue descriptors fetching and H2Ddata payload transfer.
When used for descriptor fetching, the destination of the completion data is internaldescriptor FIFOs where descriptors are stored before being dispatched to the H2DDMor D2HDM for actual data transfer.
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
When used for data payload transfer, it generates Mem Rd TLPs based on descriptorinformation such as PCIe address (source), data size, and MRRS value and forwardsthe received data to the user logic through the Avalon-MM Write Master / Avalon-STSource interface. The received completions are re-ordered to ensure the read data isdelivered to user logic in order.
When a descriptor is completed, that is, all read data has been received and forwardedto the Avalon-MM Write Master / Avalon-ST Source interface, The H2DDM performsthe housekeeping tasks that include:
• Schedule MSI-X for a completed queue, if enabled
• Schedule Writeback Consumed Head Pointer for a completed queue, if enabled
• Update Consume Head Pointer for software polling
Based on the updated status, software can proceed with releasing the transmit bufferand reuse the descriptor ring entries.
3.1.2. D2H Data Mover
The D2H Data Mover (D2HDM) transfers data from device memory to host memory. Itreceives the data from the user logic through the Avalon-MM Read Master / Avalon-STSink interface and generates Mem Wr TLPs to move the data to the host based ondescriptor information such as PCIe address (destination), data size, and MPS value.
When a descriptor is completed, that is, all DMA data has been sent to the host, theD2HDM performs housekeeping tasks that include:
• Schedule MSI-X for a completed queue, if enabled
• Schedule Writeback Consumed Head Pointer for a completed queue, if enabled
• Update Consume Head Pointer for software polling
Based on the updated status, software can proceed with releasing the receive bufferand reuse the descriptor ring entries.
3.1.3. Descriptors
A DMA channel to support Multi Channel DMA data movement consists of a pair of thedescriptor queues: one H2D descriptor queue and one D2H descriptor queue.Descriptors are arranged contiguously within a 4 KB page.
Each descriptor is 32 bytes in size. The descriptors are kept in host memory in alinked-list of 4 KB pages. For a 32 byte descriptor and a 4 KB page, each pagecontains upto 128 descriptors. The last descriptor in a 4 KB page must be a “linkdescriptor” – a descriptor containing a link to the next 4 KB page with the link bit setto 1. The last entry in the linked list must be a link pointing to the base addressprogrammed in the QCSR, in order to achieve a circular buffer containing a linked-listof 4 KB pages. The figure below shows the descriptor linked list.
3. Functional Description
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
12
Figure 3. Descriptor Linked-List
0 Link=0
127
•••Link=1
4KB Page Q_START_ADDR_L/H (from QCSR)
0 Link=0
127
•••Link=1
4KB Page
0 Link=0
127
•••Link=1
4KB Page
0 Link=0
127
•••Link=1
4KB Page
••••••
Software and hardware communicate and manage the descriptors using tail indexpointer (Q_TAIL_POINTER) and head index pointer (Q_HEAD_POINTER) QCSRregisters as shown in the following figure. The DMA starts when software writes thelast valid descriptor index to the Q_TAIL_POINTER register.
3. Functional Description
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
13
Figure 4. Descriptor Ring Buffer
DESC_IDX0 DESC_IDX
1DESC_IDX
2
DESC_IDXn
Q_TAIL_POINTER(Last valid descriptor added by SW)
Q_HEAD_POINTER(Descriptor last fetched by HW)
Table 8. Software Descriptor Format
Name Width Description
SRC_ADDR [63:0] 64 If Link bit =0, then this field containsthe source address.Starting system address of allocatedtransmit buffer read by DMA that canbe any byte alignment.If the queue is H2D, then this fieldcontains the address in Host Memory.If the queue is D2H, then this is theAVMM address in device memory.If the link bit is set, then this containsthe address of the next 4 KB page inhost memory containing thedescriptors.
DEST_ADDR [127:64] 64 Provided link=0, this field means:Starting local AVMM address written byDMA that can be any byte alignement.If the queue is D2H, then this fieldcontains the address in Host Memory.If the queue is H2D, then this is theAVMM address in device memory.
PYLD_CNT [147:128] 20 Provided link=0, this field means:DMA payload size in bytes. Max 1 MB,with 20’h0 indicating 1 MB.
RSRVD [159:148] 12 Reserved
continued...
3. Functional Description
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
14
Name Width Description
DESC_IDX [175:160] 16 Unique Identifier for each descriptor,assigned by the software driver. Thisvalue is written toQ_COMPLETED_POINTER register whena descriptor data transfer is complete.
MSIX_EN [176] 1 Enable MSI-X per descriptor
WB_EN [177] 1 Enable Write Back per descriptor
RSRVD [191:178] 14 Reserved
RX_PYLD_CNT [211:192] 20 Received actual payload for D2H datamovement (upstream)
RSRVD [221:212] 10 Reserved
SOF [222] 1 SOF indicator for Avalon-ST streaming.In the H2D streaming, this bit causesthe Avalon-ST Source interface toassert h2d_st_sof_o, indicating startof a file/packet.In the D2H streaming, this bit is set inthe Descriptor itself, by a Writeback (ifWriteback is enabled) when the userlogic asserts d2h_st_sof_i,indicating start of a file/packet.Note: In the H2D streaming, both
SOF and EOF can be set in thesame descriptor (file size =payload count) or it canspan multiple descriptor pages.
Note: In the D2H streaming, if userlogic prematurely ends the datatransfer by assertingd2h_st_eof_i in the middleof a descriptor data move thenstarts a next file/packet, theSOF bit in the next descriptor isset by a Writeback.
Note: SOF bit is an optional featurefor DMAs involving file datatransfers using Avalon-STinterface.
EOF [223] 1 EOF indicator for Avalon-ST streaming.In the H2D streaming, this bit causesthe Avalon-ST Source interface toassert h2d_st_eof_o, indicating endof a file/packet.In the D2H streaming, this bit is setwithin the descriptor itself by aWriteback (if Writeback is enabled)when the user logic assertsd2h_st_eof_i, indicating end of apacket.Along with the EOF bit, Writeback alsoupdates the actual received payloadcount (RX_PYLD_CNT) field of the lastdescriptor.Note: EOF bit is an optional feature
for DMAs involving file datatransfers using Avalon-STinterface.
continued...
3. Functional Description
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
15
Name Width Description
RSRVD [253:224] 30 Reserved
DESC_INVALID [254] 1 Indicates if current descriptor contentis valid or stale
LINK [255] 1 Link =0Descriptor contains the source address,destination address and length.Link = 1Descriptor contains the address of thenext 4 KB page in host memorycontaining the descriptors.
3.1.3.1. Metadata Support
8 Byte Metadata
In Avalon Streaming mode, once you select 8 Byte metadata support during IPgeneration, the source and destination address field in the existing descriptorstructure are repurposed for metadata support. The following fields of the existingdescriptor defined above will have revised properties.
Table 9.
Name Width Description
SRC_ADDR[63:0] 64 If Link bit =0, then this field containsthe source address.If the queue is H2D, then this fieldcontains the address in Host Memory.If the queue is D2H, then this is 8 ByteMetadataIf the link bit is set, then this containsthe address of the next 4KB page inhost memory containing thedescriptors.
DEST_ADDR[127:64] 64 Provided link=0, this field means:If the queue is D2H, then this fieldcontains the address in Host Memory.If the queue is H2D, then this is 8 ByteMetadata
3.1.3.2. MSI-X/Writeback
MSI-X and Writeback block update the host with the current processed queue’s headpointer and interrupt. Apart from a global MSI-X Enable and Writeback Enable, there isa provision to selectively enable or disable the MSI-X and Writeback on a per-descriptor basis. This feature can be used by applications to throttle the MSI-X/Writeback.
The table below shows the relation between global and per-descriptor MSI-X/Writeback Enable.
3. Functional Description
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
16
Table 10. Multi Channel DMA Per-descriptor Enable vs. Global MSI-X/Writeback Enable
Global Enable Per-descriptor Enable MSI-X/Writeback Generation
1 1 On
1 0 On only for SOF/EOF and errorconditions
0 1 Off
0 0 Off
If enabled, a Writeback is sent to the host to update the status (completed descriptorID) stored in Q_CONSUMED_HEAD_ADDR location. In addition, for D2H streaming DMA,an additional writeback is issued to the D2H descriptor itself when the IP’s Avalon-STsink interface has received an sof/eof from the user logic. It updates the D2Hdescriptor packet information fields such as start of a file/packet(SOF), end of a file/packet(EOF), and received payload count (RX_PYLD_CNT).
3.1.4. Avalon-MM PIO Master
The Avalon-MM PIO Master bypasses the DMA block and provides a way for the Hostto do MMIO read/write to CSR registers of user logic. PCIe BAR2 is mapped to theAvalon-MM PIO Master. Any TLP targeting BAR2 is forwarded to the user logic. TLPaddress targeting the PIO interface should be 8 bytes aligned. The PIO interfacesupports non-bursting 64-bit write and read transfers.
The Avalon-MM PIO Master is present only if you select Multi Channel DMA UserMode for MCDMA Settings in the IP Parameter Editor GUI. The Avalon-MM PIOMaster is always present irrespective of the Interface type (Avalon-ST/Avalon-MM)that you select.
3.1.5. Avalon-MM Write (H2D) and Read (D2H) Master
Avalon-MM Interface is used to transfer data between the host and device through thememory-mapped interface. You can enable the Memory-Mapped interface by selectingAVMM Interface type in the IP Parameter Editor. The Multi Channel DMA IP for PCIExpress supports 1 write master port and 1 read master port.
Avalon-MM Write Master
The Avalon-MM Write Master is used to write H2D DMA data to the Avalon-MM slave inthe user logic through the memory-mapped interface. The Write Master can issueAVMM write commands for up to 8 bursts (burst count = 8). ThewaitrequestAllowance of this port is enabled and set to 16, allowing the master totransfer up to 16 additional write command cycles after the waitrequest signal hasbeen asserted.
Figure 5. Avalon-MM Write with waitrequestAllowance 16clock
write
burstcount[3:0]
writedata[511:0]
waitrequest
5 34 1278 6 8 67 45 23 1
DA1DA0 DA3DA2 DA5DA4 DA7DA6 DB0 DB2DB1 DB4DB3 DB6DB5 DB7
3. Functional Description
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
17
Avalon-MM Read Master
The Avalon-MM Read Master is used to read D2H DMA data from the Avalon-MM slavein the user logic through the memory-mapped interface. The Read Master can issueAVMM read commands for up to 8 bursts (burst count = 8).
Figure 6. Avalon-MM Read Master Timing Diagram
D(A0+7)D(A0+6)D(A0+5)D(A0+4)D(A0+3)D(A0+2)D(A0+1)D(A0)
A0
clock
address
waitrequest
read
readdata[511:0]
burstcount[3:0]
readdatavalid8
3.1.6. Avalon-ST Source (H2D) and Sink (D2H)
Multi Channel DMA provides Avalon Streaming Interfaces for transferring DMA databetween the host and device. Avalon-ST Source interface is used to move H2D DMAdata to external user logic. Avalon-ST Sink interface is used to move D2H DMA data tothe host. If you select Avalon-ST interface for Multi Channel DMA mode, you canchoose either 4 Avalon-ST ports or 1 Avalon-ST port for DMA.
3.1.6.1. Avalon-ST 4-Port mode
When you select 4-Port mode, the IP provides 4 Avalon-ST Source ports for H2D DMAand 4 Avalon-ST Sink ports for D2H DMA and supports up to 4 DMA channels. Eachport and DMA channel have 1:1 mapping.
Head-of-Line Blocking Prevention
In the mode, if one of the four channels stalls on the user-logic side then a Head-of-the-Line blocking situation could occur since the data movers service each channel ina round-robin arbitration scheme. The H2D and D2H Data Movers service eachchannel independently based on a round robin arbitration scheme. To prevent Head ofLine blocking (HOL) in one of the 4 ports from impacting the performance of otherports, the Multi Channel DMA IP for PCI Express provides up to eight parallel Host-to-device descriptor fetch streams (4 for H2D descriptor fetch & 4 for D2H) and up tofour parallel Host-to-device data streams. These data/descriptor fetch streams areindependent of each other. Any persisting backpressure from an Avalon-ST Sourceport might stall one of the four H2D streams. However, the concurrent architecturealong with round robin arbitration allows other streams to be mutually exclusive andoperate effectively without any impact.
The following is the Avalon-ST interface timing for both H2D and D2H directions. Adata transfer happens when both valid and ready signals become ‘1’. Both valid andready signals can go to ‘0’ within a packet boundary.
3. Functional Description
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
18
Figure 7. Avalon-ST Interface Timing Diagram
clock
ready
valid
sof
eof
empty[63:0]
data[511:0] D0 D1 D3 D4 D5 D6
3
D2
3.1.6.2. Avalon-ST 1-Port Mode (Available in Intel Quartus Prime release 21.1and later)
When you select AVST 1 port mode, the IP provides 1 AVST Source and Sink port forDMA. In this mode, you can enable up to 64 DMA channels if Intel Stratix 10 GX/MXdevice (H-Tile) is selected, and 256 DMA channels if Intel Stratix 10 DX or Intel Agilexdevice (P-Tile) is selected.
Table 11. IP Parameters specific to D2H Descriptor Fetch in Avalon-ST 1 Port Mode
IP GUI Parameter Description Value for MCDMA (H-Tile) Value for MCDMA (P-Tile)
D2H Prefetch Channels Number of prefetch channels 8, 16, 32, 64, 128, 256 8, 16, 32, 64
Maximum Descriptor Fetch Number of descriptors thatcan be fetched for eachprefetch channels
16, 32, 64 16, 32, 64
For details about these parameters, refer to the D2H Data Mover section.
3.1.6.3. Packet (File) Boundary
When streaming the DMA data, the packet (file) boundary is indicated by the SOF andEOF bits of the descriptor and corresponding sof and eof signals of the Avalon-STinterface.
Table 12. Multi Channel DMA Streaming Packet Boundary<n>: 0-3 for 4 ports, 0 for 1 port
Packet Boundary Descriptor Field AVST Source (H2D) Signal AVST Sink (D2H) Signal
Start of Packet SOF h2d_st_sof_<n>_o d2h_st_sof_<n>_i
End of Packet EOF h2d_st_eof_<n>_o d2h_st_sof_<n>_i
In Avalon-ST 1 port mode, a channel switch can only happen at packet boundary.
3.1.6.4. Metadata
When streaming DMA data, user can optionally enable 8-byte Metadata that containsmetadata for user application. When enabled, the H2D descriptor destination addressfield is replaced with metadata and D2H descriptor source address field is replacedwith Metadata.
3. Functional Description
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
19
With Metadata enabled, Avalon-ST SOF qualifies only the metadata and will not haveany data. Since the metadata size is always 8 bytes with predefined property, userside will not expect an empty signal.
Figure 8. Avalon-ST Source Timing Diagram with Metadata enabled
D3D2D1Meta [63:0]
Channel_no
Empty_val
clk
h2d_st_ready_i
h2d_st_valid_0
h2d_st_sof_0
h2d_st_eof_0
h2d_st_data_0[511:0]h2d_st_channel_0[10:0]
h2d_st_empty_0[5:0]
3.1.7. User MSI-X
User MSI-X is arbitrated along with the H2D/D2H MSI-X/Writeback requests, and ishandled exactly the same way as the others post the arbitration. The high-leveldiagram for the MSI-X handling mechanism is shown below.
Each DMA Channel is allocated 4 MSI-X vectors:
• 2’b00: H2D DMA Vector
• 2’b01: H2D Event Interrupt
• 2’b10: D2H DMA Vector
• 2’b11: D2H Event Interrupt
2’b00 and 2’b10 to address the Descriptor completion related interrupts (DMAoperation MSI-X) on both the paths.
2’b01 and 2’b11 are used for user MSI-X.
Note: msix_queue_dir Queue direction. D2H = 0, H2D =1
3.1.8. User Functional Level Reset (FLR)
When DMA engine receives Functional Level Resets from the PCIe Hard IP module, thereset requests are propagated to the downstream logic via this interface. In additionto performing resets to its internal logic, it waits for an acknowledgment from userlogic for the reset request before it issues an acknowledgement to the PCIe Hard IP.
3.1.9. Control Registers
The Multi Channel DMA IP for PCI Express provides 4 MB of control register space thatis internally mapped to PCIe BAR0. The control register block contains the all therequired registers to support the DMA operations. This includes QCSR space forindividual queue control, MSI-X for interrupt generations, and GCSR for general globalinformation.
The following table shows 4MB space mapped for each function in PCIe config spacethrough BAR0.
3. Functional Description
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
20
Table 13. Control Registers
Address Space Range Size Description
QCSR (D2H, H2D) 22’h00_0000 - 22’h0F_FFFF 1 MB Individual queue control andstatus registers, up to 2048D2H and 2048 H2D queues
MSI-X (Table and PBA) 22’h10_0000 - 22’h1F_FFFF 1 MB MSI-X Table and PBA space
GCSR 22’h20_0000 - 22’h2F_FFFF 1 MB General DMA control andstatus registers. Only forPF0.
Reserved 22’h30_0000 – 22’h3F_FFFF 1MB Reserved
Note: For more information on Control registers, refer to Control Register (GCSR) on page83
3.2. Bursting Avalon-MM Master (BAM)
The BAM bypasses the Multi Channel DMA IP for PCI Express & provides a way for aHost to perform bursting PIO read/writes to the user logic. The BAM converts memoryread and write TLPs initiated by the remote link partner and received over the PCIelink into Avalon-MM burst read and write transactions, and sends back CplD TLPs forread requests it receives. Since the BAM user interface is Avalon-MM, the completionsare always expected in order from user logic/Qsys fabric. The BAM supports bursts ofup to 512 bytes and up to 32 outstanding read request.
BAM Address Mapping
You can select to map any BAR register other than the BAR0 of the physical functionto BAM side for the user application. The BAM interface address mapping is as follows:
BAM address = {vf_active, pf, vf, bar_num, bam_addr}
1. vf_active: This indicates that SRIOV is enabled
2. pf [PF_NUM-1:0]: Physical function number decoded from the PCIe headerreceived from the HIP; PF_NUM which is ($clog2(pf_num_tcl)) is the RTL designparameter selected by the user such that Multi Channel DMA only allocatesrequired number of the bits on Avalon-MM side to limit the number of the wires onthe user interface.
3. vf [VF_NUM-1:0]: Virtual function number decoded from the PCIe headerreceived from the HIP; VF_NUM which is ($clog2(vf_num_tcl)) is the RTL designparameter selected by the user such that Multi Channel DMA only allocatesrequired number of the bits on Avalon-MM side to limit the number of the wires onthe user interface.
4. bar_num [2:0]: This denotes the BAR number where the Avalon-ST transactionwas received.
5. bam_addr [ADDR_SIZE-1:0]: Lower address based on the maximum aperturesize amongst all the BARs. Example if BAR3 is selected as 16 MB and BAR2 is 4GB, the ADDR_SIZE = 32 corresponding to BAR2.
Core Multi Channel DMA will pass the maximum aperture size parameter for theaddress offset and the PF/VF for the BAM module to output the address in the formatshown above.
3. Functional Description
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
21
3.3. Bursting Avalon-MM Slave (BAS)
The Avalon-MM TX Bursting slave module translates Avalon-MM Read and Writetransactions from user logic to PCI Express Mrd and Mwr TLPs. The returned PCIExpress CplD packets is translated to Avalon-MM interface as response to Avalon-MMread transaction.
The BAS supports both 256 bit and 512 bit data widths to achieve bandwidthsrequired for Gen4 x8 and Gen4 x16.. It supports bursts up to 512 bytes and Multipleoutstanding read requests. The default support is only for the 64 NP outstanding.
Figure 9. Bursting Avalon-MM Slave Definition
AVST Read
AVST cpl
AVST Posted
AVMM Readand Writes
AVMM BurstingSlave (BAS)
UserInterface
Completion Re-ordering
Avalon-MM BAS interface is slave interface to the User Avalon-MM. The User AVMMcan initiate AVMM reads to host interface and this will be translated to BAS Non-Postedpacket interface signals. The BAS module keeps track of the initiated NP requests andtracks against the completions received from the PCIe on the scheduler completionpacket interface.
Since the completion from the PCIe can come out of order, the completion re-orderingmodule ensures the returned completions are re-ordered against the pending requestsand send in the order on the AVMM interface since AVMM doesn’t track out of ordercompletions.
3.4. Config Slave (Non-Bursting AVMM Slave)
The Config Slave (CS) is an AVMM non-bursting interface to the User interface andessentially converts single-cycle, Avalon-MM read and write transactions into AVSTreads and writes for the PCIe configuration TLPs to be sent to PCIe Hard IP (to be sentover the PCIe link). This module also processes the completion TLPs (Cpl and CplD) itreceives in return. This interface is applicable only in Root Port mode.
CS module converts the AVMM request into a configuration TLP with a fixed TAG (valuedecimal 255 assigned to it and sends it to scheduler. One unique TAG is enough as itdoesn’t support more than one outstanding transaction. This unique TAG helps inrerouting the completions to CS module.
Re-routing the completion is handled at the top level and since only 1 NP outstandingis needed, the TLP RX scheduler will parse the completion field to decode thecompletion on a fixed TAG and route the transaction over to CS.
3. Functional Description
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
22
Figure 10. Avalon-MM Config Slave Module
Parameters
cs_cpl pkt
cs_np pkt
ConfigurationSlave (CS)
CS AVMM Slave
CS Master IntfAVMM
AVMM Address format for the CS module: The AVMM Address [28:0] is Decoded asshown below and is consistent with the PCIe definition of the header space.
Figure 11. Avalon-MM Config Slave Module Address Format28
DW agilned0 –> Type 0
1 –> Type 1
0
Bus Number [7:0] DeviceNumber [4:0]
Func.Number [2:0]
Ext.Reg [3:0] Register [7:2] 2’b00
3.5. Hard IP Reconfiguration Interface
The Hard IP Reconfiguration interface is an Avalon-MM slave interface with a 21-bitaddress bus and an 8-bit data bus. You can use this bus to dynamically modify thevalue of configuration registers that are read-only at run time.
Note: After a warm reset or cold reset, changes made to the configuration registers of theHard IP via the Hard IP reconfiguration interface are lost and these registers revertback to their default values.
3.6. Config TL Interface (P-Tile Only)
The Config TL interface extracts the required information stored in the PCIe Hard IPconfig space in order for the DMA to operate properly. Some of the example includesMPS, MRRS, and Bus Master Enable.
The configuration register extraction only occurs periodically, so the assumption ismade that these are fairly static signals and there are significant delays after theconfig space is updated by software.
3.7. Configuration Intercept Interface (EP Only)
For detailed information about this interface, refer to P-Tile Avalon Streaming IntelFPGA IP for PCI Express User Guide (Chapter 4 Section 4.11)
Related Information
P-Tile Avalon Streaming Intel FPGA IP for PCI Express User Guide Chapter 4 Section4.11
3. Functional Description
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
23
4. Interface OverviewInterfaces for the Multi Channel DMA IP for PCI Express are:
• Clocks
• Resets
• Multi Channel DMA mode interfaces (EP only):
— Avalon-MM PIO Master Interface
— Avalon-MM Write Master Interface
— Avalon-MM Read Master Interface
— Avalon-ST Source Interface
— Avalon-ST Sink Interface
— User MSI-X
— User FLR
• Bursting Avalon-MM Master Interface (BAM)
• Bursting Avalon-MM Slave Interface (BAS)
• Config Slave Interface (RP only)
• Hard IP Reconfig Interface (P-Tile only)
• Config TL Interface (P-Tile only)
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
4.1. Port List
Figure 12. Multi Channel DMA IP for PCI Express Port List (H-Tile)
Clock Interface
MCDMA_TOP
Reset Interface
<n> = 0 to 15
<a> = {vf_active, pf, vf,bar_num, bar_addr} - 1
PCIe Serial Interface
D2H Avalon-MMRead Master
Interface
H2D Avalon-MMWrite Master
Interface
Bursting Avalon-MMMaster Interface
(BAM)
Bursting Avalon-MMSlave Interface
(BAS)
ConfigSlave Interface
(CS)
HIP DynamicReconfiguration
Interface
User MSI-X Interface
PIPE Interface for Simulationand Hardware Debug
Using ltssm_state[5:0] inSignal Tap
(For Future Use)
H2D Avalon-STSource Interface
<n> = 0 to 15
<k> = 0 to 3 (4 AVST ports),0 (1 AVST port)
D2H Avalon-ST Sink Interface
<k> = 0 to 3 (4 AVST ports),0 (1 AVST port)
PIO Avalon-MMMaster Interface
<m> = {vf_active, pf, vf, PIOBAR2 Address Width} -1
User FLR Interface
refclk coreclkout_hip
pin_perstnpor app_nreset_status ninit_done
rx_in[<n>:0]tx_out[<n>:0]
d2hdm_address_o[63:0]d2hdm_byteenable_o[63:0] d2hdm_read_od2hdm_burstcount_o[3:0] d2hdm_waitrequest_i d2hdm_readdata_i[511:0] d2hdm_readdatavalid_i d2hdm_response_i[1:0]
rx_pio_waitrequest_irx_pio_address_o[<m>:0]
rx_pio_byteenable_o[7:0]rx_pio_read_o
rx_pio_readdata_i[63:0]rx_pio_readdatavalid_i
rx_pio_write_orx_pio_writedata_o[63:0]rx_pio_burstcount_o[3:0]
rx_pio_response_i[1:0]rx_pio_writeresponsevalid_i
d2h_st_sof_<k>_id2h_st_eof_<k>_i
d2h_st_empty_<k>_i [5:0]d2h_st_channel_<k>_i [10:0]
d2h_st_valid_<k>_id2h_st_data_<k>_i [511:0]
d2h_st_ready_<k>_o
h2d_st_sof_<k>_oh2d_st_eof_<k>_o
h2d_st_empty_<k>_o[5:0]h2d_st_channel_<k>_o[10:0]
h2d_st_valid_<k>_oh2d_st_data_<k>_o[511:0]
h2d_st_ready_<k>_i
bus_master_enable_o[3:0]simu_mode_pipe
test_in[66:0]sim_pipe_pclk_in
sim_pipe_rate[1:0]sim_ltssmstate[5:0]
sim_pipe_mask_tx_pll_lock
txdeemph<n>txswing<n>
txsynchd<n>[1:0]txblkst<n>
txdataskip<n>rate<n>[1:0]
rxpolarity <n>currentrxpreset<n>[2:0]
currentcoeff<n>[17:0]rxeqeval<n>
rxeqinprogress<n>invalidreq<n>
rxdata<n>[31:0]rxdatak<n>[3:0]
phystatus<n>rxvalid<n>
rxstatus<n>[2:0]rxelecidle<n>
rxsynchd<n>[1:0]rxblkst<n>
rxdataskip<n>dirfeedback<n>[5:0]
usr_event_msix_ready_ousr_event_msix_valid_i
usr_event_msix_data_i[15:0]
usr_flr_rcvd_val_ousr_flr_rcvd_chan_num_o[10:0]
usr_flr_completed_i
txdata<n>[31:0]txdatak<n>[3:0]
txcomp<n>txelecidle<n>
txdetectrx<n>powerdown<n>[1:0]
txmargin<n>[2:0]
h2ddm_address_o[63:0]h2ddm_byteenable_o[63:0]h2ddm_burstcount_o[3:0]h2ddm_write_oh2ddm_writedata_o[511:0]h2ddm_waitrequest_i
bam_address_o[<a>:0]bam_byteenable_o[63:0]bam_burstcount_o[3:0]bam_read_obam_readdata_i[511:0]bam_readdatavalid_ibam_write_obam_writedata_o[511:0]bam_waitrequest_i
cs_address_i[28:0]cs_byteenable_i[3:0]cs_read_ics_readdata_o[31:0]cs_readdatavalid_ocs_write_ics_writedata_i[31:0]cs_waitrequest_o cs_response_o[1:0]
usr_hip_reconfig_rst_n_iusr_hip_reconfig_clk_iusr_hip_reconfig_address_i[20:0] usr_hip_reconfig_read_iusr_hip_reconfig_readdata_o[7:0]ousr_hip_reconfig_readdatavalid_ousr_hip_reconfig_write_iusr_hip_reconfig_writedata_i[7:0] usr_hip_reconfig_waitrequest_o
bas_address_i[63:0]bas_byteenable_i[63:0]bas_burstcount_i[3:0]bas_read_ibas_readdata_o[511:0]bas_readdatavalid_obas_write_ibas_writedata_i[511:0]bas_waitrequest_o
4. Interface Overview
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
25
Figure 13. Multi Channel DMA IP for PCI Express Port List (P-Tile)
refclk0refclk1coreclkout_hiprapp_clk
pin_perstninit_done
rx_in[<n>:0]tx_out[<n>:0]
app_rst_np0_pld_link_req_rst_op0_pld_warm_rst_rdy_i
d2hdm_address_o[63:0]d2hdm_byteenable_o[63:0]d2hdm_read_o
d2hdm_readdata_i[511:0]d2hdm_readdatavalid_id2hdm_response_i[1:0]
d2hdm_burstcount_o[3:0]d2hdm_waitrequest_i
rx_pio_waitrequest_irx_pio_address_o[<m>:0]
rx_pio_byteenable_o[7:0]
rx_pio_readdatavalid_irx_pio_write_o
rx_pio_writedata_o[63:0]rx_pio_burstcount_o[3:0]
rx_pio_response_i[1:0]rx_pio_writeresponsevalid_i
rx_pio_read_orx_pio_readdata_i[63:0]
d2h_st_sof_<k>_id2h_st_eof_<k>_i
d2h_st_empty_<k>_i [5:0]
d2h_st_data_<k>_i [511:0]d2h_st_ready_<k>_o
d2h_st_channel_<k>_i [10:0]d2h_st_valid_<k>_i
h2d_st_sof_<k>_oh2d_st_eof_<k>_o
h2d_st_empty_<k>_o[5:0]
h2d_st_data_<k>_o[511:0]h2d_st_ready_<k>_i
h2d_st_channel_<k>_o[10:0]h2d_st_valid_<k>_oh2ddm_address_o[63:0]
h2ddm_write_oh2ddm_writedata_o[511:0] h2ddm_waitrequest_i
h2ddm_byteenable_o[63:0]h2ddm_burstcount_o[3:0]
bam_address_o[<a>:0]
bam_read_obam_readdata_i[511:0] bam_readdatavalid_i bam_write_obam_writedata_o[511:0] bam_waitrequest_i
bam_byteenable_o[63:0]bam_burstcount_o[3:0]
bas_address_i[63:0]
bas_read_ibas_readdata_o[511:0] bas_readdatavalid_o bas_write_ibas_writedata_i[511:0] bas_waitrequest_o
bas_byteenable_i[63:0] bas_burstcount_i[3:0]
cs_address_i[28:0]
cs_readdata_o[31:0]cs_readdatavalid_o cs_write_i cs_writedata_i[31:0]cs_waitrequest_o cs_response_o[1:0]
cs_byteenable_i[3:0]cs_read_i
usr_hip_reconfig_rst_n_i
usr_hip_reconfig_read_iusr_hip_reconfig_readdata_o[7:0]usr_hip_reconfig_readdatavalid_o usr_hip_reconfig_write_iusr_hip_reconfig_writedata_i[7:0] usr_hip_reconfig_waitrequest_o
usr_hip_reconfig_clk_iusr_hip_reconfig_address_i[20:0]
usr_event_msix_ready_ousr_event_msix_valid_i
usr_event_msix_data_i[15:0]
usr_flr_completed_i
usr_flr_rcvd_val_ousr_flr_rcvd_chan_num_o[10:0]
usr_hip_tl_cfg_ctl_o[31:0]
usr_hip_tl_cfg_func_o[1:0]usr_hip_tl_cfg_add_o[4:0]
Clock Interface
Reset Interface
<n> = 0 to 15
<a> = {vf_active, pf, vf, bar_num, bam_addr} - 1
PCIe Serial Interface
D2H Avalon-MMRead Master
Interface
H2D Avalon-MMWrite Master
Interface
Bursting Avalon-MMMaster Interface
(BAM)
Bursting Avalon-MMSlave Interface
(BAS)
ConfigSlave Interface
(CS)
HIP DynamicReconfiguration
Interface
User MSI-X Interface
H2D Avalon-STSource Interface
<k> = 0 to 3 (4 AVST ports),0 (1 AVST port)
D2H Avalon-ST Sink Interface
<k> = 0 to 3 (4 AVST ports),0 (1 AVST port)
PIO Avalon-MMMaster Interface
<m> = {vf_active, pf, vf, PIOBAR2 Address Width} -1
User FLR Interface
ConfigurationOutput
Interface
MCDMA_TOP
4. Interface Overview
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
26
4.2. Clocks
Table 14. Multi Channel DMA IP for PCI Express Clock Signals
Signal Name I/O Type Description Clock Frequency
H-Tile
refclk Input PCIe reference clock definedby the PCIe specification.This input reference clockmust be stable and free-running at device power-upfor a successful deviceconfiguration.
100 MHz ± 300 ppm
coreclkout_hip Output This is an output clockprovided to user logic.Avalon-MM / Avalon-ST userinterfaces are synchronousto this clock.
250 MHz
P-Tile
refclk0 Input PCIe reference clock definedby the PCIe specification.These clocks must be free-running.
100 MHz ± 300 ppm
refclk1 Input
coreclkout_hip Output Do not use this clock. Useapp_clk.
app_clk Output Application clock Gen3: 250 MHzGen4: 350 MHz (Intel Stratix10 DX), 400 MHz (IntelAgilex)
4.3. Resets
Table 15. Multi Channel DMA IP for PCI Express Reset Signals
Signal Name I/O Type Description
H-Tile
pin_perst_n Input This is an active-low input to the PCIeHard IP, and implements the PERST#function defined by the PCIespecification.
npor Output Application drives this active-low resetinput to the PCIe Hard IP. This resetsentire PCIe Hard IP. If not used, youmust tie this input to 1.
app_nreset_status Output This is an active low reset status. Thisis deasserted after the PCIe Hard IPhas come out of reset.
ninit_done Input This is an active low input signal. A "1"indicates that the FPGA device is notyet fully configured.A "0" indicates the device has beenconfigured and is in normal operatingmode.
continued...
4. Interface Overview
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
27
Signal Name I/O Type Description
To use the ninit_done input, instantiatethe Reset Release Intel FPGA IP in yourdesign and use its ninit_done output.The Reset Release IP is required inIntel Stratix 10 design. It holds theMulti Channel DMA for PCI Express IPin reset until the FPGA is fullyconfigured and has entered user mode.
P-Tile
pin_perst_n Input See H-Tile pin_perst
ninit_done Input See H-Tile ninit_done
app_rst_n Output See H-Tile app_nreset_status
p0_pld_link_req_rst_o Output Warm reset request to application
p0_pld_warm_rst_rdy_i Input Warm reset ready from application
4.4. Multi Channel DMA
4.4.1. Avalon-MM PIO Master
The Avalon-MM PIO Master interface is used to write to /read from external registersimplemented in the user logic.
Table 16. Avalon-MM PIO Master
Interface Clock Domain for H-Tile: coreclkout_hip
Interface Clock Domain for P-Tile: app_clk
Signal Name I/O Type Description
rx_pio_address_o[n:0] Output PIO Read/Write Address.H-Tile:
<n> = (14+PIO BAR2 Address Width)-1
P-Tile:
<n> = (15+PIO BAR2 Address Width)-1
Address = {vf_active, clog2(PF_NUM), clog2(VF_NUM), PIO BAR2 Address}
rx_pio_writedata_o[63:0] Output PIO Write Data Payload.
rx_pio_byteenable_o[7:0] Output PIO Write Data Byte Enable.
rx_pio_write_o Output PIO Write.
rx_pio_read_o Output PIO Read
rx_pio_burstcount_o[3:0] Output PIO Write Burst Count.
rx_pio_waitrequest_i Input PIO Write WaitRequest.
rx_pio_writeresponsevalid_i Input PIO response valid to a write request
continued...
4. Interface Overview
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
28
Signal Name I/O Type Description
rx_pio_readdata_i[63:0] Input PIO Read Data.
rx_pio_readdatavalid_i Input PIO Read data valid
rx_pio_response_i[1:0] Input PIO response. Reserved for futurerelease. Tie to 0.
4.4.2. Avalon-MM Write Master (H2D)
The H2D Avalon-MM Write Master interface is used to write H2D DMA data to theexternal Avalon-MM slave. This port is 256-bit (x8) / 512-bit (x16) write master thatsupports a maximum burst count of 8. The WaitRequestAllowance of this port isenabled and set to 16 allowing the master to transfer continuously 16 data phasesafter the WaitRequest signal has been asserted.
Table 17. Avalon-MM Write Master (H2D)
Interface Clock Domain for H-Tile: coreclkout_hip
Interface Clock Domain for P-Tile: app_clk
Signal Name I/O Type Description
h2ddm_waitrequest_i Input H2D Wait Request
h2ddm_write_o Output H2D Write
h2ddm_address_o[63:0] Output H2D Write Address
h2ddm_burstcount_o[3:0] Output H2D Write Burst Count
x16: h2ddm_writedata_o[511:0]x8: h2ddm_writedata_o[255:0]
Output H2D Write Data Payload
x16: h2ddm_byteenable_o[63:0]x8: h2ddm_byteenable_o[31:0]
Output H2D Byte Enable
4.4.3. Avalon-MM Read Master (D2H)
The D2H Avalon-MM Read Master interface is use to read D2H DMA data from theexternal AVMM slave. This port is 256-bit (x8) / 512-bit (x16) read master thatsupports a maximum burst count of 8.
Table 18. Avalon-MM Read Master (D2H)
Interface Clock Domain for H-Tile: coreclkout_hip
Interface Clock Domain for P-Tile: app_clk
Signal Name I/O Type Description
d2hdm_read_o Output D2H Read.
d2hdm_address_o[63:0] Output D2H Read Write Address.
x16: d2hdm_byteenable_o[63:0]x8: d2hdm_byteenable_o[31:0]
Output D2H Byte Enable
d2hdm_burstcount_o[3:0] Output D2H Burst Count.
continued...
4. Interface Overview
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
29
Signal Name I/O Type Description
d2hdm_waitrequest_i Input D2H Write WaitRequest.
d2hdm_readdatavalid_i Input D2H Read Data Valid.
x16: d2hdm_readdata_i[511:0]x8: d2hdm_readdata_i[255:0]
Input D2H Read Data.
d2hdm_response_i[1:0] Input Tied to 0
4.4.4. Avalon-ST Source (H2D)
The H2D Avalon-ST source interface is used to send H2D DMA data to the externalAvalon-ST sink logic.
Table 19. Avalon-ST Source (H2D)<n>=0-3 (4 ports mode) / 0 (1 port mode)
Signal Name I/O Type Description
x16: h2d_st_data_<n>_o[511:0]x8: h2d_st_data_<n>_o[255:0]
Output H2D Streaming data from host todevice
h2d_st_valid_<n>_o Output Valid for all outgoing signals. A ‘1’represents the readiness of data to besent.
h2d_st_ready_<n>_i Input Backpressure from device. A ‘1’represents, device readiness forreceiving data.
h2d_st_sof_<n>_o Output Start of file (or packet) as instructed inhost descriptor.
h2d_st_eof_<n>_o Output End of file (or packet) as instructed inhost descriptor.
x16: h2d_st_empty_<n>_o[5:0]x8: h2d_st_empty_<n>_o[2:0]
Output Represents the number of empty bytesin h2d_st_data_<n>_o, and valid onlywhen both h2d_st_valid_<n>_o andh2d_st_eop_<n>_o is ‘1’.
h2d_st_channel_<n>_o[10:0] Output To support multi-Channel per port.
4.4.5. Avalon-ST Sink (D2H)
The D2H Avalon-ST Sink interface is used to read D2H DMA data from the externalAvalon-ST source logic.
Table 20. Avalon-ST Sink (D2H)<n>= 0-3 (4 ports mode) / 0 (1 port mode)
Signal Name I/O Type Description
d2h_st_valid_<n>_i Input Valid for all incoming signals. A ‘1’represents the device readiness fordata to be sent.
x16: d2h_st_data_<n>_i[511:0]x8: d2h_st_data_<n>_i[255:0]
Input D2H Streaming data from device tohost.
continued...
4. Interface Overview
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
30
Signal Name I/O Type Description
d2h_st_ready_<n>_o Output Backpressure from Multi Channel DMAIP for PCI Express. A ‘1’ represents, IPreadiness for receiving data.
x16: d2h_st_empty_<n>_i[5:0]x8: d2h_st_empty_<n>_i[2:0]
Input Represents the number of empty bytesin d2h_st_data_<n>_i, and valid onlywhen both d2h_st_valid__<n>_i andd2h_st_eop__<n>_i is ‘1’.
d2h_st_sof_<n>_i Input Start of file (or packet) as instructedby the user logic.
d2h_st_eof_<n>_i Input End of file (or packet) as instructed bythe user logic.
d2h_st_channel_<n>_i[10:0] input To support multi-Channel per port.
4.4.6. User MSI-X Interface
User logic requests DMA engine to send an event interrupt for a queue associated witha PF/VF.
Table 21. User MSI-X Interface
Interface Clock Domain for H-Tile: coreclkout_hip
Interface Clock Domain for P-Tile: app_clk
Signal Name I/O Description
usr_event_msix_valid_i Input The valid signal qualifies valid data onany cycle with data transfer.
usr_event_msix_ready_o Output On interfaces supporting backpressure,the sink asserts ready to mark thecycles where transfers may take place.
usr_event_msix_data_i [15:0] Input {rsvd[3:0],msix_queue_dir,msix_queue_num_i[10:0]}Note: msix_queue_dir Queue
direction. D2H = 0, H2D =1
4. Interface Overview
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
31
4.4.7. User FLR Interface
Table 22. User FLR Interface
Interface Clock Domain for H-Tile: coreclkout_hip
Interface Clock Domain for P-Tile: app_clk
Signal Name I/O Description
usr_flr_rcvd_val_o Output Indicates user logic to begin flr for thespecifid channel inusr_flr_rcvd_chan_num_o.asserted until usr_flr_completed_iinput is sampled 1’b1.
usr_flr_rcvd_chan_num_o[10:0] Output Indicates Channel number for whichflr has to be initiated by user logic.
usr_flr_completed_i Input One-cycle pulse from the applicationindicates completion of flr activity forchannel inusr_flr_rcvd_chan_num_o
4.5. Bursting Avalon-MM Master (BAM) Interface
Table 23. BAM Signals
Signal Name I/O Type Description
bam_address_o[<n>:0] Output Represents a byte address. The valueof address must align to the datawidth. <n>: {vfactive+$clog2(PF_NUM)+11+3+BAR_addr_width}-1, wherevfactive=1, PF_NUM=number of PFsenabled, 11=$clog2(2048),3=bar_num width, BAR_addr_width=22 bits (H-Tile) /max(BAR_addr_width) (P-Tile)
x16: bam_byteenable_o[63:0]x8: bam_byteenable_o[31:0]
Output Enables one or more specific byte lanesduring transfers on interfaces
bam_burstcount_o[3:0] Output Used by a bursting master to indicatethe number of transfers in each burst.
bam_read_o Output Asserted to indicate a read transfer.
x16: bam_readdata_i[511:0]x8: bam_readdata_i[255:0]
Input Read data from the user logic inresponse to a read transfer
bam_readdatavalid_i Input When asserted, indicates that thereaddata signal contains valid data. Fora read burst with burstcount value<n>, the readdatavalid signal must beasserted <n> times, once for eachreaddata item.
bam_write_o Output Asserted to indicate a write transfer
x16: bam_writedata_o[511:0]x8: bam_writedata_o[255:0]
Output Data for write transfers
bam_waitrequest_i Input When asserted, indicates that theAvalon-MM slave is not ready torespond to a request.
4. Interface Overview
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
32
4.6. Bursting Avalon-MM Slave (BAS) Interface
Table 24. BAS Signals
Signal Name I/O Type Description
bas_vfactive_i Input When asserted, this signal indicatesAVMM transaction is targeting a virtualfunction
H-Tile: bas_pfnum_i[1:0]P-Tile: bas_pfnum_i[2:0]
Input Specifies a target PF number
bas_vfnum_i[10:0] Input Specifies a target VF number
bas_address_i[63:0] Input Represents a byte address. The valueof address must align to the datawidth.
x16: bas_byteenable_i[63:0]x8: bas_byteenable_i[31:0]
Input Enables one or more specific byte lanesduring transfers on interfaces
bas_burstcount_i[3:0] Input Used by a bursting master to indicatethe number of transfers in each burst.
bas_read_i Input Asserted to indicate a read transfer.
x16: bas_readdata_o[511:0]x8: bas_readdata_o[255:0]
Output Read data to the user logic in responseto a read transfer
bas_readdatavalid_o Output When asserted, indicates that thereaddata signal contains valid data. Fora read burst with burstcount value<n>, the readdatavalid signal must beasserted <n> times, once for eachreaddata item.
bas_write_i Input Asserted to indicate a write transfer
x16: bas_writedata_i[511:0]x8: bas_writedata_i[255:0]
Input Data for write transfers
bas_waitrequest_o Output When asserted, indicates that theAvalon-MM slave is not ready torespond to a request.
4.7. Config Slave Interface (RP only)
Table 25. Config Slave Interface Signals
Signal Name I/O Type Description
cs_address_i[28:0] Input Represents a byte address. The valueof address must align to the datawidth.
cs_byteenable_i[3:0] Input Enables one or more specific byte lanesduring transfers on interfaces
cs_read_i Input Asserted to indicate a read transfer.
cs_readdata_o[31:0] Output Read data to the user logic in responseto a read transfer
cs_readdatavalid_o Output When asserted, indicates that thereaddata signal contains valid data.
continued...
4. Interface Overview
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
33
Signal Name I/O Type Description
cs_write_i Input Asserted to indicate a write transfer
cs_writedata_i[31:0] Input Data for write transfers
cs_writeresponse_valid_o Output Write responses for write commands.When asserted, the value on theresponse signal is a valid writeresponse.
cs_waitrequest_o Output When asserted, indicates that theAvalon-MM slave is not ready torespond to a request.
cs_response_o[1:0] Output Carries the response status: 00: OKAY—Successful response for atransaction. 01: RESERVED—Encodingis reserved. 10: SLAVEERROR—Errorfrom an endpoint agent. Indicates anunsuccessful transaction. 11:DECODEERROR—Indicates attemptedaccess to an undefined location.
4.8. Hard IP Reconfiguration Interface
Table 26. Hard IP Reconfiguration Interface
Signal Name I/O Description
usr_hip_reconfig_clk Input Reconfiguration clock. 50 MHz - 125MHz(Range) 100 MHz (Recommended)
usr_hip_reconfig_readdata_o[7:0]
Output read data out
usr_hip_reconfig_readdatavalid_o
Output When asserted, the data onhip_reconfig_readdata[7:0] isvalid.
usr_hip_reconfig_write_i Input Write enable
usr_hip_reconfig_read_i Input Read enable
usr_hip_reconfig_address_i[21:0]
Input Reconfig register address
usr_hip_reconfig_writedata_i[7:0]
Input Write data
usr_hip_reconfig_waitrequest_o
Output When asserted, this signal indicatesthat the IP core is not ready to respondto a request.
4. Interface Overview
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
34
4.9. Config TL Interface
Table 27. Config TL Interface Signals
Signal Name I/O Type Description
usr_hip_tl_config_func_o[1:0] Output Specifies the function whoseConfiguration Space register values arebeing driven out on tl_cfg_ctl_obus.
usr_hip_tl_config_add_o[4:0] Output This address bus contains the indexindicating which Configuration Spaceregister information is being drivenonto the tl_cfg_ctl_o bus.For detailed information for ConfigSpace registers, refer to ConfigurationOutput Interface (Table 62) of P-TileAvalon Streaming Intel FPGA IP for PCIExpress User Guide.
usr_hip_tl_config_ctl_o[31:0] Output Multiplexed data output from theregister specified bytl_cfg_add_o[4:0].For detailed information for each fieldin this bus, refer to ConfigurationOutput Interface (Table 62) of P-TileAvalon Streaming Intel FPGA IP for PCIExpress User Guide.
4.10. Configuration Intercept Interface (EP Only)
For detailed information about this interface, refer to P-Tile Avalon Streaming IntelFPGA IP for PCI Express User Guide (Chapter 4 Section 4.11).
Related Information
P-Tile Avalon Streaming Intel FPGA IP for PCI Express User Guide Chapter 4 Section4.11
4. Interface Overview
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
35
5. Parameters (H-Tile)This chapter provides a reference for all the H-Tile parameters of the Multi ChannelDMA IP for PCI Express.
Table 28. Design Environment ParameterStarting in Intel Quartus Prime 18.0, there is a new parameter Design Environment in the parameters editorwindow.
Parameter Value Description
DesignEnvironment
StandaloneSystem
Identifies the environment that the IP is in.• The Standalone environment refers to the IP
being in a standalone state where all its interfacesare exported.
• The System environment refers to the IP beinginstantiated in a Platform Designer system.
5.1. IP Settings
5.1.1. System Settings
Figure 14. Multi Channel DMA IP for PCI Express Parameter Editor
Table 29. System Settings
Parameter Value Description
Hard IP mode Gen3x16, 512-bit interface, 250 MHz Selects the following elements:• The lane data rate. Gen3 is supported• The Application Layer interface frequency
continued...
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
Parameter Value Description
The width of the data interface between the hard IPTransaction Layer and the Application Layerimplemented in the FPGA fabric.
Port type Native Endpoint Specifies the port type.
Enable multiplephysical functions
Off This parameter is not available for H-Tile
5.1.2. MCDMA Settings
Figure 15. MCDMA Settings Parameters
Table 30. MCDMA Settings
Parameter Value Description
Enable PIPE PHY Interface On/Off PIPE PHY Interface is for simulationonly. This should be enabled forexample design generation. Default:On
PIO BAR2 Address Width 4 Mbytes – 22 bits Address width for PIO AVMM port.Default address width is 22 bits
User Mode Multi channel DMABursting MasterBursting SlaveBAM-BASBAM-MCDMA
This option allows user to configure themode of operation for MCDMA IP.MCDMA mode has the DMAfunctionality. BAM and BAS offerBursting Master and Slave AVMMcapabilities without DMA functionality
Interface type AVMMAVST
User logic interface type for D2HDMand H2DDM.Default: Avalon-MM Interface
continued...
5. Parameters (H-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
37
Parameter Value Description
Number of ports 1 In the Intel Quartus Prime 21.1release, Number of Avalon-ST/Avalon-MM Ports is fixed.
Enable User-MSIX On / Off User MSI-x is enables user applicationto initiate interrupts through MCDMA,this option is available only if the userselects MCDMA mode
Enable User-FLR On / Off User FLR, interface allows passing ofFLR signals to the user side application
D2H Prefetch channels 64 Sets the D2H Active channels
Maximum Descriptor Fetch 16 Sets the max Descriptor fetch
Enable Metadata On / Off Enable Metadata
Total number of Dynamic channelsallocated
64 Sets the number of DMA channelsallocated, if dynamic channel allocationis enabled.Note: This feature is not supported in
the Intel Quartus Prime 21.1release. A Intel Quartus Primesoftware patch will be releasedto hide this feature.
Enable config slave Off This parameter is not available for the21.1 release
5.1.3. Device Identification Registers
The following table lists the default values of the read-only registers in the PCI*Configuration Header Space. You can use the parameter editor to set the values ofthese registers.
You can specify Device ID registers for each Physical Function.
Table 31. PCIe0 Device Identification Registers
Parameter Value Description
Vendor ID 0x00001172 Sets the read-only value of the VendorID register. This parameter cannotbe set to 0xFFFF per the PCI ExpressBase Specification.Address offset: 0x000.
Device ID 0x00000000 Sets the read-only value of the DeviceID register.Address offset: 0x000.
Revision ID 0x00000001 Sets the read-only value of theRevision ID register.Address offset: 0x008.
Class Code 0x00ff0000 Sets the read-only value of the ClassCode register.You must set this register to a non-zero value to ensure correct operation.Address offset: 0x008.
Subsystem Vendor ID 0x00000000 Address offset: 0x02C.
continued...
5. Parameters (H-Tile)
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
38
Parameter Value Description
Sets the read-only value of SubsystemVendor ID register in the PCI Type 0Configuration Space. This parametercannot be set to 0xFFFF per the PCIExpress Base Specification. This valueis assigned by PCI-SIG to the devicemanufacturer. This value is only usedin Root Port variants.
Subsystem Device ID 0x00000000 Sets the read-only value of theSubsystem Device ID register in thePCI Type 0 Configuration Space. Thisvalue is only used in Root Portvariants. Address offset: 0x02C
5.1.4. Multifunction and SR-IOV System Settings Parameters
Figure 16. Multifunction and SR-IOV System Settings Parameters
Table 32. PCIe0 Multifunction and SR-IOV System settings
Parameter Value Description
Total physical functions (PFs) 1-4 Sets the number of physical functions
Enable SR-IOV support On / Off Enable SR-IOV support
Number of DMA channels allocatedto PF0
4 Number of DMA Channels between thehost and device PF Avalon-ST / Avalon-MM ports.For Avalon-ST interface type, only 1channel per port is supported.For Avalon-MM Interface type, up to 8channels are supported.
Number of DMA channels allocatedto each VF in PF0
0 Note: This parameter is not availablefor the Intel Quartus Prime 21.1release.
5.1.5. Configuration, Debug and Extension Options
Table 33. PCIe0 Configuration, Debug and Extension Options
Parameter Value Description
Enable HIP dynamicreconfiguration of PCIe read-onlyregisters
On / Off When on, creates an Avalon-MM slaveinterface that software can drive toupdate global configuration registerswhich are read-only at run time.
Enable transceiver dynamicreconfiguration
On / Off When on, creates an Avalon-MM slaveinterface that software can drive toupdate Transceiver reconfigurationregisters
continued...
5. Parameters (H-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
39
Parameter Value Description
Enable Native PHY, LCPLL, andfPLL ADME for Toolkit
On / Off When on, Native PHY and ATXPLL andfPLL ADME are enabled for TransceiverToolkit. Must enable transceiverdynamic reconfiguration beforeenabling ADME
Enable PCIe Link Inspector On / Off When on, PCIe link inspector isenabled. Must enable HIP dynamicreconfiguration, transceiver dynamicreconfiguration and ADME for Toolkit touse PCIe link inspector
Enable PCIe Link Inspector AVMMInterface
Off When on, PCIe link inspector AVMMinterface is exported. When on, JTAGto Avalon Bridge IP instantiation isincluded in the Example Designgeneration for debug
5.1.6. PHY Characteristics
Table 34. PHY Characteristics
Parameter Value Description
Gen2 TX de-emphasis
3.5dB6dB
Specifies the transmit de-emphasis for Gen2.Intel recommends the following settings:• 3.5 dB: Short PCB traces• 6.0 dB: Long PCB traces.
VCCR/VCCT supplyvoltage for thetransceiver
1_1V1_0V
Allows you to report the voltage supplied by the board for thetransceivers.
5.1.7. PCI Express / PCI Capabilities Parameters
This group of parameters defines various capability properties of the IP core. Some ofthese parameters are stored in the PCI Configuration Space - PCI CompatibleConfiguration Space. The byte offset indicates the parameter address.
5.1.7.1. Device
Figure 17. Device Table
5. Parameters (H-Tile)
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
40
Table 35. Device
Parameter PossibleValues
Default Value Address Description
Maximumpayloadsizessupported
512 bytesNote: Value
isfixed
at 512bytes
512 bytes 0x074 Specifies the maximum payload size supported. Thisparameter sets the read-only value of the max payloadsize supported field of the Device Capabilities register.
5.1.7.2. Link
Table 36. Link Table
Parameter Value Description
Link port number(Root Port only)
0x01 Sets the read-only value of the port number field in the LinkCapabilities register. This parameter is for Root Ports only. It shouldnot be changed.
Slot clockconfiguration
On/Off When you turn this option On, indicates that the Endpoint uses thesame physical reference clock that the system provides on theconnector. When Off, the IP core uses an independent clock regardlessof the presence of a reference clock on the connector. This parametersets the Slot Clock Configuration bit (bit 12) in the PCI ExpressLink Status register.
5.1.7.3. MSI-X
Note: The parameters in this feature are not available to set or modify for the Intel QuartusPrime 21.1 release
5.1.7.4. Power Management
Table 37. Power Management Parameters
Parameter Value Description
Endpoint L0sacceptable latency
Maximum of 64 nsMaximum of 128 nsMaximum of 256 nsMaximum of 512 nsMaximum of 1 usMaximum of 2 usMaximum of 4 usNo limit
This design parameter specifies the maximum acceptable latency thatthe device can tolerate to exit the L0s state for any links between thedevice and the root complex. It sets the read-only value of theEndpoint L0s acceptable latency field of the Device CapabilitiesRegister (0x084).This Endpoint does not support the L0s or L1 states. However, in aswitched system there may be links connected to switches that haveL0s and L1 enabled. This parameter is set to allow system configurationsoftware to read the acceptable latencies for all devices in the systemand the exit latencies for each link to determine which links can enableActive State Power Management (ASPM). This setting is disabled forRoot Ports.The default value of this parameter is 64 ns. This is a safe setting formost designs.
Endpoint L1acceptable latency
Maximum of 1 usMaximum of 2 usMaximum of 4 usMaximum of 8 usMaximum of 16 usMaximum of 32 usMaximum of 64 ns
This value indicates the acceptable latency that an Endpoint canwithstand in the transition from the L1 to L0 state. It is an indirectmeasure of the Endpoint’s internal buffering. It sets the read-only valueof the Endpoint L1 acceptable latency field of the DeviceCapabilities Register.This Endpoint does not support the L0s or L1 states. However, aswitched system may include links connected to switches that have L0sand L1 enabled. This parameter is set to allow system configuration
continued...
5. Parameters (H-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
41
Parameter Value Description
No limit software to read the acceptable latencies for all devices in the systemand the exit latencies for each link to determine which links can enableActive State Power Management (ASPM). This setting is disabled forRoot Ports.The default value of this parameter is 1 µs. This is a safe setting formost designs.
The Intel Stratix 10 Avalon-ST Hard IP for PCI Express and Intel Stratix 10 Avalon-MMHard IP for PCI Express do not support the L1 or L2 low power states. If the link evergets into these states, performing a reset (by asserting pin_perst, for example)allows the IP core to exit the low power state and the system to recover.
These IP cores also do not support the in-band beacon or sideband WAKE# signal,which are mechanisms to signal a wake-up event to the upstream device.
5.1.7.5. Vendor Specific Extended Capability (VSEC)
Table 38. VSEC
Parameter Value Description
User ID registerfrom the VendorSpecific ExtendedCapability
Custom value Sets the read-only value of the 16-bit User ID register from the VendorSpecific Extended Capability. This parameter is only valid for Endpoints.
5.2. Example Designs
Table 39. Example Designs
Parameter Value Description
CurrentlySelected ExampleDesign
PIO using MQDMABypass mode
(default)(For AVST Interface
type only)• Device-side
Packetloopback
• PacketGenerate/
Check(For AVMM
Interface type only)AVMM DMA
Select an example design available from the pulldown list. Avalon-ST/Avalon-MM Interface type setting determines available example designs
Simulation On/Off When On, the generated output includes a simulation model.
Select simulationRoot ComplexBFM
Third-partyBFMIntel FPGA
BFM
Choose the appropriate BFM for simulation.Intel FPGA BFM: Default. This bus functional model (BFM) supports x16configurations by downtraining to x8.Third-party BFM: Select this If you want to simulate all 16 lanes using athird-party BFM.
continued...
5. Parameters (H-Tile)
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
42
Parameter Value Description
Synthesis On/Off When On, the generated output includes a synthesis model.
Generated HDLformat
Verilog/VHDL Only Verilog HDL is available in the current release.
TargetDevelopment Kit
NoneIntel Stratix 10
GX H-TileDevelopment KitIntel Stratix 10
MX H-TileDevelopment Kit
Select the appropriate development board.If you select one of the development boards, system generation overwritesthe device you selected with the device on that development board.Note: If you select None, system generation does not make any pin
assignments. You must make the assignments in the .qsf file.
Note: For more information about example designs, refer to the PCIe Multi-Channel DirectMemory Access IP for H-Tile Design Example User Guide.
Related Information
Multi Channel DMA for PCI Express IP Design Example User Guide
5. Parameters (H-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
43
6. Parameters (P-Tile)This chapter provides a reference for all the P-Tile parameters of the Multi ChannelDMA IP for PCI Express.
Table 40. Design Environment ParameterStarting in Intel Quartus Prime 18.0, there is a new parameter Design Environment in the parameters editorwindow.
Parameter Value Description
DesignEnvironment
StandaloneSystem
Identifies the environment that the IP is in.• The Standalone environment refers to the IP
being in a standalone state where all its interfacesare exported.
• The System environment refers to the IP beinginstantiated in a Platform Designer system.
6.1. IP Settings
6.1.1. Top-Level Settings
Figure 18. Multi Channel DMA IP for PCI Express Parameter Editor
Table 41. Top-Level Settings
Parameter Value Description
Hard IP mode Gen4x16, Interface – 512 bitGen3x16, Interface – 512 bitGen4x8, Interface – 256 bitGen3x8, Interface – 256 bit
Selects the following elements:• The lane data rate. Gen3 and Gen4 are supported• The Application Layer interface frequencyThe width of the data interface between the hard IPTransaction Layer and the Application Layerimplemented in the FPGA fabric.
Port type Native Endpoint Specifies the port type.
continued...
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
Parameter Value Description
Root Port
Enable PHYReconfiguration
On / Off When on, creates an Avalon-MM slave interface thatsoftware can drive to update Transceiverreconfiguration registers
PLD ClockFrequency
400 MHz350 MHz
Select the frequency of the Application clock. Theoptions available vary depending on thesetting of theHard IP Mode parameter.For Gen4 modes, the available clock frequencies are500 MHz / 400 MHz / 350 MHz (for Intel Agilex) and400 MHz / 350 MHz (for Intel Stratix 10 DX).For Gen3 modes, the available clock frequency is 250MHz (for Intel Agilex and Intel Stratix 10 DX).
Enable SRIS Mode True/False Enable the Separate Reference Clock withIndependent Spread Spectrum Clocking (SRIS)feature.Default: False
P-Tile Sim Mode True/False Enabling this parameter reduces the simulation timeof Hot Reset tests by 5 ms.Default: FalseNote: Do not enable this option if you need to run
synthesis.
Enable RST of PCS& Controller
True/False Default: FalseEnable the reset of PCS and Controller in User Modefor Endpoint and Bypass Upstream modes.When this parameter is True, depending on thetopology, new signals (p_pld_clrpcs_n) areexported to the Avalon Streaming interface.When this parameter is False (default), the IPinternally ties off these signals instead of exportingthem.Note: This parameter is required for the independent
reset feature, which is only supported in thex8x8 Endpoint/Endpoint or Bypass Upstream/Bypass Upstream mode and is scheduled to besupported in a future release of the P-tileAvalon Streaming Intel FPGA IP for PCIExpress.
Note: If you have more questions regarding theindependent reset feature and its usage,contact your Intel Field Application Engineer.
6.2. PCIe0 Settings
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
45
6.2.1. PCIe0 Multifunction and SR-IOV System settings
Figure 19. PCIe0 Multifunction and SR-IOV System settings parameters
Table 42. PCIe0 Multifunction and SR-IOV System settings
Parameter Value Description
Enable SR-IOV support On / Off Enable SR-IOV support
Number of DMA channels allocatedto PF0
4 Number of DMA Channels between thehost and device PF0 Avalon-ST /Avalon-MM ports.For Avalon-ST interface type, only 1channel per port is supported.For Avalon-MM Interface type, up to 8channels are supported.
Number of DMA channels alloted toeach VF in PF0
0 Number of DMA channels alloted toeach VF in PF0
6.2.2. PCIe0 Configuration, Debug and Extension Options
Table 43. PCIe0 Configuration, Debug and Extension Options
Parameter Value Description
Gen 3 Requested equalization far-end TX preset vector
0x00000004 Specifies the Gen 3 requested phase2/3 far-end TX preset vector. Choosinga value different from the default is notrecommended for most designs.
Gen 4 Requested equalization far-end TX preset vector
0x00000270 Specifies the Gen 4 requested phase2/3 far-end TX preset vector. Choosinga value different from the default is notrecommended for most designs.
Enable Debug Toolkit On / Off Note: This parameter is not availablefor Intel Quartus Prime 21.1
.
Enable HIP Reconfig interface On / Off Enable hip reconfiguration interface
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
46
6.2.3. PCIe0 Device Identification Registers
Table 44. PCIe0 Device Identification Registers
Parameter Value Description
Vendor ID 0x00001172 Sets the read-only value of the VendorID register. This parameter cannotbe set to 0xFFFF per the PCI ExpressBase Specification.Address offset: 0x000.
Device ID 0x00000000 Sets the read-only value of the DeviceID register.Address offset: 0x000.
Revision ID 0x00000001 Sets the read-only value of theRevision ID register.Address offset: 0x008.
Class Code 0x00ff0000 Sets the read-only value of the ClassCode register.You must set this register to a non-zero value to ensure correct operation.Address offset: 0x008.
Subsystem Vendor ID 0x00000000 Address offset: 0x02C.Sets the read-only value of SubsystemVendor ID register in the PCI Type 0Configuration Space. This parametercannot be set to 0xFFFF per the PCIExpress Base Specification. This valueis assigned by PCI-SIG to the devicemanufacturer. This value is only usedin Root Port variants.
Subsystem Device ID 0x00000000 Sets the read-only value of theSubsystem Device ID register in thePCI Type 0 Configuration Space. Thisvalue is only used in Root Portvariants. Address offset: 0x02C
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
47
6.2.4. MCDMA Settings
Figure 20. MCDMA Settings Parameters
Table 45. MCDMA Settings
Parameter Value Description
BAR2 Address Width 4 Mbytes – 22 bits Address width for PIO AVMM port.Default address width is 22 bits
User Mode Multi channel DMABursting MasterBursting SlaveBAM-BASBAM-MCDMA
This option allows user to configure themode of operation for MCDMA IP.MCDMA mode has the DMAfunctionality. BAM and BAS offerBursting Master and Slave AVMMcapabilities without DMA functionality
Interface type AVMMAVST
User logic interface type for D2HDMand H2DDM.Default: Avalon-MM Interface
Number of ports 1 If Interface Type = AVMM, Value =1If Interface Type = AVST, Value= 1 or4
Enable User-MSIX On / Off User MSI-X is enables user applicationto initiate interrupts through MCDMA,this option is available only if the userselects MCDMA mode
Enable User-FLR On / Off User FLR, interface allows passing ofFLR signals to the user side application
D2H Prefetch channels 64 Sets the D2H Active channelsFor P-Tile AVMM interface the D2HPrefetch channels is fixed to 64, forAVST options are 8,16,32,64
Maximum Descriptor Fetch 16 Sets the max Descriptor fetch
continued...
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
48
Parameter Value Description
For P-Tile AVMM interface theMaximum Descriptor Fetch is fixed to64, for AVST options are 16,32,64
Enable Metadata On / Off Enables MetadataNote: This feature is not available for
P-Tile AVMM
Enable Dynamic Channel Allocation On / Off Enables Dynamic Channel AllocationNote: This feature is not supported in
the Intel Quartus Prime 21.1release. A Intel Quartus Primesoftware patch will be releasedto hide this feature.
Total number of Dynamic channelsallocated
64 Sets the number of DMA channelsallocated, if dynamic channel allocationis enabled.Note: This feature is not supported in
the Intel Quartus Prime 21.1release. A Intel Quartus Primesoftware patch will be releasedto hide this feature.
Enable Configuration InterceptInterface
On / Off Select to enable configuration interceptinterface.
6.2.5. PCIe0 TPH/ATS for Physical Functions
Table 46. PCIe0 TPH/ATS for Physical Functions
Parameter Value Description
Enable Address TranslationServices (ATS)
Off Enable or disable Address TranslationServices (ATS) capability.When ATS is enabled, senders canrequest, and cache translatedaddresses using the RP memory spacefor later use.
Enable TLP Processing Hints (TPH) Off Enable or disable TLP Processing Hints(TPH) capability.Using TPH may improve the latencyperformance and reduce trafficcongestion.
6.2.6. PCIe0 PCI Express / PCI Capabilities Parameters
This group of parameters defines various capability properties of the IP core. Some ofthese parameters are stored in the PCI Configuration Space - PCI CompatibleConfiguration Space. The byte offset indicates the parameter address.
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
49
6.2.6.1. PCIe0 Device Capabilities
Table 47. PCIe0 Device
Parameter Value Description
Maximum payload size supported 512 Bytes256 Bytes128 Bytes
Specifies the maximum payload sizesupported. This parameter sets theread-only value of the max payloadsize supported field of the DeviceCapabilities registers.
Support Extended Tag Field On / Off Sets the Extended Tag Field Supportedbit in Configuration Space DeviceCapabilities RegisterNote: This parameter is not available
for modification in the IntelQuartus Prime 21.1
Enable multiple physical functions On / Off Enables multiple physical functions.
6.2.6.2. PCIe0 Link Capabilities
Table 48. PCIe0 Link Capabilities
Parameter Value Description
Link port number(Root Port only)
0x01 Sets the read-only value of the port number field in the LinkCapabilities register. This parameter is for Root Ports only. It shouldnot be changed.
Slot clockconfiguration
On/Off When you turn this option On, indicates that the Endpoint uses thesame physical reference clock that the system provides on theconnector. When Off, the IP core uses an independent clock regardlessof the presence of a reference clock on the connector. This parametersets the Slot Clock Configuration bit (bit 12) in the PCI ExpressLink Status register.
6.2.6.3. PCIe0 MSI-X Capabilities
Table 49. PCIe0 MSI-X Capabilities
Parameter Value Description
Enable MSI-X On / Off When On, adds the MSI-X capabilitystructure, with the parameters shownbelow.Note: This parameter is not available
for modification in the IntelQuartus Prime 21.1
Table size 15 System software reads this field todetermine the MSI-X table size <n>,which is encoded as <n-1>.
Table offset 0x0000000000020000 Points to the base of the MSI-X table.The lower 3 bits of the table BARindicator (BIR) are set to zero bysoftware to form a 64-bit qword-aligned offset. This field is read-onlyafter being programmed.
continued...
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
50
Parameter Value Description
Table BAR indicator 0 Specifies which one of a function'sbase address registers, locatedbeginning at 0x10 in the ConfigurationSpace, maps the MSI-X table intomemory space. This field is read-only.
Pending bit array (PBA) offset 0x0000000000030000 Used as an offset from the addresscontained in one of the function's BaseAddress registers to point to the baseof the MSI-X PBA. The lower 3 bits ofthe PBA BIR are set to zero bysoftware to form a 32-bit qword-aligned offset. This field is read-onlyafter being programmed
PBA BAR indicator 0 Specifies the function Base Addressregisters, located beginning at 0x10 inConfiguration Space, that maps theMSI-X PBA into memory space. Thisfield is read-only in the MSI-XCapability Structure.
6.2.6.4. PCIe0 DEV SER
Table 50. PCIe0 DEV SER
Parameter Value Description
Enable Device Serial NumberCapability
On / Off Capability Enable Device Serial NumberCapability (DEV SER)
6.2.6.5. PCIe0 PRS
Table 51. PCIe0 PRS
Parameter Value Description
PF0 Enable PRS On / Off Enable or disable Page Request Service(PRS) capability.
6.2.6.6. Slot Capabilities
Table 52. Slot Capabilities
Parameter Value Description
Use Slot register On/Off This parameter is only supported in Root Port mode. The slot capability isrequired for Root Ports if a slot is implemented on the port. Slot status isrecorded in the PCI Express Capabilities register.Defines the characteristics of the slot. You turn on this option by selectingEnable slot capability. Refer to the figure below for bit definitions.
Slot power scale 0–3 Specifies the scale used for the Slot power limit. The following coefficientsare defined:• 0 = 1.0x• 1 = 0.1x• 2 = 0.01x• 3 = 0.001xThe default value prior to hardware and firmware initialization is b’00. Writesto this register also cause the port to send the Set_Slot_Power_LimitMessage.
continued...
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
51
Parameter Value Description
Refer to Section 6.9 of the PCI Express Base Specification Revision for moreinformation.
Slot power limit 0–255 In combination with the Slot power scale value, specifies the upper limit inwatts on power supplied by the slot. Refer to Section 7.8.9 of the PCI ExpressBase Specification for more information.
Slot number 0-8191 Specifies the slot number.
Figure 21. Slot Capability
31 19 18 17 16 15 14 7 6 5
Physical Slot Number
No Command Completed SupportElectromechanical Interlock Present
Slot Power Limit ScaleSlot Power Limit Value
Hot-Plug CapableHot-Plug Surprise
Power Indicator PresentAttention Indicator Present
MRL Sensor PresentPower Controller PresentAttention Button Present
04 3 2 1
6.2.6.7. Power Management
Table 53. Power Management Parameters
Parameter Value Description
Endpoint L1acceptable latency
Maximum of 1 usMaximum of 2 usMaximum of 4 usMaximum of 8 usMaximum of 16 usMaximum of 32 usMaximum of 64 nsNo limit
This value indicates the acceptable latency that an Endpoint canwithstand in the transition from the L1 to L0 state. It is an indirectmeasure of the Endpoint’s internal buffering. It sets the read-only valueof the Endpoint L1 acceptable latency field of the DeviceCapabilities Register.This Endpoint does not support the L0s or L1 states. However, aswitched system may include links connected to switches that have L0sand L1 enabled. This parameter is set to allow system configurationsoftware to read the acceptable latencies for all devices in the systemand the exit latencies for each link to determine which links can enableActive State Power Management (ASPM). This setting is disabled forRoot Ports.The default value of this parameter is 1 µs. This is a safe setting formost designs.
The Intel Stratix 10 Avalon-ST Hard IP for PCI Express and Intel Stratix 10 Avalon-MMHard IP for PCI Express do not support the L1 or L2 low power states. If the link evergets into these states, performing a reset (by asserting pin_perst, for example)allows the IP core to exit the low power state and the system to recover.
These IP cores also do not support the in-band beacon or sideband WAKE# signal,which are mechanisms to signal a wake-up event to the upstream device.
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
52
6.2.6.8. Vendor Specific Extended Capability (VSEC)
Table 54. VSEC
Parameter Value Description
User ID registerfrom the VendorSpecific ExtendedCapability
Custom value Sets the read-only value of the 16-bit User ID register from the VendorSpecific Extended Capability. This parameter is only valid for Endpoints.
6.3. Example Designs
Table 55. Example Designs
Parameter Value Description
Simulation On / Off When Simulation box is checked, allnecessary filesets required forsimulation will be generated. When thisbox is NOT checked, filesets requiredfor Simulation will be NOT generated.Instead a qsys example design systemwill be generated.
Synthesis On / Off When Synthesis box is checked, allnecessary filesets required forsynthesis will be generated. WhenSynthesis box is NOT checked, filesetsrequired for Synthesis will be NOTgenerated. Instead a qsys exampledesign system will be generated
Generated file format Verilog HDL format
Current development kit NoneIntel Stratix 10 DX P-Tile ES 1FPGA Development KitIntel Agilex F-Series P-Tile ES0FPGA Development Kit
This option provides supports forvarious Development Kits listed. Thedetails of Intel FPGA Development kitscan be found on Intel FPGA website.If this menu is grayed out, it isbecause no board is supported for theoptions selected (for example,synthesis deselected).If an Intel FPGA Development board isselected, the Target Device used forgeneration will be the one thatmatches the device on theDevelopment Kit
Currently Selected Example Design PIO using MQDMA Bypass modeDevice-side Packet loopbackPacket Generate/Check
Based on parameterization, you canselect the appropriate example design.
Note: For more information about example designs, refer to the Multi-Channel DirectMemory Access IP for PCI Express Design Example User Guide.
Related Information
Multi Channel DMA for PCI Express IP Design Example User Guide
6. Parameters (P-Tile)
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
53
7. Designing with the IP Core
7.1. Generating the IP Core
You can use the Intel Quartus Prime Pro Edition IP Catalog or Platform Designer todefine and generate a Multi Channel DMA IP for PCI Express custom component.
Follow the steps shown in the figure below to generate a custom Multi Channel DMA IPfor PCI Express component.
Figure 22. IP Generation Flowchart
You can select Multi Channel DMA IP for PCI Express in the Intel Quartus Prime ProEdition IP Catalog or Platform Designer as shown below.
Figure 23. Intel Quartus Prime Pro Edition IP Catalog (with filter applied)
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
Figure 24. Intel Quartus Prime Pro Edition IP Catalog (with filter applied)
Figure 25. Platform Designer IP Catalog (with filter applied)
7.2. Simulating the IP Core
The Intel Quartus Prime Pro Edition software optionally generates a functionalsimulation model, a testbench or design example, and vendor-specific simulator setupscripts when you generate your parameterized Multi Channel DMA for PCI Express IPcore. For Endpoints, the generation creates a Root Port BFM. There is no support forRoot Ports in this release of the Intel Quartus Prime Pro Edition.
To enable IP simulation model generation, set Create simulation model to Verilogor VHDL when you generate HDL:
7. Designing with the IP Core
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
55
Figure 26. Multi Channel DMA IP for PCI Express Simulation in Intel Quartus Prime ProEdition
The Intel Quartus Prime Pro Edition supports the following simulators.
Table 56. Supported Simulators (H-Tile)
Vendor Simulator Version Platform
Aldec Active-HDL 11.1 Windows
Aldec Riviera-PRO 2019.10 Windows, Linux
Cadence Xcelium Parallel Simulator 20.09 Linux
Mentor Graphics ModelSim PE 2020.4 Windows
Mentor Graphics ModelSim SE 2020.4 Windows, Linux
Mentor Graphics QuestaSim 2020.4 Windows, Linux
Synopsys VCS/VCS MX Q-2020.03-SP2 Linux
Table 57. Supported Simulators (P-Tile)
Vendor Simulator Version Platform
Mentor Graphics Mentor Graphics QuestaAdvanced Simulator
2020.4 (64-bit only) Windows
Synopsys VCS/VCS MX Q-2020.03-SP2 (64-bit Linuxonly)
Linux
7. Designing with the IP Core
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
56
Note: The Intel testbench and Root Port BFM provide a simple method to do basic testing ofthe Application Layer logic that interfaces to the PCIe IP variation. This BFM allows youto create and run simple task stimuli with configurable parameters to exercise basicfunctionality of the example design. The testbench and Root Port BFM are not intendedto be a substitute for a full verification environment. Corner cases and certain trafficprofile stimuli are not covered. To ensure the best verification coverage possible, Intelstrongly recommends that you obtain commercially available PCIe verification IP andtools, or do your own extensive hardware testing, or both.
Related Information
• Introduction to Intel FPGA IP Cores
• Simulating Intel FPGA IP Cores
• Simulation Quick-Start
• Multi Channel DMA for PCI Express Design Example User Guide
7.3. IP Core Generation Output - Intel Quartus Prime Pro Edition
The Intel Quartus Prime Pro Edition software generates the following output filestructure for individual IP cores that are not part of a Platform Designer system.
7. Designing with the IP Core
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
57
Figure 27. Individual IP Core Generation Output (Intel Quartus Prime Pro Edition)
<Project Directory>
<your_ip>_inst.v or .vhd - Lists file for IP core synthesis
<your_ip>.qip - Lists files for IP core synthesis
synth - IP synthesis files
<IP Submodule>_<version> - IP Submodule Library
sim
<your_ip>.v or .vhd - Top-level IP synthesis file
sim - IP simulation files
<simulator vendor> - Simulator setup scripts<simulator_setup_scripts>
<your_ip> - IP core variation files
<your_ip>.ip - Top-level IP variation file
<your_ip>_generation.rpt - IP generation report
<your_ip>.bsf - Block symbol schematic file
<your_ip>.ppf - XML I/O pin information file
<your_ip>.spd - Simulation startup scripts
*
<your_ip>.cmp - VHDL component declaration
<your_ip>.v or vhd - Top-level simulation file
synth
- IP submodule 1 simulation files
- IP submodule 1 synthesis files
<your_ip>_bb.v - Verilog HDL black box EDA synthesis file
<HDL files>
<HDL files>
<your_ip>_tb - IP testbench system *
<your_testbench>_tb.qsys - testbench system file<your_ip>_tb - IP testbench files
your_testbench> _tb.csv or .spd - testbench file
sim - IP testbench simulation files * If supported and enabled for your IP core variation.
<your_ip>.qgsimc - Simulation caching file (Platform Designer)
<your_ip>.qgsynthc - Synthesis caching file (Platform Designer)
Table 58. Output Files of Intel FPGA IP Generation
File Name Description
<your_ip>.ip Top-level IP variation file that contains the parameterization of an IP core inyour project. If the IP variation is part of a Platform Designer system, theparameter editor also generates a .qsys file.
<your_ip>.cmp The VHDL Component Declaration (.cmp) file is a text file that contains localgeneric and port definitions that you use in VHDL design files.
<your_ip>_generation.rpt IP or Platform Designer generation log file. Displays a summary of themessages during IP generation.
continued...
7. Designing with the IP Core
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
58
File Name Description
<your_ip>.qgsimc (Platform Designersystems only)
Simulation caching file that compares the .qsys and .ip files with the currentparameterization of the Platform Designer system and IP core. This comparisondetermines if Platform Designer can skip regeneration of the HDL.
<your_ip>.qgsynth (PlatformDesigner systems only)
Synthesis caching file that compares the .qsys and .ip files with the currentparameterization of the Platform Designer system and IP core. This comparisondetermines if Platform Designer can skip regeneration of the HDL.
<your_ip>.qip Contains all information to integrate and compile the IP component.
<your_ip>.csv Contains information about the upgrade status of the IP component.
<your_ip>.bsf A symbol representation of the IP variation for use in Block Diagram Files(.bdf).
<your_ip>.spd Input file that ip-make-simscript requires to generate simulation scripts.The .spd file contains a list of files you generate for simulation, along withinformation about memories that you initialize.
<your_ip>.ppf The Pin Planner File (.ppf) stores the port and node assignments for IPcomponents you create for use with the Pin Planner.
<your_ip>_bb.v Use the Verilog blackbox (_bb.v) file as an empty module declaration for useas a blackbox.
<your_ip>_inst.v or _inst.vhd HDL example instantiation template. Copy and paste the contents of this fileinto your HDL file to instantiate the IP variation.
<your_ip>.regmap If the IP contains register information, the Intel Quartus Prime softwaregenerates the .regmap file. The .regmap file describes the register mapinformation of master and slave interfaces. This file complementsthe .sopcinfo file by providing more detailed register information about thesystem. This file enables register display views and user customizable statisticsin System Console.
<your_ip>.svd Allows HPS System Debug tools to view the register maps of peripherals thatconnect to HPS within a Platform Designer system.During synthesis, the Intel Quartus Prime software stores the .svd files forslave interface visible to the System Console masters in the .sof file in thedebug session. System Console reads this section, which Platform Designerqueries for register map information. For system slaves, Platform Designeraccesses the registers by name.
<your_ip>.v <your_ip>.vhd HDL files that instantiate each submodule or child IP core for synthesis orsimulation.
/mentor/ Contains a msim_setup.tcl script to set up and run a ModelSim simulation.
/aldec/ Contains a Riviera*-PRO script rivierapro_setup.tcl to setup and run asimulation.
/synopsys/vcs/
/synopsys/vcsmx/
Contains a shell script vcs_setup.sh to set up and run a VCS* simulation.Contains a shell script vcsmx_setup.sh and synopsys_sim.setup file toset up and run a VCS MX* simulation.
/cadence/ Contains a shell script ncsim_setup.sh and other setup files to set up andrun an NCSIM simulation.
/submodules/ Contains HDL files for the IP core submodule.
/<IP submodule>/ Platform Designer generates /synth and /sim sub-directories for each IPsubmodule directory that Platform Designer generates.
7. Designing with the IP Core
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
59
7.4. Systems Integration and Implementation
7.4.1. Required Supporting IP
Intel Stratix 10 and Intel Agilex devices use a parallel, sector-based architecture thatdistributes the core fabric logic across multiple sectors. Device configuration proceedsin parallel with each Local Sector Manager (LSM) configuring its own sector.Consequently, FPGA registers and core logic are not released from reset at exactly thesame time, as has always been the case in previous families.
In order to keep application logic held in the reset state until the entire FPGA fabric isin user mode, Intel Stratix 10 and Intel Agilex devices require you to include the IntelStratix 10 Reset Release IP.
Refer to the Multi Channel DMA for PCI Express IP design example to see how theReset Release IP is connected with the Multi Channel DMA for PCI Express IPcomponent.
Related Information
AN 891: Using the Reset Release Intel FPGA IP
7. Designing with the IP Core
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
60
8. Software Programming ModelThe Multi Channel DMA IP for PCI Express Linux software consists of the followingcomponents:
• Test Application (perfq_app)
• User Space Library and API (libmqdma)
• User Space Driver (vfio-pci, uio)
• Kernel Frame Work (ifc_uio)
The software files are created in the Multi Channel DMA IP for PCI Express designexample project folder when you generate an Multi Channel DMA IP for PCI Expressdesign example from the IP Parameter Editor as shown below. The softwareconfiguration is specific to the example design generated by Intel Quartus Prime.
Figure 28. Software Folder Structure
Name
kernel
user
dpdk
readme
8.1. Multi Channel DMA Custom Driver
8.1.1. Architecture
The figure below shows the software architecture block diagram of MCDMA customdriver.
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
Figure 29. Block Level Software Architecture
Application
McDMA LibraryUser Space
Kernel Space Hypervisor
lfc_uio.ko
McDMA Library
vf io-pcivf io-pci
IOMMU
CH CHCH
CH
CH CHCH
CH
CH CHCH
CH
PCI Express Bridge(Root Complex)
McDMA library
Application
UserSpace
KernelSpace
VM
vf io-pci
• • •PF
Application/Container
VF VF
In the above block diagram, dotted lines represent memory mapped I/O interface. Theother two lines represent read and write operations triggered by the device.
The Multi Channel DMA IP for PCI Express supports the following kernel basedmodules to expose the device to user space.
• vfio-pci
• UIO
These drivers do not perform any device management and indicate to the OperatingSystem (OS) that the devices are being used by user space such that the OS will notperform any action (e.g. scanning the device etc.) on these devices.
vfio-pci
This is the secure kernel module, provided by kernel distribution. This module allowsyou to program the I/O Memory Managnment Unit (IOMMU). IOMMU is the hardwarewhich helps to ensure memory safety in user space drivers. In case, if you are usingSingle Root I/O Virtualization (SR-IOV) , you can load vfio-pci and bind the device.
• This module enables IOMMU programming and Function level reset (FLR)
• To expose device Base Address Registers (BAR) to user space, vfio-pci enablesioctl
• Supports MSI-X (Message Signal Interrupts extensions) interrupts
• Kernel versions >= 5.7, supports the enablement of virtual functions by usingsysfs interface.
If you are using kernel versions below 5.7, you have the following alternatives:
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
62
• Use ifc_uio, which supports to enable VFs.
• Apply the patch on kernel to enable virtual functions by using sysfs. It needskernel rebuild.
ifc_uio
This is the alternative driver to vfio-pci, which doesn’t use IOMMU.
By using PCIe, sysfs, interrupt framework utilities this module reads allows the userspace to access the device.
Like vfio-pci, this module can also be used from guest VM through the hypervisor. Thisdriver allows the enablement/disablement of virtual functions. Once a virtual functionis created, by default it binds to ifc_uio. Based on the requirement, you may unbindand bind to another driver.
Following are the functionalities supported by using this module:
• Allows enablement/disablement of virtual functions by using sysfs interface.
• Probes and exports channel BARs to libmqdma
• Supports Interrupt notification/clearing
libmqdma
This is a user-space library used by the application to access the PCIe device.
• This library has the APIs to access the MCDMA IP design and you can develop yourapplication using this API.
• It features calls for allocation, release and reset of the channels
• libmqdma supports accessing the devices binded by UIO or Virtual Function I/O(VFIO ).
The libmqdma supports two user space drivers.
• uio
• vfio-pci
You can tune these options from the make file.
In case of UIO, ifc_uio driver reads the BAR register info by using sysfs and registerMSI-X info by using eventfds.
In case of VFIO, user space uses IOCTL command to read BAR registers, MSIXinformation and programming of IOMMU table.
Typically, when an application is running in a virtualized environment, you bind thedevice to vfio-pci module and libmqdma can access the device by using ioctl .Currently, the support of UIO and VFIO can be switched from common.mk file. UIO isenabled by default.
Sample application
This application uses the APIs from libmqdma and takes the following command linearguments as the input.
8. Software Programming Model
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
63
• Total message sizes/ time duration
• Packet size per descriptor
• Write/Read
• Completion reporting method
• Number of channels
It runs multiple threads for accessing the DMA channel. It also has performancemeasuring capabilities. Based on the number threads you are using and number ofchannels you are processing, queues would be scheduled on threads.
8.1.2. libmqdma library details
libmqdma library has the user space framework which enables the DMA operation withPCIe device, is responsible for following actions:
• Device management
• Channel management
• Descriptor Memory Management
• Interrupts management
The libmqdma framework is installed on the host as a dynamic link library and exportsthe APIs to the application. Applications running in user space are responsible to useMCDMA IP by using those APIs.
8.1.2.1. Channel Initialization
When libmqdma is handing over the available channel to the application, it performsthe following functions:
1. Reset the channel
• The libmqdma sets the reset register of the channel.
• Polls the register back till the reset happens.
This process resets the queue logic and sets all the channel parameters to default.
2. Initialize the channel
• Allocates the required number of descriptors in the host.
• Update the starting address of the descriptors to the registers.
• Update the number of descriptors.
Based on these parameters, hardware performs queue management.
3. Enable the channel
8.1.2.2. Descriptor Memory Management
At the time of channel initialization, the device allocates the descriptor and datamemory.
Descriptor memory
Maximum length of data in descriptor is 1 MB. Link specifies whether the nextdescriptor is in any other page.
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
64
AVST H2D/D2H descriptor
• source address
• Destination address
• Data length
• Start of file (SOF)
• End of file (EOF)
• Descriptor index
• Link
Application need to pass these values to the hardware through the libmqdma.
Data Memory
The user space data page can be much bigger than the normal TLB entry page size of4 KB. libqdma library implements the allocator to organize the memory.
Following are the hardware registers which the software updates as part of thechannel enumeration.
• Q_START_ADDR_L, Q_START_ADDR_H: Contains the physical address of thestart of the descriptor array.
• Q_SIZE: Logarithmic value of number of descriptors
• Q_CONS_HEAD_ADDR_L, Q_CONS_HEAD_ADDR_H: Physical address of thehead index of the ring, where FPGA sync the value of the head.
8.1.2.3. Descriptor Completion Status Update
There are two modes for selecting the descriptor completion status, MSI-X andWriteback mode. The default mode is writeback. This can be changed in the followingC header file.
Software/user/common/include/ifc_libmqdma.h
/* Set default descriptor completion */#define IFC_CONFIG_QDMA_COMPL_PROC <Set with following policy >
Descriptor Completion Status Update Policy
1. Writeback mode: (CONFIG_QDMA_QUEUE_WB): In this approach, MCDMA IPupdates the completed descriptor index in the host memory. libmqdma goes forlocal read and not for PCIe read.
2. MSI-X interrupt mode: (CONFIG_QDMA_QUEUE_MSIX): In this approach,when the transaction completed, MCDMA IP sends the interrupt to the Host andupdates the completed descriptor index in the host memory. libmqdma reads thecompletion status up on receiving the interrupt.
8.1.3. Application
At the time of starting the application, it reads the MCDMA capabilities, createsapplication context, open BAR registers, initializes the PCI Express functions. At thetime of termination, it clears the application context and stops all the channels.
8. Software Programming Model
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
65
Multi-threading with Multiple Channels
Based on the input parameters, the application would start multiple threads with posixthread APIs, associate the queue to the thread and submit DMA transactions one at atime independently. As part of this, driver updates the tail register of that channel.Update tid ID update, hardware picks up the channel and start DMA operation.
Each thread would perform the following tasks.
1. Get the device context based on BDF (Bus Device Function)
2. Acquire the available channel
3. Get DMA capable memory
4. Start DMA operation
5. Release the channel
As multiple threads can try to grab and release the channel at a time, userspace driver(libmqdma) handles synchronization problems while performing channel management.
Scheduling Threads
As POSIX libraries are being used for thread management, Linux scheduler would takecare of scheduling the threads, there is no custom scheduler which takes care ofscheduling the threads.
perfq_app schedules multiple queues on single threads for DMA operations.
1. Reads number of channels from user (-c <num>)
2. Reads number of threads from user (-a <num>)
3. Calculate number of queues DMA need to perform from one thread
4. After every 200 ms or after every TID update, perfq_app swaps out a queue andswaps in other queue to perform DMA operation.
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
66
8.1.4. Software Flow
Figure 30. Multi Channel DMA IP for PCI Express Software Operation Flow
H2D
Channel 1
FPGA Logic
Completion Status
Descriptors Ring(per channel 2 Descriptor ring H2D - D2H)
Physical addressof data
PCIe
Tail Pointer Write
DMA Operation
Tail Pointer FIFO HW DMA Block
HostMemory
(Huge pagefile system)
New componentsKernel baseFPGA logic blocks
Channel 2
H2D D2H
ltc_uio.ko orvfio-pci patches
Custom Example TestApplication
User Space
mmap
Kernel Space
D2H
3 4
5
2
1
6
libqdma
Device files exposedto user space byusing sysfs/ioctl
uio.ko or vfio
Step 1
• Application creates the thread based on the required port of a channel
• After spawning the thread, the thread will try to acquire the available channel’sport. In case if all channels ports are busy thread will wait in poll mode
• In the context of thread, libqdma will allocate descriptors buffer memory in thehost
• Libqdma will initialize following register which includes Starting address ofdescriptors, queue size, write back address for Consumed Head and then enablesthe channels
QCSR registers:
• Q_START_ADDR_L (Offset 8’h08)
• Q_START_ADDR_H (Offset 8’h0C)
• Q_SIZE (Offset 8’h10)
• Q_CONSUMED_HEAD_ADDR_L (Offset 8’h20)
• Q_CONSUMED_HEAD_ADDR_H (Offset 8’h24)
• Q_BATCH_DELAY (Offset 8’h28)
• Q_CTRL (Offset 8’h00)
GCSR register:
• WB_INTR_DELAY (Offset 8’h08)
8. Software Programming Model
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
67
Step 2
• Threads will continuously try to send/receive the data and library keeps checking ifchannel is busy or descriptor ring is full
• If channel is not busy and descriptor ring is not full it goes to step 3. If channel isbusy or descriptors ring is full thread will retry to initiate the transfer again
Descriptor ring full is identified by checking the Consumed Head and Tail pointerregisters.Channel busy is identified based on H2D_TPTR_AVL and D2H_TPTR_AVL registerwhich has information on available space for number of tail pointer update in H2D andD2H ports.
Step 3
Thread requests for new descriptor to submit the request and updates the requiredfield i.e. descriptor index, SOF, EOF, Payload, MSI-X enable and writeback enable.
Step 4
After initializing descriptor ring buffer, the libqdma will write number of descriptorupdates into tail register of QCSR region. On every descriptor update the tail pointer isincreased by 1.
QCSR tail pointer register: Q_TAIL_POINTER (Offset 8’h14)
Step 5
• Once the tail pointer write happens, Multi Channel DMA IP for PCI Express willfetch descriptors from host memory starting from the programmedQ_START_ADDR_L/H address
• Multi Channel DMA IP for PCI Express parses the descriptor content to find thesources, destination addresses and length of the data from descriptor and startsDMA operation
Step 6
Once descriptor processing is completed, IP will notify the completion status based onfollowing methods, which can be enabled in each descriptor.
• Either based on MSI-X Interrupt : Multi Channel DMA IP for PCI Express sendsMSI-X interrupt to host if enabled in Q_CTRL.
• Writeback: Multi Channel DMA for PCI Express IP updatesQ_CONSUMED_HEAD_ADDR_L/H, if writeback is enabled in Q_CTRL.
8.1.5. API Flow
8.1.5.1. Single Descriptor Load and Submit
The API flow below shows loading one descriptor in the descriptor ring buffer and thensubmit DMA transfer by updating the tail pointer register by increment of 1.
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
68
Figure 31. Single Descriptor Load and Submit
app uio FPGAlibmqdma
[rw_thread]
[chnl, queue_depth]
loop
prefill loop
release loop
loop
processing loop
lfc_app_start()
oklfc_qdma_device_get()
ok, dev
lfc_qdma_channel_get(dev)
chnl
lfc_qdma_request_malloc()
rq
ok
lfc_qdma_request_start(chnl, rq, TX/RX)populate_desc()one-desc-per-rq
alloc_mem_from_hugepage()
sysfs_mmap_pcie_bar
sysfs_enum_pcie
mmio-probe-chnl-resources
read-desc, read/write_mem,async-update-ring-head
mmio-bump-ring-tail
mmio-bump-ring-tail
mmio-disable-chnl
lfc_app_stop()
lfc_qdma_device_put(dev)
lfc_qdma_request_free(rq[i])[chnl, queue_depth]
lfc_qdma_channel_put(chnl)
lfc_qdma_request_start(chnl, rq[i], TX/RX)
lfc_qdma_completions_poll(chnl, TX/RX, rq[])nr, rq[]
alloc_chnl_from_pool
env_enum_hugepage_mem
mmio-enable-chnlmmio-reset-chnl
8.1.5.2. Multiple Descriptor Load and Submit
The API flow below shows loading the descriptors in bunch in the descriptor ring bufferand then submit for DMA transfer by updating the tail pointer register with totalloaded descriptors.
8. Software Programming Model
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
69
Figure 32. Multiple Descriptor Load and Submit
app uio FPGAlibmqdma
[rw_thread]
[chnl, queue_depth]
loop
prefill loop
release loop
loop
processing loop
lfc_app_start()
oklfc_qdma_device_get()
ok, dev
lfc_qdma_channel_get(dev)
chnl
lfc_qdma_request_malloc()
rq
ok
lfc_qdma_request_prepare(chnl, rq, TX/RX)lfc_qdma_request_prepare(chnl, rq, TX/RX)lfc_qdma_request_prepare(chnl, rq, TX/RX)lfc_qdma_request_submit(chnl, rq)
alloc_mem_from_hugepage()
sysfs_mmap_pcie_bar
sysfs_enum_pcie
mmio-probe-chnl-resources
read-desc, read/write_mem,async-update-ring-head
mmio-bump-ring-tail
mmio-bump-ring-tail
mmio-disable-chnl
lfc_app_stop()
lfc_qdma_device_put(dev)
lfc_qdma_request_free(rq[i])[chnl, queue_depth]
lfc_qdma_channel_put(chnl)
lfc_qdma_request_prepare(chnl, rq[i], TX/RX)lfc_qdma_request_submit(chnl, TX/RX)
lfc_qdma_completions_poll(chnl, TX/RX, rq[])nr, rq[]
alloc_chnl_from_pool
env_enum_hugepage_mem
mmio-enable-chnl
8.1.6. libmqdma Library API List
This section describes the list of APIs which would be exposed to the application.
8.1.6.1. ifc_api_start
Table 59. ifc_api_start
API API Description Input Parameters Return Values
voidifc_app_start(void)
This function is called at thetime of applicationinitialization. Probe andprepare the application forDMA transactions.• maps the enabled device
memory to user space• memory allocation from
huge page file systemhugepagefs allows userspace to get continuous andunswapable memory, whichyou can use for DMAoperations.
void 0 on successnegative otherwise
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
70
API API Description Input Parameters Return Values
Set default huge page sizeas 1 GB at boot time andallocate the requiredmemory from huge pages.
8.1.6.2. ifc_qdma_device_get
Table 60. ifc_qdma_device_get
API API Description Input Parameters Return Values
intifc_qdma_device_get(const char *bdf, structifc_qdma_device**qdev)
Based on the BDF, APIreturnS correspondingdevice context to theapplication. Application mustmaintain the device contextand use it for furtheroperations.When the application is donewith I/O, it releases thecontext by usingifc_qdma_device_putAPI.
bdf - BDFqdev - Address of thepointer to device context
Updates device context andreturns 0 on successnegative otherwise
8.1.6.3. ifc_num_channels_get
Table 61. ifc_num_channels_get
API API Description Input Parameters Return Values
intifc_num_channels_get(struct ifc_qdma_device*qdev);
This API returns the totalnumber of channelssupported by QDMA device
qdev - Pointer to devicecontext
Number of channelssupported
8.1.6.4. ifc_qdma_channel_get
Table 62. ifc_qdma_channel_get
API API Description Input Parameters Return Values
intifc_qdma_channel_get(struct ifc_qdma_device*qdev, structifc_qdma_channel**chnl, int chno)
Before submitting the DMAtransactions, application isresponsible to acquire thechannel and pass thecontext on furtherinteractions with framework.This API performs following:• Get the next available
channel• Initialize the descriptors
and data memory forboth TX and RX queues
• Enable the channelThe last parameter in thisAPI is the channel number.In case if you pass thisparameter as -1, it willreturn available freechannel. Otherwise, itallocates the available freechannel.
qdev: QDMA devicechnl: Pointer to updatechannel contextchno: Channel no if userwants specific channel. -1 ifno specific
0 : on success and populateschannel context-1 : No channel is ready tobe used. Channel contextwould be returned as NULL.-2 : Requested channel isalready allocated. But validchannel context would bereturned. Application mayuse this channel context.
8. Software Programming Model
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
71
8.1.6.5. ifc_qdma_acquire_channels
ifc_qdma_acquire_channels
Table 63.
API API Description Input Parameters Return Values
intifc_qdma_acquire_channels(structifc_qdma_device*qdev,int num)
This API acquires n numberof channels from hardware.Once the channels acquired,user must callifc_qdma_channel_get()to initialize the channels anduse for DMA.
qdev: QDMA devicenum: Number of channelsrequested
Number of channelsacquired successfully.negative otherwise
8.1.6.6. ifc_qdma_release_all_channels
Table 64. ifc_qdma_release_all_channels
API API Description Input Parameters Return Values
intifc_qdma_release_all_channels(structifc_qdma_device *qdev)
This API releases all thechannels acquired by thedevice. User must makesure to stop the traffic on allthe channels, before callingthis function. Perfq_app callsthis API at the exit ofapplication.
qdev: QDMA device 0 on successNegative otherwise
8.1.6.7. ifc_qdma_device_put
Table 65. ifc_qdma_device_put
API API Description Input Parameters Return Values
voidifc_qdma_device_put(struct ifc_qdma_device*qdev)
This API performs theunamapping of devicememory and release theallocated resources
qdev: QDMA device 0 on successnegative otherwise
8.1.6.8. ifc_qdma_channel_put
Table 66. ifc_qdma_channel_put
API API Description Input Parameters Return Values
voidifc_qdma_channel_put(struct ifc_qdma_channel*qchnl)
Once the DMA transactionsare completed, applicationmust call this API to releasethe acquired channel sothat, other process or otherthread will acquire again.libmqdma disables thischannel so that hardwarewill not look for DMAtransactions.
qchan: channel context 0 on successnegative otherwises
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
72
8.1.6.9. ifc_qdma_completion_poll
Table 67. ifc_qdma_completion_poll
API API Description Input Parameters Return Values
intifc_qdma_completion_poll (structifc_qdma_channel*qchnl, int direction,void *pkt, int quota)
Check if any previouslyqueued and pending requestgot completed. If completed,the number of completedtransactions would bereturned to called so thatthe application submitsthose transactions.
qchnl: channel contextdir: DMA direction, one ofIFC_QDMA_DIRECTION_*pkts: address wherecompleted requests to becopiedquota: maximum number ofrequests to search
0 on successnegative otherwise
8.1.6.10. ifc_qdma_request_start
Table 68. ifc_qdma_request_start
API API Description Input Parameters Return Values
intifc_qdma_request_start(structifc_qdma_channel*qchnl, int dir,structifc_qdma_request *r);
Depending on data direction,application selects TX/RXqueue, populates thedescriptors based on thepassed request object andsubmits the DMAtransactions.This is not blocking request.You may need to poll for thecompletion status.
qchnl: channel contextreceived onifc_qchannel_get()dir: DMA direction, one ofIFC_QDMA_DIRECTION_*r: request struct that needsto be processed
0 on successnegative otherwise
8.1.6.11. ifc_qdma_request_prepare
Table 69. ifc_qdma_request_prepare
API API Description Input Parameters Return Values
intifc_qdma_request_prepare(structifc_qdma_channel*qchnl, int dir,structifc_qdma_request *r);
Depending on the direction,application selects the queueand prepares the descriptorbut not submit thetransactions. Applicationmust useifc_qdma_request_submitAPI to submit thetransactions to DMA engine.
qchnl: channel contextreceived onifc_qchannel_get()dir: DMA direction, one ofIFC_QDMA_DIRECTION_*r: request struct that needsto be processed
Returns the number oftransactions completed.negative otherwise
8.1.6.12. ifc_qdma_request_submit
Table 70. ifc_qdma_request_submit
API API Description Input Parameters Return Values
intifc_qdma_request_submit(structifc_qdma_channel*qchnl, int dir);
Submits all prepared andpending DMA transactions toMCDMA engine. Beforecalling this API, Applicationmay need to callifc_qdma_request_prepareto prepare the transactions.If you want to give priorityto one channel and submitmore number of transactionsat a time from one channel,
qchnl: channel contextreceived onifc_qchannel_get()dir: DMA direction, one ofIFC_QDMA_DIRECTION_*
0 on successnegative otherwise
8. Software Programming Model
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
73
API API Description Input Parameters Return Values
application may need to callmultiple times and thensubmit APIs at a time tosubmit all the transactions.
8.1.6.13. ifc_qdma_pio_read32
Table 71. ifc_qdma_pio_read32
API API Description Input Parameters Return Values
uint32_tifc_qdma_pio_read32(struct ifc_qdma_device*qdev, uint64_t addr)
Read the value from BAR2address This API is used forPIO testing, dumpingstatistics, and patterngeneration.
qdev: QDMA deviceaddr: adderss to read
0 on successnegative otherwise
8.1.6.14. ifc_qdma_pio_write32
Table 72. ifc_qdma_pio_write32
API API Description Input Parameters Return Values
voidifc_qdma_pio_write32(struct ifc_qdma_device*qdev, uint64_t addr,uint32_t val)
Writes the value to BAR2address
qdev: QDMA deviceaddr: address to writeval: value to write
0 on success and populateschannel contextnegative otherwise
8.1.6.15. ifc_request_malloc
Table 73. ifc_request_malloc
API API Description Input Parameters Return Values
structifc_qdma_request*ifc_request_malloc(size_t len)
libmqdma allocates thebuffer for I/O request. Thereturned buffer would beDMA-able and allocated fromhuge pages
len - size of data buffer forI/O request
0 on successNegative otherwises
8.1.6.16. ifc_request_free
Table 74. ifc_request_free
API API Description Input Parameters Return Values
voidifc_request_free(void*req);
Release the passed bufferand add in free pool
req - start address ofallocation buffer
0 : on success and populateschannel context-1 : channel not available-2 : Requested for specificchannel. But alreadyoccupied
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
74
8.1.6.17. ifc_qdma_channel_set
Table 75. ifc_qdma_channel_set
API API Description Input Parameters Return Values
intifc_qdma_channel_set(struct ifc_qdma_channel*chnl, int dir, enumifc_qdma_queue_paramparasssm_type, intval);
set channel parameters chnl: location to storechannel addressdir: direction (0 – RX, 1 –TX)type: parameter what weneed to setval: value we need to set
0 : on success and populateschannel context-1 : channel not available-2 : Requested for specificchannel. But alreadyoccupied
8.1.6.18. ifc_app_stop
Table 76. ifc_app_stop
API API Description Input Parameters Return Values
voidifc_app_stop(void)
ifc_app_stop unmaps themapped resources andallocated memory
void void
8.1.6.19. ifc_qdma_poll_init
Table 77. ifc_qdma_poll_init
API API Description Input Parameters Return Values
intifc_qdma_poll_init(struct ifc_qdma_device*qdev)
This will reset the polleventfds Application, need topass this fd_set to poll incase if MSI-X interruptsenabled.
qdev: QDMA device 0 on successNegative otherwise
8.1.6.20. ifc_qdma_poll_add
Table 78. ifc_qdma_poll_add
API API Description Input Parameters Return Values
intifc_qdma_poll_add(struct ifc_qdma_device*qdev,ifc_qdma_channel*chnl, int dir)
Append event fds to the polllist
qdev: QDMA devicechnl: channel contextdir: direction, which needsto poll
0 on successNegative otherwise
8.1.6.21. ifc_qdma_poll_wait
Table 79. ifc_qdma_poll_wait
API API Description Input Parameters Return Values
intifc_qdma_poll_wait(struct ifc_qdma_device*qdev,ifc_qdma_channel**chnl, int *dir)
Monitor for interrupts for alladded queues. In case if anyinterrupt comes, it willreturn.Timeout: 1 msec
qdev: QDMA deviceqchan: address of channelcontextdir: address of directionparameters
0 on success and updateschannel context anddirectionNegative otherwise
8. Software Programming Model
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
75
8.1.7. Request Structures
To request a DMA operation, the application needs to use a common structure calledifc_qdma_struct.
It contains the following fields.
struct ifc_qdma_request { void *buf; /* src/dst buffer */ uint32_t len; /* number of bytes */ uint64_t pyld_cnt; uint32_t flags; /* SOF and EOF */ void *ctx; /* libqdma contexst, NOT for application */};
1. buf: - DMA buffer. In the case of H2D, this buffer data is moved to FPGA. In thecase of D2H, FPGA copies the content to this buffer.
2. ilen: Length of the data in this descriptor
3. pyld_cnt: D2H: Length of the valid date, in case if descriptor contains EOF. H2D:This field not used
4. flags: This the mask which contains the flags which describe the contentCurrently, these flags are being used to notify the SOF and EOF of data.
Note: For the Single Port AVST Design, the sof and eof should be on the same descriptor orSOF can be at the start and EOF at the end descriptor of a single TID update.
8.2. Multi Channel DMA IP DPDK Poll-Mode based Driver
Note: This feature is planned for a future release.
Note: Contact your Intel Sales or Intel FAE representative for additional information.
8.3. Multi Channel DMA IP Kernel Mode Driver
Note: This feature is planned for a future release.
Note: Contact your Intel Sales or Intel FAE representative for additional information.
8. Software Programming Model
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
76
9. RegistersThe Multi Channel DMA IP for PCI Express provides configuration, control and statusregisters to support the DMA operations including:
• D2H and H2D Queue control and status (QCSR)
• MSI-X Table and PBA for interrupt generation
• General/global DMA control (GCSR)
These Multi Channel DMA registers are mapped to BAR0 of a function.
Note: GCSR is only for PF0.
Following table shows 4 MB aperture space mapped for PF0 in PCIe config spacethrough BAR0.
Table 80. Multi Channel DMA CSR Address Space
Address Space Name Range Size Description
QCSR (D2H, H2D) 22’h00_0000 - 22’h0F_FFFF 1MB Individual queue controlregisters. Up to 2048 D2Hand 2048 H2D queues.
MSI-X (Table and PBA) 22’h10_0000 - 22’h1F_FFFF 1MB MSI-X Table and PBA space
GCSR 22’h20_0000 - 22’h2F_FFFF 1MB General DMA control andstatus registers.
Reserved 22’h30_0000 – 22’h3F_FFFF 1MB Reserved
Following table shows how QCSR registers for each DMA channel are mapped with 1MB space of QCSR.
Table 81. QCSR Address Space
Address Space Name Size DMA Channel Size Description
QCSR (D2H) 512 KB DMA Channel 0 256 B QCSR for DMA channel0
DMA Channel 1 256 B QCSR for DMA channel1
…. …. ….
DMA Channel N 256 B QCSR for DMA channelN
QCSR (H2D) 512 KB DMA Channel 0 256 B QCSR for DMA channel0
continued...
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
Address Space Name Size DMA Channel Size Description
DMA Channel 1 256 B QCSR for DMA channel1
…. …. ….
DMA Channel N 256 B QCSR for DMA channel2
9.1. Queue Control (QCSR)
QCSR space contains queue control and status information. This register space of 1MB can support up to 2048 H2D and 2048 D2H queues, where each queue is allocated256 bytes of register space. The memory space allocated to each function is enoughfor each function to have allocated all the DMA Channels. However, the actual numberwill depend on the parameters input at IP generation time.
Address [7:0] : Registers for the queues
Address [18:8]: Queue number
Address [19]: 0 = D2H, 1=H2D
The following registers are defined for H2D/D2H queues. The base address for H2Dand D2H are different, but registers (H2D and D2H) has the same address offsets.
Table 82. Queue Control Registers
Register Name Address Offset Access Type Description
Q_CTRL 8’h00 R/W Control Register
RESERVED 8’h04 RESERVED
Q_START_ADDR_L 8’h08 R/W Lower 32-bit of queue baseaddress in system memory.This is the beginning of thelinked-list of 4KB pagescontaining the descriptors.
Q_START_ADDR_H 8’h0C R/W Upper 32-bit of queue baseaddress in system memory.This is the beginning of thelinked-list of 4KB pagescontaining the descriptors.
Q_SIZE 8’h10 R/W Number of max entries in aqueue. Powers of 2 only.
Q_TAIL_POINTER 8’h14 R/W Current pointer to the lastvalid descriptor queue entryin the host memory.
Q_HEAD_POINTER 8’h18 RO Current pointer to the lastdescriptor that was fetched.Updated by Descriptor FetchEngine.
Q_COMPLETED_POINTER 8’h1C RO Last completed pointer afterDMA is done. Software canpoll this for status ifWriteback is disabled.
continued...
9. Registers
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
78
Register Name Address Offset Access Type Description
Q_CONSUMED_HEAD_ADDR_L 8’h20 R/W Lower 32-bit of the sytemaddress where the ringconsumed pointer is stored.This address is used forcomsumed pointerwriteback.
Q_CONSUMED_HEAD_ADDR_H 8’h24 R/W Upper 32-bit of the sytemaddress where the ringconsumed pointer is stored.This address is used forcomsumed pointerwriteback.
Q_BATCH_DELAY 8’h28 R/W Delay the descriptor fetchuntil the time elapsed from aprior fetch exceeds thedelayvalue in this register tomaximize fetching efficiency.
RESERVED 8’h2C RESERVED
RESERVED 18’h30 RESERVED
RESERVED 8’h34 RESERVED
Q_DEBUG_STATUS_1 8’h38 RO RESERVED
Q_DEBUG_STATUS_2 8’h3C RO RESERVED
Q_DEBUG_STATUS_3 8’h40 RO RESERVED
Q_DEBUG_STATUS_4 8’h44 RO RESERVED
Q_RESET 8’h48 R/W Queue reset requested
The following registers are defined for each implemented H2D and D2H queue. Thetotal QCSR address space for each H2D/D2H is 256B and requires 8-bit of address.
Table 83. Q_CTRL (Offset 8’h0)
Bit [31:0] Name R/W Default Description
[31:10] rsvd Reserved
[9] q_intr_en R/W 0 If set, uponcompletion generate aMSI-X interrupt.
[8] q_wb_en R/W 0 If set, uponcompletion, do a writeback.
[7:1] rsvd Reserved
[0] q_en R/W 0 Enable. Once it isenabled, the DMAstarts fetchingpending descriptorsand executing them.
9. Registers
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
79
Table 84. Q_START_ADDR_L (Offset 8’h8)
Bit [31:0] Name R/W Default Description
[31:0] q_strt_addr_l R/W 0 After software allocatethe descriptor ringbuffer, it writes thelower 32-bit allocatedaddress to thisregister. Thedescriptor fetchengine use thisaddress and thepending head/tailpointer to fetch thedescriptors.
Table 85. Q_START_ADDR_H (Offset 8’hC)
Bit [31:0] Name R/W Default Description
[31:0] q_strt_addr_h R/W 0 After software allocatethe descriptor ringbuffer, it writes theupper 32-bit allocatedaddress to thisregister. Thedescriptor fetchengine use thisaddress and thepending head/tailpointer to fetch thedescriptors.
Table 86. Q_SIZE (Offset 8’h10)
Bit [31:0] Name R/W Default Description
[31:5] rsvd Reserved
[4:0] q_size R/W 1 Size of the descriptorring in power of 2 andmax value of 16. Theunit is number ofdescriptors. Hardwarewill default to using avalue of 1 if an illegalvalue is written. Avalue of 1 meansqueue size of 2 (2^1).A value is 16 (0x10)means queue size of64K (2^16).
Table 87. Q_TAIL_POINTER (Offset 8’h14)
Bit [31:0] Name R/W Default Description
[31:16] rsvd Reserved
[15:0] q_tl_ptr R/W 0 After software sets upa last valid descriptorin the descriptorbuffer, it programsthis register with theposition of the last(tail) valid descriptor
continued...
9. Registers
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
80
Bit [31:0] Name R/W Default Description
that is ready to beexecuted. The DMADescriptor Enginefetches descriptorsfrom the buffer uptothis position of thebuffer
Table 88. Q_HEAD_POINTER (Offset 8’h18)
Bit [31:0] Name R/W Default Description
[31:16] rsvd Reserved
[15:0] q_hd_ptr R/W 0 After DMA DescriptorFetch Engine fetchesthe descriptors fromthe descriptor buffer,upto the tail pointer, itupdates this registerwith that last fetcheddescriptor position.The fetch engine onlyfetches descriptors ifthe head and tailpointer is not equal.
Table 89. Q_COMPLETED_POINTER (Offset 8’h1C)
Bit [31:0] Name R/W Default Description
[31:16] rsvd Reserved
[15:0] q_cmpl_ptr R/W 0 This register isupdated by hardwareto store the lastdescriptor position(pointer) that DMAhas completed, that isall data for thatdescriptor andprevious descriptorshave arrived at theintended destinations.Software can poll thisregister to find out thestatus of the DMA fora specific queue.
Table 90. Q_CONSUMED_HEAD_ADDR_L (Offset 8’h20)
Bit [31:0] Name R/W Default Description
[31:0] q_cnsm_hd_addr_l R/W 0 Software programsthis register with thelower 32-bit addresslocation where thewriteback will targetafter DMA iscompleted for a set ofdescriptors.
9. Registers
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
81
Table 91. Q_CONSUMED_HEAD_ADDR_H (Offset 8’h24)
Bit [31:0] Name R/W Default Description
[31:0] q_cnsm_hd_addr_h R/W 0 Software programsthis register with theupper 32-bit addresslocation where thewriteback will targetafter DMA iscompleted for a set ofdescriptors.
Table 92. Q_BATCH_DELAY (Offset 8’h28)
Bit [31:0] Name R/W Default Description
[31:20] rsvd Reserved
[19:0] q_batch_dscr_delay R/W 0 Software programsthis register with thethe amount of timebetween fetches fordescriptors. Each unitis 2ns.
Table 93. Q_RESET (Offset 8’h48)
Bit [31:0] Name R/W Default Description
[31:1] rsvd Reserved
[0] q_reset R/W 0 Request reset for thequeue by writing 1’b1to this register, andpoll for value of 1’b0when reset has beencompleted byhardware. Hardwareclears this bit aftercompleting the resetof a queue.
9.2. MSI-X Memory Space
The MSI-X Table and PBA memory is mapped to the second MB space of the Registeraddress space. Allocated memory space can support up to 2048 MSI-X interrupts for afunction. Actual amount of memory depends on the Multi Channel DMA IP for PCIExpress configuration.
MSI-X Table
Each entry (vector) is 16 bytes (4 DWORDs) and is divided into Message Address,Data, and Mask (Vector Control) fields as shown in the figure below. To support 2048interrupts, MSI-X Table requires 32 KB of space per function. But it is mapped to a 512KB of space.
9. Registers
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
82
Figure 33. MSI-X Table StructureDWORD 3
Vector Control Message Data Message Upper Address Message Address
Vector Control Message Data Message Upper Address Message Address
Vector Control Message Data Message Upper Address Message Address
Entry 0
Entry 1
Entry 2
Base
Vector Control Message Data Message Upper Address Message Address Entry (N - 1) Base + (N - 1) x 16
Host Byte Addresses
Base + 1 x 16
Base + 2 x 16
DWORD 2 DWORD 1 DWORD 0
••••••
••••••
•••
MSI-X PBA
MSI-X PBA (Pending Bit Array) memory space is mapped to a 512 KB region. Actualamount of memory depends on the IP configuration. The Pending Bit Array containsthe Pending bits, one per MSI-X Table entry, in array of QWORDs (64 bits). The PBAformat is shown below.
Figure 34. MSI-X PBA Structure
Pending Bit Array (PBA)
QWORD 0
QWORD 1
QWORD ((N - 1) div 64)
•••
AddressBaseBase + 1 x 8
Base + ((N - 1) div 64) x 8
•••Pending Bits ((N -1) div 64) x 64 through N - 1
Pending Bits 0 through 63
Pending Bits 64 through 127
Each DMA Channel is allocated 4 MSI-X vectors:
• 2’b00: H2D DMA Vector
• 2’b01: H2D Event Interrupt
• 2’b10: D2H DMA Vector
• 2’b11: D2H Event Interrupt
9.3. Control Register (GCSR)
This space contains global control/status registers that control the DMA operation.Access to this register set is restricted to PF0 only.
Table 94. Control Register
Register Name Address Offset Access Type Description
CTRL 8’h00 R/W Reserved
RESERVED 8’h04 Reserved
WB_INTR_DELAY 8’h08 R/W Delay the writeback and/orthe MSI-X interrupt until thetime elapsed from a priorwriteback/interrupt exceedsthe delay value in thisregister.
continued...
9. Registers
UG-20297 | 2021.05.28
Send Feedback Multi Channel DMA IP for PCI Express User Guide
83
Register Name Address Offset Access Type Description
RESERVED 8’h0C – 8’h6F Reserved
VER_NUM 8’h70 RO Multi Channel DMA IP for PCIExpress version number
SW_RESET 9'h120 RW Write this register to issueMulti Channel DMA IP resetwithout disturbing PCIExpress link. This will resetall queues and erase all thecontext. Can be issued onlyfrom PF0.
Table 95. CTRL (Offset 8’h0)
Bit [31:0] Name R/W Default Deacription
[31:0] rsvd Reserved
Table 96. WB_INTR_DELAY (Offset 8’h08)
Bit [31:0] Name R/W Default Deacription
[31:20] rsvd Reserved
[19:0] wb_intr_delay R/W 0 Delay the writebackand/or the MSI-Xinterrupt until thetime elapsed from aprior writeback/interrupt exceeds thedelay value in thisregister. Each unit is2ns.
Table 97. VER_NUM (Offset 9’h070)
Bit [31:0] Name R/W Default Deacription
[31:24] rsvd RESERVED
[23:16] MAJOR_VER RO 1 or 2 Major version numberof Multi Channel DMAIP for PCI Express1 for P-Tile in QPDS21.12 for H-tile QPDS 21.1
[7:0] MIN_VER RO 0 Minor version numberof Multi Channel DMAIP for PCI Express
9. Registers
UG-20297 | 2021.05.28
Multi Channel DMA IP for PCI Express User Guide Send Feedback
84
10. Revision HistoryTable 98. Revision History for Multi Channel DMA IP for PCI Express User Guide
Date Intel QuartusPrime Version
IP Version Changes
2021.05.28 21.1 2.0.0 (H-Tile)1.0.0 (P-Tile)
• PCIe Gen4 (P-Tile) Support• Support for x8 link width• MCDMA 1 port AVST interface• BAM, BAS, BAM+BAS, BAM+MCDMA modes• SR-IOV support• Root Port support (IP only)• Config Slave interface for RP
2020.07.20 20.2 20.0.0 (H-Tile) Initial Release
UG-20297 | 2021.05.28
Send Feedback
Intel Corporation. All rights reserved. Intel, the Intel logo, and other Intel marks are trademarks of IntelCorporation or its subsidiaries. Intel warrants performance of its FPGA and semiconductor products to currentspecifications in accordance with Intel's standard warranty, but reserves the right to make changes to anyproducts and services at any time without notice. Intel assumes no responsibility or liability arising out of theapplication or use of any information, product, or service described herein except as expressly agreed to inwriting by Intel. Intel customers are advised to obtain the latest version of device specifications before relyingon any published information and before placing orders for products or services.*Other names and brands may be claimed as the property of others.
ISO9001:2015Registered
Recommended