6
SN0530953-00 Rev. B 10/17 1 White Paper iSER RDMA Accelerates Storage Accelerate iSCSI Storage with Universal RDMA Transition from iSCSI to iSER delivers more IOPS, saves CPU cycles INTRODUCTION Flash and NVMe storage is driving the need for higher throughput and lower latency with iSCSI storage. Cavium has been a long-term leader in iSCSI adapters that provide connectivity to iSCSI SANs with full protocol offload and is now leading the transition to high-performance 25Gb Ethernet (25GbE), 40Gb Ethernet (40GbE), and 100Gb Ethernet (100GbE). Data centers can achieve higher network performance, lower latency, and enhanced CPU efficiency with next-generation Cavium FastLinQ QL45000 Series 25/40/100GbE Adapters. Cavium FastLinQ QL45000 Series Adapters are ideally suited for a wide range of network and storage applications. They deliver unsurpassed performance with unique Universal RDMA support that includes RDMA over Converged Ethernet (RoCE), RoCE v2, and Internet wide area RDMA protocol (iWARP). They also reduce operating expense with comprehensive, unified adapter management across the data center with Cavium QConvergeConsole ® (QCC). One of the key new enhancements is support for iSCSI Extensions for RDMA (iSER) that adds RDMA capabilities to the traditional iSCSI protocol. RDMA is a critical technology that enables network adapters to use zero- copy networking to transfer data directly to or from application memory to the network adapter and across the network. This eliminates multiple transfers that would otherwise be required to pass data through multiple network layers in the operating system (see Figure 1 on the next page). KEY BENEFITS Cavium™ FastLinQ ® QL45000 Series Adapters: Deliver 120% higher IOPS with iSER, when compared to iSCSI Reduce server CPU utilization by half with iSER compared to iSCSI Enabled up to 70% lower latency with iSER compared to iSCSI Deliver head-to-head performance at line rate vs. Mellanox ® Provide integrated iSER configuration and management GUI Deliver universal Remote Direct Memory Access (RDMA) technology and investment protection with support for concurrent RoCE, RoCEv2, and iWARP

iSER RDMA Accelerates Storage White Paper · SN0530953-00 Rev. B 10/17 2 iSER RDMA Accelerates Storage White Paper RDMA transfers require minimal processing by CPUs, caches or context

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: iSER RDMA Accelerates Storage White Paper · SN0530953-00 Rev. B 10/17 2 iSER RDMA Accelerates Storage White Paper RDMA transfers require minimal processing by CPUs, caches or context

SN0530953-00 Rev. B 10/17 1

White Paper

iSER RDMA Accelerates Storage

Accelerate iSCSI Storage with Universal RDMA

Transition from iSCSI to iSER delivers more IOPS, saves CPU cycles

INTRODUCTION Flash and NVMe storage is driving the need for higher throughput and lower latency with iSCSI storage. Cavium has been a long-term leader in iSCSI adapters that provide connectivity to iSCSI SANs with full protocol offload and is now leading the transition to high-performance 25Gb Ethernet (25GbE), 40Gb Ethernet (40GbE), and 100Gb Ethernet (100GbE). Data centers can achieve higher network performance, lower latency, and enhanced CPU efficiency with next-generation Cavium FastLinQ QL45000 Series 25/40/100GbE Adapters.

Cavium FastLinQ QL45000 Series Adapters are ideally suited for a wide range of network and storage applications. They deliver unsurpassed performance with unique Universal RDMA support that includes RDMA over Converged Ethernet (RoCE), RoCE v2, and Internet wide area RDMA protocol (iWARP). They also reduce operating expense with comprehensive, unified adapter management across the data center with Cavium QConvergeConsole® (QCC). One of the key new enhancements is support for iSCSI Extensions for RDMA (iSER) that adds RDMA capabilities to the traditional iSCSI protocol.

RDMA is a critical technology that enables network adapters to use zero-copy networking to transfer data directly to or from application memory to the network adapter and across the network. This eliminates multiple transfers that would otherwise be required to pass data through multiple network layers in the operating system (see Figure 1 on the next page).

KEY BENEFITS Cavium™ FastLinQ® QL45000 Series Adapters:

• Deliver 120% higher IOPS with iSER, when compared to iSCSI

• Reduce server CPU utilization by half with iSER compared to iSCSI

• Enabled up to 70% lower latency with iSER compared to iSCSI

• Deliver head-to-head performance at line rate vs. Mellanox®

• Provide integrated iSER configuration and management GUI

• Deliver universal Remote Direct Memory Access (RDMA) technology and investment protection with support for concurrent RoCE, RoCEv2, and iWARP

Page 2: iSER RDMA Accelerates Storage White Paper · SN0530953-00 Rev. B 10/17 2 iSER RDMA Accelerates Storage White Paper RDMA transfers require minimal processing by CPUs, caches or context

SN0530953-00 Rev. B 10/17 2

White PaperiSER RDMA Accelerates Storage

RDMA transfers require minimal processing by CPUs, caches or context switches, and transfers are done in parallel with other system operations. Key RDMA benefits include:

• Higher bandwidth

• Lower latency

• Better CPU efficiency

UNIVERSAL RDMA While RDMA has unique benefits of accelerating performance and offloading server CPU cycles, there are multiple mutually incompatible standards for RDMA—two prominent Ethernet-based standards are RoCE and iWARP. RoCE requires a lossless Ethernet fabric and has a routable version called RoCEv2. iWARP relies on standard TCP offload, is routable, and can operate on any Ethernet fabric. Cavium 45000 Series Adapter is the only network adapter that provides customers with the technology choice and investment protection for concurrent RoCE, RoCEv2, and iWARP. iSER iSCSI Extensions for RDMA or iSER is a new standard for extending iSCSI with RDMA. iSER is being rapidly adopted to accelerate storage I/O in all flash data centers.

iSER increases storage performance and reduces latency by eliminating the need for the traditional and general purpose TCP/IP stack. iSER technology is available for various types of RDMA including RoCE, RoCEv2, and iWARP. Compared to iSCSI, iSER can significantly increase flash Storage performance, which paves the way for the efficient utilization of high-speed flash and NVMe devices.

Cavium FastLinQ QL45000 Series Adapters enable RDMA benefits for iSCSI networking with iSER support. Data is transferred directly between SCSI memory buffers and the adapters, eliminating intermediate data copies and reducing CPU overhead (see Figure 2).

Figure 2. iSCSI vs. iSER Protocol Stack Comparison

Cavium test labs conducted a series of benchmarks to evaluate iSER performance benefits that are provided with Cavium FastLinQ QL45000 Series Adapters. This paper details test results that show key performance improvements.

Memory MemoryRDMA NIC RDMA NIC

RDMA NICOffloads Server CPU from Moving Data

Memory MemoryStandard NICNo Offloads

Standard NICNo Offloads

Standard NICRequires Additional CPU Cycles to Move Data

Client Server

Figure 1. RDMA Technology Saves CPU Cycles

Page 3: iSER RDMA Accelerates Storage White Paper · SN0530953-00 Rev. B 10/17 2 iSER RDMA Accelerates Storage White Paper RDMA transfers require minimal processing by CPUs, caches or context

SN0530953-00 Rev. B 10/17 3

White PaperiSER RDMA Accelerates Storage

iSER PERFORMANCE TEST PLATFORM IOPS, throughput, and CPU efficiency benchmarks were done using x86 initiator and target servers with 18 cores (36 CPUs per server) running the CentOS 7.1 Linux distribution. RDMA was supported using RoCE. The following network adapters were evaluated:

• Cavium FastLinQ QL45000 Series 25GbE and 40GbE Adapter

• Mellanox ConnectX-3 and ConnectX-4 25GbE and 40GbE Adapter

iSER Vs. iSCSI The first test objective was to quantify performance benefits for iSER vs. software iSCSI using Cavium FastLinQ QL45000 25GbE Adapters. The following sections show test results for IOPS and CPU efficiency.

IOPS (Transactional Performance) iSER averaged up to 100% higher IOPS vs. iSCSI with Read operations (Figure 3) and a mix of Read and Write operations results in iSER performing 120% higher for 4KB block size and 50% higher for 8KB block size operations (Figure 4). 4KB and 8KB block size operations which are typical for most enterprise applications.

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

1,600,000

1,800,000

2,000,000

512B 1K 2K 4K 8K

IOPS

Block Size of I/O Operation (Bytes)

25GbE iSCSI vs. iSER Read IOPS Performance

Cavium 45xxx SW iSCSI Cavium 45xxx iSER

Figure 3. 100% Higher Average Read IOPS with FastLinQ QL45000 Series Adapters with iSER vs. Software iSCSI

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

4K 8K

IOPS

Block Size of I/O Operation (Bytes)

25GbE iSCSI vs. iSER Read+Write IOPS Performance

Cavium 45xxx SW iSCSI Cavium 45xxx iSER

Figure 4. 120% Higher for 4KB Block Size and 50% Higher for 8KB Block Size Operations with iSER vs. iSCSI

Page 4: iSER RDMA Accelerates Storage White Paper · SN0530953-00 Rev. B 10/17 2 iSER RDMA Accelerates Storage White Paper RDMA transfers require minimal processing by CPUs, caches or context

SN0530953-00 Rev. B 10/17 4

White PaperiSER RDMA Accelerates Storage

CPU Efficiency CPU efficiency was evaluated using IOPS divided by CPU usage (%). This metric showed the greatest benefit for iSER vs. iSCSI with 183% average increase over the full range of data transfer sizes. Test results indicate that for key block sizes of 4KB and 8KB, iSER reduced the CPU utilization to half vs. iSCSI. This is a critical benefit for virtualized servers. Reducing CPU usage for I/O frees CPU resources to run more virtual machines and optimize server resources. (See Figures 5 and 6.)

0

10,000

20,000

30,000

40,000

50,000

60,000

512B 1K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1MB

IOPS

/CPU

% -

Effici

ency

Block Size of I/O Operation (Bytes)

25GbE iSCSI vs. iSER Read CPU Efficiency

Cavium 45xxx SW iSCSI Cavium 45xxx iSER

Figure 5. 183% Higher Read CPU Efficiency with FastLinQ QL45000 Series Adapters with iSER vs. iSCSI

0

5

10

15

20

25

30

35

40

45

50

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

4K 8K

CPU

Util

izatio

n (%

)

IOPS

Block Size of I/O Operation (Bytes)

25GbE iSCSI vs. iSER Read+Write IOPS and CPU Performance

Cavium 45xxx SW iSCSI Cavium 45xxx iSER iSCSI CPU % iSER CPU %

Figure 6. iSER CPU Utilization is Half of iSCSI while Delivering Higher Read and Write Performance (IOPS) vs. iSCSI

Latency Fast response is one of the key benefits driving the transition to flash storage. Cavium FastLinQ QL45000 Series Adapters with iSER/RDMA support are ideally suited for server connectivity to flash and NVMe drives. Figure 7 shows that iSER/RDMA provides 72% lower latency when compared to software iSCSI.

Latency Comparison between iSER and iSCSI

iSER/RDMA Software iSCSI

Figure 7. 72% Lower Latency with FastLinQ QL45000 Series Adapters and iSER/RDMA

Cavium FastLinQ QL45411 vs. MELLANOX ConnectX-3 AND ConnectX-4 An evaluation compared the Cavium FastLinQ QL45000 Series 25GbE and 40GbE Adapter with the Mellanox ConnectX-4 25GbE and ConnectX-3 40GbE adapter for iSER performance, management, and customer choice for flexibility of RDMA transport.

Table 1 compares the key findings for flexibility and management for iSER.

Table 1. Cavium FastLinQ vs Mellanox ConnectX

Feature

Cavium FastLinQ

45xxxMellanox ConnectX Comments

10/25/40/50/ 100GbE Speeds

Broad range of Ethernet connectivity

iSER Support iSCSI extensions over RDMA

NIC Partitioning (NPAR)

Cavium FastLinQ supports up to 16 partitions

Universal RDMA Concurrent RoCE, RoCEv2 and iWARP

iSER Management GUI

Cavium QConvergeConsole provides iSER

configuration and management options

Page 5: iSER RDMA Accelerates Storage White Paper · SN0530953-00 Rev. B 10/17 2 iSER RDMA Accelerates Storage White Paper RDMA transfers require minimal processing by CPUs, caches or context

SN0530953-00 Rev. B 10/17 5

White PaperiSER RDMA Accelerates Storage

The performance charts (Figures 8-10) indicate how the Cavium FastLinQ RDMA NICs iSER performance compares with the Mellanox portfolio.

1,000

1,200

1,400

1,600

1,800

2,000

2,200

2,400

2,600

2,800

3,000

4K 8K 16K 32K 64K 128K 256K 512K 1MB

Thro

ughp

ut (M

Bps)

Block Size of I/O Operation (Bytes)

25GbE iSER Performance Reads vs. Mellanox

Mellanox CX4 iSER Cavium 45xxx iSER

Figure 8. Both Cavium and Mellanox RDMA NICs Deliver 25GbE Line Rate Read Performance for iSER

1,000

1,200

1,400

1,600

1,800

2,000

2,200

2,400

2,600

2,800

3,000

4K 8K 16K 32K 64K 128K 256K 512K 1MB

Thro

ughp

ut (M

Bps)

Block Size of I/O Operation (Bytes)

25GbE iSER Performance Writes vs. Mellanox

Mellanox CX4 iSER Cavium 45xxx iSER

Figure 9. Both Cavium and Mellanox RDMA NICs Deliver 25GbE Line Rate Write Performance for iSER

0

200,000

400,000

600,000

800,000

1,000,000

1,200,000

1,400,000

1,600,000

1,800,000

2K 4K 8K 16K

Axis

Title

Block Size of I/O Operation (Bytes)

40GbE iSER Performance Read+Writes vs. Mellanox

Cavium 45xxx iSER MLNX CX3 iSER

Figure 10. Cavium FastLinQ 45000 Series 40GbE Adapters Deliver 31% Higher Read and Write Performance for Key 8KB Block Size

UNIFIED MANAGEMENT Management is another critical factor in evaluating network adapters. Administrators can manage all generations of Cavium FastLinQ adapters across the data center from a single location using the QCC Management Suite. This saves administration time and helps insure optimum network reliability and performance.

CONCLUSION Cavium provides the Industry’s Most Comprehensive Network Adapter portfolio – FastLinQ Standard Ethernet Adapters, Converged Networking Adapters, and LiquidIO™ Intelligent NICs that cover the entire spectrum of customer Ethernet connectivity and offload requirements. Figure 11 illustrates the currently available Cavium FastLinQ Adapters.

Figure 11. Cavium FastLinQ NIC with iSER Support

Page 6: iSER RDMA Accelerates Storage White Paper · SN0530953-00 Rev. B 10/17 2 iSER RDMA Accelerates Storage White Paper RDMA transfers require minimal processing by CPUs, caches or context

SN0530953-00 Rev. B 10/17 6

Corporate Headquarters Cavium, Inc. 2315 N. First Street San Jose, CA 95131 408-943-7100

International Offices UK | Ireland | Germany | France | India | Japan | China | Hong Kong | Singapore | Taiwan | Israel

Follow us:

Copyright © 2017 Cavium, Inc. All rights reserved worldwide. Cavium, QConvergeConsole, and FastLinQ are registered trademarks or trademarks of Cavium Inc., registered in the United States and other countries. All other brand and product names are registered trademarks or trademarks of their respective owners.

This document is provided for informational purposes only and may contain errors. Cavium reserves the right, without notice, to make changes to this document or in product design or specifications. Cavium disclaims any warranty of any kind, expressed or implied, and does not guarantee that any results or performance described in the document will be achieved by you. All statements regarding Cavium’s future direction and intent are subject to change or withdrawal without notice and represent goals and objectives only.

White PaperiSER RDMA Accelerates Storage

Key Benefits of Cavium FastLinQ 10/25/40/50/100GbE Adapters are:

• Broad Spectrum of Ethernet Connectivity Speeds – 10/25/40/50/100GbE to host the most demanding enterprise, telco, and cloud applications and deliver scalability to drive business growth, available in cost optimized standard, blade, and OCP form factors.

• Universal RDMA – Industry’s only network adapter that provides customers the technology choice and investment protection with support for concurrent RoCE, RoCEv2, and iWARP. The FastLinQ adapters support RDMA applications like iSER, NFSoRDMA, NVMe-oF, and SMB Direct.

• Network Virtualization Offloads – Acceleration for Network Virtualization by offloading protocol processing for VxLAN, NVGRE, GRE, and GENEVE, enabling customers to build and scale virtualized networks without impacting network performance.

• Server Virtualization – Optimize infrastructure costs and increase virtual machine density by leveraging in-built technologies like SR-IOV and Network Partitioning (NPAR) that deliver acceleration and QoS for workloads and infrastructure traffic.

• Network Function Virtualization (NFV) – Leading small packet performance, up to 100GbE Ethernet connectivity and integration with DPDK and OpenStack enables Telcos and NFV application vendors to seamlessly deploy, manage, and accelerate the most demanding NFV workloads.

• Storage Acceleration – Full protocol offload for iSCSI and FCoE delivers up to 2.6M IOPS while consuming the fewest server CPU cycles, leaving headroom for virtual applications and delivering a higher RoI on server investments.

• Broad Operating System Integration – Extensive support of a broad variety of enterprise and cloud operating systems (including Windows, Linux, VMware ESXi, FreeBSD, and Solaris) provides seamless integration for diverse application platforms.

• Comprehensive Management – Single pane of glass, web-based, and distributed adapter management solution with QConvergeConsole available across heterogeneous platforms and OSes, automates and simplifies the deployment and orchestration of physical and virtual infrastructure.

• Leadership – An industry leader with growing Ethernet port share and an Ethernet portfolio is broadly distributed by Tier 1 OEMs and distributors across the world.

ABOUT CAVIUM Cavium, Inc. (NASDAQ: CAVM), offers a broad portfolio of infrastructure solutions for compute, security, storage, switching, connectivity and baseband processing. Cavium’s highly integrated multi-core SoC products deliver software compatible solutions across low to high performance points enabling secure and intelligent functionality in Enterprise, Data Center and Service Provider Equipment. Cavium processors and solutions are supported by an extensive ecosystem of operating systems, tools, application stacks, hardware reference designs and other products. Cavium is headquartered in San Jose, CA with design centers in California, Massachusetts, India, Israel, China and Taiwan.