12
IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL ON GENERIC SERVER HARDWARE METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 1 David Reekie, SVP Engineering ETSI Future Network Workshop 10 th April 2013

IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

IMPLEMENTING SCALABLE AND COST-EFFECTIVE SESSION BORDER

CONTROL ON GENERIC SERVER HARDWARE

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 1

David Reekie, SVP Engineering ETSI Future Network Workshop 10th April 2013

Page 2: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

AGENDA

� Why is this an interesting case study?

� The route to virtualisation

� How most people build SBCs today

� Aspects of performance

� Packet throughput

� Media transcoding

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 2

� Media transcoding

� Remaining challenges

Page 3: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

THE ROUTE TO COTS HARDWARE AND VIRTUALISATION

� Implement all SBC functions in software on generic server

� Signaling plane functions

� Media plane functions

� Achieve sensible levels of scaling

� 200k – 1M+ registered endpoints per server

� > 20k concurrent media sessions per server

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 3

� > 20k concurrent media sessions per server

� Get all this working in a virtualised environment

� Preserve capacity and performance

� Preserve failover capabilities

Page 4: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

TODAY'S SBCS ARE BUILT ON PROPRIETARY HARDWARE

Network

DSPs

Crypto

General Purpose

SIPSignaling

RTPMedia

AccelerateSRTP, TLS,

Perform transcoding

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 4

ProcessorCrypto

TCAM

PurposeCPU

Discard packets from blacklisted IP

addresses

Look up RTP flows to

manage media relay

SRTP, TLS, IMS AKA

Page 5: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

SOFTWARE SOLUTIONS FOR THESE CHALLENGES

� Spread load across multiple cores

� A 16-core server can do a lot of heavy lifting

� Make intelligent use of cache memory

� Fast look-up for media flows and IP address blacklists

� Do performance-critical jobs close to the metal

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 5

� Do performance-critical jobs close to the metal

� Custom kernel modules for low-level packet handling

� Leverage built-in crypto instruction set

� TLS, IPsec, SRTP

� Do the best you can with software-based transcoding

Page 6: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

PERFORMANCE OF SOFTWARE-BASED SBC

� Standard off-the-shelf server, 1U, $9k

� NICs need to support Receive Side Scaling (RSS)

� If NICs support Intel DPDK integration, performance ↑ 2.5x

� Performing both signaling and media functions

� "Integrated SBC" configuration

� Signaling throughput > 12,000 SIP messages per second

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 6

� 6M BHCA (based on 7 x SIP messages per call)

� Media handling: up to 18,000 - 48,000 (DPDK) concurrent media streams

� RTP-to-RTP relay, any codec, no transcoding

Intel claims 80M pps forwarding of 64-byte frameswith 8-core Xeon and DPDK (on bare metal)

Page 7: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

WHAT HAPPENS WHEN SBC RUNS OVER HYPERVISOR

� Signaling function performs as expected

� Hypervisor overhead in the range 5-20%

� Cross-core locking a major impact

� Typical of network-intensive virtualised applications

� Can be detailed issues to resolve eg high re system clock on KVM

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 7

� Media function suffers substantial throughput reduction

� RTP relay requires high throughput of small UDP packets

� Bare metal performance of 4M packets/second is achievable

� Hypervisor can introduce order-of-magnitude reduction

� Highly dependent on hypervisor and vNIC driver choice

Page 8: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

HYPERVISOR UDP PACKET THROUGHPUT VS RAWM

bps thro

ughput

500

1000

64-byte packets

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 8

Mbps thro

ughput

Acknowledgement: "Performance Comparison of Common Server Hardware Virtualization Solutions Regarding the Network Throughput of Virtualized Systems", Daniel Schlosser, Michael Duelli, and Sebastian Goll, University of Würzburg, Germany, March 2011

0

Page 9: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

HYPERVISOR PACKET THROUGHPUT CHALLENGE

� Quick fix by making use of passthrough mode

� Nailed-up connection between SBC app and physical NIC(s)

� Reduces flexibility of virtualised SBC

� Preferable to address hypervisor limitations

� What's needed: improved packet throughput of vSwitch component

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 9

� What's needed: improved packet throughput of vSwitch component

� Intel very active in this space

� SR-IOV – Single Root I/O Virtualisation: virtualises the NICs resources for sharing between cores / guests

� Also ARM + DSP SoC

� But none currently compatable with virtual routing

Page 10: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

TRANSCODING IN SOFTWARE

� Historically, transcoding has been performed by DSPs

� Transcoding in software on x86 is more costly

� Hardware costs, power consumption costs

� Over five years, software-based transcoding ~ 2X cost of DSP-based

BUT

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 10

� DSP resources can only be used for transcoding

� x86 resources can be used for any kind of processing

� Modern codecs (eg SILK) optimised for CPU not DSP

� Flexibility of software-based transcoding compensates for higher cost

BUT

Page 11: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

REMAINING CHALLENGES

� Cohabitation of fast packet processing and virtual routing

� Impacts NICs, servers, hypervisors and OSs

� Needs support across that vendor ecosystem to resolve

� Plus

� 1:1 FT needs control of instance location

� Networking control:

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 11

� Networking control:

� Multiple redundant interfaces? Bonded?

� Separate signalling and management networks / VLANs?

� Cloud owner vs service owner management and monitoring

� Reliability: do you depend on any cloud services (obvious or hidden)

Page 12: IMPLEMENTING SCALABLE AND COST- EFFECTIVE SESSION BORDER CONTROL

METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 12