Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Performance of Packet Capturing SystemsHardware Selection for Monitoring
Fabian [email protected]
Technische Universtitat BerlinDeutsche Telekom Laboratories
11.12.2006
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 1 / 34
Introduction Motivation
Motivation
• high speed networks → high data and packet rate
• network security tools need to capture this traffic
• 2 Choices:• expensive special hardware• cheap commodity systems
⇒ Is it feasible to capture the traffic with commodity hardware?
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 2 / 34
Introduction Outline
Outline
1 Monitoring 10 Gigabit2 Measurement Setup
Systems under TestTopologyProcedureProfiling
3 WorkloadWorkload GenerationPacket Size DistributionOutput
4 ResultsUsing multiple processors?Increasing the buffer sizeAdditional filteringAdditional copy operationsmmaped pcap Linuxwrite to diskFurther Results
5 ConclusionSummaryFuture WorkResources
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 3 / 34
Monitoring 10 Gigabit
Monitoring 10 Gigabit
• monitoring 10 Gigabit of traffic needs app. 2500 MBytes/s (bothdirections)
• no recent bus or disk system can handle this!
• need to split up traffic:
• use a switch: e.g. link bundling feature (Cisco: Etherchannel)• use specialized hardware
• But be careful: do not split up data that belongs together
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 4 / 34
Measurement Setup
1 Monitoring 10 Gigabit
2 Measurement SetupSystems under TestTopologyProcedureProfiling
3 Workload
4 Results
5 Conclusion
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 5 / 34
Measurement Setup Systems under Test
Systems under Test
Opterons: 2x AMD Opteron 244 (1 MB Cache, 1.8 GHz), 2 GB RAM,Intel 82544EI optical GigE, Disk System: ATA-RAID on3ware 7000 Controller
Xeons: 2x Intel Xeon (512 kB Cache, 3.06 GHz), 2 GB RAM, Intel82544EI optical GigE, Disk System: ATA-RAID on 3ware7000 Controller
Dual-Core Opterons: 2x2 AMD Opteron 270 (1 MB Cache, 2.0 GHz),2 GB RAM, Intel 82544EI optical GigE, Disk System:SCSI-RAID on Compaq Smart Array 64xx & external RAID(easyRAID, SATA based) attached via SCSI.
2 examples of any of the systems: one installed with Linux and the otherwith FreeBSD
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 6 / 34
Measurement Setup Topology
Topology
gen
Cisco C3500XL
Splitter
swan snipeflamingomoorhen
SNMP Interface Couter Queries
Workload ->
Control Network
eth0
eth1 eth2
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 7 / 34
Measurement Setup Procedure
Procedure
Measurement categories:
• Capturing Rate
• System Load
Measurement Sequence:
1 Login to the four sniffers → Start the capturing and profilingapplications. (Save process ID’s)
2 Login to gen → Read SNMP packet counters of the switch.
3 Login to gen → Start packet generation.
4 Login to gen → Read SNMP packet counters of the switch.
5 Login to the four sniffers → Stop the applications (via saved processID’s).
Measurement Specification
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 8 / 34
Measurement Setup Profiling
Profiling
Goal: record CPU usage while capturing
• based of the mechanisms used by top
• CPU accounting information (user, system, idle, interrupt, . . . )written twice per second to file
• additional minimum/maximum/average identification
• ”under load” condition and resulting averages identified by awk script.
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 9 / 34
Workload
1 Monitoring 10 Gigabit
2 Measurement Setup
3 WorkloadWorkload GenerationPacket Size DistributionOutput
4 Results
5 Conclusion
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 10 / 34
Workload Workload Generation
Workload Generation
• Requirements:
Speed: Line Speed (1 Gbit/s) is desiredReproducibility: of the load and to avoid unrepeatable failures
Realness: at least packet sizes should match
• Checked different existing tools → none was sufficient
• Best: Linux Kernel Packet Generator can only generate packets offixed size
⇒ Necessity to add generation of different packet sizes → Identifydistributions.
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 11 / 34
Workload Packet Size Distribution
Observed Packet Size Distribution
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 12 / 34
75 % of all packetsin the 13 most frequent sizes!
0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
75
80
85
90
95
100
rest
1460
1470
145457
145244
1480
1440
1400606457
6
1300
14924855
2
14205240
1500
perc
enta
ge
packets of size (sorted by percentage descending)
cumulated percentagepercentage of packets of size
101
102
103
104
105
106
107
108
109
0 100 200 300 400 500 600 700 800 900 1000 1100 1200 1300 1400 1500
num
ber
of p
acke
ts p
er s
ize
packet size
4052
1500
number of packet of size (24h trace)
Only few frequent sizesImplementation
Workload Output
Output: Packet Size Distribution
101
102
103
104
105
106
107
108
109nu
mbe
r of
pac
kets
packets classified by size (sorted descending by quantity of packets)
originally capturedgenerated
107107
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 13 / 34
Workload Output
Output: Data and Packet Rate
kpps
Mbi
t/s
packet rate (kpps)data rate (Mbit/s)
0
100
200
300
400
500
600
700
800
max pktsize(1500 bytes)
distributionmin pktsize(40 bytes)
0
125
250
375
500
625
750
875
1000
Generator: median generation rate(with min/max errors)
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 14 / 34
Results
1 Monitoring 10 Gigabit
2 Measurement Setup
3 Workload
4 ResultsUsing multiple processors?Increasing the buffer sizeAdditional filteringAdditional copy operationsmmaped pcap Linuxwrite to diskFurther Results
5 Conclusion
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 15 / 34
Results Using multiple processors?
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 16 / 34
Only one processor
Cap
turin
g R
ate
[%]
CP
U u
sage
[%]
Data Rate [Mbit/s]
Linux/AMD - swanLinux/Intel - snipe
FreeBSD/AMD - moorhenFreeBSD/Intel - flamingo
Capturing Rate [%]CPU usage [%]
0 10 20 30 40 50 60 70 80 90
100
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 0 10 20 30 40 50 60 70 80 90 100
(32) no-improvement: no SMP, no HT, 1 app,traffic: generated, no filter, no load
Results Using multiple processors?
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 17 / 34
Multiprocessor (SMP)
Cap
turin
g R
ate
[%]
CP
U u
sage
[%]
Data Rate [Mbit/s]
Linux/AMD - swanLinux/Intel - snipe
FreeBSD/AMD - moorhenFreeBSD/Intel - flamingo
Capturing Rate [%]CPU usage [%]
0 10 20 30 40 50 60 70 80 90
100
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 0 10 20 30 40 50 60 70 80 90 100
(31) no-improvement: SMP, no HT, 1 app,traffic: generated, no filter, no load
Results Increasing the buffer size
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 18 / 34
increased buffers
Cap
turin
g R
ate
[%]
CP
U u
sage
[%]
Data Rate [Mbit/s]
Linux/AMD - swanLinux/Intel - snipe
FreeBSD/AMD - moorhenFreeBSD/Intel - flamingo
Capturing Rate [%]CPU usage [%]
0 10 20 30 40 50 60 70 80 90
100
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 0 10 20 30 40 50 60 70 80 90 100
(17) increased-buffers: SMP, no HT, 1 app,traffic: generated, no filter, no load
Results Additional filtering
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 19 / 34
additional filtering (BPF/LSF)
Cap
turin
g R
ate
[%]
CP
U u
sage
[%]
Datarate [Mbit/s]
Linux/AMD - swanLinux/Intel - snipe
FreeBSD/AMD - moorhenFreeBSD/Intel - flamingo
Capturing Rate [%]CPU usage [%]
0 10 20 30 40 50 60 70 80 90
100
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 0 10 20 30 40 50 60 70 80 90 100
(21) filter: SMP, no HT, 1 app,traffic: generated, 50 BPF instr., no load
Results Additional copy operations
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 20 / 34
50 additional copy ops
Cap
turin
g R
ate
[%]
CP
U u
sage
[%]
Data Rate [Mbit/s]
Linux/AMD - swanLinux/Intel - snipe
FreeBSD/AMD - moorhenFreeBSD/Intel - flamingo
Capturing Rate [%]CPU usage [%]
0 10 20 30 40 50 60 70 80 90
100
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 0 10 20 30 40 50 60 70 80 90 100
(27) memcpy-50: SMP, no HT, 1 app,traffic: generated, no filter, no load
Results mmaped pcap Linux
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 21 / 34
mmap Patch (only Linux)
Cap
turin
g R
ate
[%]
CP
U u
sage
[%]
Datarate [Mbit/s]
Linux/AMD mmap - swanLinux/AMD - swan alt
Linux/Intel mmap - snipeLinux/Intel - snipe alt
Capturing Rate [%]CPU usage [%]
0 10 20 30 40 50 60 70 80 90
100
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 0 10 20 30 40 50 60 70 80 90 100
(19) mmaped pacp: SMP, no HT, 1 app,traffic: generated, no filter, no load
Results write to disk
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 22 / 34
Dual Core: writing to disk
Cap
turin
g R
ate
[%]
CP
U u
sage
[%]
Datarate [Mbit/s]
32bit FreeBSD/Opteron64bit FreeBSD/Opteron
32bit Linux/Opteron64bit Linux/OpteronCapturing Rate [%]
CPU usage [%]
0 10 20 30 40 50 60 70 80 90
100
50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 0 10 20 30 40 50 60 70 80 90 100
(2-8) write to disk: SMP, 1 app,traffic: generated, no filter, no load
Results Further Results
Further Results
• running multiple capturing applications concurrently leads to badperformance.
• Measurement with additional compression show some advantage forIntel Systems
• Intel Hyperthreading does not change the performance
• FreeBSD 5.4 performs better than FreeBSD 5.2.1 (no comparablemeasurements for FreeBSD 6 at the moment).
• Using 4 processors (2x Dual Core) is minimal better than 2 Processors(Dual Core)
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 23 / 34
Conclusion
1 Monitoring 10 Gigabit
2 Measurement Setup
3 Workload
4 Results
5 ConclusionSummaryFuture WorkResources
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 24 / 34
Conclusion Summary
Summary
• FreeBSD/AMD Opteron combination in general performs best
• choosing the right buffer size is important
• filtering is cheap with respect to its benefit
• using the memory-map patch from Phil Woods does help
• 64bit systems drop more packets
• capturing full traces to disk is feasible up to about 600 Mbitbandwidth.
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 25 / 34
Conclusion Future Work
Future Work
• 10 Gigabit Ethernet
• future operating system versions / direct comparison of differentversions on the same machine (e.g.: FreeBSD 4.x 5.x 6.x)
• New Intel I/O Acceleration Technology
• implement a mmaped packet reception for FreeBSD
• (Windows platform)
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 26 / 34
Conclusion End
Questions?
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 27 / 34
Conclusion End
Thanks for the attention!
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 28 / 34
Conclusion Resources
Software
Profiling
• cpusage: Available athttp://www.net.in.tum.de/~schneifa/sources/cpusage-0.2.tar.gz,
• trimusage.awk Script:http://www.net.in.tum.de/~schneifa/sources/trimusage.awk
Capturing
• createDist: Available athttp://www.net.in.tum.de/~schneifa/sources/createDist-0.1.tar.gz,
• tcpdump: Available at www.tcpdump.org
Workload
• patched LKPG: Available athttp://www.net.in.tum.de/~schneifa/pktgen-lkpg-dist-0.1.tar.gz
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 29 / 34
Conclusion Resources
Further Reading
F. Schneider.Best Packet Capture Systemhttp://www.net.t-labs.tu-berlin.de/research/bpcs/
F. Schneider.Performance evaluation of packet capturing systems for high-speednetworks.Diplomarbeit,
http:// www. net. in. tum.de/ ~schneifa/papers/da. ps
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 30 / 34
Measurement Setup Measurement Specification
Measurement Specification
• seven similar measurements → to avoid errors
• a million packets per run
• 26 different inter-packet gaps per measurement→ increasing data and packet rate
• average of different runs with errorbars for min and max values
• no filter to capture all the packets
Return
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 31 / 34
Workload Implementation
Workload – Implementation
Return
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 32 / 34
OS insights FreeBSD
Packet Reception in FreeBSD
• interrupt context
• double buffer asinterface to userspace
• one buffer pair percapturing session
• 3 packet copy operations
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 33 / 34
OS insights Linux
Packet Reception in Linux
• soft-interrupts used
• central memory block forall packets handled inkernel
• pointer queue asinterface to userspace
• 2 packet copy operations
Fabian Schneider (TU Berlin/DT Labs) Performance of Packet Capturing Systems 11.12.2006 34 / 34