Upload
byron-spencer
View
217
Download
0
Embed Size (px)
DESCRIPTION
Outline Introduction Related Work Flow-Based Monitoring Technique And Development Platform Proposed Architectures Results And Validation Conclusions 3
Citation preview
Accurate And Flexible Flow-Based Monitoring For High-Speed
Networks
RE P O RT E R: H SU A N - J U L I
2 0 1 4 / 1 2 / 2 5
Field Programmable Logic and Applications (FPL), 2013 23rd International Conference on, Sept. (2013)
Marco Forconesi, Gustavo Sutter, Sergio Lopez-Buedo, Javier Aracil
2
OutlineIntroductionRelated WorkFlow-Based Monitoring Technique And Development PlatformProposed ArchitecturesResults And ValidationConclusions
3
OutlineIntroductionRelated WorkFlow-Based Monitoring Technique And Development PlatformProposed ArchitecturesResults And ValidationConclusions
Introduction Network operators routinely use flow-based tools in order to track down bandwidth utilization as well as network dysfunctionalities and attacks
Flow-based tools:Track down bandwidth utilizationNetwork dysfunctionalitiesNetwork attacks
4
Introduction(cont.)
5
Flow exporter Flow collector
Flow-based toolsinfrastructure
Introduction(cont.) Flow exporterAnalyzes packets in the networkCreates the flowsPeriodically send finished flows to the collector
6
RouterFlow exporter
Typically implemented inside routers
Benefit:Not necessary
to add any extra component !
Introduction(cont.)Flow collectorReceives flows from exportersStores flows for future processing
7
Introduction(cont.)Drawbacks on flow exporterRouters can be burdened by too much trafficIn order to dedicate all computing resources to
packetsFlow monitoring is skipped in order to dedicate all
computing resources to route packetsCausing all flow information to be lost
8
Introduction(cont.)Drawbacks on flow exporterIn high-speed networks, flow-based monitoring is
accomplished by routers and switches on the basis of packet sampling:Not all the packets on the network are analyzedPoor accuracy of the data delivered to the collector
9
Introduction(cont.)Drawbacks on flow exporter
Routers and switches have limited resources, so they cannot scale to higher link rates or larger memories to store more active flows.
Network devices are closed platformsNetwork engineers are not free to modify how flows are
definedSo that they don’t know what type of information is
collected
10
11
OutlineIntroductionRelated WorkFlow-Based Monitoring Technique And Development PlatformProposed ArchitecturesResults And ValidationConclusions
Related Work
12
Network probes that generate network flows at 10 Gbps
Software based probes FPGA Probes
Related Work(cont.)Software based probesUsing commodity serversMulticore architectures and a careful balance
between coresPopular open-source approaches(achieve up to
near 10Mpps)nProbe, softflowd, ffProbe
13
Related Work(cont.)FPGA based probes
The first implementation on NetFPGA-1G (a Virtex-2 platform)
Reconfigurable architecture for network flow analysis(Very Large Scale Integration (VLSI) Systems, IEEE Transactions on Volume:16 , Issue:1)
An architecture for network flow analysis using a Virtex-2 device, which is able to store up to 65,536 concurrent flows at a maximum rate below 3 Mpps.
14
Related Work(cont.)FPGA based probes
FlowMon for Network Monitoring 10 Gbps, 256,000 concurrent active flow implementation is
presented in FlowMon using Virtex-5 in the context of the Liberouter project
An fpga based hardware architecture for network flow analysisGives some results for Virtex-5, claiming a superior speed but
only for 500 concurrent flows.
15
16
OutlineIntroductionRelated WorkFlow-Based Monitoring Technique And Development PlatformProposed ArchitecturesResults And ValidationConclusions
Flow-Based Monitoring Technique And Development PlatformAccording to Cisco’s definitionFlow is a unidirectional stream for packets between
a given source and destinationFlow is identified by five key fields (5-tuple)
Source IP address , Destination IP address, Source port number, Destination port number and Layer 3 protocol type
Packets with the same 5-tuple belong to the same flow
17
Src. IP Address
Dst. IP Address Src. Port # Dst. Port# Network
Layer
Flow
Flow-Based Monitoring Technique And Development Platform(cont.)Flow cache
A fast local memory inside the exporterUsed to store the active flows of the link that is
being monitoredFlow table is a data structure on the flow cache
Consists of a list of flow recordsContains the number of packets, the total number of
transmitted bytes, the timestamp of the flow creation/expiration and the TCP flags
18
# Packets Total # trans. bytes
Timestamp of flow creation TCP flags
Flow recorder
Flow-Based Monitoring Technique And Development Platform(cont.)
19
Flow table
Flow recorder 1
Flow recorder 0
Flow recorder 2
Flow recorder n
# Packets Total # trans. bytes
Timestamp of flow creation TCP flags
Src. IP Address
Dst. IP Address Src. Port # Dst. Port# Network
Layer
Consists
Consists
Flow-Based Monitoring Technique And Development Platform(cont.)Every time a packet is received, the memory is polled to determine if the extracted 5-tuple matches an active flowIf not matches, a new flow entry is createdOtherwise the active flow in the flow table is
updated
20
Flow-Based Monitoring Technique And Development Platform(cont.)Parallel to the flow creation and updatesA mechanism that is needed in charge of
removing the flows from the flow table once they are no longer on the link
Time out TCP transmission signal FIN and RST flags
21
Flow-Based Monitoring Technique And Development Platform(cont.)Parallel to the flow creation and updatesTwo concurrent processes access the memory(flow table)
22
Flow-Based Monitoring Technique And Development Platform(cont.)The design has been implemented and tested on NetFPGA-10GSecond release of the NetFPGA projectDevelop an open-source hardware and software platformStandford University together with Xilinx LabVirtex-5 TX240T FPGA, which provide four independent 10 Gps
Ethernet ports Populate with three QDR-II and four RLDRAM memory devices
Respectively provide 27MB and 288MB of external storage
23
24
OutlineIntroductionRelated WorkFlow-Based Monitoring Technique And Development PlatformProposed ArchitecturesResults And ValidationConclusions
Proposed ArchitecturesTwo implementations that developedNF_BRAM: Uses internal BlockRAMs to store the
activesSupports up to 16,384 concurrent flows
NF_QDR: Uses external QDR-II memorySupports up to 786,432 concurrent flows
25
Proposed Architectures(cont.)Architecture of NF_BRAM
26
Proposed Architectures(cont.)Architecture of NF_BRAM
27
Proposed Architectures(cont.)Extracts the 5-tuple from the Ethernet frames
Plus the information need to create/update a new flowTimestamp, TCP flags, Number of bytes
28
Proposed Architectures(cont.)Calculates a hash code to obtain an address where the flow record will be stored
The probability of collision depends on the input 5-tuples that follow a non-uniform distribution, so this module is intended to be modified.
29
Proposed Architectures(cont.)Is the name given to ‘Process A’With previously calculated hash code, the flow table is addressed and its content analyzed Busy flag: if is set, means that an active flow is on that memory
location The received 5-tuple is compared to the store one If match, then update this flow with the information of the received
packet If not match, the collision is occurred, then the received packet is
discard
30
Proposed Architectures(cont.)Is the name given to ‘Process A’With previously calculated hash code, the flow table is addressed and its content analyzed Busy flag: if is clear, then a new flow record is created in that position of the
flow table RST flag and FIN flag: If either TCP flag is received, the memory is polled to
check if there is an active flow to which the packet belongs to. The flow is updated and exported immediately
31
Proposed Architectures(cont.)
32
Process A(){
if (busyflag = 1) //An active flow is on that memory location
{if((compare 5-tuple)==(the stored
one))//matched{
Update this flow with the information of the received packet;
}else //The collision is occurred{
The received packet is discard;}
}}
Proposed Architectures(cont.)
33
Process A(){
if (busyflag = 0) //The busy flag is clear{
Create a new flow record in that position of the flow table;
if ((RST=1)||(FIN=1)) //TCP flag assert{
Polling memory to check if there is an active flow to which the packet belongs to;
}}
}
Proposed Architectures(cont.)This module is implemented with the BlockRAM of the FPGA
Each flow record to be stored and read back in one memory address
The access of the two Processes(A and B) to the memory is completely in parallel and independent
34
Proposed Architectures(cont.)Is the name given to process B and performs two operations
First, checks the time elapsed since the last packet of the flow arrived is less than a predefined inactivity timeout
Second, consists of checking that the time elapsed since the flow record was created is less than a predefined maximum flow duration
35
Proposed Architectures(cont.)Is the name given to process B and performs two operations
If either of two conditions are satisfied Exported the flow record
Remove the flow from the flow table
36
Proposed Architectures(cont.)Receives the flow records that were purged from the flow table and exports them out of the flow cache core Could also be connected to the PCIe DMA engine in order to send flow
records directly to the host computer
Implements Flow exporting protocol NetFlow, IPFIX
Flows are exported through one of the 10 Gbps Ethernet ports available in NetFPGA-10G
37
Proposed Architectures(cont.)Architecture of NF_QDR
38
Proposed Architectures(cont.)Another architecture boosts flow monitoring in three manners
The use of QDR-II memory can implement a much bigger flow table (786,432 vs. 16,384 for the NF BRAM architecture)
Flow records are now 288-bit long, instead of the original 241 bits in the original NF BRAM design 47 additional bits to store extra information
It reduces flow drops caused by collisions in the hash function
39
Proposed Architectures(cont.)There is only one available port in the QDR-II external memoriesA multiplexing mechanism for both processes to share
the communication with the memory
40
Proposed Architectures(cont.)First looks-up if the active flow record is in the internal cache module
If the flow is in cache, then updated and the update is written back to the external main memory
41
Proposed Architectures(cont.)First looks-up if the active flow record is in the internal cache module
If the flow is not found in cachePerforms a read operation to the main memory
42
Proposed Architectures(cont.)This cache module is used to store the most recently created flows Burst of packets that belong to the same flow do not poll the memory
every time
The external memory is only addressed to write the updated flow back so an exact copy of the information in cache is present in the main memory
43
Proposed Architectures(cont.)
44
Look up
Most recently created flows
Address to write
MUx
NF_QDRPA
PB
Proposed Architectures(cont.)The dispatcher maintains a number of read operations on queue that maximizes the memory throughput
45
46
OutlineIntroductionRelated WorkFlow-Based Monitoring Technique And Development PlatformProposed ArchitecturesResults And ValidationConclusions
Results And ValidationHardware Resource UtilizationThe designs were coded in VHDL and synthesized
using Xilinx EDK/XST v13.4Clock frequency for both architectures is 200 MHz
47
Results And Validation(cont.)Two general-purpose PCs containing 10 Gbps Ethernet interfaces were connected to the NetFPGA-10G platformThe first PC was used as traffic generator, running a high-
performance network driver capable of saturating a 10 Gbps linkThe second machine captured in a file the output flow records
exported by the design under testThe same input traffic was processed offline with a well-known flow
capturing softwareP. S. del R´ıo, D. Corral, J. Garc´ıa-Dorado, and J. Aracil, “On the impact of packet sampling on skype traffic
classification,”IFIP/IEEE International Symposium on Integrated Network Management (IM 0213), 2013.
48
Results And Validation(cont.)Test the worst case scenario using a loop
generator of minimum size packets with minimum interframe gaps during a 100-second run
With the real traffic capturesIt tested flow creation in a real scenario and checked the
output against the software tools mentioned above
49
50
OutlineIntroductionRelated WorkFlow-Based Monitoring Technique And Development PlatformProposed ArchitecturesResults And ValidationConclusions
ConclusionsThe proposed design is able to cope with saturated 10 Gbps links even for the highest packet rates14.88 Mpps for the shortest 64-byte Ethernet frames
The design supports up to 786,432 concurrent active flows
The HDL code for both architectures has been released as public opensource hardware projects
51
52
THANK YOU