IP Network Performance Measurements Bruce Morgan AARNet Pty Ltd

IP Network Performance Measurements

Bruce Morgan

AARNet Pty Ltd

Just checking…

Why metrics? Metrics are important to identify network

related issues especially performance Metrics can be diverse No one metric is suitable for all needs

Types of Measurement Active Measurement

Injecting measurement data into the network

E.g. UDP, TCP, ICMP packets Passive Measurement

Measuring what is there already

The Problem

Measurement of the network cloud is difficult – but is essential if we are to gauge user perception of the internet

The World Wide Wait

Some problems are host based, while others are network based:

Physical latency Network queuing and delays Server processing delay Timeouts and packet loss TCP protocol delays

The Dark Cloud

Diverse network paths Asymmetric paths Policy routing Committed Access Rates Firewalls and filters

IP Performance Metrics

Framework spelt out in RFC 2330 from the IPPM Working Group

Goal: “to achieve a situation in which users and providers of Internet transport service have an accurate common understanding of the performance and reliability of the Internet component 'clouds' that they use/provide.”

On the Standards track…

RFC 2678 IPPM Metrics for Measuring Connectivity

RFC 2679 A One-way Delay Metric for IPPM.

RFC 2680 A One-way Packet Loss Metric for IPPM.

RFC 2681 A Round-trip Delay Metric for IPPM.

A One-way Delay Metric

Type-P-One-way-Delay The P is for protocol A Poisson distribution is chosen to inject

packets Both source and destination require time

synchronisation

A Round-trip Delay Metric Many applications do not perform

well with large end to end delays Ease of deployment compared to

one-way metrics Ease of interpretation

Ping

Two way path measurement based on RTTs (return trip times)

Choice of monitored address Host Router interface Router Loopback address

Packet Loss on ICMP

Loss Asymmetry Loss = 1 – ((1 – Lossfwd).(1-Lossrcv))

Path Asymmetry Possibility of Internet Service Providers

(ISPs) or sites or even hosts rate limiting (including complete blocking) ICMP echo and thus giving rise to invalid packet loss measurements.

PingER

(Ping End-to-end Reporting) is the name given to the Internet End-to-end Performance Measurement (IEPM) project to monitor end-to-end performance of Internet link

Uses ICMP RTT for measurement

Surveyor

Dedicated PC running Unix at key sites

GPS for clock synchronization One way delay & loss

measurements Community is Internet 2 clients, HEP sites collaborating with

Surveyor

PingER/Surveyor Comparison

PingER uses the ICMP echo facility (ping) and thus only makes round trip measurements.

Surveyor uses a GPS system to synchronise time between sites and makes one way measurements.


Surveyor requires a dedicated platform (PC) to be installed at each site that is monitored, whereas PingER uses an existing host with no special software installed at the monitored site.

PingER cheaper!


Surveyor is more accurate and better for short term measurement, especially for sites which have good connectivity.

PingER is a more light weight solution, requires less management, uses less bandwidth, requires less storage, and nothing needs to be installed at the remotely monitored sites and is good for remote sites with poor connectivity.


Surveyor PingER

Method 1 way delay 2 way ping

Hosts dedicated selected

Frequency ~2*2/s ~ 0.01/s

Timing Poisson <2/s>

bursty (30 min intervals)

Monitors ~30 18

Remotes ~30 (~full mesh)

~300 (hierarchical)

Pairs ~900 ~1200

Storage ~38Mbytes / pair / mo

~ 0.6 Mbytes / pair / mo

PingER - Surveyor Complementarity

Agree well Surveyor has one way measurements, PingER only

round-trip Surveyor dedicated platforms & strong central

management experience with PingER shows this has benefits. PingER more parsimonious/lightweight (bandwidth, disk

space, cpu) but necessarily less accurate especially at small (hourly) time

resolution on low loss links. PingER good for looking at long term trends & grouping

where statistics are less a problem

TCP SYN / ACK tools

In order to truly measure Web traffic, which is almost entirely TCP/IP traffic, it is best to probe using TCP/IP rather than ICMP

SYN/ACK mechanism proves useful for this purpose

TCP SYN/ACK tools3 way handshake

Send SYN seq=xReceive SYNSend SYN seq=y, ACK x+1

Receive SYN+ACKSend ACK y+1

Receive ACK

TCP SYN/ACK

Connection request by a SYN and measures the time taken by the target to respond with an ACK

The connection is promptly cleared by another exchange of packets, this time containing the FIN control flag.

TCP SYN/ACK tools

TCP SYN/ACK toolsMetric Ping SYN/ACK

Samples 30000 30000

Average 161.6 ms 158.0 ms

Standard Deviation

33.0 ms 11.6 ms

Median 154.4 ms 153.0 ms

Minimum 151 ms 150 ms

Maximum 1222 ms 610 ms

Lost packets

528 (1.76%) 469 (1.56%)

TCP SYN/ACK tools

Sting Sting is a TCP-based network measurement tool

that measures end-to-end network path characteristics. sting is unique because it can estimate one-way properties, such as loss rate, through careful manipulation and observation of TCP behaviour.

Avoids increasing problems with ICMP-based network measurement (blocking, spoofing, rate limiting, etc).

http://www.cs.washington.edu/homes/savage/sting/

Current AARNet Measurements

MRTG Perf

ICMP RTT measurements ICMP Packet Loss measurements

Wa Host/endpoint reachability

TCP HTTP file transfer measurements Netflow data

MRTG

Uses SNMP interface statistics Provides multi-functionality from router

temperature to throughput Visualisation package Lacks granularity with time Deployed at each RNO

MRTG graphs

WARNO/ International traffic on June 18

WARNO / VRNO traffic on June 18

Perf Tool

Perfd – uses a bsd based ping for RTT and packet Loss calculation

Perf – web display tool of the data Deployed at each RNO to measure all points of

the mesh Used to check SLA agreement with Cable and

Wireless Optus

Perf – LA Cable 21 June 2000 ICMP Loss

Perf – LA Cable 21 June 2000 ICMP RTT

Perf – Optus IA321 June 2000Packet Loss

Perf – Optus IA321 June 2000ICMP RTT

Perf 6 JuneOptus international ICMP Loss

Perf 6 June Optus international ICMP RTT

Perf 6 JuneACTRNO ICMP Loss

Perf 6 JuneACTRNO ICMP RTT

WA

“what’s alive” is based on nocol Checks reachability of hosts/endpoints Uses ICMP echo, but could be easily

extended to check on service level availablity Frequent check of all hosts

TCP based Measurements

Uses an active http file transfer Measure at host Measure from Netflow records

Can detect retransmissions These may occur from packet loss/out of

sequence packets in either direction

Load balancing impacts

Can use contiguous IP addresses on monitoring machine to monitor per destination load balancing

Monitoring machine can determine performance on link but unable to determine which link is used.

If a link fails then traffic will divert to other links

Load Balancing – round robin

Load Balancing – per packet

Load Balancing – 14 May



Flows…

A flow is taken to be either a bidirectional or unidirectional communication between a source and destination host. The communication shares an address/port correspondence.

The biggest indicator of scan/DOS attacks are generally flow records!

Netflow Records

We keep detailed Flow records Timestamps and durations Source/destination addresses Protocol Types Cumulative IP Flags ICMP control types

Netflow Records

Useful for determining metric targets eg top 100 WWW hosts

Can derive useful measurements from the netflow data itself

Be wary on derived throughput – flows can take a long time.

What are the choices?

Various tools and methods are available No one tool is good for everything Combinations of tools, both passive and

active, leads to interesting and more detailed analysis

AARNet futures…

Deployment of measurement machines Monitoring and measuring ICMP, TCP and

UDP Monitoring QOS Deploying one-way and round-trip metrics To ensure the network does what its supposed

to do…

Documents

IP Network Performance Measurements Bruce Morgan AARNet Pty Ltd