Download ppt - Sting: a TCP-based Network Measurement Tool Stefan Savage Jianxuan Xu

Sting: a TCP-based Network Measurement Tool

Stefan Savage

Jianxuan Xu

Measurement & Analysis

The Internet is supremely hard to measure– VERY heterogeneous – VERY large – Heisenberg effects

The Heisenberg effect describes a system in which the observation or measurement of an event changes the event.

Still… lots of efforts to measure and understand traffic dynamics, routing, user characteristics, etc…

Understanding wide-area network characteristics is critical for evaluating the performance of Internet applications.

Measurement & Analysis

ICMP-based tools (e.g. ping,traceroute)

--Can’t measure one-way loss

Measurement infrastructures (e.g. NIMI)

--Require cooperation from remote endpoints

Features

Measures one-way packet loss rates

TCP-based measurement traffic (not filtered)

Only uses the TCP algorithm

Target only needs to run a TCP service, such as a web server, Does not require remote cooperation

Basic approach

Send selected TCP packets to remote host

Analyze TCP behavior to deduce which packets were lost in each direction

Deducing losses in a TCP transfer

What we know

How many data packets we sent

How many acknowledgements we received What we need to know

How many data packets were received?

Remote host’s TCP MUST know

How many acknowledgements were sent?

Easy, if one ACK is sent for each data packet (ACK parity)

How TCP reveals packet loss

Data packets ordered by seq#

ACK packets specify next seq# expected

Basic loss deduction algorithm

Forward Loss Data Seeding:

– Source sends in-sequence TCP data packets to target, each of which will be a loss sample

Hole-filling:– Sends TCP data packet with sequence number one greater

than the last seeding packet– If target ACKs this new packet, no loss– Else, each ACK indicates missing packets– Should be reliable, the retransmissions must be made in

Hole-filling

Data Seeding phase

for i = 1 to n for each ack received

send packet w/seq #i ackReceived++

dataSent++

wait for long time

Hole Filling Phase

lastACK := 0 for each ack received w/ack #j

while lastAck = 0 lastAck = MAX(lastAck,j)

send packet w/seq # n+1

while lastAck < n + 1

dataLlost++

retransPkt := lastAck

while lastAck = retransPkt

send packet w/seq# retransPkt

dataReceveid := dataSent – dataLost

ackSent := dataReceived

Example

Basic loss deduction algorithm

Reverse Loss Data Seeding:

– Skip first sequence number, ensuring out-of-sequence data (Fast Retransmit)

– Receiver will immediately acknowledge each data packet received

– Measure lost ACKs

Hole-filling:– Transmit first sequence number– Continue as before

Guaranteeing ACK parity

How do we know one ACK is sent for each data packet received?

Exploit TCP’s fast retransmit algorithm

TCP must send an immediate ACK for each out-of-order packet it receives

Send all data packets out-of-order

Skip first sequence number

Don’t count first “hole” in hole filling phase

Sending Large Bursts

•Large packets can overflow receiver buffer•Mitigate by overlapping sequence numbers

Delaying connection termination

Some Web servers/firewalls terminate connections abruptly by sending RST

Solutions:

Format data packets as valid HTTP request Set advertised receiver window to 0 bytes

Sting implementation details

Raw sockets to send TCP datagrams

Packet filter (libpcap) to get responses

Currently runs on Tru64 and FreeBSD

Last-generation user interface

# sting –c 100 –f poisson –m 0.500 –p 80 www.audiofind.com

Source = 128.95.2.93

Target = 207.138.37.3:80

dataSent = 100

dataReceived = 98

acksSent = 98

acksReceived = 97

Forward drop rate = 0.020000

Reverse drop rate = 0.010204

Forward Loss Results

Reverse Loss Results

“ Popular” Web Servers

Random Web Servers

Result

Loss rates increase during business hours, and then decrease

Forward and reverse loss rates vary independently On average, with popular web servers, the reverse

loss rate is more than 10 times greater than the forward loss rate

Conclusions TCP protocol features can be leveraged for non-

standard purposes

Packet loss is highly asymmetric

Ongoing work:

Using TCP to estimate one-way queuing delays, bottleneck bandwidths, propagation delay and server load

Useful or Useless

Purpose of the Network Measurement–Diagnose current problem–Design future service

Real Time data needed for Network Control

Data sample–Event driven–fixed Interval

Research Goal

Implement new TCP congestion control algorithm with fuzzy logic control.

Develop, test and debug it in Linux

Performance Evaluation

Traditional protocol hacking

Directly modify the kernel source

Migrate protocol stack and related stuff to user space

Simulate the algorithm with NS-2

Kernel Hacking

Insert and modify the algorithm in kernel source directly

Example–Vegas, Westwood+ and BIC implementation within

Linux kernel before version 2.6.13

Kernel Hacking

Pros–Welcome to the Real World–Less overhead

Cons–Not easy to develop, trace, debug and

maintenance–Incompatible with difference kernel version

User space migration

Move all protocol stack and related stuff to user space

Can gain the total control of variable status

Example–Sting

User space migration

Pros–High flexibility in protocol hacking–Can use general debug method tools, e.g. gdb

Cons–A large and thorny project for migrating protocol

stack to user space–Incompatible with difference kernel version–Large overhead

Simulation

Algorithm is implemented base on a virtual testbed

Virtual experiment can be held easily

Usually use NS-2 as simulator

E.g. Research of FAST TCP,HighSpeed TCP

Simulation

Pros–Quick implementation of algorithm –Low cost in experiment–Easy in data statistic

Cons–Result is too idealistic–Need further development for final product

Traditional methods are not suitable

Source code modification and user space migration required a well understanding of kernel architecture

NS-2 is not as realistic as testing on top of PlanetLab

All of them are kernel version dependent

My new approach

Combine the use of pluggable congestion control algorithm and Kernel Hacking

Implementation of new control algorithm within a single kernel module

Pluggable congestion control module

Starting from version 2.6.13, a new method of TCP congestion control hacking was published

New algorithms can be written as modules file, insert to kernel during run time as like as general I/O drivers

BIC,Cubic, HighSpeed, H-TCP, Hybla, Scalable, Vegas and Westwood+ are all implemented as module already


A congestion control mechanism can be registered through function in tcp_cong.c

The functions used by the congestion control mechanism are registered via passing a tcp_congestion_ops struct to tcp_register_congestion_control.

As a minimum name, ssthresh, cong_avoid and min_cwnd must be valid.


The method that is used to determine which congestion control mechanism is determined by the sysctl net.ipv4.tcp_congestion_contrl.

The default congestion control will be the last one registered (LIFO)

newReno will be built as build-in supporting and always available

A particular default value can be set by using sysctl


tcp_congestion_ops sturct provide the below function entry points:–init–release–ssthresh–min_cwnd–cong_avoid–rtt_sample

–set_state–cwnd_event–undo_cwnd–pkts_acked–get_info


All algorithm related code are packed within a single module file

A standardized framework can be followedCodes required for implement an algorithm are greatly reduced. For example, newReno uses 77 lines where BIC uses 335 lines

The module will be compatible unless the framework changes

Kernel Hacking Still Needed

Raw, Accurate, Real time Data needed for control algorithm

–Packet Loss Rate–Bandwidth Estimation–RTT

–(tcp vegas----rtt ,westwood—be….)

PLR Calculation in Linux Kernel

tcp_input.c is the core of the implementation of the TCP protocol

–handle incoming packets and acks,–identify duplicate acks and packet losses,–adjust congestion window accordingly


Two types of events are incurred due to congestion: one is retransmission Timeout(rto), and the other is Packet-Loss.

The Timeout event is checked by tcp_head_timedout(),

The Packet-Loss event is checked by tcp_mark_head_lost function.


the TCP's congestion avoidance (CA) phase is decomposed into five states (defined in the ca_state filed of the tcp_opt data structure).

– TCP_CA_OPEN

– TCP_CA_Disorder

– TCP_CA_CWR

– TCP_CA_Recovery

– TCP_CA_Loss


The process of the state machine is implemented in function tcp_fastretrans_alert():Processing dubious ack event


( tcp_update_scoreboard ) –This function will mark all the packets which were not

sacked (till the maximum seq number sacked) as lost packets. Also the packets which have waited for the acks to arrive for interval equivalent to retransmission time are marked as lost packets. The accounting for lost

, sacked and left packets is also done in this function.


left_out = sacked_out + lost_out; sacked_out: Packets, which arrived to receiver out of

order and hence not ACKed. With SACKs this number is simply amount of SACKed data. Even without SACKs it is easy to give pretty reliable estimate of this number, counting duplicate ACKs.

lost_out: Packets lost by network. TCP has no explicit "loss notification" feedback from network (for now).It means that this number can be only _guessed_. Actually, it is the heuristics to predict lossage that distinguishes different algorithms.