Multiscale Models for Network Traffic Vinay Ribeiro Rolf Riedi, Matt Crouse, Rich Baraniuk Dept. of Electrical Engineering Rice University (Houston, Texas)

Multiscale Models for Network Traffic

Vinay Ribeiro

Rolf Riedi, Matt Crouse, Rich Baraniuk

Dept. of Electrical EngineeringRice University(Houston, Texas)

Outline

• Multiscale nature of network traffic

• Wavelets

• Wavelet models for traffic

• Network inference applications

Time Scales

timeunit

2n

2

1

(discrete time)

Multiscale Nature of Network Traffic

• Network traffic (local area networks, wide area networks, video traffic etc.) - variance decays slowly with aggregation

• i.i.d. data – variance decays faster with aggregation

60ms

6ms

time unit

600msInternet

bytes/time trace

(LBL’93)

i.i.d. time series

(lognormal)

Fractional Gaussian Noise (fGn)

• Stationary Gaussian process,

• Covariance (Hurst parameter: 0<H<1)

• Long-range dependence (LRD) if ½<H<1

• Second-order self-similarity

Variance-timeplot

fGn is a 1/f-Process

• Power spectral density decays in a 1/f fashion– Low frequency components long-term

correlations

frequency

power

Towards Generalizations of fGn

• Variance decay of traffic not always straight line like fGn

• Goal: develop LRD models – Generalize fGn– Parsimonious (few parameters)– Fast synthesis for simulations

Variance-timeplot

Auckland Univ.Traffic

time scale

Wavelets• Consider only orthonormal wavelet basis in L2(R)• Prototype functions approximation function- wavelet function-

• Basis formed by scaled and shifted versions of prototype functions

• Approximation and wavelet coefficients

The Haar Wavelet Basis

Computing the Haar Transform

• Wavelet Transform: fine to coarse (bottom to top)• Inverse Wavelet Transform: coarse to fine (top to bottom)

Wavelets and Filtering

• Wavelet coefficients at any scale j is the output of a bandpass filter

• Coarse scales low frequency band• Fine scales high frequency band• Width of bandpass filters increase exponentially

frequency

Wavelets “Decorrelate” 1/f Processes

• Analysis of 1/f data– sample means converge

faster in wavelet domain– estimate H in wavelet domain

• Synthesis of 1/f data– Exploit weak correlation in

wavelet domain– Generate independent

wavelet coefficients with appropriate variance

– Invert wavelet transform

frequency

frequency

1/f spectrum

time domain1/f

strong correlation

wavelet domainnot 1/f

weak correlation

power

power

Haar Wavelet “Additive” 1/f Model

• Choose Wj,k i.i.d. within scale j

• Set var(Wj,k) to obtain required decay of var(Vj,k)

• Fast O(N) synthesis

• log2(N) parameters

• Asymptotically Gaussian

Sample Realization

• Realization is Gaussian and can take negative values• Network traffic may be non-Gaussian and is always positive

Multiplicative Cascade Model

• Replace additive innovations by multiplicative innovations

• Aj,k 2 [0,1], example -distribution

• Choose var(Aj,k) to get appropriate decay

of var(Vj,k)

• Fast O(N) synthesis

• log2(N) parameters

• Positive data

• Asymptotically lognormal at fine time scales

Sample Realization

• Data is positive

• Same var(Vj,k) as additive model

Additive vs. Multiplicative Models

• Multiplicative model marginals closer to real data than additive model

Additive model Multiplicative modelInternet data(Auckland Univ)

6ms

12ms

24ms

timeunit

Queuing Experiment

• Additive and multiplicative models same var(Vj,k)

• Multiplicative model outperforms additive model• High-order moments can influence queuing (open loop)

real traffic

multiplicativemodel

additive model

Kilo bytes

Shortcomings of Multiscale Models• Open-loop

– Do not capture closed-loop nature of network protocols and user behavior

• Physical intuition– Cascades model “redistribution” of traffic (multiplexing

at queues, TCP)?

• Stationarity: first order stationary but not second-order stationary– Time averaged correlation structure is close to fGn– Queuing of additive model close to stationary Gaussian data (simulations and theory)

Selected References• Self-similar traffic and networks (upto 1996)

– W. Willinger, M. Taqqu, A. Erramilli, “A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks”, Stochastic Networks: Theory and Applications, vol. 4, Oxford Univ. Press, 1996.

• Wavelets– S. Burrus and R. Gopinath, “Introduction to Wavelets and Wavelet Transforms”,

Prentice Hall, 1998.– I. Daubechies, “Ten lectures on wavelets”, SIAM, New York, 1992.

• Additive model– S. Ma and C. Ji, “Modeling heterogeneous network traffic in wavelet domain”, IEEE

Trans. Networking, vol. 9, no. 5, Oct 2001.

• Multiplicative model– R. Riedi, M. Crouse, V. Ribeiro, R. Baraniuk, “A multifractal wavelet model with

application to network traffic”, IEEE Trans. Info. Theory, vol. 45, no. 3, April 1999.

– A. Feldmann, A. C. Gilbert, W. Willinger, “Data networks as cascasdes: investigating the multifractal nature of Internet WAN traffic”, ACM SIGCOMM, pp. 42-55, 1998.

– P. Mannersalo and I. Norros, “Multifractal analysis of real ATM traffic: A first look”, Technical report, VTT Information Technology, 1997, COST257TD(97)19,

Network Inference Applications

Why Network Inference?

Each dot is one Internet Service Provider

• Different parts of Internet owned by different organizations

• Information sharing difficult– Commerical

interests/trade secrets

– Privacy– Sheer volume of

“network state”

Edge-based Probing

• Inject probe packets into network• Infer internal properties from packet delay/loss• Current tools infer

– Topology– Link bandwidths– End-to-end available bandwidth– Congestion locations

Cross-Traffic Inference

• Simple network path – single queue

• Spread of packet pair gives cross-traffic over small time interval

Inferring cross-traffic over large time interval [0,T]

• Probing uncertainty principle– Dense sampling: accurate inference, affect cross-

traffic– Sparse sampling: less accurate inference, less

influence on cross-traffic

Problem Statement

Given N probe pairs, how must we space them over time interval [0,T] to optimally estimate the total cross-traffic in [0,T]

• Answer depends on – cross-traffic – optimality criterion

Multiscale Cross-Traffic Model

• Choose N leaf nodes to give best linear estimate (in terms of mean squared error) of root node

• Take a guess!– Bunch probes together– Exponentially space probes pairs– Uniformly space probes over interval– Your favorite solution

root

leaves

Sensor Networks Application

• Each sensor samples local value of process (pollution, temperature etc.)

• Sensors cost money!• Find best placement for N sensors to measure

global average

Global average

possiblesensor location

Independent Innovations Trees

• Each node is a linear combination of parent and an independent random innovation

• Optimal solution obtained by a water-filling procedure

• : arbitrary set of leaf nodes; : size of X• : leaves on left, : leaves on right• : linear min. mean sq. error of estimating root using X •

•

•

Water-Filling

0 1 2 3 40 1 2 3 4

fL(l) f

R(l)

N=01243

• Repeat at next lower scale with N replaced by l*

N (left) and (N-l*N) (right)

• Result: If innovations identically distributed within each scale then uniformly distribute leaves, l*

N=b N/2 c

Covariance Trees

• Distance : Two leaf nodes have distance j if their lowest common ancestor is at scale j

• Covariance tree : Covariance between leaf nodes with distance j is cj (only a function of distance), covariance between root and any leaf node is constant,

• Positively correlated tree : cj>cj+1

• Negatively correlated tree : cj<cj+1

Covariance Tree Result

• Result: For a positively correlated tree choosing leaf nodes uniformly in the tree is optimal. However, for negatively correlated trees this same uniform choice is the worst case!

• Optimality proof: Simply construct an independent innovations tree with similar correlation structure

• Worst case proof:

The uniform choice maximizes sum of elements of SX

Using eigen analysis show that this implies that uniform choice

minimizes sum of elements of S-1X

Future Directions

• Sampling– More general tree structures– Non-linear estimates– Non-tree stochastic processes

• Traffic estimation– More complex networks

• Sensor networks– jointly optimize with other constraints like power

transmission

References

• Estimation on multiscale trees– A. Willsky, “Multiresolution Markov models for signal

and image processing”, Proc. of the IEEE 90(8), August 2002.

• Optimal sampling on trees– V. Ribeiro, R. Riedi, and R. Baraniuk, “Optimal

sampling strategies for multiscale models and their application to computer networks”, preprint.

Documents

Multiscale Models for Network Traffic Vinay Ribeiro Rolf Riedi, Matt Crouse, Rich Baraniuk Dept. of Electrical Engineering Rice University (Houston, Texas)