Upload
edgar-johns
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Multiscale Models for Network Traffic
Vinay Ribeiro
Rolf Riedi, Matt Crouse, Rich Baraniuk
Dept. of Electrical EngineeringRice University(Houston, Texas)
Outline
• Multiscale nature of network traffic
• Wavelets
• Wavelet models for traffic
• Network inference applications
Time Scales
timeunit
2n
2
1
(discrete time)
Multiscale Nature of Network Traffic
• Network traffic (local area networks, wide area networks, video traffic etc.) - variance decays slowly with aggregation
• i.i.d. data – variance decays faster with aggregation
60ms
6ms
time unit
600msInternet
bytes/time trace
(LBL’93)
i.i.d. time series
(lognormal)
Fractional Gaussian Noise (fGn)
• Stationary Gaussian process,
• Covariance (Hurst parameter: 0<H<1)
• Long-range dependence (LRD) if ½<H<1
• Second-order self-similarity
Variance-timeplot
fGn is a 1/f-Process
• Power spectral density decays in a 1/f fashion– Low frequency components long-term
correlations
frequency
power
Towards Generalizations of fGn
• Variance decay of traffic not always straight line like fGn
• Goal: develop LRD models – Generalize fGn– Parsimonious (few parameters)– Fast synthesis for simulations
Variance-timeplot
Auckland Univ.Traffic
time scale
Wavelets• Consider only orthonormal wavelet basis in L2(R)• Prototype functions approximation function- wavelet function-
• Basis formed by scaled and shifted versions of prototype functions
• Approximation and wavelet coefficients
The Haar Wavelet Basis
Computing the Haar Transform
• Wavelet Transform: fine to coarse (bottom to top)• Inverse Wavelet Transform: coarse to fine (top to bottom)
Wavelets and Filtering
• Wavelet coefficients at any scale j is the output of a bandpass filter
• Coarse scales low frequency band• Fine scales high frequency band• Width of bandpass filters increase exponentially
frequency
Wavelets “Decorrelate” 1/f Processes
• Analysis of 1/f data– sample means converge
faster in wavelet domain– estimate H in wavelet domain
• Synthesis of 1/f data– Exploit weak correlation in
wavelet domain– Generate independent
wavelet coefficients with appropriate variance
– Invert wavelet transform
frequency
frequency
1/f spectrum
time domain1/f
strong correlation
wavelet domainnot 1/f
weak correlation
power
power
Haar Wavelet “Additive” 1/f Model
• Choose Wj,k i.i.d. within scale j
• Set var(Wj,k) to obtain required decay of var(Vj,k)
• Fast O(N) synthesis
• log2(N) parameters
• Asymptotically Gaussian
Sample Realization
• Realization is Gaussian and can take negative values• Network traffic may be non-Gaussian and is always positive
Multiplicative Cascade Model
• Replace additive innovations by multiplicative innovations
• Aj,k 2 [0,1], example -distribution
• Choose var(Aj,k) to get appropriate decay
of var(Vj,k)
• Fast O(N) synthesis
• log2(N) parameters
• Positive data
• Asymptotically lognormal at fine time scales
Sample Realization
• Data is positive
• Same var(Vj,k) as additive model
Additive vs. Multiplicative Models
• Multiplicative model marginals closer to real data than additive model
Additive model Multiplicative modelInternet data(Auckland Univ)
6ms
12ms
24ms
timeunit
Queuing Experiment
• Additive and multiplicative models same var(Vj,k)
• Multiplicative model outperforms additive model• High-order moments can influence queuing (open loop)
real traffic
multiplicativemodel
additive model
Kilo bytes
Shortcomings of Multiscale Models• Open-loop
– Do not capture closed-loop nature of network protocols and user behavior
• Physical intuition– Cascades model “redistribution” of traffic (multiplexing
at queues, TCP)?
• Stationarity: first order stationary but not second-order stationary– Time averaged correlation structure is close to fGn– Queuing of additive model close to stationary Gaussian data (simulations and theory)
Selected References• Self-similar traffic and networks (upto 1996)
– W. Willinger, M. Taqqu, A. Erramilli, “A bibliographical guide to self-similar traffic and performance modeling for modern high-speed networks”, Stochastic Networks: Theory and Applications, vol. 4, Oxford Univ. Press, 1996.
• Wavelets– S. Burrus and R. Gopinath, “Introduction to Wavelets and Wavelet Transforms”,
Prentice Hall, 1998.– I. Daubechies, “Ten lectures on wavelets”, SIAM, New York, 1992.
• Additive model– S. Ma and C. Ji, “Modeling heterogeneous network traffic in wavelet domain”, IEEE
Trans. Networking, vol. 9, no. 5, Oct 2001.
• Multiplicative model– R. Riedi, M. Crouse, V. Ribeiro, R. Baraniuk, “A multifractal wavelet model with
application to network traffic”, IEEE Trans. Info. Theory, vol. 45, no. 3, April 1999.
– A. Feldmann, A. C. Gilbert, W. Willinger, “Data networks as cascasdes: investigating the multifractal nature of Internet WAN traffic”, ACM SIGCOMM, pp. 42-55, 1998.
– P. Mannersalo and I. Norros, “Multifractal analysis of real ATM traffic: A first look”, Technical report, VTT Information Technology, 1997, COST257TD(97)19,
Network Inference Applications
Why Network Inference?
Each dot is one Internet Service Provider
• Different parts of Internet owned by different organizations
• Information sharing difficult– Commerical
interests/trade secrets
– Privacy– Sheer volume of
“network state”
Edge-based Probing
• Inject probe packets into network• Infer internal properties from packet delay/loss• Current tools infer
– Topology– Link bandwidths– End-to-end available bandwidth– Congestion locations
Cross-Traffic Inference
• Simple network path – single queue
• Spread of packet pair gives cross-traffic over small time interval
Inferring cross-traffic over large time interval [0,T]
• Probing uncertainty principle– Dense sampling: accurate inference, affect cross-
traffic– Sparse sampling: less accurate inference, less
influence on cross-traffic
Problem Statement
Given N probe pairs, how must we space them over time interval [0,T] to optimally estimate the total cross-traffic in [0,T]
• Answer depends on – cross-traffic – optimality criterion
Multiscale Cross-Traffic Model
• Choose N leaf nodes to give best linear estimate (in terms of mean squared error) of root node
• Take a guess!– Bunch probes together– Exponentially space probes pairs– Uniformly space probes over interval– Your favorite solution
root
leaves
Sensor Networks Application
• Each sensor samples local value of process (pollution, temperature etc.)
• Sensors cost money!• Find best placement for N sensors to measure
global average
Global average
possiblesensor location
Independent Innovations Trees
• Each node is a linear combination of parent and an independent random innovation
• Optimal solution obtained by a water-filling procedure
• : arbitrary set of leaf nodes; : size of X• : leaves on left, : leaves on right• : linear min. mean sq. error of estimating root using X •
•
•
Water-Filling
0 1 2 3 40 1 2 3 4
fL(l) f
R(l)
N=01243
• Repeat at next lower scale with N replaced by l*
N (left) and (N-l*N) (right)
• Result: If innovations identically distributed within each scale then uniformly distribute leaves, l*
N=b N/2 c
Covariance Trees
• Distance : Two leaf nodes have distance j if their lowest common ancestor is at scale j
• Covariance tree : Covariance between leaf nodes with distance j is cj (only a function of distance), covariance between root and any leaf node is constant,
• Positively correlated tree : cj>cj+1
• Negatively correlated tree : cj<cj+1
Covariance Tree Result
• Result: For a positively correlated tree choosing leaf nodes uniformly in the tree is optimal. However, for negatively correlated trees this same uniform choice is the worst case!
• Optimality proof: Simply construct an independent innovations tree with similar correlation structure
• Worst case proof:
The uniform choice maximizes sum of elements of SX
Using eigen analysis show that this implies that uniform choice
minimizes sum of elements of S-1X
Future Directions
• Sampling– More general tree structures– Non-linear estimates– Non-tree stochastic processes
• Traffic estimation– More complex networks
• Sensor networks– jointly optimize with other constraints like power
transmission
References
• Estimation on multiscale trees– A. Willsky, “Multiresolution Markov models for signal
and image processing”, Proc. of the IEEE 90(8), August 2002.
• Optimal sampling on trees– V. Ribeiro, R. Riedi, and R. Baraniuk, “Optimal
sampling strategies for multiscale models and their application to computer networks”, preprint.