Data Mining and Machine Learning for Analysis of Network ... ljilja/cnl/presentations/ljilja/ISPA_IUCC

  • View
    0

  • Download
    0

Embed Size (px)

Text of Data Mining and Machine Learning for Analysis of Network ......

  • Data Mining and Machine Learning for Analysis of Network Traffic

    Ljiljana Trajković ljilja@cs.sfu.ca

    Communication Networks Laboratory

    http://www.ensc.sfu.ca/cnl School of Engineering Science

    Simon Fraser University, Vancouver, British Columbia Canada

  • Roadmap

    n  Introduction n  Traffic collection, characterization, and modeling n  Case studies:

    n  telecommunication network: BCNET n  public safety wireless network: E-Comm n  satellite network: ChinaSat n  packet data networks: Internet

    n  Conclusions

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 2

  • lhr: 535,102 nodes and 601,678 links

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    http://www.caida.org/home/

    3

  • Roadmap

    n  Introduction n  Traffic collection, characterization, and modeling n  Case studies:

    n  telecommunication network: BCNET n  public safety wireless network: E-Comm n  satellite network: ChinaSat n  packet data networks: Internet

    n  Conclusions

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 4

  • n  Traffic measurements: n  help understand characteristics of network traffic n  are basis for developing traffic models n  are used to evaluate performance of protocols and

    applications n  Traffic analysis:

    n  provides information about the network usage n  helps understand the behavior of network users

    n  Traffic prediction: n  important to assess future network capacity

    requirements n  used to plan future network developments

    Measurements of network traffic

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 5

  • Traffic modeling: self-similarity

    n  Self-similarity implies a ‘‘fractal-like’’ behavior n  Data on various time scales have similar patterns n  Implications:

    n  no natural length of bursts n  bursts exist across many time scales n  traffic does not become ‘‘smoother” when

    aggregated (unlike Poisson traffic) n  it is unlike Poisson traffic used to model traffic in

    telephone networks n  as the traffic volume increases, the traffic

    becomes more bursty and more self-similar

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 6

  • December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    Self-similarity

    n  Self-similarity implies a ‘‘fractal-like’’ behavior: data on various time scales have similar patterns

    n  A wide-sense stationary process X(n) is called (exactly second order) self-similar if its autocorrelation function satisfies: n  r(m)(k) = r(k), k ≥ 0, m = 1, 2, …, n, where m is the level of aggregation

    7

  • December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    n  Properties: n  slowly decaying variance n  long-range dependence n  Hurst parameter (H)

    n  Processes with only short-range dependence (Poisson): H = 0.5

    n  Self-similar processes: 0.5 < H < 1.0 n  As the traffic volume increases, the traffic becomes

    more bursty, more self-similar, and the Hurst parameter increases

    Self-similar processes

    8

  • Self-similarity: influence of time-scales

    n  Genuine MPEG traffic trace

    0

    50000

    100000

    150000

    200000

    250000

    300000

    0 100 200 300 400 500

    time unit = 160 ms (4 frames)

    bi ts

    /ti m

    e u ni

    t

    0.E+00

    2.E+05

    4.E+05

    6.E+05

    8.E+05

    1.E+06

    0 100 200 300 400 500

    time unit = 640 ms (16 frames)

    bi ts

    /ti m

    e un

    it

    0.E+00

    1.E+06

    2.E+06

    3.E+06

    4.E+06

    5.E+06

    0 100 200 300 400 500

    time unit = 2560 ms (64 frames)

    bi ts

    /ti m

    e un

    it

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-similar nature of Ethernet traffic (extended version),” IEEE/ACM Trans. Netw., vol. 2, no 1, pp. 1-15, Feb. 1994.

    9

  • Self-similarity: influence of time-scales

    n  Synthetically generated Poisson model

    0

    50000

    100000

    150000

    200000

    250000

    300000

    0 100 200 300 400 500

    time unit = 160 ms (4 frames)

    bi ts

    /ti m

    e un

    it

    0.E+00

    2.E+05

    4.E+05

    6.E+05

    8.E+05

    1.E+06

    0 100 200 300 400 500

    time unit = 640 ms (16 frames)

    bi ts

    /ti m

    e un

    it

    0.E+00

    1.E+06

    2.E+06

    3.E+06

    4.E+06

    5.E+06

    0 100 200 300 400 500

    time unit = 2560 ms (64 frames)

    bi ts

    /ti m

    e un

    it

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    W. E. Leland, M. S. Taqqu, W. Willinger, and D. V. Wilson, “On the self-similar nature of Ethernet traffic (extended version),” IEEE/ACM Trans. Netw., vol. 2, no 1, pp. 1-15, Feb. 1994.

    10

  • Roadmap

    n  Introduction n  Traffic collection, characterization, and modeling n  Case studies:

    n  telecommunication network: BCNET n  public safety wireless network: E-Comm n  satellite network: ChinaSat n  packet data networks: Internet

    n  Conclusions

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 11

  • Case study: BCNET

    n  BCNET is the hub of advanced telecommunication network in British Columbia, Canada that offers services to research and higher education institutions

    n  The BCNET network is high-speed fiber optic research network

    n  British Columbia's network extends to 1,400 km and connects Kamloops, Kelowna, Prince George, Vancouver, and Victoria

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 12

  • BCNET packet capture

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 13

  • BCNET packet capture

    n  BCNET transits have two service providers with 10 Gbps network links and one service provider with 1 Gbps network link

    n  Optical Test Access Point (TAP) splits the signal into two distinct paths

    n  The signal splitting ratio from TAP may be modified n  The Data Capture Device (NinjaBox 5000) collects the

    real-time data (packets) from the traffic filtering device

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 14

  • Net Optics Director 7400: application diagram

    n  Net Optics Director 7400 is used for BCNET traffic filtering

    n  It directs traffic to monitoring tools such as NinjaBox 5000 and FlowMon

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 15

  • Network monitoring and analyzing: Endace card

    n  Endace Data Acquisition and Generation (DAG) 5.2X card resides inside the NinjaBox 5000

    n  It captures and transmits traffic and has time-stamping capability

    n  DAG 5.2X is a single port Peripheral Component Interconnect Extended (PCIx) card and is capable of capturing on average Ethernet traffic of 6.9 Gbps

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 16

  • Real time network usage by BCNET members

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 17

  • Roadmap

    n  Introduction n  Traffic collection, characterization, and modeling n  Case studies:

    n  telecommunication network: BCNET n  public safety wireless network: E-Comm n  satellite network: ChinaSat n  packet data networks: Internet

    n  Conclusions

    December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China 18

  • December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    Case study: E-Comm network

    n  E-Comm network: an operational trunked radio system serving as a regional emergency communication system

    n  The E-Comm network is capable of both voice and data transmissions

    n  Voice traffic accounts for over 99% of network traffic n  A group call is a standard call made in a trunked radio

    system n  More than 85% of calls are group calls n  A distributed event log database records every event

    occurring in the network: call establishment, channel assignment, call drop, and emergency call

    19

  • December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    E-Comm network

    20

  • December 14, 2017 IEEE ISPA/IEEE IUCC/SpaCCS 2017, Guangzhou, China

    E-Comm network architecture

    Burnaby

    Vancouver Other EDACS systems

    PSTN PBX Dispatch consoleUsers

    Database server

    Data gateway

    Management console

    Transmitters/Repeaters

    Network switch

    1 2 3 4 5 6 7 8 9 * 8 #

    I B M

    21

  • December 14, 2017 IE

View more