17
Large-Scale Network-Service Disruptio n: Dependencies & Exte rnal Factors Course: Network Operations and Co ntrol Lecturer: M. Kuang Group member: L Zhang,C qu,Y Zhan g,

Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Embed Size (px)

Citation preview

Page 1: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Large-Scale Network-Service Disruption:

Dependencies & External Factors

Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C qu,Y Zhang, ShuWei Kuo

Page 2: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Summary

Goal: dependencies&external factors resulting the large-scale network-service

discruptions through: a case study: Hurricane Ike

Aspects of study: 1.Service disruptions and their dependencies in communication networks:unavai

lable of networks&affected organizations. 2.Dependencies of communication disruptions with external factors: Weather&P

ower.

Conclusion: 1.Orgnaztion is a logical variable characterizing subnets to become unreachable

dependently just as Autonomous System(AS) affect subnets from different different orgnaztions.

2.Subnet unreachability and storm is weakly correlated.On the other hand,power outages strongly impact subnet unreachability.

3.Network system need to be improved.

1

Page 3: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Background & Current Situation 1. Network availability is essential e.g. $1.7 billion in profits lost in 2009 due to network disruptions

that lasted four hours or more in the US[1]

2. Factors like power outage & weather were observed in large scale(state/country) but not in a small scale

3. Large-scale and/or dependent disruptions bring more damage but remain as mystery

4. Complex dependencies exist among communication networks, external disturbances, and power systems

2

Page 4: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Outline 1 Network Disruption And Dependency 1.1 Network Unreachability 1.2 Temporal Independence 1.3 Comparison with Normal Operations

2 External Factor: Correlationship with Storm 2.1 Relating Network Unrechability and Storm 2.2 Non-casuality of subnet unreachability 2.2.1 a practice example 2.2.2 mathmatic analysis 2.3 Root cause 2.3.1 network dependence 2.3.2 power outage

3 Improvement 3.1 Improvement for the report 3.2 Improvement for the network

4.Reference3

Page 5: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Network Unreachability

1. Determine disruptions by the unreachability of subnets by collecting BGP updating messages(withdrawals &announcements) from 230 subnets

2. 120 subnets became unreachable during the storm (Date 9/12~9/13)

4

Fig. 1. Unreachable subnets between 1/8/08 - 9/20/08.

Page 6: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Temporal Independence

Step & Method: 1. Group the 120 subnets b

y an initial unique time with a small scale time of less than 3mins 56 samples

2. Choose three disjoint intervals to figure out initial time of each subnets & the average unreachable duration

3. Use Pearson's chi-squre test to prove "The unique time occur statistically independently"

82.723,05.0 71.12

3 23

23,05.0

5

Fig. 2. Unreachable subnet groups with unique initial times between 9/12/08 - 9/14/08. Intervals 1-3 with different average inter-unreachable times.

Page 7: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Comparison with Normal Operations

Compare the statistics between two different periods

Unreachabilities occurred during Ike are 158.54 times morein quantity, at 40.05 times higher rate, and 7.09-day longerduration the impact of the hurricane!

86.15 times58.53 times 72 times 。 。 。

6

Tab. 1. Comparsion of subnets unreachability between the pre-Ike and Ike period

Page 8: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Relating Network Unrechability and Storm

1. correlating with stormresearch three parameters: hitting time: when the storm cover

age first overlaps with a subnet region

initial time: when the subnets first became unreachable

coverage probability: how likely for the hurricane to impact subnet unreachability directly

thus: the hitting times and the ini

tial times are weakly correlated. Fig. 4. Distributions of initial times, hitting

times, and coverage probabilities. 7

Page 9: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

non-casuality of subnet unreachabilitya practice example

2.non-causality with stormfind 1) Subnets could become unreachable before the storm approached. 2) When the storm entered the majority of subnet regions,more subnets became unreachable outside storm coverage.

thus:the storm may not be the only direct cause of subnet unreachability.So,there may exist hidden variables which are neither taken into consideration nor inferable from our data.

Fig. 5. Storm coverage at the end of each sub-interval (blue circle) and geo-locations of subnets (white circles and squares) for different sub-intervals

8

Page 10: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

non-casuality of subnet unreachabilitymathmatic analysis

correlate the initial time of unreachability with the hitting time for individual subnets.

.0

;

.,.

1

)(

otherwise

eunreachabl

initiallybecameisubnetwhenei

ttif

tI

ui

i

.0

;int

.,.

],[1

)(

otherwise

ervalthisinfallstwhen

iregionsubnetthehitstormtheei

tTttif

tI

hihi

hi

][][1

)(1

uihiuii

n

itItI

n

highly correlation when close to 1

thus:correlations are consistently small.

)(

Fig. 6. Sample correlation coefficients for .. = 15 and 30 minutes. 11

Page 11: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Root cause

Cause 1: network dependence Subnets from the same group and the same organization exhibit a similar pa

ttern of BGP withdrawal bursts, these two subnets are dependent within an organization.

Calculate the correlation coefficient using inter-arrival times of BGP updates inside the bursts between different subnets

18 groups(77 within-organization dependent subnets, 14 organizations,14 ASes) whose correlation coefficient are between 0.9150-1.

84.58%subnets : same ISP 92.06%subnets : same unreachability duration

9

Tab. 2. example of withdrawal bursts belonging to two subnets

Page 12: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Root cause

cause2: power outageOffered from 4 organizations owning unreachable subn

ets: 1.Houston Advanced Research Center (HARC): The

network administrators shut down the network as a precaution due to lack of spare power

2.ISP in Texas: The root causes were reported to be power outages

3.NASA-Johnson Space Center: The root causes were reported to be power outages

4.University of Texas Medical Branch at Galveston (UTMB): Power failures when the rising water flooded the power generators.

thus: power is another external factor.

power outages

lack of spare power

10

Page 13: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Improvementfor the report

1.Give out new achievements of large-scale and/or dependent disruptions

2.Consider network security when suggesting sharing information across multiple organizations and providers.

3.Need detailed data when analysis power outage as a root course of network disruption.

4.Have no idea of different subnet structure so can not compare pros and cons of organizations to prevent network disruption

12

Page 14: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Improvementfor the network

Suggestions on this topic: 1. Increase the mobility or backup of network equip

ments (switch,router,base station, database etc.) to avoid the effect of external factors

(security)

2. Further research on Optimizing the network structure in different organizations

3. Establish the plan on service-failure recovery(configuration)

13

Page 15: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

Reference

[1] J. H. Cowie, A. T. Ogielski, BJ Premore, E. A. Smith, and T. Underwood, “Impact of the 2003 blackout on the Internet communications,” Renesys Corporation, Nov. 2003. [2] J. Cowie, “Lights out in Rio,” Renesys Corporation, Nov. 2009. [3] M. Brown, “Ike hammers Texas Internet,” Renesys Corporation, Sep. 2008. [4] S. Erjongmanee, C. Ji, J. Stokely, and N. Hightower, “Knowledge discovery from sensor data,” M. Gaber, et al., editors, Lecture Notes of Computer Science, vol. 5840. Springer, 2010. [5] S. C. Liew and K. W. Lu, “A framework for characterizing disasterbased network survivability,” IEEE J. Sel. Areas Commun., vol. 12, no. 1, pp. 52–58, Jan. 1994. [6] R. Berg, “Tropical cyclone report: Hurricane Ike (AL092008) 1-14 Sep. 2008,” National Hurricane Center, Jan. 2008. [7] Y. Rekhter, T. Li, and S. Hares, “A Border Gateway Protocol 4.” RFC 1771, 1995. [8] A. Feldmann, O. Maennel, Z. M. Mao, A. Bergerm, and B. Maggs, “Locating Internet routing instabilities,” Comput. Commun. Rev., vol. 34, no. 4, pp. 205–218, Oct. 2004. [9] K. Xu, J. Chandrashekar, and Z. L. Zhang, “A first step towards understanding inter-domain routing,” in Proc. 2005 ACM SIGCOMM Workshop Mining Netw. Data. [10] Route Views project. Available: http://www.routeview.org. [11] L. Kaufman and P. J. Rousseeuw, Finding Groups in Data:

14

Page 16: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

any Question?

16

Page 17: Large-Scale Network-Service Disruption: Dependencies & External Factors Course: Network Operations and Control Lecturer: M. Kuang Group member: L Zhang,C

hehe

Thank you!

15