80
1 Opera&onal Experiences from the Viewpoint of University IT System Administrators in the Metropolitan Area on East Japan Great Earthquake Kohichi Ogawa and Noriaki Yoshiura Informa7on Technology Center Saitama University ACM SIGUCCS 2012 Service Support Conference Friday, October 19, 12

Siguccs presentation pre

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Siguccs presentation pre

1

Opera&onal  Experiences  from  the  Viewpoint  of  University  IT  System  Administrators  

in  the  Metropolitan  Area  on  East  Japan  Great  Earthquake  

Kohichi  Ogawa  and  Noriaki  Yoshiura

Informa7on  Technology  CenterSaitama  University

ACM    SIGUCCS  2012Service  &  Support  Conference

Friday, October 19, 12

Page 2: Siguccs presentation pre

2

Great  Earthquake  and  Great  Tsunami

Friday, October 19, 12

Page 3: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

Friday, October 19, 12

Page 4: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

Friday, October 19, 12

Page 5: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

Epicenter

Friday, October 19, 12

Page 6: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

damaged  Areas

Epicenter

Friday, October 19, 12

Page 7: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

damaged  Areas

Epicenter

Friday, October 19, 12

Page 8: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

damaged  Areas

Epicenter

Tokyo

Friday, October 19, 12

Page 9: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

damaged  Areas

Epicenter

Tokyo

SaitamaUniversity

Friday, October 19, 12

Page 10: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

damaged  Areas

Epicenter

Tokyo

SaitamaUniversity

Friday, October 19, 12

Page 11: Siguccs presentation pre

Loca7on  of  Earthquake  and  our  University

3

damaged  Areas

Epicenter

Tokyo

about 130 milesSaitamaUniversity

Friday, October 19, 12

Page 12: Siguccs presentation pre

4

Topics  of  the  presenta7on

• Energy  Problems  by  the  Earthquake• Some  Troubles  in  the  Rolling  Blackouts• Reloca7on  to  Data  Center  and  VPS• Lessons  and  Experiences

Friday, October 19, 12

Page 13: Siguccs presentation pre

5

1. Introduc7onl System  at  the  earthquake

2. Situa7on  aSer  the  Earthquakel immediately  aSer  the  EarthquakelOpera7on  for  rolling  power  outagel Impact  of  rolling  power  outage

3. Countermeasures  against  this  situa7on  lData  Center  lVPS

4. Effec7veness  by  countermeasures5. New  System  aSer  the  earthquake6. Lessons  and  Approaches

Friday, October 19, 12

Page 14: Siguccs presentation pre

6

1. Introduc7onl System  at  the  earthquake

2. Situa7on  aSer  the  Earthquakel immediately  aSer  the  EarthquakelOpera7on  for  rolling  power  outagel Impact  of  rolling  power  outage

3. Countermeasures  against  this  situa7on  lData  Center  lVPS

4. Effec7veness  by  countermeasures5. New  System  aSer  the  earthquake6. Lessons  and  Approaches

Friday, October 19, 12

Page 15: Siguccs presentation pre

7

System  at  the  disaster(2007-­‐2011)

• Network  System  – L3  Switches  x  6  switches– Wifi  Access  Points  x  80  aps

• Server  System– About  40  Server

• Hos7ng  Services– Web  Hos7ng  Service(200  sites)– DNS  Hos7ng  Service  (100  zones)– Mail  Hos7ng  Service  (40  sub  domains)

• Housing  Service– Rent  Space  of  server  room  for  other  organiza7on  in  the  university

Friday, October 19, 12

Page 16: Siguccs presentation pre

8

Network  Topology

• Star  topology  Network

• One-­‐to-­‐one  connec7on  from  lab  to  server  room

• No  network  switch  between  each  room  and  the  server  room

Friday, October 19, 12

Page 17: Siguccs presentation pre

9

1. Introduc7onl System  at  the  earthquake

2. Situa7on  aSer  the  Earthquakel immediately  aSer  the  EarthquakelOpera7on  for  rolling  power  outagel Impact  of  rolling  power  outage

3. Countermeasures  against  this  situa7on  lData  Center  lVPS

4. Effec7veness  by  countermeasures5. New  System  aSer  the  earthquake6. Lessons  and  Approaches

Friday, October 19, 12

Page 18: Siguccs presentation pre

10

Immediately  situa7on  aSer  the  Great  Earthquake

• 5-­‐lower  in  Saitama  University• No  direct  damage  such  as  collapsed  buildings• Informa7on  Infrastructure

– No  Op7cal  fiber  cut  in  the  server  room– No  troubles  in  network  equipment  and  servers

Friday, October 19, 12

Page 19: Siguccs presentation pre

10

Immediately  situa7on  aSer  the  Great  Earthquake

• 5-­‐lower  in  Saitama  University• No  direct  damage  such  as  collapsed  buildings• Informa7on  Infrastructure

– No  Op7cal  fiber  cut  in  the  server  room– No  troubles  in  network  equipment  and  servers

Friday, October 19, 12

Page 20: Siguccs presentation pre

11

The  Rolling  Blackouts

• Damaged  nuclear  power  plant                                                  →the  supply  of  electricity  was  weakened

• The  government  announced  implementa7on  of  the  rolling  blackouts.

• 5  groups  by  regions• 4th  group  at  Saitama  University• Blackouts  for  about  4  hours  at  a  7me

Friday, October 19, 12

Page 21: Siguccs presentation pre

12

Impacts  of  rolling  blackouts

Groups  of  the  Rolling  BlackoutsElectricity    Place  

by  the  Rolling  Blackouts

Friday, October 19, 12

Page 22: Siguccs presentation pre

13

Countermeasures  against  the  disasters

• Informa7on  Infrastructure  during  Rolling  Blackouts– to  support  the  ac7vi7es  of  the  university  by  email  and  web  servers

• Rented  Power  Generator• Switching  to  the  emergency  power  supply

–  manpower  

Friday, October 19, 12

Page 23: Siguccs presentation pre

14

Prac7cal  use  ofRented  Power  Generator

Rented  Power  Generator

Temporary  Power  Connec7on  Board

Friday, October 19, 12

Page 24: Siguccs presentation pre

15

Schedule  for  the  rolling  blackouts

Date 3/14Mon

3/15Tue

3/16Wed

3/17Thu

3/18Fri

3/19Sat

3/20Sun

3/21Mon

3/22Tue

3/23Wed

0:00

Wait Wait Wait

6:00 9:20~12:30

6:20~10:00

Wait Wait Wait12:00 13:50

~17:30

15:20~18:40

Wait Wait Wait15:50~18:45

18:00 18:50~21:45

Wait Wait Wait

18:20~21:00

Friday, October 19, 12

Page 25: Siguccs presentation pre

16

Some  troubles  for  Informa7on  Infrastructure

• March  22    Three  UPS  and  two  servers  failed  at  the  7me  of  changing  switch.    – Failure  of  the  DNS  server  – Unavailability  to  access  E-­‐mail  and  Web  Servers

• March  23    Troubles  of  L3  switches– Layer  3  switches  trouble  by  rou7ng  processing  unit  failure• A  part  of  Campus  Network  stopped  for  3  days  

Friday, October 19, 12

Page 26: Siguccs presentation pre

17

Problems  of  fuel  exhaus7on

• Emergency  power  fuel  exhaus7on– Oil  refinery  damaged  by  earthquake– Reduc7on  of  oil  fuel  supply

• Staff  Problems:– Scheduling  of  opera7on  staffs– Traffic  paralysis– Health  status  of  opera7ons  staffs

The  difficulty  of  maintaining  the  informa7on  infrastructure

Friday, October 19, 12

Page 27: Siguccs presentation pre

18

1. Introduc7onl System  at  the  earthquake

2. Situa7on  aSer  the  Earthquakel immediately  aSer  the  EarthquakelOpera7on  for  rolling  power  outagel Impact  of  rolling  power  outage

3. Countermeasures  against  this  situa7on  lData  Center  lVPS

4. Effec7veness  by  countermeasures5. New  System  aSer  the  earthquake6. Lessons  and  Approaches

Friday, October 19, 12

Page 28: Siguccs presentation pre

19

Countermeasures  against  this  situa7on

• Data  Center– Physical  Reloca7on– Reloca7on  of  cri7cal  servers

• VPS  (Virtual  Private  Server)– Logical  Reloca7on– Reloca7on  of  func7ons

Friday, October 19, 12

Page 29: Siguccs presentation pre

20

Prepara7on  of  data  center  reloca7on

• Ready-­‐to-­‐use  Data  Center• Tour  of  the  data  center

– Two  weeks  before  the  earthquake

• A  data  center  near  the  university  by  chance• Specifica7on

– 1  rack(Full  Rack)  60A/100V  – 100Mbps  internet

Friday, October 19, 12

Page 30: Siguccs presentation pre

21

Standards  of  selec7ng  the  data  center

• Access  near  to  the  university• Prepara7on  of  private  power  generator  

– fuel  is  always  stored  for  3  days

• Physical  security  • Earthquake-­‐proof  construc7on  

Friday, October 19, 12

Page 31: Siguccs presentation pre

22

Plan  of  reloca7on  of  the  Data  Center

• Carry  out  servers  in  three  groups– Many  checks  – Carefully

• First  Reloca7on– impac7ng  only  a  few  users

• Last  reloca7on  – E-­‐mail  System– impac7ng  many  users

Friday, October 19, 12

Page 32: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

Friday, October 19, 12

Page 33: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

Firewall

Friday, October 19, 12

Page 34: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

Firewall

Friday, October 19, 12

Page 35: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

Firewall

Net

Friday, October 19, 12

Page 36: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

Firewall

Net

Friday, October 19, 12

Page 37: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

LDAP Server

Firewall

Net

Friday, October 19, 12

Page 38: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

LDAP Server

Mailing Lists Server

DNS Hosting Server

Mail Hosting Server

Firewall

Net

Friday, October 19, 12

Page 39: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

LDAP Server

Mailing Lists Server

DNS Hosting Server

Mail Hosting Server

Firewall

Net

Friday, October 19, 12

Page 40: Siguccs presentation pre

23

How  to  move  to  the  Data  Center

Spam  FilterAppliance

LDAP Server

Mailing Lists Server

DNS Hosting Server

Web Mail Server

Outside SMTPServer

Mail Hosting Server

University Mail Server

Firewall

Net

Friday, October 19, 12

Page 41: Siguccs presentation pre

24

The  actual  reloca7on  of  Data  Center

• About  one  week  from  the  applica7on  of  the  data  center

• Completed  the  reloca7on  of  all  the  hardware  at  the  end  of  March,  2011

• Reloca7on  experience  – one  of  the  opera7on  staff

Friday, October 19, 12

Page 42: Siguccs presentation pre

25

Some  Troubles  of  DNS  sehngs

• Mis-­‐opera7on  of  DNS  sehng– unaccessible  to  mail  servers

• Changing  the  IP  addresses  of  servers  in  the  data  center  reloca7on– shorten  “TTL  values”  of  the  DNS  configura7on

• Laboratory  Routers– A  func7on  of  the  DNS  cache– Reboot  aSer  the  big  change  of  infrastructure

Friday, October 19, 12

Page 43: Siguccs presentation pre

26

Usage  of  VPS

• VPS(Virtual  Private  Server)– Opera7ons  via  Web  Browsers– Installing  and  sehng  up  some  OS  (CentOS,  Fedora…)– Sehng  up  Servers  freely

Friday, October 19, 12

Page 44: Siguccs presentation pre

27

Servers  relocated  to  VPS

• Secondary  Mail  Spool  Server– Prevent  lost  mail,  during  data  center  reloca7on  or  the  rolling  blackouts

• DNS  Server(Slave  Server)– Secondary  DNS  Server  

• Web  Server  of  Saitama  University– www.saitama-­‐u.ac.jp

• Web  Hos7ng  Server– Virtual  Web  server  for  laboratory  and  office

Friday, October 19, 12

Page 45: Siguccs presentation pre

28

1. Introduc7onl System  at  the  earthquake

2. Situa7on  aSer  the  Earthquakel immediately  aSer  the  EarthquakelOpera7on  for  rolling  power  outagel Impact  of  rolling  power  outage

3. Countermeasures  against  this  situa7on  lData  Center  lVPS

4. Effec7veness  by  countermeasures5. New  System  aSer  the  earthquake6. Lessons  and  Approaches

Friday, October 19, 12

Page 46: Siguccs presentation pre

29

Effec7veness  of  Reloca7on  (1)

• Reloca7on  of  the  server  decrease  consump7on  of  electric  power.

• Electricity  consump7on  reduc7ons  suffice  for  the  cost  of  data  center  

Friday, October 19, 12

Page 47: Siguccs presentation pre

30

Trends  in  electricity  usageelectricity  use

 date

Friday, October 19, 12

Page 48: Siguccs presentation pre

30

Trends  in  electricity  usageelectricity  use

 date

Friday, October 19, 12

Page 49: Siguccs presentation pre

30

Trends  in  electricity  usageelectricity  use

 date

earthquake

Friday, October 19, 12

Page 50: Siguccs presentation pre

30

Trends  in  electricity  usageelectricity  use

 date

earthquake

Friday, October 19, 12

Page 51: Siguccs presentation pre

30

Trends  in  electricity  usageelectricity  use

 date

earthquake

Friday, October 19, 12

Page 52: Siguccs presentation pre

30

Trends  in  electricity  usageelectricity  use

 date

earthquake new  system  star7ng

Friday, October 19, 12

Page 53: Siguccs presentation pre

30

Trends  in  electricity  usageelectricity  use

 date

earthquake new  system  star7ng

Friday, October 19, 12

Page 54: Siguccs presentation pre

31

Effec7veness  of  Reloca7on  (2)

• Reduc7on  of  the  opera7on  for  maintaining  informa7on  infrastructure  

• Contribu7on  for  stable  Mail  service  and  Web  services– Availability  of  remote  support  without  going  to  the  server  room  at  the  university  

Friday, October 19, 12

Page 55: Siguccs presentation pre

32

1. Introduc7onlSystem  at  the  earthquake

2. Situa7on  aSer  the  Earthquakelimmediately  aSer  the  EarthquakelOpera7on  for  rolling  power  outagelImpact  of  rolling  power  outage

3. Countermeasures  against  this  situa7on  4. Effec7veness  by  countermeasures5. New  System  aSer  the  earthquake6. Lessons  and  Approaches

Friday, October 19, 12

Page 56: Siguccs presentation pre

33

New  System  (2012~now)

• Wi-­‐Fi  Access  Points  (300  APs)• Virtualiza7on  Technology• Aware  of  using  the  Data  Center

Cisco  UCS EMC  VMXFriday, October 19, 12

Page 57: Siguccs presentation pre

34

Comparison  of    server  hardware

about  40  Servers(1U  or  2U  Server)

2  Servers(Cisco  UCS)

2007~2011 2012~now

Friday, October 19, 12

Page 58: Siguccs presentation pre

34

Comparison  of    server  hardware

about  40  Servers(1U  or  2U  Server)

2  Servers(Cisco  UCS)

2007~2011 2012~now

Friday, October 19, 12

Page 59: Siguccs presentation pre

35

1. Introduc7onl System  at  the  earthquake

2. Situa7on  aSer  the  Earthquakel immediately  aSer  the  EarthquakelOpera7on  for  rolling  power  outagel Impact  of  rolling  power  outage

3. Countermeasures  against  this  situa7on  lData  Center  lVPS

4. Effec7veness  by  countermeasures5. New  System  aSer  the  earthquake6. Lessons  and  Approaches

Friday, October 19, 12

Page 60: Siguccs presentation pre

36

Organiza7ons

• The  top  execu7ves  of  the  university  and  person  in  charge  have  the  same  viewpoints.– “The  informa7on  infrastructure  is  important”  

• Staff  skill  and  manpower  are  important

Lessons

Friday, October 19, 12

Page 61: Siguccs presentation pre

36

Organiza7ons

• The  top  execu7ves  of  the  university  and  person  in  charge  have  the  same  viewpoints.– “The  informa7on  infrastructure  is  important”  

• Staff  skill  and  manpower  are  important

Lessons

Friday, October 19, 12

Page 62: Siguccs presentation pre

36

Organiza7ons

• The  top  execu7ves  of  the  university  and  person  in  charge  have  the  same  viewpoints.– “The  informa7on  infrastructure  is  important”  

• Staff  skill  and  manpower  are  important

Lessons

Approaches

• Take  smooth  communica7ons  in  organiza7on• Improve  technology  skills  of  opera7on  staffs• Make  compact  informa7on  system• Set  the  priori7es  of  elements  of  the  system

Friday, October 19, 12

Page 63: Siguccs presentation pre

37

Environments

• Because  it  was  one  campus,  communica7on  between  faculty  and  staff  was  good.

Lessons

Friday, October 19, 12

Page 64: Siguccs presentation pre

37

Environments

• Because  it  was  one  campus,  communica7on  between  faculty  and  staff  was  good.

Lessons

Friday, October 19, 12

Page 65: Siguccs presentation pre

37

Environments

• Because  it  was  one  campus,  communica7on  between  faculty  and  staff  was  good.

Lessons

Approaches

• In  separate  Campus,  Unavailability  of  telephones

                 →Preparing  Satellite-­‐based  mobile  phones

Friday, October 19, 12

Page 66: Siguccs presentation pre

38

Coopera7on  among  Universi7es  

• We  back  up  the  data  among  universi7es  for  each  other• Service  for  the  damaged  university  was  provided  by  other  non-­‐damaged  university

Lessons

Friday, October 19, 12

Page 67: Siguccs presentation pre

38

Coopera7on  among  Universi7es  

• We  back  up  the  data  among  universi7es  for  each  other• Service  for  the  damaged  university  was  provided  by  other  non-­‐damaged  university

Lessons

Friday, October 19, 12

Page 68: Siguccs presentation pre

38

Coopera7on  among  Universi7es  

• We  back  up  the  data  among  universi7es  for  each  other• Service  for  the  damaged  university  was  provided  by  other  non-­‐damaged  university

• "Disaster  Net  Box” (from WTC2012)- Low  cost  backup  system  among  universi7es

Lessons

Approaches

Friday, October 19, 12

Page 69: Siguccs presentation pre

39

System  Administrators  in  disasters

• The  change  of  the  power  generator  required  manpower.• In  disasters,  the  traffic  paralysis  disrupted        commute  of  system  administrator.

Lessons

Friday, October 19, 12

Page 70: Siguccs presentation pre

39

System  Administrators  in  disasters

• The  change  of  the  power  generator  required  manpower.• In  disasters,  the  traffic  paralysis  disrupted        commute  of  system  administrator.

Lessons

Friday, October 19, 12

Page 71: Siguccs presentation pre

39

System  Administrators  in  disasters

• The  change  of  the  power  generator  required  manpower.• In  disasters,  the  traffic  paralysis  disrupted        commute  of  system  administrator.

• The  measures  to  maintain  the  informa7on  infrastructure  remotely  are  effec7ve.

Lessons

Approaches

Friday, October 19, 12

Page 72: Siguccs presentation pre

40

Contribu7on  for  Areas  near  the  University

• Mobile  phones  were  unavailable  in  disasters.• People  could  not  use  the  Internet  during  disasters.

Lessons

Friday, October 19, 12

Page 73: Siguccs presentation pre

40

Contribu7on  for  Areas  near  the  University

• Mobile  phones  were  unavailable  in  disasters.• People  could  not  use  the  Internet  during  disasters.

Lessons

Friday, October 19, 12

Page 74: Siguccs presentation pre

40

Contribu7on  for  Areas  near  the  University

• Mobile  phones  were  unavailable  in  disasters.• People  could  not  use  the  Internet  during  disasters.

Lessons

Friday, October 19, 12

Page 75: Siguccs presentation pre

40

Contribu7on  for  Areas  near  the  University

• Mobile  phones  were  unavailable  in  disasters.• People  could  not  use  the  Internet  during  disasters.

Lessons

Friday, October 19, 12

Page 76: Siguccs presentation pre

40

Contribu7on  for  Areas  near  the  University

• Mobile  phones  were  unavailable  in  disasters.• People  could  not  use  the  Internet  during  disasters.

Lessons

Friday, October 19, 12

Page 77: Siguccs presentation pre

40

Contribu7on  for  Areas  near  the  University

• Mobile  phones  were  unavailable  in  disasters.• People  could  not  use  the  Internet  during  disasters.

Lessons

Approaches

Friday, October 19, 12

Page 78: Siguccs presentation pre

40

Contribu7on  for  Areas  near  the  University

• Mobile  phones  were  unavailable  in  disasters.• People  could  not  use  the  Internet  during  disasters.

• Open  the  university  resources  for  commuters  and  the  neighborhood  inhabitants  in  disasters

• The  informa7on  infrastructure  of  the  university  • Be  careful  about  false  rumors!

Lessons

Approaches

Friday, October 19, 12

Page 79: Siguccs presentation pre

41

Conclusion

• We  relocated  servers  to  Data  Center  and  VPS  as  countermeasures  against  Rolling  Blackouts.

• We  learned  some  lessons  by  the  Great  Earthquake  and  the  Rolling  Blackouts.

Friday, October 19, 12

Page 80: Siguccs presentation pre

42

If  you  have  ques7on  or  interest,  please  send  E-­‐mail  as  follows.

[email protected]­‐u.ac.jptwiner  @gawakouenfacebook  gawakou

and  face  to  face  communica7on.

Ques7ons  and  Answers

Friday, October 19, 12