Upload
reynold-lewis
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
1
Dependability in the
Internet Era
Jim Gray
Microsoft ResearchHigh Dependability Computing Consortium Conference
Santa Cruz, CA 7 May 2001
REVISED: 13 Feb 2005 Stanford, CA
2
Outline
• The glorious past (Availability Progress)
• The dark ages (current scene)
• Some recommendations
3
PreviewThe Last 10 Years: Availability Dark Ages
Ready for a Renaissance? • Things got better, then things got a lot worse!
1950 1960 1970 1980 1990 2000
9%
99%
99.9%
99.99%
99.999%
99.999%
Computer Systems
Telephone Systems
Cellphones
InternetA
vaila
bilit
y
2010
4
DEPENDABILITY: The 3 ITIES• RELIABILITY / INTEGRITY:
Does the right thing. (also MTTF>>1)
• AVAILABILITY: Does it now.
(also 1 >> MTTR ) MTTF+MTTRSystem Availability:If 90% of terminals up & 99% of DB up?
(=>89% of transactions are serviced on time).
• Holistic vs. Reductionist view
SecurityIntegrityReliability
Availability
5
Fail-Fast is Good, Repair is Needed
Improving either MTTR or MTTF gives benefit
Simple redundancy does not help much.
Fault Detect
Repair
Return
Lifecycle of a moduleLifecycle of a modulefail-fast gives fail-fast gives short fault latencyshort fault latency
High Availability High Availability
is low UN-Availabilityis low UN-Availability
Unavailability ~ Unavailability ~ MTTRMTTR MTTFMTTF
6
Fault Model• Failures are independent
So, single fault tolerance is a big win
• Hardware fails fast (dead disk, blue-screen)
• Software fails-fast (or goes to sleep)
• Software often repaired by reboot:– Heisenbugs
• Operations tasks: major source of outage– Utility operations
– Software upgrades
7
Disks (raid) the BIG Success Story
• Duplex or Parity: masks faults• Disks @ 1M hours (~100 years) • But
– controllers fail and – have 1,000s of disks.
• Duplexing or parity, and dual path gives “perfect disks”
• Wal-Mart never lost a byte (thousands of disks, hundreds of failures).
• Only software/operations mistakes are left.
8
Fault Tolerance vs Disaster Tolerance
• Fault-Tolerance: mask local faults– RAID disks– Uninterruptible Power Supplies– Cluster Failover
• Disaster Tolerance: masks site failures– Protects against fire, flood, sabotage,..– Also, software changes, site moves,…– Redundant system and service
at remote site.
9
Availability99 999well-managed nodes
well-managed packs & clones
well-managed GeoPlex
Masks some hardware failures
Masks hardware failures, Operations tasks (e.g. software upgrades)Masks some software failures
Masks site failures (power, network, fire, move,…) Masks some operations failuresA
vaila
bilit
yUn-managed
10
Case Study - Japan"Survey on Computer Security", Japan Info Dev Corp., March 1986. (trans: Eiichi
Watanabe).
Vendor (hardware and software) 5 MonthsApplication software 9 MonthsCommunications lines 1.5 YearsOperations 2 YearsEnvironment 2 Years
10 Weeks1,383 institutions reported (6/84 - 7/85)
7,517 outages, MTTF ~ 10 weeks, avg duration ~ 90 MINUTES
To Get 10 Year MTTF, Must Attack All These Areas
42%
12%
25%9.3%
11.2%
Vendor
Environment
OperationsApplication
Software
Tele Comm lines
11
Case Studies - Tandem Trends
MTTF improved
Shift from Hardware & Maintenance to from 50% to 10%
to Software (62%) & Operations (15%)
NOTE: Systematic under-reporting of EnvironmentOperations errorsApplication Software
unknown environment operations maintenance hardware software
0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
100
1985 1987 1989
0
20
40
60
80
1 00
1 20
1985 19 87 1 989
Outag es/ 1000 Syste m Yearsby Primar y Cause
% of Outage s by Pri mary Cause
12
Dependability Status circa 1995 • ~4-year MTTF
• 5 9s for well-managed sys. Fault Tolerance Works.
• Hardware is GREAT (maintenance and MTTF).
• Software masks most hardware faults.
• Many hidden software outages in operations:• New Software.• Utilities.
• Need to make all hardware/software changes ONLINE.
• Software seems to define a 30-year MTTF ceiling.
• Reasonable Goal: 100-year MTTF. class 4 today => class 6 tomorrow.
13
Honorable Mention
• The nice folks at Tandem (now HP))
– Made failover fast (30 seconds or less).
– Made change online• Add hardware/software• Reorganize database.• Rolling upgrades.
– Added at least one 9 to their story.
14
And Then?
• Hardware got better (& more complex)
• Software got better (& more complex)
• Raid is standard, Snapshots becoming standard
• Cluster in a box: commodity failover
• Remote replication is standard.
15
Outline
• The glorious past (Availability Progress)
• The dark ages (current scene)
• Some recommendations
16
Progress?• MTTF improved from 1950-1995• MTTR incremental improvements 1970 ---
failover• Hardware and Software online change
(pNp) is now standard• Then the Internet arrived:
– No project can take more than 3 months.– Time to market is everything– Change is good.
Computer Systems
Telephone Systems
Cellphones
Internet
17
The Internet Changed Expectations
1990Phones delivered 99.999%
ATMs delivered 99.99%
Failures were front-page news.
Few hackers
Outages last an “hour”
2005Cell phones deliver 90%
Web sites deliver 99%
Failures are business-page news
Many hackers.
Outages last a “day”
This is progress?
18
Eric Brewer said it best:
ACID vs BASEthe internet litmus test
“copy” of slide 8 of http://www.ccs.neu.edu/groups/IEEE/ind-acad/brewer/sld008.htm
• AtomicityConsistencyIsolation Durabilty
• Availability?• Strong consistency
Isolation
• Focus on commit• Conservative (Pessimistic)
• Difficult evolution (e.g. schema)
• Nested transactions
• BasicAvailabilitySoft StateEventual Consistency
• Availability FIRST• Weak consistency
stale data is OKApproximate answers OK
• Best effort• Aggressive (optimistic)• Easier Evolution.
• Simpler!• Faster
I think it is a spectrum
19
Why (1) Complexity• Internet sites are MUCH
more complex.– NAP– Firewall/proxy/IPsprayer– Web– DMZ– App server– DB server– Links to other sites– tcp/http/html/dhtml/dom/xml/
com/corba/cgi/sql/fs/os…
• Skill level is much reduced
20
A Data Center (500 servers)
C is c o 7 0 0 0
ICPMSCOMC7501
C is c o 7 0 0 0
ICPMSCOMC7502
C a ta lyst5 0 0 0
ICPMSCOMC5001(MSCOM1)
ATM0/0/0.1
FE4/0/0Port 1/1
HSRP
FE4/1/0 FE4/1/0
HSRP
Port 2/1 Port 2/1C a ta lyst
5 0 0 0
ICPMSCOMC5002(MSCOM2)
FE4/0/0
ATM0/0/0.1
Port 1/1
C is c o 7 0 0 0
ICPMSCOMC7503
C a ta lyst5 0 0 0
ICPMSCOMC5003(MSCOM3)
ATM0/0/0.1
FE4/0/0Port 1/1
HSRP
FE4/1/0 FE4/1/0
HSRP
Port 2/1 Port 2/1 C a ta lyst5 0 0 0
ICPMSCOMC5004(MSCOM4)
FE4/0/0
ATM0/0/0.1
Port 1/1
C is c o 7 0 0 0
ICPMSCOMC7504
SD
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
SYSTEMS
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
AC AC
48V DC 48V DC
5VDC OK 5VDC OK
SHUTDOWN SHUTDOWN
CAUTION:Double Pole/neutral fusing CAUTION:Double Pole/neutral fusingF12A/250V F12A/250V
ASX-1000
B DB DB D B D
A CA CA CA C
SD
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
SYSTEMS
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
SERETH
NEXT
SELECT
RESET
TXCRXL
PWR
AC AC
48V DC 48V DC
5VDC OK 5VDC OK
SHUTDOWN SHUTDOWN
CAUTION:Double Pole/neutral fusing CAUTION:Double Pole/neutral fusingF12A/250V F12A/250V
ASX-1000
B DB DB D B D
A CA CA CA C
ICPMDISTFA1001 ICPMDISTFA1002
3A22A2
2A2
1A2
ATM0/0/0.1
4A2
ATM0/0/0.1
4A2
1A2
C is c o 7 0 0 0
ICPMSCOMC7505
Catalyst 2926
ICPMSFTDLC2921(MSCOM DL1)
Port 1/1
FE4/0/0
HSRP
C is c o 7 0 0 0
ICPMSCOMC7506
Catalyst 2926
ICPMSFTDLC2922(MSCOM DL2)
Port 1/1
FE5/0/0
HSRP
Port 1/2Port 1/2
FE4/0/0
HSRP
FE5/0/0
HSRP
IIS
IIS
IIS
IIS
IIS
IIS
CPMSFTWBW26CPMSFTWBW28CPMSFTWBW30
CPMSFTWBW37CPMSFTWBW38CPMSFTWBW39
WWW.MICROSOFT.COMWWW.MICROSOFT.COM
CPMSFTWBW24CPMSFTWBW31CPMSFTWBW32CPMSFTWBW33CPMSFTWBW34
CPMSFTWBW35CPMSFTWBW40CPMSFTWBW41CPMSFTWBW42CPMSFTWBW43
SEARCH.MICROSOFT.COM
CPMSFTWBS01CPMSFTWBS02CPMSFTWBS03CPMSFTWBS04CPMSFTWBS05CPMSFTWBS06CPMSFTWBS07CPMSFTWBS08CPMSFTWBS09
CPMSFTWBS10CPMSFTWBS11CPMSFTWBS12CPMSFTWBS13CPMSFTWBS14CPMSFTWBS15CPMSFTWBS16CPMSFTWBS17CPMSFTWBS18
WWW.MICROSOFT.COM
CPMSFTWBW08CPMSFTWBW13CPMSFTWBW14CPMSFTWBW29
CPMSFTWBW36CPMSFTWBW44CPMSFTWBW45
WWW.MICROSOFT.COM
CPMSFTWBW01CPMSFTWBW15CPMSFTWBW25
CPMSFTWBW27CPMSFTWBW46CPMSFTWBW47
REGISTER.MICROSOFT.COM
CPMSFTWBR03CPMSFTWBR04CPMSFTWBR05
CPMSFTWBR09CPMSFTWBR10
SUPPORT.MICROSOFT.COM
CPMSFTWBT01CPMSFTWBT02
CPMSFTWBT03CPMSFTWBT07
CPMSFTWBT04CPMSFTWBT05
WINDOWS.MICROSOFT.COM
CPMSFTWBY01CPMSFTWBY02
CPMSFTWBY03CPMSFTWBY04
WINDOWS98.MICROSOFT.COM
CPMSFTWBJ01
WINDOWSMEDIA.MICROSOFT.COM
PREMIUM.MICROSOFT.COM
CPMSFTWBP01CPMSFTWBP02
CPMSFTWBP03
SUPPORT.MICROSOFT.COM
CPMSFTWBT06CPMSFTWBT08
CPMSFTWBR07CPMSFTWBR08
CPMSFTWBR01CPMSFTWBR02CPMSFTWBR06
REGISTER.MICROSOFT.COM
WINDOWSMEDIA.MICROSOFT.COM WINDOWSMEDIA.MICROSOFT.COM
CPMSFTWBJ01CPMSFTWBJ02
CPMSFTWBJ03CPMSFTWBJ05
CPMSFTWBJ06CPMSFTWBJ07CPMSFTWBJ08
CPMSFTWBJ09CPMSFTWBJ10
CPMSFTWBJ06CPMSFTWBJ07CPMSFTWBJ08
CPMSFTWBJ09CPMSFTWBJ10
MSDN.MICROSOFT.COM
CPMSFTWBN01CPMSFTWBN02
CPMSFTWBN03CPMSFTWBN04KBSEARCH.MICROSOFT.COM
CPMSFTWBT40CPMSFTWBT41CPMSFTWBT42
CPMSFTWBT43CPMSFTWBT44
INSIDER.MICROSOFT.COM
CPMSFTWBI01 CPMSFTWBI02
3D2
C a ta lyst5 0 0 0
IUSCCMQUEC5002(COMMUNIQUE2)
C a ta lyst5 0 0 0
IUSCCMQUEC5001(COMMUNIQUE1)
C a ta lyst5 0 0 0
C a ta lyst5 0 0 0
ICPMSCBAC5001ICPMSCBAC5502
Port 1/1 Port 1/2Port 2/12
C is c o 7 0 0 0
ICPCMGTC7501
C is c o 7 0 0 0
ICPCMGTC7502
FE4/1/0
Port 1/1
FE4/1/0SQL
Microsoft.com SQL Servers
Microsoft.com Stagers,Build and Misc. Servers
FTP 6
Build Servers 32
IIS 210
Application 2
Exchange 24
Network/Monitoring 12
SQL 120
Search 2
NetShow 3
NNTP 16
SMTP 6
Stagers 26
Total 459
Microsoft.com Server Count
Drawn by: Matt GroshongLast Updated: April 12, 2000
IP addresses removed by J im Gray to protect security
CPMSFTSQLB05CPMSFTSQLB06CPMSFTSQLB08CPMSFTSQLB09CPMSFTSQLB14CPMSFTSQLB16CPMSFTSQLB18CPMSFTSQLB20CPMSFTSQLB21
Backup SQL Servers
CPMSFTSQLB22CPMSFTSQLB23CPMSFTSQLB24CPMSFTSQLB25CPMSFTSQLB26CPMSFTSQLB27CPMSFTSQLB36CPMSFTSQLB37CPMSFTSQLB38CPMSFTSQLB39
CPMSFTSQLA05CPMSFTSQLA06CPMSFTSQLA08CPMSFTSQLA09CPMSFTSQLA14CPMSFTSQLA16CPMSFTSQLA18CPMSFTSQLA20CPMSFTSQLA21CPMSFTSQLA22
Live SQL ServersCPMSFTSQLA23CPMSFTSQLA24CPMSFTSQLA25CPMSFTSQLA26CPMSFTSQLA27CPMSFTSQLA36CPMSFTSQLA37CPMSFTSQLA38CPMSFTSQLA39
IIS
IIS
IIS IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
Consolidator SQL Servers
CPMSFTSQLC02CPMSFTSQLC03CPMSFTSQLC06CPMSFTSQLC08CPMSFTSQLC16CPMSFTSQLC18CPMSFTSQLC20CPMSFTSQLC21CPMSFTSQLC22CPMSFTSQLC23
CPMSFTSQLC24CPMSFTSQLC25CPMSFTSQLC26CPMSFTSQLC27CPMSFTSQLC30CPMSFTSQLC36CPMSFTSQLC37CPMSFTSQLC38CPMSFTSQLC39
DOWNLOAD.MICROSOFT.COM DOWNLOAD.MICROSOFT.COM
HTMLNEWS(pvt).MICROSOFT.COM
CPMSFTWBV01CPMSFTWBV02CPMSFTWBV03
CPMSFTWBV04CPMSFTWBV05
CPMSFTWBD01CPMSFTWBD05CPMSFTWBD06
CPMSFTWBD07CPMSFTWBD08
CPMSFTWBD03CPMSFTWBD04CPMSFTWBD09
CPMSFTWBD10CPMSFTWBD11
ACTIVEX.MICROSOFT.COM
CPMSFTWBA02 CPMSFTWBA03
FTP.MICROSOFT.COM
CPMSFTFTPA03CPMSFTFTPA04
CPMSFTFTPA05CPMSFTFTPA06
NTSERVICEPACK.MICROSOFT.COM
CPMSFTWBH01CPMSFTWBH02
CPMSFTWBH03
HOTFIX.MICROSOFT.COM
CPMSFTFTPA01
ASKSUPPORT.MICROSOFT.COM
CPMSFTWBAM03CPMSFTWBAM04
CPMSFTWBAM01CPMSFTWBAM01
MSDNNews.MICROSOFT.COM
CPMSFTWBV21CPMSFTWBV22
CPMSFTWBV23
MSDNSupport.MICROSOFT.COM
CPMSFTWBV41 CPMSFTWBV42
NEWSLETTERS.MICROSOFT.COM
CPMSFTSMTPQ01 CPMSFTSMTPQ02
NEWSLETTERS
CPMSFTSMTPQ11CPMSFTSMTPQ12CPMSFTSMTPQ13CPMSFTSMTPQ14CPMSFTSMTPQ15
NEWSWIRE
CPMSFTWBQ01CPMSFTWBQ02CPMSFTWBQ03
Misc. SQL Servers
INTERNAL SMTP
CPMSFTSMTPR01CPMSFTSMTPR02
NEWSWIRE.MICROSOFT.COM
CPITGMSGR01 CPITGMSGR02
NEWSWIRECPITGMSGD01CPITGMSGD02CPITGMSGD03
OFFICEUPDATE.MICROSOFT.COM
CPMSFTWBO01CPMSFTWBO02
CPMSFTWBO04CPMSFTWBO07
PremOFFICEUPDATE.MICROSOFT.COM
CPMSFTWBO30CPMSFTWBO31
CPMSFTWBO32
SearchMCSP.MICROSOFT.COM
CPMSFTWBM03
SvcsWINDOWSMEDIA.MICROSOFT.COM
CPMSFTWBJ21 CPMSFTWBJ22
STATSCPITGMSGD04CPITGMSGD05CPITGMSGD07CPITGMSGD14CPITGMSGD15CPITGMSGD16CPMSFTSTA14CPMSFTSTA15CPMSFTSTA16
WINDOWS_Redir.MICROSOFT.COM
CPMSFTWBY05
COMMUNITIES
COMMUNITIES.MICROSOFT.COM
CPMSFTNGXA01CPMSFTNGXA02CPMSFTNGXA03
CPMSFTNGXA04CPMSFTNGXA05
CODECS.MICROSOFT.COM
CPMSFTWBJ16CPMSFTWBJ17CPMSFTWBJ18
CPMSFTWBJ19CPMSFTWBJ20
CGL.MICROSOFT.COM
CPMSFTWBG03CPMSFTWBG04CPMSFTWBG05
CPMSFTWBG04CPMSFTWBG05
CDMICROSOFT.COM
CPMSFTWBC01CPMSFTWBC02
CPMSFTWBC03
BACKOFFICE.MICROSOFT.COM
CPMSFTWBB01CPMSFTWBB03
CPMSFTWBB04
Build Servers
INTERNET-BUILDINTERNET-BUILD1INTERNET-BUILD2INTERNET-BUILD3INTERNET-BUILD4INTERNET-BUILD5INTERNET-BUILD6INTERNET-BUILD7INTERNET-BUILD8INTERNET-BUILD9INTERNETBUILD10INTERNETBUILD11INTERNETBUILD12INTERNETBUILD13INTERNETBUILD14INTERNETBUILD15INTERNETBUILD16
INTERNETBUILD17INTERNETBUILD18INTERNETBUILD19INTERNETBUILD20INTERNETBUILD21INTERNETBUILD22INTERNETBUILD23INTERNETBUILD24INTERNETBUILD25INTERNETBUILD26INTERNETBUILD27INTERNETBUILD30INTERNETBUILD31INTERNETBUILD32INTERNETBUILD34INTERNETBUILD36INTERNETBUILD42
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IIS
IISIIS
IIS IIS
SQL
SQL
SQL
SQL
SQLSQL
SQL
SQL
SQL
SQL
SQL
StagersCPMSFTCRA10CPMSFTCRA14CPMSFTCRA15CPMSFTCRA32CPMSFTCRB02CPMSFTCRB03CPMSFTCRP01CPMSFTCRP02CPMSFTCRP03
CPMSFTCRS01CPMSFTCRS02CPMSFTCRS03CPMSFTSGA01CPMSFTSGA02CPMSFTSGA03CPMSFTSGA04CPMSFTSGA07
PPTP / Terminal Servers
CPMSFTPPTP01CPMSFTPPTP02CPMSFTPPTP03CPMSFTPPTP04
CPMSFTTRVA01CPMSFTTRVA02CPMSFTTRVA03
CPMSFTSQLD01CPMSFTSQLD02CPMSFTSQLE01CPMSFTSQLF01CPMSFTSQLG01CPMSFTSQLH01CPMSFTSQLH02CPMSFTSQLH03CPMSFTSQLH04CPMSFTSQLI01CPMSFTSQLL01CPMSFTSQLM01CPMSFTSQLM02CPMSFTSQLP01CPMSFTSQLP02CPMSFTSQLP03CPMSFTSQLP04CPMSFTSQLP05CPMSFTSQLQ01CPMSFTSQLQ06
CPMSFTSQLR01CPMSFTSQLR02CPMSFTSQLR03CPMSFTSQLR05CPMSFTSQLR06CPMSFTSQLR08CPMSFTSQLR20CPMSFTSQLS01CPMSFTSQLS02CPMSFTSQLW01CPMSFTSQLW02CPMSFTSQLX01CPMSFTSQLX02CPMSFTSQLZ01CPMSFTSQLZ02CPMSFTSQLZ04CPMSFTSQL01CPMSFTSQL02CPMSFTSQL03
Monitoring Servers
CPMSFTHMON01CPMSFTHMON02CPMSFTHMON03
CPMSFTMONA01CPMSFTMONA02CPMSFTMONA03
Canyon Park Data CenterMicrosoft.com Network Diagram
21
A Schematic of HotMail• ~7,000 servers • 100 backend stores
with 300TB (cooked)• many data centers• Links to
– Internet Mail gateways– Ad-rotator– Passport– …
• ~ 5 B messages per day• 350M mailboxes, 250M active• ~1M new per day.• New software every 3 months
(small changes weekly).
Sw
ittc
he
d E
the
rne
t
Inte
rne
tTelnet Management
Local Director
Local Director
Local Director
Local Director
MSERVS
MSERVSMSERVSFrontDoors
MSERVSMSERVSIncoming
MailServers
MSERVSMSERVSAD Servers
Local Director
MSERVSMSERVSGraphicsServers
DataDataData
DataUSTORES
MemberDirectory
Local Director
MSERVSMSERVSLoginServers
gatewaygatewaygatewaygatewaygateway
22
Why (2) Velocity
• No project can take more than 13 weeks.
• Time to market is everything
• Functionality is everything
• Faster, cheaper, …
Schedule Quality
Functionality
trend
23
Why (3) Hackers• Hacker’s are a new increased threat• Any site can be attacked from anywhere• Motives include ego, malice, and greed.• Complexity makes it hard to protect sites.• Whole internet attacks: Slammer• Concentration of wealth makes attractive target:
Reporter: “Why did you rob banks?”
Willie Sutton: “Cause that’s where the money is!”
Note: Eric Raymond’s How to Become a Hacker http://www.tuxedo.org/~esr/faqs/hacker-howto.html
is the positive use of “Hacker”, here I mean malicious and anti-social hackers.Black-hats, not white-hats.
24
How Bad Is It?http://www-iepm.slac.stanford.edu/
Connectivity is poor.
http://www.internettrafficreport.com/main.htm
25
How Bad Is It?
• Median monthly % ping packet loss for 2/ 99
http://www-iepm.slac.stanford.edu/pinger/
26
And in 2006, about the same
27
Or In the USOr In the US
28
Keynote measures Response Time
and Up Time
Measures response time around the world
Business service is better than popular service
Has many proprietary services for SLAs.
Week of
April 22 - April 28, 2001 Previous Week
Index 15.90 15.78
Web Siteswith BestPerformanceAverages
Ameritrade (65) Lycos (81) Yahoo! (81) Altavista (19) Go.com
3.29 5.41 5.79 6.03 7.02
Ameritrade (64) Lycos (80) Yahoo! (80) Ask Jeeves (7) Altavista (18)
3.35 5.58 5.74 6.11 6.17
Worst Average (anonymous) 38.04 (anonymous) 37.44
29
2006: typical 97.48% Availability
97.48%97.48%
30
Netcraft’s Crisis-of-the-Day
31
32
Service Level Measurements
• Many organizations are measured on SLAs• Example: 1 sec response
99% of prime time• Keynote, Netcraft, …
– offer to monitor you site (probe every few min)• This probing can go deep into the tree to detect
services.
– Send alerts via email– Give monthly reports.
33
In addition• Most large sites build
their own instrumentation (several times )
• This instrumentation is elaborate and essential for the Network Operations Center (NOC).
• There are attempts now to systematize itTivoli, OpenView, NetIQ, WhatsUP, Mom,..
34
Microsoft.Com• Operations mis-configured
a router• Took a day to diagnose
and repair.
• DOS attacks cost a fraction of a day.
• Regular security patches.
35
Back-End Servers are More Stable• Generally deliver 99.99%
• TerraServer for example single back-end failed after 2.5 y.
• Went to 4-nodecluster
• Fails every 2 mo.Transparent failover in 30 sec.Online software upgradesSo… 99.999% in backend…
Time %
Total Up Time 8754:07:22 99.93%
Total Down Time 5:52:38 0.07%Total Time 8760:00:00 100.00%Scheduled Down 2:50:45Scheduled Availabilty 8757:09:15 99.97%
Un-Scheduled Down 3:01:53Time %
Up Time 12888:21:49 99.519%Scheduled Down 4:00:25 0.031%
Unscheduled Down 58:20:46 0.451%
Total Time 12950:43:00 99.52%Total Down 62:21:11 0.48%
Year 1
Through18
Months
Down 30 hours in July (hardware stop, auto restart failed, operations failure)
Down 26 hours in September (Backplane failure, I/O Bus failure)
36
eBay: A very honest site
• Publishes operations log.Publishes operations log.
• Has 99% of scheduled uptimeHas 99% of scheduled uptime
• Schedules about 2 hours/week down.Schedules about 2 hours/week down.
• Has had some operations outagesHas had some operations outages
• Has had some DOS problems.Has had some DOS problems.
http://www2.ebay.com/aw/announce.shtml
37
And 2006…. Welcome to eBay's System Board. Visit this board for information on scheduled site maintenance or system issues that are affecting Marketplace trading. For general eBay news, please see our General Announcements Board. ***Resolved - PayPal site slowness*** February 08, 2006 | 05:20PM PST/PTFor several hours today, members may have experienced slowness while trying to access the PayPal website. This issue has now been resolved. AThank you for your patience. Link to this announcement | Back to top***PayPal site slowness***February 08, 2006 | 02:38PM PST/PTMembers may be experiencing intermittent slowness while trying to access the PayPal website. We're aware of this issue and are working to fix it as quickly as possible. Thank you for your patience. Link to this announcement | Back to top***Scheduled Maintenance For This Week***February 08, 2006 | 02:03PM PST/PTThe eBay system will be undergoing general maintenance from approximately 23:00 PT on Thursday, February 9th to 01:00 PT on Friday, February 10th. During this maintenance period, certain eBay site features may be intermittently unavailable or slow.
http://www2.ebay.com/aw/announce.shtml
38
Some Cool New Things
• There are 100,000 node services.
• Google File System shows importance & benefit of Triplex
• DB replication & mirroring works (is easy)
• little things I have done– With Leslie Lamport: unified Paxos & 2PC– Measured mean-time-to-data-loss
(and continue to measure things).
39
Outline
• The glorious past (Availability Progress)
• The dark ages (current scene)
• Some recommendations
40
Not to throw stones but…
• Everyone has a serious problem.
• The BEST people publish their stats.
• The others HIDE their stats (check Netcraft to see who I mean).
• We have good NODE-level availability5-9s is reasonable.
• We have TERRIBLE system-level availability2-9s “scheduled” is the goal (!).
41
Gresham’s Law:“bad money drives out good”
• People WANT features!• People WANT convenience!• People WANT cheap!• In exchange,
they seem to be willing to tolerate some– Un-availability (= inconvenience)– “Dirty data” that needs reconciliation– Insecurity
• I see it as our task to make it easier & cheaperto get high availability and Security.
Schedule Quality
Functionality
trend
42
Recommendation #1
• Continue progress on back-ends.– Make management easier
(AUTOMATE IT!!!)– Measure – Compare best practices– Continue to look for better algoritims.
• Live in fear– We are at 10,000 node servers– We are headed for 1,000,000 node servers
43
Recommendation #2• Current security approach is unworkable:
– Anonymous clients– Firewall is clueless– Incredible complexity
• We cant win this game!
• So change the rules (redefine the problem):– No anonymity– Unified authentication/authorization model – Single-function devices (with simple interfaces)– Only one-kind of interface (uddi/wsdl/soap/…).
44
Recommendation #3• Dependability
requires holistic not reductionist approach.• It’s the WHOLE system
(end-to-end, top-to-bottom)• Hard to publish in this area, hard to get tenure.
– Journals want theorem+proof and crisp statements.
• Companies want to make money, so do not share their knowledge.
• Dependability is an important social good,• So, it Dependability Research needs
government or philanthropic sponsorship
45
ReferencesAdams, E. (1984). “Optimizing Preventative Service of Software Products.” IBM Journal of Research and
Development. 28(1): 2-14.0Anderson, T. and B. Randell. (1979). Computing Systems Reliability. Garcia-Molina, H. and C. A. Polyzois. (1990). Issues in Disaster Recovery. 35th IEEE Compcon 90. 573-
577.Gray, J. (1986). Why Do Computers Stop and What Can We Do About It. 5th Symposium on Reliability in
Distributed Software and Database Systems. 3-12.Gray, J. (1990). “A Census of Tandem System Availability between 1985 and 1990.” IEEE Transactions
on Reliability. 39(4): 409-418.Gray, J. N., Reuter, A. (1993). Transaction Processing Concepts and Techniques. San Mateo, Morgan
Kaufmann.Lampson, B. W. (1981). Atomic Transactions. Distributed Systems -- Architecture and Implementation: An
Advanced Course. ACM, Springer-Verlag.Laprie, J. C. (1985). Dependable Computing and Fault Tolerance: Concepts and Terminology. 15’th
FTCS. 2-11.Long, D.D., J. L. Carroll, and C.J. Park (1991). A study of the reliability of Internet sites. Proc 10’th
Symposium on Reliable Distributed Systems, pp. 177-186, Pisa, September 1991.Theory and Practice of Reliable System Design, Dan Siewiorek, Robert SwarzBuilding Secure and Reliable Network Applications, Ken P. Birman Darrell Long, Andrew Muir and Richard Golding, ``A Longitudinal Study of Internet Host Reliability,'' Proc
of the Symposium on Reliable Distributed Systems, Bad Neuenahr, Germany: IEEE, 1995, p. 2-9http://www.netcraft.com/ They have even better for-fee data as well, but for-free is really excellent.http://www2.ebay.com/aw/announce.shtml#top eBay is an Excellent benchmark of best Internet practices
Empirical Measurements of Disk Failure Rates and Error Rates + C .van Ingen moving 2P with cheap iron“Consensus on Transaction Commit”, +, L. Lamport, unifies 2PC and Byzantie-Paxos