Upload
felix-combs
View
34
Download
3
Tags:
Embed Size (px)
DESCRIPTION
Coterie availability in sites. Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005. Multi-site systems. Emerging class of distributed systems Collection of sites across a WAN Multiple nodes in each site Share resources Data sets - PowerPoint PPT Presentation
Citation preview
Coterie availability in sitesCoterie availability in sites
Flavio Junqueira and Keith Marzullo
University of California, San Diego
DISC, Krakow, Poland, September 2005
2DISC’05
Multi-site systemsMulti-site systems
Emerging class of distributed systems Collection of sites across a WAN Multiple nodes in each site Share resources
Data sets Computational power
E.g. BIRN, Geon, TeraGrid, PlanetLab
Site failure All the nodes in a site simultaneously
unavailable
3DISC’05
Site availability — BIRNSite availability — BIRN
10 sites experience at least one outage
One site under 97%
4DISC’05
Improving availabilityImproving availability
Better availability through replication Coteries
Set system of processes: a set of subsets of processes Each subset is called a quorum Minimal sets, pairwise intersect
Coteries are useful Distributed mutual exclusion Distributed registers Consensus through Paxos
Coterie availability in multi-site systems
5DISC’05
RoadmapRoadmap
System model Availability metrics
Previous deterministic metrics not necessarily good A new metric
Failure model Characterize failures using survivor sets Survivor sets: more expressive
Quorum construction Multi-site hierarchical construction
Practical issues Failure model in practice PlanetLab experiment
Conclusions
6DISC’05
System modelSystem model
Set P of processes Pairwise connected by quasi-reliable asynchronous channels Process failure: crash Processes can recover
Set B of sites Partition of the set processes Site failure: simultaneous failure of all the processes in the site Process failures are not independent
Execution Sequence of steps of processes E: set of all executions
In a step s
Available process in s p P is available if p F(s) €
NF(s) = P \ F(s)
€
F(s) = {p : ( p ∈ P)∧( p is faulty in s)}
7DISC’05
Survivor setsSurvivor sets
A set S P is a survivor set iff
Example
€
∀p ∈ S : ∀E ∈E : S \ p ≠ NF(s)
€
∃E ∈E : ∃s ∈ E : S = NF(s)
Processes
Sites
E={E1,E2,E3,E4}
E1,E2: s1 s2 E3: s1 E4: s1
NF(si)
Survivor sets
8DISC’05
Availability metricsAvailability metrics
Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges
Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs
9DISC’05
A counterexampleA counterexample
Processes
Survivor sets
Sites
Majority Quorum: 5 processes In some step, no quorum can
be formed
Using SP as quorums In every step, at least one
quorum can be formed
Majority is not optimal
10DISC’05
Availability metricsAvailability metrics
Traditional deterministic metrics Undirected graph: nodes = processes, edges = comm. links Node vulnerability: Minimal number of nodes Edge vulnerability: Minimal number of edges
Majority is optimal [Barbara and Garcia-Molina’86] Complete graphs
A new metric A(Q), Q is a coterie Number of covered survivor sets in Q A survivor set S is covered in Q if:
€
∃Q ∈Q : Q ⊆ S
11DISC’05
Failure modelFailure model
Multi-site hierarchical model A set Fs of subsets of B
Subsets of simultaneously faulty sites
An array Fp One entry per site Each entry: subsets of
processes in the site Subsets of simultaneously
faulty processes at a site
A survivor set S: FS Fs
Bi FS:FP Fp[i]:P\FP S
Bi FS:Bi S =
Processes (P)
B1 B2 B3
Fs ={{B1},{B2},{B3}}
1 2 3 1 2 3 1 2 3
Fp [1]={{ }: i {1,2,3}}i
Fp [2]={{ }: i {1,2,3}}i
Fp [3]={{ }: i {1,2,3}}i
Sites(B )
Sp={{ }: i, j,k,l {1,2,3} ij kl}i j k l
{{ }: i, j,k,l {1,2,3} ij kl}i j k l
{{ }: i, j,k,l {1,2,3} ij kl}i j k l
12DISC’05
Quorum constructionQuorum construction
Optimal availability with respect to A
Coterie Q : Sp = Q OR Q dominates Sp
Survivor sets in Sp pairwise intersect
If not, then optimally discarding survivor sets is NP-Complete
A special case: Qsite All subsets of B of size fs inFs
All subsets of size t of Bi in Fp[i], for every i
Site 1
Site 2
Site 3
E.g.: fs = 1, t = 1
Quorums
13DISC’05
Model in practiceModel in practice
Qsite fs: Threshold on site failures
Data on site availability t : Threshold on process failures
Markov chains One Markov chain for each site
Transitions Failure transitions: same probability, homogeneous processes Repair transitions: variable probability, amount of resources used
Failure transitions
Repair transitions
14DISC’05
PlanetLab experimentPlanetLab experiment
Toy application Paxos: quorums of acceptors Client accessing quorums
Hosts used Three sites: three from each site One UCSD host: proposer,
learner
Three settings 3Sites: One acceptor per site
Quorum: two hosts 3SitesMaj: All hosts
Quorum: four hosts, majority from each of two sites
SimpleMaj: All hosts Quorum: any five processes
UC Davis
UT Austin
DukeUC San Diego
SimpleMaj has worse availability
3SitesMaj has better availability
15DISC’05
The Bimodal modelThe Bimodal model
Sites are survivor sets Sp is not a coterie
“Throw out” survivor sets In general, optimal solution is NP-Complete Simple solution for this model
Practical issues Practical for two sites More than two sites: open problem
n0
t0 t1 t t
00 01 0t
10 11 1t
0n
n1 n t nn
t n
1n
16DISC’05
ConclusionsConclusions
Coteries for multi-site systems Site failures: process failures not independent
A new metric Counts covered survivor sets
Multi-site hierarchical construction Practical Illustrated with Markov model Experiment shows better availability
Using majority quorums is not a good idea Not optimal Poor performance
Future work More experiments, more constructions, real deployment
17DISC’05
END
18DISC’05
Backup Slides
19DISC’05
Failure modelsFailure models
The multi-site hierarchical model A set Fs of subsets of B
An array Fp One entry per site Each entry: subsets of processes in
the site
A survivor set S: FS Fs
Bi FS:FP Fp[i]:P\FP S
Bi FS:Bi S =
The bimodal model A set Fs of subsets of B
There is one site that is in no element of Fs
An array Fp
A survivor set S As in the previous model OR
Bi B: S = Bi
Processes
B2B1
Fs =
Fp [1]={{ }: i {1,2,3}}
1 2 3 1 2 3
i
Fp [2]={{ }: i {1,2,3}}i
MSH: Sp={{ }: i, j,k,l {1,2,3}
ij kl} i j k l
B: Sp={{ }: i, j,k,l {1,2,3} ij kl} B
i j k l
20DISC’05
Bimodal constructionBimodal construction
Bimodal model By construction: Not all pairs of survivor sets intersect
Discard survivor sets until remaining intersect Selecting optimally is NP-Complete
Solution: Remove |B|-1 survivor sets Survivor sets containing processes from multiple sites pairwise intersect Construction is also optimal with respect to metric A
A special case: Bsite All elements of Fs have size fs
All elements of Fp[i] have the same size t, for every i
E.g.: fs = 1, t = 1 B1
B2
Quorums
21DISC’05
Site availabilitySite availability
Goals Show that sites are unavailable frequently enough
BIRN - Biomedical Informatics Research Network Test bed projects centered around brain imaging Currently: 19 universities, 26 research groups
Availability Monthly basis Pings (BIRN-CC) Storage broker logs
Site availability Jan/04-Aug/04 Availability under 100%
On average in 5 out of the 8 months
€
Availability = Total hours - Unplanned outages
Total hours×100
22DISC’05
Causes of site failuresCauses of site failures
Misconfigured software Shared resources
1.Storage2.Power circuits3.Cooling pipes4.Air conditioning5.Network