Upload
madison
View
43
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Symmetric Allocations for Distributed Storage. Derek Leong 1 , Alexandros G. Dimakis 2 , Tracey Ho 1 1 California Institute of Technology, USA 2 University of Southern California, USA GLOBECOM 2010 2010-12-09. A Motivating Example. - PowerPoint PPT Presentation
Citation preview
Symmetric Allocations for Distributed Storage
Derek Leong1, Alexandros G. Dimakis2, Tracey Ho1
1California Institute of Technology, USA2University of Southern California, USA
GLOBECOM 20102010-12-09
Symmetric Allocations for Distributed Storage 2
A Motivating Example
1 2 3 4 5
Suppose you have a distributed storage system comprising 5 storage devices (“nodes”)…
Symmetric Allocations for Distributed Storage 3
2 4
A Motivating Example
1 2 3 4 5
(1/3)2 (2/3)3 ≈ 0.0329218
Each node independently fails with probability 1/3, and
survives with probability 2/3 …
Symmetric Allocations for Distributed Storage 4
2 41 3 5
A Motivating Example
(1/3)5 ≈ 0.00411523
1 3 52 4
Each node independently fails with probability 1/3, and
survives with probability 2/3 …
Symmetric Allocations for Distributed Storage 5
A Motivating Example
1 2 3 4 5
You are given a single data object of unit size,
and a total storage budget of 7/3 …
Symmetric Allocations for Distributed Storage 6
A Motivating Example
1 2 3 4 5
You can use any coding scheme to store any amountof coded data in each node, as long as the total amount
of storage used is at most the given budget 7/3 …
Symmetric Allocations for Distributed Storage 7
A Motivating Example
1 2 3 4 5
010010101010010101000101010101000101010111010101001001010001010100
01101010001010101110101010010010100010101001
1010010101000101001110
1010010101000101001110
Symmetric Allocations for Distributed Storage 8
A Motivating Example
1 2 3 4 5
010010101010010101000101010101000101010111010101001001010001010100
01101010001010101110101010010010100010101001
1010010101000101001110
1010010101000101001110
?
(1/3)2 (2/3)3 ≈ 0.0329218
Symmetric Allocations for Distributed Storage 9
A Motivating Example
For maximum reliability, we need to find
(1) an optimal allocation of the given budget over the nodes, and
(2) an optimal coding scheme
that jointly maximize the probability of successful recovery
Symmetric Allocations for Distributed Storage 10
A Motivating Example
Using an appropriate code, successful recovery occurs whenever the data collector accesses at least a unit amount of data (= size of the original data object)
S
1 2 3 4 5
t1 t2
Symmetric Allocations for Distributed Storage 11
A Motivating Example
1 2 3 4 5
Symmetric Allocations for Distributed Storage 12
A Motivating Example
1 2 3 4 5
RecoveryProbability
A 7/15 7/15 7/15 7/15 7/15
B 7/6 7/6 0 0 0
C 2/3 2/3 1/3 1/3 1/3
0.79012
0.88889
0.90535C
for p = 2/3
Symmetric Allocations for Distributed Storage 13
Problem Formulation
recovery probability
budget constraint
Given n nodes, access probability p, and total storage budget T, find an optimal allocation (x1; …; xn) that maximizes the probability of successful recovery
Trivial cases of minimum and maximum budgets: when T = 1, the allocation (1, 0, …, 0) is optimal when T = n, the allocation (1, 1, …, 1) is optimal
The optimal allocation also
tells us whether coding is
beneficial for reliable storage
#P-hard to compute for a
given allocation and choice of p
Symmetric Allocations for Distributed Storage 14
Discussion between R. Karp, R. Kleinberg, C. Papadimitriou, E. Friedman, and others at UC Berkeley, 2005
S. Jain, M. Demmer, R. Patra, K. Fall, “Using redundancy to cope with failures in a delay tolerant network,” SIGCOMM 2005
Related Work
Symmetric Allocations for Distributed Storage 15
We are particularly interested in symmetric allocations because they are easy to describe and implement
Successful recovery for the symmetric allocationoccurs if and only if at least out of them nonempty nodes are accessed
Therefore, the recovery probability of is
Symmetric Allocations
Symmetric Allocations for Distributed Storage 16
The symmetric allocation that spreads the budget maximally over all n nodes is asymptotically optimal when the budget T is sufficiently large
Asymptotic Optimality of Max Spreading
RESULT 1The gap between the recovery probabilities for anoptimal allocation and for the symmetric allocation is at most
.
If p and T are fixed such that , then this gap approaches zero as .
Symmetric Allocations for Distributed Storage 17
Asymptotic Optimality of Max SpreadingProof Idea: Bounding the optimal recovery probability…
1. By conditioning on the number of accessed nodes r, we can express the probability of successful recovery as
where Sr is the number of successful r-subsets
2. We can in turn bound Sr by observing that we have Sr inequalitiesof the form , which can be summed up to produce
,
where
Symmetric Allocations for Distributed Storage 18
Asymptotic Optimality of Max SpreadingProof Idea: Bounding the optimal recovery probability…
3. We therefore have
4. Applying the bound
to
leads to the conclusion that the optimal recovery probability is at most
Symmetric Allocations for Distributed Storage 19
Asymptotic Optimality of Max SpreadingProof Idea: Bounding the suboptimality gap for max spreading…
1. The recovery probability of the allocation is
2. The suboptimality gap for this allocation is therefore at most the difference between the upper bound for the optimal recovery probability and 1, which is
3. For , we can apply the Chernoff bound to obtain
4. As , this upper bound approaches zero
Symmetric Allocations for Distributed Storage 20
The problem is nontrivial even when restrictedto symmetric allocations…
Optimal Symmetric Allocation number of nonempty nodes in the symmetric
allocation
Symmetric Allocations for Distributed Storage 21
Maximal spreading is optimal among symmetric allocations when the budget T is sufficiently large
Optimal Symmetric Allocation
RESULT 2
If , then either or
is an optimal symmetric allocation.
Symmetric Allocations for Distributed Storage 22
Minimal spreading is optimal among symmetric allocations when the budget T is sufficiently small
Optimal Symmetric Allocation
RESULT 3
If , then is an optimal
symmetric allocation.
Coding is unnecessary for such an allocation
Symmetric Allocations for Distributed Storage 23
Optimal Symmetric AllocationProof Idea: Finding the optimal symmetric allocation…
1. Observe that we can find an optimal m* from among candidates:
2. For , where , the recovery probability is
3. RESULT 2 (max spreading optimal) is a sufficient condition on p and Tfor to be nondecreasing in k
4. To obtain RESULT 3 (min spreading optimal) , we first establish asufficient condition on p and T for to be nonincreasing in k; we subsequently expand the condition to include other points for which remains optimal
m…
For constant p and k, is anondecreasing function of m
Recall that the recovery probability of thesymmetric allocation is given by
Symmetric Allocations for Distributed Storage 24
Optimal Symmetric Allocation
maximal spreading is optimal
among symmetric allocations
minimalspreading is optimal
among symmetric allocations
other symmetric allocations may be optimal in the gap
Symmetric Allocations for Distributed Storage 25
The optimal allocation is not necessarily symmetric
However, the symmetric allocation that spreads the budget maximally over all n nodes is asymptotically optimal when the budget is sufficiently large
Furthermore, we are able to specify the optimal symmetric allocation for a wide range of parameter values of p and T
Conclusion
Symmetric Allocations for Distributed Storage 26
Thank you!