Upload
juniper-randall
View
221
Download
1
Tags:
Embed Size (px)
Citation preview
Exact Regenerating Codes on Hierarchical Codes
Ernst BiersackEurecomFrance
Joint work and Zhen Huang
Outline
:: Introduction and motivation:: Hierarchical Codes:: Regenerating Codes:: Combining Hierarchical Codes and
Regenerating Codes:: Conclusion
3
Motivation: Elements of a P2P backup system
Performance metrics:Storage efficiency: how much redundant information do you store?
From Julian Monteiro
4
Motivation: Network Bandwidth is a scarce resource
Our first objective is to find erasure codes that consume less communication bandwidth, i.e. have better efficiency factor ρ
- Network communication bandwidth cannot be “put aside” for later use
A second objective should be to adopt repair policies that provide a smooth utilization of the communication bandwidth
Hierarchical Codes
Regenerating Codes
ER-Hierarchical Codes
Linear Codes: Overview
- A particular way to build erasure codes is linear codes
o1
o2
o3
o4
original fragments
p1
p2
p3
p4
[c1,1 c1,2 c1,3 c1,4]
[c2,1 c2,2 c2,3 c2,4]
[c3,1 c3,2 c3,3 c3,4]
[c4,1 c4,2 c4,3 c4,4]
P = CO O=C-1P
If C is invertible, i.e. the coefficient vectors are linearly independent,we can reconstruct the original fragments.
p1
c1,1
j
jjii ocp ,
c1,2
c1,4
c1,3parity fragment
Linear combination
p5
p6
If coefficients are chosen randomly in GF(216), the matrix is invertible with a very high probability.
7
Hierarchical codes: Idea
let us try to change the way the code is built:
o1
o2
o3
o4
p1
p2
p3
p4
p5
p6
p7
There are sets of 4 parity fragments that are not sufficient to reconstruct the original file.
1 2 3 4 5 6 70%
10%20%30%40%50%60%70%80%90%
100%
4+3 Traditional Erasure Code
# Unavailable Fragments
Prob
abili
ty o
f fai
lure
1 2 3 4 5 6 70%
10%20%30%40%50%60%70%80%90%
100%
4+3 Hierarchical Code
# Unavailable fragments
1 2 3 4 5 6 70%
10%20%30%40%50%60%70%80%90%
100%
4+3 Traditional Erasure Code
# Unavailable Fragments
Prob
abili
ty o
f fai
lure
1 2 3 4 5 6 70%
10%20%30%40%50%60%70%80%90%
100%
4+3 Hierarchical Code
# Unavailable fragments
o1
o2
o3
o4
p1
p2
p3
p4
p5
p6
p7
traditional erasure code
Hierarchical code
8
Hierarchical codes : Repair degree
1 2 3 4 5 6 70%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
4+3 Traditional Erasure Code
# Unavailable Fragments
Prob
abili
ty o
f cos
t/fa
ilure
Failure
1 2 3 4 5 6 70%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
4+3 Traditional Erasure Code
# Unavailable Fragments
Prob
abili
ty o
f cos
t/fa
ilure
ρ =4
1 2 3 4 5 6 70%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
4+3 Hierarchical Code
# Unavailable fragments
1 2 3 4 5 6 70%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
4+3 Hierarchical Code
# Unavailable fragments
1 2 3 4 5 6 70%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
4+3 Hierarchical Code
# Unavailable fragments
ρ =2
The repair degree determines the efficiency factor ρ
9
Hierarchical codes: Recursive Construction
HC-(k,h)•k original blocks•h redundant blocks
10
Hierarchical codes: Theory
11
Hierarchical codes: Repair
What if p_1 and p_3 are lost?• Use p_2 , 1 out of {p_7, p_8} and 1 out {p_4, p_5, p_6} need 3 blocks
What if p_1, p_2, and p_3 are lost?• Use …..need ???? blocks
In HC, the earlier we repair the repair is often “cheaper”
12
64+64 hierarchical codes: Reliability vs Cost
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 610%
10%20%30%40%50%60%70%80%90%
100% Cost=2
Cost=4
Cost=8
Cost=16
Cost=32
Cost=64
Failure
# Unavailable Fragments
Prob
abili
ty o
f cos
t/fa
ilure
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 610%
10%20%30%40%50%60%70%80%90%
100%
# Unavailable Fragments
Prob
abili
ty o
f cos
t/fa
ilure
Two possible instances of a 64+64 hierarchical code
- Lower repair cost comes at the prices of reduced reliability
ER-Hierarchical Codes
Regenerating Codes
Hierarchical Codes
14
Regenerating Codes: Idea
What happens if…
Regenerating codes (by G. Dimakis) give the answer: the repair communication requirements are much smaller.
upon a repair we contact more than k peers?
p1
p2
p5
p7
p’4 d>k
p8
Every peer stores a parity block larger (or equal) than the usual parity fragment (i.e. 1/k of the file size)?
o1o2o3o4
|block|≥|file|/k
b1
15
Regenerating codes: Performance
- regenerating codes are controlled by two additional parameters beyond k and h:: d the repair degree
:: i the block expansion index k ≤ d ≤ k+h-10 ≤ i ≤ k-1
- if we consider a regenerating code with k=32 and h=32:
32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 620.8
1
1.2
1.4
1.6
1.8
2
i=31i=22i=15i=7i=0
d
Bloc
k si
ze s
tret
ch
Additional space
32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 620.01
0.1
1
d
repa
ir-d
own
redu
ction
Reduced communications bandwidth
classical erasure codes
MBR: Minimum-Bandwidth Regenerating MSR: Minimum-Storage Regenerating
16
Regenerating codes: Performance - k=32 and h=32 and a stored file of 1MB:
32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 620.8
1
1.2
1.4
1.6
1.8
2
i=31i=22i=15i=7i=0
d
Bloc
k s
ize
stre
tch
32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 620.01
0.1
1
d
repa
ir-d
own
redu
ction
Reduced communications bandwidth
code d i repairdown storage
Classical erasure code 32 0 1 MB 2 MB
“ extreme“ regenerating code 63 30 42.47 KB 2.61 MB
“reasonable” erasure code 40 7 84,62 KB 2.11 MB
Communication is impressively reduced with small amount of extra storage.
Additional space
17
Regenerating codes: A new dimension in the trade-off
Storage
Communication
Erasure Codes
Replication
Regenerating codes can be seen as a generalization of replication and RSE that allow to more flexibly trade off communication and storage requirements.
RC(k,h,d,i,)•k original pieces•h additional pieces•d repair degree•i block expansion factor
18
Regenerating codes: Want to know more
See http://csi.usc.edu/~dimakis/StorageWiki/doku.php
A wiki on Coding for Distributed Storage maintained by Alexandros G. Dimakis
Hierarchical Codes
Regenerating Codes
ER-Hierarchical Codes
20
ER-Hierarchical Codes• Can we combine Hierarchical codes and Regenerating Codes?• Yes:
ER-Hierarchical Codes combine concepts of Hierarchical Codes and Regenerating Codes, namely that
• most parity blocks are linear combinations of only a small subset of all original blocks and that
• a storage block consists of α fragments, while a repair block has only β fragments, with , β < α
21
ER-Hierarchical Codes: Construction
• How to transform Hierarchical code into ER-Hierarchical Code?
22
ER-Hierarchical Codes: Construction
23
ER-Hierarchical Codes: Repair
• In HC we would need to download 4 blocks of size 1 each• 4 units of traffic
• In ER-HC we now download 5 fragments of size ½ each• 2.5 units of traffic
24
ER-Hierarchical Codes: Traffic reduction (analysis)
• ER-HC reduces the traffic by more than• 85% as compared to RSE and Regenerating Codes• 40% compared to Hierarchical codes
Reg Code is MSR with d=k+1
25
ER-Hierarchical Codes: Repair Strategies
26
ER-Hierarchical Codes: Performance (simulation)
In HC and ER-HC , the earlier we repair the “cheaper” the repair; is not the case for RG and RSE
27
Conclusion
- Have presented some new codes that - greatly reduce the communications overhead
-Regenerating codes apply principles of network coding to distributed storage and allow to trade off storage space for communications bandwidth-As compared to RSE codes
- Regenerating codes increase the repair degree (number of nodes that must be contacted for repair) but significantly reduce the amount of data downloaded from each node
- Hierarchical codes significantly reduce the repair degree while keeping the amount of data transferred by each node the same (as RSE)
-Combining Regenerating Codes and Hierarchical Codes makes us win at both fronts
- Reduces repair degree and the amount of data transmitted by each node
28
Future work
• Further exploit the possibilities offered by ER-Hierarchical Codes• Study the relationship between coding and repair policies for systems with churn
• Reactive repair results in repair burst• Proactive repair has smoother repair traffic but does unnecessary repairs. If
repairs are cheap, as they are for ER-HC, proactive repair becomes much more attractive since the “earlier we repair”, the cheaper a repair
-
Thanks Questions?