Accelerating the Reconstruction Process in Network Coding Storage System by Leveraging Data...
23
Accelerating the Reconstruction Process in Network Coding Storage System by Leveraging Data Temperature Kai Li, Yuhui Deng Department of Computer Science, Jinan University, Guangzhou 510632, P.R.China NPC 2014: The 11th IFIP International Conference on Network and Parallel Computing. Ilan, Taiwan. Sept. 18-20, 2014
Accelerating the Reconstruction Process in Network Coding Storage System by Leveraging Data Temperature Kai Li, Yuhui Deng Department of Computer Science,
Accelerating the Reconstruction Process in Network Coding
Storage System by Leveraging Data Temperature Kai Li, Yuhui Deng
Department of Computer Science, Jinan University, Guangzhou 510632,
P.R.China NPC 2014: The 11th IFIP International Conference on
Network and Parallel Computing. Ilan, Taiwan. Sept. 18-20,
2014
Slide 2
Agenda 1 3 4 Introduction Design and Implementation System
Evaluation 2 Background and Motivation 5 Conclusions and Future
Work
Slide 3
Introduction Big Data 1
Slide 4
Introduction Distributed Storage Systems 2 to provide massive
data storage service Google Google File System Amazon Dynamo
Microsoft Windows Azure IBM GPFS Red Hat Gluster
Slide 5
Introduction Fault Tolerance in Distributed Storage Systems 3
Two traditional ways 1. Replica 2. Erasure Codes
Slide 6
Introduction Fault Tolerance in Distributed Storage Systems 3
Two traditional ways 1. Replica 2. Erasure Codes Storage Efficiency
Reconstruction Bandwidth Implementation complexity
ReplicaLowOptimalEasy Erasure CodesHighBadComplex
Background and Motivation MBR Codes 2 Minimum-Bandwidth
Regenerating (MBR) is optimal repair bandwidth efficiency.
Slide 10
Background and Motivation MBR Codes 2 Minimum-Bandwidth
Regenerating (MBR) is optimal repair bandwidth efficiency.
Slide 11
Background and Motivation E-MBR Codes 3
Slide 12
Background and Motivation E-MBR Codes 3
Slide 13
Background and Motivation Motivation 4 Heavy computational cost
occurs in regenerating codes. function time complexity
RAID-5RAID-6E-MBR mbr_find_block_id(mbr_find_ dup_block_id) nB
mbr_get_dup_disk_id(mbr_ge t_disk_id) (n-1)B
mbr_get_dup_block_no(mbr_ get_block_no) nB gf_gen_tables C
gf_get_coefficient C gf_mul C open(n-1)B B pread(n-1)B B
pwrite(n-1)B B
Slide 14
Background and Motivation Motivation 4 Heavy computational cost
occurs in regenerating codes.
Design and Implementation Algorithms 2 Algorithm Tracker: while
(1) { initiate hot_seg; fpid = fork(); if (fpid == 0) { //Child
process while (1) { hot_seg = max (temp[]); listen( taskexecuter_sd
); send ( hot_seg ); if finish reconstruction, break; } }else if
(fpid > 0) { //father process recv (req_disk, offset); segment =
offset / mbr_segment_size; temp [segment] ++; } if finish
reconstruction, break; }
Slide 18
System Evaluation Workloads 1 User access pattern is usually in
accord with Pareto principle. Therefore, we use Zipfs law to
imitate the distribution of workload characteristic from client.
where
Slide 19
System Evaluation Impact of AlphaPerformance Evaluation 2
Slide 20
System Evaluation Impact of block size. Performance Evaluation
2
Slide 21
System Evaluation Impact of node number Performance Evaluation
2
Slide 22
Conclusions and Future Work Conclusions 1 2 Future Work Our
method, DTemp ---- reconstruct the hot data prior to the cold data.
DTemp does better performance than existing EMBR reconstruction
with up to 33.17%, 60.61%, 37.77% improvement in terms of
reconstruction time, throughput and average response time. 1.To
include more regenerating code schemes to evaluate reconstruction
performance using data temperature for horizontal comparison. 2.To
investigate the impacts of DTemp in distributed storage system with
real workload.