21
Copyright©2014 NTT corp. All Rights Reserved. Developing More Efficient Object Replication on OpenStack Swift 2014/05/16 (OpenStack Juno Design Summit) Kota Tsuyuzaki Developer (Swift ATC) Advanced Information Processing Technology SE Project NTT Software Innovation Center Copyright(c)2009-2014 NTT CORPORATION. All Rights Reserved.

More Efficient Object Replication in OpenStack Summit Juno

Embed Size (px)

DESCRIPTION

This slide is related to http://junodesignsummit.sched.org/event/7ae1af936b54b937a92db9c4344dfe66#.U3m1OPl_t8E

Citation preview

Page 1: More Efficient Object Replication in OpenStack Summit Juno

Copyright©2014 NTT corp. All Rights Reserved.

Developing More Efficient Object Replication on OpenStack Swift 2014/05/16 (OpenStack Juno Design Summit)

Kota Tsuyuzaki Developer (Swift ATC) Advanced Information Processing Technology SE Project NTT Software Innovation Center

Copyright(c)2009-2014 NTT CORPORATION. All Rights Reserved.

Page 2: More Efficient Object Replication in OpenStack Summit Juno

2 Copyright©2014 NTT corp. All Rights Reserved.

1. Global Distributed Cluster

2. More Efficient Object Replication

3. Benchmark Analysis

Etherpad:

https://etherpad.openstack.org/p/juno_swift_object_replication

Extra:

ssync issue

Outline

Page 3: More Efficient Object Replication in OpenStack Summit Juno

3 Copyright©2014 NTT corp. All Rights Reserved.

Demands:

•World Wide Services

•Capacity Optimization

•Disaster Recovery

Solution:

•Global Distributed Cluster

1. Global Distributed Cluster

Page 4: More Efficient Object Replication in OpenStack Summit Juno

4 Copyright©2014 NTT corp. All Rights Reserved.

Network Issues:

1. Global Distributed Cluster

・High Latency ・Narrow ・Expensive tens of ~ 100 ms 1~10Gbps $15000/Gbps/mo

Page 5: More Efficient Object Replication in OpenStack Summit Juno

5 Copyright©2014 NTT corp. All Rights Reserved.

Network Issues:

1. Global Distributed Cluster

・High Latency Excellent

-> Regions

-> Affinity Controls

Region1 Region2

from SwiftStack Blog

https://swiftstack.com/blog/

Page 6: More Efficient Object Replication in OpenStack Summit Juno

6 Copyright©2014 NTT corp. All Rights Reserved.

Network Issues:

1. Global Distributed Cluster

・Narrow ・Expensive Not So Enough

-> ???

-> ???

• Large Amounts of Transfer

• Replication Delay

Page 7: More Efficient Object Replication in OpenStack Summit Juno

7 Copyright©2014 NTT corp. All Rights Reserved.

Objective:

Reducing The Amounts of Replication Network Transfer between Regions

(focus on Narrow Network)

2. More Efficient Object Replication

Page 8: More Efficient Object Replication in OpenStack Summit Juno

8 Copyright©2014 NTT corp. All Rights Reserved.

2. More Efficient Object Replication

Current Behavior

Page 9: More Efficient Object Replication in OpenStack Summit Juno

9 Copyright©2014 NTT corp. All Rights Reserved.

Current:

Model: 2 Regions 3 Replicas with Write Affinity

2. More Efficient Object Replication

Region1

Network between Regions

Region2

User

Internet

PUT object Primary

Handoff

Page 10: More Efficient Object Replication in OpenStack Summit Juno

10 Copyright©2014 NTT corp. All Rights Reserved.

Current:

Model: 2 Regions 3 Replicas with Write Affinity

2. More Efficient Object Replication

Region1

Network between Regions

Region2

User

Internet

Primary

Handoff

Unfortunately Copy Twice or More

Page 11: More Efficient Object Replication in OpenStack Summit Juno

11 Copyright©2014 NTT corp. All Rights Reserved.

2. More Efficient Object Replication

Proposed Approach

Page 12: More Efficient Object Replication in OpenStack Summit Juno

12 Copyright©2014 NTT corp. All Rights Reserved.

Approach:

• Only push to one remote based on affinity

• Request to sync to others from the remote

• Change only few codes in object-replicator and object-server

2. More Efficient Object Replication

Region1

Network between Regions

Region2

Only push to one remote

Sync to others

Page 13: More Efficient Object Replication in OpenStack Summit Juno

13 Copyright©2014 NTT corp. All Rights Reserved.

2. More Efficient Object Replication

*Additional code [Object-Replicator]

find local part suffixes for each: find other primary locations check remote if not in remote: if (remote region is local) or (remote region not in synced region): push data create remote suffix with request to sync in remote region add remote region to synced region

[Object-Server (REPLICATE)]

create local suffix if sync request in header: push data to requested remotes

Page 14: More Efficient Object Replication in OpenStack Summit Juno

14 Copyright©2014 NTT corp. All Rights Reserved.

Objective:

•Analyze Replication Performance

• Total transferred data amount

• Average network bandwidth between region

• One pass time

3. Performance Analysis

Page 15: More Efficient Object Replication in OpenStack Summit Juno

15 Copyright©2014 NTT corp. All Rights Reserved.

Model:

• 2 Regions 3 Replicas

• 1 Gate Way Node(GW) between Regions

Scenario:

• Shaping GW Network as 1Gbps

• Stop object-replicator

• Load objects with Write Affinity

• 1Gbps -> 8MB * 5,000 (40GB total)

• Run object-replicator with once mode (32 concurrency)

Benchmark Patterns: • Original (ssync)

• Proposed (ssync, rsync)

3. Benchmark Scenario

Page 16: More Efficient Object Replication in OpenStack Summit Juno

16 Copyright©2014 NTT corp. All Rights Reserved.

3. Benchmark Environment

Storage1 Storage2

Infiniband switch (LAN)

Region 1 Region 2

Proxy

x 36 x 36

Infiniband switch (LAN)

Storage3 Storage4

x 36 x 36

GW

20Gbps 20Gbps 20Gbps

(1G) 20Gbps 20Gbps

Client

Ethernet

Storage: CPU: 2 * Intel X5650 2.67GHz (6 core * HT) MEM: 48GB RAM NIC: 20Gbps Infiniband Disks: 3TB SATA (7,200 rpm) x 36 disks

GW: CPU: 2 * Intel X5650 2.67GHz (6 core * HT) MEM: 64GB RAM NIC: 2 * 20Gbps Infiniband (Shaping 1G)

20Gbps

(1G)

Page 17: More Efficient Object Replication in OpenStack Summit Juno

17 Copyright©2014 NTT corp. All Rights Reserved.

3. Result (w/1Gbps shaping)

0

100

200

300

400

500

600

Original Proposed (ssync) Proposed (rsync)

elap

sed

tim

e (s

ec)

One Replication Pass Time (1Gps)

0

10

20

30

40

50

60

70

Original Proposed (ssync) Proposed (rsync)

Tran

sfe

rred

Dat

a A

mo

un

t (G

B)

Transferred Data on One Pass (1Gps)

0

0.2

0.4

0.6

0.8

1

Original Proposed (ssync) Proposed (rsync)

Ave

rage

NEt

wo

rk B

and

wid

th

(Gb

ps)

Average Network Bandwidth (1Gps)

- Good Reduction in Transferred Data Amount

- Little decreasing appeared in Average

Network Bandwidth

- Good Reduction in One Pass Time

-- ssync is more efficient than rsync.

-- Proposed algorithm has small overhead with waiting node

syncing.

-- Enable to ensure sync all primary nodes with a shorter

time and smaller amount of data transfer.

Very Good!

Very Good!

Little decreasing 40GB * 3 replica / 2 = 60GB

1 / 3 has 2 copy in region2 40 GB = theoretical value

Page 18: More Efficient Object Replication in OpenStack Summit Juno

18 Copyright©2014 NTT corp. All Rights Reserved.

1. Global Distributed Cluster

• Efficient Replication Needs

2. More Efficient Object Replication

• Affinity based approach

• Only push to one remote

3. Benchmark Analysis

• Good reduction of data transfer

• Little overhead in One Pass Time

acknowledgment: Swiftstack members, Ken Igarachi, Yohei Hayashi, Takashi Shito, Hiromichi Ito, Naoto Nishizono

Conclusion

Page 19: More Efficient Object Replication in OpenStack Summit Juno

19 Copyright©2014 NTT corp. All Rights Reserved.

• Is ensuring syncing all nodes needed?

• Request to sync at that time of replicate:

• Pros: Able to ensure to sync all replica

• Cons: Little overhead to wait syncing

• Not to request to sync, update the replica asynchronously:

• Pros: To be simple

• Cons: Unable to ensure to sync all replica

• Good way to sync other nodes in Object-Server

• Naïve (but very simple): • Use object-replicator instance with unnecessary wasted

information. (e.g. Ring)

• Complex: • Create syncing function or class for object-server

• Are there more efficient ways?

Discussions

current

current

Page 20: More Efficient Object Replication in OpenStack Summit Juno

20 Copyright©2014 NTT corp. All Rights Reserved.

Kota Tsuyuzaki

IRC: Kota

[email protected]

Page 21: More Efficient Object Replication in OpenStack Summit Juno

21 Copyright©2014 NTT corp. All Rights Reserved.

Ssync:

• Replication process improvement based on HTTP

• Replacement of rsync (designed to be slimmer)

• Sender / Receiver Model

Issue:

• Performance of parallel i/o (might be) caused by evenlet

• Disable to access local disk in parallel (maybe, by constraint of Python VM)

• Slower than rsync in my experiment

• Possible Solution: • Launch sender as subprocess to allow using another CPU core for

disk read similar with rsync.

• When using os.fork(), performance became better to around same as rsync.

Extra: Ssync issue