12
Distributed Storage And WAN Transport Peter Kunszt SyBIT Tech Day Nov. 23 2011, Bern

Distributed Storage And WAN Transport

  • Upload
    lance

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

Distributed Storage And WAN Transport. Peter Kunszt SyBIT Tech Day Nov. 23 2011, Bern. Distributed Storage Systems. Distributed FS Make it look like local FS User sees one space Remote user sees same local space Policies on sharing, access should be available Caching FS - PowerPoint PPT Presentation

Citation preview

Page 1: Distributed Storage And WAN Transport

Distributed Storage And WAN Transport

Peter KunsztSyBIT Tech Day

Nov. 23 2011, Bern

Page 2: Distributed Storage And WAN Transport

Distributed Storage Systems

2011.11.23 2

Distributed FSMake it look like local FSUser sees one spaceRemote user sees same local spacePolicies on sharing, access should be available

Caching FSData lives somewhere elseBut looks local due to smart WAN cache

Page 3: Distributed Storage And WAN Transport

Gluster (bought by RedHat)

2011.11.23 3

www.gluster.org GlusterFS. Many commercial users.The software is open source, they sell an appliance and support (just like redhat)

Single global namespaceBlock storage clustering, no central metadataWorks over 1GbE, 10GbE, InfinibandReplication‘NFS–like’ nativeNo kernel dependenices, simple installation

Page 4: Distributed Storage And WAN Transport

XtreemFS

2011.11.23 4

Part of XtreemOS project (EU FP7). Used only by German MosGrid in latest version in production.Object-based design. Global FS namespace.

Metadata and Replica Service stores info. Data on Object Storage Servers. Linked through Replica Management Service.Written in java – using native Memblocking. Keystore DB used : BabuDBUses Linux FUSE kernel module, MIT Vivaldi algorithm for replica automation and selection

Page 5: Distributed Storage And WAN Transport

DDN WOS

2011.11.23 5

www.ddn.com/industry/life-sciences Storage appliance, sold with several interfaces including S3 and REST. GPFS based. Highly resilient to failure.Policy-based replicationData protection mechanism – several copies stored

Break data into fragments, store those x timesCan be combined with replication

Page 6: Distributed Storage And WAN Transport

IBM Panache aka Active Cloud Engine

2011.11.23 6

www.almaden.ibm.com/storagesystems/projects/panache/ Clustered Filesystem CACHE for parallel I/OCan cache from multiple nodes

GPFS for local FS, pNFS for remote access also using parallel I/ONo proprietary HW or SW needed for installationVery resilient to failures, late sync if necessary

Page 7: Distributed Storage And WAN Transport

7

IBM Active Cloud Engine™– WAN Caching capabilitiesStatement of Direction Fileset on home cluster is associated with a

fileset on one or more cache clusters If data is in cache …

– Cache hit at local disk speeds– Client sees local GPFS performance if file or

directory is in cache If data not in cache …

– Data and metadata (files and directories) pulled on-demand at network line speed and written to GPFS

– Uses NFS for WAN data transfer

If data is modified at home– Revalidation done at a configurable timeout– Close to NFS style close-to-open consistency across

sites– POSIX strong consistency within cache site

If data is modified at cache– Writes see no WAN latency– are done to the cache (i.e. local GPFS), then

asynchronously pushed home If network is disconnected …

– cached data can still be read, and writes to cache are written back after reconnection

NFS over the WAN

IO NodesIO Nodes

SONAS layer

Cache Cluster SiteCache Cluster Site 2

Home Cluster SiteSoNAS System

SONAS layer

Pull on cache missPush on write

Page 8: Distributed Storage And WAN Transport

8

IBM Active Cloud Engine™ What is IBM Active Cloud Engine?

• Policy-driven engine that helps improve storage efficiency by automaticallyDistributing files, images, and application updates to multiple locations *Identifying files for backup or replication to a DR locationMoving desired files to the right tier of storage including tape in a TSM hierarchyDeleting expired or unwanted files

• High-performance: can scan billions of files in minutes

What client value does Active Cloud Engine deliver?• Enables ubiquitous access to files from across the globe *• Reduces networks costs and helps improve application performance by distributing files closer to users *• Improves data protection by identifying candidates for backup or DR• Lowers storage cost by moving files transparently to the most appropriate tier of storage• Controls storage growth by moving older files to tape and deleting unwanted or expired files• Enhances administrator productivity by automating file management

What capabilities are supported by Active Cloud Engine in SONAS?• Active Cloud Engine on SONAS supports all the functions described above

What capabilities are supported by Active Cloud Engine in Storwize V7000 Unified?• Active Cloud Engine on Storwize V7000 Unified supports all the functions described above except distribution to

multiple locations * Active Cloud Engine Statement of Direction

Page 9: Distributed Storage And WAN Transport

Fast Transport

2011.11.23 9

Network bandwidth maximizationFair shareCongestion controlScheduling

TCP based: GridFTP and similarFTP blocksize adjustmentMany parallel threads

Page 10: Distributed Storage And WAN Transport

Aspera

2011.11.23 10

www.asperasoft.com Built-in to other appliances, many users

UDP based transportSwarming – can look like a DoS Also has an FTP connection for control information

Configurable, has server and client UI for transport controlCongestion controlFair share control

Page 11: Distributed Storage And WAN Transport

FileCatalyst

2011.11.23 11

www.filecatalyst.com Similar to Aspera: UDP based transport

Page 12: Distributed Storage And WAN Transport

Signiant

2011.11.23 12

www.signiant.comAnd one more. Is not cheap but I didn’t find out more.