Upload
donhi
View
215
Download
1
Embed Size (px)
Citation preview
Energy Efficiency Strategies for Storage in HPC Environments
Alan G. Yoder, Ph.D. , NetApp
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Abstract
Energy Efficiency Strategies forStorage in HPC Environments
With some large-scale HPC data center owners admitting that storage and computing resources can cost more to power over their lifetimes than to purchase, attention to energy management in HPC is timely. This talk will focus onstorage; it will survey various storage schemes for HPC, ranging from Hadoop clusters to enterprise storage arrays, and compare energy usage and management in the various schemata. The potential contribution of point technologies such as data deduplication will also be covered.
2
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Outline
Typical Hadoop configurationTypical Lustre configurationNFS v4.1Replication – love and hateRAID 6Other technologies for increased efficiency
3
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Typical Hadoop configuration
HDFS does the storageMaster/slave architecture : single metadata server
4
NameNodeMetadata (name, replicas, .....): /n/foo, 3, .....
Filesystem ops: create, rm, mv, cp, etc.
DataNode
DataNode
Rack 2
DataNode
clientsDataNode
DataNode
Rack 1
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Typical Hadoop configuration
Replication
5
NameNode
DataNode
DataNode
Rack 2
DataNode
DataNode
DataNode
Rack 1
Design parameters:1. Fear of whole-rack failure2. Faster in-rack LAN operations
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Typical Hadoop configuration
Typical data path operation
6
NameNode
DataNode
DataNode
Rack 2
DataNode
clientsDataNode
DataNode
Rack 1
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Hadoop pros and cons (v 0.20)
Parallel reads of large filesMatching of computation to data location
Jobtracker jobtasker communication
“Doesn’t need RAID”If “false is the new true”, this is a good thing
Default replication value is 3 per data centerName server is a SPOF
Log-structured metadata storeLog replay only upon reboot ( ½ hour or more downtime)Loss of NameNode filesystem is equivalent of a head crash
No provision for high availability or disaster recovery
7
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Lustre
Similar in concept to Hadoopwith an intermediate storage layerwith failover and multipathing capability
8
MDS
OSS node
OST node
storage
OST node
storage
OSS node
OST node
storage
OST node
storage
OSS node
OST node
storage
OST node
storage
clients
MDS: MetaData ServerOSS: ObjectStorage ServerOST: ObjectStorage Target
2..8 2..8 2..8
fat pipe e.g. Infiniband
SAS
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Pros and cons of Lustre
Much more “enterprise ready” than Hadoophigh availabilitytesting with and support for enterprise-ready Linux etc.
Additional OSS layer insulates FS from rogue clientssaferfaster access to cached dataless efficient for streaming operations
Less dependent on pure replicationActive/passive failoverMDS a SPOF
9
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
NFS v4.1 (pNFS)
Hadoop++, I mean +++++++Master/slave architecture : replicatable metadata server
10
Metadata Server
DataServer
DataServer
Local cluster
DataServerclients
DataServer
DataServer
Remote cluster
DataServer
Metadata Server
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
NFS v4.1 additional features
Delegations – similar to CIFS oplocksoverlapping read lock supportcached local writes
Layouts – similar to Lustre layoutsclient-side control over striping etc.
Designed for enterprise storagehigh availability operations assumed and accommodatedenterprise level security incorporated
Allows sophisticated data management techniquese.g. single instance store, data dedup, delta snapshots, thin provisioning
11
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
The theme so far (mostly)
Lots of replication!
12
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Replication
Traditional data center system redundancyOverprovisioning – protect against volume-out-of-space application crashesTest/dev copies – protect live data from mutilation by unbaked codeDR Mirror – protect against whole-site disastersBackups – protect against failures and unintentional deletions/changesCompliance archive – protect against heavy fines
Big data systems like HadoopTypically 3x replication locallyPartly for performance
“Brick” architecturesGoogle, MicrosoftAt least 2x
13
shocking discovery: disk
failure considered non-binary
shocking discovery: disk
failure considered non-binary
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
AppData
1 TB
5 TB
10 TB
Test/Devcopies
Data
RAID10
“Growth”Snapshots
Data
RAID10
“Growth”Snapshots
Backup
Archive
Test
Test
Test
Test
Test
ComplianceArchive
Data
RAID10
“Growth”Snapshots
Data
RAID10
“Growth”Snapshots
Backup
Archive
DiskBackup
Data
RAID10
“Growth”Snapshots
Data
RAID10
“Growth”Snapshots
Backup
DRMirror
Data
RAID10
“Growth”Snapshots
Data
RAID10
“Growth”Snapshots
Snap-shots
Data
RAID10
“Growth”Snapshots
Over-provision
Data
RAID10
“Growth”
~10x +
RAID 10Overhead
Data
RAID10
Data
- Power consumption is roughly linear in the number of naïve (full) copies
Result of all that replication
14
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
RAID
Requirement is to tolerate n failuresChallenge is to do this with least amount of raw capacityReplication (for data protections’ sake) is a naive form of RAID – 50 – 75% overheadRAID 1 – 50% overheadRAID 5 – 15% to 20% overhead
larger disks unacceptable RG reconstruct timesRAID 6 – 15% to 25% overhead (8 + 2 common)
15
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
RAID types
RAID 0simple stripingno parityperformance gain
16
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
RAID 1
Mirroringgold standard of data protection for many yearssimple duplication in hardwaresome potential performance benefittolerates 1 failure in any 2-disk “RAID group”
17
LUNLUNLUNLUN
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
RAID 4
Parity on a separate spindleParity is XOR of other stripes (equivalent to addition)Parity reconstructed by simple math (more disks more time)
18
7328
2085
4532
1381315
7+2+47+2+413-7-213-7-213-7-413-7-413-4-313-4-3
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Notes on RAID 4
Parity disk is a hot spot in naive implementationsNeeds aggressive write staging
RAID group can be expanded with zeroed drives
19
7328
2085
4532
1381315
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
RAID 5
Striped paritySame math – XOREliminates hot spotSome performance benefit toutedCan’t easily expand RAID group, however
20
732
20
5
4
32
135
88
81315
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
RAID 6
Tolerates two drive failures per RAID groupRow-diagonal parity (usually)
One parity stripe horizontallyOne parity stripe diagonally
Parity rebuild process is complex but provably correctLess than 3% performance hit in optimal implementations
21
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Notes on Gordon (SDSC)
> 200 Tflops, 64TB DRAM, 320TB SSD in 64 nodesExpectations based on my experience:
High power use (> HDD per IOP and Gb/s)Much faster for very large random I/O (WS > 64TB)Faster for very large sequential readsSlower for data ingestRelatively small (< 1% the capacity of Sequoia)
22
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Flash and stash, tiering, MAID
Flash and stashproven technology in enterprise spacelarge flash caches (> 1TB) fronting low-power SATA diskresulting systems are hi-perf except on streaming writesexpectation that flash will move to host over timerecommend adding this technology to Lustre OSS nodes
Tieringsame idea, keep hot data on SSD, cold data on SATAruns well on Powerpoint
MAIDuseful for hot spares
23
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Other energy efficiency technologies of note
“Green” facility placementWater and natural cooling Hot aisle technologiesFlywheel UPSesPUE monitoringThin provisioningCompressionDelta snapshotsParity RAIDDeduplication and SIS
Capacity vs. high performance drivesILM / HSM / TieringMAIDSSDs / “Flash and stash”Power supply and fan efficiencies
24
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
NFS v4.1 – the future?
FOSS development underway – UMichSupport from major vendorsBuilds on 20 years of NFS development effortFewer “moving parts” than LustreIntegrates well with other storage technologiesEasier to distribute, archiveFamiliar POSIX interface for end usersEasier to secure
25
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Q & A
26
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Additional material
27
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Notes
Tutorial: RAIDHow does a Lustre storage system work?How does a Hadoop cluster work?Get rid of COMs mostlySan Diego flash system (Gordon)Tiering vs. Caching on flashLustre vs NFSv4.1 vs MAIDWright, Disruptive Technologies for Data...
28
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Servers, storage and switches are HEATERS100% efficient energy-to-heat conversionRotating media uses 85% of max power at idle!
A/C is a big “undo” mechanism for overheatingBut less than 100% efficient (typically 70%)
Problem: making heat just to cool it
> 60% of the power in a traditional data center does no IT
work
(PUE* ~ 2.5)
HEATBUILDING
UNDO
* PUE defined later 29
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Problem: unused space
Overprovisioning of systemsOverprovisioning of containers
30
data
data
data
data
data
data
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Typical Hadoop configuration
HDFS does the storageMaster/slave architecture : single metadata server
31
NameNodeMetadata (name, replicas, .....): /n/foo, 3, .....
Filesystem ops: create, rm, mv, cp, etc.
DataNode
DataNode
Rack 2
DataNode
clients
DataNode
DataNode
Remote Rack
DataNode
DataNode
DataNode
Rack 1
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Oak Ridge Labs “Spider”
32
Aggregate Bandwidth 240GB/s
Storage Systems 48 x DDN S2A9900 Storage Arrays
Hard Drives 13,440 1TB SATA Hard Drives
Aggregate Capacity 13.44 Petabytes (Raw), 10.7 Petabytes (Usable – 8+2 RAID 6)
Lustre Storage Servers 192 Lustre OSS Servers
Cabling Over 1,000 20Gb InfiniBand Cables
Data Center Cabinets 32 Data Center Racks, 572 ft
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Green facilities
PUE – Power Use Efficiency
Weighted upward byUPS and power conditioning ineffiecienciesInefficient cooling
Traditionally 2.5, modern best practice = 1.25Can be gamed
Use of equipment fans to drive hot air exhaust
33
PUE Total _ Facility _ Power
IT _ Power
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Hot aisle / cold aisle technologies
Segregate airflows into hot and/or cold aisles (backs and fronts of servers)
More precise controlAllows higher temperature differentials (more efficient)Current trend toward hot aisle containment with coldair plenumMust-have: blanking plates
Very important
Normally deployed in comb.w/ air economizers
34
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
UNDO(cool bldg.)
Green facilities
HEATBUILDING
HOT AISLE
EXHAUST
outside air
e.g.
BEFORE
AFTER
40% SAVINGS (w/ air economizer)(PUE: 2.5 1.5)
35
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Contributing to a green facility
Hot aisle containmentNeed temperature monitoring
Recomment rack level PDUs
Rigorous use of blanking plates requiredWith appropriate air economizers etc., the single biggest contribution to energy conservation you can make
Note re “green power”It’s only green if you generate it yourself (TGG)
36
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Thin provisioning
37
datadatadata
data
data
datadata
Add storage as you need it
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Thin provisioning notes
Most useful w/ variable ratios of compute and storage requirement
Especially true in cloud PaaS and SaaS environmentsTP biases toward centralized storage
38
vs.
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Solution: Capacity optimization technologies
- Green storage technologies use less raw capacity to store and use the same data set
- Power consumption falls accordingly
RAID 5/6 ThinProvisioning
VirtualClones
Dedupe&
Compression
1 TB
5 TB
10 TB
Data
RAID10
“Growth”Snapshots
Data
RAID10
“Growth”Snapshots
Backup
Archive
Test
Test
Test
Test
Test
DataRAIDDP“Growth”
Snapshots
DataRAID DP“Growth”
Snapshots
Backup
Archive
Test
Test
Test
Test
Test
DataRAIDDP“Growth”
Snapshots
DataRAID DP“Growth”
Snapshots
Backup
Archive
Test
Test
Test
Test
Test
DataRAIDDP“Growth”
Snapshots
DataRAID DP“Growth”
Snapshots
BackupArchive
Test
Test
Test
Test
Test
DataRAIDDP“Growth”
Snapshots
DataRAID DP“Growth”
Snapshots
BackupArchive
Test
Test
Test
Test
Test
DataRAIDDP“Growth”
Snapshots
DataRAID DP“Growth”
SnapshotsBackupArchive
Multi-Use
Backups
39
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Green software technologies
CompressionDelta snapshotsThin provisioningParity RAIDDeduplication and SIS
40
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Compression
CompressionOld and venerableAlmost a “gimme”Configuration matters
Compress before encrypting, decrypt before decompressing
41
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Delta snapshots
Data sharingForm of deduplicationData in snapshot shared with live data until one of them is writtenTwo fundamental techniques
Copy Out on WriteWrite to new live location
Mainly useful in HPC for VM bootingOne storage system can support a couple hundred diskless servers
42
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Deduplication and SIS
Find duplicates at some level, substitute pointers to a single shared copyBlock or sub-file based (dedup)Content or name based (SIS *, “file folding”)Inline (streaming) and post-process techniquesSavings increase with number of copies foundPerformance hit may be too expensive for HPC
* SIS = Single Instance StoreCheck out the SNIA Dictionary !
www.snia.org/dictionary
43
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
SSDs (Solid State Disks)
ProsGreat READ performanceAt rest power consumption = 0No access time penalty when idle (cf. MAID)No need to keep some disks spinning (cf. MAID)
ConsWRITE performance usually < mechanical disksCost >> mechanical disks except at very high perf pointsWear leveling requires a high space overhead
Note: these dynamics changing rapidly with time
44
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
“Flash and stash”
Large arrays of SATA-based disks fronted by large flash caches
> 1TB flashGreat for high I/O, esp. w/ contained working setsReduced power (SATA vs. SAS)Not useful for write-intensive workloads
45
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Power supply and fan efficiencies
Efficiency of power supply an up front wasteFormerly 60-70%Nowadays 80-95%
Climate Savers80plus group
Variable speed fansCommon nowadaysSoftware (OS) control
46
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Workload trends
Many HPC workloads favor centralized storage and management
nonproliferation, counter-terrorism, energy security, etc.health care analytics (SOX, HIPAA)pharma research (FDA)weapons
Security and retention/complianceboth easier and better with the enforcement point in the storage layerphysical security still necessary
47
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Other points
Centralized storage also offersMUCH better data managementVery efficient DR and remote replicationStrong vendor supportGreat virtualization support
Butcapabilities may be wasted in extreme HPC environmentsincreased up-front cost often a psychological barrier
48
Energy Efficiency Strategies for Storage in HPC Environments© 2011 Storage Networking Industry Association. All Rights Reserved.
Key takeaways
Hot aisle containmentRAID 6 is current best practiceHeadless server configurations are possibleUse of software capacity optimizations highly recommended in cloud-style environments
For more detail: http://www.snia.org/education/tutorials/2009/fall#green
49