Simulation of Hierarchical Storage Systems for TCO
Jakob Luttgau and Julian Kunkel
German Climate Compute Center (DKRZ)
June 22, 2017
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 1 / 20
Overview
1. Motivation and Background
2. Modeling and Simulation Tape Storage Systems
3. Evaluation
4. Conclusion / Discussion
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 2 / 20
MotivationLong-term storage and upcoming challenges for exascale supercomputers.
I Storing data from a supercomputer is a common bottleneckI Deep storage hierarchies to balance cost and performanceI Tape is among the most affordable storage solutionsI RAIT and object semantics change how tapes are usedI Innovation mostly dependent on vendors
But given a simulator it would be possible to:I Experiment with alternative configurationsI Better understand cost and deploy more informedI Explore strategies to rollout LTO generations
This is still a work in progress.
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 3 / 20
Automated Tape LibrariesArchives; Data reduction and compression; Encryption; Self-describing tape formats;
IBM TS3500 Library Complex (IBM, 2011b)
StorageTek SL8500 Library Complex (Oracle, 2015)
TFinity Library Complex (Spectralogic, 2016b)
Scalar i6000 Library Complex (Quantum, 2015)
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 4 / 20
1. Motivation and Background
2. Modeling and Simulation Tape Storage Systems
3. Evaluation
4. Conclusion / Discussion
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 5 / 20
Model OverviewHardware and software components in a combined overview.
Client GroupClient
Tape Silo
Shared Cache
Switch
I/O
Servers...
Cache
Switch
Drive
Drive
Drive Drive
Drive Drive
Drive
Drive
Netw
ork
I/O
Sch
ed
ulin
g
Tape Manager
File Manager
Direct RAIT
Cache Policies
Robot
Sched.
Library Topologies
Workload
Generation
Load Balancing
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 6 / 20
Handling READ RequestsStaging of recently accessed files for reads.
Client Group
Client
Shared Cache
SwitchCacheI/O
Server
READ (cached)
Tape Drive
READ (not in cache)Shared Cache
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 7 / 20
Handling WRITE RequestsTwo-Phase write with delayed persistence on tape.
WRITE (Phase 2)
Client Group
Client
SwitchCacheI/O
Server
WRITE (Phase 1)
Tape Drive
delay
Shared Cache
Shared Cache
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 8 / 20
Network Topology ModelNot packet based but allocated for the duration of a transfer.
15000.0
15000.0
10.0
10.0
1000.0
1000.0
1000.0
1000.0
15000.0
15000.0
100.0
100.0
100.0
200.0
200.0
200.0
200.0
200.0
100.0
100.0
100.0
200.0
200.0
200.0
200.0
200.0
1000.0
1000.0
1000.0
1000.0
0::Dummy
1::Node Switch
3::Drive Switch
4::I/O:A
6::Drive:A7::Drive:B
9::Drive 1
10::Drive 2
11::Drive 3
12::Drive 4
13::Drive 5
...
15::mistralpp.dkrz.de
16::lobster1.dkrz.de
17::mistralpp03.dkrz.de
...
...
15000.0
15000.0
2::Switch
Disk Cache
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 9 / 20
Robot Tape LibraryExample: Model of the SL8500 library with robot hands and elevators.
Rr,i
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 10 / 20
Serpentine Tape ModelEstimating spool and seek times for tape access.
Tseek(posj , posi) = max
(|posix − posjx|
vspool,|posit − posjt|
vhead
)
Tread/write(bytes) =bytes
vread/write
Tbusy = Tmount +
BOT,...,BOT∑posi,posi+1
Tseek(posi, posj) + Tread/write(bytesi)
+ Tunmount
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 11 / 20
Scheduling and Request QueuesChaining specialized request queues makes resource allocation manageable.
Drive
Drive
DriveR1,1Rr,i
R1,1Rr,i
Disk I/O Dirty
Queue
Tape I/O
IN
OUT
Robots
uncached reads
cached read
& writes
serve
writeserve
move tape
service
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 12 / 20
1. Motivation and Background
2. Modeling and Simulation Tape Storage Systems
3. Evaluation
4. Conclusion / Discussion
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 13 / 20
0
20
40
60
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
FT
P J
obs
0
20
40
60
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
Sta
ges
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 14 / 20
30 drives
0
20
40
60
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
Sta
ges
45 drives
0
20
40
60
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
Sta
ges
75 drives
0
20
40
60
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
Sta
ges
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 15 / 20
30 drives
0
250
500
750
1000
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
Num
ber
of J
obs
wait−times< 1m< 2m< 3m< 4m< 5m< 8m< 10m< 15m< 20m< 30m< 1h< 2h< 4h< 8hmore
45 drives
0
10
20
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
Num
ber
of J
obs
wait−times< 1m< 2m< 3m< 4m< 5m< 8m< 10m< 15m< 20m< 30m< 1h< 2h< 4h< 8hmore
75 drives
0.0
2.5
5.0
7.5
10.0
12.5
Sat Mon Wed Fri Sun Tue Thu Sat Mon Wed Fri
Num
ber
of J
obs
wait−times< 1m< 2m< 3m< 4m< 5m< 8m< 10m< 15m< 20m< 30m< 1h< 2h< 4h< 8hmore
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 16 / 20
Example: QoS for Total-WaittimeE.g.: How many drives to serve x % of requests in under y minutes.
0.00
0.05
0.10
0.15
0.20
0.25
0 50 100 150 200
duration in seconds
dens
ity
type
read
write
0.00
0.05
0.10
0.15
0.20
0 50 100 150 200
duration in seconds
dens
ity
type
read
write
0.00
0.25
0.50
0.75
1.00
0 50 100 150 200
duration in seconds
y
type
read
write
0.00
0.25
0.50
0.75
1.00
0 50 100 150 200
duration in seconds
y
type
read
write
e.g. fewer vs. more drives
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 17 / 20
Example: Power Consumption EstimatesApproximate power consumption for different configurations.
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 18 / 20
Conclusion and Discussion
SummaryI Potential helper for procurement of HSM and tape systemsI Enabler for open research otherwise only performed by vendorsI Depth of stack makes development of DES very complicated
Future WorkI Fine-tuning of various simulation parametersI Conducting experiments related drive and tape placement, LTO
generation, disk cache capacities and RAIT deploymentsI Comparison with different deployments at other sites
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 19 / 20
Bibliography I
Fontana, R. E., Decad, G. M., and Hetzler, S. R. (2013). The Impact of Areal Density and Millions ofSquare Inches ( MSI ) of Produced Memory on Petabyte Shipments of TAPE , NAND Flash , andHDD Storage Class Memories. Proceedings of the 29th IEEE Symposium on Massive Storage Systems andTechnologies.
IBM (2011a). High Performance Storage System. Technical report.
IBM (2011b). IBM System Storage TS3500 Tape Library Connector and TS1140 Tape Drive support forthe IBM TS3500 Tape Library. pages 1–15.
Oracle (2015). StorageTek SL8500 Modular Library System User’s Guide.
Quantum (2015). Quantum Scalar i6000 Datasheet.
Spectralogic (2016a). LTO Roadmap. https://www.spectralogic.com/features/lto-7/. [Online; accessed2016-01-24].
Spectralogic (2016b). Spectralogic TFinity - Enterprise Performance.https://www.spectralogic.com/products/spectra-tfinity/tfinity-features-enterprise-performance/.[Online; accessed 2016-02-12].
Sun (2006). StorageTek StreamLine SL8500 - User Guide. (96154).
Jakob Luttgau DKRZ Modeling and Simulation of Tape Libraries June 22, 2017 20 / 20