34
HPSS Best Practices Erich Thanhardt Bill Anderson Marc Genty B

HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

  • Upload
    vuthien

  • View
    225

  • Download
    3

Embed Size (px)

Citation preview

Page 1: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Best PracticesErich Thanhardt

Bill AndersonMarc Genty

B

Page 2: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Overview● Idea is to “Look Under the Hood” of HPSS to

help you better understand Best Practices○ Expose you to concepts, architecture, and tape tech○ Cite Best Practice’s in context along the way○ Talk ends with references to further resources

● Talk is interactive, please ask questions along the way

Page 3: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS - What is it?

● Acronym

● Stands for High Performance Storage System● “HPSS is software that manages petabytes of

data on disk and robotic tape libraries”.■ Quoted from:http://www.hpss-collaboration.org

Page 4: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS - What makes it different?

● Hardware: Use of tape technology is a distinguishing characteristic of HPSS

● Use case: HPSS is an archive and not a (parallel) file system○ system is remote, not cross mounted○ operation set is limited to metadata and file transfers

Best Practice: Be aware what makes HPSS very different than GLADE - intended use

Page 5: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Main Use Cases

● Archive○ Data is stored and preserved indefinitely

■ While system components come and go■ Model data and observational data collections

● Disaster Recovery○ Leverage dual sites for geographic separation○ Additional level of archival preservation

Page 6: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

CLint Interface (CLI)

HPSS Software Architecture

HPSS

HPSS End User

Metadata

DATA

Control

HSI/HTARClient

4x Gateway Servers

Linux/UnixHost

Gateway

AUTH Authentication

Page 7: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Software Architecture

● Best Practice: Reporting errors via EV ticket○ include: name, host, datetime, -d4 error tracing○ authentication problems○ those pesky parallel file transfer limits

■ your guaranteed on-ramp to the system■ “data bandwidth” allocation■ will be increasing over the next few months

Page 8: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Software Architecture

● Best Practice: Validating that a file was written○ “ls -l” both locally and on HPSS○ compare pathname and size○ not sufficient to see the pathname (ls)

● Here is what can happen:○ Creating pathname in HPSS happens first○ Then data transfer between client and HPSS○ That transfer can be interrupted

Page 9: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS

Oracle SL8500 Tape Library

NWSCCheyenne

MLCFBoulder

HPSS - One System/Two Sites

Oracle Tape Drives + Media

Disk Cache

ARCHIVE DISASTER RECOVERY

Page 10: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Libraries - Oracle SL8500

Page 11: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Tape Libraries Frontal View

ACSLS Server

MLCF

SL8500 Tape Library

Page 12: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Libraries Top View

Tape Library

Page 13: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Libraries - Photos

Page 14: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

ORACLE DRIVE & MEDIA

Page 15: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Small File Problem● Cost of a random read:

○ Robot retrieval, mount, seek: 70 secs to avg file○ Transfer data rate: 240 MB/sec○ 184 MB file means 99% latency 1% transfer

● Cost of returning tape○ Double it - indirect cost to you○ 368 MB file means 99% latency 1% transfer

● Compare these with avg filesize of 166 MB

Page 16: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Small File Problem● Best Practice: best is to avoid small files, but

where needed - aggregate with htar

Page 17: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

File Deletion● Deleting files

○ Deleting data on tape creates unusable spaces on tape because it’s linear and continuous

○ Mischaracterizations and system data migrations● Best Practice - delete un-needed files but also

avoid temporary files (whether rewriting or create/delete’s)

Page 18: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Repeated Reads and Writes● Best Practice: avoid both repeated reads from

and repeated writes to an archive file - bring the file out and park it somewhere else

Page 19: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

File Rescue● Adopting orphaned files from others

○ user/proj combo goes invalid after period of time○ someone needs to take ownership and pay storage

costs● Best Practice - never use “cp” to copy data

internally in order to move it if you don’t have proper permissions - open ticket

Page 20: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Optimizing Reads● Best Practice - if you are reading back data at

large scales, contact Helpdesk at [email protected] for ways to order your requests - it can be done!

● Process is not perfect but usually has a positive effect

Page 21: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Disk

Tape

Memory

CPU

Storage Hierarchy Concept

Page 22: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Attributes of Storage Hierarchy● Cost & Characteristics

○ Speed & Capacity○ Persistence & Reliability

■ hardware, RAID/RAIT, dual copy○ Availability

■ online/nearline/offline○ Location

■ onsite/offsite

Page 23: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

HPSS Storage Pyramid

Disk

Tape

DISK CACHE

TAPE LIBSROBOTICSDRIVES & MEDIA

Page 24: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Hierarchical Storage Manager (HSM)

DISK

TAPE

Stage Migrate

Purge

Page 25: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

User Interaction with HPSS

DISK

TAPE

Stage Migrate

Purge

Page 26: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Basic Stats Jun-Aug 2014● Writes/Reads ratio ~4-5 to 1● User response times

○ ~116 sec/read vs. ~9-10 sec/write○ ratio read/write response times ~ 13 to 1

Page 27: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Tape Technology Upgrades

DISK

TAPE

Stage Migrate

Purge

Migrate

Page 28: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Data Services Pyramid - Workflow

GLADEGPFS

HPSS

PFS

Archive DR

90 GB/sec

9 GB/sec

Page 29: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Workflow - Optimal➔ Create data on GLADE/GPFS➔ Post process (new data plus deletes)➔ Commit data selectively to HPSS➔ Best Practice!

Page 30: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Workflow - Realistic➔ Create data on GLADE/GPFS➔ Commit to HPSS (back it up)➔ Post process (new data)

◆ Commit post-processed data (selectively?) to HPSS

Page 31: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Workflow - To Avoid➔ Create data on GLADE/GPFS➔ Commit to HPSS (back it up)➔ Delete from GLADE/GPFS➔ …. time passes➔ Stage from HPSS back to GLADE/GPFS➔ …. process staged data

Page 32: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Workflow - To Avoid

➔ Create data on GLADE/GPFS➔ Commit to HPSS (back it up)➔ Delete from GLADE/GPFS➔ …. time passes➔ Stage from HPSS back to GLADE/GPFS➔ …. process staged data

BEST PRACTICE - contact [email protected]

Page 33: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

Additional Resources ● CISL Support & Allocations

○ Helpdesk & CISL Consulting■ send email to [email protected]

● HPSS Documentation○ http://www2.cisl.ucar.edu/docs/hpss

● Best Practices doc○ http://www2.cisl.ucar.edu/docs/best_practices

Page 34: HPSS Best Practices - CISL Home Best Practices Erich Thanhardt Bill Anderson ... Stage from HPSS back to GLADE/GPFS …. process staged data BEST PRACTICE

The End