27
HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 [email protected] Prepared for the HPSS User Forum 2017, Tsukuba, Japan, October 16 – 20, 2017. LLNL-PRES-739016 Development Partners HPSS Web Site URL: www.hpss-collaboration.org Lawrence Livermore National Laboratory Los Alamos National Laboratory National Energy Research Scientific Computing Center Oak Ridge National Laboratory Sandia National Laboratories IBM

HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 [email protected]

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

HPSS Collaboration Development Executive Committee Overview

Dick WatsonLawrence Livermore National Laboratory

[email protected]

Prepared for the HPSS User Forum 2017, Tsukuba, Japan, October 16 – 20, 2017.

LLNL-PRES-739016

Development Partners

HPSS Web Site URL:www.hpss-collaboration.org

Lawrence Livermore National Laboratory

Los Alamos National Laboratory

National Energy Research Scientific Computing Center

Oak Ridge National Laboratory

Sandia National Laboratories

IBM

Page 2: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Disclaimer

Forward looking information including schedules and future software reflect current planning that may change and should not be taken as commitments by IBM or the other members of the HPSS collaboration.

2

Page 3: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Outline

• Review - History of HPSS Collaboration and Current Status

• Roles of HPSS EC and TC

• DOE lab five year HPC plans

• HPSS Collaboration five year plans– Release planning– DOE six initiatives– Tree Frog– HTAR/HSI– DSI

• Questions/discussion

3

Page 4: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

History of the HPSS Collaboration• In early 1990s there was no COTS or other scalable archive/HSM on the

market or under development meeting projected requirements of Terascale HPC (now Petascale and tomorrow Exascale), and other large scale data intensive storage applications.

– There are many more storage management options available today than 25 years ago, yet when we look at our requirements for an on-premises solution (security/performance/cost), scalability in several dimensions, such as single file and total throughput transfer rates, the situations is still true today.

• The National Labs had extensive experience with building and deploying mass storage systems, IBM had been doing research related to these systems.

• IEEE Mass Storage Reference Model – modular, scalable, distributable developed in late 1980s, published in 1990s – work of IBM and other vendors, National Labs and other government labs.

• HPSS - a joint DOE Lab/IBM development collaboration, utilizing all partner’s strengths and experience started mid 1992.

– Over 20 organizations including industry, Department of Energy (DOE), other federal laboratories, universities, National Science Foundation (NSF) supercomputer centers and French Commissariat a l'Energie Atomique (CEA) have contributed to various aspects of this effort.

4

Page 5: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

• After twenty five years, the collaboration is going strong.– Software development and releases of new capabilities and fixes

continue.– HPSS is best-of-breed scalable tape data repository for HPC and

big data.– HPSS disk capacity used as a cache is highly scalable.– HPSS is well positioned to exploit new storage technology and

configurations.• All major DOE HPC facilities rely upon HPSS as their

primary archival storage solution. HPSS supports universities, weather and climate data centers, nuclear energy programs, defense systems and research labs in ~40 sites world-wide.

5

HPSS Collaboration Today

Page 6: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Largest HPSS sites world wide (2Q 2017)

Per single HPSS System & Namespace: Petabytes Millionsof Files

(ECMWF) European Centre for Medium-Range Weather Forecasts 335.88 317.22

(UKMO) United Kingdom Met Office 199.56 195.72

(NOAA-RD) National Oceanic and Atmospheric Administration Research & Development 120.90 89.24

(BNL) Brookhaven National Laboratory 119.46 134.45

(LBNL-User) Lawrence Berkley National Laboratory - User 113.01 220.41

(Meteo-France) Meteo France 92.85 355.73

(CEA TERA) Commissariat a l`Energie Atomique - GENO 83.54 17.79

(NCAR) National Center for Atmospheric Research 83.05 272.85

(MPCDF) Max Planck 78.13 168.42

(ORNL) Oak Ridge National Laboratory 74.66 77.97

(LANL-Secure) Los Alamos National Laboratory - Secure 73.55 716.06

(LLNL-Secure) Lawrence Livermore National Laboratory - Secure 69.40 913.45

(DKRZ) Deutsches Klimarechenzentrum 63.69 18.77

*List is based on publicly disclosed HPSS sites. Rank based on data stored.6

Page 7: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

HPSS is Best-of-Breed Scalable Tape

• HPSS enables faster tape reads and writes– Striped tape exceeds single tape performance.– Small file tape aggregation can deliver near-native tape transfer

rates.– Recommended Access Order (RAO) tape recalls can improve the

seek-time between files by 40% to 60%.

• HPSS cuts redundant tape costs with RAIT.

• HPSS identifies data corruption and redundant tape minimizes loss.

• HPSS maximizes automated tape library mount rates.

7

Page 8: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

HPSS Well Positioned for Exascale

• HPSS has continuously evolved and remained world-class, demonstrating production scalability in I/O rates and amount of data stored by factors of 1,000s since it went into production about 21 years ago.

– Continued scaling of I/O rates for Exascale era is primarily a matter of providing adequate and balanced network and storage device bandwidth.

• HPSS architecture pioneered separation of data and metadata I/O and storage.

– Extensibility of architecture evident as HPSS continues to scale.• HPSS supports striped files and parallel file I/O and is cloud enabled

through OpenStack Swift.• Adding new classes of storage devices that may emerge is

straightforward. We are currently putting solid state disk into production for metadata storage.

– Tape remains a viable archive media. Industry continues research and development to improve tape media bit density and transfer rate.

8

Page 9: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

HPSS Well Positioned for Exascale• HPSS metadata engine is the scalable COTS relational

database management system (IBM Db2).

• Small file performance is supported through use of user explicit and system supported implicit small file aggregation, as well as continuously improved metadata performance through Db2 partitioning and off-node.

• Because HPSS metadata is built on a RDBMS, capabilities exist for user level metadata storage (aka User Defined Attributes), which while not fully leveraged, are currently being used by various utilities and are expected to be key to enabling user optimization of data set management.

• A commercial multi-exabyte HPSS solution is currently being deployed.

9

Page 10: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

HPSS Well Positioned to Support Scalable Project and Campaign Storage Solutions

• Top tiers of HPC storage do not require HPSS management.

• Top tier data objects, file and sets will eventually find their way to HPSS to rest on tape or other scalable high latency storage media.

• Multiple methods can be used to move data between a top tier (not managed by HPSS) and HPSS storage. – One method under development by HPSS Collaboration is

Treefrog. • Organizing and aggregating project objects and files into data

sets will enable moving more data, with fewer metadata and data repository operations.

10

Page 11: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Development Executive Committee (EC)

• Development EC – one member from each of the DOE labs and IBM.– Generally meet by telecon weekly.– Two TC members also join the meetings, the TC chair and the

TC/EC liaison person.• Development EC

– IBM – Bob Coyne Co-chair– Lawrence Livermore National Laboratory – Dick Watson Co-chair– Los Alamos National Laboratory – Brett Hollander– Oak Ridge National Laboratory – Sudharshan Vazhkudai– National Energy Research Scientific Computing Center – Damian Hazen– Sandia National Laboratories – John Noe

• Development EC Role– The Executive Committee is responsible for commercialization and

deployment decisions, HPSS design approval, function and content, HPSS promotion and publicity, HPSS proposal support, and final project decisions.

11

Page 12: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Technical Committee (TC)• TC – one member from each of the DOE labs and IBM.

– Generally meet by telecon weekly.– Subject matter experts join the TC telecons when needed.

• TC members– Ramin Nosrat, Chair – IBM– Michael Meseke, attends meetings as Development Manager – IBM– Lawrence Livermore National Laboratory – Geoff Cleary– Los Alamos National Laboratory – David Sherrill– Oak Ridge National Laboratory – Vicky White– National Energy Research Scientific Computing Center – Damian Hazen– Sandia National Laboratories – Susan McRee

• TC Role– The Technical Committee is responsible for resource/personnel

allocation; project scheduling; requirements generation and analysis; design, code, test, documentation, defining standards, and other technical matters.

12

Page 13: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

DOE High-performance Computing Plans

• The US Department of Energy has two main organizations involved in HPC, the Office of Science (DOE-SC) and the National Nuclear Security Administration (NNSA).

• The HPSS Collaboration DOE development partners are:– For DOE-SC, Oakridge National Laboratory (ORNL) and Lawrence

Berkeley National Laboratory National Energy Research Scientific Computing Center (NERSC).

– For NNSA, Los Alamos National Laboratory (LANL), Lawrence Livermore National Laboratory (LLNL) and Sandia National Laboratories (SNL).

13

Page 14: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

DOE High-performance Computing Plans

• The Advanced Scientific Computing Research (ASCR) labs (including Argonne National Laboratory (ANL)) and National Nuclear Security Administration (NNSA) labs are organized into two teams for the procurement of capability, leadership, Advanced Technology Systems (ATS) – the fastest that can be built for given power and dollar budgets.– Collaboration of ORNL, ANL, and LLNL (CORAL) are one team and

LANL, SNL and NERSC (Alliance for Application Performance at Extreme Scale (APEX)) are the second.

• In addition, the NNSA labs also procure commodity HPC (capacity) systems in increments called scalable units.

• The ASCR labs have different DOE roles and thus different strategies for their mix of users and systems.– NERSC ATS deployments will support the entire spectrum of DOE

Office of Science research at NERSC.– ORNL will have both an ATS and a small number of commodity

systems.14

Page 15: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Advanced Technology SystemsThe Leading Edge of Simulation

15

KFs to MFs 10s MFs 100s MFsto 10s GFs

1s – 10sTFs 100s TFs

1s – 100s PFs

Page 16: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Platform plan• Advanced Technology Systems (ATS)

• Every three years staggered between Teams• NNSA Commodity Technology Systems (CTS)

• Every four years, “coordinated” by LLNL

DOE Platform PlanA

dvan

ced

Tech

nolo

gySy

stem

s (A

TS)

Fiscal Year‘12 ‘13 ‘14 ‘15 ‘16 ‘17

UseRetire

‘19‘18 ‘20

Com

mod

ity

Tech

nolo

gySy

stem

s (C

TS)

Dev. and Deploy

Cielo(LANL/SNL),Hopper(NERSC)

Sequoia(LLNL),Titan(ORNL)DifferentArchitectures

ATS1– TRINITY(LANL/SNL),Cori(NERSC)

ATS2– Sierra(LLNL),Summit(ORNL)

ATS3– (Crossroads(LANL/SNL)NERSC-9)

NNSATri-labLinuxCapacityClusterII(TLCCII)

NNSACTS1

NNSACTS2

‘21

SystemDelivery

16

• Included in the DOE platform strategy is support for:- Programming methodology and algorithms- Application transitions

ATS-4andATS-5systemsbeyond2020indiscussion,Exascale probablyearly2020s

Page 17: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

LLNL Sequoia compute platform§ Third generation IBM BlueGene§ 20 PF/s§ 1.6 PB memory§ 1.6M cores§ 60 TB/s bi-section BW§ 9.6MW Power, 4,000 ft2

§ Hybrid cooled§ 50 PB file system§ FS I/O throughput goal > 1 TB/s§ Generally Available July 2013

17

Page 18: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

ORNL’s 27 Petaflop Titan System• Installed in 2012• Designed for science from the ground up• Similar number of cabinets (200), cabinet design, and cooling as

Jaguar• System architecture:

– Cray XK7, 18,688 nodes with 16 AMD cores and 1 K20X Kepler GPU per node– Memory: 710TB of system memory with 32GB of CPU memory and 6GB of

GPU memory– Gemini 3-D Torus interconnect with advanced synchronization features

• OS upgrade to the Linux Operation System• 27 PF peak performance• Spider storage: a Lustre-based parallel file system, 32PB and 1TB/s

performance• No. 4 on the Top500 supercomputer list

Page 19: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Trinity/Cori High-‐Level Architecture

19

Trinity deployed at LANL/SNL Alliance for Computing at Extreme Scale (ACES),and Cori deployed at NERSC

(“Cori” will be a 27PF configuration)

Page 20: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Mellanox® InterconnectDual-rail EDR Infiniband®23 GB/s

IBM POWER• NVLink™

NVIDIA Volta• HBM• NVLink

Components

Compute Node2-POWER9® Architecture Processors6-NVIDIA®Volta™ accelerators>40TFNVMe-compatible PCIe 1.6TB SSD> 512 GB DDR4 + 96 GB HBMCoherent Shared Memory

Compute RackStandard 19” Warm water cooling

Compute System~4,600 nodes

>10 PB DDR4 + HBM + NV memory~150 - 200 PFLOPS

~15 MW Peak

LLNL Sierra and ORNL Summit SystemsSame Architecture, Different Configurations

IBM Spectrum Scale (GPFS)™ File System250 PB usable storage

2.5 TB/s bandwidth

20

(The numbers below are those publically announced for Summit. Sierra will be similar, but fewer nodes)Sierra/Summit is a heterogeneous

IBM Power 9, NVIDIA Volta architecture

Page 21: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

• Gem storage only– Explicit movement to/from disk and tape

• Exploring how might use FUSE for cached data for implicit movement

– Large HPSS disk cache (6PB OCF, 9PB SCF) – user’s files remain resident about 1 year in OCF and 1.3 years in SCF

– Quotas– Estimate will store 2017 ~5.5PB/yr OCF, ~7PB/yr SCF– Estimate 90% data are large files (>256MB), 90% of files are small files (<1MB)

• Archives are forever– Outlive vendors, OSs, file systems, technologies, users,…– Protect billions of dollars of data investment – Risk-averse– Only small data percent ever retrieved

• ~10% OCF, ~5% SCF

– 48 years and counting– Tape

21User’s Application

Home, Project, Scratch (NFS – 4.2PB

& 2.2PB)

Archive ( HPSS 41PB & 69PB)

Parallel File Systems (Lustre – 44PB & 85PB) 60-day

purge policy

FTP, NFT, HTAR, PFTP,… – 30GB/s & 30GB/s

Data ratestotal throughput

Page 22: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

HPSS Collaboration Five year Plans

• The approach to release planning involves:– Evaluating the current set of Change Requests (CRs) and setting a

timeframe for the release.– From high-level CR/bug designs make resource and timing

estimates.– Prioritize CRs against Burning Issues, IBM business needs, lab

programmatic requirements.– EC selects a set of features to go into next release.

• V 7.5.1 upgrade planning and Burning Issues talks by Jonathan Procknow.

• HPSS roadmap including V 7.5.2 details and timing talk by Michael Meseke.

• The TC is currently looking at possible 7.5.3 features.• The process for the next release will start.

22

Page 23: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

• Implement scale-out metadata capabilities such as provided with an multiple-core-server architecture.

• Provide a transparent file system interface capable of prohibiting abusive archive (e.g. tape) use/access.

• Provide a transparent object storage interface capable of prohibiting abusive archive (e.g. tape) use/access.

• Provide enhanced archive searchability, including metadata extensions and indexing.

• Provide UIs or utilities to replace HSI/HTAR/DSI etc.

• Leverage more COTS products and capabilities within HPSS, (e.g. using a file system such as ZFS as part of HPSS).

23

Page 24: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Future Architectural Changes • The DOE development labs agreed to move HPSS forward with

incremental change releases. Each lab will propose incremental CRs to move forward in areas of interest. Here are some examples:

– ORNL has archive searchability as a high priority and is developing a rich indexing infrastructure based on user-defined metadata attributes.

– LLNL has scale-out metadata capability as a high priority for exascale. Movement in this direction:

• 7.5.1 Db2 partitioning.• 7.5.2 Off-node Db2 partitions.• 7.5.3 Improve Name Server caching mechanisms/performance.

– LLNL is interested in a non-abusive transparent file system interface. LLNL is studying FUSE improvements and performance to support this goal.

– NERSC and ORNL are planning to work with Globus to extend the GridFTPDSI for HPSS.

– LANL, with IBM assistance, prototyped using ZFS as a device (To be included in 7.5.2).

• IBM, with input from Collaboration members, is working on Treefrog(separate talk by Alan Giddens).

24

Page 25: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

User Interfaces• HSI/HTAR is being transitioned from a Gleicher Enterprise

LLC supported product to an HPSS Collaboration supported product.– Separate talk by Nick Balthaser of NERSC.

• Globus GridFTP support and Data Storage Interface (DSI) to HPSS futures are under discussion between the HPSS Collaboration and Globus.– Globus has an open source server with a command line interface

and a subscription service with a web interface and other advanced features.

– A question for the HUF community is which sites are interested in which of these Globus capabilities?

25

Page 26: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

Questions and Discussion

26

Page 27: HPSS Collaboration Development Executive …...HPSS Collaboration Development Executive Committee Overview Dick Watson Lawrence Livermore National Laboratory 925-588-4194 dwatson@llnl.gov

27

AcknowledgementThis work was, in part, performed by the Lawrence Livermore National Laboratory, Los Alamos National Laboratory, Oak Ridge National Laboratory, National Energy Research Scientific Computing Center and Sandia National Laboratories under auspices of the U.S. Department of Energy, and by IBM Global Services.

Lawrence Livermore National Laboratory is operated by Lawrence Livermore National Security, LLC, forthe U.S. Department of Energy, National Nuclear Security Administration under Contract DE-AC52-07NA27344

DisclaimerThis document was prepared as an account of work sponsored by an agency of the United Statesgovernment. Neither the United States government nor Lawrence Livermore National Security, LLC, norany of their employees makes any warranty, expressed or implied, or assumes any legal liability orresponsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, orprocess disclosed, or represents that its use would not infringe privately owned rights. Reference hereinto any specific commercial product, process, or service by trade name, trademark, manufacturer, orotherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by theUnited States government or Lawrence Livermore National Security, LLC. The views and opinions ofauthors expressed herein do not necessarily state or reflect those of the United States government orLawrence Livermore National Security, LLC, and shall not be used for advertising or productendorsement purposes.