7
Convergence between HPC and Big Data: The Day After Tomorrow Maxime Martinasso Swiss National Supercomputing Centre (CSCS / ETH Zurich) Supercomputing 2018

Convergence between HPC and Big Data: The Day After Tomorrow · 2020. 2. 4. · §Enable multiple identity providers (not only users known by the HPC centre) §Identify “who”

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Convergence between HPC and Big Data: The Day After TomorrowMaxime MartinassoSwiss National Supercomputing Centre (CSCS / ETH Zurich)Supercomputing 2018

  • SC18 Panel 2

    Piz Daint and the User Lab

    Model Cray XC40/XC50

    XC50 Compute Nodes

    Intel® Xeon® E5-2690 v3 @ 2.60GHz (12 cores, 64GB RAM) and NVIDIA® Tesla® P100 16GB

    XC40 Compute Nodes

    Intel® Xeon® E5-2695 v4 @ 2.10GHz (18 cores, 64/128 GB RAM)

    Interconnect Configuration

    Aries routing and communications ASIC, and Dragonfly network topology

    Scratch capacity ~9 + 2.7 PB

  • New requirements to access HPC hardware

    § Definition of users and their capabilities§ Workflow, scientific devices, web portal§ User defined software stack

    § Connectivity of HPC§ Increasing connectivity of compute nodes (Internet, 3rd party services)§ HPC service needs an interface to be used by other services

    § Interactivity§ Jupyter notebooks (JHub, JLab, …)§ Community portal

    § Data management§ Data mover/broker service§ Select physical storage: SSD and in memory on compute nodes, Scratch, Archive

    SC18 Panel 3

  • Paul Scherrer Institute

    SC18 Panel 4

    § PSI Mission§ Study the internal structure of a wide range of different materials§ Research facilities: the Swiss Light Source (SLS), the free-electron X-ray laser SwissFEL,

    the SINQ neutron source and the SμS muon source

    § PSI facility users reserve a scientific device for a period of time § Compute power should also be available§ Storage and archive availability during the experiment§ Data retrievable after experiment by the users of PSI facilities (not PSI)

    § Proposal to interface Piz Daint with their workflow§ Use an API to access compute and data services (job scheduler, data mover)§ Create a reservation service to reserve computation nodes§ Provide a portal running on OpenStack to let PSI users access archived data at CSCS

  • SC18 Panel 5

    CN

    Gate to CSCS

    Login nodeUser

    HPC parallel FS

    INTERNET

    Public IPs

    Local storageand in memory

    Gateway

    Local storage

    Software defined infrastructure

    Data science workflowsInteractive compute

    Containers

    Convergence of HPC and Big Data Science Workflows

    Archives

    API

    API

  • Challenges

    § Authentication and Authorization infrastructure§ Enable multiple identity providers (not only users known by the HPC centre)§ Identify “who” (workflow, scientific device, web portal,…) is authorized to use HPC services

    § Data management§ Complex data ownership and security with multiple identity providers§ Automated staging in/out and transformation of data (POSIX to SWIFT)

    § Workflow systems§ Which workflow engines or standards to support?§ Enable access to HPC services via a REST API (compute, data, reservation)§ Interactive service and batch scheduling (preemption, priority)

    SC18 Panel 6

  • Thank you for your attention.