37
Singularity Container Workflows for Compute. Gregory M. Kurtzer CEO, Sylabs Inc. @gmkurtzer http://www.sylabs.io @SylabsIO @SingularityApp

Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

  • Upload
    others

  • View
    22

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

SingularityContainer Workflows for Compute.

Gregory M. KurtzerCEO, Sylabs Inc.

@gmkurtzer

http://www.sylabs.io@SylabsIO@SingularityApp

Page 2: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Gregory M. Kurtzer

CEO and Founder, Sylabs Inc.

Previously spent ~20 years at LBNL/DOE as the HPC Systems Architect.

I’m also known for founding various open source projects like Warewulf, CentOS Linux, and most recently Singularity!

INTRODUCTIONS…

Page 3: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Host 2Host 1

APPLICATION CONTAINERIZATION 101

CPU Memory Devices

Kernel

Applications, Libraries, Services

CPU Memory Devices

Kernel

Apps, libs, servicesContainer

SCP, HTTP, FTP, Archive

An environment can be built on one host, encapsulated, and packaged up into a container image.

The container image can be copied to another host, and applications can be executed directly as if they are running native.

You can additionally isolate or integrate the container environment on the host as the need necessitates.

Page 4: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Singularity is differentiated by two primary categories:

• Container Format: Sylabs created an image format to encapsulate OCI and Docker based containers, which is single file based, cryptographically signed, trusted, and immutable.

• Runtime Engine: Standardizing on existing POSIX security practices, Singularity improves performance, integration, ease of use and reduces attack surfaces while enabling HPC and the growing need for compute based orchestration.

DESIGNED FOR SECURITY, MOBILITY, AND PERFORMANCE

RuntimeEngine

Environment Format

Page 5: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

SINGULARITY IMAGE FORMAT (SIF)

”Building a container can be done in only 52 lines of code!” – Liz Rice, Container Camp 2016

SIF is a unique, single file, container encapsulation

format!

SIF is to containers what RPM and DEB is to source code!

Immutable RuntimeContainer Image

Global Header

Recipe Definition

Labels

Environment

Writable Overlay

Signature Block

CR

YP

TOG

RA

PH

ICA

LLY

SIG

NE

D

Descriptors

Page 6: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

SIF encapsulates OCI and Docker containers into a single file adding benefits such as:

• Guaranteed immutable and reproducible

• Easy to move, share, archive, etc..

• POSIX compatible

• Encapsulates the entire application and environment stack

• Cryptographic signatures and validation

• No layers or dependencies

• No tarballs, SIF is the runtime format

• Encryption (with in-kernel description) coming soon

A NEW DELIVERY PARADIGM FOR SOFTWARE

Singularity ContainerTRUST

sha2

56:9

4ed

0..

sha2

56:9

4061

..

sha2

56:a

a74a

...

sha256:becac…

Host Operating System

Presentation Layer

Root Owned Container Daemon

Network Registry

Page 7: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

SIF PERFORMANCE

Objectives:

1.Measure scaling of python startup

and import speed with increasing

numbers of concurrent python

interpreters

2.Compare scaling of a standard

python installation with an identical

containerized installation

Note: Underlying file system is NFS, max

jobs was 5120 over 320 nodes, graph is

logarithmic on both axis.

DR. WOLFGANG RESCH

HTTPS://GITHUB.COM/WRESCH/PYTHON_IMPORT_PROBLEM

Invocation performance over shared storage

Page 8: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Singularity provides absolute trust and accountability

Execution of containers can be limited to only valid keys, and/or key fingerprints

If a malicious user is found, keys are revoked from the SylabsKeyStore, limiting exposure

ABSOLUTE TRUST OF ALL WORKLOADS

$ singularity pull container.sif library://user/container$ singularity verify container.sifData integrity checked, authentic and signed by:

Gregory Kurtzer [email protected], KeyID F4EIAL82E…

$ singularity sign container.sif$ singularity push container.sif library://user/container

Page 9: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

EXTREME MOBILITY OF COMPUTE – BYOE

Absolute mobility from laptop, to

HPC, cloud all and the way out to

the edge.

• Changing the packaging and mobility paradigm for

application and data

• Disrupts the barriers of portability and bridges the

gaps between all available resources

• From private resources, to public clouds and all the

way out to edge and IoTLocal Compute

IoT Edge

NVIDIA DGX

Page 10: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Designed for the complicated integration needs of compute

• Container Runtime:• Works on all supported Linux Distributions (runtimes and kernels)

• Designed for massive efficiency and performance

• Additional support for alignment between user and kernel space

• Container Image:• Designed for absolute mobility, user freedom, and reproducibility

• Highly performant on shared and parallel file system deployments

• Can be easily shared, archived, and controls compliant; containers are just data

• Environment:• Optimized for application workflows like MPI and schedulers

• Allows direct access to GPUs, InfiniBand, FPGAs, file systems, data, etc.

COMPATIBLE AND INTEGRATION AWARE

Page 11: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Data is shared between container and host as fluently as if contained applications were running natively on the host.

NATIVE HOST INTEGRATION

$ singularity exec ubuntu.sif pwd

$ singularity exec ubuntu.sif python ./python_script_in_pwd.py

$ cat python_script_in_pwd.py | singularity exec docker://python:latest python

Page 12: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Singularity integrates with all batch resource managers, with zero modifications, by calling the Singularity command directly within the

batch script

BATCH SUPPORT

#!/bin/sh

#SBATCH --N 32

mpirun singularity exec ~/ubuntu.sif mpi_program.exe

Page 13: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

With a PMIx supporting launcher, you can run a fully contained MPI process directly from a compatible resource manager

MPI AND SLURM

$ srun -n 32 singularity exec ubuntu.sif mpi_program.exe

Page 14: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

When a container includes a GPU enabled application and libraries, Singularity (with the “--nv” option) can properly inject the required Nvidia

GPU driver libraries into the container, to match the host’s kernel

GPU / CUDA SUPPORT

$ singularity exec --nv ubuntu.sif gpu_program.exe

$ singularity run --nv docker://tensorflow/tensorflow:gpu_latest

Page 15: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

MVAPICH2 APPLICATION PERFORMANCE

Benchmarks published by MVAPICH team at Ohio State Universityhttp://mvapich.cse.ohio-state.edu/performance/singularity-application/

Page 16: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

IMB NETWORK PERFORMANCE

Benchmarks published by SDSC at UCSDhttps://dl.acm.org/citation.cfm?doid=3093338.3106737

IMB SendRecv Run using Singularity and Non-Singularity IMB PingPong Run using Singularity and Non-Singularity

Content published here with explicit permission from the authors

Page 17: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

OSU NETWORK LATENCY

Benchmarks published by SDSC at UCSDhttps://dl.acm.org/citation.cfm?doid=3093338.3106737

Content published here with explicit permission from the authors

Page 18: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

LS-DYNA PERFORMANCE

Benchmarks published by the Dell EMC HPC Innovation Lab http://en.community.dell.com/techcenter/high-performance-computing/b/general_hpc/archive/2018/02/19/performance-of-ls-dyna-on-singularity-containers

“The performance difference while running LS-DYNA within Singularity containers remains within 2%, which is within the run-to-run variability of the application itself..”

Page 19: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Designed for the security needs of compute

• Container Engine:• Singularity has no root owned daemon processes

• Implements privilege separation over an API to a secure thread

• DoD: Singularity is the only allowed container system

• Audited and certified by EU lab for use on the European Compute Grid

• NSF grant for 3rd party security assessment (in progress, going well!)

• Container:• Singularity containers are immutable

• Cryptographically signed and verifiable

• Public keys can be managed over standard HKP protocol (or Sylabs key services)

• Environment Requirements:• Containers are run as the calling user

• Blocks all privilege escalation from within the container

SECURITY FOCUSED

Page 20: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

You are always yourself within a Singularity context, and Singularity will block escalation attempts within the container

Even if you know the root password, even if you have sudo installed, even if you implement a SUID hack, Singularity will prevent privilege

escalation

SECURITY BLOCKS

$ singularity exec centos.sif whoami

$ singularity exec centos.sif sudo su -

$ singularity exec centos.sif /proc/$$/root/bin/su

Page 21: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

• System administrators, always in 100% control

• Supports User Namespace (when kernel supports it)

• Linux Capabilities (per user or group ACLs)

• Directly integrates with host’s:• SELinux

• AppArmor

• Seccomp

• Container execution can be limited by:• Container owner or group

• Location on file system (trusted paths)

• Whitelist/blacklist by signed container finger prints

ADDITIONAL SECURITY FEATURES

Page 22: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

• Backend code updated to GO

• Fully OCI compatible (3.1: `singularity oci …`)

• Integration with enterprise standards:• OCI: Image support with all container registries

• CNI: Support for all container networking options (port forwarding, NAT, etc..)

• CGroups: Resource limitations

• SIF updates• Encapsulation of OCI and Docker formats

• Immutable and 100% guaranteed reproducible

• Cryptographically signed and verifiable

• No tarballs or archives: SIF is the runtime container format

• Multi-stage builds, and “disposal” development overlay• Nvidia HPC-CM container builder

• Build tool integration: Spack, EasyBuild,… Docker, Img, Buildah, etc…

• Native support for MacOS and Windows (coming soon)

• Kubernetes Support (native CRI)

WHAT ELSE IS NEW AND COMING SOON

Page 23: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

NATIVE SINGULARITY SUPPORT ON MACOS

Singularity Desktopcoming soon (Q1 2019)

Page 24: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

BRIDGING THE GAP BETWEEN COMPUTE AND SERVICES

Native integration between Singularity with OCI, Kubernetes and Nomad to be completed in Q1 2019.

Page 25: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

AI workflows typically have a “train” and “execute” workflow, where the training is the most

computationally intensive

Singularity enables this workflow and enables large scale distribution and

provides the needed assurance, security and accountability for scale and

production

Train

Distribute

Build

Inference

ARTIFICIAL INTELLIGENCE MULTISTAGE WORKFLOWS

Page 26: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

• Parallel training

• Distribution of trained models

• Real time AI / compute

• Data streaming

• Complete validation and trust

• Supporting all tools

• “HPC as a Service”

Singularity is the unifying substrate for all compute needs

EXPANDING THE WORKFLOW SUPPORT OF THE ECOSYSTEM

Data Stream(s)

KubernetesKafka - Stream Splitter and Balancer

Compute Based Service

Compute Based Service

Compute Based Service

Compute Based Service

Real time collectors, Visualization, Storage, analytics, etc.

Page 27: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

TENSORFLOW GPU PERFORMANCE

HPC and AI Solutions Engineering group at Dell EMC https://www.nextplatform.com/2018/03/19/singularity-containers-for-hpc-deep-learning/

“The performance comparison between a bare metal versus a containerized version of the framework at 32 Tesla V100 is still under 2%, showing negligible performance delta between the two.”

Page 28: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

NVIDIA SUPPORTS SINGULARITY

Page 29: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Three years counting, HPC Wire awards for Singularity

Page 30: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Singularity, the container runtime of choice for HPC, EPC/AI, and

enterprise workloads

As of Singularity 3.0:

• Multi-millions of container runs per day

• Approx 250,000 downloads (not counting redistributors)

• Installed on over 5 million x86 cores, 250k ARM

The same reasons that make Singularity fantastic for HPC,is what makes Singularity fantastic for all enterprise compute needs!

MASSIVE ADOPTION AND GROWTH

Page 31: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

SYLABS IN THE NEWS

Page 32: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Sylabs Among “The 10 Hottest Container Startups Of 2018”

Page 33: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

• SingularityPRO:• Fully supported versions of Singularity

• Code curated, trusted builds, RPM/DEB, simple deployment

• Feature identical to open source

• Stable with long term life

• Per node or site licensed

• Sylabs Cloud Services (SCS):• KeyStore: Public key service for signed containers

• Container Library: A place to host, develop, sell, reference, and share containers and AI trained models

• Remote Builder: safely and securely build containers without root, with a web based development interface or use the native Singularity CLI

• Pipelines: CI/CD configurable pipelines for DevOps workflows (coming soon)

• Professional services, support, training, development, etc.

SYLABS OFFERINGS

Page 34: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

THE SYLABS TEAM

Page 35: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Singularity User’s Group!

March 12th-13th

San Diego Supercomputing Center

CFP closing tomorrow!

THE INAUGURAL SINGULARITY USER GROUP MEETING

Page 36: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

Come see me after if you want Singularity swag!

T Shirts, Stickers, Pens, etc…

Page 37: Singularity - HPC-AI Advisory Council...$ singularity verify container.sif Data integrity checked, authentic and signed by: ... application and data • Disrupts the barriers of portability

SingularityContainer Workflows for Compute.

Gregory M. KurtzerCEO, Sylabs Inc.

@gmkurtzer

http://www.sylabs.io@SylabsIO@SingularityApp