45
Academic Compute Cloud Provisioning and Usage Project Peter Kunszt ETH/SystemsX.ch 2012, November 19 Bern

Academic Compute Cloud Provisioning and Usage Project

Embed Size (px)

DESCRIPTION

Academic Compute Cloud Provisioning and Usage Project. Peter Kunszt ETH/SystemsX.ch. 2012, November 19 Bern. Motivation. Researchers often only want services , not products . Services rely on Infrastructure Middleware Application Software Research Informatics ‘ Glue ’ - PowerPoint PPT Presentation

Citation preview

Page 1: Academic Compute Cloud Provisioning and Usage Project

Academic Compute Cloud Provisioning and

UsageProject

Peter KunsztETH/SystemsX.ch

2012, November 19 Bern

Page 2: Academic Compute Cloud Provisioning and Usage Project

MotivationResearchers often only want services, not products.Services rely on •Infrastructure•Middleware•Application Software•Research Informatics ‘Glue’

We ‘the supporters’ want to offer ‘Apps’•Maintainable services

•Published, usable tools and software•Browsable published research data

19 Nov. 2012 SDCD Bern

Page 3: Academic Compute Cloud Provisioning and Usage Project

Motivation: SystemsX.chSCHWEIZERISCHE EIDGENOSSENSCHAFTCONFÉDÉRATION SUISSECONFEDERAZIONE SVIZZERACONFEDERAZIUN SVIZRA

Largest Swiss national research effort to dateLargest Swiss national research effort to date

19 Nov. 2012 SDCD Bern

Page 4: Academic Compute Cloud Provisioning and Usage Project

Some numbers..• Funded by the Swiss

government with CHF 25Million/year for 2008-2011, 2012, 2013-2016

• 12 Swiss Universities and Research Institutions invest a matching 25 Million/y

• Projects approved by the SNSF

• 14 large research projects (4-7MCHF) until 2012, 10 new starting 2013 (3MCHF)

• 50+ PhD projects

• 20+ interdisciplinary pilot projects

• 1 strategic support project: SyBIT 2MCHF/y average

SDCD Bern19 Nov. 2012

Page 5: Academic Compute Cloud Provisioning and Usage Project

SyBIT Project Motivation

SystemsX.ch will produce and analyze a large large amount of dataamount of dataStrong need for coordinationcoordination among data providersStrong need for commoncommon semanticssemantics and compatiblecompatible service offeringsIncreased need for professionally supportedsupported tools and services

19 Nov. 2012 SDCD Bern

Page 6: Academic Compute Cloud Provisioning and Usage Project

BioinformaticsBioinformatics

IT InfrastructureIT Infrastructure

PlatformsPlatforms

PhosphoNetXLipidX

MetaNetXPlantGrowthCellPlasticity

LiverXCycliX

NeurochoiceWingXYeastX

DynamiXCINA

BattleXInfectX

PhosphoNetXLipidX

MetaNetXPlantGrowthCellPlasticity

LiverXCycliX

NeurochoiceWingXYeastX

DynamiXCINA

BattleXInfectX

SyB

ITS

yBIT

IPPIPP

IPHDIPHD

Service Providers

SyBIT provides support

19 Nov. 2012 SDCD Bern

Page 7: Academic Compute Cloud Provisioning and Usage Project

BioinformaticsBioinformatics

IT InfrastructureIT Infrastructure

PlatformsPlatforms

PhosphoNetXLipidX

MetaNetXPlantGrowthCellPlasticity

LiverXCycliX

NeurochoiceWingXYeastX

DynamiXCINA

BattleXInfectX

PhosphoNetXLipidX

MetaNetXPlantGrowthCellPlasticity

LiverXCycliX

NeurochoiceWingXYeastX

DynamiXCINA

BattleXInfectX

SyB

ITS

yBIT

IPPIPP

IPHDIPHD

Service Providers

SyBIT gives feedback

19 Nov. 2012 SDCD Bern

Page 8: Academic Compute Cloud Provisioning and Usage Project

MotivationResearchers often only want services, not products.Services rely on •Infrastructure•Middleware•Application Software•Research Informatics ‘Glue’

We ‘the supporters’ want to offer ‘Apps’•Maintainable services

•Published, usable tools and software•Browsable published research data

19 Nov. 2012 SDCD Bern

Page 9: Academic Compute Cloud Provisioning and Usage Project

Project Goals• How to extend current cluster services using cloud

technology? • Support new application models (MapReduce, specialized

servers).• Test real applications.• Understand performance implications.

1. Define Service Models: How to move to cloud-like service orientation models.

2. Define Business Models: How to accommodate pay-per-use, OpEx vs. CapEx, how to plan an academic private cloud, and how to use and offer public clouds

3. Run real applications: Run a regular, a compute-intensive and a data-intensive application on the cloud.

19 Nov. 2012 SDCD Bern

Page 10: Academic Compute Cloud Provisioning and Usage Project

19 Nov. 2012 SDCD Bern

Project Goals• How to extend current cluster services using cloud

technology? • Support new application models (MapReduce, specialized

servers).• Test real applications.• Understand performance implications.

1. Define Service Models: How to move to cloud-like service orientation models.

2. Define Business Models: How to accommodate pay-per-use, OpEx vs. CapEx, how to plan an academic private cloud, and how to use and offer public clouds

3. Run real applications: Run a regular, a compute-intensive and a data-intensive application on the cloud.

Provide input to the mid- and long-term strategy for cluster and cloud

infrastructure at ETH and UZH.

Provide input to the mid- and long-term strategy for cluster and cloud

infrastructure at ETH and UZH.

Disseminate results in Switzerland broadly in academia and to

interested parties (Workshop at project end)

Disseminate results in Switzerland broadly in academia and to

interested parties (Workshop at project end)

Page 11: Academic Compute Cloud Provisioning and Usage Project

Cloud Attributes: When do we talk about a cloud

• Self-service, On-demand, Cost transparency– Access to immediately available resources, paying

for usage only. No long-term commitments. No up-front investments needed. Operational expenses only.

• Elasticity, Multi-tenancy, Scalability– Grow and shrink size of resource on request.

Sharing with other users without impacting each other. Economies of scale.

DEFINITION

19 Nov. 2012 SDCD Bern

Page 12: Academic Compute Cloud Provisioning and Usage Project

Definitions• Self-service: A consumer can unilaterally provision

computing capabilities, such as server time and network storage, without requiring human interaction.

• On-demand: As needed, at the time when needed, automatic provisioning.

• Cost Transparency: Accounting of actual usage transparent to user and service provider both, measured in corresponding terms (Hours CPU time, GB per Month, MB Transfer, etc)

19 Nov. 2012 SDCD Bern

Page 13: Academic Compute Cloud Provisioning and Usage Project

Definitions• Elastic: Capabilities can be elastically provisioned and

released, in some cases automatically, to scale rapidly outward and inward commensurate with demand.

• Multi-tenant: The provider’s computing resources are pooled to serve multiple consumers, with resources dynamically assigned and reassigned according to consumer demand.

• Scalable: To the consumer, the capabilities available for provisioning often appear to be unlimited and can be appropriated in any quantity at any time.

http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf

Page 14: Academic Compute Cloud Provisioning and Usage Project

HPC Pyramid

Number of users

Com

pu

tin

g n

eed

s CSCS

19 Nov. 2012 SDCD Bern

Page 15: Academic Compute Cloud Provisioning and Usage Project

Relation to Cloud: As User (extension)

Number of users

Com

pu

tin

g n

eed

s CSCS

CloudCloudBurstUse

19 Nov. 2012 SDCD Bern

Page 16: Academic Compute Cloud Provisioning and Usage Project

Today, University Clusters do not make use of the Cloud:

• Technical details to be investigated: – Bursting the cluster into the cloud

• Networking?• User Management?• File System?

• Cloud-compatible licenses for commercial products are often not available

• No billing mechanism to bill users of cluster for pay-per-use services

19 Nov. 2012 SDCD Bern

Page 17: Academic Compute Cloud Provisioning and Usage Project

Relation to Cloud: As Provider

Number of users

Com

pu

tin

g n

eed

s CSCS

CloudCloud

Account / charge usage

Expose to

19 Nov. 2012 SDCD Bern

Page 18: Academic Compute Cloud Provisioning and Usage Project

Not clear how to be a Cloud Provider with a University Cluster

• Univ. cluster is not self-service• Capital expenses, not just pay-per-use• Long-term commitment• Not extensible on-demand, not elastic• Sharing with others only according to policies• More stringent terms of use, needs account

• We have examples to look at:– SDSC, Cornell, Oslo

SDCD Bern

Page 19: Academic Compute Cloud Provisioning and Usage Project

Infrastructure and Platform as a Service

From www.cloudadoption.org

Classic Approach Today

IaaS .

SoftwareSoftware Infrastructure Infrastructure PlatformPlatform

SaaS

PaaS

START FINISH

95%time savings

Page 20: Academic Compute Cloud Provisioning and Usage Project

Software & Apps run onplatforms,

NOT infrastructure

www.cloudadoption.org

Page 21: Academic Compute Cloud Provisioning and Usage Project

Cloud Stack

SoftwareSoftware

PlatformPlatform

InfrastructureInfrastructure

User Interface

MachineInterface

Components Services

Compute

Network

Storage

CLIENTSCLIENTS

HARDWAREHARDWARE

Users or Portals. Can directly use each stack. Users or Portals. Can directly use each stack.

Any kind of infrastructure for any of the stacks.Any kind of infrastructure for any of the stacks.

DEFINITION

Page 22: Academic Compute Cloud Provisioning and Usage Project

Who can makes use of what

IaaSIaaS

PaaSPaaS

SaaSSaaS

User PortalUser Portal

HardwareHardware

• Users may use any service

• Portals may use any service

• SaaS may or may not be built on top of PaaS or IaaS

• PaaS may or may not be built on top of IaaS

19 Nov. 2012 SDCD Bern

Page 23: Academic Compute Cloud Provisioning and Usage Project

Hybrid CloudHybrid Cloud

Public, Private, Hybrid CloudsDEFINITION

Public CloudPublic Cloud

• Offered by partner organizations or cloud providers

• Only operational expenses

• No control on cloud stack, dependency on external partner

• Private Cloud connected to Public Cloud

• Remote cloud resources on-demand

• Constraints on own cloud stack: needs to interoperate with public cloud

Private CloudPrivate Cloud

• Own infrastructure only

• In-house or hosted

• Internal use or for sale

• Full control on cloud stack, accounting, etc

ConnectConnect

Institutional boundary19 Nov. 2012 SDCD Bern

Page 24: Academic Compute Cloud Provisioning and Usage Project

How to evolve the HPC Service..

• ..to be able to offer a Platform as a Service.

• ..to be able to make use of public clouds seamlessly (Hybrid model, CloudBursting)

19 Nov. 2012 SDCD Bern

Page 25: Academic Compute Cloud Provisioning and Usage Project

Information Gathering

• We collected a lot of information and conducted a survey on existing solutions (mandate to CloudBroker)

19 Nov. 2012 SDCD Bern

Page 26: Academic Compute Cloud Provisioning and Usage Project

Lots of Interactions

• With Cloud providers– IBM, Amazon, CloudSigma, HP, Google

• Software providers– VMWare, HP, Dell, OpenStack flavors (Piston, ..)

• Universities– SWITCH, ZHAW, SDSC, Cornell, Imperial College, U

Oslo, Zaragoza

19 Nov. 2012 SDCD Bern

Page 27: Academic Compute Cloud Provisioning and Usage Project

Choices

• Commercial Cloud Appliance– Evaluate HP CloudSystem Matrix– Integrated hardware: HP blades and 3PAR storage– Runs with VMWare or Hyper-V– Complete management and end-user interfaces

• Build our own– 2 different systems (Dell based)– OpenStack: Several distributions to test– Special software: ScaleMP, cloud FS

19 Nov. 2012 SDCD Bern

Page 28: Academic Compute Cloud Provisioning and Usage Project

Cloud Stack Comparison Matrix

Page 29: Academic Compute Cloud Provisioning and Usage Project

OpenStack Distribution comparison

Page 30: Academic Compute Cloud Provisioning and Usage Project

Public IaaS Comparison

Page 31: Academic Compute Cloud Provisioning and Usage Project

Infrastructure 1

• ETH: HP CloudSystem Matrix Testbed– Operational as of THIS WEEK

• 8 Intel, 8 AMD blades • 128GB memory per blade• 10TB storage 3PAR

• HP Matrix cloud software is fixed• This is on RENT we have to give it back

Page 32: Academic Compute Cloud Provisioning and Usage Project

Infrastructure 2• ETH: Build our own from new components.

– Standard cluster nodes x16, diskless– 128GB RAM on each node– Very fast storage (SSD based) for VM images

• Attach standard storage NAS from ETH• Cloud Stack:

– OpenStack – VMWare

• Being installed next monday• This remains at ETH after the project

Page 33: Academic Compute Cloud Provisioning and Usage Project

Infrastructure 3

• University of Zurich: Recycle existing components.– Set of old cluster nodes, heterogeneous– Cloud filesystem using local node storage

(technologies will be evaluated)• GlusterFS• Ceph

19 Nov. 2012 SDCD Bern

Page 34: Academic Compute Cloud Provisioning and Usage Project

HPC + Cloud: On the same HW

…….

Com

pute

Nod

esSt

orag

e

HPC CLUSTER CLOUD HW19 Nov. 2012 SDCD Bern

Page 35: Academic Compute Cloud Provisioning and Usage Project

HPC + Cloud: On the same HW

…….

Com

pute

Nod

esSt

orag

e

HPC CLUSTER CLOUD HW

Classic CLUSTER – Not Virtualized•Can be heterogeneous HW•OS controlled by Admins•Scheduler for job submission•Applications compiled and installed•Shared FS

CLOUD – Virtualized•Hypervisor and Cloud Stack controlled by Admins•Template ‘Apps’•Users can create new•Different kinds of storage•Different setups possible•Virtual SMP

19 Nov. 2012 SDCD Bern

Page 36: Academic Compute Cloud Provisioning and Usage Project

Storage• Ceph, Gluster• Mount REAL=non-virtual cluster FS (Lustre, GPFS)• Mount NFS• Object stores, e.g. SWIFT• Different HW

– Local Disks– iSCSI– Very fast SSD-based appliance over 10Gb or FC or IB

(deduplication, compression) – for VMs and fast disk

19 Nov. 2012 SDCD Bern

Page 37: Academic Compute Cloud Provisioning and Usage Project

Cloud HPC Use Cases to Test 1• Extending the regular cluster into the cloud

– Just run cluster node instances– Register back with cluster scheduler– Jobs can request these nodes explicitly– ALREADY tested using Amazon

• Building a full virtualized cluster in our Cloud– Everything virtual: Cluster nodes, headnodes– Cluster FS : several options (see storage)– What do we learn? Reality check: HPC performance

19 Nov. 2012 SDCD Bern

Page 38: Academic Compute Cloud Provisioning and Usage Project

Test Case 1 Software• Use regular cluster workloads, NOT data intensive• Rosetta: structural biology• GAMESS: molecular chemistry simulation• SMSCG workloads (if we get there)

19 Nov. 2012 SDCD Bern

Page 39: Academic Compute Cloud Provisioning and Usage Project

• Hadoop cluster– Build the virtual cluster dedicated to Hadoop– HFS or Swift

• Commercial tool cluster: Matlab– Matlab ‘cluster’: allocate a few ‘fat’ VMs to

Matlab– Let it run its internal clustering, expose to user

Cloud HPC Use Cases to Test 2

19 Nov. 2012 SDCD Bern

Page 40: Academic Compute Cloud Provisioning and Usage Project

Test Case 2 Software• A bit more data intensive• Hadoop use cases

– Proteomics: analysis of selected reaction monitoring data– Genomics: bowtie over hadoop (Crossbow)

• Matlab and R– Set up cluster matlab on regular cluster– On SMP’d nodes

19 Nov. 2012 SDCD Bern

Page 41: Academic Compute Cloud Provisioning and Usage Project

• Data intensive workflow– InfectX pipeline: Image analysis – several TB of small files– Many kinds of scripts, mostly Matlab– Same workflow can be submitted many times– Error prone!

• OpenBIS on-demand workflow– Extend metadata catalog with some basic processing

capabilities using remote resources– Streaming of data to perform some processing in the cloud

Cloud HPC Use Cases to Test 3

Page 42: Academic Compute Cloud Provisioning and Usage Project

Business Models

• Cannot charge at full cost if we want to be the service provider (competitive advantage)• Internal and external views

• Efficient, fair, feasible and generally accepted funding and charging model

• New opportunities should not require to change existing business procedures for existing infrastructure (evolution not revolution)

• Transparent Financial Accounting mechanism

19 Nov. 2012 SDCD Bern

Page 43: Academic Compute Cloud Provisioning and Usage Project

Business Models

• Several models are being worked out– Shareholder model – one-time fee for TFLOPS or TB– Subscription model – yearly fee– Pay-per-use model

• Self service options– Very detailed like Amazon– High-level ‘virtual cluster’ or PaaS– Top-level SaaS user gateways

19 Nov. 2012 SDCD Bern

Page 44: Academic Compute Cloud Provisioning and Usage Project

TimelineApr‘12

ETHProject Start

Jul‘12

SWITCH AAAProject Start

SWITCH AAAProject End

Oct‘12 Jan‘13 Apr‘13

Information Gathering

Refinement of Targets

HP CloudSystem on lease

Business Model

Application Definition

delivered ready

ETH Self-built system

call decision delivery

today

UZH Self-built systemassemblyfrom existing stuff

ready

ready

return to HP

Application testing

Page 45: Academic Compute Cloud Provisioning and Usage Project

Output• Workshop in April’13 to show results of project

– To all Swiss research community – See you there!

• Input to ETH, UZH strategies for research infrastructure– Drive next procurement processes– Drive strategies for cooperation/outsourcing models– Drive new policy models for funding and

sustainability

19 Nov. 2012 SDCD Bern