91
Open Cloud Workshop 2020

Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Open Cloud Workshop 2020

Page 2: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC's Audacious Vision 2013/14• Vision Statement

“To create a self-sustaining at-scale public cloud based on the Open Cloud eXchange model… a marketplace for industry partners as well as a place for researchers and industry to innovate and expose innovation to users.”

• Project Overview and Goals– At-scale efficient production cloud for broad set of applications– Create and Deploy a new model of an Open Cloud– Testbed for research, open source developers, companies

Page 3: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC

Cloud Research

Users

OCT

Core Partners

NERC OF

OIL

Page 4: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

The interrelated initiatives

● Mass Open Cloud (MOC): Peter Desnoyers (NEU)

● Open Cloud Testbed (OCT): Mike Zink (UMass)

● New England Research Cloud (NERC): Scott Yockel (HU) & Wayne

Gilmore (BU)

● Open Infrastructure Labs (OIL): Jonathan Bryce (OSF)

● Operate First (OF): Hugh Brock (RH)

Page 5: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC

Cloud Research

Users

Core Partners

The Mass Open Cloud

Page 6: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

•90,000 square feet + can grow•10s of thousand HPC users, potentially many more cloud users

Page 7: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Imagine Pacific Research Platform

MGHPCC Consortium comparable to Pacific Research Platform• Huge community covering every field of research• Collaborations across the globe• Massive data and computational requirements• Massive student population covering every discipline

Widths are proportional to enrollment

Shrinking to the size of a building

Page 8: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

The Massachusetts Open Cloud

MGHPCC seed fund grant:- submitted Fall 2012- awarded Spring 2013

State proposal submitted Fall 2013

Page 9: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

1+ PB2500 cores, ~40TB RAM

Elastic Secure Infrastructure

• Block and S3 Object storage• Bare Metal Physical machines• IaaS – VM, Volume • Spark, Hadoop• OpenShift: enterprise deployment of Kubernetes container platform:• Built in CI, Monitoring, Load Balancing,

400 Power9 Cores, 40 GPUs, 5TB RAM

POWER9 AC922 servers• 5.6x CPU to GPU BW vs standard Intel via NVLink 2.0 • 40 NVIDIA Tesla V100 GPUs delivering up to 5,000 Teraflops for Deep Learning• All major open source Deep Learning Frameworks• PowerAI Distributed Deep Learning enables AI across multiple servers

Current HDV hosted on AWS• 81,000 Datasets• 490,000 Files• 5.8 Million DownloadsMoving to the MOC

New North East Storage Exchange (NESE)• 20 PB + file system & Object storage• Massive data lake for region, co-located with MOC• Fraction of the cost of AWS S3

20+ PB

It worked!

Page 10: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

The Team

Rob Rado Lars Kristi Naved

MichaelJen

Page 11: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

The larger team...

● Students● Interns● Employees● Collaborators

Page 12: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC Status• Has inspired and made possible extensive systems research in the cloud

– Enabled 100s publications on and in the cloud: security, performance, new OS, analysis cloud data, ...

– Instrumental in over $20M in Grants and contracts

• Is being increasingly used by lots of researchers and courses that “just want to get their stuff done”– Around 400 users in school year– Increasing use by industry

– Indirectly 10s thousands of users

Page 13: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code
Page 14: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC

Cloud Research(Regional)

Users

Core Partners

Current MOC Model

Page 15: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC

Cloud Research(Nationwide and beyond)

Users

OCT

Core Partners

Open Cloud Testbed

Page 16: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

OCT Motivation• Cloud Testbeds critical for enabling research into new

cloud technologies (see demand for CloudLab and Chameleon)

• Today’s cloud testbed’s isolated silos

Page 17: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Research "in" the MOC

logs/usage data

Cloud Users

NESE

Access to Cloud data sets

Cloud Researcher

Access to Cloud metadata

Industry engagement

Exposing experimental services

Cloud Researchers

? MOC production cloud

Local research infrastructure

ESI

Page 18: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Research "in" the MOC

logs/usage data

Cloud Users

NESE

Cloud Researcher

MOC production cloud

Exposing experimental services

Access to cloud data sets

Access to cloud metadata

Industry engagement

Cloud Researchers

?Local research infrastructure

ESI

Page 19: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

• Scientific infrastructure for cloud research

• Three main clusters (Utah, Wisconsin, and Clemson), which offer 15,000 cores– Each cluster has a different focus: storage and networking (using hardware

from Cisco, Seagate, and HP), high-memory computing (Dell), and energy-efficient computing (HP).

• Designed specifically for reproducible research

• Hard isolation to create many parallel “slices”

Page 20: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Research "in" the MOC

logs/usage data

Cloud Users

NESE

Cloud Researcher

MOC production cloud

Exposing experimental services

Access to cloud data sets

Access to cloud metadata

Industry engagement

Cloud Researchers

?Local research infrastructure

ESI

Page 21: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Research "in" the MOC

logs/usage data

Cloud Users

NESE

Access to Cloud data sets

Cloud Researcher

Access to Cloud metadata

Industry engagement

Exposing experimental services

Cloud Researchers

? MOC production cloud

Local research infrastructure

ESI NERC

CloudLab

Utah

Clemson

Wisconsin

OCT/MGHPCC

Page 22: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Access to Cloud data sets

Access to Cloud metadata

CloudLab

Utah

Clemson

Wisconsin

OCT/MGHPCC

Page 23: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Research "in" the MOC

logs/usage data

Cloud Users

NESE

Access to Cloud data sets

Cloud Researcher

Access to Cloud metadata

Industry engagement

Exposing experimental services

Cloud Researchers

? MOC production cloud

Local research infrastructure

ESI NERC

CloudLab

Utah

Clemenson

Wisconsin

Page 24: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Research "in" the MOC

logs/usage data

Cloud Users

NESE

Access to Cloud data sets

Cloud Researcher

Access to Cloud metadata

Industry engagement

Exposing experimental services

Cloud Researchers

? MOC production cloud

Local research infrastructure

ESI NERC

CloudLab

Utah

Clemenson

Wisconsin

19 servers today> 100 servers + 15 FPGAs by end of year!

Page 25: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Research "in" the MOC

logs/usage data

Cloud Users

NESE

Access to Cloud data sets

Cloud Researcher

Access to Cloud metadata

Industry engagement

Exposing experimental services

Cloud Researchers

? MOC production cloud

Local research infrastructure

ESI NERC

CloudLab

Utah

Clemenson

Wisconsin

FPGA

FPGA

FPGA

FPGA

FPGA

30 new FPGA nodes

Page 26: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

More on OCT

● Monday, 11:30 - 12:30 pm during the Micro-talks I session.

● Monday, 12:30 - 2:00 pm Advisory Board Meeting (closed session)

● Tuesday, 9:45 - 10:30 am during the Open Cloud Testbed session.

● Tuesday, 2:00 - 3:10 pm and 4:00 - 5:10 pm during the Deep Dives

sessions.

Page 27: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC

Cloud Researchers

Usersexperimenting with technology

OCT

Core Partners

NERC

Production cloud – New England Research Cloud (NERC)

UsersdoingScience

Page 28: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

New England Research Cloud

Vision and Pilot

2019-2021Wayne Gilmore Scott Yockel

Page 29: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

The Vision for NERC

29

Build a cost effective professionally operated on-premise cloud service that includes various levels of services:● Self-service Software-as-a-Service for easy access● Automated Platforms-as-a-Service for custom workflows● Standardized Infrastructure-as-a-Service that includes emerging new

technologies for hardware acceleration

Set standards of deployment and automation that will allow other institutions to easily deploy the full suite of services built within the NERC

Through facilitation we will increase the capabilities of the emerging workforce leaving local universities as well as the innovation hubs and start-ups connected to our universities.

Beyond the pilot, attract funding (NSF/NIH), expand the reach.

Page 30: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

BU / Harvard leaders in Research Computing & Facilitation

The Importance of Human Driven Facilitation● Co-creating and co-learning with diverse communities is key to broad

adoption and understanding of innovative technologies.

● Involved in national community led efforts extending facilitation.○ ACI-REF, Cyberteams, CaRCC, PEARC

● Provides BU/Harvard with a unique competitive advantage. 30

Boston and Harvard both:● Have well established, large, and mostly centralized groups● Success through scalable infrastructure/services coupled with human driven

facilitation.● Serve broad and diverse research domains and communities.● Are involved with innovative cloud researchers at local Universities.

Page 33: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

NERC Pilot TimelineFY19 FY20 FY21

Planning: Begin Collaboration - Develop key concepts, governance, budget

Phase One: Minimum Viable Product - Add 2 Systems Engineers - Repurpose servers, Setup IaaS, connect to NESE storage - Align POC needs

Phase Two: Standardization & Automation - Add 2 Systems Software Developers - Build resource allocations and accounting portal - Deploy PaaS (OpenShift, OpenData Hub)

Phase Three: Scaling User Base & Operations - Deploy SaaS (fully self-service data science) - Engage external HU/BU POC use cases - Engage with other MGHPCC Systems Engineers - Open to all HU/BU research 33

Page 34: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

POC Use CasesHarvard & BU: General Purpose: SaaS notebook service that would support RStudio, Jupyterhub, …● Open OnDemand – OSC - NSF Funded Web-app● Domino Data Labs – Pilot with HBS● Open Data Hub – Data and AI platform in partnership with RedHat

Mike Dietze: BU, Department of Earth and Environment● Ecological Forecasting Lab PeCaN (Predictive Ecosystem Analyzer)

Frederick Jansen: BU, Hariri Software and Application Innovation Lab (SAIL)● Conclave - Multi-party computation (MPC)● Single Cell Toolkit - Joshua Campbell, BU Medical Center

Eric Kolaczyk: BU, MS Statistical Practice (MSSP) ● Consulting Service Projects

Page 35: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

POC Use CasesCurtis Huttenhower: Harvard, SPH Biostats● Galaxy - open source, web-based platform for data intensive biomedical research.

Randy Buckner: Harvard, Psychology NCF; SPH; MGH; McLean● DPdash is a Deep/Digital Phenotyping Dashboard designed to manage and visualize

multiple data streams coming in continuously over extended periods of time in individuals

Alyssa Goodman: Harvard, Center for Astrophysics ● Glue - Python visualization library to explore relationships within and between related

datasets

Francesca Dominici: Harvard, Biostatistics, Population and Data Science● Soot Project: Air pollution / Environmental Health

Jeff Blossom, Ben Lewis: Harvard, Center for Geographical Analysis● Worldmap: Open Source web-based platform to build your own mapping portal

Page 36: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Benefits of NERC

Technology Infrastructure for a Wide Range of Communities• Ubiquitous infrastructure for data analytics• Boston University Example:

• 2012 → 51 departments: Basic Science, Engineering, & Medicine• 2019 → 100 departments: Every University College & Center

• Extend services to regional Colleges and Universities

Workforce Development/Training• 35% of all degrees in Commonwealth are from MGHPCC Universities.• Extending data-analytics platforms and curriculum beyond higher-ed

Resources Supporting Competitiveness and Innovation• Innovation Hub provide industry partners the ability to service NERC’s extensive

communities.• Large-scale multi-institutional cloud environment would place MGHPCC in a league of

world class facilities. 36

Page 37: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC

Cloud Research

OILOCT

Core Partners

NERC

OpenInfra Labs (OIL)

Page 38: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

OpenInfra LabsAn OSF pilot project

Connecting open source projects to production

Page 39: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

● Top IT folks at most of the 151 North American organizations with their own data centers surveyed recently by IHS Markit expect to at least double the amount of physical servers in their data centers this year

● New workloads and requirements like AI/Machine Learning, 5G, and edge computing are putting new demands on underlying infrastructure technologies

● The proliferation of open source software brings more innovation and new capabilities faster than ever before, along with operational challenges

Why? Computing demand is growing and changing

Page 40: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

OSF Global Community

MEMBERS ORGANIZATIONS100,000

COUNTRIES187 675

Page 41: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Platinum & Gold Members

Page 42: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Open Infrastructure today requires integrating many open source componentsOpenStack for cloud services, Ceph for storage, Kubernetes for container orchestration, and TensorFlow for machine learning are just a few example projects that deliver modern capabilities:

Each integrates with, relies on, & enables dozens of other OSS projects.

This many-to-many relationship poses an integration challenge that can make open source options less attractive than proprietary choices which are more turn-key.

Page 43: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

● Integrated testing of all the software necessary to provide a complete use case

● Documentation of operational and functional gaps required to run upstream projects in a production environment

● Shared code repositories for operational tooling and the “glue” code that is often written independently by users

OpenInfra Labs - Last Mile for OSS

Page 44: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

● Prioritize existing requirements and use cases● Expand to broader set of institutions and organizations● Replicate & limited federation to other regions & commercial clouds● Explore broader set of services and heterogeneous infrastructure

Looking to the future

Page 45: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

● People: help operate participating community clouds● Code: developers contribute on priority technical gaps● Hardware: more gear to run in MOC and community clouds● Partnerships: federated connections between clouds and commercial

providers● Funding: put to use on building and operating the community cloud

environments

Ways To Help

Page 46: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

● Chat on freenode IRC: #openinfralabs● Email: http://lists.opendev.org/cgi-bin/mailman/listinfo/openinfralabs● Code repos: https://opendev.org/openinfralabs

https://openinfralabs.org/

Join Us!

Page 47: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

MOC

Cloud Research

OILOCT

Core Partners

NERC

Operate First (OF)

OF

Page 48: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

To Operate At Scale, We Must Operate Upstream First

Operate First

Hugh BrockResearch Director, Red Hat Office of the CTO

48

Page 49: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Operate First

Before Open Source, Code Was Value

49

But operating the code, even at scale, was left to the folks in the basement

Page 50: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Then open source happened...

50

Page 51: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Operate First

51

“RALEIGH, N.C. — March 26, 2002 —Red Hat, Inc. (Nasdaq: RHAT) today announced Red Hat Linux Advanced Server, the first enterprise-class Linux operating system.”

Page 52: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Operate First

… And The Value Moved From The Code To The Product

52

But we still didn’t put much value on what those folks in the basement were doing

Product

Page 53: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Then everything grew like crazy and scale got really really important

53

Page 54: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Operate First

54

“Amazon Web Services (AWS) had $17.46 billion in annual revenue in 2017. By end of 2018, the number had grown to $25.65 billion. AWS reported 37% growth in 2019. In 2019, AWS alone accounted for 12% of Amazon's profits (up from 11% in 2018).”

Page 55: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Operate First

Suddenly The Folks In The Basement Are Valuable!

55

… And they’re not in the basement any more. But like code in the days before open source, the tools

and techniques and knowledge of operation at scale are proprietary.

Page 56: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

If the value in IT is in ops, and ops are proprietary, then open source has a problem.

56

Page 57: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Upstream Projects Operate FirstWe will open the MOC to upstream communities who need a place to operate their services in order to develop them.

Red Hat Products Operate FirstWe will begin operating our own products on the MOC in the open, before we ship them to our customers.

Open Telemetry, Open Tracing, Open OpsWe will work with the community around the MOC to gather data and develop new tools in the open that will be the key to autonomous operations at scale.

Operate First

Operate First Is The Solution

57

With the Mass Open Cloud and Open Infra Labs, Red Hat is

launching an effort to open source cloud operations at scale.

Page 58: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Join Us.

58

Page 59: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

linkedin.com/company/red-hat

youtube.com/user/RedHatVideos

facebook.com/redhatinc

twitter.com/RedHat

Red Hat is the world’s leading provider of

enterprise open source software solutions.

Award-winning support, training, and consulting

services make

Red Hat a trusted adviser to the Fortune 500.

Thank you

59

Page 60: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Thank you to all the collaborators!

Page 61: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code
Page 62: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

New Hardware from Two Sigma

● 180 (slightly used) servers

● Mass CloudLab (19 additional R630 => 380 additional cores)

● 3 additional racks of servers (R630, R620, and R720/R730):

• ~ 1600 cores

• Part of CloudLab

• Part of MOC

• Can be used by industry

Page 63: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Interrelated initiatives

● Mass Open Cloud (MOC)

● Open Cloud Testbed (OCT)

● New England Research Cloud (NERC)

● Open Cloud Initiative (OCI)

● Operate First

Page 64: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Interrelated initiatives

● Mass Open Cloud (MOC)

● Open Cloud Testbed (OCT)

● New England Research Cloud (NERC): a production cloud

● Open Cloud Initiative (OCI)

● Operate First

Page 65: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Interrelated initiatives

● Mass Open Cloud (MOC): experimental services

● Open Cloud Testbed (OCT)

● New England Research Cloud (NERC): a production cloud

● Open Cloud Initiative (OCI)

● Operate First

Page 66: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Interrelated initiatives

● Mass Open Cloud (MOC): experimental services

● Open Cloud Testbed (OCT)

● New England Research Cloud (NERC): a production cloud

● Open Cloud Initiative (OCI): standardized reproducible and

federated open source clouds

● Operate First

Page 67: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Interrelated initiatives

● Mass Open Cloud (MOC): experimental services

● Open Cloud Testbed (OCT): cloud research

● New England Research Cloud (NERC): a production cloud

● Open Cloud Initiative (OCI): standardized reproducible and

federated open source clouds

● Operate First

Page 68: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Interrelated initiatives

● Mass Open Cloud (MOC): experimental services

● Open Cloud Testbed (OCT): cloud research

● New England Research Cloud (NERC): a production cloud

● Open Cloud Initiative (OCI): standardized reproducible and

federated open source clouds

● Operate First: open source for operations

Page 69: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

The whole is greater...

● Provide an economical scalable cloud for research users

● Accelerate cloud research and development

● Enable broad industry involvement in cloud technologies

● Greatly accelerate open source community

● Enable a broad diversity of clouds to be more easily stood up

and then federated together

A broader vision than what the MOC started with

in 2014...

Page 70: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Check-in, Breakfast

9:00 – 10:30 Welcome and overview

10:30 – 10:50 Break

10:50 – 11:30 Keynote - Chris Wright

11:30 – 12:30 Micro-talks 1

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Industry Luminaries Panel

3:30 – 4:10 Keynote - Matthew Adiletta

4:10 – 4:35 Break

4:30 – 5:30 Micro-talks 2

5:30 – 7:35 Closing Remarks and Reception

Today’s schedule: Big picture

Page 71: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Check-in, Breakfast

9:00 – 10:30 Welcome and overview

10:30 – 10:50 Break

10:50 – 11:30 Keynote - Chris Wright

11:30 – 12:30 Micro-talks 1

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Industry Luminaries Panel

3:30 – 4:10 Keynote - Matthew Adiletta

4:10 – 4:30 Break

4:30 – 5:30 Micro-talks 2

5:30 – 7:35 Closing Remarks and Reception

Page 72: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Check-in, Breakfast

9:00 – 10:30 Welcome and overview

10:30 – 10:50 Break

10:50 – 11:30 Keynote - Chris Wright

11:30 – 12:30 Micro-talks 1

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Industry Luminaries Panel

3:30 – 4:10 Keynote - Matthew Adiletta

4:10 – 4:35 Break

4:30 – 5:30 Micro-talks 2

5:30 – 7:35 Closing Remarks and Reception

Page 73: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Micro-talks day 1● The Open Cloud FPGA Testbed – Martin Herbordt, BU, and Miriam Leeser, NU

● Programming FPGAs – The Open Source Way – Ahmed Sanaullah, Red Hat

● Leveraging Distributed Research Cloud Infrastructures ... – Anirban Mandal, RENCI

● Management ... Cloud Resources – Jonathan Chamberlain and Zhenpeng Shi, BU

● Secure and Customized Hypervisor with Qemu – Daniele Buono, IBM

● Harvard Data Commons – Merce Crosas, Harvard

● RSpace electronic lab notebook... – Rory Macneil, RSpace

● The Open Storage Network, John Goodhue, MGHPCC

● What is FABRIC? – Paul Ruth, RENCI

● Where is Ironic, and where is it going? – Julia Kreger, Red Hat

Page 74: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Testbed capabilities ● The Open Cloud FPGA Testbed – Martin Herbordt, BU, and Miriam Leeser, NU

● Programming FPGAs – The Open Source Way – Ahmed Sanaullah, Red Hat

● Leveraging Distributed Research Cloud Infrastructures ... – Anirban Mandal, RENCI

● Management ... Cloud Resources – Jonathan Chamberlain and Zhenpeng Shi, BU

● Secure and Customized Hypervisor with Qemu – Daniele Buono, IBM

● Harvard Data Commons – Merce Crosas, Harvard

● RSpace electronic lab notebook... – Rory Macneil, RSpace

● The Open Storage Network, John Goodhue, MGHPCC

● What is FABRIC? – Paul Ruth, RENCI

● Where is Ironic, and where is it going? – Julia Kreger, Red Hat

Page 75: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Hybrid Cloud for Science● The Open Cloud FPGA Testbed – Martin Herbordt, BU, and Miriam Leeser, NU

● Programming FPGAs – The Open Source Way – Ahmed Sanaullah, Red Hat

● Leveraging Distributed Research Cloud Infrastructures ... – Anirban Mandal, RENCI

● Management ... Cloud Resources – Jonathan Chamberlain and Zhenpeng Shi, BU

● Secure and Customized Hypervisor with Qemu – Daniele Buono, IBM

● Harvard Data Commons – Merce Crosas, Harvard

● RSpace electronic lab notebook... – Rory Macneil, RSpace

● The Open Storage Network, John Goodhue, MGHPCC

● What is FABRIC? – Paul Ruth, RENCI

● Where is Ironic, and where is it going? – Julia Kreger, Red Hat

Page 76: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

● The Open Cloud FPGA Testbed – Martin Herbordt, BU, and Miriam Leeser, NU

● Programming FPGAs – The Open Source Way – Ahmed Sanaullah, Red Hat

● Leveraging Distributed Research Cloud Infrastructures ... – Anirban Mandal, RENCI

● Management ... Cloud Resources – Jonathan Chamberlain and Zhenpeng Shi, BU

● Secure and Customized Hypervisor with Qemu – Daniele Buono, IBM

● Harvard Data Commons – Merce Crosas, Harvard

● RSpace electronic lab notebook... – Rory Macneil, RSpace

● The Open Storage Network, John Goodhue, MGHPCC

● What is FABRIC? – Paul Ruth, RENCI

● Where is Ironic, and where is it going? – Julia Kreger, Red Hat

Page 77: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Check-in, Breakfast

9:00 – 10:30 Welcome and overview

10:30 – 10:50 Break

10:50 – 11:30 Keynote - Chris Wright

11:30 – 12:30 Micro-talks 1

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Industry Luminaries Panel

3:30 – 4:10 Keynote - Matthew Adiletta

4:10 – 4:35 Break

4:30 – 5:30 Micro-talks 2

5:30 – 7:35 Closing Remarks and Reception

Page 78: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Industry Luminaries Panel

• Moderator Jonathan Bryce, Executive Director, OSF

• Panel:

• Matthew J. Adiletta Intel® Senior Fellow and the Director of the Systems

Innovation Lab

• Mark Astley Head of Data Engineering at Two Sigma Investments

• Gene Bagwell Associate Fellow in Verizon’s Technology Architecture &

Strategy Group.

• Dr. Stefanie Chiras VP and GM of the RHEL Business Unit at Red Hat

• Briana Franks, Director of PM, IBM Cloud and Cognitive Software

Page 79: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Check-in, Breakfast

9:00 – 10:30 Welcome and overview

10:30 – 10:50 Break

10:50 – 11:30 Keynote - Chris Wright

11:30 – 12:30 Micro-talks 1

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Industry Luminaries Panel

3:30 – 4:10 Keynote - Matthew Adiletta

4:10 – 4:35 Break

4:30 – 5:30 Micro-talks 2

5:30 – 7:35 Closing Remarks and Reception

Page 80: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

ESI ESI/Multi-tenant IRONIC ESI Integration with CloudLab AI for researchers Tools for AI in an open cloud offering.OIL CODE Contributions Goals for contributions to OIL. operators for OpenStackhosted services

How can OpenStack leverage Kubernetes operators ecosystem to expand services.

Monitoring Requirements for Open Infra Labs

monitoring/telemetry system for an Open Cloud?

ChRIS platform -- current status, updates, and what's next

Platform for healthcare: updates and discussion

OIL Community BuildingWhat are the groups and the expectations those groups have?

Latest Schedule and Locations at Front Desk or Follow this Link:

Page 81: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Breakfast

9:00 – 9:45 Keynote - Giovanni Pacifici

9:45 – 10:30 Open Cloud Testbed

10:30 – 11:00 Break

11:00 – 12:30 Cloud Computing Research

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Deep Dives & Micro-talks 3

3:30 – 4:00 Break

4:00 – 5:30 Deep Dives & Micro-talks 4

5:30 – 6:05 Closing Remarks

Tuesday: Focus on cloud research

Page 82: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Breakfast

9:00 – 9:45 Keynote - Giovanni Pacifici

9:45 – 10:30 Open Cloud Testbed

10:30 – 11:00 Break

11:00 – 12:30 Infrastructure Cloud Computing Research Panel

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Deep Dives & Micro-talks 3

3:30 – 4:00 Break

4:00 – 5:30 Deep Dives & Micro-talks 4

5:30 – 6:05 Closing Remarks

Page 83: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Breakfast

9:00 – 9:45 Keynote - Giovanni Pacifici

9:45 – 10:30 Open Cloud Testbed

10:30 – 11:00 Break

11:00 – 12:30 Infrastructure Cloud Computing Research Panel

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Deep Dives & Micro-talks 3

3:30 – 4:00 Break

4:00 – 5:30 Deep Dives & Micro-talks 4

5:30 – 6:05 Closing Remarks

Page 84: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Breakfast

9:00 – 9:45 Keynote - Giovanni Pacifici

9:45 – 10:30 Open Cloud Testbed

10:30 – 11:00 Break

11:00 – 12:30 Infrastructure Cloud Computing Research Panel

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Deep Dives & Micro-talks 3

3:30 – 4:00 Break

4:00 – 5:30 Deep Dives & Micro-talks 4

5:30 – 6:05 Closing Remarks

Page 85: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Infrastructure for Cloud Research

• Moderator: Jack Brassil is the Senior Director of Advanced

CyberInfrastructure at Princeton

• Panel:

• Jack Brassil: Director of Advanced CyberInfrastructure, Princeton

• James Barr von Oehsen: Associate VP OARC, Rutgers

• Elizabeth Bruce: Director Technology and Corporate Responsibility, MSFT

• John Cohn: IBM Fellow, MIT-IBM Watson AI Research Group

• Rob Ricci: PI of the NSF funded CloudLab research infrastructure, Utah

• Paul Ruth co-PI of the NSF funded Chameleon Cloud research

infrastructure, UNC

Page 86: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Breakfast

9:00 – 9:45 Keynote - Giovanni Pacifici

9:45 – 10:30 Open Cloud Testbed

10:30 – 11:00 Break

11:00 – 12:30 Infrastructure Cloud Computing Research Panel

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Deep Dives & Micro-talks 3

3:30 – 4:00 Break

4:00 – 5:30 Deep Dives & Micro-talks 4

5:30 – 6:05 Closing Remarks

Page 87: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Micro-talks day 3● Challenges and Opportunities for AI Industrialization – Hui Lei, Futurewei

● Bayesian Learning for Online Tuning of Complex Apps – Rohan Basu Roy, NU

● Hybrid Cloud Storage – Emine Ugur Kaynar and Amin Mosayyebzadeh, BU

● Impact of OS Design and HW Configuration on the Power ... – Han Dong, BU

● Using ESI in ... Why and How? - Sahil Tikale, BU, and Apoorve Mohan, NU

● Providing ... insights into enterprise datacenters- John Liagouris, BU

● A Just-in-Time Framework for Tracing Cloud Applications – Emre Ates, BU

● Workflow motif: an abstraction for debugging distributed systems – Mania Abdi,

NU

● Challenges and opportunities in large-scale stream processing – Vasiliki Kalavri, BU

● Fuzzing Virtual Devices in Cloud Hypervisors – Alexander Bulekov, BU

Page 88: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Micro-talks day 3● Challenges and Opportunities for AI Industrialization – Hui Lei, Futurewei

● Bayesian Learning for Online Tuning of Complex Apps – Rohan Basu Roy, NU

● Hybrid Cloud Storage – Emine Ugur Kaynar and Amin Mosayyebzadeh, BU

● Impact of OS Design and HW Configuration on the Power ... – Han Dong, BU

● Using ESI in ... Why and How? - Sahil Tikale, BU, and Apoorve Mohan, NU

● Providing ... insights into enterprise datacenters- John Liagouris, BU

● A Just-in-Time Framework for Tracing Cloud Applications – Emre Ates, BU

● Workflow motif: an abstraction for debugging distributed systems – Mania Abdi,

NU

● Challenges and opportunities in large-scale stream processing – Vasiliki Kalavri, BU

● Fuzzing Virtual Devices in Cloud Hypervisors – Alexander Bulekov, BU

Page 89: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Micro-talks day 3● Challenges and Opportunities for AI Industrialization – Hui Lei, Futurewei

● Bayesian Learning for Online Tuning of Complex Apps – Rohan Basu Roy, NU

● Hybrid Cloud Storage – Emine Ugur Kaynar and Amin Mosayyebzadeh, BU

● Impact of OS Design and HW Configuration on the Power ... – Han Dong, BU

● Using ESI in ... Why and How? - Sahil Tikale, BU, and Apoorve Mohan, NU

● Providing ... insights into enterprise datacenters- John Liagouris, BU

● A Just-in-Time Framework for Tracing Cloud Applications – Emre Ates, BU

● Workflow motif: an abstraction for debugging distributed systems – Mania Abdi,

NU

● Challenges and opportunities in large-scale stream processing – Vasiliki Kalavri, BU

● Fuzzing Virtual Devices in Cloud Hypervisors – Alexander Bulekov, BU

Page 90: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Deep Dives

• The Open Cloud *FPGA* Testbed – Martin Herbordt, BU & Miriam

Leeser, NEU

• Open Data Hub – Sherard Griffin, Red Hat

• OpenStack on CloudLab Demo and Tutorial – David Irwin, UMass

• Hybrid Cloud Storage – Peter Desnoyers, Northeastern University

Page 91: Open Cloud Workshop 2020 · –Enabled 100s publications on and in the cloud: security, performance, new ... Shared code repositories for operational tooling and the “glue” code

Time Session

8:00 – 9:00 Breakfast

9:00 – 9:45 Keynote - Giovanni Pacifici

9:45 – 10:30 Open Cloud Testbed

10:30 – 11:00 Break

11:00 – 12:30 Infrastructure Cloud Computing Research Panel

12:30 – 2:00 Lunch & Posters

2:00 – 3:30 Deep Dives & Micro-talks 3

3:30 – 4:00 Break

4:00 – 5:30 Deep Dives & Micro-talks 4

5:30 – 6:05 Closing Remarks