25
EMI INFSO-RI- 261611 EMI INFSO-RI- 261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

Embed Size (px)

Citation preview

Page 1: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Cloud Task Force

Eric Yen and Simon Lin

22 Nov. 2010

Page 2: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

2

Toward the DCI on Grid and Cloud• Enabling collaboration to realize that the whole is grater than the

sum of parts• WWG realized the global e-Infrastructure to share resources over

Internet

Mário Campolargo Mário Campolargo European Commission - DG European Commission - DG INFSO – OGF 23, Barcelona June INFSO – OGF 23, Barcelona June 20082008

• Cloud offers versatile granularity and new usage patterns to the DCI services• Granularity: service-oriented layers

in infrastructure, platform, software, data, network, etc.

• Usage pattern: on-demand elasticity

• More user customized and user controlled environment on remote resources

Page 3: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

DCI: e-Science Infrastructure• Driven by Data Deluge

– Turning data into insight and knowledge base efficiently– Open, consistent and well-designed data format, interface,

protocol and quality code– Searchability, accessibility and sustainability

• Resources and Tools are sharable cross-disciplinarily• Enable Service-Oriented Science

– “scientific research enabled by distributed networks of interoperating services”

– New e-Infrastructure is required to host both the data and services

3

Page 4: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Definition of Cloud Computing• Definition (NIST)

– Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.

• Challenge of IT is the complexity rather than the scale• Therefore, goals of DCI are to have

– Elastic services & resources: on-demand, unlimited scalability, dynamic & low-cost migration,

– Customized applications/tools/services to accelerate knowledge discovery simple tools to answer complex question

– …

4

Page 5: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Requirements from e-Science Use Cases

• Job and resource matchmaking now depend on performance (architecture features), reliability, cost and data availability, rather than being dominated by resource availability– Then queued by priority at the target site– Auto migration, VM management, Cloud federation

• Job could be overflowed to collaborating sites (of the same VO or even to other VOs) whenever necessary (based on performance (e.g., Time to Finish) requirement)

• Computing environment on-demand– E.g., MapReduce computing environment; Distributed database like HBase,

File System (GPFS/HDFS/Lustre/NFS), etc. • Storage space on-demand ?

– Have to retain the original consistent storage system– Have to meet performance requirement

• Data as a Service

5

Page 6: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Requirements from Cloud Service Providers

• More easier and friendly AAI• Storage as a Service• Cloud Federation

– Interoperability (DCI information system), automatic scaling, HA provisioning, policy, AAI, Accounting and monitoring, …

• Standards to prevent vendors lock-in• Network as a Service

6

Page 7: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Collaborations• OCCI (and OGF)• VenusC• StratusLab• WNoD• RESERVOIR (OpenNebula)• EDGI• SIENA (Standards and Interoperability for eInfrastructure

implemeNation initiaAtive)• Globus and Nimbus• DMTF (Distributed Management Task Force)• OMG (Object Management Group)• SNIA (Storage Network Industry Association)• OCC (Open Cloud Consortium)

7

Page 8: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Challenges of Cloud at this moment

• Application management– community appliance management and customized computing

environment provisioning• AAI trust framework• Reliability• Elasticity

– automatic overflow and scalable– Scale out to new communities with diverse needs

• Performance of deployment and runtime – Data trans.: LanTorrent, streaming and min. congestion, avoid duplicate

trans.– 1,000 VMs in 10 min (by Nimbus)

• Cost• Standardization• Interoperability and Federation (either to Clouds and Grids)

9

Page 9: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

What Cloud Task Force Should Do ?

• Interoperability between grids, supercomputers and emerging computing models like clouds and desktop grids will be extended to address scalability and accessibility requirements.

• Evaluate integration scenarios with off-the-shelf computing cloud systems

• Bridge the gap between grid and cloud infrastructure• Report with architecture/infrastructure sketch• M30 - Successful computational usage of emerging

computing models, ie, clouds with EMI components (and a test bed)

10

Page 10: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611Plan for the next 6 months

• Identify probable scenarios for the evolution of EMI and Cloud– Based on e-Science applications– Barriers to effective computing, vision of future computing model from

user community, etc.• Identify what EMI component should be adapted

to/interoperable with/integrate what Cloud technology with step and schedule– OCCI focus on generic standardization issues from the use case

analysis, performance assessment metrics, etc.– EMI should aim for the scope between EMI software stacks and Clouds– What cloud technology has to be deployed

• Test bed design and evaluation• Make recommendations on EMI components from the evolution

roadmap

11

Page 11: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Backup Slides

12

Page 12: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Deliverable & Milestone

• Identify what EMI component should be adapted to/interoperable with/integrate what Cloud technology with step and schedule– EMI MW stack

• Or just in terms of the EMI areas first

– What cloud technology has to be deployed

• Make recommendations on EMI components from the evolution roadmap

13

Page 13: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Experiences from User Communities

• STAR– Nimbus + EC2 (Magellan)

• Ocean Observation Initiative– Adaptive, reliable and elastic computing, and resource strategy– HA– Nimbus + EC2 + FutureGrid

• BarBar– Nimbus for appliance preparation and cloud scheduler

• CANFAR (Canadian Adv Network for Astro Research)– Nimbus + WestGrid for Appliance management

• CloVR– Virtual appliance for auto and portable sequence analysis

appliances for push-button pipelines, and from desktop to cloud

– Nimbus + EC2 + Magellan

14

Sky Computing

Page 14: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Network as a Service

• The LHC experiments, with their distributed Computing Models and world-wide hands-on involvement in LHC physics, have brought renewed focus on networks– This has given the experiment the confidence to seek more agile

and effective Models of data distribution and/or remote access– Bringing new challenges and opportunities, both for the Computing

Models and a broader network infrastructure

• An exponential growth in capacity– 10X in usage every 47 months in ESnet over 18 years– 6M times capacity growth over 25 years across the Atlantic (LEP3Net in

1985 to US LHCNet in 2010)– The transition from 10G to 40G (2011-12) and 100G

(2013-14) are the next steps

15

Page 15: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Purposes of the TF Session & from the AHM

• Work Plan (with directional issue)– Ask for comment– Call for contribution

• Identify the function between TF of Virtualization and Cloud

• Define agenda for the next 6 months• Request a session at the OGF in Taipei,

and also the EGI UF as well

16

Page 16: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

IaaS PaaS SaaS DaaS NaaS

Common On-demand,

Information & Monitoring

Job Management

Auto overflow to other site/cloud

Customized comp env. provisioning

HPC, HTC

Data

Search, access Light speed connection with flexible bandwidth

Security

17

Page 17: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Notes

• Automatically scale out and overflow computations to any available resources– CEs capable scaling out to clouds and cloud services

• Data transmission in speed of light• Security & trust framework security team ?• Review the US solutions• Cloud Federation• Summary from Hepix (summarized slides)• Storage• Benchmarking 18

Page 18: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611Organisation of Hepix

Virtualization WG

5 work areas– Image Generation Policy– Image Exchange– Image Expiry/Revocation– Image Contextualisation– Multiple Hypervisor Support

19

Page 19: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611Policy for Trusted Image

Generation

• You recognise that VM base images, VO environments and VM complete images, must be generated according to current best practice, the details of which may be documented elsewhere by the Grid. These include but are not limited to:– any image generation tool used must be fully patched and up to date;– all operating system security patches must be applied to all images

and be up to date;– images are assumed to be world-readable and as such must not

contain any confidential information;– there should be no installed accounts, host/service certificates, ssh

keys or user credentials of any form in an image;– images must be configured such that they do not prevent Sites from

meeting the fine-grained monitoring and control requirements defined in the Grid Security Traceability and Logging policy to allow for security incident response;

– the image must not prevent Sites from implementing local authorisation and/or policy decisions, e.g. blocking the running of Grid work for a particular user.

• http://www.jspg.org/wiki/Policy_Trusted_Virtual_Machines

20

Page 20: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Image Cataloguing and Exchange

21

Page 21: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611Image Contextualisation

• Contextualisation is needed so that sites can configure images to interface to local infrastructure– e.g. for syslog, monitoring & batch scheduler.

• Contextualisation is limited to these needs! Sites may not alter the image contents in any way.– Any site are concerned about security aspects of an

image should refuse to instantiate it and notify the endorser.

• Contextualisation mechanism– Images should attempt to mount a CDROM image

provided by the sites and, if successful, invoke two scripts from the CDROM image:

• prolog.sh before network initialisation• epilog.sh after network initialisation

22

Page 22: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Multiple Hypervisor Support

• Andrea– Surveyed sites; results show that

kvm and Xen dominate as hypervisors, especially in batch virtualisation area.

– Documented method to produce VM image that can be used with both kvm and Xen

• Method tested by Sebastien Goasguen and Abdeslem Djaoui (RAL)

23

Page 23: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Current Status

• Generation policy– Clear, but probably needs to be formally approved by

JSPG?

• Contextualisation & kvm/Xen support– Also clear.

• Image Cataloguing & Exchange– Ideas sound, and working internal catalogue @ CERN, but

we need functioning inter-site exchange!– Key issue is lack (to date) of working group member(s)

with management support to deliver (and support!) a solution.

– Stratus Lab, as Michel will report later this week, has developed similar ideas.

• Joint intention to explore collaboration, but no opportunity to do so before late November.

24

Page 24: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Other thoughts

• The CernVM filesystem offers an attractive way to ensure sites have the correct VO software, reducing the need for VM images as a mechanism for this.– See Ian Collier’s presentation shortly.– CVMFS team have asked Romain Wartel to lead security audit

• This should allay any fears from sites about using CVMFS.

• Virtualisation is an area with much scope for communication failures!– We must be clear that “image endorsement” is a very rapid

process• The person creating endorses an image which is then immediately

available for instantiation by all sites who trust the endorser; there is no need for a lengthy process of verification at sites.

– Some sites talk about restricting instantiated VM images but the actual impact for end-users is likely small

• e.g. VM images would have no need to connect to a NFS-based shared storage area, so it would not matter if “isolated from the rest of the network” just means “no access to our NFS servers”.

This appears to be the likely situation at NIKHEF, one of the sites the most reluctant to enable instantiation of remotely generated images.

25

Page 25: EMI INFSO-RI-261611 Cloud Task Force Eric Yen and Simon Lin 22 Nov. 2010

EMI I

NFS

O-R

I-261

611

EMI I

NFS

O-R

I-261

611

Summary

• The working group has made good progress in establishing policies to allow the exchange of VM images…

• … but not such good progress in delivering a distributed catalogue of endorsed images.

• CVMFS is probably the neatest solution to the problem of VO software distribution…

• … but VM exchange remains interesting– as an option for sites to run hypervisors not OSes and

automatically migrate to latest patched system as images instantiate, and

– if the VM images can contact pilot job frameworks directly, simplifying the scheduling problems at sites.

26