26
Cloud Computing and Grid Computing 360-Degree Compared Rashid Tahir

Cloud Computing and Grid Computing 360-Degree Compared (1)

Embed Size (px)

Citation preview

Cloud Computing and Grid Computing 360-Degree Compared

Cloud Computing and Grid Computing 360-Degree ComparedRashid Tahir

Then to now!1961: John McCarthy predicts that Computation may someday be organized as a public utilityMid 1990s: the term Grid is coined by the community2002: Ian Foster publishes a grid checklist.July 2002: Amazon launches its AWS.Subsequent Years: Large-scale federated systems such as TeraGrid, Open Science Grid etc. are developed.

What is the Grid? A Three Point Checklist 3Definitions!!Grid Computing:The ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across multiple administrative domains based on their (resources) availability, capacity, performance, cost and users quality-of-service requirements.11- IBM Solutions Grid for Business Partners: Helping IBM Business Partners to Grid-enable applications for the next phase of e-business on demand

Characteristics of a Grid Coordinates resources that are not subject to centralized controlUses standard, open, general-purpose protocols and interfacesDelivers non-trivial qualities of service

5Definitions!!Cloud Computing:A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet.

Characteristics of a CloudMassively scalableCan be encapsulated as an abstract entity that delivers different levels of services to customers outside the CloudDriven by economies of scale Can be dynamically configured (via virtualization or other approaches) and delivered on demand

The BIG Question!!!Is Cloud Computing just a new name for Grid?

YES: The vision is the same.BUT NO: The approach is different.NEVERTHELESS, YES: The problem space, both paradigms attempt to solve, is overlapping.

Why cloud computing?Exponentially growing data size in scientific instrumentation/simulation and Internet publishing and archivingWide spread adoption of Computing Services and Web 2.0 applicationsRapid decrease in hardware cost and increase in computing power and storage capacity, and the advent of multi-core architecture and modern supercomputers consisting of hundreds of thousands of cores

Relationship between Clouds and other domains it overlaps with

Side-by-Side: Grids vs. CloudsSix comparison metrics:Business ModelArchitectureResource ManagementProgramming ModelApplication ModelSecurity Model

Business ModelCLOUDGRIDOn-demand pay-per-use type ModelServices are charged usually on the basis of per instance-hour, per GB-Month of storage, per TB/Month data transfer etcProject-oriented ModelUsers pool resources that are shared by the communityEach Grid site is responsible for maintaining their own set of resources

ArchitectureGRID Grids provide protocols and services at five different layersFabric Layer: Provides access to different resource types such as compute, storage and network resource, code repository etc.Connectivity Layer: Defines core communication and authentication protocols for easy and secure network transactions.Resource Layer: Defines protocols for the publication, discovery, negotiation, monitoring, accounting and payment of sharing operations on individual resources.Collective Layer: Captures interactions across collections of resources.Application Layer: Comprises whatever user applications built on top of the above protocols and APIs and operate in VO (Virtual Organization) Environments.

ArchitectureCLOUD Clouds architecture can be divided in to four different layers:Fabric Layer: Contains the raw hardware level resources.Unified Resource Layer: Contains resources that have been abstracted/encapsulated (usually by virtualization) so that they can be exposed to upper layer.Platform Layer: Adds on a collection of specialized tools, middleware and services on top of the unified resources.Application Layer: Contains the applications that would run in the Clouds.

Architecture Grids vsClouds

ArchitectureCLOUDClouds provide services at three different levels:IaaS: Provisions hardware, software, and equipments to deliver software application environments with a resource usage-based pricing model.PaaS: Offers a high-level integrated environment to build, test, and deploy custom applications.SaaS: Delivers special-purpose software that is remotely accessible by consumers through the Internet with a usage-based pricing model.Together the three are referred to as the Cloud Computing Onion.

Cloud Computing Onion

Resource ManagementComparison based on the following metrics:Compute ModelData ModelData LocalityCombining Compute and Data ManagementVirtualizationMonitoringProvenance

Resource ManagementCLOUDGRIDCompute ModelUsers simultaneous share resources and are serviced instantlyQueuing based batch-scheduled model

Data ModelCurrent data model houses the data inside the cloud however in the future this could change

Specially designed Data-Grids along with metadata catalogs that keep track of the location of each piece of dataData LocalityHigh focus on storing and replicating data near the associated compute unit Data is stored in shared file systems where data locality can not easily be applied. However data-aware schedulers dramatically improve performanceCombining Compute and Data ManagementInitially clouds were not used much for data-intensive applications so not much work had been done to manage large amounts of data over compute resources. However this has changed substantially nowGrids achieve this well because of the usage of data-aware schedulers to schedule jobs close to the node responsible for computing them

Resource ManagementCLOUDGRIDVirtualizationHighly virtualized to meet service level agreements. Abstractions provided at each layer to assist the process of virtualizationLittle or no support for virtualization

MonitoringSince services provided are layer-specific, monitoring information is not provided for the entire systemUsers have greater flexibility over the resources they are allocated and hence can deploy fine grained monitoring infrastructure ProvenanceStill an under-explored area however given that clouds are increasingly being used for e-science research new provenance systems are emerging fastSince Grids are project oriented, provenance is essential and therefore Grids provide a lot of support for this

Programming ModelCLOUDGRIDMost common parallel programming model is MapReduceStandard parallel programming models are also usedScripting is used in place of workflow management systemsClouds have generally adopted Web Services APIs for providing services over the web.The most commonly used parallel programming model is based on message passingLess used models employ coordination languages that allow heterogeneous components to coordinate and interactIn a recent effort to develop a service oriented Grid programming model the community has started using WSRF (Web Services Resource Framework)Workflow Management Systems are used when processing large sets of data involving complex tasks

Application ModelCLOUDGRIDCloud computing is still in its infancy so the app space has not yet been clearly understood however one can characterize most applications as loosely coupledAll three layers mentioned previously provide their own set of functionalities and applications and tools are being developed that exploit these functionalitiesGrids support a wide variety of applications such as High Performance Computing (HPC) and High Throughput Computing (HTC) appsTightly coupled applications make use of the Message Passing Interface (MPI) for inter-process communication whereas loosely coupled apps rely on Workflow Management SystemsAnother emerging set of apps are the scientific gateways apps. These provide a large variety of services through a browser-based user interface

Security ModelCLOUDGRIDClouds mostly comprise dedicated data centers with most of the infrastructure homogeneous in nature however when cross-data center interaction occurs compatibility issues ariseClouds seem to have relatively simpler and less secure security modelsSecurity is the biggest obstruction to the wide scale adoption of cloud computingGrids are built under the assumption that the shared resources will be mostly heterogeneous in nature and hence are better equipped for challenges pertaining to interoperability and compatibilitySince each Grid site has operational autonomy, they have security engineered in to them by the administratorsSecurity model is more complex as compared to Clouds and perhaps more time consuming too

ConclusionClouds and Grids share a common vision and have overlapping architecture, technology and application space. However they take different steps to provide scalable distributed on-demand computation resulting in the evolution of two parallel infrastructures. Tomorrows distributed systems will need to have the centralized scale of todays Cloud utilities and the distribution and interoperability of todays Grid facilities.

Discussion QuestionsIs Cloud computing the same as Grid computing? Why? Why not?What does the future entail for each of these paradigms? Can you see a unified distributed paradigm in the future? Why? Why not?Will the community shift more or more towards Clouds and Grids or do you envision a future where high-end desktop machines will dominate the market?Do you think Clouds and Grids are reliable enough to be trusted with sensitive data and computation?Currently Clouds cant guarantee reliability, security and control however they provide ease of access, portability etc at a very low cost. Do you think the benefits outweigh the costs?

References1- Cloud Computing and Grid Computing 360 Degree ComparedAuthors: Ian Foster, Yong Zhao, Ioan Raicu, Shiyong LuThis paper appears in:Grid Computing Environments Workshop, 2008. GCE '08