26
Cloud Computing and Grid Computing 360-Degree Compared Rashid Tahir

Cloud Computing and Grid Computing 360-Degree Compared Rashid Tahir

Embed Size (px)

Citation preview

  • Slide 1

Slide 2 Cloud Computing and Grid Computing 360-Degree Compared Rashid Tahir Slide 3 Slide 4 Then to now! 1961: John McCarthy predicts that Computation may someday be organized as a public utility Mid 1990s: the term Grid is coined by the community 2002: Ian Foster publishes a grid checklist. July 2002: Amazon launches its AWS. Subsequent Years: Large-scale federated systems such as TeraGrid, Open Science Grid etc. are developed. Slide 5 Definitions!! Grid Computing: The ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across multiple administrative domains based on their (resources) availability, capacity, performance, cost and users quality-of- service requirements. 1 1- IBM Solutions Grid for Business Partners: Helping IBM Business Partners to Grid-enable applications for the next phase of e-business on demand Slide 6 Characteristics of a Grid Coordinates resources that are not subject to centralized control Uses standard, open, general-purpose protocols and interfaces Delivers non-trivial qualities of service Slide 7 Definitions!! Cloud Computing: A large-scale distributed computing paradigm that is driven by economies of scale, in which a pool of abstracted, virtualized, dynamically-scalable, managed computing power, storage, platforms, and services are delivered on demand to external customers over the Internet. Slide 8 Characteristics of a Cloud Massively scalable Can be encapsulated as an abstract entity that delivers different levels of services to customers outside the Cloud Driven by economies of scale Can be dynamically configured (via virtualization or other approaches) and delivered on demand Slide 9 The BIG Question!!! Is Cloud Computing just a new name for Grid? YES: The vision is the same. BUT NO: The approach is different. NEVERTHELESS, YES: The problem space, both paradigms attempt to solve, is overlapping. Slide 10 Why cloud computing? Exponentially growing data size in scientific instrumentation/simulation and Internet publishing and archiving Wide spread adoption of Computing Services and Web 2.0 applications Rapid decrease in hardware cost and increase in computing power and storage capacity, and the advent of multi-core architecture and modern supercomputers consisting of hundreds of thousands of cores Slide 11 Relationship between Clouds and other domains it overlaps with Slide 12 Side-by-Side: Grids vs. Clouds Six comparison metrics: Business Model Architecture Resource Management Programming Model Application Model Security Model Slide 13 Business Model CLOUDGRID On-demand pay-per-use type Model Services are charged usually on the basis of per instance-hour, per GB-Month of storage, per TB/Month data transfer etc Project-oriented Model Users pool resources that are shared by the community Each Grid site is responsible for maintaining their own set of resources Slide 14 Architecture GRID Grids provide protocols and services at five different layers 1.Fabric Layer: Provides access to different resource types such as compute, storage and network resource, code repository etc. 2.Connectivity Layer: Defines core communication and authentication protocols for easy and secure network transactions. 3.Resource Layer: Defines protocols for the publication, discovery, negotiation, monitoring, accounting and payment of sharing operations on individual resources. 4.Collective Layer: Captures interactions across collections of resources. 5.Application Layer: Comprises whatever user applications built on top of the above protocols and APIs and operate in VO (Virtual Organization) Environments. Slide 15 Architecture CLOUD Clouds architecture can be divided in to four different layers: 1.Fabric Layer: Contains the raw hardware level resources. 2.Unified Resource Layer: Contains resources that have been abstracted/encapsulated (usually by virtualization) so that they can be exposed to upper layer. 3.Platform Layer: Adds on a collection of specialized tools, middleware and services on top of the unified resources. 4.Application Layer: Contains the applications that would run in the Clouds. Slide 16 Architecture Grids vsClouds Slide 17 Architecture CLOUD Clouds provide services at three different levels: 1.IaaS: Provisions hardware, software, and equipments to deliver software application environments with a resource usage-based pricing model. 2.PaaS: Offers a high-level integrated environment to build, test, and deploy custom applications. 3.SaaS: Delivers special-purpose software that is remotely accessible by consumers through the Internet with a usage-based pricing model. Together the three are referred to as the Cloud Computing Onion. Slide 18 Cloud Computing Onion Slide 19 Resource Management Comparison based on the following metrics: Compute Model Data Model Data Locality Combining Compute and Data Management Virtualization Monitoring Provenance Slide 20 Resource Management CLOUDGRID Compute Model Users simultaneous share resources and are serviced instantly Queuing based batch-scheduled model Data ModelCurrent data model houses the data inside the cloud however in the future this could change Specially designed Data-Grids along with metadata catalogs that keep track of the location of each piece of data Data LocalityHigh focus on storing and replicating data near the associated compute unit Data is stored in shared file systems where data locality can not easily be applied. However data-aware schedulers dramatically improve performance Combining Compute and Data Management Initially clouds were not used much for data-intensive applications so not much work had been done to manage large amounts of data over compute resources. However this has changed substantially now Grids achieve this well because of the usage of data-aware schedulers to schedule jobs close to the node responsible for computing them Slide 21 Resource Management CLOUDGRID VirtualizationHighly virtualized to meet service level agreements. Abstractions provided at each layer to assist the process of virtualization Little or no support for virtualization MonitoringSince services provided are layer- specific, monitoring information is not provided for the entire system Users have greater flexibility over the resources they are allocated and hence can deploy fine grained monitoring infrastructure ProvenanceStill an under-explored area however given that clouds are increasingly being used for e-science research new provenance systems are emerging fast Since Grids are project oriented, provenance is essential and therefore Grids provide a lot of support for this Slide 22 Programming Model CLOUDGRID Most common parallel programming model is MapReduce Standard parallel programming models are also used Scripting is used in place of workflow management systems Clouds have generally adopted Web Services APIs for providing services over the web. The most commonly used parallel programming model is based on message passing Less used models employ coordination languages that allow heterogeneous components to coordinate and interact In a recent effort to develop a service oriented Grid programming model the community has started using WSRF (Web Services Resource Framework) Workflow Management Systems are used when processing large sets of data involving complex tasks Slide 23 Application Model CLOUDGRID Cloud computing is still in its infancy so the app space has not yet been clearly understood however one can characterize most applications as loosely coupled All three layers mentioned previously provide their own set of functionalities and applications and tools are being developed that exploit these functionalities Grids support a wide variety of applications such as High Performance Computing (HPC) and High Throughput Computing (HTC) apps Tightly coupled applications make use of the Message Passing Interface (MPI) for inter-process communication whereas loosely coupled apps rely on Workflow Management Systems Another emerging set of apps are the scientific gateways apps. These provide a large variety of services through a browser-based user interface Slide 24 Security Model CLOUDGRID Clouds mostly comprise dedicated data centers with most of the infrastructure homogeneous in nature however when cross-data center interaction occurs compatibility issues arise Clouds seem to have relatively simpler and less secure security models Security is the biggest obstruction to the wide scale adoption of cloud computing Grids are built under the assumption that the shared resources will be mostly heterogeneous in nature and hence are better equipped for challenges pertaining to interoperability and compatibility Since each Grid site has operational autonomy, they have security engineered in to them by the administrators Security model is more complex as compared to Clouds and perhaps more time consuming too Slide 25 Conclusion Clouds and Grids share a common vision and have overlapping architecture, technology and application space. However they take different steps to provide scalable distributed on-demand computation resulting in the evolution of two parallel infrastructures. Tomorrows distributed systems will need to have the centralized scale of todays Cloud utilities and the distribution and interoperability of todays Grid facilities. Slide 26 Discussion Questions Is Cloud computing the same as Grid computing? Why? Why not? What does the future entail for each of these paradigms? Can you see a unified distributed paradigm in the future? Why? Why not? Will the community shift more or more towards Clouds and Grids or do you envision a future where high-end desktop machines will dominate the market? Do you think Clouds and Grids are reliable enough to be trusted with sensitive data and computation? Currently Clouds cant guarantee reliability, security and control however they provide ease of access, portability etc at a very low cost. Do you think the benefits outweigh the costs? Slide 27 References 1- Cloud Computing and Grid Computing 360 Degree Compared Authors: Ian Foster, Yong Zhao, Ioan Raicu, Shiyong Lu This paper appears in: Grid Computing Environments Workshop, 2008. GCE '08