Grid Computing Material

1. What is Grid Computing? (Apr 11)(Nov 10)Grid Computing enables virtual organizations to share geographically distributed resources as they pursue common goals, assuming the absence of central location, central control, omniscience, and an existing trust relationship.

2. What is High Performance computing?High-performance computing generally refers to what has traditionally been called supercomputing. There are hundreds of supercomputers deployed throughout the world. Key parallel processing algorithms have already been developed to support execution of programs on different, but co-located processors. High-performance computing system deployment, contrary to popular belief, is not limited to academic or research institutions. In fact, more than half of supercomputers deployed in the world today are in use at various corporations.

3. Explain about cluster computing. ( Nov 10) Cluster computing came about as a response to the high prices of supercomputers, which made those systems out of reach for many research projects. Clusters are high-performance, massively parallel computers built primarily out of commodity hardware components, running a free-software operating system such as Linux or FreeBSD, and interconnected by a private high-speed network. It consists of a cluster of PCs, or workstations, dedicated to running high-performance computing tasks. The nodes in the cluster do not sit on users’ desks, but are dedicated to running cluster jobs. A cluster is usually connected to the outside world through only a single node.

4. Explain Peer-to-Peer Computing.Peer-to-Peer (P2P) networks and file sharing into the public eye, methods for transferring files and information between computers have been, in fact, around almost as long as computing itself. Until recently, however, systems for sharing files and information between computers were exceedingly limited. They were largely confined to Local Area Networks (LANs) and the exchange of files with known individuals over the Internet. LAN transfers were executed mostly via a built-in system or network software while Internet file exchanges were mostly executed over an FTP (File Transfer Protocol) connection. The reach of this Peer-to-Peer sharing was limited to the circle of computer users an individual knew and agreed to share files with. Users who wanted to communicate with new or unknown users could transfer files using IRC (Internet Relay Chat) or other similar bulletin boards dedicated to specific subjects, but these methods never gained mainstream popularity because they were somewhat difficult to use.

5. What is Internet Computing?The explosion of the Internet and the increasing power of the home computer prompted computer scientists and engineers to apply techniques learned in high-performance and cluster-based distributed computing to utilize the vast processing cycles available at users’ desktops. This has come to be known as Internet computing.

6. Explain Grid Applications Service Providers.The Grid Applications Service Provider (GASP) provides end-to-end Grid Computing services to the user of a particular application or applications. The customer in this case will purchase “application time” from the provider, and will provide the data or parameters to the GASP through an application portal and in the future through published Web service specifications. The GASP may choose to purchase services from the GReP or may choose to build the infrastructure organically

7. Compare Peer-to-Peer Networks and Grid ComputingPeer-to-peer networks (e.g., Kazaa) fall within our definition of Grid Computing. The resource in peer-to-peer networks is the storage capacity of each (mostly desktops) node. Desktops are globally distributed and there is no central controlling authority. The exchange of files between users also does not predicate any pre-existing trust relationship. It is not surprising, given how snugly P2P fits in our definition of Grid Computing, that the Peer to Peer Working Group has become part of the grid standards body, the Global Grid Forum (GGF).

8. Compare Cluster Computing and Grid ComputingFrom a Grid Computing perspective, a cluster is a resource that is to be shared. A grid can be considered a cluster of clusters.

9. Internet Computing and Grid ComputingInternet computing examples presented earlier in our opinion fit this broad definition of Grid Computing. A virtual organization is assembled for a particular project and disbanded once the project is complete. The shared resource, in this case, is the Internet connected desktop.

10. What are the types of Gridsv Departmental Gridsv Enterprise Gridsv Extraprise Gridsv Global Gridsv Compute Gridsv Data Gridsv Utility Grids

11. What are Departmental Grids?Departmental grids are deployed to solve problems for a particular group of people within an enterprise. The resources are not shared by other groups within the enterprise. Following is a list of vendor definitions that we believe refer to departmental grids.v Cluster Gridsv Infra Grids

12. What are Cluster Grids?Cluster grid is a term used by Sun Microsystems and consists of one or more systems working together to provide a single point of access to users. It is typically used by a team for a single project and can be used to support both high throughput and high performance jobs.

13. What is Infra Grid?Infra grid is a term used by IBM to define a grid that optimizes resources within an enterprise and does not involve any other internal partner. It can be within a campus or across campuses.

14. What are Enterprise Grids?Enterprise grids consist of resources spread across an enterprise and provide service to all users within that enterprise. An enterprise grid, according to Platform Computing, is deployed within large corporations that have a global presence or a need to access resources outside a single corporate location. Enterprise grids run behind the corporate firewall. The following vendor definitions fall into this category.v Enterprise Gridsv Intra Gridsv Campus Grids

15. What are Intra Grids? According to IBM, resource sharing among different groups within an enterprise constitutes an intra grid. An intra grid can be local or traverse the wide area network. Intra grids are located within the corporate firewall.

16. What are Campus Grids? Campus grids, according to Sun Microsystems, enable multiple projects or departments to share computing resources in a cooperative way. Campus grids may consist of dispersed workstations and servers as well as centralized resources located in multiple administrative domains, in departments, or across the enterprise.

17. What are Extraprise Grids?Extraprise grids are established between companies, their partners, and their customers. The grid resources are generally made available through a virtual private network. Following are some of the terms used by various vendors to describe such grids.

18. What are Extra Grids? Extra grids, according to IBM, enable sharing of resources with external partners. This assumes that connectivity between the two enterprises is through some trusted service, such as a private network or a virtual private network.

19. What are Partner Grids? Platform Computing defines these as grids between organizations within similar industries, which have a need to collaborate on projects and use each other’s resources as a means to reach a common goal.20. What are Global Grids?Grids established over the public Internet constitute global grids. They can be established by organizations to facilitate their business or purchased in part, or in whole, from service providers. Following are some vendor definitions that fall in this category.Global Grids and Inter GridsGlobal grids, as defined by Sun, allow users to tap into external resources. Global grids provide the power of distributed resources to users anywhere in the world for computing and collaboration. They can be used by individuals or organizations to send overflow work over the public network to a grid services provider.21. What are Inter Grids?Inter grids, according to IBM, provide the ability to share compute and data/storage resources across the public Web. This can involve sharing resources with other enterprises or buying or selling of excess capacity.22. What are Compute Grids?Compute grids are created solely for the purpose of providing access to computational resources. Compute grids can be further classified by the type of computational hardware deployed.23. What are Desktop Grids? (Apr 11)These are grids that leverage the compute resources of desktop computers. Because of the true (but unfortunate) ubiquity of Microsoft® Windows® operating system in corporations, desktop grids are assumed to apply to the Windows environment. The Mac OS™ environment is supported by a limited number of vendors.

24. What are Server Grids? Some corporations, while adopting Grid Computing , keep it limited to server resources that are within the purview of the IT department. Special servers, in some cases, are bought solely for the purpose of creating an internal “utility grid” with resources made available to various departments. No desktops are included in server grids. These usually run some flavor of the Unix/Linux operating system.

25. What are Data Grids?

Grid deployments that require access to, and processing of, data are called data grids. They are optimized for data-oriented operations. Although they may consume a lot of storage capacity, these grids are not to be confused with storage service providers.

26. What are Utility Grids?utility grids as being commercial compute resources that are maintained and managed by a service provider. Customers that have the need to augment their existing, internal computational resources may purchase “cycles” from a utility grid. In addition to overflow applications, customers may choose to use utility grids for business continuity and disaster recovery purposes. Utility grid providers are also called Grid Resource Providers (GReP). Along with computing resources, some utility grids also offer key business applications that can be purchased “by the minute.”Unit – II27.Explain about Open Grid Services Architecture (OGSA) (Nov 10)The Open Grid Services Architecture (OGSA) provides a strategy for service providers to create service-oriented infrastructures which support more flexible resource management. Web Services supplies a paradigm that supports dynamic resource modeling, a fundamental requirement in pursuit of comprehensive management for evolutionary infrastructures. Peer-to-peer technology creates a mechanism for ad hoc relationships to be formed on demand, without a centralized controlling mechanism.

28. Explain about Grid Service Providers (GSP)Grid Service Providers (GSP) thus supplies an enabling infrastructure which supports the user-driven services innovation of their customers, free from the delays associated with current infrastructural paradigms. When each user has the ability to innovate network services based on their own needs, the promise of the past decade will have arrived.

29.Explain about Montague River Grid (MRG)Montague River Grid (MRG) supplies the necessary functionality to support the inter-domain and inter-provider management of network-based services.MRG is a self-organizing grid adapter/gateway for network-attached resources and is deployed in conjunction with technology specific domain managers. MRG acts as the community authority/virtual organization for locally advertised and controlled network based services. Discovery, membership, registry, mapper, factory, notification, topology, and threading services are all supported.Each MRG supports an aggregate capabilities dictionary from which domain-level capabilities are inherited and re-advertised. Furthermore, complex inter-domain services can be constructed and advertised as single service entities.Once deployed, MRG enables user controlled, end-to-end, inter-domain and inter-provider services.

30.What are the Components of the MRG? (Apr 11)Inter Domain Services—used to represent the persistent service datastore, service configuration, etc.Inter Domain Factory—primary entrance factory for usersFactory—operational interface; used to implement the process of managing network servicesRegistry—used to identify existing persistent service instances; inherited operational functionality of network devicesMapper—used to extract detailed information about existing service instancesNotifications—used to relay asynchronous alarms and notifications from the network to the pertinent registered users of the affected resourcesMembership—used to enhance path selection within a business relationship or service paradigmDiscovery—used to identify and propagate existing services within a business relationship or service paradigm

31. Draw the Montague River Grid architecture

32. Explain about Montague River DomainMontague River Domain (MRD) supplies the necessary functionality to support device-specific, domain-level management for network-based services.MRD is a service fulfillment/configuration management platform for network-attached devices such as transport equipment, storage platforms, and computational servers. Its dynamically coupled network model allows network and service evolution without the re-engineering of the platform. Each MRD implements a capabilities dictionary from which component and comprehensive services are composed for the specific devices within its domain.Furthermore, MRD implements standard functionality such as journaled transaction management, inventory upload and reconciliation, service configuration and rollback, service and topology reporting, and alarm correlation.

33. What are the Components of the MRD?Domain Services—service configuration operations, e.g., provisioning, service discovery, service grooming, etc.Domain Factory—network configuration operations, e.g., upload, transaction management, etc.Security—service and network security, including resource tagging, user enablement, etc.Configuration—specific service configuration operations, e.g, partitionLightPath, findASPath, addXC, deleteXC, etc.Inventory—network device and component management, virtualized persistence of physical network, etc.Reporting—domain level network and service reporting.

34. Draw the Montague River Domain architecture

35. What is Data Catalog?The data catalog is meta-data that identifies the data sets being managed. Typical meta-data includes the name of the data set, its location, the date-time it was last modified, its size, its type, and the correct access method. The catalog should be flexible enough to track everything from a subset of a file in a native operating system file format to a single record in a database manager to an entire database. Key issues in meta-data management include the completeness of the meta-data and the timeliness of updates to it. Catalog management should be subject to strict access controls and most catalog maintenance activity should be assiduously logged.

36. What are Portals?Web-based applications that encapsulate grid operations and present a uniform user interface to the grid user base can be a useful integration point for all this management capability. For the average grid user, the portal provides a uniform user interface that masks the difference between the hardware and software resources available on the grid. For the administrative user, the portal provides a coherent application environment that uniformly enforces access controls and ensures consistent logging.

37. What are the requirements for Grid-Enabling Software?Two requirements must be met in order to modify software for grid deployment: Access to the application source code and the ability to modify it; in other words, both the legal right and the development expertise necessary to change an uncompiled application. There are three groups that meet these requirements.The first group consists of independent software vendors (ISVs) who develop and commercially distribute software applications. ISVs own their software code and have software developers in their employ.The second group is made up of academic institutions and enterprises in research-intensive industries such as life sciences that use open source software applications. Open source software licenses permit modifications of code, and in many cases allow redistribution of the modified version, subject to certain conditions.The third group consists of enterprises that have developed their own proprietary software applications for internal deployment, often with a view to securing competitive advantage through superior implementation of information technology. As these applications are proprietary, enterprises typically own or in some fashion retain intellectual property rights to the source code.Both open source and proprietary software applications can be modified for grid deployment either by internal software developers or third-party solution integrators (SIs).

38. What are the process of Grid-enabling Software Applications?The process of grid-enabling a software application is fairly straightforward. Using the GridIron XLR8 application development tool, an experienced developer familiar with the software application to be grid-enabled should complete the code modifications in a reasonably short period of time.A distributable algorithm or job can be equivalently expressed as one or more steps. The most time-consuming, processing intensive steps that will be distributed for grid processing must have the following three characteristics:

· They can be split into smaller tasks.· Each task can be processed on a separate computer.· The results from each task can be returned re-assembled into one final result.

39. Explain the Overview of GridIron XLR8GridIron XLR8 is a product that allows software developers to add the speed of distributed computing to commercial software applications. XLR8 enables computationally intensive software applications to run faster on multiple computers.GridIron XLR8 consists of two parts: An application developers’ toolkit, or SDK, comprised of APIs that are added to the source code of a computationally intensive application, plus documentation, sample applications, and other tools and materials to assist software developers in modifying their code for processing by multiple computers; and runtime software that is installed on each computer in a network, providing additional processing power.

GridIron XLR8 reduces the complexity of embedding distributed computing within an application by providing all the necessary programmatic elements at a high level of abstraction. All of the job control logic can be defined and controlled through the use of just six XLR8 job control functions and four job execution methods provided by application plug-ins. Additional XLR8 functions are available for administration,

management, and data marshalling. By comparison, protocols such as MPI are significantly more complex, with some 380 primary calls.

Finally, GridIron XLR8 is embedded directly into the software applications. Once compiled and installed, users can benefit from the speed of distributed computing without having to change the way they use the application and without learning special skills.

40. What is Hyperthreading?There are a number of currently available technologies that provide the facility for performance improvement through coprocessor and software optimizations, such as vectorization (e.g., AltiVec), Single Instruction Multiple Data (SIMD), Pthreads, SSE2, etc. One such technology is hyperthreading.

Hyperthreading is an evolving Intel processor technology (first available on Intel’s XEON server processors and now being delivered on all desktop 3.06 GHz+ processors) that provides dual simultaneous execution of two threads on the same physical processor. Performance improvements for most multi-threaded applications range from a typical 5 percent to a current theoretical maximum of approximately 30 percent.Hyperthreading was utilized in this implementation to demonstrate that such technologies are complimentary to distributed computing and will achieve cumulative performance improvements.

41. What are the advantages a grid?

Access—Seamless, transparent, remote, secure, wireless access to computing, data, experiments, instruments, sensors, etc.Virtualization—Access to compute and data services, not the servers themselves, without caring about the infrastructure.On Demand—Get resources you need, when you need them, at the quality you need.Sharing—Enable collaboration of (virtual) teams, over the Internet, to jointly work on one complex task.Failover—In case of system failure, migrate and restart applications automatically on another system.Heterogeneity—In large and complex grids, resources are heterogeneous (platforms, operating systems, devices, software, etc.). Users can choose the system that is best suited for their specific application.Utilization—Grids are known to increase average utilization from some 20 percent to 80 percent and more. For example, our internal Sun Enterprise Grid (with currently more than 7,000 processors in three different locations) to design Sun’s next-generation processors is utilized at more than 95 percent, on average. 42. Explain about the Globus Toolkit 2.0The Globus Toolkit is an open architecture, open source software toolkit developed by the Globus Project. A brief explanation of GT2.0 is given here for completeness. Full description of the Globus Toolkit can be found at the Globus Web site. GT3.0 re-implements much of the functionality of GT2.x but is based upon the Open Grid Services Architecture, OGSA. In the following, the three core components of GT2.0 (and GT2.2).

1. Globus Security Infrastructure (GSI)2. Globus Resource Allocation Manager (GRAM)

3. Monitoring and Discovery Services (MDS)

43. What are the services provided by the Grid?

· Single sign-on—Globus creation using Grid Security Infrastructure and X509 certificates. This allows the user to seamlessly establish his or her identity across all campus grid resources.

· Resource information—Viewable status information on grid resources, both static and dynamic attributes such as operating systems,

· Job specification and submission—a GUI that enables the user to enter job specifications such as the compute resource, I/O, and queue requirements. Automated translation of these requirements into Resource specification language (RSL) and subsequent job submission to Globus Resource Allocation Managers (GRAM)

are supported by the portal. Scripts have been implemented to enable job handoff to SGE via Globus services. Further, automated translation of some job requirements into SGE parameters is supported.

· Precise usage control—Policy-based authorization and accounting services to examine and evaluate usage policies of the resource providers. Such a model is critical when sharing resources in a heterogeneous environment such as the campus grid.

· Job management—Storage and retrieval of relevant application profile information, history of job executions, and related information. Application profiles are meta-data that can be composed to characterize the applications.

· Data handling—Users can transparently authenticate with and browse remote file systems of the grid resources. Data can be securely transferred between grid resources using the GSI-enabled data transport services.44. Explain about Grid Engine Enterprise EditionGrid Engine Enterprise Edition (GEEE) is installed at each of the four nodes—Maxima, Snowdon, Titania, and Pascali. The command line and GUI of GEEE is the main access point to each node for local users. The Enterprise Edition version of Grid Engine provides policy driven resource management at the node level. There are four policy types which may be implemented:· Share Tree Policy—GEEE keeps track of how much usage users/projects have already received. At each scheduling interval, the Scheduler adjusts all jobs’ share of resources to ensure that users/groups and projects get very close to their allocated share of the system over the accumulation period.· Functional Policy—Functional scheduling, sometimes called priority scheduling, is a non-feedback scheme for determining a job’s importance by its association with the submitting user/project/department.· Deadline Policy—Deadline scheduling ensures that a job is completed by a certain time by starting it soon enough and giving it enough resources to finish on time.· Override Policy—Override scheduling allows the GEEE operator to dynamically adjust the relative importance of an individual job or of all the jobs associated with a user/department/project.

Unit – III

45. Listout the characteristics of Data Grid.Ø They are numerous.Ø They are owned and managed by different, potentially mutually distrustful organizations and individuals.Ø They are potentially faulty.Ø They have different security requirements and policies.Ø They are heterogeneous, i.e., they have different CPU architectures, are running different operating systems,

and have different amounts of memory and disk.Ø They are connected by heterogeneous, multilevel networks.Ø They have different resource management policies.Ø They are likely to be separated geographically (on a campus, in an enterprise, on a continent).

46. Write short notes on Network File System (NFS)NFS is the standard Unix solution for accessing files on remote machines within a LAN. With NFS, a disk on a remote machine can be made part of the local machine’s file system. Accessing data from the remote system now becomes a matter of accessing a particular part of the file system in the usual manner.47. Write short notes on File Transfer Protocol (FTP)FTP has been the tool of choice for transferring files between computers since the 1970s. FTP is a command-line tool that provides its own command prompt and has its own set of commands. Several of the commands resemble Unix commands, although several new commands, particularly for file transfer as well manipulating the local file system, are different. FTP may be used within a script; however, in that case, the password for the remote machine must be stored in a clear-text file on the local machine.48. Write short notes on GridFTPGridFTP is a tool for transferring files. It is built on top of the Globus Toolkit. GridFTP is an example of a service that characterizes the Globus “sum of services” approach for a grid architecture.

49. Write short notes on Andrew File System (AFS)

The Andrew File System is a distributed network file system that enables access to files and directories distributed across multiple sites. Access to files involves becoming part of a single virtual file system. AFS comprises several cells, with each cell representing an independently administered file system. 50. Explain about Avaki Data GridThe objective of Avaki Data Grid is to provide high-performance; easy, transparent, secure collaboration; and coherent sharing between different locations, administrative domains, and organizations.

· High-performance—Nobody wants a low-performance system. Yet remote access is inherently slower than local access due to the combination of higher latency and often lower bandwidth.

· Coherent—Caching data is great for performance. Unfortunately, it can lead to inconsistent copies of the data, which can lead in turn to incorrect application results or bad decisions based on out-of-date data

· Transparent—The data grid must be transparent to end users and applications.· Secure—“Secure” is a word that covers a wide range of issues. Many users believe that a data grid must

support strong authentication with identities that span administrative domains and organizations, support the establishment of virtual organizations (groups that span organizations), enforce access control policies, and protect data.

51. Write short notes Grid ServersA grid server is the primary component of a data grid. A grid server performs grid-related tasks such as domain creation, authentication, access control, meta-data management, monitoring, searching, etc. When deploying a data grid, the first grid server deployed typically bears the responsibility of starting a grid. This grid server is also called a grid domain controller (GDC). The GDC creates and defines a domain. A domain represents a single grid. Every domain has exactly one GDC. Multiple domains may be interconnected by invoking the appropriate functions on their respective GDCs. 52. Write short notes Share ServersA share server is an ADG component that is responsible for bulk data transfer to and from a local disk on a machine. A share server is always associated with a grid server. The grid server is responsible for verifying whether a given read/write request is permissible or not. If the request is permitted, the grid server passes a handle to the user as well as the share server. The user’s request is then forwarded to the share server along with this handle. Subsequent requests are satisfied by the share server without the intervention of the grid server. Naturally, if the user issues a new request, for instance, to a new file, the grid server verifies the request anew before delegating the transfer to the share server.

53. Explain about Data Grid Access Servers (DGAS)A DGAS provides a standards-based mechanism to access a data grid. A DGAS is a server that responds to NFS 2.0/3.0 protocols and interacts with other data grid components. When an NFS client on a machine mounts a DGAS, it effectively mounts the entire data grid in a single step, mapping the ADG global name space into the local file system and providing completely transparent access to data throughout the grid without even installing Avaki software. This NFS-based access to an ADG complements the command-line and Web-based access that Avaki provides as part of every data grid deployment. An upcoming version of the DGAS will support the Common Internet File System (CIFS) protocol for Windows clients as well.

54. Write short notes on Proxy ServersA proxy server enables accesses across a firewall. A proxy server requires a single port in the firewall to be opened for TCP—specifically HTTP/HTTPS—traffic. All Avaki traffic passes through this port. Opening a firewall port essentially involves permitting traffic in and out of that port on the firewall machine and forwarding incoming traffic to another machine inside the firewall on which the Avaki proxy server is started. The proxy server accepts all Avaki traffic forwarded from the firewall and redirects the traffic to the appropriate components running on machines within the firewall. The responses of these machines are sent back to the proxy server, which forwards this traffic to the appropriate destination through the open port on the firewall.

55. Explain about Failover ServersA failover server is a grid server that serves as a backup for the GDC. A failover server is configured to synchronize its internal database periodically with a GDC. As a result, if a GDC becomes unavailable either because the machine on which it is running is down or because the network is partitioned or for any other reason, users can continue to access grid data without significant interruption in service.

56. Draw the Flynn’s classification diagram

57. What are the Key Elements of Desktop Grid Technology?· Security· Unobtrusiveness· Openness/Ease of Application Integration· Robustness· Scalability· Central Manageability

58. Expalin MIMD computer classification

59. What are the components of Desktop Grids?Ø Grid—This term will be used interchangeably with Desktop Grid for simplicity.Ø Grid Server—This is a central machine that controls and administers the Desktop Grid.Ø Grid Client—An individual node that is a member of the Desktop Grid from which spare computational

resources will be harvested. A Grid Client is typically an existing desktop or laptop PC; however, any Windows-based PC connected to the corporate network can become a Grid Client.

Ø Grid Client Executive—The software component of the grid infrastructure that resides on a PC, enables that PC to serve as a Grid Client, and manages all interaction between the Grid Client and the Grid Server.

Ø Work Unit—The packet of computation assigned to a Grid Client by the Grid Server. This packet includes a grid-enabled version of an application, instructions for establishing an environment for the application on the Grid Client, the input data (or a pointer to the location of the input data), and instructions on how to execute the application and produce the output data.

60. What are the Uses of Desktop Grids?· Data Mining—Demographic analysis and legal discovery· Engineering Design—CAD/CAM and two-dimensional rendering· Financial Modeling—Portfolio management and risk management· Geophysical Modeling—Climate prediction and seismic computations· Graphic Design—Animation and three-dimensional rendering· Life Sciences—Disease simulation and target identification· Material Sciences—Physical property prediction and product optimization· Supply Chain Management—Process optimization and total cost minimization

61. Write the Challenges about Desktop Grid Intermittent Availability—Unlike a dedicated compute infrastructure, a user may choose to turn off or reboot his PC at any time. In addition, the increasing trend of using a laptop (portable) computer as a desktop replacement means that some PCs may disappear and reappear and may connect from multiple locations over network connections of varying speeds and quality. User Expectations—The user of the PC on the corporate desktop views it as a truly “personal” part of his work experience, much like a telephone or a stapler. It is often running many concurrent applications and needs to appear as if it is always and completely available to serve that employee’s needs. After a distributed computing component is deployed on an employee’s PC, that component will tend to be blamed for every future fault that occurs—at least until the next new component or application is installedUnit – IV62. Write short notes on Enterprise High Throughput Grids (EHTG)Enterprise High Throughput Grids (EHTG), which allow an easy and robust integration of the whole corporate network in a computing platform. Companies implanting an EHTG can transform their sparse and heterogeneous computers—high-end servers, workstations, desktop PCs—in a single virtual utility. EHTG also allows establishing collaborations among departments by the definition of execution policies to share their resources. In critical periods one department could require more computational power than it owns. The department could ask the EHTG to find and use underused computational resources of other departments.63. Write note on Call Data Records (CDRs)Data related to calls are stored in CDRs (Call Data Records) files. A 6-million-user operator may process about 200 million CDRs a day. The processing time of these files almost reaches the capacity of a current supercomputer. In the next few years, registered data will include information about many other activities carried out by users, in addition to voice services.64. What is the function of EDR?It is a two-step process. The first step consists of a validation of the data contained in the EDRs and arriving from the network stations. The system loads the validation rules from a database, and sends them to the grid

platform. The central grid server distributes the EDRs among the nodes for their process in a distributed mode.Once the validation step has been completed, an evaluation of the EDRs takes place. In this evaluation, the EDRs data are transformed for its inclusion in the DataWareHouse. The system loads the evaluation rules from a database, and sends them to the platform. The evaluation of the validated EDRs is distributed among the computers in the platform. Finally the results are committed to the DataWareHouse.65. Write short notes on Smart System Software (SSS)Smart System Software (SSS) to virtualize independent operating-system instances to provide an HPC service. Next to the attractive price/performance of COTS components, SSS plays a key role here. SSS allows a number of distinct systems to appear as one—even though each runs its own instance of the operating system. There are two possibilities for SSS. At one extreme the Single System Image (SSI) is SSS that involves kernel modification. At the other extreme, the Single System Environment (SSE) is SSS that runs in user space as a layered service. The arrows in emphasize interconnections and corresponding communications.66. Explain about Single System Environment(SSE)Clustering solutions can also be delivered via an SSE. In contrast to SSI, clustering via SSE does not require modifications to the kernel. Instead, SSE runs in user space and provides a distributed process abstraction that includes primitives for process creation and process control.The user-space approach releases the single-operating-system restriction, and allows third parties to craft cross-platform clusters based on Linux, Mac OS, UNIX, and/or Windows. SSE directly addresses the tension between supply and demand by matching an application’s resource requirements with the resources capable of filling the need. By effectively arbitrating the supply-demand budget over an enterprise-scale IT infrastructure, subject to policy-driven objectives, SSE solutions allow organizations to derive maximal utilization from all available computer resources.67. Write short notes on Electronic Design Automation (EDA)The high-tech field of electronic design automation (EDA) offers rich possibilities for illustrating SSE in capacity-driven simulation. In EDA, the fundamental challenge stems from incremental progress into deeper sub-micron design technologies; this advance implies staggering challenges for design synthesis, verification, timing closure, and power consumption. Through direct association with Moore’s Law, design synthesis has gained a profile. However, it is design verification that has an even greater potential to become the ultimate design bottleneck: As design complexity increases, verification requirements escalate rapidly.68. Explain about Open Grid Services Architecture (OGSA)The Open Grid Services Architecture (OGSA) is a set of technical specifications which define a common framework that will allow businesses to build grids both across the enterprise and with their business partners. It is expected that OGSA will define the standards required for both open source and commercial software for a broadly applicable and widely adopted global grid infrastructure.The Open Grid Services Architecture (OGSA) has been proposed as an enabling infrastructure for systems and applications that require the integration and management of service within distributed, heterogeneous, dynamic “virtual organizations.”69. Write short notes on Submission-execution topologies for Platform MultiCluster

70. Write note on OGSA PlatformThe OGSA Platform is made up of three components: the Open Grid Services Infrastructure (OGSI), the OGSA Platform Interfaces, and the OGSA Platform Models.OGSI represents the convergence of Web Services and grid technologies. It defines the underlying mechanisms for managing Grid Service instances (e.g., messaging, lifecycle management, etc.).OGSA Platform Interfaces are OGSI-compliant Grid Services (i.e., interfaces and associated behaviors) that are not defined within OGSI. The focus here is on defining the higher level—but basic—services common in many grid deployments. Examples include registries, data access and integration, resource manager interfaces, etc.OGSA Platform Models are the combination of OGSA services and information schemas for representing real entities on the grid. For example, a standard definition of terms describing a computer system and the associated behavior is an example of a model for a computer system.71. List out the Properties Core Grid Service

Ø Service Description and Service InstanceØ Modeling Time in OGSIØ XML Element Lifetime Declaration PropertiesØ Interface Naming and Change ManagementØ Naming Grid Service InstancesØ Grid Service LifecycleØ Common Handling of Operation FaultsØ Extensible Operations

72. Write short notes on Data Access and Integration Services (DAIS)The Data Access and Integration Services working group is focused on defining grid data services that provide consistent access to existing, autonomously managed databases. Although there had already been a lot of work around Grid Services for file management (e.g., GridFTP), database integration was not really covered by this work, even though databases play a central role in both the research and commercial computing domains.73. Explain the PortTypes for Basic ServicesOGSI defines a set of portTypes and describes the behavior of a collection of common distributed computing patterns that are fundamental to OGSI.

· GridService—encapsulates the root behavior of the service model.· HandleResolver—mapping from a GSH to a GSR.· NotificationSource—allows clients to subscribe to notification messages.· NotificationSubscription—defines the relationship between a

singleNotificationSource and NotificationSink pair.· NotificationSink—defines a single operation for delivering a notification message to the service instance that

implements the operation.· Factory—standard operation for creation of Grid Service instances.· ServiceGroup—allows clients to maintain groups of services.· ServiceGroupRegistration—allows Grid Services to be added and removed from a ServiceGroup.· ServiceGroupEntry—defines the relationship between a Grid Service and its membership within

a ServiceGroup.74.List out the serviceData elements in the GridService

· interface a list of the QNames of all portTypes implemented by the service.· serviceDataName—a list of QNames of all SDEs supported by this service instance. This includes SDEs defined

at the interface level, as well as SDEs added dynamically during the lifetime of the service instance.· factoryLocator —a service locator that points to the Grid Service instance that created this Grid Service

instance.· gridServiceHandle—zero or more GSHs of this Grid Service instance.· gridServiceReference—zero or more GSRs of this Grid Service instance.· FindServiceDataExtensibility—a set of operation extensibility declarations for the findServiceData operation.

The client can use a query expression that conforms to any of the listed inputElement types.

· setServiceDataExtensibility—operation extensibility declarations for thesetServiceData operation. Similar to findServiceDataExtensibility.

· terminationTime—the termination time for the service.75. Write the functions of OGSA Platform Interfaces

Ø Service Groups and Discovery InterfacesØ Service Domain InterfacesØ SecurityØ PolicyØ Data Management ServicesØ Messaging and QueuingØ EventsØ Distributed LoggingØ Metering and Accounting

76. Define WS-AgreementWS-Agreement defines the Agreement-based Grid Service Management model, which defines a set of OGSI-compliant portTypes allowing clients to negotiate with management services in order to manage Grid Services or other legacy applications (e.g., a local resource manager).WS-Agreement defines fundamental mechanisms based on OGSI-compliant Agreement services, which represent an ongoing relationship between an agreement provider and an agreement initiator. The agreements define the behavior of a delivered service with respect to a service consumer. The Agreement will most likely be defined in sets of domain-specific agreement terms (defined in other specifications), as the WS-Agreement specification is focused on defining the abstraction of the agreement and the protocol for coming to agreement, rather than on defining sets of agreement terms.

UNIT – V77. What is Hive Computing? (Nov 10)The development, deployment, and management of mission-critical applications—calledHive Computing—that is designed to complement and extend the vision of Grid Computing.Hive Computing enables businesses to build a transactional resource, called a Hive that can be plugged into a grid and host the transaction-oriented applications upon which businesses depend. The goal of Hive Computing is to expand the range of problems that can be solved with a grid and bring the benefits of Grid Computing to the mainstream of business computing.Hive Computing defines a new type of resource called a Transactional Resource that can be integrated into an existing grid. The transactional resource handles all the transaction-oriented application.

78. What are the services performed by the Hive ?

v Get a real-time quote based on a CUSIP (stock identifier)

v Get a delayed quote based on a CUSIP

v Generate a 30-day or other price chart based on a CUSIP

79. What are the components of Hive Computing?

80. What are the capabilities of a Hive Computing?v A Hive Is Self-organizing, Self-healing, and Self-managingv A Hive Creates a Mission-critical Computing Environment

v A Hive Utilizes Large Numbers of Dedicated Commodity Computers

v A Hive Is Designed to Host Transaction-oriented Applications

81. What are the benifits of Hive Computing?

v Reliability· Scalability· Availability· Predictabilityv Scalabilityv Availabilityv Predictabilityv Affordability

· Usability· Adaptability· Maintainability· Commodity Components

82. What are the steps involved in implementation of a Grid Service?Ø Write a WSDL PortType definition, using OGSA types (or defining new ones).Ø Write a WSDL binding definition, identifying ways in which one could connect to the service, e.g., by using

SOAP/HTTP, TCP/IP, etc.Ø Write a WSDL service definition based on the PortTypes supported by the service and identified in Step 1.Ø Implement a factory by extending the FactorySkeleton provided, to indicate how new instances of a service

are to be created.Ø Configure the factory with various options available, such as schemas supported.Ø Implement the functionality of the service by extending the ServiceSkeleton class. If an existing code

(legacy code) is to be used in some way, then the delegationmechanism should be used. When used in this mode, the factory returns a skeleton instance in Step 4.

Ø Implement code for the client that must interact with the service.83. What are the requirements when implementing a Grid Service?

Ø Scalability and costØ UniformityØ Expressiveness

Ø ExtensibilityØ Diversity (Multiple information sources)Ø DynamicityØ FlexibilityØ SecurityØ DeployabilityØ Decentralized maintenance

84. What is MDS?

Globus Toolkit contains a grid information service called MDS. Initially an acronym for Metacomputing Directory Service, MDS now denotes Monitoring and Directory Service to better reflect that MDS is more than an information service for metacomputers.

85. What are the services provided by MDS?

The MDS comprises the Grid Resource Information Service (GRIS) and Grid Index Information Service (GIIS). A GRIS is an information service that runs on a single resource and can answer queries from a user about that particular resource by directing these queries to an information provider deployed on that resource. An information provider is a service that generates information about a specific aspect of a resource. GIIS is an aggregate directory service that builds a collection of information services out of multiple GIIS. It supports queries against information spread across multiple GRIS resources.

86. Explain the classification of UDDI?UDDI may be classified as follows:

White pages—These contain basic contact information and identifiers about a company, including business name, address, contact information, and unique identifiers such as its Dun-and-Bradstreet (DUNS) numbers or tax IDs. This information allows others to discover Web Service based on business identification. In the context of Grid Computing, white pages can provide the retrieval of an IP address or the amount of memory available on a particular resource.

Yellow pages—These contain information that describes a Web Service using different business categories (taxonomies). This information allows others to discover Web Services based on its categorization (such as flower sellers or car sellers).

Green pages—These contain technical information about Web Services that are exposed by a business, including references to specifications of interfaces, as well as support for pointers to various file and URL-based discovery mechanisms.

87. How the applications are classified in Grid?Ø Parallelismo Single Program, Single Data (SPSD)o Single Program, Multiple Data (SPMD)o Multiple Program Multiple Data (MIMD)o Multiple Program Single Data (MPSD)Ø CommunicationsØ GranularityØ Dependency

88. What are the functional requirements of grids? (Nov 10)

Ø InterfacesØ Job SchedulingØ Data Management

Ø Remote Execution EnvironmentØ SecurityØ Gang SchedulingØ Check pointing and Job MigrationØ Management

88. What is Information Technology? (Apr 11)89. Write the types of GIS. (Nov 10)90. What is Replication Mechanism? (Nov 10)91. What is Compute Intensity (CI) Ratio? (Nov 10)92. List the key factors to be considered for determining the appropriate method of grid deployment in scriptable application. (Nov 10)93. Write the methods for grid deployment of applications in processing intensive. (Apr 11)94. What is the need of Genetic Algorithm? (Apr 11)95. What does Granularity refer? (Apr 11)96. Write notes on Grid services and their main pillars. (Apr 11)97. Define : Gang Scheduling. (Apr 11)98. Write an example code for using Property Bag. (Apr 11)99. What is Wi – Fi? (Nov 10)100. Mention any 5 industries using Grid Computing.

Documents

Grid Computing Material