Distributed Computing Report

16

INTRODUCTION:-

In the term distributed computing, the word distributed means spread out across

space. Thus, distributed computing is an activity performed on a spatially distributed

system.

A distributed system consists of collection of autonomous computers, connected

through a network and distributed operating system software, which enables computers to

coordinate their activities and to share the resources of the system - hardware, software

and data, so that users perceive the system as a single, integrated computing facility.

(Figure 1-Distributed Computing)

These networked computers may be in the same room, same campus, same

country, or in different continents. A distributed system may have a common goal, such

16

as solving a large computational problem. Alternatively, each computer may have its own

user with individual needs, and the purpose of the distributed system is to coordinate the

use of shared resources or provide communication services to the users.

Rise of Distributed Computing:-

Computer hardware prices are falling and power increasing.

Network connectivity is increasing.

Everyone is connected with fat pipes.

It is easy to connect hardware together.

Combination of cheap processors often more

Cost-effective than one expensive fast system.

Flexibility to add according to needs.

Potential increase of reliability.

Sharing of resources.

Characteristics of Distributed Computing:-

Six key characteristics are primarily responsible for the usefulness of distributed

system. They are resource sharing, openness, concurrency, scalability, fault tolerance and

transparency. It should be emphasized that they are not automatic consequences of

distribution; system must be carefully designed in order to ensure that they are achieved.

Resource Sharing :-

Resource sharing is the ability to use any hardware, software or data anywhere in

the system. Resources in a distributed system, unlike the centralized one, are physically

encapsulated within one of the computers and can only be accessed from others by

communication. It is the resource manager to offers a communication interface enabling

the resource be accessed, manipulated and updated reliability and consistently. There are

mainly two kinds of model resource managers: client/server model and the object-based

model. Object Management Group uses the latter one in CORBA, in which any resource

16

is treated as an object that encapsulates the resource by means of operations that users

can invoke.

Openness :-

Openness is concerned with extensions and improvements of distributed systems.

New components have to be integrated with existing components so that the added

functionality becomes accessible from the distributed system as a whole. Hence, the static

and dynamic properties of services provided by components have to be published in

detailed interfaces.

Concurrency :-

Concurrency arises naturally in distributed systems from the separate activities

of users, the independence of resources and the location of server processes in separate

computers. Components in distributed systems are executed in concurrent processes.

These processes may access the same resource concurrently. Thus the server process

must coordinate their actions to ensure system integrity and data integrity.

Scalability :-

Scalability concerns the ease of the increasing the scale of the system (e.g. the

number of processor) so as to accommodate more users and/or to improve the

corresponding responsiveness of the system. Ideally, components should not need to be

changed when the scale of a system increases.

Fault tolerance :-

Fault tolerance cares the reliability of the system so that in case of failure of

hardware, software or network, the system continues to operate properly, without

significantly degrading the performance of the system. It may be achieved by recovery

(software) and redundancy (both software and hardware).

16

Transparency :-

Transparency hides the complexity of the distributed systems to the users and

application programmers. They can perceive it as a whole rather than a collection of

cooperating components in order to reduce the difficulties in design and in operation.

This characteristic is orthogonal to the others. There are many aspects of transparency,

including access transparency, location transparency, concurrency transparency,

replication transparency, failure transparency, migration transparency, performance

transparency and scaling transparency.

Distributed Computing Architecture:-

Various hardware and software architectures are used for distributed computing. At

a lower level, it is necessary to interconnect multiple CPUs with some sort of network,

regardless of whether that network is printed onto a circuit board or made up of loosely-

coupled devices and cables. At a higher level, it is necessary to

interconnect processes running on those CPUs with some sort of communication system.

Distributed programming typically falls into one of several basic architectures or

categories: Client-server, 3-tier architecture, N-tier architecture, Distributed objects, loose

coupling, or tight coupling.

Client-server :-

Smart client code contacts the server for data, then formats and displays it to the

user. Input at the client is committed back to the server when it represents a

permanent change.

http://en.wikipedia.org/wiki/Client-server

http://en.wikipedia.org/wiki/Computer_cluster

http://en.wikipedia.org/wiki/Loose_coupling

http://en.wikipedia.org/wiki/Loose_coupling

http://en.wikipedia.org/wiki/Distributed_object

http://en.wikipedia.org/wiki/Multitier_architecture

http://en.wikipedia.org/wiki/Three-tier_(computing)

http://en.wikipedia.org/wiki/Client-server

http://en.wikipedia.org/wiki/Communication_system

http://en.wikipedia.org/wiki/Process_(computing)

16

3-tier architecture :-

Three tier systems move the client intelligence to a middle tier so that stateless

clients can be used. This simplifies application deployment. Most web

applications are 3-Tier.

N-tier architecture :-

N-Tier refers typically to web applications which further forward their requests

to other enterprise services. This type of application is the one most responsible

for the success of application servers.

Tightly coupled (clustered) :-

Tightly coupled architecture refers typically to a cluster of machines that

closely work together, running a shared process in parallel. The task is subdivided

in parts that are made individually by each one and then put back together to

make the final result.

Peer-to-peer :-

Peer-to-peer is an architecture where there is no special machine or machines

that provide a service or manage the network resources. Instead all responsibilities

are uniformly divided among all machines, known as peers. Peers can serve both

as clients and servers.

Space based :-

Space based refers to an infrastructure that creates the illusion (virtualization)

of one single address-space. Data are transparently replicated according to

application needs. Decoupling in time, space and reference is achieved.

http://en.wikipedia.org/wiki/Space_based_architecture

http://en.wikipedia.org/wiki/Peer-to-peer

http://en.wikipedia.org/wiki/Computer_cluster

http://en.wikipedia.org/wiki/Application_server

http://en.wikipedia.org/wiki/Multitier_architecture

http://en.wikipedia.org/wiki/Three-tier_(computing)

16

Another basic aspect of distributed computing architecture is the method of

communicating and coordinating work among concurrent processes. Through various

message passing protocols, processes may communicate directly with one another,

typically in a master/slave relationship. Alternatively, a "database-centric"

architecture can enable distributed computing to be done without any form of direct inter-

process communication, by utilizing a shared database.

Distributed Computing Paradigms:-

The Message Passing Paradigm :-

Message passing is the most fundamental paradigm for distributed applications.

A process sends a message representing a request. The message is delivered to a receiver,

which processes the request, and sends a message in response. In turn, the reply may

trigger a further request, which leads to a subsequent reply, and so forth.

http://en.wikipedia.org/wiki/Database

http://en.wikipedia.org/wiki/Inter-process_communication

http://en.wikipedia.org/wiki/Inter-process_communication

http://en.wikipedia.org/wiki/Database-centric_architecture

http://en.wikipedia.org/wiki/Database-centric_architecture

http://en.wikipedia.org/wiki/Master-slave_(technology)

16

The Client-Server Paradigm :-

Perhaps the best known paradigm for network applications, the client-server

model assigns asymmetric roles to two collaborating processes. One process, the server,

plays the role of a service provider which waits passively for the arrival of requests. The

other, the client, issues specific requests to the server and awaits its response. Simple in

concept, the client-server model provides an efficient abstraction for the delivery of

network services. Operations required include those for a server process to listen and to

accept requests, and for a client process to issue requests and accept responses. By

assigning asymmetric roles to the two sides, event synchronization is simplified: the

server process waits for requests, and the client in turn waits for responses. Many Internet

services are client-server applications. These services are often known by the protocol

that the application implements. Well known Internet services include HTTP, FTP, DNS,

etc.

16

The Peer-to-Peer Distributed Computing Paradigm :-

In the peer-to-peer paradigm, the participating processes play equal roles, with

equivalent capabilities and responsibilities (hence the term “peer”). Each participant may

issue a request to another participant and receive a response. The peer-to-peer paradigm

is more appropriate for applications such as instant messaging, peer-to-peer file transfers,

video conferencing, and collaborative work. It is also possible for an application to be

based on both the client-server model and the peer-to-peer model. A well-known example

of a peer-to-peer file transfer service is Napster.com or similar sites which allow files

(primarily audio files) to be transmitted among computers on the Internet. It makes use of

a server for directory in addition to the peer-to-peer computing.

16

Application:-

There are many examples of commercial application of distributed system, such as

the Database Management System, distributed computing using mobile agents, local

intranet, internet (World Wide Web), JAVA RMI, etc.

Distributed Computing Using Mobile Agents :-

Mobile agents can be wandering around in a network using free resources for

their own computations.

Local Intranet :-

A portion of Internet that is separately administered & supports internal sharing

of resources (file/storage systems and printers) is called local intranet.

16

Internet :-

The Internet is a global system of interconnected computer networks that use the

standardized Internet Protocol Suite (TCP/IP).

http://en.wikipedia.org/wiki/Internet_Protocol_Suite

http://en.wikipedia.org/wiki/Computer_network

16

JAVA RMI :-

Communicating Entities:-

Implementing some application for user

Using support of distributed services

Layers of support

Client/server

Embedded in language Java:-

Object variant of remote procedure call

Adds naming compared with RPC

Restricted to Java environments

16

RMI Features :-

Distributed object model:-

Objects: normal and remote

Idea:-

Remote object exists on other host

Remote object can be used as normal object

Behavior described by interface

Environment takes care of remote invocation

Differences normal and remote objects:-

Remote references can be distributed freely

Clients only know/use interface, not actual implementation

Passing remote objects by reference, normal objects by copying

Failure handling more complicated since invocation itself can also fail

RMI Architecture :-

16

Advantages:-

Economics :-

Computers harnessed together give a better price/performance ratio than

mainframes.

Speed :-

A distributed system may have more total computing power than a mainframe.

Inherent distribution of applications :-

Some applications are inherently distributed. E.g., an ATM-banking application.

Reliability :-

If one machine crashes, the system as a whole can still survive if you have

multiple server machines and multiple storage devices (redundancy).

Extensibility and Incremental Growth :-

Possible to gradually scale up (in terms of processing power and functionality) by

adding more sources (both hardware and software). This can be done without

disruption to the rest of the system.

Distributed custodianship :-

The National Spatial Data Infrastructure (NSDI) calls for a system of

partnerships to produce a future national framework for data as a patchwork quilt

of information collected at different scales and produced and maintained by

different governments and agencies. NSDI will require novel arrangements for

framework management, area integration, and data distribution. This research will

examine the basic feasibility and likely effects of such distributed custodianship

16

in the context of distributed computing architectures, and will determine the

institutional structures that must evolve to support such custodianship.

Data integration :-

This research will contribute to the integration of geographic information and

GISs into the mainstream of future libraries, which are likely to have full digital

capacity. The digital libraries of the future will offer services for manipulating

and processing data as well as for simple searches and retrieval.

Missed opportunities :-

By anticipating the impact that a rapidly advancing technology will have on

GISs, this research will allow the GIS community to take better advantage of the

opportunities that the technology offers.

Disadvantages:-

Lack of experience in designing, and implementing a distributed system. E.g.

which platform (hardware and OS) to use, which language to use etc. But this is

changing now.

If the network underlying a distributed system saturates or goes down, then the

distributed system will be effectively disabled thus negating most of the

advantages of the distributed system.

Security is a major hazard since easy access to data means easy access to secret

data as well.

16

Conclusions:-

In this age of optimization everybody is trying to get optimized output from their

limited resources. The concept of distributed computing is the most efficient way to

achieve the optimization. In case of distributed computing the actual task is modularized

and is distributed among various computer system. It not only increases the efficiency of

the task but also reduce the total time required to complete the task. Now the advance

concept of this distributed computing, that is the distributed computing through mobile

agents is setting a new landmark in this technology. A mobile agent is a process that can

transport its state from one environment to another, with its data intact, and be capable of

performing appropriately in the new environment.

http://en.wikipedia.org/wiki/Process_(computing)

16

References:-

Andrews, Gregory R. (2000), Foundations of Multithreaded, Parallel, and

Distributed Programming, Addison–Wesley, ISBN 0-201-35752-6.

Arora, Sanjeev; Barak, Boaz (2009), Computational Complexity – A Modern

Approach, Cambridge, ISBN 978-0-521-42426-4.

Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald

L. (1990), Introduction to Algorithms (1st ed.), MIT Press, ISBN 0-262-03141-8.

Dolev, Shlomi (2000), Self-Stabilization, MIT Press, ISBN 0-262-04178-2.

http://en.wikipedia.org/wiki/Special:BookSources/0262041782

http://en.wikipedia.org/wiki/MIT_Press

http://en.wikipedia.org/wiki/Shlomi_Dolev


http://en.wikipedia.org/wiki/MIT_Press

http://en.wikipedia.org/wiki/Introduction_to_Algorithms

http://en.wikipedia.org/wiki/Ron_Rivest

http://en.wikipedia.org/wiki/Ron_Rivest

http://en.wikipedia.org/wiki/Charles_E._Leiserson

http://en.wikipedia.org/wiki/Thomas_H._Cormen


http://en.wikipedia.org/wiki/Cambridge_University_Press

http://en.wikipedia.org/wiki/Sanjeev_Arora


http://en.wikipedia.org/wiki/Addison%E2%80%93Wesley

Technology

Distributed Computing Report