29
Presented by Prashant Tiwari and Archana Sahu DISTRIBUTED COMPUTING

Distributed Computing

Embed Size (px)

DESCRIPTION

Distributed computing deals with hardware and software systems containing more than one processing element or storage element, concurrent processes, or multiple programs, running under a loosely or tightly controlled regime. In distributed computing a program is split up into parts that run simultaneously on multiple computers communicating over a network. Distributed computing is a form of parallel computing, but parallel computing is most commonly used to describe program parts running simultaneously on multiple processors in the same computer. Both types of processing require dividing a program into parts that can run simultaneously, but distributed programs often must deal with heterogeneous environments, network links of varying latencies, and unpredictable failures in the network or the computers.

Citation preview

Page 1: Distributed Computing

Presented by Prashant Tiwari and Archana Sahu

DISTRIBUTED COMPUTING

Page 2: Distributed Computing

•Folding@Home, as of August 2009, is sustaining over 7 PFLOPS , the first computing project of any kind to cross the four petaFLOPS milestone. This level of performance is primarily enabled by the cumulative effort of a vast array of PlayStation 3 and powerful GPU units.

•The entire BOINC averages over 1.5 PFLOPS as of March 15, 2009.

•SETI@Home computes data averages more than 528 TFLOPS

•Einstein@Home is crunching more than 150 TFLOPS

•As of August 2008, GIMPS is sustaining 27 TFLOPS.

DISTRIBUTED COMPUTING

Consider The Facts

The illustration

Page 3: Distributed Computing

This What The Power of Distributed Computing Is.

DISTRIBUTED COMPUTING

This What Distributed Computing Is.

The illustration

Page 4: Distributed Computing

1 petaFLOPS = 10^15 flops or 1000 teraflops. No computer has achieved this performance yet.

PETA FLoating point OPerations per Second One quadrillion floating point operations per second

As of 2008, the fastest PC processors (quad-core) perform over 70 GFLOPS (Intel Core i7 965 XE)

DISTRIBUTED COMPUTING

What is PetaFLOPS?

The illustration

OVERVIEW

Introduction To DISTRIBUTED COMPUTING

Why DISTRIBUTED COMPUTING

Implementing DISTRIBUTED COMPUTING

Architectures of DISTRIBUTED COMPUTING

Technical Issues

Language and Projects

Conclusion and Summary

Page 5: Distributed Computing

The Definition , The Concept, The Processes

Introduction to DISTRIBUTED COMPUTING

Page 6: Distributed Computing

Distributed computing deals with hardware and software systems containing more than one processing element or storage element, concurrent processes, or multiple programs, running under a loosely or tightly controlled regime. In distributed computing a program is split up into parts that run simultaneously on multiple computers communicating over a network. Distributed computing is a form of parallel computing

DISTRIBUTED COMPUTING

Introduction To Distributed Computing

The Text

Common Distributed Computing Model

Page 7: Distributed Computing

In distributed computing a program is split up into parts that run simultaneously on multiple computers communicating over a network

DISTRIBUTED COMPUTING

THE CONCEPT

TASK 1

TASK5

TASK 5

T 2T4

PROBELEM INSTRUCTION SET

T1 T3 T5

TASK 4TASK 2

The Elaboration

The Elaboration

Page 8: Distributed Computing

Consider If There Are n Systems Connected In A Network, Then We Can Split One Program Instruction

Into n Different Tasks And Compute Them Concurrently.

DISTRIBUTED COMPUTING

ReConsider The Facts

The illustration

Page 9: Distributed Computing

Why we need Distributed Computing?

Why DISTRIBUTED COMPUTING ?

Page 10: Distributed Computing

•Computation requirements are ever increasing

•Silicon based (sequential) architectures reaching their limits in processing capabilities (clock speed) as they are constrained by.

•Significant development in networking technology is paving a way for network-based cost-effective parallel computing.

•The parallel processing technology is mature and is being exploited commercially.

DISTRIBUTED COMPUTING

Need Of Distributed Computing

The Elaboration

Page 11: Distributed Computing

Speedup achieved by distributed computing

Speedup = log2(no. of processors)

DISTRIBUTED COMPUTING

Speedup Factor

S

P

log 2P

The Elaboration

Page 12: Distributed Computing

The Organization, The Architecture

Implementing DISTRIBUTED COMPUTING

Page 13: Distributed Computing

Organizing the interaction between the computers that execute distributed computations is of prime importance.

In order to be able to use the widest possible variety of computers, the protocol or communication channel should be universal.

Software Portability

DISTRIBUTED COMPUTING

Implementing Distributed Computing

The Text

Motivation FactorThe human brain consists of a

large number (more than a billion) of neural cells that

process information. Each cell works like a simple processor and

only the massive interaction between all cells and their

parallel processing makes the brain's abilities possible.

Page 14: Distributed Computing

There are many different types of distributed computing systems and many challenges to overcome in successfully designing one. The main goal of a distributed computing system is to connect users and resources in a transparent, open, and scalable way. Ideally this arrangement is drastically more fault tolerant and more powerful than many combinations of stand-alone computer systems.

DISTRIBUTED COMPUTING

Implementing Distributed Computing

The Elaboration

Page 15: Distributed Computing

DISTRIBUTED COMPUTING

Distributed Memory MIMD

The Elaboration

Processor A

Processor A

Processor A

MEM. Bus

MEM. Bus

MEM. Bus

Memory System A

Memory System A

Memory System A

Page 16: Distributed Computing

Possible ways to Implement Distributed Computing

Architectures of DISTRIBUTED COMPUTING

Page 17: Distributed Computing

Various hardware and software architectures are used for distributed computing. At a lower level, it is necessary to interconnect multiple CPUs with some sort of network, regardless of whether that network is printed onto a circuit board or made up of loosely-coupled devices and cables. At a higher level, it is necessary to interconnect processes running on those CPUs with some sort of communication system.

DISTRIBUTED COMPUTING

The Architectures

The Text

Page 18: Distributed Computing

Client-server — Smart client code contacts the server for data, then formats and displays it to the user.

3-tier architecture — Three tier systems move the client intelligence to a middle tier so that stateless clients can be used. Most web applications are 3-Tier.

N-tier architecture — N-Tier refers typically to web applications which further forward their requests to other enterprise services. This type of application is the one most responsible for the success of application servers.

Tightly coupled (clustered) — refers typically to a cluster of machines that closely work together, running a shared process in parallel.

Peer-to-peer — architecture where there is no special machine or machines that provide a service or manage the network resources. Instead all responsibilities are uniformly divided among all machines, known as peers. Peers can serve both as clients and servers.

DISTRIBUTED COMPUTING

The Architectures

The Elaboration

Client Server Architecture

3-Tier Architecture

N- Tier Architecture

Tightly Coupled

Peer To Peer

Page 19: Distributed Computing

Distributed computing implements a kind of concurrency. It interrelates tightly with concurrent programming so much that they are sometimes not taught as distinct subjects.

DISTRIBUTED COMPUTING

The Concurrency

The Text

Page 20: Distributed Computing

Multiprocessor systemsA multiprocessor system is simply a computer that has more than one CPU on its motherboard.

Multicore SystemsIntel CPUs from the late Pentium 4 era (Northwood and Prescott cores) employed a technology called Hyper-threading that allowed more than one thread (usually two) to run on the same CPU.

Multicomputer Systems

Computer ClustersA cluster consists of multiple stand-alone machines acting in parallel across a local high speed network.

Grid computingA grid uses the resources of many separate computers, loosely connected by a network (usually the Internet), to solve large-scale computation problems.

DISTRIBUTED COMPUTING

The Concurrency

The Elaboration

Page 21: Distributed Computing

Language that Use or make a distributed system and projects that been implemented

Technical Issues

Page 22: Distributed Computing

If not planned properly, a distributed system can decrease the overall reliability of computations if the unavailability of a node can cause disruption of the other nodes.

Leslie Lamport famously quipped that: "A distributed system is one in which the failure of a computer you didn't even know existed can render your own computer unusable."

Troubleshooting and diagnosing problems in a distributed system can also become more difficult, because the analysis may require connecting to remote nodes or inspecting communication between nodes.

DISTRIBUTED COMPUTING

Technical Issues

The Text

The Text

Page 23: Distributed Computing

Language that Use or make a distributed system and projects that been implemented

Language and Projects

Page 24: Distributed Computing

Remote procedure calls distribute operating system commands over a network connection. Systems like CORBA, Microsoft DCOM, Java RMI and others, try to map object oriented design to the network.

Loosely coupled systems communicate through intermediate documents that are typically human readable (e.g. XML, HTML, SGML, X.500, and EDI).

DISTRIBUTED COMPUTING

The Organization

The Text

The Text

Page 25: Distributed Computing

DISTRIBUTED COMPUTING

ReConsider The Facts

Projects

Folding@Home

•Stanford University Chemistry Department Folding@home project

•Focused on simulations of protein folding to find disease cures and to understand biophysical systems.

•Folding@Home, as of August 2009, is sustaining over 7 PFLOPS.

SETI@Home

•Space Sciences Laboratory at the University of California, Berkeley

•Focused on analyzing radio-telescope data to find evidence of intelligent signals from space

•SETI@Home computes data averages more than 528 TFLOPS

Page 26: Distributed Computing

DISTRIBUTED COMPUTING

ReConsider The Facts

Project Start Affiliation Area Peak_#hosts Current status Computing power

GIMPS 1996 ? mathematics 10,000 active 27 TFLOPS

distributed.net 1997 U.S. non-profit organization cryptography 100,000 active ?

SETI@home 1999 University of California, Berkeley

SETI 362,000 active 528 TFLOPS

Electric Sheep 1999 ? art 57,000 active ?

Folding@home 2000 Stanford University

biology 406,000 active 8.1 PFLOPS

BOINC 2002 University of California, Berkeley

biomedicine, other 550,000 active 1.5 PFLOPS

Grid.org 2002 philanthropic by United Devices

biomedicine, other 3,734,000[46] closed ?

Climateprediction.net

2003 University of Oxford

climate change 150,000 active ?

LHC@home 2004 CERN physics 60,000 active ?

World Community Grid

2004 philanthropic by IBM

biomedicine, other 700,000[46] active ?

Einstein@home 2005 LIGO astrophysics 200,000 active 150 TFLOPS

Rosetta@home 2005 University of Washington

biology 100,000 active ?

Page 27: Distributed Computing

Implemented Distributed Computing

Conclusion And Summary

Page 28: Distributed Computing

• Distributed Computing has become a reality:– Threads concept utilized everywhere.– Clusters have emerged as popular data centers and

processing engine:• E.g., Google search engine.

• The emergence of commodity high-performance CPU, networks, and OSs have made parallel computing applicable to enterprise applications.– E.g., Oracle {9i,10g} database on Clusters/Grids.

DISTRIBUTED COMPUTING

The Organization

The Text

The Text

Page 29: Distributed Computing

DISTRIBUTED COMPUTING

Any Questions ?

Thank You For Listening

Questions ?