24

PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Embed Size (px)

Citation preview

Page 1: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer
Page 2: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

PARALLEL COMPUTING overviewWhat is Parallel Computing?Traditionally, software has been written

for serial computation:To be run on a single computer having a single

Central Processing Unit (CPU);A problem is broken into a discrete series of

instructions.Instructions are executed one after another.Only one instruction may execute at any

moment in time.

Page 3: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

PARALLEL COMPUTING overview

Page 4: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

PARALLEL COMPUTING overviewIn the simplest sense, parallel computing is

the simultaneous use of multiple compute resources to solve a computational problem:To be run using multiple CPUs

A problem is broken into discrete parts that can be solved concurrently

Each part is further broken down to a series of instructions

Instructions from each part execute simultaneously on different CPUs

Page 5: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

PARALLEL COMPUTING overview

Page 6: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Parallel Computing

Page 7: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster Computing - IntroductionDefinition: Cluster computing is the

technique of linking two or more computers into a network (usually through a local area network) in order to take advantage of the parallel processing power of those computers. 

Page 8: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster Computing - Introduction

Page 9: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Types of Clusters* High Availability Clusters 

HA Clusters are designed to ensure constant access to service applications. The clusters are designed to maintain redundant nodes that can act as backup systemsin the event of failure. The minimum number of nodes in a HA cluster is two – one active and one redundant – though most HA clusters will use considerably more nodes. 

* Benefits of Computer Clusters 

Computer clusters offer a number of benefits over mainframe computers, including: 

Reduced Cost:  The price of off-the-shelf consumer desktops has plummeted in recent years, and this drop in price has corresponded with a vast increase in their processing power and performance. The average desktop PC today is many times more powerful than the first mainframe computers. 

Processing Power :  The parallel processing power of a high-performance cluster can, in many cases, prove more cost effective than a mainframe with similar power. This reduced price per unit of power enables enterprises to get a greater ROI from their IT budget. 

Improved Network Technology: Driving the development of computer clusters has been a vast improvement in the technology related to networking, along with a reduction in the price of such technology. 

Computer clusters are typically connected via a single virtual local area network (VLAN), and the network treats each computer as a separate node. Information can be passed throughout these networks with very little lag, ensuring that data doesn’t bottleneck between nodes. 

Scalability: Perhaps the greatest advantage of computer clusters is the scalability they offer. While mainframe computers have a fixed processing capacity, computer clusters can be easily expanded as requirements change by adding additional nodes to the network. 

Availability: When a mainframe computer fails, the entire system fails. However, if a node in a computer cluster fails, its operations can be simply transferred to another node within the cluster, ensuring that there is no interruption in service. 

Page 10: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Types of ClustersHA clusters aim to solve the problems that

arise from mainframe failure in an enterprise. Rather than lose all access to IT systems, HA clusters ensure 24/7 access to computational power. This feature is especially important in business, where data processing is usually time-sensitive. 

Page 11: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Types of ClustersLoad-balancing Clusters 

Load-balancing clusters operate by routing all work through one or more load-balancing front-end nodes, which then distribute the workload efficiently between the remaining active nodes. Load-balancing clusters are extremely useful for those working with limited IT budgets. Devoting a few nodes to managing the workflow of a cluster ensures that limited processing power can be optimised. 

Page 12: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Types of ClustersHigh-performance Clusters 

HPC clusters are designed to exploit the parallel processing power of multiple nodes. They are most commonly used to perform functions that require nodes to communicate as they perform their tasks – for instance, when calculation results from one node will affect future results from another. 

Page 13: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureA cluster is a type of parallel or distributed

processing system which consists of a collection of interconnected stand alone computers working together as a single integrated computing resource.

A computer node can be single or multiprocessor system with memory I/O facilities and an operating system.

Page 14: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureA cluster generally refers to two or more

computers connected together.

The nodes can exist in a single cabinet or be physically separated and connected via a LAN.

An interconnected cluster of computers can appear as a single system to users and applications.

Page 15: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster Architecturefigure

Page 16: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureThe following are some prominent

components of cluster computers.Multiple High Performance

Computers(PCs,Workstations or SMPs)State-of-the-art Operating Systems(Layered or

Microkernel based)High Performance Networks/Switches(such as

Gigabit Ethernet and Myrinet)Network Interface cards(NICs)

Page 17: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureFast communication protocols and

services(such as active and fast messages)Cluster Middleware(Single System Image and

System availability infrastructure)Hardware(such as Digital(DEC) Memory

Channel,hardware DSM and SMP techniquesOperating System Kernel or Gluing Layer(such

as Solaris MC and GLU-nix)

Page 18: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureApplications and Subsystems

Applications(such as system management tools and electronic forms)

Runtime Systems(such as software DSM and parallel file system)

Resource Management and Scheduling software(such as LSF(Load Sharing Facility) and CODINE(Computing in Distributed Networked Environment))

Page 19: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureParallel programming Environments and

Tools(such as compilers,PVM(Parallel Virtual Machine), and MPI(Message Passing Interface))

ApplicationsSequentialParallel or Distributed

Page 20: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureThe network interface hardware acts as

communication processor and is responsible for transmitting and receiving packets of data between cluster nodes via a network switch.

Page 21: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureCommunication software offers a means of

fast and reliable data communication among cluster nodes and to the outside world.

The cluster nodes can work collectively, as an integrated computing resource, or they can operate as individual computers.

Page 22: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Cluster ArchitectureThe cluster middleware is responsible for

offering an illusion of a unified system image and availability out of a collection on independent but interconnected computers.

Programming environments can offer portable,efficient and easy-to use tools for development of applicaitions. They include message passing libraries,debuggers and profilers.

Page 23: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Parallel Programming Models

Page 24: PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer

Applications of Clusters