Cluster Architecture

Cluster Architecture

Jaringan Terdistribusi

TI Bayu



Computing speed isn't just a convenience. Faster computers allow us to solve larger problems, and to find solutions more quickly, with greater accuracy, and at a lower cost.

Traditional high-performance clusters have proved their worth in a variety of usespredicting the weather to industrial design, from molecular dynamics to astronomical modeling. High-performance computing (HPC) has created a new approach to science.


Computing speed isn't just a convenience. Faster computers allow us to solve larger problems, and to find solutions more quickly, with greater accuracy, and at a lower cost.

performance clusters have proved their worth in a variety of uses—from predicting the weather to industrial design, from molecular dynamics to astronomical modeling.

performance computing (HPC) has created a new approach to science.


Clusters are also playing a greater role in business. High performance is a key issue in data mining or in image rendering. Advances in clustering technology have led to highavailability and load-balancing clusters. availability and load-balancing clusters. Clustering is now used for missionapplications such as web and FTP servers. For example, Google uses an evercomposed of tens of thousands of computers.


Clusters are also playing a greater role in business. High performance is a key issue in data mining or in image rendering. Advances in clustering technology have led to high-

balancing clusters. balancing clusters. Clustering is now used for mission-critical applications such as web and FTP servers. For example, Google uses an ever-growing cluster composed of tens of thousands of computers.

Modern Computing and the Role of Clusters

When computing, there are three basic approaches to improving performance—use a faster computer, or divide the calculation among multiple computers. A very common analogy is that of a horse-drawn cart.

First, consider what you are trying to calculate. First, consider what you are trying to calculate.

While there are no hard and fast rules, it is not unusual to see a quadratic increase in cost with a linear increase in performance, particularly as you move away from commodity technology.

The third approach is parallelism, i.e., executing instructions simultaneously.

Modern Computing and the Role of

When computing, there are three basic approaches —use a better algorithm,

use a faster computer, or divide the calculation among multiple computers. A very common analogy

drawn cart.

First, consider what you are trying to calculate.First, consider what you are trying to calculate.

While there are no hard and fast rules, it is not unusual to see a quadratic increase in cost with a linear increase in performance, particularly as you move away from commodity technology.

The third approach is parallelism, i.e., executing instructions simultaneously.

Uniprocessor Computers

The traditional classification of computers based on size and performance, i.e., classifying computers as microcomputers, workstations, minicomputers, mainframes, and supercomputers, has become obsolete.

Regardless of where we place them in the traditional classification, most computers today are based on an architecture often attributed to the Hungarian mathematician John von Neumann. The basic structure of a von Neumann computer is a CPU connected to memory by a communications channel or bus.


The traditional classification of computers based on size and performance, i.e., classifying computers as microcomputers, workstations, minicomputers, mainframes, and supercomputers,

Regardless of where we place them in the traditional classification, most computers today are based on an architecture often attributed to the Hungarian mathematician John von Neumann. The basic structure of a von Neumann computer is a CPU connected to memory by a communications


The development of reduced instruction set computer (RISC) architectures and postarchitectures has led to more uniform instruction sets. This eliminates cycles from some instructions and allows a higher clocksome instructions and allows a higher clockrate.

Superscalar architectures and pipelining have also increased processor speeds. Superscalar architectures execute two or more instructions simultaneously.


The development of reduced instruction set computer (RISC) architectures and post-RISC architectures has led to more uniform instruction sets. This eliminates cycles from some instructions and allows a higher clock-some instructions and allows a higher clock-

Superscalar architectures and pipelining have also increased processor speeds. Superscalar architectures execute two or more instructions

Multiple Processors

In recent years we have come to augment that definition to include parallel computers with hundreds or thousands of CPUs, otherwise known as multiprocessor computers. Multiprocessor computers fall into two basic Multiprocessor computers fall into two basic categories—centralized multiprocessors (or single enclosure multiprocessors) and multicomputers.

Multiple Processors

In recent years we have come to augment that definition to include parallel computers with hundreds or thousands of CPUs, otherwise known as multiprocessor computers. Multiprocessor computers fall into two basic Multiprocessor computers fall into two basic

centralized multiprocessors (or single enclosure multiprocessors) and

Centralized multiprocessors

With centralized multiprocessors, there are two architectural approaches based on how memory is managed—uniform memory access (UMA) and nonuniform memory access (NUMA) machines.

With UMA machines, also called symmetric With UMA machines, also called symmetric multiprocessors (SMP), there is a common shared memory. Identical memory addresses map, regardless of the CPU, to the same location in physical memory. Main memory is equally accessible to all CPUs. To improve memory performance, each processor has its own cache.


With centralized multiprocessors, there are two architectural approaches based on how memory is

uniform memory access (UMA) and memory access (NUMA) machines.

With UMA machines, also called symmetric With UMA machines, also called symmetric multiprocessors (SMP), there is a common shared memory. Identical memory addresses map, regardless of the CPU, to the same location in physical memory. Main memory is equally accessible to all CPUs. To improve memory performance, each processor has its own cache.

Centralized multiprocessorsCentralized multiprocessors


A closely related architecture is used with NUMA machines. Roughly, with this architecture, each CPU maintains its own piece of memory. Effectively, memory is divided among the processors, but each process has access to all the memory.access to all the memory.

Operating system support is required with either multiprocessor scheme. Fortunately, most modern operating systems, including Linux, provide support for SMP systems, and support is improving for NUMA architectures.


A closely related architecture is used with NUMA machines. Roughly, with this architecture, each CPU maintains its own piece of memory. Effectively, memory is divided among the processors, but each process has access to all the memory.access to all the memory.

Operating system support is required with either multiprocessor scheme. Fortunately, most modern operating systems, including Linux, provide support for SMP systems, and support is improving for NUMA architectures.

Centralized multiprocessorsCentralized multiprocessors


A third architecture worth mentioning in passing is processor array, which, at one time, generated a lot of interest. A processor array is a type of vector computer built with a collection of identical, synchronized processing elements. of identical, synchronized processing elements. Each processor executes the same instruction on a different element in a data array.


A third architecture worth mentioning in passing is processor array, which, at one time, generated a lot of interest. A processor array is a type of vector computer built with a collection of identical, synchronized processing elements. of identical, synchronized processing elements. Each processor executes the same instruction on a different element in a data array.

Multicomputers

A multicomputer configuration, or cluster, is a group of computers that work together. A cluster has three basic elementsof individual computers, a network connecting those computers, and software that enables a computer to share work among the other computer to share work among the other computers via the network.

For most people, the most likely thing to come to mind when speaking of Beowulf cluster. While this is perhaps the bestknown type of multicomputer, a number of variants now exist.

A multicomputer configuration, or cluster, is a group of computers that work together. A cluster has three basic elements—a collection of individual computers, a network connecting those computers, and software that enables a computer to share work among the other computer to share work among the other computers via the network.

For most people, the most likely thing to come to mind when speaking of multicomputers is a Beowulf cluster. While this is perhaps the best-known type of multicomputer, a number of

Multicomputers

First, both commercial multicomputers and commodity clusters are available. Commodity clusters, including Beowulf clusters, are constructed using commodity, off(COTS) computers and hardware.(COTS) computers and hardware.

When constructing a commodity cluster, the norm is to use freely available, open source software. This translates into an extremely low cost that allows people to build a cluster when the alternatives are just too expensive.

First, both commercial multicomputers and commodity clusters are available. Commodity clusters, including Beowulf clusters, are constructed using commodity, off-the-shelf (COTS) computers and hardware.(COTS) computers and hardware.

When constructing a commodity cluster, the norm is to use freely available, open source software. This translates into an extremely low cost that allows people to build a cluster when the alternatives are just too expensive.

Multicomputers

In commodity clusters, the software is often mix-and-match.

Commercial clusters often use proprietary computers and software.computers and software.

A network of workstations (NOW), sometimes called a cluster of workstations (COW), is a cluster composed of computers usable as individual workstations.

In commodity clusters, the software is often

Commercial clusters often use proprietary computers and software.computers and software.

A network of workstations (NOW), sometimes called a cluster of workstations (COW), is a cluster composed of computers usable as individual workstations.

Cluster structure

It's tempting to think of a cluster as just a bunch of interconnected machines, but when you begin constructing a cluster, you'll need to give some thought to the internal structure of the cluster.cluster.

It's tempting to think of a cluster as just a bunch of interconnected machines, but when you begin constructing a cluster, you'll need to give some thought to the internal structure of the

Cluster structure

The simplest approach is a symmetric cluster. With a symmetric cluster each node can function as an individual computer. This is extremely straightforward to set up.

There are several disadvantages to a symmetric cluster. Cluster management and security can be more difficult. Workload distribution can become a problem, making it more difficult to achieve optimal performance.

The simplest approach is a symmetric cluster. With a symmetric cluster each node can function as an individual computer. This is extremely straightforward to set up.

There are several disadvantages to a symmetric cluster. Cluster management and security can be more difficult. Workload distribution can become a problem, making it more difficult to achieve optimal performance.

Cluster structure

Cluster structure

For dedicated clusters, an asymmetric architecture is more common. With asymmetric clusters one computer is the head node or frontend. It serves as a gateway between the remaining nodes and the users.remaining nodes and the users.

The primary disadvantage of this architecture comes from the performance limitations imposed by the cluster head. For this reason, a more powerful computer may be used for the head.

For dedicated clusters, an asymmetric architecture is more common. With asymmetric clusters one computer is the head node or frontend. It serves as a gateway between the remaining nodes and the users.remaining nodes and the users.

The primary disadvantage of this architecture comes from the performance limitations imposed by the cluster head. For this reason, a more powerful computer may be used for the

Cluster structure

Cluster structure

I/O represents a particular challenge. It is often desirable to distribute a shared filesystem across a number of machines within the cluster to allow parallel access.

Network design is another key issue. With small clusters, a simple switched network may be adequate. With larger clusters, a fully connected network may be prohibitively expensive.

I/O represents a particular challenge. It is often desirable to distribute a shared filesystem across a number of machines within the cluster to allow parallel access.

Network design is another key issue. With small clusters, a simple switched network may be adequate. With larger clusters, a fully connected network may be prohibitively

Cluster structure

Types of Clusters

Originally, "clusters" and "highcomputing" were synonymous. Today, the meaning of the word "cluster" has expanded beyond high-performance to include highavailability (HA) clusters and loadavailability (HA) clusters and load(LB) clusters.

Originally, "clusters" and "high-performance computing" were synonymous. Today, the meaning of the word "cluster" has expanded

performance to include high-availability (HA) clusters and load-balancing availability (HA) clusters and load-balancing

Types of Clusters

High-availability clusters, also called failover clusters, are often used in missionapplications. If you can't afford the lost business that will result from having your web server go down, you may want to implement it using a HA down, you may want to implement it using a HA cluster. The key to high availability is redundancy.

availability clusters, also called failover clusters, are often used in mission-critical applications. If you can't afford the lost business that will result from having your web server go down, you may want to implement it using a HA down, you may want to implement it using a HA cluster. The key to high availability is

Types of Clusters

The idea behind a loadprovide better performance by dividing the work among multiple computers. For example, when a web server is implemented using LB clustering, the different queries to the server are clustering, the different queries to the server are distributed among the computers in the clusters. This might be accomplished using a simple round-robin algorithm.

The idea behind a load-balancing cluster is to provide better performance by dividing the work among multiple computers. For example, when a web server is implemented using LB clustering, the different queries to the server are clustering, the different queries to the server are distributed among the computers in the clusters. This might be accomplished using a simple

Types of Clusters

Keep in mind, the term "loaddifferent things to different people. A highperformance cluster used for scientific calculation and a cluster used as a web server would likely approach loadwould likely approach loaddifferent ways. Each application has different critical requirements.

Keep in mind, the term "load-balancing" means different things to different people. A high-performance cluster used for scientific calculation and a cluster used as a web server would likely approach load-balancing in entirely would likely approach load-balancing in entirely different ways. Each application has different

Distributed Computing and Clusters

While the term parallel is often used to describe clusters, they are more correctly described as a type of distributed computing.

Typically, the term parallel computing refers to Typically, the term parallel computing refers to tightly coupled sets of computation. Distributed computing is usually used to describe computing that spans multiple machines or multiple locations.


While the term parallel is often used to describe clusters, they are more correctly described as a type of distributed computing.

Typically, the term parallel computing refers to Typically, the term parallel computing refers to tightly coupled sets of computation. Distributed computing is usually used to describe computing that spans multiple machines or


Clusters are generally restricted to computers on the same subnetworkcomputing is frequently used to describe computers working together across a WAN or the Internet.

Peer-to-peer computing provides yet another approach to distributed computing. Again this is an ambiguous term. Peersharing cycles, to the communications infrastructure, or to the actual data distributed across a WAN or the Internet.


Clusters are generally restricted to computers subnetwork or LAN. The term grid

computing is frequently used to describe computers working together across a WAN or

peer computing provides yet another approach to distributed computing. Again this is an ambiguous term. Peer-to-peer may refer to sharing cycles, to the communications infrastructure, or to the actual data distributed across a WAN or the Internet.

Limitations

While clusters have a lot to offer, they are not panaceas. There is a limit to how much adding another computer to a problem will speed up a calculation. In the ideal situation, you might expect a calculation to go twice as fast on two expect a calculation to go twice as fast on two computers as it would on one. Unfortunately, this is the limiting case and you can only approach it.

While clusters have a lot to offer, they are not panaceas. There is a limit to how much adding another computer to a problem will speed up a calculation. In the ideal situation, you might expect a calculation to go twice as fast on two expect a calculation to go twice as fast on two computers as it would on one. Unfortunately, this is the limiting case and you can only

Amdahl's Law

In a nutshell, Amdahl's Law states that the serial portion of a program will be the limiting factor in how much you can speed up the execution of the program using multiple processors.processors.

Having said this, it is important to remember that Amdahl's Law does clearly state a limitation of parallel computing. But this limitation varies not only from problem to problem, but with the size of the problem as well.

In a nutshell, Amdahl's Law states that the serial portion of a program will be the limiting factor in how much you can speed up the execution of the program using multiple

Having said this, it is important to remember that Amdahl's Law does clearly state a limitation of parallel computing. But this limitation varies not only from problem to problem, but with the size of the problem as well.

Amdahl's Law

Amdahl's Law

One last word about the limitations of clusters the limitations are often tied to a particular approach. It is often possible to mix approaches and avoid limitations. For example, in constructing your clusters, you'll want to use in constructing your clusters, you'll want to use the best computers you can afford. This will lessen the impact of inherently serial code. And don't forget to look at your algorithms!

One last word about the limitations of the limitations are often tied to a

particular approach. It is often possible to mix approaches and avoid limitations. For example, in constructing your clusters, you'll want to use in constructing your clusters, you'll want to use the best computers you can afford. This will lessen the impact of inherently serial code. And don't forget to look at your algorithms!

Documents

Cluster Architecture