50
Computer Science and Engineering Advanced Computer Advanced Computer Architecture Architecture CSE 8383 CSE 8383 April 24, 2008 April 24, 2008 Session 12 Session 12

Advanced Computer Architecture CSE 8383

  • Upload
    nishi

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

Advanced Computer Architecture CSE 8383. April 24, 2008 Session 12. Contents. Message Passing Systems (Chapters 5 & 7) Communication Patterns Network Computing Client/Server System Clusters Grid Interconnection Networks. Message Passing Mechanisms. Message Format - PowerPoint PPT Presentation

Citation preview

Page 1: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Advanced Computer Advanced Computer ArchitectureArchitecture

CSE 8383CSE 8383

April 24, 2008April 24, 2008

Session 12Session 12

Page 2: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Contents

Message Passing Systems (Chapters 5 & 7)

Communication Patterns Network Computing

Client/Server System Clusters Grid Interconnection Networks

Page 3: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Message Passing Mechanisms Message Format

Message arbitrary number of fixed length packets

Packet basic unit containing destination address. Sequence number is needed

A packet can further be divided into flits (flow control digits)

Routing and sequence occupy header flit

Page 4: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Message, Packets, Flits

Message

Packet

Data flit

Destination

Sequence

Page 5: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Store and Forward Routing

Packets are the basic units of information flow

Each node uses a packet buffer A packet is transferred from S to D

through a sequence of intermediate nodes

Channel and buffer must be available

Page 6: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Wormhole Routing Flits are the basic units of information

flow Each node uses a flit buffer Flits are transferred from S to D through

a sequence of intermediate routers in order (Pipeline)

Can be visualized as a railroad train Flits from different packets cannot be

mixed up

Page 7: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Latency Analysis

L packet length (in bits) W Channel bandwidth

(bits/sec) D Distance (number of

hops) F flit length (in bits)

Page 8: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Store and Forward Latency

W

L

W

L

W

L

SFT

D

Page 9: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

WH Latency

W

L

WT

D

Page 10: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Latency Analysis L packet length (in bits) W Channel bandwidth (bits/sec) D Distance (number of hops) F flit length (in bits) TSF = D * L/W TWH = L/W + D* F/W L/W if L>>F

(independent of D)

Page 11: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Communication Patterns

Point to Point 1 - 1 Multicast 1 - n Broadcast 1 - all Conference n - n

Page 12: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Routing potential problemsDeadlock:

When 2 messages, each is holding the resources required by the other in order to move, both messages will be blocked (cyclic dependency for resources)

Straightforward solution (but inefficient) is rerouting

Another solution is avoidance of occurrence of deadlock using a strict monotonic order of network resources

Channel dependency graph (CDG) is a technique for developing a deadlock-free routing algorithm.

Page 13: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

0

3 2

1

c1

c2c8

c5c6c4 c7

c3

c1 c2 c3

c5

c4

c6 c7c8

c8c7c6c5

c1c2 c3 c4

(a) A 4-node network (b) Channel dependency graph (CDG)

(c) CDG for a deadlock-free version of the network

A 4-node network and its CDGs

Page 14: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Livelock: A message goes around the network and never

reaches its destination

It results from using adaptive routing algorithms with dynamic injection, where nodes inject their messages in the network at arbitrary times

Policies to avoid livelock are based on assigning a priority to a message injected to the network:

Messages are routed according to their priorities Once a message is injected, only a finite number of

messages will be injected with higher or equal priority.

Page 15: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Starvation: A node suffers from starvation if it has a

message to inject into the network but is never allowed to do so.

The simplest policy to avoid starvation is to allow each node to have an injection queue that competes with the queues of the incoming links to the same node.

The main disadvantage is that a node with a high message injection rate can slow down all the other nodes in the network.

Page 16: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Routing Efficiency

Two Parameters

Channel Traffic (number of channels used to deliver the message involved)

Communication Latency (distance)

Page 17: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Multicast on a mesh (5 unicasts)

Traffic ?

Latency ?

Page 18: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Multicast on a mesh (multicast pattern 1)

Traffic ?

Latency ?

Page 19: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Multicast on a mesh (multicast pattern 2)

Traffic ?

Latency ?

Page 20: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Broadcast (tree structure)

3 2 3 4

2 1 2 3

1 1 2

Page 21: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Message Passing in PVM (Revisit)

User

applicationLibrary

Daemon

1

2 3

4

User

applicationLibrary

Daemon

5

6 7

8

Sending Task Receiving Task

Page 22: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Standard PVM asynchronous communication

A sending task issues a send command (point 1)

The message is transferred to the daemon (point 2)

Control is returned to the user application (points 3 & 4)

The daemon will transmit the message on the physical wire sometime after returning control to the user application (point 3)

Page 23: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Standard PVM asynchronous communication (cont.) The receiving task issues a receive

command (point 5) at some other time In the case of a blocking receive, the

receiving task blocks on the daemon waiting for a message (point 6). After the message arrives, control is returned to the user application (points 7 & 8)

In the case of a non-blocking receive, control is returned to the user application immediately (points 7 & 8)

Page 24: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Send (3 steps)

1. A send buffer must be initialized2. The message is packed into the

buffer3. The completed message is sent to its

destination(s)

Page 25: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Receive (2 steps)

1. The message is received2. The received items are

unpacked

Page 26: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Message Buffers

Buffer Creation (before packing)Bufid = pvm_initsend(encoding_option)

Bufid = pvm_mkbuf(encoding_option)

Encoding option Meaning0 XDR1 No encoding2 Leave data in place

Page 27: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Message Buffers (cont.) Data Packing

pvm_pk*() pvm_pkstr() – one argument

pvm_pkstr(“This is my data”); Others – three arguments

1. Pointer to the first item2. Number of items to be packed3. Stride

pvm_pkint(my_array, n, 1);

Packing functions can be called multiple times to pack data into a single message

Page 28: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Sending a message Point to point (one receiver)

info = pvm_send(tid, tag)

broadcast (multiple receivers)info = pvm_mcast(tids, n, tag)info = pvm_bcast(group_name, tag)

Pack and Send (one step)info = pvm_psend(tid, tag, my_array, length, data type)

Page 29: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Receiving a message Blocking

bufid = pvm_recv(tid, tag)-1 wild card in either tid or tag

Nonblocking bufid = pvm_nrecv(tid, tag)

bufid = 0 (no message was received)

Timeout bufid = pvm_trecv(tid, tag, timeout)

bufid = 0 (no message was received)

Page 30: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Different Receive in PVM

Pvm_recv()

wait

Time

Funcitonis called

Time is expired

Message arrival

Blocking

Pvm_nrecv()

Continue execution

Non-blocking

Pvm_trecv()

wait

Timeout

Resume execution

Resume execution

Page 31: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Data unpacking

pvm_upk*() pvm_upkstr() – one argument

pvm_upkstr(string);

Others – three arguments1. Pointer to the first item2. Number of items to be unpacked3. Stride

pvm_upkint(my_array, n, 1);

Page 32: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Networks Computing

Four categories WAN MAN LAN SAN

Internet

TCP/IP

Page 33: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Other Network technologies

Fast Ethernet and Gigabit Ethernet The Fiber Distributed Data Interface (FDDI) High-Performance Parallel Interface (HIPPI) Asynchronous Transfer Mode (ATM) Scalable Coherent Interface (SCI)

Page 34: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

HiPPI

ATM

10 Base T

100 Base T

SCI

SAN LAN MAN WAN

10Mbps

100Mbps

1000Mbps

10Gbps

1000 Base T

FDDI

A representation of network technologies

Page 35: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Client/Server Systems

InterconnectionNetwork

InterconnectionNetwork

Server Threads

ClientServer

Client

Page 36: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Sockets Sockets are used to provide the capability of

making connections from one application running on one machine to another running on a different machine.

Once a socket is created, it can be used to wait for an incoming connection (passive socket) or can be used to initiate connection (active socket).

Client Server

A Socket Connection

Page 37: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

A Client Server Framework for Parallel Applications

InterconnectionNetwork

InterconnectionNetwork

Master (Supervisor)

Server 1 Server 2 Server 3 Server n

Client

Slaves (Workers)

Page 38: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Computer Clusters

Advances in commodity processors and network technology

Network of PCs and workstations connected via LAN or WAN forms a Parallel System

Compete favorably (cost/performance)

Page 39: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Cluster Architecture

M

C

P

I/O

OS

M

C

P

I/O

OS

M

C

P

I/O

OS

Middleware

Programming Environment

Interconnection Network Home cluster

Page 40: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

InternetInternet

Grids

Dependable, consistent, pervasive, and inexpensive access to high end computing.

Geographically distributed platforms.

Page 41: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Interconnection NetworksEthernet

A packet-switched LAN technology. All hosts connected to an Ethernet receive every

transmission, making it possible to broadcast a packet to all hosts at the same time.

Ethernet uses a distributed access control scheme called Carrier Sense Multiple Access with Collision Detect (CSMA/CD).

Each computer connected to an Ethernet network is assigned a unique 48-bit address known as its Ethernet address, also called the media access control address, (MAC).

Page 42: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Switches A n1 x n2 switch consists of:

n1 input ports n2 output ports Links connecting each input to every output Control logic to select a specific connection Internal buffers

The connections between input ports and output ports may be:

One-to-one (point-to-point) One-to-many (multicast or broadcast) Many-to-one: may cause conflicts at the output ports

and needs arbitration.

Page 43: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

When only one-to-one connections are allowed, the switch is called crossbar.

An n x n crossbar switch can establish n! connections.

If we allow both one-to-one as well as one-to-many in an n x n switch, the number of connections that can be established is nn.

(We discussed this before, remember?)

Page 44: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Routing can be achieved using 2 mechanisms: Source-path: the entire path to the destination is

stored in the packet header at the source location. Table-based: the switch must have a complete routing

table that determines the corresponding port for each destination.

Port 0

Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

Port 7

605

Port 0

Port 1

Port 2

Port 3

Port 4

Port 5

Port 6

Port 7

Dest-id

6id

Routing table

Source-path Routing versus Table-based Routing

Page 45: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Myrinet Clos network

Myrinet is a high-performance, packet communication and switching technology.

Myrinet switches are multiple-port components that route a packet entering on an input channel of a port to the output channel of the port selected by the packet.

Page 46: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Myrinet Clos network

128 Hosts

Network Spine

Clos “Spreader” NetworkConnects Spine (upper 8 switches) to Leaves (16 lower switches)

128-host Clos Network using 16-port Myrinet Switch

Page 47: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Myrinet Clos network

2 links each

Network Spine

64 Hosts

64-host Clos Network using 16-port Myrinet Switch (Each line represents 2 links)

Page 48: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

Myrinet Clos network

4 links each

32 hosts

Network Spine

32-host Clos Network using 16-port Myrinet Switch (Each line represents 4 links)

Page 49: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

The Quadrics network (QsNet)

Consists of 2 hardware building blocks A programmable network interface called Elan:

connects the Quadrics network to a processing node containing one or more CPUs

Elan provides substantial local processing power to implement high-level message passing protocols (ex: MPI).

High-bandwidth, low-latency communication switch called Elite:

QsNet connects Elite switches in a quaternary fat-tree topology.

Page 50: Advanced Computer Architecture CSE 8383

Computer Science and Engineering

The Quadrics network (QsNet)

Processing Nodes