Cost-effective algorithms for multicast connection in ATM switches based on self-routing multistage networks

ELSEVIER Computer Communications 21 (1998) 54-64

computer communicatii

Cost-effective algorithms for multicast connection in ATM switches based on self-routing multistage networks

Jaehyung Park*, Hyunsoo Yoon

Department of Computer Science and Centerfor Artificial Intelligence Research, Korea Advanced Institute of Science and Technology 373-1, Kusong-Dong, Yusung-Gu, Taejon 305-701, South Korea

Received 3 April 1996; revised 12 September 1996; accepted 19 September 1996

Abstract

In this paper, we discuss the multicast connection in the self-routing multistage interconnection network (MIN) for constructing the

internal architecture of asynchronous transfer mode (ATM) switches. Many applications of ATM switches require multicast connections in

addition to conventional point-to-point connections. Multicast connection, in which the same message is delivered from a source to an arbitrary number of destinations, is fundamental in supporting collective connection primitives including cable TV, teleconferencing, and video-on-demand services. This paper presents a novel approach to supporting multicast connection, on the basis of the recursive scheme that

recycles a multicast packet one or more times through the network to reach desired destinations. We also propose cost-effective multicast algorithms providing deadlock-freedom in MIN-based ATM switches. The proposed algorithms require a small and fixed number of recycling passes and a reasonable number of links used. The proposed algorithms can be easily applied to buffered MIN-based ATM switches. 0 1998 Elsevier Science B.V.

Keywords: ATM switch architectures; Multistage networks; Multicast connection; Recursive scheme; Deadlock freedom

1. Introduction

The broadband integrated services digital network (B-ISDN)

should support a large number of services such as video com-

munication, hi-fi sound, graphics applications, and high speed data communication [ 1,2]. These broadband services require a

wide range of bandwidth, varying from an estimated bit rate of a few bit/s up to some hundreds of Mbit/s. For supporting these services efficiently, the asynchronous transfer mode (ATM) has been widely accepted as a well-suited transmission and

switching principle [3]. The ATM switching technique allows the integration of various services with different bandwidth requirements in an efficient manner since the information

transfer is based on a short fixed-sized packet (cell) [4]. Most of the services provided in the last decades have

been typically between two end users, that is as point-to- point services. As video services are playing the key role of the broadband services, it is important for B-ISDNs to provide point-to-multipoint (multicast) services as well [ 1,2,_5]. The multicast service, in which the same message is delivered from a source to an arbitrary number of destinations,

* Corresponding author.

0140-3664/98/$19.00 0 1998 Elsevier Science B.V. All rights reserved PII SO140-3664(97)00065-O

includes one-to-all (broadcast) services as a special case.

For various broadband applications such as terrestrial broadcasting, teleconferencing, and video-on-demand services, the capability of handling multicast services is demanded

in designing ATM switch architectures [ 1,2,6j.

Many of the ATM switch architectures which support broadband services efficiently employ multistage intercon-

nection networks (MINs), either buffered or unbuffered, as a routing network. To provide the multicast capability, two alternative approaches are possible. The first is to add a copy network at the front of a routing network, which replicates the cell to obtain the number of copies requested by a given

multicast connection. Those copies will be transmitted

through the routing network to the desired destinations.

The design of the copy network may be quite different from that of routing network, and requires extra complexity of components [4,7,8]. Instead of designing an additional copy network, the second approach is to exploit the inherent multicast capability of a routing network by recycling the cells at the outputs of a routing network to its input to gen- erate extra copies. This approach is known as a recursive scheme, since it simulates the function of a copy network by using the routing network recursively. This approach has the advantage that it does not require extra hardware. The

J. Park, H. YoorKompurer Communicatiorw 21 (1998) 54-64 55

simplicity of the network control algorithm and hardware

implementation makes this scheme more attractive. Many multicast routing algorithms based on such a recur-

sive scheme have been proposed [9-131. The algorithm in

[lo] employs header encoding schemes based on a multi-

address addressing scheme, in order that a source node

directly sends a multicast message to all destination nodes

at a time. However, such a multi-address encoding scheme is not suitable for ATM switches handling fixed-sized cells,

because of the need for constructing variable-sized headers.

Other algorithms [9,11,13] which employ a restricted

address encoding scheme and construct a fixed-sized multicast header suffer from the large number of passes for one

multicast connection and deadlock, which radically

degrades the switch performance for multiple multicast con-

nections [ 141. Also performance of these multicast algorithms is not analyzed for the case of several multicast

connections, but evaluated in terms of the number of recycling passes for the case of one multicast connection.

In this paper, we describe simple header encoding schemes based on the restricted address encoding scheme

which constructs a short fixed-sized multicast header. We also propose a novel approach to support multicast connec-

tion in the wrap-around MIN-based ATM switch. This paper proposes cost-effective recursive algorithms which provide

deadlock-freedom multicast connections in the unbuffered MIN by employing the restricted address encoding scheme.

The proposed algorithms exploit the intrinsic nonblocking

properties of the MIN. The multicast algorithms have been designed with the aim of minimizing the number of passes across the MIN and the number of the internal links used for

their efficiencies. The proposed multicast algorithms can be easily applied to MIN-based ATM switches, such as

Phoenix switch, Adaptive Corporation ATMX, and Fore

Systems ASX-100. The structure of this paper is organized as follows. The

next section describes the basic architecture, the intrinsic nonblocking property, and the header encoding schemes and related work. In Section 3, cost-effective multicast algo-

rithms are proposed which are based on the recursive

scheme. The approach to enhance their performance is

also proposed. We evaluate the performance of the proposed algorithms in comparison with others in Section 4. Section 5 concludes the paper.

2. System models and related work

This section describes the basic architecture, the nonblocking property of the banyan networks, and header encoding schemes. It also briefly reviews some algorithms on MIN-based ATM switches.

2.1. Basic architecture and nonblocking property

The banyan network is a class of MINs with the property that there is a unique path between any pair of source and

destination [15]. The banyan network has its self-routing

property due to its unique path property. Many of the

known MINs, such as the omega network, multistage cube network, flip network, baseline network, and data manipu-

lator, belong to the class of banyan networks and have been shown to be topologically equivalent [ 161. For this reason,

the banyan network is considered throughout this paper.

The banyan network is an N X N interconnection network

with n = logk N stages. Each stage contains Nlk(k X k)

switching elements (SEs). The stages are labeled in a sequence from (n - 1) to 0 with (n - 1) for the first

stage. The N input/output ports at each stage are labeled

using n k-ary digits (a, _ ,a, _ ?. . .ao), within each stage starting from the top. And the SEs at each stage are labeled

using (n - 1) k-ary digits (a, _ ,a,, _?. . .a,) starting from the

top. When we refer to one of the k input ports of a specific

SE, then we use the k-ary digit a0 to represent that input port

for simplicity. We consider butterfly interconnection patterns between stages, and a perfect shuffle interconnection

pattern between input controllers and stage (n - 1). The formal definitions of these connection patterns are as follows:

Dejinition 1. The interconnection function IC; for the output

ports at stage i in the banyan network under consideration, for n - 1 2 i 2 0, and that for the outputs of the input controllers are respectively defined by

lci [(a,! - I...aj+lUi(l,_,...a”)]=(U,,~~...U,+~UoUj_~...Uj)

W(a,,- Ia,,-2...aldl =(a,,-2a,I-3...alaoa,,-,)

where the right-hand-side labels are those of the input links ofthenextstageandOsa,sk- 1.

A broadcast banyan network is a banyan network with

SEs which are capable of packet replications. A packet

arriving at each broadcast SE can be either routed to one of the output ports, or it can be replicated and sent out on both ports. Fig. 1 illustrates the basic ATM switch architecture constructed from a logk N-stage broadcast banyan net-

work with wrap-around links. In a broadcast banyan

network with wrap-around links, output controllers are connected to input controllers through external links in order to

recycle packets. The banyan network itself is a blocking network. Two

packets with different destinations may be routed through

the same internal link at the same time. However, it is known that if the incoming packets with distinct destination

are arranged in an ascending or a descending order and the

active sources are all connected to consecutive input links, then the banyan network becomes nonblocking [ 171. The nonblocking condition is formally described as follows.

Dejinition 2. The binary relation CR is defined between two sets Y and Y’ which consist of destinations as follows: Y <,Y’ if and only if y < y’ for all y and y’ is such that y E Yandy’ E Y’.

56 J. Park, H. YoodComputer Communications 21 (1998) 54-64

~.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : f” ..-.................................-............ _ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

: : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..~ :

i

:i , j j :

i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..-................................. _ . . . . . . . . . . . . . . . . . . . . 1 : :

ooo ooo

001 001

010 010

011 011

100 100

101 101

110 110

111 111

Input Controller Output Controller

Fig. 1. The basic switch architecture based on the broadcast banyan net-

work, where N = 8 and k = 2.

Property 1. A broadcast banyan network is nonblocking if

the active sources x i,. . .,xk and the corresponding sets of

destinations Y i,. . . ,Yk satisfy the following.

1. Monotone: Y, CR Y, <R . . . <R Yk Or Y, CR., . CRY2 <

RYI.

2. Concentration: any source between two active sources is also active. That is, xl 5 w % x, implies source w is

active, where 1 5 1 < m 5 k.

A proof of the sufficient condition is shown in [8]. An

example with active sources xl = 3, x2 = 4, x3 = 5, and corresponding destinations Yi = ( 1,3), Y2 = (5,6,7}, Y3 = {8,10,13,14] is depicted in Fig. 2. By Definition 2,

Yl <RY2 <RY3-

2.2. Header encoding schemes

The restricted address encoding scheme constructs a

multicast routing header from reachable destinations which are restricted into a single cube or a single region in the MINs.

As one of the restricted address encoding schemes, a cube

encoding specifies arbitrary destinations forming a single cube C. The multicast routing header for the cube encoding scheme is specified by (R,M], where R = r,_ , . ..r.r, contains the routing information and M = m,_, . . .m ImO contains the multicast information [9,1 I]. To handle the multicast header {R,M), an SE at stage i (0 5 i 5 n - 1) examines ri and m;. If mi is 0, the normal unicast routing is performed according to ri. If mi is 1, ri is ignored and the broadcast is performed. Fig. 3 shows routing paths for the destinations which form a cube C = 1X0x, correspondingly M is 0101 and R can be any one of the destination addresses belonging to the cube C.

Fig. 2. An example of nonblocking connection in broadcast banyan net-

works.

Another restricted address encoding scheme is a region encoding scheme which specifies arbitrary consecutive destinations forming a single region [8,13]. The multicast

routing header by the region encoding scheme indicates

the minimum and maximum addresses of consecutive desti-

nations. Fig. 4 illustrates an example of routing paths through the MIN-based switch for a multicast packet with

routing header containing an address interval specified by 0100 and 1000.

The size of multicast routing header by above two

restricted address encoding schemes is fixed to 2n bits. While an SE using the region encoding scheme must have the capability of header modification, that using the cube

encoding scheme must not. Hence, the SE using the region encoding scheme has more additional hardware cost than it

using the region encoding scheme.

Fig. 3. An example of self-routing based on the cube encoding scheme.

.I. Park. H. Yoon/Computer Communications 21 (1998) 54-64 51

0 0

I I

2 2

3 3

4 4

5 5

6 6

1 1

8 8

9 9

10 10

II II

12 12 13 13

14 14

15 15

.

Fig. 4. An example of self-routing based on the region encoding scheme.

2.3. Related work

An algorithm considering the blocking effect among several constituent copies of one multicast packet at any pass is proposed in [9], which uses the cube encoding scheme. This

multicast algorithm checks whether any two constituent

copies of one multicast packet can be routed without conflict

or not. As any two constituent copies which cause conflict in being routed cannot be sent at the same pass, this algorithm

prevents blocking. This algorithm generates a multicast tree which models the sequence of recycling network for routing

a multicast packet to all destinations. In the multicast tree, a

parent node sends a copy with one multicast routing header

to its children. Fig. 5 shows the generated nonblocking

multicast tree and the multicast routing example, where

source 5 sends a packet to destinations 0, 1, 3, 7, 10, 12 and 15. However, the algorithm has O(d*) time complexity

to construct the nonblocking multicast tree for given d destinations of one multicast packet. This algorithm also

delivers the packet to arbitrary d destinations at d passes

in the worst case. Raghavendra et al. [ 131 proposed a two-phase multicast

algorithm, which guarantees nonblocking between any two constituent copies at the same phase. Because other outputs

which are not included in multicast destinations can receive

and send a copy, this algorithm has the advantage that it can easily minimize the number of passes for one multicast

packet across the MIN. This algorithm constructs a non-

blocking multicast tree as in [9]. However, while the region

encoding scheme is used in the first phase, the cube encod-

ing scheme is used in the second phase. Since the SE’s structure of handling the multicast routing header in the second phase is different from that of handling it in the first phase, it needs an additional hardware cost for proces-

sing two encoding schemes. These proposed multicast algorithms require excessive

network resources such as very complex SEs and/or a large number of internal links. Furthermore, they do not

consider deadlock occurring among multiple multicast packets [ 141.

0 0

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 11

12 12

13 13

14 14

15 I5 - - - -

,3 ---*12 .z

<

0 ..&$\

l I

%oox

1 . . . . . . . . . . . . *,o

3 ---*12

0 .ii;;i’i Blocking!!

7 ---~ls 5

Fig. 5. A multicast routing example

58 J. Park, H. YoodComputer Communications 21 (1998) 54-64

6

7

Fig. 6. An example of the routing phase in multicast algorithm I.

3. Multicast algorithms in MIN-based ATM switches

In this section, we propose recursive multicast algorithms in wrap-around MIN-based ATM switches which employ the cube encoding scheme. In such a header encoding scheme, any arbitrary multicast destinations may not be

specified by only one multicast header. Our approach to routing a multicast packet is using the routing in the MIN

to make copies and recycle these copies to route to desired destinations in subsequent phases. The proposed algorithms avoid the blocking effect by exploiting the intrinsic non-

blocking property of the MIN as described in Section 2.1.

This section also considers occurrence of deadlock among multiple multicast packets and presents an approach to

deadlock avoidance.

3.1. Two-phase multicast algorithm

This algorithm routes a multicast packet to arbitrary d

destinations in two phases, which are copying and routing phases. During a copying phase, a source sends a multicast packet to 2k consecutive outputs, where k is the minimum integer such that d 5 2k. The multicast routing tag is con-

structed using the cube encoding scheme. In the second phase, the recycled packets from these d consecutive out-

puts are sent to the desired destinations. The two-phase multicast algorithm with the region encoding is described as follows.

3.1.1. Multicast algorithm I Assume that the d arbitrary destinations for a multicast

packet are sorted in ascending order as DO, D, , . . . , D, _ , . Phase I. Copy from the source 2k consecutive outputs which form a single cube C through the wrap-around MIN. The cube C which consists of 2k consecutive outputs is represented as t , . Then, the 2k consecutive ,

rn-1 * * * Tk 22 ’ * - 2.

outputs are addressed as s, s + l,..,, s + 2k - 1, where ri, k

5 i 5 n - 1 isrdomly selected and the start address s is

- . The cube encoding scheme is used to ‘?-,_I -. * ?.k 00 * - - 0. copy the packet to 2k consecutive outputs.

Phase 2. Route the recycled copy from the outputs (s + I) to

the destinations D,, where 0 I 15 d - 1. The routing header

of the packet from input (s + 1) contains D1. The remaining outputs (s + 1), d 5 1 5 2k - 1, do nothing.

In Fig. 6, an example of the second phase is shown, where

source 5 sends a multicast packet to destinations

{ 3,4,6,7,9,11,13}. During the copying phase, source 5 sends a copy to 2’ outputs {8,9,10,11,12,13,14,15) which

form a cube C = lxxx, where k = 3 and the start address s is

8. In the second phase, the outputs 8, 9, 10, 11, 12, 13, and 14 send their own recycled copies of the multicast packet

while output 15 does not.

Theorem 1. In the two-phase multi cast algorithm, blocking

does not occur among any constituent copies.

Proof. In the first phase, it is trivial because the packet is copied from a single input to 2k consecutive outputs. On the contrary, in the second phase these 2k copies move from

multiple inputs to the corresponding destinations through the MIN. However, active inputs which receive the copies

at the end of the first phase are concentrated on inlets s,s + l,...,s + d - 1 and the output addresses satisfy the

monotonicity property, i.e. Do,D ,,. . .,Dd-I are in the ascending order. By satisfying the conditions of Property 1, thus the constituent copies pass without blocking in multicast

algorithm I. 0

Theorem 1 guarantees that a multicast packet that is destined for any arbitrary set of destinations is sent to the desired destinations in two phases across the MIN.

3.2. Three-phase multicast algorithm

In order to minimize the use of excessive links and out-

puts in algorithm I, this multicast algorithm routes a packet

to arbitrary d destinations in three phases-two copying

phases and one routing phase. While the first copying

phase routes the packet to 2k destinations, the second routes it to (d - 2k) remaining destinations, where k is the maximum integer such that d 2 2k.

3.2,l. Multicast algorithm II Assume that the d multicast destinations are sorted in

ascending order as Do,D , ,. . . ,Dd- I and d is represented as d,_,d,_2...do.LetB(=B0>B,>...>B,_I)beasetof sorted numbers such that dB, = 1, where 0 % BI < n and Osl%m-1.

Phase 1. Copy from the source 2k (= 2B”) consecutive

J. Purk, H. YoonKomnputer Communications 21 (1998) 54-64 59

outputs which form a single cube C through the wrap-

around MIN. The cube C which consists of 2k consecutive

outputs is represented as k Then, the 2k r,-1

..+m..

consecutive outputs are addressed as s,s + 1,. . .,s + 2k - 1,

where ri, k i i I n - 1 is randomly selected and the start

address s is 1 . The cube encoding scheme

rtl-1 . . . Tk 00. . . 0.

is used to copy the packet to 2k consecutive outputs.

Phase 2. Copy from the outputs s + 1 to 2” consecutive

outputs, where 1 I 15 m - 1. The 2B’ outputs are addressed

as s + ~~=02B”, S+x;=e2B”+l, . . . . S+x;=“2B*+ 2” - 1. The remaining outputs s + 1 do nothing, where

I= 0 or m - 1 < 1 5 2k - 1. The cube encoding scheme is also used to copy the packet to (d - 2k) consecutive out-

puts, respectively.

Phase 3. Route the recycled copy from the outputs s + 1 to the destinations D,, where 0 5 15 d - 1. The routing header

of the packet from input (s + 1) contains Dr.

For an example in Fig. 6, we consider multicast algorithm

II. Then d is represented as 0 111 and B = (2,1,0) . In the first copying phase source 5 sends a copy to 2= outputs

(8,9,10,11} which form a cube C = lOxx, where k = 2 and the start address s is 8. In the second, the output 9 ( = s + 1) sends a copy to 2 ’ consecutive outputs forming

a cube 110x and the output 10 ( = s + 2) sends a copy to 2’

consecutive outputs forming a cube 1110. As a result, 7 consecutive outputs receive the copies of a multicast packet after two copying phases. In the routing phase, the outputs

send the recycled copies to 7 desired destinations respectively.

Theorem 2. Let d be represented as d,, _ , d,, _ 1.. .do and B

(=B,>B,>... > B,_,) be a set of sorted numbers such

that du, is 1, where OiB,<n and Oslsm-1. Then 2’“lm - 1.

Proof. The maximum value of m is Bo. For all Bo. 0 5 Bo 5

n - 1, 2’” 2 Bo - 1 holds. 0

Theorem 3. In the three-phase multicast algorithm, blocking


Proof. In the first phase, it is trivial because the packet is

copied from a single input to 2’” consecutive outputs. On

the contrary, in the second phase some of these 2’” copies move from multiple inputs to the corresponding outputs through the MIN. However, active inputs which receive the copies at the end of the first phase are concentrated on inlets

s+ l,s+2 , . . . ,s + m - 1 and the output addresses satisfy the monotonicity property, i.e. C,+ i CR C,,, CR . . .cR C s +m _ , are in the ascending order by Definition 2, where outputs s + 1 send copies 2& consecutive outputs forming a cube CT+, for 1 5 15 m - 1. Thus the constituent

copies pass without blocking. In the third phase, the proof is

similar to that of Theorem 1. 0

Theorem 2 makes sure that d consecutive outputs have

their own copies of the multicast packet after two copying

phases. Theorem 3 guarantees that a multicast packet which

is destined for any arbitrary set of destinations is sent to the

desired destinations in three passes across the MIN in multi-

cast algorithm II.

3.3. Enhanced multicast algorithms

In the wrap-around MIN, the routing phase can use the

same header encoding scheme that the copying phase employs. The number of intermediate outputs in multicast

algorithms I and II is minimized by coalescing all destina-

tions to the set of cubes.

Definition 3. Assume that C is represented as c,~-I.. .c Ic~ in n

bits, where ci is 0, 1, or x, 0 5 i 5 n - 1. The binary relation

=c is defined between two cubes C and C’ as follows: C = &’ if and only if there exists a j such that cj # c;

andcj=ci’foral10%i#j5n-1.

For algorithms I and II employing the cube encoding scheme, Procedure 1 describes coalescing a set of destinations into a set of cubes by Definition 3.

Procedure 1. Coalescing destinations into cubes:

Input: C' = ((D~),(Dll,...,(Dd-l}},

D,_, < D,, 0 < 15 d - 1.

output: c = {C& ,,..., C,,], C,_, < sCm,O<ms,f- 1.

begin

do m=O;C=C’;l=l;

while (1 5 ICI - 1) do if (ICL_,l = ICI and

C/-I = CC,) then C,’ = Cl_, u C,;

1=1+2;m=m+l;

else C,’ = C,_,; l=l+l;m=m+l;

if (1 = ICI) then Ci=C,_,; endwhile

until (ICI = IC’I) end

Given that the destinations are (3,4,6,7,9,11,13} in Fig. 6, the set of destinations is coalesced into

({3},{4,6),(7),{9,11),( 13)) by Procedure 1. According to Definition 3, 4( = 0100) =c 6( = 0110) and 9( = 1001) = cl l( = 101 l), therefore (4,6) and { 9,ll) are represented as 01x0 and 10x1, respectively.

60 J. Park, H. YoonKomputer Communications 21 (1998) 54-64

0 0

1 1

2 2

3 3

4 4

5 5

6 6

7 7

8 8

9 9

10 10

11 II

12 12

13 13

14 14

15 15

Fig. 7. An example of the routing phase in enhanced multicast algorithm II.

Fig. 7 shows an example of the routing phase in algorithm II. In this algorithm, the number of intermediate outputs is

minimized by Procedure 1.

Theorem 4. In enhanced multicast algorithm I, blocking


Proof. In the first phase, it is trivial. On the contrary, we

consider the second phase where one or more constituent copies are passing through the MIN. Active inputs which received the copies in the copying phase are concentrated as

3,s + 1 ,. ..,s + d - 1 and the output addresses satisfy the

monotonicity property, i.e. two outputs C, and Cl, hold C, cR C,, such that 0 9 1 < 1’ 5 d - 1, where C, and C,, are in C. By satisfying the conditions of Property 1 thus the constituent copies pass without blocking in enhanced

algorithm I. q

Theorem 4 guarantees that a multicast packet that is destined for any arbitrary set of destinations is sent to the desired destinations in two/three passes across the MIN by

enhanced multicast algorithms I and II.

3.4. Deadlock avoidance

These multicast algorithms suffer from potential deadlock due to multiple multicast packets in the unbuffered switch. In the multicast switches using distributed control, one packet which is copied at stage i for a multicast packet does not know whether other copied packets for the multicast packet are forwarded or blocked at stage j (j > i) due to other multicast packets. As a result, it is not guaranteed that all copied packets for a multicast packet arrive at the desired destinations. This is the situation of deadlock occurring among multiple multicast packets. Fig. 8 shows an example of a deadlock situation in a multicast algorithm given two multicast packets 4 - {0,2,4,6,8,10,12,14) and 14 -

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

I5

Fig. 8. An example of a deadlock situation.

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

{5,7,13,15}. One packet with source 4 collides with another

with source 14 at SE cr. and at SE v. As shown in Fig. 8, at SE ~1 the packet from source 4 is randomly selected to be broad-

casted to two output ports and at SE v the packet from source 14 is randomly selected to be broadcasted to the two output ports. As a result, the packet from source 14 is

blocked at SE p and the packet from source 4 is blocked at

SE v, assuming a packet blocking policy at the SEs. Hence, a deadlock occurs and neither the packet from source 4 nor

the packet from source 14 can go forward.

In order to avoid deadlock, SEs must use some arbitration mechanism for multiple multicast packets. One scheme in

which global unique time is added to all packets at the source is the most intuitive approach [lo]. This scheme, however, is centralized and it is difficult to maintain global

unique time. We present a simple scheme, the upper input

port first scheme. This arbitration method, which chooses

the packet from the upper input port, is a distributed scheme.

Theorem 5. In multicast algorithms, the upper input port

first scheme guarantees a deadlock-free transmission

through the multicast switch.

Proof. In the two-phase multicast algorithm, all multicast

packets simultaneously request output port(s) on one or

more SEs at the same stage i, 0 5 i 5 n - 1. Assume that the source address and the multicast routing header of a multicast packet are s,-~...s~ and C= c,c,_ 1 . ..cg. The address of the input port at stage (n - 1) into which this multicast packet enters is s,-2...sos,-I by Definition 1. Using our notation described in Section 2.1, the label of the particular SE is then s,-~.__s~ and the specific input port of this SE is identified by the bit s,-I. Similarly, the address of the input port(s) at stage i, n - 2 2 i 2 0, into which the packet copies enter, is c,-l...c;+i + s;_ I . ..slsosi. Again using our notation described in Section 2.1, the label(s) of the specific SE(s) are then c,_~...c~+~s;-~...s~~~

J. Park, H. YoonlComputer Communications 21 (1998) 54-64 61

6.5

6

5.5

5

4.5

4

3.5

3

2.5

2

1.5 124 8 16 32 64 l&4 8 16 32 64 128

Number of Destination$Fanout) Number of D&nations(Fanout)

(a> @I

Fig. 9. Comparison of average number of internal links per fanout. (a) n = 6: (b) n = 7. 0, algorithm I; + , algorithm II

and the specific input port(s) of these SE(s) are identified by the corresponding bit Si. Thus, the position of the

input port(s) at any stage i at which the multicast packet(s) arrive from a particular source depends only on the source

address. Therefore the copies of a multicast packet that

request output port(s) at one or more SE(s) at any stage have got the same priority and hence the upper input port first scheme renders deadlock-free multiple multicast

connections. 0 In the buffered MIN, the deadlock situation for multiple

packets is avoided, for the packet blocked due to other

packets is buffered on the SE [ 181. After other packets

release the port(s) on the SE, the blocked packet can acquire

the port(s).

4. Performance evaluation

We evaluate the performance of the proposed recursive

multicast algorithms in terms of the number of recycling passes and the number of internal links used. An algorithm

with the minimum number of recycling passes that routes an arbitrary set of destinations for a multicast packet in a single

phase requires individual switch control and can be

expensive [ 10,131.

4.1. Links used by the multicast algorithms

The number of internal links used by a multicast algo-

rithm depends on a number of destinations (d) for a given

multicast connection. A multicast routing in any phase will be accomplished by selecting links of the MIN that form a binary tree with source as the root and the d destinations as the leaves. The depth of the binary tree is n as an n-stage MIN is used. Two proposed algorithms without coalescing

procedure use the same number of internal links in the routing phase which equal dn, hence we consider the number of internal links used only in the copy phase.

To compute the number of links used by multicast algorithms I and II, let the binary representation of integer d

(the number of destinations) be EyZ:di2i. Assume that k is

the maximum integer such that 2k 1 d. In multicast algorithm I, the first phase routes a multicast packet to

2 k+’ consecutive outputs forming a cube. The number of

links used is

k+I

x2’++(k+ 1)). i = 0

On the other hand, the source sends a packet to 2k consecutive outputs forming a cube in the first phase of multicast algorithm II. And then some of the 2k outputs send it to

d - 2” consecutive outputs in the second phase. The total number of internal links used by two copying phases is

Fig. 9(a) shows normalized average numbers of internal

links required for an arbitrary set of d destinations where n = 6. For an arbitrary set of d destinations, algorithm I

requires more links than the other algorithm for a given

multicast connection. The average number of links used in

algorithm II is twice as many as the number of destinations for a multicast connection. The same result is shown in

Fig. 9(b), where II = 7.

4.2. Links used by the enhanced multicast algorithms

For a given multicast connection with d destinations, we now consider the total number of links used in the proposed enhanced algorithms. Assume that the number of outputs in the copying phase is minimized tofby Procedure 1 and the average number of destinations in f cubes be c by the

coalescing procedure. Let the binary representation of integer f and c be

62 J. Park, H. YoonKomputer Communications 21 (1998) 54-64

10

9

8

1

6

5

4

3

2 124 8 16 32 64

Number of Dwinatiow(Fmout)

(a) n = 6

I8 16 32 64 128

Numba of Dc~tiaatioas(Fano~~)

(b)n = 7

Fig. 10. Comparison of total number of internal links used per fanout. (a) n = 6; (b) n = 7. Algorithms (-) and enhanced algorithms (- - -) I (0) and II ( + ).

Cyzdfj2; and ~l=dci2i. Assume that 1 is the minimum

integer such that 2’ % f and m is the minimum integer

such that 2” 5 c. In multicast algorithm I with Procedure 1, the total number of internal links used by two phases is

r=u

In multicast algorithm II with Procedure 1, the total number

4- 3 4 5 6 I 8 9

Nehwksizeh=logN)

(a) p = 0.05

11 I

4- 3 4 5 6 7 8 9

Netwod~size(n=lo~N)

(b) p = 0.10 (c) p = 0.25

3 4 5 6 7 8 9

Network sizc(n=togN)

Fig. I 1. Comparison of total number of internal links used per fanout. (a) p = 0.05; (b) p = 0. IO; (c)p = 0.25. 0 Park’s algorithm; + enhanced algorithm I; 0 enhanced algorithm II; X Chen’s algorithm; A Raghavendra’s algorithm.

J. Park, H. YoonICompufer Communications 21 (1998) 54-64 63

of internal links used by three phases is

In Fig. 10, the total number of internal links used in the proposed multicast algorithm is illustrated against the

number of destinations. Algorithms I and II without the

coalescing procedure require internal links as many as at

least (n + 2) times the number of destinations. On the contrary, the number of links used in enhanced algorithms with

the coalescing procedure is independent of the number of

stages (n) in the MIN. In terms of the number of links used,

the performances of algorithms I and II are quite enhanced by coalescing procedures (Procedure 1).

Fig. 11 shows the comparison of total number of links used in the proposed enhanced algorithms and other algorithms against the size of multistage networks (n), wherep is

the number of destinations over N destinations. The pro-

posed algorithms employing the cube encoding scheme require, on average, a smaller number of internal links

than other algorithms do. Of the other algorithms considered, that of Chen and Kumar [9] requires a large number of

recycling passes and that of Raghavendra et al. [ 131 needs additional hardware cost so that SEs process two different

header encoding schemes for multicast packets. The algorithm of Park et al. [12], which requires two recycling passes, needs a large number of links to be used.

5. Conclusions

In this paper, we have considered the multicast issues in

the MIN for constructing the internal architecture of high performance ATM switches. To support multicast connec-

tions efficiently, a novel approach based on the recursive

scheme is proposed, by employing cube encoding as one of the restricted address encoding scheme. These algorithms prevent deadlock among multiple multicast connections in ATM switch architectures based on the unbuffered MIN. We have shown that the proposed algorithms have high

performance in terms of the number of recycling passes

and the number of internal links used. In these algorithms, it is shown that a fixed number of recycling passes such as two or three are required and the average number of links

used is linearly increased against the number of destinations to route a multicast packet to its destinations. The proposed

algorithms can also be easily applied to buffered MIN-based ATM switches.

Acknowledgements

This work is supported by Samsung Group under the Fundamental Research on Multimedia Engines project.

References

[I] R. Handel, M.N. Huber, Integrated broadband networks: an introduc-

tion to ATM-based networks, Addison-Wesley, 1991.

[2] M.D. Prycker, Asynchronous transfer mode: solution for BISDN, Ellis

Horwood Ltd., I99 1.

[3] R. Rooholamini, V. Cherkassky, M. Garver, Finding the right ATM

switch for the market, IEEE Computer (April 1994) 16-28.

[4] J.S. Turner, Design of a broadcast packet switching network, IEEE

Transactions on Communications COM-36 (June 1988) 734-743.

[5] P. Giacomazzi, A. Pattavina, Providing multicast services by the ATM

shuffleout switch, Proc. of IEEE Globecom, 1994, pp. 1483-1489.

[6] J.F.K. Buford, Multimedia Systems, Addison-Wesley, 1994.

[7] D.X. Chen, J.W. Mark, Multicasting in the SCOQ switch, Proc. of

IEEE Infocom, April 1994, pp. 290-297.

[8J T.T. Lee, Nonblocking copy networks for multicast packet switching,

IEEE Journal on Selected Areas in Communications 6 (1988) 1455-

1461.

[9] X. Chen, V. Kumar, Multicast routing in self-routing multistage net-

works, Proc. of IEEE Infocom, April 1994, pp. 306-3 14.

[IO] C.-M. Chiang, Multicasting in multistage interconnection networks,

Ph.D. thesis, Department of Computer Science, Michigan State Uni-

versity, 1995.

[ 1 l] R. Cusani, F. Sestini, A recursive multistage structure for multicast

ATM switching, Proc. of IEEE Infocom, April 199 I, pp. 1289- 1295.

[ 121 J. Park, D. Yoo, H. Yoon, S.R. Maeng, Efficient two-pass multicast

algorithms in ATM switches based on multistage networks, Proc. of

the International Conference on Systems Engineering, July 1996,

pp. 55-60.

[ 131 C.S. Raghavendra, X. Chen, V.P. Kumar, A two phase multicast rout-

ing algorithm in self-routing multistage networks, Proc. of Interna-

tional Conference on Communications, June 1995, pp. 1612- 1618.

[14] S.C. Liew, A general packet replication scheme for multicasting in

interconnection networks, Proc. of IEEE Infocom, April 1995,

pp. 394-40 1.

[15] L.R. Goke, G.J. Lipovski, Banyan networks for partitioning multi-

processor system, Proc. of the Annual Symposium on Computer

Architectures, 1973, pp. 2 l-28.

[ 161 C.L. Wu, T.-Y. Feng, On a class of multistage interconnection net-

works, IEEE Transactions on Computers C-29 (August 1980) 694-

702.

[ 171 K.E. Batcher, Sorting networks and their applications, AFIPS Proc. of

the Spring Joint Computer Conference, 1968, pp. 307-314.

1181 D.K. Panda, R. Sivaram, Fast broadcast and multicast in wormhole

multistage networks with multidestination worms, Technical Report

OSU-CISRC-4/95-TR2 I, Department of Computer and Information

Science, Ohio State University, 1995

Jaehyung Park received his B.S. in computer science from Yonsei University, Korea, in 1991, and his M.S. in computer science from Korea Advanced Institute of Science and Tech- nology (KAIST), Korea, in 1993. He is currently working toward his Ph.D. in the Department of Computer Science at KAIST. His major research areas are ATM switching, multicast switches, interconnection networks, and parallel processinp.

64 .I. Park, H. Yoon/Computer Communications 21 (1998) 54-64

Hyunsoo Yoon received his B.S. degree in electronics engineering from Seoul National Univer- sity, Korea, in 1979, M.S. degree in computer science from Korea Advanced Institute of Science and Technology in 1981, and Ph.D. degree in computer and information science from Ohio State University, Columbus, Ohio, in 1988. From 1978 to 1980, he was with the Tongyang Broadcasting Company, Korea; from 1980 to 1984, with the Samsung Electronics Company, Korea: and from 1988 to 1989, with the AT&T Bell Labs. as a Member of Technical

Stafl He joined the faculty of KAIST in 1989. His research interests include parallel computer architecture, parallel computing, interconnection networks, protocol engineering, and B-ISDN/ATM switching.

Documents

Cost-effective algorithms for multicast connection in ATM switches based on self-routing multistage networks