Upload
jaehyung-park
View
214
Download
0
Embed Size (px)
Citation preview
ELSEVIER Computer Communications 21 (1998) 54-64
computer communicatii
Cost-effective algorithms for multicast connection in ATM switches based on self-routing multistage networks
Jaehyung Park*, Hyunsoo Yoon
Department of Computer Science and Centerfor Artificial Intelligence Research, Korea Advanced Institute of Science and Technology 373-1, Kusong-Dong, Yusung-Gu, Taejon 305-701, South Korea
Received 3 April 1996; revised 12 September 1996; accepted 19 September 1996
Abstract
In this paper, we discuss the multicast connection in the self-routing multistage interconnection network (MIN) for constructing the
internal architecture of asynchronous transfer mode (ATM) switches. Many applications of ATM switches require multicast connections in
addition to conventional point-to-point connections. Multicast connection, in which the same message is delivered from a source to an arbitrary number of destinations, is fundamental in supporting collective connection primitives including cable TV, teleconferencing, and video-on-demand services. This paper presents a novel approach to supporting multicast connection, on the basis of the recursive scheme that
recycles a multicast packet one or more times through the network to reach desired destinations. We also propose cost-effective multicast algorithms providing deadlock-freedom in MIN-based ATM switches. The proposed algorithms require a small and fixed number of recycling passes and a reasonable number of links used. The proposed algorithms can be easily applied to buffered MIN-based ATM switches. 0 1998 Elsevier Science B.V.
Keywords: ATM switch architectures; Multistage networks; Multicast connection; Recursive scheme; Deadlock freedom
1. Introduction
The broadband integrated services digital network (B-ISDN)
should support a large number of services such as video com-
munication, hi-fi sound, graphics applications, and high speed data communication [ 1,2]. These broadband services require a
wide range of bandwidth, varying from an estimated bit rate of a few bit/s up to some hundreds of Mbit/s. For supporting these services efficiently, the asynchronous transfer mode (ATM) has been widely accepted as a well-suited transmission and
switching principle [3]. The ATM switching technique allows the integration of various services with different bandwidth requirements in an efficient manner since the information
transfer is based on a short fixed-sized packet (cell) [4]. Most of the services provided in the last decades have
been typically between two end users, that is as point-to- point services. As video services are playing the key role of the broadband services, it is important for B-ISDNs to pro- vide point-to-multipoint (multicast) services as well [ 1,2,_5]. The multicast service, in which the same message is deliv- ered from a source to an arbitrary number of destinations,
* Corresponding author.
0140-3664/98/$19.00 0 1998 Elsevier Science B.V. All rights reserved PII SO140-3664(97)00065-O
includes one-to-all (broadcast) services as a special case.
For various broadband applications such as terrestrial broad- casting, teleconferencing, and video-on-demand services, the capability of handling multicast services is demanded
in designing ATM switch architectures [ 1,2,6j.
Many of the ATM switch architectures which support broadband services efficiently employ multistage intercon-
nection networks (MINs), either buffered or unbuffered, as a routing network. To provide the multicast capability, two alternative approaches are possible. The first is to add a copy network at the front of a routing network, which replicates the cell to obtain the number of copies requested by a given
multicast connection. Those copies will be transmitted
through the routing network to the desired destinations.
The design of the copy network may be quite different from that of routing network, and requires extra complexity of components [4,7,8]. Instead of designing an additional copy network, the second approach is to exploit the inherent multicast capability of a routing network by recycling the cells at the outputs of a routing network to its input to gen- erate extra copies. This approach is known as a recursive scheme, since it simulates the function of a copy network by using the routing network recursively. This approach has the advantage that it does not require extra hardware. The
J. Park, H. YoorKompurer Communicatiorw 21 (1998) 54-64 55
simplicity of the network control algorithm and hardware
implementation makes this scheme more attractive. Many multicast routing algorithms based on such a recur-
sive scheme have been proposed [9-131. The algorithm in
[lo] employs header encoding schemes based on a multi-
address addressing scheme, in order that a source node
directly sends a multicast message to all destination nodes
at a time. However, such a multi-address encoding scheme is not suitable for ATM switches handling fixed-sized cells,
because of the need for constructing variable-sized headers.
Other algorithms [9,11,13] which employ a restricted
address encoding scheme and construct a fixed-sized multi- cast header suffer from the large number of passes for one
multicast connection and deadlock, which radically
degrades the switch performance for multiple multicast con-
nections [ 141. Also performance of these multicast algo- rithms is not analyzed for the case of several multicast
connections, but evaluated in terms of the number of recy- cling passes for the case of one multicast connection.
In this paper, we describe simple header encoding schemes based on the restricted address encoding scheme
which constructs a short fixed-sized multicast header. We also propose a novel approach to support multicast connec-
tion in the wrap-around MIN-based ATM switch. This paper proposes cost-effective recursive algorithms which provide
deadlock-freedom multicast connections in the unbuffered MIN by employing the restricted address encoding scheme.
The proposed algorithms exploit the intrinsic nonblocking
properties of the MIN. The multicast algorithms have been designed with the aim of minimizing the number of passes across the MIN and the number of the internal links used for
their efficiencies. The proposed multicast algorithms can be easily applied to MIN-based ATM switches, such as
Phoenix switch, Adaptive Corporation ATMX, and Fore
Systems ASX-100. The structure of this paper is organized as follows. The
next section describes the basic architecture, the intrinsic nonblocking property, and the header encoding schemes and related work. In Section 3, cost-effective multicast algo-
rithms are proposed which are based on the recursive
scheme. The approach to enhance their performance is
also proposed. We evaluate the performance of the proposed algorithms in comparison with others in Section 4. Section 5 concludes the paper.
2. System models and related work
This section describes the basic architecture, the non- blocking property of the banyan networks, and header encoding schemes. It also briefly reviews some algorithms on MIN-based ATM switches.
2.1. Basic architecture and nonblocking property
The banyan network is a class of MINs with the property that there is a unique path between any pair of source and
destination [15]. The banyan network has its self-routing
property due to its unique path property. Many of the
known MINs, such as the omega network, multistage cube network, flip network, baseline network, and data manipu-
lator, belong to the class of banyan networks and have been shown to be topologically equivalent [ 161. For this reason,
the banyan network is considered throughout this paper.
The banyan network is an N X N interconnection network
with n = logk N stages. Each stage contains Nlk(k X k)
switching elements (SEs). The stages are labeled in a sequence from (n - 1) to 0 with (n - 1) for the first
stage. The N input/output ports at each stage are labeled
using n k-ary digits (a, _ ,a, _ ?. . .ao), within each stage starting from the top. And the SEs at each stage are labeled
using (n - 1) k-ary digits (a, _ ,a,, _?. . .a,) starting from the
top. When we refer to one of the k input ports of a specific
SE, then we use the k-ary digit a0 to represent that input port
for simplicity. We consider butterfly interconnection pat- terns between stages, and a perfect shuffle interconnection
pattern between input controllers and stage (n - 1). The formal definitions of these connection patterns are as follows:
Dejinition 1. The interconnection function IC; for the output
ports at stage i in the banyan network under consideration, for n - 1 2 i 2 0, and that for the outputs of the input controllers are respectively defined by
lci [(a,! - I...aj+lUi(l,_,...a”)]=(U,,~~...U,+~UoUj_~...Uj)
W(a,,- Ia,,-2...aldl =(a,,-2a,I-3...alaoa,,-,)
where the right-hand-side labels are those of the input links ofthenextstageandOsa,sk- 1.
A broadcast banyan network is a banyan network with
SEs which are capable of packet replications. A packet
arriving at each broadcast SE can be either routed to one of the output ports, or it can be replicated and sent out on both ports. Fig. 1 illustrates the basic ATM switch architec- ture constructed from a logk N-stage broadcast banyan net-
work with wrap-around links. In a broadcast banyan
network with wrap-around links, output controllers are con- nected to input controllers through external links in order to
recycle packets. The banyan network itself is a blocking network. Two
packets with different destinations may be routed through
the same internal link at the same time. However, it is known that if the incoming packets with distinct destination
are arranged in an ascending or a descending order and the
active sources are all connected to consecutive input links, then the banyan network becomes nonblocking [ 171. The nonblocking condition is formally described as follows.
Dejinition 2. The binary relation CR is defined between two sets Y and Y’ which consist of destinations as follows: Y <,Y’ if and only if y < y’ for all y and y’ is such that y E Yandy’ E Y’.
56 J. Park, H. YoodComputer Communications 21 (1998) 54-64
~.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : f” ..-.................................-............ _ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
: : . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..~ :
i
:i , j j :
i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..-................................. _ . . . . . . . . . . . . . . . . . . . . 1 : :
ooo ooo
001 001
010 010
011 011
100 100
101 101
110 110
111 111
Input Controller Output Controller
Fig. 1. The basic switch architecture based on the broadcast banyan net-
work, where N = 8 and k = 2.
Property 1. A broadcast banyan network is nonblocking if
the active sources x i,. . .,xk and the corresponding sets of
destinations Y i,. . . ,Yk satisfy the following.
1. Monotone: Y, CR Y, <R . . . <R Yk Or Y, CR., . CRY2 <
RYI.
2. Concentration: any source between two active sources is also active. That is, xl 5 w % x, implies source w is
active, where 1 5 1 < m 5 k.
A proof of the sufficient condition is shown in [8]. An
example with active sources xl = 3, x2 = 4, x3 = 5, and corresponding destinations Yi = ( 1,3), Y2 = (5,6,7}, Y3 = {8,10,13,14] is depicted in Fig. 2. By Definition 2,
Yl <RY2 <RY3-
2.2. Header encoding schemes
The restricted address encoding scheme constructs a
multicast routing header from reachable destinations which are restricted into a single cube or a single region in the MINs.
As one of the restricted address encoding schemes, a cube
encoding specifies arbitrary destinations forming a single cube C. The multicast routing header for the cube encoding scheme is specified by (R,M], where R = r,_ , . ..r.r, con- tains the routing information and M = m,_, . . .m ImO contains the multicast information [9,1 I]. To handle the multicast header {R,M), an SE at stage i (0 5 i 5 n - 1) examines ri and m;. If mi is 0, the normal unicast routing is performed according to ri. If mi is 1, ri is ignored and the broadcast is performed. Fig. 3 shows routing paths for the destinations which form a cube C = 1X0x, correspondingly M is 0101 and R can be any one of the destination addresses belonging to the cube C.
Fig. 2. An example of nonblocking connection in broadcast banyan net-
works.
Another restricted address encoding scheme is a region encoding scheme which specifies arbitrary consecutive des- tinations forming a single region [8,13]. The multicast
routing header by the region encoding scheme indicates
the minimum and maximum addresses of consecutive desti-
nations. Fig. 4 illustrates an example of routing paths through the MIN-based switch for a multicast packet with
routing header containing an address interval specified by 0100 and 1000.
The size of multicast routing header by above two
restricted address encoding schemes is fixed to 2n bits. While an SE using the region encoding scheme must have the capability of header modification, that using the cube
encoding scheme must not. Hence, the SE using the region encoding scheme has more additional hardware cost than it
using the region encoding scheme.
Fig. 3. An example of self-routing based on the cube encoding scheme.
.I. Park. H. Yoon/Computer Communications 21 (1998) 54-64 51
0 0
I I
2 2
3 3
4 4
5 5
6 6
1 1
8 8
9 9
10 10
II II
12 12 13 13
14 14
15 15
.
Fig. 4. An example of self-routing based on the region encoding scheme.
2.3. Related work
An algorithm considering the blocking effect among sev- eral constituent copies of one multicast packet at any pass is proposed in [9], which uses the cube encoding scheme. This
multicast algorithm checks whether any two constituent
copies of one multicast packet can be routed without conflict
or not. As any two constituent copies which cause conflict in being routed cannot be sent at the same pass, this algorithm
prevents blocking. This algorithm generates a multicast tree which models the sequence of recycling network for routing
a multicast packet to all destinations. In the multicast tree, a
parent node sends a copy with one multicast routing header
to its children. Fig. 5 shows the generated nonblocking
multicast tree and the multicast routing example, where
source 5 sends a packet to destinations 0, 1, 3, 7, 10, 12 and 15. However, the algorithm has O(d*) time complexity
to construct the nonblocking multicast tree for given d destinations of one multicast packet. This algorithm also
delivers the packet to arbitrary d destinations at d passes
in the worst case. Raghavendra et al. [ 131 proposed a two-phase multicast
algorithm, which guarantees nonblocking between any two constituent copies at the same phase. Because other outputs
which are not included in multicast destinations can receive
and send a copy, this algorithm has the advantage that it can easily minimize the number of passes for one multicast
packet across the MIN. This algorithm constructs a non-
blocking multicast tree as in [9]. However, while the region
encoding scheme is used in the first phase, the cube encod-
ing scheme is used in the second phase. Since the SE’s structure of handling the multicast routing header in the second phase is different from that of handling it in the first phase, it needs an additional hardware cost for proces-
sing two encoding schemes. These proposed multicast algorithms require excessive
network resources such as very complex SEs and/or a large number of internal links. Furthermore, they do not
consider deadlock occurring among multiple multicast packets [ 141.
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 I5 - - - -
,3 ---*12 .z
<
0 ..&$\
l I
%oox
1 . . . . . . . . . . . . *,o
3 ---*12
0 .ii;;i’i Blocking!!
7 ---~ls 5
Fig. 5. A multicast routing example
58 J. Park, H. YoodComputer Communications 21 (1998) 54-64
6
7
Fig. 6. An example of the routing phase in multicast algorithm I.
3. Multicast algorithms in MIN-based ATM switches
In this section, we propose recursive multicast algorithms in wrap-around MIN-based ATM switches which employ the cube encoding scheme. In such a header encoding scheme, any arbitrary multicast destinations may not be
specified by only one multicast header. Our approach to routing a multicast packet is using the routing in the MIN
to make copies and recycle these copies to route to desired destinations in subsequent phases. The proposed algorithms avoid the blocking effect by exploiting the intrinsic non-
blocking property of the MIN as described in Section 2.1.
This section also considers occurrence of deadlock among multiple multicast packets and presents an approach to
deadlock avoidance.
3.1. Two-phase multicast algorithm
This algorithm routes a multicast packet to arbitrary d
destinations in two phases, which are copying and routing phases. During a copying phase, a source sends a multicast packet to 2k consecutive outputs, where k is the minimum integer such that d 5 2k. The multicast routing tag is con-
structed using the cube encoding scheme. In the second phase, the recycled packets from these d consecutive out-
puts are sent to the desired destinations. The two-phase multicast algorithm with the region encoding is described as follows.
3.1.1. Multicast algorithm I Assume that the d arbitrary destinations for a multicast
packet are sorted in ascending order as DO, D, , . . . , D, _ , . Phase I. Copy from the source 2k consecutive outputs which form a single cube C through the wrap-around MIN. The cube C which consists of 2k consecutive outputs is repre- sented as t , . Then, the 2k consecutive ,
rn-1 * * * Tk 22 ’ * - 2.
outputs are addressed as s, s + l,..,, s + 2k - 1, where ri, k
5 i 5 n - 1 isrdomly selected and the start address s is
- . The cube encoding scheme is used to ‘?-,_I -. * ?.k 00 * - - 0. copy the packet to 2k consecutive outputs.
Phase 2. Route the recycled copy from the outputs (s + I) to
the destinations D,, where 0 I 15 d - 1. The routing header
of the packet from input (s + 1) contains D1. The remaining outputs (s + 1), d 5 1 5 2k - 1, do nothing.
In Fig. 6, an example of the second phase is shown, where
source 5 sends a multicast packet to destinations
{ 3,4,6,7,9,11,13}. During the copying phase, source 5 sends a copy to 2’ outputs {8,9,10,11,12,13,14,15) which
form a cube C = lxxx, where k = 3 and the start address s is
8. In the second phase, the outputs 8, 9, 10, 11, 12, 13, and 14 send their own recycled copies of the multicast packet
while output 15 does not.
Theorem 1. In the two-phase multi cast algorithm, blocking
does not occur among any constituent copies.
Proof. In the first phase, it is trivial because the packet is copied from a single input to 2k consecutive outputs. On the contrary, in the second phase these 2k copies move from
multiple inputs to the corresponding destinations through the MIN. However, active inputs which receive the copies
at the end of the first phase are concentrated on inlets s,s + l,...,s + d - 1 and the output addresses satisfy the
monotonicity property, i.e. Do,D ,,. . .,Dd-I are in the ascend- ing order. By satisfying the conditions of Property 1, thus the constituent copies pass without blocking in multicast
algorithm I. 0
Theorem 1 guarantees that a multicast packet that is des- tined for any arbitrary set of destinations is sent to the desired destinations in two phases across the MIN.
3.2. Three-phase multicast algorithm
In order to minimize the use of excessive links and out-
puts in algorithm I, this multicast algorithm routes a packet
to arbitrary d destinations in three phases-two copying
phases and one routing phase. While the first copying
phase routes the packet to 2k destinations, the second routes it to (d - 2k) remaining destinations, where k is the max- imum integer such that d 2 2k.
3.2,l. Multicast algorithm II Assume that the d multicast destinations are sorted in
ascending order as Do,D , ,. . . ,Dd- I and d is represented as d,_,d,_2...do.LetB(=B0>B,>...>B,_I)beasetof sorted numbers such that dB, = 1, where 0 % BI < n and Osl%m-1.
Phase 1. Copy from the source 2k (= 2B”) consecutive
J. Purk, H. YoonKomnputer Communications 21 (1998) 54-64 59
outputs which form a single cube C through the wrap-
around MIN. The cube C which consists of 2k consecutive
outputs is represented as k Then, the 2k r,-1
..+m..
consecutive outputs are addressed as s,s + 1,. . .,s + 2k - 1,
where ri, k i i I n - 1 is randomly selected and the start
address s is 1 . The cube encoding scheme
rtl-1 . . . Tk 00. . . 0.
is used to copy the packet to 2k consecutive outputs.
Phase 2. Copy from the outputs s + 1 to 2” consecutive
outputs, where 1 I 15 m - 1. The 2B’ outputs are addressed
as s + ~~=02B”, S+x;=e2B”+l, . . . . S+x;=“2B*+ 2” - 1. The remaining outputs s + 1 do nothing, where
I= 0 or m - 1 < 1 5 2k - 1. The cube encoding scheme is also used to copy the packet to (d - 2k) consecutive out-
puts, respectively.
Phase 3. Route the recycled copy from the outputs s + 1 to the destinations D,, where 0 5 15 d - 1. The routing header
of the packet from input (s + 1) contains Dr.
For an example in Fig. 6, we consider multicast algorithm
II. Then d is represented as 0 111 and B = (2,1,0) . In the first copying phase source 5 sends a copy to 2= outputs
(8,9,10,11} which form a cube C = lOxx, where k = 2 and the start address s is 8. In the second, the output 9 ( = s + 1) sends a copy to 2 ’ consecutive outputs forming
a cube 110x and the output 10 ( = s + 2) sends a copy to 2’
consecutive outputs forming a cube 1110. As a result, 7 consecutive outputs receive the copies of a multicast packet after two copying phases. In the routing phase, the outputs
send the recycled copies to 7 desired destinations respectively.
Theorem 2. Let d be represented as d,, _ , d,, _ 1.. .do and B
(=B,>B,>... > B,_,) be a set of sorted numbers such
that du, is 1, where OiB,<n and Oslsm-1. Then 2’“lm - 1.
Proof. The maximum value of m is Bo. For all Bo. 0 5 Bo 5
n - 1, 2’” 2 Bo - 1 holds. 0
Theorem 3. In the three-phase multicast algorithm, blocking
does not occur among any constituent copies.
Proof. In the first phase, it is trivial because the packet is
copied from a single input to 2’” consecutive outputs. On
the contrary, in the second phase some of these 2’” copies move from multiple inputs to the corresponding outputs through the MIN. However, active inputs which receive the copies at the end of the first phase are concentrated on inlets
s+ l,s+2 , . . . ,s + m - 1 and the output addresses satisfy the monotonicity property, i.e. C,+ i CR C,,, CR . . .cR C s +m _ , are in the ascending order by Definition 2, where outputs s + 1 send copies 2& consecutive outputs forming a cube CT+, for 1 5 15 m - 1. Thus the constituent
copies pass without blocking. In the third phase, the proof is
similar to that of Theorem 1. 0
Theorem 2 makes sure that d consecutive outputs have
their own copies of the multicast packet after two copying
phases. Theorem 3 guarantees that a multicast packet which
is destined for any arbitrary set of destinations is sent to the
desired destinations in three passes across the MIN in multi-
cast algorithm II.
3.3. Enhanced multicast algorithms
In the wrap-around MIN, the routing phase can use the
same header encoding scheme that the copying phase employs. The number of intermediate outputs in multicast
algorithms I and II is minimized by coalescing all destina-
tions to the set of cubes.
Definition 3. Assume that C is represented as c,~-I.. .c Ic~ in n
bits, where ci is 0, 1, or x, 0 5 i 5 n - 1. The binary relation
=c is defined between two cubes C and C’ as follows: C = &’ if and only if there exists a j such that cj # c;
andcj=ci’foral10%i#j5n-1.
For algorithms I and II employing the cube encoding scheme, Procedure 1 describes coalescing a set of destina- tions into a set of cubes by Definition 3.
Procedure 1. Coalescing destinations into cubes:
Input: C' = ((D~),(Dll,...,(Dd-l}},
D,_, < D,, 0 < 15 d - 1.
output: c = {C& ,,..., C,,], C,_, < sCm,O<ms,f- 1.
begin
do m=O;C=C’;l=l;
while (1 5 ICI - 1) do if (ICL_,l = ICI and
C/-I = CC,) then C,’ = Cl_, u C,;
1=1+2;m=m+l;
else C,’ = C,_,; l=l+l;m=m+l;
if (1 = ICI) then Ci=C,_,; endwhile
until (ICI = IC’I) end
Given that the destinations are (3,4,6,7,9,11,13} in Fig. 6, the set of destinations is coalesced into
({3},{4,6),(7),{9,11),( 13)) by Procedure 1. According to Definition 3, 4( = 0100) =c 6( = 0110) and 9( = 1001) = cl l( = 101 l), therefore (4,6) and { 9,ll) are represented as 01x0 and 10x1, respectively.
60 J. Park, H. YoonKomputer Communications 21 (1998) 54-64
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 II
12 12
13 13
14 14
15 15
Fig. 7. An example of the routing phase in enhanced multicast algorithm II.
Fig. 7 shows an example of the routing phase in algorithm II. In this algorithm, the number of intermediate outputs is
minimized by Procedure 1.
Theorem 4. In enhanced multicast algorithm I, blocking
does not occur among any constituent copies.
Proof. In the first phase, it is trivial. On the contrary, we
consider the second phase where one or more constituent copies are passing through the MIN. Active inputs which received the copies in the copying phase are concentrated as
3,s + 1 ,. ..,s + d - 1 and the output addresses satisfy the
monotonicity property, i.e. two outputs C, and Cl, hold C, cR C,, such that 0 9 1 < 1’ 5 d - 1, where C, and C,, are in C. By satisfying the conditions of Property 1 thus the constituent copies pass without blocking in enhanced
algorithm I. q
Theorem 4 guarantees that a multicast packet that is des- tined for any arbitrary set of destinations is sent to the desired destinations in two/three passes across the MIN by
enhanced multicast algorithms I and II.
3.4. Deadlock avoidance
These multicast algorithms suffer from potential dead- lock due to multiple multicast packets in the unbuffered switch. In the multicast switches using distributed control, one packet which is copied at stage i for a multicast packet does not know whether other copied packets for the multi- cast packet are forwarded or blocked at stage j (j > i) due to other multicast packets. As a result, it is not guaranteed that all copied packets for a multicast packet arrive at the desired destinations. This is the situation of deadlock occurring among multiple multicast packets. Fig. 8 shows an example of a deadlock situation in a multicast algorithm given two multicast packets 4 - {0,2,4,6,8,10,12,14) and 14 -
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
I5
Fig. 8. An example of a deadlock situation.
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
{5,7,13,15}. One packet with source 4 collides with another
with source 14 at SE cr. and at SE v. As shown in Fig. 8, at SE ~1 the packet from source 4 is randomly selected to be broad-
casted to two output ports and at SE v the packet from source 14 is randomly selected to be broadcasted to the two output ports. As a result, the packet from source 14 is
blocked at SE p and the packet from source 4 is blocked at
SE v, assuming a packet blocking policy at the SEs. Hence, a deadlock occurs and neither the packet from source 4 nor
the packet from source 14 can go forward.
In order to avoid deadlock, SEs must use some arbitration mechanism for multiple multicast packets. One scheme in
which global unique time is added to all packets at the source is the most intuitive approach [lo]. This scheme, however, is centralized and it is difficult to maintain global
unique time. We present a simple scheme, the upper input
port first scheme. This arbitration method, which chooses
the packet from the upper input port, is a distributed scheme.
Theorem 5. In multicast algorithms, the upper input port
first scheme guarantees a deadlock-free transmission
through the multicast switch.
Proof. In the two-phase multicast algorithm, all multicast
packets simultaneously request output port(s) on one or
more SEs at the same stage i, 0 5 i 5 n - 1. Assume that the source address and the multicast routing header of a multicast packet are s,-~...s~ and C= c,c,_ 1 . ..cg. The address of the input port at stage (n - 1) into which this multicast packet enters is s,-2...sos,-I by Definition 1. Using our notation described in Section 2.1, the label of the particular SE is then s,-~.__s~ and the specific input port of this SE is identified by the bit s,-I. Similarly, the address of the input port(s) at stage i, n - 2 2 i 2 0, into which the packet copies enter, is c,-l...c;+i + s;_ I . ..slsosi. Again using our notation described in Section 2.1, the label(s) of the specific SE(s) are then c,_~...c~+~s;-~...s~~~
J. Park, H. YoonlComputer Communications 21 (1998) 54-64 61
6.5
6
5.5
5
4.5
4
3.5
3
2.5
2
1.5 124 8 16 32 64 l&4 8 16 32 64 128
Number of Destination$Fanout) Number of D&nations(Fanout)
(a> @I
Fig. 9. Comparison of average number of internal links per fanout. (a) n = 6: (b) n = 7. 0, algorithm I; + , algorithm II
and the specific input port(s) of these SE(s) are identified by the corresponding bit Si. Thus, the position of the
input port(s) at any stage i at which the multicast packet(s) arrive from a particular source depends only on the source
address. Therefore the copies of a multicast packet that
request output port(s) at one or more SE(s) at any stage have got the same priority and hence the upper input port first scheme renders deadlock-free multiple multicast
connections. 0 In the buffered MIN, the deadlock situation for multiple
packets is avoided, for the packet blocked due to other
packets is buffered on the SE [ 181. After other packets
release the port(s) on the SE, the blocked packet can acquire
the port(s).
4. Performance evaluation
We evaluate the performance of the proposed recursive
multicast algorithms in terms of the number of recycling passes and the number of internal links used. An algorithm
with the minimum number of recycling passes that routes an arbitrary set of destinations for a multicast packet in a single
phase requires individual switch control and can be
expensive [ 10,131.
4.1. Links used by the multicast algorithms
The number of internal links used by a multicast algo-
rithm depends on a number of destinations (d) for a given
multicast connection. A multicast routing in any phase will be accomplished by selecting links of the MIN that form a binary tree with source as the root and the d destinations as the leaves. The depth of the binary tree is n as an n-stage MIN is used. Two proposed algorithms without coalescing
procedure use the same number of internal links in the routing phase which equal dn, hence we consider the num- ber of internal links used only in the copy phase.
To compute the number of links used by multicast algo- rithms I and II, let the binary representation of integer d
(the number of destinations) be EyZ:di2i. Assume that k is
the maximum integer such that 2k 1 d. In multicast algorithm I, the first phase routes a multicast packet to
2 k+’ consecutive outputs forming a cube. The number of
links used is
k+I
x2’++(k+ 1)). i = 0
On the other hand, the source sends a packet to 2k consecu- tive outputs forming a cube in the first phase of multicast algorithm II. And then some of the 2k outputs send it to
d - 2” consecutive outputs in the second phase. The total number of internal links used by two copying phases is
Fig. 9(a) shows normalized average numbers of internal
links required for an arbitrary set of d destinations where n = 6. For an arbitrary set of d destinations, algorithm I
requires more links than the other algorithm for a given
multicast connection. The average number of links used in
algorithm II is twice as many as the number of destinations for a multicast connection. The same result is shown in
Fig. 9(b), where II = 7.
4.2. Links used by the enhanced multicast algorithms
For a given multicast connection with d destinations, we now consider the total number of links used in the proposed enhanced algorithms. Assume that the number of outputs in the copying phase is minimized tofby Procedure 1 and the average number of destinations in f cubes be c by the
coalescing procedure. Let the binary representation of integer f and c be
62 J. Park, H. YoonKomputer Communications 21 (1998) 54-64
10
9
8
1
6
5
4
3
2 124 8 16 32 64
Number of Dwinatiow(Fmout)
(a) n = 6
I8 16 32 64 128
Numba of Dc~tiaatioas(Fano~~)
(b)n = 7
Fig. 10. Comparison of total number of internal links used per fanout. (a) n = 6; (b) n = 7. Algorithms (-) and enhanced algorithms (- - -) I (0) and II ( + ).
Cyzdfj2; and ~l=dci2i. Assume that 1 is the minimum
integer such that 2’ % f and m is the minimum integer
such that 2” 5 c. In multicast algorithm I with Procedure 1, the total number of internal links used by two phases is
r=u
In multicast algorithm II with Procedure 1, the total number
4- 3 4 5 6 I 8 9
Nehwksizeh=logN)
(a) p = 0.05
11 I
4- 3 4 5 6 7 8 9
Netwod~size(n=lo~N)
(b) p = 0.10 (c) p = 0.25
3 4 5 6 7 8 9
Network sizc(n=togN)
Fig. I 1. Comparison of total number of internal links used per fanout. (a) p = 0.05; (b) p = 0. IO; (c)p = 0.25. 0 Park’s algorithm; + enhanced algorithm I; 0 enhanced algorithm II; X Chen’s algorithm; A Raghavendra’s algorithm.
J. Park, H. YoonICompufer Communications 21 (1998) 54-64 63
of internal links used by three phases is
In Fig. 10, the total number of internal links used in the proposed multicast algorithm is illustrated against the
number of destinations. Algorithms I and II without the
coalescing procedure require internal links as many as at
least (n + 2) times the number of destinations. On the con- trary, the number of links used in enhanced algorithms with
the coalescing procedure is independent of the number of
stages (n) in the MIN. In terms of the number of links used,
the performances of algorithms I and II are quite enhanced by coalescing procedures (Procedure 1).
Fig. 11 shows the comparison of total number of links used in the proposed enhanced algorithms and other algo- rithms against the size of multistage networks (n), wherep is
the number of destinations over N destinations. The pro-
posed algorithms employing the cube encoding scheme require, on average, a smaller number of internal links
than other algorithms do. Of the other algorithms consid- ered, that of Chen and Kumar [9] requires a large number of
recycling passes and that of Raghavendra et al. [ 131 needs additional hardware cost so that SEs process two different
header encoding schemes for multicast packets. The algo- rithm of Park et al. [12], which requires two recycling passes, needs a large number of links to be used.
5. Conclusions
In this paper, we have considered the multicast issues in
the MIN for constructing the internal architecture of high performance ATM switches. To support multicast connec-
tions efficiently, a novel approach based on the recursive
scheme is proposed, by employing cube encoding as one of the restricted address encoding scheme. These algorithms prevent deadlock among multiple multicast connections in ATM switch architectures based on the unbuffered MIN. We have shown that the proposed algorithms have high
performance in terms of the number of recycling passes
and the number of internal links used. In these algorithms, it is shown that a fixed number of recycling passes such as two or three are required and the average number of links
used is linearly increased against the number of destinations to route a multicast packet to its destinations. The proposed
algorithms can also be easily applied to buffered MIN-based ATM switches.
Acknowledgements
This work is supported by Samsung Group under the Fundamental Research on Multimedia Engines project.
References
[I] R. Handel, M.N. Huber, Integrated broadband networks: an introduc-
tion to ATM-based networks, Addison-Wesley, 1991.
[2] M.D. Prycker, Asynchronous transfer mode: solution for BISDN, Ellis
Horwood Ltd., I99 1.
[3] R. Rooholamini, V. Cherkassky, M. Garver, Finding the right ATM
switch for the market, IEEE Computer (April 1994) 16-28.
[4] J.S. Turner, Design of a broadcast packet switching network, IEEE
Transactions on Communications COM-36 (June 1988) 734-743.
[5] P. Giacomazzi, A. Pattavina, Providing multicast services by the ATM
shuffleout switch, Proc. of IEEE Globecom, 1994, pp. 1483-1489.
[6] J.F.K. Buford, Multimedia Systems, Addison-Wesley, 1994.
[7] D.X. Chen, J.W. Mark, Multicasting in the SCOQ switch, Proc. of
IEEE Infocom, April 1994, pp. 290-297.
[8J T.T. Lee, Nonblocking copy networks for multicast packet switching,
IEEE Journal on Selected Areas in Communications 6 (1988) 1455-
1461.
[9] X. Chen, V. Kumar, Multicast routing in self-routing multistage net-
works, Proc. of IEEE Infocom, April 1994, pp. 306-3 14.
[IO] C.-M. Chiang, Multicasting in multistage interconnection networks,
Ph.D. thesis, Department of Computer Science, Michigan State Uni-
versity, 1995.
[ 1 l] R. Cusani, F. Sestini, A recursive multistage structure for multicast
ATM switching, Proc. of IEEE Infocom, April 199 I, pp. 1289- 1295.
[ 121 J. Park, D. Yoo, H. Yoon, S.R. Maeng, Efficient two-pass multicast
algorithms in ATM switches based on multistage networks, Proc. of
the International Conference on Systems Engineering, July 1996,
pp. 55-60.
[ 131 C.S. Raghavendra, X. Chen, V.P. Kumar, A two phase multicast rout-
ing algorithm in self-routing multistage networks, Proc. of Interna-
tional Conference on Communications, June 1995, pp. 1612- 1618.
[14] S.C. Liew, A general packet replication scheme for multicasting in
interconnection networks, Proc. of IEEE Infocom, April 1995,
pp. 394-40 1.
[15] L.R. Goke, G.J. Lipovski, Banyan networks for partitioning multi-
processor system, Proc. of the Annual Symposium on Computer
Architectures, 1973, pp. 2 l-28.
[ 161 C.L. Wu, T.-Y. Feng, On a class of multistage interconnection net-
works, IEEE Transactions on Computers C-29 (August 1980) 694-
702.
[ 171 K.E. Batcher, Sorting networks and their applications, AFIPS Proc. of
the Spring Joint Computer Conference, 1968, pp. 307-314.
1181 D.K. Panda, R. Sivaram, Fast broadcast and multicast in wormhole
multistage networks with multidestination worms, Technical Report
OSU-CISRC-4/95-TR2 I, Department of Computer and Information
Science, Ohio State University, 1995
Jaehyung Park received his B.S. in computer science from Yonsei University, Korea, in 1991, and his M.S. in computer science from Korea Advanced Institute of Science and Tech- nology (KAIST), Korea, in 1993. He is currently working toward his Ph.D. in the Department of Computer Science at KAIST. His major research areas are ATM switching, multicast switches, interconnection networks, and parallel processinp.
64 .I. Park, H. Yoon/Computer Communications 21 (1998) 54-64
Hyunsoo Yoon received his B.S. degree in elec- tronics engineering from Seoul National Univer- sity, Korea, in 1979, M.S. degree in computer science from Korea Advanced Institute of Science and Technology in 1981, and Ph.D. degree in computer and information science from Ohio State University, Columbus, Ohio, in 1988. From 1978 to 1980, he was with the Tongyang Broadcasting Company, Korea; from 1980 to 1984, with the Samsung Electronics Company, Korea: and from 1988 to 1989, with the AT&T Bell Labs. as a Member of Technical
Stafl He joined the faculty of KAIST in 1989. His research interests include parallel computer architecture, parallel computing, intercon- nection networks, protocol engineering, and B-ISDN/ATM switching.