8
5/17/2018 00408427-slidepdf.com http://slidepdf.com/reader/full/00408427 1/8 End to nd Survivable Broadband Networks Within the EC-sponsored RACE program, the IMMUNE project was established to analyze and specify appropriate strategies for introducing end-to-end survivability into corporate and public broadband networks Leo Nederlof Kris Struyve Chris O’Shea Howard Misser Yonggang Du and Braulio Tamayo LEO NEDERLOF is with the Aha tel Corporate Research Centre in Antwerp. KR S S T R W E s with the Department of Information Technologyat IMEClUniver- sity of Ghent. CHRIS O ’SHEA works in the Broadband Multiservice Net- works Unit of BT Research. HOWARD MISSER is with PTT Telecom. YONGGANG DU s with Philips Rese arch Lahorato- lies. BRAULIO TAMAYO is with the Alcatel Corporate Research Centre in Mudrid, n recent years, a wide range of protec- tion and restoration techniques have been developed to support the surviv- ability of today’s and tomorrow’s broad- band networks. Component redun- dancy, route diversity, self-healing rings and dynamic restoration are among the solutions to help specific parts of specific networks survive failures of one or more of their comprising ele- ments. Some of these techniques are still under study, while others have been demonstrated to work, and are already available to network oper- ators and planners. What is still lacking is a coheren t and integrated assessment of end-to- end survivability, consisting of the identification and definition of requirements, metrics and eval- uation methods, as well as the definition of the interaction between restoration mechanisms applied in different network layers or parts, and the role of the network management system. Within the EC-sponsored RACE program, the IMMUNE project, begun in January 1994, has set as its objectives to analyze and specify appropriate strategies for introducing end-to-end survivability into corporate and public broad- band networks, to support these strategies by proper techniques and evaluation tools, and to demon strate distributed restoration on PSN (public switched networks) and CPN (customer premises networks) laboratory models. Six part- ners from five European countries have joined in the IMMUNE consortium: BT Labs (United Kingdom), the Research Institute IM EC (Bel- gium), PTT Research (The Netherlands), Alca- tel Standard ElCctrica (Spain), Philips Research Lab (Germany), and Alcatel Bell (Belgium) as coordinating partner. Since survivability has only recently become an area of interest by itself, no standardized def- initions or normalized quantifications exist as yet. The first objective was, therefore, to define a set of survivability requirements and metrics to be used in the rest of the project. This has led to the identification of a range of survivability strat- egy options and how they can be mapped onto user, service provider and operator require- ments. An extract of the concluding report is given in the next section. The next step on the road to integral survivability is designing and planning survivable networks, and the evaluation of the restoration and protection mechanisms that will be applied in these networks. The third section gives an overview of this part of the pro- ject. Most protection and restoration mecha- nisms operate within a single network layer and network part, autonomous from network man- agem ent. The interaction of mechanisms in dif- ferent network layers or in different network parts, and the role of network management, are discussed in the fourth and fifth sections. For the demonstration lab models, two techniques have been selected for implementation: a dis- tributed restoration mechanism for a meshed ATM PSN, and a CPN ATM ring protection switching mechanism. These techniques are described in the sixth section. Finally, in the last section, an overview is given of the ongoing activities within the IMMU NE project, with a summary of the status of the demo models. Requirements for Network Survivability n order to determine a network’s survivability, I set of metrics needs to be specified, along with methods to measure and quantify the per- formance unambiguously. For the interpretation of these metrics, a set of survivability rcquire- ments has been derived from the point of view of users, service providers and network opera- tors. These requirements have been quantified in order to determine how metrics can best be applied to assess the performance of proposed end-to-end survivability strategies. In addition to this, reference network configurations have been presented which form the basis of the network models used for the development of restoration techniques and for the computer simulation of these techniques. IEEE Communications Magazine September 1995 0163-6804/95/ 04.001995 EEE 1 63

00408427

Embed Size (px)

DESCRIPTION

00408427

Citation preview

  • 5/17/2018 00408427

    1/8

    End to

    nd

    Survivable Broadband

    Networks

    Within the EC-sponsored

    RACE

    program, the

    I M M U N E

    project

    was established to analyze and specify appropriate strategies for

    introducing end-to-end survivability into corporate and public

    broadband networks

    Leo Nederlof Kris Struyve Chris OShea Howard Misser Yonggang Du and

    Braulio Tamayo

    LEO NEDERLOF is with the

    Aha tel Corporate Research

    Centre in Antwerp.

    K R S

    S T R W E s wi th th e

    Department of Information

    Technologyat IMEClUniver-

    sity of Ghent.

    CHRIS O SHEA works in the

    Broadband Multiservice Net-

    works Unit

    of

    BT Research.

    HOWARD MISSER is with

    P TT

    Telecom.

    YONGGA NG DU s with

    Philips Rese arch Lahorato-

    lies.

    BRAULIO TAMAYO is with

    the Alcatel Corporate

    Research Centrein Mudrid,

    n recent years, a wide range of protec-

    t i o n a n d r e s t o r a t i o n t e c h n i q u e s h a v e

    b e e n d e v e l o p e d t o s u p p o r t t h e s u r vi v -

    ability of todays and tom orrow s broa d-

    b a n d n e t w o r k s . C o m p o n e n t r e d u n -

    dancy, route diversity, self-healing rings

    and dynamic restoration ar e among the solutions

    to he lp specific parts of specific network s survive

    fa i lu res o f one o r more o f the i r compr is ing e le -

    ments . Som e o f these techn iques a re s t i ll under

    s tudy , whi le o thers have been dem ons t ra ted to

    work, and ar e already available to network oper-

    a t o r s a n d p l a n n e r s . W h a t

    is

    s t i l l l a c k i n g i s a

    coh eren t and in tegra ted assessment o f end- to -

    end survivabili ty , consisting of the identif ication

    and definit ion of requirements, m etrics and eval-

    ua t ion m ethods , a s wel l a s the de f in i t ion o f the

    i n t e r a c t i o n b e t w e e n r e s t o r a t i o n m e c h a n i s m s

    applied in different network layers or parts , and

    the role

    of

    the network managem ent system.

    W i t h in t h e E C - s p o n s o r e d R A C E p r o g r a m ,

    t h e I M M U N E p r o j e ct , b eg u n i n J a n u a r y

    1994,

    has se t a s i t s ob jec t ives to ana lyze a nd spec i fy

    appropriate strategies for introducing end-to-end

    s u r vi v a bi l it y i n t o c o r p o r a t e a n d p u b l i c b r o a d -

    b a n d n e t w o r k s , t o s u p p o r t t h e s e s t r a t e g i e s by

    proper techn iques and eva lua t ion too ls , and to

    d e m o n s t r a t e d i s t r ib u t e d r e s t o r a t i o n o n P S N

    (pub l ic swi tched ne tworks) and C PN (cus tomer

    premises netw orks) laboratory m odels. Six part-

    ne rs f rom f ive European coun t r ies have jo ined

    i n t h e I M M U N E c o ns o r ti u m : B T L a b s ( U n i t e d

    K i n g d o m ) , t h e R e s e a r c h I n s t i t u t e I M E C ( B e l -

    g ium) , PTT Researc h (The Neth er lands) , Alca -

    tel Sta ndar d ElCctrica (Spain), Philips Research

    Lab (Germany) , and Alca te l Be l l (Be lg ium) as

    coordinating partner.

    S ince su rv ivab i l ity has on ly recen t ly be come

    an a rea of interest by i tself , no standardized d ef-

    in i t ions o r normal ized qua n t i f ica t ions ex is t a s

    ye t . The f i r st ob jec t ive was , the re fo re , to de f ine

    a set of survivability requirements and metrics to

    be used in the rest of the project. This has

    led

    to

    the identification of a range of survivability strat-

    e gy o p t i o n s a n d h o w t h ey c a n b e m a p p e d o n t o

    u s e r , s e r v i c e p r o v i d e r a n d o p e r a t o r r e q u i r e -

    m e n t s . An e x t r a c t o f t h e c o n c l u d i n g r e p o r t i s

    g iven in the nex t sec t ion . Th e nex t s te p

    on

    t h e

    r o a d t o i n t e g r a l s u r v iv a b il i ty i s d e s i g n i n g a n d

    planning survivable networks, and th e evaluation

    o f t h e r e s t o r a t i o n a n d p r o t e c t i o n m e c h a n i s m s

    that will be applied in these networks. The third

    section gives an overview of this part of the pro -

    j e c t . M o s t p r o t e c t i o n a n d r e s t o r a t io n m e c h a -

    nisms ope rat e within a single network layer and

    n e t w o r k p a r t , a u t o n o m o u s f r o m n e t w o r k m a n -

    agem ent. Th e interaction of mechanisms in dif-

    f e r e n t n e tw o r k l a y e r s o r i n d i f f e r e n t n e t w o r k

    par ts , and the ro le o f ne twork m anagement , a re

    d i s cu s s ed i n t h e f o u r t h a n d f i ft h s e c t i o n s . F o r

    t h e d e m o n s t r a t i o n l a b m o d e l s , tw o t e c h n i q u e s

    h a v e b e e n s e l e c t e d f o r i m p l e m e n t a t i o n : a d is -

    t r i b u t e d r e s t o r a t i o n m e c h an i s m f o r a m e s h e d

    A T M P S N , a n d a C P N A T M r i n g p r o t e c ti o n

    s w i t ch i n g m e c h a n i s m . T h e s e t e c h n i q ue s a r e

    described in the sixth section. Finally, in the last

    s e c t i o n , a n o v e r v i e w i s g i ve n o f t h e o n g o i n g

    a c t i v i t ie s w i t h in t h e I M M U N E p r o j e c t , w i t h a

    summary of the status of the de mo models.

    Requ i rements fo r Network

    Surv ivabi l i ty

    n or de r to dete rmi ne a networks survivability ,

    I s e t o f m e t r i c s n e e d s t o b e s p e c i f i e d , a l o n g

    w i t h m e t h o d s t o m e a s u r e a n d q u a n t if y t h e p e r -

    fo rmance unambiguous ly . For the in te rp reta t ion

    of the se metr ic s , a se t o f su rv ivab i l ity rcqu i re -

    ments ha s been der ived f rom th e po in t o f view

    of use rs, se rv ice p rov iders and ne tw ork ope ra -

    tors. These requirements have been quantif ied in

    o r d e r t o d e t e r m i n e h o w m e t r i c s c a n b e s t b e

    app l ied to assess the pe r fo rm ance o f p roposed

    end-to-end survivability strategies. In addition to

    this, reference network configurations have been

    presen ted which fo rm the bas is o f the ne twork

    mode ls used fo r th e deve lopment o f res to ra tion

    t e c h n i q u e s a n d f o r t h e c o m p u t e r s i m u l a t i o n of

    these techniques.

    IEEE Communications Magazine September 1995

    0163-6804/95/ 04.00 1995

    EEE

    1

    63

  • 5/17/2018 00408427

    2/8

    T

    recovery time . poten tlal service class

    ~~ ~ ~~

    W

    Table

    1.

    Matrm of user and tewice

    claJJet

    W Figure 1.Siiwivabihty evaluation

    framework.

    Survivabi l ity can be m easu red f rom the con-

    t ext of end- to-end p er formance of a par t i cular

    deman d, in order to opt imize service to par t icu-

    l a r cu st om er s . O n t he o t he r hand . a n o p e r a t o r

    also n e e d s t o e n s u r e t h a t t h e w h o l e n e t w o r k

    meet s so me survivabi li t y cr i t er i a . Theref ore , a

    m e t h o d n e e d s t o b e f o r m u l a t e d

    to

    m e a s u r e

    whole netw ork survivabi li t y as well . The re a r c

    al so shor t t e rm and long t erm avai labi li t y mea-

    sures for QoS. Long t erm measures , ca l cula t ed

    f r om M T B F ( m ean t i m e be t w een f a i l u re ) and

    MTTR (mean t ime to repair ) , are more appl ica-

    b l e t o s u rv i v ab i li ty m e t r i a , w h i le s h o r t t e r m

    measures

    (c.g.,

    bi t er ror rate or packet loss r a t e)

    give a mea sure of t he qu al i t y of a connect ion ,

    link or path, at a given point in time. Short term

    performance measures can also be used to deter-

    mine when a c i r cui t becomes unavai l able , and

    trigger recovery systems.

    T h e d e p l o y m e n t o f s u rv i v a b i l it y o p t i o n s

    involves a choice betwccn s t r a t egies , i n which

    many f actor s ar e involved, r c l a t ed to

    cost,

    ser -

    v i ce t ype , ne t w or k t opo l ogy , e tc . T he r cqu i r e -

    m e n t s n e e d

    to

    b e d r i v e n by t h e e n d u s e r a n d

    s h o u l d b e s a t i s f i e d b y t h e s e r v i c e p r o v i d e r

    t h r ough t he r ange and qua l i t y o f t he s e rv i ce s

    provided via the transport network that is in turn

    p r ov i ded by t he ne t w o r k ope r a t o r . ( I t is a l s o

    possible that the service provider and the opera-

    tor are one and the same.)

    T h e m os t u se r - pe r ce i ved f ac t o r t ha t de t e r -

    mines the

    QoS

    i s t e rmed an ou tagc. An outage

    is defined as a significant degradation it7

    the

    ubili-

    y o f u customer to establish and ma i i i tu i i~i chan

    nel ofcot?znuti~icatiori.s s

    U

    result

    of

    fuilitr in uti

    operators network 111. This in itself is not suffi-

    cient because the term s ignificant degradat ion

    needs to

    be quant i f ied. Th e performance thresh-

    olds for declar ing an outage wi ll depcn d

    on

    the

    QoS

    cr i ter ia agreed upon with the custom er and

    the sensitivity of the service

    to

    dcgraded perfor-

    m an cc [ 2 ] . F r o m t h e po i n t o f vi ew of s c r v ice

    providers , current and new services can best be

    classified according to their resilience to network

    outages . The durat ion of a service outage can be

    set out against the impact , def ined in terms of

    call dropping, session time-outs, social and busi-

    ne s s i m pac ts , e t c . O pe r a t o r r eq u i r em en t s a r e

    der ived f rom users and service providers . How-

    eve r , ne t w or k ope r a t o r r equ i r em en t s a r e also

    dr i ven by t he f ac t t ha t t he ope r a t o r nee ds t o

    m a k e a n o v e r a l l p r o f it w h i l e m a i n t a in i n g a n

    a c c e p t a b l c l c v el o f s e r v i c e t o a l l c u s t o m e r s .

    Table summarizes the user (ul-u4 ) and service

    classes

    ( ~ 1 . ~ 6 )

    ogether in a matrix. The opcrator

    must then decide upon a res torat ion/protect ion

    strategy to be applied 10 the cases i n quest ion.

    0

    h e o p e r a t o r C O n s id e r a i ons i nc 1u d e t h e

    extent over which a disrup tionifailure occurs; the

    range of potent ial fai lure scenar ios ; faul t propa-

    ga t

    o

    n ; back - o n

    o

    r m a o p e r a t i o n , a n d a l g o

    rithm complexity, maintainability, reliability and

    sensi t ivi ty. Al l of these issues are discussed in

    In ord er t o provide an ac cura t e analys is of

    network survivabi l ity s t rategics , informat ion on

    cxist ing and planned network conf igurat ions has

    b e e n g a t h e r e d f r o m a n u m b e r o f o p e r a t o r s .

    Also, t he l ayering and p ar t i t i oning concept s as

    described in [4] have been taken into account. A

    survey has been carr ied out of var ious network

    a r ch i t ec t u r e s i n t he U S and E u r ope , cove r i ng

    S D H i S o n e t , p u r e A T M , A T M i S D H a n d

    PDH iATM networks . From this s tudy, and f rom

    litcrature, a library of network parts (e.g., mesh,

    ring, star,

    ...)

    has been established, from which a

    generic set of reference netwo rk configura tions

    has been derived. Thesc reference configurations

    form the basis of the comparative evaluations of

    survivability strateg ies.

    PI.

    Survivability Evaluation Model

    etworks are m ade survivable by impleme nt-

    N i n : r e st o ra t ion t echn iques on one hand , and

    by providing spare capacity on the other. For the

    evaluation of a certain survivability strategy as a

    whole , tools are ne eded to val idat c thc r es tora-

    tion techniques and to optimize the allocation of

    spar e capaci ty . An as ses sment of per formance

    v e r s u s c os t c a n t h u s b c m a d c

    for

    a n e t w o r k ,

    64

    IEEE Commuiiication~

    aga7ine Scptember

    19

  • 5/17/2018 00408427

    3/8

    pr io r to ac tua l dep loym ent o f the s t ra tegy . Th is

    s e c t i o n d e s c r ib e s a m o d e l t h a t w a s d e f i n e d a s

    part of the I M M U N E pro jec t , and a lso som e o f

    the tools that were developed for the evaluation.

    T h e e v a l u a t i o n m o d e l o f r e s t o r a t i o n a n d

    resource a l loca t ion m ethods i s shown in F ig .

    1.

    E a c h p a r t r e p r e s e n t s a p a r t i cu l a r a s p e c t o f t h e

    m o d e l i n g p r o c e s s . T h e f a u l t g e n e r a t i o n , s p a r e

    capacity planning and recovery strategy parts ar e

    i n p u t s t o t h e n e t w o r k e n v i r o n m e n t p a r t . T h e

    ou tcom e o f ca lcu la t ions i s eva lua ted accord ing

    t o c e r t a i n c r i t e r i a . I n t h e n e x t p a r a g r a p h s , t h e

    different parts of the framew ork in Fig.

    1

    will

    be

    further described. Some techniques used for the

    recovery s t ra teg ies a re d iscussed in the sec t ion

    on res to ra t ion mechan isms fo r meshed and r ing

    networks.

    Fault Scenarios

    Fai lu res occur r ing in rea l - l ife ne tworks may b e

    g r o u p e d i n t o t w o m a j o r c l a ss e s: l i n k a n d n o d e

    fa i lu res . As such , a l ink fa i lu re a t the phys ica l

    layer might be u sed to m odel a f iber cable break,

    while a l ink failure at the SDH VC-4 layer might

    mode l a port failure. Multiple simu ltaneous fail-

    ures also must b e considered, since a single fail-

    u r e m a y c a u s e m u l t i p l e f a i l u r e s i n h i g h e r

    netwo rk layers, as is illustrated by Fig.

    2.

    Also a

    f i re in a hub bu i ld ing poss ib ly a f fec ts m ul t ip le

    network elements.

    Network Environment

    Th e ne twork env i ronm ent compr ises th r ee lev -

    e l s : t h e t r a n s p o r t f u n c t i o n a l l e v e l, t h e c o n t r o l

    func t iona l level and the management func t ional

    leve l . The t ran spor t func t iona l level mode ls th e

    t r a n s f e r o f c l i e n t i n f o r m a t i o n i n t h e n e t w o r k .

    Th e control functional level realizes the transfer

    of operations- and maintenance-related informa-

    tion (e.g. , fault detection signals, ...). This level

    a lso inc ludes the exchange o f res to ra t ion mes-

    sages. The manag ement func t ional leve l , wh ich

    m o d e l s t h e i n t e r a c t i o n b e t w e e n a u t o n o m o u s l y

    operating survivabili ty mechanisms and T MN , is

    further considered in the following sections.

    ITU-T Recommenda t ion

    (3.803

    [4] provides a

    bas ic

    tool

    t o m o d e l m u l t il a y er m u l t i p a r t n e t -

    works which a re com posed o f l inks (e.g ., f ibe r

    cables, coax cables, radio links, .

    )

    and e lements

    e.g., M U X ,

    ADM, CC, ...). The integration of

    t h e r e s t o r a t i o n m e c h a n i s m i n t h e n e t w o r k e l e -

    ments can be included in the model, especially if

    the speed of restoration is assessed by mea ns of

    s i m u l a t i o n . In o r d e r t o p r e d i c t t h e e x e c u t i o n

    time of an a lgorithm , it is necessary to use a rep-

    r e s e n t at i v e m o d e l o f t h e n o d e s , t a k i n g i n t o

    account the software and hardware architecture.

    For in i t ia l eva lua t ion s tud ies , th e mode l ing o f

    n e t w o r k e l e m e n t s h a s b e e n b r o u g h t d o w n t o

    three parameters:

    Cros s con nect delay:

    t h e t i m e n e e d e d t o s e t

    up a cross-connect point in a node.

    Internal comm unication delay: the t ime need-

    e d t o p a s s a l a r m s a n d r e s t o r a t i o n m e s s a g e s

    through the control architecture of a node.

    Processing delay: the actual

    PU

    t ime needed

    to process an algorithms code.

    P e r f o r m a n c e b e n c h m a r k i n g of d i f f e r e n t

    r e c ov e r y m e c h a n i s m s r e q u i r e s , a m o n g o t h e r s ,

    a g r e e m e n t o n a r e f e r e n c e n e t w o r k . I t i s , ho w -

    W Figure 2

    Failure propag ation. A single SDH span failure causes multiple

    A T M link failures.

    W

    Table

    2 Evaluation criteria.

    ever , no t imposs ib le tha t the pe r fo rmance o f a

    m e c h a n i s m d e p e n d s o n s o m e p r o p e rt i e s o f t h e

    ne twork e nv i ronm ent fo r which i t i s eva lua ted .

    H e n c e , a m o r e g e n e r a l i z e d c o m p a r i s o n s h o u l d

    inc lude a sens i t iv i ty ana lys is o f eac h recovery

    mechan ism to pa ramete rs such as ne twork s ize ,

    ne twor k connec t iv i ty , re la t ive am oun t of spar e

    capacity and traffic load. For this analysis, a net-

    work generation tool is developed that can gen-

    e r a t e r e f e r e n c e n e t w o r k s a c c o r d i n g t o a s e t o f

    t u n a b l e p a r a m e t e r s . F o r i n s t a n c e , s o m e d i s -

    tr ibuted restoration algorithms were applied to a

    num ber of referen ce networks with varying con-

    nectivity, while other p arameters were k ept con-

    stant. Th e results of this study are given in [ 5 ]

    Evaluation Criteria

    To in te rp ret the ou tpu t of a performance evalua-

    t i o n , a s e t o f c r i t e r i a i s n e e d e d . F o r r e c o v e r y

    mecha n isms , two c lasses o f eva lua t ion c r i te r ia

    a r e c o n s i d e r e d : b a s i c c r i t e r i a , w h i c h a r e t h e

    direct output from the simulations; and advanced

    criteria , which ar e calculated from th e basic cri-

    t e r i a a n d e n a b l e t h e e v a l u a t i o n o f t h e o v e r a l l

    pe r fo rmance of the recovery strategy. Examples

    are given in Table 2.

    Evaluation Tools

    T h e c o m p le x i ty

    of

    t h e n e t w o r k m o d e l r e q u i r e s

    t h e u s e

    of

    c o m p u t e r s t o s i m u l a t e s u r vi v a bi l it y

    s t ra teg ies a nd

    to

    plan sp are capacity. Analytical

    methods ten d to be use fu l fo r eva luat ing s imple

    models, such as self healing protection rings, as

    the n umb er of systems states is l imited. Analyti-

    ca l methods can a lso be used to p rov ide upp er

    a n d l o w e r b o u n d s o n p o s s ib l e p e r f o r m a n c e .

    IEEE Commu nications Magazine Septcmhcr 1995

    I

    65

  • 5/17/2018 00408427

    4/8

    The objective

    o f a n

    escalation

    strategy is

    to

    optimize the

    pegormanee

    of the

    network

    under all cir-

    cumstances

    using

    available

    resources

    and

    implemented

    mechanisms.

    With in the RA CE I1 p ro jec t IM MU NE consor-

    t ium d i f fe ren t com pute r simula t ion too ls have

    b e e n a d a p t e d t o e v a l u a t e r e c o v er y a n d s p a r e

    capacity planning algorithms.

    The following tools have been used for evalu-

    ation of recovery mechanisms.

    Alcatel Standard Electricas INSE tool helps

    evaluate survivability in a multilayered network.

    Th e eva lua to r a l lows severa l su rv iva l a rch i tec -

    t u r e s t o b e c r e a t e d i n t e r a c ti v e l y a n d a n a l y z e s

    their effect on overall network survival perfor-

    m a n c e . E v a l u a t i o n o f

    loss

    of connectivity and

    loss of capacity metrics is currently supported to

    de te rmine th e mos t vu lnerab le pa r t s of the ne t -

    work for a variety of fault scenarios.

    * T h e B T t o o l , c a ll e d T E N D R A . h a s b ee n

    developed to investigate distr ibuted restoration

    protocols in transport networks [6]. Restoration

    protocols ar e enco ded as objects associated with

    each nod e object, such that i t is possible to plug

    d i f fe ren t res to ra t ion p ro toco ls in to a TEN DR A

    simulation. Message-passing between restoration

    p r o c e s s e s e x e c u t in g o n p r o c e s s o r s a t d i f f e r e n t

    node sites is achieved using discrete-event simu-

    lation techniques in which appropriate delays are

    u s e d , m o d e l i n g f i n i t e l i n k tr a n s i t d e l a y s , a n d

    nodal processing and crossconnection times.

    I M E C h a s d e v e l o p e d t h r e e t o ol s , c al l e d

    W D M S I M , S D H S I M a n d A T M S I M . T h e W D M -

    S I M a n d S D H S I M s i m u l a t o r s p r o d u c e p e r f o r -

    m a n c e p a r a m e t e r s

    of

    c e n t r a l i z e d r e s t o r a t i o n

    algorithms for multi layer networks. These multi-

    layer ne twork m ode ls a re based on ITU -T Rec-

    ommenda t ion G.803 . The A TMSIM s imulato r i s

    s p e c i f i c a l ly d e v e l o p e d t o s t u d y d i s t r i b u t e d

    res to ra t ion a lgor i thm s in s ing le layer ne tworks

    using discrete- event simulation technique s.

    Resource Allocation Tools

    Plann ing too ls a re uscd to p rov ide the ne twork

    env i ronm ent wi th su f f ic ien t spare resources to

    p e r f o r m r e s t o r a t i o n . G i v e n a t r a ff i c d e m a n d

    m a t r i x , a n d t h e t y p e o f r e c o v e ry t o b e i m p l e -

    mented , the l inks and nodes in the ne twork can

    be dimensioned and th e cost calculated. If those

    planning tools are to be applied to existing nct-

    works , ra the r than jus t fo r d imens ion ing re fe r -

    e n c e n e t w o r k s f o r e v a l u a t i o n o f s u r v i v a b i li t y

    strategies, real-life c onstraints can be taken into

    account, e .g. , geographical a spccts and installed

    base.

    Th e fo l lowing two too ls a re examples o f the

    resource allocation tools, and have been used in

    the IMM UN E pro jec t.

    * T h e A L C A L A t o o l d e v e l o p e d by A l c a t e l

    Standard Electrica provides a means of planning

    a n d d e s ig n i n g P D H a n d S D H n e t w o r k s w i t h

    protection via link and path diversity with unidi-

    r e c t i o n a l r i ng s a n d m e s h e s [ 7 ] he des ign i s

    op t imized in te rms o f equ ipment d imens ion ing

    and overall total network installation cost for a

    given level of protection between any node pairs.

    *BTs Res to ra tion Capac ity Heur ist ic ( RC H)

    p r o g r a m d e a l s w i th t h e p r o b l e m o f a l l o c a t i n g

    s p a r e c a p a c i t y t o p r o t e c t t e l e c o m m u n i c a t i o n s

    traffic in a fully or partly-meshed tran sport net-

    work topology. Th e meth od used is applicable to

    any meshed-based res to ra t ion s t ra tegy. and the

    des igns p roduced have been ver i f ied us ing the

    TE ND RA ne twork s imula tion too l .

    I

    I

    vc 4 a Failure recovery in th e SDH layer

    /

    . /

    /

    O

    End-to-endconn

    vc-4

    b

    Escalation o the

    ATM

    layer

    ~.

    Figure 3. Escalation between layers.

    T h e t o o l s a v a il a b le w i t hi n t h e c o n s o r t i u m

    need t o be fu r the r ex tended . The incorpora t ion

    of d i f fe ren t recovery mechan isms in d i f fe ren t

    parts and layers of the ne twork mode ls mus t be

    improved , and dynamic scenar ios fo r the in te r -

    work ing o f mechan ism s in d i f fe ren t layers and

    p a r t s o f a n e t w o r k s h o u l d b e c o v e r e d b y t h e

    t o o l s . T h i s i n t e r w o r k i n g , or e s c a l a t i o n , i s

    described in the following section. The tools can

    a lso bc adap te d to dea l d i rec t ly wi th a num ber

    of the evaluation criteria that were mentioned in

    th is sec tion . Fur the rmore , th e incorpora t ion o f

    T M N a s p e c t s s h o u l d b e f u r t h e r i n v e s t i g a t ed .

    M e t h od o og e s o p t i m i z i n g t h e i n t e r w o r k n g

    between recovery mechanisms and spare capaci-

    t y p l a n n i n g a l g o r i t h m s s h o u l d a l s o b e s t u d i e d

    further.

    Escalation Issues

    i f f e r e n t r e c o v e r y m e c h a n i s m s m a y b e

    D eployed in different network layers or sub-

    networks. The interworking between these mech-

    anisms is called escalation. T he objective of an

    e s c a l a t i o n s t r a t e g y is t o o p t i m i z e t h e p e r f o r -

    m a n c e o f t h e n e t w or k u n d e r

    all

    circumstances,

    u s i ng a v a i l a b l e r c s o u r c e s a n d i m p l e m e n t e d

    mechanisms. If no escalation strategy is provid-

    ed, different recovery mechanisms may prevent

    each other from acting in an efficient way, and

    may even lock up the ne twork in an indef in i te

    5 ta te . In an idea l s i tua t i on , d i f fe ren t recovery

    sys tems ac t in a complem enta ry way and sh are

    spare resources.

    Since netw ork survivability is a relatively new

    66

    IEEE

    Communications MagaLine

    September 1995

  • 5/17/2018 00408427

    5/8

    subject , as was s a id in the in t roduc t ion t o th i s

    a r t i c l e , t he i s s ue o f e s ca l a t i on i s even new er .

    Spar se ins t ances of opera t iona l survivable net -

    works are present ly implem ented, but escalat ion

    is yet a purely theoret ical issue. No custom-ma de

    s o l u t i o n s c a n th e r e f o r e b e p r e s e n t e d a s y e t .

    Ins t ead, t he s t eps t aken o n th i s par t of t he road

    t o e n d - t o - e n d s u r v i v ab i li t y a r e l i m i t e d t o a n

    ident i f icat ion of the issues and the o pt ions , and

    a qual i t a t ive deba te on some escala t ion s t r a t e-

    gies.

    Escalation Between Layers

    A netwo rk can be modeled as consis t ing of net-

    w or k l aye r s , w i t h a c l i en t / s e rve r r e l a t i ons h i p

    between adjacent layers. A layer providing trans-

    port is called a server, and the layer using trans-

    port is called a client [4]. Th e fact that di f ferent

    recovery mechanisms are im plemented in dif fer-

    ent layers may have several causes: the natural

    e v o l u t i o n o f c o m m u n i c a t i o n s n e t w o r k s m a y

    r e s u l t i n add i ng new s u r v i vab l e l aye r s t o t he

    exis t ing ones , o r th e dif fere nces in survivabi l ity

    requirements between network layers can resul t

    i n t h e i m p l e m e n t a t i o n o f d i f f e r e n t r e c o v e ry

    mechani sms . Wi thin th i s context , an escal a t ion

    strategy between layers def ines the coordinat ion

    of res torat ion m echanisms in dif ferent layers to

    a v o id c o n t e n t io n , p r o m o t e c o o p e r a t i o n a n d

    incr ease overa l l survivabi l i t y . I n the s im ples t

    case, the responsibility t o restore services in case

    of a fai lure is passed f rom o ne layer to anoth er ,

    t y p i c a l l y w h e n o n e l a y e r h a s e x h a u s t e d i t s

    r e s t o r a t i on capab i l i ti e s

    or

    w h e n a p r e d e f i n e d

    time interval has passed. Figure 3 gives an exam-

    ple of escalat ion f rom an SD H VC4 server layer

    to an A TM VP cl i ent l ayer . In Fig. 3a, a fai lure

    i s r e c o v e r e d i n t h e S D H l a y er , a n d t h e A T M

    l a y e r o n l y n o t i c e s a s h o r t s e r v i c e i n t e r r u p t .

    When the SD H layer is not able to recover f rom

    the fai lure, e.g. , due to lack of spa re resource s ,

    t he f a i l u r e e s ca l a te s t o t he A T M l ayer , w he r e

    res toration is performed using an al ternat ive VP

    trai l. Not e that th e server layer providing t rans-

    por t to the al ternat ive VP route does not neces-

    sari ly have to be the s ame as the or iginal server

    layer from which the failure esc alated.

    Th e main issues related to escalat ion between

    layers are defining in which layer the restoration

    process s tar ts , when i t escalates to another layer

    and to which layer it escalates. Additional issues

    to be solved ar e def in ing the r e l a t ion between

    the escalat ion s trategy and the network ma nage-

    m e n t s y s t em a n d t h e f o r m o f s i g n a li n g t o b e

    used in the escalation mechanism.

    Escalation Between Subnetworks

    Each l ayer network can be d iv ided in to subnet -

    works in a way that reflects the internal structure

    of that layer. From a survivability point of view,

    a survivable subnetwo rk (SSN) i s def ined as a

    set of network elements grou ped together by the

    f ac t t ha t t hey a l l s ha r e one s i ng l e r e s t o r a t i on

    mechanism, e.g., a self-healing ring

    or

    a meshed

    network with back-up routes . When the res tora-

    t ion mechanism in one SSN is not able to recov-

    er fully from a failure, adjacent

    SSNs

    need to be

    involved. Two types of escalat ion betwe en SSNs

    will be illustrated by mea ns of examp les.

    One type of interworking between SSNs con-

    Table 3. Where to start restoration.

    I

    ___

    ~ I

    ore

    mesh Feeder ring

    __

    - I

    Figure

    4 nterconnection between survivable

    subnetworks.

    c e n t r a t e s o n t h e i n t e r c o n n e c t i o n o r g a t e w a y

    nodes . Figure

    4

    depicts the dual access between

    t w o S S N s : a f eede r r i ng and a co r e m es h ne t -

    work. Wh en any of t he l i nks or nodes ou t s i de

    t he l i gh t blue s haded a r ea s f a i l s , e ach o f t he

    S S N s i s c a p a b l e o f r e s t o r i n g t h e f a i l u r e

    autonomous ly . When one of t he gat eway nodes

    o r l i nks f a il s, bo t h S S N s have t o c oope r a t e i n

    moving traffic from one gateway to the other.

    A second type of escal a t ion can occur when

    SSNs

    are organized in a h i er arch ical way, i. e .,

    when several SSNs together form a larger SSN.

    This larger SSN can then use the spare resources

    of its comprising SSNs after it has received con-

    trol over the restoration, according to the escala-

    t i o n s t r a t e g y , a n d e x e c u t e a r e s t o r a t i o n

    mechanism on a wider scale.

    Thi s l a t t er t ype of escal a t ion between SSNs

    shows a great cor r espondence wi th escal a t ion

    between layers, where a SSN in a client layer can

    use t r anspo r t f ac i l it i es of s evera l conca tenated

    server SSNs. If, for instance, the interconnection

    between two SDH subnetworks fai ls , an over lay

    ATM network may res tore i ts t raff ic on anoth er

    VP trail, using resources in the same or in differ-

    en t s e r ve r S S N s . N o t e t he d i f f e r ence t ha t i n

    escalat ion between SSNs in one layer , the same

    spare r esources can be r eused af t er escal a t ion ,

    while in general sharing spare resources between

    network layers is not a straightforward issue.

    Escalation Strategies

    T he s e t of r u l e s u s ed t o dec i de w h i ch m echa -

    ni sms to ac t ivat e and when to hal t mechani sms

    and to ac t ivat e o ther s , i s ca l l ed the escal a t ion

    strategy. Two types of escalat ion s t rategies can

    be ident i f ied: act ivat ion of mult iple res torat ion

    mechanisms in parallel, and sequential activation

    of restoration mechanisms.

    In para l l e l s t r a t egies , d i f f er ent r es tora t ion

    mechanisms are act ivated at the sam e t ime, as a

    IEEE Communications Magazine Septem ber 1995

    67

    m

  • 5/17/2018 00408427

    6/8

    Diagnostics

    LTable

    4

    ion mechanism when a timer

    active restoration mechanism

    Using a diagnostics method requires more interaction between

    th e escalation mechanism and th e restoration mechanisms, but

    can reduce restoration time as compared to the use of a timer,

    where it is possible that a restoration mechanism has already

    given up

    its

    effortsbefore th e timer has expired. A diagnostics

    method can detect when a mechanism has failed or deduce

    tha t a mechanism will not be successful, and ha nd over restoration

    control

    to

    th e next mechanism.

    hen

    to

    escalate

    result of a single failure event. When onc mcch-

    anism succ eeds in restoring t he failure. all activi-

    t i cs ar e s topped . Al though th is wil l achieve the

    fastest result, the individual mechanisms must hc

    condi t ioned careful ly so as no t to obstruct each

    other or conten d for the sam e spare resources .

    S eque n t i a l m echan i s m s m ay l ead to l o n g e r

    overall restoration times than parallel activation.

    but a r e eas i cr t o keep unde r cont rol . I ndividual

    mechan isms can then be opt imized without r isk-

    ing problems of content ion. A sequential cscal21-

    t ion s t r a t egy detcrmines the orde r of ac t ivat ion

    of the mechani sms and coordinates between the

    mechanisms. Tw o var iables in scque nt ial escala-

    t ion ar e the o rde r in which the mcchani sms are

    a c t i v a t e d

    (Table 3)

    a n d t h e c r i t c ri a u s e d t o

    decide when to escalate (Table

    3 .

    Management o f Restoration

    ho ugh i t is a goa l

    to

    dep l oy au t onom ous l y

    T

    pera t ing survivabi li ty mech anisms, at som e

    po i n t t he s c m echan i s m s w i ll i n t e r ac t w i t h t he

    network managem ent , or TMN . This may r ange

    f rom s imply informing thc T MN of the progress

    of

    t he r es tora t ion proces s, t o t he ac t ive par t i c i-

    pat ion of TM N in the r es tora t ion proces s ( e .g ..

    ins t igat ingi terminat ing res torat ion mechanisms

    in other layers or par ts of the network. ...).

    Th e r es tora t ion management funct ions in t er-

    act wi th near ly a l l f ive ma nage ment funct ional

    areas def ined in the TMN management concepts

    [8],namely, conf igurat ion manage mcnt . per for -

    mance management , faul t management . account-

    ing manage ment a nd secur ity managem ent . Most

    of t hem a re obviously r e l a t ed to thc f aul t man-

    agem en t a r ea . H ow eve r , i m por t an t r e s t o ra t i on

    m a n a g e m e n t f u n c t i o n s a r c a l s o p e r f o r m e d i n

    on e o r m o r e o t he r f unc t i ona l a rea s , e . g ., t hose

    funct ions deal ing with escalat ion. I n

    the

    follow-

    ing paragraphs ,

    the

    ro l e of each funct ional ar ea

    in the res torat ion ma nagem ent is dcxcr ibed, and

    t he f unc t i ons a r e de f i ned t ha t a r e r equ i r ed i n

    e a c h f u n c t i o n a l a r e a t o o b t a i n a c o m p l e t e

    res toration managem ent system.

    Th e f ault management ar ea conta ins

    al l

    func-

    t ions re l a t ed t o the det ect ion an d r epor t ing of

    the faul ts , including faul t diagnost ic and rccov-

    ery funct ions . T h e alarm survei l lance activity is

    an in t r insi c par t of any r es tora t ion n i cchani sni .

    A

    r e s t o r a t i on a l go r i t hm , e i t he r c en t r a l iz ed o r

    d i s t ri bu t ed , needs s o m e au t om a t i c f au l t de t ec -

    t ion mechan i sm to t r i gger i t. Then. o nce a f a il -

    u r e h a s b e e n d e t e c t e d , a f a u l t d i a g n o s ti c

    procedure mus t

    be

    invoked to local izc a nd an a-

    lyze the faul t, e.g.. de termi ne whcthcr it is

    a

    link

    o r n o d e f a i l u r e a n d i n w h i c h l a y c r t h e f a u l t

    occurs . Based

    o n

    this informat ion, thc faul t cor-

    r ec t i

    o

    n fu n

    c o

    n w a c ti v a te t h e a p p r op r a t e

    r es tora t ion mechani sm and cont rol .

    i f

    required,

    any escala t ion proccdurc an d subscqucnt r epai r

    act ions . Th e faul t m anagem ent i i rca also includes

    tracking the s tatus of the dam aged network c om-

    ponent , a nd the s t rategies for switching the n ct-

    w or k back t o i t s no r m a l s t a t u s w hcn t hc f au l t

    has bccn repaired (norni~ili7ation).

    I n t h e c o n f i g u r a t i o n m a n a g e m e n t a r c a . a n

    i m p o r t a n t f u n c t i o n i s t h e i n s t a ll a t i o n

    of

    t h e

    res tora t ion mechani sm in thc network. and the

    related introduct ion s t rategy (e.g.. which node s

    o f a m es hed ne t w or k u i l l

    he

    chosen f i rs t to b e

    protected by a distributed restoration algorithm).

    Th e provi s ioning

    of

    t hc r e s t o r a ti on da t a

    t o

    al l

    conce r ned ne t w or k e l em en t s i s also r e l a t ed to

    that funct ion. Another impor t ant conf iguration

    funct ion is thc nioni tor ing of

    the network s tatus

    a n d t h e r e s u l ts

    of

    t he r e s t o r a t i on p r oces s . i n

    o r d e r t o i n f o rm t h e n e t w o r k o p e r a t o r o n t h e

    efficiency o f res torat ion, Beside these two main

    funct ions , t

    h c

    norm a iz i

    t o

    n and t

    h e

    d ea d ock

    processing fun ction s will also imply so me config-

    r

    a o

    n m an agem

    e

    n s p

    c

    c fi

    c ;I c o

    n

    s :

    they ;ire

    t he r e f o r e a l s o i nc l uded

    i n

    t h e c o n f i g u r a t i o n

    management ar ca .

    Pcrforniancc managcnient of the rcs torat ion

    process should include funct ions to modify any

    parameter t hat i nf luences it s performance. wc h

    s t imer values used as thresholds in escalat ion

    s t r a t egic or de a d ock s i tu a t on d c c c o n . 0 ne

    of thesc functions

    is

    the validation of th e rc5tora-

    t ion mechan ism ( i .e. . the background tes t of the

    d i s t r i bu t ed r e s t o r a t i on a l go r i t hm o r t he srlf-

    heal ing r ing) in o rder t o get an es t ima te

    of

    t h e

    res torat ion performance. Ano ther one is pre

    ly the detection

    of

    I deadlock i n t hc r es tora t ion

    proces s. Thi s l ast f unc t ion

    is

    more specif ic

    t o

    a

    rcs tora t ion a lgor i thm, for which i t i s impor t ant

    t o detect the col lapse through precise threshold

    value5 ( t imers) . The opt imizat ion of th e network

    i s a l so an impor t ant per formancc management

    function that will clcan u p the network in

    a

    post-

    restoration phase.

    I n

    t he a r ea of a ccoun ti ng m anagem en t w e r -

    al aspects must be taken into account . When the

    rcstoration process is offered as

    ;I

    service to net-

    work users. possibly including the subscription

    to

    ii higher pr ior i ty class , rules must

    be

    dcf incd

    to

    determine the corresponding subscr ipt ion rates .

    A r e l a ted cha r g i ng adap t a t i on m us t be consi d -

    e r ed t o t ake in to account t he impact

    of

    t he nct -

    work faul t on th e user . Th e s i tuat ion in which

    it

    i is e r w ou l d be cha r ged n i o r c bccaus c t he con -

    nect ion is us ing it l onger path af t er r es tora t ion

    must be avoided.

    Th e r estora tion man agement does not i n t ro-

    duce any ncw speci f i c funct ions in the s ecur i ty

    managemcnt ar ea . The exi s t ing wcur i ty proce-

    du r e s o f o t he r ne t w ork m anagem e n t f unc t ions

    c a n

    bc

    c x t c n d e d f o r t h e r e s t o ra t i o n n i a n a g e -

    inen

    T h e m a p p i n g

    of

    r c \ t o r a t i o n m a n a g e m c n t

    funct ions wi th the TMN management concept s

    provides

    R

    c lea r pic ture

    of

    t he specif i c manage-

    ment issues that arc addressed within the R A C E

    I1 I M MU NE p roject . Tackl ing these i s sues wil l

    68

  • 5/17/2018 00408427

    7/8

    H Figure

    5.

    The threephases of theflood ing algo-

    rithm.

    lead to practical res torat ion m anage men t appl i -

    ca t ions in the two t es tbeds to be s e t up toward

    the end of the project .

    Restoration Mechanisms for

    Meshed and Ring Networks

    ar t of the object ives of the IM MU NE project

    P s to demonstrate survivability in a laboratory

    envi ronment. T o th i s end, two t es tbeds a r e cur -

    r ent ly being se t up: a f iv e - n od e m e s h e d A T M

    P S N m o d e l a t t h e A l c a t e l r e s e a r c h l a b i n

    A n t w e r p , and a b i -d i r ec t iona l A T M r i ng C P N

    model a t t he Phi lips r esearch l ab in Aachen. In

    Antwerp, a distributed restoration algorithm will

    be d em on s t r a t ed ; in A ach en , a r i ng p r o t ec ti on

    swi tching mechani sm, based o n a new hardware

    concept, will be demonstrated.

    For meshed r es tora t ion , cent r a li zed sys t ems

    have been implemented as network management

    applications. Distributed re storation is still a new

    area unde r invest igat ion; so far only s imulat ions

    h a v e v a l i d a t e d s e v e r a l a l g o r it h m s . F o r t h e

    t e s t bed i n A n t w er p , a d i s t r i bu t ed r e s t o r a t i on

    algor ithm has bee n developed

    [9]

    and val idated

    by simulations. This algorithm is currently being

    integrated in the control sof tware of a commer-

    cial ATM cross-connect . Th e algor i thm is based

    on a previously publ ished two-prong algor i thm

    [lo]

    a n d s o m e e x t e n s io n s h av e b e e n a d d e d t o

    also

    cover multiple link and node failures, and to

    make i t more robust . Figure

    5

    explains the three

    phases of the algorithm.

    When a fai lure is detected, nodes adjacent to

    the fai lure ( referred to as request source, or

    RS,

    nodes) broadcast request messages , containing a

    s ignature , t he

    R S

    nodes own ID, t he r eques ted

    b a n d w i d t h a n d a h o p c o u n t . T h e s i g n a t u r e is

    used to distinguish between multiple requests on

    t h e s a m e l i nk . A n y i n t e r m e d i a t e n o d e t h a t

    r eceives a r eques t message ( t an dem, or T node)

    s tores the informat ion of t he message, updates

    t h e m e s s a g e c o n t e n t s w h e r e n e c e s s a r y , e .g .,

    when t he r eques t ed bandwidth i s not avai lable ,

    and broadcasts i t fur ther af ter increment ing the

    hop count . Whenever the hop count in a request

    m es s age r eaches a p r e s e t m ax i m um va l ue , t he

    message is no longer forwarded.

    Eventual ly, two branches of th e request t rees

    will meet in a tande m nod e, that is now referred

    t o a s c o n f ir m n o d e ( C F ) . T h e c o n f i r m n o d e

    sends a confirm m es s age t o t he RS nodes w i th

    lowes t

    ID,

    w hi ch i s now a s s i gned t he r o l e o f

    c h o o s e r ( C R ) n o d e ; t h e

    RS

    n o d e w i l l be t h e

    chosen (CN ). The conf i rm messages contain the

    am oun t o f bandw i d t h t ha t i s ava i l ab l e on t he

    newly found route. In general , n ot al l bandwidth

    that was on t he f a i l ed l i nk wil l be avai l able o n

    on e al ternat ive route, an d th e los t t raff ic will be

    r e s t o r ed a l ong s eve r a l o t he r r ou t e s . T hus , t he

    chooser nod e wi ll r eceive s evera l conf i rm mes -

    sages, indicat ing which ro utes are avai lable, and

    how much bandwidth there is on that route. Th e

    chooser nod e now sends connect messages along

    the al ternat ive routes , containing informat ion for

    t h e t a n d e m n o d e s a n d t h e c h o s en n o d e a b o u t

    the new configuration.

    Wherea s in the core of publ i c networks , t he

    D C S - b a s e d m e s h e d n e t w o r k t o p o lo g i e s a r e

    promis ing s t ruct ures , f or survivabil i t y mecha-

    ni sms in the cu s tomer -or i ented pa r t s of corpo-

    r a t e n e t w o r k s , d i s t ri b u t e d A T M - s w i t c h i n g

    s ys t em s bas ed on r i ng t opo l ogy have r ecen t l y

    been p r opos ed [11].These swi tches ar e a l r eady

    based on d is t r ibuted control and , therefore, dis-

    t r i bu t ed s u rv i vab i li ty m ech an i s m s a r e l og i ca l

    extensions of these control s t ructures . However ,

    not much practical experience has been achieved

    yet.

    Th e second demon s t r a tor wi ll be based on a

    dis t r ibuted mult i - r ing ATM switching network,

    which is described in detail in another paper [12]

    i n t h is i s s ue . T h e r e s t o r a t i o n m e c h a n i s m f o r

    AT M mul t i -r ing networks i s based on a unique

    AT M swi t ching e l eme nt t hat i s ca l l ed a duplex

    ATM transceiver . These duplex t ransceivers are

    connected in a bidirect ional r ing us ing UT P. T he

    duplex t ransceiver works l ike an A TM add-drop

    mult iplexer . This add s the necessary redundancy

    i n t o t h e r i n g c o n n e c t i v i ty t o p r o t e c t s e r v i c e s

    f r o m r i n g br e a k - d o w n . T h e d u p l e x A T M

    transceiver can be prog ramed ei ther as a normal

    user acces s nod e or as a br idge to in t er connect

    two di f f er ent r i ngs. In th i s projec t a prototype

    duplex ATM transceiver is constructed by three

    simplex AT M transceivers.

    B as ed on

    t he dup l ex A T M t r ans ce i ve r con -

    cep t , f a s t s el f hea l i ng and du a l hom i ng a l go -

    r i thm s a r e deve l oped f o r t he m u l t i - ri ng bas ed

    ATM CPNs. The multi-ring restoration system is

    then inc orporat ed in to the exi s ting d i s t r ibuted

    switch control sof tware. I ts interact ion with the

    dis t r ibuted conf igurat ion and resource manage-

    m en t f unc t i ons is s t ud i ed , and t h e r e s t o r a t i on

    performance is evaluated.

    Distributed

    restoration is

    still a new

    area under

    investigation;

    so

    far only

    simulations

    have

    validated

    several

    algorithms.

    IEEE

    Communications Magazine Septem ber

    1995

    69

  • 5/17/2018 00408427

    8/8

    The

    evolution

    of

    this study

    field from

    theory to

    practical

    networks will

    inevitably

    lead to a

    need

    for

    standards

    regarding

    suw iva bility.

    Conclusions and Targets

    of

    th e Project

    l though th e e f fo r t s in study ing res to ra t ion

    A echn iques , eva lua t ion m ethods , e sca la t ion

    s t r a t e g ie s a n d m a n a g e m e n t i n t e r a c t i o n f o r m a

    s u b st a n ti a l p a r t o f t h e I M M U N E p r o j e c t , t h e

    most tang ib le ou tpu t wil l be th e demons t ra t ion

    of res to ra t ion mechan isms on the two separa te

    laboratory testbeds. Currently, Alcatel Bell and

    PKI are work ing on the ha rdware a nd so f tware

    configuration and the integration of thc restora-

    tion algorithm s. In order to extrapo late the labo-

    ratory results to real-life networks, the simul ation

    and planning tools described in the third section

    wi ll be app l ied . Par t icu la r ly fo r t he d is t r ibu ted

    restoration algorithm, extensive simulations will

    be requ i red t o va l ida te the resu l t s ob ta ined on

    the five-node testbed.

    The performance evaluation tools will also be

    app l ied to va l ida te the p rop osed s t ra teg ie s fo r

    escalation across network layers and parts . The

    evolution of this study field from theory to prac-

    tical netwo rks will inevitably lead t o a need f or

    standards regarding survivability, considering the

    multi-operator and m ulti-vendor environment of

    today's telecommunication networks. Following

    an incentive exerted by the R AC E program, i t is

    t h e i n t e n ti o n o f t h e I M M U N E p r o j e ct

    to

    c o n -

    tribute to standardization within the area of net-

    work survivability.

    References

    [ l ]M. Daneshmand et al., Measuring Outages i n Telecom-

    munications Switched Networks , /E Commun Mag,

    vol. 31, no. 6, June 1993.

    121

    K

    Glossbrenner, Availabil ity and Reliability of Switched

    Services, /E Commun. Mag, vol. 31, no. 6, June 1993

    131 RACE

    II

    - IMMUNE, Requirements & Reference Configu-

    rations for Survivability, R21Ol/IMMUNE/BT/D&P/DS/P/

    002ib0, 31 Aug. 1994.

    [4] TU-T Recommendation

    G .8 0 3 ,

    Architectures of Trans-

    port Networks based on the Synchronous Digital Hier-

    archy, July 1992.

    151

    RACE

    II -

    IMMUNE, Performance repor t on netw ork

    restorat ion algorithms, Deliverable D7, Sept. 1995

    161 D. Johnson et al., Distributed restoration in telecom-

    munications networks, BT Technology Journal, vol. 12,

    no.

    2,

    April

    1994

    [7] E. Lafuente and

    C.

    Alcazar, Planning of High Capacity

    Transmission Networks wit h Flexibility, Proc. Sixth lnt?

    Network Planning Symposium, Budapest, Sept. 1994.

    [8] ITU-T Recommendation M.3010. Principles for a

    Telecommunications Management Network , Oct. 1992

    [9] L. Nederlof, H. Vanderstr aeten, and P. Vankwikelberge,

    A New Distributed Restoration Algor ithm t o Protect

    ATM Meshed Networks against Link and Node Failures,

    Proc.

    55 95,

    April 24-28, 1995, Berlin, pp. 398-402.

    [ l o ] Chow, J . Bicknell, and S McCaughey, A F a s t Dis-

    tributed Net work Restoration Algorithm. Proc.

    lnt' l

    Phoenix Conf. on Computer a nd Communications,

    March 22-26. 1993, Tempe, AZ.

    [ l l ]

    Y

    Du, H. J. Reumerman, and

    R.

    Kraemer, A Distribut-

    ed Architect ure for Switched ATMLAN, Proc.

    SS 95,

    April 24-28, 1995, Berlin, Germany, pp. 484-488.

    [12] K. P May et al., Self-Healing ATM Rings for Local

    Area Networks, to appear in this issue.

    Biographies

    LEO

    NEDERLOFeceived an M.Sc.E.E. fr om t he Delft Universi-

    ty of Technology in the Netherlands in 1992, and joined

    the Alcatel Corporate Research Centre in Antwerp , Bel-

    gium, in that same year. He has been working on network

    survivability and self-healing networks, and contributed to

    th e RACE

    1022 project and the RACE 2101 project

    IMMUNE. He

    is

    currently working on the implementation

    of a distr ibuted restoration algori thm on a laboratory

    testbed of 5 ATM cross-connects.

    KRISSTRUWE received an MSc. degree in electrical engineer-

    ing from the University of Ghent, Belgium, in 1992. In

    1993 he joined th e Broadband Communications Networ k

    group at the Department of Information Technology at

    IMEC/University of Ghent. His activities are focused on

    research of distributed restoration techniques for ATM

    networks.

    CHRIS

    O'SHEA raduated from the University of Salford with

    a B.Sc. degree in electronic engineering in 1981 and was

    subsequent ly employed by BT Research wit hin th e Subma-

    rine Optical Systems Division. From 1986 t o 1988 he suc-

    cessfully completed a part-time MSc in Telecommunications

    at the University of Essex. From 1992 to 1993 he support-

    ed the design of a standardized enviro nment fo r element

    managers using a UNlX platform. He currently works with-

    in the Broadband Multiservice Networks Unit on advanced

    network restoration strategies for SDH/ATM technology

    development.

    HOWARDE W B E R A T HMISSEReceived an M.Sc. in electrical

    engineering from the Delft University of Technology (The

    Netherlands) in 1990 In 1990 he worked at the same uni-

    versity as a researcher in the field of telecommunication

    and traffic control systems. From 1991 until the first half

    of 1995 he worked at KPN Research, the research depart-

    ment of t he K oninklijke PTT Nederland (Royal Dutc h PTT).

    During this period he contribut ed to research in the field

    of broadband communications and reliability engineering.

    He recently joined the Tactical Planning department of the

    network division of

    PTT

    Telecom, The Netherlands.

    YONGGANGu received a diploma and Ph.D. in electrical

    engineering fr om the Aachen University of Technology in

    1985 and 1991, respectively. From 1986 to 1991 he was

    involved in lo w bit rate sti l l /motion image coding and

    image statistical modeling. In 1992 he joined t he Philips

    Research Laboratories i n Aachen. He has been active in the

    evaluation of ATM system perf ormance wit hin t he RACE

    1022 project, and in the development of distributed ATM

    switch architecture for CPN. He

    i s

    currently responsible for

    the survivability research work f or t he Philips distribut ed

    ATM switch with in the framework of th e CEC project

    IMMUNE.

    BRAULIOAMAYO was graduated in 1968 from the Polytech-

    nic University of Madri d as an electronic engineer. He

    joined ITT Standard Electrica in 1969 and was appointed

    project leader of PCM systems. In 1982 he moved to I n

    ITC steering center for 5-1 2 switching development pro-

    grams, where he was later involved in the development of

    defense projects, Currently he i s a member of t he Alcatel

    SESA Network Architecture team at the Alcatel Corporate

    Research Centre in Madrid.

    7

    IEEE

    Communication\ Magazine September

    1995