8
QoS in Switched Industrial Ethernet Linus Thrybom ABB AB, Corporate Research Forskargränd 7 SE-721 78 Västerås, Sweden [email protected] Gunnar Prytz ABB AS, Corporate Research Bergerveien 12 NO-1396 Billingstad, Norway [email protected] Abstract As Industrial Ethernet evolves it will increasingly include integration with the “IT network” in order to utilize the benefits which the use of Ethernet provides. This will result in a mixed protocol environment also in the industrial networks, which in turn will require proper usage of QoS in order to maintain the demanding requirements of latency, jitter and packet loss in the Industrial Ethernet protocols. This paper highlights the emerging need for using QoS as well as some other related technologies in Industrial Ethernet networks and outlines some guidelines to achieve well-performing networks and efficient communication both for real-time control data and other less time critical data. 1. Introduction This paper will address some important functions of switched Industrial Ethernet networks, why they are important and how they are best deployed in an automation system. 1.1. Ethernet is everywhere Ethernet as defined by IEEE 802.3 has emerged as the wired communication media of choice for almost any situation, be it at home, at the office or in the automation plant. The main reasons for the big interest in Ethernet technologies in the industry are the excellent relationship between price and performance as well as the abundance of available technologies and solutions. As industrial systems get bigger, more advanced and complex and with tougher requirements the days of the field buses seem to be numbered and a whole new set of industrial protocols have entered the scene. Some examples of such industrial protocols targeting automation systems are PROFINET IO, EtherCAT, EtherNet/IP, Modbus TCP and IEC 61850. While these protocols have their differences and may have different levels of ambition they are all Ethernet based protocols in common use in industrial systems. However, there are a number of significant differences between office or home type of networks and Industrial Ethernet networks. The requirements on determinism and thus the ability to respond with a predictable time delay are typically much stronger in Industrial Ethernet systems since there the communication typically is related to processes requiring responses on the timescale from seconds and downwards to the microseconds. This was previously used as a strong argument against introducing Ethernet in automation systems but history has shown that the requirements of industrial systems can be handled to a highly satisfactory degree by modern Industrial Ethernet solutions. The traffic in an Ethernet network will typically have a non predictable behavior so it is normally not possible to say whether delays may occur due to queuing in some of the devices (e.g. switches). See [14] and [15] for prediction of delay bounds in networks. For systems requiring the highest performance in terms of determinism, data update rates and communication jitter, a different type of solutions have been developed. The so-called real-time Ethernet protocols (e.g. EtherCAT) can take full control of the network so that no unexpected delays or jitter occur. These types of networks are used in highly time-critical applications typically found within factory automation and are outside the scope of the present paper which deals with switched Ethernet networks (although technologies such as EtherCAT can also be used on switched networks). By keeping the amount of high priority packets in an Ethernet network to a limited level and only delay low priority packets, the requirements of most automation systems can be met and is the reason why Switched Industrial Ethernet networks as defined by [1] are being deployed all over the industrial area except in the cases where the real-time Ethernet protocols provide the only feasible solution. To put it differently, by applying QoS technologies in an appropriate manner the Industrial Ethernet network will be able to service both the high- priority, process critical communication and the low- priority, non-critical communication, at the same time. 1.2. Quality of service QoS (Quality of Service) may be referred to as a way to measure, as well as the mechanisms to manage the 978-1-4244-2728-4/09/$25.00 ©2009 IEEE

QOS in Ethernet Switch

Embed Size (px)

DESCRIPTION

QOS in Ethernet Switch

Citation preview

Page 1: QOS in Ethernet Switch

QoS in Switched Industrial Ethernet

Linus Thrybom ABB AB, Corporate Research

Forskargränd 7 SE-721 78 Västerås, Sweden [email protected]

Gunnar Prytz ABB AS, Corporate Research

Bergerveien 12 NO-1396 Billingstad, Norway

[email protected]

Abstract

As Industrial Ethernet evolves it will increasingly include integration with the “IT network” in order to utilize the benefits which the use of Ethernet provides. This will result in a mixed protocol environment also in the industrial networks, which in turn will require proper usage of QoS in order to maintain the demanding requirements of latency, jitter and packet loss in the Industrial Ethernet protocols.

This paper highlights the emerging need for using QoS as well as some other related technologies in Industrial Ethernet networks and outlines some guidelines to achieve well-performing networks and efficient communication both for real-time control data and other less time critical data.

1. Introduction

This paper will address some important functions of switched Industrial Ethernet networks, why they are important and how they are best deployed in an automation system.

1.1. Ethernet is everywhere Ethernet as defined by IEEE 802.3 has emerged as the

wired communication media of choice for almost any situation, be it at home, at the office or in the automation plant. The main reasons for the big interest in Ethernet technologies in the industry are the excellent relationship between price and performance as well as the abundance of available technologies and solutions. As industrial systems get bigger, more advanced and complex and with tougher requirements the days of the field buses seem to be numbered and a whole new set of industrial protocols have entered the scene.

Some examples of such industrial protocols targeting automation systems are PROFINET IO, EtherCAT, EtherNet/IP, Modbus TCP and IEC 61850. While these protocols have their differences and may have different levels of ambition they are all Ethernet based protocols in common use in industrial systems.

However, there are a number of significant differences between office or home type of networks and

Industrial Ethernet networks. The requirements on determinism and thus the ability to respond with a predictable time delay are typically much stronger in Industrial Ethernet systems since there the communication typically is related to processes requiring responses on the timescale from seconds and downwards to the microseconds. This was previously used as a strong argument against introducing Ethernet in automation systems but history has shown that the requirements of industrial systems can be handled to a highly satisfactory degree by modern Industrial Ethernet solutions.

The traffic in an Ethernet network will typically have a non predictable behavior so it is normally not possible to say whether delays may occur due to queuing in some of the devices (e.g. switches). See [14] and [15] for prediction of delay bounds in networks. For systems requiring the highest performance in terms of determinism, data update rates and communication jitter, a different type of solutions have been developed. The so-called real-time Ethernet protocols (e.g. EtherCAT) can take full control of the network so that no unexpected delays or jitter occur. These types of networks are used in highly time-critical applications typically found within factory automation and are outside the scope of the present paper which deals with switched Ethernet networks (although technologies such as EtherCAT can also be used on switched networks).

By keeping the amount of high priority packets in an Ethernet network to a limited level and only delay low priority packets, the requirements of most automation systems can be met and is the reason why Switched Industrial Ethernet networks as defined by [1] are being deployed all over the industrial area except in the cases where the real-time Ethernet protocols provide the only feasible solution. To put it differently, by applying QoS technologies in an appropriate manner the Industrial Ethernet network will be able to service both the high-priority, process critical communication and the low-priority, non-critical communication, at the same time.

1.2. Quality of service QoS (Quality of Service) may be referred to as a way

to measure, as well as the mechanisms to manage the

978-1-4244-2728-4/09/$25.00 ©2009 IEEE

Page 2: QOS in Ethernet Switch

network quality and availability. The quality of the network and thus the result of applying QoS is measured by the latency, jitter and packet loss in a network, and these three parameters are very important to control and measure in Industrial Ethernet applications. The automation network is now being more and more integrated with the “IT network” and if not already there, the Industrial Ethernet networks will soon be multifunctional, carrying both high priority real-time data, as well as medium and low priority non real-time data.

In order to have a smooth and efficient migration path towards Industrial Ethernet, it is important that both the network device manufacturers and the industrial users understand the special features and functionalities which Industrial Ethernet offers, e.g. regarding QoS. The clear benefits of having control over the network, is that you can use it ultimately and cost effective, and also gain from the available network services, and take advantage of the mass market which the Ethernet technology is.

The development of industrial switches is fast and evolving. The good news is that the user gets better means to operate the network but the drawback is that when the complexity of the network increases, so does the complexity of managing the network.

2. Example

2.1. Streaming video and process control data The impact of using QoS in a network can be

illustrated by a simple example. Consider a network consisting of a number of automation devices like process controllers and other process equipment. In addition there is some video equipment streaming video across the network. If the network is sufficiently complex it is likely that the streaming video traffic will share some links with the automation related traffic coming from, say, a PROFINET IO device.

Without using any QoS functionality the PROFINET IO traffic may be significantly delayed in one or more switches of the network [14] since it may have to wait in line in the switch queue before being forwarded. The delay will depend on the amount of video packets, and with a certain network load it is possible that the switch queues are long and if a switch queue is completely filled when a PROFINET IO packet is entering the switch this packet will be lost. More likely the PROFINET IO packet will be severely delayed, possible to an extent that adversely affects the associated industrial process, maybe also causing significant trouble.

By using QoS, this situation may be easily avoided. Letting the PROFINET IO traffic have a sufficiently high priority and running the streaming video traffic at a lower priority the queuing delay of the PROFINET IO packet will be limited. Upon entering the switch the

high-priority PROFINET IO packet will be put into the high-priority queue and waiting for any ongoing transmissions to finish before being sent. This jump-ahead-of-the-queue functionality that QoS represents is therefore very efficient and improves determinism vastly.

3. QoS tools and technologies

The main idea of QoS is to classify the packets, and treat them according to their classification in the network. The classification may be done in the end device or in the switch.

3.1. Priority tags One of the most common QoS tools is to use a

priority field in a protocol and a corresponding support in the network to treat the high priority tagged packets different from low priority tagged packets. In short there is one priority field in the IP-header, and one priority field in the Ethernet header.

Typically, an end device application using the IP protocol uses the priority field in the IP-header called “ToS-byte”. This byte has made a long trip throughout the years and the latest definition is [11], which defines “DiffServ Code Point (DSCP)”. However, this RFC is backward compatible with RFC 1812 [12], so this byte may be used either as “IP Precedence” or as “DiffServ Code Point (DSCP)” to indicate the priority of the IP packet. “IP Precedence” provides 8 priority levels and it represents a simple way to implement QoS since it maps well to the 8 levels of “CoS” and the Layer 2 priorities. “DSCP” on the other hand is used by the DiffServ (Differentiated Services) feature, which provides four main service levels: Default Forwarding, Assured Forwarding, Expedited Forwarding and Class Selector. Default Forwarding provides “best effort”, the Assured Forwarding provides a set of services above “best effort” in priority and Expedited Forwarding represents the highest priority setting. The Class Selector is the backward compatibility service which thus implements “IP Precedence”.

Similarly, an end device application using the Ethernet protocol on Layer 2 may use the priority field defined in [1] called “CoS” (Class of Service), or “PCP” (Priority Code Point) to indicate the priority of the packet. This Layer 2 priority field provides 8 priority levels and is located in the so called “VLAN tag” [2].

The figure below illustrates where the priority field is inserted in the Ethernet frame and in the IP header.

Page 3: QOS in Ethernet Switch

Destination MAC

SourceMAC

EtherType/Length

Payload Pad CRC

Destination MAC

SourceMAC

Ethertype/Length

Payload Pad CRCVLAN Tag

TPID PCP VLAN IDCFI

16 bits 3 bits 1 bit 12 bits

VLAN Identifier

Canonical Format Indicator

Priority Code Point

Tag Protocol Identifier (0x8100)

VersionHeader length

TOS Length ID Flags Offset TTL

Protocol ChecksumSource

IPDest

IP

IP Precedenceor DSCP

Figure 1. QoS bits in MAC frame and IP header.

Note the term “indicate the priority” in the paragraph above. It is actually so that the rest is now up to the network, the configuration of the network stack in the end device, as well as the configuration of the switches and routers in the network. Even if the end device may set the priority value to the packet, with the advantage of a fine graded priority setting, it may in some cases be better to let the switch assign the priority to the packet, some reasons for this being:

• It is easier to verify that the same priority settings apply to the whole network.

• There may be other types of traffic, e.g. FTP and HTTP, which don’t provide an easy way to apply QoS to the protocol.

• Lack of QoS support in low cost devices. • There might be cases where an application

uses different priority settings depending on the use-case.

If the configuration of the switch is the way to tune these parameters, it is easy to maintain in control of the network. From a safety perspective it is also preferred to have the network to control the priority in order to decrease the impact of unexpected traffic.

3.2. In the end device In order for an end device to benefit from a well

configured and efficient network, there are some requirements on the device itself. As an end device the choice of PHY, Ethernet controller and the network stack in the operating system are of high importance.

For example, good time synchronization requires IEEE-1588 support in hardware (e.g. the PHY), efficient multicast requires support both in the Ethernet controller and in the operating system and so on.

A device needs also to have a good RTOS (Real-time operating system) and a good network stack in order to

behave in a predictable manner (although sometimes an intelligent scheduler may be sufficient). The user wants the RTOS and network stack to be small and fast but also smart enough to support e.g. traffic shaping. In fact, in several cases the bottleneck is not the network; the bottleneck is the network stack processing time in the end device [4].

Low cost end devices may only have a basic network stack which may not include the functionality of packet classification; in this case it is the switch which will apply the priority tag to the packet. More advanced end devices may on the other hand have the functionality to set the priority tag, and also to shape its egress (outgoing) traffic. Most network stacks for embedded applications do not provide prioritization inside the network stack itself. One could believe that the evolvement of e.g. VoIP applications would push the evolvement of network stacks, but unfortunately it seems there is still some work to be done here. In general a higher focus on this issue among the RTOS suppliers would be appreciated.

The ability to support priority tagged traffic, to prioritize various stacks differently, to prioritize the various IP ports differently and more specifically to be able to set the “CoS” priority in the VLAN Tag should be a standard part of any modern RTOS, but this is far from being the case. These are shortcomings that industrial device vendors need to be aware of and that may have significant consequences in a practical application.

3.3. In the switch The switch treats an incoming packet roughly in the

following flow: classification, policing, queuing, shaping and output. These phases are briefly described below.

In the classification phase, the network switch often supports several ways to apply priority to the packets it receives on its ingress ports. It may use the priority indicated by the end device, in the VLAN Tag [2], or the IP-priority. It may also use a fixed priority depending on the port the packet is received on, or depending on the destination address, or source address, or Ethernet protocol type, or IP protocol used. The switch may also be configured to set the Layer 2 priority tag of the packet depending on the classification, in order for the packet to maintain its priority in the network.

Policing is used to protect the network from devices sending too much data; typically this can be a failed Ethernet controller. This feature may also be called “Ingress Port Rate Limitation” or “Storm control”. In this case, the user may pre-configure a maximum allowed bandwidth for the device(s) connected to a specific port. If the ingress port traffic exceeds the settings then packets are simply dropped. The user must thus verify what bandwidth the device needs and how the traffic is scheduled, and must also observe the time used for the bandwidth estimation; if this is low, even

Page 4: QOS in Ethernet Switch

very short bursts of data will trigger the policing. If it is to long, then a “crazy sender” might be able to fill the switch queues before the policing starts to drop packets. If possible it is generally good to apply shaping in the sending device to avoid bursts of packets.

Some switches are also able to differentiate the port rate per type of packet, e.g. one limit for multicast, one limit for broadcast and one port rate limit for unicast packets. This may be handy if a device which e.g. only should send multicast streams of data is connected to the network. In this case the switch port may be configured to allow a high rate of multicast packets, but only allow a low rate of unicast and broadcast packets.

In case a port reaches its configured maximum port rate, it may be an error case, and it is therefore important that the information is provided to the supervision system, e.g. a SNMP trap or event in a data model of an Industrial Ethernet protocol in order for the switch to report the error.

After classification and policing the packet is ready to be forwarded on an egress port, but in case the port is already busy transmitting, the packet is queued in the queue which is mapped to the priority level the packet was classified to. Many Industrial Ethernet switches today support 4 queues but here 8 queues would be preferred since this better maps the available priority levels of “CoS” and “IP Precedence”, the Layer 2 and Layer 3 priority fields. In fact, by having only 4 priority queues, and not 8, the ability of a device to prioritize traffic has been limited and this may result in unexpected behavior.

The queuing mechanism of a switch provides different scheduling algorithms which decide which packet to send first. In this sense, both the scheduling algorithm and the queue buffer size are thus important factors for the user. One common scheduling algorithm is “Strict Priority” where packets in the highest priority queue are always sent first. Other schedule principles include “Weighted Fair Queuing” and “Shared Round Robin”, where lower priority packets may be transmitted even if there is a higher priority packet waiting in the queue. The reasoning for this is that one wants to avoid starvation of the low priority packets and their applications. The queuing mechanism may also include algorithms to drop packets in case the queue buffer is full e.g. “Weighted Tail Drop”.

Finally the switch may be able to shape its egress port in order to limit the used bandwidth on the port; however in time critical applications this should be avoided since this adds latency and jitter to the packets.

It must also be assured that the priority handling is maintained within the whole network and all switches/routers, since in case one switch is badly configured, it may give the packets wrong treatment, and this may very likely affect the performance of the network.

3.4. Integration of wired and wireless networks Integration of gateways for wireless networks with

the Industrial Ethernet network may also be an issue. In this case, the end device may have set its priority

tag according to that specific wireless standard used, and it must then be mapped in the gateway to appropriate priority settings for the switched Ethernet network. Alternatively the gateway may be configured to set the priority tag accordingly, or even the switch which the gateway is connected to may set the priority tag.

3.5. QoS for industrial Ethernet protocols Some of the Industrial Ethernet protocols have

defined in their standard which priority tag and value to use; PROFINET IO devices uses e.g. the Layer 2 priorities “CoS” of 5 and 6 for its alarms. The IEC 61850 GOOSE (Generic Object Oriented Substation Event) protocol uses the Layer 2 priority with the default priority setting of 4, but the user may also set its own priority in the range 0-7. Fieldbus Foundation specifies FF HSE to use the Layer 3 priority as defined in [13] with the default setting of “D=1”, indicating minimum delay. This means that the “ToS-byte” may be evaluated as DSCP and mapped into some corresponding “CoS” priority value by the switch. However the user is able to change the value of all bits in the “ToS-byte” in the FF HSE application. EtherNet/IP uses often the “CoS” value of 5, and this setting is to be done in the switch. The Modbus/TCP protocol does not define any specific use of priority.

As pointed out earlier, the fact that the Industrial Ethernet protocols supports Layer 2 or Layer 3 priority is not enough to achieve the required low jitter and low latency through the network. Also the switches and the network must be configured to apply QoS in a good manor, and this applies to e.g. the policing and scheduling phases of the switch. Policing provides some protection against erroneous end devices and it is thus a feature that should be used if it is possible. In the queuing phase “Strict Priority” is often the best choice for Industrial Ethernet solutions since this algorithm sends all the high priority packets before sending any lower priority packet.

To summarize, a user who wants to accomplish good QoS in an automation network consisting of multiple protocols must thus be careful and possibly reorder the priorities since there are different requirements and needs regarding jitter and latency, a diagnostic message is less critical than an emergency event. In addition the user should make sure that other protocols like FTP and HTTP only uses the lowest priorities, and that the network is configured accordingly.

3.6. Good practices From the discussion above and experience from

industrial control systems the following practices

Page 5: QOS in Ethernet Switch

regarding the usage of QoS in Industrial Ethernet networks can be concluded:

• Enable/Configure usage of the Layer 2 or Layer 3 priority in the network and use strict priority scheduling in the switch.

• Use port rate limitation where possible. • Disable egress shaping on the switch. • Limit the usage of the highest priorities,

differentiate the priority settings. • Shape the end device egress traffic when

possible. • Use a RTOS and network stacks with proper

support for QoS.

4. Related tools, features and technologies

There are also some other features of switched Ethernet which indirectly influence the achieved QoS in a network, so they are also discussed here.

4.1. Flow control The IEEE 802.3x (flow control) function is a

possibility for the network to control the end device sending function. In case the network detects congestion it may send “stop” messages to its connected end devices until the traffic situation improves and then the network will send “start” messages to its connected end devices. In Industrial Ethernet solutions the IEEE 802.3x flow control should be disabled since it stops all types of packets, even the high priority packets.

More interesting is instead the newly proposed IEEE 802.1Qau (congestion management) and also the Per-Priority Pause function, as these are more fine grained and enables the network/switch to control the medium or lower priority traffic in case of network congestion.

4.2. Multicast Multicast is communication “one to many”, in

contrast to unicast which is “one to one”, and broadcast which is communication “one to all”. Correctly used, multicast will improve the efficiency of the usage of network bandwidth.

However, some issues need to be addressed when it comes to multicast. Multicast is often used to provide publish/subscribe protocols, e.g. IEC 61850 GOOSE (Generic Object Oriented Substation Event) and DDS (Data Distribution Service), and is also a good solution when distributing video & audio to multiple destinations. Multicast is applied both on Layer 2 and on Layer 3, which may be a source of confusion.

The mechanism behind multicast (over IP) includes a membership protocol called IGMP see [6], [7] and [8]. IGMP enables devices to subscribe or unsubscribe to a multicast group by sending “join” or “leave” messages to the network. Many Industrial Ethernet switches supports here IGMP snooping, which is a way for Layer 2 devices to benefit from a Layer 3 protocol, since a switch with

IGMP snooping, snoops the IGMP messages from the end devices to find out which multicast groups it should forward to what ports.

The receiver of the IGMP messages is the router or the “IGMP querier” located in one of the switches and used when no router is available. The IGMP querier or router regularly polls all multicast enabled devices for open multicast streams. The idea is that the router then knows which multicast streams it should forward into this network, and which not to forward. With IGMP snooping, this feature to know which streams to forward to a specific port is extended into the network since the switch also has this knowledge.

To summarize, using one IGMP querier and IGMP snooping enabled in the network, it is possible to use multicast very efficiently, since only those end devices which have sent the join message for a specific multicast stream will receive this multicast stream.

IGMP is specified in version 1, 2 and 3. IGMPv1 [6] includes only the membership query sent by the query router / switch, and the membership report sent by the end device. IGMPv2 [7] includes additionally leave and join messages sent by the end device. In IGMPv3 [8] there is support for bulk join and bulk leave messages, and also to more specifically join or leave a multicast stream from a specific source address. Note here that both the network and the end device itself need to support the IGMP version to use.

One possible source of problem is however the multicast address aliasing effect. IP multicast uses the addresses in the range from 224.0.0.0 to 239.255.255.255, and uses thus a 28 bit address range. The multicast IP address is mapped to a Layer 2 MAC address, which is 48 bits, but Ethernet uses only 23 bits to differentiate between multicast addresses, as the other bits are fixed by IANA [5], which means that 32 (25) IP multicast addresses will map to the very same MAC multicast address. When a received multicast packet is forwarded to the IP layer in the network stack, it is thus not sure that it belongs to a subscribed group. In this case it will be filtered by the IP layer which is aware of the joined multicast groups, to the cost of CPU performance. This may however also be the case when using more multicast groups than the Ethernet controller supports, it is then also the task of the network stack to filter the multicast streams.

More severe is the fact that the alias effect may overload the switch CPU itself, in case a management multicast address is aliased with a user multicast address. Some examples of reserved management multicast addresses are:

• 224.0.0.1 The “All Hosts” multicast group • 224.0.0.2 The “All Routers” multicast group • 224.0.0.5 OSPF • 224.0.0.6 OSPF • 224.0.0.9 RIP • 224.0.0.18 Virtual Router Redundancy Protocol

Page 6: QOS in Ethernet Switch

• 224.0.0.22 IGMP Version 3 There are however switches which avoid the aliasing

problem by using the IP multicast group instead of the MAC address for its filtering purpose, and in that case this is not an issue anymore for IP multicast. However, this is still an issue in the Layer 2 multicast case.

4.3. Daisy chaining Daisy chaining of end devices using embedded

switches makes the installation easy and convenient. However there are some drawbacks and to mention a few are the limited bandwidth, the added latency and the limited support for QoS in the embedded switches. A single point of failure may also influence several end devices. It is however not discussed further in this document; this topic has been touched upon in [3], which discusses the complex internal architecture of a switch together with some critical pitfalls that may cause serious problems when this is not handled properly.

4.4. VLAN In Industrial Ethernet networks the main usage of

VLAN (Virtual LAN) is for security reasons, but it is also a way to limit the broadcast and multicast domain. The alternatives to VLAN should however be investigated thoroughly due to the additional effort required in e.g. configuration and maintenance of the network. The reason is that an end device in one VLAN is not “visible” from any other VLAN, and that a router is needed for any inter VLAN communication. Also for latency reasons, routing of time critical packets should in general be avoided. It should also be noted that VLAN can not solve traffic congestion since the packets will still share the same infrastructure. The traffic will rather increase since the packets between the VLANs first goes to the router in one VLAN, and then out on the other VLAN, and these VLANs may share the same network segments and switches.

When using VLAN for security reasons, the switch may be configured to put any unauthorized end device in a specific VLAN for this purpose, with very restricted access to the network. In the router it is then decided what access rights the unauthorized devices should have.

4.5. Redundancy protocols Industrial Ethernet networks often require the higher

availability provided by a redundancy protocol [9]. The used protocol does not directly influence the QoS parameters of a network, however in case of a failure, there is an impact. During the reconfiguration time of the network, which may last for a few milliseconds up to some hundred milliseconds or even some seconds, the switches have no valid forwarding information and the network may thus loose packets during this time.

The “IT networks” often deploy STP (Spanning Tree Protocol) or RSTP (Rapid Spanning Tree Protocol) to

handle redundant paths between switches in the network, but the recovery time needed by these protocols are typically much longer than required by the Industrial Ethernet real-time applications, however this may depend on the actual implementation in the RSTP case.

For the Industrial Ethernet use-case there are a set of proprietary ring protocols available and almost every industrial switch manufacturer has its own protocol. These ring protocols are similar to that sense that they can provide network redundancy without any additional cost of infrastructure and each end device is connected to only one switch. The worst case recovery time is in the range of around 10 ms – 500 ms depending of protocol type and network size. The failure of a switch will only have impact on the end devices connected to the failing switch, whereas the rest of the network will be able to communicate, except during the recovery time.

In addition there is ongoing standardization work (IEC 62439) which defines some new types of redundancy protocols targeting the tough recovery time requirements found in industrial networks. The MRP (Media Redundancy Protocol) is part of the IEC 62439 standard and is a ring protocol for redundancy and it is thus similar to the proprietary solutions. Its worst case recovery time can be configured to be as low as 10 ms.

The PRP (Parallel Redundancy Protocol [10]) is also part of IEC 62439 and it is built upon two networks and end devices with two network interfaces attached to the two redundant networks. In PRP a “link redundancy entity” is inserted between the network interface and the upper layers in the devices. This redundancy entity ensures that the packages are sent on both interfaces, and it is able to filter received duplicates before forwarding them to the upper layers. One big advantage of PRP is that there is no recovery time at all since the whole network infrastructure, including device interfaces, is duplicated. HSR (High Availability Seamless Redundancy) can be seen as an extension of PRP with an inbuilt switch in the devices. According to the plan it is going to be published as an international standard as part of IEC 62439 edition2.

One thing to consider when selecting the redundancy protocol to use is of course the recovery time. Another important thing to consider in the non PRP/HSR case is the behavior of multicast and IGMP during the recovery process, since the switch may loose its IGMP subscriber information and thus stop forwarding the multicast packets until the next IGMP query is sent by the querier. At the moment the proprietary protocols are usually the fastest alternative and are therefore often the preferred choice.

4.6. Usage of QoS related technologies in industrial Ethernet protocols

The FF HSE and EtherNet/IP protocols use IP multicast and the network should in this case thus provide IGMP support for the best performance as

Page 7: QOS in Ethernet Switch

described above. (The EtherNet/IP protocol may also use broadcast.) It is also the responsibility of the user to notice and avoid the aliasing effect described above.

The IEC 61850 and PROFINET IO protocols use only the Layer 2 part of multicast, which means that there is no IGMP functionality used in the network. So far these protocols have been used in separate networks with almost no other network traffic, and then it does not matter that the switch floods the ports with the multicast packets. As Industrial Ethernet evolves there will much likely be larger networks, with multiple protocols. Then it will be important to handle both Layer 2 and Layer 3 multicast groups in the same network in an efficient way.

One way to limit the Layer 2 multicast flooding is to map each multicast group with a specific VLAN, and then the flooding is limited to only that VLAN. The drawback of this is however that these end devices likely need to send unicast packets to end devices in other VLANs, which requires routing.

A more elegant solution would be to use GMRP (GARP Multicast Registration Protocol) which is supported by some advanced switches today and works much in the same way as IGMP on Layer 3. Obviously, the devices also have to support GMRP in order for this to work. Another Layer 2 solution is to use MMRP (Multiple MAC Registration Protocol) defined in IEEE 802.1ak-2007, which replaces GMRP, to register group MAC addresses in the switch, however this protocol is currently not much supported in switches and network stacks.

A third solution is to manually configure the switch with the used MAC multicast addresses via e.g. the switch’s web interface.

The daisy chaining and the redundancy concepts discussed above are rather topology aspects of the network and not depending on the specific Industrial Ethernet protocol used, nevertheless they influence the Industrial Ethernet network behavior and should be considered accordingly. One difference noticed is that EtherNet/IP systems are recommended by manufacturers to use the STP/RSTP redundancy protocol, whereas e.g. IEC 61850 and PROFINET IO often are used with ring protocols, either using proprietary protocols or the MRP.

4.7. Good practices From the discussion above and experience from

industrial control systems the following practices regarding the usage of QoS related technologies can be concluded:

• Disable flow control. • Use multicast when appropriate with IGMP

and IGMP snooping, or GMRP/MMRP. • Only use VLAN when it is really needed,

e.g. for security reasons. • Use proprietary ring protocols instead of

RSTP and be prepared to use the standardized protocols in a few years.

5. Conclusion

5.1. Trend The trend towards more data, faster networks, tighter

integration with “IT-networks”, wireless networks and remote access will continue. There is also a trend going from a controller centric automation network into a network centric automation network. One aspect that needs special attention is then the QoS, other aspects being e.g. security, availability and usability.

The industrial device vendors and switch manufacturers have also realized this trend and market, and provide the Industrial Ethernet users with more or less appropriate solutions. However there are still some features that are missing. Here is a current wish list:

• Support 8 priority queues in the switch. • Support GMRP/MMRP in the switch for

MAC multicast registration. • Improved RTOS and network stack support

regarding QoS.

5.2. Finally There are today automatic QoS settings available in

some switches, this is better than nothing, but to achieve good control of the data flow in the network regarding latency, jitter and packet loss, QoS must be understood and applied to the whole system. The fusion of the “IT-network” side and the automation network side will require some effort, but it will also result in a more efficient networking solution and automation system.

This paper has given an introduction into QoS with the user focus of an Industrial Ethernet network, and also highlighted some issues to take care of as a user. Furthermore, the paper has highlighted some issues which the network industry needs to take seriously and improve on.

References

[1] IEEE 802.1D-2004. Media Access Control (MAC) Bridges. IEEE 2004.

[2] IEEE 802.1Q-2005. Virtual Bridged Local Area Networks. IEEE 2005.

[3] Skaalvik, J. and Prytz, G, “Challenges related to automation devices with inbuilt switches”, 7th IEEE International Workshop on Factory Communication Systems WFCS 2008. Proceedings, pp. 331-329, 2008.

[4] Tor Skeie, Svein Johannessen and Oyvind Holmeide, “The Road to a Deterministic Ethernet End-to-End”, 4th IEEE International Workshop on Factory Communication Systems, WFCS 2002. Proceedings pp: 3 – 9, 2002

[5] http://www.iana.org/ Internet Assigned Numbers Authority (IANA)

[6] http://www.ietf.org/rfc/rfc1112.txt IETF, RFC 1112, IGMPv1

Page 8: QOS in Ethernet Switch

[7] http://www.ietf.org/rfc/rfc2236.txt IETF, RFC 2236, IGMPv2

[8] http://www.ietf.org/rfc/rfc3376.txt IETF, RFC 3376, IGMPv3

[9] Prytz, G: “Redundancy in industrial Ethernet networks”, 6th IEEE International Workshop on Factory Communication Systems WFCS 2006. Proceedings, pp. 380-385, 2006.

[10] Kirrmann, H, Hansson, M. and Muri, P, “IEC 62439 PRP: Bumpless recovery for highly available, hard real-time industrial networks”, 12th International Conference on Emerging Technologies and Factory Automation ETFA 2007. Proceedings, pp. 1396-1399, 2007.

[11] http://www.ietf.org/rfc/rfc2474.txt IETF, RFC 2474, Definition of the Differentiated Services Field (DS Field) in the IPv4 and IPv6 Headers

[12] http://www.ietf.org/rfc/rfc1812.txt IETF, RFC 1812, Requirements for IP Version 4 Routers

[13] http://www.ietf.org/rfc/rfc1122.txt IETF, RFC 1122, Requirements for Internet Hosts - Communication Layers

[14] J. Jasperneite, P. Neumann, M. Theis, K. Watson, “Deterministic real-time communication with switched Ethernet”, 4th IEEE International Workshop on Factory Communication Systems 2002. Pages: 11 – 18.

[15] A. Ermedahl, H. Hansson, M Sjodin, ”Response-time guarantees in ATM networks”, The 18th IEEE Real-Time Systems Symposium, 1997, Proceedings pp: 274 – 284.