ServerIron White Pages

Chapter 1 – How the ServerIron works Page 1 of 14

Chapter 1 – How the ServerIron Works

Methods, and Modes

Foundry Networks provides load balancing of TCP/IP applications through our product

known as ServerIron. In doing this, there are many different topologies supported; each

with its own uniqueness and feature set.

The current release of the SI code accomplishes load balancing via NAT. A soon to be

released software (possibly hardware) upgrade will allow for both proxy and NAT. In

many cases proxy load balancing is not needed and is not discussed here.

With NAT, the load balance device is transparent and passes information it receives to a

real server. Speed and simplicity are gained from a NAT based load balancer. A

connection request is passed to the load balance device, information is translated inside

the packet, and it is passed on to the real server. When the real server responds,

information is translated back and sent to the requester. What this type of device allows

for is speed and simplicity. Most sites requirements for load balancing are met with a

NAT based load balancer.

As you will read from this document, Foundry Networks has even found a mode that will

speed up responses from the real servers, and continually based on NAT.

Methods and Modes

The figures throughout this document give examples of the topologies that can be formed

by two of the configuration methods of the Foundry Networks ServerIron product. These

methods are:

• Single SI– a single ServerIron that allows for load balancing with no backup

mechanisms


• Redundant SIs – ServerIrons that can back each other up while providing load

balancing. The ServerIron provides two modes of redundancy:

Ø Single Active SI with switch backup – Single Active SI that allows for switch

level redundancy

Ø Dual Active SI with VIP backup – dual active ServerIrons that allow for VIP-

level redundancy.

For the redundant configuration, a ServerIron backs up another ServerIron either on a full

SI level or on a VIP level. This backup ServerIron virtually mirrors the configuration of

its active partner.

In the Single-Active SI configuration, one ServerIron is servicing all requests and the

other is dormant. In the event of a failure in the active SI, the dormant SI becomes

active.

In the Dual-Active SI configuration, there are two ServerIrons that are active; each

servicing requests for their unique VIP(s). Each ServerIron in this case can back up the

others VIP(s), in the event of a failure.

Each of the above methods can implement either of two modes of operation:

• DSR – Direct Server Return, which allows the return path of the data flow to be

between the real server and the requestor. The SI is not involved in the return path.

• Classic-SLB – which forces the ServerIron to be the intermediate device between the

real server and the requestor. This is many times known as classic-SLB.

All the above accomplishes the same goal: resiliency. The purpose of the topologies is to

provide fault resilient load balancing for inbound requests of services residing on the

servers.


Concepts

During installation, the ServerIron is configured with three things:

• the VIP(s) IP address and TCP/UDP port numbers for all the ports (applications) that

will be supported

• the IP addresses and TCP/UDP port numbers of the “real” physical servers. Real

servers are those devices for which the ServerIron is load balancing. Included are the

individual TCP/UDP port numbers that each server supports

• Bind the VIP port numbers to the real server port numbers.

The ServerIron is assigned one or more “virtual” IP address (VIP) and this is the

address(es) to which the outside connections are routed to. The IP address assigned to the

VIP is the publicly known address (it is the one in DNS). A user requesting

www.bicycles.com is given an IP address by DNS and then the connection attempt is routed

to the IP address of the ServerIron (the VIP). Based on inbound connection requests

(TCP/SYN packets) a session table is built.

A connection attempt to the VIP is translated to an IP address of a real server. The real

server is selected by the ServerIron by any one of 3 balancing methods. The inbound

packet is translated and sent to the real server which processes it, and sends it back. The

ServerIron intercepts this packet, translates it and sends it back the requester. This is

known as the classic-SLB mode of operation.

Another mode of operation is the Direct Server Return (DSR) mode. The process of the

user request is the same. The packet will be received by the ServerIron and is translated

to the real server. However, in this mode, the real server is allowed to directly respond to

the requester, without having the ServerIron as the intermediate device.


Both modes of operation are fully detailed in later chapters. This is simply a brief

introduction.

ServerIron Hardware

Fully redundant, these are actually Backbone class L2 switches with L4 switching

functions. They are not routers. The ones that are used in the diagram are 16-port

switches. The ServerIron is available with 8, 16, or 24 ports of 10/100 and up to 8

Gigabit ports (using the TurboIron/8 product). Optionally, the ServerIron software can

run on the TurboIron8 platform as well. Soon, the ServerIron will run on the BigIron

platform. The ones pictured contain 16 10/100 ports and 2 Gigabit ports with each port

capable of handling full duplex operation. The connectivity between all the switches can

be done with any combination of Gig and 10/100. You can use the Gigabit ports on the

ServerIron to downlink to the BI4000’s or use them as uplinks to the NetIrons. You can

also accomplish high speed connectivity by providing trunking between any of the

switches. The NetIrons and ServerIrons support trunking of up to (4) 10/100 ports or (2)

Gigabit ports. For connectivity between the ServerIrons (active and standby) trunking is

recommended. This is for redundancy, not necessarily speed. Trunking is the process of

combining multiple physical ports into one logical port. For example, you can combine

Ethernet interfaces 1 through 4 into one logical port. All of the parameters for the link

are written to Ethernet interface 1.

Startup without redundancy

In non-backup operation (SI with backup is covered under the header Startup with

redundancy), a single SI reads its configuration file and from that can determine who the

real servers are. The SI ARPs for these servers and when it receives a response, the SI

will mark the MAC address and the port that it received it on. In this way, it very easily

knows where its servers are and those servers are not bound to any particular physical

port. Since the SI can handle 1,000,000 sessions, all sessions can be dynamically bound

to this single interface. The SI is not limited to sessions per interface.


Other health checks are performed before the servers are placed on the active rotation list.

By default, these services include PINGing the IP address of the real server and for well-

known TCP bound ports, the SI will attempt a TCP ACTIVE OPEN request as a single

health check. Available with release version 5.0, the network administrator can assign

which ports are TCP and which are UDP to allow for TCP Active Open health checking

of “not-so-well-known” ports. This also allows for UDP health checking as well. Once all

services have been positively acknowledged, the server is placed on the active rotation

list. The ServerIron will wait for TCP/SYN connection packets and assign the packet to a

particular server using three types of configurable load balance metrics (detailed later):

• Round Robin

• Least connections

• Weighted

For sessions that are already built in the table, an inbound TCP/UDP packet is examined

for its port number. Upon extracting this information, the ServerIron will make a load

balance decision. The outcome of this decision is to forward the datagram to a particular

server that is supporting the inbound TCP/UDP port number requested. The server can

be selected from round robin, least connections, or weights. The packet is forwarded to

that server, supporting that port number. After this selection an IP address translation

needs to take place.

Redundancy

Depending on the method of implementation, there are two backup scenarios:

• ServerIron Redundancy

• Real Server Redundancy


Let’s discuss ServerIron redundancy first. There are always two physical ServerIrons

that a usually in the same broadcast domain involved but they operate differently as

detailed below.

Single-Active SI with Backup

In this backup configuration, there is one SI that will be the active SI and the other SI

becomes that standby (backup) SI. The Active SI and the standby SI have exact same

configuration files. The backup command in each of the SIs have the MAC address that

is to be shared between the two SIs. A backup link is used between the two SIs and this

not only provides for heartbeat signals but is also allows for information about sessions to

be transferred between the two SIs (classic SLB mode only, this mode of operation is

detailed later). Information based on ARP, MAC, session table, session statistics is

transferred between the two SIs. The standby SI is dormant. It does not answer to ARPs

in behalf of the VIPs and does not pass traffic (a few exceptions here but those packets do

not interfere with the operation). Every 1/10th of a second heartbeat signals are sent

between the two SIs, if these signals are missed over a period of a second, the standby SI

determines if the active SI is still alive by checking its data path interfaces (all the

interfaces except the backup link interface). If the standby SI can “hear” the active SI,

then no convergence takes place (if you have sniffer on-line, these packets are

indicated by the special MAC address of 00E0.5200.0000). They serve no other purpose

except to indicate that active SI is alive. The standby SI does not send these packets out.

If no backup command exists in the config file, these packets are not sent at all.

If the standby SI does not hear from the active SI by either method, the standby SI

becomes the Active SI. It immediately ARPs itself, to allow for all L2 forwarding

devices to update their forwarding tables. ARP tables will not be updated for the

standby SI uses the same MAC address as did the Active SI for all of its VIPs. As far as

ARP tables are concerned, the same switch is operating.


Dual-Active SIs with Backup

For Dual-Active SIs with a backup configuration, both SIs are active at L2 and serving

their configured VIPs for which they have priority for. Active-backup SIs are each

configured for VIPs that may or may not have backups for. Each SI is configured with a

VIP(s) and there is a priority for that VIP. The higher the priority, the more likely that SI

will be the master VIP. Each SI then monitors the other to ensure that active SI for a VIP

remains active. If for any reason that SI is not heard from, the active SI that is standby

for a VIP will make that VIP active on its SI.

In a Dual-Active SI configuration (current software release 5.x), there is no backup

command associated to the SI. Furthermore, there is not private link between the two SIs

either. The SIs find each other through the data path. The SIs must be in the same

broadcast domain. This backup link may be required in a later release.

Real Server Redundancy

This type of redundancy is provided through the ServerIron. Yes, redundancy can be

provided for the real servers by placing multiple NIC cards in the real servers, but the

ServerIron can provide redundancy as well.

There is a special command that can be used known as remote-name. When a ServerIron

has determined that all of its real servers have failed, it can used the servers listed in the

command of remote-name. Requests for services of that ServerIron’s VIP will be

redirected to the IP address indicated by the remote-name. The remote-name can be

another VIP of another SI or it can be that of a real server.

For HTTP requests only, the ServerIron can also provide redundancy when all its real

servers fail, by issuing an HTTP redirect command to the requester. This command is an

HTTP 1.0/1.1 command that allows for the redirection of a request to another server.


Data-flow with Classic-SLB

In classic-SLB mode, the ServerIron must be in the direct path of the data flow. An

inbound connection is attempted to the VIP on the ServerIron. This requested is

translated and passed through to the real server (no connections are terminated on the

ServerIron). This means that the inbound and return path must physically pass through

the SI.

The destination IP address of the inbound packet is modified (from the VIP address) with

the real server’s IP address, the IP checksum is recalculated and the destination MAC

address of the real server is placed in the data link header and the packet is forwarded to

the real server. This is known as partial NATing of the address. Running the ServerIron

in the non-DSR operating environment forces the return path of a forwarded packet back

through the SI for translation. Received datagrams from the real servers are translated

again placing the VIP’s IP address in the source IP address of the IP header, the

checksum is recalculated and the packet is forwarded to the appropriate next hop. In this

way, the Foundry Networks ServerIron products are extremely fast and provide complete

transparency to the users and servers.

Data-flow with Direct Server Return (DSR) mode

When designing the network, you should allow for the real server to directly forward the

packet back to the requester. The return packet can traverse the active SI but its speed of

return is greatly reduced compared to not traversing the originating SI. This means that

the return path should not include the ServerIron. It can be there, but it does not have to

be. Having the SI as an intermediate device on the return path forces a received packet to

be sent to the processor for path determination. This reduces the speed down to the non-

DSR effective throughput. This is demonstrated later in the topology design chapter of

this paper. Therefore, if designing a DSR topology, you should use either the single SI

with no backup or dual Active SI with backup.


The DSR function of the ServerIron allows for the real server to directly communicate (in

the return path of the data flow) with the requestor. This means, that the return path

bypasses the ServerIron. No translation takes place on the return path. This allows for

tremendous speed improvements up to wire speed of the topology. Speed improvements

will be governed by the available bandwidth of the path back, to the requestor.

The details of how DSR works are given in Chapter 4 but a brief description is given

here. DSR operates on the idea that there will be two or more identical IP addresses on

the same broadcast domain. There are two functions. One IP address is the IP address of

a VIP in the ServerIron (the logical ports associated to this VIP will indicate DSR-mode).

The other same IP address is placed on the real server or servers. The law of IP

uniqueness has been stretched here but no other station on same broadcast domain will

know this because of a little known feature of the loopback interface.

The real server is now assigned multiple IP addresses: one for the NIC and two for the

loopback interface. The NIC card is assigned a unique IP address. The loopback

interface is assigned the same address as the VIP on the ServerIron. Each real server in

the SIs rotation list for a port number will have the same IP address. The uniqueness is

provided by the inability of a loopback interface to respond to ARP requests. When the

IP address is ARPed by any inbound interface, it is the ServerIron that responds. The

inbound packet is sent to the ServerIron. In turn, the ServerIron will send this packet to

the next real server in the rotation list translating only the destination MAC address. The

packet is then forwarded to that real server.

As far as the real server is concerned, the packet was forwarded directly to it by the

requestor and not the ServerIron. The real server processes the packet and when it

responds it will respond directly to the requester. No translation of any of the headers

occurs in the return path of this data flow.

As you learned with classic-SLB, translation occurs on both the forward and return path

of the data flow. With this, (not using Source-NAT which is the ability of the ServerIron


to translate the source IP address before sending it to the ServerIron), the SI translates the

destination MAC, destination IP, and IP checksum. With DSR enabled, translation

occurs only on the inbound path and only on the destination MAC is translated. This

improves the speed forwarding of the packet to the real server. On the return, there is no

translation for the packet is forwarded directly back to the requestor at whatever speed

the bandwidth will allow. In many environments, speeds up to 2 Gbps are attainable.

Foundry Networks holds the speed record for load balancing!

Health Checks

Load balancing is just one feature that Foundry Networks ServerIrons can provide. But

without some capability of providing health checks of the services being load balanced,

you may as well provide only round robin DNS.

Foundry Networks provides protocol level health checks for Layers 2, 3, 4, and 5-7.

These are defined as:

• Layer 2 (ARP)

• Layer 3 (PING for UDP and server connectivity)

• Layer 4 (TCP Active Open [on a per TCP port basis])

• Layer 4 for not-so-well-known ports - assigned “unknown ports” as TCP ports and

health check these ports

Along with the L4 health checks the ServerIron allows for the assignment of timers and

retry counters to the health check intervals (layer 3 and higher). With this we can health

check at any timer period.


But proving the existence of an application does not prove it can respond to requests.

Therefore Foundry Networks provides application level (Layer 7) health checks on the

following protocols:

• URL (HTTP)

• TELNET

• FTP – (port 21)

• POP3

• SMTP

• IMAP4

• RADIUS-old (port 1645)

• RADIUS (port 1812

• DNS

• LDAP

• NNTP

User-defined Health Checking (port profiles)

Besides these application layer checks the ServerIron offers the ability to define ports as

TCP or UDP and allow those ports to be application checked as well. For example, lets

say you have defined multiple instances of a Web Server to run on a single machine. In

order to accomplish this you must assign a unique port number for each instance of the

Web server. With the ServerIron’s port profile feature, you can tell the ServerIron which

port has been defined, assign it a TCP attribute and even assign it to do Layer 7 URL

health checks. Another example would be running an application that has been built in

house. The application runs on a TCP stack and is assigned a port number of 30000. The

ServerIron can be told that this port is a TCP port and the ServerIron will provide Layer 4

health checks on this port.

Congestion Avoidance

There are two features the ServerIron exhibits for congestion avoidance.


- QoS using access policies

- Server Reassignment

The ServerIron provides for multiple methods to achieve QoS. QoS policies can be set

via a TCP/UDP port number, MAC address, or VLAN. The network administrator can

write policies in the ServerIron to indicate what packets has priority over other packets

and therefore get placed into a higher priority queue for forwarding.

An advanced yet easy to use feature that the ServerIron provides for load balancing

connections is server reassignment. This feature allows for the checking of server

response time. If the server does not respond within three connection requests (possibly

indicating a congested server), the ServerIron will move the connection to the next server

in the rotation (according the port bindings; that is, it will move the connection to the

next server in the rotation supporting that logical port). Even though we have moved that

particular connection request to the next server in the rotation, it does not mean that the

ServerIron marks that service as failed. After a settable number of requests to a single

service have not been responded to (known as the reassign threshold), the ServerIron will

take that service (logical port, not the server) out of the rotation list and health check the

service. If the service responds, the ServerIron will place the service back in the rotation

list and will again forward outside connection requests to it. To allow for outside

notification of this event, the ServerIron will transmit an SNMP TRAP to the specified

trap receiver and write the event to the log. For those sites supporting SyslogD, the event

will also be written to the SyslogD server. This feature combined with load balancing

algorithm of least-connections allows for the ServerIron to determine which real server is

most available and redirect requests to that server. It will continue this dynamically until

other servers can respond in an adequate amount of time. This time is discerned by TCP

and the user applications; not by special software that must be implemented on each real

server.


The advantage of this feature is no special software is required on the real servers. The

ServerIron makes the determination of possible congestion through a naturally occurring

event in the TCP/IP protocol. Again, this allows for ease of use, transparency, and

greater throughput. It ensures that you don’t spend your time “tweaking” our software

that may produce unnecessary and unpredictable events.

Multiple NICs in the Servers

Redundancy comes in many forms and the ServerIron in the right topology can provide

for it. The real servers can provide redundancy as well. Multiple NICs in the real server

is operating system dependent and in no way is it dependent on the ServerIron.

Having multiple NICs in a real server can provide for redundancy and it can also provide

for real time load sharing. There are usually two modes of operation: active or standby.

This means that each NIC can be active or one can be active and another is waiting for it

to fail. In either case, redundancy can be provided by the real servers.

Futures Foundry Networks will be providing alternative topologies that will suit different

customer environments, however it is not the purpose of this paper to review these

features or the new topologies used to enable them. These features are currently in beta

testing. This includes:

• Another feature available by summer of 1999 is the ability to provide load balancing

globally. That is to provide SLB no matter where the servers exist on the Internet.

This is a feature that will allow the ServerIron to provide health-checked DNS

services based on checks such as delay, reachability, route policies, or BGP

AS_PATH cost.

• Proxy Server Load Balancing – SSL ID and URL switching


The next migration and alternative would be to integrate the back end networks in a

single pair of switching routers using VLANs and Layer 3 routing. This topology would

integrate the ServerIron technology and the bottom two BI4000 switches. Also the

servers would be dual connected and VLANs or applications would be set up for the

highest availability by stripping them across different slots and ports on the BigIron.

Integrating these networks will simplify the current configuration, reduce amount of

network hardware required to support different networks and provide highest availability.

There are numerous options on integrating networks and optimizing the overall design.

Also BigIron can support Gigabit server connections as well as 10/100, this allows

scaleable growth not found in any other product. All the software features in the NetIron

are also available in the BigIron product. If we use GateD (routing) on the servers (SUN)

we can also dual home every server and use all links – very efficient.

Chapter 2 – Active – Standby Topologies Page 1 of 19

Chapter 2 – Single Active SI with Backup

The first topology to be look at for the Single Active SI with redundancy is the six-pack

topology. This topology is used in many installations and has been a trusted method of

implementing SLB for two years. The topology as shown here allows for 100%

redundancy in either a switch or routed configuration. However, again, network design is

really based on acceptable risk. This topology can be slimmed down to not require so

many components but in doing so the risk factor rises.

There are two methods of implementation: L2 switching and L3 switching. In order to

effect the L3 (discussed first, L2 is discussed next) topology, several pieces of hardware

and software need to be implemented.

• Two Foundry Networks NetIron switching routers (with integrated switch routing

[ISR and FSRP/VRRP],

• Two ServerIron Server Load Balance switches

• Two BigIron 4000 Layer 2 switches

• Two Cisco 7000 series routers (for WAN connectivity)

The software involved is:

• BGP – on the 7500’s only – used in this case for connection to the Internet

• OSPF/ECMP (Open Shortest Path First/Equal Cost Multi Path)

• VRRP/FSRP (VRRP is the standard [RFC 2338] version of router redundancy)

The purpose of this topology is to not only provide load balancing of the attached

physical servers to the BigIron 4000 but also load balancing of the inbound and outbound

data and fault resiliency of the entire topology.


Cisco 7500, NetIron, and OSPF

Starting from the top of the diagram, there are two Cisco routers. These routers are

attached to the Internet and are running the BGP and OSPF protocols. OSPF is needed

for the six pack topology but BGP is not. It is shown here merely for completeness.

Attached and physically cross-connected to the Cisco 7500 routers are two Foundry

Networks NetIron switching routers. These routers can route at 3 million packets per

second and have a 4.2 Gigabit switching capacity. The NetIron allows for up to (24)

10/100 auto-sensing copper ports as well as a two port Gigabit expansion module. Since

these are routers, the protocol that is being used between the Cisco 7500 and the NetIron

routers is OSPF. An advantage to using the OSPF protocol is its ability to provide load

balancing across equal cost paths. This feature is known as equal cost multi-path

(ECMP). For any data forwarding between the Cisco 7500’s and the Foundry NetIron

switch routers, it is load balanced. All paths are active which allows outbound or

inbound data to be forwarded on any of the available paths.

Other key advantages in addition to ones mentioned above of using Foundry NetIron

routers instead of Layer 2 switches is that a 100 MBPS router port from Foundry

Networks is under $300 versus $3000 on a Cisco router. Also the NetIron offloads the

Cisco router from having to burn processing cycles for local routing (server to server,

local network to network, etc…) and Hot Standby Router Protocol (HSRP). This allows

the Cisco router to have better performance for handling WAN and access lists. The

NetIron adds another level of access lists at faster speeds to increase security one more

level. Integrated Switch Routing is another advantage that a NetIron brings in this total

redundant design. Lastly, the NetIron supports EtherChannel trunking with the Cisco

Router to allow for bandwidth growth to the WAN.

NetIron with ISR and VRRP

Redundant connectivity to the ServerIrons (which provide the load balancing) is provided

by a couple of unique features in the NetIron architecture. They are known as Virtual

Router Redundancy Protocol (VRRP) and Integrated Switch Routing (ISR). VRRP

provides redundancy for the servers connected to the BigIron chassis at the bottom of the


topology. VRRP is particularly useful when the users on a subnet requires continuous

access to resources in the network. This protocol is similar in function to Cisco’s Hot

Standby Router Protocol (HSRP). This function allows for a shared IP and MAC address

between NI1 and NI2. This shared address is assigned to the real servers as their default

gateway address. VRRP allows one router to automatically assume the function of the

second router if the second router fails. VRRP works by the exchange of messages that

advertise priority among VRRP-configured routers. Between the routers, one will

become the dominant router for default routing of the real servers. When the active

router fails to send a hello message within a configurable period of time, the standby

router with the highest priority becomes the active router. The transition of packet-

forwarding functions between routers is completely transparent to all hosts on the

network.

To allow for physical redundancy (allowing a single subnet but to two different

ServerIrons), we take advantage of a Foundry Networks feature known as Integrated

Switch Routing. In a normal configuration, a router is not allowed to have more than one

physical port assigned to a single subnet. The NetIron’s adhere to that rule by creating a

virtual Ethernet interface which allows us to assign more than one physical port to a

single IP subnet. The advantage here is that if one port becomes disabled, the other

physical port assumes the responsibility of forwarding packets. Data travelling on the

same VLAN (broadcast domain) is switched locally between the ports, hence the name

Integrated Switch Routing (ISR).

ServerIron Redundancy

You should notice in the figures that there are two ServerIrons. This switch provides a

redundancy function in the event of the primary ServerIron does not respond to other data

packets. In doing this, there are no loops in the configuration, thereby eliminating the

need for spanning tree on the ServerIrons at this level of the topology. The backup

process works by dedicating a link (or trunked links) between the two ServerIrons.

Notice on this drawing that the link is a trunk link in that there is more than one physical


link that make up this connection. This provides redundancy, if one physical link

becomes disabled all information is passed over the other link.

ServerIron Backup Process

Upon startup, each ServerIron will start in standby mode and look for the active

ServerIron. One of the two ServerIrons will become the active ServerIron and the other

will be placed in standby mode. With the topology being totally symmetrical there is no

need to assign a dominant SI. During this time, the standby SI is constantly looking for

information from the active SI. There is a signal sent between the two SIs every 1/10th of

second. Should the active ServerIron stop sending messages, the standby SI will take

over as the active SI within 1 second. Once active, a SI always remains active even if the

previous active SI starts responding again. All of the active server’s information is

synchronized with the backup SI. A safety feature is built in as well using the data path

(any active interface but not the dedicated backup link) of the SIs to ensure validity of the

backup link. Should the backup link become disabled but the active ServerIron is still

valid, the data path is used to test for its presence.

The configuration files on each of the SIs are identical. In fact, the address of the VIP is

assigned with the backup command. In the event of a failover, the failed SI will ARP

itself to ensure that the switch tables in the topology correctly update their tables.

The standby SI does not function in any capacity except to receive messages from the

active SI. It does not forward any L2 traffic. The information passed from the active SI

contains state table information as well as ARP and other table information such that

when the standby becomes active, it will appear as if it were the active switch all along.

With this there is no re-ARPing for information and current state table information will

be fresh in the sense that no re-SYNing of connections is required. In essence, no one

will know that the standby switch has become active with the exception of an SNTP trap

message being sent to a preconfigured SNMP receiver or perusing the log file.


BigIron 4000

Below the ServerIron switches are BigIron 4000 or 8000 switches. These high capacity

switches have 128/256 Gbit/sec switching capacity respectively (each card has 32

Gbit/sec switching capacity). Each 10/100 card can hold up to 24 10/100 auto-sensing

ports. There is one management card on the switch which is where the console

connection is placed. The BigIron offers both Layer 2 and Layer 3 routing on a per port

basis and supports VLANs. For this configuration, only Layer 2 switching is enabled.

The BI4000 are cross connected to provide redundant paths between the ServerIrons. At

first, it may appear that there is a L2 loop involved and that spanning tree must be

enabled. This would be correct, if the standby ServerIron were actively forwarding

packets. The standby SI is dormant and therefore there are no loops involved and STA

can be and should be turned off.

To add another layer of redundancy the BigIron will offer redundant Management

Modules in July 1999. Also all of the Foundry products offer QoS capability to allow for

the prioritization of traffic on a per application basis throughout the L4 network. So if

Video or a delay sensitive application was running across network we could ensure that it

was always serviced before other traffic if the network got congested.

This design offers a lot of flexibility, scalability and redundancy to ensure no single point

of failure and growth. Also allows us to integrate functions to reduce complexity and

management costs.

Servers

The real servers are the servers that are attached below the SI (to the BigIron 4000). The

servers involved here have no special configuration to support this topology with the

exception of having dual NICs. Most NIC card manufacturers now support the ability to

place two or more NICs in the same computer. Two configurations can occur here. The

second NIC is dormant until the primary NIC stops working, or the second NIC is active

with its own IP and MAC address. Usually, it is the first configuration. With the first


scenario, the dormant NIC assumes the MAC address and the IP address of the active

NIC when the active NIC fails. You should check with the manufacturer for this

capability.

Another feature of some Gigabit Ethernet NIC manufacturers is to use a virtual MAC and

IP address. These types of cards have two Gigabit Ethernet interfaces on the same NIC.

When one interface fails, the other interface will become active using the same MAC and

IP address.

All of the servers must contain the same data content for those are redundant.

The Topology So why all the redundancy? Why not plug the servers directly into the ServerIron

switch? The answer is you can. That is the topology of the most simplistic load

balancing topology. However, it provides no resiliency.

So lets provide for backup and have two SIs but eliminate the L2 switches at the lower

end of the figure. In other words, why not simply take a server with two NIC cards and

plug each into a ServerIron. This would seem to provide the same function as the dual L2

switches below the ServerIrons. This provides redundancy with the exception of the NIC

cards. If one NIC card fails and the other one becomes active, it will become active into

an inactive SI (the backup SI).

Suffice it to say, the six pack topology allows for the most fault resilient data path,

including backup of the session tables.


An Example Data Flow – Using routing

So let’s take a look at the data flow. Refer to figure 1. It is assumed here that all the

previously explained startup information has taken place and the topology is fully

functional. There are previously established connections and the state table is built. The

ServerIron on the left is the active ServerIron (SI1). The first two servers (top to bottom)

on the left support FTP, and HTTP. The last server supports only TELNET. The

ServerIrons are configured to load balance based on round-robin.

Data can be received on either Cisco 7500 and responses from the VIP will be forwarded

out either Cisco 7500, via the NetIrons; all courtesy of OSPF ECMP. Using OSPF

ECMP eliminates the need for Cisco’s proprietary protocol of HSRP and it provides

much better data flow. The VIP is a publicly known IP address and requests to the real

servers will be made from some source address to the VIP address. Both NI1 and NI2

have a forwarding path to the active SI. Both NetIrons will find the active SI because it

responds like an end station. Therefore the NetIrons will ARP for the VIP and the SI on

the left will respond. The only standby components are SI2 for server load balancing and

NI2 for FSRP.

Upon receipt of the packet, SI1 determines this is a previously existing session and SI1

extracts the port information from the received datagram and determines which real

server to forward the datagram to. If there was no entry in the session table, SI1 would

send a TCP/RST packet to the source. SI1 extracts the port and it is TCP port 80. After

determining the next real server in the rotation, the SI performs a NAT translation on the

destination IP address, recalculates the checksum and forwards the packet to the selected

real server using the MAC address of the real server as the destination MAC address.

The real server will accept the packet for the initial connection attempt (TCP/SYN) from

that source address was passed on to it by the SI. Therefore the real server also has a

TCP connection table with that source address in it. The advantage to NAT (Network

Address Translation) Server Load Balancing is throughput and transparency.

When the real server responds, it truly thinks it is talking to the source client (somewhere

out there on the Internet) address of the packet and will return the packet via its default


gateway. It can ARP for its default gateway for the SI’s are a Layer 2 switch. ARPs will

be forwarded just like a switch. Since the NetIrons are running FSRP, one of them will

respond to the server (the one with the higher priority will win the right to serve as the

active default router).

When the real server sends the packet to the router, the SI intercepts the return packet of

the real server and based on information in its state table knows that this is an existing

session. The SI NATs the source address with its source IP address and sends the packet

to the default gateway (as indicated by the destination MAC address already in the

packet). One of the two NI’s will receive the packet based on the outcome of the FSRP

protocol

The FSRP NI will send the packet back to one of the Cisco 7500’s (based on the OSPF

ECMP) which in turn will route the packet back over the Internet to the indicated source.

An Example Data Flow – Using switching

The only difference between this topology and that shown in figure 1 is we have swapped

out the two NetIrons and in their place are two FastIron Backbone switches. We are still

providing redundancy except, on the basis of cost, we have decided to use switches

instead of routers. In this topology we have eliminated the need for OSPF ECMA. The

Cisco 7500’s must also now run HSRP (to take the place of FSRP). HSRP is Cisco’s

proprietary protocol that allows two or more routers to share an IP and MAC address that

can be used to “share” the responsibility of accepting packets from unsuspecting

workstations that use the default gateway parameter (i.e., those that cannot run a routing

protocol). One router is elected as the active router and upon its demise, the other router

will provide the data path. Foundry Networks provides the same capability through a

protocol known a Foundry Standby Router Protocol (FSRP).

This configuration is offered as a lower cost alternative to the routed solution. However,

lower the cost raises another issue, that being convergence time. In this topology we

have added a link between the two switches and eliminated the cross over links between


the two Cisco routers. You must run spanning tree on the links on the upper L2 switches

eliminating the need to run it on the ServerIrons. Foundry Networks products support

Fast Spanning Tree which allows our switches to converges in approximately 7 seconds.

The bottom two switches must also have a link between them. This provides failover in

the event any of the diagonal links between the active or standby SI go down.

To configure spanning tree for this topology, it is best to select the upper left hand L2

switch as the root bridge. The upper right hand L2 switch can be configured to become

the root bridge in the event the other root bridge goes down. Simply set the priority to a

value less than 8000hex on the switch.

Under no circumstances should spanning tree be set on on the SIs. Other SI topologies

(Symmetric SLB and Direct Server Return [DSR] allow for STA but not “classic” SLB).

Therefore, the only switches that should have spanning tree turned on are the L2 switches

above and below the SIs.

But this presents a problem. The upper right hand switch now has two equal paths to the

active SI. Since the cost of the diagonal link down through the active SI and the link

directly attached to the upper left L2 switch provide equal costs to the root. In this case,

we want the diagonal link between the upper two L2 switches to become disabled.

Simply set the costs on those links on each L2 switch to a cost higher than 5 (default path

cost for Foundry switches). In most cases, it is set to 100 for simplicity. Remember to

set each side. Both L2 switches must have that link set to 100. The rest of the network

will take care of itself. The bottom two L2 switches will disable the horizontal link

between them naturally.

Lastly, it is important to run Fast Span on all L2 switches. This is a simple configuration

on the L2 switches. Under the global span command, use the forward-delay argument

and set it to 2. Under the global command, set the max-age parameter to 67. For more

information on these settings, please refer to the IronWare release notes version 4.5.


FastIron(config)# span forward-delay 2

FastIron(config)#max-age 67

Trunking for Redundancy

This feature is still under test, but it has been proven to run effectively for redundancy

purposes. Any link in figure 1 or 2 my be trunked for redundancy. There are two types

of trunks: server and switch. The server trunk load balances by source address and is

used primarily when trunking into high end workstation; like a quad-Ethernet card for

SUN workstations. Trunk server also places each packet to the CPU for processing and

not the hardware. Initial indications are that trunking via the server command will have

not noticeable effect on the data flows of the SI. However, testing is continuing.

Trunk switch is for trunking between switches. Albeit this type of trunk works very

effectively for switch to switch topologies, this type of trunk is still an unknown for the

SI.

Currently, if you are going to trunk interfaces for the SI, use the server type of trunk. In

the topologies given in figure 1 and 2, you may have noticed that the server type of trunk

may not be effective for load sharing across multiple links. Effectively, there will only

be two source addresses (one from each router). More information will provided as

testing continues.

VLANs and the ServerIron

VLANS are fully supported in the ServerIron. The ServerIron supports VLANs based on

port, protocol, subnet and mac-addresses. The ServerIron software resides “on top” of

the VLAN software enabling it to examine packets from all VLANs.

This offers a great advantage when purchasing a ServerIron. In may happen that not all

ports are utilized after your installation has occurred and you would like to take


advantage of the unused ports. VLANs would help in this situation allowing you to

produce two ServerIrons in the same box. This can save both money and space.

VLANs may be used in the same or different subnets. In using VLANs, it is possible to

configure the six-pack down to a 4-pack. Or you may use the six-pack topology to run to

separate broadcast domains. Figure 3 shows a configuration that has implemented port-

based VLANs with redundancy. This topology eliminates two of the L2 switches. This

topology looks complicated but it really is not. The lines change to indicate whether they

connect to VLAN1 or VLAN2. We have eliminated two L2 switches so we must place

those cable somewhere. The SIs and the remaining L2s are doubled up in cabling, so this

topology looks complicated but in reality it is only messy!

The drawback of this topology is the loss of a single L2 device will take down half of the

data paths and servers. Yes, the server can be multi-connected to both SIs and this will

eliminate that drawback. But some are not comfortable with the reduction in switches

and feel more switches provide for more reliability. The six-pack continues to offer this

reassurance. However, the 4-pack can be used with complete confidence of fault

resiliency.

The topology shown in figure 3 is for one subnet only. The SI fully supports multiple

subnets in a single broadcast domain with VLANs. However, this can lead to a

complicated topology that is not fully explained here. Please contact your Foundry

Networks systems engineer for more details.


Security

One of the key benefits of this topology, is the ability to allow the real servers to have

private addresses (see RFC 1918). This allows for some security in that the private

addresses cannot be routed on the Internet. S-SLB topology allows for this as well but

the DSR topology does not. These topologies are fully explained later in this document.

If the servers (as shown in figure 1) were assigned the addresses 10.10.10.1, 10.10.10.2,

and 10.10.10.3, you should first notice that these are addresses that are known as private

addresses. In fact, any address in the 10.0.0.0 address range is a private address and

cannot be routed over the Internet.

The six-pack topology allows for the servers to be assigned a private address and still be

used in a public network. We provide for NAT translation of the source IP address when

the server responds to the source. The address of the ServerIron must be a public (well-

known) address.

But providing for this private addresses can cause complications in the network that it is

attached to. For example, since the real server does not know about the ServerIron it

responds directly to the source. Therefore, it will still need a default gateway to provide

the data flow back to the source. When using private addresses, this requires that the

routers be multinetted. With this, the router interface must support more than one

network address (or subnet address). In many cases you do not want to do this. The

ServerIron provides a method to alleviate this condition. It is called Source-NAT.

There are alternative to multinetting the customer router interfaces. One alternative to

multinetting the border routers (customer routers) is to simply put a L3 six-pack solution

in. This still requires that the VIP(s) on the SI are public addresses but the NetIrons

provide a front to the customer routers. The NetIrons will provide for the multinet public

and private address in its ports, alleviating the customer routers from this. But the VIP of

the SI inside the “ServerIron domain” must be a public address and it will probably be the

only address on that subnet. This requires some forethought before venturing into.


Another alternative is to statically place an ARP entry in the real server’s ARP cache.

This will indicate the default gateway and the real server will not have to ARP for it. It

will send the packet out its interface, correctly addressed, the ServerIron will intercept it,

NAT the addresses necessary (source IP and source MAC) and send the packet to the

router. The drawbacks of this solution are administration of the real servers. They will

have to be administered locally or through some type of NAT device on their subnet.

This ServerIron will not provide for this administrative function. Those individuals who

control the real servers (usually not a function of the network infrastructure team) may

have the final say on this!

If none of the above ideas are suitable, then the ServerIron allows for the translation of

the source address before sending the packet to the real server. Using Source-NAT, the SI

will translate the source and destination IP address and the destination MAC address (to

get it to the server). When the server responds, it thinks the source is on the same subnet

as itself, the server will ARP for the SI and will send the packet locally to the SI. The SI

looks up the packet in the session table, modifies the source and destination IP address

and sends the packet to the router (if the packet is not local).

Currently, Source-NAT is not available in a fault tolerant topology. It can run in the

“classic” SLB six-pack topology, but when failover occurs, the Source-NAT table does

not translate with it.

Source-NAT has some attributes that may not be suitable for some installations,

especially financial institutions, please contact Foundry Networks for more information

on this.


Methods of Simplication

In many cases, only a couple of the physical interfaces are being used on the L2 switches

and requests have surfaced to more efficiently use the available interfaces. To allow for

this, a unique twist to the six-pack is configured is the four-pack. Logically, this is still

the six-pack, and you should not confuse this topology with the true four pack topology

that is used with DSR in an active-active operating environment (covered in the DSR

chapter). We are using VLANs to provide for the absence of two of the L2 switches.

This may be confusing but the functions of active-standby operating environment are

preserved here. What we have done is removed two L2 switches and provided for them

using VLANs.

Refer to figure 3. All of the switches have been configured with two port-based VLANs.

This is indicated on the drawings using the colors or blue and black. What this does is

logically divide that L2 switch into two broadcast domains. Each VLAN cannot see the

other, which means no packets are transferred between the port-based VLANs unless

some other intervention has been configured.

Other Methods of Simplification

Is there any other method to simplify this network topology? Yes, there is , but again the

six pack provide for the most robust fault resilient topology for server load balancing.

Refer to figure 4.

Still using the Classic-SLB mode of operation, we can remove some component and still

provide fault tolerance. Take a good look at this picture, we will re-visit it in chapter 3

and 4. You should notice that we have reduced the number of switches to 4. In this

configuration, we are using integrating the ServerIrons into an existing environment.

Many customer sites today have a topology very similar to figure 4. Two cisco high-end

router followed by a couple of Catalyst 5000’s and then the rest of their network down

below this.


To take advantage of the existing topology that a customer may have and not force a new

topology (the six-pack on them), we could use the topology of figure 4. The ServerIrons

now become more of a network appliance more so than a integral component.

The data flow for this is similar to the other. A packet for a VIP is received and

forwarded by the router to the VIP on the ServerIron. The ServerIron sends it to the real

server and the real server sends it back to it router. But if the router is directly in the path

of the real server, how does the packet get back to the ServerIron for translation? The

ServerIron must have a feature called Source-NAT turned on.

With this, the ServerIron will receive an inbound packet and prepare it for transmission to

the real server. The normal translation of classic-SLB takes place here. But with Source-

NAT one more translation takes place. The ServerIron will translate the source IP

address of the inbound packet as well. It translates it to one of the addresses that is

preconfigured on the ServerIron. As shown in figure 2.4, the source IP address that the

real server receives is 1.1.1.2. (For those interested, we use PAT, port address translation

in order to ensure uniqueness of multiple sessions from the ServerIron). Up to four

Source-IP address can be assigned. With 65,535 ports per source IP address, a Source-

NAT ServerIron can hold up to 256,000 connections or 500,000 sessions.

In this way, it looks like the ServerIron originated the packet to the real server. In

response, the real server will send it back to the IP address generated by the ServerIron,

the ServerIron translate the headers and send it back to the requester. This is not the

most efficient use of a data flow but it does work quite well for those sites that require

classic-SLB load balancing. We will revisit this topology in Chapter 3 and 4.


Figure 1 Six pack solution using OSPF

OSPF with ECMP

FSRP ISR ISR

NI2

SI2

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

NI1

SI1

BI2 BI1

FastIron IIConsole1 Link 2

Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24


Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity


STA = Spanning Tree Figure 2 – Six pack solution using spanning tree

STA STA

HSRP

FI2

SI2

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

FI1

SI1

BI2 BI1


Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24


Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Servers

STA

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity


Figure 3 – 4-pack configuration using VLANs and active-standby SLB

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Netiron Router1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

VLAN1

VLAN 2 VLAN 2

ServerIron Backup link

HSRP

Server

Server

Server

Server

Active Standby



Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity


Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Standby SLB 2 Active SLB 1

Internet

HSRP

1.1.1.1- VIP1 1.1.1.2- Source-NAT

1.1.1.1– VIP1 1.1.1.2 – Source-NAT

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Figure 2.4 – Relaxing the six-pack rules by using Source-NAT

Chapter 3 – Active-Active or S-SLB Topologies Page 1 of 7

Chapter 3 Dual-Active SI with Backup

The S-SLB Operating Topology

Many installations request the use of redundant SIs but not in a Single-Active SI with

backup operating environment. The requirement is for redundancy but all L2

components are active. Another way of looking at this is that at layer 2, all components

are active. At layer 4, individual VIP are active and standby. S-SLB allows for this.

When using S-SLB, both modes of operation can be invoked

• Direct Server Return (DSR)

• Classic-SLB

There are some advantages to this topology. For example, having both SIs active allows

for the doubling of the number of connections. Currently, the SI can handle up to 500,00

connections each. S-SLB allows for up to 1,000,000 connections between two SIs.

(500,000 connections – 1,000,000 sessions).

Classic-SLB mode can be used with this topology, but DSR mode makes the most

efficient use of it. Without DSR, there are some caveats that must be brought up. It can

be said that the topology rules that applied for the six-pack configuration do not apply

here. Using the Dual-Active SI with backup configuration, a six-pack is not necessary.

The most that will be needed is a four-pack. Even then, the standard topology gets

tricky!

With this topology, both the SIs are configured with an active VIP and both of the SIs are

active at Layer 2 and Layer 4. Each VIP services their respective (bound) servers. Each

server can be configured with two NICs, with each NIC in the same subnet but a different

host ID. Refer to your network administrator for system information and see if your real

server operating system can accomplish this. This is not a requirement for proper

operation of the ServerIron, but it is the most efficient and effective.


Both SIs are configured with the same VIPs. Each contain the same bound ports. Where

they differ is in a priority scheme that is assigned to each VIP. Upon start up each SI

finds each other and compares the priority between the two. The SI with the lowest

priority enters into standby mode for that VIP (not for the ServerIron as compared to the

Single-Active SI with backup topology). Conversely, the VIP with the highest priority

number is the active VIP for that topology.

The most simplex configuration for a Dual-Active topology is shown in figure 1.

Simplicity is the key thought here, but there some small levels of redundancy. Should the

ServerIron on the left side become disabled, the ServerIron on the right side can take over

requests for the disabled VIP. It is easier if the operating system of the real servers can

handle dual NICs but the ServerIron can provide for this if the real server only contains a

single NIC.

The concern in figure 1is primarily the splitting of a subnet. Subnets are not allowed to

become discontiguous and if the link between the two SIs is broken, the subnet becomes

discontiguous. That is, you know have two of the same subnets. Trunking can alleviate

this concern.

Figure 1 does not provide 100% redundancy like the six- or 4-pack configuration but then

again, the concerns corrected by the six-pack configuration are not for everyone. Network

design of layer 4 networks is simply about acceptable risk. The different options shown

here have different risks, each associated with a cost.

Classic-SLB and DSR can both be applied to this topology. Figure 1 leans more towards

classic-SLB. DSR will not provide you with much benefit here. Expanding the topology

beyond figure 1 is where it can get challenging if you are not paying attention.

Broadening the topology to that shown in figure 2 can cause data flow problems. For

example, individual links can become disabled and the return path for the real server is

not back through the original SI it was transmitted from. To correct this, Source-NAT


must be enabled on the sending switch or the mode of operation must include DSR. This

is for the topology as shown in figure 2.

Source-NAT is the ability of the SI to translate the source IP address on the inbound path.

As we know from Chapter 1, without DSR, the SI translates the destination IP and

destination MAC address before sending the packet to the real server. With Source-

NAT, the SI now translates the source IP address as well, before sending it to the real

server. In this way, the real server assumes the requestor is on the local subnet and does

not try to send the reply to its router. It will ARP for the return address (that being the

SI) and send it to the SI. The SI translates it correctly to send it back to the requestor.

Figure 2, adds some redundancy. In this configuration, Source-NAT or DSR must be

used. If Source-NAT or DSR cannot be used, some link failures may cause asymmetric

data paths. Figure 2 is shown here to indicate how the rules of the six-pack topology are

hard to apply to a Dual-Active SI topology. There are some customer sites, that require

all components be active, and DSR is not allowed. Figure 2 shows an example of how

this might be accomplished. Trunking does add some resiliency.

Figure 3 shows the most effective use of Dual-Active SIs. It is with DSR. Combining

Dual-Active SIs with DSR provides for the most powerful layer 4 topology. There are

less components, less cable requirements and connectivity, simpler configuration, fast

failover, etc. There can be some obstacles through (they are not so much technical) and

they are given below.

All components are active and wire speed access is available on the return paths. So,

after all these chapters, it would seem obvious, why would anyone do anything but S-

SLB with DSR? The are a few reasons. First, session level redundancy is currently

provided for on the classic-SLB configuration that is not using DSR. An upcoming

software release will provide session level redundancy for all levels. Also, the real

servers are generally held by a different group than the network infrastructure group, and

some have objected to multinetted loopback interfaces. Another reason is the ability to


hide the real IP address of the real server. Many installations are using private addressing

in their sites and will not place a public address on the server. DSR cannot provide

service for private addressing on the loopback interface. Locally it can, but private

address is non-routable on the Internet.

Chapter 4 contains complete description on DSR. Figure 3 is fully detailed in Chapter 4.

Server NICs

A full description of how NICs can be used with the ServerIron is contained in Chapter 1.

The dotted lines indicate possible dual-NIC configuration


Fast Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

FastIron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Internet

HSRP

S-SLB SI2 S-SLB SI1

VIP2 = 1.1.1.2 Backup VIP1

VIP1 = 1.1.1.1 Backup VIP2

Figure 1 – a simple S-SLB configuration


Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Fast Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

FastIron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Internet

HSRP

S-SLB SI2 S-SLB SI1

STA Block

Source-NAT Source-NAT



Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity


Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Symmetric SLB 2 Symmetric SLB 1

Internet

HSRP

1.1.1.1- VIP1 1.1.1.2- Backup

1.1.1.2 – VIP1 1.1.1.1 - backup

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

Figure 3 – S-SLB with DSR

Chapter 4 – Direct Server Return Page 1 of 8

Chapter 4 - Direct Server Return (DSR) This mode of operation can be used with any configuration (Single SI, single- or dual-

Active SI with backup) albeit the best known topology is currently the dual-Active SI

topology. Some or all of the logical ports of a VIP can be assigned to the DSR function.

The following is how it operates.

Figure 4.1 shows a typical DSR topology. This is completely different from the other

topologies shown in previous examples. The six pack topology can be used with DSR

but it is no longer necessary to have it. In fact, it can inhibit the return path speed. A

four-pack is usually as much L2 redundancy as you would need. The reason for this is

that the SI is not an integral part of the topology any more. It acts more like an one-

armed appliance. (Trunking is allowed for redundancy).

Topology Configuration

The usual configuration of the SI still applies. However, when you define the logical

ports on virtual servers (VIP), you now define those ports as DSR ports. This means that

the real server is allowed to respond directly to the source and does not have to traverse

the SI for translation.

Refer to figure 4.1. The ServerIron are set up to high speed switches (preferably

BigIrons) as one-armed bandits. This makes them operate more like a network appliance

than an integral part of the network. VIPs are still configured on the SIs. As for

addressing, you should notice that both the real servers and the VIPs are configured with

the same IP address. Normally, this is not allowed. For example, when the router needs

to send a packet to 1.1.1.1, who should respond the SI or the real server? The answer

here is the SI, but how do you keep the real server from responding to that ARP request.

The trick here is a little known feature of the elusive loopback address.


How it Works – the mysterious loopback address

A loopback interface is a logical interface that appears on every TCP/IP host. It is not a

physical interface like a NIC, it is a logical interface. It performs many various tasks

internal to the host and is usually assigned the address 127.anything. 127.anything means

that the address of 127.0.0.1 is the same address as 127.1.1.1. The 127.x address is not

accessible outside the system that it is working on. But can more addresses be applied to

a loopback interface? Yes, like any other IP interface, multiple addresses are allowed.

Some operating systems allow up to 1000 addresses for any interface including the

loopback.

By definition, the loopback interface does not respond to ARP requests. If it is the

function of the loopback interface to not reply to ARPs then we can apply another rule.

No matter what the address of the loopback interface is, the address cannot reply to an

ARP. The loopback interface can respond directly to those who address it. As long as it

the address does not start with 127.x. The point here is we can assign a non-127 address

to the loopback interface and it can respond directly to any packet as long as it is not an

ARP. This is the key to making DSR operate.

Operation

First, we must review the IP address scheme that is required in order to make this work:

• The SIs need an administrative IP address

• The SIs need IP address(es) for the VIP(s)

• The servers need an IP address for the NIC (possibly two for redundancy)

• The servers needs to define a loopback address of 127.x

• The servers need to define an IP address matching that of the VIPs on the loopback

interface. One per VIP

What MAC address is used for the loopback interface? Like any other multinet interface,

a single MAC address is used for all addresses of that NIC and loopback on that server.

Therefore, the real server must also have an IP address assigned to the NIC. This IP


address is the one used when defining a real server on the SI. This MAC address is used

for the loopback interface. It is the same MAC address no matter how many loopback

interfaces are defined on the real server. Therefore, the SI ARPs for the IP address

assigned to the NIC interface and the SI uses that MAC address to send the packet

addressed to loopback interface.

Refer to figure 4.1. A user sends an http (could be any IP based protocol) request to the

server 1.1.1.1. The router receives this packet and ARPs for 1.1.1.1. The SI responds to

the ARP. The router learns the MAC address of the SI VIP and then sends the packet to

the SI. Finding the next server in the rotation list, the SI then relays this packet (changing

only the destination MAC address) to the real server’s loopback interface (via its IP

address; which is the same as the VIP). When the server needs to respond, it will respond

directly to the requester, bypassing the SI. In this case, it will send it to the address of

HSRP. As far as the requester and the server are concerned, the SI does not exist. It is

transparent.

So the data flow for DSR is the following (according to figure 4.1):

• Packet is received by the router for address 1.1.1.1

• Packet is sent to the SI

• Packet is sent to the associated real server

• The real server responds directly to the source

For those readers that read chapter 2 and understand classic-SLB, there may be confusing

here relative to the real server being able to respond directly to the source.

With Classic-SLB, the data flow is as follows:

• Router sends the packet to the SI

• The SI translated the Destination IP and destination MAC address


• The SI sends the packet to the associated real server

• The real server attempts to the send the packet directly back to the source either

directly or by a router

• The SI intercepts the packet, changes the source IP and source MAC (stating in

5.0.00) and sends the packet to the node indicated by the destination MAC

If any of the above fails, for example, the packet is sent directly back to the source by the

real server, the source will send a RST (reset) for the connection. The source IP address

of the real server is different than the VIP. The source made a connection to the VIP and

not the real server.

With DSR, the IP address of the real server is the same as the VIP. Actually, one of the

addresses of the real server’s loopback interface is the same as the ServerIron’s VIP. To

allow for load balancing, each real server that services packets for a VIP has the same IP

address on its loopback interface

Notice in figure 4.1 that there are two SIs with two different addressable VIPs. This is

the dual-Active SI topology (S-SLB) that is in DSR mode. Both SIs are active and

relaying requests to the servers. The Dual-Active SI with backup was covered in detail in

Chapter 3.

Each server has two NIC interfaces. Each NIC interface serves one VIP on each SI.

When a failure occurs, the other SI’s configuration is separated by this separate NIC

being part of the backup SI’s standby VIP. The ability to have more than one NIC

interface is usually a function of the operating system of the real server. The trick here is

to make sure the NICs can be addressed with the same subnet address but different host

addresses. Your operating system may not support this capability. If not, the ServerIron

can make up for it with a bit more configuration settings.


Redundancy Options

Layer 4 network design is all based on acceptable risk. Figure 4.1 could be designed

many different ways including adding the top L3 layer devices that we showed in the six-

pack design (Chapter 2). This would allow for OSPF ECMP between the Cisco routers

and the L4 topology and Foundry Networks exclusive ISR in combination with VRRP

allows for redundancy between the NetIron routers and the ServerIrons. But with

redundancy comes complexity. So let’s take a look a what we could do with the topology

in figure 4.1

First, the dual NIC cards. Depending on the operating system, you could have these as

active or standby. That is, each NIC could be assigned its own IP address (as shown in

figure 4.1) or they could be in standby mode waiting for the first one to become disabled.

Next, we have SI redundancy. If SLB1 becomes disabled, SLB2 will take over for it,

using the NIC interfaces that were configured for SLB1. Of course, this assumes that we

have dual-Active NICs. Otherwise, the SI will compensate for single IP address on a

single NIC. It does require more configuration on the SI.

To allow for simplicity, this configuration shows each VIP on each SI bound to a single

IP address on a unique NIC. If the BI 4000 chassis on the left side becomes disabled, we

lose all the servers and SLB1 as well. SLB2 would not know how to take over for

SLB1’s VIP without the Servers NIC cards being up. To compensate for this you could

multinet the NICs with two IP address each and configure each SI separately to use

different IP addresses when taking over for a failed SI. That is, we would now have 4 IP

addresses (2-per NIC). Each SI would be configured to use their primary NIC IP address

for their VIP, but if a SI fails and the backup SI takes over, it will use the secondary

address of the backup SIs NIC. Alternatively, you could configure the SI to allow for

many VIPs to communicate with one IP address on a single NIC. Again, redundancy can

get complex.


We have redundancy for the Cisco router interfaces using HSRP. The interfaces between

the BI 4000s and the ServerIrons are trunked.

With the topology shown in figure 4.1, we can develop a fault-resilient topology using

less equipment and cables. At the same time, we gain throughput; sometimes by a factor

of 10.

Finally, you could apply the NetIrons shown in Chapter 2. This allows for OSPF EMCP

to take effect between two NetIron routers. The NetIron routers also provide for Layer 2

redundancy between the BI 4000 chassis by using the ISR feature. But applying this

extra layer, you will have to be careful of Layer 2 loops. Since all components are active

at this layer, Layer 2 loops are formed and spanning tree will have to disable a path to

break the loop.

Health Checking with DSR

Health Checking of the real server and its logical ports was modified – since the real

server responses no longer traverse the SI, it is impossible for the SI to automatically

determine the health of the real server. Therefore, the SI continually sends out the

various levels of health checks (L4 and L7) to ensure the server can respond to requests.

Health checking is also provided for the loopback interfaces as well.

Conclusion

What is the purpose in all this? Two things: simplicity and speed. Speeds up to 2 Gbps

are now achievable. How? With DSR, the real server now uses a direct path back to the

source and with Foundry Networks L2/L3 switches, wire speed forwarding is achievable

at L2 or L3 bandwidth. The latency between L3 or L2 in Foundry products is the same.

The SI does not get involved in the return path. This is sensible in that the data flow

tends to be larger from the WEB server to the requester and not in the requester to the

WEB server.


Figure 4.1 – DSR with Symmetric-SLB

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity

1.1.1.251 – NIC1 1.1.1.250 – NIC2 127.anything – loopback 0 1.1.1.1 – loopback1 1.1.1.2 – loopback2

1.1.1.253 – NIC1 1.1.1.252 – NIC2 127.anything – loopback0 1.1.1.1 – Loopback1 1.1.1.2 – Loopback2

1.1.1.255 – NIC1 1.1.1.254 – NIC2 127.anything – loopback0 1.1.1.1 – loopback1 1.1.1.2 – loopback2


Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Server Iron1

2

3

4

5

6

7

8

FDXLink/Act

FDXLink/Act

9

10

11

12

13

14

15

16

FDXLink/Act

FDXLink/Act

Power

Console

LinkActivity

LinkActivity


Activity

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Symmetric SLB 2 Symmetric SLB 1

Internet

HSRP

1.1.1.1 VIP1 port http DSR 1.1.1.2 Backup

1.1.1.2 – VIP2 port http – DSR 1.1.1.1 - backup


Documents

ServerIron White Pages