Appendix B - Design Considerations v2 21 Pages

7/29/2019 Appendix B - Design Considerations v2 21 Pages

1/21

sign Considerations B


2/21sign Considerations B


3/21

Emphasis on highly available, if not entirely nonstop, applications has

drawn attention to storage as a potential point-of-failure. Even with

fully redundant storage solutions (such as RAID mirroring), as long asstorage is accessible only through the server, the server itself must also

be made fully redundant. In a SAN, the storage system is independentof the application server and can be readily switched from one server

to another. In principle, any server can provide failover capability for

any other, resulting in more protected servers at lower cost than in

traditional 2-4 server cluster arrangements.

A central pool of storage can be managed cost-effectively using SAN

tools. Central management makes it easier to measure and predict

demands for service, and it leverages management tools and training

across a broader base of systems.

Also, the storage-to-storage connection allows some administrative

operations to proceed independent of the load on the application

servers. Most obvious of these are backup and archiving operations but

the list also includes replication and snapshots, expansion and

reallocation of storage resources, and operation of support servicessuch as location and name resolution.



4/21

System Domain

In this domain, you rely on system-specific attributes of

availability such as hot plug devices, redundancy, reboot or

remote reboot on failure, remote management, multithreadedoperating systems, enhanced support, and uninterruptible

power supplies (UPSs).

Application Domain

In this domain, you find availability components like redundant

servers, application clustering, load balancing, network

connection redundancy, and transparent failover.

Storage Domain

In this domain, you rely on availability components of your SAN,

NAS and/or DAS. Also in this domain, you focus on RAID, hotswap drives, external storage, and back-up systems, with

centrally managed back-up and restore.

Infrastructure Domain

This domain is responsible for continuous operation and real-time data availability.

Replication

Snapshots

Source: Storage Security: Protecting SANs, NAS, and DAS by John

Chrillo and Scott Blaulsign Considerations B


5/21

Isolate applications and dedicate hardware

A single server handling multiple functions is an attempt to decrease hardware and licensingcosts. This may decrease device management overhead and there may be reasons to sharenetwork resources such as network links. However, as an organization grows, shared resourcescan become bottlenecks and behavior may become unpredictable.

In following best practices, it is recommended to isolate specific applications and applicationfunctions. The combination of two I/O-intensive applications on a single server can crippleperformance. You should also look at the interaction between the various applications serversand other servers such as the Microsoft Active Directory (AD) and the Domain Controller.

Optimize server iSCSI connectivity

You can connect any server to an iSCSI SAN using an iSCSI software initiator and NICcombination or an iSCSI HBA. Using a NIC with an iSCSI software initiator will provide goodperformance in most installations. However, an iSCSI HBA may improve performance byoffloading CPU processing.

Another factor to consider is the placement of these devices on the I/O buses within the server. Itcan be helpful to install your interface in a PCI-X slot to benefit from the higher clock speed (133MHz). Some servers have multiple PCI-X buses, and in cases where there are two or more HBAsor NICs configured for MPIO, it can be beneficial to install these PCI-X devices on separate PCI-Xbuses.

Isolate iSCSI SAN traffic

Gain performance benefits by segregating iSCSI SAN traffic away from client network traffic. Ifyou cannot physically isolate iSCSI network traffic from the client network using dedicated switchhardware, you should configure VLANs on the switches. This is because iSCSI traffic is unicastand can typically utilize the entire link. In fact, it is recommended that you disable unicast stormcontrol on switches that handle iSCSI traffic and provide this functionality. Note that the use ofbroadcast and multicast storm control is encouraged on switches.

Optimize the network and switch infrastructure

For best performance, you should connect arrays and hosts to a Gigabit Ethernet switchednetwork.

It is recommended that you enable jumbo frames and flow control, 802.1d, on each networkswitch port that handles iSCSI traffic and also on the server that is using a software iSCSI initiatorand NIC combination. Not all switches can do both flow control and jumbo frames. If that is thecase, then choose flow control. On the PS Series array, configure and enable all interfaces tomaximize the network throughput to the array.

Optimize I/O distribution across disk devices

Where data is written and how the storage subsystem is configured have a huge impact onperformance. Ideally, I/O should be distributed across multiple disk drives (spindles) and multipledrive controllers to balance read and write requests. When using PS Series storage, there is noneed to provision data manually because the SAN itself automatically distributes I/O across all thestorage arrays with no interruption in data availability.



6/21

Storage used by key applications, such as Exchange and databases,

must be highly scalable to accommodate changes in the workload

such as:

Adding usersIncreasing capacity needs

Performing service procedures such as replacing failed hardware

These changes should not affect users or data availability.

Adding storage DAS vs. SAN

When using direct attached storage (DAS), you must plan

carefully to build and maintain the application server due to the

difficulty of adding usable storage. With DAS, bringing new

storage resources online usually requires:Full back-up,

Installation of a new and bigger storage subsystem,

A full restore, resulting in long downtime and many hoursof work for administrators.

A PS Series SAN can easily handle the growth so that businessesdont have to

over-provision storage resources or incur downtime to add

capacity. On demand, you can seamlessly increase SAN capacity

without affecting availability.

Expanding the SAN can also improve fault tolerance and increase

performance. As you add network interfaces, controllers, and arrays,sign Considerations B


7/21

Characteristics of disk-to-disk and disk-to-disk-to-tape backups:

Heavy server loads

All have servers doing backup operations

Long backup/restore operations

Drive performance

Data must be converted on restore

12-24 hour data loss

Backups are run infrequently (typically, only once a day)

Integration of snapshots into the backup process is a better solution.



8/21

Snapshots

Provides instant copies of disks without overhead

Allows more frequent backup operations

Offers instant restore of disk images

Ability to have another server access the snapshot



9/21

The backup process is now cleaner. Snapshots:

Offload application servers

Improve backup of running applications

Provide instant recovery

Use less disk space than disk-to-disk backups

Provide history (multiple copies)

Snapshots are vulnerable since they are still local to the parent disk.

Snapshots are co-located with primary storage and snapshot

data; require access to primary data.

Backup of the snapshot now needs to be archived to a remote

site.



10/21sign Considerations B-


14/21

Cache helps write loads by buffering I/Os.

System can sustain short write bursts of tens of thousands I/Os

to cache.

Cache flush is aggressive, yet efficient, due to seekoptimizations.

Cache helps with read loads when sequential (and recent).

Keeping file systems defragmented helps maintain sequential

view to array.

sign Considerations B-


15/21

Routing: All PS Series arrays must be able to talk to each other over

ALL network links.

L2 Spanning Tree and L3 routing: This is a customer choice. Dell

EqualLogic does not make any specific recommendations.Note that Spanning Tree can cause delays in interface

availability.

Using multiple subnets can cause traffic to flow through the

router. Watch for signs it is becoming a bottleneck.

In V2.0, PS Series arrays learn router availability by monitoring

RIP. No support for OSPF.

When initiator connects, we ping the initiator through all active

Ethernet interfaces.

If we can see the initiator, we mark the interface as usable for

that initiator.

Interfaces that cannot see the initiator are not included in port

balancing.

Ethernet port failover: better redundancy with fewer connections

Link failure on port causes failover to corresponding port on

other CM (E series only)

Currently, operates on eth0 and eth1

Switch occurs at physical level, so alternate port keeps

the same MAC address.sign Considerations B


16/21

Spanning Tree when it happens is viewed as non communication for 10-30 seconds whenconnections start/stop.

Shorten convergence time in switch. (Be careful of loops!)

Fast Connect, End Node, or Spantree Portfast setting

Flow control: Always enable if possible.

Automatically enabled in PS Series arrayEnabled on each switch interface

Enabled on each NIC / iSCSI initiator.

Jumbo frames: Enabling large frames (9K) on switches and iSCSI initiators will yield maximumperformance.

Jumbo frames are automatically enabled and disabled in the PS Series array.

All switches do not support jumbo frames.

Jumbo frames rely on the entire path (initiator target) to support jumbo frame size.

Minimize routing.

Minimize inter-switch links; use multiple trunks for higher bandwidth.

Enable all of the network interfaces on the array(s).

Cabling

Gigabit Ethernet requires category (CAT) 5e or 6 cabling for optimal performance andoperation. If using the wrong cable, problems result.

No or improper connections: are link LEDs lit?

Improper cables (Cat 5 instead of Cat 5e, bad crimps)

Improper switch or bad auto negotiation: Running at 10 or 100 Mb viewed as slowperformance.

Major clue: Throughput is ~12 MB/sec.

Always monitor:

Check error counters on array Ethernet ports.

Check switch error counters for each port involved.

Check TCP/IP counters on initiator or NIC.

sign Considerations B-


21/21

Documents

Appendix B - Design Considerations v2 21 Pages