Appendix B - Design Considerations v2 21 Pages

Embed Size (px)

Citation preview

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    1/21

    sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    2/21sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    3/21

    Emphasis on highly available, if not entirely nonstop, applications has

    drawn attention to storage as a potential point-of-failure. Even with

    fully redundant storage solutions (such as RAID mirroring), as long asstorage is accessible only through the server, the server itself must also

    be made fully redundant. In a SAN, the storage system is independentof the application server and can be readily switched from one server

    to another. In principle, any server can provide failover capability for

    any other, resulting in more protected servers at lower cost than in

    traditional 2-4 server cluster arrangements.

    A central pool of storage can be managed cost-effectively using SAN

    tools. Central management makes it easier to measure and predict

    demands for service, and it leverages management tools and training

    across a broader base of systems.

    Also, the storage-to-storage connection allows some administrative

    operations to proceed independent of the load on the application

    servers. Most obvious of these are backup and archiving operations but

    the list also includes replication and snapshots, expansion and

    reallocation of storage resources, and operation of support servicessuch as location and name resolution.

    sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    4/21

    System Domain

    In this domain, you rely on system-specific attributes of

    availability such as hot plug devices, redundancy, reboot or

    remote reboot on failure, remote management, multithreadedoperating systems, enhanced support, and uninterruptible

    power supplies (UPSs).

    Application Domain

    In this domain, you find availability components like redundant

    servers, application clustering, load balancing, network

    connection redundancy, and transparent failover.

    Storage Domain

    In this domain, you rely on availability components of your SAN,

    NAS and/or DAS. Also in this domain, you focus on RAID, hotswap drives, external storage, and back-up systems, with

    centrally managed back-up and restore.

    Infrastructure Domain

    This domain is responsible for continuous operation and real-time data availability.

    Replication

    Snapshots

    Source: Storage Security: Protecting SANs, NAS, and DAS by John

    Chrillo and Scott Blaulsign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    5/21

    Isolate applications and dedicate hardware

    A single server handling multiple functions is an attempt to decrease hardware and licensingcosts. This may decrease device management overhead and there may be reasons to sharenetwork resources such as network links. However, as an organization grows, shared resourcescan become bottlenecks and behavior may become unpredictable.

    In following best practices, it is recommended to isolate specific applications and applicationfunctions. The combination of two I/O-intensive applications on a single server can crippleperformance. You should also look at the interaction between the various applications serversand other servers such as the Microsoft Active Directory (AD) and the Domain Controller.

    Optimize server iSCSI connectivity

    You can connect any server to an iSCSI SAN using an iSCSI software initiator and NICcombination or an iSCSI HBA. Using a NIC with an iSCSI software initiator will provide goodperformance in most installations. However, an iSCSI HBA may improve performance byoffloading CPU processing.

    Another factor to consider is the placement of these devices on the I/O buses within the server. Itcan be helpful to install your interface in a PCI-X slot to benefit from the higher clock speed (133MHz). Some servers have multiple PCI-X buses, and in cases where there are two or more HBAsor NICs configured for MPIO, it can be beneficial to install these PCI-X devices on separate PCI-Xbuses.

    Isolate iSCSI SAN traffic

    Gain performance benefits by segregating iSCSI SAN traffic away from client network traffic. Ifyou cannot physically isolate iSCSI network traffic from the client network using dedicated switchhardware, you should configure VLANs on the switches. This is because iSCSI traffic is unicastand can typically utilize the entire link. In fact, it is recommended that you disable unicast stormcontrol on switches that handle iSCSI traffic and provide this functionality. Note that the use ofbroadcast and multicast storm control is encouraged on switches.

    Optimize the network and switch infrastructure

    For best performance, you should connect arrays and hosts to a Gigabit Ethernet switchednetwork.

    It is recommended that you enable jumbo frames and flow control, 802.1d, on each networkswitch port that handles iSCSI traffic and also on the server that is using a software iSCSI initiatorand NIC combination. Not all switches can do both flow control and jumbo frames. If that is thecase, then choose flow control. On the PS Series array, configure and enable all interfaces tomaximize the network throughput to the array.

    Optimize I/O distribution across disk devices

    Where data is written and how the storage subsystem is configured have a huge impact onperformance. Ideally, I/O should be distributed across multiple disk drives (spindles) and multipledrive controllers to balance read and write requests. When using PS Series storage, there is noneed to provision data manually because the SAN itself automatically distributes I/O across all thestorage arrays with no interruption in data availability.

    sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    6/21

    Storage used by key applications, such as Exchange and databases,

    must be highly scalable to accommodate changes in the workload

    such as:

    Adding usersIncreasing capacity needs

    Performing service procedures such as replacing failed hardware

    These changes should not affect users or data availability.

    Adding storage DAS vs. SAN

    When using direct attached storage (DAS), you must plan

    carefully to build and maintain the application server due to the

    difficulty of adding usable storage. With DAS, bringing new

    storage resources online usually requires:Full back-up,

    Installation of a new and bigger storage subsystem,

    A full restore, resulting in long downtime and many hoursof work for administrators.

    A PS Series SAN can easily handle the growth so that businessesdont have to

    over-provision storage resources or incur downtime to add

    capacity. On demand, you can seamlessly increase SAN capacity

    without affecting availability.

    Expanding the SAN can also improve fault tolerance and increase

    performance. As you add network interfaces, controllers, and arrays,sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    7/21

    Characteristics of disk-to-disk and disk-to-disk-to-tape backups:

    Heavy server loads

    All have servers doing backup operations

    Long backup/restore operations

    Drive performance

    Data must be converted on restore

    12-24 hour data loss

    Backups are run infrequently (typically, only once a day)

    Integration of snapshots into the backup process is a better solution.

    sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    8/21

    Snapshots

    Provides instant copies of disks without overhead

    Allows more frequent backup operations

    Offers instant restore of disk images

    Ability to have another server access the snapshot

    sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    9/21

    The backup process is now cleaner. Snapshots:

    Offload application servers

    Improve backup of running applications

    Provide instant recovery

    Use less disk space than disk-to-disk backups

    Provide history (multiple copies)

    Snapshots are vulnerable since they are still local to the parent disk.

    Snapshots are co-located with primary storage and snapshot

    data; require access to primary data.

    Backup of the snapshot now needs to be archived to a remote

    site.

    sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    10/21sign Considerations B-

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    11/21sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    12/21sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    13/21sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    14/21

    Cache helps write loads by buffering I/Os.

    System can sustain short write bursts of tens of thousands I/Os

    to cache.

    Cache flush is aggressive, yet efficient, due to seekoptimizations.

    Cache helps with read loads when sequential (and recent).

    Keeping file systems defragmented helps maintain sequential

    view to array.

    sign Considerations B-

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    15/21

    Routing: All PS Series arrays must be able to talk to each other over

    ALL network links.

    L2 Spanning Tree and L3 routing: This is a customer choice. Dell

    EqualLogic does not make any specific recommendations.Note that Spanning Tree can cause delays in interface

    availability.

    Using multiple subnets can cause traffic to flow through the

    router. Watch for signs it is becoming a bottleneck.

    In V2.0, PS Series arrays learn router availability by monitoring

    RIP. No support for OSPF.

    When initiator connects, we ping the initiator through all active

    Ethernet interfaces.

    If we can see the initiator, we mark the interface as usable for

    that initiator.

    Interfaces that cannot see the initiator are not included in port

    balancing.

    Ethernet port failover: better redundancy with fewer connections

    Link failure on port causes failover to corresponding port on

    other CM (E series only)

    Currently, operates on eth0 and eth1

    Switch occurs at physical level, so alternate port keeps

    the same MAC address.sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    16/21

    Spanning Tree when it happens is viewed as non communication for 10-30 seconds whenconnections start/stop.

    Shorten convergence time in switch. (Be careful of loops!)

    Fast Connect, End Node, or Spantree Portfast setting

    Flow control: Always enable if possible.

    Automatically enabled in PS Series arrayEnabled on each switch interface

    Enabled on each NIC / iSCSI initiator.

    Jumbo frames: Enabling large frames (9K) on switches and iSCSI initiators will yield maximumperformance.

    Jumbo frames are automatically enabled and disabled in the PS Series array.

    All switches do not support jumbo frames.

    Jumbo frames rely on the entire path (initiator target) to support jumbo frame size.

    Minimize routing.

    Minimize inter-switch links; use multiple trunks for higher bandwidth.

    Enable all of the network interfaces on the array(s).

    Cabling

    Gigabit Ethernet requires category (CAT) 5e or 6 cabling for optimal performance andoperation. If using the wrong cable, problems result.

    No or improper connections: are link LEDs lit?

    Improper cables (Cat 5 instead of Cat 5e, bad crimps)

    Improper switch or bad auto negotiation: Running at 10 or 100 Mb viewed as slowperformance.

    Major clue: Throughput is ~12 MB/sec.

    Always monitor:

    Check error counters on array Ethernet ports.

    Check switch error counters for each port involved.

    Check TCP/IP counters on initiator or NIC.

    sign Considerations B-

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    17/21sign Considerations B

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    18/21sign Considerations B-

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    19/21sign Considerations B-

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    20/21sign Considerations B-

  • 7/29/2019 Appendix B - Design Considerations v2 21 Pages

    21/21