Upload
ryan-belicov
View
218
Download
0
Embed Size (px)
Citation preview
7/29/2019 Appendix B - Design Considerations v2 21 Pages
1/21
sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
2/21sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
3/21
Emphasis on highly available, if not entirely nonstop, applications has
drawn attention to storage as a potential point-of-failure. Even with
fully redundant storage solutions (such as RAID mirroring), as long asstorage is accessible only through the server, the server itself must also
be made fully redundant. In a SAN, the storage system is independentof the application server and can be readily switched from one server
to another. In principle, any server can provide failover capability for
any other, resulting in more protected servers at lower cost than in
traditional 2-4 server cluster arrangements.
A central pool of storage can be managed cost-effectively using SAN
tools. Central management makes it easier to measure and predict
demands for service, and it leverages management tools and training
across a broader base of systems.
Also, the storage-to-storage connection allows some administrative
operations to proceed independent of the load on the application
servers. Most obvious of these are backup and archiving operations but
the list also includes replication and snapshots, expansion and
reallocation of storage resources, and operation of support servicessuch as location and name resolution.
sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
4/21
System Domain
In this domain, you rely on system-specific attributes of
availability such as hot plug devices, redundancy, reboot or
remote reboot on failure, remote management, multithreadedoperating systems, enhanced support, and uninterruptible
power supplies (UPSs).
Application Domain
In this domain, you find availability components like redundant
servers, application clustering, load balancing, network
connection redundancy, and transparent failover.
Storage Domain
In this domain, you rely on availability components of your SAN,
NAS and/or DAS. Also in this domain, you focus on RAID, hotswap drives, external storage, and back-up systems, with
centrally managed back-up and restore.
Infrastructure Domain
This domain is responsible for continuous operation and real-time data availability.
Replication
Snapshots
Source: Storage Security: Protecting SANs, NAS, and DAS by John
Chrillo and Scott Blaulsign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
5/21
Isolate applications and dedicate hardware
A single server handling multiple functions is an attempt to decrease hardware and licensingcosts. This may decrease device management overhead and there may be reasons to sharenetwork resources such as network links. However, as an organization grows, shared resourcescan become bottlenecks and behavior may become unpredictable.
In following best practices, it is recommended to isolate specific applications and applicationfunctions. The combination of two I/O-intensive applications on a single server can crippleperformance. You should also look at the interaction between the various applications serversand other servers such as the Microsoft Active Directory (AD) and the Domain Controller.
Optimize server iSCSI connectivity
You can connect any server to an iSCSI SAN using an iSCSI software initiator and NICcombination or an iSCSI HBA. Using a NIC with an iSCSI software initiator will provide goodperformance in most installations. However, an iSCSI HBA may improve performance byoffloading CPU processing.
Another factor to consider is the placement of these devices on the I/O buses within the server. Itcan be helpful to install your interface in a PCI-X slot to benefit from the higher clock speed (133MHz). Some servers have multiple PCI-X buses, and in cases where there are two or more HBAsor NICs configured for MPIO, it can be beneficial to install these PCI-X devices on separate PCI-Xbuses.
Isolate iSCSI SAN traffic
Gain performance benefits by segregating iSCSI SAN traffic away from client network traffic. Ifyou cannot physically isolate iSCSI network traffic from the client network using dedicated switchhardware, you should configure VLANs on the switches. This is because iSCSI traffic is unicastand can typically utilize the entire link. In fact, it is recommended that you disable unicast stormcontrol on switches that handle iSCSI traffic and provide this functionality. Note that the use ofbroadcast and multicast storm control is encouraged on switches.
Optimize the network and switch infrastructure
For best performance, you should connect arrays and hosts to a Gigabit Ethernet switchednetwork.
It is recommended that you enable jumbo frames and flow control, 802.1d, on each networkswitch port that handles iSCSI traffic and also on the server that is using a software iSCSI initiatorand NIC combination. Not all switches can do both flow control and jumbo frames. If that is thecase, then choose flow control. On the PS Series array, configure and enable all interfaces tomaximize the network throughput to the array.
Optimize I/O distribution across disk devices
Where data is written and how the storage subsystem is configured have a huge impact onperformance. Ideally, I/O should be distributed across multiple disk drives (spindles) and multipledrive controllers to balance read and write requests. When using PS Series storage, there is noneed to provision data manually because the SAN itself automatically distributes I/O across all thestorage arrays with no interruption in data availability.
sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
6/21
Storage used by key applications, such as Exchange and databases,
must be highly scalable to accommodate changes in the workload
such as:
Adding usersIncreasing capacity needs
Performing service procedures such as replacing failed hardware
These changes should not affect users or data availability.
Adding storage DAS vs. SAN
When using direct attached storage (DAS), you must plan
carefully to build and maintain the application server due to the
difficulty of adding usable storage. With DAS, bringing new
storage resources online usually requires:Full back-up,
Installation of a new and bigger storage subsystem,
A full restore, resulting in long downtime and many hoursof work for administrators.
A PS Series SAN can easily handle the growth so that businessesdont have to
over-provision storage resources or incur downtime to add
capacity. On demand, you can seamlessly increase SAN capacity
without affecting availability.
Expanding the SAN can also improve fault tolerance and increase
performance. As you add network interfaces, controllers, and arrays,sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
7/21
Characteristics of disk-to-disk and disk-to-disk-to-tape backups:
Heavy server loads
All have servers doing backup operations
Long backup/restore operations
Drive performance
Data must be converted on restore
12-24 hour data loss
Backups are run infrequently (typically, only once a day)
Integration of snapshots into the backup process is a better solution.
sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
8/21
Snapshots
Provides instant copies of disks without overhead
Allows more frequent backup operations
Offers instant restore of disk images
Ability to have another server access the snapshot
sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
9/21
The backup process is now cleaner. Snapshots:
Offload application servers
Improve backup of running applications
Provide instant recovery
Use less disk space than disk-to-disk backups
Provide history (multiple copies)
Snapshots are vulnerable since they are still local to the parent disk.
Snapshots are co-located with primary storage and snapshot
data; require access to primary data.
Backup of the snapshot now needs to be archived to a remote
site.
sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
10/21sign Considerations B-
7/29/2019 Appendix B - Design Considerations v2 21 Pages
11/21sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
12/21sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
13/21sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
14/21
Cache helps write loads by buffering I/Os.
System can sustain short write bursts of tens of thousands I/Os
to cache.
Cache flush is aggressive, yet efficient, due to seekoptimizations.
Cache helps with read loads when sequential (and recent).
Keeping file systems defragmented helps maintain sequential
view to array.
sign Considerations B-
7/29/2019 Appendix B - Design Considerations v2 21 Pages
15/21
Routing: All PS Series arrays must be able to talk to each other over
ALL network links.
L2 Spanning Tree and L3 routing: This is a customer choice. Dell
EqualLogic does not make any specific recommendations.Note that Spanning Tree can cause delays in interface
availability.
Using multiple subnets can cause traffic to flow through the
router. Watch for signs it is becoming a bottleneck.
In V2.0, PS Series arrays learn router availability by monitoring
RIP. No support for OSPF.
When initiator connects, we ping the initiator through all active
Ethernet interfaces.
If we can see the initiator, we mark the interface as usable for
that initiator.
Interfaces that cannot see the initiator are not included in port
balancing.
Ethernet port failover: better redundancy with fewer connections
Link failure on port causes failover to corresponding port on
other CM (E series only)
Currently, operates on eth0 and eth1
Switch occurs at physical level, so alternate port keeps
the same MAC address.sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
16/21
Spanning Tree when it happens is viewed as non communication for 10-30 seconds whenconnections start/stop.
Shorten convergence time in switch. (Be careful of loops!)
Fast Connect, End Node, or Spantree Portfast setting
Flow control: Always enable if possible.
Automatically enabled in PS Series arrayEnabled on each switch interface
Enabled on each NIC / iSCSI initiator.
Jumbo frames: Enabling large frames (9K) on switches and iSCSI initiators will yield maximumperformance.
Jumbo frames are automatically enabled and disabled in the PS Series array.
All switches do not support jumbo frames.
Jumbo frames rely on the entire path (initiator target) to support jumbo frame size.
Minimize routing.
Minimize inter-switch links; use multiple trunks for higher bandwidth.
Enable all of the network interfaces on the array(s).
Cabling
Gigabit Ethernet requires category (CAT) 5e or 6 cabling for optimal performance andoperation. If using the wrong cable, problems result.
No or improper connections: are link LEDs lit?
Improper cables (Cat 5 instead of Cat 5e, bad crimps)
Improper switch or bad auto negotiation: Running at 10 or 100 Mb viewed as slowperformance.
Major clue: Throughput is ~12 MB/sec.
Always monitor:
Check error counters on array Ethernet ports.
Check switch error counters for each port involved.
Check TCP/IP counters on initiator or NIC.
sign Considerations B-
7/29/2019 Appendix B - Design Considerations v2 21 Pages
17/21sign Considerations B
7/29/2019 Appendix B - Design Considerations v2 21 Pages
18/21sign Considerations B-
7/29/2019 Appendix B - Design Considerations v2 21 Pages
19/21sign Considerations B-
7/29/2019 Appendix B - Design Considerations v2 21 Pages
20/21sign Considerations B-
7/29/2019 Appendix B - Design Considerations v2 21 Pages
21/21