25
Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents the steps in setting up a multi-site disaster recovery solution using Veritas Cluster Server and EMC ® SRDF ® in a Solaris 10 environment. The paper goes on to describe the system's functionality and corresponding administrative tasks. November 2009

Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

  • Upload
    hadieu

  • View
    221

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF

Applied Technology

Abstract

This white paper documents the steps in setting up a multi-site disaster recovery solution using Veritas Cluster Server and EMC® SRDF® in a Solaris 10 environment. The paper goes on to describe the system's functionality and corresponding administrative tasks.

November 2009

Page 2: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Copyright © 2009 EMC Corporation. All rights reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com

All other trademarks used herein are the property of their respective owners.

Part Number h6521

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 2

Page 3: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Table of Contents Executive summary ............................................................................................4 Introduction.........................................................................................................4

Audience ...................................................................................................................................... 4 Technology overview .........................................................................................4

EMC Symmetrix Remote Data Facility (SRDF) ........................................................................... 4 Veritas Cluster Server.................................................................................................................. 6

System configuration and administration ........................................................6 System components .................................................................................................................... 6

Hardware .................................................................................................................................. 6 Software ................................................................................................................................... 6

System architecture ..................................................................................................................... 7 Storage configuration................................................................................................................... 7 Host configuration ........................................................................................................................ 8

Multipathing software ............................................................................................................... 8 Network interface ................................................................................................................... 10

SRDF/Star configuration ............................................................................................................ 10 Veritas Cluster configuration...................................................................................................... 12

Post-installation configuration ................................................................................................ 12 Configuring global service groups and resources.................................................................. 16 Configuring Symmetrix heartbeat for VCS............................................................................. 21

System administration................................................................................................................ 22 VCS Cluster Manager ............................................................................................................ 22 CLI commands ....................................................................................................................... 24

Conclusion ........................................................................................................25 References ........................................................................................................25

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 3

Page 4: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Executive summary Server clustering software is often deployed in the data center to implement high availability and load balancing across multiple hosts. Today, clustering products protect entire data centers by connecting globally distributed clusters, thus enabling disaster recovery at a secondary site. This functionality was introduced to Veritas Cluster Server with the Global Cluster Option (GCO). When integrated with the EMC® Symmetrix® storage platform’s SRDF® remote replication capability, GCO-enabled clusters provide simplified management and a robust framework for planned and unplanned data center downtime while minimizing the impact on application uptime. Veritas Cluster supports the use of SRDF/Star to implement and protect a three-site solution. This white paper is intended to be a guide in installing and configuring Veritas Cluster GCO with EMC SRDF, focusing in particular on SRDF/Star.

Introduction This white paper provides insight into the implementation of Veritas Cluster Server (VCS) with EMC SRDF. This paper follows a logical progression in setting up a generic, high-availability global cluster environment using SRDF/Star. The intent is that readers will be able to follow the documentation in step-by-step fashion to instantiate a disaster-recoverable system, which could then be further modified to fit specific applications and requirements. While this white paper is not intended to be a comprehensive guide to VCS and SRDF, many of the configuration steps can be generalized to other environments. In addition, the paper also provides information on the live environment’s functionalities and operational features. VCS is a server-clustering solution designed for UNIX, Linux, and Windows operating systems. It makes applications highly available by enabling application failovers in the event of component failures. With the Global Cluster Option, geographically dispersed Veritas Clusters can be linked together to protect applications from events affecting entire data centers; it expedites and simplifies disaster recovery as well as production site migrations. Thus, applications can remain online during production site outages by moving to a backup cluster. EMC SRDF, or Symmetrix Remote Data Facility, is a remote data replication product for EMC’s flagship enterprise storage array. SRDF technology is storage-based, allowing for the replication of any data stored on one Symmetrix to a remote Symmetrix over a number of transport protocols. SRDF can be integrated into the cluster framework via VCS agents to simplify management. These agents support two- and three-site replication.

Audience This white paper is appropriate for technical personnel looking to implement and use VCS with SRDF. Knowledge in the system components and their operations will be helpful in customizing the environment beyond the steps taken in this paper.

Technology overview

EMC Symmetrix Remote Data Facility (SRDF) The EMC Symmetrix Remote Data Facility (SRDF) family of remote mirroring software is the most field-proven, widely deployed array-based disaster restart solution in the world with tens of thousands of licenses shipped in the most demanding customer environments. Leveraging the industry-leading high-end Symmetrix system, SRDF offers the most choice and flexibility to meet any service level requirement.

The SRDF family of software provides remote mirroring independent of the host and operating system, application, and database. SRDF remote mirroring helps companies manage planned and unplanned outages, enabling 7x24x365 data availability and allowing businesses to focus on maximizing revenue

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 4

Page 5: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

generation and customer support opportunities, improve productivity, and control or reduce costs for increased competitive advantage.

The SRDF family of software includes:

• SRDF/Synchronous (SRDF/S) maintains a real-time synchronized mirror of a Symmetrix production data device to a secondary site Symmetrix data device that is usually at campus, metropolitan, or regional distances providing a recovery point objective of zero data loss.

• SRDF/Asynchronous (SRDF/A) maintains a near real-time synchronized mirror of a Symmetrix production data device to a secondary site Symmetrix data device that is usually at an extended or out-of-region distances providing a recovery point objective that could be as minimal as a few seconds.

• SRDF/Data Mobility (SRDF/DM) provides for the transfer of a Symmetrix production data device to a secondary site Symmetrix data device that can be at any distance permitting information to be periodically mirrored for disaster restart, information sharing for decision support or data warehousing activities, or for data migration between Symmetrix systems.

The SRDF family of software also consists of other options including advanced three-site capabilities using the combination of SRDF/S, SRDF/A, and/or SRDF/DM, offering the most comprehensive portfolio of remote mirror solutions in the industry to meet a wide range of business needs for high data availability, data mobility, and disaster restart requirements.

The other SRDF options and advanced three-site solutions include:

• SRDF/Asynchronous Replication (SRDF/AR) enables rapid disaster restart over any distance with a two-site single hop option using SRDF/DM in combination with TimeFinder, or a three-site multi-hop option using in combination with SRDF/S, SRDF/DM, and TimeFinder.

SRDF/AR Single-Hop provides remote disaster restart at the secondary site with low data loss exposure (minutes to 10s of minutes). SRDF/AR Multi-Hop provides long distance remote disaster restart with zero data loss achievable at the “out-of-region” site.

• SRDF/Cluster Enabler (SRDF/CE) enables automated or semi-automated site failover using SRDF/S or SRDF/A with Microsoft Failover Clusters. SRDF/CE allows Windows Server 2003 and Windows Server 2008 Enterprise and Datacenter editions running Microsoft Failover Clusters to operate across a single pair of SRDF connected Symmetrix arrays as geographically distributed clusters.

• SRDF/Star is a three-site disaster-restart solution that can enable resumption of SRDF/A with no data loss between two remaining sites, providing continued remote-data mirroring and preserving disaster-restart capabilities.

It offers a combination of continuous protection, changed-data resynchronization, and enterprise consistency between two remaining sites in the event of the Workload Site going offline due to a site failure, fault, or disaster event.

As more businesses require solutions to provide the highest levels of disaster restart capabilities with zero to minimal data loss, and low RTO, SRDF/Star is the industry’s first solution to enable organizations to satisfy those requirements.

• SRDF Concurrent enables the ability to remotely mirror a Symmetrix production site data device to two secondary site Symmetrix data devices simultaneously using either SRDF/S, or a combination of SRDF/S and SRDF/A.

• SRDF Cascaded is an advanced three-site solution that can synchronously mirror a Symmetrix production site data device with SRDF/S to a secondary site Symmetrix data device, then asynchronously mirror that secondary Symmetrix data device with SRDF/A to an out-of-region Symmetrix data device, with no data loss achievable in the event of a production site disaster event.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 5

Page 6: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

• SRDF/Extended Distance Protection (SRDF/EDP) is a new two-site disaster restart solution

enabling customers the ability to achieve no data loss at an out-of-region site at a lower cost. Using the cascaded SRDF mode of operation as the building block for this solution, combined with the use of the new diskless R21 device in the intermediate site, allows the intermediate site to provide data pass-through to the out-of-region site.

• SRDF/Consistency Groups (SRDF/CG), provided at no additional cost, ensures application dependent write consistency of the application data being remotely mirrored by SRDF in the event of a rolling disaster, across multiple Symmetrix systems or across multiple devices within a Symmetrix, providing for a business point of consistency for remote site disaster restart for all identified applications associated with a business function.

Veritas Cluster Server Veritas Cluster is Symantec’s server clustering product. It is supported on numerous operating systems, including Solaris (x86 and SPARC), AIX, RHEL, SLES, OEL, HP-UX, Windows, and ESX. Veritas Cluster makes applications highly available by enabling failover to other nodes in the event of component failures. Failure detection is handled through a heartbeat mechanism run on dedicated private networks. Data sharing through private intra-cluster networks is done through the Low Latency Transport (LLT) and Group Membership and Atomic Broadcast (GAB) components of VCS. Coordinator disks are used for I/O fencing. Applications are protected by Veritas Cluster through the use of agents. An agent contains management methods that are used to start, stop, monitor, and perform corrective measures on the application. Agents are instantiated as resources, which can then be grouped into service groups. By creating dependencies between resources, a service group can be used to manage the entire operating environment supporting an application – from the application itself to its file systems, LVMs, and underlying data replication. Many agents are available and can be downloaded from the Symantec website, in addition to agents created and maintained by partners. There is also a generic agent that can be used to manage custom applications or scripts. Service groups can be made active on any nodes in a cluster. Building on the protection offered by Veritas Cluster, the Global Cluster Option allows geographically dispersed Veritas Clusters to communicate with each other over IP networks. Once an ICMP heartbeat is established, service groups may be reconfigured as global service groups, with an instance of the service group located at each site (that is, a cluster). Only one instance can be active at a given time. Using this mechanism to move the active site between clusters enables expedient recovery from data center disaster events as well as greater flexibility with planned downtime (for example, maintenance windows).

System configuration and administration

System components

Hardware • Sun Fire T2000 (with UltraSPARC T1 processors) running Solaris 10 update 5 • Symmetrix DMX-3 950 running Enginuity™ 5773.123.83

Software • EMC Solutions Enabler 6.5.2.0 • Veritas Storage Foundations 5.0 MP3

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 6

Page 7: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

System architecture There are numerous viable ways to configure VCS with SRDF. Variables include the number of nodes per cluster, number of data centers, distance of the replication, RTO and RPO, networking options, and so on. For the purposes of this paper, which focuses on the use of SRDF in a multi-site VCS framework, three one-node Veritas clusters were configured. Corresponding to these are three Symmetrix DMX-3 950s, one for each cluster. The node in each cluster acts as the management host for the local Symmetrix. The arrays were zoned to each other through the SAN and configured for SRDF. With this infrastructure we can implement two-site or three-site disaster recovery. In addition to Veritas Cluster Server, Veritas Volume Manager was also installed and configured for each cluster. Table 1 lists the naming convention followed in this paper.

Table 1. Equipment usage and designation

Clusters Site A Site B Site C

Host Licod230 Licod231 Licod207 Symmetrix HK000190300359 HK000190300570 HK000190300571

Figure 1. System topology

Storage configuration The test environment used the following director flags: C, D, UWN, VCM, EAN.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 7

Page 8: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

As a matter of best practice, multiple paths should be made available between each host and its respective Symmetrix. In addition, using two or more RA directors for each RDF groups allows for high availability and load balancing.

After the devices have been provisioned, each must be mapped and masked to the hosts. To map a range of devices to a Symmetrix director port, use the symconfigure command. For example:

Licod230:/> symconfigure –sid 359 commit –cmd “map dev 00A8:00B3 to dir 1C:0 starting lun=1;”

To mask a range of devices from a Symmetrix director port to a HBA, use the symmask command. For example:

Licod230:/> symmask –sid 359 add devs 00A8:00B3 -dir 1C -p 0 -wwn 210000E08B93CC86

To turn on dynamic SRDF capabilities for a LUN:

Licod230:/> symconfigure –sid 359 commit –cmd “set dev 00A8:00B3 attribute=dyn_rdf;”

The VCS I/O fencing feature requires that devices used as data disks or coordinator disks support SCSI-3 Persistent Reservations. To turn on the SCSI-3 Persistent Reservations flag:

Licod230:/> symconfigure –sid 359 commit –cmd “set dev 00A8:00B3 attribute=scsi3_persist_reserv;”

Host configuration

Multipathing software PowerPath EMC PowerPath® load balancing, optimized for Symmetrix, is enabled by the following command:

Licod230:/> powermt set policy=so dev=all

MPxIO MPxIO is integrated with the Solaris operating system. To enable MPxIO, edit the file /kernel/drv/fp.conf by changing the value of the mpxio-disable line to “no”:

mpxio-disable=”no”;

If the line does not exist, add it to the file.

To enable MPxIO operations with Symmetrix devices, edit the file /kernel/drv/scsi_vhci.conf to include the following lines:

device-type-scsi-options-list=”EMC SYMMETRIX”, “symmetric-option”; symmetric-option=0x1000000;

Note that there should be five white spaces between “EMC” and “SYMMETRIX”.

Perform a reconfigure reboot after these edits (ok boot –r).

VxDMP VxDMP is turned on by default upon configuration of Veritas Volume Manager. VxDMP will multipath and load balance for devices provisioned from the Symmetrix.

To disable VxDMP, use the command vxdiskadm and select prevent multipathing/suppress devices from VXVM’s view. Options allow users to select a subset of devices by controller, specified paths, or by Vendor ID and Product ID.

Licod230:/> vxdiskadm Volume Manager Support Operations

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 8

Page 9: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Menu: VolumeManager/Disk 1 Add or initialize one or more disks 2 Encapsulate one or more disks 3 Remove a disk ... 17 Prevent multipathing/Suppress devices from VxVM's view 18 Allow multipathing/Unsuppress devices from VxVM's view ... Select an operation to perform: 17 Exclude Devices Menu: VolumeManager/Disk/ExcludeDevices VxVM INFO V-5-2-1239 This operation might lead to some devices being suppressed from VxVM's view or prevent them from being multipathed by vxdmp (This operation can be reversed using the vxdiskadm command). Do you want to continue? [y,n,q,?] (default: y) y Volume Manager Device Operations Menu: VolumeManager/Disk/ExcludeDevices 1 Suppress all paths through a controller from VxVM's view 2 Suppress a path from VxVM's view 3 Suppress disks from VxVM's view by specifying a VID:PID combination 4 Suppress all but one paths to a disk 5 Prevent multipathing of all disks on a controller by VxVM 6 Prevent multipathing of a disk by VxVM 7 Prevent multipathing of disks by specifying a VID:PID combination 8 List currently suppressed/non-multipathed devices ? Display help about menu ?? Display help about the menuing system q Exit from menus Select an operation to perform: 5 Exclude controllers from DMP Menu: VolumeManager/Disk/ExcludeDevices/CTLR-DMP Use this operation to exclude all disks on a controller from being multipathed by vxdmp. As a result of this operation, all disks having a path through the specified controller will be claimed in the OTHER_DISKS category and hence, not multipathed by vxdmp. This operation can be reversed using the vxdiskadm command. VxVM INFO V-5-2-1263 You can specify a controller name at the prompt. A controller name is of the form c#, example c3, c11 etc. Enter 'all' to exclude all paths on all the controllers on the host. To see the list of controllers on the system, type 'list'. Enter a controller name [<ctlr-name>,all,list,list-exclude,q,?] all VxVM INFO V-5-2-1182 No disk will be multipathed by vxdmp as a result of this operation ! Continue operation? [y,n,q,?] (default: n) y VxVM NOTICE V-5-2-1241 This operation will take effect only after a reboot.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 9

Page 10: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

... VxVM vxdiskadm NOTICE V-5-2-1231 The system must be shut down and rebooted for the device suppression/unsuppression operations you have performed to take effect. To shutdown your system, cd to / and type shutdown -g0 -y -i6 Do not attempt to use the device suppression/unsuppression operations again before the system is rebooted. Goodbye.

Network interface In a multi-node Veritas cluster, each node requires at least three available network adapter ports: One for the public network and two for the dedicated intra-cluster private network. This requirement for two private network channels protects the cluster against network partitioning. For a single-node cluster where there is no private network, only one network adapter is needed.

SRDF/Star configuration First, enable dynamic SRDF on each Symmetrix:

Licod230:/> symconfigure –sid 359 commit –cmd “set symmetrix dynamic_rdf=enable;”

Then enable the dynamic SRDF bit on the devices presented to the hosts:

Licod230:/> symconfigure –sid 359 commit –cmd “set dev 0101:0106 attribute=dyn_rdf;”

Next, from a corresponding control host, create a dynamic SRDF group for each of the three legs in the SRDF/Star configuration: the synchronous, asynchronous, and recovery links:

Licod230:/> symrdf addgrp –label vcssync –sid 359 –rdfg 80 –dir 2d –remote_sid 570 –remote_rdfg 80 -remote_dir 1d

Licod230:/> symrdf addgrp –label vcsasync –sid 359 –rdfg 81 –dir 2d –remote_sid 571 –remote_rdfg 81 -remote_dir 1d

Licod231:/> symrdf addgrp –label vcsrecov –sid 570 –rdfg 82 –dir 1d –remote_sid 571 –remote_rdfg 82 –remote_dir 1d

Further, a SRDF group can be placed on more directors for load balancing and high availability using the symrdf modifygrp command. For example:

Licod230:/> symrdf modifygrp –add –rdfg 80 –sid 359 –dir 16d –remote_dir 16d

Now create the device pairs on the synchronous and asynchronous legs. The source device at the workload site (site A) will be concurrently replicated to both remote sites:

Licod230:/> symrdf createpair –f syncpairs –sid 359 –rdfg 80 –type R1 –rdf_mode sync –invalidate R2

Licod230:/> symrdf createpair –f asyncpairs –sid 359 –rdfg 81 –type R1 –rdfg_mode async –invalidate R2

The input file “syncpairs” and “asyncpairs” contain a tab delimited one-to-one mapping of an R1 to an R2 device. Example:

#content of syncpairs, maps Sym device numbers from #R1 (Sym 359) to R2 (Sym 570) 0101 00a8

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 10

Page 11: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

0102 00a9 0103 00aa 0104 00ab 0105 00ac 0106 00ad

At this point, we have a complete setup for concurrent SRDF. We can now create a composite group for monitoring and managing the replicated devices. Composite groups are also used to convert the concurrent SRDF setup into a SRDF/Star setup. To create a composite group named vcsstar_cg located at site A:

Licod230:/> symcg create vcsstar_cg –type rdf1 –rdf_consistency

The -rdf_consistency flag will cause the group to be added to the consistency database, which allows it to be enabled for SRDF/Star consistency protection. Once consistency is enabled, the composite group becomes known as a consistency group.

We now add the replicated devices into the composite group:

Licod230:/> symcg –cg vcsstar_cg –rdfg 80 –sid 359 addall dev

Though this command specifies only one RDF group, the composite group will actually become aware of both legs of the concurrent configuration. In our case, a subsequent symcg show will display information on both RDF groups 80 and 81.

Next, we use the symcg set -name command to provide a designation for the synchronous and asynchronous target sites. In this case, those sites are simply called site B and site C, respectively:

Licod230:/> symcg –cg vcsstar_cg set –name siteB –rdfg 359:80 –recovery_rdfg 82 Licod230:/> symcg –cg vcsstar_cg set –name siteC –rdfg 359:81 –recovery_rdfg 82

In these commands, the –rdfg parameter inputs the local Symmetrix ID and the RDF group to be named separated by a colon. The –recovery_rdfg flag dictates the RDF group used for the purpose of the recovery link at the target site. In this case, RDF group 82 is used on both site B and C for the recovery link should the workload site become unavailable.

The next step is to create an options file that will be used to dictate SRDF/Star parameters, such as the mapping between the synchronous and asynchronous target sites and their designated names, as defined in the symcg set –name commands above.

The following is the contents of the text options file sconfig:

SYMCLI_STAR_WORKLOAD_SITE_NAME = siteA SYMCLI_STAR_SYNCTARGET_SITE_NAME = siteB SYMCLI_STAR_ASYNCTARGET_SITE_NAME = siteC SYMCLI_STAR_ALLOW_CASCADED_CONFIGURATION = YES

The only required lines are the first three, which map the workload, synchronous, and asynchronous target sites. There are a number of optional parameters that override default values. The full listing may be accessed through the man page for the symstar command.

Next, the symstar setup command is invoked from the workload site (Licod230 at site A) to build the internal SRDF/Star definition file for the R1 composite group.

Sunl:/> symstar –cg vcsstar_cg setup –options sconfig –nop

Once setup completes, the definition file can be distributed to the target sites to allow for SRDF/Star operations from the remote control hosts. The file has the same name as the composite group and can be found in /var/symapi/config/STAR/def/ on Solaris. Place this file into the same directory on the remote hosts and run the symcg buildcg command from the workload site:

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 11

Page 12: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Licod230:/> symstar –cg vcsstar_cg buildcg –site siteB –nop Licod230:/> symstar –cg vcsstar_cg buildcg –site siteC –nop

With Solutions Enabler 7.0, the commands above are no longer necessary. By default, the configuration file is distributed to the remote sites.

Next, perform symstar connect commands to enable read/write on the SRDF links. The links are initially set to adaptive copy mode to quickly replicate data, though the command does not wait for synchronization.

Licod230:/> symstar –cg vcsstar_cg –site siteB connect –nop Licod230:/> symstar –cg vcsstar_cg –site siteC connect –nop

We can now perform symstar protect commands. This command will check to make sure the links are synchronized. By default, at 30,000 invalid tracks, the links are transitioned out of adaptive copy and into synchronous or asynchronous. This threshold can be changed via the options file. The command then enables SRDF consistency protection on each link.

Licod230:/> symstar –cg vcsstar_cg –site siteB protect –nop Licod230:/> symstar –cg vcsstar_cg –site siteC protect –nop

Finally, we enable SRDF/Star protection using the symstar enable command. This command creates and initiates the SDDF resources. Once R2 is recoverable on both legs, Star system state becomes Protected.

Licod230:/> symstar –cg vcsstar_cg enable –nop

Veritas Cluster configuration

Post-installation configuration The general configuration for Veritas Cluster can be completed separately after installation. To do so, run the installvcs script located on the installation media with the –configure argument. For a three-site solution, this must be done at all three sites.

Licod230:/> ./installvcs -configure Veritas Cluster Server 5.0MP3 Configuration Program Copyright (c) 2008 Symantec Corporation. All rights reserved. Symantec, the Symantec Logo are trademarks or registered trademarks of Symantec Corporation or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. The Licensed Software and Documentation are deemed to be "commercial computer software" and "commercial computer software documentation" as defined in FAR Sections 12.212 and DFARS Section 227.7202. Logs for installvcs are being created in /var/tmp/installvcs-NBwhYS. Enter the system names separated by spaces on which to configure VCS: Licod230 If you plan to run VCS on a single node without any need for adding cluster node online, you have an option to proceed without starting GAB and LLT. Starting GAB and LLT is recommended. Do you want to start GAB and LLT? [y,n,q,?] (n) y Initial system check:

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 12

Page 13: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Checking VCS installation on Licod230............................. 5.0 Checking architecture on Licod230............................... sparc

Checking system licensing NFR VCS license registered on Licod230 Do you want to enter another license key for Licod230? [y,n,q] (n) n

Stopping VCS processes. Please wait... VCS processes have been stopped. Press [Return] to continue:

To configure VCS, please answer the sets of questions on the next screen. When [b] is presented after a question, 'b' may be entered to go back to the first question of the configuration set. When [?] is presented after a question, '?' may be entered for help or additional information about the question. Following each set of questions, the information you have entered will be presented for confirmation. To repeat the set of questions and correct any previous errors, enter 'n' at the confirmation prompt. No configuration changes are made to the systems until all configuration questions are completed and confirmed. Press [Return] to continue:

To configure VCS the following information is required: A unique Cluster name A unique Cluster ID number between 0-65535 Two or more NIC cards per system used for heartbeat links One or more heartbeat links are configured as private links One heartbeat link may be configured as a low priority link All systems are being configured to create one cluster Enter the unique cluster name: [?] siteC Enter the unique Cluster ID number between 0-65535: [b,?] 0 Discovering NICs on Licod230 ... discovered e1000g0 e1000g1 e1000g2 e1000g3 To use aggregated interfaces for private heartbeat, enter the name of the aggregated interface. To use a NIC for private heartbeat, enter a NIC which is not part of an aggregated interface.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 13

Page 14: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Enter the NIC for the first private heartbeat link on Licod230: [b,?] e1000g1 Would you like to configure a second private heartbeat link? [y,n,q,b,?] (y) y Enter the NIC for the second private heartbeat link on Licod230: [b,?] e1000g2 Would you like to configure a third private heartbeat link? [y,n,q,b,?] (n) n Do you want to configure an additional low priority heartbeat link? [y,n,q,b,?] (n) n

Cluster information verification: Cluster Name: siteC Cluster ID Number: 0 Private Heartbeat NICs for Licod230: link1=e1000g1 link2=e1000g2 Is this information correct? [y,n,q] (y) y

Veritas Cluster Server can be configured to utilize Symantec Security Services. Running VCS in Secure Mode guarantees that all inter-system communication is encrypted and that users are verified with security credentials. When running VCS in Secure Mode, NIS and system usernames and passwords are used to verify identity. VCS usernames and passwords are no longer utilized when a cluster is running in Secure Mode. Before configuring a cluster to operate using Symantec Security Services, another system must already have Symantec Security Services installed and be operating as a Root Broker. Refer to the Veritas Cluster Server Installation Guide for more information on configuring a Symantec Product Authentication Service Root Broker. Would you like to configure VCS to use Symantec Security Services? [y,n,q] (n) n

The following information is required to add VCS users: A user name A password for the user User privileges (Administrator, Operator, or Guest) Do you want to set the username and/or password for the Admin user (default username = 'admin', password='password')? [y,n,q] (n) y Enter the user name: [b,?] (admin) root Enter the password: Enter again: Do you want to add another user to the cluster? [y,n,q] (y) y

Veritas Cluster Server 5.0MP3 Configuration Program

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 14

Page 15: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

VCS User verification: User: root Privilege: Administrators Passwords are not displayed Is this information correct? [y,n,q] (y) y

The following information is required to configure SMTP notification: The domain-based hostname of the SMTP server The email address of each SMTP recipient A minimum severity level of messages to send to each recipient Do you want to configure SMTP notification? [y,n,q,?] (y) n

The following information is required to configure SNMP notification: System names of SNMP consoles to receive VCS trap messages SNMP trap daemon port numbers for each console A minimum severity level of messages to send to each console Do you want to configure SNMP notification? [y,n,q,?] (y) n

The following is required to configure the Global Cluster Option: A public NIC used by each system in the cluster A Virtual IP address and netmask Do you want to configure the Global Cluster Option? [y,n,q,?] (y) y Active NIC devices discovered on Licod230: e1000g0 Enter the NIC for Global Cluster Option to use on Licod230: [b,?] (e1000g0) e1000g0 Is e1000g0 to be the public NIC used by all systems? [y,n,q,b,?] (y) y Enter the Virtual IP address for the Global Cluster Option: [b,?] 10.243.157.70 Enter the netmask for IP 10.243.157.70: [b,?] (255.255.255.0) 255.255.255.0

Global Cluster Option configuration verification:

NIC: e1000g0 IP: 10.243.157.70 Netmask: 255.255.255.0 Is this information correct? [y,n,q] (y) y

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 15

Page 16: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Creating Veritas Cluster Server configuration files.............. Done Copying configuration files to Licod230.......................... Done Do you want to start Veritas Cluster Server processes now? [y,n,q] (y) y

Logs for installvcs are temporarily being created in /var/tmp/installvcs-NBwhYS. Starting VCS: 0%__________________________________________________ 20%________________________________________ 40%______________________________ 60%____________________ 80%__________ 100% Startup completed successfully on all systems Configuration log files, summary file, and response file are saved at: /opt/VRTS/install/logs/installvcs-NBwhYS

It is strongly recommended to reboot the following systems. Licod230 Execute '/usr/sbin/shutdown -y -i6 -g0' to properly restart your systems.

General configuration for Veritas Cluster is now complete.

Configuring global service groups and resources Next, we will use VCS Cluster Manager, the Java-based GUI, to build a global service group capable of failover between sites. A service group is defined as a number of interdependent and standalone resources. This service group can then be used to represent the management stack from storage to application.

To start Cluster Manager, run the command hagui. Once logged in to the cluster, click the Add Resource Group quick bar icon. Give the service group a name – in this example, the service group was named testapp. Hosts in the cluster are initially shown on the left. To add the service group to multiple systems, select the hosts and click the right arrow. Users can specify if the service group should automatically start on boot. Users also have the ability to select the failover group policy: Failover, Parallel, or Hybrid. Failover is an active/passive configuration where the application only runs on one host at a time, whereas parallel is designed for active/active applications with multiple simultaneous instances, such as Oracle RAC. Hybrid is a mix of these two policies, where it behaves as a failover group within a system zone and a parallel group across system zones. System zones are a logical partitioning of the hosts in a cluster and can be defined as an attribute of the resource group. A system’s priority is related to the order in which the service group will be failed over. Note that the Show Command button will toggle a text window that dynamically shows the exact command to be run.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 16

Page 17: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Figure 2. Choose the Service Group Type With the service group created, we can now add resources. By deploying agents as resources, we can control and manage applications, databases, and replication technologies through VCS. Each agent implements a number of standard “entry points” that allow it to communicate with the underlying management interface. For example, the SRDF and SRDF/Star agents use SymCLI scripts for control operations and to gather information. Some basic entry points are online, offline, monitor, and clean. Arguments defined at the time of resource creation modify the behavior of each entry point. For example, the HaltOnOffline attribute for a SRDF/Star resource determines if an offline action will perform a symstar halt.

Before we can make use of the SRDF/Star agent, we must import the SRDFStarTypes.cf file into the VCS configuration. Once the package for the agent is installed, the .cf file can be found in /etc/VRTSvcs/conf/. To do so, on the File menu, click on Import Types, then find the file.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 17

Page 18: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Figure 3. Import the SRDFStarTypes file

Next, we add an SRDF/Star resource into the service group testapp. To add a resource, right-click on service group and select Add Resource…

Figure 4. Add a resource

Select the SRDF/Star resource type and give it an appropriate name. A list of attributes that correspond to the resource type will appear in the scroll box. To get a description of each attribute, click the Edit icon. The bolded attributes are required. In this case, it is required that users supply the name of the consistency group. Note that toggling Critical causes the service group to default when the resource or any dependent resources default. Once values are specified for the mandatory attributes, the resource can be toggled to Enabled, which allows it to be monitored and switched on.

Repeat this to add other relevant resources to the service group. For example, testapp may consist of resources for an application, file system mount, VxVM volume and disk group, and SRDF replication.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 18

Page 19: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Once all resources are defined, they can be linked together to form dependencies. When service groups are turned on, the parent-child relationship is used to ensure that resources that are depended on are turned on first. To create links, select the service group and click the Resource tab. Click Link and simply click on the resources in order from parent to child.

Figure 5. Create links

At this time, with all resources enabled, the service group is ready to start on the local cluster. Before doing so, however, we will configure it as a global service group. The first step is to connect the three separate clusters with ICMP heartbeats. To do so, under the Edit pull-down menu, select Add/Delete Remote Cluster… and fill in the IP and user authentication for the remote cluster. Do so for all remote clusters. This will automatically set up the heartbeat. Once these are filled out, click Next, then Finish at the wizard summary window.

Figure 6. Enter cluster details

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 19

Page 20: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

To facilitate the movement of the service group between clusters, it must be instantiated on each of the remote clusters. In this example, a service group named testapp is manually created on each of the clusters siteA, siteB, and siteC. The service group at each cluster is customized to represent the management stack specific to that site.

Once the three local service groups are created, they can be converted into a global service group. To do so, right-click on the service group and select Configure As Global… and select the remote clusters that will participate to form the global service group. The three options for cluster failover policy are manual, auto, and connected. In manual mode, a service group will not automatically fail over to another cluster. Instead, users must monitor the service group to react to any errors. Auto mode enables a group to automatically fail over to another cluster if it is unable to fail over within the cluster, or if the entire cluster faults. In contrast, connected mode enables a group to automatically fail over only if it is unable to fail over within the cluster. After completing the form, click Next.

Figure 7. Select clusters for the service group

Next, enter the IP address of the remote clusters or the IP address or hostname of a node in the remote cluster. Enter the login information for the remote cluster. After all remote clusters are entered, click Next, then Finish.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 20

Page 21: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Figure 8. Enter remote cluster information

Note that if the service does not exist on all inputted remote clusters, the process will not complete successfully. Otherwise, the service group is now configured as a global group.

Configuring Symmetrix heartbeat for VCS An optional Symmetrix heartbeat can be configured. This heartbeat is run through the SRDF link between the local and remote arrays, which helps prevent VCS from mistakenly interpreting the loss of ICMP heartbeats over the public network as a site failure. The Symmetrix heartbeat is maintained via symrdf ping commands. To add a Symmetrix heartbeat at each cluster, under the Edit menu, click Configure Heartbeats. In the Heartbeat Configuration box, enter the name of the heartbeat (Symm) and select the checkboxes next to clusters for which this heartbeat will apply. Next, click the Configure icon to customize the heartbeat settings. Specify the Symmetrix ID of the remote array as the value of the Arguments field. Next, the “Are You Alive” RetryLimit attribute needs to be set to 1 less than that of the ICMP heartbeat. This is to ensure that VCS detects array failures first and won’t confuse a site failure with an all-host failure.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 21

Page 22: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Figure 9. Configure a Symmetrix heartbeat

System administration

VCS Cluster Manager VCS Cluster Manager can be used to monitor and control the infrastructure.

Users can monitor the status of the cluster by clicking on the top level of the list view panel on the left in Cluster Manager. This brings up the relevant information on the main panel. By changing the view from Status to Remote Cluster Status, users can monitor other connected clusters, intercluster heartbeats, and global service groups. From here, users can also view Service Groups, System Connectivity, and Properties.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 22

Page 23: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Figure 10. Monitor cluster status

The state of a service group can be monitored by clicking on it from the list view panel. The main panel can display three views: Status, Resources, and Properties. Status view informs users of the state of the group (that is, online, offline, faulted) as well as the names, types, and states of all resources. It also displays the status of the service group on other clusters if it is configured to be global. The Resources view visualizes the dependency relationships between resources. The Properties view allows users to view and modify the behavior of the group.

Similarly, information on specific resources in a group can be found by drilling down from the list panel.

To bring a service group online, be sure that all resources are in the enabled state. To enable all resources in a service group, right-click on the group and select Enable Resources. Now the service group can be started: right-click on the group and select Online. From the pull-down menu, either select a particular system or choose the Any System option. For global groups, there is also the option of Remote online. Similarly, service groups are taken offline by selecting the Offline menu.

Figure 11. Bring a service group online

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 23

Page 24: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

To switch a service group from one cluster to another, right-click the relevant group and select Switch To, then Remote Switch… At the dialog box, specify the cluster to which the group will be switched to. A particular node may be specified, or select Any System, then click OK.

Figure 12. Switch a service group

CLI commands To check the status of local and remote clusters:

Licod230:/> hastatus

To obtain information on the state of a local or remote cluster:

Licod230:/> haclus –display [cluster | -localclus]

To display service group information across clusters:

Licod230:/> hagrp –display [service_group] [-attribute attributes] [-sys systems] [-clus cluster | -localclus]

To find information on the state of heartbeats configured on the local cluster:

Licod230:/> hahb –display [heartbeat ...]

To online a service group across clusters:

Licod230:/> hagrp –online service_group [-any | –sys system] [-clus cluster | -localclus]

Specifying the –any flag will cause failover groups to online on any one node in the specified cluster; with parallel groups, it is brought online on all systems designated for the service group.

To take a service group offline across clusters:

Licod230:/> hagrp –offline [-ifprobed] service_group [-any | –sys system] [-clus cluster | -localclus]

Similarly, specifying the –any flag in this command will bring all instances of the service group offline. In addition, the –ifprobed flag will make sure to that all resources are probed (that is, monitored to determine status) before the operation takes place.

To switch a global service group across clusters:

Licod230:/> hagrp –switch service_group [–to system] [-clus cluster | -localclus]

If a particular system is not specified, the service group will be brought online on any node within the specified cluster.

For detailed information and a more complete list of VCS CLI commands and flags, see the Veritas Cluster Server User’s Guide as well as the command man pages.

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 24

Page 25: Implementing Disaster Recovery using Veritas Global ... · Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology Abstract This white paper documents

Conclusion Veritas Cluster Server allows for the monitoring of applications over a number of servers from one pane of glass. VCS provides high availability protection by facilitating failover between nodes. The VCS Global Cluster Option expands the failover capability to geographically dispersed clusters. SRDF facilitates the mirroring of devices between physically separate Symmetrix arrays with solutions that span up to four sites. Leveraging the Symmetrix’s high-end enterprise level framework, SRDF enables flexibility in disaster recovery deployment as well as massive scalability. By deploying SRDF’s powerful data replication technologies as the underlying DR mechanism for intercluster failover, data centers are provided with business continuity for mission-critical applications.

References For more information on SRDF, see the following on EMC’s Powerlink® website:

• EMC Solutions Enabler Using SYMCLI to Implement SRDF/Star Technical Notes

• EMC Solutions Enabler Symmetrix SRDF Family CLI Product Guide

For more information on implementing SRDF with Veritas Cluster, see:

• Veritas Cluster Support for EMC SRDF

• Veritas Cluster Server Agent for EMC SRDF Installation and Configuration Guide

• Veritas Cluster Server Agent for EMC SRDF/Star Installation and Configuration Guide

For more information on Veritas Cluster Server, see the following resources and visit http://www.symantec.com/business/cluster-server.

• Veritas Cluster Server Installation Guide

• Veritas Cluster Server User’s Guide

Implementing Disaster Recovery using Veritas Global Clusters and EMC SRDF Applied Technology 25