31
White Paper Abstract This white paper explains RecoverPoint replication of XtremIO arrays using Snap-Based replication technology. It discusses Architecture, deployment, topologies and use cases of RecoverPoint protection for XtremIO. August 2015 EMC RECOVERPOINT REPLICATION OF XTREMIO Understanding the essentials of RecoverPoint Snap-based replication for XtremIO

EMC RecoverPoint Replication of XtremIO White Paper

  • Upload
    phamtu

  • View
    241

  • Download
    8

Embed Size (px)

Citation preview

Page 1: EMC RecoverPoint Replication of XtremIO White Paper

White Paper

Abstract

This white paper explains RecoverPoint replication of XtremIO arrays using Snap-Based replication technology. It discusses Architecture, deployment, topologies and use cases of RecoverPoint protection for XtremIO. August 2015

EMC RECOVERPOINT REPLICATION OF XTREMIO Understanding the essentials of RecoverPoint Snap-based replication for XtremIO

Page 2: EMC RecoverPoint Replication of XtremIO White Paper

2 EMC RECOVERPOINT REPLICATION OF XTREMIO

Copyright © 2015 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All other trademarks used herein are the property of their respective owners. Part Number H14296

Page 3: EMC RecoverPoint Replication of XtremIO White Paper

3 EMC RECOVERPOINT REPLICATION OF XTREMIO

Table of Contents Executive Summary ................................................................................................. 5

Audience ............................................................................................................................ 5

Document List of Changes .................................................................................................. 5

Terminology ............................................................................................................ 6

Snap-Based Replication .......................................................................................... 7

Snap-Based Replication Use Cases .................................................................................... 7

RecoverPoint Replication for XtremIO ....................................................................... 8

Snapshots in XtremIO ......................................................................................................... 8

Snapshots management in XtremIO ................................................................................... 9

RecoverPoint Snapshot Operations .................................................................................... 9

Replication Flow ............................................................................................................... 10

Image Access Flow ........................................................................................................... 13

Failover Flow .................................................................................................................... 14

Replication Modes ............................................................................................................ 15

Replication States ............................................................................................................ 16

Configuring XtremIO Replication ............................................................................ 17

Zoning .............................................................................................................................. 17

RPA Initiator Registration in XtremIO ................................................................................. 18

RecoverPoint Cluster Installation ...................................................................................... 18

Registering the XMS in RecoverPoint ................................................................................ 18

Connectivity between RecoverPoint and XtremIO .............................................................. 19

Replication configuration ................................................................................................. 20

Default Replication Mode ................................................................................................. 20

Maximum number of snapshots ....................................................................................... 21

Required Protection Window ............................................................................................ 23

Snapshot Pruning ............................................................................................................. 23

Replication Topologies and Use Cases ................................................................... 25

Use Cases ........................................................................................................................ 27

Homogenous Replication ............................................................................................. 27

Use Cases of Homogenous Replication ......................................................................... 27

Heterogeneous Replication: non-XtremIO to XtremIO and vice versa ............................. 27

Use cases of Heterogeneous replication ....................................................................... 28

Appendix: Different Ways to Protect XtremIO with RecoverPoint .............................. 30

VPLEX Splitter ................................................................................................................... 30

RecoverPoint for Virtual Machines .................................................................................... 31

Conclusion ............................................................................................................ 31

References ............................................................................................................ 31

Page 4: EMC RecoverPoint Replication of XtremIO White Paper

4 EMC RECOVERPOINT REPLICATION OF XTREMIO

Page 5: EMC RecoverPoint Replication of XtremIO White Paper

5 EMC RECOVERPOINT REPLICATION OF XTREMIO

Executive Summary Data protection has become an integral and essential part of any successful business. The need to provide a powerful, scalable and yet simple disaster and operational recovery solution is at all-time high.

XtremIO is the highly-acclaimed all-flash array featuring scale-out architecture and ultra-high performance. XtremIO enables high and consistent performance at all time, while being cost-effective across the board due to its inherent data reduction technologies.

EMC RecoverPoint is a popular replication solution and has worldwide deployments with both enterprise and commercial customers. It is a universal replication solution that supports all EMC block storage natively, and over 50 storage families through VPLEX storage virtualization.

This document is a comprehensive guide to all aspects of RecoverPoint protection for XtremIO arrays, and how this solution empowers and enables world-class RecoverPoint protection of high performance environments XtremIO-based businesses.

Audience This white paper is intended for EMC customers, partners, and employees who want to better understand, evaluate, and choose their XtremIO replication solution using RecoverPoint. Familiarity with RecoverPoint and XtremIO based-solutions is required.

Document List of Changes

Date Revision

August 2015 01

Page 6: EMC RecoverPoint Replication of XtremIO White Paper

6 EMC RECOVERPOINT REPLICATION OF XTREMIO

Terminology

RPA - RecoverPoint Appliance is a hardware based appliance that runs the RecoverPoint software.

RecoverPoint Cluster – A group of 2-8 RecoverPoint Appliances configured in a cluster.

RecoverPoint System - One or more connected RecoverPoint clusters.

Consistency Group - One or more volumes that require consistency across grouped in a single group.

Splitter - A mechanism used to intercept writes so that they are sent to their normally designated storage volumes and the RPA simultaneously.

Full Sweep - An efficient Initialization process, which is performed on all of the volumes in a consistency group, when the RecoverPoint system cannot identify which blocks are identical between the production and replica volumes, and must therefore mark all blocks for all volumes in the consistency group, as dirty.

Short init - An initialization process that uses marking information to re-synchronize a copy’s replica volumes with their production sources.

Failover - Moving production to one of the copies

Image Access - Enabling access to a selected point-in-time at one of the copies

Recovery Point Objective (RPO) –RPO is the maximum amount of data that an organization is willing to lose in case of a disaster. For example, an RPO of 30 seconds means that in case of a disaster, the data that can be lost should not be more than the data generated in 30 seconds.

Recovery Time Objective (RTO) – RTO is the duration of time within which a business process must be restored after a disaster. For example: An RTO of 1 hour means that in case of a disaster, the data needs to be restored in 1 hour.

Asynchronous Replication – A replication mode that enables you to replicate data over long distances while maintaining a dependent write consistent copy of data between the local and remote sites at all times.

Synchronous Replication – A replication mode in which the host initiates a write to the array at local site and the data must be successfully stored in both local and remote sites before an acknowledgement is sent back to the host. There is always only one outstanding IO per LUN in a synchronous replication.

Page 7: EMC RecoverPoint Replication of XtremIO White Paper

7 EMC RECOVERPOINT REPLICATION OF XTREMIO

Snap-Based Replication Snap-based replication is a new asynchronous replication method leveraged in RecoverPoint version 4.1. Snap-based replication utilizes array-based snaps and transfer the difference between these to the target as opposed to normal Asynchronous RecoverPoint replication where writes are intercepted by the splitter before sent to the target. Snap-based replication is available for VNX arrays when it is at the production copy and for XtremIO when it is at the production and/or target copies. In this paper we will elaborate on snap-based replication for XtremIO volumes. While there are common concepts, the implementation of snap-based replication for VNX and for XtremIO are very different. For more information on snap-based replication for VNX, please refer to the RecoverPoint 4.1 Administrator’s Guide as well as the RecoverPoint Deploying VNX and CLARiiON Arrays and Splitter Technical Notes. It is vital to mention some of the key differences between snap-based replication in XtremIO and Async replication:

Write interception – With XtremIO at the production, there is no write splitter and no extra installations are required on the array. This is opposed to Async replication of Symmetrix VMAX, VNX, VPLEX which employs a write splitter integrated into the array operating environment

Target side storage – When XtremIO is at the target, as we will discuss on this paper, RecoverPoint is distributing to XtremIO snapshots. Moreover, the replica volume is a reference to an array-based snap. In contrast to that, when non-XtremIO arrays are at the target, RecoverPoint writes to journal volumes and the data is being distributed to the replica volumes by the target RPAs.

Granularity of Points-in-time – Asynchronous replication without snap-based replication means near-zero RPO with AnyPiT capability. In Snap-based replication for XtremIO, the number of points-in-time is dictated by maximum XtremIO snapshots RecoverPoint and in general can be created for a given volume(s). A minimum of 60 seconds RPO can be achieved in snap-based replication.

Snap-Based Replication Use Cases Use Case 1: High performance environments – snap-based replication would be suitable for write intensive host environments since RecoverPoint with snap-based replication replicates deltas between array-based snapshots without intercepting the writes in real-time as they are sent to the storage array. Use Case 2: Limited WAN bandwidth – In cases where there is limited available bandwidth, snap-based replication in periodic mode can provide WAN savings because of write folding. Write folding is an addition to other bandwidth reduction techniques RecoverPoint leverages which are

Page 8: EMC RecoverPoint Replication of XtremIO White Paper

8 EMC RECOVERPOINT REPLICATION OF XTREMIO

Deduplication and Compression. We will discuss the different snap-based replication modes later on in this document Use Case 3: Relaxed RPO – In situations where there are less-stringent RPO requirements, snap-based replication can be configured. Additionally, requirement for small number of Business Continuity copies will be suitable for Periodic snap-based replication as the replication interval can be configured to suite the low frequency of points-in-time.

RecoverPoint Replication for XtremIO In this section we would discuss how XtremIO volumes are replicated by RecoverPoint. In order to understand some of the replication flows RecoverPoint is employing when replicating from or to XtremIO, it is imperative to understand fundamentals of snapshots in XtremIO.

Snapshots in XtremIO

Snapshots in XtremIO are regular volumes created as writeable snapshots. Creating Snapshots in XtremIO does not affect system performance, and a Snapshot can be taken either directly from a source volume or from other snapshots. XtremIO Snapshots are inherently writeable, but can be created as read-only. RecoverPoint currently with release 4.1SP2, creates and manages only writeable snapshots. When a snap is created, the following steps occur:

1) Two empty containers are created in-memory 2) Snapshot SCSI personality is pointing to the new snapshot sub-node 3) The SCSI personality which the host is using, is linked to the second node in the internal

data tree

Page 9: EMC RecoverPoint Replication of XtremIO White Paper

9 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 1: XtremIO snapshot creation diagram

Snapshots management in XtremIO In order to understand how RecoverPoint leverages XtremIO’s snapshot technology, let’s discuss two terms related to XtremIO’s snapshot management: Consistency Group – CGs are used to create a consistent image of a set of Volumes. RecoverPoint uses XtremIO’s consistency groups to create snapshots at both the production and the target. RecoverPoint Consistency Groups is fully aligned with XtremIO’s Consistency Groups. Snapshot Set – snapshots taken at the exact time on all volumes in a Consistency Group or in other words, a snapshot on a consistency group. RecoverPoint uses snap sets for various snap-specific operations such as: calculate diff between snaps, promote snap, etc. DIFF protocol - A vendor specific SCSI command which RecoverPoint uses to query XtremIO with in order to obtain a bitmap of changes between two snapshot sets. RecoverPoint uses the output of DIFF command to read the actual data and transfer it to the target side.

RecoverPoint Snapshot Operations

Page 10: EMC RecoverPoint Replication of XtremIO White Paper

10 EMC RECOVERPOINT REPLICATION OF XTREMIO

RecoverPoint is responsible for all aspects of snapshot management in XtremIO, the main operations are as follows:

1) Snap creation 2) Snap deletion 3) Snap promotion – an operation in which the SCSI personality of the root volume is moved

to reference the snapshot being promoted. Note that the XtremIO Administrator must not alter any snapshot created and managed by RecoverPoint.

Replication Flow In snap-based replication for XtremIO, there are two cases where the replication flows are substantially different from splitter-based/normal replication or other snap-based replication mechanisms. XtremIO volumes are configured on the production copy (source)

a. RecoverPoint creates first snapshot from the root volume b. RecoverPoint requests a DIFF between the first snapshot and the root volume.

Note that the DIFF of the first snapshot and the root volume will return all the written data on the root volume

c. RecoverPoint performs initialization based on the DIFF result. Note that this will trigger a full sweep. Full sweep means that the production and target volumes are being read and only different blocks will be transferred across the wire. This is an efficient replication method since it minimizes WAN consumption even it is a replication which have configured for the first time. Full Sweep in XtremIO’s case is based on a DIFF between the first snapshot and the root volume. The DIFF will return a bitmap of only the written blocks.

d. RecoverPoint creates second snapshot and the SCSI personality of the snapshot is moved to the new snapshot

e. RecoverPoint requests a Diff between the second snapshot and the first snapshot f. RecoverPoint deletes first snapshot g. RecoverPoint performs initialization based on the DIFF between the two snapshots h. Steps d-g are being repeated continuously

Page 11: EMC RecoverPoint Replication of XtremIO White Paper

11 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 2: Replication Flow – XtremIO at Production

XtremIO volumes are configured on the target copy

a. RecoverPoint creates snapshot from the root volume, also referred to as the working snap.

b. RecoverPoint distributes to that working snap. c. RecoverPoint creates another snap and the snapshot SCSI personality is moved to the

new snapshot d. RecoverPoint promotes first snapshot. In this operation, the references to the root

volume is changed to point to the first snapshot. Furthermore, the SCSI personality is moved as well. This promotion is done every 30 minutes.

Page 12: EMC RecoverPoint Replication of XtremIO White Paper

12 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 3: Replication Flow – XtremIO at Target

Note that once RecoverPoint creates another snap, it will start distributing to it and a Point-in-time (PiT) in RecoverPoint will be created. That PiT will represent the first snapshot created on the target XtremIO array for that target copy. Every PiT in RecoverPoint equals a snapshot in XtremIO. This notion is quite different comparing to a PiT when there are no XtremIO volumes at the target copy. That PiT will represent a data point on the target journal whereas with XtremIO, it represents an array snap. In normal replication, the replica volume will always contain the most recent and consistent data which is the latest PiT in the target journal. With XtremIO at the target, the replica volume contains the latest consistent PiT. Hence, with XtremIO at the target, the replica volume has the same capacity requirements as the production volume but is merely a reference to the latest consistent snapshot. When replicating to XtremIO, the target journal volume is used for metadata as the data itself is kept in form of XtremIO snapshots. Consequently, the journal size can be of minimal capacity. With RecoverPoint 4.1, the minimum journal capacity is 10GB for normal Consistency Groups and 40GB for Distributed Consistency Groups. Another important matter to take into consideration when planning, implementing or managing replication is that the replica volume is not denied access when XtremIO is at the

Page 13: EMC RecoverPoint Replication of XtremIO White Paper

13 EMC RECOVERPOINT REPLICATION OF XTREMIO

target. When there is a splitter on the target or in other words when non-XtremIO storage is at the target copy, the splitter prevents access by failing all IOs while not in image access. With XtremIO, there is no splitter and consequently nothing to prevent the user from accessing the replica or snaps while not in image access. It is highly recommended to mount the data on the replica only when in image access and make sure to unmount filesystems before disabling image access. Similarly, in failover, the former production is not being blocked for IO operations. Therefore, it is vital to shut down production or unmount all relevant filesystems before performing a failover. As of RecoverPoint 4.1SP2 with XtremIO 4.0, the mitigation to these caveats is that the snaps which RecoverPoint creates can only be managed by the rp_user user in XtremIO. Therefore, XtremIO users such as admin will not be able to manage the snapsets, snaps and XtremIO consistency Groups which RecoverPoint had created and constantly manages. In fact, the volumes that RecoverPoint creates in XtremIO will show-up as internal volumes in the XtremIO CLI and will not show-up at all in the XtremIO management application. This will be the behavior for all XtremIO users which are not the user “tech” or the “rp_user” user or any user which RecoverPoint uses to access the XMS with (see the “Registering the XMS in RecoverPoint” section below).

Image Access Flow Image access when XtremIO is at the target is substantially different from image access flows when non-XtremIO arrays are at the target. There are three types of image access when there are non-XtremIO volumes at the target. These are logged, virtual and direct image access. The image access for XtremIO is different in a sense that it does not involve data distribution as in logged image access, RecoverPoint does not reference IOs to a journal as in virtual image access or does not pause replication and move to marking mode as in direct image access. For more information the different image access types, refer to the RecoverPoint 4.1 Administrator’s Guide. Image access for XtremIO volumes is instantaneous since, as can be seen in the flow below, it is merely changing references of the replica volume and does not cause any replication impact.

1) The user selects a certain Point-in-time to access 2) Snap promotion - This means that a new volume is created to reference the selected

snap. A new snap is created from that selected snap in order to store the writes during image access. new volume is created to reference the selected snap and the SCSI personality is being moved to that new volume

3) Host may access the replica using the same SCSI personality 4) During any of these operations, replication flow at the target site continuous without

any disruption. That means that RecoverPoint continues to distribute to the working snap and to create new snaps. Nonetheless, promotion of snaps does not occur when an image is being accessed.

5) When the user selects to disable image access, RecoverPoint simply resumes the snap promotion of latest snapshot.

Page 14: EMC RecoverPoint Replication of XtremIO White Paper

14 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 4: Image Access Flow

Failover Flow

Failover in RecoverPoint is made out of three phases – Image access, shifting the target to be production and finally replication to the former production copy. Specifically, the failover flow consists of the following steps: 1) Image access takes place from the selected PiT, once the user selects to failover, the flow

continues 2) RecoverPoint creates a snap off of the working snap 3) Calculate the DIFF between the latest snap and the snap selected for failover

Page 15: EMC RecoverPoint Replication of XtremIO White Paper

15 EMC RECOVERPOINT REPLICATION OF XTREMIO

4) Calculate the DIFF between the accessed snap including writes made to it and the snap selected for failover

5) Merge these two DIFFs – this means that all the data which has been changed in both snaps will be replicated to the other side

6) The target and production change roles, also refer as “set as production” since the target is being configured as the new production and replication direction reverses.

7) Replicate data to the former production. Note that this optional, a failover can be performed without replicating data to the former production as part of the flow, this can be done on-demand after failover occurred.

8) Cleanup redundant snaps – snaps and volumes which have been created after the selected snap was created.

Note that Recover Production flow is very similar. The only difference is that the roles do not shift, only the replication direction changes until the image is accessed on the production copy. Please refer to RecoverPoint 4.1 Administrator’s Guide for more information. When failing over to or from XtremIO copy, there are cases where replication mode will have to be changed before reversing replication. The reason is that snap-based replication is not supported on every array, so manual modification of the replication mode would need to be performed (Snap-based replication to Async, Async to snap-based replication, etc.). As for RecoverPoint release 4.2.SP1, the following table describes the expected behavior:

Table 1: Failover scenarios in XtremIO replication

Replication Modes

Page 16: EMC RecoverPoint Replication of XtremIO White Paper

16 EMC RECOVERPOINT REPLICATION OF XTREMIO

Snap-based replication can be configured on a per-link basis. Snap-based replication with XtremIO features has two snap-based replication modes: Continuous – In that mode replication starts as fast as possible after the previous snap diff had finished replicating. Continuous offer the best RPO possible in snap-based replication since the delay between the replication of the first DIFF and the second DIFF is minimal. RPO in continuous should be planned for a minimum of 60 seconds. The effective RPO depends on various variables such as amount of changes to be transferred, available WAN bandwidth, RPA utilization, target side performance and more. Periodic – Periodic is similar to Continuous snap-based replication with the addition of a user-configurable time interval between the transfers. That interval can range from 1 minute to 1 day. The interval is counted from when the snap transfer began. If the interval had been reached while there is active replication (replication state is “Replicating Snap”) then the next snap replication will occur right after the current replication is done. Snap-Based replication is a function of the array type used in production. These Snap-based replication modes can be configured if XtremIO volumes are used in the production copy.

Figure 5: Snap-based Replication Modes

Replication States Snap-based replication introduces a couple of new replication states. The first is “Snap Idle” which represents a state where snap deltas are not being transferred. It can be during the Periodic interval or briefly between snap DIFF transfers in Continuous Snap-based replication mode. The second relevant state is “Replicating Snap”. This state represents active replication of snap deltas and is followed by percentage to indicate the progress of the DIFF initialization.

Page 17: EMC RecoverPoint Replication of XtremIO White Paper

17 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 6: Snap Idle State

Figure X: Snap idle status

Figure 7: Replicate Snap state

Configuring XtremIO Replication

RecoverPoint requires FC connectivity to the XtremIO array as well as IP connectivity. In this section we are going to discuss the planning considerations as well as recommendation to the actual replication configuration.

Zoning

Page 18: EMC RecoverPoint Replication of XtremIO White Paper

18 EMC RECOVERPOINT REPLICATION OF XTREMIO

It is recommended to zone the RecoverPoint Appliances to all available storage controllers in an even manner. This means that per fabric, all RPA FC ports should be zoned to all Storage controller FC ports. RecoverPoint built-in Multipathing software will work with subset of paths, evenly across all available storage controllers in a round-robin fashion. For simplicity purposes, one zone per fabric containing all RPA ports and Storage controller ports can be configured. The following is an example of a suggested zoning scheme:

Table 2: Example of Recover and XtremIO Zoning Scheme

RPA Initiator Registration in XtremIO RecoverPoint appliances should be registered as a standard host, it is recommended to register the initiators as Linux OS initiators. Each port on the RPA should be registered separately in XtremIO. Afterwards, all RPA ports of the same designated cluster should be grouped to a single Initiator Group on XtremIO. If the RPA cluster is going to be deployed on XtremIO, then the RecoverPoint Repository volume must be mapped to the RPA initiator group.

RecoverPoint Cluster Installation The RecoverPoint cluster installation flow has not been changed for XtremIO replication, it is the same RecoverPoint Installer flow which must be run via RecoverPoint’s Deployment Manager. For more information, please refer to the RecoverPoint Installation and Deployment Guide.

Registering the XMS in RecoverPoint For every RecoverPoint cluster in which XtremIO replication is required, either at the production, the target or both, XtremIO’s management server (XMS) must be registered in order to enable communication between RecoverPoint and XtremIO. The registration can be done via Unisphere for RecoverPoint, CLI or RESTAPI. In Unisphere for RecoverPoint, Navigate to RPA Clusters > Select Appropriate RP cluster > Storage -> Add.

Page 19: EMC RecoverPoint Replication of XtremIO White Paper

19 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 8: XMS Registration in Unisphere for RecoverPoint

Connectivity between RecoverPoint and XtremIO After the XMS (XtremIO Management Server) has been registered in RecoverPoint, RecoverPoint communicates over TCP port 443 (HTTPS) with the XMS and retrieves the XtremIO SYM IPs. RecoverPoint then communicates with the SYM over TCP port 11111 (XML-RPC). The communication with the XMS is used for sending snapshot management commands after replication has been configured. These snapshot management commands are CG creation and modification. The first is used when new CG or copy has been configured in RP and the second is initiated when a current CG copy is being altered.

Page 20: EMC RecoverPoint Replication of XtremIO White Paper

20 EMC RECOVERPOINT REPLICATION OF XTREMIO

If there is a connectivity problem between RPAs and the XMS or if the XMS fails, replication will not be disrupted, but new configuration cannot take place for current CGs and new volumes, originating from the same XtremIO array, would not be protected. The communication with the SYM over IP is used for ongoing snapshot management commands such as snapshot creation, snapshot promotion, etc. This communication path is not leveraged for configuration commands but only for ongoing snapshot related operations. Moreover, RecoverPoint communicates over FC with XtremIO for DIFF related communication such as the DIFF request and response as well as actual reads if XtremIO is at the production and writes and reads when XtremIO is at the target. The following table summarizes the communication channels as well as the impact of connectivity loss:

Table 3: RecoverPoint and XtremIO communication channels

Replication configuration The flow of Consistency Group creation is identical when protection XtremIO volumes. For more information on how to create a Consistency Group in RecoverPoint, Please refer to the RecoverPoint 4.1 Administrator’s Guide.

Default Replication Mode If XtremIO is at the production copy then Snap-based replication will be automatically configured for Periodic mode with 1 minute interval. This can be changed on a per link basis via editing of the link policy during Consistency Group creation or after it has already been created.

Page 21: EMC RecoverPoint Replication of XtremIO White Paper

21 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 9: SBR mode configuration on CG creation

Figure 10: Editing SBR mode on an existing CG by editing its Link Policy

Maximum number of snapshots If XtremIO volumes are at the target copy then the user will be able to configure the maximum snapshots RecoverPoint will use. As of RecoverPoint release 4.1SP2, RecoverPoint can consume

Page 22: EMC RecoverPoint Replication of XtremIO White Paper

22 EMC RECOVERPOINT REPLICATION OF XTREMIO

up to 500 snapshots per volume which is effectively up to 500 snapshots per CG since all snaps are taken in parallel. Moreover, the max snapshots correlates to the maximum points-in-time a certain target copy will have. The higher that number of snapshots RecoverPoint can create, the number of user snaps which can be created on the same volumes will be lower. For example, if the max number of snaps in RecoverPoint is set to 64, then the maximum number of user snaps which can be taken from that same volumes is 448. As of XtremIO 4.0, the maximum number of snaps per volume is 512. So, a total of 512 snaps per volume minus the snaps that RP can create which is 64 in that case, will equal in 448 which is the maximum number of user snapshots which can be taken from the same volume set. Note that if XtremIO as the production then this setting will be in effect only when the replication direction is changes and that production copy is made a target copy. In terms of snapshot consumption on the production XtremIO volumes, 2 snapshots per volume are consumed by RecoverPoint for replication as the DIFF protocol calculates changed blocks between two volumes. The first DIFF is taken with 1 snapshot as it is done from the root volume. This parameter can be changed during CG creation or after the creation by navigating to the copy and its Copy Policy.

Figure 11: Configuration of max snapshots in the group policy

Page 23: EMC RecoverPoint Replication of XtremIO White Paper

23 EMC RECOVERPOINT REPLICATION OF XTREMIO

Required Protection Window When there are volumes from other storage arrays at the target side, such as: VPLEX, VNX or Symmetrix, the Required Protection Window parameter simply alerts the user when the current protection window goes under the value specified in the required protection window setting. If there are XtremIO volumes at the target copy, the copy-policy based setting “Required Protection Window” has another function which is to determine the time window in which PiTs/snapshots will be kept. Any snapshot older than the value specified will be expired and deleted by RecoverPoint. As of RecoverPoint version 4.1SP2, the default required protection window when XtremIO is at the target is 30 days.

Figure 12: Required Protection Window setting under Copy policy

Snapshot Pruning Replication to XtremIO is unique in a sense that among other aspects, it involves working with array-based snapshots. Snapshots are a finite array resource and so are the number of Points-in-time RecoverPoint can create on XtremIO based target copies. Therefore, RecoverPoint allows the user to configure the maximum snapshots/PiTs per copy. In order to prevent impacting the protection window or RPO because of the relatively low number of PITs, there is a snapshot expiration mechanism in place to delete snapshots with a policy-based logic. That objective of that logic is to maintain the required protection window with the

Page 24: EMC RecoverPoint Replication of XtremIO White Paper

24 EMC RECOVERPOINT REPLICATION OF XTREMIO

assumption that the most recent snaps should be kept in higher granularity for operational recovery purposes. For example, if there are requirements of 1 minute RPO and a protection window of 30 days then without expiring snapshots in any period of time, theoretically the maximum protection window would have been 500 (max number of snaps) x 60 seconds (Minimum RPO) which equals roughly to 8:30 hours. This comes to emphasize the need for a snapshot expiration mechanism to maximize the Protection window. The policy in which that mechanism operates is static and defines the percentage of snaps to keep in a specific period of time. The following table presents the snapshot pruning policy:

Table 4: Snapshot Pruning Policy in RecoverPoint 4.1.2

Also, as can be seen from the table, the maximum Protection window is 30 days, which is also the default value of the required protection window parameter. This setting means that snapshots older than the configured value will be deleted. If the required protection window is configured to less than 30 days, the pruning policy will ignore the irrelevant time window(s) from policy and align the percentage accordingly. For example, if the required protection window will be configured as 2 days. Snaps older than 2 days will be expired by the snapshot pruning mechanism. The remaining time windows will align to a sum of 100%, this means that this will be the effective and approximate snapshot pruning policy:

Age of snapshots Percentage of total

0–2 hours 45%

Page 25: EMC RecoverPoint Replication of XtremIO White Paper

25 EMC RECOVERPOINT REPLICATION OF XTREMIO

2–24 hours 34%

1–2 days 21%

Table 5: Example of Pruning policy for 2 day Protection Window

Snapshot pruning mechanism deletes snapshots from the middle of the time window. Moreover, user bookmarks take precedence over other points-in-time so if there is a bookmark in the middle of a time window, there are cases where the closer system-generated PiT will be deleted instead of the user bookmark. In fact, user bookmarks are deleted only when they reach 80% percent of the overall snapshot count. Therefore, it is imperative to tightly monitor the amount of these as they can negatively impact protection window since older PiTs will be deleted before deleting user bookmarks.

Replication Topologies and Use Cases

RecoverPoint replication of XtremIO volumes using snap-based replication fully supports heterogeneous and homogenous replication, local and/or remote replication including the ability to leverage RecoverPoint’s multisite capabilities. Heterogeneous Replication means that it is possible to replicate from XtremIO to non-XtremIO and vice versa.

Multisite replication enables RecoverPoint to support concurrent replication (FAN-OUT) and FAN-IN in terms of system topology where it’s possible to have multiple RecoverPoint arrays in a single RecoverPoint system. Connectivity between these clusters can be over FC or IP. All clusters in a system can be connected (MESH/Full topology) or only some clusters can be connected (STAR/Partial topology) which is possible but reduces the flexibility of replication relationship. RecoverPoint supports 5 copy CGs and 5 clusters in a system, As of RecoverPoint release 4.1SP2, two remote copies are supported per CG when there is at least one copy with XtremIO volumes. Furthermore, if in a RecoverPoint system, there is replication to/from XtremIO volumes then that system will be supported with a maximum of 3 clusters.

Page 26: EMC RecoverPoint Replication of XtremIO White Paper

26 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 13: Example of Possible CG Topologies

It is worthwhile mentioning that in RecoverPoint, every copy is independent, so every target copy has its independent PiTs. In addition to that, this allows for a different storage or splitter per copy. Moreover, RecoverPoint fully supports bi-directional replication as the production role is set on a consistency group copy level. Currently, VNX and Symmetrix volumes can co-exist on the same copy. XtremIO volumes cannot co-exist with volumes which belong to other storage arrays, including a different XtremIO array. This applies to co-existence per-copy, it is fully supported and possible to mix different arrays or splitter types across copies. The following table summarizes the rules for co-existence of different storage arrays or splitters in a single RecoverPoint system:

Table 6: Co-existence rules as of RecoverPoint 4.1SP2

Page 27: EMC RecoverPoint Replication of XtremIO White Paper

27 EMC RECOVERPOINT REPLICATION OF XTREMIO

Use Cases

Let us explore some of the possible replication topologies and the use cases they enable:

Homogenous Replication

Figure 14: Homogenous replication

In Homogenous replication, XtremIO is at the production and at the target production and replica snap-based replication flows applies as XtremIO snaps will be leveraged on the production and target arrays.

Use Cases of Homogenous Replication

Homogenous XtremIO replication enables Disaster Recover/ Business Continuity solution for high-performing host environments residing on XtremIO volumes. As we have discussed earlier in this document, RecoverPoint replication for XtremIO using snap-based replication can deliver RPO as low as 60 seconds with multiple points-in-time for Operational Recovery purposes as well. Additionally, replication to XtremIO involves low-RTO since image access is instantaneous. Another aspect which RecoverPoint and XtremIO integration provides is support for end-to-end scale-out. If there is a need to add more capacity or to enable higher performance levels, XtremIO X-Bricks and RecoverPoint appliances can be added non-disruptively.

Heterogeneous Replication: non-XtremIO to XtremIO and vice versa

Page 28: EMC RecoverPoint Replication of XtremIO White Paper

28 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 15: Heterogeneous Replication

In Heterogeneous Replication, There are two distinct cases:

a. XtremIO to non-XtremIO Snap-based replication is used at the production XtremIO array while at the target, a splitter will be used on the supported arrays. Moreover, changes will be kept in a Journal. In fact, since every copy is independent, the replication behavior at the target does not change in respect to the replication mode employed in the production. Nevertheless, the effective RPO is determined by the replication mode at the production. The only exception to that statement is when XtremIO is at the target.

b. Non-XtremIO to XtremIO A splitter is being used at the production while XtremIO snapshots are used at the target. When replicating to XtremIO from splitter-based volumes, the effective RPO will be 60 seconds. Furthermore, the RPAs at the production will send bulk of writes as received from the splitter, normally with no regards to the replication behavior at the target, the RPAs at the target will receive these writes and send them to the working snap in XtremIO. That working snap will be statically promoted every 60 seconds, hence the RPO will be 1 minute.

Use cases of Heterogeneous replication

Migrations or Data Center Relocations Heterogeneous replication can be suitable for cases where there is replication between the production copy on non-XtremIO array like Symmetrix to another non-XtremIO array like VNX. XtremIO is being added at the production site so RecoverPoint can be used to replicate data from

Page 29: EMC RecoverPoint Replication of XtremIO White Paper

29 EMC RECOVERPOINT REPLICATION OF XTREMIO

the production array concurrently to XtremIO and maintain the DR copy on the remote VNX. The user can failover the production to the local copy residing on XtremIO and fail back if needed. After the failover, replication to the remote site will endure a short init without impact to RPO.

Figure 16: Migrations Leveraged by Heterogeneous Replication

Technology Refresh In case there is a need to perform Tech Refresh on the non-XtremIO production array, one can make his production available on the target XtremIO by performing failover in RecoverPoint and failback again when the former production array is operational again. This enables relatively short production downtime because the production applications are being brought up at the target XtremIO. In a scenario where such a Tech Refresh is needed because of power maintenance, OS patches or unplanned need to shut down production host or array, one should shut down his production applications and servers, create a bookmark for the relevant Consistency Groups, wait until replication is done and finally failover to XtremIO while using the bookmark previously created. Once the production environment is back up, resume the replication to the non-XtremIO based copy. Wait until there are Points-in-time on the target, shut down production, create a bookmark, wait until replication is done and finally failover to former production copy.

Page 30: EMC RecoverPoint Replication of XtremIO White Paper

30 EMC RECOVERPOINT REPLICATION OF XTREMIO

Figure 17: Tech Refresh failover example

Figure 18: Tech Refresh failback example

Post-Processing Heterogeneous replication capabilities can also be leveraged to form Development, Test, or other post-processing copies on XtremIO. RecoverPoint can replicate from EMC and 3rd party arrays to XtremIO so that XtremIO’s inherent snapshot and data reduction capabilities will apply. Specifically, for such post-processing copies, XtremIO user snapshots can be taken off of RecoverPoint replica.

Figure 19: Post-Processing with Heterogonous replication

Appendix: Different Ways to Protect XtremIO with RecoverPoint

This paper discussed how RecoverPoint can protect XtremIO block-based volumes using snap-based replication. It is worth mentioning other ways XtremIO can be protected using RecoverPoint:

VPLEX Splitter XtremIO volumes can be encapsulated into VPLEX. As a result, VPLEX splitter can be used with RecoverPoint to provide granular journal-based, Async or Sync replication. In addition to that, VPLEX inherent continuous availability, storage virtualization and non-disruptive data migration capabilities will transparently apply to hosts working with VPLEX and XtremIO back-end arrays. Moreover, leveraging XtremIO replication with RecoverPoint and VPLEX splitter with VPLEX Metro enables the MetroPoint topology which enables disaster recovery high availability. Note that the journal volumes can be allocated from VPLEX or directly from XtremIO. Journal volumes on VPLEX can be non-disruptively migrated to different volumes or arrays. On the other hand, allocating journal volumes directly from XtremIO or any other back-end array has the advantage of offloading load from VPLEX as well as reduce capacity license requirements. In terms of RecoverPoint licensing, VPLEX requires RP/EX which supports allocation journals from

Page 31: EMC RecoverPoint Replication of XtremIO White Paper

31 EMC RECOVERPOINT REPLICATION OF XTREMIO

unlicensed arrays. If allocating journal volumes directly from XtremIO, make sure to use RecoverPoint 4.1 SP2 P1 or later.

RecoverPoint for Virtual Machines RecoverPoint for VMs, which is storage agnostic, can replicate VMware-based Virtual Machines residing on XtremIO datastores and/or VMs which have RDMs from XtremIO. RecoverPoint for VMs uses a journal to store writes history and the ESX splitter to intercept writes to protected virtual disks. Also, RecoverPoint for VMs supports Async and Sync replication.

Conclusion

This paper provided information on RecoverPoint replication for XtremIO arrays leveraging snap-based replication technology. This paper thoroughly discussed the concepts of snap-based replication, RecoverPoint and XtremIO integration design, deployment as well as the rich use-cases and topologies this solution enables.

References

The following documents were used in writing this whitepaper. All documents are available at EMC’s Support site https://support.emc.com. RecoverPoint Installation and Deployment Guide RecoverPoint XtremIO Technical Notes RecoverPoint Deploying VNX and CLARiiON Arrays and Splitter Technical Notes RecoverPoint 4.1 Administrator's Guide RecoverPoint 4.1 Release notes RecoverPoint and XtremIO Scale and Performance Guide