68
EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by EMC CLARiiON and Data Domain A Detailed Review EMC Information Infrastructure Solutions Abstract EMC ® Data Domain ® deduplication storage systems dramatically reduce the amount of disk storage needed to retain and protect Microsoft Exchange and SharePoint data. This white paper provides best practices and demonstrates how the EMC Data Domain deduplication storage solution integrates with EMC backup and recovery products to provide full or granular-level backups for Microsoft Exchange Server 2010 and Enterprise SharePoint Server 2007. August 2010

EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

Embed Size (px)

Citation preview

Page 1: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by

EMC CLARiiON and Data Domain A Detailed Review

EMC Information Infrastructure Solutions

Abstract

EMC® Data Domain® deduplication storage systems dramatically reduce the amount of disk storage needed to retain and protect Microsoft Exchange and SharePoint data. This white paper provides best practices and demonstrates how the EMC Data Domain deduplication storage solution integrates with EMC backup and recovery products to provide full or granular-level backups for Microsoft Exchange Server 2010 and Enterprise SharePoint Server 2007.

August 2010

Page 2: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 2

Copyright © 2010 EMC Corporation. All rights reserved.

EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice.

THE INFORMATION IN THIS PUBLICATION IS PROVIDED “AS IS.” EMC CORPORATION MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND WITH RESPECT TO THE INFORMATION IN THIS PUBLICATION, AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.

For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com

All other trademarks used herein are the property of their respective owners.

Part number: h7051

Page 3: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

3

Table of Contents

Executive summary .................................................................................................. 6 Business case ............................................................................................................................... 6 Product solution............................................................................................................................. 6 Key results ..................................................................................................................................... 7

Introduction ............................................................................................................... 8 Overview ....................................................................................................................................... 8 Purpose ......................................................................................................................................... 8 Scope ............................................................................................................................................ 8 Audience ....................................................................................................................................... 8 Key components............................................................................................................................ 9 EMC CLARiiON CX4-480 ............................................................................................................. 9 EMC Replication Manager ............................................................................................................ 9 EMC NetWorker ............................................................................................................................ 9 EMC Data Domain DD690 ............................................................................................................ 9 EMC SnapView ............................................................................................................................. 9 Microsoft Office SharePoint Server ............................................................................................. 10 Microsoft Exchange 2010 ............................................................................................................ 10 Exchange 2010 DAG .................................................................................................................. 10 VMware ESX Server ................................................................................................................... 10 Kroll Ontrack ................................................................................................................................ 10 Environment profile ..................................................................................................................... 11 Physical environment .................................................................................................................. 11 Hardware resources .................................................................................................................... 12 Software resources ..................................................................................................................... 13

Microsoft Exchange design ..................................................................................... 14

Exchange 2010 design in a virtualized environment ...................................................................... 14 Introduction .................................................................................................................................. 14 Exchange user profiles ................................................................................................................ 14 Storage design for Exchange database and log LUNs ............................................................... 14 Building Block .............................................................................................................................. 15 Step 1: Identify requirements ...................................................................................................... 15 Step 2: Calculate storage requirements ...................................................................................... 15 Step 3: Identify Exchange server Mailbox design ....................................................................... 17 Step 4: Finalize Exchange server Mailbox storage configuration ............................................... 17 Exchange DAG configuration ...................................................................................................... 18 Exchange virtualization resource allocation ................................................................................ 19

Backup and recovery design for Exchange .................................................................................... 20 Overview ..................................................................................................................................... 20 LAN-free backup design for Exchange 2010 .............................................................................. 20

Page 4: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 4

Storage design and consideration for Replication Manager ....................................................... 22 Automation scripts during backup ............................................................................................... 22 NetWorker save set configuration for Exchange 2010................................................................ 23

Microsoft SharePoint design ................................................................................... 24

SharePoint 2007 design in a virtualized environment ..................................................................... 24 Introduction .................................................................................................................................. 24 SharePoint content database consideration ............................................................................... 24 SharePoint farm search component consideration ..................................................................... 24 SharePoint virtualization resource allocation .............................................................................. 25 SharePoint storage design .......................................................................................................... 26

Backup and recovery design for SharePoint 2007 ......................................................................... 27 Introduction .................................................................................................................................. 27 Full disaster backup and recovery design ................................................................................... 27 VSS providers overview .............................................................................................................. 27 VSS Writer overview ................................................................................................................... 28 LAN-free backup design for SharePoint full farm ........................................................................ 28 Clone group design ..................................................................................................................... 30 Snapshot policy consideration .................................................................................................... 30 Full farm conventional recovery design ...................................................................................... 30

Granular backup and recovery design for SharePoint .................................................................... 31 Introduction .................................................................................................................................. 31 Granular LAN-based backup and recovery design ..................................................................... 31 Save set configuration for SharePoint......................................................................................... 32

Data Domain design and configuration ................................................................... 33 Data Domain system overview .................................................................................................... 33 Data Domain sizing considerations ............................................................................................. 33 Data Domain deduplication ratio considerations ......................................................................... 34 Data Domain space management considerations ...................................................................... 34 Data Domain VTL with NetWorker .............................................................................................. 34 Data Domain configuration .......................................................................................................... 35

Testing and validation ............................................................................................. 36 Introduction .................................................................................................................................. 36

Exchange backup scenarios ........................................................................................................... 36 Introduction .................................................................................................................................. 36 Scenario 1: Initial Exchange 2010 full backup ............................................................................ 36 Scenario 2: Daily full backup ....................................................................................................... 37

Exchange 2010 recovery scenarios ................................................................................................ 40 Introduction .................................................................................................................................. 40 Scenario 1: Single database recovery ........................................................................................ 40 Scenario 2: Exchange 2010 mailbox recovery ............................................................................ 41

Page 5: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

5

SharePoint backup using VSS framework ...................................................................................... 45 Introduction .................................................................................................................................. 45 Workload simulation .................................................................................................................... 45 Scenario 1: Initial SharePoint farm full backup ........................................................................... 45 Scenario 2: Daily full farm backup ............................................................................................... 47 Scenario 3: Database-level full backup ....................................................................................... 47

Granular backup for SharePoint scenarios ..................................................................................... 48 Introduction .................................................................................................................................. 48 Scenario 1: Granular backup ...................................................................................................... 48

SharePoint recovery using VSS ...................................................................................................... 50 Introduction .................................................................................................................................. 50 Scenario 1: Full farm recovery using NMM 2.2 SP1 ................................................................... 50 Scenario 2: Database-level conventional recovery ..................................................................... 51

Granular recovery for SharePoint ................................................................................................... 53 Introduction .................................................................................................................................. 53 Scenario 1: Item-level granular recovery using NMM 2.2 SP1 ................................................... 53 Scenario 2: Item-level recovery using Kroll Ontrack 6.0 ............................................................. 54 Scenario 3: Site-level recovery using Kroll Ontrack 6.0 .............................................................. 58

Combined backup scenarios ........................................................................................................... 60 Introduction .................................................................................................................................. 60 Scenario 1: Combined initial full backup to Data Domain ........................................................... 60 Scenario 2: Combined daily full backup to Data Domain after new data generation .................. 61

Combined recovery scenario .......................................................................................................... 63 Introduction .................................................................................................................................. 63

Conclusion .............................................................................................................. 64 Summary ..................................................................................................................................... 64 Key findings ................................................................................................................................. 64 Next steps ................................................................................................................................... 64

References ............................................................................................................. 65 White papers ............................................................................................................................... 65 Product documentation ............................................................................................................... 65 Other documentation ................................................................................................................... 65

Additional information ............................................................................................. 66 Introduction .................................................................................................................................. 66 Automation scripts during backup ............................................................................................... 66 Automation script to reclaiming array storage ............................................................................. 67

Page 6: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 6

Executive summary

Business case Data within enterprise environments is growing fast, according to a Digital Universe

study by analyst firm IDC. The technology research and consulting firm estimates the worldwide volume of digital data grew by 62 percent between 2008 and 2009 to nearly 800,000 petabytes (PB). IDC claims this Digital Universe will grow to 1.2 million PB, or 1.2 zettabytes (ZB) in 2010, and reach 35 ZB by 2020.

Microsoft Exchange and SharePoint data stores are also experiencing massive data growth because of the increasing capacity of Exchange mailboxes, rising use of multimedia files, and increasing desire to share and collaborate on these documents in an enterprise environment.

As these environments continue to scale and expand, their criticality increases while protecting them becomes more and more difficult.

EMC, with its rich portfolio of hardware, software, and partner offerings, is well positioned to offer a solution that combines recommendations and best practices for creating a robust backup solution for both Exchange 2010 and SharePoint.

Product solution

The solution illustrates how the EMC® Data Domain® deduplication storage solution integrates with EMC NetWorker®, Replication Manager software, and Kroll Ontrack PowerControls to provide a full or granular-level backup of Exchange Server 2010 and Enterprise SharePoint Server 2007.

Page 7: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

7

Key results This white paper demonstrates the following benefits:

• Reduced backup timeframes: Backing up 9.2 TB of data takes 4 days with a traditional tape-based backup solution. However, this solution enables a daily full backup in less than 10 hours. Data Domain as the virtual yape library (VTL) can use Fibre Channel (FC) to improve throughput.

• Reduced backup storage requirements: Compared with the traditional 100 percent backup storage capacity requirement on tape, only a relatively small amount of additional capacity is required on Data Domain for daily full backups with a data deduplication ratio of 25:1.

Note Your mileage may vary depending on how many duplicates are in your specific environment.

• No impact to production: By backing up the passive copy of the Exchange DAG with space-efficient LAN-free snapshots (through CLARiiON® SnapView and NetWorker proxy client) the production environment for backup operations is not affected.

• Reduced physical infrastructure footprint: By consolidating the Active Directory, domain controllers, and Exchange/SharePoint application servers onto the VMware virtualization platform, the number of physical servers needed in this solution is significantly reduced.

• Simple and efficient item-level recovery: EMC selects the Kroll Ontrack tool for both mailbox-level and item-level granular recovery. It also saves administrators’ time and efforts significantly.

Page 8: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 8

Introduction

Overview This white paper describes the benefits of integrating Data Domain deduplication

with the NetWorker family for Exchange and SharePoint backup and restore. It also covers some of the features of the NetWorker Module for Microsoft Applications (NMM), as well as offers some insight into how customers can back up SharePoint and stream all backed-up data to highly efficient Data Domain backup storage through the NetWorker family.

Purpose The purpose of this white paper is to:

• Demonstrate a rapid and efficient backup and restore of a multi-terabyte Microsoft environment using EMC snapshot replication, backup, and backup deduplication technologies.

• Validate the backup and restore performance for SharePoint 2007 and Exchange 2010 by integrating NetWorker with Replication Manager and Data Domain deduplication storage.

• Validate the deduplication function and document the deduplication ratio of Data Domain and NetWorker, which makes an ideal long-term backup solution for Exchange and SharePoint.

• Provide the design and architecture on VMware virtualization deployment to reduce physical footprint.

• Conclude best practices, including Data Domain and NMM design overview and considerations.

Scope The scope of this white paper is to:

• Present an overview of the concepts and technologies in the solution

• Document the backup and restore performance and deduplication ratios of Data Domain for both Exchange Server 2010 and Enterprise SharePoint Server 2007. It also includes different scenarios, such as granular restore, and so on

• Present realistic capabilities and the deduplication ratio of the Data Domain product

This white paper does not provide detailed installation instructions. Actual implementations can vary from the parameter testing results shown, due to customer-specific environmental factors.

Audience This white paper is intended for corporate management and business decision-

makers, including storage, server, and IT managers, and application engineers, as well as storage integrators, consultants, and distributors. Database administrators who wish to restore Exchange mail, SharePoint documents, file system data will also find this paper helpful.

Page 9: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

9

Key components

This section briefly describes the key solution components. For details on all of the components that make up the architecture, see the “Environment profile” section in this paper.

EMC CLARiiON CX4-480

The EMC CLARiiON CX4-480 is a versatile and cost-effective solution for organizations seeking an alternative to server-based storage. The EMC CLARiiON CX4-480 delivers performance, scalability, and advanced data management features in one, easy-to-use storage solution.

EMC Replication Manager

Replication Manager automates and simplifies the management of replicas. It orchestrates critical business applications, middleware, and underlying EMC replication technologies to create and manage replicas at the application level for a variety of purposes, including operational recovery, backup, restore, development, and simulation. Customers interested in reducing manual scripting efforts, improving recovery, and creating parallel access to information can implement Replication Manager to put the right data in the right place at the right time.

EMC NetWorker NetWorker helps organizations to control costs by bringing management and control

of the entire information environment into one central offering. NetWorker uses this centralized, broad protection to bridge the gap between traditional backup and deduplication backup and allows new backup technologies to be introduced nondisruptively into complex IT operations by providing a common platform for both.

EMC Data Domain DD690

Data Domain solutions can perform data deduplication while maintaining high levels of performance and reliability. Data deduplication enables organizations to reduce back-end capacity requirements by minimizing the amount of redundant data that is ultimately written to disk backup targets. The actual data reduction can vary significantly from organization to organization or from application to application, depending on a number of factors—the most important being the rate at which data is changing, the frequency of backup and archive events, and how long that data is retained online.

Data Domain integrates easily into existing data centers and can be configured with FC connections to the SAN.

EMC SnapView EMC SnapView lets you create local point-in-time snapshots and complete data

clones for testing, backup, and recovery operations. With SnapView, you can create multiple copies of production data on your EMC CLARiiON networked storage system quickly and easily.

Page 10: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 10

Microsoft Office SharePoint Server

Microsoft Office SharePoint Server (MOSS) is an integrated suite of server capabilities that can help to improve organizational effectiveness by providing comprehensive content management and enterprise search, accelerating shared business processes, and facilitating information sharing across boundaries for better business insight. Additionally, this collaboration and content management server provides IT professionals and developers with the platform and tools they need for server administration, application extensibility, and interoperability.

Microsoft Exchange 2010

Microsoft Exchange Server 2010 is designed to meet today’s communication and collaboration challenges. It provides advanced e-mail and scheduling while delivering new methods of access for employees, greater productivity for IT administrators, and increased security and compliance capabilities for organizations.

Exchange Server 2010 introduces significant improvements in its database. Mailbox servers now can be defined as part of a Database Availability Group (DAG) to provide automatic recovery at the individual mailbox database level instead of at the server level. Furthermore, the transactional input/output (I/Os) requirements for Exchange 2010 have been reduced from those in Exchange Server 2007. With these new features in Exchange Server 2010, customers can now deploy much larger mailboxes than previous versions of Exchange Server, with less expensive drive types such as Serial Attached SCSI (SAS) and Serial Advanced Technology Attachment (SATA) for Exchange Server 2010 mailbox storage.

Exchange 2010 DAG

A Database Availability Group (DAG) is a set of up to 16 Microsoft Exchange Server 2010 Mailbox servers that provide automatic database-level recovery from a database, server, or network failure. Mailbox servers in a DAG monitor each other for failures. When a Mailbox server is added to a DAG, it works with the other servers in the DAG to provide automatic, database-level recovery from database, server, and network failures.

VMware ESX Server

VMware ESX Server is software for partitioning, consolidating, and managing servers in mission-critical environments. Ideally suited for enterprise data centers, ESX Server minimizes the total cost of ownership of computing infrastructure by increasing resource utilization and maximizing administration flexibility.

Kroll Ontrack Kroll Ontrack provides technology-driven services and software to help corporate,

legal, and government entities and consumers recover, search, analyze, produce, and present data efficiently and cost-effectively. In addition to its award-winning suite of software, it also provides data recovery, advanced search, paper and electronic discovery, computer forensics, ESI and trial consulting, and presentation services.

Page 11: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

11

Environment profile

This section identifies and briefly describes the technology and components used in the environment.

Physical environment

The following diagram illustrates the overall physical architecture of the environment:

Page 12: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 12

Hardware resources

The hardware used to validate the solution is listed in the following table.

Equipment Quantity Configuration

CX4-480 1 45 x 300 GB 15k FC disks 105 x 1 TB 7.2k SATA II disks

SAN Switch 1 Cisco MDS 9509

IP Switch 1 Cisco Catalyst 3560E

Data Domain DD690 1 16 TB raw devices

Dell PowerEdge R900 24 core, 128 GB RAM 2 D-P, 4 G Emulex HBAs 2 x Quad NICs

1 Virtual Machines (Cluster) 2 x Exchange MBX Server (2 x 4 vCPUs/24 GB) 1 x Exchange HUB/CAS Server (4 vCPUs/8 GB) 1 x Domain Controller (2 x 4 vCPUs/4 GB) 1 x Replication Manager Server (2 vCPUs/4 GB)

Dell PowerEdge R900 24 core, 128 GB RAM 2 D-P, 4 G Emulex HBAs 2 x Quad NICs

1 Virtual Machines (Cluster) 1 MOSS Index Server (8 vCPUs/6 GB) 2 MOSS web front end (WFEs) (2 x 4 vCPUs/4 GB) 1 MOSS App (Central Admin/Excel/Docu) (2 x vCPUs/4 GB) 1 x NetWorker Server (4 vCPUs/2 GB)

Dell PowerEdge R900 24 core, 128 GB RAM 2 D-P, 4G Emulex HBAs 2 x Quad NICs

1 Virtual Machines (Cluster) 2 x MOSS WFEs (2 x 4 vCPUs/4 GB) 1 x MOSS SQL Server (8 vCPUs/16 GB) 1 x Exchange MBX Server (4 vCPUs/24 GB) 1 x Exchange HUB/CAS Server (4 vCPUs/8 GB)

DC Server 1 Dell PowerEdge R710 8 core, 32 GB RAM, 3 x 4-port NICs

NetWorker Proxy Server 1 Dell PowerEdge 6850/R710 8 core, 32 GB RAM, 3 x 4-port NICs 2 dual-port, 4 Gb/s Emulex HBA

Page 13: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

13

Software resources

The software used to validate the solution is listed in the following table.

Software Version

Windows Server 2008 R2 Enterprise edition

RTM

Windows Server 2008 Enterprise edition SP2

Microsoft SQL Server 2008 SP1

Microsoft Office SharePoint Server 2007 SP2

SP2 with July Cumulative Update 12.0.6510

Microsoft Exchange 2010 Enterprise Edition

RTM (14.0.639.21)

Exchange MAPI and CDO 1.2.1 Latest

EMC PowerPath® 5.3 SP1

EMC PowerPath/VE 5.4 SP1

EMC Navisphere® Agent/CLI 6.29.5.0.66

Visual Studio Test Suite 2008

KnowledgeLake Document Loader Latest

NetWorker Module for Microsoft Applications

2.2 SP1

NetWorker 7.6

Replication Manager 5.2.3

Kroll Ontrack PowerControls 6.0

Data Domain Enterprise Manager 4.8.0.3 (beta code)

Driver for Emulex 7.2.20.6

VMware vSphere 4.0

Microsoft LoadGen 2010 Beta

Solutions Enabler 7.1.1

Solutions Enabler with VSS Provider 7.1.042

Page 14: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 14

Microsoft Exchange design

Exchange 2010 design in a virtualized environment

Introduction The design and testing principles applied to this environment demonstrate how

Exchange users with large mailboxes can achieve a high level of backup and recovery performance, while utilizing minimal resources. Testing was based on virtualized Exchange 2010 servers, with DAG implemented to provide mailbox database high availability. Snapshots taken from passive DAG copies will be used for both backup and recovery, which minimizes the impact to active mailbox databases during the backup and speeds up the process during the recovery.

This solution is intended not only to meet the basic functionality requirements when deploying an efficient, repeatable backup and recovery design on a large-scale virtualized Microsoft Exchange Server 2010 platform, but also to provide a solid foundation for future growth and development of the environment.

Exchange user profiles

The following table summarizes the Exchange environment profile in this solution.

Profile characteristic Value

Number of users 8,000

Exchange 2010 IOPS 0.15 (Very Heavy)

Read/Write Ratio 3:2

Mailbox server 3 (Virtual Machines)

Number of DAG copies 2

User count per server 4,000

Mailbox size 1 GB

Number of databases per server 10

User count per database 400

RAID type RAID 10, 1 TB 7.2k SATA

Storage design for Exchange database and log LUNs

Sizing and configuring storage for use with Microsoft Exchange Server 2010 could be a complicated process, due to many variables and factors that vary from organization to organization. One of the methods used to simplify the sizing and configuration of storage for use with Microsoft Exchange Server 2010 is to define a unit of measure – a building-block.

Page 15: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

15

Building Block A building-block represents the required amount of disk and server resources

required to support a specific number of Exchange 2010 users. The amount of required resources is derived from a specific user profile type, mailbox size, and disk requirements.

Using the building-block approach takes out the guesswork and simplifies the implementation of the Exchange 2010 Mailbox server. Once the initial building-block is designed, an organization can take this block of work and multiply it by some factor until the desired number of Microsoft Exchange server users (that is, Microsoft Messaging API (MAPI) Outlook users), has been properly met or configured to satisfy the Microsoft Exchange Server recommended performance metrics.

EMC’s best practices involving the building-block approach for Exchange Server design proved to be very successful throughout many customer implementations.

The process of creating a building-block involves four simple steps:

1. Identify user requirements.

2. Identify and calculate storage requirements (based on both IOPS and capacity).

3. Identify your Exchange Mailbox server database design.

4. Finalize the Exchange Mailbox server storage configuration

Step 1: Identify requirements

As we can see from the Exchange user profiles outlined above, the test environment needs to support 8,000 users (with 4,000 users per server) at 0.15 IOPS per user and a 1 GB MB mailbox quota.

Step 2: Calculate storage requirements

Use this formula to calculate storage for the Exchange 2010 Mailbox server role:

(IOPs * %R) + WP (IOPs * %W) / Physical Disk Speed = Required Physical Disks

Where Is

IOPS the number of input/output operations per second

%R the percentage of I/Os that are reads

%W the percentage of I/Os that are writes

WP the RAID write penalty multiplier (RAID 1=2, RAID 5=4)

Physical Disk Speed

For the CLARiiON CX4™ series, 155 for 15k rpm FC drives, 130 for 10k rpm FC drives, and 55 for 7.2k rpm SATA drives

Page 16: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 16

Note:

Microsoft also provides an Exchange 2010 Mailbox server role requirements calculator with some additional variables:

http://msexchangeteam.com/files/12/attachments/entry453145.aspx

IOPS calculation: Calculations are based on the targeted user profile as listed above and availability of 1 TB 7.2k rpm drives on the CLARiiON CX4-480. It is essential to calculate IOPS first, and then capacity.

On each Mailbox server, 4,000 users generate 720 IOPS (that is, 4,000 * 0.15 IOPS + 20% headroom). In RAID 10, it requires at least 19 (18.3 round up to 19) spindles to complete the tasks:

((720 * 0.6) + 2 * (720 * 0.4)) / 55 = 18.3

Capacity calculation On each Mailbox server, at least 7,000 GB formatted capacity is required for 4,000 mailboxes (that is, 4,000 * 1 GB + 35% = 5,400 GB, where 35 percent reservation is for deleted items retention).

Note: Due to the LoadGen 2010 Beta version, an additional 50 percent reservation is needed for indexing. As a result, at least 7,400 GB is required for each Mailbox server. It is not necessary for the production Exchange server.

For log files, at least 800 GB formatted capacity is required (that is, 4,000 * 29 logs/per user/per day * 7 days retention = 793 GB).

So the total space required based on capacity is 8,200 GB (7,400 GB + 800 GB).

Four 1 TB Serial Advanced Technology Attachment (SATA) disks were grouped as one RAID 10 group (2+2), which provides 1.8 TB formatted capacity. Therefore, to provide total required capacity for 8,200 GB, it will require a total of 20 disks in five RAID 10 groups on the CLARiiON CX4-480 (8,200 / 1024 / 1.8 = 4.4 round up to 5).

Number of disks required Based on the calculations above, capacity requirements supersede IOPS requirements. In total, 20 1 TB SATA drives were grouped as five RAID 10 groups to fulfill both IOPS and capacity requirements.

Page 17: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

17

Step 3: Identify Exchange server Mailbox design

The next step is to identify how many databases to configure per Exchange server. This involves determining how large the databases need to be. Based on the capacity of each RAID group in this solution, the database and log LUNs are configured to support 400 users and two databases and two log LUNs will be accommodated per RAID group. So each RAID group will host 800 users in total.

In summary, a building-block is created that provides all the necessary requirements for performance, capacity, and data protection to support 800 users. The table below summarizes the final building-block created for this configuration.

Item Description

Number of users supported 800

User profile supported 0.15

Mailbox size 1 GB

Disk size and type 1 TB 7.2k rpm SATA drives

RAID type RAID 10

Database LUN size 780 GB

Log LUN size 120 GB

Total disks required 1 RAID 10 (2+2) group – 4 disks (per database copy)

Step 4: Finalize Exchange server Mailbox storage configuration

Scaling the configuration up to 4,000 users per the server requirement will require five of these building-blocks, with a total of 20 disks with five RAID 10 groups for each database copy.

To improve performance, it is recommended that each Exchange database and its corresponding log LUNs be placed on separate RAID groups.

The following image illustrates the LUN layout in this solution:

Page 18: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 18

Exchange DAG configuration

High availability is provided for this solution with the use of Microsoft Database Availability Groups (DAGs). Within a DAG, a set of Mailbox servers uses continuous replication to provide automatic recovery in the event of failures.

In this environment, each database has two DAG copies, with three Exchange 2010 Mailbox servers deployed. Each of the Exchange 2010 Mailbox servers (1 and 2) hosts 10 active database copies. Exchange 2010 Mailbox server 3 hosts and 20 passive database copies.

In this way, the snapshots of passive DAG copies will be used for backup, thus eliminating the performance influence on the active DAG copies (for detailed information about Exchange server backup, please refer to the “Backup and recovery design for Exchange” section).

The following image illustrates the database copy layout in this solution:

Page 19: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

19

Exchange virtualization resource allocation

The Exchange servers were deployed on virtualized machines. The virtualization allocation of this solution is detailed in the following table.

Server role vCPUs Memory (GB)

Boot disk (GB)

Raw device mapping disk

DC Server x 1 2 4 80 N/A

Exchange 2010 HUB/CAS Server

2 8 100 N/A

Exchange 2010 Mailbox Server

4 24 100 780 GB x 10 (Database LUNs) 120 GB x 10 (Log LUNs)

Replication Manager Server

2 4 80 N/A

Note:

Microsoft provides detailed information on how to calculate memory and CPU requirements for Exchange Server 2010:

http://technet.microsoft.com/en-us/library/ee832793.aspx

http://technet.microsoft.com/en-us/library/ee712771.aspx

Page 20: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 20

Backup and recovery design for Exchange

Overview Replication Manager helps organizations to safeguard their business-critical

Exchange 2010 data with point-in-time, disk-based replicas or continuous data protection sets that can be restored to any significant point within the protection window. With its awareness of the Exchange 2010 environment, Replication Manager wizards guide the process of linking Exchange intelligence with EMC replication software. Replication Manager supports Exchange 2010 in standalone or DAG environments.

Microsoft Volume Shadow Copy Service (VSS) coordinates with Exchange 2010, replication, and CLARiiON to enable application-aware data management. VSS enables Replication Manager to create application-aware replicas. During replication or snapshots, Replication Manager coordinates with the storage and Exchange 2010 to create a snapshot or clone, which is a point-in-time copy of the volumes that contain the data, logs, and system files for Exchange 2010 databases. Replication Manager coordinates with VSS and Exchange 2010 to freeze then thaw the databases during snapshot creation, and then resumes the flow of data after the replication is complete.

The EMC NetWorker client/server environment enables organizations to protect their enterprise from the loss of valuable data. In a network environment, where the amount of data grows rapidly when servers are added to the network, the need to protect data becomes crucial. EMC NetWorker products give organizations the power and flexibility to meet such a challenge.

LAN-free backup design for Exchange 2010

The following image illustrates the LAN-free configuration of Exchange 2010 backup used in this solution. The snapshots of the Exchange 2010 DAG passive copies are mounted on the Replication Manager mount host, which in this solution is also known as the NetWorker Storage Node, and are connected with Data Domain through FC. So the backup data flow is through SAN.

This design avoids the network traffic while rolling the snapshot data to Data Domain. It also minimizes the impact on the production environment during the backup.

Page 21: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

21

The data flow of the LAN-free topology for Exchange 2010 backup is listed in the following table:

Stage Description

1 NetWorker Server initializes the backup request to the Exchange 2010 DAG passive copies.

2 Before the backup, NetWorker pre-script function calls out the Replication Manager CLI function to create snapshots for the Exchange 2010 database storage volumes automatically.

3 After a quick snapshot, the replicas are mounted and made visible on the Replication Manager mount host.

4 The Replication Manager mount host, in this case the NetWorker Storage Node, uses the snapshot in primary storage to transfer the data into Data Domain devices through FC.

NetWorker Server, the Replication Manager server, and Exchange 2010 DAG Passive Node server communicate through LAN. However, the data itself is not transferred across the LAN because the backup client, also known as the Replication Manager mount host, is also the NetWorker Storage Node. Data Domain is attached directly to the NetWorker Storage Node and configured as the VTL.

Page 22: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 22

Storage design and consideration for Replication Manager

For Replication Manager to create a replica on CLARiiON, either CLARiiON SnapView™ clone technology or CLARiiON SnapView snapshot technology, can be used. CLARiiON SnapView clone technology is known as a real mirror whose size is the same as that of the source volume. CLARiiON SnapView snapshot technology uses copy-on-first-write to perform point-in-time snapshots.

The Exchange 2010 DAG feature makes it possible to use the CLARiiON SnapView snapshot feature in this solution. There is no impact on the production environment when backing up the snapshots of Exchange 2010 DAG passive copies. CLARiiON SnapView can create or destroy a snapshot in seconds, regardless of the LUN size, because it does not actually copy data. It will significantly reduce the replication time and space requirements compared with CLARiiON SnapView clone.

To configure a CLARiiON SnapView snapshot, a reserved LUN pool with the proper number and size of LUNs (also known as snapshot cache) should be allocated for the snapshot function. In this particular solution, 40 LUNs with a total of 36 TB volume capacity needed to be backed up, so 80 x 45 GB LUNs were created to form the snapshot cache, which is a total of 20 percent of production data.

For more information on how to calculate the snapshot cache size, refer to EMC SnapView for Navisphere Administrator’s Guide.

Automation scripts during backup

EMC NetWorker provides the savepnpc command, so that the pre-script and post-script of the backup can be easily customized. By using the Replication Manager CLI function, the whole backup procedure can be managed by NetWorker.

Refer to the “Additional information” section for more information about the scripts.

Page 23: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

23

NetWorker save set configuration for Exchange 2010

To maximize the backup performance, consider:

• Raising the backup load by properly setting up parallelism in NetWorker

• Balancing the data flow during backup in parallel

Configure the Save Set attribute of the client resource to achieve this. For each Exchange 2010 database, back up the .edb database file and the log folder.

In this particular solution, 20 Exchange 2010 mailbox databases in total needed to be backed up, so the client parallelism value was set to 20. This means that 20 database files can be backed up simultaneously if 20 drivers are assigned from Data Domain. This design ensured that there was sufficient backup load. In the meantime, the backup data flow for each backup session can be properly balanced.

The following table lists the backup order and the content to be backed up:

Note Considering that log folders take much less backup time than db files, 20 database files will be backed up simultaneously within most of the backup window.

Backup order Backup content

1 MBX01 (10 DBs and 10 logs)

2 MBX02 (10 DBs and 10 logs)

To achieve the backup order above, the following values were specified in the Save Set attribute of the client resource.

C:\MBX01\DB1\MBX01_DB1.edb

C:\MBX01\LOG1

C:\MBX01\DB2\MBX01_DB2.edb

C:\MBX01\LOG2

C:\MBX02\DB10\MBX02_DB10.edb

C:\MBX02\LOG10

Page 24: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 24

Microsoft SharePoint design

SharePoint 2007 design in a virtualized environment

Introduction The following design factors should be considered for a virtualized SharePoint 2007

environment:

• SharePoint content database

• SharePoint farm search component

• SharePoint virtualization resource allocation • SharePoint storage design

• Clone design

SharePoint content database consideration

The SharePoint farm is designed as a publishing/collaboration portal. It includes 1 TB of user content consisting of 10 SharePoint site collections, each populated with 100 GB of content data.

Microsoft recommends a 100 GB content size for each content database as a soft limit. The storage design best practices are 130-150 GB for data volume and 25-50 GB for log volume. This solution designs:

• 100 GB content database data files on 150 GB LUNs

• 5 GB content database transaction log files on 30 GB LUNs

SharePoint farm search component consideration

During the full farm backup, it is important to back up the SharePoint search database. Two types of search components are available in the SharePoint farm:

• Enterprise search engine: Office SharePoint Server Search Service (Osearch)

• SharePoint help information search Engine: Windows SharePoint Services (WSS) Search Service (SPsearch), a very small search index (less than 100 MB), but still required to be backed up

Both Osearch and SPsearch engines are configured to store:

• WFEs, which are also configured as the query server for better query performance by using query load balance. Content Index (CI) files are stored on the physically exposed drive letter (LUN) on the index and query servers.

• The SSP search database, which stores metadata and crawler history information for the search system, and typically requires more disk space than the index.

For definitions about Office Search Engine and WSS Search terms, refer to Microsoft TechNet websites.

Page 25: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

25

SharePoint virtualization resource allocation

The SharePoint farm uses two of three ESX servers. The virtualization allocation of this solution is detailed in the following table:

Server role vCPUs Memory (GB)

Boot disk (GB)

Raw device mapping disk

WFE Server x 4 4 4 40 40 GB x 1 (query volume)

Index Server 2 4 40 150 GB x 1 (office search index volume)

30 GB x 1 (WSS help search index volume)

Application Excel Server

2 2 40

SQL Server 2008 4 8 40 150 GB x 10 (content database data volume)

30 GB x 10 (content database log volume)

50 GB x 1 (configuration database volume)

200 GB + 50 GB (SharePoint SSP Search database data and log volumes)

80 GB x 5 (SQL temp database and log volumes)

Page 26: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 26

SharePoint storage design

This white paper uses the Microsoft recommended sizing of 100 GB for each SharePoint 2007 content database.

A total of 1 TB user data was split into 10 x 100 GB content databases on 150 GB volumes, which used RAID 5 protection.

The following table describes the detailed disk layouts, and the size and number of disks holding SharePoint data across the whole solution.

Description Size (GB)

Quantity Total (GB)

Drive and RAID type

Number of disks

SQL MOSS content (Databases –Data) x 10

150 10 1,500 300 GB 15k RAID 5 (8+1) FC disks

9

SQL MOSS content (Databases –Log) x 10

30 10 300

SQL MOSS configuration (Databases and Log)

50 1 50 300 GB 15k RAID 5 (4+1) FC disks

5

Index volume–Index Server

150 1 150

Query volume-WFEs 40 4 160 300 GB 15k RAID 5 (3+1) FC disks

4

WSS Index volume-Index Server

40 1 40

System volume-SQL 40 1 40

SQL TempDB Data & Log x 5

80 5 400 300 GB 15k RAID 10 (5+5) FC disks

10

SQL MOSS SSP Search Database

200 1 200

SQL MOSS SSP Search Database Log

50 1 50

Page 27: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

27

Backup and recovery design for SharePoint 2007

Introduction EMC NetWorker provides disaster and granular backup and recovery for many

applications:

• Full disaster backup and recovery: The entire volume or database for that application is backed up, and the entire volume or database is recovered as a whole. In disaster backup and recovery, individual items for backup and recovery cannot be selected. Incremental level backup is not supported by VSS but is supported in granular backups.

• Granular backup and recovery: In granular backup, individual items can be selected for backup and in granular recovery. Individual items can be selected for recovery.

Full disaster backup and recovery design

This solution is using a VSS framework for consistent point-in-time application snapshots, delivering quick recovery and off-host backup. This solution also demonstrates full recovery for a distributed SharePoint farm and individual SharePoint content databases.

VSS providers overview

This solution uses two kinds of Volume Shadow Copy Service (VSS) providers for SharePoint full farm backup and recovery:

• Microsoft VSS Provider (software-based)

• CLARiiON VSS Provider (hardware-based)

The default VSS provider software on the Windows platform is Microsoft Software Shadow Copy Provider.

For more information on how Microsoft VSS works on Windows 2008, refer to the article on the Microsoft TechNet website: http://technet.microsoft.com/en-us/library/ee923636(WS.10).aspx

For a WFE server, NetWorker can back up one file of the system volume, so Microsoft VSS was used for WFEs.

VSS hardware providers (EMC VSS Provider in this solution), which are used to back up the SQL and Index servers, enable the creation of shadow copies at the hardware level, without imposing a load to the production server. For the purposes of VSS, the snapshot/clone is referred to as a shadow. Furthermore, an option to make the shadow transportable is provided, which allows you to mount, or import the shadow on another client. If a shadow is not marked as transportable, you will not be able to mount the shadow or perform rollback recovery.

Page 28: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 28

VSS Writer overview

NetWorker and NMM integrate with Microsoft Office SharePoint Server 2007 by using the SharePoint Volume Shadow Copy Services (VSS) Writer. The Microsoft SharePoint Server 2007 VSS Writer is dependent on the Microsoft SQL Server 2005/2008 VSS Writer.

Using the SharePoint VSS Writer, EMC NetWorker takes VSS snapshots of the entire SharePoint farm for data protection. For more detail information and configuration for VSS Writer, refer to EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Application Technical Notes for SharePoint and Exchange.

LAN-free backup design for SharePoint full farm

The transportable technology of VSS hardware providers allows the data clone/snapshot to be mounted onto a non-production environment (proxy client) for backup tasks. The benefits of using the proxy client are listed as follows:

• The lifetime of the data can be controlled without affecting the performance of

the existing servers.

• Hardware resources such as the processor, memory, and network can be optimized for serving the client or the user application. The hardware resources of the proxy host can be used for backing up the data to the storage node.

• Multiple independent copies of the data volumes can be managed across several machines.

The following image illustrates the LAN-free configuration of SharePoint used in this solution. This design avoids network traffic when rolling the clone data to the Data Domain. It also minimizes the impact on the production environment during the backup.

Page 29: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

29

The data flow of the LAN-free topology is listed as below in this solution:

Stage Description

1 NetWorker Server initializes the request to the application servers (in this solution, they included SQL servers and SharePoint Index servers) with the EMC VSS Provider installed.

2 The application servers use the EMC VSS Provider to create the clones/snapshots for the storage volumes.

3 Clones are mounted and visible in the NetWorker Proxy Server.

4 The proxy client, in this case the storage node, uses the clone/snapshot in primary storage to transfer the data into Data Domain device through FC.

Data Domain is attached directly to the NetWorker Storage Node as a virtual tape library.

CLARiiON SnapView clone technology is used in this solution because it minimizes the impact on the production environment. The backup program reads data from clone LUNs mounted on a non-production client rather than from a snapshot that reads data from the production database LUNs. For the fast-changing database, it is suggested to use split mirror snapshot technology such as CLARiiON clones or Symmetrix® business continuance volumes.

For more information about the CLARiiON SnapView recommendation, refer to EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Application Technical Notes for SharePoint and Exchange.

Page 30: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 30

Clone group design

VSS providers for CLARiiON also require creating a local clone copy for LUNs. The total clone size that the database requires is 2,400 GB.

For better performance, 15 x 1 TB SATA drives were used for the clone LUNs. The 15 spindles were configured into three RAID 5 (4+1) groups yielding 12 TB. The database clone LUNs were distributed evenly among the three RAID 5 groups.

Snapshot policy consideration

EMC NetWorker provides two preconfigured policies that can be used with NMM:

• Serverless backup: A single snapshot is taken per day. The data is then backed up to the traditional tape and the snapshot is deleted.

• Daily: Eight snapshots are taken per day. The data in the first snapshot is backed up to the tape. Each snapshot expires after 24 hours.

In this solution, serverless backup is used for SharePoint full-farm backup.

Retaining the snapshot enables you to perform a snapshot restore for the databases. The snapshot restore is much faster than a conventional restore, which reads data from the backup media. The disadvantage of keeping a snapshot is that the disk space used by the snapshot grows rapidly during daytime.

Full farm conventional recovery design

A full recovery of a distributed SharePoint farm requires that each machine in the farm is configured as a Client resource in the NetWorker.

Upon recovery, each machine will use a proxy client to read data from the backup target (Data Domain) and restore the entire farm over the LAN back to the production environment.

One or more content databases can be recovered after the configuration database has been restored. In the previous releases of NMM, a user was unable to select only individual SharePoint content databases for restore. When any content database was selected, the corresponding configuration and generic databases were also selected for recovery. NMM 2.2 sp1 provides the ability to select individual SharePoint content databases for recovery.

Page 31: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

31

Granular backup and recovery design for SharePoint

Introduction In this solution, the granular backup and recovery is using LAN-based topology.

The SharePoint 2007 backup utility does not support item-level recovery if the data is missing from the first-level and second-level recycle bin. One remedy is to restore the entire backup or snapshot to the secondary farm and get the item from the DR site. The backup or snapshot is not restored directly to the production server, which eliminates potential risks. However, it is time-consuming and expensive to set up a DR farm to do the recovery.

Another way is to restore whole sites to the production environment directly, which eliminates the need and expense of having a recovery server. However, restoring the entire database can take many hours and impact the business. In addition, restoring directly to the production server overwrites all of the content currently on it, which is not desirable. After running a granular backup of SharePoint site, an item-level granular recovery was completed using NMM 2.2 SP1.

The SharePoint granular backup does not use SQL or SharePoint VSS writers, so it is not necessary to register these writers prior to creating a client resource for SharePoint 2007 granular backup. NMM 2.2 SP1 offers granular backup for SharePoint 2007. Granular backup provides the finest granularity available with SharePoint backup down to the object level. It also provides the ability to back up incrementally. NMM 2.2 SP1 leverages content migration APIs (STSADM “export” command) of SharePoint 2007 by exporting every document and its metadata in the content site one by one.

Granular LAN-based backup and recovery design

In this solution, LAN-based topology is used for granular backup. The workflow is similar as listed in the table below:

Stage Description

1 One or more SharePoint WFEs with NMM 2.2 SP1 installed are set up as clients for granular backup.

2 NetWorker Server sends requests to the WFEs and sets the proxy client as the data mover.

3 WFEs request backup data from SQL Server and objects are staged in the folder on the WFEs.

4 The WFE then ships this data to the NetWorker proxy client.

5 The NetWorker proxy client transfers the data into Data Domain device through FC.

Page 32: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 32

As the following image illustrates, a public network is used to transfer data from the SQL server back end to the WFE during data extraction. The public network utilization can affect the backup performance.

The 15 SATA disks that contain clones LUNs also have five staging LUNs for SharePoint granular backup. Five 1 TB LUNs are used for four WFEs, and the SQL server is used as the staging folder to store backup streams during granular backup.

Note The staging LUN should have sufficient size for the staging to happen. If there is insufficient space for the temp folder the granular backup will fail.

A capacity utilization of 70 percent of these SATA disks ensures better backup performance and I/O throughput during the backup. For more detailed information about staging folder size calculation, refer to EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Application Technical Notes for SharePoint and Exchange.

Save set configuration for SharePoint

For more information about Save set settings for SharePoint full farm backup and granular backup, refer to EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Administration Guide.

Page 33: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

33

Data Domain design and configuration

Data Domain system overview

Data Domain systems are disk-based deduplication appliances and gateways that provide data protection and disaster recovery (DR) for the enterprise. Data Domain operating system (DD OS) provides both a CLI for performing all system operations, and Enterprise Manager (a graphical user interface) for configuration, management, and monitoring.

Data integrity The Data Domain Data Invulnerability Architecture protects against data loss from hardware and software failures. Storage in most Data Domain systems is set up in a double parity RAID 6 configuration (two parity drives). Additionally, most configurations include one or two hot spares in each enclosure.

Data compression

DD OS stores only unique data. Through Data Domain Global Compression technology, a Data Domain system pools redundant data from each backup image. Any duplicate data are stored only once. The storage of unique data is invisible to backup software, which sees the entire virtual file system. DD OS data compression is independent of data format. Data can be structured, such as databases, or unstructured, such as text files. Data can be from file systems or raw volumes.

Restore operations With disk backup through the Data Domain system, incremental backups are always reliable and access time for files is measured in milliseconds. Furthermore, with a Data Domain system, full backups can be performed more frequently without the penalty of storing redundant data.

From a Data Domain system, file restores go quickly and create little contention with backup or other restore operations. Unlike tape drive backups, multiple processes can access a Data Domain system simultaneously. A Data Domain system allows your site to offer safe, user-driven, single-file restore operations.

Data Domain sizing considerations

Storage capacity needs to be sized to adequately handle the amount of data to be retained. Backups that are larger than expected or contain data that deduplicates poorly can require much more storage space.

Although there are many factors that might affect the deduplication ratio, it is possible to estimate Data Domain storage capacity required for particular backup scenarios. Typical compression ratios are about 20:1 on average over many weeks. A backup that includes many duplicate or similar files (files copied several times with minor changes) benefits the most from compression.

In this particular solution, 16 x 931 GB drivers were configured into RAID 6 (12+2) with two hot spares so that total available capacity displayed in the Data Domain console is about 10 TB, which means the Data Domain system can accept up to almost 200 TB real data backups. In this solution, the data that needed to be backed up was about 9.2 TB (8 TB of Exchange data and 1.2 TB of SharePoint data). So a three-week long daily full backup was able to be achieved by estimation.

Page 34: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 34

EMC strongly recommends performing a sizing assessment when including replication in the backup environment. The sizing assessment can help to determine if replication can occur within the required timeframe, based on the replication network bandwidth and estimated amount of data to replicate on each day.

Data Domain deduplication ratio considerations

The term “deduplication ratio” refers to the ratio of data before deduplication to the amount of data after deduplication.

There are many factors that affect the deduplication ratio. Some key factors are listed below:

• Retaining data for longer periods of time improves the chance that common data already exists in storage, resulting in greater storage savings and a better deduplication ratio.

• Backups of Exchange and SharePoint are known to contain redundant data and are good deduplication candidates.

• After first full backup, the data change rate affects the deduplication ratio for those consecutive backups.

• Data compression and encryption during backup affect the deduplication ratio; thus, this is not recommended.

• EMC recommends multiplexing be turned off when using the Data Domain storage system as a VTL with NetWorker.

Data Domain space management considerations

EMC recommends running space reclamation weekly as per the default. This feature can be scheduled or run manually.

If possible, schedule space reclamation to occur outside peak ingestion windows. This reduces the completion for resources and minimizes any impact on ingestion, deduplication, or replication.

Data Domain VTL with NetWorker

The following describes general EMC NetWorker settings and best practices for optimizing the backup environment when using Data Domain as a VTL:

• Avoid running disk-intensive applications such as virus scanning on the backup client when it is backing up or restoring files.

• Use parallelism on the client when backing up data for increasing backup load.

• Assign library and drivers for the exclusive use of each backup host to ensure the best possible performance.

• Balance the backup start times rather than scheduling hundreds of backups to begin at the same time. Look at the savegroup and client completion times, or drive activity, to balance the load.

Page 35: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

35

• To ensure steady-state load, examine the drive target sessions, and try to keep certain numbers of sessions (more than 10 percent) running throughout the backup window. Fewer than these risks stalling target devices; more than this places unnecessary load on the infrastructure.

• On large systems with more than several hundred gigabytes to protect, eliminate data travel through the network by configuring the client as a storage node (LAN-free topology).

• Increase the number of storage nodes and devices if possible, for better performance.

Data Domain configuration

In this particular solution, Data Domain is configured as a VTL and connected to the NetWorker Storage Node. The LAN-free environment is enabled. Configuration details are as follows:

• Exchange and SharePoint share the same NetWorker Storage Node. Adding more storage nodes improves the combining backup and recovery performance if it is required.

• Two libraries are configured, one for the exclusive use of Exchange 2010 and the other for SharePoint 2007.

• In total, 20 IBM LTO-3 drivers are configured for Exchange 2010 so that all 20 database files can be backed up simultaneously. The tape size is 400 GB so that each database file (around 370 GB) can be backed up within one tape.

• In total, 16 IBM LTO-1 drivers are configured for SharePoint 2007. The tape size is 100 GB.

• On the Storage Node Server, one dual-port 4 GB HBA card is assigned to connect the Data Domain system and the other to connect to the primary storage (backup source). This design ensures the best performance when doing backup.

• On Data Domain System, there are two 4 GB HBA ports. Each port is assigned to half the amount of drivers for load-balance consideration.

Page 36: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 36

Testing and validation

Introduction This section describes the design validation and performance results for this

solution. The backup and restore features were validated under different scenarios and performance was measured for both SharePoint 2007 and Exchange 2010. An EMC Data Domain DD690 appliance was used for data deduplication. Microsoft LoadGen 2010 was used to generate mailbox data and simulate a MAPI work load. The KnowledgeLake Document Loader was used to provide continual data population during testing to simulate SharePoint user data growth. The data grows at a daily base. After that, daily full backup test is performed to validate the solution design.

Exchange backup scenarios

Introduction The following table lists the Exchange backup scenarios performed in this solution:

Test Scenario Description

1 Initial Exchange 2010 full backup

2 Daily full backup

Scenario 1: Initial Exchange 2010 full backup

This test scenario was to validate the initial Exchange 2010 full backup performance and deduplication ratio in Data Domain. This should be a one-time event.

The test results showed:

• It took 9 hours and 19 minutes to do an initial full Exchange 2010 backup of 6.5 TB of data.

• The deduplication ratio was 1.46:1.

Note It takes some time to seed the grid on Data Domain when performing the initial backup. What is important is the backup time and duplication ratio for a daily full backup following the initial full backup (See results in Exchange Backup Scenario 2.)

• The backup throughput to Data Domain was 214 MB/s on average.

The following graph shows the Data Domain statistics during the initial full Exchange 2010 backup. As you can see from the graph below, backup throughput is maximizing Data Domain’s CPU resource.

Page 37: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

37

Scenario 2: Daily full backup

This scenario was to validate the daily full backup performance and deduplication ratio after running LoadGen for eight hours, which generated about 250 GB of log data. Since Data Domain contains the initial full backup data, the deduplication ratio increased greatly.

The test results showed that:

• It took 4 hours and 54 minutes to back up 7.4 TB of data into Data Domain. The backup rate is about 1.5 TB per hour.

• The deduplication ratio was 37:1. The total deduplication ratio of two full Exchange 2010 backup data was 2.84:1 percent.

Note The test result of deduplication ratio is based on using LoadGen, which might be different from real-world data. LoadGen is not a tool to test deduplication ratio.

• The backup throughput to Data Domain was 468 MB/s on average and Data Domain CPU utilization was above 25 percent during the backup. Because Data Domain contains initial Exchange 2010 backup data, the backup throughput was higher than the initial backup and CPU utilization was lower, compared to the very first time.

• At this time, the bottleneck is the limitation of switch throughput. In fact, one testing variation achieved 650 MB/s with ports of the FC switch in dedicated rate mode, a less common configuration (this switch throughput maximization also pushes Data Domain CPU to very high levels as well). These published testing

Page 38: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 38

environment numbers were gathered by setting all of the 4 GB ports to Shared Rate mode, a more common configuration. This means that by adding more FC bandwidth, it is possible to scale to a backup rate that is more than 1.5 TB per hour.

The following table lists the daily full backup results after LoadGen simulation:

Amount of backup data

RM snapshot and mount replica time

Backup time (database and Log)

Total backup window

Deduplication ratio

7.4 TB 28 minutes for all 40 snaps

4 hours and 36 minutes

4 hours and 54 minutes

37:1

Note The amount of snaps will impact the total RM snapshot and mount replica time. It is recommended to mount less than 20 volumes on the RM mount host. So adding more mount host (backup server) will improve the total backup window

The following image shows the SAN switch bandwidth during the daily full backup after LoadGen simulation. The total throughput is about 68.5 MB/s.

Page 39: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

39

The following graph shows the Data Domain statistic during the daily full backup after LoadGen simulation. The Data Domain CPU utilization is about 25 percent.

Page 40: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 40

Exchange 2010 recovery scenarios

Introduction The following table lists the Exchange recovery scenarios performed in this solution:

Test Scenario Description

1 Single database recovery

2 Single 2010 mailbox recovery

Scenario 1: Single database recovery

This test shows the RTO to recover a single Exchange 2010 database by using both Replication Manager snapshot and Data Domain backup data. Snapshot restore provides a quick way for Exchange 2010 database recovery and you can also recover old backup data from Data Domain by leveraging the NetWorker client.

The test results showed that:

• It took only 8 minutes to restore one Exchange 2010 database from an RM snapshot.

• It took 2 hours and 37 minutes to recover one Exchange 2010 database from Data Domain to Mailbox Server MBX01 of 386 GB of data.

• For Data Domain recovery, the recovery speed from Data Domain was about 41.96 MB/s by calculation. In our testing environment, the bottleneck was the network speed. The recovery window could have been improved if we increased the network bandwidth; for example using the 10 GB network instead of the 1 GB network in the testing environment.

• For Data Domain recovery, the average CPU usage of Data Domain was 10 percent and the maximum disk utilization of Data Domain was 23 percent, which means Data Domain resources were enough to support more network bandwidth or multiple recovery sessions.

The following table lists the performance counters captured on the Mailbox Server MBX02 to measure the impact on the production environment during the Data Domain recovery.

Mailbox Server CPU usage (%)

Network Utilization (%)

MBX01 24% 41%

Page 41: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

41

The following graph demonstrates the Data Domain performance during a full point-in-time recovery.

Scenario 2: Exchange 2010 mailbox recovery

Microsoft provides a mechanism to recover data at the mailbox level or item level, which is called a recovery database (RDB). RDB is a special kind of mailbox database to mount a restored mailbox database and extract data from the restored database as part of a recovery operation.

Perform the steps in the following table to recover data using RDB:

Step Action

1 Restore the database files and log files from the tape, and put them under predefined folders on the recovery Mailbox server.

2 Create the recovery database with the cmdlet New-MailboxDatabase and a switch –Recovery. Specify the database path and log file path as the predefined folders. For example: New-MailboxDatabase -Recovery -Name RDB1 -Server MBX03 -EdbFilePath "C:\Recovery\RDB1\DB\MBX02_DB1.EDB" -LogFolderPath "C:\Recovery\RDB1\Log"

3 Use the Restore-Mailbox cmdlet to recover mailbox-level data or item-level data.

Page 42: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 42

For more details about recovery databases, refer to the following articles on the Microsoft TechNet website:

• http://technet.microsoft.com/en-us/library/ee332321.aspx

• http://technet.microsoft.com/en-us/library/ee332351.aspx

The advantage of this recovery method is that RDB is a built-in feature of Exchange 2010 so there is no need to buy additional software or licenses to create RDBs on existing Exchange servers. However, the recovery procedure consumes CPU, memory, and network resources on Exchange servers. To minimize the impact, organizations can have a dedicated Exchange Mailbox server for RDBs, which is not a cost-effective solution. Currently, there is no GUI for the RDB feature. All operations need to be run through cmdlets, which adds complexity to the recovery.

Ontrack PowerControls is mailbox recovery software that overcomes some of the disadvantages of RDB. Ontrack PowerControls for Exchange works with existing Exchange Server backup architecture and procedures, and enables the recovery of individual mailboxes, folders, messages, attachments, calendar items, notes, and tasks directly to the production Exchange Server or to any PST file. This powerful software also lets you search and create a copy of all archived e-mail that matches a given keyword or criteria.

In this solution, Ontrack PowerControls is installed on the NetWorker proxy client, which minimizes the impacts on the production environment.

Perform the steps listed in the following table to recover data using Ontrack PowerControls:

Step Action

1 Restore the database files and log files from the tape to the NetWorker Proxy node. In this way, there is no impact on the LAN and the Exchange servers.

2 Run Ontrack PowerControls for Exchange on the NetWorker Proxy node, and specify the location where the database files and log files are restored as shown in the following image.

Page 43: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

43

3 Ontrack PowerControls lists all mailboxes contained in this database. Through the GUI, select the specific mailbox for restore as shown in the following image.

4 When restoring the data, select in which format to export the data as shown in the following image.

Page 44: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 44

5 For item-level restore, expand the mailbox folder hierarchy, and select the e-mail items to be restored as shown in the following image.

6 Select the Export format to as shown in the following image.

A significant advantage for this solution design is that when the recovery data is contained in the last snapshot, it is easy to mount the snapshot to the Proxy node for recovery, without waiting to restore it from the tape, which greatly saves the restore time.

Page 45: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

45

SharePoint backup using VSS framework

Introduction The following table lists the SharePoint backup scenarios using the VSS framework

performed in this solution:

Test Scenario Description

1 Initial SharePoint farm full backup

2 Daily full farm backup

3 Database-level granular full backup

Workload simulation

The KnowledgeLake Document Loader was used to provide continual data population during testing to simulate SharePoint user data growth. The intention was to measure the Data Domain deduplication ratio when SharePoint content data increases.

Content creation was accomplished with Knowledge Document Loader Lite software. The software can take a series of documents and modify copies of them to generate unique documents. It then takes the document copies and distributes them into document libraries in the SharePoint farm. The data population lasts for 8 hours per day.

Scenario 1: Initial SharePoint farm full backup

This test scenario was to validate the initial SharePoint farm full backup performance and deduplication ratio in Data Domain using VSS in LAN-free topology.

The test results are as follows:

• It took 3 hours and 41 minutes to complete a full backup of 1149 GB of data in total into the Data Domain.

• The deduplication ratio was 1.55:1 percent while the post-compression data was 740.48 GB.

• The average response time for the SATA disks of clone LUNs was 3 milliseconds during backup.

• The average write throughput to Data Domain was 152.1 MB/s and Data Domain CPU utilization was 43.9 percent during the backup.

Page 46: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 46

The following table lists the detailed information about the whole backup duration:

Total backup duration

Total size Deduplication ratio

3 hours and 41 minutes

1149 GB 1.55:1

Prepare and import snapshots to proxy client

Backup streams to Data Domain

Backup the metadata file

Deport the snapshot

Backup the index and bootstrap

1 hour and 15 minutes

2 hours and 7 minutes

4 minutes 11 minutes 4 minutes

Note The test result of the deduplication ratio is based on using Knowledge Document Loader Lite, which might be totally different from the real-world data. Knowledge Document Loader Lite software is not a tool to test the deduplication ratio.

The following chart shows the Data Domain statistic during the initial full farm backup. From the chart, we can see that the percentage of time that all CPUs use is stable. The Disk chart also demonstrates the amount of data in Mebibytes per second going to and from all disks in the Data Domain system.

Page 47: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

47

Scenario 2: Daily full farm backup

After running KnowledgeLake Document Loader for eight hours, the total SharePoint farm data increased 17 GB (from 1,149 GB to 1,166 GB).

The test results are as follows:

• It took 3 hours 39 minutes to back up 1,166 GB data into Data Domain.

• The deduplication ratio was 10.5:1 percent while the post-compression data was 114.2GB. The total deduplication ratio was 2.74:1 percent.

• The write throughput to Data Domain was 164.5 MB/s on average and Data Domain CPU utilization was 23.8 percent during the backup. The peak write throughput was 240 MB/s.

• By default, NMM can run four save sessions in parallel in NetWorker to increase backup performance.

Note The test result of the deduplication ratio is based on using Knowledge Document Loader Lite, which might be totally different from the real-world data. Knowledge Document Loader Lite software is not a tool to test the deduplication ratio.

Scenario 3: Database-level full backup

This test scenario was to do a database-level granular backup using VSS. Using formats in the save sets similar to the following specifies some content databases to be backed up:

• APPLICATIONS:\SqlServerWriter\SQL\contentdb7

• APPLICATIONS:\SqlServerWriter\SQL\contentdb8

• APPLICATIONS:\SqlServerWriter\SQL\contentdb9

The test results are as follows:

• It took 1 hour and 14 minutes to back up three content databases with 360 GB in total.

• The deduplication ratio was 1.47:1 with 252.3 GB post-compression data after initial full backup.

• The write throughput to Data Domain was 193 MB/s and the CPU utilization of Data Domain was 50 percent.

Note The test result of the deduplication ratio is based on using Knowledge Document Loader Lite, which might be totally different from the real-world data. Knowledge Document Loader Lite software is not a tool to test the deduplication ratio.

Page 48: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 48

Granular backup for SharePoint scenarios

Introduction The following table lists the granular backup SharePoint scenario performed in this

solution:

Test Scenario Description

1 Granular backup

Scenario 1: Granular backup

This test scenario shows the performance of granular full and incremental backups and the deduplication ratio of Data Domain, which allows an item-level recovery in NMM 2.2 SP1.

The test was to back up three site collections simultaneously from three WFEs.

The test results are as follows:

• It took 8 hours to do a full backup of three site collections with 291 GB of data in total. Each site collection is 97 GB and backup is run in parallel. The total compression ratio was 92.8 percent with 21.56GB post-compression data in Data Domain.

• The write throughput to the Data Domain was 17.1 MB/s on average.

The following table lists the performance counters captured on the SQL server, the WFEs running backups, and the proxy client in order to measure the impact on the production environment during the granular backup of one site collection.

Server role CPU usage (%) Memory usage

Network utilization

SQL Server 36.76% 7.5% 8.6%

Web Front End 01 19.09% 14.076% 11.12%

Web Front End 02 23.56% 40.67% 13%

Web Front End 03 18.30% 9.57% 12.19%

Proxy client 0.50% 3.5% 4.4%

Sequence incremental backup tests show that it took 1 hour and 10 minutes to back up a 30 GB increase of three site collections into Data Domain with 93 percent deduplication.

According to NMM best practices, it is recommended to have a dedicated WFE to perform the granular backup. In the next test, we used a dedicated WFE with eight vCPUs to perform the granular backup. The backup performance improved to 55 GB/hr, on average, while running six site collections in parallel. It took and one hour and 20 minutes to back up six site collections in parallel with 70 GB in total. During the test, the WFE CPU utilization was 75% because each site collection was staged to one processor for backup. The following are some best practices to reduce the backup time:

Page 49: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

49

• Start multiple save sets simultaneously to reduce the backup time. For more parallelism settings, refer to the EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Administration Guide.

• Use nsr_moss_save -p option for SharePoint granular backup. This parameter will improve SharePoint backup performance and staging. The parameter depends on the number of processors (CPU) you have on WFE. For more information, refer to the EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Administration Guide.

• After running the full backup, run more scheduled incremental backups to reduce the backup time for granular backup.

Note The test result of the deduplication ratio is based on using Knowledge Document Loader Lite, which might be totally different from the real-world data. Knowledge Document Loader Lite software is not a tool to test the deduplication ratio.

Page 50: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 50

SharePoint recovery using VSS

Introduction The following table lists the SharePoint recovery scenarios using VSS performed in

this solution:

Test Scenario Description

1 Full farm recovery using NMM 2.2 SP1

2 Database-level conventional recovery

Scenario 1: Full farm recovery using NMM 2.2 SP1

This test scenario shows the RTO of restoring a whole SharePoint farm by using NMM 2.2 SP1.

The test results are as follows:

• It took 4 hours and 19 minutes to restore the entire SharePoint farm of 1,166 GB of data.

• The recovery speed from Data Domain can reach 77.652 MB/s.

• The average CPU usage of Data Domain was 10 percent and the maximum disk utilization of Data Domain was 21 percent.

After the full recovery is completed, it is necessary to clean up all the timer cache on the SharePoint servers. For more detailed information, refer to the link at: http://support.microsoft.com/kb/939308/en-gb.

The following chart demonstrates the Data Domain performance when doing a full point-in-time recovery.

Page 51: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

51

Scenario 2: Database-level conventional recovery

This test scenario shows the RTO and recovery performance when restoring one or multiple content databases using NMM 2.2 SP1.

Select one or more content databases in NMM to restore at a point in time as shown in the following image. It requires restoring the configuration database prior to the individual content databases’ recovery. VSS does not allow restoring content databases from an alternative location.

Page 52: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 52

The test was to restore four content databases using NMM 2.2.1 SP1.

The test results are as follows:

• It took 1 hour and 21 minutes to restore four content databases of 307.3 GB in total from Data Domain.

• The recovery speed for four single databases can reach 68.2 MB/s.

• 62 percent of the network bandwidth was utilized for the restore on the production environment and proxy client.

• The SQL production environment CPU usage consumed 46.3 percent.

• The average CPU usage of Data Domain was 9.5 percent and the maximum disk utilization of Data Domain was 30 percent. The memory usage was stable during the recovery and reached a plateau at 5 GB.

Parallel recover sessions reduce the recovery time and also increase the recover throughput. In the meantime, the CPU utilization of SQL Server production increases as each unit of parallelism is added.

Page 53: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

53

Granular recovery for SharePoint

Introduction The following table lists the SharePoint granular recovery scenarios performed in this

solution:

Test Scenario Description

1 Item-level granular recovery using NMM 2.2 SP1

2 Item-level recovery using Kroll Ontrack 6.0

3 Site-level recovery using Kroll Ontrack 6.0

Scenario 1: Item-level granular recovery using NMM 2.2 SP1

In the test, it took 12 minutes and 20 seconds to restore 80 documents with 5.5 MB each on average without recovering the entire site.

Select items to be restored to the original location or to an alternative location. Use the Search function to find the items to be restored as shown in the following images.

Page 54: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 54

Scenario 2: Item-level recovery using Kroll Ontrack 6.0

Ontrack PowerControls for SharePoint works with the existing SharePoint database, and allows the restoration of content from MDF files, NDF files, and LDF files directly to a SharePoint target, or to a different SharePoint server.

In this solution, Ontrack PowerControls was installed on the NetWorker proxy client to minimize the impact on the production environment.

To recover data using Ontrack PowerControls, perform the following steps:

Step Action

1 Restore the database files and log files from the tape to the NetWorker Proxy node. In this way, there is no impact on the LAN and the SharePoint servers. It took 54 minutes and 27 seconds to restore a single database of 111 GB (98 GB data and 13 GB log file) from Data Domain.

Page 55: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

55

2 Load Ontrack PowerControls for SharePoint on the NetWorker Proxy node, and specify the location where database files and log files are restored, and specify the temporary file path as shown in the following image.

3 Enter the SQL Server Name and the SharePoint Configuration Database name as shown in the following image to connect to the target server to complete the recovery.

Page 56: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 56

Kroll Ontrack lists the source and target site collection structure as shown in the following image.

Page 57: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

57

4 Select the item and choose Export to restore to a local folder.

It took one second to restore a 512 K document into the local folder as shown in the following image.

Page 58: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 58

5 Select Copy and Paste to restore items into the destination site as shown in the following image.

A big advantage for this solution design is that if the recovery data is contained in the last snapshot, it is easy to mount the snapshot to the Proxy node for recovery, without waiting to restore it from tape. This can greatly reduce the restore time.

Scenario 3: Site-level recovery using Kroll Ontrack 6.0

Kroll Ontrack also allows restoring a whole site, list, and document libraries from MDF files, NDF files, and LDF files.

To complete the restore, perform the following steps:

Step Action

1 Restore the database files and log files from the tape to the NetWorker Proxy node. In this way, there is not any impact to the LAN and the SharePoint servers. It took 54 minutes and 27 seconds to restore a single database of 111 GB (98 GB data and 13 GB log file) from Data Domain.

2 Load Ontrack PowerControls for SharePoint on the NetWorker Proxy node, and complete the wizard by following Steps 2 to 3 in Scenario 4.

3 After showing the source and target site collections, right-click the site to be restored and select Export to export the whole site to the local folder as shown in the following image.

Page 59: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

59

4 To restore it to a target site collection, click Copy and Paste as shown in the following image.

As shown in the test results, it took 3 hours to restore a whole site with 98 GB of user data. This also applies to a whole list and document library.

Page 60: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 60

Combined backup scenarios

Introduction The following table lists the combined backup scenarios performed in this solution:

Test Scenario Description

1 Combined initial full backup to Data Domain

2 Combined daily full backup to Data Domain after new data generation

Scenario 1: Combined initial full backup to Data Domain

This test scenario is to validate the combined backup performance and deduplication ratio in Data Domain. Both Exchange 2010 backup and SharePoint 2007 full farm backup are performed during the same period.

The test results are as follows:

• It took 10 hours and 28 minutes to do a full backup of 8.6 TB (7.5 TB Exchange data and 1.1 TB SharePoint data) data in total into Data Domain.

• The deduplication ratio was 1.5:1.

• The backup throughput was 233.7 MB/s on average and the Data Domain CPU utilization was 90 percent during the backup.

The following table lists the combined initial full backup results:

Amount of backup data

Total backup window

Deduplication ratio

8.6 TB 10 hours 28 minutes

1.5:1

Page 61: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

61

The following chart shows the Data Domain statistics when doing combined initial full backup.

Scenario 2: Combined daily full backup to Data Domain after new data generation

After data generation, the total data increases from 8.6 TB to 9.2 TB. Data Domain contains the initial full backup data, so the deduplication ratio increases greatly.

The test results are as follows:

• It took 9 hours and 37 minutes to back up 9.2 TB of combined data into Data Domain.

• The deduplication ratio was 26.3:1. The total deduplication ratio of the two combined full backup data was 2.84:1.

• The backup throughput was 278.6 MB/s on average and Data Domain CPU utilization was around 50% during the backup.

The following table listed the combined daily full backup results:

Amount of backup data

Total backup window

Deduplication ratio

9.2 TB 9 hours, 37 minutes

26.3:1

Page 62: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 62

The following chart shows the Data Domain statistics when doing combined daily full backup.

Page 63: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

63

Combined recovery scenario

Introduction This test was to show the RTO to recover combined data, including two Exchange

2010 databases and full SharePoint farm, during the same time using the NetWorker client.

The test results showed that:

• It took 6 hours and 35 minutes to recover the whole farm of 1456 GB of user data. It took 5 hours and 34 minutes to recover two Exchange 2010 databases of 769 GB of user data from Data Domain. The recovery performance is almost the same as when doing the recovery separately.

• When doing a combined recovery, the network utilization of the storage node was 79 percent. In the meantime, the read throughput of Data Domain was around 103 MB/s during the recovery and the CPU utilization was 15 percent. These results indicate that through adding more storage nodes and enabling more recovery sessions at the same time, it is possible to increase the total recovery RTO results.

Page 64: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 64

Conclusion

Summary This solution demonstrates the building of an Exchange Server 2010 and an

Enterprise SharePoint Server 2007 environment on the CLARiiON platform that integrates EMC Replication Manager, the NetWorker family, Data Domain, and Kroll Ontrack PowerControls.

This solution also provides a full or granular-level backup of Exchange Server 2010 and Enterprise SharePoint Server 2007 with rapid VSS backup of terabytes of data while storing them efficiently on data deduplication storage.

Key findings The table below summarizes the key points that this solution addresses.

Key Point Solution objective

Reduced backup timeframes Enable a daily full backup in a time window of less than 10 hours, compared to a four-day time window with a traditional tape-based backup solution

Reduced backup storage requirements

Require less capacity on Data Domain for daily full backups with a high data deduplication ratio, compared to the traditional 100 percent backup storage capacity requirement on tape

No impact to production environment

Cause little impact on the production environment for backup operations by using the Exchange DAG passive copy, CLARiiON SnapView clone technology, and the NetWorker proxy client for data rollover to Data Domain

Simple and efficient item-level recovery

Perform easy operations for both mailbox-level and item-level granular recovery

Next steps To learn more about this and other solutions contact an EMC representative or visit:

www.emc.com.

Page 65: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

65

References

White papers For additional information, see the white papers listed below.

• SharePoint Backup and Recovery: Ensuring Complete Protection - A Detailed Review

• EMC Backup and Recovery for Microsoft Office SharePoint Server 2007 Enabled by EMC CLARiiON CX4, EMC Replication Manager, Kroll Ontrack, and Microsoft Hyper-V – A Detailed Review

Product documentation

For additional information, see the product documents listed below.

• Data Domain OS 4.8 Administration Guide

• Data Domain OS 4.8 Command Reference Guide

• EMC Data Domain Storage System with EMC NetWorker Best Practices Planning

• EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Application Technical Notes for SharePoint and Exchange

• EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Administration Guide

• EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Installation Guide

• EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Release Notes

• EMC NetWorker Module for Microsoft Applications and EMC CLARiiON: Implementing Proxy Node Backups Release 2.2 SP1 Technical Notes

• EMC NetWorker Release 7.6 Administration Guide

• EMC NetWorker Release 7.6 Error Message Guide

• EMC NetWorker Release 7.6 Installation Guide

• EMC NetWorker Release 7.6 Release Notes

• EMC Replication Manager Version 5.2.3 Administrator’s Guide

• EMC Replication Manager Version 5.2.3 Product Guide

• EMC Replication Manager Version 5.2.3 Language Pack Release Notes

Other documentation

For additional information, see documents at the website listed below:

• Microsoft TechNet

Page 66: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 66

Additional information

Introduction This appendix describes the sample scripts used in this white paper.

Automation scripts during backup

Assume that we have an Exchange 2010 backup group called Ex2010 in NetWorker. The following are some sample scripts using Replication Manager CLI.

***********************************************************************************************

C:\Program Files\Legato\nsr\res\Ex2010.reg type: savepnpc;

precmd: "C:\\scripts\\RM_start.cmd >> C:\\scripts\\RM_start.log 2>&1";

pstcmd: "echo Finished!";

# timeout: "12:00:00";

abort precmd with group: Yes;

C:\scripts\RM_start.cmd

cd "C:\Program Files (x86)\EMC\rm\gui"

rmcli file="C:\scripts\Batch_RM_start.txt"

C:\scripts\Batch_RM_start.txt

connect to host=RMSRV.tcesh.gsc.emc.com port=65432

login user=Administrator password=********

run-job name=MBX01_02 appset=MBX01_02

exit 0

***********************************************************************************************

For detailed information about savepnpc usage, refer to the EMC NetWorker Release 7.6 Administration Guide.

Page 67: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications - Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review

67

Automation script to reclaiming array storage

When there is an import failure in CLARiiON clones or a stop action of the backup session accidently in Networker, the snapshot session must be destroyed, synchronized, or terminated, depending on the hardware type, prior to the next backup.

The following script automates resynchronizing all clone groups where the last status was “Administratively Fractured”.

The script can be saved as a .ps1 file and run in the Windows PowerShell.

***********************************************************************************************

########## Constants ########## # Please provide CX administration information here $CX_Admin = "XXXX" #### CLARiiON Administration account##### $CX_Passcode = "XXXXXX" #### CLARiiON Administration password##### $CX_IP = "XXXXXX" ####CLARiiON IP Address###### $Cmd_Prefix = "naviseccli -h "+$CX_IP+" -user "+$CX_Admin+" -password "+$CX_Passcode+" -scope global" $Max_LUN_ID = 8191 ########## Constants ########## Write-Host "Script is initializing... ..." $GetSPNamecmd= $cmd_Prefix + " snapview -listclonegroup >C:\clonegroups.txt" Write-Host $GetSPNamecmd Invoke-Expression $GetSPNamecmd $CloneStatus= Get-Content c:\clonegroups.txt | Select-String "CloneCondition:" $a=0 "{0:N0}" -f $a foreach ($line in $CloneStatus) { $cloneconditions = $line.tostring().split(':') Write-Host $cloneconditions[1].tostring().trim() if($cloneconditions[1].tostring().trim() -like "Administratively Fractured") { $clonenames= Get-Content c:\clonegroups.txt | Select-String "Name: " $clonegroupids = Get-Content c:\clonegroups.txt | Select-String "CloneID: " $clonename = $clonenames[$a].tostring().split(':') $clonegroupid = $clonegroupids[$a].tostring().split(':') $cmd = $Cmd_Prefix+" snapview -syncclone -o -name

Page 68: EMC Backup and Recovery for Microsoft Applications ... · EMC Backup and Recovery for Microsoft Applications Deduplication Enabled by ... Backup and recovery design for SharePoint

EMC Backup and Recovery for Microsoft Applications -

Deduplication Enabled by EMC CLARiiON and Data Domain—A Detailed Review 68

"+$clonename[1] +" -CloneId"+$clonegroupid[1] Write-Host $cmd Invoke-Expression $cmd } $a = $a +1 } Write-Host "Script finish" ***********************************************************************************************

For more information about reclaiming array storage before next backup, refer to the EMC NetWorker Module for Microsoft Applications Release 2.2 SP1 Release Notes.