11
© Copyright IBM Corporation 2012 Trademarks Back up 1000 VMware guests with Tivoli Storage FlashCopy Manager for VMware Page 1 of 11 Back up 1000 VMware guests with Tivoli Storage FlashCopy Manager for VMware Poulin M. Kao ([email protected]) Senior Software Engineer IBM James E. Damgar ([email protected]) Staff Software Engineer IBM 27 November 2012 A VMware environment with 1000 virtual machines was backed up in 36 minutes using IBM® Tivoli® Storage FlashCopy® Manager for VMware V3.1. This article discusses the program functions and parameters that achieved this result and suggests best practice guidelines. Overview With the explosive growth in server virtualization, the demand for quick, efficient backup of virtual machines residing in VMware datastores is becoming more and more critical. IBM Tivoli Storage FlashCopy Manager for VMware Version 3.1, referred to as FlashCopy Manager for VMware in the rest of this article, is designed to handle such demand. As a virtual environment user, you want to quickly back up and restore your VMware datastores even as your virtual environment grows larger and the applications running on it become more important and critical to your business success. To assess how well FlashCopy Manager for VMware meets the needs of VMware virtual environments, the Tivoli Storage Manager performance team conducted FlashCopy backup tests for VMware environments containing up to 1000 online virtual machines, with a total of 18TB of disk space. The test results for up to 1000 virtual machines (the maximum tested) show that the FlashCopy backup elapsed time increases linearly with the number of virtual machines when the SNAPSHOT_EXCL_MEM snapshot mode and NUMBER_VM_CONCURRENT_TASKS=1 defaults are used. Tests that increased the value for the NUMBER_VM_CONCURRENT_TASKS parameter resulted in substantial improvement to the FlashCopy backup elapsed time. Highlights of the results are: • 500 virtual machines can be backed up by FlashCopy Manager for VMware in 15 minutes when SNAPSHOT_EXCL_MEM mode is used and NUMBER_VM_CONCURRENT_TASKS is set to 256.

Sm Flashcopy Vmware Backup PDF

Embed Size (px)

DESCRIPTION

Backup flashcopy

Citation preview

Page 1: Sm Flashcopy Vmware Backup PDF

© Copyright IBM Corporation 2012 TrademarksBack up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 1 of 11

Back up 1000 VMware guests with Tivoli StorageFlashCopy Manager for VMwarePoulin M. Kao ([email protected])Senior Software EngineerIBM

James E. Damgar ([email protected])Staff Software EngineerIBM

27 November 2012

A VMware environment with 1000 virtual machines was backed up in 36 minutes using IBM®Tivoli® Storage FlashCopy® Manager for VMware V3.1. This article discusses the programfunctions and parameters that achieved this result and suggests best practice guidelines.

OverviewWith the explosive growth in server virtualization, the demand for quick, efficient backup of virtualmachines residing in VMware datastores is becoming more and more critical. IBM Tivoli StorageFlashCopy Manager for VMware Version 3.1, referred to as FlashCopy Manager for VMware inthe rest of this article, is designed to handle such demand. As a virtual environment user, you wantto quickly back up and restore your VMware datastores even as your virtual environment growslarger and the applications running on it become more important and critical to your businesssuccess.

To assess how well FlashCopy Manager for VMware meets the needs of VMware virtualenvironments, the Tivoli Storage Manager performance team conducted FlashCopy backup testsfor VMware environments containing up to 1000 online virtual machines, with a total of 18TB ofdisk space.

The test results for up to 1000 virtual machines (the maximum tested) show that the FlashCopybackup elapsed time increases linearly with the number of virtual machines when theSNAPSHOT_EXCL_MEM snapshot mode and NUMBER_VM_CONCURRENT_TASKS=1 defaults are used. Teststhat increased the value for the NUMBER_VM_CONCURRENT_TASKS parameter resulted in substantialimprovement to the FlashCopy backup elapsed time. Highlights of the results are:

• 500 virtual machines can be backed up by FlashCopy Manager for VMware in 15 minuteswhen SNAPSHOT_EXCL_MEM mode is used and NUMBER_VM_CONCURRENT_TASKS is set to 256.

Page 2: Sm Flashcopy Vmware Backup PDF

developerWorks® ibm.com/developerWorks/

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 2 of 11

• 1000 virtual machines can be backed up by FlashCopy Manager for VMware in 36 minuteswhen SNAPSHOT_EXCL_MEM mode is used and NUMBER_VM_CONCURRENT_TASKS is set to 256.

However, increasing the value for the NUMBER_VM_CONCURRENT_TASKS parameter does notnecessarily suit everyone's environment. We observed its positive and negative impact andformulated some practical recommendations, which are based on FlashCopy Manager for VMwareand VMware operation behaviors in different VMware configurations.

FlashCopy backup test environment

FlashCopy backup tests were conducted in the following test environment. All the resources in thissetup were dedicated to this particular test. We used scripts that utilized the FlashCopy Managerfor VMware command-line interface (vmcli) so that we could have full control over test execution.Typically, you use the FlashCopy Manager for VMware Data Protection vCenter GUI plug-in, whichis integrated with VMware vCenter, to perform FlashCopy Manager for VMware tasks, but theprocessing is similar.

Hardware and software components

We used five IBM 3850 M2 model xServers with very similar machine configurations. Theydiffered only slightly in terms of the number of CPU cores and processor speed. Nonetheless,each one was powerful enough to handle the requirements of several hundred virtual machines.We allocated 512MB RAM for each virtual machine (VM), because each ESXi server comeswith 128GB RAM and we wanted to build a minimum of 200 VMs in each ESXi server in orderto have 1000 VMs in the VMware environment. For the 1000 VMs test, 200 VM guests werehosted on each of the five ESXi 5.0 servers in the test; whereas, with the 100-500 VM gueststest, 100 VM guests were hosted on each ESXi server. The VMware environment that we testedconsisted of five VMware ESXi version 5 servers, one VMware vCenter server, one Linux backupserver (FlashCopy Manager for VMware backup server) and one IBM System Storage DS8000®storage system. Figure 1 illustrates the environment. Detailed lists of the hardware and softwarecomponents are provided in Supplemental test environment information.

Page 3: Sm Flashcopy Vmware Backup PDF

ibm.com/developerWorks/ developerWorks®

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 3 of 11

Figure 1. Test environment layout

Testing considerationsFlashCopy backup tests were performed in increments of 100 VMs up to 500 VMs and then at1000 VMs. System metrics on the backup server and ESXi servers were collected using OS toolsand VMware tools. Virtual machines were powered on and had sporadic file update activities. Bydefault, FlashCopy Manager for VMware performs VMware resignature to a FlashCopied LUNto preserve the uniqueness of the LUN. This action applies only to DS8000 and does not affectother IBM storage such as XIV® Storage System, SAN Volume Controller, or Storwize® V7000.We disabled this action for this test because we were interested in doing FlashCopy backups ofthe VMware datastore LUNs and not in keeping more than one version of a FlashCopied LUNonline. For the 1000 VMs test, we updated FlashCopy Manager for VMware to level 3.1.0.1 whichincludes an enhancement that allowed us to extend the TIMEOUT_FLASH default value from 120 to300 seconds. The NOCOPY mode for FlashCopy on DS8000 was selected for the FLASHCOPY_TYPEparameter in the FlashCopy Manager for VMware profile.

FlashCopy backup test resultsElapsed time was measured between the start and end of the test script that ran vmcli commandsto FlashCopy back up the VMware datastore LUNs. FlashCopy Manager for VMware provides twoparameters, VM_BACKUP_MODE and NUMBER_VM_CONCURRENT_TASKS (VMware virtual machine snapshotconcurrency), to control how the VMware snapshot is handled before the storage FlashCopyof the datastore LUNs. The default for NUMBER_VM_CONCURRENT_TASKS is 1 and the default forVM_BACKUP_MODE is SNAPSHOT_EXCL_MEM.

Using FlashCopy Manager for VMware defaults, FlashCopy backup elapsedtime increased linearly with the number of VMsFigure 2 shows the results from tests that used the default settings. These results indicate analmost linear increase in the FlashCopy backup elapsed time as the number of VM guests

Page 4: Sm Flashcopy Vmware Backup PDF

developerWorks® ibm.com/developerWorks/

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 4 of 11

increased to 1000 VMs hosted in 100 VMware datastores. The time spent performing storageFlashCopy, represented by the top line, was sub-second and was not affected by the increase.The storage FlashCopy time was rounded up to 1 second in the graph. The results from the defaultsettings tests indicate that FlashCopy of the VMware datastore LUNs is a time-saving approach forbacking up a large VMware setup because hundreds of VMs can be safely backed up in a matterof hours rather than days.

Figure 2. FlashCopy backup elapsed time with default settings

Increasing VMware task concurrency can improve FlashCopy backupperformanceIn addition to the FlashCopy backup tests that used the defaults for VM_BACKUP_MODE andNUMBER_VM_CONCURRENT_TASKS, we also ran our FlashCopy backup tests on 500 VMs and 1000 VMsusing various permutations of these two parameters.

• VM_BACKUP_MODE: controls which type of VMware snapshot is performed prior to the FlashCopy(or allows the VMware snapshot to be skipped). The settings are described in Table 1.

• NUMBER_VM_CONCURRENT_TASKS: specifies the number of concurrent snapshots of virtualmachines that VMware performs at a time. It applies to all VM_BACKUP_MODE settings exceptASIS mode. Tests were done with the value set to 1, 2, 4, 8, 16, 32, 64, 128 and 256.

Table 1. VM_BACKUP_MODE and associated VMware actionsMode VMware actions

SNAPSHOT_EXCL_MEM Performs virtual machine snapshot without capturing virtual machine memory content.Default –- recommended for most FlashCopy backup scenarios

SNAPSHOT_INCL_MEM Performs virtual machine snapshot and captures virtual machine memory content

SUSPEND Suspends the virtual machine and then resumes the virtual machine after storage FlashCopy

Page 5: Sm Flashcopy Vmware Backup PDF

ibm.com/developerWorks/ developerWorks®

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 5 of 11

ASIS No snapshot of virtual machine. The storage FlashCopy is performed without any VMware action.NUMBER_VM_CONCURRENT_TASKS parameter does not apply to this mode.

In general, FlashCopy backup performance improved with increased NUMBER_VM_CONCURRENT_TASKSsettings for all of the different VM_BACKUP_MODE settings (except ASIS). In our testenvironment, we observed a limit to the FlashCopy backup performance benefits when theNUMBER_VM_CONCURRENT_TASKS value reached the range of 32-64.

Figure 3 shows the effect of NUMBER_VM_CONCURRENT_TASKS on the FlashCopy backup of 500 VMs.

Figure 3. Elapsed time with various backup modes for 500 VMs backup

The performance gains flatten out at about the 32 - 64 range of VMware task concurrency. There isonly one data point in the graph for the ASIS mode test because FlashCopy Manager for VMwaredoes not invoke any VMware actions before the storage FlashCopy.

Figure 4 shows the same trend from the effect of VMware task concurrency on the FlashCopybackup of 1000 VMs.

Page 6: Sm Flashcopy Vmware Backup PDF

developerWorks® ibm.com/developerWorks/

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 6 of 11

Figure 4. Elapsed time with various backup modes for 1000 VMs backup

With the exception of ASIS mode, FlashCopy backup took up a lot of resources on the ESXiservers because most of the actions were with VMware. As VMware task concurrency increased,the resources (such as CPU and disk I/O) of the ESXi servers were saturated and the performanceimprovement flattened out at about a range of 32-64 concurrent VMware tasks in the environmentwe tested.

Scalability considerationsOur test results indicate that the default settings, SNAPSHOT_EXCL_MEM for VM_BACKUP_MODE andNUMBER_VM_CONCURRENT_TASKS=1, are capable of handling many VMware datastores that hosthundreds or thousands of virtual machines. Assuming that the storage system in your setupis not over-saturated, we recommend you start with the default settings. When you use theSNAPSHOT_EXCL_MEM mode, you get VMFS-consistent backups and do not spend additional timecapturing the working memory of virtual machines during the snapshot time. Some special virtualmachines might require working memory in the snapshot, but that is a less common scenario.Setting NUMBER_VM_CONCURRENT_TASKS to 1 does not put a heavy burden on the ESX server. Whenyou combine it with the SNAPSHOT_EXCL_MEM option, you can reap the benefit of fast FlashCopybackups without adding strain to your host ESX servers. However, if you need even fasterFlashCopy backups and you have resources to spare on your ESX servers, you can increase theVMware task concurrency by adjusting the NUMBER_VM_CONCURRENT_TASKS setting.

Tip: Tuning the VMware task concurrency is an exercise in tradeoff between ESX serverresources, such as CPU and disk I/O, and FlashCopy backup performance. Here are suggestionson how each VM_BACKUP_MODE setting may take up ESX server resources during FlashCopybackups as you adjust NUMBER_VM_CONCURRENT_TASKS to tune your VMware task concurrency.

Page 7: Sm Flashcopy Vmware Backup PDF

ibm.com/developerWorks/ developerWorks®

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 7 of 11

• SUSPEND mode• ESX server Disk I/O would be saturated with more concurrent writing of virtual machine

data to datastores.• ESX host CPU utilization increases dramatically, particularly on virtual machine resumes

after the storage FlashCopy.• As observed in our test environment, improvement to the FlashCopy backup elapsed

time slowed substantially with 8 or more concurrent VMware tasks. You mustperform preliminary testing with your actual environment to determine your optimalNUMBER_VM_CONCURRENT_TASKS setting.

• Applications are impacted by the need for the virtual machines to temporarily suspend.

• SNAPSHOT_EXCL_MEM mode• As VMware task concurrency increases, CPU utilization of the ESX server is affected by

snapshot creation and removal activity, although it is not saturated.• With higher VMware task concurrency, there is a tradeoff between faster FlashCopy

backup performance and higher ESX server CPU utilization, which may impact the virtualmachines running on the ESX servers.

• In our test environment, FlashCopy backup performance improvements flattened outwhen NUMBER_VM_CONCURRENT_TASKS reached 32.

• Due to resource constraints, we did not emulate virtual machines with close to real-lifeactivity. Therefore, you should perform exploratory tests with your key virtual machinesrunning to determine your optimal NUMBER_VM_CONCURRENT_TASKS setting for this mode.

• SNAPSHOT_INCL_MEM mode• Both ESX server CPU and disk utilization were increased with increased VMware task

concurrency.• There was potential saturation of the disk system hosting the VMware datastores. Our

testing used a smaller 512MB of RAM for each virtual machine. With a larger amount,more memory is available to write out to disk for each virtual machine.

• In our test environment, ESX server CPU utilization would reach 100% in a short timewhen we tested with NUMBER_VM_CONCURRENT_TASKS=32.

• Our virtual machines did not run production-like activity. Therefore, you shouldperform exploratory tests with your VMware setup to determine your optimalNUMBER_VM_CONCURRENT_TASKS setting for this mode.

• We observed the most improvement in FlashCopy backup performance with increasingVMware task concurrency.

Special notes for VMware datastores on DS8000 storage

If you have DS8000 storage in your VMware setup and intend to use FlashCopy Manager forVMware FlashCopy backup, keep the following points in mind:

• Resignature, which performs a forced-mount of a FlashCopied LUN, is a default action and itadds extra time to the FlashCopy backup time. The added time can become substantial whenthe LUN size is large and there are many of them. To reduce the overall FlashCopy backup

Page 8: Sm Flashcopy Vmware Backup PDF

developerWorks® ibm.com/developerWorks/

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 8 of 11

time, you can disable the resignature if you do not need to keep more than one FlashCopybackup copy.

• You might encounter a TIMED-OUT error during FlashCopy backup operations when workingwith a large number of DS8000 data stores. You can apply patch level 3.1.0.1 for FlashCopyManager for VMware, which extends the TIMEOUT_FLASH default from 120 to 300 seconds. Youcan also set the value that you want used for TIMEOUT_FLASH in the profile file after the patch isapplied.

Supplemental test environment informationThe following details supplemental the test environment information provided in FlashCopy backuptest environment.

Hardware components

ESXi 5.0 Hosts:

• ESX hosts: tsmcveh01 (Host 1), tsmcveh03 (Host 3)• IBM x3850 M2• 4 x 4-core (16 logical CPU) Intel Xeon X7350 @ 2.93GHz• 128GB RAM• 2 x 146GB 15K Internal SAS drives (mirrored)• 2 x 4Gbit dual port QLogic HBA cards

• ESX hosts: tsmcveh02 (Host 2), tsmcveh04 (Host 4), tsmcveh05 (Host 5)• IBM x3850 M2• 4 x 6-core (24 logical CPU) Intel Xeon X7460 @ 2.66GHz• 128GB RAM• 2 x 146GB 15K Internal SAS drives (mirrored)• 2 x 4Gbit dual port QLogic HBA cards

VCenter 5.0 Server and FlashCopy Manager for VMware Linux proxy server in separate x3650M1:

• IBM x3650 M1• 2 x 4-core (8 logical CPU) Intel Xeon E5345 @ 2.33GHz• 24GB RAM• 2 x 300GB 15K Internal SAS drives (mirrored)• 1 x 4Gbit dual port QLogic HBA card

IBM 2005-B16 16-port fibre channel switches ( 2 )

• Each ESXi Host connected to each switch via a 4Gbit link

DS8000 Model 2107-932 (2-frame)

• 384 x ~146GB fibre channel drives (15k rpm)• 100 volumes, 202GB each, for VMware datastores

Page 9: Sm Flashcopy Vmware Backup PDF

ibm.com/developerWorks/ developerWorks®

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 9 of 11

• 100 volumes, 202GB each, for FlashCopy targets• A total of 24 6+1P RAID5 arrays and 24 7+1P RAID5 arrays (smaller are due to hot-spare

coverage)• Two controller units each currently providing one 4Gbit connection to each of the above FC

switches

Software components

• All x3850 ESXi servers were installed with VMware ESXi v5.0.0-469512 code.• The vCenter x3650 server was installed with VCenter Server v5.0.0-455964 on Windows

Server 2008 Enterprise (64-bit).• One x3650 server was installed as the FCM/TSM4VE proxy (tsmcvefcm01) with Red Hat EL

6.1 (64-bit). It ran the FlashCopy Manager for VMware GA-level code, plus the latest patches.

Configuration details for hosts and virtual machines

• There was a 1-to-1 mapping between each 202GB DS8000 volume and each VMFS3datastore.

• 100 DS8000 volumes were used for datastores and the same number of volumes werereserved for DS8000 FlashCopy targets.

• Datastores were distributed in a round-robin fashion to each of the 5 ESXi hosts for a total of10 datastores per host.

• A total of 10 VM guests ran (powered-on, though idle) on each of the 100 total datastores,with a total of 100 VM guests running on each ESXi host for the 100-500 VM guests tests and200 VM guests running on each host for the 1000 VM guests tests.

• The following four guest image types were used in this environment (250 total of each) withan approximately even distribution on each datastore:

• SUSE Linux ES 11 SP1 (64-bit)• Red Hat Linux EL 5.6 (64-bit)• Windows Server 2003 (64-bit)• Windows Server 2008 (64-bit)

• Each VM guest was provisioned with the following:• 1 vCPU• 512MB RAM• 17GB of virtual hard disk space (thin)

Page 10: Sm Flashcopy Vmware Backup PDF

developerWorks® ibm.com/developerWorks/

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 10 of 11

Resources

• Tivoli Storage Manager provides a wide range of storage management capabilities.• Review Tivoli Storage FlashCopy Manager V3.1 Documentation.• Find hints and tips for FlashCopy Manager in the Tivoli Storage FlashCopy Manager wiki.• Find Tivoli Storage Manager hints and tips.• Participate in the IBM Tivoli Storage Management forum.

Page 11: Sm Flashcopy Vmware Backup PDF

ibm.com/developerWorks/ developerWorks®

Back up 1000 VMware guests with Tivoli Storage FlashCopyManager for VMware

Page 11 of 11

About the authors

Poulin M. Kao

Poulin Kao is a Software Engineer in IBM Tivoli who works on performance evaluationfor Tivoli Storage Manager and related products.

James E. Damgar

James Damgar is a Software Engineer in IBM Tivoli who works on performanceevaluation for Tivoli Storage Manager and related products.

© Copyright IBM Corporation 2012(www.ibm.com/legal/copytrade.shtml)Trademarks(www.ibm.com/developerworks/ibm/trademarks/)