Achieving High Availability with SQL Server using EMC SRDF Prem Mehra – SQL Server Development, Microsoft Art Ullman - CSC

Achieving High Achieving High Availability Availability with SQL Server usingwith SQL Server usingEMC SRDFEMC SRDF

Prem Mehra – SQL Server Development, Prem Mehra – SQL Server Development, MicrosoftMicrosoftArt Ullman - CSCArt Ullman - CSC

Topics CoveredTopics Covered

Share experiences gained on deploying Share experiences gained on deploying SQL Server and SAN for a Highly SQL Server and SAN for a Highly Available Data Warehouse. Emphasis onAvailable Data Warehouse. Emphasis on Intersection of SAN and SQL Server Intersection of SAN and SQL Server

TechnologiesTechnologies Not on Large Data Base Implementation or on Not on Large Data Base Implementation or on

Data Warehouse Best PracticesData Warehouse Best Practices

Project OverviewProject Overview Best Practices in a SAN environmentBest Practices in a SAN environment Remote Site Fail-over using EMC SRDF Remote Site Fail-over using EMC SRDF

and SQL Server Log Shippingand SQL Server Log Shipping

USDA GDW Project OverviewUSDA GDW Project OverviewProjectProject Build Geo-spatial Data Build Geo-spatial Data

Warehouse for two sites with Warehouse for two sites with remote fail-overremote fail-over

ClientClient USDAUSDA

StorageStorage EMC SAN (46 terabytes)EMC SAN (46 terabytes)

DatabaseDatabase SQL Server 2000SQL Server 2000

ImplementationImplementation USDA / CSCUSDA / CSC

ConsultantsConsultants EMC / CSC / Microsoft / ESRIEMC / CSC / Microsoft / ESRI

Geo-spatial Geo-spatial SoftwareSoftware

ESRI Data Management ESRI Data Management SoftwareSoftware

Application RequirementsApplication Requirements A large (46 TB total storage) A large (46 TB total storage)

geo-spatial data warehouse for 2 geo-spatial data warehouse for 2 USDA sites: USDA sites: Salt Lake City & Fort WorthSalt Lake City & Fort Worth

Provide database fail-over and Provide database fail-over and fail-back between remote sitesfail-back between remote sites Run data replication across DS3 Run data replication across DS3

network between sites (45Mb/sec)network between sites (45Mb/sec) Support read- only access at fail-Support read- only access at fail-

over sites on ongoing basisover sites on ongoing basis

SAN Implementation SAN Implementation 11

Understand your throughput, response Understand your throughput, response time and availability requirements and time and availability requirements and potential bottlenecks and issues potential bottlenecks and issues

Work with your storage vendorWork with your storage vendor Get Best PracticesGet Best Practices Get design advice on LUN size, sector Get design advice on LUN size, sector

alignment, etcalignment, etc Understand the available backend monitoring Understand the available backend monitoring

toolstools

Do not try to over optimize, keep LUN, Do not try to over optimize, keep LUN, filegroup, file design simple, if possiblefilegroup, file design simple, if possible


Balance I/O across all HBAs when possible Balance I/O across all HBAs when possible using balancing software (e.g., EMC’s using balancing software (e.g., EMC’s PowerPath)PowerPath) Provides redundant data pathsProvides redundant data paths Offers the most flexibility and much easier to design Offers the most flexibility and much easier to design

when compared to static mappingwhen compared to static mapping Some vendors are now offering implementations Some vendors are now offering implementations

which use Microsoft’s MPIO (multi-path IO). which use Microsoft’s MPIO (multi-path IO). Permits more flexibility in heterogeneous storage Permits more flexibility in heterogeneous storage environments.environments.

Managing growthManaging growth Some configurations offer dynamic growth of Some configurations offer dynamic growth of

existing LUNs for added flexibility (e.g., Veritas existing LUNs for added flexibility (e.g., Veritas Volume Manger or SAN Vendor Utilities)Volume Manger or SAN Vendor Utilities) Working with SAN vendor engineers is highly Working with SAN vendor engineers is highly

recommendedrecommended

SAN Implementation SAN Implementation 33 Benchmarking the I/O SystemBenchmarking the I/O System

Before implementing SQL Server, benchmark Before implementing SQL Server, benchmark the SAN. Shake out hardware/driver problemsthe SAN. Shake out hardware/driver problems

Test a variety of I/O types and sizes. Test a variety of I/O types and sizes. Combinations - read/write & sequential/randomCombinations - read/write & sequential/randomInclude I/O of at least 8K, 64K, 128K, and 256K.Include I/O of at least 8K, 64K, 128K, and 256K.

Ensure test files are significantly larger than SAN Ensure test files are significantly larger than SAN Cache – At least 2 to 4 times Cache – At least 2 to 4 times

Test each I/O path individually & in combination to Test each I/O path individually & in combination to cover all paths cover all paths Ideally - linear scale up of throughput (MB/s) as Ideally - linear scale up of throughput (MB/s) as paths are added paths are added

Save the benchmark data for comparison when Save the benchmark data for comparison when SQL is being deployedSQL is being deployed

SAN Implementation SAN Implementation 44 Benchmarking the I/O SystemBenchmarking the I/O System

Share results with vendor: Share results with vendor: Is performance reasonable for the Is performance reasonable for the configuration?configuration?

SQLIO.exe is an internal Microsoft toolSQLIO.exe is an internal Microsoft tool On-going discussions to post it as an On-going discussions to post it as an

unsupported tool at unsupported tool at http://www.microsoft.com/sql/techinfo/admihttp://www.microsoft.com/sql/techinfo/administration/2000/scalability.aspnistration/2000/scalability.asp


Among other factors, parallelism also influenced by Among other factors, parallelism also influenced by Number of CPUs on the host Number of CPUs on the host Number of LUNsNumber of LUNs

For optimizing Create Database and Backup/Restore For optimizing Create Database and Backup/Restore performance, considerperformance, consider More or as many volumes as the number of CPUs. More or as many volumes as the number of CPUs. Could be volumes created by dividing a dynamic disk or Could be volumes created by dividing a dynamic disk or

separate LUNsseparate LUNs

Database and TempDB Files Database and TempDB Files Internal file Internal file structuresstructures require synchronization, consider the require synchronization, consider the

# of processors on the server# of processors on the server Number of data files should be >= the number of Number of data files should be >= the number of

processorsprocessors

Remote Site Fail-over Remote Site Fail-over with SQL Server and EMC with SQL Server and EMC SRDFSRDF

USDA Geo-spatial DatabaseUSDA Geo-spatial Database

Data RequirementsData Requirements 23 terabytes of EMC SAN storage per 23 terabytes of EMC SAN storage per

site (46 TB total storage)site (46 TB total storage) 2 primary SQL Servers and 2 fail-over 2 primary SQL Servers and 2 fail-over

servers per siteservers per site 15 TB of image data in SQL Server at 15 TB of image data in SQL Server at

Salt Lake City site with fail-over to Fort Salt Lake City site with fail-over to Fort WorthWorth

3 TB of vector data in SQL server at 3 TB of vector data in SQL server at Fort Worth site with fail-over to Salt Fort Worth site with fail-over to Salt Lake CityLake City

80 GB of daily updates that need to be 80 GB of daily updates that need to be processed and pushed to fail-over siteprocessed and pushed to fail-over site

SolutionSolution

Combination of SRDF and SQL Combination of SRDF and SQL Server Log ShippingServer Log Shipping

Initial Synchronization using SRDFInitial Synchronization using SRDF Push updates using SQL Server Push updates using SQL Server

Log ShippingLog Shipping Use SRDF incremental update to Use SRDF incremental update to

fail-back after a fail-overfail-back after a fail-over Use SRDF to move log backups to Use SRDF to move log backups to

remote siteremote site

Hardware InfrastructureHardware InfrastructureSite Configuration (identical at each site)Site Configuration (identical at each site)

Technical Overview – EMC DevicesTechnical Overview – EMC Devices EMC SAN is partitioned into Hyper- EMC SAN is partitioned into Hyper-

Volumes and Meta-Volumes Volumes and Meta-Volumes (collections of Hyper-Volumes) (collections of Hyper-Volumes) through BIN File configurationthrough BIN File configuration

All drives are either mirrored or Raid All drives are either mirrored or Raid 7+17+1

Hypers and or Metas are masked to Hypers and or Metas are masked to hosts and are viewable as LUNs to the hosts and are viewable as LUNs to the OSOS

EMC Devices are identified by Sym IdEMC Devices are identified by Sym Id EMC Devices are defined as R1, R2, EMC Devices are defined as R1, R2,

Local or BCV devices in the Bin FileLocal or BCV devices in the Bin File

Technical Overview – Device MappingTechnical Overview – Device Mapping

Windows Device Manager and SYMPD LIST Output

Technical Overview – SRDF Technical Overview – SRDF 11

SRDF provides track to track data mirroring between remote EMC SAN devices.

BCVs are for local copies.

• Track to track replication (independent of host)

• R1 Device is source

• R2 Device is target

• R2 is read/write disabled until the mirror is split


Synchronous Mode Synchronous Mode Semi-Synchronous Semi-Synchronous

Synchronous with some lagSynchronous with some lag Adaptive Copy Mode – Asynchronous Adaptive Copy Mode – Asynchronous Adaptive Copy A – Asynchronous with Adaptive Copy A – Asynchronous with

guaranteed write sequence using guaranteed write sequence using buffered track copiesbuffered track copies

Note: only Adaptive Copy A requires Note: only Adaptive Copy A requires additional storage space. All other additional storage space. All other SRDF replications simply keep a table SRDF replications simply keep a table of tracks that have changed.of tracks that have changed.


• SRDF replicates by Sym Device (Hyper or Meta).

• SRDF Devices can be “Grouped” for synchronizing.

• SQL Server databases are replicated “by database” or “by groupings of databases” if TSIMSNAP2 is used.

R1 R2

R1 R2

R1 Group A

Database 1

R1 Group B

Database 2

Primary Host Fail-over Host

Process OverviewProcess Overview Initial Synchronization using SRDF in Initial Synchronization using SRDF in

Adaptive Copy Mode (all database files).Adaptive Copy Mode (all database files). Use TSIMSNAP(2) to split SRDF group after Use TSIMSNAP(2) to split SRDF group after

synchronization is complete.synchronization is complete. Restore fail-over databases using Restore fail-over databases using

TSIMSNAP(2) after splitting SRDF mirror.TSIMSNAP(2) after splitting SRDF mirror. Use SQL Server Log shipping to push all Use SQL Server Log shipping to push all

updates to fail-over server (after initial sync).updates to fail-over server (after initial sync). Fail-over database is up and running at all Fail-over database is up and running at all

times, giving you confidence that the fail-times, giving you confidence that the fail-over server is working.over server is working.

PlanningPlanning Install SQL Server and system Install SQL Server and system

databases on Primary and Fail-over databases on Primary and Fail-over Servers (on Local non-replicated Servers (on Local non-replicated devices)devices)

Create user databases on R1 devices Create user databases on R1 devices (MDF, NDF and LDF) on Primary Host(MDF, NDF and LDF) on Primary Host

Don’t share devices among databases, Don’t share devices among databases, if you need to keep databases if you need to keep databases independent for fail-over and fail-back. independent for fail-over and fail-back. (Important)(Important)

Database volumes can be drive letters Database volumes can be drive letters or mount pointsor mount points

Initial StepInitial StepCreate Databases on R1 DevicesLoad Data

Synchronize to Fail-over hostSynchronize to Fail-over host 11

Create SRDF Group for Database on R1Create SRDF Group for Database on R1 Set Group to Adaptive Copy ModeSet Group to Adaptive Copy Mode Establish SRDF Mirror to R2Establish SRDF Mirror to R2

Synchronize to Fail-over hostSynchronize to Fail-over host 22

Wait until Adaptive Copy is “synchronized”Wait until Adaptive Copy is “synchronized” Use TSIMSNAP command to split SRDF group after device Use TSIMSNAP command to split SRDF group after device

synchronization is complete.synchronization is complete. Use TSIMSNAP2 for multiple databases.Use TSIMSNAP2 for multiple databases. TSIMSNAP writes Meta Data about databases to R1, which is TSIMSNAP writes Meta Data about databases to R1, which is

used for recovering databases on R2 host.used for recovering databases on R2 host.

Write Meta Data

Break Mirror

Attach Database on Fail-over HostAttach Database on Fail-over Host

Verify SQL Server is installed and running Verify SQL Server is installed and running on Fail-over host.on Fail-over host.

Mount R2 volumes on remote host.Mount R2 volumes on remote host. Run TSIMSNAP RESTORE command on Run TSIMSNAP RESTORE command on

Fail-over host. Specify either Fail-over host. Specify either standbystandby (read-only) or (read-only) or norecoverynorecovery mode. mode. Database is now available for log shipping on Database is now available for log shipping on

fail-over.fail-over. SRDF Mirror is now broken, but the track SRDF Mirror is now broken, but the track

changes are still tracked (for incremental changes are still tracked (for incremental mirror and/or for fail-back).mirror and/or for fail-back).

Log Shipping – at Primary SiteLog Shipping – at Primary Site Log Shipping volume on separate R1 device (not the same Log Shipping volume on separate R1 device (not the same

as the database R1)as the database R1) Log Backup Maintenance Plan to backup logs to log Log Backup Maintenance Plan to backup logs to log

shipping volume, which is an R1 deviceshipping volume, which is an R1 device Set R1 to Adaptive Copy ModeSet R1 to Adaptive Copy Mode Establish R1/R2 Mirror. Logs automatically get copied to R2.Establish R1/R2 Mirror. Logs automatically get copied to R2.

Log Shipping – at Fail-over SiteLog Shipping – at Fail-over Site BCV (mirror) of R2BCV (mirror) of R2 Schedule a script that splits and mounts BCV, then restores Schedule a script that splits and mounts BCV, then restores

logs to SQL Server database(s)logs to SQL Server database(s) Flush, un-mount and re-establish BCV mirror after logs have Flush, un-mount and re-establish BCV mirror after logs have

been restoredbeen restored

Process Overview SummaryProcess Overview Summary

Initial Synchronization using SRDF in Initial Synchronization using SRDF in Adaptive Copy Mode.Adaptive Copy Mode.

Use TSIMSNAP(2) to split SRDF group Use TSIMSNAP(2) to split SRDF group after synchronization is complete.after synchronization is complete.

Use SQL Server Log shipping to push Use SQL Server Log shipping to push updates to fail-over server.updates to fail-over server.

Fail-over database is up and running at Fail-over database is up and running at all times, giving you confidence that all times, giving you confidence that the fail-over server is working.the fail-over server is working.

Fail-over ProcessFail-over Process

Fail-over TypeFail-over Type Required ActionRequired Action

Read-onlyRead-only No Server Action. Clients No Server Action. Clients would need to point to would need to point to fail-over server.fail-over server.

Full UpdateFull Update SQL Command:SQL Command:

Restore Database Restore Database DBName with RecoveryDBName with Recovery

Fail-back ProcessFail-back ProcessFromFrom Required ActionRequired Action

Read-only Fail-Read-only Fail-overover

None Required. Point Clients to None Required. Point Clients to Primary.Primary.

Full Update Fail-Full Update Fail-overover

1.1. Run Run SYMRDF UpdateSYMRDF Update command to copy command to copy from R2 to R1 in Adaptive Copy Mode.from R2 to R1 in Adaptive Copy Mode.

2.2. Detach database on R2 after Update Detach database on R2 after Update Complete.Complete.

3.3. Flush and un-mount volumes on R2Flush and un-mount volumes on R2

4.4. Run Run SYMRDF FAILBACKSYMRDF FAILBACK to replicate to replicate final changes back to R1 and write final changes back to R1 and write enable R1enable R1

5.5. Mount R1 volumesMount R1 volumes

6.6. Attach Database on Primary HostAttach Database on Primary Host

Closing ObservationsClosing Observations So far SQL Server 2000 has met High Availability So far SQL Server 2000 has met High Availability

objectivesobjectives Network traffic across the WAN was minimized, (by Network traffic across the WAN was minimized, (by

shipping only SQL Server Log Copies, once the shipping only SQL Server Log Copies, once the initial synchronization was completed.) initial synchronization was completed.)

The dual Nishan fiber-to-IP switches allowed for The dual Nishan fiber-to-IP switches allowed for data transfer at about 16 GB / hour, taking full data transfer at about 16 GB / hour, taking full advantage of the DS3. This transfer rate easily met advantage of the DS3. This transfer rate easily met USDA’s needs for initial synchronization, daily log USDA’s needs for initial synchronization, daily log shipping, and the fail-back process.shipping, and the fail-back process.

The working read-only version of the fail-over The working read-only version of the fail-over database meant that the administrators always knew database meant that the administrators always knew the status of their fail-over system.the status of their fail-over system.

The USDA implementation did not require a large The USDA implementation did not require a large number of BCV volumes - as some other replication number of BCV volumes - as some other replication schemes require.schemes require.

Closing ObservationsClosing Observations

After the R1/R2 mirror has been split, SRDF After the R1/R2 mirror has been split, SRDF continues to track updates to R1 (from normal continues to track updates to R1 (from normal processing) and R2 (from log restore process). processing) and R2 (from log restore process). SRDF is then able to ship only the modified tracks SRDF is then able to ship only the modified tracks during fail-back or re-synchronization. This process during fail-back or re-synchronization. This process is called an Incremental Establish, or an Incremental is called an Incremental Establish, or an Incremental Fail-back and is much more efficient than a Full Fail-back and is much more efficient than a Full Establish or Full Fail-back.Establish or Full Fail-back.

After fail-back, the R1 and R2 devices will be “in-After fail-back, the R1 and R2 devices will be “in-sync”, and ready for log shipping startup with a sync”, and ready for log shipping startup with a minimal amount of effort.minimal amount of effort.

Since SRDF (initial synchronization, fail-back, and Since SRDF (initial synchronization, fail-back, and log shipping) all run in adaptive copy mode, the log shipping) all run in adaptive copy mode, the performance on the primary server is not impacted.performance on the primary server is not impacted.

SoftwareSoftware

SQL Server 2000 Enterprise EditionSQL Server 2000 Enterprise Edition Windows 2000 / Windows 2003 Windows 2000 / Windows 2003

ServerServer EMC SYM Command Line InterfaceEMC SYM Command Line Interface EMC Resource PackEMC Resource Pack

Call To ActionCall To Action

For more information, please email For more information, please email [email protected]

You can download all presentations atYou can download all presentations atwww.microsoft.com/usa/southcentral/

Understand your HA requirements Understand your HA requirements Work with your SAN Vendor to architect and design Work with your SAN Vendor to architect and design

for SQL Server deploymentfor SQL Server deployment Plan your device & database allocation before requesting a Plan your device & database allocation before requesting a

BIN FileBIN File Decide if sharing devices for databases (use TSIMSNAP or Decide if sharing devices for databases (use TSIMSNAP or

TSIMSNAP2). Decision effects convenience, space & TSIMSNAP2). Decision effects convenience, space & flexibility of operationsflexibility of operations

Stress test subsystem prior to deploying SQL ServerStress test subsystem prior to deploying SQL Server

mailto:[email protected]

http://www.microsoft.com/usa/southcentral/

SQL Server Summit SQL Server Summit Brought To You By:Brought To You By:

© 2004 Microsoft Corporation. All rights reserved.© 2004 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

Documents

Achieving High Availability with SQL Server using EMC SRDF Prem Mehra – SQL Server Development, Microsoft Art Ullman - CSC