14
Data Deduplication in Virtualized Environments Marc Crespi, ExaGrid Systems http://blog.exagrid.com Twitter: @ExaGrid

Data Deduplication in Virtualized Environments Marc Crespi, ExaGrid Systems Twitter: @ExaGrid

Embed Size (px)

Citation preview

Data Deduplication in Virtualized Environments

Marc Crespi, ExaGrid Systems

http://blog.exagrid.com

Twitter: @ExaGrid

About the speaker

Marc has over 20 years of software and hardware experience in the high technology sector

He is part of the ExaGrid team that drives product strategy and execution and is responsible for managing product operations.

Prior to joining the company, Marc was director of product management for security managementproducts at Altiris.

Objective of This Program

What is Deduplication?

Why Use Deduplication in Backup and Recovery?

Challenges of Deduplication in Virtualized Environments

Deduplication approaches (two camps)

Summary ‒ Deduplication’s Role in Data Protection and Disaster Recovery

Enhanced Speed/Performance● Faster backup times due to lower volume of data to be backed up

● Data lands faster because it is targeted at disk

Dramatic Savings in Disk Costs● 20:1 Reduction in amount of disk space required to store backups

Scalability● Backup higher data volumes while maintaining backup window

Offsite Disaster Recovery● Efficient use of bandwidth via WAN-efficient replication

Why Use Deduplication in Backup and Recovery?

VM

Reduced storage footprint with deduplication

Reduce total amount of storage by as much as 1000:1 Store only the bytes that change in your VMware virtual servers Eliminate redundancy of typical VMware backups Restore quickly from most recent VMware backup

Each virtual server image gets backed up in its entirety

Large amount of storage consumed

Deduplicate backups to changed bytes Dramatic savings in disk and bandwidth Integrated Replication

Eliminate Redundancies for More Efficient Virtual Server Backups

VMVMVMVM

VMVMVMVMVMVM

VMVM

Specific Challenges of Backups/Restores in Virtualized Environments

Management of backups

● Growing number of virtual machines/ sprawl● Inability to monitor backups on individual virtual machines

Handling the volume of backup data efficiently

● More data to store as virtual machines proliferate● Each change means entire virtual server is backed up

These challenges are driving a need for better tools to morereliably and easily back up and restore virtual machines

Example: 10 guest OS instances x 50GB = 500GB of backed-up virtual images daily

How Dedupe Works: Store Only Changed Bytes

Standard Disk

Total 500GB Total 3.4GB

2.5GB

100MBOldest Backup

Most Recent Backup 50GB

Oldest Backup

Most Recent BackupStored Optimized for Read

100MB

100MB

100MB

100MB

100MB

100MB

100MB

100MB

Data Deduplication

50GB

50GB

50GB

50GB

50GB

50GB

50GB

50GB

50GB

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

500GB 3.4GB

2011 ExaGrid Systems, Inc.

Where to Deploy Deduplication

PROS

Reduces impact on VM Shortens BU window/less data Reduced bandwidth needed

to the backup target Reduction in storage usage

CONS

Can be slower for large(multiple TB) amounts of data

Increased workload on servers

PROS

Shortens BU window/less data Reduced replication bandwidth Reduction in storage usage

CONS

Must transfer the entire datasetto the device

Don’t get reduced bandwidthneeded to the backup target

Target Based Data ReductionRemoves data redundanciesafter transmission to the backup target

Source Based Data ReductionRemoves data redundanciesbefore transmission to the backup target

2011 ExaGrid Systems, Inc.

Achieves an additional 80% data reduction (98% total)

● Further reduction in bandwidth● Further reduction in storage usage● Further reduction in backup window

Integrated replication of virtual servers

Source Based PLUS Target Based Data Deduplication Removes data redundancies before and after transmission to the backup target

Using Both Deduplication Techniques Provides Complementary Benefits

2011 ExaGrid Systems, Inc.

Architectural Considerations

Scalable GRID ArchitectureMultiple Deduplication Engines

Legacy Architecture - Single ControllerOne Deduplication Engine

Backup Window

X TB/hr

X TB/hr

X TB/hr

X TB/hr

X TB/hr

X TB/hr

20 TB

30 TB

40 TB

50 TB

60 TB

Disks

Disks

Disks

Disks

Disks

Deduplication EngineX TB/hr

2X TB/hr

3X TB/hr

4X TB/hr

5X TB/hr

6X TB/hr

20 TB

30 TB

40 TB

50 TB

60 TB

10 TB

Deduplication Engine

Deduplication Engine

Deduplication Engine

Deduplication Engine

Deduplication Engine

Deduplication Engine

Backup Window

2011 ExaGrid Systems, Inc.

Architectural Considerations

Scalable GRID ArchitectureMultiple Deduplication Engines

Legacy Architecture – Single Controller

Legacy Architecture –Appliance Sprawl

One Deduplication Engine

Linear performance as data grows, stable backup window

Capacity is virtualized across nodes Deduplication is shared across nodes Simplified management through single UI System can be right-sized

to current data size Avoids forklift upgrades

Scalable GRID Features

Individual appliances

Deduplication Engine

Deduplication Engine

Benefits One-time division of data during installation (15 to 30 minutes)

GRID software manages placement of data

Revisit only during expansion (additional 15 to 30 minutes)

Eliminates the challenges of monolithic, primary storage like architectures

GRID Architecture for Deduplication Performance

Backup Servers

Wire Speed

Wire Speed

Node 1 – System Capacity – RAID6

Landing Zone

Node 2 – System Capacity – RAID6

Repository

Landing Zone

Deduplication Process Load Balancing

Backup Job

Backup Job

VM VM VM VM VM VM VM VM

VM VM VM VM VM VM VM VM

What We Covered

What is Deduplication?

Why Use Deduplication in Backup and Recovery?

Challenges of Deduplication in Virtualized Environments

Overview Diagram of Major Components

Deduplication approaches (two camps)

Summary ‒ Deduplication’s Role in Data Protection and Disaster Recovery

Enjoy and share this material

Feel free to promote this material

Recommend your peers to pass certification

Blog, Tweet and share this material and your experience on Facebook

You’re an Expert? We will be happy to have you as Backup Academy contributor. Apply here.

Web: http://www.backupacademy.comE-mail: [email protected]: BckpAcademyFacebook: backup.academy