Vm13 vnx mixed workloads

Preview:

DESCRIPTION

 

Citation preview

Mixed Workloads on EMC VNX Storage Arrays

Tony Pittman @pittmantonyTPittman@Varrow.com

Martin Valencia @ubergiekMValencia@Varrow.com

Discuss:• How VNX storage pools work• How common workloads compare• Which workloads are compatible• How to monitor performance• How to mitigate performance

problems

Goals For This Session

Also check out this session at 2:55 EMC Session: VNX Performance Optimization and Tuning - David Gadwah, EMC

Goals For This Session

VNX5100 VNX5300CX4-120

VNX5500CX4-240

VNX5700CX4-480

VNX7500CX4-960

CX4VNX Series with Rotating drivesVNX Series with Flash drives

IOPS - Mixed Workloads

Platform

# ofUsers

VNX Basics• VNX shines at mixed workloads

• VNX is EMC’s mid-tier unified storage array

• FC, iSCSI or FCoE block connectivity

• Multiple SAS buses backend• NFS and CIFS file connectivity• Built for flash

VNX Basics

VNX Architecture

LCCLCC

Flash drivesSAS drives

SPSSPS

Power Supply

Power Supply

LAN

Near-Line SAS drives

VNX SP Failover

Clients Oracle serversExchange servers

VNX Unified Storage

Application servers

SAN

FailoverVNX X-Blade

VNX SP

VNX OE FILE

VNX OE BLOCK

Virtual servers

FC iSCSI FC iSCSIFCoE

FCoE

VNX X-BladeVNX X-BladeVNX X-Blade VNX X-BladeVNX X-BladeVNX X-BladeVNX X-Blade

10Gb Enet

10GbEnet

Object:Atmos VE

• Two Storage Processors with DRAM cache, frontend ports (FC, iSCSI, FCOE) and backend ports (6 Gb SAS)

• Each LUN owned by one SP, and accessible by both

• Both SP’s have active connections

VNX Architecture

• FAST Cache– Second layer of read/write cache,

housed on solid state drives– Operates in 64 KB chunks– Reactive in nature– Great for random I/O– Don’t use it for sequential I/O

VNX Architecture

• Storage Pools– Based on RAID

• RAID 5, RAID 1/0, RAID 6

– FAST VP: Fully Automated Storage Tiering• Pools with multiple drive types: EFD, SAS,

NL-SAS• Sub-LUN tiering• Operates at 1 GB chunks• Adjusts over time, not immediately

– FASTCache is more immediate

VNX Architecture

When should I use traditional RAID Groups? As the exception:• Very specific performance tuning (MetaLUNs)• Internal array features (write intent logs, clone

private LUNs)• Maybe Recoverpoint journals• Supportability (I’m looking at you, Meditech)

Remember the limitations:• Maximum of 16 drives• Expand via metaLUNs• No tiering

VNX Architecture

• IOPS per drive type (for sizing)3500 IOPS - EFD 180 IOPS - 15k rpm drive 140 IOPS - 10k rpm drive 90 IOPS - 7200 rpm driveEffects of RAID

• Parity calculations (RAID 5 and RAID 6)• Effect on response times

• Write penalty• RAID 1/0 = 2x• RAID 5 = 4x• RAID 6 = 6x

VNX Architecture

• Real-world effect of write penalty:– 10x 600 GB 15k SAS drives = 1800

read IOPS• With RAID 1/0, capable of 900 write IOPS• With RAID 5, capable of 450 write IOPS

– 1 write operation takes 4 I/O operations

• With RAID 6, capable of 300 write IOPS– 1 write operation takes 6 I/O operations

VNX Architecture

Common workloads seen in the field.

Virtual Disks/VMFS (RAID5)DB – Data files (RAID5)DB – Transaction files (RAID 10)Unstructured Data, Backups (RAID6)

Workloads

Benchmarking Real World Performance

Non-profitUses generic applications rather than specific applicationsSPEC benchmarks rely on a mix of I/O to simulate a generic applicationThis balances the need for real world performance and consistency over time

Real World WorkloadsStandard Performance Evaluation Corporation

• Array with single application• No budget constraints• Separate storage pools for

different sub-workloads

Ideal Scenario

• The ideal SQL:– PCIe flash and XtremeSW on the host– FAST Cache in the array– tempDB:

• Data files on separate RAID 5 storage pool

– User DB’s:• Each has tlogs on separate RAID 1/0 storage pool• Each has data files on one or more RAID 5

storage pools, with the appropriate drive configuration (EFD+FAST)

– Backups / Dump files:• Separate RAID 6 storage pool, maybe a separate

array

Ideal Scenario

Cost prohibitive, and do we have to?• Business Critical Application … maybe• Management & Lower-tier application…

probably not

Reality – Can’t isolate every workload

• One or Two RAID 5 pools (ex: Gold & Silver)

– FAST with EFDs, SAS, NL-SAS according to skew or 5/20/75 rule

• RAID 1/0 pool for transaction logs. – 15k SAS drives

• RAID 6 pool for backup files and unstructured data– 7.2k NL-SAS drives

Basic Storage Pool Layout

• VMFS• DB Data Files• Good for random read/write mix• Use FASTCache

RAID 5 Pools

Example:• Gold Pool: 5x EFD, 15x SAS, 16x NL-SAS• Silver Pool: 15x SAS, 16x NL-SAS

Drive Composition: Skew

Example:• 8x 15k SAS drives

RAID 1/0 Pool• Transaction Logs for many applications• Specifically for small sequential writes• Do Not Use FAST Cache– It’ll be wasted– It’ll hurt performance

• Unstructured data– Office Files (.doc, .xls, .etc)– Images

• Backup files– Split into separate pool if necessary

• Low I/O & high capacity • Good for long sequential writes• Do Not Use FAST Cache

– It’ll be wasted

RAID 6 Pool

Example:• 16x 7.2k rpm NL-SAS drives

Pool Layout

There is no “Set it and forget it”

Workloads change over time• Users get added• Transaction load increases• Requirements change

Often no one tells us

Monitoring and Troubleshooting

Proactive performance review– Admins wear too many hats– Low priority

Reactive to user impact (Too late)– Crisis management

Problem identification

Where do we start? What do we look at?• Cache Utilization– Exceeding a high water marks, need to flush

cache to disk– Forced Flushes

• SP performance – Balance the SP load

• Pool LUN migration (metadata)• Online LUN migration

Troubleshooting Metrics

Unisphere Analyzer (On array)– Proactively gathers data for review– Data logging must be enabled on the array

The “Toolbox”

VNX Monitoring and Reporting (Off array)– Historical Data Collection– Streamlined application based on

Watch4net

The “Toolbox”

EMC miTrend– Leverages NAR (Navisphere analyzer data) that can

be retrieved from the array– Need EMC or partner (us) to perform the analysis

The “Toolbox”

Several options for mitigating a performance problem:• Add drives

– OE 32 required to rebalance existing data– Pre OE 32, must increase pool by originating drive

count, existing data will not be rebalanced

• Migrate to a different pool– Live migration avoids the need for an outage– Performance Throttling minimizes performance

impact

Troubleshooting / Problem Mitigation

• Rebalance at the application layer– Storage vMotion– Host-based data migration (Open Migrator, etc)

• Migrate data between arrays– SANCopy– Replication (Mirroring/RecoverPoint)

• Reduce workload– Reschedule for off-hours (backups for example)– Decommission non-critical workloads

Troubleshooting / Problem Mitigation

Questions

Thank You!