Upload
pwtoday
View
409
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Join our guest, Vale Inco, worldwide leading producer of nickel, and Scalar for an informative session providing you insight on how to: •Automate data management tasks to free up IT resources and eliminate downtime •Get better utilization out of your storage resources •Utilize storage policies to better manage and optimize use of storage devices •Easily add and manage storage policies for all devices from a single management console •Reduce overall storage costs by 50 to 80% •Cut migration times by up to 90% with zero impact to users during migration •Reduce backup times and costs by up to 90%
Citation preview
DATA CENTRE GRADE VIRTUALIZATION
Managing Growth of Unstructured Data
Unstructured Data
Michael TravesChief Architect, Data [email protected]
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Session Agenda: Overview of Scalar Decisions Unstructured Data
Challenges Approaches Solutions
Case Study – Vale Inco Tom Morrier
Next Steps Unstructured Data Assessment Activity: Demonstration @ ScalarLabs TGIF Session
Questions & Answers Draw
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Scalar Decisions – Who we are:Toronto · Vancouver · Calgary · Ottawa · London · Kitchener · Guelph
Product and Solution delivery experts focussing on the most current technologies and complex business challenges
Technically led organization specializing in the design, deployment and management of complete IT Infrastructures
Key industry partnerships with leading technology solution vendors such as EMC and VMware
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
What we do:
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Scalar Professional Services: Real World Experience
With our customers and at our own data centres using proven architectures and solutions
End-to-end Consulting From up-front assessments to long-
term architecture considerations Holistic Vision
Scalar designs, deploys and manages the entire IT stack including eco considerations
Architecture and Solution Design
System Implementation
Capacity Planning
Health Checks
Storage and System Consolidation
Converged Network Infrastructure
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Highly flexible, scalable and affordable managed services for customer IT environments
Multiple data centre hosting facilities, plus full remote management offerings at customer sites
Virtualized offerings include: Cloud computing for primary or dev/test environment Remote VMs / hosted DR at multiple sites Remote monitoring of ESX and hardware platform
Scalar Leadership in Managed Services
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
the systems
the network
the storage
Unify your test environment @ 20+ vendor products available to platform test
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Data Management Challenge
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Traditional Infrastructure Problem
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenges with Unstructured Data Storage growth rates that average 40-120% CAGR. Storage environments becoming increasing complex and
difficult to manage Inconsistent utilization of storage resources Skyrocketing storage and backup costs Lengthy data migrations and consolidations Backup times that exceed backup windows Costly downtime caused by disruptive data and capacity
management
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenge: Data Growth Growth increases complexity and administrative burden
Most companies are still managing growth reactively. Where do you put new data when your filesystems fill up?
If you aren’t able to dynamically increase the size of a file system (pooling, thin provisioning, etc), how do you move data between filesystems/servers without impacting users?
When you need to increase capacity, how long does it usually take to acquire, deploy and provision it? Do you play the data “shell” game until its ready?
What if the new storage isn’t the same type/brand/release as the current? How does this affect integration and manageability?
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenge: File Count Growth More files means more metadata. What’s the impact?
In high file count environments, you have a metadata problem, not a data problem.
Lots of small files complicate management strategies
Archiving, while one strategy to address data growth actually increases file counts (stubs), creating more of a problem
Backup and recovery of high file count filesystems are complex – “walking a filesystem” is usually an order of magnitude more time consuming than actually moving the data.
More, smaller filesystems to constrain file counts increases complexity and don’t really address the source of the problem
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenge: Backup Windows Large data volumes are resource intensive
File system backups are sequential (one job per filesystem) and take time. Multiple filesystems create management headaches.
Full backups of large amounts of data takes time and chew up resources (either D2D, Tape, or Dedupe).
Most data doesn’t change week to week (80%+ is aged, static)
Large file counts create disk I/O constraints A 72hr backup job can typically be 95% metadata processing and
5% data movement.
Solving the data problem with archiving can create the high file count problem
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenge: Disruptive Migrations Transitioning between new/old or different vendors
Storage is typically on a three year life cycle – which generally means four, if you account for migration in and migration out
How do you migrate large volumes of data between old and new storage platforms without impacting users?
How do you migrate between different types of technologies? I.e., NetApp to EMC, EMC to BlueArc, Windows/UNIX to NAS?
When migrating between different NAS vendors, how do you leverage their proprietary vendor specific tools?
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenge: Disparate Storage Platforms Multi-vendor, multi-protocol environments
Managing multiple solutions is typical with unstructured data – UNIX (NFS) and Windows (CIFS) typically coexist. NAS appliances or gateways come into play when UNIX/Windows can’t scale
Having multiple protocols across multiple fileshares, on multiple servers/NAS solutions creates management complexity. Ensuring that each platform can grow/scale to meet demand is difficult to predict, and requires different strategies for managing growth
Different generations/brands of technology support different features and and protocols. How do you integrate NFS3 and NFS4 across two different storage solutions? And what happens when you have to move a share from one device to the other due to space constraints?
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenge: Scalability Horizontal Scale-out, Vertical Scale-up, and Mobility
Scale-up strategies leverage the same server/NAS platform by adding capacity. This minimizes management overhead, assuming that filesystems can dynamically be scaled online.
This assumes that the existing system can sustain performance growth too
Scale-out strategies couple storage capacity with performance, ideally using the same building block for consistency. This is predictable, creates allocation issues
Can a single fileshare span multiple device? How is data and performance distributed?
How is data balanced across devices? Is this automated? Can data migrate between devices without impacting users?
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Challenge: Inefficient Resource Utilization Having multiple server/NAS devices presenting
unstructured data creates administrative challenges How do you manage capacity, when data on different devices
grows at different rates?
How do you manage performance, when access patterns are unpredictable?
Is it possible to redistribute content between filesystems and devices to “optimize” utilization? How does this impact users?
When you do move a directory or share from one device to another (out of space issues anyone?), how does that impact backups? Generally, it’s included in your incremental backups.
Approaches to Solving the these Challenges
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Approaches to Managing Unstructured Data Quota Management Archiving Bigger is Better Tiering Deduplication Replicate the Problem
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Approach: Quota Management Establish quotas to prevent users from storing “too
much” data on home, project, etc folders Pro’s
Limits the amount of data people can store in public folders Con’s
People always find places to store their data (desktop/laptops, external drives, etc) – usually outside the control and protection of IT
Drives helpdesk complaints, and constant “exceptions” Does not address project/departmental folders Does not move static data out of day-to-day management
processes (i.e., backup/recovery)
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Approach: Archiving Moves inactive, static content from primary storage to
lower cost storage, reducing backup data volumes Pro’s
Reduces primary storage usage, and associated costs Reduces backup volumes, reducing backup tape/disk usage
Con’s Requires stubs (for no user impact), which does not reduce file
counts Increasing file counts while decreasing data does not solve the
backup problem – millions of files/stubs still take hours/days to process
The longer this strategy is employed, the more metadata/stubs you maintain, the worse the problem becomes
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Approach: Bigger is Better. More is Better The philosophy of “buy more” to address growing storage
requirements may address growth, but how does it address manageability?
Con’s More device means more to manage. How do you organize it?
When 80%+ of your data is static, how do you separate it from current/new data without impacting users?
More primary storage creates more costs, and more backup/recovery pain
Just because a new, larger NAS head is “faster”, doesn’t mean you’ll be able to backup or restore it “faster”.
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Approach: Tiering By creating different tiers of storage (i.e., FC and SATA) in
your environment, perhaps on different devices, you can put data with lower access/priority on lower cost/performing storage
Pro’s Helps manage cost by prioritizing data placement
Con’s How do you decide what should go where? What if priority or access patterns change? At what level of granularity is this possible? Filesystem (LUN)?
Directory? File? Block?
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Approach: Deduplication Deduplication, combined with compression, can reduce
your storage foot print across all your unstructured data Pro’s
Deduplication can dramatically reduce the storage footprint for many types of data, promising lower storage costs long-term
Con’s Not all data deduplication is created equal. Is it block level, file
level, or variable block level? What is the performance impact? More efficient storage of static data is good, but if it’s still in the
backup/recovery cycle, have you really addressed the problem? Most solutions today still rehydrate the data during backup. So
are you really saving anything for backups? What performance impact does this imply during backup/recovery operations?
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Approach: Replicate the Problem When backup/recovery activities kill performance on
your primary storage device(s), replicate the data (and delta changes) instead.
Pro’s Allows you to backup the replication target, instead of source Gets you a DR solution while moving the backup issue offsite
Con’s Active and Static data is still mixed, with the same policies and
retentions being applied to each Your storage costs have now doubled, and backup is still a (now
remote site) problem. Snapshot history helps, but not forever.
Solving the Unstructured Data Challenge
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Solving the Challenges Automate data management tasks to free up IT resources
and eliminate downtime Get better utilization out of your storage resources Utilize storage policies to better manage and optimize use
of storage devices Easily add and manage storage policies for all devices
from a single management console Reduce overall storage costs by 50 to 80% Cut migration times by up to 90% with zero impact to
users during migration Reduce backup times and costs by up to 90%
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Solution: File Virtualization Capacity Balancing
Balance data and I/O across multiple storage devices, making the most efficient use of your storage resources
Data Migration Automatically migrate data between heterogeneous devices,
without impacting user access – no downtime
Storage Tiering Intelligently put data on the right type of storage based on
metadata policies and aging criteria
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Global Namespace (Wikipedia) A Global Namespace is a heterogeneous, enterprise-wide abstraction of all file
information, open to dynamic customization based on user-defined parameters. This becomes of particular importance as multiple network based file systems proliferate within an organization—the challenge becomes one of effective file management.
A Global NameSpace (GNS) has the unique ability to aggregate disparate and remote network based file systems, providing a consolidated view that can greatly reduce complexities of localized file management and administration. For example, prior to file system namespace consolidation, two servers exist and each represent their own independent namespaces; e.g. \\server1\share1 & \\server2\share2. Various files exist within each share respectively, however users have to access each namespace independently. This becomes an obvious challenge as the number of namespaces grows within an organization.
With a GNS, an organization can access a virtualized file system namespace; e.g. files now exist under a unified structure, such as \\company.com\share1, share2—where the files exist in multiple physical server\share locations but appear to be part of a single namespace
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Implementation of a Global Namespace
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Capacity Balancing Automatically balance capacity across multiple file
servers and NAS appliances Make the best use of your current and future storage
capacity Eliminate the need to manually rebalance data – use
automated, policy driven tools instead
Reduce storage costs, management complexities, and eliminate downtime due to maintenance
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Capacity Balancing
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Data Migration Move data between storage devices on your schedule,
not your users – seamless access to data during migration means no scheduled downtime
Transitioning from one generation of technology to another is now a scheduled task, not a 6 month project
Keep your vendors competitive – without the pain of data migration projects, your choice of solution comes down to features and costs. Why pay more by being locked in?
Automatically, Online, and without disrupting your business, migrate your file infrastructure with zero downtime and complex administrative burden.
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Data Migration
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Storage Tiering With multiple tiers of storage in your environment, you
now have the power to cost effectively store data based on policies you establish – age, access, type, etc
Keep current data on faster, regularly backed up storage, while segregating static, older content that isn’t changing to lower tiers
Eliminate backup of over 80% of your data by cycling it out of the regular backup scheme
Shrink your backups and related costs, improve recovery windows, and store data on the right tier - creating efficiencies and capital cost savings at multiple levels
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Storage Tiering
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Storage Tiering – Granular Value-based Policy
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Benefits of File Virtualization Capacity Balancing
Utilize your existing storage assets better
Optimize access performance and eliminate issues that impact user productivity (scheduled and unscheduled)
Pool the resources of servers and NAS appliances you already own, achieving better asset utilization and realized cost savings
Eliminated downtime and reconfiguration activities. Enable non-disruptive data management
Create process efficiencies in your organization through the elimination of administrator “shell-game” tasks.
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Benefits of File Virtualization Data Migration
No client reconfiguration – with a virtualized, global name space, the location of data is policy and administrator controlled. Moving data around does not impact access to it.
Move entire file systems or individual files around without interrupting access to them.
Reduce the overhead of migration projects with a streamlined, consistent, automated solution.
No stubs. Ever. Leaving stubs or pointers around in the filesystem does not solve the backup problem, and long-term this can become a management headache!
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Benefits of File Virtualization Storage Tiering
Reduce your storage costs by putting data on the right (cost) tier of storage – automated, policy driven.
Reduce your backup volumes dramatically be moving aged data out of the daily/weekly/monthly backup cycle. Back static data up once a quarter or less, with proper retention practices.
Tiering without Administrative overhead. Automate the challenge of what goes where, and save yourself the trouble.
Improve your storage utilization across all tiers and devices, automatically, as granular as the file level – without stubs!
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Storage Tiering – Optimizing Backups
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Case Study: Vale Inco Challenges
Backup Windows Impact to production during backups Too much data, high growth, archiving partially implemented
How we helped Information Life Cycle Management Assessment
Reviewing all aspects of data in their environment Current State Analysis Future State Recommendations Technology and Design Recommendations
and now….. Tom Morrier!
Introduction
Tom Morrier
Vale Inco Limited
Once Storage Administrator
Now Project Manger
Still Secretly the Storage Administrator
Killing 5 Birds with one Appliance
Our Problem(s)
• Extremely large volumes of data growing out of control• Millions of files, many of them under 1k in size• Aging End Of Life Data Archiving solution• 5 day backup times
– Backups were running during business hours
• Small change windows to take outages in– 24 hour operation that does not like down time.
The Solution
Two pair of ARX 4000’s
1 pair in our Primary DC
1 pair in our Largest Site
How We Used the ARX
5 TB
3 TB
2.5 TB
4 TB
Tier 2Tier 1
The Results
• Backup Times– 98 hours went to 28 hours– 5 streams have been turned into 14 streams 4 of witch only
happen once a month– In primary DC backup times went from 110 hours for 1 full
backup to 21 hours over 5 streams for the same full
• Archiving has been undone in one site and under way in the other
• Re-archive based on change through tiering• All data moves were done during business hours without
impacting user data access
Some Bonus Results
• Tape usage has gone down thanks to tiering• Data types can be isolated
• Old systems still accessing network storage surface • Strange connections get identified
• MP3 library gets a boost !
Questions
?
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
How we can help you – two approaches Information Life Cycle Assessment
We’ll look at all aspects of your data environment (online, nearline, offline), processes, and applications, and provide guidance on how to get from “current state” to your desired “future state” given the challenges specific to your business.
Unstructured Data Targeted – “Quick Assessment” We’ll target your file servers and NAS appliances with tools
specifically designed to capture and analyze your unstructured data environment, provide recommendations on design and TCO/ROI, and business justification on how, where, and to what impact a File Virtualization solution would have for you.
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Unstructured Data Assessment – what’s involved? Discovery
Through a ½ day workshop, we will gather information about your processes, policies, infrastructure and challenges.
A data collection tool will be installed (non-invasive) to capture the metadata for the target filesystem shares (server/NAS)
Analysis The captured data will be analyzed to determine what efficiencies would
be realized, and the best design case Presentation of Results
We will present the results of the analysis, along with recommendations on how to realize the benefits of file virtualization
A mapping of benefits to your specific challenges will help build an ROI/TCO for business justification
Specific design recommendations and costs will be presented.
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
The Tool – F5 Data Manager
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Unstructured Data Assessment Results
Justification through soft and hard dollar cost savings will be presented to help establish a business case for deployment in your environment
Costs
Free to Session AttendeesBecause we believe this solution can be proven out as a cost effective,
highly impactful way of managing unstructured data growth, we are presenting this 2 ½ day engagement free of charge.
DATA MANAGEMENT and VIRTUALIZATION EXPERTS
© Copyright 2010. scalar decisions inc. Not for redistribution outside of the intended audience.
Next Steps
• Unstructured Data Assessment• Learn how file virtualization can benefit your environment• Free of charge for attendees who complete the survey• Inquire for additional information (see handout)
• Complete your Survey• Be sure to complete the survey for your
chance to win a Netbook! • Beer Tasting
• Join us for a sampling of Duggin’s Beer