10
89 Fifth Avenue, 7th Floor New York, NY 10003 www.TheEdison.com @EdisonGroupInc 212.367.7400 Solution Brief Bridging the Infrastructure Gap for Unstructured Data with Object Storage

Solution Brief - データ ストレージ、コンバージド、 … Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 1 Opportunities and Challenges

Embed Size (px)

Citation preview

89 Fifth Avenue, 7th Floor

New York, NY 10003

www.TheEdison.com

@EdisonGroupInc

212.367.7400

Solution Brief

Bridging the Infrastructure Gap for

Unstructured Data with Object Storage

Printed in the United States of America

Copyright 2016 Edison Group, Inc. New York.

Edison Group offers no warranty either expressed or implied on the information contained

herein and shall be held harmless for errors resulting from its use.

All products are trademarks of their respective owners.

First Publication: January 2016

Produced by: Brandon Moore, Analyst; Manny Frishberg, Editor; Barry Cohen, Editor-in-Chief

Table of Contents

Opportunities and Challenges of Unstructured Data ........................................................... 1

Object Storage: The Solution for the Unstructured Data Deluge ....................................... 3

Key Advantages of Object Storage ......................................................................................... 3

EMC ECS and Object Storage .................................................................................................... 7

Edison: Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 1

Opportunities and Challenges of Unstructured Data

Whether you are a startup or have been in business for 100 years, the rules are the same.

Every business wants to get as close as possible to their customers. The closer you are to

them, the closer you are to revenue generation. The world is experiencing a data

revolution that can help put your company top of mind for all current and potential

customers. Using different methods, like surveys, banner ads, email campaigns, social

media, or targeted product recommendations, companies and entities, for instance

federal, state and local governments, have been collecting this data for years. For

example, http://www.data.gov contains nearly 193,000 different datasets from 422

publishers with 78 agencies and sub agencies of the U.S. Government that are changing

monthly.

Access to these large data sets is just the tip of the proverbial iceberg. This data requires

new methods of storage. Collection and management of data provides additional non-IT

challenges. These include security concerns, data locality based on regulations, privacy

laws as well as the total cost of ownership (TCO) when scaling past the petabyte range.

Business leaders want to collect and analyze data from each part of their revenue chain

to deliver the best products and services, while mitigating risk, generating revenue, and

maximizing margins.

As an IT professional, you view this problem from a different perspective:

How much data will be generated and where can I store it cost effectively?

Where is the data coming from?

What kind of data is it? Structured, unstructured, or both?

Who or what application needs to use it?

How do I protect that data?

Data generation is a “one-time” event; do I have what I need to collect this data?

These questions refer to the deluge often associated with unstructured data. The

challenge is being prepared for what the business has asked of you with these questions.

As more and more systems, devices and sensors in your company’s revenue chain

generate data that needs to be collected, you begin to understand the gaps in your

infrastructure’s ability: A storage gap and an application gap. The solution requires the

flexibility to address different kinds of applications, development cycles, and

infrastructure.

Edison: Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 2

Figure 1, illustrates the gaps often found in IT infrastructure trying to address this

increase in data.

Figure 1: Infrastructure Gap for Unstructured Data Storage

At the center are the core systems that run your business, often referred to as systems of

record. These systems are some of the most protected, regulated, and secure assets in the

company. As a result, they were not built to have the flexibility to interface with systems

that generate large amounts of unstructured data with different speeds and sources. The

data generated by the web, people, devices, sensors, and your revenue chain is

unstructured, unpredictable, and unending. Looking at Figure 1, the infrastructure gap

between the systems interacting with your revenue chain and those currently running

your business becomes clear: Storage technology is at the foundation of this challenge.

A keystone of data collection in IT, storage technology is experiencing a revolution

centered on addressing the challenges of the unstructured data gap. Object storage is a

solution for storage of unstructured data and application systems analyzing and

transforming data.

In this solution brief, Edison Group explores the infrastructure gap and evaluates object

storage and EMC Elastic Cloud Storage (ECS) as the solution to fill that gap.

Edison: Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 3

Object Storage: The Solution for the Unstructured

Data Deluge

While not a new concept, object storage is one of the hottest terms in IT today. As a

result, startups are being acquired, and “born on the web” companies are innovating

heavily in this field. Using a flat namespace, object storage uses a globally unique

address and metadata to store data. This method reduces the overhead needed to

manage storage systems such as:

LUN creation, expansion, or migration

Applying data protection schemes

Creating, extending, and managing filesystems

With object storage, data can be of any size and type. From documents, images, audio

and video, there is no need to apply special techniques to store these and other data

types. Along with a unique global ID, metadata is embedded with each object. Users and

applications can embed additional metadata to further increase and customize ease of

identification.

Key Advantages of Object Storage

Let’s explore the advantages of object storage for storing unstructured data. Traditional

storage systems are built to interact with operating systems and people. Object storage is

built to interact with applications and many different data sources. Block storage is

needed to provide a place for the applications to run in the operating systems to live and

generate data. Some of the data applications generate is not best suited for block storage.

Examples of that type of data are:

Backup files (database dumps, virtual machine level backups, and other

backup/recover applications)

Content repositories for content archival (document archival, compliance data,

email, databases)

This type of data is considered referential from an application perspective, meaning it

needs to be available for recall but not accessed frequently. Additionally, the volume of

this data far exceeds what is used in the applications operation. As a result, the data

does not require the performance and availability characteristics of block storage. Object

storage excels at storing this type of data because application can write data directly

Edison: Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 4

using a TCP/IP connection to a programmable API on the object storage system. This

characteristic is what defines object storage systems as software defined storage.

Access connections for object storage are delivered via TCP/IP and Ethernet. Since no

setup of LUNs, RAID, or filesystems are required for use, integration into existing

environments is seamless, providing excellent time to value. Object storage systems

deliver unlimited capacity expansion as a cost effective, high value solution for warm

archival of referential data. These characteristics of object storage also provide an

excellent solution for the following use cases:

Archiving files in place of local tapes and tape libraries

Offsite backup and archive storage for disaster recovery

Archive tiering for network attached storage (NAS)

Remote office and back office (ROBO)

Extending or replacing capacity on current NAS devices with object storage systems can

improve TCO for your storage environment without needing to purchase identical or

similar equipment. Object is a viable option for disk based recovery time objectives for

disaster recovery plans. Replication, another feature of object storage, extends

recoverability and mitigates risk as it can be extended outside of the datacenter to other

locations under your security control.

The setup of an object storage system is simplified for implementation, but so is

management moving forward. Because of the flat namespace, capacity expansion and

upgrades can be executed with no downtime in most cases.

Monitoring and administrative tools are web based allowing for use anywhere on your

secured network. This ultimately means less of a learning curve to achieve operational

efficiency as the systems continue to grow. Once operational efficiency is achieved, the

TCO of your environment can be further reduced when adding more workloads.

The flat namespace also allows for multiple types of data to be stored side by side.

Regardless of the data, it is all viewed as object, their globally unique IDs, and metadata.

This puts your company in excellent position to ingest and store data from the following

sources:

Large data sets: Financial, pharmaceutical, geospatial, biotech, and legal

Public data sets: Weather, government

Security, imagery, and social media: Images, videos, blogs

Revenue chain data: Sensors, devices, Internet of Things

Edison: Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 5

The ability to store this data locally incorporates your own IT security and lessens the

exposure of your company if this data was stored in a public cloud. You also have

greater control to share data with your partners for support and service of your revenue

chain.

Knowing that being prepared for the data deluge is critical to your company’s future

successes, Figure 2 highlights how object storage and its advantages fill the previously

identified infrastructure gap.

Figure 2: How Object Storage Fills the Unstructured Data Storage Infrastructure Gap

Edison Group believes object storage technology is best equipped to address the

infrastructure gap between your business, customers, and the unstructured data they

both generate. We recommend that organizations begin to investigate this technology

within the next 3-6 months and plan to implement it in their environment within the

next 6-18 months, as there are likely several areas of immediate need for object storage.

Some of them include:

Backup and disaster recovery

Archive data

Content management repositories

Edison: Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 6

Compliance and regulated data archives (Sarbanes Oxley – SOX, BASEL, etcetera)

NAS migration/modernization

Remote office storage solutions

Enterprise data warehouse (EDW) data offload

Object storage systems, having low time to value for implementation along with easy,

petabyte scalability, also enable your IT department to move quickly to address the

concerns of LoB application development lifecycles. By providing “as a Service

offerings” and a foundational, “in-house” private cloud storage environment object

storage allows your company to further big data analytics, data lakes, and Internet of

Things (IoT) development at a greatly reduced security risk and cost.

Based upon this evaluation, Edison recommends object storage solutions to meet the

demands of ever-increasing data. Now that we understand the challenge associated with

unstructured data and what technology is needed to bridge the infrastructure gap, let’s

get an overview of how an object storage solution, EMC ECS, can close your storage

infrastructure gap for unstructured data.

Edison: Bridging the Infrastructure Gap for Unstructured Data with Object Storage Page 7

EMC ECS and Object Storage

EMC provides a cloud-scale object storage platform that meets the storage demands of

today and beyond through their ECS solution. ECS is a turnkey, on-site solution offering

all the advantages of commodity infrastructure with enterprise grade reliability,

availability, and serviceability. ECS can efficiently store PBs of data – whether billions of

small files and/or large files – in a cost appropriate state-of-the-art, storage system.

EMC ECS Appliance features include:

Universal protocol support in a single platform with support for object, file (NFS),

and HDFS

Single management view across multiple types of infrastructures

Geo-federated, active-active architecture with a single global namespace, enabling

the management of a geographically distributed environment as a single logical

resource using metadata-driven policies to distribute and protect content

Multi-tenancy support, detailed metering, and an intuitive self-service portal, as well

as billing integration

These features allow customers to extend automation capabilities and deliver improved

efficiencies across their storage environments, providing better control of operating

expenses as data growth continues to rise at unprecedented rates — one of the key pain

points customers face in the current IT landscape.

To help put this data growth in context, the Digital Universe is growing 40 percent

yearly into the next decade.1 By 2020, it will contain as many digital bits as there are

stars in the universe. This vast amount of data makes storing, accessing and managing

all this data difficult, not to mention expensive. The way customers distribute and

protect their data at scale today will play a very important role in how successful they

are in the future.

1 http://www.emc.com/leadership/programs/digital-universe.htm