23
Integrating Big Data Technologies in Your IT Portfolio Vineet Tyagi VP Technology, Head of Innovation Labs [email protected] [email protected] blogs.impetus.com Impetus Technologies Inc.

Integrating big data technologies in enterprise it nasscom

  • Upload
    nasscom

  • View
    1.377

  • Download
    2

Embed Size (px)

DESCRIPTION

Integrating big data technologies in enterprise it Nasscom

Citation preview

Page 1: Integrating big data technologies in enterprise it nasscom

Integrating Big Data Technologies in Your IT

Portfolio

Vineet TyagiVP Technology, Head of Innovation Labs

[email protected]

[email protected]

blogs.impetus.com

Impetus Technologies Inc.

Page 2: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Outline

Big Data Big Data Technologies and the Ecosystem Transforming your Enterprise Data Warehouse to a Big Data

Warehouse Cost considerations Cloud considerations Operational Support People Aspect of Big Data

Use Case Selection – what where Q&A’s

2

Page 3: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Big Data

3

2.5 QUINTILLION2.5 quintillion bytes is produced every day

$6 TRILLIONbig data cost IDC/EMC

$650 BILLION$650 Billion / year: cost of wasted productivity due to Information overload.

1ZB 1 Zettabyte: Estimated Internet Traffic by 2015

1800EB

1,800 Exabytes: Size of the digital universe in 2011

90%90% of the data in the world today is less than two years old

1818 Months is the estimated time for the digital universe to double

Page 4: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Big Data

Not only the original content stored or being consumed but also about the information around its consumption

airline jet collects 10 terabytes of sensor data for every 30 minutes of flying time

New York Stock Exchange collects 1 terabyte of structured trading data per day

Big Data Is Not the Created Content, nor Is It Even Its Consumption — It Is the Analysis of All the Data Surrounding or Swirling Around It

4

Page 5: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Age of Data

5

Age of Software Age of Data

Data Rich and Information Poor

Page 6: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Big Data - Technologies

There will be More content devices applications On - Demand Access

Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis

6

Finding answers where there are yet to be questions

Page 7: Integrating big data technologies in enterprise it nasscom

Impetus Confidential

Big Data - Technologies

77

VoltDBHadaptClustrixXeround

BIG SQL

MembaseCassandraMongo DBHypertableCouchDB

NOSQL

Innovation of Mixware architecturesNVIDIA GPUCUDA LibraryGPGPUOpenCL standardsSIMD architecture based parallel programming

Programmable Hardware

Alchemi Proactive JPPFGridGainDistributed computationMicrosoft HPC Server

GRID Frameworks

Hadoop, Map-ReduceHBaseHive, Sqoop, Flume etcPIG, PIG LatinMapR, Platform Computing, Pervasive

DataRush

Hadoop Ecosystem

BIG SQLBIG SQL• Machine LearningApache Mahout, R, Weka, OrangeVoldemort, Oozie, Datameer

RainstorTeradata

Storage Optimization

Page 8: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

EDW to Big Data Warehouse - Drivers

BUSINESS DRIVERS Better insight Faster turn-around Accuracy and timeliness

IT DRIVERS Reduce storage cost Reduce data movement Faster time-to-market Standardized toolset Ease of management and operation Security and governance

8

Page 9: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

EDW to Big Data Warehouse

9

Traditional enterprise data models for application, database, and storage resources have grown over the years

cost and complexity of these models has increased along the way to meet the needs of larger sets of data

new models are based on a scaled-out, shared-nothing architecture

One size no longer fits all

Big Data Components Hadoop: Provides storage capability through a distributed, shared-

nothing file system, and analysis capability through MapReduce NoSQL: Provides the capability to capture, read, and update, in

real time, the large influx of unstructured data and data without schemas; examples include click streams, social media, log files, event data, mobility trends, and sensor and machine data

Page 10: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

EDW to Big Data Warehouse

10

Page 11: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Cost Considerations of Big Data Warehouse

Initial Entry Costs Cost of Experimentation

Cost of Integration and Moving Data Cost of ETL

Query and analytics capability Manageability On-Going Maintenance

Monitoring and Tuning

Changing Capacity Additional Hardware

Cost of Compliance

11

Page 12: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Lowering TCO of Big Data

Initial Entry Costs Cost of Experimentation – Best Practice Patterns, learn or hire

Cost of Integration and Moving Data Cost of ETL – Remove costly licensed tools, switch to MR for ETL

or ELT

Manageability Provisioning, management tools – You will have more than single

vendor, look for multi-vendor management toolsets like Ankush from Impetus

On-Going Maintenance Monitoring and Tuning – Automate Automate Automate

Changing Capacity Additional Hardware – Do you know the GPU?

12

Page 13: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Lowering TCO of Big Data

Cost of Storage Compress Data – Rainstor type solutions

Do More with Less Faster MR – MapR type solutions Acunu type solutions for NoSQL

13

Page 14: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Cloud Considerations in Big Data Warehouse

More “virtual” servers shipped than “physical” servers in 2011 20% of all information running through servers by 2015 will be

doing so on virtualized systems

The challenges for cloud adoption include Data preparation for conversion to cloud Integrated cloud / non-cloud management Service-level agreements and termination strategies Security, backup, archiving, and disaster control strategies Intercountry data transfer and compliance

14

Page 15: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

Operational Support Considerations in Big Data Warehouse

Big Data solution architectures Technology churn. Hadoop is the only constant as a paradigm

Impedance Mismatch : Is your IT organization geared up to transition Big Data technologies into the Enterprise?

Un Solved Challenges Rapidly, automatically or rule based single click provisioning of Big Data

Clusters Measure the boost provided by Clusters/Grids to your business data processing

capabilities. Need to change your choice of cluster software at any point of time when you

feel that it is not sufficiently delivering to your needs Manage big data solution from a single cluster management software umbrella

15

Mutli – Vendor Multi – Technology Cluster Management

Page 16: Integrating big data technologies in enterprise it nasscom

Impetus Proprietary

People Aspect : Big Data Warehouse

New Skillsets Needed Data Scientists Developers who can think Parallel

New Definitions for older roles Big Data Administrator ?

Invest in Training

16

Page 17: Integrating big data technologies in enterprise it nasscom

Use Case Selection : Big Data Scenarios

Big Input-Small Output

Small Input-Big Output

Big Input-Big Output

Page 18: Integrating big data technologies in enterprise it nasscom

Stage 1 Use Case : Introduce Big Data

18

Page 19: Integrating big data technologies in enterprise it nasscom

Stage 2 Use Case : Get Bolder

19

Page 20: Integrating big data technologies in enterprise it nasscom

Stage 3 Use Case : Sophisticated

20

Page 21: Integrating big data technologies in enterprise it nasscom

Impetus

We offer Innovative Product Engineering & Technology R&D Services and Products

Eighteen years of experience, numerous award winning products and success stories

Innovation based differentiated services 1300+ engineers, development centers in India Pioneers in Big Data Consulting Services

Since 2008, 10+ active in-production use cases at large Fortune 100

Products and Tools to help ease Big Data adoption Ankush, Jumbune, iLaDaP

21

Page 22: Integrating big data technologies in enterprise it nasscom

Impetus OpenSource Contributions

Kundera (http://code.google.com/p/kundera/): This is an annotation-based Java library for NoSQL databases like Cassandra, Hbase & MongoDB.

Hadoop (http://hadoop-toolkit.googlecode.com): Hadoop Performance Monitoring tool provides an inbuilt solution with an Suggestion Engine for quickly finding performance bottlenecks. A visual representation helps suggest remediation.

Korus (http://code.google.com/p/korus/): This is a parallel and distributed programming framework, which improves the performance and scalability of Java applications.

22

Page 23: Integrating big data technologies in enterprise it nasscom

Thank You