Big Data

Preview:

DESCRIPTION

My recent presentation about what is Big Data, Why so much Hype now, Startling Facts, Opportunity, History, Important Research Papers such as GFS, Map-Reduce , Technology Platforms and Organizations , Hadoop, Cassandra, Introduction to Hadoop, Contribution of Indians to various Big Data technologies working in Google, Cloudera, Hortonworks, Yahoo, Facebook, Aadhar - "All your answers lie in data - @Sameer Sawhney"

Citation preview

BIG

DATA

WHY NOW ?

World’s information totaled over

2 Zetabytes

That’s 2 Trillion Gigabytes

By 2020, this number will be

35 Trillion ZB

“world’s data is doubling every 1.2 years”

“80% of this data is unstructured”

5 V

Money

2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2013

Today

Apache Cassandra

Apache Hadoop

Amazon Dynamo

Big Table

Map Reduce

Google File System

Dremel

Impala

Spanner ?

Analytics

(Hadoop)

Realtime

(“NoSql”)

TH

E E

CO

SY

STEM

Hadoop Ecosystem

Apache Hadoop is an open-source software

framework that supports running applications on

large clusters of commodity hardware.

Replication

Fault Tolerant

Commodity Hardware

Map Reduce

Map Reduce

Word Count

World's largest biometric identity platform

2,00,00,00,00,000 Biometric Matches

2 PB Data

Hadoop Stack

This is just the Beginning of “Big Data Revolution”

This is just the Beginning of “Big Data Revolution”

sameer.sawhney@gmail.com

@sameersaw at twitter

Images

Raymond Bryson Marius B IntelFreePress License Pedro Moura Pinheiro

Recommended