Upload
mani-gandan
View
93
Download
0
Embed Size (px)
Citation preview
BIG DATATHE NEXT BIG THING!!!
2
OLTP: Online Transaction Processing (DBMSs) OLAP: Online Analytical Processing (Data Warehousing) RTAP: Real-Time Analytics Processing (Big Data Architecture & technology)
Data Management Journey
Classification of Data
• Structured
• Semi-Structured
• Un Structured
Relational Tables
XML
Graphic images, videos, Streaming instrument data, Web
pages, logs, tweets etc
Customer
Social
Media
Gaming
Entertain
BankingFinance
OurKnow
nHistor
y
Purchase
Different Sources of Data
12+ TBs of tweet data
every day
25+ TBs oflog data
every day
? TB
s of
data
eve
ry
day
2+ billion people on the
Web by end
2011
30 billion RFID tags today
(1.3B in 2005)
4.6 billion camera phones
world wide
100s of millions of GPS enable
d devices sold
annually
76 million smart meters in 2009… 200M by 2014
Data usage
2009
2012
2020
05
101520253035
Data Usage
Data Usage
Zettabyt
es
Data Usage as on 2012
Data Usage0%
10%20%30%40%50%60%70%80%90%
100%
0.2160.324
2.16 Unstruc-turedSemi Struc-turedStructured
Not all of the Unstructured data are useful
Breaking down data silos to access all data an organization
stores in different places and often in different systems
Creating platforms that can pull in unstructured data as easily
as structured data
Challenges
Need of Big Data
Obtained and processed through new techniques to produce best
value
What is Big Data
= +Enormous volume of information stored in Data Center/Data
warehouses/Data Bases/ Transaction system
The Process of collecting, organizing and analyzing large sets
of data to discover patterns and other useful information
Help the organizations to understand the information
contained within the data in better way
Helps to identify the most important data for the business
and future business decisions
BIG Data Analytics
11
Big Data
To Addresses the following…
Sophisticated BI & Analytics
Leverage Universal Data
Select proven Technologies
Agility for Business Change
Easy to Build and Manage
…and to overcomes the challenges
Long Timelines for Infrastructure setup
Time and Cost Uncertainties
Limitations in on-boarding new data sources
Need Big Data to
Big DataVolum
e
VelocityVariety
Value
Traditional Data warehouse
• Complete record from Transactional system• All Data are centralized• Addition of New Data every month/Day• Analytics designed against Stable environment• Many Reports run on a production basis
Big Data Environment
• Data from Many Sources inside and outside organization
• Data often physically disturbed• Need to iterate solution to test / improve model• Large Memory analytics also part of iteration• Every iteration requires complete reload of
information
Big Data Vs Data Warehouse
Researcher uses BD to decode human DNA in minutes
Predict where terrorists plan to attack
To determine which gene is mostly likely to be responsible for
certain diseases
To decide which ads you are most likely to respond to on
BIG Data used today
Improves customer retention
Help with product development and gain a competitive advantage
Increases efficiencies and optimize operations
Improve speed and reduce complexity
Benefits
Hadoop, Cloudera, Hortonworks, MapR and Amazon. There also
other products such HPCC and cloud-based services such as
Google BigQuery.
Tools
20
Change in ModelOld Model: Few companies generates data, all others are consuming
New Model: All of us generate data, and all of us consume data
Thank You