24
Big Data Analytics – Challenge and Opportunity

IBM Smarter Business 2012 - Big Data Analytics

Embed Size (px)

DESCRIPTION

Det finns en enorm potential i analys, förädling, modellering och åskådliggörande av de enorma datamängder som genomsyrar såväl näringsliv som samhälle. För att realisera denna potential räcker det inte med kapacitet att lagra, förmedla och söka igenom data. Till det behövs istället Big Data Analytics, storskalig analys av enorma datamängder. Vi presenterar här den forskningsagenda för Big Data Analytics som SICS tillsammans med IBM och ytterligare parter från industri och akademi håller på att ta fram. Talare: Daniel Gillberg, Research Group Leader, SICS, Anders Holst Senior Research Scientist, SICS, samt Flemming Bagger, IBM Besök http://smarterbusiness.se för mer information.

Citation preview

Page 1: IBM Smarter Business 2012 - Big Data Analytics

Big Data Analytics– Challenge and Opportunity

Page 2: IBM Smarter Business 2012 - Big Data Analytics

6,000,000 users on Twitter

pushing out 300,000 tweets per day

500,000,000 users on Twitter

pushing out 400,000,000 tweets per day

83x

1333x

Page 3: IBM Smarter Business 2012 - Big Data Analytics

2+ billion

people on the

Web by end 2011

30 billion RFID tags today

(1.3B in 2005)

4.6 billion camera phones

world wide

100s of millions of GPS

enabled devices

sold annually

76 million smart meters

in 2009… 200M by 2014

12+ TBs of tweet data

every day

25+ TBs of

log data every day

? T

Bs

of

dat

a ev

ery

day

Where is big data coming from?

Page 4: IBM Smarter Business 2012 - Big Data Analytics

The Characteristics of Big Data

Collectively analyzing the broadening Variety

Responding to the increasing Velocity

Cost efficiently processing the growing Volume

Establishing the Veracity of big data sources

By 2015, 80% of all available data will be uncertain - The number of networked devices will be double the entire

global population

- The total number of social media accounts exceeds the entire global population

50x 35 ZB

20202010

30 Billion RFID sensors and counting

80% of the worlds data is unstructured

Page 5: IBM Smarter Business 2012 - Big Data Analytics

Data AVAILABLE to an organization

Data an organization can PROCESS

Big Data is a Hot topic - Because it is possible to Analyze ALL Available Data• The percentage of available data an enterprise can analyze is decreasing proportionately to

the available to that enterprise– Quite simply, this means as enterprises, we are getting “more naive” about our business over time

• Just collecting and storing “Big Data” doesn’t drive a cent of value to an organization’s bottom line

• Cost effectively manage and analyze ALL available data in its native form unstructured, structured, realtime streaming…….Internal and external

Signalsand

Noise001100110010010101001010011110010010110101101000100100010001010011110010001001000100100011001000100100010010001010001001000

Page 6: IBM Smarter Business 2012 - Big Data Analytics

Business-centric Big Data Platform

• “Big data” isn’t just a technology—it’s a business strategy for capitalizing on information resources

• Getting started is crucial

• Success at each entry point is accelerated by products within the Big Data platform

• Build the foundation for future requirements by expanding further into the big data platform

6

Page 7: IBM Smarter Business 2012 - Big Data Analytics

Database services that handle large volumes of transactions with high availability, scalability and integrity

Data Warehouse services for complex analytics and reportingon data up to petabyte scale -with minimal administration

Operational Warehouse services for continuous ingest of operational data, complex analytics, and a large volume of concurrent operational queries

Different data workloads have different characteristics

System for Transactions

System for Analytics

System for Operational Analytics

powered by Netezza technology

Page 8: IBM Smarter Business 2012 - Big Data Analytics

Big Data Analytics – A national research initiative

Page 9: IBM Smarter Business 2012 - Big Data Analytics

Big Data Analytics – A national research initiative

Daniel Gillblad

Research Group Leader, Senior Research Scientist

SICS, Swedish Institute of Computer Science

Page 10: IBM Smarter Business 2012 - Big Data Analytics

Background

• There is a very large potential, both societal and commercial, in the analysis, refinement, modeling, and visualization these data sets

• Capacity to store, transfer, and search is not enough - analytics is critical

Page 11: IBM Smarter Business 2012 - Big Data Analytics

Additional business value of Analytics

• Predict and optimize business outcomes

• New services and applications, both for end-users and industry

• New value chains, were different actors can create and exchange new analysis services

Page 12: IBM Smarter Business 2012 - Big Data Analytics

A national Big Data Analytics initiative

① A strategic nation-wide research and innovation agenda – Input from several sectors and application areas– Both new businesses built on analytics applications

and traditional industry – Input from academia, both as developers and as users

② A national Big Data Analytics network – Open to all interested parties – Industry and academia with an active interest in Big Data Analytics

Page 13: IBM Smarter Business 2012 - Big Data Analytics

Focus areas

Analytics

Computation

Storage

Collection

Visualization

{Focus areas

Control and planning

Page 14: IBM Smarter Business 2012 - Big Data Analytics

Current constellation

Page 15: IBM Smarter Business 2012 - Big Data Analytics

Research and development challenges

• Huge businesses are built on Big Data Analytics today, but a large number of issues must be resolved to fully realize the potential

• Three examples

Page 16: IBM Smarter Business 2012 - Big Data Analytics

Example 1: Large-scale physics experimentation

• Challenges: Scale (storage, computation), scalable analytics

Page 17: IBM Smarter Business 2012 - Big Data Analytics

Example 2: Social network mining

• Challenges: Unstructured data, biased data, data access

Page 18: IBM Smarter Business 2012 - Big Data Analytics

Example 3: Access network pattern mining

• Challenges: Integrity issues, distributed mining, service frameworks

Page 19: IBM Smarter Business 2012 - Big Data Analytics

Long term trends

• Currently dominating approach will continue to be successful, but will be complemented due to– Too much data, unstructured data, noisy data– Limited access – security, integrity, legal, and business– Fast data generation, situation awareness

• The consequences are– Analysis closer to data generation / collection– No storage - Catching information on the fly– Distributed analysis with incomplete data– Real time collection, real time analytics

Page 20: IBM Smarter Business 2012 - Big Data Analytics

Research challenges

• Research challenges on different levels:– The sensor/collection level– The algorithmic/analytical level– The system level– The organisational level

Page 21: IBM Smarter Business 2012 - Big Data Analytics

Technical challenges, examples

• Computational and storage framework development• Analysis of unstructured data• Distributed analysis• Efficient analysis algorithms• Stream mining• Managing sample bias• Managing uncertain and missing data

Page 22: IBM Smarter Business 2012 - Big Data Analytics

Platform and organisational challenges, examples

• Service and analytics frameworks, exchanging models and data• API:s and standards

• Privacy, integrity, security, and legal• Business models

Page 23: IBM Smarter Business 2012 - Big Data Analytics

Contacts

• If you are interested in the Swedish Big Data Analytics Network, feel free to contact

Daniel [email protected]+46 8 633 15 68

Anders [email protected]+46 8 633 15 93

Page 24: IBM Smarter Business 2012 - Big Data Analytics