IBM Smarter Business 2012 - Big Data Analytics

Preview:

DESCRIPTION

Det finns en enorm potential i analys, förädling, modellering och åskådliggörande av de enorma datamängder som genomsyrar såväl näringsliv som samhälle. För att realisera denna potential räcker det inte med kapacitet att lagra, förmedla och söka igenom data. Till det behövs istället Big Data Analytics, storskalig analys av enorma datamängder. Vi presenterar här den forskningsagenda för Big Data Analytics som SICS tillsammans med IBM och ytterligare parter från industri och akademi håller på att ta fram. Talare: Daniel Gillberg, Research Group Leader, SICS, Anders Holst Senior Research Scientist, SICS, samt Flemming Bagger, IBM Besök http://smarterbusiness.se för mer information.

Citation preview

Big Data Analytics– Challenge and Opportunity

6,000,000 users on Twitter

pushing out 300,000 tweets per day

500,000,000 users on Twitter

pushing out 400,000,000 tweets per day

83x

1333x

2+ billion

people on the

Web by end 2011

30 billion RFID tags today

(1.3B in 2005)

4.6 billion camera phones

world wide

100s of millions of GPS

enabled devices

sold annually

76 million smart meters

in 2009… 200M by 2014

12+ TBs of tweet data

every day

25+ TBs of

log data every day

? T

Bs

of

dat

a ev

ery

day

Where is big data coming from?

The Characteristics of Big Data

Collectively analyzing the broadening Variety

Responding to the increasing Velocity

Cost efficiently processing the growing Volume

Establishing the Veracity of big data sources

By 2015, 80% of all available data will be uncertain - The number of networked devices will be double the entire

global population

- The total number of social media accounts exceeds the entire global population

50x 35 ZB

20202010

30 Billion RFID sensors and counting

80% of the worlds data is unstructured

Data AVAILABLE to an organization

Data an organization can PROCESS

Big Data is a Hot topic - Because it is possible to Analyze ALL Available Data• The percentage of available data an enterprise can analyze is decreasing proportionately to

the available to that enterprise– Quite simply, this means as enterprises, we are getting “more naive” about our business over time

• Just collecting and storing “Big Data” doesn’t drive a cent of value to an organization’s bottom line

• Cost effectively manage and analyze ALL available data in its native form unstructured, structured, realtime streaming…….Internal and external

Signalsand

Noise001100110010010101001010011110010010110101101000100100010001010011110010001001000100100011001000100100010010001010001001000

Business-centric Big Data Platform

• “Big data” isn’t just a technology—it’s a business strategy for capitalizing on information resources

• Getting started is crucial

• Success at each entry point is accelerated by products within the Big Data platform

• Build the foundation for future requirements by expanding further into the big data platform

6

Database services that handle large volumes of transactions with high availability, scalability and integrity

Data Warehouse services for complex analytics and reportingon data up to petabyte scale -with minimal administration

Operational Warehouse services for continuous ingest of operational data, complex analytics, and a large volume of concurrent operational queries

Different data workloads have different characteristics

System for Transactions

System for Analytics

System for Operational Analytics

powered by Netezza technology

Big Data Analytics – A national research initiative

Big Data Analytics – A national research initiative

Daniel Gillblad

Research Group Leader, Senior Research Scientist

SICS, Swedish Institute of Computer Science

Background

• There is a very large potential, both societal and commercial, in the analysis, refinement, modeling, and visualization these data sets

• Capacity to store, transfer, and search is not enough - analytics is critical

Additional business value of Analytics

• Predict and optimize business outcomes

• New services and applications, both for end-users and industry

• New value chains, were different actors can create and exchange new analysis services

A national Big Data Analytics initiative

① A strategic nation-wide research and innovation agenda – Input from several sectors and application areas– Both new businesses built on analytics applications

and traditional industry – Input from academia, both as developers and as users

② A national Big Data Analytics network – Open to all interested parties – Industry and academia with an active interest in Big Data Analytics

Focus areas

Analytics

Computation

Storage

Collection

Visualization

{Focus areas

Control and planning

Current constellation

Research and development challenges

• Huge businesses are built on Big Data Analytics today, but a large number of issues must be resolved to fully realize the potential

• Three examples

Example 1: Large-scale physics experimentation

• Challenges: Scale (storage, computation), scalable analytics

Example 2: Social network mining

• Challenges: Unstructured data, biased data, data access

Example 3: Access network pattern mining

• Challenges: Integrity issues, distributed mining, service frameworks

Long term trends

• Currently dominating approach will continue to be successful, but will be complemented due to– Too much data, unstructured data, noisy data– Limited access – security, integrity, legal, and business– Fast data generation, situation awareness

• The consequences are– Analysis closer to data generation / collection– No storage - Catching information on the fly– Distributed analysis with incomplete data– Real time collection, real time analytics

Research challenges

• Research challenges on different levels:– The sensor/collection level– The algorithmic/analytical level– The system level– The organisational level

Technical challenges, examples

• Computational and storage framework development• Analysis of unstructured data• Distributed analysis• Efficient analysis algorithms• Stream mining• Managing sample bias• Managing uncertain and missing data

Platform and organisational challenges, examples

• Service and analytics frameworks, exchanging models and data• API:s and standards

• Privacy, integrity, security, and legal• Business models

Contacts

• If you are interested in the Swedish Big Data Analytics Network, feel free to contact

Daniel Gillbladdgi@sics.se+46 8 633 15 68

Anders Holstaho@sics.se+46 8 633 15 93

Recommended