Upload
ibm-switzerland
View
353
Download
1
Tags:
Embed Size (px)
Citation preview
© 2013 IBM Corporation
IBM Big Data Platform
Turning big data into smarter decisions
Corinne Fabre - Vincent Jeansoulin
© 2013 IBM Corporation© 2013 IBM Corporation
6,000,000 users on Twitter
pushing out 300,000 tweets per day
500,000,000 users on Twitter
pushing out 400,000,000tweets per day
83x
1333x
Twitter generates over 12 TB of tweet data every single day
07/06/2013 IBM Confidential2
© 2013 IBM Corporation© 2013 IBM Corporation
“At the World Economic
Forum last month in Davos,
Switzerland, Big Data was a
marquee topic. A report by the
forum, “Big Data, Big Impact,”
declared data a new class of
economic asset, like
currency or gold.
“Companies are being
inundated with data—from
information on customer-buying
habits to supply-chain efficiency.
But many managers struggle to
make sense of the numbers.”
“Increasingly, businesses are
applying analytics to social
media such as Facebook and
Twitter, as well as to product
review websites, to try to
“understand where customers are,
what makes them tick and what
they want”, says Deepak Advani,
who heads IBM’s predictive
analytics group.”
“Big Data has arrived at Seton
Health Care Family, fortunately
accompanied by an
analytics tool that will help
deal with the complexity of
more than two million
patient contacts a year�”
“Data is the new oil.”
Clive Humby
The Oscar Senti-meter — a tool
developed by the L.A. Times, IBM
and the USC Annenberg
Innovation Lab — analyzes
opinions about the Academy
Awards race shared in millions
of public messages on Twitter.”
“Data is the new Oil.”In its raw form, oil has little value. Once processed & refined, it helps power the world.
“0now Watson is being put to
work digesting millions of
pages of research,
incorporating the best clinical
practices and monitoring the
outcomes to assist physicians in
treating cancer patients.”
© 2013 IBM Corporation© 2013 IBM Corporation
Imagine the Possibilities of Harnessing your Data Resources
Retailer reduces time to
run queries by 80% to optimize inventory
Stock Exchange cuts queries from 26 hours to
2 minutes on 2 PB
Government cuts acoustic
analysis from hours to 70 Milliseconds
Utility avoids power failures by analyzing 10 PB of data in minutes
Telco analyses streaming network data to reduce
hardware costs by 90%
Hospital analyzes streaming
vitals to intervene 24 hours earlier
Big data challenges exist in every business today
© 2013 IBM Corporation© 2013 IBM Corporation
The characteristics of big data
Collectively Analyzing the
broadening Variety
Responding to the
increasing Velocity
Cost efficiently processing the
growing Volume
Establishing the
Veracity of big data sources
30 Billion RFID sensors and counting
1 in 3 business leaders don’t trust the information they use to make decisions
50x 35 ZB
2020
80% of the worlds data is unstructured
2010
© 2013 IBM Corporation© 2013 IBM Corporation
Big data is a hot topic because technology makes it
possible to analyze ALL available data
Cost effectively manage and analyze all available data,
in its native form – unstructured, structured, streaming
ERPCRM RFID
Website
Network Switches
Social Media
Billing
© 2013 IBM Corporation© 2013 IBM Corporation
In order to realize new opportunities, you need to think
beyond traditional sources of data
Transactional & Application Data
Machine Data Social Data
• Volume
• Structured
• Throughput
• Velocity
• Semi-structured
• Ingestion
• Variety
• Highly unstructured
• Veracity
Enterprise Content
• Variety
• Highly unstructured
• Volume
© 2013 IBM Corporation© 2013 IBM Corporation
8
Hadoop
Streaming
Data
New
Sources
Unstructured
Exploratory
Iterative
StructuredRepeatable
Linear
Data
Warehouse
Traditional
Sources
Traditional ApproachStructured, analytical, logical
New ApproachCreative, holistic thought, intuition
Enterprise
Wide
Integration
Web logs, URLs
Social data
Text Data: emails, chats
RFID, sensor data
Network data
Internal App Data
Transaction Data
ERP data
Mainframe Data
OLTP System Data
Analysis expanding from enterprise data to big data, creating
new cost-effective opportunities for competitive advantage
© 2013 IBM Corporation© 2013 IBM Corporation
Big Data Exploration
Find, visualize, understand all big data to improve business knowledge
Enhanced 360o Viewof the CustomerAchieve a true unified view, incorporating internal and external sources
Operations AnalysisAnalyze a variety of machinedata for improved business results
Data Warehouse AugmentationIntegrate big data and data warehouse capabilities to increase operational efficiency
Security/Intelligence ExtensionLower risk, detect fraud and monitor cyber security in real-time
The 5 High Value Big Data Use Cases
© 2013 IBM Corporation© 2013 IBM Corporation
Big Data Exploration: Needs
Struggling to manage
and extract value from
the growing 3 V’s of
data in the enterprise;
need to unify
information across
federated sources
Inability to relate “raw” data
collected from system logs,
sensors, clickstreams, etc.,
with customer and line-of-
business data managed in
enterprise systems
Risk of exposing unsecure
personally identifiable
information (PII) and/or
privileged data due to lack
of information awareness
Find, visualize, understand all big datato improve decision making
© 2013 IBM Corporation© 2013 IBM Corporation
What’s Possible – Integrated Service Solutions
11
© 2013 IBM Corporation© 2013 IBM Corporation
Enhanced 360º View of the Customer: Needs
Need a deeper
understanding of
customer sentiment
from both internal and
external sources
Extend existing customer views (MDM, CRM, etc.) by incorporating additional internal and external information sources
Desire to increase
customer loyalty and
satisfaction by
understanding what
meaningful actions
are needed
Challenged getting the
right information to the
right people to provide
customers what they
need to solve problems,
cross-sell & up-sell
© 2013 IBM Corporation© 2013 IBM Corporation
Enhanced 360º View of the Customer: Illustrated
MasterDataManagement
Unified View of Party’s Information
CRM
J Robertson
Pittsburgh, PA 15213
35 West 15th
Name:
Address:
Address:
ERP
Janet Robertson
Pittsburgh, PA 15213
35 West 15th St.
Name:
Address:
Address:
Legacy
Jan Robertson
Pittsburgh, PA 15213
36 West 15th St.
Name:
Address:
Address:
SOURCE SYSTEMS
Janet
35 West 15th St
Pittsburgh
Robertson
PA / 15213
F
48
1/4/64
First:
Last:
Address:
City:
State/Zip:
Gender:
Age:
DOB:
360° View of Party Identity
BigInsights Streams Warehouse
Unified View of Party’s Information
© 2013 IBM Corporation© 2013 IBM Corporation
Security/Intelligence Extension: Needs
Enhanced
Intelligence &
Surveillance Insight
Real-time Cyber
Attack Prediction &
Mitigation
Analyze network traffic to:
• Discover new threats early
• Detect known complex threats
• Take action in real-time
Analyze Telco & social data to:
• Gather criminal evidence
• Prevent criminal activities
• Proactively apprehend criminals
Crime prediction &
protection
Security/Intelligence Extension enhances traditional security solutions by analyzing all types and sources of under-leveraged data
Analyze data-in-motion & at rest to:
• Find associations
• Uncover patterns and facts
• Maintain currency of information
© 2013 IBM Corporation© 2013 IBM Corporation15
TerraEchos Turns to IBM
Big Data for Low Latency
Surveillance Data Analysis
• Deployed security surveillance system to detect,
classify, locate, and track potential threats at
highly sensitive national lab
• Stream computing collects and analyzes acoustic
data from fiber-optic sensor arrays
• Analyzed acoustic data fed into TerraEchos
intelligence platform for threat detection,
classification, prediction & communication
Results
• Enables Terraechos solution to analyze and
classify streaming acoustic data in real-time
• Provides lab & security staff with holistic view of
potential threats & non-issues
• Enables a faster and more intelligent response to
any threat
15
Capabilities Utilized
Stream Computing
Identifies and classifies potential security threats
miles away
Public & Safety
© 2013 IBM Corporation© 2013 IBM Corporation
Operations Analysis: Needs
• Gain real-time visibility into operations,
customer experience, transactions and
behavior
• Proactively plan to increase operational
efficiency
Analyze a variety of machinedata for improved business results
Because of the complexity and rapid growth of
machine data, many companies make decisions
on a small fraction of the information available to
them
The ability to analyze machine data and combine
it with enterprise data for a full view can enable
organizations to:
• Identify and investigate anomalies
• Monitor end-to-end infrastructure to
proactively avoid service degradation or
outages
© 2013 IBM Corporation© 2013 IBM Corporation17
KTH Swedish Royal Institute
of Technology Reducing
Traffic Congestion
• Deployed real-time Smarter Traffic system to
predict and improve traffic flow.
• Analyzes streaming real-time data gathered from
cameras at entry/exit to city, GPS data from taxis
and trucks, and weather information.
• Predicts best time and method to travel such as
when to leave to catch a flight at the airport
Results
• Enables ability to analyze and predict traffic
faster and more accurately than ever before
• Provides new insight into mechanisms that affect
a complex traffic system
• Smarter, more efficient, and more
environmentally friendly traffic
17
Capabilities Utilized
Stream Computing
Public & Transporation
© 2013 IBM Corporation© 2013 IBM Corporation
Integrate big data and data warehouse capabilities to increase operational efficiency
Data Warehouse Augmentation: Needs
Need to leverage variety of data Extend warehouse infrastructure
• Optimized storage, maintenance and licensing
costs by migrating rarely used data to Hadoop
• Reduced storage costs through smart
processing of streaming data
• Improved warehouse performance by
determining what data to feed into it
• Structured, unstructured, and streaming
data sources required for deep analysis
• Low latency requirements
(hours—not weeks or months)
• Required query access to data
© 2013 IBM Corporation© 2013 IBM Corporation19
Vestas optimizes capital
investments based on 2.5
Petabytes of information
Results
� Model the weather to optimize placement
of turbines, maximizing power generation
and longevity.
� Reduce time required to identify placement
of turbine from weeks to hours.
� Incorporate 2.5 PB of structured and semi-
structured information flows. Data volume
expected to grow to 6 PB.
19
Capabilities Utilized
Hadoop System
Reduced turbine placement identification from weeks to
hours
Utility
© 2013 IBM Corporation© 2013 IBM Corporation
An example of the big data platform in practice
Ingest
Landing and Analytics Sandbox Zone
Indexes,
facets
Hive/HBase
Col Stores
Documents
In Variety
of Formats
Analytics
MapReduce
Repository, Workbench
Ingestion and Real-time Analytic Zone
Data
Sinks
Filter, Transform
Ingest
Correlate, Classify
Extract, Annotate
Warehousing Zone
Enterprise
Warehouse
Data Marts
Query
Engines
Cubes
Descriptive,
Predictive
Models
Models
Widgets
Discovery,
Visualizer
Search
Analytics and
Reporting Zone
Metadata and Governance Zone
20
Connectors
© 2013 IBM Corporation© 2013 IBM Corporation
The IBM Big Data Platform
� Process any type of data
– Structured, unstructured, in-
motion, at-rest
� Built-for-purpose engines
– Designed to handle different
requirements
� Analyze data in motion
� Manage and govern data in the
ecosystem
� Enterprise data integration
� Grow and evolve on current
infrastructure
21
Solutions
IBM Big Data Platform
Analytics and Decision Management
Big Data Infrastructure
Accelerators
Information Integration & Governance
Hadoop
System
Stream
Computing
Data
Warehouse
Systems
Management
Application
Development
Visualization
& Discovery
© 2013 IBM Corporation© 2013 IBM Corporation
The IBM Big Data Platform
BI / Reporting
BI / Reporting
Exploration / Visualization
FunctionalApp
IndustryApp
Predictive Analytics
Content Analytics
Analytic Applications
Big Data Platform
Systems Management
Application Development
Visualization & Discovery
Accelerators
Information Integration & Governance
HadoopSystem
Stream Computing
Data Warehouse
2 – Analyze Raw Rata
InfoSphere BigInsights
5 – Analyze Streaming Data
InfoSphere Streams
1 – Unlock Big Data
IBM DataExplorer3 – Simplify your warehouse
Netezza/PureData
4 – Reduce costs with Hadoop
InfoSphere BigInsights
22
6 – Analyse your content
IBM CCI/SMAIBM Content Anayltique7 – Report
IBM Cognos8 – Predict IBM SPSS
© 2013 IBM Corporation© 2013 IBM Corporation
Get Started on Your Big Data Journey TodayGet Educated
• IBM Big Data: ibm.com/bigdata
• IBMBigDataHub.com
• BigDataUniversity.com
• IBV study on big data
• Books / analyst papers
Schedule a Big Data Workshop
• Free of charge
• Best practices
• Industry use cases
• Business uses
• Business value assessment
© 2013 IBM Corporation© 2013 IBM Corporation
Disclaimer
• No part of this document may be reproduced or transmitted in any form without written permission from IBM Corporation.
• Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice.
• This information could include technical inaccuracies or typographical errors. IBM may make improvements and/or changes in the
• product(s) and/or program(s) at any time without notice. Any statements regarding IBM's future direction and intent are subject to
• change or withdrawal without notice, and represent goals and objectives only.
• The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be obtained in
• other operating environments may vary significantly. While IBM has reviewed each item for accuracy in a specific situation, there is no
• guarantee that the same or similar results will be obtained elsewhere. Customer experiences described herein are based upon
• information and opinions provided by the customer. The same results may not be obtained by every user.
• Reference in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs
• or services available in all countries in which IBM operates or does business. Any reference to an IBM Program Product in this
• document is not intended to state or imply that only that program product may be used. Any functionally equivalent program, that does
• not infringe IBM's intellectual property rights, may be used instead. It is the user's responsibility to evaluate and verify the operation on
• any non-IBM product, program or service.
• THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR
• IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR
• INFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted according to the terms and
• conditions of the agreements (e.g. IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement,
• etc.) under which they are provided. IBM is not responsible for the performance or interoperability of any non-IBM products discussed
• herein.
• Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other
• publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of
• performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should
• be addressed to the suppliers of those products.