35
© 2015 IBM Corporation IBM Smarter Analytics Big Data Adoption Adrian Turcu Big Data Architect IBM Client Innovation Centers RoCEB

IBM Smarter Analytics

Embed Size (px)

Citation preview

Page 1: IBM Smarter Analytics

© 2015 IBM Corporation

IBM Smarter AnalyticsBig Data Adoption

Adrian TurcuBig Data Architect

IBM Client Innovation Centers RoCEB

Page 2: IBM Smarter Analytics

© 2015 IBM Corporation2

Mobile

Social

Cloud

Analytics

The Mega Trends

Page 3: IBM Smarter Analytics

© 2015 IBM Corporation3

Big Data: More than just volume

Volume

Terabytes to exabytes of

existing data to process

Velocity

Streaming data, milliseconds to

seconds to respond

Variety

Structured, unstructured,

text & multimedia

Veracity

Uncertainty from

inconsistency,

ambiguities, etc.

Page 4: IBM Smarter Analytics

© 2015 IBM Corporation4

Big Data & Analytics Value Proposition

The primary value from big data and analytics comes not from the data in its raw form, but from the processing and analysis of it and the

insights, decisions, products, and services that emerge from analysis.

Page 5: IBM Smarter Analytics

© 2015 IBM Corporation5

IBM’s Commitment to Big Data

$16 Billion in Big Data acquisitions35 new acquisitions in the last 5 years

More than 1000 developers focused on Big Data technology development

IBM joins Apple, Twitter, and the Weather Company in strategic partnerships

Largest patent portfolio in the industry

IBM has the largest commercial research organization on Earth

‒ 200+ mathematicians developing breakthrough analytics

IBM’s Big Data business grew over 150% in 2014

IBM CEO Says ‘Big Data’ Is Company’s Top Priority

Commitment to Big Data Means…Commitment to Hadoop and Spark

Page 6: IBM Smarter Analytics

© 2015 IBM Corporation6

IBM is Committed to Open Source

Open source technologies are the base for IBM software and solutions

IBM’s long history of deep open source commitment- Apache Software Foundation: Founding member in 1999- Cloud Foundry: #1 contributor; Basis for Bluemix- OpenStack: #4 contributor; Basis for IBM’s IaaS- Linux: #3 contributor; IBM first enterprise backer of Linux- Hadoop/Spark: Extensive investment in open source contribution;

Integration with Analytics software

Infrastructure

Systems

Application

Page 7: IBM Smarter Analytics

© 2015 IBM Corporation7

IBM Investing in Four Catalysts for Big Data Adoption

Familiar Interfaces & Integration with Established Tools

Technical Standards

New Analytics Capabilities

Open Source Innovation

Page 8: IBM Smarter Analytics

© 2015 IBM Corporation8

Apache Hadoop Ecosystem: Rapid Innovation, Few Standards Distributions include different projects at different version levels

“This proliferation of baskets [Hadoop distributions with different project versions] creates significant drag when it comes to building reliable applications ... makes it harder for customers to assess which basket of Hadoop that they need and harder for application developers to create solutions that work broadly.”

– Raymie Stata, CEO, Altiscale

Even though the project versions match, there are interface differences

If the industry is truly committed to developing big data technologies and solutions …, it will require an ecosystem of providers … to create a consistent framework around which everyone can develop.

- Siki Giunta, SVP, Verizon

The Hadoop ecosystem is evolving at a faster pace than is comfortable

“My personal speculation is that it comes from some who have been evaluating for a while seeing change occur so rapidly that they are dropping back for another look.”

– Merv Adrian, VP, Gartner

Page 9: IBM Smarter Analytics

© 2015 IBM Corporation9

Certify a standard “ODP Core” set of open source Hadoop family projects with specific versions and patch levels

Develop tools and methods to help solution providers to test applications against the ODP Core.

Contribute changes and fixes in the ODP Core Hadoop family projects to the ASF using the ASF processes.

http://opendataplatform.org/

Page 10: IBM Smarter Analytics

© 2015 IBM Corporation10

Open Data Platform Initiative

Representation across the Hadoop ecosystem…

Hadoop distribution vendors

Software application providers

System integrators/consultants

Hardware vendors Customers

… who all believe in the need for a community-based effort to standardize Hadoop, which will lead to

improved adoption

Page 11: IBM Smarter Analytics

© 2015 IBM Corporation11

IBM Open Platform with Apache Hadoop (IOP)

100% open source code Apache Hadoop distribution- Commitment to currency: “days, not months”- Includes Spark

Free for production use- Decoupled Apache Hadoop from IBM analytics and data science technologies- Production support offering available

Apache Open Source Components

HDFS

YARN

MapReduce

Ambari HBase

Spark

Flume

Hive Pig

Sqoop

HCatalog

Solr/Lucene

IBM Open Platform with Apache Hadoop

Page 12: IBM Smarter Analytics

© 2015 IBM Corporation12

Text Analytics

POSIX Distributed Filesystem

Multi-workload, multi-tenant scheduling

IBM Biglnsights Enterprise Management

Machine Learning on Big R

Big R (R support)

IBM Open Platform with Apache Hadoop(HDFS, YARN, MapReduce, Ambari, Hbase, Hive, Oozie, Parquet, Parquet Format, Pig,

Snappy, Solr, Spark, Sqoop, Zookeeper, Open JDK, Knox, Slider)

IBM Biglnsights Data Scientist

IBM Biglnsights Analyst

Big SQL

BigSheets

Industry standard SQL (Big SQL)

Spreadsheet-style tool (BigSheets)

Overview of Biglnsights (v4.x)

. . .

Page 13: IBM Smarter Analytics

© 2015 IBM Corporation13

IBM Open Platform with Apache Hadoop adopts ODP Core

BigInsights will include ODP certified Apache packages - ODP will initially target core packages of a Hadoop distribution- Packages will expand over time- First certification set expected this summer

Our goal for BigInsights on ODP- Better compatibility and less testing against ecosystem software- Enable IBM Hadoop capabilities to run on other ODP-certified

Hadoop distributions

HDFS

YARN

MapReduce

Ambari HBase

Spark

Flume

Hive Pig

Sqoop

HCatalog

Solr/Lucene

ODP

* Candidate set of certified ODP modules – expected summer 2015

Apache Open Source Components

IBM Open Platform with Apache Hadoop

Page 14: IBM Smarter Analytics

© 2015 IBM Corporation14

Apache Spark is ideal for: Machine Learning Interactive analytics Data Science

http://spark.apache.org

Spark is an open-source, in-memory compute engine that is highly versatile to any environment, enabling you to quickly build models, iterate faster, and apply deep intelligence everywhere.

Apache Spark Overview

Apache Spark

Spark SQLSpark

StreamingGraphX

MLlib(machine learning)

SparkR

Page 15: IBM Smarter Analytics

© 2015 IBM Corporation15

IBM | Spark - The Start of Something Big in Data and DesignTogether, creating the platform for Data Science

Understand Business Goal

Data Profiling and Exploration

Train Algorithms

Consult Experts

Prepare data

App Dev, Deploy, Validate

Go live. Refresh.

+

Page 16: IBM Smarter Analytics

© 2015 IBM Corporation16

IBM Analytic Platform Capabilities

IBM Software Integrates and Extends Hadoop and Spark

Data WarehousingPureData for Analytics, Operational Analytics

Entity Extraction and MatchingBig Match

Security and ComplianceOptim, Guardium Audit and Encryption

Data Integration and GovernanceInformation Server

Enterprise SearchWatson Explorer

Real-time AnalyticsStreams

Predictive Modeling and Descriptive Statistics

SPSS, Big R and Scalable Algorithms

Analysis, Reporting, and ExplorationWatson Analytics, Cognos, BigSheets

Fast, ANSI SQL 2011, and Secure SQLBig SQL

Enterprise File SystemGPFS-FPO

Cluster Resource and Workload Management

Platform Symphony

Large Scale Text ExtractionBig Text

IBM Open Platform with Apache Hadoop

Page 17: IBM Smarter Analytics

© 2015 IBM Corporation17

What is IBM’s perspective on Spark?

IBM opens Spark Technology Center in San Francisco to foster innovation in the heart of the Spark community

IBM is forging key partnerships and building relationships with the creators of Spark- Big data university- Spark certification and Spark social badge- Databricks partnership- AMPLab partnership

IBM Analytics Platform will unify on and around Spark to ensure robust integration and ease of use for our clients- Biglnsights “Spark-Inside”- Spark as a Service on IBM Bluemix (beta in June 2015)- Streams and Spark integration

Page 18: IBM Smarter Analytics

© 2015 IBM Corporation18http://g01zcdwas002.ahe.pok.ibm.com/software/data/infosphere/hadoop/trials.html

Free Quick Start (non production): • IBM Open Platform • Biglnsights Analyst, Data

Scientist features • Community support

Page 19: IBM Smarter Analytics

© 2015 IBM Corporation19

http://g01zcdwas002.ahe.pok.ibm.com/software/data/infosphere/hadoop/trials.html

Page 20: IBM Smarter Analytics

© 2015 IBM Corporation20

IBM’s Investment in the Big Data CommunityOver 250,000 benefit from free Big Data skills training

http://bigdatauniversity.com

Page 21: IBM Smarter Analytics

© 2015 IBM Corporation21

Big Data ≠

Page 22: IBM Smarter Analytics

© 2015 IBM Corporation22

Page 23: IBM Smarter Analytics

© 2015 IBM Corporation23

Watson is creating a new

partnership between people

and computers that

enhances, scales and

accelerates human expertise.

Page 24: IBM Smarter Analytics

© 2015 IBM Corporation24

Brief History of IBM Watson

R&D

Demonstration

Commercialization

Cross-industry Applications

IBMResearch Project

(2006 – )

Jeopardy!Grand Challenge

(Feb 2011)

Watson for

Healthcare(Aug 2011 –)

Watson Industry Solutions(2012 – )

Watson for Financial

Services(Mar 2012 – )

Expansion

Page 25: IBM Smarter Analytics

© 2015 IBM Corporation25

IBM Watson is cognitive computing

Watson understands me.

Watson engages me.

Watson learns and improves over time.

Watson helps me discover.

Watson establishes trust.

Watson has endless capacity for insight.

Watson operates in a timely fashion.

…built on a massively parallel Big Data scalable architecture

Page 26: IBM Smarter Analytics

© 2015 IBM Corporation26

Many industries have a “discovery” challenge

Drug discovery: ~12-15 yrs, $B per drug, 90+% fallout rateLithium ion Battery: ~20 years development time

Healthcare and Life sciences Chemical and Petroleum

Drug DiscoveryNew

Biology Science

New Bio-medical

Research

Oil Reservoir Discovery

Crop SciencesNew Energy

Materials

Product formation: based on Ad hoc manual trial & errorWater filtration: Billions still do not have clean water today

Consumer Goods and Products Semi-Conductor and Materials

Product Innovation

New Market Identification

New Partnerships

Nano Materials

Energy Storage Water Filtration

Existing Discovery is Slow, Expensive, Ad hoc and Manual

Page 27: IBM Smarter Analytics

© 2015 IBM Corporation27

27© 2014 International Business Machines Corporation

Bringing IBM Watson to market

Watson Engagement AdvisorWatson Discovery AdvisorWatson Policy AdvisorWatson Decision Advisor

Offerings:

Watson for Wealth ManagementWatson for OncologyChef Watson

Applications:

Watson ExplorerWatson AnalyticsWatson Curator

Products

Watson Zone on BluemixWatson Developer CloudWatson Tooling

Platform:

Page 28: IBM Smarter Analytics

© 2015 IBM Corporation28

Delivers the tools, methodologies, software developer kits and API(s) for ISVs to build the next generation of cognitive applications

Provides sources of free and fee based content including public, industry and enterprise content

Bridges developers resource gaps by providing a marketplace for critical cognitive skills

• Cloud based sandbox• Hosting Services• Self-service portal –

API / Tooling / SDK / Methodology

• Starter Content• General content• Domain content• Taxonomies

• Third-Party Content

• IBM subject matter experts (500+)

• Third-party specialists• Certification • Individual and project

work

Synopsis

Offering

WATSON DEVELOPER CLOUD

WATSON CONTENT STORE

WATSON TALENT HUB

IBM Watson Platform

Page 29: IBM Smarter Analytics

© 2015 IBM Corporation29

IBM Watson Services on Bluemix

User ModelingPersonality profiling to help engage users on their own terms.

Language Identification

Identifies the language in which text is written

Machine Translation

Globalize on the fly. Translate text from one language to another.

Concept ExpansionMaps euphemisms or colloquial terms to more commonly understood phrases

Message ResonanceCommunicate with people with a style and words that suits them

Question AnswerDirect responses to users inquiries fueled by primary document sources

Relationship ExtractionIntelligently finds relationships between sentences components (nouns,

verbs, subjects, objects, etc.)

Visualization Rendering

Graphical representations of data analysis for easier understanding

Page 30: IBM Smarter Analytics

© 2015 IBM Corporation30

The Watson Experience Manager (WEM)

With Watson Experience Manager:

• Developers use APIs to access and test “Powered by Watson” apps

• Data Scientists can manage their content used to enrich Watson

• User experience developers can customize or create user interaction models with Watson

• Domain experts can train and test their “Powered by Watson” apps

WEM provides a role based set of tools for SME, Watson administrators, and Domain Experts

Page 31: IBM Smarter Analytics

© 2015 IBM Corporation31

Access Watson Developer Cloudusing Watson Experience Manager

Develop app “Powered by Watson”using APIs

Enrich Watsonwith content

Train Watson using tools and experts

Test appfunctional andnon-functional

Deploy application

Building your “Powered by Watson” app

Page 32: IBM Smarter Analytics

© 2015 IBM Corporation32

Let’s Get Started To partner with Watson, you need to:

Be committed to training Have an accessible corpus of

information Identify a clear problem to solve

Get a Bluemix account

Try the Watson services free for 30 days

Take the next step towards development or production deployment

Let’s Get Started with IBM Watson!

Page 33: IBM Smarter Analytics

© 2015 IBM Corporation33

Investing and Educating

www.ibmbigdatahub.comwww.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/www.ibm.com/cloud-computing/bluemix/

Page 34: IBM Smarter Analytics

© 2015 IBM Corporation34

zzzzzzz

Questions?

Page 35: IBM Smarter Analytics

© 2015 IBM Corporation35