45
#1 Modern Platform to Turn Data into a Strategic Asset RapidMiner, Inc. All rights reserved. RapidMiner, Inc. All rights reserved. May 24, 2016 Featuring Howard Dresner Predictive Analytics: Extracting Big Value from Big Data

Predictive Analytics: Extracting Big Value from Big Data

Embed Size (px)

Citation preview

Page 1: Predictive Analytics: Extracting Big Value from Big Data

#1 Modern Platform toTurn Data into a Strategic Asset

©2016 RapidMiner, Inc. All rights reserved.©2016 RapidMiner, Inc. All rights reserved.

May 24, 2016

Featuring Howard Dresner

Predictive Analytics:Extracting Big Value from Big Data

Page 2: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 2 -

Speakers

Howard DresnerChief Research Officer

Dresner Advisory Services

Lars Bauerle Chief Product Officer

RapidMiner

Page 3: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 3 -

Housekeeping

• Recording will be available within 1-2 business days, link will be emailed to you

• You may type your questions in the Questions panel on the screen at any time

• We will leave time at the end for a Q&A session

Page 4: Predictive Analytics: Extracting Big Value from Big Data

Dresner Advisory Services

Advanced and Predictive Analytics and Big Data

Copyright 2016 Dresner Advisory Services, LLC

www.dresneradvisory.com

Page 5: Predictive Analytics: Extracting Big Value from Big Data

Definitions

Advanced and Predictive Analytics Includes statistics, modeling, machine learning, and data mining to analyze facts to make predictions about future, or otherwise unknown, events.

We define big data analytics as systems that enable end-user access to and analysis of data contained and managed within the broader Hadoop ecosystem.

Copyright 2016 Dresner Advisory Services, LLC

Page 6: Predictive Analytics: Extracting Big Value from Big Data

DashboardsEnd-user "self service"

Data warehousingAdvanced visualization

Integration with operational processesData discovery

Enterprise planning/budgetingData mining, advanced algorithms,

predictiveEmbedded BI (contained within an application, portal, etc.)

Mobile device supportEnd-user data "blending" (data

mashups)Location intelligence/analytics

Collaborative support for group-based analysis

In-memory analysisSoftware-as-a-service and cloud

computingSearch-based interface

Pre-packaged vertical/functional ana-lytical applications

Big data (e.g., Hadoop)Ability to write to transactional ap-

plicationsText analytics

Social media analysis (Social BI)Open source software

Complex event processing (CEP)Internet of Things (IoT)

Cognitive BI (e.g., artificial intelligence-based BI)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Technologies and Initiatives Strategic to Business Intelligence

Critical Very important Important Somewhat important Not importantCopyright 2016 Dresner Advisory Services, LLC

Page 7: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Advanced and predictive analyt -

ics

Big data0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

20%9%

30%

18%

24%

22%

16%

23%

11%28%

Importance of Advanced and Predictive Analyt -ics and Big Data

Not importantSomewhat importantImportantVery importantCritical

Page 8: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

0%10%20%30%40%

27%17%

37%

19%

Current Deployment of Advanced and Pre-dictive Analytics

Page 9: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Will adopt in 201517%

Will adopt in 201623%

Will adopt beyond 2016

33%

No plan27%

Deployment Plans for Advanced and Predictive Analytics

Page 10: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Yes. W

e use

big data to

day

We may u

se big data

in the f

uture

No. We h

ave n

o plans t

o use big data

at all

0%

10%

20%

30%

40%

50%

17%

47%

36%

Current Deployment of Big Data

Page 11: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Will adopt in 2015, 4%

Will adopt in 2016; 27%

Will adopt beyond 2016; 69%

Deployment Plans for Big Data

Page 12: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Range of regression models, from linear, logistic to nonlinear

Hierarchical clustering, expectation maximization, k-Means, and vari-

ants of self-organizing mapsTextbook statistical functions for

descriptive statistics

Geospatial analysis

Text analytic functions and sen-timent analysis

Bayesian methods, including Naïve Bayes and Bayesian Networks

Recommendation engine included

Automatic feature selection like principal component analysis (PCA)Vector machine (SVM) approaches

for classification and estimation

Neural networks supportedVarious approaches to CART (e.g. ID3, C4.5, CHAID, MARS, random

forests, gradient boosting)Ensemble learning

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Features for Advanced and Predictive Analyt -ics

Critical Very important Important Somewhat important Unimportant

Page 13: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Data warehouse op-timization

Customer/social analysis

Internet of Things

Clickstream analytics

Fraud detection

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Big Data Use Cases

Critical Very important Important Somewhat important Not important

Page 14: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Fast cycle time for analysis with data preparation

functionsAccess to advanced analyt -ics for predictive and tem-

poral analysisSimple process for contin-uous modification of mod-

elsSupport for easy iteration

Support for entire process in a single application/user

interfaceSupport/guidance in pre-paring data analytical

modelsPre-built drag and drop macros and tools from R

that require no scripting or programmingAutomatic creation of models from data

A specialist *NOT* required to create analytical models,

test and run them

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Usability for Advanced and Predictive Analyt -ics

Critical Very important Important Somewhat important Unimportant

Page 15: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

A specialist *NOT* required to create analytical models, test and run them

Automatic creation of models from data

Support for entire process in a single application/user interface

Support for easy iteration

Pre-built drag and drop macros and tools from R that require no scripting or programming

Simple process for continuous modification of models

Support/guidance in preparing data analytical models

Fast cycle time for analysis with data preparation functions

Access to advanced analytics for predictive and temporal analysis

2.9 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7

Usability 2014 to 2015

2015 2014

Page 16: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Set operations e.g., joins, aggregations or pivot ta-

bles

Detection of duplicates or outliers

Cleansing and en-richment of source

data

Complex filtering

Support for data type conversions

Support for cutting, merging, and replacing

of values

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Data Preparation for Advanced and Predictive Analytics

Critical Very important Important Somewhat important Unimportant

Page 17: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

Complex filtering

Support for data type conversions

Detection of duplicates or outliers

Cleansing and enrichment of source data

Support for cutting, merging, and replacing of values

Set operations e.g., joins, aggregations or pivot tables

3 3.2 3.4 3.6 3.8 4 4.2

Data Preparation 2014 to 2015

2015 2014

Page 18: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

SAP (1st)SAS (1st)

RapidMiner (2nd)

Dell Software (3rd)

IBM (4th)

Birst (5th)

Oracle (5th)TIBCO (5th)

Information Builders

Microsoft

Pentaho

OpenText

MicroStrategy

2

4

8

16

32

64

Advanced and Predictive Analytics Vendor Ratings

Features Data Prep UsabilityScale & Integration Total score

Page 19: Predictive Analytics: Extracting Big Value from Big Data

Copyright 2016 Dresner Advisory Services, LLC

RapidMiner Datameer

Clearstory

MicroStrategy

Pentaho

Domo

Platfora

OpenText

SAPTableauLogi

Qlik

JinfoNet

Dundas

Information Builders

Microsoft

Gooddata

IBM

1

10

100

Big Data Analytics Vendor Ratings

Infrastructure Data Access Search Distributions Machine Learning Total

Page 20: Predictive Analytics: Extracting Big Value from Big Data

Dresner Advisory Services

Advanced and Predictive Analytics and Big Data

Copyright 2016 Dresner Advisory Services, LLC

www.dresneradvisory.com

Page 21: Predictive Analytics: Extracting Big Value from Big Data

#1 Modern Platform toTurn Data into a Strategic Asset

©2016 RapidMiner, Inc. All rights reserved.©2016 RapidMiner, Inc. All rights reserved.

May 24, 2016

Lars BauerleChief Product Officer

RapidMinerfor

Advanced/Predictive Analytics and Big Data

Page 22: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 22 -

Leader

2016, 2015 & 2014

Gartner Magic Quadrant for Advanced Analytics Platforms

Strong Performer

2015

Forrester Wave on Big Data Predictive Analytics

Innovation Winner

2015Wisdom of Crowds for

Advanced & Predictive Analytics, Big Data Analytics &

End-User Data Preparation

#1 Open-Source Platform

2015, 2014, 2013

Data Mining & Analytics Software Poll

RapidMiner is #1 OPEN SOURCE

Page 23: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 23 -

RapidMiner is UNIQUE

Open-Source Innovation

Cutting-edge data science platform designed for

the Big Data era

Frictionless Operationalization

Prescriptive analyticscloses the loop between

insight & action

Lightning-FastData Science

Seamless orchestration accelerates predictive

analytics lifecycle

Self-Service Predictive Analytics

Effortless & guided design democratizes data science

Page 24: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 24 -

ACCELERATES Time-to-Value

DATA PREP Speed & optimize ALL data

exploration, blending & cleansing tasks

OPERATIONALIZEEasily deploy & maintain

models and embed analytic results

MODEL & VALIDATERapidly prototype and

confidently validate predictive models

DATA PREP Speed & optimize ALL data

exploration, blending & cleansing tasks

CONNECT TO ANY DATA SOURCE, ANY

FORMAT, AT ANY SCALE

SUPPORT FOR ALL MAJOR BI, DATA VISUALIZATION &

BUSINESS APPLICATIONS

Page 25: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 25 -©2016 RapidMiner, Inc. All rights reserved. - 25 -

STREAMLINED Data PreparationSpeed & optimize ALL data

exploration, blending & cleansing tasks

A powerful chart engine offers statisticaloverviews, graphs & charts for data exploration

Rapidly import, combine and transform structured & unstructured data for deeper predictive insights

Accelerate advanced data blending tasks with powerful feature weighting, selection & generation

Expertly cleanse data with anomaly & outlier detection, missing value handling and

normalization

Page 26: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 26 -©2016 RapidMiner, Inc. All rights reserved. - 26 -

POWERFUL Modeling & ValidationRapidly prototype and

confidently validate predictive models

Breadth of machine learning functions enhance supervised & unsupervised learning

Automatic techniques for model building, selection & optimization, simplify each step

in the process

Prescriptive algorithms, optimization loops& guided recommendations reveal

optimal actions

Modular cross-validation & honest performance calculations ensure that results will deliver the expected outcome

Page 27: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 27 -©2016 RapidMiner, Inc. All rights reserved. - 27 -

FRICTIONLESS Operationalization—the details

Easily deploy & maintain models and embed analytic results

Scheduled or event-driven model execution supports human decisions

& automated actionsEmbed results into data visualizations, business applications & web services

Dynamically manage models to ensure continued updates and accuracy including

tuning, versioning & alertingSupport for cloud, big data/Hadoop & server based infrastructure for separation of design

and execution

Page 28: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 28 -- 28 -©2016 RapidMiner, Inc. All rights reserved.

Demo

Page 29: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 29 -- 29 -©2016 RapidMiner, Inc. All rights reserved.

Big Data - Hadoop

Page 30: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 30 -

Big Data – Hadoop Challenges

• How to EXTRACT VALUE from Hadoop– What to actually do with all the data being collected– There is opportunity to improve the business in there– But, how do we do it?

• SKILLS GAP is a major adoption inhibitor– Lots of technology– Rapidly changing– Very technical - programming

Page 31: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 31 -

Sampling

Grid Computing

Native Distributed Algorithms

Different Approaches to Big Data Analytics

Page 32: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 32 -

Sampling

Grid Computing

Native Distributed Algorithms

Approach 1: Sampling

Page 33: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 33 -

Approach 1: Sampling

• Data Movement & Processing• Pulls sample data from HDFS/Hive/Impala• In the analytics tool (DV, PA, programming)

• When to use it+ Only data exploration / data understanding+ Early prototyping on prepared and clean data+ Machine Learning modeling with very few and basic

patterns (e.g. only a handful of columns and binary prediction target)

• When NOT to use it− Large number of columns in the data− Need to blend large data sets (e.g. large-scale joins)− Complex Machine Learning models

Analytics Tool

Pieces of data pulled out of Hadoop

Performs Calculations

Page 34: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 34 -

Sampling

Grid Computing

Native Distributed Algorithms

Approach 2: Grid Computing

Page 35: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 35 -

Approach 2: Grid computing

• Data Movement and Processing• Only results are moved, data remains in Hadoop• Custom single-node application running on multiple

Hadoop nodes• When to use it

+ Task can be performed on smaller, independent data subsets

+ Complex data pre-processing• When NOT to use it

– Complex Machine Learning models– Lots of interdependencies between data subsets

App

Analytics Tool

Application Results

Calculations

App

App

App

Page 36: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 36 -

Sampling

Grid Computing

Native Distributed Algorithms

Approach 3: Native Distributed Algorithms

Page 37: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 37 -

Analytics Tool

Approach 3: Native distributed algorithms

• Data Movement and Processing• Only results are moved, data remains in Hadoop• Executed by native Hadoop tools: Hive, Spark, H2O, Pig,

MapReduce, etc.• When to use it

+ Complex Machine Learning models needed+ Lots of interdependencies inside the data (e.g. graph

analytics)+ Need to blend and cleanse large data sets (e.g. large-

scale joins)• When NOT to use it

− Data is not that large− Sample would reveal all interesting patterns− You don’t want to do a lot of Programming in multiple

languages

Calculations

ResultsInstructions pushed to Hadoop

Page 38: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 38 -

Sampling

Grid Computing

Native Distributed Algorithms

Different Approaches to Big Data Analytics

Which one to use for a givenuse case?

Page 39: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 39 -

Typical projects need all three to succeed

Sampling

Grid Computing

Native Distributed Algorithms

Page 40: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 40 -

RapidMiner Predictive Analytics Platform

Page 41: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 41 -

Sampling

Grid Computing

Native Distributed Algorithms

Single Analytics Platform to support all three

Pull data from Hive/ImpalaUse 1500+ operators

SparkRM, PySpark, SparkR

Spark, Hive, Impala, custom UDFs,Mahout, Pig

RapidMiner• Capabilities for all use cases• In a GUI environment• In a single platform

Page 42: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 42 -

What it looks likeRapidMiner• Capabilities for all use cases• In a GUI environment• In a single platform

Page 43: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 43 -

RapidMiner for Big Data (Hadoop)RapidMiner Radoop extends predictive analytics to Hadoop and Spark • We speak Hadoop so you don’t have to

Translates predictive analytics into native Hadoop – you concentrate on creating analytics, not Hadoop programming

• COMPLETE insights into your Big DataPushes analytic instructions into Hadoop for computation, so you can analyze the full breadth and variety of your Big Data

• Use your favorite Hadoop scripts, too!Incorporates SparkR, PySpark, Pig and HiveQL

• Safe and sound Integrates with Kerberos authentication, supports data access authorization – seamless for users, easy administration for IT

Page 44: Predictive Analytics: Extracting Big Value from Big Data

©2016 RapidMiner, Inc. All rights reserved. - 44 -

TRANSFORMATIONAL Business Impact

Build Better Predictive Models Faster

Accelerate the creation of high-value predictive analytics while

streamlining low-value tasks

Easily Use Predictive Analytics

Confidently extract the hidden value from your data using intuitive predictive analytics

Operationalize Competitive Advantage

Bridge the Data Science Skills Gap

Leverage prescriptive analytics in all your

decisions to achieve better outcomes

Empower data scientists and citizen data scientists to feed

the insatiable demand for predictive insights

CHIEF ANALYTICS OFFICER CHIEF EXECUTIVE OFFICER

DATA SCIENTIST BUSINESS ANALYST

Page 45: Predictive Analytics: Extracting Big Value from Big Data

- 45 -CONFIDENTIAL

#1 Agile Predictive Analytics Platform for Today’s Modern Analysts

- 45 -©2016 RapidMiner, Inc. All rights reserved.

Q & A

Download RapidMiner Today @ www.rapidminer.com