60
Data Mining + Business Intelligence Integration, Design and Implementation

Data Mining + Business Intelligence - Powerful Data …€¦ ·  · 2015-07-29BUSINESS INTELLIGENCE DATA MINING - Making data accessible ... ANALYTICAL ARCHITECTURE #1 Extraction

Embed Size (px)

Citation preview

Data Mining + Business IntelligenceIntegration, Design and Implementation

ABOUT ME

Vijay Kotu

Data, Business, Technology, Statistics

Result

BUSINESS INTELLIGENCE

DATA MINING- Making data accessible- Wider distribution- Dimensional slicing- Mostly as-is reporting

- Finding useful patterns in data- Limited distribution- Algorithms- Insights and Predictions

DATA MINING

Statistics

Computing Machine Learning

QuantitativeOperations Research

Data StoresComputation

Machine Learning, Optimization, Algorithms

Data Mining in simpler terms, is finding useful patterns in the data.

“It is non-trivial process of finding useful, valid, novel, understandable patterns or relationships in the data to make important decisions” (Fayyad et al., 1996)

DATA MINING: MODELS

Data Mining

Classification

Regression

Clustering

Association

Anomaly detection

Time Series

Feature Selection

Text Mining

DATA MINING: TYPES

Tasks

Applications

Tasks Examples

Classification Assigning voters into known buckets by political parties eg: soccer moms. Bucketing new customers into one of known customer groups.

Regression Predicting unemployment rate for next year. Estimating insurance premium.

Anomaly detection Fraud transaction detection in credit cards. Network intrusion detection.

Time series Sales forecasting, production forecasting, virtually any growth phenomenon that needs to be extrapolated

Clustering Finding customer segments in a company based on transaction, web and customer call data.

Association analysis Find cross selling opportunities for a retailer based on transaction purchase history.

DATA MINING: TYPES

Tasks Algorithms

Classification Decision Trees, Neural networks, Bayesian models, Induction rules, K nearest neighbors

Regression Linear regression, Logistic regression

Anomaly detection Distance based, Density based, LOF

Time series Exponential smoothing, ARIMA, regression

Clustering K means, density based clustering - DBSCAN

Association analysis FP Growth, Apriori

DATA MINING: TYPES

DATA MINING: PROCESS

DATA MINING: PROCESS

DATA MINING: PROCESS

DATA MINING: PROCESS

DATA MINING: PROCESS

DATA MINING: PROCESS

Data Mining Scoring

625

DATA MINING: PROCESS

Data Mining + Business Intelligence

BusinessIntelligence

Data Mining

ISSUES

- People: Skills of data mining and business intelligence are exclusive

- Organization: They live in different organizations within an enterprise

- Technology: Minimal overlap in the tools, platform and technology

- Use cases: History reporting vs. prediction and insights

Data Mining Business

Intelligence

BENEFITS

- Distribution: Data Mining insights will have wider real time distribution

- Smarter Analytics: History + Predictions

- Visual discovery: Common link

- Security: Secure delivery of insights

Star Schema

OLAPStaging

Secu

rity

Laye

r

Dashboards, reports, alerts, ad hoc...

CLASSIC BI ARCHITECTURE

Extr

actio

n Tr

ansf

orm

atio

n &

Load

ing

Star Schema

OLAPStagingDashboards, reports, alerts, ad hoc...

ANALYTICAL ARCHITECTURE #1

Extr

actio

n Tr

ansf

orm

atio

n &

Load

ing

Data Mining Tool

Data Mining tool does the scoring. Robust modeling and scoring capabilities. BI tool reports the scored like any other data points. Limitations: New records cannot be scored, unless scoring is provided by DM tool. Required multiple analytical tools.

Data Mining Tool Scoring

Star Schema

OLAPStagingDashboards, reports, alerts, ad hoc...

ANALYTICAL ARCHITECTURE #2

Extr

actio

n Tr

ansf

orm

atio

n &

Load

ing

Database does the scoring. Can handle large data. Model, scoring and data in one place. Limitations: DB vendors have to provide full DM suite. Analysis Skills

Database Scoring

Star Schema

OLAPStagingDashboards, reports, alerts, ad hoc...

ANALYTICAL ARCHITECTURE #3

Extr

actio

n Tr

ansf

orm

atio

n &

Load

ing

BI platform does the scoring. Good integration between predictive metrics with BI metrics. Security. Distribution. Real time scoring. Limitations: Performance. Limited Functionality

BI Scoring: Native Modeling

Star Schema

OLAPStagingDashboards, reports, alerts, ad hoc...

ANALYTICAL ARCHITECTURE #4

Extr

actio

n Tr

ansf

orm

atio

n &

Load

ing

Data Mining Tool

BI platform does the scoring. Modeled by DM tool and imported in BI platform. Real time scoring. Supports wide selection of algo. Limitations: Performance.

BI Scoring: Data Mining Tool Modeling

PMML Model

ANALYTICAL ARCHITECTURE

Data Mining Tool Scoring

Database Scoring

BI Scoring

- Data Mining Tool Modeling

- Native Modeling

USE CASE

Association Analysis or Market Basket Analysis

CLICKSTREAM DATA

Can be generalized to transactionsApplies to any product purchases in an enterprise

CLICKSTREAM DATA

Creation of Association Rules

CLICKSTREAM DATA

Creation of Association Rules

CLICKSTREAM DATA

Creation of Association Rules

DATA MINING USING BI SYSTEM

MicroStrategy Desktop > Data Mining Services

Model Building in BI

DATA MINING SERVICE

MicroStrategy Desktop > Data Mining Services

DATA MINING SERVICE

MicroStrategy Desktop > Data Mining Services

DATA MINING SERVICE

MicroStrategy Desktop > Data Mining Services

MODEL DETAILS

MicroStrategy Desktop > Data Mining Services

RESULTS

MicroStrategy Desktop > Data Mining Services

RESULTS

PMML

MicroStrategy Desktop > Data Mining Services

PMML

PMML

PMML

PMML

PMML

BI VS. DATA MINING THINKING

Number of customers lost last month Who will most likely churn in next 10 days

Production downtime report What part of process will fail and mitigation

ROI for Marketing Campaigns Whats the next action will the prospect make

Yesterday’s revenue Tomorrow’s

Data Mining + Business Intelligence

RECOMMENDED READING

OPEN SOURCE DATA MINING TOOLS

Advanced Reporting

Guide:Enhancing

Your Business

Intelligence

Data Mining + Business IntelligenceAppendix

CLUSTERING

CLUSTERING

CLUSTERING

CLUSTERING

Data Set

CLUSTERING

k-Means Clustering

CLUSTERING

CLUSTERING

CLUSTERING

CLUSTERING

DECISION TREES

DECISION TREES

DECISION TREES

DECISION TREES

DECISION TREES

DECISION TREES

DECISION TREES