20
#mstrworld How to Build MicroStrategy Projects on Top of Big Data Sources in the Cloud Jochen Demuth, Director, Partner Engineering

MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

How to Build MicroStrategy Projects on Top of Big Data Sources in the Cloud

Jochen Demuth, Director, Partner Engineering

Page 2: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business

and consumer studies, Surveys, Polls

All business performance drivers – Operational

efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service

records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk

management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts,

tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand

management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to

machine communication

Operational efficiency, Cost control, Risk

avoidance

SOURCE

VALUE

Use Cases for Big Data in the Cloud

Four broad categories and their value

Page 3: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Traditional sources moving online

How to take advantage of new technologies

3

Traditional relational data sources in the cloud• RDBMS installed in the cloud (e.g. HP Vertica on Amazon EC2)• Managed RDBMS in the cloud (e.g. Amazon RDS)

Relational Database technology build for the cloud, e.g.• Amazon AWS (EMR, Redshift, Aurora)• Google BigQuery• RDBMS vendor cloud services (e.g. Microsoft, Oracle, Teradata, HP, IBM,

SAP, …)

� Cloud services simplify and automate many aspects of data management, but there are still application specific aspects that need conscious control

Page 4: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Some Database Features Require Conscious Design Choices

Query time often dominated by data access with significant performance impact

4

Data organization

• Columnar vs. row based

Minimize data access

• Partitioning key selection

• Data sorting

• (Index selection/strategy)

• Compression (on/off; algorithm)

• Approximate calculation (e.g. HyperLogLog)

Access and process data in parallel

• Data distribution in MPP databases to minimize data movement

� Existing best practices for developing MicroStrategy applications apply

� Make sure to take advantage of db features designed for analytical workloads

� Look for best practices to take advantage of data source strengths in MicroStrategy Community

Page 5: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business

and consumer studies, Surveys, Polls

All business performance drivers – Operational

efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service

records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk

management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts,

tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand

management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to

machine communication

Operational efficiency, Cost control, Risk

avoidance

SOURCE

VALUE

Use Cases for Big Data in the Cloud

Four broad categories and their value

Page 6: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Identifying Value in Data Requires Utmost Flexibility

Static data models get in the way of analysis at the speed of thought

6

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service

records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk

management, Fraud detection

SOURCE

VALUE

Technical Characteristics:• Unknown data sources are analyzed for

potential new business value.• Analysis necessary to support the

development of new business models• Data models don’t exist (yet).

Page 7: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Analy

tical C

om

ple

xity

User S

cale

• Trained in modeling and coding

• Use a variety of tools

• Want their favorite tools

• Look for the truth

• Analytical amateurs

• Power users of BI tools

• Want to use the right tool

• Look for the business edge

• Make the daily decisions

• Some may be power users

• Most need simple tools

• Look for actionable information

Data Scientists Business Analysts Business Users

Back Office Front Line

MicroStrategy Supports All Analytic Needs

Some People Produce Analytics While Others Consume Analytics

Page 8: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Choose how to access and analyze data

MicroStrategy Provides Flexible Data Modeling Options

Direct

Unified MicroStrategy Metadata

• Reusable Data

• Reusable Objects

• Reusable Design

Report

Modeled

Visual InsightDashboard

ID scansOnline click-

streamApplication logs

Call/service records

Report Visual InsightDashboard

Flexible data access

• Schema on read

• Supports quick iterations

• Reusable Objects

Page 9: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business

and consumer studies, Surveys, Polls

All business performance drivers – Operational

efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service

records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk

management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts,

tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand

management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to

machine communication

Operational efficiency, Cost control, Risk

avoidance

SOURCE

VALUE

Use Cases for Big Data in the Cloud

Four broad categories and their value

Page 10: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

The Web 2.0 Phenomenon Introduces Specific Challenges

Data access, data structure, and data meshing

10

Web 2.0

phenomenon

Content generated from social media posts,

tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand

management, Viral marketing

SOURCE

VALUE

Access data where it exists• Web 2.0 data stored in relational data sources • Online services that also provide data services

• E.g. Salesforce.com• Online services that provide data

• Social• Government• Weather

� MicroStrategy offers three ways to access Web 2.0 data

� Data often requires structuring or flattening for analysis

� For optimal value data from multiple sources need to be put in context

Page 11: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

User / Departmental Data

Data Warehouse Appliances

Big Data & NoSQL

Relational Databases

MultidimensionalDatabases

ColumnarDatabases

SaaS-Based App Data

HANA

BigInsights

Parallel Data Warehouse

Elastic Map Reduce

Analysis Services

Redshift

Brin

g A

ll R

ele

van

t D

ata

to

D

ecis

ion

Ma

ke

rs

Distribution

No Data Left Behind

Optimized connectors to your entire Big Data ecosystem

Page 12: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

DA

TA

PR

OC

ES

SIN

G,

AN

ALY

TIC

S &

DE

LIV

ER

Y

Dashboards Reports and StatementsSelf-Service Analytics OLAP Analysis

MicroStrategy Analytics Platform

1. Direct connection to source

• Parse structure with lightweight “Schema-on-read” functions

• Import data or Create a modeled environment

2. Using Web Services

• Requires data to be exposed as a Web Service

• Data will need to be structured prior to access

3. Offline “Process and Store”

• Using specialty analytics (text, streaming, image processing) and stored as structured

• Text Analytics Module

Semi-Structured Data Unstructured Data

DA

TA

S

TO

RA

GE

Web Logs Social media posts

Surveys Server Logs Geo-spatial

E-mail Image Audio Video

Sensor + Machine Data Documents

Three Ways to Query Multi-structured Data

Page 13: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

MicroStrategy Offers Several Paths to Mesh Data For Analysis

Integrating Modeled BI and Self-Service BI

Multi-Source Pushdown Joins

Structured BI Content Consumption

Structured Data:Architect

Structured Join: Multi-Source Model

Corporate Data Sources

Dashboards and MicroApps

Cubes from Model

Ad Hoc / Visual Insight

Join Datasets in Documents

Self Service BI Content Creation

Self Service Data:Data Import

Self Service Join:Document Data

Blending

Local / Dept Data SourcesCubes from Import

Page 14: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business

and consumer studies, Surveys, Polls

All business performance drivers – Operational

efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service

records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk

management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts,

tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand

management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to

machine communication

Operational efficiency, Cost control, Risk

avoidance

SOURCE

VALUE

Use Cases for Big Data in the Cloud

Four broad categories and their value

Page 15: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Internet of

things

Machine generated sensor data and machine to

machine communication

Operational efficiency, Cost control, Risk

avoidance

SOURCE

VALUE

Find Insights in Vast Amounts of Machine Generated Data

Machine generated data often does not lend itself for traditional OLAP analysis

Apply the methods of predictive analytics and data mining to machine generated data

Page 16: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Primary Work Horses of

Data Mining

“Which Techniques Do You Use Most”

�= MicroStrategy Native

� = via PMML

= via R

���

��

���

��

����

��

Source: 2013 Rexer Data Miner Surveyswww.RexerAnalytics.com

Over 1,250 Data Miners from 75 Countries

MicroStrategy Support for Predictive Analytics

All of the most commonly used techniques are supported

Page 17: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Predictive Analytics Are Part of MicroStrategy Function Library

17

AverageMeanCountSumMaximumMinimumMedianModeProductRank Percentile“N”-TileN-tile by StepN-tile by ValueN-tile by Step and Value

ReportingAdd DaysAdd MonthsCurrent DateCurrent Date & TimeCurrent TimeDay of MonthDay of WeekDay of YearDays BetweenMonth Start DateMonth End DateMonths BetweenYear Start DateYear End Date

Date and Time

Standard DeviationStandard Deviation of a PopulationVarianceVariance of a Population

Geometric MeanAverage DeviationKurtosisSkew

Statistical Aggregate

Running TotalRunning Std DeviationRunning Std Deviation of PopulationRunning MinimumRunning MaximumRunning CountMoving DifferenceMoving MaximumMoving MinimumMoving Average

Moving SumMoving CountMoving Std DeviationMoving Std Deviation of PopulationFirst or Last Value in RangeExponential Weight Moving AvgExponential Weight Running Avg

OLAP Functions

Beta DistributionBeta InverseBinomial Distribution ProbabilityChi DistributionChi InverseConfidenceCorrelation CoefficientCovarianceCritical Binomial DistributionChi Test (Independence)Cumulative Binomial DistributionExponent DistributionF-Probability DistributionF-TestFisher Transformation Gamma DistributionGamma InverseGamma LogarithmHomoscedastic Ttest

Heteroscedastic TtestHypergeometricDistributionIntercept PointInverse of Lognormal Cumulative DistributionInverse of F Probability DistributionInverse of FisherInverse of the StdNormal Cumulative DistributionInverse of the T-DistributionLognormal Cumulative DistributionMean T-TestNegative Binomial DistributionNormal Cumulative DistributionNormal Distribution InverseNumber of

Permutations for a Given ObjectPaired T-testPoisson Distribution (Predict Number of Events)Pearson Product Moment Correlation CoefficientRSQ (Square of Pearson)Slope of Linear Regression STEYX (Standard Error of Predicted “y” Value)StandardizeStandard Normal Cumulative DistributionT-DistributionVariance TestWeibull Distribution (Reliability Analysis)

Statistical

Accrued InterestAccrued Interest MaturityAmount Received at MaturityBond-equivalent Yield for T-BILLConvert Dollar Price from Fraction to DecimalConvert Dollar Price from Decimal to FractionCumulative Interest Paid on Loan Cumulative Principal Paid on LoanDepreciation for each Accounting PeriodDays In Coupon Period to Settlement DateDays In Coupon Period with Settlement DateDays from Settlement Date to Next CouponDouble-Declining Balance MethodDiscount Rate For a SecurityEffective Annual Interest RateFixed-Declining Balance MethodFuture ValueFuture Value of Initial Principal with Compound

Interest RatesInterest RateInterest PaymentInternal Rate of ReturnInterest Rate per AnnuityMacauley DurationModified DurationModified Internal Rate of ReturnNext Coupon Date After Settlement DateNo of Coupons Settlement and Maturity DateNominal Annual Interest RateNo of Investment PeriodsNet Present ValueOdd First period YieldOdd Last PeriodPrev Coupon Date Before Settlement DatePrice Per $100 Face Value w OddFirst Period Payment

Payment on PrincipalPricePrice DiscountPrice at MaturityPresent ValueProrated Depreciation for each Period Straight Line DepreciationSum-Of-Years' Digits DepreciationT-BILL PriceT-BILL YieldVariable Declining BalanceYieldYield for Discounted SecurityYield at Maturity

FinancialAbsolute IntegerA-cosine LnHyp A-cos LogA-sine Log10Hyp A-sine ModA-tan PowerA-tan2 QuotientHyp A-tanRadiansCeiling RandbetweenCombine RoundCosine SineHyp Cosine Hyp SineDegrees Square RootExponent TanFactorial Hyp TanFloor Truncate

Math Functions

Association RulesClusteringGeneral RegressionMiningNeural NetworkRegressionRule SetSupport Vector Machine

Time SeriesTrain AssociationTrain ClusteringTrain Decision TreeTrain RegressionTrain Time SeriesTree ModelVariants

Data Mining

Page 18: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Deploy Any of 5000+ Open Source R

Analytics

As a MicroStrategy metric, use models and

functions in any report or dashboard

MicroStrategy R

Integration Pack

Create Your Own Custom Functions

MicroStrategy Custom

Function Plug-in

Import Predictive Models from Popular

Packages

PMML Model

ƒApply(X)

Easy Integration with Third Party Analytical Models

Page 19: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Industry’s most powerful SQL Engine and 300+ native analytical functions

Predictions

Relationship Analysis

Benchmarking

Trend Analysis

Data Summarization

An

aly

tic

al

Ma

turi

ty

What is likely to happen based on past history?

What factors influence activity or behavior?

How are we doing versus comparables?

What direction are we headed in?

What is happening in the aggregate?

Optimization What do we want to happen?

World’s most popular

advanced analytics tool.

Free, open source.

More

Specialty Tools

The Full Range of Advanced Analytics from One Place

Page 20: MSTRWorld2015 TT S2 How to Build MicroStrategy Projects …...Internet of things Machine generated sensor data and machine to machine communication ... avoidance SOURCE VALUE Use Cases

#mstrworld

Traditional sources

moving online

Company, Government, Financial sector, Business

and consumer studies, Surveys, Polls

All business performance drivers – Operational

efficiency, Revenue management, Strategic planning

SOURCE

VALUE

Digital exhaust

from interactions

Online click-stream, Application logs, Call/service

records, ID scans, Security cameras

New revenue sources, Consumer promotions, Risk

management, Fraud detection

SOURCE

VALUE

Web 2.0

phenomenon

Content generated from social media posts,

tweets, blogs, pictures, videos, ratings

Customer engagement, Customer service, Brand

management, Viral marketing

SOURCE

VALUE

Internet of

things

Machine generated sensor data and machine to

machine communication

Operational efficiency, Cost control, Risk

avoidance

SOURCE

VALUE

MicroStrategy Supports All Use Cases for Big Data in the Cloud

Analytical platform that provides the flexibility to enable modern analysis