Big Data, analytics and 4th generation data warehousing by Martyn Jones at Big Data Spain 2015

Preview:

Citation preview

Big Data, Analytics and 4th Generation Data Warehousing

Martyn Jones

Big Data Spain 2015

agenda

∙ Imperatives.

∙ Data value chains.

∙ Resources.

∙ 4th Generation Data Warehousing.

∙ Analytics Data Store / Big Data.

∙ Information Supply Framework.Friday 16th from 12:30 pm to 13:15 pm

Room 25 - Technical

0 5 10 15 20 25 30 35 40 45

business background

the ages of data

B . C . L i f e o f B r i a n A . D .

C h a n g eI n s i g h tP o t e n t i a l l y

u s e f u l

Simplicity

A b u n d a n t

V o l u m e V e l o c i t y V a r i e t y

framework

O b t a i n I n t e g r a t e A n a l y s e P r e s e n t

D A T A

D A T A

D A T A

the road to Big Data success…

S t r a t e g i c

T a c t i c a l

O p e r a t i o n a l A n a l y t i c s

A r c h i t e c t e d

M a n a g e d

I n t e g r a t i o n

D a t a

scope

BIZ DATA DWBIG

DATASTATS PRES

Business ImperativesA good place to start

what’s important to business?

BE

NOTICED

CASH

FLOW

BE

NOTICED

CASH

FLOW

BE

NOTICED

CASH

FLOW

what else is important to business?

Market share

Differentiation

Ability to execute

Liquidity

Profitability

Time and place utility

React to

competitive threats

Enhance service

scope

Improving customer

service

Respond to price

pressure

Segmentation of n

Addressing short-term

attention spans

Ability to respond to

irrationality

Be noticed

Cash flow

Risk

Legislation

No pressBad press

Customer

centricity

Front office

empowerment

Excellence

Channel

excellence

Operational

excellence

Product

excellence

Cultures

IT business

value

Base protection

Expansion

Diversification

Consolidation

Augmented Competitive Forces

Competition from

within the industrySuppliers Buyers

Replacements

Potential entrants

Threat of replacement

product or service

Threat of new

entrants

Bargaining

powerBargaining

power

Sources: Michael Porter;Martyn R Jones

and others

Rivalry with

existing

competitors

Pressure groups

Media

Government

Power to

change the game

Exposure

McKinsey 7S Framework

Culture

differentiated capabilities

operating models

Customer segments

Channels

Products

Services

Organsational design

Processes

Data & information

Physical assets

Development

Deployment

Organsational design

Performance management

Information technology

Business

model

Operating

model

People

model

Customers

Systems People

Processes Organisation

objectives

1. Information awareness corresponding to areas of operation and spheres of control

2. Comprehensive data and information supply framework

3. Continually seek to maintain and then improve data’s contribution to business

Business data everywhereWhere, when, what, who, why... how?

Data

I n t e r n a l P a s t

E x t e r n a l P r e s e n t

S h a r e d F u t u r e

Data

O p e r a t i o n a l O n l i n e

B i g D a t a A r c h i v e d

D a r k D a t a U n m a n a g e d

Data

A r c h i v e s S o c i a l M e d i a

D o c u m e n t s M a c h i n e L o g

M e d i a S e n s o r

B u s i n e s sA p p l i c a t i o n s

D a t a S t o r a g e

P u b l i c W e b

Activities, Abstractions and Relations

Velocity

Volume

Variety

Adequacy

Ambiguity

Small

Availability

Accuracy

Relevance

Persistence

Reliability

Value

Obtuseness

Listo

Complexity

Utility

Descriptiveness

Big

Velocidad

Volumen

Variedad

Adecuación

Ambigüedad

Precisión

Disponibilidad

Exactitud

Relevancia

Persistencia

Confiabilidad

Valor

Obtuso

Smart

Complejidad

Utilidad

Descriptivo

Grande

D a t a

Facets of Big DataFacets of Data

B I G D A T A

I n t e r n e t o f

T h i n g s

C L O U D

S t a t i s t i c s

D a t a

W a r e h o u s i n g

P r e s e n t a t i o n

D a t a S u p p l y F r a m e w o r k

Building Bill’s Data Warehouse25 years of... sometimes getting it right

Enterprise Data Warehousing – AS IS

S u b j e c t

o r i e n t e d

S t r a t e g i c

d e c i s i o n m a k i n g

I n t e g r a t e d

T i m e

v a r I a n tN o n – v o l a t i l e

Operational Systems Data Warehouse

Purchasing

HR

CreditOrder

Processing

Marketing

SalesLogistics

Billing

Arrangements

ProductsParty

TimeGeography

Transactions

Subject oriented

Operational Systems Data Warehouse

Euro Account Customer:Customer: Village Bank GmbHCountry code: D

Mutual Fund Customer:Customer: Village BankersRegion: Westphalia

NTIP Customer:Customer: Village Bank InternationalCountry: Germany

Account:Number Customer Type230956 441353 Euro010555 441353 MF291284 441353 NTIP

Party:Number: 100441353Name: Village Bank GmbHCountry: Germany

Integrated

Operational Systems Data Warehouse

0

10

20

30

40

50

60

70

80

90

100

Trading Activity Snapshots:

Date Security Amount

2006.09.01 MartyBank 79.000.000

2006.09.02 MartyBank 92.000.000

2006.09.03 MartyBank 44.000.000

2006.09.04 MartyBank 39.000.000

2006.09.05 MartyBank 80.000.000

Trading Activity: MartyBank

Time variant

Operational Systems Data Warehouse

Order

Processing

Create

Replace

Update Delete

Orders

Read Read

Read ReadWrite

Read

Non-volatile

Data Warehousing 2.0

Data Sources

Str

uc

ture

d D

ata

ETL

Extr

ac

t

Tra

nsf

orm

Loa

d

Internal

ODS

ODS

EDW

ETL

Extr

ac

t

Tra

nsf

orm

Loa

d

Data Marts

Str

uc

ture

d D

ata

Un

stru

ctu

red

DataMart

DataMart

Report Repository

Reports &Extracts

Stats

Da

ta s

ele

ctio

n a

nd

re

pre

sen

tatio

n

Da

ta a

na

lytic

s

Re

po

rt s

et

an

d e

xtr

ac

t c

rea

tio

n

Service

Pu

sh /

Pu

ll Te

ch

no

log

y

Vis

ua

lisa

tio

n

An

no

tatio

n

Users

Inte

rna

l

Clie

nts

Oth

er

sta

ke

ho

lde

rs

Metadata, Workflow/Process Control and CIW Management

Metadata ProcessÊDW

Management

Staging

StagedData

EDW

Un

stru

ctu

red

EDW

DataMart

Str

uc

ture

d D

ata

Un

stru

ctu

red

Enterprise Data Warehousing – AS A BODGE

G e t d a t a

W o n d e r w h y i t ‘ s n o t

m e e t i n g e x p e c t a t I o n s

D u m p d a t a

Q u e r y d a t a V i s u a l i s e d a t a

Enterprise Data Warehousing – AS A BODGE

DW BODGER TEAM HADOOP TEAM

We built a data dog house using Oracle and IBM technology and we called it a data

warehouse

We can do data warehousing too and it will be cheaper, faster and smarter

Data Supply FrameworkA data architecture for data sourcing, transformation, integration, storage, search, analysis and presentation

Data Supply Framework

Operational

Data Store

Data

Warehouse

Business

Intelligence

Data

logistics

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Allinformation

and data consumers

All

information

consumers

All digital data

All data processing, enrichmentand information creation

Internal

digital data

Data Supply Framework

External

digital data

Data logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

Data Supply FrameworkData Sources 4th Generation Data Warehousing

Data Sources Core Statistics

Cambriano Energy 2015

Core Data SourcingComprehensive data acquisition and transformation

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Cambriano Energy 2015

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

4th Generation Data WarehousingProviding a solid foundation for strategic, tactical and operational decision making

Enterprise Data Warehousing – 4 GEN

S u b j e c to r i e n t e d

S t r a t e g i c , t a c t i c a l & o p e r a t i o n a l

s u p p o r t

I n t e g r a t e d

T i m e v a r i a n c e &t i m e p e r s p e c t i v e s

C o n s t r a i n e d v o l a t i l i t y

C l a s s i f i c a t i o ns c h e m a

R u l e b a s e d t r a n s f o r m a t i o n

4th Generation EDW

Interpretation

Prediction

Diagnosis

Design

Planning

Monitoring

Debugging

Repairing

Instruction

Control

S t r a t e g y

T a c t i c s

O p e r a t i o n s

Using, applying and measuring

Big Data

Big Data

Big Data

Predictive Analytics

Predictive Analytics

Outcomes

EDW 4.0

EDW 4.0E(A)TL

Using, applying and measuring

Big DataPredictive analytics

Select predictions

Define trackable actions

Apply outcomes and actions to EDW

4

Accumulate campaign Big

Data

Descriptive analytics

Select findingsCombine with

trackable actions

Apply outcomes and actions to EDW

4

Run campaign

Analyse campaign and performance of Big Data analytics

Forecasts and results – from all perspectives

-400

-300

-200

-100

0

100

200

300

400

500

01/15 02/15 03/15 04/15 05/15 06/15 07/15 08/15 09/15 10/15 11/15 12/15 01/16 02/16 03/16 04/16 05/16 06/16

Cambriano Big Data Campaign 2015-2016

Forecast Actual Strategy BD Costs Benefit

Values Relativity Dimensions HierarchiesStructuresPast Future

Using, applying and measuring

•Combining Big Data analytics with Data Warehousing 4.0

•Planning and managing initiatives

•Measuring, analysing and reporting the effectiveness of business initiatives

•Measuring, analysing and reporting the tangible contribution of the Big Data analytics process to the creation of business value

Big Data and Core StatisticsA multi-faceted data theatre for ad-hoc, speculative and immediate operational analytics

Internal

digital data

Data Supply Framework

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

DSF 4.0 Data Value Chains

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

DATA INFORMATION KNOWLEDGE

Requires context Requires interpretation Requires wisdom

Relevant Correct Usable

Irrelevant Incorrect Useless

Meaningless Misleading Wrong

Value? Value? Value?

DSF 4.0 Data Assets in MOSCOW

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

RISK

ASSET

SECURE

BAU

Assurance

Highest High Medium/LowVery

low/None

MUST SHOULD COULD WON’T

Yes Yes Maybe Maybe/No

Yes Yes Yes Maybe/No

Yes Yes Yes Maybe/No

DSF 4.0 Data Assets in MOSCOW

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

RISK

ASSET

SECURE

BAU

Assurance

Highest High Medium/LowVery

low/None

MUST SHOULD COULD WON’T

Yes Yes Maybe Maybe/No

Yes Yes Yes Maybe/No

Yes Yes Yes Maybe/No

DSF 4.0 Data Supply Framework

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

OLTP

Applications

‘What if ’

analysis

MIS /

Reporting

Visualisation

Publication

ºAll digital

data

Internal

digital data

DSF 4.0 Data Supply Framework

External

digital data

Data

logistics

Operational

Data Store

Data

Warehouse

Analytics

Data Store

Data Marts

Statistical

Analysis

Business

Intelligence

Scenarios

Data

logistics

Primary data flow

Secondary data flow

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

All

information

consumersº

All digital

data

Internal

digital data

External

digital data

Primary data flow

Secondary data flow

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

º

Statistics

Data

Science

Big Data

Small Data

Smart Data

This Data

That Data

That

department

Messing

with dataMap Fatten

Retrospect

Reports

Alerts

Visualisation

Analytics

This

department

The other

department

Map Reduce

DSF 4.0 Data Supply Framework

DSF 4.0 Data Supply Framework

Operational

Data Store

Data

Warehouse

Business

Intelligence

Data

logistics

Operational

applications

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

Allinformation

and data consumers

All

information

consumers

All digital data

All data processing, enrichmentand information creation

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Data Sources – This element covers all the current sources, varieties andvolumes of data available which may be used to support processes of'challenge identification', 'option definition', decision making, includingstatistical analysis and scenario generation.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Core Data Warehousing – This is a suggested evolution path of the DW 2.0model. It faithfully extends the Inmon paradigm to not only includeunstructured and complex data but also the information and outcomesderived from statistical analysis performed outside of the Core DataWarehousing landscape.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

EDW

ADS

DM

DM

DM

Statistical analysis

ETL

T/ETL

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Message Adapter

MessageQueue

OLTP

Staging

ODS

ETLT/ETL

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

TL

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

Message Adapter

MessageAdapter

Core Statistics – This element covers the core body of statistical competence,especially but not only with regards to evolving data volumes, data velocityand speed, data quality and data variety.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

INTO THE ZONE!

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Complex Data – This is unstructured or highly complexly structured data contained in documents and other complex data artefacts, such as multimedia documents.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Event Data – This is an aspect of Enterprise Process Data, and typically at a fine-grained level of abstraction. Here are the business process logs, the internet web activity logs and other similar sources of event data. The volumes generated by these sources will tend to be higher than other volumes of data, and are those that are currently associated with the Big Data term, covering as it does that masses of information generated by tracking even the most minor piece of 'behavioural data' from, for example, someone casually surfing a web site.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Infrastructure Data – This aspect includes data which could well be described as signal data. Continuous high velocity streams of potentially highly volatile data that might be processed through complex event correlation and analysis components.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Event Applicance – This puts the dynamic data collation, selection and reduction functionality as close to the point of event data generation as physically possible.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Signal Applicance – This puts the dynamic data collation, selection and reduction functionality as close to the point of continuous streaming data generation as physically possible.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DW 3.0 Information Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Distributed Inter Process Communication – Different forms of messaging allow high volumes of data to be transmitted in near real time.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Staging and Reduction – Traditional data staging combined with in-line data reduction.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

ET(A)L – Extending ETL to include data analytics components tightly integrated into parallel ETL job streams.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

ADS – The Analytics Data Store. 1. Statistics oriented 2. Integrated by focus area 3. Variable volatility 4. Time variant

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Statistical Analysis – Qualitative analysis. Diagnostic analysis, predictive analysis, speculative analysis, data mining, data exploration, modelling.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Scenarios and outcomes – 1. Snapshots of outcomes of scenario analysis as the process of analyzing possible future events by generating alternative possible outcomes. 2. Captured outcomes of statistical analysis.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

DSF 4.0 Data Supply Framework

Martyn Richard Jones 2015 – martynjones.eu

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Write back – The ability to append data, update data and enrich data within the Analytics Data Store, and to provide scenario data to the Core Data Warehousing.

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com

DSF 4-0 – Core Statistics: Analytics Data Store

Martyn Richard Jones 2015 – martynjones.eu

ADSStatistical analysis

ET(A)L

Staging & Reduction

SignalAppliance

Message Adapter

MessageQueue

Infrastructure Data

Write back

Complex data

Event DataEvent

Appliance

Scenario 1

Scenario 2

Scenario 3

Core Data Warehousing

Core Statistics

Data Sources

MessageAdapter

Cambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com

DSF 4.0 – Analytics Data Store

Martyn Richard Jones 2015 – martynjones.euCambriano Energy 2015 - http://www.cambriano.es Published by goodstrat.com

Distributed File SystemNon-relational distributed file storage / NoSQL

DFS (Including ‘refractoring’ of Unix primitives)

Unix File StorePOSIX compliant

Document DBMS

Graph DBMSKey-Value

DBMSIn-memory Column Oriented Relational

DBMS

Relational DBMS (MPP/SMP/Hybrid)

Object DBMS

POSIX compliant Unix / Linux primitives

Relational DBMS

DSF 4.0 – What’s important?

Cambriano Energy 2015 - http://www.cambriano.es

Data Warehouse

Martyn Richard Jones 2015 – martynjones.euPublished by goodstrat.com

Business Intelligence

Operational Data Store

Analytics Data Store

Statistical Analysis

Dark Data

Big Data

Internet of Things

Knowledge Management

Structured Intellectual

Capital

Cloud

SummaryA good place to end, for now

Summary

• Consider everything

• Question everything

• Never stop hypothesising

• Never stop testing

• For every initiative have a business imperative

• Make continuous engagement and involvement a goal

Muchas graciasMany thanks

Big Data Spain 2015

Big Data, Analytics and 4th Generation Data Warehousing

Big Data Spain 2015

Recommended