98
A company of Daimler AG LECTURE @DHBW: DATA WAREHOUSE PART IV: FRONTEND, METADATA, PROJECTS, ADVANCED TOPICS ANDREAS BUCKENHOFER, DAIMLER TSS

LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Embed Size (px)

Citation preview

Page 1: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

A company of Daimler AG

LECTURE @DHBW: DATA WAREHOUSE

PART IV: FRONTEND, METADATA, PROJECTS, ADVANCED TOPICSANDREAS BUCKENHOFER, DAIMLER TSS

Page 2: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

ABOUT ME

https://de.linkedin.com/in/buckenhofer

https://twitter.com/ABuckenhofer

https://www.doag.org/de/themen/datenbank/in-memory/

http://wwwlehre.dhbw-stuttgart.de/~buckenhofer/

https://www.xing.com/profile/Andreas_Buckenhofer2

Andreas BuckenhoferSenior DB [email protected]

Since 2009 at Daimler TSS Department: Big Data Business Unit: Analytics

Page 3: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

As a 100% Daimler subsidiary, we give

100 percent, always and never less.

We love IT and pull out all the stops to

aid Daimler's development with our

expertise on its journey into the future.

Our objective: We make Daimler the

most innovative and digital mobility

company.

NOT JUST AVERAGE: OUTSTANDING.

Daimler TSS

Page 4: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

INTERNAL IT PARTNER FOR DAIMLER

+ Holistic solutions according to the Daimler guidelines

+ IT strategy

+ Security

+ Architecture

+ Developing and securing know-how

+ TSS is a partner who can be trusted with sensitive data

As subsidiary: maximum added value for Daimler

+ Market closeness

+ Independence

+ Flexibility (short decision making process,

ability to react quickly)

Daimler TSS 4

Page 5: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Daimler TSS

LOCATIONS

Data Warehouse / DHBW

Daimler TSS China

Hub Beijing

10 employees

Daimler TSS Malaysia

Hub Kuala Lumpur

42 employeesDaimler TSS IndiaHub Bangalore22 employees

Daimler TSS Germany

7 locations

1000 employees*

Ulm (Headquarters)

Stuttgart

Berlin

Karlsruhe

* as of August 2017

5

Page 6: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• After the end of this lecture you will be able to

• Understand function of Frontend Tools and Information Design

• Understand the necessity for metadata

• Understand lifecycle of DWH projects

• Advanced topics like Operational BI, DWH Appliances, Cloud BI

WHAT YOU WILL LEARN TODAY

Data Warehouse / DHBWDaimler TSS 6

Page 7: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

FRONTEND

Page 8: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE

Data Warehouse / DHBWDaimler TSS 8

Data Warehouse

FrontendBackend

External data sources

Internal data sources

Staging Layer(Input Layer)

OLTP

OLTP

Core Warehouse

Layer(Storage

Layer)

Mart Layer(Output Layer)

(Reporting Layer)

Integration Layer

(Cleansing Layer)

Aggregation Layer

Metadata Management

Security

DWH Manager incl. Monitor

Page 9: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

VISUALIZATION IN THE USUAL CASE OF LIFE

Data Warehouse / DHBWDaimler TSS 9

Page 10: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

RUSSIAN CAMPAIGN OF NAPOLEON

Data Warehouse / DHBWDaimler TSS 10

Source: https://de.wikipedia.org/wiki/Charles_Joseph_Minard

Page 11: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

MAPPING THE 1854 LONDON CHOLERA OUTBREAK

Data Warehouse / DHBWDaimler TSS 11

Source: https://www1.udel.edu/johnmack/frec682/cholera/

Page 12: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

MAPPING THE 1854 LONDON CHOLERA OUTBREAK

Data Warehouse / DHBWDaimler TSS 12

Page 13: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

EXCERCISE: VISUALIZE AS MUCH AS POSSIBLE

Data Warehouse / DHBWDaimler TSS 13

Umsatz in €

2014 2015 2016

Kanada 16.000 14.000 17.000

England 8.000 9.000 8.000

Frankreich 7.000 4.000 5.000

USA 60.000 85.000 90.000

Deutschland 4.000 10.000 15.000

Australien 10.000 8.000 15.000

Umsatz 105.000 130.000 150.000

Page 14: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

POSSIBLE SOLUTION 1

14Data Warehouse / DHBWDaimler TSS

Page 15: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

POSSIBLE SOLUTION 2

15Data Warehouse / DHBWDaimler TSS

Page 16: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Reporting (Standard, ad-hoc)

• OLAP

• Dashboards, Scorecards

• Advanced Analytics / Data Mining / Text Mining

• Search & Discovery

INTERFACE TO THE END USER

Data Warehouse / DHBWDaimler TSS 16

Page 17: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Standard Reports

• Prepared static reports that can be executed at request by end users

• Are executed at the end of an ETL process and e.g. send by email to end users

• Normally based on fact tables and its dimensions

• Reports are often lists similar to Excel-Sheets but can also contain graphics (e.g. line charts)

Ad-hoc Reports

• End users create their own reports („Self service“)

REPORTING (STANDARD, AD-HOC)

Data Warehouse / DHBWDaimler TSS 17

Page 18: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

ROLAP / MOLAP Client Frontend

• Prepared cubes (multidimensional or relational fact tables)

• User can perform interactive analysis of data

• Rollup / drill-down

• Pivot

• Slicing

• Dicing

OLAP

Data Warehouse / DHBWDaimler TSS 18

Page 19: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

„Progress reports“

Provide an overall view of KPIs (Key Performance Indicators)

Combination of several elements from Reporting and/or OLAP (e.g. line charts) into an overall view (like a „cockpit“)

Dashboard is more focused on operational goals

• High-level overview what is happening

Scorecard is more focused on strategic goals

• Plan a strategy and identify why something happens

DASHBOARDS, SCORECARDS

Data Warehouse / DHBWDaimler TSS 19

Page 20: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

See Mr. Bollinger‘s lecture

ADVANCED ANALYTICS / DATA MINING / TEXT MINING

Data Warehouse / DHBWDaimler TSS 20

Page 21: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Not just numerical data

Analysis of new data types gets more and more important

• Text

• GPS coordinates

• Pictures

• Videos

Data can be available in RDBMS (e.g. text modules/indexes available), Hadoop or SQL DBs

SEARCH & DISCOVERY

Data Warehouse / DHBWDaimler TSS 21

Page 22: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

MANY GRAPHICAL ELEMENTS TO USE IN REPORTS

Data Warehouse / DHBWDaimler TSS 22

Source: https://github.com/d3/d3/wiki/Gallery

Page 23: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

MANY GRAPHICAL ELEMENTS … CHAMBER OF HORROR

Data Warehouse / DHBWDaimler TSS 23

Source: Hichert / Faisst, http://www.backup-page.hichert.com/

Page 24: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

DO YOU USE 3D USUALLY ?

Data Warehouse / DHBWDaimler TSS 24

Page 25: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Some remarks about previous slide

• 3D elements introduce clutter and give not more information

• Pie chart most often does not make sense

• Line chart barely readable

• Labels are placed outside of the graphic

• Tachometer costs a lot of space and show

• Too much color in general

• Color without meaning, e.g. red should be used for alarms / errors

MANY GRAPHICAL ELEMENTS … CHAMBER OF HORROR

Data Warehouse / DHBWDaimler TSS 25

Page 26: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

STORY TELLING WITH APPROPRIATE VISUALIZATION

Famous example by Hans Rosling (watch 3:08 onwards)

https://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen?language=de

Data Warehouse / DHBWDaimler TSS 26

Page 27: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Information design is the practice of presenting information in a way that fosters efficient and effective understanding of it.(source: Wikipedia, https://en.wikipedia.org/wiki/Information_design )

Some authors are well known for their criticism of many graphical representations - they provide rules for good information design

• Edward Tufte

• Stephen Few

• Rolf Hichert

INFORMATION DESIGN

Data Warehouse / DHBWDaimler TSS 27

Page 28: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Define standards, e.g.

• use always the same colors and with care, e.g.

• red = negative

• green = positive

• pie charts are rarely useful and should be avoided

• better use bar chart or line chart

• No 3D elements as these elements don’t enhance information but introduce clutter

• Standardize abbreviations, e.g. PY = previous year

INFORMATION DESIGNREDUCE TO THE ESENTIALS

Data Warehouse / DHBWDaimler TSS 28

Page 29: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

TABLE WITH INTEGRATED BAR CHARTS

Data Warehouse / DHBWDaimler TSS 29

Source: Hichert, http://www.hichert.com/de/resource/table-template-02/

Page 30: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

WHICH PRODUCTGROUP HAS THE HIGHEST WIN IN JUNE?

30Data Warehouse / DHBWDaimler TSS

Page 31: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

WHICH PRODUCTGROUP HAS THE HIGHEST WIN IN JUNE?EYE TRACKING

31Data Warehouse / DHBWDaimler TSS

Page 32: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

WHICH PRODUCTGROUP HAS THE HIGHEST WIN IN JUNE?IMPROVED VERSION

32Data Warehouse / DHBWDaimler TSS

Page 33: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

WHICH PRODUCTGROUP HAS THE HIGHEST WIN IN JUNE?EYE TRACKING

33Data Warehouse / DHBWDaimler TSS

Page 34: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

EYE-TRACKING - BEFORE AND AFTER

34Data Warehouse / DHBWDaimler TSS

Page 35: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Consumers / BI Users

• use reports, OLAP and dashboards to obtain information

Power Users

• Use reports , OLAP and dashboards to obtain information

• Create new reports and dashboards

Data Scientists

• Statistical / mathematical geeks

• Analyze / explore data

• Need to analyze raw (non-cleansed, non-transformed) data

BI END USER ROLES

Data Warehouse / DHBWDaimler TSS 35

Page 36: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

META DATA MANAGEMENT

Page 37: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

WHAT IS METADATA?

Data Warehouse / DHBWDaimler TSS 37

Data about

other data

Page 38: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Business Metadata

• Definition of business vocabulary and relationships

• Definition of the value range

• Linkage to physical representation

TYPES OF METADATA

Data Warehouse / DHBWDaimler TSS 38

Page 39: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Report and ETL metadata

• Report definitions

• Data sources

• Column definitions

• Computations

Logical and physical metadata of data model

• Table structure

• Definition of columns

• Relationships between tables and columns

• Dimension hierarchy

TYPES OF METADATA

Data Warehouse / DHBWDaimler TSS 39

Page 40: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

THE AREAS OF METADATA

Data Warehouse / DHBWDaimler TSS 40

Page 41: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

THE AREAS OF METADATA CONNECTED

Data Warehouse / DHBWDaimler TSS 41

Page 42: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Components of a data warehouse system are interconnected

• BI report user has to know

• the meaning, definitions of the shown measures, „KPIs“ (key performance indicators)

• BI report designer has to know

• the table definitions

• the meaning of the column values

• ETL job designer has to know

• the table definitions or the exact definition of the measures

• Database administrator has to know

• Which tables are used by ETL jobs, reports

WHY A COMMON METADATA REPOSITORY?

Data Warehouse / DHBWDaimler TSS 42

Page 43: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Metadata driven ETL development

• Generate parts of ETL code

• increasing interest for Data Vault development projects

• Tools e.g. MID Innovator, Quipu, AnalytiX DS, Talend, Pentaho, Wherescape, andothers

Common metadata repository ensures consistency across all components

• Many tools involved (DB, ETL, Frontend, …)

Enables cross component metadata analysis

• Data Lineage

• Impact Analysis

WHY A COMMON METADATA REPOSITORY?

Data Warehouse / DHBWDaimler TSS 43

Page 44: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

“Data lineage”

• Import & Browse Full BI Report Metadata

• Navigate through report attributes

• Visually navigate through data lineage across tools

• Combines operational & design viewpoint

WHY A COMMON METADATA REPOSITORY?

Data Warehouse / DHBWDaimler TSS 44

Page 45: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

“Impact Analysis”

• Show complete change impact in graphical or list form

• Includes impact on reports in BI tools

• Visually navigate through impacted objects across tools

• Allows impact analysis on any object type

WHAT HAPPENS IF I CHANGE THIS COLUMN?

Data Warehouse / DHBWDaimler TSS 45

Page 46: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Show relationships between business terms, data model entities, and technical and report fields

• Requires cross-tool mapping of business terms

• Allows field meaning to be understood

• Allows business term relationships to be understood

WHAT DOES THIS FIELD MEAN?

Data Warehouse / DHBWDaimler TSS 46

Page 47: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Shows objects that user manages

• Shows stewardship relationships on business terms

• Shows user group associations

WHAT OBJECTS DOES THIS USER OWN?

Data Warehouse / DHBWDaimler TSS 47

Page 48: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Navigation through complete job details

• Navigation of complete operational metadata

WHAT HAPPENED ON THE LAST JOB RUN?

Data Warehouse / DHBWDaimler TSS 48

Page 49: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

DATA WAREHOUSING PROJECTS

Page 50: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

LOGICAL STANDARD DATA WAREHOUSE ARCHITECTURE

Data Warehouse / DHBWDaimler TSS 50

Data Warehouse

FrontendBackend

External data sources

Internal data sources

Staging Layer(Input Layer)

OLTP

OLTP

Core Warehouse

Layer(Storage

Layer)

Mart Layer(Output Layer)

(Reporting Layer)

Integration Layer

(Cleansing Layer)

Aggregation Layer

Metadata Management

Security

DWH Manager incl. Monitor

Top Down (Inmon)

Bottom Up (Kimball)

Page 51: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Top-Down (Inmon)

• Comprehensive approach regarding available data

• Design Core Warehouse Layer = integrated data model first considering all requirements

• Design data marts afterwards

Bottom-Up (Kimball)

• Approach focusing on fast delivery of first results

• Design one data mart first

• Next Marts are modeled afterwards usually using Kimball architecture

• conformed dimensions to integrate different data marts / fact tables

TOP-DOWN VS BOTTOM-UP APPROACH

Data Warehouse / DHBWDaimler TSS 51

Page 52: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

TOP-DOWN VS BOTTOM-UP APPROACHADVANTAGES AND DISADVANTAGES

Data Warehouse / DHBWDaimler TSS 52

Top-Down (Inmon) Bottom-Up (Kimball)

☺ Core Warehouse Layer is designed optimal ☺ Early involvement of end users

☺ Data from Core Warehouse Layer is reused in many Marts

☺ Fast results

Time-consuming approach with high preparatory effort

Focus on single Marts leads to risk that overall view is lost, esp. properly designed Core Warehouse Layer

High risk with changing requirements Data often not reused but inconsistently copied across Marts

Page 53: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Both approaches have their down-sides

• Top-Down takes enormous initial effort to build data model for Core Warehouse Layer

• Bottom-Up is risky as central / integrated focus is lost

Think big, start small

• Think Big: Design conceptual data model for Core Warehouse Layer covering whole enterprise

• Start small: Implement physical data model for Core and Mart Layer in iterations by each business department

THINK BIG, START LOCAL

Data Warehouse / DHBWDaimler TSS 53

Page 54: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• DWH is not a product

• DWH databases are more complex with different layers and data models

• Data first, code is secondary

• Data quality is a major concern

• Data integration is a challenging objective

• Business need difficult to justify quantitatively

WHAT’S DIFFERENT IN DWH PROJECTS?

Data Warehouse / DHBWDaimler TSS 54

Page 55: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

WHY DO DWH PROJECTS FAIL?

Data Warehouse / DHBWDaimler TSS 55

Page 56: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Feasibility study Analysis Design Implementation TestOperations and maintenance

PROJECT PHASESSMALL ITERATIONS INSTEAD OF LONG PHASES

Data Warehouse / DHBWDaimler TSS 56

Page 57: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Organizational team that coordinate and standardize DWH activities within an (end user) organization

• Define standards and create BI portfolio (e.g. which tools/products to use)

• Create DWH architecture and govern BI activities

• Establish processes for business and IT interaction

• Monitor DWH/BI market for new trends

• Determine skills and experience of Business users

BICC: BI CENTER OF EXCELLENCE

Data Warehouse / DHBWDaimler TSS 57

Page 58: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Define 3-5 criteria for the evaluation of an ETL tool

How does a relational DBMS (like Oracle, DB2, MS SQL Server) meet these requirements?

EXERCISE

Data Warehouse / DHBWDaimler TSS 58

Page 59: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Supplier profile

• Support

• HW/SW requirements

• License / maintenance Costs

• Usability

• Reliability

• Performance and scalability

• Multi-tenant

• Interfaces

• Scheduling

EXERCISE - DEFINE 5 CRITERIA FOR THE EVALUATION OF AN ETL TOOL

Data Warehouse / DHBWDaimler TSS 59

Page 60: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• RDBMS provide many of the functionalities but additional programming required

• RDBMS are often used for ETL/ELT by programming with SQL, PL/SQL, SQLT, etc

EXERCISE - HOW DOES A RELATIONAL DBMS MEET THESE REQUIREMENTS?

Data Warehouse / DHBWDaimler TSS 60

ETL Tool Manual ETL

Informatica, Talend, Oracle ODI, etc. SQL, PL/SQL, SQLT, etc.

Separate license No additional license

Workflow, error handling, and restart/recovery functionality included

Workflow, error handling, and restart/recovery functionality must be implemented manually

Impact analysis and where-used (lineage) functionality available

Impact analysis and where-used (lineage) functionality difficult

Faster development, easier maintenance Slower development, more difficult maintenance

Additional (Tool-) Know How required Know How often available

Page 61: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

NEWER / ADVANCED TOPICS

• OPERATIONAL DATA WAREHOUSING

• DATA WAREHOUSE APPLIANCES

• CLOUD BI

Page 62: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

OPERATIONAL DATA WAREHOUSING

Page 63: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Classical“ Data Warehouses

• Information in the warehouse used to support strategic business decisions

• Kept separate from operational systems

• Load of new data only in larger intervals (mostly weekly or monthly)

• Shorter intervals not required by users

• Huge system resources of the ETL process

Near Real Time Operational Data Warehousing

• Information in the warehouse used for tactical business decisions as well

• Low latency of information in data warehouse therefore needed

• Not only mathematical aggregations

OPERATIONAL DATA WAREHOUSING

Data Warehouse / DHBWDaimler TSS 63

Page 64: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

With classical data warehouses users have to access two types of systems to get a complete image of a customer (for instance for CRM applications or in call centers)

• the data warehouse to see what happened in the past

• the OLTP systems to get the most current information

With an operational data warehouse

• all this information is in one system

• tighter integration with operational systems is easier

• for instance personalized offers „closing the loop“

WHY OPERATIONAL DATA WAREHOUSING?

Data Warehouse / DHBWDaimler TSS 64

Page 65: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

SMARTFACTORY: OPERATIONAL BI SERVICE PLATFORM

Data Warehouse / DHBWDaimler TSS 65

Source: Gluchowski: Analytische Informationssysteme, 5.Aufl., p. 279

Workers getting alarms on their

watch

Containing and displaying

complex manuals, e.g. during repair

New data sources

sending lots of data with high speed

(sensor data, logs, etc.)

Right-Time data

required for automated actions, e.g.

cordless screwdriver knows and

adjusts torque

Page 66: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Near Real time /Right time ETL

• Executed asynchronously; triggered by business transactions in the OLTP

• Incremental real-time load

• Tighter integration of operational and data warehouse systems

DWHs become „mission critical“

• Higher requirements on availability and performance

Higher „transactional“ system load on data warehouse system

• DWH DB has to deal with typical DWH system load and transactional load

Data Quality mandatory

• Data is used for automated decisions

CHALLENGES FOR OPERATIONAL DATA WAREHOUSING

Data Warehouse / DHBWDaimler TSS 66

Page 67: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

COMPARISON CLASSICAL DWH – OPERATIONAL DWH

Data Warehouse / DHBWDaimler TSS 67

Classical DWH Operational DWH

Strategic• Passive• Historical trends

Tactical• Prediction• Automatic execution of decisions

Batch• E.g. daily batch

Near Real-Time / Right-Time• Up-to-date view

Lower Availability• System can be down for

maintenance and longer response times for some reports are accepted

Availability• System becomes critical and must

fulfill high availability and performance requirements

Page 68: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

DATA WAREHOUSE APPLIANCES

Page 69: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Setting up and configuring a data warehouse system is a complex task

• Hardware

• Servers + Storage + Network

• Connectivity to source systems

• Software

• Database management system

• ETL software

• Reporting and analytics software

• ...

An optimal performance of the whole system is difficult to achieve

DATA WAREHOUSE APPLIANCES

Data Warehouse / DHBWDaimler TSS 69

Page 70: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Data Warehouse Appliances are

• Pre-configured and pre-tested hard- and software configurations developed for running a data warehouse

• Optimized for data warehousing workload / Only suited for running OLAP• In contrast one size fits all: RDBMS are suited for OLTP, OLAP and mixed workloads

• Ready to be used after they are delivered to the customer

• Products, e.g. Teradata, HP Vertica, Exasol, Oracle Exadata, IBM Netezza(IBM PureData System for Analytics), MS Analytic Platform System

DATA WAREHOUSE APPLIANCES

Data Warehouse / DHBWDaimler TSS 70

Page 71: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

APPLIANCE SIMPLICITY (E.G. ORACLE EXADATA)

Data Warehouse / DHBWDaimler TSS 71

Page 72: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Up to 912 CPU core and 28.5TB memory per rack

• 2 to 19 DB servers per rack

• 3 to 18 Storage servers per rack

• Maximum of 920TB flash capacity

• 2.1PB of disk capacity

• 10TB size disk (10TB x 12 = 120TB RAW per storage server)

• About 4.8 million reads and about 4.3 million writes per second

ORACLE EXADATA X7 – SOME KEY FIGURES

Data Warehouse / DHBWDaimler TSS 72

Source: http://jaffardba.blogspot.com/2017/10/whats-new-in-exadata-x7.html

Page 73: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Move as many operations as possible to storage cell instead of moving data to the DB server

• E.g. filter data already at storage cell and not at DB server

• Avoid transferring unnecessary data

• Column-oriented In-memory storage with high compression

• Many appliances are based on shared nothing architecture• Each node is independent

• Each node has its own storage or memory

• Parallel processing simpler and faster as no overhead due to contention

TYPICAL ENHANCEMENTS

Data Warehouse / DHBWDaimler TSS 73

Page 74: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

CLOUD BI

Page 75: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• BI applications (database, ETL tools, Frontend) are hosted in a public cloud, e.g.

• AWS (Amazon Web Services)

• Microsoft Azure

• …

• Many tools nowadays are available in the cloud first • Vendors try to force customers to use clouds

• Or even available in the cloud only• E.g. Microsoft Power BI

CLOUD BI

Data Warehouse / DHBWDaimler TSS 75

Page 76: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

CLOUD BI ARCHITECTURE

Data Warehouse / DHBWDaimler TSS 76

Source: Lang: Business Intelligence erfolgreich umsetzen, 5.Aufl., p. 185

Page 77: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Analytics as a service

• Provide complete BI (Analytics) SW stack including

• data storage

• data integration (ETL)

• data visualization and/or data modeling (Frontend)

• Meta data management

• Data as a service

• Provide quality data for further usage

• Data marketplace

CLOUD BI ARCHITECTURE

Data Warehouse / DHBWDaimler TSS 77

Page 78: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

CLOUD BI – DATA WAREHOUSING SERVICES

Data Warehouse / DHBWDaimler TSS 78

Source: http://db-engines.com/en/system/Amazon+Redshift%3BSnowflake

Page 79: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

SNOWFLAKE ARCHITECTURE

Data Warehouse / DHBWDaimler TSS 79

Don‘t confuseSnowflake

product withSnowflake

dimensional model from

session 2

Page 80: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Snowflake Storage

• Snowflake loads data into its internal optimized, compressed, columnar format

• Snowflake itself uses (!) Amazon Web Service’s S3 (Simple Storage Service) cloud storage

Query Processing

• Each virtual warehouse is an MPP (Multi Parallel Processing) compute cluster composed of multiple compute nodes allocated by Snowflake from Amazon EC2

• Each virtual warehouse is an independent compute cluster that does not share compute resources with other virtual warehouses

SNOWFLAKE ARCHITECTURE

Data Warehouse / DHBWDaimler TSS 80

Page 81: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Frontend

• Meta Data Management

• DWH Project Management

• New / Advanced topics

• Operational Data Warehousing

• Data Warehouse Appliances

• Cloud BI

SUMMARY

Data Warehouse / DHBWDaimler TSS 81

Page 82: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

END OF DWH LECTUREGOOD TEXT BOOKS

• Kimball et al: The Data Warehouse Lifecycle Toolkit, Wiley 2008

• Bauer / Günzel: Data-Warehouse-Systeme, dpunkt, 2013

• Köppen et al: Data Warehouse Technologien, mitp, 2016

Data Warehouse / DHBWDaimler TSS 82

Source: http://dilbert.com/strip/2014-05-07

Page 83: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Daimler TSS GmbHWilhelm-Runge-Straße 11, 89081 Ulm / Telefon +49 731 505-06 / Fax +49 731 505-65 99

[email protected] / Internet: www.daimler-tss.com/ Intranet-Portal-Code: @TSSDomicile and Court of Registry: Ulm / HRB-Nr.: 3844 / Management: Christoph Röger (CEO), Steffen Bäuerle

Data Warehouse / DHBWDaimler TSS 83

THANK YOU

Page 84: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

How to document / identify requirements?

• Must be easy to understand from non-technical users during Analysis/Technical concept phase

• Must provide sufficient information for System Design phase

The following slides provide some example work products that are produced during Analysis/Technical concept phase and may be refine during System Design phase

POSSIBLE DWH ANALYSIS AND DESIGN WORK PRODUCTS

Data Warehouse / DHBWDaimler TSS 84

Page 85: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Answer most important questions of participating business units

• Provide high-quality data

• Introduction in time

• Usage of modern technology

• Business orientation

• Easy to use

• Executive sponsor

• Patience – user acceptance evolves over time

CRITICAL SUCCESS FACTORS FOR BUILDING A DATA WAREHOUSE

Data Warehouse / DHBWDaimler TSS 85

Page 86: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• New applications and data sources

• Increase demand for an

• Operational DWH, e.g.

• Industry 4.0 / Smart Factory

• Internet Of Things

• Internet of medical things

• Connected Cars

EXAMPLES OF OPERATIONAL DATA WAREHOUSING

Data Warehouse / DHBWDaimler TSS 86

Source: Gluchowski: Analytische Informationssysteme, 5.Aufl., p. 277

Replace pen & paper with electronic workflows

Decision support for each end user and not only management

Increasing demand to publish same content on different devices

Page 87: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Source: http://www.agiledata.org/essays/disciplinedAgileDW.html

DISCIPLINED AGILE DATA WAREHOUSING (SCOTT AMBLER) -PHASES

Data Warehouse / DHBWDaimler TSS 87

• Project initiation like initial scoping, initial architectural modeling

Inception

• Produce potentially consumable solution on an incremental basis

Construction

• Deploying the solution to their stakeholders

• Continuous deployment: Over time this phase becomes shorter and ideally disappears

Transition

Page 88: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

DISCIPLINED AGILE DATA WAREHOUSING (SCOTT AMBLER) -PHASES OVERVIEW

Data Warehouse / DHBWDaimler TSS 88

Source: http://www.agiledata.org/essays/disciplinedAgileDW.html

Page 89: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Take a usage-driven approach

• Keep your initial modelling light-weight

• User stories are only one option• As a Professor I would like to analyze the current grades of my students so that I can adjust the

difficulty of future tests and assignments

• As a Student I would like to know the drop out rates by course and professor from previous years to determine the likely difficulty of my course choices

• As a Registrar I would like to know the rate of enrollments within a class over time to determine the popularity of them

• As a Student I would like to know the estimated travel time between back-to-back classes so that I can determine whether I can make it to class on time

Source: http://www.disciplinedagiledelivery.com/user-stories-for-data-warehousebusiness-intelligence-a-disciplined-agile-approach/

DISCIPLINED AGILE DATA WAREHOUSING (SCOTT AMBLER) -USAGE-DRIVEN APPROACH

Data Warehouse / DHBWDaimler TSS 89

Page 90: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Classical

• Waterfall model

• Incremental model

• Agile

• Scrum

• Kanban

PROCESS MODEL

Data Warehouse / DHBWDaimler TSS 90

Page 91: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Benefits of DWH

• Cost-effectiveness

• SW selection

• HW selection

• Staff requirement including external Know-How

• Data protection and data security agreement, data classification

• Proof of Concept (PoC) to challenge different possible solutions

• Architectural concept

FEASIBILITY STUDY

Data Warehouse / DHBWDaimler TSS 91

Page 92: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Documentation of user requirements: specification sheet

• Backend including ETL

• Frontend

• Security

• Metadata, Business glossary

• Non-functional requirements

• Analysis of data sources

• Data quality

• Data models

• Data security

ANALYSIS

Data Warehouse / DHBWDaimler TSS 92

Page 93: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Technical description how to implement specifications

• Data model for different DWH layers

• Data integration design

• Frontend design

• Security concept

• Capacity planning

DESIGN

Data Warehouse / DHBWDaimler TSS 93

Page 94: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Installation of development, test, integration, production, maintenance environment

• Usage of Metadata repository for implementation of data model, etl, frontend, security

• Launch of DWH

• Release Management

IMPLEMENTATION

Data Warehouse / DHBWDaimler TSS 94

Page 95: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Functional

• Data quality / data validation

• Usability

• Performance

• Operational

• Security

TEST

Data Warehouse / DHBWDaimler TSS 95

Page 96: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Deployment of new features, changes or bug fixes

• End user training

• Monitoring

• Production concept

• Initial load and future delta loads

• Keep the system running

OPERATIONS AND MAINTENANCE

Data Warehouse / DHBWDaimler TSS 96

Page 97: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

• Security concerns for sensitive data

• But new data source coming from Internet. Storing the data in a (public) cloud can make sense, e.g.

• Connected Cars, IOT in general

CLOUD BI

Data Warehouse / DHBWDaimler TSS 97

Page 98: LECTURE @DHBW: DATA WAREHOUSE PART IV: …buckenhofer/20181DWH/Buckenhofer-D… · TOP-DOWN VS BOTTOM-UP APPROACH ... Top-Down (Inmon) Bottom-Up (Kimball) ☺Core Warehouse Layer

Cloud Services

• Authentication and access control

• Infrastructure management

• Metadata management

• Query parsing and optimization

• Security

SNOWFLAKE ARCHITECTURE

Data Warehouse / DHBWDaimler TSS 98