51
Business Intelligence Michael Lamont [email protected]

Business Intelligence: Data Warehouses

Embed Size (px)

DESCRIPTION

A basic introduction to data warehouses, their uses, and their benefits.

Citation preview

Page 1: Business Intelligence: Data Warehouses

Business Intelligence

Michael Lamont

[email protected]

Page 2: Business Intelligence: Data Warehouses

Platforms

Implementation of BI platform requires

lots of important choices:

Type of platform

Software tools & technologies

IT usually takes lead on technology and

platform decisions

Important for business managers to

participate in decision making – they’ll

actually be using the platform

Page 3: Business Intelligence: Data Warehouses

Platforms

BI platforms capture raw operational

data and convert it to useful info

Process used by a platform can be

simple or complex

Data warehouse is most common BI

platform

Data warehouses have several distinct

components that work together

Page 4: Business Intelligence: Data Warehouses

BI Platform

Data Sources

Page 5: Business Intelligence: Data Warehouses

Operational Systems

Organizations usually have dozens of

operational systems that support day-to-

day transactions

Line-of-business apps:

Human resources

Enterprise Resource Planning

Supply chain

Point-Of-Sale

Page 6: Business Intelligence: Data Warehouses

Operational Systems

Efficient at supporting transactional

processes

Not so good for business analysis

Not really able to use data from multiple

sources

Page 7: Business Intelligence: Data Warehouses

BI Platform

Data Sources

Data

Warehouse

Page 8: Business Intelligence: Data Warehouses

Data Warehouse

Collective repository of data from a

company’s operational systems

Data warehouse feeds data into series

of subject-specific databases called

data marts

Some “data warehouse” platforms are

really just a collection of data marts

Page 9: Business Intelligence: Data Warehouses

BI Platform

Data Sources

Data

Warehouse

HR

Sales

Finance

Data Marts

Page 10: Business Intelligence: Data Warehouses

Data Marts

Data marts are subject-specific

HR

Sales

Finance

Marketing

Etc

Definition of “subject” varies from

company to company depending on

needs

Page 11: Business Intelligence: Data Warehouses

Data Marts

Examples of data marts in a single

company:

Support Sales dept’s analysis of

performance and margins

Let HR dept analyze headcount and

absence trends

Page 12: Business Intelligence: Data Warehouses

Data Sharing

Data warehouses shouldn’t be collection

of independent silos of data

Silos of data are what operational systems

already give you

A good data warehouse makes it easy to

normalize measures and dimensions

Ensures dimensions & measures have same

meanings across company

Support metrics calculations across data feeds

Page 13: Business Intelligence: Data Warehouses

Data Sharing

Operational systems can’t calculate many useful metrics because they can’t integrate/share data

Calculating revenue per employee requires data from Sales and HR data silos

Easy to calculate these metrics in a data warehouse with shared data and dimensions

More shared data = more powerful analysis

Page 14: Business Intelligence: Data Warehouses

Data Integration

Integrating data into a common

warehouse is hardest part of BI process

Each operational system creates

mountains of data in incompatible

formats

Extract, Transform, Load processes

load data from operational systems into

data warehouse.

Page 15: Business Intelligence: Data Warehouses

BI Platform

Data Sources

Data

Warehouse

HR

Sales

Finance

Data Marts ETL Processes

Page 16: Business Intelligence: Data Warehouses

Data Integration

Business managers/analysts aren’t

usually involved in technical details of

ETL

Participate in defining business rules for

how data is integrated

Data integration rules determined by:

Type of analysis to be performed

How well data supports requirements

Page 17: Business Intelligence: Data Warehouses

Data Analysis

Analysis processes responsible for

assembling charts, graphs, etc and

delivering them to business users

Software packages used for these tasks

are called front-end tools

Harvest info from data warehouse

Present to users in visual formats

Page 18: Business Intelligence: Data Warehouses

Data Analysis

More advanced analysis tools can be

used to explain behavior or uncover

hidden trends

Goal of analysis process is to help

decision makers by giving them useful

data

Page 19: Business Intelligence: Data Warehouses

Reporting & Analysis

Piece of BI that business users are most

familiar with

Primary purpose: put data in hands of

business users

Reporting & analysis processes need to

assemble data into formats that hold

meaning for business users

Page 20: Business Intelligence: Data Warehouses

Reporting & Analysis

Multidimensional analysis designed to

make data understandable/useful to

business users

Tabular grids excellent way to

consolidate & present data

Also important to graphically chart data

Graphs and tables work together to give

business users different perspectives on

data

Page 21: Business Intelligence: Data Warehouses

Graphics Example

Tenure Sick

Days

10 8.04

8 6.95

13 7.58

9 8.81

11 8.33

14 9.96

6 7.24

4 4.26

12 10.84

7 4.82

5 5.68

Tenure Sick

Days

10 9.14

8 8.14

13 8.74

9 8.77

11 9.26

14 8.1

6 6.13

4 3.1

12 9.13

7 7.26

5 4.74

Tenure Sick

Days

10 7.46

8 6.77

13 12.74

9 7.11

11 7.81

14 8.84

6 6.08

4 5.39

12 8.15

7 6.42

5 5.763

Tenure Sick

Days

8 6.58

8 5.76

8 7.71

8 8.84

8 8.47

8 7.04

8 5.25

19 12.5

8 5.56

8 7.91

8 6.89

Dept 1 Dept 2 Dept 3 Dept 4

Avg Tenure: 9 years

Avg Sick Days: 7.5

Page 22: Business Intelligence: Data Warehouses

Graphics Example

0

5

10

15

0 5 10 15

Dept 1

0

5

10

0 5 10 15

Dept 2

0

5

10

15

0 5 10 15

Dept 3

0

5

10

15

0 10 20

Dept 4

Page 23: Business Intelligence: Data Warehouses

Business Users

Power Analysts

Information Consumers

Information Users

Page 24: Business Intelligence: Data Warehouses

Business Users

Information Users

Page 25: Business Intelligence: Data Warehouses

Information Users

Require standard reports

Can be short or extensive

Usually contains charts and tables

Want consistent report formats

No need to “slice and dice” data

Static or very simple dynamic reports

Printed

MS Office document formats (PPT, XLS)

Page 26: Business Intelligence: Data Warehouses

Business Users

Information Consumers

Page 27: Business Intelligence: Data Warehouses

Information Consumers

Want to perform dynamic data queries

Not experts in database design or query

tools

Want to be able to pivot and nest data

inside intuitive interface

Interactive ad hoc tools can provoke info

users to cross the line into info

consumer territory

Page 28: Business Intelligence: Data Warehouses

Business Users

Power Analysts

Page 29: Business Intelligence: Data Warehouses

Power Analysts

Use the full analytical power of the

system to do free-form ad hoc analysis

Knows the details of database design

and query tool software

Creates reports for others

Smallest of the three groups of users

Page 30: Business Intelligence: Data Warehouses

Front-End Tools

Present data from warehouse to

business users as reports and

interactive data views

Can be grouped into two categories:

Reporting tools

Data exploration

Page 31: Business Intelligence: Data Warehouses

Front-End Tools

Reporting paradigm:

Excellent at producing tabular reports

Lots of mature and stable packages

Web interfaces for wide-scale deployment

Strong printing/scheduling capabilities

Multidimensional data exploration:

Excellent for dealing with OLAP cubes

Support interactive ad hoc analysis

Graphical charts and views

Page 32: Business Intelligence: Data Warehouses

Front-End Tools

Competitive market space

Wide range of available features and

functionality

Page 33: Business Intelligence: Data Warehouses

Front-End Tools

Remember: features aren’t benefits

Advanced analysis features useful to

power analysts, but not info users

Invest time to figure out broader BI

objectives and needs of users

Select solution providers based on your

objectives and needs

Page 34: Business Intelligence: Data Warehouses

Data Warehouses

Primary task: support reporting &

analysis

Warehouse design & content driven by

business needs

Business people determine what info they

need to make better decisions faster

IT implements warehouse to fit business

needs

Page 35: Business Intelligence: Data Warehouses

Data Warehouses

Business & IT need to be aligned on

business requirements

Page 36: Business Intelligence: Data Warehouses

Subject Oriented

Data warehouses organize data into

subject-specific data marts

Data marts are NOT silos of data

Data marts gather data from multiple

operational systems to support analyses

Ex: product line profitability

Data in the warehouse is shared by the

data marts

Page 37: Business Intelligence: Data Warehouses

Consistent Data

Warehouses provide consistent data by

using the same dimensions and

measures for all data

Consistent - data to be analyzed has

same definitions across entire company

Achieving data consistency requires

both integration and organizational

decisions

Page 38: Business Intelligence: Data Warehouses

Consistent Data

Data from multiple operational systems

has to be integrated into one common

data set for analysis

Problem: Different systems may have

subtly different definitions of “discount”

Solution: Data warehouse

integrates/transforms data based on

consistent business rule

Page 39: Business Intelligence: Data Warehouses

Consistent Data

Problem: Source data has different

dimension structures

Solution: Warehouse defines uniform

dimension designs

Consistent data requires standardized

measure & dimension definitions

Everyone in company needs to “speak

the same language” for dimensions &

measures

Page 40: Business Intelligence: Data Warehouses

Cleansed Data

Cleansed data – data that has been

validated by business & structural rules

Storing cleansed data is a key priority

for data warehouses

Data from operational systems is usually

uncleansed “dirty data”

Page 41: Business Intelligence: Data Warehouses

Types of Dirty Data

Missing

Information not entered into an order

tracking system

Incorrect

One Walmart reporting it sold 50K razor

blades in an hour

Data entry errors

Booston, MA

Subtle issues like double-counting

Page 42: Business Intelligence: Data Warehouses

Cleansed Data

ETL processes use business rules to

load valid data and cleanse/reject invalid

data

Page 43: Business Intelligence: Data Warehouses

Historical Data

Warehouses let you analyze data over

specific time periods

Provides users with “snapshots” of data

from operational systems

Warehouse data is static, unlike

operational systems

Warehouse data refreshed at regular

time intervals

Page 44: Business Intelligence: Data Warehouses

Historical Data

Data warehouses are non-volatile

Historical data lets analysts identify

trends and exceptions

Ex: comparing year-over-year sales on a

quarterly basis

Page 45: Business Intelligence: Data Warehouses

Fast Delivery of Data

Warehouse has to provide data to users

quickly and efficiently

Database technology and structures

need to be fast & efficient

Two types of databases in common

usage:

OLAP (OnLine Analytical Processing)

RDBMS (Relational DataBase Management

Systems)

Page 46: Business Intelligence: Data Warehouses

OLAP Databases

Benefits of OLAP:

Native support of multidimensional analysis

Fast data retrieval

Pre-process data as much as possible

Ideal for fast retrieval of aggregated data

OLAP is usually a good candidate for

data marts

Page 47: Business Intelligence: Data Warehouses

OLAP Databases

Important recent developments:

Much easier to design OLAP databases

Acquisition costs are extremely low

SMBs can now use technology that was

only available to large enterprises a few

years ago

Page 48: Business Intelligence: Data Warehouses

OLAP & Relational Databases

Relational databases often store

underlying data supplied to OLAP

database

RDBMS stores detailed data, OLAP

stores summarized data views

Example: Sales data mart

Relational stores daily sales data

OLAP stores and manages summarized

sales data by customer, product, region, etc.

Page 49: Business Intelligence: Data Warehouses

Relational Databases

Relational databases can host data

marts without OLAP

Use their own set of dimensions &

measures to support analysis

Requires sophisticated front-end tools

that can quickly assemble relational data

into multidimensional formats

Page 50: Business Intelligence: Data Warehouses

Conclusions

Data warehouse architecture is flexible,

effective decision support platform

Warehouse helps organize and deliver

data to decision makers

Brings BI to life through data marts, DB

technology, ETL tools, and analysis tools

Helps business managers make better

decisions faster

Page 51: Business Intelligence: Data Warehouses

Michael Lamont

[email protected]