OLAP Concepts

Preview:

Citation preview

OLAP – On Line Analytical Processing

2

People. Passion. Excellence

Objectives:

At the end of this session, you will be able to:

> Define On Line Analytical Processing

> Understand the need for OLAP and applications of OLAP in BI

> Describe the various OLAP solutions and Architecture

> Comparison of different OLAP architectures

> Evaluation parameters to be considered for selecting an OLAP tool

Session Objectives

3

People. Passion. Excellence

> OLAP (On Line Analytical Processing) applications - designed

for online ad-hoc data access and analysis.

> Data organized into multiple dimensions.

> Access to analytical content such as time series and trend

analysis views and summary level information.

> A set of functionality that attempts to facilitate

multidimensional analysis.

> Offers drill-down, drill-across and slice and dice capabilities.

What is OLAP?

4

People. Passion. Excellence

• On Line No piles of paper,

please!

• Analytical Establish patterns

• Processing Data-based

• Fast Analysis of Shared

Multidimensional Information

OLAP - Fast Analysis

5

People. Passion. Excellence

• Dimensions can we think in ?

E.g. analysis by branch, product, agent, year !!!

2 or 3

• Types of values we can handle ?

E.g. Sales, Profit, Cost

1 or 2

• How many levels can we handle ?

E.g. number of products we can analyze

Need for OLAP

6

People. Passion. Excellence

Many parameters affect a Measure (value)

e.g Sales influenced by product, region, time,

distribution channel, etc.,

Linear analysis = reports

Many totals are at one level

Difficult to identify the key parameters

Need for OLAP

7

People. Passion. Excellence

OLAP in an Enterprise

8

People. Passion. Excellence

Departments:

Finance

Marketing

Sales

Manufacturing

Analytical Capabilities:

> Used by analysts and managers.

> Offers aggregated view of the data, such as total revenues by

customer profile, by product line, by geographical regions.

Uses of OLAP

9

People. Passion. Excellence

> Provides the decision support front-end for data warehousing.

> Advanced statistical, financial, and analytical calculations.

> Appropriate tools to access data from a relational database.

> Appropriate tools to access or manage multidimensional data.

Functionality of OLAP Tools

10

People. Passion. Excellence

OLAP analytical features

> Multi-dimensional views of data

> Calculation intensive capabilities

> Time intelligence

The OLAP Calculation engine in OLAP tools have a wide range of

built-in calculations such as:

> Ratios

> Time calculations

> Statistics

> Ranking

> Custom formulas/algorithms

> Forecasting and modeling

Features of OLAP Applications

Evolution of OLAP

12

People. Passion. Excellence

Star Schema

> A Star Schema is a dimensional model created by mapping data entities from operational systems

> It has a central table (fact table) that links all the other tables (dimension tables) together

> Dimension: The same category of information. For example, year, month, day, and week are all part of the Time Dimension.

> Measure: The property that can be summed or averaged using pre computed aggregates.

13

People. Passion. Excellence

Facts and Measures

> Facts or Measures are the Key Performance

Indicators of an enterprise

> Factual data about the subject area

> Numeric, summarized

Net ProfitSale

s Rev

enue

Gross Margin

ProfitabilityCost

14

People. Passion. Excellence

Dimension

> Dimensions put measures in perspective

> What, when and where qualifiers to the measures

> Dimensions could be products, customers, time, geography

etc.

Sales

Rev

enue

(Mea

sure

) What was sold ? Whom was it sold to ? When was it sold ? Where was it sold ?

15

People. Passion. Excellence

Star Schema

16

People. Passion. Excellence

Star Schema Example

17

People. Passion. Excellence

Star Schema with Sample Data

People. Passion. Excellence

Cube

– Multi dimensional databases store information in the form of cubes.

– A cube is a collection of facts and related dimensions stored together in arrays.

Sales

HR

CUBE

Geography

Time

Product

19

People. Passion. Excellence

> Hierarchy: A hierarchy defines the navigating path for drilling up and drilling down. All attributes in a hierarchy belong to the same dimension.

> Levels: These are organized into one or more hierarchies, typically from a coarse-grained level (for example, Year) down to the most detailed one (for example, Day).

> Members: The individual category values (for example, 2002 or 21Jan2002).

> Measures: These are the data values that are summarized and analyzed. Examples of measures are sales figures or operational costs.

> Cells: These are the intersection of one member for every dimension and store the data for measures.

Basic Terminology of a Cube

20

People. Passion. Excellence

Basic Terminology of a Cube

> Dimensions consist of– Dimension Name

– Level – Hierarchy

– Member

Time

1999 2000 2001

Q1 Q2 Q3 Q4 Q1 Q2Q3 Q4

YEAR

QUARTER

LevelOf

Detail

21

People. Passion. Excellence

Aggregates

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE WHERE date = 1

81

22

People. Passion. Excellence

Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date

ans date sum1 812 48

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

Aggregates

23

People. Passion. Excellence

Another Example

Add up amounts by day, product In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId

sale prodId date amtp1 1 62p2 1 19p1 2 48

drill-down

rollup

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

24

People. Passion. Excellence

Aggregates

>Operators: sum, count, max, min, median and avg

>“Having” clause

>Using dimension hierarchy– average by region (within store)– maximum by month (within date)

25

People. Passion. Excellence

The MOLAP Cube

sale prodId storeId amtp1 s1 12p2 s1 11p1 s3 50p2 s2 8

s1 s2 s3p1 12 50p2 11 8

Fact table view: Multi-dimensional cube:

dimensions = 2

26

People. Passion. Excellence

3-D Cube

dimensions = 3

Multi-dimensional cube:Fact table view:

sale prodId storeId date amtp1 s1 1 12p2 s1 1 11p1 s3 1 50p2 s2 1 8p1 s1 2 44p1 s2 2 4

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

27

People. Passion. Excellence

Example

Store

Product

Time

M T W Th F S S

Juice

Milk

Coke

Cream

Soap

Bread

NYSF

LA

10

34

56

32

12

56

56 units of bread sold in LA on M

Dimensions:Time, Product, Store

Attributes:Product (upc, price, …)Store ……

Hierarchies:Product Brand …Day Week QuarterStore Region Country

roll-up to week

roll-up to brand

roll-up to region

28

People. Passion. Excellence

Cube Aggregation: Roll-up

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

s1 s2 s3p1 56 4 50p2 11 8

s1 s2 s3sum 67 12 50

sump1 110p2 19

129

. . .

drill-down

rollup

Example: computing sums

29

People. Passion. Excellence

Aggregation Using Hierarchies

region A region Bp1 56 54p2 11 8

store

region

country

(store s1 in Region A;stores s2, s3 in Region B)

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

30

People. Passion. ExcellenceSlicing

day 2 s1 s2 s3p1 44 4p2 s1 s2 s3

p1 12 50p2 11 8

day 1

s1 s2 s3p1 12 50p2 11 8

TIME = day 1

In SQL: SELECT * FROM SALE WHERE date = 1

OLAP Solutions and Architecture

32

People. Passion. Excellence

Online Analytical Processing (OLAP) can be done on:

> Relational databases

> Multidimensional databases

OLAP products are grouped into three categories:

> Relational OLAP (ROLAP)

> Multidimensional OLAP (MOLAP)

> Hybrid OLAP (HOLAP)

OLAP - Classification

33

People. Passion. Excellence

Multi-dimensional OLAP

MOLAP is a technology which uses a multi-dimensional

database that stores data as n-dimensional cube

Geography

Age G

roup

Bra

nd

MOLAP

34

People. Passion. Excellence

Architecture of MOLAP

Data Mart Server

•RDBMS•Connectivity Middleware

MOLAP Server

•MDDBMS/Data Cube•MOLAP Application

Desktop Systems

MOLAP Client Tools

LANLAN

non-live connection•Used for updating the MOLAP data cube only

IntranetInternet

Thin Clients

•WWW Browser

RouterRouterFirewallFirewall

Cube Size Crit

ical

Cube Size Crit

ical

Issues:• Size of Data Cube• Cubes deployment• Size of Update Data Set

35

People. Passion. Excellence

Oracle's Oracle Express Server

Cognos - Powerplay Transformer

Essbase (Hyperion Software)

Holos (Seagate Software)

MOLAP Products

36

People. Passion. Excellence

Data Mart Server

•RDBMS•Connectivity Middleware

ROLAP Server

•ROLAP Application

Desktop SystemsDesktop Systems

ROLAP Client Client ToolsTools

LANLAN

IntranetInternet

Thin Clients

•WWW Browser

Router /Router /FirewallFirewallIssues:

• Aggregate Awareness• Response Time•Network Capacity

Architecture of ROLAP

37

People. Passion. Excellence

Brio Query Enterprise

Business Objects

Metacube

DSS Server

Information Advantage

ROLAP Products

38

People. Passion. Excellence

ROLAP Server

•ROLAP Application

Desktop SystemsDesktop Systems

HOLAP Client Client ToolsTools

LANLAN

Router/FirewallIssues:

•Cube elements•Integration with RDBMS

MOLAP Server

•MDDBMS/Data Cube•MOLAP Application

Architecture of HOLAP

39

People. Passion. Excellence

Holos (Seagate Software)

Microsoft SQL Server OLAP Services

Pilot Software's Pilot Decision Support Suite

SAS

HOLAP Products

MOLAP Vs ROLAP

41

People. Passion. Excellence

ArchitecturalFeatures

MOLAP ROLAP

Number of Dimensions Ten or Less Unlimited

Support for Large numberof users

Limited support Good

Scalability Poor Good

ComplexMultidimensional analysis

Easier to achieve Difficult to achieve

Volume of Data storage Up to 50 GB Hundreds ofGigabytes andTerabytes

Storage of Information Through cubes SQL result sets

User Interface &functionality

Good Normal

Common accesslanguage

NA SQL

Nature of Data Stores summarizeddata

Stores Detailed aswell as summarizeddata

Comparison of Architectures

42

People. Passion. Excellence

Parameters MOLAP ROLAPApplication design Essentially the

definition ofdimensional modeland calculation rules

It uses two-dimensional tablesthat are stored inRDBMSs. (Data isstored in Starschema or Snowflake schema.)

Aggregation techniques Measures are pre-calculated and storedat each hierarchysummary level duringload time

Summary tables areimplemented in therelational database

Multidimensionalanalysis

Drill down, Drill up,Drill across and Slicing /Dicing

Drill down, Drill up,Slicing and Dicing

Query performance Instant response Slower

Value added functions Supports complexfunctions like%change, ranking etc.,

Limited value addedfunctions

User – definedcalculations

Calculated from cubes Calculated (On thefly )from thedatabase

Strength and Weakness of MOLAP/ROLAP

43

People. Passion. Excellence

Parameters MOLAP ROLAPProcessing Over headfor large input data sets

High Low

Support for frequentupdates

Cannot handlefrequent update ofcubes

Suitable for frequentupdates

Resource requirements High Low

Industry standard No current standards SQL standard

Access to the databasethrough ODBC

The databases haveproprietary API and donot provide accessthrough ODBC.

Provides accessthrough ODBC

Strength and Weakness of MOLAP/ROLAP

44

People. Passion. Excellence

In this session, We have

> Understood the need for OLAP and significance of

Multidimensional analysis in a Data Warehouse.

> Discussed about the evolution of OLAP.

> Explained architectures, characteristics as well as the merits and

demerits of various OLAP solutions.

Session Summary

Thank you

Recommended