56
Dimensional Modeling Overview Dimensional Modeling Overview

Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Embed Size (px)

Citation preview

Page 1: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling OverviewDimensional Modeling Overview

Page 2: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Agenda

■ Dimensional Modeling Overview

■ Dimensional Modeling Steps

■ Dimensional Modeling Framework

■ Retail Sales - Case Study

■ Dimensional Modeling Tips

Page 3: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

Page 4: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

From Requirement to Data Design

■ The requirements definition completely drives the Data design for the DW.

■ Data design consists of putting together the data structures.

■ A group of data elements form a data structure.

■ Logical data design includes determination of the various data elements that are needed and combination of the data elements into structures of data and also establishing relationships among the data structures.

Requirements Gathering

Rqts. DefinitionDocument

DimensionalModel Data

Design

Information Package

Page 5: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Design Decisions

■ Choosing the process.

Selecting the subjects from the information packages for the first set of logical structures to be designed.

■ Choosing the grain.

Determining the level of detail for the data structures.

■ Identifying and conforming the dimensions.

Choosing the business dimensions (such as product, market, time, etc.) to be included in the first set of structures .

■ Choosing the facts.

Selecting the metrics or units of measurements (such as product sale units, dollar sales, dollar revenue, etc.) included in set of structures.

■ Choosing the duration of the database.

Determining how far back in time you should go for historical data.

Page 6: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

■ What is Dimensional Modeling♦ It is a logical design technique to structure the business

dimensions and the metrics that are analyzed along these dimensions.

♦ A logical design technique that seeks to present the data in a standard framework that is intuitive and allows for high performance access

♦ It is inherently dimensional and adheres to a discipline that uses relational model with some important restrictions

♦ The fundamental idea of dimensional modeling is that nearly every type of business data can be represented as a kind of cube of data

♦ The model has also proved to provide high performance for queries and analysis

Page 7: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

■ Components of Dimensional Model♦ Fact Table

► The fact table contains facts or measurements of the business

♦ Dimension Table► The dimension tables contain textual attributes that describe the

facts

Page 8: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

“Dimension”Report,row and column heading

“Facts”Numeric report values

Dimensional Modeling Overview Sample Report Translation

Sales Rep Performance Report

Central Region        

  Jul-00 Aug-00

  (Dollars) (Dollars)

Chicago District 879 878

Adams 345 456

Brown 564 565

Frederickson 657 768

     

Minneapolis District 890 789

Andersen Smith 909 978

Central Region Total 4244 4434

Page 9: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

■ Dimension Tables♦ Dimension Tables are the entry points into the data warehouse

♦ Dimension tables are designed especially for selection and grouping under a common head

♦ Determine contextual background for facts

♦ Parameters for OLAP

♦ Common Dimensions► Date

► Product

► Location/Region

► Customers

The dimensional model to represent the information contained in the information package, the data structure must be represent with Metrics, business dimensions and attributes for each business dimension.

Page 10: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

■ Dimension Tables♦ Dimension Table Characteristics

► Serve as report labels and query constraints

• “By” words

• “Where” clauses

► Provide Descriptive Information

• Minimal codes

• Embedded meaning as attributes

► Represent hierarchical relationships

Let see Product business dimension example,

When we want to analyze the fact by products.

ITEM KEY

Item DescriptionItem SizePackage TypeCategory

Page 11: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

Product Key

Product Desc.Brand Desc.Class Desc.Size

Product Key

Product Desc Brand Desc Class Size

0001

0002

0003

CHEERIOS 10 OZ

CHEERIOS 24 OZ

LUCKY CHARMS 10 OZ

CHEERIOS

CHEERIOS

LUCKY CHARMS

Family

Family

Kids

10 OZ

24 OZ

10 OZ

Product Dimension Table with Sample Rows

Page 12: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Big Dimensions

Retail_Fact

time_Key

Store_Key

Customer_Key

Product_key

Sales_dollars

Units_sold

Customer Dimension

Customer_key

Customer_ID (natural key)

Customer_name

Customer_address

Date_of_birth

Age

Gender

Annual_Income

Number_of_children

Marital_status

Other_attributes...

Page 13: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Big Dimensions

Customer Demographics Dimension

Customer_demographics_key

Age_Band

Gender

Income_Band

Number_of_children

Marital_status

Customer Dimension

Customer_key

Customer_ID (natural key)

Customer_name

Customer_address

Date_of_birth

Age

Gender

Annual_Income

Number_of_children

Marital_status

Other_attributes...

Customer Dimension

Customer_key

Customer_ID

Customer_name

Customer_address

Date_of_birth

Page 14: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dirty dimensions

■ A dirty dimension is the one in which data quality cannot be guaranteed♦ Data about the same customer can appear multiple times

Customer Dimension

Customer_key

Customer_ID (natural key)

Customer_name

Customer_address

Date_of_birth

Age

Gender

Annual_Income

Number_of_children

Marital_status

Other_attributes...

Page 15: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Hierarchies in Dimensions

■ Multiple Hierarchies♦ Dimension tables can represent multiple hierarchies roll-ups

♦ For example ,Store Dimension could have

♦ the following hierarchies► Physical Geography

• Zip, City, County, State, Country

► Sales Organization

• District, Region, Zone

► Distribution Roll-up

• Distribution Center, Distribution Center Region

Store Dimension

Store_key

Store_description

Store_type

Zip

City

State

Sales_region

Sales_zone

Distribution Center

Distribution Center Region

Page 16: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Hierarchies in Dimensions

Dimension Tables can represent multiple hierarchical roll-ups

DateDate

MonthMonth

QuarterQuarter

YearYear

WeekWeek

CityCity

DistrictDistrict

StateState

RegionRegion

Sales ZoneSales Zone

ProductProduct

BrandBrand

CategoryCategoryRegionRegion

Page 17: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

■ Fact Table♦ The fact table is where the numerical measurements of the

business are stored

♦ Facts► The detail information in a Fact tables

• For Examples: Sales Quantity, Unit Sales Price, Sales Amount etc.

► Key performance indicators of the business

► Numeric in Nature

► Analyzed across the dimensions

♦ Multi-part key► Foreign keys to dimension tables

► Date is always a key

Sales Fact

DATE KEY

ITEM KEY

STORE KEY

PROMOTION KEY

POS TRXN#

Sales Quantity

Unit Sales Price

Sales $ Amount

Page 18: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Overview

■ So far we have formed fact table and dimension tables.

■ How should these tables be arranged in the dimensional model?

■ What are the relationships and how should we mark the relationships in the model?

■ The dimensional model should primarily facilitate queries and analyses. What would be the types of queries and analyses?

■ Before combining these tables in dimensional model. What are the requirements? The model should provide the best data access.

The whole model must be query-centric.

It must be optimized for queries and analysis.

The model must show that the dimension tables interact with the fact table.

It should structured in such a way that every dimension can interact equally with the fact table.

The model should allow drilling down or rolling up along dimension hierarchies.

Page 19: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

With this rqts., each of the dimension tables are directly relates to fact table in the middle.

■ Such an arrangement in the dimensional model looks like a star formation, with the fact table at the core of a star and the dimension tables along the spikes of the star.

■ The dimensional model is therefore called a STAR schema.

■ See figure:

Page 20: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Star like Model

Dimension1

Fact

Dimension3

Dimensionn

Dimension2

Page 21: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

STAR schema for AUTO Sales

AUTO SALES

TIME

PRODUCT

DEALER

CUSTOMERDEMO

GRAPHICS

PAYMENTMETHOD

Page 22: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Typical Star model

Date Dimension

Date_key

Day_of_week

Month

Quarter

Year

Holiday_flag

Sales Fact

Date_key

Product_key

Store_key

Dollars_sold

Units_sold

Dollars_cost

Product Dimension

Product_key

Description

Brand

Category

Store Dimension

Store_key

Store_name

Address

Floor_plan_type

Page 23: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Star model…..

■ It consists of sales fact table in the middle of schema diagram. It have 3 dimension tables of Date, Product and Store.

■ The user will analyze the sales using dollar sold, unit sold and dollar cost.

■ From the STAR schema structure intuitively answers the questions for a given amount of dollars, what was the product sold? Who was the customer? Which store sold the product? When was the order placed?

■ Constraints and filters of queries are easily understood by looking at the star.

■ A common type of analysis is the drilling down the summary numbers to get at the details at the lower levels by filtering queries.

Page 24: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Snowflake Design

FactTable

DimTable

DimTable

DimTable

DimTable

Low cardinality redundant attributes moved to “sub dimension” tables

Page 25: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Snowflake Design

■ Issues♦ Only few tools optimized for snowflake schema

♦ When the tool is not optimized for snowflake design► Presentation more complex

► Browsing is slower

► Problems with multiple joins

Page 26: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Snowflake DesignDimensional Modeling Steps

Page 27: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Steps

■ Identify the Business Process

♦ A major operational process that is supported by some kind of legacy system(s) from which data can be collected for the purpose of the data warehouse

♦ Example: orders, invoices, shipments, inventory, sales

■ Identify the Grain

♦ The fundamental lowest level of data represented in a fact table for the business process

♦ Example: individual transactions, individual daily snapshots

■ Identify the Dimensions

♦ Choose the dimensions that will apply to each fact table record

■ Identify the Facts

♦ Choose the measured facts that will populate each fact table record

Page 28: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Steps

Resist the temptation to model data by looking copy books alone

1. Identify the business Process2. Identify the Grain3. Identify the Dimensions4. Identify the Facts

Business Requirements

Data Reality

Dimensional Model

Key Inputs Dimensional Modeling Steps Output

Page 29: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Shared Dimensions Must Conform

■ Option 1: Identical dimensions with the same keys, labels, definitions and values

Sales Schema

Inventory Schema

Conformed Dimension

Item Key

Item Desc.

Brand Desc.

Category

..

DATE KEY

ITEM KEY

STORE KEY

PROMO KEY

Sales Fact

Item Key

Item Desc.

Brand Desc.

Category

..

DATE KEY

ITEM KEY

STORE KEY

Inventory Fact

Page 30: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

■ Option 2: “Subset” of base dimension

Sales Schema

Forecast Schema

Item key Item Desc Brand Desc Category Desc0001 Cheerios 10oz Cheerios Cereal

Brand key Brand Desc Category Desc1001 Cheerios Cereal

Conformed Dimension

Item Key

Item Desc.

Brand Desc.

Category

Desc.

..

DATE KEY

ITEM KEY

STORE KEY

PROMO KEY

Sales $

DATE KEY

Day-of-week

Week Desc

Month Desc

Brand Key

Brand Desc.

Category

Desc.

..

Month Key

Brand Key

Estimate

Sales $

Month KEY

Month Desc

Page 31: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Slowly Changing Dimensions

■ Dimension attributes evolve over time♦ For example, customers change their names, move, have

children, adjust their Incomes

■ For every dimension attribute, need to identify “Changes” strategy♦ May use combination of strategies within same dimension table

Page 32: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Type 1 : Overwrite the changed attributes

Original recordItem Key Item Desc Dept12345 Sim City 3000 Educational S/W

Updated recordItem Key Item Desc Dept12345 Sim City 3000 Strategy S/W

Slowly Changing Dimensions

Page 33: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Type 2 : Add a New Dimension Record

Original recordItem Key Item Desc Dept12345 Sim City 3000 Educational S/W

Additional recordItem Key Item Desc Dept12345 Sim City 3000 Strategy S/W

Slowly Changing Dimensions

Page 34: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Type 3 : Add a “Prior” Attribute

Original recordItem Key Item Desc Dept12345 Sim City 3000 Educational S/W

Updated recordItem Key Item Desc Dept Prior Dept12345 Sim City 3000 Strategy S/W Educational S/W

Slowly Changing Dimensions

Page 35: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Slowly Changing Dimensions

■ Data Warehouse Keys ie., STAR schema keys♦ All tables (facts and dimensions) should use Data Warehouse

generated surrogate keys

♦ It is possible that the customer no., of discontinued customers are reassigned to new customers. We will have a problem because the same customer no,. Could relate to the data for the newer customer and also to the data of the retried customer. Therefore do not use such keys as a primary keys for dimensional tables.

♦ The surrogate keys are simply system generated sequence numbers.

♦ Each row in a dimension table is identified by a unique value of an attribute designated as the primary key of the dimension.

♦ Each dimension table is in 1:M relationship with central fact table. So the primary key of each dimension table must be a foreign key in the fact table.

Page 36: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Additive Measures

■ The ability of measures to be added across all dimensions of the fact table.

■ Measures could be fully additive, semi additive or non additive♦ Fully Additive - The values of the attributes summed up by

simple addition, Aggregation is a fully additive measures is done by simple addition. Sales Quantity, Revenue

♦ Semi Additive - Account Balance, Inventory, number of customers (Measure of Intensity, head counts)

♦ Non-Additive - Profit margin (Ratios and Percentages) i.e., Ratio or Percentages should not be added 1:3, 30%, etc.

Page 37: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Factless Fact tables to track attendance of students

Student Attendance Fact

time_Key

Student_Key

Course_Key

Faculty_Key

Facility_key

Student Dimension

Student_key

Student attributes…..

Faculty Dimension

Faculty_key

Faculty attributes…..

Time Dimension

time_key

time attributes…..

Course Dimension

Course_key

Course attributes…..

Facility Dimension

Facility_key

Facility attributes…..

Page 38: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Factless Fact tables – Coverage tables

Date Dimension

Date_key

Day_of_week

Month

Quarter

Year

Holiday_flag

Sales Fact

Date_key

Product_key

Promotion_key

Store_key

Dollars_sold

Units_sold

Dollars_cost

Product Dimension

Product_key

Description

Brand

Category

Store Dimension

Store_key

Store_name

Address

Floor_plan_type

Promotion Dimension

Promotion_key

Promotion_name

Discount

Event

Sales Fact (revisited)

Page 39: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Factless Fact tables – Coverage tables

Date Dimension

Date_key

Day_of_week

Month

Quarter

Year

Holiday_flag

Sales Fact

Date_key

Product_key

Promotion_key

Store_key

Product Dimension

Product_key

Description

Brand

Category

Store Dimension

Store_key

Store_name

Address

Floor_plan_type

Promotion Dimension

Promotion_key

Promotion_name

Discount

Event

Page 40: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Framework

Page 41: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Identify Subject Area, Grain

Detail Facts with Measures

Source-Data Model Mapping

Pre-calculations, Aggregates, Indexes, Data Structures, Source-Physical Model Mapping

Conceptual Level

Logical Level

Level of detail

Physical Level

Identify Major Dimension & Facts, Conform Dimensions across Facts

•Detail Dimensions with Hierarchies &Attributes•Slowly changing Dimensions Policies

Dimensional Modeling Framework

Page 42: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Framework

■ STEP1: Choose the process♦ Chose a process or subject area from the list of subject areas

identified

♦ Examples of this could be Sales Analysis, Strategic Sourcing, Human Resources

■ STEP2 : Choose the Grain♦ Choose the level of detail

♦ Every data mart / warehouse should be based on the most granular (atomic) data that can possibly be collected and stored.

Page 43: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Framework

■ STEP3 : Identifying Dimensions & Dimension Hierarchy

DateDate

MonthMonth

QuarterQuarter

YearYear

WeekWeek

CityCity

DistrictDistrict

StateState

RegionRegion

ProductProduct

BrandBrand

CategoryCategory

Page 44: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Framework

■ STEP4 : Choose the Measures♦ Identify all the measures for the fact table

■ STEP5 : Conforming the dimensions♦ Common dimensions across the Facts/ data marts have to be

exactly same or subset of the main dimension table

Page 45: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Framework

■ STEP6 : Adding Attributes to Dimension Tables♦ Enhance the depth of analysis

♦ Examples Customer Age, Address, Profession, Product color, flavor, product size, packaging type etc.

DateDate

MonthMonth

QuarterQuarter

YearYear

WeekWeek

CityCity

DistrictDistrict

StateState

RegionRegion

ProductProduct

BrandBrand

CategoryCategory

CurrentFlagSequence

Regional Manager

PhoneAddress Size Color

Page 46: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Framework

■ STEP7 : Storing Pre-calculations in the Fact table♦ Calculated based on one or more base measures

■ STEP8 : Choosing the Duration of the Database♦ Need for analyzing the data over a period of time

Page 47: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Framework

■ STEP9 : Track Slowly Changing Dimensions♦ Certain kinds of dimension attribute changes need to be handled

differently in Data Warehouse► Type I – Overwrite

► Type II - History

► Type III – Add new column example :- Organizational changes

Page 48: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Retail Sales – Case Study

Page 49: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Retail Sales - Case Study

■ Background♦ Chain consists of over 100 grocery stores in five states

♦ Stores average 60,000 SKUs in departments such as frozen foods, dairy etc.

♦ Bar codes are scanned directly into the cash registers PoS system

♦ Items are promoted via coupons, temporary price reductions, ads and in-store promotions

■ Analytical Requirements♦ Need to know what is selling in store each day in order to evaluate

product movement, as well as to see how sales are impacted by promotions

♦ Need to understand the mix of items in a consumer market basket

Page 50: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Retail Sales - Case Study

■ Identify the Business Process

■ Identify the Grain

■ Identify the Dimensions

■ Identify the Measures

■ Sales

■ Transaction Item (Daily Sales)

■ Date, Location, Item, Promotion

■ Quantity, Price, Amount

Page 51: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Retail Sales - Case Study

Date DescriptionWeekMonthQuarterYear

DATE KEYITEM KEYSTORE KEYPROMOTION KEYPOS TRXN#

Sales QuantityUnit Sales PriceSales $ Amount

Store NameCityDistrictZone

Item DescriptionItem SizePackage TypeCategory

Promotion DescriptionDiscount

DATE KEY

STORE KEYPROMOTION KEY

ITEM KEY

Resultant Sales Schema

Page 52: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Retail Sales - Case Study

Time Dimension

Date Key Date Day of Week Day Number in Month

Month Quarter Year Holiday Indicator

1 1/1/1999 Friday 1 January Q1 1999 Holiday

2 1/2/1999 Saturday 2 January Q1 1999 Non-Holiday

3 1/3/1999 Sunday 3 January Q1 1999 Non-Holiday

4 1/4/1999 Monday 4 January Q1 1999 Non-Holiday

Item Dimension

Itemkey Item Description SKU Number Dept Size Package Type Brand Category

1 Lasagna 6 OZ 90706287103 Grocery 6 OZ Box Cold Gourmet Frozen Foods

2 Beef Stew 6 OZ 16005393282 Grocery 6 OZ Box Cold Gourmet Frozen Foods

3 Extra Nougat 2 OZ

46817560065 Grocery 6 OZ Can Chewy Candy

Page 53: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Retail Sales - Case Study

Promotion Dimension

Promo Key Promo Name Price Reduction Ad Type Media Type Promo $ Begin Date End Date

1 Blue Ribbon Discounts

Temporary Daily Paper Paper 2000 1/1/1999 1/15/1999

2 Red Carpet Closeout

Markdown Sunday Paper Paper 1000 1/3/1999 1/10/1999

3 Ad Blitz None Paper and Radio Paper and Radio

7000 1/15/1999 1/30/1999

Sales Fact

Date Key Item Key Store Key Promo Key POS Trxn # Sales Qty Unit Sales Price

Sales $Amt

1 1 1 15 763457893 1 4.59 4.59

1 2 1 1 763457893 2 0.89 1.78

1 5 11 19 763457894 1 2.56 2.56

2 13 5 8 763457923 1 0.33 0.33

2 5 11 12 763457998 1 1.29 1.29

Page 54: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Tips

Page 55: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Dimensional Modeling Tips

■ Carefully choose the labels to identify data marts, dimension, attributes and facts

■ An attribute can live in one and only one dimension, whereas a fact can be repeated in multiple fact tables

■ If a single dimension appears to reside in more than one places, several roles are probably being played. Name the roles uniquely and treat them as separate dimensions

■ A single field in the underlying source data can have one or more logical columns associated with it

♦ E.g., A product attribute field may translate to product code, product short description, and product long description

■ Every fact should have a default aggregation rule (sum, min, max, latest, semi-additive, special algorithm, and not aggregatable)

♦ This will serve as a requirements list for query and report writers tools evaluations

Page 56: Dimensional Modeling Overview. Agenda ■ Dimensional Modeling Overview ■ Dimensional Modeling Steps ■ Dimensional Modeling Framework ■ Retail Sales - Case

Thank You

■ References♦ Ralph Kimball

► The Data Warehouse Toolkit

► The Data Warehouse Lifecycle Toolkit