Click here to load reader

Kimball DimensionalModeling

  • View
    213

  • Download
    0

Embed Size (px)

Text of Kimball DimensionalModeling

  • 8/14/2019 Kimball DimensionalModeling

    1/75Page 1 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University All rights reserved.

    Contact information:

    www.kimballgroup.com

    Kimball Group

    14898 Boulder Pointe RoadEden Prairie, MN 55347

    Bob [email protected]

    952.974.0434

  • 8/14/2019 Kimball DimensionalModeling

    2/75

    Page 2 1996-2005 Kimball Group. All rights reserved.

    !!!!

    " !" !" !" !! #! #! #! #

    ! ! $ ! ! ! $ ! ! ! $ ! ! ! $ ! %!%!%!%! %% ! %% ! %% ! %% !

    & ! ! '& ! ! '& ! ! '& ! ! '

    (((( ) ))) %% ! !'*%% ! !'*%% ! !'*%% ! !'*

    Our goal is that you walk away from this class armed with a process andsome basic techniques to begin to model your data dimensionally.

  • 8/14/2019 Kimball DimensionalModeling

    3/75

    Page 3 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    & ! !& ! !& ! !& ! !

    + !+ !+ !+ !

    , - .! ', - .! ', - .! ', - .! '

    , ' / ! ! !- .! ', ' / ! ! !- .! ', ' / ! ! !- .! ', ' / ! ! !- .! '

    + !+ !+ !+ !

    ,0 ! ! - .! ',0 ! ! - .! ',0 ! ! - .! ',0 ! ! - .! '

    % , !% , !% , !% , ! ) ))) ! -! -! -! -

    Introductions and Fundamentals Page 1Basics Case Study Page 16

    4-step processDegenerate dimensionsSurrogate keys

    Factless fact tablesBeyond the 1st Data Mart Case Study Page 29

    Star vs. snowflake variationsData warehouse bus matrixConformed dimensions

    More Dimensional Fundamentals Page 40Slowly changing dimensions

    Dimension and fact table indexingTransaction Detail Case Study Page 47

    Dimension table role playing

    Invoicing complicationsDesign Workshop Page 55

    Completed Worksheets Page 61

  • 8/14/2019 Kimball DimensionalModeling

    4/75

    Page 4 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    1 2 % 31 2 % 31 2 % 31 2 % 3& ! ! & ! !& ! ! & ! !& ! ! & ! !& ! ! & ! !

    % !% !% !% !

    + 4 ' !+ 4 ' !+ 4 ' !+ 4 ' !%% !%% !%% !%% !

    3 %3 %3 %3 %/ ! !/ ! !/ ! !/ ! !

    ! ***! ***! ***! *** ! ! ! ! 5 -55 -55 -55 -5

    2 2 2 2 2 2 2 2 2222 ( $ ( $ ( $ ( $ 2 ! 2 ! 2 ! 2 ! 0 ! .! 0 ! .! 0 ! .! 0 ! .! 1 ' 1 ' 1 ' 1 ') )) ) " ! " ! " ! " !

    2004 Kimball University. All rights reserved.

    ! ! ! !

    ! %! $ ***! %! $ ***! %! $ ***! %! $ *** 0 ! 0 ! 0 ! 0 ! 0 ! 0 ! 0 ! 0 ! 6 ! 6 ! 6 ! 6 !

    7* 1 * 7 8 ' 9 7* 1 * 7 8 ' 9 7* 1 * 7 8 ' 9 7* 1 * 7 8 ' 9

    .3& ! ! 6 ! % ! .3& ! ! 6 ! % ! .3& ! ! 6 ! % ! .3& ! ! 6 ! % ! * % * * % * * % * * % *

    0 ! 5 $ ' 0 ! 0 ! 5 $ ' 0 ! 0 ! 5 $ ' 0 ! 0 ! 5 $ ' 0 ! 7* 1 5* 7 * 77* 1 5* 7 * 77* 1 5* 7 * 77* 1 5* 7 * 7

    * 0 ! ! 8 ' /::;9 * 0 ! ! 8 ' /::;9 * 0 ! ! 8 ' /::;9 * 0 ! ! 8 ' /::;9

  • 8/14/2019 Kimball DimensionalModeling

    5/75

  • 8/14/2019 Kimball DimensionalModeling

    6/75

    Page 6 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    Data Mart #...Data Mart #...

    . % $. % $. % $. % $6 ! $ !6 ! $ !6 ! $ !6 ! $ !

    Data StagingArea

    Data Mart Bus:Conformed facts anddimensions

    Data Mart Bus:Conformed facts anddimensions

    Services:Transform from

    source-to-targetMaintain conform

    dimensionsNo user query

    support

    Data Store:Flat files or

    relational tables

    Design Goals:Staging

    throughputIntegrity /

    consistency

    Services:Transform from

    source-to-targetMaintain conform

    dimensionsNo user query

    support

    Data Store:Flat files or

    relational tables

    Design Goals:Staging

    throughputIntegrity /

    consistency

    PresentationArea

    Ad Hoc Query Tools

    Report Writers

    AnalyticApplications

    Modeling:ForecastingScoringData mining

    Ad Hoc Query Tools

    Report Writers

    AnalyticApplications

    Modeling:ForecastingScoringData mining

    Data AccessTools

    SourceSystems

    Extract

    Load

    Extract

    Extract

    Data Mart #2Data Mart #2

    Data Mart #1DimensionalAtomic AND

    summary dataBusiness-process

    centric

    Design Goals:Ease-of-useQuery

    performance

    Data Mart #1DimensionalAtomic AND

    summary dataBusiness-process

    centric

    Design Goals:Ease-of-useQuery

    performance

    Access

    Data Staging Area:Portion of the data warehouse restricted to EXTRACTING, CLEANING, andLOADING data from legacy systems into destination data marts.

    The data staging area is the back room and is explicitly off limits to the endusers, akin to the kitchen area of a restaurant.The data staging area DOES NOT SUPPORT QUERY OR PRESENTATION

    SERVICES.The data staging area is dominated by sorting and sequential processing. Thepre-cleaned data is almost always in a flat file format, and the post cleaneddata is either in a flat file format or third normal form.

    Presentation Area:Portion of the data warehouse restricted to PRESENTING data, not cleansingor transforming data.Organized into data marts, focused on a a specific BUSINESS PROCESS (notbusiness function, business function or business unit) subject area.

    A data mart often contains ATOMIC DATA, as well as more summarized

    data.

  • 8/14/2019 Kimball DimensionalModeling

    7/75

    Page 7 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    . % $. % $. % $. % $ ! ! > % ! ! ! > % ! ! ! > % ! ! ! > % !

    Data StagingArea

    Data AccessTools

    SourceSystems

    Extract

    Extract

    Extract

    Enterprise DataWarehouse

    Normalizedtables

    Atomic dataUser query

    support toatomic data

    Enterprise DataWarehouse

    Normalizedtables

    Atomic dataUser query

    support toatomic data

    Access

    Presentation Area

    Extract,transform &

    loadLoad

    centriccentric

    Data Mart #1DimensionalSummary dataDepartmental-

    centric

    Data Mart #1DimensionalSummary dataDepartmental-

    centric

    Access

    There are a number of similarities between the two approaches:Integrated data with common, conformed dimensions.

    Dimensional data marts.Unique characteristics of this approach include the following:

    Mandatory, persistent normalized data structures result from stagingprocess. Some ETL occurs to load the data warehouse, and more ETLoccurs to load the marts.Dimensional structures are for summary data only.Users are given limited access to the detailed data in ER structures.

  • 8/14/2019 Kimball DimensionalModeling

    8/75

    Page 8 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    %! ?%! ?%! ?%! ?! $$! $$! $$! $$ $$ ! ! ! %$$ ! ! ! %$$ ! ! ! %$$ ! ! ! %

    ! !! !! !! !

    0 ! (0 ! (0 ! (0 ! ( ! !!!7 % ! ! ! ! ( ! '

    ( ! @ ' % !

    %! A ! ! %! A ! !! < = ! < !=

    2004 Kimball University. All rights reserved.

    %! ? $$ !%! ? $$ !%! ? $$ !%! ? $$ !$ $$ ! ($ $$ ! ($ $$ ! ($ $$ ! (

    0 ! % A !0 ! % A !0 ! % A !0 ! % A !

    ! $$ ! %% * * *! $$ ! %% * * *! $$ ! %% * * *! $$ ! %% * * * 6 ' ! 6 ' ! 6 ' ! 6 ' !

    . % ! ! . % ! ! . % ! ! . % ! ! %! A $ # ' % $ %! A $ # ' % $ %! A $ # ' % $ %! A $ # ' % $

  • 8/14/2019 Kimball DimensionalModeling

    9/75

    Page 9 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    %! ? 64% ! !%! ? 64% ! !%! ? 64% ! !%! ? 64% ! !!!!!

    !!!!! 8< ! =9! 8< ! =9! 8< ! =9! 8< ! =9

    + ! ! %! 3 ! ' + ! ! %! 3 ! ' + ! ! %! 3 ! ' + ! ! %! 3 ! '

    & ! ! !& ! ! !& ! ! !& ! ! ! .! ! ! $ ! .! ! ! $ ! .! ! ! $ ! .! ! ! $ ! ! ! ! ! ! ! ! !

    Although well be focused on designing dimensional models for a relationaldatabase, much of what well be discussing is relevant to multidimensionaldatabases, too.

  • 8/14/2019 Kimball DimensionalModeling

    10/75

    Page 10 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    3333.! ..! ..! ..! . )) )))))) ! !B! !B! !B! !B

    . ! 8$ !9 ! ' ! %. ! 8$ !9 ! ' ! %. ! 8$ !9 ! ' ! %. ! 8$ !9 ! ' ! %%! 8 9 !%! 8 9 !%! 8 9 !%! 8 9 !

    The star schema is a dimensional design for relational databases.Dimensional models implemented in multidimensional OLAP databases(such as Hyperions Essbase) are referred to as data cubes.A star schema is characterized by a single data table or fact table, surroundedby descriptive tables or dimension tables. The data warehouse will ultimatelyconsist of many star schemas.The retail industry, the first to adopt this design technique, typically had fouror five dimensions, hence the term star was coined.

  • 8/14/2019 Kimball DimensionalModeling

    11/75

    Page 11 1996-2005 Kimball Group. All rights reserved.

    2004 Kimball University. All rights reserved.

    PRODUCT KEY

    Product Desc.SKU #SizeBrand Desc.Class Desc.

    0 '?0 '?0 '?0 '?

    ! ! $ !3 !! ! $ !3 !! ! $ !3 !! ! $ !3 ! ! ' ! ' ! ' ! ' ( ! ! ( ! ! + !' C( ! ! ( ! ! + !' C( ! ! ( ! ! + !' C( ! ! ( ! ! + !' C

    6666 % % ! ' % ! ! C% % ! ' % ! ! C% % ! ' % ! ! C% % ! ' % ! ! C

    !! ! 8 9?!! ! 8 9?!! ! 8 9?!! ! 8 9? 7 % ! # ' ! ! 7 % ! # ' ! ! 7 % ! # ' ! ! 7 % ! # ' ! ! < '= < = < '= < = < '= < = < '= < = > %! !! ! !> %! !! ! !> %! !! ! !> %! !! ! !! ! ! !

    @ ! % @ ! % @ ! % @ ! %

    Dimensions or

Search related