Kishore Jaladi DW

Embed Size (px)

Citation preview

  • 7/25/2019 Kishore Jaladi DW

    1/41

    Data Warehousing: Data

    Models and OLAPoperationsBy

    Kishore [email protected]!

  • 7/25/2019 Kishore Jaladi DW

    2/41

    "opis #o$ered%. &nderstanding the ter! 'Data Warehousing(

    ). "hree*tier Deision +upport +yste!s

    ,. Approahes to OLAP ser$ers

    -. Multi*di!ensional data !odel

    . /OLAP

    0. MOLAP

    1. 2OLAP

    3. Whih to hoose: #o!pare and #ontrast

    4. #onlusion

  • 7/25/2019 Kishore Jaladi DW

    3/41

    &nderstanding the ter! DataWarehousing Data Warehouse:

    The term Data Warehouse was coined by Bill Inmon in 1990, whichhe defined in the following way: ! warehouse is a sub"ect#oriented,integrated, time#$ariant and non#$olatile collection of data in su%%ortof management&s decision ma'ing %rocess( )e defined the terms inthe sentence as follows:

    Subject Oriented:

    Data that gi$es information about a %articular sub"ect instead ofabout a com%any&s ongoing o%erations(

    Integrated:Data that is gathered into the data warehouse from a $ariety ofsources and merged into a coherent whole(

    Time-variant:

    !ll data in the data warehouse is identified with a %articular time%eriod(

    Non-volatileData is stable in a data warehouse( *ore data is added but data isne$er remo$ed( This enables management to gain a consistent

    %icture of the business(

  • 7/25/2019 Kishore Jaladi DW

    4/41

    Data Warehouse

    Arhiteture

  • 7/25/2019 Kishore Jaladi DW

    5/41

    Other i!portant

    ter!inology5 6nterprise Data 7arehouseollets all in8or!ation a9out su9jetscustomers,products,sales,assets, personnel; that span theentire organieuti$e? !anager? analyst; !ake 8aster 9etter deisions

    5 Online Analytial Proessing OLAP;an ele!ent o8 deision support syste!s D++;

  • 7/25/2019 Kishore Jaladi DW

    6/41

    "hree*"ier Deision +upport+yste!s

    5 Warehouse data9ase ser$er Al!ost al7ays a relational DBM+? rarely at Cles

    5 OLAP ser$ers /elational OLAP /OLAP;: e>tended relational

    DBM+ that !aps operations on !ultidi!ensionaldata to standard relational operators

    Multidi!ensional OLAP MOLAP;: speial*purposeser$er that diretly i!ple!ents !ultidi!ensional

    data and operations5 #lients uery and reporting tools Analysis tools

    Data !ining tools

  • 7/25/2019 Kishore Jaladi DW

    7/41

    "he #o!plete Deision +upport+yste!

    Information Sources Data Warehouse

    Server

    (Tier 1)

    OLAP Servers

    (Tier 2)

    Clients

    (Tier 3)

    Oerational

    D!"s

    Semistructure#

    Sources

    extract

    transform

    load

    refresh

    etc.

    Data $arts

    Data

    Warehouse

    e%&%' $OLAP

    e%&%' OLAP

    serve

    OLAP

    uer*+eortin&

    Data $inin&

    serve

    serve

  • 7/25/2019 Kishore Jaladi DW

    8/41

    Approahes to OLAP+er$ers"hree possi9ilities 8or OLAP ser$ers%; /elational OLAP /OLAP;

    /elational and speiali

  • 7/25/2019 Kishore Jaladi DW

    9/41

    "he Multi*Di!ensional DataModel

    Sales by product line over the past six monthsSales by store between 1990 and 1995

    Pro# Co#e Time Co#e Store Co#e Sales t*

    Store Info

    Pro#uct Info

    Time Info

    % % %

    ,umerical $easures-e* columns .oinin& fact ta/le

    to #imension ta/les

    Fact table for

    measures

    Dimension tables

  • 7/25/2019 Kishore Jaladi DW

    10/41

    /OLAP: Di!ensional Modeling &sing/elational DBM+

    5 +peial she!a design: star, snowfake

    5 +peial inde>es: 9it!ap? !ulti*ta9le join

    5 Pro$en tehnology relational !odel? DBM+;?tend to outper8or! speiali

  • 7/25/2019 Kishore Jaladi DW

    11/41

    +tar +he!a in /DBM+;

  • 7/25/2019 Kishore Jaladi DW

    12/41

    +tar +he!a 6>a!ple

  • 7/25/2019 Kishore Jaladi DW

    13/41

    "he '#lassi( +tar +he!a

    A single 8at ta9le? 7ithdetail and su!!ary data

    Eat ta9le pri!ary keyhas only one key olu!nper di!ension

    6ah key is generated 6ah di!ension is a

    single ta9le? highly de*nor!ali

  • 7/25/2019 Kishore Jaladi DW

    14/41

    Star Schema

    ith Samle

    Data

  • 7/25/2019 Kishore Jaladi DW

    15/41

    "he '+no7ake( +he!a

    STORE KEYStoreDimension

    Store Descri#tionCityStateDistrict ID

    Reion%IDReional Mr$

    District%ID

    District Desc$Reion%ID

    Reion%ID

    Reion Desc$Reional Mr$

    STORE KEY

    PRODUCT KEY

    PERIOD KEY

    DollarsUnitsPrice

    Store Fact Ta"le

  • 7/25/2019 Kishore Jaladi DW

    16/41

    Aggregation in a +ingle Eat "a9le

    Dra+"ac,s: +u!!ary data in the 8at ta9le yields poorer per8or!ane 8or

    su!!ary le$els? huge di!ension ta9les a pro9le!

    PERIOD KEY

    Store Dimension Time Dimension

    Product Dimension

    STORE KEY

    PRODUCT KEY

    PERIOD KEY

    DollarsUnitsPrice

    Period Desc

    YearQuarterMonthDayCurrent FlaResolutionSe!uence

    Fact Ta"le

    PRODUCT KEY

    Store Descri#tionCity

    StateDistrict IDDistrict Desc$Reion%IDReion Desc$Reional Mr$&e'el

    Product Desc$(randColorSi)e

    Manu*acturer&e'el

    STORE KEY

  • 7/25/2019 Kishore Jaladi DW

    17/41

    PERIOD KEY

    Store Dimension Time Dimension

    Product Dimension

    STORE KEY

    PRODUCT KEY

    PERIOD KEY

    DollarsUnitsPrice

    Period DescYearQuarterMonth

    DayCurrent FlaSe!uence

    Fact Ta"le

    PRODUCT KEY

    Store Descri#tionCityStateDistrict IDDistrict Desc$Reion%IDReion Desc$Reional Mr$

    Product Desc$(randColorSi)eManu*acturer

    STORE KEY

    "he 'Eat #onstellation(+he!a

    DollarsUnitsPrice

    District Fact Ta"le

    District%IDPRODUCT%KE

    YPERIOD%KEY

    DollarsUnitsPrice

    Reion Fact Ta"le

    Reion%ID

    PRODUCT%KEYPERIOD%KEY

  • 7/25/2019 Kishore Jaladi DW

    18/41

    "he

    Aggregations using'+no7ake( +he!a and

    Multiple Eat "a9les

    5 Fo L6G6L in di!ension ta9les5 Di!ension ta9les are nor!ali

  • 7/25/2019 Kishore Jaladi DW

    19/41

    Aggregation #ontd IAggregation #ontd I

    Advantage:Advantage:!est erformance hen 7ueries involve a&&re&ation!est erformance hen 7ueries involve a&&re&ation

    Disadvantage:Disadvantage:Comlicate# maintenance an# meta#ata' e8losion in the num/erComlicate# maintenance an# meta#ata' e8losion in the num/er

    of ta/les in the #ata/aseof ta/les in the #ata/ase

    STORE KEY

    Store Dimension

    Store Descri#tion

    City

    State

    District ID

    District Desc$

    Reion%ID

    Reion Desc$Reional Mr$

    District%ID

    District Desc$

    Reion%ID

    Reion%ID

    Reion Desc$

    Reional Mr$

    STORE KEY

    PRODUCT KEY

    PERIOD KEY

    Dollars

    Units

    Price

    Store Fact Ta"le

    DollarsUnits

    Price

    District Fact Ta"le

    District_ID

    PRODUCT_KEY

    PERIOD_KEY DollarsUnitsPrice

    ReionFact Ta"le

    Region_ID

    PRODUCT_KEY

    PERIOD_KEY

  • 7/25/2019 Kishore Jaladi DW

    20/41

    Aggregates

    Add up amounts for day 1In SQL: SELECT sum(amt F!"# SALE

    $%E!E date & 1

    '1

  • 7/25/2019 Kishore Jaladi DW

    21/41

    Aggregates

    Add up amounts by dayIn SQL: SELECT date sum(amt F!"# SALE

    )!"*+ ,- date

    ans date sum

    1 '1

    . /'

    sale prodId storeId date amtp1 s1 1 1.p. s1 1 11p1 s0 1 2p. s. 1 'p1 s1 . //

    p1 s. . /

  • 7/25/2019 Kishore Jaladi DW

    22/41

    Another 6>a!pleAdd up amounts by day productSQL: SELECT prodid date sum(amt F!"# SALE

    )!"*+ ,- date prodId

    sale prodId date amt

    p1 1 3.

    p. 1 14

    p1 . /'

    drill5do6n

    rollup

    sale prodId storeId date amtp1 s1 1 1.p. s1 1 11p1 s0 1 2p. s. 1 'p1 s1 . //p1 s. . /

  • 7/25/2019 Kishore Jaladi DW

    23/41

    Points to 9e notied a9out /OLAP

    5 DeCnes o!ple>? !ulti*di!ensional data 7ithsi!ple !odel

    5 /edues the nu!9er o8 joins a uery has toproess

    5 Allo7s the data 7arehouse to e$ol$e 7ith rel.lo7 !aintenane

    5 #an ontain 9oth detailed and su!!ari

  • 7/25/2019 Kishore Jaladi DW

    24/41

    MOLAP: Di!ensional Modeling&sing the Multi Di!ensional Model

    5MDDB: a speial*purpose data !odel5Eats stored in !ulti*di!ensional

    arrays

    5Di!ensions used to inde> array

    5+o!eti!es on top o8 relational DB

    5Produts Pilot? Ar9or 6ss9ase? entia

  • 7/25/2019 Kishore Jaladi DW

    25/41

    "he MOLAP #u9e

    sale prodId storeId amtp1 s1 1.p. s1 11p1 s0 2p. s. '

    s1 s2 sp1 1. 2p. 11 '

    Fact table 7ie6:#ulti5dimensional cube:

    dimensions & .

  • 7/25/2019 Kishore Jaladi DW

    26/41

    ,*D #u9e,*D #u9e

    dimensions & 0dimensions & 0

    #ulti5dimensional cube:#ulti5dimensional cube:Fact table 7ie6:Fact table 7ie6:

    sale prodId storeId date amt

    p1 s1 1 1.p. s1 1 11p1 s0 1 2p. s. 1 'p1 s1 . //p1 s. . /

    da! 2da! 2s1 s2 s

    p1 // /p. s1 s2 s

    p1 1. 2p. 11 '

    da! 1da! 1

  • 7/25/2019 Kishore Jaladi DW

    27/41

    6>a!ple6>a!ple

    Stor

    e

    Stor

    e

    Product

    Produc

    t

    TimeTime

    $ T W Th 9 S S$ T W Th 9 S S

    :uice:uice

    $il5$il5

    Co5eCo5e

    CreamCream

    SoaSoa

    !rea#!rea#

    ,;,;

    S9S9

    LALA

    1?>?

    3232

    1212

    >?>?

    >? units of /rea# sol# in LA on $>? units of /rea# sol# in LA on $

    Dimensions:Dimensions:

    Time' Pro#uct' StoreTime' Pro#uct' Store

    Attributes:Attributes:

    Pro#uct (uc' rice' @)Pro#uct (uc' rice' @)

    Store @Store @@@

    Hierarchies:Hierarchies:

    Pro#uctPro#uct !ran#!ran# @@

    Da*Da* Wee5Wee5 uarteruarter

    StoreStore e&ione&ion Countr*Countr*rollu to ee5rollu to ee5

    rollu to /ran#rollu to /ran#

    rollu to re&ionrollu to re&ion

  • 7/25/2019 Kishore Jaladi DW

    28/41

    #u9e Aggregation: /oll*up

    da! 2s1 s2 s

    p1 // /p. s1 s2 s

    p1 1. 2

    p. 11 '

    da! 1

    s1 s2 sp1 3 / 2

    p. 11 '

    s1 s2 s

    sum 38 1. 2

    sum

    p1 112

    p. 14

    1.4

    " " "

    drill5do6n

    rollup

    E9ample: computin sums

  • 7/25/2019 Kishore Jaladi DW

    29/41

    #u9e Operators 8or /oll*up#u9e Operators 8or /oll*up

    da! 2da! 2s1 s2 s

    p1 // /p. s1 s2 s

    p1 1. 2

    p. 11 '

    da! 1da! 1

    s1 s2 sp1 3 / 2

    p. 11 '

    s1 s2 s

    sum 38 1. 2

    sum

    p1 112

    p. 14

    1.41.4

    " " "" " "

    sale#s1$%$%&sale#s1$%$%&

    sale#%$%$%&sale#%$%$%&sale#s2$p2$%&sale#s2$p2$%&

  • 7/25/2019 Kishore Jaladi DW

    30/41

    s1 s2 s ;

    p1 3 / 2 112p. 11 ' 14; 38 1. 2 1.4

    6>tended #u9e6>tended #u9e

    da! 2da! 2 s1 s2 s ;p1 // / /'p.; // / /'s1 s2 s ;

    p1 1. 2 3.p. 11 ' 14

    ; .0 ' 2 '1

    da! 1da! 1

    %%

    sale#%$p2$%&sale#%$p2$%&

  • 7/25/2019 Kishore Jaladi DW

    31/41

    Aggregation &sing

    2ierarhies

    region ' region (

    p1 3 /

    p. 11 '

    store

    reion

    country

    (store s1 in !eion A