DW 2.0 Workshop - Bill Inmon - Archival Sector

Embed Size (px)

Citation preview

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    1/27

    a presentationby W H Inmon

    ALTERNATE STORAGE

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    2/27

    data warehouses grow large rapidly

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    3/27

    data warehouses age over time

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    4/27

    Actively used data

    Dormant data

    Performance is greatly hurt by keepinga lot of data on disk storage that is dormant

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    5/27

    an arterywithcholesterol

    Lots of

    cholesterolNot muchcholesterol

    lots ofdormant data

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    6/27

    Lots ofdormant data

    Not muchdormant data Removing dormant data

    from the data warehouseis the single most importantthing the designer can doto improve performance

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    7/27

    It is much less expensiveto place data on different

    forms of storage

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    8/27

    High performancedisk storage

    Near line storage

    Archival storage

    There are three storage media that

    data can be sent to

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    9/27

    Data that has a very highProbability of access

    Data that has a lowprobability of access

    Data that needs to be keptregardless of probability of access

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    10/27

    Can be accessed in online time

    Can be accessed in near online time

    Cannot be accessed in online time

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    11/27

    Near line storage can be tightly coupledor loosely coupled with disk storage

    Looselycoupled Tightly

    coupled

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    12/27

    When the near line storageenvironment and the diskstorage environment arenot tightly coupled, they mustbe queried separately

    of course the result sets can be mergedindependently if desired

    query query

    resultset

    resultset

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    13/27

    When the disk storage and the near line environmentare not tightly coupled, the data base design can be

    independent and the data can be managed separately

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    14/27

    query

    When the disk storage environment and thenear line storage environment are tightly

    coupled, a single query can access both setsof data without knowing where the data is

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    15/27

    When the disk storage and the near line storageenvironments are managed in a tightly coupled manner,

    the data base design must be identical and the units ofstorage must be managed together

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    16/27

    Archival storageis always looselycoupled with otherstorage media

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    17/27

    archival assumptions over time

    - data will degrade

    - metadata not stored directly with data will be lostor otherwise corrupted

    - related data (key/foreign key) will be lostand or corrupted

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    18/27

    copy over

    occasionally it is a good practice to copy overarchived data to ensure the longevity andintegrity of the data

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    19/27

    sparemachine

    while archival data is sitting

    around waiting to be used,create passive indexes inanticipation of future needs

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    20/27

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    21/27

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    22/27

    cross media

    storage manager

    data can flow from the archival environmentto the disk environment if needed

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    23/27

    Archival and near line data canalso flow into the data mining/

    exploration warehouseenvironment as well

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    24/27

    Granularity of data -

    In both the near line and the archival environment, dataNeeds to be kept at the granular level. It is optional and

    Sometimes useful to keep the data at the summary level.

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    25/27

    operating system/

    dbms

    Near line storage needs to be

    kept current/compatible with theoperating system/dbms

    Archival data usually is not current with theoperating system/dbms

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    26/27

    One of the most important uses

    of archival and near line datais that of passive security

    In passive security we look atthe records of events todetermine what the extent ofthe damage is or how tofind out next time how toprevent a disaster

    Copyright Inmon Consulting Services, 2008C

  • 7/28/2019 DW 2.0 Workshop - Bill Inmon - Archival Sector

    27/27

    processor

    Dont just let archival data sit there and wait for activity.take an idle processor and constantly build indexeswaiting for future unknown needs.

    Then when it comes time for using the archival

    environment it will be fast and easy to access

    Building passive indexesin the archival environment

    Copyright Inmon Consulting Services, 2008C