TD_ji1

Embed Size (px)

Citation preview

  • 7/30/2019 TD_ji1

    1/10

    Join Index

  • 7/30/2019 TD_ji1

    2/10

    EXPLANATION ON JOIN INDEX

    In Teradata the concept of materialized views are

    refer to as Join Indexes (and Aggregate Join

    Indexes-AJI when they include GROUP BY clauses).

    1) An AJI is created which effectively materializes

    the view results in a table.

  • 7/30/2019 TD_ji1

    3/10

    1) 2. When SQL covers the view then Teradata accesses the table to

    return the answer.

    3. This is beneficial because the join index can be used even for SQL that

    does not reference the view.

    4. This makes Teradatas implementation much more powerful then

    material views by other vendors. Look at the attached EXPLAIN

    output. (next slide)

    5. The first SQL references the view used to define the join index. Thesecond SQL is similar to the view definition but not actually the same.

    However, Teradata still exploits the AJI. The view definition is shown at

    the end of the file(slide).

    6. An AJI work as follows. :

    When the AJI is created the view is materialized and saved in a table.

    When the rows of dependent tables are changed the rows of the AJI

    table need to be recalculated and stored for future query references. If

    the AJI is based on a complex view this can add significant overhead toINSERT and UPDATE operations on the dependent tables.

  • 7/30/2019 TD_ji1

    4/10

    7) Join Indexes are great for dimensional tables like the

    calendar and the geography hierarchy.

    8). With many updates, you might find it quicker to drop the

    AJI, update the tables, then recreate the AJI.

    Text Document

  • 7/30/2019 TD_ji1

    5/10

    Summary : JI - A Fast Path To Fast Query

    A Join Index is an indexing structure

    stores and maintains the result from joining two or moretables.

    useful where the index structure contains all of thecolumns referenced by one or more joins in a query.

    To improve the performance during updates, considercollecting statistics on the base tables of a Join Index.

    ==================================

    Defined by you & Maintained by the system

    Immediately available to the Optimizer

    If a covering index, considered by the Optimizerfor a merge join

    Reported by the HELP INDEX and SHOWTABLE statements

  • 7/30/2019 TD_ji1

    6/10

    JI Affects the Following :

    Load Utilities MultiLoad and FastLoad utilities cannot be used. Use

    TPump, BTeQ.

    Archive and Restore Archiving is permitted on a base table or database,

    During Restore the Join Index is marked as invalid.

    Permanent Journal Recovery Using a permanent journal to recover a base table (i.e.,

    ROLLBACK or ROLLFORWARD), but join index ismarked as invalid.

    Collecting Statistics Statistics should be collected on the primary index and

    secondary indexes of the Join Index to provide theOptimizer with baseline statistics.

  • 7/30/2019 TD_ji1

    7/10

    JI : Case Study:

    Reporting Requirements

    Store by Day Store by Product by Day

    Store by Product by Month

    Reporting Frequency Multiple times a day, following data loads

    Data Maintenance Loads of POS throughout the day.

    Inventory loaded once a day.

    Full history of data is kept.

  • 7/30/2019 TD_ji1

    8/10

    Types of JI :

    Single Table

    Subset column selection on large base table Aggregation of single table

    Multi Table

    Pre-joining multiple tables

    With and without aggregation

    Sparse

    WHERE Clause used to limit data in JI

  • 7/30/2019 TD_ji1

    9/10

    Sample Query

    Aggregation by store by product by day:

    Select L.Location_Name, Pr.Product_Name, P.Pos_Date,Sum(Pos_Qty)

    From Location L, Product Pr, Pos PWhere P.Location_Id = L.Location_Id And P.Product_Sku =

    Pr.Product_SkuGroup By L.Location_Name, Pr.Product_Name,P.Pos_Date

  • 7/30/2019 TD_ji1

    10/10

    Join Index Cost

    Space

    JI Maintenance following data changes

    Initial creation

    Cant be backed up (restored)

    Poor design might lead to minimal benefits

    Too many columns selected

    Wrap up

    Direct Data Access Less Data to analyze

    Pre-joining of data can improve response times

    Costs/Benefits must be analyzed before implementation