Relational optimization and consideration

Embed Size (px)

Citation preview

  • 8/12/2019 Relational optimization and consideration

    1/23

    REL TION L OPTIMIZ TIONND CONSIDER TIONS

  • 8/12/2019 Relational optimization and consideration

    2/23

    Relational Optimization

    The optimizer is the heart of a relational

    database management system.

    Optimizer is an inference engine fordetermining the best possible database

    navigation strategy for any given SQL request.

  • 8/12/2019 Relational optimization and consideration

    3/23

  • 8/12/2019 Relational optimization and consideration

    4/23

    Relational optimization is very powerful because it

    allows queries to adapt to a changing database

    environment. It can also react to changes by formulatingnew access paths without requiring application coding

    changes to be implemented.

    Physical Data Independence is the separation of accesscriteria from physical storage characteristics

  • 8/12/2019 Relational optimization and consideration

    5/23

    To optimize SQL

    The relational Optimizer must analyze each SQL

    statements by parsing it to determine the tables and

    columns that must be accessed.

    The optimizer will access statistics stored by the RDBMS

    in either system catalog or the database objects

    themselves

  • 8/12/2019 Relational optimization and consideration

    6/23

    Every RDBMS has an embedded

    relational optimizer that renders SQL

    statements into executable accesspaths.

  • 8/12/2019 Relational optimization and consideration

    7/23

    Modern relational optimizers are cost based,

    meaning that the optimizer will attempt to

    formulate an access path for each query that

    reduces overall cost.

  • 8/12/2019 Relational optimization and consideration

    8/23

    CPU and I/O Costs

    The optimizer can arrive at a rough estimate of the

    CPU time required to run the query using each optimized

    access path at analyzes.

  • 8/12/2019 Relational optimization and consideration

    9/23

    Database Statistics

    A relational optimizer is of little use without accurate

    statistics about the data stored in the database. It provides

    DBMS a utility program or command to gather statistics

    about database objects and to store them for them for use

    by the optimizer.

  • 8/12/2019 Relational optimization and consideration

    10/23

    The DBA should collect modified statistics whenever

    a significant volume of date has been added or modified.

    Failure to do so will result in the optimizer basing its cost

    estimates on inaccurate statistics. This may be detrimental

    to query performance.

  • 8/12/2019 Relational optimization and consideration

    11/23

    DBMS collects statistical information

    Number of unique values stored in the column

    Most frequently occurring values for columns

    Index key density

    Details on the ratio of clustering for clustered tables

    Correlation pf columns to other columns

    Structural state of the index or tablespace

    Amount of the storage used by the database object

  • 8/12/2019 Relational optimization and consideration

    12/23

    Query Analysis

    It scans the SQL statement to determine its overall

    complexity. The formulation of the SQL statement is a

    significant factor in determining the access paths chosen

    by the optimizer.

    The complexity of the query, the number and the

    type of predicates, the presence of functions, and the

    presence of ordering clauses enter into the estimated cost

    that is calculated by the optimizer

  • 8/12/2019 Relational optimization and consideration

    13/23

    Which tables in which database are required

    Whether any views are required to be broken down intounderlying tables

    Whether tables joins or subselects are required

    Which indexes, if any, can be used

    How many predicates must be satisfied

    Which functions must be executed

    Whether the SQL uses OR or AND

  • 8/12/2019 Relational optimization and consideration

    14/23

    How the DBMS process each component of the SQLstatement

    How much memory has been assigned to the data caches

    used by the tables in the SQL statement

    How much memory is available for sorting if the query

    requires a sort.

  • 8/12/2019 Relational optimization and consideration

    15/23

  • 8/12/2019 Relational optimization and consideration

    16/23

  • 8/12/2019 Relational optimization and consideration

    17/23

    Joins

    Joining combining information from multiple tables.

    When multiple tables are accessed, the optimizer

    figures out how to combine the tables in the most efficientmanner.

    When determining the access path for a join, the

    optimizer must determine the order in which the tables willbe joined.

  • 8/12/2019 Relational optimization and consideration

    18/23

    Choose the table to process first

    Series of operations are performed on the outer table to

    prepare it for joining.

    Rows from that table are then combined with rows fromthe second table, called the INNER TABLE.

  • 8/12/2019 Relational optimization and consideration

    19/23

  • 8/12/2019 Relational optimization and consideration

    20/23

    Two common Join Method

    Nested-loop join

    Merge-scan join

  • 8/12/2019 Relational optimization and consideration

    21/23

  • 8/12/2019 Relational optimization and consideration

    22/23

    Merge-scan Join

    The tables to be joined are ordered by the keys. This

    ordering can be accomplished by a sort or by access via an

    index.

  • 8/12/2019 Relational optimization and consideration

    23/23

    Join Order

    The optimizer reviews each join in a query and

    analyzes that appropriate statistics to determine the optimal

    order in which the tables should be accessed to complete

    the join.

    To find optimal join access path, the optimizer uses

    built-in algorithms containing knowledge about joins and

    data volume.

    It matches this intelligence against the join

    predicates, databases statistics, and available indexes toestimate which order is more efficient.