Upload
krishan-singh
View
1.176
Download
6
Tags:
Embed Size (px)
DESCRIPTION
DB2 for Z/OS version 8 helps understand SQL Coding Strategies & Guidelines, optimization, filters and predicates for beginners
Citation preview
DB2 z/OS v8 - SQL Tuning
Overview
Understanding DB2 Optimizer
SQL Coding Strategies & Guidelines Fliter Factor Stage1 & Stage 2 Predicates Explain table How to interpret the Explain Tables
Using Monitoring Tools to understand the performance of SQLs
BMC Apptune BMC SQL Explorer
SQL Coding Strategies & Guidelines
SQL
Optimized Access
Path
DB2 OptimizerCost - Based
Query Cost
FormulasDB2Catalog
Determines database navigation Parses SQL statements for tables and columns which must be accessed Queries statistics from DB2 Catalog (populated by RUNSTATS utility) Determines least expensive access path Checks Authorization
The DB2 Optimizer is Cost Based and chooses the least expensive access path
SQL Coding Strategies & Guidelines
Avoid unnecessary execution of SQL Consider accomplishing as much as possible with a single call, so as to minimize table
access as far as possible. Limit the data selected (rows & columns) using SQL and avoid filtering using Application
programs. As far as possible, Code predicates on Indexable columns Use equivalent data types for comparison. This avoids the data type conversion overhead. JOIN tables on Indexed columns. Avoid Cartesian Products. The DISTINCT, ORDER BY, GROUP BY, UNION clauses involve a SORT operation. Use
these clauses only if absolutely necessary.
SQL Coding Strategies & Guidelines
Cursor Usage Tips Use Singleton SELECT statements, if you need to retrieve one row only. This
gives a far better performance than cursors.SELECT … INTO :<host variables>
Cursors should be used when you have more than one row to be retrieved. Cursors have the overhead of OPEN, FETCH & CLOSE.
To update rows using a Cursor, use the FOR UPDATE OF clause. Use FOR FETCH ONLY clause when the cursor is used for data retrieval only.
FOR READ ONLY clause provides the same functionality and is ODBC compliant.
Use the WITH HOLD clause if you don’t want DB2 to automatically close the cursor when the application issues a COMMIT statement.
Static Vs Dynamic SQL The Access paths for Dynamic SQL is determined at run-time, which results in
additional overhead. Also, users need to have direct access to the tables. The Access paths for Static SQL is determined at bind-time, and reused at run-
time. Users need only the EXECUTE access on the plan.
SQL Coding Strategies & Guidelines
UNION and UNION ALL The OR operator requires Stage 2 processing. Consider rewriting the query as
the union of two SELECT statements, making index access possible UNION ALL allows duplicates, and hence does not involve a SORT.
The BETWEEN clause BETWEEN is usually more efficient than using <= and >= operators, except
when comparing a host variable to 2 columns Stage 2 : WHERE :hostvar BETWEEN col1 and col2 Stage 1: WHERE Col1 <= :hostvar AND col2 >= :hostvar
SQL Coding Strategies & Guidelines
Use IN Instead of Like If you know that only a certain number of values exist and can be put in a
list Use IN or BETWEEN
IN (‘Value1’, ‘Value2’, ‘Value3’) BETWEEN :valuelow AND :valuehigh
Rather than: LIKE ‘Value_’
Use LIKE With Care Avoid the % or the _ at the beginning because it prevents DB2 from using
a matching index and may cause a scan Use the % or the _ at the end to encourage index usage
SQL Coding Strategies & Guidelines
Use NOT operator with care Predicates formed using NOT (except NOT EXISTS) are Stage 1, but are not
indexable. For Subquery - when using negation logic:
• Use NOT Exists instead of NOT IN
Code the Most Restrictive Predicate First After the indexes, place the predicate that will eliminate the greatest number of
rows
Avoid Arithmetic in Predicates An index is not used for a column when the column is in an arithmetic
expression. Used at Stage 1 but not indexable
SQL Coding Strategies & Guidelines
Nested loop join is efficient when Outer table is small. Predicates with small filter factor reduces no of qualifying
rows in outer table. The number of data pages accessed in inner table is also small. Highly clustered index available on join columns of the inner table. This join method is efficient when filtering for both the tables (Outer and inner) is
high. This is the most common Join method.
Merge scan is used when : Qualifying rows of inner and outer tables are large and join predicates also does
not provide much filtering Tables are large and have no indexes with matching columns
Hybrid Join is used when: A non-clustered index available on join column of the inner table and there are
duplicate qualifying rows on outer table.
SQL Coding Strategies & Guidelines
Join Types & Join Predicate Considerations
Provide accurate JOIN predicates Avoid JOIN without a predicate (Cartesian Join) Join ON indexed columns Use Joins over sub-queries When the results of a join must be sorted -
Limiting the ORDER BY to columns of a single table can sometimes avoid a Sort
Specifying columns from multiple tables definitely involve a Sort Favor coding LEFT OUTER joins over RIGHT OUTER joins as DB2 always
converts RIGHT joins to LEFT before executing it.
SQL Coding Strategies & Guidelines
Sub-Query Guidelines
– If there are efficient indexes available on the tables in the subquery, then a correlated subquery is likely to be the most efficient kind of subquery.
– If there are no efficient indexes available on the tables in the subquery, then a non-correlated subquery would likely perform better.
– If there are multiple subqueries in any parent query, make sure that the subqueries are ordered in the most efficient manner.
SQL Coding Strategies & Guidelines
Techniques for Performance Improvement
Use OPTIMIZE OF n ROWS DB2 assumes that only the said number of rows will be retrieved by
the query before choosing the access path. It is basically like giving a Hint to the DB2 Optimizer.
This does not stop the user from accessing the entire result set. This is not useful when DB2 has to gather whole result set before
returning the first n rows. With this clause, DB2 optimizes the query for quicker response.
Updating catalog tables If RUNSTATS is costly or it cannot be executed then catalog table
should be updated manually.
Enhanced Techniques for Performance Improvement
SQL Coding Strategies & Guidelines
Influencing access path – Add extra Predicate DB2 evaluates the access path based on information available in
catalog tables Wrong catalog information or unavailable catalog information may
result in selection of wrong access path Wrong access path could be because of a wrong index selection or
it could also be of index selection where a tablespace scan is effective
Code extra predicates or change the predicate to make DB2 use a different access path
Adding extra predicate may also influence the selection of join method
If you have extra predicate, Nested loop join may be selected as DB2 assumes that filter factor will be high. The proper type of predicate to add is WHERE T1.C1 = T1.C1
Hybrid join is a costlier method. Outer join does not use hybrid join. So If hybrid join is used by DB2, convert inner join to outer join and add extra predicates to remove unneeded rows.
Enhanced Techniques for Performance Improvement
SQL Coding Strategies & Guidelines
General recommendations
Make sure that The queries are as simple as possible Unused rows are not fetched. Filtering to be done by DB2 not in the application
program. Unused columns are not selected There is no unnecessary ORDER BY or GROUP BY Clause Use page level locking and try to minimize lock duration. Mass updates should be avoided. Try to use indexable predicates wherever possible Do not code redundant predicates Make sure that declared length of the host variable is not greater than length
attribute of data column. If there are efficient indexes available on the tables in the subquery, co-related
subquery will perform better. Otherwise non co related subquery will perform better. If there are multiple subqueries, make sure that they are ordered in efficient
manner.
Summary
Optimizer assigns a “Filter Factor” (FF) to each predicate or predicate combination
– Number between 0 and 1 that provides the estimated filtering percentage
FF of 0.25 means 25% of the rows are estimated to qualify
– Calculated using available statistics from catalog tables • Column cardinality (COLCARDF) • HIGH2KEY/LOW2KEY • Frequency statistics (FREQUENCYF in SYSCOLDIST)
Filter Factor (FF)
RUNSTATS
RUNSTATS is a DB2 utility which provides catalog statistics used by the optimizer and statistics related to the organization of an object (TS / TB / IX / CO)
Accurate Statistics are a critical factor for performance of the SQL.
Updates the DB2 catalog and reports the statistics.
Some catalog statistics updated by RUNSTATS for use by the optimizer can be manually updated with appropriate authorization (SYSADM).
Stats Used for Access Path Determination
SYSCOLDIST– COLVALUE– FREQUENCYF– TYPE– CARDF – COLGROUPCOLNO– NUMCOLUMNS
SYSCOLUMNS
– COLCARDF– HIGH2KEY– LOW2KEY
SYSINDEXES– CLUSTERING– CLUSTERRATIOF– FIRSTKEYCARDF– FULLKEYCARDFNLEAF– NLEVELS
Stats Used for Access Path Determination
SYSINDEXPART
– LIMITKEY
SYSTABLES
– CARDF
– EDPROC
– NPAGES
– PCTROWCOMP
Stage 1 vs. Stage 2 Predicates
Stage 1 predicates may use an available Index. Stage 2 predicates cannot use any Index.
Wherever possible, prefer to use Stage 1 (Sargable) predicates in the where clause. These are conditions that can be evaluated in the Data Manager of DB2, before the results are passed to Relational Data System (RDS). The more conditions that can be evaluated early on, the more efficient data retrieval is.
Stage 1- Refers to DM( Data Manager) A suitable index must exist! Reduces I-O from disk and bufferpool activity
Stage 2 - Refers to RDS ( Relational Data System)
Stage 1 vs. Stage 2 Predicates
How does the optimizer calculate Filter Factors?
The lower the filter factor, the lower the cost. In general, the more efficient the
query will be
A tool that shows the access path used by a query.
Results of Explain stored in table PLAN_TABLE.
Explain can be run for a query outside a program or for all queries in a program.
For all queries in a program: By using EXPLAIN(YES) parameter during BIND.
Sample Explain Table Output
Explain
Explain
Explain can be run at bind time using parm value of EXPLAIN(YES)
A PLAN_TABLE must previously exist based on OWNER parm value on BIND or current SQLID for dynamic SQL
Explain can also be run against dynamic SQL DELETE FROM PLAN_TABLE WHERE QUERYNO = 999;
EXPLAIN PLAN SET QUERYNO = 999 FOR <SELECT STATEMENT GOES HERE - USE ? IN PLACE OF HOST
VARIABLES>; SELECT * FROM PLAN_TABLE WHERE QUERYNO = 999 ORDER BY QBLOCKNO, PLANNO;
Don’t forget to Explain everything
Plan_Table is where all the tuning starts
Non- Matching Index scan (ACCESSTYPE = I and MATCHCOLS = 0)
Scan all leaf pages of index selected by optimizer selecting one OR more qualifying rows. Scan can be with OR without data access.
Predicate does not match Leading columns in the index
SELECT COUNT(*) FROM TABLEA
SELECT MAX(COL1) FROM TABLEA
SELECT COL1 FROM TABLEA WHERE COL2 = :HV
Interpreting the Plan Table/Analyzing Access Paths
Non-Matching Index Scan Diagram
Root Page
Non-LeafPage 1
Non-LeafPage 2
Leaf Page 1 Leaf Page 2 Leaf Page 3 Leaf Page 4
Matching Index scan (MATCHCOLS > 0)
Scan one or more leaf pages of index selected by optimizer selecting one OR more qualifying rows. Index match based on one or more key columns of selected index. Scan can be with OR without data access. Predicates matches leading columns of the index.
SELECT COL1 FROM TABLEA WHERE COL2 = :HV
SELECT COL2 FROM TABLEA WHERE COL1 = :HV (host variable length longer than COL1)
Interpreting the Plan Table/Analyzing Access Paths
Root Page
Non-LeafPage 1
Non-LeafPage 2
Leaf Page 1 Leaf Page 2 Leaf Page 3 Leaf Page 4
Data Page Data Page Data Page Data Page Data Page Data Page Data Page Data Page
Matching Index Scan Diagram
Interpreting the Plan Table/Analyzing Access Paths
One Fetch Index Access (ACCESSTYPE = I1)
In certain circumstances can be THE most efficient access path in DB2.May only need to access only 1 leaf page but MAY need to traverse index tree path.
Requires only one row be retrieved ( Min or Max column function)
SELECT MIN(COL1) FROM TABLEA
SELECT MIN(COL2) FROM TABLEA WHERE COL1 = :HV (will still be I1 BUT with matchcols = 1)
Interpreting the Plan Table/Analyzing Access Paths
IN List Index Scan (ACCESSTYPE = N)
Scan one or more leaf pages of index selected by optimizer selecting one OR more qualifying rows.
Index match based on one or more key columns of selected index. At least one key column incorporates an IN list.
SELECT * FROM TABLEA WHERE COL1 = :HV
AND COL2 IN (‘A’,’B’,’C’)
SELECT COL3 FROM TABLEA WHERE COL1 IN (‘12345’,’56789’)
AND COL2 = :HV
Interpreting the Plan Table/Analyzing Access Paths
Table-space scan (ACCESSTYPE = R)
Scan against partitioned tablespace or simple tablespace with one table scans all pages including pages which are empty or contain purely deleted rows.
Scan against simple tablespace containing more than one table includes scanning of tables within that tablespace not necessarily included in the query.
Scan against segmented tablespace includes only pages containing data.
SELECT * FROM TABLEASELECT * FROM TABLEA WHERE COL6 = 0SELECT * FROM TABLEA WHERE COL1 <> :HV
Interpreting the Plan Table/Analyzing Access Paths
Data Page 1 Data Page 2 Data Page 3 Data Page 4
Tablespace Scan Diagram
Interpreting the Plan Table/Analyzing Access Paths
DB2 I/O Assisted Mechanisms
Prefetch To read data ahead in anticipation of its use. Prefetch can read up to 32 4K pages for applications, and up to 64 4K pages for utilities. Sequential Prefetch In DB2 UDB for OS/390, a mechanism that triggers consecutive asynchronous I/O operations. Pages are fetched before they are required, and several pages are read with a single I/O operation. This action is determined at bind time and can be detected by a value of “S” in the prefetch column of the plan table. If index AND data are required for the SQL, prefetch can occurs both object types.
Dynamic Prefetch Using the same approach as sequential prefetch, the mechanism is trigger at runtime if DB2 detect that access to the index and/or data pages is sequential in nature but are distributed |in a nonconsecutive manner .
List Prefetch An access method that takes advantage of prefetching even in queries that do not access data sequentially. This is done by scanning the index and collecting RIDs in advance of accessing any data pages. These RIDs are then sorted in page number order, and then data is prefetched using this list.
DB2 Explain Columns
QUERY Number –
Identifies the SQL statement in the PLAN_TABLE (any number you assign - the example uses the numeric part of the userid)
BLOCK –
Query block within the query number, where 1 is the top level SELECT. Subselects, unions, materialized views, and nested table expressions will show multiple query blocks. Each QBLOCK has it's own access path.
PLAN –
Indicates the order in which the tables will be accessed
DB2 Explain Columns
METHOD – Shows which JOIN technique was used:
00- First table accessed, continuation of previous table accessed, or not used.
01- Nested Loop Join. For each row of the present composite table, matching rows of a new table are found and joined
02- Merge Scan Join. The present composite table and the new table are scanned in the order of the join columns, and matching rows are joined.
03- Sorts needed by ORDER BY, GROUP BY, SELECT DISTINCT, UNION, a quantified predicate, or an IN predicate. This step does not access a new table.
04- Hybrid Join. The current composite table is scanned in the order of the join-column rows of the new table. The new table accessed using list prefetch.
DB2 Explain Columns
TNAME –
name of the table whose access this row refers to. Either a table in the FROM clause, or a materialized VIEW name.
TYPE (ACCESS TYPE) –
indicates whether an index was chosen: I = INDEX R = TABLESPACE SCAN (reads every data page of the table once) I1 = ONE-FETCH INDEX SCAN N = INDEX USING IN LIST M = MULTIPLE INDEX SCAN MX = NAMES ONE OF INDEXES USED MI = INTERSECT MULT. INDEXES MU = UNION MULT. INDEXES
DB2 Explain Columns
MC (MATCHCOLS) - number of columns of matching index scan ANAME (ACCESS NAME) - name of index IO (INDEX ONLY) - Y = index alone satisfies data request N = table must be accessed also
8 Sort Groups: Each sort group has four indicators indicating why the sort is necessary. Usually, a sort will cause the statement to run longer.
UNIQ - DISTINCT option or UNION was part of the query or IN list for subselect JOIN - sort for Join ORDERBY - order by option was part of the query GROUPBY - group by option was part of the query
DB2 Explain Columns
Sort flags for 'new' (inner) tables:
SNU - SORTN_UNIQ - Y = remove duplicates, N = no sort SNJ - SORTN_JOIN - Y = sort table for join, N = no sort SNO - SORTN_ORDERBY - Y = sort for order by, N = no sort SNG - SORTN_GROUPBY - Y = sort for group by, N = no sort
Sort flags for 'composite' (outer) tables: SCU - SORTC_UNIQ - Y = remove duplicates, N = no sort SCJ - SORTC_JOIN - Y = sort table for join, N = no sort SCO - SORTC_ORDERBY - Y = sort for order by, N = no sort SCG - SORTC_GROUPBY - Y = sort for group by, N = no sort
PF - PREFETCH - Indicates whether data pages were read in advance by prefetch. S = pure sequential PREFETCH L = PREFETCH through a RID list Blank = unknown, or not applicable
DB2 Explain Columns
MIXOPSEQ The sequence number of a step in a multiple index operation. PAGE_RANGE Whether the table qualifies for page range screening, so that plans
scan only the partitions that are needed. Y = Yes; blank = No COLUMN_FN_EVAL: When an SQL aggregate function is evaluated. R = while the
data is being read from the table or index; S = while performing a sort to satisfy a GROUP BY clause; blank =after data retrieval and after any sorts.
QBLOCK_TYPE For each query block, an indication of the type of SQL operation performed.
JOIN_TYPE: The type of join:F FULL OUTER JOINL LEFT OUTER JOINS STAR JOINblank INNER JOIN or no joinRIGHT OUTER JOIN converts to a LEFT OUTER JOIN
when you use it, so that JOIN_TYPE contains L.
EXPLAIN Statements with examples.doc
Performance Tools Overview
BMC APPTUNE
BMC SQL EXPLORER
BMC APPTUNE
Use Option4-Performance
Products
BMC APPTUNE
Use Option Q-Apptune and
Index components
BMC APPTUNE
Option 1-
SQL
Workload
Setting Options in BMC APPTUNE
Use
Workload
Analysis
Choose
6. Data source
5. Time interval
Viewing Reports in APPTUNE
Use Various
Options To
Generate
Reports
Reports
Generated
for Programs
Viewing SQLs in APPTUNE
Use Option S-
To Show
SQLS
Use Option X-
To EXPLAIN
SQLS
Example of EXPLAIN Result in BMC APPTUNE
Cost
Calculated
by Optimizer
Matching
Index scan
Performed
Matching
Columns
used by index
Table &
Index names
Used by
access path
BMC SQL EXPLORER
Use Option S-
SQL Explorer
Use Option 1-Explain
Setting Options in BMC SQL EXPLORER
Plans orPackages orDBRMS canbe analyzed
Package
options
Analysis run
in Batch
Mode
More references
\BMC SQL EXPLORER.doct
steps to get to Apptune.doc
Run thru of an Actual SQL Tuning Exercise
Set up Development Environment
Use Option 7 - Migrate
Access Path Statistics
Example of the SQL Tuning Process - Development
Step 1.3: Import Statistics From Production to Development
Step 2: Identification of Problem SQL – Identify problem SQL
SQL Statement
being Analysed.
Tool warns that
Cardinality is missing.
Predicate Mismatch is
also detected.
Example of the SQL Tuning Process - Development
Step 2: Identification of Problem SQL – Check SQL Best Practices
No tool available for checking Best Practices. This
needs to be manually checked
using the SQL Best Practices document already Published.
A snippet of the related Best
Practice from the SQL Guidelines
document.
Example of the SQL Tuning Process - Development
Step 3: SQL Optimization – SQL Rewrite
No tool available to automatically rewrite SQL
statements. This needs to be
manually rewritten and subsequent
steps for Checking the new Access
Path to be performed.
Example of the SQL Tuning Process - Development
Step 3: SQL Optimization – Compare Access paths
Access Paths can be compared.
Notice the change in Estimated
Indicative cost. A different Index is being used now.
Example of the SQL Tuning Process - Development
Bibliography
Redbooks at www.redbooks.ibm.com
DB2 UDB for z/OS V8 Everything you ever wanted to know… SG24-6079
DB2 UDB for z/OS V8 Performance Topics SG24-6465
DB2 for z/OS Application Design for High Performance and Availability SG24-7134 10/05
DB2 UDB for Z/OS V8 Application Programming and SQL Guide
SQL Tuning Best Practices & Guidelines Document
In the IM Project & Document Database Process Document section
1) Database 'IM Project and Document Database'
2) Select the ‘Process Document’ Section
3) Select ‘By Process Category’
4) Select ‘Best Practices’
5) View ‘Table of Contents '
6) Select document 'Database Access - SQL Tuning Best Practice & Guidelines’