Ten SQL Tips

Top tips for Oracle SQL tuningGuy HarrisonSenior Software Architect,Quest Software

BUY QUEST PRODUCTSBUY GUYS BOOKBUY QUEST PRODUCTS

Top 10 Oracle SQL tuning tipsDesign and develop with performance in mindEstablish a tuning environmentIndex wiselyReduce parsingTake advantage of Cost Based OptimizerAvoid accidental table scansOptimize necessary table scansOptimize joinsUse array processingConsider PL/SQL for tricky SQL

Hint #1: Design and develop with performance in mindExplicitly identify performance targets

Focus on critical transactionsTest the SQL for these transactions against simulations of production dataMeasure performance as early as possible

Consider prototyping critical portions of the applications

Consider de-normalization and other performance by design features early on

Hint #2: Establish a tuning and development environment A significant portion of SQL that performs poorly in production was originally crafted against empty or nearly empty tables.

Make sure you establish a reasonable sub-set of production data that is used during development and tuning of SQL

Make sure your developers understand EXPLAIN PLAN and tkprof, or equip them with commercial tuning tools.

Understanding SQL tuning toolsThe foundation tools for SQL tuning are:

The EXPLAIN PLAN commandThe SQL Trace facilityThe tkprof trace file formatter

Effective SQL tuning requires either familiarity with these tools or the use of commercial alternatives such as SQLab

EXPLAIN PLANThe EXPLAIN PLAN reveals the execution plan for an SQL statement.

The execution plan reveals the exact sequence of steps that the Oracle optimizer has chosen to employ to process the SQL.

The execution plan is stored in an Oracle table called the plan table

Suitably formatted queries can be used to extract the execution plan from the plan table.

A simple EXPLAIN PLANSQL> EXPLAIN PLAN FOR select count(*) from sales where product_id=1;Explained.

SQL> SELECT RTRIM (LPAD (' ', 2 * LEVEL) || RTRIM (operation) ||' '||RTRIM (options) || ' ' || object_name) query_plan 2 FROM plan_table 3 CONNECT BY PRIOR id = parent_id 4* START WITH id = 0

QUERY_PLAN-------------------------------------------- SELECT STATEMENT SORT AGGREGATE TABLE ACCESS FULL SALES

Interpreting EXPLAIN PLANThe more heavily indented an access path is, the earlier it is executed.

If two steps are indented at the same level, the uppermost statement is executed first.

Some access paths are joined such as an index access that is followed by a table lookup.

A more complex EXPLAIN PLANSELECT STATEMENT VIEW SYS_DBA_SEGS UNION-ALL NESTED LOOPS NESTED LOOPS NESTED LOOPS NESTED LOOPS NESTED LOOPS VIEW SYS_OBJECTS UNION-ALL TABLE ACCESS FULL TAB$ TABLE ACCESS FULL TABPART$ TABLE ACCESS FULL CLU$ TABLE ACCESS FULL IND$ TABLE ACCESS FULL INDPART$ TABLE ACCESS FULL LOB$ TABLE ACCESS FULL TABSUBPART$ TABLE ACCESS FULL INDSUBPART$ TABLE ACCESS FULL LOBFRAG$ TABLE ACCESS BY INDEX ROWID OBJ$ INDEX UNIQUE SCAN I_OBJ1 TABLE ACCESS CLUSTER SEG$ INDEX UNIQUE SCAN I_FILE#_BLOCK# TABLE ACCESS BY INDEX ROWID FILE$ INDEX UNIQUE SCAN I_FILE2 TABLE ACCESS CLUSTER TS$ INDEX UNIQUE SCAN I_TS# TABLE ACCESS CLUSTER USER$ INDEX UNIQUE SCAN I_USER# NESTED LOOPS NESTED LOOPS NESTED LOOPS NESTED LOOPS TABLE ACCESS FULL UNDO$ TABLE ACCESS BY INDEX ROWID FILE$ INDEX UNIQUE SCAN I_FILE2 TABLE ACCESS CLUSTER SEG$ INDEX UNIQUE SCAN I_FILE#_BLOCK# TABLE ACCESS CLUSTER TS$ INDEX UNIQUE SCAN I_TS# TABLE ACCESS CLUSTER USER$ INDEX UNIQUE SCAN I_USER# NESTED LOOPS NESTED LOOPS NESTED LOOPS TABLE ACCESS FULL FILE$ TABLE ACCESS CLUSTER SEG$ INDEX RANGE SCAN I_FILE#_BLOCK# TABLE ACCESS CLUSTER TS$ INDEX UNIQUE SCAN I_TS# TABLE ACCESS CLUSTER USER$ INDEX UNIQUE SCAN I_USER#

SQL_TRACE and tkprofALTER SESSION SET SQL_TRACE TRUE causes a trace of SQL execution to be generated.

The TKPROF utility formats the resulting output.

Tkprof output contains breakdown of execution statistics, execution plan and rows returned for each step. These stats are not available from any other source.

Tkprof is the most powerful tool, but requires a significant learning curve.

Tkprof output count2 cpu3 elapsed4 disk5 query6 current7 rows8------ ------ ------ -------- ------- -------- -------- ------Parsea 1d 0.02 0.01 0 0 0 0Executeb 1e 0.00 0.00 0 0 0 0Fetchc 20j 141.10 141.65 1237 1450011 386332 99i------ ------ ------ -------- ------- -------- -------- ------total 22 141.12 141.66 1237k 1450011f 386332g 99hRowsl Execution Planm------- --------------------------------------------------- 0 SELECT STATEMENT GOAL: CHOOSE 99 FILTER 96681 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'CUSTOMERS' 96582 TABLE ACCESS GOAL: ANALYZED (FULL) OF 'EMPLOYEES'

Using SQLabBecause EXPLAIN PLAN and tkprof are unwieldy and hard to interpret, third party tools that automate the process and provide expert advice improve SQL tuning efficiency.

The Quest SQLab product:

Identifies SQL your database that could benefit from tuningProvides a sophisticated tuning environment to examine, compare and evaluate execution plans.Incorporates an expert system to advise on indexing and SQL statement changes

SQLab SQL tuning labDisplay execution plan in a variety of intuitive waysProvide easy access to statistics and other useful dataModel changes to SQL and immediately see the results

SQLab Expert AdviceSQLab provides specific advice on how to tune an SQL statement

SQLab SQL trace integration

SQLab can also retrieve the execution statistics that are otherwise only available through tkprof

Hint #3: Index wiselyIndex to support selective WHERE clauses and join conditions

Use concatenated indexes where appropriate

Consider overindexing to avoid table lookups

Consider advanced indexing optionsHash ClustersBit mapped indexesIndex only tables

Effect of adding columns to a concatenated indexNovice SQL programmers often are satisfied if the execution plan shows an indexMake sure the index has all the columns required

Bit-map indexesContrary to widespread belief, can be effective when there are many distinct column valuesNot suitable for OLTP however

Hint #4: Reduce parsingUse bind variablesBind variables are key to application scalabilityIf necessary in 8.1.6+, set cursor CURSOR_SHARING to FORCE

Reuse cursors in your application codeHow to do this depends on your development language

Use a cursor cacheSetting SESSION_CACHED_CURSORS (to 20 or so) can help applications that are not re-using cursors

Hint #5: Take advantage of the Cost Based OptimizerThe older rule based optimizer is inferior in almost every respect to the modern cost based optimizer

Using the cost based optimizer effectively involves:Regular collection of table statistics using the ANALYZE or DBMS_STATS commandUnderstand hints and how they can be used to influence SQL statement executionChoose the appropriate optimizer mode: FIRST_ROWS is best for OLTP applications; ALL_ROWS suits reporting and OLAP jobs

Hint #6: Avoid accidental tablescansTablescans that occur unintentionally are a major source of poorly performing SQL. Causes include:

Missing IndexUsing !=, or NOTUse inclusive range conditions or IN listsLooking for values that are NULLUse NOT NULL values with a default valueUsing functions on indexed columnsUse functional indexes in Oracle8i

Hint #7: Optimize necessary table scansThere are many occasions where a table scan is the only option. If so:Consider parallel query optionTry to reduce size of the tableAdjust PCTFREE and PCTUSEDRelocate infrequently used long columns or BLOBsRebuild when necessary to reduce the high water markImprove the caching of the tableUse the CACHE hint or table propertyImplement KEEP and RECYCLE pools Partition the table (if you really seek a large subset of data)Consider the fast full index scan

Fast full index scan performanceUse when you must read every row, but not every columnCounting the rows in a table is a perfect example

Hint #8: Optimize joinsPick the best join methodNested loops joins are best for indexed joins of subsetsHash joins are usually the best choice for big joinsPick the best join orderPick the best driving tableEliminate rows as early as possible in the join orderOptimize special joins when appropriateSTAR joins for data-warehousing applicationsSTAR_TRANSFORMATION if you have bitmap indexesANTI-JOIN methods for NOT IN sub-queriesSEMI-JOIN methods for EXISTS sub-queriesProperly index CONNECT BY hierarchical queries

Oracle 8 semi-joinsOptimizes queries using EXISTS where there is no supporting index

select * from customers c where exists (select 1 from employees e where e.surname=c.contact_surname and e.firstname=c.contact_firstname and e.date_of_birth=c.date_of_birth)

No index on employees

Oracle 8 semi-joinsWithout the semi-join or supporting index, queries like the one on the preceding slide will perform very badly.Oracle will perform a tablescan of the inner table for each row retrieved by the outer tableIf customers has 100,000 rows, and employees 800 rows then 80 MILLION rows will be processed!In Oracle7, you should create the index or use an IN-based subqueryIn Oracle8, the semi-join facility allows the query to be resolved by a sort-merge or hash join.

To Use semi-joinsSet ALWAYS_SEMI_JOIN=HASH or MERGE in INIT.ORA, ORUse a MERGE_SJ or HASH_SJ hint in the subquery of the SQL statementSELECT * FROM customers c WHERE exists (select /*+merge_sj*/ 1 from employees e where .)

Oracle8 semi-joinsThe performance improvements are impressive (note the logarithmic scale)

Star Join improvementsA STAR join involves a large FACT table being joined to a number of smaller dimension tables

Star Join improvementsThe Oracle7 Star join algorithm works well when there is a concatenated index on all the FACT table columnsBut when there are a large number of dimensions, creating concatenated indexes for all possible queries is impossible.Oracle8s Star transformation involves re-wording the query so that it can be supported by combinations of bitmap indexes.Since bitmap indexes can be efficiently combined, a single bitmap index on each column can support all possible queries.

To enable the star transformationCreate bitmap indexes on each of the FACT table columns which are used in star queriesMake sure that STAR_TRANSFORMATION_ENABLED is TRUE, either by changing init.ora or using an ALTER SESSION statement.Use the STAR_TRANSFORMATION hint if necessary.

Drawback of Star transformationBitmap indexes reduce concurrency (row-level locking may break down).

But remember that large number of distinct column values may not matter

Star transformation performanceWhen there is no suitable concatenated index, the Star transformation results in a significant improvement

Hint #9: Use ARRAY processingRetrieve or insert rows in batches, rather than one at a time.Methods of doing this are language specific

Hint #10: Consider PL/SQL for tricky SQLWith SQL you specify the data you want, not how to get it. Sometime you need to specifically dictate your retrieval algorithms.For example:Getting the second highest value Doing lookups on a low-high lookup tableCorrelated updatesSQL with multiple complex correlated subqueries SQL that seems to hard to optimize unless it is broken into multiple queries linked in PL/SQL

Oracle8i PL/SQL ImprovementsArray processing

NOCOPY

Temporary tables

The profiler

Dynamic SQL

Bonus hint: When your SQL is tuned, look to your Oracle configurationWhen SQL is inefficient there is limited benefit in investing in Oracle server or operating system tuning.

However, once SQL is tuned, the limiting factor for performance will be Oracle and operating system configuration.

In particular, check for internal Oracle contention that typically shows up as latch contention or unusual wait conditions (buffer busy, free buffer, etc)

Third party tools such as Quests Spotlight on Oracle product can be invaluable

w w w . q u e s t . c o m

All too often, SQL tuning is performed on an application designed and developed with virtually no consideration given to performance requirements. SQL tuning is often the best option for tuning such applications, but it is both more efficient and effective to design an application with performance in mind.Producing good performance by design requires that the following activities occur within the development life-cycle:

Explicit specification of the performance characteristics of the system at an early stage of development. Focus on critical transactions and queries during logical and physical modeling. Simulation of these SQL statements on prototype designs can often reveal key performance weaknesses. Effective simulation of the target environment. This can include simulating concurrent user loads and/or acquiring representative data volumes. Continual measurement of critical transaction performance. If possible, this should be integrated with other Quality Assurance measures.Its not uncommon for SQL that seemed to be working well in the development environment to exhibit poor performance once it is released to a production system. A primary cause of these unpleasant surprises is an inadequate development or volume testing environment. In particular, environments without realistic or representative data volumes are bound to lead to unrealistic SQL performance.The ideal tuning or development environment is one in which:Data volumes are realistic or at least proportional. With todays increasingly large production databases its often not possible to duplicate production data volumes exactly. However, it should always be possible to construct a reasonable sub-set of the data. For instance, a 20% sample of a 1GB database may be adequate for performance tuning. At all costs, avoid the situation in which SQL developers are constructing and testing code against empty or almost empty tables - even the proverbial SQL query from hell will appear efficient in these environments.Tuning facilities are available. Supply your volume and development environments with as many tuning tools as you have available. This could involve third party SQL tuning tools but will at least involve enabling the default Oracle tools. Make sure developers know how to use EXPLAIN PLAN, SQL trace and tkprof. Make sure that relevant database options are set to enable their effective use (see Chapter 7).Documentation is available. This documentation should include the database design, index details, performance requirements and volume metrics. The SQL programmer needs all this information in order to effectively produce efficient SQL.

Image 3 slideInclude name for graphic here.Graphic size = 300x354 pixels. X=5.08, Y = 2.08

Indexes exist primarily to improve the performance of SQL statements. In many cases, establishing the best indexes is easiest path to high performance.Use concatenated indexesTry not to use two indexes when one would do. If searching for SURNAME and FIRSTNAME, dont necessarily create separate indexes for each column. Instead create a concatenated index on both SURNAME and FIRSTNAME. You can use the leading portions of a concatenated index on their own, so if you sometimes query on the SURNAME column without supplying the FIRSTNAME then SURNAME should come first in the index.Overindex to avoid a table lookupYou can sometimes improve SQL execution by overindexing. Overindexing involves concatenating columns which appear in the SELECT clause, but not in the WHERE clause to the index. For instance, imagine we are searching on SURNAME and FIRSTNAME in order to find EMPLOYEE_ID. Our concatenated index on SURNAME and FIRSTNAME will allow us to quickly locate the row containing the appropriate EMPLOYEE_ID, but we will need to access both the index and the table. If there is an index on SURNAME, FIRSTNAME and EMPLOYEE_ID, then the query can be satisfied using the index alone. This technique can be particularly useful when optimizing joins, since intermediate tables in a join are sometimes queried merely to obtain the join key for the next table.Consider advanced indexing optionsOracle default B*-tree indexes are flexible and efficient and are suitable for the majority of situations. However, Oracle offers a number of alternative indexing schemes that can improve performance in specific situations. Index clusters allow rows from one or more tables to be located in cluster key order. Clustering tables can result in a substantial improvement to join performance. However, table scans of an individual tables in the cluster can be severely degraded. Index clusters are usually only recommended for tables which are always accessed together. Even then, alternatives such as denormalization should be considered.In hash clusters, the key values are translated mathematically to a hash value. Rows are stored in the hash cluster based on this hash value. Locating a row when the hash key is known may require only a single I/O, rather than the 2-3 I/Os required by an index lookup. However, range scans of the hash key cannot be performed. Furthermore, if the cluster is poorly configured, or if the size of the cluster changes, then overflows on the hash keys can occur or the cluster can become sparsely populated. In the first case, hash key retrieval can degrade and in the second case table scans will be less efficient.Bit-mapped indexes suit queries on multiple columns made against multiple columns which each have relatively few distinct values. They are more compact that a corresponding concatenated index and, unlike the concatenated index, they can support queries in which the columns appear in any combination. However, bit-mapped indexes are not suitable for tables which high modification rates, since locking of bit-mapped indexes occurs at the block, rather than row level.In an Index-organized table, all table data is stored within a B*-tree index structure. This can improve access to data via the primary key, and reduces the redundancy of storing key values both in the index and in the table. Index-organized tables can be configured with infrequently accessed columns located in an overflow segment. This keeps the index structure relatively small and efficient.Make sure your query uses the best indexNovice SQL programmers are often satisfied providing that the execution plan for their SQL statement uses any index. However, there will sometimes be a choice of indexed retrievals and the Oracle optimizer - especially the older Rule Based Optimizer - will not always choose the best index. Make sure that the indexes being selected by Oracle are the most appropriate and use hints (discussed below) to change the index if necessaryImage 2 slideInclude name for graphic here.Graphic size = 612x282 pixels. X=0.92, Y = 3.17

Image 2 slideInclude name for graphic here.Graphic size = 612x282 pixels. X=0.92, Y = 3.17

Parsing an SQL statement is the process of validating the SQL and determining the optimal execution plan. For SQL which has low I/O requirements but is frequently executed (for example, the SQL generated by OLTP-type applications) reducing the overhead of SQL parsing is very important. When an Oracle session needs to parse an SQL statement, it first looks for an identical SQL statement in the Oracle shared pool. If a matching statement cannot be found then Oracle will determine the optimal execution plan for the statement and store the parsed representation in the shared pool. The process of parsing SQL is CPU intensive. When I/O is well-tuned, the overhead of parsing the SQL can be a significant portion of the total overhead of executing the SQL. There are a number of ways of reducing the overhead of SQL statement parsing:Use bind variables. Bind variables allow the variable part of a query to be represented by pointers to program variables. If you use bind variables, the text of the SQL statement will not change from execution to execution and Oracle will usually find a match in the shared pool, dramatically reducing parse overheadRe-use cursors. Cursors (or context areas) are areas of memory that store the parsed representation of SQL statements. If you re-execute the same SQL statement more than once, then you can re-open an existing cursor and avoid issuing a parse call altogether. The mechanism of re-using cursors varies across development tools and programming languages. Appendix C contains some guidelines for specific tools. Use a cursor cache. If your development tool makes it hard or impossible to re-use cursors, you can instruct Oracle to create a cursor cache for each session using the SESSION_CACHED_CURSORS init.ora parameter. If SESSION_CACHED_CURSORS is greater than 0, then Oracle will store that number of recently re-executed cursors in a cache. If an SQL statement is re-executed, it may be found in the cache and a parse call avoided.

The component of the Oracle software that determines the execution plan for an SQL statement is known as the optimizer. Oracle supports two approaches to query optimization. The rule based optimizer determines the execution plan based on a set of rules which rank various access paths. For instance, an index-based retrieval has a lower rank than a full table scan and so the rule based optimizer will use indexes wherever possible. The cost based optimizer determines the execution plan based on an estimate of the computer resources (the cost) required to satisfy various access methods. The cost based optimizer uses statistics including the number of rows in a table and the number of distinct values in indexes to determine this optimum plan.Early experiences with the cost based optimizer in Oracle7 were often disappointing and gave the cost based optimizer a poor reputation in some quarters. However, the cost based optimizer has been improving in each release while the rule based optimizer is virtually unchanged since Oracle 7.0. Many advanced SQL access methods, such as star and hash joins are only available when you use the cost based optimizer.The cost based optimizer is the best choice for almost all new projects and converting from rule to cost based optimization will be worthwhile for many existing projects. Consider then following guidelines for getting the most from the cost based optimizer:Optimizer_mode. The default mode of the cost based optimizer (optimizer_mode=CHOOSE) will attempt to optimize the throughput - time taken to retrieve all rows - of SQL statements and will often favor full table scans over index lookups. When converting to cost based optimization, many users are disappointed to find that previously well-tuned index lookups change to long running table scans. To avoid this, set OPTMIZER_MODE=FIRST_ROWS in init.ora or ALTER SESSION SET OPTIMIZER_GOAL=FIRST_ROWS in your code. This instructs the cost based optimizer to minimize the time taken to retrieve the first row in your result set and encourages the use of indexes.Hints. No matter how sophisticated the cost based optimizer becomes, there will still be occasions when you need to modify its execution plan. SQL hints are usually the best way of doing this. Using hints, you can instruct the optimizer to pursue your preferred access paths (such as a preferred index), use the parallel query option, select a join order and so on. Hints are entered as comments following the first word in an SQL statement. The plus sign + in the comment lets Oracle know that the comment contains a hint. Hints are fully documented in Appendix A. In the following example, a hint is instructing the optimizer to use the CUST_I2 index:SELECT /*+ INDEX(CUSTOMERS CUST_I2) */ * FROM CUSTOMERS WHERE NAME=:cust_name

Analyze your tables. The Cost Based Optimizers execution plans are calculated using table statistics collected by the analyze command. Make sure you analyze your tables regularly, that you analyze all your tables and that you analyze them at peak volumes (for instance, dont analyze a table just before it is about to be loaded by a batch job). For small-medium tables, use ANALYZE TABLE table_name COMPUTE STATISTICS, for, for larger tables take a sample such as ANALYZE TABLE table_name ESTIMATE STATISTICS SAMPLE 20 PERCENT.Use histograms. Prior to Oracle 7.3, the, the Cost Based Optimizer would have available the number of distinct values in a column but not the distribution of data within the column. This meant that it might decline to use an index on a column with only a few values even if the particular value in question was very rare and would benefit from an index lookup. Histograms, introduced in Oracle 7.3, allow, allow column distribution data to be collected and will allow the Cost Based Optimizer to make better decisions. You create histograms with the FOR COLUMNS clause of the analyze command (for instance ANALYZE TABLE table_name COMPUTE STATISTICS FOR ALL INDEXED COLUMNS). Note that you cant take advantage of histograms if you are using bind variables (which we discussed earlier).

One of the most fundamental SQL tuning problems is the accidental table scan. Accidental table scans usually occur when the SQL programmer tries to perform a search on an indexed column that cant be supported by an index. This can occur when:Using != (not equal to). Even if the not equals condition satisfies only a small number of rows, Oracle will not use an index to satisfy such a condition. Often, you can re-code these queries using > or IN conditions, which can be supported by index lookups.Searching for NULLS. Oracle wont use an index to find null values, since null values are not usually stored in an index (the exception is a concatenated index entry where only some of the values are NULL). If youre planning to search for values which are logically missing, consider changing the column to NOT NULL with a DEFAULT clause. For instance, you could set a default value of UNKNOWN and use the index to find these values. Using functions on indexed columns. Any function or operation on an indexed column will prevent Oracle from using an index on that column. For instance, Oracle cant use an index to find SUBSTR(SURNAME,1,4)=SMIT. Instead of manipulating the column, try to manipulate the search condition. In the previous example, a better formulation would be SURNAME LIKE SMIT%. In Oracle8i you can create functional indexes, which are indexes created on functions, providing that the function always returns the same result when provided with the same inputs. This allows you, for instance, to create an index on UPPER(surname).

In many cases, avoiding a full table scan by using the best of all possible indexes is your aim. However, its often the case that a full table scan cannot be avoided. In these situations, you could consider some of the following techniques to improve table scan performance:Use the parallel query optionParallel Query OptionOracles Parallel Query Option is the most effective - although most resource intensive - way of improving the performance of full table scans. Parallel Query allocates multiple processes to an SQL statement that is based at least partially on a full table scan. The table is partitioned into distinct sets of blocks, and each process works on a different set of data. Further processes may be allocated - or the original processes recycled - to perform joins, sorts and other operations.The approach of allocating multiple processes to the table scan can reduce execution time dramatically if the hardware and database layout is suitable. In particular, the host computer should have multiple CPUs and/or the database should be spread across more than one disk device.You can enable the Parallel Query option with a PARALLEL hint or make it the default for a table with the PARALLEL table clause. Reduce the size of the tableThe performance of a full table scan will generally be proportional to the size of the table to be scanned. There are ways of reducing the size of the table quite substantially and thereby improving full table scan performance.Reduce PCTFREE. The PCTFREE table setting reserves a certain percentage of space in each block to allow for updates that increase the length of a row. By default, PCTFREE is set to 10%. If your table is rarely updated, or if the updates rarely increase the length of the row, you can reduce PCTFREE and hence reduce the overall size of the table.Increase PCTUSED. The PCTUSED table setting determines at what point blocks that have previously hit PCTFREE will again become eligible for inserts. The default value is 40%, which means that after hitting PCTFREE, the block will only become eligible for new rows when deletes reduce the amount of used space to 40%. If you increase PCTUSED, rows will be inserted into the table at an earlier time, blocks will be fuller on average and the table will be smaller. There may be a negative effect on INSERT performance - youll have to assess the trade off between scan and insert performance.Relocate long or infrequently used columns. If you have LONG (or big VARCHAR2 ) columns in the table which are not frequently accessed and never accessed via a full table scan (perhaps a bitmap image or embedded document) you should consider relocating these to a separate table. By relocating these columns you can substantially reduce the tables size and hence improve full table scan performance. Note that Oracle8 LOB types will almost always be stored outsidestored outside of the core table data anyway.The CACHE hintNormally, rows retrieved by most full table scans are flushed almost immediately from Oracles cache. This is sensible since otherwise full table scans could completely saturate the cache and push out rows retrieved from index retrievals. However, this does mean that subsequent table scans of the same table are unlikely to find a match in the cache and will therefore incur a high physical IO rate.You can encourage Oracle to keep these rows within the cache by using the CACHE hint or the CACHE table setting. Oracle will then place the rows retrieved at the Least Recently Used end of the LRU chain and they will persist in the cache for a much longer period of time.Use partitioningIf the number of rows you want to retrieve from a table is greater than an index lookup could effectively retrieve, but still only a fraction of the table itself (say between 10% and 40% of total), you could consider partitioning the table. For instance, suppose that a SALES table contains all sales records for the past 4 years and you frequently need to scan all sales records for the current financial year in order to calculate year-to-date totals. The proportion or rows scanned is far greater than an index lookup would comfortably support but is still only a fraction of the total table.If you partition the table by financial year you can restrict processing to only those records that match the appropriate financial year. This could potentially reduce scan time by 75% or more.Partitioning is discussed in detail in Chapter 13.Consider fast full index scanIf a query needs to access all or most of the rows in a table, but only a subset of the columns, you can consider using a fast full index scan to retrieve the rows. To do this, you need to create an index which contains all the columns included in the select and where clauses of the query. If these columns comprise only a small subset of the entire table then the index will be substantially smaller and Oracle will be able to scan the index more quickly than it could scan the table. There will, of course, be an overhead involved in maintaining the index that will affect the performance of INSERT, UPDATE and DELETE statements.

Determining the optimal join order and method is a critical consideration when optimizing the performance of SQL that involves multi-table joins. Join MethodOracle supports three join methods:In a nested loops join, Oracle performs a search of the inner table for each row found in the outer table. This type of access is most often seen when there is an index on the inner tablesince, otherwise, multiple nested table scans may result.When performing a sort merge join, Oracle must sort each table (or result set) by the value of the join columns. Once sorted, the two sets of data are merged, much as you might merge two sorted piles of numbered pages.When performing a hash join, Oracle builds a hash table for the smaller of the two tables. This hash table is then used to find matching rows in a somewhat similar fashion to the way an index is used in a nested loops join.

The Nested loops method suits SQL that joins together subsets of table data and where there is an index to support the join. When larger amounts of data must be joined and/or there is no index, use the sort-merge or hash-join method. Hash join usually out-performs sort-merge, but will only be used if Cost Based Optimization is in effect or if a hint is used.

Join Order Determining the best join order can be a hit-and-miss affair. The Cost Based Optimizer will usually pick a good join order but if it doesnt, you can use the ORDERED clause to force the tables to be joined in the exact order in which they appear in the FROM clause. In general, it is better to eliminate rows earlier rather than later in the join process, so if a table has a restrictive WHERE clause condition you should favor it earlier in the join process. Special JoinsOracle provides a number of optimization for special join types. Some of these optimizations will be performed automatically if you are using the Cost Based Optimizer, but all can be invoked in either optimizer by use of hints:The Star join algorithm optimizes the join of a single massive fact table to multiple, smaller, dimension tables. The optimization can be invoked with the STAR hint. A further optimization rewrites the SQL to take advantage of bitmap indexes that might exist in the fact table. This optimization can be invoked by the STAR_TRANSFORMATION hint.The anti-join is usually expressed as a subquery using the NOT IN clause. Queries of this type can perform badly under the Rule based optimizer, but can run efficiently if the init.ora parameter ALWAYS_ANTI_JOIN is set to HASH or if a HASH_AJ hint is added to the subquery.The semi-join usually expressed as a subquery using the EXISTS clause. This query may perform poorly if there is no index supporting the subquery, but can be run efficiently if the init.ora parameter ALWAYS_SEMI_JOIN is set to HASH or if a HASH_SJ hint is added to the subquery.Hierarchical queries using the CONNECT BY operator will degrade rapidly as table volumes increase unless there is an index to support the CONNECT BY join condition.

Semi joins allow Oracle to efficiently optimize queries which include an EXISTS subquery, but which dont have an index to support efficient execution of the subquery. Typically, these statements would perform very badly in Oracle7, because the EXISTS clause causes the subquery to be executed once for each row returned by the parent query. If the subquery involves a table scan or inefficient index scan then performance would be expected to be poor. In Oracle7, the solution was either to create a supporting index, or re-code the statement as a join or as a subquery using the IN operator.

Array processing refers to Oracles ability to insert or select more than one row in a single operation. For SQL which deals with multiple rows of data, array processing usually results in reductions of 50% or more in execution time (more if youre working across the network). In some application environments, array processing is implemented automatically and you wont have to do anything to enable this feature. In other environments array processing may be the responsibility of the programmer. Appendix C outlines how array processing can be activated in some popular development tools.On the principle that if some is good, more must be better, many programmers implement huge arrays. This can be overkill and may even reduce performance by increasing memory requirements for the program. Most of the gains of array processing are gained by increasing the array size from 1 to about 20. Further increases result in diminishing gains and you wont normally see much improvement when increasing the array size over 100.

SQL is a non-procedural language that is admirably suited for most data retrieval and manipulation tasks. However, there are many circumstances in which a procedural approach will yield better results. In these circumstances, the PL/SQL language (or possibly Java) can be used in place of standard SQL. Although its not possible to exhaustively categorize all the situations in which PL/SQL can be used in place of standard SQL, its possible that PL/SQL is a valid alternative when:There is little or no requirement to return large quantities of data. For instance, UPDATE transactions or when retrieving only a single value or row.Standard SQL requires more resources than seems logically required and no combination of hints seem to work. This is particularly likely if there are some implicit characteristics of the data which the optimizer cannot understand or where the SQL is particularly complex.You have a clear idea of how the data should be retrieved and processed, but cant implement your algorithm using standard SQL.Some of the specific circumstances in which PL/SQL was found to improve performance within this book were:Determining 2nd highest values.Performing range lookups for table that have a LOW_VALUE and HIGH_VALUE column.Performing correlated updates where the same table is referenced within the WHERE and SET clauses of an UPDATE statement with a subquery within the SET clause.PL/SQL triggers can also be invaluable when implementing de-normalization.

Documents

Ten SQL Tips