32
CBO: A Configuration Roadmap Christian Antognini Trivadis AG Hotsos Symposium 2005 6-10 March, Dallas (USA)

Antognini - CBO Configuration Roadmap

Embed Size (px)

DESCRIPTION

Christian Antognini Trivadis AG Hotsos Symposium 2005 6-10 March, Dallas (USA) » Logical and physical database design » Application performance management » Integration with J2EE Be in the know. > Senior consultant and trainer at Trivadis AG in Zurich, Switzerland > Focus: get the most out of Oracle > Christian Antognini 2 CBO: A Configuration Roadmap © 2004

Citation preview

Page 1: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap

Christian AntogniniTrivadis AG

Hotsos Symposium 20056-10 March, Dallas (USA)

Page 2: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 2 © 2004

Who am I

> Senior consultant and trainer at Trivadis AG in Zurich, Switzerland

» [email protected]» www.trivadis.com

> Focus: get the most out of Oracle» Logical and physical database design» Application performance management» Integration with J2EE

> 10 years experiencewith OracleBe in

the know.

> Christian Antognini

Page 3: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 3 © 2004

Disclaimer

> The CBO changes from release to release

> It’s difficult to give general advice about many releases

> Therefore, this presentation focuses on the most important versions at present» Oracle9i Release 2» Oracle 10g Release 1

> When information applies to only one version, one of the following icons will be used

Page 4: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 4 © 2004

Correctly configuring the CBO (1)

> At least since Oracle9i the CBO works well, i.e. it generates good execution plans for most SQL statements

> This is only true when» It is correctly configured» The database has been

designed to take advantageof all its features

> Correctly configuring the CBO is not an easy job

This is no good reason for not doing it however!

Page 5: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 5 © 2004

Correctly configuring the CBO (2)

> There is no single configuration that is good for every system» Each application has its own requirements» Each system has its own characteristics

> Therefore, I’m not able to provide you withthe “magic configuration”…

> I can only show you how I proceed to dosuch a configuration

Page 6: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 6 © 2004

Configuration = INIT.ORA parameters + statistics

> Statistics» Object statistics» System statistics

> Basic» OPTIMIZER_MODE» OPTIMIZER_FEATURES_ENABLE» OPTIMIZER_DYNAMIC_SAMPLING» OPTIMIZER_INDEX_COST_ADJ» OPTIMIZER_INDEX_CACHING

> Query transformation» QUERY_REWRITE_ENABLED» QUERY_REWRITE_INTEGRITY» STAR_TRANSFORMATION_ENABLED

> Memory» WORKAREA_SIZE_POLICY» PGA_AGGREGATE_TARGET» HASH_AREA_SIZE» SORT_AREA_SIZE» BITMAP_MERGE_AREA_SIZE

> Miscellaneous» DB_FILE_MULTIBLOCK_READ_COUNT» CURSOR_SHARING» SKIP_UNUSABLE_INDEXES» PARALLEL_*» Many undocumented parameters…

Page 7: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 7 © 2004

Set the right parameter! (1)

> Each INIT.ORA parameter has been introduced for a specific reason!

> Understanding how it changes the behavior of the CBO is essential

> Don’t tweak the configuration randomly, instead» Understand the current situation» Define the goal to be achieved» Find out which parameter should be changed to achieve the goal

Page 8: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 8 © 2004

Choose the right tool!

Page 9: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 9 © 2004

Set the right parameter! (2)

obje

ct s

tatis

tics

OPTIM

IZER_

MODE

OPTIM

IZER_F

EATU

RES_E

NABLED

DB_FILE_MULTIBLOCK_READ_COUNT

OPTIMIZER_DYNAM

IC_SAMPLING

PGA_AGGREGATE_TARGET

SORT

_ARE

A_SI

ZEH

ASH

_ARE

A_S

IZE

BITMA

P_MERG

E_AREA

_SIZE

OPTIM

IZER_IND

EX_CO

ST_AD

J

WO

RKAREA_SIZE_POLICY

QUERY_REWRITE_ENABLED

OPTIMIZER_INDEX_CACHING

system statistics

Page 10: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 10 © 2004

Configuration roadmap

gather system and object statistics

OPTIMIZER_MODEOPTIMIZER_FEATURES_ENABLE

DB_FILE_MULTIBLOCK_READ_COUNTOPTIMIZER_DYNAMIC_SAMPLING

OPTIMIZER_INDEX_COST_ADJOPTIMIZER_INDEX_CACHING

manualauto

no

yes

PGA_AGGREGATE_TARGETHASH_AREA_SIZESORT_AREA_SIZE

BITMAP_MERGE_AREA_SIZE

test the application

most of the execution plans are good

WORKAREA_SIZE_POLICY

Once set to their optimal value they should not be modified further; the only

exception is when tweaking the index related

parameters does not achieve good results!

gather missing histograms

Page 11: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 11 © 2004

Object statistics

> The object statistics describe the data stored in the database» Table statistics: number of rows, number of blocks below the high

water mark and average row length» Column statistics: number of distinct values, number of NULL values

and data distribution (a.k.a. histograms)» Index statistics: number of distinct keys, high, number of leaf blocks

and clustering factor

> The CBO can generate good execution plans only if the statisticsare good as well, i.e. they must reflect existing data

> Missing, not up-to-date, or wrong statistics can lead to poor execution plans

> Object statistics are automatically gathered

Page 12: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 12 © 2004

Gathering object statistics (1)

> Gather statistics for tables, columns and indexes at the same time» DBMS_STATS.GATHER_SCHEMA_STATS» DBMS_STATS.GATHER_TABLE_STATS

> Use small estimate percentage (<10%)» Oracle increases the specified value if necessary ☺» For very large tables reduce it to 1% or less

> Activate table monitoring to refresh the statistics when data ischanged (by default a threshold of 10% is used)» DBMS_STATS.ALTER_SCHEMA_TAB_MONITORING» STATISTICS_LEVEL = TYPICAL (default)

> For partitioned tables gather statistics at all levels

> For large tables gather statistics in parallel

Page 13: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 13 © 2004

Gathering object statistics (2)

> Histograms are essential, gather them for all columns containingskewed data that are referenced in WHERE clauses» Even for non-indexed columns!» For simplicity use SIZE SKEWONLY, if it takes too much time try SIZE

AUTO; if it is still too slow or the chosen number of buckets it not good, manually specify the list of columns

» To get the most out of them, bind variables must not be used

> The following command is a good starting point

dbms_stats.gather_schema_stats(ownname => user,estimate_percent => 5,cascade => TRUE,method_opt => 'FOR ALL COLUMNS SIZE SKEWONLY',options => 'GATHER STALE'

);

Page 14: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 14 © 2004

Gathering object statistics (3)

> All procedures that gather optimizer statistics no longerhave hardcoded default values, instead, they are stored in the data dictionary» You can get and set them with the DBMS_STATS procedures

GET_PARAM and SET_PARAM

> Check the definition of the scheduler window group MAINTENANCE_WINDOW_GROUP

SQL> SELECT sname, nvl(to_char(sval1),spare4) value2 FROM sys.optstat_hist_control$;

SNAME VALUE-------------------- ------------------------------CASCADE DBMS_STATS.AUTO_CASCADEESTIMATE_PERCENT DBMS_STATS.AUTO_SAMPLE_SIZEDEGREE NULLMETHOD_OPT FOR ALL COLUMNS SIZE AUTONO_INVALIDATE DBMS_STATS.AUTO_INVALIDATEGRANULARITY AUTO

Page 15: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 15 © 2004

System statistics (1)

> System statistics are essential for a successful configuration

> The CBO used to base its estimations on the number of I/O needed to execute a SQL statement (a.k.a. “I/O cost model”)

> With system statistics a new model, called “CPU cost model”, is enabled

> They give information about the performance of the system where Oracle runs» Performance of the I/O subsystem » Average I/O size for multi-block read operations» Performance of the CPU

> To use the “I/O cost model”, when system statisticshave been gathered, specify the hint NO_CPU_COSTING

Page 16: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 16 © 2004

System statistics (2)

> There are two kinds of system statistics

> Without workload» Not really useful…» They are automatically gathered at startup if they don’t exist

> With workload» They should be gathered when the system is charged with a

common workload» The instance is “monitored” for a period of time (interval)

dbms_stats.gather_system_stats('noworkload');

dbms_stats.gather_system_stats(gathering_mode => 'interval',interval => 30

); Statistics are gathered over the next 30 minutes

Page 17: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 17 © 2004

System statistics (3)

> Without workload» CPUSPEEDNW: CPU speed in millions

of operations per second» IOSEEKTIM: I/O seek time [ms]» IOFRSPEED: I/O transfer speed [B/s]

> With workload » CPUSPEED: CPU speed in millions of

operations per second» SREADTIM: single-block read time [ms]» MREADTIM: multi-block read time [ms]» MBRC: average number of blocks read

during a multi-block read operation» MAXTHR: max. throughput [MB/s]» SLAVETHR: max. slave throughput

[MB/s]

SQL> SELECT pname, pval12 FROM sys.aux_stats$3 WHERE 4 sname = 'SYSSTATS_MAIN';

PNAME PVAL1------------ ----------CPUSPEEDNW 886.233IOSEEKTIM 10IOTFRSPEED 4096CPUSPEED 928SREADTIM 8.138MREADTIM 16.799MBRC 9MAXTHR 162238464SLAVETHR 277504

Page 18: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 18 © 2004

OPTIMIZER_MODE

> This is the most important parameter, unfortunately, too often it is not set, i.e. its default value is used» Default value is CHOOSE

- ALL_ROWS is used if object statistics are available, otherwise RULE- In some situations ALL_ROWS has to be used even if no statistics are

available (e.g. for partitioned tables)» Default value is ALL_ROWS

> If fast delivery of the last row is important, ALL_ROWS should be used» Typical example: reporting systems and data warehouses

> If fast delivery of the first row(s) is important, FIRST_ROWS_nshould be used» Where “n” is [ 1 | 10 | 100 | 1000 ] row(s)» Typical example: OLTP systems

Page 19: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 19 © 2004

OPTIMIZER_FEATURES_ENABLE

> Each database version introduces new features in the optimizer

> OPTIMIZER_FEATURES_ENABLE can be used to set which “version” of the optimizer should be used» Valid values are database versions (e.g. 8.1.7, 9.0.1, 9.2.0, …)» A complete list of the features which are enabled/disabled by each

version is available in the “Oracle Database Reference” manual» The default value is the current database version» Not all new features are enabled/disabled by this parameter, this

means that even if you set it to 8.1.7 in, for example, Oracle9i you won’t get the 8.1.7 optimizer!

> It could be useful to set it when an application is upgraded to a new database version, otherwise leave it at the default value

Page 20: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 20 © 2004

DB_FILE_MULTIBLOCK_READ_COUNT (1)

> DB_FILE_MULTIBLOCK_READ_COUNT specifies the maximumnumber of blocks that Oracle reads during a multi-block operation (e.g. full table scan)

> Three common situations lead to multi-block reads that are smaller than the specified value» Oracle reads headers with single-block reads» Oracle never does an I/O that spans more extents » Oracle never reads a block that is already in the buffer cache

> Multi-block reads are a performance feature, therefore, the parameter should be set to achieve the best performance» Higher values do not provide better performance in all cases!» It makes no sense to exceed the maximum physical I/O size

Page 21: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 21 © 2004

DB_FILE_MULTIBLOCK_READ_COUNT (2)

> When system statistics with workload are not used this parameter has a direct impact on the full table scan costs, therefore, too high values lead to excessive full scans» Without system statistics

» System statistics without workload

> When system statistics with workload are used, MBRC in SYS.AUX_STATS$ is used instead of DFMBRC to compute the cost of full table scans

6581.06765.1 /

DFMBRCBlocksFTSCostOI⋅

...)ln(ln1 / 2 +⋅+⋅+

≈DFMBRCB(DFMBRC)A

BlocksFTSCostOI

Page 22: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 22 © 2004

DB_FILE_MULTIBLOCK_READ_COUNT (3)

> A simple full table scan with different values gives useful information about the impact of this parameter and, therefore, assists in finding the “best” value

-20

0

20

40

60

80

100

1 5 9 13 17 21 25 29

DFMBRC / MBRC

Gai

n in

%

I/O cost withoutsystem statistics

-20

0

20

40

60

80

100

1 5 9 13 17 21 25 29

DFMBRC / MBRC

Gai

n in

%

I/O cost withoutsystem statistics

I/O cost with def.system statistics

-20

0

20

40

60

80

100

1 5 9 13 17 21 25 29

DFMBRC / MBRC

Gai

n in

%

I/O cost withoutsystem statistics

I/O cost with def.system statistics

System 1 (max. 55 MB/s)

-20

0

20

40

60

80

100

1 5 9 13 17 21 25 29

DFMBRC / MBRC

Gai

n in

%

I/O cost withoutsystem statistics

I/O cost with def.system statistics

System 1 (max. 55 MB/s)

System 2 (max. 40 MB/s)

-20

0

20

40

60

80

100

1 5 9 13 17 21 25 29

DFMBRC / MBRC

Gai

n in

%

I/O cost withoutsystem statistics

I/O cost with def.system statistics

System 1 (max. 55 MB/s)

System 2 (max. 40 MB/s)

System 3 (max. 58 MB/s)

-20

0

20

40

60

80

100

1 5 9 13 17 21 25 29

DFMBRC / MBRC

Gai

n in

%

I/O cost withoutsystem statisticsI/O cost with def.system statisticsSystem 1 (max. 55 MB/s)System 2 (max. 40 MB/s)

System 3 (max. 58 MB/s)System 4 (max. 230 MB/s)

Page 23: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 23 © 2004

OPTIMIZER_DYNAMIC_SAMPLING (1)

> The CBO used to base its estimations on statistics found in the data dictionary only

> With dynamic sampling some statistics can be gathered during the parse phase as well, this means that some queries are executed against the referenced objects to gather additional information

> The statistics gathered by dynamic sampling are not stored in the data dictionary, they are simply shared at cursor level

> The value of the parameter OPTIMIZER_DYNAMIC_SAMPLING specifies how and when dynamic sampling is used» Range: from 0 to 10 (0=dynamic sampling disabled)

Page 24: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 24 © 2004

OPTIMIZER_DYNAMIC_SAMPLING (2)

# Blocks

All40961024256128

64

32

Param

256

128

64

81929

All tables which fulfill level-3 criteria plus all tables which have single table predicates referencing more than two attributes

All tables which fulfill the level-2 criterion plus all tables for which a standard selectivity estimation is used

All non-analyzed tables

All non-analyzed tables, if at least one table» Is part of a join, subquery or non-mergeable view» Has no index» Has more blocks than the number of blocks used for

the sampling

On which table?

2

3

4

All10

4096820487102465125

321

HintLevel

Page 25: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 25 © 2004

OPTIMIZER_DYNAMIC_SAMPLING (3)

> Default values» If OPTIMIZER_FEATURES_ENABLE ≥ 10.0.0: 2» If OPTIMIZER_FEATURES_ENABLE = 9.2.0: 1» If OPTIMIZER_FEATURES_ENABLE ≤ 9.0.1: 0

> Level 1 and 2 are not very useful, in fact the tables should be analyzed on a regular basis!» An exception is when temporary tables are used, in fact, usually, no

statistics are available for them

> Level 3 and up are useful for improving selectivity estimations of predicates» If the CBO is not able to do correct estimations start by setting it to 4» It is possible to do such corrections with SQL profiles as well

Page 26: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 26 © 2004

PGA management

> It is possible to choose between two methods to manage the PGA» Manual: the DBA has full control over the size of the PGA» Automatic: the DBA delegates the management of the PGA to Oracle

> Except in the following situations it is recommended to use automatic PGA management» Very large PGA needed (>100MB, see Metalink note 147806.1)

- Except if you want to set _PGA_MAX_SIZE» Fine tuning is needed» Shared server (former MTS) is used

> Usually a larger PGA makes merge/hash joins and sort operations faster, therefore, you should devote the “unused” memory that is available on the system to it

> The PGA size has an influence on the costs computed by the CBO

Page 27: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 27 © 2004

Manual PGA management

> To enable manual PGA management the INIT.ORA parameter WORKAREA_SIZE_POLICY must be set to MANUAL

> Then the following INIT.ORA parameters should be set» HASH_AREA_SIZE» SORT_AREA_SIZE» BITMAP_MERGE_AREA_SIZE

> It is practically impossible to give advice about their value, anyway, here some general rules» Usually a minimal value of 512KB to 1MB should be used» For small PGAs, to take advantage of hash join, the

HASH_AREA_SIZE should be at least 3-4 times the SORT_AREA_SIZE

> Each parameter specifies the maximum amount of memory that can be used by each server process

Page 28: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 28 © 2004

Automatic PGA management

> To enable auto PGA management the INIT.ORA parameter WORKAREA_SIZE_POLICY must be set to AUTO

> Contrary to manual PGA management, the INIT.ORA parameter PGA_AGGREGATE_TARGET specifies the total amount of memory use for all PGAs, i.e. not for the single server processes

> In some situations the PGA_AGGREGATE_TARGET could also be exceeded» It’s a target, not a maximum value!» This usually happens when a value which is too small is specified

Page 29: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 29 © 2004

OPTIMIZER_INDEX_COST_ADJ

> This parameter is used to change the cost of table accesses through index scans

> Range of values 1..10000 (default 100)» Values greater than 100 make index scans more expensive» Values lower than 100 make index scans less expensive

> Correction for an index range scan

> Correction for an index unique scan

> This parameter flattens costs and makes the clustering factor much less significant, therefore, small values should be carefully specified! With system statistics the default value is usually good

( )( )100

/ OICAySelectivitrClustFactoLeafBlocksBLevelCostOI ⋅⋅++≈

( )100

1 / OICABLevelCostOI ⋅+≈

Page 30: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 30 © 2004

OPTIMIZER_INDEX_CACHING

> This parameter is used to specify the expected amount (in percent) of index blocks cached in the buffer cache during a nested loop join» This parameter has no influence on index range scans!

> Range of values 0..100 (default 0)» Values greater than 0 make nested loops less expensive

> Correction for an index range scan

> Correction for index unique scan

> With system statistics the default value is usually good

1100

1 / +⎟⎠⎞

⎜⎝⎛ −⋅≈

OICBLevelCostOI

( ) ⎟⎠⎞

⎜⎝⎛ −⋅⋅+≈

1001 / OICySelectivitLeafBlocksBLevelCostOI

ySelectivitrClustFacto ⋅+

Page 31: Antognini - CBO Configuration Roadmap

CBO: A Configuration Roadmap 31 © 2004

> from Trivadis

> The CBO works well if it is correctly configured!

> Correctly configuring the CBO is not an easy job

This is no good reason for not doing it!

> Understanding how each parameter changes the behavior of the CBO is essential

Core messages…

At the core it's all about data.

Page 32: Antognini - CBO Configuration Roadmap

Thank you for your attention

Questions?