30
Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America [email protected] IOUG collaborate 08 Paper # 204

Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America [email protected] IOUG collaborate 08 Paper # 204

Embed Size (px)

Citation preview

Page 1: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Oracle 11g Data Warehousing

Vincent ChazhoorData Warehouse Architect

Lafarge North America

[email protected]

IOUG collaborate 08 Paper # 204

Page 2: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

The Pillars of Oracle Data Warehousing

• Partitioning• Parallel Processing• Data Compression• Summary Management

Page 3: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Data Compression

• Introduced in Oracle9i Release 2– Supports compression during bulk load operations only

• Data modified using conventional operations not compressed

• Improved performance for queries accessing large amounts of data– Fewer IOs

– Efficient IO operations

– Buffer Cache efficiency

• Data Can be compressed at the table level or partition level– Compressing unmodified partitions are common in DW environment

• Compressing data saves both storage space and CPU resources• Savings in storage space cascade to standby, backup and

development, test and other database copies

Page 4: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Data Compression – 11g New Features

• Oracle Database 11g extends compression for OLTP data– Support for conventional DML Operations

(INSERT, UPDATE, DELETE)

• New algorithm significantly reduces write overhead– Batched compression ensures no impact for most OLTP

transactions• Typical compression ratio of 2x to 3x

Page 5: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Using OLTP Table Compression

• Requires database compatibility level at 11.1 or greater• New Syntax extends the ‘COMPRESS’ keyword

– COMPRESS [FOR {ALL | DIRECT_LOAD} OPERATIONS]– DIRECT_LOAD (DEFAULT)

• Refers to Bulk load operations from 10g and prior releases

– ALL• OLTP + Direct loads

• Enable compression for a new table– CREATE TABLE (

…………………….COMPRESS FOR ALL OPERATIONS

• Enable only direct load compression on existing table– alter table <tablename> compress – only new rows are compressed, existing rows are uncompressed

Page 6: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Data Compression – 11g New Features

• Fast RMAN Compression– Compresses the backup contents before writing them to disk or tape

– No extra decompression is required during recovery

– Performance improvement

• 40% faster backup compression versus Oracle Database 10g

• Improved Data pump Compression – Metadata compression available in 10g

– ALL, DATA_ONLY, METADATA_ONLY, NONE options in 11g

– No need to decompress before import

– Single step compression of both data and metadata

– Complete Data pump functionality available on compressed file

– suitable for fast, incremental daily backups

Page 7: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Partitioning

• Partitioning is the Key functionality for managing large volumes of data in Oracle Database– One logical object and Many physical objects– Many features directly and indirectly depend on

partitioning– No bitmap indexes or star transformation without

partitioning!– Partition exchange is key in providing longer

availability of the data warehouse and fast ETL

Page 8: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Oracle Partitioning - 10g and Before

Oracle 8 Range Partitioning

Oracle 8i Hash PartitioningRange-Hash Composite Partitioning

Oracle 9 List Partitioning

Oracle 9i Range-List Composite partitioning

Oracle 10g Global Hash Indexes

Oracle 10g R2 1M Partitions per Table

Page 9: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

• Composite Partitioning enhancements– Data is partitioned along two dimensions

– A distinct value pair for the two dimensions uniquely determines the target partitioning

– In 11g More types of Composite partitioning options

• Enhancements to Partition Management– Virtual column based partitioning

– REF partitioning

– Interval partitioning

– Partition advisor

Oracle 11g PartitioningEnhancements

Page 10: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

• Composite partitioning enhancements

Range List Hash

Range New 11g 9i 8i

List New 11g New 11g New 11g

200801200802

200802

200801ORDERS

RANGE-RANGEOrder Date by GL Post Date

USACAN

200802

200801ORDERS

LIST-RANGECountry by

GL Post Date

USACAN

Cement

GypsumORDERS

LIST-LISTCountry by

Product Type

Oracle Partitioning – 11g Enhancements

Page 11: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

• Data is coming by Invoice date and Queries are by GL_date• ETL team want to partition by Invoice_date to use Partition exchange loading and Reporting team wants to Partition by GL_Date • For the records the Invoice_date will be in one moth and GL_date in another

Range- Range Composite Partitioning

Invoice_date

... ...

... ...

... ...

...

...

...

...

Jan 2008 Feb 2008 Jun 2008

Jan 2008

Feb 2008

May 2008

Mar 2008

GL_date

Page 12: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Range- Range Composite Partitioning

Jan 2008

... ...

Feb 2008 Mar 2006 Jun 2008

... ...

... ...

...

...

...

...

Jan 2008

Feb 2008

May 2008

Mar 2008

• All records with GL_DATE in March 2008

Invoice_date

GL_DATE

• In 10g and before Range partition on one date column and Hash on other• DBAs use other work around to address the issue• In 11g the DBA can make both ETL and Reporting Teams happy

Page 13: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Virtual Columns

Oracle Database 11g introduces virtual columns– purely virtual, meta-data only– Do not take storage space

• Virtual Columns are ‘Column Expressions’– Derived from other columns in the table– Deterministic functions– Constants

• Treated as real columns – Can compute statistics. – Can create Index– Eligible as partitioning key

• Improved performance

Page 14: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Virtual Columns

• Virtual Columns can be added using CREATE TABLE or ALTER TABLE commands– Eg

CREATE TABLE Transaction

(Accunt_number number(18) not null,

Account_name varchar2(50) not null,

....................

....................

Bank_branch number(2) generated always as (to_number(substr(to_char(Account_number),1,3)))

partition by list (Bank_branch)

Page 15: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Interval Partitioning

• Adding Range Partitions when needed is a challenge for DBAs– New partitions must be created on-time for new data– alter table <tablename> split partition MAX_PARTITION at

(TO_DATE(' 2008-05-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN')) into (partition P200804, partition MAX_PARTITION);

• “Managing Partitioned Tables While You Sleep” by David Burns is the most downloaded article from IOUG “knowledge library” and it is in “top 5 most downloaded articles” for years

• Automating partition management was a request from DBAs for years and Interval partitioning finally does that

Page 16: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Interval Partitioning

• Interval Partitioning• This is an extension to range partitioning

• Automatically create equi-sized range partitions

• Support Composite Partitioning

• Interval, Interval-list, Interval-hash, Interval-range

• Partitions are created as metadata information – Start partition is created

– Other partitions are created when data arrives

• Segments are allocated as soon as new data arrives– No need to add new partitions

– Local indexes are created and maintained as well

Page 17: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Interval Partitioning

Table SALES_ORDER

Jan 2008

... ...

Feb 2008 Mar 2008 Jun 2008 Oct 2009 Nov 2009

...

First segment is created

CREATE TABLE Sales_Order (order_date DATE, ...)

PARTITON BY RANGE (order_date)

INTERVAL(NUMTOYMINTERVAL(1,'month')

(PARTITION p_first VALUES LESS THAN ('01-FEB-2008');

New segment is automatically allocated when data arrives

INSERT INTO Sales_Order (order_date,...) VALUES ('02-JUN-2008',...);

Page 18: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

REF PartitioningPartitioning in “Relational Database” • If a Table is partitioned and a child table of that table is not

partitioned or partitioned by a different column, then that can create issues for maintenance– How do you truncate an old partition from the parent table when

associated records exist in the child table?– Deleting records from the child table will be a maintenance issue

Partitioned Parent Table Non-Partitioned Child Table

Child Table Partitioned by different partition strategy

Page 19: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Partitioning in “Relational Database”

• This issue is not very common in star-schema design following Kimball approach. The fact tables will not be associated with another fact table.

• This is an issue in “relational” data warehouse designs following Inmon’s approach

• DBAs generally try to solve this issue by partitioning the child table exactly like the parent table using the same partition key.– Many companies follow this as a standard or best practice– The child table will have the PK of the parent table as a Foreign

Key but the partition key may be missing• Redundant storage of Parent Table Partition Key• Redundant maintenance

Page 20: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

REF Partitioning

• Oracle Database 11g introduces REF Partitioning– There is no need for specifically following the

partition strategy of the parent table while designing child table

– Child table inherits the partitioning strategy of parent table through PK-FK relationship

– The partition key of the parent will not be stored redundantly in the child table

• Enhanced Performance and Manageability

Page 21: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Partitioning Advisor• Advises on Partitioning Strategy

– Advises on partitioning method, partitioning key and interval• Partition advise integrated with advises for indexes and MVs

SQL Advisor

SQL PlanTuning

SQLStructureAnalysis

AccessAnalysis

PartitionAnalysis

Oracle DB

Page 22: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Partition Merging – No change in Behavior• In Oracle 10g and before, when two partitions are merged,

data is actually moved. This is different from Partition exchange which is just a data dictionary update without any actual movement of data

• There is no change to the behavior of partition merging in 11g – still there is data movement

• SQL> drop table test_fct;Table dropped.

• SQL> create table test_fct 2 (col1 number) 3 partition by range (col1) 4 (partition P1 values less than (200), 5 partition P2 values less than (300), 6 partition P3 values less than (400));

Table created.

Page 23: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Partition Merging in 11g• SQL> insert into test_fct values (100);

1 row created.

• SQL> insert into test_fct values (201);

1 row created.

• SQL> insert into test_fct values (305);

1 row created.

• SQL> commit;

Commit complete.

Page 24: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Partition Merging in 11g

• SQL> select subobject_name,object_id, DATA_OBJECT_ID

2 from user_objects

3 where object_name = 'TEST_FCT'

4 and subobject_name is not null;

SUBOBJECT_NAME OBJECT_ID DATA_OBJECT_ID

------------------------------ ---------- --------------

P1 69855 69855

P2 69856 69856

P3 69857 69857

• SQL> alter table TEST_FCT merge partitions p1,p2 into partition p2;

Table altered.

Page 25: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Partition Merging

• SQL> select subobject_name,object_id, DATA_OBJECT_ID

2 from user_objects

3 where object_name = 'TEST_FCT'

4 and subobject_name is not null;

SUBOBJECT_NAME OBJECT_ID DATA_OBJECT_ID

------------------------------ ---------- --------------

P2 69856 69858

P3 69857 69857

• A new data object id is assigned to the new partition because a new segment is created for it

• No change from 10g behavior

Page 26: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Pivot•A great feature for data Warehousing

•Common Query requirement to display the data for current month, previous month and same month last year as one record

•These values will be stored as different records in the database

•The challenge of converting rows into columns in the report

•Developers used to do this using workarounds

•Pivot function will do the trick

•Pivot makes this coding lot easier

Page 27: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Optimizer Statistics• Oracle Database 11g enhancements improve Optimizer

Statistics– Oracle Database 11g makes the collection of optimizer statistics

faster – The computed optimizer statistics are more thorough, providing

better information to the CBO by correlating statistics, such as Number of Distinct Values (NDV) and histograms, on multiple columns.

– Oracle Database 11g makes gathering statistics safer. Newly generated statistics can remain in a private ‘statistics workspace’ until they can be validated and published for general use

– In Oracle 11g we can set individual thresholds on table level with set individual thresholds on table level with DBMS_STATS. DBMS_STATS. dba_tab_stat_prefsdba_tab_stat_prefs

– New data dictionary view New data dictionary view dba_tab_stat_prefsdba_tab_stat_prefs to view these these thresholds.thresholds.

– Extended statistics on expressions can be computed in 11gexec

DBMS_STATS.GATHER_TABLE_STATS(’schema’, ‘Tablename’, METHOD_OPT => ‘for all columns size 1 columns (upper(last_name)) size 5‘)

Page 28: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Cube Organized Materialized Views• The 11g introduces Cube Organized The 11g introduces Cube Organized

Materialized Views Materialized Views • This new feature permits standard SQL This new feature permits standard SQL

queries (such as those generated by non-queries (such as those generated by non-OLAP aware reporting tools) to access OLAP OLAP aware reporting tools) to access OLAP cube data through query rewrite cube data through query rewrite

• Cube Organized Materialized Views had Cube Organized Materialized Views had many advantages compared to conventional many advantages compared to conventional MVsMVs– A Cube can replace hundreds of MVsA Cube can replace hundreds of MVs

Page 29: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

AQ&

Page 30: Oracle 11g Data Warehousing Vincent Chazhoor Data Warehouse Architect Lafarge North America Vincent_chazhoor@hotmail.com IOUG collaborate 08 Paper # 204

Thank You

• Please complete the evaluation form

– Speaker Name : Vincent Chazhoor – Session name : Oracle 11g Data Warehousing – Session# : 204