Upload
roy-hardy
View
220
Download
1
Embed Size (px)
Citation preview
The Best of Both Worlds: In-Memory Column Store and
Advanced Index Compression in Oracle Database 12.1.0.2
My Credentials
30+ years of database-centric IT experience Oracle DBA since 2001 Oracle 9i, 10g, 11g OCP Oracle ACE Director 100+ articles on databasejournal.com and ioug.org Teach core Oracle DBA courses (G/I+RAC, Exadata,
Performance Tuning, Data Guard) Regular speaker at Oracle OpenWorld, IOUG
COLLABORATE, and OTN ACE Tours Oracle-centric blog (Generally, It Depends)
Coming Soon To a Bookstore Near You…
Coming in June 2015 from Oracle Press:Oracle Database Upgrade, Migration &
Transformation Tips & Techniques
• Covers everything you need to know to upgrade, migrate, and transform any Oracle 10g or 11g database to Oracle 12c
• Discusses strategy and tactics of planning Oracle migration, transformation, and upgrade projects
• Explores latest transformation features:• Recovery Manager (RMAN)• Oracle GoldenGate• Cross-Platform Transportable Tablespaces• Cross-Platform Transport (CPT)• Full Transportable Export (FTE)
• Includes detailed sample code
Our Agenda
In-Memory Column Store (IMCS) Architecture and terminology Object prioritization Leveraging In-Memory Compression Improving query performance
In-Memory Joins (IMJ) In-Memory Aggregation (IMA)
Architecture, advantages and detection Improving star schema query performance
Advanced Index Compression (AIC) Q+A
The Changing Nature of IT
Disk? Much too slow!Ohhhh! In DRAM! • Powerful machines
• Huge core count• Parallelism• SIMD
Three major trends are driving the IT “back office”
Flash? Still too slow!
Enterprise platforms
are ubiquitous
1Where are we
storing our hottest data?
2
True in-memory computing
3
In-Memory Column Store
IMCS: Turning Sideways For Better Performance
ord#00 part# suppl# line# qty extdamt rtn sts shipdt comtdt rcptdt12389 103 19 1 19 190.00 N A 2014-01-01 2014-01-02 2014-01-05
12389 987 22 2 48 960.00 N A 2014-01-01 2014-01-02 2014-01-04
12389 623 23 3 10 200.00 N A 2014-01-01 2014-01-02 2014-01-05
12389 103 19 4 5 100.00 N A 2014-01-02 2014-01-04 2014-01-05
12389 103 19 5 17 51.00 N A 2014-01-02 2014-01-04 2014-01-05
12389 623 23 6 5 190.00 Y I 2014-01-02 2014-01-03 2014-01-05
12389 623 23 7 1 190.00 N A 2014-01-05 2014-01-05 2014-01-14
12389 109 22 8 34 68.50 Y P 2014-01-05 2014-01-05 2014-01-08
ord# 12389 12389 12389 12389 12389 12389 12389 12389
part# 103 987 623 103 103 623 623 109
suppl# 19 22 23 19 19 23 23 22
line# 1 2 3 4 5 6 7 8
qty 19 48 10 5 17 5 1 34
extdamt 190.00 960.00 200.00 100.00 51.00 190.00 190.00 68.50
rtn N N N N N Y N Y
sts A A A A A I A P
shipdt 2014-01-01 2014-01-01 2014-01-01 2014-01-02 2014-01-02 2014-01-02 2014-01-05 2014-01-05
comtdt 2014-01-02 2014-01-02 2014-01-02 2014-01-04 2014-01-04 2014-01-03 2014-01-05 2014-01-05
Rcptdt 2014-01-05 2014-01-04 2014-01-05 2014-01-05 2014-01-05 2014-01-05 2014-01-14 2014-01-08
This columnar representation results in far fewer I/Os than with row major format. Reading only needed columns and ignoring the rest is called columnar projection.
Row-major storage works great for single-row access, especially DML …
… but can seriously reduce query performance when just a few
column values must be accessed
By simply turning the table structure table sideways to the left … now the
same query only has to scan two columns.
SELECT COUNT(*) FROM tpch.h_lineitem WHERE rtn <> ‘N’ AND status <> ‘A’;
INSERT INTO tpch.h_lineitemVALUES (12389, 109, 22 ...);
Pre-12.1.0.2: Row-major storage (ORGANIZATION HEAP)
UPDATE tpch.h_lineitem SET rtn = ‘Y’ WHERE ord# = 12389 AND ...);
12.1.0.2: Columnar Storage
In-Memory Column Store (IMCS)
IMCS candidates: Tables and table partitions Objects can be:
Stored in dual format, or only in memory Prioritized to insure most benefit goes to worthiest
objects Compressed to conserve IMCS resources
Many indexes can often be eliminated Columnar format often much more efficient than indexed
search Speedier OLTP processing
IMCS: A Simple Example of How It Works
+WARM
ADO_WARM_DATAADO_WARM_IDX
Database Buffer Cache (DBC)
DBWn
SGA
IMCO SMCOWnnn
User
Physical Storage
Transaction Journal
In-Memory Column Store
User issues query
against IMCS candidate
table
1
If there’s enough space in IMCS,
rows are populated in
columnar format and compressed
2
… else, data is retrieved from DBC …
4… else,
unpopulated data is
retrieved from disk
5
1
2
4
5
Data is then uncompresse
d and retrieved
from IMCS …
3
3
IMCS: How is Columnar Data Stored?
12389:1;612389:2;12389:3;12389:4;12389:5;12389:6;12389:7;12389:8 | 103:1;
987:2; 623:3; 103:4; 103:5; 623:6; 623:7; 109:8 | 19:1; 22:2; 23:3; 19:4;
19:5; 23:6; 23:7; 22:8 | 1:1; 2:2; 3:3; 4:4; 5:5; 6:6; 7:7; 8:8 | 19:1;
48:2; 10:3; 5:4; 17:5; 5:6; 1:7; 34:8 | 190:1; 960:2; 200:3; 100:4; 51:5;
190:6; 190:7; 68.5:8 | N:1; N:2; N:3; N:4; N:5; N:6; Y:7; N:8 | A:1; A:2;
A:3; A:4; A:5; I:6; A:7; P:8
Data in columnar format:TPCH.H_LINEITEM
In-Memory Column Units
C1 C5C2 C6C4C3 C7 C8 C1 C5C2 C6C4C3 C7 C8
12.1.0.2 Evaluation Specifications
Oracle VirtualBox 4.3.18 8 virtual CPU cores 10GB virtual memory
Sample database schema Custom objects:
AP.EST_SALES 680K rows, 200+ columns
SH.CUST_PREFS (50K rows, 40+ columns)
Entity Rows
Regions 5
Nations 25
Suppliers 40,000
Customers 600,000
Parts 800,000
Part Suppliers 3,200,000
Orders 6,000,000
Item List 24,000,000
Demonstrations performed against Oracle 12c Database Release 12.1.0.2 using a virtualized environment:
TPC-H database:
Activating IMCS
IMCS activation requires database instance “bounce”:
SQL> ALTER SYSTEM SET inmemory_size = 4G SCOPE=SPFILE;
SQL> ALTER SYSTEM SET inmemory_clause_default = \ 'INMEMORY MEMCOMPRESS FOR QUERY LOW PRIORITY LOW‘ SCOPE=BOTH;
SQL> shutdown immediate
Database closed.Database dismounted.ORACLE instance shut down.
SQL> startupORACLE instance started.
Total System Global Area 7516192768 bytesFixed Size 3728304 bytesVariable Size 838863952 bytesDatabase Buffers 2365587456 bytesRedo Buffers 13045760 bytesIn-Memory Area 4294967296 bytesDatabase mounted.Database opened.
SQL> show parameter inmemory
NAME VALUE------------------------------------------ --------------------inmemory_clause_default INMEMORY MEMCOMPRESS FOR QUERY LOW PRIORITY LOWinmemory_force DEFAULTinmemory_max_populate_servers 1inmemory_query ENABLEinmemory_size 4Ginmemory_trickle_repopulate_servers_percent 1optimizer_inmemory_aware TRUE
IMCS: Influencing Population Priority
Objects can be granted a population priority within the IMCS:
Priority Populated As Soon As:
CRITICAL The database instance starts
HIGH All CRITICAL objects have been populated
MEDIUM All CRITICAL and HIGH objects are populated
LOW All CRITICAL, HIGH, and MEDIUM objects populated
NONE Never populated (default)
This insures judicious usage of IMCS resources
Implementing IMCS For Tablespaces
IMCS can be applied automatically for all objects in a tablespace:
CREATE TABLESPACE hot_imcs_data DATAFILE ‘+DATA’ SIZE 256M INMEMORY MEMCOMPRESS FOR QUERY HIGH;
During Tablespace
Creation
Existing Tablespace
ALTER TABLESPACE ado_hot_data INMEMORY MEMCOMPRESS FOR DML PRIORITY CRITICAL;
Implementing IMCS For Tables
A table can be added to IMCS with or without using predefined, default compression
attributes:ALTER TABLE ap.vendors INMEMORY;
Place table into IMCS using default
defined compression
ALTER TABLE ap.invoices INMEMORY MEMCOMPRESS FOR CAPACITY LOW;
Place table into IMCS with goal of reducing
space used within memory (instead of
query efficiency)
SQL> ALTER TABLE ap.randomized_sorted INMEMORY MEMCOMPRESS FOR CAPACITY LOW (key_id, key_date, key_sts) NO INMEMORY (key_desc); * ERROR at line 1: ORA-64361: column INMEMORY clause may only be specified for an inmemory table
Implementing IMCS For Individual Columns
SQL> ALTER TABLE ap.randomized_sorted INMEMORY; Table altered.
SQL> ALTER TABLE ap.randomized_sorted INMEMORY MEMCOMPRESS FOR CAPACITY LOW (key_id, key_date, key_sts) NO INMEMORY (key_desc);
Table altered.
Only certain columns of a table can be added to IMCS:
Specification rules of precedence:Columns beat tables …which beat tablespaces … which beat initialization parameters
In-Memory Compression
IMCS: In-Memory Compression
IMCS offers five different compression levels:
Compression Method
Works Best For
DML Heavily-updated data
QUERY LOW Often-queried data
QUERY HIGH Infrequently-queried data
CAPACITY LOW Archived, but queried occasionally
CAPACITY HIGH Archived, but rarely queried
… but actual compression can vary dramatically depending on datatypes, NULLs, and NDVs per column
Estimating In-Memory Compression
GET_COMPRESSION_RATIO procedure of package DBMS_COMPRESSION can provide estimates of
potential object compression ratios:
Table: SH.CUST_PREFS(50,000 rows, 37 columns, mainly VARCHAR2)
Compression Method
Estimated Compression
Ratio
Actual Compression
Ratio
DML 1.10 : 1 1.32 : 1
QUERY LOW 1.40 : 1 1.75 : 1
QUERY HIGH 1.90 : 1 2.25 : 1
CAPACITY LOW 2.60 : 1 3.12 : 1
CAPACITY HIGH 3.30 : 1 3.88 : 1
Table: AP.EST_SALES(680K rows, 190+ columns, mainly NUMBER)
Compression Method
Estimated Compression
Ratio
Actual Compression
Ratio
DML 3:00 : 1 1.28 : 1
QUERY LOW 2.90 : 1 1.86 : 1
QUERY HIGH 2.90 : 1 1.91 : 1
CAPACITY LOW 3.70 : 1 2.24 : 1
CAPACITY HIGH 4.30 : 1 2.62 : 1
Table: TPCH.H_LINEITEM(24M rows, 8 columns, mixed)
Compression Method
Estimated Compression
Ratio
Actual Compression
Ratio
DML 1.20 : 1 1.23 : 1
QUERY LOW 1.90 : 1 1.94 : 1
QUERY HIGH 2.30 : 1 2.36 : 1
CAPACITY LOW 3.60 : 1 3.75 : 1
CAPACITY HIGH 4.80 : 1 4.90 : 1
. . .DBMS_COMPRESSION.GET_COMPRESSION_RATIO ( scratchtbsname => 'ADO_COLD_DATA' ,ownname => 'AP' ,objname => 'EST_SALES' ,subobjname => NULL ,comptype => DBMS_COMPRESSION.COMP_INMEMORY_QUERY_HIGH ,blkcnt_cmp => comp_blks ,blkcnt_uncmp => unco_blks ,row_cmp => comp_rspb ,row_uncmp => unco_rspb ,cmp_ratio => comp_ratio ,comptype_str => comp_type ,subset_numrows => DBMS_COMPRESSION.COMP_RATIO_ALLROWS);. . .
IMCS and Improved Query Performance:Small Table
Simple query against SH.CUST_PREFS (16MB, 37 columns, mainly VARCHAR2):
BaselineInitial Population
into DBCAfter DBC Population
SH.CUST_PREFS 1.65s 0.87s
Compression MethodInitial Population
into IMCSAfter IMCS Population
DML 1.82s 0.15s
QUERY LOW 1.44s 0.15s
QUERY HIGH 1.34s 0.10s
CAPACITY LOW 1.38s 0.14s
CAPACITY HIGH 2.77s 0.13s
SQL> SELECT active_ind ,COUNT(*) ,MIN(created_dtm),MAX(lstchgd_dtm) FROM sh.cust_prefs GROUP BY active_ind ORDER BY active_ind;
Plan hash value: 2844705231
------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 4 | 96 | 23 (9)| 00:00:01 || 1 | SORT GROUP BY | | 4 | 96 | 23 (9)| 00:00:01 || 2 | TABLE ACCESS INMEMORY FULL| CUST_PREFS | 50000 | 1171K| 21 (0)| 00:00:01 |------------------------------------------------------------------------------------------
16.5X
IMCS and Improved Query Performance:Huge Fact Table, No Filtering
SQL> SELECT l_linestatus ,l_returnflag ,SUM(l_quantity) ,SUM(l_extendedprice) FROM tpch.h_lineitem GROUP BY l_linestatus, l_returnflag ORDER BY l_linestatus, l_returnflag;
Unfiltered query against TPCH.H_LINEITEM (~ 24M rows, 12 columns, mixed datatypes):Baseline
Initial Population into DBC
After DBC Population
TPCH.H_LINEITEM 114.77s 104.97s
Compression MethodInitial Population
into IMCSAfter IMCS Population
DML 258.39s 11.72s
QUERY LOW 224.91s 9.64s
QUERY HIGH 206.56s 12.87s
CAPACITY LOW 212.30s 10.05s
CAPACITY HIGH 197.58s 10.65s
Plan hash value: 2974235678
------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 5 | 65 | 5441 (27)| 00:00:01 || 1 | SORT GROUP BY | | 5 | 65 | 5441 (27)| 00:00:01 || 2 | TABLE ACCESS INMEMORY FULL| H_LINEITEM | 23M| 297M| 4773 (17)| 00:00:01 |------------------------------------------------------------------------------------------
10.9X
IMCS and Improved Query Performance:Huge Fact Table, In-Memory Filtering
SQL> SELECT l_linestatus ,l_returnflag ,SUM(l_quantity) ,SUM(l_extendedprice) FROM tpch.h_lineitem WHERE l_linestatus <> 'O' AND l_returnflag IN ('A','R') GROUP BY l_linestatus, l_returnflag ORDER BY l_linestatus, l_returnflag;
Filtered query against TPCH.H_LINEITEM:
BaselineInitial Population
into DBCAfter DBC Population
TPCH.H_LINEITEM 130.78s 112.45s
Compression MethodInitial Population
into IMCSAfter IMCS Population
DML 175.80s 10.10s
QUERY LOW 190.18s 4.34s
QUERY HIGH 141.31s 4.68s
CAPACITY LOW 226.45s 7.60s
CAPACITY HIGH 203.09s 6.07s
Plan hash value: 2974235678------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 3 | 39 | 4812 (16)| 00:00:01 || 1 | SORT GROUP BY | | 3 | 39 | 4812 (16)| 00:00:01 ||* 2 | TABLE ACCESS INMEMORY FULL| H_LINEITEM | 5923K| 73M| 4660 (13)| 00:00:01 |------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - inmemory("L_LINESTATUS"<>'O' AND ("L_RETURNFLAG"='A' OR "L_RETURNFLAG"='R')) filter("L_LINESTATUS"<>'O' AND ("L_RETURNFLAG"='A' OR "L_RETURNFLAG"='R'))
30.1X
IMCS and Improved Query Performance:Large Table, In-Memory Filtering
Filtered query against AP.EST_SALES(592MB, 190 columns, mostly NUMBER):
BaselineInitial Population
into DBCAfter DBC Population
AP.EST_SALES 19.20s 0.32s
Compression MethodInitial Population
into IMCSAfter IMCS Population
DML 30.87s 0.18s
QUERY LOW 19.26s 0.17s
QUERY HIGH 41.68s 0.14s
CAPACITY LOW 24.11s 0.74s
CAPACITY HIGH 40.18s 0.24s
SQL> SELECT zipcode ,SUM(sls_2014q4) "2014_Q4“ ,SUM(sls_2014q3) "2014_Q3" ,SUM(sls_1970q2) "1970_Q2“ ,SUM(sls_1970q1) "1970_Q1" FROM ap.est_sales WHERE zipcode LIKE '606%' GROUP BY zipcode ORDER BY zipcode;
Plan hash value: 2076845129
-----------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-----------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 25 | 550 | 826 (6)| 00:00:01 || 1 | SORT GROUP BY | | 25 | 550 | 826 (6)| 00:00:01 ||* 2 | TABLE ACCESS INMEMORY FULL| EST_SALES | 394 | 8668 | 825 (6)| 00:00:01 |-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):--------------------------------------------------- 2 - inmemory("ZIPCODE" LIKE '606%') filter("ZIPCODE" LIKE '606%')
137.1X
IMCS: What Happens When Data Changes?
+WARM
ADO_WARM_DATAADO_WARM_IDX
Database Buffer Cache (DBC)
DBWn
SGA
IMCO SMCOWnnn
User
Physical Storage
Transaction Journal
In-Memory Column Store
User issues DML against table already in IMCS
1
Modified data retained in IMCS transaction journal
2
DBW0 writes modified data from DBC …
4 … but queries use transaction journal entries for read consistency
5
1
2
4
5
When it becomes stale, journal is flushed by IMCO, SMCO and Wnnn
33
IMCS: DML and Instance Statistics
IMCS specific instance statistics show DML impact:SQL> UPDATE ap.randomized_sorted SET key_sts = 50 WHERE key_sts <> 50 AND key_id BETWEEN 1100000 and 1300000;
60087 rows updated.
SQL> COMMIT;
Committed.
SQL> SELECT A.name ,B.value FROM v$sysstat A ,v$mystat B WHERE A.statistic# = B.statistic# AND (A.name LIKE '%IM%' OR A.name LIKE '%In memory%') AND B.value > 0 ORDER BY A.name;TTITLE OFF
IMCS: Session-Level Execution Statistics (from V$MYSTAT)
Statistic Name Value---------------------------------------------------------------------- --------------IM repopulate CUs requested 1IM scan CUs columns accessed 12IM scan CUs columns theoretical max 92IM scan CUs memcompress for query low 23. . .IM space private journal bytes allocated 13107200IM space private journal bytes freed 13107200IM space private journal extents allocated 200IM space private journal extents freed 200IM space private journal segments allocated 1IM space private journal segments freed 1IM transactions 1IM transactions rows invalidated 60087IM transactions rows journaled 60087IMU Flushes 1IMU Redo allocation size 31852IMU undo allocation size 63992session logical reads - IM 63823table scan disk non-IMC rows gotten 64table scans (IM) 3
In-Memory Joins andIn-Memory Aggregation
In-Memory Scan
In-Memory Scan is analogous to Exadata Smart Scan: Employs columnar projection Leverages In-Memory Storage Indexes for IMCUs
Work much like storage indexes for Exadata … but in IMCS Avoids scanning IMCUs where data is not present
Leverages SIMD vector processing SIMD: Single-Instruction processing Multiple Data values Enables scanning rates of billions of rows per second
Hints affecting behavior: +IN_MEMORY (enables IMCS-resident object scans) +IN_MEMORY_PRUNING (forces use of storage indexes)
Bloom Filters (BFs)
BFs guarantee that a false negative never occurs However, BFs cannot guarantee against false
positives, so BF processing always requires a final hash join
For a better understanding: http://en.wikipedia.org/wiki/Bloom_filter http://www.jasondavies.com/bloomfilter
Bloom Filters (BFs) have been used “under the covers” for faster joins between row sets since Oracle
Database10gR2
… and a great night’s sleep!
In-Memory Join (IMJ)
In-Memory Joins leverage Bloom Filters (BF) for faster joins
• Queries using complex, multi-table joins can now leverage BFs
• Tables accessed serially now also candidates for BFs• Easy to identify in EXPLAIN PLANs:
• :BFnnnnnn row sources
• JOIN FILTER CREATE and JOIN FILTER USE operations• SYS_OP_BLOOM_FILTER in filter predicates
• +PX_JOIN_FILTER hint may force optimizer to choose BFs• Not all tables need to be resident in IMCS!
In-Memory Joins and Bloom Filters
A typical DW query employing a star join:SELECT R.promo_id ,P.prod_id ,SUM(S.quantity_sold) ,SUM(s.amount_sold) FROM sh.sales S ,sh.promotions R ,sh.products P WHERE S.prod_id = P.prod_id AND S.promo_id = R.promo_id AND P.prod_id BETWEEN 60 AND 120 AND R.promo_id > 99 GROUP BY R.promo_id, P.prod_id ORDER BY R.promo_id, P.prod_id;
----------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |----------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 29 | 696 | 127 (15)| 00:00:01 || 1 | SORT GROUP BY | | 29 | 696 | 127 (15)| 00:00:01 ||* 2 | HASH JOIN | | 61549 | 1442K| 125 (14)| 00:00:01 ||* 3 | TABLE ACCESS INMEMORY FULL | PROMOTIONS | 469 | 1876 | 4 (0)| 00:00:01 ||* 4 | HASH JOIN | | 61597 | 1203K| 121 (15)| 00:00:01 || 5 | JOIN FILTER CREATE | :BF0000 | 34 | 136 | 1 (0)| 00:00:01 ||* 6 | TABLE ACCESS INMEMORY FULL | PRODUCTS | 34 | 136 | 1 (0)| 00:00:01 || 7 | JOIN FILTER USE | :BF0000 | 130K| 2038K| 120 (14)| 00:00:01 || 8 | PARTITION RANGE ALL | | 130K| 2038K| 120 (14)| 00:00:01 ||* 9 | TABLE ACCESS INMEMORY FULL| SALES | 130K| 2038K| 120 (14)| 00:00:01 |----------------------------------------------------------------------------------------------Predicate Information (identified by operation id):--------------------------------------------------- 2 - access("S"."PROMO_ID"="R"."PROMO_ID") 3 - inmemory("R"."PROMO_ID">99) filter("R"."PROMO_ID">99) 4 - access("S"."PROD_ID"="P"."PROD_ID") 6 - inmemory("P"."PROD_ID">=60 AND "P"."PROD_ID"<=120) filter("P"."PROD_ID">=60 AND "P"."PROD_ID"<=120) 9 - inmemory("S"."PROD_ID">=60 AND "S"."PROD_ID"<=120 AND "S"."PROMO_ID">99 AND SYS_OP_BLOOM_FILTER(:BF0000,"S"."PROD_ID")) filter("S"."PROD_ID">=60 AND "S"."PROD_ID"<=120 AND "S"."PROMO_ID">99 AND SYS_OP_BLOOM_FILTER(:BF0000,"S"."PROD_ID"))
In-Memory Aggregation (IMA)
Often an alternative when dimension tables are IMCS-resident
Biggest yield: Analytical queries performing aggregation Aggregation performed within an In-Memory Accumulator,
a multi-dimensional array stored in the PGA IMA leverages key vector structures
• Key vectors built from values found within dimensions• Temporary tables stored in memory hold dimensions’ “payload”
columns• BFs require final hash join to verify against any false positives …• … but key vectors never return a false positive!
In-Memory Aggregation (IMA) may provide an alternative to BFs
PGA
IMA: Architecture and Concepts
PRODUCTS
prod_id product_name . . .
…
113 CD-R Mini Discs …
114 Music CD-R …
115 CD-RW, Pack of 10 …
116 CD-RW, Pack of 5 …
…
PROMOTIONS
promo_id promo_name . . .
…
99 Newspaper #99 …
100 Internet #14-100 …
…
SALES
prod_id promo_id qty_sold amt_sold . . .
…
113 99 10 620.00
113 99 10 990.00
113 99 3 279.00
115 99 2 198.00
116 100 4 404.00
116 100 7 707.00
…
0 1 2
0
1
KV0
0
1
KV1
0
1
2
In-Memory Accumulator
23 211
Key Vectors are created for every join column …
1
Dimensions
Fact Table
… and then they are used to perform in-memory aggregation
2
IMA: Prerequisites
Optimizer should choose IMA by default whenever: All tables are stored within IMCS Dimension tables are small (<10% of fact table size) Fact table surpasses default row count (10M)
IMA behavior can also be forcibly requested via: +VECTOR_TRANSFORM hint _always_vector_transformation = TRUE
IMA and Vector Transformation
Forcing vector transformation via an optimizer hint:
SQL> SELECT /*+ VECTOR_TRANSFORM */ R.promo_id ,P.prod_id ,SUM(S.quantity_sold) ,SUM(s.amount_sold) FROM sh.sales S ,sh.promotions R ,sh.products P WHERE S.prod_id = P.prod_id AND S.promo_id = R.promo_id AND P.prod_id BETWEEN 60 AND 120 AND R.promo_id > 99 GROUP BY R.promo_id, P.prod_id ORDER BY R.promo_id, P.prod_id;
------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |------------------------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 2654 | 176K| 483 (77)| 00:00:01 || 1 | TEMP TABLE TRANSFORMATION | | | | | || 2 | LOAD AS SELECT | SYS_TEMP_0FD9D67F2_BB592E | | | | || 3 | VECTOR GROUP BY | | 469 | 1876 | 5 (20)| 00:00:01 || 4 | KEY VECTOR CREATE BUFFERED | :KV0000 | | | | ||* 5 | TABLE ACCESS INMEMORY FULL | PROMOTIONS | 469 | 1876 | 4 (0)| 00:00:01 || 6 | LOAD AS SELECT | SYS_TEMP_0FD9D67F3_BB592E | | | | || 7 | VECTOR GROUP BY | | 8 | 32 | 2 (50)| 00:00:01 || 8 | KEY VECTOR CREATE BUFFERED | :KV0001 | | | | ||* 9 | TABLE ACCESS INMEMORY FULL | PRODUCTS | 8 | 32 | 1 (0)| 00:00:01 || 10 | SORT GROUP BY | | 2654 | 176K| 476 (78)| 00:00:01 ||* 11 | HASH JOIN | | 2654 | 176K| 475 (78)| 00:00:01 || 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D67F2_BB592E | 469 | 1876 | 2 (0)| 00:00:01 ||* 13 | HASH JOIN | | 2654 | 165K| 473 (78)| 00:00:01 || 14 | TABLE ACCESS FULL | SYS_TEMP_0FD9D67F3_BB592E | 8 | 32 | 2 (0)| 00:00:01 || 15 | VIEW | VW_VT_0737CF93 | 2654 | 155K| 471 (78)| 00:00:01 || 16 | VECTOR GROUP BY | | 2654 | 42464 | 471 (78)| 00:00:01 || 17 | HASH GROUP BY | | 2654 | 42464 | 471 (78)| 00:00:01 || 18 | KEY VECTOR USE | :KV0000 | | | | || 19 | KEY VECTOR USE | :KV0001 | | | | || 20 | PARTITION RANGE ALL | | 130K| 2038K| 120 (14)| 00:00:01 ||* 21 | TABLE ACCESS INMEMORY FULL| SALES | 130K| 2038K| 120 (14)| 00:00:01 |------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):--------------------------------------------------- 5 - inmemory("R"."PROMO_ID">99) filter("R"."PROMO_ID">99) 9 - inmemory("P"."PROD_ID">=60 AND "P"."PROD_ID"<=120) filter("P"."PROD_ID">=60 AND "P"."PROD_ID"<=120) 11 - access("ITEM_7"=INTERNAL_FUNCTION("C0") AND "ITEM_8"="C2") 13 - access("ITEM_9"=INTERNAL_FUNCTION("C0") AND "ITEM_10"="C2") 21 - inmemory("S"."PROD_ID">=60 AND "S"."PROD_ID"<=120 AND "S"."PROMO_ID">99 AND SYS_OP_KEY_VECTOR_FILTER("S"."PROD_ID",:KV0001) AND SYS_OP_KEY_VECTOR_FILTER("S"."PROMO_ID",:KV0000)) filter("S"."PROD_ID">=60 AND "S"."PROD_ID"<=120 AND "S"."PROMO_ID">99 AND SYS_OP_KEY_VECTOR_FILTER("S"."PROD_ID",:KV0001) AND SYS_OP_KEY_VECTOR_FILTER("S"."PROMO_ID",:KV0000))Note----- ... - vector transformation used for this statement
Influencing IMA Behaviors
Hints affecting IMA behavior: +VECTOR_TRANSFORM (forces IMA) +VECTOR_TRANSFORM_FACT (identifies fact table) +VECTOR_TRANSFORM_DIMS (identifies
dimensions)
Parameters influencing IMA default behavior: _optimizer_vector_min_fact_rows (Default: 10M) _optimizer_vector_transformation (Default: TRUE) _optimizer_vector_cost_adj (Default: 100;
Range: 1–10000)
See also: MOS #1935305.1, In-Memory Aggregation a New Feature in 12.1.0.2
Advanced Index Compression
Not All Indexes Are Evil!
Indexes still do have a purpose: Great for enforcing uniqueness and referential integrity Useful for returning data in sorted order Compressed multi-column indexes save space …
Repeating values in leading column saved as prefix in each index block
More row pieces stored in same disk space … but there are disadvantages too
DML on indexes tends to “scatter” clustered values Bulk DELETEs can leave an index block almost empty Compressed indexes ineligible for Exadata Smart Scan
Advanced Index Compression (AIC)
AIC increases efficiency of most indexes: Supports single-column as well as multi-column indexes Unnecessary block splits may be avoided when new
index row pieces are added AIC may benefit frequently-updated partitioned indexes AIC does not support:
Bitmap indexes Index-organized tables (IOTs) Single-column unique indexes
SQL> CREATE INDEX ap.est_sales_zipcityst_idx ON ap.est_sales(zipcode, city, state)
TABLESPACE ado_cold_idx COMPRESS ADVANCED LOW;
Index created.
AIC: Implementing ADVANCED LOW Compression
SQL> ALTER INDEX ap.est_sales_zipcityst_idx REBUILD ONLINE
COMPRESS ADVANCED LOW; Index altered.
AIC can be implemented during index creation …
…or existing indexes can leverage AIC after a REBUILD ONLINE operation
AIC: Performance Impacts
A query leveraging 3-column index:
SQL> SELECT state, zipcode, COUNT(city) FROM ap.est_sales WHERE state IN ('CO','AZ','UT','NM')GROUP BY state, zipcode ORDER BY state, zipcode;
Plan hash value: 2590263876
-------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1482 | 28158 | 768 (1)| 00:00:01 || 1 | SORT GROUP BY | | 1482 | 28158 | 768 (1)| 00:00:01 ||* 2 | INDEX FAST FULL SCAN| EST_SALES_ZIPCITYST_IDX | 32032 | 594K| 767 (1)| 00:00:01 |-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):---------------------------------------------------
2 - filter("STATE"='AZ' OR "STATE"='CO' OR "STATE"='NM' OR "STATE"='UT')
Without AIC:
`Plan hash value: 2590263876
-------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |-------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1482 | 28158 | 332 (3)| 00:00:01 || 1 | SORT GROUP BY | | 1482 | 28158 | 332 (3)| 00:00:01 ||* 2 | INDEX FAST FULL SCAN| EST_SALES_ZIPCITYST_IDX | 32032 | 594K| 331 (2)| 00:00:01 |-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):---------------------------------------------------
2 - filter("STATE"='AZ' OR "STATE"='CO' OR "STATE"='NM' OR "STATE"='UT')
With AIC: