Upload
-
View
615
Download
3
Embed Size (px)
DESCRIPTION
In-Memory features is the most perspective trend in the area of high performance. Columnstore Indexes is one of such features, and even with their restrictions, they can accelerate your queries at times! How to get more from this feature? In which situations should we use them? Which internal mechanisms help to achive that? You can get answers on these questions on this session.
Citation preview
Columnstore Indexes
Deep introduction into columnar storage and indexes in SQL Server 2012
Denis Reznik
Sponsors
About me
Denis Reznik Kiev, Ukraine Database Architect at The Frayman Group Microsoft MVP Community enthusiast
3 |
Agenda
Columnar storage Creation of Columnstore index Usage scenarios and limitations Performance accelerators
Columnstore Storage internals Columnstore Execution mode internals
Columnstore index maintenance Columnstore Future (actually Present :)
4 |
Row Store and Column Store
In row store, data is stored tuple by tuple. In column store, data is stored column by
column
Row Store and Column Store
Most of the queries does not process all the attributes of a particular relation.
SELECT c.Name, c.Address FROM Customers cWHERE c.City = 'Sofia'
id
name
city state age
address
Creating a columnstore index
T-SQL
SSMS
Usage scenarios and limitations
Primary focus of Columnstore Indexes is DW databases
In SQL Server 2012 Columnstore Indexes are read-only
Supported operators and data types are limited
DEMO
Incredible Performance of Columnstore Indexes
How Are These Performance Gains Achieved?
Two complimentary technologies: Storage
Data is stored in a compressed columnar data format (stored by column) instead of row store format (stored by row).
New “batch mode” execution Vector-based query execution capability Data can then be processed in batches versus row-by-row Depending on filtering and other factors, a query may also benefit
by “segment elimination” - bypassing million row chunks (segments) of data, further reducing I/O
Compression
Patented VERTIPAQ algorithms So, there is no public information about how the
data actually compressed
But some info we have Dictionary encoding Run Length encoding Bit-Vector encoding …
DEMO
Columnstore Indexes Internals
C1 C2 C3 C5 C6C4
…
Pages
Row store:
Column store:
Columnar storage structure
C1 C2 C3 C5 C6C4
Set of about 1M rows
Column Segment
segment 1
segment Ndictionaries
…
Column Segments and Dictionaries
DEMO
Columnstore Indexes – Segments and Dictionaries
Memory management
SELECT C2, SUM(C4)FROM TGROUP BY C2;
T.C2
T.C4
T.C2
T.C4
T.C2
T.C2
T.C2
T.C1 T.C
1
T.C1
T.C1
T.C1
T.C3
T.C3
T.C3
T.C3
T.C3
T.C4
T.C4
T.C4
• Memory management is automatic
• Columnstore is persisted on disk
• Needed columns fetched into memory
• Columnstore segments is a unit of data between disk and
memory
Batch mode processing
Process ~1000 rows at a time
Vector operators implemented
Greatly reduced CPU time (7 to 40X)
bitm
ap o
f qua
lifyi
ng r
ows
Column vectors
Batch object
Segment Elimination
column_id
segment_id
min_data_id max_data_id
1 1 20120101 20120131
1 2 20120115 20120215
1 3 20120201 20120228
• Segment (rowgroup) = 1 million row chunk• Min, Max kept for each column in a segment• Scans can skip segments based on this info
select Date, count(*) from dbo.Purchase where Date >= '20120201'group by Date
skipped
DEMO
Segment Elimination
Maintaining Data in a Columnstore Index
Once built, the table becomes “read-only” and INSERT/UPDATE/DELETE/MERGE is no longer allowed
ALTER INDEX REBUILD / REORGANIZE not allowed
How can I modify index data? Drop columnstore index / make modifications / add
columnstore index UNION ALL (but be sure to validate performance) Partition switches (IN and OUT)
Columnstore Index Future
Actually it is already become Columnstore indexes can be clustered (in
SQL server 2014) Clustered Columnstore indexes can be
updatable (in SQL Server 2014) Update data (deltas) store in rowstore until
segment can be created
Summary
Columnar storage Columnstore Performance Demo Creation of Columnstore index Usage scenarios and limitations Performance accelerators
Columnstore Storage internals Columnstore Execution mode internals
Columnstore index maintanance Columnstore Future (actually Present :)
22 |
Sponsors
Thank you!
Denis Reznik Twitter: @denisreznik Email: [email protected] Blog (in russian): http://reznik.uneta.com.ua Facebook: https://www.facebook.com/denis.reznik.5
LinkedIn: http://ua.linkedin.com/pub/denis-reznik/3/502/234