Upload
eddiew
View
1.639
Download
2
Embed Size (px)
DESCRIPTION
Slide deck of the April 20, 2010 presentation to IndyPASS on SQL Server IO internals for performance
Citation preview
Writing Efficient Queries – Part 1
Using SQL Server Internals to Improve Data Access
Eddie WuerchMCT, MCITP
Principal, Data ManagementExactTarget
Disk I/O – Key Points
Disk I/O (reads and writes) is usually the slowest component of a system◦Compare: Memory and CPU speeds are reported
in GHz: billions of actions per second◦Disk I/O rates are simply measured in IOPS:
Input/output Operations Per Second◦Fast disks on mostly sequential workloads can
get 100-150 IOPSBecause of disk access rates, our first
tuning goal is to reduce overall I/O…so let’s look at those data pages, then
examine how to process less of them
Disk I/O – Key Points - II
SQL Server is not a black boxMost data is in structured storageThere are many ways to access the data,
some ways are significantly faster – and cause less impact on other processes - than others
Understanding SQL Server data storage internals will guide you to the faster ways
Data Storage in SQL Server
The base unit of SQL Server data storage is the page
All data – system and user – is stored in pages
Each page is 8KB (8192 bytes)Pages are allocated from files in 64KB (8-
page) extents
Page Processing
Pages are read from disk and processed in memory as an entire 8KB unit
Extents are often read in from disk as a single block to reduce I/O
All data is processed in memory, pulled from disk first (processing put on hold) if data is not in memory
Page Types
System Page Types◦Space Management: File Header, PFS, GAM,
SGAM◦Change Management: DCM, BCM
Data Page Types◦In-row data ◦Index ◦LOB data and Row-overflow data
All pages are 8KB
In-Row Data Pages
96-byte Page HeaderPage Header
Row 1…Row 2…Row 3…Row 4…
4… 3… 2… 1…
Row DataRows written seriallyStarts at 97th byte
Row-offset tableStarts at end of page, moves
backwardsRecords first-byte offset of
each row
Disk Access Methods
Think of a phone book, with each entry as a record
Ordered by Last Name, First Name, MITwo ways to find a record:
◦Use Last Name, First Name to find a number (Index Seek)
◦Look through the entire phone book, one page at a time, scanning each row for data (Table Scan)
Index Types
Clustered Index◦Represents the table itself◦Index specifies the physical ordering of that
data◦Only 1 allowed per table◦May be unique, does not have to be the
primary keyNon-clustered index
◦Additional index of data◦Over 200 allowed◦May be unique
The phone book example
If a table has a clustered index, the pointer to each row in the table is the clustered index key
The leaf level of the nonclustered index contains the nonclustered keys and the clustered index keys
Nonclustered indexes may also include additional non-indexed columns, will be stored at the leaf level of the index
Index Pages
4 5 6
22 23 24 25 26 27
274 275 276 277 278 279 280 281 282 283
A-K : Page 4L-U : Page 5V-Z : Page 6
A : Page 22B : Page 23C : Page 24 D….
Baa : Page 276Baba : Page 277Base : Page 278Ba…
Index Lookups
Index Lookups - revisted
Nonclustered index
Clustered index (table)
Separate trip through the clustered index for
each ncl entry!
Operational Join Types
Merge JoinsHash JoinsLoop Joins
Join Type Comparison
Merge Hash
One trip through each table
Requires indexes on both sides, at least one of them must be unique
Usually the fastest join type
Works well for very large joins
Builds join data in tempdb
Loop
When the other two can’t be used
One trip through one table
One trip through the other table for each entry in the first table
Generally the slowest of the three types
Join and Indexing Tips
When defining an index, if the data is unique, then declare the index as unique
Join on keysProvide arguments in WHERE clauses to
match available indexesCluster tables on range scansLook for covering indexes
So How Do I Know?
SET STATISTICS IO ON
So How Do I Know?
So How Do I Know?
sys.dm_db_index_usage_stats◦User_seeks◦User_scans◦User_lookups◦User_updates
Sys.dm_db_missing_index_*◦Not magic, has limitations◦Many similar index entries with different
INCLUDE statements may indicate a need to revisit the clustered index design
So How Do I Know?
Scan-indicating waits◦Lots of PAGEIOLATCH_SH and PAGEIOLATCH_EX
waits are generated by tables scans that read from disk
◦CX_PACKET waits – related to parallellism often caused by scanning large tables (don’t reduce MAXDOP: fix the scan!)
◦Other processes with SOS_SCHEDULER_YIELD or high signal wait times may be mitigated by reducing CPU load of scans
So How Do I Know?
TempDB activity in instances without much use of temp tables or table variables◦SELECT * FROM
sys.dm_io_virtual_file_stats(DB_ID(‘TempDB’), NULL)
◦Must track over time, perform time-slice analysis◦May indicate additional worktable sort and hash-
match activity◦Tracking this for all of your databases shows the
amount of I/O your systems are performing, and if the disk systems are keeping up
Resources
Microsoft White Papers◦SQL Server 2000 I/O Basics (
http://technet.microsoft.com/en-us/library/cc966500.aspx)
◦SQL Server I/O Basics, Chapter 2 (http://technet.microsoft.com/en-us/library/cc917726.aspx)
◦SQL Server Waits and Queues (download) (http://technet.microsoft.com/en-us/library/cc966413.aspx)
The Waits and Queues document is highly recommended tuning or analyzing workloads
Resources
Inside SQL Server Book Series◦SQL 2005
The Storage Engine (Kalen Delaney) Query Tuning and Optimization (Delany, et. al.) T-SQL Querying (Ben-Gan, Kollar, Sarka)
◦SQL 2008 Microsoft SQL Server 2008 Internals (Delaney,
Randal, Tripp, Cunningham) T-SQL Querying (Ben-Gan, Kollar, Sarka)
Questions?
Email: [email protected]