24
MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Embed Size (px)

Citation preview

Page 1: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

MICROSOFT SQL SERVERDATABASE ENGINE I/O

by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present

Built: Jan 2008

Page 2: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Areas Covered

Write Ahead Logging (WAL) Protocol Synchronous vs Asynchronous I/O Scatter / Gather I/O Sector alignment, Block Alignment Latching and a page: A read walk-through SQL Server I/O Sizes Data cache maintenance PAE and AWE Read Ahead User Mode and Kernel Mode (SYSTRAP) Sparse Files and Copy On Write (COW)

Pages Locked Pages Scribbler(s) and Bit flips Page Protection and Constant Pages Checksum vs Torn Stale Read Stalled I/O

Page 3: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

WAL Protocol

Write Ahead Logging ACID (Durability Property) Log records secured

before data Hardened / Stable Media Log contains parity bit

• Commit • Rollback• Trigger Snapshot

Page 4: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Synchronous vs Asynchronous I/O

Sync: Wait for Completion Async: Post and Continue

Overlapped Event Completion Port

SQL Server 98% Async Usage

Overlapped and HasOverlappedIoCompleted Network Layers Use Completion Port

Backup/Restore Use Sync – Sequential Patterns

• dm_io_pending_io_requests• Overlapped Structure• Async Processing ~= CPU• Package vs Phone

Page 5: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Scatter / Gather I/O

Consolidates or Distributes APIs

ReadFileScatter WriteFileGather

Increases Efficiency Used by SQL I/O Paths Used by Windows Page File

• Old Design: 6.x Sorting• AWE Availability • WriteMultiple• # of 8K Pages• Forward and

Backward• Buffer Pool Ramp-up

Disk

Memory

GatherScatter

Page 6: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Sector AlignmentBlock Alignment

Sector: Log Writes Block: Performance Avoid Crossovers DiskPart/DiskPar Utilities Discuss with your Vendor

• Double Touch• Rewrites• Defragment• 4K Sectors

Alignment: http://support.microsoft.com/kb/929491To verify that an existing partition is aligned, divide the size of the stripe unit by the starting offset of the RAID disk group. Use the following syntax: ((Partition offset) * (Disk sector size)) / (Stripe unit size)

Example of alignment calculations in bytes for a 256-KB stripe unit size: (63 * 512) / 262144 = 0.123046875(64 * 512) / 262144 = 0.125(128 * 512) / 262144 = 0.25(256 * 512) / 262144 = 0.5(512 * 512) / 262144 = 1

These examples shows that the partition is not aligned correctly for a 256-KB stripe unit size until the partition is created by using an offset of 512 sectors (512 bytes per sector).

Page 7: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Latch

Multiple Readers (SH) One Writer (EX) Protects In-Memory Data Page

Latch = Physical Protection Lock = Logical Protection

User Mode UMS/SQLOS Aware Optimized FIFO Ordering

• Flushed & Rollback• Latch Timeout• Sub-latch

BUF Array

Memory (Data Pages)

BUFStatusLatchDatabase*PageIdHash *…

Page 8: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Reading A Page

Get Free Buffer for Read Acquire Exclusive (EX) Latch Is already in-memory/hashed? Add Entry to Page Hash Post and Record Asynchronous Read … Continue Processing …. Check Status (Scheduler Switch) Complete: Validate I/O and Release

Latch

• Page Audits• Read retry • Stalled I/O Warnings • Error raised at Acquire• Shared (SH) waiters• PAGE_IO* vs PAGE* Latch• Writing A Page

kernel transition – Stuck I/O?ntdll!ZwWriteFile+0xa kernel32!WriteFile+0xf6 sqlservr!DiskWriteAsync+0xee …

0:000> uf ZwWriteFilemov r10,rcxmov eax,5Syscall Kernel Transitionret

Page 9: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Myth: Single Worker Per FileTruth: Each Worker Issues I/O

Worker #1

Vol #1

Vol #2

Serial Planselect * from dbTest.dbo.tblTestinsert into dbTest.dbo.tblTest

dbTest.MDF dbTest.NDF dbTest.LDF

Worker #2

Parallel Planselect * from dbTest.dbo.tblTestinsert into dbTest.dbo.tblTest

Create Database Workers Assigned by Volume ID

Primary = dbTest.MDFSecondary = dbTest.NDFLog = dbTest.LDF

Worker #3

Worker #4

Worker #5

Page 10: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Data Cache Maintenance

Memory Pressure: LazyWriter Per NUMA Node Time Of Last Access (TLA)

Recovery Interval: Checkpoint Queue I/O Targets .LDF Usage Triggers Alternate Triggers (Backup, Restore, …) Scatter/Gather Usage (WriteMultiple)

• Checkpoint Assignments• By Ordinal Sweep• Stalled I/O – LW #0• I/O Queue Depth > 2

Page 11: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

PAE and AWE

Physical Address Extensions /PAE in Boot.ini Boots Kernel with 36 bit addressing Physical Memory > 4GB Virtual Address Unchanged (/2gb or /3GB) Automatic for Hot Add Memory Computers

Address Windows Extension Windows APIs (AllocateUserPhysicalPages) Physical Memory Allocations Un/Mapped in or out of Virtual Address Range

• Data Pages-Only• Locked Pages• Windows Paging•Windows 2000 Bugs

•32 Bit Address = 4294967295 (0xFFFFFFFF) 4GB• Interlocked Instruction

lock xadd dword ptr [ecx],eax

•36 Bit Address = 68719476735 (0xFFFFFFFFF) 64GB• Multiple Instructions

Page 12: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Read Ahead

128 Pages Standard SKU 1024 Pages Enterprise

SKU Uses ReadFileScatter Plan Based Decisions Power of Asynchronous

I/O• Read Over Write • Ramp-up

Page 13: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Sparse Files – Copy On Write

Usage Online DBCC Snapshot Databases

Buffer Pool: PrepareToDirty

File Control Block (FCB) Chaining

• Sparse Allocation• FCB Tracking• Windows Limits • New Page Allocations

Page 14: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Advanced Protection

What is a Scribbler? Data Page Audits

None Torn Bits Checksum

Log Block Checksum Constant Page Backup with Checksum

• DBCC Page Audit• Stale Read Check• SQLIOSim

Page 15: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

REFERENCES

Page 17: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Fundamentals and Requirements

KB230785 - SQL Server 7.0, SQL Server 2000 and SQL Server 2005 logging and data storage algorithms extend data reliability

KB917047 - Microsoft SQL Server I/O subsystem requirements for the tempdb database

KB231347 - SQL Server databases not supported on compressed volumes (except 2005 read only files)

Page 18: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Subsystems

KB917043 - Key factors to consider when evaluating third-party file cache systems with SQL Server

KB234656- Using disk drive caching with SQL Server KB46091-

Using hard disk controller caching with SQL Server KB86903 -

Description of caching disk controls in SQL Server KB304261-

Description of support for network database files in SQL Server

KB910716 (in progress) - Support for third-party Remote Mirroring solutions used with SQL Server 2000 and 2005

KB833770 - Support for SQL Server 2000 on iSCSI technology components (applies to SQL Server 2005)

Page 19: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Design and Configuration

White paper - Physical Database Layout and Design KB298402 -

Understanding How to Set the SQL Server I/O Affinity Option KB78363 - When Dirty Cache Pages are Flushed to Disk White paper - Database Mirroring in SQL Server 2005 White paper -

Database Mirroring Best Practices and Performance Considerations

KB910378 - Scalable shared database are supported by SQL Server 2005

MSDN article - Read-Only Filegroups KB156932 -

Asynchronous Disk I/O Appears as Synchronous on Windows NT, Windows 2000, and Windows XP

Page 20: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Diagnostics

KB826433 - Additional SQL Server Diagnostics Added to Detect Unreported I/O Problems

KB897284 - SQL Server 2000 SP4 diagnostics help detect stalled and stuck I/O operations (applies to SQL Server 2005)

KB828339 - Error message 823 may indicate hardware problems or system problems in SQL Server

KB167711 - Understanding Bufwait and Writelog Timeout Messages

KB815436 - Use Trace Flag 3505 to Control SQL Server Checkpoint Behavior

KB906121 - Checkpoint resumes behavior that it exhibited before you installed SQL Server 2000 SP3 when you enable trace flag 828

WebCast- Data Recovery in SQL Server 2005

Page 22: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Utilities

Download - SQLIO Disk Subsystem Benchmark Tool

Download - SQLIOStress utility to stress disk subsystem (applies to SQL Server 7.0, 2000, and 2005 - replaced with SQLIOSim and SQL Server 2008 installed in BINN)

Page 23: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Blog Content SQL Server Urban Legends Discussed

http://blogs.msdn.com/psssql/archive/2007/02/21/sql-server-urban-legends-discussed.aspx How It Works: SQL Server Checkpoint (FlushCache) Outstanding I/O Target

http://blogs.msdn.com/psssql/archive/2008/04/11/how-it-works-sql-server-checkpoint-flushcache-outstanding-i-o-target.aspx How It Works: SQL Server Page Allocations

http://blogs.msdn.com/psssql/archive/2008/04/08/how-it-works-sql-server-page-allocations.aspx How It Works: Shapshot Database (Replica) Dirty Page Copy Behavior (NewPage)

http://blogs.msdn.com/psssql/archive/2008/03/24/how-it-works-shapshot-database-replica-dirty-page-copy-behavior-newpage.aspx

How It Works: SQL Server 2005 I/O Affinity and NUMA Don't Always Mixhttp://blogs.msdn.com/psssql/archive/2008/03/18/how-it-works-sql-server-2005-i-o-affinity-and-numa-don-t-always-mix.aspx

How It Works: Debugging SQL Server Stalled or Stuck I/O Problems - Root Causehttp://blogs.msdn.com/psssql/archive/2008/03/03/how-it-works-debugging-sql-server-stalled-or-stuck-i-o-problems-root-cause.aspx

How It Works: SQL Server 2005 Database Snapshots (Replica)http://blogs.msdn.com/psssql/archive/2008/02/07/how-it-works-sql-server-2005-database-snapshots-replica.aspx

How It Works: File Stream the Before and After Image of a Filehttp://blogs.msdn.com/psssql/archive/2008/01/15/how-it-works-file-stream-the-before-and-after-image-of-a-file.aspx

Using SQLIOSim to Diagnose SQL Server Reported Checksum (Error 824/823) Failureshttp://blogs.msdn.com/psssql/archive/2008/12/19/using-sqliosim-to-diagnose-sql-server-reported-checksum-error-824-823-failures.aspx

How to use the SQLIOSim utility to simulate SQL Server activity on a disk subsystem http://support.microsoft.com/kb/231619

Should I run SQLIOSim? - An e-mail follow-up from SQL PASS 2008 http://blogs.msdn.com/psssql/archive/2008/11/24/should-i-run-sqliosim-an-e-mail-follow-up-from-sql-pass-2008.aspx

What do I need to know about SQL Server database engine I/O? http://blogs.msdn.com/psssql/archive/2006/11/27/what-do-i-need-to-know-about-sql-server-database-engine-i-o.aspx

SQLIOSim is "NOT" an I/O Performance Tuning Tool http://blogs.msdn.com/psssql/archive/2008/04/05/sqliosim-is-not-an-i-o-performance-tuning-tool.aspx

How It Works: SQLIOSim - Running Average, Target Duration, Discarded Buffers ... http://blogs.msdn.com/psssql/archive/2008/11/12/how-it-works-sqliosim-running-average-target-duration-discarded-buffers.aspx

How It Works: SQLIOSim [Audit Users] and .INI Control File Sections with User Count Options http://blogs.msdn.com/psssql/archive/2008/08/19/how-it-works-sqliosim-audit-users-and-ini-control-file-sections-with-user-count-options.aspx

Understanding SQLIOSIM Output http://sqlblog.com/blogs/kevin_kline/archive/2007/06/28/understanding-sqliosim-output.aspx

Page 24: MICROSOFT SQL SERVER DATABASE ENGINE I/O by Bob Dorr, Microsoft SQL Server Principle Escalation Engineer, 1994 – Present Built: Jan 2008

Additional Learning Resources

Inside SQL Server 7.0 and Inside SQL Server 2000Written by Kalen Delaney – her husband is Paul Randle who wrote the core dbcc checks for SQL 7.0, 2000 and 2005

The Guru’s Guide to SQL Server Architecture and Internals – ISBN 0-201-70047-6

Written by Ken after he joined Microsoft SQL Server SupportMany chapters reviewed by developers and folks like myself

SQL Server 2005 Practical Troubleshooting ISBN 0-321-44774-3 – Ken Henderson

Authors of this book were key developers or support team membersCesar – QP developer and leader of the QP RedZone with Keithelm and JackliSameert – Developer of UMS and SQLOS SchedulerSanteriv – Developer of the lock managerSlavao – Developer of the SOS memory managers and engine architectWei Xiao – Engine developerBart Duncan – long time SQL EE and now developer of the Microsoft Data

Warehouse – performance focusedBob Ward – SQL Server Support Senior EE

Advanced Windows Debugging – ISBN 0-321-37446Written by Microsoft developers – excellent resource

Applications for Windows – Jeffrey RichterGreat details about Windows basics