Monitoring sql server

Preview:

Citation preview

MONITORING SQL SERVER

Key Performance Metrics and how to Interpret Them

Tuning blog: http://www.sqlperformance.com/

E-mail ebooks@sqlsentry.com for free copies of our $10 e-books:

YOUR PRESENTER• John Q Martin

o Sales Engineer for SQL Sentryo Worked with SQL Server for ~10 yearso Consultant, SQL DBA, Dev & BI Developero Former Microsoft Premier Field Engineer

• Contact Informationo Email: Jmartin@SQLSentry.como Blog: http://blogs.sqlsentry.com/author/JohnMartin/ o Twitter: @SQLDiplomat o LinkedIn: https://uk.linkedin.com/in/johnqmartin

AGENDA• CPU Monitoring

• Memory Monitoring

• Storage Monitoring

• SQL Server Monitoringo Monitoring Counterso Wait Statso DMVso Events

MONITORING APPROACHES

MONITORING FUNDAMENTALS

• Monitor over time, keep the captured data as it will be invaluableo Don’t just grab everything “just in case”

• Use historical data to create baselineso Baselines will allow for spotting when regular events or time periods are ‘out of

band’

• Historical monitoring data can be used to perform trend analysis and capacity planning.

CPU METRICS

• Processo% Processor Timeo% Privileged Time

• Processoro% Processor Timeo% Privileged Timeo% DPC Time

CPU METRICS• Important to monitor each CPU as well as the total CPU usage.

o Helps identify potential MAXDOP issues.o Will allow for you to see if there are possible misconfigurations in the system

outside of SQL Server

• Excessive DPC and Privileged time can indicate issues elsewhere in the system such as networking or storage.

• Monitoring the SQL Server Processes will allow you to see how much time is spent on SQL Servero Depending on storage you can capture more process instances

CPU METRICSReal Time Monitoring Tracking CPU over time.

MEMORY METRICS

• NUMA Node MemoryoAvailable MByteso Total MBytes

• MemoryoAvailable Mbyteso Page Faults/seco Page Reads/seco Page Writes/sec

MEMORY MONITORING

• Understand what volumes of data are being read into and out of memory.

• Tracking memory use by NUMA node can have benefits depending on the configuration of the system. o Differences in the amount of memory allocated to each NUMA node can affect

processing in the CPUs within each node.

• Ensuring that there is a sufficient free memory is important to maintaining a stable system.

MEMORY MONITORING

• Track memory by node

• Aggregate to overall

• What else is in use?Node 1Node 0

STORAGE METRICS

• Logical Disko Same as Physical DiskoDepends on your disk

configuration, if 1:1 mapping between physical & logical then use Physical metrics.

• Physical DiskoDisk Read Bytes/secoDisk Write Bytes/secoDisk Reads/secoDisk Writes/seco Split IO/secoCurrent Disk Queue

Length

STORAGE MONITORING

• Key monitoring elements for storageo IOPSo Throughputo Latency

• Monitor amount of space usedo Sample rate does not need to be frequent, can be minutes or hours rather than seconds.

• Understand the configuration of the disks as to whether you need to use Logical and/or Physical Disk counters.

SQL SERVER STORAGE DMVS

• sys.dm_io_virtual_file_stats()o Gives depth to the reads & writes into each database fileo Allows you to derive Read/Write balance for data fileso IO operations and Bytes written

• sys.sm_io_pending_io_requestso Shows outstanding file IOs for SQL Server database files.

• sys.dm_db_index_physical_stats()o Gather index fragmentation detailso Can cause lots of IO, use sparingly on large databases

SQL SERVER METRICS

• Buffer ManageroBuffer Cache Hit RatiooCheckpoint pages/seco Page Reads/seco Page Writes/sec

• Access Methodso Forwarded Records/seco FreeSpace Scans/seco Page Splits/secoWorkfiles Created/secoWorktables Created/sec

SQL SERVER METRICS

• Buffer Nodeo Page Life Expectancyo Local node Page

lookups/secoRemote node page

lookups/sec

• DatabasesoActive Transactionso Log Bytes Flushed/seco Log Flush Wait timeo Log Flush Waits/seco Log Flushes/seco Percent Log Used

PAGE LIFE EXPECTANCY• PLE Value is meaningless, Discuss.

• Value needs to be given contexto How large is the buffer poolo What is my IO sub-system capabilityo What % of the IO Channel is used to maintain the PLE value

• Investigate changeso What happened when PLE suddenly dropped?

• Monitor at the Buffer Node Levelo Global PLE value will not equal mean AVG of Node value.

PAGE LIFE EXPECTANCY

• Look for changes and see what else was going ono Large batch job/reporto Someone runs DBCC DROPCLEANBUFFERS

• Frequent monitoring required as changes can happen fasto Seconds to minutes for monitoring interval.

What happened ?

SQL SERVER WAITS

• ASYNC_NETWORK_IO

• LCK_*

• PAGELATCH_*

• RESOURCE_SEMAPHORE

• WRITELOG

• LOGBUFFER

• CXPACKET

• THREADPOOL

SQL SERVER EVENTS

• Monitor SQL Agent for Failed Jobs

• Monitor for 823, 824, 825 errorso Can indicate storage or corruption issueso Make use of Agent Alerts or tools to scan the agent log

• Monitor and manage the dbo.suspect_pages table in MSDBo SQL Server will track incidences of corrupt pages hereo Limited to 1000 records so needs to be managed if there is anything here

SUMMARY

• Identify base metrics that you should be capturing and a capture frequencyo Understand why you are collecting them and how to use them effectively

• Identify specific business events and cycles and create baselines to allow for tracking performance over multiple iterations and time

• Look for correlation between performance metricso Make use of CORREL function in Excel if needed

• Track changes to the environment, code, applications etc. this will help supplement the monitoring data.

QUESTIONS

THANK YOU!

• Slides will be available at http://blogs.sqlsentry.com • More information at:

o SQLSkills, et al• E-mail ebooks@sqlsentry.com for free copies of our e-books:

o Just tell them where you met me• My contact info for other questions:

o Email: Jmartin@SQLSentry.como Twitter: @SQLDiplomat

Recommended