11/15/2013
1
Squeezing Top Performance from your Virtualized SQL Server
David Klee, Group Principal and Practice Lead
Chicago SQL Server Users Group, November 14 2014
www.linchpinpeople.com 1
Linchpin People® are database coaches and wellness experts for the
Microsoft SQL Server Ecosystem. Our team includes more than 15 of the
world’s top SQL Server experts, and we have presented over 175 combined
sessions at SQL Saturdays, PASS Summit, and other events in the past year.
Do you have specific questions about your database environment? Check
out www.LinchpinPeople.com and take a look at the range of services we
provide, or contact one of us today to learn how we can help improve your
business.
You know us individually, but do
you know Linchpin People®?
www.linchpinpeople.com 2
11/15/2013
2
www.linchpinpeople.com 3
David Klee
@kleegeek
davidklee.net
gplus.to/kleegeek
linked.com/a/davidaklee
Specialties / Focus Areas / Passions:
• Performance Tuning &
Troubleshooting
• Virtualization
• Cloud Enablement
• Infrastructure
Architecture
• High Availability
• Disaster Recovery
• Health Monitoring
• Capacity Management
• Data Processing
Group Principal and Practice Lead
Agenda
• What does Virtualization Mean for DBAs?
• Environmental Design & Performance Impact
• SQL Server Virtual Machine Construction
• Demo – Tuned versus Non-Tuned VM Performance
• Real-World Performance Killers (and How to Fix Them)
www.linchpinpeople.com 4
11/15/2013
3
VirtualizationVirtualizationVirtualizationVirtualization
www.linchpinpeople.com 5
What Is It? Why Should DBAs Care?
What is Virtualization?
• Added layer between hardware and operating system
• Hardware treated as physical resource queues
• Extension of the hardware, so is treated as part of the platform
• Not just P2Ving a single server…
• Most DBA’s virtualization experience…
• Virtualization layer is transparentto performance
• Expect at least equivalent rawperformance (when done right)
www.linchpinpeople.com 6
11/15/2013
4
Hypervisor Options (Q3 2013)
VMware vSphere 5.5 Microsoft Hyper-V 2012 R2
Cost of the hypervisor and management suite is
irrelevant when compared to SQL Server licensing.
www.linchpinpeople.com 7
EnvironmentEnvironmentEnvironmentEnvironment
8
Underneath Your Data
www.linchpinpeople.com
11/15/2013
5
Physical Environment Model
9
Local Disks (OS, Instance Home)
SQL Server A
SQL Server B
SQL Server C
www.linchpinpeople.com
10
Virtualization (Private Cloud) Model
VIR
TU
ALI
ZAT
ION
VM
VM
VM
www.linchpinpeople.com
11/15/2013
6
11
Logical Environment Model
150 GHz
CPU
4 TBMemory
4x10GbE
Network
20 TB
Tier 1
Storage
40 TB
Tier 2
Storage
VM
16 vCPU
128 GB vRAM
VM
8 vCPU
64 GB vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
V I R T U A L I Z A T I O N
www.linchpinpeople.com
Virtualization: Then and Now
• Virtualization’s historical draw is hardware consolidation
• Performance of virtualized critical systems is number one concern
• Consolidation ratios are secondary (but a close second)
• Resource contention levels determine consolidation ratio
• Resource contention leads to silent SQL Server performance penalties
12
You can improve all of this!www.linchpinpeople.com
11/15/2013
7
ConfigurationConfigurationConfigurationConfiguration
13
Top to Bottom Performance
www.linchpinpeople.com
Host - Physical Server
• Fastest CPU cores vs. core count
• Maximize host memory capacity
• But ensure maximum memory performance (i.e. http://tinyurl.com/hpmemoryperf)
• Configuration tips:
• Set OS-controlled CPU power management
• Enable CPU virtualization extensions
• Plan for no memory overcommitment
• Plan for no more than 80% resource consumption
14www.linchpinpeople.com
11/15/2013
8
Host Cluster
15
• Enable automatic cluster-wide load balancing
• VMware DRS / Hyper-V Performance and Resource Optimization
SAN
Disk
Pool
V I R T U A L I Z A T I O N
95% 35% 25% 30% 25% 65%
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
VM
30%
Host Resource Utilization
(Shared) Storage
• Types of storage
• Local Storage (not recommended)
• Fibre Channel
• IP-Based (SMB, iSCSI, NFS)
• Single points of failure
• Multipath all paths
• RAID Types / Controller cache
• VMware: PVSCSI
• Storage Presentation:
• Tiered storage
Configuration Performance
• Benchmark all your storage
• Microsoft SQLIO (http://tinyurl.com/mssqlio)
• IOMeter (http://iometer.org)
• CrystalDiskMark(http://crystalmark.info)
• Metrics to collect:
• Throughput (MB/s)
• I/Os Per Second (IOps)
• Latency (ms)
16
11/15/2013
9
One Lane Highways vs. Interstate
• Storage performance bottleneck potential
• Get fastest possible! Get most possible!
• Multi-pathing configuration & software
• Vendor-recommended HBA queue depth settings
• Verify and test end-to-end performance
• Use iperf to test network throughput
• bit.ly/1eDCYbi
17www.linchpinpeople.com
VM ConstructionVM ConstructionVM ConstructionVM Construction
18
Performance-Oriented From Day One
www.linchpinpeople.com
11/15/2013
10
Resource Allocations (Maximums)
• 128 vCPUs per VM
• 2TB vRAM per VM
• 120 virtual machine disks (VMDK / RDMv) @ 62TB each
• 120 raw device mapped disks (RDMp) @ 64TB each
• Guest NUMA extensions
VMware vSphere 5.5 Microsoft Hyper-V 2012 R2
• 64 vCPUs per VM
• 1TB vRAM per VM
• 256 virtual hard drives (VHDX) @ 64TB each
• 256 Passthrough disks @ 256TB+ each
• Guest NUMA extensions
19
Virtual Machine Configuration
• Select correct operating system
• Correct vCPU vNUMA configuration
• Appropriate vRAM allocation
• Full memory reservation (Dynamic Memory?)
• CPU & RAM OS-level hot-add -> No vNUMA
• Use virtual disks whenever possible
• “Right-Sized” virtual machine resources
20www.linchpinpeople.com
11/15/2013
11
“Right-Sizing” the Virtual Machine
• “Right” amount of vCPU and vRAM resources
• Physical world = Size for requirements at end of life
• Virtual world = Size for your requirements right now
• All resources must be scheduled by the hypervisor’s resource queues
• Idle vCPUs can actually slow application performance of all VMs on the host
• How to “right-size”?
• Analyze performance baselines and determine what you need
• Repeat “right-sizing” analysis quarterly
21www.linchpinpeople.com
22
CPU Scheduling
(Two sockets, eight cores, no HT)
V I R T U A L I Z A T I O N
CPU Ready to Run Scheduling QueueCPU Ready to Run Scheduling Queue
VM
16 vCPU
128 GB vRAM
VM
8 vCPU
64 GB vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
VM2 vCPU
16 GB
vRAM
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
K
T
A
S
Kwww.linchpinpeople.com
11/15/2013
12
VMware vCPU Ready Time
• vCPU Ready Time = Delay between vCPU task queue entry andexecution of the taskon a pCPU
• Performance impact of high vCPU Ready Time at http://tinyurl.com/kleecpuovercommit
23www.linchpinpeople.com
Hyper-V CPU Wait Time
• Equivalent to VMware vCPU Ready Time
• Hyper-V 2012 and above
• Perfmon counter:
• Hyper-V Hypervisor\Virtual Processor\CPU Wait Time Per Dispatch
• Jonathan Kehayias writeup:http://tinyurl.com/cpureadyhyperv
www.linchpinpeople.com 24
11/15/2013
13
Virtual Machine Configuration - Disks
Disk Layout
• C: - Operating System
• D: - SQL Server Instance Home
• E: - System Databases (master, model, msdb) *
• F: - User Database Data (1 of X)
• G: - User Database Log (1 of Y)
• H: - TempDB
• Y: - Windows Page file **
• Z: - Backups
25
26
Virtual Disk Placement
Tier 1
Disk Pool
(SSD)
Big Fancy
&
Expensive
SAN
Tier 2
Disk Pool
(FC/SAS)
Tier 3
Disk Pool
(SATA)
VM User Data 1
User Data 2
User Data 3
Operating System
User Logs 1
TempDB
Backups
Instance Home, System DBs
www.linchpinpeople.com
11/15/2013
14
Windows Server OS• Set power settings to “High Performance” (CPU-Z)
• 64KB NTFS allocation unit size
• Set antivirus exclusions for SQL Server (http://tinyurl.com/sqlav)
• Background applications & processes
• Ongoing OS-level performance metric collection
• No greater than five minute interval
• 24 x 7, not just as-needed
• Windows Perfmon, Microsoft SCOM, or any other third-party utility
27www.linchpinpeople.com
Instance-Level Configuration
• Set Max and Min Server Memory• Leave enough headroom for OS & background processes
• Enable Lock Pages in Memory
• Enable Instant File Initialization
• Set MaxDOP = vNUMA node core count
• Cost threshold for parallelism = Not default
• Large Pages (understand pros and cons)
• TempDB data file count
• Optimize for Ad-hoc Workloads
28www.linchpinpeople.com
11/15/2013
15
BenchmarksBenchmarksBenchmarksBenchmarks
29
Demonstrate Raw Performance
www.linchpinpeople.com
Performance Benchmarks and Baselines
• The demonstration of performance is largely subjective• Until you make it objective through benchmark metrics
• Must benchmark (and then create baseline):• Each component of infrastructure underneath OS• SQL Server Instance itself• Long running tasks / jobs / queries• Anything else you can measure that matters to your users
• Tools for benchmark statistics collection• Perfmon / SCOM / VMware vCenter• SQL Server DMOs• LOADS of third-party tools
• Monitoring and alerting
www.linchpinpeople.com 30
11/15/2013
16
Instance Benchmark Utility - DVDStore
• Open source database server benchmarking tool
• Single number output for relative instance-level performance comparisons – Orders placed per test
• Available from http://linux.dell.com/dvdstore
• How-to at http://tinyurl.com/kleedvdstore
31www.linchpinpeople.com
PerformancePerformancePerformancePerformance
32
Demo: Tuned vs. Non-Tuned VMs
www.linchpinpeople.com
11/15/2013
17
DVDStore Performance Comparisons
• 2 vSocket x 3 vCPU Core (+ vNUMA)
• 32GB vRAM
• Multiple virtual disks with PVSCSI
• 50GB DVDStore database
• All best practices applied
• 128 worker threads (2 tests x 64 t)
• Approximately 52% CPU utilized
• 774,925 combined orders placed in 60 min
Tuned VM Not Tuned VM
• 6 vSocket x 1 vCore (default)
• 32 GB vRAM
• One large virtual disk
• Click-through SQL Server install
• Had to reduce to 64 worker threads (connection timeout failures)
• Approximately 41% CPU utilized
• 538,164 combined orders placed in 60 minutes
• 31% slower than tuned VM
33
TroubleshootingTroubleshootingTroubleshootingTroubleshooting
34
Top Four Common Issues
www.linchpinpeople.com
11/15/2013
18
Remediation – Storage Performance
• Poor throughput / IOPs
• Unreliable performance
• High latencies
• “Cap” on throughput
Symptoms Remediation Steps
• Configure host multipathing
• Add more / faster paths
• Validate round-trip throughput
• Adjust disk group / RAID configuration
• Add SSD caching / controller cache / more spindles
35
Remediation – Host Overcommitment
• Memory reclamation
• High vCPU Ready Times
• Path bottlenecks
• Perfmon shows resource contention
Symptoms Remediation Steps
• Re-balance cluster
• Resource pools
• Add more cluster nodes
• Add more host resources
• Add more interconnect paths / multipathing
36
11/15/2013
19
Remediation – VM Misconfiguration
• (Silently) poor performance
• High vCPU Ready Times
Symptoms Remediation Steps
• Right-size vCPU count
• Right-size memory
• Spread out disk load to more virtual disks
• Align vNUMA with physical NUMA
• vNUMA presentation
• VM shares / reservations / limit
37
Remediation – “Noisy Neighbors”
• One VM periodically overpowers the host’s resources
• Causes performance problems with other VMs on the host
Symptoms Remediation Steps
• Resource limitation
• Resource pools
• Isolate VM
• Resolve the source of the “noise”
38
11/15/2013
20
Conclusions
• Pick the hypervisor that best suits your environment
• Break down the silos around the infrastructure and application teams – and work with them
• Understand your SQL Server workloads• Individual• Aggregate
• Validate target virtualization environment performance
• Virtualize everything
• Make sure you have your benchmarks to demonstrate equivalent performance
• Routinely check ongoing performance
www.linchpinpeople.com 39
Questions?Questions?Questions?Questions?
40
@kleegeek
davidklee.net
gplus.to/kleegeek
linked.com/a/davidaklee
www.linchpinpeople.com
11/15/2013
21
Thank You!Thank You!Thank You!Thank You!
www.linchpinpeople.com 41