29
Application Performance in Virtualized Environments Todor Tsankov, Cloud Service Engineer

PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Embed Size (px)

Citation preview

Page 1: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Application Performance in Virtualized Environments

Todor Tsankov,Cloud Service Engineer

Page 2: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

2

Agenda

1 CPU Resources

2 Memory Resources

3 Storage Resources

4 Network Resources

Page 3: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Managing CPU Resources

Page 4: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

CPU - Features

• Fair proportional scheduling:– Shares– Limit– Reservation

• vSMP / Co-scheduling• Hypethreading• Intel VT-x / AMD-V• NUMA / prefer-HT• vNUMA

4

Page 5: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

CPU - NUMA

5

Page 6: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

CPU - vNUMA

6

Page 7: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

CPU – NUMA and Hyperthreading

7

Page 8: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

8

CPU - Monitoring

• Ready (%RDY)– % time a vCPU was ready to be scheduled on a physical processor but couldn’t due to processor

contention– Investigation Threshold: 10% per vCPU

• Co-Stop (%CSTP)– % time a vCPU in an SMP virtual machines is “stopped” from executing, so that another vCPU in the

same virtual machine could run to “catch-up” and make sure the skew between the two virtual processors doesn’t grow too large

– Investigation Threshold: 3%

Page 9: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

9

CPU – Best Practices

• Do not over-allocate vCPUs

• Create single vCPU VMs whenever possible

• Enable Hyperthreading

• Right Size the VM– vCPU count should be less or equal to the number of cores in a single physical CPU (single NUMA

node)

Page 10: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Managing Memory Resources

Page 11: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Memory - Features

• Allow memory over-commitment• Fair proportional memory scheduling

– Scheduling parameters:• Shares• Limit• Reservation

• Support for large pages– Performance increased by 10 to 30%

• Intel EPT / AMD RVI

11

Page 12: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

12

Memory - Mapping

• Three types of memory address spaces– Virtual memory– Physical memory– Machine memory

Page 13: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

13

Memory - Mapping• Hardware accelerated virtualization (Intel EPT / AMD RVI)

– Handle shadow mapping in the hardware– Tagged Translation Look-aside Buffers (TLB)

Page 14: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

14

Memory - Reclamation• Transparent page sharing

– most efficient

• Memory Ballooning – always install latest version of VMware tools

• Memory Compression – may sound strange, but this is much faster than swapping

• Virtual Machines Swap– Hypervisor swap, not to be confused with OS swap file/partition

Page 15: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

15

Memory - Reclamation - Transparent Page Sharing (TPS) Background process for removing duplicate memory pages

Page 16: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

16

Memory – Reclamation - Ballooning“Pushes” memory pressure from ESX host into VM

Page 17: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

17

Memory - Reclamation - CompressionEssentially “zips” memory instead of swapping it so that it uses less space in RAM

Page 18: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

18

Memory - Reclamation - SwappingWrites VM memory from physical RAM out to disk

Page 19: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

19

Memory - Monitoring• Balloon driver size (MCTLSZ)

– The total amount of guest physical memory reclaimed by the balloon driver– Investigation Threshold: 1

• Swapping (SWCUR)– The current amount of guest physical memory that is swapped out to the ESX kernel VM swap file– Investigation Threshold: 1

• Swap Reads/sec (SWR/s)– The rate at which machine memory is swapped in from disk– Investigation Threshold: 1

• Swap Writes/sec (SWW/s)– the rate at which machine memory is swapped out to disk– Investigation Threshold: 1

Page 20: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

20

Memory – Best Practices

• Do not overcommit memory• Configure swap in your Guest Operating System

– Size it to be at least equal to the configured vRAM for the VM– Put the swapping partition or swap file (for Windows) in separate virtual disk

• Install VMware tools– This enables the ballooning driver and enables the VMkernel to use the best memory reclamation

technique

• Enable Intel EPT / AMD RVI in the ESX host BIOS• Use large memory pages in guest OS

– Minimizes the TLB misses

Page 21: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Managing Storage Resources

Page 22: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

22

Storage - Overhead

Page 23: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Storage – Monitoring

• Kernel Latency Average (KAVG)– This counter tracks the latencies of IO passing thru the Kernel– Investigation Threshold: 1ms

• Device Latency Average (DAVG)– This is the latency seen at the device driver level. It includes the roundtrip thime between the HBA and

the storage– Investigation Threshold: 15-20ms, lower is better, some spikes are okay

• Abort (ABRT/s)– The number of commands aborted per second– Investigation Threshold: 1

23

Page 24: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Storage – Best Practices

• Separate VM disk on different physical disks if needed• Do not oversize VM disks

– VM disk can be expanded, but it is difficult to shrink

• Preprovision VM disks– Don’t use thin provisioned disk for mission critical applications

• Install VMware Tools– Installs optimized, specific OS drivers for the SCSI controllers

• Align guest OS disks– Most modern OS does this automatically

24

Page 25: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Managing Network Resources

Page 26: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

26

Network - Components

Page 27: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Network - Monitoring

• Transmit Dropped Packets (%DRPTX)– The percentage of transmit packets dropped– Investigation Threshold: 1ms

• Receive Dropped Packets (%DRPRX)– The percentage of received packets dropped– Investigation Threshold: 1ms

27

Page 28: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Network - Best Practices

• Load balance on vSwitch level, not inside VM– Allow the Hypervisor to do the network teaming

• Install VMware Tools– Installs optimized, specific OS drivers for the NIC adapters

• Use VMXNET3 vNIC adapters when possible– Support for most modern OS

28

Page 29: PlovDev 2016: Application Performance in Virtualized Environments by Todor Tsankov

Thank You

Questions & Answers