Upload
concentrated-technology
View
760
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
HA & DRS Gotchas that willHA & DRS Gotchas that willKill your InfrastructureKill your Infrastructure
Greg ShieldsGreg ShieldsPartner and Principal TechnologistConcentrated Technologywww.ConcentratedTech.com
This slide deck was used in one of our many conference presentations. We hope you enjoy it, and invite you to use it
within your own organization however you like.
For more information on our company, including information on private classes and upcoming conference appearances, please
visit our Web site, www.ConcentratedTech.com.
For links to newly-posted decks, follow us on Twitter:@concentrateddon or @concentratdgreg
This work is copyright ©Concentrated Technology, LLC
Class DiscussionClass Discussion
Two Questions:
So……How often do you actually vMotion your virtual machines?– Or, would you if you haven’t yet deployed ESX.
Class DiscussionClass Discussion
Two Questions:
So……How often do you actually vMotion your virtual machines?– Or, would you if you haven’t yet deployed ESX.
What failure states does vMotion provide protection?– Hint: They’re fewer than you’d think…
vMotion Solves Two ProblemsvMotion Solves Two Problems
Problem #1: Protection from Host Failures orScheduled HostDowntime
(These are relatively rare)
(They will get even more rare as we migrate to ESXi)
vMotion Solves Two ProblemsvMotion Solves Two Problems
OverloadedVirtual Host
Shared Storage
UnderloadedVirtual Host
Network
Live Migration to New Host
Problem #2:Load Balancing of VM & Host Resources
(Much morecommon, whereturned on)
Costs vs. BenefitsCosts vs. Benefits
High-availability adds dramatically greater uptime for virtual machines.– Protection against host failures– Protection against resource overuse– Protection against scheduled/unscheduled downtime
High-availability also adds much greater cost…– Shared storage between hosts– Connectivity– Higher (and more expensive) software editions
Not every environment needs HA!– Does anyone want to argue this point?
Contrary to Popular Belief…Contrary to Popular Belief…
…seeing the actual vMotion process occurisn’t all that sexy.
DEMO: Watching avMotion occur…
vMotion: What Really Happens?vMotion: What Really Happens?
Sexy is recognizing what’s going on under the hood.
vMotion: What Really Happens?vMotion: What Really Happens?
Sexy is recognizing what’s going on under the hood.
Let’s Compare to Hyper-V’s First Release– Remember those days? A Hyper-V VM could be
relocated “with a minimum of downtime”.
vMotion: What Really Happens?vMotion: What Really Happens?
Sexy is recognizing what’s going on under the hood.
Let’s Compare to Hyper-V’s First Release– Remember those days? A Hyper-V VM could be relocated
“with a minimum of downtime”.
– This downtime was directly related to the amount of memory assigned to the virtual machine and the connection speed between virtual hosts and shared storage.
– A VM with 2G of vRAM could take 32 seconds or longer to migrate!
– Virtual machines with more assigned virtual memory and/or slow networks take longer to complete.
– Those with less complete the migration faster.
vMotion: What Really Happens?vMotion: What Really Happens?
So…How does it work?– As you invoked a Hyper-V “Quick Migration”, the virtual
machine was immediately put into a saved state.
– That state was not a power down, nor was it the same as pausing.
– In the saved state – and unlike pausing – the virtual machine released its memory reservation on the host machine and stored the contents of its memory pages to disk.
– Once this completed, a target host could take over the ownership of the virtual machine and bring it back to operations.
vMotion: What Really Happens?vMotion: What Really Happens?
OverloadedVirtual Host
Shared Storage
UnderloadedVirtual Host
Network
Live Migration to New Host
vMotion: What Really Happens?vMotion: What Really Happens?
This saving of virtual machine state and transferring memory contents consumed downtime.
So, how is this different from vMotion (& HV today)?
vMotion: What Really Happens?vMotion: What Really Happens?
This saving of virtual machine state and transferring memory contents consumed downtime.
So, how is this different from vMotion (& HV today)?– vMotion and today’s Hyper-V use a pre-copy mechanism.
– Transfers VM memory pages from source to target host prior to starting the migration and while it is still running.
– During the pre-copy memory changes are logged.
– These tend to be relatively small in size.
– Once the initial copy has completed, vMotion then…
…pauses the virtual machine…copies the memory deltas…notifies network switches (RARP) & fibre fabric…transfers ownership to the target host
The result: Effectively “zero” downtime.
vMotion: GotchavMotion: Gotcha’’ss
Successful vMotion requires similar processors.– Processors must be from the same manufacturer. No
Intel-to-AMD or AMD-to-Intel vMotioning.– Processors must be of a proximate families.– This bites people a few years down the road all the
time!
vMotion: GotchavMotion: Gotcha’’ss
vMotion: GotchavMotion: Gotcha’’ss
Big problem: As a virtual environment ages, hardware is refreshed and new hardware is added.– New servers sometimes create “islands” of vMotion
capability
How can we always vMotion between computers?
vMotion: GotchavMotion: Gotcha’’ss
Big problem: As a virtual environment ages, hardware is refreshed and new hardware is added.– New servers sometimes create “islands” of vMotion
capability
How can we always vMotion between computers?
– You can always refresh all hardware at the same time (Har!)
vMotion: GotchavMotion: Gotcha’’ss
Big problem: As a virtual environment ages, hardware is refreshed and new hardware is added.– New servers sometimes create “islands” of vMotion
capability
How can we always vMotion between computers?
– You can always refresh all hardware at the same time (Har!)
– You can cold migrate, with the machine powered down. This always works, but ain’t all that friendly.
vMotion: GotchavMotion: Gotcha’’ss
Big problem: As a virtual environment ages, hardware is refreshed and new hardware is added.– New servers sometimes create “islands” of vMotion capability
How can we always vMotion between computers?
– You can always refresh all hardware at the same time (Har!)– You can cold migrate, with the machine powered down.
This always works, but ain’t all that friendly.– You can use vMotion Enhanced Compatibility Mode to
manage your vMotion-ability. Create islands as individual clusters.
DEMO: vMotion EVC
Class DiscussionClass Discussion
Blades versus Servers has been a common battle in virtualization spaces.– Even I hated them for years.– Then, I saw the light. <Cue angels singing>
Why are blades and EVC perfect for each other?
Class DiscussionClass Discussion
Blades versus Servers has been a common battle in virtualization spaces.– Even I hated them for years.– Then, I saw the light. <Cue angels singing>
Why are blades and EVC perfect for each other?
How are blades and Private Clouds perfect for each other?
New Topic!New Topic!vMotion vs. Storage vMotionvMotion vs. Storage vMotion
vMotion vs. Storage vMotionvMotion vs. Storage vMotion
vMotion vs. Storage vMotionvMotion vs. Storage vMotion
vMotion vs. Storage vMotionvMotion vs. Storage vMotion
Three options for migrating machines and disks.– Online & Powered On: vMotion– Online & Powered On: Storage vMotion– Offline & Powered Off: vMotion + Storage vMotion
at the same time.
vMotion vs. Storage vMotionvMotion vs. Storage vMotion
Requirements:– Virtual machines with snapshots cannot be
svMotioned.– Virtual machine disks must be persistent mode or
RDMs.– The host must have sufficient resources to support
two instances of the VM running concurrently for a brief time.
– The host must have a vMotion license, and be correctly configured for vMotion.
– The host must have access to both the source and target datastores.
HA & DRS:HA & DRS:Combining vMotion + MathCombining vMotion + Math
vMotion ItselfvMotion ItselfMight Not be Sexy…Might Not be Sexy…
vMotion ItselfvMotion ItselfMight Not be Sexy…Might Not be Sexy…
…but what-you-can-do-with-it-once-you-combine-it-with-monitoring-and-a-set-of-smart-calculations…is! Rawrrr…
vMotion ItselfvMotion ItselfMight Not be Sexy…Might Not be Sexy…
…but what-you-can-do-with-it-once-you-combine-it-with-monitoring-and-a-set-of-smart-calculations…is! Rawrrr…
HA– A host goes down, to where do I relocate VMs?
DRS– A host is overloaded, how should I re-balance VMs
to optimize resource capacity to resource demand?
These aren’t trivial questions to answer!
vCenter ConstructsvCenter Constructs
Datacenter– The boundary of a virtual infrastructure (& vMotion)
Cluster– Collection of ESX hosts for centralized management
Resource Pool– Logical abstraction of processing and memory
capacity, which can be distributed to VMs as necessary.
– Sub-collections of resource capacity for distribution to VMs based on business rules.
DEMO: Properties of a Cluster
Shares, Reservations, & LimitsShares, Reservations, & Limits
Shares– Identifies the ratio of resources a VM can consume
Reservations– Identifies minimum resources a VM is guaranteed– Ensures minimum service level during resource
contention
Limits– Identifies maximum resources a VM may have– Protects against resource overuse (spiking)
Understanding Shares & Understanding Shares & Resource PoolsResource Pools
TechMentorCluster
PowerShell Deep DiveResource Pool1000 CPU & RAM Shares2048 RAM
ESXpert Deep DiveResource Pool2000 CPU & RAM Shares4096 RAM
WMI Class200 CPU Shares512M RAM
AD Class600 CPU Shares784M RAM
Storage Class1200 CPU Shares3G RAM
vMotion Class800 CPU Shares512M RAM
Abstracting ResourcesAbstracting Resources
EXTENDED DEMO:– Reservations– Limits– Shares– Resource Pools
Remember: These all factor into HA/DRS cluster automation calculations
What Role Does DRS Play?What Role Does DRS Play?
Chassis 1 Chassis 2 Chassis 3
TechMentorCluster
HOT!HOT!
What Role Does DRS Play?What Role Does DRS Play?
38
Automation Level
Initial VM Placement
Dynamic Balancing
Administrator Involvement
Manual Manual Manual High –All actions
Partially-Automated
Automatic Manual Moderate – Approval Actions
Fully-Automated
Automatic Automatic Low –Monitoring
Administrator trust is often the deciding
factor in choosing the automation level.
How Does DRS Do its Doo-Doo?How Does DRS Do its Doo-Doo?
Five recommendation levels– 1 = DO THIS NOW!– 2 = DO THIS (SORT OF) NOW– 3 = Do this very soon-ish– 4 = You know, if you have the time, you might
consider…– 5 = Meh. Get around to it when you have time.
How Does DRS Do its Doo-Doo?How Does DRS Do its Doo-Doo?
Five recommendation levels– 1 = DO THIS NOW!– 2 = DO THIS (SORT OF) NOW– 3 = Do this very soon-ish– 4 = You know, if you have the time, you might
consider…– 5 = Meh. Get around to it when you have time.
How Does DRS Do its Doo-Doo?How Does DRS Do its Doo-Doo?
Great, but how is this calculated?– Anybody wanna’ get super geeky?
Uh, duh...
“Ceiling”Operator
What About HA?What About HA?
Chassis 1 Chassis 2 Chassis 3
TechMentorCluster
CRASH
CRASH!!
RESTART
RESTARTRESTART
RESTARTRESTART
RESTART
What About HA?What About HA?
VMware HA must be configured for each Virtual Machine.
HA is a component of DRS. DRS will analyze the resource load of the
system at an HA event and decide where to restart the failed server.
Crashed system will restart on new chassis. Will incur an outage, but that outage will be short.
Both HA and DRS require DNS.– This is exceptionally important!
Affinity / Anti-AffinityAffinity / Anti-Affinity
Within a DRS cluster, certain machines should remain on the same chassis.– E.g., an application server and it’s database
server
Others should never– E.g., two domain controllers
Use affinity rules and anti-affinity rules to ensure correct placement of systems during DRS load balancing.– Ensure systems aren’t given conflicting rules
DEMO: Affinity
44
"Serve the public trust. Protect the innocent. Uphold the law.”
Building in Cluster ReserveBuilding in Cluster Reserve
All of these calculations highlight the notion that you’re always going to need somewhere to go.– Hosts that die need to send VMs somewhere.– DRS needs a resource buffer if its to do its job.– You need expansion potential.
That’s why maintaining a cluster reserve is important.
LIVE DRAW: Cluster Reserve
Easter Egg: Change DRSEaster Egg: Change DRSInvocation FrequencyInvocation Frequency You can customize how often DRS will
automatically take its own advice.– I wish my wife had this setting…
On your vCenter Server, locateC:\Users\All Users\Application Data\VMware\VMware VirtualCenter\vpxd.cfg
Add in the followinglines (appropriately!):
Final ThoughtsFinal Thoughts
For the love of gosh, turn on HA/DRS.– But only if you have enough hardware!– You’ve already paid for it.– It is smarter than you.
Understand why your VMs move around.– …and always make sure that you’ve got the correct
connected resources that they need on every host!
Save some cluster resources in reserve.– You’ll thank me for it!
HA & DRS Gotchas that willHA & DRS Gotchas that willKill your InfrastructureKill your Infrastructure
Greg ShieldsGreg ShieldsPartner and Principal TechnologistConcentrated Technologywww.ConcentratedTech.com
Please fill out evaluations,or your karma will suffer!
!!!
This slide deck was used in one of our many conference presentations. We hope you enjoy it, and invite you to use it
within your own organization however you like.
For more information on our company, including information on private classes and upcoming conference appearances, please
visit our Web site, www.ConcentratedTech.com.
For links to newly-posted decks, follow us on Twitter:@concentrateddon or @concentratdgreg
This work is copyright ©Concentrated Technology, LLC