Upload
others
View
15
Download
0
Embed Size (px)
Citation preview
Migration v2
Andrew Cooper
Citrix XenServer
17th August 2015
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 1 / 12
Why Migration v2
XenServer 6.2I 64bit Xen, 32bit Dom0I Inertia, More efficient to virtualise
XenServer 6.5I 64bit Xen, 64bit Dom0I High MMIO regions above 244 bits
Rolling Pool Upgrade testsI Migrate VM from XS6.2 to XS6.5I Error on the receiving side:
xc: detail: xc_domain_restore: starting restore of new domid 1
xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
xc: error: Couldn’t allocate p2m_frame_list array: Internal error
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 2 / 12
Why Migration v2
XenServer 6.2I 64bit Xen, 32bit Dom0I Inertia, More efficient to virtualise
XenServer 6.5I 64bit Xen, 64bit Dom0I High MMIO regions above 244 bits
Rolling Pool Upgrade testsI Migrate VM from XS6.2 to XS6.5I Error on the receiving side:
xc: detail: xc_domain_restore: starting restore of new domid 1
xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
xc: error: Couldn’t allocate p2m_frame_list array: Internal error
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 2 / 12
Why Migration v2
XenServer 6.2I 64bit Xen, 32bit Dom0I Inertia, More efficient to virtualise
XenServer 6.5I 64bit Xen, 64bit Dom0I High MMIO regions above 244 bits
Rolling Pool Upgrade testsI Migrate VM from XS6.2 to XS6.5I Error on the receiving side:
xc: detail: xc_domain_restore: starting restore of new domid 1
xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
xc: error: Couldn’t allocate p2m_frame_list array: Internal error
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 2 / 12
Why Migration v2
XenServer 6.2I 64bit Xen, 32bit Dom0I Inertia, More efficient to virtualise
XenServer 6.5I 64bit Xen, 64bit Dom0I High MMIO regions above 244 bits
Rolling Pool Upgrade testsI Migrate VM from XS6.2 to XS6.5I Error on the receiving side:
xc: detail: xc_domain_restore: starting restore of new domid 1
xc: detail: xc_domain_restore: p2m_size = ffffffff00010000
xc: error: Couldn’t allocate p2m_frame_list array: Internal error
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 2 / 12
Legacy Migration
int xc_domain_restore(xc_interface *xch,
...
if ( RDEXACT(io_fd, &dinfo->p2m_size, sizeof(unsigned long)) )
{
PERROR("read: p2m_size");
goto out;
}
DPRINTF("%s: p2m_size = %lx\n", __func__, dinfo->p2m_size);
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 3 / 12
Legacy Migration
No format written downI Subsequently reverse engineered from existing code
No header information at all
Hard to extendI Written mostly as two monolithic functionsI goto tangleI PV MSR support too complicated to implement
Asymmetry with Qemu handlingI Save side’s caller puts Qemu blob into the streamI Restore side pulls Qemu blob out and saves in magic path
Stream contents depends on compilation ABII Different between 32bit and 64bit
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 4 / 12
Legacy Migration
No format written downI Subsequently reverse engineered from existing code
No header information at all
Hard to extendI Written mostly as two monolithic functionsI goto tangleI PV MSR support too complicated to implement
Asymmetry with Qemu handlingI Save side’s caller puts Qemu blob into the streamI Restore side pulls Qemu blob out and saves in magic path
Stream contents depends on compilation ABII Different between 32bit and 64bit
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 4 / 12
Legacy Migration
No format written downI Subsequently reverse engineered from existing code
No header information at all
Hard to extendI Written mostly as two monolithic functionsI goto tangleI PV MSR support too complicated to implement
Asymmetry with Qemu handlingI Save side’s caller puts Qemu blob into the streamI Restore side pulls Qemu blob out and saves in magic path
Stream contents depends on compilation ABII Different between 32bit and 64bit
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 4 / 12
Legacy Migration
No format written downI Subsequently reverse engineered from existing code
No header information at all
Hard to extendI Written mostly as two monolithic functionsI goto tangleI PV MSR support too complicated to implement
Asymmetry with Qemu handlingI Save side’s caller puts Qemu blob into the streamI Restore side pulls Qemu blob out and saves in magic path
Stream contents depends on compilation ABII Different between 32bit and 64bit
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 4 / 12
VM Serialisation
Information (Currently x86 specific)
Common Page Data, TSCHVM Params, Context (Xen serialised state)
PV Width, Levels, P2M, VCPU State, Shared Info
SuspendI Pause VMI Copy all memory
MigrateI Enable logdirtyI Copy all memoryI — Query logdirty bitmapI — Copy dirty memoryI — LoopI Pause VMI Copy remaining memory
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 5 / 12
VM Serialisation
Information (Currently x86 specific)
Common Page Data, TSCHVM Params, Context (Xen serialised state)
PV Width, Levels, P2M, VCPU State, Shared Info
SuspendI Pause VMI Copy all memory
MigrateI Enable logdirtyI Copy all memoryI — Query logdirty bitmapI — Copy dirty memoryI — LoopI Pause VMI Copy remaining memory
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 5 / 12
VM Serialisation
Information (Currently x86 specific)
Common Page Data, TSCHVM Params, Context (Xen serialised state)
PV Width, Levels, P2M, VCPU State, Shared Info
SuspendI Pause VMI Copy all memory
MigrateI Enable logdirtyI Copy all memoryI — Query logdirty bitmapI — Copy dirty memoryI — LoopI Pause VMI Copy remaining memory
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 5 / 12
Solution for XenServer
Redesigned completely from scratch
Specification written downI docs/specs/libxc-migration-stream.pandocI Describes exact binary layoutI Extensible
Reimplemented completely from scratchI Common save and restore algorithmsI Per-guest-type hooks to implement
Legacy conversion neededI tools/python/scripts/convert-legacy-streamI Reads in legacy streamI Writes out v2 stream
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 6 / 12
Solution for XenServer
Redesigned completely from scratch
Specification written downI docs/specs/libxc-migration-stream.pandocI Describes exact binary layoutI Extensible
Reimplemented completely from scratchI Common save and restore algorithmsI Per-guest-type hooks to implement
Legacy conversion neededI tools/python/scripts/convert-legacy-streamI Reads in legacy streamI Writes out v2 stream
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 6 / 12
Solution for XenServer
Redesigned completely from scratch
Specification written downI docs/specs/libxc-migration-stream.pandocI Describes exact binary layoutI Extensible
Reimplemented completely from scratchI Common save and restore algorithmsI Per-guest-type hooks to implement
Legacy conversion neededI tools/python/scripts/convert-legacy-streamI Reads in legacy streamI Writes out v2 stream
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 6 / 12
Solution for XenServer
Redesigned completely from scratch
Specification written downI docs/specs/libxc-migration-stream.pandocI Describes exact binary layoutI Extensible
Reimplemented completely from scratchI Common save and restore algorithmsI Per-guest-type hooks to implement
Legacy conversion neededI tools/python/scripts/convert-legacy-streamI Reads in legacy streamI Writes out v2 stream
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 6 / 12
Stream format – libxc
0 1 2 3 4 5 6 7 octet
+-------------------------------------------------+ +
| marker (0xffffffffffffffff) | |
+-----------------------+-------------------------+ | Image
| id ("XENF" in ASCII) | version (2) | | Header
+-----------+-----------+-------------------------+ |
| options | (reserved) | |
+-----------+-------------------------------------+ +
+-----------------------+-----------+-------------+ +
| type (PV, HVM, etc) | page_shift| (reserved) | | Domain
+-----------------------+-----------+-------------+ | Header
| xen_major (4) | xen_minor (6) | |
+-----------------------+-------------------------+ +
+-----------------------+-------------------------+ +
| type | body_length | |
+-----------+-----------+-------------------------+ |
| body... | | Record
... |
| | padding (0 to 7 octets) | |
+-----------+-------------------------------------+ +
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 7 / 12
Upstreaming
Problems with libxlI No participation in streamI ’Toolstack Data’ depends on compilation ABI
Design from scratch
Specification written downI docs/specs/libxl-migration-stream.pandocI Describes exact binary layoutI Extensible
Write from scratch
Compatibility script extendedI Able to write libxl migration v2 streamsI ’Qemu’ and ’Toolstack data’ layered appropriately
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 8 / 12
Upstreaming
Problems with libxlI No participation in streamI ’Toolstack Data’ depends on compilation ABI
Design from scratch
Specification written downI docs/specs/libxl-migration-stream.pandocI Describes exact binary layoutI Extensible
Write from scratch
Compatibility script extendedI Able to write libxl migration v2 streamsI ’Qemu’ and ’Toolstack data’ layered appropriately
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 8 / 12
Upstreaming
Problems with libxlI No participation in streamI ’Toolstack Data’ depends on compilation ABI
Design from scratch
Specification written downI docs/specs/libxl-migration-stream.pandocI Describes exact binary layoutI Extensible
Write from scratch
Compatibility script extendedI Able to write libxl migration v2 streamsI ’Qemu’ and ’Toolstack data’ layered appropriately
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 8 / 12
Upstreaming
Problems with libxlI No participation in streamI ’Toolstack Data’ depends on compilation ABI
Design from scratch
Specification written downI docs/specs/libxl-migration-stream.pandocI Describes exact binary layoutI Extensible
Write from scratch
Compatibility script extendedI Able to write libxl migration v2 streamsI ’Qemu’ and ’Toolstack data’ layered appropriately
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 8 / 12
Upstreaming
Problems with libxlI No participation in streamI ’Toolstack Data’ depends on compilation ABI
Design from scratch
Specification written downI docs/specs/libxl-migration-stream.pandocI Describes exact binary layoutI Extensible
Write from scratch
Compatibility script extendedI Able to write libxl migration v2 streamsI ’Qemu’ and ’Toolstack data’ layered appropriately
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 8 / 12
Framing
Legacy Migration
Migration v2 Remus Migration v2
header
optional data
...
toolstack
...
qemu
header
optional data
header
libxc content
image header
domain header
...
end
emulator xenstore
emulator context
end
header
optional data
header
libxc content
image header
domain header
...
checkpoint
...
checkpoint end
...
checkpoint
...
checkpoint end
...
Key: xl libxl libxc
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 9 / 12
Framing
Legacy Migration Migration v2
Remus Migration v2
header
optional data
...
toolstack
...
qemu
header
optional data
header
libxc content
image header
domain header
...
end
emulator xenstore
emulator context
end
header
optional data
header
libxc content
image header
domain header
...
checkpoint
...
checkpoint end
...
checkpoint
...
checkpoint end
...
Key: xl libxl libxc
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 9 / 12
Framing
Legacy Migration Migration v2 Remus Migration v2
header
optional data
...
toolstack
...
qemu
header
optional data
header
libxc content
image header
domain header
...
end
emulator xenstore
emulator context
end
header
optional data
header
libxc content
image header
domain header
...
checkpoint
...
checkpoint end
...
checkpoint
...
checkpoint end
...
Key: xl libxl libxc
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 9 / 12
General Notes
Issues fixedI PV VCPU state corruption when racing with vcpu actionsI PV guests with superpages abort on save, rather than failing to
reconstruct pagetables on restoreI More efficient handling of page data
Issues still presentI Guests which balloonI PV P2M structure changesI HVM guests with PoD pages
Areas for further workI Live migrate looping parametersI Linear P2M support
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 10 / 12
General Notes
Issues fixedI PV VCPU state corruption when racing with vcpu actionsI PV guests with superpages abort on save, rather than failing to
reconstruct pagetables on restoreI More efficient handling of page data
Issues still presentI Guests which balloonI PV P2M structure changesI HVM guests with PoD pages
Areas for further workI Live migrate looping parametersI Linear P2M support
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 10 / 12
General Notes
Issues fixedI PV VCPU state corruption when racing with vcpu actionsI PV guests with superpages abort on save, rather than failing to
reconstruct pagetables on restoreI More efficient handling of page data
Issues still presentI Guests which balloonI PV P2M structure changesI HVM guests with PoD pages
Areas for further workI Live migrate looping parametersI Linear P2M support
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 10 / 12
Status – Xen 4.6
All committed
Fully enabled (and tested)
xl save/restore/migrate/remus function as before
Legacy migration removed
No noticeable difference to users
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 11 / 12
Migration v2
Any Questions?
Andrew Cooper (Citrix XenServer) Migration v2 17th August 2015 12 / 12