WO2021023052A1 - 虚拟机热迁移方法、装置、电子设备及计算机存储介质 - Google Patents

虚拟机热迁移方法、装置、电子设备及计算机存储介质 Download PDF

Info

Publication number
WO2021023052A1
WO2021023052A1 PCT/CN2020/105032 CN2020105032W WO2021023052A1 WO 2021023052 A1 WO2021023052 A1 WO 2021023052A1 CN 2020105032 W CN2020105032 W CN 2020105032W WO 2021023052 A1 WO2021023052 A1 WO 2021023052A1
Authority
WO
WIPO (PCT)
Prior art keywords
page table
page
virtual
memory
level
Prior art date
Application number
PCT/CN2020/105032
Other languages
English (en)
French (fr)
Inventor
张超
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2021023052A1 publication Critical patent/WO2021023052A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0868Data transfer between cache memory and other subsystems, e.g. storage devices or host systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1009Address translation using page tables, e.g. page table structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5022Mechanisms to release resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation

Definitions

  • the embodiments of the present invention relate to the field of data processing technology, and in particular to a method, device, electronic device, and computer storage medium for virtual machine hot migration.
  • Virtual machine hot migration is a key technology in cloud computing operations. Through virtual machine hot migration, the virtual guest can be migrated from one physical machine to another while ensuring the normal operation of the virtual guest, so as to realize computing resources. Dynamic scheduling, physical machine failure maintenance, etc.
  • the process of virtual machine hot migration is to copy the memory of the virtual guest from one physical machine to another in an iterative manner, and the memory content of each copy is determined according to the memory paging of the virtual guest.
  • embodiments of the present invention provide a virtual machine hot migration method, device, electronic device, and computer storage medium to solve the above-mentioned problems.
  • a virtual machine hot migration method including: according to a switching trigger instruction, the first physical machine is used to indicate the mapping relationship between the memory address of the virtual guest and the physical address of the host machine.
  • One page table is switched to the second page table; according to the second page table, the virtual guest is hot migrated from the first physical machine to the second physical machine; wherein, the last-level page table of the second page table
  • the size of the indicated physical memory page meets the set size, and the size of the physical memory page indicated by the last-level page table of the first page table is greater than the set size.
  • a virtual machine hot migration device including: a switching module, configured to use a first physical machine to indicate a virtual guest memory address and a host physical The first page table of the address mapping relationship is switched to the second page table; the hot migration module is configured to hot migrate the virtual client from the first physical machine to the second physical machine according to the second page table; wherein, The size of the physical memory page indicated by the last page table of the second page table satisfies the set size, and the size of the physical memory page indicated by the last page table of the first page table is greater than the set size.
  • an electronic device including: a processor, a memory, a communication interface, and a communication bus.
  • the processor, the memory, and the communication interface complete each other through the communication bus.
  • Inter-communication; the memory is used to store at least one executable instruction, and the executable instruction causes the processor to perform operations corresponding to the virtual machine hot migration method described above.
  • a computer storage medium having a computer program stored thereon, and when the program is executed by a processor, the virtual machine hot migration method described above is implemented.
  • a first page table and a second page table are set in the first physical machine, and the first and second page tables are both used to indicate the distance between the virtual guest memory address and the host machine physical address.
  • the mapping relationship to form the master backup setting of the mapping relationship.
  • the difference is that the size of the physical memory page indicated by the last page table of the second page table satisfies the set size, such as the conventionally used 4K BYTES size, while the size of the physical memory page indicated by the last page table of the first page table The size is larger than the set size, which is commonly referred to as "large page".
  • the virtual machine hot migration can be realized according to the second page table.
  • the process of converting the physical memory page size to the page size required for hot migration will not introduce the suspension of the virtual machine or other
  • the problem of performance impact on the operation of the virtual machine greatly reduces the adverse effect on the operation of the virtual guest caused by the change of the page table size of the hot migration.
  • FIG. 1 is a flowchart of the steps of a method for hot migration of a virtual machine according to Embodiment 1 of the present invention
  • FIG. 2 is a flowchart of steps of a method for hot migration of a virtual machine according to Embodiment 2 of the present invention
  • FIG. 3 is a flowchart of steps of a method for hot migration of a virtual machine according to Embodiment 3 of the present invention.
  • FIG. 4 is a flowchart of steps of a method for hot migration of a virtual machine according to Embodiment 4 of the present invention.
  • FIG. 5 is a schematic diagram of the first page table and the second page table in the embodiment shown in FIG. 4;
  • FIG. 6 is a structural block diagram of a virtual machine hot migration device according to the fifth embodiment of the present invention.
  • FIG. 7 is a structural block diagram of a virtual machine hot migration device according to the sixth embodiment of the present invention.
  • FIG. 8 is a schematic structural diagram of an electronic device according to the seventh embodiment of the present invention.
  • Step S102 According to the switch trigger instruction, switch the first page table in the first physical machine that is used to indicate the mapping relationship between the memory address of the virtual guest and the physical address of the host machine to the second page table.
  • one physical machine can be virtualized into multiple virtual machines, and the multiple virtual machines are virtual clients, and the physical machine can be regarded as the host machine of multiple virtual machines.
  • Multiple virtual clients use the actual physical resources of the host machine through the page table used to indicate the mapping relationship between the virtual guest memory address and the host machine's physical address.
  • KVM Kernel-based Virtual Machine
  • XEN Extended Page Tables
  • a first page table and a second page table are set, and both the first page table and the second page table are used to indicate the memory address of the virtual guest
  • the page table with the physical address mapping relationship of the host machine is instructed to perform the switching operation of the first page table and the second page table through the switching trigger instruction, and then the virtual machine hot migration is performed based on the switching operation of the first page table and the second page table.
  • the switching trigger instruction may be any appropriate instruction, or an instruction triggered by any appropriate trigger operation.
  • the first page table can be marked as read-only, and if a write error exception occurs during the hot migration, the second page table can also be used for processing.
  • the first page table and the second page table are both multi-level page tables
  • the size of the physical memory page indicated by the last page table of the second page table meets the set size
  • the end of the first page table The size of the physical memory page indicated by the level page table is greater than the set size.
  • the set size can be set by those skilled in the art according to the memory page size required by the memory copy iteration of the hot migration, for example, it can be set to 4K BYTES.
  • Physical memory pages that meet the set size can be called "small pages”, and correspondingly, physical memory pages that are larger than the set size can be called “large pages", such as 2M BYTES or 1G BYTES. Of physical memory pages.
  • the second page table can be generated after further processing based on the information in the first page table, for example, the "large page" in the first page table is divided into "small pages” and corresponding page table entries are created to save Generate cost and improve generation efficiency. But it is not limited to this.
  • the second page table can also be generated in a conventional way of generating page tables, such as the same way of generating the first page table (for example, the way of generating EPT page tables).
  • Step S104 According to the second page table, the virtual client is hot migrated from the first physical machine to the second physical machine.
  • the size of the physical memory page indicated by the last-level page table of the second page table meets the set size, that is, the memory page size required by the memory copy iteration of the virtual machine hot migration is met, therefore, based on the second page
  • the table can realize the hot migration of the virtual guest from the first physical machine to the second physical machine.
  • a first page table and a second page table are provided in the first physical machine, and the first and second page tables are both used to indicate the mapping relationship between the virtual guest memory address and the host machine physical address.
  • the size of the physical memory page indicated by the last page table of the second page table satisfies the set size, such as the conventionally used 4K BYTES size, while the size of the physical memory page indicated by the last page table of the first page table
  • the size is larger than the set size, which is commonly referred to as "large page”.
  • the virtual machine hot migration can be realized according to the second page table.
  • the process of converting the physical memory page size to the page size required for hot migration will not introduce the suspension of the virtual machine or other
  • the problem of performance impact on the operation of the virtual machine greatly reduces the adverse effect on the operation of the virtual guest caused by the change of the page table size of the hot migration.
  • the virtual machine hot migration method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 2 there is shown a flowchart of the steps of a virtual machine hot migration method according to the second embodiment of the present invention.
  • Step S202 According to the switch trigger instruction, switch the first page table used for indicating the mapping relationship between the memory address of the virtual guest and the physical address of the host computer in the first physical machine to the second page table.
  • the size of the physical memory page indicated by the last page table of the second page table meets the set size, and the size of the physical memory page indicated by the last page table of the first page table is greater than the set size.
  • the set size is as described in the first embodiment, and can be set by those skilled in the art according to the memory page size required by the memory copy iteration of the hot migration, for example, it can be set to 4K BYTES.
  • both the first page table and the second page table can adopt the form of EPT (Extended Page Tables) page tables.
  • this step can be implemented as follows: switch the active page table of the virtual machine currently in use from the first page table to the second page table according to the switching trigger instruction; and send to all virtual clients in the first physical machine
  • the computer sends an instruction signal for instructing to reload the page table to instruct each virtual client to switch the currently used page table node in the first page table to the corresponding page table node in the second page table.
  • the virtual guest needs to use the memory resources of the host through the active page table.
  • the embodiment of the present invention provides a first page table and a second page table.
  • the first page table can be regarded as a page table pointing to "large pages”
  • the second page table can be regarded as a page table pointing to "small pages”.
  • page table switching may be performed in units of virtual processors (VCPUs), and further page table switching reduces the impact on the performance of virtual clients.
  • the instruction signal for instructing to reload the page table is sent to all virtual clients in the first physical machine to instruct each virtual client to switch the page table node in the first page table currently in use To the corresponding page table node in the second page table, including: sending an instruction signal for instructing to reload the page table to the virtual processors corresponding to all virtual clients in the first physical machine to instruct each virtual processor to change the current
  • the used page table node in the first page table is switched to the corresponding page table node in the second page table.
  • the first page table can be created according to the first page table.
  • the second page table can also be created in advance, and it can be switched directly after receiving the switching trigger instruction.
  • the second page table is created according to the first page table after receiving the switch trigger instruction.
  • the second page table is created according to the switch trigger instruction, that is, it is created after the hot migration is determined.
  • the second page table is created based on the first page table to improve
  • the creation efficiency is also guaranteed, and the smooth transition of the handover and virtual machine performance are also guaranteed.
  • the embodiment of the present invention further provides a way to implement page table switching through variables.
  • the set value can be appropriately set by those skilled in the art according to the actual situation, and can be a number, a letter or a symbol, or a combination of the above. For example, taking active_page as the first variable, when active_page is 1, it indicates that the currently used virtual machine active page table is the first page table, and when active_page is 0, it indicates that the currently used virtual machine active page table is the second page table . By changing the variable value of active_page, the page table switching instruction can be realized.
  • the first variable may be a memory management unit node MMU_NODE variable.
  • MMU_NODE variable Set to use MMU lock to protect MMU resources, before resetting the first variable to the set value, you can also lock the memory management unit MMU corresponding to the first variable; and activate the second page table as a virtual machine activity After the page table, the MMU is unlocked. In this way, the exclusive use of MMU resources is guaranteed.
  • the use of the MMU_NODE variable is convenient for the subsequent rapid switching of the first page table and the second page table by changing the value of the NODE, and it can be compatible with the existing code logic as much as possible. But as mentioned earlier, other variable forms are also applicable.
  • each virtual processor can switch the currently used page table node in the first page table to the second page table.
  • the corresponding page table node in the page table includes: each virtual processor checks whether the variable value of the second variable pointing to the root page table currently used by each virtual processor and the variable value of the first variable according to the indication signal Consistent; if they are inconsistent, use the variable value of the first variable to replace the variable value of the second variable. By means of variable value substitution, the page table node used by the virtual processor can be switched simply and quickly, which improves the switching efficiency.
  • step S204 can be performed.
  • Step S204 According to the second page table, hot migrate the virtual client from the first physical machine to the second physical machine.
  • the size of the physical memory page indicated by the last-level page table of the second page table meets the set size, such as 4K BYTES, which can effectively meet the needs of virtual machine hot migration for memory page size.
  • 4K BYTES 4K BYTES
  • Step S206 Determine whether the hot migration is successful; if it is successful, perform step S208; otherwise, perform step S210.
  • the determination of whether the hot migration of the virtual machine is successful can be implemented in a conventional manner, such as whether the virtual client is running normally, whether the data is complete, etc., which is not limited in the embodiment of the present invention.
  • Step S208 If the hot migration is successful, release the first page table and the second page table in the first physical machine. End this process.
  • the memory data of the virtual guest is successfully copied from the first physical machine to the second physical machine.
  • the virtual guest can use the corresponding data and mechanism in the second physical machine to work normally.
  • the first page table and the second page table in the first physical machine can be released, including releasing data related to the page table and occupied resources.
  • Step S210 If the hot migration fails, perform failure processing.
  • the virtual guest after the hot migration operation is performed, the virtual guest cannot work normally on the second physical machine, that is, although the hot migration is performed, the hot migration is not successful and a rollback operation is triggered.
  • the second page table can be switched back to the first page table, and the second page table in the first physical machine is released. That is, by switching the second page table back to the first page table, the virtual client can work normally according to the first page table in the cloud, and release the second page table.
  • the hot migration failure may be caused by reasons such as non-convergence of the migration. Therefore, you can switch back to the first page table and the hot migration rollback operation to ensure that the virtual guest works in the original state and continues to use the first page.
  • One page table to improve the performance of the virtual machine and avoid affecting users' use.
  • the second page table can be released first, and then the active page table of the currently used virtual machine is switched back to After the first page table, release the first page table. That is, the hot migration has not been completed, and an abnormality occurs during the hot migration. At this time, the related data and the occupied resources of the first page table and the second page table need to be released.
  • the first and second page tables can also be released at the same time.
  • the method of releasing the second page table first, and then switching the active page table of the currently used virtual machine back to the first page table and releasing it is more in line with the actual code logic implemented by the solution.
  • steps S206-S210 the processing of unsuccessful thermal migration is effectively realized, and the reliability and safety of thermal migration are ensured.
  • a first page table and a second page table are provided in the first physical machine, and the first and second page tables are both used to indicate the mapping relationship between the virtual guest memory address and the host machine physical address.
  • the size of the physical memory page indicated by the last page table of the second page table satisfies the set size, such as the conventionally used 4K BYTES size, while the size of the physical memory page indicated by the last page table of the first page table
  • the size is larger than the set size, which is commonly referred to as "large page”.
  • the virtual machine hot migration can be realized according to the second page table.
  • the process of converting the physical memory page size to the page size required for hot migration will not introduce the suspension of the virtual machine or other
  • the problem of performance impact on the operation of the virtual machine greatly reduces the adverse effect on the operation of the virtual guest caused by the change of the page table size of the hot migration.
  • the virtual machine hot migration method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 3 there is shown a flowchart of the steps of a virtual machine hot migration method according to the third embodiment of the present invention.
  • This embodiment focuses on how to create a second page table to describe the virtual machine hot migration solution provided by the embodiment of the present invention.
  • Step S302 After receiving the switching trigger instruction, create a second page table according to the first page table.
  • the size of the physical memory page indicated by the last page table of the second page table meets the set size
  • the size of the physical memory page indicated by the last page table of the first page table is greater than the set size.
  • the set size is as described in the first embodiment, and can be set by those skilled in the art according to the memory page size required by the memory copy iteration of the hot migration, for example, it can be set to 4K BYTES.
  • the memory reverse mapping table corresponding to the virtual memory slot pointed to by the first page table may be traversed, and the second page table may be created according to the traversal result.
  • the rapid creation of the second page table can be realized.
  • the memory reverse mapping table records the mapping relationship reflecting the physical memory address of the host machine and its corresponding virtual address. Whenever a physical memory page is mapped to a new virtual address space (here, the virtual guest physical address), the last-level page table entry corresponding to the physical memory page is recorded in the memory reverse mapping table.
  • the memory of the virtual guest is composed of multiple virtual memory slots memslot, and each memslot does not overlap each other. That is, by traversing the memory reverse mapping table corresponding to each virtual memory slot memslot, the mapping relationship between the memory address of the virtual client and the corresponding physical memory address can be obtained. According to this, the data and information used to create the second page table can be obtained.
  • the traversed physical memory page size is greater than After the page of the set size is divided according to the set size, the entries in the corresponding second page table are generated.
  • creating the second page table according to the traversal result may include: generating the last-level page table entries in the second page table according to each copy of the traversed last-level page table entries, and using the set identifier to generate the last-level page table Item mark; determine whether the size of the physical memory page pointed to by the last-level page table entry is greater than the set size; if it is larger, delete the set identifier corresponding to the last-level page table entry, and According to the size of the physical memory page pointed to by the last-level page table entry, establish at least one level of subpage table entry for the last level page table entry, wherein the last level page in the at least one level of subpage table entry The size of the physical memory page pointed to by the entry is the set size.
  • the setting identifier can be appropriately set by a person skilled in the art according to actual needs, which is not limited in the embodiment of the present invention; the setting size is as described above and will not be repeated here.
  • the physical memory page size is greater than the set size, it indicates that it is a "large page” and does not meet the requirements of the last-level page table entry of the second page table, and the setting flag used to indicate the last-level page table entry needs to be deleted. And treat "large pages” as "small pages”. From “large page” to "small page” may require one or more levels of processing. For example, if the "large page” is 2M, it can be processed into 512 4K BYTES "small pages”. The next-level subpage table entries are created under the current entry level of the second page table, and each subpage table entry points to a 4K BYTES physical memory page.
  • the "large page” is 1G, it needs to be processed into 512 2M pages first, and then each 2M page is processed into 512 4K BYTES "small pages", and then in the second page table A second-level subpage table entry is created under the current entry level.
  • the first-level subpage table entry points to a second-level subpage table entry, and each second-level subpage table entry points to a 4K BYTES physical memory page.
  • the size of the physical memory page pointed to by the generated last-level page table entry is not greater than the set size, it indicates that it points to a "small page" and can be directly copied and used.
  • a physical memory page of the host may be mapped to multiple memory pages of a virtual guest.
  • the memory reverse mapping table stores multiple corresponding physical memory pages that point to the physical memory page.
  • a linked list of the last-level page table entries In this case, the memory reverse mapping table corresponding to the virtual memory slot pointed to by the first page table is traversed, and the second page table is created according to the traversal result, including: obtaining the corresponding virtual memory slot pointed to by the first page table Memory reverse mapping table; traverse the entries of the memory reverse mapping table one by one to determine whether the linked list is stored in the current entry; if the linked list is stored, traverse each last page in the linked list Entry, and copy the information of the multiple last-level page entry corresponding to the linked list and the content of each last-level page entry traversed to the second page table; if the linked list is not stored, the The contents of the current table entry are copied to the second page table.
  • Step S304 After the second page table is successfully created, switch the currently used virtual machine active page table from the first page table to the second page table.
  • the specific implementation of switching the currently used virtual machine active page table from the first page table to the second page table can refer to the description in step S202 of the second embodiment, which will not be repeated here.
  • Step S306 According to the second page table, the virtual client is hot migrated from the first physical machine to the second physical machine.
  • a first page table and a second page table are provided in the first physical machine, and the first and second page tables are both used to indicate the mapping relationship between the virtual guest memory address and the host machine physical address.
  • the size of the physical memory page indicated by the last page table of the second page table satisfies the set size, such as the conventionally used 4K BYTES size, while the size of the physical memory page indicated by the last page table of the first page table
  • the size is larger than the set size, which is commonly referred to as "large page”.
  • the virtual machine hot migration can be realized according to the second page table.
  • the process of converting the physical memory page size to the page size required for hot migration will not introduce the suspension of the virtual machine or other
  • the problem of performance impact on the operation of the virtual machine greatly reduces the adverse effect on the operation of the virtual guest caused by the change of the page table size of the hot migration.
  • the virtual machine hot migration method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 4 there is shown a flowchart of the steps of a virtual machine hot migration method according to the fourth embodiment of the present invention.
  • KVM is a virtualization technology based on CPU hardware support, which can be implemented as a module of Linux, namely the KVM module. After Linux loads the KVM module, it can create virtual machines with other tools. But only through the KVM module, users cannot directly control the operating system kernel to operate, and need to use corresponding user space tools such as Qemu to control the user space of KVM, that is, to work in the user mode of the operating system through Qemu.
  • KVM in order to achieve memory virtualization, let the virtual guest use an isolated, zero-based and continuous memory space, KVM introduces the guest physical address space (Guest Physical Address, GPA), GPA is not The real physical address space is just a mapping of the host machine (Host machine) virtual address space (HVA) in the virtual guest address space.
  • GPA is a continuous address space starting from zero, but for the host, the physical address space of the virtual guest is not necessarily continuous.
  • the physical address space of the virtual guest may be mapped in several A discontinuous host address range.
  • VMM Virtual Guest Monitor
  • memory virtualization is to convert the virtual address (Guest Virtual Address, GVA) of the virtual guest into the physical address (Host Physical Address, HPA) of the Host, passing through the physical address (Guest Physical Address, GPA) of the virtual machine and Host virtual address (Host Virtual Address, HVA) conversion, namely: GVA ⁇ GPA ⁇ HVA ⁇ HPA.
  • the virtual machine hot migration method of this embodiment includes the following steps:
  • Step S402 The Qemu thread generates a switching trigger instruction through the IOCTL call trigger before performing the hot migration of the virtual machine to indicate that it is ready to switch between the main page table and the standby page table.
  • the home page table is the aforementioned first page table
  • the spare page table (also called the spare page table) is the aforementioned second page table.
  • Step S404 Create a backup page table based on the home page table according to the switching trigger instruction.
  • KVM the physical memory of the virtual guest is divided into several memslots, and each memslot does not overlap each other.
  • a backup page table is created by traversing all memslots in turn, and completing copying and rebuilding of page table entries of the physical memory corresponding to each memslot in turn.
  • the alternate page table is created by traversing the entries of the memory reverse mapping rmap table of the current memslot.
  • Memory reverse mapping rmap is a data structure that records the correspondence between a physical memory address and its virtual address. It records the relationship between each physical memory page and its page table.
  • the EPT page table associated with the physical memory page can be found through the page frame number (gfn) of the guest physical memory page, namely rmap[gfn].
  • rmap[gfn] When a host's physical memory page is mapped to a new virtual address space (here, the virtual machine physical address), the address of the last-level page table entry corresponding to the physical memory page will be recorded in rmap[gfn] in.
  • rmap[gfn] When a host's physical memory page is mapped to multiple new virtual address spaces at the same time, rmap[gfn] will record the first address of a pte linked list, and all the last-level page table entries associated with the physical memory page Will be recorded in the pte linked list.
  • the multiple physical memory pages included in the physical memory of the virtual guest may be determined according to the memory allocation information of the virtual machine, and the determined physical memory page may be represented by the page frame number (gfn) of the physical memory page.
  • the physical memory of the virtual client is composed of multiple memslots, each memslot has a basegfn, the basegfn records the start offset of the memslot in the physical address space of the entire virtual client, and the gfn corresponding to each physical memory page in the memslot Calculated by the basegfn of the memslot and the offset of the physical memory page inside the memslot. In order to traverse the entire gfn, each memslot needs to be traversed once.
  • the backup EPT page table For each last-level page table entry obtained by traversal, copy it to the backup page table, that is, the backup EPT page table. Specifically, it may include: (a) If the upper-level page table corresponding to the last-level page table entry currently copied does not exist in the backup page table, a corresponding upper-level page table is created for the last-level page table entry and added to the backup page In the table (if the same method as the first page table (such as EPT) is used); (b) Determine whether the page pointed to by the last-level page table entry of the current copy is a "large page", if not a "large page", then After the copy is over, return to (a) to continue traversing the next page table entry; if it is a "large page", delete the flag indicating that the page table entry is the last-level page table entry, and then establish the pointer in turn. All sub-page table entries corresponding to the page table entries of "large pages”.
  • the above-mentioned establishment of all subpage table entries corresponding to the page table entry pointing to the "large page” includes: firstly querying the physical page of the virtual client associated with the page table entry from the page table entry pointing to the "large page” Frame number gfn and its actual physical page frame number pfn in the host; determine whether the memory buffer pool resources required to create page table entries are sufficient, if not enough, add cache resources to the buffer pool; if sufficient, target All the "small pages” physical memory covered by the "large page” physical memory are created in turn to create their corresponding multi-level "small page” page tables (for example, for 2M BYTES "large pages”, 2M BYTES and 4K need to be created respectively BYTES-level page tables; for 1G BYTES "large pages", you need to create 1G BYTES, 2M BYTES, and 4K BYTES three-level page tables respectively.
  • EPT page tables you need to create second and third page tables for 1G BYTES physical memory
  • filling in the entries of the backup page table includes: querying the first-level page table corresponding to the gfn for the non-last-level page table (the page table other than the fourth-level page table) to view its corresponding Whether the next-level page table exists; if it does not exist, apply for a page table for the next-level page table, and fill in the first-level page table with the physical address of the newly applied page table memory. Repeat the above process until the last level page table, that is, the fourth level page table. For the last-level page table (fourth-level page table), fill in the last-level page table according to memory attributes and corresponding gfn and pfn information.
  • the flag bit is set to 0 for subsequent tracking using pml.
  • the last-level page table is added to the rmap structure corresponding to the gfn. If there are multiple gfn, the last-level page table entry needs to be added to the corresponding rmap linked list. After all physical memory traversal is completed, the creation of the spare page table is completed.
  • FIG 5 A comparison between a created backup page table (ie the second page table) and the home page table (ie the first page table) is shown in Figure 5.
  • both the homepage table and the backup page table take the EPT page table as an example (the main EPT page table, the backup EPT page table), and the meaning of each node is the same as the regular EPT page table.
  • advanced page table entries such as PDE or PDPTE are directly used to point to a physical "large page”.
  • PDE third-level page table
  • PDPTE second-level page table
  • the standby EPT page table is created by the main EPT page table.
  • the standby EPT page table uses PTE (fourth level page table) to point to 4K BYTES physical memory pages (shown by the dashed box in Figure 5), and PTE is used as Prepare the final page table of the EPT page table to complete the entire physical memory mapping.
  • Step S406 Switch the KVM's active page table node indicating the current virtual client to the backup page table node.
  • this embodiment in order to prevent the overall switching of the page table from affecting the performance of the virtual client, the switching process of the primary and backup page tables is disassembled to the vcpu granularity.
  • this embodiment introduces two new variables kvm->mmu_node and vcpu->mmu_node, where kvm->mmu_node represents the root page table pointed to by the current KVM, and vcpu->mmu_node represents the root used by the current vcpu Page table.
  • " ⁇ 1" means 1->0 (from 1 to 0) or 0->1 (from 0 to 1) operation.
  • kvm->mmu_node kvm->mmu_node -1 to achieve.
  • mmu_node is protected by a mmu lock, so the mmu lock needs to be acquired before changing mmu_node.
  • the process of creating and switching the backup page table it is mutually exclusive with the memory hot swap operation.
  • Step S408 Send mmu_reload signal to all vcpus.
  • the mmu_reload signal will be called by all vcpus before entering guest mode next time to reload the page table.
  • Step S410 The vcpu finds the mmu_reload signal, and checks whether the mmu_node currently used by the vcpu is consistent with the active mmu_node specified by the KVM. If they are consistent, the mmu_reload signal is triggered for other reasons, and it will not be processed. If they do not match, step S412 is executed.
  • Step S412 If the vcpu finds that the mmu_node used is inconsistent with the active mmu_node specified by the KVM, the mmu_node used by the current vcpu is replaced with the mmu_node specified by the KVM.
  • Step S414 When the mmu_node of all vcpus is consistent with the mmu_node specified by the KVM, the switching is completed.
  • Step S416 Refresh the TLB cache corresponding to the vm to ensure that the newly loaded page table takes effect; at the same time, enable PML to enable monitoring of memory changes.
  • TLB Translation Lookaside Buffer
  • PTE Page Table Entry, page table entry
  • PML is a feature of the CPU. After this feature is turned on, the CPU will record the physical memory page information that has been rewritten by the CPU. The hot migration needs to record this rewriting information to ensure that the vm data of the source and destination is consistent.
  • Step S418 Perform hot migration of the virtual machine.
  • the virtual guest on the first physical machine is hot migrated to the second physical machine.
  • Step S420 Perform post-processing of the virtual machine hot migration.
  • the homepage table and backup page table can be released; if the virtual machine hot migration fails, the virtual guest needs to continue to work on the first physical machine, and you can repeat steps S406-S416 to switch the backup EPT page table back to
  • the main EPT page table ensures that the virtual client can use the main EPT page table, thereby improving the performance of the virtual client. After switching back to the main EPT page table, the backup EPT page table can be released.
  • active_mmu_pages records the address epitome of all page table entries used by the current virtual guest. There is one active_mmu_pages for the main and backup page tables. After the virtual machine is hot migrated, the corresponding page table can be released by traversing the active_mmu_pages corresponding to the corresponding page table.
  • the release process of the backup page table includes: traversing each item in active_mmu_pages one by one; judging whether the current table item is active and whether there are page table child nodes. If there are page table child nodes, then Traverse all the child nodes, and record all the child nodes to the invalid list; release all page table entries in the invalid list, and thus release all page table memory.
  • the page table release process is described.
  • the release of the home page table uses the same method as the release of the above-mentioned backup page table. Those skilled in the art can realize the release of the home page table according to the above description. No longer.
  • the solution provided in this embodiment Before the virtual machine hot migration, actively analyze the memory mapping relationship of the virtual guest's rmap, backup and rebuild the virtual guest's EPT page table, and then back up the root directory entry root table of the EPT page table for each vcpu in turn Reloading, switch the page table of the virtual guest from the main EPT page table to the backup EPT page table at one time, thereby switching the memory of the virtual guest to the standard 4KBYTES mode, which effectively improves the performance of the virtual machine and reduces The impact of virtual machine hot migration on users.
  • the virtual machine hot migration method of this embodiment can be executed by any appropriate electronic device with data processing capabilities, including but not limited to: servers, mobile terminals (such as tablet computers, mobile phones, etc.), and PCs.
  • FIG. 6 there is shown a structural block diagram of a virtual machine hot migration apparatus according to the fifth embodiment of the present invention.
  • the virtual machine hot migration apparatus of this embodiment includes: a switching module 502, configured to switch the first page table in the first physical machine for indicating the mapping relationship between the memory address of the virtual guest and the physical address of the host machine according to the switching trigger instruction to The second page table; the hot migration module 504, configured to hot migrate the virtual guest from the first physical machine to the second physical machine according to the second page table; wherein, the last stage of the second page table The size of the physical memory page indicated by the page table satisfies the set size, and the size of the physical memory page indicated by the last-level page table of the first page table is greater than the set size.
  • the virtual machine hot migration apparatus of this embodiment is used to implement the corresponding virtual machine hot migration method in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • the functional realization of each module in the virtual machine hot migration apparatus of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
  • FIG. 7 there is shown a structural block diagram of a virtual machine hot migration apparatus according to the sixth embodiment of the present invention.
  • the virtual machine hot migration apparatus of this embodiment includes: a switching module 602, configured to switch the first page table in the first physical machine for indicating the mapping relationship between the memory address of the virtual guest and the physical address of the host machine according to the switching trigger instruction to A second page table; a hot migration module 604, configured to hot migrate a virtual guest from the first physical machine to a second physical machine according to the second page table; wherein the last stage of the second page table The size of the physical memory page indicated by the page table satisfies the set size, and the size of the physical memory page indicated by the last-level page table of the first page table is greater than the set size.
  • the switching module 602 includes: an active page table switching submodule 6022, configured to switch the active page table of the currently used virtual machine from the first page table to the second page table according to the switching trigger instruction; and instruct the submodule 6024, configured to send an instruction signal for instructing to reload the page table to all virtual clients in the first physical machine, so as to instruct each virtual client to switch the page table node in the first page table currently in use to The corresponding page table node in the second page table.
  • an active page table switching submodule 6022 configured to switch the active page table of the currently used virtual machine from the first page table to the second page table according to the switching trigger instruction
  • instruct the submodule 6024 configured to send an instruction signal for instructing to reload the page table to all virtual clients in the first physical machine, so as to instruct each virtual client to switch the page table node in the first page table currently in use to The corresponding page table node in the second page table.
  • the active page table switching sub-module 6022 includes: a creation unit 60222, configured to create the second page table according to the first page table after receiving a switching trigger instruction; the post-creation switching unit 60224 uses After the second page table is successfully created, the currently used virtual machine active page table is switched from the first page table to the second page table.
  • the post-creation switching unit 60224 is configured to, after the second page table is successfully created, determine the variable value corresponding to the first variable that points to the root page table of the virtual machine active page table, according to the The variable value determines that the currently used virtual machine active page table is the first page table; re-assigns the first variable to a set value, deactivates the first page table according to the set value, and activates all The second page table is used as the virtual machine active page table.
  • the first variable is a variable of the memory management unit node MMU_NODE; the post-creation switching unit 60224 is further configured to set the value of the first variable to the set value before resetting the first variable.
  • the memory management unit MMU corresponding to the variable is locked; and after the activation of the second page table as the active page table of the virtual machine, the MMU is unlocked.
  • the instruction submodule 6024 is configured to send an instruction signal for instructing to reload the page table to the virtual processors corresponding to all virtual clients in the first physical machine, so as to instruct each virtual processor to The currently used page table node in the first page table is switched to the corresponding page table node in the second page table.
  • each virtual processor switches the currently used page table node in the first page table to the corresponding page table node in the second page table in the following manner: each virtual processor checks the pointer to each Whether the variable value of the second variable of the root page table currently used by the virtual processor is consistent with the variable value of the first variable; if not, the variable value of the first variable is used to replace the variable value of the second variable .
  • the virtual machine hot migration apparatus of this embodiment further includes: a first hot migration processing module 606, configured to determine whether the hot migration is successful; if the hot migration is successful, release the first physical machine The first page table and the second page table; if the hot migration fails, switch the second page table back to the first page table, and release all the pages in the first physical machine Describe the second page table.
  • a first hot migration processing module 606 configured to determine whether the hot migration is successful; if the hot migration is successful, release the first physical machine The first page table and the second page table; if the hot migration fails, switch the second page table back to the first page table, and release all the pages in the first physical machine Describe the second page table.
  • the creating unit 60222 is configured to traverse the memory reverse mapping table corresponding to the virtual memory slot pointed to by the first page table after receiving the switching trigger instruction, and create the second Page table.
  • the creating unit 60222 is configured to traverse the memory reverse mapping table corresponding to the virtual memory slot pointed to by the first page table after receiving the switching trigger instruction; according to each final page table traversed Item copy generates the last-level page table entry in the second page table, and uses a set identifier to mark the generated last-level page table entry; judging the physical memory page pointed to by the generated last-level page table entry Whether the size of is greater than the set size; if it is greater than, the set identifier corresponding to the last-level page table entry is deleted, and the size of the physical memory page pointed to by the last-level page table entry is the The last-level page table entry establishes at least a first-level subpage table entry, wherein the size of the physical memory page pointed to by the last-level page table entry in the at least one-level subpage table entry is the set size.
  • the creating unit 60222 is further configured to determine whether the upper-level page table corresponding to the generated last-level page table entry exists; if it does not exist, generate a corresponding upper-level page table for the last-level page table entry And save to the second page table.
  • the memory reverse mapping table stores corresponding multiple last-level page table entries pointing to the physical memory page Linked list.
  • the creating unit 60222 traverses the memory reverse mapping table corresponding to the virtual memory slot pointed to by the first page table, and creates the second page table according to the traversal result: obtains the first page table.
  • the virtual machine hot migration apparatus of this embodiment further includes: a second hot migration processing module 608, configured to release the second page table first if a hot migration exception occurs during the hot migration process, and then After the currently used active page table of the virtual machine is switched back to the first page table, the first page table is released.
  • a second hot migration processing module 608 configured to release the second page table first if a hot migration exception occurs during the hot migration process, and then After the currently used active page table of the virtual machine is switched back to the first page table, the first page table is released.
  • the virtual machine hot migration apparatus of this embodiment is used to implement the corresponding virtual machine hot migration method in the foregoing multiple method embodiments, and has the beneficial effects of the corresponding method embodiments, which will not be repeated here.
  • the functional realization of each module in the virtual machine hot migration apparatus of this embodiment can refer to the description of the corresponding part in the foregoing method embodiment, and will not be repeated here.
  • An electronic device comprising: a processor, a memory, a communication interface, and a communication bus.
  • the processor, the memory, and the communication interface communicate with each other through the communication bus; the memory is used to store at least one An executable instruction that causes the processor to perform operations corresponding to the virtual machine hot migration method described above.
  • FIG. 8 there is shown a schematic structural diagram of an electronic device according to the seventh embodiment of the present invention.
  • the specific embodiment of the present invention does not limit the specific implementation of the electronic device.
  • the electronic device may include: a processor (processor) 702, a communication interface (Communications Interface) 704, a memory (memory) 706, and a communication bus 708.
  • processor processor
  • Communication interface Communication Interface
  • memory memory
  • the processor 702, the communication interface 704, and the memory 706 communicate with each other through the communication bus 708.
  • the communication interface 704 is used to communicate with other electronic devices or servers.
  • the processor 702 is configured to execute a program 710, and specifically can execute relevant steps in the above-mentioned virtual machine hot migration method embodiment.
  • the program 710 may include program code, and the program code includes computer operation instructions.
  • the processor 702 may be a central processing unit CPU, or an ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement the embodiments of the present invention.
  • the one or more processors included in the electronic device may be processors of the same type, such as one or more CPUs; or processors of different types, such as one or more CPUs and one or more ASICs.
  • the memory 706 is used to store the program 710.
  • the memory 706 may include a high-speed RAM memory, or may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
  • the program 710 may specifically be used to cause the processor 702 to perform the following operations: according to the switch trigger instruction, switch the first page table in the first physical machine for indicating the mapping relationship between the virtual guest memory address and the host machine physical address to the second page Table; according to the second page table, the virtual client is migrated from the first physical machine to the second physical machine; wherein the size of the physical memory page indicated by the last page table of the second page table satisfies The size is set, and the size of the physical memory page indicated by the last-level page table of the first page table is greater than the set size.
  • the program 710 is further configured to cause the processor 702 to use the first page in the first physical machine to indicate the mapping relationship between the memory address of the virtual guest and the physical address of the host machine according to the switching trigger instruction.
  • the table is switched to the second page table: according to the switching trigger instruction, the currently used virtual machine active page table is switched from the first page table to the second page table; and the data is sent to all virtual clients in the first physical machine
  • An instruction signal for instructing to reload the page table to instruct each virtual client to switch the currently used page table node in the first page table to the corresponding page table node in the second page table.
  • the program 710 is further configured to cause the processor 702 to switch the active page table of the virtual machine currently in use from the first page table to the second page table according to the switching trigger instruction: receiving the switching After the instruction is triggered, the second page table is created according to the first page table; after the second page table is successfully created, the currently used virtual machine active page table is switched from the first page table to the The second page table.
  • the program 710 is further configured to cause the processor 702 to determine to point to the second page table when the currently used virtual machine active page table is switched from the first page table to the second page table. According to the variable value corresponding to the first variable of the root page table of the virtual machine active page table, it is determined that the currently used virtual machine active page table is the first page table according to the variable value; Set a value, deactivate the first page table according to the set value, and activate the second page table as the virtual machine active page table.
  • the first variable is a memory management unit node MMU_NODE variable; the program 710 is also used to make the processor 702 before the re-assignment of the first variable to the set value, Locking the memory management unit MMU corresponding to the first variable; and, after activating the second page table as the virtual machine active page table, unlocking the MMU.
  • the program 710 is further configured to cause the processor 702 to send an instruction signal for instructing to reload the page table to all virtual clients in the first physical machine to instruct each virtual client
  • the computer switches the page table node in the first page table currently in use to the corresponding page table node in the second page table: send an instruction to the virtual processors corresponding to all virtual clients in the first physical machine The instruction signal of reloading the page table to instruct each virtual processor to switch the page table node in the first page table currently used to the corresponding page table node in the second page table.
  • each virtual processor switches the currently used page table node in the first page table to the corresponding page table node in the second page table in the following manner: each virtual processor according to the Indication signal, check whether the variable value of the second variable pointing to the root page table currently used by each virtual processor is consistent with the variable value of the first variable; if they are inconsistent, the variable value of the first variable is used to replace the The variable value of the second variable.
  • the program 710 is further configured to enable the processor 702 to determine whether the hot migration is successful; if the hot migration is successful, release the first page table in the first physical machine And the second page table; if the hot migration fails, switch the second page table back to the first page table, and release the second page table in the first physical machine.
  • the program 710 is further configured to enable the processor 702 to: when creating the second page table according to the first page table: corresponding to the virtual memory slot pointed to by the first page table Traverse the memory reverse mapping table of, and create the second page table according to the traversal result.
  • the program 710 is further configured to cause the processor 702 to create the second page table according to the traversal result: generate the second page table according to the copy of each last-level page table item traversed And use a setting identifier to mark the generated last-level page table entry; determine whether the size of the physical memory page pointed to by the generated last-level page table entry is greater than the set size; If it is greater than, delete the setting flag corresponding to the last-level page table entry, and create at least one level of child for the last-level page table entry according to the size of the physical memory page pointed to by the last-level page table entry The page table entry, wherein the size of the physical memory page pointed to by the last-level page table entry in the at least one level of sub-page table entry is the set size.
  • the program 710 is further configured to make the processor 702 determine whether the upper-level page table corresponding to the generated last-level page table entry exists; if not, it is the last-level page table entry The corresponding upper-level page table is generated and saved to the second page table.
  • the memory reverse mapping table stores corresponding multiple addresses pointing to the physical memory page.
  • the program 710 is further configured to cause the processor 702 to traverse the memory reverse mapping table corresponding to the virtual memory slot pointed to by the first page table, and create the first page table according to the traversal result.
  • two page tables obtain the memory reverse mapping table corresponding to the virtual memory slot pointed to by the first page table; traverse the entries of the memory reverse mapping table one by one, and determine whether there is something stored in the current entry
  • the linked list if the linked list is stored, each last-level page table entry in the linked list is traversed, and the information of multiple last-level page table entries corresponding to the linked list and each last-level page table traversed
  • the content of the entry is copied to the second page table; if the linked list is not stored, the content of the current entry is copied to the second page table.
  • the program 710 is further configured to cause the processor 702 to release the second page table first if a hot migration exception occurs during the hot migration process, and then reset the currently used virtual machine activity After the page table is switched back to the first page table, the first page table is released.
  • a first page table and a second page table are provided in the first physical machine, and the first and second page tables are both used to indicate the mapping between the virtual guest memory address and the host machine physical address Relationship to form the master backup setting of the mapping relationship.
  • the difference is that the size of the physical memory page indicated by the last page table of the second page table satisfies the set size, such as the conventionally used 4K BYTES size, while the size of the physical memory page indicated by the last page table of the first page table The size is larger than the set size, which is commonly referred to as "large page".
  • the virtual machine hot migration can be realized according to the second page table.
  • the process of converting the physical memory page size to the page size required for hot migration will not introduce the suspension of the virtual machine or other
  • the problem of performance impact on the operation of the virtual machine greatly reduces the adverse effect on the operation of the virtual guest caused by the change of the page table size of the hot migration.
  • each component/step described in the embodiment of the present invention can be split into more components/steps, or two or more components/steps or partial operations of components/steps can be combined into New components/steps to achieve the purpose of the embodiments of the present invention.
  • the above method according to the embodiments of the present invention can be implemented in hardware, firmware, or implemented as software or computer code that can be stored in a recording medium (such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk), or implemented by
  • a recording medium such as CD ROM, RAM, floppy disk, hard disk, or magneto-optical disk
  • the computer code downloaded from the network is originally stored in a remote recording medium or a non-transitory machine-readable medium and will be stored in a local recording medium, so that the method described here can be stored using a general-purpose computer, a dedicated processor or a programmable Or such software processing on a recording medium of dedicated hardware (such as ASIC or FPGA).
  • a computer, processor, microprocessor controller, or programmable hardware includes storage components (for example, RAM, ROM, flash memory, etc.) that can store or receive software or computer code, when the software or computer code is used by the computer, When the processor or hardware accesses and executes, the virtual machine hot migration method described here is implemented.
  • storage components for example, RAM, ROM, flash memory, etc.
  • the execution of the code converts the general-purpose computer into a special computer for executing the virtual machine hot migration method shown here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明实施例提供了一种虚拟机热迁移方法、装置、电子设备及计算机存储介质,其中,虚拟机热迁移方法包括:根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表;根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。通过本发明实施例,提升了虚拟客户机及其热迁移性能。

Description

虚拟机热迁移方法、装置、电子设备及计算机存储介质
本申请要求2019年08月05日递交的申请号为201910715884.1、发明名称为“虚拟机热迁移方法、装置、电子设备及计算机存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明实施例涉及数据处理技术领域,尤其涉及一种虚拟机热迁移方法、装置、电子设备及计算机存储介质。
背景技术
虚拟机热迁移是云计算运营中的一个关键技术,通过虚拟机热迁移可以在保证虚拟客户机正常运行的同时,将虚拟客户机从一个物理机迁移至另一个物理机,以实现计算资源的动态调度、物理机故障维修等。
具体地,虚拟机热迁移的过程是通过迭代的方式将虚拟客户机的内存从一个物理机拷贝到另一个物理机中,每次拷贝的内存内容根据虚拟客户机的内存分页确定。
传统的虚拟客户机一般采用4k BYTES的内存分页方式,即末级页表指示的内存页面大小为4k,而现在,为了提高虚拟客户机的性能,在分页时一般采用“大页”,例如采用2M BYTES或1G BYTES的内存分页方式。然而,内存分页方式的改变增加了虚拟机热迁移的对带宽的消耗,使得虚拟机热迁移的难度较高。现有技术中,为了较少对带宽的消耗,会在热迁移前直接删除原2M或1G的“大页”,然后通过缺页异常实现4K页表的重建,但是这种方式会导致虚拟客户机内的缺页异常量较大,使得虚拟客户机的性能受损。
有鉴于此,现有技术中亟需解决的技术问题是如何提供另一种难度较低的虚拟机热迁移方法。
发明内容
有鉴于此,本发明实施例提供一种虚拟机热迁移方法、装置、电子设备及计算机存储介质,以解决上述问题。
根据本发明实施例的第一方面,提供了一种虚拟机热迁移方法,包括:根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第 一页表切换至第二页表;根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。
根据本发明实施例的第二方面,提供了一种虚拟机热迁移装置,包括:切换模块,用于根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表;热迁移模块,用于根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。
根据本发明实施例的第三方面,提供了一种电子设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如上所述的虚拟机热迁移方法对应的操作。
根据本发明实施例的第四方面,提供了一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上所述的虚拟机热迁移方法。
根据本发明实施例提供的方案,在第一物理机中设置有第一页表和第二页表,第一和第二页表均用于指示虚拟客户机内存地址与宿主机物理地址之间的映射关系,以形成该映射关系的主备份设置。所不同的是,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,如常规使用的4K BYTES大小,而第一页表的末级页表指示的物理内存页的尺寸则大于所述设定大小,即通常所说的“大页”。由此,根据第二页表即可实现虚拟机热迁移。一方面,因第二页表对应的物理内存页的尺寸满足设定大小,也即,满足了热迁移对物理内存页面的大小要求,提升了虚拟客户机及其热迁移成功率和性能;另一方面,通过第一页表和第二页表的主备份设置与合理切换,使得在将物理内存页面大小转变为热迁移所需的页面大小的过程中,不会引入虚拟机的暂停或者其他对虚拟机的运行产生性能影响的问题,大大降低了热迁移的页表大小改变对虚拟客户机运行造成的不良影响。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本 发明实施例中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。
图1为本发明实施例一的一种虚拟机热迁移方法的步骤流程图;
图2为本发明实施例二的一种虚拟机热迁移方法的步骤流程图;
图3为本发明实施例三的一种虚拟机热迁移方法的步骤流程图;
图4为本发明实施例四的一种虚拟机热迁移方法的步骤流程图;
图5为图4所示实施例中的第一页表和第二页表的示意图;
图6为本发明实施例五的一种虚拟机热迁移装置的结构框图;
图7为本发明实施例六的一种虚拟机热迁移装置的结构框图;
图8为本发明实施例七的一种电子设备的结构示意图。
具体实施方式
为了使本领域的人员更好地理解本发明实施例中的技术方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明实施例一部分实施例,而不是全部的实施例。基于本发明实施例中的实施例,本领域普通技术人员所获得的所有其他实施例,都应当属于本发明实施例保护的范围。
下面结合本发明实施例附图进一步说明本发明实施例具体实现。
实施例一
参照图1,示出了根据本发明实施例一的一种虚拟机热迁移方法的步骤流程图。
本实施例的虚拟机热迁移方法包括以下步骤:
步骤S102:根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表。
在虚拟机体系下,一台物理机可以被虚拟为多台虚拟机,该多台虚拟机即为虚拟客户机,该物理机可以被认为是多台虚拟机的宿主机。多台虚拟客户机通过用于指示虚拟客户机内存地址与宿主机物理地址映射关系的页表,使用宿主机的实际物理资源。例如,KVM(Kernel-based Virtual Machine)虚拟机体系、XEN虚拟机体系等多种虚拟机体系中,使用EPT(Extended Page Tables)页表实现虚拟客户机内存地址与宿主机物理地址映射。
本发明实施例中,在实现虚拟机体系下的虚拟机热迁移时,设置了第一页表和第二 页表,第一页表和第二页表均为用于指示虚拟客户机内存地址与宿主机物理地址映射关系的页表,通过切换触发指令指示进行第一页表和第二页表的切换操作,进而基于第一页表和第二页表的切换操作进行虚拟机热迁移。其中,切换触发指令可以是任意适当的指令,或者是由任意适当的触发操作所触发的指令。切换后,可以将第一页表标记为只读,并且,如果热迁移中发生写错误异常的情况,也可以使用第二页表进行处理。
此外,本实施例中,第一页表和第二页表均为多级页表,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。其中,所述设定大小可以由本领域技术人员根据热迁移的内存拷贝迭代所需求的内存分页尺寸进行设置,例如,可以设置为4K BYTES。满足所述设定大小的物理内存页可以被称为“小页”,相应地,大于所述设定大小的物理内存页则可以被称为“大页”,如大小为2M BYTES或者1G BYTES的物理内存页。
第二页表可以根据第一页表中的信息进行进一步加工后生成,如,将第一页表中的“大页”分割为“小页”并建立相应的页表项后生成,以节约生成成本,提高生成效率。但不限于此,第二页表也可以采用常规生成页表的方式生成,如与第一页表相同的生成方式(例如EPT页表的生成方式)等。
需要说明的是,本发明实施例中,若无特殊说明,“多级”、“多个”等与“多”有关的数量均意指两个及两个以上。另外,“第一”和“第二”仅用于区别不同的页表,并不表示两个页表之间具有必然的顺序或时序关系。
步骤S104:根据第二页表,将虚拟客户机从第一物理机热迁移至第二物理机。
因第二页表的末级页表指示的物理内存页的尺寸满足所述设定大小,也即,满足虚拟机热迁移的内存拷贝迭代所需求的内存分页尺寸,因此,基于该第二页表即可实现将虚拟客户机从第一物理机热迁移至第二物理机。
通过本实施例,在第一物理机中设置有第一页表和第二页表,第一和第二页表均用于指示虚拟客户机内存地址与宿主机物理地址之间的映射关系,以形成该映射关系的主备份设置。所不同的是,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,如常规使用的4K BYTES大小,而第一页表的末级页表指示的物理内存页的尺寸则大于所述设定大小,即通常所说的“大页”。由此,根据第二页表即可实现虚拟机热迁移。一方面,因第二页表对应的物理内存页的尺寸满足设定大小,也即,满足了热迁移对物理内存页面的大小要求,提升了虚拟客户机及其热迁移成功率和性能;另一方面,通过第一页表和第二页表的主备份设置与合理切换,使得在将物理内存页面大小转变为热迁移 所需的页面大小的过程中,不会引入虚拟机的暂停或者其他对虚拟机的运行产生性能影响的问题,大大降低了热迁移的页表大小改变对虚拟客户机运行造成的不良影响。
本实施例的虚拟机热迁移方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机。
实施例二
参照图2,示出了本发明实施例二的一种虚拟机热迁移方法的步骤流程图。
本实施例的虚拟机热迁移方法包括以下步骤:
步骤S202:根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表。
其中,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。所述设定大小如实施例一中所述,可以由本领域技术人员根据热迁移的内存拷贝迭代所需求的内存分页尺寸进行设置,例如,可以设置为4K BYTES。在KVM虚拟机体系、XEN虚拟机体系等多种虚拟机体系下,第一页表和第二页表均可采用EPT(Extended Page Tables,扩展页表)页表的形式。
在一种可行方式中,本步骤可以实现为:根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表;并向第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
虚拟客户机需要通过处于活动状态的页表来使用宿主机的内存资源,本发明实施例为实现“大页”情况下的虚拟机热迁移,提供了第一页表和第二页表,其中,第一页表可以认为是指向“大页”的页表,而第二页表可以认为是指向“小页”的页表。基于此,在进行虚拟机热迁移时,需要将当前使用的虚拟机活动页表从第一页表切换为第二页表,进而,向第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以虚拟客户机为单位进行页表切换,以降低页表切换对虚拟客户机的性能影响。
可选地,可以以虚拟处理器(VCPU)为单位进行页表切换,进一步页表切换降低对虚拟客户机的性能影响。此种情况下,所述向第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点,包括:向第一物理机中的所有虚拟客户机对应的虚拟处理器发送用于指示重新加载页表的指示信号,以指示各个虚拟处理器将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
在根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表时,一种可行方案中,可以在接收到切换触发指令后,根据第一页表创建第二页表;在第二页表创建成功后,将当前使用的虚拟机活动页表从第一页表切换为第二页表。但不限于此,第二页表也可提前创建,则在接收到切换触发指令后,直接进行切换即可。而采用在接收到切换触发指令后根据第一页表创建第二页表的方式,一方面,根据切换触发指令来创建第二页表,也即在确定了要进行热迁移才进行创建,有效保证了第二页表会被使用,避免了提前创建方式下第二页表后续可能不会被使用导致的数据浪费和资源浪费;另一方面,依据第一页表创建第二页表,提高了创建效率,也保证了切换的平稳过渡和虚拟机性能。
为了降低页表切换成本,提高页表切换效率,本发明实施例进一步提供了通过变量实现页表切换的方式。该种方式下,在将当前使用的虚拟机活动页表从第一页表切换为第二页表时,先确定指向虚拟机活动页表的根页表的第一变量对应的变量值,根据所述变量值确定当前使用的虚拟机活动页表为第一页表;再将第一变量重新赋值为设定值,根据所述设定值去激活第一页表,并激活第二页表作为虚拟机活动页表。其中,所述设定值可以由本领域技术人员根据实际情况适当设置,可以为数字也可以为字母或符号或者上述的结合。例如,以active_page作为第一变量,当active_page为1时,指示当前使用的虚拟机活动页表为第一页表,当active_page为0时,指示当前使用的虚拟机活动页表为第二页表。则可以通过改变active_page的变量值,即可实现对页表的切换指示。
在KVM虚拟机体系下,所述第一变量可以为内存管理单元节点MMU_NODE变量。设置使用MMU锁来保护MMU资源,则在将第一变量重新赋值为设定值之前,还可以对第一变量对应的内存管理单元MMU加锁;并且,在激活第二页表作为虚拟机活动页表之后,再对所述MMU解锁。以此,保证对MMU资源的独占使用。采用MMU_NODE变量的方式,便于后续通过改变NODE的值来实现第一页表和第二页表的快速切换,且可最大可能地与现有代码逻辑兼容。但如前所述,其它变量形式也同样适用。
此外,基于第一变量的设置,在采用前述向第一物理机中的所有虚拟客户机对应的虚拟处理器发送用于指示重新加载页表的指示信号,以指示各个虚拟处理器将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点的方式时,各个虚拟处理器可以通过以下方式将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点,包括:各个虚拟处理器根据所述指示信号,检查指向各虚拟处理器当前使用的根页 表的第二变量的变量值和所述第一变量的变量值是否一致;若不一致,则使用所述第一变量的变量值替换所述第二变量的变量值。通过变量值替换的方式,可以简单快速地实现虚拟处理器使用的页表节点的切换,提高了切换效率。
可见,通过上述过程,有效实现了第一页表和第二页表之间的切换。基于此,可进行如下步骤S204。
步骤S204:根据第二页表,将虚拟客户机从第一物理机热迁移至第二物理机。
第二页表的末级页表指示的物理内存页的尺寸满足设定大小,如4K BYTES,可以有效满足虚拟机热迁移对内存分页尺寸的需求,通过第二页表,可一次完成所有的虚拟客户机内存页表更新,降低热迁移对虚拟机性能和用户的影响。
步骤S206:判断所述热迁移是否成功;若成功,则执行步骤S208;否则,执行步骤S210。
对虚拟机热迁移是否成功的判断可以采用常规方式实现,如虚拟客户机是否运行正常,数据是否完整等,本发明实施例在此不作限定。
步骤S208:若所述热迁移成功,则释放第一物理机中的第一页表和第二页表。结束本次流程。
在热迁移成功后,虚拟客户机的内存数据从第一物理机中成功拷贝到第二物理机中,热迁移后的虚拟客户机可使用第二物理机中相应的数据和机制进行正常工作,在此情况下,可释放第一物理机中第一页表和第二页表,包括释放与页表相关的数据以及占用的资源等。
步骤S210:若所述热迁移失败,则进行失败处理。
一种情况下,进行热迁移操作后,虚拟客户机在第二物理机上不能正常工作,也即,虽然进行了热迁移但热迁移没有成功且触发了回滚操作。针对此种热迁移失败情况,可则将第二页表切换回第一页表,并释放第一物理机中的第二页表。也即,通过将第二页表切换回第一页表,以使虚拟客户机可以在云端根据第一页表正常工作,并且释放掉第二页表。因该种情况下,热迁移失败可能是由诸如迁移不收敛等原因引起,因此,可以进行切换回第一页表以及热迁移回滚操作,以保证虚拟客户机在原状况下工作,继续使用第一页表,提升虚拟机性能,避免影响用户使用。
另一种情况下,若在所述热迁移过程中出现热迁移异常(如虚拟客户机异常退出或关机),则可以先释放第二页表,再将当前使用的虚拟机活动页表切换回第一页表后,释放第一页表。也即,热迁移还没有完成,在热迁移过程中即出现异常,此时,需要将 第一页表和第二页表的相关数据及占用的资源均释放掉。
需要说明的是,在实际应用中,也可以同时释放第一和第二页表。但采用先释放第二页表,再将当前使用的虚拟机活动页表切换回第一页表后释放的方式,更符合方案实现的实际代码逻辑。
通过步骤S206-S210,有效实现了对热迁移不成功情况的处理,保证了热迁移的可靠性和安全性。
通过本实施例,在第一物理机中设置有第一页表和第二页表,第一和第二页表均用于指示虚拟客户机内存地址与宿主机物理地址之间的映射关系,以形成该映射关系的主备份设置。所不同的是,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,如常规使用的4K BYTES大小,而第一页表的末级页表指示的物理内存页的尺寸则大于所述设定大小,即通常所说的“大页”。由此,根据第二页表即可实现虚拟机热迁移。一方面,因第二页表对应的物理内存页的尺寸满足设定大小,也即,满足了热迁移对物理内存页面的大小要求,提升了虚拟客户机及其热迁移成功率和性能;另一方面,通过第一页表和第二页表的主备份设置与合理切换,使得在将物理内存页面大小转变为热迁移所需的页面大小的过程中,不会引入虚拟机的暂停或者其他对虚拟机的运行产生性能影响的问题,大大降低了热迁移的页表大小改变对虚拟客户机运行造成的不良影响。
本实施例的虚拟机热迁移方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机。
实施例三
参照图3,示出了根据本发明实施例三的一种虚拟机热迁移方法的步骤流程图。
本实施例以如何创建第二页表为侧重点,对本发明实施例提供的虚拟机热迁移方案进行说明。
本实施例的虚拟机热迁移方法包括以下步骤:
步骤S302:接收到切换触发指令后,根据第一页表创建第二页表。
其中,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。所述设定大小如实施例一中所述,可以由本领域技术人员根据热迁移的内存拷贝迭代所需求的内存分页尺寸进行设置,例如,可以设置为4K BYTES。
在一种可行方式中,可以对第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建第二页表。由此,可以实现第二页表的快速创建。在另一种 可行方式中,也可以通过直接遍历第一页表,然后,基于遍历结果以及相应的“小页分割”结果来创建第二页表。
内存反向映射表中记录了反映宿主机的物理内存地址及其对应的虚拟地址的映射关系。每当一个物理内存页被映射到一块新的虚拟地址空间(此处为虚拟客户机物理地址)的时候,该物理内存页对应的末级页表项会被记录到内存反向映射表中。而虚拟客户机的内存由多个虚拟内存槽memslot构成,每个memslot互不重叠。也即,通过遍历各个虚拟内存槽memslot对应的内存反向映射表,可获得所述虚拟客户机的内存地址与对应的物理内存地址的映射关系。据此,可以获得用于创建第二页表的数据和信息。又因第二页表的末级页表指示的物理内存页的尺寸需要满足设定大小,因此需要根据相应内存反向映射表的遍历结果进行进一步处理,如将遍历出的物理内存页尺寸大于所述设定大小的页面按照所述设定大小进行分割后,生成相应的第二页表中的表项。
例如,根据遍历结果创建第二页表可以包括:根据遍历的各个末级页表项拷贝生成第二页表中的末级页表项,并使用设定标识对生成的所述末级页表项进行标记;判断生成的所述末级页表项指向的物理内存页的尺寸是否大于所述设定大小;若大于,则删除所述末级页表项对应的所述设定标识,并根据所述末级页表项指向的物理内存页的尺寸,为所述末级页表项建立至少一级子页表项,其中,所述至少一级子页表项中的最后一级页表项指向的物理内存页的尺寸为所述设定大小。所述设定标识可以由本领域技术人员根据实际需求适当设置,本发明实施例对此不作限制;所述设定大小如前所述,在此不再赘述。
若物理内存页尺寸大于所述设定大小,表明其为“大页”,不符合第二页表的末级页表项要求,则需删除用于指示末级页表项的设定标识,并将“大页”处理为“小页”。从“大页”到“小页”可能需要一级或多级处理,例如,若“大页”为2M,则可将其处理为512个4K BYTES的“小页”,据此再在第二页表的当前表项级别之下建立下一级子页表项,每个子页表项指向一个4K BYTES的物理内存页。而若“大页”为1G,则需要先将其处理为512个2M的页面,再将每个2M的页面处理为512个4K BYTES的“小页”,据此再在第二页表的当前表项级别之下建立二级子页表项,第一级子页表项指向第二级子页表项,每个第二级子页表项指向一个4K BYTES的物理内存页。
而若生成的所述末级页表项指向的物理内存页的尺寸不大于所述设定大小,则表明其指向“小页”,可直接拷贝使用。
除此之外,针对每个生成的末级页表项,还可以判断生成的所述末级页表项对应的 上级页表是否存在;若不存在,则为所述末级页表项生成对应的上级页表并保存至第二页表。通过这种方式,实现第二页表的完善和生成。
在某些情况下,宿主机的一个物理内存页可能被映射至一个虚拟客户机的多个内存页,此时,内存反向映射表中则存储有对应的指向所述物理内存页的多个末级页表项的链表。此时情况下,对第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建第二页表,包括:获取第一页表指向的虚拟内存槽所对应的内存反向映射表;对内存反向映射表的表项逐个进行遍历,判断当前表项中是否存储有所述链表;若存储有所述链表,则遍历所述链表中的每个末级页表项,并将所述链表对应的多个末级页表项的信息和遍历的每个末级页表项的内容拷贝至第二页表;若未存储有所述链表,则将所述当前表项的内容拷贝至第二页表。由此,有效解决了一个物理内存页被映射至多个虚拟客户机内存页对应的页表项的生成问题。
步骤S304:在第二页表创建成功后,将当前使用的虚拟机活动页表从第一页表切换为第二页表。
其中,将当前使用的虚拟机活动页表从第一页表切换为第二页表的具体实现可参照实施例二的步骤S202中的描述,在此不再赘述。
步骤S306:根据第二页表,将虚拟客户机从第一物理机热迁移至第二物理机。
通过本实施例,在第一物理机中设置有第一页表和第二页表,第一和第二页表均用于指示虚拟客户机内存地址与宿主机物理地址之间的映射关系,以形成该映射关系的主备份设置。所不同的是,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,如常规使用的4K BYTES大小,而第一页表的末级页表指示的物理内存页的尺寸则大于所述设定大小,即通常所说的“大页”。由此,根据第二页表即可实现虚拟机热迁移。一方面,因第二页表对应的物理内存页的尺寸满足设定大小,也即,满足了热迁移对物理内存页面的大小要求,提升了虚拟客户机及其热迁移成功率和性能;另一方面,通过第一页表和第二页表的主备份设置与合理切换,使得在将物理内存页面大小转变为热迁移所需的页面大小的过程中,不会引入虚拟机的暂停或者其他对虚拟机的运行产生性能影响的问题,大大降低了热迁移的页表大小改变对虚拟客户机运行造成的不良影响。
本实施例的虚拟机热迁移方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机。
实施例四
参照图4,示出了根据本发明实施例四的一种虚拟机热迁移方法的步骤流程图。
本实施例以KVM虚拟机体系为示例,对本发明实施例提供的虚拟机热迁移方法进行说明。KVM是一种在CPU硬件支持基础之上的虚拟化技术,可以实现为Linux的一个模块,即KVM模块。Linux在加载了KVM模块后,才能进一步通过其他工具创建虚拟机。但仅通过KVM模块用户无法直接控制操作系统内核进行操作,还需要通过相应的用户空间工具如Qemu实现对KVM的用户空间进行控制,即,通过Qemu实现在操作系统的用户模式下工作。
此外,在KVM中,为了实现内存虚拟化,让虚拟客户机使用一个隔离的、从零开始且具有连续的内存空间,KVM引入了客户机物理地址空间(Guest Physical Address,GPA),GPA并不是真正的物理地址空间,它只是宿主机(Host机)虚拟地址空间(HVA)在虚拟客户机地址空间的一个映射。对虚拟客户机来说,GPA都是从零开始的连续地址空间,但对于宿主机来说,虚拟客户机的物理地址空间并不一定是连续的,虚拟客户机物理地址空间有可能映射在若干个不连续的宿主机地址区间。
由于虚拟客户机本质上是Host机上的一个进程,在虚拟化模式下,虚拟客户机处于非Root模式,无法直接访问Root模式下的Host机上的内存。此时,需要VMM(虚拟客户机监控器)的介入,通过VMM来intercept(截获)虚拟客户机的内存访问指令,然后virtualize(模拟)Host机上的内存,相当于VMM在虚拟客户机的虚拟地址空间(GVA)和Host机的虚拟地址空间(HVA)中间增加了一层,即GPA。
可见,内存虚拟化就是将虚拟客户机的虚拟地址(Guest Virtual Address,GVA)转化为Host的物理地址(Host Physical Address,HPA),中间要经过虚拟机的物理地址(Guest Physical Address,GPA)和Host虚拟地址(Host Virtual Address,HVA)的转化,即:GVA→GPA→HVA→HPA。通过上述转化,建立起虚拟客户机与宿主机之间的内存映射关系,据此实现虚拟客户机对宿主机物理内存资源的使用。在虚拟机热迁移中,也需依据上述内存映射关系进行页表生成、热迁移等操作。
基于此,本实施例的虚拟机热迁移方法包括以下步骤:
步骤S402:Qemu线程在进行虚拟机热迁移前通过IOCTL调用触发生成切换触发指令,以指示准备进行主、备页表切换。
本实施例中,主页表即前述第一页表,备用页表(也称备页表)即前述第二页表。
需要说明的是,本实施例中,仅以IOCTL调用触发生成切换触发指令为例,但实际应用中,本领域技术人员还可以根据实际需求设置其它适当的触发操作或触发条件,以生成切换触发指令。
步骤S404:根据切换触发指令,依据主页表创建备用页表。
在KVM中,虚拟客户机的物理内存被分成若干个memslot,每个memslot互不重叠。本实施例中,采用依次遍历所有的memslot,依次完成各memslot对应的物理内存的页表项拷贝和重建的方式,来创建备用页表。具体地,采用遍历当前memslot的内存反向映射rmap表的表项的方式来创建备用页表。
内存反向映射rmap是一个记录物理内存地址和其虚拟地址对应关系的数据结构,它记录了每个物理内存页及其页表的关联关系。在实际应用中,可以通过Guest物理内存页的页框号(gfn)来找到与该物理内存页所关联的EPT页表,即rmap[gfn]。
每当一个宿主机的物理内存页被映射到一块新的虚拟地址空间(此处为虚拟机物理地址)的时候,该物理内存页对应的末级页表项地址会被记录到rmap[gfn]中。当一个宿主机的物理内存页同时被映射到多个新的虚拟地址空间的时候,rmap[gfn]中会记录一个pte链表的首地址,与该物理内存页相关联的所有末级页表项都会被记录到该pte链表中。
虚拟客户机的物理内存中包括的多个物理内存页面可以根据虚拟机的内存分配信息确定,确定的物理内存页面可以通过物理内存页面的页框号(gfn)表示。虚拟客户机的物理内存由多个memslot构成,每个memslot有一个basegfn,该basegfn记录了该memslot在整个虚拟客户机物理地址空间的起始偏移,memslot内部每个物理内存页面所对应的gfn由该memslot的basegfn和该物理内存页面在memslot内部的偏移量共同计算得出。为了遍历整个gfn,需要一次遍历每个memslot。
针对每个rmap[gfn],如果其对应了多个虚拟地址空间,需要通过遍历rmap[gfn]对应的pte链表项中的每个末级页表项。所有的末级页表项遍历结束后,继续遍历下一个gfn。
对遍历获得的每个末级页表项,将其拷贝到备份页表即备份的EPT页表中。具体地,可以包括:(a)如果备份页表中,当前拷贝的末级页表项对应的上级页表不存在,则为该末级页表项建立相应的上级页表并加入到备份页表中(如采用和第一页表(如EPT)相同的建立方法);(b)判断当前拷贝的末级页表项指向的页面是否为“大页”,如果不是“大页”,则在拷贝结束后回到(a)继续遍历下一个页表项;如果是“大页”,则要将表明该页表项为末级页表项的标志删除,然后,再依次建立该指向“大页”的页表项对应的所有子页表项。
其中,上述建立该指向“大页”的页表项对应的所有子页表项包括:首先从该指向 “大页”的页表项中,查询该页表项关联的虚拟客户机的物理页框号gfn及其在宿主机中的真实的物理页框号pfn;判断创建页表项所需的内存缓存池资源是否充足,如果不充足,向缓存池中新增缓存资源;如果充足,针对该“大页”物理内存所涵盖的所有“小页”物理内存,依次创建其对应的多级“小页”页表(例如,对2M BYTES的“大页”,需要分别创建2M BYTES和4K BYTES级别的页表;对1G BYTES的“大页”,需要分别创建1G BYTES、2M BYTES和4K BYTES三级页表。具体到EPT页表时,对1G BYTES的物理内存需要创建第二、三、四级页表,对2M BYTES的物理内存需要创建第三、四级页表)。
进而,进行备份页表的表项的填写,包括:对非末级页表(第四级页表之外的页表),从该gfn对应的第一级页表进行查询,查看其对应的下一级页表是否存在;如果不存在,则为下一级页表申请一个页表,并将该新申请的页表内存的物理地址等信息填入第一级页表。重复上述过程,直到最后一级页表,即第四级页表。对于末级页表(第四级页表),根据内存属性,对应的gfn、pfn信息,填入末级页表。其中,对于内存属性的dirtybit,标志位设置为0,以便后续使用pml进行跟踪。并且,将该末级页表添加到该gfn对应的rmap结构中,如果有多个gfn,需要将该末级页表项加入到对应的rmap链表中。在所有的物理内存遍历完成后,完成备用页表的创建工作。
一种创建完成的备份页表(即第二页表)与主页表(即第一页表)的对比如图5所示。图5中,主页表和备份页表均以EPT页表为例(主EPT页表、备EPT页表),各节点含义与常规EPT页表相同。在图5中所示的主EPT页表中,直接使用高级的页表项如PDE或者PDPTE来指向一个物理“大页”。例如,PDE(第三级页表)指向2M BYTES的物理内存页(图5中实线方框所示),PDPTE(第二级页表)指向1G BYTES的物理内存页(图中未示出)。备EPT页表通过主EPT页表创建,从图5中可见,备EPT页表使用PTE(第四级页表)指向4K BYTES的物理内存页(图5中虚线方框所示),PTE作为备EPT页表的末级页表,完成整个物理内存映射。
步骤S406:将KVM的表明当前虚拟客户机的活跃的页表节点切换到备份页表节点。
本实施例中,为了避免页表整体切换对虚拟客户机的性能产生影响,将主备页表的切换流程拆解到vcpu粒度来实现。为此,本实施例引入了两个新的变量kvm->mmu_node和vcpu->mmu_node,其中,kvm->mmu_node表示当前KVM所指向的根页表,vcpu->mmu_node表示当前vcpu所使用的根页表。
基于上述变量,主备EPT页表的切换可以通过kvm->mmu_node=kvm->mmu_node^1来完成。其中,“^1”表示1->0(从1至0)或者0->1(从0到1)的操作。例如,若 用“1”指示主EPT页表,“0”指示备份EPT页表,假设需要从主EPT页表切换为备份EPT页表,则可以通过诸如:kvm->mmu_node=kvm->mmu_node-1来实现。
此外,本实施例中,mmu_node采用mmu锁来保护,因此在更改mmu_node前需要获取mmu锁。以在备份页表的创建和切换过程中,和内存热插拔操作互斥。
步骤S408:向所有的vcpu发送mmu_reload信号。
该mmu_reload信号会被所有的vcpu在下次enter guest mode之前被调用,以进行页表的重新加载。
步骤S410:vcpu发现mmu_reload信号,检查当前vcpu使用的mmu_node和KVM所指定的活跃的mmu_node是否一致。如果一致,则该mmu_reload信号是因为别的原因触发,则不对其进行处理。如果不一致,则执行步骤S412。
步骤S412:如果vcpu发现使用的mmu_node和KVM所指定的活跃的mmu_node不一致,则将当前vcpu所使用的mmu_node替换成KVM所指定的mmu_node。
同时,切换该vcpu所使用的ept root table为备用页表的ept root table。
步骤S414:当所有的vcpu的mmu_node都和KVM所指定的mmu_node一致时,则切换完成。
步骤S416:刷新该vm对应的TLB缓存,确保新加载的页表生效;同时使能PML,开启对内存更改的监控。
TLB(Translation Lookaside Buffer,转换检测缓冲区)是一个内存管理单元,是用于改进虚拟地址到物理地址转换速度的缓存。TLB中的每一行都保存着一个由单个PTE(Page Table Entry,页表项)组成的块。通过TLB,无需每次读取数据都要两次访问内存(即查页表获得物理地址和读取数据),直接从TLB中读取即可。CPU提供了TLB刷新指令,通过该指令和该VM对应的VPID信息,完成TLB的刷新。
PML是CPU的一个特性,该特性开启后,CPU会记录被CPU改写了的物理内存页面信息。而热迁移需要记录这个改写信息,从而保证源端和目的端的vm数据一致。
步骤S418:进行虚拟机热迁移。
即根据切换后的备份EPT页表,将第一物理机上的虚拟客户机热迁移至第二物理机。
步骤S420:进行虚拟机热迁移后处理。
如果虚拟机热迁移成功,则可以释放主页表和备份页表;如果虚拟机热迁移失败,虚拟客户机需要继续保持在第一物理机上工作,可以重复步骤S406-S416将备份EPT页表切换回主EPT页表,从而保证虚拟客户机能够使用回主EPT页表,从而带来虚拟客户 机的性能提升。在切换回主EPT页表后,可以释放掉备份EPT页表。
在KVM中,active_mmu_pages记录了当前虚拟客户机所使用的所有页表项的地址缩影,主、备用页表各有一个active_mmu_pages。在虚拟机热迁移后,可以通过遍历相应页表对应的active_mmu_pages来实现相应页表的释放。
以备用页表的释放为例,该备用页表的释放流程包括:逐个遍历active_mmu_pages中的每一项;判断当前表项是否active,是否还存在页表子节点,如果存在页表子节点,则遍历所有的子节点,并记录所有的子节点到invalid列表中;释放invalid列表中的所有页表项,并以此释放所有的页表内存。
以上,以备用页表为例,对页表释放过程进行了说明,主页表的释放采用与上述备用页表的释放相同的方式,本领域技术人员可以根据上述描述实现主页表的释放,在此不再赘述。
通过本实施例,针对现有技术中采用“大页”方式组织页表情况下,在虚拟机热迁移导致的缺页异常给虚拟机性能带来很大抖动的问题,本实施例提供的方案在虚拟机热迁移前,主动地通过分析虚拟客户机的rmap的内存映射关系,对虚拟客户机的EPT页表进行备份重建,然后对每个vcpu依次进行备份EPT页表的根目录项root table的重新加载,一次性的将虚拟客户机的页表从主EPT页表切换到备份EPT页表,从而将该虚拟客户机的内存切换成标准的4KBYTES模式,有效提升了虚拟机性能,降低了虚拟机热迁移对用户的影响。
本实施例的虚拟机热迁移方法可以由任意适当的具有数据处理能力的电子设备执行,包括但不限于:服务器、移动终端(如平板电脑、手机等)和PC机。
实施例五
参照图6,示出了根据本发明实施例五的一种虚拟机热迁移装置的结构框图。
本实施例的虚拟机热迁移装置包括:切换模块502,用于根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表;热迁移模块504,用于根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。
本实施例的虚拟机热迁移装置用于实现前述多个方法实施例中相应的虚拟机热迁移方法,并具有相应的方法实施例的有益效果,在此不再赘述。此外,本实施例的虚拟机热迁移装置中的各个模块的功能实现均可参照前述方法实施例中的相应部分的描述,在 此亦不再赘述。
实施例六
参照图7,示出了根据本发明实施例六的一种虚拟机热迁移装置的结构框图。
本实施例的虚拟机热迁移装置包括:切换模块602,用于根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表;热迁移模块604,用于根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。
可选地,所述切换模块602包括:活动页表切换子模块6022,用于根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表;指示子模块6024,用于向所述第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
可选地,所述活动页表切换子模块6022包括:创建单元60222,用于接收到切换触发指令后,根据所述第一页表创建所述第二页表;创建后切换单元60224,用于在所述第二页表创建成功后,将当前使用的虚拟机活动页表从所述第一页表切换为所述第二页表。
可选地,所述创建后切换单元60224,用于在所述第二页表创建成功后,确定指向所述虚拟机活动页表的根页表的第一变量对应的变量值,根据所述变量值确定当前使用的虚拟机活动页表为所述第一页表;将所述第一变量重新赋值为设定值,根据所述设定值去激活所述第一页表,并激活所述第二页表作为虚拟机活动页表。
可选地,所述第一变量为内存管理单元节点MMU_NODE变量;所述创建后切换单元60224,还用于在所述将所述第一变量重新赋值为设定值之前,对所述第一变量对应的内存管理单元MMU加锁;以及,在所述激活所述第二页表作为虚拟机活动页表之后,对所述MMU解锁。
可选地,所述指示子模块6024,用于向所述第一物理机中的所有虚拟客户机对应的虚拟处理器发送用于指示重新加载页表的指示信号,以指示各个虚拟处理器将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
可选地,各个虚拟处理器通过以下方式将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点:各个虚拟处理器根据所述指示信号,检查指向各虚拟处理 器当前使用的根页表的第二变量的变量值和所述第一变量的变量值是否一致;若不一致,则使用所述第一变量的变量值替换所述第二变量的变量值。
可选地,本实施例的虚拟机热迁移装置还包括:第一热迁移处理模块606,用于判断所述热迁移是否成功;若所述热迁移成功,则释放所述第一物理机中的所述第一页表和所述第二页表;若所述热迁移失败,则将所述第二页表切换回所述第一页表,并释放所述第一物理机中的所述第二页表。
可选地,所述创建单元60222,用于接收到切换触发指令后,对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表。
可选地,所述创建单元60222,用于接收到切换触发指令后,对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历;根据遍历的各个末级页表项拷贝生成所述第二页表中的末级页表项,并使用设定标识对生成的所述末级页表项进行标记;判断生成的所述末级页表项指向的物理内存页的尺寸是否大于所述设定大小;若大于,则删除所述末级页表项对应的所述设定标识,并根据所述末级页表项指向的物理内存页的尺寸,为所述末级页表项建立至少一级子页表项,其中,所述至少一级子页表项中的最后一级页表项指向的物理内存页的尺寸为所述设定大小。
可选地,所述创建单元60222,还用于判断生成的所述末级页表项对应的上级页表是否存在;若不存在,则为所述末级页表项生成对应的上级页表并保存至所述第二页表。
可选地,当宿主机的一个物理内存页被映射至多个虚拟客户机内存页时,则所述内存反向映射表中存储有对应的指向所述物理内存页的多个末级页表项的链表。
可选地,所述创建单元60222在对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表时:获取所述第一页表指向的虚拟内存槽所对应的内存反向映射表;对所述内存反向映射表的表项逐个进行遍历,判断当前表项中是否存储有所述链表;若存储有所述链表,则遍历所述链表中的每个末级页表项,并将所述链表对应的多个末级页表项的信息和遍历的每个末级页表项的内容拷贝至所述第二页表;若未存储有所述链表,则将所述当前表项的内容拷贝至所述第二页表。
可选地,本实施例的虚拟机热迁移装置还包括:第二热迁移处理模块608,用于若在所述热迁移过程中出现热迁移异常,则先释放所述第二页表,再将当前使用的虚拟机活动页表切换回所述第一页表后,释放所述第一页表。
本实施例的虚拟机热迁移装置用于实现前述多个方法实施例中相应的虚拟机热迁移方法,并具有相应的方法实施例的有益效果,在此不再赘述。此外,本实施例的虚拟机 热迁移装置中的各个模块的功能实现均可参照前述方法实施例中的相应部分的描述,在此亦不再赘述。
实施例七
一种电子设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如上所述的虚拟机热迁移方法对应的操作。
具体地,参照图8,示出了根据本发明实施例七的一种电子设备的结构示意图,本发明具体实施例并不对电子设备的具体实现做限定。
如图8所示,该电子设备可以包括:处理器(processor)702、通信接口(Communications Interface)704、存储器(memory)706、以及通信总线708。
其中:
处理器702、通信接口704、以及存储器706通过通信总线708完成相互间的通信。
通信接口704,用于与其它电子设备或服务器进行通信。
处理器702,用于执行程序710,具体可以执行上述虚拟机热迁移方法实施例中的相关步骤。
具体地,程序710可以包括程序代码,该程序代码包括计算机操作指令。
处理器702可能是中央处理器CPU,或者是特定集成电路ASIC(Application Specific Integrated Circuit),或者是被配置成实施本发明实施例的一个或多个集成电路。电子设备包括的一个或多个处理器,可以是同一类型的处理器,如一个或多个CPU;也可以是不同类型的处理器,如一个或多个CPU以及一个或多个ASIC。
存储器706,用于存放程序710。存储器706可能包含高速RAM存储器,也可能还包括非易失性存储器(non-volatile memory),例如至少一个磁盘存储器。
程序710具体可以用于使得处理器702执行以下操作:根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表;根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。
在一种可选的实施方式中,程序710还用于使得处理器702在根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切 换至第二页表时:根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表;并向所述第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
在一种可选的实施方式中,程序710还用于使得处理器702在根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表时:接收到切换触发指令后,根据所述第一页表创建所述第二页表;在所述第二页表创建成功后,将当前使用的虚拟机活动页表从所述第一页表切换为所述第二页表。
在一种可选的实施方式中,程序710还用于使得处理器702在将当前使用的虚拟机活动页表从所述第一页表切换为所述第二页表时:确定指向所述虚拟机活动页表的根页表的第一变量对应的变量值,根据所述变量值确定当前使用的虚拟机活动页表为所述第一页表;将所述第一变量重新赋值为设定值,根据所述设定值去激活所述第一页表,并激活所述第二页表作为虚拟机活动页表。
在一种可选的实施方式中,所述第一变量为内存管理单元节点MMU_NODE变量;程序710还用于使得处理器702在在所述将所述第一变量重新赋值为设定值之前,对所述第一变量对应的内存管理单元MMU加锁;以及,在所述激活所述第二页表作为虚拟机活动页表之后,对所述MMU解锁。
在一种可选的实施方式中,程序710还用于使得处理器702在向所述第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点时:向所述第一物理机中的所有虚拟客户机对应的虚拟处理器发送用于指示重新加载页表的指示信号,以指示各个虚拟处理器将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
在一种可选的实施方式中,各个虚拟处理器通过以下方式将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点:各个虚拟处理器根据所述指示信号,检查指向各虚拟处理器当前使用的根页表的第二变量的变量值和所述第一变量的变量值是否一致;若不一致,则使用所述第一变量的变量值替换所述第二变量的变量值。
在一种可选的实施方式中,程序710还用于使得处理器702判断所述热迁移是否成功;若所述热迁移成功,则释放所述第一物理机中的所述第一页表和所述第二页表;若所述热迁移失败,则将所述第二页表切换回所述第一页表,并释放所述第一物理机中的 所述第二页表。
在一种可选的实施方式中,程序710还用于使得处理器702在根据所述第一页表创建所述第二页表时:对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表。
在一种可选的实施方式中,程序710还用于使得处理器702在根据遍历结果创建所述第二页表时:根据遍历的各个末级页表项拷贝生成所述第二页表中的末级页表项,并使用设定标识对生成的所述末级页表项进行标记;判断生成的所述末级页表项指向的物理内存页的尺寸是否大于所述设定大小;若大于,则删除所述末级页表项对应的所述设定标识,并根据所述末级页表项指向的物理内存页的尺寸,为所述末级页表项建立至少一级子页表项,其中,所述至少一级子页表项中的最后一级页表项指向的物理内存页的尺寸为所述设定大小。
在一种可选的实施方式中,程序710还用于使得处理器702判断生成的所述末级页表项对应的上级页表是否存在;若不存在,则为所述末级页表项生成对应的上级页表并保存至所述第二页表。
在一种可选的实施方式中,当宿主机的一个物理内存页被映射至多个虚拟客户机内存页时,则所述内存反向映射表中存储有对应的指向所述物理内存页的多个末级页表项的链表。
在一种可选的实施方式中,程序710还用于使得处理器702在对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表时:获取所述第一页表指向的虚拟内存槽所对应的内存反向映射表;对所述内存反向映射表的表项逐个进行遍历,判断当前表项中是否存储有所述链表;若存储有所述链表,则遍历所述链表中的每个末级页表项,并将所述链表对应的多个末级页表项的信息和遍历的每个末级页表项的内容拷贝至所述第二页表;若未存储有所述链表,则将所述当前表项的内容拷贝至所述第二页表。
在一种可选的实施方式中,程序710还用于使得处理器702在若所述热迁移过程中出现热迁移异常,则先释放所述第二页表,再将当前使用的虚拟机活动页表切换回所述第一页表后,释放所述第一页表。
程序710中各步骤的具体实现可以参见上述虚拟机热迁移方法实施例中的相应步骤和单元中对应的描述,在此不赘述。所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的设备和模块的具体工作过程,可以参考前述方法实施例中的对 应过程描述,在此不再赘述。
本实施例的电子设备,在第一物理机中设置有第一页表和第二页表,第一和第二页表均用于指示虚拟客户机内存地址与宿主机物理地址之间的映射关系,以形成该映射关系的主备份设置。所不同的是,第二页表的末级页表指示的物理内存页的尺寸满足设定大小,如常规使用的4K BYTES大小,而第一页表的末级页表指示的物理内存页的尺寸则大于所述设定大小,即通常所说的“大页”。由此,根据第二页表即可实现虚拟机热迁移。一方面,因第二页表对应的物理内存页的尺寸满足设定大小,也即,满足了热迁移对物理内存页面的大小要求,提升了虚拟客户机及其热迁移成功率和性能;另一方面,通过第一页表和第二页表的主备份设置与合理切换,使得在将物理内存页面大小转变为热迁移所需的页面大小的过程中,不会引入虚拟机的暂停或者其他对虚拟机的运行产生性能影响的问题,大大降低了热迁移的页表大小改变对虚拟客户机运行造成的不良影响。
需要指出,根据实施的需要,可将本发明实施例中描述的各个部件/步骤拆分为更多部件/步骤,也可将两个或多个部件/步骤或者部件/步骤的部分操作组合成新的部件/步骤,以实现本发明实施例的目的。
上述根据本发明实施例的方法可在硬件、固件中实现,或者被实现为可存储在记录介质(诸如CD ROM、RAM、软盘、硬盘或磁光盘)中的软件或计算机代码,或者被实现通过网络下载的原始存储在远程记录介质或非暂时机器可读介质中并将被存储在本地记录介质中的计算机代码,从而在此描述的方法可被存储在使用通用计算机、专用处理器或者可编程或专用硬件(诸如ASIC或FPGA)的记录介质上的这样的软件处理。可以理解,计算机、处理器、微处理器控制器或可编程硬件包括可存储或接收软件或计算机代码的存储组件(例如,RAM、ROM、闪存等),当所述软件或计算机代码被计算机、处理器或硬件访问且执行时,实现在此描述的虚拟机热迁移方法。此外,当通用计算机访问用于实现在此示出的虚拟机热迁移方法的代码时,代码的执行将通用计算机转换为用于执行在此示出的虚拟机热迁移方法的专用计算机。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及方法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明实施例的范围。
以上实施方式仅用于说明本发明实施例,而并非对本发明实施例的限制,有关技术 领域的普通技术人员,在不脱离本发明实施例的精神和范围的情况下,还可以做出各种变化和变型,因此所有等同的技术方案也属于本发明实施例的范畴,本发明实施例的专利保护范围应由权利要求限定。

Claims (30)

  1. 一种虚拟机热迁移方法,其特征在于,包括:
    根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表;
    根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;
    其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。
  2. 根据权利要求1所述的方法,其特征在于,所述根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表,包括:
    根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表;
    并向所述第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
  3. 根据权利要求2所述的方法,其特征在于,所述根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表,包括:
    接收到切换触发指令后,根据所述第一页表创建所述第二页表;
    在所述第二页表创建成功后,将当前使用的虚拟机活动页表从所述第一页表切换为所述第二页表。
  4. 根据权利要求3所述的方法,其特征在于,所述将当前使用的虚拟机活动页表从所述第一页表切换为所述第二页表,包括:
    确定指向所述虚拟机活动页表的根页表的第一变量对应的变量值,根据所述变量值确定当前使用的虚拟机活动页表为所述第一页表;
    将所述第一变量重新赋值为设定值,根据所述设定值去激活所述第一页表,并激活所述第二页表作为虚拟机活动页表。
  5. 根据权利要求4所述的方法,其特征在于,所述第一变量为内存管理单元节点MMU_NODE变量;
    在所述将所述第一变量重新赋值为设定值之前,所述方法还包括:对所述第一变量对应的内存管理单元MMU加锁;
    在所述激活所述第二页表作为虚拟机活动页表之后,所述方法还包括:对所述MMU 解锁。
  6. 根据权利要求4所述的方法,其特征在于,所述向所述第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点,包括:
    向所述第一物理机中的所有虚拟客户机对应的虚拟处理器发送用于指示重新加载页表的指示信号,以指示各个虚拟处理器将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
  7. 根据权利要求6所述的方法,其特征在于,各个虚拟处理器通过以下方式将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点:
    各个虚拟处理器根据所述指示信号,检查指向各虚拟处理器当前使用的根页表的第二变量的变量值和所述第一变量的变量值是否一致;
    若不一致,则使用所述第一变量的变量值替换所述第二变量的变量值。
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述方法还包括:
    判断所述热迁移是否成功;
    若所述热迁移成功,则释放所述第一物理机中的所述第一页表和所述第二页表;
    若所述热迁移失败,则将所述第二页表切换回所述第一页表,并释放所述第一物理机中的所述第二页表。
  9. 根据权利要求3-7任一项所述的方法,其特征在于,所述根据所述第一页表创建所述第二页表,包括:
    对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表。
  10. 根据权利要求9所述的方法,其特征在于,所述根据遍历结果创建所述第二页表,包括:
    根据遍历的各个末级页表项拷贝生成所述第二页表中的末级页表项,并使用设定标识对生成的所述末级页表项进行标记;
    判断生成的所述末级页表项指向的物理内存页的尺寸是否大于所述设定大小;
    若大于,则删除所述末级页表项对应的所述设定标识,并根据所述末级页表项指向的物理内存页的尺寸,为所述末级页表项建立至少一级子页表项,其中,所述至少一级子页表项中的最后一级页表项指向的物理内存页的尺寸为所述设定大小。
  11. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    判断生成的所述末级页表项对应的上级页表是否存在;
    若不存在,则为所述末级页表项生成对应的上级页表并保存至所述第二页表。
  12. 根据权利要求9所述的方法,其特征在于,
    当宿主机的一个物理内存页被映射至多个虚拟客户机内存页时,则所述内存反向映射表中存储有对应的指向所述物理内存页的多个末级页表项的链表。
  13. 根据权利要求12所述的方法,其特征在于,所述对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表,包括:
    获取所述第一页表指向的虚拟内存槽所对应的内存反向映射表;
    对所述内存反向映射表的表项逐个进行遍历,判断当前表项中是否存储有所述链表;
    若存储有所述链表,则遍历所述链表中的每个末级页表项,并将所述链表对应的多个末级页表项的信息和遍历的每个末级页表项的内容拷贝至所述第二页表;
    若未存储有所述链表,则将所述当前表项的内容拷贝至所述第二页表。
  14. 根据权利要求1-7任一项所述的方法,其特征在于,所述方法还包括:
    若在所述热迁移过程中出现热迁移异常,则先释放所述第二页表,再将当前使用的虚拟机活动页表切换回所述第一页表后,释放所述第一页表。
  15. 一种虚拟机热迁移装置,其特征在于,包括:
    切换模块,用于根据切换触发指令,将第一物理机中用于指示虚拟客户机内存地址与宿主机物理地址映射关系的第一页表切换至第二页表;
    热迁移模块,用于根据所述第二页表,将虚拟客户机从所述第一物理机热迁移至第二物理机;
    其中,所述第二页表的末级页表指示的物理内存页的尺寸满足设定大小,所述第一页表的末级页表指示的物理内存页的尺寸大于所述设定大小。
  16. 根据权利要求15所述的装置,其特征在于,所述切换模块包括:
    活动页表切换子模块,用于根据切换触发指令,将当前使用的虚拟机活动页表从第一页表切换为第二页表;
    指示子模块,用于向所述第一物理机中的所有虚拟客户机发送用于指示重新加载页表的指示信号,以指示各个虚拟客户机将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
  17. 根据权利要求16所述的装置,其特征在于,所述活动页表切换子模块包括:
    创建单元,用于接收到切换触发指令后,根据所述第一页表创建所述第二页表;
    创建后切换单元,用于在所述第二页表创建成功后,将当前使用的虚拟机活动页表从所述第一页表切换为所述第二页表。
  18. 根据权利要求17所述的装置,其特征在于,所述创建后切换单元,用于在所述第二页表创建成功后,确定指向所述虚拟机活动页表的根页表的第一变量对应的变量值,根据所述变量值确定当前使用的虚拟机活动页表为所述第一页表;将所述第一变量重新赋值为设定值,根据所述设定值去激活所述第一页表,并激活所述第二页表作为虚拟机活动页表。
  19. 根据权利要求18所述的装置,其特征在于,所述第一变量为内存管理单元节点MMU_NODE变量;
    所述创建后切换单元,还用于在所述将所述第一变量重新赋值为设定值之前,对所述第一变量对应的内存管理单元MMU加锁;以及,在所述激活所述第二页表作为虚拟机活动页表之后,对所述MMU解锁。
  20. 根据权利要求18所述的装置,其特征在于,所述指示子模块,用于向所述第一物理机中的所有虚拟客户机对应的虚拟处理器发送用于指示重新加载页表的指示信号,以指示各个虚拟处理器将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点。
  21. 根据权利要求20所述的装置,其特征在于,各个虚拟处理器通过以下方式将当前使用的第一页表中的页表节点切换至第二页表中对应的页表节点:
    各个虚拟处理器根据所述指示信号,检查指向各虚拟处理器当前使用的根页表的第二变量的变量值和所述第一变量的变量值是否一致;
    若不一致,则使用所述第一变量的变量值替换所述第二变量的变量值。
  22. 根据权利要求15-21任一项所述的装置,其特征在于,所述装置还包括:
    第一热迁移处理模块,用于判断所述热迁移是否成功;若所述热迁移成功,则释放所述第一物理机中的所述第一页表和所述第二页表;若所述热迁移失败,则将所述第二页表切换回所述第一页表,并释放所述第一物理机中的所述第二页表。
  23. 根据权利要求17-21任一项所述的装置,其特征在于,所述创建单元,用于接收到切换触发指令后,对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表。
  24. 根据权利要求23所述的装置,其特征在于,所述创建单元,用于接收到切换触 发指令后,对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历;根据遍历的各个末级页表项拷贝生成所述第二页表中的末级页表项,并使用设定标识对生成的所述末级页表项进行标记;判断生成的所述末级页表项指向的物理内存页的尺寸是否大于所述设定大小;若大于,则删除所述末级页表项对应的所述设定标识,并根据所述末级页表项指向的物理内存页的尺寸,为所述末级页表项建立至少一级子页表项,其中,所述至少一级子页表项中的最后一级页表项指向的物理内存页的尺寸为所述设定大小。
  25. 根据权利要求23所述的装置,其特征在于,所述创建单元,还用于判断生成的所述末级页表项对应的上级页表是否存在;若不存在,则为所述末级页表项生成对应的上级页表并保存至所述第二页表。
  26. 根据权利要求23所述的装置,其特征在于,
    当宿主机的一个物理内存页被映射至多个虚拟客户机内存页时,则所述内存反向映射表中存储有对应的指向所述物理内存页的多个末级页表项的链表。
  27. 根据权利要求26所述的装置,其特征在于,所述创建单元在对所述第一页表指向的虚拟内存槽所对应的内存反向映射表进行遍历,根据遍历结果创建所述第二页表时:
    获取所述第一页表指向的虚拟内存槽所对应的内存反向映射表;
    对所述内存反向映射表的表项逐个进行遍历,判断当前表项中是否存储有所述链表;
    若存储有所述链表,则遍历所述链表中的每个末级页表项,并将所述链表对应的多个末级页表项的信息和遍历的每个末级页表项的内容拷贝至所述第二页表;
    若未存储有所述链表,则将所述当前表项的内容拷贝至所述第二页表。
  28. 根据权利要求15-21任一项所述的装置,其特征在于,所述装置还包括:
    第二热迁移处理模块,用于若在所述热迁移过程中出现热迁移异常,则先释放所述第二页表,再将当前使用的虚拟机活动页表切换回所述第一页表后,释放所述第一页表。
  29. 一种电子设备,包括:处理器、存储器、通信接口和通信总线,所述处理器、所述存储器和所述通信接口通过所述通信总线完成相互间的通信;
    所述存储器用于存放至少一可执行指令,所述可执行指令使所述处理器执行如权利要求1-14任一项所述的虚拟机热迁移方法对应的操作。
  30. 一种计算机存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1-14中任一所述的虚拟机热迁移方法。
PCT/CN2020/105032 2019-08-05 2020-07-28 虚拟机热迁移方法、装置、电子设备及计算机存储介质 WO2021023052A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910715884.1A CN112328354A (zh) 2019-08-05 2019-08-05 虚拟机热迁移方法、装置、电子设备及计算机存储介质
CN201910715884.1 2019-08-05

Publications (1)

Publication Number Publication Date
WO2021023052A1 true WO2021023052A1 (zh) 2021-02-11

Family

ID=74319266

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105032 WO2021023052A1 (zh) 2019-08-05 2020-07-28 虚拟机热迁移方法、装置、电子设备及计算机存储介质

Country Status (2)

Country Link
CN (1) CN112328354A (zh)
WO (1) WO2021023052A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114154163A (zh) * 2021-10-19 2022-03-08 荣耀终端有限公司 漏洞检测方法和装置
WO2022200962A1 (en) * 2021-03-24 2022-09-29 Ati Technologies Ulc Migrating pages of memory accessible by input-output devices
CN116185902A (zh) * 2023-04-13 2023-05-30 阿里云计算有限公司 一种表切分方法、系统、电子设备及可读介质

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342711B (zh) * 2021-06-28 2024-02-09 海光信息技术股份有限公司 页表更新方法、装置及相关设备
CN113515502B (zh) * 2021-07-14 2023-11-21 重庆度小满优扬科技有限公司 数据迁移方法、装置、设备以及存储介质
CN116701248B (zh) * 2022-02-24 2024-04-30 象帝先计算技术(重庆)有限公司 页表管理方法、单元、soc、电子设备及可读存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080256327A1 (en) * 2007-04-16 2008-10-16 Stuart Zachary Jacobs System and Method for Maintaining Page Tables Used During a Logical Partition Migration
CN104598303A (zh) * 2013-10-31 2015-05-06 中国电信股份有限公司 基于kvm的虚拟机间在线迁移方法与装置
US20170329718A1 (en) * 2016-05-10 2017-11-16 Oracle International Corporation Virtual memory page mapping overlays
CN108804350A (zh) * 2017-04-27 2018-11-13 华为技术有限公司 一种内存访问方法及计算机系统

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080256327A1 (en) * 2007-04-16 2008-10-16 Stuart Zachary Jacobs System and Method for Maintaining Page Tables Used During a Logical Partition Migration
CN104598303A (zh) * 2013-10-31 2015-05-06 中国电信股份有限公司 基于kvm的虚拟机间在线迁移方法与装置
US20170329718A1 (en) * 2016-05-10 2017-11-16 Oracle International Corporation Virtual memory page mapping overlays
CN108804350A (zh) * 2017-04-27 2018-11-13 华为技术有限公司 一种内存访问方法及计算机系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022200962A1 (en) * 2021-03-24 2022-09-29 Ati Technologies Ulc Migrating pages of memory accessible by input-output devices
CN114154163A (zh) * 2021-10-19 2022-03-08 荣耀终端有限公司 漏洞检测方法和装置
CN114154163B (zh) * 2021-10-19 2023-01-10 北京荣耀终端有限公司 漏洞检测方法和装置
CN116185902A (zh) * 2023-04-13 2023-05-30 阿里云计算有限公司 一种表切分方法、系统、电子设备及可读介质

Also Published As

Publication number Publication date
CN112328354A (zh) 2021-02-05

Similar Documents

Publication Publication Date Title
WO2021023052A1 (zh) 虚拟机热迁移方法、装置、电子设备及计算机存储介质
RU2751551C1 (ru) Способ и устройство для восстановления нарушенной работоспособности узла, электронное устройство и носитель данных
US7814287B2 (en) Using writeable page tables for memory address translation in a hypervisor environment
US11487675B1 (en) Collecting statistics for persistent memory
US20200371700A1 (en) Coordinated allocation of external memory
EP3073384B1 (en) Fork-safe memory allocation from memory-mapped files with anonymous memory behavior
US20180136838A1 (en) Management of block storage devices based on access frequency
JP5484117B2 (ja) ハイパーバイザ及びサーバ装置
US9678818B2 (en) Direct IO access from a CPU's instruction stream
US20170357592A1 (en) Enhanced-security page sharing in a virtualized computer system
US10031858B2 (en) Efficient translation reloads for page faults with host accelerator directly accessing process address space without setting up DMA with driver and kernel by process inheriting hardware context from the host accelerator
US20080177974A1 (en) System and method for reducing memory overhead of a page table in a dynamic logical partitioning environment
US8943296B2 (en) Virtual address mapping using rule based aliasing to achieve fine grained page translation
US20160231929A1 (en) Zero copy memory reclaim using copy-on-write
US20190138405A1 (en) Data Loading Method and Apparatus
US9990237B2 (en) Lockless write tracking
US20220156106A1 (en) Virtual Machine Hot Migration Method, Apparatus, Electronic Device, and Computer Storage Medium
CN109710317B (zh) 系统启动方法、装置、电子设备及存储介质
WO2018214850A1 (zh) 用于访问安全世界的方法、装置和系统
US11620233B1 (en) Memory data migration hardware
US10884790B1 (en) Eliding redundant copying for virtual machine migration
RU2580016C1 (ru) Способ передачи управления между областями памяти
US10754783B2 (en) Techniques to manage cache resource allocations for a processor cache
CN112241310A (zh) 页表管理、信息获取方法、处理器、芯片、设备及介质
TW201531862A (zh) 記憶體資料分版技術

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20851105

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20851105

Country of ref document: EP

Kind code of ref document: A1