CN115629840A - Method and system for hot migration of RDMA virtual machine and corresponding physical machine - Google Patents

Method and system for hot migration of RDMA virtual machine and corresponding physical machine Download PDF

Info

Publication number
CN115629840A
CN115629840A CN202211282764.5A CN202211282764A CN115629840A CN 115629840 A CN115629840 A CN 115629840A CN 202211282764 A CN202211282764 A CN 202211282764A CN 115629840 A CN115629840 A CN 115629840A
Authority
CN
China
Prior art keywords
physical machine
network card
rdma network
machine
rdma
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211282764.5A
Other languages
Chinese (zh)
Inventor
韦奋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Yunbao Intelligent Co ltd
Original Assignee
Shenzhen Yunbao Intelligent Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Yunbao Intelligent Co ltd filed Critical Shenzhen Yunbao Intelligent Co ltd
Priority to CN202211282764.5A priority Critical patent/CN115629840A/en
Publication of CN115629840A publication Critical patent/CN115629840A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45575Starting, stopping, suspending or resuming virtual machine instances
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation

Abstract

The invention discloses a method for thermal migration of an RDMA virtual machine, which comprises the following steps: a Host of a source physical machine issues a virtual machine live migration synchronization command to an RDMA network card; the RDMA network card of the source physical machine stops sending new data messages to the RDMA network card of the opposite-end physical machine, and simultaneously sends first extended CNP messages to the RDMA network card of the opposite-end physical machine; after the RDMA network card of the opposite-end physical machine receives the first extended CNP message, a congestion control unit in the RDMA network card of the opposite-end physical machine reduces the data transmission rate to zero; and returning a first extended CNP response message to the source end physical machine; after the RDMA network card of the source physical machine receives the first extended CNP response message, the RDMA network card of the source physical machine keeps the table entry data cached in the RDMA network card of the source physical machine stable and synchronizes the cached table entry data to the memory area of the source physical machine; and the source physical machine copies the memory page table corresponding to the virtual machine to the memory area of the target physical machine. The invention also discloses a corresponding system and a physical machine. The implementation of the invention can improve the efficiency of the thermal migration, shorten the time of the thermal migration and improve the success rate of the thermal migration.

Description

Method and system for hot migration of RDMA virtual machine and corresponding physical machine
Technical Field
The invention relates to the technical field of data storage, in particular to a method and a system for hot migration of an RDMA virtual machine and a corresponding physical machine.
Background
The virtual machine live migration refers to migrating a running virtual machine from a source physical machine to a target physical machine under the condition of a target with minimum service interruption time and system performance loss, and in the application technology of virtual machine live migration, copying page memory information of the virtual machine in a multi-cycle iteration mode is a main bottleneck limiting the virtual machine live migration efficiency. The adopted copy is usually realized by using an RDMA (Remote Direct Memory Access) network card between a source Memory and a target Memory through a Remote Direct Memory Access.
In the process of the live migration of the virtual machine, the system copies all pages from the source physical machine to the target physical machine, if the source physical machine writes a memory region needing copying, the corresponding page is marked as a dirty page, if the dirty page continuously appears in the copying and transmitting process, the dirty page is continuously copied in an iterative copying mode, when the iteratively copied memory data is converged or is lower than a certain threshold value, the source physical machine pauses the virtual machine, copies the last round of dirty page and the state of the virtual machine to the target physical machine, then shuts down the virtual machine of the source physical machine, and starts the virtual machine of the target physical machine to run, so that the live migration of the virtual machine from the source physical machine to the target physical machine is realized.
The existing scheme for realizing the live migration of the virtual machine by adopting the RDMA card has the following defects:
under RDMA communication, the processing of a communication protocol is mainly completed by hardware, a plurality of memory data table entries are maintained by the hardware, a cache space (cache) is usually arranged in the hardware for caching updated data and is inconsistent with data in a memory area, and the data cached in the hardware needs to be synchronized to the memory area for copying in the process that a system copies the memory data from a source physical machine to a target physical machine, so that the iterative copy efficiency is low. Wherein, the memory data table item comprises: QPC (Queue Pair Context), CQC (Complete Queue Context), SRQC (Shared Receive Queue Context), and EQC (Event Queue Context).
Under RDMA communication, when a source physical machine is in iterative copy, an opposite physical machine in communication with the source physical machine cannot sense that the source physical machine is migrating, and still sends a data message to the source physical machine. When the RDMA network card of the source physical machine receives the data message for processing, new dirty data is generated in hardware, and the dirty data in the network card needs to be continuously synchronized to a memory area by the system, so that the iterative copy efficiency is low.
Meanwhile, in RDMA communication, when a source physical machine suspends a migrated virtual machine, the migrated virtual machine cannot process a data packet of an opposite-end physical machine communicating with the source physical machine, the data packet of the opposite-end physical machine is discarded, if the suspended time of the migrated virtual machine is too long, the opposite-end physical machine communicating with the migrated virtual machine cannot receive a response of the sent data packet for a long time, and overtime retransmission occurs, even if the overtime retransmission overflows, an RDMA network card of the opposite-end physical machine is broken, and even if the migrated virtual machine finally completes memory data transfer, the communication with the opposite end cannot be recovered in a target physical machine.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method and an apparatus for RDMA virtual machine live migration, which can quickly and smoothly implement RDMA virtual machine live migration, and improve live migration efficiency and success rate.
To solve the above technical problem, as an aspect of the present invention, a method for RDMA virtual machine live migration is provided, which at least includes the following steps:
acquiring a migration request for migrating the memory data of a migrated virtual machine in a source physical machine, and issuing a virtual machine thermal migration synchronization command to an RDMA (remote direct memory Access) network card by a Host of the source physical machine;
the RDMA network card of the source physical machine stops sending subsequent data messages of current data transmission to the RDMA network card of an opposite-end physical machine which is communicated with the migrated virtual machine, and simultaneously sends a first extended CNP message to the RDMA network card of the opposite-end physical machine;
after the RDMA network card of the opposite-end physical machine receives the first extended CNP message, a congestion control unit in the RDMA network card of the opposite-end physical machine reduces the data transmission rate to zero;
after the RDMA network card of the opposite-end physical machine reduces the data transmission rate to zero, a first extended CNP response message is returned to the source-end physical machine;
after receiving the first extended CNP response message, the RDMA network card of the source physical machine keeps the table entry data cached in the RDMA network card stable and synchronizes the cached table entry data to the memory area of the source physical machine;
and the source physical machine copies the memory page table corresponding to the virtual machine to the memory area of the target physical machine.
Wherein, further include:
after the source physical machine copies all the memory page tables of the virtual machines to the memory area of the target physical machine, closing the corresponding virtual machine of the source physical machine and starting the corresponding virtual machine of the target physical machine;
the target physical machine RDMA network card sends a second extended CNP message to the RDMA network card of the opposite-end physical machine;
after the RDMA network card of the opposite-end physical machine receives the second extended CNP message, the congestion control unit of the RDMA network card recovers the sending rate of the RDMA network card of the opposite-end physical machine and sends subsequent data messages to the RDMA network card of the target physical machine.
As another aspect of the present invention, a system for RDMA virtual machine live migration is further provided, which includes at least an active physical machine, a peer physical machine, and a target physical machine, where:
the source physical machine is used for acquiring a migration request for migrating the memory data of the migrated virtual machine in the source physical machine and controlling a Host machine to issue a virtual machine hot migration synchronization command to the RDMA network card; controlling the RDMA network card to stop sending new data messages to the RDMA network card of the opposite-end physical machine, and simultaneously sending first extended CNP messages to the RDMA network card of the opposite-end physical machine; after receiving a first extended CNP response message from an opposite-end physical machine, controlling an RDMA network card thereof to keep the table entry data cached in the RDMA network card stable, and synchronizing the cached table entry data to a memory area of a source physical machine; copying a memory page table corresponding to the virtual machine to a memory area of a target physical machine;
the opposite-end physical machine is used for reducing the data transmission rate to zero by a congestion control unit in the RDMA network card after the RDMA network card receives the first extended CNP message from the source physical machine; and the congestion control unit of the RDMA network card is used for recovering the sending rate of the RDMA network card of the opposite-end physical machine after receiving the second extended CNP message from the target physical machine and sending subsequent data messages to the RDMA network card of the target physical machine;
the target physical machine is used for starting a corresponding virtual machine in the target physical machine after receiving the memory page table copied by the source physical machine in the memory area of the target physical machine; the RDMA network card of the control target physical machine sends a second extended CNP message to the RDMA network card of the opposite-end physical machine; and controls data transfer between the target physical machine and the peer physical machine.
Wherein, further include:
the source physical machine is further used for closing the corresponding virtual machine after the memory page table of the virtual machine is completely copied to the target physical memory area.
As another aspect of the present invention, a source physical machine is further provided, which includes a hardware layer, on which a Host runs, on which at least one virtual machine runs, and the hardware layer further includes an RDMA network card and a memory area, where the RDMA network card includes a congestion control unit;
wherein the Host is configured to:
acquiring a migration request for migrating the memory data of a migrated virtual machine in a source physical machine, and issuing a virtual machine live migration synchronization command to an RDMA (remote direct memory access) network card;
the RDMA network card of the control source physical machine stops sending new data messages to the RDMA network card of the opposite-end physical machine, and simultaneously sends first extended CNP messages to the RDMA network card of the opposite-end physical machine;
after receiving a first extended CNP response message from an opposite-end physical machine, controlling an RDMA network card of a source physical machine to keep the table entry data cached in the RDMA network card stable, and synchronizing the cached table entry data to a memory area of the source physical machine;
controlling the source physical machine to copy a memory page table of the virtual machine to a memory area of the target physical machine;
and after the copying is finished, controlling to close the corresponding virtual machine on the source physical machine.
As another aspect of the present invention, a peer physical machine is further provided, which includes a hardware layer, on which a Host runs, on which at least one virtual machine runs, and the hardware layer further includes an RDMA network card and a memory area, where the RDMA network card includes a congestion control unit;
wherein the RDMA network card is to:
after receiving a first extended CNP message from a source physical machine, controlling a congestion control unit in the RDMA network card to reduce the data transmission rate to zero;
and after receiving the second extended CNP message from the target physical machine, controlling a congestion control unit of the RDMA network card to recover the sending rate of the RDMA network card of the opposite-end physical machine, and sending a subsequent data message to the RDMA network card of the target physical machine.
As another aspect of the present invention, there is also provided a target physical machine, which includes a hardware layer, on which a Host runs, on which at least one virtual machine runs, the hardware layer further including an RDMA network card and a memory area;
wherein the Host is configured to:
after a memory page table copied by a source physical machine is received in a memory area of a target physical machine, starting a virtual machine of the target physical machine;
the RDMA network card of the control target physical machine sends a second extended CNP message to the RDMA network card of the opposite-end physical machine;
and controlling data transmission between the target physical machine and the opposite-end physical machine.
As another aspect of the present invention, there is also provided a system for RDMA virtual machine live migration, comprising: a plurality of physical machines, the plurality of physical machines comprising at least: a source physical machine as described above, a peer physical machine as described above; and a target physical machine as previously described.
The embodiment of the invention has the following beneficial effects:
the invention provides a method and a system for thermal migration of an RDMA virtual machine and a corresponding physical machine, wherein a definition mode of a CNP message is expanded by using a reserved field in the CNP message, when the virtual machine of a source physical machine starts thermal migration, an RDMA network card of an opposite-end physical machine communicating with a migrated virtual machine is notified by using a first expanded CNP message to reduce the sending flow to zero by using a congestion control unit of the RDMA network card, so that the aim of suspending the RDMA network card of the opposite-end physical machine from sending a new data message is fulfilled, and therefore the purposes of improving the efficiency of thermal migration, shortening the time of thermal migration and improving the success rate of thermal migration are achieved.
During the hot migration, the network flow between the source physical machine and the opposite-end physical machine can be suspended, the RDMA network card of the source physical machine does not generate new dirty pages any more, the system of the source physical machine can execute the last round of copy only by executing the data synchronization operation between the RDMA hardware cache and the system memory once, and continuous iteration synchronization and copy are not needed, so that the migration efficiency is greatly improved.
Meanwhile, when the last round of copying is finished, starting the virtual machine of the target physical machine to run, sending a second extended CNP message to the opposite-end physical machine communicated with the migrated virtual machine of the source physical machine through the RDMA network card of the target physical machine, informing the RDMA network card of the opposite-end physical machine to recover the data sending rate by using the congestion control unit of the RDMA network card, and restarting the data packet sending process, so that the risk of chain break in the prior art can be prevented.
In summary, in the present invention, by adding a control information channel between two end links of RDMA communication, and by using a CNP extension packet, when a virtual machine of a source physical machine starts a live migration, a state of a migrated virtual machine of an opposite end communicating with the migrated virtual machine is notified, and the opposite end is notified to suspend sending a new data packet, so that the efficiency of live migration is improved, the time of live migration is shortened, and the success rate of live migration is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is within the scope of the present invention for those skilled in the art to obtain other drawings based on the drawings without inventive exercise.
FIG. 1 is a schematic main flow diagram illustrating an embodiment of a method for RDMA virtual machine live migration according to the present invention;
FIG. 2 is a block diagram illustrating an embodiment of a system for RDMA virtual machine live migration in FIG. 1.
Detailed Description
To make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings.
As shown in fig. 1, a main flow diagram of an embodiment of a method for live migration of an RDMA virtual machine according to the present invention is shown. Referring to fig. 2, in this embodiment, the method is applied to the application architecture shown in fig. 2, wherein the sequence numbers 1 to 10 in fig. 2 generally show the sequence of the flow of the method of the present invention. In this embodiment, the method at least comprises the following steps:
step S10, acquiring a migration request for migrating the memory data of a migrated virtual machine in a source physical machine, and issuing a virtual machine thermal migration synchronization command to an RDMA network card by a Host computer of the source physical machine; it can be understood that, in a specific example, the command may carry content such as virtual machine identification information to be migrated, virtual machine identification information of a target physical machine, and the like;
step S11, the RDMA network card of the source physical machine stops sending new data messages to the RDMA network card of the opposite-end physical machine notified by the migrated virtual machine, and simultaneously sends a first extended CNP message (such as a CNP _ pause message) to the RDMA network card of the opposite-end physical machine; in this step, the RDMA network card of the source physical machine stops sending subsequent data messages in the current data transmission to the RDMA network card of the opposite physical machine through the congestion control unit;
step S12, after the RDMA network card of the opposite-end physical machine receives the first extended CNP message, a congestion control unit in the RDMA network card of the opposite-end physical machine reduces the data transmission rate to zero; and returns a first extended CNP response message (CNP _ ack message) to the source end physical machine;
after the RDMA network card of the source physical machine receives the first extended CNP response message, the flow of the RDMA network card of the source physical machine is suspended, the cached table data in the RDMA network card is kept stable, and the cached table data is synchronized to the memory area of the source physical machine;
step S14, the source physical machine copies the memory page table corresponding to the virtual machine to the memory area of the target physical machine.
Wherein the method further comprises the steps of:
step S15, after the source physical machine copies all the memory page tables of the virtual machines to the memory area of the target physical machine, closing the corresponding virtual machine of the source physical machine and starting the corresponding virtual machine of the target physical machine;
step S16, the RDMA network card of the target physical machine sends a second extended CNP message (such as a CNP _ Resume message) to the RDMA network card of the opposite physical machine;
and S17, after the RDMA network card of the opposite-end physical machine receives the second extended CNP message, the congestion control unit of the RDMA network card recovers the sending rate of the RDMA network card of the opposite-end physical machine, restarts the current data transmission and sends subsequent data messages to the RDMA network card of the target physical machine.
It can be understood that the main innovation point of the embodiment of the present invention is to extend the function of the CNP packet by using the reserved field of the CNP packet, so that the source physical machine can notify the opposite end physical RDMA network card to suspend and resume the data packet transmission flow through the CNP packet, thereby achieving the purpose of improving the RDMA thermal migration efficiency.
In a common application scenario, when congestion is detected in a network in a data message, a network card receiving the data message generates a CNP message, and notifies a sending network card of the data message to reduce the sending rate of the data message. And when the sending network card receives the CNP message, reducing the sending rate by using a quantization algorithm specified by the DCQCN. As shown in the following drawings, it is assumed that the peer-to-peer physical device network card sends a data message to the source physical device network card, and congestion is detected on the network, so that the source physical device network card generates a CNP message to the peer-to-peer physical device network card. And after receiving the CNP message, the network card of the physical machine at the opposite end can reduce the sending rate of the data message.
In order to distinguish from the extended CNP message in the embodiment of the present invention, a CNP message in a general application scenario is referred to as CNP. Normal is triggered and generated by receiving data message with congestion identification in normal application scenario. The CNP.extend is actively generated by the network card of the source physical machine, and does not need to depend on the received data message. When the network card of the physical machine at the opposite end receives the CNP, the transmission rate is reduced by using a quantization algorithm specified by the DCQCN, but the network card of the physical machine at the opposite end can still transmit data packets to the network card of the source physical machine before the rate is reduced to zero. In the embodiment of the invention, when the network card of the physical machine at the opposite end receives the CNP.extend, the sending rate is immediately reduced to zero (or is restored to the initial sending rate), so that the network card of the physical machine at the opposite end can not send out a new data message after receiving the CNP.extend.
In principle, the mechanisms of cnp. Normal and cnp. Extended are the same, and both achieve the purpose of controlling the sending rate of the data message through the quantization algorithm of the DCQCN, and the cnp. Extended can be supported only by modifying the quantization algorithm of the DCQCN. And the generation of cnp.extended can also multiplex the path of cnp.normal. Therefore, the method has no influence on the data message processing flow of the network cards at two ends of the RDMA communication, does not need the upper application software perception of the network cards at two ends of the RDMA communication, and has very little influence on the existing RDMA network cards.
Referring back to fig. 2, another aspect of the embodiment of the present invention further provides a system for RDMA virtual machine live migration, where the system includes at least multiple physical machines, where the multiple physical machines include at least a source physical machine, a peer physical machine, and a target physical machine, where:
the source physical machine is used for acquiring a migration request for migrating the memory data of a migrated virtual machine in the source physical machine and controlling a Host of the source physical machine to issue a virtual machine thermal migration synchronization command to the RDMA network card; controlling the RDMA network card to stop sending a new data message to the RDMA network card of the opposite-end physical machine, and simultaneously sending a first extended CNP message to the RDMA network card of the opposite-end physical machine; after receiving a first extended CNP response message from an opposite-end physical machine, controlling an RDMA network card thereof to keep the table entry data cached in the RDMA network card stable, and synchronizing the cached table entry data to a memory area of a source physical machine; copying a memory page table corresponding to the virtual machine to a memory area of a target physical machine; the source physical machine is further used for closing the corresponding virtual machine after the memory page table of the virtual machine is completely copied to the memory area of the target physical machine.
More specifically, the source physical machine comprises a hardware layer, a Host runs on the hardware layer, at least one Virtual Machine (VM) runs on the Host, the hardware layer further comprises an RDMA network card and a memory area, and the RDMA network card comprises a congestion control unit and a sending unit; in a specific example, the above functions of the source physical machine are implemented by a Host therein.
Specifically, in one example, the Host in the source physical machine is configured to:
acquiring a migration request for migrating the memory data of a migrated virtual machine in a source physical machine, and issuing a virtual machine thermal migration synchronization command to an RDMA network card;
the RDMA network card of the control source physical machine stops sending new data messages to the RDMA network card of the opposite-end physical machine, and simultaneously sends first extended CNP messages to the RDMA network card of the opposite-end physical machine;
after receiving a first extended CNP response message from an opposite-end physical machine, controlling an RDMA network card of a source physical machine to keep the table entry data cached in the RDMA network card stable, and synchronizing the cached table entry data to a memory area of the source physical machine;
controlling the source physical machine to copy a memory page table of the virtual machine to a memory area of the target physical machine;
and after the copying is finished, controlling to close the corresponding virtual machine on the source physical machine.
The opposite-end physical machine is used for controlling a congestion control unit in the RDMA network card to reduce the data transmission rate to zero after the RDMA network card receives the first extended CNP message from the source physical machine; and the congestion control unit is used for controlling the congestion control unit of the RDMA network card to recover the sending rate of the RDMA network card of the opposite-end physical machine after receiving the second extended CNP message from the target physical machine, restarting the current data transmission and sending subsequent data messages to the RDMA network card of the target physical machine;
more specifically, the peer physical machine comprises a hardware layer, a Host runs on the hardware layer, at least one Virtual Machine (VM) runs on the Host, the hardware layer further comprises an RDMA network card and a memory area, and the RDMA network card comprises a congestion control unit and a sending unit; in a specific example, the above functions of the peer physical machine may be automatically implemented by the RDMA network card therein.
Specifically, in one example, the RDMA network card in the peer physical machine is configured to:
after receiving a first extended CNP message from a source physical machine, controlling a congestion control unit in the RDMA network card to reduce the data transmission rate to zero;
and after receiving the second extended CNP message from the target physical machine, controlling a congestion control unit of the RDMA network card to recover the sending rate of the RDMA network card of the opposite physical machine, and sending subsequent data messages to the RDMA network card of the target physical machine.
The target physical machine is used for starting a corresponding virtual machine in the target physical machine after receiving the memory page table copied by the source physical machine in the memory area of the target physical machine; the RDMA network card of the control target physical machine sends a second extended CNP message to the RDMA network card of the opposite-end physical machine; and controls data transfer between the target physical machine and the peer physical machine.
More specifically, the target physical machine comprises a hardware layer, a Host runs on the hardware layer, at least one Virtual Machine (VM) runs on the Host, the hardware layer further comprises an RDMA network card and a memory area, and the RDMA network card comprises a congestion control unit and a sending unit; in a specific example, the above-mentioned functions of the target physical machine are implemented by a Host therein.
In one example, the Host in the target physical machine is configured to:
after a memory page table copied by a source physical machine is received in a memory area of a target physical machine, starting a virtual machine of the target physical machine;
the control target physical machine RDMA network card sends a second extended CNP message to the RDMA network card of the opposite-end physical machine;
and controlling data transmission between the target physical machine and the opposite-end physical machine.
For more details, reference may be made to and combined with the foregoing description of fig. 1, which is not repeated herein.
The embodiment of the invention has the following beneficial effects:
the embodiment of the invention provides a method and a system for thermal migration of an RDMA virtual machine and a corresponding physical machine, wherein a definition mode of a CNP message is expanded by using a reserved field in the CNP message, when the virtual machine of a source physical machine starts thermal migration, an RDMA network card of an opposite-end physical machine communicating with a migrated virtual machine is notified by using a first expanded CNP message to reduce the sending flow to zero by using a congestion control unit of the RDMA network card, and the purpose of suspending the RDMA network card of the opposite-end physical machine to send a new data message is realized, so that the aims of improving the efficiency of thermal migration, shortening the time of thermal migration and improving the success rate of thermal migration are fulfilled.
During the hot migration, the network flow between the source physical machine and the opposite-end physical machine can be suspended, the RDMA network card of the source physical machine does not generate new dirty pages any more, the system of the source physical machine can execute the last round of copy only by executing the data synchronization operation between the RDMA hardware cache and the system memory once, and continuous iteration synchronization and copy are not needed, so that the migration efficiency is greatly improved.
Meanwhile, when the last round of copying is finished, starting the virtual machine of the target physical machine to run, sending a second extended CNP message to the opposite-end physical machine communicated with the migrated virtual machine of the source physical machine through the RDMA network card of the target physical machine, informing the RDMA network card of the opposite-end physical machine to recover the data sending rate by using the congestion control unit of the RDMA network card, and restarting the data packet sending process, so that the risk of chain break in the prior art can be prevented.
To sum up, the embodiment of the present invention adds a control information channel between two end links of RDMA communication, and notifies the state of the migrated virtual machine of the opposite end communicating with the migrated virtual machine and notifies the opposite end to suspend sending new data packets when the virtual machine of the source physical machine starts the live migration by using the CNP extension packet, so as to improve the efficiency of live migration, shorten the time of live migration, and improve the success rate of live migration.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While the invention has been described in connection with what is presently considered to be the most practical and preferred embodiment, it is to be understood that the invention is not to be limited to the disclosed embodiment, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (8)

1. A method of RDMA virtual machine live migration, comprising at least the steps of:
acquiring a migration request for migrating the memory data of a migrated virtual machine in a source physical machine, and issuing a virtual machine thermal migration synchronization command to an RDMA (remote direct memory Access) network card by a Host of the source physical machine;
the RDMA network card of the source physical machine stops sending subsequent data messages of current data transmission to the RDMA network card of an opposite-end physical machine which is communicated with the migrated virtual machine, and simultaneously sends a first extended CNP message to the RDMA network card of the opposite-end physical machine;
after the RDMA network card of the opposite-end physical machine receives the first extended CNP message, a congestion control unit in the RDMA network card of the opposite-end physical machine reduces the data transmission rate to zero;
the opposite-end physical machine returns a first extended CNP response message to the source-end physical machine;
after receiving the first extended CNP response message, the RDMA network card of the source physical machine keeps the table entry data cached in the RDMA network card stable and synchronizes the cached table entry data to the memory area of the source physical machine;
and the source physical machine copies the memory page table corresponding to the virtual machine to the memory area of the target physical machine.
2. The method of claim 1, further comprising:
after the source physical machine copies all the memory page tables of the virtual machines to the memory area of the target physical machine, closing the corresponding virtual machine of the source physical machine and starting the corresponding virtual machine of the target physical machine;
the RDMA network card of the target physical machine sends a second extended CNP message to the RDMA network card of the opposite-end physical machine;
after the RDMA network card of the opposite-end physical machine receives the second extended CNP message, the congestion control unit of the RDMA network card recovers the sending rate of the RDMA network card of the opposite-end physical machine and sends subsequent data messages to the RDMA network card of the target physical machine.
3. A system for RDMA virtual machine live migration comprising at least an active physical machine, a peer physical machine, and a target physical machine, wherein:
the source physical machine is used for acquiring a migration request for migrating the memory data of the migrated virtual machine in the source physical machine and controlling a Host machine to issue a virtual machine hot migration synchronization command to the RDMA network card; controlling the RDMA network card to stop sending new data messages to the RDMA network card of the opposite-end physical machine, and simultaneously sending first extended CNP messages to the RDMA network card of the opposite-end physical machine; after receiving a first extended CNP response message from an opposite-end physical machine, controlling an RDMA network card thereof to keep the table entry data cached in the RDMA network card stable, and synchronizing the cached table entry data to a memory area of a source physical machine; copying a memory page table corresponding to the virtual machine to a memory area of a target physical machine;
the opposite-end physical machine is used for reducing the data transmission rate to zero by a congestion control unit in the RDMA network card after the RDMA network card receives the first extended CNP message from the source physical machine; and the congestion control unit of the RDMA network card is used for recovering the sending rate of the RDMA network card of the opposite-end physical machine after receiving the second extended CNP message from the target physical machine and sending subsequent data messages to the RDMA network card of the target physical machine;
the target physical machine is used for starting a corresponding virtual machine in the target physical machine after receiving the memory page table copied by the source physical machine in the memory area of the target physical machine; the RDMA network card of the control target physical machine sends a second extended CNP message to the RDMA network card of the opposite-end physical machine; and controls data transfer between the target physical machine and the peer physical machine.
4. The system of claim 2, further comprising:
the source physical machine is further used for closing the corresponding virtual machine after the memory page table of the virtual machine is completely copied to the memory area of the target physical machine.
5. A source physical machine, comprising a hardware layer, wherein a Host runs on the hardware layer, at least one virtual machine runs on the Host, the hardware layer further comprises an RDMA network card and a memory area, and the RDMA network card at least comprises a congestion control unit;
wherein the Host is configured to:
acquiring a migration request for migrating the memory data of a migrated virtual machine in a source physical machine, and issuing a virtual machine thermal migration synchronization command to an RDMA network card;
the RDMA network card of the control source physical machine stops sending new data messages to the RDMA network card of the opposite-end physical machine, and simultaneously sends first extended CNP messages to the RDMA network card of the opposite-end physical machine;
after receiving a first extended CNP response message from an opposite-end physical machine, controlling an RDMA network card of a source physical machine to keep the table entry data cached in the RDMA network card stable, and synchronizing the cached table entry data to a memory area of the source physical machine;
controlling the source physical machine to copy a memory page table of the virtual machine to a memory area of the target physical machine;
and after the copying is finished, controlling to close the corresponding virtual machine on the source physical machine.
6. A peer physical machine, comprising a hardware layer, on which a Host runs, and at least one virtual machine runs on the Host, wherein the hardware layer further comprises an RDMA network card and a memory area, and the RDMA network card at least contains a congestion control unit;
wherein the RDMA network card is to:
after receiving a first extended CNP message from a source physical machine, controlling a congestion control unit in the RDMA network card to reduce the data transmission rate to zero;
and after receiving the second extended CNP message from the target physical machine, controlling a congestion control unit of the RDMA network card to recover the sending rate of the RDMA network card of the opposite-end physical machine, and sending a subsequent data message to the RDMA network card of the target physical machine.
7. A target physical machine, comprising a hardware layer, a Host running on the hardware layer, at least one virtual machine running on the Host, the hardware layer further comprising an RDMA network card and a memory area;
wherein the Host is configured to:
after a memory page table copied by a source physical machine is received in a memory area of a target physical machine, starting a virtual machine of the target physical machine;
the RDMA network card of the control target physical machine sends a second extended CNP message to the RDMA network card of the opposite-end physical machine;
and controlling data transmission between the target physical machine and the opposite-end physical machine.
8. A system for RDMA virtual machine live migration, comprising: a plurality of physical machines, the plurality of physical machines comprising at least:
the source physical machine of claim 5;
the peer physical machine of claim 6; and
the target physical machine of claim 7.
CN202211282764.5A 2022-10-19 2022-10-19 Method and system for hot migration of RDMA virtual machine and corresponding physical machine Pending CN115629840A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211282764.5A CN115629840A (en) 2022-10-19 2022-10-19 Method and system for hot migration of RDMA virtual machine and corresponding physical machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211282764.5A CN115629840A (en) 2022-10-19 2022-10-19 Method and system for hot migration of RDMA virtual machine and corresponding physical machine

Publications (1)

Publication Number Publication Date
CN115629840A true CN115629840A (en) 2023-01-20

Family

ID=84907326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211282764.5A Pending CN115629840A (en) 2022-10-19 2022-10-19 Method and system for hot migration of RDMA virtual machine and corresponding physical machine

Country Status (1)

Country Link
CN (1) CN115629840A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303173A (en) * 2023-05-19 2023-06-23 深圳云豹智能有限公司 Method, device and system for reducing RDMA engine on-chip cache and chip

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116303173A (en) * 2023-05-19 2023-06-23 深圳云豹智能有限公司 Method, device and system for reducing RDMA engine on-chip cache and chip
CN116303173B (en) * 2023-05-19 2023-08-08 深圳云豹智能有限公司 Method, device and system for reducing RDMA engine on-chip cache and chip

Similar Documents

Publication Publication Date Title
US7318133B2 (en) Method and apparatus for replicating volumes
US6061714A (en) Persistent cache synchronization and start up system
US7082506B2 (en) Remote copy control method, storage sub-system with the method, and large area data storage system using them
US9401958B2 (en) Method, apparatus, and system for migrating user service
US7779291B2 (en) Four site triangular asynchronous replication
US9576040B1 (en) N-site asynchronous replication
US8862843B2 (en) Storage system, backup storage apparatus, and backup control method
WO2013178082A1 (en) Image uploading method, system, client terminal, network server and computer storage medium
CN106469085B (en) The online migration method, apparatus and system of virtual machine
CN110023912B (en) Asynchronous local and remote generation of consistent point-in-time snapshot copies
US6061807A (en) Methods systems and computer products for error recovery of endpoint nodes
WO2017124917A1 (en) Data processing method and apparatus
US7752404B2 (en) Toggling between concurrent and cascaded triangular asynchronous replication
US7213114B2 (en) Remote copy for a storage controller in a heterogeneous environment
US7734884B1 (en) Simultaneous concurrent and cascaded triangular asynchronous replication
CN115629840A (en) Method and system for hot migration of RDMA virtual machine and corresponding physical machine
US7533289B1 (en) System, method, and computer program product for performing live cloning
CN109450676B (en) Switch upgrading method and device, electronic equipment and computer readable medium
WO2018049567A1 (en) Application migration method, device, and system
US7680997B1 (en) Data recovery simulation
CN114443364A (en) Distributed block storage data processing method, device, equipment and storage medium
US20080270832A1 (en) Efficiently re-starting and recovering synchronization operations between a client and server
CN115437750A (en) Method and system for thermal migration of Remote Direct Memory Access (RDMA) virtual machine and corresponding physical machine
CN103888283A (en) SCTP communication method and device
CN109992447B (en) Data copying method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination