CN112380068A - Virtual machine fault-tolerant system and fault-tolerant method thereof - Google Patents

Virtual machine fault-tolerant system and fault-tolerant method thereof Download PDF

Info

Publication number
CN112380068A
CN112380068A CN202011415534.2A CN202011415534A CN112380068A CN 112380068 A CN112380068 A CN 112380068A CN 202011415534 A CN202011415534 A CN 202011415534A CN 112380068 A CN112380068 A CN 112380068A
Authority
CN
China
Prior art keywords
virtual machine
network
standby
network card
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011415534.2A
Other languages
Chinese (zh)
Inventor
藏洪永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Haiguang Information Technology Co Ltd
Original Assignee
Haiguang Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Haiguang Information Technology Co Ltd filed Critical Haiguang Information Technology Co Ltd
Priority to CN202011415534.2A priority Critical patent/CN112380068A/en
Publication of CN112380068A publication Critical patent/CN112380068A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45575Starting, stopping, suspending or resuming virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention provides a virtual machine fault-tolerant system and a fault-tolerant method thereof. And a primary virtual machine monitor is installed on the primary physical machine end, and a primary virtual machine is installed on the primary virtual machine monitor. And a standby virtual machine runs on the standby physical machine end. The main virtual machine monitor is provided with a network card, and a network packet management device for issuing network packet processing rules to the network card is deployed on the main virtual machine monitor. The network packet processing rule includes: and the command network card forwards a network request packet sent to the primary virtual machine by the client to the standby virtual machine. The network card is arranged at the main physical machine end, the network management device which issues the network packet processing rule to the network card is deployed on the monitor of the main virtual machine, so that the network card forwards the network request packet sent by the client to the main virtual machine to the standby virtual machine, the CPU overhead introduced by software operation is reduced, and meanwhile, the high-efficiency kernel-state Vhost-net network rear-end driver is compatible, and the problem that the Vhost-net kernel-state network rear-end driver cannot be used is solved.

Description

Virtual machine fault-tolerant system and fault-tolerant method thereof
Technical Field
The invention relates to the technical field of virtual machines, in particular to a virtual machine fault-tolerant system and a fault-tolerant method thereof.
Background
The virtualization technology is widely applied along with the development of cloud computing, and the virtual machine fault-tolerant technology can provide reliability guarantee for key application. In the early virtual machine fault-tolerant technology (such as MicroCheckpointing, Kemari and the like), a standby virtual machine is always in a pause state, the state change of a main virtual machine is synchronized at high frequency without stopping, and if the main virtual machine fails, the standby virtual machine is activated to run. The Checkpoint (a detection point mechanism, a state data synchronization manner) is periodically performed between the primary virtual machine and the standby virtual machine at a high frequency, and meanwhile, the Checkpoint needs to suspend the operation of the source virtual machine.
Aiming at the problem that the fault-tolerant overhead of an early virtual machine is large, Intel (Intel) provides a COarse-grained synchronization technology (COLO for short), in the method, a main virtual machine and a standby virtual machine are in a running state, an external client sends a network request of the main virtual machine, the main virtual machine simultaneously sends the network request of the main virtual machine to the standby virtual machine for processing, and whether the states of the main virtual machine and the standby virtual machine need to be synchronized is determined by comparing network responses of the main virtual machine and the standby virtual machine to the same network request. If the response data packets generated by the main virtual machine and the standby virtual machine to the client network request are the same, the Checkpoint is not needed; otherwise, the states of the main virtual machine and the standby virtual machine are synchronized immediately. The method reduces the frequency of Checkpoint and fault-tolerant overhead, is already in commercial use in cloud computing products (such as ZStack and the like), and in addition, VMware also adopts fault-tolerant technology similar to COLO.
In the COLO virtual machine fault-tolerant technology, after a main virtual machine receives a client network request packet, a Qemu client (a set of simulation processors written by fabry-bela and distributing source codes by a GPL license, a virtual operating system simulator) intercepts the network request packet through a Virtio-net network back-end driver, and sends the network request packet to a standby virtual machine through Proxy software or a Proxy server. And after the standby virtual machine generates a network response packet, the network response packet is also sent to the main virtual machine through the Qemu user side Proxy. The main virtual machine Qemu user side receives a network response packet sent by the standby virtual machine, and the Proxy compares the network response packet generated by the main virtual machine and the standby virtual machine to determine whether a Checkpoint synchronization state is needed. In the scheme, the network data packet receiving, sending and comparing are processed through Qemu user state software, on one hand, the software processing of the network packet receiving, sending and comparing introduces CPU (central processing unit) overhead, and on the other hand, the virtual machine network can only adopt user state Virtio-net network rear-end drive, but can not use more efficient kernel state Vhost-net network rear-end drive.
Disclosure of Invention
The invention provides a fault-tolerant system and a fault-tolerant method of a virtual machine, which are used for reducing the CPU overhead introduced by software operation, are compatible with high-efficiency kernel-state Vhost-net network back-end drive, and solve the problem that the Vhost-net kernel-state network back-end drive cannot be used.
In a first aspect, the present invention provides a virtual machine fault-tolerant system, which includes a host physical machine end and a backup physical machine end. And a primary virtual machine monitor is installed on the primary physical machine end, and a primary virtual machine is installed on the primary virtual machine monitor. And a standby virtual machine is operated on the standby physical machine end. The main virtual machine monitor is provided with a network card, and a network packet management device used for issuing network packet processing rules to the network card is deployed on the main virtual machine monitor. The network packet processing rule includes: and the command network card forwards a network request packet sent to the primary virtual machine by the client to the standby virtual machine.
In the scheme, the network card is arranged at the main physical machine end, the network management device which sends the network packet processing rule to the network card is deployed on the monitor of the main virtual machine, and the network card forwards the network request packet sent by the client to the main virtual machine to the standby virtual machine, so that the CPU overhead introduced by software operation is reduced, the efficient kernel-state Vhost-net network rear-end driver is compatible, and the problem that the Vhost-net kernel-state network rear-end driver cannot be used is solved. In the prior art, a COLO virtual machine fault-tolerant technology is adopted to transmit and receive network data packets and compare the data packets through Qemu client software, so that efficient kernel-state Vhost-net network back-end driving cannot be used, and extra CPU overhead is introduced by the Qemu software for data packet related operation. Compared with the prior art, the scheme of the application realizes the forwarding of the network request packet from the client through the network card hardware, thereby reducing the CPU overhead introduced by software operation, being compatible with the high-efficiency kernel-state Vhost-net network back-end driver, and solving the problem that the Vhost-net kernel-state network back-end driver cannot be used.
In a specific embodiment, the network packet processing rule further includes: and the command network card receives and compares network response packets generated after the network request packets are processed by the main virtual machine and the standby virtual machine so as to judge whether the network response packets generated by the main virtual machine and the standby virtual machine are the same. The network card can not only forward a network request packet from a client to the standby virtual machine, but also receive and compare network response packets generated by the main virtual machine and the standby virtual machine, so that the CPU overhead introduced by software operation is further reduced, the high-efficiency kernel-state Vhost-net network back-end driver is more compatible, and the problem that the Vhost-net kernel-state network back-end driver cannot be used is solved.
In a specific embodiment, the network packet processing rule further includes: and when the network card judges that the network response packets generated by the main virtual machine and the standby virtual machine are the same, the network card is instructed to send the network response packet generated by the main virtual machine to the client. Thereby reducing the CPU overhead introduced by the software operation even further.
In a specific embodiment, the network packet processing rule further includes: when the network response packet comparison module judges that the network response packets generated by the main virtual machine and the standby virtual machine are different, the network card is instructed to send the network response packet generated by the main virtual machine to the client, and a signal representing that the network response packets generated by the main virtual machine and the standby virtual machine are different is sent to the main physical machine. And the network card sends the signals with different network response packets to the host physical machine end in time for further processing.
In a specific embodiment, the instructing the network card to send a signal to the main physical machine end, where the signal is different and characterizes that the network response packets generated by the main virtual machine and the standby virtual machine are different, specifically: after the network request packet generated by the main virtual machine is sent to the client, the network card is instructed to send an interrupt signal representing that the network response packets generated by the main virtual machine and the standby virtual machine are different to each other to the main physical machine. In order to send the characterizing signal to the host physical machine side.
In a specific embodiment, after receiving a signal sent by a network card and sent by a main virtual machine and a standby virtual machine, which are different from each other, a main physical machine end forwards the signal to a network packet management device; after receiving signals that network response packets generated by the main virtual machine and the standby virtual machine are different, the network packet management device sends state synchronization instructions to the main virtual machine and the standby virtual machine. And the state synchronization of the main virtual machine and the standby virtual machine is carried out in time.
In a specific embodiment, a cache module is disposed in the network card, and the cache module is configured to cache a network response packet that is sent first in the primary virtual machine and the standby virtual machine. The network response packet sent first is cached so as to be compared with the network response packet sent later.
In one specific embodiment, the network card is an intelligent network card, so as to configure the network card.
In a second aspect, the present invention further provides a fault tolerance method based on the above virtual machine fault tolerance system, where the fault tolerance method includes: the client sends a network request packet to the primary virtual machine; and the network packet management device commands the network card to forward a network request packet sent to the main virtual machine by the client to the standby virtual machine.
In the scheme, the network request packet sent by the client to the main virtual machine is forwarded to the standby virtual machine through the network card, so that the CPU overhead introduced by software operation is reduced, the high-efficiency kernel-mode Vhost-net network back-end driver is compatible, and the problem that the Vhost-net kernel-mode network back-end driver cannot be used is solved. In the prior art, a COLO virtual machine fault-tolerant technology is adopted to transmit and receive network data packets and compare the data packets through Qemu client software, so that efficient kernel-state Vhost-net network back-end driving cannot be used, and extra CPU overhead is introduced by the Qemu software for data packet related operation. Compared with the prior art, the scheme of the application realizes the forwarding of the network request packet from the client through the network card hardware, thereby reducing the CPU overhead introduced by software operation, being compatible with the high-efficiency kernel-state Vhost-net network back-end driver, and solving the problem that the Vhost-net kernel-state network back-end driver cannot be used.
In a specific embodiment, the fault tolerance method further includes: the primary virtual machine sends a network response packet generated after processing the network request packet to the network card; the standby virtual machine sends a network response packet generated after processing the network request packet to the network card; the network card receives and compares network response packets sent by the main virtual machine and the standby virtual machine to judge whether the network response packets generated by the main virtual machine and the standby virtual machine are the same; if the network response packets are the same, forwarding the network response packets generated by the primary virtual machine to the client; and if not, forwarding the network response packet generated by the primary virtual machine to the client, and synchronizing the states of the primary virtual machine and the standby virtual machine. The network card can not only forward a network request packet from a client to the standby virtual machine, but also receive and compare network response packets generated by the main virtual machine and the standby virtual machine, so that the CPU overhead introduced by software operation is further reduced, the high-efficiency kernel-state Vhost-net network back-end driver is more compatible, and the problem that the Vhost-net kernel-state network back-end driver cannot be used is solved.
Drawings
FIG. 1 is a schematic block diagram of a fault tolerant system for virtual machines according to an embodiment of the present invention;
FIG. 2 is a flow chart of a fault tolerance method according to an embodiment of the present invention;
fig. 3 is another flowchart of a fault tolerance method according to an embodiment of the present invention.
Reference numerals:
10-primary vm monitor 11-network packet management device
20-primary virtual machine 30-standby virtual machine 40-network card
41-network request packet forwarding module 42-network response packet comparison module
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
For convenience of understanding the virtual machine fault-tolerant system provided in the embodiment of the present invention, an application scenario of the virtual machine fault-tolerant system provided in the embodiment of the present invention is first described below, where the virtual machine fault-tolerant system is applied to a virtual machine system having a primary virtual machine and a standby virtual machine. The virtual machine fault tolerant system is described in detail below with reference to the accompanying drawings.
Referring to fig. 1 and fig. 2, a fault tolerant system for a virtual machine according to an embodiment of the present invention includes a host physical machine end and a standby physical machine end. A primary virtual machine monitor 10 is installed on the primary physical machine side, and a primary virtual machine 20 is installed on the primary virtual machine monitor 10. A standby virtual machine 30 is run on the standby physical machine side. A network card 40 is provided on the host machine side, and a network packet management device 11 for issuing a network packet processing rule to the network card 40 is disposed on the host virtual machine monitor 10. The network packet processing rule includes: the network card 40 is instructed to forward a network request packet sent by the client to the primary virtual machine 20 to the standby virtual machine 30. By arranging the network card 40 at the main physical machine end and deploying a network management device for issuing a network packet processing rule to the network card 40 at the main virtual machine monitor 10, the network card 40 forwards a network request packet sent by a client to the main virtual machine 20 to the standby virtual machine 30, so that the CPU overhead introduced by software operation is reduced, and meanwhile, the method is compatible with efficient kernel-state Vhost-net network rear-end drive and solves the problem that the Vhost-net kernel-state network rear-end drive cannot be used.
In the prior art, a COLO virtual machine fault-tolerant technology is adopted to transmit and receive network data packets through Qemu user side software, so that high-efficiency kernel-state Vhost-net network rear-end drive cannot be used, and extra CPU overhead is introduced by the Qemu software for data packet related operation. Compared with the prior art, the scheme of the application realizes the forwarding of the network request packet from the client through the hardware of the network card 40, thereby reducing the CPU overhead introduced by software operation, being compatible with the high-efficiency kernel-state Vhost-net network back-end driver, and solving the problem that the Vhost-net kernel-state network back-end driver cannot be used.
When the forwarding function of the network card 40 to the network request packet is specifically implemented, a network request packet forwarding module 41 may be set in the network card 40, and the network packet management device 11 in the primary virtual machine monitor 10 may forward the network request packet sent by the client to the primary virtual machine 20 to the standby virtual machine 30 by instructing the network request packet forwarding module 41, so as to implement the forwarding function of the network card 40 to the network request packet.
When the network card 40 is configured, the network card 40 may be an intelligent network card, so as to configure the network card 40. Of course, the network card 40 may also be another network card having functional modules such as a processor and a storage module, that is, as long as the network card can implement that the network packet management device 11 in the primary virtual machine monitor 10 issues the processing rule to the network card 40 and can implement forwarding of the network request packet, it is within the protection scope of the embodiment of the present invention.
Referring to fig. 1 and 3, after the primary virtual machine 20 and the standby virtual machine 30 process the network request packet and generate the network response packet, the network packet processing rule may further include: the command network card 40 receives and compares network response packets generated after the primary virtual machine 20 and the standby virtual machine 30 process the network request packets, so as to determine whether the network response packets generated by the primary virtual machine 20 and the standby virtual machine 30 are the same. The network card 40 can not only forward a network request packet from a client to the standby virtual machine 30, but also receive and compare network response packets generated by the primary virtual machine 20 and the standby virtual machine 30, so that the CPU overhead introduced by software operation is further reduced, and meanwhile, the network card is more compatible with high-efficiency kernel-state Vhost-net network back-end driver, and the problem that the Vhost-net kernel-state network back-end driver cannot be used is solved.
Specifically, when the network card 40 compares the network response packets of the primary virtual machine 20 and the standby virtual machine 30, a network response packet comparison module 42 may be set in the network card 40, and the network packet management device 11 in the primary virtual machine monitor 10 may receive and compare whether the network response packets generated by the primary virtual machine 20 and the standby virtual machine 30 are the same through instructing the network response packet comparison module 42, so as to implement comparison of the network card 40 with the network response packets of the primary virtual machine 20 and the standby virtual machine 30.
In addition, a cache module may be provided in the network card 40, and the cache module is configured to cache the network response packet that is sent first in the primary virtual machine 20 and the standby virtual machine 30. The network response packet sent first is cached so as to be compared with the network response packet sent later.
Referring to fig. 3, the network packet processing rule may further include: when the network card 40 determines that the network response packets generated by the primary virtual machine 20 and the standby virtual machine 30 are the same, the network card 40 may be instructed to send the network response packet generated by the primary virtual machine 20 to the client. The network card 40 not only forwards the network request packet sent by the client, but also enables the network card 40 to complete sending the network response packet to the client, and enables the network card 40 to complete the transceiving function of the network request packet and the network response packet with the client, thereby further reducing the CPU overhead introduced by software operation.
In addition, referring to fig. 3, the network packet processing rule may further include: when the network response packet comparison module 42 determines that the network response packets generated by the primary virtual machine 20 and the standby virtual machine 30 are different, the network card 40 may be instructed to send the network response packet generated by the primary virtual machine 20 to the client, and send a signal representing that the network response packets generated by the primary and standby virtual machines 30 are different to the primary physical machine end. The network card 40 sends the signals with different network response packets to the host physical machine end in time for further processing.
When the network card 40 is specifically instructed to send a signal representing that the network response packets generated by the main and standby virtual machines 30 are different to the main physical machine end, after the network request packet generated by the main virtual machine 20 is sent to the client, the network packet management device 11 instructs the network card 40 to send an interrupt signal representing that the network response packets generated by the main and standby virtual machines 30 are different to the main physical machine end. In order to send the characterizing signal to the host physical machine side. Of course, other signal representation manners may be adopted besides the interrupt signal as a signal for representing the inconsistency of the network response packets generated by the primary virtual machine 20 and the standby virtual machine 30.
In addition, after receiving the signal sent by the network card 40 and sent by the master virtual machine 30, that the network response packets are different, the master physical machine end may forward the signal to the network packet management device 11. After receiving the signal that the network response packets generated by the active/standby virtual machines 30 are different, the network packet management device 11 may send a state synchronization instruction to the primary virtual machine 20 and the standby virtual machine 30. The primary virtual machine 20 and the standby virtual machine 30 are synchronized in state in time. Specifically, when the state synchronization of the primary virtual machine 20 and the standby virtual machine 30 is implemented, the network packet management device 11 may send an instruction for performing Checkpoint state synchronization to the primary virtual machine 20 and the standby virtual machine 30, so that the primary virtual machine 20 and the standby virtual machine 30 perform state synchronization.
By arranging the network card 40 at the main physical machine end and deploying a network management device for issuing a network packet processing rule to the network card 40 at the main virtual machine monitor 10, the network card 40 forwards a network request packet sent by a client to the main virtual machine 20 to the standby virtual machine 30, so that the CPU overhead introduced by software operation is reduced, and meanwhile, the method is compatible with efficient kernel-state Vhost-net network rear-end drive and solves the problem that the Vhost-net kernel-state network rear-end drive cannot be used. In the prior art, a COLO virtual machine fault-tolerant technology is adopted to transmit and receive network data packets and compare the data packets through Qemu client software, so that efficient kernel-state Vhost-net network back-end driving cannot be used, and extra CPU overhead is introduced by the Qemu software for data packet related operation. Compared with the prior art, the scheme of the application realizes the forwarding of the network request packet from the client through the hardware of the network card 40, thereby reducing the CPU overhead introduced by software operation, being compatible with the high-efficiency kernel-state Vhost-net network back-end driver, and solving the problem that the Vhost-net kernel-state network back-end driver cannot be used.
In addition, referring to fig. 1 and fig. 2, an embodiment of the present invention further provides a fault tolerance method based on the above virtual machine fault tolerance system, where the fault tolerance method includes:
the client sends a network request packet to the primary virtual machine 20;
the network packet management device 11 instructs the network card 40 to forward the network request packet sent from the client to the primary virtual machine 20 to the standby virtual machine 30.
In the above scheme, the network request packet sent by the client to the primary virtual machine 20 is forwarded to the standby virtual machine 30 through the network card 40, so that the CPU overhead introduced by software operation is reduced, and meanwhile, the high-efficiency kernel-state Vhost-net network back-end driver is compatible, thereby solving the problem that the Vhost-net kernel-state network back-end driver cannot be used. In the prior art, a COLO virtual machine fault-tolerant technology is adopted to transmit and receive network data packets and compare the data packets through Qemu client software, so that efficient kernel-state Vhost-net network back-end driving cannot be used, and extra CPU overhead is introduced by the Qemu software for data packet related operation. Compared with the prior art, the scheme of the application realizes the forwarding of the network request packet from the client through the hardware of the network card 40, thereby reducing the CPU overhead introduced by software operation, being compatible with the high-efficiency kernel-state Vhost-net network back-end driver, and solving the problem that the Vhost-net kernel-state network back-end driver cannot be used.
Referring to fig. 3, after the primary virtual machine 20 and the standby virtual machine 30 process the network request packet and generate the network response packet, the fault tolerance method further includes:
the primary virtual machine 20 sends a network response packet generated after processing the network request packet to the network card 40;
the standby virtual machine 30 sends a network response packet generated after processing the network request packet to the network card 40;
the network card 40 receives and compares the network response packets sent by the primary virtual machine 20 and the standby virtual machine 30 to determine whether the network response packets generated by the primary virtual machine 20 and the standby virtual machine 30 are the same;
if the network response packets are the same, forwarding the network response packet generated by the primary virtual machine 20 to the client;
and if not, forwarding the network response packet generated by the primary virtual machine 20 to the client, and performing state synchronization on the primary virtual machine 20 and the standby virtual machine 30.
The network card 40 can not only forward a network request packet from a client to the standby virtual machine 30, but also receive and compare network response packets generated by the primary virtual machine 20 and the standby virtual machine 30, so that the CPU overhead introduced by software operation is further reduced, and meanwhile, the network card is more compatible with high-efficiency kernel-state Vhost-net network back-end driver, and the problem that the Vhost-net kernel-state network back-end driver cannot be used is solved.
The above description is only for the specific embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A virtual machine fault tolerance system, comprising:
a main physical machine end and a standby physical machine end;
a primary virtual machine monitor installed on the primary physical machine end, the primary virtual machine monitor having a primary virtual machine installed thereon;
a standby virtual machine running on the standby physical machine end;
a network card is arranged at the host physical machine end, and a network packet management device used for issuing network packet processing rules to the network card is deployed on the host virtual machine monitor; the network packet processing rule comprises: and commanding the network card to forward a network request packet sent to the primary virtual machine by a client to the standby virtual machine.
2. The virtual machine fault tolerance system of claim 1, wherein the network packet processing rules further comprise: and commanding the network card to receive and compare network response packets generated after the network request packets are processed by the main virtual machine and the standby virtual machine so as to judge whether the network response packets generated by the main virtual machine and the standby virtual machine are the same.
3. The virtual machine fault tolerance system of claim 2, wherein the network packet processing rules further comprise:
and when the network card judges that the network response packets generated by the primary virtual machine and the standby virtual machine are the same, the network card is instructed to send the network response packet generated by the primary virtual machine to the client.
4. The virtual machine fault tolerance system of claim 2, wherein the network packet processing rules further comprise:
when the network card judges that the network response packets generated by the main virtual machine and the standby virtual machine are different, the network card is instructed to send the network response packet generated by the main virtual machine to the client, and a signal representing that the network response packets generated by the main virtual machine and the standby virtual machine are different is sent to the main physical machine.
5. The virtual machine fault tolerant system of claim 4, wherein said instructing said network card to send a signal to said host physical machine end indicating that network response packets generated by the host and standby virtual machines are different specifically comprises:
after the network card sends the network request packet generated by the main virtual machine to the client, the network card is instructed to send an interrupt signal representing that network response packets generated by the main virtual machine and the standby virtual machine are different to each other to the main physical machine.
6. The virtual machine fault-tolerant system of claim 4, wherein the host physical machine end forwards a signal sent by the network card and sent by the host virtual machine to the network packet management device after receiving the signal that the network response packets sent by the host virtual machine and sent by the network card are different;
the network packet management device sends a state synchronization instruction to the main virtual machine and the standby virtual machine after receiving signals that network response packets generated by the main virtual machine and the standby virtual machine are different.
7. The virtual machine fault tolerant system of claim 2, wherein a cache module is arranged in the network card, and the cache module is configured to cache a network response packet that is sent first in the primary virtual machine and the standby virtual machine.
8. The virtual machine fault tolerance system of claim 1, wherein the network card is an intelligent network card.
9. A fault-tolerant method based on the virtual machine fault-tolerant system of claim 1, comprising:
the client sends a network request packet to the primary virtual machine;
and the network packet management device commands a network card to forward a network request packet sent to the primary virtual machine by the client to the standby virtual machine.
10. The fault tolerant method of claim 9 further comprising:
the primary virtual machine sends a network response packet generated after the network request packet is processed to the network card;
the standby virtual machine sends a network response packet generated after the network request packet is processed to the network card;
the network card receives and compares network response packets sent by the main virtual machine and the standby virtual machine to judge whether the network response packets generated by the main virtual machine and the standby virtual machine are the same;
if the network response packets are the same, forwarding the network response packet generated by the primary virtual machine to the client;
and if not, forwarding the network response packet generated by the primary virtual machine to the client, and performing state synchronization on the primary virtual machine and the standby virtual machine.
CN202011415534.2A 2020-12-04 2020-12-04 Virtual machine fault-tolerant system and fault-tolerant method thereof Pending CN112380068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011415534.2A CN112380068A (en) 2020-12-04 2020-12-04 Virtual machine fault-tolerant system and fault-tolerant method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011415534.2A CN112380068A (en) 2020-12-04 2020-12-04 Virtual machine fault-tolerant system and fault-tolerant method thereof

Publications (1)

Publication Number Publication Date
CN112380068A true CN112380068A (en) 2021-02-19

Family

ID=74590585

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011415534.2A Pending CN112380068A (en) 2020-12-04 2020-12-04 Virtual machine fault-tolerant system and fault-tolerant method thereof

Country Status (1)

Country Link
CN (1) CN112380068A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104618155A (en) * 2015-01-23 2015-05-13 华为技术有限公司 Virtual machine fault tolerant method, device and system
CN104883302A (en) * 2015-03-18 2015-09-02 华为技术有限公司 Method, device and system for forwarding data packet
CN106250166A (en) * 2015-05-21 2016-12-21 阿里巴巴集团控股有限公司 A kind of half virtualization network interface card kernel accelerating module upgrade method and device
KR20180134219A (en) * 2017-06-08 2018-12-18 정기웅 The method for processing virtual packets and apparatus therefore
CN111241201A (en) * 2020-01-14 2020-06-05 厦门网宿有限公司 Distributed data processing method and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104618155A (en) * 2015-01-23 2015-05-13 华为技术有限公司 Virtual machine fault tolerant method, device and system
CN104883302A (en) * 2015-03-18 2015-09-02 华为技术有限公司 Method, device and system for forwarding data packet
CN106250166A (en) * 2015-05-21 2016-12-21 阿里巴巴集团控股有限公司 A kind of half virtualization network interface card kernel accelerating module upgrade method and device
KR20180134219A (en) * 2017-06-08 2018-12-18 정기웅 The method for processing virtual packets and apparatus therefore
CN111241201A (en) * 2020-01-14 2020-06-05 厦门网宿有限公司 Distributed data processing method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
崔晓红等: "《局域网组建与维护》", 北京:中国计划出版社 *

Similar Documents

Publication Publication Date Title
US11507477B2 (en) Virtual machine fault tolerance
US7917811B2 (en) Virtual computer system
US8656388B2 (en) Method and apparatus for efficient memory replication for high availability (HA) protection of a virtual machine (VM)
US8413145B2 (en) Method and apparatus for efficient memory replication for high availability (HA) protection of a virtual machine (VM)
US8417885B2 (en) Method and apparatus for high availability (HA) protection of a running virtual machine (VM)
US8694828B2 (en) Using virtual machine cloning to create a backup virtual machine in a fault tolerant system
US9430266B2 (en) Activating a subphysical driver on failure of hypervisor for operating an I/O device shared by hypervisor and guest OS and virtual computer system
EP3224712B1 (en) Support for application-transparent, highly-available gpu-computing with virtual machine (vm)-checkpointing
US7111086B1 (en) High-speed packet transfer in computer systems with multiple interfaces
US20160224371A1 (en) Virtual machine group migration
US9329958B2 (en) Efficient incremental checkpointing of virtual devices
US10474496B1 (en) Dynamic multitasking for distributed storage systems by detecting events for triggering a context switch
US20160323427A1 (en) A dual-machine hot standby disaster tolerance system and method for network services in virtualilzed environment
JP3933587B2 (en) Computer system, computer apparatus, and operating system transfer method
CN104391764A (en) Computer fault-tolerant method and computer fault-tolerant system
US10977191B2 (en) TLB shootdowns for low overhead
US9398094B2 (en) Data transfer device
CN112380068A (en) Virtual machine fault-tolerant system and fault-tolerant method thereof
EP3255550B1 (en) Tlb shootdowns for low overhead
Scarpazza et al. Transparent system-level migration of PGAS applications using Xen on InfiniBand
KR20120043375A (en) Apparatus and method for detecting and recovering the fault of device driver in virtual machine
CN112380069B (en) Virtual machine fault tolerance system and fault tolerance method thereof
Yoshida et al. Orthros: A High-Reliability Operating System with Transmigration of Processes
Nieh Migration Mechanisms for Large-scale Parallel Applications
CN101950333A (en) Method for responding to trusted computing TOCTOU attacks on hardware virtual domain of Xen client

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination