CN115202803A

CN115202803A - Fault processing method and device

Info

Publication number: CN115202803A
Application number: CN202110396996.2A
Authority: CN
Inventors: 肖磊; 李秀桥; 孙宏伟; 阮涵
Original assignee: XFusion Digital Technologies Co Ltd
Current assignee: XFusion Digital Technologies Co Ltd
Priority date: 2021-04-13
Filing date: 2021-04-13
Publication date: 2022-10-18
Also published as: WO2022218346A1

Abstract

The application provides a fault processing method and device, and relates to the field of cluster fault processing. The VM cluster applied by the method comprises a management node, a first storage node and a plurality of VMs, and the method comprises the following steps: the management node acquires the states of the plurality of VMs from the first storage node which stores the state of each VM, and if at least one VM in the plurality of VMs is in a failure state, the management node instructs the host in the VM cluster to restart the at least one failure VM, wherein the host comprises the host bearing the failure VM and other hosts except the host bearing the failure VM in the VM cluster. According to the method, a part of storage space in the VM cluster is used as a uniform address space, the state of each VM is stored by the first storage node comprising the address space, and the management node acquires the states of a plurality of VMs from the first storage node, so that the management node is prevented from communicating with each host in the VM cluster, and the time for the management node to acquire the states of the VMs is reduced.

Description

Fault processing method and device

Technical Field

The present application relates to the field of cluster fault handling, and in particular, to a fault handling method and apparatus.

Background

A Virtual Machine (VM) refers to a computer system having complete hardware system functions and operating in a completely isolated environment, which is simulated by software. A virtual machine cluster refers to a system including a management node and a plurality of virtual machines deployed on different hosts, where the management node is used to monitor the operating state of a VM.

Currently, a host may monitor running VMs, and a management node needs to communicate with each host to obtain the running state of the VMs from the host. If the VM fails, the management node instructs the host to restart the failed VM. If the host fails, the host cannot normally run the VM, and the management node restarts the VM borne by the failed host on other healthy hosts to ensure normal running of the VM and realize High Availability (HA) of the VM cluster. Because the management node needs to communicate with each host, the failure detection time of the VM cluster is long, and the recovery time of the VM is long. Therefore, how to quickly detect the failure of the VM is a problem that needs to be solved.

Disclosure of Invention

The application provides a fault processing method and device, and solves the problem that the fault detection speed of a VM is low in the prior art.

In order to achieve the purpose, the following technical scheme is adopted in the application.

In a first aspect, an embodiment of the present application provides a fault handling method, where the method is applied to a management node of a VM cluster, or the method is applicable to a communication device that can support implementation of the method, where the communication device includes a chip system, for example. In one possible design, the VM cluster further includes a first storage node and a plurality of VMs, the method including: the management node acquires the states of the plurality of VMs from the first storage node which stores the state of each VM in the VM cluster, and instructs the hosts in the VM cluster to restart at least one failed VM when the state of at least one VM in the plurality of VMs is in a failure state, wherein the hosts in the VM cluster comprise the host bearing the failed VM and other hosts except the host bearing the failed VM in the VM cluster. According to the fault processing method provided by the embodiment of the application, a part of storage space in the VM cluster is used as a uniform address space, the state of each VM in the VM cluster is stored by the first storage node comprising the address space, and the management node can acquire the states of a plurality of VMs from the address space of the first storage node, so that the communication between the management node and each host in the VM cluster is avoided, the time for the management node to acquire the states of all VMs in the VM cluster is reduced, and the fault recovery efficiency of the VM cluster is improved.

In an alternative implementation, the acquiring, by the management node, the states of the plurality of VMs from the first storage node includes: the management node sends a first request to the first storage node, wherein the first request is used for indicating the first storage node to report the states of the multiple VMs; the management node receives the states of the plurality of VMs sent by the first storage node. Compared with the prior art in which the management node needs to communicate with each host to acquire the states of all the VMs in the VM cluster, according to the fault processing method provided in the embodiment of the present application, the management node only needs to communicate with the first storage node, so that the number of communications required by the management node to perform fault detection is reduced, the time required by network communications is reduced, and the fault detection efficiency of the VM cluster is improved.

In another optional implementation manner, the acquiring, by the management node, the states of the plurality of VMs from the first storage node includes: the management node receives the states of the plurality of VMs periodically transmitted by the first storage node. The management node can periodically acquire the state of each VM in the VM cluster from the first storage node, so that the communication frequency of the management node and the host and the fault detection time of the VM cluster are reduced.

In another optional implementation manner, the first storage node is further configured to save a hardware device address of each host in the VM cluster, and the method further includes: the management node acquires the hardware equipment address of the host bearing the fault VM from the first storage node; if the host bearing the fault VM has a fault, the management node determines other hosts in the VM cluster except the host bearing the fault VM according to the address of the hardware device. The hardware device address may include an identification of the failed host in the first storage node, an address of the failed host, and an address of a device of the failed host in the failed host.

Because a part of the storage space in the VM cluster is used as a uniform address space, and the first storage node including the address space stores the state of each VM in the VM cluster, the management node can acquire the states of multiple VMs from the address space of the first storage node by using a Direct Memory Access (DMA) technology under the condition that the management node and the first storage node are deployed on the same host; under the condition that the management node and the first storage node are deployed on different hosts, the management node can acquire the states of a plurality of VMs from an address space of the first storage node by using a Remote Direct data Access (RDMA) technology, so that the management node is prevented from communicating with each host in a VM cluster, the time for the management node to acquire the states of all the VMs in the VM cluster is reduced, and the fault recovery efficiency of the VM cluster is improved.

In another optional implementation manner, the first storage node is further configured to store traffic data of at least one failed VM, the VM cluster further includes a second storage node, the second storage node is configured to store system data of a plurality of VMs, and the management node instructs a host in the VM cluster to restart the at least one failed VM, including: and the management node sends a restart instruction to the host in the VM cluster, wherein the restart instruction is used for instructing the host to read the service data of the fault VM from the first storage node and read the system data of the fault VM from the second storage node, and the fault VM is restarted according to the service data and the system data of the fault VM. Because the management node can read the hardware device address of the fault host from the first storage node by using the DMA technology, the management node does not need to perform network communication with the fault host in the VM cluster, the fault confirmation time of the host is reduced, and the fault detection efficiency of the VM cluster is improved.

In another optional implementation manner, before the management node sends the restart instruction to the hosts in the VM cluster, the method further includes: the management node isolates the failed VM from the second storage node.

In another optional implementation manner, if the state of at least one VM of the plurality of VMs is a failure state, the method further includes: and the management node sends the second request to the first storage node and receives a request response sent by the first storage node. The second request is used for indicating the first storage node to carry out snapshot on service data of the fault VM stored in a memory of the host machine bearing the fault VM; the request response indicates that the first storage node has written the service data of the failed VM into the memory space of the first storage node. In the fault recovery process of the VM, the first storage node can snapshot service data stored in a memory of a host bearing the fault VM to obtain a VM memory snapshot, and the VM memory snapshot is written into a memory space of the first storage node, so that the copying process of the service data of the fault VM by the management node is reduced, the computing resources of the management node are released, and the management node can use the computing resources to execute other management actions of the VM cluster; in addition, in the process of restarting the failed VM, the healthy host can read the VM memory snapshot from the memory space of the first storage node by using a DMA or RDMA technology, so that network transmission of service data is avoided, time required for the healthy host to read the service data is reduced, further, time required for the management node to perform failure recovery and service interruption time are reduced, and failure recovery efficiency of the VM cluster and high availability of the VM cluster are improved.

In a second aspect, the present application provides another fault handling method, where the method is applied to a first storage node of a VM cluster, or the method is applicable to a communication device that can support implementing the method, where the communication device includes a chip system, for example. In one possible design, the VM cluster further includes a management node and a plurality of VMs, the first storage node is to save a state of each VM, the method includes: the first storage node receives the states of the plurality of VMs sent by the plurality of hosts of the VM cluster and sends the states of the plurality of VMs to the management node.

In an alternative implementation, the sending, by the first storage node, the state of the plurality of VMs to the management node includes: the first storage node receives a first request sent by the management node, and sends the states of the multiple VMs to the management node according to the first request.

In another optional implementation manner, the sending, by the first storage node, the state of the multiple VMs to the management node includes: the first storage node periodically sends the state of the plurality of VMs to the management node. For example, the first storage node may set a monitoring period (e.g., 0.3 second or 0.5 second), and if the time that the first storage node last reported the states of the plurality of VMs reaches the monitoring period, the first storage node sends the states of the plurality of VMs to the management node, so that the management node determines at least one faulty VM in the VM cluster after receiving the states of the plurality of VMs.

In another optional implementation manner, the fault handling method further includes: and the first storage node receives a second request sent by the management node, snapshots are carried out on service data stored in a memory in the fault host according to the second request to obtain a VM memory snapshot, and then after the VM memory snapshot is written into a memory space of the first storage node by the first storage node, the first storage node sends a request response to the management node. Wherein the second request includes an address of at least one failed VM of the plurality of VMs; the fault host is a host bearing a fault VM in the VM cluster; the request response indicates that the first storage node has written the VM memory snapshot into the memory space.

In another alternative implementation, the first storage node is further configured to maintain a hardware device address for each host in the VM cluster.

In another optional implementation manner, the hardware device address is stored in an address space of the first storage node, and the method further includes: the method comprises the steps that a first storage node receives an address reading request sent by a management node, and reads a hardware device address of a fault host from an address space according to the address reading request, wherein the fault host is a host bearing a fault VM in a VM cluster; the first storage node then sends the hardware device address of the failed host to the management node. Because a part of storage space in the VM cluster is used as a uniform address space, and the first storage node comprising the address space stores the state of each VM and the hardware device address of each host in the VM cluster, the management node can acquire the states of a plurality of VMs and the local hardware device address of the host from the address space of the first storage node by using a DMA (direct memory access) technology under the condition that the management node and the first storage node are deployed on the same host; under the condition that the management node and the first storage node are deployed on different hosts, the management node can acquire the states of the multiple VMs and the hardware device address of the host bearing the failed VM from the address space of the first storage node by using an RDMA (remote direct memory access) technology, so that the management node is prevented from communicating with each host in the VM cluster, the time for the management node to acquire the hardware device address of the host bearing the failed VM in the VM cluster is reduced, and the failure recovery efficiency of the VM cluster is improved.

In another optional implementation manner, at least one memory in the VM cluster is used for implementing the storage function of the first storage node. By way of example, the Memory may be Persistent Memory (PMEM). Compared with the storage medium of shared storage in the prior art, the PMEM has higher data read-write speed, is beneficial to reducing the data read-write time of the first storage node, and improves the fault processing speed of the VM cluster.

In a third aspect, an embodiment of the present application provides a fault handling apparatus, and beneficial effects may refer to descriptions of any aspect of the first aspect, which are not described herein again. The fault handling device has functionality to implement the actions in the method instance of any of the first aspects described above. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions. In one possible design, the failure processing apparatus is applied to a management node of a VM cluster, the VM cluster further including a first storage node and a plurality of VMs, the failure processing apparatus including: the receiving and sending module is used for acquiring the states of the plurality of VMs from the first storage node; the processing module is used for indicating a host in the VM cluster to restart at least one failed VM if the state of at least one VM in the VMs is a failure state, wherein the host comprises the host bearing the failed VM and other hosts except the host bearing the failed VM in the VM cluster.

In an optional implementation manner, the transceiver module is specifically configured to send a first request to the first storage node, where the first request is used to instruct the first storage node to report states of the multiple VMs; the transceiver module is specifically configured to receive states of the plurality of VMs, which are sent by the first storage node.

In another optional implementation manner, the transceiver module is specifically configured to receive states of the plurality of VMs, which are periodically transmitted by the first storage node.

In another optional implementation manner, the first storage node is further configured to save a hardware device address of each host in the VM cluster; the receiving and sending module is also used for acquiring the hardware equipment address of the host bearing the fault VM from the first storage node; the processing module is further configured to determine, if the host bearing the failed VM fails, other hosts in the VM cluster except the host bearing the failed VM according to the hardware device address.

In another optional implementation manner, the first storage node is further configured to store service data of at least one failed VM, and the VM cluster further includes a second storage node, where the second storage node is configured to store system data of multiple VMs; the processing module is specifically configured to send a restart instruction to a host in the VM cluster, where the restart instruction is used to instruct the host to read service data of a failed VM from the first storage node, read system data of the failed VM from the second storage node, and restart the failed VM according to the service data and the system data of the failed VM.

In another optional implementation, the processing module is further configured to isolate the failed VM from the second storage node.

In another optional implementation manner, if the state of at least one VM of the multiple VMs is a failure state, the transceiver module is further configured to send a second request to the first storage node, where the second request is used to instruct the first storage node to snapshot service data of the failure VM stored in a memory in a host bearing the failure VM; the transceiver module is further configured to receive a second request response sent by the first storage node, where the second request response indicates that the first storage node has written the service data of the failed VM into the memory space of the first storage node.

In a fourth aspect, an embodiment of the present application provides another fault handling apparatus, and beneficial effects may refer to description of any aspect of the second aspect, which are not described herein again. The fault handling apparatus has functionality to implement the actions in the method instance of any of the second aspects described above. The functions can be realized by hardware, and the functions can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the above-described functions. In one possible design, the failure handling apparatus is applied to a first storage node of a VM cluster, the VM cluster further including a management node and a plurality of VMs, the failure handling apparatus including: the receiving and sending module is used for receiving the states of the plurality of VMs sent by the plurality of hosts of the VM cluster; the transceiver module is further configured to send the state of the plurality of VMs to the management node.

In an optional implementation manner, the transceiver module is specifically configured to receive a first request sent by a management node; the transceiver module is specifically configured to send the states of the plurality of VMs to the management node according to the first request.

In another optional implementation manner, the transceiver module is specifically configured to periodically send the states of the plurality of VMs to the management node.

In another optional implementation manner, the transceiver module is further configured to receive a second request sent by the management node, where the second request includes an address of at least one failed VM in the multiple VMs; the fault handling device further comprises: a processing module; the processing module is used for carrying out snapshot on service data stored in the memory of the fault host according to the second request to obtain VM memory snapshot, and the fault host is a host bearing a fault VM in the VM cluster; the processing module is further used for writing the VM memory snapshot into the memory space of the first storage node; the transceiver module is further configured to send a second request response to the management node, where the second request response indicates that the first storage node has written the VM memory snapshot into the memory space.

In another alternative implementation, the hardware device address is stored in an address space of the first storage node; the receiving and sending module is also used for receiving an address reading request sent by the management node; the processing module is further used for reading the hardware equipment address of the fault host from the address space according to the address reading request, wherein the fault host is a host bearing a fault VM in the VM cluster; the transceiver module is further configured to send the hardware device address of the failed host to the management node.

In another optional implementation manner, at least one memory in the VM cluster is used for implementing the storage function of the first storage node. Illustratively, the memory may be PMEM.

In a fifth aspect, an embodiment of the present application provides a communication device, which includes a processor and an interface circuit, where the processor receives or transmits data through the interface circuit, and the processor is configured to implement, through logic circuits or execute code instructions, the method in any one of the first aspect and the first possible implementation manner, or the operation steps of the method in any one of the second aspect and the second possible implementation manner.

In a sixth aspect, embodiments of the present application provide a computer-readable storage medium, in which a computer program or instructions are stored, and when the computer program or instructions are executed by a communication device, the computer program or instructions implement the method in any one of the possible implementations of the first aspect and the first aspect, or the operating steps of the method in any one of the possible implementations of the second aspect and the second aspect.

In a seventh aspect, embodiments of the present application provide a computer program product, which includes instructions that, when executed on a communication device or a processor, cause the communication device or the processor to execute the instructions to implement the method in the first aspect and any one of the possible implementations of the first aspect, or the method in any one of the possible implementations of the second aspect and the second aspect.

In an eighth aspect, an embodiment of the present application provides a chip, which includes a memory and a processor, where the memory is used to store computer instructions, and the processor is used to call and execute the computer instructions from the memory, so as to perform the operation steps of the method in the first aspect and any possible implementation manner of the first aspect, or the method in any possible implementation manner of the second aspect and the second aspect.

The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.

Drawings

FIG. 1 is a schematic diagram of a virtual machine cluster provided in the prior art;

fig. 2 is a first schematic diagram of a virtual machine cluster provided in the present application;

fig. 3 is a first flowchart illustrating a fault handling method provided in the present application;

fig. 4 is a schematic flowchart of a fault handling method provided in the present application;

fig. 5 is a schematic flowchart of a fault processing method provided in the present application;

fig. 6 is a second schematic diagram of a virtual machine cluster provided in the present application;

FIG. 7 is a block diagram of a fault handling apparatus provided herein;

fig. 8 is a schematic structural diagram of a communication device provided in the present application.

Detailed Description

The terms "first," "second," and "third," etc. in the description and claims of this application and the above-described drawings are used for distinguishing between different objects and not for limiting a particular order.

In the embodiments of the present application, the words "exemplary" or "such as" are used herein to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.

"plurality" means two or more, and other terms are analogous. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. Furthermore, for elements (elements) that appear in the singular form "a," an, "and" the, "they are not intended to mean" one or only one "unless the context clearly dictates otherwise, but rather" one or more than one. For example, "a device" means for one or more such devices. Still further, at least one (at least one of).

For clarity and conciseness of the description of the embodiments described below, a brief introduction to the related art is first given.

A VM is a virtual computer that is emulated based on the hardware resources of a physical computer, also known as a logical computer. In order to improve the data processing capacity of the VM, a VM cluster is produced. Fig. 1 is a schematic diagram of a virtual machine cluster provided in the prior art, where the VM cluster includes a management node 110 and a plurality of VMs (such as virtual machines 121A to 121B and virtual machines 122A to 122B shown in fig. 1).

Management node 110 may run on any one of the hosts in the VM cluster, as shown in FIG. 1, management node 110 may run on host 121 or host 122.

One or more VMs may run on each host. As shown in fig. 1, host 121 can run virtual machine 121A and virtual machine 121B, and host 122 can run virtual machine 122A and virtual machine 122B.

The management node 110 is used to implement management functions of the entire virtual machine cluster, including but not limited to: the management node 110 deploys and configures a highly available agent (HA agent) on the host, issues configuration change information of the VM cluster, configures protection information of the VM, and the like.

The host is used for deploying a plurality of VMs of the virtual machine cluster, as shown in fig. 1, a virtual machine 121A and a virtual machine 121B are deployed in the host 121, and a virtual machine 122A and a virtual machine 122B are deployed in the host 122.

The host is further configured to deploy agent nodes issued by the management node 110, as shown in fig. 1, a host agent 121C and a high availability agent 121D are deployed in the host 121, and a host agent 122C and a high availability agent 122D are deployed in the host 122. Among them, the high-availability agent 121D may be configured to monitor the running state of the virtual machine on the host 121, communicate with high-availability attributes, process heartbeat, and the like, and in some possible examples, the high-availability agent 121D may also deploy the virtual machine on a host (such as the host 121 or the host 122), restart the deployed virtual machine on the host, record logs of all the virtual machines on the host, and the like; the high-availability agent 121D may also be configured to obtain hardware device addresses (e.g., address information stored in a network card, a processor, and a memory in the host) of each device in the host 121; host agent 121C may be configured to perform VM management actions issued by management node 110 to manage virtual machine 121A and virtual machine 121B on host 121.

For example, in vsphere HA, management node 110 may be VCenter, host agent 121C may be HostD, and high availability agent 121D may be FDM (Fault Domain Manager), where FDM relies on HostD to provide information of a virtual machine and performs management action of the virtual machine through HostD.

To achieve high availability of the virtual machine cluster, the VCenter in the vsphere HA manages the entire VM cluster, and collects the operating state and heartbeat information of the VM through FDM. The hosts may monitor the running VMs with FDM, and the VCenter needs to communicate with the FDM in each host to acquire the running states of the VMs. If the VM fails, the VCenter instructs the host to restart the failed VM. If the host fails, the host cannot normally run the VM, and the VCenter restarts the VM carried by the failed host at other healthy hosts so as to ensure the normal running of the VM and realize the high availability of the VM cluster. Since the VCenter needs to communicate with each FDM, the failure detection time of the VM cluster is long, and the recovery time of the VM is long.

In order to solve the above problem, an embodiment of the present application provides a fault handling method, where a VM cluster applied in the method includes a management node, a first storage node, and multiple VMs, and the fault handling method includes: the management node acquires the state of the plurality of VMs from the first storage node which stores the state of each VM in the VM cluster, and instructs the hosts in the VM cluster to restart at least one failed VM when the state of at least one VM in the plurality of VMs is in a failure state, wherein the hosts in the VM cluster comprise the host bearing the failed VM and other hosts except the host bearing the failed VM in the VM cluster. According to the fault processing method provided by the embodiment of the application, a part of storage space in a VM cluster is used as a uniform address space, the state of each VM in the VM cluster is stored by the first storage node comprising the address space, and the management node can acquire the states of a plurality of VMs from the address space of the first storage node, so that the management node is prevented from communicating with each host in the VM cluster, the time for the management node to acquire the states of all the VMs in the VM cluster is shortened, and the fault recovery efficiency of the VM cluster is improved.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Fig. 2 is a first schematic diagram of a virtual machine cluster provided in the present application, where the virtual machine cluster includes a management node 211, a first storage node 212, a second storage node 213, and a plurality of virtual machines (such as the virtual machine 214, the virtual machine 223, and the virtual machine 224 shown in fig. 2).

The management node and the virtual machine in the VM cluster may be deployed on the same host, as shown in fig. 1, and the management node 211 and the virtual machine 214 are deployed on the host 210.

The management node and the virtual machine in the VM cluster may be deployed on different hosts, as shown in fig. 1, the management node 211 is deployed on the host 210, and the virtual machine 223 and the virtual machine 224 are deployed on the host 220.

Clients may access the VM cluster through a network, which may include a switch, such as switch 230 shown in fig. 2. The communication between the management node and the host in the VM cluster may be realized through a network.

The management node 211 (VM manager) is used for managing all VMs in the VM cluster, for example, the management node 211 may deploy a virtual machine agent (VM agent) on a host, and the management node 211 and the VM agent on the host where each VM is located cooperate to implement a virtual machine management function of the VM cluster. For example, the management node 211 may further set a high availability agent (HA agent) in the VM agent, and the HA agent may be used to implement high availability communication of the virtual machine 214.

A first storage node 212 (Memory manager) may be used to store the state of each VM in the VM cluster. The storage function of the first storage node 212 may be implemented by a distributed memory system, where the distributed memory system is a technology for pooling multiple memories, a storage area of the first storage node 212 may also be referred to as a "unified memory space", "unified address space", "distributed memory platform", or a "unified memory pool", and the like, and a storage area of the first storage node 212 may also be referred to as a cross-node unified virtual memory space (unified virtual memory), and the application does not limit the name of the storage area of the first storage node 212.

In one possible example, as shown in FIG. 2, the first storage node 212 includes an address space and a memory space. The address space is used to store the state of each VM in the VM cluster. The memory space is used for storing service data of any one or more VMs in the VM cluster, the service data includes intermediate data generated by the VMs in the running process, and the service data may also be referred to as running data.

The second storage node 213 may be used to save system data for each VM in the VM cluster, including data saved by the operating system in the VM. In one possible example, communication between VMs may also be accomplished by accessing a shared store. For example, the functionality of the second storage node 213 may be implemented by a shared storage of a VM cluster, which refers to a parallel architecture in which two or more virtual processors share a main memory.

In one possible example, the data read and write speed of the first storage node 212 is faster than the data read and write speed of the second storage node 213.

In order to implement the functions of each node in the above embodiments, the host may include a hardware structure corresponding to each function, and as shown in fig. 2, the host 210 may include a processor 210A, a network card 210B, a memory 210C, a storage 210D, and a communication interface 210E. Processor 210A, network card 210B, and communication interface 210E are coupled to one another. It is understood that the communication interface 210E may be a transceiver or an input-output interface.

The memory 210D may be used to store software programs and modules, such as program instructions/modules corresponding to the fault handling method provided in the embodiment of the present application, and the processor 210A executes the software programs and modules stored in the memory 210D, so as to execute various functional applications and data processing. The communication interface 210E may be used for signaling or data communication with other devices. The host 210 may have a plurality of communication interfaces 210E in this application.

In one possible example, the memory 210D may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a programmable read-only memory (PROM), an erasable read-only memory (EPROM), an electrically erasable read-only memory (EEPROM), and the like. The memory 210C may be, but is not limited to, a PMEM or a Dynamic Random Access Memory (DRAM).

The processor 210A may include one or more processing units, for example, but not limited to, an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), and the like. The different processing units may be separate devices or may be integrated into one or more processors. The controller may be, among other things, a neural center and a command center of the host 210. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.

A memory may also be provided in processor 210A for storing instructions and data. In some embodiments, the memory in processor 210A is a cache memory. The memory may hold instructions or data that have just been used or recycled by processor 210A. If processor 210A needs to use the instruction or data again, it may be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 210A, thereby increasing the efficiency of the system.

In some embodiments, processor 210A may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose-input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.

The core of network card 210B is a link layer controller, which is typically a single special purpose chip that implements many link layer services including framing, link access, flow control or error detection, etc. Network card 210B is a device that networks the host, and is commonly referred to as a network adapter that connects the host to a local area network. The network card 210B may be inserted into a motherboard slot of the host 210, and the network card 210B is responsible for converting data to be transmitted by the user into a format that can be recognized by other devices on the network, and transmitting the data through a network medium. For example, the network card 210B can be used to transmit or receive instructions and data transmitted by the host 220.

The host 220 may be a communication device having the same hardware structure as the host 210, and for the specific implementation of the host 220, and the processor 220A, the network card 220B, the memory 220C, the storage 220D, and the communication interface 220E included in the host 220, reference may be made to the above description about the host 210, which is not described herein again. It is to be understood that the illustrated structure of the embodiments of the present application does not constitute a specific limitation on the host. In other embodiments of the present application, the host may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components may be used. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

In this embodiment, the storage function of the first storage node 212 may be implemented by at least one memory of the VM cluster. As shown in fig. 2, the storage function of the first storage node 212 may be implemented by the memory 210C and the memory 220C, and if both the memory 210C and the memory 220C may be PMEM, compared to the storage medium shared and stored in the prior art, the data read/write speed of PMEM is faster, which is beneficial to reducing the data read/write time of the first storage node and improving the fault handling speed of the VM cluster. It is noted that although fig. 2 illustrates the storage functionality of first storage node 212 as being implemented by host 210 and host 220, in some possible examples, the storage functionality of first storage node 212 may also be implemented by other hosts or storage devices in the VM cluster in addition to host 210 and host 220.

In the embodiment of the present application, as shown in fig. 2, the storage function of the second storage node 213 may be implemented by at least one of the memory 210D and the memory 220D. It is noted that although fig. 2 illustrates the storage functionality of the second storage node 213 as being implemented by the host 210 and the host 220, in some possible embodiments, the storage functionality of the second storage node 213 may also be implemented by other hosts or storage devices in the VM cluster besides the host 210 and the host 220.

In order to reduce the failure detection time of the VM cluster and improve the failure detection efficiency of the VM cluster, here, taking the management node 211, the first storage node 212, the second storage node 213, and any host (host 210 or host 220) in the VM cluster shown in fig. 2 as an example, fig. 3 is a first flowchart of a failure processing method provided in this application, where the failure processing method includes the following steps.

S310, the first storage node 212 obtains states of a plurality of VMs from a plurality of hosts of the VM cluster.

As an optional implementation manner, the foregoing S310 may specifically include: the host writes the state of the VM deployed locally to the host into the first storage node 212.

Since the storage function of the first storage node 212 is implemented by a plurality of memories in the VM cluster, in the process of writing the state of the VM by the host, the host may write the state of the VM into a memory local to the host, and the host may also write the state of the VM into other memories of the distributed memory system implementing the first storage node.

The host may utilize a virtual machine agent to gather the state of the locally deployed VMs. As shown in fig. 2, the management node 211 deploys a virtual machine agent (VM agent) at the host 210, and the VM agent is used to collect the state of the virtual machine 214. In one possible example, a highly available agent (HA agent) may be included in the VM agent, which is used to collect highly available communication states for virtual machine 214.

The host may also utilize the virtual machine agent to gather the state of VMs deployed on other hosts in the VM cluster. As shown in fig. 2, the management node 211 deploys a VM agent for collecting the states of the virtual machine 223 and the virtual machine 224 at the host 220. In one possible example, the VM agent may include an HA agent for collecting the high available communication state of the virtual machines 223 and 224.

In one possible implementation, the state of a VM may be described by state information, which may indicate the address and operating state of the VM. For example, the address of the VM may be an Internet Protocol (IP) address, the running state indicates that the state of the VM is a fault, normal running, or to be started, and the like, the description field corresponding to the running state may be described by a virtual device supporting the running of the VM, and the virtual device may include a processor, a memory, and a network card, and if the description field indicates the state description of 3 virtual devices of the virtual machine: for example, if the description field in the first storage node 212 for indicating the operating state of the VM is "11X", it indicates that the processor and the memory of the VM are both in a normal operating state, and the network card has a failure.

S320, the management node 211 acquires the states of the plurality of VMs from the first storage node 212.

Compared with the prior art that the host can only monitor the locally running VM, the management node needs to communicate with each host in the VM cluster to acquire the states of all VMs in the VM cluster and further process the fault VM in the VM cluster, according to the fault processing method provided by the embodiment of the application, the management node can acquire the states of all VMs in the VM cluster from the first storage node and further restart the fault VM, the management node is prevented from communicating with each host in the VM cluster, the time for the management node to acquire the states of all VMs in the VM cluster is shortened, and the fault recovery efficiency of the VM cluster is improved.

As an optional implementation manner, the state of the management node 211 acquiring the multiple VMs may be "active behavior", as shown in fig. 4, fig. 4 is a schematic flow diagram of a fault processing method provided by the present application, and the foregoing S320 may specifically include the following steps S320A and S320B.

S320A, the management node 211 sends a first request to the first storage node 212.

The first request is used for indicating the first storage node to report the states of the plurality of VMs.

S320B, the management node 211 receives the statuses of the plurality of VMs sent by the first storage node 212.

Compared with the prior art in which the management node needs to communicate with each host to acquire the states of all the VMs in the VM cluster, in the fault processing method provided in the embodiment of the present application, the management node only needs to communicate with the first storage node, so that the number of communications required for the management node to perform fault detection is reduced, the time required for network communications is reduced, and the fault detection efficiency of the VM cluster is improved.

In addition, in the prior art, in order to acquire the state of the VM deployed on the host, the management node acquires the state of the VM on the host through the VM agent, and reports the state to the management node through the VM agent, and then the management node determines whether the VM deployed on the host is in a failure state. However, in the VM cluster, the management node and each VM agent have a Virtual Local Area Network (VLAN) address, and the communication between the management node and each VM agent is performed by using the VLAN address, so that the management node needs to communicate with the VM agents through a network regardless of whether the management node and the VM agents are deployed in the same host, which results in a long communication time between the management node and the VM agents in each host in the VM fault detection process of the VM cluster.

In the embodiment of the present application, as shown in fig. 2, the communication mode between the management node 211 and the first storage node 212 may be DMA, and DMA transmission is to copy data from one address space to another address space, so that the management node does not have work delay, the speed of acquiring all VM states in a VM cluster by the management node is increased, the fault detection time of the VM cluster is reduced, and the fault detection efficiency of the VM cluster is improved. In some possible embodiments, if the management node and the first storage node are deployed on different hosts, the management node and the first storage node may implement communication by using an RDMA technology, which enables the management node to access remote addresses of all VMs in the VM cluster, reduces a delay of processing data by a processor of the host in network transmission, and improves efficiency of fault detection in the VM cluster. The description of DMA and RDMA can refer to the related explanation of the prior art, and will not be described here.

In addition, the VM agent on the host can also report the state of the VM in the host where the VM agent is located by using a DMA (direct memory access) or RDMA (remote direct memory access) technology, so that the network communication times of the host are reduced, and the fault detection time of the VM cluster is further reduced.

As another alternative implementation, the state of acquiring multiple VMs by the management node 211 may be "passive behavior", as shown in fig. 4, where the above S320 may specifically include the following step S320C.

S320C, the first storage node 212 periodically sends the state of the plurality of VMs to the management node 211.

For example, the first storage node 212 may set a monitoring period (e.g., 0.3 second or 0.5 second), and if the time that the first storage node 212 last reported the states of the plurality of VMs reaches the monitoring period, the first storage node 212 sends the states of the plurality of VMs to the management node 211, so that the management node 211 determines at least one failed VM in the VM cluster after receiving the states of the plurality of VMs.

Compared with the prior art that each host where the VM cluster is located can perform network communication with the management node to acquire the state of each VM in the VM cluster, according to the fault processing method provided by the embodiment of the application, the management node can periodically acquire the state of each VM in the VM cluster from the first storage node, so that the number of times of communication between the management node and the host and the fault detection time of the VM cluster are reduced.

With continuing reference to fig. 3, the method for handling a fault according to the embodiment of the present application may further include the following step S330.

S330, the management node 211 instructs the hosts in the VM cluster to restart the at least one failed VM.

The hosts in the VM cluster include the host carrying the failed VM and other hosts in the VM cluster other than the host carrying the failed VM.

If the host carrying the failed VM does not fail, management node 211 may instruct the host carrying the failed VM to restart the failed VM. As shown in fig. 2, if the failed VM is a virtual machine 214, the host carrying the failed VM is the host 210, and in a case that the host 210 does not fail, the management node 211 may instruct the virtual machine 214 to be restarted at the host 210.

If the host bearing the failed VM fails, the management node 211 may instruct other hosts in the VM cluster except the host bearing the failed VM to restart the failed VM. As shown in fig. 2, if the failed VM is a virtual machine 223, the host bearing the failed VM is a host 220, and the other hosts in the VM cluster except the host bearing the failed VM may be hosts 210; in the event of a failure of host 220, management node 211 may instruct a restart of virtual machine 223 at host 210.

In the embodiment provided by the application, the management node can acquire the states of all the VMs in the VM cluster from the first storage node, and then restart the faulty VM, thereby avoiding the management node from communicating with each host in the VM cluster, reducing the time for the management node to acquire the states of all the VMs in the VM cluster, and improving the fault recovery efficiency of the VM cluster, thereby improving the high availability of the VM cluster.

As an alternative implementation, please continue to refer to fig. 4, in order to achieve high availability of the VM cluster, the above S330 specifically includes the following steps S330A and S330B.

S330A, the management node 211 sends a restart instruction to the hosts in the VM cluster.

The restart instruction is used for instructing the host to read the service data of the fault VM from the first storage node, read the system data of the fault VM from the second storage node, and restart the fault VM according to the service data and the system data of the fault VM. The system data refers to information such as an operating system of the VM stored in the host that carries the VM, and stored process data. In one possible example, the restart instruction may include a memory address of the traffic data of the failed VM in the first storage node, and a storage address of the system data of the failed VM in the second storage node.

S330B, the host reads the service data of the failed VM from the first storage node 212, and reads the system data of the failed VM from the second storage node 213, and restarts the failed VM according to the service data and the system data of the failed VM.

In a possible example, the host that restarts the failed VM may be the host that carries the failed VM, and as shown in fig. 2, if the virtual machine 224 is in a failure state and the host 220 that carries the virtual machine 224 is in a normal state, the management node 211 may send a restart instruction to the virtual machine agent on the host 220, and the virtual machine agent interacts with a hardware device such as the processor 220A of the host 220, so that the host 220 reads the traffic data of the virtual machine 224 from the first storage node 212 according to the memory address and reads the system data of the virtual machine 224 from the second storage node 213 according to the storage address, and further, the host 220 restarts the virtual machine 224 according to the traffic data and the system data of the virtual machine 224.

In another possible example, the host that restarts the failed VM may be another host in the VM cluster except the host that bears the failed VM, as shown in fig. 2, if the virtual machine 224 is in a failure state and the host 210 that bears the virtual machine 224 is in a failure state, the management node 211 may send a restart instruction to a virtual machine agent on the host 210, and the virtual machine agent interacts with a hardware device such as the processor 220A of the host 220, so that the host 210 reads the traffic data of the virtual machine 224 from the first storage node 212 according to the memory address and reads the system data of the virtual machine 224 from the second storage node 213 according to the storage address, and further, the host 210 restarts the virtual machine 224 according to the traffic data and the system data of the virtual machine 224.

Compared with the prior art, the host restarting the failed VM obtains the service data and the system data of the failed VM from the shared storage of the VM cluster, and because the shared storage has more data, the host restarting the failed VM is difficult to quickly find the service data of the failed VM in the shared storage, so that the restarting time of the failed VM is longer. In the fault processing method provided by the embodiment of the application, the host restarting the faulty VM can read the service data of the faulty VM from the first storage node and read the system data of the faulty VM from the second storage node, so that the host restarting the faulty VM does not need to search a plurality of different data in the same storage area, the time for the host restarting the faulty VM to acquire the service data and the system data is reduced, and the fault recovery efficiency of the VM cluster is improved.

As an optional implementation manner, in order to determine a host (healthy host) of a restarted failed VM in a VM cluster, as shown in fig. 5, fig. 5 is a flowchart of a failure processing method provided by the present application, where, when the management node 211 confirms that a state of at least one VM in a plurality of VMs in the VM cluster is a failure state, after S320 described above, the failure processing method provided by the embodiment of the present application may further include the following steps S321 to S323.

S321, the management node 211 sends an address read request to the first storage node 212.

For example, the address read request may include an address of a failed VM of the plurality of VMs in the VM cluster, such that the first storage node may query the host carrying the failed VM according to the address of the failed VM.

S322, the first storage node 212 reads the hardware device address of the host of the failed VM from the address space according to the address read request.

As shown in fig. 2, the hardware device address may be stored in the address space of the first storage node 212. The hardware device address may be an address in the host of various hardware devices in the host. For example, the hardware device address may include an identification of the failed host in the first storage node, an address of the failed host, and an address of the device of the failed host in the failed host. As shown in FIG. 2, the hardware device address may include an identification of the failed host (e.g., host 220) at the first storage node, the identification being unique across the VM cluster; the hardware device address also includes an address of the failed host, such as the address may be an IP address or a Media Access Control (MAC) address; the hardware device addresses also include addresses of the failed host of the devices of the failed host, such as processor 210A address "001", network card 210B address "002", memory 210C address "003", memory 210D address "005", and communication interface 210E address "005".

S323, the management node 211 acquires the hardware device address of the host bearing the failed VM from the first storage node 212.

The process of the management node 211 acquiring the hardware device address from the first storage node 212 may use a DMA or RDMA technology, which may reduce network communication between the management node 211 and the host, thereby reducing acquisition time of the hardware device address and improving failure detection efficiency of the VM cluster.

In the fault processing method provided by the embodiment of the application, because a part of storage space in a VM cluster is used as a uniform address space, and a first storage node comprising the address space stores the state of each VM in the VM cluster and the hardware device address of each host, when a management node and the first storage node are deployed in the same host, the management node can acquire the states of a plurality of VMs and the hardware device address local to the host from the address space of the first storage node by using a DMA technique; under the condition that the management node and the first storage node are deployed on different hosts, the management node can acquire the states of the multiple VMs and the hardware device address of the host bearing the failed VM from the address space of the first storage node by using an RDMA (remote direct memory access) technology, so that the management node is prevented from communicating with each host in the VM cluster, the time for the management node to acquire the hardware device address of the host bearing the failed VM in the VM cluster is reduced, and the failure recovery efficiency of the VM cluster is improved.

In a possible implementation manner, if the management node 211 determines that the host bearing the failed VM fails, please refer to fig. 5, after the above step S323, the method for handling a failure provided in the embodiment of the present application may further include the following step S324.

S324, the management node 211 determines other hosts in the VM cluster except the host carrying the failed VM according to the hardware device address.

As shown in fig. 2, when the management node 211 acquires the address of the network card 220B in the host 220 according to the state (11X) of the virtual machine 223 and confirms that the network card 220B is in a failure state by ping or other failure confirmation methods, the management node 221 confirms that the host 220 is a failed host, and the management node 221 further regards the host 210 as "another host" in S324, which may also be referred to as "healthy host".

The management node 211 may determine a failed host of the VM cluster using a hardware device address of a device in the failed host, and during a failure recovery process of the VM cluster, the management node 211 selects a non-failed host (healthy host) in the VM cluster to restart the failed VM. In the fault processing method provided in the embodiment of the present application, since the management node 211 may read the hardware device address of the faulty host from the first storage node 212 by using the DMA technology, the management node 211 does not need to perform network communication with the faulty host in the VM cluster, thereby reducing the time for confirming the fault of the host and improving the fault detection efficiency of the VM cluster.

The method comprises the steps that a client communicates with a VM cluster through a network, an interface between each VM in the VM cluster and the client is unique, under the condition that a management node confirms that a VM in the VM cluster is in a fault, if the management node restarts the fault VM in a healthy host in the VM cluster, the management node does not shut down the original fault VM, two VMs with the same VLAN address exist in the VM cluster at the same time, the client is difficult to confirm which VM is in a normal state according to the address of the VM, the client may possibly use the original fault VM as a healthy VM, and a service requested by the client to the VM cluster is difficult to operate on the fault VM. In order to solve the above problem, please continue to refer to fig. 5, the method for handling a fault according to the embodiment of the present application may further include the following step S325.

S325, the management node 211 isolates the failed VM from the second storage node 213.

In one possible example, the meaning of "isolated" above may be understood as: management node 211 revokes the access rights of the failed VM to second storage node 213 and closes the rights of the failed VM to communicate with clients outside the VM cluster.

Here, taking the example that the failed VM is the virtual machine 223 and the host bearing the failed VM is the host 220, when the management node 211 confirms that the host 220 is the failed host, the management node 211 may isolate the virtual machine 223 from the second storage node 212, so that the virtual machine 223 on the host 220 cannot communicate with the client connected to the VM cluster, and cannot access the second storage node 213. After the management node 211 restarts the virtual machine 223 on the host 210, the virtual machine 223 on the host 210 communicates with the client connected to the VM cluster, so that it is avoided that the fault VM in the VM cluster has 2 external interfaces in the process that the management node 211 restarts the fault VM (virtual machine 223) on a healthy host (such as the host 210), and further, in the case of realizing high availability of the VM cluster, it is avoided that the fault VM has 2 external interfaces to generate erroneous interaction between the client and the fault VM, thereby further improving the high availability of the VM cluster.

In the recovery process of a fault VM in the prior art, a management node copies service data of the fault VM stored in a memory of a host bearing the fault VM and sends the copied service data to a healthy host, and the management node copies the service data and consumes a large amount of computing resources of the management node, so that the management node needs to reduce other management actions of a VM cluster, and the management performance of the management node is reduced; moreover, the host bearing the faulty VM transmits the service data from the network to the management node, and then from the management node to the host restarting the faulty VM, which consumes a large amount of transmission resources of the VM cluster, increases the fault recovery time of the VM cluster, causes an increase in service interruption time on the faulty VM, and reduces the high availability performance of the VM cluster.

In order to solve the above problem and increase the pull-up speed of the VM with a fault, please continue to refer to fig. 5, the fault handling method provided in the embodiment of the present application may further include the following steps S326 to S328.

S326, the management node 211 sends a second request to the first storage node.

The second request is used for indicating the first storage node to carry out snapshot on service data of the fault VM stored in the memory of the host machine bearing the fault VM.

As an optional implementation manner, the second request may include an address of at least one failed VM, after the first storage node receives the second request, service data of the failed VM stored in a memory of the host is searched according to the address of the failed VM in the second request, and the first storage node takes a snapshot of the service data to obtain a VM memory snapshot, where the VM memory snapshot indicates the service data of the failed VM.

S327, the first storage node 212 performs a snapshot on the service data stored in the memory of the failed host according to the second request.

As an optional implementation manner, the above S327 may specifically be: the first storage node 212 searches the service data of the faulty VM stored in the memory of the host according to the address of the faulty VM included in the second request, and performs snapshot on the service data to obtain a VM memory snapshot, and then the first storage node writes the VM memory snapshot into the memory space of the first storage node 212. The snapshot technique can refer to the related explanation of the prior art, and is not described here.

In a possible example, as shown in fig. 2, taking an example that the failed VM is a virtual machine 223 and the host bearing the failed VM is a host 220 as an example, the memory 220C stores service data of the virtual machine 223, after the first storage node 212 receives the second request sent by the management node 211, the first storage node 212 searches the service data of the virtual machine 223 in the memory 220C of the host 220 by using an address of the virtual machine 223, and performs a snapshot on the service data to obtain a VM memory snapshot of the virtual machine 223, where the VM memory snapshot indicates the service data of the virtual machine 223 in the memory 220C. It should be noted that, in a case that the host 220 is a failed host, if the virtual machine 223 and the virtual machine 224 are both virtual machines configured with highly available agents, the first storage node 212 may snapshot not only the service data of the virtual machine 223, but also the service data of the virtual machine 224 in the memory 220C, so that the management node 211 restarts the virtual machine 223 and the virtual machine 224 in other hosts of the VM cluster, thereby preventing the virtual machine 224 from continuing to run on the failed host 220, and improving reliability of each VM in the VM cluster.

S328, the first storage node 212 sends a request response to the management node 211.

The request response indicates that the first storage node has written the service data of the failed VM into the memory space of the first storage node. In the case that the service data of the failed VM is snapshot, the request response indicates that the first storage node has written the VM memory snapshot into the memory space of the first storage node.

In one possible implementation manner, the first storage node 212 may be configured to store service data of at least one failed VM in the VM cluster, where the service data refers to operation data of the failed VM stored in a memory of a host when the failed VM runs on the host. The storage function of the first storage node 212 may be implemented by a distributed memory system, for example, a memory (e.g., DRAM and PMEM) in each host is pooled by using a distributed technology to form a logical memory pool, and the function of the first storage node 212 is implemented by the logical memory pool, and in a VM cluster, data of the first storage node 212 may be shared and uniformly managed and uniformly accessed by the first storage node 212. As shown in fig. 2, the storage function of the first storage node 212 may be implemented by a memory 210C and a memory 220D, and the memory 210C and the memory 220D may be a DRAM and/or a PMEM.

In a possible implementation manner, the request response may include a memory address of the service data of the failed VM in the memory space of the first storage node, and after the management node receives the request response, the memory address included in the request response may be obtained, and a reboot instruction including the memory address may be sent to a host of the VM cluster (a host bearing the VM, or another host in the VM cluster except the host bearing the VM), so that the host of the VM cluster reads the service data of the failed VM from the first storage node according to the memory address, reads the system data of the failed VM from the second storage node, and reboots the failed VM according to the service data and the system data of the failed VM.

In the fault recovery process of the VM, the first storage node may snapshot service data stored in a memory of a host that carries a faulty VM to obtain a VM memory snapshot, and write the VM memory snapshot into a memory space of the first storage node, so that a process of copying the service data of the faulty VM by a management node is reduced, and a computing resource of the management node is released, so that the management node can use the computing resource in executing other management actions of the VM cluster; in addition, in the process of restarting the failed VM, the healthy host can read the VM memory snapshot from the memory space of the first storage node by using a DMA or RDMA technology, so that network transmission of service data is avoided, time required for the healthy host to read the service data is reduced, further, time required for the management node to perform failure recovery and service interruption time are reduced, and failure recovery efficiency of the VM cluster and high availability of the VM cluster are improved.

In the above-mentioned failure processing method provided in this embodiment of the present application, the function of the first storage node may be implemented by a distributed memory system, where the distributed memory system includes a plurality of PMEMs, and during the recovery process of a failed VM, the first storage node may snapshot the original service data of the failed VM to obtain a memory snapshot of the failed VM, and after the management node issues a restart instruction to the healthy host, the healthy host may read the service data of the failed VM from the first storage node according to a memory address of the service data of the failed VM included in the restart instruction in the first storage node, and a storage address of the system data of the failed VM in the second storage node, and read the system data of the failed VM from the second storage node, where compared with the prior art, the system data and the service data of the failed VM need to be read from the shared storage.

With respect to the fault handling method provided by the foregoing embodiment of the present application, a possible specific implementation manner is provided in the embodiment of the present application, as shown in fig. 6, fig. 6 is a schematic diagram of a virtual machine cluster provided by the present application, where the virtual machine cluster includes a management node 630, a first storage node 631, a second storage node 632, a virtual machine 611 deployed on a host 610, and a virtual machine 621 and a virtual machine 622 deployed on the host 620. The management node 630 may implement the function of the management node 211, the first storage node 631 may implement the function of the first storage node 212, the second storage node 632 may implement the function of the second storage node 213, and for the hardware implementation of the host 610 and the host 620, reference may be made to the description related to the host 210, which is not described herein again.

Among them, a high availability management unit 630A (HA manager) may be included in the management node 630 (VM manager), and the HA manager may be used to configure the high availability attributes of the VM cluster. For example, a VM manager may deploy a virtual machine agent (VM agent) on each host (e.g., host 610 and host 620 shown in fig. 6) of a VM cluster, which VM agent may be used to monitor the state of each VM in the host hosting the VM agent, and to obtain the hardware device address of the host, such as virtual machine agent 613 and virtual machine agent 623 shown in fig. 6; the HA manager may configure a high availability agent (HA agent) in each VM manager, which may be used to monitor the high availability attributes of each VM in the host hosting the HA agent, such as high availability agent 613A and high availability agent 623A shown in fig. 6.

The first storage node 631 (Memory manager) may deploy a Memory agent (Memory agent) on each host of the VM cluster, where the Memory agent may be used to manage the Memory of the host where the Memory agent is located, such as the Memory agent 614 and the Memory agent 624 shown in fig. 6, and the Memory agent 614 may be responsible for collecting the Memory information of the host 610 and performing the Memory allocation and recovery actions.

In the virtual machine cluster provided in fig. 6, each VM may access the second storage node 632 to execute the service of the VM cluster, e.g., interaction between each VM and service data may be realized through the second storage node 632.

In the virtual machine cluster provided in fig. 6, the VM manager, the VM agent, and the virtual processor of the VM may implement the virtual machine management function of the VM cluster; the Memory manager and the Memory agent can realize the unified Memory management function of the VM cluster. In some possible examples, the Memory manager may also be set in the VM manager as a functional component, and thus, the Memory agent may also be set in the VM agent as a functional component, which may reduce independent processing nodes in the VM cluster, thereby reducing redundancy of the VM cluster.

In the process of realizing high availability of a VM cluster, according to the fault handling method provided in the embodiment of the present application, a VM agent on each host collects information such as a state of a VM and an address of a hardware device borne by the host, an HA agent in the VM agent can determine a high availability attribute of each VM in the host bearing the HA agent, and add the high availability attribute to the state of the VM, and then the VM agent writes information such as the state of the VM and the address of the hardware device into a storage area of a Memory manager through a Memory agent of the same host.

In this embodiment of the present application, as shown in fig. 6, a first storage node 631 (Memory manager) includes a Memory space and an address space, where the address space may be used to store a state of each VM in the VM cluster and a hardware device address of each host of the VM cluster, and the Memory space may be used to store service data of any one or more VMs in the VM cluster, for example, a Memory agent 624 (Memory agent) may snapshot service data of a virtual machine 621 stored in a Memory of a host 620 to obtain a VM Memory snapshot, and write the Memory snapshot into a Memory space of the first storage node 631.

The HA manager in the VM manager may read the states of the VMs from the memory space of the first storage node 631 to determine a faulty VM in the VM cluster, so that the VM manager performs management operations of the VM cluster according to the fault information of the faulty VM, thereby achieving high availability of the VM cluster. For example, the process that the VM manager instructs the healthy host to pull up the failed VM again may be that the HA manager issues a restart instruction to the HA agent on the healthy host, and after the HA agent receives the restart instruction, the HA agent instructs the VM agent in the healthy host to restart the failed VM; in the process that the VM agent restarts the failed VM on the healthy host, the VM agent may issue a data reading request (for example, a Memory address of the service data of the failed VM in the Memory space of the first storage node) to the Memory agent; furthermore, the healthy host reads the service data of the failed VM from the memory space of the first storage node 631 according to the memory address in the data read request, where the service data of the failed VM may be stored in the memory space of the first storage node 631 in a memory snapshot manner. The healthy host also reads the system data of the failed VM from the second storage node 632, thereby restarting the failed VM according to the traffic data and the system data of the failed VM. For the recovery process and beneficial effects of the failed VM, reference may be made to the above description of S310 to S330 and their possible sub-steps, which are not described herein again.

In the prior art, because the management node needs to communicate with the host for multiple times to obtain the state of each VM in the VM cluster and the address of the hardware device of the host carrying the failed VM, the failure confirmation process of the VM takes at least 15 seconds; in the process of re-pulling up the VM, the service data stored in the memory of the host bearing the failed VM needs to be copied and transmitted to the shared storage, which takes at least 15 seconds, so that the HA of the VM cluster takes at least 30 seconds in the case of failure of the host in which the VM is located in the VM cluster.

By using the fault processing method provided by the application, as the management node can acquire the states of all VMs and the hardware equipment address of the host bearing the fault VM from the first storage node of the VM cluster by using a DMA (direct memory access) or RDMA (remote direct memory access) technology, the fault confirmation process of the VM only needs to spend less than 1 second; in the process of pulling up the VM again, a Memory agent (Memory agent) in the host bearing the faulty VM may receive a snapshot request of the first storage node, and snapshot service data stored in a Memory of the host bearing the faulty VM, and further, the first storage node may write a VM Memory snapshot indicating the service data of the faulty VM into a Memory space of the first storage node, and the healthy host may also obtain the VM Memory snapshot from the first storage node of the VM cluster, so that it takes less than 1 second for the healthy host to pull up the faulty VM again. Therefore, by using the fault processing method provided by the application, the HA overall time of the VM cluster only needs less than 2 seconds, and the actual implementation process is generally about 1 second, so that the service interruption time in the VM cluster is greatly reduced, and the high availability of the VM cluster is improved.

It should be noted that, although the virtual machine cluster is taken as an example in the above embodiments provided in the present application, in some possible cases, the failure processing method provided in the present application may also be applied to a cluster including multiple distributed processes, such as k8s (kubernets) including multiple pods, or other highly available scenarios including processes or containers lighter than the pods, so as to implement fast failure detection and recovery of the cluster, and reduce the service interruption time of the cluster.

It is understood that, in order to implement the functions of the above embodiments, the host includes corresponding hardware structures and/or software modules for performing the respective functions. Those of skill in the art will readily appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed in hardware or computer software driven hardware depends on the specific application scenario and design constraints of the solution.

Fig. 7 and fig. 8 are schematic structural diagrams of a possible fault handling apparatus and a communication device provided in an embodiment of the present application. The fault handling apparatus and the communication device may be configured to implement the functions of the host in the above method embodiment, and therefore, the advantageous effects of the above method embodiment can also be achieved. In the embodiment of the present application, the communication device may be the host 210 or the host 220 shown in fig. 2, a module (e.g., a chip) applied to the host, or a storage device (e.g., a disk array) with processing capability.

Fig. 7 is a schematic block diagram of a fault handling apparatus provided in this application, where the fault handling apparatus 700 includes a transceiver module 710 and a processing module 720, the fault handling apparatus 700 may implement the functions of the management node 211 and the first storage node 212 shown in fig. 2 to fig. 5, and the fault handling apparatus 700 may also implement the functions of the management node 630 and the first storage node 631 shown in fig. 6, it should be understood that the present embodiment only performs an exemplary division on the structure and the functional modules of the fault handling apparatus 700, and this application does not limit any specific division thereof.

When the fault handling apparatus 700 is used to implement the function of the management node 211 in the method embodiment shown in fig. 3, the transceiver module 710 is configured to perform S320, and the transceiver module 710 and the processing module 720 are further configured to implement S330 in cooperation.

When the failure handling apparatus 700 is used to implement the function of the first storage node 212 in the method embodiment shown in fig. 3, the transceiver module 710 is used to execute S310.

When the fault handling apparatus 700 is used to implement the function of the management node 211 in the method embodiment shown in fig. 4, the transceiver module 710 is used to execute S320A, S320B, S320C and S330A.

When the failure processing apparatus 700 is used to implement the functions of the first storage node 212 in the method embodiment shown in fig. 4, the transceiver module 710 is configured to execute S310, S320A, S320B, and S320C, and the processing module 720 is configured to implement S330B in cooperation with the host and the second storage node 213.

When the fault handling apparatus 700 is used to implement the function of the management node 211 in the method embodiment shown in fig. 5, the transceiver module 710 is configured to execute S320, S321, S323, S326 and S328, and the processing module 720 is configured to execute S324, S325 and S330.

When the fault handling apparatus 700 is used to implement the functions of the first storage node 212 in the method embodiment shown in fig. 5, the transceiver module 710 is configured to execute S310, S320, S321, S323, S326 and S328, and the processing module 720 is configured to execute S322, S327 and S330.

In an alternative implementation, the failure processing apparatus 700 may further include a storage module 730, where the storage module 730 may be configured to store the states of the plurality of VMs in the VM cluster, the hardware device address of the host that carries the VM, the service data and the system data of the memory snapshot or the failed VM of the VM, and the like.

More detailed descriptions about the fault handling apparatus 700 can be directly obtained by referring to the related descriptions in the embodiments shown in fig. 2 to fig. 6, which are not repeated herein.

Fig. 8 is a schematic structural diagram of a communication device provided in the present application, where the communication device 800 includes a processor 810 and a communication interface 820. Processor 810 and communication interface 820 are coupled to one another. It is to be appreciated that the communication interface 820 can be a transceiver or an input-output interface. Optionally, the communication device 800 may further include a memory 830 for storing instructions to be executed by the processor 810 or for storing input data required by the processor 810 to execute the instructions or for storing data generated by the processor 810 after executing the instructions.

When communications device 800 is used to implement the embodiments shown in fig. 2-6, processor 810, communications interface 820, and memory 830 may also cooperate to implement the various operational steps in the fault handling methods performed by the nodes and hosts in a VM cluster. The communication device 800 may also perform the functions of the fault handling apparatus 700 shown in fig. 7, which are not described herein.

The specific connection medium among the communication interface 820, the processor 810 and the memory 830 is not limited in the embodiments of the present application. In fig. 8, the communication interface 820, the processor 810 and the memory 830 are connected by a bus 840, the bus is represented by a thick line in fig. 8, and the connection manner among other components is only schematically illustrated and is not limited. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 8, but this is not intended to represent only one bus or type of bus.

The memory 830 can be used for storing software programs and modules, such as program instructions/modules corresponding to the fault handling method provided in the embodiment of the present application, and the processor 810 executes various functional applications and data processing by executing the software programs and modules stored in the memory 830. The communication interface 820 may be used for communicating signaling or data with other devices. The communication device 800 may have multiple communication interfaces 820 in this application.

In one possible example, the above-mentioned memory 830 may be, but is not limited to, RAM, ROM, PROM, EPROM, EEPROM, DRAM, hard Disk Drive (HDD), solid State Drive (SSD), or disk array (RAID), etc.

It is understood that the processor in the embodiments of the present application may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. The general purpose processor may be a microprocessor, but may be any conventional processor.

The method steps in the embodiments of the present application may be implemented by hardware, or may be implemented by software instructions executed by a processor. The software instructions may consist of corresponding software modules that may be stored in RAM, flash memory, ROM, PROM, EPROM, EEPROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an ASIC. Additionally, the ASIC may reside in a network device or a communication device. Of course, the processor and the storage medium may reside as discrete components in a network device or a communication device.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer programs or instructions. When the computer program or instructions are loaded and executed on a computer, the processes or functions described in the embodiments of the present application are performed in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, a network appliance, a user device, or other programmable apparatus. The computer program or instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer program or instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire or wirelessly. The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that integrates one or more available media. The usable medium may be a magnetic medium, such as a floppy disk, a hard disk, a magnetic tape; or optical media such as Digital Video Disks (DVDs); but may also be a semiconductor medium, such as an SSD.

In various embodiments of the present application, unless otherwise specified or conflicting, terms and/or descriptions between different embodiments have consistency and may be mutually referenced, and technical features in different embodiments may be combined to form a new embodiment according to their inherent logical relationships. In this application, "at least one" means one or more, "a plurality" means two or more. "and/or" describes the association relationship of the associated objects, meaning that there may be three relationships, e.g., a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In the description of the text of the present application, the character "/" generally indicates that the former and latter associated objects are in an "or" relationship; in the formula of the present application, the character "/" indicates that the preceding and following associated objects are in a "division" relationship.

It is to be understood that the various numerical references referred to in the embodiments of the present application are merely for descriptive convenience and are not intended to limit the scope of the embodiments of the present application. The sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of the processes should be determined by their functions and inherent logic.

Claims

1. A fault handling method, wherein a VM cluster of a virtual machine includes a management node, a first storage node, and a plurality of VMs, the first storage node being configured to save a state of each VM, the method being performed by the management node, the method comprising:

obtaining states of the plurality of VMs from the first storage node;

and if the state of at least one VM in the plurality of VMs is a fault state, indicating a host in the VM cluster to restart the at least one fault VM, wherein the host comprises a host bearing the fault VM and other hosts except the host bearing the fault VM in the VM cluster.

2. The method of claim 1, wherein obtaining the state of the plurality of VMs from the first storage node comprises:

sending a first request to the first storage node, wherein the first request is used for indicating the first storage node to report the states of the plurality of VMs;

and receiving the states of the plurality of VMs sent by the first storage node.

3. The method of claim 1, wherein obtaining the state of the plurality of VMs from the first storage node comprises:

receiving the states of the plurality of VMs periodically transmitted by the first storage node.

4. A method according to any of claims 1-3, wherein the first storage node is further configured to maintain a hardware device address for each host in the VM cluster, the method further comprising:

acquiring a hardware device address of the host bearing the fault VM from the first storage node;

and if the host bearing the fault VM has a fault, determining other hosts in the VM cluster except the host bearing the fault VM according to the hardware equipment address.

5. The method of any of claims 1-4, wherein the first storage node is further configured to maintain traffic data for the at least one failed VM, wherein the VM cluster further comprises a second storage node configured to maintain system data for the plurality of VMs, wherein instructing a host in the VM cluster to restart the at least one failed VM comprises:

and sending a restart instruction to a host in the VM cluster, wherein the restart instruction is used for instructing the host to read the service data of the fault VM from the first storage node and read the system data of the fault VM from the second storage node, and restarting the fault VM according to the service data and the system data of the fault VM.

6. The method of claim 5, wherein prior to sending a reboot instruction to the hosts in the VM cluster, the method further comprises:

isolating the failed VM from the second storage node.

7. The method of any of claims 1-6, wherein if the state of at least one of the plurality of VMs is a failure state, the method further comprises:

sending a second request to the first storage node, where the second request is used to instruct the first storage node to snapshot service data of the failed VM stored in the memory of the host bearing the failed VM;

receiving a request response sent by the first storage node, where the request response indicates that the first storage node has written the service data of the failed VM into a memory space of the first storage node.

8. A fault handling method, wherein a VM cluster of a virtual machine includes a management node, a first storage node, and a plurality of VMs, the first storage node being configured to save a state of each VM, the method being performed by the first storage node, the method comprising:

receiving states of the plurality of VMs sent by a plurality of hosts of the VM cluster;

sending the state of the plurality of VMs to the management node.

9. The method of claim 8, wherein sending the state of the plurality of VMs to the management node comprises:

receiving a first request sent by the management node;

and sending the states of the plurality of VMs to the management node according to the first request.

10. The method of claim 8, wherein sending the state of the plurality of VMs to the management node comprises:

periodically sending the state of the plurality of VMs to the management node.

11. The method according to any one of claims 8-10, further comprising:

receiving a second request sent by the management node, wherein the second request comprises an address of at least one failed VM in the plurality of VMs;

performing snapshot on service data stored in a memory of a fault host according to the second request to obtain a VM memory snapshot, wherein the fault host is a host bearing a fault VM in the VM cluster;

writing the VM memory snapshot into the memory space of the first storage node;

sending a request response to the management node, wherein the request response indicates that the first storage node writes the VM memory snapshot into the memory space.

12. The method of any of claims 8-11, wherein the first storage node is further configured to maintain a hardware device address for each host in the VM cluster.

13. The method of claim 12, wherein the hardware device address is stored in an address space of the first storage node, the method further comprising:

receiving an address reading request sent by the management node;

reading a hardware device address of a fault host from the address space according to the address reading request, wherein the fault host is a host bearing a fault VM in the VM cluster;

and sending the hardware equipment address of the fault host to the management node.

14. The method of any of claims 8-13, wherein at least one memory in the VM cluster is used to implement the storage function of the first storage node.

15. The method of claim 14, wherein the memory is persistent memory (PMEM).

16. A fault handling apparatus, wherein a VM cluster of a virtual machine includes a management node, a first storage node, and a plurality of VMs, the first storage node being configured to save a state of each of the VMs, the fault handling apparatus being applied to the management node, the apparatus comprising:

a transceiver module, configured to obtain the states of the plurality of VMs from the first storage node;

and the processing module is used for indicating a host in the VM cluster to restart at least one failed VM if the state of at least one VM in the VMs is a failure state, wherein the host comprises a host bearing the failed VM and other hosts except the host bearing the failed VM in the VM cluster.

17. The apparatus according to claim 16, wherein the transceiver module is specifically configured to send a first request to the first storage node, where the first request is used to instruct the first storage node to report the states of the VMs;

the transceiver module is specifically configured to receive the statuses of the VMs, which are sent by the first storage node.

18. The apparatus of claim 16, wherein the transceiver module is specifically configured to receive the status of the VMs of the plurality of VMs periodically transmitted by the first storage node.

19. The apparatus of any of claims 16-18, wherein the first storage node is further configured to maintain a hardware device address for each host in the VM cluster;

the transceiver module is further configured to acquire, from the first storage node, a hardware device address of the host bearing the failed VM;

and the processing module is further configured to determine, if the host bearing the faulty VM fails, other hosts in the VM cluster except for the host bearing the faulty VM according to the hardware device address.

20. The apparatus of any of claims 16-19, wherein the first storage node is further configured to maintain traffic data for the at least one failed VM, and wherein the VM cluster further comprises a second storage node configured to maintain system data for the plurality of VMs;

the processing module is specifically configured to send a restart instruction to a host in the VM cluster, where the restart instruction is used to instruct the host to read service data of the failed VM from the first storage node, read system data of the failed VM from the second storage node, and restart the failed VM according to the service data and the system data of the failed VM.

21. The apparatus of claim 20, wherein the processing module is further configured to isolate the failed VM from the second storage node.

22. The apparatus according to any of claims 16-21, wherein if the status of at least one VM of the plurality of VMs is a failure status, the transceiver module is further configured to send a second request to the first storage node, where the second request is used to instruct the first storage node to snapshot the traffic data of the failed VM stored in the memory of the host that carries the failed VM;

the transceiver module is further configured to receive a request response sent by the first storage node, where the request response indicates that the first storage node has written the service data of the faulty VM into a memory space of the first storage node.

23. A fault handling apparatus, wherein a VM cluster of a virtual machine includes a management node, a first storage node, and a plurality of VMs, the first storage node is configured to save a state of each VM, the apparatus is applied to the first storage node, and the apparatus includes:

a transceiver module, configured to receive states of the VMs, sent by the plurality of hosts of the VM cluster;

the transceiver module is further configured to send the states of the plurality of VMs to the management node.

24. The apparatus according to claim 23, wherein the transceiver module is specifically configured to receive a first request sent by the management node;

the transceiver module is specifically configured to send the states of the VMs to the management node according to the first request.

25. The apparatus of claim 23, wherein the transceiver module is specifically configured to periodically send the state of the plurality of VMs to the management node.

26. The apparatus according to any of claims 23-25, wherein the transceiver module is further configured to receive a second request sent by the management node, the second request comprising an address of at least one failed VM of the plurality of VMs;

the device further comprises: the processing module is used for snapshotting service data stored in a memory of a fault host according to the second request to obtain a VM memory snapshot, wherein the fault host is a host bearing a fault VM in the VM cluster;

the processing module is further configured to write the VM memory snapshot into a memory space of the first storage node;

the transceiver module is further configured to send a second request response to the management node, where the second request response indicates that the first storage node has written the VM memory snapshot into the memory space.

27. The apparatus of any of claims 23-26, wherein the first storage node is further configured to maintain a hardware device address for each host in the VM cluster.

28. The apparatus of claim 27, wherein the hardware device address is stored in an address space of the first storage node;

the transceiver module is further configured to receive an address reading request sent by the management node;

the processing module is further configured to read a hardware device address of a faulty host from the address space according to the address reading request, where the faulty host is a host that carries a faulty VM in the VM cluster;

the transceiver module is further configured to send the hardware device address of the faulty host to the management node.

29. The apparatus of any of claims 23-28, wherein at least one memory in the VM cluster is configured to implement the storage function of the first storage node.

30. The apparatus of claim 29, wherein the memory is a persistent memory (PMEM).

31. A communications device comprising a processor and interface circuitry, the processor receiving or transmitting data via the interface circuitry, the processor being configured to implement the method of any of claims 1-7 or the method of any of claims 8-15 via logic circuitry or executing code instructions.

32. A computer-readable storage medium, in which a computer program or instructions is stored, which, when executed by a communication device, implements the method of any of claims 1-7, or the method of any of claims 8-15.