WO2017054626A1 - Fault recovery method and device for virtual machine - Google Patents

Fault recovery method and device for virtual machine Download PDF

Info

Publication number
WO2017054626A1
WO2017054626A1 PCT/CN2016/098341 CN2016098341W WO2017054626A1 WO 2017054626 A1 WO2017054626 A1 WO 2017054626A1 CN 2016098341 W CN2016098341 W CN 2016098341W WO 2017054626 A1 WO2017054626 A1 WO 2017054626A1
Authority
WO
WIPO (PCT)
Prior art keywords
virtual machine
data disk
original
new virtual
original virtual
Prior art date
Application number
PCT/CN2016/098341
Other languages
French (fr)
Chinese (zh)
Inventor
谢军勇
阳代平
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2017054626A1 publication Critical patent/WO2017054626A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance

Definitions

  • the present invention relates to the field of virtual technologies, and in particular, to a virtual machine fault repair method and apparatus.
  • VMs virtual machines
  • the VM may experience system disk failure during the running process.
  • the VM is faulty due to the guest operating system (Guest OS).
  • the VM system disk fails, the VM cannot work normally, resulting in business damage.
  • the current solution for handling VM failures is mainly through the following methods:
  • the VM After the VM fails, the VM is set to boot from the network and apply for IP from the Dynamic Host Configuration Protocol (DHCP) server.
  • the VM uses the applied IP connection to download the Trivial File Transfer Protocol (TFTP) server.
  • Micro OS VM
  • VM starts from micro OS and reinstalls the operating system (Operation System, OS); after installing the production OS, restart the VM, you can continue to install the app.
  • OS operating system
  • OS Operating System
  • the IP of the VM is required to be allocated by the cloud's infrastructure layer.
  • the tenant's VM cannot open the DHCP service. Therefore, the tenant's VM can no longer rely on the DHCP service for VM failure recovery. It can be seen that the VM fault repair method cannot repair the VM in the DHCP-free service scenario.
  • the embodiment of the invention provides a virtual machine fault repairing method and device, which can perform fault repair on a VM in a DHCP-free service scenario.
  • an embodiment of the present invention provides a virtual machine fault repairing method, including:
  • MAC Media Access Control
  • the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the reserved area where the service data in the data disk is located is not formatted.
  • the method further includes:
  • the creating a new virtual machine by using the image template of the source virtual machine includes:
  • the data disk includes a partition table, where the data is indicated in the partition table. a reserved area where the service data of the disk is located.
  • the reserved area where the service data in the data disk is located is not formatted.
  • the area of the data disk other than the reserved area where the service data is located is formatted.
  • the data disk of the original virtual machine is mounted to the new virtual machine ,include:
  • the method further includes:
  • an embodiment of the present invention provides a virtual machine fault repair apparatus, including: a creating unit, a setting unit, and a starting unit, where:
  • the creating unit is configured to create a new virtual machine by using a mirror template when detecting that the original virtual machine is faulty;
  • the setting unit is configured to set a MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine;
  • the startup unit is configured to start the new virtual machine, where the new virtual machine does not format a reserved area of the service data in the data disk when the data disk is partitioned.
  • the device further includes:
  • a detecting unit configured to: when detecting that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed a first time threshold, reset the original virtual machine, and detect that after the original virtual machine is reset Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds a second time threshold, and if yes, determining that the original virtual machine is faulty.
  • the creating unit is configured to create a new virtual machine that includes only the system disk by using the image template.
  • the data disk includes a partition table, where the data is indicated in the partition table a reserved area where the service data of the disk is located, when the new virtual machine partitions the data disk, according to the partition table, the service data in the data disk is not
  • the reserved area is formatted to format an area of the data disk other than the reserved area in which the service data is located.
  • the setting unit is further configured to uninstall the data disk of the original virtual machine, and Mounting the unloaded data disk to the new virtual machine;
  • the device also includes:
  • a delete unit is used to delete the original virtual machine.
  • the mirroring template when detecting that the original virtual machine fails, is used to create a new virtual machine; setting the media access control MAC address of the original virtual machine to the MAC address of the new virtual machine, and The data disk of the original virtual machine is mounted to the new virtual machine; the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the service data in the data disk is not located.
  • the reserved area is formatted.
  • FIG. 1 is a schematic flowchart of a virtual machine fault repairing method according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of another virtual machine fault repairing method according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of another virtual machine fault repairing method according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a virtual machine fault repairing apparatus according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention.
  • FIG. 7 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of a method for repairing a fault of a virtual machine according to an embodiment of the present invention. As shown in FIG. 1 , the method includes the following steps:
  • the image template may be a mirror template used to create the original virtual machine. And the above image template may be pre-stored. In this way, step 101 can directly create a new virtual machine using the image template.
  • the creation here may specifically be a system disk that creates a new virtual machine.
  • the MAC address of the new virtual machine is the same as the MAC address of the original virtual machine, and the data disk of the new virtual machine is the data disk of the original virtual machine, so that the new virtual machine and the original virtual machine can be created.
  • the new virtual machine is started, where the new virtual machine does not format a reserved area of the service data in the data disk when the data disk is partitioned.
  • the data disk can be partitioned. Since the reserved area of the service data in the data disk is not formatted, the service data of the original virtual machine can be guaranteed to avoid loss of service data. In this way, when the new virtual machine runs, the business data of the original virtual machine can be used, so that it can be understood as repairing the original virtual machine.
  • the foregoing method may be applied to a network function virtualization (NFV) distributed architecture, that is, the foregoing method may be implemented by one or more network devices in the NFV distributed architecture, for example: Network devices such as servers, computers, laptops, in-vehicle devices, and network televisions.
  • NFV network function virtualization
  • the mirroring template when it is detected that the original virtual machine is faulty, the mirroring template is used to create a new virtual machine; the media access control MAC address of the original virtual machine is set to the MAC address of the new virtual machine, and the The data disk of the original virtual machine is mounted to the new virtual machine; the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the service data in the data disk is not located.
  • the reserved area is formatted.
  • FIG. 2 is a schematic flowchart of another virtual machine fault repairing method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
  • the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds the first time threshold, the original virtual machine is reset, and the original virtual machine is detected after the original virtual machine is reset. Whether the time when the HA arbitration module stops transmitting the message packet exceeds the second time threshold, and if yes, determining that the original virtual machine has failed, the flow may be ended.
  • the foregoing virtual machine and the HA arbitration module stop transmitting the message packet, which may be understood as the heartbeat interruption of the original virtual machine HA arbitration module.
  • the message packet may be any message packet transmitted by the original virtual machine and the HA arbitration module.
  • the arbitration module can be an HA arbitration module in the NFV distributed architecture.
  • Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds the second time threshold after the reset of the original virtual machine is understood to be that the original virtual machine is reset to the start time of the timer, when the timer reaches the above At the second time threshold, when the original virtual machine has not transmitted the message packet with the HA arbitration module, it determines that the original virtual machine has failed.
  • the first time threshold and the second time threshold may be preset time thresholds.
  • step 201 automatic fault detection of the virtual machine is realized, manual detection is avoided, and the timeliness of the fault alarm is improved.
  • the HA arbitration module may detect that the original virtual machine is faulty, and the HA may manage and orchestrate to the NFV distributed architecture (Management And Orchestration, MANO).
  • the module sends a request to rebuild the virtual machine message.
  • the MANO module can notify the infrastructure layer (I layer) of the NFV distributed architecture to create a new virtual machine, and the I layer uses the mirror template to create the new virtual machine.
  • the creating a new virtual machine by using the image template of the source virtual machine may include:
  • the data disk may be all data disks except the system disk in the original virtual machine.
  • the above-mentioned mounting of the data disk of the original virtual machine to the new virtual may be understood as using the data disk of the original virtual machine as the data disk of the new virtual machine.
  • the mount may be that the I layer mounts the data disk of the original virtual machine to the new virtual using an Application Programming Interface (API). Specifically, the I layer finds the data disk of the original virtual machine through the site identifier and the description file of the original virtual machine, and then uses the API interface to mount the data disk of the original virtual machine to the new virtual machine.
  • API Application Programming Interface
  • the step of mounting the data disk of the original virtual machine to the new virtual machine may include:
  • the foregoing method may further include:
  • the virtual machine can be automatically deleted during the virtual machine repair process, thus improving efficiency.
  • the new virtual machine is started, where the new virtual machine does not format a reserved area of service data in the data disk when partitioning the data disk.
  • the data disk may include a partition table, and the reserved area in which the service data of the data disk is located may be indicated in the partition table.
  • the partition description module indicates, in the partition table, an area in which the service data of the data disk is located as a reserved area.
  • the new virtual machine does not format the reserved area where the service data in the data disk is located according to the partition table, and the service data is divided in the data disk.
  • the area outside the reserved area is formatted.
  • the new virtual machine can re-install the application (Application, APP) on the data disk, thereby restoring the entire virtual machine.
  • Application Application
  • the new virtual machine can re-install the application (Application, APP) on the data disk, thereby restoring the entire virtual machine.
  • APP Application
  • the installed APP can also retain the data recorded on the original virtual machine.
  • file system check and repair can also be performed on the reserved area. It can be formatted when the check and repair fails.
  • the virtual machine is deployed for the service active/standby deployment, because the process of redistributing the service to the new virtual machine is not introduced during the virtual machine failure recovery process.
  • a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 1, and the VMs in the DHCP-free service scenario can be fault-repaired.
  • FIG. 3 is a schematic diagram of another virtual machine fault repairing method according to an embodiment of the present invention. As shown in FIG. 3, the method includes the following steps:
  • the APP VM detects whether the heartbeat interruption of the original VM and the HA arbitration module exceeds TI seconds. If yes, step 302 is performed; the APP VM may be understood as a function module for installing the VM.
  • the APP VM detects whether the heartbeat interruption of the original VM and the HA arbitration module exceeds T2 seconds, and if yes, step 304 is performed;
  • the APP VM sends a message to the MANO to rebuild the VM system disk.
  • the step may be that the HA arbitration module notifies the MANO module to send a message for rebuilding the VM system disk.
  • the MANO module notifies the I layer to use the image to create the VM.
  • the I layer module creates a new VM with only the system disk
  • the MANO module notifies the I layer module to change the MAC of the new VM to the original VM.
  • the I layer module changes the MAC of the new VM to be the same as the original VM.
  • the MANO module notifies the I layer module to unload the data disk from the original VM and hold the data disk to the new VM.
  • the I layer module unloads the data disk from the original VM and holds the data disk to the new VM;
  • the MANO module notifies the I layer module to delete the original VM
  • the I layer module deletes the original VM
  • the MANO module notifies the APP VM that the reconstruction is successful
  • the APP VM starts a new VM from the system disk.
  • the APP VM installs the APP on the data disk of the new VM.
  • the APP VM, the MANO module, and the I layer module may be functional modules located in the same network device or in different network devices in the NFV distributed architecture.
  • the device embodiment of the present invention is used to perform the method for implementing the first to second embodiments of the present invention.
  • the device embodiment of the present invention is used to perform the method for implementing the first to second embodiments of the present invention.
  • Only parts related to the embodiment of the present invention are shown, and the specific technical details are not disclosed. Please refer to Embodiment 1 and Embodiment 2 of the present invention.
  • FIG. 4 is a schematic structural diagram of a virtual machine fault repairing apparatus according to an embodiment of the present invention, as shown in FIG. As shown, it includes: a creating unit 41, a setting unit 42 and a starting unit 43, wherein:
  • the creating unit 41 is configured to create a new virtual machine by using a mirror template when detecting that the original virtual machine fails.
  • the image template may be a mirror template used to create the original virtual machine. And the above image template may be pre-stored. This way, the creation unit 41 can directly create a new virtual machine using the image template.
  • the creation here may specifically be a system disk that creates a new virtual machine.
  • the setting unit 42 is configured to set a media access control MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine.
  • the setting unit 42 can realize that the MAC address of the new virtual machine is the same as the MAC address of the original virtual machine, and the data disk of the new virtual machine is the data disk of the original virtual machine, so that the created new virtual machine and the original virtual machine can be realized. For the same virtual machine.
  • the startup unit 43 is configured to start the new virtual machine, where the new virtual machine does not format the reserved area of the service data in the data disk when the data disk is partitioned.
  • the data disk can be partitioned. Since the reserved area of the service data in the data disk is not formatted, the service data of the original virtual machine can be guaranteed to avoid loss of service data. In this way, when the new virtual machine runs, the business data of the original virtual machine can be used, so that it can be understood as repairing the original virtual machine.
  • the foregoing apparatus may be applied to a network function virtualization (NFV) distributed architecture, that is, the foregoing apparatus may be implemented by one or more network devices in an NFV distributed architecture, for example, a server or a computer. , laptop computers, car equipment, network television and other network equipment.
  • NFV network function virtualization
  • the mirroring template when it is detected that the original virtual machine is faulty, the mirroring template is used to create a new virtual machine; the media access control MAC address of the original virtual machine is set to the MAC address of the new virtual machine, and the The data disk of the original virtual machine is mounted to the new virtual machine; the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the service data in the data disk is not located.
  • the reserved area is formatted.
  • FIG. 5 is a schematic structural diagram of another virtual machine fault repairing apparatus according to an embodiment of the present invention. As shown in FIG. 5, the method includes: a detecting unit 51, a creating unit 52, a setting unit 53, and a starting unit 54, among them:
  • the detecting unit 51 is configured to: when detecting that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed a first time threshold, reset the original virtual machine, and detect that the original virtual machine is reset Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds the second time threshold, and if yes, determining that the original virtual machine is faulty.
  • the foregoing virtual machine and the HA arbitration module stop transmitting the message packet, which may be understood as the heartbeat interruption of the original virtual machine HA arbitration module.
  • the message packet may be any message packet transmitted by the original virtual machine and the HA arbitration module.
  • the arbitration module can be an HA arbitration module in the NFV distributed architecture.
  • the second time threshold can be understood as: the original virtual machine is reset to the start time of the timer.
  • the timer reaches the second time threshold, when the original virtual machine has not transmitted the message packet with the HA arbitration module, the original virtual machine is determined. malfunction.
  • the first time threshold and the second time threshold may be preset time thresholds.
  • the detection unit 51 automatically detects the failure of the virtual machine, avoids manual detection, and improves the timeliness of the failure alarm.
  • the creating unit 52 is configured to create a new virtual machine by using a mirror template when detecting that the original virtual machine fails.
  • the HA arbitration module may detect that the original virtual machine is faulty, and the HA may send a request to rebuild the virtual machine message to the MANO module in the NFV distributed architecture, MANO After receiving the message, the module can notify the infrastructure layer (I layer) of the NFV distributed architecture to create a new virtual machine, and the I layer uses the mirror template to create the new virtual machine.
  • I layer infrastructure layer
  • the creation unit 52 can be used to create a new virtual machine that includes only system disks using a mirror template.
  • the setting unit 53 is configured to set a media access control MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine.
  • the data disk may be all data disks except the system disk in the original virtual machine.
  • the above-mentioned mounting of the data disk of the original virtual machine to the new virtual may be understood as using the data disk of the original virtual machine as the data disk of the new virtual machine.
  • the mount may be that the I layer uses the API to mount the data disk of the original virtual machine to the new virtual. Specifically, the I layer finds the data disk of the original virtual machine through the site identifier and the description file of the original virtual machine, and then uses the API interface to mount the data disk of the original virtual machine to the new virtual machine.
  • the setting unit 53 is further configured to: uninstall the data disk of the original virtual machine, and mount the unloaded data disk to the new virtual machine;
  • the foregoing apparatus may further include:
  • the deleting unit 55 is configured to delete the original virtual machine.
  • the virtual machine can be automatically deleted during the virtual machine repair process, thus improving efficiency.
  • the startup unit 54 is configured to start the new virtual machine, where the new virtual machine does not format the reserved area of the service data in the data disk when the data disk is partitioned.
  • the data disk may include a partition table, and the reserved area in which the service data of the data disk is located may be indicated in the partition table.
  • the partition description module indicates, in the partition table, an area in which the service data of the data disk is located as a reserved area.
  • the new virtual machine does not format the reserved area where the service data in the data disk is located according to the partition table, and the service data is divided in the data disk.
  • the area outside the reserved area is formatted.
  • the new virtual machine can re-install the APP on the data disk, thereby restoring the entire virtual machine. For example, use the business data in the reserved area to install the APP installed on the original virtual machine. In addition, since the business data is reserved, the installed APP can also retain the data recorded on the original virtual machine.
  • file system check and repair can also be performed on the reserved area. It can be formatted when the check and repair fails.
  • the virtual machine is deployed for the service active/standby deployment, because the process of redistributing the service to the new virtual machine is not introduced during the virtual machine failure recovery process.
  • a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 4, and the VMs in the DHCP-free service scenario can be fault-repaired.
  • FIG. 7 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention.
  • the processor 71 includes a processor 71, a network interface 72, a memory 73, and a communication bus 74.
  • the communication bus 74 is configured to implement connection communication between the processor 71, the network interface 72, and the memory 73, and the processor 71 executes a program stored in the memory 73 for implementing the following method:
  • the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the reserved area where the service data in the data disk is located is not formatted.
  • the program executed by the processor 71 may further include:
  • HA arbitration module When it is detected that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed the first time threshold, reset the original virtual machine, and detect the original virtual machine after the original virtual machine is reset. Whether the time when the HA arbitration module stops transmitting the message packet exceeds a second time threshold, and if so, determines that the original virtual machine has failed.
  • the processor 71 executes a program for creating a new virtual machine by using the image template of the source virtual machine, and may include:
  • the data disk may include a partition table, where the partition table indicates a reserved area where the service data of the data disk is located, and when the new virtual machine partitions the data disk, And formatting, according to the partition table, a reserved area where the service data in the data disk is located, and formatting an area of the data disk other than the reserved area where the service data is located.
  • the program executed by the processor 71 to mount the data disk of the original virtual machine to the new virtual machine may include:
  • the program executed by the processor 71 may further include:
  • the mirroring template when it is detected that the original virtual machine is faulty, the mirroring template is used to create a new virtual machine; the media access control MAC address of the original virtual machine is set to the MAC address of the new virtual machine, and the Data disk of the original virtual machine Mounting to the new virtual machine; starting the new virtual machine, wherein the new virtual machine does not format a reserved area of service data in the data disk when partitioning the data disk.
  • the faulty virtual machine is prevented from being repaired by the DHCP network, so that the VM in the DHCP-free service scenario can be repaired.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Hardware Redundancy (AREA)
  • Stored Programmes (AREA)

Abstract

A fault recovery method and device for a virtual machine, the method comprising: when it is detected that an original virtual machine has failed, using an image template to create a new virtual machine (101); setting the MAC address of the original virtual machine to the MAC address of the new virtual machine, and mounting the data disk of the original virtual machine to the new virtual machine (102); and starting the new virtual machine (103), wherein when the data disk in the new virtual machine is partitioned, the reservation area where the service data in the data disk is located is not formatted. The method can be used to troubleshoot a VM under no DHCP service scenario.

Description

一种虚拟机故障修复方法和装置Virtual machine fault repairing method and device
本申请要求于2015年9月30日提交中国专利局、申请号为201510638436.8、发明名称为“一种虚拟机故障修复方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201510638436.8, entitled "A Virtual Machine Fault Repair Method and Apparatus", filed on September 30, 2015, the entire contents of which are incorporated herein by reference. In the application.
技术领域Technical field
本发明涉及虚拟技术领域,尤其涉及一种虚拟机故障修复方法和装置。The present invention relates to the field of virtual technologies, and in particular, to a virtual machine fault repair method and apparatus.
背景技术Background technique
随着虚拟技术的发展,目前虚拟机(Virtual Machine,VM)应用越来越广泛。然而,VM在运行过程可能会出现系统盘故障,例如:VM因客户操作系统(Guest Operation System,Guest OS)的原因导致系统盘故障。且当VM系统盘出现故障后,VM无法正常工作,以导致业务受损。目前处理VM故障的方案主要是通过如下方法:With the development of virtual technologies, virtual machines (VMs) are becoming more and more widely used. However, the VM may experience system disk failure during the running process. For example, the VM is faulty due to the guest operating system (Guest OS). And when the VM system disk fails, the VM cannot work normally, resulting in business damage. The current solution for handling VM failures is mainly through the following methods:
VM故障后,VM设置成从网络引导,并从动态主机配置协议(Dynamic Host Configuration Protocol,DHCP)服务器申请IP,VM使用申请到的IP连接简单文件传输协议(Trivial File Transfer Protocol,TFTP)服务器下载微操作系统(micro OS);VM从micro OS启动运行,并重新安装生产操作系统(Operation System,OS);在安装完生产OS后,重启VM,就可以继续安装APP。通过上述步骤就可以重装整个VM,同时自动恢复业务,以实现恢复VM故障。After the VM fails, the VM is set to boot from the network and apply for IP from the Dynamic Host Configuration Protocol (DHCP) server. The VM uses the applied IP connection to download the Trivial File Transfer Protocol (TFTP) server. Micro OS (VM); VM starts from micro OS and reinstalls the operating system (Operation System, OS); after installing the production OS, restart the VM, you can continue to install the app. Through the above steps, the entire VM can be reinstalled, and the service is automatically restored to recover the VM failure.
然而,在某些实际生产环境中(比如运营商搭建的云),基于安全考虑,要求VM的IP由云的基础设施层分配。为了不产生冲突,租户的VM就不能开启DHCP服务。因此租户的VM不能再依靠DHCP服务进行VM故障恢复。可见,上述VM故障修复方法无法对无DHCP服务场景下的VM进行故障修复。However, in some actual production environments (such as cloud built by operators), based on security considerations, the IP of the VM is required to be allocated by the cloud's infrastructure layer. In order not to cause conflicts, the tenant's VM cannot open the DHCP service. Therefore, the tenant's VM can no longer rely on the DHCP service for VM failure recovery. It can be seen that the VM fault repair method cannot repair the VM in the DHCP-free service scenario.
发明内容Summary of the invention
本发明实施例提供了一种虚拟机故障修复方法和装置,可以对无DHCP服务场景下的VM进行故障修复。The embodiment of the invention provides a virtual machine fault repairing method and device, which can perform fault repair on a VM in a DHCP-free service scenario.
第一方面,本发明实施例提供一种虚拟机故障修复方法,包括:In a first aspect, an embodiment of the present invention provides a virtual machine fault repairing method, including:
当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;Create a new virtual machine using the mirror template when it detects that the original virtual machine has failed.
将所述原虚拟机的媒体接入控制(Media Access Control,MAC)地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机;Setting a Media Access Control (MAC) address of the original virtual machine to a MAC address of the new virtual machine, and mounting the data disk of the original virtual machine to the new virtual machine;
启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。The new virtual machine is started, wherein when the new virtual machine partitions the data disk, the reserved area where the service data in the data disk is located is not formatted.
在第一方面的第一种可能的实现方式中,所述方法还包括: In a first possible implementation manner of the first aspect, the method further includes:
当检测到所述原虚拟机与高可用性(High availability,HA)仲裁模块停止传输消息包的时间超过第一时间阈值时,将所述原虚拟机复位,并检测在所述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值,若是,则确定所述原虚拟机发生故障。When it is detected that the original virtual machine and the high availability (HA) arbitration module stop transmitting the message packet exceed the first time threshold, reset the original virtual machine, and detect that after the original virtual machine is reset Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds a second time threshold, and if yes, determining that the original virtual machine is faulty.
结合第一方面或者第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述使用所述源虚拟机的镜像模板创建新虚拟机,包括:With the first aspect or the first possible implementation of the first aspect, in a second possible implementation of the first aspect, the creating a new virtual machine by using the image template of the source virtual machine includes:
使用镜像模板创建仅包括系统盘的新虚拟机。Use a mirroring template to create a new virtual machine that includes only system disks.
结合第一方面或者第一方面的第一种可能的实现方式,在第一方面的第三种可能的实现方式中,所述数据盘包括分区表,所述分区表中注明有所述数据盘的业务数据所在的预留区,所述新虚拟机在对所述数据盘进行分区时,根据所述分区表不对所述数据盘中的业务数据所在的预留区进行格式化,将所述数据盘中除所述业务数据所在的预留区之外的区域进行格式化。In conjunction with the first aspect, or the first possible implementation of the first aspect, in a third possible implementation manner of the first aspect, the data disk includes a partition table, where the data is indicated in the partition table. a reserved area where the service data of the disk is located. When the new virtual machine partitions the data disk, according to the partition table, the reserved area where the service data in the data disk is located is not formatted. The area of the data disk other than the reserved area where the service data is located is formatted.
结合第一方面或者第一方面的第一种可能的实现方式,在第一方面的第四种可能的实现方式中,所述将所述原虚拟机的数据盘挂载至所述新虚拟机,包括:In conjunction with the first aspect or the first possible implementation of the first aspect, in a fourth possible implementation manner of the first aspect, the data disk of the original virtual machine is mounted to the new virtual machine ,include:
将所述原虚拟机的数据盘卸载,并将卸载后的所述数据盘挂载至所述新虚拟机;Unloading the data disk of the original virtual machine, and mounting the unloaded data disk to the new virtual machine;
所述方法还包括:The method further includes:
删除所述原虚拟机。Delete the original virtual machine.
第二方面,本发明实施例提供一种虚拟机故障修复装置,包括:创建单元、设置单元和启动单元,其中:In a second aspect, an embodiment of the present invention provides a virtual machine fault repair apparatus, including: a creating unit, a setting unit, and a starting unit, where:
所述创建单元,用于当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;The creating unit is configured to create a new virtual machine by using a mirror template when detecting that the original virtual machine is faulty;
所述设置单元,用于将所述原虚拟机的MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机;The setting unit is configured to set a MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine;
所述启动单元,用于启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。The startup unit is configured to start the new virtual machine, where the new virtual machine does not format a reserved area of the service data in the data disk when the data disk is partitioned.
在第二方面的第一种可能的实现方式中,所述装置还包括:In a first possible implementation manner of the second aspect, the device further includes:
检测单元,用于当检测到所述原虚拟机与高可用性HA仲裁模块停止传输消息包的时间超过第一时间阈值时,将所述原虚拟机复位,并检测在所述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值,若是,则确定所述原虚拟机发生故障。a detecting unit, configured to: when detecting that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed a first time threshold, reset the original virtual machine, and detect that after the original virtual machine is reset Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds a second time threshold, and if yes, determining that the original virtual machine is faulty.
结合第二方面或者第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述创建单元用于使用镜像模板创建仅包括系统盘的新虚拟机。In conjunction with the second aspect or the first possible implementation of the second aspect, in a second possible implementation of the second aspect, the creating unit is configured to create a new virtual machine that includes only the system disk by using the image template.
结合第二方面或者第二方面的第一种可能的实现方式,在第二方面的第三种可能的实现方式中,所述数据盘包括分区表,所述分区表中注明有所述数据盘的业务数据所在的预留区,所述新虚拟机在对所述数据盘进行分区时,根据所述分区表不对所述数据盘中的业务数据所 在的预留区进行格式化,将所述数据盘中除所述业务数据所在的预留区之外的区域进行格式化。With reference to the second aspect, or the first possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the data disk includes a partition table, where the data is indicated in the partition table a reserved area where the service data of the disk is located, when the new virtual machine partitions the data disk, according to the partition table, the service data in the data disk is not The reserved area is formatted to format an area of the data disk other than the reserved area in which the service data is located.
结合第二方面或者第二方面的第一种可能的实现方式,在第二方面的第四种可能的实现方式中,所述设置单元还用于将所述原虚拟机的数据盘卸载,并将卸载后的所述数据盘挂载至所述新虚拟机;In conjunction with the second aspect, or the first possible implementation of the second aspect, in a fourth possible implementation manner of the second aspect, the setting unit is further configured to uninstall the data disk of the original virtual machine, and Mounting the unloaded data disk to the new virtual machine;
所述装置还包括:The device also includes:
删除单元,用于删除所述原虚拟机。A delete unit is used to delete the original virtual machine.
上述技术方案中,当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机;启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。这样在修复过程中避免了依靠DHCP网络引导修复故障虚拟机,从而可以实现无DHCP服务场景下的VM进行故障修复。In the foregoing technical solution, when detecting that the original virtual machine fails, the mirroring template is used to create a new virtual machine; setting the media access control MAC address of the original virtual machine to the MAC address of the new virtual machine, and The data disk of the original virtual machine is mounted to the new virtual machine; the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the service data in the data disk is not located. The reserved area is formatted. In this way, in the repair process, the faulty virtual machine is prevented from being repaired by the DHCP network, so that the VM in the DHCP-free service scenario can be repaired.
附图说明DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any creative work.
图1是本发明实施例提供的一种虚拟机故障修复方法的流程示意图;1 is a schematic flowchart of a virtual machine fault repairing method according to an embodiment of the present invention;
图2是本发明实施例提供的另一种虚拟机故障修复方法的流程示意图;2 is a schematic flowchart of another virtual machine fault repairing method according to an embodiment of the present invention;
图3是本发明实施例提供的另一种虚拟机故障修复方法的示意图;3 is a schematic diagram of another virtual machine fault repairing method according to an embodiment of the present invention;
图4是本发明实施例提供的一种虚拟机故障修复装置的结构示意图;4 is a schematic structural diagram of a virtual machine fault repairing apparatus according to an embodiment of the present invention;
图5是本发明实施例提供的另一种虚拟机故障修复装置的结构示意图;FIG. 5 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention; FIG.
图6是本发明实施例提供的另一种虚拟机故障修复装置的结构示意图;FIG. 6 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention; FIG.
图7是本发明实施例提供的另一种虚拟机故障修复装置的结构示意图。FIG. 7 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
请参阅图1,图1是本发明实施例提供的一种虚拟机故障修复方法的流程示意图,如图1所示,包括以下步骤:Referring to FIG. 1 , FIG. 1 is a schematic flowchart of a method for repairing a fault of a virtual machine according to an embodiment of the present invention. As shown in FIG. 1 , the method includes the following steps:
101、当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机。 101. When detecting that the original virtual machine fails, use the image template to create a new virtual machine.
本实施例中,上述镜像模板可以是用于创建上述原虚拟机的镜像模板。且上述镜像模板可以是预先存储的。这样步骤101就可以直接使用该镜像模板创建新虚拟机。另外,这里的创建具体可以是创建新虚拟机的系统盘。In this embodiment, the image template may be a mirror template used to create the original virtual machine. And the above image template may be pre-stored. In this way, step 101 can directly create a new virtual machine using the image template. In addition, the creation here may specifically be a system disk that creates a new virtual machine.
102、将所述原虚拟机的MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机。102. Set a MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine.
通过步骤102就可以实现新虚拟机的MAC地址与原虚拟机的MAC地址一样,且新虚拟机的数据盘为原虚拟机的数据盘,从而可以实现创建的上述新虚拟机和原虚拟机为相同的虚拟机。The MAC address of the new virtual machine is the same as the MAC address of the original virtual machine, and the data disk of the new virtual machine is the data disk of the original virtual machine, so that the new virtual machine and the original virtual machine can be created. The same virtual machine.
103、启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。103. The new virtual machine is started, where the new virtual machine does not format a reserved area of the service data in the data disk when the data disk is partitioned.
当上述新虚拟机启动后,就可以对数据盘进行分区,由于不对数据盘中的业务数据所在预留区进行格式化,这样就可以保证原虚拟机的业务数据,以避免业务数据的丢失。这样当新虚拟机运行时,可以使用原虚拟机的业务数据,从而可以理解为修复原虚拟机。After the new virtual machine is started, the data disk can be partitioned. Since the reserved area of the service data in the data disk is not formatted, the service data of the original virtual machine can be guaranteed to avoid loss of service data. In this way, when the new virtual machine runs, the business data of the original virtual machine can be used, so that it can be understood as repairing the original virtual machine.
本实施例中,上述方法可以应用于网络功能虚拟化(Network Funct ion Virtual izat ion,NFV)分布式架构中,即上述方法可以由NFV分布式架构中的一个或者多个网络设备实现,例如:服务器、计算机、笔记本电脑、车载设备、网络电视等网络设备。In this embodiment, the foregoing method may be applied to a network function virtualization (NFV) distributed architecture, that is, the foregoing method may be implemented by one or more network devices in the NFV distributed architecture, for example: Network devices such as servers, computers, laptops, in-vehicle devices, and network televisions.
本实施例中,当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机;启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。这样在修复过程中避免了依靠DHCP网络引导修复故障虚拟机,从而可以实现对无DHCP服务场景下的VM进行故障修复。In this embodiment, when it is detected that the original virtual machine is faulty, the mirroring template is used to create a new virtual machine; the media access control MAC address of the original virtual machine is set to the MAC address of the new virtual machine, and the The data disk of the original virtual machine is mounted to the new virtual machine; the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the service data in the data disk is not located. The reserved area is formatted. In this way, in the repair process, the faulty virtual machine is prevented from being repaired by the DHCP network, so that the VM in the DHCP-free service scenario can be repaired.
请参阅图2,图2是本发明实施例提供的另一种虚拟机故障修复方法的流程示意图,如图2所示,包括以下步骤:Referring to FIG. 2, FIG. 2 is a schematic flowchart of another virtual machine fault repairing method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
201、当检测到所述原虚拟机与HA仲裁模块停止传输消息包的时间超过第一时间阈值时,将所述原虚拟机复位,并检测在所述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值,若是,则确定所述原虚拟机发生故障,则可以结束流程。When the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds the first time threshold, the original virtual machine is reset, and the original virtual machine is detected after the original virtual machine is reset. Whether the time when the HA arbitration module stops transmitting the message packet exceeds the second time threshold, and if yes, determining that the original virtual machine has failed, the flow may be ended.
本实施例中,上述原虚拟机与HA仲裁模块停止传输消息包可以理解为原虚拟机HA仲裁模块心跳中断,另外,上述消息包可以原虚拟机与HA仲裁模块传输的任何消息包,上述HA仲裁模块可以NFV分布式架构中的HA仲裁模块。In this embodiment, the foregoing virtual machine and the HA arbitration module stop transmitting the message packet, which may be understood as the heartbeat interruption of the original virtual machine HA arbitration module. In addition, the message packet may be any message packet transmitted by the original virtual machine and the HA arbitration module. The arbitration module can be an HA arbitration module in the NFV distributed architecture.
上述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值可以理解为,原虚拟机复位为计时器的起始时间,当计时器达到上述第二时间阈值时,原虚拟机还未与HA仲裁模块传输消息包时,则确定原虚拟机发生故障。 Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds the second time threshold after the reset of the original virtual machine is understood to be that the original virtual machine is reset to the start time of the timer, when the timer reaches the above At the second time threshold, when the original virtual machine has not transmitted the message packet with the HA arbitration module, it determines that the original virtual machine has failed.
上述第一时间阈值和第二时间阈值可以是预先设置的时间阈值。The first time threshold and the second time threshold may be preset time thresholds.
通过步骤201实现自动检测虚拟机故障,避免人工检测,提升了故障报警及时性。Through step 201, automatic fault detection of the virtual machine is realized, manual detection is avoided, and the timeliness of the fault alarm is improved.
202、当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机。202. When detecting that the original virtual machine fails, use the image template to create a new virtual machine.
本实施例中,上述检测到原虚拟机发生故障时,可以是上述HA仲裁模块检测到上述原虚拟机发生故障,且HA可以向NFV分布式架构中的管理和编排(Management And Orchestration,MANO)模块发送请求重建虚拟机消息,MANO模块接收到该消息后,就可以通知NFV分布式架构的基础设施层(I层)创建新虚拟机,I层再使用镜像模板创建上述新虚拟机。In this embodiment, when the fault of the original virtual machine is detected, the HA arbitration module may detect that the original virtual machine is faulty, and the HA may manage and orchestrate to the NFV distributed architecture (Management And Orchestration, MANO). The module sends a request to rebuild the virtual machine message. After receiving the message, the MANO module can notify the infrastructure layer (I layer) of the NFV distributed architecture to create a new virtual machine, and the I layer uses the mirror template to create the new virtual machine.
本实施例中,上述使用所述源虚拟机的镜像模板创建新虚拟机,可以包括:In this embodiment, the creating a new virtual machine by using the image template of the source virtual machine may include:
使用镜像模板创建仅包括系统盘的新虚拟机。Use a mirroring template to create a new virtual machine that includes only system disks.
该实施方式中,可以实施创建仅包括系统盘的新虚拟机。In this embodiment, a new virtual machine that includes only system disks can be implemented.
203、将所述原虚拟机的MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟。203. Set a MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual.
本实施例中,上述数据盘可以是原虚拟机中除系统盘之外的所有数据盘。另外,上述将所述原虚拟机的数据盘挂载至所述新虚拟可以理解为将原虚拟机的数据盘作为新虚拟机的数据盘。其挂载可以是I层使用应用程序编程接口(Application Programming Interface,API)将原虚拟机的数据盘挂载至所述新虚拟。具体可以是I层通过原虚拟机的站点标识和描述文件找到原虚拟机的数据盘,再使用API接口将原虚拟机的数据盘挂载至新虚拟机。In this embodiment, the data disk may be all data disks except the system disk in the original virtual machine. In addition, the above-mentioned mounting of the data disk of the original virtual machine to the new virtual may be understood as using the data disk of the original virtual machine as the data disk of the new virtual machine. The mount may be that the I layer mounts the data disk of the original virtual machine to the new virtual using an Application Programming Interface (API). Specifically, the I layer finds the data disk of the original virtual machine through the site identifier and the description file of the original virtual machine, and then uses the API interface to mount the data disk of the original virtual machine to the new virtual machine.
本实施例中,上述将所述原虚拟机的数据盘挂载至所述新虚拟机的步骤,可以包括:In this embodiment, the step of mounting the data disk of the original virtual machine to the new virtual machine may include:
将所述原虚拟机的数据盘卸载,并将卸载后的所述数据盘挂载至所述新虚拟机。Unloading the data disk of the original virtual machine, and mounting the unloaded data disk to the new virtual machine.
另外,该实施方式,上述方法还可以包括:In addition, in this implementation manner, the foregoing method may further include:
删除所述原虚拟机。Delete the original virtual machine.
这样可以实现虚拟机修复过程中,会自动删除故障虚拟机,因此提升了效率。In this way, the virtual machine can be automatically deleted during the virtual machine repair process, thus improving efficiency.
204、启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。204. The new virtual machine is started, where the new virtual machine does not format a reserved area of service data in the data disk when partitioning the data disk.
本实施例中,上述数据盘可以包括分区表,所述分区表中可以注明有所述数据盘的业务数据所在的预留区。例如:分区描述模块在上述分区表中注明数据盘的业务数据所在的区域作为预留区。In this embodiment, the data disk may include a partition table, and the reserved area in which the service data of the data disk is located may be indicated in the partition table. For example, the partition description module indicates, in the partition table, an area in which the service data of the data disk is located as a reserved area.
另外,上述新虚拟机在对所述数据盘进行分区时,根据所述分区表不对所述数据盘中的业务数据所在的预留区进行格式化,将所述数据盘中除所述业务数据所在的预留区之外的区域进行格式化。In addition, when the data disk is partitioned, the new virtual machine does not format the reserved area where the service data in the data disk is located according to the partition table, and the service data is divided in the data disk. The area outside the reserved area is formatted.
另外,在对数据盘进行必要的格式化处理后,新虚拟机可以重新在数据盘安装应用程序(Application,APP),从而恢复整个虚拟机。例如:使用上述预留区中的业务数据安装原虚拟机上安装的APP。另外,由于业务数据都有预留,这样安装后的APP还可以保留原虚拟机上记录的数据。 In addition, after the necessary formatting of the data disk, the new virtual machine can re-install the application (Application, APP) on the data disk, thereby restoring the entire virtual machine. For example, use the business data in the reserved area to install the APP installed on the original virtual machine. In addition, since the business data is reserved, the installed APP can also retain the data recorded on the original virtual machine.
另外,该实施方式中,还可以对预留区进行文件系统检查和修复。且当检查和修复失败时可以对其格式化。In addition, in this embodiment, file system check and repair can also be performed on the reserved area. It can be formatted when the check and repair fails.
另外,上述实施方式中,由于在虚拟机故障恢复过程中,没有引入业务重新分发到新虚拟机过程,因此对于业务主备部署的虚拟机也适用。In addition, in the foregoing embodiment, the virtual machine is deployed for the service active/standby deployment, because the process of redistributing the service to the new virtual machine is not introduced during the virtual machine failure recovery process.
本实施例中,在图1所示的实施例的基础上增加了多种可选的实施方式,且都可以实现对无DHCP服务场景下的VM进行故障修复。In this embodiment, a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 1, and the VMs in the DHCP-free service scenario can be fault-repaired.
请参阅图3,图3是本发明实施例提供的另一种虚拟机故障修复方法的示意图,如图3所示,包括以下步骤:Referring to FIG. 3, FIG. 3 is a schematic diagram of another virtual machine fault repairing method according to an embodiment of the present invention. As shown in FIG. 3, the method includes the following steps:
301、APP VM检测原VM与HA仲裁模块的心跳中断是否超过TI秒,若是,则执行步骤302;上述APP VM可以理解为用于安装VM的功能模块。301. The APP VM detects whether the heartbeat interruption of the original VM and the HA arbitration module exceeds TI seconds. If yes, step 302 is performed; the APP VM may be understood as a function module for installing the VM.
302;原VM被复位;302; the original VM is reset;
303、APP VM检测原VM与HA仲裁模块的心跳中断是否超过T2秒,若是,则执行步骤304;303, the APP VM detects whether the heartbeat interruption of the original VM and the HA arbitration module exceeds T2 seconds, and if yes, step 304 is performed;
304、APP VM向MANO发送重建VM系统盘的消息;该步骤具体可以是HA仲裁模块通知MANO模块发送重建VM系统盘的消息。304. The APP VM sends a message to the MANO to rebuild the VM system disk. The step may be that the HA arbitration module notifies the MANO module to send a message for rebuilding the VM system disk.
305、MANO模块通知I层使用镜像创建VM;305. The MANO module notifies the I layer to use the image to create the VM.
306、I层模块创建只带系统盘的新VM;306. The I layer module creates a new VM with only the system disk;
307、MANO模块通知I层模块将新VM的MAC改成原VM一样;307. The MANO module notifies the I layer module to change the MAC of the new VM to the original VM.
308、I层模块将新VM的MAC改成原VM一样;308. The I layer module changes the MAC of the new VM to be the same as the original VM.
309、MANO模块通知I层模块从原VM卸载数据盘并持载到新VM;309. The MANO module notifies the I layer module to unload the data disk from the original VM and hold the data disk to the new VM.
3010、I层模块从原VM卸载数据盘并持载到新VM;3010. The I layer module unloads the data disk from the original VM and holds the data disk to the new VM;
3011、MANO模块通知I层模块删除原VM;3011, the MANO module notifies the I layer module to delete the original VM;
3012、I层模块删除原VM;3012. The I layer module deletes the original VM;
3013、MANO模块通知APP VM重建成功;3013. The MANO module notifies the APP VM that the reconstruction is successful;
3014、APP VM从系统盘启动新VM;3014. The APP VM starts a new VM from the system disk.
3015、APP VM在新VM的数据盘安装APP。3015. The APP VM installs the APP on the data disk of the new VM.
需要说明的是,APP VM、MANO模块和I层模块可以是NFV分布式架构中位于同网络设备或者位于不同网络设备中的功能模块。It should be noted that the APP VM, the MANO module, and the I layer module may be functional modules located in the same network device or in different network devices in the NFV distributed architecture.
下面为本发明装置实施例,本发明装置实施例用于执行本发明方法实施例一至二实现的方法,为了便于说明,仅示出了与本发明实施例相关的部分,具体技术细节未揭示的,请参照本发明实施例一和实施例二。The following is a device embodiment of the present invention. The device embodiment of the present invention is used to perform the method for implementing the first to second embodiments of the present invention. For the convenience of description, only parts related to the embodiment of the present invention are shown, and the specific technical details are not disclosed. Please refer to Embodiment 1 and Embodiment 2 of the present invention.
请参阅图4,图4是本发明实施例提供的一种虚拟机故障修复装置的结构示意图,如图4 所示,包括:创建单元41、设置单元42和启动单元43,其中:Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of a virtual machine fault repairing apparatus according to an embodiment of the present invention, as shown in FIG. As shown, it includes: a creating unit 41, a setting unit 42 and a starting unit 43, wherein:
创建单元41,用于当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机。The creating unit 41 is configured to create a new virtual machine by using a mirror template when detecting that the original virtual machine fails.
本实施例中,上述镜像模板可以是用于创建上述原虚拟机的镜像模板。且上述镜像模板可以是预先存储的。这样创建单元41就可以直接使用该镜像模板创建新虚拟机。另外,这里的创建具体可以是创建新虚拟机的系统盘。In this embodiment, the image template may be a mirror template used to create the original virtual machine. And the above image template may be pre-stored. This way, the creation unit 41 can directly create a new virtual machine using the image template. In addition, the creation here may specifically be a system disk that creates a new virtual machine.
设置单元42,用于将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机。The setting unit 42 is configured to set a media access control MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine.
通过设置单元42就可以实现新虚拟机的MAC地址与原虚拟机的MAC地址一样,且新虚拟机的数据盘为原虚拟机的数据盘,从而可以实现创建的上述新虚拟机和原虚拟机为相同的虚拟机。The setting unit 42 can realize that the MAC address of the new virtual machine is the same as the MAC address of the original virtual machine, and the data disk of the new virtual machine is the data disk of the original virtual machine, so that the created new virtual machine and the original virtual machine can be realized. For the same virtual machine.
启动单元43,用于启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。The startup unit 43 is configured to start the new virtual machine, where the new virtual machine does not format the reserved area of the service data in the data disk when the data disk is partitioned.
当上述新虚拟机启动后,就可以对数据盘进行分区,由于不对数据盘中的业务数据所在预留区进行格式化,这样就可以保证原虚拟机的业务数据,以避免业务数据的丢失。这样当新虚拟机运行时,可以使用原虚拟机的业务数据,从而可以理解为修复原虚拟机。After the new virtual machine is started, the data disk can be partitioned. Since the reserved area of the service data in the data disk is not formatted, the service data of the original virtual machine can be guaranteed to avoid loss of service data. In this way, when the new virtual machine runs, the business data of the original virtual machine can be used, so that it can be understood as repairing the original virtual machine.
本实施例中,上述装置可以应用于网络功能虚拟化(Network Function Virtualization,NFV)分布式架构中,即上述装置可以由NFV分布式架构中的一个或者多个网络设备实现,例如:服务器、计算机、笔记本电脑、车载设备、网络电视等网络设备。In this embodiment, the foregoing apparatus may be applied to a network function virtualization (NFV) distributed architecture, that is, the foregoing apparatus may be implemented by one or more network devices in an NFV distributed architecture, for example, a server or a computer. , laptop computers, car equipment, network television and other network equipment.
本实施例中,当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机;启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。这样在修复过程中避免了依靠DHCP网络引导修复故障虚拟机,从而可以实现对无DHCP服务场景下的VM进行故障修复。In this embodiment, when it is detected that the original virtual machine is faulty, the mirroring template is used to create a new virtual machine; the media access control MAC address of the original virtual machine is set to the MAC address of the new virtual machine, and the The data disk of the original virtual machine is mounted to the new virtual machine; the new virtual machine is started, wherein when the new virtual machine partitions the data disk, the service data in the data disk is not located. The reserved area is formatted. In this way, in the repair process, the faulty virtual machine is prevented from being repaired by the DHCP network, so that the VM in the DHCP-free service scenario can be repaired.
请参阅图5,图5是本发明实施例提供的另一种虚拟机故障修复装置的结构示意图,如图5所示,包括:检测单元51、创建单元52、设置单元53和启动单元54,其中:Referring to FIG. 5, FIG. 5 is a schematic structural diagram of another virtual machine fault repairing apparatus according to an embodiment of the present invention. As shown in FIG. 5, the method includes: a detecting unit 51, a creating unit 52, a setting unit 53, and a starting unit 54, among them:
检测单元51,用于当检测到所述原虚拟机与高可用性HA仲裁模块停止传输消息包的时间超过第一时间阈值时,将所述原虚拟机复位,并检测在所述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值,若是,则确定所述原虚拟机发生故障。The detecting unit 51 is configured to: when detecting that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed a first time threshold, reset the original virtual machine, and detect that the original virtual machine is reset Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds the second time threshold, and if yes, determining that the original virtual machine is faulty.
本实施例中,上述原虚拟机与HA仲裁模块停止传输消息包可以理解为原虚拟机HA仲裁模块心跳中断,另外,上述消息包可以原虚拟机与HA仲裁模块传输的任何消息包,上述HA仲裁模块可以NFV分布式架构中的HA仲裁模块。In this embodiment, the foregoing virtual machine and the HA arbitration module stop transmitting the message packet, which may be understood as the heartbeat interruption of the original virtual machine HA arbitration module. In addition, the message packet may be any message packet transmitted by the original virtual machine and the HA arbitration module. The arbitration module can be an HA arbitration module in the NFV distributed architecture.
上述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过 第二时间阈值可以理解为,原虚拟机复位为计时器的起始时间,当计时器达到上述第二时间阈值时,原虚拟机还未与HA仲裁模块传输消息包时,则确定原虚拟机发生故障。Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet after the original virtual machine is reset exceeds The second time threshold can be understood as: the original virtual machine is reset to the start time of the timer. When the timer reaches the second time threshold, when the original virtual machine has not transmitted the message packet with the HA arbitration module, the original virtual machine is determined. malfunction.
上述第一时间阈值和第二时间阈值可以是预先设置的时间阈值。The first time threshold and the second time threshold may be preset time thresholds.
通过检测单元51实现自动检测虚拟机故障,避免人工检测,提升了故障报警及时性。The detection unit 51 automatically detects the failure of the virtual machine, avoids manual detection, and improves the timeliness of the failure alarm.
创建单元52,用于当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机。The creating unit 52 is configured to create a new virtual machine by using a mirror template when detecting that the original virtual machine fails.
本实施例中,上述检测到原虚拟机发生故障时,可以是上述HA仲裁模块检测到上述原虚拟机发生故障,且HA可以向NFV分布式架构中的MANO模块发送请求重建虚拟机消息,MANO模块接收到该消息后,就可以通知NFV分布式架构的基础设施层(I层)创建新虚拟机,I层再使用镜像模板创建上述新虚拟机。In this embodiment, when the fault of the original virtual machine is detected, the HA arbitration module may detect that the original virtual machine is faulty, and the HA may send a request to rebuild the virtual machine message to the MANO module in the NFV distributed architecture, MANO After receiving the message, the module can notify the infrastructure layer (I layer) of the NFV distributed architecture to create a new virtual machine, and the I layer uses the mirror template to create the new virtual machine.
另外,创建单元52可以用于使用镜像模板创建仅包括系统盘的新虚拟机。Additionally, the creation unit 52 can be used to create a new virtual machine that includes only system disks using a mirror template.
该实施方式中,可以实施只创建仅包括系统盘的新虚拟机。In this embodiment, it is possible to implement the creation of only new virtual machines that only include system disks.
设置单元53,用于将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机。The setting unit 53 is configured to set a media access control MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine.
本实施例中,上述数据盘可以是原虚拟机中除系统盘之外的所有数据盘。另外,上述将所述原虚拟机的数据盘挂载至所述新虚拟可以理解为将原虚拟机的数据盘作为新虚拟机的数据盘。其挂载可以是I层使用API将原虚拟机的数据盘挂载至所述新虚拟。具体可以是I层通过原虚拟机的站点标识和描述文件找到原虚拟机的数据盘,再使用API接口将原虚拟机的数据盘挂载至新虚拟机。In this embodiment, the data disk may be all data disks except the system disk in the original virtual machine. In addition, the above-mentioned mounting of the data disk of the original virtual machine to the new virtual may be understood as using the data disk of the original virtual machine as the data disk of the new virtual machine. The mount may be that the I layer uses the API to mount the data disk of the original virtual machine to the new virtual. Specifically, the I layer finds the data disk of the original virtual machine through the site identifier and the description file of the original virtual machine, and then uses the API interface to mount the data disk of the original virtual machine to the new virtual machine.
本实施例中,设置单元53还可以用于将所述原虚拟机的数据盘卸载,并将卸载后的所述数据盘挂载至所述新虚拟机;In this embodiment, the setting unit 53 is further configured to: uninstall the data disk of the original virtual machine, and mount the unloaded data disk to the new virtual machine;
如图6所示,上述装置还可以包括:As shown in FIG. 6, the foregoing apparatus may further include:
删除单元55,用于删除所述原虚拟机。The deleting unit 55 is configured to delete the original virtual machine.
这样可以实现虚拟机修复过程中,会自动删除故障虚拟机,因此提升了效率。In this way, the virtual machine can be automatically deleted during the virtual machine repair process, thus improving efficiency.
启动单元54,用于启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。The startup unit 54 is configured to start the new virtual machine, where the new virtual machine does not format the reserved area of the service data in the data disk when the data disk is partitioned.
本实施例中,上述数据盘可以包括分区表,所述分区表中可以注明有所述数据盘的业务数据所在的预留区。例如:分区描述模块在上述分区表中注明数据盘的业务数据所在的区域作为预留区。In this embodiment, the data disk may include a partition table, and the reserved area in which the service data of the data disk is located may be indicated in the partition table. For example, the partition description module indicates, in the partition table, an area in which the service data of the data disk is located as a reserved area.
另外,上述新虚拟机在对所述数据盘进行分区时,根据所述分区表不对所述数据盘中的业务数据所在的预留区进行格式化,将所述数据盘中除所述业务数据所在的预留区之外的区域进行格式化。In addition, when the data disk is partitioned, the new virtual machine does not format the reserved area where the service data in the data disk is located according to the partition table, and the service data is divided in the data disk. The area outside the reserved area is formatted.
另外,在对数据盘进行必要的格式化处理后,新虚拟机可以重新在数据盘安装APP,从而恢复整个虚拟机。例如:使用上述预留区中的业务数据安装原虚拟机上安装的APP。另外,由于业务数据都有预留,这样安装后的APP还可以保留原虚拟机上记录的数据。 In addition, after the necessary formatting of the data disk, the new virtual machine can re-install the APP on the data disk, thereby restoring the entire virtual machine. For example, use the business data in the reserved area to install the APP installed on the original virtual machine. In addition, since the business data is reserved, the installed APP can also retain the data recorded on the original virtual machine.
另外,该实施方式中,还可以对预留区进行文件系统检查和修复。且当检查和修复失败时可以对其格式化。In addition, in this embodiment, file system check and repair can also be performed on the reserved area. It can be formatted when the check and repair fails.
另外,上述实施方式中,由于在虚拟机故障恢复过程中,没有引入业务重新分发到新虚拟机过程,因此对于业务主备部署的虚拟机也适用。In addition, in the foregoing embodiment, the virtual machine is deployed for the service active/standby deployment, because the process of redistributing the service to the new virtual machine is not introduced during the virtual machine failure recovery process.
本实施例中,在图4所示的实施例的基础上增加了多种可选的实施方式,且都可以实现对无DHCP服务场景下的VM进行故障修复。In this embodiment, a plurality of optional implementation manners are added on the basis of the embodiment shown in FIG. 4, and the VMs in the DHCP-free service scenario can be fault-repaired.
请参阅图7,图7是本发明实施例提供的另一种虚拟机故障修复装置的结构示意图,如图7所示,包括:处理器71、网络接口72、存储器73和通信总线74,其中,所述通信总线74用于实现所述处理器71、网络接口72和存储器73之间连接通信,所述处理器71执行所述存储器73中存储的程序用于实现以下方法:Referring to FIG. 7, FIG. 7 is a schematic structural diagram of another virtual machine fault repair apparatus according to an embodiment of the present invention. As shown in FIG. 7, the processor 71 includes a processor 71, a network interface 72, a memory 73, and a communication bus 74. The communication bus 74 is configured to implement connection communication between the processor 71, the network interface 72, and the memory 73, and the processor 71 executes a program stored in the memory 73 for implementing the following method:
当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;Create a new virtual machine using the mirror template when it detects that the original virtual machine has failed.
将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机;Setting a media access control MAC address of the original virtual machine to a MAC address of the new virtual machine, and mounting the data disk of the original virtual machine to the new virtual machine;
启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。The new virtual machine is started, wherein when the new virtual machine partitions the data disk, the reserved area where the service data in the data disk is located is not formatted.
本实施例中,处理器71执行的程序还可以包括:In this embodiment, the program executed by the processor 71 may further include:
当检测到所述原虚拟机与高可用性HA仲裁模块停止传输消息包的时间超过第一时间阈值时,将所述原虚拟机复位,并检测在所述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值,若是,则确定所述原虚拟机发生故障。When it is detected that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed the first time threshold, reset the original virtual machine, and detect the original virtual machine after the original virtual machine is reset. Whether the time when the HA arbitration module stops transmitting the message packet exceeds a second time threshold, and if so, determines that the original virtual machine has failed.
本实施例中,处理器71执行使用所述源虚拟机的镜像模板创建新虚拟机的程序,可以包括:In this embodiment, the processor 71 executes a program for creating a new virtual machine by using the image template of the source virtual machine, and may include:
使用镜像模板创建仅包括系统盘的新虚拟机。Use a mirroring template to create a new virtual machine that includes only system disks.
本实施例中,所述数据盘可以包括分区表,所述分区表中注明有所述数据盘的业务数据所在的预留区,所述新虚拟机在对所述数据盘进行分区时,根据所述分区表不对所述数据盘中的业务数据所在的预留区进行格式化,将所述数据盘中除所述业务数据所在的预留区之外的区域进行格式化。In this embodiment, the data disk may include a partition table, where the partition table indicates a reserved area where the service data of the data disk is located, and when the new virtual machine partitions the data disk, And formatting, according to the partition table, a reserved area where the service data in the data disk is located, and formatting an area of the data disk other than the reserved area where the service data is located.
本实施例中,处理器71执行的将所述原虚拟机的数据盘挂载至所述新虚拟机的程序,可以包括:In this embodiment, the program executed by the processor 71 to mount the data disk of the original virtual machine to the new virtual machine may include:
将所述原虚拟机的数据盘卸载,并将卸载后的所述数据盘挂载至所述新虚拟机;Unloading the data disk of the original virtual machine, and mounting the unloaded data disk to the new virtual machine;
另外,处理器71执行的程序还可以包括:In addition, the program executed by the processor 71 may further include:
删除所述原虚拟机。Delete the original virtual machine.
本实施例中,当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘 挂载至所述新虚拟机;启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。这样在修复过程中避免了依靠DHCP网络引导修复故障虚拟机,从而可以实现对无DHCP服务场景下的VM进行故障修复。In this embodiment, when it is detected that the original virtual machine is faulty, the mirroring template is used to create a new virtual machine; the media access control MAC address of the original virtual machine is set to the MAC address of the new virtual machine, and the Data disk of the original virtual machine Mounting to the new virtual machine; starting the new virtual machine, wherein the new virtual machine does not format a reserved area of service data in the data disk when partitioning the data disk. In this way, in the repair process, the faulty virtual machine is prevented from being repaired by the DHCP network, so that the VM in the DHCP-free service scenario can be repaired.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存取存储器(Random Access Memory,简称RAM)等。One of ordinary skill in the art can understand that all or part of the process of implementing the foregoing embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。 The above is only the preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, and thus equivalent changes made in the claims of the present invention are still within the scope of the present invention.

Claims (10)

  1. 一种虚拟机故障修复方法,其特征在于,包括:A virtual machine fault repairing method, comprising:
    当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;Create a new virtual machine using the mirror template when it detects that the original virtual machine has failed.
    将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机;Setting a media access control MAC address of the original virtual machine to a MAC address of the new virtual machine, and mounting the data disk of the original virtual machine to the new virtual machine;
    启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。The new virtual machine is started, wherein when the new virtual machine partitions the data disk, the reserved area where the service data in the data disk is located is not formatted.
  2. 如权利要求1所述的方法,其特征在于,所述方法还包括:The method of claim 1 wherein the method further comprises:
    当检测到所述原虚拟机与高可用性HA仲裁模块停止传输消息包的时间超过第一时间阈值时,将所述原虚拟机复位,并检测在所述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值,若是,则确定所述原虚拟机发生故障。When it is detected that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed the first time threshold, reset the original virtual machine, and detect the original virtual machine after the original virtual machine is reset. Whether the time when the HA arbitration module stops transmitting the message packet exceeds a second time threshold, and if so, determines that the original virtual machine has failed.
  3. 如权利要求1或2所述的方法,所述使用所述源虚拟机的镜像模板创建新虚拟机,包括:The method of claim 1 or 2, the creating a new virtual machine by using a mirror template of the source virtual machine, comprising:
    使用镜像模板创建仅包括系统盘的新虚拟机。Use a mirroring template to create a new virtual machine that includes only system disks.
  4. 如权利要求1或2所述的方法,其特征在于,所述数据盘包括分区表,所述分区表中注明有所述数据盘的业务数据所在的预留区,所述新虚拟机在对所述数据盘进行分区时,根据所述分区表不对所述数据盘中的业务数据所在的预留区进行格式化,将所述数据盘中除所述业务数据所在的预留区之外的区域进行格式化。The method according to claim 1 or 2, wherein the data disk comprises a partition table, and the reserved area in which the service data of the data disk is located is indicated in the partition table, and the new virtual machine is When the data disk is partitioned, the reserved area where the service data in the data disk is located is not formatted according to the partition table, and the data disk is excluded from the reserved area where the service data is located. The area is formatted.
  5. 如权利要求1或2所述的方法,其特征在于,所述将所述原虚拟机的数据盘挂载至所述新虚拟机,包括:The method of claim 1 or 2, wherein the mounting the data disk of the original virtual machine to the new virtual machine comprises:
    将所述原虚拟机的数据盘卸载,并将卸载后的所述数据盘挂载至所述新虚拟机;Unloading the data disk of the original virtual machine, and mounting the unloaded data disk to the new virtual machine;
    所述方法还包括:The method further includes:
    删除所述原虚拟机。Delete the original virtual machine.
  6. 一种虚拟机故障修复装置,其特征在于,包括:创建单元、设置单元和启动单元,其中:A virtual machine fault repairing device, comprising: a creating unit, a setting unit and a starting unit, wherein:
    所述创建单元,用于当检测到原虚拟机发生故障时,使用镜像模板创建新虚拟机;The creating unit is configured to create a new virtual machine by using a mirror template when detecting that the original virtual machine is faulty;
    所述设置单元,用于将所述原虚拟机的媒体接入控制MAC地址设置为所述新虚拟机的MAC地址,并将所述原虚拟机的数据盘挂载至所述新虚拟机; The setting unit is configured to set a media access control MAC address of the original virtual machine to a MAC address of the new virtual machine, and mount the data disk of the original virtual machine to the new virtual machine;
    所述启动单元,用于启动所述新虚拟机,其中,所述新虚拟机在对所述数据盘进行分区时,不对所述数据盘中的业务数据所在预留区进行格式化。The startup unit is configured to start the new virtual machine, where the new virtual machine does not format a reserved area of the service data in the data disk when the data disk is partitioned.
  7. 如权利要求6所述的装置,其特征在于,所述装置还包括:The device of claim 6 wherein said device further comprises:
    检测单元,用于当检测到所述原虚拟机与高可用性HA仲裁模块停止传输消息包的时间超过第一时间阈值时,将所述原虚拟机复位,并检测在所述原虚拟机复位后所述原虚拟机与所述HA仲裁模块停止传输消息包的时间是否超过第二时间阈值,若是,则确定所述原虚拟机发生故障。a detecting unit, configured to: when detecting that the original virtual machine and the high availability HA arbitration module stop transmitting the message packet exceed a first time threshold, reset the original virtual machine, and detect that after the original virtual machine is reset Whether the time when the original virtual machine and the HA arbitration module stop transmitting the message packet exceeds a second time threshold, and if yes, determining that the original virtual machine is faulty.
  8. 如权利要求6或7所述的装置,所述创建单元用于使用镜像模板创建仅包括系统盘的新虚拟机。The apparatus according to claim 6 or 7, said creating unit for creating a new virtual machine including only a system disk using a mirror template.
  9. 如权利要求6或7所述的装置,其特征在于,所述数据盘包括分区表,所述分区表中注明有所述数据盘的业务数据所在的预留区,所述新虚拟机在对所述数据盘进行分区时,根据所述分区表不对所述数据盘中的业务数据所在的预留区进行格式化,将所述数据盘中除所述业务数据所在的预留区之外的区域进行格式化。The device according to claim 6 or 7, wherein the data disk comprises a partition table, and the reserved area in which the service data of the data disk is located is indicated in the partition table, and the new virtual machine is When the data disk is partitioned, the reserved area where the service data in the data disk is located is not formatted according to the partition table, and the data disk is excluded from the reserved area where the service data is located. The area is formatted.
  10. 如权利要求6或7所述的装置,其特征在于,所述设置单元还用于将所述原虚拟机的数据盘卸载,并将卸载后的所述数据盘挂载至所述新虚拟机;The device according to claim 6 or 7, wherein the setting unit is further configured to uninstall the data disk of the original virtual machine, and mount the unloaded data disk to the new virtual machine. ;
    所述装置还包括:The device also includes:
    删除单元,用于删除所述原虚拟机。 A delete unit is used to delete the original virtual machine.
PCT/CN2016/098341 2015-09-30 2016-09-07 Fault recovery method and device for virtual machine WO2017054626A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510638436.8A CN105204955B (en) 2015-09-30 2015-09-30 A kind of virtual-machine fail restorative procedure and device
CN201510638436.8 2015-09-30

Publications (1)

Publication Number Publication Date
WO2017054626A1 true WO2017054626A1 (en) 2017-04-06

Family

ID=54952650

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/098341 WO2017054626A1 (en) 2015-09-30 2016-09-07 Fault recovery method and device for virtual machine

Country Status (2)

Country Link
CN (1) CN105204955B (en)
WO (1) WO2017054626A1 (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105204955B (en) * 2015-09-30 2018-05-29 华为技术有限公司 A kind of virtual-machine fail restorative procedure and device
CN106921508B (en) * 2015-12-25 2021-02-19 中兴通讯股份有限公司 Virtualized network element fault self-healing method and device
CN107515725B (en) * 2016-06-16 2022-12-09 中兴通讯股份有限公司 Method and device for sharing disk by core network virtualization system and network management MANO system
CN106201654A (en) * 2016-06-30 2016-12-07 国云科技股份有限公司 A kind of rescue method of dummy machine system
WO2018023217A1 (en) * 2016-07-30 2018-02-08 华为技术有限公司 Method, device and system for establishing virtual machine
CN107122229A (en) * 2017-04-21 2017-09-01 紫光华山信息技术有限公司 A kind of virtual machine restoration methods and device
CN112231063A (en) * 2020-10-23 2021-01-15 新华三信息安全技术有限公司 Fault processing method and device
CN116527494B (en) * 2023-07-05 2023-09-12 南京赛宁信息技术有限公司 Shooting range virtual machine network initialization method and system based on virtual network card cloning

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101605084A (en) * 2009-06-29 2009-12-16 北京航空航天大学 Virtual network message processing method and system based on virtual machine
US20100257269A1 (en) * 2009-04-01 2010-10-07 Vmware, Inc. Method and System for Migrating Processes Between Virtual Machines
CN101876883A (en) * 2009-11-30 2010-11-03 英业达股份有限公司 Method for keeping remote operation of virtual machine uninterrupted
CN103618627A (en) * 2013-11-27 2014-03-05 华为技术有限公司 Method, device and system for managing virtual machines
CN104243265A (en) * 2014-09-05 2014-12-24 华为技术有限公司 Gateway control method, device and system based on virtual machine migration
CN104536842A (en) * 2014-12-17 2015-04-22 中电科华云信息技术有限公司 Virtual machine fault-tolerant method based on KVM virtualization
CN105204955A (en) * 2015-09-30 2015-12-30 华为技术有限公司 Method and device for correcting faults of virtual machines

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100257269A1 (en) * 2009-04-01 2010-10-07 Vmware, Inc. Method and System for Migrating Processes Between Virtual Machines
CN101605084A (en) * 2009-06-29 2009-12-16 北京航空航天大学 Virtual network message processing method and system based on virtual machine
CN101876883A (en) * 2009-11-30 2010-11-03 英业达股份有限公司 Method for keeping remote operation of virtual machine uninterrupted
CN103618627A (en) * 2013-11-27 2014-03-05 华为技术有限公司 Method, device and system for managing virtual machines
CN104243265A (en) * 2014-09-05 2014-12-24 华为技术有限公司 Gateway control method, device and system based on virtual machine migration
CN104536842A (en) * 2014-12-17 2015-04-22 中电科华云信息技术有限公司 Virtual machine fault-tolerant method based on KVM virtualization
CN105204955A (en) * 2015-09-30 2015-12-30 华为技术有限公司 Method and device for correcting faults of virtual machines

Also Published As

Publication number Publication date
CN105204955B (en) 2018-05-29
CN105204955A (en) 2015-12-30

Similar Documents

Publication Publication Date Title
WO2017054626A1 (en) Fault recovery method and device for virtual machine
US10642638B2 (en) Virtual machine placement with automatic deployment error recovery
US8966318B1 (en) Method to validate availability of applications within a backup image
US11888762B1 (en) VNFM assisted fault handling in virtual network function components
CN102708018B (en) Method and system for exception handling, proxy equipment and control device
US10917291B2 (en) RAID configuration
US20130254759A1 (en) Installing an operating system in a host system
US20160070625A1 (en) Providing boot data in a cluster network environment
US11528183B1 (en) EMS assisted split-brain resolution in virtual network function components
US9235484B2 (en) Cluster system
WO2016045439A1 (en) Vnfm disaster-tolerant protection method and device, nfvo and storage medium
CN103595801B (en) Cloud computing system and real-time monitoring method for virtual machine in cloud computing system
WO2020001354A1 (en) Master/standby container system switch
CN112380062A (en) Method and system for rapidly recovering system for multiple times based on system backup point
US20150339144A1 (en) Maintaining virtual hardware device id in a virtual machine
US11119872B1 (en) Log management for a multi-node data processing system
CN104503861A (en) Abnormality handling method and system, agency device and control device
JP2015158773A (en) Operation verification device for virtual apparatus, operation verification system for virtual apparatus, and program
KR20150111608A (en) Method for duplication of virtualization server and Virtualization control apparatus thereof
US20220283834A1 (en) System and method to monitor and manage a passthrough device
US10514991B2 (en) Failover device ports
US11307842B2 (en) Method and system for virtual agent upgrade using upgrade proxy service
CN113448688A (en) Method, system and computer medium for automatically correcting network configuration in cloud migration scene
US10656959B2 (en) Shutting down of a virtual system
US11405277B2 (en) Information processing device, information processing system, and network communication confirmation method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16850250

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16850250

Country of ref document: EP

Kind code of ref document: A1