JP2016062140A

JP2016062140A - Virtual equipment management device, virtual equipment management method, and virtual equipment management program

Info

Publication number: JP2016062140A
Application number: JP2014187489A
Authority: JP
Inventors: 庸次山登; Yoji Yamato; 伸二長尾; Shinji Nagao; 賢一佐藤; Kenichi Sato
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2014-09-16
Filing date: 2014-09-16
Publication date: 2016-04-25
Anticipated expiration: 2034-09-16
Also published as: JP5855724B1

Abstract

PROBLEM TO BE SOLVED: To restore a virtual machine safely.SOLUTION: A virtual equipment management device has a determination part, a deletion part, and a reconstitution part. The determination part determines whether there is a restoring process being performed for a virtual machine when trouble of the virtual machine or trouble of physical equipment operating the virtual machine is detected. The deletion part deletes correspondence between the virtual machine and a storage region used by the virtual machine or deletes the correspondence and a virtual machine instance of the virtual machine as a first restoring process when there is not the restoring process being performed. After the physical equipment operating the virtual machine is selected, and the correspondence is deleted or the correspondence and the virtual machine instance are deleted, the reconstitution part makes the selected physical equipment form the virtual machine again as a second restoring process.SELECTED DRAWING: Figure 1

Description

本発明は、仮想機器管理装置、仮想機器管理方法及び仮想機器管理プログラムに関する。 The present invention relates to a virtual device management apparatus, a virtual device management method, and a virtual device management program.

ＩａａＳ（Infrastructure as a Service）型クラウドサービスの実施例として、ＡｍａｚｏｎＥｌａｓｔｉｃＣｏｍｐｕｔｅＣｌｏｕｄ（web site, http://aws.amazon.com/ec2）、ＲａｃｋｓｐａｃｅＣｌｏｕｄＳｅｒｖｅｒ（web site, http://www.rackspacecloud.com/cloud-hosting-products/servers/）がある。 Examples of IaaS (Infrastructure as a Service) type cloud services include Amazon Elastic Compound Cloud (web site, http://aws.amazon.com/ec2), Rackspace Cloud Server (web site, http: //www.rackspacecloud .com / cloud-hosting-products / servers /).

ＩａａＳ型クラウドサービスの基盤として、Ａｍａｚｏｎはプロプライエタリなプラットフォームを用いているが、ＲａｃｋｓｐａｃｅはＯｐｅｎＳｏｕｒｃｅのＯｐｅｎＳｔａｃｋ（web site, http://www.openstack.org/）を用いている。 Amazon uses a proprietary platform as the foundation of the IaaS cloud service, while Rackspace uses OpenStack (web site, http://www.openstack.org/) of Open Source.

しかし、ＯｐｅｎＳｔａｃｋ等のＩａａＳ基盤は、仮想機器の管理を行うプリミティブなＡＰＩ（Application Programming Interface）提供がターゲットの中心であり、物理機器の管理はスコープ外であるため、サービス事業者がクラウドサービスを提供する際は考慮が必要である。 However, the IaaS platform such as OpenStack is mainly targeted at providing primitive API (Application Programming Interface) for managing virtual devices, and management of physical devices is out of scope, so service providers provide cloud services. It is necessary to consider when doing so.

具体的には、仮想機器が動作する物理機器が故障した際の復旧は、ＯｐｅｎＳｔａｃｋは特にサポートしていなく、サービス事業者にて対策が必要である。市中で採用されている方法として、ＨｉｇｈＡｖａｉｌａｂｉｌｉｔｙクラスタソフトウェアのＰａｃｅｍａｋｅｒ等を用いてＨＡ構成を構築し、物理機器故障時はフェールオーバーする方法がある。 Specifically, OpenStack does not particularly support recovery when a physical device on which a virtual device operates fails, and a countermeasure is required by the service provider. As a method adopted in the city, there is a method in which an HA configuration is constructed using a high availability cluster software, such as a packager, and a failover occurs when a physical device fails.

また、仮想マシンを稼働させる物理機器では、Ｐｉｎｇを用いたりＬｉｂｖｉｒｔ等の仮想マシン制御ライブラリを監視したりすることで、仮想マシンのダウンを検知する方法がある。このような物理機器は、仮想マシンのダウンを検出した場合に、仮想マシンを再作成する復旧処理を実行する。 In addition, in a physical device that operates a virtual machine, there is a method of detecting the down of the virtual machine by using Ping or monitoring a virtual machine control library such as Libvirt. When such a physical device detects that the virtual machine is down, it executes a recovery process for recreating the virtual machine.

Pacemaker web site、［平成26年8月27日検索］、インターネット（ＵＲＬ：http://www.linux-ha.org/wiki/Pacemaker/）Pacemaker web site, [August 27, 2014 search], Internet (URL: http://www.linux-ha.org/wiki/Pacemaker/) Libvirt web site、［平成26年8月27日検索］、インターネット（ＵＲＬ：http://libvirt.org/）Libvirt web site, [searched August 27, 2014], Internet (URL: http://libvirt.org/) D. Mane, "Building A High Availability - Openstack," International Journal of Engineering Research and Applications, Vol.3, Issue 4, pp.269-277, July 2013.D. Mane, "Building A High Availability-Openstack," International Journal of Engineering Research and Applications, Vol.3, Issue 4, pp.269-277, July 2013.

しかしながら、上記の従来技術では、仮想マシンを安全に復旧できない場合があるという問題がある。 However, the above-described conventional technique has a problem that the virtual machine may not be safely recovered.

具体的には、物理機器故障時の復旧と、仮想マシンダウン時の復旧とは独立である。また、物理機器の故障と仮想マシンの故障とが重複した際の復旧手段が統一的でない。このため、例えば、仮想マシンダウンによる仮想マシン復旧が、物理機器故障によるフェールオーバー時の故障ノードの強制終了前に行われた場合には、仮想マシンが２重に起動し、ボリュームを破壊する可能性がある。 Specifically, recovery when a physical device fails and recovery when a virtual machine goes down are independent. Also, the recovery means when the failure of the physical device and the failure of the virtual machine overlap is not unified. For this reason, for example, if virtual machine recovery due to a virtual machine down is performed before the forced termination of the failed node at the time of failover due to a physical device failure, the virtual machine can be started twice and the volume can be destroyed There is sex.

開示の技術は、上述に鑑みてなされたものであって、仮想マシンを安全に復旧することを目的とする。 The disclosed technology has been made in view of the above, and aims to safely recover a virtual machine.

本願の開示する仮想機器管理装置は、判定部と、削除部と、再作成部とを有する。判定部は、仮想マシンの障害又は前記仮想マシンを稼働させる物理機器の障害が検出された場合に、前記仮想マシンに対して実施中の復旧処理が存在するか否かを判定する。削除部は、前記実施中の復旧処理が存在しない場合に、第１の復旧処理として、前記仮想マシンと前記仮想マシンにより使用される記憶領域との対応付けを削除する、又は、前記対応付けと前記仮想マシンの仮想マシンインスタンスとを削除する。再作成部は、前記仮想マシンを稼働させる物理機器を選択し、前記対応付けを削除後、又は、前記対応付けと前記仮想マシンインスタンスとを削除後に、第２の復旧処理として、選択した物理機器に前記仮想マシンを再作成させる。 The virtual device management apparatus disclosed in the present application includes a determination unit, a deletion unit, and a re-creation unit. When a failure of a virtual machine or a failure of a physical device that operates the virtual machine is detected, the determination unit determines whether there is a recovery process being performed on the virtual machine. The deletion unit deletes the association between the virtual machine and the storage area used by the virtual machine as the first restoration process when there is no restoration process in progress, or the association Delete the virtual machine instance of the virtual machine. The re-creation unit selects a physical device that operates the virtual machine and deletes the association, or after deleting the association and the virtual machine instance, the selected physical device as a second recovery process To recreate the virtual machine.

また、本願の開示する仮想機器管理方法は、仮想機器管理装置が、判定工程と、削除工程と、再作成工程とを含む。判定工程は、仮想マシンの障害又は前記仮想マシンを稼働させる物理機器の障害が検出された場合に、前記仮想マシンに対して実施中の復旧処理が存在するか否かを判定する。削除工程は、前記実施中の復旧処理が存在しない場合に、第１の復旧処理として、前記仮想マシンと前記仮想マシンにより使用される記憶領域との対応付けを削除する、又は、前記対応付けと前記仮想マシンの仮想マシンインスタンスとを削除する。再作成工程は、前記仮想マシンを稼働させる物理機器を選択し、前記対応付けを削除後、又は、前記対応付けと前記仮想マシンインスタンスとを削除後に、第２の復旧処理として、選択した物理機器に前記仮想マシンを再作成させる。 In the virtual device management method disclosed in the present application, the virtual device management apparatus includes a determination step, a deletion step, and a recreation step. In the determination step, when a failure of the virtual machine or a failure of a physical device that operates the virtual machine is detected, it is determined whether there is a recovery process being performed on the virtual machine. The deletion step deletes the association between the virtual machine and the storage area used by the virtual machine as the first restoration process when there is no restoration process in progress, or Delete the virtual machine instance of the virtual machine. In the re-creation process, after selecting the physical device that operates the virtual machine and deleting the association, or after deleting the association and the virtual machine instance, the selected physical device is used as a second recovery process. To recreate the virtual machine.

また、本願の開示する仮想機器管理プログラムは、判定手順と、削除手順と、再作成手順とをコンピュータに実行させる。判定手順は、仮想マシンの障害又は前記仮想マシンを稼働させる物理機器の障害が検出された場合に、前記仮想マシンに対して実施中の復旧処理が存在するか否かを判定する。削除手順は、前記実施中の復旧処理が存在しない場合に、第１の復旧処理として、前記仮想マシンと前記仮想マシンにより使用される記憶領域との対応付けを削除する、又は、前記対応付けと前記仮想マシンの仮想マシンインスタンスとを削除する。再作成手順は、前記仮想マシンを稼働させる物理機器を選択し、前記対応付けを削除後、又は、前記対応付けと前記仮想マシンインスタンスとを削除後に、第２の復旧処理として、選択した物理機器に前記仮想マシンを再作成させる。 The virtual device management program disclosed in the present application causes a computer to execute a determination procedure, a deletion procedure, and a re-creation procedure. The determination procedure determines whether there is a recovery process in progress for the virtual machine when a failure of the virtual machine or a failure of a physical device that operates the virtual machine is detected. The deletion procedure deletes the association between the virtual machine and the storage area used by the virtual machine as the first restoration processing when there is no restoration processing in progress, or Delete the virtual machine instance of the virtual machine. In the re-creation procedure, a physical device that operates the virtual machine is selected, and after the association is deleted, or after the association and the virtual machine instance are deleted, the selected physical device is used as a second recovery process. To recreate the virtual machine.

開示する仮想機器管理装置の一つの態様によれば、仮想マシンを安全に復旧することができるという効果を奏する。 According to one aspect of the disclosed virtual device management apparatus, there is an effect that the virtual machine can be safely recovered.

図１は、第１の実施形態に係る仮想機器管理システムの構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of a virtual device management system according to the first embodiment. 図２は、仮想機器配置スケジューラ機能部による仮想機器の作成処理を説明するための図である。FIG. 2 is a diagram for explaining virtual device creation processing by the virtual device placement scheduler function unit. 図３は、仮想機器配置スケジューラ機能部による仮想機器の再配置処理を説明するための図である。FIG. 3 is a diagram for explaining virtual device rearrangement processing by the virtual device placement scheduler function unit. 図４は、仮想機器配置スケジューラ機能部による仮想機器の再配置処理を説明するための図である。FIG. 4 is a diagram for explaining virtual device rearrangement processing by the virtual device placement scheduler function unit. 図５は、仮想機器管理システムにおける仮想機器を作成する処理動作を説明するための図である。FIG. 5 is a diagram for explaining a processing operation for creating a virtual device in the virtual device management system. 図６は、仮想機器管理システムにおける仮想機器を再配置する処理動作を説明するための図である。FIG. 6 is a diagram for explaining a processing operation for rearranging virtual devices in the virtual device management system. 図７は、仮想機器管理システムにおける仮想機器を再配置する処理動作を説明するための図である。FIG. 7 is a diagram for explaining a processing operation for rearranging virtual devices in the virtual device management system. 図８は、仮想機器管理装置が実現する仮想機器配置スケジューラＤＢ及び仮想機器配置スケジューラ機能部を説明するための図である。FIG. 8 is a diagram for explaining the virtual device placement scheduler DB and the virtual device placement scheduler function unit realized by the virtual device management apparatus. 図９は、仮想機器配置情報テーブルのデータ構造の一例を示す図である。FIG. 9 is a diagram illustrating an example of the data structure of the virtual device arrangement information table. 図１０は、物理資源情報テーブルのデータ構造の一例を示す図である。FIG. 10 is a diagram illustrating an example of the data structure of the physical resource information table. 図１１は、配置先選択部による処理動作を説明するための図である。FIG. 11 is a diagram for explaining the processing operation by the placement destination selection unit. 図１２は、障害判定部による物理機器の障害を検出する処理動作を説明するための図である。FIG. 12 is a diagram for explaining a processing operation for detecting a failure of a physical device by the failure determination unit. 図１３は、再作成部による処理動作を説明するための図である。FIG. 13 is a diagram for explaining the processing operation by the re-creation unit. 図１４は、物理機器障害通知を受信した場合の仮想機器配置スケジューラ機能部による処理手順を示すフローチャートである。FIG. 14 is a flowchart illustrating a processing procedure performed by the virtual device placement scheduler function unit when a physical device failure notification is received. 図１５は、仮想マシンダウン通知を受信した場合の仮想機器配置スケジューラ機能部による処理手順を示すフローチャートである。FIG. 15 is a flowchart illustrating a processing procedure performed by the virtual device arrangement scheduler function unit when a virtual machine down notification is received. 図１６は、仮想機器管理プログラムを実行するコンピュータを示す図である。FIG. 16 is a diagram illustrating a computer that executes a virtual device management program.

以下に、開示する仮想機器管理装置、仮想機器管理方法及び仮想機器管理プログラムの実施形態について、図面に基づいて詳細に説明する。なお、本実施形態により開示する発明が限定されるものではない。 Hereinafter, embodiments of a disclosed virtual device management apparatus, virtual device management method, and virtual device management program will be described in detail based on the drawings. The invention disclosed by this embodiment is not limited.

（第１の実施形態）
図１は、第１の実施形態に係る仮想機器管理システムの構成の一例を示す図である。図１に示すように、仮想機器管理システムは、ユーザ端末１０１、物理機器１０３ａ、物理機器１０３ｂ、物理機器１０３ｃ、クラウドコントローラ１０８、及び仮想機器管理装置１０９を有する。なお、物理機器１０３ａ、物理機器１０３ｂ及び物理機器１０３ｃを区別しない場合には、物理機器１０３と記載する。また、ここで言う「物理機器」とは、仮想機器を作成可能な物理サーバ、ストレージ装置、及びネットワーク機器等であるが、以下では説明の便宜上、物理機器１０３が仮想マシンを稼働可能な物理サーバである場合について説明する。また、仮想機器管理システムが有する物理機器１０３の数は図１に示す数に限定されるものではなく、任意に変更可能である。 (First embodiment)
FIG. 1 is a diagram illustrating an example of a configuration of a virtual device management system according to the first embodiment. As illustrated in FIG. 1, the virtual device management system includes a user terminal 101, a physical device 103a, a physical device 103b, a physical device 103c, a cloud controller 108, and a virtual device management apparatus 109. In the case where the physical device 103a, the physical device 103b, and the physical device 103c are not distinguished, they are described as the physical device 103. The “physical device” referred to here is a physical server, a storage device, a network device, or the like that can create a virtual device. The case where it is is demonstrated. Further, the number of physical devices 103 included in the virtual device management system is not limited to the number shown in FIG. 1 and can be arbitrarily changed.

ユーザ端末１０１は、ユーザが利用する端末であり、ユーザの指示に応じて仮想機器の作成を仮想機器管理装置１０９に要求する。 The user terminal 101 is a terminal used by the user, and requests the virtual device management apparatus 109 to create a virtual device according to a user instruction.

物理機器１０３は、クラウドコントローラ１０８から仮想機器の作成や削除依頼を受け、実際の仮想機器を作成したり削除したりする。例えば、物理機器１０３は、仮想機器を作成する指示をクラウドコントローラ１０８から受付け、仮想機器を作成する。 The physical device 103 receives a virtual device creation or deletion request from the cloud controller 108 and creates or deletes an actual virtual device. For example, the physical device 103 receives an instruction to create a virtual device from the cloud controller 108 and creates a virtual device.

例えば、物理機器１０３ａは、図示しない仮想マシン制御部を有し、仮想マシン１０４ａと、仮想マシン１０５ａとを作成する。また、例えば、物理機器１０３ｂは、図示しない仮想マシン制御部を有し、仮想マシン１０４ｂと、仮想マシン１０５ｂとを作成する。同様に、物理機器１０３ｃは、図示しない仮想マシン制御部を有し、仮想マシン１０４ｃと、仮想マシン１０５ｃとを作成する。なお、物理機器１０３ａ〜物理機器１０３ｃにおいて、仮想マシン制御部は、例えば、ＯｐｅｎＳｔａｃｋであれば、「Ｎｏｖａ」機能によって実現される。 For example, the physical device 103a has a virtual machine control unit (not shown), and creates a virtual machine 104a and a virtual machine 105a. For example, the physical device 103b has a virtual machine control unit (not shown), and creates a virtual machine 104b and a virtual machine 105b. Similarly, the physical device 103c has a virtual machine control unit (not shown), and creates a virtual machine 104c and a virtual machine 105c. In the physical device 103a to the physical device 103c, the virtual machine control unit is realized by a “Nova” function in the case of OpenStack, for example.

なお、物理機器１０３がストレージ装置である場合、例えば、物理機器１０３は、仮想ボリューム制御部を有し、仮想ボリュームを作成する。また、物理機器１０３がネットワーク機器である場合、物理機器１０３は、仮想ネットワーク制御部を有し、仮想Ｌ２ネットワーク、仮想ルータ、及び仮想ロードバランサなどを作成する。なお、仮想ネットワーク制御部は、例えば、ＯｐｅｎＳｔａｃｋであれば、「Ｎｅｕｔｒｏｎ」機能によって実現される。 If the physical device 103 is a storage device, for example, the physical device 103 has a virtual volume control unit and creates a virtual volume. When the physical device 103 is a network device, the physical device 103 includes a virtual network control unit, and creates a virtual L2 network, a virtual router, a virtual load balancer, and the like. Note that the virtual network control unit is realized by a “Netron” function in the case of OpenStack, for example.

このような、物理機器１０３の稼働状態には、「稼働中」、「予備」、「故障中（メンテナンス中）」及び「故障中（復旧処理中）」４つの状態がある。「稼働中」は、物理機器が稼働中であることを示す。「予備」は、物理機器が予備系として設けられ稼働中ではないことを示す。「故障中（メンテナンス中）」は、障害が発生した物理機器の復旧処理が開始されていないことを示す。この「故障中（メンテナンス中）」である状態は、障害が発生した物理機器の完全停止が保障されている状態ではない。「故障中（復旧処理中）」は、障害が発生した物理機器の復旧処理が開始されていることを示す。この「故障中（復旧処理中）」である状態は、障害が発生した物理機器の完全停止が保障されている状態である。なお、仮想機器管理システムにおいて、「予備」の物理機器が設けられなくてもよい。 Such operation states of the physical device 103 include four states of “in operation”, “standby”, “failing (maintenance)”, and “failing (recovery processing)”. “In operation” indicates that the physical device is in operation. “Reserved” indicates that the physical device is provided as a standby system and is not in operation. “Failure (during maintenance)” indicates that recovery processing of a physical device in which a failure has occurred has not started. This “failing (maintenance)” state is not a state in which a complete stop of a physical device in which a failure has occurred is guaranteed. “During failure (during recovery process)” indicates that the recovery process of the physical device in which the failure has occurred is started. The state of “failing (during recovery process)” is a state in which a complete stop of the physical device in which the failure has occurred is guaranteed. In the virtual device management system, a “reserve” physical device may not be provided.

また、物理機器１０３には、物理資源の容量に応じて、仮想機器を配置するために利用可能な物理資源の容量が定義される。ここで、物理資源には、例えば、物理メモリ、ＣＰＵ（Central Processing Unit）、ネットワークポートなどが含まれる。なお、仮想マシンは、フレーバー（仮想マシンのスペック指定）に応じてメモリサイズが異なるため、作成する仮想マシンに応じて利用される物理資源の容量は異なる。しかしながら説明の便宜上、以下では、全ての仮想機器１つにつき、使用される物理資源の容量が同じであるものと仮定する。そして、１つの仮想機器を配置するために使用される物理資源の容量を１単位とし、「１スペース」と呼ぶ。言い換えると、１スペースには、１つの仮想機器を配置可能であり、１つの仮想機器を作成する場合には、いずれかの物理機器のスペースが１つ消費される。 In addition, the physical device 103 defines a physical resource capacity that can be used to arrange virtual devices in accordance with the physical resource capacity. Here, the physical resources include, for example, a physical memory, a CPU (Central Processing Unit), a network port, and the like. Since virtual machines have different memory sizes according to flavors (specifying virtual machine specifications), the capacity of physical resources used differs depending on the virtual machine to be created. However, for convenience of explanation, it is assumed below that the capacity of physical resources used is the same for all virtual devices. The capacity of the physical resource used for arranging one virtual device is defined as one unit and is called “one space”. In other words, one virtual device can be arranged in one space, and when one virtual device is created, one of the physical device spaces is consumed.

また、物理機器内のスペースの状態は、「空き」、「使用中」、及び「障害用バッファ」の３種類で管理されるものとする。ここで、「空き」は、仮想機器が配置されていないスペースであることを示す。「使用中」は、仮想機器が配置されているスペースであることを示す。「障害用バッファ」は、障害復旧用に確保されたスペースであることを示す。 In addition, it is assumed that the state of the space in the physical device is managed by three types of “free”, “in use”, and “failure buffer”. Here, “free” indicates a space where no virtual device is arranged. “In use” indicates a space in which virtual devices are arranged. The “failure buffer” indicates a space reserved for failure recovery.

また、物理機器１０３ａは、仮想マシン監視モジュール１０６ａを備えている。同様に、物理機器１０３ｂは、仮想マシン監視モジュール１０６ｂを備えており、物理機器１０３ｃは、仮想マシン監視モジュール１０６ｃを備えている。なお、仮想マシン監視モジュール１０６ａ〜１０６ｃを区別しない場合には仮想マシン監視モジュール１０６と記載する。この仮想マシン監視モジュール１０６は、物理機器１０３上で動作する。例えば、仮想マシン監視モジュール１０６は、Ｌｉｂｖｉｒｔ等のイベントを監視するモジュールによって、仮想マシンの障害を検知した場合、後述する仮想機器配置スケジューラ機能部１１１に仮想マシンの障害を通知する。 The physical device 103a includes a virtual machine monitoring module 106a. Similarly, the physical device 103b includes a virtual machine monitoring module 106b, and the physical device 103c includes a virtual machine monitoring module 106c. Note that the virtual machine monitoring modules 106a to 106c are described as the virtual machine monitoring module 106 when they are not distinguished. The virtual machine monitoring module 106 operates on the physical device 103. For example, when a virtual machine failure is detected by a module for monitoring events such as Libvirt, the virtual machine monitoring module 106 notifies the virtual device placement scheduler function unit 111 described later of the failure of the virtual machine.

また、物理機器１０３ａは、高可用ソフトウェア１０７ａを備えている。同様に、物理機器１０３ｂは、高可用ソフトウェア１０７ｂを備えており、物理機器１０３ｃは、高可用ソフトウェア１０７ｃを備えている。なお、高可用ソフトウェア１０７ａ〜１０７ｃを区別しない場合には高可用ソフトウェア１０７と記載する。この高可用ソフトウェア１０７には、例えば「Ｐａｃｅｍａｋｅｒ」等が利用できる。高可用ソフトウェア１０７は、物理機器１０３の障害を検知し、仮想機器管理装置１０９に物理機器の障害を通知する。かかる場合、物理機器１０３は、仮想機器を再配置させる指示をクラウドコントローラ１０８から受付け、障害の生じた物理機器１０３に配置された仮想機器を再配置する。 In addition, the physical device 103a includes high availability software 107a. Similarly, the physical device 103b includes high availability software 107b, and the physical device 103c includes high availability software 107c. Note that the high availability software 107a to 107c is described as the high availability software 107 when not distinguished from each other. As this highly available software 107, for example, “Pacemaker” or the like can be used. The high availability software 107 detects a failure of the physical device 103 and notifies the virtual device management apparatus 109 of the failure of the physical device. In such a case, the physical device 103 receives an instruction to relocate the virtual device from the cloud controller 108, and relocates the virtual device disposed in the physical device 103 in which the failure has occurred.

なお、「Ｐａｃｅｍａｋｅｒ」は、信頼性の高い故障検出メカニズムを備えており、スプリットブレイン対策が確立している。「Ｐａｃｅｍａｋｅｒ」は、スプリットブレイン状態（孤立状態）を、Ｑｕｏｒｕｍモジュール等による多数決原理で検出する。 Note that “Pacemaker” has a highly reliable failure detection mechanism, and a countermeasure for split brain has been established. “Pacemaker” detects the split brain state (isolated state) based on the majority rule by the Quorum module or the like.

クラウドコントローラ１０８は、物理機器１０３と仮想機器管理装置１０９とに接続されている。このクラウドコントローラ１０８は、ＣＰＵ（Central Processing Unit）、メモリ、データ保持領域、及びネットワーク通信機能を有する装置である。クラウドコントローラ１０８は、仮想機器管理装置１０９からＡＰＩ（Application Programming Interface）経由で仮想機器の作成依頼や削除依頼を受付け、受付けた作成依頼や削除依頼に基づいて、仮想機器の作成や削除を物理機器１０３に指示する。例えば、クラウドコントローラ１０８は、ＯｐｅｎＳｔａｃｋ等である。 The cloud controller 108 is connected to the physical device 103 and the virtual device management apparatus 109. The cloud controller 108 is a device having a CPU (Central Processing Unit), a memory, a data holding area, and a network communication function. The cloud controller 108 receives a virtual device creation request or deletion request from the virtual device management apparatus 109 via an API (Application Programming Interface), and creates or deletes a virtual device based on the received creation request or deletion request. 103 is instructed. For example, the cloud controller 108 is OpenStack or the like.

仮想機器管理装置１０９は、図示しないネットワークを介して、ユーザ端末１０１と物理機器１０３とクラウドコントローラ１０８とに接続されている。仮想機器管理装置１０９は、ＣＰＵ、メモリ、データ保持領域、及びネットワーク通信機能を有する装置であり、例えば、図１に示すように、仮想機器管理装置１０９は、仮想機器配置スケジューラＤＢ（Data Base）１１０及び仮想機器配置スケジューラ機能部１１１を有する。仮想機器管理装置１０９は、ユーザ端末から仮想機器の作成依頼を受付けた場合、仮想機器配置先を決定してクラウドコントローラへの仮想マシン作成を依頼する。また、仮想機器管理装置１０９は、高可用ソフトウェア１０７からの物理機器の故障通知、仮想マシン監視モジュール１０６からの仮想マシンダウン通知を受付けた場合、仮想機器配置スケジューラＤＢ１１０の情報を用いて、復旧処理の種類を決定する。そして、仮想機器管理装置１０９は、仮想機器配置先を決定し、クラウドコントローラへの仮想マシン作成依頼等の復旧処理を行う。 The virtual device management apparatus 109 is connected to the user terminal 101, the physical device 103, and the cloud controller 108 via a network (not shown). The virtual device management device 109 is a device having a CPU, a memory, a data holding area, and a network communication function. For example, as shown in FIG. 1, the virtual device management device 109 is a virtual device placement scheduler DB (Data Base). 110 and a virtual device arrangement scheduler function unit 111. When the virtual device management apparatus 109 receives a virtual device creation request from the user terminal, the virtual device management apparatus 109 determines a virtual device placement destination and requests the cloud controller to create a virtual machine. When the virtual device management apparatus 109 receives a physical device failure notification from the high availability software 107 and a virtual machine down notification from the virtual machine monitoring module 106, the virtual device management apparatus 109 uses the information in the virtual device placement scheduler DB 110 to perform recovery processing. Determine the type of. Then, the virtual device management apparatus 109 determines a virtual device placement destination and performs recovery processing such as a virtual machine creation request to the cloud controller.

仮想機器配置スケジューラＤＢ１１０は、例えば、ＲＡＭ（Random Access Memory）、フラッシュメモリ（Flash Memory）等の半導体メモリ素子、又は、ハードディスク、光ディスク等の記憶装置などである。仮想機器配置スケジューラＤＢ１１０は、仮想機器配置情報及び物理資源情報を記憶する。仮想機器配置情報は、各仮想マシンがどの物理機器上に配置されているかと、各仮想マシンの処理の進捗状態とを示す情報である。物理資源情報は、各物理機器の稼働状態と物理機器が有する物理資源の空き容量とを示す情報である。物理機器の稼働状態を示す情報には、例えば、稼働中、予備、故障中（メンテナンス中）、故障中（復旧処理中）があり、物理資源の空き容量を示す情報には、例えば、空き、使用中、障害用バッファがある。なお、仮想機器配置情報の詳細については、図９を用いて後述し、物理資源情報の詳細については、図１０を用いて後述する。 The virtual device arrangement scheduler DB 110 is, for example, a semiconductor memory device such as a RAM (Random Access Memory) or a flash memory, or a storage device such as a hard disk or an optical disk. The virtual device arrangement scheduler DB 110 stores virtual device arrangement information and physical resource information. The virtual device arrangement information is information indicating on which physical device each virtual machine is arranged and the progress of processing of each virtual machine. The physical resource information is information indicating the operating state of each physical device and the free capacity of the physical resource that the physical device has. Information indicating the operating state of the physical device includes, for example, operating, standby, failure (maintenance), and failure (recovery processing). Information indicating the free capacity of the physical resource includes, for example, free, There is a fault buffer in use. Details of the virtual device arrangement information will be described later with reference to FIG. 9, and details of the physical resource information will be described later with reference to FIG.

仮想機器配置スケジューラ機能部１１１は、物理機器の稼働状態と物理機器が有する物理資源の使用状態とを参照して、ビジネス要件に応じた仮想機器を配置する。例えば、仮想機器配置スケジューラ機能部１１１は、ユーザ端末１０１から、仮想機器の作成を要求された場合、仮想機器の作成要求と仮想機器配置スケジューラＤＢ１１０の情報とを用いて、仮想機器の作成を仲介する。ここで、仮想機器配置スケジューラ機能部１１１は、仮想マシンや仮想ルータ等の仮想機器を新規に作成する通常のオペレーション時に、仮想機器を配置する物理機器を決め、クラウドコントローラ１０８に物理機器を指定して仮想機器の作成を依頼する。 The virtual device arrangement scheduler function unit 111 arranges virtual devices according to business requirements with reference to the operating state of the physical device and the usage state of the physical resource of the physical device. For example, when a virtual device creation request is received from the user terminal 101, the virtual device placement scheduler function unit 111 uses the virtual device creation request and information in the virtual device placement scheduler DB 110 to mediate the creation of the virtual device. To do. Here, the virtual device placement scheduler function unit 111 determines a physical device to place a virtual device in a normal operation of creating a new virtual device such as a virtual machine or a virtual router, and designates the physical device to the cloud controller 108. Request to create a virtual device.

このように構成される仮想機器管理システムにおいて、仮想機器管理装置１０９は、仮想マシン監視モジュール１０６及び高可用ソフトウェア１０７と連携して動作する。図２から図４を用いて、仮想機器管理システムにおける処理について説明する。 In the virtual device management system configured as described above, the virtual device management apparatus 109 operates in cooperation with the virtual machine monitoring module 106 and the high availability software 107. Processing in the virtual device management system will be described with reference to FIGS.

図２は、仮想機器配置スケジューラ機能部１１１による仮想機器の作成処理を説明するための図である。図２では、３台の物理機器１０３ａ〜１０３ｃに、仮想マシンＶＭ＃１１〜ＶＭ＃１６、ＶＭ＃２１〜ＶＭ＃２６、及びＶＭ＃３１〜ＶＭ＃３６を仮想機器として作成する場合を示す。なお、３台の物理機器１０３ａ〜１０３ｃはいずれも稼働中であるものとする。また、図２の例では、仮想機器が仮想マシンである場合を示すが、仮想機器は、仮想ルータ等のその他の仮想機器であってもよい。 FIG. 2 is a diagram for explaining virtual device creation processing by the virtual device placement scheduler function unit 111. FIG. 2 illustrates a case where virtual machines VM # 11 to VM # 16, VM # 21 to VM # 26, and VM # 31 to VM # 36 are created as virtual devices in three physical devices 103a to 103c. It is assumed that all three physical devices 103a to 103c are in operation. In the example of FIG. 2, the virtual device is a virtual machine, but the virtual device may be another virtual device such as a virtual router.

図２に示すように、仮想機器配置スケジューラ機能部１１１は、仮想マシンＶＭ＃１１〜ＶＭ＃１６の配置先として物理機器１０３ａを選択し、仮想マシンＶＭ＃２１〜ＶＭ＃２６の配置先として物理機器１０３ｂを選択し、仮想マシンＶＭ＃３１〜ＶＭ＃３６の配置先として物理機器１０３ｃを選択する。そして、仮想機器配置スケジューラ機能部１１１は、クラウドコントローラ１０８に配置を依頼する。すなわち、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ａに仮想マシンＶＭ＃１１〜ＶＭ＃１６を作成するようにクラウドコントローラ１０８に依頼する。また、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂに仮想マシンＶＭ＃２１〜ＶＭ＃２６を作成するようにクラウドコントローラ１０８に依頼し、物理機器１０３ｃに仮想マシンＶＭ＃３１〜ＶＭ＃３６を作成するようにクラウドコントローラ１０８に依頼する。なお、この配置ロジックは、ビジネス要件に応じて各事業者がロジックを設定すればよく、できるだけ分散するようにしても、できるだけ集中するようにしても良い。出来るだけ分散する場合は、ユーザの仮想機器の性能には都合が良く、できるだけ集中する場合は実際に使われる物理機器を減らすことができ事業者の運用コスト低減に都合が良い。 As illustrated in FIG. 2, the virtual device placement scheduler function unit 111 selects the physical device 103a as the placement destination of the virtual machines VM # 11 to VM # 16, and physically selects the physical device 103a as the placement destination of the virtual machines VM # 21 to VM # 26. The device 103b is selected, and the physical device 103c is selected as the placement destination of the virtual machines VM # 31 to VM # 36. Then, the virtual device placement scheduler function unit 111 requests the cloud controller 108 for placement. In other words, the virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to create the virtual machines VM # 11 to VM # 16 in the physical device 103a. Further, the virtual device placement scheduler function unit 111 requests the cloud controller 108 to create the virtual machines VM # 21 to VM # 26 in the physical device 103b, and assigns the virtual machines VM # 31 to VM # 36 to the physical device 103c. The cloud controller 108 is requested to create it. It should be noted that the placement logic may be set by each operator according to business requirements, and may be distributed as much as possible or concentrated as much as possible. When it is distributed as much as possible, it is convenient for the performance of the user's virtual device, and when it is concentrated as much as possible, the physical devices that are actually used can be reduced, which is convenient for reducing the operating cost of the operator.

また、仮想機器配置スケジューラ機能部１１１は、例えば、いずれかの物理機器１０３に障害が生じた場合に、仮想機器配置スケジューラＤＢ１１０の情報を用いて、仮想機器の再配置を仲介する。ここで、仮想機器配置スケジューラ機能部１１１は、高可用ソフトウェア１０７及びクラウドコントローラ１０８と連携することで障害復旧時に仮想機器を再配置する。図３は、仮想機器配置スケジューラ機能部１１１による仮想機器の再配置処理を説明するための図である。図３では、稼働中である３台の物理機器１０３ａ〜１０３ｃのうち、仮想機器としてＶＭ＃２１〜ＶＭ＃２６を配置する物理機器１０３ｂに障害が生じた場合を示す。 Further, for example, when a failure occurs in any of the physical devices 103, the virtual device placement scheduler function unit 111 mediates the rearrangement of the virtual devices using information in the virtual device placement scheduler DB 110. Here, the virtual device placement scheduler function unit 111 rearranges virtual devices at the time of failure recovery in cooperation with the high availability software 107 and the cloud controller 108. FIG. 3 is a diagram for explaining virtual device rearrangement processing by the virtual device placement scheduler function unit 111. FIG. 3 illustrates a case where a failure occurs in the physical device 103b in which the VM # 21 to VM # 26 are arranged as virtual devices among the three physical devices 103a to 103c that are operating.

図３に示すように、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂに障害が生じたことを検出する。かかる場合、仮想機器配置スケジューラ機能部１１１は、クラウドコントローラ１０８に仮想マシンのクリアを依頼する。そして、クラウドコントローラ１０８は、仮想マシンをクリアする。なお、ここで言う仮想マシンのクリアとは、仮想マシンと仮想マシンにより使用される記憶領域との対応付けを削除することを示す。言い換えると、仮想マシンのクリアとは、仮想マシンとストレージとの紐付けを削除することを示す。 As illustrated in FIG. 3, the virtual device placement scheduler function unit 111 detects that a failure has occurred in the physical device 103b. In such a case, the virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to clear the virtual machine. Then, the cloud controller 108 clears the virtual machine. Note that the clearing of the virtual machine here means that the association between the virtual machine and the storage area used by the virtual machine is deleted. In other words, the clearing of the virtual machine means deleting the association between the virtual machine and the storage.

続いて、仮想機器配置スケジューラ機能部１１１は、仮想機器配置スケジューラＤＢ１１０の情報を用いて、仮想マシンＶＭ＃２１〜ＶＭ＃２６の再配置先を決定する。図３に示す例では、仮想機器配置スケジューラ機能部１１１は、ＶＭ＃２１、ＶＭ＃２３、及びＶＭ＃２５の再配置先として物理機器１０３ａを選択し、ＶＭ＃２２、ＶＭ＃２４、及びＶＭ＃２６の再配置先として物理機器１０３ｃを選択する。 Subsequently, the virtual device arrangement scheduler function unit 111 determines relocation destinations of the virtual machines VM # 21 to VM # 26 using information in the virtual device arrangement scheduler DB 110. In the example illustrated in FIG. 3, the virtual device arrangement scheduler function unit 111 selects the physical device 103a as the relocation destination of the VM # 21, VM # 23, and VM # 25, and the VM # 22, VM # 24, and VM The physical device 103c is selected as the relocation destination of # 26.

そして、仮想機器配置スケジューラ機能部１１１は、クラウドコントローラ１０８に再配置を依頼する。すなわち、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ａに仮想マシンＶＭ＃２１、ＶＭ＃２３、及びＶＭ＃２５を作成するようにクラウドコントローラ１０８に依頼する。また、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｃに仮想マシンＶＭ＃２２、ＶＭ＃２４、及びＶＭ＃２６を作成するようにクラウドコントローラ１０８に依頼する。 Then, the virtual device placement scheduler function unit 111 requests the cloud controller 108 to rearrange. That is, the virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to create the virtual machines VM # 21, VM # 23, and VM # 25 in the physical device 103a. Further, the virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to create the virtual machines VM # 22, VM # 24, and VM # 26 in the physical device 103c.

この結果、物理機器１０３ａは、仮想マシンＶＭ＃２１、ＶＭ＃２３、及びＶＭ＃２５を再構築し、仮想マシンＶＭ＃１１〜ＶＭ＃１６に加えて、仮想マシンＶＭ＃２１、ＶＭ＃２３、及びＶＭ＃２５を配置する。また、物理機器１０３ｃは、仮想マシンＶＭ＃２２、ＶＭ＃２４、及びＶＭ＃２６を再構築し、仮想マシンＶＭ＃３１〜ＶＭ＃３６に加えて、仮想マシンＶＭ＃２２、ＶＭ＃２４、及びＶＭ＃２６を配置する。 As a result, the physical device 103a reconstructs the virtual machines VM # 21, VM # 23, and VM # 25, and in addition to the virtual machines VM # 11 to VM # 16, the virtual machines VM # 21, VM # 23, And VM # 25 are arranged. Further, the physical device 103c reconstructs the virtual machines VM # 22, VM # 24, and VM # 26, and in addition to the virtual machines VM # 31 to VM # 36, the virtual machines VM # 22, VM # 24, and VM # 26 is arranged.

このように、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂに障害が生じた場合、物理機器１０３ｂに配置されていた仮想マシンＶＭ＃２１〜ＶＭ＃２６を、物理機器１０３ａと物理機器１０３ｃとに再配置する。すなわち、仮想機器配置スケジューラ機能部１１１は、複数台の物理機器を仮想機器の復旧先として利用するので、物理機器故障時の仮想機器復旧時間を短縮できる。なお、障害時もできるだけ少ない数の物理機器に仮想マシンを配置するロジックを事業者が設定することもでき、その場合は、最も空きが少ないノードから埋めていくようなロジックを設定してもよく、或いは、最も空きが多いノードから埋めていくようなロジックを設定してもよい。 As described above, when a failure occurs in the physical device 103b, the virtual device placement scheduler function unit 111 converts the virtual machines VM # 21 to VM # 26 that are placed in the physical device 103b to the physical device 103a and the physical device 103c. Rearrange to In other words, since the virtual device placement scheduler function unit 111 uses a plurality of physical devices as virtual device recovery destinations, it is possible to shorten the virtual device recovery time when a physical device fails. The operator can also set the logic to place virtual machines on as few physical devices as possible even in the event of a failure. In that case, you may set up logic that fills from the node with the least available space. Alternatively, logic that fills from the node with the most available space may be set.

また、仮想機器配置スケジューラ機能部１１１は、例えば、いずれかの仮想マシン１０４に障害が生じた場合に、仮想機器配置スケジューラＤＢ１１０の情報を用いて、仮想マシンの再配置を仲介する。ここで、仮想機器配置スケジューラ機能部１１１は、仮想マシン監視モジュール１０６及びクラウドコントローラ１０８と連携することで障害復旧時に仮想マシンを再配置する。図４は、仮想機器配置スケジューラ機能部１１１による仮想マシンの再配置処理を説明するための図である。図４では、物理機器１０３ｂが稼働させている仮想マシンＶＭ＃２１がダウンした場合を示す。なお、３台の物理機器１０３ａ〜１０３ｃはいずれも稼働中であるものとする。 Further, for example, when a failure occurs in any of the virtual machines 104, the virtual device placement scheduler function unit 111 mediates the rearrangement of the virtual machines using information in the virtual device placement scheduler DB 110. Here, the virtual device placement scheduler function unit 111 rearranges virtual machines at the time of failure recovery in cooperation with the virtual machine monitoring module 106 and the cloud controller 108. FIG. 4 is a diagram for explaining virtual machine relocation processing by the virtual device arrangement scheduler function unit 111. FIG. 4 shows a case where the virtual machine VM # 21 that is operating the physical device 103b is down. It is assumed that all three physical devices 103a to 103c are in operation.

図４に示すように、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂが稼働させている仮想マシンＶＭ＃２１に障害が生じたことを検出する。かかる場合、仮想機器配置スケジューラ機能部１１１は、クラウドコントローラ１０８に仮想マシンインスタンスの削除を依頼する。そして、クラウドコントローラ１０８は、仮想マシンのインスタンスを削除する。なお、ＯｐｅｎＳｔａｃｋ等のクラウドコントローラでは、仮想マシンインスタンス削除のＡＰＩ実行で、仮想マシンとストレージの紐付も解除されるため、仮想機器配置スケジューラ機能部１１１は、仮想マシンインスタンスの削除のＡＰＩを実行すれば、仮想マシンインスタンス削除と仮想マシンとストレージの紐付解除がクラウドコントローラ１０８に依頼される。 As illustrated in FIG. 4, the virtual device placement scheduler function unit 111 detects that a failure has occurred in the virtual machine VM # 21 that is operating the physical device 103b. In such a case, the virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to delete the virtual machine instance. Then, the cloud controller 108 deletes the virtual machine instance. In addition, in the cloud controller such as OpenStack, the virtual machine instance scheduler API 111 deletes the association between the virtual machine and the storage by executing the virtual machine instance deletion API. The cloud controller 108 is requested to delete the virtual machine instance and release the association between the virtual machine and the storage.

続いて、仮想機器配置スケジューラ機能部１１１は、仮想機器配置スケジューラＤＢ１１０の情報を用いて、仮想マシンＶＭ＃２１の再配置先を決定する。図４に示す例では、仮想機器配置スケジューラ機能部１１１は、ＶＭ＃２１の再配置先として物理機器１０３ｂを選択する。そして、仮想機器配置スケジューラ機能部１１１は、クラウドコントローラ１０８に再配置を依頼する。すなわち、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂに仮想マシンＶＭ＃２１を作成するようにクラウドコントローラ１０８に依頼する。この結果、物理機器１０３ｂは、仮想マシンＶＭ＃２１を再構築する。 Subsequently, the virtual device arrangement scheduler function unit 111 determines the relocation destination of the virtual machine VM # 21 using information in the virtual device arrangement scheduler DB 110. In the example illustrated in FIG. 4, the virtual device arrangement scheduler function unit 111 selects the physical device 103b as a relocation destination of the VM # 21. Then, the virtual device placement scheduler function unit 111 requests the cloud controller 108 to rearrange. That is, the virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to create the virtual machine VM # 21 in the physical device 103b. As a result, the physical device 103b reconstructs the virtual machine VM # 21.

このように、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂが稼働させている仮想マシンＶＭ＃２１に障害が生じた場合、仮想マシンインスタンスを削除した後に、物理機器１０３ｂに仮想マシンＶＭ＃２１を再配置する。 As described above, when a failure occurs in the virtual machine VM # 21 operated by the physical device 103b, the virtual device placement scheduler function unit 111 deletes the virtual machine instance and then transfers the virtual machine VM # 21 to the physical device 103b. Rearrange.

このように、仮想機器管理システムでは、物理機器障害時の仮想マシン復旧だけでなく、仮想マシンダウン時の仮想マシン復旧も行えるようにする。また、仮想機器管理システムでは、物理機器故障や仮想マシンダウンの故障通知が複数重複した際に、仮想マシンを安全に復旧することを行うため、クラウドコントローラ１０８が有するＤＢ上の仮想マシン情報や物理機器上の仮想マシンインスタンスのクリーンアップステップを実施したのち、仮想マシンを新たに配置する。 As described above, in the virtual device management system, not only the virtual machine can be recovered when a physical device fails, but also the virtual machine can be recovered when the virtual machine is down. Further, in the virtual device management system, when a plurality of physical device failure notifications or virtual machine down failure notifications are duplicated, the virtual machine is safely recovered. After performing the virtual machine instance cleanup step on the device, a new virtual machine is placed.

続いて、このような仮想機器管理システムにおける処理動作について、図５から図７を用いて説明する。図５は、仮想機器管理システムにおける仮想機器を作成する処理動作を説明するための図である。 Next, processing operations in such a virtual device management system will be described with reference to FIGS. FIG. 5 is a diagram for explaining a processing operation for creating a virtual device in the virtual device management system.

図５に示すように、ユーザ端末１０１は、仮想機器作成依頼を、仮想機器配置スケジューラ機能部１１１に送信する（ステップＳ１）。仮想機器配置スケジューラ機能部１１１は、仮想機器配置スケジューラＤＢ１１０を参照し（ステップＳ２）、物理資源情報を確認する（ステップＳ３）。すなわち、仮想機器配置スケジューラ機能部１１１は、仮想機器配置スケジューラＤＢ１１０を参照して物理機器空き情報を取得する。これにより仮想機器配置スケジューラ機能部１１１は、ビジネス要件に基づいて、仮想機器を作成する配置先の物理機器１０３を決定し、ＡＰＩパラメータを準備する（ステップＳ４）。 As shown in FIG. 5, the user terminal 101 transmits a virtual device creation request to the virtual device placement scheduler function unit 111 (step S1). The virtual device arrangement scheduler function unit 111 refers to the virtual device arrangement scheduler DB 110 (step S2) and confirms physical resource information (step S3). That is, the virtual device arrangement scheduler function unit 111 refers to the virtual device arrangement scheduler DB 110 to acquire physical device availability information. As a result, the virtual device placement scheduler function unit 111 determines the physical device 103 as the placement destination for creating the virtual device based on the business requirements, and prepares the API parameter (step S4).

次に、仮想機器配置スケジューラ機能部１１１は、決定した物理機器１０３に仮想機器を作成させるようにクラウドコントローラ１０８に依頼する（ステップＳ５）。例えば、仮想機器配置スケジューラ機能部１１１は、決めた配置先を指定してクラウドコントローラＡＰＩを呼び出す。続いて、クラウドコントローラ１０８は、指定された物理機器１０３に仮想機器の作成を依頼する（ステップＳ６）。 Next, the virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to cause the determined physical device 103 to create a virtual device (step S5). For example, the virtual device arrangement scheduler function unit 111 calls the cloud controller API by designating the decided arrangement destination. Subsequently, the cloud controller 108 requests the designated physical device 103 to create a virtual device (step S6).

そして、物理機器１０３は、仮想機器を作成し（ステップＳ７）、仮想機器の作成が完了したことをクラウドコントローラ１０８に通知する（ステップＳ８）。続いて、クラウドコントローラ１０８は、仮想機器の作成が完了したことを仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ９）。そして、仮想機器配置スケジューラ機能部１１１は、仮想機器の作成が完了したことをユーザ端末１０１に通知する（ステップＳ１０）。 Then, the physical device 103 creates a virtual device (step S7), and notifies the cloud controller 108 that the creation of the virtual device is completed (step S8). Subsequently, the cloud controller 108 notifies the virtual device arrangement scheduler function unit 111 that the creation of the virtual device has been completed (step S9). Then, the virtual device arrangement scheduler function unit 111 notifies the user terminal 101 that the creation of the virtual device has been completed (step S10).

図６は、仮想機器管理システムにおける仮想機器を再配置する処理動作を説明するための図である。図６では、いずれかの物理機器１０３に障害が生じた場合に、仮想機器配置スケジューラ機能部１１１が仮想機器の再配置を仲介する動作を説明する。図６に示すように、仮想機器管理システムでは、物理機器１０３ａ、物理機器１０３ｂ及び物理機器１０３ｃが相互に機器状態を監視している（ステップＳ２１、ステップＳ２２）。例えば、物理機器１０３ａ、物理機器１０３ｂ及び物理機器１０３ｃの高可用ソフトウェア１７は、Ｈｅａｒｔｂｅａｔでお互いの機器状態を交換している。以下では、物理機器１０３ａに障害が生じた場合について説明する。 FIG. 6 is a diagram for explaining a processing operation for rearranging virtual devices in the virtual device management system. FIG. 6 illustrates an operation in which the virtual device placement scheduler function unit 111 mediates the rearrangement of virtual devices when a failure occurs in any physical device 103. As shown in FIG. 6, in the virtual device management system, the physical device 103a, the physical device 103b, and the physical device 103c mutually monitor the device state (steps S21 and S22). For example, the high availability software 17 of the physical device 103a, the physical device 103b, and the physical device 103c exchanges device states with each other using Heartbeat. Hereinafter, a case where a failure occurs in the physical device 103a will be described.

ここで、物理機器１０３ａで障害が起きた際は、物理機器１０３ａ上の高可用ソフトウェア１０７ａは物理機器１０３ａ上のプロセスを停止し、仮想機器配置スケジューラ機能部１１１に障害を通知する。物理機器１０３ｂ及び物理機器１０３ｃも同様に物理機器１０３ａの障害を仮想機器配置スケジューラ機能部１１１に通知する。ここで、物理機器１０３ａが完全に故障している場合は、物理機器１０３ａから仮想機器配置スケジューラ機能部１１１に障害の発生を通知はできないが、物理機器１０３ｂ及び物理機器１０３ｃは、物理機器１０３ａの故障を仮想機器配置スケジューラ機能部１１１に通知できる。このため、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ａの故障を知ることができる。なお図５に示す例では、物理機器１０３ａが完全に故障し、物理機器１０３ａから障害の発生を仮想機器配置スケジューラ機能部１１１に通知できない場合を示す。 Here, when a failure occurs in the physical device 103a, the high availability software 107a on the physical device 103a stops the process on the physical device 103a and notifies the virtual device arrangement scheduler function unit 111 of the failure. Similarly, the physical device 103b and the physical device 103c notify the virtual device arrangement scheduler function unit 111 of the failure of the physical device 103a. Here, when the physical device 103a has completely failed, the physical device 103a cannot notify the virtual device placement scheduler function unit 111 of the occurrence of the failure, but the physical device 103b and the physical device 103c are not connected to the physical device 103a. A failure can be notified to the virtual device arrangement scheduler function unit 111. Therefore, the virtual device arrangement scheduler function unit 111 can know the failure of the physical device 103a. In the example illustrated in FIG. 5, a case where the physical device 103 a completely fails and the occurrence of a failure cannot be notified from the physical device 103 a to the virtual device arrangement scheduler function unit 111 is illustrated.

なお、障害が発生した物理機器が自ら仮想機器配置スケジューラ機能部１１１に障害を通知することを「自ノード故障通知」と記載し、障害が発生した物理機器とは異なる他の物理機器が仮想機器配置スケジューラ機能部１１１に障害を通知することを「他ノード故障通知」と記載する。また、自ノード故障通知は信頼性が低いので、仮想機器配置スケジューラ機能部１１１は、自ノード故障通知を受信した場合、障害が発生した物理機器の「稼働状態」が「稼働中」や「予備」の場合は、「故障中（メンテナンス中）」に変更し、その他の状態の場合は「稼働状態」を変更しない。また、仮想機器配置スケジューラ機能部１１１は、自ノード故障通知を受信した場合、障害が発生した物理機器の完全停止が保障されているわけではなく、仮想マシンの２重起動の可能性があるので、復旧処理を開始しない。ただし、仮想機器配置スケジューラ機能部１１１は、「稼働状態」を「故障中（メンテナンス中）」に変更することで、新たに仮想マシンが配置されることを防ぐようにする。一方、仮想機器配置スケジューラ機能部１１１は、他ノード故障通知を受信した場合、障害が発生した物理機器の「稼働状態」が「稼働中」、「予備」及び「故障中（メンテナンス中）」の場合は、「故障中（復旧処理中）」に変更する。ここで、他ノード故障通知は、後述のＳＴＯＮＩＴＨ等により、故障ノードの停止を保証している。このため、仮想機器配置スケジューラ機能部１１１は、他ノード故障通知を受信した場合、復旧処理に入る。 It is described as “own node failure notification” that a physical device in which a failure has occurred notifies the virtual device placement scheduler function unit 111 of the failure, and another physical device different from the physical device in which the failure has occurred is a virtual device. Notifying the placement scheduler function unit 111 of a failure is referred to as “other node failure notification”. Further, since the own node failure notification has low reliability, the virtual device placement scheduler function unit 111 receives the own node failure notification, and the “operating state” of the physical device in which the failure has occurred is “operating” or “standby "" Is changed to "Failure (maintenance)", otherwise "Operating status" is not changed. Further, when the virtual device placement scheduler function unit 111 receives the failure notification of its own node, it is not guaranteed that the physical device in which the failure has occurred is completely stopped, and there is a possibility that the virtual machine may be double-started. , Do not start the recovery process. However, the virtual device placement scheduler function unit 111 changes the “operating state” to “failing (during maintenance)” to prevent a new virtual machine from being placed. On the other hand, when the virtual device arrangement scheduler function unit 111 receives the other node failure notification, the “operating state” of the physical device in which the failure has occurred is “operating”, “standby”, and “failing (maintenance)”. If this is the case, change it to “Failure (during recovery process)”. Here, the other node failure notification guarantees the stop of the failed node by STONIT or the like described later. For this reason, the virtual device arrangement scheduler function unit 111 enters the recovery process when receiving the other node failure notification.

図６に示すように、物理機器１０３ｂは、物理機器１０３ａに障害が生じたことを仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ２３）。そして、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂからの他ノード故障通知を受ける。ここで、仮想機器配置スケジューラ機能部１１１は、仮想機器配置スケジューラＤＢ１１０を参照して通知物理機器１０３ａの稼働状態を確認し、通知物理機器１０３ａの稼働状態が「稼働中」であるため、通知物理機器１０３ａの稼働状態「稼働中」を「故障中（復旧処理中）」に変更する（ステップＳ２４）。そして、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂにＡＣＫ（ACKnowledgement）を応答する（ステップＳ２５）。そして、仮想機器配置スケジューラ機能部１１１は、復旧処理を開始する。同様に、物理機器１０３ｃは、物理機器１０３ａに障害が生じたことを仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ２６）。そして、仮想機器配置スケジューラ機能部１１１は、仮想機器配置スケジューラＤＢ１１０を参照して通知物理機器１０３ａの稼働状態を確認し、通知物理機器１０３ａの稼働状態が「故障中（復旧処理中）」であるため、再送として判断する（ステップＳ２７）。そして、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｃにＡＣＫを応答する（ステップＳ２８）。このように仮想機器配置スケジューラ機能部１１１は、最初に受信した通知に従って仮想機器の復旧処理を始めるが、２番目以降に受信した通知に対してもＡＣＫを応答する。 As illustrated in FIG. 6, the physical device 103b notifies the virtual device placement scheduler function unit 111 that a failure has occurred in the physical device 103a (step S23). Then, the virtual device arrangement scheduler function unit 111 receives another node failure notification from the physical device 103b. Here, the virtual device arrangement scheduler function unit 111 refers to the virtual device arrangement scheduler DB 110 to check the operating state of the notification physical device 103a, and the operation state of the notification physical device 103a is “in operation”. The operating state “operating” of the device 103a is changed to “failing (during recovery process)” (step S24). Then, the virtual device arrangement scheduler function unit 111 sends an ACK (ACKnowledgement) response to the physical device 103b (step S25). Then, the virtual device arrangement scheduler function unit 111 starts a recovery process. Similarly, the physical device 103c notifies the virtual device arrangement scheduler function unit 111 that a failure has occurred in the physical device 103a (step S26). Then, the virtual device placement scheduler function unit 111 refers to the virtual device placement scheduler DB 110 to confirm the operating state of the notification physical device 103a, and the operation state of the notification physical device 103a is “failing (during recovery process)”. Therefore, it is determined as retransmission (step S27). Then, the virtual device arrangement scheduler function unit 111 returns an ACK to the physical device 103c (step S28). As described above, the virtual device arrangement scheduler function unit 111 starts the virtual device recovery process according to the first received notification, but also responds with an ACK to the second and subsequent notifications.

そして、仮想機器配置スケジューラ機能部１１１は、故障復旧処理では、まず、物理機器１０３ａで稼働する全ての仮想マシンのクリアをクラウドコントローラ１０８に依頼する（ステップＳ２９）。続いて、クラウドコントローラ１０８は、クラウドコントローラ１０８が有するＤＢ上において、仮想マシンとストレージとの紐付けを削除し（ステップＳ３０）、仮想マシンクリア完了を仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ３１）。 Then, in the failure recovery process, the virtual device placement scheduler function unit 111 first requests the cloud controller 108 to clear all virtual machines operating on the physical device 103a (step S29). Subsequently, the cloud controller 108 deletes the association between the virtual machine and the storage on the DB of the cloud controller 108 (step S30), and notifies the virtual device placement scheduler function unit 111 of the completion of the virtual machine clear (step S30). S31).

仮想機器配置スケジューラ機能部１１１は、仮想機器配置スケジューラＤＢ１１０を参照し（ステップＳ３２）、物理資源情報を確認する（ステップＳ３３）。これにより仮想機器配置スケジューラ機能部１１１は、物理機器１０３ｂ及び物理機器１０３ｃの物理資源の空き容量を取得して、仮想機器を再配置する物理機器１０３を決定し、ＡＰＩパラメータを準備する（ステップＳ３４）。ここで、仮想機器配置スケジューラ機能部１１１は、複数台の物理機器１０３を仮想機器の復旧先として選択することで、高速の復旧を可能とする。 The virtual device arrangement scheduler function unit 111 refers to the virtual device arrangement scheduler DB 110 (step S32) and confirms physical resource information (step S33). Thereby, the virtual device arrangement scheduler function unit 111 acquires the free capacity of the physical resources of the physical device 103b and the physical device 103c, determines the physical device 103 on which the virtual device is to be relocated, and prepares an API parameter (step S34). ). Here, the virtual device arrangement scheduler function unit 111 enables a high-speed recovery by selecting a plurality of physical devices 103 as a recovery destination of the virtual device.

次に、仮想機器配置スケジューラ機能部１１１は、再配置する仮想機器を物理機器１０３ｂに作成させるようにクラウドコントローラ１０８に依頼する（ステップＳ３５）。続いて、クラウドコントローラ１０８は、物理機器１０３ｂに仮想機器の作成を依頼する（ステップＳ３６）。同様に、仮想機器配置スケジューラ機能部１１１は、再配置する仮想機器を物理機器１０３ｃに作成させるようにクラウドコントローラ１０８に依頼する（ステップＳ３７）。例えば、仮想機器配置スケジューラ機能部１１１は、決めた配置先を指定してクラウドコントローラＡＰＩを呼び出す。続いて、クラウドコントローラ１０８は、物理機器１０３ｃに仮想機器の作成を依頼する（ステップＳ３８）。ここで、仮想機器配置スケジューラ機能部１１１は、選択した配置先を指定してクラウドコントローラ１０８のＡＰＩを呼び出す。これにより、クラウドコントローラ１０８は、指定された物理機器１０３に対して仮想機器作成を依頼する。 Next, the virtual device placement scheduler function unit 111 requests the cloud controller 108 to cause the physical device 103b to create a virtual device to be rearranged (step S35). Subsequently, the cloud controller 108 requests the physical device 103b to create a virtual device (step S36). Similarly, the virtual device placement scheduler function unit 111 requests the cloud controller 108 to cause the physical device 103c to create a virtual device to be rearranged (step S37). For example, the virtual device arrangement scheduler function unit 111 calls the cloud controller API by designating the decided arrangement destination. Subsequently, the cloud controller 108 requests the physical device 103c to create a virtual device (step S38). Here, the virtual device placement scheduler function unit 111 calls the API of the cloud controller 108 by designating the selected placement destination. As a result, the cloud controller 108 requests the designated physical device 103 to create a virtual device.

そして、物理機器１０３ｂは、仮想機器を作成し（ステップＳ３９）、仮想機器の作成が完了したことをクラウドコントローラ１０８に通知する（ステップＳ４０）。続いて、クラウドコントローラ１０８は、仮想機器の作成が完了したことを仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ４１）。同様に、物理機器１０３ｃは、仮想機器を作成し（ステップＳ４２）、仮想機器の作成が完了したことをクラウドコントローラ１０８に通知する（ステップＳ４３）。続いて、クラウドコントローラ１０８は、仮想機器の作成が完了したことを仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ４４）。なお、図６に示す例では、仮想機器配置スケジューラ機能部１１１が、物理機器１０３ｂと物理機器１０３ｃとに再配置する例について説明したが、物理機器１０３ｂ又物理機器１０３ｃのいずれか一方に再配置してもよい。 Then, the physical device 103b creates a virtual device (step S39), and notifies the cloud controller 108 that the creation of the virtual device has been completed (step S40). Subsequently, the cloud controller 108 notifies the virtual device placement scheduler function unit 111 that the creation of the virtual device has been completed (step S41). Similarly, the physical device 103c creates a virtual device (step S42), and notifies the cloud controller 108 that the creation of the virtual device is completed (step S43). Subsequently, the cloud controller 108 notifies the virtual device placement scheduler function unit 111 that the creation of the virtual device has been completed (step S44). In the example illustrated in FIG. 6, the example in which the virtual device placement scheduler function unit 111 rearranges the physical device 103b and the physical device 103c has been described. However, the virtual device placement scheduler function unit 111 performs the rearrangement on either the physical device 103b or the physical device 103c. May be.

図７は、仮想機器管理システムにおける仮想機器を再配置する処理動作を説明するための図である。図７では、物理機器１０３ａが稼働させている仮想マシンがダウンした場合に、仮想機器配置スケジューラ機能部１１１が仮想マシンの再配置を仲介する動作を説明する。 FIG. 7 is a diagram for explaining a processing operation for rearranging virtual devices in the virtual device management system. FIG. 7 illustrates an operation in which the virtual device placement scheduler function unit 111 mediates the rearrangement of the virtual machine when the virtual machine operated by the physical device 103a is down.

図７に示すように、物理機器１０３ａは、物理機器１０３ａ上の仮想マシン監視モジュール１０６によって、物理機器１０３ａ上の仮想マシンダウンを検出する（ステップＳ５１）。そして、物理機器１０３ａは、仮想マシンダウンを仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ５２）。 As illustrated in FIG. 7, the physical device 103a detects a virtual machine down on the physical device 103a by the virtual machine monitoring module 106 on the physical device 103a (step S51). Then, the physical device 103a notifies the virtual device placement scheduler function unit 111 that the virtual machine is down (step S52).

仮想機器配置スケジューラ機能部１１１は、仮想マシンインスタンス削除をクラウドコントローラ１０８に依頼する（ステップＳ５３）。クラウドコントローラ１０８は、物理機器１０３ａに仮想マシンインスタンスの削除を依頼する（ステップＳ５４）。物理機器１０３ａは、仮想マシンインスタンスを削除し（ステップＳ５５）、クラウドコントローラ１０８に仮想マシンインスタンス削除完了を通知する（ステップＳ５６）。続いて、クラウドコントローラ１０８は、仮想マシンインスタンス削除完了を仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ５７）。 The virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to delete the virtual machine instance (step S53). The cloud controller 108 requests the physical device 103a to delete the virtual machine instance (step S54). The physical device 103a deletes the virtual machine instance (step S55), and notifies the cloud controller 108 of the completion of virtual machine instance deletion (step S56). Subsequently, the cloud controller 108 notifies the virtual device placement scheduler function unit 111 of the completion of virtual machine instance deletion (step S57).

仮想マシンインスタンス削除完了の通知を受けた仮想機器配置スケジューラ機能部１１１は、仮想マシン再配置のため、仮想機器配置スケジューラＤＢ１１０を参照し（ステップＳ５８）、物理資源情報を確認する（ステップＳ５９）。これにより仮想機器配置スケジューラ機能部１１１は、物理機器１０３ａ、物理機器１０３ｂ及び物理機器１０３ｃの物理資源の空き容量を取得して、仮想マシンを再配置する物理機器１０３を決定し、ＡＰＩパラメータを準備する（ステップＳ６０）。なお、図７では、仮想機器配置スケジューラ機能部１１１は、物理機器１０３ａに仮想マシンを再作成する場合を示す。 The virtual device arrangement scheduler function unit 111 that has received the notification of completion of deletion of the virtual machine instance refers to the virtual device arrangement scheduler DB 110 for virtual machine relocation (step S58) and confirms physical resource information (step S59). As a result, the virtual device placement scheduler function unit 111 acquires the free capacity of the physical resources of the physical device 103a, the physical device 103b, and the physical device 103c, determines the physical device 103 on which the virtual machine is to be relocated, and prepares an API parameter. (Step S60). FIG. 7 shows a case where the virtual device arrangement scheduler function unit 111 recreates a virtual machine in the physical device 103a.

仮想機器配置スケジューラ機能部１１１は、クラウドコントローラ１０８に仮想マシン作成を依頼する（ステップＳ６１）。続いて、クラウドコントローラ１０８は、物理機器１０３ａに仮想マシン作成を依頼する（ステップＳ６２）。そして、物理機器１０３ａは、仮想マシンを作成し（ステップＳ６３）、仮想マシン作成完了をクラウドコントローラ１０８に通知する（ステップＳ６４）。続いて、クラウドコントローラ１０８は、仮想マシン作成完了を仮想機器配置スケジューラ機能部１１１に通知する（ステップＳ６５）。 The virtual device arrangement scheduler function unit 111 requests the cloud controller 108 to create a virtual machine (step S61). Subsequently, the cloud controller 108 requests the physical device 103a to create a virtual machine (step S62). Then, the physical device 103a creates a virtual machine (step S63), and notifies the cloud controller 108 of the completion of virtual machine creation (step S64). Subsequently, the cloud controller 108 notifies the virtual device placement scheduler function unit 111 of the completion of virtual machine creation (step S65).

続いて、図８を用いて、仮想機器管理装置１０９が実現する仮想機器配置スケジューラＤＢ１１０及び仮想機器配置スケジューラ機能部１１１について説明する。図８は、仮想機器管理装置１０９が実現する仮想機器配置スケジューラＤＢ１１０及び仮想機器配置スケジューラ機能部１１１を説明するための図である。 Next, the virtual device placement scheduler DB 110 and the virtual device placement scheduler function unit 111 realized by the virtual device management apparatus 109 will be described with reference to FIG. FIG. 8 is a diagram for explaining the virtual device placement scheduler DB 110 and the virtual device placement scheduler function unit 111 realized by the virtual device management apparatus 109.

図８に示すように、仮想機器配置スケジューラＤＢ１１０は、仮想機器配置情報テーブル１１０ａ及び物理資源情報テーブル１１０ｂを記憶する。仮想機器配置情報テーブル１１０ａは、各仮想機器がどの物理機器上に配置されているかを示す仮想機器配置情報を記憶する。 As shown in FIG. 8, the virtual device arrangement scheduler DB 110 stores a virtual device arrangement information table 110a and a physical resource information table 110b. The virtual device arrangement information table 110a stores virtual device arrangement information indicating on which physical device each virtual device is arranged.

図９は、仮想機器配置情報テーブル１１０ａのデータ構造の一例を示す図である。図９に示すように、仮想機器配置情報テーブル１１０ａは、「仮想マシンＩＤ」と、「物理機器ＩＤ」と、「進捗状態」とを対応付けた仮想機器配置情報を記憶する。ここで、仮想機器配置情報テーブル１１０ａが記憶する「仮想マシンＩＤ」は、物理機器１０３に作成された仮想マシンを一意に識別する識別子を示す。例えば、「仮想マシンＩＤ」には、「仮想マシン＃１１」、「仮想マシン＃２１」等のデータ値が格納される。 FIG. 9 is a diagram illustrating an example of the data structure of the virtual device arrangement information table 110a. As illustrated in FIG. 9, the virtual device arrangement information table 110 a stores virtual device arrangement information in which “virtual machine ID”, “physical device ID”, and “progress state” are associated with each other. Here, the “virtual machine ID” stored in the virtual device arrangement information table 110 a indicates an identifier for uniquely identifying a virtual machine created in the physical device 103. For example, data values such as “virtual machine # 11” and “virtual machine # 21” are stored in “virtual machine ID”.

また、仮想機器配置情報テーブル１１０ａが記憶する「物理機器ＩＤ」は、物理機器１０３を一意に識別する識別子を示す。例えば、「物理機器ＩＤ」には、「物理機器＃１」、「物理機器＃２」等のデータ値が格納される。 The “physical device ID” stored in the virtual device arrangement information table 110a indicates an identifier for uniquely identifying the physical device 103. For example, data values such as “physical device # 1” and “physical device # 2” are stored in “physical device ID”.

また、仮想機器配置情報テーブル１１０ａが記憶する「進捗状態」は、仮想マシンの復旧処理の進捗状態を示す情報を示す。この「進捗状態」は、後述する仮想マシン削除部１１１ｄや再作成部１１１ｅにより更新される。ここで、「進捗状態」には、仮想マシンの復旧処理の進捗を示す情報として、「稼働中」、「削除中（仮想マシンダウン）」、「削除中（物理機器障害）」、「作成中（仮想マシンダウン）」及び「作成中（物理機器障害）」等が格納される。ここで、「稼働中」は、仮想マシンが稼働中であることを示す。すなわち、「稼働中」は、仮想マシンの復旧処理が開始していないことを示す。 The “progress status” stored in the virtual device arrangement information table 110a indicates information indicating the progress status of the virtual machine recovery process. This “progress state” is updated by a virtual machine deletion unit 111d and a re-creation unit 111e described later. Here, in the “progress status”, information indicating the progress of the recovery process of the virtual machine includes “in operation”, “deleting (virtual machine down)”, “deleting (physical device failure)”, “creating” (Virtual machine down) "and" Creating (physical device failure) "are stored. Here, “in operation” indicates that the virtual machine is in operation. That is, “in operation” indicates that the virtual machine recovery processing has not started.

また、「削除中（仮想マシンダウン）」は、仮想マシンの障害に起因する仮想マシンの復旧処理が実施中であることを示す。「削除中（物理機器障害）」は、物理機器の障害に起因する仮想マシンの復旧処理が実施中であることを示す。ここで、進捗状態が削除中である場合の復旧処理として、仮想マシンと仮想マシンにより使用される記憶領域との対応付けを削除する処理や仮想マシンのインスタンスを削除する処理が実施される。なお、削除中である場合の復旧処理のことを「第１の復旧処理」とも言う。 “Deleting (virtual machine down)” indicates that recovery processing of the virtual machine due to the failure of the virtual machine is being performed. “Deleting (physical device failure)” indicates that virtual machine recovery processing due to a physical device failure is being performed. Here, as a recovery process when the progress state is being deleted, a process of deleting the association between the virtual machine and the storage area used by the virtual machine and a process of deleting the virtual machine instance are performed. Note that the recovery process in the case of deletion is also referred to as “first recovery process”.

「作成中（仮想マシンダウン）」は、仮想マシンの障害に起因する仮想マシンの復旧処理が実施中であることを示す。「作成中（物理機器障害）」は、物理機器の障害に起因する仮想マシンの復旧処理が実施中であることを示す。ここで、進捗状態が作成中である場合の復旧処理として、仮想マシンを稼働させる物理機器を選択し、選択した物理機器に仮想マシンを再作成させる処理が実施される。なお、作成中である場合の復旧処理のことを「第２の復旧処理」とも言う。第２の復旧処理は、第１の復旧処理の終了後に実施される。 “Creating (virtual machine down)” indicates that a virtual machine recovery process due to a failure of the virtual machine is in progress. “Creating (physical device failure)” indicates that a virtual machine recovery process due to a physical device failure is being performed. Here, as recovery processing when the progress state is being created, a physical device that operates the virtual machine is selected, and processing for causing the selected physical device to recreate the virtual machine is performed. Note that the recovery process when it is being created is also referred to as a “second recovery process”. The second recovery process is performed after the end of the first recovery process.

一例をあげると、図９に示す仮想機器配置情報テーブル１１０ａは、識別子が「物理機器＃１」である物理機器１０３には、仮想機器「仮想マシン＃１１」及び「仮想マシン＃１２」が配置されており、仮想マシン＃１１及び仮想マシン＃１２が仮想マシンのダウンによる復旧処理中であることを示す。ここで、仮想マシン＃１１は、削除中であり、仮想マシン＃１２は、作成中である。また、図９に示す仮想機器配置情報テーブル１１０ａは、識別子が「物理機器＃２」である物理機器１０３には、仮想機器「仮想マシン＃２１」及び「仮想マシン＃２２」が配置されており、仮想マシン＃２１及び仮想マシン＃２２が物理機器＃２の障害による復旧処理中であることを示す。ここで、仮想マシン＃２１及び仮想マシン＃２２は、削除中である。また、図９に示す仮想機器配置情報テーブル１１０ａは、識別子が「物理機器＃３」である物理機器１０３には、仮想機器「仮想マシン＃２３」、「仮想マシン＃３１」及び「仮想マシン＃３２」が配置されており、仮想マシン＃３１及び仮想マシン＃３２が稼働中であり、仮想マシン＃２３が物理機器＃３以外の他の物理機器の障害による復旧処理中であることを示す。なお、図９に示す例では、元々、物理機器＃２では、仮想マシン＃２１、＃２２、及び＃２３が動作しており、物理機器＃２が故障した場合を示す。このため、仮想マシン＃２１、＃２２、及び＃２３のクリーンアップ処理を行うが、仮想マシン＃２１、及び＃２２はまだクリーンアップ中のため物理機器＃２に紐付いており削除中と表示される。仮想マシン＃２３は、クリーンアップが既に終わり、仮想マシン作成を物理機器＃３上で既に始めているため、物理機器＃３に紐付いて作成中と表示されている。 For example, in the virtual device arrangement information table 110a shown in FIG. 9, the virtual devices “virtual machine # 11” and “virtual machine # 12” are arranged in the physical device 103 whose identifier is “physical device # 1”. This indicates that the virtual machine # 11 and the virtual machine # 12 are being restored by the virtual machine being down. Here, the virtual machine # 11 is being deleted, and the virtual machine # 12 is being created. In the virtual device arrangement information table 110a shown in FIG. 9, the virtual devices “virtual machine # 21” and “virtual machine # 22” are arranged in the physical device 103 whose identifier is “physical device # 2”. This indicates that the virtual machine # 21 and the virtual machine # 22 are being restored due to a failure of the physical device # 2. Here, the virtual machine # 21 and the virtual machine # 22 are being deleted. Further, the virtual device arrangement information table 110a illustrated in FIG. 9 includes virtual devices “virtual machine # 23”, “virtual machine # 31”, and “virtual machine #” for the physical device 103 whose identifier is “physical device # 3”. 32 ”is arranged, indicating that the virtual machine # 31 and the virtual machine # 32 are operating, and that the virtual machine # 23 is being restored due to a failure of a physical device other than the physical device # 3. In the example illustrated in FIG. 9, the virtual machine # 21, # 22, and # 23 are originally operating in the physical device # 2, and the physical device # 2 has failed. For this reason, the virtual machines # 21, # 22, and # 23 are cleaned up, but since the virtual machines # 21 and # 22 are still being cleaned up, they are associated with the physical device # 2 and are displayed as being deleted. The Since the virtual machine # 23 has already been cleaned up and virtual machine creation has already started on the physical device # 3, it is displayed as being created in association with the physical device # 3.

図８に戻る。物理資源情報テーブル１１０ｂは、各物理機器の稼働状態と物理機器が有する物理資源の空き容量とを示す物理資源情報を記憶する。図１０は、物理資源情報テーブル１１０ｂのデータ構造の一例を示す図である。図１０に示すように、物理資源情報テーブル１１０ｂは、「物理機器ＩＤ」と「稼働状態」と「空き」と「使用中」と「障害用」とを対応付けた物理資源情報を記憶する。 Returning to FIG. The physical resource information table 110b stores physical resource information indicating the operating state of each physical device and the free capacity of the physical resource that the physical device has. FIG. 10 is a diagram illustrating an example of a data structure of the physical resource information table 110b. As illustrated in FIG. 10, the physical resource information table 110b stores physical resource information in which “physical device ID”, “operating state”, “free”, “in use”, and “for failure” are associated with each other.

ここで、物理資源情報テーブル１１０ｂが記憶する「物理機器ＩＤ」は、物理機器１０３を一意に識別する識別子を示す。例えば、「物理機器ＩＤ」には、「物理機器＃１」、「物理機器＃２」等のデータ値が格納される。 Here, the “physical device ID” stored in the physical resource information table 110 b indicates an identifier for uniquely identifying the physical device 103. For example, data values such as “physical device # 1” and “physical device # 2” are stored in “physical device ID”.

また、物理資源情報テーブル１１０ｂが記憶する「稼働状態」は、物理機器が稼働中であるか否かを示す。この「稼働状態」は、後述する障害判定部１１１ｃにより更新される。例えば、物理機器が稼働中である場合、「稼働状態」には「稼働中」が格納される。なお、図８では図示していないが、物理機器が予備系として設けられ稼働中ではない場合、「稼働状態」には「予備」が格納される。また、障害が発生した物理機器の復旧処理が開始されていない場合、「稼働状態」には「故障中（メンテナンス中）」が格納される。また、障害が発生した物理機器の復旧処理が開始されている場合、「稼働状態」には「故障中（復旧処理中）」が格納される。 The “operating state” stored in the physical resource information table 110b indicates whether the physical device is operating. This “operation state” is updated by a failure determination unit 111c described later. For example, when the physical device is in operation, “in operation” is stored in “operation state”. Although not shown in FIG. 8, when the physical device is provided as a standby system and is not in operation, “standby” is stored in the “operation state”. In addition, when recovery processing of a physical device in which a failure has occurred has not started, “failed (maintenance)” is stored in the “operation state”. Further, when the recovery process of the physical device in which the failure has occurred has been started, “failed (during recovery process)” is stored in the “operation state”.

また、「空き」は、物理機器が有する物理資源の容量のうち空き容量を示す。例えば、「空き」には、「３」、「５」、「４」等の値が格納される。また、「使用中」は、物理機器が有する物理資源の容量のうち使用中の容量を示す。例えば、「使用中」には、「１」、「３」、「２」等の値が格納される。また、「障害用」は、物理機器が有する物理資源の容量のうち復旧用に確保された容量を示す。例えば、「障害用」には、「２」等の値が格納される。 “Free” indicates the free capacity of the physical resource capacity of the physical device. For example, values such as “3”, “5”, and “4” are stored in “empty”. Further, “in use” indicates a capacity in use among the capacity of physical resources of the physical device. For example, “in use” stores values such as “1”, “3”, and “2”. Further, “for failure” indicates the capacity reserved for recovery out of the physical resource capacity of the physical device. For example, a value such as “2” is stored in “for failure”.

一例をあげると、図１０に示す物理資源情報テーブル１１０ｂは、物理機器＃１は、稼働中であり、物理資源の空き容量が「３」であり、使用中の容量が「１」であり、復旧用に確保された容量が「２」であることを示す。また、図１０に示す物理資源情報テーブル１１０ｂは、物理機器＃２は、稼働中であり、物理資源の空き容量が「５」であり、使用中の容量が「３」であり、復旧用に確保された容量が「２」であることを示す。同様に、図１０に示す物理資源情報テーブル１１０ｂは、物理機器＃３は、稼働中であり、物理資源の空き容量が「４」であり、使用中の容量が「２」であり、復旧用に確保された容量が「２」であることを示す。 As an example, in the physical resource information table 110b shown in FIG. 10, the physical device # 1 is in operation, the free capacity of the physical resource is “3”, and the used capacity is “1”. This indicates that the capacity reserved for recovery is “2”. Further, in the physical resource information table 110b shown in FIG. 10, the physical device # 2 is in operation, the free capacity of the physical resource is “5”, the used capacity is “3”, and is used for recovery. It shows that the secured capacity is “2”. Similarly, in the physical resource information table 110b shown in FIG. 10, the physical device # 3 is in operation, the free capacity of the physical resource is “4”, the capacity in use is “2”, and Indicates that the secured capacity is “2”.

図８に戻る。仮想機器配置スケジューラ機能部１１１は、作成依頼受付部１１１ａと、配置先選択部１１１ｂと、障害判定部１１１ｃと、仮想マシン削除部１１１ｄと、再作成部１１１ｅとを有する。 Returning to FIG. The virtual device placement scheduler function unit 111 includes a creation request reception unit 111a, a placement destination selection unit 111b, a failure determination unit 111c, a virtual machine deletion unit 111d, and a recreation unit 111e.

作成依頼受付部１１１ａは、仮想機器の作成要求をユーザ端末１０１から受付ける。作成依頼受付部１１１ａは、受付けた仮想機器の作成要求を配置先選択部１１１ｂに受け渡す。 The creation request reception unit 111 a receives a virtual device creation request from the user terminal 101. The creation request accepting unit 111a delivers the accepted virtual device creation request to the placement destination selecting unit 111b.

配置先選択部１１１ｂは、仮想機器を新規に作成する際に、仮想機器を配置する物理機器を選択する。ここで、配置先選択部１１１ｂは、仮想機器を出来るだけ分散して配置するように物理機器１０３を選択する。言い換えると、配置先選択部１１１ｂは、「稼働中」の「空き」スペースの数が平準化するように仮想機器を配置する。 The placement destination selection unit 111b selects a physical device on which a virtual device is placed when a virtual device is newly created. Here, the placement destination selection unit 111b selects the physical device 103 so that virtual devices are distributed as much as possible. In other words, the placement destination selection unit 111b arranges the virtual devices so that the number of “operating” “free” spaces is leveled.

図１１は、配置先選択部１１１ｂによる処理動作を説明するための図である。図１１では、物理機器＃１〜物理機器＃６の６台の物理機器を有する仮想機器管理システムにおいて、仮想機器を新規に作成する場合について説明する。ここで、物理機器＃１〜物理機器＃５の稼働状態は「稼働中」であり、物理機器＃６の稼働状態は「予備」である。また、物理機器＃１のスペースの状態は、「空き」２、「使用中」１、「障害用バッファ」２であり、物理機器＃２のスペースの状態は、「空き」０、「使用中」３、「障害用バッファ」２であり、物理機器＃３のスペースの状態は、「空き」４、「使用中」０、「障害用バッファ」２である。また、物理機器＃４のスペースの状態は、「空き」０、「使用中」５、「障害用バッファ」２であり、物理機器＃５のスペースの状態は、「空き」２、「使用中」０、「障害用バッファ」２であり、物理機器＃６のスペースの状態は、「空き」３、「使用中」０、「障害用バッファ」２である。 FIG. 11 is a diagram for explaining the processing operation by the placement destination selection unit 111b. FIG. 11 illustrates a case where a virtual device is newly created in a virtual device management system having six physical devices # 1 to # 6. Here, the operating state of the physical device # 1 to the physical device # 5 is “operating”, and the operating state of the physical device # 6 is “standby”. The space status of the physical device # 1 is “free” 2, “in use” 1, and “failure buffer” 2. The space status of the physical device # 2 is “free” 0, “in use” 3 and “failure buffer” 2, and the space status of the physical device # 3 is “free” 4, “in use” 0, and “failure buffer” 2. The space status of the physical device # 4 is “free” 0, “in use” 5, and “failure buffer” 2. The space status of the physical device # 5 is “free” 2, “in use”. "0", "Fault buffer" 2, and the space status of the physical device # 6 is "Free" 3, "In use" 0, "Fault buffer" 2.

例えば、配置先選択部１１１ｂは、配置先選択時に、稼働状態が「稼働中」である物理機器の空きスペースの量をチェックし、最も空きスペースが多い稼働中の物理機器を特定する。より具体的には、配置先選択部１１１ｂは、作成する仮想機器のうち１つの仮想機器（例えば、仮想機器＃１）を選択する。そして、図１１に示すスペースの状態である場合には、「空き」が４である物理機器＃３を、最も空きスペースが多い稼働中の物理機器に特定する。なお、配置先選択部１１１ｂは、「障害用バッファ」を通常オペレーション時には利用しない。そして、配置先選択部１１１ｂは、特定した物理機器＃３を選択した仮想機器＃１の配置先として選択する。 For example, when selecting an arrangement destination, the arrangement destination selection unit 111b checks the amount of free space of a physical device whose operation state is “in operation” and identifies an active physical device with the largest available space. More specifically, the placement destination selection unit 111b selects one virtual device (for example, virtual device # 1) from the virtual devices to be created. Then, in the case of the space state shown in FIG. 11, the physical device # 3 whose “empty” is 4 is identified as an active physical device with the most free space. The placement destination selection unit 111b does not use the “failure buffer” during normal operation. Then, the placement destination selection unit 111b selects the specified physical device # 3 as the placement destination of the selected virtual device # 1.

続いて、配置先選択部１１１ｂは、配置先として選択する処理を、作成を依頼された全ての仮想機器の配置先を選択するまで繰り返す。一例をあげると、配置先選択部１１１ｂは、図１１に示す数字順に仮想機器を配置するように物理機器を選択する。このように、配置先選択部１１１ｂは、最も空きスペースが多い稼働中の物理機器のスペースの一部を選択することで「空き」スペースの数を平準化する。 Subsequently, the placement destination selection unit 111b repeats the process of selecting the placement destination until the placement destinations of all virtual devices requested to be created are selected. For example, the placement destination selection unit 111b selects physical devices so that virtual devices are placed in the numerical order shown in FIG. In this way, the placement destination selection unit 111b equalizes the number of “free” spaces by selecting a part of the space of the operating physical device that has the most free space.

また、配置先選択部１１１ｂは、稼働状態が「稼働中」である物理機器のスペースが全て埋まった場合に、稼働状態が「予備」である物理機器に仮想機器を配置する。このため、配置先選択部１１１ｂは、図１１に示す８番のスペースまで仮想機器を配置したら、予備の物理機器に仮想機器を配置する。すなわち、配置先選択部１１１ｂは、図１１に示す例において、仮想機器を９台以上作成する場合には、稼働状態が「予備」である物理機器に仮想機器を配置する。 Further, the arrangement destination selection unit 111b arranges the virtual device in the physical device whose operation state is “standby” when all the space of the physical device whose operation state is “in operation” is filled. For this reason, when the placement destination selecting unit 111b places virtual devices up to the eighth space shown in FIG. 11, the placement destination selecting unit 111b places the virtual devices in the spare physical device. That is, in the example illustrated in FIG. 11, the placement destination selection unit 111 b places virtual devices on physical devices whose operation state is “standby” when nine or more virtual devices are created.

そして、配置先選択部１１１ｂは、選択した物理機器１０３に、仮想機器を作成するようにクラウドコントローラ１０８に依頼する。 Then, the placement destination selection unit 111b requests the selected physical device 103 to the cloud controller 108 to create a virtual device.

図８に戻る。障害判定部１１１ｃは、仮想マシンの障害又は仮想マシンを稼働させる物理機器の障害を検出する。例えば、障害判定部１１１ｃは、各物理機器１０３が有する仮想マシン監視モジュール１０６及び高可用ソフトウェア１０７と連携することで、ダウンした仮想マシン１０４や障害の生じた物理機器１０３を検出する。そして、障害判定部１１１ｃは、仮想マシンの障害又は仮想マシンを稼働させる物理機器の障害を検出した場合に、仮想マシンに対して先行して実施されている復旧処理が存在するか否かを判定する。 Returning to FIG. The failure determination unit 111c detects a failure of a virtual machine or a failure of a physical device that operates the virtual machine. For example, the failure determination unit 111c detects the down virtual machine 104 or the failed physical device 103 in cooperation with the virtual machine monitoring module 106 and the high availability software 107 included in each physical device 103. Then, when the failure determination unit 111c detects a failure of the virtual machine or a failure of a physical device that operates the virtual machine, the failure determination unit 111c determines whether there is a recovery process that has been performed on the virtual machine in advance. To do.

ここではまず、障害判定部１１１ｃによる仮想マシンを稼働させる物理機器の障害を検出する処理について説明する。図１２は、障害判定部１１１ｃによる物理機器の障害を検出する処理動作を説明するための図である。 Here, first, processing for detecting a failure of a physical device that operates a virtual machine by the failure determination unit 111c will be described. FIG. 12 is a diagram for explaining a processing operation for detecting a failure of a physical device by the failure determination unit 111c.

図１２では、物理機器１０３ａ〜１０３ｃを図示しており、物理機器１０３ａに障害が発生した場合について説明する。また、図１２では、物理機器１０３が有する機能のうち、物理機器１０３ａには、自装置の障害発生時に機能する構成部を示し、物理機器１０３ｂ及び物理機器１０３ｃには、他装置の障害を検出した場合に機能する構成部を示す。 FIG. 12 illustrates the physical devices 103a to 103c, and a case where a failure occurs in the physical device 103a will be described. In FIG. 12, among the functions of the physical device 103, the physical device 103 a shows a component that functions when a failure of the own device occurs, and the physical device 103 b and the physical device 103 c detect a failure of another device. The component part which functions when it does is shown.

図１２に示すように、物理機器１０３の障害の検出には、高可用ソフトウェア１０７ａ〜１０７ｃが用いられる。全ての物理機器１０３は、ＣＩＢ（Cluster Information Base）に、クラスタ内の全物理機器１０３の状態を保持する。高可用ソフトウェア１０７は、ＲＡ（Resource Agent）を用いて自物理機器の状態を確認する。なお、ＲＡとは、例えば、仮想ボリューム制御部や仮想マシン制御部に相当する。 As illustrated in FIG. 12, high availability software 107 a to 107 c is used for detecting a failure of the physical device 103. All physical devices 103 hold the status of all physical devices 103 in the cluster in a CIB (Cluster Information Base). The high availability software 107 confirms the state of the own physical device using an RA (Resource Agent). Note that RA corresponds to, for example, a virtual volume control unit or a virtual machine control unit.

また、高可用ソフトウェア１０７は、Ｈｅａｒｔｂｅａｔにより、クラスタ内のどの物理機器も他の物理機器の状態を知り得る。このため、高可用ソフトウェア１０７は、ｈｅａｒｔｂｅａｔパケットを使ってクラスタ内に状態を通知する。この仕組みにより、各物理機器は他の物理機器の状態を知る。高可用ソフトウェア１０７は、ある物理機器からのＨｅａｒｔｂｅａｔパケットが継続的にロストすると、他の物理機器は当該物理機器がダウンしたとみなす。 Further, the high availability software 107 can know the state of any other physical device in any physical device in the cluster by Heartbeat. For this reason, the high availability software 107 notifies the state in the cluster using the heartbeat packet. With this mechanism, each physical device knows the status of other physical devices. When the Heartbeat packet from a certain physical device is continuously lost, the high availability software 107 considers that the other physical device is down.

物理機器１０３は、仮想機器配置スケジューラ機能部１１１に物理機器に生じた障害を通知するため、通知ＲＡと通知プロセスとを備える。例えば、Ｐａｃｅｍａｋｅｒが自物理機器の故障を検出した場合、通知ＲＡを使用して仮想機器配置スケジューラ機能部１１１に自物理機器の故障を通知する。一方、通知プロセスは、常駐プロセスとして設定され、ＣＩＢの状態を定期的に確認する。そして、通知プロセスは、他物理機器の故障を検出すると、後述のＳＴＯＮＩＴＨにより確実に故障ノードが落ちていることを確認した後、他物理機器に障害が生じたことを仮想機器配置スケジューラ機能部１１１に通知する。通知ＲＡによる通知及び通知プロセスによる通知は、ＡＣＫが仮想機器配置スケジューラ機能部１１１から返るまで一定回数繰り返される。 The physical device 103 includes a notification RA and a notification process in order to notify the virtual device arrangement scheduler function unit 111 of a failure that has occurred in the physical device. For example, when the Maker detects a failure of the own physical device, the notification RA is used to notify the virtual device arrangement scheduler function unit 111 of the failure of the own physical device. On the other hand, the notification process is set as a resident process and periodically checks the status of the CIB. When the notification process detects a failure of another physical device, it confirms that the failure node is surely dropped by STONIT, which will be described later, and then indicates that a failure has occurred in the other physical device. Notify The notification by the notification RA and the notification by the notification process are repeated a certain number of times until the ACK is returned from the virtual device arrangement scheduler function unit 111.

続いて、仮想機器配置スケジューラ機能部１１１において、障害判定部１１１ｃは、通知を受信したらＡＣＫを応答し、仮想マシン削除部１１１ｄ及び再作成部１１１ｅに仮想機器の復旧処理を実行させる。なお、障害判定部１１１ｃは、同内容の通知に関しては２通目以降の通知を無視してＡＣＫを応答する。ただし、自ノード故障通知と他ノード故障通知は別物として扱う。これにより、複数の物理機器から通知を受けることで冗長化対策をとることができるとともに、復旧処理を繰り返さないようにする。 Subsequently, in the virtual device arrangement scheduler function unit 111, the failure determination unit 111c responds with an ACK when receiving the notification, and causes the virtual machine deletion unit 111d and the re-creation unit 111e to execute a virtual device recovery process. The failure determination unit 111c ignores the second and subsequent notifications and returns an ACK for the same content notification. However, the self-node failure notification and the other-node failure notification are handled separately. As a result, it is possible to take a redundancy measure by receiving notifications from a plurality of physical devices and not to repeat the recovery process.

なお、高可用ソフトウェア１０７は、自物理機器の停止に失敗する場合がある。仮想マシンの場合、復旧により、複数の仮想マシンが同時に存在してしまい、データ領域への同時アクセスによりデータ破壊の可能性が出てしまう。そこで、高可用ソフトウェア１０７が「Ｐａｃｅｍａｋｅｒ」である場合、ＳＴＯＮＩＴＨモジュールを用いて、確実に故障物理機器を落とす。ＳＴＯＮＩＴＨは、ＩＰＭＩ（Intelligent Platform Management Interface）経由で、故障物理機器を停止することで、故障物理機器が動作し続けないことを保証する。Ｑｕｏｒｕｍで過半数を形成した多数派の物理機器が、ＳＴＯＮＩＴＨを起動することで、誤発動を防止する。なお、Ｑｕｏｒｕｍは過半数で判断するため、クラスタの物理機器数が少ない場合に、ある物理機器が故障したら、正常な物理機器が過半数を確保できなくなる。このため、クラスタから故障物理機器を切り離す減設作業が必要である。また、図１２では、高可用ソフトウェア１０７が、Ｐａｃｅｍａｋｅｒである場合を示しているが、他の高可用ソフトウェアでも同様のメカニズムで障害の発生を検知したり、障害の発生を通知したりすることが可能である。なお、障害判定部１１１ｃは、障害の生じた物理機器自ら物理機器の障害の発生を通知された場合、物理資源情報テーブル１１０ｂにおいて障害の生じた物理機器に対応する「稼働状態」を「故障中（メンテナンス中）」に更新する。一方、障害判定部１１１ｃは、例えば、他の物理機器に生じた障害を物理機器１０３ｂから通知された場合、物理資源情報テーブル１１０ｂにおいて障害の生じた物理機器に対応する「稼働状態」を「故障中（復旧処理中）」に更新して、仮想マシン削除部１１１ｄ及び再作成部１１１ｅに仮想機器の復旧処理を実行させる。かかる場合、障害判定部１１１ｃは、復旧処理が仮想マシンを稼働させる物理機器の障害に起因することを仮想マシン削除部１１１ｄ及び再作成部１１１ｅに通知する。なお、障害判定部１１１ｃは、物理機器１０３ｂからの通知に続いて他の物理機器に生じた障害を物理機器１０３ｃから通知された場合、既に復旧処理を開始しているので無視する。 Note that the high availability software 107 may fail to stop its own physical device. In the case of a virtual machine, a plurality of virtual machines exist at the same time due to recovery, and there is a possibility of data destruction due to simultaneous access to the data area. Therefore, when the highly available software 107 is “Pacemaker”, the failed physical device is surely dropped using the STONITH module. STONIT guarantees that the failed physical device will not continue to operate by stopping the failed physical device via IPMI (Intelligent Platform Management Interface). The majority of physical devices that make up the majority with Quorum activate STONITH to prevent false triggers. Since Quorum is determined by a majority, if a physical device fails when the number of physical devices in the cluster is small, a normal physical device cannot secure the majority. For this reason, it is necessary to perform a reduction work for separating the failed physical device from the cluster. FIG. 12 shows the case where the high availability software 107 is a maker. However, other high availability software may detect the occurrence of a failure or notify the occurrence of the failure using the same mechanism. Is possible. When the failure determination unit 111c is notified of the failure of the physical device in which the failure has occurred, the failure determination unit 111c sets the “operating state” corresponding to the failed physical device in the physical resource information table 110b to “failing”. Update to (in maintenance). On the other hand, for example, when the failure that has occurred in another physical device is notified from the physical device 103b, the failure determination unit 111c sets the “operation state” corresponding to the failed physical device in the physical resource information table 110b to “failure”. Update to “middle (recovery process in progress)”, and cause the virtual machine deletion unit 111d and the re-creation unit 111e to execute the recovery process of the virtual device. In such a case, the failure determination unit 111c notifies the virtual machine deletion unit 111d and the recreation unit 111e that the recovery process is caused by a failure of a physical device that operates the virtual machine. The failure determination unit 111c ignores a failure that has occurred in another physical device following the notification from the physical device 103b because the recovery processing has already been started and is ignored.

次に、障害判定部１１１ｃによる仮想マシンの障害を検出する処理について説明する。物理機器障害で、その上で動作する仮想機器が全て障害になる場合以外に、仮想機器がプロセス障害などでダウンする場合がある。特に仮想マシン障害は、即サービス断につながるため、仮想マシンのユーザにとって影響が大きい。そこで、仮想機器管理システムでは、Ｐａｃｅｍａｋｅｒ等の高可用ソフトウェアによる物理機器監視だけでなく、Ｌｉｂｖｉｒｔ等の仮想マシン制御ライブラリによって仮想マシンを監視する。そして、仮想機器配置スケジューラ機能部１１１は、仮想マシン障害時に、仮想マシンの復旧処理を実施する。 Next, processing for detecting a failure in the virtual machine by the failure determination unit 111c will be described. There are cases where a virtual device goes down due to a process failure or the like in addition to a case where all of the virtual devices operating on the physical device fail due to a physical device failure. In particular, a virtual machine failure has a great influence on the user of the virtual machine because it immediately leads to service interruption. Therefore, in the virtual device management system, not only physical device monitoring by high availability software such as Pacemaker, but also virtual machines are monitored by a virtual machine control library such as Libvirt. Then, the virtual device arrangement scheduler function unit 111 performs virtual machine recovery processing when a virtual machine fails.

例えば、ＯｐｅｎＳｔａｃｋコミュニティでは、Ｑｅｍｕ−ＫＶＭが数多く利用されているハイパーバイザ―であり、仮想マシンの制御ライブラリとしてＬｉｂｖｉｒｔが数多く利用されている。そこで、仮想マシン監視モジュール１０６は、Ｌｉｂｖｉｒｔ等の仮想マシン制御ライブラリからイベントを取得することで仮想マシンがダウンしたことを検知する。そして、仮想マシン監視モジュール１０６は、仮想マシンがダウンしたことを検知した場合、仮想機器配置スケジューラ機能部１１１に障害を通知する。Ｌｉｂｖｉｒｔは、仮想マシン障害だけでなく、仮想マシンゲストＯＳのシャットダウン、ホストＯＳのシャットダウン等のイベントも取得可能である。このため、仮想マシン監視モジュール１０６は、仮想機器配置スケジューラ機能部１１１に、仮想マシンゲストＯＳのシャットダウンやホストＯＳのシャットダウン等のイベントを通知する。また、障害判定部１１１ｃは、仮想マシン削除部１１１ｄ及び再作成部１１１ｅに仮想機器の復旧処理を実行させる。かかる場合、障害判定部１１１ｃは、復旧処理が仮想マシンのダウンに起因することを仮想マシン削除部１１１ｄ及び再作成部１１１ｅに通知する。 For example, the OpenStack community is a hypervisor in which many Qemu-KVMs are used, and many Libvirts are used as control libraries for virtual machines. Therefore, the virtual machine monitoring module 106 detects that the virtual machine has gone down by acquiring an event from a virtual machine control library such as Libvirt. When the virtual machine monitoring module 106 detects that the virtual machine is down, the virtual machine monitoring module 106 notifies the virtual device arrangement scheduler function unit 111 of the failure. Libvir can acquire not only virtual machine failures but also events such as shutdown of the virtual machine guest OS and shutdown of the host OS. Therefore, the virtual machine monitoring module 106 notifies the virtual device arrangement scheduler function unit 111 of events such as shutdown of the virtual machine guest OS and shutdown of the host OS. In addition, the failure determination unit 111c causes the virtual machine deletion unit 111d and the re-creation unit 111e to execute a virtual device recovery process. In such a case, the failure determination unit 111c notifies the virtual machine deletion unit 111d and the re-creation unit 111e that the recovery process is caused by the virtual machine being down.

続いて、障害判定部１１１ｃによる、仮想マシンに対して先行して実施されている復旧処理が存在するか否かを判定する処理について説明する。障害判定部１１１ｃは、複数の故障通知を受信する可能性がある。言い換えると、故障通知が競合する場合がある。例えば、仮想マシンでは、復旧タイミングを誤り仮想マシンが２重起動されてしまうと、同じデータ領域への２つの仮想マシンからのアクセスによりデータ破壊の可能性が有る。このため、現在の復旧処理の進捗状態に応じて適切な復旧処理を行う必要がある。そこで、障害判定部１１１ｃは、Ｐａｃｅｍａｋｅｒ等の高可用ソフトウェア１０７とＬｉｂｖｉｒｔ監視モジュール等の仮想マシン監視モジュール１０６からの故障通知とを統一的に管理し、通知受信時の復旧処理の進捗状況に応じて、実施する復旧処理を判定する。 Next, a process for determining whether or not there is a recovery process that has been performed on the virtual machine in advance by the failure determination unit 111c will be described. The failure determination unit 111c may receive a plurality of failure notifications. In other words, failure notifications may compete. For example, in a virtual machine, if the recovery timing is wrong and a virtual machine is started twice, there is a possibility of data destruction due to access from the two virtual machines to the same data area. For this reason, it is necessary to perform an appropriate recovery process according to the current progress of the recovery process. Therefore, the failure determination unit 111c manages the failure notification from the high availability software 107 such as Pacemaker and the failure notification from the virtual machine monitoring module 106 such as the Livevirt monitoring module in accordance with the progress of the recovery process when receiving the notification. The recovery process to be executed is determined.

例えば、障害判定部１１１ｃは、仮想マシンの障害又は仮想マシンを稼働させる物理機器の障害が検出された場合に、仮想マシンに対して実施中の復旧処理が存在するか否かを判定する。例えば、障害判定部１１１ｃは、図９に示した仮想機器配置情報テーブル１１０ａを参照して、障害が通知された物理機器で稼働する仮想マシンの進捗状態が削除中や作成中であるか否かを判定する。そして、障害判定部１１１ｃは、障害が通知された物理機器で稼働する仮想マシンの進捗状態が削除中や作成中である場合、仮想マシンに対して実施中の復旧処理が存在すると判定する。一方、障害判定部１１１ｃは、障害が通知された物理機器で稼働する仮想マシンの進捗状態が削除中や作成中でない場合、仮想マシンに対して実施中の復旧処理が存在しないと判定する。以下では、故障通知が競合する場合をパターン１からパターン４の４つの状況にわけて説明する。 For example, when a failure of a virtual machine or a failure of a physical device that operates the virtual machine is detected, the failure determination unit 111c determines whether there is a recovery process being performed on the virtual machine. For example, the failure determination unit 111c refers to the virtual device arrangement information table 110a illustrated in FIG. 9 to determine whether the progress status of the virtual machine operating on the physical device notified of the failure is being deleted or created. Determine. Then, the failure determination unit 111c determines that there is a recovery process being performed on the virtual machine when the progress state of the virtual machine operating on the physical device notified of the failure is being deleted or created. On the other hand, the failure determination unit 111c determines that there is no recovery process being performed on the virtual machine when the progress state of the virtual machine operating on the physical device notified of the failure is not being deleted or created. Below, the case where a failure notification competes will be described in four situations from pattern 1 to pattern 4.

パターン１として、仮想マシンを稼働させる物理機器の障害に起因する復旧処理が実施中であるときに物理機器の障害が新たに検出された場合について説明する。言い換えると、障害が新たに検出された物理機器で稼働する仮想マシンの「進捗状態」が「削除中（物理機器障害）」や「作成中（物理機器障害）」である場合を示す。 As pattern 1, a case will be described in which a physical device failure is newly detected when recovery processing due to a failure of a physical device that operates a virtual machine is being performed. In other words, a case where the “progress status” of a virtual machine operating on a physical device in which a failure is newly detected is “deleting (physical device failure)” or “creating (physical device failure)” is shown.

障害判定部１１１ｃは、物理機器上の仮想マシンの削除が終わる前であれば、受信した通知を再送として破棄する。より具体的には、障害判定部１１１ｃは、仮想機器配置情報テーブル１１０ａにおいて、障害が生じた物理機器で稼働する仮想マシンの「進捗状態」が「削除中（物理機器障害）」である場合、通知が再送であると判定する。すなわち、障害判定部１１１ｃは、仮想マシンを稼働させる物理機器の障害に起因する第１の復旧処理が実施中であるときに物理機器の障害が新たに検出された場合、新たに検出された障害に起因する第１の復旧処理を仮想マシン削除部１１１ｄに実行させず、新たに検出された障害に起因する第２の復旧処理を再作成部１１１ｅに実行させない。 The failure determination unit 111c discards the received notification as a retransmission before the deletion of the virtual machine on the physical device is completed. More specifically, the failure determination unit 111c, in the virtual device arrangement information table 110a, when the “progress status” of the virtual machine operating on the physical device in which the failure has occurred is “deleting (physical device failure)” It is determined that the notification is a retransmission. In other words, the failure determination unit 111c, when a failure of the physical device is newly detected when the first recovery process due to the failure of the physical device that operates the virtual machine is being performed, the newly detected failure The virtual machine deletion unit 111d is not allowed to execute the first recovery process due to the failure, and the re-creation unit 111e is not allowed to execute the second recovery process due to the newly detected failure.

また、障害判定部１１１ｃは、既に仮想マシンを他の物理機器で仮想マシンインスタンス作成中の場合は、新たに受信した通知を新たなイベントとして物理機器の復旧処理を行う。より具体的には、障害判定部１１１ｃは、仮想機器配置情報テーブル１１０ａにおいて、障害が生じた物理機器で稼働する仮想マシンの「進捗状態」が「作成中（物理機器障害）」である場合、新たなイベントとして物理機器の復旧処理を行う。すなわち、障害判定部１１１ｃは、仮想マシンを稼働させる物理機器の障害に起因する第２の復旧処理が実施中であるときに物理機器の障害が新たに検出された場合、実施中の第２の復旧処理を中止させる。そして、障害判定部１１１ｃは、新たに検出された障害に起因する第１の復旧処理を仮想マシン削除部１１１ｄに実行させ、新たに検出された障害に起因する第２の復旧処理を再作成部１１１ｅに実行させる。なお、同じ物理機器に対する通知であり、仮想マシンの削除が既に実施されている場合、仮想マシンを新たに削除する処理を省略可能である。 If the virtual machine instance is already being created with another physical device, the failure determination unit 111c performs physical device recovery processing using the newly received notification as a new event. More specifically, the failure determination unit 111c, in the virtual device arrangement information table 110a, when the “progress status” of the virtual machine operating on the physical device in which the failure has occurred is “being created (physical device failure)” Perform physical device recovery processing as a new event. In other words, if a failure in the physical device is newly detected when the second recovery process due to the failure in the physical device that operates the virtual machine is being performed, the failure determination unit 111c Stop the recovery process. Then, the failure determination unit 111c causes the virtual machine deletion unit 111d to execute the first recovery process due to the newly detected failure, and recreates the second recovery process due to the newly detected failure. 111e is executed. If the notification is for the same physical device and the virtual machine has already been deleted, the process of newly deleting the virtual machine can be omitted.

パターン２として、仮想マシンを稼働させる物理機器の障害に起因する復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合について説明する。言い換えると、ダウンが新たに検出された仮想マシンの「進捗状態」が「削除中（物理機器障害）」や「作成中（物理機器障害）」である場合を示す。 As a pattern 2, a case will be described in which a failure of a virtual machine is newly detected when recovery processing due to a failure of a physical device that operates the virtual machine is being performed. In other words, the “progress status” of a virtual machine in which a down is newly detected is “deleting (physical device failure)” or “creating (physical device failure)”.

障害判定部１１１ｃは、第１の復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合、新たに検出された障害に起因する第１の復旧処理を仮想マシン削除部１１１ｄに実行させず、新たに検出された障害に起因する第２の復旧処理を再作成部１１１ｅに実行させない。例えば、物理機器上の仮想マシンクリーンアップはＳＴＯＮＩＴＨで停止後に行われるため基本的には仮想マシンダウン通知が来ることはない。しかしながら、通知が遅延した場合等では、仮想マシンの削除中に仮想マシンダウン通知を受信する場合がある。この場合、ダウンが新たに検出された仮想マシンの「進捗状態」が「削除中（物理機器障害）」である。かかる場合、障害判定部１１１ｃは、実施中である仮想マシンの削除処理を継続すると判定する。 If the failure of the virtual machine is newly detected while the first recovery process is being performed, the failure determination unit 111c performs the first recovery process due to the newly detected failure in the virtual machine deletion unit 111d. And the re-creation unit 111e is not allowed to execute the second recovery process caused by the newly detected failure. For example, since virtual machine cleanup on a physical device is performed after stopping at STONITH, a virtual machine down notification is not basically received. However, when the notification is delayed, a virtual machine down notification may be received during deletion of the virtual machine. In this case, the “progress status” of the virtual machine in which down is newly detected is “deleting (physical device failure)”. In such a case, the failure determination unit 111c determines to continue the virtual machine deletion process being performed.

また、障害判定部１１１ｃは、第２の復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合、実施中である第２の復旧処理を再作成部１１１ｅに中止させる。そして、障害判定部１１１ｃは、新たに検出された障害に起因する第１の復旧処理を仮想マシン削除部１１１ｄに実行させ、新たに検出された障害に起因する第２の復旧処理を再作成部１１１ｅに実行させる。例えば、障害判定部１１１ｃは、既に仮想マシンを他の物理機器で仮想マシンインスタンス作成中の場合は、新たに受信した通知を新たな故障通知として扱う。この場合、ダウンが新たに検出された仮想マシンの「進捗状態」が「作成中（物理機器障害）」である。これにより、仮想マシン削除部１１１ｄは、第１の復旧処理として、対応付けを削除することに加えて、仮想マシンのインスタンスを削除させる。そして、再作成部１１１ｅは、対応付けの削除及び仮想マシンのインタンスの削除後に、第２の復旧処理として、選択した物理機器に仮想マシンを再作成させる。 In addition, when a failure of the virtual machine is newly detected while the second recovery process is being performed, the failure determination unit 111c causes the re-creation unit 111e to stop the second recovery process being performed. Then, the failure determination unit 111c causes the virtual machine deletion unit 111d to execute the first recovery process due to the newly detected failure, and recreates the second recovery process due to the newly detected failure. 111e is executed. For example, when the virtual machine instance is already being created with another physical device, the failure determination unit 111c handles the newly received notification as a new failure notification. In this case, the “progress status” of the virtual machine in which the down is newly detected is “Creating (physical device failure)”. Thereby, the virtual machine deleting unit 111d deletes the virtual machine instance in addition to deleting the association as the first recovery processing. Then, the re-creation unit 111e causes the selected physical device to re-create the virtual machine as a second recovery process after deleting the association and deleting the instance of the virtual machine.

パターン３として、仮想マシンの障害に起因する復旧処理が実施中であるときに物理機器の障害が新たに検出された場合について説明する。言い換えると、障害が新たに検出された物理機器で稼働する仮想マシンの「進捗状態」が「削除中（仮想マシンダウン）」や「作成中（仮想マシンダウン）」である場合を示す。かかる場合、障害判定部１１１ｃは、仮想マシンの障害に起因する実施中の復旧処理が存在するときに仮想マシンを稼働させる物理機器の障害が新たに検出された場合、実施中の復旧処理を中止させる。そして、障害判定部１１１ｃは、新たに検出された障害に起因する第１の復旧処理を仮想マシン削除部１１１ｄに実行させ、新たに検出された障害に起因する第２の復旧処理を再作成部１１１ｅに実行させる。 As a pattern 3, a case where a physical device failure is newly detected while a recovery process due to a virtual machine failure is being performed will be described. In other words, a case where the “progress status” of a virtual machine operating on a physical device in which a failure is newly detected is “deleting (virtual machine down)” or “creating (virtual machine down)” is shown. In such a case, the failure determination unit 111c cancels the ongoing recovery process when a failure of the physical device that operates the virtual machine is newly detected when there is an ongoing recovery process due to the failure of the virtual machine. Let Then, the failure determination unit 111c causes the virtual machine deletion unit 111d to execute the first recovery process due to the newly detected failure, and recreates the second recovery process due to the newly detected failure. 111e is executed.

例えば、障害が新たに検出された物理機器で稼働する仮想マシンの「進捗状態」が「削除中（仮想マシンダウン）」である場合、物理機器障害により仮想マシンインスタンスの削除が終了しない可能性がある。このため、障害判定部１１１ｃは、実施中である仮想マシンインスタンス削除を中断するように仮想マシン削除部１１１ｄに指示する。そして、障害判定部１１１ｃは、新たに通知された物理機器の復旧処理を仮想マシン削除部１１１ｄ及び再作成部１１１ｅに実行させる。 For example, if the “progress status” of a virtual machine running on a physical device in which a failure is newly detected is “deleting (virtual machine down)”, the deletion of the virtual machine instance may not end due to a physical device failure is there. Therefore, the failure determination unit 111c instructs the virtual machine deletion unit 111d to interrupt the virtual machine instance deletion being performed. Then, the failure determination unit 111c causes the virtual machine deletion unit 111d and the re-creation unit 111e to execute the newly notified physical device recovery process.

また、例えば、障害が新たに検出された物理機器で稼働する仮想マシンの「進捗状態」が「作成中（仮想マシンダウン）」である場合も、物理機器障害により仮想マシンインスタンスの削除が終了しない可能性がある。このため、障害判定部１１１ｃは、実施中である仮想マシンインスタンス削除を中断するように仮想マシン削除部１１１ｄに指示する。そして、障害判定部１１１ｃは、新たに通知された物理機器の復旧処理を仮想マシン削除部１１１ｄ及び再作成部１１１ｅに実行させる。 For example, even when the “progress status” of a virtual machine operating on a physical device in which a failure is newly detected is “Creating (Virtual Machine Down)”, the deletion of the virtual machine instance does not end due to a physical device failure. there is a possibility. Therefore, the failure determination unit 111c instructs the virtual machine deletion unit 111d to interrupt the virtual machine instance deletion being performed. Then, the failure determination unit 111c causes the virtual machine deletion unit 111d and the re-creation unit 111e to execute the newly notified physical device recovery process.

パターン４として、仮想マシンの障害に起因する復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合について説明する。言い換えると、ダウンが新たに検出された仮想マシンの「進捗状態」が「削除中（仮想マシンダウン）」や「作成中（仮想マシンダウン）」である場合を示す。 As pattern 4, a case will be described in which a failure of a virtual machine is newly detected when recovery processing due to a failure of the virtual machine is being performed. In other words, the “progress status” of a virtual machine in which a down is newly detected is “deleting (virtual machine down)” or “creating (virtual machine down)”.

障害判定部１１１ｃは、第１の復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合、新たに検出された障害に起因する第１の復旧処理を仮想マシン削除部１１１ｄに実行させず、新たに検出された障害に起因する第２の復旧処理を再作成部１１１ｅに実行させない。例えば、ダウンが新たに検出された仮想マシンの「進捗状態」が「削除中（仮想マシンダウン）」である場合は、障害判定部１１１ｃは、新たに受信した通知が再送であるとして破棄し、実施中である仮想マシンインスタンスの削除処理を継続する。 If the failure of the virtual machine is newly detected while the first recovery process is being performed, the failure determination unit 111c performs the first recovery process due to the newly detected failure in the virtual machine deletion unit 111d. And the re-creation unit 111e is not allowed to execute the second recovery process caused by the newly detected failure. For example, when the “progress status” of a virtual machine in which a down is newly detected is “deleting (virtual machine down)”, the failure determination unit 111c discards the newly received notification as being retransmitted, Continue to delete the virtual machine instance that is being implemented.

また、障害判定部１１１ｃは、第２の復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合、実施中である第２の復旧処理を再作成部１１１ｅに中止させる。そして、障害判定部１１１ｃは、新たに検出された障害に起因する第１の復旧処理を仮想マシン削除部１１１ｄに実行させ、新たに検出された障害に起因する第２の復旧処理を再作成部１１１ｅに実行させる。これにより、仮想マシン削除部１１１ｄは、第１の復旧処理として、対応付けを削除することに加えて、仮想マシンのインスタンスを削除させる。例えば、ダウンが新たに検出された仮想マシンの「進捗状態」が「作成中（仮想マシンダウン）」である場合、障害判定部１１１ｃは、実施中である仮想マシンインスタンス作成処理を中断するように再作成部１１１ｅに指示する。そして、再作成部１１１ｅは、対応付けの削除及び仮想マシンのインタンスの削除後に、第２の復旧処理として、選択した物理機器に仮想マシンを再作成させる。 In addition, when a failure of the virtual machine is newly detected while the second recovery process is being performed, the failure determination unit 111c causes the re-creation unit 111e to stop the second recovery process being performed. Then, the failure determination unit 111c causes the virtual machine deletion unit 111d to execute the first recovery process due to the newly detected failure, and recreates the second recovery process due to the newly detected failure. 111e is executed. Thereby, the virtual machine deleting unit 111d deletes the virtual machine instance in addition to deleting the association as the first recovery processing. For example, when the “progress status” of a virtual machine in which a down is newly detected is “Creating (Virtual Machine Down)”, the failure determination unit 111c interrupts the virtual machine instance creation process that is being performed. The re-creation unit 111e is instructed. Then, the re-creation unit 111e causes the selected physical device to re-create the virtual machine as a second recovery process after deleting the association and deleting the instance of the virtual machine.

このように、障害判定部１１１ｃは、従来は独立に行われていた物理機器の障害に起因する復旧処理とダウンした仮想マシンの復旧処理とを、統一的に管理することで、仮想マシンの２重起動を防いだ信頼性のある復旧処理を実施することを可能とする。 As described above, the failure determination unit 111c uniformly manages the recovery processing caused by the failure of the physical device and the recovery processing of the down virtual machine, which has been performed independently in the past. It is possible to perform reliable recovery processing that prevents heavy startup.

図８に戻る。仮想マシン削除部１１１ｄは、障害判定部１１１ｃの指示に応じて、第１の復旧処理として、仮想マシンと仮想マシンにより使用される記憶領域との対応付けを削除する。ここで、仮想マシン削除部１１１ｄは、仮想マシンの障害に起因する第１の復旧処理である場合、仮想マシンと仮想マシンにより使用される記憶領域との対応付けを削除することに加えて、仮想マシンのインスタンスを削除する。 Returning to FIG. In response to an instruction from the failure determination unit 111c, the virtual machine deletion unit 111d deletes the association between the virtual machine and the storage area used by the virtual machine as the first recovery process. Here, in the case of the first recovery process caused by the failure of the virtual machine, the virtual machine deleting unit 111d deletes the association between the virtual machine and the storage area used by the virtual machine, Delete the machine instance.

例えば、仮想マシン削除部１１１ｄは、実施中の復旧処理が存在しない場合に、第１の復旧処理として、仮想マシンと仮想マシンにより使用される記憶領域との対応付けを削除する。また、仮想マシン削除部１１１ｄは、障害判定部１１１ｃの指示に応じて、実施中である第１の復旧処理を中止する。実施中である第１の復旧処理を中止した場合、仮想マシン削除部１１１ｄは、障害判定部１１１ｃの指示に応じて、新たに第１の復旧処理を実施する。 For example, when there is no recovery process in progress, the virtual machine deletion unit 111d deletes the association between the virtual machine and the storage area used by the virtual machine as the first recovery process. Further, the virtual machine deletion unit 111d stops the first recovery process that is being performed in response to an instruction from the failure determination unit 111c. When the first recovery process being performed is stopped, the virtual machine deletion unit 111d newly executes the first recovery process in response to an instruction from the failure determination unit 111c.

また、仮想マシン削除部１１１ｄは、第１の復旧処理を開始する場合、仮想機器配置情報テーブル１１０ａにおいて、障害の発生した仮想マシンに対応する「進捗状態」を「削除中（仮想マシンダウン）」或いは「削除中（物理機器障害）」に更新する。 Further, when starting the first recovery process, the virtual machine deleting unit 111d sets “deleting (virtual machine down)” as the “progress state” corresponding to the virtual machine in which a failure has occurred in the virtual device arrangement information table 110a. Alternatively, it is updated to “Deleting (physical device failure)”.

第１の復旧処理が終了後、仮想マシン削除部１１１ｄは、再作成部１１１ｅに第１の復旧処理が終了したことを通知する。 After the first recovery process ends, the virtual machine deletion unit 111d notifies the re-creation unit 111e that the first recovery process has ended.

再作成部１１１ｅは、障害判定部１１１ｃの指示を受信した後、仮想マシン削除部１１１ｄから第１の復旧処理が終了したことを通知された場合に、第２の復旧処理として、仮想マシンを稼働させる物理機器を選択し、選択した物理機器に仮想マシンを再作成させる。 The re-creation unit 111e operates the virtual machine as the second recovery process when the virtual machine deletion unit 111d is notified that the first recovery process has ended after receiving the instruction from the failure determination unit 111c. Select the physical device to be created, and recreate the virtual machine on the selected physical device.

例えば、再作成部１１１ｅは、物理機器に生じた障害を通知された場合に、仮想機器を再作成する物理機器を選択する。また、再作成部１１１ｅは、仮想マシンのダウンを通知された場合に、ダウンした仮想マシンを再作成する物理機器を選択する。ここで、再作成部１１１ｅは、物理機器に生じた障害を通知された場合に、対応付けの削除後に、第２の復旧処理として、選択した物理機器に仮想マシンを再作成させる。また、再作成部１１１ｅは、仮想マシンのダウンを通知された場合に、対応付けの削除及び仮想マシンのインタンスの削除後に、第２の復旧処理として、選択した物理機器に仮想マシンを再作成させる。 For example, the re-creation unit 111e selects a physical device for re-creating a virtual device when notified of a failure that has occurred in the physical device. Further, when the re-creation unit 111e is notified of the down of the virtual machine, the re-creation unit 111e selects a physical device for re-creating the down virtual machine. Here, when notified of the failure that has occurred in the physical device, the re-creation unit 111e causes the selected physical device to re-create the virtual machine as a second recovery process after deleting the association. In addition, when the virtual machine is notified that the virtual machine is down, the re-creation unit 111e causes the selected physical device to re-create the virtual machine as a second recovery process after deleting the association and deleting the virtual machine instance. .

以下では、物理機器に生じた障害を通知された場合の再作成部１１１ｅの処理について説明する。ここで、再作成部１１１ｅは、障害の生じた物理機器以外の物理機器にできるだけ順番に割り振られるように物理機器を選択する。例えば、再作成部１１１ｅは、障害の生じた物理機器以外の物理機器のうち物理資源の空き容量のある物理機器を複数特定する。そして、再作成部１１１ｅは、特定した複数の物理機器の物理資源を、障害の生じた物理機器に配置された仮想機器の再配置先として選択する。 Hereinafter, processing of the re-creation unit 111e when a failure that has occurred in a physical device is notified will be described. Here, the re-creation unit 111e selects physical devices so that they are allocated in order as much as possible to physical devices other than the failed physical device. For example, the re-creation unit 111e specifies a plurality of physical devices having a physical resource free capacity among physical devices other than the failed physical device. Then, the recreating unit 111e selects the physical resources of the identified plurality of physical devices as the relocation destination of the virtual device that is allocated to the physical device in which the failure has occurred.

図１３は、再作成部１１１ｅによる処理動作を説明するための図である。図１３では、物理機器＃１〜物理機器＃６の６台の物理機器を有する仮想機器管理システムにおいて、物理機器＃４が故障した際の復旧について説明する。ここで、物理機器＃１〜物理機器＃５の稼働状態は「稼働中」であり、物理機器＃６の稼働状態は「予備」である。また、物理機器＃１のスペースの状態は、「空き」２、「使用中」１、「障害用バッファ」２であり、物理機器＃２のスペースの状態は、「空き」０、「使用中」３、「障害用バッファ」２であり、物理機器＃３のスペースの状態は、「空き」４、「使用中」０、「障害用バッファ」２である。また、物理機器＃４のスペースの状態は、「空き」０、「使用中」１０、「障害用バッファ」２であり、物理機器＃５のスペースの状態は、「空き」２、「使用中」０、「障害用バッファ」２であり、物理機器＃６のスペースの状態は、「空き」３、「使用中」０、「障害用バッファ」２である。 FIG. 13 is a diagram for explaining the processing operation by the re-creation unit 111e. FIG. 13 illustrates recovery when the physical device # 4 fails in the virtual device management system having six physical devices # 1 to # 6. Here, the operating state of the physical device # 1 to the physical device # 5 is “operating”, and the operating state of the physical device # 6 is “standby”. The space status of the physical device # 1 is “free” 2, “in use” 1, and “failure buffer” 2. The space status of the physical device # 2 is “free” 0, “in use” 3 and “failure buffer” 2, and the space status of the physical device # 3 is “free” 4, “in use” 0, and “failure buffer” 2. The space status of the physical device # 4 is “free” 0, “in use” 10, and “failure buffer” 2. The space status of the physical device # 5 is “free” 2, “in use”. "0", "Fault buffer" 2, and the space status of the physical device # 6 is "Free" 3, "In use" 0, "Fault buffer" 2.

例えば、再作成部１１１ｅは、障害の生じた物理機器以外の物理機器のうち物理資源の空き容量のある物理機器を複数特定する。ここで、再作成部１１１ｅは、物理機器の障害発生時には、「空き」のスペースに加えて、「障害用バッファ」のスペースも使用する。これにより、再作成部１１１ｅは、「空き」が０の物理機器も含めて、より多くの物理機器が復旧処理を分担できるようにする。図１３に示す例では、再作成部１１１ｅが、稼働中である物理機器＃１〜物理機器＃３及び物理機器＃５を特定した場合を示す。 For example, the re-creation unit 111e specifies a plurality of physical devices having a physical resource free capacity among physical devices other than the failed physical device. Here, the re-creation unit 111e uses the space of the “failure buffer” in addition to the “free” space when a failure of the physical device occurs. As a result, the re-creation unit 111e allows more physical devices to share the recovery process, including physical devices whose “empty” is zero. The example illustrated in FIG. 13 illustrates a case where the re-creation unit 111e identifies physical devices # 1 to # 3 and physical devices # 5 that are operating.

そして、再作成部１１１ｅは、特定した複数の物理機器の物理資源を、障害の生じた物理機器＃４に配置された仮想機器の再配置先として選択する。例えば、再作成部１１１ｅは、特定した複数の物理機器に順序付けを行う。ここで、再作成部１１１ｅは、特定した複数の物理機器の物理資源の空き容量が多い順に、特定した複数の物理機器に順序付けを行う。例えば、再作成部１１１ｅは、「空き」のスペースと「障害用バッファ」のスペースとの合計スペースを物理資源の空き容量とし、合計スペースが多い順に物理機器に順序付けを行う。図１３の例では、再作成部１１１ｅが、合計スペースが６である物理機器＃３、合計スペースが４である物理機器＃１、合計スペースが４である物理機器＃５、そして、合計スペースが２である物理機器＃２の順で順序付けした場合を示す。 Then, the re-creation unit 111e selects the physical resources of the identified plurality of physical devices as the relocation destination of the virtual device disposed in the physical device # 4 where the failure has occurred. For example, the re-creation unit 111e orders the plurality of identified physical devices. Here, the re-creating unit 111e orders the specified plurality of physical devices in descending order of the physical resources of the specified plurality of physical devices. For example, the re-creation unit 111e sets the total space of the “free” space and the “failure buffer” space as the free capacity of the physical resource, and orders the physical devices in descending order of the total space. In the example of FIG. 13, the re-creation unit 111e has a physical device # 3 with a total space of 6, a physical device # 1 with a total space of 4, a physical device # 5 with a total space of 4, and a total space of 2 shows a case where the physical device # 2 is ordered in the order.

続いて、再作成部１１１ｅは、障害の生じた物理機器に配置された仮想機器それぞれの再配置先として、順序に基づいて選択した物理機器の物理資源を選択する処理を繰り返す。一例をあげると、再作成部１１１ｅは、図１３に示す数字順に仮想機器を再配置するように物理機器を選択する。ここで、再配置時は、空き容量が多い順に入れていくことも出来れば、空き容量がなくなるまで出来るだけ順々に別物理機器に入れていくこともできる。前者の場合は再配置後の空き容量が均等になるようにすることが目的で、後者の場合は出来るだけ多くの物理機器に再配置し速く復旧するが目的である。図１３では、後者のロジックを想定した例である。より具体的には、再作成部１１１ｅは、物理機器＃３、物理機器＃１、物理機器＃５、そして、物理機器＃２の順で選択した物理機器の物理資源を仮想機器の再配置先として選択する処理を繰り返す。ここで、再作成部１１１ｅは、「空き」のスペースや「障害用バッファ」のスペースが無くなるまでは、各物理機器に仮想機器を順番に配置する。また、再作成部１１１ｅは、スペースが無くなった物理機器は飛ばすようにする。 Subsequently, the re-creation unit 111e repeats the process of selecting the physical resource of the physical device selected based on the order as the relocation destination of each of the virtual devices arranged in the failed physical device. As an example, the re-creation unit 111e selects physical devices so that virtual devices are rearranged in the numerical order shown in FIG. Here, at the time of rearrangement, it can be put in the order of increasing free capacity, or it can be put in different physical devices as much as possible until there is no free capacity. In the former case, the purpose is to make the free space after rearrangement uniform, and in the latter case, the purpose is to rearrange as many physical devices as possible and restore them quickly. FIG. 13 is an example assuming the latter logic. More specifically, the re-creation unit 111e transfers the physical resources of the physical devices selected in the order of physical device # 3, physical device # 1, physical device # 5, and physical device # 2 to the virtual device relocation destination. The process of selecting as is repeated. Here, the re-creating unit 111e arranges virtual devices in order on each physical device until there is no “free” space or “failure buffer” space. In addition, the re-creation unit 111e skips physical devices that have run out of space.

なお、再作成部１１１ｅは、稼働状態が「稼働中」である全ての物理機器の「空き」のスペース及び「障害用バッファ」のスペースが満たされるまで、稼働状態が「予備」である物理機器を選択しない。このように、仮想機器配置スケジューラ機能部１１１は、「空き」のスペースに加えて、仮想機器の作成時には利用されない「障害用バッファ」のスペースを予め準備しておき、障害時に多くの物理機器に仮想機器を再配置することで、高速の復旧を可能とする。また、再作成部１１１ｅは、障害が発生した物理機器に配置された仮想機器の全てを再配置可能ではない場合には、特定した物理機器の「空き」のスペースと「障害用バッファ」のスペースとに再配置可能な範囲で、仮想機器ごとに再配置先を選択する。 Note that the re-creation unit 111e has a physical device whose operation state is “standby” until the “free” space and the “failure buffer” space of all physical devices whose operation state is “active” are filled. Do not select. In this way, the virtual device placement scheduler function unit 111 prepares in advance a “failure buffer” space that is not used when creating a virtual device, in addition to the “empty” space. High-speed recovery is possible by rearranging virtual devices. Further, if not all the virtual devices arranged in the physical device in which the failure has occurred can be rearranged, the re-creating unit 111e “free” space and “failure buffer” space of the identified physical device The relocation destination is selected for each virtual device within the range that can be relocated.

また、Ｐａｃｅｍａｋｅｒのクラスタ構成は、最大８台程度で組み、障害の検知を行う。また、仮想機器配置スケジューラ機能部１１１は、クラスタを跨いで別物理機器に仮想機器を作成してもよいため、再作成が依頼される物理機器はクラスタのサイズ以上でも良い。また、全てが埋まった際に利用される予備機は存在してもしなくてもよい。クラスタ構成上はＮ−Ａｃｔ、０−Ｓｂｙで、Ｓｔａｎｄｂｙ機を準備する必要はないため、物理機器の利用効率を高めることも出来る。 In addition, the cluster configuration of Pacemaker is configured with a maximum of about 8 units to detect a failure. Further, since the virtual device placement scheduler function unit 111 may create a virtual device in another physical device across the cluster, the physical device requested to be recreated may be larger than the size of the cluster. Also, there may or may not be a spare machine used when everything is buried. Since the cluster configuration is N-Act, 0-Sby and there is no need to prepare a Stand-by machine, the utilization efficiency of physical equipment can be increased.

そして、再作成部１１１ｅは、選択した物理機器１０３に、仮想機器を作成するようにクラウドコントローラ１０８に依頼する。 Then, the re-creation unit 111e requests the selected physical device 103 to the cloud controller 108 to create a virtual device.

まお、再作成部１１１ｅは、仮想マシンがダウンした場合には、物理機器に生じた障害を通知された場合と同様に仮想マシンを再作成させる物理機器を選択してもよく、或いは、ダウンした仮想マシンが稼働していた物理機器にダウンした仮想マシンを再作成させてもよい。 Well, when the virtual machine goes down, the re-creation unit 111e may select a physical device that re-creates the virtual machine in the same manner as when a failure occurred in the physical device is notified, or it goes down. The down virtual machine may be recreated on the physical device on which the virtual machine was operating.

また、再作成部１１１ｅは、障害判定部１１１ｃの指示に応じて、実施中である第２の復旧処理を中止する。実施中である第２の復旧処理を中止した場合、再作成部１１１ｅは、障害判定部１１１ｃの指示に応じて、新たに第２の復旧処理を実施する。かかる場合も、再作成部１１１ｅは、障害判定部１１１ｃの指示を受信した後、仮想マシン削除部１１１ｄから第１の復旧処理が終了したことを通知された場合に、第２の復旧処理を実施する。 In addition, the re-creation unit 111e stops the second recovery process that is being performed in response to an instruction from the failure determination unit 111c. When the second recovery process that is being performed is stopped, the re-creation unit 111e newly performs the second recovery process in response to an instruction from the failure determination unit 111c. Also in this case, the re-creation unit 111e performs the second recovery process when the virtual machine deletion unit 111d is notified that the first recovery process has been completed after receiving the instruction from the failure determination unit 111c. To do.

また、再作成部１１１ｅは、第２の復旧処理を開始する場合、仮想機器配置情報テーブル１１０ａにおいて、障害の発生した仮想マシンに対応する「進捗状態」を「作成中（仮想マシンダウン）」或いは「作成中（物理機器障害）」に更新する。また、再作成部１１１ｅは、第２の復旧処理が終了した場合、仮想機器配置情報テーブル１１０ａにおいて、障害の発生した仮想マシンに対応する「進捗状態」を「稼働中」に更新する。 Further, when starting the second recovery process, the re-creation unit 111e sets “progress state (virtual machine down)” to “progress state” corresponding to the failed virtual machine in the virtual device arrangement information table 110a or Update to "Creating (physical device failure)". In addition, when the second recovery process is completed, the re-creation unit 111e updates the “progress status” corresponding to the virtual machine in which the failure has occurred to “in operation” in the virtual device arrangement information table 110a.

なお、図１３に示す例では、再作成部１１１ｅが、特定した複数の物理機器の物理資源の空き容量が多い順に、特定した複数の物理機器に順序付けを行う場合について説明したが、実施形態はこれに限定されるものではない。例えば、再作成部１１１ｅは、物理資源の空き容量とは関係なく、特定した複数の物理機器に任意に順序付けを行うようにしてもよい。 In the example illustrated in FIG. 13, a case has been described in which the re-creation unit 111e orders the plurality of identified physical devices in descending order of physical resource capacity of the identified plurality of physical devices. It is not limited to this. For example, the re-creation unit 111e may arbitrarily order the specified plurality of physical devices regardless of the free capacity of the physical resource.

図１４は、物理機器障害通知を受信した場合の仮想機器配置スケジューラ機能部１１１による処理手順を示すフローチャートである。図１４に示すように、障害判定部１１１ｃは、物理機器障害通知を受信したか否かを判定する（ステップＳ１０１）。ここで、障害判定部１１１ｃは、物理機器障害通知を受信したと判定した場合（ステップＳ１０１、Ｙｅｓ）、仮想マシンに対して実施中の復旧処理が存在するか否かを判定する（ステップＳ１０２）。なお、障害判定部１１１ｃは、物理機器障害通知を受信しなかったと判定した場合（ステップＳ１０１、Ｎｏ）、物理機器障害通知を受信したか否かを判定する。 FIG. 14 is a flowchart illustrating a processing procedure performed by the virtual device arrangement scheduler function unit 111 when a physical device failure notification is received. As illustrated in FIG. 14, the failure determination unit 111c determines whether a physical device failure notification has been received (step S101). Here, when the failure determination unit 111c determines that a physical device failure notification has been received (step S101, Yes), the failure determination unit 111c determines whether there is a recovery process being performed on the virtual machine (step S102). . If the failure determination unit 111c determines that a physical device failure notification has not been received (No in step S101), the failure determination unit 111c determines whether a physical device failure notification has been received.

障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が存在すると判定しなかった場合（ステップＳ１０２、Ｎｏ）、ステップＳ１０７に移行する。一方、障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が存在すると判定した場合（ステップＳ１０２、Ｙｅｓ）、仮想マシンに対して実施中の復旧処理が物理機器の障害に起因するか否かを判定する（ステップＳ１０３）。ここで、障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が物理機器の障害に起因すると判定しなかった場合（ステップＳ１０３、Ｎｏ）、ステップＳ１０６に移行する。 If the failure determination unit 111c does not determine that there is a recovery process in progress for the virtual machine (No in step S102), the failure determination unit 111c proceeds to step S107. On the other hand, if the failure determination unit 111c determines that there is a recovery process that is being performed on the virtual machine (Yes in step S102), whether the recovery process that is being performed on the virtual machine is due to a failure of the physical device. It is determined whether or not (step S103). Here, if the failure determination unit 111c does not determine that the recovery process being performed on the virtual machine is caused by a failure of the physical device (No in step S103), the failure determination unit 111c proceeds to step S106.

障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が物理機器の障害に起因すると判定した場合（ステップＳ１０３、Ｙｅｓ）、物理機器障害通知が再送か否かを判定する（ステップＳ１０４）。例えば、障害判定部１１１ｃは、障害が通知された物理機器で稼働する仮想マシンの進捗状態が削除中である場合、物理機器障害通知が再送であると判定する。一方、障害判定部１１１ｃは、障害が通知された物理機器で稼働する仮想マシンの進捗状態が削除中でない場合、物理機器障害通知が再送ではないと判定する。すなわち、障害判定部１１１ｃは、障害が通知された物理機器で稼働する仮想マシンの進捗状態が稼働中又は作成中である場合、物理機器障害通知が再送ではないと判定する。 If the failure determination unit 111c determines that the recovery process being performed on the virtual machine is caused by a failure of the physical device (step S103, Yes), the failure determination unit 111c determines whether the physical device failure notification is a retransmission (step S104). . For example, the failure determination unit 111c determines that the physical device failure notification is retransmission when the progress state of the virtual machine running on the physical device notified of the failure is being deleted. On the other hand, the failure determination unit 111c determines that the physical device failure notification is not retransmission when the progress state of the virtual machine running on the physical device notified of the failure is not being deleted. In other words, the failure determination unit 111c determines that the physical device failure notification is not retransmission when the progress state of the virtual machine running on the physical device to which the failure is notified is in operation or being created.

障害判定部１１１ｃは、物理機器障害通知が再送であると判定した場合（ステップＳ１０４、Ｙｅｓ）、通知を破棄して（ステップＳ１０５）、処理を終了する。一方、障害判定部１１１ｃは、物理機器障害通知が再送であると判定しなかった場合（ステップＳ１０４、Ｎｏ）、実施中の処理を中止させる（ステップＳ１０６）。そして、障害判定部１１１ｃは、仮想マシン削除部１１１ｄ及び再作成部１１１ｅに復旧処理を実施させる（ステップＳ１０７）。 If the failure determination unit 111c determines that the physical device failure notification is retransmission (Yes in step S104), the failure determination unit 111c discards the notification (step S105) and ends the processing. On the other hand, when the failure determination unit 111c does not determine that the physical device failure notification is retransmission (No in step S104), the failure determination unit 111c stops the process being executed (step S106). Then, the failure determination unit 111c causes the virtual machine deletion unit 111d and the re-creation unit 111e to perform recovery processing (step S107).

図１５は、仮想マシンダウン通知を受信した場合の仮想機器配置スケジューラ機能部１１１による処理手順を示すフローチャートである。図１５に示すように、障害判定部１１１ｃは、仮想マシンダウン通知を受信したか否かを判定する（ステップＳ２０１）。ここで、障害判定部１１１ｃは、仮想マシンダウン通知を受信したと判定した場合（ステップＳ２０１、Ｙｅｓ）、仮想マシンに対して実施中の復旧処理が存在するか否かを判定する（ステップＳ２０２）。なお、障害判定部１１１ｃは、仮想マシンダウン通知を受信しなかったと判定した場合（ステップＳ２０１、Ｎｏ）、仮想マシンダウン通知を受信したか否かを判定する。 FIG. 15 is a flowchart illustrating a processing procedure performed by the virtual device arrangement scheduler function unit 111 when a virtual machine down notification is received. As illustrated in FIG. 15, the failure determination unit 111c determines whether a virtual machine down notification is received (step S201). If the failure determination unit 111c determines that a virtual machine down notification has been received (step S201, Yes), the failure determination unit 111c determines whether there is a recovery process in progress for the virtual machine (step S202). . If the failure determination unit 111c determines that a virtual machine down notification has not been received (No in step S201), the failure determination unit 111c determines whether a virtual machine down notification has been received.

障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が存在すると判定しなかった場合（ステップＳ２０２、Ｎｏ）、ステップＳ２０６に移行する。一方、障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が存在すると判定した場合（ステップＳ２０２、Ｙｅｓ）、仮想マシンに対して実施中の復旧処理が第１の復旧処理であるか否かを判定する（ステップＳ２０３）。ここで、障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が第１の復旧処理であると判定しなかった場合（ステップＳ２０３、Ｎｏ）、実施中の処理を中止させる（ステップＳ２０４）。そして、障害判定部１１１ｃは、仮想マシン削除部１１１ｄ及び再作成部１１１ｅに復旧処理を実施させる（ステップＳ２０６）。 If the failure determination unit 111c does not determine that there is a recovery process in progress for the virtual machine (No in step S202), the failure determination unit 111c proceeds to step S206. On the other hand, if the failure determination unit 111c determines that there is a recovery process being performed on the virtual machine (Yes in step S202), is the recovery process being performed on the virtual machine the first recovery process? It is determined whether or not (step S203). Here, if the failure determination unit 111c does not determine that the recovery process being performed on the virtual machine is the first recovery process (No at Step S203), the failure determination unit 111c stops the process being performed (Step S204). ). Then, the failure determination unit 111c causes the virtual machine deletion unit 111d and the re-creation unit 111e to perform recovery processing (step S206).

一方、障害判定部１１１ｃは、仮想マシンに対して実施中の復旧処理が第１の復旧処理であると判定した場合（ステップＳ２０３、Ｙｅｓ）、通知を破棄して（ステップＳ２０５）、処理を終了する。 On the other hand, when the failure determination unit 111c determines that the recovery process being performed on the virtual machine is the first recovery process (Yes in step S203), the failure determination unit 111c discards the notification (step S205) and ends the process. To do.

上述したように、第１の実施形態に係る仮想機器管理装置１０９は、物理機器故障時と仮想マシンダウン時の復旧方法として統一的手段で行い、物理機器故障と仮想マシンダウンが重複しても、仮想マシンのステータスを管理し、仮想マシンが２重起動しないように復旧を行う。 As described above, the virtual device management apparatus 109 according to the first embodiment performs unified means as a recovery method when a physical device fails and a virtual machine goes down, and even if a physical device failure and a virtual machine go down overlap. The virtual machine status is managed, and recovery is performed so that the virtual machine does not start twice.

例えば、第１の実施形態に係る仮想機器管理装置１０９は、仮想マシンを稼働させる物理機器の障害に起因する第１の復旧処理が実施中であるときに物理機器の障害が新たに検出された場合、通知が再送であると判定し、新たに検出された障害に起因する第１の復旧処理及び第２の復旧処理を実施しない。 For example, the virtual device management apparatus 109 according to the first embodiment newly detects a physical device failure when the first recovery process due to the failure of the physical device that operates the virtual machine is being performed. In this case, it is determined that the notification is retransmission, and the first recovery process and the second recovery process due to the newly detected failure are not performed.

また、第１の実施形態に係る仮想機器管理装置１０９は、仮想マシンを稼働させる物理機器の障害に起因する第２の復旧処理が実施中であるときに物理機器の障害が新たに検出された場合、実施中の第２の復旧処理を中止させる。そして、第１の実施形態に係る仮想機器管理装置１０９は、新たなイベントとして物理機器の復旧処理を実施する。 In addition, the virtual device management apparatus 109 according to the first embodiment newly detects a physical device failure when the second recovery process due to the failure of the physical device that operates the virtual machine is being performed. In such a case, the second recovery process being performed is stopped. Then, the virtual device management apparatus 109 according to the first embodiment performs physical device recovery processing as a new event.

また、第１の実施形態に係る仮想機器管理装置１０９は、仮想マシンの障害に起因する実施中の復旧処理が存在するときに仮想マシンを稼働させる物理機器の障害が新たに検出された場合、実施中の復旧処理を中止させる。そして、第１の実施形態に係る仮想機器管理装置１０９は、新たに検出された障害に起因する第１の復旧処理と第２の復旧処理とを実施する。 Further, the virtual device management apparatus 109 according to the first embodiment, when a failure of a physical device that operates a virtual machine is newly detected when there is an ongoing recovery process due to a failure of the virtual machine, Stop the ongoing recovery process. Then, the virtual device management apparatus 109 according to the first embodiment performs a first recovery process and a second recovery process caused by a newly detected failure.

また、第１の実施形態に係る仮想機器管理装置１０９は、第１の復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合、新たに検出された障害に起因する第１の復旧処理と第２の復旧処理とを実行しない。 In addition, when a virtual machine failure is newly detected while the first recovery process is being performed, the virtual device management apparatus 109 according to the first embodiment is configured to perform the first operation caused by the newly detected failure. The first recovery process and the second recovery process are not executed.

また、第１の実施形態に係る仮想機器管理装置１０９は、第２の復旧処理が実施中であるときに仮想マシンの障害が新たに検出された場合、実施中である第２の復旧処理を中止させ、新たに検出された障害に起因する第１の復旧処理と第２の復旧処理とを実施する。 In addition, when a virtual machine failure is newly detected when the second recovery process is being performed, the virtual device management apparatus 109 according to the first embodiment performs the second recovery process being performed. The first recovery process and the second recovery process caused by the newly detected failure are performed.

このように、第１の実施形態に係る仮想機器管理装置１０９は、物理機器故障と仮想マシンダウンが重複しても、仮想マシンのステータスを管理することで、仮想マシンが２重起動しないように復旧を実施する。この結果、第１の実施形態に係る仮想機器管理装置１０９は、仮想マシンを安全に復旧することができる。 As described above, the virtual device management apparatus 109 according to the first embodiment manages the status of the virtual machine so that the virtual machine does not start twice even if the physical device failure and the virtual machine down overlap. Perform recovery. As a result, the virtual device management apparatus 109 according to the first embodiment can safely recover the virtual machine.

また、第１の実施形態に係る仮想機器管理装置１０９は、障害の生じた物理機器以外の物理機器のうち物理資源の空き容量のある物理機器を複数特定する。そして、第１の実施形態に係る仮想機器管理装置１０９は、特定した複数の物理機器の物理資源を、障害の生じた物理機器に配置された仮想機器の再配置先として選択する。すなわち、第１の実施形態に係る仮想機器管理装置１０９は、複数台の物理機器を仮想機器の復旧先として利用する。これにより、第１の実施形態に係る仮想機器管理装置１０９は、仮想機器を復旧するまでの時間を短縮することができる。 In addition, the virtual device management apparatus 109 according to the first embodiment identifies a plurality of physical devices having a free physical resource capacity among physical devices other than the failed physical device. Then, the virtual device management apparatus 109 according to the first embodiment selects the physical resources of the identified plurality of physical devices as relocation destinations of the virtual devices arranged in the physical device in which the failure has occurred. That is, the virtual device management apparatus 109 according to the first embodiment uses a plurality of physical devices as recovery destinations of virtual devices. Thereby, the virtual device management apparatus 109 according to the first embodiment can reduce the time until the virtual device is restored.

より具体的には、従来方式では、Ｎ−Ａｃｔ、Ｍ−Ｓｂｙでクラスタを組み物理機器に障害が起きた際に、Ｐａｃｅｍａｋｅｒ等の高可用ソフトウェアの機能によりＳｔａｎｄｂｙ機にフェールオーバーし、ＯｐｅｎＳｔａｃｋ等のクラウドコントローラのＤＢを元に仮想機器を再構築していた。ここで、従来方式では、ＨＡクラスタソフトウェアを用いたフェールオーバーは、１台のＳｔａｎｄｂｙ機に仮想機器を新たに再構築するため、全仮想機器の復旧に時間がかかるという問題がある。 More specifically, in the conventional method, when a failure occurs in a physical device by combining a cluster with N-Act and M-Sby, the system is failed over to a Standby machine by the function of highly available software such as Pacemaker, and OpenStack or the like. The virtual device was reconstructed based on the cloud controller database. Here, in the conventional method, the failover using the HA cluster software has a problem that it takes time to recover all the virtual devices because a new virtual device is reconstructed in one Standby machine.

一方、第１の実施形態に係る仮想機器管理装置１０９では、Ｎ−Ａｃｔ、０−Ｓｂｙでクラスタを組み、物理機器に障害が起きた際は、高可用ソフトウェアの機能により障害を検知するが、フェールオーバーせずに物理機器の障害を仮想機器管理装置１０９に通知する。仮想機器管理装置１０９は、各仮想機器に対して、再配置する複数の物理機器を決定し、配置する物理機器を指定してクラウドコントローラ１０８に再作成依頼を行う。そして、クラウドコントローラ１０８は、指定された物理機器に仮想機器を作成する。 On the other hand, in the virtual device management apparatus 109 according to the first embodiment, a cluster is formed with N-Act and 0-Sby, and when a failure occurs in a physical device, the failure is detected by the function of the highly available software. The virtual device management apparatus 109 is notified of a physical device failure without failing over. The virtual device management apparatus 109 determines a plurality of physical devices to be rearranged for each virtual device, designates the physical devices to be placed, and requests the cloud controller 108 to recreate. Then, the cloud controller 108 creates a virtual device in the designated physical device.

このように、第１の実施形態に係る仮想機器管理装置１０９は、故障した物理機器上で動作していた仮想機器を、複数台の物理機器に再作成することで高速に復旧する。言い換えると、仮想機器管理装置１０９は、複数台の物理機器を復旧先として利用するため、物理機器故障時の仮想機器復旧時間が短縮される。例えば、移行先物理機器が３台の場合は、復旧処理時間が１/３に短縮できる。 As described above, the virtual device management apparatus 109 according to the first embodiment recovers at high speed by re-creating a virtual device operating on the failed physical device in a plurality of physical devices. In other words, since the virtual device management apparatus 109 uses a plurality of physical devices as recovery destinations, the virtual device recovery time when a physical device fails is reduced. For example, when there are three migration destination physical devices, the recovery processing time can be shortened to 1/3.

また、第１の実施形態に係る仮想機器管理装置１０９は、Ｐａｃｅｍａｋｅｒ等の高可用ソフトウェアでＮ−Ａｃｔ、０−Ｓｂｙでクラスタを組み障害検知を行う。ここで、第１の実施形態に係る仮想機器管理装置１０９は、クラスタの枠を超えて故障復旧を行うことが出来るため、移行先物理機器の台数をクラスタサイズ以上にとることもできる。これにより、復旧時間をより短縮できる。 Also, the virtual device management apparatus 109 according to the first embodiment performs failure detection by combining clusters with N-Act and 0-Sby with high availability software such as Pacemaker. Here, since the virtual device management apparatus 109 according to the first embodiment can perform failure recovery beyond the cluster frame, the number of migration destination physical devices can also be larger than the cluster size. Thereby, the recovery time can be further shortened.

更に、仮想機器管理システムでは、障害検知のためのクラスタはＮ−Ａｃｔ、０−Ｓｂｙであるため、Ｓｔａｎｄｂｙ用の物理機器を準備する必要がなく、物理機器数の増大を抑えることができる。 Furthermore, in the virtual device management system, the failure detection clusters are N-Act and 0-Sby, so there is no need to prepare a physical device for standby, and an increase in the number of physical devices can be suppressed.

なお、上述した実施形態では、配置先選択部１１１ｂは、仮想機器を新規に作成する通常のオペレーション時に、仮想機器を出来るだけ分散して配置するように物理機器１０３を選択するものとして説明したが実施形態はこれに限定されるものではない。例えば、配置先選択部１１１ｂは、仮想機器を新規に作成する通常のオペレーション時には、仮想機器を分散させることなく配置するように物理機器１０３を選択するようにしてもよい。 In the above-described embodiment, the placement destination selection unit 111b has been described as selecting the physical device 103 so that the virtual devices are distributed as much as possible during the normal operation of creating a new virtual device. The embodiment is not limited to this. For example, the placement destination selection unit 111b may select the physical device 103 so that the virtual device is placed without being distributed during a normal operation for creating a new virtual device.

（第２の実施形態）
さて、これまで本発明の実施形態について説明したが、本発明は上述した実施形態以外にも、その他の実施形態にて実施されてもよい。そこで、以下では、その他の実施形態を示す。 (Second Embodiment)
Although the embodiments of the present invention have been described so far, the present invention may be implemented in other embodiments besides the above-described embodiments. Therefore, other embodiments will be described below.

（システム構成）
また、本実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部又は一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部又は一部を公知の方法で自動的に行うこともできる。この他、上述の文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 (System configuration)
Also, among the processes described in this embodiment, all or part of the processes described as being performed automatically can be performed manually, or the processes described as being performed manually can be performed. All or a part can be automatically performed by a known method. In addition, the processing procedures, control procedures, specific names, and information including various data and parameters shown in the above-described document and drawings can be arbitrarily changed unless otherwise specified.

また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況などに応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。 Further, each component of each illustrated apparatus is functionally conceptual, and does not necessarily need to be physically configured as illustrated. That is, the specific form of distribution / integration of each device is not limited to the one shown in the figure, and all or a part of the distribution / integration may be functionally or physically distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured.

（プログラム）
また、上記の実施形態に係る仮想機器管理装置１０９が実行する処理をコンピュータが実行可能な言語で記述した仮想機器管理プログラムを生成することもできる。この場合、コンピュータが仮想機器管理プログラムを実行することにより、上記実施形態と同様の効果を得ることができる。さらに、かかる仮想機器管理プログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録された仮想機器管理プログラムをコンピュータに読み込ませて実行することにより上記実施形態と同様の処理を実現してもよい。以下に、図１等に示した仮想機器管理装置１０９と同様の機能を実現する仮想機器管理プログラムを実行するコンピュータの一例を説明する。 (program)
It is also possible to generate a virtual device management program in which the processing executed by the virtual device management apparatus 109 according to the above embodiment is described in a language that can be executed by a computer. In this case, when the computer executes the virtual device management program, the same effect as in the above embodiment can be obtained. Further, the virtual device management program is recorded on a computer-readable recording medium, and the virtual device management program recorded on the recording medium is read by the computer and executed, thereby realizing the same processing as in the above embodiment. May be. An example of a computer that executes a virtual device management program that realizes the same function as the virtual device management apparatus 109 shown in FIG. 1 and the like will be described below.

図１６は、仮想機器管理プログラムを実行するコンピュータ１０００を示す図である。図１６に示すように、コンピュータ１０００は、例えば、メモリ１０１０と、ＣＰＵ１０２０と、ハードディスクドライブインタフェース１０３０と、ディスクドライブインタフェース１０４０と、シリアルポートインタフェース１０５０と、ビデオアダプタ１０６０と、ネットワークインタフェース１０７０とを有する。これらの各部は、バス１０８０によって接続される。 FIG. 16 is a diagram illustrating a computer 1000 that executes a virtual device management program. As illustrated in FIG. 16, the computer 1000 includes, for example, a memory 1010, a CPU 1020, a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. These units are connected by a bus 1080.

メモリ１０１０は、ＲＯＭ（Read Only Memory）１０１１およびＲＡＭ（Random Access Memory）１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、ハードディスクドライブ１０３１に接続される。ディスクドライブインタフェース１０４０は、ディスクドライブ１０４１に接続される。ディスクドライブ１０４１には、例えば、磁気ディスクや光ディスク等の着脱可能な記憶媒体が挿入される。シリアルポートインタフェース１０５０には、例えば、マウス１０５１およびキーボード１０５２が接続される。ビデオアダプタ１０６０には、例えば、ディスプレイ１０６１が接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM (Random Access Memory) 1012. The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1031. The disk drive interface 1040 is connected to the disk drive 1041. For example, a removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1041. For example, a mouse 1051 and a keyboard 1052 are connected to the serial port interface 1050. For example, a display 1061 is connected to the video adapter 1060.

ここで、図１６に示すように、ハードディスクドライブ１０３１は、例えば、ＯＳ１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３およびプログラムデータ１０９４を記憶する。上記実施形態で説明した仮想機器管理プログラムは、例えばハードディスクドライブ１０３１やメモリ１０１０に記憶される。 Here, as shown in FIG. 16, the hard disk drive 1031 stores, for example, an OS 1091, an application program 1092, a program module 1093, and program data 1094. The virtual device management program described in the above embodiment is stored in, for example, the hard disk drive 1031 or the memory 1010.

また、仮想機器管理プログラムは、例えば、コンピュータ１０００によって実行される指令が記述されたプログラムモジュールとして、例えばハードディスクドライブ１０３１に記憶される。具体的には、上記実施形態で説明した障害判定部１１１ｃと同様の情報処理を実行する判定手順と、仮想マシン削除部１１１ｄと同様の情報処理を実行する削除手順と、再作成部１１１ｅと同様の情報処理を実行する再作成手順とが記述されたプログラムモジュール１０９３が、ハードディスクドライブ１０３１に記憶される。 Further, the virtual device management program is stored in, for example, the hard disk drive 1031 as a program module in which a command executed by the computer 1000 is described. Specifically, the determination procedure for executing the same information processing as the failure determination unit 111c described in the above embodiment, the deletion procedure for executing the same information processing as the virtual machine deletion unit 111d, and the same as the re-creation unit 111e A program module 1093 in which a re-creation procedure for executing the information processing is described is stored in the hard disk drive 1031.

また、仮想機器管理プログラムによる情報処理に用いられるデータは、プログラムデータ１０９４として、例えば、ハードディスクドライブ１０３１に記憶される。そして、ＣＰＵ１０２０が、ハードディスクドライブ１０３１に記憶されたプログラムモジュール１０９３やプログラムデータ１０９４を必要に応じてＲＡＭ１０１２に読み出して、上述した各手順を実行する。 Data used for information processing by the virtual device management program is stored as program data 1094 in, for example, the hard disk drive 1031. Then, the CPU 1020 reads the program module 1093 and the program data 1094 stored in the hard disk drive 1031 to the RAM 1012 as necessary, and executes the above-described procedures.

なお、仮想機器管理プログラムに係るプログラムモジュール１０９３やプログラムデータ１０９４は、ハードディスクドライブ１０３１に記憶される場合に限られず、例えば、着脱可能な記憶媒体に記憶されて、ディスクドライブ１０４１等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、仮想機器管理プログラムに係るプログラムモジュール１０９３やプログラムデータ１０９４は、ＬＡＮ（Local Area Network）やＷＡＮ（Wide Area Network）等のネットワークを介して接続された他のコンピュータに記憶され、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 Note that the program module 1093 and the program data 1094 related to the virtual device management program are not limited to being stored in the hard disk drive 1031, but are stored in a removable storage medium, for example, by the CPU 1020 via the disk drive 1041 or the like. It may be read out. Alternatively, the program module 1093 and the program data 1094 related to the virtual device management program are stored in another computer connected via a network such as a LAN (Local Area Network) or a WAN (Wide Area Network), and the network interface 1070 is stored. Via the CPU 1020.

（その他）
なお、本実施形態で説明した仮想機器管理プログラムは、インターネットなどのネットワークを介して配布することができる。また、特定プログラムは、ハードディスク、フレキシブルディスク（ＦＤ）、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤなどのコンピュータで読み取り可能な記録媒体に記録され、コンピュータによって記録媒体から読み出されることによって実行することもできる。 (Other)
Note that the virtual device management program described in this embodiment can be distributed via a network such as the Internet. The specific program can also be executed by being recorded on a computer-readable recording medium such as a hard disk, a flexible disk (FD), a CD-ROM, an MO, or a DVD, and being read from the recording medium by the computer.

１０９仮想機器管理装置
１１０仮想機器配置スケジューラＤＢ
１１０ａ仮想機器配置情報テーブル
１１０ｂ物理資源情報テーブル
１１１仮想機器配置スケジューラ機能部
１１１ａ作成依頼受付部
１１１ｂ配置先選択部
１１１ｃ障害判定部
１１１ｄ仮想マシン削除部
１１１ｅ再作成部
１０００コンピュータ
１０１０メモリ
１０１１ＲＯＭ
１０１２ＲＡＭ
１０２０ＣＰＵ
１０３０ハードディスクドライブインタフェース
１０３１ハードディスクドライブ
１０４０ディスクドライブインタフェース
１０４１ディスクドライブ
１０５０シリアルポートインタフェース
１０５１マウス
１０５２キーボード
１０６０ビデオアダプタ
１０６１ディスプレイ
１０７０ネットワークインタフェース
１０８０バス
１０９１ＯＳ
１０９２アプリケーションプログラム
１０９３プログラムモジュール
１０９４プログラムデータ 109 Virtual device management apparatus 110 Virtual device placement scheduler DB
110a Virtual device arrangement information table 110b Physical resource information table 111 Virtual device arrangement scheduler function unit 111a Creation request reception unit 111b Placement destination selection unit 111c Failure determination unit 111d Virtual machine deletion unit 111e Recreation unit 1000 Computer 1010 Memory 1011 ROM
1012 RAM
1020 CPU
1030 Hard disk drive interface 1031 Hard disk drive 1040 Disk drive interface 1041 Disk drive 1050 Serial port interface 1051 Mouse 1052 Keyboard 1060 Video adapter 1061 Display 1070 Network interface 1080 Bus 1091 OS
1092 Application program 1093 Program module 1094 Program data

Claims

A determination unit that determines whether or not there is a recovery process in progress for the virtual machine when a failure of the virtual machine or a failure of a physical device that operates the virtual machine is detected;
When there is no restoration process in progress, as the first restoration process, the association between the virtual machine and the storage area used by the virtual machine is deleted, or the association and the virtual machine A deletion unit for deleting a virtual machine instance;
After selecting the physical device that operates the virtual machine and deleting the association, or after deleting the association and the virtual machine instance, the virtual machine is added to the selected physical device as a second recovery process. A virtual device management apparatus comprising: a re-creation unit for re-creation.

The determination unit, when a failure of a physical device that operates the virtual machine is newly detected when there is an ongoing recovery process due to the failure of the virtual machine, cancels the ongoing recovery process. , Causing the deletion unit to execute the first recovery process due to a newly detected failure, and causing the re-creation unit to execute the second recovery process due to a newly detected failure. The virtual device management apparatus according to claim 1.

The determination unit, when the failure of the physical device is newly detected when the second recovery process due to the failure of the physical device that operates the virtual machine is being performed, The recovery process is stopped, the first recovery process caused by the newly detected failure is executed by the deletion unit, and the second recovery process caused by the newly detected failure is performed by the re-creation unit. The virtual device management apparatus according to claim 1, wherein the virtual device management apparatus is executed.

When the first recovery process due to a failure of a physical device that operates the virtual machine is being performed, the determination unit detects a newly detected failure when the failure of the physical device is newly detected. 2. The first recovery process caused by a failure is not executed by the deletion unit, and the second recovery process caused by a newly detected failure is not executed by the re-creation unit. The virtual device management apparatus described.

If a failure of the virtual machine is newly detected when the second recovery process is being performed, the determination unit cancels the second recovery process being performed and is newly detected Causing the deletion unit to execute the first recovery process due to a failure, causing the re-creation unit to execute the second recovery process due to a newly detected failure,
The deletion unit deletes the virtual machine instance in addition to deleting the association as the first recovery process,
The re-creation unit causes the selected physical device to re-create the virtual machine as the second recovery process after deleting the association and the virtual machine instance. Virtual device management device.

If the failure of the virtual machine is newly detected when the first recovery process is being performed, the determination unit performs the first recovery process caused by the newly detected failure as the deletion unit. The virtual device management apparatus according to claim 1, wherein the second recovery process caused by a newly detected failure is not executed by the re-creation unit.

A virtual device management method executed by a virtual device management apparatus,
The virtual device management apparatus is
A determination step of determining whether there is a recovery process in progress for the virtual machine when a failure of the virtual machine or a failure of a physical device that operates the virtual machine is detected;
When there is no restoration process in progress, as the first restoration process, the association between the virtual machine and the storage area used by the virtual machine is deleted, or the association and the virtual machine A deletion step of deleting the virtual machine instance;
After selecting the physical device that operates the virtual machine and deleting the association, or after deleting the association and the virtual machine instance, the virtual machine is added to the selected physical device as a second recovery process. A virtual device management method comprising: a re-creation step for re-creation.

A determination procedure for determining whether there is a recovery process in progress for the virtual machine when a failure of the virtual machine or a failure of a physical device that operates the virtual machine is detected;
When there is no restoration process in progress, as the first restoration process, the association between the virtual machine and the storage area used by the virtual machine is deleted, or the association and the virtual machine Deletion procedure to delete the virtual machine instance,
After selecting the physical device that operates the virtual machine and deleting the association, or after deleting the association and the virtual machine instance, the virtual machine is added to the selected physical device as a second recovery process. A virtual device management program that causes a computer to execute a re-creation procedure for re-creation.