JP2019185511A

JP2019185511A - Cluster system, autoscale server monitoring device, autoscale server monitoring program, and autoscale server monitoring method

Info

Publication number: JP2019185511A
Application number: JP2018077371A
Authority: JP
Inventors: 雅彦谷川; Masahiko Tanigawa; 健一郎下川; Kenichiro Shimokawa
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-04-13
Filing date: 2018-04-13
Publication date: 2019-10-24
Anticipated expiration: 2038-04-13
Also published as: JP7044971B2

Abstract

To restore an autoscale server when suspended and restored on the basis of latest information of a virtual machine.SOLUTION: Physical servers 30, 40 are capable of executing a plurality of virtual machines. An autoscale server 20 performs scale-in and scale-out of virtual machines in the physical servers 30, 40. An autoscale server monitoring device 10 periodically communicates with the autoscale server 20, and stores information on the virtual machines managed by the autoscale server 20. Upon sending that the autoscale server 20 is suspended, the autoscale server monitoring device 10 transmits information on the virtual machines in accordance with a request from the autoscale server 20.SELECTED DRAWING: Figure 1

Description

本発明はクラスタシステム、オートスケールサーバ監視装置、オートスケールサーバ監視プログラムおよびオートスケールサーバ監視方法に関する。 The present invention relates to a cluster system, an autoscale server monitoring apparatus, an autoscale server monitoring program, and an autoscale server monitoring method.

情報処理の分野では、物理的なコンピュータ（物理マシンや物理ホストと呼ぶことがある）上で、複数の仮想的なコンピュータ（仮想マシンや仮想ホストと呼ぶことがある）を動作させる仮想化技術が利用されている。各仮想マシン上では、ＯＳ（Operating System）などのソフトウェアを実行できる。仮想化技術を利用する物理マシンは、複数の仮想マシンを管理するためのソフトウェアを実行する。例えば、ハイパーバイザと呼ばれるソフトウェアが、ＣＰＵ（Central Processing Unit）の処理能力やＲＡＭ（Random Access Memory）の記憶領域を、演算のリソースとして複数の仮想マシンに割り振ることがある。 In the field of information processing, there is a virtualization technology for operating a plurality of virtual computers (sometimes called virtual machines or virtual hosts) on a physical computer (sometimes called a physical machine or a physical host). It's being used. Software such as an OS (Operating System) can be executed on each virtual machine. A physical machine that uses a virtualization technology executes software for managing a plurality of virtual machines. For example, software called a hypervisor may allocate a processing capacity of a CPU (Central Processing Unit) and a storage area of a RAM (Random Access Memory) to a plurality of virtual machines as computing resources.

ところで、情報処理システムでは、演算を行うマシン（仮想マシンや物理マシン）を増やしたり、減らしたりすることがある。例えば、マシンを増やすことをスケールアウトと言う。一方、マシンを減らすことをスケールインと言う。ここで、スケールアウトやスケールインを行うシステムの運用を支援する技術が考えられている。 By the way, in an information processing system, the number of machines (virtual machines and physical machines) that perform operations may be increased or decreased. For example, increasing the number of machines is called scale-out. On the other hand, reducing the number of machines is called scale-in. Here, a technique for supporting the operation of a system that performs scale-out and scale-in is considered.

例えば、自動スケールアウトおよび自動スケールインによるスケール用の待機サーバの正常動作を障害として誤通知することを防止する障害監視装置の提案がある。障害監視装置は、監視対象の各サーバが、常時稼動するのか、あるいは、スケールアウト時のみ稼動するのかを示すサーバ用途情報と各サーバが待機中であるか稼働中かを示す稼動状態情報を記憶する。障害監視装置は、監視システムが検知したイベントについて、イベント発生元のサーバのサーバ用途情報と稼動状態情報とを確認することで、イベントが障害により発生したか、自動スケールアウトおよび自動スケールインにより発生したかを判定する。 For example, there is a proposal of a failure monitoring device that prevents erroneous notification of normal operation of a standby server for scale by automatic scale-out and automatic scale-in as a failure. The fault monitoring device stores server usage information that indicates whether each monitored server is always operating or only when it is scaled out, and operating status information that indicates whether each server is on standby or operating. To do. The fault monitoring device checks the server usage information and operational status information of the server that generated the event for the event detected by the monitoring system, and whether the event occurred due to a fault or occurs due to automatic scale-out and automatic scale-in. Determine if you did.

また、クラウド環境上で、オートスケール機能により自動的に台数が増減する仮想サーバによって構築される情報処理システムにおいて、ログの消失を回避してこれを監視可能にする基盤運用管理システムの提案もある。基盤運用管理システムでは、オートスケール機能の対象である仮想サーバが、当該仮想サーバに係るログのうち、リアルタイム監視が必要な所定のものをオートスケール機能の対象外の仮想サーバに転送する。 In addition, there is also a proposal for a platform operation management system that can monitor and avoid the loss of logs in an information processing system built with virtual servers whose number automatically increases or decreases in a cloud environment. . In the infrastructure operation management system, the virtual server that is the target of the autoscale function transfers a predetermined log that requires real-time monitoring, to the virtual server that is not the target of the autoscale function.

特開２０１１−２５３２３１号公報JP 2011-253231-A 特開２０１５−１８４８７９号公報Japanese Patent Laying-Open No. 2015-184879

上記のように、オートスケール機能を有するオートスケールサーバを用いて、システムに属する仮想マシンの台数を自動的に増減させることが考えられる。しかし、オートスケールサーバは、障害などが原因で停止することがある。この場合、オートスケールサーバが停止している間に仮想マシンの起動状態の変化（例えば、起動していた仮想マシンが障害などで停止するなど）が生じ得る。すると、オートスケールサーバが復旧したときに、オートスケールサーバが保持する仮想マシンの稼動状態を示す情報が、実際の仮想マシンの稼動状態に対して不整合となる可能性がある。このような不整合は、オートスケールサーバが、復旧後にオートスケール機能を適切に実行できない要因になり得る。 As described above, it is conceivable to automatically increase or decrease the number of virtual machines belonging to the system by using an autoscale server having an autoscale function. However, the autoscale server may stop due to a failure or the like. In this case, a change in the activation state of the virtual machine (for example, the activated virtual machine stops due to a failure or the like) may occur while the autoscale server is stopped. Then, when the autoscale server is restored, there is a possibility that the information indicating the operating state of the virtual machine held by the autoscale server is inconsistent with the actual operating state of the virtual machine. Such inconsistency can be a factor that prevents the autoscale server from properly executing the autoscale function after recovery.

１つの側面では、本発明は、オートスケールサーバが停止し、復旧したときに仮想マシンの最新の情報を基に復旧することができるクラスタシステム、オートスケールサーバ監視装置、オートスケールサーバ監視プログラムおよびオートスケールサーバ監視方法を提供することを目的とする。 In one aspect, the present invention relates to a cluster system, an autoscale server monitoring apparatus, an autoscale server monitoring program, and an autoscale server that can be recovered based on the latest information of a virtual machine when the autoscale server is stopped and recovered. An object is to provide a scale server monitoring method.

１つの態様では、クラスタシステムが提供される。クラスタシステムは、物理サーバとオートスケールサーバとオートスケールサーバ監視装置とを有する。物理サーバは、複数の仮想マシンを実行可能である。オートスケールサーバは、物理サーバにおける仮想マシンのスケールインおよびスケールアウトを行う。オートスケールサーバ監視装置は、オートスケールサーバと定期的に通信し、オートスケールサーバが管理する仮想マシンの情報を記憶し、オートスケールサーバが停止したことを検知すると、オートスケールサーバの要求に応じて、仮想マシンの情報を送信する。 In one aspect, a cluster system is provided. The cluster system includes a physical server, an autoscale server, and an autoscale server monitoring device. The physical server can execute a plurality of virtual machines. The autoscale server performs scale-in and scale-out of the virtual machine in the physical server. The autoscale server monitoring device periodically communicates with the autoscale server, stores virtual machine information managed by the autoscale server, and detects that the autoscale server has stopped. , Send virtual machine information.

また、１つの態様では、オートスケールサーバ監視装置が提供される。オートスケールサーバ監視装置は、記憶部と処理部とを有する。記憶部は、オートスケールサーバが管理する仮想マシンの情報を記憶する。処理部は、オートスケールサーバと定期的に通信し、オートスケールサーバが停止したことを検知すると、オートスケールサーバの要求に応じて、仮想マシンの情報を送信する。 In one aspect, an autoscale server monitoring device is provided. The autoscale server monitoring apparatus has a storage unit and a processing unit. The storage unit stores information on virtual machines managed by the autoscale server. When the processing unit periodically communicates with the autoscale server and detects that the autoscale server has stopped, the processing unit transmits virtual machine information in response to a request from the autoscale server.

また、１つの態様では、オートスケールサーバ監視プログラムが提供される。
また、１つの態様では、オートスケールサーバ監視方法が提供される。 In one aspect, an autoscale server monitoring program is provided.
In one aspect, an autoscale server monitoring method is provided.

１つの側面では、オートスケールサーバが停止し、復旧したときに仮想マシンの最新の情報を基に復旧することができる。 In one aspect, when the autoscale server stops and is restored, it can be restored based on the latest information of the virtual machine.

第１の実施の形態のクラスタシステムを示す図である。It is a figure which shows the cluster system of 1st Embodiment. 第２の実施の形態のクラスタシステムの例を示す図である。It is a figure which shows the example of the cluster system of 2nd Embodiment. 監視サーバのハードウェア例を示すブロック図である。It is a block diagram which shows the hardware example of a monitoring server. スケールアウトおよびスケールインの例を示す図である。It is a figure which shows the example of a scale out and a scale in. クラスタシステムの機能例を示すブロック図である。It is a block diagram which shows the function example of a cluster system. オートスケールサーバ管理テーブルの例を示す図である。It is a figure which shows the example of an auto scale server management table. ＶＭ管理テーブルの例を示す図である。It is a figure which shows the example of a VM management table. オートスケールグループテーブルの例を示す図である。It is a figure which shows the example of an auto scale group table. ＶＭテーブルの例を示す図である。It is a figure which shows the example of a VM table. オートスケールポリシーテーブルの例を示す図である。It is a figure which shows the example of an auto scale policy table. ＶＭ監視の例を示すフローチャートである。It is a flowchart which shows the example of VM monitoring. オートスケールサーバ監視の例を示すフローチャートである。It is a flowchart which shows the example of an auto scale server monitoring. 監視サーバによる監視の例を示す図である。It is a figure which shows the example of the monitoring by a monitoring server. 監視の比較例を示す図である。It is a figure which shows the comparative example of monitoring. 第３の実施の形態の仮想マシンの例を示す図である。It is a figure which shows the example of the virtual machine of 3rd Embodiment. ＶＭ管理テーブルの例を示す図である。It is a figure which shows the example of a VM management table. ＶＭ監視の例を示すフローチャートである。It is a flowchart which shows the example of VM monitoring.

以下、本実施の形態について図面を参照して説明する。
［第１の実施の形態］
第１の実施の形態を説明する。 Hereinafter, the present embodiment will be described with reference to the drawings.
[First Embodiment]
A first embodiment will be described.

図１は、第１の実施の形態のクラスタシステムを示す図である。
クラスタシステム１は、オートスケールサーバ監視装置１０、オートスケールサーバ２０および物理サーバ３０，４０を有する。オートスケールサーバ監視装置１０、オートスケールサーバ２０および物理サーバ３０，４０は、ネットワーク５０に接続されている。 FIG. 1 illustrates a cluster system according to the first embodiment.
The cluster system 1 includes an autoscale server monitoring device 10, an autoscale server 20, and physical servers 30 and 40. The autoscale server monitoring device 10, the autoscale server 20, and the physical servers 30 and 40 are connected to a network 50.

物理サーバ３０，４０は、複数の仮想マシンを実行可能である。例えば、物理サーバ３０は、仮想マシン３１，３２を実行可能である。物理サーバ４０は、仮想マシン４１，４２を実行可能である。仮想マシンは、スケールアウトやスケールインが可能である。 The physical servers 30 and 40 can execute a plurality of virtual machines. For example, the physical server 30 can execute the virtual machines 31 and 32. The physical server 40 can execute the virtual machines 41 and 42. Virtual machines can be scaled out and scaled in.

オートスケールサーバ２０は、各仮想マシンの負荷を収集し、各仮想マシンの負荷に基づいて、仮想マシンのスケールアウトやスケールインを制御する。例えば、オートスケールサーバ２０は、仮想マシン３２が停止しているときに、仮想マシン３１の負荷が第１の閾値を超えた状態が継続すると、物理サーバ３０上で仮想マシン３２を起動させ、仮想マシン３１だけでなく仮想マシン３２にも負荷を分散させる。また、オートスケールサーバ２０は、仮想マシン３１，３２が稼動しているときに、仮想マシン３１，３２の負荷（平均の負荷または一方の負荷）が第２の閾値（第２の閾値＜第１の閾値）を下回ると、仮想マシン３２を停止させ、リソース使用量を減少させる。オートスケールサーバ２０は、物理サーバ４０における仮想マシン４１，４２のスケールアウトやスケールインも同様に制御する。負荷の判定を行う仮想マシンのグループは、運用に応じて決定される（例えば、仮想マシン３１，４１，４２の負荷に応じて、仮想マシン３２を起動させてもよい）。 The autoscale server 20 collects the load of each virtual machine and controls the scale-out and scale-in of the virtual machine based on the load of each virtual machine. For example, if the state in which the load of the virtual machine 31 exceeds the first threshold continues when the virtual machine 32 is stopped, the autoscale server 20 starts the virtual machine 32 on the physical server 30 and The load is distributed not only to the machine 31 but also to the virtual machine 32. In addition, when the virtual machines 31 and 32 are operating, the autoscale server 20 is configured such that the load (average load or one load) of the virtual machines 31 and 32 is the second threshold (second threshold <first Below the threshold), the virtual machine 32 is stopped and the resource usage is reduced. The autoscale server 20 similarly controls the scale-out and scale-in of the virtual machines 41 and 42 in the physical server 40. The group of virtual machines for determining the load is determined according to the operation (for example, the virtual machine 32 may be started according to the load of the virtual machines 31, 41, and 42).

オートスケールサーバ監視装置１０は、オートスケールサーバ２０を監視する。また、オートスケールサーバ監視装置１０は、オートスケール対象である仮想マシン３１，３２，４１，４２を監視する。具体的には、オートスケールサーバ監視装置１０は、稼働中の仮想マシンと定期的に通信することで、該当の仮想マシンの死活監視を行う。オートスケールサーバ監視装置１０は、何れかの仮想マシンの異常を検知すると、異常を検知したことをユーザに通知する。 The autoscale server monitoring apparatus 10 monitors the autoscale server 20. Further, the autoscale server monitoring apparatus 10 monitors the virtual machines 31, 32, 41, and 42 that are autoscale targets. Specifically, the autoscale server monitoring apparatus 10 performs life and death monitoring of the corresponding virtual machine by periodically communicating with the operating virtual machine. When the autoscale server monitoring apparatus 10 detects an abnormality in any of the virtual machines, the autoscale server monitoring apparatus 10 notifies the user that the abnormality has been detected.

ただし、オートスケール対象の仮想マシンは、オートスケールサーバ２０によるオートスケール制御によって起動されたり、停止されたりする。このため、オートスケールサーバ監視装置１０は、オートスケール対象の仮想マシンの何れかで定期通信の途絶を検知したとき、当該仮想マシンがスケールインによって停止されたか否かを、オートスケールサーバ２０に問い合わせる。定期通信の途絶がスケールインに起因するのであれば、当該途絶は異常ではない。一方、定期通信の途絶がスケールインに起因するのでなければ、当該途絶は異常とみなされる。ただし、オートスケールサーバ２０が異常などにより停止することもある。オートスケールサーバ監視装置１０は、オートスケールサーバ２０の稼動状態を監視し、オートスケールサーバ２０の復旧を支援する機能を提供する。 However, the autoscale target virtual machine is activated or stopped by the autoscale control by the autoscale server 20. For this reason, when the autoscale server monitoring apparatus 10 detects the interruption of the periodic communication in any of the autoscale target virtual machines, the autoscale server monitoring apparatus 10 inquires of the autoscale server 20 whether the virtual machine has been stopped due to the scale-in. . If the disruption of regular communication is due to scale-in, the disruption is not abnormal. On the other hand, if the interruption of regular communication is not caused by scale-in, the interruption is regarded as abnormal. However, the autoscale server 20 may stop due to an abnormality or the like. The autoscale server monitoring apparatus 10 provides a function of monitoring the operating state of the autoscale server 20 and supporting the recovery of the autoscale server 20.

オートスケールサーバ監視装置１０は、記憶部１１および処理部１２を有する。また、オートスケールサーバ２０は、記憶部２１および処理部２２を有する。
記憶部１１，２１は、ＲＡＭなどの揮発性記憶装置でもよいし、ＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの不揮発性記憶装置でもよい。処理部１２，２２は、ＣＰＵ、ＤＳＰ（Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）などを含み得る。処理部１２，２２はプログラムを実行するプロセッサであってもよい。ここでいう「プロセッサ」には、複数のプロセッサの集合（マルチプロセッサ）も含まれ得る。 The autoscale server monitoring apparatus 10 includes a storage unit 11 and a processing unit 12. The autoscale server 20 includes a storage unit 21 and a processing unit 22.
The storage units 11 and 21 may be a volatile storage device such as a RAM, or may be a nonvolatile storage device such as an HDD (Hard Disk Drive) or a flash memory. The processing units 12 and 22 may include a CPU, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), and the like. The processing units 12 and 22 may be processors that execute programs. As used herein, the “processor” may include a set of multiple processors (multiprocessor).

記憶部１１は、オートスケールサーバ２０が管理する仮想マシン３１，３２，４１，４２の情報を記憶する。例えば、記憶部１１は、テーブル６１を記憶する。テーブル６１は、オートスケールサーバ監視装置１０による仮想マシン３１，３２，４１，４２それぞれの死活監視の状況（定期通信の成否）を示す。ここで、仮想マシン３１の識別情報は「ＶＭ（Virtual Machine）１」である。仮想マシン３２の識別情報は「ＶＭ２」である。仮想マシン４１の識別情報は「ＶＭ３」である。仮想マシン４２の識別情報は「ＶＭ４」である。また、テーブル６１では、（例えば、最新の定期通信のタイミングにおいて）該当の仮想マシンと定期通信を行えたことを「ＯＮ」、定期通信を行えなかったことを「ＯＦＦ」で表す。 The storage unit 11 stores information on the virtual machines 31, 32, 41, and 42 managed by the autoscale server 20. For example, the storage unit 11 stores a table 61. The table 61 shows the status of alive monitoring (success / failure of regular communication) of each of the virtual machines 31, 32, 41, and 42 by the autoscale server monitoring apparatus 10. Here, the identification information of the virtual machine 31 is “VM (Virtual Machine) 1”. The identification information of the virtual machine 32 is “VM2”. The identification information of the virtual machine 41 is “VM3”. The identification information of the virtual machine 42 is “VM4”. Further, in the table 61, “ON” indicates that periodic communication with the corresponding virtual machine has been performed (for example, at the latest periodic communication timing), and “OFF” indicates that periodic communication has not been performed.

ここで、記憶部２１も、仮想マシン３１，３２，４１，４２の状態を示す情報を記憶する。例えば、記憶部２１は、テーブル７１を記憶する。テーブル７１は、記憶部２１のうち、不揮発性の記憶領域に格納される。テーブル７１は、オートスケール制御に用いられる情報であり、仮想マシン３１，３２，４１，４２それぞれの状態を示す状態情報である。例えば、「ｎｏｒｍａｌ」は正常稼働中を示す。「ｓｃａｌｅ−ｉｎ」は、スケールインにより停止されていることを示す。「ｅｒｒｏｒ」は、異常により停止されていることを示す。処理部２２は、仮想マシン３１，３２，４１，４２の稼動状態の収集やオートスケールの結果に応じて、テーブル７１の各仮想マシンの状態を更新する。 Here, the storage unit 21 also stores information indicating the states of the virtual machines 31, 32, 41, and 42. For example, the storage unit 21 stores a table 71. The table 71 is stored in a non-volatile storage area in the storage unit 21. The table 71 is information used for autoscale control, and is state information indicating the states of the virtual machines 31, 32, 41, and 42. For example, “normal” indicates normal operation. “Scale-in” indicates that the scale-in is stopped. “Error” indicates that the operation has been stopped due to an abnormality. The processing unit 22 updates the state of each virtual machine in the table 71 in accordance with the collection of the operating states of the virtual machines 31, 32, 41, and the results of auto scaling.

処理部１２は、オートスケール対象の仮想マシンを示す情報をオートスケールサーバ２０から取得してテーブル６１を生成し、オートスケールサーバ監視装置１０による死活監視の対象の仮想マシンを決定してもよい。 The processing unit 12 may acquire information indicating the virtual machine to be autoscaled from the autoscale server 20 and generate the table 61 to determine a virtual machine to be monitored for life and death by the autoscale server monitoring apparatus 10.

処理部１２は、オートスケールサーバ２０と定期的に通信し、オートスケールサーバ２０が停止したことを検知すると、オートスケールサーバ２０の要求に応じて、仮想マシンの情報をオートスケールサーバ２０に送信する。 When the processing unit 12 periodically communicates with the autoscale server 20 and detects that the autoscale server 20 has stopped, the processing unit 12 transmits virtual machine information to the autoscale server 20 in response to a request from the autoscale server 20. .

まず、オートスケールサーバ２０が稼働中の場合を考える（ステップＳＴ１）。このとき、処理部１２は、仮想マシン３１，４１，４２との定期通信を行えたが、仮想マシン３２との定期通信を行えなかった（通信不可になった）とする。処理部１２は、仮想マシン３１，４１，４２（「ＶＭ１，ＶＭ３，ＶＭ４」）について「ＯＮ」、仮想マシン３２（「ＶＭ２」）について、「ＯＦＦ」をテーブル６１に記録する。処理部１２は、仮想マシン３２のオートスケールの状況を、オートスケールサーバ２０に問い合わせる。 First, consider the case where the autoscale server 20 is in operation (step ST1). At this time, it is assumed that the processing unit 12 can perform regular communication with the virtual machines 31, 41, and 42 but cannot perform regular communication with the virtual machine 32 (cannot communicate). The processing unit 12 records “ON” for the virtual machines 31, 41, and 42 (“VM1, VM3, VM4”) and “OFF” for the virtual machine 32 (“VM2”) in the table 61. The processing unit 12 inquires of the autoscale server 20 about the autoscale status of the virtual machine 32.

このとき、オートスケールサーバ２０では、テーブル７１に示されるように、仮想マシン３１，４１，４２については「ｎｏｒｍａｌ」であり、仮想マシン３２については「ｓｃａｌｅ−ｉｎ」と管理されている。すなわち、仮想マシン３２は、スケールインによって停止された状態である。このため、処理部２２は、仮想マシン３２がスケールインによって停止された状態である旨をオートスケールサーバ監視装置１０に応答する。 At this time, in the autoscale server 20, as shown in the table 71, the virtual machines 31, 41, and 42 are managed as “normal” and the virtual machine 32 is managed as “scale-in”. That is, the virtual machine 32 is in a state stopped by scale-in. For this reason, the processing unit 22 responds to the autoscale server monitoring apparatus 10 that the virtual machine 32 has been stopped due to the scale-in.

処理部１２は、オートスケールサーバ２０による応答を受信し、当該応答により、仮想マシン３２がスケールインによって停止された状態であることを検知する。このため、処理部１２は、仮想マシン３２との通信不可（定期通信の途絶）を異常とみなさない。処理部１２は、仮想マシン３１，４１，４２に対する死活監視を継続する。 The processing unit 12 receives a response from the autoscale server 20, and detects that the virtual machine 32 is in a state stopped by scale-in based on the response. For this reason, the processing unit 12 does not regard the failure of communication with the virtual machine 32 (disruption of regular communication) as an abnormality. The processing unit 12 continues alive monitoring for the virtual machines 31, 41, and 42.

次に、オートスケールサーバ２０が異常などによって停止中の場合を考える（ステップＳＴ２）。処理部１２は、オートスケールサーバ２０との定期通信を正常に行えなかった場合に、オートスケールサーバ２０が停止中であることを検知する。処理部１２は、オートスケールサーバ２０が停止中である間も、稼働中の仮想マシン３１，４１，４２と定期的に通信し、仮想マシン３１，４１，４２の死活監視を継続する。そして、処理部１２は、仮想マシン４２（「ＶＭ４」）との通信不可（定期通信の途絶）を検出する。すると、処理部１２は、テーブル６１をテーブル６２に更新する。具体的には、処理部１２は、「ＶＭ４」を「ＯＮ」から「ＯＦＦ」に変更する。 Next, consider a case where the autoscale server 20 is stopped due to an abnormality or the like (step ST2). When the regular communication with the autoscale server 20 cannot be normally performed, the processing unit 12 detects that the autoscale server 20 is stopped. The processing unit 12 periodically communicates with the operating virtual machines 31, 41, and 42 while the autoscale server 20 is stopped, and continues monitoring whether the virtual machines 31, 41, and 42 are alive. Then, the processing unit 12 detects that communication with the virtual machine 42 (“VM4”) is disabled (periodic communication interruption). Then, the processing unit 12 updates the table 61 to the table 62. Specifically, the processing unit 12 changes “VM4” from “ON” to “OFF”.

次に、オートスケールサーバ２０が停止中の状態から復旧した場合を考える（ステップＳＴ３）。処理部１２は、オートスケールサーバ２０からの要求を受け付けると、オートスケールサーバ２０が起動したことを検知する。オートスケールサーバ２０からの要求は、仮想マシンの情報の要求でもよいし、オートスケールサーバ監視装置１０に対する定期通信に関する所定の要求（あるいは応答）でもよい。すると、処理部１２は、テーブル６２に基づいて、オートスケールサーバ２０が停止していた間に、仮想マシン４２との定期通信の途絶を検知したことをオートスケールサーバ２０に送信する。仮想マシン４２との定期通信の途絶は、オートスケールサーバ２０の停止中に発生している。このため、当該途絶は、仮想マシン４２のスケールインに起因するものではない。したがって、処理部２２は、オートスケールサーバ監視装置１０から仮想マシン４２の定期通信の途絶の通知を受信すると、テーブル７１をテーブル７２に更新する。具体的には、処理部２２は、「ＶＭ４」を「ｎｏｒｍａｌ」から「ｅｒｒｏｒ」に変更する。 Next, consider a case where the autoscale server 20 is restored from a stopped state (step ST3). When receiving a request from the autoscale server 20, the processing unit 12 detects that the autoscale server 20 has started. The request from the autoscale server 20 may be a request for virtual machine information, or may be a predetermined request (or response) related to regular communication with the autoscale server monitoring apparatus 10. Then, based on the table 62, the processing unit 12 transmits to the autoscale server 20 that the periodic communication with the virtual machine 42 has been detected while the autoscale server 20 is stopped. The interruption of the regular communication with the virtual machine 42 occurs while the autoscale server 20 is stopped. For this reason, the interruption is not caused by the scale-in of the virtual machine 42. Therefore, the processing unit 22 updates the table 71 to the table 72 when receiving a notification of interruption of the periodic communication of the virtual machine 42 from the autoscale server monitoring apparatus 10. Specifically, the processing unit 22 changes “VM4” from “normal” to “error”.

処理部１２は、仮想マシン４２が「ｅｒｒｏｒ」として管理されていることをオートスケールサーバ２０から取得して、仮想マシン４２の異常をユーザに通知してもよい。
なお、ステップＳＴ３では、処理部１２は、テーブル６２における各仮想マシンの情報を、オートスケールサーバ２０に送信してもよい。処理部２２は、テーブル６２の仮想マシンの情報と、テーブル７２の仮想マシンの情報とを照合することで、何れの仮想マシンで異常が生じているかを判定できる。例えば、処理部２２は、テーブル６２で「ＯＦＦ」かつテーブル７１で「ｎｏｒｍａｌ」である仮想マシンを異常（「ｅｒｒｏｒ」）と判定し、それ以外の仮想マシンを異常なし（「ｎｏｒｍａｌ」や「ｓｃａｌｅ−ｉｎ」など）と判定してもよい。 The processing unit 12 may acquire from the autoscale server 20 that the virtual machine 42 is managed as “error” and notify the user of an abnormality in the virtual machine 42.
In step ST <b> 3, the processing unit 12 may transmit information on each virtual machine in the table 62 to the autoscale server 20. The processing unit 22 can determine in which virtual machine an abnormality has occurred by comparing the virtual machine information in the table 62 and the virtual machine information in the table 72. For example, the processing unit 22 determines that a virtual machine that is “OFF” in the table 62 and “normal” in the table 71 is abnormal (“error”), and other virtual machines are not abnormal (“normal” or “scale”). -In "or the like.

オートスケールサーバ監視装置１０によれば、オートスケールサーバ２０と定期的に通信され、オートスケールサーバ２０が管理する仮想マシンの情報が記憶される。オートスケールサーバ２０が停止したことが検知されると、オートスケールサーバ２０の要求に応じて、仮想マシンの情報が送信される。 According to the autoscale server monitoring apparatus 10, information on virtual machines that are periodically communicated with the autoscale server 20 and managed by the autoscale server 20 is stored. When it is detected that the autoscale server 20 has stopped, virtual machine information is transmitted in response to a request from the autoscale server 20.

これにより、オートスケールサーバ２０が停止し、復旧したときに仮想マシンの最新の情報を基に復旧することができる。
ここで、オートスケールサーバ監視装置１０の機能を用いない場合を考える。この場合、オートスケールサーバ２０の停止中に、仮に、仮想マシン４１が異常などによって停止しても、オートスケールサーバ２０が起動した後に、オートスケールサーバ２０は当該仮想マシンの停止を把握できていない。オートスケールサーバ２０は、テーブル７１によって各仮想マシンのオートスケール制御を行うことになる。すなわち、オートスケールサーバ２０が管理する仮想マシンの情報と、現実の仮想マシンの稼働状況とに不整合が生じた状態になる。この場合、オートスケールサーバ２０は、仮想マシン４１，４２に対する適切なオートスケール制御を行えない。また、オートスケールサーバ２０が仮想マシン４２の停止を検知するまでに、比較的長い時間（例えば、１０分から数十分など）を要することもある。この間に、仮想マシン４１の負荷が高まると、オートスケール制御を適切に行えずに、仮想マシン４１で実行されるアプリケーションなどの処理に影響を及ぼす可能性もある。 Thereby, when the autoscale server 20 is stopped and recovered, it can be recovered based on the latest information of the virtual machine.
Here, a case where the function of the autoscale server monitoring apparatus 10 is not used is considered. In this case, even if the virtual machine 41 stops due to an abnormality or the like while the autoscale server 20 is stopped, the autoscale server 20 cannot grasp the stop of the virtual machine after the autoscale server 20 is started. . The autoscale server 20 performs autoscale control of each virtual machine using the table 71. In other words, there is a mismatch between the virtual machine information managed by the autoscale server 20 and the actual operating state of the virtual machine. In this case, the autoscale server 20 cannot perform appropriate autoscale control for the virtual machines 41 and 42. Further, it may take a relatively long time (for example, 10 minutes to several tens of minutes) before the autoscale server 20 detects the stop of the virtual machine 42. During this time, if the load on the virtual machine 41 increases, autoscale control may not be performed properly, and processing such as an application executed on the virtual machine 41 may be affected.

そこで、オートスケールサーバ監視装置１０により、オートスケールサーバ２０の停止中の仮想マシンの情報を取得し、オートスケールサーバ２０の復旧時に、当該仮想マシンの情報をオートスケールサーバ２０に提供する。これにより、オートスケールサーバ２０において管理されている仮想マシンの情報と、現実の仮想マシンの稼働状況との不整合を解消した状態で、オートスケールサーバ２０を復旧させることができる。このため、オートスケールサーバ２０は、復旧した直後から、オートスケール制御を正常に再開することができる。その結果、各仮想マシンの負荷をオートスケール制御により適切に分散でき、各仮想マシンで実行されるアプリケーションなどの処理への影響を抑えられる。 Therefore, the autoscale server monitoring apparatus 10 acquires information about the virtual machine that is stopped by the autoscale server 20 and provides the autoscale server 20 with the information about the virtual machine when the autoscale server 20 is restored. As a result, the autoscale server 20 can be restored in a state in which the inconsistency between the virtual machine information managed in the autoscale server 20 and the actual operating state of the virtual machine is resolved. For this reason, the autoscale server 20 can resume autoscale control normally immediately after the recovery. As a result, the load of each virtual machine can be appropriately distributed by auto-scaling control, and the influence on processing such as an application executed on each virtual machine can be suppressed.

なお、クラスタシステム１の例では、オートスケールサーバ監視装置１０による監視の対象を、オートスケールサーバ２０およびオートスケールサーバ２０によるオートスケール対象の仮想マシン（仮想マシン３１，３２，４１，４２）とした。一方、オートスケールサーバ監視装置１０による監視対象の仮想マシンはこれに限られない。オートスケールサーバ監視装置１０は、オートスケール対象の仮想マシンおよびオートスケール対象ではない仮想マシンの死活監視を行ってもよい。オートスケールサーバ監視装置１０は、オートスケール対象ではない仮想マシンについて定期通信の途絶を検出すると、オートスケールサーバ２０への問い合わせを省略して、当該仮想マシンで異常が発生したことをユーザに通知することができる。 In the example of the cluster system 1, the targets to be monitored by the autoscale server monitoring device 10 are the autoscale server 20 and the virtual machines (virtual machines 31, 32, 41, 42) subject to autoscale by the autoscale server 20. . On the other hand, the virtual machine to be monitored by the autoscale server monitoring apparatus 10 is not limited to this. The autoscale server monitoring apparatus 10 may perform alive monitoring of virtual machines that are autoscale targets and virtual machines that are not autoscale targets. When the autoscale server monitoring apparatus 10 detects the interruption of the periodic communication for a virtual machine that is not an autoscale target, the autoscale server monitoring apparatus 10 omits the inquiry to the autoscale server 20 and notifies the user that an abnormality has occurred in the virtual machine. be able to.

［第２の実施の形態］
次に、第２の実施の形態を説明する。
図２は、第２の実施の形態のクラスタシステムの例を示す図である。 [Second Embodiment]
Next, a second embodiment will be described.
FIG. 2 is a diagram illustrating an example of a cluster system according to the second embodiment.

第２の実施の形態のクラスタシステムは、ユーザに対して仮想マシンの利用環境を提供する情報処理システムである。第２の実施の形態のクラスタシステムは、監視サーバ１００、オートスケールサーバ２００および物理サーバ３００，４００を有する。 The cluster system according to the second embodiment is an information processing system that provides a virtual machine usage environment to a user. The cluster system according to the second embodiment includes a monitoring server 100, an autoscale server 200, and physical servers 300 and 400.

監視サーバ１００、オートスケールサーバ２００および物理サーバ３００，４００は、ネットワーク６０に接続される。ネットワーク６０は、例えば、データセンタなどに敷設されたＬＡＮ（Local Area Network）である。ネットワーク６０は、ネットワーク７０に接続される。ネットワーク７０は、例えば、インターネットやＷＡＮ（Wide Area Network）である。ネットワーク７０には、ユーザ端末５００，６００が接続される。 The monitoring server 100, the autoscale server 200, and the physical servers 300 and 400 are connected to the network 60. The network 60 is, for example, a LAN (Local Area Network) laid in a data center or the like. The network 60 is connected to the network 70. The network 70 is, for example, the Internet or a WAN (Wide Area Network). User terminals 500 and 600 are connected to the network 70.

監視サーバ１００は、オートスケールサーバ２００の監視を行うサーバコンピュータである。また、監視サーバ１００は、物理サーバ３００，４００で動作する仮想マシンの監視を行う。監視サーバ１００は、第１の実施の形態のオートスケールサーバ監視装置１０の一例である。 The monitoring server 100 is a server computer that monitors the autoscale server 200. The monitoring server 100 also monitors virtual machines that operate on the physical servers 300 and 400. The monitoring server 100 is an example of the autoscale server monitoring apparatus 10 according to the first embodiment.

オートスケールサーバ２００は、物理サーバ３００，４００で動作する仮想マシンのオートスケール（自動スケール）制御を行うサーバコンピュータである。オートスケールサーバ２００は、第１の実施の形態のオートスケールサーバ２０の一例である。 The auto scale server 200 is a server computer that performs auto scale (auto scale) control of virtual machines operating on the physical servers 300 and 400. The autoscale server 200 is an example of the autoscale server 20 according to the first embodiment.

物理サーバ３００，４００は、複数の仮想マシンを実行可能なサーバコンピュータである。例えば、物理サーバ３００は、ハイパーバイザと呼ばれるソフトウェアを実行し、物理サーバ３００におけるＣＰＵやＲＡＭなどのハードウェアリソースを物理サーバ３００上の仮想マシンに割り振る。同様に、物理サーバ４００は、ハイパーバイザを実行し、物理サーバ４００におけるＣＰＵやＲＡＭなどのハードウェアリソースを物理サーバ４００上の仮想マシンに割り振る。物理サーバ３００，４００は、第１の実施の形態の物理サーバ３０，４０の一例である。 The physical servers 300 and 400 are server computers that can execute a plurality of virtual machines. For example, the physical server 300 executes software called a hypervisor and allocates hardware resources such as CPU and RAM in the physical server 300 to virtual machines on the physical server 300. Similarly, the physical server 400 executes a hypervisor and allocates hardware resources such as CPU and RAM in the physical server 400 to virtual machines on the physical server 400. The physical servers 300 and 400 are an example of the physical servers 30 and 40 according to the first embodiment.

ユーザ端末５００，６００は、ユーザが利用するクライアントコンピュータである。ユーザ端末５００，６００は、物理サーバ３００，４００上の仮想マシンで実行されるアプリケーションに対する処理要求を送信する。また、ユーザ端末５００，６００は、仮想マシンによる処理結果を受信する。 User terminals 500 and 600 are client computers used by users. The user terminals 500 and 600 transmit processing requests for applications executed on the virtual machines on the physical servers 300 and 400. In addition, the user terminals 500 and 600 receive the processing result by the virtual machine.

第２の実施の形態のクラスタシステムでは、ユーザにより円滑に仮想マシンを利用できるように、オートスケールサーバ２００による仮想マシンのオートスケール制御が行われる。ただし、オートスケールサーバ２００が、異常などによって停止することもある。そこで、監視サーバ１００により、オートスケールサーバ２００が停止した場合でも、オートスケール制御への影響を低減する機能を提供する。以下の説明では、仮想マシンを、ＶＭと略記することがある。また、オートスケールを、ＡＳ（Auto Scaling）と略記することがある。 In the cluster system of the second embodiment, autoscale control of the virtual machine is performed by the autoscale server 200 so that the user can use the virtual machine smoothly. However, the autoscale server 200 may stop due to an abnormality or the like. Therefore, the monitoring server 100 provides a function for reducing the influence on the autoscale control even when the autoscale server 200 is stopped. In the following description, a virtual machine may be abbreviated as VM. In addition, the auto scale may be abbreviated as AS (Auto Scaling).

図３は、監視サーバのハードウェア例を示すブロック図である。
監視サーバ１００は、ＣＰＵ１０１、ＲＡＭ１０２、ＨＤＤ１０３、画像信号処理部１０４、入力信号処理部１０５、媒体リーダ１０６およびＮＩＣ（Network Interface Card）１０７を有する。なお、ＣＰＵ１０１は、第１の実施の形態の処理部１２に対応する。ＲＡＭ１０２またはＨＤＤ１０３は、第１の実施の形態の記憶部１１に対応する。 FIG. 3 is a block diagram illustrating a hardware example of the monitoring server.
The monitoring server 100 includes a CPU 101, a RAM 102, an HDD 103, an image signal processing unit 104, an input signal processing unit 105, a medium reader 106, and a NIC (Network Interface Card) 107. The CPU 101 corresponds to the processing unit 12 of the first embodiment. The RAM 102 or the HDD 103 corresponds to the storage unit 11 of the first embodiment.

ＣＰＵ１０１は、プログラムの命令を実行するプロセッサである。ＣＰＵ１０１は、ＨＤＤ１０３に記憶されたプログラムやデータの少なくとも一部をＲＡＭ１０２にロードし、プログラムを実行する。なお、ＣＰＵ１０１は複数のプロセッサコアを含んでもよい。また、監視サーバ１００は複数のプロセッサを有してもよい。以下で説明する処理は複数のプロセッサまたはプロセッサコアを用いて並列に実行されてもよい。また、複数のプロセッサの集合を「マルチプロセッサ」または単に「プロセッサ」と言うことがある。 The CPU 101 is a processor that executes program instructions. The CPU 101 loads at least a part of the program and data stored in the HDD 103 into the RAM 102 and executes the program. Note that the CPU 101 may include a plurality of processor cores. The monitoring server 100 may have a plurality of processors. The processes described below may be executed in parallel using a plurality of processors or processor cores. A set of processors may be referred to as “multiprocessor” or simply “processor”.

ＲＡＭ１０２は、ＣＰＵ１０１が実行するプログラムやＣＰＵ１０１が演算に用いるデータを一時的に記憶する揮発性の半導体メモリである。なお、監視サーバ１００は、ＲＡＭ以外の種類のメモリを備えてもよく、複数個のメモリを備えてもよい。 The RAM 102 is a volatile semiconductor memory that temporarily stores programs executed by the CPU 101 and data used by the CPU 101 for calculations. Note that the monitoring server 100 may include a type of memory other than the RAM, or may include a plurality of memories.

ＨＤＤ１０３は、ＯＳやミドルウェアやアプリケーションソフトウェアなどのソフトウェアのプログラム、および、データを記憶する不揮発性の記憶装置である。なお、監視サーバ１００は、フラッシュメモリやＳＳＤ（Solid State Drive）などの他の種類の記憶装置を備えてもよく、複数の不揮発性の記憶装置を備えてもよい。 The HDD 103 is a non-volatile storage device that stores software programs such as an OS, middleware, and application software, and data. The monitoring server 100 may include other types of storage devices such as flash memory and SSD (Solid State Drive), and may include a plurality of nonvolatile storage devices.

画像信号処理部１０４は、ＣＰＵ１０１からの命令に従って、監視サーバ１００に接続されたディスプレイ１１１に画像を出力する。ディスプレイ１１１としては、ＣＲＴ（Cathode Ray Tube）ディスプレイ、液晶ディスプレイ（ＬＣＤ：Liquid Crystal Display）、プラズマディスプレイ、有機ＥＬ（ＯＥＬ：Organic Electro-Luminescence）ディスプレイなど、任意の種類のディスプレイを用いることができる。 The image signal processing unit 104 outputs an image to the display 111 connected to the monitoring server 100 in accordance with a command from the CPU 101. As the display 111, any type of display such as a CRT (Cathode Ray Tube) display, a liquid crystal display (LCD), a plasma display, an organic EL (OEL: Organic Electro-Luminescence) display, or the like can be used.

入力信号処理部１０５は、監視サーバ１００に接続された入力デバイス１１２から入力信号を取得し、ＣＰＵ１０１に出力する。入力デバイス１１２としては、マウス・タッチパネル・タッチパッド・トラックボールなどのポインティングデバイス、キーボード、リモートコントローラ、ボタンスイッチなどを用いることができる。また、監視サーバ１００に、複数の種類の入力デバイスが接続されていてもよい。 The input signal processing unit 105 acquires an input signal from the input device 112 connected to the monitoring server 100 and outputs it to the CPU 101. As the input device 112, a pointing device such as a mouse, a touch panel, a touch pad, a trackball, a keyboard, a remote controller, a button switch, or the like can be used. A plurality of types of input devices may be connected to the monitoring server 100.

媒体リーダ１０６は、記録媒体１１３に記録されたプログラムやデータを読み取る読み取り装置である。記録媒体１１３として、例えば、磁気ディスク、光ディスク、光磁気ディスク（ＭＯ：Magneto-Optical disk）、半導体メモリなどを使用できる。磁気ディスクには、フレキシブルディスク（ＦＤ：Flexible Disk）やＨＤＤが含まれる。光ディスクには、ＣＤ（Compact Disc）やＤＶＤ（Digital Versatile Disc）が含まれる。 The medium reader 106 is a reading device that reads programs and data recorded on the recording medium 113. As the recording medium 113, for example, a magnetic disk, an optical disk, a magneto-optical disk (MO), a semiconductor memory, or the like can be used. Magnetic disks include flexible disks (FD: Flexible Disk) and HDDs. The optical disc includes a CD (Compact Disc) and a DVD (Digital Versatile Disc).

媒体リーダ１０６は、例えば、記録媒体１１３から読み取ったプログラムやデータを、ＲＡＭ１０２やＨＤＤ１０３などの他の記録媒体にコピーする。読み取られたプログラムは、例えば、ＣＰＵ１０１によって実行される。なお、記録媒体１１３は可搬型記録媒体であってもよく、プログラムやデータの配布に用いられることがある。また、記録媒体１１３やＨＤＤ１０３を、コンピュータ読み取り可能な記録媒体と言うことがある。 For example, the medium reader 106 copies a program or data read from the recording medium 113 to another recording medium such as the RAM 102 or the HDD 103. The read program is executed by the CPU 101, for example. The recording medium 113 may be a portable recording medium and may be used for distributing programs and data. In addition, the recording medium 113 and the HDD 103 may be referred to as computer-readable recording media.

ＮＩＣ１０７は、ネットワーク６０に接続され、ネットワーク６０を介して他のコンピュータと通信を行うインタフェースである。ＮＩＣ１０７は、例えば、スイッチやルータなどの通信装置とケーブルで接続される。 The NIC 107 is an interface that is connected to the network 60 and communicates with other computers via the network 60. The NIC 107 is connected to a communication device such as a switch or a router with a cable, for example.

図４は、スケールアウトおよびスケールインの例を示す図である。
図４（Ａ）は、スケールアウトを例示する。物理サーバ３００が仮想マシン３１０，３２０を実行し、物理サーバ４００が仮想マシン４１０，４２０を実行している場合を考える。例えば、仮想マシン３１０，３２０，４１０，４２０は、オートスケールの対象となる仮想マシンの１つのグループに属し、ユーザが利用するアプリケーション（あるいはアプリケーション群）の処理を分散して実行する。 FIG. 4 is a diagram illustrating an example of scale-out and scale-in.
FIG. 4A illustrates scale-out. Consider a case where the physical server 300 executes the virtual machines 310 and 320 and the physical server 400 executes the virtual machines 410 and 420. For example, the virtual machines 310, 320, 410, and 420 belong to one group of virtual machines to be autoscaled and execute processing of applications (or application groups) used by users in a distributed manner.

オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の負荷を定期的に収集する。例えば、オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の平均の負荷が所定期間継続して第１閾値を上回った場合、仮想マシン３１０，３２０，４１０，４２０の負荷が高まっていると判断し、仮想マシンのスケールアウトを行う。例えば、オートスケールサーバ２００は、物理サーバ３００により仮想マシン３３０を起動させ、仮想マシン３１０，３２０，４１０，４２０の負荷の一部を、仮想マシン３３０に分散させる。 The autoscale server 200 periodically collects loads on the virtual machines 310, 320, 410, and 420. For example, in the autoscale server 200, when the average load of the virtual machines 310, 320, 410, and 420 exceeds the first threshold for a predetermined period, the load of the virtual machines 310, 320, 410, and 420 is increased. And scale out the virtual machine. For example, the autoscale server 200 activates the virtual machine 330 by the physical server 300 and distributes part of the load on the virtual machines 310, 320, 410, 420 to the virtual machine 330.

図４（Ｂ）は、スケールインを例示する。物理サーバ３００が仮想マシン３１０，３２０を実行し、物理サーバ４００が仮想マシン４１０，４２０を実行している場合を考える。オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の負荷を定期的に収集する。例えば、オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の平均の負荷が所定期間継続して第２閾値を下回った場合、仮想マシン３１０，３２０，４１０，４２０の負荷が低くなっていると判断し、仮想マシンのスケールインを行う。ここで、第２閾値は、第１閾値よりも小さい。例えば、オートスケールサーバ２００は、物理サーバ４００における仮想マシン４２０を停止させ、仮想マシン４２０に割り当てていたリソースを解放する。 FIG. 4B illustrates scale-in. Consider a case where the physical server 300 executes the virtual machines 310 and 320 and the physical server 400 executes the virtual machines 410 and 420. The autoscale server 200 periodically collects loads on the virtual machines 310, 320, 410, and 420. For example, in the autoscale server 200, when the average load of the virtual machines 310, 320, 410, and 420 continues below a second threshold for a predetermined period, the load of the virtual machines 310, 320, 410, and 420 becomes low. The virtual machine is scaled in. Here, the second threshold value is smaller than the first threshold value. For example, the autoscale server 200 stops the virtual machine 420 in the physical server 400 and releases resources allocated to the virtual machine 420.

図５は、クラスタシステムの機能例を示すブロック図である。
監視サーバ１００は、記憶部１２０および監視部１３０を有する。記憶部１２０は、ＲＡＭ１０２やＨＤＤ１０３の記憶領域により実現される。監視部１３０は、ＣＰＵ１０１がＲＡＭ１０２に記憶されたプログラムを実行することで実現される。 FIG. 5 is a block diagram illustrating an example of functions of the cluster system.
The monitoring server 100 includes a storage unit 120 and a monitoring unit 130. The storage unit 120 is realized by a storage area of the RAM 102 or the HDD 103. The monitoring unit 130 is realized by the CPU 101 executing a program stored in the RAM 102.

記憶部１２０は、オートスケールサーバ管理テーブルおよびＶＭ管理テーブルを記憶する。オートスケールサーバ管理テーブルは、オートスケールサーバ２００の稼動状態を示す情報である。ＶＭ管理テーブルは、各仮想マシンに対する定期通信の成否を示す情報である。 The storage unit 120 stores an autoscale server management table and a VM management table. The autoscale server management table is information indicating the operating state of the autoscale server 200. The VM management table is information indicating the success or failure of regular communication for each virtual machine.

監視部１３０は、物理サーバ３００，４００上の各仮想マシンおよびオートスケールサーバ２００の監視を行う。監視部１３０は、ＶＭ監視部１３１およびＡＳサーバ連携部１３２を有する。 The monitoring unit 130 monitors each virtual machine on the physical servers 300 and 400 and the autoscale server 200. The monitoring unit 130 includes a VM monitoring unit 131 and an AS server cooperation unit 132.

ＶＭ監視部１３１は、物理サーバ３００，４００上の各仮想マシンと定期的に通信し、各仮想マシンとの疎通確認を行う。例えば、ＶＭ監視部１３１は、各仮想マシンから疎通確認用のパケットを受信することで、疎通確認を行う。疎通確認用のパケットは、例えば、ＩＣＭＰ（Internet Control Message Protocol）のエコー要求でもよいし、ＶＭ監視部１３１により送信されたエコー要求に対する仮想マシンからのエコー応答でもよい。あるいは、ＶＭ監視部１３１は、ＳＮＭＰ（Simple Network Management Protocol）などのその他のプロトコルを用いて疎通確認を行ってもよい。ＶＭ監視部１３１による監視対象の仮想マシンは、何れもオートスケールの制御対象の仮想マシンである。ＶＭ監視部１３１は、監視対象とする仮想マシンを、オートスケールサーバ２００に問い合わせてもよい。 The VM monitoring unit 131 periodically communicates with each virtual machine on the physical servers 300 and 400 and confirms communication with each virtual machine. For example, the VM monitoring unit 131 performs communication confirmation by receiving a communication confirmation packet from each virtual machine. The communication confirmation packet may be, for example, an ICMP (Internet Control Message Protocol) echo request or an echo response from a virtual machine in response to an echo request transmitted by the VM monitoring unit 131. Alternatively, the VM monitoring unit 131 may perform communication confirmation using another protocol such as SNMP (Simple Network Management Protocol). The virtual machines that are monitored by the VM monitoring unit 131 are all virtual machines that are subject to autoscale control. The VM monitoring unit 131 may inquire the autoscale server 200 about a virtual machine to be monitored.

ＡＳサーバ連携部１３２は、オートスケールサーバ（ＡＳサーバ）２００と連携する。ＡＳサーバ連携部１３２は、オートスケールサーバ２００と定期的に通信し、オートスケールサーバ２００の死活監視を行う。例えば、ＡＳサーバ連携部１３２は、オートスケールサーバ２００に対して、定期的に仮想マシンの状態を問い合わせることで、オートスケールサーバ２００の死活監視を行ってもよい。問い合わせに対してオートスケールサーバ２００から応答があれば、オートスケールサーバ２００は稼動している。一方、問い合わせに対してオートスケールサーバ２００から応答がなければ、オートスケールサーバ２００は停止している。 The AS server cooperation unit 132 cooperates with the auto scale server (AS server) 200. The AS server cooperation unit 132 periodically communicates with the autoscale server 200 and performs alive monitoring of the autoscale server 200. For example, the AS server cooperation unit 132 may perform alive monitoring of the autoscale server 200 by inquiring of the autoscale server 200 periodically about the state of the virtual machine. If there is a response from the autoscale server 200 to the inquiry, the autoscale server 200 is operating. On the other hand, if there is no response from the autoscale server 200 to the inquiry, the autoscale server 200 is stopped.

オートスケールサーバ２００が稼動している場合、ＡＳサーバ連携部１３２は、監視対象の仮想マシンのうち、疎通確認を行えなかった仮想マシンがスケールインにより停止されたか否かを、オートスケールサーバ２００に問い合わせる。該当の仮想マシンがスケールインにより停止された場合、ＡＳサーバ連携部１３２は、疎通確認を行えなかったことを異常としない。該当の仮想マシンがスケールインにより停止されていない場合、ＡＳサーバ連携部１３２は、疎通確認を行えなかった仮想マシンを異常と判断し、システム管理者に通知する。例えば、ＡＳサーバ連携部１３２は、該当の仮想マシンの異常発生を示す画面をディスプレイ１１１に表示させてもよい。または、ＡＳサーバ連携部１３２は、該当の仮想マシンの異常発生を示すメッセージを、ネットワーク５０に接続された、システム管理者が使用する端末装置（図示を省略している）に送信してもよい。 When the autoscale server 200 is operating, the AS server cooperation unit 132 indicates to the autoscale server 200 whether or not a virtual machine that has not been confirmed for communication among the monitored virtual machines has been stopped due to scale-in. Inquire. When the corresponding virtual machine is stopped due to the scale-in, the AS server cooperation unit 132 does not make an abnormality that the communication check cannot be performed. When the corresponding virtual machine is not stopped due to the scale-in, the AS server cooperation unit 132 determines that the virtual machine for which the communication check cannot be performed is abnormal, and notifies the system administrator. For example, the AS server cooperation unit 132 may cause the display 111 to display a screen indicating that an abnormality has occurred in the corresponding virtual machine. Alternatively, the AS server cooperation unit 132 may transmit a message indicating the occurrence of an abnormality of the corresponding virtual machine to a terminal device (not shown) connected to the network 50 and used by the system administrator. .

オートスケールサーバ２００が停止している場合、ＡＳサーバ連携部１３２は、疎通確認を行えなかった仮想マシンがスケールインにより停止されたか否かを、オートスケールサーバ２００に問い合わせることはできない。このため、ＡＳサーバ連携部１３２は、問い合わせを保留する。その後、オートスケールサーバ２００が起動すると、疎通確認の再開により、ＡＳサーバ連携部１３２は、オートスケールサーバ２００の起動を検知する。そして、ＡＳサーバ連携部１３２は、オートスケールサーバ２００の停止中に、疎通確認が途絶えた仮想マシンが存在する場合、当該仮想マシンの情報を、オートスケールサーバ２００に送信する。 When the autoscale server 200 is stopped, the AS server cooperation unit 132 cannot make an inquiry to the autoscale server 200 as to whether or not the virtual machine for which the communication check has not been performed has been stopped due to the scale-in. For this reason, the AS server cooperation unit 132 holds the inquiry. Thereafter, when the autoscale server 200 is activated, the AS server cooperation unit 132 detects the activation of the autoscale server 200 by resuming the communication confirmation. Then, when there is a virtual machine whose communication check has been interrupted while the autoscale server 200 is stopped, the AS server cooperation unit 132 transmits information on the virtual machine to the autoscale server 200.

オートスケールサーバ２００は、記憶部２１０およびＡＳ制御部２２０を有する。記憶部２１０は、オートスケールサーバ２００のＲＡＭやＨＤＤの記憶領域を用いて実現される。ＡＳ制御部２２０は、オートスケールサーバ２００のＣＰＵがオートスケールサーバ２００のＲＡＭに記憶されたプログラムを実行することで実現される。 The auto scale server 200 includes a storage unit 210 and an AS control unit 220. The storage unit 210 is realized using a storage area of the RAM or HDD of the autoscale server 200. The AS control unit 220 is realized by the CPU of the autoscale server 200 executing a program stored in the RAM of the autoscale server 200.

記憶部２１０は、オートスケール制御に用いられる情報を記憶する。具体的には、記憶部２１０は、オートスケールグループテーブル、ＶＭテーブルおよびオートスケールポリシーテーブルを記憶する。 The storage unit 210 stores information used for autoscale control. Specifically, the storage unit 210 stores an autoscale group table, a VM table, and an autoscale policy table.

オートスケールグループテーブルは、オートスケールグループを示す情報である。オートスケールグループは、オートスケール制御の対象となる仮想マシンのグループである。１つのオートスケールグループに属する仮想マシンの負荷に応じて、当該オートスケールグループに属する仮想マシンのオートスケール制御が行われる。ＶＭテーブルは、オートスケール制御の対象の仮想マシンを示す情報である。ＶＭテーブルは、仮想マシンの状態を含む。仮想マシンの状態には、（１）仮想マシンが正常に稼動している、（２）仮想マシンに異常あり、（３）スケールインにより縮退している（スケールインのために停止している）、（４）スケールアウトのために起動中、などが考えられる。オートスケールポリシーテーブルは、オートスケール制御のポリシー（スケールインやスケールアウトを行うための条件）を示す情報である。 The auto scale group table is information indicating an auto scale group. The auto scale group is a group of virtual machines that are targets of auto scale control. In accordance with the load of the virtual machine belonging to one autoscale group, the autoscale control of the virtual machine belonging to the autoscale group is performed. The VM table is information indicating a virtual machine that is subject to autoscale control. The VM table includes the state of the virtual machine. The state of the virtual machine is as follows: (1) The virtual machine is operating normally, (2) The virtual machine is abnormal, (3) It is degraded due to scale-in (it has stopped due to scale-in) (4) It is considered that the system is being activated for scale-out. The autoscale policy table is information indicating an autoscale control policy (conditions for performing scale-in and scale-out).

ここで、オートスケールグループテーブル、ＶＭテーブルおよびオートスケールポリシーテーブルは、記憶部２１０のうち、不揮発性の記憶領域（例えば、ＨＤＤの記憶領域）に格納される。また、オートスケールグループテーブル、ＶＭテーブルおよびオートスケールポリシーテーブルは、オートスケール制御に用いられる場合、複製されて、記憶部２１０のうち、揮発性の記憶領域（例えば、ＲＡＭの記憶領域）に一時的に格納されることもある。この場合、揮発性の記憶領域に保持されているときの各テーブルの更新内容は、ＡＳ制御部２２０により不揮発性の記憶領域に格納された複製元の各テーブルにも反映される。 Here, the auto scale group table, the VM table, and the auto scale policy table are stored in a non-volatile storage area (for example, a storage area of the HDD) in the storage unit 210. Further, the autoscale group table, the VM table, and the autoscale policy table are duplicated and temporarily stored in a volatile storage area (for example, a storage area of the RAM) in the storage unit 210 when used for autoscale control. May also be stored. In this case, the updated contents of each table held in the volatile storage area are also reflected in the replication source tables stored in the nonvolatile storage area by the AS control unit 220.

ＡＳ制御部２２０は、物理サーバ３００，４００上の仮想マシン（例えば、仮想マシン３１０，３２０，４１０，４２０を含む複数の仮想マシン）のオートスケール制御（ＡＳ制御）を行う。ＡＳ制御部２２０は、仮想マシンの負荷の情報を定期的に収集する。例えば、ＡＳ制御部２２０は、ＳＮＭＰなどのプロトコルを用いて仮想マシンの負荷を収集してもよい。ＡＳ制御部２２０は、収集した負荷と、当該仮想マシンが属するオートスケールグループのオートスケールポリシーとに基づいて、スケールインやスケールアウトを物理サーバ３００，４００に指示する。 The AS control unit 220 performs autoscale control (AS control) of virtual machines on the physical servers 300 and 400 (for example, a plurality of virtual machines including the virtual machines 310, 320, 410, and 420). The AS control unit 220 periodically collects information on virtual machine loads. For example, the AS control unit 220 may collect the load of the virtual machine using a protocol such as SNMP. The AS control unit 220 instructs the physical servers 300 and 400 to perform scale-in and scale-out based on the collected load and the autoscale policy of the autoscale group to which the virtual machine belongs.

ここで、障害などによりオートスケールサーバ２００が停止することがある。オートスケールサーバ２００の停止中は、ＡＳ制御部２２０によるオートスケール制御も停止する。ＡＳ制御部２２０は、オートスケールサーバ２００の停止後、オートスケールサーバ２００が起動した際に、オートスケールサーバ２００が停止していた間の疎通確認に応じた仮想マシンの情報を監視サーバ１００から取得する。ＡＳ制御部２２０は、取得した仮想マシンの情報に基づいて、記憶部２１０に記憶されたＶＭテーブルにおける仮想マシンの状態を更新する。ＡＳ制御部２２０は、更新後のＶＭテーブルに基づいて、仮想マシンの負荷の収集を再開し、オートスケール制御を再開する。 Here, the autoscale server 200 may stop due to a failure or the like. While the autoscale server 200 is stopped, the autoscale control by the AS control unit 220 is also stopped. When the autoscale server 200 is started after the autoscale server 200 is stopped, the AS control unit 220 acquires virtual machine information from the monitoring server 100 according to the communication confirmation while the autoscale server 200 is stopped. To do. The AS control unit 220 updates the state of the virtual machine in the VM table stored in the storage unit 210 based on the acquired virtual machine information. The AS control unit 220 resumes the collection of the load of the virtual machine based on the updated VM table, and resumes the autoscale control.

図６は、オートスケールサーバ管理テーブルの例を示す図である。
オートスケールサーバ管理テーブル１２１は、記憶部１２０に格納される。オートスケールサーバ管理テーブル１２１は、オートスケールサーバＩＤ（IDentifier）および稼働中フラグの項目を含む。 FIG. 6 is a diagram illustrating an example of an autoscale server management table.
The autoscale server management table 121 is stored in the storage unit 120. The autoscale server management table 121 includes items of an autoscale server ID (IDentifier) and an operating flag.

オートスケールサーバＩＤの項目には、オートスケールサーバ２００の識別情報（オートスケールサーバＩＤ）が登録される。オートスケールサーバ２００のオートスケールサーバＩＤは、例えば、「装置Ａ」である。稼働中フラグの項目には、オートスケールサーバ２００が稼働しているか否かを示す稼働中フラグが登録される。稼働中フラグ「Ｔｒｕｅ」は稼働していることを示す。稼働中フラグ「Ｆａｌｓｅ」は稼動していない（すなわち、停止している）ことを示す。例えば、オートスケールサーバ管理テーブル１２１には、オートスケールサーバＩＤが「装置Ａ」、稼働中フラグが「Ｔｒｕｅ」というレコードが登録される。 In the auto scale server ID item, identification information (auto scale server ID) of the auto scale server 200 is registered. The autoscale server ID of the autoscale server 200 is, for example, “apparatus A”. In the in-operation flag item, an in-operation flag indicating whether or not the autoscale server 200 is in operation is registered. The operating flag “True” indicates that it is operating. The operating flag “False” indicates that the system is not operating (that is, stopped). For example, in the autoscale server management table 121, a record in which the autoscale server ID is “device A” and the operating flag is “True” is registered.

図７は、ＶＭ管理テーブルの例を示す図である。
ＶＭ管理テーブル１２２は、記憶部１２０に格納される。ＶＭ管理テーブル１２２は、ＶＭ名、通信用ＩＰ（Internet Protocol）アドレス、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目を含む。 FIG. 7 is a diagram illustrating an example of the VM management table.
The VM management table 122 is stored in the storage unit 120. The VM management table 122 includes items of a VM name, a communication IP (Internet Protocol) address, an autoscale VM operating flag, and an autoscale information update flag.

ＶＭ名の項目には、仮想マシンの名称（仮想マシンのＩＤ）が登録される。通信用ＩＰアドレスの項目には、仮想マシンのＩＰアドレスが登録される。オートスケールＶＭ動作中フラグの項目には、死活監視の成否（すなわち、該当の仮想マシンが動作しているか否か）を示すオートスケールＶＭ動作中フラグが登録される。オートスケールＶＭ動作中フラグ「Ｔｒｕｅ」は、該当の仮想マシンとの定期通信が正常に行われた（すなわち、該当の仮想マシンが動作している）ことを示す。オートスケールＶＭ動作中フラグ「Ｆａｌｓｅ」は、該当の仮想マシンとの定期通信が正常に行われなかった（すなわち、該当の仮想マシンが停止している）ことを示す。オートスケール情報更新フラグの項目には、オートスケールサーバ２００の停止中に、該当の仮想マシンに関してオートスケールＶＭ動作中フラグの更新が発生したか否かを示すオートスケール情報更新フラグが登録される。オートスケール情報更新フラグ「Ｔｒｕｅ」は、当該更新が発生したことを示す。オートスケール情報更新フラグ「Ｆａｌｓｅ」は、当該更新が発生しなかったことを示す。オートスケール情報更新フラグの初期値は「Ｆａｌｓｅ」である。 The virtual machine name (virtual machine ID) is registered in the VM name item. The IP address of the virtual machine is registered in the item of IP address for communication. In the auto scale VM operating flag item, an auto scale VM operating flag indicating whether or not life monitoring is successful (that is, whether the corresponding virtual machine is operating) is registered. The auto-scale VM operating flag “True” indicates that regular communication with the corresponding virtual machine has been normally performed (that is, the corresponding virtual machine is operating). The auto-scale VM operating flag “False” indicates that regular communication with the corresponding virtual machine has not been normally performed (that is, the corresponding virtual machine has stopped). In the item of the autoscale information update flag, an autoscale information update flag indicating whether or not the autoscale VM operation flag has been updated for the virtual machine while the autoscale server 200 is stopped is registered. The autoscale information update flag “True” indicates that the update has occurred. The autoscale information update flag “False” indicates that the update has not occurred. The initial value of the autoscale information update flag is “False”.

例えば、ＶＭ管理テーブル１２２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ１」、通信用ＩＰアドレスが「１００．１０．９９．１」、オートスケールＶＭ動作中フラグが「Ｔｒｕｅ」、オートスケール情報更新フラグが「Ｆａｌｓｅ」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ１」の仮想マシンの通信用ＩＰアドレスが「１００．１０．９９．１」であることを示す。また、当該仮想マシンが稼動しており、オートスケールサーバ２００の停止中におけるオートスケールＶＭ動作中フラグの更新が発生していないことを示す。 For example, in the VM management table 122, the VM name is “Grp1_VM1”, the communication IP address is “100.10.99.1”, the autoscale VM operating flag is “True”, and the autoscale information update flag is “False”. Is registered. This record indicates that the communication IP address of the virtual machine with the VM name “Grp1_VM1” is “100.10.99.1”. Further, it indicates that the virtual machine is running and the autoscale VM operating flag is not updated while the autoscale server 200 is stopped.

また、例えば、ＶＭ管理テーブル１２２には、ＶＭ名が「ＳａｍｐｌｅＶＭ」、通信用ＩＰアドレスが「２００．２００．２００．２」、オートスケールＶＭ動作中フラグが「Ｆａｌｓｅ」、オートスケール情報更新フラグが「Ｔｒｕｅ」というレコードが登録される。このレコードは、ＶＭ名「ＳａｍｐｌｅＶＭ」の仮想マシンの通信用ＩＰアドレスが「２００．２００．２００．２」であることを示す。また、当該仮想マシンが停止しており、オートスケールサーバ２００の停止中におけるオートスケールＶＭ動作中フラグの更新が発生したことを示す。 Further, for example, in the VM management table 122, the VM name is “SampleVM”, the communication IP address is “200.200.200.2”, the autoscale VM operating flag is “False”, and the autoscale information update flag is set. A record “True” is registered. This record indicates that the communication IP address of the virtual machine with the VM name “SampleVM” is “200.200.200.2”. In addition, it indicates that the virtual machine is stopped and the autoscale VM operation flag is updated while the autoscale server 200 is stopped.

図８は、オートスケールグループテーブルの例を示す図である。
オートスケールグループテーブル２１１は、記憶部２１０に格納される。オートスケールグループテーブル２１１は、オートスケールグループＩＤ、利用可能ＣＩＤＲ（Classless Inter-Domain Routing）、オートスケールポリシーＩＤ、最小台数および最大台数の項目を含む。 FIG. 8 is a diagram illustrating an example of an autoscale group table.
The autoscale group table 211 is stored in the storage unit 210. The auto scale group table 211 includes items of auto scale group ID, available CIDR (Classless Inter-Domain Routing), auto scale policy ID, minimum number and maximum number.

オートスケールグループＩＤの項目には、オートスケールグループの識別情報（オートスケールグループＩＤ）が登録される。利用可能ＣＩＤＲの項目には、利用可能なＣＩＤＲが登録される。オートスケールポリシーＩＤの項目には、該当のオートスケールグループに対して適用されるオートスケールポリシーの識別情報（オートスケールポリシーＩＤ）が登録される。ここで、オートスケールポリシーＩＤに対応するオートスケールポリシーの具体的な内容は、後述するオートスケールポリシーテーブルに予め登録されている。最小台数の項目には、該当のオートスケールグループにおける仮想マシンの最小数が登録される。最大台数の項目には、該当のオートスケールグループにおける仮想マシンの最大数が登録される。 In the item of auto scale group ID, identification information (auto scale group ID) of the auto scale group is registered. Available CIDR is registered in the item of available CIDR. In the auto scale policy ID item, identification information (auto scale policy ID) of an auto scale policy applied to the corresponding auto scale group is registered. Here, the specific content of the autoscale policy corresponding to the autoscale policy ID is registered in advance in an autoscale policy table to be described later. In the item of minimum number, the minimum number of virtual machines in the corresponding autoscale group is registered. In the maximum number of items, the maximum number of virtual machines in the corresponding autoscale group is registered.

例えば、オートスケールグループテーブル２１１には、オートスケールグループＩＤが「グループ１」、利用可能ＣＩＤＲが「１００．１０．９９．０／２４」、オートスケールポリシーＩＤが「ルール１，３」、最小台数が「１」、最大台数が「１０」というレコードが登録される。このレコードは、オートスケールグループＩＤ「グループ１」のオートスケールグループでは、利用可能ＣＩＤＲが「１００．１０．９９．０／２４」であり、オートスケールポリシーＩＤ「ルール１，３」のオートスケールポリシーが適用され、仮想マシンの最小数が１個、最大数が１０個であることを示す。 For example, in the autoscale group table 211, the autoscale group ID is “Group 1”, the available CIDR is “100.10.99.0/24”, the autoscale policy ID is “Rule 1, 3”, and the minimum number Is registered as “1” and the maximum number is “10”. In this record, in the autoscale group with the autoscale group ID “group 1”, the available CIDR is “100.10.99.0/24”, and the autoscale policy with the autoscale policy ID “rule 1, 3”. Indicates that the minimum number of virtual machines is 1 and the maximum number is 10.

図９は、ＶＭテーブルの例を示す図である。
ＶＭテーブル２１２は、記憶部２１０に格納される。ＶＭテーブル２１２は、ＶＭ名、オートスケールグループＩＤ、通信用ＩＰアドレスおよびＶＭ状態の項目を含む。 FIG. 9 is a diagram illustrating an example of the VM table.
The VM table 212 is stored in the storage unit 210. The VM table 212 includes items of a VM name, an auto scale group ID, a communication IP address, and a VM state.

ＶＭ名の項目には、仮想マシンのＶＭ名が登録される。オートスケールグループＩＤの項目には、当該仮想マシンが属するオートスケールグループのオートスケールグループＩＤが登録される。通信用ＩＰアドレスの項目には、仮想マシンのＩＰアドレスが登録される。ＶＭ状態の項目には、仮想マシンの状態が登録される。前述のように、仮想マシンの状態には、仮想マシンが正常に稼動している、仮想マシンに異常あり（ＥＲＲＯＲ）、スケールインにより縮退している（スケールインのために停止している）、スケールアウトのために起動中、などが考えられる。 In the VM name item, the VM name of the virtual machine is registered. In the auto scale group ID item, the auto scale group ID of the auto scale group to which the virtual machine belongs is registered. The IP address of the virtual machine is registered in the item of IP address for communication. In the VM status item, the status of the virtual machine is registered. As described above, the state of the virtual machine is that the virtual machine is operating normally, the virtual machine is abnormal (ERROR), and has been degraded due to scale-in (stopped due to scale-in), It may be activated for scale-out.

例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ１」、オートスケールグループＩＤが「グループ１」、通信用ＩＰアドレスが「１００．１０．９９．１」、ＶＭ状態が「正常」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ１」の仮想マシンがオートスケールグループＩＤ「グループ１」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１０．９９．１」であり、当該仮想マシンが正常に稼動していることを示す。 For example, in the VM table 212, a record having a VM name “Grp1_VM1”, an autoscale group ID “group 1”, a communication IP address “100.10.99.1”, and a VM state “normal” is registered. Is done. In this record, the virtual machine with the VM name “Grp1_VM1” belongs to the autoscale group with the autoscale group ID “group 1”, the IP address of the virtual machine is “100.10.99.1”, and the virtual machine Indicates that it is operating normally.

また、例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ２」、オートスケールグループＩＤが「グループ１」、通信用ＩＰアドレスが「１００．１０．９９．２」、ＶＭ状態が「ＥＲＲＯＲ」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ２」の仮想マシンがオートスケールグループＩＤ「グループ１」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１０．９９．２」であり、当該仮想マシンで異常が発生していることを示す。 Further, for example, the VM table 212 has a record in which the VM name is “Grp1_VM2”, the autoscale group ID is “group 1”, the communication IP address is “100.10.99.2”, and the VM state is “ERROR”. Is registered. In this record, the virtual machine with the VM name “Grp1_VM2” belongs to the autoscale group with the autoscale group ID “group 1”, the IP address of the virtual machine is “100.10.99.2”, and the virtual machine Indicates that an abnormality has occurred.

また、例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ３」、オートスケールグループＩＤが「グループ１」、通信用ＩＰアドレスが「１００．１０．９９．３」、ＶＭ状態が「スケールイン縮退」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ３」の仮想マシンがオートスケールグループＩＤ「グループ１」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１０．９９．３」であり、スケールインにより停止していることを示す。 Further, for example, in the VM table 212, the VM name is “Grp1_VM3”, the autoscale group ID is “group 1”, the communication IP address is “100.10.99.3”, and the VM state is “scale-in degenerate”. Is registered. In this record, a virtual machine with a VM name “Grp1_VM3” belongs to an autoscale group with an autoscale group ID “group 1”, and the IP address of the virtual machine is “100.10.99.3”. Indicates that it has stopped.

また、例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ２＿ＶＭ３」、オートスケールグループＩＤが「グループ２」、通信用ＩＰアドレスが「１００．１１．０．３２」、ＶＭ状態が「スケールアウト中」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ２＿ＶＭ３」の仮想マシンがオートスケールグループＩＤ「グループ２」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１１．０．３２」であり、スケールアウトのため起動中であることを示す。 Further, for example, in the VM table 212, the VM name is “Grp2_VM3”, the autoscale group ID is “group 2”, the communication IP address is “100.11.0.32”, and the VM state is “scaled out”. Is registered. In this record, the virtual machine with the VM name “Grp2_VM3” belongs to the autoscale group with the autoscale group ID “group 2”, the IP address of the virtual machine is “100.11.0.32”, and the scale-out Therefore, it shows that it is starting.

図１０は、オートスケールポリシーテーブルの例を示す図である。オートスケールポリシーテーブル２１３は、記憶部２１０に格納される。オートスケールポリシーテーブル２１３は、オートスケールポリシーＩＤ、トリガーおよびトリガー詳細の項目を含む。 FIG. 10 is a diagram illustrating an example of an autoscale policy table. The autoscale policy table 213 is stored in the storage unit 210. The auto scale policy table 213 includes items of auto scale policy ID, trigger, and trigger details.

オートスケールポリシーＩＤの項目には、オートスケールポリシーの識別情報（オートスケールポリシーＩＤ）が登録される。トリガーの項目には、オートスケール制御のトリガーとなる監視対象のリソース（仮想マシンにより認識される論理的なリソースでもよい）が登録される。トリガー詳細の項目には、オートスケール制御のトリガーに関する条件が登録される。 In the item of auto scale policy ID, identification information (auto scale policy ID) of the auto scale policy is registered. In the trigger item, a resource to be monitored (which may be a logical resource recognized by a virtual machine) serving as a trigger for autoscale control is registered. In the trigger details item, conditions related to the trigger of auto scale control are registered.

例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール１」、トリガーが「ＣＰＵ使用率」、トリガー詳細が「１分毎のＣＰＵ平均使用率を取得し、連続５回８０％を上回るとスケールアウト」というレコードが登録される。 For example, in the auto scale policy table 213, the auto scale policy ID is “Rule 1”, the trigger is “CPU usage rate”, the trigger details are “CPU average usage rate per minute is acquired, and 80% is obtained 5 times continuously. A record “scale out when exceeded” is registered.

このレコードは、オートスケールポリシーＩＤ「ルール１」のオートスケールポリシーでは、仮想マシンのＣＰＵ使用率をトリガーとしており、１分毎のＣＰＵ平均使用率が連続５回８０％を上回った場合に、スケールアウトを行うことを示す。ここで、「１分毎のＣＰＵ平均使用率」は、該当のオートスケールグループに属する複数の仮想マシンに関する平均でもよいし、該当のオートスケールグループに属する仮想マシン単位の平均でもよい。後者の場合、該当のオートスケールグループに属する少なくとも何れかの仮想マシンにおいて、１分毎のＣＰＵ平均使用率が連続５回８０％を上回るとスケールアウトを行う。なお、所定時間毎の「ＣＰＵ平均使用率」（あるいは、「メモリ平均使用率」）の考え方は、他のオートスケールポリシーについても同様である。 This record is triggered by the CPU usage rate of the virtual machine in the auto-scaling policy with the auto-scaling policy ID “Rule 1”, and when the average CPU usage rate per minute exceeds 80% for 5 consecutive times, Indicates to do out. Here, the “CPU average usage rate per minute” may be an average regarding a plurality of virtual machines belonging to the corresponding autoscale group, or may be an average of virtual machine units belonging to the corresponding autoscale group. In the latter case, scale-out is performed when the average CPU usage rate per minute exceeds 80% five times in at least one of the virtual machines belonging to the corresponding autoscale group. The concept of “CPU average usage rate” (or “memory average usage rate”) every predetermined time is the same for other autoscale policies.

また、例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール２」、トリガーが「メモリ使用率」、トリガー詳細が「５分毎のメモリ平均使用率を取得し、連続３回９５％を上回るとスケールアウト」というレコードが登録される。このレコードは、オートスケールポリシーＩＤ「ルール２」のオートスケールポリシーでは、仮想マシンのメモリ使用率をトリガーとしており、５分毎のメモリ平均使用率が連続３回９５％を上回った場合に、スケールアウトを行うことを示す。 Also, for example, in the autoscale policy table 213, the autoscale policy ID is “Rule 2”, the trigger is “Memory usage rate”, the trigger details are “Acquire memory average usage rate every 5 minutes, and 95 times in succession. The record “Scale out when exceeding%” is registered. This record is triggered by the memory usage rate of the virtual machine in the auto scaling policy with the auto-scaling policy ID “Rule 2”. When the average memory usage rate every 5 minutes exceeds 95% three times continuously, the scale is Indicates to do out.

また、例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール３」、トリガーが「ＣＰＵ使用率」、トリガー詳細が「１分毎のＣＰＵ平均使用率を取得し、連続５回１０％を下回るとスケールイン」というレコードが登録される。このレコードは、オートスケールポリシーＩＤ「ルール３」のオートスケールポリシーでは、仮想マシンのＣＰＵ使用率をトリガーとしており、１分毎のＣＰＵ平均使用率が連続５回１０％を下回った場合に、スケールインを行うことを示す。 Also, for example, in the autoscale policy table 213, the autoscale policy ID is “Rule 3”, the trigger is “CPU usage rate”, the trigger details are “CPU average usage rate per minute is acquired, and 10 times continuously. The record “Scale-in when below%” is registered. This record is triggered by the CPU usage rate of the virtual machine in the auto-scaling policy with the auto-scaling policy ID “Rule 3”, and when the average CPU usage rate per minute falls below 10% for 5 consecutive times, Indicates to perform in.

また、例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール４」、トリガーが「メモリ使用率」、トリガー詳細が「５分毎のメモリ平均使用率を取得し、連続５回３０％を下回るとスケールイン」というレコードが登録される。このレコードは、オートスケールポリシーＩＤ「ルール４」のオートスケールポリシーでは、仮想マシンのメモリ使用率をトリガーとしており、５分毎のメモリ平均使用率が連続５回３０％を下回った場合に、スケールインを行うことを示す。 Also, for example, in the autoscale policy table 213, the autoscale policy ID is “Rule 4”, the trigger is “Memory usage rate”, the trigger details are “Acquire memory average usage rate every 5 minutes, and 30 times continuously. The record “Scale-in when below%” is registered. This record is triggered by the virtual machine's memory usage rate in the auto scaling policy with the auto-scaling policy ID “Rule 4”, and when the average memory usage rate every 5 minutes falls below 30% for 5 consecutive times, Indicates to perform in.

次に、上記のクラスタシステムにおける監視サーバ１００の処理手順を説明する。
図１１は、ＶＭ監視の例を示すフローチャートである。
ＶＭ監視部１３１は下記の処理を定期的に実行する。実行の周期は、運用に応じて定められる。周期は、数秒から数十秒程度でもよいし、１分から数分程度でもよい。 Next, a processing procedure of the monitoring server 100 in the cluster system will be described.
FIG. 11 is a flowchart illustrating an example of VM monitoring.
The VM monitoring unit 131 periodically executes the following processing. The execution cycle is determined according to the operation. The period may be about several seconds to several tens of seconds, or about 1 minute to several minutes.

（Ｓ１０）ＶＭ監視部１３１は、監視対象の仮想マシン（監視対象ＶＭ）の監視情報を収集する。例えば、ＶＭ監視部１３１は、監視対象の仮想マシンから死活監視用の所定のパケットを受信することで、監視情報を収集する。 (S10) The VM monitoring unit 131 collects monitoring information of the virtual machine to be monitored (monitoring target VM). For example, the VM monitoring unit 131 collects monitoring information by receiving a predetermined packet for alive monitoring from a virtual machine to be monitored.

（Ｓ１１）ＶＭ監視部１３１は、監視対象ＶＭの動作状況を更新する。具体的には、ＶＭ監視部１３１は、ステップＳ１０の監視情報の収集結果に基づいて、ＶＭ管理テーブル１２２を更新する。すなわち、ＶＭ監視部１３１は、監視情報を収集できた（死活監視用のパケットを受信できた）仮想マシンのオートスケールＶＭ動作中フラグを「Ｔｒｕｅ」に設定する。なお、元々「Ｔｒｕｅ」の場合はそのままでよい。 (S11) The VM monitoring unit 131 updates the operation status of the monitoring target VM. Specifically, the VM monitoring unit 131 updates the VM management table 122 based on the monitoring information collection result in step S10. In other words, the VM monitoring unit 131 sets the auto-scale VM operating flag of the virtual machine for which the monitoring information has been collected (the alive monitoring packet has been received) to “True”. In the case of “True” originally, it can be left as it is.

（Ｓ１２）ＶＭ監視部１３１は、前回の監視情報の収集時から所定時間内に監視情報が届いていない監視対象ＶＭがあるか否かを判定する。所定時間とは、当該監視の周期、または、当該周期に比較的短い時間（当該周期よりも短い時間）を加算した時間である。所定時間内に監視情報が届いていない監視対象ＶＭがある場合、ステップＳ１３に処理が進む。所定時間内に監視情報が届いていない監視対象ＶＭがない場合、ステップＳ１６に処理が進む。 (S12) The VM monitoring unit 131 determines whether there is a monitoring target VM for which the monitoring information has not arrived within a predetermined time since the last collection of the monitoring information. The predetermined time is a period obtained by adding a relatively short time (a time shorter than the period) to the monitoring period or the period. If there is a monitoring target VM for which monitoring information has not arrived within a predetermined time, the process proceeds to step S13. If there is no monitoring target VM for which the monitoring information has not arrived within the predetermined time, the process proceeds to step S16.

（Ｓ１３）ＶＭ監視部１３１は、ステップＳ１２で前回の監視情報の収集時から所定時間内に監視情報が届いていないと判断された監視対象ＶＭについて、ＶＭ管理テーブル１２２のオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定する。なお、元々「Ｆａｌｓｅ」の場合はそのままでよい。 (S13) The VM monitoring unit 131 sets an auto-scale VM operating flag in the VM management table 122 for the monitoring target VM that is determined in step S12 that the monitoring information has not arrived within a predetermined time since the last monitoring information collection. Is set to “False”. In the case of “False” originally, it can be left as it is.

（Ｓ１４）ＶＭ監視部１３１は、オートスケールサーバ管理テーブル１２１を参照して、稼働中フラグが「Ｔｒｕｅ」であるか否かを判定する。稼働中フラグが「Ｔｒｕｅ」の場合、ステップＳ１６に処理が進む。稼働中フラグが「Ｆａｌｓｅ」の場合、ステップＳ１５に処理が進む。 (S14) The VM monitoring unit 131 refers to the autoscale server management table 121 and determines whether or not the operating flag is “True”. If the operating flag is “True”, the process proceeds to step S16. If the operating flag is “False”, the process proceeds to step S15.

（Ｓ１５）ＶＭ監視部１３１は、ステップＳ１３でオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定した監視対象ＶＭについて、ＶＭ管理テーブル１２２のオートスケール情報更新フラグを「Ｔｒｕｅ」に設定する。 (S15) The VM monitoring unit 131 sets the autoscale information update flag of the VM management table 122 to “True” for the monitoring target VM for which the autoscale VM operation flag is set to “False” in step S13.

（Ｓ１６）ＶＭ監視部１３１は、監視を継続するか否を判定する。監視を継続する場合、監視の周期の分だけ待機して、ステップＳ１０に処理が進む。監視を継続しない場合、ＶＭ監視の処理が終了する。例えば、ＶＭ監視部１３１は、システム管理者による監視の終了の入力を受け付けた場合、監視を継続しないと判定し、それ以外の場合に監視を継続すると判定する。 (S16) The VM monitoring unit 131 determines whether or not to continue monitoring. When monitoring is continued, the process waits for the monitoring period and the process proceeds to step S10. If the monitoring is not continued, the VM monitoring process ends. For example, the VM monitoring unit 131 determines that the monitoring is not continued when an input of the monitoring end by the system administrator is received, and determines that the monitoring is continued in other cases.

図１２は、オートスケールサーバ監視の例を示すフローチャートである。
ＡＳサーバ連携部１３２は下記の処理を定期的に実行する。実行の周期は、運用に応じて定められる。周期は、数秒から数十秒程度でもよいし、１分から数分程度でもよい。 FIG. 12 is a flowchart illustrating an example of autoscale server monitoring.
The AS server cooperation unit 132 periodically executes the following processing. The execution cycle is determined according to the operation. The period may be about several seconds to several tens of seconds, or about 1 minute to several minutes.

（Ｓ２０）ＡＳサーバ連携部１３２は、オートスケールサーバ２００のＶＭ状態を参照する。具体的には、ＡＳサーバ連携部１３２は、オートスケールサーバ２００に、ＶＭテーブル２１２における各仮想マシンのＶＭ状態を問い合わせる。 (S20) The AS server cooperation unit 132 refers to the VM state of the autoscale server 200. Specifically, the AS server cooperation unit 132 inquires of the autoscale server 200 about the VM state of each virtual machine in the VM table 212.

（Ｓ２１）ＡＳサーバ連携部１３２は、オートスケールサーバ２００が動作中であるか否かを判定する。オートスケールサーバ２００が動作中である場合、ステップＳ２３に処理が進む。オートスケールサーバ２００が動作中でない、すなわち、停止している場合、ステップＳ２２に処理が進む。例えば、ＡＳサーバ連携部１３２は、ステップＳ２０の問い合わせに対するオートスケールサーバ２００の応答がある場合、オートスケールサーバ２００が動作中であると判定する。また、ＡＳサーバ連携部１３２は、ステップＳ２０の問い合わせに対するオートスケールサーバ２００の応答がない場合、オートスケールサーバ２００が停止していると判定する。 (S21) The AS server cooperation unit 132 determines whether or not the autoscale server 200 is operating. If the autoscale server 200 is operating, the process proceeds to step S23. If the autoscale server 200 is not operating, that is, if it is stopped, the process proceeds to step S22. For example, if there is a response from the autoscale server 200 to the inquiry in step S20, the AS server cooperation unit 132 determines that the autoscale server 200 is operating. Moreover, the AS server cooperation part 132 determines with the auto scale server 200 having stopped, when there is no response of the auto scale server 200 with respect to the inquiry of step S20.

（Ｓ２２）ＡＳサーバ連携部１３２は、オートスケールサーバ管理テーブル１２１の稼働中フラグを「Ｆａｌｓｅ」に設定する。元々「Ｆａｌｓｅ」の場合はそのままでよい。そして、ステップＳ２７に処理が進む。 (S22) The AS server cooperation unit 132 sets the operating flag of the autoscale server management table 121 to “False”. In the case of “False” originally, it can be left as it is. Then, the process proceeds to step S27.

（Ｓ２３）ＡＳサーバ連携部１３２は、オートスケールサーバ管理テーブル１２１の稼働中フラグを「Ｔｒｕｅ」に設定する。元々「Ｔｒｕｅ」の場合はそのままでよい。
（Ｓ２４）ＡＳサーバ連携部１３２は、ＶＭ管理テーブル１２２のオートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭがあるか否かを判定する。オートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭがある場合、ステップＳ２５に処理が進む。オートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭがない場合、ステップＳ２７に処理が進む。 (S23) The AS server cooperation unit 132 sets the operating flag of the autoscale server management table 121 to “True”. In the case of “True” originally, it can be left as it is.
(S24) The AS server cooperation unit 132 determines whether there is a monitoring target VM whose autoscale information update flag in the VM management table 122 is “True”. If there is a monitoring target VM whose autoscale information update flag is “True”, the process proceeds to step S25. If there is no monitoring target VM whose autoscale information update flag is “True”, the process proceeds to step S27.

（Ｓ２５）ＡＳサーバ連携部１３２は、オートスケールサーバ２００が管理するＶＭ状態を、監視サーバ１００のＶＭ管理テーブル１２２におけるオートスケールＶＭ動作中フラグを基に更新する。具体的には、ＡＳサーバ連携部１３２は、オートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭの情報（オートスケールＶＭ動作中フラグ「Ｆａｌｓｅ」を示す情報）を、オートスケールサーバ２００に送信する。 (S25) The AS server cooperation unit 132 updates the VM state managed by the autoscale server 200 based on the autoscale VM operating flag in the VM management table 122 of the monitoring server 100. Specifically, the AS server cooperation unit 132 transmits information on the monitoring target VM whose autoscale information update flag is “True” (information indicating the autoscale VM operating flag “False”) to the autoscale server 200. To do.

（Ｓ２６）ＡＳサーバ連携部１３２は、ＶＭ管理テーブル１２２におけるオートスケール情報更新フラグを「Ｆａｌｓｅ」に設定する。具体的には、ＡＳサーバ連携部１３２は、ステップＳ２４でオートスケール情報更新フラグが「Ｔｒｕｅ」であった箇所を、「Ｆａｌｓｅ」に変更する。 (S26) The AS server cooperation unit 132 sets the autoscale information update flag in the VM management table 122 to “False”. Specifically, the AS server cooperation unit 132 changes the part where the autoscale information update flag is “True” in Step S24 to “False”.

（Ｓ２７）ＡＳサーバ連携部１３２は、オートスケールサーバ２００から取得した各監視対象ＶＭのＶＭ状態に応じて異常の発生を検知し、システム管理者に異常を通知する。例えば、ＡＳサーバ連携部１３２は、ＶＭ管理テーブル１２２においてオートスケールＶＭ動作中フラグが「Ｆａｌｓｅ」で、かつ、オートスケールサーバ２００に確認したＶＭ状態が「スケールインによる停止」でない仮想マシンを異常と判定する。例えば、ＡＳサーバ連携部１３２は、ディスプレイ１１１に異常を示す画像を表示させてもよい。あるいは、ＡＳサーバ連携部１３２は、システム管理者が利用する端末装置に、異常を示すメッセージを送信してもよい。なお、ステップＳ２２を経由してステップＳ２７が実行される場合、ＡＳサーバ連携部１３２はオートスケールサーバ２００からＶＭ状態を取得できないことになる。この場合、ＡＳサーバ連携部１３２は、ステップＳ２７をスキップしてステップＳ２８を実行してもよい。あるいは、ＡＳサーバ連携部１３２は、例外的にオートスケールサーバ２００への確認なしに、オートスケールＶＭ動作中フラグが「Ｆａｌｓｅ」の仮想マシンを異常とみなして、システム管理者に当該仮想マシンの異常を通知してもよい。 (S27) The AS server cooperation unit 132 detects the occurrence of an abnormality according to the VM state of each monitoring target VM acquired from the autoscale server 200, and notifies the system administrator of the abnormality. For example, the AS server cooperation unit 132 determines that the virtual machine in which the autoscale VM operation flag is “False” in the VM management table 122 and the VM state confirmed by the autoscale server 200 is not “stop due to scale-in” is abnormal. judge. For example, the AS server cooperation unit 132 may display an image indicating abnormality on the display 111. Alternatively, the AS server cooperation unit 132 may transmit a message indicating an abnormality to the terminal device used by the system administrator. When step S27 is executed via step S22, the AS server cooperation unit 132 cannot acquire the VM state from the autoscale server 200. In this case, the AS server cooperation unit 132 may skip step S27 and execute step S28. Alternatively, the AS server cooperation unit 132 regards the virtual machine with the auto-scale VM operation flag “False” as abnormal without exceptionally confirming with the auto-scale server 200, and notifies the system administrator of the abnormality of the virtual machine. May be notified.

（Ｓ２８）ＡＳサーバ連携部１３２は、監視を継続するか否かを判定する。監視を継続する場合、監視の周期の分だけ待機して、ステップＳ２０に処理が進む。監視を継続しない場合、オートスケールサーバ監視の処理が終了する。例えば、ＡＳサーバ連携部１３２は、システム管理者による監視の終了の入力を受け付けた場合、監視を継続しないと判定し、それ以外の場合に監視を継続すると判定する。 (S28) The AS server cooperation unit 132 determines whether or not to continue monitoring. When monitoring is continued, the process waits for the monitoring period and the process proceeds to step S20. If the monitoring is not continued, the autoscale server monitoring process ends. For example, the AS server cooperation unit 132 determines that the monitoring is not continued when an input of the monitoring end by the system administrator is received, and determines that the monitoring is continued in other cases.

次に、監視サーバ１００による監視の例を説明する。
図１３は、監視サーバによる監視の例を示す図である。
説明を簡単にするため、ＶＭ管理テーブル１２２の各項目のうち、ＶＭ名とＶＭ動作中フラグ（オートスケールＶＭ動作中フラグに相当）とを図示し、他の項目の図示を省略する。また、ＶＭテーブル２１２の各項目のうち、ＶＭ名とＶＭ状態とを図示し、他の項目の図示を省略する。また、ＶＭ名「ＶＭ１」の仮想マシンを、仮想マシンＶＭ１のように表記する（他のＶＭ名についても同様に表記する）。 Next, an example of monitoring by the monitoring server 100 will be described.
FIG. 13 is a diagram illustrating an example of monitoring by the monitoring server.
In order to simplify the description, among the items in the VM management table 122, the VM name and the VM operating flag (corresponding to the auto-scale VM operating flag) are shown, and the other items are not shown. In addition, among the items of the VM table 212, the VM name and the VM state are illustrated, and illustration of other items is omitted. In addition, the virtual machine with the VM name “VM1” is represented as a virtual machine VM1 (other VM names are also represented in the same manner).

まず、オートスケールサーバ２００が稼働中の場合を考える（ステップＳＴ１１）。ＶＭ管理テーブル１２２によれば、この段階において、仮想マシンＶＭ１のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ２のＶＭ動作中フラグは「Ｆａｌｓｅ」である。仮想マシンＶＭ３のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ４のＶＭ動作中フラグは「Ｔｒｕｅ」である。一方、ＶＭテーブル２１２によれば、仮想マシンＶＭ１のＶＭ状態は「正常」である。仮想マシンＶＭ２のＶＭ状態は「スケールイン縮退」である。仮想マシンＶＭ３のＶＭ状態は「正常」である。仮想マシンＶＭ４のＶＭ状態は「正常」である。ＶＭ管理テーブル１２２で、仮想マシンＶＭ２のＶＭ動作中フラグが「Ｆａｌｓｅ」なので、監視サーバ１００は、オートスケールサーバ２００に仮想マシンＶＭ２のＶＭ状態を問い合わせる。オートスケールサーバ２００は、ＶＭテーブル２１２に基づいて、仮想マシンＶＭ２のＶＭ状態「スケールイン縮退」を監視サーバ１００に応答する。この場合、監視サーバ１００は、仮想マシンＶＭ２から監視情報を取得できなかったことを異常とみなさない。 First, consider the case where the autoscale server 200 is in operation (step ST11). According to the VM management table 122, the VM operating flag of the virtual machine VM1 at this stage is “True”. The VM operating flag of the virtual machine VM2 is “False”. The VM operating flag of the virtual machine VM3 is “True”. The VM operating flag of the virtual machine VM4 is “True”. On the other hand, according to the VM table 212, the VM state of the virtual machine VM1 is “normal”. The VM state of the virtual machine VM2 is “scale-in degeneration”. The VM state of the virtual machine VM3 is “normal”. The VM state of the virtual machine VM4 is “normal”. Since the VM operating flag of the virtual machine VM2 is “False” in the VM management table 122, the monitoring server 100 inquires of the autoscale server 200 about the VM state of the virtual machine VM2. Based on the VM table 212, the autoscale server 200 responds to the monitoring server 100 with the VM state “scale-in degeneration” of the virtual machine VM2. In this case, the monitoring server 100 does not regard the failure to acquire the monitoring information from the virtual machine VM2 as an abnormality.

その後、オートスケールサーバ２００が停止した場合を考える（ステップＳＴ１２）。ＶＭテーブル２１２は、オートスケールサーバ２００が停止している間も、オートスケールサーバ２００の不揮発性の記憶装置（例えば、ＨＤＤ）に保持されている。監視サーバ１００は、オートスケールサーバ２００に対するＶＭ状態の定期的な問い合わせに対して、オートスケールサーバ２００からの応答がないことを検知することで、オートスケールサーバ２００が停止したことを検知する。 Then, consider a case where the autoscale server 200 is stopped (step ST12). The VM table 212 is held in a non-volatile storage device (for example, HDD) of the autoscale server 200 even while the autoscale server 200 is stopped. The monitoring server 100 detects that the autoscale server 200 has stopped by detecting that there is no response from the autoscale server 200 in response to a periodic inquiry about the VM state to the autoscale server 200.

監視サーバ１００は、仮想マシンＶＭ４との通信不可を検知する。すると、監視サーバ１００は、ＶＭ管理テーブル１２２において、仮想マシンＶＭ４のＶＭ動作中フラグを「Ｆａｌｓｅ」に変更することで、ＶＭ管理テーブル１２２をＶＭ管理テーブル１２３に更新する。監視サーバ１００は、仮想マシンＶＭ４について、オートスケールサーバ２００が停止している間にＶＭ動作中フラグを「Ｔｒｕｅ」から「Ｆａｌｓｅ」に変更したので、オートスケール情報更新フラグ（図１３では図示を省略している）を「Ｔｒｕｅ」に設定する。 The monitoring server 100 detects that communication with the virtual machine VM4 is not possible. Then, the monitoring server 100 updates the VM management table 122 to the VM management table 123 by changing the VM operating flag of the virtual machine VM4 to “False” in the VM management table 122. Since the monitoring server 100 changes the VM operating flag from “True” to “False” for the virtual machine VM4 while the autoscale server 200 is stopped, the autoscale information update flag (not shown in FIG. 13) is displayed. Is set to “True”.

更にその後、オートスケールサーバ２００が復旧した場合を考える（ステップＳＴ１３）。例えば、監視サーバ１００は、オートスケールサーバ２００に対するＶＭ状態の定期的な問い合わせに対してオートスケールサーバ２００からの応答が再開されたことを検知することで、オートスケールサーバ２００の起動を検知する。当該応答は、仮想マシンＶＭ４が「正常」（ただし、実際の状態とは異なる）である旨を含む。監視サーバ１００は、オートスケールサーバ２００からＶＭ状態の応答を受け付けると、仮想マシンＶＭ４の停止がスケールインによる停止ではないことを検知し、仮想マシンＶＭ４の異常をシステム管理者に通知する。 After that, consider the case where the autoscale server 200 is restored (step ST13). For example, the monitoring server 100 detects activation of the autoscale server 200 by detecting that a response from the autoscale server 200 has been resumed in response to a periodic inquiry about the VM state to the autoscale server 200. The response includes that the virtual machine VM4 is “normal” (however, different from the actual state). When the monitoring server 100 receives the response of the VM state from the autoscale server 200, the monitoring server 100 detects that the stop of the virtual machine VM4 is not stop due to the scale-in, and notifies the system administrator of the abnormality of the virtual machine VM4.

そして、監視サーバ１００は、ＶＭ管理テーブル１２３に基づいて、オートスケールサーバ２００が停止している間に仮想マシンＶＭ４との通信不可を検知したことを、オートスケールサーバ２００に通知する。オートスケールサーバ２００は、当該通知に応じて、ＶＭテーブル２１２の仮想マシンＶＭ４のＶＭ状態を「ＥＲＲＯＲ」に変更することで、ＶＭテーブル２１２をＶＭテーブル２１４に更新する。そして、オートスケールサーバ２００は、ＶＭテーブル２１４により各仮想マシンのオートスケール制御を再開する。 Based on the VM management table 123, the monitoring server 100 notifies the autoscale server 200 that the communication with the virtual machine VM4 has been detected while the autoscale server 200 is stopped. The auto scale server 200 updates the VM table 212 to the VM table 214 by changing the VM state of the virtual machine VM4 in the VM table 212 to “ERROR” in response to the notification. Then, the autoscale server 200 resumes autoscale control of each virtual machine using the VM table 214.

なお、監視サーバ１００は、オートスケールサーバ２００の起動を検知したタイミングではなく、オートスケールサーバ２００から仮想マシンＶＭ４のＶＭ状態として「ＥＲＲＯＲ」を取得したタイミングで仮想マシンＶＭ４の異常を検知し、システム管理者に通知してもよい。 Note that the monitoring server 100 detects an abnormality of the virtual machine VM4 at the timing when “ERROR” is acquired as the VM state of the virtual machine VM4 from the autoscale server 200, not at the timing when the activation of the autoscale server 200 is detected. The administrator may be notified.

次に、監視の比較例を説明する。
図１４は、監視の比較例を示す図である。
比較例では、仮想マシンを監視する監視サーバ７００と、仮想マシンに対するオートスケール制御を行うオートスケールサーバ８００とを含むシステムを考える。ただし、監視サーバ７００は、オートスケールサーバ８００と連携する機能を有していない。 Next, a comparative example of monitoring will be described.
FIG. 14 is a diagram illustrating a comparative example of monitoring.
In the comparative example, a system including a monitoring server 700 that monitors a virtual machine and an autoscale server 800 that performs autoscale control on the virtual machine is considered. However, the monitoring server 700 does not have a function to cooperate with the autoscale server 800.

監視サーバ７００は、各仮想マシンの死活監視の状況を管理するＶＭ監視テーブル７０１を記憶する。ＶＭ監視テーブル７０１には、ＶＭ名とＶＭ動作フラグとが記録される。ＶＭ動作フラグは、「Ｔｒｕｅ」が動作中、「Ｆａｌｓｅ」が停止を示す。 The monitoring server 700 stores a VM monitoring table 701 that manages the alive monitoring status of each virtual machine. In the VM monitoring table 701, a VM name and a VM operation flag are recorded. The VM operation flag indicates that “True” is operating and “False” is stopped.

オートスケールサーバ８００は、各仮想マシンの状態を管理するＶＭ状態テーブル８０１を記憶する。ＶＭ状態テーブル８０１には、ＶＭ名とＶＭ状態とが記録される。
まず、オートスケールサーバ８００が稼働中の場合を考える（ステップＳＴ２１）。ＶＭ監視テーブル７０１によれば、この段階において、仮想マシンＶＭ１のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ２のＶＭ動作中フラグは「Ｆａｌｓｅ」である。仮想マシンＶＭ３のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ４のＶＭ動作中フラグは「Ｔｒｕｅ」である。一方、ＶＭ状態テーブル８０１によれば、仮想マシンＶＭ１のＶＭ状態は「正常」である。仮想マシンＶＭ２のＶＭ状態は「スケールイン縮退」である。仮想マシンＶＭ３のＶＭ状態は「正常」である。仮想マシンＶＭ４のＶＭ状態は「正常」である。ＶＭ監視テーブル７０１で、仮想マシンＶＭ２のＶＭ動作中フラグが「Ｆａｌｓｅ」なので、監視サーバ７００は、オートスケールサーバ８００に仮想マシンＶＭ２のＶＭ状態を問い合わせる。オートスケールサーバ８００は、ＶＭ状態テーブル８０１に基づいて、仮想マシンＶＭ２のＶＭ状態「スケールイン縮退」を監視サーバ７００に応答する。この場合、監視サーバ７００は、仮想マシンＶＭ２から監視情報を取得できなかったことを異常とみなさない。 The autoscale server 800 stores a VM state table 801 that manages the state of each virtual machine. In the VM state table 801, a VM name and a VM state are recorded.
First, consider the case where the autoscale server 800 is in operation (step ST21). According to the VM monitoring table 701, the VM operating flag of the virtual machine VM1 is “True” at this stage. The VM operating flag of the virtual machine VM2 is “False”. The VM operating flag of the virtual machine VM3 is “True”. The VM operating flag of the virtual machine VM4 is “True”. On the other hand, according to the VM state table 801, the VM state of the virtual machine VM1 is “normal”. The VM state of the virtual machine VM2 is “scale-in degeneration”. The VM state of the virtual machine VM3 is “normal”. The VM state of the virtual machine VM4 is “normal”. Since the VM operating flag of the virtual machine VM2 is “False” in the VM monitoring table 701, the monitoring server 700 inquires of the autoscale server 800 about the VM state of the virtual machine VM2. The autoscale server 800 responds to the monitoring server 700 with the VM state “scale-in degeneration” of the virtual machine VM2 based on the VM state table 801. In this case, the monitoring server 700 does not regard the failure to acquire monitoring information from the virtual machine VM2 as an abnormality.

その後、オートスケールサーバ８００が停止した場合を考える（ステップＳＴ２２）。ＶＭ状態テーブル８０１は、オートスケールサーバ８００が停止している間も、オートスケールサーバ８００の不揮発性の記憶装置（例えば、ＨＤＤ）に保持されている。監視サーバ７００は、オートスケールサーバ８００に対するＶＭ状態の定期的な問い合わせに対して、オートスケールサーバ８００からの応答がないことを検知することで、オートスケールサーバ８００が停止したことを検知する。 Then, consider a case where the autoscale server 800 is stopped (step ST22). The VM state table 801 is held in a non-volatile storage device (for example, HDD) of the autoscale server 800 even while the autoscale server 800 is stopped. The monitoring server 700 detects that the autoscale server 800 has stopped by detecting that there is no response from the autoscale server 800 in response to a periodic inquiry about the VM state to the autoscale server 800.

監視サーバ７００は、仮想マシンＶＭ４との通信不可を検知する。すると、監視サーバ７００は、ＶＭ監視テーブル７０１において、仮想マシンＶＭ４のＶＭ動作中フラグを「Ｆａｌｓｅ」に変更することで、ＶＭ監視テーブル７０１をＶＭ監視テーブル７０２に更新する。 The monitoring server 700 detects that communication with the virtual machine VM4 is not possible. Then, the monitoring server 700 updates the VM monitoring table 701 to the VM monitoring table 702 by changing the VM operating flag of the virtual machine VM4 to “False” in the VM monitoring table 701.

更にその後、オートスケールサーバ８００が復旧した場合を考える（ステップＳＴ２３）。例えば、監視サーバ７００は、オートスケールサーバ８００に対するＶＭ状態の定期的な問い合わせに対してオートスケールサーバ８００からの応答が再開されたことを検知することで、オートスケールサーバ８００の起動を検知する。監視サーバ７００は、ＶＭ状態の応答に基づいて、仮想マシンＶＭ４のＶＭ状態が「正常」（ただし、実際の状態とは異なる）であり、スケールインによる停止ではないことを検知すると、システム管理者に仮想マシンＶＭ４の異常を通知する。 After that, consider the case where the autoscale server 800 is restored (step ST23). For example, the monitoring server 700 detects the start of the autoscale server 800 by detecting that the response from the autoscale server 800 has been resumed in response to a periodic inquiry about the VM state to the autoscale server 800. When the monitoring server 700 detects that the VM state of the virtual machine VM4 is “normal” (but different from the actual state) based on the response of the VM state and is not stopped due to scale-in, the system administrator Is notified of the abnormality of the virtual machine VM4.

オートスケールサーバ８００は、ＶＭ状態テーブル８０１によりオートスケール制御を再開する。ＶＭ状態テーブル８０１は、仮想マシンＶＭ４が「正常」として管理されている。このため、オートスケールサーバ８００は、仮想マシンＶＭ４が属するオートスケールグループに関してオートスケール制御を適切に行うことができない。また、オートスケールサーバ８００が仮想マシンＶＭ４の異常を検知するまでに、１０分から数十分かかることもある。この間、ユーザが利用するアプリケーションなどの処理負荷が高まると、適切なスケールアウトを行えず、当該処理に遅延が生じるおそれがある。 The autoscale server 800 resumes autoscale control using the VM state table 801. In the VM state table 801, the virtual machine VM4 is managed as “normal”. For this reason, the autoscale server 800 cannot appropriately perform autoscale control for the autoscale group to which the virtual machine VM4 belongs. In addition, it may take 10 minutes to several tens of minutes for the autoscale server 800 to detect an abnormality in the virtual machine VM4. During this time, if the processing load of the application used by the user increases, appropriate scale-out cannot be performed, and there is a risk that the processing will be delayed.

一方、第２の実施の形態のクラスタシステムによれば、監視サーバ１００とオートスケールサーバ２００とを連携させ、オートスケールサーバ２００が起動すると、監視サーバ１００により最新の仮想マシンの情報をオートスケールサーバ２００に提供する。このため、オートスケールサーバ２００は、最新の仮想マシンの情報で復旧し、オートスケール制御を再開することができる。このため、オートスケールサーバ２００が停止している間に停止した仮想マシンを、オートスケールサーバ２００に適切に把握させ、オートスケール制御を適切に再開させることができる。その結果、ユーザが利用するアプリケーションの処理への影響を抑えられる。 On the other hand, according to the cluster system of the second embodiment, when the monitoring server 100 and the autoscale server 200 are linked and the autoscale server 200 is started, the monitoring server 100 sends the latest virtual machine information to the autoscale server. 200. For this reason, the autoscale server 200 can recover with the latest virtual machine information and resume autoscale control. For this reason, the virtual machine stopped while the autoscale server 200 is stopped can be properly grasped by the autoscale server 200, and the autoscale control can be restarted appropriately. As a result, the influence on the processing of the application used by the user can be suppressed.

［第３の実施の形態］
次に、第３の実施の形態を説明する。前述の第２の実施の形態と相違する事項を主に説明し、共通する事項の説明を省略する。 [Third Embodiment]
Next, a third embodiment will be described. Items that differ from the second embodiment described above will be mainly described, and descriptions of common items will be omitted.

第２の実施の形態の例では、オートスケールサーバ２００によるオートスケールの制御対象の仮想マシンと、監視サーバ１００による監視対象の仮想マシンとが一致していたが、監視サーバ１００は、オートスケールの制御対象以外の仮想マシンの監視も行える。 In the example of the second embodiment, the virtual machine to be controlled by the autoscale server 200 and the virtual machine to be monitored by the monitoring server 100 match. You can also monitor virtual machines that are not controlled.

図１５は、第３の実施の形態の仮想マシンの例を示す図である。
例えば、物理サーバ３００が仮想マシン３１０，３２０を実行し、物理サーバ４００が仮想マシン４１０，４２０，４３０を実行することを考える。このうち、オートスケールサーバ２００によるオートスケールの制御対象は、仮想マシン３１０，３２０，４１０，４２０である。仮想マシン４３０は、オートスケールサーバ２００によるオートスケールの制御の対象外である。一方、監視サーバ１００による監視対象は、仮想マシン３１０，３２０，４１０，４２０，４３０である。 FIG. 15 is a diagram illustrating an example of a virtual machine according to the third embodiment.
For example, consider that the physical server 300 executes the virtual machines 310 and 320, and the physical server 400 executes the virtual machines 410, 420, and 430. Among these, virtual machines 310, 320, 410, and 420 are objects to be controlled by the autoscale server 200. The virtual machine 430 is not subject to autoscale control by the autoscale server 200. On the other hand, the monitoring targets by the monitoring server 100 are virtual machines 310, 320, 410, 420, and 430.

このように、オートスケールサーバ２００によるオートスケールの制御対象の仮想マシンの範囲と、監視サーバ１００による監視対象の仮想マシンの範囲とは一致していなくてもよい。監視サーバ１００は、オートスケールの制御対象でない仮想マシンに対する死活監視により、当該仮想マシンとの通信不可を検知すると、当該仮想マシンについてのオートスケール状況の確認を行わずに、当該仮想マシンの異常を検知し、システム管理者に通知する。監視サーバ１００は、監視対象の仮想マシンがオートスケールの制御対象であるか否かをＶＭ管理テーブルにより管理する。 As described above, the range of the virtual machine to be controlled by the autoscale server 200 and the range of the virtual machine to be monitored by the monitoring server 100 may not match. When the monitoring server 100 detects that communication with the virtual machine is disabled by alive monitoring for a virtual machine that is not subject to autoscale control, the monitoring server 100 checks the virtual machine for an abnormality without checking the autoscale status of the virtual machine. Detect and notify the system administrator. The monitoring server 100 manages whether or not the virtual machine to be monitored is an autoscale control target by using the VM management table.

図１６は、ＶＭ管理テーブルの例を示す図である。
ＶＭ管理テーブル１２４は、記憶部１２０に格納される。ＶＭ管理テーブル１２４は、オートスケール可否フラグ、ＶＭ名、通信用ＩＰアドレス、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目を含む。 FIG. 16 is a diagram illustrating an example of the VM management table.
The VM management table 124 is stored in the storage unit 120. The VM management table 124 includes items of an auto scale availability flag, a VM name, a communication IP address, an auto scale VM operating flag, and an auto scale information update flag.

オートスケール可否フラグの項目には、該当の仮想マシンがオートスケール制御の対象であるか否かを示す情報が登録される。該当の仮想マシンがオートスケール制御の対象の場合、オートスケール可否フラグは「対象」である。該当の仮想マシンがオートスケール制御の対象外の場合、オートスケール可否フラグは「対象外」である。 Information indicating whether or not the corresponding virtual machine is subject to autoscale control is registered in the autoscale availability flag item. When the corresponding virtual machine is subject to autoscale control, the autoscale availability flag is “target”. When the corresponding virtual machine is not subject to autoscale control, the autoscalability flag is “not subject”.

ＶＭ名、通信用ＩＰアドレス、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目に登録される情報は、ＶＭ管理テーブル１２２における同名の項目に登録される情報と同様である。ただし、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目は、オートスケール可否フラグが「対象外」の場合、設定なし（図では設定なしをハイフン記号「−」で示す）となる。 The information registered in the items of the VM name, the communication IP address, the autoscale VM operating flag, and the autoscale information update flag is the same as the information registered in the item of the same name in the VM management table 122. However, the items of the autoscale VM operating flag and the autoscale information update flag are not set when the autoscale availability flag is “not applicable” (in the figure, “not set” is indicated by a hyphen symbol “−”).

例えば、ＶＭ管理テーブル１２４には、オートスケール可否フラグが「対象外」、ＶＭ名が「ＶＭｎｏｒｍａｌ」、通信用ＩＰアドレスが「１１０．１０．１．１」、オートスケールＶＭ動作中フラグが設定なし（「−」）、オートスケール情報更新フラグが設定なし（「−」）というレコードが登録される。このレコードは、ＶＭ名「ＶＭｎｏｒｍａｌ」の仮想マシンがオートスケールの制御対象外であり、当該仮想マシンのＩＰアドレスが「１１０．１０．１．１」であることを示す。 For example, in the VM management table 124, the autoscale availability flag is “not applicable”, the VM name is “VMMnomal”, the communication IP address is “110.10.1.1.1”, and the autoscale VM operating flag is not set. ("-"), A record that the autoscale information update flag is not set ("-") is registered. This record indicates that the virtual machine with the VM name “VMMormal” is not subject to autoscale control and the IP address of the virtual machine is “110.10.1.1.1”.

また、例えば、ＶＭ管理テーブル１２４には、オートスケール可否フラグが「対象」、ＶＭ名が「Ｇｒｐ１＿ＶＭ１」、通信用ＩＰアドレスが「１００．１０．９９．１」、オートスケールＶＭ動作中フラグが「Ｔｒｕｅ」、オートスケール情報更新フラグが「Ｆａｌｓｅ」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ１」の仮想マシンがオートスケールの制御対象であることを示す。また、当該仮想マシンの通信用ＩＰアドレスが「１００．１０．９９．１」であることを示す。更に、当該仮想マシンが稼動しており、オートスケールサーバ２００の停止中におけるオートスケールＶＭ動作中フラグの更新が発生していないことを示す。 Further, for example, in the VM management table 124, the autoscale availability flag is “target”, the VM name is “Grp1_VM1”, the communication IP address is “100.10.99.1”, and the autoscale VM operating flag is “ “True” and an autoscale information update flag of “False” are registered. This record indicates that the virtual machine with the VM name “Grp1_VM1” is an autoscale control target. Further, it indicates that the communication IP address of the virtual machine is “100.10.99.1”. Furthermore, it indicates that the virtual machine is running and the autoscale VM operation flag is not updated while the autoscale server 200 is stopped.

次に、ＶＭ管理テーブル１２４を用いた、ＶＭ監視部１３１によるＶＭ監視の処理手順を説明する。第３の実施の形態では、ＶＭ監視部１３１は、図１１で説明したＶＭ監視の手順に代えて、下記の手順を実行する。 Next, a VM monitoring processing procedure by the VM monitoring unit 131 using the VM management table 124 will be described. In the third embodiment, the VM monitoring unit 131 executes the following procedure instead of the VM monitoring procedure described with reference to FIG.

図１７は、ＶＭ監視の例を示すフローチャートである。
ＶＭ監視部１３１は下記の処理を定期的に実行する。実行の周期は、運用に応じて定められる。周期は、数秒から数十秒程度でもよいし、１分から数分程度でもよい。 FIG. 17 is a flowchart illustrating an example of VM monitoring.
The VM monitoring unit 131 periodically executes the following processing. The execution cycle is determined according to the operation. The period may be about several seconds to several tens of seconds, or about 1 minute to several minutes.

（Ｓ３０）ＶＭ監視部１３１は、監視対象の仮想マシン（監視対象ＶＭ）の監視情報を収集する。例えば、ＶＭ監視部１３１は、監視対象の仮想マシンから死活監視用の所定のパケットを受信することで、監視情報を収集する。 (S30) The VM monitoring unit 131 collects monitoring information of the virtual machine to be monitored (monitoring target VM). For example, the VM monitoring unit 131 collects monitoring information by receiving a predetermined packet for alive monitoring from a virtual machine to be monitored.

（Ｓ３１）ＶＭ監視部１３１は、監視対象ＶＭの動作状況を更新する。具体的には、ＶＭ監視部１３１は、ステップＳ３０の監視情報の収集結果に基づいて、ＶＭ管理テーブル１２４を更新する。すなわち、ＶＭ監視部１３１は、監視情報を収集できた（死活監視用のパケットを受信できた）仮想マシンのオートスケールＶＭ動作中フラグを「Ｔｒｕｅ」に設定する。なお、元々「Ｔｒｕｅ」の場合はそのままでよい。 (S31) The VM monitoring unit 131 updates the operation status of the monitoring target VM. Specifically, the VM monitoring unit 131 updates the VM management table 124 based on the monitoring information collection result in step S30. In other words, the VM monitoring unit 131 sets the auto-scale VM operating flag of the virtual machine for which the monitoring information has been collected (the alive monitoring packet has been received) to “True”. In the case of “True” originally, it can be left as it is.

（Ｓ３２）ＶＭ監視部１３１は、前回の監視情報の収集時から所定時間内に監視情報が届いていない監視対象ＶＭがあるか否かを判定する。所定時間とは、当該監視の周期、または、当該周期に比較的短い時間（当該周期よりも短い時間）を加算した時間である。所定時間内に監視情報が届いていない監視対象ＶＭがある場合、ステップＳ３３に処理が進む。所定時間内に監視情報が届いていない監視対象ＶＭがない場合、ステップＳ３８に処理が進む。 (S32) The VM monitoring unit 131 determines whether or not there is a monitoring target VM for which monitoring information has not arrived within a predetermined time since the last collection of monitoring information. The predetermined time is a period obtained by adding a relatively short time (a time shorter than the period) to the monitoring period or the period. If there is a monitoring target VM for which monitoring information has not arrived within a predetermined time, the process proceeds to step S33. If there is no monitoring target VM for which the monitoring information has not arrived within the predetermined time, the process proceeds to step S38.

（Ｓ３３）ＶＭ監視部１３１は、ＶＭ管理テーブル１２４を参照して、前回の監視情報の収集時から所定時間内に監視情報が届いていない監視対象ＶＭのオートスケール可否フラグが「対象」であるか否かを判定する。「対象」である場合、ステップＳ３５に処理が進む。「対象」でない場合（すなわち、「対象外」である場合）、ステップＳ３４に処理が進む。 (S33) The VM monitoring unit 131 refers to the VM management table 124, and the auto-scalability flag of the monitoring target VM for which the monitoring information has not arrived within a predetermined time since the last monitoring information collection is “target”. It is determined whether or not. If it is “target”, the process proceeds to step S35. If it is not “target” (ie, “not target”), the process proceeds to step S34.

（Ｓ３４）ＶＭ監視部１３１は、該当の仮想マシンの異常をシステム管理者に通知する。例えば、ＶＭ監視部１３１は、ディスプレイ１１１に異常を示す画像を表示させてもよい。あるいは、ＶＭ監視部１３１は、システム管理者が利用する端末装置に、異常を示すメッセージを送信してもよい。そして、ステップＳ３８に処理が進む。 (S34) The VM monitoring unit 131 notifies the system administrator of the abnormality of the corresponding virtual machine. For example, the VM monitoring unit 131 may display an image indicating abnormality on the display 111. Alternatively, the VM monitoring unit 131 may transmit a message indicating an abnormality to the terminal device used by the system administrator. Then, the process proceeds to step S38.

（Ｓ３５）ＶＭ監視部１３１は、ステップＳ３２で前回の監視情報の収集時から所定時間内に監視情報が届いていないと判断された監視対象ＶＭについて、ＶＭ管理テーブル１２４のオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定する。なお、元々「Ｆａｌｓｅ」の場合はそのままでよい。 (S35) The VM monitoring unit 131 sets an auto-scale VM operating flag in the VM management table 124 for the monitoring target VM that is determined in step S32 that the monitoring information has not arrived within a predetermined time since the last monitoring information collection. Is set to “False”. In the case of “False” originally, it can be left as it is.

（Ｓ３６）ＶＭ監視部１３１は、オートスケールサーバ管理テーブル１２１を参照して、稼働中フラグが「Ｔｒｕｅ」であるか否かを判定する。稼働中フラグが「Ｔｒｕｅ」の場合、ステップＳ３８に処理が進む。稼働中フラグが「Ｆａｌｓｅ」の場合、ステップＳ３７に処理が進む。 (S36) The VM monitoring unit 131 refers to the autoscale server management table 121 and determines whether or not the operating flag is “True”. If the operating flag is “True”, the process proceeds to step S38. If the operating flag is “False”, the process proceeds to step S37.

（Ｓ３７）ＶＭ監視部１３１は、ステップＳ３５でオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定した監視対象ＶＭについて、ＶＭ管理テーブル１２４のオートスケール情報更新フラグを「Ｔｒｕｅ」に設定する。 (S37) The VM monitoring unit 131 sets the autoscale information update flag of the VM management table 124 to “True” for the monitoring target VM for which the autoscale VM operation flag is set to “False” in step S35.

（Ｓ３８）ＶＭ監視部１３１は、監視を継続するか否を判定する。監視を継続する場合、監視の周期の分だけ待機して、ステップＳ３０に処理が進む。監視を継続しない場合、ＶＭ監視の処理が終了する。例えば、ＶＭ監視部１３１は、システム管理者による監視の終了の入力を受け付けた場合、監視を継続しないと判定し、それ以外の場合に監視を継続すると判定する。 (S38) The VM monitoring unit 131 determines whether or not to continue monitoring. When monitoring is continued, the process waits for the monitoring period and the process proceeds to step S30. If the monitoring is not continued, the VM monitoring process ends. For example, the VM monitoring unit 131 determines that the monitoring is not continued when an input of the monitoring end by the system administrator is received, and determines that the monitoring is continued in other cases.

なお、第３の実施の形態でもＡＳサーバ連携部１３２は、図１２のオートスケールサーバ監視の手順により、オートスケールサーバ２００と連携する。
これにより、オートスケールサーバ２００が停止し、復旧したときに仮想マシンの最新の情報を基にオートスケールサーバ２００を復旧することができる。 In the third embodiment, the AS server cooperation unit 132 also cooperates with the autoscale server 200 according to the autoscale server monitoring procedure of FIG.
Thereby, when the autoscale server 200 stops and is restored, the autoscale server 200 can be restored based on the latest information of the virtual machine.

更に、監視サーバ１００は、ＶＭ管理テーブル１２４のオートスケール可否フラグに基づいて、オートスケール制御の対象の仮想マシンと、オートスケール制御の対象外の仮想マシンとを区別した監視を行うことができる。監視サーバ１００は、オートスケール制御の対象外の仮想マシンについては、オートスケールサーバ２００に対するオートスケールに関する問い合わせを省略して、当該仮想マシンの異常を迅速に通知することができる。 Furthermore, the monitoring server 100 can perform monitoring by distinguishing a virtual machine that is subject to autoscale control and a virtual machine that is not subject to autoscale control based on the autoscalability flag in the VM management table 124. For virtual machines that are not subject to auto-scaling control, the monitoring server 100 can omit an inquiry about auto-scaling to the auto-scaling server 200 and can quickly notify the abnormality of the virtual machine.

なお、第１の実施の形態の情報処理は、処理部１２にプログラムを実行させることで実現できる。また、第２，第３の実施の形態の情報処理は、ＣＰＵ１０１にプログラムを実行させることで実現できる。プログラムは、コンピュータ読み取り可能な記録媒体１１３に記録できる。 The information processing according to the first embodiment can be realized by causing the processing unit 12 to execute a program. The information processing of the second and third embodiments can be realized by causing the CPU 101 to execute a program. The program can be recorded on a computer-readable recording medium 113.

例えば、プログラムを記録した記録媒体１１３を配布することで、プログラムを流通させることができる。また、プログラムを他のコンピュータに格納しておき、ネットワーク経由でプログラムを配布してもよい。コンピュータは、例えば、記録媒体１１３に記録されたプログラムまたは他のコンピュータから受信したプログラムを、ＲＡＭ１０２やＨＤＤ１０３などの記憶装置に格納し（インストールし）、当該記憶装置からプログラムを読み込んで実行してもよい。 For example, the program can be distributed by distributing the recording medium 113 on which the program is recorded. Alternatively, the program may be stored in another computer and distributed via a network. For example, the computer stores (installs) a program recorded in the recording medium 113 or a program received from another computer in a storage device such as the RAM 102 or the HDD 103, and reads and executes the program from the storage device. Good.

１クラスタシステム
１０オートスケールサーバ監視装置
１１，２１記憶部
１２，２２処理部
２０オートスケールサーバ
３０，４０物理サーバ
３１，３２，４１，４２仮想マシン
５０ネットワーク
６１，６２，７１，７２テーブル DESCRIPTION OF SYMBOLS 1 Cluster system 10 Autoscale server monitoring apparatus 11,21 Memory | storage part 12,22 Processing part 20 Autoscale server 30,40 Physical server 31,32,41,42 Virtual machine 50 Network 61,62,71,72 Table

Claims

A physical server capable of running multiple virtual machines;
An autoscale server that performs scale-in and scale-out of virtual machines in the physical server;
Periodically communicating with the autoscale server, storing information on the virtual machine managed by the autoscale server, and detecting that the autoscale server has stopped, in response to a request from the autoscale server, An autoscale server monitoring device that transmits virtual machine information;
A cluster system.

The autoscale server monitoring device detects that communication with the virtual machine is disabled while the autoscale server is stopped, and displays information indicating that communication with the virtual machine is disabled when the autoscale server is started. Send to the scale server,
The cluster system according to claim 1.

The autoscale server stores state information indicating the state of the virtual machine, updates the state information in accordance with the virtual machine information transmitted by the autoscale server monitoring apparatus, and the updated state information Resuming control of the scale-in and the scale-out based on
The cluster system according to claim 1.

When the autoscale server monitoring device detects that communication with the virtual machine is disabled, the autoscale server monitoring device detects an abnormality of the virtual machine in response to an inquiry to the autoscale server as to whether or not the virtual machine has been stopped due to the scale-in. Detect
The cluster system according to claim 1.

The plurality of virtual machines include other virtual machines that are outside the scope of the scale-in and scale-out control by the autoscale server,
When the autoscale server monitoring device detects the inability to communicate with the other virtual machine, the autoscale server detects the abnormality of the other virtual machine by omitting the inquiry to the autoscale server.
The cluster system according to claim 4.

A storage unit for storing information on virtual machines managed by the autoscale server;
When periodically communicating with the autoscale server and detecting that the autoscale server has stopped, a processing unit that transmits information on the virtual machine in response to a request from the autoscale server;
An autoscale server monitoring device.

Communicates regularly with the autoscale server, stores information on virtual machines managed by the autoscale server,
Upon detecting that the autoscale server has stopped, the virtual scale information is transmitted in response to a request from the autoscale server.
An autoscale server monitoring program that causes a computer to execute processing.

Computer
Communicates regularly with the autoscale server, stores information on virtual machines managed by the autoscale server,
Upon detecting that the autoscale server has stopped, the virtual machine information is transmitted in response to the request from the autoscale server.
Autoscale server monitoring method.