JP7044971B2

JP7044971B2 - Cluster system, autoscale server monitoring device, autoscale server monitoring program and autoscale server monitoring method

Info

Publication number: JP7044971B2
Application number: JP2018077371A
Authority: JP
Inventors: 雅彦谷川; 健一郎下川
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2018-04-13
Filing date: 2018-04-13
Publication date: 2022-03-31
Anticipated expiration: 2038-04-13
Also published as: JP2019185511A

Description

本発明はクラスタシステム、オートスケールサーバ監視装置、オートスケールサーバ監視プログラムおよびオートスケールサーバ監視方法に関する。 The present invention relates to a cluster system, an autoscale server monitoring device, an autoscale server monitoring program, and an autoscale server monitoring method.

情報処理の分野では、物理的なコンピュータ（物理マシンや物理ホストと呼ぶことがある）上で、複数の仮想的なコンピュータ（仮想マシンや仮想ホストと呼ぶことがある）を動作させる仮想化技術が利用されている。各仮想マシン上では、ＯＳ（Operating System）などのソフトウェアを実行できる。仮想化技術を利用する物理マシンは、複数の仮想マシンを管理するためのソフトウェアを実行する。例えば、ハイパーバイザと呼ばれるソフトウェアが、ＣＰＵ（Central Processing Unit）の処理能力やＲＡＭ（Random Access Memory）の記憶領域を、演算のリソースとして複数の仮想マシンに割り振ることがある。 In the field of information processing, virtualization technology that operates multiple virtual computers (sometimes called virtual machines or virtual hosts) on a physical computer (sometimes called a physical machine or physical host) is available. It's being used. Software such as an OS (Operating System) can be executed on each virtual machine. Physical machines that utilize virtualization technology run software to manage multiple virtual machines. For example, software called a hypervisor may allocate the processing power of a CPU (Central Processing Unit) and the storage area of a RAM (Random Access Memory) to a plurality of virtual machines as operation resources.

ところで、情報処理システムでは、演算を行うマシン（仮想マシンや物理マシン）を増やしたり、減らしたりすることがある。例えば、マシンを増やすことをスケールアウトと言う。一方、マシンを減らすことをスケールインと言う。ここで、スケールアウトやスケールインを行うシステムの運用を支援する技術が考えられている。 By the way, in an information processing system, the number of machines (virtual machines and physical machines) that perform operations may be increased or decreased. For example, increasing the number of machines is called scale-out. On the other hand, reducing the number of machines is called scale-in. Here, technologies that support the operation of systems that perform scale-out and scale-in are being considered.

例えば、自動スケールアウトおよび自動スケールインによるスケール用の待機サーバの正常動作を障害として誤通知することを防止する障害監視装置の提案がある。障害監視装置は、監視対象の各サーバが、常時稼動するのか、あるいは、スケールアウト時のみ稼動するのかを示すサーバ用途情報と各サーバが待機中であるか稼働中かを示す稼動状態情報を記憶する。障害監視装置は、監視システムが検知したイベントについて、イベント発生元のサーバのサーバ用途情報と稼動状態情報とを確認することで、イベントが障害により発生したか、自動スケールアウトおよび自動スケールインにより発生したかを判定する。 For example, there is a proposal of a failure monitoring device that prevents erroneous notification of the normal operation of the standby server for scaling by automatic scale-out and automatic scale-in as a failure. The fault monitoring device stores server usage information indicating whether each server to be monitored is always operating or operating only at scale-out, and operating status information indicating whether each server is waiting or operating. do. For the event detected by the monitoring system, the fault monitoring device checks the server usage information and operating status information of the server from which the event occurred, and whether the event occurred due to a failure or occurred by automatic scale-out and automatic scale-in. Determine if it has been done.

また、クラウド環境上で、オートスケール機能により自動的に台数が増減する仮想サーバによって構築される情報処理システムにおいて、ログの消失を回避してこれを監視可能にする基盤運用管理システムの提案もある。基盤運用管理システムでは、オートスケール機能の対象である仮想サーバが、当該仮想サーバに係るログのうち、リアルタイム監視が必要な所定のものをオートスケール機能の対象外の仮想サーバに転送する。 There is also a proposal for a basic operation management system that can avoid the loss of logs and monitor them in an information processing system built by virtual servers that automatically increase or decrease the number of units by the autoscale function in a cloud environment. .. In the infrastructure operation management system, the virtual server that is the target of the autoscale function transfers a predetermined log related to the virtual server that requires real-time monitoring to a virtual server that is not the target of the autoscale function.

特開２０１１－２５３２３１号公報Japanese Unexamined Patent Publication No. 2011-253231 特開２０１５－１８４８７９号公報Japanese Unexamined Patent Publication No. 2015-184879

上記のように、オートスケール機能を有するオートスケールサーバを用いて、システムに属する仮想マシンの台数を自動的に増減させることが考えられる。しかし、オートスケールサーバは、障害などが原因で停止することがある。この場合、オートスケールサーバが停止している間に仮想マシンの起動状態の変化（例えば、起動していた仮想マシンが障害などで停止するなど）が生じ得る。すると、オートスケールサーバが復旧したときに、オートスケールサーバが保持する仮想マシンの稼動状態を示す情報が、実際の仮想マシンの稼動状態に対して不整合となる可能性がある。このような不整合は、オートスケールサーバが、復旧後にオートスケール機能を適切に実行できない要因になり得る。 As described above, it is conceivable to automatically increase or decrease the number of virtual machines belonging to the system by using an autoscale server having an autoscale function. However, the autoscale server may stop due to a failure or the like. In this case, a change in the startup state of the virtual machine (for example, the started virtual machine may stop due to a failure or the like) may occur while the autoscale server is stopped. Then, when the autoscale server is restored, the information indicating the operating status of the virtual machine held by the autoscale server may be inconsistent with the operating status of the actual virtual machine. Such inconsistencies can cause the autoscale server to not properly perform the autoscale function after recovery.

１つの側面では、本発明は、オートスケールサーバが停止し、復旧したときに仮想マシンの最新の情報を基に復旧することができるクラスタシステム、オートスケールサーバ監視装置、オートスケールサーバ監視プログラムおよびオートスケールサーバ監視方法を提供することを目的とする。 In one aspect, the present invention is a cluster system, an autoscale server monitoring device, an autoscale server monitoring program and an auto that can recover based on the latest information of a virtual machine when the autoscale server is stopped and recovered. The purpose is to provide a scale server monitoring method.

１つの態様では、クラスタシステムが提供される。クラスタシステムは、物理サーバとオートスケールサーバとオートスケールサーバ監視装置とを有する。物理サーバは、複数の仮想マシンを実行可能である。オートスケールサーバは、物理サーバにおける仮想マシンのスケールインおよびスケールアウトを行う。オートスケールサーバ監視装置は、オートスケールサーバと定期的に通信し、オートスケールサーバが管理する仮想マシンの情報を記憶し、オートスケールサーバが停止したことを検知した後に、停止中の状態から起動したオートスケールサーバの要求に応じて、仮想マシンの情報を送信する。 In one aspect, a cluster system is provided. The cluster system has a physical server, an autoscale server, and an autoscale server monitoring device. A physical server can run multiple virtual machines. The autoscale server scales in and out of virtual machines on the physical server. The autoscale server monitoring device periodically communicates with the autoscale server, stores information on virtual machines managed by the autoscale server, detects that the autoscale server has stopped, and then starts from the stopped state. Sends virtual machine information at the request of the autoscale server.

また、１つの態様では、オートスケールサーバ監視装置が提供される。オートスケールサーバ監視装置は、記憶部と処理部とを有する。記憶部は、オートスケールサーバが管理する仮想マシンの情報を記憶する。処理部は、オートスケールサーバと定期的に通信し、オートスケールサーバが停止したことを検知した後に、停止中の状態から起動したオートスケールサーバの要求に応じて、仮想マシンの情報を送信する。 Also, in one aspect, an autoscale server monitoring device is provided. The autoscale server monitoring device has a storage unit and a processing unit. The storage unit stores information on virtual machines managed by the autoscale server. The processing unit periodically communicates with the autoscale server, detects that the autoscale server has stopped, and then transmits information about the virtual machine in response to a request from the autoscale server started from the stopped state .

また、１つの態様では、オートスケールサーバ監視プログラムが提供される。
また、１つの態様では、オートスケールサーバ監視方法が提供される。 Also, in one aspect, an autoscale server monitoring program is provided.
Also, in one aspect, an autoscale server monitoring method is provided.

１つの側面では、オートスケールサーバが停止し、復旧したときに仮想マシンの最新の情報を基に復旧することができる。 On one side, when the autoscale server goes down and recovers, it can be recovered based on the latest information on the virtual machine.

第１の実施の形態のクラスタシステムを示す図である。It is a figure which shows the cluster system of 1st Embodiment. 第２の実施の形態のクラスタシステムの例を示す図である。It is a figure which shows the example of the cluster system of the 2nd Embodiment. 監視サーバのハードウェア例を示すブロック図である。It is a block diagram which shows the hardware example of a monitoring server. スケールアウトおよびスケールインの例を示す図である。It is a figure which shows the example of scale-out and scale-in. クラスタシステムの機能例を示すブロック図である。It is a block diagram which shows the functional example of a cluster system. オートスケールサーバ管理テーブルの例を示す図である。It is a figure which shows the example of the autoscale server management table. ＶＭ管理テーブルの例を示す図である。It is a figure which shows the example of the VM management table. オートスケールグループテーブルの例を示す図である。It is a figure which shows the example of the autoscale group table. ＶＭテーブルの例を示す図である。It is a figure which shows the example of the VM table. オートスケールポリシーテーブルの例を示す図である。It is a figure which shows the example of the autoscale policy table. ＶＭ監視の例を示すフローチャートである。It is a flowchart which shows the example of VM monitoring. オートスケールサーバ監視の例を示すフローチャートである。It is a flowchart which shows the example of the autoscale server monitoring. 監視サーバによる監視の例を示す図である。It is a figure which shows the example of monitoring by a monitoring server. 監視の比較例を示す図である。It is a figure which shows the comparative example of monitoring. 第３の実施の形態の仮想マシンの例を示す図である。It is a figure which shows the example of the virtual machine of the 3rd Embodiment. ＶＭ管理テーブルの例を示す図である。It is a figure which shows the example of the VM management table. ＶＭ監視の例を示すフローチャートである。It is a flowchart which shows the example of VM monitoring.

以下、本実施の形態について図面を参照して説明する。
［第１の実施の形態］
第１の実施の形態を説明する。 Hereinafter, the present embodiment will be described with reference to the drawings.
[First Embodiment]
The first embodiment will be described.

図１は、第１の実施の形態のクラスタシステムを示す図である。
クラスタシステム１は、オートスケールサーバ監視装置１０、オートスケールサーバ２０および物理サーバ３０，４０を有する。オートスケールサーバ監視装置１０、オートスケールサーバ２０および物理サーバ３０，４０は、ネットワーク５０に接続されている。 FIG. 1 is a diagram showing a cluster system according to the first embodiment.
The cluster system 1 includes an autoscale server monitoring device 10, an autoscale server 20, and physical servers 30 and 40. The autoscale server monitoring device 10, the autoscale server 20, and the physical servers 30 and 40 are connected to the network 50.

物理サーバ３０，４０は、複数の仮想マシンを実行可能である。例えば、物理サーバ３０は、仮想マシン３１，３２を実行可能である。物理サーバ４０は、仮想マシン４１，４２を実行可能である。仮想マシンは、スケールアウトやスケールインが可能である。 The physical servers 30 and 40 can execute a plurality of virtual machines. For example, the physical server 30 can execute the virtual machines 31 and 32. The physical server 40 can execute the virtual machines 41 and 42. Virtual machines can be scaled out and scaled in.

オートスケールサーバ２０は、各仮想マシンの負荷を収集し、各仮想マシンの負荷に基づいて、仮想マシンのスケールアウトやスケールインを制御する。例えば、オートスケールサーバ２０は、仮想マシン３２が停止しているときに、仮想マシン３１の負荷が第１の閾値を超えた状態が継続すると、物理サーバ３０上で仮想マシン３２を起動させ、仮想マシン３１だけでなく仮想マシン３２にも負荷を分散させる。また、オートスケールサーバ２０は、仮想マシン３１，３２が稼動しているときに、仮想マシン３１，３２の負荷（平均の負荷または一方の負荷）が第２の閾値（第２の閾値＜第１の閾値）を下回ると、仮想マシン３２を停止させ、リソース使用量を減少させる。オートスケールサーバ２０は、物理サーバ４０における仮想マシン４１，４２のスケールアウトやスケールインも同様に制御する。負荷の判定を行う仮想マシンのグループは、運用に応じて決定される（例えば、仮想マシン３１，４１，４２の負荷に応じて、仮想マシン３２を起動させてもよい）。 The autoscale server 20 collects the load of each virtual machine and controls the scale-out and scale-in of the virtual machine based on the load of each virtual machine. For example, when the virtual machine 32 is stopped, if the load of the virtual machine 31 continues to exceed the first threshold value, the autoscale server 20 starts the virtual machine 32 on the physical server 30 and virtualizes it. The load is distributed not only to the machine 31 but also to the virtual machine 32. Further, in the autoscale server 20, when the virtual machines 31 and 32 are operating, the load (average load or one of the loads) of the virtual machines 31 and 32 has a second threshold value (second threshold value <first). When it falls below the threshold value of), the virtual machine 32 is stopped and the resource usage is reduced. The autoscale server 20 also controls the scale-out and scale-in of the virtual machines 41 and 42 in the physical server 40. The group of virtual machines for which the load is determined is determined according to the operation (for example, the virtual machine 32 may be started according to the load of the virtual machines 31, 41, 42).

オートスケールサーバ監視装置１０は、オートスケールサーバ２０を監視する。また、オートスケールサーバ監視装置１０は、オートスケール対象である仮想マシン３１，３２，４１，４２を監視する。具体的には、オートスケールサーバ監視装置１０は、稼働中の仮想マシンと定期的に通信することで、該当の仮想マシンの死活監視を行う。オートスケールサーバ監視装置１０は、何れかの仮想マシンの異常を検知すると、異常を検知したことをユーザに通知する。 The autoscale server monitoring device 10 monitors the autoscale server 20. Further, the autoscale server monitoring device 10 monitors the virtual machines 31, 32, 41, 42 which are the targets of autoscale. Specifically, the autoscale server monitoring device 10 periodically communicates with a running virtual machine to monitor the life and death of the virtual machine. When the autoscale server monitoring device 10 detects an abnormality in any of the virtual machines, it notifies the user that the abnormality has been detected.

ただし、オートスケール対象の仮想マシンは、オートスケールサーバ２０によるオートスケール制御によって起動されたり、停止されたりする。このため、オートスケールサーバ監視装置１０は、オートスケール対象の仮想マシンの何れかで定期通信の途絶を検知したとき、当該仮想マシンがスケールインによって停止されたか否かを、オートスケールサーバ２０に問い合わせる。定期通信の途絶がスケールインに起因するのであれば、当該途絶は異常ではない。一方、定期通信の途絶がスケールインに起因するのでなければ、当該途絶は異常とみなされる。ただし、オートスケールサーバ２０が異常などにより停止することもある。オートスケールサーバ監視装置１０は、オートスケールサーバ２０の稼動状態を監視し、オートスケールサーバ２０の復旧を支援する機能を提供する。 However, the virtual machine to be autoscaled is started or stopped by the autoscale control by the autoscale server 20. Therefore, when the autoscale server monitoring device 10 detects the interruption of periodic communication in any of the virtual machines to be autoscaled, the autoscale server monitoring device 10 inquires to the autoscale server 20 whether or not the virtual machine has been stopped by scale-in. .. If the interruption of regular communication is due to scale-in, the interruption is not abnormal. On the other hand, unless the interruption of regular communication is due to scale-in, the interruption is considered abnormal. However, the autoscale server 20 may stop due to an abnormality or the like. The autoscale server monitoring device 10 monitors the operating state of the autoscale server 20 and provides a function of supporting the recovery of the autoscale server 20.

オートスケールサーバ監視装置１０は、記憶部１１および処理部１２を有する。また、オートスケールサーバ２０は、記憶部２１および処理部２２を有する。
記憶部１１，２１は、ＲＡＭなどの揮発性記憶装置でもよいし、ＨＤＤ（Hard Disk Drive）やフラッシュメモリなどの不揮発性記憶装置でもよい。処理部１２，２２は、ＣＰＵ、ＤＳＰ（Digital Signal Processor）、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）などを含み得る。処理部１２，２２はプログラムを実行するプロセッサであってもよい。ここでいう「プロセッサ」には、複数のプロセッサの集合（マルチプロセッサ）も含まれ得る。 The autoscale server monitoring device 10 has a storage unit 11 and a processing unit 12. Further, the autoscale server 20 has a storage unit 21 and a processing unit 22.
The storage units 11 and 21 may be a volatile storage device such as a RAM, or a non-volatile storage device such as an HDD (Hard Disk Drive) or a flash memory. The processing units 12 and 22 may include a CPU, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array), and the like. The processing units 12 and 22 may be processors that execute programs. The "processor" here may include a set of a plurality of processors (multiprocessor).

記憶部１１は、オートスケールサーバ２０が管理する仮想マシン３１，３２，４１，４２の情報を記憶する。例えば、記憶部１１は、テーブル６１を記憶する。テーブル６１は、オートスケールサーバ監視装置１０による仮想マシン３１，３２，４１，４２それぞれの死活監視の状況（定期通信の成否）を示す。ここで、仮想マシン３１の識別情報は「ＶＭ（Virtual Machine）１」である。仮想マシン３２の識別情報は「ＶＭ２」である。仮想マシン４１の識別情報は「ＶＭ３」である。仮想マシン４２の識別情報は「ＶＭ４」である。また、テーブル６１では、（例えば、最新の定期通信のタイミングにおいて）該当の仮想マシンと定期通信を行えたことを「ＯＮ」、定期通信を行えなかったことを「ＯＦＦ」で表す。 The storage unit 11 stores the information of the virtual machines 31, 32, 41, 42 managed by the autoscale server 20. For example, the storage unit 11 stores the table 61. Table 61 shows the status of life-and-death monitoring (success or failure of periodic communication) of each of the virtual machines 31, 32, 41, and 42 by the autoscale server monitoring device 10. Here, the identification information of the virtual machine 31 is "VM (Virtual Machine) 1". The identification information of the virtual machine 32 is "VM2". The identification information of the virtual machine 41 is "VM3". The identification information of the virtual machine 42 is "VM4". Further, in the table 61, "ON" indicates that the periodic communication can be performed with the corresponding virtual machine (for example, at the timing of the latest periodic communication), and "OFF" indicates that the periodic communication cannot be performed.

ここで、記憶部２１も、仮想マシン３１，３２，４１，４２の状態を示す情報を記憶する。例えば、記憶部２１は、テーブル７１を記憶する。テーブル７１は、記憶部２１のうち、不揮発性の記憶領域に格納される。テーブル７１は、オートスケール制御に用いられる情報であり、仮想マシン３１，３２，４１，４２それぞれの状態を示す状態情報である。例えば、「ｎｏｒｍａｌ」は正常稼働中を示す。「ｓｃａｌｅ－ｉｎ」は、スケールインにより停止されていることを示す。「ｅｒｒｏｒ」は、異常により停止されていることを示す。処理部２２は、仮想マシン３１，３２，４１，４２の稼動状態の収集やオートスケールの結果に応じて、テーブル７１の各仮想マシンの状態を更新する。 Here, the storage unit 21 also stores information indicating the states of the virtual machines 31, 32, 41, and 42. For example, the storage unit 21 stores the table 71. The table 71 is stored in the non-volatile storage area of the storage unit 21. Table 71 is information used for autoscale control, and is state information indicating the states of the virtual machines 31, 32, 41, and 42, respectively. For example, "normal" indicates normal operation. “Scale-in” indicates that the scale-in has stopped. "Error" indicates that the system has been stopped due to an abnormality. The processing unit 22 updates the status of each virtual machine in the table 71 according to the collection of the operating status of the virtual machines 31, 32, 41, 42 and the result of autoscale.

処理部１２は、オートスケール対象の仮想マシンを示す情報をオートスケールサーバ２０から取得してテーブル６１を生成し、オートスケールサーバ監視装置１０による死活監視の対象の仮想マシンを決定してもよい。 The processing unit 12 may acquire information indicating the virtual machine to be autoscaled from the autoscale server 20 to generate a table 61, and determine the virtual machine to be alive and alive monitored by the autoscale server monitoring device 10.

処理部１２は、オートスケールサーバ２０と定期的に通信し、オートスケールサーバ２０が停止したことを検知すると、オートスケールサーバ２０の要求に応じて、仮想マシンの情報をオートスケールサーバ２０に送信する。 The processing unit 12 periodically communicates with the autoscale server 20, and when it detects that the autoscale server 20 has stopped, it transmits virtual machine information to the autoscale server 20 in response to a request from the autoscale server 20. ..

まず、オートスケールサーバ２０が稼働中の場合を考える（ステップＳＴ１）。このとき、処理部１２は、仮想マシン３１，４１，４２との定期通信を行えたが、仮想マシン３２との定期通信を行えなかった（通信不可になった）とする。処理部１２は、仮想マシン３１，４１，４２（「ＶＭ１，ＶＭ３，ＶＭ４」）について「ＯＮ」、仮想マシン３２（「ＶＭ２」）について、「ＯＦＦ」をテーブル６１に記録する。処理部１２は、仮想マシン３２のオートスケールの状況を、オートスケールサーバ２０に問い合わせる。 First, consider the case where the autoscale server 20 is in operation (step ST1). At this time, it is assumed that the processing unit 12 was able to perform periodic communication with the virtual machines 31, 41, 42, but was unable to perform periodic communication with the virtual machine 32 (communication became impossible). The processing unit 12 records “ON” for the virtual machines 31, 41, 42 (“VM1, VM3, VM4”) and “OFF” for the virtual machine 32 (“VM2”) in the table 61. The processing unit 12 inquires of the autoscale server 20 about the autoscale status of the virtual machine 32.

このとき、オートスケールサーバ２０では、テーブル７１に示されるように、仮想マシン３１，４１，４２については「ｎｏｒｍａｌ」であり、仮想マシン３２については「ｓｃａｌｅ－ｉｎ」と管理されている。すなわち、仮想マシン３２は、スケールインによって停止された状態である。このため、処理部２２は、仮想マシン３２がスケールインによって停止された状態である旨をオートスケールサーバ監視装置１０に応答する。 At this time, in the autoscale server 20, as shown in Table 71, the virtual machines 31, 41, and 42 are managed as "normal", and the virtual machine 32 is managed as "scale-in". That is, the virtual machine 32 is in a state of being stopped by the scale-in. Therefore, the processing unit 22 responds to the autoscale server monitoring device 10 that the virtual machine 32 is in a stopped state due to scale-in.

処理部１２は、オートスケールサーバ２０による応答を受信し、当該応答により、仮想マシン３２がスケールインによって停止された状態であることを検知する。このため、処理部１２は、仮想マシン３２との通信不可（定期通信の途絶）を異常とみなさない。処理部１２は、仮想マシン３１，４１，４２に対する死活監視を継続する。 The processing unit 12 receives the response from the autoscale server 20, and detects that the virtual machine 32 is in the stopped state due to the scale-in based on the response. Therefore, the processing unit 12 does not regard the impossibility of communication with the virtual machine 32 (interruption of periodic communication) as an abnormality. The processing unit 12 continues the alive monitoring of the virtual machines 31, 41, 42.

次に、オートスケールサーバ２０が異常などによって停止中の場合を考える（ステップＳＴ２）。処理部１２は、オートスケールサーバ２０との定期通信を正常に行えなかった場合に、オートスケールサーバ２０が停止中であることを検知する。処理部１２は、オートスケールサーバ２０が停止中である間も、稼働中の仮想マシン３１，４１，４２と定期的に通信し、仮想マシン３１，４１，４２の死活監視を継続する。そして、処理部１２は、仮想マシン４２（「ＶＭ４」）との通信不可（定期通信の途絶）を検出する。すると、処理部１２は、テーブル６１をテーブル６２に更新する。具体的には、処理部１２は、「ＶＭ４」を「ＯＮ」から「ＯＦＦ」に変更する。 Next, consider the case where the autoscale server 20 is stopped due to an abnormality or the like (step ST2). The processing unit 12 detects that the autoscale server 20 is stopped when the periodic communication with the autoscale server 20 cannot be normally performed. The processing unit 12 periodically communicates with the operating virtual machines 31, 41, 42 even while the autoscale server 20 is stopped, and continues the alive monitoring of the virtual machines 31, 41, 42. Then, the processing unit 12 detects that communication with the virtual machine 42 (“VM4”) is impossible (interruption of periodic communication). Then, the processing unit 12 updates the table 61 to the table 62. Specifically, the processing unit 12 changes "VM4" from "ON" to "OFF".

次に、オートスケールサーバ２０が停止中の状態から復旧した場合を考える（ステップＳＴ３）。処理部１２は、オートスケールサーバ２０からの要求を受け付けると、オートスケールサーバ２０が起動したことを検知する。オートスケールサーバ２０からの要求は、仮想マシンの情報の要求でもよいし、オートスケールサーバ監視装置１０に対する定期通信に関する所定の要求（あるいは応答）でもよい。すると、処理部１２は、テーブル６２に基づいて、オートスケールサーバ２０が停止していた間に、仮想マシン４２との定期通信の途絶を検知したことをオートスケールサーバ２０に送信する。仮想マシン４２との定期通信の途絶は、オートスケールサーバ２０の停止中に発生している。このため、当該途絶は、仮想マシン４２のスケールインに起因するものではない。したがって、処理部２２は、オートスケールサーバ監視装置１０から仮想マシン４２の定期通信の途絶の通知を受信すると、テーブル７１をテーブル７２に更新する。具体的には、処理部２２は、「ＶＭ４」を「ｎｏｒｍａｌ」から「ｅｒｒｏｒ」に変更する。 Next, consider the case where the autoscale server 20 is restored from the stopped state (step ST3). When the processing unit 12 receives the request from the autoscale server 20, it detects that the autoscale server 20 has started. The request from the autoscale server 20 may be a request for information on a virtual machine, or may be a predetermined request (or response) for periodic communication to the autoscale server monitoring device 10. Then, based on the table 62, the processing unit 12 transmits to the autoscale server 20 that the interruption of the periodic communication with the virtual machine 42 is detected while the autoscale server 20 is stopped. The interruption of periodic communication with the virtual machine 42 occurs while the autoscale server 20 is stopped. Therefore, the interruption is not due to the scale-in of the virtual machine 42. Therefore, when the processing unit 22 receives the notification of the interruption of the periodic communication of the virtual machine 42 from the autoscale server monitoring device 10, the processing unit 22 updates the table 71 to the table 72. Specifically, the processing unit 22 changes "VM4" from "normal" to "error".

処理部１２は、仮想マシン４２が「ｅｒｒｏｒ」として管理されていることをオートスケールサーバ２０から取得して、仮想マシン４２の異常をユーザに通知してもよい。
なお、ステップＳＴ３では、処理部１２は、テーブル６２における各仮想マシンの情報を、オートスケールサーバ２０に送信してもよい。処理部２２は、テーブル６２の仮想マシンの情報と、テーブル７２の仮想マシンの情報とを照合することで、何れの仮想マシンで異常が生じているかを判定できる。例えば、処理部２２は、テーブル６２で「ＯＦＦ」かつテーブル７１で「ｎｏｒｍａｌ」である仮想マシンを異常（「ｅｒｒｏｒ」）と判定し、それ以外の仮想マシンを異常なし（「ｎｏｒｍａｌ」や「ｓｃａｌｅ－ｉｎ」など）と判定してもよい。 The processing unit 12 may acquire from the autoscale server 20 that the virtual machine 42 is managed as an “error” and notify the user of the abnormality of the virtual machine 42.
In step ST3, the processing unit 12 may transmit the information of each virtual machine in the table 62 to the autoscale server 20. The processing unit 22 can determine which virtual machine has an abnormality by collating the information of the virtual machine in the table 62 with the information of the virtual machine in the table 72. For example, the processing unit 22 determines that a virtual machine that is "OFF" in the table 62 and "normal" in the table 71 is abnormal ("error"), and determines that the other virtual machines are normal ("normal" or "scale"). -In ", etc.) may be determined.

オートスケールサーバ監視装置１０によれば、オートスケールサーバ２０と定期的に通信され、オートスケールサーバ２０が管理する仮想マシンの情報が記憶される。オートスケールサーバ２０が停止したことが検知されると、オートスケールサーバ２０の要求に応じて、仮想マシンの情報が送信される。 According to the autoscale server monitoring device 10, information on a virtual machine managed by the autoscale server 20 is stored in communication with the autoscale server 20 on a regular basis. When it is detected that the autoscale server 20 has stopped, the virtual machine information is transmitted in response to the request of the autoscale server 20.

これにより、オートスケールサーバ２０が停止し、復旧したときに仮想マシンの最新の情報を基に復旧することができる。
ここで、オートスケールサーバ監視装置１０の機能を用いない場合を考える。この場合、オートスケールサーバ２０の停止中に、仮に、仮想マシン４１が異常などによって停止しても、オートスケールサーバ２０が起動した後に、オートスケールサーバ２０は当該仮想マシンの停止を把握できていない。オートスケールサーバ２０は、テーブル７１によって各仮想マシンのオートスケール制御を行うことになる。すなわち、オートスケールサーバ２０が管理する仮想マシンの情報と、現実の仮想マシンの稼働状況とに不整合が生じた状態になる。この場合、オートスケールサーバ２０は、仮想マシン４１，４２に対する適切なオートスケール制御を行えない。また、オートスケールサーバ２０が仮想マシン４２の停止を検知するまでに、比較的長い時間（例えば、１０分から数十分など）を要することもある。この間に、仮想マシン４１の負荷が高まると、オートスケール制御を適切に行えずに、仮想マシン４１で実行されるアプリケーションなどの処理に影響を及ぼす可能性もある。 As a result, when the autoscale server 20 is stopped and restored, it can be restored based on the latest information of the virtual machine.
Here, consider a case where the function of the autoscale server monitoring device 10 is not used. In this case, even if the virtual machine 41 is stopped due to an abnormality or the like while the autoscale server 20 is stopped, the autoscale server 20 cannot grasp the stop of the virtual machine after the autoscale server 20 is started. .. The autoscale server 20 will perform autoscale control of each virtual machine by the table 71. That is, there is an inconsistency between the information of the virtual machine managed by the autoscale server 20 and the operating status of the actual virtual machine. In this case, the autoscale server 20 cannot perform appropriate autoscale control for the virtual machines 41 and 42. Further, it may take a relatively long time (for example, 10 minutes to several tens of minutes) for the autoscale server 20 to detect the stop of the virtual machine 42. If the load on the virtual machine 41 increases during this period, autoscale control may not be properly performed, which may affect the processing of applications and the like executed on the virtual machine 41.

そこで、オートスケールサーバ監視装置１０により、オートスケールサーバ２０の停止中の仮想マシンの情報を取得し、オートスケールサーバ２０の復旧時に、当該仮想マシンの情報をオートスケールサーバ２０に提供する。これにより、オートスケールサーバ２０において管理されている仮想マシンの情報と、現実の仮想マシンの稼働状況との不整合を解消した状態で、オートスケールサーバ２０を復旧させることができる。このため、オートスケールサーバ２０は、復旧した直後から、オートスケール制御を正常に再開することができる。その結果、各仮想マシンの負荷をオートスケール制御により適切に分散でき、各仮想マシンで実行されるアプリケーションなどの処理への影響を抑えられる。 Therefore, the autoscale server monitoring device 10 acquires the information of the stopped virtual machine of the autoscale server 20, and provides the information of the virtual machine to the autoscale server 20 when the autoscale server 20 is restored. As a result, the autoscale server 20 can be restored in a state where the inconsistency between the information of the virtual machine managed by the autoscale server 20 and the operating status of the actual virtual machine is resolved. Therefore, the autoscale server 20 can normally resume the autoscale control immediately after the restoration. As a result, the load of each virtual machine can be appropriately distributed by autoscale control, and the influence on the processing of applications executed by each virtual machine can be suppressed.

なお、クラスタシステム１の例では、オートスケールサーバ監視装置１０による監視の対象を、オートスケールサーバ２０およびオートスケールサーバ２０によるオートスケール対象の仮想マシン（仮想マシン３１，３２，４１，４２）とした。一方、オートスケールサーバ監視装置１０による監視対象の仮想マシンはこれに限られない。オートスケールサーバ監視装置１０は、オートスケール対象の仮想マシンおよびオートスケール対象ではない仮想マシンの死活監視を行ってもよい。オートスケールサーバ監視装置１０は、オートスケール対象ではない仮想マシンについて定期通信の途絶を検出すると、オートスケールサーバ２０への問い合わせを省略して、当該仮想マシンで異常が発生したことをユーザに通知することができる。 In the example of the cluster system 1, the monitoring target by the autoscale server monitoring device 10 is the autoscale server 20 and the virtual machines (virtual machines 31, 32, 41, 42) to be autoscaled by the autoscale server 20. .. On the other hand, the virtual machine to be monitored by the autoscale server monitoring device 10 is not limited to this. The autoscale server monitoring device 10 may monitor the life and death of virtual machines targeted for autoscaling and virtual machines not subject to autoscaling. When the autoscale server monitoring device 10 detects the interruption of periodic communication for a virtual machine that is not the target of autoscale, it omits the inquiry to the autoscale server 20 and notifies the user that an abnormality has occurred in the virtual machine. be able to.

［第２の実施の形態］
次に、第２の実施の形態を説明する。
図２は、第２の実施の形態のクラスタシステムの例を示す図である。 [Second Embodiment]
Next, a second embodiment will be described.
FIG. 2 is a diagram showing an example of a cluster system according to a second embodiment.

第２の実施の形態のクラスタシステムは、ユーザに対して仮想マシンの利用環境を提供する情報処理システムである。第２の実施の形態のクラスタシステムは、監視サーバ１００、オートスケールサーバ２００および物理サーバ３００，４００を有する。 The cluster system of the second embodiment is an information processing system that provides a user with a virtual machine usage environment. The cluster system of the second embodiment has a monitoring server 100, an autoscale server 200, and physical servers 300, 400.

監視サーバ１００、オートスケールサーバ２００および物理サーバ３００，４００は、ネットワーク６０に接続される。ネットワーク６０は、例えば、データセンタなどに敷設されたＬＡＮ（Local Area Network）である。ネットワーク６０は、ネットワーク７０に接続される。ネットワーク７０は、例えば、インターネットやＷＡＮ（Wide Area Network）である。ネットワーク７０には、ユーザ端末５００，６００が接続される。 The monitoring server 100, the autoscale server 200, and the physical servers 300, 400 are connected to the network 60. The network 60 is, for example, a LAN (Local Area Network) installed in a data center or the like. The network 60 is connected to the network 70. The network 70 is, for example, the Internet or a WAN (Wide Area Network). User terminals 500 and 600 are connected to the network 70.

監視サーバ１００は、オートスケールサーバ２００の監視を行うサーバコンピュータである。また、監視サーバ１００は、物理サーバ３００，４００で動作する仮想マシンの監視を行う。監視サーバ１００は、第１の実施の形態のオートスケールサーバ監視装置１０の一例である。 The monitoring server 100 is a server computer that monitors the autoscale server 200. Further, the monitoring server 100 monitors the virtual machines operating on the physical servers 300 and 400. The monitoring server 100 is an example of the autoscale server monitoring device 10 of the first embodiment.

オートスケールサーバ２００は、物理サーバ３００，４００で動作する仮想マシンのオートスケール（自動スケール）制御を行うサーバコンピュータである。オートスケールサーバ２００は、第１の実施の形態のオートスケールサーバ２０の一例である。 The autoscale server 200 is a server computer that controls autoscale (autoscale) of virtual machines operating on physical servers 300 and 400. The autoscale server 200 is an example of the autoscale server 20 of the first embodiment.

物理サーバ３００，４００は、複数の仮想マシンを実行可能なサーバコンピュータである。例えば、物理サーバ３００は、ハイパーバイザと呼ばれるソフトウェアを実行し、物理サーバ３００におけるＣＰＵやＲＡＭなどのハードウェアリソースを物理サーバ３００上の仮想マシンに割り振る。同様に、物理サーバ４００は、ハイパーバイザを実行し、物理サーバ４００におけるＣＰＵやＲＡＭなどのハードウェアリソースを物理サーバ４００上の仮想マシンに割り振る。物理サーバ３００，４００は、第１の実施の形態の物理サーバ３０，４０の一例である。 The physical servers 300 and 400 are server computers capable of executing a plurality of virtual machines. For example, the physical server 300 executes software called a hypervisor and allocates hardware resources such as a CPU and RAM in the physical server 300 to virtual machines on the physical server 300. Similarly, the physical server 400 executes a hypervisor and allocates hardware resources such as a CPU and RAM in the physical server 400 to virtual machines on the physical server 400. The physical servers 300 and 400 are examples of the physical servers 30 and 40 according to the first embodiment.

ユーザ端末５００，６００は、ユーザが利用するクライアントコンピュータである。ユーザ端末５００，６００は、物理サーバ３００，４００上の仮想マシンで実行されるアプリケーションに対する処理要求を送信する。また、ユーザ端末５００，６００は、仮想マシンによる処理結果を受信する。 The user terminals 500 and 600 are client computers used by the user. The user terminals 500 and 600 transmit a processing request for an application executed by the virtual machine on the physical servers 300 and 400. Further, the user terminals 500 and 600 receive the processing result by the virtual machine.

第２の実施の形態のクラスタシステムでは、ユーザにより円滑に仮想マシンを利用できるように、オートスケールサーバ２００による仮想マシンのオートスケール制御が行われる。ただし、オートスケールサーバ２００が、異常などによって停止することもある。そこで、監視サーバ１００により、オートスケールサーバ２００が停止した場合でも、オートスケール制御への影響を低減する機能を提供する。以下の説明では、仮想マシンを、ＶＭと略記することがある。また、オートスケールを、ＡＳ（Auto Scaling）と略記することがある。 In the cluster system of the second embodiment, the autoscale control of the virtual machine is performed by the autoscale server 200 so that the user can use the virtual machine smoothly. However, the autoscale server 200 may stop due to an abnormality or the like. Therefore, the monitoring server 100 provides a function of reducing the influence on the autoscale control even when the autoscale server 200 is stopped. In the following description, the virtual machine may be abbreviated as VM. Further, the auto scale may be abbreviated as AS (Auto Scaling).

図３は、監視サーバのハードウェア例を示すブロック図である。
監視サーバ１００は、ＣＰＵ１０１、ＲＡＭ１０２、ＨＤＤ１０３、画像信号処理部１０４、入力信号処理部１０５、媒体リーダ１０６およびＮＩＣ（Network Interface Card）１０７を有する。なお、ＣＰＵ１０１は、第１の実施の形態の処理部１２に対応する。ＲＡＭ１０２またはＨＤＤ１０３は、第１の実施の形態の記憶部１１に対応する。 FIG. 3 is a block diagram showing a hardware example of the monitoring server.
The monitoring server 100 includes a CPU 101, a RAM 102, an HDD 103, an image signal processing unit 104, an input signal processing unit 105, a medium reader 106, and a NIC (Network Interface Card) 107. The CPU 101 corresponds to the processing unit 12 of the first embodiment. The RAM 102 or the HDD 103 corresponds to the storage unit 11 of the first embodiment.

ＣＰＵ１０１は、プログラムの命令を実行するプロセッサである。ＣＰＵ１０１は、ＨＤＤ１０３に記憶されたプログラムやデータの少なくとも一部をＲＡＭ１０２にロードし、プログラムを実行する。なお、ＣＰＵ１０１は複数のプロセッサコアを含んでもよい。また、監視サーバ１００は複数のプロセッサを有してもよい。以下で説明する処理は複数のプロセッサまたはプロセッサコアを用いて並列に実行されてもよい。また、複数のプロセッサの集合を「マルチプロセッサ」または単に「プロセッサ」と言うことがある。 The CPU 101 is a processor that executes a program instruction. The CPU 101 loads at least a part of the programs and data stored in the HDD 103 into the RAM 102 and executes the program. The CPU 101 may include a plurality of processor cores. Further, the monitoring server 100 may have a plurality of processors. The processes described below may be performed in parallel using multiple processors or processor cores. Also, a set of multiple processors may be referred to as a "multiprocessor" or simply a "processor".

ＲＡＭ１０２は、ＣＰＵ１０１が実行するプログラムやＣＰＵ１０１が演算に用いるデータを一時的に記憶する揮発性の半導体メモリである。なお、監視サーバ１００は、ＲＡＭ以外の種類のメモリを備えてもよく、複数個のメモリを備えてもよい。 The RAM 102 is a volatile semiconductor memory that temporarily stores a program executed by the CPU 101 and data used by the CPU 101 for calculation. The monitoring server 100 may include a type of memory other than the RAM, or may include a plurality of memories.

ＨＤＤ１０３は、ＯＳやミドルウェアやアプリケーションソフトウェアなどのソフトウェアのプログラム、および、データを記憶する不揮発性の記憶装置である。なお、監視サーバ１００は、フラッシュメモリやＳＳＤ（Solid State Drive）などの他の種類の記憶装置を備えてもよく、複数の不揮発性の記憶装置を備えてもよい。 The HDD 103 is a non-volatile storage device that stores software programs such as an OS, middleware, and application software, and data. The monitoring server 100 may be provided with other types of storage devices such as a flash memory and an SSD (Solid State Drive), or may be provided with a plurality of non-volatile storage devices.

画像信号処理部１０４は、ＣＰＵ１０１からの命令に従って、監視サーバ１００に接続されたディスプレイ１１１に画像を出力する。ディスプレイ１１１としては、ＣＲＴ（Cathode Ray Tube）ディスプレイ、液晶ディスプレイ（ＬＣＤ：Liquid Crystal Display）、プラズマディスプレイ、有機ＥＬ（ＯＥＬ：Organic Electro-Luminescence）ディスプレイなど、任意の種類のディスプレイを用いることができる。 The image signal processing unit 104 outputs an image to the display 111 connected to the monitoring server 100 in accordance with a command from the CPU 101. As the display 111, any kind of display such as a CRT (Cathode Ray Tube) display, a liquid crystal display (LCD: Liquid Crystal Display), a plasma display, and an organic EL (OEL: Organic Electro-Luminescence) display can be used.

入力信号処理部１０５は、監視サーバ１００に接続された入力デバイス１１２から入力信号を取得し、ＣＰＵ１０１に出力する。入力デバイス１１２としては、マウス・タッチパネル・タッチパッド・トラックボールなどのポインティングデバイス、キーボード、リモートコントローラ、ボタンスイッチなどを用いることができる。また、監視サーバ１００に、複数の種類の入力デバイスが接続されていてもよい。 The input signal processing unit 105 acquires an input signal from the input device 112 connected to the monitoring server 100 and outputs the input signal to the CPU 101. As the input device 112, a pointing device such as a mouse, a touch panel, a touch pad, or a trackball, a keyboard, a remote controller, a button switch, or the like can be used. Further, a plurality of types of input devices may be connected to the monitoring server 100.

媒体リーダ１０６は、記録媒体１１３に記録されたプログラムやデータを読み取る読み取り装置である。記録媒体１１３として、例えば、磁気ディスク、光ディスク、光磁気ディスク（ＭＯ：Magneto-Optical disk）、半導体メモリなどを使用できる。磁気ディスクには、フレキシブルディスク（ＦＤ：Flexible Disk）やＨＤＤが含まれる。光ディスクには、ＣＤ（Compact Disc）やＤＶＤ（Digital Versatile Disc）が含まれる。 The medium reader 106 is a reading device that reads programs and data recorded on the recording medium 113. As the recording medium 113, for example, a magnetic disk, an optical disk, a magneto-optical disk (MO: Magneto-Optical disk), a semiconductor memory, or the like can be used. The magnetic disk includes a flexible disk (FD) and an HDD. Optical discs include CDs (Compact Discs) and DVDs (Digital Versatile Discs).

媒体リーダ１０６は、例えば、記録媒体１１３から読み取ったプログラムやデータを、ＲＡＭ１０２やＨＤＤ１０３などの他の記録媒体にコピーする。読み取られたプログラムは、例えば、ＣＰＵ１０１によって実行される。なお、記録媒体１１３は可搬型記録媒体であってもよく、プログラムやデータの配布に用いられることがある。また、記録媒体１１３やＨＤＤ１０３を、コンピュータ読み取り可能な記録媒体と言うことがある。 The medium reader 106, for example, copies a program or data read from the recording medium 113 to another recording medium such as the RAM 102 or the HDD 103. The read program is executed by, for example, the CPU 101. The recording medium 113 may be a portable recording medium and may be used for distribution of programs and data. Further, the recording medium 113 and the HDD 103 may be referred to as a computer-readable recording medium.

ＮＩＣ１０７は、ネットワーク６０に接続され、ネットワーク６０を介して他のコンピュータと通信を行うインタフェースである。ＮＩＣ１０７は、例えば、スイッチやルータなどの通信装置とケーブルで接続される。 The NIC 107 is an interface that is connected to the network 60 and communicates with another computer via the network 60. The NIC 107 is connected to a communication device such as a switch or a router by a cable.

図４は、スケールアウトおよびスケールインの例を示す図である。
図４（Ａ）は、スケールアウトを例示する。物理サーバ３００が仮想マシン３１０，３２０を実行し、物理サーバ４００が仮想マシン４１０，４２０を実行している場合を考える。例えば、仮想マシン３１０，３２０，４１０，４２０は、オートスケールの対象となる仮想マシンの１つのグループに属し、ユーザが利用するアプリケーション（あるいはアプリケーション群）の処理を分散して実行する。 FIG. 4 is a diagram showing examples of scale-out and scale-in.
FIG. 4 (A) illustrates scale-out. Consider a case where the physical server 300 is executing the virtual machines 310 and 320, and the physical server 400 is executing the virtual machines 410 and 420. For example, the virtual machines 310, 320, 410, 420 belong to one group of virtual machines to be autoscaled, and the processing of the application (or application group) used by the user is distributed and executed.

オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の負荷を定期的に収集する。例えば、オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の平均の負荷が所定期間継続して第１閾値を上回った場合、仮想マシン３１０，３２０，４１０，４２０の負荷が高まっていると判断し、仮想マシンのスケールアウトを行う。例えば、オートスケールサーバ２００は、物理サーバ３００により仮想マシン３３０を起動させ、仮想マシン３１０，３２０，４１０，４２０の負荷の一部を、仮想マシン３３０に分散させる。 The autoscale server 200 periodically collects the load of the virtual machines 310, 320, 410, 420. For example, in the autoscale server 200, when the average load of the virtual machines 310, 320, 410, 420 continuously exceeds the first threshold value for a predetermined period, the load of the virtual machines 310, 320, 410, 420 is increased. And scale out the virtual machine. For example, the autoscale server 200 starts the virtual machine 330 by the physical server 300, and distributes a part of the load of the virtual machines 310, 320, 410, 420 to the virtual machine 330.

図４（Ｂ）は、スケールインを例示する。物理サーバ３００が仮想マシン３１０，３２０を実行し、物理サーバ４００が仮想マシン４１０，４２０を実行している場合を考える。オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の負荷を定期的に収集する。例えば、オートスケールサーバ２００は、仮想マシン３１０，３２０，４１０，４２０の平均の負荷が所定期間継続して第２閾値を下回った場合、仮想マシン３１０，３２０，４１０，４２０の負荷が低くなっていると判断し、仮想マシンのスケールインを行う。ここで、第２閾値は、第１閾値よりも小さい。例えば、オートスケールサーバ２００は、物理サーバ４００における仮想マシン４２０を停止させ、仮想マシン４２０に割り当てていたリソースを解放する。 FIG. 4B illustrates scale-in. Consider a case where the physical server 300 is executing the virtual machines 310 and 320, and the physical server 400 is executing the virtual machines 410 and 420. The autoscale server 200 periodically collects the load of the virtual machines 310, 320, 410, 420. For example, in the autoscale server 200, when the average load of the virtual machines 310, 320, 410, 420 continues to fall below the second threshold value for a predetermined period, the load of the virtual machines 310, 320, 410, 420 becomes low. Judge that there is, and scale in the virtual machine. Here, the second threshold value is smaller than the first threshold value. For example, the autoscale server 200 stops the virtual machine 420 in the physical server 400 and releases the resources allocated to the virtual machine 420.

図５は、クラスタシステムの機能例を示すブロック図である。
監視サーバ１００は、記憶部１２０および監視部１３０を有する。記憶部１２０は、ＲＡＭ１０２やＨＤＤ１０３の記憶領域により実現される。監視部１３０は、ＣＰＵ１０１がＲＡＭ１０２に記憶されたプログラムを実行することで実現される。 FIG. 5 is a block diagram showing a functional example of the cluster system.
The monitoring server 100 has a storage unit 120 and a monitoring unit 130. The storage unit 120 is realized by the storage area of the RAM 102 or the HDD 103. The monitoring unit 130 is realized by the CPU 101 executing a program stored in the RAM 102.

記憶部１２０は、オートスケールサーバ管理テーブルおよびＶＭ管理テーブルを記憶する。オートスケールサーバ管理テーブルは、オートスケールサーバ２００の稼動状態を示す情報である。ＶＭ管理テーブルは、各仮想マシンに対する定期通信の成否を示す情報である。 The storage unit 120 stores the autoscale server management table and the VM management table. The autoscale server management table is information indicating the operating state of the autoscale server 200. The VM management table is information indicating the success or failure of periodic communication for each virtual machine.

監視部１３０は、物理サーバ３００，４００上の各仮想マシンおよびオートスケールサーバ２００の監視を行う。監視部１３０は、ＶＭ監視部１３１およびＡＳサーバ連携部１３２を有する。 The monitoring unit 130 monitors each virtual machine on the physical servers 300 and 400 and the autoscale server 200. The monitoring unit 130 has a VM monitoring unit 131 and an AS server cooperation unit 132.

ＶＭ監視部１３１は、物理サーバ３００，４００上の各仮想マシンと定期的に通信し、各仮想マシンとの疎通確認を行う。例えば、ＶＭ監視部１３１は、各仮想マシンから疎通確認用のパケットを受信することで、疎通確認を行う。疎通確認用のパケットは、例えば、ＩＣＭＰ（Internet Control Message Protocol）のエコー要求でもよいし、ＶＭ監視部１３１により送信されたエコー要求に対する仮想マシンからのエコー応答でもよい。あるいは、ＶＭ監視部１３１は、ＳＮＭＰ（Simple Network Management Protocol）などのその他のプロトコルを用いて疎通確認を行ってもよい。ＶＭ監視部１３１による監視対象の仮想マシンは、何れもオートスケールの制御対象の仮想マシンである。ＶＭ監視部１３１は、監視対象とする仮想マシンを、オートスケールサーバ２００に問い合わせてもよい。 The VM monitoring unit 131 periodically communicates with each virtual machine on the physical servers 300 and 400, and confirms communication with each virtual machine. For example, the VM monitoring unit 131 confirms communication by receiving a communication confirmation packet from each virtual machine. The communication confirmation packet may be, for example, an ICMP (Internet Control Message Protocol) echo request or an echo response from the virtual machine to the echo request transmitted by the VM monitoring unit 131. Alternatively, the VM monitoring unit 131 may perform communication confirmation using another protocol such as SNMP (Simple Network Management Protocol). The virtual machines to be monitored by the VM monitoring unit 131 are all virtual machines to be controlled by the autoscale. The VM monitoring unit 131 may inquire of the autoscale server 200 about the virtual machine to be monitored.

ＡＳサーバ連携部１３２は、オートスケールサーバ（ＡＳサーバ）２００と連携する。ＡＳサーバ連携部１３２は、オートスケールサーバ２００と定期的に通信し、オートスケールサーバ２００の死活監視を行う。例えば、ＡＳサーバ連携部１３２は、オートスケールサーバ２００に対して、定期的に仮想マシンの状態を問い合わせることで、オートスケールサーバ２００の死活監視を行ってもよい。問い合わせに対してオートスケールサーバ２００から応答があれば、オートスケールサーバ２００は稼動している。一方、問い合わせに対してオートスケールサーバ２００から応答がなければ、オートスケールサーバ２００は停止している。 The AS server cooperation unit 132 cooperates with the autoscale server (AS server) 200. The AS server cooperation unit 132 periodically communicates with the autoscale server 200 and monitors the life and death of the autoscale server 200. For example, the AS server cooperation unit 132 may periodically inquire the autoscale server 200 of the state of the virtual machine to monitor the life and death of the autoscale server 200. If there is a response from the autoscale server 200 to the inquiry, the autoscale server 200 is operating. On the other hand, if there is no response from the autoscale server 200 to the inquiry, the autoscale server 200 is stopped.

オートスケールサーバ２００が稼動している場合、ＡＳサーバ連携部１３２は、監視対象の仮想マシンのうち、疎通確認を行えなかった仮想マシンがスケールインにより停止されたか否かを、オートスケールサーバ２００に問い合わせる。該当の仮想マシンがスケールインにより停止された場合、ＡＳサーバ連携部１３２は、疎通確認を行えなかったことを異常としない。該当の仮想マシンがスケールインにより停止されていない場合、ＡＳサーバ連携部１３２は、疎通確認を行えなかった仮想マシンを異常と判断し、システム管理者に通知する。例えば、ＡＳサーバ連携部１３２は、該当の仮想マシンの異常発生を示す画面をディスプレイ１１１に表示させてもよい。または、ＡＳサーバ連携部１３２は、該当の仮想マシンの異常発生を示すメッセージを、ネットワーク５０に接続された、システム管理者が使用する端末装置（図示を省略している）に送信してもよい。 When the autoscale server 200 is operating, the AS server linkage unit 132 tells the autoscale server 200 whether or not the virtual machines to be monitored that could not be confirmed for communication were stopped by scale-in. Contact us. When the corresponding virtual machine is stopped due to scale-in, the AS server cooperation unit 132 does not consider that the communication confirmation could not be performed as an abnormality. If the corresponding virtual machine is not stopped due to scale-in, the AS server cooperation unit 132 determines that the virtual machine for which communication confirmation could not be performed is abnormal, and notifies the system administrator. For example, the AS server cooperation unit 132 may display a screen showing the occurrence of an abnormality in the virtual machine on the display 111. Alternatively, the AS server linkage unit 132 may send a message indicating the occurrence of an abnormality in the corresponding virtual machine to a terminal device (not shown) connected to the network 50 and used by the system administrator. ..

オートスケールサーバ２００が停止している場合、ＡＳサーバ連携部１３２は、疎通確認を行えなかった仮想マシンがスケールインにより停止されたか否かを、オートスケールサーバ２００に問い合わせることはできない。このため、ＡＳサーバ連携部１３２は、問い合わせを保留する。その後、オートスケールサーバ２００が起動すると、疎通確認の再開により、ＡＳサーバ連携部１３２は、オートスケールサーバ２００の起動を検知する。そして、ＡＳサーバ連携部１３２は、オートスケールサーバ２００の停止中に、疎通確認が途絶えた仮想マシンが存在する場合、当該仮想マシンの情報を、オートスケールサーバ２００に送信する。 When the autoscale server 200 is stopped, the AS server cooperation unit 132 cannot inquire from the autoscale server 200 whether or not the virtual machine for which communication confirmation could not be confirmed has been stopped due to scale-in. Therefore, the AS server cooperation unit 132 suspends the inquiry. After that, when the autoscale server 200 is started, the AS server cooperation unit 132 detects the start of the autoscale server 200 by restarting the communication confirmation. Then, if there is a virtual machine whose communication confirmation is interrupted while the autoscale server 200 is stopped, the AS server cooperation unit 132 transmits the information of the virtual machine to the autoscale server 200.

オートスケールサーバ２００は、記憶部２１０およびＡＳ制御部２２０を有する。記憶部２１０は、オートスケールサーバ２００のＲＡＭやＨＤＤの記憶領域を用いて実現される。ＡＳ制御部２２０は、オートスケールサーバ２００のＣＰＵがオートスケールサーバ２００のＲＡＭに記憶されたプログラムを実行することで実現される。 The autoscale server 200 has a storage unit 210 and an AS control unit 220. The storage unit 210 is realized by using the storage area of the RAM or HDD of the autoscale server 200. The AS control unit 220 is realized by the CPU of the autoscale server 200 executing a program stored in the RAM of the autoscale server 200.

記憶部２１０は、オートスケール制御に用いられる情報を記憶する。具体的には、記憶部２１０は、オートスケールグループテーブル、ＶＭテーブルおよびオートスケールポリシーテーブルを記憶する。 The storage unit 210 stores information used for autoscale control. Specifically, the storage unit 210 stores an autoscale group table, a VM table, and an autoscale policy table.

オートスケールグループテーブルは、オートスケールグループを示す情報である。オートスケールグループは、オートスケール制御の対象となる仮想マシンのグループである。１つのオートスケールグループに属する仮想マシンの負荷に応じて、当該オートスケールグループに属する仮想マシンのオートスケール制御が行われる。ＶＭテーブルは、オートスケール制御の対象の仮想マシンを示す情報である。ＶＭテーブルは、仮想マシンの状態を含む。仮想マシンの状態には、（１）仮想マシンが正常に稼動している、（２）仮想マシンに異常あり、（３）スケールインにより縮退している（スケールインのために停止している）、（４）スケールアウトのために起動中、などが考えられる。オートスケールポリシーテーブルは、オートスケール制御のポリシー（スケールインやスケールアウトを行うための条件）を示す情報である。 The autoscale group table is information indicating an autoscale group. An autoscale group is a group of virtual machines that are subject to autoscale control. According to the load of the virtual machines belonging to one autoscale group, the autoscale control of the virtual machines belonging to the autoscale group is performed. The VM table is information indicating a virtual machine subject to autoscale control. The VM table contains the state of the virtual machine. The state of the virtual machine is (1) the virtual machine is operating normally, (2) the virtual machine is abnormal, and (3) it is degraded due to scale-in (stopped due to scale-in). , (4) Starting up for scale-out, etc. are conceivable. The autoscale policy table is information indicating the autoscale control policy (conditions for scaling in and scale out).

ここで、オートスケールグループテーブル、ＶＭテーブルおよびオートスケールポリシーテーブルは、記憶部２１０のうち、不揮発性の記憶領域（例えば、ＨＤＤの記憶領域）に格納される。また、オートスケールグループテーブル、ＶＭテーブルおよびオートスケールポリシーテーブルは、オートスケール制御に用いられる場合、複製されて、記憶部２１０のうち、揮発性の記憶領域（例えば、ＲＡＭの記憶領域）に一時的に格納されることもある。この場合、揮発性の記憶領域に保持されているときの各テーブルの更新内容は、ＡＳ制御部２２０により不揮発性の記憶領域に格納された複製元の各テーブルにも反映される。 Here, the autoscale group table, the VM table, and the autoscale policy table are stored in the non-volatile storage area (for example, the storage area of the HDD) in the storage unit 210. Further, when the autoscale group table, the VM table, and the autoscale policy table are used for autoscale control, they are duplicated and temporarily stored in a volatile storage area (for example, a RAM storage area) in the storage unit 210. It may be stored in. In this case, the updated content of each table when it is held in the volatile storage area is also reflected in each copy source table stored in the non-volatile storage area by the AS control unit 220.

ＡＳ制御部２２０は、物理サーバ３００，４００上の仮想マシン（例えば、仮想マシン３１０，３２０，４１０，４２０を含む複数の仮想マシン）のオートスケール制御（ＡＳ制御）を行う。ＡＳ制御部２２０は、仮想マシンの負荷の情報を定期的に収集する。例えば、ＡＳ制御部２２０は、ＳＮＭＰなどのプロトコルを用いて仮想マシンの負荷を収集してもよい。ＡＳ制御部２２０は、収集した負荷と、当該仮想マシンが属するオートスケールグループのオートスケールポリシーとに基づいて、スケールインやスケールアウトを物理サーバ３００，４００に指示する。 The AS control unit 220 performs autoscale control (AS control) of virtual machines (for example, a plurality of virtual machines including virtual machines 310, 320, 410, 420) on the physical servers 300 and 400. The AS control unit 220 periodically collects information on the load of the virtual machine. For example, the AS control unit 220 may collect the load of the virtual machine by using a protocol such as SNMP. The AS control unit 220 instructs the physical servers 300 and 400 to scale in and scale out based on the collected load and the autoscale policy of the autoscale group to which the virtual machine belongs.

ここで、障害などによりオートスケールサーバ２００が停止することがある。オートスケールサーバ２００の停止中は、ＡＳ制御部２２０によるオートスケール制御も停止する。ＡＳ制御部２２０は、オートスケールサーバ２００の停止後、オートスケールサーバ２００が起動した際に、オートスケールサーバ２００が停止していた間の疎通確認に応じた仮想マシンの情報を監視サーバ１００から取得する。ＡＳ制御部２２０は、取得した仮想マシンの情報に基づいて、記憶部２１０に記憶されたＶＭテーブルにおける仮想マシンの状態を更新する。ＡＳ制御部２２０は、更新後のＶＭテーブルに基づいて、仮想マシンの負荷の収集を再開し、オートスケール制御を再開する。 Here, the autoscale server 200 may stop due to a failure or the like. While the autoscale server 200 is stopped, the autoscale control by the AS control unit 220 is also stopped. When the autoscale server 200 is started after the autoscale server 200 is stopped, the AS control unit 220 acquires the information of the virtual machine corresponding to the communication confirmation while the autoscale server 200 is stopped from the monitoring server 100. do. The AS control unit 220 updates the state of the virtual machine in the VM table stored in the storage unit 210 based on the acquired virtual machine information. The AS control unit 220 restarts the collection of the load of the virtual machine based on the updated VM table, and restarts the autoscale control.

図６は、オートスケールサーバ管理テーブルの例を示す図である。
オートスケールサーバ管理テーブル１２１は、記憶部１２０に格納される。オートスケールサーバ管理テーブル１２１は、オートスケールサーバＩＤ（IDentifier）および稼働中フラグの項目を含む。 FIG. 6 is a diagram showing an example of an autoscale server management table.
The autoscale server management table 121 is stored in the storage unit 120. The autoscale server management table 121 includes items of an autoscale server ID (IDentifier) and an operating flag.

オートスケールサーバＩＤの項目には、オートスケールサーバ２００の識別情報（オートスケールサーバＩＤ）が登録される。オートスケールサーバ２００のオートスケールサーバＩＤは、例えば、「装置Ａ」である。稼働中フラグの項目には、オートスケールサーバ２００が稼働しているか否かを示す稼働中フラグが登録される。稼働中フラグ「Ｔｒｕｅ」は稼働していることを示す。稼働中フラグ「Ｆａｌｓｅ」は稼動していない（すなわち、停止している）ことを示す。例えば、オートスケールサーバ管理テーブル１２１には、オートスケールサーバＩＤが「装置Ａ」、稼働中フラグが「Ｔｒｕｅ」というレコードが登録される。 The identification information (autoscale server ID) of the autoscale server 200 is registered in the item of the autoscale server ID. The autoscale server ID of the autoscale server 200 is, for example, "device A". In the item of the operating flag, an operating flag indicating whether or not the autoscale server 200 is operating is registered. The running flag "True" indicates that it is running. The running flag "False" indicates that it is not running (ie, stopped). For example, a record in which the autoscale server ID is "device A" and the operating flag is "True" is registered in the autoscale server management table 121.

図７は、ＶＭ管理テーブルの例を示す図である。
ＶＭ管理テーブル１２２は、記憶部１２０に格納される。ＶＭ管理テーブル１２２は、ＶＭ名、通信用ＩＰ（Internet Protocol）アドレス、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目を含む。 FIG. 7 is a diagram showing an example of a VM management table.
The VM management table 122 is stored in the storage unit 120. The VM management table 122 includes items such as a VM name, a communication IP (Internet Protocol) address, an autoscale VM operating flag, and an autoscale information update flag.

ＶＭ名の項目には、仮想マシンの名称（仮想マシンのＩＤ）が登録される。通信用ＩＰアドレスの項目には、仮想マシンのＩＰアドレスが登録される。オートスケールＶＭ動作中フラグの項目には、死活監視の成否（すなわち、該当の仮想マシンが動作しているか否か）を示すオートスケールＶＭ動作中フラグが登録される。オートスケールＶＭ動作中フラグ「Ｔｒｕｅ」は、該当の仮想マシンとの定期通信が正常に行われた（すなわち、該当の仮想マシンが動作している）ことを示す。オートスケールＶＭ動作中フラグ「Ｆａｌｓｅ」は、該当の仮想マシンとの定期通信が正常に行われなかった（すなわち、該当の仮想マシンが停止している）ことを示す。オートスケール情報更新フラグの項目には、オートスケールサーバ２００の停止中に、該当の仮想マシンに関してオートスケールＶＭ動作中フラグの更新が発生したか否かを示すオートスケール情報更新フラグが登録される。オートスケール情報更新フラグ「Ｔｒｕｅ」は、当該更新が発生したことを示す。オートスケール情報更新フラグ「Ｆａｌｓｅ」は、当該更新が発生しなかったことを示す。オートスケール情報更新フラグの初期値は「Ｆａｌｓｅ」である。 The name of the virtual machine (ID of the virtual machine) is registered in the item of VM name. The IP address of the virtual machine is registered in the item of IP address for communication. In the item of the autoscale VM operating flag, the autoscale VM operating flag indicating the success or failure of the alive monitoring (that is, whether or not the corresponding virtual machine is operating) is registered. The autoscale VM operating flag "True" indicates that periodic communication with the corresponding virtual machine has been normally performed (that is, the corresponding virtual machine is operating). The autoscale VM operating flag "False" indicates that the periodic communication with the corresponding virtual machine has not been performed normally (that is, the corresponding virtual machine is stopped). In the item of the autoscale information update flag, an autoscale information update flag indicating whether or not an update of the autoscale VM operating flag has occurred for the corresponding virtual machine is registered while the autoscale server 200 is stopped. The autoscale information update flag "True" indicates that the update has occurred. The autoscale information update flag "False" indicates that the update did not occur. The initial value of the autoscale information update flag is "False".

例えば、ＶＭ管理テーブル１２２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ１」、通信用ＩＰアドレスが「１００．１０．９９．１」、オートスケールＶＭ動作中フラグが「Ｔｒｕｅ」、オートスケール情報更新フラグが「Ｆａｌｓｅ」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ１」の仮想マシンの通信用ＩＰアドレスが「１００．１０．９９．１」であることを示す。また、当該仮想マシンが稼動しており、オートスケールサーバ２００の停止中におけるオートスケールＶＭ動作中フラグの更新が発生していないことを示す。 For example, in the VM management table 122, the VM name is "Grp1_VM1", the communication IP address is "100.10.99.1", the autoscale VM operating flag is "True", and the autoscale information update flag is "False". The record is registered. This record indicates that the communication IP address of the virtual machine with the VM name "Grp1_VM1" is "100.10.99.1". Further, it is shown that the virtual machine is operating and the autoscale VM operating flag is not updated while the autoscale server 200 is stopped.

また、例えば、ＶＭ管理テーブル１２２には、ＶＭ名が「ＳａｍｐｌｅＶＭ」、通信用ＩＰアドレスが「２００．２００．２００．２」、オートスケールＶＭ動作中フラグが「Ｆａｌｓｅ」、オートスケール情報更新フラグが「Ｔｒｕｅ」というレコードが登録される。このレコードは、ＶＭ名「ＳａｍｐｌｅＶＭ」の仮想マシンの通信用ＩＰアドレスが「２００．２００．２００．２」であることを示す。また、当該仮想マシンが停止しており、オートスケールサーバ２００の停止中におけるオートスケールＶＭ動作中フラグの更新が発生したことを示す。 Further, for example, in the VM management table 122, the VM name is "SimpleVM", the communication IP address is "200.200.200.2", the autoscale VM operating flag is "False", and the autoscale information update flag is displayed. A record called "True" is registered. This record indicates that the communication IP address of the virtual machine with the VM name "SimpleVM" is "200.200.200.2". Further, it indicates that the virtual machine is stopped and the autoscale VM operating flag is updated while the autoscale server 200 is stopped.

図８は、オートスケールグループテーブルの例を示す図である。
オートスケールグループテーブル２１１は、記憶部２１０に格納される。オートスケールグループテーブル２１１は、オートスケールグループＩＤ、利用可能ＣＩＤＲ（Classless Inter-Domain Routing）、オートスケールポリシーＩＤ、最小台数および最大台数の項目を含む。 FIG. 8 is a diagram showing an example of an autoscale group table.
The autoscale group table 211 is stored in the storage unit 210. The autoscale group table 211 includes items such as autoscale group ID, available CIDR (Classless Inter-Domain Routing), autoscale policy ID, minimum number and maximum number.

オートスケールグループＩＤの項目には、オートスケールグループの識別情報（オートスケールグループＩＤ）が登録される。利用可能ＣＩＤＲの項目には、利用可能なＣＩＤＲが登録される。オートスケールポリシーＩＤの項目には、該当のオートスケールグループに対して適用されるオートスケールポリシーの識別情報（オートスケールポリシーＩＤ）が登録される。ここで、オートスケールポリシーＩＤに対応するオートスケールポリシーの具体的な内容は、後述するオートスケールポリシーテーブルに予め登録されている。最小台数の項目には、該当のオートスケールグループにおける仮想マシンの最小数が登録される。最大台数の項目には、該当のオートスケールグループにおける仮想マシンの最大数が登録される。 Identification information (autoscale group ID) of the autoscale group is registered in the item of the autoscale group ID. Available CIDRs are registered in the Available CIDR items. In the autoscale policy ID item, the identification information (autoscale policy ID) of the autoscale policy applied to the corresponding autoscale group is registered. Here, the specific contents of the autoscale policy corresponding to the autoscale policy ID are registered in advance in the autoscale policy table described later. In the item of minimum number, the minimum number of virtual machines in the corresponding autoscale group is registered. In the item of maximum number, the maximum number of virtual machines in the corresponding autoscale group is registered.

例えば、オートスケールグループテーブル２１１には、オートスケールグループＩＤが「グループ１」、利用可能ＣＩＤＲが「１００．１０．９９．０／２４」、オートスケールポリシーＩＤが「ルール１，３」、最小台数が「１」、最大台数が「１０」というレコードが登録される。このレコードは、オートスケールグループＩＤ「グループ１」のオートスケールグループでは、利用可能ＣＩＤＲが「１００．１０．９９．０／２４」であり、オートスケールポリシーＩＤ「ルール１，３」のオートスケールポリシーが適用され、仮想マシンの最小数が１個、最大数が１０個であることを示す。 For example, in the autoscale group table 211, the autoscale group ID is "group 1", the available CIDR is "100.10.99.0/24", the autoscale policy ID is "rules 1 and 3", and the minimum number of units. Is registered as "1" and the maximum number is "10". In this record, the available CIDR is "100.10.999.0/24" in the autoscale group of the autoscale group ID "Group 1", and the autoscale policy of the autoscale policy ID "Rules 1 and 3". Is applied, indicating that the minimum number of virtual machines is 1 and the maximum number is 10.

図９は、ＶＭテーブルの例を示す図である。
ＶＭテーブル２１２は、記憶部２１０に格納される。ＶＭテーブル２１２は、ＶＭ名、オートスケールグループＩＤ、通信用ＩＰアドレスおよびＶＭ状態の項目を含む。 FIG. 9 is a diagram showing an example of a VM table.
The VM table 212 is stored in the storage unit 210. The VM table 212 includes VM names, autoscale group IDs, communication IP addresses, and VM status items.

ＶＭ名の項目には、仮想マシンのＶＭ名が登録される。オートスケールグループＩＤの項目には、当該仮想マシンが属するオートスケールグループのオートスケールグループＩＤが登録される。通信用ＩＰアドレスの項目には、仮想マシンのＩＰアドレスが登録される。ＶＭ状態の項目には、仮想マシンの状態が登録される。前述のように、仮想マシンの状態には、仮想マシンが正常に稼動している、仮想マシンに異常あり（ＥＲＲＯＲ）、スケールインにより縮退している（スケールインのために停止している）、スケールアウトのために起動中、などが考えられる。 The VM name of the virtual machine is registered in the item of VM name. In the autoscale group ID item, the autoscale group ID of the autoscale group to which the virtual machine belongs is registered. The IP address of the virtual machine is registered in the item of IP address for communication. The state of the virtual machine is registered in the item of VM state. As mentioned above, the state of the virtual machine is that the virtual machine is running normally, the virtual machine is abnormal (ERROR), degenerate due to scale-in (stopped due to scale-in), It may be running due to scale-out.

例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ１」、オートスケールグループＩＤが「グループ１」、通信用ＩＰアドレスが「１００．１０．９９．１」、ＶＭ状態が「正常」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ１」の仮想マシンがオートスケールグループＩＤ「グループ１」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１０．９９．１」であり、当該仮想マシンが正常に稼動していることを示す。 For example, in the VM table 212, a record having a VM name of "Grp1_VM1", an autoscale group ID of "group 1", a communication IP address of "100.10.99.1", and a VM status of "normal" is registered. Will be done. In this record, the virtual machine with the VM name "Grp1_VM1" belongs to the autoscale group with the autoscale group ID "group1", the IP address of the virtual machine is "100.10.99.1", and the virtual machine is concerned. Indicates that is operating normally.

また、例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ２」、オートスケールグループＩＤが「グループ１」、通信用ＩＰアドレスが「１００．１０．９９．２」、ＶＭ状態が「ＥＲＲＯＲ」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ２」の仮想マシンがオートスケールグループＩＤ「グループ１」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１０．９９．２」であり、当該仮想マシンで異常が発生していることを示す。 Further, for example, in the VM table 212, a record having a VM name of "Grp1_VM2", an autoscale group ID of "group 1", a communication IP address of "100.10.999.2", and a VM state of "ERROR". Is registered. In this record, the virtual machine with the VM name "Grp1_VM2" belongs to the autoscale group with the autoscale group ID "Group1", the IP address of the virtual machine is "100.10.99.2", and the virtual machine is concerned. Indicates that an abnormality has occurred in.

また、例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ１＿ＶＭ３」、オートスケールグループＩＤが「グループ１」、通信用ＩＰアドレスが「１００．１０．９９．３」、ＶＭ状態が「スケールイン縮退」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ３」の仮想マシンがオートスケールグループＩＤ「グループ１」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１０．９９．３」であり、スケールインにより停止していることを示す。 Further, for example, in the VM table 212, the VM name is "Grp1_VM3", the autoscale group ID is "Group1", the communication IP address is "100.10.99.3", and the VM state is "scale-in degenerate". Record is registered. In this record, the virtual machine with the VM name "Grp1_VM3" belongs to the autoscale group with the autoscale group ID "Group1", the IP address of the virtual machine is "100.10.99.3", and by scale-in. Indicates that it is stopped.

また、例えば、ＶＭテーブル２１２には、ＶＭ名が「Ｇｒｐ２＿ＶＭ３」、オートスケールグループＩＤが「グループ２」、通信用ＩＰアドレスが「１００．１１．０．３２」、ＶＭ状態が「スケールアウト中」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ２＿ＶＭ３」の仮想マシンがオートスケールグループＩＤ「グループ２」のオートスケールグループに属し、当該仮想マシンのＩＰアドレスが「１００．１１．０．３２」であり、スケールアウトのため起動中であることを示す。 Further, for example, in the VM table 212, the VM name is "Grp2_VM3", the autoscale group ID is "Group2", the communication IP address is "100.11.0.32", and the VM state is "scaling out". Record is registered. In this record, the virtual machine with the VM name "Grp2_VM3" belongs to the autoscale group with the autoscale group ID "Group2", the IP address of the virtual machine is "100.11.0.32", and the scale is out. Therefore, it indicates that it is starting up.

図１０は、オートスケールポリシーテーブルの例を示す図である。オートスケールポリシーテーブル２１３は、記憶部２１０に格納される。オートスケールポリシーテーブル２１３は、オートスケールポリシーＩＤ、トリガーおよびトリガー詳細の項目を含む。 FIG. 10 is a diagram showing an example of an autoscale policy table. The autoscale policy table 213 is stored in the storage unit 210. The autoscale policy table 213 includes items for the autoscale policy ID, trigger and trigger details.

オートスケールポリシーＩＤの項目には、オートスケールポリシーの識別情報（オートスケールポリシーＩＤ）が登録される。トリガーの項目には、オートスケール制御のトリガーとなる監視対象のリソース（仮想マシンにより認識される論理的なリソースでもよい）が登録される。トリガー詳細の項目には、オートスケール制御のトリガーに関する条件が登録される。 The identification information (autoscale policy ID) of the autoscale policy is registered in the item of the autoscale policy ID. In the trigger item, the monitored resource (which may be a logical resource recognized by the virtual machine) that triggers the autoscale control is registered. Conditions related to triggers for autoscale control are registered in the trigger details item.

例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール１」、トリガーが「ＣＰＵ使用率」、トリガー詳細が「１分毎のＣＰＵ平均使用率を取得し、連続５回８０％を上回るとスケールアウト」というレコードが登録される。 For example, in the autoscale policy table 213, the autoscale policy ID is "Rule 1", the trigger is "CPU usage rate", and the trigger details are "Acquire the CPU average usage rate every minute, and 80% is obtained 5 times in a row. A record called "Scale out when exceeded" is registered.

このレコードは、オートスケールポリシーＩＤ「ルール１」のオートスケールポリシーでは、仮想マシンのＣＰＵ使用率をトリガーとしており、１分毎のＣＰＵ平均使用率が連続５回８０％を上回った場合に、スケールアウトを行うことを示す。ここで、「１分毎のＣＰＵ平均使用率」は、該当のオートスケールグループに属する複数の仮想マシンに関する平均でもよいし、該当のオートスケールグループに属する仮想マシン単位の平均でもよい。後者の場合、該当のオートスケールグループに属する少なくとも何れかの仮想マシンにおいて、１分毎のＣＰＵ平均使用率が連続５回８０％を上回るとスケールアウトを行う。なお、所定時間毎の「ＣＰＵ平均使用率」（あるいは、「メモリ平均使用率」）の考え方は、他のオートスケールポリシーについても同様である。 This record is triggered by the CPU usage rate of the virtual machine in the autoscale policy of the autoscale policy ID "Rule 1", and scales when the average CPU usage rate per minute exceeds 80% five times in a row. Indicates that you are going out. Here, the "average CPU usage rate per minute" may be an average for a plurality of virtual machines belonging to the corresponding autoscale group, or may be an average for each virtual machine belonging to the corresponding autoscale group. In the latter case, in at least one of the virtual machines belonging to the corresponding autoscale group, the scale-out is performed when the average CPU usage rate per minute exceeds 80% five times in a row. The concept of "CPU average usage rate" (or "memory average usage rate") for each predetermined time is the same for other autoscale policies.

また、例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール２」、トリガーが「メモリ使用率」、トリガー詳細が「５分毎のメモリ平均使用率を取得し、連続３回９５％を上回るとスケールアウト」というレコードが登録される。このレコードは、オートスケールポリシーＩＤ「ルール２」のオートスケールポリシーでは、仮想マシンのメモリ使用率をトリガーとしており、５分毎のメモリ平均使用率が連続３回９５％を上回った場合に、スケールアウトを行うことを示す。 Further, for example, in the autoscale policy table 213, the autoscale policy ID is "rule 2", the trigger is "memory usage rate", and the trigger details are "memory average usage rate every 5 minutes. If it exceeds%, the record "scale out" is registered. This record is triggered by the memory usage of the virtual machine in the autoscale policy of the autoscale policy ID "Rule 2", and scales when the average memory usage every 5 minutes exceeds 95% three times in a row. Indicates that you are going out.

また、例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール３」、トリガーが「ＣＰＵ使用率」、トリガー詳細が「１分毎のＣＰＵ平均使用率を取得し、連続５回１０％を下回るとスケールイン」というレコードが登録される。このレコードは、オートスケールポリシーＩＤ「ルール３」のオートスケールポリシーでは、仮想マシンのＣＰＵ使用率をトリガーとしており、１分毎のＣＰＵ平均使用率が連続５回１０％を下回った場合に、スケールインを行うことを示す。 Further, for example, in the autoscale policy table 213, the autoscale policy ID is "rule 3", the trigger is "CPU usage rate", and the trigger details are "CPU average usage rate per minute", and 10 times in a row 10 times. If it falls below%, a record called "scale in" will be registered. This record is triggered by the CPU usage rate of the virtual machine in the autoscale policy of the autoscale policy ID "Rule 3", and scales when the average CPU usage rate per minute falls below 10% 5 times in a row. Indicates that the in is to be performed.

また、例えば、オートスケールポリシーテーブル２１３には、オートスケールポリシーＩＤが「ルール４」、トリガーが「メモリ使用率」、トリガー詳細が「５分毎のメモリ平均使用率を取得し、連続５回３０％を下回るとスケールイン」というレコードが登録される。このレコードは、オートスケールポリシーＩＤ「ルール４」のオートスケールポリシーでは、仮想マシンのメモリ使用率をトリガーとしており、５分毎のメモリ平均使用率が連続５回３０％を下回った場合に、スケールインを行うことを示す。 Further, for example, in the autoscale policy table 213, the autoscale policy ID is "rule 4", the trigger is "memory usage rate", and the trigger details are "memory average usage rate every 5 minutes. If it falls below%, a record called "scale in" is registered. This record is triggered by the memory usage of the virtual machine in the autoscale policy of the autoscale policy ID "Rule 4", and scales when the average memory usage every 5 minutes falls below 30% 5 times in a row. Indicates that the inn is to be performed.

次に、上記のクラスタシステムにおける監視サーバ１００の処理手順を説明する。
図１１は、ＶＭ監視の例を示すフローチャートである。
ＶＭ監視部１３１は下記の処理を定期的に実行する。実行の周期は、運用に応じて定められる。周期は、数秒から数十秒程度でもよいし、１分から数分程度でもよい。 Next, the processing procedure of the monitoring server 100 in the above cluster system will be described.
FIG. 11 is a flowchart showing an example of VM monitoring.
The VM monitoring unit 131 periodically executes the following processing. The execution cycle is determined according to the operation. The cycle may be about several seconds to several tens of seconds, or may be about one minute to several minutes.

（Ｓ１０）ＶＭ監視部１３１は、監視対象の仮想マシン（監視対象ＶＭ）の監視情報を収集する。例えば、ＶＭ監視部１３１は、監視対象の仮想マシンから死活監視用の所定のパケットを受信することで、監視情報を収集する。 (S10) The VM monitoring unit 131 collects monitoring information of the virtual machine to be monitored (monitored VM). For example, the VM monitoring unit 131 collects monitoring information by receiving a predetermined packet for life-and-death monitoring from a virtual machine to be monitored.

（Ｓ１１）ＶＭ監視部１３１は、監視対象ＶＭの動作状況を更新する。具体的には、ＶＭ監視部１３１は、ステップＳ１０の監視情報の収集結果に基づいて、ＶＭ管理テーブル１２２を更新する。すなわち、ＶＭ監視部１３１は、監視情報を収集できた（死活監視用のパケットを受信できた）仮想マシンのオートスケールＶＭ動作中フラグを「Ｔｒｕｅ」に設定する。なお、元々「Ｔｒｕｅ」の場合はそのままでよい。 (S11) The VM monitoring unit 131 updates the operating status of the monitored VM. Specifically, the VM monitoring unit 131 updates the VM management table 122 based on the collection result of the monitoring information in step S10. That is, the VM monitoring unit 131 sets the autoscale VM operating flag of the virtual machine for which the monitoring information can be collected (the packet for life-and-death monitoring can be received) to "True". In the case of "True", it may be left as it is.

（Ｓ１２）ＶＭ監視部１３１は、前回の監視情報の収集時から所定時間内に監視情報が届いていない監視対象ＶＭがあるか否かを判定する。所定時間とは、当該監視の周期、または、当該周期に比較的短い時間（当該周期よりも短い時間）を加算した時間である。所定時間内に監視情報が届いていない監視対象ＶＭがある場合、ステップＳ１３に処理が進む。所定時間内に監視情報が届いていない監視対象ＶＭがない場合、ステップＳ１６に処理が進む。 (S12) The VM monitoring unit 131 determines whether or not there is a monitored VM for which the monitoring information has not arrived within a predetermined time from the time of the previous collection of the monitoring information. The predetermined time is the monitoring cycle or the time obtained by adding a relatively short time (a time shorter than the cycle) to the cycle. If there is a monitored VM for which monitoring information has not arrived within a predetermined time, the process proceeds to step S13. If there is no monitored VM for which monitoring information has not arrived within a predetermined time, the process proceeds to step S16.

（Ｓ１３）ＶＭ監視部１３１は、ステップＳ１２で前回の監視情報の収集時から所定時間内に監視情報が届いていないと判断された監視対象ＶＭについて、ＶＭ管理テーブル１２２のオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定する。なお、元々「Ｆａｌｓｅ」の場合はそのままでよい。 (S13) The VM monitoring unit 131 sets the autoscale VM operating flag of the VM management table 122 for the monitored VM for which it is determined in step S12 that the monitoring information has not arrived within a predetermined time from the time when the previous monitoring information was collected. Is set to "False". In the case of "False", it may be left as it is.

（Ｓ１４）ＶＭ監視部１３１は、オートスケールサーバ管理テーブル１２１を参照して、稼働中フラグが「Ｔｒｕｅ」であるか否かを判定する。稼働中フラグが「Ｔｒｕｅ」の場合、ステップＳ１６に処理が進む。稼働中フラグが「Ｆａｌｓｅ」の場合、ステップＳ１５に処理が進む。 (S14) The VM monitoring unit 131 refers to the autoscale server management table 121 and determines whether or not the operating flag is “True”. If the operating flag is "True", the process proceeds to step S16. If the operating flag is "False", the process proceeds to step S15.

（Ｓ１５）ＶＭ監視部１３１は、ステップＳ１３でオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定した監視対象ＶＭについて、ＶＭ管理テーブル１２２のオートスケール情報更新フラグを「Ｔｒｕｅ」に設定する。 (S15) The VM monitoring unit 131 sets the autoscale information update flag of the VM management table 122 to "True" for the monitored VM for which the autoscale VM operating flag is set to "False" in step S13.

（Ｓ１６）ＶＭ監視部１３１は、監視を継続するか否を判定する。監視を継続する場合、監視の周期の分だけ待機して、ステップＳ１０に処理が進む。監視を継続しない場合、ＶＭ監視の処理が終了する。例えば、ＶＭ監視部１３１は、システム管理者による監視の終了の入力を受け付けた場合、監視を継続しないと判定し、それ以外の場合に監視を継続すると判定する。 (S16) The VM monitoring unit 131 determines whether or not to continue monitoring. When continuing monitoring, the process proceeds to step S10 after waiting for the monitoring cycle. If the monitoring is not continued, the VM monitoring process ends. For example, when the VM monitoring unit 131 receives the input of the end of monitoring by the system administrator, it determines that the monitoring is not continued, and determines that the monitoring is continued in other cases.

図１２は、オートスケールサーバ監視の例を示すフローチャートである。
ＡＳサーバ連携部１３２は下記の処理を定期的に実行する。実行の周期は、運用に応じて定められる。周期は、数秒から数十秒程度でもよいし、１分から数分程度でもよい。 FIG. 12 is a flowchart showing an example of autoscale server monitoring.
The AS server cooperation unit 132 periodically executes the following processing. The execution cycle is determined according to the operation. The cycle may be about several seconds to several tens of seconds, or may be about one minute to several minutes.

（Ｓ２０）ＡＳサーバ連携部１３２は、オートスケールサーバ２００のＶＭ状態を参照する。具体的には、ＡＳサーバ連携部１３２は、オートスケールサーバ２００に、ＶＭテーブル２１２における各仮想マシンのＶＭ状態を問い合わせる。 (S20) The AS server cooperation unit 132 refers to the VM state of the autoscale server 200. Specifically, the AS server cooperation unit 132 inquires the autoscale server 200 about the VM state of each virtual machine in the VM table 212.

（Ｓ２１）ＡＳサーバ連携部１３２は、オートスケールサーバ２００が動作中であるか否かを判定する。オートスケールサーバ２００が動作中である場合、ステップＳ２３に処理が進む。オートスケールサーバ２００が動作中でない、すなわち、停止している場合、ステップＳ２２に処理が進む。例えば、ＡＳサーバ連携部１３２は、ステップＳ２０の問い合わせに対するオートスケールサーバ２００の応答がある場合、オートスケールサーバ２００が動作中であると判定する。また、ＡＳサーバ連携部１３２は、ステップＳ２０の問い合わせに対するオートスケールサーバ２００の応答がない場合、オートスケールサーバ２００が停止していると判定する。 (S21) The AS server cooperation unit 132 determines whether or not the autoscale server 200 is in operation. If the autoscale server 200 is in operation, the process proceeds to step S23. If the autoscale server 200 is not in operation, that is, it is stopped, the process proceeds to step S22. For example, the AS server cooperation unit 132 determines that the autoscale server 200 is in operation when there is a response from the autoscale server 200 to the inquiry in step S20. Further, the AS server cooperation unit 132 determines that the autoscale server 200 is stopped when there is no response from the autoscale server 200 to the inquiry in step S20.

（Ｓ２２）ＡＳサーバ連携部１３２は、オートスケールサーバ管理テーブル１２１の稼働中フラグを「Ｆａｌｓｅ」に設定する。元々「Ｆａｌｓｅ」の場合はそのままでよい。そして、ステップＳ２７に処理が進む。 (S22) The AS server cooperation unit 132 sets the operating flag of the autoscale server management table 121 to "False". Originally, in the case of "False", it can be left as it is. Then, the process proceeds to step S27.

（Ｓ２３）ＡＳサーバ連携部１３２は、オートスケールサーバ管理テーブル１２１の稼働中フラグを「Ｔｒｕｅ」に設定する。元々「Ｔｒｕｅ」の場合はそのままでよい。
（Ｓ２４）ＡＳサーバ連携部１３２は、ＶＭ管理テーブル１２２のオートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭがあるか否かを判定する。オートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭがある場合、ステップＳ２５に処理が進む。オートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭがない場合、ステップＳ２７に処理が進む。 (S23) The AS server cooperation unit 132 sets the operating flag of the autoscale server management table 121 to "True". Originally, in the case of "True", it can be left as it is.
(S24) The AS server cooperation unit 132 determines whether or not there is a monitored VM whose autoscale information update flag of the VM management table 122 is “True”. If there is a monitored VM whose autoscale information update flag is "True", the process proceeds to step S25. If there is no monitored VM whose autoscale information update flag is “True”, the process proceeds to step S27.

（Ｓ２５）ＡＳサーバ連携部１３２は、オートスケールサーバ２００が管理するＶＭ状態を、監視サーバ１００のＶＭ管理テーブル１２２におけるオートスケールＶＭ動作中フラグを基に更新する。具体的には、ＡＳサーバ連携部１３２は、オートスケール情報更新フラグが「Ｔｒｕｅ」である監視対象ＶＭの情報（オートスケールＶＭ動作中フラグ「Ｆａｌｓｅ」を示す情報）を、オートスケールサーバ２００に送信する。 (S25) The AS server cooperation unit 132 updates the VM state managed by the autoscale server 200 based on the autoscale VM operating flag in the VM management table 122 of the monitoring server 100. Specifically, the AS server linkage unit 132 transmits information on the monitored VM whose autoscale information update flag is “True” (information indicating the autoscale VM operating flag “False”) to the autoscale server 200. do.

（Ｓ２６）ＡＳサーバ連携部１３２は、ＶＭ管理テーブル１２２におけるオートスケール情報更新フラグを「Ｆａｌｓｅ」に設定する。具体的には、ＡＳサーバ連携部１３２は、ステップＳ２４でオートスケール情報更新フラグが「Ｔｒｕｅ」であった箇所を、「Ｆａｌｓｅ」に変更する。 (S26) The AS server linkage unit 132 sets the autoscale information update flag in the VM management table 122 to “False”. Specifically, the AS server cooperation unit 132 changes the place where the autoscale information update flag is "True" in step S24 to "False".

（Ｓ２７）ＡＳサーバ連携部１３２は、オートスケールサーバ２００から取得した各監視対象ＶＭのＶＭ状態に応じて異常の発生を検知し、システム管理者に異常を通知する。例えば、ＡＳサーバ連携部１３２は、ＶＭ管理テーブル１２２においてオートスケールＶＭ動作中フラグが「Ｆａｌｓｅ」で、かつ、オートスケールサーバ２００に確認したＶＭ状態が「スケールインによる停止」でない仮想マシンを異常と判定する。例えば、ＡＳサーバ連携部１３２は、ディスプレイ１１１に異常を示す画像を表示させてもよい。あるいは、ＡＳサーバ連携部１３２は、システム管理者が利用する端末装置に、異常を示すメッセージを送信してもよい。なお、ステップＳ２２を経由してステップＳ２７が実行される場合、ＡＳサーバ連携部１３２はオートスケールサーバ２００からＶＭ状態を取得できないことになる。この場合、ＡＳサーバ連携部１３２は、ステップＳ２７をスキップしてステップＳ２８を実行してもよい。あるいは、ＡＳサーバ連携部１３２は、例外的にオートスケールサーバ２００への確認なしに、オートスケールＶＭ動作中フラグが「Ｆａｌｓｅ」の仮想マシンを異常とみなして、システム管理者に当該仮想マシンの異常を通知してもよい。 (S27) The AS server cooperation unit 132 detects the occurrence of an abnormality according to the VM state of each monitored VM acquired from the autoscale server 200, and notifies the system administrator of the abnormality. For example, the AS server linkage unit 132 determines that a virtual machine whose autoscale VM operating flag is "False" in the VM management table 122 and whose VM state confirmed by the autoscale server 200 is not "stopped by scale-in" is abnormal. judge. For example, the AS server cooperation unit 132 may display an image indicating an abnormality on the display 111. Alternatively, the AS server cooperation unit 132 may send a message indicating an abnormality to the terminal device used by the system administrator. When step S27 is executed via step S22, the AS server cooperation unit 132 cannot acquire the VM state from the autoscale server 200. In this case, the AS server cooperation unit 132 may skip step S27 and execute step S28. Alternatively, the AS server linkage unit 132 considers the virtual machine whose autoscale VM operating flag is "False" as an abnormality without confirmation to the autoscale server 200, and informs the system administrator that the virtual machine is abnormal. May be notified.

（Ｓ２８）ＡＳサーバ連携部１３２は、監視を継続するか否かを判定する。監視を継続する場合、監視の周期の分だけ待機して、ステップＳ２０に処理が進む。監視を継続しない場合、オートスケールサーバ監視の処理が終了する。例えば、ＡＳサーバ連携部１３２は、システム管理者による監視の終了の入力を受け付けた場合、監視を継続しないと判定し、それ以外の場合に監視を継続すると判定する。 (S28) The AS server cooperation unit 132 determines whether or not to continue monitoring. When continuing monitoring, the process proceeds to step S20 after waiting for the monitoring cycle. If monitoring is not continued, the autoscale server monitoring process ends. For example, when the AS server cooperation unit 132 receives the input of the end of monitoring by the system administrator, it determines that the monitoring is not continued, and determines that the monitoring is continued in other cases.

次に、監視サーバ１００による監視の例を説明する。
図１３は、監視サーバによる監視の例を示す図である。
説明を簡単にするため、ＶＭ管理テーブル１２２の各項目のうち、ＶＭ名とＶＭ動作中フラグ（オートスケールＶＭ動作中フラグに相当）とを図示し、他の項目の図示を省略する。また、ＶＭテーブル２１２の各項目のうち、ＶＭ名とＶＭ状態とを図示し、他の項目の図示を省略する。また、ＶＭ名「ＶＭ１」の仮想マシンを、仮想マシンＶＭ１のように表記する（他のＶＭ名についても同様に表記する）。 Next, an example of monitoring by the monitoring server 100 will be described.
FIG. 13 is a diagram showing an example of monitoring by the monitoring server.
For the sake of simplicity, among the items of the VM management table 122, the VM name and the VM operating flag (corresponding to the autoscale VM operating flag) are shown, and the other items are omitted. Further, among the items of the VM table 212, the VM name and the VM state are shown, and the illustration of other items is omitted. Further, the virtual machine having the VM name "VM1" is described as the virtual machine VM1 (the other VM names are also described in the same manner).

まず、オートスケールサーバ２００が稼働中の場合を考える（ステップＳＴ１１）。ＶＭ管理テーブル１２２によれば、この段階において、仮想マシンＶＭ１のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ２のＶＭ動作中フラグは「Ｆａｌｓｅ」である。仮想マシンＶＭ３のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ４のＶＭ動作中フラグは「Ｔｒｕｅ」である。一方、ＶＭテーブル２１２によれば、仮想マシンＶＭ１のＶＭ状態は「正常」である。仮想マシンＶＭ２のＶＭ状態は「スケールイン縮退」である。仮想マシンＶＭ３のＶＭ状態は「正常」である。仮想マシンＶＭ４のＶＭ状態は「正常」である。ＶＭ管理テーブル１２２で、仮想マシンＶＭ２のＶＭ動作中フラグが「Ｆａｌｓｅ」なので、監視サーバ１００は、オートスケールサーバ２００に仮想マシンＶＭ２のＶＭ状態を問い合わせる。オートスケールサーバ２００は、ＶＭテーブル２１２に基づいて、仮想マシンＶＭ２のＶＭ状態「スケールイン縮退」を監視サーバ１００に応答する。この場合、監視サーバ１００は、仮想マシンＶＭ２から監視情報を取得できなかったことを異常とみなさない。 First, consider the case where the autoscale server 200 is in operation (step ST11). According to the VM management table 122, the VM operating flag of the virtual machine VM1 is "True" at this stage. The VM operating flag of the virtual machine VM2 is "False". The VM operating flag of the virtual machine VM3 is "True". The VM operating flag of the virtual machine VM4 is "True". On the other hand, according to the VM table 212, the VM state of the virtual machine VM1 is "normal". The VM state of the virtual machine VM2 is "scale-in degenerate". The VM state of the virtual machine VM3 is "normal". The VM state of the virtual machine VM4 is "normal". Since the VM operating flag of the virtual machine VM2 is "False" in the VM management table 122, the monitoring server 100 inquires the autoscale server 200 of the VM status of the virtual machine VM2. The autoscale server 200 responds to the monitoring server 100 with the VM state "scale-in degenerate" of the virtual machine VM2 based on the VM table 212. In this case, the monitoring server 100 does not consider that the monitoring information could not be acquired from the virtual machine VM2 as an abnormality.

その後、オートスケールサーバ２００が停止した場合を考える（ステップＳＴ１２）。ＶＭテーブル２１２は、オートスケールサーバ２００が停止している間も、オートスケールサーバ２００の不揮発性の記憶装置（例えば、ＨＤＤ）に保持されている。監視サーバ１００は、オートスケールサーバ２００に対するＶＭ状態の定期的な問い合わせに対して、オートスケールサーバ２００からの応答がないことを検知することで、オートスケールサーバ２００が停止したことを検知する。 After that, consider the case where the autoscale server 200 is stopped (step ST12). The VM table 212 is held in the non-volatile storage device (for example, HDD) of the autoscale server 200 even while the autoscale server 200 is stopped. The monitoring server 100 detects that the autoscale server 200 has stopped by detecting that there is no response from the autoscale server 200 in response to a periodic inquiry of the VM state to the autoscale server 200.

監視サーバ１００は、仮想マシンＶＭ４との通信不可を検知する。すると、監視サーバ１００は、ＶＭ管理テーブル１２２において、仮想マシンＶＭ４のＶＭ動作中フラグを「Ｆａｌｓｅ」に変更することで、ＶＭ管理テーブル１２２をＶＭ管理テーブル１２３に更新する。監視サーバ１００は、仮想マシンＶＭ４について、オートスケールサーバ２００が停止している間にＶＭ動作中フラグを「Ｔｒｕｅ」から「Ｆａｌｓｅ」に変更したので、オートスケール情報更新フラグ（図１３では図示を省略している）を「Ｔｒｕｅ」に設定する。 The monitoring server 100 detects that communication with the virtual machine VM4 is not possible. Then, the monitoring server 100 updates the VM management table 122 to the VM management table 123 by changing the VM operating flag of the virtual machine VM4 to "False" in the VM management table 122. Since the monitoring server 100 changed the VM operating flag from "True" to "False" for the virtual machine VM4 while the autoscale server 200 was stopped, the autoscale information update flag (not shown in FIG. 13 is omitted). Set) to "True".

更にその後、オートスケールサーバ２００が復旧した場合を考える（ステップＳＴ１３）。例えば、監視サーバ１００は、オートスケールサーバ２００に対するＶＭ状態の定期的な問い合わせに対してオートスケールサーバ２００からの応答が再開されたことを検知することで、オートスケールサーバ２００の起動を検知する。当該応答は、仮想マシンＶＭ４が「正常」（ただし、実際の状態とは異なる）である旨を含む。監視サーバ１００は、オートスケールサーバ２００からＶＭ状態の応答を受け付けると、仮想マシンＶＭ４の停止がスケールインによる停止ではないことを検知し、仮想マシンＶＭ４の異常をシステム管理者に通知する。 Further, after that, consider the case where the autoscale server 200 is restored (step ST13). For example, the monitoring server 100 detects the start of the autoscale server 200 by detecting that the response from the autoscale server 200 is resumed in response to the periodic inquiry of the VM state to the autoscale server 200. The response includes that the virtual machine VM4 is "normal" (but not in the actual state). When the monitoring server 100 receives the response of the VM state from the autoscale server 200, it detects that the stop of the virtual machine VM4 is not the stop due to the scale-in, and notifies the system administrator of the abnormality of the virtual machine VM4.

そして、監視サーバ１００は、ＶＭ管理テーブル１２３に基づいて、オートスケールサーバ２００が停止している間に仮想マシンＶＭ４との通信不可を検知したことを、オートスケールサーバ２００に通知する。オートスケールサーバ２００は、当該通知に応じて、ＶＭテーブル２１２の仮想マシンＶＭ４のＶＭ状態を「ＥＲＲＯＲ」に変更することで、ＶＭテーブル２１２をＶＭテーブル２１４に更新する。そして、オートスケールサーバ２００は、ＶＭテーブル２１４により各仮想マシンのオートスケール制御を再開する。 Then, the monitoring server 100 notifies the autoscale server 200 that it has detected that communication with the virtual machine VM4 is not possible while the autoscale server 200 is stopped, based on the VM management table 123. The autoscale server 200 updates the VM table 212 to the VM table 214 by changing the VM state of the virtual machine VM4 of the VM table 212 to "ERROR" in response to the notification. Then, the autoscale server 200 restarts the autoscale control of each virtual machine by the VM table 214.

なお、監視サーバ１００は、オートスケールサーバ２００の起動を検知したタイミングではなく、オートスケールサーバ２００から仮想マシンＶＭ４のＶＭ状態として「ＥＲＲＯＲ」を取得したタイミングで仮想マシンＶＭ４の異常を検知し、システム管理者に通知してもよい。 The monitoring server 100 detects an abnormality in the virtual machine VM4 at the timing when "ERROR" is acquired as the VM state of the virtual machine VM4 from the autoscale server 200, not at the timing when the start of the autoscale server 200 is detected, and the system. You may notify the administrator.

次に、監視の比較例を説明する。
図１４は、監視の比較例を示す図である。
比較例では、仮想マシンを監視する監視サーバ７００と、仮想マシンに対するオートスケール制御を行うオートスケールサーバ８００とを含むシステムを考える。ただし、監視サーバ７００は、オートスケールサーバ８００と連携する機能を有していない。 Next, a comparative example of monitoring will be described.
FIG. 14 is a diagram showing a comparative example of monitoring.
In the comparative example, consider a system including a monitoring server 700 that monitors a virtual machine and an autoscale server 800 that performs autoscale control for the virtual machine. However, the monitoring server 700 does not have a function of linking with the autoscale server 800.

監視サーバ７００は、各仮想マシンの死活監視の状況を管理するＶＭ監視テーブル７０１を記憶する。ＶＭ監視テーブル７０１には、ＶＭ名とＶＭ動作フラグとが記録される。ＶＭ動作フラグは、「Ｔｒｕｅ」が動作中、「Ｆａｌｓｅ」が停止を示す。 The monitoring server 700 stores a VM monitoring table 701 that manages the status of life-and-death monitoring of each virtual machine. The VM name and the VM operation flag are recorded in the VM monitoring table 701. As for the VM operation flag, "True" indicates that the operation is in progress, and "False" indicates that the operation is stopped.

オートスケールサーバ８００は、各仮想マシンの状態を管理するＶＭ状態テーブル８０１を記憶する。ＶＭ状態テーブル８０１には、ＶＭ名とＶＭ状態とが記録される。
まず、オートスケールサーバ８００が稼働中の場合を考える（ステップＳＴ２１）。ＶＭ監視テーブル７０１によれば、この段階において、仮想マシンＶＭ１のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ２のＶＭ動作中フラグは「Ｆａｌｓｅ」である。仮想マシンＶＭ３のＶＭ動作中フラグは「Ｔｒｕｅ」である。仮想マシンＶＭ４のＶＭ動作中フラグは「Ｔｒｕｅ」である。一方、ＶＭ状態テーブル８０１によれば、仮想マシンＶＭ１のＶＭ状態は「正常」である。仮想マシンＶＭ２のＶＭ状態は「スケールイン縮退」である。仮想マシンＶＭ３のＶＭ状態は「正常」である。仮想マシンＶＭ４のＶＭ状態は「正常」である。ＶＭ監視テーブル７０１で、仮想マシンＶＭ２のＶＭ動作中フラグが「Ｆａｌｓｅ」なので、監視サーバ７００は、オートスケールサーバ８００に仮想マシンＶＭ２のＶＭ状態を問い合わせる。オートスケールサーバ８００は、ＶＭ状態テーブル８０１に基づいて、仮想マシンＶＭ２のＶＭ状態「スケールイン縮退」を監視サーバ７００に応答する。この場合、監視サーバ７００は、仮想マシンＶＭ２から監視情報を取得できなかったことを異常とみなさない。 The autoscale server 800 stores a VM status table 801 that manages the status of each virtual machine. The VM name and the VM state are recorded in the VM state table 801.
First, consider the case where the autoscale server 800 is in operation (step ST21). According to the VM monitoring table 701, the VM operating flag of the virtual machine VM1 is "True" at this stage. The VM operating flag of the virtual machine VM2 is "False". The VM operating flag of the virtual machine VM3 is "True". The VM operating flag of the virtual machine VM4 is "True". On the other hand, according to the VM state table 801, the VM state of the virtual machine VM1 is "normal". The VM state of the virtual machine VM2 is "scale-in degenerate". The VM state of the virtual machine VM3 is "normal". The VM state of the virtual machine VM4 is "normal". Since the VM operating flag of the virtual machine VM2 is "False" in the VM monitoring table 701, the monitoring server 700 inquires the autoscale server 800 about the VM status of the virtual machine VM2. The autoscale server 800 responds to the monitoring server 700 with the VM state "scale-in degenerate" of the virtual machine VM2 based on the VM state table 801. In this case, the monitoring server 700 does not consider that the monitoring information could not be acquired from the virtual machine VM2 as an abnormality.

その後、オートスケールサーバ８００が停止した場合を考える（ステップＳＴ２２）。ＶＭ状態テーブル８０１は、オートスケールサーバ８００が停止している間も、オートスケールサーバ８００の不揮発性の記憶装置（例えば、ＨＤＤ）に保持されている。監視サーバ７００は、オートスケールサーバ８００に対するＶＭ状態の定期的な問い合わせに対して、オートスケールサーバ８００からの応答がないことを検知することで、オートスケールサーバ８００が停止したことを検知する。 After that, consider the case where the autoscale server 800 is stopped (step ST22). The VM state table 801 is held in the non-volatile storage device (for example, HDD) of the autoscale server 800 even while the autoscale server 800 is stopped. The monitoring server 700 detects that the autoscale server 800 has stopped by detecting that there is no response from the autoscale server 800 in response to a periodic inquiry of the VM state to the autoscale server 800.

監視サーバ７００は、仮想マシンＶＭ４との通信不可を検知する。すると、監視サーバ７００は、ＶＭ監視テーブル７０１において、仮想マシンＶＭ４のＶＭ動作中フラグを「Ｆａｌｓｅ」に変更することで、ＶＭ監視テーブル７０１をＶＭ監視テーブル７０２に更新する。 The monitoring server 700 detects that communication with the virtual machine VM4 is not possible. Then, the monitoring server 700 updates the VM monitoring table 701 to the VM monitoring table 702 by changing the VM operating flag of the virtual machine VM4 to "False" in the VM monitoring table 701.

更にその後、オートスケールサーバ８００が復旧した場合を考える（ステップＳＴ２３）。例えば、監視サーバ７００は、オートスケールサーバ８００に対するＶＭ状態の定期的な問い合わせに対してオートスケールサーバ８００からの応答が再開されたことを検知することで、オートスケールサーバ８００の起動を検知する。監視サーバ７００は、ＶＭ状態の応答に基づいて、仮想マシンＶＭ４のＶＭ状態が「正常」（ただし、実際の状態とは異なる）であり、スケールインによる停止ではないことを検知すると、システム管理者に仮想マシンＶＭ４の異常を通知する。 Further, after that, consider the case where the autoscale server 800 is restored (step ST23). For example, the monitoring server 700 detects the startup of the autoscale server 800 by detecting that the response from the autoscale server 800 is resumed in response to the periodic inquiry of the VM state to the autoscale server 800. When the monitoring server 700 detects that the VM state of the virtual machine VM4 is "normal" (but different from the actual state) based on the response of the VM state, and it is not stopped due to scale-in, the system administrator Is notified of the abnormality of the virtual machine VM4.

オートスケールサーバ８００は、ＶＭ状態テーブル８０１によりオートスケール制御を再開する。ＶＭ状態テーブル８０１は、仮想マシンＶＭ４が「正常」として管理されている。このため、オートスケールサーバ８００は、仮想マシンＶＭ４が属するオートスケールグループに関してオートスケール制御を適切に行うことができない。また、オートスケールサーバ８００が仮想マシンＶＭ４の異常を検知するまでに、１０分から数十分かかることもある。この間、ユーザが利用するアプリケーションなどの処理負荷が高まると、適切なスケールアウトを行えず、当該処理に遅延が生じるおそれがある。 The autoscale server 800 restarts the autoscale control according to the VM status table 801. In the VM status table 801 the virtual machine VM4 is managed as "normal". Therefore, the autoscale server 800 cannot properly perform autoscale control with respect to the autoscale group to which the virtual machine VM4 belongs. In addition, it may take 10 minutes to several tens of minutes for the autoscale server 800 to detect an abnormality in the virtual machine VM4. During this period, if the processing load of the application used by the user increases, appropriate scale-out cannot be performed, and the processing may be delayed.

一方、第２の実施の形態のクラスタシステムによれば、監視サーバ１００とオートスケールサーバ２００とを連携させ、オートスケールサーバ２００が起動すると、監視サーバ１００により最新の仮想マシンの情報をオートスケールサーバ２００に提供する。このため、オートスケールサーバ２００は、最新の仮想マシンの情報で復旧し、オートスケール制御を再開することができる。このため、オートスケールサーバ２００が停止している間に停止した仮想マシンを、オートスケールサーバ２００に適切に把握させ、オートスケール制御を適切に再開させることができる。その結果、ユーザが利用するアプリケーションの処理への影響を抑えられる。 On the other hand, according to the cluster system of the second embodiment, when the monitoring server 100 and the autoscale server 200 are linked and the autoscale server 200 is started, the monitoring server 100 outputs the latest virtual machine information to the autoscale server. Provide to 200. Therefore, the autoscale server 200 can recover with the latest virtual machine information and restart the autoscale control. Therefore, the virtual machine stopped while the autoscale server 200 is stopped can be appropriately grasped by the autoscale server 200, and the autoscale control can be appropriately restarted. As a result, the influence on the processing of the application used by the user can be suppressed.

［第３の実施の形態］
次に、第３の実施の形態を説明する。前述の第２の実施の形態と相違する事項を主に説明し、共通する事項の説明を省略する。 [Third Embodiment]
Next, a third embodiment will be described. Matters that differ from the second embodiment described above will be mainly described, and explanations of common matters will be omitted.

第２の実施の形態の例では、オートスケールサーバ２００によるオートスケールの制御対象の仮想マシンと、監視サーバ１００による監視対象の仮想マシンとが一致していたが、監視サーバ１００は、オートスケールの制御対象以外の仮想マシンの監視も行える。 In the example of the second embodiment, the virtual machine to be controlled by the autoscale server 200 and the virtual machine to be monitored by the monitoring server 100 are the same, but the monitoring server 100 is the autoscale. You can also monitor virtual machines that are not controlled.

図１５は、第３の実施の形態の仮想マシンの例を示す図である。
例えば、物理サーバ３００が仮想マシン３１０，３２０を実行し、物理サーバ４００が仮想マシン４１０，４２０，４３０を実行することを考える。このうち、オートスケールサーバ２００によるオートスケールの制御対象は、仮想マシン３１０，３２０，４１０，４２０である。仮想マシン４３０は、オートスケールサーバ２００によるオートスケールの制御の対象外である。一方、監視サーバ１００による監視対象は、仮想マシン３１０，３２０，４１０，４２０，４３０である。 FIG. 15 is a diagram showing an example of a virtual machine according to a third embodiment.
For example, consider that the physical server 300 executes the virtual machines 310 and 320, and the physical server 400 executes the virtual machines 410, 420 and 430. Of these, the virtual machines 310, 320, 410, and 420 are the targets of autoscale control by the autoscale server 200. The virtual machine 430 is not subject to autoscale control by the autoscale server 200. On the other hand, the monitoring targets by the monitoring server 100 are virtual machines 310, 320, 410, 420, 430.

このように、オートスケールサーバ２００によるオートスケールの制御対象の仮想マシンの範囲と、監視サーバ１００による監視対象の仮想マシンの範囲とは一致していなくてもよい。監視サーバ１００は、オートスケールの制御対象でない仮想マシンに対する死活監視により、当該仮想マシンとの通信不可を検知すると、当該仮想マシンについてのオートスケール状況の確認を行わずに、当該仮想マシンの異常を検知し、システム管理者に通知する。監視サーバ１００は、監視対象の仮想マシンがオートスケールの制御対象であるか否かをＶＭ管理テーブルにより管理する。 As described above, the range of the virtual machines to be controlled by the autoscale server 200 and the range of the virtual machines to be monitored by the monitoring server 100 do not have to match. When the monitoring server 100 detects that communication with the virtual machine is impossible by alive monitoring of the virtual machine that is not the control target of the autoscale, the monitoring server 100 does not check the autoscale status of the virtual machine and causes an abnormality in the virtual machine. Detect and notify the system administrator. The monitoring server 100 manages whether or not the virtual machine to be monitored is an autoscale control target by using the VM management table.

図１６は、ＶＭ管理テーブルの例を示す図である。
ＶＭ管理テーブル１２４は、記憶部１２０に格納される。ＶＭ管理テーブル１２４は、オートスケール可否フラグ、ＶＭ名、通信用ＩＰアドレス、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目を含む。 FIG. 16 is a diagram showing an example of a VM management table.
The VM management table 124 is stored in the storage unit 120. The VM management table 124 includes items such as an autoscale enable / disable flag, a VM name, a communication IP address, an autoscale VM operating flag, and an autoscale information update flag.

オートスケール可否フラグの項目には、該当の仮想マシンがオートスケール制御の対象であるか否かを示す情報が登録される。該当の仮想マシンがオートスケール制御の対象の場合、オートスケール可否フラグは「対象」である。該当の仮想マシンがオートスケール制御の対象外の場合、オートスケール可否フラグは「対象外」である。 Information indicating whether or not the corresponding virtual machine is subject to autoscale control is registered in the item of the autoscale enable / disable flag. When the corresponding virtual machine is the target of autoscale control, the autoscale enable / disable flag is "target". If the virtual machine is not subject to autoscale control, the autoscale enable / disable flag is "not subject".

ＶＭ名、通信用ＩＰアドレス、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目に登録される情報は、ＶＭ管理テーブル１２２における同名の項目に登録される情報と同様である。ただし、オートスケールＶＭ動作中フラグおよびオートスケール情報更新フラグの項目は、オートスケール可否フラグが「対象外」の場合、設定なし（図では設定なしをハイフン記号「－」で示す）となる。 The information registered in the items of the VM name, the communication IP address, the autoscale VM operating flag, and the autoscale information update flag is the same as the information registered in the items of the same name in the VM management table 122. However, the items of the autoscale VM operation flag and the autoscale information update flag are not set when the autoscale enable / disable flag is "not applicable" (in the figure, no setting is indicated by a hyphen symbol "-").

例えば、ＶＭ管理テーブル１２４には、オートスケール可否フラグが「対象外」、ＶＭ名が「ＶＭｎｏｒｍａｌ」、通信用ＩＰアドレスが「１１０．１０．１．１」、オートスケールＶＭ動作中フラグが設定なし（「－」）、オートスケール情報更新フラグが設定なし（「－」）というレコードが登録される。このレコードは、ＶＭ名「ＶＭｎｏｒｍａｌ」の仮想マシンがオートスケールの制御対象外であり、当該仮想マシンのＩＰアドレスが「１１０．１０．１．１」であることを示す。 For example, in the VM management table 124, the autoscale enable / disable flag is "not applicable", the VM name is "VMNormal", the communication IP address is "110.10.1.1", and the autoscale VM operating flag is not set. ("-"), The record that the autoscale information update flag is not set ("-") is registered. This record indicates that the virtual machine with the VM name "VMNormal" is out of the control of autoscale and the IP address of the virtual machine is "110.10.1.1".

また、例えば、ＶＭ管理テーブル１２４には、オートスケール可否フラグが「対象」、ＶＭ名が「Ｇｒｐ１＿ＶＭ１」、通信用ＩＰアドレスが「１００．１０．９９．１」、オートスケールＶＭ動作中フラグが「Ｔｒｕｅ」、オートスケール情報更新フラグが「Ｆａｌｓｅ」というレコードが登録される。このレコードは、ＶＭ名「Ｇｒｐ１＿ＶＭ１」の仮想マシンがオートスケールの制御対象であることを示す。また、当該仮想マシンの通信用ＩＰアドレスが「１００．１０．９９．１」であることを示す。更に、当該仮想マシンが稼動しており、オートスケールサーバ２００の停止中におけるオートスケールＶＭ動作中フラグの更新が発生していないことを示す。 Further, for example, in the VM management table 124, the autoscale enable / disable flag is "target", the VM name is "Grp1_VM1", the communication IP address is "100.10.99.1", and the autoscale VM operating flag is "100.19.9.1". A record with "True" and the autoscale information update flag "False" is registered. This record indicates that the virtual machine with the VM name "Grp1_VM1" is subject to autoscale control. It also indicates that the communication IP address of the virtual machine is "100.10.99.1". Further, it is shown that the virtual machine is running and the autoscale VM operating flag is not updated while the autoscale server 200 is stopped.

次に、ＶＭ管理テーブル１２４を用いた、ＶＭ監視部１３１によるＶＭ監視の処理手順を説明する。第３の実施の形態では、ＶＭ監視部１３１は、図１１で説明したＶＭ監視の手順に代えて、下記の手順を実行する。 Next, the processing procedure of VM monitoring by the VM monitoring unit 131 using the VM management table 124 will be described. In the third embodiment, the VM monitoring unit 131 executes the following procedure instead of the VM monitoring procedure described with reference to FIG.

図１７は、ＶＭ監視の例を示すフローチャートである。
ＶＭ監視部１３１は下記の処理を定期的に実行する。実行の周期は、運用に応じて定められる。周期は、数秒から数十秒程度でもよいし、１分から数分程度でもよい。 FIG. 17 is a flowchart showing an example of VM monitoring.
The VM monitoring unit 131 periodically executes the following processing. The execution cycle is determined according to the operation. The cycle may be about several seconds to several tens of seconds, or may be about one minute to several minutes.

（Ｓ３０）ＶＭ監視部１３１は、監視対象の仮想マシン（監視対象ＶＭ）の監視情報を収集する。例えば、ＶＭ監視部１３１は、監視対象の仮想マシンから死活監視用の所定のパケットを受信することで、監視情報を収集する。 (S30) The VM monitoring unit 131 collects monitoring information of the virtual machine to be monitored (monitored VM). For example, the VM monitoring unit 131 collects monitoring information by receiving a predetermined packet for life-and-death monitoring from a virtual machine to be monitored.

（Ｓ３１）ＶＭ監視部１３１は、監視対象ＶＭの動作状況を更新する。具体的には、ＶＭ監視部１３１は、ステップＳ３０の監視情報の収集結果に基づいて、ＶＭ管理テーブル１２４を更新する。すなわち、ＶＭ監視部１３１は、監視情報を収集できた（死活監視用のパケットを受信できた）仮想マシンのオートスケールＶＭ動作中フラグを「Ｔｒｕｅ」に設定する。なお、元々「Ｔｒｕｅ」の場合はそのままでよい。 (S31) The VM monitoring unit 131 updates the operating status of the monitored VM. Specifically, the VM monitoring unit 131 updates the VM management table 124 based on the collection result of the monitoring information in step S30. That is, the VM monitoring unit 131 sets the autoscale VM operating flag of the virtual machine for which the monitoring information can be collected (the packet for life-and-death monitoring can be received) to "True". In the case of "True", it may be left as it is.

（Ｓ３２）ＶＭ監視部１３１は、前回の監視情報の収集時から所定時間内に監視情報が届いていない監視対象ＶＭがあるか否かを判定する。所定時間とは、当該監視の周期、または、当該周期に比較的短い時間（当該周期よりも短い時間）を加算した時間である。所定時間内に監視情報が届いていない監視対象ＶＭがある場合、ステップＳ３３に処理が進む。所定時間内に監視情報が届いていない監視対象ＶＭがない場合、ステップＳ３８に処理が進む。 (S32) The VM monitoring unit 131 determines whether or not there is a monitored VM for which the monitoring information has not arrived within a predetermined time from the time of the previous collection of the monitoring information. The predetermined time is the monitoring cycle or the time obtained by adding a relatively short time (a time shorter than the cycle) to the cycle. If there is a monitored VM for which monitoring information has not arrived within a predetermined time, the process proceeds to step S33. If there is no monitored VM for which monitoring information has not arrived within a predetermined time, the process proceeds to step S38.

（Ｓ３３）ＶＭ監視部１３１は、ＶＭ管理テーブル１２４を参照して、前回の監視情報の収集時から所定時間内に監視情報が届いていない監視対象ＶＭのオートスケール可否フラグが「対象」であるか否かを判定する。「対象」である場合、ステップＳ３５に処理が進む。「対象」でない場合（すなわち、「対象外」である場合）、ステップＳ３４に処理が進む。 (S33) The VM monitoring unit 131 refers to the VM management table 124, and the autoscale possibility flag of the monitored VM whose monitoring information has not arrived within a predetermined time from the time of the previous collection of the monitoring information is “target”. Judge whether or not. If it is a "target", the process proceeds to step S35. If it is not "target" (that is, if it is "not target"), the process proceeds to step S34.

（Ｓ３４）ＶＭ監視部１３１は、該当の仮想マシンの異常をシステム管理者に通知する。例えば、ＶＭ監視部１３１は、ディスプレイ１１１に異常を示す画像を表示させてもよい。あるいは、ＶＭ監視部１３１は、システム管理者が利用する端末装置に、異常を示すメッセージを送信してもよい。そして、ステップＳ３８に処理が進む。 (S34) The VM monitoring unit 131 notifies the system administrator of the abnormality of the corresponding virtual machine. For example, the VM monitoring unit 131 may display an image indicating an abnormality on the display 111. Alternatively, the VM monitoring unit 131 may send a message indicating an abnormality to the terminal device used by the system administrator. Then, the process proceeds to step S38.

（Ｓ３５）ＶＭ監視部１３１は、ステップＳ３２で前回の監視情報の収集時から所定時間内に監視情報が届いていないと判断された監視対象ＶＭについて、ＶＭ管理テーブル１２４のオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定する。なお、元々「Ｆａｌｓｅ」の場合はそのままでよい。 (S35) The VM monitoring unit 131 sets the autoscale VM operating flag of the VM management table 124 for the monitored VM for which it is determined in step S32 that the monitoring information has not arrived within a predetermined time from the time when the previous monitoring information was collected. Is set to "False". In the case of "False", it may be left as it is.

（Ｓ３６）ＶＭ監視部１３１は、オートスケールサーバ管理テーブル１２１を参照して、稼働中フラグが「Ｔｒｕｅ」であるか否かを判定する。稼働中フラグが「Ｔｒｕｅ」の場合、ステップＳ３８に処理が進む。稼働中フラグが「Ｆａｌｓｅ」の場合、ステップＳ３７に処理が進む。 (S36) The VM monitoring unit 131 refers to the autoscale server management table 121 and determines whether or not the operating flag is “True”. If the operating flag is "True", the process proceeds to step S38. If the operating flag is "False", the process proceeds to step S37.

（Ｓ３７）ＶＭ監視部１３１は、ステップＳ３５でオートスケールＶＭ動作中フラグを「Ｆａｌｓｅ」に設定した監視対象ＶＭについて、ＶＭ管理テーブル１２４のオートスケール情報更新フラグを「Ｔｒｕｅ」に設定する。 (S37) The VM monitoring unit 131 sets the autoscale information update flag of the VM management table 124 to "True" for the monitored VM for which the autoscale VM operating flag is set to "False" in step S35.

（Ｓ３８）ＶＭ監視部１３１は、監視を継続するか否を判定する。監視を継続する場合、監視の周期の分だけ待機して、ステップＳ３０に処理が進む。監視を継続しない場合、ＶＭ監視の処理が終了する。例えば、ＶＭ監視部１３１は、システム管理者による監視の終了の入力を受け付けた場合、監視を継続しないと判定し、それ以外の場合に監視を継続すると判定する。 (S38) The VM monitoring unit 131 determines whether or not to continue monitoring. When continuing monitoring, the process proceeds to step S30 after waiting for the monitoring cycle. If the monitoring is not continued, the VM monitoring process ends. For example, when the VM monitoring unit 131 receives the input of the end of monitoring by the system administrator, it determines that the monitoring is not continued, and determines that the monitoring is continued in other cases.

なお、第３の実施の形態でもＡＳサーバ連携部１３２は、図１２のオートスケールサーバ監視の手順により、オートスケールサーバ２００と連携する。
これにより、オートスケールサーバ２００が停止し、復旧したときに仮想マシンの最新の情報を基にオートスケールサーバ２００を復旧することができる。 Also in the third embodiment, the AS server cooperation unit 132 cooperates with the autoscale server 200 according to the procedure of the autoscale server monitoring of FIG.
As a result, when the autoscale server 200 is stopped and restored, the autoscale server 200 can be restored based on the latest information of the virtual machine.

更に、監視サーバ１００は、ＶＭ管理テーブル１２４のオートスケール可否フラグに基づいて、オートスケール制御の対象の仮想マシンと、オートスケール制御の対象外の仮想マシンとを区別した監視を行うことができる。監視サーバ１００は、オートスケール制御の対象外の仮想マシンについては、オートスケールサーバ２００に対するオートスケールに関する問い合わせを省略して、当該仮想マシンの異常を迅速に通知することができる。 Further, the monitoring server 100 can perform monitoring by distinguishing between the virtual machine subject to autoscale control and the virtual machine not subject to autoscale control based on the autoscale enable / disable flag of the VM management table 124. For virtual machines that are not subject to autoscale control, the monitoring server 100 can omit inquiries about autoscale to the autoscale server 200 and promptly notify the abnormality of the virtual machine.

なお、第１の実施の形態の情報処理は、処理部１２にプログラムを実行させることで実現できる。また、第２，第３の実施の形態の情報処理は、ＣＰＵ１０１にプログラムを実行させることで実現できる。プログラムは、コンピュータ読み取り可能な記録媒体１１３に記録できる。 The information processing of the first embodiment can be realized by causing the processing unit 12 to execute the program. Further, the information processing of the second and third embodiments can be realized by causing the CPU 101 to execute the program. The program can be recorded on a computer-readable recording medium 113.

例えば、プログラムを記録した記録媒体１１３を配布することで、プログラムを流通させることができる。また、プログラムを他のコンピュータに格納しておき、ネットワーク経由でプログラムを配布してもよい。コンピュータは、例えば、記録媒体１１３に記録されたプログラムまたは他のコンピュータから受信したプログラムを、ＲＡＭ１０２やＨＤＤ１０３などの記憶装置に格納し（インストールし）、当該記憶装置からプログラムを読み込んで実行してもよい。 For example, the program can be distributed by distributing the recording medium 113 on which the program is recorded. Alternatively, the program may be stored in another computer and distributed via the network. For example, the computer may store (install) a program recorded on the recording medium 113 or a program received from another computer in a storage device such as RAM 102 or HDD 103, read the program from the storage device, and execute the program. good.

１クラスタシステム
１０オートスケールサーバ監視装置
１１，２１記憶部
１２，２２処理部
２０オートスケールサーバ
３０，４０物理サーバ
３１，３２，４１，４２仮想マシン
５０ネットワーク
６１，６２，７１，７２テーブル 1 Cluster system 10 Autoscale server monitoring device 11,21 Storage unit 12,22 Processing unit 20 Autoscale server 30,40 Physical server 31,32,41,42 Virtual machine 50 Network 61,62,71,72 Table

Claims

A physical server that can run multiple virtual machines and
An autoscale server that scales in and out of virtual machines on the physical server,
The autoscale that periodically communicates with the autoscale server, stores the information of the virtual machine managed by the autoscale server, detects that the autoscale server has stopped, and then starts from the stopped state. An autoscale server monitoring device that sends information about the virtual machine in response to a server request,
Cluster system with.

The autoscale server monitoring device detects the inability to communicate with the virtual machine while the autoscale server is stopped, and when the autoscale server starts, the information indicating that communication with the virtual machine is impossible is given to the auto. Send to scale server,
The cluster system according to claim 1.

The autoscale server stores state information indicating the state of the virtual machine, updates the state information according to the information of the virtual machine transmitted by the autoscale server monitoring device, and updates the state information. Resume control of the scale-in and scale-out based on
The cluster system according to claim 1.

When the autoscale server monitoring device detects that communication with the virtual machine is impossible, the autoscale server monitoring device causes an abnormality in the virtual machine in response to an inquiry to the autoscale server as to whether or not the virtual machine has been stopped by the scale-in. Detect,
The cluster system according to claim 1.

The plurality of virtual machines include other virtual machines that are not subject to the scale-in and scale-out control by the autoscale server.
When the autoscale server monitoring device detects that communication with the other virtual machine is impossible, the autoscale server monitoring device omits the inquiry to the autoscale server and detects an abnormality in the other virtual machine.
The cluster system according to claim 4.

A storage unit that stores virtual machine information managed by the autoscale server,
A process of periodically communicating with the autoscale server, detecting that the autoscale server has stopped, and then transmitting information about the virtual machine in response to a request from the autoscale server started from the stopped state. Department and
Autoscale server monitoring device with.

It communicates with the autoscale server on a regular basis and stores the information of the virtual machine managed by the autoscale server.
After detecting that the autoscale server has stopped, the information of the virtual machine is transmitted in response to the request of the autoscale server started from the stopped state .
An autoscale server monitoring program that causes a computer to perform processing.

The computer
It communicates with the autoscale server on a regular basis and stores the information of the virtual machine managed by the autoscale server.
After detecting that the autoscale server has stopped, the information of the virtual machine is transmitted in response to the request of the autoscale server started from the stopped state .
Autoscale server monitoring method.