JP5387761B2

JP5387761B2 - Cluster reconstruction method, cluster reconstruction device, and cluster reconstruction program

Info

Publication number: JP5387761B2
Application number: JP2012512539A
Authority: JP
Inventors: 智宏花田
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2010-04-28
Filing date: 2010-04-28
Publication date: 2014-01-15
Anticipated expiration: 2030-04-28
Also published as: JPWO2011135628A1; WO2011135628A1

Description

本発明は、クラスタシステムにおけるクラスタ再構築方法、クラスタ再構築装置及びクラスタプログラムに関し、特に、障害時に、系切り替え制御を行う技術に関するクラスタシステムに適用して好適なものである。 The present invention relates to a cluster rebuilding method, a cluster rebuilding apparatus, and a cluster program in a cluster system, and is particularly suitable for application to a cluster system related to a technique for performing system switching control in the event of a failure.

いわゆるクラスタシステムでは、クラスタプログラムが、ノード間で通信を行うことで、他ノードの死活状態を判断している。そのため、ノード間通信で用いるネットワークに障害が発生すると、通信ができないノード同士で互いに相手側のノードに障害が発生したと判断する。このような状態はネットワークスプリットと呼ばれ、複数クラスタ上で同一アプリケーションが実行され、共有リソースに対する競合が発生し、データが破壊されるおそれがある。そのため、一般的なクラスタプログラムでは、上記のような現象が起こらないよう、ネットワークスプリットが発生した場合に、クラスタ構成を再構築する方法が行われている。 In a so-called cluster system, the cluster program determines the alive state of other nodes by communicating between the nodes. For this reason, when a failure occurs in the network used for inter-node communication, it is determined that a failure has occurred in the partner node between nodes that cannot communicate with each other. Such a state is called network split, and the same application is executed on a plurality of clusters, contention for shared resources may occur, and data may be destroyed. Therefore, in a general cluster program, a method of reconstructing a cluster configuration when a network split occurs is performed so that the above phenomenon does not occur.

そのような再構築方法の一例として、従来のクラスタシステムでは、ネットワークスプリットが発生すると、予め各ノードに設定されたリセット優先度に基づき、ノードごとに時間差を設けてリセットを発行することで、最もリセット優先度が高いノードと通信ができないノードをリセットし、最もリセット優先度が高いノードと通信可能なグループをクラスタとして再構築する方法（以下、「第１の方法」と称する。）が採用されている（特許文献１参照）。 As an example of such a reconstruction method, in a conventional cluster system, when a network split occurs, a reset is issued with a time difference for each node based on a reset priority set in advance for each node. A method of resetting a node that cannot communicate with a node having a high reset priority and reconfiguring a group that can communicate with a node having the highest reset priority as a cluster (hereinafter referred to as a “first method”) is adopted. (See Patent Document 1).

また、その再構築方法の他の例として、従来のクラスタシステムでは、ネットワークスプリットが発生すると、通信可能なノード同士で形成されるグループのノード数を、クォーラムディスクと呼ばれるハードディスクに書き込むことで、他グループのノード数を把握し、最大ノード数となるグループをクラスタとして再構築する方法（以下、「第２の方法」と称する。）が採用されている（特許文献２参照）。 As another example of the reconstruction method, in a conventional cluster system, when a network split occurs, the number of nodes in a group formed by communicable nodes is written to a hard disk called a quorum disk. A method of grasping the number of nodes of a group and reconstructing the group having the maximum number of nodes as a cluster (hereinafter referred to as “second method”) is employed (see Patent Document 2).

特開２００９−１８１５９７号公報JP 2009-181597 A 特開２００８−１９２１３９号公報JP 2008-192139 A

しかしながら、上述した第１の方法では、リセット優先度が予め設定されているため、ネットワークスプリット発生時に、グループで実行中の処理内容に応じてクラスタを再構築することができないという問題がある。そのため、第１の方法では、ネットワークスプリットが発生した際に、最もリセット優先度が高いノードを含まないクラスタグループのノードで、金融機関のシステムにおける入出金処理のようなトランザクション処理をアプリケーションが実行中であった場合、トランザクション処理を処理中のノードが、最もリセット優先度が高いノードからリセットされてしまい、トランザクション処理が中断される可能性がある。 However, in the first method described above, the reset priority is set in advance, and therefore there is a problem that the cluster cannot be reconstructed according to the processing contents being executed in the group when a network split occurs. Therefore, in the first method, when a network split occurs, an application is executing transaction processing such as deposit / withdrawal processing in a financial institution system at a node of a cluster group that does not include a node having the highest reset priority. In such a case, there is a possibility that the node that is processing the transaction process is reset from the node having the highest reset priority, and the transaction process is interrupted.

また、第２の方法では、最大ノード数となるグループをクラスタとして再構築するが、稼働率がノードごとに同じとは限らないため、稼働率の低いノードが最大ノード数グループとして固まった場合、最大ノード数グループが最も可用性の高いグループであるとは限らないという事態を招来してしまうという問題がある。そのため、第２の方法では、最も可用性の高いグループでクラスタを再構築することができない可能性がある。 In the second method, the group having the maximum number of nodes is reconstructed as a cluster, but the operation rate is not necessarily the same for each node. There is a problem in that the maximum number of nodes group is not necessarily the most highly available group. Therefore, in the second method, there is a possibility that the cluster cannot be reconstructed with the most highly available group.

本発明は以上の点を考慮してなされたもので、ネットワークスプリットが発生した場合に、グループの処理内容から、クラスタとして構築するのに最適なグループを選択することができるクラスタ再構築方法及びクラスタシステムを提案しようとするものである。 The present invention has been made in consideration of the above points. When a network split occurs, a cluster reconstructing method and a cluster capable of selecting an optimum group to be constructed as a cluster from the processing contents of the group. The system is to be proposed.

かかる課題を解決するため、本発明においては、互いに監視パスで接続された複数のノードを含むクラスタシステムのクラスタ再構築方法であって、前記複数のノードのうちのいずれか所定のノードが、所定の条件を保持し、前記所定の条件に対応する自ノードの情報を生成する自ノード情報生成ステップと、前記所定のノードが、前記監視パスでの通信を可能とする他ノードから、前記所定の条件に対応する他ノードの情報を収集する他ノード情報収集ステップと、前記所定のノードが、前記自ノードの情報又は前記他ノードの情報が存在する場合、前記他ノードの情報及び前記自ノードの情報に基づいて、少なくとも前記自ノードを含むとともに前記監視パスで通信可能なノードで構成される自グループの優先度を生成する優先度生成ステップと、前記所定のノードが、前記自グループの優先度と、前記自ノードが前記監視パスでの通信を不可能とするノードから構成される他グループの優先度とに基づいて、クラスタとして再構築するグループを決定するグループ決定ステップとを含むことを特徴とする。 In order to solve this problem, in the present invention, there is provided a cluster rebuilding method for a cluster system including a plurality of nodes connected to each other through a monitoring path, wherein any one of the plurality of nodes is a predetermined node. The local node information generation step for generating local node information corresponding to the predetermined condition, and the predetermined node from the other node that enables communication on the monitoring path. An other node information collecting step for collecting information of another node corresponding to the condition, and when the predetermined node has the information of the own node or the information of the other node, the information of the other node and the information of the own node Based on the information, a priority generation process for generating the priority of the own group including at least the own node and communicable by the monitoring path. And the predetermined node as a cluster based on the priority of the own group and the priority of another group composed of nodes in which the own node cannot communicate on the monitoring path. And a group determining step for determining a group to be reconstructed.

また、本発明においては、互いに監視パスで接続された複数のノードを含むクラスタシステムのクラスタ構成再構築装置であって、前記複数のノードは、それぞれ、所定の条件を保持し、前記所定の条件に対応する自ノードの情報を生成する一方、前記監視パスでの通信を可能とする他ノードから、前記所定の条件に対応する他ノードの情報を収集するノード情報収集部と、前記自ノードの情報又は前記他ノードの情報が存在する場合には、前記他ノードの情報及び前記自ノードの情報に基づいて、少なくとも前記自ノードを含むとともに前記監視パスで通信可能なノードで構成される自グループの優先度を生成し、前記自グループの優先度と、前記自ノードが前記監視パスでの通信を不可能とするノードから構成される他グループの優先度とに基づいて、クラスタとして再構築するグループを決定するグループスコア作成部とを有することを特徴とする。 Further, in the present invention, there is provided a cluster configuration reconstruction device of a cluster system including a plurality of nodes connected to each other through a monitoring path, wherein each of the plurality of nodes holds a predetermined condition, and the predetermined condition A node information collection unit that collects information of another node corresponding to the predetermined condition from another node that enables communication on the monitoring path, and If the information or the information of the other node exists, the own group including at least the own node and communicable by the monitoring path based on the information of the other node and the information of the own node The priority of the own group and the priority of the other group composed of nodes that make the own node impossible to communicate on the monitoring path Based on, and having a group score creation unit configured to determine a group to be reconstructed as a cluster.

また、本発明においては、互いに監視パスで接続された複数のノードを含むクラスタシステムにおけるクラスタ再構築装置のクラスタ再構築プログラムであって、前記クラスタ再構築装置のプロセッサに、前記複数のノードのうちのいずれか所定のノードが保持する所定の条件に対応した自ノードの情報を生成させる自ノード情報生成ステップと、前記プロセッサに、前記監視パスでの通信を可能とする他ノードから、前記所定の条件に対応する他ノードの情報を収集させる他ノード情報収集ステップと、前記プロセッサに、前記自ノードの情報又は前記他ノードの情報が存在する場合、前記他ノードの情報及び前記自ノードの情報に基づいて、少なくとも前記自ノードを含むとともに前記監視パスで通信可能なノードで構成される自グループの優先度を生成させる優先度生成ステップと、前記プロセッサに、前記自グループの優先度と、前記自ノードが前記監視パスでの通信を不可能とするノードから構成される他グループの優先度とに基づいて、クラスタとして再構築するグループを決定させるグループ決定ステップとを実行させることを特徴とする。 Further, in the present invention, there is provided a cluster reconstruction program for a cluster reconstruction device in a cluster system including a plurality of nodes connected to each other via a monitoring path, the processor of the cluster reconstruction device including a plurality of nodes A local node information generation step for generating local node information corresponding to a predetermined condition held by any of the predetermined nodes, and the processor from another node that enables communication on the monitoring path. In the other node information collecting step for collecting information on other nodes corresponding to the condition, and the information on the own node or the information on the other node in the processor, the information on the other node and the information on the own node are included. Based on a self-group that includes at least the self-node and can communicate with the monitoring path. A priority generation step for generating the priority of the group, the processor having the priority of the own group, and the priority of the other group configured by the node incapable of communicating on the monitoring path. And a group determining step for determining a group to be reconstructed as a cluster.

本発明によれば、ネットワークスプリットが発生した場合に、グループの処理内容から、クラスタとして構築するのに最適なグループを選択することができる。 According to the present invention, when a network split occurs, it is possible to select an optimal group to be constructed as a cluster from the processing contents of the group.

本発明を適用した一実施の形態であるクラスタシステムの構成例を示すブロック図である。It is a block diagram which shows the structural example of the cluster system which is one embodiment to which this invention is applied. クラスタシステムのグループスコア作成部が管理する優先定義の構成を示す。The structure of the priority definition which the group score preparation part of a cluster system manages is shown. クラスタシステムのグループスコア作成部が管理するノード情報管理表の構成を示す。The structure of the node information management table which the group score preparation part of a cluster system manages is shown. クラスタシステムのグループスコア作成部が管理するグループスコア管理表の構成を示す。The structure of the group score management table which the group score preparation part of a cluster system manages is shown. クラスタシステムの監視部が他ノードの監視時に行う処理の流れを示す。The flow of processing performed by the monitoring unit of the cluster system when monitoring other nodes is shown. クラスタシステムのグループスコア作成部において、障害ノードの通知、ノード情報の通知及びリセット実行済みを示す通知を受信した場合の処理の流れを示す。The flow of processing when the group score creation unit of the cluster system receives a failure node notification, node information notification, and notification indicating that reset has been executed is shown. クラスタシステムのリセット部において、グループスコア管理表の通知及びリセット要求を受信した場合の処理の流れを示す。The flow of a process when the reset part of a cluster system receives the notification of a group score management table | surface and the reset request | requirement is shown. クラスタシステムの系切替部において、リセット完了の通知及びアプリケーション障害を示す通知を受信した場合に行う処理の流れを示す。The flow of processing performed when the system switching unit of the cluster system receives a reset completion notification and a notification indicating an application failure is shown.

以下、図面について、本発明の一実施の形態について詳述する。 Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.

（１）本実施の形態によるクラスタシステムの構成
図１は、本実施の形態におけるクラスタシステムの構成例を示す。クラスタシステムは、ノードＡ１１０１、ノードＢ１２０１及びノードＣ１３０１を有し、それらが、ネットワーク１００１、リセットパス１００２及び監視パス１００３と通信可能に互いに接続されている。各ノードの構成は、ほぼ同様であるため、以下では、主にノードＡ１１０１について説明する。(1) Configuration of Cluster System According to this Embodiment FIG. 1 shows a configuration example of a cluster system according to this embodiment. The cluster system includes a node A 1101, a node B 1201, and a node C 1301, which are connected to each other so as to be communicable with a network 1001, a reset path 1002, and a monitoring path 1003. Since the configuration of each node is almost the same, node A 1101 will be mainly described below.

ノードＡ１１０１は、ＣＰＵ１１０２（プロセッサ）、ネットワークアダプタ（ＮＩＣ）１１０３、メモリ１１０４及びリセット装置１１０５を備えている。ノードＡ１１０１は、ＣＰＵ１１０２に後述する手順の処理を実行させるためのプログラム（クラスタ再構築プログラム）をメモリ１１０４に有する。ネットワークアダプタ（ＮＩＣ）１１０３は、外部との通信を送受信するための装置である。 The node A 1101 includes a CPU 1102 (processor), a network adapter (NIC) 1103, a memory 1104, and a reset device 1105. The node A 1101 has in the memory 1104 a program (cluster restructuring program) for causing the CPU 1102 to execute processing of a procedure described later. A network adapter (NIC) 1103 is a device for transmitting and receiving communications with the outside.

メモリ１１０４は、後述するクラスタ監視部１１１２が動作するための記憶装置である。このメモリ１１０４は、オペレーティングシステム（以下、ＯＳと称する）１１１０、アプリケーション１１１１及びクラスタ監視部１１１２を有する。このクラスタ監視部１１１２は、上述したクラスタ再構築プログラムに相当する。リセット装置１１０５は、自らが搭載されているノード（以下、自ノードという）以外のノード（以下、他ノードという）からリセット要求を受け取った時に自ノードを停止させるための装置である。 The memory 1104 is a storage device for operating a cluster monitoring unit 1112 described later. The memory 1104 includes an operating system (hereinafter referred to as OS) 1110, an application 1111, and a cluster monitoring unit 1112. This cluster monitoring unit 1112 corresponds to the above-described cluster reconstruction program. The reset device 1105 is a device for stopping the own node when receiving a reset request from a node (hereinafter referred to as another node) other than the node on which the reset device 1105 is mounted (hereinafter referred to as the own node).

ＮＩＣ１１０３は、アプリケーション１１１１の業務処理に伴う外部との通信に用いられたり、クラスタ監視部１１１２が他ノードを監視するための通信に用いられたり、クラスタ監視部１１１２が他ノードをリセットする際に行うリセット部との通信に用いられる。図１では、ノードＡ１１０１が複数のＮＩＣ１１０３を搭載している構成を例示したが、同一のＮＩＣでこれら通信を担うようにしても良い。 The NIC 1103 is used for communication with the outside accompanying business processing of the application 1111, used for communication for the cluster monitoring unit 1112 to monitor other nodes, or performed when the cluster monitoring unit 1112 resets other nodes. Used for communication with the reset unit. Although FIG. 1 illustrates the configuration in which the node A 1101 includes a plurality of NICs 1103, these communication may be performed by the same NIC.

クラスタ監視部１１１２は、ソフトウェアとハードウェアとの協働により、次のような各モジュールを実行する。このクラスタ監視部１１１２は、上述したクラスタ再構築プログラムに相当し、監視部１１２０、通信部１１２１、ノード情報収集部１１２２、グループスコア作成部１１２３及び系切替部１１２４を有する。 The cluster monitoring unit 1112 executes the following modules in cooperation with software and hardware. The cluster monitoring unit 1112 corresponds to the above-described cluster reconstruction program, and includes a monitoring unit 1120, a communication unit 1121, a node information collection unit 1122, a group score creation unit 1123, and a system switching unit 1124.

監視部１１２０は、自ノードのアプリケーション１１１１が正常動作するか監視する機能と、自ノードのアプリケーション１１１１に障害が発生した場合に、他ノードのクラスタ監視部１２１２などに障害が発生したことを通信部１１２１を介して通知する機能と、通信部１１２１を介して他ノードのクラスタ監視部の状態を監視する機能と、ノード情報収集部１１２２に対して正常動作しているアプリケーションを通知する機能と、他ノードのクラスタ監視部１２１２などの障害を検知した場合に、グループスコア作成部１１２３に対し、障害が発生しているノード（以下、障害ノードともいう）を通知する機能とを備える。 The monitoring unit 1120 has a function for monitoring whether or not the application 1111 of the own node is operating normally, and when a failure occurs in the application 1111 of the own node, the communication unit A function of notifying through the node 1121, a function of monitoring the status of the cluster monitoring unit of the other node through the communication unit 1121, a function of notifying the node information collecting unit 1122 of a normally operating application, and the like When a failure such as a cluster monitoring unit 1212 of a node is detected, the group score creating unit 1123 is provided with a function of notifying a node in which a failure has occurred (hereinafter also referred to as a failed node).

通信部１１２１は、ＮＩＣ１１０３を介して、他ノード１２０１Ａなどのクラスタ監視部１２１２などと通信する機能と、グループスコア作成部１１２３からの指示により、自ノードのリセット部１１１３に対して通知したり、他ノードのリセット部と通信したりする機能を持つ。 The communication unit 1121 notifies the reset unit 1113 of its own node by a function of communicating with the cluster monitoring unit 1212 such as the other node 1201A via the NIC 1103 and an instruction from the group score creation unit 1123, It has a function to communicate with the reset part of the node.

ノード情報収集部１１２２は、アプリケーション１１１１、ＯＳ１１１０及び監視部１１２０からノードに関する情報を収集する機能と、グループスコア作成部１１２３からノード情報の要求があった場合に、グループスコア作成部１１２３にノード情報を通知する機能を有する。 The node information collection unit 1122 collects node information to the group score creation unit 1123 when there is a request for node information from the application 1111, the OS 1110, and the monitoring unit 1120 and the group score creation unit 1123 requests node information. It has a function to notify.

ノード情報収集部１１２２は、所定の条件を保持し、所定の条件に対応する自ノードの情報を生成する一方、監視パス１１０３での通信を可能とする他ノードから、その所定の条件に対応する他ノードの情報を収集する。 The node information collection unit 1122 holds a predetermined condition, generates information of the own node corresponding to the predetermined condition, and corresponds to the predetermined condition from another node that enables communication on the monitoring path 1103. Collect information on other nodes.

ここでいう、所定の条件とは、優先順位を有する複数の条件である。後述するように、各ノード１１０１では、グループスコア作成部１１２３が、複数の条件の夫々に対応する自ノードの情報を生成し、自グループの優先度を生成する際、複数の条件の夫々に対し優先度を生成し、クラスタとして再構築するグループを決定する際、優先順位に応じて、他グループの優先度と比較を行うことで再構築グループを決定する。 Here, the predetermined conditions are a plurality of conditions having priority. As will be described later, in each node 1101, when the group score creating unit 1123 generates information on its own node corresponding to each of a plurality of conditions and generates the priority of the own group, When a priority is generated and a group to be reconstructed as a cluster is determined, a reconstructed group is determined by comparing with the priority of another group according to the priority order.

また、ここでいう所定の条件は、例えば、各ノード１１０１などのハードウェア使用率（稼動率）、又は、各ノード１１０１などで稼動するアプリケーションが実行中である特定の処理の数を含んでいても良い。ここでいう特定の処理は、例えば、アプリケーションによるデータライト処理又はデータリード処理を含んでいる。また、上述した所定の条件としては、例えば、各ノード１１０１で稼動するアプリケーション１１１１の種別又はその種別の名称を含んでいても良い。 Further, the predetermined condition here includes, for example, the hardware usage rate (operation rate) of each node 1101 or the like, or the number of specific processes that are being executed by an application running on each node 1101 or the like. Also good. The specific processing here includes, for example, data write processing or data read processing by an application. Further, as the predetermined condition described above, for example, the type of the application 1111 running on each node 1101 or the name of the type may be included.

グループスコア作成部１１２３は、例えば、自ノードの情報又は他ノードの情報が存在する場合には、他ノードの情報及び自ノードの情報に基づいて、少なくとも自ノードを含むとともに監視パスで通信可能なノードで構成される自グループの優先度を生成し、自グループの優先度と、自ノードが監視パス１００３での通信を不可能とするノードから構成される他グループの優先度とに基づいて、クラスタとして再構築するグループを決定する。 For example, when there is information on the own node or information on another node, the group score creating unit 1123 includes at least the own node and can communicate with the monitoring path based on the information on the other node and the information on the own node. Generate the priority of the own group configured with the nodes, and based on the priority of the own group and the priority of the other group configured with the nodes in which the own node cannot communicate with the monitoring path 1003, Determine the group to be rebuilt as a cluster.

このグループスコア作成部１１２３は、例えば、以下のような７つの機能を有する。具体的には、グループスコア作成部１１２３は、優先定義１１３０の優先情報に基づき、ノード情報収集部１１２２からノード情報を取得する機能と、障害ノードを除く他ノードのクラスタ監視部に対し通信部１１２１を介してそのノード情報を通知する機能と、ノード情報収集部１１２２から取得したノード情報と障害ノードを除く他ノードのクラスタ監視部から通知されるノード情報とから、優先定義１１３０の優先条件に基づきグループスコアを作成する機能と、自ノードのリセット部１１１３に対して通信部１１２１を介してグループスコア管理表１１３２を通知する機能と、監視部１１２０から通知された障害ノードに対して通信部１１２１を介してグループスコア管理表１１３２を載せたリセット要求を通知する機能と、リセットが成功した場合に、通信部１１２１を用いて系切替部１１２４と障害ノードを除く他ノードのクラスタ監視部に障害ノードのリセット成功を通知する機能と、障害ノードを全てリセットした場合に、自ノードのリセット部１１１３に対して、全ての障害ノードをリセットしたことを通知する機能とを備えている。 The group score creating unit 1123 has, for example, the following seven functions. Specifically, the group score creation unit 1123 has a function of acquiring node information from the node information collection unit 1122 based on the priority information of the priority definition 1130 and a communication unit 1121 for the cluster monitoring units of other nodes excluding the failed node. Based on the priority condition of the priority definition 1130 from the node information acquired from the node information collection unit 1122 and the node information notified from the cluster monitoring unit of the other node excluding the failed node. A function of creating a group score, a function of notifying the reset unit 1113 of the own node of the group score management table 1132 via the communication unit 1121, and a communication unit 1121 for the failed node notified from the monitoring unit 1120 A reset request with a group score management table 1132 via If the communication node 1121 is used to notify the system switching unit 1124 and the cluster monitoring unit of the other nodes other than the failed node of the success of resetting the failed node, and if all the failed nodes are reset, The reset unit 1113 has a function of notifying that all the faulty nodes have been reset.

さらに、グループスコア作成部１１２３は、優位なグループを決定するために必要な情報と条件を保持した優先定義１１３０と、各ノードの死活情報やノード情報を保持するためのノード情報管理表１１３１と、自ノードを含むグループの情報を保持するためのグループスコア管理表１１３２とを保持する。 Further, the group score creating unit 1123 includes a priority definition 1130 that holds information and conditions necessary for determining a superior group, a node information management table 1131 for holding life / death information and node information of each node, It holds a group score management table 1132 for holding information about groups including its own node.

系切替部１１２４は、グループスコア作成部１１２３及び、障害ノードを除く他ノードのクラスタ監視部１２１２又は１３１２からリセット完了の通知があった場合、又は、当該他ノードのクラスタ監視部１２１２又は１３１２からアプリケーションに障害が発生したことを示す通知があった場合、自ノードが障害により終了したアプリケーションを引き継ぐ必要があるか否か判断する。系切替部１１２４は、必要がある場合、障害により終了したアプリケーションを引き継ぐ機能を有する。 The system switching unit 1124 receives a notification of reset completion from the group score creating unit 1123 and the cluster monitoring unit 1212 or 1312 of the other node excluding the failed node, or the application from the cluster monitoring unit 1212 or 1312 of the other node. If there is a notification indicating that a failure has occurred, it is determined whether or not the own node needs to take over an application that has been terminated due to the failure. The system switching unit 1124 has a function of taking over an application terminated due to a failure when necessary.

リセット装置１１０５のリセット部１１１３は、自グループがクラスタとして再構築するグループであると決定された場合、当該自グループに属するノードをリセットする。また、リセット部１１１３は、当該自グループに属するノードをリセットする際に、リセット要求とともに自グループの優先度を、クラスタを構成するノードに通知し、その通知を他ノードから受けた際に、その通知に含まれるグループ優先度が、当該自グループの優先度より低い場合、当該自ノードのリセットを防止している。 If the reset unit 1113 of the reset device 1105 determines that the self group is a group to be reconstructed as a cluster, the reset unit 1113 resets the nodes belonging to the self group. In addition, when resetting a node belonging to the own group, the reset unit 1113 notifies the node constituting the cluster of the priority of the own group together with the reset request, and when receiving the notification from another node, When the group priority included in the notification is lower than the priority of the self group, the self node is prevented from being reset.

（２）テーブルなどの構成
次に図２、図３及び図４を用いて、優先定義１１３０、ノード情報管理表１１３１及びグループスコア管理表１１３２が有する情報について詳細に説明する。(2) Configuration of Tables Next, the information included in the priority definition 1130, the node information management table 1131 and the group score management table 1132 will be described in detail with reference to FIG. 2, FIG. 3 and FIG.

図２は、優先定義１１３０が有する情報を示した図である。優先定義１１３０には、以下に示した３つの情報が保持される。 FIG. 2 is a diagram illustrating information included in the priority definition 1130. The priority definition 1130 holds the following three pieces of information.

（ａ）優先条件を一意に識別するための優先番号２１
（ｂ）優先条件に基づきより優位なグループを決めるために必要な情報を示す優先情報２２
（ｃ）クラスタを構築するグループとして最適なグループの条件を示す優先条件２３(A) Priority number 21 for uniquely identifying priority conditions
(B) Priority information 22 indicating information necessary for determining a more dominant group based on priority conditions
(C) Priority condition 23 indicating the optimum group condition as a group for constructing a cluster

図３は、ノード情報管理表１１３１に含まれる情報を示した図である。ノード情報管理表１１３１には、以下に示した３つの情報が保持される。 FIG. 3 is a diagram showing information included in the node information management table 1131. The node information management table 1131 holds the following three pieces of information.

（ａ）各ノードを一意に識別するためのノード名３１
（ｂ）障害ノードであることを表す障害発生フラグ３２
（ｃ）優先定義１１３０の優先情報２２に基づき取得された各ノードのノード情報を示す第１〜第４ノード情報３３〜３６(A) Node name 31 for uniquely identifying each node
(B) Failure occurrence flag 32 indicating a failure node
(C) 1st-4th node information 33-36 which shows the node information of each node acquired based on the priority information 22 of the priority definition 1130

ノード情報管理表１１３１には、上記優先番号２１の数だけノード情報を登録する列が追加される。 In the node information management table 1131, a column for registering node information by the number of the priority numbers 21 is added.

障害発生フラグ３２は、グループスコア作成部１１２３が監視部１１２０から通知された障害ノードに対して設定する。本実施の形態では、説明を容易にするため、ネットワークスプリットが発生し、ノードＣ１３０１に生じた障害によって、ノードＢ１２０１とノードＣ１３０１との間で通信が切断された例を示している。このため、ノードＡ１１０１の監視部１１２０がノードＣ１３０１の障害を検知し、グループスコア作成部１１２３に対してノードＣ１３０１の障害を通知する。このため、ノード名３１が「Ｃ」の障害発生フラグ３２がオフ（図示上では空欄に相当）からオン（図示の丸印に相当）設定される。 The failure occurrence flag 32 is set for the failure node notified from the monitoring unit 1120 by the group score creation unit 1123. In this embodiment, for ease of explanation, an example in which a network split occurs and communication between the node B 1201 and the node C 1301 is disconnected due to a failure occurring in the node C 1301 is shown. Therefore, the monitoring unit 1120 of the node A 1101 detects the failure of the node C 1301 and notifies the group score creation unit 1123 of the failure of the node C 1301. Therefore, the failure occurrence flag 32 whose node name 31 is “C” is set from off (corresponding to a blank in the drawing) to on (corresponding to a circle in the drawing).

第１〜４ノード情報３３〜３６には、各ノードのグループスコア作成部１１２３などが、優先定義１１３０の優先情報２２に基づき各ノードのノード情報収集部１１２２などから取得したノード情報が格納される。本実施の形態では、一例として、ノードＡ１１０１の監視部１１２０がノードＣ１３０１の障害を検知している。このため、ノードＣ１３０１のノード情報が格納されていない。その他ノードのノード情報については、以下で説明する。 In the first to fourth node information 33 to 36, node information acquired by the group score creation unit 1123 of each node or the like from the node information collection unit 1122 or the like of each node based on the priority information 22 of the priority definition 1130 is stored. . In this embodiment, as an example, the monitoring unit 1120 of the node A 1101 detects a failure of the node C 1301. For this reason, the node information of the node C1301 is not stored. The node information of other nodes will be described below.

優先番号２１が「１」の優先情報２２は、「ノードの稼働率」であるため、第１ノード情報３３には、ノードＡ１１０１及びノードＢ１２０１のそれぞれの稼働率を示す「９５.０」及び「９０.０」が設定される。これらの稼働率は、予めノード情報収集部に指定された値であってもよいし、ノード情報収集部１１２２が統計情報をとり、計算により求めた値でもよい。 Since the priority information 22 with the priority number 21 being “1” is “node operation rate”, the first node information 33 includes “95.0” and “9” indicating the respective operation rates of the node A 1101 and the node B 1201. 90.0 "is set. These operating rates may be values designated in advance by the node information collection unit, or may be values obtained by calculation by the node information collection unit 1122 taking statistical information.

優先番号２１が「２」の優先情報２２は、「データ書き込み処理中のアプリケーション数」であるため、第２ノード情報３４には、ノードＡ１１０１とノードＢ１２０１におけるデータ書き込み処理中のアプリケーションの数が、それぞれ設定される。例えば、アプリケーション１１１１がデータ書き込み処理中であり、かつ、アプリケーション１２１１がデータ書き込み処理を行っていなかった場合、第２ノード情報３４には、「１」と「０」がそれぞれ設定される。データ書き込み処理中のアプリケーション数は、ノード情報収集部１１２２に対して処理開始と処理終了を通知することができるプログラムを、アプリケーションがデータ書き込み処理を開始時と終了時に実行することで、ノード情報収集部１１２２によって把握される。 Since the priority information 22 with the priority number 21 being “2” is “the number of applications during data write processing”, the second node information 34 includes the number of applications under data write processing in the node A 1101 and the node B 1201. Each is set. For example, when the application 1111 is performing data write processing and the application 1211 is not performing data write processing, “1” and “0” are set in the second node information 34, respectively. The number of applications in the data writing process can be obtained by executing a program that can notify the node information collecting unit 1122 of the start and end of the process by executing the program at the start and end of the data write process. It is grasped by the unit 1122.

優先番号２１が「３」の優先情報２２は、「起動中のアプリケーション名」であるため、第３ノード情報３５には、ノードＡ１１０１とノードＢ１２０１でそれぞれ起動中のアプリケーションの名称である「アプリケーション１１１１」と「アプリケーション１２１１」が設定される。 Since the priority information 22 with the priority number 21 of “3” is “active application name”, the third node information 35 includes “application 1111” which is the name of the application that is active on the node A 1101 and the node B 1201 respectively. And “application 1211” are set.

優先番号２１が「４」の優先情報２２は、「ネットワーク使用率」であるため、第４ノード情報３６には、ノードＡ１１０１とノードＢ１２０１におけるネットワーク使用率を表す「４０」と「６０」がそれぞれ設定される。 Since the priority information 22 with the priority number 21 of “4” is “network usage rate”, “40” and “60” representing the network usage rates in the node A 1101 and the node B 1201 are respectively included in the fourth node information 36. Is set.

図４は、グループスコア管理表１１３２が有する情報を示した図である。グループスコア管理表１１３２には、以下に示した２つの情報が保持される。 FIG. 4 is a diagram showing information included in the group score management table 1132. The group score management table 1132 holds the following two pieces of information.

（ａ）グループスコアがどの優先条件に基づき作成されたかを示す優先番号２１
（ｂ）自ノードと監視パスを介して互いに通信し合えるノードで形成されるグループの前記優先条件２３に基づく優位度を示すグループスコア４１(A) Priority number 21 indicating on which priority condition the group score was created
(B) A group score 41 indicating the superiority based on the priority condition 23 of a group formed by nodes that can communicate with each other via the monitoring path with the own node.

優先番号２１は、優先定義１１３０の優先番号２１と同じ値が設定される。一方、グループスコア４１は、ノード情報管理表１１３１のノード情報から作成されたスコアを示す。本実施の形態では、一例として、ノードＡ１１０１とノードＢ１２０１がグループを形成するので、ノード情報管理表１１３１のノードＡ１１０１とノードＢ１２０１のノード情報から作成されたスコアが、グループスコア４１に格納される。 The priority number 21 is set to the same value as the priority number 21 of the priority definition 1130. On the other hand, the group score 41 indicates a score created from the node information in the node information management table 1131. In this embodiment, as an example, the node A 1101 and the node B 1201 form a group, and therefore the score created from the node information of the node A 1101 and the node B 1201 in the node information management table 1131 is stored in the group score 41.

優先番号２１が「１」の優先条件２２が「稼働率の高いグループが優位」であるため、優先番号２１が「１」のグループスコア４１には、第１ノード情報３３から計算によって求めたグループの稼働率「９８.５」が格納される。 Since the priority condition 22 with the priority number 21 being “1” is “a group with a high operation rate is superior”, the group score 41 with the priority number 21 being “1” is obtained by calculation from the first node information 33. The operation rate “98.5” is stored.

優先番号２１が「２」の優先条件２２が「データ書き込み処理中のアプリケーション数が多いグループが優位」であるため、優先番号２１が「２」のグループスコア４１には、第２ノード情報３４の合計値である「１」が格納される。 Since the priority condition 22 with the priority number 21 being “2” is “a group having a large number of applications during data write processing is superior”, the group score 41 with the priority number 21 being “2” has the second node information 34 The total value “1” is stored.

優先番号２１が「３」の優先条件２２が「アプリケーション１３１１が起動中のグループが優位」であるため、優先番号２１が「３」のグループスコア４１には、第３ノード情報３５にアプリケーション１３１１の名前が存在しないことを示す「０」が設定される。一方、第３ノード情報３５にアプリケーション１３１１の名前が存在する場合は「１」が設定される。 Since the priority condition 22 with the priority number 21 being “3” is “the group in which the application 1311 is running is superior”, the group score 41 with the priority number 21 being “3” is included in the third node information 35 in the third node information 35. “0” indicating that the name does not exist is set. On the other hand, when the name of the application 1311 exists in the third node information 35, “1” is set.

優先番号２１が「４」の優先条件２２が「ネットワーク使用率の平均値が高いグループが優位」であるため、優先番号２１が「４」のグループスコア４１には、第４ノード情報３６から計算によって求めたグループのネットワーク使用率の平均値を表す「５０」が格納される。 Since the priority condition 22 with the priority number 21 of “4” is “the group with a high average value of the network usage rate is superior”, the group score 41 with the priority number 21 of “4” is calculated from the fourth node information 36. “50” representing the average value of the network usage rate of the group obtained by the above is stored.

なお、本実施の形態では、説明を容易にするために、ノード情報収集部１１２２及びグループスコア作成部１１２３が、クラスタ監視部１１１２内のプログラムのモジュールとして構成されている例を示したが、クラスタ監視部１１１２とは別のプログラム内のモジュールとしても良い。また、優先情報２２が、グループスコア作成部１１２３の管理する優先定義１１３０で保持する情報としたが、ノード情報収集部１１２２が保持する情報としても良い。 In the present embodiment, for ease of explanation, an example in which the node information collection unit 1122 and the group score creation unit 1123 are configured as program modules in the cluster monitoring unit 1112 has been shown. A module in a program different from the monitoring unit 1112 may be used. Further, although the priority information 22 is information held in the priority definition 1130 managed by the group score creating unit 1123, it may be information held by the node information collecting unit 1122.

リセット装置１１０５は、次のような機能を搭載するリセット部１１１３を有する。即ち、リセット部１１１３は、クラスタ監視部１１１２からの要求に応じて自ノードをリセットする機能を有する。より具体的には、このリセット部１１１３は、自ノードのクラスタ監視部１１１２から通知されるグループスコア管理表１１３２を保持する機能と、自身が保持するグループスコアと他ノードのクラスタ監視部からのリセット要求で通知されるグループスコアを比較し、自ノードのリセットを実行するか防止するか判断する機能と、自ノードを停止させる機能と、リセットを実行したことをクラスタ監視部に通知する機能とを持つ。なお、自ノードを停止させる処理は、自ノードの共有リソース使用を終了させることができれば良いので、例えば電源オフであったり、ＯＳのシャットダウンであったりしても良い。 The reset device 1105 includes a reset unit 1113 that has the following functions. That is, the reset unit 1113 has a function of resetting its own node in response to a request from the cluster monitoring unit 1112. More specifically, the reset unit 1113 has a function of holding the group score management table 1132 notified from the cluster monitoring unit 1112 of its own node, a group score held by itself, and a reset from the cluster monitoring unit of other nodes. A function that compares the group scores notified in the request and determines whether or not to reset the own node, a function that stops the own node, and a function that notifies the cluster monitoring unit that the reset has been executed. Have. Note that the process for stopping the own node is not limited as long as the use of the shared resource of the own node can be terminated. For example, the power may be turned off or the OS may be shut down.

本実施の形態では、説明を容易にするために、リセット部１１１３がリセット装置１１０５内のモジュールである例を示したが、リセット装置１１０５とは別の装置内のモジュールとしても良く、ノードごとに存在する必要はない。また、リセット部１１１３は、ノードＡ１１０１内の他の装置から独立したリセット専用の装置内のモジュールである必要はなく、メモリ１１０４上で動作するプログラムでも良い。 In this embodiment, for ease of explanation, the example in which the reset unit 1113 is a module in the reset device 1105 has been described. However, a module in a device different from the reset device 1105 may be used, and It doesn't have to exist. The reset unit 1113 does not have to be a module in a device dedicated to reset independent of other devices in the node A 1101, and may be a program that operates on the memory 1104.

（３）クラスタ再構成方法の一例
（３−１）概念
本実施の形態では、クラスタ再構成方法の一例として、クラスタ再構築プログラムが、ＣＰＵ１１０２に、次のような３つのステップを実行させる。まず、自ノード情報生成ステップでは、ノードＡ１１０１のＣＰＵ１１０２が、ノード情報収集部１１２２に、複数のノード１１０１，１２０１，１３０１のうちのいずれか所定のノード１１０１が、所定の条件を保持し、所定の条件に対応する自ノードの情報を生成する。他ノード情報収集ステップでは、ノードＡ１１０１のＣＰＵ１１０２が、監視パス１００３での通信を可能とする他ノード１２０１，１３０１から、所定の条件に対応する他ノードの情報を収集する。(3) Example of Cluster Reconfiguration Method (3-1) Concept In this embodiment, as an example of the cluster reconfiguration method, the cluster reconfiguration program causes the CPU 1102 to execute the following three steps. First, in the own node information generation step, the CPU 1102 of the node A 1101 stores in the node information collection unit 1122 any one of the plurality of nodes 1101, 1201, and 1301, a predetermined node 1101 holds a predetermined condition, Generate local node information corresponding to the condition. In the other node information collection step, the CPU 1102 of the node A 1101 collects information on other nodes corresponding to a predetermined condition from the other nodes 1201 and 1301 that enable communication on the monitoring path 1003.

次に、優先度生成ステップでは、ノードＡ１１０１のＣＰＵ１１０２が、グループスコア作成部１１２３に、所定のノード１１０１が、自ノードの情報又は他ノードの情報が存在する場合、他ノードの情報及び自ノードの情報に基づいて、少なくとも自ノードを含むとともに監視パス１００３で通信可能なノードで構成される自グループの優先度を生成する。次に、グループ決定ステップでは、ノードＡ１１０１のＣＰＵ１１０２が、グループスコア作成部１１２３に、自グループの優先度と、自ノードが監視パスでの通信を不可能とするノードから構成される他グループの優先度とに基づいて、クラスタとして再構築するグループを決定する。 Next, in the priority generation step, the CPU 1102 of the node A 1101 determines that when the predetermined node 1101 has information on the own node or information on the other node in the group score creating unit 1123, the information on the other node and the information on the own node Based on the information, the priority of the own group including at least the own node and communicable by the monitoring path 1003 is generated. Next, in the group determination step, the CPU 1102 of the node A 1101 instructs the group score creation unit 1123 to prioritize the other group configured by the priority of the own group and the node in which the own node cannot communicate on the monitoring path. Based on the degree, a group to be reconstructed as a cluster is determined.

（３−２）具体例
図５から図８は、それぞれ本実施の形態におけるクラスタ監視部１１１２の動作と、リセット部１１１３の動作を説明したフローチャートである。まず、図５は、監視部１１２０が、他ノードの監視時に行う処理を説明したフロー図である。(3-2) Specific Example FIGS. 5 to 8 are flowcharts illustrating the operation of the cluster monitoring unit 1112 and the operation of the reset unit 1113 in the present embodiment, respectively. First, FIG. 5 is a flowchart illustrating processing performed by the monitoring unit 1120 when monitoring other nodes.

監視部１１２０は、まず、他ノードのクラスタ監視部と監視パス１００３を介して接続しているか判断する（ＳＰ５０１）。監視部１１２０は、このステップ５０１において、接続していないと判断した場合、何もせずに終了する。監視部１１２０は、接続していると判断した場合、全てのノードのクラスタ監視部１１１２，１２１２，１３１２に対してハートビートメッセージを送信する（ＳＰ５０２）。 First, the monitoring unit 1120 determines whether it is connected to the cluster monitoring unit of another node via the monitoring path 1003 (SP501). If the monitoring unit 1120 determines that the connection is not established in step 501, the monitoring unit 1120 ends without doing anything. If the monitoring unit 1120 determines that they are connected, the monitoring unit 1120 transmits a heartbeat message to the cluster monitoring units 1112, 1212, and 1312 of all the nodes (SP 502).

監視部１１２０は、ハートビートメッセージを送信すると、他ノードのクラスタ監視部から、ハートビートが届いているか否かをチェックする（ＳＰ５０３）。併せて監視部１１２０は、一定期間ハートビートが届いていないノードがあるか判断する（ＳＰ５０４）。監視部１１２０は、このステップ５０４において一定期間ハートビートが届いていないノードが存在しないと判断した場合、何もせずに処理を終了する（次回のハートビート監視まで待機する）。一方、監視部１１２０は、一定期間ハートビートが届いていないノードが存在すると判断した場合、そのノードに障害が発生したとみなし、グループスコア作成部１１２３に障害ノードを通知する（ＳＰ５０５）。 When transmitting the heartbeat message, the monitoring unit 1120 checks whether or not a heartbeat has arrived from the cluster monitoring unit of another node (SP503). In addition, the monitoring unit 1120 determines whether there is a node that has not received a heartbeat for a certain period (SP504). If the monitoring unit 1120 determines in step 504 that there is no node that has not received a heartbeat for a certain period of time, the monitoring unit 1120 does nothing and ends the process (waits for the next heartbeat monitoring). On the other hand, if the monitoring unit 1120 determines that there is a node that has not received a heartbeat for a certain period of time, the monitoring unit 1120 considers that the node has failed, and notifies the group score creation unit 1123 of the failed node (SP505).

図６は、グループスコア作成部１１２３が、障害ノードの通知、ノード情報の通知及びリセット実行済みを示す通知を受信した場合における処理の一例を示す。 FIG. 6 shows an example of processing when the group score creating unit 1123 receives a failure node notification, node information notification, and a notification indicating that reset has been executed.

まず、グループスコア作成部１１２３は、受信した通知の通知元が自ノードの監視部１１２０であるか否かを判断する（ＳＰ６０１）。通知元が監視部１１２０である場合、グループスコア作成部１１２３は、通知された障害ノードのノード情報管理表１１３１の障害発生フラグ３２を設定する（ＳＰ６０２）。次にグループスコア作成部１１２３は、優先定義１１３０の優先情報２２に設定されている情報に基づき、ノード情報収集部１１２２から自ノードのノード情報を取得し、この取得したノード情報をノード情報管理表１１３１に登録する（ＳＰ６０３）。 First, the group score creating unit 1123 determines whether or not the notification source of the received notification is the monitoring unit 1120 of its own node (SP601). When the notification source is the monitoring unit 1120, the group score creating unit 1123 sets the failure occurrence flag 32 of the node information management table 1131 of the notified failure node (SP602). Next, the group score creation unit 1123 acquires the node information of the own node from the node information collection unit 1122 based on the information set in the priority information 22 of the priority definition 1130, and stores the acquired node information in the node information management table. Registered in 1131 (SP603).

次に、クラスタ監視部１１１２は、この取得したノード情報（自ノードの情報）を、障害発生フラグが設定されていない（正常に稼動中の）ノードのクラスタ監視部に通知する（ＳＰ６０４）。その後、クラスタ監視部１１１２は、障害発生フラグが設定されていないノードのクラスタ監視部から通知されてくるノード情報をノード情報管理表１１３１に登録する（ＳＰ６０５）。クラスタ監視部１１１２は、ノード障害フラグが設定されていない全てのノードから、ノード情報通知を受信したかを判断する（ＳＰ６０６）。 Next, the cluster monitoring unit 1112 notifies the acquired node information (information of the own node) to the cluster monitoring unit of the node for which the failure occurrence flag is not set (operating normally) (SP604). Thereafter, the cluster monitoring unit 1112 registers the node information notified from the cluster monitoring unit of the node for which the failure occurrence flag is not set in the node information management table 1131 (SP605). The cluster monitoring unit 1112 determines whether node information notifications have been received from all nodes for which the node failure flag is not set (SP606).

上記ステップ６０６において、クラスタ監視部１１１２が、ノード情報が揃っていないと判断した場合、再び他ノードのノード情報の通知を受信する処理（ＳＰ６０５）に戻って実行する。一方、クラスタ監視部１１１２は、ノード情報が揃ったと判断した場合、自ノードのノード情報と受信済みの他ノードのノード情報とから、グループの優先条件２３に基づく優位度を表すグループスコア４１を作成する（ＳＰ６０７）。 If the cluster monitoring unit 1112 determines in step 606 that the node information is not complete, the process returns to the process of receiving the node information notification of the other node (SP605) and executed again. On the other hand, if the cluster monitoring unit 1112 determines that the node information has been prepared, the cluster monitoring unit 1112 creates a group score 41 representing the superiority based on the group priority condition 23 from the node information of the own node and the received node information of the other nodes. (SP607).

グループスコア作成部１１２３は、作成したグループスコア管理表１１３２を自ノードのリセット部１１１３に対して通知した後（ＳＰ６０８）、障害ノードのリセット部１３１３に対してグループスコア管理表１１３２を載せたリセット要求を発行する（ＳＰ６０９）。 After notifying the created group score management table 1132 to the reset unit 1113 of its own node (SP608), the group score creating unit 1123 resets the group score management table 1132 to the reset unit 1313 of the failed node. Is issued (SP609).

一方、上記ステップ６０１において通知元がリセット部であった場合、受信した通知はリセット実行済みであることを示す通知であるため、グループスコア作成部１１２３は、監視部１１２０と、障害ノードを除く他ノードのクラスタ監視部とに対して、リセット完了を通知する（ＳＰ６１１）。 On the other hand, when the notification source is the reset unit in step 601, the received notification is a notification indicating that the reset has been executed, and therefore the group score creating unit 1123 includes the monitoring unit 1120 and other than the failure node. The reset monitoring is notified to the cluster monitoring unit of the node (SP611).

次に、グループスコア作成部１１２３は、リセットが完了したノードをノード情報管理表１１３１から削除した後（ＳＰ６１２）、障害ノード発生フラグ３２が設定されたノードが残っているか確認する（ＳＰ６１３）。このステップ６１３で、グループスコア作成部１１２３は、ノードが残っていると判断した場合、何もせずに処理を終了する。一方、グループスコア作成部１１２３は、ノードが残っていないと判断した場合、自ノードのリセット部１１１３に対して、障害ノードを全てリセットしたことを通知する（ステップ６１４）。グループスコア作成部１１２３は、ノード情報管理表１１３１のノード情報を全てクリアし（ステップ６１５）、クラスタを再構築した結果を管理コンピュータ１００４の画面に表示する。 Next, the group score creating unit 1123 deletes the reset-completed node from the node information management table 1131 (SP612), and then checks whether there is a node with the failed node occurrence flag 32 set (SP613). If the group score creating unit 1123 determines in step 613 that there are remaining nodes, the process ends without doing anything. On the other hand, when the group score creating unit 1123 determines that there are no remaining nodes, the group score creating unit 1123 notifies the reset unit 1113 of its own node that all the failed nodes have been reset (step 614). The group score creating unit 1123 clears all the node information in the node information management table 1131 (Step 615), and displays the result of rebuilding the cluster on the screen of the management computer 1004.

図７は、リセット部１１１３が、クラスタ監視部１１０４から通知を受信した場合の処理を説明したフロー図である。リセット部１１１３は、受信した通知の通知元が自ノードのクラスタ監視部１１１２であるか否かを判断する（ＳＰ７０１）。リセット部１１１３は、自ノードのクラスタ監視部１１１２であった場合は、さらに通知内容がグループスコア管理表１１３２の通知であるかを判断する（ＳＰ７０２）。 FIG. 7 is a flowchart illustrating processing when the reset unit 1113 receives a notification from the cluster monitoring unit 1104. The reset unit 1113 determines whether the notification source of the received notification is the cluster monitoring unit 1112 of its own node (SP701). If it is the cluster monitoring unit 1112 of its own node, the reset unit 1113 further determines whether the notification content is a notification of the group score management table 1132 (SP702).

リセット部１１１３は、このステップＳＰ７０２において通知内容がグループスコア管理表１１３２の通知であると判断した場合、グループスコア管理表１１３２を保持し（ＳＰ７０３）、終了する。一方、リセット部１１１３は、グループスコア管理表１１３２の通知ではない場合、受信した通知は全ての障害ノードをリセットしたことを示す通知なので、上記ステップＳＰ７０３で保持したグループスコア管理表１１３２を削除する（ＳＰ７１１）。 When the reset unit 1113 determines in step SP702 that the notification content is notification of the group score management table 1132, the reset unit 1113 holds the group score management table 1132 (SP703), and the process ends. On the other hand, when the notification is not the notification of the group score management table 1132, the reset unit 1113 deletes the group score management table 1132 held in step SP703 because the received notification is a notification indicating that all failed nodes have been reset ( SP711).

一方、受信した通知の通信元が自ノードのクラスタ監視部１１１２でない場合、この通知は他ノードのクラスタ監視部からのリセット要求であるため、リセット部１１１３は、自身が保持するグループスコア管理表１１３２と、リセット要求で通知されたグループスコア管理表とから、同じ優先番号のグループスコアを比較する（ＳＰ７２２）。 On the other hand, if the communication source of the received notification is not the cluster monitoring unit 1112 of the own node, the notification is a reset request from the cluster monitoring unit of the other node, so the reset unit 1113 holds the group score management table 1132 held by itself. And the group score of the same priority number are compared from the group score management table notified by the reset request (SP722).

リセット部１１１３は、リセット要求のグループスコアの方が小さい場合、何もせずに処理を終了する。両グループスコアが等しい場合は、再びステップＳＰ７２２に戻り、リセット部１１１３は、次の優先番号のグループスコアを比較する。次の優先番号がない場合、リセット部１１１３は、通知元のＩＰアドレスなどのシステム内で一意に定まる値を用いて、リセット実行を判断する。一方、リセット部１１１３は、リセット要求のグループスコアの方が大きい場合は、自ノードのリセットを実行する（ＳＰ７３１）。 When the group score of the reset request is smaller, the reset unit 1113 ends the process without doing anything. When both group scores are equal, the process returns to step SP722 again, and the reset unit 1113 compares the group scores of the next priority numbers. If there is no next priority number, the reset unit 1113 determines reset execution using a value uniquely determined in the system, such as a notification source IP address. On the other hand, when the group score of the reset request is larger, the reset unit 1113 executes resetting of its own node (SP731).

本実施の形態では、リセット部１１１３が、ノードＡ１１０１とノードＢ１２０１で形成されるグループと、ノードＣ１３０１のみで形成されるグループとのグループスコア同士を比較する。グループＣ１３０１のみで形成されるグループの優先番号２１が１のグループスコアが９８.５よりも大きい場合は、ノードＡ１１０１からノードＣ１３０１に対するリセットが防止され、ノードＣ１３０１からノードＡ１１０１に対するリセットが実行される。上記グループスコアが９８.５よりも小さい場合は、ノードＡ１１０１からノードＣ１３０１に対するリセットが実行され、ノードＣ１３０１からノードＡ１１０１に対するリセットが防止される。等しい場合は、次の優先番号のグループスコアが比較される。 In the present embodiment, the reset unit 1113 compares the group scores of the group formed by the node A 1101 and the node B 1201 and the group formed only by the node C 1301. When the group score of the group priority number 21 formed only by the group C1301 is greater than 98.5, the reset from the node A1101 to the node C1301 is prevented, and the reset from the node C1301 to the node A1101 is executed. When the group score is smaller than 98.5, the reset from the node A 1101 to the node C 1301 is executed, and the reset from the node C 1301 to the node A 1101 is prevented. If they are equal, the group score of the next priority number is compared.

ステップＳＰ７３１でリセットを実行したら、リセット要求元のクラスタ監視部に対して、リセット完了を通知し（ＳＰ７３２）、保持していたグループスコア管理表を削除する（ＳＰ７３３）。 When reset is executed in step SP731, the reset monitoring source cluster monitoring unit is notified of the reset completion (SP732), and the held group score management table is deleted (SP733).

なお、本実施の形態では、優先条件に優先度をつけるため、優先番号の順にグループスコアを比較し、グループスコアが等しかった場合だけ次の優先番号のグループスコアを比較する処理としているが、複数ある優先条件のうち、より多くの優先条件に適するグループをクラスタとして再構築した場合は、全てのグループスコアを比較してから、グループスコアが大きいと判断された回数でリセットの実行を判断してもよい。また、優先条件ごとに重みをつけることで、重みのつけられた優先条件を満たすグループがクラスタを再構築しやすくなるようにリセットの実行を判断してもよい。 In this embodiment, in order to give priority to the priority conditions, the group scores are compared in order of priority numbers, and the group score of the next priority number is compared only when the group scores are equal. If a group suitable for more priority conditions is reconfigured as a cluster among certain priority conditions, all group scores are compared, and the execution of reset is determined by the number of times that the group score is determined to be large. Also good. Further, by assigning a weight for each priority condition, it may be determined to execute the reset so that a group satisfying the weighted priority condition can easily reconstruct the cluster.

図８は、系切替部１１２４が、リセット完了の通知及びアプリケーション障害を示す通知を受信した場合における処理の一例を示す。系切替部１１２４は、リセット完了の通知及びアプリケーション障害を示す通知を受信すると、リセットされた障害ノードの処理を引き継ぐ必要があるか判断する（ＳＰ８０１）。系切替部１１２４は、このステップ８０１においてその障害ノードの処理を引き継ぐ必要がないと判断した場合、何もせずに処理を終了する。系切替部１１２４は、その障害ノードの処理を引き継ぐ必要があると判断した場合、系切り替え処理を行う（ＳＰ８０２）。 FIG. 8 shows an example of processing when the system switching unit 1124 receives a notification of reset completion and a notification indicating an application failure. Upon receiving the reset completion notification and the notification indicating the application failure, the system switching unit 1124 determines whether it is necessary to take over the processing of the reset failure node (SP801). If the system switching unit 1124 determines in step 801 that it is not necessary to take over the processing of the failed node, the system switching unit 1124 ends the processing without doing anything. When the system switching unit 1124 determines that it is necessary to take over the processing of the failed node, the system switching unit 1124 performs system switching processing (SP802).

（４）本実施の形態の効果等
以上説明したように、上記実施の形態によれば、ネットワークスプリットが発生した場合に、グループの処理内容から、クラスタとして構築するのに最適なグループを選択することができる。(4) Effects of this Embodiment As described above, according to the above embodiment, when a network split occurs, the optimum group to be constructed as a cluster is selected from the processing contents of the group. be able to.

（５）その他の実施形態
上記実施形態は、本発明を説明するための例示であり、本発明をこれらの実施形態にのみ限定する趣旨ではない。本発明は、その趣旨を逸脱しない限り、様々な形態で実施することができる。例えば、上記実施形態では、各種プログラムの処理をシーケンシャルに説明したが、特にこれにこだわるものではない。従って、処理結果に矛盾が生じない限り、処理の順序を入れ替え又は並行動作するように構成しても良い。(5) Other Embodiments The above embodiment is an example for explaining the present invention, and is not intended to limit the present invention only to these embodiments. The present invention can be implemented in various forms without departing from the spirit of the present invention. For example, in the above-described embodiment, the processing of various programs is described sequentially, but this is not particularly concerned. Therefore, as long as there is no contradiction in the processing result, the processing order may be changed or the operation may be performed in parallel.

１００１……ネットワーク、１００２……リセットパス、１００３……監視パス、１００４……管理コンピュータ、１１０１，１２０１，１３０１……ノード、１１０２……ＣＰＵ、１１０３……ＮＩＣ、１１０４……メモリ、１１０５，１２０５，１３０５……リセット装置、１１１０……ＯＳ、１１１１，１２１１，１３１１……アプリケーション、１１１２，１２１２，１３１２……クラスタ監視部、１１１３，１２１３，１３１３……リセット部、１１２０……監視部、１１２１……通信部、１１２２，１２２２，１３２２……ノード情報収集部、１１２３、１２２３，１３２３……グループスコア作成部、１１２４，１２２４，１３２４……系切替部、１１３０……優先定義、１１３１，１２３１，１３３１……ノード情報管理表、１１３２，１２３２，１３３２……グループスコア管理表。
DESCRIPTION OF SYMBOLS 1001 ... Network, 1002 ... Reset path, 1003 ... Monitoring path, 1004 ... Management computer, 1101, 1201, 1301 ... Node, 1102 ... CPU, 1103 ... NIC, 1104 ... Memory, 1105, 1205 , 1305... Reset device, 1110... OS, 1111, 1211, 1311... Application, 1112, 1212, 1312 ... cluster monitoring unit, 1113, 1213, 1313. ... Communication unit, 1122, 1222, 1322 ... Node information collection unit, 1123, 1223, 1323 ... Group score creation unit, 1124, 1224, 1324 ... System switching unit, 1130 ... Priority definition, 1131, 1231, 1331 ...... Node information management table 1132,1232,1332 ...... group score management table.

Claims

A cluster reconstruction method for a cluster system including a plurality of nodes connected to each other via a monitoring path,
A local node information generation step in which any one of the plurality of nodes holds a predetermined condition and generates information of the local node corresponding to the predetermined condition;
The other node information collecting step of collecting information of another node corresponding to the predetermined condition from another node that enables the predetermined node to communicate on the monitoring path;
When the predetermined node has the information of the own node or the information of the other node, the predetermined node includes at least the own node and can communicate through the monitoring path based on the information of the other node and the information of the own node. A priority generation step for generating the priority of the own group composed of various nodes;
A group in which the predetermined node is reconfigured as a cluster on the basis of the priority of the own group and the priority of another group composed of nodes in which the own node cannot communicate on the monitoring path And a group determination step for determining a cluster reconstruction method.

The predetermined condition is a plurality of conditions having a priority order,
The predetermined node is:
Generating information of the own node corresponding to each of the plurality of conditions;
When generating the priority of the own group, a priority is generated for each of the plurality of conditions,
2. The cluster reconstruction according to claim 1, wherein when a group to be reconstructed as the cluster is determined, a reconstructed group is determined by comparing with the priority of the other group according to the priority. Construction method.

The predetermined condition is:
The cluster rebuilding method according to claim 1, comprising a hardware usage rate (operation rate) of the predetermined node.

The predetermined condition is:
The cluster rebuilding method according to any one of claims 1 to 3, further comprising: a number of specific processes that are being executed by an application that operates on the predetermined node.

The specific process is:
The cluster rebuilding method according to claim 4, further comprising data write processing or data read processing by the application.

The predetermined condition is:
The cluster rebuilding method according to any one of claims 1 to 5, further comprising: a type of the application running on the predetermined node or a name of the type.

In the group determination step,
The node according to any one of claims 1 to 6, wherein when the predetermined node determines that the self group is a group to be reconstructed as the cluster, the node belonging to the self group is reset. Cluster rebuild method.

In the group determination step,
The predetermined node is
When resetting a node belonging to the own group, the priority of the own group is notified to a node constituting the cluster together with a reset request,
8. When the notification is received from another node and the group priority included in the notification is lower than the priority of the own group, the resetting of the own node is prevented. Cluster rebuild method.

A cluster configuration reconstruction device for a cluster system including a plurality of nodes connected to each other via a monitoring path,
The plurality of nodes are respectively
While holding a predetermined condition and generating information of the own node corresponding to the predetermined condition, information on the other node corresponding to the predetermined condition is collected from another node enabling communication on the monitoring path. A node information collection unit to
When the information of the own node or the information of the other node exists, based on the information of the other node and the information of the own node, the node includes at least the own node and can communicate with the monitoring path Based on the priority of the own group and the priority of the other group composed of nodes in which the own node cannot communicate on the monitoring path. And a group score creating unit for determining a group to be reconstructed as a cluster reconstruction device.

A cluster reconstruction program of a cluster reconstruction device in a cluster system including a plurality of nodes connected to each other via a monitoring path,
A self-node information generating step of causing the processor of the cluster restructuring device to generate self-node information corresponding to a predetermined condition held by any one of the plurality of nodes;
An other node information collecting step for causing the processor to collect information of another node corresponding to the predetermined condition from another node enabling communication on the monitoring path;
When the processor has the information of the own node or the information of the other node, the node includes at least the own node and can communicate through the monitoring path based on the information of the other node and the information of the own node. A priority generation step for generating the priority of the own group configured by:
The processor determines a group to be reconfigured as a cluster based on the priority of the own group and the priority of another group configured with nodes in which the own node cannot communicate with the monitoring path. A cluster restructuring program characterized by causing a group determination step to be executed.