JP2017220846A

JP2017220846A - Maintenance reduction system, node and maintenance reduction method

Info

Publication number: JP2017220846A
Application number: JP2016115035A
Authority: JP
Inventors: 敬子栗生; Keiko Kuriu; 篤史外山; Atsushi Toyama
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2016-06-09
Filing date: 2016-06-09
Publication date: 2017-12-14
Anticipated expiration: 2036-06-09
Also published as: JP6564349B2

Abstract

PROBLEM TO BE SOLVED: To shorten the processing time required for maintenance reduction of separating a node from a cluster.SOLUTION: Each of nodes 1 of a maintenance reduction system 1000 constituting a cluster comprises: a reduction node determination part 12 for determining a reduction node indicating a node 1 subjected to reduction from among multiple nodes 1; a distribution destination change processing part 151 for generating temporary distribution ID information 200 indicating each of nodes 1 excluding the reduction node in charge of data; an exclusive synchronizing processing part 152 for using the temporary distribution ID information 200 to extract original promotion target data and executing synchronizing processing between the original promotion target data and original data; an original promotion processing part 153 for promoting the original promotion target data into original data and updating the temporary distribution ID information; and a data rearrangement processing part 154 for executing duplication and rearrangement of data.SELECTED DRAWING: Figure 2

Description

本発明は、ネットワーク上の複数のノードをクラスタ構成とした高可用システム（分散処理システム）において、保守者側からの指示等を契機にクラスタの一部のノードを、そのクラスタから離脱（減設）する、保守減設システム、ノードおよび保守減設方法に関する。 In the present invention, in a highly available system (distributed processing system) in which a plurality of nodes on a network are configured in a cluster configuration, some nodes of the cluster are removed from the cluster (reduced) in response to an instruction from the maintenance person. The present invention relates to a maintenance / reduction system, a node, and a maintenance / reduction method.

近年、クラウドコンピューティングの隆盛に伴い、多量のデータの処理や保持を効率的に行うことが求められている。そこで、複数のサーバを協調動作させることにより効率的な処理を実現する分散処理技術が発展している。 In recent years, with the rise of cloud computing, it has been required to efficiently process and retain a large amount of data. Thus, distributed processing technology has been developed that realizes efficient processing by operating a plurality of servers in a coordinated manner.

分散処理を行う際には、クラスタ構成からなる分散処理システムを構成する各サーバ（以下、「ノード」と称する。）が担当するデータを決定する必要がある。このとき、分散処理システム全体での処理能力を高めるためには、各ノードが担当するデータ数は平均化されていることが望ましい。 When performing distributed processing, it is necessary to determine data to be handled by each server (hereinafter referred to as “node”) constituting a distributed processing system having a cluster configuration. At this time, in order to increase the processing capability of the entire distributed processing system, it is desirable that the number of data handled by each node is averaged.

代表的なデータの管理手法として、各データのｋｅｙをハッシュ関数にかけた値（以下、「ｈａｓｈ（ｋｅｙ）」と称する。）をノード数Ｎで割った余り、即ち「ｈａｓｈ（ｋｅｙ）ｍｏｄＮ」を番号として持つノードがデータを管理する手法がある。この場合、各ノードに事前に「０」から「Ｎ−１」までの番号を割り当てていることが前提となる。このような管理手法を用いた場合、ノードの追加・離脱が発生すると、Ｎの値が変化して、多くのデータについて、そのデータの保存を担当するノードが変更になるため、担当するデータを再配置することが必要になる。 As a representative data management method, a remainder obtained by dividing a value obtained by multiplying the key of each data by a hash function (hereinafter referred to as “hash (key)”) by the number of nodes N, that is, “hash (key) mod N”. There is a method in which a node having a number as a number manages data. In this case, it is assumed that numbers “0” to “N−1” are assigned to each node in advance. When such a management method is used, when a node is added or removed, the value of N changes, and the node in charge of storing the data changes for a lot of data. It will be necessary to rearrange.

そこで、ノードの追加・離脱に伴い担当するノードが変更になるデータ数を約１／Ｎに抑える方法として、コンシステント・ハッシュ（Consistent Hashing）法（非特許文献１参照）を用いたデータ管理手法がある。 Therefore, as a method for suppressing the number of data that the node in charge changes with the addition / detachment of a node to about 1 / N, a data management method using a consistent hashing method (see Non-Patent Document 1). There is.

このコンシステント・ハッシュ法を用いたデータ管理手法では、ノードとデータの双方にＩＤ（IDentifier）を割り当てる。そして、データのＩＤから閉じたＩＤ空間を時計回りに辿った場合に最初に当たったノードをそのデータの担当とする。ノードに対するＩＤの与え方の例としては、ＩＰアドレスをハッシュ関数にかけた値（ｈａｓｈ（ＩＰアドレス））が挙げられる。コンシステント・ハッシュ法を用いたデータ管理手法では、各ノードがそのデータの処理を担当するＩＤ空間上の領域を、ＩＤ表（後記する「振り分けＩＤ情報」）に基づき管理する。 In this data management method using the consistent hash method, IDs (IDentifiers) are assigned to both nodes and data. Then, when the closed ID space is traced clockwise from the ID of the data, the node that hits first is assumed to be in charge of the data. An example of how to give an ID to a node is a value (hash (IP address)) obtained by multiplying an IP address by a hash function. In the data management method using the consistent hash method, the area on the ID space where each node is responsible for processing the data is managed based on an ID table ("sorting ID information" described later).

上記のような、クラスタ構成の分散処理システムでは、例えば、サーバ（ノード）を定期点検したり、故障の疑いがあるノードを切り離したりするために、保守者（ネットワーク管理者）からの指示等を契機に、特定のノードが故障する前に、該当するノードをクラスタから安全に離脱させる機能を有する。この機能を、「保守減設」と呼ぶ。 In a distributed processing system with a cluster configuration as described above, for example, in order to periodically check a server (node) or to disconnect a node suspected of malfunctioning, an instruction from a maintenance person (network administrator), etc. As a trigger, it has a function of safely leaving the corresponding node from the cluster before a specific node fails. This function is called “maintenance reduction”.

保守減設は、従来、次の手順で実行される。
（１）保守者からの指示や、クラスタを構成するノードの減設を決定する所定のロジックに基づき、減設開始を決定する。
（２）減設するノード（以下、「減設ノード」と称する。）を削除した仮のＩＤ表を用いて、減設ノードの信号振り分けを停止する（以下、「振り分け先変更処理」と称する。）。
（３）冗長度を保つように、原本データの複製（複製データ）を再配置する（以下、「データ再配置」と称する。）。
（４）複製データを原本データに昇格させる（以下、「原本昇格」と称する。）。
（５）減設ノードを削除したＩＤ表に更新する（以下、「ＩＤ表更新」と称する。）。
（６）減設ノードをクラスタから切り離す。 The maintenance reduction is conventionally performed by the following procedure.
(1) The start of reduction is determined based on an instruction from the maintenance person and predetermined logic for determining the reduction of the nodes constituting the cluster.
(2) Using the temporary ID table from which the node to be removed (hereinafter referred to as “removed node”) is deleted, signal distribution of the reduced node is stopped (hereinafter referred to as “distribution destination changing process”). .)
(3) The original data copy (replicated data) is rearranged so as to maintain redundancy (hereinafter referred to as “data rearrangement”).
(4) Promote replicated data to original data (hereinafter referred to as “original promotion”).
(5) The reduced node is updated to the deleted ID table (hereinafter referred to as “ID table update”).
(6) Disconnect the removed node from the cluster.

以下、上記手順を詳細に説明する。
図７は、従来の分散処理システム（高可用システム）における保守減設の処理シーケンスを示す図である。
ここで、ネットワーク管理装置５は、分散処理システムを構成する各ノード１全体を管理する装置である。また、クラスタを構成する各ノード１は、減設対象となるノードである減設ノード１ｂと、クラスタを構成する各ノード１のうち減設ノード１ｂ以外のノードである既存ノード１ａと、既存ノード１ａの中で、上記したＩＤ表の管理等を行い、ノード１の代表として動作する代表ノード１Ａと、が存在する。 The above procedure will be described in detail below.
FIG. 7 is a diagram showing a maintenance reduction processing sequence in a conventional distributed processing system (high availability system).
Here, the network management device 5 is a device that manages the entire nodes 1 constituting the distributed processing system. Further, each node 1 constituting the cluster includes a reduction node 1b that is a node to be removed, an existing node 1a that is a node other than the reduction node 1b among the nodes 1 that constitute the cluster, and an existing node In 1a, there is a representative node 1A that manages the ID table and operates as a representative of the node 1.

まず、代表ノード１Ａは、減設ノード１ｂを、保守減設するノードとして決定する（ステップＳ１）。この減設するノード１を決定する処理は、例えば、ネットワーク管理装置５からの減設指示情報を受け取ることにより決定したり、各ノード１の状態（警報通知の内容や警報数）により決定したり、外部装置から受信した故障確率（詳細は後記）等により決定したりすることができる。 First, the representative node 1A determines the reduced node 1b as a node for maintenance reduction (step S1). The process for determining the node 1 to be removed is determined, for example, by receiving reduction instruction information from the network management device 5, or determined according to the state of each node 1 (contents of alarm notification and number of alarms). The failure probability received from an external device (details will be described later) can be used.

次に、代表ノード１Ａは、決定した減設ノード１ｂに、信号が振り分けられないようにするため、振り分け先変更依頼をネットワーク管理装置５に送信する（ステップＳ２）。そして、ネットワーク管理装置５は、分散処理システムを構成する各ノード１のうち、減設ノード１ｂには、信号が振り分けられないように、例えば、振り分け元となるロードバランサの振り分け先の設定を変更する（ステップＳ３）。続いて、ネットワーク管理装置５は、振り分け先変更の完了通知を代表ノード１Ａに送信する（ステップＳ４）。 Next, the representative node 1A transmits a distribution destination change request to the network management device 5 so that the signal is not distributed to the determined reduction node 1b (step S2). Then, the network management device 5 changes, for example, the setting of the distribution destination of the load balancer as the distribution source so that the signal is not distributed to the reduced node 1b among the nodes 1 constituting the distributed processing system. (Step S3). Subsequently, the network management device 5 transmits a distribution destination change completion notification to the representative node 1A (step S4).

振り分け先変更の完了通知を受信した代表ノード１Ａは、減設ノード１ｂを削除した仮ＩＤ表を生成する（ステップＳ５）。この仮ＩＤ表には、現状のＩＤ表から減設ノード１ｂを削除したＩＤ表であることを示す減設フラグ「＋１」をたてておく。続いて、代表ノード１Ａは、生成した仮ＩＤ表を既存ノード１ａと減設ノード１ｂとに通知する（ステップＳ６）。そして、既存ノード１ａと減設ノード１ｂとにおいて、仮ＩＤ表を取得し、記憶部に記憶する（ステップＳ７）。次に、既存ノード１ａおよび減設ノード１ｂは、仮ＩＤ表を取得したことを示す完了通知を代表ノード１Ａに送信する（ステップＳ８）。 The representative node 1A that has received the distribution destination change completion notification generates a temporary ID table from which the reduced node 1b has been deleted (step S5). In this temporary ID table, a reduction flag “+1” is set indicating that the ID table is obtained by deleting the reduction node 1b from the current ID table. Subsequently, the representative node 1A notifies the generated temporary ID table to the existing node 1a and the reduced node 1b (step S6). Then, in the existing node 1a and the reduced node 1b, a temporary ID table is acquired and stored in the storage unit (step S7). Next, the existing node 1a and the reduced node 1b transmit a completion notification indicating that the temporary ID table has been acquired to the representative node 1A (step S8).

代表ノード１Ａは、すべての既存ノード１ａと減設ノード１ｂとから仮ＩＤ表を取得したことを示す完了通知を受信すると、現状のＩＤ表を仮ＩＤ表に切り替える（ステップＳ９）。そして、代表ノード１Ａは、仮ＩＤ表への切替通知を既存ノード１ａと減設ノード１ｂとに送信する（ステップＳ１０）。続いて、既存ノード１ａおよび減設ノード１ｂは、現状のＩＤ表を仮ＩＤ表に切り替える（ステップＳ１１）。そして、既存ノード１ａおよび減設ノード１ｂは、仮ＩＤ表への切替が完了したことを示す完了通知を、代表ノード１Ａに送信する（ステップＳ１２）。 When the representative node 1A receives a completion notification indicating that the temporary ID table has been acquired from all the existing nodes 1a and the reduced node 1b, the representative node 1A switches the current ID table to the temporary ID table (step S9). Then, the representative node 1A transmits a notification of switching to the temporary ID table to the existing node 1a and the reduced node 1b (step S10). Subsequently, the existing node 1a and the reduced node 1b switch the current ID table to the temporary ID table (step S11). Then, the existing node 1a and the reduced node 1b transmit a completion notification indicating that switching to the temporary ID table is completed to the representative node 1A (step S12).

代表ノード１Ａは、すべての既存ノード１ａと減設ノード１ｂとから仮ＩＤ表に切り替えたことを示す完了通知を受信すると、仮ＩＤ表に基づき、データの再配置処理を実行する（ステップＳ１３）。具体的には、代表ノード１Ａは、仮ＩＤ表に基づき、冗長度を保持するように複製データの再配置をする必要があるかチェックし、必要がある場合には複製データの再配置を実行する。この処理は、例えば、非特許文献２に記載の手法で行われる。続いて、代表ノード１Ａは、複製データの原本データへの昇格処理を実行する。 When the representative node 1A receives a completion notification indicating switching to the temporary ID table from all the existing nodes 1a and the reduced node 1b, the representative node 1A executes data relocation processing based on the temporary ID table (step S13). . Specifically, the representative node 1A checks whether or not it is necessary to rearrange the replicated data so as to maintain redundancy based on the temporary ID table, and executes the rearrangement of the replicated data if necessary. To do. This process is performed by the method described in Non-Patent Document 2, for example. Subsequently, the representative node 1A executes a process of promoting the copy data to the original data.

また、代表ノード１Ａは、既存ノード１ａと減設ノード１ｂに対して、データの再配置処理を実行することを示す再配置開始通知を送信する（ステップＳ１４）。ここで、再配置開始通知を受信した既存ノード１ａは、データの再配置処理（複製データの再配置、原本データの昇格処理）を実行する（ステップＳ１５）。また、再配置開始通知を受信した減設ノード１ｂは、データの再配置処理（複製データの再配置）を実行する（ステップＳ１５）。 Further, the representative node 1A transmits a rearrangement start notification indicating that the data rearrangement process is executed to the existing node 1a and the reduced node 1b (step S14). Here, the existing node 1a that has received the relocation start notification executes data relocation processing (relocation of replicated data, promotion processing of original data) (step S15). Further, the reduced node 1b that has received the rearrangement start notification executes data rearrangement processing (relocation of replicated data) (step S15).

続いて、既存ノード１ａおよび減設ノード１ｂは、複製データの再配置が終了すると、複製データ再配置完了通知（図７では「複製完了通知」と記載する。）を代表ノード１Ａに送信する（ステップＳ１６）。代表ノード１Ａは、すべての既存ノード１ａおよび減設ノード１ｂから複製データ再配置完了通知を受信すると、複製完了確認通知を既存ノード１ａおよび減設ノード１ｂに送信する（ステップＳ１７）。
次に、既存ノード１ａは、複製完了確認通知を受信し、さらに、原本データの昇格処理が終了すると、データの再配置処理の完了通知を代表ノード１Ａに送信する（ステップＳ１８）。また、減設ノード１ｂは、複製完了確認通知を受信すると、データの再配置処理の完了通知を代表ノード１Ａに送信する（ステップＳ１８）。 Subsequently, when the rearrangement of the replicated data is completed, the existing node 1a and the reduced node 1b transmit a replicated data rearrangement completion notification (described as “duplication completion notification” in FIG. 7) to the representative node 1A ( Step S16). When the representative node 1A receives the replication data relocation completion notification from all the existing nodes 1a and the reduction node 1b, it transmits a replication completion confirmation notification to the existing node 1a and the reduction node 1b (step S17).
Next, the existing node 1a receives the copy completion confirmation notification, and further transmits the data relocation processing completion notification to the representative node 1A when the original data promotion processing is completed (step S18). Further, when receiving the copy completion confirmation notification, the reduction node 1b transmits a data relocation processing completion notification to the representative node 1A (step S18).

代表ノード１Ａは、すべての既存ノード１ａおよび減設ノード１ｂから、データの再配置の完了通知を受信すると、仮ＩＤ表の減設フラグ「＋１」を下げ、仮ＩＤ表により、現状のＩＤ表を更新する（ステップＳ１９）。続いて、代表ノード１Ａは、既存ノード１ａに対し、ＩＤ表更新通知を送信する（ステップＳ２０）。これにより、既存ノード１ａは、仮ＩＤ表の減設フラグ「＋１」を下げ、仮ＩＤ表により、現状のＩＤ表を更新する（ステップＳ２１）。そして、既存ノード１ａは、ＩＤ表を更新したことを示す完了通知を代表ノード１Ａに送信する（ステップＳ２２）。この完了通知を受信した代表ノード１Ａは、ネットワーク管理装置５に対し、ノード１の減設に伴う処理の終了を示す、クラスタメンバ更新通知を送信する（ステップＳ２３）。 When the representative node 1A receives the data rearrangement completion notification from all the existing nodes 1a and the reduction node 1b, the representative node 1A lowers the reduction flag “+1” in the temporary ID table, and uses the temporary ID table to determine the current ID table. Is updated (step S19). Subsequently, the representative node 1A transmits an ID table update notification to the existing node 1a (step S20). Thereby, the existing node 1a lowers the provisional ID table reduction flag “+1”, and updates the current ID table with the temporary ID table (step S21). Then, the existing node 1a transmits a completion notification indicating that the ID table has been updated to the representative node 1A (step S22). Receiving this completion notification, the representative node 1A transmits to the network management device 5 a cluster member update notification indicating the end of the process associated with the removal of the node 1 (step S23).

一方、代表ノード１Ａは、ステップＳ１９において、自身のＩＤ表を更新すると、減設ノード１ｂに対し、減設通知を送信する（ステップＳ２４）。これにより、減設ノード１ｂは、すべての処理を終了し（ステップＳ２５）、クラスタからの切り離しが行われる。 On the other hand, when the representative node 1A updates its own ID table in step S19, the representative node 1A transmits a reduction notification to the reduction node 1b (step S24). Thereby, the reduction node 1b finishes all the processes (step S25), and is disconnected from the cluster.

この従来の保守減設における過渡期間は、図７に示すように、代表ノード１Ａ、各既存ノード１ａおよび減設ノード１ｂが、減設フラグ「１」をたてて仮ＩＤ表を自身に記憶してから（ステップＳ５、Ｓ７の処理）、データの再配置処理が終了し、最終的にＩＤ表を更新するまで（ステップＳ１９、Ｓ２１、Ｓ２５の処理）となる。 As shown in FIG. 7, during the transition period in this conventional maintenance reduction, the representative node 1A, each existing node 1a, and the reduction node 1b set the reduction flag “1” and store the temporary ID table in themselves. After that (the processes of steps S5 and S7), the data rearrangement process is completed and finally the ID table is updated (the processes of steps S19, S21, and S25).

David Karger, et al.,“Consistent Hashing and Random Trees:Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web”，［online］，1997，ACM，［平成28年 4月26日検索］，インターネット<ＵＲＬ:http://www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf>David Karger, et al., “Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web”, [online], 1997, ACM, [April 26, 2016 search], Internet <URL : http: //www.akamai.com/dl/technical_publications/ConsistenHashingandRandomTreesDistributedCachingprotocolsforrelievingHotSpotsontheworldwideweb.pdf> 岩佐絵里子、他２名、「高可用サーバクラスタにおける自律的データ再配置方式の一検討」、社団法人電子情報通信学会、電子情報通信学会通信ソサイエティ大会講演論文集 2012年、B-6-71、2012-08-28Eriko Iwasa and two others, “Study on autonomous data relocation method in high-availability server cluster”, The Institute of Electronics, Information and Communication Engineers, IEICE Communication Society Conference Proceedings 2012, B-6-71, 2012-08-28

この保守減設は、故障の疑いのあるノードをクラスタから切り離すことを考慮すると、より迅速に処理することが求められる。つまり、過渡期間をできる限り短縮することが求められる。しかしながら、上記の手順によると、クラスタを構成するすべてのノードの再配置や原本昇格が完了しないと、この保守減設の処理が完了しない。また、保守減設中にノードの故障が発生してしまうと、保守減設処理が無効になる上、ロールバック等に伴い、余計なノード間のデータの複製や再配置によるネットワークコストおよび処理遅延が発生してしまう。 This maintenance reduction is required to be processed more quickly in consideration of disconnecting a node suspected of failure from the cluster. That is, it is required to shorten the transition period as much as possible. However, according to the above procedure, the maintenance reduction process is not completed unless the rearrangement of all the nodes constituting the cluster and the original promotion are completed. Also, if a node failure occurs during maintenance reduction, the maintenance reduction processing becomes invalid, and network costs and processing delays due to extra data replication and relocation between nodes due to rollback etc. Will occur.

このような背景を鑑みて本発明がなされたのであり、本発明は、クラスタからノードを切り離す保守減設にかかる処理時間を短縮する、保守減設システム、ノードおよび保守減設方法を提供することを課題とする。 The present invention has been made in view of such a background, and the present invention provides a maintenance / removal system, a node, and a maintenance / removal method that reduce the processing time required for the maintenance / removal to separate a node from a cluster. Is an issue.

前記した課題を解決するため、請求項１に記載の発明は、クラスタを構成する複数のノードそれぞれにメッセージを振り分けて処理させ、前記メッセージとして処理されたデータの原本である原本データとその複製である複製データとを異なる前記ノードに記憶させる高可用システムに備えられ、前記クラスタを構成する複数のノードのうちの一部のノードを減設する保守減設システムであって、前記ノードが、各ノードの識別情報を示すＩＤに対応付けて、各ノードが処理するデータの担当を示す振り分けＩＤ情報を記憶する記憶部と、前記複数のノードの中から減設対象となるノードを示す減設ノードを決定する減設ノード決定部と、前記減設ノードに対する前記メッセージの振り分けを停止し、前記減設ノードを除いた各ノードの前記データの担当を示す仮の振り分けＩＤ情報を生成する振り分け先変更処理部と、前記仮の振り分けＩＤ情報を用いて、前記減設ノードが減設されたときに前記原本データに昇格する複製データを、原本昇格対象データとして抽出し、前記抽出した原本昇格対象データと、前記減設ノードに記憶される前記原本データとの間で同期処理を実行する限定的同期処理部と、前記限定的同期処理部による同期処理の後に、前記原本昇格対象データを前記原本データに昇格させ、その後に、前記仮の振り分けＩＤ情報を用いて、現状の振り分けＩＤ情報を更新させる原本昇格処理部と、前記高可用システムに定められた冗長度を保つように、前記データの複製および再配置を実行するデータ再配置処理部と、を備えることを特徴とする保守減設システムとした。 In order to solve the above-described problem, the invention according to claim 1 is characterized in that a message is distributed to each of a plurality of nodes constituting a cluster and processed, and original data that is an original of the data processed as the message and a copy thereof. A maintenance / removal system provided in a high availability system for storing a certain replicated data in a different node, wherein a part of a plurality of nodes constituting the cluster is removed, A storage unit that stores distribution ID information that indicates the charge of data processed by each node in association with an ID that indicates node identification information, and a reduced node that indicates a node to be removed from the plurality of nodes A reduction node determination unit for determining the reduction node, and the distribution of the message to the reduction node is stopped, and the nodes of the nodes other than the reduction node are Replicated data that is promoted to the original data when the reduced node is removed by using a distribution destination change processing unit that generates temporary distribution ID information indicating the responsibility of the data and the temporary distribution ID information Is extracted as original promotion target data, and the limited synchronization processing unit executes synchronization processing between the extracted original promotion target data and the original data stored in the reduced node, and the limited synchronization After the synchronization processing by the processing unit, the original promotion target data is promoted to the original data, and then the current distribution ID information is updated using the temporary distribution ID information; A maintenance reduction system comprising: a data rearrangement processing unit that performs duplication and rearrangement of the data so as to maintain redundancy determined in the available system It was.

請求項２に記載の発明は、クラスタを構成する複数のノードそれぞれにメッセージを振り分けて処理させ、前記メッセージとして処理されたデータの原本である原本データとその複製である複製データとを異なる前記ノードに記憶させる高可用システムに備えられ、前記クラスタを構成する複数のノードのうちの一部のノードを減設する保守減設システムの前記ノードであって、各ノードの識別情報を示すＩＤに対応付けて、各ノードが処理するデータの担当を示す振り分けＩＤ情報を記憶する記憶部と、前記複数のノードの中から減設対象となるノードを示す減設ノードを決定する減設ノード決定部と、前記減設ノードに対する前記メッセージの振り分けを停止し、前記減設ノードを除いた各ノードの前記データの担当を示す仮の振り分けＩＤ情報を生成する振り分け先変更処理部と、前記仮の振り分けＩＤ情報を用いて、前記減設ノードが減設されたときに前記原本データに昇格する複製データを、原本昇格対象データとして抽出し、前記抽出した原本昇格対象データと、前記減設ノードに記憶される前記原本データとの間で同期処理を実行する限定的同期処理部と、前記限定的同期処理部による同期処理の後に、前記原本昇格対象データを前記原本データに昇格させ、その後に、前記仮の振り分けＩＤ情報を用いて、現状の振り分けＩＤ情報を更新させる原本昇格処理部と、前記高可用システムに定められた冗長度を保つように、前記データの複製および再配置を実行するデータ再配置処理部と、を備えることを特徴とするノードとした。 According to a second aspect of the present invention, a message is distributed to each of a plurality of nodes constituting a cluster and processed, and the original data that is the original data processed as the message is different from the duplicate data that is a duplicate thereof. The node of the maintenance / reduction system, which is provided in the high availability system to be stored in the network, and in which a part of the plurality of nodes constituting the cluster is removed, corresponds to an ID indicating identification information of each node. In addition, a storage unit that stores distribution ID information indicating the charge of data processed by each node, and a reduction node determination unit that determines a reduction node indicating a node to be reduced from among the plurality of nodes, Temporary distribution indicating suspension of distribution of the message to the reduced node and indicating the data assignment of each node excluding the reduced node Using the distribution destination change processing unit that generates D information and the temporary distribution ID information, copy data that is promoted to the original data when the reduced node is removed is extracted as original promotion target data A limited synchronization processing unit that executes synchronization processing between the extracted original promotion target data and the original data stored in the reduced node, and after the synchronization processing by the limited synchronization processing unit, The original promotion target data is promoted to the original data, and thereafter, the provisional distribution ID information is used to update the current distribution ID information, and the redundancy determined in the high availability system is set. And a data rearrangement processing unit that executes the data duplication and rearrangement so as to maintain the node.

請求項３に記載の発明は、クラスタを構成する複数のノードそれぞれにメッセージを振り分けて処理させ、前記メッセージとして処理されたデータの原本である原本データとその複製である複製データとを異なる前記ノードに記憶させる高可用システムに備えられ、前記クラスタを構成する複数のノードのうちの一部のノードを減設する保守減設システムの保守減設方法であって、前記ノードが、各ノードの識別情報を示すＩＤに対応付けて、各ノードが処理するデータの担当を示す振り分けＩＤ情報を記憶部に記憶しており、前記複数のノードの中から減設対象となるノードを示す減設ノードを決定するステップと、前記減設ノードに対する前記メッセージの振り分けを停止し、前記減設ノードを除いた各ノードの前記データの担当を示す仮の振り分けＩＤ情報を生成するステップと、前記仮の振り分けＩＤ情報を用いて、前記減設ノードが減設されたときに前記原本データに昇格する複製データを、原本昇格対象データとして抽出し、前記抽出した原本昇格対象データと、前記減設ノードに記憶される前記原本データとの間で同期処理を実行するステップと、前記同期処理の後に、前記原本昇格対象データを前記原本データに昇格させ、その後に、前記仮の振り分けＩＤ情報を用いて、現状の振り分けＩＤ情報を更新させるステップと、前記高可用システムに定められた冗長度を保つように、前記データの複製および再配置を実行するステップと、を実行することを特徴とする保守減設方法とした。 According to a third aspect of the present invention, a message is distributed to each of a plurality of nodes constituting the cluster and processed, and the original data that is the original data processed as the message is different from the duplicate data that is a duplicate thereof. A maintenance / removal method for a maintenance / reduction system in which a part of a plurality of nodes constituting the cluster is removed, provided in a high-availability system stored in a storage system, wherein the node identifies each node In association with an ID indicating information, distribution ID information indicating the charge of data processed by each node is stored in the storage unit, and a reduction node indicating a node to be reduced is selected from the plurality of nodes. A step of deciding, and stopping the distribution of the message to the reduced node, and indicating the responsibility of the data of each node except the reduced node Generating the distribution ID information, and using the temporary distribution ID information, extracting duplicate data to be promoted to the original data when the reduced node is removed as original promotion target data, Executing the synchronization process between the extracted original promotion target data and the original data stored in the reduced node; and after the synchronization process, promoting the original promotion target data to the original data; Thereafter, using the provisional distribution ID information, updating the current distribution ID information, and executing replication and rearrangement of the data so as to maintain redundancy determined in the high availability system And a maintenance reduction method characterized in that

このように、保守減設システムでは、振り分け先変更処理の後、原本昇格対象データと原本データとの間で同期処理（限定的同期処理）を実行する。その後、原本昇格を行い、ＩＤ表を更新し、最後に、データ再配置を行う。この限定的同期処理により、原本データの最新の状態が、その原本データの消滅後に昇格する複製データ（原本昇格対象データ）に同期されるため、非同期なシステム（処理効率を優先し、複製データが常に最新ではない状態があるシステム）の場合でも、保守減設時に最新のデータを保持することができる。そして、保守減設システムでは、限定的同期処理の後、原本昇格とＩＤ表の更新を行うことで、保守減設処理に伴う過渡状態を終了する。よって、データ再配置を、保守減設とは切り離して行うことができるため、過渡状態をより短い期間にすることが可能となる。 As described above, in the maintenance reduction system, after the distribution destination change process, the synchronization process (limited synchronization process) is executed between the original promotion target data and the original data. Thereafter, the original is promoted, the ID table is updated, and finally data rearrangement is performed. This limited synchronization process synchronizes the latest state of the original data with the replicated data that is promoted after the disappearance of the original data (data to be promoted to the original data). Even in the case of a system that is not always up-to-date, the latest data can be retained when maintenance is reduced. In the maintenance reduction system, after the limited synchronization process, the original state promotion and the update of the ID table are performed, thereby terminating the transient state associated with the maintenance reduction process. Therefore, since the data rearrangement can be performed separately from the maintenance reduction, the transient state can be made shorter.

本発明によれば、クラスタからノードを切り離す保守減設にかかる処理時間を短縮する、保守減設システム、ノードおよび保守減設方法を提供することができる。 According to the present invention, it is possible to provide a maintenance / reduction system, a node, and a maintenance / reduction method that reduce the processing time required for maintenance / removal for separating a node from a cluster.

本実施形態に係る保守減設システムを含む分散処理システムの全体構成を示す図である。It is a figure which shows the whole structure of the distributed processing system containing the maintenance reduction system which concerns on this embodiment. 本実施形態に係るノードの構成例を示す機能ブロック図である。It is a functional block diagram which shows the structural example of the node which concerns on this embodiment. 本実施形態に係るノード識別子管理情報のデータ構成例を示す図である。It is a figure which shows the data structural example of the node identifier management information which concerns on this embodiment. 本実施形態に係る振り分けＩＤ情報（ＩＤ表）のデータ構成例を示す図である。It is a figure which shows the example of a data structure of distribution ID information (ID table) which concerns on this embodiment. 本実施形態に係る故障確率情報のデータ構成例を示す図である。It is a figure which shows the data structural example of the failure probability information which concerns on this embodiment. 本実施形態に係る保守減設システムが実行する保守減設処理を示すシーケンス図である。It is a sequence diagram which shows the maintenance reduction process which the maintenance reduction system which concerns on this embodiment performs. 従来の分散処理システム（高可用システム）における保守減設の処理シーケンスを示す図である。It is a figure which shows the processing sequence of the maintenance reduction in the conventional distributed processing system (high availability system).

＜全体構成＞
まず、本発明を実施するための形態（以下、「本実施形態」と称する。）に係る保守減設システム１０００を含む分散処理システム（高可用システム）Ｓについて説明する。
図１は、本実施形態に係る保守減設システム１０００を含む分散処理システムＳの全体構成を示す図である。 <Overall configuration>
First, a distributed processing system (high availability system) S including a maintenance reduction system 1000 according to a mode for carrying out the present invention (hereinafter referred to as “this embodiment”) will be described.
FIG. 1 is a diagram showing an overall configuration of a distributed processing system S including a maintenance / reduction system 1000 according to the present embodiment.

この保守減設システム１０００は、複数のノード１から構成される。各ノード１は、コンピュータなどの物理装置や仮想マシンなどの論理装置である。ロードバランサ３は、クライアント２から受信したメッセージを、単純なラウンドロビン等により振り分けて各ノード１に送信する。そして、ノード１の振り分け部１３は、クライアント２からのメッセージを、例えば、コンシステント・ハッシュ法等に基づき、メッセージを担当するノード１に振り分ける。メッセージを担当するノード１では、信号処理部１４において、信号処理を行い、クライアント２にサービスを提供する。 The maintenance reduction system 1000 is composed of a plurality of nodes 1. Each node 1 is a physical device such as a computer or a logical device such as a virtual machine. The load balancer 3 distributes the message received from the client 2 by simple round robin or the like and transmits it to each node 1. Then, the distribution unit 13 of the node 1 distributes the message from the client 2 to the node 1 in charge of the message based on, for example, a consistent hash method. In the node 1 in charge of the message, the signal processing unit 14 performs signal processing and provides a service to the client 2.

なお、ロードバランサ３が存在せず、クライアント２から任意のノード１（振り分け部１３）にメッセージを送信することも可能である。また、振り分け部１３と信号処理部１４とは、同じノード１上に同時に存在してもよいし、別々のノード１上に存在してもよい。 Note that the load balancer 3 does not exist, and a message can be transmitted from the client 2 to an arbitrary node 1 (distribution unit 13). Further, the distribution unit 13 and the signal processing unit 14 may exist on the same node 1 at the same time, or may exist on different nodes 1.

また、保守減設システム１０００においてクラスタを構成する各ノード１は、ＩＤ表（後記する振り分けＩＤ情報２００等）を管理する代表ノード１Ａと、代表ノード１Ａ以外のノード１である既存ノード１ａで構成される。ここで、代表ノード１Ａは、ネットワーク管理装置等の外部装置との間で、情報の送受信を行い、保守減設システム１０００全体を制御する。なお、ネットワーク管理装置は、保守減設システム１０００およびロードバランサ３と通信接続され、分散処理システムＳ全体を管理する。 Each node 1 constituting the cluster in the maintenance / reduction system 1000 includes a representative node 1A that manages an ID table (distribution ID information 200 and the like described later) and an existing node 1a that is a node 1 other than the representative node 1A. Is done. Here, the representative node 1A transmits / receives information to / from an external device such as a network management device and controls the entire maintenance / reduction system 1000. The network management apparatus is connected to the maintenance / reduction system 1000 and the load balancer 3 to manage the entire distributed processing system S.

＜概要＞
従来の保守減設の処理は図７で示したように、振り分け先変更処理→データ再配置→原本昇格→ＩＤ表更新の順で行われていた。しかしながら、この手順では、振り分け先変更処理において、減設ノードを除いた仮ＩＤ表での処理に移行した後、最終的にＩＤ表を更新するまでは、過渡期間となる。つまり、データの再配置や原本昇格のすべてが完了しないと、保守減設の処理が完了しない。
これに対し、本実施形態に係る保守減設システム１０００では、振り分け先変更処理の後、本発明特有の限定的同期処理（詳細は後記）を実行する。その後、原本昇格を行い、ＩＤ表を更新し、最後に、データ再配置を行う。この限定的同期処理は、減設ノードが持つ原本データと、減設ノードが減設された後に新たに原本データに昇格する複製データとの間で、データ内容を一致させる同期処理を意味する。この限定的同期処理により、原本データの最新の状態が、その原本データの消滅後に昇格する複製データに同期される。そして、保守減設システム１０００において、限定的同期処理の後、原本昇格とＩＤ表の更新を行うことで、保守減設処理に伴う過渡状態を終了する。つまり、データ再配置を、保守減設とは切り離して行うことができるため、過渡状態をより短い期間にすることが可能となる。
以下、本実施形態に係るノード１を含む保守減設システム１０００について、具体的に説明する。 <Overview>
As shown in FIG. 7, the conventional maintenance reduction processing is performed in the order of distribution destination change processing → data rearrangement → original promotion → ID table update. However, in this procedure, in the distribution destination changing process, it is a transitional period until the ID table is finally updated after shifting to the process using the temporary ID table excluding the removed node. That is, the maintenance reduction process is not completed unless the data rearrangement and the original promotion are completed.
On the other hand, in the maintenance / reduction system 1000 according to the present embodiment, a limited synchronization process (details will be described later) specific to the present invention is executed after the distribution destination change process. Thereafter, the original is promoted, the ID table is updated, and finally data rearrangement is performed. This limited synchronization process means a synchronization process for matching the data contents between the original data held by the reduced node and the replicated data newly promoted to the original data after the reduced node is removed. By this limited synchronization processing, the latest state of the original data is synchronized with the replicated data that is promoted after the original data disappears. Then, in the maintenance reduction system 1000, after the limited synchronization processing, the original state promotion and the update of the ID table are performed, thereby terminating the transient state associated with the maintenance reduction processing. That is, since the data rearrangement can be performed separately from the maintenance reduction, the transient state can be made shorter.
Hereinafter, the maintenance / reduction system 1000 including the node 1 according to the present embodiment will be specifically described.

≪ノード≫
まず、本実施形態に係る保守減設システム１０００を構成するノード１について、具体的に説明する。なお、本実施形態に係るノード１は、保守減設システム１０００を構成する複数のノード１のうち、後記するノード識別子管理情報１００（図３参照）、および、上記したＩＤ表（振り分けＩＤ情報２００（図４参照））を管理する代表ノード１Ａとなる場合と、代表ノード１Ａからノード識別子管理情報１００および振り分けＩＤ情報２００を受け取り、各情報を更新して記憶する既存ノード１ａ（非代表ノード）になる場合とが存在する。なお、代表ノード１Ａが行う処理等については、後記する。また、保守減設システム１０００において、クラスタを構成する既存ノード１ａのうち、減設対象となるノード１を減設ノード１ｂとして、以下において説明する。また、本実施形態に係るノード１の代表ノード１Ａが、減設ノード１ｂを決定する処理に用いる情報の例として、ここでは、代表ノード１Ａが外部装置から故障確率情報３００（図５参照）を受信し、各ノード１の故障確率に基づき、減設ノード１ｂを決定する例として説明する。 ≪Node≫
First, the node 1 which comprises the maintenance reduction system 1000 which concerns on this embodiment is demonstrated concretely. Note that the node 1 according to the present embodiment includes the node identifier management information 100 (see FIG. 3), which will be described later, among the plurality of nodes 1 constituting the maintenance / reduction system 1000, and the ID table (distribution ID information 200) described above. (See FIG. 4)) When the node becomes the representative node 1A that manages the existing node 1a (non-representative node) that receives the node identifier management information 100 and the distribution ID information 200 from the representative node 1A, and updates and stores each information There are cases where The processing performed by the representative node 1A will be described later. Further, in the maintenance / reduction system 1000, among the existing nodes 1a constituting the cluster, the node 1 to be reduced will be described as the reduction node 1b below. In addition, as an example of information used by the representative node 1A of the node 1 according to the present embodiment for determining the reduced node 1b, here, the representative node 1A receives failure probability information 300 (see FIG. 5) from an external device. An example will be described in which the reduced node 1b is determined based on the failure probability of each node 1 received.

図２は、本実施形態に係るノード１の構成例を示す機能ブロック図である。
図２に示すように、ノード１は、制御部１０と、入出力部２０と、記憶部３０とを含んで構成される。 FIG. 2 is a functional block diagram illustrating a configuration example of the node 1 according to the present embodiment.
As illustrated in FIG. 2, the node 1 includes a control unit 10, an input / output unit 20, and a storage unit 30.

入出力部２０は、ロードバランサ３や、自身以外の他のノード１、ネットワーク管理装置（図示省略）、他の外部装置（図示省略）等との間の情報の入出力を行う。また、この入出力部２０は、通信回線を介して情報の送受信を行う通信インタフェース（図示省略）と、キーボード等の入力手段やモニタ等の出力手段との間で入出力を行う入出力インタフェース（図示省略）とから構成される。 The input / output unit 20 inputs / outputs information to / from the load balancer 3, other nodes 1 other than itself, a network management device (not shown), other external devices (not shown), and the like. The input / output unit 20 also includes an input / output interface (input / output interface) that performs input / output between a communication interface (not shown) that transmits and receives information via a communication line and an input unit such as a keyboard and an output unit such as a monitor. (Not shown).

記憶部３０は、ハードディスクやフラッシュメモリ、ＲＡＭ（Random Access Memory）等の記憶手段からなり、処理の対象となるデータ４００や、ノード識別子管理情報１００（図３参照）、振り分けＩＤ情報２００（図４参照）、減設ノード１ｂを決定する際に用いる故障確率情報３００（図５）等が記憶される。なお、このデータ４００には、そのノード１自身が原本として記憶する原本データと、自身以外の他のノード１が記憶する原本データの複製が複製データとして記憶される。この記憶部３０に記憶される各情報についての詳細は後記する。 The storage unit 30 includes storage means such as a hard disk, a flash memory, and a RAM (Random Access Memory). The storage unit 30 includes data 400 to be processed, node identifier management information 100 (see FIG. 3), and distribution ID information 200 (FIG. 4). The failure probability information 300 (FIG. 5) used when determining the reduced node 1b is stored. In this data 400, the original data stored by the node 1 itself and the copy of the original data stored by other nodes 1 other than the node 1 are stored as duplicate data. Details of each piece of information stored in the storage unit 30 will be described later.

制御部１０は、ノード１全体の制御を司り、ノード識別子管理部１１、減設ノード決定部１２、振り分け部１３、信号処理部１４、保守減設処理部１５を含んで構成される。なお、この制御部１０は、例えば、記憶部３０に格納されたプログラムをＣＰＵ（図示省略）がＲＡＭ（図示省略）に展開し実行することで実現される。 The control unit 10 controls the entire node 1, and includes a node identifier management unit 11, a reduced node determination unit 12, a distribution unit 13, a signal processing unit 14, and a maintenance reduction processing unit 15. In addition, this control part 10 is implement | achieved when CPU (illustration omitted) expand | deploys and executes the program stored in the memory | storage part 30 on RAM (illustration omitted), for example.

ノード識別子管理部１１は、保守減設システム１０００においてクラスタを構成する各ノード１のノード情報（ＩＰアドレス等）および各ノード１が担当するＩＤ空間を管理する。
具体的には、ノード識別子管理部１１は、自身が属する保守減設システム１０００においてノード１の離脱（減設）等が発生した場合に、保守減設システム１０００を構成するノード１の識別情報等が記憶されたノード識別子管理情報１００（図３）を更新する。 The node identifier management unit 11 manages the node information (IP address and the like) of each node 1 configuring the cluster and the ID space handled by each node 1 in the maintenance / reduction system 1000.
Specifically, the node identifier management unit 11 identifies, for example, identification information of the nodes 1 constituting the maintenance / removal system 1000 when the node 1 is removed (removed) in the maintenance / removal system 1000 to which the node identifier management unit 11 belongs. The node identifier management information 100 (FIG. 3) stored therein is updated.

図３は、本実施形態に係るノード識別子管理情報１００のデータ構成例を示す図である。
図３に示すように、ノード識別子管理情報１００には、保守減設システム１０００を構成する各ノード１のノード識別子１０１とアドレス１０２（例えば、ＩＰアドレス）とが対応付けられて格納される。 FIG. 3 is a diagram illustrating a data configuration example of the node identifier management information 100 according to the present embodiment.
As shown in FIG. 3, the node identifier management information 100 stores the node identifier 101 and the address 102 (for example, IP address) of each node 1 constituting the maintenance reduction system 1000 in association with each other.

このノード識別子１０１は、例えば、当該保守減設システム１０００内において予め設定される特定のノード（例えば、ノード識別子１０１の昇順に設定した代表ノード１Ａ）のノード識別子管理部１１で付与され、当該保守減設システム１０００内の各ノード１に配信される。 The node identifier 101 is given by the node identifier management unit 11 of a specific node (for example, the representative node 1A set in ascending order of the node identifier 101) set in advance in the maintenance / removal system 1000, for example. It is distributed to each node 1 in the reduction system 1000.

ノード識別子管理部１１は、外部から受信したノードＩＤの変更情報に基づき、ノード識別子管理情報１００を更新（ノード１の減設等を反映）する。
さらに、ノード識別子管理部１１は、自身が代表ノード１Ａのノード識別子管理部１１である場合には、減設ノード決定部１２により、減設するノード（減設ノード１ｂ）が決定され、後記する保守減設処理部１５の振り分け先変更処理部１５１により、減設ノード１ｂへの振り分けが停止されると、その減設ノード１ｂを削除した仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を生成し、減設ノード１ｂ以外の既存ノード１ａに送信する。
また、ノード識別子管理部１１は、自身が既存ノード１ａのノード識別子管理部１１である場合には、代表ノード１Ａから仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を受信し、自身の記憶部３０に記憶する。また、ノード識別子管理部１１は、振り分けＩＤ情報２００の更新通知を代表ノード１Ａから受信すると、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を用いて、現状の振り分けＩＤ情報２００を更新する。 The node identifier management unit 11 updates the node identifier management information 100 (reflects the removal of the node 1 or the like) based on the node ID change information received from the outside.
Furthermore, when the node identifier management unit 11 is the node identifier management unit 11 of the representative node 1A, the node to be removed (removal node 1b) is determined by the reduction node determination unit 12, and will be described later. When the distribution destination change processing unit 151 of the maintenance reduction processing unit 15 stops the allocation to the reduction node 1b, temporary allocation ID information 200k (temporary ID table) in which the reduction node 1b is deleted is generated. And transmitted to the existing nodes 1a other than the reduced node 1b.
Further, when the node identifier management unit 11 is the node identifier management unit 11 of the existing node 1a, the node identifier management unit 11 receives the temporary distribution ID information 200k (temporary ID table) from the representative node 1A, and stores its own storage unit 30. To remember. Further, when receiving the update notification of the distribution ID information 200 from the representative node 1A, the node identifier management unit 11 updates the current distribution ID information 200 using the temporary distribution ID information 200k (temporary ID table).

図４は、本実施形態に係る振り分けＩＤ情報２００（ＩＤ表）のデータ構成例を示す図である。図４（ａ）は、保守減設前（現状）の振り分けＩＤ情報２００を示し、図４（ｂ）は、仮の振り分けＩＤ情報２００ｋ、および、保守減設後（更新後）の振り分けＩＤ情報２００ｇを示す。 FIG. 4 is a diagram illustrating a data configuration example of the distribution ID information 200 (ID table) according to the present embodiment. 4A shows the distribution ID information 200 before maintenance reduction (current state), and FIG. 4B shows the temporary distribution ID information 200k and distribution ID information after maintenance reduction (after update). 200 g is shown.

図４（ａ）に示すように、振り分けＩＤ情報２００には、ノード識別子２０１に対応付けて、そのノード１が担当するＩＤ空間２０２（担当領域）が格納される。このノード識別子２０１は、図３のノード識別子１０１と同様の情報である。図４（ａ）に示す例では、ＩＤ空間の全ＩＤ数が「０」〜「９９９」の１０００であり、例えば、ノード識別子２０１が「Ａ」のノード１が、担当するＩＤ空間２０２として「０〜１９９」について担当することを示している。また、この振り分けＩＤ情報２００において、ノード識別子２０１が「Ａ」のノード１（ノード「Ａ」）のＩＤ空間上のノードＩＤは、「１９９」である。以下同様に、ノード「Ｂ」のＩＤ空間上でのノードＩＤは「３９９」である。ノード「Ｃ」のＩＤ空間上でのノードＩＤは「５９９」である。ノード「Ｄ」のＩＤ空間上でのノードＩＤは「７９９」である。ノード「Ｅ」のＩＤ空間上でのノードＩＤは「９９９」である。そして、ノード識別子管理部１１は、振り分けＩＤ情報２００において、各ノード１のノードＩＤを昇順にソートし、連続したＩＤ空間２０２として管理する。 As shown in FIG. 4A, the distribution ID information 200 stores an ID space 202 (area in charge) that the node 1 is responsible for in association with the node identifier 201. This node identifier 201 is the same information as the node identifier 101 of FIG. In the example shown in FIG. 4A, the total number of IDs in the ID space is 1000 from “0” to “999”. For example, the node 1 whose node identifier 201 is “A” is “ID space 202 in charge” 0 to 199 ". In the distribution ID information 200, the node ID on the ID space of the node 1 (node “A”) having the node identifier 201 “A” is “199”. Similarly, the node ID on the ID space of the node “B” is “399”. The node ID of the node “C” on the ID space is “599”. The node ID on the ID space of the node “D” is “799”. The node ID on the ID space of the node “E” is “999”. Then, the node identifier management unit 11 sorts the node IDs of the respective nodes 1 in ascending order in the distribution ID information 200 and manages them as a continuous ID space 202.

図４（ｂ）は、図４（ａ）のＩＤ空間において、減設対象のノード１としてノード「Ｄ」が選ばれた場合に、ノード識別子管理部１１が生成する仮の振り分けＩＤ情報２００ｋ（２００）を示す。なお、この仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）は、保守減設処理の結果、減設ノード１ｂの減設後の振り分けＩＤ情報２００ｇ（２００）となる。 FIG. 4B shows temporary allocation ID information 200k (generated by the node identifier management unit 11 when the node “D” is selected as the node 1 to be removed in the ID space of FIG. 200). The temporary distribution ID information 200k (temporary ID table) becomes the distribution ID information 200g (200) after the reduction of the reduction node 1b as a result of the maintenance reduction process.

図４（ｂ）では、ノード「Ｄ」が減設ノード１ｂであるとして削除され、ノード「Ｄ」が担当していた領域を、ノード「Ｃ」とノード「Ｅ」とが２等分して分担した例を示している。図４（ｂ）においては、ノード「Ｃ」の担当領域が、削除したノード「Ｄ」が担当した領域の半分を受け継ぎ「４００」〜「６９９」に変更されている。また、ノード「Ｅ」の担当領域が、削除したノード「Ｄ」の担当した領域の半分を受け継ぎ「７００」〜「９９９」に変更されている。ただし、減設ノード１ｂが担当していたＩＤ空間の領域を、既存ノード１ａにおいて、どのように分担させるかについてのロジックは、ネットワーク管理者により予め設定されているものとする。ノード識別子管理部１１は、減設ノード決定部１２が減設ノード１ｂを決定した場合に、この予め設定されたロジック（所定のロジック）に基づき、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を生成する。 In FIG. 4B, the node “D” is deleted as the reduced node 1b, and the area that the node “D” was in charge of is divided into two equal parts, the node “C” and the node “E”. A shared example is shown. In FIG. 4B, the area in charge of the node “C” is changed to “400” to “699” inheriting half of the area in charge of the deleted node “D”. In addition, the area in charge of the node “E” is changed to “700” to “999”, inheriting half of the area in charge of the deleted node “D”. However, it is assumed that the logic on how to share the ID space area handled by the reduced node 1b in the existing node 1a is set in advance by the network administrator. When the reduced node determination unit 12 determines the reduced node 1b, the node identifier management unit 11 obtains temporary allocation ID information 200k (temporary ID table) based on the preset logic (predetermined logic). Generate.

保守減設システム１０００内の代表ノード１Ａのノード識別子管理部１１は、各ノード１に対して、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を送信する。また、代表ノード１Ａのノード識別子管理部１１は、仮の振り分けＩＤ情報２００ｋが更新され、保守減設後（更新後）の振り分けＩＤ情報２００ｇ（図４（ｂ）参照）となった後において、減設ノード１ｂを削除したノード識別子管理情報１００（図３参照）を各既存ノード１ａに送信する。このようにすることにより、保守減設システム１０００内の各ノード１には、同一のノード識別子管理情報１００および同一の振り分けＩＤ情報２００が保持される。 The node identifier management unit 11 of the representative node 1A in the maintenance reduction system 1000 transmits temporary allocation ID information 200k (temporary ID table) to each node 1. In addition, the node identifier management unit 11 of the representative node 1A updates the temporary distribution ID information 200k to the distribution ID information 200g (see FIG. 4B) after the maintenance reduction (after update). The node identifier management information 100 (see FIG. 3) from which the reduced node 1b is deleted is transmitted to each existing node 1a. Thus, the same node identifier management information 100 and the same distribution ID information 200 are held in each node 1 in the maintenance / reduction system 1000.

なお、代表ノード１Ａは、例えば、このノード識別子管理情報１００（図３）の一番上の行のノード１から順に、代表ノード１Ａとなるように設定される。ノード１が新たに代表ノード１Ａになった場合、自身が代表ノード１Ａであることを示す情報を、各既存ノード１ａ等に送信する。 For example, the representative node 1A is set to be the representative node 1A in order from the node 1 in the top row of the node identifier management information 100 (FIG. 3). When the node 1 newly becomes the representative node 1A, information indicating that it is the representative node 1A is transmitted to each existing node 1a and the like.

図２に戻り、減設ノード決定部１２は、保守減設システム１０００を構成する各ノード１のうち、減設対象となるノード（減設ノード１ｂ）を決定する。この減設ノード１ｂの決定は、例えば、定期点検のため、ネットワーク管理装置（図示省略）から減設ノード１ｂの指定情報を取得し、その指定情報に基づき決定してもよい。また、代表ノード１Ａが各既存ノード１ａの状態を監視し、例えば、所定値より多い警報通知を受信したりすることにより、故障の疑いのあるノード１を減設ノード１ｂとして決定してもよい。
本実施形態においては、外部装置から地震等の災害による（近い）将来の故障確率情報を取得し、減設ノード決定部１２が、所定の故障確率以上のノード１（被災して障害が発生する確率が高いノード）を、減設対象ノード（減設ノード１ｂ）として決定する例として説明する。 Returning to FIG. 2, the reduction node determination unit 12 determines a node (reduction node 1 b) to be reduced among the nodes 1 constituting the maintenance reduction system 1000. The determination of the reduced node 1b may be performed based on, for example, the designation information of the reduced node 1b obtained from a network management apparatus (not shown) for periodic inspection. Further, the representative node 1A may monitor the state of each existing node 1a and, for example, receive a warning notification that is greater than a predetermined value, thereby determining the node 1 that is suspected of being the failure node 1b. .
In the present embodiment, the failure probability information of (near) future due to a disaster such as an earthquake is acquired from an external device, and the reduced node determination unit 12 has a node 1 with a predetermined failure probability or higher (a failure occurs due to a disaster). A description will be given as an example of determining a node having a high probability as a reduction target node (reduction node 1b).

なお、この故障確率情報は、例えば、非特許文献３（Hiroshi Saito，et al.，”Proposal of Disaster Avoidance Control，” Proc. of Telecommunications Network Strategy and Planning Symposium (Networks)，2014 16th International．）に記載の技術により実現される。非特許文献３に記載の技術では、災害（地震等）の発生位置と、ネットワーク上の各ノード１の位置関係に基づき故障確率を算出する。本実施形態に係る代表ノード１Ａの減設ノード決定部１２は、保守減設システム１０００を構成する各ノード１の故障確率情報を、例えばネットワーク管理装置（図示省略）等を介して外部装置から取得する。 This failure probability information is described, for example, in Non-Patent Document 3 (Hiroshi Saito, et al., “Proposal of Disaster Avoidance Control,” Proc. Of Telecommunications Network Strategy and Planning Symposium (Networks), 2014 16th International.). It is realized by the technology. In the technique described in Non-Patent Document 3, a failure probability is calculated based on the occurrence position of a disaster (such as an earthquake) and the positional relationship between each node 1 on the network. The reduced node determination unit 12 of the representative node 1A according to the present embodiment acquires failure probability information of each node 1 constituting the maintenance / removal system 1000 from an external device via, for example, a network management device (not shown). To do.

図５は、本実施形態に係る故障確率情報３００のデータ構成例を示す図である。
故障確率情報３００は、各拠点のノード１（サーバ）それぞれが、将来発生が予測される大規模災害等により故障する確率（障害が発生する確率）を示す情報である。なお、大規模災害等とは、例えば、台風や暴風雨、竜巻、落雷、大雨、河川の氾濫、津波、地震等であり、数日若しくは数秒から数時間後に、拠点が位置する地域において、上記災害による物理的な損傷（ネットワークの切断等も含む）や、停電の発生、若しくは、サーバの管理者がサーバ設置施設に近付けない等を含む障害によりサーバが使用不能となる予測される確率を示す情報である。 FIG. 5 is a diagram illustrating a data configuration example of the failure probability information 300 according to the present embodiment.
The failure probability information 300 is information indicating the probability of failure of each node 1 (server) at each site due to a large-scale disaster or the like that is predicted to occur in the future (probability of failure). Large-scale disasters include, for example, typhoons, storms, tornadoes, lightning strikes, heavy rains, river floods, tsunamis, earthquakes, etc., and in the areas where the bases are located several days or seconds to hours later, Information that indicates the predicted probability that the server will become unusable due to physical damage (including network disconnection, etc.), power outages, or failures such as the server administrator not getting close to the server installation facility It is.

この故障確率情報３００は、ノード識別子３０１、時刻３０２、故障確率３０３のデータ項目から構成される。
ノード識別子３０１は、クラスタを構成する各ノード１の識別子を表わし、図３および図４のノード識別子１０１，２０１と同様の情報である。
時刻３０２および故障確率３０３は、ノード識別子３０１に対応付けて格納される情報である。故障確率３０３には、時刻３０２（所定時刻）における当該サーバの故障確率（％）が格納される。なお、図５においては、時刻３０２は、１時間毎に設定される例を示している。
例えば、故障確率情報３００の１行目に示すように、ノード識別子３０１のノード「Ａ」は、時刻３０２で示される「2015年6月1日」の13時から14時の間の故障確率３０３が「１０（％）」、14時から15時の間の故障確率３０３が「３０（％）」、15時から16時の間の故障確率３０３が「４０（％）」であることを示している。 The failure probability information 300 includes data items of a node identifier 301, a time 302, and a failure probability 303.
The node identifier 301 represents the identifier of each node 1 constituting the cluster, and is the same information as the node identifiers 101 and 201 in FIGS. 3 and 4.
The time 302 and the failure probability 303 are information stored in association with the node identifier 301. The failure probability 303 stores the failure probability (%) of the server at time 302 (predetermined time). FIG. 5 shows an example in which the time 302 is set every hour.
For example, as shown in the first line of the failure probability information 300, the node “A” of the node identifier 301 has a failure probability 303 of “1 June 2015” indicated at time 302 between 13:00 and 14:00. 10 (%) ”, the failure probability 303 between 14:00 and 15:00 is“ 30 (%) ”, and the failure probability 303 between 15:00 and 16:00 is“ 40 (%) ”.

減設ノード決定部１２は、この故障確率情報３００を参照し、所定の故障確率以上のノード１（被災して障害が発生する確率が高いノード）を、減設対象ノード（減設ノード１ｂ）として決定する。例えば、減設ノード決定部１２は、故障確率３０３が所定の故障確率（例えば、５０％以上）を超えたノード「Ｄ」を、減設ノード１ｂとして決定する。 The reduced node determination unit 12 refers to the failure probability information 300 and selects a node 1 (node with a high probability of occurrence of a failure due to a disaster) that is equal to or higher than a predetermined failure probability as a reduction target node (removed node 1b). Determine as. For example, the reduced node determination unit 12 determines the node “D” whose failure probability 303 exceeds a predetermined failure probability (for example, 50% or more) as the reduced node 1b.

図２に戻り、振り分け部１３は、ロードバランサ３（図１）等を介してクライアント２から受信したメッセージ（信号）内の情報（「振り分けキー」）をもとに「ｈａｓｈ（ｋｅｙ）」を算出し、振り分けＩＤ情報２００（図４）を参照して、そのメッセージの処理を担当するノード１を抽出する。そして、振り分け部１３は、その抽出したノード１のアドレス情報を、ノード識別子管理情報１００（図３）を参照して取得し、その抽出したノード１へメッセージの振り分け（送信）を行う。 Returning to FIG. 2, the distribution unit 13 obtains “hash (key)” based on the information (“distribution key”) in the message (signal) received from the client 2 via the load balancer 3 (FIG. 1) or the like. Calculate and refer to the distribution ID information 200 (FIG. 4) to extract the node 1 in charge of processing the message. Then, the distribution unit 13 acquires the extracted address information of the node 1 with reference to the node identifier management information 100 (FIG. 3), and distributes (transmits) the message to the extracted node 1.

信号処理部１４は、自身のノード１が担当するデータに関するメッセージの信号処理を実行する。このメッセージにより信号処理部１４が実行する処理は、例えば、データの登録、更新、検索、削除等である。また、信号処理部１４は、データの登録や更新等のメッセージを受信した場合に、振り分けＩＤ情報２００を参照し、冗長度に応じて、自身のノード１からＩＤ空間上で時計回りに次のノードというようにして、データの複製を行うノード（複製ノード）を決定する（冗長度が「３」の場合は、２つの複製ノードを決定する。）。そして、信号処理部１４は、決定した複製ノードに対して、原本データを複製した複製データの送信し、その複製データを記憶させる。
この信号処理部１４は、信号処理後に送付するメッセージに、例えば、ＳＩＰ（Session Initiation Protocol）においては「Call-id」をもとに算出したハッシュ値を振り分けキーとして埋め込む（ＳＩＰにおいては、例えばTo/FromヘッダのTagに記載する。）ようにしてもよい。これにより、振り分け部１３がそのメッセージの後続呼を受信した場合に、振り分けキーとして埋め込まれたハッシュ値を用いて、振り分けＩＤ情報２００（図４）を参照し、その後続呼を担当するノード１を特定することができる。 The signal processing unit 14 performs signal processing of messages related to data handled by its own node 1. The processing executed by the signal processing unit 14 in response to this message is, for example, data registration, update, search, or deletion. In addition, when the signal processing unit 14 receives a message such as data registration or update, the signal processing unit 14 refers to the distribution ID information 200, and in the clockwise direction from the own node 1 in the ID space according to the redundancy. As a node, a node (duplicate node) that replicates data is determined (when the redundancy is “3”, two replica nodes are determined). Then, the signal processing unit 14 transmits the duplicate data obtained by duplicating the original data to the decided duplicate node, and stores the duplicate data.
The signal processing unit 14 embeds, as a distribution key, a hash value calculated based on “Call-id” in SIP (Session Initiation Protocol), for example, in a message sent after signal processing (for example, in To, for example, To (It is described in Tag of / From header.) As a result, when the distribution unit 13 receives a subsequent call of the message, the distribution ID information 200 (FIG. 4) is referred to using the hash value embedded as the distribution key, and the node 1 in charge of the subsequent call Can be specified.

保守減設処理部１５は、減設ノード決定部１２が、クラスタから減設させるノード（減設ノード１ｂ）を決定した際に、保守減設処理を実行する。
この保守減設処理部１５は、振り分け先変更処理部１５１と、限定的同期処理部１５２と、原本昇格処理部１５３と、データ再配置処理部１５４とを備える。 The maintenance reduction processing unit 15 executes the maintenance reduction processing when the reduction node determination unit 12 determines a node to be removed from the cluster (reduction node 1b).
The maintenance reduction processing unit 15 includes a distribution destination change processing unit 151, a limited synchronization processing unit 152, an original promotion processing unit 153, and a data rearrangement processing unit 154.

振り分け先変更処理部１５１は、減設ノード決定部１２が、減設ノード１ｂを決定した場合に、その減設対象となるノード１（減設ノード１ｂ）に信号が振り分けられないようにするため、ネットワーク管理装置（図示省略）に対し、振り分け先変更依頼を送信する。これにより、ネットワーク管理装置は、ロードバランサ３等の設定を変更し、減設ノード１ｂに対し、メッセージ（信号）を振り分けないように設定変更を行う。
また、振り分け先変更処理部１５１は、ノード識別子管理部１１に対し、減設ノード１ｂへの信号振り分けが停止されたことを通知する。これにより、ノード識別子管理部１１に、所定のロジックに基づき、減設ノード１ｂが減設したものとする仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を生成させる。
そして、振り分け先変更処理部１５１は、既存ノード１ａおよび減設ノード１ｂのすべてに対し、生成した仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を通知する。 The distribution destination change processing unit 151 prevents the signal from being distributed to the node 1 (removal node 1b) to be removed when the removal node determination unit 12 determines the reduction node 1b. The distribution destination change request is transmitted to the network management apparatus (not shown). Thereby, the network management device changes the setting of the load balancer 3 and the like, and changes the setting so that the message (signal) is not distributed to the reduced node 1b.
Also, the distribution destination change processing unit 151 notifies the node identifier management unit 11 that the signal distribution to the reduced node 1b has been stopped. This causes the node identifier management unit 11 to generate temporary allocation ID information 200k (temporary ID table) that is assumed to be reduced by the reduced node 1b based on a predetermined logic.
Then, the distribution destination change processing unit 151 notifies the generated temporary distribution ID information 200k (temporary ID table) to all of the existing nodes 1a and the reduced nodes 1b.

限定的同期処理部１５２は、自身が既存ノード１ａの限定的同期処理部１５２である場合に、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を参照し、自身が持つ複製データのうち、原本昇格の対象となる複製データ（以下、「原本昇格対象データ」と称する。）を抽出する。そして、限定的同期処理部１５２は、減設ノード１ｂが持つ原本データと、原本昇格対象データとの間で、同期処理を実行する。
具体的には、原本昇格対象データを有する既存ノード１ａの限定的同期処理部１５２から、減設ノード１ｂの限定的同期処理部１５２に対し、同期要求を送信する。そして、減設ノード１ｂの限定的同期処理部１５２が、現状の振り分けＩＤ情報２００を参照し、自身が原本として持つデータ（原本データ）を抽出し、当該原本データの情報を同期情報として同期要求を送信してきた既存ノード１ａに送信する。このとき、減設ノード１ｂの限定的同期処理部１５２は、原本データと複製データとの間の差分情報のみを送信するようにしてもよい。 When the limited synchronization processing unit 152 is the limited synchronization processing unit 152 of the existing node 1a, the limited synchronization processing unit 152 refers to the temporary distribution ID information 200k (temporary ID table), and promotes the original copy of the copy data held by itself. The data to be replicated (hereinafter referred to as “original promotion target data”) is extracted. Then, the limited synchronization processing unit 152 executes synchronization processing between the original data held by the reduced node 1b and the original promotion target data.
Specifically, a synchronization request is transmitted from the limited synchronization processing unit 152 of the existing node 1a having the original data to be promoted to the limited synchronization processing unit 152 of the reduced node 1b. Then, the limited synchronization processing unit 152 of the reduced node 1b refers to the current distribution ID information 200, extracts the data (original data) that it owns as an original, and requests synchronization using the information of the original data as synchronization information. Is transmitted to the existing node 1a that has transmitted. At this time, the limited synchronization processing unit 152 of the reduced node 1b may transmit only the difference information between the original data and the duplicated data.

原本昇格処理部１５３は、自身が原本昇格の対象となる複製データ（原本昇格対象データ）を有する既存ノード１ａの原本昇格処理部１５３である場合に、限定的同期処理部１５２による原本昇格対象データと原本データとの間の同期処理が終了すると、その複製データ（原本昇格対象データ）を原本データに昇格する。
そして、原本昇格処理部１５３は、複製データを原本データに昇格したことを示す原本昇格完了通知を代表ノード１Ａに送信する。 When the original promotion processing unit 153 is the original promotion processing unit 153 of the existing node 1a having the copy data (original promotion target data) that is the target of the original promotion, the original promotion processing data by the limited synchronization processing unit 152 When the synchronization processing between the original data and the original data is completed, the duplicated data (original promotion target data) is promoted to the original data.
Then, the original data promotion processing unit 153 transmits an original data promotion completion notification indicating that the copy data has been promoted to the original data to the representative node 1A.

代表ノード１Ａの原本昇格処理部１５３は、原本昇格完了通知を受信すると、ノード識別子管理部１１に対し、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を用いて、現状の振り分けＩＤ情報２００を更新する処理の実行を指示する。
そして、代表ノード１Ａの原本昇格処理部１５３は、自身の振り分けＩＤ情報２００が更新されたことを契機として、既存ノード１ａに対し、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を用いて、現状の振り分けＩＤ情報２００を更新する指示情報であるＩＤ表更新通知を既存ノード１ａに送信し、現状の振り分けＩＤ情報２００を更新させる。
また、代表ノード１Ａの原本昇格処理部１５３は、原本昇格処理が完了したことを示す原本昇格完了通知を、ネットワーク管理装置に送信する。 Receiving the original promotion completion notification, the original promotion processing unit 153 of the representative node 1A updates the current distribution ID information 200 to the node identifier management unit 11 using the temporary distribution ID information 200k (temporary ID table). To execute the process to be executed.
Then, the original promotion processing unit 153 of the representative node 1A uses the temporary distribution ID information 200k (temporary ID table) for the existing node 1a when the own distribution ID information 200 is updated. ID table update notification, which is instruction information for updating the distribution ID information 200, is transmitted to the existing node 1a, and the current distribution ID information 200 is updated.
The original promotion processing unit 153 of the representative node 1A transmits an original promotion completion notification indicating that the original promotion process has been completed to the network management apparatus.

データ再配置処理部１５４は、自身が代表ノード１Ａのデータ再配置処理部１５４である場合に、原本昇格処理部１５３により振り分けＩＤ情報２００の更新処理が終了したことを契機として、既存ノード１ａに対し、データ再配置開始通知を送信する。これにより、各既存ノード１ａにおいて、再冗長化が必要となるデータが抽出され、更新された振り分けＩＤ情報２００に基づき、データの移行や複製を行うことにより、データの再配置が実行される。 When the data rearrangement processing unit 154 is the data rearrangement processing unit 154 of the representative node 1A, the data rearrangement processing unit 154 sends an update to the existing node 1a in response to the completion of the update processing of the distribution ID information 200 by the original promotion processing unit 153 In response, a data relocation start notification is transmitted. Thereby, in each existing node 1a, data requiring re-redundancy is extracted, and data rearrangement is executed by performing data migration or duplication based on the updated distribution ID information 200.

＜処理の流れ＞
次に、本実施形態に係る保守減設システム１０００が実行する、保守減設処理の流れについて説明する。
図６は、本実施形態に係る保守減設システム１０００が実行する保守減設処理を示すシーケンス図である。 <Process flow>
Next, the flow of maintenance / reduction processing executed by the maintenance / reduction system 1000 according to the present embodiment will be described.
FIG. 6 is a sequence diagram showing maintenance / reduction processing executed by the maintenance / reduction system 1000 according to the present embodiment.

まず、代表ノード１Ａの減設ノード決定部１２は、減設対象となるノード（減設ノード１ｂ）を決定する（ステップＳ１０１）。ここで、減設ノード決定部１２は、例えば、外部装置から故障確率情報３００（図５参照）を受信することにより、各ノード１の故障確率を参照し、所定の故障確率以上のノード１を減設対象のノード（減設ノード１ｂ）に決定する。 First, the reduced node determination unit 12 of the representative node 1A determines a node (removal node 1b) to be reduced (step S101). Here, the reduced node determination unit 12 refers to the failure probability of each node 1 by receiving failure probability information 300 (see FIG. 5) from an external device, for example, and selects a node 1 having a predetermined failure probability or higher. The node to be removed is determined (removal node 1b).

続いて、代表ノード１Ａの振り分け先変更処理部１５１は、決定した減設ノード１ｂに、メッセージ（信号）が振り分けられないようにするため、振り分け先変更依頼をネットワーク管理装置５に送信する（ステップＳ１０２）。そして、ネットワーク管理装置５は、保守減設システム１０００を構成する各ノード１のうち、減設ノード１ｂには、信号が振り分けられないように、例えば、振り分け元となるロードバランサ３の振り分け先の設定を変更する（ステップＳ１０３）。続いて、ネットワーク管理装置５は、振り分け先変更の完了通知を代表ノード１Ａに送信する（ステップＳ１０４）。 Subsequently, the distribution destination change processing unit 151 of the representative node 1A transmits a distribution destination change request to the network management device 5 so that a message (signal) is not distributed to the determined reduction node 1b (Step S1). S102). Then, the network management device 5 includes, for example, a distribution destination of the load balancer 3 serving as a distribution source so that signals are not distributed to the reduction node 1b among the nodes 1 constituting the maintenance reduction system 1000. The setting is changed (step S103). Subsequently, the network management device 5 transmits a distribution destination change completion notification to the representative node 1A (step S104).

次に、代表ノード１Ａの振り分け先変更処理部１５１は、ノード識別子管理部１１に対し、減設ノード１ｂへの信号振り分けが停止されたことを通知し、減設ノード１ｂが減設したものとする仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を生成させる（ステップＳ１０５）。この仮の振り分けＩＤ情報２００ｋには、現状の振り分けＩＤ情報２００から減設ノード１ｂを削除した情報であることを示す減設フラグ「＋１」をたてておく。 Next, the distribution destination change processing unit 151 of the representative node 1A notifies the node identifier management unit 11 that the signal distribution to the reduced node 1b has been stopped, and the reduced node 1b has been reduced. Temporary allocation ID information 200k (temporary ID table) to be generated is generated (step S105). In this temporary distribution ID information 200k, a reduction flag “+1” indicating that the deletion node 1b is deleted from the current distribution ID information 200 is set.

続いて、代表ノード１Ａの振り分け先変更処理部１５１は、既存ノード１ａに対し、生成した仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を通知する（ステップＳ１０６）。そして、既存ノード１ａは、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を取得し、記憶部３０に記憶する（ステップＳ１０７）。 Subsequently, the distribution destination change processing unit 151 of the representative node 1A notifies the generated temporary distribution ID information 200k (temporary ID table) to the existing node 1a (step S106). Then, the existing node 1a acquires the temporary distribution ID information 200k (temporary ID table) and stores it in the storage unit 30 (step S107).

次に、既存ノード１ａと減設ノード１ｂとにおいて限定的同期処理を実行する。
具体的には、既存ノード１ａの限定的同期処理部１５２は、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を参照し、自身が持つ複製データのうち、原本昇格の対象となる複製データ（原本昇格対象データ）を抽出する（ステップＳ１０８）。
続いて、原本昇格対象データを有する既存ノード１ａの限定的同期処理部１５２から、減設ノード１ｂの限定的同期処理部１５２に対し、同期要求を送信する（ステップＳ１０９）。そして、減設ノード１ｂの限定的同期処理部１５２が、現状の振り分けＩＤ情報２００を参照し、自身が原本として持つデータ（原本データ）を抽出し、当該原本データの情報を同期情報として同期要求を送信してきた既存ノード１ａに送信する（ステップＳ１１０）。これにより、減設ノード１ｂが持つ原本データと、原本昇格対象データとの間で、同期処理が実行される。 Next, limited synchronization processing is executed between the existing node 1a and the reduced node 1b.
Specifically, the limited synchronization processing unit 152 of the existing node 1a refers to the temporary distribution ID information 200k (temporary ID table), and among the copy data held by itself, the copy data (original data) that is the target of the original data promotion. Data to be promoted) is extracted (step S108).
Subsequently, a synchronization request is transmitted from the limited synchronization processing unit 152 of the existing node 1a having the original data to be promoted to the limited synchronization processing unit 152 of the reduced node 1b (step S109). Then, the limited synchronization processing unit 152 of the reduced node 1b refers to the current distribution ID information 200, extracts the data (original data) that it owns as an original, and requests synchronization using the information of the original data as synchronization information. Is transmitted to the existing node 1a that has transmitted (step S110). As a result, the synchronization process is executed between the original data of the reduced node 1b and the original promotion target data.

限定的同期処理部１５２による、原本データと原本昇格対象データとの同期処理を終えると、原本昇格対象データを持つ既存ノード１ａの原本昇格処理部１５３は、その原本昇格対象データである複製データを、原本データに昇格させる原本昇格処理を実行する（ステップＳ１１１）。
続いて、原本昇格処理を実行した既存ノード１ａの原本昇格処理部１５３は、複製データの原本昇格が完了したことを示す原本昇格完了通知を代表ノード１Ａに送信する（ステップＳ１１２）。 When the synchronization processing between the original data and the original promotion target data by the limited synchronization processing unit 152 is finished, the original promotion processing unit 153 of the existing node 1a having the original promotion target data receives the duplicate data that is the original promotion target data. Then, an original promotion process for promoting the original data is executed (step S111).
Subsequently, the original promotion processing unit 153 of the existing node 1a that has executed the original promotion process transmits an original promotion completion notification indicating that the original promotion of replicated data has been completed to the representative node 1A (step S112).

次に、代表ノード１Ａの原本昇格処理部１５３は、原本昇格完了通知を受信すると、ノード識別子管理部１１に対し、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）の減設フラグ「＋１」を下げるとともに、仮の振り分けＩＤ情報２００ｋを用いて、現状の振り分けＩＤ情報２００を更新する処理の実行を指示する（ステップＳ１１３）。
そして、代表ノード１Ａの原本昇格処理部１５３は、自身の振り分けＩＤ情報２００が更新されたことを契機として、既存ノード１ａに対し、仮の振り分けＩＤ情報２００ｋ（仮ＩＤ表）を用いて、現状の振り分けＩＤ情報２００を更新する指示情報であるＩＤ表更新通知を既存ノード１ａに送信する（ステップＳ１１４）。これにより、既存ノード１ａにおいて、仮の振り分けＩＤ情報２００ｋの減設フラグ「＋１」を下げて、振り分けＩＤ情報２００を更新させる（ステップＳ１１５）。
また、代表ノード１Ａの原本昇格処理部１５３は、原本昇格処理が完了したことを示す原本昇格完了通知を、ネットワーク管理装置に送信する（ステップＳ１１６）。
この原本昇格完了通知をネットワーク管理装置に送信した時点で、原本昇格と振り分けＩＤ情報２００の更新が完了しているため、各ノード１は、過渡状態を終了することができる。よって、ネットワーク管理者は、この時点で、クラスタからの減設ノード１ｂの切り離しを実行することができる。 Next, when receiving the original promotion completion notification, the original promotion processing unit 153 of the representative node 1A lowers the reduction flag “+1” of the temporary distribution ID information 200k (temporary ID table) to the node identifier management unit 11. At the same time, the temporary distribution ID information 200k is used to instruct execution of a process for updating the current distribution ID information 200 (step S113).
Then, the original promotion processing unit 153 of the representative node 1A uses the temporary distribution ID information 200k (temporary ID table) for the existing node 1a when the own distribution ID information 200 is updated. ID table update notification, which is instruction information for updating the distribution ID information 200, is transmitted to the existing node 1a (step S114). Thereby, in the existing node 1a, the reduction flag “+1” of the temporary distribution ID information 200k is lowered, and the distribution ID information 200 is updated (step S115).
Also, the original promotion processing unit 153 of the representative node 1A transmits an original promotion completion notification indicating that the original promotion process has been completed to the network management device (step S116).
Since the original promotion and the update of the distribution ID information 200 are completed when the original promotion completion notification is transmitted to the network management apparatus, each node 1 can end the transient state. Therefore, the network administrator can execute the disconnection of the reduced node 1b from the cluster at this point.

続いて、代表ノード１Ａのデータ再配置処理部１５４は、既存ノード１ａに対し、データ再配置開始通知を送信する（ステップＳ１１７）。また、代表ノード１Ａの保守減設処理部１５は、ネットワーク管理装置５に対し、保守減設処理完了通知を送信する（ステップＳ１１８）。 Subsequently, the data relocation processing unit 154 of the representative node 1A transmits a data relocation start notification to the existing node 1a (step S117). In addition, the maintenance / reduction processing unit 15 of the representative node 1A transmits a maintenance / reduction processing completion notification to the network management device 5 (step S118).

代表ノード１Ａからデータ再配置開始通知を受信した既存ノード１ａは、データ再配置処理部１５４が、更新された振り分けＩＤ情報２００に基づき、再冗長化が必要となるデータを抽出し、データの移行や複製を行うことにより、データの再配置を実行する（ステップＳ１１９）。そして、既存ノード１ａのデータ再配置処理部１５４は、データの再配置が終了すると、その旨を示す完了通知を代表ノード１Ａに送信する（ステップＳ１２０）。
代表ノード１Ａの保守減設処理部１５は、クラスタメンバの更新がすべて完了したことを示すクラスタメンバ更新完了通知をネットワーク管理装置５に送信し（ステップＳ１２１）、処理を終える。 In the existing node 1a that has received the data relocation start notification from the representative node 1A, the data relocation processing unit 154 extracts data that requires re-redundancy based on the updated distribution ID information 200, and migrates the data. Then, data rearrangement is executed by performing duplication (step S119). Then, when the data rearrangement is completed, the data rearrangement processing unit 154 of the existing node 1a transmits a completion notification to that effect to the representative node 1A (step S120).
The maintenance reduction processing unit 15 of the representative node 1A transmits a cluster member update completion notification indicating that all cluster member updates have been completed to the network management device 5 (step S121), and ends the process.

以上説明したように、本実施形態に係る保守減設システム１０００、ノード１および保守減設方法によれば、振り分け先変更処理の後、本発明特有の限定的同期処理を実行し、その後、原本昇格を行い、ＩＤ表を更新し、最後に、データ再配置を行う（図６参照）。このようにすることにより、図７に示す従来技術の手順に比べ、過渡状態の期間を短縮することができる。過渡状態の期間を短縮することにより、クラスタからの減設ノード１ｂの切り離しをより早く実行することが可能となる。また、原本昇格およびＩＤ表の更新後に行われる、複製データのデータ再配置の段階で、何らかの原因で処理が中断した場合であっても、原本昇格し、ＩＤ表を更新した直後に巻き戻せば済むため、ＭＴＴＲ（平均修復時間）を短くすることができる。 As described above, according to the maintenance / reduction system 1000, the node 1, and the maintenance / reduction method according to the present embodiment, after the distribution destination change process, the limited synchronization process peculiar to the present invention is executed, and then the original Promotion is performed, the ID table is updated, and finally data rearrangement is performed (see FIG. 6). By doing in this way, the period of a transient state can be shortened compared with the procedure of the prior art shown in FIG. By shortening the period of the transient state, the removal node 1b can be disconnected from the cluster more quickly. Even if the process is interrupted for some reason at the stage of data relocation of the replicated data performed after the original promotion and update of the ID table, if the original is promoted and rewinded immediately after the ID table is updated Therefore, MTTR (average repair time) can be shortened.

１ノード
１Ａ代表ノード
１ａ既存ノード
１ｂ減設ノード
２クライアント
３ロードバランサ
１０制御部
１１ノード識別子管理部
１２減設ノード決定部
１３振り分け部
１４信号処理部
１５保守減設処理部
２０入出力部
３０記憶部
１００ノード識別子管理情報
１５１振り分け先変更処理部
１５２限定的同期処理部
１５３原本昇格処理部
１５４データ再配置処理部
２００振り分けＩＤ情報
３００故障確率情報
４００データ
１０００保守減設システム
Ｓ分散処理システム（高可用システム） 1 node 1A representative node 1a existing node 1b reduced node 2 client 3 load balancer 10 control unit 11 node identifier management unit 12 reduced node determination unit 13 distribution unit 14 signal processing unit 15 maintenance reduction processing unit 20 input / output unit 30 storage Section 100 Node identifier management information 151 Distribution destination change processing section 152 Limited synchronization processing section 153 Original promotion processing section 154 Data relocation processing section 200 Distribution ID information 300 Failure probability information 400 Data 1000 Maintenance reduction system S Distributed processing system (High Available system)

Claims

Provided in a high availability system that distributes and processes messages to each of a plurality of nodes constituting a cluster, and stores the original data that is the original data processed as the message and the replicated data that is a duplicate thereof in different nodes. , A maintenance / reduction system for removing some of the plurality of nodes constituting the cluster,
The node is
A storage unit for storing distribution ID information indicating the charge of data processed by each node in association with an ID indicating identification information of each node;
A reduction node determination unit for determining a reduction node indicating a node to be reduced from among the plurality of nodes;
A distribution destination change processing unit that stops the distribution of the message to the reduced node and generates temporary distribution ID information indicating the data assignment of each node excluding the reduced node;
Using the temporary distribution ID information, duplicate data that is promoted to the original data when the reduced node is removed is extracted as original promotion target data, the extracted original promotion target data, and the reduction A limited synchronization processing unit that performs synchronization processing with the original data stored in the installation node;
After the synchronization processing by the limited synchronization processing unit, the original promotion processing unit that promotes the original promotion target data to the original data and then updates the current distribution ID information using the temporary distribution ID information When,
A data relocation processing unit for performing replication and relocation of the data so as to maintain the redundancy determined in the high availability system;
A maintenance reduction system characterized by comprising:

Provided in a high availability system that distributes and processes messages to each of a plurality of nodes constituting a cluster, and stores the original data that is the original data processed as the message and the replicated data that is a duplicate thereof in different nodes. The node of the maintenance / removal system for removing a part of the plurality of nodes constituting the cluster,
A storage unit for storing distribution ID information indicating the charge of data processed by each node in association with an ID indicating identification information of each node;
A reduction node determination unit for determining a reduction node indicating a node to be reduced from among the plurality of nodes;
A distribution destination change processing unit that stops the distribution of the message to the reduced node and generates temporary distribution ID information indicating the data assignment of each node excluding the reduced node;
Using the temporary distribution ID information, duplicate data that is promoted to the original data when the reduced node is removed is extracted as original promotion target data, the extracted original promotion target data, and the reduction A limited synchronization processing unit that performs synchronization processing with the original data stored in the installation node;
After the synchronization processing by the limited synchronization processing unit, the original promotion processing unit that promotes the original promotion target data to the original data and then updates the current distribution ID information using the temporary distribution ID information When,
A data relocation processing unit for performing replication and relocation of the data so as to maintain the redundancy determined in the high availability system;
A node characterized by comprising:

Provided in a high availability system that distributes and processes messages to each of a plurality of nodes constituting a cluster, and stores the original data that is the original data processed as the message and the replicated data that is a duplicate thereof in different nodes. A maintenance / removal method for a maintenance / removal system that removes some of the plurality of nodes constituting the cluster,
The node is
In association with the ID indicating the identification information of each node, the distribution ID information indicating the charge of data processed by each node is stored in the storage unit,
Determining a reduction node indicating a node to be reduced from among the plurality of nodes;
Stopping the distribution of the message to the reduced node, and generating temporary allocation ID information indicating the data assignment of each node excluding the reduced node;
Using the temporary distribution ID information, duplicate data that is promoted to the original data when the reduced node is removed is extracted as original promotion target data, the extracted original promotion target data, and the reduction Executing a synchronization process with the original data stored in the installation node;
After the synchronization processing, the original promotion target data is promoted to the original data, and then the current distribution ID information is updated using the temporary distribution ID information.
Performing replication and relocation of the data so as to maintain the redundancy defined in the high availability system;
The maintenance reduction method characterized by performing.