JP2015095200A - Cluster system - Google Patents

Cluster system Download PDF

Info

Publication number
JP2015095200A
JP2015095200A JP2013235597A JP2013235597A JP2015095200A JP 2015095200 A JP2015095200 A JP 2015095200A JP 2013235597 A JP2013235597 A JP 2013235597A JP 2013235597 A JP2013235597 A JP 2013235597A JP 2015095200 A JP2015095200 A JP 2015095200A
Authority
JP
Japan
Prior art keywords
cluster
representative
members
monitoring
management table
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2013235597A
Other languages
Japanese (ja)
Other versions
JP5848743B2 (en
Inventor
博史 野口
Hiroshi Noguchi
博史 野口
絵里子 岩佐
Eriko Iwasa
絵里子 岩佐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP2013235597A priority Critical patent/JP5848743B2/en
Publication of JP2015095200A publication Critical patent/JP2015095200A/en
Application granted granted Critical
Publication of JP5848743B2 publication Critical patent/JP5848743B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

PROBLEM TO BE SOLVED: To ensure that, even when a representative member for updating and delivering a member management table and other members go down at the same time in a cluster system, a new representative member can be selected at high speed.SOLUTION: The present invention is a cluster system in which a plurality of cluster members perform distributed processing. Each of the plurality of cluster members (representative member 100, non-representative member 200) is provided with: a storage unit 120 (220) for storing a monitored-member setting rule 123 (223) in which it is stipulated that, when determining the subject to be alive-monitored among the plurality of cluster members, a cluster member for performing alive-monitoring on a next-term representative member candidate be selected from other than the then representative member; and a processing unit 110 (210) for selecting, in accordance with the monitored-member setting rule 123 (223), the subject to be monitored by the member itself when determining the subject to be alive-monitored among the cluster members.

Description

本発明は、複数のクラスタメンバ(サーバなど。以下、単に「メンバ」と称する場合もある。)が分散処理を行うクラスタシステムにおける、クラスタメンバ間の死活監視の技術に関する。   The present invention relates to a life and death monitoring technique between cluster members in a cluster system in which a plurality of cluster members (such as servers, hereinafter may be simply referred to as “members”) perform distributed processing.

大容量データ保持、高速アクセス、高可用性が求められる近年のWebシステムでは、複数のサーバを協調させることにより、システム全体の処理能力を向上するクラスタシステムが多く用いられている。クラスタシステムによる分散処理では、クラスタを構成するクラスタメンバと、担当するデータが、対応付けられている必要がある。クラスタメンバとデータを対応付ける手法には、例えば、コンシステントハッシング(Consistent Hashing)法という手法がある(非特許文献1)。   In recent Web systems that require large-capacity data retention, high-speed access, and high availability, cluster systems that improve the processing capacity of the entire system by coordinating a plurality of servers are often used. In the distributed processing by the cluster system, the cluster members constituting the cluster and the data in charge must be associated with each other. As a method of associating data with cluster members, for example, there is a method called a consistent hashing method (Non-patent Document 1).

コンシステントハッシング法は、データのハッシュ値と、クラスタメンバに割り当てたアドレスとを同一のID(IDentifier)空間上へマッピングすることで、クラスタメンバの担当データを定める手法である。この手法は、クラスタメンバが増減した場合でも、クラスタメンバとデータとの対応関係について全体の1/N(N:クラスタメンバ数)だけの変更で済むため、データ再配置の負荷を抑えられるという特徴がある。   The consistent hashing method is a method of determining data assigned to a cluster member by mapping a hash value of data and an address assigned to the cluster member on the same ID (IDentifier) space. Even if the number of cluster members increases or decreases, this method requires only a change of 1 / N (N: the number of cluster members) of the correspondence between the cluster members and the data, thereby reducing the load of data relocation. There is.

また、この手法では、クラスタシステムの可用性を向上させるために、担当サーバ以外にもデータの複製を持たせる方法を併用する場合がある。その場合、各クラスタメンバは、全メンバの一覧表(メンバ管理表)を保持し、特定の規則のもとにメンバ管理表から複製先を特定できる必要がある。図1に示すように、メンバ管理表には、各メンバを一意に指定するキーとなるメンバIDと、IP(Internet Protocol)アドレスといったメンバ固有の付加情報と、が含まれる。また、このとき、メンバ管理表の一貫性を担保するには、その更新権限がクラスタ内外の特定サーバにのみ与えられていることが望ましい。   Further, in this method, in order to improve the availability of the cluster system, there is a case where a method of having data replication in addition to the server in charge is used in combination. In that case, each cluster member needs to hold a list of all members (member management table) and be able to specify the replication destination from the member management table under specific rules. As shown in FIG. 1, the member management table includes a member ID as a key for uniquely designating each member and additional information unique to the member such as an IP (Internet Protocol) address. At this time, in order to ensure the consistency of the member management table, it is desirable that the update authority is given only to specific servers inside and outside the cluster.

このような、メンバ管理表の一貫性を実現する方式として、クラスタメンバのうちの一つを代表メンバとし、メンバ管理表の更新と他メンバへの配信を行う方式がある(非特許文献2)。この方式において、代表メンバは、任意に定められた代表メンバ選出規則に従ってクラスタメンバの中から選出される。また、各クラスタメンバは代表メンバの選出規則を把握(記憶)しており、メンバ管理表の情報を元に、自分自身を含んで代表メンバの特定を可能とする。クラスタメンバの増減による代表メンバの交代もクラスタメンバの自律的な判断によって行われる。   As a method for realizing such consistency of the member management table, there is a method in which one of the cluster members is used as a representative member and the member management table is updated and distributed to other members (Non-patent Document 2). . In this method, the representative member is selected from among the cluster members in accordance with an arbitrarily determined representative member selection rule. Each cluster member grasps (stores) representative member selection rules, and can identify representative members including themselves based on information in the member management table. The change of the representative member due to the increase / decrease of the cluster member is also performed by the autonomous determination of the cluster member.

(クラスタメンバ間死活監視と代表メンバの交代)
クラスタメンバの故障(異常)検知手法として、クラスタメンバ間で監視を行う方式がある。クラスタメンバごとに監視対象を定め、例えば定期的な(所定時間ごとの)ハートビート送信による応答有無を確認することで、所定時間内での故障検知を行う。故障が検知された場合には、監視を行ったクラスタメンバから全クラスタメンバへと当該メンバの故障が通知される。そして、当該メンバはクラスタシステムから減設される。そのため、代表メンバはメンバ管理表を更新し、その更新したメンバ管理表を全メンバに配信する。また、このとき、故障したメンバが代表メンバであった場合には、代表メンバの選出規則に従って次期の代表メンバが自律的に出現する。
(Life monitoring between cluster members and replacement of representative members)
As a cluster member failure (abnormality) detection method, there is a method of monitoring between cluster members. The monitoring target is determined for each cluster member, and for example, failure detection within a predetermined time is performed by confirming the presence or absence of a response by periodic heartbeat transmission (every predetermined time). When a failure is detected, the cluster member who has performed monitoring is notified of the failure of the member to all cluster members. The member is removed from the cluster system. Therefore, the representative member updates the member management table and distributes the updated member management table to all members. At this time, if the failed member is a representative member, the next representative member appears autonomously in accordance with the representative member selection rule.

代表メンバとその他のメンバが有する機能(処理部の構成)や情報について、図2を参照して説明する。
図2(a)に示すように、代表メンバは、処理部、記憶部、入出力部を備えて構成される。
処理部は、メンバ管理表配信部、メンバ管理表更新部、代表メンバ特定部、他メンバ死活監視部を備えている。
記憶部は、メンバ管理表、代表メンバ選出規則、旧監視先メンバ設定規則を記憶している。
Functions (configuration of the processing unit) and information possessed by the representative member and other members will be described with reference to FIG.
As shown in FIG. 2A, the representative member includes a processing unit, a storage unit, and an input / output unit.
The processing unit includes a member management table distribution unit, a member management table update unit, a representative member specifying unit, and another member alive monitoring unit.
The storage unit stores a member management table, representative member selection rules, and old monitoring destination member setting rules.

図2(b)に示すように、非代表メンバは、処理部、記憶部、入出力部を備えて構成される。
処理部は、代表メンバ特定部、他メンバ死活監視部を備えている。
記憶部は、メンバ管理表、代表メンバ選出規則、旧監視先メンバ設定規則を記憶している。
As shown in FIG. 2B, the non-representative member includes a processing unit, a storage unit, and an input / output unit.
The processing unit includes a representative member specifying unit and another member alive monitoring unit.
The storage unit stores a member management table, representative member selection rules, and old monitoring destination member setting rules.

なお、代表メンバと非代表メンバは、図示を省略しているが、クラスタメンバとしてのメッセージ処理機能や、複製データの送信機能や管理機能などを有している。   Although not shown, the representative member and the non-representative member have a message processing function as a cluster member, a duplicate data transmission function, a management function, and the like.

(実装例と問題点)
上記のメンバ管理表(図1も参照)によるメンバ管理と、クラスタメンバ間の死活監視方式の実装例を考える。例えば、代表メンバ選出規則により、最初の代表メンバを、メンバ管理表の記載順序の先頭のメンバとする。このとき、メンバ管理表の記載順序はクラスタへの参加順序に従うものとする。さらに、各クラスタメンバの死活監視対象を、メンバ管理表で自身の次の(1つ下の)メンバとする(最後尾メンバは先頭メンバを監視する)。例として、クラスタメンバ数N=4の場合を図3に示す。この場合に、1つのメンバ故障に伴うクラスタメンバの動作としては、以下の通りとなる。
(Implementation example and problems)
Consider an implementation example of member management based on the above-mentioned member management table (see also FIG. 1) and the alive monitoring system between cluster members. For example, according to the representative member selection rule, the first representative member is set as the first member in the description order of the member management table. At this time, the member management table is described in the order of participation in the cluster. Furthermore, the alive monitoring target of each cluster member is set as the next (one lower) member of itself in the member management table (the last member monitors the first member). As an example, FIG. 3 shows a case where the number of cluster members N = 4. In this case, the operation of the cluster member accompanying one member failure is as follows.

<(a)代表メンバ以外のメンバが故障した場合(図4参照)>
例として、メンバ3が故障した場合について考える。メンバ3を死活監視しているメンバ2がメンバ3の故障を検知し(S41)、メンバ3の故障を全メンバへ通知する(S42)。その後、通知を受けた代表メンバはメンバ管理表を更新して(S43)、その更新したメンバ管理表を全メンバへ配信する(S44)。代表メンバ以外の他メンバ2,4が故障した場合も同様である。
<(A) When a member other than the representative member fails (see FIG. 4)>
As an example, consider the case where member 3 fails. The member 2 who is alive monitoring the member 3 detects the failure of the member 3 (S41), and notifies all members of the failure of the member 3 (S42). After that, the representative member who has received the notification updates the member management table (S43), and distributes the updated member management table to all members (S44). The same applies when the members 2 and 4 other than the representative member fail.

<(b)代表メンバが故障した場合(図5参照)>
メンバ1(代表メンバ)を死活監視しているメンバ4がメンバ1(代表メンバ)の故障を検知し(S51)、メンバ1(代表メンバ)の故障を全メンバへ通知する(S52)。その後、通知を受けたメンバ2(次期代表メンバ候補)は、自身を新代表メンバとして認識し(S53)、メンバ管理表を更新し(S54)、その更新したメンバ管理表を全メンバへ配信する(S55)。以降、メンバ2は代表メンバとして動作する。
<(B) When a representative member fails (see FIG. 5)>
The member 4 who is alive monitoring the member 1 (representative member) detects the failure of the member 1 (representative member) (S51), and notifies the failure of the member 1 (representative member) to all members (S52). Thereafter, the member 2 (next representative member candidate) who has received the notification recognizes itself as a new representative member (S53), updates the member management table (S54), and distributes the updated member management table to all members. (S55). Thereafter, the member 2 operates as a representative member.

次に、2つのメンバが同時に故障した場合の動作として、以下の場合を考える。   Next, consider the following case as an operation when two members fail simultaneously.

<(c)代表メンバが故障するとともに、次期代表メンバ候補(先頭から二番目のメンバ)を監視するメンバでもなく代表メンバを監視するメンバでもないメンバが故障した場合(図6参照)>
メンバ1(代表メンバ)とメンバ3が故障した場合を考える。まず、メンバ1(代表メンバ)を死活監視しているメンバ4がメンバ1(代表メンバ)の故障を検知し(S61)、メンバ1(代表メンバ)の故障を全メンバへ通知する(S62)。また、並行して、メンバ3を死活監視しているメンバ2がメンバ3の故障を検知し(S63)、メンバ3の故障を全メンバへ通知する(S64)。その後、次期代表メンバ候補であるメンバ2は、メンバ1(代表メンバ)とメンバ3の故障通知を受けて、自身を新代表メンバとして認識し(S65)、メンバ管理表を更新し(S66)、その更新したメンバ管理表を全メンバへ配信する(S67)。以降、メンバ2は代表メンバとして動作する。
<(C) When a representative member fails and a member who is neither a member who monitors the next representative member candidate (second member from the top) nor a member who monitors the representative member fails (see FIG. 6)>
Consider a case where member 1 (representative member) and member 3 fail. First, the member 4 who is alive monitoring the member 1 (representative member) detects the failure of the member 1 (representative member) (S61), and notifies the failure of the member 1 (representative member) to all members (S62). In parallel, the member 2 who is alive monitoring the member 3 detects the failure of the member 3 (S63), and notifies the failure of the member 3 to all members (S64). Thereafter, the member 2 who is the next representative member candidate receives a failure notification of the member 1 (representative member) and the member 3, recognizes itself as a new representative member (S65), updates the member management table (S66), The updated member management table is distributed to all members (S67). Thereafter, the member 2 operates as a representative member.

<(d)代表メンバが故障するとともに、次期代表メンバ候補(先頭から二番目のメンバ)が故障した場合(図7参照)>
メンバ1(代表メンバ)とメンバ2(次期代表メンバ候補)が故障した場合を考える。まず、メンバ1(代表メンバ)を死活監視しているメンバ4がメンバ1(代表メンバ)の故障を検知し(S71)、メンバ1(代表メンバ)の故障を全メンバへ通知する(S72)。このとき、メンバ2(次期代表メンバ候補)も故障しているが、監視元のメンバ1(代表メンバ)が故障しているために故障検知が正常に行われない。この場合、メンバ1(代表メンバ)、メンバ2(次期代表メンバ候補)の同時故障であるため、次々期代表メンバ候補であるメンバ3が代表メンバとなる必要がある。ここで、メンバ2は、次期代表メンバ候補であるが、故障しているため、メンバ管理表を更新しない(S73前段)。しかし、メンバ3は、メンバ2(次期代表メンバ候補)の故障が通知されないために、メンバ2(次期代表メンバ候補)を新代表メンバとして認識する(S73後段)。そして、メンバ3は、そのメンバ2(次期代表メンバ候補)が新代表メンバとしてメンバ管理表の更新と配信を行うのを待ち続けることになり、クラスタシステムの処理が停止してしまう。
<(D) When the representative member fails and the next representative member candidate (second member from the top) fails (see FIG. 7)>
Consider a case where member 1 (representative member) and member 2 (next representative member candidate) fail. First, the member 4 who is alive monitoring the member 1 (representative member) detects the failure of the member 1 (representative member) (S71), and notifies the failure of the member 1 (representative member) to all members (S72). At this time, the member 2 (next representative member candidate) also fails, but failure detection is not normally performed because the monitoring source member 1 (representative member) has failed. In this case, since member 1 (representative member) and member 2 (next representative member candidate) are at the same time, member 3 which is the next representative member candidate needs to be the representative member. Here, the member 2 is the next representative member candidate, but since it has failed, the member management table is not updated (first stage of S73). However, since the failure of the member 2 (next representative member candidate) is not notified, the member 3 recognizes the member 2 (next representative member candidate) as a new representative member (second stage of S73). Then, the member 3 continues to wait for the member 2 (next representative member candidate) to update and distribute the member management table as a new representative member, and the processing of the cluster system stops.

<(e)代表メンバが故障するとともに、代表メンバを監視するメンバ(最後尾メンバ)が故障した場合(図8参照)>
メンバ1(代表メンバ)とメンバ4(代表メンバを監視するメンバ)が故障した場合を考える。まず、メンバ4(代表メンバを監視するメンバ)を死活監視しているメンバ3がメンバ4(代表メンバを監視するメンバ)の故障を検知し(S81)、メンバ4(代表メンバを監視するメンバ)の故障を全メンバへ通知する(S82)。このとき、メンバ1(代表メンバ)も故障しているが、監視元のメンバ4(代表メンバを監視するメンバ)が故障しているために故障検知が正常に行われない。この場合、メンバ1(代表メンバ)が故障しているため、次期代表メンバ候補であるメンバ2が代表メンバとなる必要がある。ここで、メンバ1(代表メンバ)は、故障しているため、メンバ管理表を更新しない(S83前段)。しかし、メンバ2(次期代表メンバ候補)は、メンバ1(代表メンバ)の故障が通知されないために、メンバ1を代表メンバとして認識する(S83後段)。そして、メンバ2(次期代表メンバ候補)は、メンバ1(代表メンバ)の故障が通知されないために、メンバ1(代表メンバ)がメンバ管理表の更新と配信を行うのを待ち続けることになり、クラスタシステムの処理が停止してしまう。
<(E) When a representative member fails and a member that monitors the representative member (last member) fails (see FIG. 8)>
Consider a case where member 1 (representative member) and member 4 (member monitoring the representative member) fail. First, the member 3 that is alive monitoring the member 4 (member that monitors the representative member) detects a failure of the member 4 (member that monitors the representative member) (S81), and the member 4 (member that monitors the representative member) Is reported to all members (S82). At this time, the member 1 (representative member) has also failed, but failure detection is not normally performed because the monitoring source member 4 (member that monitors the representative member) has failed. In this case, since member 1 (representative member) is out of order, member 2 as the next representative member candidate needs to be the representative member. Here, since member 1 (representative member) is out of order, the member management table is not updated (first stage of S83). However, the member 2 (next representative member candidate) recognizes the member 1 as a representative member because the failure of the member 1 (representative member) is not notified (second stage of S83). Then, since member 2 (representative member candidate) is not notified of the failure of member 1 (representative member), member 2 (representative member) continues to wait for member 1 (representative member) to update and distribute the member management table. Cluster system processing stops.

上記(d)(図7)に示した代表メンバと次期代表メンバ候補が同時に故障した場合、および、(e)(図8)に示した代表メンバと代表メンバを監視するメンバが同時に故障した場合には、新代表メンバが現れず、メンバ管理表の更新が行われないために、システムが停止してしまうという問題がある。   When the representative member and the next representative member candidate shown in (d) (FIG. 7) fail simultaneously, and when the representative member shown in (e) (FIG. 8) and the member monitoring the representative member fail simultaneously However, there is a problem that the system stops because no new representative member appears and the member management table is not updated.

本問題の従来の解決手法の一つを図9に示す。図7の場合と同様、メンバ1(代表メンバ)とメンバ2(次期代表メンバ候補)が故障した場合を考える。この解決手法では、クラスタメンバの故障通知が発生した際に、所定時間、代表メンバによるメンバ管理表の更新が行われなかった場合には、本来の死活監視メンバとは異なるメンバが、代表メンバもしくは次期代表メンバ候補の故障検知を行う。   One conventional solution to this problem is shown in FIG. As in the case of FIG. 7, consider a case where member 1 (representative member) and member 2 (next representative member candidate) fail. In this solution, when a failure notification of a cluster member occurs, if the member management table is not updated by the representative member for a predetermined time, a member different from the original alive monitoring member is designated as the representative member or The failure detection of the next representative member candidate is performed.

まず、メンバ1(代表メンバ)を死活監視しているメンバ4がメンバ1(代表メンバ)の故障を検知し(S91)、メンバ1(代表メンバ)の故障を全メンバへ通知する(S92)。ここで、メンバ2は、次期代表メンバ候補であるが、故障しているため、メンバ管理表を更新しない(S93前段)。しかし、メンバ3は、メンバ2(次期代表メンバ候補)の故障が通知されないために、メンバ2(次期代表メンバ候補)を新代表メンバとして認識する(S93後段)。   First, the member 4 who is alive monitoring the member 1 (representative member) detects the failure of the member 1 (representative member) (S91), and notifies the failure of the member 1 (representative member) to all members (S92). Here, the member 2 is a candidate for the next representative member, but has failed, so the member management table is not updated (first stage of S93). However, since the failure of member 2 (next representative member candidate) is not notified, member 3 recognizes member 2 (next representative member candidate) as a new representative member (second stage of S93).

ここで、故障通知から所定時間、メンバ管理表の更新が行われなかった場合に、メンバ4は、メンバ2(次期代表メンバ候補)の故障検知をメンバ3へ依頼する(S94)。そして、メンバ3は、メンバ2(次期代表メンバ候補)の故障を検知し(S95)、メンバ2(次期代表メンバ候補)の故障を全メンバに通知する(S96)。その後、メンバ3は、自身を新代表メンバとして認識し(S97)、メンバ管理表を更新し(S98)、その更新したメンバ管理表を全メンバへ配信する(S99)。   Here, when the member management table is not updated for a predetermined time from the failure notification, the member 4 requests the member 3 to detect the failure of the member 2 (next representative member candidate) (S94). The member 3 detects the failure of the member 2 (next representative member candidate) (S95), and notifies all members of the failure of the member 2 (next representative member candidate) (S96). Thereafter, the member 3 recognizes itself as a new representative member (S97), updates the member management table (S98), and distributes the updated member management table to all members (S99).

丸山不二夫、首藤一幸、「スケールアウトの技術」、雲の世界の向こうをつかむ クラウドの技術、株式会社アスキー・メディアワークス、2009年11月6日、ISBN978−4−04−868064−6 C3004、p.88‐99Fujio Maruyama, Kazuyuki Shudo, “Technology of scale-out”, seizing beyond the cloud world Cloud Technology, ASCII Media Works, Inc., November 6, 2009, ISBN 978-4-04-886864-6 C3004, p . 88-99 岩佐絵里子、外5名、「スケールアウト型セッション制御サーバにおける動的構成変更負荷軽減方式」、信学技報、社団法人電子情報通信学会、2012年、NS2011−156(2012−01)、p.65‐70Eriko Iwasa, 5 others, “Dynamic configuration change load reduction method in scale-out type session control server”, IEICE Technical Report, IEICE, 2012, NS2011-156 (2012-01), p. 65-70

このように、図9に示すように、クラスタメンバの故障通知が発生した際に、所定時間、代表メンバによるメンバ管理表の更新が行われなかった場合には、本来の死活監視メンバとは異なるメンバが、代表メンバもしくは次期代表メンバ候補の故障検知を行う方法は有効である。しかしながら、この方法は代表メンバや次期代表メンバ候補の故障検知を行うまでの判断に時間を要するため、最終的にメンバ管理表が更新されるまでに比較的長時間を要するという問題がある。メンバ管理表が正しく更新されていない期間は、データの複製が正常に管理されていない状態であるため、サービスの停止が許されない高可用システムにおいては、より高速にクラスタメンバの状態を反映可能な手法が求められる。   In this way, as shown in FIG. 9, when the failure notification of the cluster member occurs, if the member management table is not updated by the representative member for a predetermined time, it is different from the original alive monitoring member. A method in which a member detects a failure of a representative member or a next representative member candidate is effective. However, this method has a problem that it takes a relatively long time until the member management table is finally updated because it takes time to determine the failure of the representative member or the next representative member candidate. During the period when the member management table is not updated correctly, data replication is not managed properly, so the status of cluster members can be reflected more quickly in a highly available system where service is not allowed to stop. A method is required.

そこで、本発明は、前記した事情に鑑みてなされたものであり、クラスタシステムにおいて、メンバ管理表の更新と配信を行う代表メンバと他のメンバとが同時に故障した場合でも、新代表メンバを高速に選出することを課題とする。   Therefore, the present invention has been made in view of the above circumstances, and in a cluster system, even when a representative member that updates and distributes a member management table and another member fail at the same time, a new representative member is The task is to select the first.

前記課題を解決するために、本発明は、複数のクラスタメンバが分散処理を行うクラスタシステムであって、前記複数のクラスタメンバそれぞれは、前記複数のクラスタメンバの識別子を含むメンバ管理表を保持しており、前記複数のクラスタメンバのうちの代表メンバは、前記メンバ管理表を更新し、その更新したメンバ管理表を前記複数のクラスタメンバそれぞれに配信し、前記複数のクラスタメンバのうち、前記代表メンバが故障した場合の次の代表メンバである次期代表メンバ候補が予め選出されており、前記複数のクラスタメンバそれぞれは、前記複数のクラスタメンバの間で行う死活監視対象を決定する際に、次期代表メンバ候補に対して死活監視を行うクラスタメンバを、そのときの代表メンバ以外から選出するように定めた監視先メンバ設定規則を記憶する記憶部と、前記複数のクラスタメンバの間で行う死活監視対象を決定する際に、前記監視先メンバ設定規則にしたがって自身の監視対象を選出する処理部と、を備えることを特徴とする。   In order to solve the above problems, the present invention provides a cluster system in which a plurality of cluster members perform distributed processing, and each of the plurality of cluster members has a member management table including identifiers of the plurality of cluster members. A representative member of the plurality of cluster members updates the member management table, distributes the updated member management table to each of the plurality of cluster members, and among the plurality of cluster members, the representative member The next representative member candidate that is the next representative member when the member fails is selected in advance, and each of the plurality of cluster members determines the next life / death monitoring target to be performed between the plurality of cluster members. It was decided to select a cluster member to perform life and death monitoring for a representative member candidate from a member other than the representative member at that time. A storage unit for storing a viewing-destination member setting rule; and a processing unit for selecting a monitoring target according to the monitoring-destination member setting rule when determining a life / death monitoring target to be performed among the plurality of cluster members, It is characterized by providing.

これにより、クラスタシステムにおいて、前記した監視先メンバ設定規則を用いることで、メンバ管理表の更新と配信を行う代表メンバと他のメンバとが同時に故障した場合でも、新代表メンバを高速に選出することができる。   As a result, in the cluster system, by using the monitoring member setting rules described above, a new representative member can be selected at high speed even when the representative member that updates and distributes the member management table and another member fail at the same time. be able to.

また、本発明において、前記監視先メンバ設定規則は、前記複数のクラスタメンバの間で行う死活監視の関係が、任意のクラスタメンバから出発して、そのクラスタメンバの監視先クラスタメンバ、さらにそのクラスタメンバの監視先クラスタメンバ、と順にたどった場合に、全クラスタメンバを網羅してから元の出発のクラスタメンバに戻るように設定された規則であることが望ましい。   Also, in the present invention, the monitoring destination member setting rule is that the relationship of life and death monitoring performed between the plurality of cluster members starts from an arbitrary cluster member, and the cluster member monitoring destination cluster member and further the cluster It is desirable that the rule is set so that all the cluster members are covered and then the original starting cluster member is returned when the members are monitored in the order of the monitoring destination cluster members.

これにより、各クラスタメンバの監視先が全メンバを跨った系を成すように構成することで、複数メンバの同時故障時に故障検知不能となるメンバをより少なくすることができる。   Thus, by configuring the cluster members so that the monitoring destinations of all cluster members extend over all members, it is possible to further reduce the number of members that cannot be detected when a plurality of members fail simultaneously.

また、本発明において、前記監視先メンバ設定規則は、クラスタメンバ数(N)が奇数の場合、各クラスタメンバの監視対象を前記メンバ管理表における順番の降順で一つ飛ばしのクラスタメンバとし、最後から一つ前の「N−1」番目のクラスタは先頭のクラスタメンバを監視し、最後のN番目のクラスタメンバは先頭から二番目のクラスタメンバを監視するように設定された規則であることが望ましい。   In the present invention, when the number of cluster members (N) is an odd number, the monitoring destination member setting rule sets the monitoring target of each cluster member to one cluster member skipped in the descending order of the order in the member management table. The “N−1” -th cluster immediately before is monitored for the first cluster member, and the last N-th cluster member is a rule set to monitor the second cluster member from the top. desirable.

これにより、クラスタメンバ数(N)が奇数の場合に、各クラスタメンバの監視先が全メンバを跨った系を成すように容易に構成することができる。   As a result, when the number of cluster members (N) is an odd number, the monitoring destination of each cluster member can be easily configured so as to form a system across all members.

また、本発明において、前記監視先メンバ設定規則は、クラスタメンバ数(N)が偶数の場合、各クラスタメンバの監視対象を、先頭から「N−2」番目のクラスタメンバについては、前記メンバ管理表における順番の降順で一つ飛ばしのクラスタメンバとし、最後から一つ前の「N−1」番目のクラスタは先頭から二番目のクラスタメンバを監視し、最後のN番目のクラスタメンバは先頭のクラスタメンバを監視するように設定された規則であることが望ましい。   In the present invention, when the number of cluster members (N) is an even number, the monitoring destination member setting rule specifies that the monitoring target of each cluster member is the member management for the “N-2” -th cluster member from the top. The cluster member is skipped by one in descending order in the table, the “N−1” -th cluster one previous from the last is monitored for the second cluster member from the top, and the last N-th cluster member is the first cluster member. The rules are preferably set to monitor cluster members.

これにより、クラスタメンバ数(N)が偶数の場合に、各クラスタメンバの監視先が全メンバを跨った系を成すように容易に構成することができる。   Thus, when the number of cluster members (N) is an even number, the monitoring destination of each cluster member can be easily configured so as to form a system across all members.

本発明によれば、クラスタシステムにおいて、メンバ管理表の更新と配信を行う代表メンバと他のメンバとが同時に故障した場合でも、新代表メンバを高速に選出するができる。   According to the present invention, in a cluster system, a new representative member can be selected at high speed even when a representative member that updates and distributes a member management table and another member fail at the same time.

従来技術のメンバ管理表のデータ構成図である。It is a data block diagram of the member management table | surface of a prior art. (a)は、従来技術の代表メンバの構成図である。(b)は、従来技術の非代表メンバの構成図である。(A) is a block diagram of the representative member of a prior art. (B) is a block diagram of a non-representative member of the prior art. 従来技術における、クラスタメンバ数N=4の場合のクラスタメンバ間の死活監視関係を示す図である。It is a figure which shows the alive monitoring relationship between cluster members in the case of the number N of cluster members in a prior art. 従来技術において、図3の例でメンバ3が故障した場合について、(a)は各メンバの状態を示す模式図であり、(b)は各メンバによる処理を示すフローチャートである。In the prior art, when the member 3 fails in the example of FIG. 3, (a) is a schematic diagram showing the state of each member, and (b) is a flowchart showing processing by each member. 従来技術において、図3の例でメンバ1が故障した場合について、(a)は各メンバの状態を示す模式図であり、(b)は各メンバによる処理を示すフローチャートである。In the prior art, when the member 1 fails in the example of FIG. 3, (a) is a schematic diagram showing the state of each member, and (b) is a flowchart showing processing by each member. 従来技術において、図3の例でメンバ1とメンバ3が同時に故障した場合について、(a)は各メンバの状態を示す模式図であり、(b)は各メンバによる処理を示すフローチャートである。FIG. 3A is a schematic diagram showing the state of each member and FIG. 5B is a flowchart showing processing by each member when the members 1 and 3 fail simultaneously in the example of FIG. 従来技術において、図3の例でメンバ1とメンバ2が同時に故障した場合について、(a)は各メンバの状態を示す模式図であり、(b)は各メンバによる処理を示すフローチャートである。FIG. 3A is a schematic diagram showing the state of each member and FIG. 4B is a flowchart showing processing by each member when the members 1 and 2 fail simultaneously in the example of FIG. 従来技術において、図3の例でメンバ1とメンバ4が同時に故障した場合について、(a)は各メンバの状態を示す模式図であり、(b)は各メンバによる処理を示すフローチャートである。FIG. 3A is a schematic diagram illustrating the state of each member and FIG. 3B is a flowchart illustrating processing by each member when the member 1 and the member 4 fail simultaneously in the example of FIG. 別の従来技術において、図3の例でメンバ1とメンバ2が同時に故障した場合について、(a)は各メンバの状態を示す模式図であり、(b)は各メンバによる処理を示すフローチャートである。FIG. 3A is a schematic diagram showing a state of each member, and FIG. 3B is a flowchart showing processing by each member when the members 1 and 2 fail simultaneously in the example of FIG. is there. (a)は、本実施形態の代表メンバの構成図である。(b)は、本実施形態の非代表メンバの構成図である。(A) is a block diagram of the representative member of this embodiment. (B) is a block diagram of a non-representative member of the present embodiment. (a)は、本実施形態の方式をメンバ数(N)が奇数の場合に適用したときのメンバ間の監視関係を示す図である。(b)は、本実施形態の方式をメンバ数(N)が偶数の場合に適用したときのメンバ間の監視関係を示す図である。(A) is a figure which shows the monitoring relationship between members when the system of this embodiment is applied when the number of members (N) is an odd number. (B) is a figure which shows the monitoring relationship between members when the system of this embodiment is applied when the number of members (N) is an even number. 本実施形態において、図7の場合と同様に、メンバ1とメンバ2が同時に故障した場合について、(a)は各メンバの状態を示す模式図であり、(b)は各メンバによる処理を示すフローチャートである。In this embodiment, as in the case of FIG. 7, (a) is a schematic diagram showing the state of each member, and (b) shows the processing by each member when the members 1 and 2 fail simultaneously. It is a flowchart. (a)は、本実施形態において、監視先が全メンバを跨った系を成すように構成する場合のメンバ間の監視関係を示す図である。(b)は、比較例として、監視先が全メンバを跨っていない構成の場合のメンバ間の監視関係を示す図である。(A) is a figure which shows the monitoring relationship between members in the case where it configures so that the monitoring destination may form the system over all members in this embodiment. (B) is a figure which shows the monitoring relationship between members in the case of the structure where the monitoring destination does not straddle all members as a comparative example. 本実施形態において、監視先が全メンバを跨った系を成すように構成するためのクラスタメンバの監視先の設定手順の説明図であり、(a)は、メンバ管理表における各メンバの配列状況を示す図であり、(b)はその設定手順を示すフローチャートである。In this embodiment, it is explanatory drawing of the setting procedure of the monitoring destination of the cluster member for comprising so that a monitoring destination may form the system over all members, (a) is the arrangement | positioning condition of each member in a member management table | surface (B) is a flowchart showing the setting procedure.

以下、本発明を実施するための形態(以下、実施形態と称する。)について、図面を参照(言及図以外の図も適宜参照)しながら説明する。
本実施形態では、複数のクラスタメンバが分散処理を行うクラスタシステムについて説明する。なお、本実施形態ではクラスタメンバとデータを対応付ける手法としてコンシステントハッシング法を用いるが、本発明はコンシステントハッシング法に限定されない。
Hereinafter, modes for carrying out the present invention (hereinafter referred to as embodiments) will be described with reference to the drawings (refer to drawings other than the referenced drawings as appropriate).
In this embodiment, a cluster system in which a plurality of cluster members perform distributed processing will be described. In the present embodiment, the consistent hashing method is used as a method for associating the cluster members with the data, but the present invention is not limited to the consistent hashing method.

まず、本実施形態の代表メンバとその他のメンバが有する機能(処理部の構成)や情報について、図10を参照して説明する。
図10(a)に示すように、代表メンバ100は、処理部110、記憶部120、入出力部130を備えて構成される。
処理部110は、メンバ管理表配信部111、メンバ管理表更新部112、代表メンバ特定部113、他メンバ死活監視部114を備えている。
First, functions (configuration of processing units) and information possessed by the representative member and other members of the present embodiment will be described with reference to FIG.
As shown in FIG. 10A, the representative member 100 includes a processing unit 110, a storage unit 120, and an input / output unit 130.
The processing unit 110 includes a member management table distribution unit 111, a member management table update unit 112, a representative member specifying unit 113, and another member alive monitoring unit 114.

メンバ管理表配信部111は、メンバ管理表121を他のメンバに配信する。
メンバ管理表更新部112は、所定の契機(例えば、メンバの増減時)に、メンバ管理表121を更新する。
代表メンバ特定部113は、所定の契機(例えば、代表メンバの故障時)に、代表メンバ選出規則122を用いて、代表メンバを特定する。
他メンバ死活監視部114は、監視先メンバ設定規則123により定められた他のメンバの死活監視を行う。
The member management table distribution unit 111 distributes the member management table 121 to other members.
The member management table update unit 112 updates the member management table 121 at a predetermined opportunity (for example, when the number of members increases or decreases).
The representative member specifying unit 113 specifies a representative member using a representative member selection rule 122 at a predetermined opportunity (for example, when a representative member fails).
The other member alive monitoring unit 114 performs alive monitoring of other members defined by the monitoring destination member setting rule 123.

記憶部120は、メンバ管理表121、代表メンバ選出規則122、監視先メンバ設定規則123を記憶している。
入出力部130は、情報の入力インタフェース、出力インタフェース、通信インタフェースを備える。
The storage unit 120 stores a member management table 121, a representative member selection rule 122, and a monitoring destination member setting rule 123.
The input / output unit 130 includes an information input interface, an output interface, and a communication interface.

なお、代表メンバ100は、図2(a)に示す代表メンバと比較して、監視先メンバ設定規則123が旧監視先メンバ設定規則と異なっている点で技術的特徴を有する(詳細は後記)。その他の構成については、図2(a)に示す代表メンバにおいて対応する構成と同様であるので、詳細な説明を省略する。   The representative member 100 has a technical feature in that the monitoring destination member setting rule 123 is different from the old monitoring destination member setting rule compared to the representative member shown in FIG. 2A (details will be described later). . The other configuration is the same as the corresponding configuration in the representative member shown in FIG.

また、図10(a)に図示していないが、代表メンバ100は、記憶部120に、次期代表メンバ候補の選出規則も記憶している。そして、代表メンバ100は、その次期代表メンバ候補の選出規則を用いることで、必要に応じて、次期代表メンバ候補を新たに選出することができる。または、代表メンバ100は、次期代表メンバ候補の識別子を所定の記憶領域に常に記憶していて、代表メンバが変更になる契機に、次期代表メンバ候補の選出規則に基づいて次期代表メンバ候補を新たに選出し、その所定の記憶領域に記憶している次期代表メンバ候補の識別子を更新するようにしてもよい。なお、次期代表メンバ候補の選出規則は、例えば、メンバ管理表121における代表メンバの次(1つ下)のメンバを次期代表メンバとする規則である。   Although not shown in FIG. 10A, the representative member 100 also stores the selection rule for the next representative member candidate in the storage unit 120. The representative member 100 can newly select the next representative member candidate as necessary by using the selection rule for the next representative member candidate. Alternatively, the representative member 100 always stores the identifier of the next representative member candidate in a predetermined storage area. When the representative member is changed, the representative member 100 newly selects the next representative member candidate based on the selection rule for the next representative member candidate. The identifier of the next representative member candidate stored in the predetermined storage area may be updated. The next representative member candidate selection rule is, for example, a rule in which the next representative member in the member management table 121 is the next representative member.

図10(b)に示すように、非代表メンバ200は、処理部210、記憶部220、入出力部230を備えて構成される。
処理部210は、代表メンバ特定部211、他メンバ死活監視部212を備えている。
代表メンバ特定部211は、所定の契機(例えば、代表メンバの故障時)に、代表メンバ選出規則222を用いて、代表メンバを特定する。
他メンバ死活監視部212は、監視先メンバ設定規則223により定められた他のメンバの死活監視を行う。
記憶部220は、メンバ管理表221、代表メンバ選出規則222、監視先メンバ設定規則223を記憶している。
入出力部230は、情報の入力インタフェース、出力インタフェース、通信インタフェースを備える。
As shown in FIG. 10B, the non-representative member 200 includes a processing unit 210, a storage unit 220, and an input / output unit 230.
The processing unit 210 includes a representative member specifying unit 211 and another member alive monitoring unit 212.
The representative member specifying unit 211 uses a representative member selection rule 222 to specify a representative member at a predetermined opportunity (for example, when a representative member fails).
The other member alive monitoring unit 212 performs alive monitoring of other members determined by the monitoring destination member setting rule 223.
The storage unit 220 stores a member management table 221, a representative member selection rule 222, and a monitoring destination member setting rule 223.
The input / output unit 230 includes an information input interface, an output interface, and a communication interface.

なお、非代表メンバ200は、図2(b)に示す非代表メンバと比較して、監視先メンバ設定規則223が旧監視先メンバ設定規則と異なっている点で技術的特徴を有する(詳細は後記)。その他の構成については、図2(b)に示す非代表メンバにおいて対応する構成と同様であるので、詳細な説明を省略する。   The non-representative member 200 has a technical feature in that the monitoring destination member setting rule 223 is different from the old monitoring destination member setting rule as compared to the non-representative member shown in FIG. (Postscript). The other configuration is the same as the corresponding configuration in the non-representative member shown in FIG.

また、図10(b)に図示していないが、非代表メンバ200は、記憶部220に、次期代表メンバ候補の選出規則(代表メンバ100のものと共通)も記憶している。   Although not shown in FIG. 10B, the non-representative member 200 also stores a selection rule for the next representative member candidate (common to that of the representative member 100) in the storage unit 220.

また、代表メンバ100と非代表メンバ200は、図示を省略しているが、クラスタメンバとしてのメッセージ処理機能や、複製データの送信機能や管理機能などを有している。   Although not shown, the representative member 100 and the non-representative member 200 have a message processing function as a cluster member, a duplicate data transmission function, a management function, and the like.

また、図10(b)では、非代表メンバ200の処理部210は、代表メンバ100の処理部110におけるメンバ管理表配信部111メンバ管理表更新部112に対応する構成を図示していないが、非代表メンバ200が代表メンバ100になった際には、そのような構成も有することになることは、言うまでもない。   In FIG. 10B, the processing unit 210 of the non-representative member 200 does not show a configuration corresponding to the member management table distribution unit 111 and the member management table update unit 112 in the processing unit 110 of the representative member 100. Needless to say, when the non-representative member 200 becomes the representative member 100, such a configuration is also provided.

本実施形態では、前述した(d)(図7)と(e)(図8)の問題を解決するクラスタメンバ間の監視方式を提案する。そのためには、代表メンバの監視対象が次期代表メンバ候補となることを避けることが有効である。つまり、監視先メンバ設定規則123および監視先メンバ設定規則223は、複数のクラスタメンバの間で行う死活監視対象を決定する際に、次期代表メンバ候補に対して死活監視を行うクラスタメンバを、そのときの代表メンバ以外から選出するように定めた規則であり、具体的には、次の通りである。   In the present embodiment, a monitoring method between cluster members that solves the problems (d) (FIG. 7) and (e) (FIG. 8) described above is proposed. For this purpose, it is effective to avoid the monitoring target of the representative member from becoming the next representative member candidate. In other words, the monitoring destination member setting rule 123 and the monitoring destination member setting rule 223 specify the cluster member that performs life and death monitoring for the next representative member candidate when determining the life and death monitoring target to be performed among a plurality of cluster members. It is a rule determined to select from members other than the representative member at the time, specifically, as follows.

例えば、クラスタメンバ数(N)が奇数の場合、各クラスタメンバの監視対象をメンバ管理表121(図1と同じ)の一つ飛ばし(2つ下)のメンバとする。なお、メンバ管理表121では、列最後尾(最下段)の次は先頭に戻るものとして扱う。   For example, when the number of cluster members (N) is an odd number, the monitoring target of each cluster member is one skipped (two lower) member of the member management table 121 (same as FIG. 1). In the member management table 121, the next column end (bottom row) is treated as returning to the top.

つまり、図11(a)に示すように、N=5の場合で考えると、メンバ1(代表メンバ)はメンバ3を監視する。メンバ2はメンバ4を監視する。メンバ3はメンバ5を監視する。そして、最後尾(メンバ5)から一つ前の「N−1」番目のメンバ(メンバ4)は先頭メンバ(メンバ1)を監視する。また、最後尾のN番目のメンバ(メンバ5)は先頭から二番目のメンバ(メンバ2)を監視する。これにより、各クラスタメンバの監視先が全メンバを跨った系を成すように容易に構成することができる。   That is, as shown in FIG. 11A, considering the case of N = 5, the member 1 (representative member) monitors the member 3. Member 2 monitors member 4. Member 3 monitors member 5. Then, the “N−1” -th member (member 4) immediately before the tail (member 5) monitors the head member (member 1). The last Nth member (member 5) monitors the second member (member 2) from the top. Thereby, it can be easily configured so that the monitoring destination of each cluster member forms a system across all members.

ここで、「監視先が全メンバを跨った系」とは、複数のクラスタメンバの間で行う死活監視の関係が、任意のクラスタメンバから出発して、そのクラスタメンバの監視先クラスタメンバ、さらにそのクラスタメンバの監視先クラスタメンバ、と順にたどった場合に、全クラスタメンバを網羅してから元の出発のクラスタメンバに戻ることを指す。   Here, the “system in which the monitoring destination spans all members” means that the relationship of life and death monitoring performed between a plurality of cluster members starts from an arbitrary cluster member, the monitoring destination cluster member of the cluster member, When the cluster members are traced in order, the cluster members are monitored, and all cluster members are covered before returning to the original starting cluster member.

図11(b)に示すように、メンバ数(N)が偶数の場合には、N番目のメンバ(メンバ4)が先頭メンバ(メンバ1)を監視し、N−1番目のメンバが先頭から二番目のメンバを監視することとする。これにより、各クラスタメンバの監視先が全メンバを跨った系を成すように容易に構成することができる。また、メンバ数N=4の場合に、メンバ2とメンバ4が互いを監視してそれらの同時故障時に故障検知が行われない事態を防ぐという効果もある。   As shown in FIG. 11B, when the number of members (N) is an even number, the Nth member (member 4) monitors the first member (member 1), and the (N-1) th member from the top. The second member will be monitored. Thereby, it can be easily configured so that the monitoring destination of each cluster member forms a system across all members. In addition, when the number of members N = 4, there is an effect that the members 2 and 4 monitor each other to prevent a situation in which failure detection is not performed at the time of the simultaneous failure.

以上の方式を用いた場合の、メンバ1(代表メンバ)とメンバ2(次期代表メンバ候補)が同時に故障したときの動作について図12に示す(N=4の場合)。動作主体は、処理部110内の各部と、処理部210内の各部であるが、記載を簡潔にするために各メンバを動作主体とする。   FIG. 12 shows the operation when member 1 (representative member) and member 2 (next representative member candidate) fail simultaneously when the above method is used (when N = 4). The operation subject is each unit in the processing unit 110 and each unit in the processing unit 210, but each member is an operation subject in order to simplify the description.

まず、メンバ1(代表メンバ)を死活監視しているメンバ4がメンバ1(代表メンバ)の故障を検知し(S121)、メンバ1(代表メンバ)の故障を全メンバへ通知する(S122)。また、並行して、メンバ2を死活監視しているメンバ3がメンバ2の故障を検知し(S123)、メンバ2の故障を全メンバへ通知する(S124)。次々期代表メンバ候補であるメンバ3は、メンバ1(代表メンバ)とメンバ2(次期代表メンバ候補)の故障通知を受けて、自身を新代表メンバとして認識し(S125)、メンバ管理表121を更新し(S126)、その更新したメンバ管理表121を全メンバへ配信する(S127)。以降、メンバ3は代表メンバとして動作する。   First, the member 4 that is alive monitoring the member 1 (representative member) detects the failure of the member 1 (representative member) (S121), and notifies the failure of the member 1 (representative member) to all members (S122). In parallel, the member 3 who is alive monitoring the member 2 detects the failure of the member 2 (S123), and notifies the failure of the member 2 to all members (S124). Member 3 who is the next representative member candidate receives the failure notification of member 1 (representative member) and member 2 (next representative member candidate), recognizes himself as a new representative member (S125), and stores member management table 121. Update (S126), and distribute the updated member management table 121 to all members (S127). Thereafter, the member 3 operates as a representative member.

以上のように、本方式を用いることで、代表メンバと次期代表メンバの故障検知が並行して行われ、新代表メンバの高速な選出を行うことが可能となる。つまり、クラスタシステムにおいて、メンバ管理表121の更新と配信を行う代表メンバと他のメンバとが同時に故障した場合でも、新代表メンバを高速に選出できる。   As described above, by using this method, failure detection of the representative member and the next representative member is performed in parallel, and a new representative member can be selected at high speed. That is, in the cluster system, even when a representative member that updates and distributes the member management table 121 and another member fail at the same time, a new representative member can be selected at high speed.

ここで、本方式を一般化して示す。本方式では前提として、複数メンバの同時故障時に故障検知不能となるメンバをできるだけ少なくするために、相互監視しあうペアを発生させないことに加えて、図13(a)に示すように各クラスタメンバの監視先が全メンバを跨った系を成すように構成する監視先メンバ設定規則123,223を用いる。   Here, this method is generalized and shown. In this method, as a premise, in order to minimize the number of members that are unable to detect a failure at the time of simultaneous failure of a plurality of members, in addition to not generating a pair to be mutually monitored, as shown in FIG. Monitoring-destination member setting rules 123 and 223 are used so that the monitoring destinations are configured to form a system across all members.

比較例として、各クラスタメンバの監視先が全メンバを跨っていない構成の場合、図13(b)の例では、メンバ1〜3が同時に故障した場合に故障検知が不能であり、また、メンバ4〜Nが同時に故障した場合にも故障検知が不能である。   As a comparative example, when the monitoring destination of each cluster member does not straddle all members, in the example of FIG. 13B, failure detection is impossible when members 1 to 3 fail simultaneously, Even if 4 to N fail simultaneously, failure detection is impossible.

一方、「監視先が全メンバを跨った系」を採用する本方式では、メンバkを代表メンバ、メンバjを次期代表メンバ候補とするとき、各クラスタメンバの監視先を、図14(b)に示す手順で決定する。図14(a)は、N個のメンバのうちk番目が代表メンバでj番目が次期代表メンバ候補であることを表している。   On the other hand, in this method employing the “system in which the monitoring destination spans all members”, when the member k is the representative member and the member j is the next representative member candidate, the monitoring destination of each cluster member is shown in FIG. Determine according to the procedure shown in. FIG. 14A shows that among the N members, the kth is the representative member and the jth is the next representative member candidate.

まず、条件1(x≠k,x≠j)に従い、メンバk(代表メンバ)の監視先メンバxを決定する(S141)。つまり、メンバk(代表メンバ)の監視先メンバxを、メンバj(次期代表メンバ候補)と自身を除いた(N−2)個のメンバの中から選択する。   First, the monitoring target member x of the member k (representative member) is determined according to the condition 1 (x ≠ k, x ≠ j) (S141). That is, the monitoring target member x of the member k (representative member) is selected from (N-2) members excluding the member j (next representative member candidate) and itself.

次に、条件2(y≠k,y≠x)に従い、メンバxの監視先メンバyを決定する(S142)。つまり、メンバxの監視先メンバyを、系の始点である代表メンバkと、既に監視元メンバが決定しているメンバxを除いた(N−2)個のメンバの中から選択する。   Next, the monitoring target member y of the member x is determined according to the condition 2 (y ≠ k, y ≠ x) (S142). That is, the monitoring destination member y of the member x is selected from (N-2) members excluding the representative member k that is the starting point of the system and the member x that has already been determined by the monitoring source member.

次に、監視先を未設定のメンバ数が「2以上」か「1」かを判断し、「2以上」であればS145に進み、「1」であれば最終メンバの監視先をメンバk(代表メンバ)に設定し(S144)、処理を終了する。   Next, it is determined whether the number of members whose monitoring destination is not set is “2 or more” or “1”. If “2 or more”, the process proceeds to S145. If “1”, the monitoring destination of the last member is member k. (Representative member) is set (S144), and the process ends.

S145において、条件3(z≠k,z≠x,z≠y)に従い、メンバyの監視先メンバzを決定する。つまり、メンバyの監視先メンバzを、系の始点である代表メンバkと、既に監視元メンバが決定しているメンバx,yを除いた(N−3)個のメンバの中から選択する。   In S145, the monitoring target member z of the member y is determined according to the condition 3 (z ≠ k, z ≠ x, z ≠ y). That is, the monitoring target member z of the member y is selected from (N-3) members excluding the representative member k that is the starting point of the system and the members x and y that have already been determined by the monitoring source member. .

次に、監視先を未設定のメンバ数が「2以上」か「1」かを判断し、「2以上」であれば{S142,S143}、{S145,S146}と同様の処理(条件4,5,・・・を用いて監視先メンバを決定)を継続し、「1」であれば最終メンバの監視先をメンバk(代表メンバ)に設定し(S147)、処理を終了する。   Next, it is determined whether the number of members whose monitoring destinations are not set is “2 or more” or “1”. If the number is “2 or more”, the same processing (condition 4) as {S142, S143}, {S145, S146} , 5,... Is determined), and if “1”, the monitoring destination of the last member is set to member k (representative member) (S147), and the process ends.

このようにして、代表メンバが次期代表メンバを監視することを避けるとともに、各クラスタメンバの監視先が全メンバを跨った系を成すように構成することで、複数メンバの同時故障時に故障検知不能となるメンバをできるだけ少なくすることができる。なお、N=2の場合は、代表メンバと次期代表メンバ候補が相互監視することになるので、本実施形態におけるNの必要条件はN≧3である。   In this way, by preventing the representative member from monitoring the next representative member and configuring the cluster members to be monitored across all members, failure detection is not possible when multiple members fail simultaneously Can be reduced as much as possible. When N = 2, the representative member and the next representative member candidate are mutually monitored, so the necessary condition of N in the present embodiment is N ≧ 3.

この方式は、クラスタシステムのソフトウェアに大きな変更を加えることなく実装可能であるので、汎用的な利用に有効である。   Since this method can be implemented without major changes to the software of the cluster system, it is effective for general use.

以上で本実施形態の説明を終えるが、本発明の態様はこれらに限定されるものではない。具体的な構成や処理について、本発明の主旨を逸脱しない範囲で適宜変更が可能である。   Although description of this embodiment is finished above, the aspect of the present invention is not limited to these. The specific configuration and processing can be appropriately changed without departing from the gist of the present invention.

100 代表メンバ
110 処理部
111 メンバ管理表配信部
112 メンバ管理表更新部
113 代表メンバ特定部
114 他メンバ死活監視部
120 記憶部
121 メンバ管理表
122 代表メンバ選出規則
123 監視先メンバ設定規則
130 入出力部
200 非代表メンバ
210 処理部
211 代表メンバ特定部
212 他メンバ死活監視部
220 記憶部
222 代表メンバ選出規則
223 監視先メンバ設定規則
230 入出力部
DESCRIPTION OF SYMBOLS 100 Representative member 110 Processing part 111 Member management table distribution part 112 Member management table update part 113 Representative member specific part 114 Other member alive monitoring part 120 Storage part 121 Member management table 122 Representative member selection rule 123 Monitoring destination member setting rule 130 Input / output Section 200 Non-representative member 210 Processing section 211 Representative member specifying section 212 Other member alive monitoring section 220 Storage section 222 Representative member selection rule 223 Monitoring destination member setting rule 230 Input / output section

Claims (4)

複数のクラスタメンバが分散処理を行うクラスタシステムであって、
前記複数のクラスタメンバそれぞれは、前記複数のクラスタメンバの識別子を含むメンバ管理表を保持しており、
前記複数のクラスタメンバのうちの代表メンバは、前記メンバ管理表を更新し、その更新したメンバ管理表を前記複数のクラスタメンバそれぞれに配信し、
前記複数のクラスタメンバのうち、前記代表メンバが故障した場合の次の代表メンバである次期代表メンバ候補が予め選出されており、
前記複数のクラスタメンバそれぞれは、
前記複数のクラスタメンバの間で行う死活監視対象を決定する際に、前記次期代表メンバ候補に対して死活監視を行うクラスタメンバを、そのときの前記代表メンバ以外から選出するように定めた監視先メンバ設定規則を記憶する記憶部と、
前記複数のクラスタメンバの間で行う死活監視対象を決定する際に、前記監視先メンバ設定規則にしたがって自身の監視対象を選出する処理部と、
を備えることを特徴とするクラスタシステム。
A cluster system in which multiple cluster members perform distributed processing,
Each of the plurality of cluster members has a member management table including identifiers of the plurality of cluster members,
The representative member of the plurality of cluster members updates the member management table, and distributes the updated member management table to each of the plurality of cluster members.
Of the plurality of cluster members, the next representative member candidate that is the next representative member when the representative member fails is selected in advance,
Each of the plurality of cluster members is
A monitoring destination that is determined to select a cluster member that performs life and death monitoring for the next representative member candidate from other than the representative member at the time of determining the life and death monitoring target to be performed among the plurality of cluster members. A storage unit for storing member setting rules;
When determining the life and death monitoring target to be performed among the plurality of cluster members, a processing unit that selects its own monitoring target according to the monitoring member setting rules;
A cluster system comprising:
前記監視先メンバ設定規則は、
前記複数のクラスタメンバの間で行う死活監視の関係が、任意のクラスタメンバから出発して、そのクラスタメンバの監視先クラスタメンバ、さらにそのクラスタメンバの監視先クラスタメンバ、と順にたどった場合に、前記複数のクラスタメンバすべてを網羅してから元の出発のクラスタメンバに戻るように設定された規則である
ことを特徴とする請求項1に記載のクラスタシステム。
The monitoring member setting rules are:
When the relationship of life and death monitoring performed between the plurality of cluster members starts from an arbitrary cluster member, and follows the cluster member monitoring destination cluster member, and further the cluster member monitoring destination cluster member, The cluster system according to claim 1, wherein the rule is set so as to cover all of the plurality of cluster members and then return to the original starting cluster member.
前記監視先メンバ設定規則は、
クラスタメンバ数(N)が奇数の場合、各クラスタメンバの監視対象を前記メンバ管理表における順番の降順で一つ飛ばしのクラスタメンバとし、最後から一つ前の「N−1」番目のクラスタは先頭のクラスタメンバを監視し、最後のN番目のクラスタメンバは先頭から二番目のクラスタメンバを監視するように設定された規則である
ことを特徴とする請求項1または請求項2に記載のクラスタシステム。
The monitoring member setting rules are:
When the number of cluster members (N) is an odd number, the monitoring target of each cluster member is one cluster member skipped in the descending order of the order in the member management table, and the “N−1” -th previous cluster from the last is The cluster according to claim 1 or 2, wherein the first cluster member is monitored, and the last Nth cluster member is a rule set to monitor the second cluster member from the top. system.
前記監視先メンバ設定規則は、
クラスタメンバ数(N)が偶数の場合、各クラスタメンバの監視対象を、
前記メンバ管理表における順番の先頭から「N−2」番目のクラスタメンバについては、前記メンバ管理表における順番の降順で一つ飛ばしのクラスタメンバとし、
最後から一つ前の「N−1」番目のクラスタは先頭から二番目のクラスタメンバを監視し、最後のN番目のクラスタメンバは先頭のクラスタメンバを監視するように設定された規則である
ことを特徴とする請求項1または請求項2に記載のクラスタシステム。
The monitoring member setting rules are:
When the number of cluster members (N) is an even number, the monitoring target of each cluster member is
The “N-2” -th cluster member from the top of the order in the member management table is a cluster member that is skipped in descending order of the order in the member management table.
The “N-1” -th cluster immediately before the last is a rule set to monitor the second cluster member from the top, and the last N-th cluster member is a rule set to monitor the first cluster member. The cluster system according to claim 1 or 2, wherein:
JP2013235597A 2013-11-14 2013-11-14 Cluster system Active JP5848743B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2013235597A JP5848743B2 (en) 2013-11-14 2013-11-14 Cluster system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2013235597A JP5848743B2 (en) 2013-11-14 2013-11-14 Cluster system

Publications (2)

Publication Number Publication Date
JP2015095200A true JP2015095200A (en) 2015-05-18
JP5848743B2 JP5848743B2 (en) 2016-01-27

Family

ID=53197532

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2013235597A Active JP5848743B2 (en) 2013-11-14 2013-11-14 Cluster system

Country Status (1)

Country Link
JP (1) JP5848743B2 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011164751A (en) * 2010-02-05 2011-08-25 Nec Corp System and method for resource management

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011164751A (en) * 2010-02-05 2011-08-25 Nec Corp System and method for resource management

Also Published As

Publication number Publication date
JP5848743B2 (en) 2016-01-27

Similar Documents

Publication Publication Date Title
US10291696B2 (en) Peer-to-peer architecture for processing big data
US7849178B2 (en) Grid computing implementation
JP6607783B2 (en) Distributed cache cluster management
US8990176B2 (en) Managing a search index
CN103067433B (en) A kind of data migration method of distributed memory system, equipment and system
US9367261B2 (en) Computer system, data management method and data management program
EP3015998B1 (en) Zoning balance subtask delivering method, apparatus and system
JP6434131B2 (en) Distributed processing system, task processing method, storage medium
KR20120018178A (en) Swarm-based synchronization over a network of object stores
US20190155922A1 (en) Server for torus network-based distributed file system and method using the same
JP2017507415A (en) Method and apparatus for IT infrastructure management in cloud environment
JP5969315B2 (en) Data migration processing system and data migration processing method
EP3186720B1 (en) Organizing a computing system having multiple computers, distributing computing tasks among the computers, and maintaining data integrity and redundancy in the computing system
JP2013182546A (en) Management device and program
JP2016177324A (en) Information processing apparatus, information processing system, information processing method, and program
JP5848743B2 (en) Cluster system
JP6644902B2 (en) Neighbor monitoring in a hyperscale environment
KR102476271B1 (en) Method for configuration of semi-managed dht based on ndn and system therefor
JP2024514467A (en) Geographically distributed hybrid cloud cluster
JP5745445B2 (en) Management device and program
US9798633B2 (en) Access point controller failover system
CN110868340B (en) Testing method and device, reconfigurable tester and controller
JP5956940B2 (en) Redundant system and working machine determination method
JP6093320B2 (en) Distributed processing system
JP5711771B2 (en) Node leave processing system

Legal Events

Date Code Title Description
A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20150602

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20150616

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20151124

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20151127

R150 Certificate of patent or registration of utility model

Ref document number: 5848743

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150