JP4818349B2

JP4818349B2 - Distributed system and multiplexing control method for the same

Info

Publication number: JP4818349B2
Application number: JP2008328750A
Authority: JP
Inventors: 雅田中; 茂夫大道; 敬介伊藤; 大士中山; 卓也熊谷
Original assignee: Toshiba Corp; Toshiba Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2008-12-24
Filing date: 2008-12-24
Publication date: 2011-11-16
Anticipated expiration: 2028-12-24
Also published as: JP2010152559A

Description

本発明は、ネットワークで接続されたｍ台（ｍは４より大きい整数）のコンピュータのうちｎ台（ｎは４＜ｎ≦ｍを満たす整数）のコンピュータを同期的に動作させ、（ｎ−ｆ）台（ｆは３ｆ＜ｎを満たす最大の整数）以上での多重化を保証する分散システムに係り、特に当該ｎ台のコンピュータのいずれかで障害が発生した場合に好適な分散システムおよび同システムの多重化制御方法に関する。 The present invention synchronously operates n computers (n is an integer satisfying 4 <n ≦ m) among m computers (m is an integer larger than 4) connected via a network, and (n−f ) The present invention relates to a distributed system that guarantees multiplexing at or above (f is the maximum integer satisfying 3f <n), and particularly suitable for a case where a failure occurs in any of the n computers. The present invention relates to a multiplexing control method.

近年、コンピュータ技術やネットワーク技術の向上は目覚ましく、これに伴って、業務のコンピュータ化が広く行われている。また、その業務の内容によっては、故障などによる中断が許されないものも多く、最近では、複数のコンピュータをネットワークで結合した分散システムを構築することが一般的になりつつある。そして、この分散システムの運用手法の１つに、整列マルチキャストを用いた決定性のプログラムの実行の多重化が存在する。 In recent years, computer technology and network technology have been remarkably improved, and along with this, computerization of business has been widely performed. In addition, depending on the contents of the business, there are many things that cannot be interrupted due to a failure or the like, and recently, it is becoming common to construct a distributed system in which a plurality of computers are connected via a network. One of the distributed system operation techniques is multiplexing of execution of a deterministic program using ordered multicast.

整列マルチキャストは、分散システムへの入力をすべてのコンピュータに配送する仕組みであり、データの到着順序がすべてのコンピュータで同じであることを保証するものである。 The ordered multicast is a mechanism for delivering the input to the distributed system to all computers, and guarantees that the arrival order of data is the same on all computers.

分散システムにおける多重化の手法として、例えば特許文献１は、ｎ台（ｎは４以上の整数）のコンピュータ（つまり４台以上のコンピュータ）がネットワークで接続された分散システムにおいて、スプリットブレインを原理的に発生させず、タイムアウトによる故障発生時の処理の中断も発生させることがない手法を開示している。 As a multiplexing method in a distributed system, for example, Patent Document 1 is based on the principle of split brain in a distributed system in which n computers (n is an integer of 4 or more) (that is, 4 or more computers) are connected by a network. And a method that does not cause interruption of processing when a failure occurs due to a timeout.

この特許文献１が開示する従来の多重化手法（先行技術）においては、多重化を構成するコンピュータがｎ台（ｎは４以上の整数）の分散システムを例にとると、当該ｎ台のコンピュータを同期的に動作させ、当該ｎ台のコンピュータのうちの（ｎ−ｆ）台（ｆは、最大許容障害数と呼ばれる、３ｆ＜ｎを満たす最大の整数）以上での多重化を保証するために、当該ｎ台のコンピュータの各々は、入力候補収集手段と入力候補選定制御手段（第１の入力候補選定制御手段）とを具備する。入力候補収集手段は、ｎ台のコンピュータそれぞれが次に処理する候補として選択した入力データをネットワークを介して収集する。入力候補選定制御手段は、入力候補収集手段により収集された入力データが（ｎ−ｆ）個以上存在する場合に、その中に同一内容の入力データが（ｎ−ｆ）個以上あるか否かを判定し、（ｎ−ｆ）個以上あったときに、その入力データを次に処理する対象として確定する。これにより入力データが整列マルチキャストされる。このように、収集された（ｎ−ｆ）個以上の入力データの中に同一内容の入力データが（ｎ−ｆ）個以上あることを入力候補選定制御手段が判定することは、（ｎ−ｆ）台以上のコンピュータで入力データの合意をとることに他ならない。つまり入力候補選定制御手段は合意手段として機能することを意味する。
特許第３６５５２６３号公報 In the conventional multiplexing method (prior art) disclosed in Patent Document 1, if a computer is a distributed system having n computers (n is an integer of 4 or more), the n computers Are operated synchronously to guarantee multiplexing on (n−f) of the n computers (where f is the maximum integer satisfying 3f <n called the maximum allowable failure number). In addition, each of the n computers includes input candidate collection means and input candidate selection control means (first input candidate selection control means). The input candidate collecting means collects input data selected as candidates to be processed next by each of the n computers via the network. When there are (n−f) or more input data collected by the input candidate collecting unit, the input candidate selection control unit determines whether or not there are (n−f) or more input data having the same contents. When there are (n−f) or more, the input data is determined as a target to be processed next. As a result, the input data is arranged and multicast. In this way, the input candidate selection control means determines that there are (n−f) or more input data having the same content among the (n−f) or more collected input data. f) It is nothing but an agreement on input data with more than one computer. That is, the input candidate selection control means functions as an agreement means.
Japanese Patent No. 3655263

上記先行技術によれば、多重化を構成するコンピュータがｎ台（ｎは４以上の整数）の分散システムでは、ｆ台（ｆは３ｆ＜ｎを満たす最大の整数）までのコンピュータの故障（故障停止）が許容される。つまり、（ｎ−ｆ）台以上での多重化が保証（継続）される。このことは、上記先行技術では、故障停止（障害が発生）したコンピュータの数が（ｆ＋１）以上になると、多重化は継続できず、分散システムはフェイルストップすることを意味する。例えば、多重化を構成するコンピュータが７台（ｎ＝７）の分散システムでは、ｆは２となることから、２台までのコンピュータの故障停止は許容されるものの、さらに１台のコンピュータが故障停止すると、つまり３台のコンピュータが故障停止すると多重化を継続できない。 According to the above prior art, in a distributed system having n computers (n is an integer of 4 or more) that constitutes multiplexing, computer failures (failures) up to f computers (f is the maximum integer satisfying 3f <n). Stop) is allowed. That is, multiplexing of (n−f) or more units is guaranteed (continued). This means that in the above prior art, if the number of computers that have stopped (failed) has become (f + 1) or more, multiplexing cannot be continued and the distributed system fails. For example, in a distributed system with 7 computers (n = 7) that constitutes multiplexing, since f is 2, failure of up to 2 computers is permitted, but 1 computer fails. When stopped, that is, when three computers fail, the multiplexing cannot be continued.

本発明は上記事情を考慮してなされたものでその目的は、障害が発生したコンピュータの数が運用開始時のｆで表される（ｆ＋１）以上となっても、正常に動作するコンピュータが３台以上あれば、多重化を継続できる分散システムおよび同システムの多重化制御方法を提供することにある。 The present invention has been made in view of the above circumstances, and its object is to provide three computers that operate normally even when the number of computers in which a failure has occurred is equal to or greater than (f + 1) represented by f at the start of operation. It is an object of the present invention to provide a distributed system that can continue multiplexing if there are more than one, and a multiplexing control method for the system.

本発明の１つの観点によれば、ネットワークで接続されたｍ台（ｍは４より大きい整数）のコンピュータのうちｎ台（ｎは４＜ｎ≦ｍを満たす整数）のコンピュータを同期的に動作させる分散システムが提供される。この分散システムにおいて、前記ｍ台のコンピュータの各々は、構成記憶手段と、合意手段と、障害検出手段と、構成決定手段とを具備する。前記構成記憶手段は、同期的に動作させられるべきｎ台のコンピュータを識別する情報を、前記分散システムを構成するコンピュータを識別する情報として格納する。前記合意手段は、前記構成記憶手段に格納されている情報によって識別される前記分散システムを構成するｎ台のコンピュータを同期的に動作させ、当該ｎ台のコンピュータのうちの（ｎ−ｆ）台（ｆは３ｆ＜ｎを満たす最大の整数）以上での多重化を保証するために、当該（ｎ−ｆ）台以上のコンピュータで入力データの合意をとることで当該入力データを整列マルチキャストする。前記障害検出手段は、前記構成記憶手段に格納されている情報によって識別される前記分散システムを構成するｎ台のコンピュータの状態から障害が発生したコンピュータを検出する。前記構成決定手段は、前記障害検出手段によって障害が発生したコンピュータが検出された場合、前記分散システムの構成を変更すべき事項が発生したと判定し、前記分散システムの構成を変更すべき事項が発生したと判定したコンピュータが（ｎ−ｆ）台以上存在するかによって、前記分散システムの構成を変更するかを決定し、前記分散システムの構成を変更すると決定した場合、前記構成記憶手段に格納されている情報を変更後の前記分散システムの構成を示すように更新する。 According to one aspect of the present invention, among m computers (m is an integer greater than 4) connected by a network, n computers (n is an integer satisfying 4 <n ≦ m) are operated synchronously. A distributed system is provided. In the distributed system, each of the m computers includes a configuration storage unit, an agreement unit, a failure detection unit, and a configuration determination unit. The configuration storage means stores information for identifying n computers to be operated synchronously as information for identifying computers constituting the distributed system. The agreement means synchronously operates n computers constituting the distributed system identified by information stored in the configuration storage means, and (n−f) of the n computers. (F is the maximum integer satisfying 3f <n) In order to guarantee multiplexing at or above, the input data is arranged and multicasted by agreeing on the input data among the (n−f) computers or more. The failure detection means detects a computer in which a failure has occurred from the status of n computers constituting the distributed system identified by the information stored in the configuration storage means. The configuration determination unit determines that an item to be changed in the configuration of the distributed system has occurred when a computer in which a failure has occurred is detected by the failure detection unit, and an item to be changed in the configuration of the distributed system exists. It is determined whether to change the configuration of the distributed system depending on whether there are (n−f) or more computers that have been determined to have occurred. If it is determined to change the configuration of the distributed system, the configuration is stored in the configuration storage unit. The updated information is updated to show the configuration of the distributed system after the change.

本発明によれば、現在分散システムを構成するｎ台のコンピュータのいずれかのコンピュータで障害が発生したために分散システムの構成を変更すべき事項が発生したと判定し、この分散システムの構成を変更すべき事項が発生したと判定したコンピュータが（ｎ−ｆ）台以上存在することの合意がとれたことをもって、分散システムの構成を変更することにより、障害が発生したコンピュータの数が運用開始時のｆで表される（ｆ＋１台）以上となっても、正常に動作するコンピュータが３台以上あれば、多重化を継続できる According to the present invention, it is determined that an item that should change the configuration of the distributed system has occurred because a failure has occurred in one of the n computers that currently constitute the distributed system, and the configuration of the distributed system is changed. By agreeing that there are (n−f) or more computers that have been determined to have occurred, the number of computers that have failed can be reduced by changing the configuration of the distributed system. Multiplexing can be continued if there are three or more computers that operate normally even if it is greater than (f + 1) represented by f

以下、本発明の実施の形態につき図面を参照して説明する。
図１は本発明の一実施形態に係る分散システムの構成を示すブロック図である。図１において、分散システム１０００は、例えば７台のコンピュータ１００-1（＃１）〜１００-7（＃７）により多重化されているものとする。但し本実施形態では、分散システム１０００を構成するコンピュータの数は、当該システムの状況に応じて動的に変更される。 Embodiments of the present invention will be described below with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of a distributed system according to an embodiment of the present invention. In FIG. 1, it is assumed that the distributed system 1000 is multiplexed by, for example, seven computers 100-1 (# 1) to 100-7 (# 7). However, in the present embodiment, the number of computers constituting the distributed system 1000 is dynamically changed according to the status of the system.

そこで、分散システム１０００の運用開始時の当該分散システム１０００を構成するコンピュータの数をｍで表し、現在の分散システム１０００を構成するコンピュータの数をｎで表すこともある。ｎは上記特許文献１と異なり４より大きい整数）である。なお、上記特許文献１では、分散システムを構成するコンピュータの数は変更されない。つまり上記特許文献１では、分散システムの構成が当該システムの状況に応じて変更されることはない。 Therefore, the number of computers constituting the distributed system 1000 at the start of operation of the distributed system 1000 may be represented by m, and the current number of computers constituting the distributed system 1000 may be represented by n. n is an integer larger than 4 unlike the above-mentioned Patent Document 1. In Patent Document 1, the number of computers constituting the distributed system is not changed. That is, in Patent Document 1, the configuration of the distributed system is not changed according to the status of the system.

ｍは４より大きい整数（つまり５以上の整数）である。つまり分散システム１０００の運用開始時の当該分散システム１０００を構成するコンピュータの数ｍは７に限るものではなく、４より大きければよい。明らかなように、ｎは、４＜ｎ≦ｍを満たす整数である。図１の例では、ｍ＝ｎ＝７であるが、本実施形態では、ｎは７から６に、６から５に変更される可能性がある。 m is an integer greater than 4 (that is, an integer of 5 or more). That is, the number m of computers constituting the distributed system 1000 at the start of the operation of the distributed system 1000 is not limited to 7, and may be larger than 4. As is apparent, n is an integer that satisfies 4 <n ≦ m. In the example of FIG. 1, m = n = 7, but in the present embodiment, n may be changed from 7 to 6 and from 6 to 5.

コンピュータ１００-1〜１００-7は、ネットワークＡを介してクライアント装置２０００と接続されている。コンピュータ１００-1〜１００-7は、クライアント装置２０００以外のクライアント装置（図示せず）ともネットワークＡを介して接続されているものとする。本実施形態においてネットワークＡはパブリックネットワークである。コンピュータ１００-1〜１００-7間は、ネットワークＢを介して接続されている。本実施形態においてネットワークＢはプライベートネットワークである。 The computers 100-1 to 100-7 are connected to the client apparatus 2000 via the network A. The computers 100-1 to 100-7 are also connected to a client device (not shown) other than the client device 2000 via the network A. In the present embodiment, the network A is a public network. The computers 100-1 to 100-7 are connected via a network B. In the present embodiment, the network B is a private network.

コンピュータ１００-1〜１００-7が分散システム１０００を構成している場合、コンピュータ１００-1〜１００-7は、前記特許文献１に記載された分散システムにおけるコンピュータと同様に、ネットワークＡを介してクライアント装置２０００から受け取った入力パケット（入力）を他のコンピュータと同じ順序で処理していく。なお、クライアント装置２０００からの入力パケットは、コンピュータ１００-1〜１００-7のうちのいずれかのコンピュータに入力される。 When the computers 100-1 to 100-7 constitute the distributed system 1000, the computers 100-1 to 100-7 are connected via the network A in the same manner as the computers in the distributed system described in Patent Document 1. The input packet (input) received from the client apparatus 2000 is processed in the same order as other computers. An input packet from the client device 2000 is input to any one of the computers 100-1 to 100-7.

コンピュータ１００-1〜１００-7は、それぞれ同一のアプリケーションプログラム３（図２参照）を有している。コンピュータ１００-1〜１００-7は、同一の初期状態から始まる。その後、クライアント装置２０００から分散システムに入力されるデータは、必ず整列マルチキャストを通して、コンピュータ１００-1〜１００-7に同一順序で配送される。これにより、コンピュータ１００-1〜１００-7においてそれぞれのアプリケーションプログラム３が実行される。 Each of the computers 100-1 to 100-7 has the same application program 3 (see FIG. 2). The computers 100-1 to 100-7 start from the same initial state. Thereafter, the data input from the client apparatus 2000 to the distributed system is always delivered to the computers 100-1 to 100-7 through the ordered multicast in the same order. Thereby, the respective application programs 3 are executed in the computers 100-1 to 100-7.

コンピュータ１００-1〜１００-7がそれぞれ有するアプリケーションプログラム３への入力データ列は、整列マルチキャストにより同一順序となっている。このため、前記特許文献１に記載されているような決定性のプログラムの特徴により、コンピュータ１００-1〜１００-7の状態が同一に保たれ、出力データ列もすべて同じとなる。つまり、プログラムの実行が多重化される。 The input data string to the application program 3 that each of the computers 100-1 to 100-7 has has the same order by the ordered multicast. For this reason, due to the characteristics of the deterministic program as described in Patent Document 1, the states of the computers 100-1 to 100-7 are kept the same, and the output data strings are all the same. That is, program execution is multiplexed.

特許文献１に記載されている分散システムにおいて多重化を構成するコンピュータの数がｎ（ｎは４以上の整数）の場合、当該分散システムでは、ｆを３ｆ＜ｎを満足する最大の整数（最大許容障害数）とすると、ｆ台までの故障停止が許容される。つまり、ｎ台のコンピュータから構成される分散システムでは、（ｎ−ｆ）台以上のコンピュータの多重化が保証され、多重化されるプログラムは、少なくとも（ｎ−ｆ）台のコンピュータ上で実行される。したがって、ｎが４より大きい整数である本実施形態においても、（ｎ−ｆ）台以上のコンピュータの多重化が保証される。 In the distributed system described in Patent Document 1, when the number of computers constituting the multiplexing is n (n is an integer of 4 or more), in the distributed system, f is the largest integer satisfying 3f <n (maximum (Allowable failure count), up to f failure stops are allowed. In other words, in a distributed system composed of n computers, multiplexing of (n−f) or more computers is guaranteed, and the multiplexed program is executed on at least (n−f) computers. The Therefore, also in this embodiment in which n is an integer greater than 4, multiplexing of (n−f) or more computers is guaranteed.

図２は、コンピュータ１００-i（ｉ＝１，２，…，７）の構成を示すブロック図である。図２において、クライアント装置２０００からネットワークＡを介してコンピュータ１００-iに送信されて、当該コンピュータ１００-iの入力受付キュー部１で受け付けられた入力パケットは、当該入力受付キュー部１にキューイングされる。入力受付キュー部１にキューイングされた入力パケットは、整列マルチキャスト部２（内の後述する入力パケット確定判定部２６に含まれている合意部２６２）によってアプリケーションプログラム３または整列マルチキャスト部２内の後述する構成決定部２１０に配送される。なお、入力受付キュー部１にキューイングされる入力パケットには、クライアント装置２０００からの入力パケットの他に、構成決定部２１０から送られる入力パケット（特定入力パケット）がある。 FIG. 2 is a block diagram showing a configuration of the computer 100-i (i = 1, 2,..., 7). In FIG. 2, an input packet transmitted from the client device 2000 to the computer 100-i via the network A and received by the input reception queue unit 1 of the computer 100-i is queued in the input reception queue unit 1. Is done. The input packet queued in the input reception queue unit 1 is sent to the application program 3 or the later-described ordered multicast unit 2 by the ordered multicast unit 2 (the agreement unit 262 included in the later-described input packet confirmation determination unit 26). To the configuration determining unit 210. The input packet queued in the input reception queue unit 1 includes an input packet (specific input packet) sent from the configuration determining unit 210 in addition to the input packet from the client device 2000.

アプリケーションプログラム３は、配送された入力パケットを受けて、プログラム状態管理部４に保存されている状態に従って当該入力パケットを処理し、出力パケットを生成する。生成された出力パケットは、出力フィルタ部５で選別されてから、ネットワークＡを介してクライアント装置２０００に返却される（出力）。 The application program 3 receives the delivered input packet, processes the input packet according to the state stored in the program state management unit 4, and generates an output packet. The generated output packet is selected by the output filter unit 5 and then returned to the client device 2000 via the network A (output).

次に、コンピュータ１００-iの整列マルチキャスト部２の構成について説明する。整列マルチキャスト部２は、前記特許文献１に記載された整列マルチキャスト部と同様に、入力順序番号記憶部２１、入力パケットジャーナル記憶部２２、プロトコルデータ送受信部２３、ステップ番号記憶部２４、候補パケット記憶部２５、入力パケット確定判定部２６、最大確定入力順序番号記憶部２７、遅延記憶部２８およびスキップ判定部２９の周知の構成を有している。 Next, the configuration of the ordered multicast unit 2 of the computer 100-i will be described. Similar to the ordered multicast unit described in Patent Document 1, the ordered multicast unit 2 has an input sequence number storage unit 21, an input packet journal storage unit 22, a protocol data transmission / reception unit 23, a step number storage unit 24, and a candidate packet storage. The unit 25, the input packet determination determination unit 26, the maximum determination input sequence number storage unit 27, the delay storage unit 28, and the skip determination unit 29 are well-known.

入力順序番号記憶部２１は、整列マルチキャストによってコンピュータ１００-iへ次に配送される入力パケットの順序番号（つまり整列マルチキャストにシリアルに付される最新の順序番号）を格納する。入力パケットジャーナル記憶部２２は、整列マルチキャストによってコンピュータ１００-iへの配送が確定した入力パケットの列を最近のものから一定の量だけ格納する。プロトコルデータ送受信部２３は、他のコンピュータのプロトコルデータ送受信部２３とネットワークＢを介してプロトコルデータを授受する。 The input sequence number storage unit 21 stores the sequence number of the next input packet to be delivered to the computer 100-i by the ordered multicast (that is, the latest sequence number assigned serially to the ordered multicast). The input packet journal storage unit 22 stores a certain amount of the input packet sequence that is confirmed to be delivered to the computer 100-i by the ordered multicast from the latest one. The protocol data transmission / reception unit 23 exchanges protocol data with the protocol data transmission / reception unit 23 of another computer via the network B.

本実施形態では、クライアント装置２０００とコンピュータ１００-iとの間のデータの授受と、コンピュータ１００-i相互間のデータの授受とで、使用するネットワークが切り替えられる。これによりネットワーク負荷が軽減される。しかし、クライアント装置２０００とコンピュータ１００-iとの間のデータの授受と、コンピュータ１００-i相互間のデータの授受とが、例えばネットワークＡを介して行われる構成であっても構わない。またネットワークＡが必ずしもパブリックネットワークである必要はない。 In the present embodiment, the network to be used is switched between data exchange between the client apparatus 2000 and the computer 100-i and data exchange between the computers 100-i. This reduces the network load. However, the configuration may be such that data exchange between the client apparatus 2000 and the computer 100-i and data exchange between the computers 100-i are performed via the network A, for example. Further, the network A is not necessarily a public network.

ステップ番号記憶部２４、候補パケット記憶部２５および入力パケット確定判定部２６は、整列マルチキャストによってコンピュータ１００-iへ次に配送される入力パケットを決定するアルゴリズムで用いられる。ステップ番号記憶部２４は、プロトコルのステップを示すステップ番号を格納する。候補パケット記憶部２５は、そのステップにおける各コンピュータの「入力候補」となる入力パケットを計ｎ個格納する。 The step number storage unit 24, the candidate packet storage unit 25, and the input packet confirmation determination unit 26 are used in an algorithm that determines the next input packet to be delivered to the computer 100-i by ordered multicast. The step number storage unit 24 stores a step number indicating a protocol step. The candidate packet storage unit 25 stores a total of n input packets that are “input candidates” for each computer in that step.

入力パケット確定判定部２６は、候補パケット記憶部２５の情報から入力パケットの確定の判定および次ステップの「入力候補」の決定を行う。入力パケット確定判定部２６はさらに、前記特許文献１に記載された入力パケット確定判定部と異なり、入力パケットをアプリケーションプログラム３および構成決定部２１０のいずれに渡すかを決定する。この決定のために、入力受付キュー部１にキューイングされる入力パケットには、処理種別を示す情報（処理種別情報）が付加される。入力パケット確定判定部２６は、入力候補収集部２６１及び合意部２６２を含む。 The input packet determination determination unit 26 determines the determination of the input packet from the information in the candidate packet storage unit 25 and determines the “input candidate” in the next step. Unlike the input packet determination determination unit described in Patent Document 1, the input packet determination determination unit 26 further determines whether the input packet is to be passed to the application program 3 or the configuration determination unit 210. For this determination, information indicating the processing type (processing type information) is added to the input packet queued in the input reception queue unit 1. The input packet confirmation determination unit 26 includes an input candidate collection unit 261 and an agreement unit 262.

図３は入力受付キュー部１にキューイングされるデータ（入力パケットデータ）のデータ構造例を示す。図３に示されるように、入力パケットデータは、処理種別および入力パケットの各フィールドを含む。入力パケットフィールドには入力パケットが格納（設定）され、処理種別フィールドには処理種別情報が格納（設定）される。 FIG. 3 shows an example of the data structure of data (input packet data) queued in the input reception queue unit 1. As shown in FIG. 3, the input packet data includes fields of processing type and input packet. An input packet is stored (set) in the input packet field, and process type information is stored (set) in the process type field.

本実施形態において処理種別情報は、入力パケットフィールドに格納されている入力パケットをアプリケーションプログラム３または構成決定部２１０のいずれに渡すかを入力パケット確定判定部２６（内の合意部２６２）が決定するための処理種別を示す。そのため、処理種別情報の示す処理種別は、（１）アプリケーションと（２）構成とに分けられる。処理種別が「アプリケーション」の場合、入力パケットが外部のクライアント装置２０００から入力されたものであることをも示し、処理種別が「構成」の場合、入力パケットが分散システム１０００を構成するいずれかのコンピュータの構成決定部２１０から当該いずれかのコンピュータの入力受付キュー部１に入力されたものであることをも示す。 In the present embodiment, the processing type information is determined by the input packet confirmation determination unit 26 (the agreement unit 262) as to whether the input packet stored in the input packet field is to be passed to the application program 3 or the configuration determination unit 210. The processing type for Therefore, the process type indicated by the process type information is divided into (1) application and (2) configuration. When the processing type is “application”, it also indicates that the input packet is input from the external client device 2000. When the processing type is “configuration”, the input packet is one of the constituents of the distributed system 1000. It also indicates that the information is input from the computer configuration determination unit 210 to the input reception queue unit 1 of any one of the computers.

再び図２を参照すると、最大確定入力順序番号記憶部２７は、他のコンピュータも含め、配送が確定したことがわかっている最大の入力順序番号を格納する。遅延記憶部２８は、他の（ｎ−１）台（ｎ＝７）のコンピュータよりも遅延しているかどうかを示す（ｎ−１）個の遅延フラグ（ｎ＝７の本実施形態では、６個のフラグ）を格納する。スキップ判定部２９は、遅延記憶部２８の情報からスキップ動作の必要性を判定およびスキップ動作を実行する。 Referring again to FIG. 2, the maximum confirmed input sequence number storage unit 27 stores the maximum input sequence number that is known to have been delivered, including other computers. The delay storage unit 28 has (n−1) delay flags (6 in the present embodiment where n = 7) indicating whether or not the delay storage is delayed from the other (n−1) computers (n = 7). Number of flags). The skip determination unit 29 determines the necessity of the skip operation from the information in the delay storage unit 28 and executes the skip operation.

図４は、最大確定入力順序番号記憶部２７のデータ構造例を示す。図４に示されるデータ構造例では、最大確定入力順序番号として「５００００」が格納されている。この場合、最大確定入力順序番号記憶部２７は、他のコンピュータも含め、配送が確定したことがわかっている最大の入力順序番号が「５００００」であることを示す。 FIG. 4 shows an example of the data structure of the maximum confirmed input sequence number storage unit 27. In the data structure example shown in FIG. 4, “50000” is stored as the maximum confirmed input order number. In this case, the maximum confirmed input sequence number storage unit 27 indicates that the maximum input sequence number that is known to have been confirmed for delivery, including other computers, is “50000”.

以降の説明では、入力順序番号記憶部２１に格納された入力順序番号を該当入力順序番号と呼び、ステップ番号記憶部２４に格納されたステップ番号を該当ステップ番号と呼ぶ。コンピュータ１００-iの整列マルチキャスト部２に含まれている候補パケット記憶部２５に格納されているｎ個の「入力候補」のうち、当該コンピュータ１００-i自身（自コンピュータ）に対応する「入力候補」を自候補と呼び、当該自候補以外の「入力候補」を他候補と呼ぶ。 In the following description, the input sequence number stored in the input sequence number storage unit 21 is referred to as a corresponding input sequence number, and the step number stored in the step number storage unit 24 is referred to as a corresponding step number. Of the n “input candidates” stored in the candidate packet storage unit 25 included in the ordered multicast unit 2 of the computer 100-i, the “input candidates” corresponding to the computer 100-i itself (own computer) Is called a self-candidate, and “input candidates” other than the self-candidate are called other candidates.

本実施形態では、整列マルチキャスト部２は、前記特許文献１に記載された整列マルチキャスト部と異なり、分散システム１０００を構成するコンピュータを、当該システム１０００の状況に応じて動的に変更するための新規の構成を含む。即ち整列マルチキャスト部２は、図２に示されるように、構成決定部２１０、構成記憶部２１１、最大順序番号遅延許容値記憶部２１２および他コンピュータ最大確定入力順序番号記憶部２１３をさらに含む。 In the present embodiment, the ordered multicast unit 2 is different from the ordered multicast unit described in Patent Document 1 in that a new computer for dynamically changing a computer constituting the distributed system 1000 according to the status of the system 1000 is provided. Including the configuration. That is, the ordered multicast unit 2 further includes a configuration determining unit 210, a configuration storage unit 211, a maximum sequence number delay tolerance storage unit 212, and another computer maximum fixed input sequence number storage unit 213, as shown in FIG.

構成決定部２１０は、分散システム１０００を構成するコンピュータを、当該システム１０００の状況（さらに詳細に述べるならば、現在当該システム１０００を構成しているコンピュータ１００-1〜１００-7の状況）に応じて決定する。構成記憶部２１１は、構成決定部２１０によって決定された分散システム１０００を構成するコンピュータを示す情報を格納する。本実施形態では、分散システム１０００を構成するコンピュータを示す情報として、当該コンピュータを一意に識別するための識別子(ＩＤ)が用いられる。構成決定部２１０は、構成記憶部２１１に格納されている情報（分散システム１０００を構成するコンピュータを示す情報）により、分散システム１０００において多重化を構成するコンピュータの数ｎと、当該ｎによって導かれる最大許容障害数ｆとを決定する。構成決定部２１０は、他のコンピュータの障害を検出する障害検出部２１０ａを含む。 The configuration determination unit 210 determines the computers that make up the distributed system 1000 according to the status of the system 1000 (in more detail, the status of the computers 100-1 to 100-7 that currently make up the system 1000). To decide. The configuration storage unit 211 stores information indicating computers constituting the distributed system 1000 determined by the configuration determination unit 210. In the present embodiment, an identifier (ID) for uniquely identifying the computer is used as information indicating the computer constituting the distributed system 1000. The configuration determination unit 210 is guided by the number n of computers constituting the multiplexing in the distributed system 1000 and the n based on information stored in the configuration storage unit 211 (information indicating the computers configuring the distributed system 1000). The maximum allowable failure number f is determined. The configuration determination unit 210 includes a failure detection unit 210a that detects a failure of another computer.

図５は、構成記憶部２１１のデータ構造例を示す。図５に示されるデータ構造例では、コンピュータ＃１乃至＃７のＩＤ（ＩＤ＝１乃至ＩＤ＝７）が構成記憶部２１１に格納されている。この場合、構成記憶部２１１は、分散システム１０００がコンピュータ＃１乃至＃７から構成されており、ｎが７（ｎ＝７）で、ｆが２（ｆ＝２）であることを示す。 FIG. 5 shows an exemplary data structure of the configuration storage unit 211. In the data structure example shown in FIG. 5, IDs (ID = 1 to ID = 7) of the computers # 1 to # 7 are stored in the configuration storage unit 211. In this case, the configuration storage unit 211 indicates that the distributed system 1000 includes computers # 1 to # 7, n is 7 (n = 7), and f is 2 (f = 2).

最大順序番号遅延許容値記憶部２１２は、後述する他コンピュータ最大確定入力順序番号記憶部２１３に格納されている、コンピュータ１００-i以外のコンピュータ（他コンピュータ）の最大確定入力順序番号の入力順序番号記憶部２１に格納されている該当入力順序番号（つまりコンピュータ１００-iにおいて管理されている最新の入力順序番号）に対するずれ（遅延）を許容する値（最大順序番号遅延許容値）Ｌを格納する。この値Ｌは例えばユーザによって任意に設定される。但し、Ｌは１より大きい整数である。 The maximum sequence number delay allowable value storage unit 212 is an input sequence number of the maximum determined input sequence number of a computer (other computer) other than the computer 100-i, which is stored in another computer maximum determined input sequence number storage unit 213 to be described later. A value (maximum sequence number delay allowable value) L that allows a deviation (delay) with respect to the corresponding input sequence number (that is, the latest input sequence number managed in the computer 100-i) stored in the storage unit 21 is stored. . This value L is arbitrarily set by the user, for example. However, L is an integer greater than 1.

図６は、最大順序番号遅延許容値記憶部２１２のデータ構造例を示す。図６に示されるデータ構造例では、最大順序番号遅延許容値Ｌとして「３０００」が最大順序番号遅延許容値記憶部２１２に格納されている。この場合、他コンピュータの最大確定入力順序番号の該当入力順序番号に対する遅延は、順序番号の差に換算して「３０００」まで許容される。つまり、他コンピュータの最大確定入力順序番号の該当入力順序番号に対するずれが「３０００」を超えない範囲では、当該他コンピュータは正常であると構成決定部２１０によって判定される。換言するならば、他コンピュータの最大確定入力順序番号の該当入力順序番号に対するずれが「３０００」を超えると、当該他コンピュータは異常であると構成決定部２１０によって判定（検出）される。 FIG. 6 shows a data structure example of the maximum sequence number delay allowable value storage unit 212. In the example of the data structure shown in FIG. 6, “3000” is stored in the maximum sequence number delay allowable value storage unit 212 as the maximum sequence number delay allowable value L. In this case, a delay with respect to the corresponding input sequence number of the maximum confirmed input sequence number of another computer is allowed up to “3000” in terms of the difference of the sequence numbers. That is, the configuration determining unit 210 determines that the other computer is normal as long as the deviation of the maximum confirmed input sequence number of the other computer from the corresponding input sequence number does not exceed “3000”. In other words, when the deviation of the maximum confirmed input sequence number of another computer from the corresponding input sequence number exceeds “3000”, the configuration determining unit 210 determines (detects) that the other computer is abnormal.

本実施形態では、上述の最大確定入力順序番号（他コンピュータ最大確定入力順序番号記憶部２１３に格納されている最大確定入力順序番号）を利用したコンピュータ１００-iの異常検出（つまり分散システム１０００の状況の変化）に応じて、分散システム１０００が再構成される。 In the present embodiment, abnormality detection of the computer 100-i (that is, the distributed system 1000 of the distributed system 1000) using the above-described maximum fixed input sequence number (maximum fixed input sequence number stored in the other computer maximum fixed input sequence number storage unit 213). The distributed system 1000 is reconfigured in response to changes in the situation.

他コンピュータ最大確定入力順序番号記憶部２１３は、分散システム１０００を構成するコンピュータの集合のうち、コンピュータ１００-i自身（自コンピュータ）を除くコンピュータ（他コンピュータ）に対応する最大確定入力順序番号を格納する。 The other computer maximum confirmed input sequence number storage unit 213 stores the maximum confirmed input sequence number corresponding to computers (other computers) excluding the computer 100-i itself (local computer) among the set of computers constituting the distributed system 1000. To do.

図７は、コンピュータ１００-iがコンピュータ１００-1（＃１）である場合における、当該コンピュータ１００-1（＃１）の他コンピュータ最大確定入力順序番号記憶部２１３のデータ構造例を示す。図７に示されるデータ構造例では、コンピュータ１００-1（＃１）以外のコンピュータ１００-2（＃２）〜１００-7（＃７）の最大確定入力順序番号（５００００）が、当該コンピュータ１００-2（＃２）〜１００-7（＃７）の例えばＩＤに対応付けて他コンピュータ最大確定入力順序番号記憶部２１３に格納されている。 FIG. 7 shows an example of the data structure of the computer maximum determined input sequence number storage unit 213 of the computer 100-1 (# 1) when the computer 100-i is the computer 100-1 (# 1). In the data structure example shown in FIG. 7, the maximum confirmed input sequence number (50000) of the computers 100-2 (# 2) to 100-7 (# 7) other than the computer 100-1 (# 1) is the computer 100. -2 (# 2) to 100-7 (# 7) are stored in the other computer maximum confirmed input sequence number storage unit 213 in association with IDs, for example.

次に、プロトコルデータ送受信部２３によって送受信されるプロトコルデータについて説明する。
図８は、プロトコルデータのレイアウトを示す図である。図８に示されるように、プロトコルデータ送受信部２３によって送受信されるプロトコルデータは、種類、送信者、入力順序番号、ステップ番号、最大確定入力順序番号、処理種別および入力パケットの各フィールドを含む。図８に示されるプロトコルデータが前記特許文献１に記載されているプロトコルデータと相違するのは、前述の処理種別フィールドが追加されている点にある。 Next, protocol data transmitted / received by the protocol data transmitting / receiving unit 23 will be described.
FIG. 8 is a diagram showing a layout of protocol data. As shown in FIG. 8, the protocol data transmitted / received by the protocol data transmitting / receiving unit 23 includes fields of type, sender, input sequence number, step number, maximum confirmed input sequence number, processing type, and input packet. The protocol data shown in FIG. 8 is different from the protocol data described in Patent Document 1 in that the processing type field is added.

周知のように、プロトコルデータは先頭の種類フィールドによって、次の３つに使い分けられる。 As is well known, the protocol data is divided into the following three types depending on the type field at the head.

（１）候補種類：入力順序番号フィールド、ステップ番号フィールド、入力パケットフィールド、処理種別フィールドには、それぞれ、送信者（送信側コンピュータ）の送信時における該当入力順序番号、該当ステップ番号、自候補、自候補に付されていた処理種別（処理種別情報）が格納される。 (1) Candidate type: In the input sequence number field, step number field, input packet field, and processing type field, the corresponding input sequence number, corresponding step number, own candidate, Stores the processing type (processing type information) assigned to the candidate.

（２）確定種類：その入力順序番号（入力順序番号フィールドに格納されている入力順序番号）に対応する入力パケットが、送信者の送信時における入力パケットジャーナル記憶部２２にあることを示し、入力パケットフィールド、処理種別フィールドには、それぞれ、その入力パケット、その入力パケットに付されていた処理種別（処理種別情報）が格納される。この場合、ステップ番号フィールドは使用されない。 (2) Determined type: Indicates that the input packet corresponding to the input sequence number (the input sequence number stored in the input sequence number field) is in the input packet journal storage unit 22 at the time of transmission by the sender. The packet field and the processing type field store the input packet and the processing type (processing type information) attached to the input packet, respectively. In this case, the step number field is not used.

（３）遅延種類：その入力順序番号に対応する入力パケットが、送信者の送信時における入力パケットジャーナル記憶部２２にないことを示す。この場合、ステップ番号フィールド、入力パケットフィールドおよび処理種別フィールドは使用されない。 (3) Delay type: Indicates that there is no input packet corresponding to the input sequence number in the input packet journal storage unit 22 at the time of transmission by the sender. In this case, the step number field, the input packet field, and the process type field are not used.

いずれの種類のプロトコルデータにおいても、最大確定入力順序番号フィールドには、送信者（送信側コンピュータ）からのプロトコルデータ送信時における該当最大確定入力順序番号が格納される。また、プロトコルデータの受信側コンピュータにおける該当最大確定入力順序番号は、当該受信側コンピュータで確定された入力パケットの順序番号と、当該受信側コンピュータで受信されたプロトコルデータ中の最大確定入力順序番号とのうち、最も大きいものに更新される。 In any type of protocol data, the maximum confirmed input order number field stores the corresponding maximum confirmed input order number when protocol data is transmitted from the sender (sender computer). Further, the corresponding maximum confirmed input sequence number in the receiving computer of the protocol data includes the sequence number of the input packet confirmed in the receiving computer, the maximum confirmed input sequence number in the protocol data received by the receiving computer, Of these, it is updated to the largest one.

次に、図９乃至図１４のフローチャートを参照して、コンピュータ１００-iの整列マルチキャスト部２の動作手順について説明する。
まず初期状態では、入力順序番号記憶部２１は初期入力順序番号（例えば１）を格納する。入力パケットジャーナル記憶部２２は空の状態であり、ステップ番号記憶部２４は初期ステップ番号（例えば１）を格納する。また、候補パケット記憶部２５は空の状態であり、最大確定入力順序番号記憶部２７は初期入力順序番号を格納し、さらに、遅延記憶部２８のすべてのフラグはリセットされている。 Next, the operation procedure of the ordered multicast unit 2 of the computer 100-i will be described with reference to the flowcharts of FIGS.
First, in the initial state, the input sequence number storage unit 21 stores an initial input sequence number (for example, 1). The input packet journal storage unit 22 is empty, and the step number storage unit 24 stores an initial step number (for example, 1). The candidate packet storage unit 25 is empty, the maximum determined input sequence number storage unit 27 stores the initial input sequence number, and all the flags in the delay storage unit 28 are reset.

図９および図１０は、整列マルチキャストの１回の配送を行う基本的な部分の動作手順を示すフローチャートである。
整列マルチキャスト部２内の入力パケット確定判定部２６に含まれる入力候補収集部２６１は、コンピュータ１００-1〜１００-7（ｎ＝７）がそれぞれ次に処理する候補（入力候補）として選択した入力パケット（入力データ）を収集するための候補一覧作成処理を実行する（図８のステップＡ１）。 FIG. 9 and FIG. 10 are flowcharts showing an operation procedure of a basic part that performs one delivery of ordered multicast.
The input candidate collection unit 261 included in the input packet confirmation determination unit 26 in the ordered multicast unit 2 is an input selected as a candidate (input candidate) to be processed next by each of the computers 100-1 to 100-7 (n = 7). A candidate list creation process for collecting packets (input data) is executed (step A1 in FIG. 8).

候補一覧作成処理において入力候補収集部２６１は、ステップ番号記憶部２４に格納されている該当ステップ番号が初期値であるときは（図１０のステップＢ１のＹＥＳ）、入力受付キュー部１に入力パケット（より具体的には、入力パケットおよび処理種別情報から構成される入力パケットデータ）が存在するかを判定する（図１０のステップＢ２）。 In the candidate list creation process, the input candidate collection unit 261 receives an input packet in the input reception queue unit 1 when the corresponding step number stored in the step number storage unit 24 is an initial value (YES in step B1 in FIG. 10). It is determined whether or not (more specifically, input packet data including input packets and processing type information) exists (step B2 in FIG. 10).

もし、入力パケットが存在するならば（図１０のステップＢ２のＹＥＳ）、入力候補収集部２６１はステップ番号記憶部２４に格納されている該当ステップ番号を次に進める（図１０のステップＢ３）。そして入力候補収集部２６１は、入力パケットを自候補として候補パケット記憶部２５に格納し、かつ、この自候補が入力パケットフィールドに設定され、この自候補に付されている処理種別情報が処理種別フィールドに設定された候補種類のプロトコルデータをプロトコルデータ送受信部２３によりネットワークＢを介して他のすべてのコンピュータに送信させる（図１０のステップＢ４）。このステップＢ４において、入力候補収集部２６１は、候補パケット記憶部２５内の全ての他候補を空にする。 If there is an input packet (YES in step B2 in FIG. 10), the input candidate collection unit 261 advances the corresponding step number stored in the step number storage unit 24 (step B3 in FIG. 10). Then, the input candidate collection unit 261 stores the input packet as a self-candidate in the candidate packet storage unit 25, and the self-candidate is set in the input packet field, and the processing type information attached to the self-candidate is the processing type. The protocol data of the candidate type set in the field is transmitted to all other computers via the network B by the protocol data transmitting / receiving unit 23 (step B4 in FIG. 10). In step B4, the input candidate collection unit 261 empties all other candidates in the candidate packet storage unit 25.

一方、該当ステップ番号が初期値でないか（図１０のステップＢ１のＮＯ）、または入力受付キュー部１に入力パケットがないとき（図１０のステップＢ２のＮＯ）、入力候補収集部２６１は、入力順序番号記憶部２１に格納されている該当入力順序番号に一致する入力順序番号（が設定された入力順序番号フィールド）を持つ候補種類のプロトコルデータがプロトコルデータ送受信部２３によって受信されているかを判定する（図１０のステップＢ５）。もし、受信されているならば（図１０のステップＢ５のＹＥＳ）、入力候補収集部２６１は、受信されているプロトコルデータ（受信プロトコルデータ）内の（ステップ番号フィールドに設定されている）ステップ番号は該当ステップ番号よりも大きいかを判定する（図１０のステップＢ６）。 On the other hand, when the corresponding step number is not the initial value (NO in step B1 in FIG. 10) or there is no input packet in the input reception queue unit 1 (NO in step B2 in FIG. 10), the input candidate collection unit 261 It is determined whether or not the protocol data transmitting / receiving unit 23 has received candidate type protocol data having an input sequence number (input sequence number field in which the input sequence number matches the corresponding input sequence number stored in the sequence number storage unit 21). (Step B5 in FIG. 10). If it has been received (YES in step B5 in FIG. 10), the input candidate collection unit 261 has a step number (set in the step number field) in the received protocol data (reception protocol data). Is greater than the corresponding step number (step B6 in FIG. 10).

もし、受信プロトコルデータ内のステップ番号が該当ステップ番号よりも大きいならば（図１０のステップＢ６のＹＥＳ）、入力候補収集部２６１は、ステップ番号記憶部２４に格納されている該当ステップ番号を受信プロトコルデータ内のステップ番号に更新する（図１０のステップＢ７）。このステップＢ７では、受信プロトコルデータ内の（最大確定入力順序番号フィールドに設定されている）最大確定入力順序番号が、最大確定入力順序番号記憶部２７に格納されている該当最大確定入力順序番号よりも大きいならば、この該当最大確定入力順序番号が受信プロトコルデータ内の最大確定入力順序番号に更新される。以降の説明では、該当最大確定入力順序番号が受信プロトコルデータ内の最大確定入力順序番号に更新される処理を該当最大確定入力順序番号更新処理と称する。またステップＢ７では、受信プロトコルデータの送信者に対応して他コンピュータ最大確定入力順序番号記憶部２１３に格納されている最大確定入力順序番号が、受信プロトコルデータ内の最大確定入力順序番号に更新される。 If the step number in the reception protocol data is larger than the corresponding step number (YES in step B6 in FIG. 10), the input candidate collection unit 261 receives the corresponding step number stored in the step number storage unit 24. Update to the step number in the protocol data (step B7 in FIG. 10). In this step B 7, the maximum confirmed input sequence number (set in the maximum confirmed input sequence number field) in the reception protocol data is obtained from the corresponding maximum confirmed input sequence number stored in the maximum confirmed input sequence number storage unit 27. If the value is also larger, the corresponding maximum determined input sequence number is updated to the maximum determined input sequence number in the reception protocol data. In the following description, the process in which the corresponding maximum confirmed input sequence number is updated to the maximum confirmed input sequence number in the reception protocol data is referred to as the corresponding maximum confirmed input sequence number update process. In step B7, the maximum confirmed input sequence number stored in the other computer maximum confirmed input sequence number storage unit 213 corresponding to the sender of the received protocol data is updated to the maximum confirmed input sequence number in the received protocol data. The

次に入力候補収集部２６１は、受信プロトコルデータ内の（入力パケットフィールドに設定されている）入力パケットを自候補として候補パケット記憶部２５に格納し、かつ、この自候補が入力パケットフィールドに設定され、この自候補に付されている処理種別情報が処理種別フィールドに設定された（つまり受信プロトコルデータ内の入力パケットフィールドおよび処理種別フィールドのコピーを含む）候補種類のプロトコルデータをプロトコルデータ送受信部２３によりネットワークＢを介して他のすべてのコンピュータに送信させる（図１０のステップＢ８）。このステップＢ８において入力候補収集部２６１は、受信プロトコルデータ内の入力パケット（ここでは自候補とされた入力パケット）を、当該受信プロトコルデータの（送信者フィールドの示す）送信者に対応する他候補として、候補パケット記憶部２５に格納する。つまりステップＢ８では、受信プロトコルデータ中の入力パケットが、自候補として設定されるとともに、当該受信プロトコルデータの送信者に対応する他候補としても設定される。このとき入力候補収集部２６１は、候補パケット記憶部２５内の、受信プロトコルデータの送信者に対応する他候補以外の他候補を全て破棄する（空にする）。 Next, the input candidate collection unit 261 stores the input packet (set in the input packet field) in the reception protocol data as a self-candidate in the candidate packet storage unit 25, and sets the self-candidate in the input packet field. Protocol data transmission / reception unit for transmitting the protocol data of the candidate type in which the processing type information attached to the candidate is set in the processing type field (that is, including the input packet field and the copy of the processing type field in the received protocol data) 23, it transmits to all other computers via the network B (step B8 in FIG. 10). In this step B8, the input candidate collection unit 261 uses the input packet in the reception protocol data (in this case, the input packet determined as its own candidate) as another candidate corresponding to the sender of the reception protocol data (indicated by the sender field). Is stored in the candidate packet storage unit 25. That is, in step B8, the input packet in the reception protocol data is set as its own candidate and also set as another candidate corresponding to the sender of the reception protocol data. At this time, the input candidate collection unit 261 discards (empties) all other candidates in the candidate packet storage unit 25 other than the other candidates corresponding to the sender of the received protocol data.

これに対し、受信プロトコルデータ内のステップ番号と該当ステップ番号とが等しいならば（図１０のステップＢ６のＮＯ，ステップＢ９のＹＥＳ）、入力候補収集部２６１は、受信プロトコルデータ内の入力パケットを、当該受信プロトコルデータの送信者に対応する他候補として候補パケット記憶部２５に格納する（図１０のステップＢ１０）。このステップＢ１０では、受信プロトコルデータ内の最大確定入力順序番号が、最大確定入力順序番号記憶部２７に格納されている該当最大確定入力順序番号よりも大きいならば、該当最大確定入力順序番号更新処理が行われる。またステップＢ１０では、受信プロトコルデータの送信者に対応して他コンピュータ最大確定入力順序番号記憶部２１３に格納されている最大確定入力順序番号が、受信プロトコルデータ内の最大確定入力順序番号に更新される。 On the other hand, if the step number in the reception protocol data is equal to the corresponding step number (NO in step B6 in FIG. 10, YES in step B9), the input candidate collection unit 261 determines the input packet in the reception protocol data. Then, it is stored in the candidate packet storage unit 25 as another candidate corresponding to the sender of the reception protocol data (step B10 in FIG. 10). In this step B10, if the maximum confirmed input sequence number in the reception protocol data is larger than the corresponding maximum confirmed input sequence number stored in the maximum confirmed input sequence number storage unit 27, the corresponding maximum confirmed input sequence number update process Is done. In step B10, the maximum confirmed input sequence number stored in the other computer maximum confirmed input sequence number storage unit 213 corresponding to the sender of the received protocol data is updated to the maximum confirmed input sequence number in the received protocol data. The

入力候補収集部２６１は、ステップＢ６またはステップＢ１０を実行すると、候補パケット記憶部２５に格納された候補数が（ｎ−ｆ）個以上になったかを判定する（図１０のステップＢ１１）。 When executing step B6 or step B10, the input candidate collection unit 261 determines whether the number of candidates stored in the candidate packet storage unit 25 is (n−f) or more (step B11 in FIG. 10).

入力候補収集部２６１は、候補数が（ｎ−ｆ）個以上になっていないならば（図１０のステップＢ１１のＮＯ）、ステップＢ１からの処理を再び実行する。これに対し、候補数が（ｎ−ｆ）個以上になっているならば（図１０のステップＢ１１のＹＥＳ）、入力候補収集部２６１は候補一覧作成処理を終了する。なお、ステップＢ５またはステップＢ９の判定がＮＯの場合にも、ステップＢ１からの処理が再び実行される。 If the number of candidates does not reach (n−f) or more (NO in step B11 in FIG. 10), the input candidate collection unit 261 executes the processing from step B1 again. On the other hand, if the number of candidates is (n−f) or more (YES in step B11 in FIG. 10), the input candidate collection unit 261 ends the candidate list creation process. Even when the determination in step B5 or step B9 is NO, the processing from step B1 is executed again.

候補一覧作成処理（図９のステップＡ１）が終了すると、即ち候補パケット記憶部２５に格納された候補（入力候補）数（空でない候補数）が（ｎ−ｆ）個以上になると、入力パケット確定判定部２６内の合意部２６２は第１の入力候補選定制御手段として機能して、当該候補パケット記憶部２５に（ｎ−ｆ）個以上の同一の候補が存在するか、つまり（ｎ−ｆ）台以上のコンピュータ（少なくとも（ｎ−ｆ）台のコンピュータ）で合意がとられた（合意が形成された）候補が存在するかを判定する（図９のステップＡ２）。もし、（ｎ−ｆ）個以上の同一（同一内容）の候補が存在するならば（図９のステップＡ２のＹＥＳ）、合意部２６２は、その候補を該当入力順序番号における入力パケットとして確定する（図９のステップＡ３）。このステップＡ３において合意部２６２は、確定された入力パケットが入力受付キュー部１に存在するならば、当該入力パケットを入力受付キュー部１から削除する。 When the candidate list creation process (step A1 in FIG. 9) ends, that is, when the number of candidates (input candidates) (number of non-empty candidates) stored in the candidate packet storage unit 25 becomes (n−f) or more, the input packet The agreement unit 262 in the decision determination unit 26 functions as a first input candidate selection control unit, and whether or not (n−f) or more identical candidates exist in the candidate packet storage unit 25, that is, (n− f) It is determined whether or not there is a candidate for which an agreement has been reached (an agreement has been formed) on at least (n−f) computers (step A2 in FIG. 9). If there are (n−f) or more identical (identical contents) candidates (YES in step A2 in FIG. 9), the agreement unit 262 determines the candidates as input packets in the corresponding input sequence number. (Step A3 in FIG. 9). In step A <b> 3, if the determined input packet exists in the input reception queue unit 1, the agreement unit 262 deletes the input packet from the input reception queue unit 1.

このように、該当入力順序番号における入力パケットが確定すると、つまり該当入力順序番号における入力パケットに関して合意がとられて、整列マルチキャストが確定すると、合意部２６２は、次工程へ移行すべく、入力順序番号記憶部２１に格納されている該当入力順序番号を次に進め（１インクリメントし）、ステップ番号記憶部２４に格納されている該当ステップ番号を初期化する（図９のステップＡ４）。このステップＡ４において合意部２６２は、候補パケット記憶部２５に格納されているすべての候補を破棄し（空にし）、遅延記憶部２８に格納されている（ｎ−１）個の遅延フラグをすべてリセットする。 As described above, when the input packet at the corresponding input sequence number is determined, that is, when the input packet at the corresponding input sequence number is agreed and the ordered multicast is determined, the agreement unit 262 determines the input sequence to proceed to the next process. The corresponding input sequence number stored in the number storage unit 21 is advanced (incremented by 1), and the corresponding step number stored in the step number storage unit 24 is initialized (step A4 in FIG. 9). In step A4, the agreement unit 262 discards (empties) all candidates stored in the candidate packet storage unit 25, and sets all (n−1) delay flags stored in the delay storage unit 28. Reset.

次に合意部２６２は、候補出力先切り替え手段として機能して、確定された入力パケットに付されている処理種別情報の示す処理種別が「アプリケーション」であるかどうか（つまり「構成」であるか）を調べる（図９のステップＡ８）。「アプリケーション」であるならば（図９のステップＡ８のＹＥＳ）、合意部２６２は、確定された入力パケットをアプリケーションプログラム３に渡すとともに、当該入力パケットを入力パケットジャーナル記憶部２２に格納する（図９のステップＡ９）。ここでは、確定された入力パケットは、現在入力パケットジャーナル記憶部２２に格納されている入力パケットの列の後ろに位置するように格納される。なお、入力パケットジャーナル記憶部２２に格納されている入力パケットの列が一定の量に達しているならば、先頭の入力パケット（つまり最も古い入力パケット）が破棄される。 Next, the agreement unit 262 functions as a candidate output destination switching unit, and determines whether the process type indicated by the process type information attached to the confirmed input packet is “application” (that is, “configuration”). ) Is checked (step A8 in FIG. 9). If it is “application” (YES in step A8 in FIG. 9), the agreement unit 262 passes the determined input packet to the application program 3, and stores the input packet in the input packet journal storage unit 22 (FIG. 9). 9 step A9). Here, the determined input packet is stored so as to be positioned behind the column of input packets currently stored in the input packet journal storage unit 22. If the input packet sequence stored in the input packet journal storage unit 22 reaches a certain amount, the first input packet (that is, the oldest input packet) is discarded.

これに対し、「アプリケーション」でないならば、つまり「構成」であるならば（図９のステップＡ８のＮＯ）、合意部２６２は、確定された入力パケットを構成決定部２１０に渡すとともに、当該入力パケットを入力パケットジャーナル記憶部２２に格納する（図９のステップＡ１０）。 On the other hand, if it is not “application”, that is, if it is “configuration” (NO in step A8 in FIG. 9), the agreement unit 262 passes the determined input packet to the configuration determination unit 210 and also performs the input. The packet is stored in the input packet journal storage unit 22 (step A10 in FIG. 9).

一方、（ｎ−ｆ）個以上の同一の候補が存在しなかった場合（図９のステップＡ２のＮＯ）、合意部２６２は第２の入力候補選定制御手段として機能して、今度は、候補パケット記憶部２５に過半数以上の同一の候補が存在するかを判定する（図９のステップＡ５）。もし、過半数以上の同一の候補が存在するならば（図９のステップＡ５のＹＥＳ）、合意部２６２は、その候補を選択して自候補として候補パケット記憶部２５に格納し（つまり、その候補を候補パケット記憶部２５内の自候補とし）、かつ、この自候補が入力パケットフィールドに設定され、この自候補に付されている処理種別情報が処理種別フィールドに設定された候補種類のプロトコルデータをプロトコルデータ送受信部２３によりネットワークＢを介して他のすべてのコンピュータに送信させる（図９のステップＡ６）。このステップＡ６において合意部２６２は、候補パケット記憶部２５に格納されているすべての他候補を破棄する。 On the other hand, if (n−f) or more identical candidates do not exist (NO in step A2 in FIG. 9), the agreement unit 262 functions as a second input candidate selection control unit, and this time, candidates It is determined whether or not a majority of the same candidates exist in the packet storage unit 25 (step A5 in FIG. 9). If more than half of the same candidates exist (YES in step A5 in FIG. 9), the agreement unit 262 selects the candidate and stores it as a candidate in the candidate packet storage unit 25 (that is, the candidate). ) As the self-candidate in the candidate packet storage unit 25), and this self-candidate is set in the input packet field, and the processing type information attached to the self-candidate is set in the processing type field. Is transmitted to all other computers via the network B by the protocol data transmitting / receiving unit 23 (step A6 in FIG. 9). In step A <b> 6, the agreement unit 262 discards all other candidates stored in the candidate packet storage unit 25.

これに対し、過半数以上の同一の候補が存在しないならば（図９のステップＡ５のＮＯ）、合意部２６２は第３の入力候補選定制御手段として機能して、候補パケット記憶部２５に格納されている入力候補の中からランダムに候補（入力パケット）を自候補として選択し、かつ、この自候補が入力パケットフィールドに設定され、この自候補に付されている処理種別情報が処理種別フィールドに設定された候補種類のプロトコルデータをプロトコルデータ送受信部２３によりネットワークＢを介して他のすべてのコンピュータに送信させる（図９のステップＡ７）。このステップＡ７において合意部２６２は、候補パケット記憶部２５に格納されているすべての他候補を破棄する。 On the other hand, if more than a majority of the same candidates do not exist (NO in step A5 in FIG. 9), the agreement unit 262 functions as a third input candidate selection control unit and is stored in the candidate packet storage unit 25. A candidate (input packet) is randomly selected from among the input candidates, and the self candidate is set in the input packet field, and the processing type information attached to the self candidate is displayed in the processing type field. The set candidate type protocol data is transmitted to all other computers via the network B by the protocol data transmission / reception unit 23 (step A7 in FIG. 9). In step A7, the agreement unit 262 discards all other candidates stored in the candidate packet storage unit 25.

合意部２６２は、ステップＡ６（第２の入力候補選定制御手段としての動作）またはステップＡ７（第３の入力候補選定制御手段としての動作）が終了すると、ステップＡ１からの処理を再び実行する。一方、ステップＡ９またはステップＡ１０（候補出力先切り替え手段としての動作）が終了すると、整列マルチキャストの１回の配送処理を終了する。 When step A6 (operation as the second input candidate selection control unit) or step A7 (operation as the third input candidate selection control unit) is completed, the agreement unit 262 executes the processing from step A1 again. On the other hand, when step A9 or step A10 (operation as candidate output destination switching means) is completed, one delivery process of the ordered multicast is ended.

以上の手順で、各コンピュータ１００-iは、（ｎ−ｆ）台以上のコンピュータでの入力パケットの一致を確認しながら処理を進めていく。 With the above procedure, each computer 100-i proceeds with processing while confirming the match of input packets in (n−f) or more computers.

次に、多重化実行の遅延を解消するための動作手順について説明する。
図１１乃至図１４は、多重化実行の遅延を解消するための動作手順を示すフローチャートである。図１１乃至図１４は、前記特許文献１の図７乃至図１０に相当する。 Next, an operation procedure for eliminating the multiplexing execution delay will be described.
FIG. 11 to FIG. 14 are flowcharts showing an operation procedure for eliminating the multiplexing execution delay. 11 to 14 correspond to FIGS. 7 to 10 of Patent Document 1. FIG.

整列マルチキャスト部２内の合意部２６２は、該当入力順序番号より小さい入力順序番号を持つ候補種類のプロトコルデータがプロトコルデータ送受信部２３によって受信された場合に、その入力順序番号に対応する入力パケットが入力パケットジャーナル記憶部２２に存在するかを判定する（図１１のステップＣ１）。 When the protocol data transmitter / receiver 23 receives candidate type protocol data having an input sequence number smaller than the corresponding input sequence number, the agreement unit 262 in the ordered multicast unit 2 receives an input packet corresponding to the input sequence number. It is determined whether or not the packet exists in the input packet journal storage unit 22 (step C1 in FIG. 11).

もし、該当入力順序番号より小さい入力順序番号に対応する入力パケットが入力パケットジャーナル記憶部２２に存在するならば（図１１のステップＣ１のＹＥＳ）、つまり短い多重化実行の遅延のために、該当入力順序番号より小さい入力順序番号に対応する入力パケットが入力パケットジャーナル記憶部２２に残されているならば、合意部２６２は、その入力パケットが入力パケットフィールドに設定され、その入力パケットに付されている処理種別情報が処理種別フィールドに設定された確定種類のプロトコルデータをプロトコルデータ送受信部２３によりネットワークＢを介して受信プロトコルデータの送信者に返送させる（図１１のステップＣ２）。 If an input packet corresponding to an input sequence number smaller than the corresponding input sequence number exists in the input packet journal storage unit 22 (YES in step C1 in FIG. 11), that is, because of a short multiplexing execution delay, If an input packet corresponding to an input sequence number smaller than the input sequence number remains in the input packet journal storage unit 22, the agreement unit 262 sets the input packet in the input packet field and attaches it to the input packet. The protocol data transmission / reception unit 23 causes the protocol data transmission / reception unit 23 to return the received protocol data to the sender of the received protocol data (step C2 in FIG. 11).

一方、該当入力順序番号より小さい入力順序番号に対応する入力パケットが入力パケットジャーナル記憶部２２に存在しないならば（図１１のステップＣ１のＮＯ）、つまり長い多重化実行の遅延のために、該当入力順序番号より小さい入力順序番号に対応する入力パケットが入力パケットジャーナル記憶部２２から既に捨てられている場合には、合意部２６２は、遅延種類のプロトコルデータをプロトコルデータ送受信部２３によりネットワークＢを介して受信プロトコルデータの送信者に返送させる（図１１のステップＣ３）。 On the other hand, if the input packet corresponding to the input sequence number smaller than the corresponding input sequence number does not exist in the input packet journal storage unit 22 (NO in step C1 in FIG. 11), that is, because of a long multiplexing execution delay, When an input packet corresponding to an input sequence number smaller than the input sequence number has already been discarded from the input packet journal storage unit 22, the agreement unit 262 uses the protocol data transmission / reception unit 23 to send the network B to the protocol type transmission / reception unit 23. To the sender of the received protocol data (step C3 in FIG. 11).

また、ステップＣ２およびＣ３では、受信プロトコルデータ内の最大確定入力順序番号が、最大確定入力順序番号記憶部２７に格納されている該当最大確定入力順序番号よりも大きいならば、該当最大確定入力順序番号更新処理が行われる。またステップＣ２およびＣ３では、受信プロトコルデータの送信者に対応して他コンピュータ最大確定入力順序番号記憶部２１３に格納されている最大確定入力順序番号が、受信プロトコルデータ内の最大確定入力順序番号に更新される。 In steps C2 and C3, if the maximum confirmed input sequence number in the reception protocol data is greater than the corresponding maximum confirmed input sequence number stored in the maximum confirmed input sequence number storage unit 27, the corresponding maximum confirmed input sequence number. Number update processing is performed. In steps C2 and C3, the maximum determined input sequence number stored in the other computer maximum determined input sequence number storage unit 213 corresponding to the sender of the received protocol data is changed to the maximum determined input sequence number in the received protocol data. Updated.

また、合意部２６２は、該当入力順序番号に一致する入力順序番号を持つ確定種類のプロトコルデータがプロトコルデータ送受信部２３によって受信された場合には、その受信プロトコルデータ内の入力パケットを入力パケットとして確定する（図１２のステップＤ１）。このステップＤ１において合意部２６２は、確定された入力パケットが入力受付キュー部１に存在するならば、当該入力パケットを入力受付キュー部１から削除する。またステップＤ１では、受信プロトコルデータ内の最大確定入力順序番号が、最大確定入力順序番号記憶部２７に格納されている該当最大確定入力順序番号よりも大きいならば、該当最大確定入力順序番号更新処理が行われる。ステップＤ１ではさらに、受信プロトコルデータの送信者に対応して他コンピュータ最大確定入力順序番号記憶部２１３に格納されている最大確定入力順序番号が、受信プロトコルデータ内の最大確定入力順序番号に更新される。 In addition, when the protocol data transmitter / receiver 23 receives a definite type of protocol data having an input sequence number that matches the input sequence number, the agreement unit 262 uses the input packet in the received protocol data as an input packet. Confirm (step D1 in FIG. 12). In step D1, the agreement unit 262 deletes the input packet from the input reception queue unit 1 if the determined input packet exists in the input reception queue unit 1. In step D1, if the maximum confirmed input sequence number in the reception protocol data is larger than the corresponding maximum confirmed input sequence number stored in the maximum confirmed input sequence number storage unit 27, the corresponding maximum confirmed input sequence number update process is performed. Is done. In step D1, the maximum determined input sequence number stored in the other computer maximum determined input sequence number storage unit 213 corresponding to the sender of the received protocol data is updated to the maximum determined input sequence number in the received protocol data. The

次に合意部２６２は、次工程へ移行すべく、入力順序番号記憶部２１に格納されている該当入力順序番号を次に進め（１インクリメントし）、ステップ番号記憶部２４に格納されている該当ステップ番号を初期化する（図１２のステップＤ２）。このステップＤ２において合意部２６２は、候補パケット記憶部２５に格納されているすべての候補を破棄し、遅延記憶部２８に格納されている（ｎ−１）個の遅延フラグをすべてリセットする。 Next, the agreement unit 262 advances the corresponding input sequence number stored in the input sequence number storage unit 21 (increment by 1) to proceed to the next process, and the corresponding stored in the step number storage unit 24. The step number is initialized (step D2 in FIG. 12). In step D <b> 2, the agreement unit 262 discards all candidates stored in the candidate packet storage unit 25 and resets all (n−1) delay flags stored in the delay storage unit 28.

次に合意部２６２は、候補出力先切り替え手段として機能して、確定された入力パケットに付されている処理種別情報の示す処理種別が「アプリケーション」であるかを判定する（図１２のステップＤ３）。「アプリケーション」であるならば（図１２のステップＤ３のＹＥＳ）、合意部２６２は、確定された入力パケットをアプリケーションプログラム３に渡すとともに、当該入力パケットを入力パケットジャーナル記憶部２２に格納する（図１２のステップＤ４）。 Next, the agreement unit 262 functions as a candidate output destination switching unit, and determines whether the process type indicated by the process type information attached to the confirmed input packet is “application” (step D3 in FIG. 12). ). If it is “application” (YES in step D3 in FIG. 12), the agreement unit 262 passes the determined input packet to the application program 3 and stores the input packet in the input packet journal storage unit 22 (FIG. 12). 12 step D4).

これに対し、「アプリケーション」でないならば（図１２のステップＤ３のＮＯ）、合意部２６２は、確定された入力パケットを構成決定部２１０に渡すとともに、当該入力パケットを入力パケットジャーナル記憶部２２に格納する（図１２のステップＤ５）。 On the other hand, if it is not “application” (NO in step D3 in FIG. 12), the agreement unit 262 passes the determined input packet to the configuration determination unit 210 and also sends the input packet to the input packet journal storage unit 22. Store (step D5 in FIG. 12).

一方、整列マルチキャスト部２内のスキップ判定部２９は、該当入力順序番号に一致する入力順序番号を持つ遅延種類のプロトコルデータがプロトコルデータ送受信部２３によって受信された場合に、遅延記憶部２８に格納されている（ｎ−１）個の遅延フラグのうち、当該プロトコルデータの送信者に対応する遅延フラグをセットする（図１３のステップＥ１）。このステップＥ１では、受信プロトコルデータ内の最大確定入力順序番号が、最大確定入力順序番号記憶部２７に格納されている該当最大確定入力順序番号よりも大きいならば、該当最大確定入力順序番号更新処理が行われる。またステップＥ１では、受信プロトコルデータの送信者に対応して他コンピュータ最大確定入力順序番号記憶部２１３に格納されている最大確定入力順序番号が、受信プロトコルデータ内の最大確定入力順序番号に更新される。 On the other hand, the skip determination unit 29 in the ordered multicast unit 2 stores the delay type protocol data having an input sequence number matching the corresponding input sequence number in the delay storage unit 28 when the protocol data transmission / reception unit 23 receives it. Among the (n−1) delay flags that have been set, a delay flag corresponding to the sender of the protocol data is set (step E1 in FIG. 13). In this step E1, if the maximum confirmed input sequence number in the reception protocol data is larger than the corresponding maximum confirmed input sequence number stored in the maximum confirmed input sequence number storage unit 27, the corresponding maximum confirmed input sequence number update process. Is done. In step E1, the maximum determined input sequence number stored in the other computer maximum determined input sequence number storage unit 213 corresponding to the sender of the received protocol data is updated to the maximum determined input sequence number in the received protocol data. The

また、スキップ判定部２９は、遅延記憶部２８に格納されている（ｎ−１）個の遅延フラグのうちのセットされた遅延フラグの数と、候補パケット記憶部２５に格納されている入力候補の数との和が、（ｎ−ｆ）個以上に達したかどうかを監視する（図１４のステップＦ１）。もし、（ｎ−ｆ）個以上に達しているならば（図１４のステップＦ１のＹＥＳ）、スキップ判定部２９は、候補パケット記憶部２５に格納されている入力候補の数が（ｎ−ｆ）個未満かを判定する（図１４のステップＦ２）。もし、（ｎ−ｆ）個未満であるならば（図１４のステップＦ２のＹＥＳ）、スキップ判定部２９はスキップ動作を行う（図１４のステップＦ３）。即ちスキップ判定部２９は、入力順序番号記憶部２１に格納されている該当入力順序番号を最大確定入力順序番号記憶部２７に格納されている該当最大確定入力順序番号に更新し、ステップ番号記憶部２４に格納されている該当ステップ番号を初期ステップ番号にする。またスキップ判定部２９は、候補パケット記憶部２５を空にし、遅延記憶部２８のすべての遅延フラグをリセットした上で、プログラム状態管理部４にスキップを通知する。プログラム状態管理部４は、スキップが通知されると、該当入力順序番号の直前の状態を他のコンピュータのプログラム状態管理部４からコピーする。このために、プログラム状態管理部４は、各入力順序番号の直前の状態を最近のものから一定の量だけ保持している。 The skip determination unit 29 also sets the number of set delay flags among the (n−1) delay flags stored in the delay storage unit 28 and the input candidates stored in the candidate packet storage unit 25. It is monitored whether or not the sum with the number has reached (n−f) or more (step F1 in FIG. 14). If the number reaches (n−f) or more (YES in step F1 in FIG. 14), the skip determination unit 29 determines that the number of input candidates stored in the candidate packet storage unit 25 is (n−f). ) It is determined whether it is less (step F2 in FIG. 14). If the number is less than (n−f) (YES in step F2 in FIG. 14), the skip determination unit 29 performs a skip operation (step F3 in FIG. 14). That is, the skip determination unit 29 updates the corresponding input sequence number stored in the input sequence number storage unit 21 to the corresponding maximum determined input sequence number stored in the maximum determined input sequence number storage unit 27, and the step number storage unit The corresponding step number stored in 24 is set as the initial step number. The skip determination unit 29 empties the candidate packet storage unit 25, resets all delay flags in the delay storage unit 28, and notifies the program state management unit 4 of the skip. When the skip is notified, the program state management unit 4 copies the state immediately before the corresponding input sequence number from the program state management unit 4 of another computer. For this reason, the program state management unit 4 holds the state immediately before each input sequence number by a certain amount from the latest one.

以上の手順で、各コンピュータ１００-iは、スプリットブレインを起こさないよう、遅延された実行が追い付く仕組みを実現する。 With the above procedure, each computer 100-i realizes a mechanism in which delayed execution catches up so as not to cause split brain.

次に、各コンピュータ１００-iの構成決定部２１０によって実行される、分散システム１０００の構成を決定するための動作手順（構成決定処理）について、図１５のフローチャートを参照して説明する。本実施形態において、構成決定部２１０による構成決定処理は一定時間毎に（つまり定期的に）実行されるものとするが、ランダムなタイミングで実行されても構わない。 Next, an operation procedure (configuration determination process) for determining the configuration of the distributed system 1000, which is executed by the configuration determination unit 210 of each computer 100-i, will be described with reference to the flowchart of FIG. In the present embodiment, the configuration determination process by the configuration determination unit 210 is executed at regular time intervals (that is, periodically), but may be executed at random timing.

まず構成決定部２１０は、構成記憶部２１１に格納されている情報に基づき、現在分散システム１０００を構成しているコンピュータの数ｎが４を超えているかを判定する（図１５のステップＧ１）。もし、ｎが４を超えている（つまり５以上である）ならば（図１５のステップＧ１のＹＥＳ）、構成決定部２１０は、現在分散システム１０００を構成しているｎ台のコンピュータ（ここではコンピュータ１００-1〜１００-7）のうち、コンピュータ１００-i（構成記憶部２１１を含むコンピュータ１００-i）を除くすべてのコンピュータ（他コンピュータ）１００-j（ここではｊは１〜７、但しｉを除く）について、障害検出部２１０ａを用いて以下の処理を実行する（図１５のステップＧ２）。 First, the configuration determination unit 210 determines whether the number n of computers that currently constitute the distributed system 1000 exceeds 4 based on the information stored in the configuration storage unit 211 (step G1 in FIG. 15). If n exceeds 4 (that is, 5 or more) (YES in step G1 in FIG. 15), the configuration determination unit 210 includes n computers (here, the computers) that currently constitute the distributed system 1000. Of the computers 100-1 to 100-7), all computers (other computers) 100-j (where j is 1 to 7, except for the computer 100-i (the computer 100-i including the configuration storage unit 211)) For i), the following processing is executed using the failure detection unit 210a (step G2 in FIG. 15).

まず障害検出部２１０ａは、入力順序番号記憶部２１に格納されている該当入力順序番号と、他コンピュータ最大確定入力順序番号記憶部２１３に格納されている、現在対象となっているコンピュータ１００-j（対象のコンピュータ１００-j）の最大確定入力順序番号との差を求め、この差が、最大順序番号遅延許容値記憶部２１２に格納されている最大順序番号遅延許容値Ｌよりも大きくなっているかを判定する（図１５のステップＧ３）。ここで上記差は、コンピュータ１００-iの該当入力順序番号を基準とする対象のコンピュータ１００-jにおける入力パケット列に対する処理の遅延量を示す。 First, the failure detection unit 210a includes the corresponding input sequence number stored in the input sequence number storage unit 21 and the computer 100-j that is the current target stored in the other computer maximum confirmed input sequence number storage unit 213. The difference from the maximum confirmed input sequence number of (target computer 100-j) is obtained, and this difference becomes larger than the maximum sequence number delay allowable value L stored in the maximum sequence number delay allowable value storage unit 212. (Step G3 in FIG. 15). Here, the difference indicates the amount of processing delay for the input packet sequence in the target computer 100-j based on the corresponding input sequence number of the computer 100-i.

障害検出部２１０ａは、上記差が最大順序番号遅延許容値Ｌよりも大きいならば（図１５のステップＧ３のＹＥＳ）、対象のコンピュータ１００-jは異常であると判定する。つまり障害検出部２１０ａは、上記差が最大順序番号遅延許容値Ｌよりも大きくなったことをもって、対象のコンピュータ１００-jの異常（障害）を検出する。この場合、構成決定部２１０は、分散システム１０００の構成を変更する必要があるものと、さらに具体的に述べるならば、対象のコンピュータ１００-j（異常が検出されたコンピュータ１００-j）を分散システム１０００（を構成しているコンピュータの集合）から外す（切り離す）必要があるものと判定する。 If the difference is larger than the maximum sequence number delay allowable value L (YES in step G3 in FIG. 15), the failure detecting unit 210a determines that the target computer 100-j is abnormal. That is, the failure detection unit 210a detects an abnormality (failure) in the target computer 100-j when the difference is larger than the maximum sequence number delay allowable value L. In this case, the configuration determination unit 210 distributes the target computer 100-j (the computer 100-j in which an abnormality has been detected), if more specifically, that it is necessary to change the configuration of the distributed system 1000. It is determined that it is necessary to remove (separate) from the system 1000 (a set of computers constituting the system).

そこで構成決定部２１０は、対象のコンピュータ１００-j（異常が検出されたコンピュータ１００-j）を分散システム１０００（を構成しているコンピュータの集合）から外す（切り離す）合意をとる（合意を形成する）ための処理、つまり分散システム１０００の構成を変更すべきことの合意をとるための処理を、整列マルチキャスト部２（内の入力パケット確定判定部２６に含まれている合意部２６２）を用いて行う（図１５のステップＧ４）。換言するならば、構成決定部２１０は、対象のコンピュータ１００-jを分散システム１０００から外す合意を、アプリケーションプログラム３に配送されるべき入力パケットに対する整列マルチキャストと同様にしてとるために、整列マルチキャスト部２を用いる。対象のコンピュータ１００-jを分散システム１０００から外すかは、整列マルチキャストと同様に、（ｎ−ｆ）台以上のコンピュータでの合意をもって決定される。 Therefore, the configuration determining unit 210 makes an agreement (forms an agreement) to remove (separate) the target computer 100-j (the computer 100-j in which an abnormality has been detected) from the distributed system 1000 (a set of computers constituting the computer). ), That is, a process for obtaining an agreement that the configuration of the distributed system 1000 should be changed, using the ordered multicast unit 2 (the agreement unit 262 included in the input packet determination unit 26). (Step G4 in FIG. 15). In other words, the configuration determining unit 210 makes an agreement to remove the target computer 100-j from the distributed system 1000 in the same manner as the ordered multicast for the input packet to be delivered to the application program 3. 2 is used. Whether to remove the target computer 100-j from the distributed system 1000 is determined by agreement among (n−f) or more computers, similarly to the ordered multicast.

本実施形態において、対象のコンピュータ１００-jを分散システム１０００から外すとは、後述するように、当該コンピュータ１００-jを示す情報（ＩＤ）を構成記憶部２１１から削除することである。つまり対象のコンピュータ１００-jを分散システム１０００から外すとは、当該コンピュータ１００-jを、分散システム１０００を構成するコンピュータの集合から論理的に外すことであり、当該コンピュータ１００-jを物理的に分散システム１０００から取り外すものではない。 In the present embodiment, removing the target computer 100-j from the distributed system 1000 means deleting information (ID) indicating the computer 100-j from the configuration storage unit 211, as will be described later. That is, to remove the target computer 100-j from the distributed system 1000 is to logically remove the computer 100-j from the set of computers constituting the distributed system 1000, and physically remove the computer 100-j. It is not removed from the distributed system 1000.

ステップＧ４において構成決定部２１０は、対象のコンピュータ１００-jを分散システム１０００から外すことを要求する特定のパケットを入力受付キュー部１に送る。この特定のパケットには、処理種別が「構成」であることを示す処理種別情報が付されている。構成決定部２１０から入力受付キュー部１に送られた特定のパケットは、当該入力受付キュー部１で受け付けられて、当該入力受付キュー部１にキューイングされる。 In step G4, the configuration determining unit 210 sends a specific packet requesting to remove the target computer 100-j from the distributed system 1000 to the input reception queue unit 1. Processing type information indicating that the processing type is “configuration” is attached to the specific packet. The specific packet sent from the configuration determining unit 210 to the input reception queue unit 1 is received by the input reception queue unit 1 and is queued in the input reception queue unit 1.

入力受付キュー部１にキューイングされた特定のパケット（以下、特定入力パケットと称する）は、図９および図１０のフローチャートを参照しての整列マルチキャスト部２（内の合意部２６２）の動作から明らかなように、対象のコンピュータ１００-jを含むｎ台（ｎ＝７）のコンピュータ（コンピュータ１００-1〜１００-7）の整列マルチキャスト部２による整列マルチキャストの対象となる。 A specific packet queued in the input reception queue unit 1 (hereinafter referred to as a specific input packet) is obtained from the operation of the ordered multicast unit 2 (within the agreement unit 262) with reference to the flowcharts of FIGS. As will be apparent, the ordered multicast unit 2 of n (n = 7) computers (computers 100-1 to 100-7) including the target computer 100-j is the target of the ordered multicast.

もし、ｎ台（ｎ＝７）のコンピュータ（コンピュータ１００-1〜１００-7）のうちの（ｎ−ｆ）台以上のコンピュータで特定入力パケットの合意がとられたならば（図９のステップＡ２，Ａ３）、当該（ｎ−ｆ）台以上のコンピュータの整列マルチキャスト部２では、上記特定入力パケットが、合意部２６２から構成決定部２１０に渡されるとともに、入力パケットジャーナル記憶部２２に格納される（図９のステップＡ１０）。 If the agreement of specific input packets has been reached by (n−f) or more of n (n = 7) computers (computers 100-1 to 100-7) (step of FIG. 9). In the aligned multicast unit 2 of the (n−f) or more computers, the specific input packet is transferred from the agreement unit 262 to the configuration determining unit 210 and stored in the input packet journal storage unit 22. (Step A10 in FIG. 9).

構成決定部２１０は、合意部２６２から上記特定入力パケットが渡されたことをもって、対象のコンピュータ１００-jを分散システム１０００から外す合意が（ｎ−ｆ）台以上のコンピュータでとられたと判定する（図１５のステップＧ５のＹＥＳ）。この場合、構成決定部２１０は、構成記憶部２１１から対象のコンピュータ１００-jを示す情報（ＩＤ）を削除することにより、当該対象のコンピュータを分散システム１０００から外す（図１５のステップＧ６）。 The configuration determination unit 210 determines that an agreement to remove the target computer 100-j from the distributed system 1000 has been reached by (nf) or more computers when the specific input packet is passed from the agreement unit 262. (YES in step G5 in FIG. 15). In this case, the configuration determining unit 210 removes the target computer from the distributed system 1000 by deleting the information (ID) indicating the target computer 100-j from the configuration storage unit 211 (step G6 in FIG. 15).

構成決定部２１０（コンピュータ１００-iの構成決定部２１０）は、上記ステップＧ３からの処理を、現在分散システム１０００を構成しているｎ台のコンピュータのうち、コンピュータ１００-iを除く（ｎ−１）台のコンピュータ（他コンピュータ）について繰り返す（図１５のステップＧ２）。 The configuration determining unit 210 (the configuration determining unit 210 of the computer 100-i) performs the processing from the above step G3, excluding the computer 100-i among the n computers that currently constitute the distributed system 1000 (n− 1) Repeat for one computer (other computer) (step G2 in FIG. 15).

本実施形態によれば、上述のような構成決定部２１０の処理により、ｎおよびｆをスケールダウンすることができる。以下、このｎおよびｆのスケールダウンについて説明する。 According to the present embodiment, n and f can be scaled down by the processing of the configuration determining unit 210 as described above. Hereinafter, the scale-down of n and f will be described.

まず、分散システム１０００が当該分散システム１０００の運用開始時に７台のコンピュータ１００-1〜１００-7（ｎ＝７，ｆ＝２）から構成される本実施形態では、あるコンピュータ（以下、第１のコンピュータと称する）の障害が検出されて、当該第１のコンピュータを分散システム１０００から外す合意が（ｎ−ｆ）台のコンピュータ（ここでは、５台のコンピュータ）でとられたものとする。この場合、分散システム１０００の構成は６台（ｎ＝６，ｆ＝１）のコンピュータから構成されるように変更される。 First, in the present embodiment in which the distributed system 1000 includes seven computers 100-1 to 100-7 (n = 7, f = 2) at the start of operation of the distributed system 1000, a certain computer (hereinafter referred to as the first computer). ) Is detected, and an agreement to remove the first computer from the distributed system 1000 has been reached by (n−f) computers (here, 5 computers). In this case, the configuration of the distributed system 1000 is changed to include six computers (n = 6, f = 1).

次に、６台（ｎ＝６，ｆ＝１）のコンピュータから構成される分散システム１０００において、あるコンピュータ（以下、第２のコンピュータと称する）の障害が検出されて、当該第２のコンピュータを分散システム１０００から外す合意が（ｎ−ｆ）台のコンピュータ（ここでは、５台のコンピュータ）でとられたものとする。この場合、分散システム１０００の構成は５台（ｎ＝５，ｆ＝１）のコンピュータから構成されるように変更される。 Next, in a distributed system 1000 composed of six computers (n = 6, f = 1), a failure of a certain computer (hereinafter referred to as a second computer) is detected, and the second computer is It is assumed that an agreement to be removed from the distributed system 1000 has been reached with (n−f) computers (here, 5 computers). In this case, the configuration of the distributed system 1000 is changed to include five computers (n = 5, f = 1).

次に、５台（ｎ＝５，ｆ＝１）のコンピュータから構成される分散システム１０００において、あるコンピュータ（以下、第３のコンピュータと称する）の障害が検出されて、当該第３のコンピュータを分散システム１０００から外す合意が（ｎ−ｆ）台のコンピュータ（ここでは、４台のコンピュータ）でとられたものとする。この場合、分散システム１０００の構成は４台（ｎ＝４，ｆ＝１）のコンピュータから構成されるように変更される。 Next, in a distributed system 1000 composed of five computers (n = 5, f = 1), a failure of a certain computer (hereinafter referred to as a third computer) is detected, and the third computer is It is assumed that an agreement to be removed from the distributed system 1000 has been reached with (n−f) computers (here, four computers). In this case, the configuration of the distributed system 1000 is changed to include four computers (n = 4, f = 1).

このように本実施形態によれば、分散システム１０００に含まれる多重化を構成するコンピュータの数を７台（ｎ＝７）から４台（ｎ＝４）に下げること、つまり、ｎ，ｆを７，２から４，１にスケールダウンすることができる。この状態では、ｆ＝１が実現されているため、４台（ｎ＝４）のコンピュータのうち、正常に動作するコンピュータが３台以上あれば、多重化を継続できる。つまり、障害が発生したコンピュータの数が運用開始時のｆ（ここでは２）で表される（ｆ＋１）以上（ここでは３）となっても、少なくとも３台のコンピュータが正常に動作するならば、多重化を継続でき、スプリットブレインを原理的に発生させず、タイムアウトによる故障発生時の処理の中断も発生させることがない。 As described above, according to the present embodiment, the number of computers included in the distributed system 1000 is reduced from seven (n = 7) to four (n = 4), that is, n and f are reduced. You can scale down from 7,2 to 4,1. In this state, since f = 1 is realized, multiplexing can be continued if there are three or more normally operating computers among the four (n = 4) computers. That is, if at least three computers operate normally even if the number of failed computers is equal to or greater than (f + 1) (here 3) represented by f (2 here) at the start of operation. Multiplexing can be continued, split brain is not generated in principle, and processing is not interrupted when a failure occurs due to timeout.

ここで、分散システム１０００から外されるコンピュータ１００-jの異常（最大順序番号遅延許容値Ｌを超える遅延）の要因が、当該コンピュータ１００-jの一時的な高負荷のためのスローダウンにあるものとする。この場合、コンピュータ１００-jはスローダウンから回復して、構成記憶部２１１の情報に基づいてｎ台のコンピュータの１つとして動作する可能性がある。 Here, the cause of the abnormality of the computer 100-j removed from the distributed system 1000 (the delay exceeding the maximum sequence number delay allowable value L) is the slowdown due to the temporary high load of the computer 100-j. Shall. In this case, the computer 100-j may recover from the slowdown and operate as one of the n computers based on information in the configuration storage unit 211.

しかし、コンピュータ１００-jの構成記憶部２１１の内容は、現在分散システム１０００を構成しているコンピュータの構成記憶部２１１の内容ではなく、当該コンピュータ１００-jが分散システム１０００から外された（切り離された）際の構成記憶部２１１の内容である。ここで、コンピュータ１００-jが分散システム１０００から切り離された際のｎ，ｆをｎ_old，ｆ_oldで表し、現在のｎ，ｆをｎ_new，ｆ_newで表すならば、コンピュータ１００-jは、（ｎ_new−ｆ_new）台のコンピュータではなくて、（ｎ_old−ｆ_old）台のコンピュータで合意をとる必要がある。しかし、ｎ_old台のコンピュータのうちのｎ_new台のコンピュータは、現在の分散システム１０００を構成している。したがってコンピュータ１００-jは、当該コンピュータ１００-jを含む（ｎ_old−ｆ_old）台のコンピュータで合意をとることはできず、現在の分散システム１０００を構成しているｎ_new台のコンピュータの動作に悪影響を及ぼすことはない。 However, the content of the configuration storage unit 211 of the computer 100-j is not the content of the configuration storage unit 211 of the computer that currently configures the distributed system 1000, but the computer 100-j is removed from the distributed system 1000 (separated). The contents of the configuration storage unit 211 at the time. Here, if n and f when the computer 100-j is disconnected from the distributed system 1000 are represented by n _old and f _old and the current n and f are represented by n _new and f _new , the computer 100-j , It is necessary to agree on (n _old -f _old ) computers, not (n _new -f _new ) computers. However, n _new computers among the n _old computers constitute the current distributed system 1000. Accordingly, the computer 100-j cannot agree on the (n _old -f _old ) computers including the computer 100-j, and the operation of the n _new computers constituting the current distributed system 1000 is not possible. Will not be adversely affected.

なお、本発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。例えば上記実施形態では、コンピュータ１００-jの異常（障害）の検出に、該当入力順序番号と当該コンピュータ１００-jの最大確定入力順序番号との差が用いられている。しかし、コンピュータ１００-jの異常の検出に他の手法を用いることも可能である。例えば上記特許文献１に従来技術として記載されているような、ハートビート・タイムアウト・アルゴリズム、つまり各コンピュータが定期的に送出するハートビートが一定時間以上確認できない場合に、当該コンピュータの異常（障害）を判定するという手法を用いることも可能である。この手法では、上記特許文献１に従来技術の問題点として指摘されているように、障害の誤検出が発生してスプリットブレインに陥るおそれがある。しかし、ハートビート・タイムアウト・アルゴリズムによって例えばコンピュータ１００-jの障害を検出した場合に、同様の障害検出を行っているコンピュータがｎ−ｆ台以上存在することを確認することにより、つまりコンピュータ１００-jの障害の検出の合意をｎ−ｆ台以上のコンピュータでとることにより、スプリットブレインの発生を原理的に防止できる。 Note that the present invention is not limited to the above-described embodiment as it is, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. For example, in the above embodiment, the difference between the corresponding input sequence number and the maximum confirmed input sequence number of the computer 100-j is used to detect an abnormality (failure) in the computer 100-j. However, other methods can be used for detecting an abnormality of the computer 100-j. For example, when a heartbeat timeout algorithm as described in Patent Document 1 as a prior art, that is, when a heartbeat periodically transmitted by each computer cannot be confirmed for a predetermined time or more, an abnormality (failure) of the computer It is also possible to use a method of determining In this method, as pointed out as a problem of the prior art in the above-mentioned Patent Document 1, there is a possibility that a fault is erroneously detected and falls into a split brain. However, for example, when a failure of the computer 100-j is detected by the heartbeat timeout algorithm, it is confirmed that there are n-f or more computers performing the same failure detection, that is, the computer 100-. In principle, the occurrence of split brain can be prevented by agreeing to detect the failure of j with n−f or more computers.

また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。 In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment.

本発明の一実施形態に係る分散システムの構成を示すブロック図。1 is a block diagram showing a configuration of a distributed system according to an embodiment of the present invention. 同実施形態の分散システムを構成するコンピュータの機能構成を示すブロック図。2 is an exemplary block diagram showing a functional configuration of a computer constituting the distributed system of the embodiment. FIG. 同実施形態の入力受付キュー部にキューイングされるデータ（入力パケット）のデータ構造例を示す図。The figure which shows the example of a data structure of the data (input packet) queued in the input reception queue part of the embodiment. 同実施形態における最大確定入力順序番号記憶部のデータ構造例を示す図。The figure which shows the example of a data structure of the largest fixed input order number memory | storage part in the embodiment. 同実施形態における構成記憶部のデータ構造例を示す図。The figure which shows the data structure example of the structure memory | storage part in the embodiment. 同実施形態における最大順序番号遅延許容値記憶部のデータ構造例を示す図。The figure which shows the data structure example of the largest sequence number delay tolerance memory | storage part in the embodiment. 同実施形態における他コンピュータ最大確定入力順序番号記憶部のデータ構造例を示す図。The figure which shows the example of a data structure of the other computer largest fixed input sequence number memory | storage part in the embodiment. 同実施形態の分散システムを構成するコンピュータ間で送受信されるプロトコルデータのレイアウトを示す図。The figure which shows the layout of the protocol data transmitted / received between the computers which comprise the distributed system of the embodiment. 同実施形態において各コンピュータが実行する整列マルチキャストの１回の配送を行う基本的な部分の動作手順を示す第１のフローチャート。The 1st flowchart which shows the operation | movement procedure of the basic part which performs one delivery of the ordered multicast which each computer performs in the same embodiment. 同実施形態において各コンピュータが実行する整列マルチキャストの１回の配送を行う基本的な部分の動作手順を示す第２のフローチャート。The 2nd flowchart which shows the operation | movement procedure of the basic part which performs one delivery of the ordered multicast which each computer performs in the same embodiment. 同実施形態において各コンピュータが実行する、多重化実行の遅延を解消するための動作手順を示す第１のフローチャート。The 1st flowchart which shows the operation | movement procedure for eliminating the delay of multiplexing execution which each computer performs in the same embodiment. 同実施形態において各コンピュータが実行する、多重化実行の遅延を解消するための動作手順を示す第２のフローチャート。The 2nd flowchart which shows the operation | movement procedure for eliminating the delay of multiplexing execution which each computer performs in the same embodiment. 同実施形態において各コンピュータが実行する、多重化実行の遅延を解消するための動作手順を示す第３のフローチャート。The 3rd flowchart which shows the operation | movement procedure for eliminating the delay of multiplexing execution which each computer performs in the same embodiment. 同実施形態において各コンピュータが実行する、多重化実行の遅延を解消するための動作手順を示す第４のフローチャート。FIG. 10 is a fourth flowchart showing an operation procedure for eliminating a delay in multiplexing execution, which is executed by each computer in the embodiment. 同実施形態において各コンピュータが実行する、分散システムの構成を決定するための動作手順を示すフローチャート。6 is an exemplary flowchart illustrating an operation procedure for determining the configuration of a distributed system executed by each computer in the embodiment.

Explanation of symbols

１…入力受付キュー部、２…整列マルチキャスト部、３…アプリケーションプログラム、４…プログラム状態管理部、５…出力フィルタ部、２１…入力順序番号記憶部、２２…入力パケットジャーナル記憶部、２３…プロトコルデータ送受信部、２４…ステップ番号記憶部、２５…候補パケット記憶部、２６…入力パケット確定判定部、２７…最大確定入力順序番号記憶部、２８…遅延記憶部、２９…スキップ判定部、１００-1〜１００-7，１００-i…コンピュータ、２１０…構成決定部、２１０ａ…障害検出部、２１１…構成記憶部、２１２…最大順序番号遅延許容値記憶部、２１３…他コンピュータ最大確定入力順序番号記憶部、２６１…入力候補収集部、２６２…合意部、１０００…分散システム、２０００…クライアント装置、Ａ…ネットワーク、Ｂ…ネットワーク。 DESCRIPTION OF SYMBOLS 1 ... Input reception queue part, 2 ... Sort multicast part, 3 ... Application program, 4 ... Program state management part, 5 ... Output filter part, 21 ... Input sequence number memory | storage part, 22 ... Input packet journal memory part, 23 ... Protocol Data transmission / reception unit, 24 ... step number storage unit, 25 ... candidate packet storage unit, 26 ... input packet determination determination unit, 27 ... maximum determination input sequence number storage unit, 28 ... delay storage unit, 29 ... skip determination unit, 100- 1 to 100-7, 100-i ... computer, 210 ... configuration determination unit, 210a ... failure detection unit, 211 ... configuration storage unit, 212 ... maximum sequence number delay tolerance storage unit, 213 ... other computer maximum fixed input sequence number Storage unit 261 ... Input candidate collection unit 262 ... Agreement unit 1000 ... Distributed system 2000 ... Client device A ... N Network, B ... network.

Claims

A distributed system that synchronously operates n computers (m is an integer satisfying 4 <n ≦ m at the start of operation ) among m computers (m is an integer greater than 4) connected via a network. ,
Each of the m computers is
Configuration storage means for storing information identifying n computers to be operated synchronously as information identifying the computers constituting the distributed system;
The n computers constituting the distributed system identified by the information stored in the configuration storage means are operated synchronously, and (n−f) computers (f is 3f <of the n computers). In order to guarantee multiplexing at or above the maximum allowable number of failures that is the largest integer satisfying n), the input data is arranged and multicasted by agreeing the input data with the (n−f) computers or more. Means of agreement,
Fault detection means for detecting a computer in which a fault has occurred from the state of n computers constituting the distributed system identified by information stored in the configuration storage means;
When the failure detection unit detects a computer in which a failure has occurred, the computer determines that an item to change the configuration of the distributed system has occurred and determines that an item to change the configuration of the distributed system has occurred Determines whether to change the configuration of the distributed system depending on whether or not (n−f) or more exist, and changes the information stored in the configuration storage means when it is determined to change the configuration of the distributed system Configuration determining means for updating to indicate the configuration of the later distributed system, and based on the number n of computers constituting the changed distributed system, the maximum that matches the changed distributed system A distributed system comprising: a configuration determining unit that determines an allowable fault count f .

When it is determined that an item that should change the configuration of the distributed system has occurred, the configuration determination unit provides specific input data indicating the fact to the agreement unit, and causes the specific input data to be aligned and multicast. When it is determined that there are at least (n−f) computers that have been determined to have changed the configuration of the distributed system and the configuration of the distributed system is determined to be changed, the occurrence of the failure is detected. It has been as computer is disconnected from the distributed system, distributed system of claim 1, wherein updating the information stored in said configuration storage means.

The agreement means manages the ordered multicast subject to agreement by a sequence number serially assigned to the ordered multicast,
Each of the m computers further includes another computer maximum sequence number storage means for storing the maximum sequence number of the ordered multicast agreed for each other computer,
The failure detection means includes a difference between the latest ordered multicast order number managed by the agreement means and the largest ordered multicast order number for each other computer stored in the other computer maximum order number storage means. The distributed system according to claim 1, wherein the occurrence of a failure in the other computer is detected depending on whether the value exceeds a threshold value.

In a distributed system in which m computers (m is an integer greater than 4) are connected via a network, n computers (n is an integer satisfying 4 <n ≦ m at the start of operation) among the m computers. It is a multiplexing control method that operates synchronously and guarantees multiplexing at (n−f) units (f is the maximum integer satisfying 3f <n),
Each of the m computers includes configuration storage means for storing information identifying n computers to be operated synchronously as information for identifying computers constituting the distributed system, agreement means, failure A detection means and a configuration determination means;
The multiplexing control method includes:
The agreement means of each of the m computers operates n computers constituting the distributed system identified by the information stored in the configuration storage means synchronously, and the n computers In order to guarantee multiplexing at or above (n−f) units (f is the maximum allowable number of faults that is the maximum integer satisfying 3f <n), input data on the (n−f) computers or more. Arranging and multicasting the input data with the agreement of
The failure detection means detecting a computer in which a failure has occurred from the state of n computers constituting the distributed system identified by the information stored in the configuration storage means;
When a computer in which a failure has occurred is detected, it is determined that an item to change the configuration of the distributed system has occurred, and a computer that has determined that an item to change the configuration of the distributed system has occurred (n−f A step in which the configuration determining means determines whether to change the configuration of the distributed system depending on whether or not there are more units;
When it is determined that the configuration of the distributed system is to be changed, the configuration determining unit is configured to allow the maximum allowable value that is suitable for the distributed system after the change based on the number n of computers configuring the distributed system after the change. multiplexing control method characterized by comprising the steps of determining the number of error f, and to update, as shown the structure of the distributed system after the change the information stored in said configuration storage means.

Each of the m computers further includes another computer maximum sequence number storage means for storing the maximum sequence number of the ordered multicast agreed for each other computer,
The agreement means manages the ordered multicast subject to agreement by a sequence number serially assigned to the ordered multicast,
The failure detection means includes a difference between the latest ordered multicast order number managed by the agreement means and the largest ordered multicast order number for each other computer stored in the other computer maximum order number storage means. Detects the failure of the other computer depending on whether the value exceeds the threshold
5. The multiplexing control method according to claim 4, wherein: