JP6535304B2

JP6535304B2 - Distributed synchronous processing system and distributed synchronous processing method

Info

Publication number: JP6535304B2
Application number: JP2016166182A
Authority: JP
Inventors: 小林　弘明; 弘明小林; 雄大北野; 岡本　光浩; 光浩岡本; 健福元; 力米森; 恭太堤田; 貴志矢実; 智洋大谷; 南司
Original assignee: NTT Data Corp; Nippon Telegraph and Telephone Corp
Current assignee: NTT Data Corp; Nippon Telegraph and Telephone Corp
Priority date: 2016-08-26
Filing date: 2016-08-26
Publication date: 2019-06-26
Anticipated expiration: 2036-08-26
Also published as: JP2018032344A

Description

本発明は、分散配置された複数のサーバを同期させて処理を実行する分散同期処理システムおよび分散同期処理方法に関する。 The present invention relates to a distributed synchronization processing system and a distributed synchronization processing method that execute processing by synchronizing a plurality of distributed servers.

ネットワーク上に複数のサーバを分散配置する分散処理システムのフレームワークとして、非特許文献１にはＭａｐＲｅｄｕｃｅが開示されている。但し、このＭａｐＲｅｄｕｃｅは、処理の度に、外部のデータストアからの入力データの読み込みや、結果の書き出し処理が必要であるため、ある処理の結果を次の処理で利用するようなイテレーティブな（反復する）処理には向いていない。この種の処理には、非特許文献２に開示されているＢＳＰ（Bulk Synchronous Parallel：バルク同期並列）が適している。 Non-Patent Document 1 discloses MapReduce as a framework of a distributed processing system in which a plurality of servers are distributed and arranged on a network. However, since this MapReduce requires reading out input data from an external data store and writing out results every time processing, it is an iterative (iteration that uses the result of a certain processing in the next processing) Yes) not suitable for processing. For this type of processing, BSP (Bulk Synchronous Parallel) disclosed in Non-Patent Document 2 is suitable.

このＢＳＰは、「スーパーステップ（ＳＳ：superstep)」という処理単位を繰り返し実行することにより、分散環境でのデータ処理を実行する。図１は、ＢＳＰ計算モデルを説明するための図である。 The BSP executes data processing in a distributed environment by repeatedly executing a processing unit called "superstep (SS)". FIG. 1 is a diagram for explaining a BSP calculation model.

１つのスーパーステップは、図１に示すように、次の３つのフェーズ（ＰＨ：phase）、「ローカル計算（ＬＣ：Local computation）」（フェーズＰＨ１）、「データ交換（Ｃｏｍ：Communication）」（フェーズＰＨ２）、「同期（Sync）」（フェーズＰＨ３）から構成される。
具体的には、複数のノード（ノード１〜ノード４）のうちのいずれかのノードがデータを受信すると、そのノード（例えば、ノード１）がフェーズＰＨ１において、そのデータについての計算処理（ローカル計算（ＬＣ））を実行する。続いて、フェーズＰＨ２において、各ノードが保持しているローカル計算の結果であるデータについて、ノード間でのデータ交換を実行する。次に、フェーズＰＨ３において、同期処理を行う、より詳細には、すべてのノード間でのデータ交換の終了を待つ。
そして、スーパーステップＳＳ１として、一連のスーパーステップの処理（ＰＨ１〜ＰＨ３）が終了すると、各ノードはその計算結果を保持した上で、次の一連の処理であるスーパーステップＳＳ２へと進む。 As shown in FIG. 1, one super step includes the following three phases (PH: phase), “local computation (LC: Local computation) (phase PH1),“ data exchange (Com: Communication) (phase: PH2), “Sync” (phase PH3).
Specifically, when any of a plurality of nodes (node 1 to node 4) receives data, that node (for example, node 1) performs calculation processing (local calculation) for the data in phase PH1. Execute (LC)). Subsequently, in phase PH2, data exchange between nodes is performed on data which is the result of the local calculation held by each node. Next, in phase PH3, synchronization processing is performed, more specifically, waiting for the end of data exchange between all nodes.
Then, when the series of super step processing (PH1 to PH3) is finished as the super step SS1, each node holds the calculation result, and proceeds to the next series of processing, super step SS2.

このＢＳＰを採用した分散処理フレームワークとして、非特許文献３にはＰｒｅｇｅｌが開示されている。このＰｒｅｇｅｌ等のフレームワークでは、全体の処理をグラフＧ＝（Ｖ，Ｅ）として表現し、これをＢＳＰに適用して実行する。ここで、Ｖは「バーテックス（vertex：頂点）の集合」であり、Ｅは「エッジ（edge：辺）の集合」を意味する。 Non-Patent Document 3 discloses Pregel as a distributed processing framework adopting this BSP. In the framework such as Pregel, the entire processing is expressed as a graph G = (V, E), which is applied to BSP and executed. Here, V is a “set of vertexes”, and E is a “set of edges”.

ここで、図２を参照し、交通シミュレーションにＢＳＰを適用した例を説明する。
図２においては、各交差点（ｖ）がバーテックス（vertex）に対応付けられる（図２のｖ_１〜ｖ_４）。また、各交差点を結ぶ道路（ｅ）がエッジ（edge）に対応付けられる（図２のｅ_１〜ｅ_６）。ここで、エッジ（edge）は一方通行であり、双方向の道路は２つのエッジに対応付けられる。また、あるバーテックス（vertex）から見て、車両が出てゆく方向のエッジを、「出力エッジ（outgoing edge）」と呼び、車両が流入する方向のエッジを「入力エッジ（incoming edge）」と呼ぶ。例えば、図２において、バーテックスｖ_２からみると、エッジｅ_１は入力エッジであり、エッジｅ_２は出力エッジになる。逆に、バーテックスｖ_１からみると、エッジｅ_１は出力エッジであり、エッジｅ_２は入力エッジになる。 Here, with reference to FIG. 2, an example in which BSP is applied to traffic simulation will be described.
In FIG. 2, each intersection (v) is associated with a vertex (v ₁ to v _{4 in} FIG. 2). In addition, the road (e) connecting each intersection is associated with an edge (e ₁ to e _{6 in} FIG. 2). Here, the edge is one-way, and a bidirectional road is associated with two edges. Also, an edge in the direction in which the vehicle exits is referred to as an "outgoing edge" as viewed from a certain vertex, and an edge in the direction in which the vehicle flows in is referred to as an "incoming edge". . For example, in FIG. 2, when viewed from the vertex _{v 2,} the edge _{e 1} is input edge, the edge _{e 2} is the output edge. Conversely, when viewed from the vertex v _1, the edge e ₁ is the output edge, the edge e ₂ is the input edge.

図１で示したスーパーステップでは、フェーズＰＨ１（ローカル計算）において、バーテックス（vertex）毎に、経過時間（Δｔ）における、各バーテックスｖ_１〜ｖ_４に対応付けられている交差点の状態（例えば、信号の色（青、黄、赤）や交差点内の車両の動き等）と、それに付随する出力エッジとしての道路内の状態（車両の動き（台数・平均速度等））とをシミュレートする。フェーズＰＨ２（データ交換）では、あるバーテックスは、出力エッジを介して接する他のバーテックスに対して、当該出力エッジを介して出てゆく車両の動きの情報（台数等）を送信するとともに、入力エッジを介して入ってくる車両の動きの情報（台数等）を受信する。フェーズＰＨ３（同期）では、バーテックス間で、シミュレーション時刻ｔを同期する。つまり、全てのバーテックス間でデータ交換の完了を待つ。
この交通シミュレーションにおいては、このように交差点（バーテックス）単位で、並列処理することにより、計算時間を短縮することが可能となる。 In the super step shown in FIG. 1, in phase PH1 (local calculation), for each vertex (vertex), the state of the intersection associated with each vertex v _{1 to} v ₄ (e.g. It simulates the color of the signal (blue, yellow, red), the movement of the vehicle in the intersection, etc., and the state in the road (movement of vehicles (average number, etc.)) as an output edge accompanying it. In phase PH2 (data exchange), one vertex transmits information (such as the number of vehicles) of a vehicle moving out through the output edge to another vertex contacting via the output edge, and also inputs the input edge. Receive information (number etc.) of incoming vehicle movement via. In phase PH3 (synchronization), simulation time t is synchronized between vertices. That is, it waits for completion of data exchange among all the vertices.
In this traffic simulation, it is possible to reduce the calculation time by performing parallel processing in intersection units (vertex) in this way.

Dean, J., et al., “MapReduce: Simplified Data Processing on Large Clusters,” OSDI '04, 2004, p.137-149.Dean, J., et al., “MapReduce: Simplified Data Processing on Large Clusters,” OSDI '04, 2004, p. Valiant, L., et al., “A bridging model for parallel computation,” Communications of the ACM, 1990, vol.33, No.8, p.103-111.Valiant, L., et al., “A bridging model for parallel computation,” Communications of the ACM, 1990, vol. 33, No. 8, p. 103-111. Malewicz, G., et al., “Pregel: A System for Large-Scale Graph Processing,” Proc. of ACM SIGMOD, 2010, p.136-145.Malewicz, G., et al., “Pregel: A System for Large-Scale Graph Processing,” Proc. Of ACM SIGMOD, 2010, p. 136-145.

上記のような、ＢＳＰを採用した分散処理フレームワークを実現するためのアーキテクチャとして、master／worker構成が採用されている。図３に示すように、master／worker構成は、処理単位となるバーテックス２０ａを複数備えるworker（処理サーバ３０ａ）が複数台と、workerの処理について進行状況の管理等を行うmaster（管理サーバ１０ａ）１台とで、構成される。 The master / worker configuration is adopted as an architecture for realizing the distributed processing framework adopting BSP as described above. As shown in FIG. 3, in the master / worker configuration, a plurality of workers (processing server 30a) having a plurality of vertexes 20a serving as a processing unit and a master (management server 10a) that manages the progress of the processing of the worker, etc. It consists of one.

ここで、master（管理サーバ１０ａ）の役割は、worker（処理サーバ３０ａ）への処理（バーテックス２０ａ）の割り振り（グラフＧのパーティショニング）、workerの処理の進行状況の管理、全workerに共通となる全体としてのスーパーステップの管理、バーテックスやエッジの追加や削除に伴うグラフトポロジの管理等である。
また、worker（処理サーバ３０ａ）の役割は、各スーパーステップにおけるフェーズＰＨ１のローカル計算、フェーズＰＨ２における、隣接するバーテックスとの間のデータの送受信、masterへの報告である。 Here, the role of master (management server 10a) is allocation of processing (vertex 20a) to worker (processing server 30a) (partitioning of graph G), management of progress of processing of worker, common to all workers Management of supersteps as a whole, and management of graph topology accompanying addition and deletion of vertices and edges.
The role of the worker (processing server 30a) is local calculation of the phase PH1 in each superstep, transmission / reception of data between adjacent vertices in the phase PH2, and reporting to the master.

既存のフレームワークにおけるアーキテクチャの多くは、このmaster／worker構成を採用しており、ＢＳＰが適用されるときには、workerは、自身が備える全てのバーテックスの処理（フェーズＰＨ１，２）が完了すると、masterに報告する。masterは、全workerからの報告を受けると、スーパーステップを「＋１」し、次のスーパーステップに移行するように、各workerに指示を出すこととなる。 Many of the architectures in the existing framework adopt this master / worker configuration, and when BSP is applied, the worker will be master when processing of all vertices that he / she has (phase PH1, 2) is completed. Report to When the master receives a report from all the workers, it instructs the workers to "+1" the super step and to shift to the next super step.

しかしながら、上記の構成では、スーパーステップ毎に、全バーテックスを同期するため、最も処理が遅いバーテックスにあわせることとなる。よって、たった一つでも全体から著しく遅いバーテックスがあると、その影響が全体に及ぶ。つまり、最も処理が遅いバーテックスにあわせて、全体が著しく遅延してしまう。
また、大規模なグラフＧを処理対象とする場合、つまり、多数のバーテックスとエッジを備えた計算対象を扱うときには、master／worker構成では、一つのmasterでグラフ全体を管理するため、グラフＧの規模が大きいと、masterがボトルネックとなってしまう。 However, in the above configuration, since all vertices are synchronized every super step, it is matched with the latest processing vertex. Therefore, if there is only one vertex that is extremely slow from the whole, the influence will be overall. That is, the whole is significantly delayed according to the slowest processing vertex.
Also, when processing a large graph G, that is, when dealing with a calculation object having a large number of vertices and edges, in the master / worker configuration, one master manages the entire graph; If the scale is large, the master will be a bottleneck.

そこで、本発明では、前記した問題を解決し、同期処理に伴うシステム全体の処理遅延を低減することができる、分散同期処理システムおよび分散同期処理方法を提供することを課題とする。 Therefore, it is an object of the present invention to provide a distributed synchronization processing system and a distributed synchronization processing method capable of solving the above-mentioned problems and reducing the processing delay of the entire system involved in synchronization processing.

前記した課題を解決するため、請求項１に記載の発明は、並列に処理を行う複数の処理サーバと、前記処理サーバ上で動作する複数の分散処理部と、対象とする計算処理に必要な複数の前記分散処理部を複数の前記処理サーバに対して割り当てる管理サーバと、を有する分散同期処理システムであって、前記処理サーバが、前記分散処理部による所定の計算ステップにおける、計算処理および計算結果の出力先として接続された分散処理部への送信処理を示す計算・送信処理の完了を検出し、前記計算・送信処理の完了を示す完了報告を生成して、前記管理サーバに送信するとともに、前記管理サーバから次の前記計算ステップへの移行の指示である次ステップ移行指示を受信し、前記計算・送信処理を完了した分散処理部に出力する分散処理管理部を備え、前記管理サーバが、前記完了報告を受信し、前記計算・送信処理を完了した分散処理部が、次の前記計算ステップにおいて必要な計算結果の取得が完了しているか否かを前記計算結果の入力元として接続された分散処理部からの完了報告を受信しているか否かに基づき判定し、前記計算結果の取得が完了しているときに、前記次ステップ移行指示を前記完了報告を送信してきた処理サーバに送信する隣接同期処理部を備えることを特徴とする分散同期処理システムとした。 In order to solve the problems described above, the invention according to claim 1 is necessary for a plurality of processing servers performing processing in parallel, a plurality of distributed processing units operating on the processing server, and target calculation processing. A distributed synchronous processing system, comprising: a management server that allocates a plurality of distributed processing units to a plurality of processing servers, wherein the processing server performs calculation processing and calculation in a predetermined calculation step by the distributed processing units. It detects completion of calculation / transmission processing indicating transmission processing to the distributed processing unit connected as a result output destination, generates a completion report indicating completion of the calculation / transmission processing, and transmits it to the management server A distributed process that receives from the management server an instruction to move to the next step, which is an instruction to move to the next calculation step, and outputs the instruction to the distributed processing unit that has completed the calculation and transmission process. Whether the distributed processing unit that has a management unit, the management server has received the completion report, and has completed the calculation and transmission processing has completed acquisition of the calculation result required in the next calculation step The determination is made based on whether or not the completion report from the distributed processing unit connected as the input source of the calculation result is received, and when the acquisition of the calculation result is completed, the next step shift instruction is completed According to another aspect of the present invention, there is provided a distributed synchronization processing system comprising: an adjacent synchronization processing unit that transmits a report to a processing server that has transmitted the report.

また、請求項３に記載の発明は、並列に処理を行う複数の処理サーバと、前記処理サーバ上で動作する複数の分散処理部と、対象とする計算処理に必要な複数の前記分散処理部を複数の前記処理サーバに対して割り当てる管理サーバと、を有する分散同期処理システムの分散同期処理方法であって、前記処理サーバが、前記分散処理部による所定の計算ステップにおける、計算処理および計算結果の出力先として接続された分散処理部への送信処理を示す計算・送信処理の完了を検出し、前記計算・送信処理の完了を示す完了報告を生成して、前記管理サーバに送信する手順と、前記管理サーバから次の前記計算ステップへの移行の指示である次ステップ移行指示を受信し、前記計算・送信処理を完了した分散処理部に出力する手順と、を実行し、前記管理サーバが、前記完了報告を受信し、前記計算・送信処理を完了した分散処理部が、次の前記計算ステップにおいて必要な計算結果の取得が完了しているか否かを前記計算結果の入力元として接続された分散処理部からの完了報告を受信しているか否かに基づき判定し、前記計算結果の取得が完了しているときに、前記次ステップ移行指示を前記完了報告を送信してきた処理サーバに送信する手順を実行することを特徴とする分散同期処理方法とした。 The invention according to claim 3 is that the plurality of processing servers performing processing in parallel, the plurality of distributed processing units operating on the processing server, and the plurality of distributed processing units necessary for the target calculation processing A distributed synchronous processing method of a distributed synchronous processing system, comprising: a management server that assigns the plurality of processing servers to the plurality of processing servers, wherein the processing server performs calculation processing and calculation results in predetermined calculation steps by the distributed processing unit. Detecting completion of calculation / transmission processing indicating transmission processing to the distributed processing unit connected as an output destination of the process, generating a completion report indicating completion of the calculation / transmission processing, and transmitting it to the management server Receiving an instruction to move to the next step, which is an instruction to shift to the next calculation step, from the management server, and outputting the instruction to the distributed processing unit that has completed the calculation and transmission process. The management server receives the completion report, and the distributed processing unit that has completed the calculation / transmission process calculates whether or not the acquisition of the calculation result required in the next calculation step is completed. The determination is made based on whether or not the completion report from the distributed processing unit connected as the input source of is received, and when the acquisition of the calculation result is completed, the completion report is transmitted with the next step shift instruction The distributed synchronous processing method is characterized in that the procedure for transmitting to the processing server which has been performed is executed.

このように、分散同期処理システムは、管理サーバが、分散処理部ごとに、次の計算ステップに移行してよいのかを判定することができる。よって、全ての分散処理部の計算・送信処理の終了まで待機する必要がないため、同期処理に伴うシステム全体の処理遅延を低減することができる。 As described above, the distributed synchronous processing system can determine, for each distributed processing unit, whether the management server may shift to the next calculation step. Therefore, since it is not necessary to wait until the end of the calculation / transmission processing of all the distributed processing units, it is possible to reduce the processing delay of the entire system accompanying the synchronization processing.

請求項２に記載の発明は、並列に処理を行う複数の処理サーバと、前記処理サーバ上で動作する複数の分散処理部と、を有する分散同期処理システムであって、前記処理サーバが、前記分散処理部による所定の計算ステップにおける、計算処理および計算結果の出力先として接続された分散処理部への送信処理を示す計算・送信処理の完了を検出し、前記計算・送信処理を完了した分散処理部が、計算結果の入力元として接続された分散処理部から、次の前記計算ステップにおいて必要な計算結果の取得が完了しているか否かを判定し、前記計算結果の取得が完了しているときに、次の前記計算ステップへの移行の指示である次ステップ移行指示を、前記計算・送信処理を完了した分散処理部に出力する隣接同期分散管理部を備えることを特徴とする分散同期処理システムとした。 The invention according to claim 2 is a distributed synchronous processing system having a plurality of processing servers performing processing in parallel, and a plurality of distributed processing units operating on the processing server, wherein the processing server is Calculation processing and transmission processing completion indicating the transmission processing to the distributed processing unit connected as an output destination of the calculation result in a predetermined calculation step by the distributed processing unit is detected, and the distribution processing completed the calculation and transmission processing The processing unit determines from the distributed processing unit connected as an input source of the calculation result whether acquisition of the calculation result necessary in the next calculation step is completed and acquisition of the calculation result is completed. And an adjacent synchronization and dispersion management unit for outputting, to the distributed processing unit that has completed the calculation / transmission processing, an instruction for moving to the next calculation step, which is an instruction to move to the next calculation step. It was distributed synchronization system that.

また、請求項４に記載の発明は、並列に処理を行う複数の処理サーバと、前記処理サーバ上で動作する複数の分散処理部と、を有する分散同期処理システムの分散同期処理方法であって、前記処理サーバが、前記分散処理部による所定の計算ステップにおける、計算処理および計算結果の出力先として接続された分散処理部への送信処理を示す計算・送信処理の完了を検出する手順と、前記計算・送信処理を完了した分散処理部が、計算結果の入力元として接続された分散処理部から、次の前記計算ステップにおいて必要な計算結果の取得が完了しているか否かを判定し、前記計算結果の取得が完了しているときに、次の前記計算ステップへの移行の指示である次ステップ移行指示を、前記計算・送信処理を完了した分散処理部に出力する手順と、を実行することを特徴とする分散同期処理方法とした。 The invention according to claim 4 is a distributed synchronous processing method of a distributed synchronous processing system including a plurality of processing servers performing processing in parallel and a plurality of distributed processing units operating on the processing server, A procedure in which the processing server detects the completion of the calculation processing and transmission processing indicating the calculation processing and the transmission processing to the distributed processing unit connected as an output destination of the calculation result in the predetermined calculation step by the distributed processing unit; The distributed processing unit that has completed the calculation / transmission processing determines whether acquisition of the calculation result necessary in the next calculation step has been completed from the distributed processing unit connected as an input source of the calculation result, When acquisition of the calculation result is completed, an instruction to output a next step shift instruction which is an instruction to shift to the next calculation step to the distributed processing unit which has completed the calculation / transmission processing When, it was distributed synchronization processing method, characterized by the execution.

このように、分散同期処理システムは、処理サーバが、分散処理部ごとに、次の計算ステップに移行してよいのかを判定することができる。よって、全ての分散処理部の計算・送信処理の終了まで待機する必要がないため、同期処理に伴うシステム全体の処理遅延を低減することができる。
さらに、各処理サーバが自律分散的に、次の計算ステップへの移行を判定するため、処理サーバおよび分散処理部が多数となる大規模なシステムであっても、システム全体の処理遅延を低減することが可能となる。 As described above, the distributed synchronous processing system can determine, for each distributed processing unit, whether the processing server may shift to the next calculation step. Therefore, since it is not necessary to wait until the end of the calculation / transmission processing of all the distributed processing units, it is possible to reduce the processing delay of the entire system accompanying the synchronization processing.
Furthermore, since each processing server autonomously determines the shift to the next calculation step in a distributed manner, the processing delay of the entire system is reduced even in a large-scale system in which the number of processing servers and distributed processing units are large. It becomes possible.

本発明によれば、同期処理に伴うシステム全体の処理遅延を低減する、分散同期処理システムおよび分散同期処理方法を提供することができる。 According to the present invention, it is possible to provide a distributed synchronization processing system and a distributed synchronization processing method that reduce the processing delay of the entire system involved in synchronization processing.

ＢＳＰ計算モデルを説明するための図である。It is a figure for demonstrating a BSP calculation model. 交通シミュレーションにＢＳＰ計算モデルを適用した例を説明するための図である。It is a figure for demonstrating the example which applied the BSP calculation model to traffic simulation. 比較例に係る分散同期処理システムのmaster／worker構成を説明するための図である。It is a figure for demonstrating the master / worker structure of the distributed synchronous processing system which concerns on a comparative example. バーテックスの構成要素の定義を説明するための図である。It is a figure for demonstrating the definition of the component of a vertex. １つのバーテックスの構成要素を例示する図である。It is a figure which illustrates the component of one vertex. ＢＳＰ計算モデルにおける計算対象のグラフを例示する図である。It is a figure which illustrates the graph of calculation object in a BSP calculation model. 比較例の分散同期処理システムにおける処理の流れを説明するための図である。It is a figure for demonstrating the flow of the process in the distributed synchronous processing system of a comparative example. 比較例に係る分散同期処理システムの処理の流れ（図８（ａ））と、本実施形態に係る分散同期処理システムの処理の流れ（図８（ｂ））とを、説明するための図である。FIG. 8 is a view for explaining the flow of processing of the distributed synchronous processing system according to the comparative example (FIG. 8A) and the flow of processing of the distributed synchronous processing system according to the present embodiment (FIG. 8B). is there. 本実施形態に係る分散同期処理システムの全体構成を示す図である。FIG. 1 is a diagram showing an overall configuration of a distributed synchronous processing system according to an embodiment of the present invention. 本実施形態に係る分散同期処理システムの処理の流れを示すシーケンス図である。It is a sequence diagram which shows the flow of a process of the distributed synchronous processing system which concerns on this embodiment. 本実施形態の変形例に係る分散同期処理システムの全体構成を示す図である。It is a figure which shows the whole structure of the distributed synchronous processing system which concerns on the modification of this embodiment. 本実施形態の変形例に係る分散同期処理システムの処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a process of the distributed synchronous processing system which concerns on the modification of this embodiment.

＜比較例の分散処理手法の内容と課題の詳細な説明＞
初めに、本実施形態に係る分散同期処理システム１および分散同期処理方法の特徴構成を説明するため、比較例として従来技術における分散同期処理システム１ａおよび分散同期処理方法を、詳細に説明する。 <Details of contents and problems of distributed processing method of comparative example>
First, in order to describe the characteristic configurations of the distributed synchronization processing system 1 and the distributed synchronization processing method according to the present embodiment, the distributed synchronization processing system 1a and the distributed synchronization processing method in the related art will be described in detail as comparative examples.

比較例の分散同期処理システム１ａは、図３に示したような、master／worker構成を採用し、複数のworkerそれぞれが、複数のバーテックス（vertex）を備える。そして、このmaster／worker構成にＢＳＰを適用するとき、各workerは、自身が備える全てのバーテックスの処理（フェーズＰＨ１，２）が完了するとmasterに報告し、masterは、全workerからの報告を受けると、スーパーステップを次のスーパーステップに移行する。 The distributed synchronous processing system 1a of the comparative example adopts a master / worker configuration as shown in FIG. 3, and each of a plurality of workers has a plurality of vertices. Then, when applying BSP to this master / worker configuration, each worker reports to master that all vertex processing (phase PH1, 2) that it has is completed, and master receives reports from all workers. And move the superstep to the next superstep.

ここで、バーテックスに着目すると、各バーテックスは、次に示す処理を実行する。
バーテックスは、ＢＳＰのフェーズＰＨ１において、現在のバーテックスの状態、出力エッジの状態、および、前スーパーステップ（以下、単に「ステップ」と称することがある。）の入力メッセージにより取得した情報（入力エッジの状態）をパラメータとして計算を行い、バーテックスの状態および出力エッジの状態を更新する。そして、バーテックスは、フェーズＰＨ２において、更新した出力エッジの状態を出力メッセージとして、その出力エッジに隣接するバーテックスに送信する。なお、この「出力エッジに隣接するバーテックス」は、「計算結果の出力先として接続されたバーテックス」を意味する。 Here, focusing on vertices, each vertex executes the following processing.
Vertexes are information obtained in the phase PH1 of the BSP, the state of the current vertex, the state of the output edge, and the input message of the previous super step (hereinafter may be simply referred to as "step") (the input edge's The calculation is performed with the state) as a parameter, and the state of the vertex and the state of the output edge are updated. Then, in phase PH2, the vertex transmits the updated output edge state as an output message to the vertex adjacent to the output edge. The "vertex adjacent to the output edge" means "vertex connected as an output destination of the calculation result".

上記の処理（計算・送信処理）は、次の式（１）として表わすことができる。
ｆ（ｖ_vid,n，Ｅ_out,n，Ｍ_in,n-1）＝（ｖ_vid,n+1，Ｅ_out,n+1，Ｍ_out,n）・・・式（１）
ここで、バーテックスの各構成要素の定義について、図４に示す。 The above process (calculation and transmission process) can be expressed as the following equation (1).
f (v _{vid, n} , E _{out, n} , M _{in, n-1} ) = (v _{vid, n + 1} , E _{out, n + 1} , M _{out, n} ) (1)
Here, the definition of each component of vertex is shown in FIG.

図４に示すように、「vid」は、「vertex ID」を示す。「ｖ_vid,n」は、「バーテックスの状態」を示す。「ｎ」は、現在のステップ（スーパーステップ）を示す。「Ｅ_out,n」は、出力エッジの状態の集合を示す。「Ｍ_in,n」は、入力エッジの状態を示す入力メッセージのバッファに記憶される情報（現在のステップ用）を示す。「Ｍ_in,n-1」は、入力メッセージのバッファに記憶される情報（１つ前のステップ用）を示す。「ｓ_n」は、現在のステップの「状態フラグ（active／inactive）」を示す。「ｓ_n+1」は、次のステップの「状態フラグ（active／inactive）」を示す。ｆ（ｖ_vid,n，Ｅ_out,n，Ｍ_in,n-1）＝（ｖ_vid,n+1，Ｅ_out,n+1，Ｍ_out,n）は、式（１）において示したように、計算・送信処理を示す。ここで、以降、ステップ（スーパーステップ）ｎにおける計算・送信処理を、「計算・送信処理ｆ_ｎ」と記載する。
なお、「ｓ_n」の状態フラグは、そのバーテックスがＢＳＰのフェーズ１，２の処理を実行している間は、「active」の状態とし、フェーズＰＨ３の同期処理で他のバーテックスの処理待ち状態であるときに、「inactive」の状態とする。また、「ｓ_n+1」は、次のステップの処理に移行する設定の場合に「active」の状態とし、シミュレーション処理の設定時間が終了したこと等により、次のステップにおいて処理を実行しない設定の場合に、「inactive」の状態とする。 As shown in FIG. 4, “vid” indicates “vertex ID”. "V _{vid, n} " indicates "state of vertex". "N" indicates the current step (super step). “E _{out, n} ” indicates a set of output edge states. “M _{in, n} ” indicates information (for the current step) stored in the buffer of the input message indicating the state of the input edge. “M _{in, n−1} ” indicates the information (for the previous step) stored in the buffer of the input message. "S _n " indicates the "status flag (active / inactive)" of the current step. “S _{n + 1} ” indicates the “status flag (active / inactive)” of the next step. f (v _{vid, n} , E _{out, n} , M _{in, n-1} ) = (v _{vid, n + 1} , E _{out, n + 1} , M _{out, n} ) is as shown in the equation (1) Shows the calculation and transmission process. Here, hereinafter, the calculation / transmission process in step (super step) n will be referred to as “calculation / transmission process f _n ”.
The status flag “s _n ” is in the “active” status while the vertex is executing the processing of phases 1 and 2 of the BSP, and the processing of other vertexes in the synchronous processing of the phase PH 3 is waiting for processing When it is in the "inactive" state. In addition, “s _{n + 1} ” is set to “active” in the case of setting to shift to the process of the next step, setting that the process is not performed in the next step due to the end of the setting time of the simulation process, etc. In the case of, the state of "inactive".

図５は、１つのバーテックスに注目した場合の構成要素を例示する図である。
図５に示すように、現在のステップ「ｎ」における「vertex ID」が「１」のバーテックス「１」は、ステップ「ｎ」おけるバーテックスの状態「ｖ_1,n」を保持する。また、バーテックス「１」は、出力エッジの状態として、「ｅ_1,3,n」をバーテックス「３」に出力し、「ｅ_1,4,n」をバーテックス「４」に出力する。そして、バーテックス「１」は、入力メッセージの情報（入力エッジの状態）として、バーテックス「２」から「ｍ_2,1,n」を受信し、バーテックス「３」から「ｍ_3,1,n」を受信する。 FIG. 5 is a diagram illustrating components in the case of focusing on one vertex.
As shown in FIG. 5, the vertex "1" with "vertex ID""1" in the current step "n" holds the state "v _{1, n} " of the vertex in the step "n". Also, vertex "1" outputs "e _{1,3, n} " as vertex "3" and "e _{1,4, n} " as vertex "4" as the output edge state. Then, vertex “1” receives “m _{2,1, n} ” from vertex “2” as information (state of input edge) of the input message, and vertex “3” to “m _{3,1, n} ” Receive

worker（図３参照）は、自身が備えるバーテックス毎に、現在のステップ（スーパーステップ）の状態フラグ（active／inactive）と次のステップ（スーパーステップ）の状態フラグ（active／inactive）を管理する。また、workerは、自身に属するバーテックスから、他のworkerに属するバーテックスに出力エッジの状態を出力メッセージとして送信するときには、同じworkerに属するバーテックスへのメッセージをバッファリングすることにより、まとめて送信するようにしてもよい。このようにすることで、通信コストを削減することができる。 The worker (see FIG. 3) manages the status flag (active / inactive) of the current step (superstep) and the status flag (active / inactive) of the next step (superstep), for each vertex that the worker has. Also, when a worker sends an output edge status as an output message from a vertex that belongs to itself to a vertex that belongs to another worker, it sends messages collectively by buffering messages to a vertex that belongs to the same worker. You may By doing this, communication costs can be reduced.

次に、図７を参照して、比較例の分散同期処理システム１ａが実行する処理の流れについて説明する。なお、ここでは、グラフＧの計算対象が、図６に示すグラフトポロジであるものとして説明する。また、図７に示すように、１台のmasterと２台のworker（worker１，worker２）で構成され、バーテックスｖ_１〜ｖ_６のうち、バーテックスｖ_１〜ｖ_３をworker１が担当し、バーテックスｖ_４〜ｖ_６をworker２が担当するものとする。以下、全体の処理の流れを通して説明する。 Next, with reference to FIG. 7, the flow of processing executed by the distributed synchronous processing system 1a of the comparative example will be described. Here, it is assumed that the calculation target of the graph G is the graph topology shown in FIG. Further, as shown in FIG. 7, consists of one master and two worker (worker1, worker2), among the vertex _v 1 to v _6, the vertex _v 1 to v ₃ worker1 is responsible, vertex v _It is assumed that worker 2 is in charge of _{4 to} ₆ v. Hereinafter, the entire processing flow will be described.

まず、masterは、図６に示すグラフＧの各バーテックス（バーテックスｖ_１〜ｖ_６）を、処理対象として設定しworkerに割り振る（ステップＳ１０１）、つまり、グラフＧのパーティショニングを実行する。
ここでは、図６に示すように、バーテックスｖ_１〜ｖ_６のうち、バーテックスｖ_１〜ｖ_３をworker１に割り振り、バーテックスｖ_４〜ｖ_６をworker２に割り振るものとする。 First, master may each vertex of a graph G shown in FIG. 6 (Vertex _v 1 to v _6), and set as a processing target allocated to the worker (step S101), i.e., executes a partitioning of the graph G.
Here, as shown in FIG. 6, of the vertex _v 1 to v _6, allocate vertex _v 1 to v ₃ in worker1, it shall allocate vertex _v 4 to v ₆ to worker2.

続いて、各worker（worker１，worker２）は、担当するバーテックスのスーパーステップを実行する（ステップＳ１０２）。具体的には、フェーズＰＨ１のローカル計算を実行し、スーパーステップの処理を開始する。 Subsequently, each worker (worker1, worker2) executes the superstep of the vertex in charge (step S102). Specifically, the local calculation of phase PH1 is performed, and the processing of the superstep is started.

次に、各workerは、自身が担当するバーテックスの処理の進行を監視し、各バーテックスが、フェーズＰＨ２のデータ交換まで完了したか否かを判定する。そして、各workerは、担当する全てのバーテックスが、フェーズＰＨ２までの処理を完了したと確認した場合に、各バーテックスの次のスーパーステップにおける状態フラグをmasterに報告（送信）する（ステップＳ１０３）。ここで、workerは、各バーテックスの次のスーパーステップにおける状態フラグとして「active」（次のスーパーステップの処理に移行する設定であること）を報告する。 Next, each worker monitors the progress of processing of the vertex that he is in charge of, and determines whether or not each vertex is completed until the data exchange of the phase PH2. Then, each worker, when confirming that all the vertexes in charge have completed the processing up to the phase PH2, reports (sends) the status flag in the next super step of each vertex to the master (step S103). Here, the worker reports “active” (setting to shift to the processing of the next super step) as a status flag in the next super step of each vertex.

そして、masterは、全てのworker（worker１，worker２）から、処理の完了を示す状態フラグの報告を受けたか否かを確認する。masterは、全てのworkerから報告を受けた場合に、スーパーステップを「＋１」に更新する（ステップＳ１０４）。
ここで、masterは、グラフトポロジに変更がある場合、例えば、バーテックスやエッジの追加や削除がある場合には、そのグラフトポロジの変更を、各workerに通知する。 Then, the master confirms whether or not a status flag indicating the completion of the process has been received from all the workers (worker1, worker2). When the master receives a report from all the workers, it updates the super step to "+1" (step S104).
Here, if there is a change in graph topology, for example, if there is a vertex or edge addition or deletion, the master notifies each worker of the change in graph topology.

続いて、masterは、全てのworker（worker１，worker２）に対して、次にスーパーステップに移行するように指示する（ステップＳ１０５）。そして、各workerは、ステップＳ１０２〜Ｓ１０５を繰り返す。 Subsequently, the master instructs all the workers (worker1, worker2) to shift to the super step next (step S105). And each worker repeats step S102-S105.

比較例の分散同期処理システム１ａにおいては、スーパーステップ毎に、計算対象となる全てのバーテックスを同期する、具体的には、図７に示す全体同期ポイントにおいて同期するため、最も遅いバーテックスにあわせることとなる。例えば、図７のスーパーステップＳＳ１では、バーテックスｖ_１〜ｖ_６のうち、最も遅いバーテックスｖ_２にあわせることとなる。また、スーパーステップＳＳ２では、最も遅いバーテックスｖ_６にあわせることとなる。よって、著しく遅いバーテックスがあると、そのバーテックスにあわせるために、バーテックスの処理全体が著しく遅延してしまう。
また、master／worker構成では、一つのmasterで全体を管理することになるため、グラフＧの規模が大きくなった場合、つまり、バーテックスの数やworkerの数が多くなるときに、masterがボトルネックとなる。 In the distributed synchronous processing system 1a of the comparative example, all vertices to be calculated are synchronized at each superstep, specifically, in order to synchronize at the entire synchronization point shown in FIG. It becomes. For example, the Super step SS1 in FIG. 7, of the vertex _v 1 to v _6, and thus to match the slowest vertex _{v 2.} Furthermore, the Super step SS2, and thus fit the slowest vertex _{v 6.} Thus, if there is a significantly slower vertex, then the overall processing of the vertex will be significantly delayed to match that vertex.
Also, in the master / worker configuration, one master manages the whole, so when the scale of the graph G becomes large, that is, when the number of vertices and the number of workers increase, the master becomes a bottleneck It becomes.

上記した全体としての処理速度の遅延や、フェーズＰＨ３において処理をせず同期待ちが多いこと（処理の効率性）の問題（以下、「処理速度／効率性」の問題と称する。）を解決するために、非同期型の分散処理フレームワークが提案されている（例えば、非特許文献４参照）。
ここで、非特許文献４は、「Low, Y., et al., “Distributed GraphLab”, Proc. of the VLDB Endowment, 2012.」である。 It solves the above-mentioned delay of the processing speed as a whole and the problem of not performing processing in phase PH3 but having many synchronization waits (processing efficiency) (hereinafter referred to as the "processing speed / efficiency" problem). For this purpose, an asynchronous distributed processing framework has been proposed (see, for example, Non-Patent Document 4).
Here, Non-Patent Document 4 is “Low, Y., et al.,“ Distributed Graph Lab ”, Proc. Of the VLDB Endowment, 2012.”.

しかしながら、非特許文献４に記載の非同期型の分散処理フレームワークでは、処理速度／効率性と計算精度がトレードオフの関係になるため、処理を設計する際におけるプログラマの負担（プラグラムの複雑性）が増大してしまう。
具体的には、非同期型では、各バーテックスによって、同じスーパーステップを実行していることが保証されないため、プログラマが、バーテックス間の処理の追い越しや上書きの考慮が必要となる。追い越されたイテレーション（反復処理）は、無効になってしまうため、精度の低下をまねくこととなる。また、スーパーステップの追い越し数が無制限に増えることにより、精度の理論的保証が困難になってしまう。 However, in the asynchronous distributed processing framework described in Non-Patent Document 4, there is a trade-off between processing speed / efficiency and calculation accuracy, so the programmer's burden in designing processing (complexity of program) Will increase.
Specifically, in the asynchronous type, since each vertex does not guarantee that the same superstep is executed, it is necessary for the programmer to consider overtaking or overwriting of processing between the vertices. Passed over iterations will be invalidated, which will lead to a loss of accuracy. In addition, the theoretical guarantee of accuracy becomes difficult as the number of superstep overtakings increases without limit.

本実施形態に係る分散同期処理システム１（図９参照）および分散同期処理方法では、これらの問題に対し、同期型で、プログラマに優しい（つまり、処理の追い越しや上書きの考慮が不要となる）シンプルなフレームワークを提供しつつ、同期型で問題であった処理速度／効率性を改善することを課題とする。
さらに、masterのボトルネック化を回避し、大規模なグラフＧでも処理速度／効率性を担保することを課題とする。 The distributed synchronous processing system 1 (refer to FIG. 9) and the distributed synchronous processing method according to the present embodiment are synchronous type and programmer-friendly for these problems (that is, no consideration of overtaking or overwriting of processing becomes necessary). The task is to improve the processing speed / efficiency that has been synchronous and problematic while providing a simple framework.
Furthermore, it is an object to prevent the bottleneck of the master and secure processing speed / efficiency even for a large graph G.

なお、本来masterが実行するグラフトロポジの管理のうち、「要素（バーテックスおよびエッジ）の動的な追加」については、システムとして構成の変更等が必要となるため、本発明の適用対象外とし、「要素の動的な追加」の必要がないケースを本発明の対象とする。 Of the management of graph topology that the master originally executes, “Dynamic addition of elements (vertex and edges)” requires a change in the configuration as a system, so this is not applicable to the present invention. The subject of the present invention is the case where there is no need for 'dynamic addition of elements'.

＜本実施形態の概要＞
次に、本実施形態に係る分散同期処理システム１が実行する処理の概要について説明する。
本実施形態に係る分散同期処理システム１（後記する図９）では、master（後記する「管理サーバ１０」）による全バーテックス（後記する「分散処理部２０」）での同期処理を行わず、バーテックス毎に次のスーパーステップへの移行を判断することを特徴とする。これにより、分散同期処理システム１は、著しく処理の遅いバーテックスの影響を低減する。 <Overview of this embodiment>
Next, an outline of processing executed by the distributed synchronous processing system 1 according to the present embodiment will be described.
In the distributed synchronous processing system 1 (FIG. 9 described later) according to the present embodiment, synchronization processing is not performed on all vertices (the “distributed processing unit 20” described later) by the master (the “management server 10” described later). It is characterized by judging the transition to the next super step every time. As a result, the distributed synchronous processing system 1 significantly reduces the influence of vertices that are slow in processing.

具体的には、分散同期処理システム１において、次のスーパーステップへの移行条件を「自バーテックスおよび入力エッジで接する全てのバーテックスの計算・送信処理ｆ_ｎが完了していること」と設定する。なお、「入力エッジで接する全てのバーテックス」は、計算結果の入力元として接続された全てのバーテックス」を意味する。以下、この「次のスーパーステップへの移行条件」を「隣接同期」と称する。この隣接同期の詳細を、図８を参照して説明する。 Specifically, in the distributed synchronous processing system 1, the transition condition to the next super step is set as "the calculation / transmission processing f _{n of} all vertices contacting at the own vertex and the input edge is completed". Note that "all vertices in contact with input edge" mean all vertices connected as input sources of calculation results. Hereinafter, this “transition condition to the next super step” is referred to as “adjacent synchronization”. The details of this adjacent synchronization will be described with reference to FIG.

図８は、図７において示した比較例の分散同期処理システム１ａが実行する処理（図８（ａ）参照）と、本実施形態に係る分散同期処理システム１が実行する処理（図８（ｂ）参照）とを示す図である。
本実施形態に係る分散同期処理システム１では、上記のように、「自バーテックスおよび入力エッジで接する全てのバーテックスの計算・送信処理ｆ_ｎが完了していること」（「隣接同期」）により、次のスーパーステップに移行する。 8 shows a process (see FIG. 8A) executed by the distributed synchronous processing system 1a of the comparative example shown in FIG. 7 and a process executed by the distributed synchronous processing system 1 according to the present embodiment (FIG. 8 (b) )).
In the distributed synchronous processing system 1 according to the present embodiment, as described above, “the calculation and transmission processing f _{n of} all vertices in contact with the own vertex and the input edge is completed” (“adjacent synchronization”), Move to the next super step.

例えば、図８（ｂ）のバーテックスｖ_２に着目すると、バーテックスｖ_２は、入力エッジで接するバーテックスｖ_１，ｖ_３，ｖ_４の計算・送信処理ｆ_ｎと自身の計算・送信処理ｆ_ｎが終わった時点が隣接同期する隣接同期ポイントとなる。ここでバーテックスｖ_２は、スーパーステップＳＳ１のとき、自身の計算・送信処理ｆ_１の終了がバーテックスｖ_１，ｖ_３，ｖ_４より遅く一番後であったので、その時点が隣接同期ポイントとなっている。
バーテックスｖ_３に着目すると、バーテックスｖ_３は、入力エッジで接するバーテックスｖ_２，ｖ_４の計算・送信処理ｆ_ｎと自身の計算・送信処理ｆ_ｎが終わった時点が隣接同期する隣接同期ポイントとなる。ここでバーテックスｖ_３は、スーパーステップＳＳ１のとき、自身の計算・送信処理ｆ_１が終わった時点では、バーテックスｖ_４の計算・送信処理ｆ_１は終わっているが、バーテックスｖ_２の計算・送信処理ｆ_１が終わっていないため、「inactive」の状態で待機し（図８（ｂ）の符号α）、バーテックスｖ_２の計算・送信処理ｆ_１が終わった時点が隣接同期する隣接同期ポイントとなる。
また、バーテックスｖ_１に着目すると、バーテックスｖ_１は、入力エッジで接するバーテックスは存在しない、よって、スーパーステップＳＳ１のとき、自バーテックスの計算・送信処理ｆ_１が終了した時点が隣接同期する隣接同期ポイントとなる。 For example, focusing on the vertex _{v 2} in FIG. 8 (b), the vertex _{v 2} is the vertex _v _1, v 3, _v calculation and transmission processing of ₄ _{f n} and their calculation and transmission processing _{f n} in contact with the input edge The end point is the adjacent sync point adjacent sync. Here vertex v _2, when the super step SS1, since the end of calculation and transmission processing f ₁ itself was later slowest from vertex _{_{_{v 1, v 3, v 4}}} , and that point adjacent synchronization points It has become.
Focusing on the vertex v _3, Vertex v ₃ is an adjacent sync point when the vertex v _2, v ₄ of calculation and transmission processing f _n and their calculation and transmission processing f _n in contact with input edge is finished is adjacent sync Become. Here vertex v _3, when the super step SS1, at the time the end of the calculation and transmission processing f ₁ itself, calculation and transmission processing f ₁ Vertex v ₄ is ending, the vertex v ₂ calculation and transmission since the process f ₁ is not finished, waiting in the state of "inactive" (reference numeral α in FIG. 8 (b)), and neighboring synchronization point when the calculation and transmission processing f ₁ vertex v ₂ is finished is adjacent sync Become.
Further, when focusing on the vertex v _1, vertex v _1, the vertex in contact with the input edge is not present, therefore, when the super step SS1, adjacent synchronization when the calculation and transmission processing f ₁ of the own vertex is completed adjacent sync It becomes a point.

図８（ｂ）に示すように、処理全体のある時点でみると、各バーテックス間においてスーパーステップがずれる可能性がある。そのため、バーテックス間でメッセージを送受信するときには上書きせずに、スーパーステップ毎に管理する。つまり、スーパーステップの情報（ステップ番号）をあわせて記憶するようにする。そのため、図４において示したバーテックスの要素に加え、本実施形態における各バーテックスは、「Ｍ_in,n+m」を入力メッセージのバッファに記憶する。ここで、「Ｍ_in,n+m」は、ステップ番号ｎ＋ｍ（「ｍ」は正の整数）において、入力エッジの状態としてバッファに記憶される情報を示す。各バーテックスは、自身のスーパーステップ（例えば、ステップ番号「ｎ」（現在のステップ））よりも先に、次のスーパーステップに移行したバーテックスから、入力エッジの状態を取得した場合、ステップ番号ｎ＋１，ｎ＋２，…，ｎ＋ｍ、としたステップ番号とともに、入力メッセージの状態を記憶しておく。 As shown in FIG. 8 (b), there is a possibility that the superstep may be shifted between the vertices when viewed at a certain point in the entire process. Therefore, when transmitting and receiving messages between vertices, management is performed for each super step without overwriting. That is, super step information (step number) is stored together. Therefore, in addition to the elements of the vertices shown in FIG. 4, each vertex in the present embodiment stores “M _{in, n + m} ” in the buffer of the input message. Here, “M _{in, n + m} ” indicates the information stored in the buffer as the state of the input edge at the step number n + m (“m” is a positive integer). When each vertex acquires the state of the input edge from the vertex which has shifted to the next super step before its own super step (for example, step number “n” (current step)), the step number n + 1, The state of the input message is stored together with the step numbers n + 2,..., n + m.

このように、隣接同期に基づき次のスーパーステップに移行することにより、論理的には、バーテックスそれぞれに着目すると同一スーパーステップ内での同期がとれている。そのため、プログラマは、非同期型のような処理速度／効率性と計算精度のトレードオフを考慮する必要がなくすことができる。
また、図８（ａ）に示す比較例にくらべ、inactiveとして同期待ちをする時間が大幅に削減されるため（図８（ｂ）の符号β）、処理速度／効率性を改善することが可能となる。つまり、システム全体としての処理速度の遅延や、フェーズＰＨ３において処理をせず同期待ちが多いこと（処理の効率性）の問題を解決することができる。 As described above, by shifting to the next super step based on the adjacent synchronization, logically, when focusing on each of the vertices, synchronization within the same super step is established. Therefore, the programmer can eliminate the need to consider the trade-off between processing speed / efficiency and calculation accuracy such as asynchronous processing.
Further, compared to the comparative example shown in FIG. 8A, the time for waiting for synchronization as inactive is significantly reduced (symbol β in FIG. 8B), so that processing speed / efficiency can be improved. It becomes. That is, it is possible to solve the problem of the delay of the processing speed of the whole system and the fact that the processing is not performed in the phase PH3 and the number of synchronization waits is large (processing efficiency).

≪分散同期処理システムの構成≫
次に、本実施形態に係る分散同期処理システム１の構成について具体的に説明する。
図９に示すように、分散同期処理システム１は、管理サーバ１０（master）と、管理サーバ１０にそれぞれ接続され並列に処理を行う複数の処理サーバ３０（worker）と、処理サーバ３０上で動作する複数の分散処理部２０（vertex）と、を備える。 << Configuration of Distributed Synchronous Processing System >>
Next, the configuration of the distributed synchronous processing system 1 according to the present embodiment will be specifically described.
As shown in FIG. 9, the distributed synchronous processing system 1 operates on the management server 10 (master), a plurality of processing servers 30 (workers) connected to the management server 10 and performing processing in parallel, and the processing server 30. And a plurality of distributed processing units 20 (vertex).

管理サーバ１０および処理サーバ３０は、ＣＰＵ（Central Processing Unit）、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）、ＨＤＤ（Hard Disk Drive）等、一般的なコンピュータとしてのハードウエアを備えており、ＨＤＤには、ＯＳ（Operating System）、アプリケーションプログラム、各種データ等が格納されている。ＯＳおよびアプリケーションプログラムは、ＲＡＭに展開され、ＣＰＵによって実行される。なお、図９において、管理サーバ１０、分散処理部２０および処理サーバ３０の内部は、ＲＡＭに展開されたアプリケーションプログラム等によって実現される機能（特徴構成）を、ブロックとして示している。 The management server 10 and the processing server 30 include hardware as a general computer, such as a central processing unit (CPU), a random access memory (RAM), a read only memory (ROM), and a hard disk drive (HDD). The HDD stores an operating system (OS), application programs, various data, and the like. The OS and application programs are expanded in the RAM and executed by the CPU. In FIG. 9, the inside of the management server 10, the distributed processing unit 20, and the processing server 30 is shown as a block the function (feature configuration) realized by the application program etc. expanded in the RAM.

管理サーバ１０は、システム全体を管理するmasterとして機能する。管理サーバ１０は、対象とする計算処理の全体について所定単位に細分化した複数の計算処理を、workerとして機能する処理サーバ３０にそれぞれ割り振る。個々の計算処理には、データ入力、計算、メッセージの送受信等が含まれる。並列に処理を行う複数の処理サーバ３０（worker）上では、個々の計算処理にそれぞれ対応した複数の分散処理部２０が動作する。対象とする計算処理をグラフＧ＝（Ｖ，Ｅ）として表現したときに、この計算処理に必要な個々の計算処理は、グラフＧ中の個々の頂点（バーテックス：vertex）として表現される。つまり、分散処理部２０は頂点（バーテックス：vertex）として機能する。
以下、分散同期処理システム１を構成する各装置について詳細に説明する。 The management server 10 functions as a master that manages the entire system. The management server 10 allocates, to the processing server 30 functioning as a worker, a plurality of calculation processes divided into predetermined units for the entire target calculation process. Each calculation process includes data input, calculation, message transmission / reception, and the like. On a plurality of processing servers 30 (workers) performing processing in parallel, a plurality of distributed processing units 20 respectively corresponding to individual calculation processing operate. When the target calculation process is expressed as a graph G = (V, E), individual calculation processes required for the calculation process are expressed as individual vertices (vertex) in the graph G. That is, the distributed processing unit 20 functions as a vertex (vertex).
Hereinafter, each device constituting the distributed synchronous processing system 1 will be described in detail.

＜管理サーバ（master）＞
管理サーバ１０は、対象とする計算処理に必要な個々の計算処理（vertex）の設定と、その個々の計算処理（vertex）の各処理サーバ３０（worker）への割り振りを行う。また、管理サーバ１０は、システム上に設定したバーテックス（vertex）毎に、ＢＳＰにおける、次のスーパーステップに移行するか否かを判断する処理を行うことにより、対象とする計算処理の全体を管理する。
図３に示した、従来の分散同期処理システム１ａのmasterとの違いは、次のスーパーステップへの移行を、全てのバーテックスの処理が終了していることにより判断するのではなく、本実施形態に係る管理サーバ１０（master）では、バーテックス毎に、上記した「隣接同期」に基づき判定することである。 <Management server (master)>
The management server 10 performs setting of individual calculation processing (vertex) necessary for target calculation processing and allocation of the individual calculation processing (vertex) to each processing server 30 (worker). Also, the management server 10 manages the entire target calculation processing by performing processing of determining whether to shift to the next super step in the BSP for each vertex set in the system. Do.
The difference between the conventional distributed synchronous processing system 1a shown in FIG. 3 and the master is that the transition to the next super step is not judged by the completion of processing of all vertices, and this embodiment In the management server 10 (master) according to the present invention, the determination is made based on the “adjacent synchronization” described above for each vertex.

この管理サーバ１０は、その特徴構成として、隣接同期処理部１１を備える。
隣接同期処理部１１は、各処理サーバ３０（worker）から、分散処理部２０（vertex）毎に、計算・送信処理ｆ_ｎが完了したとき、つまり、フェーズＰＨ１（ローカル計算）およびフェーズＰＨ２（データ交換）が完了したときに、計算・送信処理ｆ_ｎの完了報告（以下、「計算・送信処理完了報告」と称する。）を受信する。
そして、隣接同期処理部１１は、受信した計算・送信処理完了報告で示される分散処理部２０（vertex）、すなわち、計算・送信処理が完了した分散処理部２０（vertex）について、次のスーパーステップへの移行判断を上記の「隣接同期」の条件に基づき行う。つまり、隣接同期処理部１１は、「自バーテックスおよび入力エッジで接する全てのバーテックスの計算・送信処理ｆ_ｎが完了していること」（隣接同期）の条件を満たすか否かを判定する。なお、この隣接同期の判定は、次のスーパーステップにおいて必要な計算結果の取得が完了しているか否かを、隣接する分散処理部２０（vertex）からの計算・送信処理完了報告を受信しているか否かに基づき判定することを意味する。 The management server 10 includes an adjacent synchronization processing unit 11 as its characteristic configuration.
The adjacent synchronization processing unit 11 performs the calculation and transmission processing f _n from each processing server 30 (worker) for each distributed processing unit 20 (vertex), that is, the phase PH1 (local calculation) and the phase PH2 (data When the exchange) is completed, a completion report of calculation / transmission processing f _n (hereinafter referred to as “calculation / transmission processing completion report”) is received.
Then, the adjacent synchronization processing unit 11 performs the next superstep on the distributed processing unit 20 (vertex) indicated by the received calculation / transmission processing completion report, that is, the distributed processing unit 20 (vertex) whose calculation / transmission processing is completed. Judgment of transition to the network is made based on the above-mentioned "adjacent synchronization" condition. That is, the adjacent synchronization processing unit 11 determines whether or not the condition of “the calculation and transmission processing f _{n of} all vertices in contact with the own vertex and the input edge is completed” (adjacent synchronization) is satisfied. It should be noted that the determination of the adjacent synchronization is made by receiving a calculation / transmission processing completion report from the adjacent distributed processing unit 20 (vertex) as to whether or not acquisition of the necessary calculation result is completed in the next super step. It means to make a decision based on the presence or absence.

隣接同期処理部１１は、受信した計算・送信処理完了報告で示される分散処理部２０（vertex）が、隣接同期の条件を満たす場合には、その分散処理部２０（vertex）について、次のスーパーステップに移行する（スーパーステップを「＋１」する。）ように、その分散処理部２０（vertex）を担当する処理サーバ３０（worker）に指示を送信する。なお、隣接同期処理部１１による、次のスーパーステップへの移行指示を、以下「次ステップ移行指示」と称する。 When the distributed processing unit 20 (vertex) indicated by the received calculation / transmission processing completion report satisfies the adjacent synchronization condition, the adjacent synchronization processing unit 11 determines the next superimposition of the distributed processing unit 20 (vertex). The instruction is transmitted to the processing server 30 (worker) in charge of the distributed processing unit 20 (vertex) so as to shift to the step (make the super step “+1”). The instruction to shift to the next super step by the adjacent synchronization processing unit 11 is hereinafter referred to as “next step shift instruction”.

また、隣接同期処理部１１は、ある分散処理部２０（vertex）の計算・送信処理完了報告を受信した場合に、その計算・送信処理完了報告で示される分散処理部２０（vertex）が出力エッジで接する分散処理部２０（vertex）のうち、当該分散処理部２０（vertex）のみからの入力メッセージ待ち（入力エッジの状態の取得待ち）」の理由により、inactive状態で待機している分散処理部２０（vertex）がある場合には、その分散処理部２０（vertex）を次のスーパーステップへ移行させるように、次ステップ移行指示を送信する。 Further, when the adjacent synchronization processing unit 11 receives a calculation / transmission processing completion report of a certain distributed processing unit 20 (vertex), an output edge of the distributed processing unit 20 (vertex) indicated by the calculation / transmission processing completion report is output. Of the distributed processing units 20 (vertex) in contact with each other, waiting for the input message from only the distributed processing unit 20 (vertex) (waiting for acquisition of the state of the input edge). If there is 20 (vertex), the next step shift instruction is transmitted so that the distributed processing unit 20 (vertex) can shift to the next super step.

具体的には、図８（ｂ）を参照して説明する。スーパーステップＳＳ１のときのバーテックスｖ_３に着目すると、バーテックスｖ_３は、入力エッジで接するバーテックスｖ_２，ｖ_４と自身の計算・送信処理ｆ_１が終わった時点が隣接同期する隣接同期ポイントとなる。ここでバーテックスｖ_３は、自身の計算・送信処理ｆ_１が終わった時点では、バーテックスｖ_４の計算・送信処理ｆ_１は終わっているが、バーテックスｖ_２の計算・送信処理ｆ_１が終わっていないため、「inactive」の状態で待機している。この状態において、管理サーバ１０の隣接同期処理部１１が、処理サーバ３０（worker１）からバーテックスｖ_２の計算・送信処理ｆ_１が終わった旨の計算・送信処理完了報告を受信した場合には、バーテックスｖ_２のみからの入力メッセージ待ち（入力エッジの状態の取得待ち）をしていた、バーテックスｖ_３に対して、次ステップ移行指示を送信する。
このようにすることで、自身の計算・送信処理ｆ_ｎが終了し、inactive状態で待機していた分散処理部２０（vertex）について、次にステップに移行させることができる。 Specifically, this will be described with reference to FIG. Focusing on the vertex v ₃ when the super step SS1, Vertex v ₃ is a neighboring synchronization point time when the vertex v _2, v ₄ and its own calculation and transmission processing f ₁ in contact with input edge finished is adjacent sync . Here vertex v ₃ is the time when the calculation and transmission processing f ₁ itself is finished, calculation and transmission processing f ₁ Vertex v ₄ is ending, not end calculation and transmission processing f ₁ Vertex v ₂ is Because there is no, it is waiting in the state of "inactive". In this state, when the adjacent synchronization processing unit 11 of the management server 10 receives, from the processing server 30 (worker 1), the calculation / transmission processing completion report indicating that the calculation / transmission processing f _{1 of the} vertex v ₂ is completed, The next step shift instruction is sent to the vertex v ₃ which has been waiting for an input message from the vertex v ₂ only (waiting to obtain the state of the input edge).
By doing so, ends the calculation and transmission processing f _n itself, the distributed processing unit was waiting in inactive state 20 (vertex), it can be then proceeds to step.

＜分散処理部（vertex）＞
図９に戻り、分散処理部２０（vertex）は、所定単位に区分された計算処理を実行し、数値計算部２１およびメッセージ送受信部２２を含んで構成される。 <Distributed processing unit (vertex)>
Returning to FIG. 9, the distributed processing unit 20 (vertex) executes calculation processing divided into predetermined units, and includes a numerical calculation unit 21 and a message transmission / reception unit 22.

数値計算部２１は、ＢＳＰにおけるフェーズＰＨ１（ローカル計算）の処理を実行する。この数値計算部２１は、メッセージ送受信部２２を介して受信する次ステップ移行指示に従い、次のスーパーステップへの移行を行う。なお、数値計算部２１は、自身の計算・送信処理ｆ_ｎが完了した後、次ステップ移行指示を受信するまで、inactive状態で待機する。 The numerical calculation unit 21 executes the process of phase PH1 (local calculation) in BSP. The numerical value calculation unit 21 shifts to the next super step in accordance with the next step shift instruction received via the message transmission / reception unit 22. The numerical calculation unit 21, after the calculation and transmission processing f _n itself is completed, until it receives the next step change instruction waits in inactive state.

メッセージ送受信部２２は、他の分散処理部２０や処理サーバ３０（worker）との間での情報の送受信を行う。具体的には、メッセージ送受信部２２は、ＢＳＰにおけるフェーズＰＨ２（データ交換）において、自身の出力エッジの状態を出力メッセージとして、その出力エッジで接続するバーテックスへ向けて送信する。なお、この出力メッセージには、その出力エッジの状態に対応付けてその時点でのスーパーステップのステップ番号が付される。また、メッセージ送受信部２２は、入力エッジで接続するバーテックスから入力エッジの状態を入力メッセージとして受信する。また、この入力メッセージには、その入力エッジに状態に対応付けてその時点でのスーパーステップのステップ番号が付される。なお、メッセージ送受信部２２は、この出力メッセージおよび入力メッセージを、処理サーバ（worker）３０のメッセージ処理部３２を介して送受信する。
また、このメッセージ送受信部２２は、自身が属する処理サーバ３０（worker）から、次ステップ移行指示を受信し、数値計算部２１に出力する。 The message transmission / reception unit 22 transmits / receives information to / from another distributed processing unit 20 or processing server 30 (worker). Specifically, in phase PH2 (data exchange) in the BSP, the message transmitting / receiving unit 22 transmits the state of its output edge as an output message toward a vertex connected at the output edge. The output message is assigned the step number of the superstep at that time in association with the state of the output edge. Further, the message transmitting / receiving unit 22 receives the state of the input edge as an input message from the vertex connected at the input edge. Also, in this input message, the step number of the superstep at that time is attached to the input edge in association with the state. The message transmission / reception unit 22 transmits / receives the output message and the input message via the message processing unit 32 of the processing server (worker) 30.
Further, the message transmitting / receiving unit 22 receives an instruction to shift to the next step from the processing server 30 (worker) to which the message transmission / reception unit 22 belongs, and outputs the instruction to the numerical value calculation unit 21.

＜処理サーバ（worker）＞
処理サーバ３０（worker）（図９参照）は、管理サーバ１０（master）や他の処理サーバ３０（worker）と接続される。この処理サーバ３０（worker）は、処理単位となる分散処理部２０（vertex）を複数備え、自身が備える分散処理部２０（vertex）の処理の進行状態等を管理するとともに、他の処理サーバ３０（worker）や管理サーバ１０（master）との間での情報の送受信を行う。また、この処理サーバ３０（worker）は、仮想化制御部３１、メッセージ処理部３２およびバーテックス管理部３３（分散処理管理部）を含んで構成される。 <Processing server (worker)>
The processing server 30 (worker) (see FIG. 9) is connected to the management server 10 (master) and other processing servers 30 (worker). The processing server 30 (worker) includes a plurality of distributed processing units 20 (vertex) which are processing units, and manages the progress of the processing of the distributed processing unit 20 (vertex) provided in itself, and the other processing servers 30. It exchanges information with (worker) and the management server 10 (master). The processing server 30 (worker) includes a virtualization control unit 31, a message processing unit 32, and a vertex management unit 33 (distributed processing management unit).

仮想化制御部３１は、仮想化技術に基づき、処理サーバ３０上に仮想化プラットホームを構築し、複数の分散処理部２０（仮想マシン）を配置する制御を行う。 The virtualization control unit 31 constructs a virtualization platform on the processing server 30 based on virtualization technology, and performs control of arranging a plurality of distributed processing units 20 (virtual machines).

メッセージ処理部３２は、自身に属する各分散処理部２０（vertex）から、ＢＳＰにおけるフェーズＰＨ２（データ交換）の際に、出力エッジの状態を示す出力メッセージを受け取り、計算対象のグラフＧのグラフトポロジに基づき、その出力エッジで接続するバーテックスに、受信した出力メッセージを、入力エッジの状態を示す入力メッセージとして出力する。なお、以降、出力メッセージと入力メッセージとを特に区別しない場合、単に「メッセージ」と称する場合がある。 The message processing unit 32 receives an output message indicating the state of the output edge at each phase PH2 (data exchange) in the BSP from each distributed processing unit 20 (vertex) belonging to itself, and the graph topology of the graph G to be calculated And outputs the received output message as an input message indicating the state of the input edge to the vertex connected at the output edge based on Hereinafter, when the output message and the input message are not particularly distinguished, they may be simply referred to as a "message".

メッセージ処理部３２は、出力エッジで隣接する分散処理部２０（vertex）へのメッセージを、例えば、次に示す２つのタイミングで送信することができる。
（タイミング１）
自分散処理部２０（vertex）の計算終了後に直ちに送信する。
具体的には、メッセージ処理部３２は、自分散処理部２０（vertex）から、出力メッセージを受信した場合に、出力エッジで接続する分散処理部２０（vertex）が、自身に属する分散処理部２０（vertex）であるとき、および、他の処理サーバ３０に属する分散処理部２０（vertex）であるときに、直ちに、その分散処理部２０（vertex）に送信する。
このようにすることにより、通信遅延の影響を低減させることができる。 The message processing unit 32 can transmit a message to the distributed processing unit 20 (vertex) adjacent at the output edge, for example, at two timings shown below.
(Timing 1)
It transmits immediately after the end of the calculation of the self-distributed processing unit 20 (vertex).
Specifically, when the message processing unit 32 receives an output message from the self-distributed processing unit 20 (vertex), the distributed processing unit 20 (vertex) connected by the output edge belongs to the distributed processing unit 20 belonging to itself. When it is (vertex) and when it is the distributed processing unit 20 (vertex) belonging to another processing server 30, it is immediately transmitted to the distributed processing unit 20 (vertex).
By doing so, the influence of communication delay can be reduced.

（タイミング２）
出力エッジで接続する分散処理部２０（vertex）（隣接バーテックス）が、次のスーパーステップに移行する直前までバッファリングする。
具体的には、メッセージ処理部３２は、自分散処理部２０（vertex）から、出力メッセージを受信した場合に、出力エッジで接続する分散処理部２０（vertex）が、他の処理サーバ３０に属する分散処理部２０（vertex）であるときに、その分散処理部２０（vertex）が次のスーパーステップに移行する情報（次ステップ移行指示）を受ける状態になった時点で、管理サーバ１０から、その次ステップ移行指示を出す旨の情報を事前に取得する。そして、移行直前にバッファリングしたメッセージをまとめて出力エッジで接続する分散処理部２０（vertex）に送信する。
このようにすることで、他の処理サーバ３０に属する分散処理部２０（vertex）に送信する回数（通信回数）を削減することができる。
なお、メッセージ処理部３２は、自分散処理部２０（vertex）から、出力メッセージを受信した場合に、出力エッジで接続する分散処理部２０（vertex）が、自身に属する分散処理部２０（vertex）であるとき、上記のような通信回数の削減効果は得られないので、バッファリングせず、直ちに送信するようにする。 (Timing 2)
The distributed processing unit 20 (vertex) (adjacent vertex) connected at the output edge buffers until immediately before moving to the next super step.
Specifically, when the message processing unit 32 receives an output message from the self-distributed processing unit 20 (vertex), the distributed processing unit 20 (vertex) connected by the output edge belongs to another processing server 30. When the distributed processing unit 20 (vertex) is in the state of receiving information (instruction to shift to the next step) when it is the distributed processing unit 20 (vertex), the management server 10 The information to the effect that the next step shift instruction is issued is acquired in advance. Then, the messages buffered immediately before the transition are collectively sent to the distributed processing unit 20 (vertex) connected at the output edge.
By doing this, it is possible to reduce the number of transmissions (the number of communication) to the distributed processing unit 20 (vertex) belonging to another processing server 30.
When the message processing unit 32 receives an output message from the self-distributed processing unit 20 (vertex), the distributed processing unit 20 (vertex) connected by the output edge belongs to the distributed processing unit 20 (vertex) belonging to itself. In this case, since the reduction effect of the number of communications as described above can not be obtained, it is transmitted without buffering.

バーテックス管理部３３（分散処理管理部）は、自身に属する分散処理部２０（vertex）を監視し、各分散処理部２０（vertex）が、計算・送信処理ｆ_ｎが完了したとき、つまり、フェーズＰＨ１（ローカル計算）およびフェーズＰＨ２（データ交換）が完了したときに、計算・送信処理ｆ_ｎの完了報告（計算・送信処理完了報告）を生成し、管理サーバ１０（master）に送信する。
そして、バーテックス管理部３３は、管理サーバ１０（master）から、計算・送信処理完了報告に対する応答として、次ステップ移行指示を受信した場合に、その次ステップ移行指示を対象となる分散処理部２０（vertex）に出力する。 The vertex management unit 33 (distributed processing management unit) monitors the distributed processing unit 20 (vertex) belonging to itself, and when each distributed processing unit 20 (vertex) completes the calculation and transmission processing f _n , that is, the phase When PH1 (local calculation) and phase PH2 (data exchange) are completed, a completion report (calculation / transmission process completion report) of calculation / transmission process f _n is generated and transmitted to the management server 10 (master).
Then, the vertex management unit 33 receives the next step shift instruction from the management server 10 (master) as a response to the calculation / transmission processing completion report, the distributed processing unit 20 (targeted to the next step shift instruction) Output to vertex).

≪分散同期処理システムの動作≫
次に、分散同期処理システム１の動作について説明する。
図１０は、本実施形態に係る分散同期処理システム１の処理の流れを示すシーケンス図である。
なお、ここでは、管理サーバ１０（master）により、対象とする計算処理に必要な個々の計算処理（vertex）の設定と、その個々の計算処理（vertex）の各処理サーバ３０（worker）への割り振りがすでに終わっているものとして説明する。 << Operation of distributed synchronous processing system >>
Next, the operation of the distributed synchronous processing system 1 will be described.
FIG. 10 is a sequence diagram showing a flow of processing of the distributed synchronous processing system 1 according to the present embodiment.
Here, the management server 10 (master) sets individual calculation processes (vertex) required for the target calculation process and sets the individual calculation processes (vertex) to each processing server 30 (worker). Explain that allocation has already been completed.

まず、処理サーバ３０（worker）のバーテックス管理部３３（分散処理管理部）は、自身に属する分散処理部２０（vertex）を監視することにより、ある分散処理部２０（vertex）について計算・送信処理ｆ_ｎが完了したことを検出する（ステップＳ１０）。そして、バーテックス管理部３３は、その分散処理部２０（vertex）の識別番号とその時点でのスーパーステップのステップ番号（ｎ）とを付した計算・送信処理完了報告を、管理サーバ１０に送信する（ステップＳ１１）。 First, the vertex management unit 33 (distributed processing management unit) of the processing server 30 (worker) monitors and calculates the distributed processing unit 20 (vertex) by monitoring the distributed processing unit 20 (vertex) belonging to the processing server 30 (worker). It is detected that f _n has been completed (step S10). Then, the vertex manager 33 transmits to the management server 10 a calculation / transmission processing completion report with the identification number of the distributed processing unit 20 (vertex) and the step number (n) of the superstep at that time. (Step S11).

次に、計算・送信処理完了報告を受信した管理サーバ１０（master）は、隣接同期処理部１１が、受信した計算・送信処理完了報告で示される分散処理部２０（vertex）について、隣接同期の条件を満たすか否かを判定する（ステップＳ１２）。具体的には、隣接同期処理部１１は、「自バーテックスおよび入力エッジで接する全てのバーテックスの計算・送信処理ｆ_ｎが完了していること」を満たすか否かを判定する。 Next, the management server 10 (master) that has received the calculation / transmission processing completion report determines that the adjacent synchronization processing unit 11 performs adjacent synchronization for the distributed processing unit 20 (vertex) indicated by the received calculation / transmission processing completion report. It is determined whether the condition is satisfied (step S12). Specifically, the adjacent synchronization processing unit 11 determines whether or not “the calculation and transmission processing f _{n of} all the vertices in contact with the own vertex and the input edge is satisfied”.

そして、管理サーバ１０（master）の隣接同期処理部１１は、隣接同期の条件を満たさない場合には（ステップＳ１２→Ｎｏ）、処理サーバ３０（worker）から次の計算・送信処理完了報告を受信するまで待つ。 Then, when the condition for adjacent synchronization is not satisfied (step S12 → No), the adjacent synchronization processing unit 11 of the management server 10 (master) receives the next calculation / transmission processing completion report from the processing server 30 (worker). Wait until you do.

一方、ステップＳ１２において、管理サーバ１０（master）の隣接同期処理部１１は、隣接同期の条件を満たす場合に（ステップＳ１２→Ｙｅｓ）、計算・送信処理完了報告を送信してきた処理サーバ３０（worker）に、その分散処理部２０（vertex）について、次のスーパーステップに移行するように、次ステップ移行指示を送信する（ステップＳ１３）。 On the other hand, in step S12, the adjacent synchronization processing unit 11 of the management server 10 (master) has sent the calculation / transmission processing completion report when the adjacent synchronization condition is satisfied (step S12: Yes) The next step shift instruction is sent to the distributed processing unit 20 (vertex) so as to shift to the next super step (step S13).

また、管理サーバ１０（master）の隣接同期処理部１１は、ステップＳ１２において、受信した計算・送信処理完了報告で示される分散処理部２０（vertex）が出力エッジで接する分散処理部２０（vertex）のうち、当該分散処理部２０（vertex）のみからの入力メッセージ待ち（入力エッジの状態の取得待ち）」の理由により、inactive状態で待機している分散処理部２０（vertex）があるか否かを判定する。そして、隣接同期処理部１１は、該当する分散処理部２０（vertex）がある場合には、その分散処理部２０（vertex）が属する処理サーバ３０（worker）に対しても、次ステップ移行指示を送信する。 Further, in step S12, the adjacent synchronization processing unit 11 of the management server 10 (master) is a distributed processing unit 20 (vertex) in which the distributed processing unit 20 (vertex) indicated by the received calculation / transmission processing completion report contacts at the output edge. Whether there is a distributed processing unit 20 (vertex) waiting in the inactive state due to the reason of waiting for an input message from the distributed processing unit 20 (vertex) alone (waiting for acquisition of the state of the input edge) ”. Determine Then, when there is the corresponding distributed processing unit 20 (vertex), the adjacent synchronization processing unit 11 instructs the processing server 30 (worker) to which the distributed processing unit 20 (vertex) belongs to the next step migration. Send.

次ステップ移行指示を受信した処理サーバ３０（worker）のバーテックス管理部３３は、その計算・送信処理ｆ_ｎが完了した分散処理部２０（vertex）、および、上記ステップＳ１２の際に、inactive状態で待機していたと判定された分散処理部２０（vertex）に対し、次ステップ移行指示を出力する（ステップＳ１４）。これにより、次ステップ移行指示を受信した分散処理部２０（vertex）の数値計算部２１は、次のスーパーステップ（ｎ＋１）の計算・送信処理ｆ_ｎ＋１を実行する。 The vertex management unit 33 of the processing server 30 (worker) that has received the instruction to shift to the next step is in the inactive state in the distributed processing unit 20 (vertex) in which the calculation and transmission processing f _n is completed and in step S12 above. An instruction to shift to the next step is output to the distributed processing unit 20 (vertex) determined to have been on standby (step S14). As a result, the numerical value calculation unit 21 of the distributed processing unit 20 (vertex) that has received the instruction to shift to the next step executes the calculation / transmission processing f _{n + 1} of the next super step (n + 1).

以上説明したように、本実施形態に係る分散同期処理システム１および分散同期処理方法によれば、比較例の分散同期処理システムにおいて問題であった、システム全体としての処理速度の遅延や、フェーズＰＨ３において処理をせず同期待ちが多いことの問題、つまり、処理速度／効率性の問題を解決し、処理が著しく遅い分散処理部２０（vertex）の影響を低減することができる。
また、プログラマは、バーテックス間の処理の追い越しや上書きを考慮する必要がなく、シンプルなフレームワークとして、本システムのプログラムを作成することが可能となる。 As described above, according to the distributed synchronous processing system 1 and the distributed synchronous processing method according to the present embodiment, the delay in the processing speed as a whole system, the phase PH 3, which was a problem in the distributed synchronous processing system of the comparative example. Solve the problem of many synchronization waits without processing, that is, the problem of processing speed / efficiency, and reduce the influence of the distributed processing unit 20 (vertex) whose processing is extremely slow.
Also, the programmer does not have to consider overtaking or overwriting of processing between vertices, and can create a program of this system as a simple framework.

〔本実施形態の変形例〕
次に、本実施形態に係る分散同期処理システム１の変形例について説明する。
図１１は、本実施形態の変形例に係る分散同期処理システム１Ａの全体構成を示す図である。
図９で示した本実施形態に係る分散同期処理システム１では、管理サーバ１０（master）が、各分散処理部２０（vertex）について、隣接同期の条件を満たすか否かの判定を行っていた。つまり、隣接同期の判定を管理サーバ１０が行う「master集中型」であった。これに対し、図１１に示す、分散同期処理システム１Ａは、管理サーバ１０（master）を備えず、各分散処理部２０（vertex）について、隣接同期の条件を満たすか否かの判定を、各処理サーバ３０（worker）において自律分散的に実行する。つまり、分散同期処理システム１Ａは、自律分散型（master-less型）で、隣接同期を行うことを特徴とする。
具体的には、本実施形態の変形例に係る分散同期処理システム１Ａでは、図９に示す分散同期処理システム１における管理サーバ１０（master）を備えない構成とするとともに、各処理サーバ３０Ａ（worker）におけるバーテックス管理部３３を備えないものとし、その代わりに、図１１に示すように、処理サーバ３０Ａに、隣接同期バーテックス管理部３４（隣接同期分散管理部）を備えるものとした。
なお、図９で示す構成と同じ機能を備える構成については、同一の名称と符号を付し、説明を省略する。 [Modification of this embodiment]
Next, a modification of the distributed synchronous processing system 1 according to the present embodiment will be described.
FIG. 11 is a diagram showing an entire configuration of a distributed synchronous processing system 1A according to a modification of the present embodiment.
In the distributed synchronous processing system 1 according to the present embodiment shown in FIG. 9, the management server 10 (master) determines whether or not the condition for adjacent synchronization is satisfied for each distributed processing unit 20 (vertex). . That is, it is a "master centralized type" in which the management server 10 determines the adjacent synchronization. On the other hand, the distributed synchronous processing system 1A shown in FIG. 11 does not include the management server 10 (master), and each distributed processing unit 20 (vertex) determines whether or not the adjacent synchronous condition is satisfied. The processing is performed autonomously in the processing server 30 (worker). That is, the distributed synchronous processing system 1A is characterized in performing adjacent synchronization in an autonomous distributed type (master-less type).
Specifically, in the distributed synchronous processing system 1A according to the modification of the present embodiment, the management server 10 (master) in the distributed synchronous processing system 1 shown in FIG. 9 is not provided, and each processing server 30A (worker , And instead, as shown in FIG. 11, the processing server 30A is provided with the adjacent synchronization vertex management unit 34 (adjacent synchronization and dispersion management unit).
In addition, about the structure provided with the same function as the structure shown in FIG. 9, the same name and code | symbol are attached | subjected and description is abbreviate | omitted.

隣接同期バーテックス管理部３４（隣接同期分散管理部）は、自身に属する分散処理部２０（vertex）を監視し、各分散処理部２０（vertex）が、計算・送信処理ｆ_ｎが完了したとき、つまり、フェーズＰＨ１（ローカル計算）およびフェーズＰＨ２（データ交換）が完了したときに、入力エッジで接する全ての分散処理部２０（vertex）からの入力メッセージ（入力エッジの状態）が揃っているか否かを判定する。なお、隣接同期バーテックス管理部３４は、各分散処理部２０（vertex）の入力メッセージ（incomingメッセージ）のバッファ（図４の「Ｍ_in,n」（現在のステップ用））を参照して、入力メッセージ（入力エッジの状態）が揃っているか否かを判定する。
なお、本実施形態の変形例においても、本実施形態と同様に、処理全体のある時点でみると、各バーテックス間においてスーパーステップがずれる可能性がある。そのため、バーテックス間でメッセージを送受信するときには上書きせずに、スーパーステップ毎に管理する。よって、各バーテックスは、自身のスーパーステップよりも先に、次のスーパーステップに移行した入力エッジで接するバーテックスから取得した入力エッジの状態を、「Ｍ_in,n+m」として入力メッセージのバッファに記憶しておく。 The adjacent synchronization vertex management unit 34 (adjacent synchronization distribution management unit) monitors the distributed processing units 20 (vertex) belonging to itself, and when each distributed processing unit 20 (vertex) completes the calculation and transmission processing f _n , That is, when the phase PH1 (local calculation) and the phase PH2 (data exchange) are completed, whether or not the input messages (state of the input edge) from all the distributed processing units 20 (vertex) in contact with the input edge are aligned. Determine The adjacent synchronization vertex management unit 34 refers to the buffer (“M _{in, n} ” (for the current step) _in FIG. 4) of the input message (incoming message) of each distributed processing unit 20 (vertex) and performs the input. It is determined whether the messages (the state of the input edge) are aligned.
Also in the modification of the present embodiment, as in the present embodiment, there is a possibility that supersteps may be shifted between the respective vertices when viewed at a certain point in the entire processing. Therefore, when transmitting and receiving messages between vertices, management is performed for each super step without overwriting. Therefore, each vertex uses the state of the input edge acquired from the vertex that is in contact with the input edge that has shifted to the next super step prior to its own super step as “M _{in, n + m} ” in the buffer of the input message. Remember.

そして、隣接同期バーテックス管理部３４は、入力メッセージ（入力エッジの状態）が揃っている場合、つまり、隣接する分散処理部２０（vertex）から、次の計算ステップにおいて必要な計算結果の取得が完了している場合には、本実施形態における隣接同期の条件「自バーテックスおよび入力エッジで接する全てのバーテックスの計算・送信処理ｆ_ｎが完了していること」を満たすものとする。隣接同期バーテックス管理部３４は、この場合に、次のスーパーステップに移行する（スーパーステップを「＋１」する。）ように、「次ステップ移行指示」を、計算・送信処理ｆ_ｎが完了した分散処理部２０（vertex）に出力する。 Then, when the input message (the state of the input edge) is complete, the adjacent synchronization vertex management unit 34 completes the acquisition of the calculation result necessary for the next calculation step from the adjacent distributed processing unit 20 (vertex). In this case, it is assumed that the condition of the adjacent synchronization in the present embodiment "the calculation and transmission processing f _{n of} all the vertices in contact with the own vertex and the input edge is completed". In this case, the adjacent synchronization vertex management unit 34 proceeds to the next super step (increases the super step by “+1”), the “next step shift instruction” is distributed and the calculation / transmission process f _n is completed. Output to the processing unit 20 (vertex).

隣接同期バーテックス管理部３４は、inactive状態で待機している分散処理部２０（vertex）に対し入力エッジで接するいずれかの分散処理部２０（vertex）から、当該分散処理部２０（vertex）が入力メッセージ（入力エッジの状態）を受信した場合には、その受信を契機として、再度、入力メッセージ（入力エッジの状態）が揃っているか否か、つまり、隣接同期の条件を満たすか否かの判定を実行する。 The adjacent synchronous vertex manager 34 receives an input from the distributed processing unit 20 (vertex) from any of the distributed processing units 20 (vertex) in contact with the distributed processing unit 20 (vertex) waiting in the inactive state at the input edge. When a message (the state of the input edge) is received, it is determined whether or not the input message (the state of the input edge) is aligned again, that is, whether or not the condition for adjacent synchronization is satisfied. Run.

なお、処理サーバ３０Ａ（worker）のメッセージ処理部３２は、隣接同期バーテックス管理部３４の上記した隣接同期の判定のため、入力エッジの状態としてのデータがない場合（例えば、データが「０」）であっても、フェーズＰＨ１（ローカル計算）が終わった時点で、出力エッジで接する分散処理部２０（vertex）に対して、入力メッセージを送信する。また、処理サーバ３０Ａ（worker）のメッセージ処理部３２は、隣接同期バーテックス管理部３４の隣接同期の条件による判定のため、出力メッセージをバッファリングせずに、直ちに送信する。 When the message processing unit 32 of the processing server 30A (worker) does not have data as the state of the input edge because the adjacent synchronization vertex management unit 34 determines the above-mentioned adjacent synchronization (for example, the data is "0"). Even when the phase PH1 (local calculation) is finished, the input message is transmitted to the distributed processing unit 20 (vertex) that is in contact with the output edge. Further, the message processing unit 32 of the processing server 30A (worker) immediately transmits the output message without buffering it, for the determination based on the condition of the adjacent synchronization of the adjacent synchronization vertex management unit 34.

≪変形例の分散同期処理システムの動作≫
次に、変形例に係る分散同期処理システム１Ａの動作について説明する。
図１２は、本実施形態の変形例に係る分散同期処理システム１Ａの処理の流れを示すフローチャートである。
なお、ここでは、予め対象とする計算処理に必要な個々の計算処理（vertex）の設定と、その個々の計算処理（vertex）の各処理サーバ３０Ａ（worker）への割り振りが終わっているものとして説明する。この個々の計算処理（vertex）の設定と、各処理サーバ３０Ａ（worker）への割り振りとは、例えば、これらの機能を、システム全体の管理サーバを備えさせたり、処理サーバ３０Ａの中の代表サーバに備えさせたりすることにより、事前に実行しておけばよい。 << Operation of the distributed synchronous processing system of the modified example >>
Next, the operation of the distributed synchronous processing system 1A according to the modification will be described.
FIG. 12 is a flowchart showing the process flow of the distributed synchronous processing system 1A according to the modification of the present embodiment.
Here, it is assumed that setting of individual calculation processing (vertex) necessary for target calculation processing and allocation of each calculation processing (vertex) to each processing server 30A (worker) have been completed in advance. explain. The setting of the individual calculation processing (vertex) and the allocation to each processing server 30A (worker), for example, include the management server of the entire system or the representative server in the processing server 30A. It may be performed in advance by preparing for

まず、処理サーバ３０Ａ（worker）の隣接同期バーテックス管理部３４（隣接同期分散管理部）は、自身に属する分散処理部２０（vertex）を監視することにより、ある分散処理部２０（vertex）について計算・送信処理ｆ_ｎが完了したことを検出する（ステップＳ２０）。 First, the adjacent synchronization vertex management unit 34 (adjacent synchronization distribution management unit) of the processing server 30A (worker) calculates about a certain distributed processing unit 20 (vertex) by monitoring the distributed processing unit 20 (vertex) belonging to itself. The completion of the transmission process f _n is detected (step S20).

続いて、隣接同期バーテックス管理部３４は、その分散処理部２０（vertex）について、入力エッジで接する全ての分散処理部２０（vertex）からの入力メッセージ（入力エッジの状態）が揃っているか否かを判定する（ステップＳ２１）。 Subsequently, with respect to the distributed processing unit 20 (vertex), the adjacent synchronous vertex management unit 34 determines whether or not the input messages (states of the input edge) from all the distributed processing units 20 (vertex) in contact with the input edge are aligned. Is determined (step S21).

ここで、隣接同期バーテックス管理部３４は、入力メッセージ（入力エッジの状態）が揃っていると判定した場合には（ステップＳ２１→Ｙｅｓ）、隣接同期の条件「自バーテックスおよび入力エッジで接する全てのバーテックスの計算・送信処理ｆ_ｎが完了していること」を満たすものし、後記するステップＳ２４（「次ステップ移行指示」の出力）に進む。 Here, when the adjacent synchronization vertex management unit 34 determines that the input message (the state of the input edge) is aligned (step S21 → Yes), the condition of the adjacent synchronization “all vertex and input edge contact with each other” It is satisfied that “vertex calculation / transmission processing f _n is completed”, and the process proceeds to step S24 (output of “next step shift instruction”) described later.

一方、隣接同期バーテックス管理部３４は、入力メッセージ（入力エッジの状態）が揃っていないと判定した場合には（ステップＳ２１→Ｎｏ）、「次ステップ移行指示」を出力しない。そのため、その計算・送信処理ｆ_ｎが完了した分散処理部２０（vertex）は、inactive状態での待機となる（ステップＳ２２）。 On the other hand, when the adjacent synchronization vertex management unit 34 determines that the input messages (the state of the input edge) are not aligned (step S21 → No), the “next step shift instruction” is not output. Therefore, the distributed processing unit 20 (vertex) which has completed the calculation / transmission process f _n becomes a standby in the inactive state (step S22).

続いて、隣接同期バーテックス管理部３４は、inactive状態で待機している分散処理部２０（vertex）に対し、入力エッジで接するいずれかの分散処理部２０（vertex）から、当該分散処理部２０（vertex）が入力メッセージ（入力エッジの状態）を受信したか否かを判定する（ステップＳ２３）。そして、隣接同期バーテックス管理部３４は、inactive状態で待機している分散処理部２０（vertex）が入力メッセージを受信していなければ（ステップＳ２３→Ｎｏ）、受信するまで待つ。一方、隣接同期バーテックス管理部３４は、inactive状態で待機している分散処理部２０（vertex）が入力メッセージを受信した場合には（ステップＳ２３→Ｙｅｓ）、そのことを契機として、ステップＳ２１に戻る。 Subsequently, the adjacent synchronization vertex management unit 34 sends the distributed processing unit 20 (vertex), which is in contact with the input edge, to the distributed processing unit 20 (vertex) waiting in the inactive state. It is determined whether the vertex (vertex) receives the input message (the state of the input edge) (step S23). Then, if the distributed processing unit 20 (vertex) waiting in the inactive state has not received an input message (step S23 → No), the adjacent synchronization vertex management unit 34 waits until it is received. On the other hand, when the distributed processing unit 20 (vertex) waiting in the inactive state receives the input message (step S23 → Yes), the adjacent synchronization vertex manager 34 returns to step S21 triggered by that. .

一方、隣接同期バーテックス管理部３４は、隣接同期の条件を満たす場合には（ステップＳ２１→Ｙｅｓ）、ステップＳ２４において、その分散処理部２０（vertex）について、次のスーパーステップに移行するように、「次ステップ移行指示」を出力する。これにより、「次ステップ移行指示」を受信した分散処理部２０（vertex）の数値計算部２１は、次のスーパーステップ（ｎ＋１）の計算・送信処理ｆ_ｎ＋１を実行する。 On the other hand, when the condition for adjacent synchronization is satisfied (step S21 → Yes), the adjacent synchronization vertex management unit 34 causes the distributed processing unit 20 (vertex) to shift to the next super step in step S24. Outputs the "next step shift instruction". Accordingly, the numerical value calculation unit 21 of the distributed processing unit 20 (vertex) that has received the “next step shift instruction” executes the calculation / transmission process f _{n + 1} of the next super step (n + 1).

以上説明したように、本実施形態の変形例に係る分散同期処理システム１Ａおよび分散同期処理方法によれば、本実施形態に係る分散同期処理システム１の効果に加えて、自律分散型を採用することにより、管理サーバ１０（master）のボトルネックを回避し、大規模なグラフＧにおいても、処理速度／効率性を担保することができる。 As described above, according to the distributed synchronization processing system 1A and the distributed synchronization processing method according to the modification of the present embodiment, in addition to the effects of the distributed synchronization processing system 1 according to the present embodiment, an autonomous distributed type is adopted. Thus, the bottleneck of the management server 10 (master) can be avoided, and the processing speed / efficiency can be secured even in the large graph G.

１，１Ａ分散同期処理システム
１０管理サーバ（master）
１１隣接同期処理部
２０分散処理部（vertex）
２１数値計算部
２２メッセージ送受信部
３０，３０Ａ処理サーバ（worker）
３１仮想化制御部
３２メッセージ処理部
３３バーテックス管理部（分散処理管理部）
３４隣接同期バーテックス管理部（隣接同期分散管理部） 1,1A Distributed synchronous processing system 10 Management server (master)
11 adjacent synchronization processing unit 20 distributed processing unit (vertex)
21 Numerical calculation unit 22 Message transmission and reception unit 30, 30A Processing server (worker)
31. Virtualization control unit 32. Message processing unit 33. Vertex management unit (distributed processing management unit)
34 Adjacent sync vertex manager (adjacent sync distribution manager)

Claims

A management that assigns a plurality of processing servers performing processing in parallel, a plurality of distributed processing units operating on the processing server, and a plurality of distributed processing units required for the target calculation processing to the plurality of processing servers A distributed synchronous processing system having a server;
The processing server is
It detects the completion of calculation / transmission processing indicating the transmission processing to the distributed processing unit connected as the output destination of the calculation processing and the calculation result in the predetermined calculation step by the distributed processing unit, and completes the calculation / transmission processing Generating a completion report to be sent to the management server,
The distributed processing management unit is configured to receive, from the management server, an instruction to move to the next step, which is an instruction to shift to the next calculation step, and output the instruction to the distributed processing unit that has completed the calculation and transmission process.
The management server is
The distributed processing unit that has received the completion report and completed the calculation / transmission process is connected as an input source of the calculation result whether acquisition of the calculation result necessary in the next calculation step is completed. It determines based on whether the completion report from the distributed processing unit has been received, and transmits the next step shift instruction to the processing server that has sent the completion report when acquisition of the calculation result is completed. A distributed synchronization processing system comprising: an adjacent synchronization processing unit.

A distributed synchronous processing system, comprising: a plurality of processing servers performing processing in parallel; and a plurality of distributed processing units operating on the processing server,
The processing server is
Detection of completion of calculation / transmission processing indicating transmission processing to the distributed processing unit connected as the output destination of the calculation processing and the calculation result in a predetermined calculation step by the distributed processing unit;
The distributed processing unit that has completed the calculation / transmission processing determines whether acquisition of the calculation result necessary in the next calculation step has been completed from the distributed processing unit connected as an input source of the calculation result, The adjacent synchronization and distribution management unit which outputs, to the distributed processing unit that has completed the calculation / transmission processing, an instruction to move to the next step, which is an instruction to shift to the next calculation step, when acquisition of the calculation result is completed. A distributed synchronous processing system comprising:

A management that assigns a plurality of processing servers performing processing in parallel, a plurality of distributed processing units operating on the processing server, and a plurality of distributed processing units required for the target calculation processing to the plurality of processing servers A distributed synchronous processing method of a distributed synchronous processing system having a server;
The processing server is
It detects the completion of calculation / transmission processing indicating the transmission processing to the distributed processing unit connected as the output destination of the calculation processing and the calculation result in the predetermined calculation step by the distributed processing unit, and completes the calculation / transmission processing A procedure for generating a completion report to be sent and sending it to the management server;
Receiving from the management server an instruction to move to the next step, which is an instruction to shift to the next calculation step, and outputting the calculation / transmission processing to the distributed processing unit that has completed the processing;
The management server is
The distributed processing unit that has received the completion report and completed the calculation / transmission process is connected as an input source of the calculation result whether acquisition of the calculation result necessary in the next calculation step is completed. It determines based on whether the completion report from the distributed processing unit has been received, and transmits the next step shift instruction to the processing server that has sent the completion report when acquisition of the calculation result is completed. A distributed synchronous processing method characterized in performing a procedure.

A distributed synchronous processing method of a distributed synchronous processing system, comprising: a plurality of processing servers performing processing in parallel; and a plurality of distributed processing units operating on the processing server,
The processing server is
A procedure for detecting completion of calculation / transmission processing indicating transmission processing to a distributed processing unit connected as a calculation processing and output destination of calculation results in a predetermined calculation step by the distributed processing unit;
The distributed processing unit that has completed the calculation / transmission processing determines whether acquisition of the calculation result necessary in the next calculation step has been completed from the distributed processing unit connected as an input source of the calculation result, Executing a procedure of outputting a next step shift instruction, which is an instruction to shift to the next calculation step, to the distributed processing unit that has completed the calculation / transmission process when the acquisition of the calculation result is completed Distributed synchronous processing method characterized by: