JP5826390B2

JP5826390B2 - Transfer method and graph processing system

Info

Publication number: JP5826390B2
Application number: JP2014523509A
Authority: JP
Inventors: 雅士高田; 泰幸工藤; 加藤　猛; 猛加藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2012-07-06
Filing date: 2012-07-06
Publication date: 2015-12-02
Anticipated expiration: 2032-07-06
Also published as: WO2014006735A1; JPWO2014006735A1

Description

本発明は、複数の計算ノードを用いたグラフ処理に関し、特にグラフ情報の計算ノード間での転送に関するものである。 The present invention relates to graph processing using a plurality of calculation nodes, and more particularly to transfer of graph information between calculation nodes.

グラフ処理における情報圧縮技術として、非特許文献１に記載されている技術がある。非特許文献１に記載の技術は、非ゼロの値を格納したベクトルＡ、非ゼロの箇所の列番号を格納したベクトルＢ、各行の先頭の非ゼロの箇所がベクトルＡの何番目の値に対応するかを格納したベクトルＣでグラフ構造データを表現することで情報を圧縮するものである。 As an information compression technique in graph processing, there is a technique described in Non-Patent Document 1. The technique described in Non-Patent Document 1 is that a vector A that stores a non-zero value, a vector B that stores a column number of a non-zero location, and the first non-zero location of each row is the vector A The information is compressed by expressing the graph structure data with a vector C storing whether it corresponds.

リチャード・バレット（ＲｉｃｈａｒｄＢａｒｒｅｔｔ）、外９名著、「線形システムのソリューション向けテンプレート：反復法のためのビルディングブロック（ＴｅｍｐｌａｔｅｓｆｏｒｔｈｅＳｏｌｕｔｉｏｎｏｆＬｉｎｅａｒＳｙｓｔｅｍｓ：ＢｕｉｌｄｉｎｇＢｌｏｃｋｓｆｏｒＩｔｅｒａｔｉｖｅＭｅｔｈｏｄｓ）」、ソサイエティ・フォー・インダストリアル・アンド・アプライド・マスマティクス（ＳｏｃｉｅｔｙｆｏｒＩｎｄｕｓｔｒｉａｌａｎｄＡｐｐｌｉｅｄＭａｔｈｅｍａｔｉｃｓ）、（米国）、１９９４年、ｐｐ．６４−６５Richard Barrett, et al., "Templates for linear system solutions: Building for the Solutions of Linear Systems for Iterators," Society for Industrial and Applied Materials (USA), 1994, pp. 64-65

非特許文献１に記載の技術では、グラフ構造データを圧縮することは可能であるが、グラフ処理時の中間データであるグラフの頂点間を伝搬する情報は圧縮できない。本願発明者らは、複数の計算ノードを用いたグラフ処理の場合に、処理対象となるグラフの規模が大きくなるほど、計算ノード間で転送される中間データ量が多くなるために、中間データの転送に要する時間が長くなり、ひいてはグラフ処理全体の速度が遅くなるという問題があることを見出した。 With the technique described in Non-Patent Document 1, it is possible to compress graph structure data, but it is not possible to compress information that propagates between vertices of a graph, which is intermediate data during graph processing. In the case of graph processing using a plurality of calculation nodes, the inventors of the present application transfer intermediate data because the amount of intermediate data transferred between calculation nodes increases as the scale of the graph to be processed increases. It has been found that there is a problem that the time required for the processing becomes longer and the overall speed of the graph processing becomes slower.

そこで本発明は、効率的なグラフ処理の中間データの転送を目的とする。 Accordingly, an object of the present invention is to transfer intermediate data for efficient graph processing.

本発明は、複数の計算ノードによるグラフ処理での、送信元のグラフ頂点の情報と送信先のグラフ頂点の情報の組を有する中間データの転送を、各計算ノードに中間データを蓄積する送信バッファを設け、送信元のグラフ頂点の情報または送信先のグラフ頂点の情報のいずれかソート後に圧縮率が高くなる情報に基づいて、蓄積された中間データの群の中の中間データの並びをソートし、ソート後に、蓄積された中間データの群を圧縮し、中間データの群を転送することで、前述の課題を解決する。 The present invention relates to a transmission buffer that accumulates intermediate data in each computation node for transfer of intermediate data having a pair of information of a graph vertex of a transmission source and information of a graph vertex of a transmission destination in graph processing by a plurality of computation nodes. And sort the sequence of intermediate data in the group of accumulated intermediate data based on the information that increases the compression rate after sorting either the source graph vertex information or the destination graph vertex information After the sorting , the above-mentioned problem is solved by compressing the accumulated intermediate data group and transferring the intermediate data group.

本発明によれば、複数計算ノードでグラフ処理を行う際に、効率的な中間データの転送を実現できる。 According to the present invention, it is possible to realize efficient transfer of intermediate data when performing graph processing with a plurality of computation nodes.

本発明の実施例に係るグラフ処理システムを示す図である。It is a figure which shows the graph processing system which concerns on the Example of this invention. 図１のＡｐサーバのシステム構成を示す図である。It is a figure which shows the system configuration | structure of the Ap server of FIG. 図２のサーバ装置の内部構成を示すブロック図である。It is a block diagram which shows the internal structure of the server apparatus of FIG. 図３のメモリ装置上に用意された送信バッファや受信バッファを示す図である。It is a figure which shows the transmission buffer and reception buffer which were prepared on the memory device of FIG. 処理対象のグラフの例を示す図である。It is a figure which shows the example of the graph of a process target. 図５のグラフを行列形式で表現した図である。FIG. 6 is a diagram expressing the graph of FIG. 5 in a matrix format. 図５のグラフをＣＳＲ形式で表現した図である。FIG. 6 is a diagram expressing the graph of FIG. 5 in a CSR format. 頂点間の転送情報の構成を示した図である。It is the figure which showed the structure of the transfer information between vertices. 頂点割当て情報の例を示す図である。It is a figure which shows the example of vertex allocation information. グラフ処理の例を示すフローチャートである。It is a flowchart which shows the example of a graph process. 図１０の圧縮処理の詳細を示すフローチャートである。11 is a flowchart showing details of the compression processing of FIG. 10. 図１０の伸長処理の詳細を示すフローチャートである。FIG. 11 is a flowchart showing details of the decompression process of FIG. 10. FIG. 図５のグラフの各頂点がどのサーバで処理されるかの例を示した説明図である。It is explanatory drawing which showed the example of which server each vertex of the graph of FIG. 5 is processed. 図１３に対し頂点間の情報の伝搬を示した図である。It is the figure which showed the propagation of the information between vertices with respect to FIG. サーバ装置４２０からサーバ装置４３０へ転送される情報のソート前の例を示す説明図である。It is explanatory drawing which shows the example before the sorting of the information transferred from the server apparatus 420 to the server apparatus 430. FIG. サーバ装置４２０からサーバ装置４３０へ転送される情報のソート後の例を示す説明図である。It is explanatory drawing which shows the example after the sorting of the information transferred from the server apparatus 420 to the server apparatus 430. FIG. サーバ装置４２０からサーバ装置４３０へ転送される情報の圧縮後の例を示す説明図である。FIG. 11 is an explanatory diagram illustrating an example after compression of information transferred from the server apparatus 420 to the server apparatus 430.

図１は、本発明の実施の形態に係るグラフ処理システム１０の構成を概略的に示す図である。図１に示されるように、グラフ処理システム１０は、インターネット２００を介して通信可能なＰＣや携帯端末といったクライアント１００、インターネット２００を介してクライアント１００からの要求を受け付けるウェブサーバ（Ｗｅｂサーバ）３００、Ｗｅｂサーバ３００からの要求によりグラフ解析を行うアプリケーションサーバ（Ａｐサーバ）４００、およびデータベースへのアクセスを行うデータベースサーバ（ＤＢサーバ）５００を有している。但し、クライアント１００とＷｅｂサーバ３００を結ぶネットワークはインターネット２００に限定されず、例えばＬＡＮでも良い。また、本実施例では、Ｗｅｂサーバ３００、Ａｐサーバ４００、およびＤＢサーバ５００を備えるウェブ３層構成のシステムを記載しているが、本発明はこの構成に限定されるものではない。例えばグラフ解析も行うＷｅｂサーバ３００とＤＢサーバ５００の２層構成などでも良い。 FIG. 1 is a diagram schematically showing a configuration of a graph processing system 10 according to an embodiment of the present invention. As shown in FIG. 1, the graph processing system 10 includes a client 100 such as a PC or a portable terminal that can communicate via the Internet 200, a web server (Web server) 300 that receives a request from the client 100 via the Internet 200, An application server (Ap server) 400 that performs graph analysis in response to a request from the Web server 300 and a database server (DB server) 500 that accesses a database are included. However, the network connecting the client 100 and the Web server 300 is not limited to the Internet 200, and may be a LAN, for example. In the present embodiment, a web three-layer system including the Web server 300, the Ap server 400, and the DB server 500 is described, but the present invention is not limited to this configuration. For example, a two-layer configuration of the Web server 300 and the DB server 500 that also performs graph analysis may be used.

図２は、図１のＡｐサーバ４００のシステム構成を示す図である。図２に示されるようにＡｐサーバ４００はサーバ装置４２０、４３０、４４０、４５０と、それらを結ぶネットワーク装置４１０とを有している。サーバ装置４２０−４５０のそれぞれは、グラフ処理における計算ノードとして働く。サーバ装置の台数はサーバ装置の性能やタスクの負荷によって決められるものであるため、４台の構成に制限されず他の台数であっても良い。サーバ装置４２０−４５０は、並列処理を行い互いに通信しながら、最短経路問題を解く等のグラフ処理を実行する。サーバ装置４２０−４５０には、それぞれサーバ識別情報（サーバＩＤ）が与えられる。本実施例では、３台のサーバ装置をグラフ処理に用い、サーバ装置４２０のサーバＩＤは「１」、サーバ装置４３０のサーバＩＤは「２」、サーバ装置４４０のサーバＩＤは「３」である。 FIG. 2 is a diagram showing a system configuration of the Ap server 400 of FIG. As shown in FIG. 2, the Ap server 400 includes server devices 420, 430, 440, and 450 and a network device 410 that connects them. Each of the server apparatuses 420 to 450 serves as a calculation node in the graph processing. Since the number of server devices is determined by the performance of the server device and the task load, the number of server devices is not limited to the configuration of four devices and may be other numbers. The server apparatuses 420 to 450 execute graph processing such as solving the shortest path problem while performing parallel processing and communicating with each other. Server identification information (server ID) is given to each of the server devices 420-450. In this embodiment, three server devices are used for graph processing, the server ID of the server device 420 is “1”, the server ID of the server device 430 is “2”, and the server ID of the server device 440 is “3”. .

図３は、図２に示されるサーバ装置４２０の内部構成を示すブロック図である。サーバ装置４３０、４４０、４５０の内部構成は、サーバ装置４２０の内部構成と同等の機能を有していれば良く、例えば、サーバ毎に異なるメーカや性能であってもかまわない。以降、サーバ装置４２０を代表として取り上げて説明する。 FIG. 3 is a block diagram showing an internal configuration of the server apparatus 420 shown in FIG. The internal configuration of the server apparatuses 430, 440, and 450 only needs to have a function equivalent to that of the internal configuration of the server apparatus 420. For example, the server apparatuses 430, 440, and 450 may have different manufacturers and performance. Hereinafter, the server apparatus 420 will be described as a representative.

サーバ装置４２０は、中央処理装置（ＣＰＵ）６００、メモリ装置６１０、ストレージ装置６２０、入力装置６３０、出力装置６４０、ネットワークインタフェース（Ｉ／Ｆ）６５０、およびバス６６０を備える。サーバ装置４２０内の構成要素間のデータ転送は主にバス６６０を介して行われる。ＣＰＵ６００はメモリ装置６１０を使用してグラフ解析プログラムを実行するとともにサーバ装置４２０全体の動作を制御する。メモリ装置６１０は、ＳＤＲＡＭなどの１次記憶装置であり、ＣＰＵ６００がプログラムを実行する際に必要な命令やデータを保持する。ストレージ装置６２０は、ＨＤＤやＳＳＤといった２次記憶装置であり、プログラムやデータを長期間保持する他、メモリ装置６１０のスワップ領域としても利用される。入力装置６３０は、マウスやキーボードなどであり、出力装置６４０は表示装置やスピーカーなどである。ネットワークインタフェース６５０は他のサーバ装置との通信に利用されるものであり、ＩｎｆｉｎｉＢａｎｄなどを用いることができる。 The server device 420 includes a central processing unit (CPU) 600, a memory device 610, a storage device 620, an input device 630, an output device 640, a network interface (I / F) 650, and a bus 660. Data transfer between components in the server apparatus 420 is mainly performed via the bus 660. The CPU 600 uses the memory device 610 to execute a graph analysis program and to control the overall operation of the server device 420. The memory device 610 is a primary storage device such as an SDRAM, and holds instructions and data necessary when the CPU 600 executes a program. The storage device 620 is a secondary storage device such as an HDD or an SSD, and holds programs and data for a long period of time and is also used as a swap area of the memory device 610. The input device 630 is a mouse or a keyboard, and the output device 640 is a display device or a speaker. The network interface 650 is used for communication with other server apparatuses, and InfiniBand can be used.

図４は、メモリ装置６１０のメモリ領域に確保された送信バッファ６１１と、受信バッファ６１２と、処理キュー６１３と、グラフ演算モジュール６１４と、頂点割当て情報６１５と、次の回の処理を蓄積するＮＥＸＴキュー６１６と、圧縮モジュール６１７と、伸長モジュール６１８とを示す図である。送信バッファ６１１は、サーバ装置間で通信を行う際に送信するデータを一時保管するためのメモリ領域である。受信バッファ６１２は、サーバ装置間で通信を行う際に受信するデータを一時保管するためのメモリ領域である。サーバ装置間で転送されるグラフ処理の中間データは、送信バッファ６１１や受信バッファ６１２に一時保管される。送信バッファ６１１に蓄えられたデータは、ＣＰＵ６００の送信指示によりネットワークインタフェース６５０へ転送された後、ネットワーク装置４１０を介して別のサーバ装置へ送信される。送信データは、別のサーバ装置のネットワークインタフェース６５０で受信された後、別のサーバ装置の受信バッファ６１２に蓄えられる。処理キュー６１３は、グラフ処理のあるフェーズにおいて演算可能なデータを一時保管するためのメモリ領域であり、ＮＥＸＴキュー６１６は、その次のフェーズにおいて演算可能なデータを一時保管するためのメモリ領域である。グラフ演算モジュールは、最短経路探索などのグラフ処理の演算を行うプログラムモジュールである。圧縮モジュール６１７は、後述するが、サーバ装置間で転送されるグラフ処理の中間データを圧縮するプログラムモジュールである。伸長モジュール６１８は、後述するが、サーバ装置間で転送されるグラフ処理の中間データを伸長するプログラムモジュールである。頂点割当て情報６１５は、後述するが、複数のサーバ装置に対するグラフ処理の割当てを示す情報である。 FIG. 4 shows a NEXT that stores a transmission buffer 611, a reception buffer 612, a processing queue 613, a graph calculation module 614, vertex assignment information 615, and processing for the next round, which are secured in the memory area of the memory device 610. FIG. 6 shows a queue 616, a compression module 617, and an expansion module 618. The transmission buffer 611 is a memory area for temporarily storing data to be transmitted when communication is performed between server apparatuses. The reception buffer 612 is a memory area for temporarily storing data received when communication is performed between server apparatuses. The intermediate data of the graph processing transferred between the server apparatuses is temporarily stored in the transmission buffer 611 and the reception buffer 612. The data stored in the transmission buffer 611 is transferred to the network interface 650 according to a transmission instruction from the CPU 600 and then transmitted to another server device via the network device 410. The transmission data is received by the network interface 650 of another server device, and then stored in the reception buffer 612 of another server device. The processing queue 613 is a memory area for temporarily storing data that can be calculated in a phase of graph processing, and the NEXT queue 616 is a memory area for temporarily storing data that can be calculated in the next phase. . The graph calculation module is a program module that performs calculation of graph processing such as shortest path search. As will be described later, the compression module 617 is a program module that compresses intermediate data for graph processing transferred between server apparatuses. As will be described later, the decompression module 618 is a program module that decompresses intermediate data for graph processing transferred between server apparatuses. As will be described later, the vertex assignment information 615 is information indicating the assignment of graph processing to a plurality of server devices.

図５は、本実施例のグラフ処理システム１０の動作を説明するために具体例としてあげたグラフの図である。図５中の円はグラフの頂点を表しており、円内には頂点の識別番号（以下、頂点番号）を示している。頂点間の線はグラフの辺である。グラフは様々な事柄の関係性を表現するのに適している。例えば、グラフの頂点を駅や交差点とみなすと辺は線路や道路を表し、頂点を人や企業とみなした場合には、辺は人や企業間の相互関係を示す。辺は頂点間の関係を示す重みを持ち、前者の例では時間や距離、後者の例では結びつきの強さを示す。 FIG. 5 is a graph showing a specific example for explaining the operation of the graph processing system 10 of this embodiment. Circles in FIG. 5 represent vertices of the graph, and vertex identification numbers (hereinafter referred to as vertex numbers) are shown in the circles. Lines between vertices are graph edges. Graphs are suitable for expressing relationships between various things. For example, if the vertices of the graph are regarded as stations or intersections, the edges represent tracks and roads, and if the vertices are regarded as people or companies, the edges indicate the interrelationship between people and companies. The edge has a weight indicating the relationship between the vertices, and in the former example, it indicates time and distance, and in the latter example, it indicates the strength of connection.

図６および図７は、図５に示したグラフをグラフ構造データとして表現したもので、図５のグラフと同じグラフ構造を示している。図６は図５のグラフを行列形式で表現したものであり、図７は図５のグラフをＣＳＲ（ＣｏｍｐｒｅｓｓｅｄＳｐａｒｓｅＲｏｗ）形式で表現したものである。図６の表中の値は辺の重みを示し、値が０（最短経路問題など解くべき問題によっては∞と表現した方が都合の良いものもある）の箇所は辺が存在しないことを意味する。図７のＣＳＲ形式は、非ゼロの値を格納したｖａｌｕｅｓ、非ゼロの箇所の列番号を格納したｃｏｌｕｍｎｓ、各行の先頭の非ゼロの箇所がｖａｌｕｅｓの何番目の値に対応するかを格納したｒｏｗｉｎｄｅｘから成る。ＣＳＲ形式は本図のような疎行列を表現するのに適している。行列形式ではｎ×ｍの行列を表現するのにｎ×ｍの記憶容量が必要になるが、ＣＳＲ形式では、非ゼロの数をｌとすると２×ｌ＋ｎ＋１の記憶容量で済む。例えば、ビッググラフに代表されるスケールフリー性を持っているグラフは、疎行列となるため、一般にＣＳＲ形式やＣＳＣ（ＣｏｍｐｒｅｓｓｅｄＳｐａｒｓｅＣｏｌｕｍｎ）形式（ＣＳＲ形式に対して行と列を入れ替えたもの）といった圧縮格納形式でグラフ構造データが保存される。 6 and 7 represent the graph shown in FIG. 5 as graph structure data, and show the same graph structure as the graph of FIG. 6 represents the graph of FIG. 5 in a matrix format, and FIG. 7 represents the graph of FIG. 5 in a CSR (Compressed Sparse Row) format. The values in the table of FIG. 6 indicate edge weights, meaning that there are no edges at locations where the value is 0 (it may be more convenient to express ∞ depending on the problem to be solved such as the shortest path problem). To do. The CSR format in FIG. 7 stores values storing non-zero values, columns storing column numbers of non-zero locations, and what values of values correspond to the first non-zero location of each row. It consists of rowindex. The CSR format is suitable for expressing a sparse matrix as shown in the figure. In the matrix format, an n × m storage capacity is required to represent an n × m matrix. However, in the CSR format, if the number of non-zeros is 1, a storage capacity of 2 × l + n + 1 is sufficient. For example, a graph having a scale-free property represented by a big graph is a sparse matrix, and therefore generally has a CSR format or a CSC (Compressed Sparse Column) format (in which the rows and columns are replaced with respect to the CSR format). Graph structure data is saved in a compressed storage format.

図８は、頂点間で転送される情報を示したもので、各行は送信先の頂点（ｔａｒｇｅｔ頂点）の頂点番号（ｔａｒｇｅｔ）、送信元の頂点（ｓｏｕｒｃｅ頂点）の頂点番号（ｓｏｕｒｃｅ）、送信データ（ｄａｔａ）の組で構成される。ここでは、例として頂点３から頂点７、８、９へ転送される情報を示している。ｔａｒｇｅｔと送信先のサーバ装置、およびｓｏｕｒｃｅと送信元サーバ装置の関係は、頂点割当て情報６１５として各サーバ装置のメモリ装置６１０に保存されている。 FIG. 8 shows information transferred between the vertices. Each row indicates the vertex number (target) of the destination vertex (target vertex), the vertex number (source) of the source vertex (source vertex), and the transmission. It consists of a set of data. Here, as an example, information transferred from the vertex 3 to the vertices 7, 8, and 9 is shown. The relationship between the target and the destination server device, and the source and the source server device are stored in the memory device 610 of each server device as the vertex assignment information 615.

図９は、本実施例の頂点割当て情報６１５を示す図である。頂点割当て情報６１５は、頂点番号と該頂点番号の頂点の演算が割り当てられているサーバ装置のサーバＩＤとの組をエントリとする。頂点割当て情報６１５は、各サーバ装置への頂点の割当ての際に生成することができる。本実施例では、サーバＩＤの昇順に、頂点番号が若い頂点の群を配置しているが、頂点の配置の構成はこれに限定されず、頂点が各サーバに分散配置されていれば本発明を適用できる。 FIG. 9 is a diagram showing the vertex assignment information 615 of the present embodiment. In the vertex assignment information 615, a pair of a vertex number and a server ID of a server device to which a vertex calculation of the vertex number is assigned is an entry. The vertex assignment information 615 can be generated when assigning vertices to each server device. In the present embodiment, a group of vertices with younger vertex numbers are arranged in ascending order of server IDs, but the configuration of the arrangement of vertices is not limited to this, and the present invention can be used if the vertices are distributed and arranged in each server. Can be applied.

図１０に本実施例のグラフ処理システム１０によるグラフ処理の動作例をフローチャートで示す。本実施例では、図７のグラフ構造データが処理対象としてストレージ装置６２０に格納されており、各サーバ装置のメモリ装置６１０に図７のグラフ構造データが転送された状態で、図９に示した割当てで各サーバ装置にそれぞれが演算する頂点が割当てられているとして説明する。各サーバ装置で演算する頂点の割当ては、各サーバ装置のメモリ装置６１０に頂点割当て情報６１５として、保存されている。また、グラフ解析の始点となる頂点は、サーバ装置４２０に割当てられるとして説明する。 FIG. 10 is a flowchart showing an example of the graph processing operation performed by the graph processing system 10 of this embodiment. In this embodiment, the graph structure data of FIG. 7 is stored in the storage device 620 as a processing target, and the graph structure data of FIG. 7 is transferred to the memory device 610 of each server device as shown in FIG. A description will be given on the assumption that each server device is assigned a vertex to be calculated by assignment. The vertex assignment calculated by each server device is stored as vertex assignment information 615 in the memory device 610 of each server device. Also, a description will be given assuming that the vertex that is the starting point of the graph analysis is assigned to the server device 420.

まず、サーバ装置４２０のグラフ演算モジュール６１４が、グラフ解析の始点となる頂点をｔａｒｇｅｔ頂点として選択する（ステップＳ１００、Ｓ１０１）。次に、サーバ装置４２０のグラフ演算モジュール６１４が、選択したｔａｒｇｅｔ頂点に対するｓｏｕｒｃｅ頂点とｄａｔａの組を作り、処理キュー６１３へ入れる（ステップＳ１０２）。ここでは、ｔａｒｇｅｔ頂点が始点であるため、ｓｏｕｒｃｅ頂点、ｄａｔａはダミーデータ（例えば、ｓｏｕｒｃｅ＝ｔａｒｇｅｔ、ｄａｔａ＝Ｚ）とする。 First, the graph calculation module 614 of the server device 420 selects a vertex that is a starting point of graph analysis as a target vertex (steps S100 and S101). Next, the graph calculation module 614 of the server device 420 creates a set of source vertices and data for the selected target vertices and puts them into the processing queue 613 (step S102). Here, since the target vertex is the start point, the source vertex and data are dummy data (for example, source = target, data = Z).

ステップＳ１０３では、各サーバ装置のグラフ演算モジュール６１４が、各サーバ装置の処理キュー６１３が空であるか否かの判定を行い、各サーバ装置の動作は、空であればステップＳ１０８へ進み、空でなければステップＳ１０４へ進む。ステップＳ１０４では、各サーバ装置のグラフ演算モジュール６１４が、各サーバ装置の処理キュー６１３から（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）情報の組を取り出し、ｔａｒｇｅｔ頂点に対する演算を実施する。 In step S103, the graph calculation module 614 of each server device determines whether or not the processing queue 613 of each server device is empty. If the operation of each server device is empty, the process proceeds to step S108. Otherwise, the process proceeds to step S104. In step S104, the graph operation module 614 of each server device extracts a set of (target, source, data) information from the processing queue 613 of each server device, and performs an operation on the target vertex.

次に、ステップＳ１０５で各サーバ装置のグラフ演算モジュール６１４が、処理キュー６１３にあったｔａｒｇｅｔ頂点を次のｓｏｕｒｃｅ頂点として選択し、メモリ装置６１０にある図７のグラフ構造データを参照して、グラフ処理の中間データとなる、ｓｏｕｒｃｅ頂点と、グラフ構造データから読み出したｔａｒｇｅｔ頂点と、ｄａｔａの組を生成する。ステップ１０６で、各サーバ装置のグラフ演算モジュール６１４が、ステップ１０５で生成した組のｔａｒｇｅｔ頂点を自サーバ装置が処理する場合は生成した組の情報をＮｅｘｔキュー６１６に入れ、他サーバが処理する場合は生成した組の情報を送信バッファ６１１へ入れる。送信バッファ６１１は、ｔａｒｇｅｔ頂点を処理するサーバ装置毎に分けておくことが望ましい。 Next, in step S105, the graph calculation module 614 of each server device selects the target vertex that was in the processing queue 613 as the next source vertex, and refers to the graph structure data in FIG. A set of source vertices, target vertices read from the graph structure data, and data, which are intermediate data of processing, is generated. When the graph calculation module 614 of each server device processes the target vertex of the set generated in step 105 in step 106, the server device puts the generated set information in the Next queue 616 and the other server processes it Puts the generated set of information into the transmission buffer 611. The transmission buffer 611 is preferably divided for each server device that processes the target vertex.

ステップＳ１０７では、送信情報をまとめて効率よく転送するために、各サーバ装置で、送信バッファ６１１に格納された情報の量が予め設定された送信基準サイズを超えたか否かの判定が行われる。基準のサイズを超えた場合は送信を開始するために、各サーバ装置の動作はステップＳ２００へ進み、基準のサイズを超えていない場合はステップＳ１０３へ戻る。ステップ１０８では、各サーバ装置で、処理キュー６１３が空になった状態で未送信の情報が送信バッファ６１１に残っているか否かの判定が行われる。送信バッファ６１１が空でない場合は未送信情報を送信するために、各サーバ装置の動作はステップＳ２００へ進み、送信バッファが空の場合は未送信情報がないため、受信を開始するためにステップＳ３００へ進む。 In step S107, in order to efficiently transmit the transmission information collectively, each server device determines whether or not the amount of information stored in the transmission buffer 611 exceeds a preset transmission reference size. When the reference size is exceeded, the transmission of each server device proceeds to start transmission in order to start transmission, and when the reference size is not exceeded, the processing returns to step S103. In step 108, each server device determines whether untransmitted information remains in the transmission buffer 611 in a state where the processing queue 613 is empty. If the transmission buffer 611 is not empty, the operation of each server device proceeds to step S200 in order to transmit untransmitted information. If the transmission buffer is empty, there is no untransmitted information, so that the reception starts in step S300. Proceed to

図１１に、ステップＳ２００の詳細なステップであるステップＳ２０１−Ｓ２０５を示す。まず、ステップＳ２００では、各サーバ装置の圧縮モジュール６１７が、送信バッファ６１１内のｔａｒｇｅｔ頂点、ｓｏｕｒｃｅ頂点、ｄａｔａのカラム毎に共通な値の数をカウントアップする（ステップＳ２０１、ステップＳ２０２）。その後、各サーバ装置の圧縮モジュール６１７は、各カラムで共通な値は連続するものとしてランレングス符号化を行った場合に最も圧縮率の高いカラムをキーとして送信バッファ６１１内のデータのソートを行う（ステップＳ２０３）。この際、ｔａｒｇｅｔ頂点、ｓｏｕｒｃｅ頂点、ｄａｔａの組の対応は保持した形でソートが行われる。このように、送信バッファ６１１に蓄積されている中間データであるｔａｒｇｅｔ頂点、ｓｏｕｒｃｅ頂点、ｄａｔａの組の群の中で、組の並びのソートが行われることで、効率的な送信データの圧縮が可能となり、効率的なデータ転送を実現でき、ひいてはグラフ処理を高速化できる。さらに、ｔａｒｇｅｔ頂点、ｓｏｕｒｃｅ頂点、ｄａｔａの内で最も圧縮率が高くなるカラムをキーとすることで、さらに効率的な送信データの圧縮が可能となる。なお、ｔａｒｇｅｔ頂点、ｓｏｕｒｃｅ頂点のカラム毎に共通な値の数をカウントアップし、ｔａｒｇｅｔ頂点またはｓｏｕｒｃｅ頂点のいずれか圧縮率の高い方のカラムをキーとしてソートしてもよい。例えば、辺に重みが無いグラフ処理の場合には、ｄａｔａの送信は不要である。 FIG. 11 shows steps S201 to S205 which are detailed steps of step S200. First, in step S200, the compression module 617 of each server apparatus counts up the number of common values for each column of the target vertex, the source vertex, and the data in the transmission buffer 611 (step S201, step S202). Thereafter, the compression module 617 of each server device sorts the data in the transmission buffer 611 using the column with the highest compression rate as a key when run-length encoding is performed on the assumption that values common to the columns are continuous. (Step S203). At this time, sorting is performed while maintaining the correspondence between the set of target vertices, source vertices, and data. As described above, the sorting of the set arrangement is performed in the group of the set of the target vertex, the source vertex, and the data that are the intermediate data stored in the transmission buffer 611, so that the transmission data can be efficiently compressed. This makes it possible to realize efficient data transfer, and thus speed up the graph processing. Furthermore, by using the column having the highest compression rate among the target vertex, source vertex, and data as a key, it is possible to compress transmission data more efficiently. Note that the number of values common to each column of the target vertex and the source vertex may be counted up, and the column having the higher compression rate of the target vertex or the source vertex may be sorted as a key. For example, in the case of graph processing in which there is no weight on an edge, transmission of data is not necessary.

各サーバ装置の圧縮モジュール６１７は、ソートを行った後、キーとしたカラムに対しランレングス符号化を行う（ステップＳ２０４）。予めソートがおこなわれることで、共通の数値が連続するようになり、ランレングス符号化での圧縮率を高めることができる。また、伸長モジュール６１８による復号時にどのカラムをキーとして圧縮を行ったかが分かるように、各サーバ装置の圧縮モジュール６１７は、送信情報の先頭にキーとしたカラムを示す情報を付加する。ステップＳ２００に続くステップＳ１０９では、各サーバ装置は、ステップＳ２００で処理された送信データを、頂点割当て情報６１５を参照し、Ｔａｒｇｅｔ頂点の演算を割当てられているサーバ装置へ送信する。 After performing the sorting, the compression module 617 of each server device performs run-length encoding on the key column (step S204). By performing the sorting in advance, common numerical values become continuous, and the compression rate in the run-length encoding can be increased. Further, the compression module 617 of each server device adds information indicating the column as the key to the head of the transmission information so that it can be understood which column was used as the key when decoding by the decompression module 618. In step S109 following step S200, each server device refers to the vertex assignment information 615 and transmits the transmission data processed in step S200 to the server device to which the computation of the target vertex is assigned.

図１２に、ステップＳ３００の詳細なステップであるステップＳ３０１−Ｓ３０６を示す。伸長処理ステップＳ３００では、各サーバ装置の伸長モジュール６１８が、まず受信バッファ６１２から情報を取り出し、その先頭の情報からどのカラムに復号を行えば良いかの特定を行う（ステップＳ３０１、Ｓ３０２）。伸長モジュール６１８は、特定したカラムに対し、ランレングス複合化を行い、もとの情報に伸長する（ステップＳ３０３）。ステップＳ３０４では、伸長モジュール６１８は、特定したカラムがｔａｒｇｅｔ頂点か否かの判定を行う。特定したカラムがｔａｒｇｅｔ頂点でない場合は、伸長モジュール６１８は、ｔａｒｇｅｔ頂点をキーとしてソートを行う（ステップＳ３０５）。この際も、ｔａｒｇｅｔ頂点、ｓｏｕｒｃｅ頂点、ｄａｔａの組の対応は保持した形でソートが行われる。このように、受信バッファ６１２に到着した中間データの群の中で、中間データの並びのソートが行われる。特定したカラムがｔａｒｇｅｔ頂点の場合は、送信前に既にソート済みであるためソートは行われず処理が終了する（Ｓ３０６）。 FIG. 12 shows steps S301 to S306 which are detailed steps of step S300. In the decompression processing step S300, the decompression module 618 of each server device first extracts information from the reception buffer 612, and specifies which column should be decoded from the top information (steps S301 and S302). The decompression module 618 performs run-length composition on the identified column, and decompresses the original information (step S303). In step S304, the decompression module 618 determines whether the identified column is a target vertex. If the identified column is not the target vertex, the decompression module 618 performs sorting using the target vertex as a key (step S305). Also in this case, the sorting is performed while maintaining the correspondence between the set of the target vertex, the source vertex, and the data. In this way, the arrangement of the intermediate data in the group of intermediate data that has arrived at the reception buffer 612 is sorted. If the identified column is the target vertex, since the sorting has already been performed before transmission, the sorting is not performed and the process ends (S306).

ステップＳ１１０では、各サーバ装置は、別のサーバ装置から送信された情報を受信バッファ６１２から取り出してＮｅｘｔキュー６１６へ入れる。各サーバ装置は、受信する情報が無い場合はステップＳ１１０で何もしない。ここには記載しないが、受信すべき情報があるか否かは別途サーバ装置間で通信し合う。その後ステップＳ１１１では、各サーバ装置は、グラフ処理を行っている全サーバ装置でＮｅｘｔキュー６１６が空か否かの判定を行う。グラフ処理を行っている全サーバ装置でＮｅｘｔキュー６１６が空か否かは、サーバ装置のそれぞれが、自身のＮｅｘｔキュー６１６が空になった場合に、他のサーバ装置に自身のＮｅｘｔキュー６１６が空になったことを通知することで、各サーバ装置が判定できる。Ｎｅｘｔキュー６１６が空でない場合は、各サーバ装置は、Ｎｅｘｔキュー６１６から処理キュー６１３へ情報を移動させ（ステップＳ１１２）、各サーバ装置の動作はステップＳ１０３へ戻る。全サーバ装置のＮｅｘｔキュー６１６が空の場合は、全ての頂点に対する処理が完了する（ステップＳ１１３）。 In step S <b> 110, each server device extracts information transmitted from another server device from the reception buffer 612 and puts it in the Next queue 616. If there is no information to be received, each server device does nothing in step S110. Although not described here, whether or not there is information to be received is separately communicated between the server apparatuses. Thereafter, in step S111, each server device determines whether or not the Next queue 616 is empty in all the server devices performing the graph processing. Whether or not the Next queue 616 is empty in all the server devices performing the graph processing depends on whether each Server device has its own Next queue 616 in another server device when its Next queue 616 is empty. Each server device can determine by notifying that it has become empty. If the Next queue 616 is not empty, each server device moves information from the Next queue 616 to the processing queue 613 (Step S112), and the operation of each server device returns to Step S103. If the Next queue 616 of all server devices is empty, the processing for all vertices is completed (step S113).

ステップＳ２００とステップＳ３００の圧縮・伸長の処理により、受信した情報はｔａｒｇｅｔ頂点をキーとしてソートされた状態で処理キュー６１３へ入れられる。処理キュー６１３への格納順は頂点に対する各サーバ装置の演算順となるため、ｔａｒｇｅｔ頂点順としておくことで、同じｔａｒｇｅｔ頂点に対する処理が連続に行われる。連続に処理が行われることで、中間データがＣＰＵ６００のキャッシュからメモリ装置６１０へリプレースされたり、メモリ装置６１０からストレージ装置６２０へのスワップが発生したりする確率を下げることができる。以降、具体例を用いて圧縮・伸長動作について説明する。 By the compression / decompression process of steps S200 and S300, the received information is put into the process queue 613 in a state of being sorted using the target vertex as a key. Since the storage order in the processing queue 613 is the calculation order of each server device with respect to the vertices, the processing for the same target vertex is continuously performed by setting the target vertex order. By performing the processing continuously, it is possible to reduce the probability that the intermediate data is replaced from the cache of the CPU 600 to the memory device 610 or the swap from the memory device 610 to the storage device 620 occurs. Hereinafter, the compression / decompression operation will be described using a specific example.

図１３は、図５のグラフの各頂点がどのサーバ装置で処理されるかの例を示した図である。この図では頂点１〜１０をグループ７００、頂点１１〜１５をグループ７１０、頂点１６〜２５をグループ７２０に分割している。本実施例では、グループ７００はサーバ装置４２０で、グループ７１０はサーバ装置４３０で、グループ７２０はサーバ装置４４０で処理する場合を考える。 FIG. 13 is a diagram showing an example of which server device processes each vertex of the graph of FIG. In this figure, vertices 1 to 10 are divided into a group 700, vertices 11 to 15 are divided into a group 710, and vertices 16 to 25 are divided into a group 720. In this embodiment, it is assumed that the group 700 is processed by the server apparatus 420, the group 710 is processed by the server apparatus 430, and the group 720 is processed by the server apparatus 440.

図１４は、図１３に対し、頂点３を始点としてグラフ解析を行った場合の頂点間の情報の流れを矢印で示した図である。以下、頂点３を始点としたグラフ解析について処理の流れを説明する。まず、サーバ装置４２０のグラフ演算モジュール６１４が、頂点３をｔａｒｇｅｔ頂点として選択し、（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（３、３、Ｚ）（ｓｏｕｒｃｅおよびｄａｔａはダミーデータ）を処理キュー６１３へ入れる。サーバ装置４２０のグラフ演算モジュール６１４は、処理キュー６１３から（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（３、３、Ｚ）を取り出し、頂点３に対する演算を実行する。サーバ装置４２０のグラフ演算モジュール６１４は、頂点３をｓｏｕｒｃｅ頂点としてグラフ構造データを参照し、次に処理すべきｔａｒｇｅｔ頂点が頂点７、８、９であることを得る。頂点７、８、９は全てグループ７００に属しているため、サーバ装置４２０のグラフ演算モジュール６１４は、自身のＮｅｘｔキュー６１６に（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（７、３、ｃ）、（８、３、ｄ）、（９、３、ｅ）を入れる。送受信される情報は存在しないため、サーバ装置４２０のグラフ演算モジュール６１４は、Ｎｅｘｔキュー６１６の内容を処理キュー６１３へ移動させ、頂点７、８、９について演算を行い、グラフ構造データを参照する。各頂点が所属するグループの分類により、サーバ装置４２０のグラフ演算モジュール６１４は、頂点７ではＮｅｘｔキュー６１６へ（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（１、７、ａ）、（２、７、ｂ）、送信バッファ６１１へ（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（１１、７、ｉ）、頂点８では送信バッファ６１１へ（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（１１、８、ｊ）、頂点９ではＮｅｘｔキュー６１６へ（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（４、９、ｆ）、送信バッファ６１１へ（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）＝（１２、９、ｌ）、（１１、９、ｋ）を入れる。サーバ装置４２０の圧縮モジュール６１７は、処理キュー６１３が空になったため、送信バッファ６１１の内容を圧縮して転送を行う。 FIG. 14 is a diagram showing the flow of information between vertices with arrows when graph analysis is performed with vertex 3 as a starting point with respect to FIG. 13. Hereinafter, the flow of processing for the graph analysis starting from the vertex 3 will be described. First, the graph calculation module 614 of the server apparatus 420 selects the vertex 3 as the target vertex, and puts (target, source, data) = (3, 3, Z) (source and data are dummy data) into the processing queue 613. . The graph operation module 614 of the server apparatus 420 extracts (target, source, data) = (3, 3, Z) from the processing queue 613 and executes the operation on the vertex 3. The graph calculation module 614 of the server device 420 refers to the graph structure data with the vertex 3 as the source vertex, and obtains that the next target vertex to be processed is the vertices 7, 8, and 9. Since the vertices 7, 8, and 9 all belong to the group 700, the graph operation module 614 of the server device 420 stores (target, source, data) = (7, 3, c), (8 in its Next queue 616. 3, d), (9, 3, e). Since there is no information to be transmitted / received, the graph calculation module 614 of the server device 420 moves the contents of the Next queue 616 to the processing queue 613, calculates the vertices 7, 8, and 9 and refers to the graph structure data. Depending on the classification of the group to which each vertex belongs, the graph calculation module 614 of the server device 420 moves to the next queue 616 at the vertex 7 (target, source, data) = (1, 7, a), (2, 7, b). , To the transmission buffer 611 (target, source, data) = (11, 7, i), at the vertex 8, to the transmission buffer 611 (target, source, data) = (11, 8, j), at the vertex 9, the Next queue 616 (Target, source, data) = (4, 9, f), and (target, source, data) = (12, 9, l), (11, 9, k) are input to the transmission buffer 611. The compression module 617 of the server apparatus 420 performs transfer by compressing the contents of the transmission buffer 611 because the processing queue 613 has become empty.

図１５（ａ）に送信バッファ６１１の内容のソート前の状態、図１５（ｂ）に送信バッファ６１１の内容をソート後の状態、図１５（ｃ）に送信バッファ６１１の内容をソートし圧縮した後の状態をそれぞれ示す。送信開始に先立ち、サーバ装置４２０の圧縮モジュール６１７は、（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）の各カラムをソートしてランレングス符号化を行った場合に最も圧縮率の高いものを選択する。ここでは、ｔａｒｇｅｔが頂点１１に対し３個共通、ｓｏｕｒｃｅが頂点９に対し２個共通、ｄａｔａは共通なしであり、ｔａｒｇｅｔをロウ方向に圧縮することが選択される。その後、サーバ装置４２０の圧縮モジュール６１７は、ロウ方向のソートを行い（図１５（ｂ））、ｔａｒｇｅｔに対してランレングス符号化を行う（図１５（ｃ））。図１５（ｃ）では頂点１１が３個共通ということを１１ｘ３と表現しているが、これを例えば、最上位ビットが１となっている数を繰り返し数として表現することができる。ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ、繰り返し数をそれぞれ４バイトの変数で表現した場合、圧縮前のサイズが４８バイトに対し、圧縮後のサイズは４４バイトとなる。さらに転送情報の先頭にｔａｒｇｅｔを圧縮したという情報が１バイト付加され、４５バイトの情報がサーバ装置４３０へ送信される。サーバ装置４３０では、受理した情報の伸長が行われ、伸長されたデータがＮｅｘｔキュー６１６へ入れられる。サーバ装置４２０、４３０では、Ｎｅｘｔキュー６１６から処理キュー６１３へ（ｔａｒｇｅｔ、ｓｏｕｒｃｅ、ｄａｔａ）情報が移動され、別の頂点についても処理が進められる。以後、各頂点に対する上述の処理の繰り返しなので、説明は省略する。 FIG. 15A shows the state before sorting the contents of the transmission buffer 611, FIG. 15B shows the state after sorting the contents of the transmission buffer 611, and FIG. 15C shows the contents of the transmission buffer 611 sorted and compressed. Each subsequent state is shown. Prior to the start of transmission, the compression module 617 of the server apparatus 420 selects the one with the highest compression rate when each column of (target, source, data) is sorted and run-length encoding is performed. Here, three targets are common to the vertex 11, two are common to the vertex 9, data is not common, and compression of the target in the row direction is selected. Thereafter, the compression module 617 of the server apparatus 420 performs sorting in the row direction (FIG. 15B), and performs run-length encoding on the target (FIG. 15C). In FIG. 15C, the fact that the three vertices 11 are common is expressed as 11 × 3, but this can be expressed as, for example, the number in which the most significant bit is 1 as the number of repetitions. When the target, source, data, and number of repetitions are each represented by a 4-byte variable, the size before compression is 48 bytes, and the size after compression is 44 bytes. Further, 1 byte of information indicating that the target is compressed is added to the head of the transfer information, and 45 bytes of information is transmitted to the server device 430. In the server device 430, the received information is decompressed, and the decompressed data is placed in the Next queue 616. In the server apparatuses 420 and 430, information (target, source, data) is moved from the Next queue 616 to the processing queue 613, and the processing is also advanced for another vertex. Thereafter, since the above-described processing is repeated for each vertex, description thereof is omitted.

なお、各頂点に対する演算は上述のように処理キュー６１３から取り出されたｔａｒｇｅｔ頂点の順に行われるが、同じｔａｒｇｅｔ頂点に対する演算が連続に行われない場合、他のｔａｒｇｅｔ頂点への演算により、中間データがキャッシュからメモリ装置にリプレースされたり、メモリ装置からストレージ装置へのスワップが発生したりすることで処理性能が低下してしまう。そこで、ｓｏｕｒｃｅまたはｄａｔａで圧縮が行われた場合は、受信側で伸長を行った後に、ｔａｒｇｅｔでソートが行われるフローとしている。 As described above, the calculation for each vertex is performed in the order of the target vertices extracted from the processing queue 613. However, when the calculation for the same target vertex is not performed continuously, the intermediate data is obtained by the calculation for other target vertices. However, if the cache is replaced with a memory device or a swap from the memory device to the storage device occurs, the processing performance deteriorates. Therefore, when compression is performed by source or data, a flow is performed in which sorting is performed by target after decompression at the receiving side.

以上のように、本実施例の転送方法では、グラフ頂点間で転送される中間データの群の中で、中間データの並びのソートが行われることで、効率的な送信データの圧縮が可能となり、効率的なデータ転送を実現でき、ひいてはグラフ処理を高速化できる。さらに、各カラムをロウ方向ソートして圧縮した場合の圧縮率の比較を行うことで、サーバ間で転送される情報の量をさらに効率よく削減することができる。また、カラムのうち、ｓｏｕｒｃｅまたはｄａｔａをキーとしてソートし、圧縮をかけて転送した場合には、受信側で伸長後にｔａｒｇｅｔで再ソートを行うことでメモリ装置やストレージ装置に待避する中間データ量を減らし、グラフ解析をさらに高速に行うことが可能になる。 As described above, in the transfer method of the present embodiment, it is possible to efficiently compress the transmission data by sorting the arrangement of the intermediate data in the group of intermediate data transferred between the graph vertices. Efficient data transfer can be realized, and the graph processing can be speeded up. Further, by comparing the compression rates when the columns are sorted and compressed in the row direction, the amount of information transferred between servers can be more efficiently reduced. Also, if the source or data is sorted using the source or data as a key and transferred after compression, the amount of intermediate data to be saved in the memory device or storage device can be reduced by re-sorting with the target after decompression on the receiving side. This makes it possible to perform graph analysis at higher speed.

１００：クライアント、２００：インターネット、３００：Ｗｅｂサーバ、４００：Ａｐサーバ、５００：ＤＢサーバ、４１０：ネットワーク装置、４２０〜４５０：サーバ装置、６００：ＣＰＵ、６１０：メモリ装置、６１１：送信バッファ、６１２：受信バッファ、６１３：処理キュー、６１４：グラフ演算モジュール、６１５：頂点割当て情報、６１６：ＮＥＸＴキュー、６１７：圧縮モジュール、６１８：伸長モジュール、６２０：ストレージ装置、６３０：入力装置、６４０：出力装置、６５０：ネットワークインタフェース、６６０：バス 100: client, 200: Internet, 300: Web server, 400: Ap server, 500: DB server, 410: network device, 420 to 450: server device, 600: CPU, 610: memory device, 611: transmission buffer, 612 : Reception buffer, 613: processing queue, 614: graph operation module, 615: vertex assignment information, 616: NEXT queue, 617: compression module, 618: decompression module, 620: storage device, 630: input device, 640: output device 650: Network interface 660: Bus

Claims

A method for transferring intermediate data in graph processing by a plurality of computation nodes,
Each compute node has a send buffer,
Each compute node is assigned a graph vertex to be processed,
The intermediate data includes a pair of information on the graph vertex of the transmission source and information on the graph vertex of the transmission destination,
Storing the intermediate data in the transmission buffer;
Sorting the array of the intermediate data in the accumulated group of intermediate data based on information that increases the compression rate after sorting either the information of the graph vertex of the transmission source or the information of the graph vertex of the transmission destination And
After the sorting, compress the accumulated group of intermediate data,
A method of transferring intermediate data for graph processing, wherein the group of intermediate data is transferred.

The method for transferring intermediate data for graph processing according to claim 1,
Each compute node has a receive buffer,
Each compute node
When the group of transferred intermediate data is sorted based on the information of the source graph vertex,
A method of transferring intermediate data in graph processing, wherein the arrangement of the intermediate data in the group of intermediate data that has arrived at the reception buffer is sorted based on information on the graph vertex of the transmission destination.

The method for transferring intermediate data for graph processing according to claim 1,
The method for transferring intermediate data of graph processing, wherein the computing node is a server device.

A method for transferring intermediate data in graph processing by a plurality of computation nodes,
Each compute node has a send buffer,
Each compute node is assigned a graph vertex to be processed,
The intermediate data includes a set of information on the graph vertex of the transmission source, information on the graph vertex of the transmission destination, and information on dependency between the graph vertex of the transmission source and the graph vertex of the transmission destination,
Storing the intermediate data in the transmission buffer;
Among the group of accumulated intermediate data based on the information that the compression rate becomes the highest after sorting, either the information of the graph vertex of the transmission source, the information of the graph vertex of the transmission destination, or the information of the dependency relationship Sort the sequence of the intermediate data of
After the sorting, compress the accumulated group of intermediate data,
A method of transferring intermediate data for graph processing, wherein the group of intermediate data is transferred.

The method for transferring intermediate data for graph processing according to claim 4 ,
Each compute node has a receive buffer,
Each compute node
When the group of transferred intermediate data is sorted based on the information of the source graph vertex or the dependency information,
A method of transferring intermediate data in graph processing, wherein the arrangement of the intermediate data in the group of intermediate data that has arrived at the reception buffer is sorted based on information on the graph vertex of the transmission destination.

The method for transferring intermediate data for graph processing according to claim 4 ,
The method for transferring intermediate data of graph processing, wherein the computing node is a server device.

A graph processing system having a plurality of computation nodes,
Each compute node
A transmission buffer for storing intermediate data of graph processing having a pair of information on the graph vertex of the transmission source and information on the graph vertex of the transmission destination;
On the basis of the compression ratio becomes higher information after any sort of information the information or graph vertices of the destination of the transmission source graph vertices, the intermediate of the group of the intermediate data stored in the transmission buffer A module to sort the data sequence;
A graph processing system comprising: a module for compressing the group of intermediate data stored after the sorting; and a module for transferring the group of intermediate data.

The graph processing system according to claim 7 ,
Each compute node
A receive buffer;
A graph processing system comprising: a module that sorts the arrangement of the intermediate data in the group of intermediate data that has arrived at the reception buffer based on the information on the graph vertex of the transmission destination.

The graph processing system according to claim 7 ,
The graph processing system, wherein the computation node is a server device.