JP2020126487A

JP2020126487A - Parallel processor apparatus, data transfer destination determining method, and data transfer destination determining program

Info

Publication number: JP2020126487A
Application number: JP2019019107A
Authority: JP
Inventors: 貴史野瀬; Takashi Nose; 剛橋本; Takeshi Hashimoto
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2019-02-05
Filing date: 2019-02-05
Publication date: 2020-08-20
Anticipated expiration: 2039-02-05
Also published as: JP7180424B2

Abstract

To reduce time required for completion of data broadcast communication in a parallel processor apparatus.SOLUTION: The present invention is directed to a parallel processor apparatus having a plurality of nodes connected to one another through a network. Each of the plurality of nodes has a calculating unit for, based on configuration information of the network, positional information of each node in the network, and origin node information indicating an origin node of broadcasting communication, calculating a transfer destination node which is a transfer destination of data of the broadcasting communication so that transfer distance should gradually become smaller as a transfer number is increased, a storing unit for storing positional information of the transfer destination node at every transfer number calculated by the calculating unit, and a communication unit for, upon reception of the data from other node in the broadcasting communication, determining the transfer destination node based on the information stored in the storing unit and for transferring the received data to the determined transfer destination node.SELECTED DRAWING: Figure 1

Description

本発明は、並列処理装置、データ転送先決定方法およびデータ転送先決定プログラムに関する。 The present invention relates to a parallel processing device, a data transfer destination determination method, and a data transfer destination determination program.

複数のノードを含むネットワークにおいて、複数のデータを同報通信する場合、あるノードから別のノードに所定回数のデータ転送を実行した後、２つのノード間でデータ転送を相互に実行することで、転送時間が削減される（例えば、特許文献１参照）。また、あるノードから別のノードのそれぞれに個別のデータを送信する場合、ネットワークを複数の均等な領域に分割し、領域それぞれに対するデータの送信回数を等しくすることで、効率のよい通信が実現される（例えば、特許文献２参照）。 When broadcasting a plurality of data in a network including a plurality of nodes, by performing a predetermined number of times of data transfer from one node to another node and then mutually executing data transfer between two nodes, The transfer time is reduced (for example, see Patent Document 1). In addition, when transmitting individual data from one node to another, dividing the network into multiple equal areas and making the number of data transmissions for each area equal allows efficient communication to be realized. (For example, refer to Patent Document 2).

特開平１１−３４５２２０号公報JP, 11-345220, A 国際公開第２００８／１１４４４０号International Publication No. 2008/114440

分散メモリ型のＨＰＣ（High Performance Computing）システム等の並列処理装置では、並列計算を行うジョブの実行時に、複数のノードがデータを一斉に転送する動作が随所で必要となる。例えば、データの一斉転送として同報通信（broadcast）がある。 In a parallel processing device such as a distributed memory type HPC (High Performance Computing) system, a plurality of nodes need to transfer data all at once at the time of executing a job for performing parallel calculation. For example, there is broadcast communication as a simultaneous transfer of data.

同報通信では、なるべく多くのノードが、なるべく早く、なるべく長い時間、送信ノードとして動作することが望ましく、さらに、転送レートを低下させるリンクの共有が発生しないことが望ましい。しかしながら、例えば、データを隣接ノードに順次転送する同報通信では、データを送信した送信済みノードが同報通信に参加し続ける場合、２回目以降のデータ転送は、他の送信済みノードを介して送信されることになる。この結果、２回目以降のデータ転送時に他のノードとリンクを共有する可能性が高くなり、同報通信時のデータの転送効率は低下し、リンクの共有が発生しない場合に比べて、同報通信を完了するまでの時間が掛かってしまう。 In broadcast communication, it is desirable that as many nodes as possible operate as transmitting nodes as soon as possible and as long as possible, and further, it is desirable that link sharing that lowers the transfer rate does not occur. However, for example, in a broadcast communication in which data is sequentially transferred to an adjacent node, if the transmitted node that has transmitted the data continues to participate in the broadcast communication, the second and subsequent data transfers will be performed via another transmitted node. Will be sent. As a result, the possibility of sharing the link with other nodes during the second and subsequent data transfers increases, the data transfer efficiency during broadcast communication decreases, and compared to the case where link sharing does not occur, It takes time to complete the communication.

１つの側面では、本発明は、並列処理装置においてデータの同報通信が完了するまでに掛かる時間を削減することを目的とする。 In one aspect, the present invention aims to reduce the time it takes to complete data broadcast in a parallel processing device.

一つの観点によれば、ネットワークを介して相互に接続される複数のノードを含む並列処理装置において、前記複数のノードの各々は、前記ネットワークの構成情報と、前記ネットワーク上での各ノードの位置情報と、同報通信時の起点ノードを示す起点ノード情報とに基づいて、転送回数の増加にしたがって転送距離が徐々に小さくなるような同報通信におけるデータの転送先である転送先ノードを求める算出部と、前記算出部が算出した転送回数毎の転送先ノードの位置情報が格納される記憶部と、同報通信時に他のノードからデータを受信した場合、前記記憶部に記憶された情報に基づいて転送先ノードを決定し、決定した転送先ノードに、受信したデータを転送する通信部と、を有する。 According to one aspect, in a parallel processing device including a plurality of nodes connected to each other via a network, each of the plurality of nodes includes configuration information of the network and a position of each node on the network. Based on the information and the origin node information indicating the origin node at the time of the broadcast communication, the transfer destination node which is the transfer destination of the data in the broadcast communication such that the transfer distance becomes gradually smaller as the number of transfers increases is obtained. A calculation unit, a storage unit that stores the position information of the transfer destination node for each transfer count calculated by the calculation unit, and information stored in the storage unit when data is received from another node during broadcast communication. And a communication unit that transfers the received data to the determined transfer destination node.

１つの側面では、本発明は、並列計算機においてデータの同報通信が完了するまでに掛かる時間を削減することができる。 In one aspect, the present invention can reduce the time it takes to complete data broadcast in a parallel computer.

一実施形態における並列処理装置の一例を示す図である。It is a figure which shows an example of the parallel processing apparatus in one embodiment. 図１のネットワークにおいて、ジョブの実行対象のノードを含むサブネットワークの一例を示す図である。FIG. 2 is a diagram showing an example of a sub-network including a node as a job execution target in the network of FIG. 1. 図２のサブネットワークのネットワーク座標表の一例を示す図である。It is a figure which shows an example of the network coordinate table of the subnetwork of FIG. 図１の受信時段数表の一例を示す図である。It is a figure which shows an example of the receiving stage number table of FIG. 図１の転送先ノード表の一例を示す図である。It is a figure which shows an example of the transfer destination node table of FIG. 図１の並列処理装置が同報通信を実行する場合の第１フェーズでの各ノードの動作の一例を示すフローチャートである。6 is a flowchart showing an example of the operation of each node in the first phase when the parallel processing device of FIG. 1 executes broadcast communication. 図１の並列処理装置における同報通信の一例を示す図である。It is a figure which shows an example of the broadcast communication in the parallel processing apparatus of FIG. 図１の並列処理装置が同報通信を実行する場合の第２フェーズでの各ノードの動作の一例を示すフローチャートである。6 is a flowchart showing an example of the operation of each node in the second phase when the parallel processing device of FIG. 1 executes broadcast communication. 図１の各ノードが実行するデータの転送先を決定する処理の一例を示すフローチャートである。6 is a flowchart showing an example of processing for determining a transfer destination of data executed by each node in FIG. 1. 他の並列処理装置における同報通信の一例（比較例）を示す図である。It is a figure which shows an example (comparative example) of the broadcast communication in another parallel processing apparatus.

以下、図面を用いて実施形態が説明される。 Embodiments will be described below with reference to the drawings.

図１は、一実施形態における並列処理装置の一例を示す。図１に示す並列処理装置１００は、複数のノードＮＤを有するネットワークＮＷと、各ノードＮＤを管理する管理ノード５０とを有する。各ノードＮＤは、算出部１０、記憶部２０および通信部３０を有する。なお、図１では、ネットワークＮＷが２次元メッシュネットワークである例を示すが、ネットワークＮＷは、他のネットワークでもよく、次元は２次元以外でもよい。 FIG. 1 shows an example of a parallel processing device in one embodiment. The parallel processing device 100 illustrated in FIG. 1 includes a network NW having a plurality of nodes ND and a management node 50 that manages each node ND. Each node ND has a calculation unit 10, a storage unit 20, and a communication unit 30. Although FIG. 1 shows an example in which the network NW is a two-dimensional mesh network, the network NW may be another network and the dimension may be other than two-dimensional.

例えば、並列処理装置１００は、分散メモリ型の大規模ＨＰＣシステムとして動作する。複数のノードＮＤを使用して並列計算を実行するジョブでは、随所において、複数のノードＮＤが特定の通信パターンにしたがって一斉に実行する通信が必要になる。このような通信は、集団通信(Collective Communication)と称される。以下では、集団通信の一例として、同報通信を例に説明する。 For example, the parallel processing device 100 operates as a distributed memory type large-scale HPC system. A job that uses a plurality of nodes ND to execute parallel computation requires communication that a plurality of nodes ND execute all at once in accordance with a specific communication pattern. Such communication is called collective communication. In the following, broadcast communication will be described as an example of collective communication.

記憶部２０は、ネットワーク座標表２２、受信時段数表２４および転送先ノード表２６を保持する記憶領域を有する。受信時段数表２４を保持する記憶領域は、受信条件保持領域の一例であり、転送先ノード表２６を保持する記憶領域は、転送条件保持領域の一例である。 The storage unit 20 has a storage area that holds a network coordinate table 22, a reception stage number table 24, and a transfer destination node table 26. The storage area holding the reception stage number table 24 is an example of the reception condition holding area, and the storage area holding the transfer destination node table 26 is an example of the transfer condition holding area.

ネットワーク座標表２２は、ネットワークＮＷに含まれるノードＮＤのうち、ジョブを並列に実行する所定数のノードＮＤの構成情報（ネットワーク座標を示す座標情報等）を含む。換言すれば、ネットワーク座標表２２は、同報通信の対象のノードＮＤの構成情報を含む。 The network coordinate table 22 includes configuration information (coordinate information indicating network coordinates, etc.) of a predetermined number of nodes ND that execute jobs in parallel among the nodes ND included in the network NW. In other words, the network coordinate table 22 includes the configuration information of the node ND that is the target of the broadcast communication.

ネットワーク座標表２２は、ネットワークＮＷの構成が決まった時点で、管理ノード５０から各ノードＮＤに予め配布されてもよく、ジョブを実行する前に管理ノード５０から各ノードＮＤに予め配布されてもよい。各ノードＮＤは、自ノードＮＤ内のネットワーク座標表２２を参照することで、他のノードＮＤと通信することなく、自ノードＮＤおよび同報通信の対象の全てのノードＮＤの座標情報（すなわち、位置情報）を取得することができる。なお、各ノードＮＤは、管理ノード５０から自ノードＮＤの座標情報を予め通知されており、ネットワークＮＷ内での自ノードＮＤの位置を把握している。 The network coordinate table 22 may be pre-distributed from the management node 50 to each node ND when the configuration of the network NW is determined, or may be pre-distributed from the management node 50 to each node ND before executing the job. Good. By referring to the network coordinate table 22 in the own node ND, each node ND can coordinate information of the own node ND and all the nodes ND to be broadcasted without communicating with other nodes ND (that is, Location information) can be acquired. Note that each node ND is notified in advance of the coordinate information of its own node ND from the management node 50, and knows the position of its own node ND in the network NW.

例えば、ネットワーク座標表２２に含まれる座標情報で示される範囲のサブネットワークに含まれるノードＮＤが、同じジョブに参加するノードＮＤのグループになる。ネットワーク座標表２２の例は、図３に示される。以下では、ジョブを並列に実行する複数のノードＮＤを含む部分的なネットワークは、サブネットワークＳＮＷ（図２）とも称される。そして、サブネットワークＳＮＷに含まれる全てのノードＮＤは、同報通信の対象のノードＮＤである。 For example, the node ND included in the sub-network within the range indicated by the coordinate information included in the network coordinate table 22 becomes a group of the nodes ND participating in the same job. An example of the network coordinate table 22 is shown in FIG. In the following, a partial network including a plurality of nodes ND that execute jobs in parallel is also referred to as a sub-network SNW (FIG. 2). Then, all the nodes ND included in the sub-network SNW are the nodes ND targeted for the broadcast communication.

受信時段数表２４は、複数回の転送により実行される同報通信において、同報通信の対象の複数のノードＮＤの各々がどの転送回数の同報通信時にデータを受信するかを示す情報を含む。すなわち、各ノードＮＤの記憶部２０の受信時段数表２４は、自ノードＮＤだけでなく、同報通信の対象の全てのノードＮＤについて、どの転送回数の同報通信でデータを受信するかを示す情報を含む。受信時段数表２４の例は、図４に示される。 The reception stage number table 24 shows information indicating which transfer number of times each of the plurality of nodes ND targeted for the broadcast communication receives the data in the broadcast communication executed by the plurality of transfers. Including. That is, the reception stage number table 24 of the storage unit 20 of each node ND indicates which transfer number of times the data is received by not only the own node ND but also all the nodes ND targeted for the broadcast communication. Contains the information to indicate. An example of the reception stage number table 24 is shown in FIG.

転送先ノード表２６は、複数回の転送により実行される同報通信の転送回数毎に、所定のノードＮＤが転送するデータの転送先である転送先ノードＮＤを示す情報を含む。すなわち、各ノードＮＤの記憶部２０の転送先ノード表２６は、自ノードＮＤだけでなく、同報通信の対象の全てのノードＮＤについて、同報通信の転送回数毎の転送先ノードＮＤを示す情報を含む。転送先ノード表２６の例は、図５に示される。同報通信の転送回数は、ノードＮＤ毎の転送ではなく、サブネットワークＳＮＷ全体での転送において、どの転送回数による転送かを示している。データの転送は、ノードＮＤを中継して実行され、各中継は同報通信の状態を示すため、以下では、同報通信の転送回数は、中継段数とも称される。 The transfer destination node table 26 includes information indicating the transfer destination node ND, which is the transfer destination of the data transferred by the predetermined node ND, for each transfer count of the broadcast communication executed by a plurality of transfers. That is, the transfer destination node table 26 of the storage unit 20 of each node ND shows the transfer destination node ND for each number of times of transfer of the broadcast communication not only for the own node ND but also for all the nodes ND targeted for the broadcast communication. Contains information. An example of the transfer destination node table 26 is shown in FIG. The number of times of transfer of the broadcast communication indicates not the transfer for each node ND but the transfer number in the entire sub-network SNW. The data transfer is performed by relaying the node ND, and each relay indicates the state of the broadcast communication. Therefore, in the following, the number of transfer of the broadcast communication is also referred to as the number of relay stages.

なお、受信時段数表２４は、同報通信において、自ノードＮＤがデータを受信する中継段数のみを含んでもよい。同様に、転送先ノード表２６は、同報通信において、自ノードＮＤから転送するデータの転送先ノードＮＤを示す情報のみを含んでもよい。但し、受信時段数表２４および転送先ノード表２６に格納される情報は、例えば、各ノードＮＤが実行するデータ転送先決定プログラムにより生成される。したがって、同報通信の対象の全てのノードＮＤの情報を含む受信時段数表２４および転送先ノード表２６を生成する場合、共通のデータ転送先決定プログラムを同報通信の対象の全てのノードＮＤで使用することができる。これにより、管理ノード５０は、１つのデータ転送先決定プログラムを各ノードＮＤに配布して実行させればよく、管理ノード５０によるノードＮＤの管理を簡易にすることができる。 The reception stage number table 24 may include only the number of relay stages at which the own node ND receives data in the broadcast communication. Similarly, the transfer destination node table 26 may include only information indicating the transfer destination node ND of the data transferred from the own node ND in the broadcast communication. However, the information stored in the reception stage number table 24 and the transfer destination node table 26 is generated, for example, by the data transfer destination determination program executed by each node ND. Therefore, in the case of generating the reception stage number table 24 and the transfer destination node table 26 including the information of all the nodes ND of the broadcast communication, the common data transfer destination determining program is used for all the nodes ND of the broadcast communication. Can be used in. As a result, the management node 50 has only to distribute one data transfer destination determining program to each node ND and execute the program, which can simplify the management of the node ND by the management node 50.

算出部１０は、自ノードＮＤの記憶部２０が保持するネットワーク座標表２２と同報通信時の起点ノードＮＤを示す起点ノード情報とに基づいて、同報通信におけるデータの転送先である転送先ノードＮＤを、転送距離が徐々に小さくなるように算出する。例えば、転送距離は、データが転送されるノードＮＤ間の経路上の距離（マンハッタン距離）が使用されてもよい。なお、ネットワーク座標表２２は、同報通信の対象のネットワークの構成と、同報通信の対象のネットワークに含まれるノードＮＤの位置情報とを含む。 Based on the network coordinate table 22 stored in the storage unit 20 of the own node ND and the origin node information indicating the origin node ND at the time of the broadcast communication, the calculation unit 10 is a transfer destination of the data in the broadcast communication. The node ND is calculated so that the transfer distance becomes gradually smaller. For example, as the transfer distance, a distance (Manhattan distance) on the route between the nodes ND to which the data is transferred may be used. The network coordinate table 22 includes the configuration of the network targeted for the broadcast communication and the position information of the node ND included in the network targeted for the broadcast communication.

そして、算出部１０は、算出した転送先ノードＮＤに基づいて、受信時段数表２４と転送先ノード表２６とを作成し、作成した受信時段数表２４と転送先ノード表２６とを記憶部２０に格納する。受信時段数表２４および転送先ノード表２６は、データ等のメッセージを転送する順序を示すメッセージ転送順データベースの一例である。なお、起点ノード情報は、管理ノード５０から各ノードＮＤに予め通知されてもよく、ネットワーク座標表２２に含まれてもよい。 Then, the calculation unit 10 creates the reception time stage number table 24 and the transfer destination node table 26 based on the calculated transfer destination node ND, and stores the created reception time stage number table 24 and the transfer destination node table 26 in the storage unit. It stores in 20. The reception stage number table 24 and the transfer destination node table 26 are an example of a message transfer order database indicating the order of transferring messages such as data. The origin node information may be notified from the management node 50 to each node ND in advance, or may be included in the network coordinate table 22.

このように、算出部１０は、”同報通信の早い段階では、できるだけ遠くのノードＮＤにメッセージ（データ）を転送する”ための転送パターンの情報を含む受信時段数表２４および転送先ノード表２６を、同報通信が開始される前に予め作成する。この際、算出部１０は、サブネットワークＳＮＷ（図２）に含まれるノードＮＤと、同報通信の転送を開始する起点ノードＮＤ（開始位置）と、サブネットワークＳＮＷにおいて使用可能なリンクとに基づき、受信時段数表２４および転送先ノード表２６を作成する。 As described above, the calculation unit 10 includes the reception stage number table 24 and the transfer destination node table including the information of the transfer pattern for “transfer the message (data) to the node ND as far as possible in the early stage of the broadcast communication”. 26 is created in advance before the broadcast communication is started. At this time, the calculation unit 10 is based on the node ND included in the sub-network SNW (FIG. 2), the origin node ND (starting position) for starting the transfer of the broadcast communication, and the link usable in the sub-network SNW. , The reception stage number table 24 and the transfer destination node table 26 are created.

算出部１０の機能は、各ノードＮＤに含まれる図示しないＣＰＵ（Central Processing Unit）等のプロセッサが実行するデータ転送先決定プログラムにより実現されてもよい。すなわち、受信時段数表２４および転送先ノード表２６は、プロセッサがデータ転送先決定プログラムを実行することにより生成されてもよい。算出部１０および各ノードＮＤに含まれるプロセッサは、コンピュータの一例である。 The function of the calculation unit 10 may be realized by a data transfer destination determination program executed by a processor such as a CPU (Central Processing Unit) (not shown) included in each node ND. That is, the reception stage number table 24 and the transfer destination node table 26 may be generated by the processor executing the data transfer destination determination program. The calculator included in the calculation unit 10 and each node ND is an example of a computer.

この場合、記憶部２０は、プロセッサによりアクセス可能に設けられ、破線枠で示すように、データ転送先決定プログラム２８を格納する記憶領域を有してもよい。そして、各ノードＮＤのプロセッサがデータ転送先決定プログラム２８を実行することで、受信時段数表２４と転送先ノード表２６とを作成するデータ転送先決定方法が実現される。なお、算出部１０の機能は、ＦＰＧＡ（Field Programmable Gate Array）等のハードウェアにより実現されてもよい。 In this case, the storage unit 20 may be provided so as to be accessible by the processor and may have a storage area for storing the data transfer destination determination program 28, as indicated by a broken line frame. Then, the processor of each node ND executes the data transfer destination determination program 28 to implement a data transfer destination determination method for creating the reception stage number table 24 and the transfer destination node table 26. The function of the calculation unit 10 may be realized by hardware such as FPGA (Field Programmable Gate Array).

この実施形態では、各ノードＮＤに設けられる算出部１０は、同報通信でのデータの転送先ノードＮＤを決め、決めた転送先ノードＮＤを示す情報を、自ノードＮＤの受信時段数表２４および転送先ノード表２６に格納する。これにより、決めた転送先ノードＮＤを他のノードＮＤ等に通知しなくてよいため、ネットワークＮＷの通信負荷の増加を抑止することができる。これに対して、例えば、管理ノード５０が、同報通信でのデータの転送先ノードＮＤを決める場合、決めた転送先ノードＮＤを各ノードＮＤに転送するため、ネットワークＮＷの通信負荷が増加する。 In this embodiment, the calculating unit 10 provided in each node ND determines the transfer destination node ND of the data in the broadcast communication, and the information indicating the determined transfer destination node ND is used as the reception stage number table 24 of the own node ND. And the destination node table 26. As a result, it is not necessary to notify the determined transfer destination node ND to other nodes ND, etc., and thus it is possible to suppress an increase in the communication load of the network NW. On the other hand, for example, when the management node 50 determines the transfer destination node ND of the data in the broadcast communication, the determined transfer destination node ND is transferred to each node ND, so that the communication load of the network NW increases. ..

通信部３０は、同報通信において、他のノードＮＤからデータを受信した場合、記憶部２０が保持する受信時段数表２４および転送先ノード表２６に基づいて、データを転送する転送先ノードＮＤを決定し、決定した転送先ノードＮＤに受信したデータを転送する。なお、通信部３０は、同報通信以外の通信において、受信したデータの宛先が自ノードＮＤである場合、受信したデータを記憶部２０等に格納する機能を有する。また、通信部３０は、受信したデータの宛先が他のノードＮＤである場合、宛先のノードＮＤに向けてデータを転送する中継機能を有する。 When data is received from another node ND in the broadcast communication, the communication unit 30 transfers the data based on the reception stage number table 24 and the transfer destination node table 26 stored in the storage unit 20. And transfer the received data to the determined transfer destination node ND. In communication other than broadcast communication, the communication unit 30 has a function of storing the received data in the storage unit 20 or the like when the destination of the received data is the own node ND. Further, when the destination of the received data is another node ND, the communication unit 30 has a relay function of transferring the data to the destination node ND.

管理ノード５０は、ノードＮＤの管理に使用する管理ネットワークＭＮＷを介して各ノードＮＤと個別に接続され、各ノードＮＤを管理する。なお、図１では、管理ノード５０は、管理ネットワークＭＮＷを介して一部のノードＮＤのみに接続されているが、実際には、ネットワークＮＷに含まれる全てのノードＮＤに接続される。 The management node 50 is individually connected to each node ND via the management network MNW used for managing the node ND, and manages each node ND. In FIG. 1, the management node 50 is connected to only some of the nodes ND via the management network MNW, but is actually connected to all the nodes ND included in the network NW.

例えば、管理ノード５０は、各ノードＮＤのプロセス起動を管理するジョブスケジューラノードでもよい。ジョブスケジューラノードは、並列処理装置１００に投入されたジョブを、計算ノードであるノードＮＤに割り当て、割り当てたノードＮＤに、ジョブに記載されたプログラムの起動を依頼する。各ノードＮＤには、同じジョブに参加する全てのノードＮＤのネットワーク座標等の情報（例えば、ネットワーク座標表２２）が、ジョブスケジューラノードから渡される。 For example, the management node 50 may be a job scheduler node that manages the process activation of each node ND. The job scheduler node allocates the job submitted to the parallel processing device 100 to the node ND which is a calculation node, and requests the allocated node ND to start the program described in the job. Information such as the network coordinates of all the nodes ND participating in the same job (for example, the network coordinate table 22) is passed to each node ND from the job scheduler node.

図２は、図１のネットワークＮＷにおいて、ジョブの実行対象のノードＮＤを含むサブネットワークＳＮＷの一例を示す。図２では、サブネットワークＳＮＷは、サブメッシュネットワークであるが、サブネットワークＳＮＷのトポロジーは、メッシュネットワークに限定されない。例えば、サブネットワークＳＮＷは、ジョブに記載されたプログラムの起動に基づいてデータ処理等を実行するノードＮＤの全てを含む。サブネットワークＳＮＷに含まれるノードＮＤおよびノードＮＤのネットワーク座標を示す情報は、管理ノード５０から各ノードＮＤに転送されるネットワーク座標表２２に含まれる。 FIG. 2 shows an example of the sub-network SNW including the node ND which is the execution target of the job in the network NW of FIG. In FIG. 2, the sub-network SNW is a sub-mesh network, but the topology of the sub-network SNW is not limited to the mesh network. For example, the sub-network SNW includes all the nodes ND that execute data processing and the like based on the activation of the program described in the job. Information indicating the network coordinates of the node ND and the node ND included in the sub-network SNW is included in the network coordinate table 22 transferred from the management node 50 to each node ND.

図２に示す例では、サブネットワークＳＮＷは、Ｘ軸方向に並ぶ１２個のノードＮＤと、Ｙ軸方向に並ぶ５個のノードＮＤとによる６０個のノードＮＤを含む。サブネットワークＳＮＷ内の各ノードＮＤの左上に付した（０、０）等は、ネットワーク座標を示す。なお、サブネットワークＳＮＷにおいても、ネットワークＮＷと同様に、メッシュネットワークまたはトーラスネットワークになるように、ジョブへのノードＮＤの割り当てを制御することが好ましい。これにより、異なるジョブの各々のプロセス間通信で使用するリンクを重ならないようにすることができる。 In the example illustrated in FIG. 2, the sub-network SNW includes 60 nodes ND including 12 nodes ND arranged in the X-axis direction and 5 nodes ND arranged in the Y-axis direction. (0, 0) and the like attached to the upper left of each node ND in the sub-network SNW indicate network coordinates. In the sub-network SNW, like the network NW, it is preferable to control the assignment of the node ND to the job so that the sub-network SNW becomes a mesh network or a torus network. This makes it possible to prevent the links used for inter-process communication of different jobs from overlapping.

ネットワークＮＷにおいて、ネットワーク座標表２２により定義される同報通信の対象のノードＮＤが含まれる各次元の座標軸の領域（大きさ）は、形状パラメータと称される。すなわち、サブネットワークＳＮＷは、形状パラメータにより表される。例えば、サブネットワークＳＮＷは、デカルト座標で与えられ、各座標軸Ｘ、Ｙの座標の範囲が予め決められている。なお、ネットワークＮＷが、サブネットワークＳＮＷとして使用されてもよい。 In the network NW, the area (size) of the coordinate axis of each dimension that includes the node ND targeted for broadcast communication defined by the network coordinate table 22 is called a shape parameter. That is, the sub-network SNW is represented by the shape parameter. For example, the sub-network SNW is given in Cartesian coordinates, and the coordinate range of each coordinate axis X, Y is predetermined. The network NW may be used as the sub-network SNW.

図３は、図２のサブネットワークＳＮＷのネットワーク座標表２２の一例を示す。ネットワーク座標表２２は、サブネットワークＳＮＷ内の各ノードＮＤに割り当てられたランク番号ＲＡＮＫとネットワーク座標（Ｘ，Ｙ）とが格納される複数のエントリを有する。以下では、ネットワーク座標は、単に座標とも称される。 FIG. 3 shows an example of the network coordinate table 22 of the sub-network SNW of FIG. The network coordinate table 22 has a plurality of entries that store the rank number RANK assigned to each node ND in the sub-network SNW and the network coordinates (X, Y). In the following, network coordinates are also simply referred to as coordinates.

ランク番号ＲＡＮＫは、サブネットワークＳＮＷ内の各ノードＮＤに割り当てられる通し番号である。図３に示す例では、ランク番号ＲＡＮＫは、座標（０，０）、（０，１）、（０，２）、（０，３）、（０，４）、（０，５）、（１，０）、（１，１）、．．．、（１１，３）、（１１，４）のが割り当てられたノードＮＤのそれぞれに順次割り当てられる。なお、ランク番号ＲＡＮＫの割り当ては、図３に示す例に限定されない。各ノードＮＤは、自ノードＮＤの記憶部２０に格納されたネットワーク座標表２２を参照することで、サブネットワークＳＮＷ内のノードＮＤのネットワーク座標（Ｘ，Ｙ）を識別可能である。 The rank number RANK is a serial number assigned to each node ND in the sub-network SNW. In the example shown in FIG. 3, the rank number RANK has the coordinates (0,0), (0,1), (0,2), (0,3), (0,4), (0,5), ( 1,0), (1,1),. ．． , (11, 3), (11, 4) are sequentially assigned to each of the assigned nodes ND. The allocation of the rank number RANK is not limited to the example shown in FIG. Each node ND can identify the network coordinates (X, Y) of the node ND in the sub-network SNW by referring to the network coordinate table 22 stored in the storage unit 20 of the own node ND.

各ノードＮＤに１つのプロセスが割り当てられる場合、ランク番号ＲＡＮＫはノードＮＤ毎に割り当てられる。各ノードＮＤに複数のプロセスが割り当てられる場合、ランク番号ＲＡＮＫは、各ノードＮＤのプロセス毎に割り当てられる。但し、各ノードＮＤに複数のランク番号ＲＡＮＫが割り当てられる場合、代表のランク番号ＲＡＮＫをネットワーク座標表２２に登録することで、図３に示すネットワーク座標表２２をそのまま使用することができる。 When one process is assigned to each node ND, the rank number RANK is assigned to each node ND. When a plurality of processes are assigned to each node ND, the rank number RANK is assigned to each process of each node ND. However, when a plurality of rank numbers RANK are assigned to each node ND, by registering the representative rank number RANK in the network coordinate table 22, the network coordinate table 22 shown in FIG. 3 can be used as it is.

なお、ネットワーク座標表２２は、図１に示すネットワークＮＷに含まれる全てのノードＮＤのネットワーク座標が格納されてもよい。この場合、サブネットワークＳＮＷが生成される毎にネットワーク座標表２２を更新しなくてよいため、管理ノード５０と各ノードＮＤ間での通信量を削減することができる。 The network coordinate table 22 may store the network coordinates of all the nodes ND included in the network NW shown in FIG. In this case, since it is not necessary to update the network coordinate table 22 every time the sub-network SNW is generated, it is possible to reduce the amount of communication between the management node 50 and each node ND.

図４は、図１の受信時段数表２４の一例を示す。受信時段数表２４は、同報通信の対象のノードＮＤの座標（Ｘ，Ｙ）と、ノードＮＤを識別するランク番号ＲＡＮＫと、同報通信においてデータを受信する中継段数とが格納される複数のエントリを有する。中継段数＝”０”のノードＮＤは、同報通信を開始する起点ノードＮＤを示し、図４では、座標（０，０）が割り当てられたノードＮＤが起点ノードＮＤである。以下の説明では、同報通信の起点ノードＮＤは、”Ｒｏｏｔ”とも称される。 FIG. 4 shows an example of the reception stage number table 24 of FIG. The reception stage number table 24 stores a plurality of coordinates (X, Y) of the node ND targeted for the broadcast communication, a rank number RANK for identifying the node ND, and the number of relay stages for receiving the data in the broadcast communication. Has an entry of. The node ND having the number of relay stages=“0” indicates the origin node ND that starts the broadcast communication. In FIG. 4, the node ND to which the coordinates (0, 0) are assigned is the origin node ND. In the following description, the origin node ND of the broadcast communication is also referred to as "Root".

例えば、座標（０，４）、（１１，４）が割り当てられたノードＮＤは、中継段数＝”１”でデータを受信することを示す。座標（０，１）、（０，３）、（４，２）、（７，２）、（１１，０）、（１１，２）が割り当てられたノードＮＤは、中継段数＝”２”でデータを受信することを示す。同報通信では、同じデータが全てのノードＮＤに転送されるため、各ノードＮＤはデータを１回受信すればよい。このため、各エントリの中継段数の欄は１つの中継段数のみが格納される。各ノードＮＤは、データの受信に基づいて受信時段数表２４の自ノードＮＤのエントリを参照することで、データを受信した中継段数を検出することができる。これにより、後述するように、検出した中継段数に基づいて転送先ノード表２６を参照することで、データを転送する転送先ノードＮＤを検出することができる。 For example, the node ND to which the coordinates (0,4) and (11,4) are assigned indicates that it receives data with the number of relay stages=“1”. The node ND to which the coordinates (0,1), (0,3), (4,2), (7,2), (11,0), (11,2) are assigned has the number of relay stages=“2”. Indicates that data will be received. In the broadcast communication, the same data is transferred to all the nodes ND, so each node ND may receive the data once. Therefore, only one relay stage number is stored in the relay stage number column of each entry. Each node ND can detect the number of relay stages that have received the data by referring to the entry of the own node ND of the reception stage number table 24 based on the reception of the data. Thus, as will be described later, the transfer destination node ND to which the data is transferred can be detected by referring to the transfer destination node table 26 based on the detected number of relay stages.

図５は、図１の転送先ノード表２６の一例を示す。転送先ノード表２６は、データを転送するノードＮＤを識別するランク番号ＲＡＮＫと、同報通信においてデータを転送する中継段数と、データの転送先のノードＮＤの座標（Ｘ，Ｙ）とが格納される複数のエントリを有する。なお、この実施形態では、所定のノードＮＤは、複数の中継段数でデータを転送する。このため、１つのノードＮＤ（例えば、ランク番号ＲＡＮＫ＝”０”のノードＮＤ）に対応する複数のエントリが、転送先ノード表２６に割り当てられる。また、この実施形態では、各ノードＮＤは、同報通信の転送を実行する各中継段数において、２つのノードＮＤにデータを転送する。このため、各ノードＮＤは、転送先ノードＮＤの欄には、２つの座標が格納される。 FIG. 5 shows an example of the transfer destination node table 26 of FIG. The transfer destination node table 26 stores a rank number RANK that identifies the node ND that transfers the data, the number of relay stages that transfer the data in the broadcast communication, and the coordinates (X, Y) of the node ND that is the transfer destination of the data. Has a plurality of entries. In addition, in this embodiment, the predetermined node ND transfers data at a plurality of relay stages. Therefore, a plurality of entries corresponding to one node ND (for example, the node ND having the rank number RANK=“0”) is assigned to the transfer destination node table 26. Further, in this embodiment, each node ND transfers data to two nodes ND in each relay stage number that executes the transfer of the broadcast communication. Therefore, each node ND stores two coordinates in the transfer destination node ND column.

図６は、図１の並列処理装置１００が同報通信を実行する場合の第１フェーズでの各ノードＮＤの動作の一例を示すフローチャートである。図６に示す動作は、例えば、管理ノード５０からの同報通信の開始指示に基づいて、ノードＮＤ毎に開始される。図６のフローの開始時の中継段数は”０”であり、図６のフローには示していないが、”Ｒｏｏｔ”のノードＮＤは、同報通信するデータを、管理ノード５０から受信する。なお、”Ｒｏｏｔ”のノードＮＤは、図６のフローが開始される前に、同報通信するデータを保持していてもよい。 FIG. 6 is a flowchart showing an example of the operation of each node ND in the first phase when the parallel processing device 100 of FIG. 1 executes broadcast communication. The operation illustrated in FIG. 6 is started for each node ND based on, for example, a broadcast communication start instruction from the management node 50. The number of relay stages at the start of the flow of FIG. 6 is “0”, and although not shown in the flow of FIG. 6, the node “ND” of “Root” receives the data to be broadcast from the management node 50. The “Root” node ND may hold data to be broadcast before the flow of FIG. 6 is started.

第１フェーズは、同報通信の早い段階で実行される動作であり、できるだけ遠くのノードＮＤにデータを転送するための動作である。できるだけ遠くのノードＮＤにデータを転送することで、データを受信するノードＮＤをサブネットワークＳＮＷ内で分散させることができる。また、データを受信するノードＮＤをサブネットワークＳＮＷ内で分散させることで、より多くのノードＮＤで、リンクを共有することなく、より多くの中継段数を使って、データを他のノードＮＤに転送することができる。 The first phase is an operation executed at an early stage of the broadcast communication and is an operation for transferring data to the node ND as far as possible. By transferring the data to the node ND that is as far as possible, the nodes ND that receive the data can be distributed within the sub-network SNW. In addition, by distributing the nodes ND that receive data in the sub-network SNW, the data can be transferred to other nodes ND by using more relay stages without sharing the link with more nodes ND. can do.

例えば、第１フェーズでは、各ノードＮＤが各中継段数においてｋ個のノードＮＤにデータを転送し、転送後にデータを保持しているノードＮＤの数がｋ＋１倍になる状態が続く期間である。データを受信したノードＮＤが増加し、データの転送先が重複する状況になった場合、転送後にデータを保持しているノードＮＤの数は、ｋ＋１倍以下になる。この実施形態では、ノードＮＤの数がｋ＋１倍以下になってからの転送は、第１フェーズではなく、第２フェーズに移行して実行される。 For example, in the first phase, each node ND transfers data to k nodes ND in each number of relay stages, and the state in which the number of nodes ND holding the data after the transfer becomes k+1 times continues. When the number of nodes ND that have received the data increases and the data transfer destinations overlap, the number of nodes ND that retains the data after the transfer becomes k+1 times or less. In this embodiment, the transfer after the number of the nodes ND becomes k+1 times or less is executed by shifting to the second phase instead of the first phase.

まず、ステップＳ１０において、ノードＮＤは、自ノードＮＤが”Ｒｏｏｔ”である場合、処理をステップＳ１４に移行し、自ノードＮＤが”Ｒｏｏｔ”でない場合、処理をステップＳ１２に移行する。ステップＳ１２において、”Ｒｏｏｔ”以外のノードＮＤは、データを受信するまで待ち、データを受信した場合、処理をステップＳ１４に移行する。 First, in step S10, the node ND shifts the processing to step S14 when the own node ND is “Root”, and shifts the processing to step S12 when the own node ND is not “Root”. In step S12, the node ND other than "Root" waits until data is received, and when the data is received, the process proceeds to step S14.

ステップＳ１４において、ノードＮＤは、受信時段数表２４を検索し、自ノードＮＤに割り当てられたランク番号ＲＡＮＫまたは自ノードＮＤの座標（Ｘ，Ｙ）を含むエントリから中継段数を取得する。例えば、ノードＮＤが同報通信の起点ノードＮＤである”Ｒｏｏｔ”の場合、図６のフローの開始時の中継段数は”０”であり、ステップＳ１４の実行時の中継段数は”０”である。 In step S14, the node ND searches the reception stage number table 24, and acquires the number of relay stages from the entry including the rank number RANK assigned to the own node ND or the coordinates (X, Y) of the own node ND. For example, when the node ND is “Root” which is the origin node ND of the broadcast communication, the number of relay stages at the start of the flow of FIG. 6 is “0”, and the number of relay stages at the time of executing step S14 is “0”. is there.

ノードＮＤが”Ｒｏｏｔ”以外の場合、ステップＳ１４の実行時の中継段数は、ステップＳ１２においてデータを受信した中継段数である。すなわち、受信時段数表２４から取得する中継段数は、現在の中継段数である。ノードＮＤは、取得した中継段数に”１”を加えた値をカウンタ値ｉとして保持する。 When the node ND is other than “Root”, the number of relay stages at the time of executing step S14 is the number of relay stages that received the data in step S12. That is, the number of relay stages acquired from the reception stage number table 24 is the current number of relay stages. The node ND holds the value obtained by adding “1” to the acquired number of relay stages as the counter value i.

次に、ステップＳ１６において、ノードＮＤは、転送先ノード表２６を検索し、自ノードＮＤに割り当てられたランク番号ＲＡＮＫとカウンタ値ｉが示す中継段数とを含むエントリから転送先ノードＮＤの座標（Ｘ，Ｙ）を取得する。受信時段数表２４と送信先ノード表２６とは、中継段数を介して相互に対応付けすることができる。このため、ノードＮＤは、受信時段数表２４と送信先ノード表２６とを検索して転送先ノードＮＤを取得する場合にも、中継段数を介して１つの表として検索することができる。 Next, in step S16, the node ND searches the transfer destination node table 26, and from the entry including the rank number RANK assigned to the own node ND and the number of relay stages indicated by the counter value i, the coordinates of the transfer destination node ND ( X, Y) is acquired. The reception stage number table 24 and the destination node table 26 can be associated with each other via the relay stage number. Therefore, the node ND can also search as one table via the number of relay stages even when the transfer destination node ND is obtained by searching the reception stage number table 24 and the transmission destination node table 26.

次に、ステップＳ１８において、ノードＮＤは、ステップＳ１６で取得した転送先ノードＮＤにデータを転送する。次に、ステップＳ２０において、ノードＮＤは、カウンタ値ｉに”１”を加える。 Next, in step S18, the node ND transfers the data to the transfer destination node ND acquired in step S16. Next, in step S20, the node ND adds "1" to the counter value i.

次に、ステップＳ２２において、ノードＮＤは、自ノードＮＤに割り当てられたランク番号ＲＡＮＫと、ステップＳ２０で更新されたカウンタ値ｉが示す中継段数とを含むエントリが、転送先ノード表２６に存在するか否かを判定する。条件に合致するエントリが存在する場合、データを転送する転送先ノードＮＤがあるため、処理はステップＳ１６に移行され、同報通信におけるデータの転送動作が継続して実行される。条件に合致するエントリが存在しない場合、第１フェーズでデータを転送するノードＮＤがなくなったため、第１フェーズの動作が終了する。 Next, in step S22, the node ND has an entry including the rank number RANK assigned to its own node ND and the number of relay stages indicated by the counter value i updated in step S20 in the transfer destination node table 26. Or not. If there is an entry that matches the condition, there is a transfer destination node ND that transfers the data, so the process proceeds to step S16, and the data transfer operation in the broadcast communication is continuously executed. If there is no entry that matches the conditions, there is no node ND that transfers the data in the first phase, so the operation in the first phase ends.

各ノードＮＤは、第１フェーズを他のノードＮＤに対して独立に実行するが、全てのノードＮＤに共通の受信時段数表２４および転送先ノード表２６に基づいて転送動作を実行する。このため、複数のノードＮＤから転送されるデータが１つのノードＮＤに重複して転送されることを抑止することができる。 Each node ND independently executes the first phase with respect to the other nodes ND, but executes the transfer operation based on the reception stage number table 24 and the transfer destination node table 26 common to all the nodes ND. Therefore, it is possible to prevent the data transferred from the plurality of nodes ND from being transferred to one node ND in a duplicated manner.

なお、各ノードＮＤは、図６の動作を開始する前に、受信時段数表２４および転送先ノード表２６から自ノードＮＤに対応する情報を取得してもよい。これにより、例えば、ステップＳ１６において、転送先ノード表２６から転送先ノードＮＤを毎回取得する処理を省略することができる。 Note that each node ND may acquire the information corresponding to its own node ND from the reception stage number table 24 and the transfer destination node table 26 before starting the operation of FIG. 6. Accordingly, for example, in step S16, the process of acquiring the transfer destination node ND from the transfer destination node table 26 each time can be omitted.

図７は、図１の並列処理装置１００における同報通信の一例を示す。図７に示す同報通信は、各ノードＮＤが、図４に示す受信時段数表２４および図５に示す転送先ノード表２６を参照し、図６に示すフローを実行する場合の例である。図７において、黒丸で示すノードＮＤは、同報通信されるデータを受信したことを示し、白丸で示すノードＮＤは、同報通信されるデータを受信していないことを示す。 FIG. 7 shows an example of broadcast communication in the parallel processing device 100 of FIG. The broadcast communication shown in FIG. 7 is an example in which each node ND executes the flow shown in FIG. 6 with reference to the reception stage number table 24 shown in FIG. 4 and the transfer destination node table 26 shown in FIG. .. In FIG. 7, a node ND indicated by a black circle indicates that data to be broadcast is received, and a node ND indicated by a white circle indicates that data to be broadcast is not received.

まず、中継段数＝”０”では、”Ｒｏｏｔ”である座標（０，０）のノードＮＤのみが同報通信のデータを受信済みである。中継段数＝”０”での転送済みノード数は”１”である。 First, when the number of relay stages=“0”, only the node ND at the coordinate (0,0) which is “Root” has already received the data of the broadcast communication. The number of transferred nodes is “1” when the number of relay stages=“0”.

次に、中継段数＝”１”では、”Ｒｏｏｔ”のノードＮＤは、転送先ノード表２６を参照し、座標（０，４）、（１１，４）のノードＮＤにデータを転送する。データを受信した座標（０，４）、（１１，４）のノードＮＤは、受信時段数表２４を参照し、現在の中継段数が”１”であることを検出する。中継段数＝”１”での転送済みノード数は”３”である。 Next, when the number of relay stages=“1”, the node “ND” of “Root” refers to the transfer destination node table 26 and transfers data to the node ND of coordinates (0, 4) and (11, 4). The node ND having the coordinates (0, 4) and (11, 4) having received the data refers to the reception stage number table 24 and detects that the current relay stage number is “1”. The number of transferred nodes when the number of relay stages=“1” is “3”.

中継段数＝”２”では、”Ｒｏｏｔ”のノードＮＤは、転送先ノード表２６を参照し、中継段数＝”２”に対応して転送先ノードＮＤ（座標（０，１）、（１１，０））が存在することを検出する。このため、”Ｒｏｏｔ”のノードＮＤは、座標（０，１）、（１１，０）にデータを転送する。 When the number of relay stages=“2”, the node ND of “Root” refers to the transfer destination node table 26, and corresponds to the number of relay stages=“2”, the transfer destination node ND (coordinates (0, 1), (11, 0)) is present. Therefore, the node “ND” of “Root” transfers the data to the coordinates (0, 1) and (11, 0).

座標（０，４）のノードＮＤ（ＲＡＮＫ＝４）は、転送先ノード表２６を参照し、中継段数＝”２”に対応して転送先ノードＮＤ（座標（０，３）、（４，２））が存在することを検出する。このため、ノードＮＤ（ＲＡＮＫ＝４）は、座標（０，３）、（４，２）にデータを転送する。 The node ND (RANK=4) at the coordinate (0,4) refers to the transfer destination node table 26, and the transfer destination node ND (coordinates (0,3), (4, 2)) is detected. Therefore, the node ND (RANK=4) transfers the data to the coordinates (0,3) and (4,2).

座標（１１，４）のノードＮＤ（ＲＡＮＫ＝５９）は、転送先ノード表２６を参照し、中継段数＝”２”に対応して転送先ノードＮＤ（座標（７，２）、（１１，２））が存在することを検出する。このため、ノードＮＤ（ＲＡＮＫ＝４）は、座標（７，２）、（１１，２）にデータを転送する。中継段数＝”２”での転送済みノード数は”９”である。 The node ND (RANK=59) of the coordinate (11, 4) refers to the transfer destination node table 26, and the transfer destination node ND (coordinates (7, 2), (11, 2)) is detected. Therefore, the node ND (RANK=4) transfers the data to the coordinates (7, 2) and (11, 2). The number of transferred nodes when the number of relay stages=“2” is “9”.

図７に示すように、各ノードＮＤは、転送先ノード表２６に基づいて、中継段数が小さい場合に転送距離が相対的に大きいノードＮＤにデータを転送し、中継段数が増えるにしたがい、転送距離が相対的に小さいノードＮＤにデータを転送する。これにより、データを受信するノードＮＤをサブネットワークＳＮＷ内に分散させることができ、データを受信したノードＮＤが以後の同報通信の転送に参加し続ける中継段数を増やすことができる。また、データを受信するノードＮＤを分散させることで、データを送受信する複数のノードＮＤ対の通信でリンクの共有が発生する可能性を下げることができる。 As shown in FIG. 7, based on the transfer destination node table 26, each node ND transfers data to the node ND having a relatively large transfer distance when the number of relay stages is small, and the transfer is performed as the number of relay stages increases. Data is transferred to the node ND having a relatively small distance. As a result, the nodes ND that receive data can be distributed in the sub-network SNW, and the number of relay stages in which the nodes ND that receive data continue to participate in the transfer of the subsequent broadcast communication can be increased. Further, by distributing the nodes ND that receive data, it is possible to reduce the possibility that link sharing will occur in the communication of a plurality of node ND pairs that transmit and receive data.

第１フェーズでは、例えば、同報通信の各中継段数において、各ノードＮＤが２つのノードＮＤにデータを転送する場合、ｍ段目では、”３”のｍ乗個のノードＮＤにデータを保持させることができる。同報通信の各中継段数において、各ノードＮＤがデータを転送できるノードＮＤの数を”ｋ”とする場合、ｍ段においてデータを受信済みのノードＮＤの数は、”（ｋ＋１）^ｍ”で示される。 In the first phase, for example, when each node ND transfers data to two nodes ND in each relay stage number of the broadcast communication, in the m-th stage, the data is held in the m-th power of “3” nodes ND. Can be made. When the number of nodes ND to which each node ND can transfer data is “k” in each number of relay stages of the broadcast communication, the number of nodes ND that have already received data in the m stages is “(k+1) ^m ”. Shown.

図８は、図１の並列処理装置１００が同報通信を実行する場合の第２フェーズでの各ノードＮＤの動作の一例を示すフローチャートである。第２フェーズは、図６のステップＳ２２でデータを転送するノードＮＤがなくなり、第１フェーズを終了したノードＮＤが開始する。すなわち、図８は、ノードＮＤ毎に実行される。 FIG. 8 is a flowchart showing an example of the operation of each node ND in the second phase when the parallel processing device 100 of FIG. 1 executes broadcast communication. In the second phase, the node ND that transfers the data in step S22 of FIG. 6 disappears, and the node ND that has completed the first phase starts. That is, FIG. 8 is executed for each node ND.

まず、ステップＳ３０において、ノードＮＤは、自ノードＮＤの転送先ノード表２６の全エントリに格納された転送先ノードＮＤを検索し、自ノードＮＤに隣接するノードＮＤのうち、転送先ノードＮＤに該当しない隣接ノードＮＤを検出する。隣接ノードＮＤか否かは、ネットワーク座標表２２に基づいて判定可能である。例えば、自ノードＮＤの座標（Ｘ，Ｙ）に対して、Ｘ軸またはＹ軸が”１”だけずれたノードＮＤが隣接ノードＮＤである。転送先ノードＮＤに含まれない隣接ノードＮＤは、第１フェーズではデータが転送されていないノードＮＤであり、第２フェーズでデータを転送する必要がある。 First, in step S30, the node ND searches the transfer destination node ND stored in all the entries of the transfer destination node table 26 of the own node ND, and selects the transfer destination node ND among the nodes ND adjacent to the own node ND. The adjacent node ND which does not correspond is detected. Whether or not it is the adjacent node ND can be determined based on the network coordinate table 22. For example, the node ND in which the X axis or the Y axis is deviated by “1” with respect to the coordinates (X, Y) of the own node ND is the adjacent node ND. The adjacent node ND that is not included in the transfer destination node ND is a node ND to which data is not transferred in the first phase, and it is necessary to transfer the data in the second phase.

次に、ステップＳ３２において、ノードＮＤは、データを転送していない隣接ノードＮＤを検出した場合、処理をステップＳ３４に移行する。一方、ノードＮＤは、データを転送していない隣接ノードＮＤを検出しない場合、全ての隣接ノードＮＤにデータが転送されているため、処理を終了する。 Next, in step S32, when the node ND detects an adjacent node ND that has not transferred data, the process proceeds to step S34. On the other hand, when the node ND does not detect the adjacent node ND which has not transferred the data, the data is transferred to all the adjacent nodes ND, and thus the process ends.

ステップＳ３４において、ノードＮＤは、データが転送されていない隣接ノードＮＤのうちｋ個を上限としてデータを転送する隣接ノードＮＤを決定する。ｋ個は、同報通信の各中継段数において、各ノードＮＤがデータを転送できるノードＮＤの数であり、例えば、２個である。換言すれば、ｋ個は、あるノードＮＤからのデータ転送バンド幅の合計が最大になる同時転送動作の数である。ｋ個は、メッセージ長、各ノードＮＤに接続される通信リンクの数、ネットワーク装置のＤＭＡ（Direct Memory Access）転送エンジンの数、ネットワーク装置が接続されているシステムバスのバンド幅、ネットワーク装置のコマンドキューの動作並列度等に基づいて決められる。ネットワーク装置は、各ノードＮＤに含まれ、ノードＮＤ間での通信を制御する機能を有する。 In step S34, the node ND determines the adjacent node ND to which the data is transferred, with k as the upper limit among the adjacent nodes ND to which the data has not been transferred. k is the number of nodes ND to which each node ND can transfer data in each number of relay stages of the broadcast communication, and is, for example, two. In other words, k is the number of simultaneous transfer operations that maximize the total data transfer bandwidth from a certain node ND. k is the message length, the number of communication links connected to each node ND, the number of DMA (Direct Memory Access) transfer engines of the network device, the bandwidth of the system bus to which the network device is connected, the command of the network device It is decided based on the parallelism of the operation of the queue. The network device is included in each node ND and has a function of controlling communication between the nodes ND.

次に、ステップＳ３６において、ノードＮＤは、ステップＳ３４で決定した隣接ノードＮＤにデータを転送する。次に、ステップＳ３８において、ノードＮＤは、データの転送を決定した隣接ノードＮＤのうち、データを転送していない隣接ノードＮＤがある場合、処理をステップＳ３４に戻し、データの転送処理を実行する。一方、ノードＮＤは、全ての隣接ノードＮＤにデータを転送済みの場合、処理を終了する。以上の動作を各ノードＮＤで実行することで、全てのノードＮＤにデータが転送され、同報通信が終了する。 Next, in step S36, the node ND transfers the data to the adjacent node ND determined in step S34. Next, in step S38, if there is an adjacent node ND that has not transferred data among the adjacent nodes ND that have decided to transfer data, the node ND returns the processing to step S34, and executes the data transfer processing. .. On the other hand, the node ND ends the process when the data has been transferred to all the adjacent nodes ND. By executing the above operation in each node ND, the data is transferred to all the nodes ND, and the broadcast communication ends.

なお、任意のノードＮＤに隣接する４つのノードＮＤは、任意のノードＮＤを隣接ノードＮＤと判断する。このため、隣接ノードＮＤは、第２フェーズのある中継段数において、周囲の複数のノードＮＤからデータを受信する可能性がある。この場合、隣接ノードＮＤは、先に受信したデータを有効とし、後で受信したデータを破棄してもよい。 The four nodes ND adjacent to the arbitrary node ND determine the arbitrary node ND as the adjacent node ND. Therefore, the adjacent node ND may receive data from a plurality of surrounding nodes ND in a certain number of relay stages in the second phase. In this case, the adjacent node ND may validate the data received earlier and discard the data received later.

図９は、図１の各ノードＮＤが実行するデータの転送先を決定する処理の一例を示す。換言すれば、図９は、上述した第１フェーズで使用する受信時段数表２４および転送先ノード表２６を作成する処理を示す。図９に示す処理は、図１に示す算出部１０により実行される。なお、図９に示す処理は、各ノードＮＤに含まれるＣＰＵ等のプロセッサが実行するデータ転送先決定プログラムにより実現される算出部１０により実行されてもよい。すなわち、図９は、データ転送先決定方法の一例およびデータ転送先決定プログラムの一例を示す。 FIG. 9 shows an example of a process executed by each node ND of FIG. 1 for determining a data transfer destination. In other words, FIG. 9 shows the process of creating the reception stage number table 24 and the transfer destination node table 26 used in the above-described first phase. The process shown in FIG. 9 is executed by the calculation unit 10 shown in FIG. Note that the processing illustrated in FIG. 9 may be executed by the calculation unit 10 realized by a data transfer destination determination program executed by a processor such as a CPU included in each node ND. That is, FIG. 9 shows an example of a data transfer destination determination method and an example of a data transfer destination determination program.

まず、ステップＳ４０において、算出部１０は、中継段数ｍを”１”に設定する。次に、ステップＳ４２において、算出部１０は、転送数ｋと中継段数ｍとを用いて、第ｍ段でのデータの転送先のノードＮＤの総数を求める。転送数ｋは、各中継段数ｍにおいて各ノードＮＤからデータが転送されるノードＮＤの数である。例えば、転送数ｋが”２”の場合、第２段目では６個のノードＮＤにデータが転送され、第３段目では１８個のノードＮＤにデータが転送される。 First, in step S40, the calculation unit 10 sets the relay stage number m to “1”. Next, in step S42, the calculation unit 10 obtains the total number of nodes ND to which the data is transferred at the m-th stage by using the transfer number k and the relay stage number m. The transfer number k is the number of nodes ND to which data is transferred from each node ND in each relay stage number m. For example, when the transfer number k is “2”, the data is transferred to 6 nodes ND in the second stage, and the data is transferred to 18 nodes ND in the third stage.

次に、ステップＳ４４において、算出部１０は、データの転送先のノードＮＤがサブネットワークＳＮＷ内で分散するように、データの転送元のノードＮＤ毎に、データを受信していないノードＮＤの中からデータの転送先である転送先ノードＮＤを決定する。例えば、データを受信済みのノードＮＤのネットワーク座標の分散の最大化を目的関数とする最適化問題を解くことで、データの転送先のノードＮＤを算出することができる。算出部１０は、データを受信していないノードＮＤを、例えば、後述するステップＳ５０で更新される転送先ノード表２６を参照することで判断する。 Next, in step S44, the calculation unit 10 selects, among the nodes ND that have not received the data, for each node ND that is the data transfer source so that the nodes ND that are the data transfer destinations are distributed in the sub-network SNW. Determines the transfer destination node ND which is the transfer destination of the data. For example, the node ND to which the data is transferred can be calculated by solving an optimization problem whose objective function is to maximize the distribution of network coordinates of the node ND that has already received the data. The calculation unit 10 determines the node ND that has not received the data, for example, by referring to the transfer destination node table 26 updated in step S50 described below.

次に、ステップＳ４６において、算出部１０は、ステップＳ４４で決定したデータの転送先ノードＮＤが重複するか否かを判定する。転送先ノードＮＤが重複する場合、転送先ノードＮＤをこれ以上分散させることが困難であると判断され、処理は終了する。なお、処理を終了する場合、直前のステップＳ４４で決定した転送先ノードＮＤを示す情報は破棄される。 Next, in step S46, the calculation unit 10 determines whether the transfer destination nodes ND of the data determined in step S44 overlap. When the transfer destination nodes ND overlap, it is determined that it is difficult to further distribute the transfer destination nodes ND, and the process ends. When the process is terminated, the information indicating the transfer destination node ND determined in step S44 immediately before is discarded.

転送先ノードＮＤとして割り当られていないノードＮＤは、図８に示す第２フェーズの動作により、隣接ノードＮＤとしてデータが転送される。転送先ノードＮＤが重複する場合、第２フェーズにおいて、隣接ノードＮＤを転送先ノードＮＤとして割り当てることで、１つの転送先ノードＮＤにデータが重複して転送される可能性を低くすることができる。この結果、同報通信におけるデータの転送効率が低下することを抑止することができる。 Data is transferred to the node ND not assigned as the transfer destination node ND as the adjacent node ND by the operation of the second phase shown in FIG. When the transfer destination nodes ND overlap, in the second phase, by assigning the adjacent node ND as the transfer destination node ND, it is possible to reduce the possibility that the data is transferred to one transfer destination node ND in duplicate. .. As a result, it is possible to prevent the data transfer efficiency in the broadcast communication from decreasing.

一方、転送先ノードＮＤが重複しない場合、転送先ノードＮＤにデータを転送するノードＮＤを決めるため、処理はステップＳ４８に移行される。ステップＳ４８において、算出部１０は、データを受信済みのノードＮＤを転送元ノードＮＤとして、ステップＳ４４で決定した転送先ノードＮＤのうち、どの転送先ノードＮＤに各転送元ノードＮＤからデータを転送するかを決める。すなわち、算出部１０は、データの転送元ノードＮＤとデータの転送先ノードＮＤとの組合せを決定する。なお、データを受信済みのノードＮＤは、データ転送先決定プログラム上で決められる仮想的なノードＮＤである。 On the other hand, if the transfer destination nodes ND do not overlap, the process proceeds to step S48 to determine the node ND to transfer the data to the transfer destination node ND. In step S48, the calculation unit 10 transfers the data from each transfer source node ND to which transfer destination node ND among the transfer destination nodes ND determined in step S44, with the node ND that has already received the data as the transfer source node ND. Decide what to do. That is, the calculation unit 10 determines the combination of the data transfer source node ND and the data transfer destination node ND. The node ND that has received the data is a virtual node ND determined by the data transfer destination determination program.

各ノードＮＤが各中継段数においてｋ個のノードＮＤにデータを転送する場合（転送数＝ｋ）、算出部１０は、１つの転送元ノードＮＤとｋ個の転送先ノードＮＤとの組合せ（割り当て）を決定する。ここで、算出部１０は、データの転送経路が交差しないように組合せを決定する。これにより、複数の転送先ノードＮＤへのデータの転送に、共通のリンクが使用される可能性を低くすることができる。 When each node ND transfers data to k nodes ND in each number of relay stages (transfer number=k), the calculation unit 10 uses a combination (allocation) of one transfer source node ND and k transfer destination nodes ND. ) Is determined. Here, the calculation unit 10 determines the combination so that the data transfer paths do not intersect. As a result, it is possible to reduce the possibility that the common link is used to transfer the data to the plurality of transfer destination nodes ND.

なお、決定した組合せでのデータの転送において、共通のリンクが使用される場合（使用するリンクが重複する場合）、算出部１０は、転送元ノードＮＤと転送先ノードＮＤとの割り当てを変更することで、共通のリンクを使用しない転送経路の設定を試みる。これにより、共通のリンクを使用する可能性を下げることができる。共通のリンクを使用する転送経路を完全になくすことができない場合、算出部１０は、共通のリンクを使用する転送経路の比率が最も低くなるように、転送先のノードＮＤの割り当てを変更してもよい。 When a common link is used in the transfer of data in the determined combination (when the links to be used overlap), the calculation unit 10 changes the allocation of the transfer source node ND and the transfer destination node ND. By doing so, we try to set a transfer route that does not use a common link. This can reduce the possibility of using a common link. When the transfer path using the common link cannot be completely eliminated, the calculation unit 10 changes the allocation of the transfer destination node ND so that the ratio of the transfer paths using the common link becomes the lowest. Good.

次に、ステップＳ５０において、算出部１０は、ステップＳ４４で決定した各転送先ノードＮＤの座標（Ｘ，Ｙ）、ランク番号ＲＡＮＫおよび中継段数ｍを受信時段数表２４に格納することで、受信時段数表２４を更新する。次に、ステップＳ５２において、算出部１０は、ステップＳ４４で決定した各転送先ノードＮＤの座標（Ｘ，Ｙ）を、データの転送元のノードＮＤを示すランク番号ＲＡＮＫと中継段数ｍとに対応付けて転送先ノード表２６に格納する。これにより、転送先ノード表２６が更新される。 Next, in step S50, the calculation unit 10 stores the coordinates (X, Y) of each transfer destination node ND, the rank number RANK, and the number of relay stages m determined in step S44 in the reception stage number table 24, thereby receiving. The time table 24 is updated. Next, in step S52, the calculation unit 10 associates the coordinates (X, Y) of each transfer destination node ND determined in step S44 with the rank number RANK indicating the node ND of the data transfer source and the relay stage number m. It is attached and stored in the transfer destination node table 26. As a result, the transfer destination node table 26 is updated.

次に、ステップＳ５４において、算出部１０は、中継段数ｍを”１”増加し、処理をステップＳ４２に戻し、次の中継段数ｍでのデータの転送先のノードＮＤを決定する処理を実行する。データの転送先のノードＮＤを決定する処理は、上述したように、転送先ノードＮＤが重複するまで繰り返し実行される。なお、ステップＳ５０、Ｓ５２、Ｓ５４の順序は、入れ替えられてもよい。 Next, in step S54, the calculation unit 10 increases the relay stage number m by "1", returns the process to step S42, and executes the process of determining the node ND of the data transfer destination at the next relay stage number m. .. As described above, the process of determining the data transfer destination node ND is repeatedly executed until the transfer destination nodes ND overlap. Note that the order of steps S50, S52, and S54 may be interchanged.

図１０は、他の並列処理装置における同報通信の一例（比較例）を示す。図１０に示す同報通信では、”Ｒｏｏｔ”である座標（０，０）が割り当てられたノードＮＤが、中継段数＝”１”において、自ノードＮＤに隣接する隣接ノードＮＤにデータを転送する。データを受信したノードＮＤは、中継段数＝”２”において、自ノードＮＤに隣接する隣接ノードＮＤにデータを転送する。この後も、各中継段数において、データを受信したノードＮＤは、自ノードＮＤに隣接する隣接ノードＮＤにデータを転送する。 FIG. 10 shows an example (comparative example) of broadcast communication in another parallel processing device. In the broadcast communication illustrated in FIG. 10, the node ND to which the coordinate (0, 0) that is “Root” is assigned transfers the data to the adjacent node ND adjacent to the own node ND in the relay stage number=“1”. .. The node ND having received the data transfers the data to the adjacent node ND adjacent to the own node ND when the number of relay stages=“2”. Even after this, in each relay stage number, the node ND receiving the data transfers the data to the adjacent node ND adjacent to the own node ND.

隣接するノードＮＤにデータを順次転送する同報通信では、データの転送方向は、”Ｒｏｏｔ”から離れる方向に限られる。図１０に示す例では、”Ｒｏｏｔ”から離れる方向は、Ｘ座標が増加する方向またはＹ座標が増加する方向である。このため、ある中継段数でデータを転送したノードＮＤは、その後の中継段数でデータを転送できない場合がある。例えば、”Ｒｏｏｔ”のノードＮＤは、中継段数＝”２”では同報通信に参加できない。厳密には、Ｒｏｏｔ”のノードＮＤは、他のノードＮＤとリンクを共有することで、同報通信に参加することができるが、この場合、データ転送の帯域が小さくなってしまう。 In the broadcast communication in which data is sequentially transferred to the adjacent node ND, the data transfer direction is limited to the direction away from “Root”. In the example shown in FIG. 10, the direction away from “Root” is the direction in which the X coordinate increases or the direction in which the Y coordinate increases. Therefore, the node ND that has transferred data with a certain number of relay stages may not be able to transfer data with the subsequent number of relay stages. For example, the node “ND” of “Root” cannot participate in the broadcast communication when the number of relay stages=“2”. Strictly speaking, the root ND node ND can participate in the broadcast communication by sharing a link with another node ND, but in this case, the data transfer band becomes small.

したがって、図１０に示す同報通信では、図７に示す同報通信に比べて、データの転送効率が低下する。換言すれば、図７に示す同報通信では、受信したデータを他のノードＮＤに転送したノードＮＤは、それ以降の中継段数においてもデータを他のノードＮＤに転送することができる。この結果、図７に示す同報通信では、図１０に示す同報通信に比べて、同報通信に掛かる時間（中継段数）を削減することができ、同報通信の効率を向上することができる。 Therefore, in the broadcast communication shown in FIG. 10, the data transfer efficiency is lower than that in the broadcast communication shown in FIG. In other words, in the broadcast communication shown in FIG. 7, the node ND that has transferred the received data to the other node ND can transfer the data to the other node ND even in the subsequent number of relay stages. As a result, in the broadcast communication shown in FIG. 7, the time (number of relay stages) required for the broadcast communication can be reduced as compared with the broadcast communication shown in FIG. 10, and the efficiency of the broadcast communication can be improved. it can.

ところで、データサイズ（メッセージサイズ）が大きく、１回でデータを転送できない場合、データを分割してパイプライン転送を行うことで、データの転送効率は向上する。一方、データ量が小さく、１回でデータ転送可能な場合、同報通信の完了までに必要な時間は転送の中継段数に比例する。この場合、”ｋ”を２以上の整数として、ｋ分木による同報通信アルゴリズムにおいて、中継段数がｍ段（ｍは正の整数）の転送までにデータを受信済になるノード数は、式（１）に示される。 By the way, when the data size (message size) is large and the data cannot be transferred at one time, the data transfer efficiency is improved by dividing the data and performing the pipeline transfer. On the other hand, when the amount of data is small and data can be transferred at one time, the time required to complete the broadcast communication is proportional to the number of relay stages of transfer. In this case, when “k” is an integer of 2 or more, the number of nodes that have already received data by the number of relay stages of m stages (m is a positive integer) in the broadcast algorithm using the k-ary tree is expressed by It is shown in (1).

なお、”木”は、グラフ理論における”閉路を持たないグラフないし部分グラフ”という意味の用語であり、対応する計算機ネットワークの全体ないし一部の接続関係を表現するために使用可能である。 The "tree" is a term in the graph theory meaning "graph or subgraph having no closed circuit", and can be used to represent the connection relation of the whole or a part of the corresponding computer network.

ノード数が”Ｎ”のサブネットワークＳＮＷにおいて、同報通信により全ノードＮＤにデータを転送する場合に必要な中継段数は、式（２）に示される。 In the sub-network SNW in which the number of nodes is “N”, the number of relay stages required when data is transferred to all the nodes ND by broadcast communication is represented by the equation (2).

例えば、図２に示すサブネットワークＳＮＷの同報通信において、サブネットワークＳＮＷに含まれるＮ個の全てのノードＮＤにｋ分木でデータを転送する場合、転送回数（すなわち、中継段数）は”ｌｏｇ_ｋ＋１Ｎ”程度になる。転送データ量をＤ、中継一回あたりのバンド幅をＢ、転送一回当たりのオーバヘッド＋通信遅延時間をＬとすると、同報通信全体での通信時間の概算は、式（３）で示され、通信遅延時間の概算は、式（４）で示される。 For example, in the broadcast communication of the sub-network SNW shown in FIG. 2, when data is transferred to all N nodes ND included in the sub-network SNW in a k-ary tree, the transfer count (that is, the number of relay stages) is “log”. It becomes about _k+1 N″. Assuming that the transfer data amount is D, the bandwidth per relay is B, and the overhead per transfer + communication delay time is L, the approximate communication time for the entire broadcast communication is given by equation (3). An approximate communication delay time is given by equation (4).

一方、データサイズが大きく、データＤを３つに分割して２分木でパイプライン転送を行う他の並列処理装置における同報通信全体での通信時間の概算は、式（５）に示される。 On the other hand, the approximate communication time of the entire broadcast communication in another parallel processing device which has a large data size, divides the data D into three, and performs pipeline transfer with a binary tree is given by equation (5). ..

式（５）において、転送データ量をＤ／３、中継一回あたりのバンド幅をＢ、転送一回当たりの転送オーバヘッドと通信遅延時間の和をＬとする。 In Expression (5), the transfer data amount is D/3, the bandwidth per relay is B, and the sum of the transfer overhead and communication delay time per transfer is L.

例えば、Ｎ＝２０００、ｋ＝３、Ｂ＝１２．５ＧｉＢ／ｓｅｃ、Ｄ＝１．２５ＭｉＢ、Ｌ＝１μｓｅｃ（１０^−６ｓｅｃ）とすると、並列処理装置１００での通信時間の概算は、式（３）のＬが無視できるとした場合の式（６）より１７９μｓｅｃ程度になる。一方、通信時間の概算が式（５）で示される他の並列処理装置での通信時間の概算は、式（５）のＬが無視できるとした場合の式（７）より３３３μｓｅｃ程度になる。式（６）および式（７）中の符号＊は、乗算を示す。式（６）で示される通信時間は、式（７）で示される通信時間の４０％程度である。 For example, assuming that N=2000, k=3, B=12.5 GiB/sec, D=1.25 MiB, and L=1 μsec (10 ⁻⁶ sec), the approximate communication time in the parallel processing device 100 is calculated by the formula ( According to the equation (6) when L in 3) is negligible, it becomes about 179 μsec. On the other hand, the approximate communication time is approximately 333 μsec from the equation (7) when L in the equation (5) can be ignored. The symbol * in equations (6) and (7) indicates multiplication. The communication time represented by the equation (6) is about 40% of the communication time represented by the equation (7).

式（６）は、図７に示すように、同報通信においてデータの転送先ノードＮＤを分散させる場合の通信時間を示している。式（７）は、同報通信においてデータを隣接ノードＮＤに転送する場合の通信時間を示している。このため、図７に示した同報通信に掛かる通信時間を、図１０に示す同報通信に掛かる通信時間に比べて短縮することができる。 Expression (6) represents the communication time when the data transfer destination nodes ND are distributed in the broadcast communication as shown in FIG. 7. Expression (7) indicates the communication time when data is transferred to the adjacent node ND in the broadcast communication. Therefore, the communication time required for the broadcast communication shown in FIG. 7 can be shortened as compared with the communication time required for the broadcast communication shown in FIG.

一方、Ｌが通信時間の主要因子となる場合、３分木を使用すると、並列処理装置１００での通信時間の概算は、式（４）に基づき５．５μｓｅｃ程度となり、他の並列処理装置での通信時間の概算は、式（２）にＬを乗じて６．５μｓｅｃ程度となる。Ｌが通信時間の主要因子となる場合の通信時間についても、並列処理装置１００が有利である。 On the other hand, when L is the main factor of the communication time, if the ternary tree is used, the communication time in the parallel processing device 100 is estimated to be about 5.5 μsec based on the equation (4), and the other parallel processing devices can use it. The communication time of is roughly calculated by multiplying equation (2) by L and becomes about 6.5 μsec. The parallel processing device 100 is also advantageous for the communication time when L is the main factor of the communication time.

以下では、受信時段数表２４と転送先ノード表２６を作成する実施例が示される。すなわち、以下では、同報通信を実行するサブネットワークＳＮＷにおいて、上述した第１フェーズでのデータの転送順の算出方法が説明される。 In the following, an example of creating the reception stage number table 24 and the transfer destination node table 26 is shown. That is, a method of calculating the data transfer order in the above-described first phase in the sub-network SNW that executes broadcast communication will be described below.

各ノードＮＤが各中継段数においてｋ個のノードＮＤにデータを転送し、転送後にデータを保持しているノードＮＤの数がｋ＋１倍になる状態が続く期間である第１フェーズは、中継段数が上限ｎに達するまで実行される。上限ｎは、”（ｋ＋１）^ｎ≦Ｎ”が成立する最大の整数である。例えば、実数ｘに対し”ｘを越えない最大の整数”をガウスの記号により［ｘ］と表記すると、上限ｎは、式（８）により示される。
ｎ＝［ｌｏｇ_ｋ＋１Ｎ］ ‥（８）
以下、二次計画法を用いて各中継段数でのデータの転送先のノードＮＤを決める方法を説明する。説明を簡明にするため、ネットワークトポロジーはメッシュネットワークとする。”同報通信の早い段階では、できるだけ遠くのノードＮＤにメッセージ（データ）を転送する”という条件を実現するため、”メッシュネットワークの各次元のネットワーク座標の分散”を最大化する目的関数とする。
＜実施例１＞
以下の手順を実行するデータ転送先決定プログラムにより求めた結果を、受信時段数表２４および転送先ノード表２６に登録する。なお、実施例１で実行されるデータ転送先決定プログラムは、下記のステップＡおよびステップＢを含み、図９に示すデータ転送先決定プログラムの処理とは異なる。以下、実施例１で実行されるデータ転送先決定プログラムは、単にプログラムと称される。 In the first phase in which each node ND transfers data to k nodes ND in each number of relay stages, and the number of nodes ND holding the data after the transfer becomes k+1 times, the number of relay stages is It is executed until the upper limit n is reached. The upper limit n is the maximum integer that satisfies “(k+1) ⁿ ≦N”. For example, when the "maximum integer that does not exceed x" for the real number x is expressed as [x] by the Gauss symbol, the upper limit n is expressed by the equation (8).
n=[log _k+1 N] (8)
Hereinafter, a method of determining the node ND of the data transfer destination at each relay stage number using the quadratic programming method will be described. For simplicity of explanation, the network topology is a mesh network. In order to realize the condition that "the message (data) is transferred to the node ND as far away as possible in the early stage of the broadcast communication", "the variance of the network coordinates of each dimension of the mesh network" is an objective function that maximizes ..
<Example 1>
The result obtained by the data transfer destination determination program that executes the following procedure is registered in the reception stage number table 24 and the transfer destination node table 26. The data transfer destination determination program executed in the first embodiment includes the following steps A and B, and is different from the process of the data transfer destination determination program shown in FIG. Hereinafter, the data transfer destination determination program executed in the first embodiment will be simply referred to as a program.

ステップＡでは、プログラムは、各中継段数においてデータの転送先ノードＮＤ（すなわち、データを受信するノードＮＤ）の座標を決定する。ステップＡは、以下のサブステップＡ１、Ａ２、Ａ３を含み、例えば、中継段数毎に実行される。なお、ステップＡで用いる「整数変数の二次計画法サブルーチン」の出力は厳密解でなくてもよく、例えば「整数である」という制約を外した「緩和問題」の解の整数部分をとった近似解でよい。 In step A, the program determines the coordinates of the data transfer destination node ND (that is, the node ND that receives the data) at each relay stage number. Step A includes the following sub-steps A1, A2, and A3, and is executed, for example, for each number of relay stages. Note that the output of the "quadratic programming subroutine for integer variables" used in step A does not have to be an exact solution. For example, the integer part of the solution of the "relaxation problem" without the constraint "is an integer" was taken. Approximate solution is acceptable.

サブステップＡ１では、プログラムは、各中継段数でデータを受信するノードＮＤおよびその座標を格納する配列を割り当てる。 In sub-step A1, the program allocates a node ND that receives data at each relay stage and an array that stores the coordinates thereof.

次に、サブステップＡ２では、プログラムは、ｍ−１段でデータを受信済みの（ｋ＋１）^ｍ−１個のノードＮＤの座標を入力として、「ｍ段までにデータを受信するノードＮＤの座標の分散」を最大化する目的関数を受け付ける。 Next, in sub-step A2, the program inputs the coordinates of the (k+1) ^m−1 nodes ND that have already received the data at the m−1 stage, and sets “the coordinates of the node ND that receives the data up to the m stage”. Accept an objective function that maximizes the variance of.

次に、サブステップＡ３では、プログラムは、サブステップＡ２の目的関数を与えて、整数変数の二次計画法サブルーチンを呼び出し、サブルーチンが出力する受信ノードＮＤの座標を配列に格納する。サブステップＡ３では、プログラムは、中継段数毎に、ｋ×（ｋ＋１）^ｍ−１個のネットワーク座標（すなわち、データを受信するノードＮＤ）を導き出す。ｋは、各ノードＮＤがデータを転送できるノードＮＤの数である。 Next, in sub-step A3, the program gives the objective function of sub-step A2, calls a quadratic programming subroutine of integer variables, and stores the coordinates of the receiving node ND output by the subroutine in an array. In sub-step A3, the program derives k×(k+1) ^m−1 network coordinates (that is, the node ND that receives data) for each number of relay stages. k is the number of nodes ND to which each node ND can transfer data.

ステップＡの完了後、ステップＢでは、プログラムは、第ｍ−１段までにデータを受信したノードＮＤと第ｍ段でデータを受信するノードＮＤとの対応付け行う。ステップＢは、以下のサブステップＢ１、Ｂ２、Ｂ３を含み、例えば、中継段数毎に実行される。 After step A is completed, in step B, the program associates the node ND that has received the data up to the m-1th stage with the node ND that receives the data at the mth stage. Step B includes the following sub-steps B1, B2, and B3, and is executed, for example, for each number of relay stages.

サブステップＢ１では、プログラムは、第ｍ−１段までにデータを受信した（ｋ＋１）^ｍ−１個のノードＮＤ毎に、第ｍ段でデータを受信するべきｋ×（ｋ＋１）^ｍ−１個のノードＮＤの中からｋ個を割り当てる。すなわち、プログラムは、第ｍ段において、データを転送するノードＮＤとデータを受信するノードＮＤとの対応関係を決める。 In sub-step B1, the program receives k×(k+1) ^m−1 data to be received at the m-th stage for every (k+1) ^m−1 nodes ND that have received the data up to the m−1 th stage. Are assigned from among the nodes ND. That is, the program determines the correspondence relationship between the node ND that transfers data and the node ND that receives data in the m-th stage.

次に、サブステップＢ２では、プログラムは、第ｍ段でデータを受信するｋ（ｋ＋１）^ｍ−１個のノードＮＤの各々に対応するエントリを、受信時段数表２４に追加する。 Next, in sub-step B2, the program adds an entry corresponding to each of the k(k+1) ^m-1 nodes ND that receives data at the m-th stage to the reception stage number table 24.

サブステップＢ３では、プログラムは、第ｍ−１段までにデータを受信したノードＮＤの各々に対し、割り当てたｋ個のノードＮＤを転送先ノードＮＤとするエントリを、転送先ノード表２６に追加する。なお、サブステップＢ２、Ｂ３は逆順に実行されてもよい。 In sub-step B3, the program adds, to the transfer destination node table 26, an entry having the allocated k nodes ND as the transfer destination node ND for each of the nodes ND that has received the data up to the m-1th stage. To do. Note that the sub-steps B2 and B3 may be executed in reverse order.

ステップＡ、Ｂの実行により、それぞれのノードＮＤが各中継段数でどのノードＮＤにデータを転送するのかが決定される。二次計画法を解くことにより”同報通信の早い段階では、できるだけ遠くのノードＮＤにメッセージ（データ）を転送する”という条件が達成される。このため、パケット（データ）の衝突が起こりにくく、上述した第１フェーズの後半においても転送能力が損なわれにくい通信手順が可能となる。
＜実施例２＞
実施例１では、共通のリンクの使用を考慮せずに、転送先ノードＮＤが決定される。このため、共通のリンクを使用してデータが転送された場合、転送効率が低下する。実施例２では、共通のリンクを使用する可能性が低くなるように、転送先ノードＮＤが決定される。 By executing steps A and B, it is determined to which node ND each node ND transfers the data at each relay stage number. By solving the quadratic programming, the condition that "the message (data) is transferred to the node ND as far as possible in the early stage of the broadcast communication" is achieved. For this reason, it is possible to perform a communication procedure in which packet (data) collision is unlikely to occur and transfer performance is unlikely to be impaired even in the latter half of the first phase described above.
<Example 2>
In the first embodiment, the transfer destination node ND is determined without considering the use of the common link. Therefore, when the data is transferred using the common link, the transfer efficiency decreases. In the second embodiment, the transfer destination node ND is determined so as to reduce the possibility of using the common link.

実施例２は、サブステップＢ１が実施例１と異なることを除き、実施例１と同様の処理を実行する。実施例２のサブステップＢ１では、プログラムは、転送元のノードＮＤ毎にｋ個の転送先ノードＮＤを割り当てた後、例えば、次元順ルーティングによるデータの転送経路中に共通のリンクを使用する経路があるかを判定する。プログラムは、共通のリンクを使用する経路がある場合、共通のリンクを使用する経路がなくなるように、転送先のノードＮＤの割り当てを変更する。 The second embodiment executes the same processing as the first embodiment except that the sub-step B1 is different from the first embodiment. In sub-step B1 of the second embodiment, the program allocates k transfer destination nodes ND for each transfer source node ND, and then, for example, a path using a common link in the data transfer path by the dimension order routing. Determine if there is. If there is a route that uses the common link, the program changes the assignment of the transfer destination node ND so that there is no route that uses the common link.

プログラムは、共通のリンクを使用する経路を完全になくすことができない場合、各中継段数において共通のリンクを使用する経路の比率が最も低くなるように、転送先のノードＮＤの割り当てを変更してもよい。さらに、プログラムは、複数の中継段数において共通のリンクを使用する経路の比率が最も低くなるように、転送先のノードＮＤの割り当てを変更してもよい。 When the program cannot completely eliminate the route using the common link, the program changes the assignment of the transfer destination node ND so that the ratio of the route using the common link becomes the lowest in each relay stage number. Good. Further, the program may change the allocation of the transfer destination node ND so that the ratio of the routes using the common link is the lowest in the plurality of relay stages.

実施例２では、データの転送時に共通のリンクを使用する確率を下げることができ、データの転送効率の向上により、同報通信時間を短縮することができる。 In the second embodiment, it is possible to reduce the probability of using a common link when transferring data, and improve the data transfer efficiency, so that the broadcast communication time can be shortened.

以上、本実施形態では、データを受信したノードＮＤが同報通信のデータ転送に参加し続けることができ、かつ、隣接ノードＮＤにデータを転送する場合に比べて、各ノードＮＤがデータ転送に参加できる中継段数を多くすることができる。この結果、各中継段数において、データ転送ノードＮＤを増加させることができ、同報通信が完了するまでに掛かる時間を削減することができる。 As described above, in the present embodiment, the node ND that has received the data can continue to participate in the data transfer of the broadcast communication, and each node ND can transfer the data as compared with the case of transferring the data to the adjacent node ND. It is possible to increase the number of relay stages that can participate. As a result, the number of data transfer nodes ND can be increased in each number of relay stages, and the time required to complete the broadcast communication can be reduced.

また、できるだけ遠くのノードＮＤにデータを転送することで、データを受信するノードＮＤをサブネットワークＳＮＷ内で分散させることができる。データを受信するノードＮＤをサブネットワークＳＮＷ内で分散させることで、より多くのノードＮＤで、リンクを共有することなく、より多くの中継段数を使って、データを他のノードＮＤに転送することができる。 Further, by transferring the data to the node ND as far as possible, the nodes ND that receive the data can be dispersed in the sub-network SNW. By distributing the nodes ND receiving the data in the sub-network SNW, the data can be transferred to other nodes ND by using more relay stages without sharing a link with more nodes ND. You can

各ノードＮＤの算出部１０が、共通のデータ転送先決定プログラムを実行するため、ノードＮＤで同一の受信時段数表２４および転送先ノード表２６を作成することができる。このため、各ノードＮＤは、算出により決定した転送先ノードＮＤを他のノードＮＤに通知しなくてよいため、ネットワークＮＷの通信負荷の増加を抑止することができる。 Since the calculation unit 10 of each node ND executes the common data transfer destination determination program, the same reception stage number table 24 and transfer destination node table 26 can be created in the node ND. Therefore, each node ND does not have to notify the other node ND of the transfer destination node ND determined by the calculation, so that it is possible to suppress an increase in the communication load of the network NW.

受信時段数表２４および転送先ノード表２６は、同報通信の対象の全てのノードＮＤの受信情報および転送情報を含む。このため、各ノードＮＤは、共通のデータ転送先決定プログラムを実行することで、受信時段数表２４および転送先ノード表２６を生成することができる。これにより、管理ノード５０は、１つのデータ転送先決定プログラムを各ノードＮＤに配布して実行させればよく、管理ノード５０によるノードＮＤの管理を簡易にすることができる。 The reception stage number table 24 and the transfer destination node table 26 include reception information and transfer information of all the nodes ND that are targets of the broadcast communication. Therefore, each node ND can generate the reception stage number table 24 and the transfer destination node table 26 by executing the common data transfer destination determining program. As a result, the management node 50 has only to distribute one data transfer destination determination program to each node ND and execute the program, and the management node 50 can easily manage the node ND.

第１フェーズでの転送先ノードＮＤを決める際に、転送先ノードＮＤが重複する場合、第２フェーズにおいて、隣接ノードＮＤを転送先ノードＮＤとして割り当てることで、１つの転送先ノードＮＤにデータが重複して転送される可能性を低くすることができる。この結果、同報通信におけるデータの転送効率が低下することを抑止することができる。 When the transfer destination nodes ND overlap when determining the transfer destination node ND in the first phase, data is stored in one transfer destination node ND by allocating the adjacent node ND as the transfer destination node ND in the second phase. The possibility of duplicate transfer can be reduced. As a result, it is possible to prevent the data transfer efficiency in the broadcast communication from decreasing.

共通のリンクを同時に使用する可能性を低くしてパケットの転送を実行することができ、共通のリンクを同時に使用してパケットを転送する場合に比べて、パケットの転送効率を向上することができる。 The packet transfer can be executed with a low possibility of using the common link at the same time, and the packet transfer efficiency can be improved as compared with the case where the packet is transferred using the common link at the same time. ..

サブネットワークＳＮＷに含まれるノードＮＤのグループ毎に、共通のデータ転送先決定プログラムが実行され、各ノードＮＤは、共通の受信時段数表２４および転送先ノード表２６を作成する。すなわち、受信時段数表２４および転送先ノード表２６は、サブネットワークＳＮＷの形状パラメータおよびサブネットワークＳＮＷに含まれるノードＮＤの数に応じて生成される。このため、サブネットワークＳＮＷのサイズに合わせて、同報通信によるデータの転送効率を最適に設定することができる。 A common data transfer destination determination program is executed for each group of nodes ND included in the sub-network SNW, and each node ND creates a common reception stage number table 24 and transfer destination node table 26. That is, the reception stage number table 24 and the transfer destination node table 26 are generated according to the shape parameter of the sub-network SNW and the number of nodes ND included in the sub-network SNW. Therefore, it is possible to optimally set the data transfer efficiency by the broadcast communication according to the size of the sub-network SNW.

以上の詳細な説明により、実施形態の特徴点および利点は明らかになるであろう。これは、特許請求の範囲がその精神および権利範囲を逸脱しない範囲で前述のような実施形態の特徴点および利点にまで及ぶことを意図するものである。また、当該技術分野において通常の知識を有する者であれば、あらゆる改良および変更に容易に想到できるはずである。したがって、発明性を有する実施形態の範囲を前述したものに限定する意図はなく、実施形態に開示された範囲に含まれる適当な改良物および均等物に拠ることも可能である。 The features and advantages of the embodiments will be apparent from the above detailed description. This is intended to cover the features and advantages of the embodiments as described above without departing from the spirit and scope of the claims. Further, a person having ordinary skill in the art can easily think of all the improvements and changes. Therefore, the scope of the embodiments having the invention is not intended to be limited to the above, and it is possible to rely on appropriate improvements and equivalents included in the scope disclosed in the embodiments.

１０算出部
２０記憶部
２２ネットワーク座標表
２４受信時段数表
２６転送先ノード表
２８データ転送先決定プログラム
３０通信部
５０管理ノード
１００並列処理装置
ＭＮＷ管理ネットワーク
ＮＤノード
ＮＷネットワーク
ＲＡＮＫランク番号
ＳＮＷサブネットワーク
10 calculation unit 20 storage unit 22 network coordinate table 24 reception stage number table 26 transfer destination node table 28 data transfer destination determination program 30 communication unit 50 management node 100 parallel processing device MNW management network ND node NW network RANK rank number SNW subnetwork

Claims

In a parallel processing device including a plurality of nodes connected to each other via a network,
Each of the plurality of nodes is
Based on the network configuration information, the position information of each node on the network, and the origin node information indicating the origin node at the time of broadcast communication, the transfer distance is gradually reduced as the number of transfers increases. And a calculation unit that obtains a transfer destination node that is a transfer destination of data in the broadcast communication,
A storage unit that stores the position information of the transfer destination node for each transfer count calculated by the calculation unit;
When data is received from another node during the broadcast communication, a transfer destination node is determined based on the information stored in the storage unit, and a communication unit that transfers the received data to the determined transfer destination node. A parallel processing device having.

The calculation unit performs, for each of the plurality of nodes, calculation of a transfer destination node that gradually reduces the transfer distance to the transfer destination node as the number of transfers increases.
The storage unit stores the transfer destination node calculated by the calculation unit for each of the plurality of nodes in association with the number of transfers,
The communication unit, when receiving data from another node, transfers the data to a transfer destination node stored in the storage unit for each transfer count corresponding to its own node. Parallel processor.

When the transfer destination nodes that receive data from the transfer source node that is the data transfer source overlap, the calculation unit stores the transfer destination nodes up to the transfer count that is one transfer count before the transfer count at which the transfer destination nodes overlap. Stored in the department,
When the communication unit completes the transfer of the data to the transfer destination node stored in the storage unit, the communication unit transfers the data to a node that has not received the data among the adjacent nodes adjacent to the own node. The parallel processing device according to claim 1 or 2.

The storage unit holds a reception condition holding area for holding the number of times of transfer for receiving data for each of the plurality of nodes, and a transfer condition holding area for holding for each of the plurality of nodes a correspondence relationship between the number of times of transfer and a transfer destination node. Have
The communication unit of each of the plurality of nodes refers to the reception condition holding area based on the reception of data, detects the number of times of transfer of data, and based on the detected number of times of transfer, stores the transfer condition holding area. 4. The parallel processing device according to claim 1, wherein the transfer destination node to which the data is transferred is determined by referring to the parallel processing device.

The calculation unit
For each transfer count, from a node that has not received data, a predetermined number of nodes are assigned to the transfer destination node, and the correspondence relationship between the assigned transfer destination node and the transfer count is stored in the reception condition holding area,
A transfer source node and a transfer destination node that are sources of data are associated with each other for each transfer count, and the association between the transfer source node and the transfer destination node is stored in the transfer condition holding area together with the transfer count. The parallel processing device according to claim 4.

The calculation unit changes to a combination of a transfer source node and a transfer destination node that do not share a link when data is transferred to a plurality of transfer destination nodes sharing a link. Parallel processor.

7. The calculation unit allocates a predetermined number of distributed nodes to transfer destination nodes from among nodes that have not received data for each transfer count. The parallel processing device according to item 1.

8. The parallel processing device according to claim 1, wherein the network targeted for broadcast communication is a sub-network that is a part of the entire network.

In a data transfer destination determining method for determining a data transfer destination during broadcast communication of a parallel processing device including a plurality of nodes connected to each other via a network,
Each of the plurality of nodes is
Based on the network configuration information, the position information of each node on the network, and the origin node information indicating the origin node at the time of broadcast communication, the transfer distance is gradually reduced as the number of transfers increases. For the transfer destination node that is the transfer destination of data in various broadcast communications,
A method of determining a data transfer destination, characterized in that the calculated position information of the transfer destination node for each transfer count is stored in a storage unit in the node.

In a data transfer destination determination program for determining a data transfer destination during broadcast communication of a parallel processing device including a plurality of nodes connected to each other via a network,
A computer included in each of the plurality of nodes,
Based on the network configuration information, the position information of each node on the network, and the origin node information indicating the origin node at the time of broadcast communication, the transfer distance is gradually reduced as the number of transfers increases. For the transfer destination node that is the transfer destination of data in various broadcast communications,
A data transfer destination determination program, wherein the calculated position information of the transfer destination node for each transfer count is stored in a storage unit in the node.