JP2000112912A

JP2000112912A - Processing system for test and copy against remote memory in distributed memory-type parallel computer

Info

Publication number: JP2000112912A
Application number: JP10281112A
Authority: JP
Inventors: Masanobu Inaba; 政信稲葉
Original assignee: NEC Computertechno Ltd
Current assignee: NEC Computertechno Ltd
Priority date: 1998-10-02
Filing date: 1998-10-02
Publication date: 2000-04-21

Abstract

PROBLEM TO BE SOLVED: To speed up a test and copy processing in a distributed memory-type parallel computer. SOLUTION: A node 1 continuously transmits a test instruction, a copy instruction and copy data. When a node 2 receives the test instruction, the copy instruction and copy data, it repetitively executes the test instruction against a remote memory in the self-node until it is succeeded. Copy data are copied to the remote memory with the copy instruction after the success of the test instruction.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、分散メモリ型並列
計算機におけるリモートメモリに対するテストアンドコ
ピーの処理方式に関し、特に、テストアンドコピーの高
速処理方式に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a test and copy processing method for a remote memory in a distributed memory type parallel computer, and more particularly to a high speed test and copy processing method.

【０００２】[0002]

【従来の技術】分散メモリ型並列計算機で計算をする場
合は、ノード間データ転送の発生頻度を小さくするよう
（計算ノード内に閉じた）プログラミングをするのが望
ましい。なぜらば、計算ノード間はネットワークで接続
されており、計算ノード内の距離に比べ計算ノード間の
距離は大きいからである。しかし、大規模な科学技術問
題では、各計算ノード内に閉じたプログラミングは不可
能に等しく、各計算ノードが協調して動くプログラミン
グとなる。例えば、大規模問題配列が複数ノードに跨っ
てマッピングされている場合（グローバルメモリ空間と
してマッピング）がこれに相当する。2. Description of the Related Art When performing calculations using a distributed memory type parallel computer, it is desirable to perform programming (closed within the calculation nodes) so as to reduce the frequency of data transfer between nodes. This is because the calculation nodes are connected by a network, and the distance between the calculation nodes is larger than the distance within the calculation nodes. However, in large-scale science and technology problems, closed programming within each computation node is unequally possible, resulting in programming in which each computation node works in concert. For example, a case where a large-scale problem array is mapped across a plurality of nodes (mapping as a global memory space) corresponds to this.

【０００３】図１は、分散メモリ型並列計算機のブロッ
ク図である。FIG. 1 is a block diagram of a distributed memory type parallel computer.

【０００４】図１を参照すると、分散メモリ型並列計算
機は、複数の計算ノード１／２（一例として、２つのノ
ードで説明するが、ノード数は、２に限定されるもので
はない）と、それらを結ぶＮｅｔｗｏｒｋ３を有してい
る。計算ノード１は、ＣＰＵ（中央演算装置：Ｃｅｎｔ
ｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１、ＭＭ
Ｕ（主記憶装置：ＭａｉｎＭｅｍｏｒｙＵｎｉｔ）
１２、ＲＣＵ（遠隔制御装置：ＲｅｍｏｔｅＣｏｎｔ
ｒｏｌＵｎｉｔ）１３で構成される。計算ノード２も
同様にＣＰＵ２１、ＭＭＵ２２、ＲＣＵ２３で構成され
る。Referring to FIG. 1, a distributed memory type parallel computer includes a plurality of calculation nodes 1/2 (for example, two nodes will be described, but the number of nodes is not limited to two). It has Network 3 connecting them. The calculation node 1 is a CPU (Central Processing Unit: Cent)
ral Processing Unit) 11, MM
U (Main storage unit: Main Memory Unit)
12. RCU (Remote control device: Remote Cont)
(Roll Unit) 13. Similarly, the calculation node 2 includes a CPU 21, an MMU 22, and an RCU 23.

【０００５】計算ノード１において、ＣＰＵ１１はＭＭ
Ｕ１２に対するデータのロードとストアが可能であり、
ＣＰＵ１１ではロードしてきたデータを使って演算を
し、結果をＭＭＵ１２にストアする。ＲＣＵ１３は、Ｃ
ＰＵ１１からの計算ノードを跨ぐ他ノードＭＭＵ間のデ
ータ転送命令を受け付け、他ノードのＲＣＵと協調して
ノード間データ転送を実現する。例えば、ＣＰＵ１１が
ＲＣＵ１３にＭＭＵ１２のデータをＭＭＵ２２に転送す
るよう指示した場合、ＲＣＵ１３はＭＭＵ１２のデータ
をロードしてＮｅｔｗｏｒｋ３経由でＲＣＵ２３に転送
し、ＲＣＵ２３は転送データをＭＭＵ２２にライトす
る。これをノード間ライト転送と呼ぶ。また、ＣＰＵ１
１がＲＣＵ１３にＭＭＵ２２のデータをＭＭＵ１２に転
送するよう指示した場合、ＲＣＵ１３はＲＣＵ２３にデ
ータ転送リクエストを送り付け、ＲＣＵ２３はＭＭＵ２
２のデータをロードしてＮｅｔｗｏｒｋ３経由でＲＣＵ
１３に転送し、ＲＣＵ１３は転送データをＭＭＵ１２に
ライトする。これをノード間ロード転送と呼ぶ。このと
き、命令起動を発した計算ノード１をローカルノードと
呼び、それに準ずる計算ノード２をリモートノードと呼
ぶ。In the calculation node 1, the CPU 11
Load and store data to U12,
The CPU 11 performs an operation using the loaded data, and stores the result in the MMU 12. RCU 13 uses C
It receives a data transfer instruction between the MMUs of other nodes across the calculation nodes from the PU 11, and realizes inter-node data transfer in cooperation with the RCU of the other node. For example, when the CPU 11 instructs the RCU 13 to transfer the data of the MMU 12 to the MMU 22, the RCU 13 loads the data of the MMU 12, transfers the data to the RCU 23 via the network 3, and the RCU 23 writes the transfer data to the MMU 22. This is called an inter-node write transfer. CPU1
1 instructs the RCU 13 to transfer the data of the MMU 22 to the MMU 12, the RCU 13 sends a data transfer request to the RCU 23, and the RCU 23
2 and load RCU via Network3
13 and the RCU 13 writes the transfer data to the MMU 12. This is called inter-node load transfer. At this time, the computing node 1 that has issued the instruction activation is called a local node, and the corresponding computing node 2 is called a remote node.

【０００６】図１において、プログラム例として条件付
きコピー演算がノード間に跨るケースを考えてみる。下
記プログラムは、フラグＮＯＤＥ２＿ＦＬＡＧをテスト
して、値が”１”ならば、配列ＮＯＤＥ１（Ｉ）を配列
ＮＯＤＥ２（Ｉ＋Ｊ）にコピーするものである。ここ
で、配列ＮＯＤＥ１（Ｉ）を計算ノード１にマッピン
グ、配列ＮＯＤＥ２（Ｉ＋Ｊ）とフラグＮＯＤＥ２＿Ｆ
ＬＡＧを計算ノード２にマッピングする。また、親プロ
セスは、計算ノード１のＣＰＵであるとする。In FIG. 1, consider a case where a conditional copy operation extends between nodes as an example of a program. The following program tests the flag NODE2_FLAG, and if the value is "1", copies the array NODE1 (I) to the array NODE2 (I + J). Here, array NODE1 (I) is mapped to computation node 1, array NODE2 (I + J) and flag NODE2_F
Map LAG to Compute Node 2. The parent process is assumed to be the CPU of the computation node 1.

【０００７】ＤＯＩ＝Ｍ，ＮＩＦＮＯＤＥ２＿ＦＬＡＧＴＨＥＮＮＯＤＥ２（Ｉ＋Ｊ）＝ＮＯＤＥ１（Ｉ）ＥＮＤＤＯこの場合、計算ノード１のＣＰＵは計算ノード１のＲＣ
Ｕに対し、テスト命令とコピー命令の２命令を発行する
ことになる。最初のテスト命令では計算ノード２のメモ
リに対するテストが成功するまで、計算ノード１と計算
ノード２の間をテスト命令は行き来することになる。ま
た、コピー命令の発行は、テストが成功して計算ノード
１のＣＰＵがテスト完了を認識してから（計算ノード２
からテスト終了リプライを受け取ってから）となるの
で、コピー命令が発行されて計算ノード２のメモリにコ
ピーデータが到着するまでのオーバヘッドが大きくな
る。DOI = M, N IF NODE2_FLAG THEN NODE2 (I + J) = NODE1 (I) END DO In this case, the CPU of the computation node 1
For U, two instructions, a test instruction and a copy instruction, are issued. In the first test instruction, the test instruction moves between the computation nodes 1 and 2 until the test on the memory of the computation node 2 succeeds. The issuance of the copy instruction is performed after the test is successful and the CPU of the computation node 1 recognizes that the test is completed (computation node 2).
From when the test end reply is received), so that the overhead until the copy instruction is issued and the copy data arrives at the memory of the computation node 2 increases.

【０００８】、図６は、従来技術のＲＣＵ部２３（及び
１３）の構成例を示すブロック図、図７は、従来技術の
動作を説明するためのタイムチャートである。FIG. 6 is a block diagram showing a configuration example of a conventional RCU unit 23 (and 13), and FIG. 7 is a time chart for explaining the operation of the conventional technology.

【０００９】図１、図６及び図７を元に、従来技術の分
散メモリ型並列計算機におけるテストアンドコピー動作
について説明する。A test and copy operation in a conventional distributed memory type parallel computer will be described with reference to FIGS. 1, 6 and 7.

【００１０】まず、ローカルノード１において、ＣＰＵ
１１からＲＣＵ１３に対して、リモートメモリ（ＭＭＵ
２２）に対するテスト命令が発行される。ＲＣＵ１３で
は、テスト命令をリクエスト受付部１３０１で受け付
け、競合調停部１３０３でＮｅｔｗｏｒｋ３からのリク
エストとの競合調停後、アドレス変換部１３０４におい
て、物理ノード番号変換／リモートＪＯＢ番号変換をし
て、リクエスト／データ送出部１３０５からＲＣＵ２３
（Ｎｅｔｗｏｒｋ３経由）に送出する。First, in the local node 1, the CPU
11 to the RCU 13 using the remote memory (MMU
A test instruction for 22) is issued. In the RCU 13, the request instruction is received by the request receiving unit 1301, and after the contention arbitration with the request from the Network 3 by the contention arbitration unit 1303, the address conversion unit 1304 performs physical node number conversion / remote JOB number conversion, and performs request / data From the sending unit 1305 to the RCU 23
(Via Network 3).

【００１１】次に、リモートノード２において、ＲＣＵ
２３はＲＣＵ１３（Ｎｅｔｗｏｒｋ３経由）よりテスト
命令をリクエスト受付部２３０１で受け付け、競合調停
部２３０３において競合調停後、アドレス変換部２３０
４において物理アドレス変換した後、ＭＭＵ２２にテス
ト命令を発行する。ＭＭＵ２２からのテスト終了リプラ
イはリクエスト／データ送出部２３０５よりＮｅｔｗｏ
ｒｋ３経由でＲＣＵ１３に返却される。Next, in the remote node 2, the RCU
23 receives a test instruction from the RCU 13 (via Network 3) in the request receiving unit 2301, and after the contention arbitration in the contention arbitration unit 2303, the address conversion unit 230.
After the physical address conversion in step 4, a test instruction is issued to the MMU 22. The test end reply from the MMU 22 is sent from the request / data sending unit 2305 to the network.
Returned to RCU 13 via rk3.

【００１２】ＲＣＵ１３では結果判定を行い、失敗して
いたら同テストシーケンスを繰り返す。図７では、３回
目のテスト終了リプライにてテストシーケンスは成功す
るので、３回目のテスト終了リプライをＣＰＵ１１に返
却しテスト命令が完了する。The RCU 13 determines the result, and if it fails, repeats the same test sequence. In FIG. 7, since the test sequence is successful at the third test end reply, the third test end reply is returned to the CPU 11 and the test instruction is completed.

【００１３】次に、ローカルノード１において、ＣＰＵ
１１からＲＣＵ１３に対して、リモートメモリ（ＭＭＵ
２２）に対するコピー命令が発行される。ＲＣＵ１３で
は、コピー命令をリクエスト受付部１３０１で受け付
け、競合調停部１３０３で競合調停後、アドレス変換部
１３０４において、物理ノード番号変換／リモートＪＯ
Ｂ番号変換／物理アドレス変換をして、ＭＭＵ１２をア
クセスする。そして、コピー命令とＭＭＵ１２からのロ
ード（コピー）データを一緒にしてリクエスト／データ
送出部１３０５からＲＣＵ２３（Ｎｅｔｗｏｒｋ３経
由）に送出する。Next, in the local node 1, the CPU
11 to the RCU 13 using the remote memory (MMU
A copy instruction for 22) is issued. In the RCU 13, the request receiving unit 1301 receives a copy command, and after the contention arbitration in the contention arbitration unit 1303, the address conversion unit 1304 converts the physical node number conversion /
The MMU 12 is accessed by performing B number conversion / physical address conversion. Then, the request / data sending unit 1305 sends the copy instruction and the load (copy) data from the MMU 12 together to the RCU 23 (via Network 3).

【００１４】次に、ＲＣＵ２３では、コピー命令とコピ
ーデータをリクエスト受付部２３０１とデータ受付部２
３０２で受け付け、コピー命令の競合調停とアドレス変
換を競合調停部２３０３とアドレス変換部２３０４で実
行した後、コピー命令（コマンドとアドレス）とコピー
データをＭＭＵ２２に送出しライト（コピー）を行う。
ＲＣＵ２３は、ＲＣＵ１３からのデータ転送とＭＭＵ２
２へのライト動作が正常終了したことの通知（コピー終
了リプライ）をリクエスト／データ送出部２３０５から
Ｎｅｔｗｏｒｋ３経由でＲＣＵ１３に送り付け、ＲＣＵ
１３は本リプライをＣＰＵ１１に返却して一連の動作が
完了する。Next, the RCU 23 sends a copy command and copy data to the request receiving unit 2301 and the data receiving unit 2.
At 302, contention arbitration and address conversion of a copy instruction are executed by a contention arbitration unit 2303 and address conversion unit 2304, and then a copy instruction (command and address) and copy data are sent to the MMU 22 to perform write (copy).
The RCU 23 transfers the data from the RCU 13 and the MMU 2
2 is notified from the request / data sending unit 2305 to the RCU 13 via the network 3 that the write operation to the RCU 2 has been completed normally (copy end reply).
13 returns this reply to the CPU 11 to complete a series of operations.

【００１５】図６における構成例では、テストアンドコ
ピー処理（ＣＰＵ１１がリクエストを発行からリプライ
を受け取るまで）は８０Ｔで完了となる。In the configuration example shown in FIG. 6, the test and copy process (from when the CPU 11 issues a request to when it receives a reply) is completed in 80T.

【００１６】[0016]

【発明が解決しようとする課題】上述した従来の分散メ
モリ型並列計算機におけるテストアンドコピー処理は，
計算ノード１のＣＰＵは計算ノード１のＲＣＵに対し、
テスト命令とコピー命令の２命令を発行することにな
り、最初のテスト命令では計算ノード２のメモリに対す
るテストが成功するまで、計算ノード１と計算ノード２
の間をテスト命令は行き来することになり、また、コピ
ー命令の発行は、テストが成功して計算ノード１のＣＰ
Ｕがテスト完了を認識してから（計算ノード２からテス
ト終了リプライを受け取ってから）となるので、計算ノ
ード１でテスト命令が発行されてから計算ノード２でコ
ピー命令の実行が終了するまでのオーバーヘッドが大き
くなる問題があった。The test and copy processing in the conventional distributed memory type parallel computer described above involves
The CPU of the calculation node 1 sends the RCU of the calculation node 1
Two instructions, a test instruction and a copy instruction, are issued. In the first test instruction, the computation nodes 1 and 2 are executed until the test on the memory of the computation node 2 succeeds.
The test instruction is exchanged between the two. The issuance of the copy instruction indicates that the test succeeds and the CP
Since U recognizes the completion of the test (after receiving the test end reply from the calculation node 2), the period from when the test instruction is issued at the calculation node 1 to when the execution of the copy instruction is completed at the calculation node 2 is completed. There was a problem that overhead increased.

【００１７】本発明の目的は、分散メモリ型並列計算機
におけるリモートメモリに対するテストアンドコピー処
理の高速化方式を提供することにある。An object of the present invention is to provide a method for speeding up test and copy processing on a remote memory in a distributed memory type parallel computer.

【００１８】[0018]

【課題を解決するための手段】本願の第１の発明は、ネ
ットワークにおける複数のノードを構成する分散メモリ
型並列計算機におけるリモートメモリに対するテストア
ンドコピーの処理方式において、テスト命令とコピー命
令及びコピーデータを続けて送出する第一のノードと、
前記テスト命令と前記コピー命令及びコピーデータを受
信すると自ノード内の前記リモートメモリに対する該テ
スト命令を実行し、該テスト命令の実行後に前記コピー
命令により前記コピーデータを前記リモートメモリにコ
ピーする第二のノードを具備することを特徴とする。According to a first aspect of the present invention, there is provided a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network. A first node that sends
Upon receiving the test command, the copy command, and the copy data, execute the test command for the remote memory in the own node, and copy the copy data to the remote memory by the copy command after the execution of the test command. Characterized by having the following nodes:

【００１９】本願の第２の発明は、第１の発明における
前記第二のノードは、前記リモートメモリに対する該テ
スト命令を自ノード内で該命令が成功するまで繰り返し
実行することを特徴とする。According to a second aspect of the present invention, in the first aspect, the second node repeatedly executes the test instruction for the remote memory in its own node until the instruction is successful.

【００２０】本願の第３の発明は、ネットワークにおけ
る複数のノードを構成する分散メモリ型並列計算機にお
けるリモートメモリに対するテストアンドコピーの処理
方式において、前記各ノードはＣＰＵ（中央演算装
置）、ＭＭＵ（主記憶装置）、ＲＣＵ（遠隔制御装置）
を含んで構成され、第一のノードにおける第一のＲＣＵ
は、第一のＣＰＵからテストアンドコピー命令を受ける
と該命令をテスト命令とコピー命令に分解し該テスト命
令を第二のノードにおける第二のＲＣＵに送出し続いて
前記コピー命令と第一のＭＭＵからのコピーデータを一
緒にして前記第二のＲＣＵに送出する送出手段を具備
し、前記第二のノードにおける前記第二のＲＣＵは、受
信した前記テスト命令及び前記コピー命令を格納するコ
マンド・アドレス退避バッファと、受信した前記コピー
データを格納するデータ退避バッファと、前記テスト命
令が成功するまで該テスト命令を前記コマンド・アドレ
ス退避バッファから取り出して第二のＭＭＵに発行する
ことを繰り返し該テスト命令が成功すると該コマンド・
アドレス退避バッファから前記コピー命令を取り出すと
ともに前記データ退避バッファから前記コピーデータを
取り出し前記第二のＭＭＵに発行する繰り返し制御部と
を具備することを特徴とする。According to a third aspect of the present invention, there is provided a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network, wherein each of the nodes includes a CPU (Central Processing Unit) and an MMU (Main Unit). Storage device), RCU (remote control device)
And a first RCU at the first node
Receives a test and copy instruction from the first CPU, decomposes the instruction into a test instruction and a copy instruction, sends the test instruction to a second RCU in a second node, and subsequently transmits the test instruction and the first Sending means for sending together the copy data from the MMU to the second RCU, wherein the second RCU in the second node includes a command for storing the received test command and the received copy command. An address saving buffer, a data saving buffer for storing the received copy data, and repeatedly executing the test instruction from the command / address saving buffer and issuing to the second MMU until the test instruction succeeds. If the command succeeds, the command
A repetition control unit that fetches the copy instruction from the address save buffer and fetches the copy data from the data save buffer and issues the copy data to the second MMU.

【００２１】本願の第４の発明は、ネットワークにおけ
る複数のノードを構成する分散メモリ型並列計算機にお
けるリモートメモリに対するテストアンドコピーの処理
方式において、テスト命令を送出するとともにテスト終
了リプライを受信後にコピー命令及びコピーデータを続
けて送出する第一のノードと、前記テスト命令を受信す
ると自ノード内の前記リモートメモリに対する該テスト
命令を実行し該テスト命令の実行後に前記テスト終了リ
プライを送出し該テスト終了リプライ送出に応じて前記
コピー命令及びコピーデータを受信し記コピー命令によ
り前記コピーデータを前記リモートメモリにコピーする
第二のノードを具備することを特徴とする。According to a fourth aspect of the present invention, in a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network, a copy instruction is transmitted after transmitting a test instruction and receiving a test end reply. And a first node for continuously transmitting copy data, and upon receiving the test instruction, executing the test instruction for the remote memory in the own node, transmitting the test end reply after executing the test instruction, and terminating the test. A second node that receives the copy command and the copy data in response to the reply transmission and copies the copy data to the remote memory according to the copy command.

【００２２】本願の第５の発明は、第４の発明における
前記第二のノードは、前記リモートメモリに対する該テ
スト命令を自ノード内で該命令が成功するまで繰り返し
実行することを特徴とする。According to a fifth aspect of the present invention, in the fourth aspect, the second node repeatedly executes the test instruction for the remote memory in its own node until the instruction is successful.

【００２３】本願の第６の発明は、ネットワークにおけ
る複数のノードを構成する分散メモリ型並列計算機にお
けるリモートメモリに対するテストアンドコピーの処理
方式において、前記各ノードはＣＰＵ（中央演算装
置）、ＭＭＵ（主記憶装置）、ＲＣＵ（遠隔制御装置）
を含んで構成され、第一のノードにおける第一のＲＣＵ
は、第一のＣＰＵからテストアンドコピー命令を受ける
と該命令をテスト命令とコピー命令に分解し該テスト命
令を第二のノードにおける第二のＲＣＵに送出しテスト
終了リプライを受信後に前記コピー命令と第一のＭＭＵ
からのコピーデータを一緒にして前記第二のＲＣＵに送
出する送出手段を具備し、前記第二のノードにおける前
記第二のＲＣＵは、受信した前記テスト命令を格納する
コマンド・アドレス退避バッファと、前記テスト命令が
成功するまで該テスト命令を前記コマンド・アドレス退
避バッファから取り出して第二のＭＭＵに発行すること
を繰り返し該テスト命令が成功すると前記テスト終了リ
プライを送出する繰り返し制御部と、該テスト終了リプ
ライ送出に応じて前記コピー命令及びコピーデータを受
信し該コピー命令及びコピーデータを前記第二のＭＭＵ
に発行する制御手段を具備することを特徴とする。According to a sixth aspect of the present invention, there is provided a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network, wherein each of the nodes includes a CPU (Central Processing Unit), an MMU (Main Unit). Storage device), RCU (remote control device)
And a first RCU at the first node
Receives a test and copy instruction from the first CPU, decomposes the instruction into a test instruction and a copy instruction, sends the test instruction to the second RCU in the second node, and receives the test end reply, And the first MMU
And sending means for sending the copy data from the second RCU together to the second RCU, wherein the second RCU in the second node comprises a command / address saving buffer for storing the received test instruction; A repetition control unit for repeatedly taking out the test instruction from the command / address save buffer and issuing the test instruction to the second MMU until the test instruction succeeds, and sending out the test end reply when the test instruction succeeds; Receiving the copy command and the copy data in response to sending the end reply and transmitting the copy command and the copy data to the second MMU
Is provided with control means for issuing the control information.

【００２４】［作用］分散メモリ型並列計算機における
リモートメモリに対するテストアンドコピー処理を高速
化するために、次の２つの対策を取り入れる。[Operation] In order to speed up the test and copy process for the remote memory in the distributed memory type parallel computer, the following two measures are taken.

【００２５】１．新設のテストアンドコピー命令により
テスト命令とコピー命令を計算ノード１のＣＰＵから同
時に発行し、テスト命令とコピー命令及びコピーデータ
を計算ノード２へ同時に（連続しての意味、以降も同
様）転送し、計算ノード２においてテスト命令実行終了
後直ちにコピー命令の実行を可能にする。1. The test instruction and the copy instruction are simultaneously issued from the CPU of the calculation node 1 by the newly installed test and copy instruction, and the test instruction, the copy instruction, and the copy data are simultaneously transferred to the calculation node 2 (sequential meaning, and so on). , Enables the execution of the copy instruction immediately after the execution of the test instruction in the computation node 2.

【００２６】２．テスト命令のリトライシーケンスを計
算ノード２の中に閉じて実行させる（成功するまでテス
ト終了リプライを計算ノード１に返さない）。2. The retry sequence of the test instruction is closed in the calculation node 2 and executed (the test end reply is not returned to the calculation node 1 until the test node succeeds).

【００２７】上記１の対策により、コピー命令及びコピ
ーデータの転送時間がテスト命令のレイテンシに隠蔽さ
れ、上記２の対策により、テスト命令における計算ノー
ド間のリトライの行き来によるターンアラウンドタイム
が大幅に短縮され、その結果、分散メモリ型並列計算機
におけるテストアンドコピー処理は高速化される。According to the first measure, the transfer time of the copy command and the copy data is hidden by the latency of the test command, and the second measure significantly reduces the turnaround time due to the retry between the calculation nodes in the test command. As a result, the speed of the test and copy processing in the distributed memory type parallel computer is increased.

【００２８】[0028]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して詳細に説明する。Next, embodiments of the present invention will be described in detail with reference to the drawings.

【００２９】図１は、分散メモリ型並列計算機のブロッ
ク図、図２は、本発明の一実施の形態におけるＲＣＵ部
２３（及び１３）の構成例を示すブロック図、図３は、
本発明の一実施の形態の動作を説明するためのタイムチ
ャートである。FIG. 1 is a block diagram of a distributed memory type parallel computer, FIG. 2 is a block diagram showing an example of the configuration of an RCU 23 (and 13) according to an embodiment of the present invention, and FIG.
5 is a time chart for explaining the operation of the embodiment of the present invention.

【００３０】まず、本実施の形態の構成につて説明す
る。ここでは、リモートノード２のＲＣＵ２３を中心に
説明をする。また、配列の計算ノードへのマッピング
は、“従来の技術”で述べた場合と同じとする。First, the configuration of the present embodiment will be described. Here, the description will focus on the RCU 23 of the remote node 2. The mapping of the array to the computation nodes is the same as the case described in the “prior art”.

【００３１】図２を参照すると、リクエスト受付部２３
０１は、ＣＰＵ２１からの命令または、ＲＣＵ１３（Ｎ
ｅｔｗｏｒｋ３）からの命令を受け付け保持をする。デ
ータ受付部２３０２は、ＲＣＵ１３（Ｎｅｔｗｏｒｋ
３）から転送されてくるデータ部分を受け付け保持をす
る。競合調停部２３０３は、リクエスト受付部２３０１
中のリクエストを１つずつ選択（競合調停）する。アド
レス変換部２３０４は、論理ノード番号を物理ノード番
号に変換、ローカルＪＯＢ番号をリモートＪＯＢ番号に
変換、ノード内論理アドレスをノード内物理アドレスに
変換する。特に、物理ノード番号変換とリモートＪＯＢ
番号変換はＮｅｔｗｏｒｋ３経由で他ノードをアクセス
する命令に対し必要となり、ノード内物理アドレス変換
はノード内メモリ（ここではＭＭＵ２２）をアクセスす
る命令に対し必要となる。リクエスト／データ送出部２
３０５は、アドレス変換後の命令（コマンド・アドレ
ス）とＭＭＵ２２からのロードデータを他ノード（Ｎｅ
ｔｗｏｒｋ３経由でＲＣＵ１３）に送出する部分であ
る。また、データ受付部２３０２に保持されているデー
タは、他ノード（Ｎｅｔｗｏｒｋ３経由のＲＣＵ１３）
からＭＭＵ２２にライトする場合に必要となる。Referring to FIG. 2, the request receiving unit 23
01 is a command from the CPU 21 or the RCU 13 (N
network 3), and holds the instruction. The data receiving unit 2302 is configured to execute the RCU 13 (Network
The data part transferred from 3) is received and held. The contention arbitration unit 2303 includes a request reception unit 2301
The requests inside are selected one by one (contention arbitration). The address conversion unit 2304 converts a logical node number into a physical node number, converts a local JOB number into a remote JOB number, and converts a logical address in a node into a physical address in a node. In particular, physical node number conversion and remote job
The number conversion is required for an instruction for accessing another node via Network 3, and the intra-node physical address conversion is required for an instruction for accessing the intra-node memory (here, MMU 22). Request / data sending unit 2
Reference numeral 305 denotes an instruction (command / address) after the address conversion and the load data from the MMU 22 to another node (Ne).
This is a part to be sent to the RCU 13) via the work3. The data held in the data receiving unit 2302 is stored in another node (the RCU 13 via the Network 3).
This is necessary when writing to the MMU 22 from.

【００３２】次に、本発明の特徴である、コマンド・ア
ドレス退避バッファ２３１１、データ退避バッファ２３
１４、セレクタ２３１２／２３１５、繰り返し制御部２
３１３について説明する。コマンド・アドレス退避バッ
ファ２３１１は、テストアンドコピー命令におけるＭＭ
Ｕ２２へのアクセス時のコマンドとアドレスを退避して
おくためにバッファであり、これによりＭＭＵ２２に対
するテストリトライ処理（ＮＯＤＥ２＿ＦＬＡＧのチェ
ック）の繰り返し発行が可能となる。データ退避バッフ
ァ２３１４は、テストが成功（ＮＯＤＥ２＿ＦＬＡＧ＝
１）するまで、ローカルノード１からのコピーデータを
退避させておくためのバッファであり、テスト成功時に
初めてＭＭＵ２２にコピーデータが送出される。セレク
タ２３１２は、通常、アドレス変換部２３０４を選択し
ているが、テストアンドコピー命令における、テストリ
トライ処理時とデータコピー時のみコマンド・アドレス
退避バッファ２３１１を選択する。セレクタ２３１５
は、通常は、データ受付部２３０２を選択しているが、
テストアンドコピー命令におけるデータコピー処理時の
みデータ退避バッファ２３１４を選択する。繰り返し制
御部２３１３は、テストアンドコピー命令のテストリト
ライ処理において、ＭＭＵ２２から返却される結果をチ
ェックしセレクタ２３１２と２３１５の選択方向を切り
替えるものである。セレクタ切り替え論理は、上記セレ
クタ２３１２と２３１５の説明に従う。Next, a command / address saving buffer 2311 and a data saving buffer 23 which are features of the present invention are described.
14, selector 2312/2315, repetition control unit 2
313 will be described. The command / address save buffer 2311 stores the MM in the test and copy instruction.
This buffer is used to save a command and an address at the time of accessing the U22, thereby enabling a test retry process (NODE2_FLAG check) to be repeatedly issued to the MMU22. The data save buffer 2314 indicates that the test was successful (NODE2_FLAG =
This buffer is for saving copy data from the local node 1 until 1), and the copy data is sent to the MMU 22 only when the test is successful. The selector 2312 normally selects the address conversion unit 2304. However, the selector 2312 selects the command / address saving buffer 2311 only at the time of test retry processing and data copy in a test and copy instruction. Selector 2315
Usually selects the data receiving unit 2302,
The data save buffer 2314 is selected only at the time of data copy processing in the test and copy instruction. The repetition control unit 2313 checks the result returned from the MMU 22 in the test retry processing of the test and copy instruction, and switches the selection direction of the selectors 2312 and 2315. The selector switching logic follows the description of the selectors 2312 and 2315.

【００３３】次に、図１、図２及び図３を元に、本実施
の形態の分散メモリ型並列計算機におけるテストアンド
コピー動作について説明する。Next, a test and copy operation in the distributed memory type parallel computer according to the present embodiment will be described with reference to FIGS. 1, 2 and 3.

【００３４】まず、ローカルノード１において、ＣＰＵ
１１からＲＣＵ１３に対して、リモートメモリ（ＭＭＵ
２２）に対するテストアンドコピー命令が発行される。
ＲＣＵ１３では、テストアンドコピー命令をリクエスト
受付部１３０１で受け付け、テスト命令とコピー命令と
に分解する。テスト命令は、競合調停部１３０３で競合
調停後、アドレス変換部１３０４において、物理ノード
番号変換／リモートＪＯＢ番号変換され、リクエスト／
データ送出部１３０５からＲＣＵ２３（Ｎｅｔｗｏｒｋ
３経由）に送出される。続いてコピー命令は、競合調停
部１３０３で競合調停後、アドレス変換部１３０４にお
いて物理アドレス変換されてＭＭＵ１２をアクセスす
る。そして、コピー命令とＭＭＵ１２からのロード（コ
ピー）データを一緒にしてリクエスト／データ送出部１
３０５より、ＲＣＵ２３（Ｎｅｔｗｏｒｋ３経由）に送
出される。First, in the local node 1, the CPU
11 to the RCU 13 using the remote memory (MMU
A test and copy instruction for 22) is issued.
In the RCU 13, the test and copy command is received by the request receiving unit 1301, and is decomposed into a test command and a copy command. The test instruction is subjected to contention arbitration in the contention arbitration unit 1303, and then to physical address number conversion / remote JOB number conversion in the address conversion unit 1304.
From the data transmission unit 1305 to the RCU 23 (Network
3). Subsequently, the copy instruction is subjected to physical arbitration in the address conversion unit 1304 after contention arbitration in the contention arbitration unit 1303, and accesses the MMU 12. Then, the request / data sending unit 1 combines the copy instruction and the load (copy) data from the MMU 12 together.
From 305, it is sent to the RCU 23 (via Network 3).

【００３５】次に、リモートノード２において、ＲＣＵ
２３はＲＣＵ１３（Ｎｅｔｗｏｒｋ３経由）よりテスト
命令をリクエスト受付部２３０１で受け付け、競合調停
部２３０３において競合調停後、アドレス変換部２３０
４において物理アドレス変換し、コマンド・アドレス退
避バッファ２３１１に格納するのと同時に、ＭＭＵ２２
に発行する。ＭＭＵ２２では結果を繰り返し制御部２３
１３に返却し、テスト失敗（ＮＯＤＥ２＿ＦＬＡＧ＝
０）ならば、コマンド・アドレス退避バッファ２３１１
に格納してあるテスト命令をＭＭＵ２２に繰り返し発行
する。図３では、２４Ｔ目にＮＯＤＥ２＿ＦＬＡＧ＝１
となるので、８回失敗した後９回目で成功となる。一
方、テスト繰り返し処理中に、コピー命令がＲＣＵ１３
（Ｎｅｔｗｏｒｋ３）より到着するので、コマンドとア
ドレスはリクエスト受付部２３０１で受け付け、競合調
停部２３０３で競合調停して、アドレス変換部２３０４
で物理アドレス変換した後、コマンド・アドレス退避バ
ッファ２３１１に格納する。一方、データはデータ受付
部２３０２で受け付け、データ退避バッファ２３１４に
格納する。コピー命令はテスト成功まで、各退避バッフ
ァ（２３１１／２３１４）において退避される。そし
て、テストは９回目で成功となり、そのタイミングでコ
マンド・アドレス退避バッファ２３１１中のコマンドと
アドレス、データ退避バッファ２３１４中のデータをＭ
ＭＵ２２に送出しライト（コピー）を行う。ＲＣＵ２３
は、ＲＣＵ１３からのデータ転送とＭＭＵ２２へのライ
ト動作が正常終了したことの通知（テストアンドコピー
終了リプライ）をリクエスト／データ送出部２３０５か
らＮｅｔｗｏｒｋ３経由でＲＣＵ１３に送り付け、ＲＣ
Ｕ１３は本リプライをＣＰＵ１１に返却して一連の動作
が完了する。Next, in the remote node 2, the RCU
23 receives a test instruction from the RCU 13 (via Network 3) in the request receiving unit 2301, and after the contention arbitration in the contention arbitration unit 2303, the address conversion unit 230.
4, the MMU 22 converts the physical address and stores it in the command / address saving buffer 2311.
Issue to The MMU 22 repeatedly repeats the result.
13 and the test failed (NODE2_FLAG =
0), the command / address saving buffer 2311
Are repeatedly issued to the MMU 22. In FIG. 3, NODE2_FLAG = 1 at 24T.
Therefore, after failing eight times, ninth succeeds. On the other hand, during the test repetition processing, the copy instruction
(Network 3), the command and the address are received by the request receiving unit 2301, the contention is arbitrated by the contention arbitration unit 2303, and the address conversion unit 2304 is received.
After converting the physical address, the data is stored in the command / address saving buffer 2311. On the other hand, the data is received by the data receiving unit 2302 and stored in the data save buffer 2314. The copy instruction is saved in each save buffer (2311/2314) until the test is successful. Then, the test succeeds at the ninth time, and at that timing, the command and address in the command / address save buffer 2311 and the data in the data save buffer 2314 are stored in M.
The data is sent to the MU 22 for writing (copying). RCU23
Sends a notification (test and copy end reply) that the data transfer from the RCU 13 and the write operation to the MMU 22 have been normally completed from the request / data sending unit 2305 to the RCU 13 via the Network 3,
U13 returns this reply to the CPU 11, and a series of operations is completed.

【００３６】ここでは便宜上、各ユニット間／ユニット
内のレイテンシを下記のように定めることにする。ま
た、ＮＯＤＥ２＿ＦＬＡＧは２４Ｔ目に”０”から”
１”に変わるものとする。但し、Ｔとは本分散メモリ型
並列計算機システムの１マシンクロックに相当するもの
とする。Here, for convenience, the latency between / within each unit is determined as follows. NODE2_FLAG changes from “0” at 24T.
1 ", where T corresponds to one machine clock of the distributed memory type parallel computer system.

【００３７】１．ＣＰＵ（１１，２１）／ＭＭＵ（１
２，２２）間のレイテンシ：１Ｔ２．ＭＭＵ（１２，２２）／ＲＣＵ（１３，２３）間の
レイテンシ：１Ｔ３．Ｎｅｔｗｏｒｋ（ＲＣＵ１３／ＲＣＵ２３間）レイ
テンシ：６Ｔ４．各ユニット内通過レイテンシ：０Ｔこのとき、本発明における構成例では、テストアンドコ
ピー処理（ＣＰＵ１１がリクエストを発行からリプライ
を受け取るまで）は５０Ｔで完了となる。尚、以上の本
実施の形態の説明において、１．配列ＮＯＤＥ１（Ｉ）を計算ノード１にマッピン
グ、配列ＮＯＤＥ２（Ｉ＋Ｊ）とフラグＮＯＤＥ２＿Ｆ
ＬＡＧを計算ノード２にマッピングして説明したが、マ
ッピングする計算ノードに制限はない。1. CPU (11, 21) / MMU (1
1. Latency between 2, 22): 1T 2. Latency between MMU (12, 22) / RCU (13, 23): 1T 3. Network (between RCU13 / RCU23) Latency: 6T At this time, in the configuration example of the present invention, the test-and-copy processing (from when the CPU 11 issues a request to when it receives a reply) is completed in 50T. In the above description of the present embodiment, Array NODE1 (I) is mapped to computation node 1, array NODE2 (I + J) and flag NODE2_F
Although the LAG is mapped to the calculation node 2, the calculation node to be mapped is not limited.

【００３８】２．親プロセスは計算ノード１のＣＰＵで
あるとして説明したが、この親プロセスの割り当てに制
限はない。2. Although the parent process has been described as being the CPU of the computing node 1, there is no limitation on the allocation of the parent process.

【００３９】３．ネットワークに接続される計算ノード
は２つとして説明したが、これらの数に制限はない。3. Although the number of computing nodes connected to the network has been described as two, these numbers are not limited.

【００４０】４．１つの計算ノード内は１つのＣＰＵで
構成されるとして説明したが、これらの数に制限はな
い。つまり、ノード内はマルチＣＰＵによる共有メモリ
型でもよい。4. Although the description has been made assuming that one computation node is constituted by one CPU, the number is not limited. That is, the inside of the node may be a shared memory type using multiple CPUs.

【００４１】５．各ユニット間のレイテンシ、ユニット
内のレイテンシを固定値を用いて説明したが、これらの
値に制限はない。5. Although the latencies between the units and the latencies within the units have been described using fixed values, these values are not limited.

【００４２】６．コマンド・アドレス退避バッファの容
量に制限はない。6. There is no limit on the capacity of the command / address saving buffer.

【００４３】７．データ退避バッファの容量に制限はな
い。7. There is no limit on the capacity of the data save buffer.

【００４４】図４は、本発明の他の実施の形態における
ＲＣＵ部２３（及び１３）の構成例を示す図、図５は、
他の実施の形態の動作を説明するためにタイムチャート
である。FIG. 4 is a diagram showing a configuration example of the RCU unit 23 (and 13) according to another embodiment of the present invention, and FIG.
9 is a time chart for explaining the operation of another embodiment.

【００４５】本発明の第一の実施の形態との差分は、こ
のＲＣＵ部２３（及び１３）のみである。また、ＲＣＵ
部としての基本構成は変わらないので、発明の特徴であ
る、コマンド・アドレス退避バッファ２３１１、セレク
タ２３１２、繰り返し制御部２３１３のみについて説明
する。The difference from the first embodiment of the present invention is only the RCU 23 (and 13). Also, RCU
Since the basic configuration as a unit does not change, only the command / address saving buffer 2311, selector 2312, and repetition control unit 2313, which are features of the present invention, will be described.

【００４６】コマンド・アドレス退避バッファ２３１１
は、テスト命令におけるＭＭＵ２２アクセス時のコマン
ドとアドレスを退避しておくためにバッファであり、こ
れによりＭＭＵ２２に対するテストリトライ処理（ＮＯ
ＤＥ２＿ＦＬＡＧのチェック）の繰り返し発行が可能と
なる。セレクタ２３１２は、通常、アドレス変換部２３
０４を選択しているが、テスト命令における、テストリ
トライ処理時のみコマンド・アドレス退避バッファ２３
１１を選択する。繰り返し制御部２３１３は、テスト命
令のテストリトライ処理において、ＭＭＵ２２から返却
される結果をチェックしセレクタ２３１２の選択方向を
切り替えるものである。セレクタ２３１２の切り替え論
理は、上記セレクタ２３１２の説明に従う。Command / address saving buffer 2311
Is a buffer for saving a command and an address at the time of access to the MMU 22 in a test instruction.
DE2_FLAG) can be repeatedly issued. The selector 2312 is usually
04 is selected, but the command / address saving buffer 23 is used only during the test retry processing in the test instruction.
Select 11. The repetition control unit 2313 checks the result returned from the MMU 22 and switches the selection direction of the selector 2312 in the test retry processing of the test instruction. The switching logic of the selector 2312 follows the description of the selector 2312.

【００４７】次に、図５を元に（図１と図４も参照）、
発明の他の実施の形態における分散メモリ型並列計算機
のテストアンドコピー動作について説明する。Next, based on FIG. 5 (see also FIGS. 1 and 4),
A test and copy operation of the distributed memory type parallel computer according to another embodiment of the present invention will be described.

【００４８】まず、ローカルノード１において、ＣＰＵ
１１からＲＣＵ１３に対して、リモートメモリ（ＭＭＵ
２２）に対するテストアンドコピー命令が発行される。
ＲＣＵ１３では、テストアンドコピー命令をリクエスト
受付部１３０１で受け付け、テスト命令とコピー命令と
に分解する。テスト命令は競合調停部１３０３で競合調
停後、アドレス変換部１３０４において、物理ノード番
号変換／リモートＪＯＢ番号変換され、リクエスト／デ
ータ送出部１３０５からＲＣＵ２３（Ｎｅｔｗｏｒｋ
３）に送出される。コピー命令はＲＣＵ２３からテスト
終了リプライが返却されるまでリクエスト受付部１３０
１において保持しておく。First, in the local node 1, the CPU
11 to the RCU 13 using the remote memory (MMU
A test and copy instruction for 22) is issued.
In the RCU 13, the test and copy command is received by the request receiving unit 1301, and is decomposed into a test command and a copy command. The test instruction is subjected to contention arbitration in the contention arbitration unit 1303, and is then subjected to physical node number conversion / remote JOB number conversion in the address conversion unit 1304.
Sent to 3). The copy instruction is sent to the request receiving unit 130 until the RCU 23 returns a test end reply.
It is kept at 1.

【００４９】次に、リモートノード２において、ＲＣＵ
２３はＲＣＵ１３（Ｎｅｔｗｏｒｋ３経由）よりテスト
命令をリクエスト受付部２３０１で受け付け、競合調停
部２３０３において競合調停後、アドレス変換部２３０
４において物理アドレス変換し、コマンド・アドレス退
避バッファ２３１１に格納するのと同時に、ＭＭＵ２２
に発行する。ＭＭＵ２２では結果を繰り返し制御部２３
１３に返し、テスト失敗（ＮＯＤＥ２＿ＦＬＡＧ＝０）
ならば、コマンド・アドレス退避バッファ２３１１に格
納してあるテスト命令をＭＭＵ２２に繰り返し発行す
る。図５では、２４Ｔ目にＮＯＤＥ２＿ＦＬＡＧ＝１と
なるので、８回失敗した後９回目で成功となる。このタ
イミングでリクエスト／データ送出部２３０５より、テ
スト終了リプライをＮｅｔｗｏｒｋ３経由でＲＣＵ１３
に返却する。Next, in the remote node 2, the RCU
23 receives a test instruction from the RCU 13 (via Network 3) in the request receiving unit 2301, and after the contention arbitration in the contention arbitration unit 2303, the address conversion unit 230.
4, the MMU 22 converts the physical address and stores it in the command / address saving buffer 2311.
Issue to The MMU 22 repeatedly repeats the result.
13 and the test failed (NODE2_FLAG = 0)
If so, the test instruction stored in the command / address saving buffer 2311 is repeatedly issued to the MMU 22. In FIG. 5, since NODE2_FLAG = 1 at 24T, success is achieved at the ninth time after eight failures. At this timing, the request / data transmission unit 2305 sends a test end reply to the RCU 13 via the network 3.
Return to.

【００５０】次に、ＲＣＵ１３ではテスト終了リプライ
をリクエスト受付部１３０１で受理し、保持しておいた
コピー命令を、競合調停部１３０３において競合調停
後、アドレス変換部１３０４において物理アドレスに変
換しＭＭＵ１２に送出する。そして、コピー命令とＭＭ
Ｕ１２からのロード（コピー）データを一緒にしてリク
エスト／データ送出部１３０５より、ＲＣＵ２３（Ｎｅ
ｔｗｏｒｋ３）に送出する。Next, in the RCU 13, the test end reply is received by the request receiving unit 1301, the held copy instruction is subjected to contention arbitration in the contention arbitration unit 1303, and then converted to a physical address in the address conversion unit 1304 and sent to the MMU 12 Send out. And the copy instruction and MM
The request / data sending unit 1305 combines the load (copy) data from the U12 and the RCU23 (Ne)
send to Twork3).

【００５１】ＲＣＵ２３では、コピー命令とコピーデー
タをリクエスト受付部２３０１とデータ受付部２３０２
で受け付け、コピー命令の競合調停と物理アドレス変換
を競合調停部２３０３とアドレス変換部２３０４で実行
した後、コピー命令（コマンドとアドレス）とコピーデ
ータをＭＭＵ２２に送出しライト（コピー）を行う。Ｒ
ＣＵ２３は、ＲＣＵ１３からのデータ転送とＭＭＵ２２
へのライト動作が正常終了したことの通知（コピー終了
リプライ）をリクエスト／データ送出部２３０５からＮ
ｅｔｗｏｒｋ３経由でＲＣＵ１３に送り付け、ＲＣＵ１
３は本リプライをＣＰＵ１１に返却して一連の動作が完
了する。The RCU 23 transmits a copy command and copy data to the request receiving unit 2301 and the data receiving unit 2302.
After the contention arbitration and physical address conversion of the copy instruction are executed by the contention arbitration unit 2303 and the address conversion unit 2304, the copy instruction (command and address) and the copy data are sent to the MMU 22 to perform write (copy). R
The CU 23 transfers data from the RCU 13 and the MMU 22
The request / data sending unit 2305 sends a notification (copy end reply) that the write operation to the
Send to RCU13 via network3, RCU1
3 returns this reply to the CPU 11 to complete a series of operations.

【００５２】図４における構成例では、テストアンドコ
ピー処理（ＣＰＵ１１がリクエストを発行からリプライ
を受け取るまで）は６４Ｔで完了となる。In the configuration example shown in FIG. 4, the test and copy process (from when the CPU 11 issues a request to when it receives a reply) is completed in 64T.

【００５３】本他の実施の形態では、第一の実施の形態
に比べてテストアンドコピー処理は遅くなるが、その分
データ退避バッファ２３１４（データ退避バッファ１３
１４）が不要になり、どちらを選択するかは、システム
の運用条件による。In the other embodiment, the test and copy processing is slower than in the first embodiment, but the data save buffer 2314 (data save buffer 13
14) becomes unnecessary, and which one to select depends on the operating conditions of the system.

【００５４】また、本願発明は、分散メモリ型並列計算
機のテストアンドコピー動作だけではなく、排他制御の
一般論理にも応用できることはいうまでもない。例え
ば、ディスクに対する書き込み動作の高速化にも応用可
能である。Further, it goes without saying that the present invention can be applied not only to the test and copy operation of the distributed memory type parallel computer, but also to general logic of exclusive control. For example, the present invention can be applied to speeding up a write operation on a disk.

【００５５】[0055]

【発明の効果】以上説明したように、本発明は、１．新設のテストアンドコピー命令によりテスト命令と
コピー命令を計算ノード１のＣＰＵから同時に発行し、
テスト命令とコピー命令及びコピーデータを計算ノード
２へ同時に転送し、計算ノード２においてテスト命令実
行終了後にすぐにコピー命令を実行させる。As described above, the present invention provides: A test instruction and a copy instruction are simultaneously issued from the CPU of the computation node 1 by a newly established test and copy instruction,
The test instruction, the copy instruction, and the copy data are transferred to the calculation node 2 at the same time.

【００５６】２．テスト命令のリトライシーケンスを計
算ノード２の中に閉じて実行させる（成功するまでテス
ト終了リプライを計算ノード１に返さない）。2. The retry sequence of the test instruction is closed in the calculation node 2 and executed (the test end reply is not returned to the calculation node 1 until the test node succeeds).

【００５７】ようにしたことにより、コピー命令及びコ
ピーデータの転送時間が、テスト命令のレイテンシに隠
蔽され、また、テスト命令における計算ノード間のリト
ライによるターンアラウンドタイムが大幅に短縮され、
その結果、分散メモリ型並列計算機におけるテストアン
ドコピー処理が大幅に高速化される効果がある。By doing so, the transfer time of the copy instruction and the copy data is hidden by the latency of the test instruction, and the turnaround time due to the retry between the calculation nodes in the test instruction is greatly reduced.
As a result, there is an effect that the test and copy processing in the distributed memory type parallel computer is greatly speeded up.

[Brief description of the drawings]

【図１】分散メモリ型並列計算機のブロック図である。FIG. 1 is a block diagram of a distributed memory parallel computer.

【図２】本発明の一実施の形態におけるＲＣＵ部２３
（及び１３）の構成例を示すブロック図である。FIG. 2 shows an RCU 23 according to an embodiment of the present invention.
It is a block diagram showing an example of composition of (and 13).

【図３】本発明の一実施の形態の動作を説明するための
タイムチャートである。FIG. 3 is a time chart for explaining the operation of the embodiment of the present invention.

【図４】本発明の他の実施の形態におけるＲＣＵ部２３
（及び１３）の構成例を示す図である。FIG. 4 shows an RCU unit 23 according to another embodiment of the present invention.
It is a figure showing the example of composition of (and 13).

【図５】他の実施の形態の動作を説明するためにタイム
チャートである。FIG. 5 is a time chart for explaining an operation of another embodiment.

【図６】従来技術のＲＣＵ部２３（及び１３）の構成例
を示すブロック図である。FIG. 6 is a block diagram showing a configuration example of a conventional RCU unit 23 (and 13).

【図７】従来技術の動作を説明するためのタイムチャー
トである。FIG. 7 is a time chart for explaining the operation of the conventional technique.

[Explanation of symbols]

１，２ノード３Ｎｅｔｗｏｒｋ１１，２１ＣＰＵ１２，２２ＭＭＵ１３，２３ＲＣＵ１３０１，２３０１リクエスト受付部１３０２，２３０２データ受付部１３０３，２３０３競合調停部１３０４，２３０４アドレス変換部１３０５，２３０５リクエスト／データ送出部１３１１，２３１１コマンド・アドレス退避バッフ
ァ１３１２，２３１２セレクタ１３１３，２３１３繰り返し制御部１３１４，２３１４データ退避バッファ１３１５，２３１５セレクタ1, 2 node 3 Network 11, 21 CPU 12, 22, MMU 13, 23 RCU 1301, 301 Request reception unit 1302, 2302 Data reception unit 1303, 2303 Competition arbitration unit 1304, 2304 Address conversion unit 1305, 2305 Request / data transmission unit 1311, 2311 Command / address saving buffer 1312, 2312 selector 1313, 2313 Repetition control unit 1314, 2314 Data saving buffer 1315, 2315 selector

Claims

[Claims]

In a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network, a first node for continuously transmitting a test instruction, a copy instruction, and copy data; When receiving the test command, the copy command and the copy data, the test command for the remote memory in the own node is executed, and after the execution of the test command, the copy command copies the copy data to the remote memory. A test and copy processing method for a remote memory in a distributed memory type parallel computer, comprising a node.

2. The remote memory-type parallel computer according to claim 1, wherein the second node repeatedly executes the test instruction for the remote memory in the own node until the instruction succeeds. Test and copy processing method for memory.

3. In a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network, each of the nodes includes a CPU (Central Processing Unit), an MMU (Main Storage Unit), an RCU ( The first RCU in the first node receives a test and copy instruction from the first CPU, decomposes the instruction into a test instruction and a copy instruction, and divides the test instruction into a second instruction. The copy command and the copy data from the first MMU together to send the second RC
U, and the second RCU in the second node stores a command / address saving buffer for storing the received test command and the copy command, and stores the received copy data. The data save buffer and the step of repeatedly taking out the test instruction from the command address save buffer and issuing it to the second MMU until the test instruction succeeds are repeated, and when the test instruction succeeds, the copy is executed from the command address save buffer. Fetching an instruction and fetching the copy data from the data saving buffer;
And a repetition control unit for issuing a test and copy to a remote memory in a distributed memory type parallel computer.

4. In a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network, a test instruction is transmitted, and a copy instruction and copy data are continuously transmitted after receiving a test end reply. A first node to transmit, upon receiving the test instruction, executing the test instruction for the remote memory in the own node, transmitting the test end reply after executing the test instruction, and responding to the test end reply according to the transmission of the test end reply. A test and copy processing method for a remote memory in a distributed memory type parallel computer, comprising: a second node that receives a copy instruction and copy data and copies the copy data to the remote memory according to the copy instruction.

5. The remote memory type parallel computer according to claim 4, wherein the second node repeatedly executes the test instruction for the remote memory in the own node until the instruction succeeds. Test and copy processing method for memory.

6. In a test and copy processing method for a remote memory in a distributed memory type parallel computer constituting a plurality of nodes in a network, each of the nodes includes a CPU (Central Processing Unit), an MMU (Main Storage Unit), an RCU ( The first RCU in the first node receives a test and copy instruction from the first CPU, decomposes the instruction into a test instruction and a copy instruction, and divides the test instruction into a second instruction. Sending means for sending to the second RCU together with the copy command and copy data from the first MMU after receiving the test end reply to the second RCU in the second node; The second RCU in the node includes a command address save buffer for storing the received test instruction, A repetition control unit that sends out the test end reply when the test instruction succeeds by repeatedly taking out the test instruction from the command / address save buffer and issuing the test instruction to the second MMU until the test instruction succeeds; And a control means for receiving the copy command and the copy data in response to the reply transmission and issuing the copy command and the copy data to the second MMU. Copy processing method.