JPH09212474A

JPH09212474A - Method for data transfer between processors and computer system suitable to same

Info

Publication number: JPH09212474A
Application number: JP8315848A
Authority: JP
Inventors: Naonobu Sukegawa; 直伸助川; Masanao Ito; 昌尚伊藤; Yoshiko Tamaoki; 由子玉置
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1995-11-27
Filing date: 1996-11-27
Publication date: 1997-08-15

Abstract

PROBLEM TO BE SOLVED: To enable a reception area which is not fixed to be used as a main storage and reduce synchronous processing as well. SOLUTION: A transfer source node 10 transmits data, the virtual address of a data transfer destination area which is not fixed as the main storage, the address of a transfer control flag in use, a comparison value, and a method for comparison. At a reception-side node 210, a network adapter 30 decides that conditions are met according to a semaphore in a transfer control flag 330 specified with the received transfer control flag address and the received comparison value and comparing method. The received virtual address of the transfer destination area and an address conversion table 140 are used to inspect whether or not the transfer destination area 340 that the virtual address indicates is in the main storage 50. When the conditions are not met or when the transfer destination area 340 is not in the main storage 50, the received data are stored in a transfer buffer 370 in an area for an OS. When a reception- side program issues a specific system call or when the program issues a read instruction for data in the transfer destination area and a page fault is caused, the OS transfers the received data from the transfer buffer 370 to the transfer destination area 340.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術の分野】本発明は、それぞれプロセ
ッサを有する複数のノードをデータ転送ネットワークで
接続した計算機システムにおけるデータ転送方法および
それに適した計算機システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data transfer method in a computer system in which a plurality of nodes each having a processor are connected by a data transfer network, and a computer system suitable therefor.

【０００２】[0002]

【従来の技術】並列計算機においては、プロセッサ間の
データ転送を高速化することが、システム全体の高速化
につながる。プロセッサ間データ転送の性能を決める要
素としては、（１）データ転送レートと、（２）転送起
動遅延との、２つがある。転送起動遅延とは、データ転
送を開始するのにあたり、必要となるハードウェア／ソ
フトウェア処理のオーバヘッドである。2. Description of the Related Art In a parallel computer, speeding up data transfer between processors leads to speeding up of the entire system. There are two factors that determine the performance of data transfer between processors: (1) data transfer rate and (2) transfer activation delay. The transfer start delay is an overhead of hardware / software processing required to start data transfer.

【０００３】大量のデータを一度に転送する場合には、
データ転送レートを高めることで高速にデータを転送で
きる。しかし、短いデータを繰り返し転送する場合は、
転送起動遅延を小さくしないと、データ転送レートが高
くても性能が出ない。データ転送レートが転送データ幅
など物理的な要因で決まるのに対し、転送起動遅延の大
小は主に転送方法に因ることから、短いデータの転送に
おいては転送方法自体が重要になる。When transferring a large amount of data at once,
Data can be transferred at high speed by increasing the data transfer rate. However, when repeatedly transferring short data,
If the transfer activation delay is not reduced, the performance will not be obtained even if the data transfer rate is high. While the data transfer rate is determined by a physical factor such as the transfer data width, the size of the transfer start delay mainly depends on the transfer method, and therefore the transfer method itself becomes important in transferring short data.

【０００４】現在の並列計算機では、一般にＳＥＮＤ／
ＲＥＣＥＩＶＥ型と呼ばれるデータ転送を採用する。Ｓ
ＥＮＤ／ＲＥＣＥＩＶＥ型では、データ送信側のプロセ
ッサが送信要求命令（ＳＥＮＤ）を実行すると、転送デ
ータが受信側に送信される。受信側のプロセッサは受信
要求命令（ＲＥＣＥＩＶＥ）を実行することで、転送デ
ータを受け入れる。転送データを読み出す領域（転送元
領域）はＳＥＮＤ命令の中で指定し、転送データを書き
込む領域（転送先領域）はＲＥＣＥＩＶＥ命令の中で指
定する。領域の指定方法としては、領域の開始論理アド
レスと転送データサイズを用いるのが一般的である。In current parallel computers, SEND /
A data transfer called RECEIIVE type is adopted. S
In the END / RECEIVE type, when the processor on the data transmission side executes the transmission request command (SEND), the transfer data is transmitted to the reception side. The processor on the receiving side receives the transfer data by executing the reception request instruction (RECEIVE). The area from which transfer data is read (transfer source area) is specified in the SEND instruction, and the area to which transfer data is written (transfer destination area) is specified in the RECEIVE instruction. As a method for designating an area, it is general to use the start logical address of the area and the transfer data size.

【０００５】ＳＥＮＤ／ＲＥＣＥＩＶＥ型転送には、
（１）送信側は受信側のＲＥＣＥＩＶＥ命令実行のタイ
ミングに依らずにＳＥＮＤ命令が実行できる（非同期転
送）のでプログラムが記述しやすい、（２）受信側がＲ
ＥＣＥＩＶＥ命令で書き込み領域を指定するまでデータ
が書き込まれないため送信側のプログラムバグで受信側
のデータが壊される危険性が低い、というメリットがあ
り、ワークステーションクラスタから大規模並列計算機
まで幅広く用いられている。For SEND / RECEIVE type transfer,
(1) The sending side can execute the SEND instruction regardless of the timing of the receiving side's RECEIVE instruction execution (asynchronous transfer), making it easy to write a program. (2) The receiving side uses R
Since the data is not written until the write area is specified by the ECEIVE command, there is a low risk that the data on the receiving side will be destroyed by a program bug on the transmitting side, which is widely used from workstation clusters to large-scale parallel computers. ing.

【０００６】しかし、ＳＥＮＤ／ＲＥＣＥＩＶＥ型転送
では、データ送信側から送られたデータを受信側のＲＥ
ＣＥＩＶＥ命令実行まで転送先領域に書き込めないとい
う問題がある。このためＳＥＮＤ／ＲＥＣＥＩＶＥ型転
送では、転送データを受信側ノード内に一旦バッファリ
ングする方法が一般的に採られる。However, in the SEND / RECEIVE type transfer, the data sent from the data sending side is transmitted to the RE on the receiving side.
There is a problem that the transfer destination area cannot be written until the execution of the CEIVE instruction. Therefore, in the SEND / RECEIVE type transfer, a method of temporarily buffering the transfer data in the receiving side node is generally adopted.

【０００７】このバッファリングの簡単な管理方法とし
て、受信側プロセッサで転送データを必ず一旦バッファ
リングし、その内ＲＥＣＥＩＶＥ命令が実行されている
データをバッファから転送先領域にコピーするというが
ある。しかし、この方法ではデータ転送に伴い必ずメモ
リコピーが発生し、性能が出ない。As a simple management method for this buffering, there is a method in which the receiving processor always buffers the transfer data, and the data in which the RECEIVE instruction is executed is copied from the buffer to the transfer destination area. However, in this method, the memory copy is always generated along with the data transfer, and the performance does not appear.

【０００８】特開平６−３２４９９８は、ＲＥＣＥＩＶ
Ｅ命令が実行されてから受信側に到着するデータを転送
先領域に直接書き込み、ＲＥＣＥＩＶＥ命令が実行され
る前に到着したデータのみバッファリングする手段につ
いて開示する。これにより、メモリコピーが削減でき、
高い性能が発揮できる。Japanese Unexamined Patent Publication No. 6-324998 discloses RECEIV.
A means for directly writing the data arriving at the receiving side after the execution of the E instruction to the transfer destination area and buffering only the data arriving before the execution of the RECEIVE instruction is disclosed. This reduces memory copy,
High performance can be demonstrated.

【０００９】しかし、特開平６−３２４９９８の並列計
算機では、受信側のノードは転送データを受信した時
に、そのデータに対応するＲＥＣＥＩＶＥ命令がすでに
発行されているかいないかをチェックし、更に発行され
ている場合にはそのＲＥＣＥＩＶＥ命令により指定され
た情報を検索し、その情報により指定される転送先領域
に転送データを書き込む必要がある。このように、ＳＥ
ＮＤ／ＲＥＣＥＩＶＥ型では、非同期転送を実現する上
で必要になるハードウェア／ソフトウェア処理が多く、
転送起動遅延は小さくならない。However, in the parallel computer of Japanese Patent Laid-Open No. 6-324998, when the receiving node receives the transfer data, it checks whether or not the RECEIVE command corresponding to the data has already been issued, and further issues it. If so, it is necessary to retrieve the information specified by the RECEIVE command and write the transfer data in the transfer destination area specified by the information. In this way, SE
In the ND / RECEIVE type, there are many hardware / software processes required to realize asynchronous transfer,
The transfer activation delay does not decrease.

【００１０】この問題に対する解決策として、ＰＵＴ型
データ転送がある。ＰＵＴ型データ転送では、送信先の
プロセッサの送信要求命令（ＰＵＴ）の中で転送元領域
だけでなく、転送先領域まで指定する。領域の指定の方
法としては、領域の開始論理アドレスと転送データサイ
ズを用いるのが一般的である。転送起動後は、無条件に
転送先領域にデータを書き込むため、受信側が受信要求
命令を発行する必要が無く、かつ受信データのためのバ
ッファも必要ない。As a solution to this problem, there is PUT type data transfer. In PUT-type data transfer, not only the transfer source area but also the transfer destination area is specified in the transmission request command (PUT) of the destination processor. As a method of designating an area, it is general to use a start logical address of the area and a transfer data size. After the transfer is started, the data is unconditionally written in the transfer destination area, so that the receiving side does not need to issue the reception request command and the buffer for the received data is not necessary.

【００１１】ＰＵＴ型データ転送では、（１）受信側が
転送データを受け入れられる状態になってから送信側が
ＰＵＴ命令を実行することをユーザ責任で守る必要があ
る（同期転送）、（２）送信側のプログラムにバグがあ
ると容易に受信側のデータが破壊される、という問題が
ある。しかし、受信側の処理は、転送データのヘッダ情
報に従い転送先領域にデータを書き込むだけであること
から、転送起動遅延が小さくでき、短いデータの転送で
はＳＥＮＤ／ＲＥＣＥＩＶＥ型よりも高性能が発揮でき
る。In PUT type data transfer, (1) it is necessary for the user to protect the execution of the PUT instruction from the transmitting side after the receiving side is ready to accept the transferred data (synchronous transfer), (2) transmitting side There is a problem that if there is a bug in the program, the data on the receiving side will be easily destroyed. However, since the processing on the receiving side only writes the data in the transfer destination area according to the header information of the transfer data, the transfer start delay can be made small, and the performance of the short data transfer can be higher than that of the SEND / RECEIVE type. .

【００１２】「ＰＵＴ／ＧＥＴインタフェースのハード
ウェアサポートによる並列プログラムの効率的実行」
（並列処理シンポジウムＪＳＰＰ‘９４論文集、ｐｐ．
２３３−２４０、１９９４年５月）は、ＰＵＴ型データ
転送の実施方法を開示している。このＰＵＴ命令では、
転送元領域、転送先領域を仮想アドレスで指定する。"Efficient Execution of Parallel Programs by Hardware Support of PUT / GET Interface"
(Parallel Processing Symposium JSPP'94 Proceedings, pp.
233-240, May 1994) discloses a method of implementing PUT type data transfer. In this PUT instruction,
The transfer source area and transfer destination area are specified by virtual addresses.

【００１３】ここで、仮想アドレス管理を行う計算機
で、プロセッサ間のデータ転送を行う場合を考える。主
記憶の実サイズ以上に広い仮想アドレス空間を用意する
場合、もしくは同時に動く複数プロセスが必要とするメ
モリ容量が主記憶のサイズを超える場合、転送元／転送
先領域が主記憶ではなく、ハードディスクなどの外部記
憶装置上に存在する場合があり得る。Now, consider a case where a computer for managing virtual addresses transfers data between processors. If a virtual address space wider than the actual size of the main memory is prepared, or if the memory capacity required by multiple processes running simultaneously exceeds the size of the main memory, the transfer source / destination area is not the main memory but a hard disk, etc. It may exist on the external storage device of.

【００１４】この問題に対し、送信側は、転送元領域が
一部でも外部記憶装置上に存在する場合には、送信動作
を開始せずに、全領域を主記憶までロードした後に転送
を開始するという手順をソフトウェア／ハードウェアで
容易に実現できる。これにより、送信途中で中断するこ
となく、データを送り出せる。しかし、受信側は、デー
タが到着した時点で受信領域の一部もしくは全部が外部
記憶装置上に存在すると、受信動作が行えないことにな
る。これにより、プロセッサ間ネットワーク上にデータ
が保留されると、ネットワークの輻輳が発生し、性能上
問題になる。In response to this problem, if the transfer source area is partially present in the external storage device, the transmission side does not start the transmission operation but loads the entire area to the main storage and then starts the transfer. This procedure can be easily realized by software / hardware. This allows the data to be sent out without interruption during transmission. However, the receiving side cannot perform the receiving operation if part or all of the receiving area exists in the external storage device at the time when the data arrives. As a result, when data is reserved on the interprocessor network, network congestion occurs, which causes a performance problem.

【００１５】特開平６−１１０８４５で開示されるよう
に、ＯＳにより転送先領域が必ず主記憶上に存在するこ
とを保証する手段がある（主記憶固定）。これにより、
転送を中断することなくデータを受信できるため、ネッ
トワークの輻輳が防げる。As disclosed in Japanese Patent Laid-Open No. 6-110845, there is a means for ensuring that the transfer destination area always exists in the main memory by the OS (main memory fixed). This allows
Since data can be received without interrupting the transfer, network congestion can be prevented.

【００１６】しかし、ＰＵＴ型データ転送では、転送デ
ータが到着するまで受信側は転送先領域がどこであるか
判別不能である。このため、大きなデータを受信する可
能性があるデータ領域は、その全体を主記憶固定としな
ければならない。例えば実際には行列データの１列分の
データを受信するだけなのにも拘わらず、受信するデー
タがどの列に属するか判別不能の場合には行列データを
受信する可能性のあるデータ受信領域の全体を主記憶固
定せざるを得ない。これは、仮想メモリ管理の自由度を
著しく妨げる。However, in the PUT type data transfer, the receiving side cannot determine where the transfer destination area is until the transfer data arrives. For this reason, the data area that may receive a large amount of data must be fixed in the main memory. For example, if it is impossible to determine which column the received data belongs to, even though only one column of the matrix data is actually received, the entire data reception area that may receive the matrix data I cannot help fixing my main memory. This significantly impedes the flexibility of virtual memory management.

【００１７】データを受信する可能性のあるデータ領域
の全てを主記憶固定としない技術が、特開平４−２９１
６６０に記載されている。ここでは、受信したデータを
格納すべきデータ領域がスワップアウトされていて主記
憶上に無い場合にその領域に格納すべきデータを受信し
たときには、ＯＳ管理の受信バッファにそのデータを一
旦バッファリングし、バッファリング終了後に割込みを
ハードウェアが発生する。これを契機にＯＳが実行中の
プログラムの実行を中断した後で、ＯＳ内の割り込み処
理を実行する。この処理の中でこのデータの受信領域を
ページインし、ページインが完了し次第、上記受信バッ
ファ内の受信データをそのデータ受信領域に転送する。A technique in which the main memory is not fixed to all of the data areas that may receive data is disclosed in Japanese Patent Laid-Open No. 4-291.
660. Here, when the data area for storing the received data is swapped out and is not in the main memory and the data to be stored in the area is received, the data is temporarily buffered in the reception buffer of the OS management. , Hardware generates an interrupt after buffering is completed. With this as an opportunity, the OS interrupts the execution of the program being executed and then executes the interrupt processing in the OS. In this process, the reception area of this data is paged in, and as soon as the page-in is completed, the reception data in the reception buffer is transferred to the data reception area.

【００１８】以上の受信領域のスワップアウトの技術と
は別に、ＰＵＴ型通信を用いた基本的なプログラミング
について、次に述べる。ＰＵＴ型データ転送を行うシス
テムでは、送信側のプロセッサからデータが書き込まれ
るであろう領域が、そのデータを受信側のプロセッサが
読み出すためには、必ず送信側のプログラムと受信側の
プログラムとで同期をとり、書き込みの終了を保証しな
ければならない。これを行わないと、受信側のプロセッ
サがデータ受信領域を読み出している途中で送信側のプ
ロセッサから書き込まれるなどの不整合が発生する。た
とえば、ノードＡ内のデータＭの内容と、ノードＢ内の
データＮの内容とを交換する場合、ノードＡとＢのプロ
セッサは、図２１に示すようなプログラムを実行する。
各ノードＡ、ＢはそれぞれデータＭ、データＮのコピー
を取る命令３０００Ａ、３０００Ｂを実行する。これ
は、交換動作の為のＰＵＴ命令３０２０Ａ、３０２０Ｂ
において直接データＭ，Ｎに書き込むことを前提とした
からである。このように書き込み先領域を最初に確保
し、その後にノードＡとノードＢとの間でバリア同期を
取る命令３０１０Ａ、３０１０Ｂを実行する必要があ
る。これを行わないとＰＵＴ命令３０２０Ａ（あるいは
３０２０Ｂ）で書き込まれるべき領域が確保されていな
いうちに、他ノードがＰＵＴ命令３０２０Ｂ（あるいは
３０２０Ａ）を実行し、データを書き込む可能性があ
る。特に大きなデータを転送する場合には、書き込まれ
るべき領域はあらかじめ確保できないことが多く、場合
によってはＰＵＴ転送を行う度に、その直前で領域を確
保し、同期をとらなければならない。次に、各ノードで
ＰＵＴ命令３０２０Ａ、３０２０Ｂを実行する。各ノー
ドは、他ノードから書かれたデータを読み出す前に、再
びバリア同期を取る命令３０３０Ａ、３０３０Ｂを実行
する必要がある。これは、相手からのＰＵＴ命令による
データ書き込みが終了したかどうかを知る手段がない為
である。同期後には、双方のノードでのＰＵＴ命令の完
了が保証できるため、双方のノードは書き込まれたデー
タ（交換されたデータ）を読み出すことができる。Aside from the above-mentioned receiving area swap-out technique, basic programming using PUT type communication will be described below. In a PUT-type data transfer system, the area where data is to be written from the processor on the sending side must be synchronized between the program on the sending side and the program on the receiving side in order for the data to be read by the receiving side processor. To guarantee the end of writing. If this is not done, inconsistency will occur, such as writing from the sending processor while the receiving processor is reading the data receiving area. For example, when exchanging the contents of the data M in the node A and the contents of the data N in the node B, the processors of the nodes A and B execute a program as shown in FIG.
The nodes A and B execute the instructions 3000A and 3000B for copying the data M and the data N, respectively. This is a PUT instruction 3020A, 3020B for exchange operation.
This is because it is premised that the data is directly written in the data M and N. Thus, it is necessary to first secure the write destination area and then execute the instructions 3010A and 3010B between the node A and the node B for barrier synchronization. If this is not done, there is a possibility that another node may execute the PUT instruction 3020B (or 3020A) and write data while the area to be written by the PUT instruction 3020A (or 3020B) is not secured. In particular, when transferring a large amount of data, it is often impossible to secure the area to be written in advance, and in some cases, every time PUT transfer is performed, the area must be secured immediately before the PUT transfer to achieve synchronization. Next, the PUT instructions 3020A and 3020B are executed in each node. Each node needs to execute the instructions 3030A and 3030B for obtaining the barrier synchronization again before reading the data written from another node. This is because there is no means for knowing whether or not the data writing by the PUT command from the other party has been completed. After synchronization, the completion of the PUT instruction at both nodes can be guaranteed, so that both nodes can read the written data (exchanged data).

【００１９】[0019]

【発明が解決しようとする課題】上記の特開平４−２９
１６６０が開示する方法では、ＣＰＵは、スワップア
ウトされているデータ領域に書き込むべきデータを受信
する度に、割り込まれ、割り込み処理を実行する必要が
ある。このために、そのときに実行中のプログラムの実
行が中断されるという問題がある。とくに、このスワッ
プアウトされたページをスワップインするには、外部記
憶装置をアクセスする必要があり、そのために、実行中
のプログラムの中断が長くなる。この中断時点では、実
行中のプログラムが受信したデータを利用する状態にな
いこともありうる。したがって、このような状態での実
行中のプログラムの中断は、プログラムの実行効率から
みると、望ましくない。さらに、実行中のプログラムが
その受信データを保持するページをスワップインされた
後もしばらくの間アクセスしなければ、そのページは再
びスワップアウトされてしまうことも起こる可能性があ
る。このことが起きると、受信直後のスワップインが無
駄になる。DISCLOSURE OF THE INVENTION Problems to be Solved by the Invention
In the method disclosed by 1660, each time the CPU receives data to be written to the swapped-out data area, the CPU needs to be interrupted and execute interrupt processing. For this reason, there is a problem that the execution of the program being executed at that time is interrupted. In particular, swapping in the swapped-out page requires accessing the external storage device, which causes a long interruption of the program being executed. At the time of this interruption, the program being executed may not be in a state of using the received data. Therefore, interruption of the program being executed in such a state is not desirable from the viewpoint of program execution efficiency. Further, if the executing program does not access the page holding the received data for a while even after the page is held, the page may be swapped out again. When this happens, the swap-in immediately after reception is wasted.

【００２０】また、従来のＰＵＴ型データ転送では、バ
リア同期のような大きなオーバヘッドを有するプロセッ
サ間の同期を頻繁に行う必要がある。ＰＵＴ型データ転
送のために頻繁にプロセッサ間の同期を行うと、ＰＵＴ
型データ転送が持つ低オーバヘッドという特長が活かせ
ない。In the conventional PUT type data transfer, it is necessary to frequently perform synchronization between processors having a large overhead such as barrier synchronization. Frequent synchronization between processors for PUT-type data transfer causes PUT
The advantage of low overhead of type data transfer cannot be utilized.

【００２１】従って、本発明の目的は、主記憶に固定さ
れていないデータ受信領域にデータを転送可能で、か
つ、データ受信時に実行中のプログラムへ大きな影響を
与えないＰＵＴ型のデータ転送方法およびそれに適した
計算機システムを提供することである。Therefore, an object of the present invention is to provide a PUT type data transfer method capable of transferring data to a data receiving area which is not fixed in the main memory and having no great influence on a program being executed at the time of data reception. It is to provide a computer system suitable for it.

【００２２】さらに、本発明の他の目的は、プロセッサ
間の同期のオーバヘッドがより少なく、ＰＵＴ型のデー
タ転送に適したデータ転送方法およびそれに適した計算
機システムを提供することである。Still another object of the present invention is to provide a data transfer method suitable for PUT type data transfer and a computer system suitable for the same, which has less synchronization overhead between processors.

【００２３】[0023]

【課題を解決するための手段】上記目的を達成するため
に、本発明の第１の動作態様のデータ転送方法は、以下
のステップを実行する。In order to achieve the above object, the data transfer method of the first operation mode of the present invention executes the following steps.

【００２４】データとそのデータを格納すべき転送先領
域の仮想アドレスとを転送元のノードから転送先のノー
ドに転送し、転送先のノードにおいて、上記データを受
信した時点で受信した仮想アドレスが割り当てられた上
記領域が転送先のノード内の主記憶内に存在するとき、
受信したデータを該領域に格納し、上記転送先領域が転
送先のノードの主記憶内に存在しないとき、転送先のノ
ード内に設けられた受信したデータを一時的に保持する
ための転送バッファ領域に受信したデータを格納し、上
記転送先領域が転送先ノードの上記主記憶に存在しない
とき、上記転送先ノードで実行されているプログラムか
らその後発行された所定の命令が実行されたときに、該
転送先領域に該転送バッファ領域に格納された受信した
データを転送する。ここで、この転送は、上記転送先領
域を主記憶に割り当てられた後に実行される。The data and the virtual address of the transfer destination area in which the data should be stored are transferred from the transfer source node to the transfer destination node, and the virtual address received at the time of receiving the data at the transfer destination node is When the above-mentioned allocated area exists in the main memory in the transfer destination node,
A transfer buffer for storing the received data in the area and temporarily holding the received data provided in the transfer destination node when the transfer destination area does not exist in the main memory of the transfer destination node When the received data is stored in the area, the transfer destination area does not exist in the main memory of the transfer destination node, and when a predetermined instruction issued subsequently from the program executed in the transfer destination node is executed , The received data stored in the transfer buffer area is transferred to the transfer destination area. Here, this transfer is executed after the transfer destination area is assigned to the main memory.

【００２５】本発明の第２の動作態様のデータ転送方法
は、以下のステップを実行する。The data transfer method of the second operation mode of the present invention executes the following steps.

【００２６】データと、そのデータを格納すべき転送先
領域の仮想アドレスと比較値を転送元のノードから転送
先のノードに転送し、転送先のノードで実行中のプログ
ラムにより指定されたセマフォ値を転送先ノードにより
記憶し、転送先のノードが上記比較値と、該記憶されて
いるセマフォの値とが、上記データを転送先流域に書き
込むためにあらかじめ定めた条件を満たすか否かを判別
し、上記判別の結果、該条件が満たされると判断された
ときには、転送先のノード内の上記仮想アドレスを割り
当てられた該転送先領域に受信したデータを格納し、該
条件が満たされないとき、転送先のノード内に設けられ
たデータを一時的に保持するためのバッファ領域に受信
したデータを格納し、上記プログラムからその後発行さ
れた転送要求に応答して、上記バッファ領域から上記デ
ータを該転送先領域に転送する。The data, the virtual address of the transfer destination area in which the data should be stored, and the comparison value are transferred from the transfer source node to the transfer destination node, and the semaphore value specified by the program being executed in the transfer destination node. Is stored by the transfer destination node, and the transfer destination node determines whether or not the comparison value and the stored semaphore value satisfy a predetermined condition for writing the data in the transfer destination basin. If, as a result of the determination, it is determined that the condition is satisfied, the received data is stored in the transfer destination area to which the virtual address in the transfer destination node is allocated, and when the condition is not satisfied, Store the received data in the buffer area for temporarily holding the data provided in the transfer destination node, and respond to the transfer request issued later from the above program. And forwards the data to the destination area from the buffer area.

【００２７】[0027]

【発明の実施の形態】以下、本発明に係るデータ転送方
法および計算機システムを図面に示した発明の実施の形
態を参照してさらに詳しく説明する。BEST MODE FOR CARRYING OUT THE INVENTION The data transfer method and computer system according to the present invention will be described below in more detail with reference to the embodiments of the invention shown in the drawings.

【００２８】＜１．装置と動作の概要＞＜１．１装置の概要＞図１は本発明に係るデータ転送方
法を採用する並列計算機システムの概略構成を示す。本
システムは、ＣＰＵ２０、主記憶５０、およびネットワ
ークアダプタ３０を持つノード１０を複数個、プロセッ
サ間ネットワーク１０００で結合する。各ノードは、そ
れ自体のＯＳを持つ。ネットワークアダプタ３０はノー
ド間データ転送を担当するユニットであり、ＣＰＵ２０
により起動されることで、他のノード、例えばノード２
１０にデータを転送する。また他のノード、例えばノー
ド２１０から送られてきたデータを主記憶５０に書き込
む機能を持つ。なお、ノード、例えばノード１０とノー
ド２１０は、完全に同一のハードウェアでも、別のハー
ドウェアでもかまわないが、本実施の形態では各ノード
が同じ構造を有するとしている。但し、図１では、ノー
ド１０のネットワークアダプタ３０および主記憶５０の
中には、ノード１０の転送元領域１２０に格納されるデ
ータをノード２１０の転送先領域３４０に転送するとき
の送信側の動作を説明する上で必要になる要素を記入し
た。同様に、受信側となるノード２１０のネットワーク
アダプタ２３０および主記憶５０の中には、受信側ノー
ドの動作を説明する上で必要になる要素を記入した。各
ノード、たとえば１０には、ＣＰＵ２０、ネットワーク
アダプタ３０、主記憶制御部４０、主記憶５０がある。
更に、ノード毎に外部記憶装置６０を用意する。主記憶
５０の内部には、ユーザ制御領域９０（主記憶固定）と
ユーザデータ領域１００（仮想記憶管理）がある。ユー
ザデータ領域１００に存在するデータは恒常的に存在す
るのではなく、必要に応じて外部記憶装置６０にスワッ
プアウトされる。このため、転送元領域１２０、転送先
領域３４０は主記憶５０に存在する場合もあれば、外部
記憶装置６０にスワップアウトされている場合もある。
ＯＳ用の領域としては、ＯＳ領域１３０をユーザ領域と
は別に設け、主記憶固定となるように管理する。<1. Overview of Device and Operation><1.1 Overview of Device> FIG. 1 shows a schematic configuration of a parallel computer system that employs a data transfer method according to the present invention. In this system, a plurality of nodes 10 each having a CPU 20, a main memory 50, and a network adapter 30 are connected by an interprocessor network 1000. Each node has its own OS. The network adapter 30 is a unit in charge of data transfer between nodes, and the CPU 20
Is started by another node, for example, node 2
Transfer the data to 10. Further, it has a function of writing data sent from another node, for example, the node 210, into the main memory 50. Note that the nodes such as the node 10 and the node 210 may be completely the same hardware or different hardware, but in the present embodiment, each node has the same structure. However, in FIG. 1, in the network adapter 30 and the main memory 50 of the node 10, the operation on the transmission side when the data stored in the transfer source area 120 of the node 10 is transferred to the transfer destination area 340 of the node 210. I have filled in the necessary elements to explain. Similarly, in the network adapter 230 and the main memory 50 of the node 210 on the receiving side, the elements necessary for explaining the operation of the receiving side node are entered. Each node, for example 10, has a CPU 20, a network adapter 30, a main memory control unit 40, and a main memory 50.
Furthermore, an external storage device 60 is prepared for each node. Inside the main memory 50, there are a user control area 90 (main memory fixed) and a user data area 100 (virtual memory management). The data existing in the user data area 100 does not always exist, but is swapped out to the external storage device 60 as needed. Therefore, the transfer source area 120 and the transfer destination area 340 may exist in the main memory 50 or may be swapped out to the external storage device 60.
As the area for the OS, the OS area 130 is provided separately from the user area and managed so that the main memory is fixed.

【００２９】＜１．２＞ノードの詳細主記憶５０上に用意する要素は、次の７つである。<1.2> Details of Node The following seven elements are prepared in the main memory 50.

【００３０】（１）転送元領域１２０送信側ノード１０の転送データを格納する領域である。
ユーザデータ領域１００に用意し、仮想記憶として管理
する。(1) Transfer source area 120 This is an area for storing transfer data of the transmitting side node 10.
It is prepared in the user data area 100 and managed as a virtual memory.

【００３１】（２）転送先領域３４０受信側ノード２１０の転送データを書き込むべき領域で
ある。ユーザデータ領域１００に用意し、仮想記憶とし
て管理する。(2) Transfer destination area 340 This is an area in which the transfer data of the receiving side node 210 should be written. It is prepared in the user data area 100 and managed as a virtual memory.

【００３２】（３）転送バッファ３７０受信側ノード２１０の転送データを一時的に記録する領
域である。ＯＳ領域１３０に用意し、主記憶に固定す
る。(3) Transfer buffer 370 This is an area for temporarily recording the transfer data of the receiving side node 210. It is prepared in the OS area 130 and fixed in the main memory.

【００３３】（４）アドレス変換テーブル１４０送信／受信双方のノードでアドレス変換情報を記録する
領域である。ＯＳ領域１３０に用意し、主記憶に固定す
る。(4) Address conversion table 140 This is an area for recording address conversion information at both the transmitting and receiving nodes. It is prepared in the OS area 130 and fixed in the main memory.

【００３４】（５）転送元フラグ１１０送信側ノード１０のデータ転送動作状況（送信中／終了
など）を記録するフラグである。記録する値を図１２に
示す。＜２．２＞で詳細を説明する。ユーザ制御領域９
０に用意し、主記憶に固定する。転送元ノード１０のユ
ーザプログラムは、このフラグを読むことで、転送動作
の現状を把握することができる。(5) Transfer source flag 110 This is a flag for recording the data transfer operation status (transmission / end, etc.) of the transmission-side node 10. The recorded values are shown in FIG. Details will be described in <2.2>. User control area 9
It is prepared in 0 and fixed in the main memory. The user program of the transfer source node 10 can grasp the current status of the transfer operation by reading this flag.

【００３５】（６）転送先フラグ３２０受信側ノード２１０のデータ転送動作状況（受信中／バ
ッファリング済みなど）を記録するフラグである。記録
する値を図１３に示す。＜２．２＞でこのフラグの詳細
を説明する。このフラグはユーザ制御領域９０に用意
し、主記憶に固定する。受信側のノード２１０のユーザ
プログラムは、このフラグを読むことで、転送動作の現
状を把握することができる。(6) Transfer destination flag 320 This is a flag for recording the data transfer operation status of the receiving-side node 210 (such as receiving / buffering completed). The recorded values are shown in FIG. The details of this flag will be described in <2.2>. This flag is prepared in the user control area 90 and fixed in the main memory. The user program of the receiving-side node 210 can grasp the current state of the transfer operation by reading this flag.

【００３６】（７）転送制御フラグ３３０受信側ノードにおいてデータの受信とそのノード内での
プログラムの進捗とを同期させるためのセマフォ、バッ
ファリング数などを含むフラグであり、このフラグの構
成を図９に示す。＜２．２＞でその詳細を説明する。こ
のフラグはユーザ制御領域９０に用意し、主記憶に固定
する。セマフォ２０１０は、具体的には、データを受信
した時点で、転送先領域３４０を受信側ノード２１０が
まだ使用終了していないで、そのためそのデータを転送
先領域３４０に転送出来ない状態にあるか否かを判断す
るのに使用される。本実施の形態では、転送先領域３４
０を受信側ノード２１０が使用終了していない場合な
ど、この転送先領域３４０がデータ転送を受け付けられ
ない状態にある場合には、強制的に転送データを転送バ
ッファ３７０に一時的に格納する方法を採る。(7) Transfer control flag 330 This is a flag including a semaphore, a buffering number, etc. for synchronizing the reception of data at the receiving side node with the progress of the program in that node, and the configuration of this flag is shown. 9 shows. The details will be described in <2.2>. This flag is prepared in the user control area 90 and fixed in the main memory. Specifically, the semaphore 2010 is in a state where it cannot transfer the data to the transfer destination area 340 because the receiving side node 210 has not finished using the transfer destination area 340 at the time of receiving the data. It is used to judge whether or not. In this embodiment, the transfer destination area 34
A method for forcibly storing transfer data in the transfer buffer 370 when the transfer destination area 340 is in a state in which data transfer cannot be accepted, such as when the receiving side node 210 has not finished using 0. Take.

【００３７】ネットワークアダプタ３０の構成を図２に
示す。主な要素を次に挙げる。The structure of the network adapter 30 is shown in FIG. The main elements are listed below.

【００３８】（１）送信制御レジスタ７０ＣＰＵ２０からの送信要求を記録する。その内部構成は
図３に示す。(1) Transmission control register 70 The transmission request from the CPU 20 is recorded. The internal structure is shown in FIG.

【００３９】（２）送信制御部４１０送信制御レジスタ７０の情報を元に送信動作を行う。(2) Transmission control unit 410 Performs a transmission operation based on the information in the transmission control register 70.

【００４０】（３）ネットワーク送信部４３０、ネット
ワーク受信部４５０プロセッサ間ネットワーク１０００との接続を担当す
る。(3) Network transmitter 430, network receiver 450 In charge of connection with the interprocessor network 1000.

【００４１】（４）受信制御レジスタ１７０他ノードからの転送データが届いた場合に、転送制御情
報を記録する。その内部構成は図４に示す。(4) Reception control register 170 When transfer data from another node arrives, transfer control information is recorded. The internal structure is shown in FIG.

【００４２】（５）受信制御部４４０受信制御レジスタ１７０の情報を元に受信動作を行う。(5) Reception control section 440 The reception operation is performed based on the information in the reception control register 170.

【００４３】（６）主記憶アクセス部４００送信制御部４１０や受信制御部４４０などからの主記憶
５０をアクセスする要求に応答して主記憶をアクセスす
る。(6) Main memory access unit 400 The main memory is accessed in response to a request from the transmission control unit 410 or the reception control unit 440 to access the main memory 50.

【００４４】（７）テーブル管理レジスタ８０アドレス変換テーブル１４０の実アドレス情報、具体的
にはそのテーブルの先頭アドレス１５１０を持つ。その
内部構成は図５に示す。(7) Table management register 80 It has real address information of the address conversion table 140, specifically, the head address 1510 of the table. The internal structure is shown in FIG.

【００４５】（８）アドレス変換部４８０主記憶アクセス部４００からのアクセス要求に対してア
ドレス変換テーブル１４０の情報に基づき仮想−実アド
レス変換を担当する。(8) Address translation unit 480 It is in charge of virtual-real address translation based on the information in the address translation table 140 in response to the access request from the main memory access unit 400.

【００４６】（９）バッファ管理レジスタ１９０転送バッファ３７０のアドレス情報を持つ。その内部構
成は図６に示す。(9) Buffer management register 190 It holds the address information of the transfer buffer 370. The internal structure is shown in FIG.

【００４７】（１０）バッファ制御部４７０バッファ管理レジスタ１９０の情報に基づき、転送デー
タを転送バッファ３７０に格納する動作をする。(10) Buffer control section 470 Based on the information in the buffer management register 190, it operates to store the transfer data in the transfer buffer 370.

【００４８】なお、以上の構成要素の動作は、＜２．２
＞で説明する。The operation of the above components is <2.2.
>.

【００４９】＜１．３＞データ転送命令図２０に本実施の形態で使用するデータ転送命令（ＰＵ
Ｔ命令）のフォーマットを示す。命令が指定するパラメ
ータとしては、以下に示す９つを持つ。<1.3> Data Transfer Command FIG. 20 shows a data transfer command (PU) used in this embodiment.
(T instruction) format is shown. The command has the following nine parameters.

【００５０】（１）転送データ長２７１０転送するデータのバイト数を指定する。ハードウェア上
の最大転送長は１ＭＢである（詳細後述）。(1) Transfer data length 2710 Designates the number of bytes of data to be transferred. The maximum transfer length on the hardware is 1 MB (details will be described later).

【００５１】（２）転送元領域開始アドレス２７２０転送元領域１２０の送信側ノード１０における仮想アド
レスを指定する。(2) Transfer source area start address 2720 The virtual address in the transmission side node 10 of the transfer source area 120 is designated.

【００５２】（３）転送元フラグアドレス２７３０転送元フラグ１１０の送信側ノード１０における仮想ア
ドレスを指定する。(3) Transfer source flag address 2730 The virtual address of the transfer source flag 110 in the transmitting side node 10 is designated.

【００５３】（４）転送先プロセッサ番号２７４０データ転送先のノードの番号（ノード２１０の番号）を
指定する。(4) Transfer destination processor number 2740 Specify the node number of the data transfer destination (node 210 number).

【００５４】（５）転送制御フラグアドレス２７５０転送制御フラグ３３０の受信側ノード２１０における仮
想アドレスを指定する。(5) Transfer control flag address 2750 The virtual address of the transfer control flag 330 in the receiving side node 210 is designated.

【００５５】（６）比較値２７６０本実施の形態で新規に使用される情報で、転送制御フラ
グ３３０内のセマフォア値（図９の２０１０）と比較さ
れる値を指定する。符号なしの３２ビットデータであ
る。(6) Comparison value 2760 Information newly used in the present embodiment specifies a value to be compared with the semaphore value (2010 in FIG. 9) in the transfer control flag 330. It is 32-bit data without a code.

【００５６】（７）比較方法２７７０本実施の形態で新規に使用される情報で、セマフォア２
０１０と比較値２７７０との比較方法を示す。この比較
方法２７７０の値と実行される比較方法の関係を図１０
に示す。この比較方法の値２７７０と比較値２７６０を
用いて、いろいろいろの条件を指定して、データの受信
と受信側プログラムの進捗との間の同期を簡単に行え
る。具体的には、これらの情報は、データを受信した時
点で、転送先領域３４０を受信側ノード２１０が使用終
了していない状態にあり、そのためそのデータを転送先
領域３４０に転送出来ないか否かを判断するのに使用さ
れる。このために、本実施の形態では、後に説明するよ
うに、受信側のプロセスは、そのプログラムの進捗に応
じてセマフォア２０１０を書き換えるように構成されて
る。データを受信した時点でこの比較方法２７７０が指
定する条件が成立しなければ、比較方法２７７０の値に
よらずに、強制的に転送データを転送バッファ３７０に
格納する。それにより転送データを転送先領域３４０に
転送しないで、転送バッファ３７０に転送し、後にこの
転送データを転送バッファ３７０から転送先領域３４０
に転送する。これにより、データの転送に必要なプロセ
ス間の同期処理のオーバヘッドを軽減している。(7) Comparison method 2770 Information newly used in the present embodiment is semaphore 2
A method of comparing 010 and the comparison value 2770 will be described. The relationship between the value of this comparison method 2770 and the comparison method executed is shown in FIG.
Shown in By using the value 2770 and the comparison value 2760 of this comparison method, various conditions can be specified and the synchronization between the data reception and the progress of the reception side program can be easily performed. Specifically, these pieces of information are in a state in which the receiving side node 210 has not finished using the transfer destination area 340 at the time of receiving the data, and therefore, whether or not the data cannot be transferred to the transfer destination area 340. Used to determine if. Therefore, in the present embodiment, as will be described later, the receiving side process is configured to rewrite the semaphore 2010 according to the progress of the program. If the condition specified by the comparison method 2770 is not satisfied at the time of receiving the data, the transfer data is forcibly stored in the transfer buffer 370 regardless of the value of the comparison method 2770. As a result, the transfer data is not transferred to the transfer destination area 340 but is transferred to the transfer buffer 370, and this transfer data is transferred from the transfer buffer 370 to the transfer destination area 340 later.
Transfer to This reduces the overhead of synchronization processing between processes required for data transfer.

【００５７】（８）転送先領域開始アドレス２７８０転送先領域３４０の受信側ノード２１０における仮想ア
ドレスを指定する。(8) Transfer destination area start address 2780 The virtual address in the receiving side node 210 of the transfer destination area 340 is designated.

【００５８】（９）転送先フラグアドレス２７９０転送先フラグ３２０の受信側ノード２１０における仮想
アドレスを指定する。(9) Transfer destination flag address 2790 Specifies the virtual address of the transfer destination flag 320 in the receiving side node 210.

【００５９】なお、本実施の形態ではページ境界（１Ｍ
Ｂバウンダリ）をまたがるデータ転送を許さない。よっ
て、転送元領域や転送先領域の少なくとも一方がページ
をまたがる場合、データの転送は、複数回のデータ転送
で実現する必要がある。同様に、１ＭＢを超える長大デ
ータも、複数回のデータ転送で対処する。In the present embodiment, page boundaries (1M
Do not allow data transfer across B boundary). Therefore, when at least one of the transfer source area and the transfer destination area extends over a page, the data transfer needs to be realized by a plurality of data transfers. Similarly, long data exceeding 1 MB can be dealt with by multiple times of data transfer.

【００６０】図２０に示したＰＵＴ命令は、データ転送
元のユーザプログラム内に設けられた通信ライブラリに
対する関数呼び出しを表す。図に示したいろいろのパラ
メータは、この関数呼び出しの引数を表す。しかし、以
下では簡単化のために、命令あるいはパラメータという
用語を使用する。この通信ライブラリは、ユーザプログ
ラムから呼び出されると、これらのパラメータをそれぞ
れ適当なレジスタに格納した後に、ネットワークアダプ
タ３０をこのＰＵＴ動作のために起動する。このアダプ
タ３０は、ＰＵＴ動作のために起動されると、後に示す
ように上記のいろいろのパラメータをしようしてＰＵＴ
動作を実行する。The PUT instruction shown in FIG. 20 represents a function call to the communication library provided in the user program of the data transfer source. The various parameters shown in the figure represent the arguments of this function call. However, in the following, the term instruction or parameter will be used for simplicity. This communication library, when called from the user program, stores these parameters in appropriate registers and then activates the network adapter 30 for this PUT operation. When this adapter 30 is activated for PUT operation, it will use the various parameters described above to
Perform the action.

【００６１】＜１．４＞転送制御フラグの具体的使用例次に、転送制御フラグ３３０、比較値２７６０、比較方
法２７７０を用いたプロセッサ間のデータ転送を、図２
２に示す具体的なプログラム例を用いて説明する。この
図は、ノードＡ内のデータＭの内容と、ノードＢ内のデ
ータＮの内容とをスワップするためのプログラムである
点は図２１に示した従来技術による場合と同様である
が、図２１と異なり図２０に示すＰＵＴ命令を用いてい
る。但し、図２２では、図２０と異なりＰＵＴ命令の表
記を簡単化している。すなわち、ＰＵＴ命令３１２０Ａ
あるいは３１２０Ｂ中の、ｄａｔａＭ、ｆｌａｇ（ｓ
ａ）、ｎｏｄｅＢは、図２０のPUT命令内の転送元領域
開始アドレス２７２０、転送元フラグアドレス２７３
０、転送先プロセッサ番号２７４０を表す。ｗｈｅｎ
ｆｌａｇ（ｂ）＞０は、転送制御フラグアドレス２７５
０で、転送制御フラグｂを指定し、比較値２７６０で０
を指定し、比較方法２７７０で"＞"を指定することを表
す。また、ｄａｔａＮ、ｆｌａｇ（ｒｂ）は、転送先領
域開始アドレス２７８０、転送先フラグアドレス２７９
０を示す。なお、図２０の転送データ長２７１０は、図
２２では省略している。<1.4> Specific Usage Example of Transfer Control Flag Next, data transfer between processors using the transfer control flag 330, the comparison value 2760, and the comparison method 2770 will be described with reference to FIG.
Description will be made using a specific program example shown in FIG. 21 is similar to the case of the conventional technique shown in FIG. 21 in that it is a program for swapping the contents of the data M in the node A and the contents of the data N in the node B. Unlike the above, the PUT instruction shown in FIG. 20 is used. However, in FIG. 22, unlike FIG. 20, the notation of the PUT instruction is simplified. That is, PUT instruction 3120A
Alternatively, in 3120B, dataM, flag (s
a) and nodeB are the transfer source area start address 2720 and the transfer source flag address 273 in the PUT instruction of FIG.
0 represents the transfer destination processor number 2740. when
flag (b)> 0 is the transfer control flag address 275.
The transfer control flag b is specified by 0, and the comparison value 2760 is 0.
Is designated, and ">" is designated by the comparison method 2770. Also, dataN and flag (rb) are the transfer destination area start address 2780 and the transfer destination flag address 279.
Indicates 0. The transfer data length 2710 of FIG. 20 is omitted in FIG.

【００６２】転送元のユーザプログラムの先頭の命令列
３１００Ａおよび転送先のユーザプログラムの先頭の命
令列３１００Ｂは、これらのプログラムが使用する転送
制御フラグ（ａ，ｂ）、転送元フラグ（ｓａ，ｓｂ）、
転送先フラグ（ｒａ，ｒｂ）を初期化する。双方で初期
化できたことを保証するために、プロセッサ間でバリア
同期を実現するための命令３１１０Ａ、３１１０Ｂを実
行する。フラグ群はサイズが小さくプログラムの先頭で
初期化しておくことが容易にでき、これに伴いバリア同
期命令３１１０Ａ、３１１０Ｂによるバリア同期もこれ
らのプログラムの先頭の方で取っておくことができる。
ノードＡとＢは、その後いろいろのプログラム命令を実
行した後の適当な時点でＰＵＴ命令３１０２Ａ、３１０
２Ｂを実行する。The first instruction sequence 3100A of the transfer source user program and the first instruction sequence 3100B of the transfer destination user program have a transfer control flag (a, b) and a transfer source flag (sa, sb) used by these programs. ),
The transfer destination flag (ra, rb) is initialized. In order to ensure that both sides can be initialized, the instructions 3110A and 3110B for realizing barrier synchronization between the processors are executed. The flag group has a small size and can be easily initialized at the beginning of the program, and accordingly, the barrier synchronization by the barrier synchronization instructions 3110A and 3110B can also be reserved at the beginning of these programs.
Nodes A and B then execute PUT instructions 3102A, 310 at appropriate points after executing various program instructions.
Perform 2B.

【００６３】ノードＡが実行するＰＵＴ命令３１２０Ａ
は、「データ領域Ｍの転送データをフラグｓａを転送元
フラグとして使用して、ノードＢの転送先領域ＮへＰＵ
Ｔする。但し、ノードＢにある転送制御フラグｂ内のセ
マフォ値が０より大きくなければ、転送データを一旦転
送バッファに蓄積するという条件をつける。書き込みの
終了は、ノードＢにある転送先フラグｒｂに記録す
る。」という意味を持つ。同様に、ノードＢが実行する
ＰＵＴ命令３１２０Ｂは、「データ領域Ｎの転送データ
をｆｌａｇ（ｓｂ）を転送元フラグとして使用して、ノ
ードＡのデータ領域ＭへＰＵＴする。但し、ノードＡに
ある転送制御フラグａ内のセマフォ値が０より大きくな
ければ、転送データを一旦転送バッファに蓄積するとい
う条件をつける。書き込みの終了は、ノードＡにある転
送先フラグｒaに記録する。」という意味を持つ。PUT instruction 3120A executed by node A
"Uses the transfer data of the data area M as the transfer source flag and the PU as the transfer source area N of the node B.
T. However, if the semaphore value in the transfer control flag b in the node B is not larger than 0, the condition that the transfer data is temporarily stored in the transfer buffer is added. The end of writing is recorded in the transfer destination flag rb in the node B. Has the meaning. Similarly, the PUT instruction 3120B executed by the node B “PUTs the transfer data of the data area N to the data area M of the node A by using flag (sb) as the transfer source flag. If the semaphore value in the transfer control flag a is not greater than 0, the condition is set that the transfer data is temporarily stored in the transfer buffer. The end of writing is recorded in the transfer destination flag ra in the node A. " To have.

【００６４】後に詳細に説明するように、ノードＡのＰ
ＵＴ命令３１２０Ａが実行されると、その実行の過程
で、送信元フラグｓａの値がデータ領域Ｍ内の転送デー
タの転送状態に応じて図１２に従い更新される。さら
に、ノードＢ内の転送先フラグｒｂの値が、この転送デ
ータの転送状態に応じて図１３に従い更新される。同様
に、ノードＢのＰＵＴ命令３１２０Ｂが実行されると、
その実行の過程で、ノードＢ内の転送元フラグｓｂの値
がデータ領域Ｍ内の転送データの転送状態に応じて図１
２に従い更新される。さらに、ノードＡ内の転送先フラ
グｒａの値が、この転送データの転送状態に応じて図１
３に従い更新される。As will be described in detail later, P of node A
When the UT instruction 3120A is executed, the value of the transmission source flag sa is updated according to the transfer state of the transfer data in the data area M according to FIG. Further, the value of the transfer destination flag rb in the node B is updated according to the transfer state of this transfer data according to FIG. Similarly, when the PUT instruction 3120B of the node B is executed,
In the process of execution, the value of the transfer source flag sb in the node B changes according to the transfer state of the transfer data in the data area M.
Updated according to 2. Further, the value of the transfer destination flag ra in the node A is set according to the transfer state of the transfer data shown in FIG.
Updated according to 3.

【００６５】後に受信ノードの動作説明で詳しく説明す
るように、ノードＡでのＰＵＴ命令３１２０Ａの実行に
より、ノードＡ内のデータ領域ＭのデータがノードＢに
転送されとき、ノードＢでの転送制御フラグａ内のセマ
フォ値が、このＰＵＴ命令が指定する比較方法（＞）に
従い、この命令が指定する値０と比較され、このセマフ
ォ値が０より大きくないとき、あるいはこのセマフォ値
が０より大きいが、このＰＵＴ命令が指定するデータ領
域Ｎが主記憶に存在しない時には、この転送データはノ
ードＢの転送バッファに格納される。このセマフォ値が
０より大きく、かつ、このＰＵＴ命令が指定するデータ
領域Ｎが主記憶に存在する時には、この転送データは、
データ領域Ｎに直ちに格納される。ノードＢでのＰＵＴ
命令３１２０Ｂの実行により、ノードＢ内のデータ領域
ＮのデータがノードＡに転送されるときも同様である。When the data in the data area M in the node A is transferred to the node B by the execution of the PUT instruction 3120A in the node A, the transfer control in the node B will be described in detail later in the description of the operation of the receiving node. The semaphore value in the flag a is compared with the value 0 specified by this instruction according to the comparison method (>) specified by this PUT instruction, and when this semaphore value is not greater than 0 or when this semaphore value is greater than 0. However, when the data area N designated by this PUT instruction does not exist in the main memory, this transfer data is stored in the transfer buffer of the node B. When this semaphore value is larger than 0 and the data area N designated by this PUT instruction exists in the main memory, this transfer data is
It is immediately stored in the data area N. PUT on Node B
The same applies when the data in the data area N in the node B is transferred to the node A by the execution of the instruction 3120B.

【００６６】ノードＡでの転送制御フラグａ内のセマフ
ォおよびノードＢでの転送制御フラグｂ内のセマフォの
更新は、命令列３１３０Ａ、３１３０Ｂにより次のよう
にして行われる。ノードＡのＰＵＴ命令３１２０Ａの後
に適当な命令を実行した後に命令列３１３０Ａが実行さ
れる。この命令列３１３０Ａ中のＷＡＩＴ命令は、ノー
ドＡで実行されたＰＵＴ命令３１２０Ａが指定する転送
データの転送が完了するまで待つ命令である。具体的に
は、このデータの転送状態が、図１２に示す送信完了バ
ッファリング済み状態もしくは送信完了着信完了状態に
なるまで待つ。すなわち、転送元フラグｓａの値が、図
１２に示した送信完了バッファリング済み状態を表すフ
ラグ値００１０（２進数）と送信完了着信完了状態を表
すフラグ値１０１０の内のより小さな値００１０以上に
なるのを待つ。なお、ＷＡＩＴ命令中のＳＦＩＮはこの
特定のフラグ値００１０を示す。ＳＥＴ命令は、ＷＡＩ
Ｔ命令の待ちが成功した時点で、転送制御フラグａ内の
セマフォ値を１にセットする命令である。The semaphore in the transfer control flag a at the node A and the semaphore in the transfer control flag b at the node B are updated by the instruction sequences 3130A and 3130B as follows. After executing the appropriate instruction after the PUT instruction 3120A of the node A, the instruction sequence 3130A is executed. The WAIT instruction in the instruction sequence 3130A is an instruction to wait until the transfer of the transfer data designated by the PUT instruction 3120A executed in the node A is completed. Specifically, it waits until the data transfer state becomes the transmission completion buffering state or the transmission completion incoming call completion state shown in FIG. That is, the value of the transfer source flag sa is equal to or larger than the smaller value 0010 of the flag value 0010 (binary number) indicating the transmission completion buffered state and the flag value 1010 indicating the transmission completion incoming completion state shown in FIG. Wait to become. SFIN in the WAIT instruction indicates this specific flag value 0010. The SET command is WAI
It is an instruction to set the semaphore value in the transfer control flag a to 1 when the waiting of the T instruction is successful.

【００６７】同様に、ノードＢのＰＵＴ命令３１２０Ｂ
の後に適当な命令を実行した後に命令列３１３０Ｂが実
行される。この命令列中のＷＡＩＴ命令は、ノードＢで
実行されたＰＵＴ命令３１２０Ｂが指定する転送データ
の転送が完了するまで待つ命令である。具体的には、転
送元フラグｓｂの値が、上記特定のフラグ値００１０以
上になるのを待つ。ＳＥＴ命令は、ＷＡＩＴ命令の待ち
が成功した時点で、転送制御フラグｂ内のセマフォ値を
１にセットする命令である。Similarly, the PUT instruction 3120B of the node B
After executing the appropriate instruction, the instruction sequence 3130B is executed. The WAIT instruction in this instruction sequence is an instruction to wait until the transfer of the transfer data designated by the PUT instruction 3120B executed by the node B is completed. Specifically, it waits until the value of the transfer source flag sb becomes equal to or more than the specific flag value 0010. The SET instruction is an instruction to set the semaphore value in the transfer control flag b to 1 when the wait of the WAIT instruction succeeds.

【００６８】従って、ノードＢからＰＵＴ命令３１２０
Ｂにより転送された、ノードＢのデータ領域Ｎのデータ
がノードＡで受信された時点までに、ノードＡ内で命令
列３１３０Ａがすでに実行されているときには、転送制
御フラグａ内のセマフォの値が１になっているために、
この転送データは、ノードＢで実行されたＰＵＴ命令３
１２０Ｂが指定する条件を満たすことになる。従って、
この転送データは、ノードＡ内のデータ領域Ｍが主記憶
に存在するか否かに依存してノードＡのこのデータ領域
Ｍあるいは転送バッファ３７０のいずれかに書き込まれ
ることになる。このようにして、ノードＡ側でのプログ
ラムによるデータ領域Ｍの使用終了前に、他のノードか
ら転送されるデータがそのデータ領域へ書き込まれるの
を、ノードＡのネットワークアダプタ３０が防ぐことに
なる。このことは、ノードＢに関しても同様である。Therefore, from the node B, the PUT instruction 3120
By the time the data in the data area N of the node B transferred by B is received by the node A, if the instruction sequence 3130A has already been executed in the node A, the value of the semaphore in the transfer control flag a is Because it is 1,
This transfer data is the PUT instruction 3 executed by the node B.
The condition specified by 120B is satisfied. Therefore,
This transfer data is written in either the data area M of the node A or the transfer buffer 370 depending on whether or not the data area M in the node A exists in the main memory. In this way, the network adapter 30 of the node A prevents the data transferred from another node from being written to the data area M before the end of the use of the data area M by the program on the node A side. . This also applies to the node B.

【００６９】ノードＡ内のデータＭとノードＢ内のデー
タＮとをスワップするようなデータ転送の場合には、従
来のＰＵＴ型データ転送による図２１のプログラムで
は、バリア同期命令３０１０、３０３０を実行する必要
があったが、図２２のプログラムは、そのような命令を
必要としない。このように、本実施の形態では、プロセ
ッサ間のデータ転送に関して、プロセッサ間の同期のた
めの命令を削減することができる。これは、図２２に示
すように、受信側のノードのプログラムにより、そのプ
ログラムによるデータ領域の使用を終了した時点で、そ
の使用終了を示すセマフォを転送制御フラグにセット
し、受信側ノードのネットワークアダプタが、そのセマ
フォの値に基づいて、受信側ノードのプログラムが転送
データを書き込むべきデータ領域を使用終了したか否か
を判断し、受信側ノードのプログラムが転送データを書
き込むべきデータ領域を使用終了する前には転送された
データをそのデータ領域に書き込むことを防ぐからであ
る。本実施の形態では、送信側のノードのプログラムが
実行するＰＵＴ命令の中で、転送先領域に転送データを
書き込むか否かの判定に使用する転送制御フラグを指定
し、さらに、その判定の条件と判定に使用する比較値も
指定し、より自由度のある制御を実現している。In the case of data transfer in which the data M in the node A and the data N in the node B are swapped, the barrier synchronization instructions 3010 and 3030 are executed in the conventional PUT type data transfer program of FIG. However, the program of FIG. 22 does not require such an instruction. As described above, according to the present embodiment, regarding data transfer between processors, it is possible to reduce instructions for synchronization between processors. As shown in FIG. 22, when the program of the receiving side node finishes using the data area by the program, a semaphore indicating the end of use is set in the transfer control flag, and the network of the receiving side node is set. Based on the value of the semaphore, the adapter determines whether the program of the receiving node has finished using the data area to write the transfer data, and the program of the receiving node uses the data area to write the transfer data. This is to prevent the transferred data from being written in the data area before the end. In the present embodiment, in the PUT instruction executed by the program of the node on the transmission side, the transfer control flag used to determine whether or not to write the transfer data in the transfer destination area is specified, and the determination condition is further specified. The comparison value used for the judgment is also specified to realize more flexible control.

【００７０】さらに、図２１のプログラムでは、命令３
０００Ａ、３０００Ｂのごとく、データ転送命令を実行
する前に明示的にデータをコピーする命令を実行する必
要があったが、本実施の形態では、そのような命令を必
要としない。これも上述のように、受信側のネットワー
クアダプタ３０が転送制御フラグを使用して、転送され
たデータの、データ領域への書き込みを行うか否かを制
御していることによる。Further, in the program of FIG. 21, the instruction 3
Although it is necessary to explicitly execute an instruction to copy data before executing a data transfer instruction, such as 000A and 3000B, this embodiment does not require such an instruction. This is also because, as described above, the network adapter 30 on the receiving side uses the transfer control flag to control whether to write the transferred data in the data area.

【００７１】さて、ノードＡは、そのデータ領域Ｍへデ
ータがＰＵＴされたかどうかを、ノードＡ自身が持つ転
送先フラグの値により確認する。すなわち、ノードＡの
後続のＷＡＩＴ命令３１４０Ａは、ノードＢのＰＵＴ命
令３１２０ＢによるノードＡへのデータ転送の完了を待
つ命令である。具体的には、この命令は、このＰＵＴ命
令３１２０Ｂが指定するデータの転送状態が、図１３に
示す受信完了バッファリング済み状態もしくは受信完了
状態になるのを待つ、すなわち、ノードＢのＰＵＴ命令
がデータ領域Ｍのデータの転送先フラグとして指定する
フラグｒａの値が、図１３に示した受信完了バッファリ
ング済み状態を示すフラグ値０１１０と受信完了書き込
み完了状態を示すフラグ値０１００の内の小さな値０１
００以上になることを待つ。なお、ＷＡＩＴ命令中のＲ
ＦＩＮは、この特定のフラグ値０１００を示す。次のＩ
Ｆ命令３１５０Ａは、転送先フラグｒａの値がＲＦＩＮ
に等しくないとき、すなわち、この値が０１１０のと
き、すなわち、データ領域ＭにノードＢから転送すべき
データがノードＡの転送バッファに書き込まれたが、デ
ータ領域Ｍにまだ書き込まれていない時に、転送バッフ
ァ３７０からこの転送データをデータ領域Ｍに引き出す
ことをＯＳに要求するシステムコールを含む。ＯＳは、
このシステムコールに応答して、このデータの引き出し
を実行する。このときに、データ領域Ｍが、主記憶から
スワップアウトされているときには、データ領域Ｍに主
記憶を割り当てた上で、このデータの引き出しを行う。
なお、転送先フラグｒａの値が０１００であるときに
は、このＩＦ命令は実行されない。Now, the node A confirms whether or not the data is put into the data area M by the value of the transfer destination flag of the node A itself. That is, the subsequent WAIT instruction 3140A of the node A is an instruction to wait for the completion of the data transfer to the node A by the PUT instruction 3120B of the node B. Specifically, this instruction waits until the transfer state of the data specified by this PUT instruction 3120B becomes the reception completion buffered state or the reception completion state shown in FIG. 13, that is, the PUT instruction of the node B is The value of the flag ra designated as the transfer destination flag of the data in the data area M is the smaller value of the flag value 0110 indicating the reception completion buffered state and the flag value 0100 indicating the reception completion write completion state shown in FIG. 01
Wait for 00 or more. Note that R in the WAIT command
FIN indicates this specific flag value 0100. Next I
In the F instruction 3150A, the value of the transfer destination flag ra is RFIN.
, That is, when this value is 0110, that is, when the data to be transferred from the node B to the data area M has been written in the transfer buffer of the node A, but has not yet been written to the data area M, It includes a system call that requests the OS to extract this transfer data from the transfer buffer 370 to the data area M. OS is
In response to this system call, this data retrieval is executed. At this time, when the data area M is swapped out from the main memory, the main memory is allocated to the data area M and then this data is extracted.
When the value of the transfer destination flag ra is 0100, this IF instruction is not executed.

【００７２】ノードＡでは、ＩＦ命令３１５０Ａの後
に、このデータ領域Ｍ内のデータの読み出しを要求する
ＲＥＡＤ命令３１６０Ａを発行する。ノードＡでのＲＥ
ＡＤ命令３１６０Ａの実行時には、転送先データ領域Ｍ
は主記憶にすでに存在し、そこに転送データがすでに書
き込まれているので、このＲＥＡＤ命令は直ちに実行さ
れる。At the node A, after the IF instruction 3150A, a READ instruction 3160A requesting the reading of the data in the data area M is issued. RE at node A
When the AD instruction 3160A is executed, the transfer destination data area M
Already exists in the main memory and the transfer data has already been written therein, the READ instruction is immediately executed.

【００７３】同様に、ノードＢは、データ領域Ｎへデー
タがＰＵＴされたかどうかを、ノードＢ自身が持つ転送
先フラグの値により確認する。すなわち、ノードＢの後
続のＷＡＩＴ命令３１４０Ｂは、ノードＡのＰＵＴ命令
３１２０ＡによるノードＢへのデータ転送が終了するの
を待つ命令である。この命令は、具体的には、転送すべ
きデータの転送状態が、受信完了バッファリング済み状
態もしくは受信完了状態になる、すなわち、ノードＡの
ＰＵＴ命令３１１０Ａがデータ領域Ｎのデータの転送先
フラグとして指定するフラグｒｂの値が、上記特定のフ
ラグ値０１００以上になることを待つ。次のＩＦ命令３
１５０Ｂは、転送先フラグｒｂの値がＲＦＩＮに等しく
ないとき、すなわち、データ領域ＮにノードＡから転送
すべきデータが、ノードＢの転送バッファに書き込まれ
たがデータ領域Ｍにまだ書き込まれていない時に、転送
バッファからこの転送データをデータ領域に引き出すこ
とをＯＳに要求するシステムコールである。ノードＢで
も、ＩＦ命令３１５０Ｂの後に、データ領域Ｎ内のデー
タの読み出しを要求するＲＥＡＤ命令３１６０Ｂを発行
する。Similarly, the node B confirms whether or not the data is put into the data area N by the value of the transfer destination flag of the node B itself. That is, the subsequent WAIT instruction 3140B of the node B is an instruction to wait for the end of data transfer to the node B by the PUT instruction 3120A of the node A. Specifically, this instruction is such that the transfer state of the data to be transferred becomes the reception completion buffering state or the reception completion state, that is, the PUT instruction 3110A of the node A is used as the transfer destination flag of the data of the data area N. It waits until the value of the designated flag rb becomes equal to or more than the specific flag value 0100. Next IF instruction 3
In 150B, when the value of the transfer destination flag rb is not equal to RFIN, that is, the data to be transferred from the node A to the data area N is written in the transfer buffer of the node B, but not yet written in the data area M. At times, this is a system call that requests the OS to extract this transfer data from the transfer buffer to the data area. The node B also issues the READ instruction 3160B requesting the reading of the data in the data area N after the IF instruction 3150B.

【００７４】これらの命令３１４０Ａ、３１５０Ａ、３
１６０Ａ、３１４０Ｂ、３１５０Ｂ、３１６０Ｂの実行
態様から分かるように、各ノードで他のノードからの転
送データを読み出す場合においても、転送先フラグによ
りデータの着信が確認できるため、これらのノードのプ
ログラムの間で同期を取る必要がない。These instructions 3140A, 3150A, 3
As can be seen from the execution modes of 160A, 3140B, 3150B, and 3160B, even when each node reads transfer data from another node, the arrival of the data can be confirmed by the transfer destination flag. There is no need to synchronize with.

【００７５】＜２．装置の詳細＞本節では、本計算機シ
ステムのメモリ空間管理方法、装置構成およびユーザイ
ンタフェースについて簡単に説明する、＜２．１＞メモリ管理方法図１４に、本実施の形態における各ノード内の実アドレ
ス空間を示す。主記憶が存在する範囲にＯＳ領域２１１
０とユーザ領域２１２０とを持つ。Ｉ／Ｏ領域として
は、ＯＳのみがアクセスできるＯＳ専用Ｉ／Ｏ領域２１
４０と、ユーザモードでもアクセスできるユーザ用Ｉ／
Ｏ領域２１５０とを持つ。ＯＳ専用Ｉ／Ｏ領域２１４０
に対するユーザモードからのアクセスに対しては、アド
レス変換において割込を発生することでプロテクトをか
ける。その詳細は後述する。<2. Device Details> In this section, a memory space management method, device configuration, and user interface of the computer system will be briefly described. <2.1> Memory management method FIG. 14 shows real addresses in each node in the present embodiment. Indicates a space. OS area 211 in the range where main memory exists
It has 0 and a user area 2120. As the I / O area, the OS-dedicated I / O area 21 that can be accessed only by the OS
40 and user I / O that can be accessed in user mode
And an O region 2150. OS dedicated I / O area 2140
Access from the user mode is protected by generating an interrupt in address translation. The details will be described later.

【００７６】図１５に、本実施の形態における仮想アド
レス空間の実アドレス空間へのマッピング方法を示す。
なお、仮想アドレス空間から実アドレス空間へのマッピ
ングは、各ノード毎に行う。図１５は、１つのノードで
プロセスＡとプロセスＢとが動作している場合のマッピ
ング方法を示す。各プロセスＡ（またはＢ）用に、実ア
ドレス空間２３３０内に制御領域２２９０（または２３
１０）とデータ領域２３００（または２３２０）とを用
意する。このうち、制御領域２２９０、２３１０はＯＳ
が実アドレス空間に固定する。つまり、主記憶上に常に
存在することをＯＳ保証で実現する。制御領域２２９
０、２３１０にはそれぞれプロセスＡの転送制御領域２
２１０、２２４０が割り当てられる。これに対し、デー
タ領域２３００、２３２０は仮想記憶管理を行う。つま
り、必ずしも実アドレス空間に存在せず、外部記憶装置
にスワップアウトされることを許す。これらのデータ領
域２３００，２３２０には、転送先領域２２２０などが
割り当てられる。なお、本実施の形態では、ＯＳが使用
するＯＳ領域２２８０については実アドレス空間に固定
する。つまり、必ず主記憶にあることをＯＳ管理で実現
する。FIG. 15 shows a mapping method of the virtual address space to the real address space in this embodiment.
The mapping from the virtual address space to the real address space is performed for each node. FIG. 15 shows a mapping method when process A and process B are operating in one node. For each process A (or B), the control area 2290 (or 23
10) and the data area 2300 (or 2320) are prepared. Of these, the control areas 2290 and 2310 are OSs.
Fixed in the real address space. That is, the OS guarantees that it always exists in the main memory. Control area 229
0 and 2310 include the transfer control area 2 of the process A, respectively.
210 and 2240 are allocated. On the other hand, the data areas 2300 and 2320 manage virtual storage. In other words, it does not always exist in the real address space and is allowed to be swapped out to the external storage device. A transfer destination area 2220 and the like are assigned to these data areas 2300 and 2320. In this embodiment, the OS area 2280 used by the OS is fixed in the real address space. That is, the fact that it is always in the main memory is realized by OS management.

【００７７】本実施の形態では、仮想記憶管理における
スワップイン／スワップアウトの単位サイズ（これはペ
ージサイズに等しい）を１ＭＢとする。つまり、各プロ
セスのデータ領域２３００、２３２０は、ページサイズ
（１ＭＢ）単位で外部記憶装置と主記憶との間でスワッ
プイン／スワップアウトされる。また、各仮想アドレス
空間は３２ビットのアドレスを有する、すなわち４ＧＢ
のサイズを有するとする。つまり、各仮想アドレス空間
は、１ＭＢ×４Ｋページで構成される。In this embodiment, the swap-in / swap-out unit size (which is equal to the page size) in virtual memory management is 1 MB. That is, the data areas 2300 and 2320 of each process are swapped in / out between the external storage device and the main storage in page size (1 MB) units. Also, each virtual address space has a 32-bit address, that is, 4 GB.
Have a size of. That is, each virtual address space is composed of 1 MB × 4K pages.

【００７８】＜２．２＞動作の詳細本節では、ＯＳの動作、送信側ノードの動作、受信側ノ
ードの動作の順に説明する。<2.2> Details of Operation In this section, the operation of the OS, the operation of the transmitting side node, and the operation of the receiving side node will be described in this order.

【００７９】＜２．２．１＞ＯＳの動作ＯＳが行う重要な動作の一つは、アドレス変換テーブル
１４０の作成である。アドレス変換について説明する前
に、ＯＳが各プロセスにつける管理番号であるプロセス
番号について述べる。プロセス番号として０〜１５まで
用意する。０は常にＯＳ自身用として確保するので、他
のプロセスとしては番号１〜１５のプロセスまでが同時
に動作できる。<2.2.1> Operation of OS One of the important operations performed by the OS is the creation of the address conversion table 140. Before explaining the address conversion, the process number which is a management number given to each process by the OS will be described. Prepare 0 to 15 as the process number. Since 0 is always reserved for the OS itself, the processes of numbers 1 to 15 can simultaneously operate as other processes.

【００８０】まず、本実施の形態におけるアドレス変換
方法を、図１６を用いて説明する。なお、本アドレス変
換はネットワークアダプタ３０のアドレス変換部４８０
（図２）により行われる。アドレス変換部４８０自体の
動作は、＜２．２．２＞、＜２．２．３＞で説明する。First, the address conversion method in this embodiment will be described with reference to FIG. The address conversion is performed by the address conversion unit 480 of the network adapter 30.
(FIG. 2). The operation of the address conversion unit 480 itself will be described in <2.2.2> and <2.2.3>.

【００８１】本実施の形態では３２ビット仮想空間を１
ＭＢページで管理するため、変換すべき仮想アドレス
は、その上位２４３０の１２ビットである。この１２ビ
ットと、テーブル管理レジスタ８０（図２）に保持され
たアドレス変換テーブルアドレス１５１０（図５）と、
アドレス変換を要求したプロセスのプロセス番号２４２
０とより、アドレス変換テーブル１４０（図１）を引
き、そこで得た実ページ番号２４６０を用いて、実アド
レス２４７０を求める。In the present embodiment, the 32-bit virtual space is set to 1
Since it is managed by the MB page, the virtual address to be translated is 12 bits of the upper 2430. The 12 bits, the address conversion table address 1510 (FIG. 5) held in the table management register 80 (FIG. 2),
Process number 242 of the process that requested address translation
The address conversion table 140 (FIG. 1) is subtracted from 0, and the real address 2470 is obtained using the real page number 2460 obtained there.

【００８２】ここで、仮想アドレス上位２４３０、テー
ブル管理レジスタ情報２４１０、プロセス番号情報２４
２０より、アドレス変換テーブル１４０内のエントリを
引くための実アドレス計算方法を図１７を参照して説明
する。最下位ビットを自動的に０にする。アドレス変換
テーブル１４０の各エントリが１６ビットであるためで
ある。その上に、仮想アドレス上位２４３０、プロセス
番号２４２０、テーブル管理レジスタ情報２４１０の順
に積み上げ、これを実アドレスとしてアドレス変換テー
ブル１４０をアクセスする。アドレス変換テーブル１４
０は、各プロセス毎に８ＫＢのテーブル（＝１６ビット
×４Ｋエントリ）が１６個存在する形式になる。Here, the virtual address upper 2430, the table management register information 2410, the process number information 24
20, a real address calculation method for drawing an entry in the address conversion table 140 will be described with reference to FIG. The least significant bit is automatically set to 0. This is because each entry of the address conversion table 140 has 16 bits. The virtual address upper 2430, the process number 2420, and the table management register information 2410 are piled up in that order, and the address conversion table 140 is accessed using this as a real address. Address conversion table 14
0 has a format in which 16 8 KB tables (= 16 bits × 4K entries) exist for each process.

【００８３】アドレス変換テーブル１４０の各エントリ
の内容を、図１８に示す。各エントリは、アドレス変換
のための実ページ番号２４６０の他に、Ｖビット２６１
０、Ｒビット２６２０、Ｍビット２６３０とを持つ。各
ビットの意味を図１９に示す。The contents of each entry of the address conversion table 140 are shown in FIG. Each entry has a V bit 261 in addition to the real page number 2460 for address translation.
0, R bit 2620, and M bit 2630. The meaning of each bit is shown in FIG.

【００８４】本実施の形態でもう一つ重要な動作は、デ
ータ転送の前に、ネットワークアダプタ３０中の送信制
御レジスタ７０にあらかじめプロセス番号およびノード
番号をセットしておく動作である。送信制御レジスタ７
０の構成を図３に示す。これは複数のレジスタからな
る。これらは、ユーザI／O領域２１５０にマッピングさ
れ、たレジスタ群７０Ｂと、ＯＳ専用のI／O領域２１５
０にマッピングされたレジスタ群７０Ａとからなる。各
レジスタはいわゆるI／Oマッピング方法により実アドレ
ス空間の中にマッピングされている。ユーザプロセスか
らのアクセスの可否は、アクセスしようとするアドレス
がＯＳ専用Ｉ／Ｏ領域２１４０にマッピングされるか、
ユーザ用Ｉ／Ｏ領域２１５０にマッピングされるかによ
り決まる。言い換えれば、ユーザプロセスから見ると、
ＯＳ専用Ｉ／Ｏ領域に属するアドレスを有するレジスタ
をアクセスする為のエントリがアドレス変換テーブル１
４０に存在しないか、もしくはそのエントリが存在した
としてもＶビットが０であるために、アドレス変換時点
でアクセスが拒否されることになる。Another important operation in this embodiment is an operation of setting the process number and the node number in the transmission control register 70 in the network adapter 30 in advance before data transfer. Transmission control register 7
0 is shown in FIG. It consists of multiple registers. These are mapped to the user I / O area 2150, the register group 70B, and the OS-dedicated I / O area 215.
Register group 70A mapped to 0. Each register is mapped in the real address space by the so-called I / O mapping method. Whether the access from the user process is permitted or not is whether the address to be accessed is mapped to the OS dedicated I / O area 2140,
It is determined depending on whether it is mapped to the user I / O area 2150. In other words, from the perspective of the user process,
The address conversion table 1 is an entry for accessing a register having an address belonging to the OS-dedicated I / O area.
40 does not exist, or even if the entry exists, the V bit is 0, so access is denied at the time of address translation.

【００８５】図３に示す送信制御レジスタ７０内の、Ｏ
Ｓ専用Ｉ／Ｏ領域２１４０に属するアドレスにマッピン
グされたレジスタ群７０Ｂには、現在実行中のプロセス
番号を記録するフィールド１２１０と、そのネットワー
クアダプタ３０が載っているノードの番号を記録するフ
ィールド１２２０とが存在する。図１の例では、ノード
１０に与えられたノード番号が送信制御レジスタ７０の
フィールド１２２０に保持される。ノード番号はシステ
ム立ち上げ時にセットし、プロセス番号はプロセススイ
ッチの際にＯＳが設定する。O in the transmission control register 70 shown in FIG.
In the register group 70B mapped to the address belonging to the S-dedicated I / O area 2140, a field 1210 for recording the process number of the currently executing process and a field 1220 for recording the number of the node on which the network adapter 30 is mounted are provided. Exists. In the example of FIG. 1, the node number given to the node 10 is held in the field 1220 of the transmission control register 70. The node number is set at system startup, and the process number is set by the OS at the time of process switch.

【００８６】また、図２のテーブル管理レジスタ８０、
バッファ管理レジスタ１９０にもＯＳが値をセットす
る。図５に示すように、テーブル管理レジスタ８０に
は、アドレス変換テーブル１４０の先頭の実アドレスを
格納する。ＯＳがＯＳ領域１３０の中に転送バッファ３
７０のための領域を、１ＭＢの倍数分、ページ境界に合
わせて確保する。この領域の確保の際、バッファ管理レ
ジスタ１９０内のカレントバッファアドレス１６１０
（図６）にその領域の先頭アドレスをセットし、バッフ
ァエンドアドレス１６２０（図６）にその領域の末尾ア
ドレスより１ＭＢ＋２８Ｂを引いた値をセットする。Ｏ
Ｓは、転送バッファ３７０に転送データを格納する度
に、格納したデータの長さだけカレントバッファアドレ
ス１６１０を増加させる。その動作は＜２．２．３＞に
示す。この時、カレントバッファアドレス１６１０の値
がバッファエンドアドレス１６２０を超えると、ＣＰＵ
２０に割込要求を出す。前述の通り、転送データ長は最
大１ＭＢであり、後述するパケットヘッダと合わせて
も、一度にバッファに格納する最大長は、１ＭＢ＋２８
Ｂである。よって以上の処理により、最後の１転送につ
いては、完全に転送バッファ３７０に格納できることが
保証できる。Further, the table management register 80 of FIG.
The OS also sets a value in the buffer management register 190. As shown in FIG. 5, the table management register 80 stores the first real address of the address conversion table 140. The OS transfers the transfer buffer 3 in the OS area 130.
An area for 70 is reserved for a page boundary for a multiple of 1 MB. When securing this area, the current buffer address 1610 in the buffer management register 190
The start address of the area is set in (FIG. 6), and the value obtained by subtracting 1MB + 28B from the end address of the area is set in the buffer end address 1620 (FIG. 6). O
Each time S stores transfer data in the transfer buffer 370, the current buffer address 1610 is increased by the length of the stored data. The operation is shown in <2.2.3>. At this time, if the value of the current buffer address 1610 exceeds the buffer end address 1620, the CPU
Issue an interrupt request to 20. As described above, the maximum transfer data length is 1 MB, and the maximum length that can be stored in the buffer at one time is 1 MB + 28 even with the packet header described later.
B. Therefore, by the above processing, it can be guaranteed that the last one transfer can be completely stored in the transfer buffer 370.

【００８７】転送バッファ管理における上記の割込を受
けると、ＯＳは新たに主記憶上にバッファ領域を用意
し、これを転送バッファ３７０の末尾アドレスからつな
がる領域になるようアドレス変換テーブル１４０を用い
て設定し、それに合わせてバッファエンドアドレス１６
２０を更新する。このときの動作は＜２．２．３＞にも
示す。When the above-mentioned interrupt in the transfer buffer management is received, the OS newly prepares a buffer area in the main memory, and uses the address conversion table 140 so that this area becomes an area connected from the end address of the transfer buffer 370. Set and end buffer end address 16
Update 20. The operation at this time is also shown in <2.2.3>.

【００８８】テーブル管理レジスタ８０、バッファ管理
レジスタ１９０も、図１４および＜２．１＞で説明した
ＯＳ専用Ｉ／Ｏ領域２１４０内にマッピングされ、ＯＳ
しかアクセスできない。The table management register 80 and the buffer management register 190 are also mapped in the OS-dedicated I / O area 2140 described in FIG.
Only accessible.

【００８９】実際にＣＰＵ２０が送信制御レジスタ７
０、テーブル管理レジスタ８０、バッファ管理レジスタ
１９０をアクセスする上で発生する信号の流れと、ＣＰ
Ｕ２０がネットワークアダプタ３０中の上記レジスタ群
にアクセスする手順とを図１、図２を用いて説明する。The CPU 20 actually causes the transmission control register 7
0, the flow of signals generated when accessing the table management register 80, the buffer management register 190, and the CP
A procedure for the U20 to access the register group in the network adapter 30 will be described with reference to FIGS.

【００９０】ＣＰＵ２０は、線ａ１０に、ＯＳ専用Ｉ／
Ｏ領域に存在するレジスタ群の一つのレジスタに対する
アドレスのリクエストを出す。このアドレス値は、図１
４に示すＯＳ専用Ｉ／Ｏ領域２１４０内に用意した、送
信制御レジスタ７０、テーブル管理レジスタ８０、バッ
ファ管理レジスタ１９０の一つへのアドレスであること
から、主記憶制御部４０は、受け取ったリクエストをネ
ットワークアダプタ３０へ線ａ２０を通して出す。リク
エストは、ネットワークアダプタ３０内のセレクタ４２
０に伝達される。セレクタ４２０はアドレス値より、リ
クエストが送信制御レジスタ７０へのものか、テーブル
管理レジスタ８０へのものか、バッファ管理レジスタ１
９０へのものか判断し、それに応じて線ｃ３０、ｃ５
５、ｃ７５の内の対応する一つをアクティブにする。以
上の一連の動作に伴い、線ｄ１０、主記憶制御部４０、
線ｄ２０を通して、ＣＰＵ２０と送信制御レジスタ７
０、テーブル管理レジスタ８０、バッファ管理レジスタ
１９０との間でデータをやり取りすることができるよう
になる。The CPU 20 displays the I / O dedicated to the OS on the line a10.
A request for an address is issued to one of the registers existing in the O area. This address value is
4 is an address to one of the transmission control register 70, the table management register 80, and the buffer management register 190 prepared in the OS-dedicated I / O area 2140 shown in FIG. To the network adapter 30 through the line a20. The request is made by the selector 42 in the network adapter 30.
0 is transmitted. Based on the address value, the selector 420 determines whether the request is for the transmission control register 70, the table management register 80, or the buffer management register 1
90 to determine the lines c30 and c5 accordingly
The corresponding one of 5 and c75 is activated. With the above series of operations, the line d10, the main memory control unit 40,
Through the line d20, the CPU 20 and the transmission control register 7
0, the table management register 80, and the buffer management register 190 can exchange data.

【００９１】＜２．２．２＞送信側ノードの動作送信側の動作を、図１のノード１０、および図２を中心
にして、順を追って説明する。なお、以下の説明におけ
る””で囲む数字は、２進数表示の値である。［１］送信起動ユーザプログラム中からデータ転送命令ＰＵＴ（図２
０）が発行されたのを契機にして、ＣＰＵ２０は送信起
動処理を開始する。ＣＰＵ２０は、線ａ１０に、送信制
御レジスタ７０のうちユーザ用Ｉ／Ｏ領域に存在するレ
ジスタ群７０Ａに対応するアドレスのリクエストを出
す。このアドレス値は、図１４に示すユーザ用Ｉ／Ｏ領
域２１５０内に用意した、送信制御レジスタ７０へ格納
すべき情報へのアドレスであることから、主記憶制御部
４０は、受け取ったリクエストをネットワークアダプタ
３０へ線ａ２０を通して出す。リクエストは、ネットワ
ークアダプタ３０内のセレクタ４２０に伝達される。セ
レクタ４２０はアドレス値より送信制御レジスタ７０へ
のＣＰＵ２０からのアクセスリクエストであることを判
別し、線ｃ３０をアクティブにする。以上の一連の動作
に伴い、線ｄ１０、主記憶制御部４０、線ｄ２０を通し
て、ＣＰＵ２０と送信制御レジスタ７０との間でデータ
をやり取りすることができるようになる。<2.2.2> Operation of Transmission-Side Node The operation of the transmission-side node will be described step by step, focusing on the node 10 in FIG. 1 and FIG. The numbers enclosed by "" in the following description are values in binary notation. [1] Transmission start Data transfer instruction PUT (see FIG. 2) from the user program.
0) is issued, the CPU 20 starts the transmission start processing. The CPU 20 issues to the line a10 a request for an address corresponding to the register group 70A existing in the user I / O area of the transmission control register 70. Since this address value is an address to information to be stored in the transmission control register 70 prepared in the user I / O area 2150 shown in FIG. 14, the main memory control unit 40 sends the received request to the network. The line a20 is put out to the adapter 30. The request is transmitted to the selector 420 in the network adapter 30. Based on the address value, the selector 420 determines that it is an access request from the CPU 20 to the transmission control register 70, and activates the line c30. With the series of operations described above, it becomes possible to exchange data between the CPU 20 and the transmission control register 70 through the line d10, the main memory control unit 40, and the line d20.

【００９２】ＣＰＵ２０の送信制御レジスタ７０へのア
クセスは次の通りである。まず、ＣＰＵ２０は制御レジ
スタ状態１０１０をリードする。制御レジスタ状態フィ
ールド１０１０に、以前起動したデータ転送のために送
信制御レジスタ７０の使用が終了していない場合には１
がすでにセットされている。送信制御レジスタ７０の使
用が終了したときには、０がセットされている。もし、
読み出した情報がこのレジスタの使用が終了していない
ことを示すときには、ＣＰＵ２０は送信制御レジスタ７
０が使用可能になるまで、制御レジスタ状態１０１０に
対するポーリングを行い、待つ。このレジスタが使用可
能となったときには、ＣＰＵは、ユーザプログラム中の
データ転送命令ＰＵＴ（図２０）により指定された９つ
のパラメータを送信制御レジスタ７０内にセットする。
最後にＣＰＵ２０は、送信制御レジスタ７０中の転送起
動フィールド１０２０に対して、転送起動状態にあるこ
とを示す”００”をライトする。これにより、送信制御
部４１０が起動され、ネットワークアダプタ３０が次の
ようにしてデータ送信動作を開始する。The access of the CPU 20 to the transmission control register 70 is as follows. First, the CPU 20 reads the control register status 1010. The control register status field 1010 is set to 1 if the use of the transmission control register 70 has not been completed due to the previously activated data transfer.
Has already been set. When the use of the transmission control register 70 is completed, 0 is set. if,
When the read information indicates that the use of this register is not completed, the CPU 20 determines that the transmission control register 7
Poll and wait for control register state 1010 until 0 is available. When this register becomes available, the CPU sets in the transmission control register 70 nine parameters specified by the data transfer instruction PUT (FIG. 20) in the user program.
Finally, the CPU 20 writes “00” indicating that the transfer is in the transfer start state in the transfer start field 1020 in the transmission control register 70. As a result, the transmission control unit 410 is activated, and the network adapter 30 starts the data transmission operation as follows.

【００９３】［２］転送元フラグの更新（１）送信制御部４１０は、起動されると、転送起動処理を受
けて、送信制御レジスタ７０の転送元フラグアドレス１
０４０（図３）で指定される、主記憶５０上の転送元フ
ラグ１１０に、送信動作開始を表す値０００１（図１２
参照）を書き込むことを要求する書き込み要求を、線ａ
５０を通して主記憶アクセス部４００に出す。、この書
き込み要求は、上記フィールド１０４０内の転送先フラ
グアドレスと、すでにＯＳにより設定されていた、送信
制御レジスタ７０内のプロセス番号１２１０との組を主
記憶アドレスとして指定する。主記憶アクセス部４００
は、線ａ７０を通して、転送元フラグ１１０の仮想アド
レスおよびプロセス番号をアドレス変換部４８０に出
す。アドレス変換部４８０は、前述の方法によりアドレ
ス変換テーブル１４０内のエントリのアドレスを計算
し、その結果得られる実アドレスを線ａ７５を通して、
主記憶アクセス部４００に出す。主記憶アクセス部４０
０は、その実アドレスを有するデータの読み出しリクエ
ストを線ａ５５、ミキサ４９０、線ａ３０、主記憶制御
部４０、線ａ４０を通じて、アドレス変換テーブル１４
０に対して出す。同時に、主記憶アクセス部４００は、
線ｃ４５を通して、スイッチ５００をアドレス変換部４
８０に切り替える。アドレス変換テーブル１４０から読
み出された、アドレス変換情報（図１８参照）は線ｄ５
０，主記憶制御部４０、線ｄ３０、スイッチ５００、線
ｄ８０を通して、アドレス変換部４８０に入る。アドレ
ス変換部４８０は得られたアドレス変換情報の内、Ｖビ
ット２６１０が０であれば線ｃ１０を通してＣＰＵ２０
に割込をかける。これは、Ｖビット２６１０が０である
ということは、ユーザが指定した転送要求が不当であっ
たことを示すからである。よって、ＯＳに処理を渡し、
プロセスのキルなどの適切な処理をとる。また、Ｒビッ
ト２６２０、Ｍビット２６３０がともに０である場合に
も、ＣＰＵ２０にｃ１０を通して割込をかける。転送元
フラグ１１０はユーザ制御領域９０、つまり主記憶固定
の領域に存在する必要があるため、ＯＳはこの割込を受
けると、ユーザが指定した転送要求が不当であったとみ
なし、プロセスのキルなどの適切な処理をとる。[2] Update of transfer source flag (1) When the transmission control unit 410 is activated, the transmission control unit 410 receives the transfer activation process, and transfers the transmission source flag address 1 of the transmission control register 70.
In the transfer source flag 110 on the main memory 50 designated by 040 (FIG. 3), a value 0001 (FIG.
A write request requesting to write
50 to the main memory access unit 400. The write request designates a set of the transfer destination flag address in the field 1040 and the process number 1210 in the transmission control register 70, which has already been set by the OS, as a main storage address. Main memory access unit 400
Outputs the virtual address and process number of the transfer source flag 110 to the address conversion unit 480 via the line a70. The address translation unit 480 calculates the address of the entry in the address translation table 140 by the method described above, and the real address obtained as a result is passed through the line a75,
The data is output to the main memory access unit 400. Main memory access unit 40
0 indicates a read request for data having the real address through the line a55, the mixer 490, the line a30, the main memory control unit 40, and the line a40 to obtain the address conversion table 14
Put out for 0. At the same time, the main memory access unit 400
The switch 500 is connected to the address conversion unit 4 through the line c45.
Switch to 80. The address translation information (see FIG. 18) read from the address translation table 140 is the line d5.
0, the main memory control unit 40, the line d30, the switch 500, and the line d80 to enter the address conversion unit 480. If the V bit 2610 is 0 among the obtained address translation information, the address translation unit 480 sends the CPU 20 through the line c10.
Interrupt. This is because the V bit 2610 being 0 indicates that the transfer request designated by the user was invalid. Therefore, the process is passed to the OS,
Take appropriate action such as killing the process. Also, when both the R bit 2620 and the M bit 2630 are 0, the CPU 20 is interrupted through c10. Since the transfer source flag 110 needs to exist in the user control area 90, that is, the area fixed to the main memory, when the OS receives this interrupt, the OS considers that the transfer request specified by the user is invalid, and kills the process. Take appropriate treatment of.

【００９４】正当にアドレス変換テーブル情報がアクセ
スできると、その中の実ページ番号２４６０（図１８）
に基づき、アドレス変換部４８０は転送元フラグ１１０
の実アドレスを計算し、そのアドレスに対して送信開始
を表す値”０００１”（図１２参照）を書き込むこと
を、線ａ７５を通して主記憶アクセス部４００に要求す
る。主記憶アクセス部４００は、線ａ５５、ミキサ４９
０、線ａ３０、主記憶制御部４０、線ａ４０を通し、主
記憶上の転送元フラグ１１０に対してライトリクエスト
を出すと共に、送信開始を表す値”０００１”を線ｄ８
５、ミキサ４６０、線ｄ４０、主記憶制御部４０、線ｄ
５０を通して転送元フラグ１１０に対して送出する。When the address translation table information can be accessed legitimately, the real page number 2460 (FIG. 18) in it can be accessed.
The address conversion unit 480 determines that the transfer source flag 110
The main memory access unit 400 is requested through line a75 to calculate the real address of and write the value "0001" (see FIG. 12) indicating the start of transmission to that address. The main memory access unit 400 has a line a55 and a mixer 49.
0, the line a30, the main memory control unit 40, and the line a40, a write request is issued to the transfer source flag 110 in the main memory, and the value "0001" indicating the start of transmission is set to the line d8
5, mixer 460, line d40, main memory control unit 40, line d
It is sent to the transfer source flag 110 through 50.

【００９５】［３］データ転送の開始次に、送信制御部４１０は、データ転送パケットを作る
上で必要となる送信制御レジスタ７０に蓄えられた情報
を、線ｃ４０を通してネットワーク送信部４３０に送
る。この必要となる情報は、具体的には、図３に示す情
報の内、制御レジスタ状態１０１０、転送起動ビット１
０２０以外の情報である。。[3] Start of Data Transfer Next, the transmission control section 410 sends the information stored in the transmission control register 70, which is necessary for creating a data transfer packet, to the network transmission section 430 through the line c40. This required information is specifically the control register status 1010, transfer start bit 1 among the information shown in FIG.
This is information other than 020. .

【００９６】これと同時に、送信制御部４１０は、転送
元領域１２０の読み出し要求を主記憶アクセス部４００
に出す。主記憶アクセス部４００とアドレス変換部４８
０は、前記と同様の手順で、アドレス変換テーブル１４
０からアドレス変換情報を読み出す。ここで、アドレス
変換部４８０は、Ｖビット２６１０もしくはＭビット２
６３０が０であれば、線ｃ１０を通してＣＰＵ２０に割
込をかける。ＯＳは、Ｖビット２６１０が０の場合に
は、プロセスのキルなどの適切な処理をとる。Ｍビット
が０であれば、外部記憶装置６０から転送元領域１２０
を含むページをスワップインし、転送動作を再開する。At the same time, the transmission control unit 410 sends a read request for the transfer source area 120 to the main memory access unit 400.
Put out Main memory access unit 400 and address conversion unit 48
0 is the same as the above, and the address conversion table 14
Address translation information is read from 0. Here, the address conversion unit 480 uses the V bit 2610 or the M bit 2
If 630 is 0, the CPU 20 is interrupted through the line c10. When the V bit 2610 is 0, the OS takes appropriate processing such as killing the process. If the M bit is 0, the transfer source area 120 is transferred from the external storage device 60.
Swap in the page containing and restart the transfer operation.

【００９７】正当にアドレス変換テーブル情報がアクセ
スできると、その情報に基づき、アドレス変換部４８０
は転送元領域１２０の実アドレスが計算できるようにな
る。前記の通り、ハードウェア的にはページをまたがる
データ転送は禁止しているため、転送あたり一度アドレ
ス変換テーブル１４０をアクセスすればよい。When the address conversion table information can be accessed legitimately, the address conversion unit 480 is based on the information.
Allows the real address of the transfer source area 120 to be calculated. As described above, data transfer across pages is prohibited in terms of hardware, so the address translation table 140 may be accessed once for each transfer.

【００９８】この状態で、次に主記憶アクセス部４００
は転送元領域１２０から転送データを読み出すために、
線ａ７０、アドレス変換部４８０、線ａ６５、ミキサ４
９０、線ａ３０、主記憶制御部４０、線ａ４０を通し
て、転送元領域１２０に対してリードリクエストを転送
データ分だけ発行するとともに、線ｃ４５を通してスイ
ッチ５００の出力をネットワーク送信部４３０に切り替
える。転送元領域１２０から読み出された転送データ
は、線ｄ５０，主記憶制御部４０、線ｄ３０、スイッチ
５００、線ｄ７０を通してネットワーク送信部４３０に
伝達する。ネットワーク送信部４３０は、読み出された
転送データと、送信制御部４１０から先に送られた、送
信制御レジスタ７０内のデータとから、図７に示すパケ
ットを生成し、線ｃ１０を通して、プロセッサ間ネット
ワーク１０００に出す。なお、図７のパケットフォーマ
ットの内、有効ビット１７１０は１（有効）にセットさ
れる。また、パケットタイプ１７２０には、データ送信
を示す”００”が入れられる。パケットタイプ１７２０
について、その値と意味とを図１１に示す。リプライパ
ケットについては、後に［５］に記述する。In this state, next, the main memory access unit 400
To read the transfer data from the transfer source area 120,
Line a70, address conversion unit 480, line a65, mixer 4
90, the line a30, the main memory control unit 40, and the line a40, a read request for the transfer data is issued to the transfer source area 120, and the output of the switch 500 is switched to the network transmission unit 430 through the line c45. The transfer data read from the transfer source area 120 is transmitted to the network transmission unit 430 through the line d50, the main memory control unit 40, the line d30, the switch 500, and the line d70. The network transmission unit 430 generates the packet shown in FIG. 7 from the read transfer data and the data in the transmission control register 70 that was previously transmitted from the transmission control unit 410, and the packet shown in FIG. Send to network 1000. In the packet format of FIG. 7, the valid bit 1710 is set to 1 (valid). Further, "00" indicating data transmission is entered in the packet type 1720. Packet type 1720
FIG. 11 shows the values and meanings of the above. The reply packet will be described later in [5].

【００９９】プロセッサ間ネットワーク１０００は、パ
ケットの先頭の情報を元に、転送先となるノード２１０
のネットワークアダプタ２３０へと、線ｎ２０を通して
伝達する。これから先のノード２１０側の動作について
は、＜２．２．３＞に記述する。The interprocessor network 1000 uses the information at the beginning of the packet to transfer the node 210 that is the transfer destination.
To the network adapter 230 of the above through the line n20. The operation on the node 210 side from now on is described in <2.2.3>.

【０１００】［４］転送元フラグの更新（２）［２］と同じ手順で、送信制御部４１０は転送元フラグ
１１０を、送信完了を表す”００１０”に更新する。[4] Update of transfer source flag (2) By the same procedure as in [2], the transmission control unit 410 updates the transfer source flag 110 to "0010" indicating completion of transmission.

【０１０１】［５］書込完了通知の受信と転送元フラグ
の更新（３）受信側ノード２１０において、転送データを主記憶５０
に書き終えた際に（＜２．２．３＞参照）、リプライパ
ケット（図８参照）を受信制御部４４０より返す。リプ
ライパケットには、（１）受信側ノード２１０で転送デ
ータを転送先領域３４０に書き込んだ場合に返される、
パケットタイプ１９１０が”０１”のパケット（図１１
参照）と、（２）受信側ノード２１０で転送データを転
送バッフ３７０に書き込んだ場合に返るパケットタイプ
１９１０が”１０”のパケットとがある。受信側ノード
２１０がこれらのパケットを送る動作は、＜２．２．３
＞で説明する。[5] Reception of write completion notification and update of transfer source flag (3) In the receiving side node 210, transfer data is stored in the main memory 50.
Upon completion of writing (see <2.2.3>), a reply packet (see FIG. 8) is returned from the reception control unit 440. The reply packet is (1) returned when the transfer data is written in the transfer destination area 340 by the receiving side node 210,
A packet whose packet type 1910 is "01" (see FIG. 11)
2) and (2) a packet whose packet type 1910 is “10” which is returned when the transfer data is written in the transfer buffer 370 by the receiving side node 210. The operation by the receiving node 210 to send these packets is <2.2.3.
>.

【０１０２】これらのパケットを送信側のノード１０の
ネットワークアダプタ３０内のネットワーク受信部４５
０が、プロセッサ間ネットワーク１０００より線ｎ２０
を通して受け取ると、そのパケットの情報は一旦受信制
御レジスタ１７０に格納される。受信制御レジスタの構
成を図４に示す。なお、パケットに含まれない情報、例
えば、転送データ長１３５０のフィールドについて
は、”０”が入る。受信制御レジスタ１７０にパケット
情報が入ると、その情報に基づき受信制御部４４０は線
ａ６０を通して、転送元フラグ１１０の更新を主記憶ア
クセス部４００に要求する。後は、［３］と同様に転送
元フラグ１１０が更新される。パケットタイプ１９１０
が”０１”のパケットに対しては、転送元フラグ１１０
を”１０１０”に更新し、。パケットタイプ１９１０
が”１０”のパケットに対しては、転送元フラグ１１０
を”０１１０”に更新する。The network receiving unit 45 in the network adapter 30 of the node 10 on the transmission side transmits these packets.
0 is the line n20 from the interprocessor network 1000.
When the packet is received through, the packet information is temporarily stored in the reception control register 170. The structure of the reception control register is shown in FIG. It should be noted that "0" is entered for information not included in the packet, for example, for the field of the transfer data length 1350. When the packet information is stored in the reception control register 170, the reception control unit 440 requests the main memory access unit 400 to update the transfer source flag 110 through the line a60 based on the information. After that, the transfer source flag 110 is updated as in [3]. Packet type 1910
For the packet with “01”, the transfer source flag 110
Is updated to "1010". Packet type 1910
For the packet with “10”, the transfer source flag 110
Is updated to "0110".

【０１０３】＜２．２．３＞受信側ノードの動作［１］受信起動受信側ノードのネットワークアダプタ２３０では、転送
データがプロセッサ間ネットワーク１０００よりネット
ワーク受信部４５０に到着すると、受信動作が起動され
る。受信した転送データパケットの先頭の情報（図７参
照）は、線ｃ６５を通して受信制御レジスタ１７０に記
録される。<2.2.3> Operation of Receiving Side Node [1] Activation of Receiving In the network adapter 230 of the receiving side node, when transfer data arrives at the network receiving section 450 from the interprocessor network 1000, the receiving operation is activated. It The top information (see FIG. 7) of the received transfer data packet is recorded in the reception control register 170 through the line c65.

【０１０４】［２］フラグチェック受信制御レジスタ１７０の情報に従い、受信制御部４４
０が起動される。受信制御部４４０は、＜２．２．２＞
［５］と同様にして転送制御フラグ３３０へアクセスリ
クエストを出す。但し、本アクセス動作は＜２．２．２
＞［５］の動作がライト動作であるのに対し、リード動
作である。リードしたデータは、ネットワークアダプタ
２３０内のスイッチ５００を通り、受信制御部４４０に
格納される。受信制御部４４０は、転送制御フラグ３３
０内のセマフォア値２０１０と到着パケット内の比較値
１０８０とが、到着パケット内の比較方法１０９０で示
される方法で比較する（図１０参照）。[2] Flag check According to the information in the reception control register 170, the reception control unit 44
0 is activated. The reception control unit 440 uses <2.2.2>
An access request is issued to the transfer control flag 330 similarly to [5]. However, this access operation is <2.2.2
The operation of [5] is a write operation, while the operation is a write operation. The read data passes through the switch 500 in the network adapter 230 and is stored in the reception control unit 440. The reception control unit 440 uses the transfer control flag 33.
The semaphore value 2010 in 0 and the comparison value 1080 in the arrival packet are compared by the method shown by the comparison method 1090 in the arrival packet (see FIG. 10).

【０１０５】［３］転送先領域３４０が主記憶５０内に
存在することの確認受信制御部４４０は、上記比較の結果、上記比較方法１
０９０が示す条件が成立する場合には、転送先領域３４
０が主記憶５０上に存在するかどうか確認する。具体的
には、＜２．２．２＞［３］でデータ転送を開始するの
にあたり、転送元領域１２０が主記憶５０に存在するか
否かを確認する場合と同様にアドレス変換部４８０をこ
の確認に使用する。違いは、転送元領域１２０が主記憶
５０上に存在しない場合には、アドレス変換部４８０は
線ｃ１０を通してＣＰＵ２０に割込をかけていたが、転
送先領域３４０が主記憶５０上に存在しない場合には、
アドレス変換部４８０は線ｃ５０を通して受信制御部４
４０にその旨を伝える。[3] Confirmation that the transfer destination area 340 exists in the main memory 50 The reception control unit 440, as a result of the above comparison, the above comparison method 1
When the condition indicated by 090 is satisfied, the transfer destination area 34
It is confirmed whether 0 exists in the main memory 50. Specifically, when starting data transfer in <2.2.2> [3], the address conversion unit 480 is set in the same manner as in the case of confirming whether or not the transfer source area 120 exists in the main memory 50. Used for this confirmation. The difference is that when the transfer source area 120 does not exist in the main memory 50, the address conversion unit 480 interrupts the CPU 20 through the line c10, but the transfer destination area 340 does not exist in the main memory 50. Has
The address translation unit 480 sends the reception control unit 4 through the line c50.
Tell 40 to that effect.

【０１０６】なお、受信制御部４４０は、上記比較の結
果、上記比較方法１０９０が示す条件が成立しない場合
には、転送先領域３４０の主記憶５０内の存在確認は行
わない。Note that the reception control unit 440 does not confirm the existence of the transfer destination area 340 in the main memory 50 if the condition indicated by the comparison method 1090 is not satisfied as a result of the comparison.

【０１０７】［４］転送先フラグの更新（１）次に説明するように、受信制御部４４０は、上記［２］
で、上記比較方法１０９０が示す条件が成立すると判定
し、かつ、上記［３］で、転送先領域３４０が主記憶５
０内に存在すると確認した場合には、転送データを主記
憶５０上の転送先領域３４０に書き込む動作を起動す
る。それ以外の場合には、転送データを転送バッファ３
７０に蓄える動作を起動する。受信制御部４４０は、こ
れらの動作の前に、いずれの動作を行うかに従って、転
送先フラグ３２０を更新する。これは、＜２．２．２＞
における転送元フラグの更新と同様の動作で実現する。
なお、更新後のフラグの値は、図１３に示すように、受
信中書き込み中の状態あるいは受信中バッファリング中
を示す０００１または００１１である。[4] Update of transfer destination flag (1) As described below, the reception control section 440 causes the reception control section 440 to perform the above [2].
Then, it is determined that the condition indicated by the comparison method 1090 is satisfied, and the transfer destination area 340 is stored in the main memory 5 in [3].
If it is confirmed that it exists in 0, the operation of writing the transfer data to the transfer destination area 340 in the main memory 50 is activated. Otherwise, transfer data is transferred to the transfer buffer 3
The operation stored in 70 is started. The reception control unit 440 updates the transfer destination flag 320 according to which operation is performed before these operations. This is <2.2.2>
This is realized by the same operation as the updating of the transfer source flag in.
The value of the updated flag is 0001 or 0011, which indicates the state of writing during reception or the state of buffering during reception, as shown in FIG.

【０１０８】更に、転送制御フラグ３３０との比較の結
果、転送バッファ３７０にデータを蓄えることになった
場合には、転送制御フラグ３３０のバッファリング数フ
ィールド（図９の２０２０）をカウントアップする。こ
のバッファリング数の値は、プログラムでセマフォア値
２０１０を更新する際に、転送バッファ３７０にデータ
が蓄積されているかどうか判断するための目安に使用す
る。このカウントアップ動作も他のフラグ更新と同様の
手順を必要とする。Furthermore, if the transfer buffer 370 stores data as a result of comparison with the transfer control flag 330, the buffering number field (2020 in FIG. 9) of the transfer control flag 330 is counted up. The value of the buffering number is used as a standard for determining whether or not data is accumulated in the transfer buffer 370 when the semaphore value 2010 is updated by the program. This count-up operation also requires the same procedure as other flag updates.

【０１０９】［５］転送データの主記憶５０への書き込
み受信制御部４４０は、上記［２］で、上記比較方法１０
９０が示す条件が成立すると判定したが、上記［３］
で、転送先領域３４０が主記憶５０内に存在しないと確
認した場合は、あるいは上記［２］で、上記比較方法が
示す条件が成立しないと判定した場合には、転送データ
を主記憶５０上の転送バッファ３７０に書き込む動作を
起動する。[5] Writing Transfer Data to Main Memory 50 The reception control section 440 uses the comparison method 10 described in [2] above.
It was determined that the condition indicated by 90 was satisfied, but the above [3]
When it is confirmed that the transfer destination area 340 does not exist in the main memory 50, or when it is determined in [2] that the condition indicated by the comparison method is not satisfied, the transfer data is stored in the main memory 50. The write operation to the transfer buffer 370 is started.

【０１１０】すなわち、転送先領域３４０が主記憶５０
内に存在しない場合、転送データを転送バッファ３７０
に書き込むことにより、転送先領域３４０が主記憶から
スワップアウトされていても、転送受信データを有効に
利用可能である。That is, the transfer destination area 340 is the main memory 50.
If not, the transfer data is transferred to the transfer buffer 370.
By writing the data in, the transfer reception data can be effectively used even if the transfer destination area 340 is swapped out from the main memory.

【０１１１】また、上記［２］で、上記比較方法１０９
０が示す条件が成立しない場合は、転送先領域３４０が
転送先のプロセスが使用中である等の理由により転送先
領域３４０が書き込みに使用できない場合である。本実
施の形態では、すでに例示したように、受信側のプロセ
スは、そのプログラムの進捗に応じてセマフォア２０１
０を書き換えるように構成されてる。データを受信した
時点でこの比較方法が指定する条件が成立しなければ、
比較方法２７７０の値によらずに、強制的に転送データ
を転送バッファ３７０に格納し、転送データを転送先領
域３４０に転送しない。後にこの転送データを転送バッ
ファ３７０から転送先領域３４０に転送する。これによ
り、データの転送に必要なプロセス間の同期処理のオー
バヘッドを軽減している。Further, in the above [2], the comparison method 109
If the condition indicated by 0 is not satisfied, the transfer destination area 340 cannot be used for writing because the transfer destination process is in use by the transfer destination process. In the present embodiment, as described above, the process on the receiving side receives the semaphore 201 according to the progress of the program.
It is configured to rewrite 0. If the condition specified by this comparison method is not satisfied when the data is received,
Regardless of the value of the comparison method 2770, the transfer data is forcibly stored in the transfer buffer 370, and the transfer data is not transferred to the transfer destination area 340. Later, this transfer data is transferred from the transfer buffer 370 to the transfer destination area 340. This reduces the overhead of synchronization processing between processes required for data transfer.

【０１１２】受信制御部４４０は、転送データの転送バ
ッファ３７０への書き込みを起動するために、線ｃ７０
を通してバッファ制御部４７０にバッファ書き込み要求
を出す。バッファ制御部４７０はこの要求を受けて、バ
ッファ管理レジスタ１９０をチェックし、カレントバッ
ファアドレス１６１０より現在の書き込み開始アドレス
（仮想アドレス）を把握する。このアドレスは線ａ８０
を通して主記憶アクセス部４００に伝わる。主記憶アク
セス部４００により、仮想アドレスから実アドレスに変
換するのだが、転送バッファはＯＳ領域１３０に存在す
るため、プロセス番号としては”００００”を用いて変
換を行う。変換の手順（アドレス変換部４８０まで変換
情報をフェッチする手段など）は前記と同じ手順を必要
とする。変換したアドレスを線ａ７０、アドレス変換部
４８０、線ａ６５、ミキサ４９０などを通して主記憶上
の転送バッファ３７０に対して出力すると同時に、受信
制御部４４０は線ｃ６０を通して、転送されてきたパケ
ットそのものをネットワーク受信部４５０から線ｄ９
０，ミキサ４６０を通して転送バッファ３７０に落とす
ようにネットワーク受信部４５０を制御する。The reception controller 440 activates the line c70 to activate the writing of the transfer data to the transfer buffer 370.
A buffer write request is issued to the buffer control unit 470 through. In response to this request, the buffer control unit 470 checks the buffer management register 190 and grasps the current write start address (virtual address) from the current buffer address 1610. This address is line a80
Through the main memory access unit 400. Although the virtual address is converted to the real address by the main memory access unit 400, since the transfer buffer exists in the OS area 130, the conversion is performed using "0000" as the process number. The conversion procedure (such as means for fetching the conversion information up to the address conversion unit 480) requires the same procedure as described above. The converted address is output to the transfer buffer 370 on the main memory through the line a70, the address conversion unit 480, the line a65, the mixer 490, and the like, and at the same time, the reception control unit 440 transmits the transferred packet itself to the network via the line c60. The line d9 from the receiving unit 450
0, the network receiving unit 450 is controlled so as to be dropped into the transfer buffer 370 through the mixer 460.

【０１１３】転送バッファ３７０に転送データを落とし
た場合には、バッファ制御部４７０は、それに応じてバ
ッファ管理レジスタ１９０のカレントバッファアドレス
を更新する。その結果カレントバッファアドレス１６１
０がバッファエンドアドレス１６２０（ここには、本当
のバッファの末端から１ＭＢ＋α少ない値を格納してあ
る）よりも大きく成った場合には、バッファ制御部４７
０から線ｃ１０を通してＣＰＵに割込をかける。ＣＰＵ
は新たなページをバッファ用に確保し、これを確保済み
の転送バッファ３７０に連結し、このバッファの領域を
拡大する。When the transfer data is dropped to the transfer buffer 370, the buffer control unit 470 updates the current buffer address of the buffer management register 190 accordingly. As a result, the current buffer address 161
When 0 becomes larger than the buffer end address 1620 (here, a value 1 MB + α less than the end of the true buffer is stored), the buffer control unit 47.
Interrupt the CPU from 0 through line c10. CPU
Secures a new page for the buffer, connects it to the secured transfer buffer 370, and expands the area of this buffer.

【０１１４】受信制御部４４０は、上記［２］で、上記
比較方法が示す条件が成立すると判定し、かつ、上記
［３］で、転送先領域３４０が主記憶５０内に存在する
と確認した場合には、転送データを主記憶５０上の転送
先領域３４０に書き込む動作を起動する。すなわち、こ
の場合は、転送先領域３４０が主記憶５０内にスワップ
インされており、かつ、転送先領域３４０を転送先のプ
ロセスが使用終了した場合であり、この場合には転送デ
ータは直ちに、転送先領域３４０に書き込む。この書き
込み動作も、上に述べた、転送データを転送バッファ３
７０に格納する動作と同様に行われる。但しアドレス変
換に用いるプロセス番号は、転送されてきたパケットに
付加されてきた転送先のプロセスの番号を用いる。ま
た、転送バッファ３７０への書き込みと違い、転送され
てきたパケットの制御情報を切り落とした、データのみ
転送先領域３４０に書き込む。When the reception control unit 440 determines in [2] that the condition indicated by the comparison method is satisfied and also in [3] that the transfer destination area 340 exists in the main memory 50. First, the operation of writing the transfer data to the transfer destination area 340 on the main memory 50 is activated. That is, in this case, the transfer destination area 340 is swapped into the main memory 50, and the transfer destination process is finished using the transfer destination area 340. In this case, the transfer data is immediately transferred. Write to the transfer destination area 340. This write operation also applies to the transfer data transfer buffer 3 described above.
The same operation as that of storing in 70 is performed. However, as the process number used for the address conversion, the number of the transfer destination process added to the transferred packet is used. Further, unlike writing to the transfer buffer 370, only the data in which the control information of the transferred packet is cut off is written to the transfer destination area 340.

【０１１５】［６］転送先フラグの更新（２）主記憶５０への書き込み終了後、転送先フラグ３２０を
更新する。これは、＜２．２．２＞における転送元フラ
グの更新と同様の動作で実現する。なお、更新後の値
は、図１３に示すように、受信完了バッファリング済み
の状態あるいは受信完了書き込み完了の状態を示す０１
１０または０１００である。[6] Update of transfer destination flag (2) After the completion of writing to the main memory 50, the transfer destination flag 320 is updated. This is realized by the same operation as the update of the transfer source flag in <2.2.2>. It should be noted that the updated value is 01 indicating the reception completion buffered state or the reception completion write completed state, as shown in FIG.
It is 10 or 0100.

【０１１６】［７］リプライパケットの返送主記憶５０への書き込み完了を、送信側ノード１０に通
達するため、リプライパケット（図８）を返送する。こ
れは、受信制御部４４０より、リプライパケット生成に
必要な情報を線ｄ６５で送信制御レジスタ７０に送り、
線ｃ３５で起動をかけることで実行する。起動のかけ方
は、＜２．２．２＞でＣＰＵ２０が行ったのと同様の手
順を踏む必要がある。[7] Return of reply packet In order to notify the sender node 10 of the completion of writing to the main memory 50, a reply packet (FIG. 8) is returned. This is because the reception control unit 440 sends information necessary for reply packet generation to the transmission control register 70 by a line d65,
It is executed by activating the line c35. The activation procedure needs to follow the same procedure as that performed by the CPU 20 in <2.2.2>.

【０１１７】［８］転送バッファ３７０から転送先領域
３４０への転送データの移動上記動作［５］において、転送データを転送先領域３４
０に書き込めた場合は、以上で受信ノードの動作は終了
する。しかし、転送バッファ３７０に転送データを格納
した場合には、この転送データを適当なタイミング転送
バッファ３７０から転送先領域３４０へ引き出す必要が
ある。転送バッファはＯＳ管理領域に存在するため、引
き出すのはＯＳが行うしかない。引き出す契機は、転送
データを転送バッファ３７０に転送したときの事情に依
存して異なる。以下ではそれぞれの契機に関して説明す
る。[8] Transfer of Transfer Data from Transfer Buffer 370 to Transfer Destination Area 340 In the above operation [5], transfer data is transferred to the transfer destination area 34.
When it is possible to write 0, the operation of the receiving node is completed. However, when the transfer data is stored in the transfer buffer 370, it is necessary to extract this transfer data from the appropriate timing transfer buffer 370 to the transfer destination area 340. Since the transfer buffer exists in the OS management area, the OS can only pull it out. The trigger for withdrawing differs depending on the circumstances when the transfer data is transferred to the transfer buffer 370. Each trigger will be described below.

【０１１８】（１）転送データの受信時に、転送先領域
３４０が主記憶５０内に存在しなかったときこの場合には、転送先領域３４０を含むページに対して
転送先プロセスが後にデータ読み出し命令（たとえば、
ロード命令）を発行したときに、転送バッファ３７０か
ら転送先領域３４０へ転送データを移動する。すなわ
ち、この場合には、ＣＰＵ２０がこの命令を実行すると
きに、この命令が指定する仮想アドレスを実アドレスに
変換する過程でページフォールトが発生する。このペー
ジフォールトの処理の中で、転送バッファ３７０から転
送先領域３４０へ転送データを移動する。(1) When the transfer destination area 340 does not exist in the main memory 50 at the time of receiving the transfer data In this case, the transfer destination process sends a data read instruction to the page including the transfer destination area 340. (For example,
When the load command) is issued, the transfer data is moved from the transfer buffer 370 to the transfer destination area 340. That is, in this case, when the CPU 20 executes this instruction, a page fault occurs in the process of converting the virtual address designated by this instruction into a real address. In the processing of this page fault, the transfer data is moved from the transfer buffer 370 to the transfer destination area 340.

【０１１９】このページフォルトを契機にＯＳに処理が
移る。本実施の形態では図２３のようにページフォール
ト処理を実行するルーチンをＯＳ内に設ける。すなわ
ち、ページフォールトに対応して空きページを用意した
後、上記命令が要求するページのデータが転送バッファ
３７０中に存在するかどうかについて、全ての転送バッ
ファ３７０を検査する（ステップ４０２０）。転送バッ
ファ３７０に蓄積されたデータは、パケットヘッダ情報
を含む形で残っている。ＯＳは蓄積されたデータがいず
れのページに対するデータであるかをこの情報を使用し
て検索できる。The processing shifts to the OS triggered by this page fault. In this embodiment, a routine for executing page fault processing is provided in the OS as shown in FIG. That is, after preparing an empty page corresponding to a page fault, all the transfer buffers 370 are inspected as to whether or not the data of the page requested by the above instruction exists in the transfer buffer 370 (step 4020). The data accumulated in the transfer buffer 370 remains in the form including the packet header information. The OS can use this information to search which page the stored data is for.

【０１２０】このページのデータが転送バッファ３７０
中に全く存在しなかった場合には、外部記憶装置６０よ
りスワップインする（ステップ４０３０）。要求された
ページのデータの全体が転送バッファ３７０に存在する
場合には、スワップインを行わずに、転送バッファ３７
０内のこのページのデータを転送先領域３４０へメモリ
コピーする（図中４０６０）。なお、引き出したデータ
については、パケットヘッダの有効ビット１７１０に相
当するデータを０（無効）にしておくことで、次回以降
の検索ではそのデータをソフトウェア的に無視すること
ができる。また、可能な場合には、メモリコピーを行わ
ず、仮想アドレスの付け替えだけで済ましてもよい。ま
た、要求されたページの一部が転送バッファ３７０に存
在した場合には、外部記憶装置６０よりそのページのデ
ータをスワップインし（ステップ４０４０）、その後、
転送バッファ３７０に存在する、その一部のデータをメ
モリコピーする（ステップ４０５０）。以上の処理が終
了した後、アドレス変換テーブル１４０のうち、そのペ
ージ関するＭビット２６３０（図１８）を１にセットす
るようにアドレス変換テーブル１４０を更新する（ステ
ップ４０７０）。以上により、ページフォルト時に転送
バッファ３７０から転送データを引き出すことができ
る。The data of this page is transferred to the transfer buffer 370.
If it does not exist at all, it is swapped in from the external storage device 60 (step 4030). If the entire data of the requested page exists in the transfer buffer 370, the swap-in is not performed and the transfer buffer 37
The data of this page in 0 is memory copied to the transfer destination area 340 (4060 in the figure). Regarding the extracted data, by setting the data corresponding to the valid bit 1710 of the packet header to 0 (invalid), the data can be ignored by software in the subsequent searches. If possible, the memory copy may not be performed and only the virtual address may be replaced. If a part of the requested page exists in the transfer buffer 370, the data of the page is swapped in from the external storage device 60 (step 4040), and then,
A part of the data existing in the transfer buffer 370 is memory copied (step 4050). After the above processing is completed, the address conversion table 140 is updated so that the M bit 2630 (FIG. 18) relating to the page in the address conversion table 140 is set to 1 (step 4070). As described above, the transfer data can be retrieved from the transfer buffer 370 at the time of page fault.

【０１２１】従来の通常のＳＥＮＤ／ＲＥＣＥＩＶＥ型
のデータ転送では、ＲＥＣＥＩＶＥ命令を実行した時、
つまり、ユーザプロセスが明示的に引き出しをＯＳに要
求したときのみ、受信バッファからデータを引き出すの
に対し、ＰＵＴ型データ転送を用いる本実施の形態で
は、受信側のユーザプロセスが主記憶内のデータに対す
るデータ読み出し命令（たとえば、ロード命令）を発行
するだけで、転送データの引き出し動作が開始され得
る。これは、ＰＵＴ型データ転送を採用するシステムで
は、受信側プロセスが転送データが転送先領域に書き込
まれたか否かを把握することなく、転送先領域アクセス
できてしまうことに因る。また、従来の通常のＳＥＮＤ
／ＲＥＣＥＩＶＥ型のデータ転送では、ユーザプロセス
が転送データの引き出しをＯＳに要求したとき、ＯＳ
は、受信バッファを検索して、要求された転送データが
転送バッファに存在するか否かを検出し、もしそのデー
タが転送バッファに存在する時には、その転送データを
転送すべきページが主記憶に存在するかを検出する。し
かし、ＰＵＴ型データ転送を用いる本実施の形態では、
ユーザプロセスがデータ読み出し命令を発行し、ページ
フォールトが発生した時点で、この命令で要求されるデ
ータが属するページが転送バッファに存在するか否かを
判定する。この点でも本実施の形態でのデータ転送の利
用手順は、従来の通常のＳＥＮＤ／ＲＥＣＥＩＶＥ型の
データ転送における転送データの利用手順とは異なる。In the conventional normal SEND / RECEIVE type data transfer, when the RECEIVE instruction is executed,
That is, only when the user process explicitly requests the OS to pull out data, the data is pulled out from the reception buffer, whereas in the present embodiment using the PUT type data transfer, the user process on the receiving side stores the data in the main memory. A read operation of transfer data can be started only by issuing a data read command (for example, a load command) to. This is because in the system that employs the PUT type data transfer, the receiving side process can access the transfer destination area without knowing whether or not the transfer data is written in the transfer destination area. In addition, conventional normal SEND
In the / RECEIVE type data transfer, when a user process requests the OS to retrieve transfer data, the OS
Searches the receive buffer to detect whether the requested transfer data exists in the transfer buffer, and if the data exists in the transfer buffer, the page to which the transfer data should be transferred is stored in the main memory. Detect if it exists However, in the present embodiment using PUT type data transfer,
When a user process issues a data read command and a page fault occurs, it is determined whether the page to which the data requested by this command belongs exists in the transfer buffer. Also in this respect, the procedure for using data transfer in the present embodiment is different from the procedure for using transfer data in the conventional normal SEND / RECEIVE type data transfer.

【０１２２】（２）転送データの受信時に、セマフォ２
０１０が転送命令が指定する条件を満たさなかったときこの場合は、転送データの受信時に受信側のユーザプロ
セスが転送先領域３４０を使用中であった場合に生じ
る。この場合には、後に受信側のプロセスが、転送先領
域３４０の使用を終了した後、転送先領域３４０に格納
すべき転送データを利用する時点で、受信側のユーザプ
ロセスが転送先フラグ３２０を検索し、転送バッファ３
７０にデータがすでに転送されているか否かを判別す
る。転送バッファ３７０に転送データが転送されたこと
を検出できた時点で、転送バッファ３７０から転送先領
域３４０へ転送データを引き出すことを要求するシステ
ムコールを発行する。ユーザプロセスは、転送データの
受信状態を判別するために、転送先フラグ３２０を指定
して、それを読み出すことができる。このシステムコー
ルに応答して、ＯＳがこの引き出しを実行する。このと
きの処理は図２２に示した具体的なプログラムの例を用
いて先に説明したとおりである。そのプログラムでは、
システムコールは、ＧＥＴＢＵＦ命令であった。命令列
３１５０Ａまたは３１５０Ｂに含まれたＩＦ文から転送
先フラグの読み出しを行う命令が生成される。(2) When the transfer data is received, the semaphore 2
010 does not satisfy the condition specified by the transfer command. This case occurs when the user process on the receiving side is using the transfer destination area 340 at the time of receiving the transfer data. In this case, after the receiving side process finishes using the transfer destination area 340 and then uses the transfer data to be stored in the transfer destination area 340, the receiving side user process sets the transfer destination flag 320. Search and transfer buffer 3
It is determined whether data has already been transferred to 70. When it is detected that the transfer data has been transferred to the transfer buffer 370, a system call requesting that the transfer data be drawn from the transfer buffer 370 to the transfer destination area 340 is issued. The user process can specify the transfer destination flag 320 and read it in order to determine the reception state of the transfer data. The OS executes this withdrawal in response to this system call. The processing at this time is as described above using the specific program example shown in FIG. In that program,
The system call was a GETBUF instruction. An instruction for reading the transfer destination flag is generated from the IF statement included in the instruction sequence 3150A or 3150B.

【０１２３】なお、転送データの受信時に転送先領域が
主記憶に存在しないで、かつ、セマフォ２０１０が上記
条件を満たさなかった場合には、システムコール命令あ
るいは前述のメモリアクセス命令のいずれかが実行され
たときに、転送データの移動が行われる。システムコー
ル命令が発行されたときには、ページイン動作を行う必
要がある。If the transfer destination area does not exist in the main memory when the transfer data is received and the semaphore 2010 does not satisfy the above condition, either the system call instruction or the above memory access instruction is executed. When this is done, the transfer data is moved. When the system call instruction is issued, it is necessary to perform the page-in operation.

【０１２４】（３）以上の場合以外にも、上記転送デー
タの移動を行わせることが出来る。たとえば、ユーザプ
ロセスが転送制御フラグ３３０のセマフォア値２０１０
を更新するのをきっかけに、古いセマフォア値のために
転送バッファ３７０に蓄積していた転送データを一気に
転送先領域３４０に移したいとき、ユーザプロセスがシ
ステムコールを発行し、上記転送データの移動をＯＳが
行う。(3) In addition to the above case, the transfer data can be moved. For example, if the user process has the semaphore value 2010 of the transfer control flag 330.
When the user wants to transfer the transfer data accumulated in the transfer buffer 370 due to the old semaphore value to the transfer destination area 340 all at once, the user process issues a system call to move the transfer data. The OS does it.

【０１２５】［9］着信完了に相当するリプライパケッ
トの返送上記［８］を実行した場合には、＜２．２．２＞の送信
動作と同様の動作を行うことで、ＯＳのソフトウェア処
理によりリプライパケットを返送する。データ転送パケ
ット送信時と異なり、送信制御レジスタ７０の転送起動
フィールド１０２０に”０１”を書き込むことで、リプ
ライパケット（着信確認）が送信できる。必要性はない
はずだが、”１０”を書き込めば、ソフトウェア的にバ
ッファリング確認パケットを送信することもできる。[9] Return of Reply Packet Corresponding to Completion of Incoming Call When the above [8] is executed, the same operation as the transmitting operation of <2.2.2> is performed, and the software processing of the OS is performed. Returns the reply packet. Unlike when transmitting a data transfer packet, by writing "01" in the transfer activation field 1020 of the transmission control register 70, a reply packet (confirmation of arrival) can be transmitted. Although it should not be necessary, if "10" is written, the buffering confirmation packet can be transmitted by software.

【０１２６】以上説明したところにより、図２０に例示
したプログラムが先に説明したように実行される。この
結果、先に説明したように、転送先領域を主記憶に常駐
させなくてもＰＵＴ命令を実行できる。さらに、送信側
のユーザプログラムと受信側のユーザプログラムの間の
同期も簡単に行える。As described above, the program illustrated in FIG. 20 is executed as described above. As a result, as described above, the PUT instruction can be executed without making the transfer destination area resident in the main memory. Further, synchronization between the user program on the transmitting side and the user program on the receiving side can be easily performed.

【０１２７】＜３．変形例＞本発明は、以上の実施の形
態に限定されるものではなく、いろいろの変形をカバー
することが可能である。その一つを次に示す。なお、次
に説明するは、ハードウェア構成は変更することなく実
現できる。以下では、先の実施例と異なる部分のみ示
す。<3. Modifications> The present invention is not limited to the above embodiments, and can cover various modifications. One of them is shown below. Note that, as will be described next, the hardware configuration can be realized without changing. In the following, only parts different from the previous embodiment will be shown.

【０１２８】＜３．１＞変形の基本的な思想転送制御フラグの値は、ユーザプログラムが制御する。
よって、転送制御フラグの値が要因で転送データが転送
バッファ３７０にバッファリングされたか否かはユーザ
に見えることが必須である。しかし、転送先領域３４０
が主記憶上に存在するかどうかについては、通常はユー
ザは管理できず、ＯＳの主記憶管理に依る。つまり、転
送先領域３４０が主記憶上に存在しないことが原因でバ
ッファリングされたことは、転送元フラグ、転送先フラ
グによりユーザに見せる必要はない。<3.1> Basic Idea of Modification The value of the transfer control flag is controlled by the user program.
Therefore, it is essential for the user to see whether the transfer data is buffered in the transfer buffer 370 due to the value of the transfer control flag. However, the transfer destination area 340
Whether or not exists in the main memory cannot be usually managed by the user and depends on the main memory management of the OS. That is, it is not necessary to show to the user that the transfer destination area 340 has been buffered because it does not exist in the main memory by the transfer source flag and the transfer destination flag.

【０１２９】＜３．２＞変形の基本ポイント＜１．４＞に示した転送制御フラグの具体的使用例で
は、（１）ＰＵＴ命令３１２０Ａ（図２２）で指定され
た比較値と転送制御フラグｂとの比較の結果により、Ｐ
ＵＴ命令で指定された比較条件を満たさないことから転
送バッファ３７０に受信データを書き込むケース、およ
び（２）転送先領域３４０がスワップアウトされていた
ケースの双方とも、転送先フラグｒｂには転送バッファ
３７０への書き込み中を表す値００１１もしくはそのバ
ッファへの書込完了を表す０１１０（図１３）を設定し
ていた。<3.2> Basic point of modification In the specific use example of the transfer control flag shown in <1.4>, (1) the comparison value and the transfer control flag designated by the PUT instruction 3120A (FIG. 22). According to the result of comparison with b, P
In both the case where the received data is written in the transfer buffer 370 because the comparison condition specified by the UT instruction is not satisfied, and (2) the case where the transfer destination area 340 is swapped out, the transfer buffer is set in the transfer destination flag rb. The value 0011 indicating that writing to 370 or 0110 (FIG. 13) indicating that writing to the buffer has been completed is set.

【０１３０】これを変更し、転送先フラグｒｂに転送バ
ッファ３７０への書き込み中を表す値００１１およびそ
のバッファへの書込完了を表す０１１０を設定するケー
スを、上記第１のケースに限定する。ＰＵＴ命令が指定
した比較値が上記条件を満したが、転送先領域が主記憶
になかったときは、上記第１のケースと同様に扱う。第
２のケースの場合には、転送先フラグｒｂに転送先領域
３４０への書き込み中を表す値０００１あるいはそこへ
の書込完了を表す０１００を書き込む。The case where this is changed and the value 0011 representing the writing to the transfer buffer 370 and 0110 representing the completion of writing to the buffer are set in the transfer destination flag rb is limited to the first case. When the comparison value designated by the PUT instruction satisfies the above condition, but the transfer destination area is not in the main memory, the same processing as in the first case is performed. In the case of the second case, the value 0001 indicating that writing is being performed to the transfer destination area 340 or 0100 that indicates the completion of writing to the transfer destination area 340 is written in the transfer destination flag rb.

【０１３１】転送元フラグｓａの値も、上記第２のケー
スでは、受信データが転送バッファ３７０にバッファリ
ングされたことを表す値０１１０（図１２）から転送先
領域３４０への書き込み完了を表す値１０１０に変更す
る。In the second case, the value of the transfer source flag sa is also a value from the value 0110 (FIG. 12) indicating that the received data has been buffered in the transfer buffer 370, indicating the completion of writing to the transfer destination area 340. Change to 1010.

【０１３２】なお、転送元フラグの設定値に関する変更
は、受信側ノードから返答するリプライパケットの値を
変更することにより実現する。つまり、本変形は受信側
ノードの動作のみ変更することにより実現する。以上
は、ＰＵＴ命令３１２０Ｂ（図２２）に関しても全く同
様である。The change of the set value of the transfer source flag is realized by changing the value of the reply packet returned from the receiving node. That is, this modification is realized by changing only the operation of the receiving side node. The above is exactly the same for the PUT instruction 3120B (FIG. 22).

【０１３３】＜３．３＞受信側ノードのプログラム動作
の変更本変形例において図２２に示すプログラムを実行したと
きには、転送制御フラグｂまたはａの値は、それぞれＰ
ＵＴ命令３１２０Ａ、３１２０Ｂにより指定された条件
を満たしていたにも関わらず、転送先領域３４０がスワ
ップアウトされていたために転送バッファ３７０に受信
データを書き込んだ場合には、受信を待つｗａｉｔ命令
３１４０Ａ、３１４０Ｂが成立した後では転送制御フラ
グｂまたはａの値は０１００＝ＲＦＩＮになっており、
３１５０Ａまたは３１５０ＢのＩＦ文ではＧＥＴＢＵＦ
命令が起動されない。しかし、３１６０Ａまたは３１６
０ＢのＲＥＡＤ文で受信データを読み出す際にページフ
ォルトを発生するので、これを契機にＯＳ制御により転
送バッファ３７０より受信データを読み出す。<3.3> Change in Program Operation of Receiving Node When the program shown in FIG. 22 is executed in this modification, the value of the transfer control flag b or a is P, respectively.
When the received data is written in the transfer buffer 370 because the transfer destination area 340 has been swapped out even though the conditions specified by the UT commands 3120A and 3120B are satisfied, the wait command 3140A that waits for reception, After 3140B is established, the value of the transfer control flag b or a is 0100 = RFIN,
GETBUF in the IF statement of 3150A or 3150B
The instruction does not fire. However, 3160A or 316
Since a page fault occurs when reading the received data with the READ statement of 0B, the received data is read from the transfer buffer 370 by the OS control triggered by this.

【０１３４】転送バッファ内のデータを転送先領域に移
動することは、データ読み出し命令が実行された場合に
限らず、転送先領域にデータを書き込む命令が実行され
た場合でも同様である。このことは、先に記載した実施
の形態においても同じである。Moving the data in the transfer buffer to the transfer destination area is not limited to the case where the data read instruction is executed, and is the same when the instruction for writing the data in the transfer destination area is executed. This also applies to the embodiments described above.

【０１３５】具体的には、次の４点について、受信側ノ
ードの動作を変更する。Specifically, the operation of the receiving side node is changed for the following four points.

【０１３６】Ａ）＜２．２．３＞［４］に示す転送先フ
ラグの更新（１）において、転送先領域３４０がスワッ
プアウトされていたことが原因で転送バッファ３７０に
受信データを書き込んだ場合にも、転送先フラグｒａま
たはｒｂに転送先領域３４０への書き込み中を表す値０
００１を書き込む。A) In the transfer destination flag update (1) shown in <2.2.3> [4], the received data is written in the transfer buffer 370 because the transfer destination area 340 was swapped out. Also in this case, the transfer destination flag ra or rb has a value of 0 indicating that the transfer destination area 340 is being written.
Write 001.

【０１３７】Ｂ）＜２．２．３＞［６］に示す転送先フ
ラグの更新（２）において、転送先領域３４０がスワッ
プアウトされていたことが原因で転送バッファ３７０に
受信データを書き込んだ場合にも、転送先フラグｒａま
たはｒｂに転送先領域３４０への書き込み完了を表す値
０１００を書き込む。B) <2.2.3> In the update (2) of the transfer destination flag shown in [6], the received data was written in the transfer buffer 370 because the transfer destination area 340 was swapped out. Also in this case, the value 0100 indicating the completion of writing to the transfer destination area 340 is written in the transfer destination flag ra or rb.

【０１３８】Ｃ）＜２．２．３＞［７］に示すリプライ
パケットの返送において、転送先領域３４０がスワップ
アウトされていたことが原因で転送バッファ３７０に受
信データを書き込んだ場合にも、リプライパケットには
着信確認を表す値０１を付ける。C) <2.2.3> In returning the reply packet shown in [7], even when the received data is written in the transfer buffer 370 because the transfer destination area 340 was swapped out, The reply packet is provided with a value 01 indicating the confirmation of the incoming call.

【０１３９】Ｄ）＜２．２．３＞［８］に示す着信完了
を示すリプライパケットの返送については、ＧＥＴＢＵ
Ｆ関数で強制的に受信データを転送バッファ３７０から
引き出した場合にはＯＳによりリプライパケットを返送
するが、ＲＥＡＤ命令を契機に引き出した場合には返送
する必要がなくなる。D) Regarding the return of the reply packet indicating the completion of the incoming call shown in <2.2.3> [8], GETBU
When the F function forcibly extracts the received data from the transfer buffer 370, the OS returns a reply packet, but when the READ command is used as a trigger, there is no need to return it.

【０１４０】[0140]

【発明の効果】以上の実施例および変形例から明からよ
うに、本発明によれば、受信データ領域を主記憶固定に
しなくてもＰＵＴ型データ転送を行える。As is apparent from the above embodiments and modifications, according to the present invention, PUT type data transfer can be performed without fixing the reception data area to the main memory.

【０１４１】また、少ないプロセッサ間の同期のための
オーバヘッドでもってＰＵＴ型データ転送を実行でき
る。Also, PUT type data transfer can be executed with a small overhead for synchronization between processors.

[Brief description of drawings]

【図１】本発明に係る並列計算機の概略構成図。FIG. 1 is a schematic configuration diagram of a parallel computer according to the present invention.

【図２】図１の装置に使用するネットワークアダプタ３
０の概略構成図。FIG. 2 is a network adapter 3 used in the apparatus of FIG.
The schematic block diagram of 0.

【図３】ネットワークアダプタ３０中の送信制御レジス
タ７０内の情報を示す図。FIG. 3 is a diagram showing information in a transmission control register 70 in the network adapter 30.

【図４】ネットワークアダプタ３０中の受信制御レジス
タ１７０内の情報を示す図。FIG. 4 is a diagram showing information in a reception control register 170 in the network adapter 30.

【図５】ネットワークアダプタ３０中のテーブル管理レ
ジスタ８０内の情報を示す図。5 is a diagram showing information in a table management register 80 in the network adapter 30. FIG.

【図６】ネットワークアダプタ３０中のバッファ管理レ
ジスタ内の情報を示す図。6 is a diagram showing information in a buffer management register in the network adapter 30. FIG.

【図７】プロセッサ間ネットワーク１０００用のデータ
送信パケット内の情報を示す図。FIG. 7 is a diagram showing information in a data transmission packet for the interprocessor network 1000.

【図８】プロセッサ間ネットワーク１０００用のリプラ
イパケット内の情報を示す図。FIG. 8 is a diagram showing information in a reply packet for the interprocessor network 1000.

【図９】主記憶５０に設ける転送制御フラグ３３０内の
情報を示す図。9 is a diagram showing information in a transfer control flag 330 provided in the main memory 50. FIG.

【図１０】送信制御レジスタ７０内の比較方法１０９０
の値と意味を説明するための図。10 is a comparison method 1090 in the transmission control register 70. FIG.
The figure for explaining the value and meaning of.

【図１１】パケットタイプ１７００の値と意味を説明す
るための図。FIG. 11 is a diagram for explaining values and meanings of a packet type 1700.

【図１２】主記憶５０に設ける転送元フラグ１１０の値
と意味を説明するための図。FIG. 12 is a diagram for explaining the value and meaning of a transfer source flag 110 provided in the main memory 50.

【図１３】主記憶５０に設ける転送先フラグ３２０の値
と意味を説明するための図。FIG. 13 is a diagram for explaining the value and meaning of a transfer destination flag 320 provided in the main memory 50.

【図１４】図１の装置に使用する実アドレス空間を説明
するための図。14 is a diagram for explaining a real address space used in the device of FIG.

【図１５】図１の装置に使用する、仮想アドレス空間か
ら実アドレス空間へのマッピング方法を説明するための
図。FIG. 15 is a diagram for explaining a mapping method from a virtual address space to a real address space, which is used in the device of FIG.

【図１６】図１の装置に使用するアドレス変換方法を説
明するための図。16 is a diagram for explaining an address conversion method used in the device of FIG.

【図１７】アドレス変換テーブル１４０をアクセスする
アドレスの生成方法を説明するための図。FIG. 17 is a diagram for explaining a method of generating an address for accessing the address conversion table 140.

【図１８】アドレス変換テーブル１４０内に用意する情
報を示す図。FIG. 18 is a diagram showing information prepared in the address conversion table 140.

【図１９】図１８に示した情報の意味を説明するための
図。19 is a diagram for explaining the meaning of the information shown in FIG.

【図２０】図１の装置に使用するデータ転送命令ＰＵＴ
のフォーマットを示す図。20 is a data transfer instruction PUT used in the apparatus of FIG. 1;
FIG.

【図２１】従来のＰＵＴ命令を使用するプログラムの例
を示す図。FIG. 21 is a diagram showing an example of a program using a conventional PUT instruction.

【図２２】図１の装置に使用するプログラムの例を示す
図。22 is a diagram showing an example of a program used in the apparatus of FIG.

【図２３】図１の装置で実行するページフォルト処理の
フローチャート。23 is a flowchart of page fault processing executed by the device of FIG.

Claims

[Claims]

1. In a computer system having a plurality of nodes interconnected by a data transfer network, each node having at least one processor and a main memory, data and a transfer destination area for storing the data. The virtual address and the virtual address are sent from the transfer source node to the transfer destination node, and when the transfer destination node receives the data, the transfer destination area to which the received virtual address is allocated is the main memory of the transfer destination node. When the data is received, the data is stored in the transfer destination area to which the virtual address is assigned, and when the transfer destination area is not in the main memory of the transfer destination node when the data is received, the data is transferred. The data is stored in the buffer area for temporary holding provided in the destination node, and after the data is stored in the buffer area, the data is transferred to the transfer destination. When a predetermined command is subsequently issued by the user program running on the node, the data stored in the buffer area is transferred to the transfer destination area, and the transfer is performed by transferring the transfer destination area to the main memory. Data transfer method that is executed after being assigned to.

2. The data transfer method according to claim 1, wherein the predetermined instruction is a memory access instruction requesting access to the transfer destination area.

3. The transfer step executes page fault processing in response to a page fault generated as a result of executing the access instruction when the transfer destination area does not exist in the main memory, and the page fault processing is executed. 3. The data transfer method according to claim 2, further comprising the step of allocating the data and moving the data from the buffer area to the transfer destination area allocated in the main memory.

4. The data transfer method according to claim 1, wherein the predetermined instruction is a predetermined system call to an operating system controlling the transfer destination node.

5. The transfer step is executed by the operating system controlling the transfer destination node by the OS, and the steps other than the transfer step are connected to the processor of the transfer destination node, the main memory, and the network. The data transfer method according to claim 1, which is executed by a circuit.

6. The method further comprising the step of storing a transfer destination flag in the main memory of the transfer destination node, the transfer destination flag indicating a reception state of the data at the transfer destination node, and the transfer destination flag. The data transfer method according to claim 1, wherein said data has first and second values respectively when said data is stored in said buffer area and transfer destination area.

7. In a computer system having a plurality of nodes interconnected by a data transfer network, each node having at least one processor and a main memory, data and a transfer destination area in which the data is to be stored. Sent from the transfer source node to the transfer destination node, the transfer destination node stores the semaphore value specified by the user program being executed there, and is sent from the transfer source node. Whether the comparison value and the value of the semaphore stored in the transfer destination node satisfy a predetermined condition for allowing the data to be stored in the transfer destination area to which the virtual address is allocated. When the determination step determines that the condition is not satisfied, the temporary holding provided in the transfer destination node is determined. A data transfer method for storing the above data in a buffer area for data and transferring the above data from the above buffer area to the above destination area in response to a transfer request issued thereafter by the above program.

8. The data transfer method according to claim 7, wherein the semaphore is held at a position accessible by a user program, and the address of the position is sent together with the data from the transfer source node.

9. The transfer destination node further comprises a step of storing a transfer destination flag in an area accessible by a user program in the main memory of the transfer destination node, wherein the transfer destination flag is a reception state of the data in the transfer destination node. 8. The data transfer method according to claim 7, wherein said transfer destination flag has first and second values when said data is stored in said buffer area and said transfer destination area, respectively.

10. The data transfer method according to claim 9, wherein the address at the position where the transfer destination flag is held is sent together with the data from the transfer source node.

11. The transfer destination node notifies the transfer source node of reception status information indicating the progress of reception of the data at the transfer destination node, and the transfer source node responds to the reception status information notified from the transfer destination node. And further storing the transfer source flag in an area accessible by the user program in the main memory of the transfer source node, wherein the transfer source flag is such that the data is stored in the buffer and the transfer destination area. Sometimes the first and second
10. The data transfer method according to claim 9, which has a value of.

12. The semaphore is held in a main memory of the transfer destination node at a position accessible by a user program, and the address of the position where the semaphore is held and the transfer destination flag are held. The address of the position and the address of the position in the main memory of the transfer source node in which the transfer source flag is held are sent from the transfer source node together with the data, and the step of notifying is performed by the transfer source node. 12. The data transfer method according to claim 11, further comprising the step of notifying the transfer source node of the address at the position where the transfer source flag notified from is held.

13. The transfer destination node notifies the transfer source node of reception state information indicating the progress of reception of the data at the transfer destination node, and the transfer source node responds to the reception state information notified from the transfer destination node. And further storing the transfer source flag in an area accessible by the user program in the main memory of the transfer source node, wherein the transfer source flag stores the data in the buffer and the transfer destination area. When the first and second
10. The data transfer method according to claim 9, which has a value of.

14. The transfer destination node detects whether or not the transfer destination area assigned with the virtual address exists in the main memory of the transfer destination node when the data is received. The method further comprises a step of storing the data in the buffer area when it is detected by the detection step that the data does not exist in the main memory of the transfer destination node or when the determination step determines that the condition is not satisfied. In the step of storing the data in the transfer destination area, the determination step determines that the transfer destination area exists in the main memory of the transfer destination node at the time of receiving the data and the condition is satisfied. The data transfer method according to claim 7, which is executed when the data transfer is performed.

15. The method further comprises the step of storing a transfer destination flag in an area accessible by a user program in the main memory of the transfer destination node, wherein the transfer destination flag is a reception state of the data in the transfer destination node. 15. The data transfer method according to claim 14, further comprising the step of:

16. The data transfer method according to claim 15, wherein said transfer destination flag has a first value and a second value when said data is stored in said buffer and said transfer destination area, respectively.

17. The transfer destination flag is determined depending on whether the transfer destination area exists in the main memory of the transfer destination node when the data is stored in the buffer because the condition is not satisfied. Has a first value, the transfer destination flag indicates that the data is stored in the transfer destination area because the transfer destination area exists in the main memory of the transfer destination node, and the condition is satisfied. 16. The data transfer method according to claim 15, wherein the data transfer method has a second value when stored or when the condition is satisfied but the transfer destination area does not exist in the main memory of the transfer destination node.

18. The data transfer method according to claim 15, wherein the address at the position where the transfer destination flag is held is sent from the transfer source node together with the data.

19. The transfer destination node notifies the transfer source node of reception status information indicating the progress of reception of the data at the transfer destination node, and the reception status information indicates whether the data is stored in the buffer area or not. At least indicating whether or not the data is stored in the destination area, and the transfer source node transfers the data in an area accessible by the user program in the main memory of the transfer source node in response to the reception status information notified from the transfer destination node. The data transfer method according to claim 15, further comprising the step of storing the original flag.

20. The data transfer method according to claim 19, wherein said transfer source flag has a first value and a second value, respectively, when said data is stored in said buffer and said transfer destination area.

21. The transfer source flag indicates whether or not the transfer destination area exists in the main memory of the transfer destination node when the data is stored in the buffer because the condition is not satisfied. The transfer source flag has a first value, and the transfer source flag has the transfer destination area in the main memory of the transfer destination node, and the data is transferred to the transfer destination area because the condition is satisfied. 20. The data transfer method according to claim 19, wherein the data transfer method has a second value when the data is stored in the storage device or when the condition is satisfied but the transfer destination area does not exist in the main memory of the transfer destination node.

22. The semaphore is stored at a position accessible by a user program in the main memory of the transfer destination node, and the address of the position where the semaphore is held and the address of the position where the transfer destination flag is held. And the address of the position in the main memory of the transfer source node in which the transfer source flag is held is sent together with the data from the transfer source node, and the step of notifying is notified from the transfer source node. 20. The step of notifying the transfer source node of the address of the position where the transfer source flag is held is set.
Data transfer method described.

23. A plurality of nodes, and a data transfer network for connecting the plurality of nodes to each other, wherein each node has at least one processor and a main memory connected to the processor. The main memory, the processor, and a network adapter connected to the network. The network adapter stores data and a virtual address of a transfer destination area in which the data is to be stored, via the network. To the processor, the value, and a transmitter connected to the work, and the data transferred from the one node and the virtual address of the transfer destination area to be received via the network, to the network. The connected receiver and the virtual address received by the receiver are stored in the virtual address. To the real address of the allocated real page, connected to the main memory of each node, the address translation unit of each node and the main memory, and the real page is When present in the main memory, in response to the real address obtained by the address translation unit, the data received by the receiving unit is written in the real page, and the real page is stored in the main memory of each node. And a control circuit for storing the received data in a predetermined buffer area in the main memory of each node when the data is not present inside the processor, and the processor is assigned the virtual address when receiving the data. In response to a request issued by the program executing in the real page when the real page does not exist, the real page in the main storage To the virtual address and the computer system is programmed to transfer the data from the buffer area to the real page after the allocation.

24. The request is a memory access request, and the processor assigns a virtual address to the virtual address during execution of a page fault process for processing a page fault generated by the address translation unit at the time of executing the memory access request. To allocate real pages,
24. The computer system according to claim 23, which is further programmed to transfer the data from the buffer area to the real page allocated with the virtual address.

25. The request is a system call to the operating system of each node, the processor further allocating a real page to the virtual address while processing the system call by the operating system. 24. The computer system of claim 23, wherein the computer system is programmed to transfer the data from the buffer area to the real page assigned the virtual address.

26. A plurality of nodes, and a data transfer network for connecting the plurality of nodes to each other, wherein each node has at least one processor and a main memory connected to the processor. It has the main memory, the processor, and a network adapter connected to the network, and the network adapter is designated by a program running on each node,
Sending the data, the virtual address of the transfer destination area where the data is to be stored, and the comparison value for the data to one of the plurality of nodes via the network, the transmission being connected to the processor, the value, and the work. Section, a receiving section connected to the network for receiving the data transferred from the one node and the virtual address of the transfer destination area and the comparison value for the data via the network, and received by the receiving section. An address translation unit connected to the main memory of each of the nodes, which translates the virtual address into a real address of the real page to which the virtual address is assigned; the reception unit of the node and the address translation unit. The semaphore value specified by the program running on each node connected to the main memory Then, it is determined whether the received comparison value satisfies a predetermined condition, and when the condition is satisfied, the address obtained by the address translation unit with respect to the received virtual page. Write the data received by the receiving unit in a transfer destination area having a real address, and further,
When it is determined that the condition is not satisfied, the main memory of each node has a reception control circuit for storing the buffer area in a predetermined buffer area, and the processor executes the program executed therein. A computer system further programmed to transfer the data from the buffer area to the real page in response to a request issued by the.

27. The reception control circuit stores a transfer destination flag in an area in the main memory of each of the nodes, which is accessible by the program executing in each node, and the transfer destination flag is stored in each of the above-mentioned respective nodes. 27. The computer system according to claim 26, which indicates a reception state of the data at a node, and the transfer flag has a first value and a second value when the data is stored in the buffer area and the transfer destination area, respectively. .