JP3791463B2

JP3791463B2 - Arithmetic unit and data transfer system

Info

Publication number: JP3791463B2
Application number: JP2002163160A
Authority: JP
Inventors: 修三和田崎
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2002-06-04
Filing date: 2002-06-04
Publication date: 2006-06-28
Anticipated expiration: 2022-06-04
Also published as: JP2004013324A

Description

【０００１】
【発明の属する技術分野】
本発明は、複数の演算装置に対し、演算対象となる処理を分散して並列処理を実行する、いわゆるマルチプロセッサ計算機システムの各演算装置の負荷を低減し、システムの演算性能を向上させる演算装置及びデータ転送システムに関する。
【０００２】
【従来の技術】
近年、ＬＳＩの微細化により１台の情報処理装置の性能は飛躍的に向上しているが、大規模な科学計算等ではそれらを複数台使用してマルチノード構成でしか要求する性能に達しないのが現状である。マルチノードの情報処理の場合、各演算装置に演算処理を細分化し並列して処理するが、プログラム上で並列化出来ない部分は少なからず存在する。このため、演算の途中結果をあるノード上に集めて続きの処理の判断または実行を行う必要がある。それゆえ、ノード間転送速度は性能を上げるために、Ｉ／Ｏより帯域が広いのが一般的である。多ノードで一つのプログラムの処理を行う場合、プログラム実行当初に各演算装置は、自メモリ内にプログラムやデータを読み込んで演算を行う。したがって、共通部分のプログラムや初期データは各演算装置で同じモノを読み込むこととなり、ディスクに負荷が集中する結果、システムとしてのスループットを低下させてしまう。
一般的に多ポートのディスクであっても内部的にはリクエストをシリアライズするのが常套手段であり、ｍノードで要求すればその数に応じてｍ倍の時間が必要となるのである。
【０００３】
【発明が解決しようとする課題】
このディスク競合により顕著にスループットを低下させていることが分かっている場合は、１ノードのメモリにデータを読み込んで、ノード間転送で他ノードへデータを分配する手段が考えられる。しかし、この場合はメモリ上で配置されたデータの単位でしか転送を行えない等の欠点を有する。
【０００４】
本発明は、以上の欠点に鑑みてなされたものであって、複数の制御手段に対し、演算対象となる処理を分散して並列処理を実行する、いわゆるマルチプロセッサ計算機システムにおいて、一つのプログラムを処理する際の負荷を軽減し、スループットの低下を防止することができる演算装置及びデータ転送システムを提供することを目的とする。
【０００５】
【課題を解決するための手段】
請求項１記載の演算装置は、複数の外部記憶装置と外部記憶装置切替スイッチを介して接続されることによって外部記憶装置から送信されるプログラムおよびデータを受信することができる受信手段と、該受信手段が受信したプログラムおよびデータを格納する内部記憶手段と、複数の演算装置とノード間接続切替スイッチを介して接続されることによって他の演算装置へプログラムおよびデータを転送することができる機能ならびに他の演算装置から送信されるプログラムおよびデータを受信することができる機能を有するノード間接続手段と、ノード間接続手段に転送命令を出すことができる制御手段とを有し、前記受信手段と前記ノード間接続手段との間にプログラムおよびデータを送信することができる経路が設けられていることを特徴とするものである。
【０００６】
本発明における受信手段とは、外部からのデータを受け入れるバスであって、記憶装置と演算装置内部の各パーツ間を結ぶデータ伝送路である。特に、ＰＣＩバスであることが好ましい。これは、演算装置が拡張スロットとして備えているものであり、記憶装置から送信されるプログラムおよびデータを受信する機能を有するものである。このとき、受信手段は、受信したプログラムおよびデータを内部記憶手段およびノード間接続手段に送信する機能を有している。
また、本発明における内部記憶手段は受信手段が受信したプログラムおよびデータを一時的に所定のアドレスに格納するものであり、メモリ、ＨＤＤ等を挙げることができる。
また、本発明におけるノード間接続手段は、演算装置（ノード）同士をノード間接続切替スイッチを介して接続するものであって、演算装置が読み取り命令を受け取った場合は、受信手段が外部記憶装置から送信されたプログラムおよびデータを受信手段から直接受け取り、さらにそのプログラムおよびデータを他の演算装置へ転送する機能を有しており、演算装置が受信命令を受け取った場合は、読み取り命令を受け取った演算装置から送信されてきたプログラムおよびデータを受信し、受信したプログラムおよびデータを内部記憶手段の所定の場所（メモリアドレス等）に配置させる機能を有しているものである。
また、本発明における制御手段は、演算装置が読み取り命令を受け取った場合には、ノード間接続手段から他の演算装置へプログラムおよびデータを転送することを命令する機能を有するものであり、これには通常ＣＰＵが用いられる。
【０００７】
本発明の演算装置は、受信手段とノード間接続手段の間にデータを伝送できる経路を設けたことを特徴とし、受信手段がプログラムおよびデータを受信したときには、自装置の内部記憶手段に格納するだけでなく、他の演算装置へプログラムおよびデータを転送するためにノード間接続手段へもプログラムおよびデータを送信する。したがって、内部記憶手段に格納すると同時にデータの転送も行うことができ、他の演算装置においてもこの送信されたプログラムおよびデータをノード間接続手段において受信し、受信したプログラムおよびデータを本来プログラムおよびデータを読み込んだときに格納すべき場所（メモリアドレス等）に配置することで、読み取り命令のあった演算装置および受信命令のあった演算装置の全てにプログラムおよびデータが共有化される。
【０００８】
これにより、演算装置が個々に外部記憶装置からプログラムおよびデータを読み込む必要がなく、外部記憶装置へのアクセスの軽減を図り、プログラムおよびデータの読み取り時間のためシステムのスループットが低下することを防ぐことができる。また、転送されるプログラムおよびデータを受信する演算装置は、内部記憶手段に一旦データを読み込む必要がないので、演算装置が外部記憶装置からプログラムおよびデータを読み込む時間とほぼ同じ時間で転送を終了することができる。さらに、従来のように内部記憶手段に一旦読み込んだ場合は、その読み込んだ際の内部記憶手段上に配置されたデータの単位でしか転送を行うことができなかったが、本発明では、内部記憶手段にプログラムおよびデータを読み込まずに転送を行うことができるため、従来のような制限なくプログラムおよびデータの転送を行うことができる。
【０００９】
請求項２記載の演算装置は、演算装置の接続がクロスバー接続によりなされるものであって、制御手段が、複数の他の演算装置にプログラムおよびデータの転送を行うよう命令を出すことを特徴とするものである。
【００１０】
本発明は、ノード間接続がクロスバー接続でなされているため、読み取り命令を受け取った演算装置から受信命令を受け取った演算装置へプログラムおよびデータを転送する際に、転送先が複数であってもほぼ同時に転送することができ、読み取り命令を受け取った演算装置から他の演算装置全てにプログラムおよびデータを転送することもできる。
したがって、ノード間接続が１：１の接続しかサポートしていない場合は、外部記憶装置へのアクセスを１件減らすにすぎないが、ノード間接続をクロスバー接続とした本発明においては、ｍ台の演算装置を接続している場合には、最大でｍ−１件のアクセスを減らすことができ、大幅に演算装置の負荷を軽減することができる。
【００１１】
請求項３記載のデータ転送システムは、複数の演算装置がノード間接続切替スイッチを介して接続されており、各演算装置が複数の外部記憶装置と外部記憶装置切替スイッチを介して接続されているデータ転送システムにおいて、前記演算装置が、複数の外部記憶装置と外部記憶装置切替スイッチを介して接続されており外部記憶装置から送信されるプログラムおよびデータを受信することができる受信手段と、該受信手段が受信したプログラムおよびデータを格納する内部記憶手段と、複数の演算装置とノード間接続切替スイッチを介して接続されており他の演算装置へプログラムおよびデータを転送することができる機能ならびに他の演算装置から送信されるプログラムおよびデータを受信することができる機能を有するノード間接続手段と、ノード間接続手段に転送命令を出すことができる制御手段とを有し、前記受信手段と前記ノード間接続手段との間にプログラムおよびデータを送信することができる経路が設けられていることを特徴とするものである。
【００１２】
本発明のデータ転送システムにおける演算装置は、請求項１記載の演算装置と同一である。
また、本発明におけるノード間接続切替スイッチは、演算装置同士を接続するものであり、読み取り命令を受け取った演算装置と受信命令を受け取った演算装置とを接続するものである。このノード間接続切替スイッチを介してプログラムおよびデータが送信されることとなる。
また、外部記憶装置は、演算装置の外部に設けられるものであって、マルチノードにおいて処理を行うためのプログラムおよびデータが保存されている。この外部記憶装置はそのマルチノードで用いる情報量から複数台用いられる。これらは記憶装置切替スイッチを介して演算装置に接続されるものであって、本発明では読み取り命令を受けるのは１台の演算装置であるから、その演算装置と読み取るプログラムおよびデータを記憶している外部記憶装置とを接続することができればよい。
これにより、請求項１と同様に、システムのスループットが低下することを防ぐこと、演算装置が外部記憶装置からプログラムおよびデータを読み込む時間とほぼ同じ時間で転送を終了すること、内部記憶手段上に配置されたデータの単位でしか送信できないとの制限もなくプログラムおよびデータの転送を行うことができる。
【００１３】
請求項４記載のデータ転送システムは、ノード間接続手段がクロスバー接続で演算装置を接続することを特徴とするものである。
【００１４】
本発明は、ノード間接続がクロスバー接続であるため、読み取り命令を受け取った演算装置から受信命令を受け取った演算装置へプログラムおよびデータを転送する際に、転送先が複数であってもほぼ同時に転送することができ、読み取り命令を受け取った演算装置から他の演算装置全てにプログラムおよびデータを転送することもできる。
これにより請求項２と同様、ｍ台の演算装置を接続している場合には、最大でｍ−１件のアクセスを減らすことができる。
【００１９】
【発明の実施の形態】
以下、本発明について図面を参照しながら説明する。
図１は本発明のデータ転送システムの構成を示すブロック図である。
図１において、本発明のマルチノードにおけるデータ転送システムは、演算装置（ノード）１０、２０、３０、４０、それらのノード間データ転送の際に相手先を切り替えるノード間接続切替スイッチ５０、演算装置からの要求に応じて外部記憶装置のポートを割り当てる外部記憶装置切替スイッチ６０、外部記憶装置装置７０、８０、９０から構成されている。また、本発明の演算装置１０は、制御手段１１、内部記憶手段１２、ノード間接続手段１３、受信手段１４から構成されており、ノード間接続手段１３と受信手段１４との間に経路１５を設けることを特徴とするものである。他の演算装置も同様の構成をとるものである。
【００２０】
本発明において、経路１５はノード間接続手段とノード間接続切替スイッチとを結ぶ経路ほど帯域が広い必要は無く、外部記憶装置７０、８０、９０のピーク性能の帯域があればよい。制御手段１１からノード間接続手段１３の間には特別な制御信号線１６があり、他のノードへの転送命令はこの制御信号線１６を経由してノード間接続手段１３に送られる。このように、制御手段１１は、受信手段１４から経路１５を経てノード間接続手段１３に送られたデータを、他ノードに転送するかどうかを制御するのである。
【００２１】
以下、本発明について、実施の形態の動作について図面を参照しながら説明する。
多数の演算装置１０、２０、３０、４０で、外部記憶装置７０から同じデータを読み込む場合、ある１つの演算装置にだけ読み取り命令を発効し、残りの演算装置にはノード間転送の受信命令を発効する。ここでは、読み取り命令が演算装置１０に発せられた場合について説明する。
読み取り命令を発効された演算装置１０は、通常通りデータを読み出すと同時に、制御信号線１６により他の演算装置への転送の指示を行う。転送は外部記憶装置７０からの受信データをそのまま他の演算装置へリアルタイムで送信する。受信命令を受けた他の演算装置は、読み取り命令を発効された演算装置１０からの、ノード間データ転送をノード間接続手段で受信し、本来外部記憶装置から読んだ場合に、格納すべきメモリアドレスに配置する。これにより、各演算装置から外部記憶装置へアクセスした場合と同じ結果が得られる。なお、ノード間接続切替スイッチ５０が１：１の接続しかサポートしていなければ、外部記憶装置へのアクセスを１件減らすに過ぎないが、クロスバー接続が可能である場合は最大でｍ−１のアクセスを減らす事ができる。
【００２２】
【発明の効果】
本発明のデータ転送システムは、マルチノードを構成する演算装置において、各演算装置内にある外部記憶装置からのデータを受け取る受信手段とノード間データ転送を行うノード間接続手段との間にバイパス経路を設ける事を特徴とする。
これによって、外部記憶装置へのアクセスを軽減し、読み取り時間によりシステムのスループットが低下されることを防ぐことが可能である。
また、内部記憶手段に一旦データを読み込む必要が無いので、１つの演算装置が外部記憶装置からプログラムおよびデータを読み込む時間とほぼ同じ時間で転送が終了する。
【図面の簡単な説明】
【図１】本発明のデータ転送システムの構成を示したブロック図である。
【符号の説明】
１０、２０、３０、４０演算装置
１１、２１、３１、４１制御手段
１２、２２、３２、４２内部記憶手段
１３、２３、３３、４３ノード間接続手段
１４、２４、３４、４４受信手段
１５、２５、３５、４５経路
１６、２６、３６、４６制御信号線
５０ノード間接続切替スイッチ
６０外部記憶装置切替スイッチ
７０、８０、９０外部記憶装置[0001]
BACKGROUND OF THE INVENTION
The present invention reduces the load on each arithmetic device of a so-called multiprocessor computer system that distributes the processing to be performed and executes parallel processing for a plurality of arithmetic devices, and improves the arithmetic performance of the system. And a data transfer system .
[0002]
[Prior art]
In recent years, the performance of a single information processing device has improved dramatically due to the miniaturization of LSIs. However, in large-scale scientific computing, etc., the performance required only by a multi-node configuration using a plurality of them is reached. is the current situation. In the case of multi-node information processing, arithmetic processing is subdivided into each arithmetic device and processed in parallel, but there are not a few parts that cannot be parallelized on the program. For this reason, it is necessary to collect intermediate results of operations on a certain node and to determine or execute subsequent processing. Therefore, the transfer speed between nodes is generally wider than that of I / O in order to improve performance. When a single program is processed by multiple nodes, each arithmetic device reads a program or data into its own memory and performs an operation at the beginning of the program execution. Therefore, the common part program and initial data read the same thing in each arithmetic unit, and as a result of the load being concentrated on the disk, the throughput of the system is reduced.
Generally, even in a multi-port disk, it is a conventional means to serialize a request internally, and if it is requested by m nodes, m times of time is required according to the number.
[0003]
[Problems to be solved by the invention]
If it is known that the throughput is significantly reduced due to this disk contention, a means for reading data into the memory of one node and distributing the data to other nodes by inter-node transfer can be considered. However, in this case, there is a drawback that transfer can be performed only in units of data arranged on the memory.
[0004]
The present invention has been made in view of the above-described drawbacks. In a so-called multiprocessor computer system that executes parallel processing by distributing processing to be performed on a plurality of control means, one program is executed. It is an object of the present invention to provide an arithmetic device and a data transfer system that can reduce the processing load and prevent a decrease in throughput.
[0005]
[Means for Solving the Problems]
The arithmetic device according to claim 1 is a receiving means capable of receiving a program and data transmitted from an external storage device by being connected to a plurality of external storage devices via an external storage device changeover switch, and the reception An internal storage means for storing the program and data received by the means, a function capable of transferring the program and data to other arithmetic devices by being connected to a plurality of arithmetic devices via an inter-node connection changeover switch, and others An inter-node connection unit having a function of receiving a program and data transmitted from the arithmetic device, and a control unit capable of issuing a transfer command to the inter-node connection unit, the receiving unit and the node There must be a route through which the program and data can be transmitted It is an butterfly.
[0006]
The receiving means in the present invention is a bus that accepts data from the outside, and is a data transmission path that connects between the storage device and each part inside the arithmetic device. In particular, a PCI bus is preferable. This is provided in the arithmetic device as an expansion slot and has a function of receiving a program and data transmitted from the storage device. At this time, the receiving means has a function of transmitting the received program and data to the internal storage means and the internode connecting means.
The internal storage means in the present invention temporarily stores the program and data received by the receiving means at a predetermined address, and examples thereof include a memory and an HDD.
The inter-node connection means in the present invention connects arithmetic devices (nodes) via an inter-node connection changeover switch, and when the arithmetic device receives a read command, the receiving means is an external storage device. It has a function to directly receive the program and data transmitted from the receiving means, and further transfer the program and data to another arithmetic device. When the arithmetic device receives a reception command, it receives a read command. It has a function of receiving a program and data transmitted from an arithmetic unit and placing the received program and data at a predetermined location (memory address or the like) in the internal storage means.
The control means in the present invention has a function of instructing transfer of a program and data from the internode connection means to another arithmetic device when the arithmetic device receives a read command. In general, a CPU is used.
[0007]
The arithmetic unit according to the present invention is characterized in that a path through which data can be transmitted is provided between the receiving means and the inter-node connecting means, and when the receiving means receives the program and data, it stores them in its own internal storage means. In addition, the program and data are transmitted to the internode connection means in order to transfer the program and data to other arithmetic devices. Therefore, data can be transferred at the same time as being stored in the internal storage means, and the transmitted program and data are received by the inter-node connection means also in other arithmetic devices, and the received program and data are originally stored in the program and data. Is placed at a location (memory address or the like) where it should be stored when it is read, the program and data are shared by all of the arithmetic devices that have received the read command and the arithmetic devices that have received the reception command.
[0008]
This eliminates the need for the arithmetic device to individually read the program and data from the external storage device, reduces access to the external storage device, and prevents the system throughput from being reduced due to the program and data read time. Can do. In addition, since the arithmetic unit that receives the transferred program and data does not need to read the data into the internal storage unit once, the transfer ends in approximately the same time as the arithmetic unit reads the program and data from the external storage unit. be able to. Furthermore, once read into the internal storage means as in the prior art, transfer could be performed only in units of data arranged on the internal storage means at the time of reading. Since the transfer can be performed without reading the program and data in the means, the program and data can be transferred without limitation as in the prior art.
[0009]
The arithmetic device according to claim 2 is characterized in that the arithmetic devices are connected by a crossbar connection, and the control means issues a command to transfer a program and data to a plurality of other arithmetic devices. It is what.
[0010]
In the present invention, since the connection between nodes is a crossbar connection, when a program and data are transferred from an arithmetic device that has received a read command to an arithmetic device that has received a reception command, there are a plurality of transfer destinations. The program and data can be transferred almost simultaneously, and the program and data can be transferred from the arithmetic unit that has received the read command to all other arithmetic units.
Therefore, when the inter-node connection supports only 1: 1 connection, only one access to the external storage device is reduced. However, in the present invention in which the inter-node connection is a crossbar connection, m units are connected. When the computing devices are connected, m-1 accesses can be reduced at the maximum, and the load on the computing devices can be greatly reduced.
[0011]
In the data transfer system according to claim 3, a plurality of arithmetic devices are connected via an inter-node connection changeover switch, and each arithmetic device is connected to a plurality of external storage devices via an external storage device changeover switch. In the data transfer system, the arithmetic unit is connected to a plurality of external storage devices via an external storage device changeover switch, and can receive a program and data transmitted from the external storage device, and the reception unit An internal storage means for storing the program and data received by the means, a function connected to a plurality of arithmetic devices via an inter-node connection changeover switch and transferring the program and data to other arithmetic devices, and other Internode connection having a function capable of receiving a program and data transmitted from an arithmetic unit And a control means capable of issuing a transfer command to the inter-node connection means, and a path through which a program and data can be transmitted is provided between the receiving means and the inter-node connection means. It is characterized by.
[0012]
The arithmetic device in the data transfer system of the present invention is the same as the arithmetic device described in claim 1.
In addition, the inter-node connection changeover switch in the present invention connects arithmetic devices to each other, and connects an arithmetic device that has received a read command and an arithmetic device that has received a receive command. Programs and data are transmitted via the inter-node connection changeover switch.
The external storage device is provided outside the arithmetic device, and stores a program and data for processing in the multi-node. A plurality of external storage devices are used based on the amount of information used in the multi-node. These are connected to the arithmetic device via the storage device changeover switch. In the present invention, since one arithmetic device receives the read command, the arithmetic device and the program and data to be read are stored. It suffices if it can be connected to an external storage device.
Thus, as in the first aspect, the system throughput is prevented from being lowered, the arithmetic unit finishes the transfer in substantially the same time as the time for reading the program and data from the external storage device, and is stored on the internal storage means. Programs and data can be transferred without restriction that data can be transmitted only in units of arranged data.
[0013]
The data transfer system according to claim 4 is characterized in that the internode connection means connects the arithmetic devices by crossbar connection.
[0014]
In the present invention, since the connection between nodes is a crossbar connection, when a program and data are transferred from an arithmetic unit that has received a read command to an arithmetic unit that has received a reception command, even if there are a plurality of transfer destinations, It is possible to transfer the program and data from the arithmetic unit that has received the read command to all other arithmetic units.
Accordingly, as in the case of claim 2, when m computing devices are connected, m-1 accesses can be reduced at the maximum.
[0019]
DETAILED DESCRIPTION OF THE INVENTION
The present invention will be described below with reference to the drawings.
FIG. 1 is a block diagram showing the configuration of the data transfer system of the present invention.
In FIG. 1, a data transfer system in a multi-node according to the present invention includes arithmetic devices (nodes) 10, 20, 30, 40, an inter-node connection changeover switch 50 for switching a partner at the time of data transfer between those nodes, an arithmetic device. The external storage device changeover switch 60 and the external storage device 70, 80, 90 which assign the port of the external storage device in response to a request from the external storage device. The arithmetic device 10 according to the present invention includes a control unit 11, an internal storage unit 12, an internode connection unit 13, and a reception unit 14. A path 15 is provided between the internode connection unit 13 and the reception unit 14. It is characterized by providing. Other arithmetic devices have the same configuration.
[0020]
In the present invention, the route 15 does not have to be as wide as the route connecting the internode connection means and the internode connection changeover switch, and may have a peak performance bandwidth of the external storage devices 70, 80, and 90. There is a special control signal line 16 between the control means 11 and the inter-node connection means 13, and a transfer command to another node is sent to the inter-node connection means 13 via this control signal line 16. In this way, the control means 11 controls whether or not the data sent from the receiving means 14 via the path 15 to the internode connecting means 13 is transferred to another node.
[0021]
Hereinafter, the operation of the present invention will be described with reference to the drawings.
When the same data is read from the external storage device 70 by a large number of arithmetic devices 10, 20, 30, 40, a read command is issued to only one arithmetic device, and an inter-node transfer reception command is issued to the remaining arithmetic devices. Enter into force. Here, a case where a read command is issued to the arithmetic device 10 will be described.
The arithmetic device 10 that has issued the read command reads data as usual, and at the same time, instructs the transfer to another arithmetic device via the control signal line 16. In the transfer, the received data from the external storage device 70 is transmitted as it is to another arithmetic device in real time. The other arithmetic unit that receives the reception command receives the inter-node data transfer from the arithmetic unit 10 for which the read command has been issued by the inter-node connection means, and originally stores the memory to be read from the external storage device Place at address. As a result, the same result as that obtained when each arithmetic device accesses the external storage device can be obtained. If the inter-node connection changeover switch 50 supports only 1: 1 connection, only one access to the external storage device is reduced. However, if crossbar connection is possible, the maximum is m−1. Access can be reduced.
[0022]
【The invention's effect】
A data transfer system according to the present invention provides a bypass path between a receiving unit that receives data from an external storage device in each arithmetic unit and an inter-node connection unit that performs inter-node data transfer in a multi-node arithmetic unit. It is characterized by providing.
As a result, it is possible to reduce access to the external storage device and prevent the system throughput from being lowered due to the read time.
In addition, since there is no need to once read data into the internal storage means, the transfer is completed in approximately the same time as the time when one arithmetic device reads the program and data from the external storage device.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a data transfer system of the present invention.
[Explanation of symbols]
10, 20, 30, 40 Arithmetic units 11, 21, 31, 41 Control means 12, 22, 32, 42 Internal storage means 13, 23, 33, 43 Internode connection means 14, 24, 34, 44 Receiving means 15, 25, 35, 45 Paths 16, 26, 36, 46 Control signal line 50 Inter-node connection changeover switch 60 External storage device changeover switches 70, 80, 90 External storage device

Claims

Receiving means capable of receiving a program and data transmitted from an external storage device by being connected to a plurality of external storage devices via an external storage device changeover switch, a plurality of arithmetic devices and an inter-node connection changeover switch An inter-node connection means having a function capable of transferring a program and data to the plurality of arithmetic devices by being connected via the network, and a function capable of receiving a program and data transmitted from another arithmetic device; An arithmetic device comprising: a connection path provided between the receiving means and the inter-node connecting means for transmitting the program and data received by the receiving means to the inter-node connecting means. The reception means receives the data from the plurality of external storage devices via the external storage device changeover switch. Receiving programs and data, an arithmetic unit via the connection path programs and data thus received, and transmits the connection means between said nodes.

Connection with the plurality of arithmetic devices is made by crossbar connection, and further includes control means capable of issuing a transfer command to the inter-node connection means. The control means includes the inter-node connection means, The program and data received by the receiving unit from an external storage device and transmitted to the node indirect unit via the connection path are transmitted to any one or more of the plurality of arithmetic units via an inter-node connection changeover switch. 2. The arithmetic unit according to claim 1, wherein an instruction is issued so as to transfer to the arithmetic unit.

A plurality of arithmetic devices are connected via an inter-node connection changeover switch, and the plurality of arithmetic devices are data transfer systems respectively connected to a plurality of external storage devices via an external storage device changeover switch, Receiving means capable of receiving a program and data transmitted from the external storage device by being connected to the plurality of external storage devices via the external storage device changeover switch, and the plurality of arithmetic devices, respectively And a function capable of transferring programs and data to the plurality of arithmetic devices and a function capable of receiving programs and data transmitted from other arithmetic devices. The internode connection means having the program and data received by the receiving means, A connection path provided between the receiving means and the inter-node connecting means for transmitting to the inter-node connecting means, wherein the receiving means is connected to the external storage device from the plurality of external storage devices. A data transfer system that receives the program and data received via a storage device changeover switch and transmits the received program and data to the internode connection means via the connection path.

The plurality of arithmetic devices are connected to each other by crossbar connection, and each of the plurality of arithmetic devices further includes a control unit capable of issuing a transfer command to the inter-node connection unit. The inter-connection means receives the program and data received by the receiving means from the plurality of external storage devices and transmitted to the node indirect means via the connection path via the inter-node connection changeover switch. 4. The data transfer system according to claim 3, wherein an instruction is issued to transfer to one or more of the arithmetic devices.