JP2975722B2

JP2975722B2 - Computer data communication system

Info

Publication number: JP2975722B2
Application number: JP3161781A
Authority: JP
Inventors: 三浦宏喜
Original assignee: Sanyo Denki Co Ltd
Current assignee: Sanyo Denki Co Ltd
Priority date: 1991-07-02
Filing date: 1991-07-02
Publication date: 1999-11-10
Anticipated expiration: 2014-11-10
Also published as: JPH0512231A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、計算機のデータ通信シ
ステム、特に複数のプロセッサ間でデータ通信を行う並
列計算機におけるデータ通信システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a data communication system of a computer, and more particularly to a data communication system of a parallel computer for performing data communication between a plurality of processors.

【０００２】[0002]

【従来の技術】近年、半導体技術の進歩に伴い同種のハ
ードウェアを多数並べることが容易になり、要素プロセ
ッサを複数台相互接続して並列に演算処理を行う事がで
きる並列計算機の研究が盛んである。2. Description of the Related Art In recent years, with the advance of semiconductor technology, it has become easy to arrange a large number of hardware of the same kind, and research on a parallel computer capable of interconnecting a plurality of element processors and performing arithmetic processing in parallel has been active. It is.

【０００３】例えば、本出願の発明者は、情報処理学会
第38回（平成元年前期）論文集２Ｔ−２に開示されてい
るように、１チップの要素プロセッサＬＳＩを、最大10
24台接続した大規模データ駆動計算機（Enhanced Data
Driven Engine）の開発を進めている。このような大規
模データ駆動計算機における要素プロセッサ間の通信は
全て双方向通信となっており、任意の２つのプロセッサ
間において、最短距離でデータ通信を可能としたもので
ある。For example, as disclosed in the 38th (early 1989) Transactions of Information Processing Society of Japan 2T-2, the inventor of the present application has proposed to use a one-chip element processor LSI of up to 10
24 large-scale data-driven computers (Enhanced Data
Driven Engine). All communication between element processors in such a large-scale data driven computer is bidirectional communication, and data communication is possible between any two processors over the shortest distance.

【０００４】更に、この種大規模データ駆動計算機にお
ける通信方式に於て、プロセッサ結合網からなるプロセ
ッサアレーの外部に対して送信する為の入出力インタフ
ェースを特定のプロセッサに接続し、外部行きのデータ
にこの入出力インタフェースが接続されているプロセッ
サの番号（行番号、列番号）を保持することで、この外
部行きのデータを出力インタフェース付きのプロセッサ
に向けて送信可能とする事が提案されている。Further, in a communication system of a large-scale data driven computer of this kind, an input / output interface for transmitting to the outside of a processor array comprising a processor connection network is connected to a specific processor, and data to the outside is connected. It is proposed that by holding the number (row number, column number) of the processor to which this input / output interface is connected, the outbound data can be transmitted to the processor with the output interface. .

【０００５】しかしながら、上述のように、プロセッサ
結合網外部行きのデータの行き先プロセッサ番号とし
て、入出力インタフェースの直近のプロセッサの番号を
完全に指定してやらねばならない。従って、行き先のプ
ロセッサ番号として、行番号、列番号いずれも指定する
ので、例えば、データの保持する行き先のプロセッサ番
号フィールドが１０ビットであれば、高々２¹⁰＝１０２
４台のプロセッサから成るシステムしか構築できないこ
とになる。However, as described above, the number of the nearest processor of the input / output interface must be completely specified as the destination processor number of the data going outside the processor connection network. Therefore, both the row number and the column number are specified as the destination processor number. For example, if the destination processor number field held by the data is 10 bits, at most 2 ¹⁰ = 102
Only a system consisting of four processors can be constructed.

【０００６】このようなデータ駆動計算機に於ては、プ
ロセッサ台数をさらに増やすためには、プロセッサ番号
のビット長をさらに増やす必要があるが、このビット長
を増やすことは通信データのデータ語長を増加させるこ
とになり、結果的に計算機システム全体のハードウェア
の増加につながる欠点があった。In such a data driven computer, in order to further increase the number of processors, it is necessary to further increase the bit length of the processor number. However, increasing this bit length reduces the data word length of the communication data. However, there is a disadvantage that the hardware of the entire computer system is increased.

【０００７】[0007]

【発明が解決しようとする課題】本発明は上述の点に鑑
みてなされたものであり、並列処理計算機システムにお
けるプロセッサ結合網外部行きのデータに、出力インタ
ーフェースにつながった行き先プロセッサ番号の一部を
保持させるだけで、これを外部に出力できるようになす
事によって、通信データが保持する行き先プロセッサ番
号のデータ長を増加させずにシステムのプロセッサ台数
を大幅に増加させることを可能とした計算機のデータ通
信システムを提供することである。SUMMARY OF THE INVENTION The present invention has been made in view of the above points, and has been made by adding a part of a destination processor number connected to an output interface to data going outside a processor connection network in a parallel processing computer system. Just by holding it, it can be output to the outside, so that the number of processors in the computer can be greatly increased without increasing the data length of the destination processor number held by the communication data. Is to provide a system.

【０００８】[0008]

【課題を解決するための手段】本発明の計算機のデータ
通信システムは、すくなくとも東西南北４系統の通信ポ
ートを備える複数のプロセツサを行列配置し、該各プロ
セツサが、行列配置に於ける行を識別する行番号、及び
列を識別する列番号からなるプロセツサ番号によつて識
別されるプロセツサアレーをなし、該アレーの行方向の
各プロセツサをその東西２系統の通信ポートを用いて結
合する東西通信線、並びに該アレーの列方向の各プロセ
ツサをその南北２系統の通信ポートを用いて結合する南
北通信線を備え、これら通信線を介してプロセツサアレ
ー中のいずれのプロセツサ間でもデータ通信を可能とし
た計算機のデータ通信システムにおいて、前記通信線に
よつて通信されるデータは、行き先のプロセツサ番号、
及び該データがプロセツサアレーの外部に出力されるべ
きか否かを示す外部フラグを保持してなり、プロセツサ
に到着、あるいは該プロセツサで生成された通信データ
に含まれる外部フラグが通信データがプロセツサアレー
の外部行きでないことを示している時には、該通信デー
タが含む行き先の行番号及び列番号のプロセツサに向け
て該データを送信するために前記通信ポートのいずれか
を選択的に用いて該データを出力し、一方、プロセツサ
に到着、あるいは該プロセツサで生成された通信データ
に含まれる外部フラグが、通信データがプロセツサアレ
ーの外部行きであることを示している時には、該データ
が含む行き先の行番号、及び該データと別にあらかじめ
定められた特定の列番号のプロセツサに向けて該データ
を送信するために前記通信ポートのいずれかを選択的に
用いて該データを出力し、前記プロセツサアレーをひと
つのプロセツサグループとし、更に複数のプロセツサグ
ループを相互に結合して全てのプロセツサ間のデータ通
信を可能とし、各プロセツサグループは、プロセツサア
レーの前記特定の列のプロセツサ群の少なくともひとつ
に結合された入出力インタフエースを備え、各プロセツ
サグル−プの外部に出力されるべきデータが含む前記行
き先の列番号フイールドを、行き先のプロセツサグルー
プのグループ番号を格納するフイールドとして用い、該
フイ−ルドに格納されたグループ番号に従つて、前記入
出力インタフエースを介して複数のプロセツサグループ
間でデ−タの通信を行うことを特徴とする。According to the data communication system of the present invention, a plurality of processors having at least four communication ports of east, west, north and south are arranged in a matrix, and each processor identifies a row in the matrix. East-west communication in which a processor array identified by a processor number consisting of a row number and a column number identifying a column is connected, and each processor in the row direction of the array is connected using its two east-west communication ports. Line, and a north-south communication line that couples each processor in the column direction of the array using its two north-south communication ports, allowing data communication between any of the processors in the processor array via these communication lines. In the data communication system of a computer, the data communicated through the communication line includes a destination processor number,
And an external flag indicating whether or not the data should be output to the outside of the processor array. The external flag included in the communication data arriving at the processor or generated by the processor indicates that the communication data is When it is indicated that the communication data is not going to the outside, the communication data is selectively used by using any of the communication ports to transmit the data to the processor of the destination row number and column number included in the communication data. Output the data, and when the external flag included in the communication data arriving at the processor or generated by the processor indicates that the communication data is going out of the processor array, the destination included in the data. To transmit the data to a processor having a specific row number and a specific column number predetermined separately from the data. The processor array is used as one processor group, and a plurality of processor groups are interconnected to perform data communication between all the processors. Enabled, each processor group having an input / output interface coupled to at least one of the processor groups in the particular column of the processor array, wherein the destination contains data to be output outside each processor group. Column number field is used as a field for storing the group number of the destination processor group, and according to the group number stored in the field, the plurality of processor groups can be connected via the input / output interface. It is characterized by performing data communication.

【０００９】更に、本発明のデータ通信システムは、上
記プロセッサアレイを１つのプロセッサグループとし、
さらに複数のプロセッサグループを相互に結合してプロ
セッサ間のデータ通信を行う計算機のデータ通信システ
ムに於て、各プロセッサグループは、プロセッサアレイ
の前述の特定の列のプロセッサ群の少なくともひとつに
結合された入出力インタフェースを備え、ここで各プロ
セッサグループの外部に出力されるべきデータが含む行
き先列番号フィールドを、行き先のプロセッサグループ
のグループ番号を格納するフィールドとして用い、当該
グループ番号フィールドに格納されたグループ番号に従
って、前述の入出力インタフェースを介して複数のプロ
セッサグループ間でデータの通信を行うものである。Further, in the data communication system according to the present invention, the processor array is formed as one processor group,
Further, in a computer data communication system in which a plurality of processor groups are mutually connected to perform data communication between processors, each processor group is connected to at least one of the processor groups in the aforementioned specific column of the processor array. An input / output interface is provided, wherein a destination column number field included in data to be output to the outside of each processor group is used as a field for storing a group number of a destination processor group, and a group stored in the group number field is used. According to the numbers, data communication is performed between a plurality of processor groups via the aforementioned input / output interface.

【００１０】[0010]

【作用】本発明のデータ通信システムによれば、プロセ
ッサアレイの外部行きデータが保持する行き先プロセッ
サ番号のうちの列番号フィールドの値にかかわらず、あ
らかじめ定められた特定の列に向けて通信制御が行われ
る。従って、プロセッサアレイの外部行きデータには、
行き先プロセッサの行番号のみを指定してやれば正しく
外部行きデータの通信制御が行える。即ち、外部行きデ
ータの行き先列番号フィールドは、別の用途に自由に使
用できることになる。According to the data communication system of the present invention, regardless of the value of the column number field among the destination processor numbers held in the data going to the outside of the processor array, the communication control is performed toward a predetermined specific column. Done. Therefore, the outbound data of the processor array includes
If only the line number of the destination processor is designated, the communication control of the outbound data can be performed correctly. That is, the destination column number field of the outbound data can be freely used for another purpose.

【００１１】更に本発明のデータ通信システムによれ
ば、プロセッサアレイからなる１つのプロセッサグルー
プの外部行きデータが保持する行き先プロセッサ番号の
うちの列番号フィールドを、行き先プロセッサグループ
番号として使用することにより、グループ番号を指定す
るためのフィールドを新たに追加することなしに、シス
テムを構成するプロセッサ数を大幅に増加させることが
できる。Further, according to the data communication system of the present invention, the column number field of the destination processor number held by the outward data of one processor group including the processor array is used as the destination processor group number, The number of processors constituting the system can be greatly increased without newly adding a field for designating a group number.

【００１２】[0012]

【実施例】図１に本発明実施例としての並列データ駆動
計算機システムを示し、図２に並列データ駆動計算機シ
ステムの基本構成要素である要素プロセッサの構成を示
す。FIG. 1 shows a parallel data driven computer system as an embodiment of the present invention, and FIG. 2 shows a configuration of an element processor which is a basic component of the parallel data driven computer system.

【００１３】まず、図１において、ＰＥは要素プロセッ
サであり、ＮＩＦはネットワークインタフェースであ
る。本実施例では、要素プロセッサ（ＰＥ）がトーラス
状に結合されている。即ち、各々東西南北４系統の通信
ポートを有する複数の要素プロセッサ（ＰＥ）が行列配
置され、各行方向即ち東西方向の要素プロセッサ群、及
び各列方向即ち南北方向の要素プロセッサ群が、それぞ
れ複数の通信リンクによって循環的にリング状に結合さ
れている。また、各行方向のリングにネットワークイン
タフェース（ＮＩＦ）が挿入されている。ネットワーク
インタフェース（ＮＩＦ）にはホストインタフェースが
接続されており、ホストインタフェースには更にホスト
計算機が接続されているこのような構成によって、まず
ホスト計算機からホストインタフェース、ネットワーク
インタフェース（ＮＩＦ）を介してプログラム、及び初
期データがロードされる。各要素プロセッサは、このプ
ログラムに従って、プロセッサ間で相互にデータ通信を
行いながら並列的に演算を行う。First, in FIG. 1, PE is an element processor, and NIF is a network interface. In this embodiment, element processors (PE) are connected in a torus. That is, a plurality of element processors (PEs) each having four communication ports of the east, west, north and south are arranged in a matrix, and each row direction, that is, the east-west direction element processor group, and each column direction, that is, the north-south element processor group, are provided in plural numbers. They are cyclically connected by communication links in a ring. Also, a network interface (NIF) is inserted in each row-direction ring. A host interface is connected to the network interface (NIF), and a host computer is further connected to the host interface. With such a configuration, first, a program, a program, and a program are transmitted from the host computer via the host interface and the network interface (NIF). And initial data are loaded. Each element processor performs operations in parallel according to this program while performing data communication between the processors.

【００１４】図２の要素プロセッサ（ＰＥ）は、基本的
にはプログラム記憶（ＰＳ）、発火制御・カラー管理部
（ＦＣＣＭ）、命令実行ユニット（ＥＸＥ）及びキュー
メモリ（Ｑ）が循環パイプライン（リング）構造に接続
された構成としている。プログラム記憶（ＰＳ）はノー
ド番号の更新、定数付与及び結果のコピーを行う。発火
制御・カラー管理部（ＦＣＣＭ）は、２段階の待ち合わ
せ記憶方式で発火制御及びカラーの獲得・解放の管理を
行う命令実行ユニット（ＥＸＥ）は、浮動小数点演算整
数演算、条件判定、分岐、簡易定数発生などの命令及び
それらの複合命令を実行する。The element processor (PE) of FIG. 2 basically includes a program storage (PS), a firing control / color management unit (FCCM), an instruction execution unit (EXE), and a queue memory (Q). (Ring) structure. The program storage (PS) updates the node number, assigns a constant, and copies the result. The firing control and color management unit (FCCM) performs firing control and color acquisition / release management in a two-stage waiting storage system. The instruction execution unit (EXE) includes floating-point arithmetic integer arithmetic, condition determination, branching, and simple processing. Execute instructions such as constant generation and their composite instructions.

【００１５】キューメモリ（Ｑ）はリング上でのあらゆ
るデータ流変動を吸収する緩衝記憶である。緩衝記憶が
必要となるのは、コピー、リングへの強制的入力、
リングからの出力遅延、ＦＣＣＭにおける待ちリス
トのサーチ、などが生じたときである。本要素プロセッ
サ（ＰＥ）にはキュー（Ｑ）のデータ滞在量に応じて
〜の動作モードを動的に変更する機能を付加し、これ
によって並列度の制御を行う。また、キュー（Ｑ）がや
むなくオーバーフローしたときには外部データメモリ
（ＥＤＭ）上に外部キューを形成してこれを吸収し、プ
ログラム実行の継続を図る。The queue memory (Q) is a buffer memory that absorbs any data flow fluctuations on the ring. Buffering is required for copying, forcing the ring,
This is when an output delay from the ring, a search of a waiting list in the FCCM, or the like occurs. The present element processor (PE) is provided with a function of dynamically changing the operation mode (1) to (3) according to the amount of data in the queue (Q), thereby controlling the degree of parallelism. When the queue (Q) overflows unavoidably, an external queue is formed on the external data memory (EDM) to absorb this and to continue the program execution.

【００１６】ネットワーク制御部（ＮＣ）は、東西南北
４系統の通信ポ−ト（１）（２）（３）（４）を保持
し、最大１０２４プロセッサ（ＰＥ）のトーラス結合網
に基づくルーティング制御を行う。また、自身に向けて
到着したデータパケットを、巡回パイプライン部に向け
て入力するための入力ポート（５）及び巡回パイプライ
ンからのデータパケットをネットワーク制御部に出力す
るための出力ポート（６）を備える。The network control unit (NC) holds the communication ports (1), (2), (3) and (4) of the four systems of east, west, north and south, and controls routing based on a torus connection network of up to 1024 processors (PE). I do. Also, an input port (5) for inputting a data packet arriving at itself to the cyclic pipeline unit and an output port (6) for outputting a data packet from the cyclic pipeline to the network control unit. Is provided.

【００１７】ベクトル演算制御ユニット（ＶＣ）は、ベ
クトル演算関連命令、及び通常のメモリアクセス命令の
実行制御を行う。ベクトル演算制御ユニット（ＶＣ）と
入力制御部（ＩＣ）の間には、ベクトルデータを入力す
るためのベクトル入力バス（７を備えている。また、ベ
クトル演算制御ユニット（ＶＣ）と入力制御部（ＩＣ）
の間には、ベクトルデータを出力するためのベクトル出
力バス（８を備えている。A vector operation control unit (VC) controls execution of vector operation related instructions and ordinary memory access instructions. A vector input bus (7) for inputting vector data is provided between the vector operation control unit (VC) and the input control unit (IC). The vector operation control unit (VC) and the input control unit (IC) IC)
And a vector output bus (8) for outputting vector data.

【００１８】上述の構成のデータ駆動計算機において、
プロセッサ間通信に用いられる通信データパケットの構
成を図４に示す。通信データパケットには大別して、図
４（ａ）に示すスカラパケットと、図４（ｂ）に示す構
造体パケットがある。In the data driven computer having the above configuration,
FIG. 4 shows the configuration of a communication data packet used for inter-processor communication. Communication data packets are roughly classified into a scalar packet shown in FIG. 4A and a structure packet shown in FIG. 4B.

【００１９】図４においてＰＥ−Ｘ、ＰＥ−Ｙはそれぞ
れ行き先ＰＥの列番号、行番号である。ＣＴＬはカラ
ー、あるいはロードダンプ先を示すコードなどの制御情
報である。ＯＴは、プロセッサ結合網の外部行きのパケ
ットを識別する外部フラグであるＮＯＤＥ番号はデータ
フローグラフにおいて命令を識別するためのノード番号
である。ＤＡＴＡはデータ値であり、スカラパケットで
は３２ビットの単一データを上位、下位に分けて保持
し、構造体パケットでは複数即ちｎワードの３２ビット
データを保持する。各ワードの上位２ビットによって、
パケットのヘッダ、テイル、及びそれ以外のワードを識
別できる。また、最上位のフラグは１ワード毎に値が反
転することによりデータの存在を示す転送制御ビットの
役割をも果たす。In FIG. 4, PE-X and PE-Y are the column number and row number of the destination PE, respectively. The CTL is control information such as a color or a code indicating a load dump destination. OT is an external flag that identifies a packet destined to the outside of the processor interconnection network. The NODE number is a node number for identifying an instruction in a data flow graph. DATA is a data value. A scalar packet holds single 32-bit data in upper and lower parts, and a structure packet holds a plurality of, that is, n-word 32-bit data. By the upper 2 bits of each word,
The header, tail, and other words of the packet can be identified. The uppermost flag also serves as a transfer control bit indicating the presence of data by inverting the value every word.

【００２０】また、パイプラインリング上の入力制御部
（ＩＣ）には、自身のプロセッサ番号を格納しておくた
めのプロセッサ番号レジスタを備えている。図５にプロ
セッサ番号レジスタの構成を示す。ＰＥ−Ｘは東西方向
のプロセッサ番号（列番号）であり、ＰＥ−Ｙは南北方
向のプロセッサ番号（行番号）である。両者を合わせて
各プロセッサを固有に識別するプロセッサ番号となる。The input control section (IC) on the pipeline ring has a processor number register for storing its own processor number. FIG. 5 shows the configuration of the processor number register. PE-X is a processor number (column number) in the east-west direction, and PE-Y is a processor number (row number) in the north-south direction. Together, the processor numbers uniquely identify each processor.

【００２１】図５に示すPEACTと称するフラグビット
は、プロセッサ番号が既に設定されているかどうかを示
すフラグである。PEACT フラグは、システムの初期化信
号（ハードウェアリセット）に応じて ”０”になる。A flag bit called PEACT shown in FIG. 5 is a flag indicating whether a processor number has already been set. The PEACT flag becomes “0” in response to the system initialization signal (hardware reset).

【００２２】通信制御部（ＮＣ）は、図４ (ａ）及び
(ｂ)の如きパケットを通信ポートを介して受け取る。The communication control unit (NC) is shown in FIG.
The packet as shown in (b) is received via the communication port.

【００２３】PEACT＝０の時は、通信制御部は東西南北
あらゆるポートから入力される全てのデータパケット
を、自身へのデータパケットとみなしてパイプラインリ
ングに入力して処理を行う。PEACT＝１の時に、プロセ
ッサ番号レジスタへのデータをロードするパケットが到
着すると、通信制御部はこれを入力制御部（ＩＣ）に入
力し、ここでプロセッサ番号レジスタに所定のプロセッ
サ番号がロードされるとともに、PEACTフラグが ”１”
にセットされる。When PEACT = 0, the communication control unit regards all data packets input from any of the east, west, north and south ports as data packets to the communication control unit, and inputs the data packets to the pipeline to perform processing. When PEACT = 1, when a packet for loading data into the processor number register arrives, the communication control unit inputs this to the input control unit (IC), where a predetermined processor number is loaded into the processor number register. At the same time, the PEACT flag is "1"
Is set to

【００２４】PEACT＝１の時、通信制御部は到着したパ
ケットを所定のアルゴリズムに従って、いずれかの通信
ポートに出力するか、パイプラインリングに入力するか
を判定する。When PEACT = 1, the communication control unit determines whether an incoming packet is to be output to any communication port or to be input to the pipeline ring according to a predetermined algorithm.

【００２５】また、ＯＴ＝１のパケットが到着した時
は、通信制御部はこれがネットワークインタフェース
（ＮＩＦ）を経てプロセッサ結合網外部へ転送されるよ
うに通信制御を行う。When a packet of OT = 1 arrives, the communication control unit controls communication so that the packet is transferred to the outside of the processor connection network via the network interface (NIF).

【００２６】また、図示はしないが、本実施例の各要素
プロセッサには外部より、プロセッサ間通信時のデータ
通信の方向を双方向モードにするか単方向モードにする
かを区別するモード信号ＴＡＮが入力される。Although not shown, a mode signal TAN for discriminating whether the direction of data communication at the time of inter-processor communication is a bidirectional mode or a unidirectional mode is externally provided to each element processor of this embodiment. Is entered.

【００２７】本発明の主たる特徴は、上述の如くＯＴ＝
１のパケットが到着した時の各プロセッサの通信制御の
仕方にある。即ち、ＯＴ＝１であれば、パケットが保持
する行き先プロセッサの列番号（ＰＥ−Ｘ）が如何なる
値であろうとも、その全ビットが”１”であるとみなし
て通信制御を行う。The main feature of the present invention is that OT =
The communication control of each processor when one packet arrives. That is, if OT = 1, communication control is performed with all bits being regarded as "1" regardless of the value of the column number (PE-X) of the destination processor held by the packet.

【００２８】これを説明するために通信制御部の動作に
ついてさらに詳細に説明する。図３に通信制御部の構成
を模式的に示す。同図において、(RWI)及び(RWO)は、西
入出力ポートを構成する自己同期式の入力シフトレジス
タであり、同様に(REI)(REO)は東入出力ポートを、(RN
I)(RNO)は北入出力ポートを、(RSI)(RSO)は南入出力ポ
ートを構成している。また、○は合流回路、◎は分岐回
路を示している。To explain this, the operation of the communication control unit will be described in more detail. FIG. 3 schematically shows the configuration of the communication control unit. In the figure, (RWI) and (RWO) are self-synchronous input shift registers constituting the west input / output port, and similarly, (REI) (REO) is the east input / output port and (RN
I) (RNO) constitutes the north input / output port, and (RSI) (RSO) constitutes the south input / output port. ○ indicates a merging circuit, and 、 indicates a branch circuit.

【００２９】図３を用いて通信制御部におけるルーティ
ングアルゴリズムについて説明する。The routing algorithm in the communication control unit will be described with reference to FIG.

【００３０】Ｍ１〜Ｍ５は、それぞれパケットの合流回
路であり、図の数字に示したような優先度でパケットを
合流させる(１が最優先)。Ｒ１〜Ｒ５は、それぞれパケ
ットの分岐回路であり、以下のようなアルゴリズムで処
理を行う。１．自分のプロセッサ番号（行番号、列番号）＝（ｘ，
ｙ），パケットの行き先プロセッサ番号＝（Ｘ，Ｙ）と
し、ネットワークの配列サイズ＝ｐ×ｑ（ｐ行ｑ列）と
し、 Δｘ≡（Ｘ−ｘ）mod ｑ，｜Δｘ｜≦ｑ／２ Δｙ≡（Ｙ−ｙ）mod ｐ，｜Δｙ｜≦ｐ／２とする。（mod は、モジュロ演算を示す。）２．プロセッサ番号は、ＮからＳの方向に順にｙ＝０、１、２、・・・ｐＷからＥの方向に順にｘ＝０、１、２、・・・ｑとする。３．ＯＴは、パケットのＯＴビットの値を意味し、ＯＴ
＝１はプロセッサ結合網の外部行きパケットであること
を示す。４．パケットのＯＴ＝１の時は、上記〓におけるパケッ
トの行き先プロセッサ番号のうちの列番号Ｘを全ビッ
ト”１”とみなして上記Δｘを算出する。Each of M1 to M5 is a packet merging circuit which merges packets with the priority shown in the figure (1 is the highest priority). R1 to R5 are packet branch circuits, respectively, which perform processing according to the following algorithm. 1. Own processor number (row number, column number) = (x,
y), destination processor number of packet = (X, Y), network array size = p × q (p rows and q columns), Δx≡ (X−x) mod q, | Δx | ≦ q / 2 Δy ≡ (Y−y) mod p, | Δy | ≦ p / 2. (Mod indicates a modulo operation.) The processor numbers are assumed to be y = 0, 1, 2,... P in the order from N to S x = 0, 1, 2,. 3. OT means the value of the OT bit of the packet, and OT
= 1 indicates that the packet is directed to the outside of the processor connection network. 4. When the packet OT = 1, the column number X of the destination processor number of the packet in the above みな is regarded as all bits “1”, and the above Δx is calculated.

【００３１】以下に分岐回路における処理を述べる。１．ＰＥＡＣＴ＝０の時Ｗ，Ｅ，Ｎ，Ｓいずれから来たパケットもＰへ入力す
る。２．ＰＥＡＣＴ＝１の時（１）Ｒ３：Δｘ＝０かつΔｙ＞０ならばＳへ出力。The processing in the branch circuit will be described below. 1. When PEACT = 0 Packets from any of W, E, N, and S are also input to P. 2. When PEACT = 1 (1) R3: Output to S if Δx = 0 and Δy> 0.

【００３２】 Δｘ＝０かつΔｙ＜０かつＴＡＮ＝１ならばＳへ出力。If Δx = 0, Δy <0, and TAN = 1, output to S.

【００３３】 Δｘ＝０かつΔｙ＜０かつＴＡＮ＝０ならばＮへ出力。If Δx = 0, Δy <0 and TAN = 0, output to N.

【００３４】 Δｘ＝０かつΔｙ＝０かつＯＴ＝０ならばＰヘ入力。If Δx = 0, Δy = 0 and OT = 0, input to P.

【００３５】それ以外は、Ｅへ出力。（２）Ｒ２：Δｘ＝０かつΔｙ＞０ならばＳへ出力。Otherwise, output to E. (2) R2: Output to S if Δx = 0 and Δy> 0.

【００３６】 Δｘ＝０かつΔｙ＜０かつＴＡＮ＝１ならばＳへ出力。If Δx = 0, Δy <0 and TAN = 1, output to S.

【００３７】 Δｘ＝０かつΔｙ＜０かつＴＡＮ＝０ならばＮへ出力。If Δx = 0, Δy <0, and TAN = 0, output to N.

【００３８】 Δｘ＝０かつΔｙ＝０かつＯＴ＝０ならばＰヘ入力。If Δx = 0, Δy = 0 and OT = 0, input to P.

【００３９】それ以外は、Ｗへ出力。（３）Ｒ１：Δｙ＝０かつＯＴ＝０ならばＰへ入力。Otherwise, output to W. (3) R1: input to P if Δy = 0 and OT = 0.

【００４０】 Δｙ＝０かつＯＴ＝１ならばＥヘ出力。If Δy = 0 and OT = 1, output to E.

【００４１】それ以外は、Ｓへ出力。（４）Ｒ４：Δｙ＝０かつＯＴ＝０ならばＰへ入力。Otherwise, output to S. (4) R4: Input to P if Δy = 0 and OT = 0.

【００４２】 Δｙ＝０かつＯＴ＝１ならばＥヘ出力。If Δy = 0 and OT = 1, output to E.

【００４３】それ以外は、Ｎへ出力。（５）Ｒ５：Δｘ＝０かつΔｙ＞０ならばＳへ出力。Otherwise, output to N. (5) R5: Output to S if Δx = 0 and Δy> 0.

【００４４】 Δｘ＝０かつΔｙ＜０かつＴＡＮ＝１ならばＳへ出力。If Δx = 0, Δy <0 and TAN = 1, output to S.

【００４５】 Δｘ＝０かつΔｙ＜０かつＴＡＮ＝０ならばＮへ出力。If Δx = 0, Δy <0, and TAN = 0, output to N.

【００４６】 Δｘ＜０かつＴＡＮ＝０ならばＷヘ出力。If Δx <0 and TAN = 0, output to W.

【００４７】それ以外は、Ｅへ出力。Otherwise, output to E.

【００４８】以上がルーティングアルゴリズムの詳細で
あるが、これに限られるものではない。The details of the routing algorithm have been described above, but the present invention is not limited to this.

【００４９】以上の説明からわかるように、プロセッサ
間で通信されるパケットは、ＰＥＡＣＴ＝１のとき、列
番号が一致するまで東西（Ｅ⇔Ｗ）方向の通信線上を転
送され、列番号が一致したところで、南北(Ｓ⇔Ｎ）方
向の通信線上に移り、更に行番号が一致するまで南北通
信線上を転送されて目的のプロセッサに到達する。As can be seen from the above description, when PEACT = 1, the packet communicated between the processors is transferred on the communication line in the east-west (E 通信 W) direction until the column numbers match, and the column numbers match. Then, it moves on the communication line in the north-south (S⇔N) direction, and is further transferred on the north-south communication line until the line number matches, and reaches the target processor.

【００５０】しかも、プロセッサ結合網の外部行きのパ
ケット（ＯＴ＝１）については、その行き先列番号がい
かなる値であっても、行き先が最も東（Ｅ）側の列のプ
ロセッサであるとみなしてルーティングが行われる。こ
の外部行きパケットは、最も東側のプロセッサにまで転
送されると、そこで更に東（Ｅ）ポートに出力され、ネ
ットワークインタフェースを介して外部に出力される。Furthermore, regarding the packet (OT = 1) destined to the outside of the processor interconnection network, the destination is regarded as the processor in the east (E) side column regardless of the destination column number. Routing takes place. When the outbound packet is transferred to the eastmost processor, it is further output to the east (E) port and output to the outside via the network interface.

【００５１】以上の説明から、本発明のデータ通信シス
テムによって、プロセッサ結合網の外部行きのパケット
における行き先プロセッサの列番号フィールドは、他の
情報を格納するフィールドとして使用することができ
る。As described above, according to the data communication system of the present invention, the column number field of the destination processor in the packet going out of the processor interconnection network can be used as a field for storing other information.

【００５２】そこで、次に図６、図７に示す実施例にお
いては、上述の列番号フィールドを行き先のプロセッサ
グループ番号として使用する。Therefore, in the embodiment shown in FIGS. 6 and 7, the above-mentioned column number field is used as the destination processor group number.

【００５３】図６は、以上で述べてきたプロセッサ結合
網(６０)に、他のプロセッサグループとのデータの授受
のためのグループインタフェース(６１)を結合したひと
つのプロセッサグループ（ＰＧ）の構成例である。ＧＩ
はグループ入力線、ＧＯはグループ出力線である。FIG. 6 shows a configuration example of one processor group (PG) in which a group interface (61) for exchanging data with another processor group is connected to the processor connection network (60) described above. It is. GI
Is a group input line, and GO is a group output line.

【００５４】図７は、図６に示したプロセッサグループ
をさらに複数接続した高並列計算機システムの構成例で
ある。図においてＰＧ０、ＰＧ１、ＰＧ２ＰＧ３、・・
・ＰＧ３０は、それぞれプロセッサグループであり、ホ
スト計算機が３２番目のプロセッサグループＰＧ３２と
して識別される。FIG. 7 shows a configuration example of a highly parallel computer system in which a plurality of the processor groups shown in FIG. 6 are further connected. In the figure, PG0, PG1, PG2PG3,.
PG30 is a processor group, and the host computer is identified as the 32nd processor group PG32.

【００５５】ＣＭＵはデータ通信ユニットであり、各プ
ロセッサグループ間のデータ通信の際のデータ通信の制
御を司る。The CMU is a data communication unit that controls data communication during data communication between the processor groups.

【００５６】ホスト計算機、及び各プロセッサグループ
間で通信されるデータパケットの行き先プロセッサ番号
の列番号フィールド（ＰＥ−Ｘ）には、前述のように行
き先のプロセッサグループ番号、即ち、”０”、”
１”、”２”、”３”、・・・”31”のいずれかが格納
される。各プロセッサグループからは図１〜図５を用い
て説明したようなメカニズムにより、外部行きパケット
が出力される。データ通信ユニットＣＭＵは、入力され
たパケットの行き先プロセッサグループ番号に従い、パ
ケットを行き先のプロセッサグループへと出力する。こ
のようにして、各プロセッサグループにおけるパケット
の行き先列番号フィールドをプロセッサグループ番号と
して使用することができ、高々１０ビットの行き先プロ
セッサ番号フィールドを利用して、３万台以上のプロセ
ッサから成る高機能の並列処理計算機システムを構築す
ることができる。As described above, the column number field (PE-X) of the destination processor number of the data packet communicated between the host computer and each processor group contains the destination processor group number, that is, "0", "".
1 ”,“ 2 ”,“ 3 ”,...,“ 31 ”are stored in. Outgoing packets are output from each processor group by the mechanism described with reference to FIGS. The data communication unit CMU outputs the packet to the destination processor group in accordance with the destination processor group number of the input packet, thus setting the destination column number field of the packet in each processor group to the processor group number. By using a destination processor number field of at most 10 bits, a high-performance parallel processing computer system composed of 30,000 or more processors can be constructed.

【００５７】[0057]

【発明の効果】以上の説明から明らかなように、プロセ
ッサ結合網外部行きのデータに、行き先プロセッサ番号
の一部を保持させるだけで、これを外部に出力させるこ
とが可能な計算機のデータ通信システムを提供すること
ができる。従って、並列処理計算機システムにおける通
信データが保持する行き先プロセッサ番号のデータ長を
増加させずに、システムを構成するプロセッサ台数を大
幅に増加させることが可能になる。As is apparent from the above description, a data communication system of a computer capable of outputting to the outside only by holding a part of the destination processor number in the data to the outside of the processor connection network. Can be provided. Accordingly, it is possible to greatly increase the number of processors constituting the system without increasing the data length of the destination processor number held by the communication data in the parallel processing computer system.

[Brief description of the drawings]

【図１】本発明の並列処理計算機を示す構成図、FIG. 1 is a configuration diagram showing a parallel processing computer according to the present invention;

【図２】本発明の並列処理計算機の基本構成要素である
要素プロセッサを示す構成図、FIG. 2 is a configuration diagram showing an element processor which is a basic component of the parallel processing computer according to the present invention;

【図３】本発明の要素プロセッサ要部を示す模式図、FIG. 3 is a schematic diagram showing a main part of an element processor according to the present invention;

【図４】データパケットの構成図、FIG. 4 is a configuration diagram of a data packet;

【図５】本発明の要素プロセッサ内部のプロセッサ番号
レジスタの構成図、FIG. 5 is a configuration diagram of a processor number register inside an element processor according to the present invention;

【図６】本発明の並列処理計算機におけるひとつのプロ
セッサグループの構成図、FIG. 6 is a configuration diagram of one processor group in the parallel processing computer of the present invention;

【図７】本発明の高並列計算機システムを示す構成図。FIG. 7 is a configuration diagram showing a highly parallel computer system of the present invention.

[Explanation of symbols]

(ＰＥ)・・・要素プロセッサ、 (ＮＩＦ)・・・ネットワークインタフェース、 (ＮＣ)・・・通信制御部、 (ＰＳ)・・・プログラム記憶部、 (ＥＸＥ)・・・命令実行部、 (ＰＧ)・・・プロセッサグループ、 (ＣＭＵ)・・・データ通信ユニット。 (PE): Element processor, (NIF): Network interface, (NC): Communication control unit, (PS): Program storage unit, (EXE): Instruction execution unit, (PG ): Processor group, (CMU): Data communication unit.

Claims

(57) [Claims]

A plurality of processors having at least four communication ports of east, west, north and south are arranged in a matrix, and each of the processors includes a row number for identifying a row in the matrix and a column number for identifying a column. An east-west communication line that connects the processors in the row direction of the array by using two east-west communication ports, and the north-south processor in the column direction of the array. comprising a north-south communication line for coupling with the two systems communications port in a computer data communication system which enables data communication even between any processor in Purosetsusaare through these communication lines, before SL communication line Yotsute data communication, an external off the processor number of the destination, and the data indicating whether or not to be output to the outside of Purosetsusaare over It will hold the lugs, when the external flag included in the communication data generated by the arrival in the processor, or the processor indicates that the communication data does not go outside the Purosetsusaare over the
And it outputs the data selectively using one of the previous SL communication port to transmit the data to the processor of the line number and column number of the destination of the communication data comprises, whereas, arrives at processor, or external flag included in the communication data generated by the processor is the communication data when the identification information indicates that the external go of Purosetsusaare over the destination of the line number that the data includes, and separately from predetermined and the data toward processor specific column number outputs the data by using selectively one of the previous SL communication port to transmit the data, and one of the pro-broken group said Purosetsusaare
Multiple processor groups to interconnect
Data communication between all processors
Sub-groups are assigned to the specific row of the processor array.
An input / output interface coupled to at least one of the
Face, and output to the outside of each process group.
Destination column number field that contains the data to be read
And the group number of the destination processor group
Field stored in the
According to the group number, via the input / output interface
To communicate data between multiple processor groups
A data communication system for a computer , characterized in that :