JPS63316254A

JPS63316254A - Parallel processor

Info

Publication number: JPS63316254A
Application number: JP62151381A
Authority: JP
Inventors: Takashi Kimura; 隆木村; Tomoo Fukazawa; 深沢　友雄
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1987-06-19
Filing date: 1987-06-19
Publication date: 1988-12-23
Anticipated expiration: 2014-10-25
Also published as: JP2967928B2

Abstract

PURPOSE:To increase efficiency to use a transfer channel by providing in a one-dimensional transfer channel plural switch circuits with a means to electri cally disconnect or connect a one-dimensional transfer channel. CONSTITUTION:One data transfer channel commonly used by plural processor elements PE1-PE3 is divided into the transfer channels P1-P3 of an arbitrary number and an arbitrary length by distributed and inserted switch circuits SW1-SW4. Switch circuits SW1-SW3 are constituted of a circuit regarding the transfer channel of right and left directions. A transfer circuit changing-over circuit SC electrically connects and disconnects a transfer channel DRi of an input side and a transfer channel D'Ri of an output side, based on a change-over controlling signal S from a transfer channel deciding circuit PD, a transfer channel is formed without depending on far and near transfer distances, in accordance with the transfer request of the processor elements PE1-PE3 asynchronously generated.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、ハード量が少なく、小形にして転送路の使用
効率が高く、任意の演算プロセッサ間で転送路の競合が
無く、効率的にデータの授受を行なう並列プロセッサに
関するものである。[Detailed Description of the Invention] [Field of Industrial Application] The present invention has a small amount of hardware, is compact, has high efficiency in the use of transfer paths, and has no contention for transfer paths between arbitrary arithmetic processors. It relates to parallel processors that exchange data.

[Conventional technology]

複数のプロセッサ間の相互のデータ転送を行なう装置と
しては、各プロセッサが共通に転送路を使用する第１２
図に示すようなハス構造が最もハード量が少なく並列プ
ロセッサとして小形化が可能な構成である。しかし、こ
のバスを用いた並列プロセッサでは、１度に２つのプロ
セッサしか転送路を使用できず、他の多くのプロセッサ
は転送路が空くまで処理を中断して待たねばならない。As a device for mutually transferring data between a plurality of processors, a twelfth processor in which each processor uses a common transfer path is used.
The helical structure shown in the figure has the smallest amount of hardware and is a configuration that can be miniaturized as a parallel processor. However, in parallel processors using this bus, only two processors can use the transfer path at a time, and many other processors must suspend processing and wait until the transfer path becomes free.

このため、並列プロセッサの数が増大するとともに、演
算処理と転送とを合わせた全体の処理実効時間は大幅に
増大し、処理速度が低下するという欠点があった。なお
、第１２図において、ＰＥＩ〜ＰＥ６はプロセッサ要素
、ＢＳはバス、ＡＢはバス調定（アービタ）である。Therefore, as the number of parallel processors increases, the total effective processing time including arithmetic processing and transfer increases significantly, resulting in a decrease in processing speed. In FIG. 12, PEI to PE6 are processor elements, BS is a bus, and AB is a bus arbiter.

[Problem that the invention seeks to solve]

一方、転送路のデータ競合を無くすため　、第１３図に
示すようなりロスバスイッチＣ８や多段スイッチなどの
専用の転送装置を負荷した並列プロセッサでは、１度に
すべてのプロセッサがそれぞれ他の１つのプロセッサと
の転送路を実現し、・転送の効率化・高速化を図ること
ができる。しかし、このような高速転送装置の問題は、
すべてのプロセッサから独立に転送装置に信号線が集中
し、接続ケーブルの本数が膨大になり、並列プロセッサ
のプロセッサ数の増大とともに実現不可能な規模になる
という欠点があった。On the other hand, in order to eliminate data conflicts on the transfer path, in a parallel processor loaded with a dedicated transfer device such as a loss bar switch C8 or a multi-stage switch as shown in Fig. It is possible to realize a transfer path with the processor and improve the efficiency and speed of transfer. However, the problem with such high-speed transfer devices is that
Signal lines are concentrated in the transfer device independently from all the processors, the number of connection cables becomes enormous, and the scale becomes unrealizable as the number of parallel processors increases.

本発明はこのような点に鑑みてなされたものであり、そ
の目的とするところは、同時に多数の転送経路を実現し
、個々の転送路の使用効率を高め、並列プロセッサの処
理実効速度を向上させ、転送装置の小形化と高速性を両
立させた並列プロセッサを提供することにある。The present invention has been made in view of these points, and its purpose is to simultaneously realize a large number of transfer paths, increase the efficiency of using each transfer path, and improve the effective processing speed of parallel processors. The object of the present invention is to provide a parallel processor that achieves both compactness and high speed of a transfer device.

[Means for solving problems]

このような目的を達成するために本発明は、複数のプロ
セッサ要素と、この複数のプロセッサ要素の任意の２つ
のプロセッサ要素間で転送経路を構成しデータの転送・
情報の授受を行なう転送装置とからなる並列プロセッサ
において、一次元状の転送路と、この転送路中に電気的
に一次元の転送路を切断もしくは接続する手段を有する
複数のスイッチ回路とを備え、スイッチ回路によって分
断された各転送路にプロセッサ要素を接続するようにし
たものである。In order to achieve such an object, the present invention configures a transfer path between a plurality of processor elements and any two processor elements of the plurality of processor elements, and transfers and transfers data.
A parallel processor comprising a transfer device that sends and receives information, which includes a one-dimensional transfer path and a plurality of switch circuits in the transfer path that have means for electrically disconnecting or connecting the one-dimensional transfer path. , a processor element is connected to each transfer path separated by a switch circuit.

[Effect]

本発明による並列プロセッサは、ハード規模が小さいバ
ス構造を基本としながら、１本の信号転送路を、分散的
に挿入したスイッチ回路によって、任意の数と任意の長
さの転送路に分割することを可能にする。また、転送路
分割と使用のための転送路の空き管理制御を簡単な方法
で転送路自身が行なうことができ、プロセッサからの要
求発生に応じて転送経路をダイナミックに変えられ、１
度に多数の２つのプロセッサ間転送路を確保できる。The parallel processor according to the present invention is based on a bus structure with a small hardware scale, and can divide one signal transfer path into transfer paths of an arbitrary number and arbitrary length using switch circuits inserted in a distributed manner. enable. In addition, the transfer path itself can perform transfer path division and free space management control for use in a simple manner, and the transfer path can be dynamically changed in response to requests from the processor.
A large number of transfer paths between two processors can be secured at a time.

〔Example〕

本発明の第１の特徴は、第２図に示すように、１本のデ
ータ転送線路を複数のプロセッサ要素ＰＥ１〜ＰＥ３が
共用する構成でありながら、１本の転送線路に分散して
スイッチ回路５ｗｔ−５ｗ４を挿入し、このスイッチ回
路で挾まれた転送路ＰＩ、Ｐ２．Ｐ３をバスとしてプロ
セッサ要素ＰＥｌ、ＰＥ２．ＰＥ３を接続した構成であ
る。The first feature of the present invention is that, as shown in FIG. 2, although one data transfer line is shared by a plurality of processor elements PE1 to PE3, switch circuits are distributed over one data transfer line. 5wt-5w4 are inserted, and the transfer paths PI, P2 . Processor elements PEl, PE2 . This is a configuration in which PE3 is connected.

本発明の第２の特徴は、１本の転送路を任意に分割して
、それぞれ独立した転送路として使用でき、また、各プ
ロセッサ要素が転送路使用のスケジュール表に基づいて
競合のない転送装置として、転用予約手続きを自動的に
できる手段を有している点にある。The second feature of the present invention is that one transfer path can be arbitrarily divided and used as independent transfer paths, and that each processor element is configured as a transfer device without contention based on a transfer path use schedule. The point is that it has a means to automatically perform the diversion reservation procedure.

本発明の第３の特徴は、転送路の空きで使用可能な経路
と、各プロセッサ要素の使用要求の発生とデータ授受を
必要とするプロセッサとの転送距離とが判明した時点で
、ダイナミックに転送経路設定が可能な制御手段を有し
ている点にある。The third feature of the present invention is that the transfer is performed dynamically at the time when it is determined which available transfer paths are available, the occurrence of a request to use each processor element, and the transfer distance between the processor and the processor that requires data exchange. The point is that it has a control means that allows route setting.

この特徴は、簡単な信号の授受により効率的に自動的に
転送路は右と左の独立した方向を持った転送線路で構成
され、−次元上に配置されたスイッチ回路が、転送方向
に対し、前段からの要求信号と個々に管理するプロセッ
サ要素からの要求信号とから転送経路の始点を判断し、
これにより、全く非同期に発生するプロセッサ要素の転
送要求に応じて、しかも必要なプロセッサ要素間の転送
距離の遠近に依存しないで転送経路が形成されることを
特徴としている。This feature is that the transfer path is configured with transfer lines with independent directions of right and left, and the switch circuit arranged on the - dimension is configured to efficiently and automatically transfer signals by simply sending and receiving signals. , determines the starting point of the transfer path from the request signal from the previous stage and the request signal from the individually managed processor element,
As a result, a transfer path is formed in response to transfer requests of processor elements that occur completely asynchronously, and without depending on the required transfer distance between processor elements.

本発明の第４の特徴は、転送路を複数の信号線路で構成
し、この信号線路をスイッチ回路間で乗り換えられるよ
うに、次段の転送路の信号線の空き情報に基づいて前段
の転送路の信号線と後段の転送路の信号線とを独立に接
続する回路を設けたスイッチ回路を構成している点にあ
る。The fourth feature of the present invention is that the transfer path is configured with a plurality of signal lines, and the transfer of the previous stage is based on the information on the availability of the signal line of the next stage transfer path so that the signal line can be transferred between switch circuits. The switch circuit is configured with a circuit that independently connects the signal line of the transfer path and the signal line of the subsequent transfer path.

特許請求の範囲第１項、第２項記載の発明の実施例とし
ての並列プロセッサの構成を第１図に示し、スイッチ回
路を第３図に、そのスイッチ回路の処理フローを第４図
に、並列プロセッサのスイッチ回路の動作状態の変化を
第５図に、プロセンサ要素の転送路インタフェースを第
６図に示す。The configuration of a parallel processor as an embodiment of the invention described in claims 1 and 2 is shown in FIG. 1, a switch circuit is shown in FIG. 3, and a processing flow of the switch circuit is shown in FIG. FIG. 5 shows changes in the operating state of the switch circuit of the parallel processor, and FIG. 6 shows the transfer path interface of the processor element.

第１図で、スイッチＳＷ１〜ＳＷ３は転送路中に挿入さ
れたスイッチ回路で、ＰＥＩ〜ＰＥ３はスイッチ回路で
挟まれた転送路に接続されたプロセッサ要素である。第
１図において、ＤＲおよびＤＬはそれぞれ左から右およ
び右から左方向のデータの転送路であり、ＱＤＲ，ＡＤ
ＲおよびＱＤＬ、ＡＤＬはそれぞれ左から右へおよび右
から左への転送経路予約情報ＱＤ　（Ｑｕｅｒｙ　Ｄｉ
ｓｔａｎｃｅ）とその転送経路確認情報Ａ　Ｄ　（Ａｃ
ｑｕｉｒｅｄ　Ｄｉｓｔａｎｃｅ）の情報である。ＬＲ
ｅｑＲ，ＬＲｅｑＬはＤＲ，ＤＬの転送路それぞれに対
するプロセッサ要素からの転送要求信号（Ｌｏｃａｌ　
ＲｅｑＲ，Ｌ）である。In FIG. 1, switches SW1 to SW3 are switch circuits inserted into a transfer path, and PEI to PE3 are processor elements connected to the transfer path sandwiched between the switch circuits. In FIG. 1, DR and DL are data transfer paths from left to right and from right to left, respectively, and QDR, AD
R, QDL, and ADL are left-to-right and right-to-left transfer route reservation information QD (Query Di
stance) and its transfer route confirmation information A D (Ac
required Distance). LR
eqR and LReqL are transfer request signals (Local
ReqR,L).

第１図において、スイッチ回路ＳＷ１〜ＳＷ３は、右方
向、左方向の転送路に関する回路で構成される。In FIG. 1, switch circuits SW1 to SW3 are constructed of circuits related to rightward and leftward transfer paths.

第３図はスイッチ回路ＳＷｊ　（ｊ＝１．２．　　・・
・）の右方向の転送路に関する回路を示したもので、Ｓ
Ｃは転送路切替回路で、その他に転送要求処理回路ＲＰ
と転送経路判定回路ＰＤとから構成されている。右方向
の転送路も同様である。転送路切替回路ＳＣは、ここで
は、ハイインピーダンスを含む３値をとるバッファＢＵ
Ｆでの組合せで構成され、転送経路判定回路ＰＤからの
切替制御信号Ｓに基づいて、入力側の転送路ＤＲｉ　（
ｉ＝０．　１．　　・・・、ｍ）と出力側の転送路Ｄ’
Ｒｉ　　（ｉ＝０．１．　　・・・、ｍ）とを電気的に
接続したり、切断したりする。以下、右、左の添字Ｒ１
Ｌを省略する。Figure 3 shows the switch circuit SWj (j=1.2.
・) This shows the circuit related to the rightward transfer path of S.
C is a transfer path switching circuit, and a transfer request processing circuit RP is also included.
and a transfer route determination circuit PD. The same applies to the rightward transfer path. Here, the transfer path switching circuit SC is a buffer BU that takes three values including high impedance.
Based on the switching control signal S from the transfer route determination circuit PD, the input side transfer route DRi (
i=0. 1. ..., m) and the output side transfer path D'
It electrically connects and disconnects Ri (i=0.1. . . . , m). Below, right and left subscripts R1
Omit L.

次に、第３図の回路の動作について説明する。Next, the operation of the circuit shown in FIG. 3 will be explained.

１．１）　　転送要求処理回路ＲＰに前段スイッチから
ＲＲｅｑｉｎ、接続されたプロセッサ要素からＬＲｅｑ
ｉｎが入力される。1.1) Transfer request processing circuit RP receives RReqin from the previous switch and LReq from the connected processor element.
in is input.

１．２）　　ＲＲｅｑｉｎはフリップフロップＦＦＩに
リモート・バス・ステート（Ｒｅｍｏｔｅ　Ｐａｔｈ　
５ｔａｔｅ、　　ＲＰＳ）として記憶され、ＬＲｅｑｉ
ｎはフリップフロップＦＦ２にローカル・バス・ステー
ト（ＬｏｃａｌＰａｔｈ　５ｔａｔｅ、　　Ｌ　Ｐ　Ｓ
）として記憶される。前段スイッチに接続されたプロセ
ッサ要素から要求があれば、ＲＰＳ、ＬＰＳが１になる
。1.2) RReqin inputs the remote bus state (Remote Path) to the flip-flop FFI.
5tate, RPS) and LReqi
n is the local bus state (LocalPath 5tate, L P S
). If there is a request from the processor element connected to the front-stage switch, RPS and LPS become 1.

１．３）制御論理手段ＰＣＯＮ　（プライオリティコン
トローラ）はＲＰＳ、ＬＰＳを監視する。ＲＰＳの方が
プライオリティが高い。ＲＰＳ＝１のときは、ＬＰＳの
いかんにかかわらずオリジン（ＯＲＧ）＝Ｏとし、接続
プロセッサ要素からの要求を無視する。ＲＲｅｑｏｕｔ
にＲＲｅｑｉｎを出力する。1.3) Control logic means PCON (priority controller) monitors RPS, LPS. RPS has higher priority. When RPS=1, the origin (ORG) is set to O regardless of the LPS, and requests from connected processor elements are ignored. RReqout
Output RReqin to .

ＲＰ、Ｓ　＝　ＯのときはＬＰＳをみる。ＬＰＳ＝１の
ときは接続プロセッサ要素から要求があるので、０ＲＧ
＝１とする（第４図のステップ１．２）。When RP, S = O, look at LPS. When LPS = 1, there is a request from the connected processor element, so 0RG
=1 (step 1.2 in FIG. 4).

ＬＰＳ＝０のときは要求がないので、０ＲＧ＝０とする
（第４図のステップ１．　３）　ｏ　ＲＲｅｑｏｕｔ＝
０が出力される。When LPS=0, there is no request, so set 0RG=0 (step 1.3 in Figure 4) o RReqout=
0 is output.

次に、第３図の回路の動作を第４図を用いて説明する。Next, the operation of the circuit shown in FIG. 3 will be explained using FIG. 4.

第４図はスイッチ回路の基本処理を示したもので、プロ
セッサ要素と前段のスイッチからの転送路使用要求の有
無、転送経路予約情報と転送経路確認情報の入力情報と
の一致によって前段のスイッチあるいはプロセッサ要素
への転送経路予約情報ＱＤｏｕｔと転送経路確認情報Ａ
Ｄｏｕｔを出力する。Figure 4 shows the basic processing of the switch circuit, and depending on whether there is a request to use the transfer path from the processor element and the previous switch, and whether there is a match between the input information of the transfer route reservation information and the transfer route confirmation information, Transfer route reservation information QDout and transfer route confirmation information A to the processor element
Output Dout.

２．１）転送経路判定回路ＰＤに、前段スイッチからＱ
　Ｄ　ｉｎ　％後段スイッチからＡＤｉｎが入力される
。後段スイッチにＱＤｏｕｔが出力され、前段スイッチ
にＡＤｏｕｔとＡＣＫＲ（一致信号）が出力される。2.1) Q from the previous stage switch to the transfer path determination circuit PD
D in % ADin is input from the subsequent switch. QDout is output to the rear switch, and ADout and ACKR (coincidence signal) are output to the front switch.

２．２）制御論理手段ＳＤ（ステータスディテクタ）は
ＯＲＧとＲＲｅｑｏｕｔを監視する。ＲＰＳ。2.2) Control logic means SD (Status Detector) monitors ORG and RReqout. RPS.

ＬＰＳ共にＯのとき（ステップ４からステップ５、へ移
行するとき）、後段からのＡＤｉｎに１を加え、前段に
Ａ　Ｄｏｕｔ＝　Ａ　Ｄｉｎ＋　１を前段のスイッチ回
路に出力する（ステップ５）。自らの要求はないので、
後段の空き区間数に自らの区間を加えて前段に伝えるの
である。第３図のＩＮＣは１を加算するインクリメンタ
である。When both LPS are O (when moving from step 4 to step 5), 1 is added to ADin from the subsequent stage, and A Dout=A Din+ 1 is output to the previous stage switch circuit (step 5). Since there are no demands of their own,
It adds its own section to the number of empty sections in the subsequent section and transmits the result to the previous section. INC in FIG. 3 is an incrementer that adds 1.

２．３）　　ＲＰＳ、ＬＰＳのいずれかが１のときはス
テップ６へ移行しくステップ１〜４）　、０ＲＧ＝０で
あれば前段からのＱＤ□ヵに１を滅じ（ステップ７）、
後段にＱ　Ｄｏｕｔ＝　Ｑ　Ｄ　ｉｎ　　１を出力する
（ステップ６〜１０）。ＱＤｏｕｔは要求区間数を示し
、自らを区間として設定するのでこれを減じて後段に伝
えるのである。ＤＥＣは１を減算するデクリメンタであ
る。０ＲＧ＝　１であれば、接ｔＥＰＢが要求するＱＤ
ｉｎを後段に出力する（ステップ９．１０）。2.3) If either RPS or LPS is 1, proceed to step 6. Steps 1 to 4) If 0RG = 0, destroy 1 in QD□ka from the previous stage (step 7),
Q Dout=Q D in 1 is output to the subsequent stage (steps 6 to 10). QDout indicates the number of requested sections and sets itself as the section, so this is subtracted and transmitted to the subsequent stage. DEC is a decrementer that subtracts 1. If 0RG = 1, the QD required by the contact EPB
in is output to the subsequent stage (step 9.10).

２．４）　　また、ＡＤｉｎとＱＤｏｕｔを比較し、一
致しない間はＯＲＧにかかわらず、転送径路要求がある
場合は、Ａ　Ｄｏｕｔ−０、ＡＣＫ＝０を前段に出力す
る（ステップ１２．１３）。一致すれば、スイッチを接
続（オン）し、Ａ　Ｄｏｕｔ＝　Ａ　Ｄｉｎ＋Ｌ　ＡＣ
Ｋ＝１を前段に出力する（ステップ１４）。2.4) Also, ADin and QDout are compared, and as long as they do not match, regardless of ORG, if there is a transfer route request, A Dout-0 and ACK=0 are output to the previous stage (step 12.13). If they match, connect (turn on) the switch and A Dout=A Din+L AC
Output K=1 to the previous stage (step 14).

２．５）Ｌ　Ｐ　Ｓ、　ＲＰ　Ｓのいずれかが１の場合
にＱＤｏｕｔ＝０になれば、要求転送径路の終端と判断
し、この時、スイッチを切断（オフ）　Ｌ、ＡＤｏｕｔ
＝０、ＡＣＫ＝　１を前段に、ＱＤｏｕｔ＝Ｑ、ＲＲｅ
ｑ＝ｏを後段に出力する（ステップ１５）。2.5) When either LPS or RPS is 1, if QDout=0, it is determined that the request transfer path is at the end, and at this time, the switch is disconnected (off) L, ADout
=0, ACK=1 in front, QDout=Q, RRe
Output q=o to the subsequent stage (step 15).

３．１）第５図は、第４図の基本処理に基づいて実行さ
れる８つのスイッチ回路のＱＤｏｕｔとＡＤｏｕｔの出
力の時間的推移を示している。第５図において、最初は
すべてのスイッチＳＷｉでＡＤｏｕｔ＝ＦＦ　（最大値
）であり、これは後段のすべての区間を使用できること
を意味する。ＱＤｏｕｔ＝０となっているのは転送路の
要求がないためである。3.1) FIG. 5 shows the temporal transition of the outputs of QDout and ADout of eight switch circuits executed based on the basic processing shown in FIG. 4. In FIG. 5, initially ADout=FF (maximum value) for all switches SWi, which means that all sections in the subsequent stage can be used. QDout=0 because there is no request for a transfer path.

３．２）今、クロック、ｌでスイッチＳＷＩにオリジン
が発生し、転送路区間数−３が要求されたとする。すな
わち、スイッチＳＷＩに接続されるプロセッサ要素ＰＥ
ＯからスイッチＳＷ４に接続されるプロセッサ要素ＰＥ
３に転送要求がなされた場合を例として説明する。この
場合、スイッチＳＷ１からＱＤｏｕｔ＝３．ＡＤｏｕｔ
＝Ｏが出力される。3.2) Now, assume that an origin occurs in the switch SWI at clock l and that the number of transfer path sections - 3 is requested. That is, the processor element PE connected to the switch SWI
Processor element PE connected from O to switch SW4
The case where a transfer request is made in 3 will be explained as an example. In this case, QDout=3. from switch SW1. ADout
=O is output.

ここで、ＡＤｏｕｔ−０はスイッチＳＷＩから右側へは
転送路を形成できないことを示す。転送路予約手続が開
始される。Here, ADout-0 indicates that a transfer path cannot be formed from the switch SWI to the right side. Transfer route reservation procedure is started.

３．３）　　クロック２でスイッチＳＷ２はスイッチＳ
ＷＩの要求をうけ、Ｑ　Ｄｏｕｔ＝　２　、’　Ａ　Ｄ
ｏｕｔ＝　０を出力する（ステップ９，１０，１２．１
３）。3.3) At clock 2, switch SW2 becomes switch S
At the request of WI, Q Dout= 2,' A D
Output out=0 (steps 9, 10, 12.1
3).

クロック３でスイッチＳＷ３はＱＤｏｕｔ＝１゜Ａ　Ｄ
ｏｕｔ＝　Ｏを出力する（ステップ９，１０，１２．１
３）。At clock 3, switch SW3 is QDout=1°A D
Output out=O (steps 9, 10, 12.1
3).

クロ・７り４でスイッチＳ　Ｗ　４はＲＲｅｑ＝１が入
力され、かつＱＤｏｕｔ＝０となるため、ＡＤｏｕｔ−
〇、ＡＣＫ＝１を出力する。Ａ　ＣＫ　＝　１は前段に
伝えられる（ステップ８．１５）。In Kuro-7ri4, RReq=1 is input to switch SW4, and QDout=0, so ADout-
〇, outputs ACK=1. ACK = 1 is passed to the previous stage (step 8.15).

３．４）　　クロック５でＡＣＫ＝１となり、スイッチ
ＳＷ３はスイッチＳＷ４のＡＤｏｕｔ＝０に１を加え、
Ｑ　Ｄｏｕｔ＝　Ａ　Ｄｏｕｔ＝　１となり、スイッチ
をオンし、ＡＣＫ＝　１となる（ステップ６．７，９゜
１０．１２．１４）。3.4) At clock 5, ACK=1, switch SW3 adds 1 to ADout=0 of switch SW4,
Q Dout = A Dout = 1, the switch is turned on, and ACK = 1 (steps 6.7, 9, 10.12.14).

クロック６で、スイッチＳＷ２でＱＤｏｕｔ＝ＡＤｏｕ
ｔ−２を出力するとともに、スイッチをオンし、ＡＣＫ
＝　１を前段に出力する（ステップ６゜７．９，１０，
１２．１４）。At clock 6, QDout=ADou at switch SW2
Outputs t-2, turns on the switch, and receives ACK.
= 1 is output to the previous stage (step 6゜7.9,10,
12.14).

クロック７で、スイッチＳＷＩでＱＤｏｕｔ＝ＡＤｏｕ
ｔ＝３を出力するとともに、スイッチをオンし、ＡＣＫ
＝　１を前段に出力する（ステップ６゜７．９，１１，
１２．１４）。At clock 7, QDout=ADou with switch SWI
Outputs t=3, turns on the switch, and ACKs
Output = 1 to the previous stage (step 6゜7.9, 11,
12.14).

ここにおいて、スイッチＳＷＩからＳＷ４、プロセッサ
要素ＰＥＩからＰＥ４への転送路が確保される。Here, a transfer path from switch SWI to SW4 and from processor element PEI to PE4 is secured.

４．１）　　プロセッサ要素ＰＥＩからＰＥ４へＬＲｅ
ｑを１に保つことにより、スイッチＳＷＩ〜ＳＷ４のＱ
　ＤｏｕＬ、　Ａ　Ｄｏｕｔおよびスイッチオンの状態
を不変に保ち（ステップ６．１６）、この間データ転送
が行なわれる。4.1) LRe from processor element PEI to PE4
By keeping q at 1, the Q of switches SWI to SW4
The states of DouL, A Dout and the switch on are kept unchanged (step 6.16), during which data transfer takes place.

５．１）　　データ転送が終了すると、プロセッサ要素
ＰＥＩでＬＲｅｑをＯに、すなわちクロック１０１でス
イッチＳＷＩはＬＰＳ＝Ｏ１ＯＲＧ＝０、ＲＲｅｑｏｕ
ｔ＝　Ｏとなる（ステップ４．５）、第３図の制御論理
手段ＳＤはこれを検出してステッチを直ちにオフし、Ｑ
Ｄｏｕｔ＝０とする（ステップ４．５）。転送予約解除
が開始される。5.1) When the data transfer is finished, processor element PEI sets LReq to O, that is, at clock 101, switch SWI sets LPS=O1ORG=0, RReqou
t=O (step 4.5), the control logic means SD of FIG. 3 detects this and immediately turns off the stitching, Q
Set Dout=0 (step 4.5). Transfer reservation cancellation is started.

５．２）　　ＲＲｅｑがスイッチＳＷ１〜ＳＷ４へ伝わ
ることによって、クロック１０２で、スイッチＳＷ２にＳＷＩからのＲＲ
ｅｑｏｕｔ−０が入力され、ＱＤｏｕｔ＝Ｏ１ＡＤｏｕ
ｔ＝２、スイッチオフとなり、クロック１０３で、スイッチＳＷ３のＱＤｏｕｔ＝０、
Ａ　Ｄｏｕｔ＝　１、スイッチオフとなり、クロック１
０４で、スイッチＳＷ４のＱＤｏｕｔ＝Ｏ，ＡＤｏｕｔ
＝ＦＦとなる（ステップ４，５）。5.2) By transmitting RReq to switches SW1 to SW4, at clock 102, RR from SWI is sent to switch SW2.
eqout-0 is input, QDout=O1ADou
At t=2, the switch turns off, and at clock 103, QDout of switch SW3=0,
A Dout = 1, switch off, clock 1
04, QDout=O, ADout of switch SW4
=FF (steps 4 and 5).

５．３）スイッチＳＷ３で０ＲＧ＝０であれば、クロッ
ク１０５でＲＲｅｑｉｎ＝　０より、後段のスイッチに
ついて使用可能な転送路区間数にセットされる。すなわ
ち、ＡＤｏｕｔ＝Ｆ　Ｆ　＋　１　＝Ｆ　Ｆ　（最大使
用可能な区間ＦＦ）が出力される（ステップ４．５）。5.3) If 0RG=0 in switch SW3, RReqin=0 is set at clock 105 to the number of usable transfer path sections for the subsequent switch. That is, ADout=F F + 1 = F F (maximum usable section FF) is output (step 4.5).

クロック１０６で、スイッチＳＷ３でＳＷ４のＡＤｏｕ
ｔに１を加え、ＡＤｏｕｔが出力される。At clock 106, switch SW3 switches to ADou of SW4.
Add 1 to t and output ADout.

クロック１０７で、スイッチＳＷ２でＳＷ３のＡＤｏｕ
ｔに１を加え、ＡＤｏｕｔが出力される。At clock 107, switch SW2 switches to ADou of SW3.
Add 1 to t and output ADout.

クロック１０８で、スイッチＳＷＩが順次１を加算され
たＡＤｏｕｔを出力し、ここで転送路の解除が完了する
（ステップ４，５）。At the clock 108, the switch SWI sequentially outputs ADout to which 1 has been added, and the release of the transfer path is completed here (steps 4 and 5).

第６図はプロセッサ要素の転送路インタフェース回路を
示したもので、転送路としてＤＲにトライステートのＩ
１０バッファＢ１で接続され、転送路検査信号ＱＤはト
ライステートの入力バッファＢ２で、転送路確認信号Ａ
Ｄは同じくトライステートの出カバソファＢ３で接続さ
れている。転送要求信号ＬＲｅｑおよびＡＣＫ信号は直
接スイッチ回路と接続されている。第６図で、シリアル
パラレルシフトレジスタＳＰＲはビット長の変換が必要
な場合に用いられる。同図において、ＩＤＥは内部デー
タバス、ＣＤＢは制御データバス、２０はインタフェー
ス制御回路である。Figure 6 shows the transfer path interface circuit of the processor element, in which a tri-state I/O is connected to the DR as the transfer path.
The transfer path check signal QD is connected to the 10 buffer B1, and the transfer path check signal QD is connected to the tristate input buffer B2.
D is connected to the output sofa B3, which is also tri-state. Transfer request signal LReq and ACK signal are directly connected to the switch circuit. In FIG. 6, a serial-parallel shift register SPR is used when bit length conversion is required. In the figure, IDE is an internal data bus, CDB is a control data bus, and 20 is an interface control circuit.

次に本実施例の具体的な効果について説明する。Next, specific effects of this embodiment will be explained.

この場合、第９図に示すように、９つのスイッチ回路と
８つの演算プロセッサ（プロセッサ要素）ＰＥＯ−ＰＥ
７から構成される並列プロセッサを例として、従来のバ
スで構成されたもの、クロスバスイッチで構成されたも
のと比較する。In this case, as shown in FIG. 9, there are nine switch circuits and eight arithmetic processors (processor elements) PEO-PE.
Taking as an example a parallel processor consisting of 7 processors, we will compare it with a conventional bus configuration and a crossbar switch configuration.

デバイスシミュレーションなどで特に高速化が困難とさ
れているモンテカルロシミュレーションの並列演算にお
いては、すべてのプロセッサが殆ど一斉に他のすべての
プロセッサにデータを送る問題がある。この問題につい
て本実施例の効果を説明する。今、８台の各プロセッサ
からほぼ一斉に他の７台のプロセッサに１つのデータを
送る場合、バス構造の並列プロセッサでは、第７図に示
すように、２８回の転送回数が必要となる。最も理想的
な転送装置としてクロスバスイッチを有する並列プロセ
ッサでは第８図に示すように４回の転送で済む。これに
対し、右方向と左方向の１組の転送路で構成されている
本実施例の並列プロセッサでは、第９図に示すように２
０回、さらに、これにもう１組の転送路構成を持つ並列
プロセッサ構成をとると、１０回となる。第９図の点線
は他の回に転送路となっているが、当該図に変更可能な
場合を示す。In parallel calculations in Monte Carlo simulations, which are particularly difficult to speed up in device simulations, there is a problem in which all processors send data almost simultaneously to all other processors. Regarding this problem, the effects of this embodiment will be explained. Now, if one data is sent almost simultaneously from each of the eight processors to the other seven processors, in a parallel processor with a bus structure, as shown in FIG. 7, 28 transfers are required. In a parallel processor having a crossbar switch as the most ideal transfer device, only four transfers are required as shown in FIG. On the other hand, in the parallel processor of this embodiment, which is composed of one set of right-direction and left-direction transfer paths, as shown in FIG.
0 times, and if a parallel processor configuration with another set of transfer path configurations is adopted, the number becomes 10 times. The dotted line in FIG. 9 indicates the transfer path in other times, but this figure shows a case where it can be changed.

一般的にプロセッサ要素数ｎと転送路の本数２″の場合
の本実施例の転送回数とケーブル本数の関係式を表に示
す。表において、２″はプロセンサ要素数、ｂは１本の
転送路のビット幅である。本実施例の転送路の本数を関
数としてプロセッサ要素数を変化した時の転送回数とケ
ーブル本数を第１０図、第１１図に示す。転送路本数の
極限はクロスバスイッチで必要な本数の１／４で、この
時の転送速度もほぼクロスバスイッチのそれに匹敵する
。第１０図および第１１図より、転送路の本数は、装置
の規模と転送速度の要求に応じて、任意に変えられるの
が本実施例の大きな効果である。In general, the relational expression between the number of transfers and the number of cables in this embodiment when the number of processor elements is n and the number of transfer paths is 2'' is shown in the table. In the table, 2'' is the number of processor elements, and b is one transfer. is the bit width of the path. FIGS. 10 and 11 show the number of transfers and the number of cables when the number of processor elements is changed as a function of the number of transfer paths in this embodiment. The maximum number of transfer paths is 1/4 of the number required for a crossbar switch, and the transfer speed at this time is almost comparable to that of a crossbar switch. As can be seen from FIGS. 10 and 11, a great advantage of this embodiment is that the number of transfer paths can be changed arbitrarily depending on the scale of the device and the transfer speed requirements.

この場合、転送回数とケーブル本数との関係は表に示す
ようにいくつかの組合せを選択できる。第１０図、第１
１図において、Ｓｔ、Ｓ３はクロスバスイッチの特性線
、Ｓ２．Ｓ４は通常バスの特性線、４０．５０は本実施
例における選択の範囲を示す。In this case, several combinations can be selected for the relationship between the number of transfers and the number of cables as shown in the table. Figure 10, 1st
In Figure 1, St, S3 are the characteristic lines of the crossbar switch, S2. S4 indicates the characteristic line of the normal bus, and 40.50 indicates the selection range in this embodiment.

〔Effect of the invention〕

以上説明したように本発明は、１本の転送路に分散して
スイッチ回路を配置したことにより、１本の転送路上に
一度に複数の任意の区間長の転送路を実現でき、転送路
の使用効率を増大することができる。また、転送路に接
続されたプロセッサ要素はスイッチ回路から出力される
転送路の空き情報として区間長情報を見て、転送要求信
号とともに必要な転送区間長の情報を送り、転送路が確
保されればＡＣＫを受は転送路を使用できる。このよう
な簡単な制御で２つのプロセッサ間の転送路を個々のプ
ロセッサ要素が処理途中で必要な経路をダイナミックに
変えられ、使用効率とともに高速性を実現できる。As explained above, by arranging switch circuits distributed over one transfer path, the present invention can realize multiple transfer paths with arbitrary section lengths on one transfer path at the same time. Usage efficiency can be increased. In addition, the processor element connected to the transfer path looks at the section length information as the available transfer path information output from the switch circuit, and sends the necessary transfer section length information along with the transfer request signal, so that the transfer path is secured. If an ACK is received, the transfer path can be used. With such simple control, the necessary transfer paths between two processors can be dynamically changed by individual processor elements in the middle of processing, making it possible to achieve both high efficiency and high speed.

[Brief explanation of drawings]

第１図は本発明に係わる並列プロセッサの一実施例を示
すブロック系統図、第２図は本発明の詳細な説明するた
めの並列プロセッサを示すブロック系統図、第３図はス
イッチ回路を示すブロック系統図、第４図はそのスイッ
チ回路の処理フローを示すフローチャート、第５図は第
４図の基本処理に基づいて実行される８つのスイッチ回
路のＱＤｏｕｔとＡＤｏｕｔの出力の時間的推移を示す
説明図、第６図はプロセッサ要素の転送路インタフェー
ス回路を示すブロック系統図、第７図はバス構成の並列
プロセッサでの転送回数の説明図、第８図は両方向転送
路をもつクロスバスイッチ構成の並列プロセッサでの転
送回数を示す説明図、第９図は本発明による並列プロセ
ッサでの転送回数を示す説明図、第１０図、第１１図は
装置の規模と転送速度の要求に対する転送路の本数を示
すグラフ、第１２図は従来のバス構成の並列プロセッサ
を示すブロック系統図、第１３図は従来のクロスバスイ
ッチ構成の並列プロセッサを示すブロック系統図である
。ＰＥＩ〜ＰＥ３・・・プロセッサ要素、ＳＷＩ〜ＳＷ４
・・・スイッチ回路、ＤＲ，ＤＬ・・・信号線、Ｐｌ〜
Ｐ３・・・転送路、ＳＣ・・・転送路切替回路、Ｐ、Ｄ
・・・転送径路判定回路、ＲＰ・・・転送要求処理回路
、ＢＵＦ・・・バッファ回路、ＳＤ、ＰＣＯＮ・・・制
御論理手段、ＩＮＣ・・・インクリメンタ、ＤＥＣ・・
・ディクリメンタ、ＦＦＩ、ＦＦ２・・・フリップフロ
ップ。FIG. 1 is a block system diagram showing an embodiment of a parallel processor according to the present invention, FIG. 2 is a block system diagram showing a parallel processor for explaining the invention in detail, and FIG. 3 is a block system diagram showing a switch circuit. System diagram, Fig. 4 is a flowchart showing the processing flow of the switch circuit, and Fig. 5 is an explanation showing the temporal transition of the outputs of QDout and ADout of the eight switch circuits executed based on the basic processing of Fig. 4. Figure 6 is a block system diagram showing the transfer path interface circuit of the processor element, Figure 7 is an explanatory diagram of the number of transfers in a parallel processor with a bus configuration, and Figure 8 is a parallel diagram of a crossbar switch configuration with bidirectional transfer paths. FIG. 9 is an explanatory diagram showing the number of transfers in a processor. FIG. 9 is an explanatory diagram showing the number of transfers in a parallel processor according to the present invention. FIGS. 12 is a block diagram showing a parallel processor with a conventional bus configuration, and FIG. 13 is a block diagram showing a parallel processor with a conventional crossbar switch configuration. PEI~PE3...processor element, SWI~SW4
...Switch circuit, DR, DL...Signal line, Pl~
P3... Transfer path, SC... Transfer path switching circuit, P, D
...Transfer route determination circuit, RP...Transfer request processing circuit, BUF...Buffer circuit, SD, PCON...Control logic means, INC...Incrementer, DEC...
・Decrementer, FFI, FF2...Flip-flop.

Claims

[Claims]

(1) In a parallel processor consisting of a plurality of processor elements and a transfer device that configures a transfer path between any two processor elements of the plurality of processor elements and transfers data and exchanges information, one-dimensional A transfer path and a plurality of switch circuits having means for electrically disconnecting or connecting the one-dimensional transfer path in the transfer path, and a processor element is connected to each transfer path separated by the switch circuit. A parallel processor featuring

(2) A transfer path divided by a plurality of switch circuits has two sets of unidirectional data transfer directions, from left to right and from right to left, and a right direction and a left direction. Claim 1, characterized in that the switch circuit is constructed of a switch circuit having means for arbitrarily electrically connecting or disconnecting between a transfer path that inputs from the right and a transfer path that outputs to the right or left of the switch circuit. Parallel processors as described.

(3) The processor element PEi connected to the transfer path Pi sandwiched between the i-th and i+1-th switch circuits is connected to the p section of the rightward or leftward transfer path up to the i+p or i-pth switch circuit. The switch circuit SWi has means for outputting transfer route reservation information for the transfer route in the right direction, and the switch circuit SWi outputs the transfer route reservation information for the transfer route in the right direction.
A request signal indicating whether or not a transfer path request is generated from the transfer path Pi is generated by inputting the transfer path use request signal of transfer path 1 and the transfer path request signal of transfer path Pi+1 from the arithmetic processor PEi connected to the transfer path Pi. It has a means for generating an original confirmation signal, outputs a transfer path use request signal of the transfer path Pi+1 to the i+1th switch circuit SWi+1 in the next stage, and transfers from the i+1th switch circuit to the leftward transfer path. Transfer path use request signal of path Pi-1 and transfer path Pi
A transfer request processing circuit inputs a transfer path request signal from the processor element PEi connected to the processor element PEi from the right and outputs a transfer path use request signal for the transfer path Pi-1; inputting transfer route confirmation information regarding the transfer route at that point from the switch circuit of the next stage, and using this to determine whether the transfer route requested from the previous stage has been secured to the next switch circuit;
a transfer route determination circuit that outputs a transfer route use permission signal (rightward) to a processor element connected to the previous transport route, and similarly outputs a transfer route use permission signal (leftward) for the leftward direction; Based on the transfer path request generation confirmation signal (right direction and left direction) generated in the request processing circuit and the transfer path usage permission signal (right direction and left direction) generated in the transfer path determination circuit, the previous stage request transfer path and the subsequent stage are determined. 2. The parallel processor according to claim 1, further comprising a transfer path switching circuit that electrically connects or disconnects the free transfer path.