JPS6195477A

JPS6195477A - Vector processing device

Info

Publication number: JPS6195477A
Application number: JP21613084A
Authority: JP
Inventors: ▲吉▼田　八穂子; Yahoko Yoshida
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 1984-10-17
Filing date: 1984-10-17
Publication date: 1986-05-14
Anticipated expiration: 2009-03-30
Also published as: JPH0623977B2

Abstract

PURPOSE:To carry out a processing at high speed since an using efficiency of a vector computing element and a data transfer circuit can be enhanced, by dividing one vector instruction into plural vector processing units and processing. CONSTITUTION:Each of vector arithmetic units 4-7 carries out a transfer of a data with a main memory device 1 through plural vector registers 9, one or plural vector computing element 10 and a memory device control unit 2. In this transfer, connection bus selecting circuits 24, 25 forming a data bus between data transfer circuits 11-13, the vector register 9 and the vector computing element 10 or the data transfer circuits 11-13 are included. Also an instruction performing control section 15 controlling an operation of all the vector arithmetic units connected to these elements is included. A scalar arithmetic processing unit 3 carries out a some task and when it has to perform a vector processing on the way thereof, it is done by the vector processing units 4-7 through a vector arithmetic control unit 8.

Description

【発明の詳細な説明】〔発明の利用分野〕本発明は、ベクトル処理装置に関する。[Detailed description of the invention] [Field of application of the invention] The present invention relates to a vector processing device.

[Background of the invention]

従来のベクトル処理装置では、処理速度を上げる為に、
ベクトル演算器や主記憶装置とベクトルレジスタ間のデ
ータ転送を司さどるデータ転送回路を複数個有する。In conventional vector processing devices, in order to increase processing speed,
It has a plurality of data transfer circuits that manage data transfer between vector arithmetic units, main memory, and vector registers.

し７’）’　Ｌ　、実際のベクトル処理を構成するベク
トル命令群（でおいては、同時に実行できるベクトル命
令の数が少なく、これら複数のベクトル演算器やデータ
転送回路を同時に使用できず、ベクトル演算器の使用効
率が低く、処理の高速本発明の目的は、ベクトル演算器
やデータ転送回路の使用効率を高め、処理の高速化を図
るベクトル処理装置を提供することにある。7')' L. In the vector instructions that constitute actual vector processing, the number of vector instructions that can be executed simultaneously is small, and these multiple vector arithmetic units and data transfer circuits cannot be used at the same time. The purpose of the present invention is to provide a vector processing device that increases the efficiency of use of a vector arithmetic unit and a data transfer circuit and achieves high-speed processing.

[Summary of the invention]

本発明では、複数のベクトルレジスタと複数のベクトル
演算器と複数のデータ転送回路とを備えるベクトル処理
装置において、複数のベクトルレジスタと少なくとも１
つのベクトル演算器と少なくとも１つのデータ転送回路
を含むベクトル演算処理ユニットを複数用意し、１つの
ベクトル命令を実行するに際し、各ベクトル演算処理ユ
ニット毎に処理すべきベクトル要素数を指示し、全体と
して本来処理すべき要素の数だけベクトル処理を行うこ
とを可能としたものである。The present invention provides a vector processing device including a plurality of vector registers, a plurality of vector arithmetic units, and a plurality of data transfer circuits.
A plurality of vector arithmetic processing units each including one vector arithmetic unit and at least one data transfer circuit are prepared, and when executing one vector instruction, the number of vector elements to be processed is instructed for each vector arithmetic processing unit, and the overall This makes it possible to perform vector processing for the number of elements that should originally be processed.

[Embodiments of the invention]

以下１本発明の一実施例を図面を用いて説明する。 An embodiment of the present invention will be described below with reference to the drawings.

第１図は、本発明のベクトル処理装置の一実施例を示す
全体の構成図であり、１は主記憶装置％　２は記憶制御
ユニット、５はスカラー演算処理ユニット、４〜７はそ
れぞれベクトル演算処理ユニット％　８はベクトル演算
処理ユニ、）４〜７の動作を制御するベクトル演算制御
ユニ、トである。スカラー演算処理ユニット６は、この
分野でよ（知られた通常の演算処理機能を備えたもので
ある、ベクトル演算処理ユニット４〜７の各々は、複数
のベクトルレジスタ９゜１つ又は複数のベクトル演算器
１０．記憶装置制御ユニット２を介して主記憶装置１と
の間でデータの転送を行なうデータ転送回路１１〜１３
゜ベクトルレジスタ９とベクトル演算器１０あるいはデ
ータ転送回路１１〜１Ｓとの間のデータバスを形成する
接続パス選択回路２４，２５　、これらの要素と接続さ
れベクトル演算処理ユニット全体の動作を制御する命令
実行制御部１５を含んでいる。FIG. 1 is an overall configuration diagram showing an embodiment of the vector processing device of the present invention, in which 1 is a main storage unit, 2 is a storage control unit, 5 is a scalar operation processing unit, and 4 to 7 are vector operation units. Processing unit 8 is a vector arithmetic processing unit and a vector arithmetic control unit that controls the operations of 4 to 7. The scalar arithmetic processing unit 6 is equipped with conventional arithmetic processing functions well known in the art. Each of the vector arithmetic processing units 4 to 7 has a plurality of vector registers 9, one or more vector Arithmetic unit 10. Data transfer circuits 11 to 13 that transfer data to and from the main storage device 1 via the storage device control unit 2.
゜Connection path selection circuits 24 and 25 that form a data bus between the vector register 9 and the vector arithmetic unit 10 or the data transfer circuits 11 to 1S, and instructions that are connected to these elements and control the operation of the entire vector arithmetic processing unit. It includes an execution control section 15.

図テハ、ベクトル演算処理ユニット４のミ詳しく示しで
あるが、他のベクトル演算処理ユニ、ト５〜７も同じ構
成である。Although the figure shows the vector arithmetic processing unit 4 in detail, the other vector arithmetic processing units 5 to 7 have the same configuration.

なお、データ転送回路１１と１２はフェッチ用。Note that data transfer circuits 11 and 12 are for fetching.

データ転送回路１５はストア用である。また、接続パス
選択回路２４．２５は図ではベクトル演算処理ユニット
毎に独立しているが、全てのベクトル演算処理ユニット
間を接続してあってもよい。The data transfer circuit 15 is for storing. Further, although the connection path selection circuits 24 and 25 are shown to be independent for each vector arithmetic processing unit, they may be connected between all vector arithmetic processing units.

第１図のシステムでは、スカラー演算処理ユニット３が
あるタスクを処理していて、その途中でベクトル処理を
行なわなければならな（なった場合、それをベクトル演
算制御ユニット８を介してベクトル演算処理ユニット４
〜７に処理させろようになっている。In the system shown in FIG. 1, the scalar arithmetic processing unit 3 is processing a certain task, and in the middle of the process, vector processing must be performed (if this occurs, it is processed by the vector arithmetic processing unit 8 via the vector arithmetic processing unit 8). unit 4
~7 will be processed.

スカラー演算処理ユニット３で次のプログラムが実行さ
れる場合を考えてみる。Let us consider a case where the following program is executed by the scalar arithmetic processing unit 3.

ＤＯＩＱ　　Ｉ＝１　、１ｏ。DOIQ I=1, 1o.

１０　　Ａ（１）＝　Ｂ（１）＋　Ｃ（１）これは機械
語では、以下のようＶＣｌつのＬＩＮＬｏａｄ　Ｉｎｃ
ｌｅｍｅｎｔ　）命令と３つのＬＭＡ　（Ｌｏａｄ　Ｍ
ｕｌｔｉｐｌｅ　Ａｄｄｒｅｓｓ　）命令と１つのＥＸ
ＶＰ（Ｅｘｅｃｕｔｅ　ｖｅｃｔｏｒ　ｐｒｏｃｅｓｓ
ｏｒ　）命令に展開され、スカラー演算処理ユニット５
でそれぞれ実行される。10 A(1) = B(1) + C(1) In machine language, this is as follows:
element) instruction and three LMA (Load M
multiple Address) instruction and one EX
VP (Execute vector process)
or ) instruction, and the scalar arithmetic processing unit 5
are executed respectively.

ＬＩＮ　　ＩＮＲＯ，ｌＮＲ２，ｌＮＲ４：インクレメ
ントレジスタＩＮＲｏ　、　ｌＮＲ２）ｌＮＲ４にそれ
ぞれ定数をセットすることを指令（インクレメントレジ
スタは後述するようにベク）　ｐ−演％制御ユニット８
の中に用意される。）ＬＭＡ　　ＶＡＲＯ二行列Ａの先頭アドレスをベクトルアドレスレジスタＶ
ＡＲＯにセットすることを指令。LIN INRO, lNR2, lNR4: Command to set constants in increment registers INRo, lNR2) and lNR4 (increment registers are vectored as described later)
prepared inside. ) LMA VARO 2 The start address of matrix A is stored in vector address register V.
Commanded to set it to ARO.

（ベクトルアドレスレジスタ＋ｓ後述するようにベクト
ル演算制御ユニ、トｌ）中に用意される。）ＬＭＡ　　ＶＡＲ２二行列Ｂの先頭アドレスをベクトルアドレスレジスタＶ
ＡＲ２にセットすることを指令。(Vector address register +s is prepared in the vector arithmetic control unit, as will be described later). ) LMA VAR2 The start address of two matrix B is stored in vector address register V.
Command to set it to AR2.

ＬＭＡ　　ＶＡＲ４二行列Ｃの先頭アドレスをベクトルアドレスレジスタＶ
ＡＲ４にセットすることを指令。LMA VAR4 The start address of two matrices C is stored in vector address register V.
Commanded to set it on AR4.

ＥＸＶＰ　　Ｘ：処理すべきベクトル要素数を工りであることを指令す
るとともに、主記憶装置１のアドレスＸを先頭アドレス
として、そこからベクトル命令列を続出してそれをベク
トル演算制御ユニット８に送出することを指令。EXVP command to do.

上記のＬＩＮ、ＬＭＡ命令により、行列Ａ、Ｂ。Matrices A and B are created by the above LIN and LMA instructions.

Ｃｌ（関するアドレス制御データがベクトル演算制御ユ
ニット８内のベクトルアドレスレジスタとインクレメン
トレジスタの中１（セットされ。Cl (related address control data is set in the vector address register and increment register in the vector arithmetic control unit 8).

またＥＸＶＰ命令によりベクトル命令列の読出しが行な
われろ。このベクトル命令列は、以下のよう（て、２つ
のＬＶＲ（Ｌｏａｄ　Ｖｅｃｔｏｒ　ｌｅｇｉｓｔｅｒ
）命令と１つのＶ　Ｅ　Ａ　（Ｖｅｃｔｏｒ　ＥＬｅｒ
ｎｅｎｔｗｉｓｅ　Ａｄｄ）命令と１つの５ＴＶＲ（８
ｔｏｒｅ　Ｖｅｃｔｏｒ　Ｒｅｇｉｓｔｅｒ　）命令と
からなっている。Also, the vector instruction string is read out by the EXVP instruction. This vector instruction string is divided into two LVRs (Load Vector register
) instruction and one V E A (Vector ELer
nentwise Add) instruction and one 5TVR (8
It consists of (tore Vector Register) instructions.

ＬＶＲＶＢ２．ＶＡＲ２，１ＮＲ２：ベクトルアドレスレジスタＶＡＲ２とインクレメント
レジスタｌＮＲ２にそれぞれセットされた行列Ｂの先頭
アドレスと定数とに基づいて主・記憶装置１のアドレス
を作成し、そこρ）ら行列Ｂのデータを続出し、それを
ベクトルレジスタＶＲ２にセットすることを指令。LVRVB2. VAR2, 1NR2: Creates the address of the main memory device 1 based on the start address and constant of matrix B set in vector address register VAR2 and increment register NR2, respectively, and outputs data of matrix B one after another from there. and commands to set it in vector register VR2.

なお、前記定数はアドレスの増分値として使用されるも
のであり、以下同様とする。Note that the above constant is used as an increment value of the address, and the same applies hereinafter.

ＬＶＲＶＥＡ、ＶＡＲａ、ｌＮＲ４：ベクトルアドレスレジスタＶＡＲ４とインクレメント
レジスタｌＮＲ４＜それぞれセットされた行列Ｃの先頭
アドレスと定数とに基づいて主記憶装置１のアドレスを
作成し、そこから行列Ｃのデータを読出し、それをベク
トルレジスタＶＲ４にセットすることを指令。LVRVEA, VARa, lNR4: Vector address register VAR4 and increment register lNR4< Create an address in the main memory device 1 based on the set start address and constant of the matrix C, read the data of the matrix C from there, Command to set it in vector register VR4.

ＶＥＡ　　ＶＢ２．　　ＶＢ２．　　ＶＥＡ：ベクトル
レジスタＶＲ２と’ＶＲ４からそれぞれ行列ＢとＣを読
出し、両者の加算を行なって結果をベクトルレジスタＶ
Ｒ６にセラトスることを指令。VEA VB2. VB2. VEA: Read matrices B and C from vector registers VR2 and 'VR4, respectively, add them together, and store the results in vector register V.
Order R6 to Seratos.

８ＴＶＲＶＢ２．ＶＡＲＯ，ＩＮＲ。8TVRVB2. VARO, INR.

：ベクトルレジスタＶ　Ｒ６７］）らデータを読出し。: Read data from vector register V R67]).

それをベクトルアドレスレジスタＶＡＲｏとインクレメ
ントレジスタＩＮＲｏにそれぞれセットされた行列人の
先頭アドレスと定数に基づいて作成した主記憶装置１の
アドレスに書込むことを指令。A command is given to write it to an address in the main memory device 1 created based on the start address and constant of the queue operator set in the vector address register VARo and increment register INRo, respectively.

これらベクトル命令ヲ工、それぞれベクトル演算制御ユ
ニット８内のベクトル命令バッファ１６に送出される。These vector commands are each sent to a vector command buffer 16 within the vector operation control unit 8.

また、ベクトル処理要素数りはベクトル演算制御ユニッ
ト８内のベクトル長レジスタ１７ニ送出される。Further, the number of vector processing elements is sent to the vector length register 17 in the vector calculation control unit 8.

第２図は、ベクトル演算制御ユニット８とベクトル演算
処理ユニットの中のデータ転送回路の詳細を示すブロッ
ク図である。ベクトル演算制御ユニット８の動作を第１
図と第２図に基づいて説明する。FIG. 2 is a block diagram showing details of the vector arithmetic control unit 8 and the data transfer circuit in the vector arithmetic processing unit. The operation of the vector calculation control unit 8 is
This will be explained based on the figure and FIG.

上記のようにして、スカラー演算処理上ニット６カ）ら
ベクトル命令が入力されたベクトル命命バッファ１６に
対し、命令実行判定回路１Ｂは。As described above, the instruction execution determination circuit 1B processes the vector instruction buffer 16 into which the vector instruction is input from the scalar operation processing unit 6).

その先頭取出し位置から１つのベクトル命令を取出し、
それが実行可能かどうかを判定する。Extract one vector instruction from the first extraction position,
Determine whether it is feasible.

すなわち、う水回路１９は、ベクトル演算処理ユニット
４〜７に対し、共通に設けられているもので、その中の
ベクトルレジスタ９．ベクトル演算器１０．データ転送
回路１１〜１５の各々に対応して１個ずつの表示子を備
え、それらが使用中であるかどうかを表示する。That is, the water replenishment circuit 19 is provided in common to the vector arithmetic processing units 4 to 7, and the vector registers 9. Vector operator 10. One indicator is provided corresponding to each of the data transfer circuits 11 to 15 to indicate whether or not they are in use.

例えｄベクトルレジスタＶ几１に対応する表示子は、ベ
クトル演算処理ユニット４〜７の各々に対応して設けら
ｔｌているのではなく、１個だけ設けられている。他の
もの九ついても同様である。For example, the indicator corresponding to the d vector register V1 is not provided corresponding to each of the vector arithmetic processing units 4 to 7, but only one indicator is provided. The same goes for nine other things.

舖令実行判定回路１８は、これら表示子を参照すること
により、取出したベクトル命令で指定されたベクトルレ
ジスタ９や七〇ベク）ｙ命令で指定された演算を行なう
ためのベク）Ａ／演算器１０等が空いているかどうかを
調べ、必要なものが全て空いていることを検出すると、
そのベクトル命令は実行可能であると判定する。そして
その場合には、そのベクトル命令で使用するベクトルレ
ジスタ９．ベクトル演算器１０．データ転送回路１１〜
１５に対応する表示子を、それらが使用中を表示するよ
うにセットし、そのベクトル命令を命令レジスタ２０に
送出するとともに起動制御回路２１に起動信号２２を送
出する。By referring to these indicators, the command execution determination circuit 18 selects the vector register 9 specified by the retrieved vector instruction and the vector register 9 or the vector register 9 for performing the operation specified by the vector y instruction. Check to see if 10th grade is available, and if you find that all the necessary items are available,
The vector instruction is determined to be executable. In that case, the vector register 9. used in that vector instruction. Vector operator 10. Data transfer circuit 11~
The indicators corresponding to 15 are set to indicate that they are in use, and the vector instruction is sent to the instruction register 20, and at the same time, the start signal 22 is sent to the start control circuit 21.

第２図のベクトル命令バッファ１６には、スカラー演算
処理ユニット５から送出される１つのベクトル命令のフ
ォーマットを示しである。The vector instruction buffer 16 in FIG. 2 shows the format of one vector instruction sent from the scalar arithmetic processing unit 5.

また、第２図の命令レジスタ２０には、命令実行判定回
路１８から送出される１つのベクトル命令のフォーマッ
トを示しである。Further, the instruction register 20 in FIG. 2 shows the format of one vector instruction sent from the instruction execution determination circuit 18.

これにおいて、ＯＦは演算の種類を表わすオペレーショ
ンコード％ＶＲＮ１〜３はベクトルレジスタを指定する
ベクトルレジスタ指定ｉ、ＶＡＲＮはベクトルアドレス
レジスタを指定スルヘクトルアドレスレジスタ指定部、
ＩＮＲＮはインクレメントレジスタを指定するインクレ
メントレジスタ指定部である。In this, OF is an operation code representing the type of operation, % VRN1 to 3 are vector register designations i that designate vector registers, VARN is vector address register designation parts that designate vector address registers,
INRN is an increment register designation part that designates an increment register.

なお、ベクトル命令によっては、ベクトルアドレスレジ
スタ等を使用しないもの（例えば前記ＶＢＡ命令）があ
り、その場合には、該当の指定部は存在しない。Note that some vector instructions do not use a vector address register or the like (for example, the above-mentioned VBA instruction), and in that case, the corresponding specification section does not exist.

以下、説明の都合上、特に説明のない限り。For convenience of explanation, unless otherwise specified below.

■ＲＮ１〜５は全て存在するものとして扱う。(2) All RNs 1 to 5 are treated as existing.

命令Ｌ／　シスタ２０（７）　中（１’）　ＯＦ　、Ｖ
ＲＮ１〜３、ＹＡＲＮ％ＩＮＲＮはベクトル命令バッフ
ァ１６から送出されたものを命令実行判定回路１８がそ
のまま出力したものである。ＡＬＮとＴＲＮは、共に命
令実行判定回路１Ｂで新たに付加されたもので、この回
路で新たに使用中とした表示子に対応するベクトル演算
器やデータ転送回路を指定する演算器指定部、データ転
送回路指定部である。Instruction L/ Sister 20 (7) Medium (1') OF, V
RN1-3, YARN%INRN are the ones sent out from the vector instruction buffer 16 and output as they are by the instruction execution determination circuit 18. ALN and TRN are both newly added to the instruction execution determination circuit 1B, and are arithmetic unit designation section and data This is a transfer circuit designation section.

ここで説明するベクトル処理装置は、１つのベクトル命
令なベク）Ａ／要素番号に注目して。The vector processing device described here focuses on one vector instruction, vector) A/element number.

４個のベクトル演算処理ユニットで以下の様に分割して
処理する。The data is divided and processed by four vector processing units as follows.

即ち、ベクトル要素番号をｉとすると。That is, let the vector element number be i.

ベクトル要素番号　ベクトル演算処理ユニットｒ　＝　
０　、４　、８・・・・・・　　　　　　４１　＝　１
　、５　、９・・・・・・　　　　　　５ｉ　＝　２　
、６．１０・・・・・・　　　　　　６ｉ＝＝３．７．
ｉｌ・・曲　　　　　　　　７という具合に分担するも
のである。Vector element number Vector arithmetic processing unit r =
0, 4, 8...41 = 1
, 5 , 9... 5i = 2
, 6.10...6i==3.7.
il...song 7.

命令レジスタ２０のＶＡＲＮとＩＮＲＮ以外の部分は各
ベクトル演算処理ユニット４〜７の中の命令実行制御部
１５　Ｋ送出される。命令実行制御部１５の各々は、起
動制御回路２１からユニット起動信号２６を受けると、
命令レジスタ２０から受取りた情報に基づき、七〇ペク
ト〃演算処理ユニットにベクトル処理動作を行なわせる
。The portions of the instruction register 20 other than VARN and INRN are sent to the instruction execution control section 15K in each vector processing unit 4-7. When each of the instruction execution control units 15 receives the unit activation signal 26 from the activation control circuit 21,
Based on the information received from the instruction register 20, the 70 vector processing unit is caused to perform vector processing operations.

各命令実行制御部１５４！、実行すべきベクトル命令が
ＬＶＲ命令やＳ’ｒＶＲ命令のよう１てベクトルレジス
タ９とデータ転送回路を使用するベクトル命令の場合に
は、ＶＲＮ１〜５のうちのいずれか（ＬＶＲ命令や５Ｔ
ＶＲ命令では１つのベクトルレジスタだけが使用され、
ここではそれをＶ′ＲＮ１で指定するものとする）及び
ＴＲＮを接続バス選択回路２４あるいは２５に送出する
。この時。Each instruction execution control unit 154! If the vector instruction to be executed is a vector instruction that uses the vector register 9 and the data transfer circuit, such as the LVR instruction or S'rVR instruction, one of VRN1 to VRN5 (LVR instruction or 5T
VR instructions use only one vector register,
Here, it is designated by V'RN1) and TRN are sent to the connection bus selection circuit 24 or 25. At this time.

接続バス選択回路２４あるいは２５は％ＶＲＮＩＫよっ
て指定されるベクトルレジスタ９とＴＲＮで指定される
データ転送回路との間の接続バスを選択し、それを活性
化する。そして各命令実行制御部１５は、命令レジスタ
２０の中のＹＡＲＮとＩＮＲＮに基づいて、複数のベク
トルアドレスレジスタ２７のうちの１つ及び複数のイン
クレメントレジスタ２８のうちの１つから、それぞれ内
容を読出させる。The connection bus selection circuit 24 or 25 selects the connection bus between the vector register 9 specified by %VRNIK and the data transfer circuit specified by TRN, and activates it. Then, each instruction execution control unit 15 reads the contents from one of the plurality of vector address registers 27 and one of the plurality of increment registers 28 based on YARN and INRN in the instruction register 20. Read out.

ＴＲＮで指定されたデータ転送回路（以下、これをデー
タ転送回路１１として説明する）は、ベクトルアドレス
レジスタ２７から続出した内容をセレクタ２９．レジス
タ３０を介して、アクセスアドレスとして記憶制御ユニ
ット２に送出する。The data transfer circuit specified by TRN (hereinafter referred to as the data transfer circuit 11) transfers the contents successively received from the vector address register 27 to the selector 29. It is sent to the storage control unit 2 as an access address via the register 30.

一方、その間に、インクレメントレジスタ２８から続出
した内容を４倍回路５１とレジスタ５２を介して加算回
路５５に入力し、レジスタ５０の内容との和を求める。On the other hand, during that time, the contents successively received from the increment register 28 are input to the adder circuit 55 via the quadruple circuit 51 and the register 52, and the sum with the contents of the register 50 is calculated.

そしてこの結果をセレクタ２９を介してレジスタ３０に
セットする。This result is then set in the register 30 via the selector 29.

この新しい内容は前と同様にして、アクセスアドレスと
して記憶制御ユニット２に送出する。This new content is sent to the storage control unit 2 as an access address in the same way as before.

以下、同様の動作を繰り返す。Thereafter, the same operation is repeated.

また、ベクトル演算処理ユニット５の中のデータ転送回
路１１は、ベクトルアドレスレジスタ２７から続出した
内容とインクレメントレジスタ２８から続出した内容の
和を加算回路５４で求め、それをセレクタ２９．レジス
タ５０を介してアクセスアドレスとして記憶制御ユニッ
ト２に送出する。Further, the data transfer circuit 11 in the vector arithmetic processing unit 5 calculates the sum of the contents successively received from the vector address register 27 and the contents successively received from the increment register 28 in the adder circuit 54, and adds the sum to the selector 29. It is sent to the storage control unit 2 as an access address via the register 50.

一方、その間に、インクレメントレジスタ２８から続出
した内容を４倍回路５１とレジスタ５２を介して加算回
路３３に入力し、レジスタ３０の内容との和を求める。On the other hand, during that time, the contents successively received from the increment register 28 are input to the adder circuit 33 via the quadruple circuit 51 and the register 52, and the sum with the contents of the register 30 is calculated.

そしてこの結果をセレクタ２９を介してレジスタ５０に
セットする。この新らしい内容は前と同様にしてアクセ
スアドレスとして記憶制御ユニット２に送出する。以下
、同様の動作を繰り返す。This result is then set in the register 50 via the selector 29. This new content is sent to the storage control unit 2 as an access address in the same manner as before. Thereafter, the same operation is repeated.

また、ベクトル演算処理ユニット６の中のデータ転送回
路１１の場合は、加算回路５４に対応する加算回路５５
の入力が、インクレメントレジスタ２８から読出した内
容を２倍回路５６を介したものとなっている点が前記と
異なっているだけである。In addition, in the case of the data transfer circuit 11 in the vector arithmetic processing unit 6, an adder circuit 55 corresponding to the adder circuit 54
The only difference from the above is that the input is the content read from the increment register 28 via the doubling circuit 56.

また、ベクトル演算処理ユニット７の中のデータ転送回
路１１の場合は、加算回路３４に対応する加算回路３７
０入力が、インクレメントレジスタ２Ｂから読出した内
容を５倍回路３８を介したものとなっている点が前記と
異なっているだけである。In addition, in the case of the data transfer circuit 11 in the vector arithmetic processing unit 7, the addition circuit 37 corresponding to the addition circuit 34
The only difference from the above is that the 0 input is the content read from the increment register 2B via the 5x circuit 38.

なお、加算回路５４，５５，５７．２倍回路５６．３倍
回路５８．４＠回路３１から成るアドレス演算回路２６
ヲエ、図ではデータ転送回路１１用に用意された１つし
か示していないが、これは他のデータ転送回路１２や１
５′にも用意されているものとする。Note that the address calculation circuit 26 consists of adder circuits 54, 55, 57.2x circuit 56.3x circuit 58.4@circuit 31.
Wow, the figure only shows one prepared for the data transfer circuit 11, but this is not the same as the other data transfer circuits 12 and 1.
5' is also provided.

従って、ＴＲＮで指定されるデータ転送回路が１２や１
３の場合には命令実行制御部１５はそれに対応したアド
レス演算回路に信号を送り、それを動作させる。Therefore, the data transfer circuit specified by TRN is 12 or 1.
In the case of 3, the instruction execution control unit 15 sends a signal to the corresponding address calculation circuit to operate it.

各ベクトル演算処理ユニット４〜７の中のデータ転送回
路１１から記憶制御ユニット２に送出されたアクセスア
ドレスは、主記憶装置１に与えられる。ＴＲＮで指定さ
れたデータ転送回路がフェッチ用の１１あるいは１２で
あれば主記憶装置１からの読出しデータが信号線３９を
介してデータ転送回路１１あるいは１２に送出され、し
かる後、接続パス選択回路２４を介してＶ　ＲＮ　１で
指定されたベクトルレジスタ９にロードされる。また、
ＴＲ，Ｎで指定されたデータ転送回路がストア用の１６
であれば、■ＲＮ１で指定されたベクトルレジスタ９か
らの読出しデータが接続パス選択回路２５を介してデー
タ転送回路１３に送出されしかる後、信号線４０と記憶
制御ユニット２を介して主記憶装置１に書込される。The access address sent from the data transfer circuit 11 in each vector processing unit 4 to 7 to the storage control unit 2 is given to the main storage device 1. If the data transfer circuit specified by TRN is 11 or 12 for fetch, the read data from the main memory 1 is sent to the data transfer circuit 11 or 12 via the signal line 39, and then the connection path selection circuit 24 into the vector register 9 designated by V RN 1. Also,
The data transfer circuit specified by TR,N is 16 for storage.
If so, the read data from the vector register 9 specified by RN1 is sent to the data transfer circuit 13 via the connection path selection circuit 25, and then sent to the main memory device via the signal line 40 and the storage control unit 2. Written to 1.

各命令実行制御部１５＆工、実行すべきベクトル命令が
ＶＥＡ命令のようにベクトルレジスタ９とベクトル演算
器１０を、使用するベクトル命令の場合には、ＶＲＮ１
〜３及びＡＬＮを接続バス選択回路２４．２５　Ｋ送出
する。この時、接続パス選択回路２４と２５は％ＶＲＮ
１〜５によって指示される３つのベクトルレジスタ９と
ＡＬＮで指示される１つのベクトル演算器１０との間の
接続パスを選択し、それを活性化する。この後、選択し
た２つのベクトルレジスタ９からデータが読出され。Each instruction execution control unit 15 &
~3 and ALN are connected to the bus selection circuit 24.25K. At this time, the connection path selection circuits 24 and 25 are set to %VRN.
A connection path between three vector registers 9 indicated by 1 to 5 and one vector arithmetic unit 10 indicated by ALN is selected and activated. After this, data is read from the two selected vector registers 9.

選択したベクトル演算器１０（てよって演算が行なわれ
、結果が選択した１つのベクトルレジスタ９に書込まれ
る。The selected vector arithmetic unit 10 performs the calculation, and the result is written into the selected one vector register 9.

以上のように、１つのベクトル命令が４つのベクトル演
算処理ユニ、ト４〜７で分割して処理される。As described above, one vector instruction is divided and processed by the four vector processing units 4-7.

すなわち、ベクトル演算処理ユニット４〜７はそれぞれ
Ｌ要素のうちのｉ　ＭＯＤ４＝Ｏ，１Ｍ０Ｄ、ａ＝　１
　、ｉＭ（ＪＤａ＝２．１Ｍ０Ｄａ＝ｓなる要素を分担
する。That is, vector operation processing units 4 to 7 each process i MOD4=O, 1M0D, a=1 of L elements.
, iM(JDa=2.1M0Da=s).

接続パス選択回路２４と２５は、それぞれ同時に複数の
接続パスを活性化できる。これによって命令実行制御部
１５は、指定されたベクトルレジスタ９やベクトル演算
器１０やデータ転送回路１１〜１５が空いていれば、命
令レジスタ２０から与えられたベクトル命令の実行を次
々と開始し、複数のベクトル命令を同時に実行すること
ができろ。Each of the connection path selection circuits 24 and 25 can activate a plurality of connection paths at the same time. As a result, the instruction execution control unit 15 starts executing vector instructions given from the instruction register 20 one after another, if the specified vector register 9, vector arithmetic unit 10, or data transfer circuits 11 to 15 are vacant. Be able to execute multiple vector instructions simultaneously.

次に、各ベクトル演算処理ユニット４〜７で処理するベ
クトル要素数の制御について第３図を用いて説明する。Next, control of the number of vector elements processed by each vector processing unit 4 to 7 will be explained using FIG. 3.

谷ペクトＡ／演算処理ユニット４〜７の中の命令実行制
御部１５の中には、その中のベクトルレジスタ、ベクト
ル演算器、データ転送回路の各々に対応して１個ずつの
カウンタが用意される。Tanipet A/In the instruction execution control section 15 in the arithmetic processing units 4 to 7, one counter is prepared corresponding to each of the vector register, vector arithmetic unit, and data transfer circuit therein. Ru.

実行スヘきベクトル命令がベクトルレジスタとデータ転
送回路を使用するＬＶ几や８ＴＶＲ命令の場合に％ｖＲ
Ｎ１とＴＲＮで指定されたベクトルレジスタとデータ転
送回路に対応するカウンタ。%vR when the execution vector instruction is an LV or 8TVR instruction that uses a vector register and data transfer circuit.
A counter corresponding to the vector register and data transfer circuit specified by N1 and TRN.

及び実行すべきベクトル命令がベクトルレジスタとベク
トル演算器を便用するＶＥＡ命令の場合に、■ＲＮ１〜
３とＡＬＮで指定されたベクトルレシスタトヘクトル演
算器に対応するカウンタの動作は以下の通りである。And if the vector instruction to be executed is a VEA instruction that conveniently uses a vector register and a vector arithmetic unit,
The operation of the counter corresponding to the vector register calculator designated by 3 and ALN is as follows.

なお、第３図では前記の如きカウンタの１つを４１とし
て示しであるが、他のカウンタについても同様である。In FIG. 3, one of the counters mentioned above is shown as 41, but the same applies to the other counters.

スカラー演算処理ユニット３がら与えらり。The scalar arithmetic processing unit 3 is also provided.

ベクトル長レジスタ１７にセットされたベクトル要素数
りを表わすデータの下２ビットを除く部分を工そのまま
補正回路４２を介してカウンタ４１［セットされる。ベ
クトル長レジスタ１７の下２ビ、トが′００′の場合は
、補正回路４２から何ら出力は発生しない。もし、’０
１’ならば、ベクトル演算処理ユニット４０カウンタ４
１へ補正回路４２の中のオアゲート４３から出力が発せ
られる。The data representing the number of vector elements set in the vector length register 17, except for the lower two bits, is set as is in the counter 41 via the correction circuit 42. If the lower two bits of the vector length register 17 are '00', no output is generated from the correction circuit 42. If '0
If 1', vector arithmetic processing unit 40 counter 4
An output is issued from the OR gate 43 in the correction circuit 42 to 1.

またＩ０１′ならば、ベクトル演算処理ユニット４と５
のカウンタ４１へ補正回路４２の中のオアゲート４６と
出力線４４から出力が発せられる。Also, if I01', vector arithmetic processing units 4 and 5
An output is issued from the OR gate 46 in the correction circuit 42 and the output line 44 to the counter 41 of the .

また′１１′ならば、ベクトル演算処理ユニッ）４，５
．６のカウンタ４１へ補正回路４２の中のオアゲート４
６と出力線４４とアンドゲート４５から出力が発せられ
る。各ベクトル演算処理ユニット４〜７の中のカウンタ
４１は、補正回路４２から出力が発せられると、その値
に＋１するようになっている。Also, if it is '11', the vector arithmetic processing unit)4,5
．． OR gate 4 in correction circuit 42 to counter 41 of 6
Outputs are generated from the output line 6, the output line 44, and the AND gate 45. The counter 41 in each vector arithmetic processing unit 4 to 7 is configured to increment its value by +1 when an output is issued from the correction circuit 42.

ベクトル命令がベクトル演算処理ユニット４〜７の各々
で実行される時、前記のようにしてセットされたカウン
タ４１の値は、ベクトル部製な１個処理する毎に−１さ
れる。セして０になったら信号線４６に出力を発する。When a vector instruction is executed in each of the vector arithmetic processing units 4 to 7, the value of the counter 41 set as described above is decremented by 1 every time one instruction is processed by the vector unit. When the signal is set to 0, an output is output to the signal line 46.

各信号線４６の出力はベクトル演算制御ユニット８内の
優先順位回路４７を介してカウンタ４８ＶＣ送出される
。優先順位回路４７とカウンタ４Ｂと最終値レジスタ４
９はベクトル演算処理ユニット４〜７に対し共通に設け
られているもので、その中のベクトルレジスタ９．ベク
トル演算器ＩＱ、データ転送回路１１〜１５の各々に対
応して１個ずつ用意されている。The output of each signal line 46 is sent to a counter 48VC via a priority circuit 47 in the vector calculation control unit 8. Priority circuit 47, counter 4B, and final value register 4
Reference numeral 9 is provided commonly to the vector arithmetic processing units 4 to 7, and the vector register 9. One circuit is prepared corresponding to each of the vector arithmetic unit IQ and data transfer circuits 11 to 15.

なお、最終値レジスタ４９は起動信号２２が出力される
と４にセットされる。優先順位回路４７は複数の信号線
４７に同時に出力が現われない時は、それぞれの出力を
そのままカウンタ４８に与え。Note that the final value register 49 is set to 4 when the activation signal 22 is output. When outputs do not appear on the plurality of signal lines 47 at the same time, the priority circuit 47 applies the respective outputs as they are to the counter 48.

同時に出力が現われた時は、それらを１クロック時間ず
つずらせてカウンタ４８に与えるものである。カウンタ
４８は優先順位回路４７がらの出力をカウントし、その
カウント値が最終値レジスタ４９の値と等しくなると比
較回路５ｏが出力を発する、表示回路１９は、比較回路
５ｏがらの出方に基づいて、それに対応するベクトルレ
ジスタ。When the outputs appear at the same time, they are shifted by one clock time and fed to the counter 48. The counter 48 counts the output from the priority circuit 47, and when the count value becomes equal to the value of the final value register 49, the comparison circuit 5o outputs an output.The display circuit 19 outputs an output based on the output from the comparison circuit 5o. , its corresponding vector register.

ベクトル演算器、データ転送回路の表示子を。Vector arithmetic unit, data transfer circuit indicator.

空きを表示するようにリセットする。Reset to display vacancies.

ベクトル長レジスタ１７１’（セットされるベクトル要
素数りとしては％　４未満の場合もある。そのため、必
要なだけのベクトル演算処理ユニットを起動するための
起動制御回路２１がベクトル演裏制御ユニット８の中（
Ｃ用意される。Vector length register 171' (The number of vector elements to be set may be less than 4%. Therefore, the activation control circuit 21 for activating the necessary number of vector arithmetic processing units is connected to the vector performance control unit 8. During(
C will be prepared.

以下、これについて第５図を用いて説明する。This will be explained below using FIG. 5.

ベクトル要素数りが４以上の場合には、命令実行判定回
路１８から起動信号２２が出力されると、起動制御回路
２１から全てのベク）Ａ／演算処理ユニット４〜７に対
し、ユニット起動信号２５が出力される、また、前記したよう（τ最終値レジスタ４９が４（Ｃセ
、トされる。Ｌが１の場合は、起動制御回路２１の中の
１検出回路５１から出力が発せられるため、オアゲート
５２，５３．アンドゲート５４．５５゜５６の働き（で
よりベクトル演算処理ユニツ）５１６ｔ７へのユニット
起動信号２５は阻止される。Ｌが２の場合は、２検出回
路５７から出力が発せられろため、オアゲート５２，５
５　、アンドゲート５５゜５６の働きによりベクトル演
算処理ユニット６゜７へのユニｙ）起動信号２３は阻止
される。Ｌが３の場合は、３検出回路５８から出力が発
せられるため、オアゲート５５．アンドゲート５６の働
ぎによりベクトル演算処理ユニット７への二ニット起動
信号２３←工阻止される。When the number of vector elements is 4 or more, when the start signal 22 is output from the instruction execution determination circuit 18, the start control circuit 21 sends a unit start signal to all vector) A/arithmetic processing units 4 to 7. 25 is output. Also, as described above, the τ final value register 49 is set to 4 (C). If L is 1, an output is issued from the 1 detection circuit 51 in the startup control circuit 21. Therefore, the unit activation signal 25 to the OR gates 52, 53, and the AND gates 54, 55, and 56 (and thus the vector arithmetic processing unit) 516t7 is blocked.When L is 2, the output from the 2 detection circuit 57 is blocked. Let it be released, Or Gate 52,5
5, the activation signal 23 to the vector arithmetic processing unit 6.7 is blocked by the action of the AND gates 55.56. When L is 3, an output is generated from the 3 detection circuit 58, so the OR gate 55. By the action of the AND gate 56, the two-nit activation signal 23 to the vector arithmetic processing unit 7 is blocked.

なお、Ｌが４未満の場合には、起動するベクトル演算処
理ユニットの数が最終ｔＶレジスタ９に設定される。こ
の場合、最終値レジスタ４９にセットするデータは１例
えば各ベクトル演算処理ユニットに送出されるユニット
起動信号２５を優先順位回路４７と同じ回路を介してカ
ウントすれば得られろ。Note that when L is less than 4, the number of vector arithmetic processing units to be activated is set in the final tV register 9. In this case, the data to be set in the final value register 49 can be obtained by counting, for example, the unit activation signal 25 sent to each vector processing unit via the same circuit as the priority circuit 47.

以上は、１つのベクトル命令を４個のベクトル演算処理
ユニットで分割処理する場合の例だが、必要により１〜
４個のベクトル演算処理ユニ、トで処理するよう、任意
に切換えられるように構成してもよい。The above is an example of dividing one vector instruction into four vector processing units.
The configuration may be such that it can be arbitrarily switched to perform processing using four vector processing units.

例えば、内積や総和を求めるベクトル命令や１次巡回演
算を行なうベクトル命令の場合には。For example, in the case of a vector instruction that calculates an inner product or summation, or a vector instruction that performs a first-order cyclic operation.

１つのベクトル演算処理ユニットで処理させる必要があ
る、上記のように１〜４個のベクトル演算処理ユニットで処
理するよう、任意に切換えられろようにするための構成
は１例えば次のよ５にすればよい。For example, in order to be able to arbitrarily switch the processing to be performed by one to four vector arithmetic processing units as described above, it is necessary to perform processing using one vector arithmetic processing unit. do it.

第５図において、起動制御回路２１の中のアンドゲート
５４〜５６のそれぞれの出力ｉＫゲートを設ける。ベク
トル演算処理ユニット４だけに処理させる場合は、アン
ドゲート５４〜５６の出力を禁止し、ベクトル演算処理
ユニット４と５だけに処理させる場合は、アンドゲート
５５と５６の出力を禁止する。In FIG. 5, output iK gates are provided for each of the AND gates 54 to 56 in the activation control circuit 21. When only the vector arithmetic processing unit 4 is allowed to process, the outputs of the AND gates 54 to 56 are prohibited, and when only the vector arithmetic processing units 4 and 5 are made to process, the outputs of the AND gates 55 and 56 are prohibited.

また、補正回路４２１τついては、ベク）ｙ長しジスタ
１７の全ビットをベクトル演ＩＥ処Ｉ！ユニット４だけ
１で与える経路や、下１ビットを除いた部分をベクトル
演算処理ユニット４と５に与えろとともに下１ビットが
１１′の場合にはベクトル演算処理ユニット４のカウン
タ４１ｔＣ＋１を行なわせる経路等を設け、これらの経
路を選択できるように構成する。Also, regarding the correction circuit 421τ, all bits of the vector y length register 17 are processed by the vector operation IE! A route where only unit 4 is given as 1, a route where the portion excluding the lower 1 bit is given to vector processing units 4 and 5, and when the lower 1 bit is 11', the counter 41tC+1 of vector processing unit 4 is executed. The system is configured so that these routes can be selected.

前者の経路を選択した場合には、ベクトル演算処理ユニ
ット４で全てのベクトル要素が処理され、後者の場合に
はベクトル演算処理ユニット４と５で分割処理される。If the former route is selected, all vector elements are processed by the vector arithmetic processing unit 4, and in the latter case, the vector elements are dividedly processed by the vector arithmetic processing units 4 and 5.

また、ベクトル演算処理ユニット４だけに処理させる場
合にを工、第２図において、ベクトル演算処理ユニット
４の中のデータ転送回路内のレジスタ５２には、インク
レメントレジスタ２８から読み出した値そのものが初期
設定きれるように、またベクトル演算処理ユニット４．
５だけに処理させる場合には、それらの中のデータ転送
回路内のレジスタ５２には、インクレメントレジスタ２
Ｂから読み出した値を２倍した値が初期設定されるよう
に、アドレス演算回路２６を構成する。In addition, when processing is performed only by the vector arithmetic processing unit 4, as shown in FIG. In addition, the vector calculation processing unit 4.
5, the register 52 in the data transfer circuit among them has an increment register 2.
The address calculation circuit 26 is configured so that a value obtained by doubling the value read from B is initially set.

なお前者の場合、最終値レジスタ４９にセットする値は
、起動するベクトル演算ユニットの数に等しく設定する
必要があることは言うまでもない。In the former case, it goes without saying that the value set in the final value register 49 must be set equal to the number of vector operation units to be activated.

〔Effect of the invention〕

本発明によれば、１つのベクトル命令を複数のベクトル
演算処理ユニットで分割して処理する為ベクトル演算器
やデータ転送回路の使用効率を高めることができるので
、処理の高速化が可能となる。According to the present invention, since one vector instruction is divided and processed by a plurality of vector arithmetic processing units, the usage efficiency of the vector arithmetic unit and the data transfer circuit can be increased, so that processing speed can be increased.

[Brief explanation of drawings]

第１＠は本発明のベクトル処理装置の一実施例を示す全
体の構成図、第２図は第１図の中のベクトル演算制御ユ
ニットとベクトル演算処理ユニットの中のデータ転送回
路の詳細を示すブロック図、第５図は第１°図のベクト
ル演算制御ユニットとベクトル演算処理ユニットの中の
命令実行部の詳細を示すブロック図である。図において１・・−・・主記憶装置、２・・・・・・記憶制御ユニット、５・・・・・・スカラー演算処理ユニット。４〜７・・−・・・ベクトル演算処理ユニット。８・・・・・・ベクトル演算制御ユニット。９・・・・・・ヘクトルレジスタ、１０・・・・・・ベクトル演算器。１１〜１３・・・・・・データ転送回路、１５・・・・
・・命令実行制御部、１７・・・・・・ベクトル長レジスタ、１８・・・・・
・命令実行判定回路、２１・・・・・・起動制御回路。２６・・・・・・アドレス演算回路。４１．４８・・・・・・カウンタ、４２・・・・・・補正回路。４９・・・・・・最終値レジスタ。５０・・・・・・比較回路。箋１　図イ９臥＃理七高構明大Figure 1 shows the overall configuration of an embodiment of the vector processing device of the present invention, and Figure 2 shows details of the vector operation control unit and the data transfer circuit in the vector operation processing unit in Figure 1. Block Diagram FIG. 5 is a block diagram showing details of the vector operation control unit and the instruction execution section in the vector operation processing unit shown in FIG. 1. In the figure, 1... Main storage device, 2... Storage control unit, 5... Scalar arithmetic processing unit. 4-7...--Vector arithmetic processing unit. 8...Vector calculation control unit. 9... Hector register, 10... Vector arithmetic unit. 11-13...Data transfer circuit, 15...
...Instruction execution control unit, 17...Vector length register, 18...
・Instruction execution determination circuit, 21... Start-up control circuit. 26...Address calculation circuit. 41.48... Counter, 42... Correction circuit. 49...Final value register. 50... Comparison circuit. Notebook 1 Diagram I 9 #Science Seven High School Meidai

Claims

[Claims]

(1) A plurality of vector registers, a plurality of vector arithmetic units that perform arithmetic processing on vector data received from the vector registers and send the results to the vector registers, and data transfer between the main storage device and the vector registers. A vector processing device having a plurality of data transfer circuits that control a plurality of vector operation processing units including a plurality of vector registers, at least one vector arithmetic unit, and at least one data transfer circuit;
A first storage means in which the total number of vector elements to be processed is set, and a second storage means provided corresponding to each of the vector arithmetic processing units to be processed by each vector arithmetic processing unit. The vector element number converting means sets the number of exponent vector elements, and the vector element number conversion means determines the contents to be set in the second storage means based on the contents set in the first storage means. vector processing device.

(2) In the vector processing device according to claim 1, the second storage means includes means for updating the contents by one each time one vector element is processed by the corresponding vector arithmetic processing unit. and a plurality of said second
1. A vector processing device comprising means for detecting that the contents of the storage means have reached a predetermined value.

(3) In the vector processing device according to claim 1, the second storage means is provided corresponding to a vector arithmetic unit provided in a corresponding vector arithmetic processing unit. Characteristic vector processing device.

(4) In the vector processing device according to claim 1, the second storage means is provided corresponding to a data transfer circuit provided in a corresponding vector arithmetic processing unit. Characteristic vector processing device.