JPH02100175A

JPH02100175A - Vector processor

Info

Publication number: JPH02100175A
Application number: JP25352388A
Authority: JP
Inventors: Takeshi Nishikawa; 西川　岳; Kazuaki Furusawa; 古澤　一昭
Original assignee: NEC Corp; NEC Computertechno Ltd
Current assignee: NEC Corp; NEC Computertechno Ltd
Priority date: 1988-10-06
Filing date: 1988-10-06
Publication date: 1990-04-12
Anticipated expiration: 2010-12-18
Also published as: JPH07117955B2

Abstract

PURPOSE:To shorten computing time when the storage of the arithmetic results is continuous to the same vector register by selecting and checking a flag that is controlled in the timing corresponding to the number of pipeline steps of each computing element. CONSTITUTION:When the computing element to be used with an instruction which is decoded at an instruction control part 2 is equal to an adder 11, a short write busy A flag 33 is selected. In the same way, a short write busy L flag 32, a short write busy M flag 34, and a write busy flag 31 are respectively selected with a logical computing element 12, a multiplier 13, and a computing element 14 which starts a storing action to a designated vector register right after the execution of an instruction is instructed. These selected flags are checked by a write busy check circuit 23, and these check results are set to an instruction execution instructing circuit 24. Then an instruction executing instruction is sent to an instruction executing part 1. Thus the storage of the arithmetic results is continuously carried out to the vector register and the total computing time is shortened.

Description

[Detailed description of the invention]

〔産業上の利用分野〕本発明は、複数のベクトルレジスタおよびパイプライン
段数の異なる複数の演算器から構成される命令実行部と
、実行すべき命令の解読を行ない命令実行のための情報
を生成し、前記命令実行部のベクトルレジスタおよび演
算器等のリソースの状態を管理しながら命令の実行指示
を行なう命令制御部とを備えたベクトル処理装置に関す
る。［従来の技術１従来、この種のベクトル処理装置は、命令実行指示を命
令実行部に送出すると同時に命令のオペランドを与える
ベクトルレジスタが読出し中であることを示すフラッグ
（以降リードビジーフラッグと称す）、演算結果を指定
されたベクトルレジスタに格納中であることを示すフラ
ッグ（以降ライトビジーフラッグと称す）をセットし、
後続の命令実行指示送出前にこれらのフラッグをチェッ
クすることで命令実行部の状態をチェックし、とジーで
なければ命令実行部に命令実行指示を送出し、とジーで
あれば解除されるまで待ち、解除されたあと命令実行指
示を送出する。命令実行部の演算器は、加減算２乗算、
論理演算、シフト演算等の機能を各々持ち、パイプライ
ン構成でベクトルレジスタの１要素データを１クロツク
毎に演算でき、演算機能によりパイプライン段数が異な
るのが一般的である。よってパイプライン段数の多い演
算器を使用する命令はど、命令実行指示からのライトビ
ジー期間は長くなる。特に、第３図に示すように、命令実行部で使用している
演算結果を格納しているベクトルレジスタの指定が命令
制御部でチェックしている命令で演算結果を格納すべき
ベクトルレジスタの指定と一致している場合、命令実行
部で演算結果を格納しているベクトルレジスタのライト
ビジーが解除されてから後続の命令の実行指示が命令実
行部に送出され、後続命令で使用している演算器のパイ
プライン段数相当クロック後に演算結果を指定されたベ
クトルレジスタに格納しはじめることにより、ベクトル
レジスタへの格納が中断し、空き時間が生じていた。し
たがって、第４図に示すように、種々の演算器の中でパ
イプライン段数の最小のものを使用する命令が後続して
いる場合、ライトとジ−フラッグより最小パイプライン
段数相当２０９２分だけ短いフラッグ（以下ショートラ
イトビジーフラッグと称す）を、各命令の実行指示から
セットし、このフラッグのリセットをチェックし、後続
命令の実行指示を命令実行部へ送出するなら、演算結果
の格納は中断しなくてずむように制御できる。しかし、
第５図に示すように、それ以上のパイプライン段数をも
つ演算器を使用する命令が後続する場合、ショートライ
トビジーフラッグのリセットをチェックして命令実行指
示を命令実行部へ送出すると、演算結果の格納は中断さ
れ空き時間が生ずる。［発明が解決しようとする課題］上述した従来のベクトル処理装置は、演算結果を格納す
るベクトルレジスタの指定が同一である連続した演算命
令を実行すると、後続する命令で使用すべき演算器のパ
イプライン段数が大きいものほど、先行演算の結果の格
納と後続演算の結果の格納との間に演算結果が得られな
い、より大きな空き時間が生じてしまい、全体として演
算時間が長くなってしまうという欠点がある。［課題を解決するための手段１本発明のベクトル処理装置は、複数のベクトルレジスタ毎に複数の演算器対応に設けら
れ、演算結果を指定されたベクトルレジスタに格納中で
あることを示し、当該演算器のパイプライン段数相当ク
ロックだけライトビジーフラッグより早いタイミングで
リセットされる複数のショートライトビジーフラッグと
、命令制御部において解読中の命令で使用すべき演算器に
より前記ショートライトビジーフラッグを選択するセレ
クタと、選択されたショートライトビジーフラッグをチェックす
るチェック回路と、該ショートライトビジーフラッグがリセットされた時点
で後続の演算命令の実行指示を命令実行部へ送出する命
令実行指示手段とを有する。[Industrial Application Field] The present invention comprises an instruction execution unit composed of a plurality of vector registers and a plurality of arithmetic units with different numbers of pipeline stages, and an instruction execution unit that decodes instructions to be executed and generates information for instruction execution. The present invention also relates to a vector processing device comprising an instruction control section that instructs the execution of instructions while managing the states of resources such as vector registers and arithmetic units of the instruction execution section. [Prior art 1] Conventionally, this type of vector processing device sends an instruction execution instruction to an instruction execution unit and at the same time sends a flag (hereinafter referred to as read busy flag) indicating that a vector register that provides an operand of the instruction is being read. , sets a flag (hereinafter referred to as write busy flag) indicating that the calculation result is being stored in the specified vector register,
The state of the instruction execution unit is checked by checking these flags before sending a subsequent instruction execution instruction. Wait, and then send an instruction execution instruction after it is released. The arithmetic unit of the instruction execution unit performs addition, subtraction, 2 multiplication,
They each have functions such as logical operations and shift operations, and can operate on one element data of a vector register every clock in a pipeline configuration, and the number of pipeline stages generally differs depending on the operation function. Therefore, for an instruction that uses an arithmetic unit with a large number of pipeline stages, the write busy period from the instruction execution instruction becomes longer. In particular, as shown in Figure 3, the designation of the vector register that stores the calculation result used in the instruction execution unit is the designation of the vector register that should store the calculation result in the instruction that is checked in the instruction control unit. If it matches, the write busy of the vector register storing the operation result in the instruction execution unit is released, and then the instruction to execute the subsequent instruction is sent to the instruction execution unit, and the operation used in the subsequent instruction is By starting to store the operation result in the designated vector register after a clock period corresponding to the number of pipeline stages of the device, storage in the vector register was interrupted, resulting in idle time. Therefore, as shown in Figure 4, if an instruction that uses the smallest number of pipeline stages among various arithmetic units follows, it will be shorter than Write and G-Flag by 2092 equivalent to the minimum number of pipeline stages. If a flag (hereinafter referred to as the short write busy flag) is set from the execution instruction of each instruction, the reset of this flag is checked, and the execution instruction of the subsequent instruction is sent to the instruction execution unit, the storage of the operation result will not be interrupted. You can control it so that it doesn't need to be used. but,
As shown in Figure 5, when an instruction that uses an arithmetic unit with more pipeline stages follows, if the reset of the short write busy flag is checked and an instruction execution instruction is sent to the instruction execution unit, the operation result will be The storage of the data is interrupted and idle time is created. [Problems to be Solved by the Invention] When the above-described conventional vector processing device executes consecutive arithmetic instructions in which the same vector register designation for storing the arithmetic result is executed, the pipe of the arithmetic unit to be used in the subsequent instruction is The larger the number of line stages, the longer the calculation time will be as a result of the storage of the result of the preceding calculation and the storage of the result of the subsequent calculation, where no calculation result will be obtained, and the longer the calculation time will be. There are drawbacks. [Means for Solving the Problems 1] The vector processing device of the present invention is provided in correspondence with a plurality of arithmetic units for each of a plurality of vector registers, and indicates that an arithmetic result is being stored in a designated vector register. A plurality of short write busy flags are reset at a timing earlier than the write busy flag by a clock equivalent to the number of pipeline stages of the arithmetic unit, and the short write busy flag is selected according to the arithmetic unit to be used for the instruction being decoded in the instruction control unit. It has a selector, a check circuit that checks a selected short write busy flag, and an instruction execution instruction means that sends an instruction to execute a subsequent arithmetic instruction to an instruction execution section when the short write busy flag is reset.

[Effect]

したがって、命令実行部で実行中の命令の演算結果を格
納中であるベクトルレジスタの指定と命令制御部で解読
中の命令の演算結果を格納すべきベクトルレジスタの指
定が一致し、命令実行部で実行中の命令で使用している
演算器のパイプライン段数よりも命令制御部で解読中の
命令で使用すべき演算器のパイプライン段数の方が大き
い場合でも命令実行部で実行中の命令の演算結果の格納
と後続する命令制御部で解読中の命令の演算結果の格納
とを連続して行なえる。［実施例］次に、本発明の実施例について図面を参照して説明する
。第１図は本発明のベクトル処理装置の一実施例のブロッ
ク図、第２図は加算命令実行時のフラッグの説明図、第
３図は後続命令の命令実行指示送出チェック時に、先行
命令のライトビジーフラッグ３１を参照した場合の図、
第４図は後続命令が論理演算器使用命令で、命令実行指
示送出チェック時にショートライトビジーヒフラッグ３
２をチェックした場合の図、第５図は後続命令が乗算器
使用命令で命令実行指示送出チェック時にショートライ
トビジーヒフラッグ３２をチェックした場合の図、第６
図は加算命令実行時のすべてのフラッグの説明の図、第
７図は後続命令が乗算器使用命令で命令実行指示送出チ
ェック時にショートビジーＭフラッグ３４をチェックし
た場合の図である。本実施例は命令実行部１と命令制御部２とからなる。命
令実行部１は、複数個のエントリを有するベクトルレジ
スタＲ０〜Ｒ７と、クロスバスイッチｌＯと、加算器１
１と、論理演算器１２と、乗算器１３と、演算器１４と
から構成されている。加算器１１、論理演算器１２．乗
算器１３、演算器１４のパイプライン段数は、演算器に
より固有であり、ここでは乗算器１２．加算器比論理演
算器１３の順にパイプライン段数が大きいと仮定する。演算器１４は演算命令実行指示直後に、指定されたベク
トルレジスタへの格納を開始すると仮定する。は命令制
御部２は、命令レジスタ２０と、デコード回路２１と、
リードビジーフラッグ３０と、ライトビジーフラッグ３
１と、ショートライトビジーヒフラッグ３２と、ショー
トライトビジーＡフラッグ３３と、ショートライトとジ
ー間フラッグ３４と、セレクタ回路２５と、リードビジ
ーチェック回路２２と、ライトビジーチェック回路２３
と、命令実行指示回路２４とから構成されている。リー
ドビジーフラッグ３０は、実行している命令のオペラン
ドを与えるベクトルレジスタの読出し中であることを示
し、読出しが終了するとリセットされる。ライトとジ−
フラッグ３１は、演算結果を指定されたベクトルレジス
タに格納中であることを示し、指定されたベクトルレジ
スタへの格納が終了するとリセットされる。ショートライトビジーヒフラッグ３２は、第２図に示す
ように、ライトビジーフラッグ３１より各種演算器の中
で最もパイプライン段数の短い論理演算器１２のパイプ
ライン段数相当クロックだけ早いタイミングでリセット
される。ショートライトビジーＡフラッグ３３は、ライ
トビジーフラッグ３１より加算器１１のパイプライン段
数相当クロックだけ早いタイミングでリセットされる。ショートライトとジー間フラッグ３４は、ライトビジー
フラッグ３１より乗算器１３のパイプライン段数相当ク
ロックだけ早いタイミングでリセットされる。従来、命令制御部２で解読中の命令でオペランドを与え
るベクトルレジスタの指定と実行中の命令が読出し中で
あるベクトルレジスタの指定との一致をリードビジーフ
ラッグ３０を用いてリードビジーチェック回路２２で検
出し、同じく、解読中の命令で演算結果を格納すべきベ
クトルレジスタの指定と実行中の命令で演算結果を格納
中であるベクトルレジスタの指定との一致をライトビジ
ーフラッグ３１を用いてライトビジーチェック回路２３
で検出し、これらの結果から命令実行指示回路２４によ
り、命令実行部ｌへ命令実行指示を送出していた。この
−ため、実行中の命令の結果を格納中であるベクトルレ
ジスタの指定と命令制御部２で解読中の命令の演算結果
を格納すべきベクトルレジスタの指定が一致したとき、
第３図で示すように、命令実行部１でベクトルレジスタ
Ｒｏ、Ｒ＋の加算結果をベクトルレジスタＲ２へ格納中
であり、後続命令がベクトルレジスタＲ２，Ｒｓの論理
演算結果をベクトルレジスタＲ１に格納する命令である
場合、ベクトルレジスタＲ１のライトビジーフラッグ３
１がリセットされた時点で後続の論理演算命令の実行指
示を命令実行部１へ送出できる。よってベクトルレジスタＲ？への演算結果の格納は連続
して行なえず空き時間を生ずる。さらに、ライトビジー
チェック回路２３でチェックするフラッグをライトビジ
ーフラッグ３１を用いず、ショートライトビジーヒフラ
ッグ３２を用いた場合、第４図で示すように、第３図と
同じ演算を行なえば、ベクトルレジスタＲｙへの演算結
果の格納は連続して行なえる。ところが、第５図で示すように、命令実行部１でベクト
ルレジスタＲ６，Ｒ，の加算結果をベクトルレジスタＲ
７へ格納中であり、後続命令が、ベクトルレジスタＲ４
，Ｒｓの乗算結果をベクトルレジスタＲ７に格納する命
令である場合、ライトビジーチェック回路２３でチェッ
クするフラッグをショートライトとジ−ヒフラッグ３２
とすると、乗算器１３のパイプライン段数の方が論理演
算器１２のパイプライン段数より大きいため、ベクトル
レジスタＲ１への演算結果の格納は連続せず、再び空き
時間を生ずる。以上の問題点を解決するために、第６図に示すように、
命令実行指示送出時に、加えてショートライトビジーＡ
フラッグ３３．ショートライトビジーＭフラッグ３４を
セットし、それぞれライトビジーフラッグ３１より各演
算器のパイプライン段数相当クロックだけ早くリセット
するようにし、これを各ベクトルレジスタ毎に設定する
。第１図に示すように、命令制御部２において解読中の命
令で使用すべき演算器のパイプライン段数により、これ
らフラッグをセレクタ回路２５で選択し、選択されたフ
ラッグをライトとジ−チェック回路２３でチェックし、
その結果を命令実行指示回路２４に送り、命令実行指示
を命令実行部１に送出する。すなわち、命令制御部２で
解読中の命令で使用すべき演算器が加算器ｌｌである場
合はショートライトビジーＡフラッグ３３を、論理演算
器１２である場合はショートライトビジーヒフラッグ３
２を、乗算器１３である場合はショートライトビジーＭ
フラッグ３４を、命令実行指示直後に指定されたベクト
ルレジスタへの格納を開始するような演算器】４の場合
はライトビジーフラッグ３１をそれぞれ選択する。例えば、第７図に示すようなベクトルレジスタＲｏ、Ｒ
＋の加算結果をベクトルレジスタＲ７へ格納中であり、
後続命令がベクトルレジスタＲ４゜Ｒｓの乗算結果をベ
クトルレジスタＲｙに格納する命令である場合、乗算器
１３のパイプライン段数からショートライトとジー間フ
ラッグ３４が選択されう′イトビジーチエ９ク回路２３
で該フラッグがチェックされ、命令実行指示回路２４か
ら命令実行部ｌへ命令実行指示が送出され、結果として
ベクトルレジスタＲ１への演算結果の格納は連続して行
なわれる。〔発明の効果〕以上説明したように本発明は、解読中の命令で使用すべ
き演算器のパイプライン段数により、演算器対応に各演
算器のパイブライ、ン段数に応じたタイミングで制御さ
れるフラッグを選択して、チェックすることにより、同
一ベクトルレジスタへの演算結果格納の指定が連続した
場合も、演算結果を中断することなく格納でき、また、
後続する命令で使用すべき演算器のパイプライン段数が
先行する命令で使用している演算器のパイプライン段数
に比べて大きいときも、演算結果の格納が中断すること
なく行なわれ、よって同一ベクトルレジスタへの演算結
果格納の指定が連続した場合、全体として演算時間を短
縮できる効果がある。Therefore, the designation of the vector register that is storing the operation result of the instruction being executed by the instruction execution unit and the designation of the vector register that should store the operation result of the instruction that is being decoded by the instruction control unit match, and the instruction execution unit Even if the number of pipeline stages of the arithmetic unit used by the instruction being executed is greater than the number of pipeline stages of the arithmetic unit that should be used for the instruction being decoded by the instruction control unit, the number of pipeline stages of the arithmetic unit used by the instruction being executed is The storage of the calculation result and the storage of the calculation result of the instruction being decoded by the subsequent instruction control unit can be performed continuously. [Example] Next, an example of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram of an embodiment of the vector processing device of the present invention, FIG. 2 is an explanatory diagram of flags when an addition instruction is executed, and FIG. Diagram when referring to busy flag 31,
Figure 4 shows that the subsequent instruction is an instruction using a logical operation unit, and the short write busy flag is 3 when checking the instruction execution instruction sending.
2 is checked. Figure 5 is a diagram when the subsequent instruction uses a multiplier and the short write busy flag 32 is checked when checking the instruction execution instruction sending.
This figure is a diagram explaining all the flags when an addition instruction is executed, and FIG. 7 is a diagram showing a case where the subsequent instruction uses a multiplier and the short busy M flag 34 is checked when checking the sending of an instruction execution instruction. This embodiment consists of an instruction execution section 1 and an instruction control section 2. The instruction execution unit 1 includes vector registers R0 to R7 having a plurality of entries, a crossbar switch IO, and an adder 1.
1, a logical arithmetic unit 12, a multiplier 13, and an arithmetic unit 14. Adder 11, logical operator 12. The number of pipeline stages of the multiplier 13 and the arithmetic unit 14 is unique to each arithmetic unit; It is assumed that the number of pipeline stages increases in the order of the adder ratio logical operator 13. It is assumed that the arithmetic unit 14 starts storing data in a designated vector register immediately after being instructed to execute an arithmetic instruction. The instruction control unit 2 includes an instruction register 20, a decoding circuit 21,
Read busy flag 30 and write busy flag 3
1, short write busy flag 32, short write busy A flag 33, short write to G flag 34, selector circuit 25, read busy check circuit 22, and write busy check circuit 23.
and an instruction execution instruction circuit 24. The read busy flag 30 indicates that a vector register providing the operand of the instruction being executed is being read, and is reset when the read is completed. light and g-
The flag 31 indicates that the calculation result is being stored in the designated vector register, and is reset when the storage in the designated vector register is completed. As shown in FIG. 2, the short write busy flag 32 is reset at a timing earlier than the write busy flag 31 by a clock equivalent to the number of pipeline stages of the logical operator 12, which has the shortest pipeline stage among the various arithmetic units. . The short write busy A flag 33 is reset at a timing earlier than the write busy flag 31 by a clock corresponding to the number of pipeline stages of the adder 11. The short write-to-G flag 34 is reset at a timing earlier than the write busy flag 31 by a clock corresponding to the number of pipeline stages of the multiplier 13. Conventionally, the read busy check circuit 22 uses a read busy flag 30 to check whether the designation of the vector register that provides the operand in the instruction being decoded by the instruction control unit 2 matches the designation of the vector register from which the instruction being executed is being read. Similarly, the write busy flag 31 is used to detect a match between the designation of the vector register in which the calculation result is to be stored by the instruction being decoded and the designation of the vector register in which the calculation result is stored by the instruction being executed. Check circuit 23
Based on these results, the instruction execution instruction circuit 24 sends an instruction execution instruction to the instruction execution unit l. Therefore, when the designation of the vector register that is storing the result of the instruction being executed matches the designation of the vector register that is to store the operation result of the instruction that is being decoded by the instruction control unit 2,
As shown in FIG. 3, the instruction execution unit 1 is storing the addition result of vector registers Ro and R+ in vector register R2, and the subsequent instruction stores the logical operation result of vector registers R2 and Rs in vector register R1. If it is an instruction, write busy flag 3 of vector register R1
1 is reset, an instruction to execute a subsequent logical operation instruction can be sent to the instruction execution unit 1. So vector register R? The calculation results cannot be stored continuously, resulting in empty time. Furthermore, if the write busy check circuit 23 does not use the write busy flag 31 but uses the short write busy flag 32, as shown in FIG. 4, if the same calculation as in FIG. 3 is performed, the vector The calculation results can be stored continuously in the register Ry. However, as shown in FIG.
7, and the subsequent instruction is being stored in vector register R4.
, Rs, the flag to be checked by the write busy check circuit 23 is set to short write and gihi flag 32.
In this case, since the number of pipeline stages of the multiplier 13 is larger than the number of pipeline stages of the logical operator 12, the storage of the operation results to the vector register R1 is not continuous, and empty time occurs again. In order to solve the above problems, as shown in Figure 6,
In addition, short write busy A occurs when an instruction execution instruction is sent.
Flag 33. The short write busy M flag 34 is set so as to be reset earlier than the write busy flag 31 by a clock corresponding to the number of pipeline stages of each arithmetic unit, and this is set for each vector register. As shown in FIG. 1, these flags are selected by the selector circuit 25 according to the number of pipeline stages of the arithmetic unit to be used for the instruction being decoded in the instruction control unit 2, and the selected flags are transferred to the write and g-check circuits. Check on 23,
The result is sent to the instruction execution instruction circuit 24, and an instruction execution instruction is sent to the instruction execution section 1. That is, if the arithmetic unit to be used for the instruction being decoded by the instruction control unit 2 is the adder 11, the short write busy A flag 33 is set, and if it is the logical arithmetic unit 12, the short write busy flag 3 is set.
2, short write busy M if multiplier 13
In the case of [4], an arithmetic unit that starts storing the flag 34 into a designated vector register immediately after an instruction execution instruction is issued, the write busy flag 31 is selected. For example, vector registers Ro, R as shown in FIG.
The + addition result is being stored in vector register R7,
If the subsequent instruction is an instruction to store the multiplication result of the vector register R4゜Rs in the vector register Ry, the short write and G flag 34 is selected from the number of pipeline stages of the multiplier 13.
The flag is checked, and an instruction execution instruction is sent from the instruction execution instruction circuit 24 to the instruction execution unit 1. As a result, the operation results are continuously stored in the vector register R1. [Effects of the Invention] As explained above, the present invention is capable of controlling the pipeline of each arithmetic unit at a timing corresponding to the number of pipeline stages of each arithmetic unit, depending on the number of pipeline stages of the arithmetic unit to be used for the instruction being decoded. By selecting and checking the flag, even if the storage of calculation results to the same vector register is specified consecutively, the calculation results can be stored without interruption.
Even when the number of pipeline stages of the arithmetic unit to be used in the subsequent instruction is larger than the number of pipeline stages of the arithmetic unit used in the preceding instruction, the storage of the operation result is performed without interruption, and therefore the same vector If the storage of calculation results in registers is specified consecutively, the overall calculation time can be reduced.

[Brief explanation of the drawing]

第１図は本発明のベクトル処理装置の一実施例のブロッ
ク図、第２図は加算命令実行時のフラッグの説明図、第
３図は後続命令の命令実行指示送出チェック時に、先行
命令のライトビジーフラッグ３１を参照した場合の図、
第４図は後続命令が論理演算器使用命令で、命令実行指
示送出チェック時にショートライトビジーヒフラッグ３
２をチェックした場合の図、第５図は後続命令が乗算器
使用命令で命令実行指示送出チェック時にショートライ
トとジ−ヒフラッグ３２をチェックした場合の図、第６
図は加算命令実行時のすべてのフラッグの説明の図、第
７図は後続命令が乗算器使用命令で命令実行指示送出チ
ェック時にショートビジーＭフラッグ３４をチェックし
た場合の図である。１・・・・・・・・・・命令実行部２・・・・・・・・・・命令制御部Ｒ０〜Ｒ１・・・ベクトルレジスタ１０・・・・・・・・・・クロスバスイッチ＋＋・・・
・・・・・・・加算器１２・・・・・・・・・・論理演算器１３・・・・・・・・・・乗算器１４・・・・・・・・・・演算器２０・・・・・・・・・・命令レジスタ２１・・・・・
・・・・・デコード回路２２・・・・・・・・・・リー
ドビジーチェック回路２３・・・・・・・・・・ライト
ビジーチェック回路２４・・・・・・・・・・命令実行
指示回路３０・・・・・・・・・・リードビジーフラッ
グ３１・・・・・・・・・・ライトビジーフラッグ３２
・・・・・・・・・・ショートライトビジーヒフラッグ
３３・・・・・・・・・・ショートライトとジーＡフラ
ッグ３４・・・・・・・・・・ショートライトビジーＭ
フラッグ４ｇ１ｆｆ櫓示第２奔命Ｒｙ−ＲＯ＋Ｒ＋〜−Ｒ２ＪＲ３命令Ｒ７−ｆ％　＋　Ｒ。Ｒ７−Ｒ４＊Ｒ５毎分Ｒ７←ＲＯ＋Ｒ電Ｒ７◆−Ｒ２ＬＦｂ４今Ｒ７−Ｒ，＋Ｒ。FIG. 1 is a block diagram of an embodiment of the vector processing device of the present invention, FIG. 2 is an explanatory diagram of flags when an addition instruction is executed, and FIG. Diagram when referring to busy flag 31,
Figure 4 shows that the subsequent instruction is an instruction using a logical operation unit, and the short write busy flag is 3 when checking the instruction execution instruction sending.
2 is checked. Figure 5 is a diagram when the subsequent instruction is an instruction using a multiplier and the short write and gee-hi flag 32 are checked when checking the sending of an instruction execution instruction.
This figure is a diagram explaining all the flags when an addition instruction is executed, and FIG. 7 is a diagram showing a case where the subsequent instruction uses a multiplier and the short busy M flag 34 is checked when checking the sending of an instruction execution instruction. 1......Instruction execution unit 2...Instruction control unit R0-R1...Vector register 10...Crossbar switch++ ...
...... Adder 12 ...... Logical operator 13 ... Multiplier 14 ...... Arithmetic unit 20 ......Instruction register 21...
...Decode circuit 22 ... Read busy check circuit 23 ... Write busy check circuit 24 ...... Instruction execution Indication circuit 30...Read busy flag 31...Write busy flag 32
・・・・・・・・・・Short write busy flag 33・・・・・・・・Short light and G-A flag 34・・・・・・・・・・Short write busy M
Flag 4g1ff turret display 2nd command Ry-RO+R+ ~-R2JR3 command R7-f% + R. R7-R4*R5 Every minute R7←RO+R electric R7◆-R2LFb 4 Now R7-R, +R.

Claims

[Claims] 1. An instruction execution unit composed of a plurality of vector registers and a plurality of arithmetic units with different numbers of pipeline stages, and a read busy flag indicating that a vector register providing an operand of an instruction is being read. It includes a write busy flag indicating that the operation result is being stored in the specified vector register, decodes the instruction to be executed, generates information for instruction execution, and stores the instruction execution unit's vector register and arithmetic unit. In the vector processing device, each of the plurality of vector registers is configured to correspond to the plurality of arithmetic units. is provided to indicate that the operation result is being stored in the specified vector register,
a plurality of short write busy flags that are reset at a timing earlier than the write busy flag by clocks corresponding to the number of pipeline stages of the arithmetic unit; and a plurality of short write busy flags that are reset by the arithmetic unit to be used for the instruction being decoded in the instruction control unit. A selector for making a selection, a check circuit for checking a selected short write busy flag, and an instruction execution instruction means for sending an instruction to execute a subsequent arithmetic instruction to an instruction execution section when the short write busy flag is reset. A vector processing device comprising: