JPH0690712B2

JPH0690712B2 - Vector processor

Info

Publication number: JPH0690712B2
Application number: JP11590588A
Authority: JP
Inventors: 政人西田; 一昭古澤
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-05-11
Filing date: 1988-05-11
Publication date: 1994-11-14
Anticipated expiration: 2009-11-14
Also published as: JPH01284968A

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、複数のベクトルレジスタおよび複数の演算器
より構成される命令処理部と、実行すべき命令の解読を
行ない命令実行のための情報を生成し、前記命令処理部
のベクトルレジスタおよび演算器の状態を管理しながら
命令の実行指示を行なう命令解読指示部とを備えたベク
トル処理装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Industrial field of use] The present invention relates to an instruction processing unit including a plurality of vector registers and a plurality of arithmetic units, and information for executing an instruction by decoding an instruction to be executed. And a command decoding instruction unit for instructing execution of an instruction while managing the states of the vector register and the arithmetic unit of the instruction processing unit.

[Conventional technology]

従来、この種のベクトル処理装置は、命令実行指示を命
令処理部に送出する同時に、該命令のオペランドを与え
るベクトルレジスタの読出し中であることを示すフラッ
グ（以降リードビジーフラッグと称す）、演算結果を格
納中であるベクトルレジスタに演算結果を供給している
演算器を示すフラッグ（以降ファンクションフラッグと
称す）、演算結果を指定されたベクトルレジスタに格納
中であることを示すフラッグ（以降ライトビジーフラッ
グと称す）をセットし、後続の命令実行指示送出前にこ
れらのフラッグをチェックすることで、命令処理部の状
態をチェックし、ビジーでなければ命令実行指示を送出
し、ビジーであれば、解除されるまで待ち、解除された
あと命令実行指示を送出する。命令処理部の演算器は、
加減算，乗算，論理演算，シフト演算等の機能を各々も
ち、パイプライン構成でベクトルレジスタの１要素デー
タを１クロック毎に演算でき、演算機能によりパイプラ
イン段数が異なるのが一般的である。よって、パイプラ
イン段数の多い演算器を使用する命令ほど、命令実行指
示からのライトビジー期間は長くなる。特に、命令処理
部で使用している演算結果を格納しているベクトルレジ
スタの指定が命令解読指示部で指示のためチェックして
いる命令で演算結果を格納すべきベクトルレジスタの指
定と一致している場合、命令処理部で使用している演算
器と命令解読指示部で指示のためチェックしている命令
で使用すべき演算器の指定が同一である以外は、命令実
行指示を行なおうとしている命令で使用すべき演算器お
よび読出すべきベクトルレジスタのビジーが既に解除さ
れていたとしても、演算結果を格納中であるベクトルレ
ジスタのライトビジーが解除されるまで命令実行指示を
送出できなかった。Conventionally, a vector processing device of this type sends a command execution instruction to a command processing unit and, at the same time, a flag indicating that the vector register giving the operand of the command is being read (hereinafter referred to as a read busy flag), a calculation result. Is stored in the vector register indicating the operation unit that supplies the operation result (hereinafter referred to as the function flag), and the flag indicating that the operation result is being stored in the specified vector register (hereinafter referred to as the write busy flag). ) Is set, and these flags are checked before sending the subsequent instruction execution instruction, the status of the instruction processing unit is checked, and if the instruction execution instruction is not busy, the instruction execution instruction is sent. Waits until it is released, and then releases the instruction execution instruction. The operation unit of the instruction processing unit is
It has functions such as addition and subtraction, multiplication, logical operation, shift operation, etc., and can generally operate one element data of a vector register every one clock in a pipeline configuration, and the number of pipeline stages generally differs depending on the operation function. Therefore, the write busy period from the instruction execution instruction becomes longer as the instruction using the arithmetic unit having the larger number of pipeline stages is used. In particular, the specification of the vector register that stores the operation result used in the instruction processing unit matches the specification of the vector register that should store the operation result in the instruction that is being checked because it is instructed by the instruction decoding instruction unit. If it is, the instruction execution instruction is issued except that the instruction used by the instruction processing section and the instruction decoding instruction section specify the same operator to be used for the instruction being checked. The instruction execution instruction could not be sent until the write busy of the vector register that is storing the operation result is released, even if the busy of the arithmetic unit to be used and the vector register to be read has already been released. .

[Problems to be Solved by the Invention]

上述した従来のベクトル処理装置は、演算結果を格納す
るベクトルレジスタの指定が同一で、異なる演算器を使
用する連続した演算命令を実行すると、先行演算の結果
を格納してベクトルレジスタのライトビジーが解除され
てから、後続命令の実行指示を命令処理部へ送出するた
め、先行演算と後続演算との間に演算結果が得られない
空き時間が生じ、全体として演算時間が遅くなるという
欠点がある。In the above-described conventional vector processing device, when the specification of the vector register for storing the calculation result is the same, and when consecutive calculation instructions using different calculation units are executed, the result of the preceding calculation is stored and the write busy of the vector register is reduced. Since the instruction to execute the subsequent instruction is sent to the instruction processing unit after it is canceled, there is a vacant time in which the operation result cannot be obtained between the preceding operation and the subsequent operation, resulting in a delay in the operation time as a whole. .

[Means for Solving the Problems]

本発明のベクトル処理装置は、命令処理部で実行中の命令の結果を格納中であるベクト
ルレジスタの指定と命令解読指示部で解読中の命令の演
算結果を格納すべきベクトルレジスタの指定が一致する
ことを検出する手段と、前記命令処理部で実行中の命令の結果を出力中の演算器
のパイプライン段数と、命令解読指示部で解読中の命令
で使用する演算器のパイプライン段数とを比較する手段
と、前記命令処理部で実行中の命令で演算器にデータを
供給しているベクトルデータの読出しの終了を検出する
手段と、前記ベクトルレジスタの指定の一致が検出され、後続命
令で使用する演算器のパイプライン段数が先行命令で使
用する演算器のパイプライン段数以上である場合、前記
命令処理部で実行中の命令で演算器にデータを供給して
いるベクトルレジスタの読出しが終了した時点で解読中
の命令の実行指示を命令処理部に送出する手段とを有す
る。In the vector processing device of the present invention, the designation of the vector register that is storing the result of the instruction being executed by the instruction processing unit and the designation of the vector register that should store the operation result of the instruction being decoded by the instruction decoding instruction unit match. And a pipeline stage number of the arithmetic unit outputting the result of the instruction being executed by the instruction processing unit, and a pipeline stage number of the arithmetic unit used by the instruction being decoded by the instruction decoding instruction unit. And a means for detecting the end of the reading of vector data that is supplying data to the arithmetic unit by the instruction being executed in the instruction processing unit, a coincidence of the designation of the vector register is detected, and a subsequent instruction If the number of pipeline stages of the arithmetic unit used in the above is greater than or equal to the number of pipeline stages of the arithmetic unit used in the preceding instruction, the instruction being executed by the instruction processing unit is supplying data to the arithmetic unit. Reading of torque registers and means for sending the instruction processing unit to execute instructions of the instruction being decoded by the time of completion.

[Action]

命令処理部で実行中の命令で演算器にデータを供給して
いるベクトルレジスタの読出しが終了した時点で解読中
の命令の実行指示を命令処理部に送出するので、演算時
間を短縮できる。Since the execution instruction of the instruction being decoded is sent to the instruction processing unit when the reading of the vector register that is supplying data to the arithmetic unit by the instruction being executed by the instruction processing unit is completed, the operation time can be shortened.

〔Example〕

次に、本発明の実施例について図面を参照して説明す
る。Next, embodiments of the present invention will be described with reference to the drawings.

第１図は本発明のベクトル処理装置の一実施例のブロッ
ク図、第２図は加算命令実行時の各ビジーフラッグ22,2
3,24の状態を示す図、第３図は加算命令に続いて乗算命
令を実行する場合の従来の方法による各フラッグ22,23,
24の状態を示す図、第４図は加算命令に続いて乗算命令
を実行する場合の本実施例における各フラッグ22,23,24
の状態を示す図である。FIG. 1 is a block diagram of an embodiment of the vector processing device of the present invention, and FIG. 2 is each busy flag 22, 2 when executing an add instruction.
FIG. 3 is a diagram showing the states of 3, 24, and FIG. 3 shows each flag 22, 23, by the conventional method when the multiply instruction is executed subsequently to the add instruction.
FIG. 4 is a diagram showing the state of 24, and FIG. 4 shows each flag 22, 23, 24 in the present embodiment in the case of executing a multiply instruction following an add instruction.
It is a figure which shows the state of.

本実施例は、命令処理部１と命令解読指示部２とからな
る。命令処理部１は、ベクトルレジスタR₀〜R₅と、演算
器10〜12と、クロスバスイッチ13とから構成されてい
る。演算器10,11,12は、この場合、それぞれ加算器，乗
算器，論理演算器であり、このパイプライン段階数は演
算器により固有である。ここでは、乗算器11のパイプラ
イン段数は加算器10のパイプライン段数以上であると仮
定する。命令解読指示部２は、命令レジスタ20と、デコ
ーダ回路21と、ファンクションフラッグ22と、ライトビ
ジーフラッグ23と、リードビジーフラッグ24と、パイプ
ライン段数比較回路25と、結果格納レジスタ一致検出回
路26と、ライトビジーチェック回路27と、リードビジー
チェック回路28と、先行命令リードビジーリセット検出
回路30と、ナンド回路31と、アンド回路32と、命令実行
指示回路29とで構成されている。デコーダ回路31は命令
レジスタ20の命令を解読する。ファンクションフラッグ
22は加算器フラッグ22₁,乗算器フラッグ22₂,論理演算器
フラッグ22₃からなり、演算実行時にその演算器が結果
を格納しているベクトルレジスタ番号を示すビットにフ
ラッグがセットされ、また、ベクトルの最後のエントリ
ーを供給するとリセットされる。ライトビジーフラッグ
23は、演算結果を指定されたベクトルレジスタに格納中
であることを示し、指定されたベクトルレジスタへの演
算結果の格納が終了するとリセットされる。リードビジ
ーフラッグ24は、実行している命令のオペランドを与え
るベクトルレジスタの読出し中であることを示し、読出
しが終了するとリセットされる。パイプライン段数比較
回路25は、先行命令実行で、どの演算器を使用している
かという情報をファンクションフラッグ22より受け、そ
のパイプライン段数をテーブル等から読出し、後続命令
のパイプライン段数は、デコーダ回路21のデコード情報
から求め、後続命令のパイプライン段数と先行命令のパ
イプライン段数とを比較し、後続命令のパイプライン段
数が先行命令のパイプライン段数以上のとき“1"を出力
する。結果格納レジスタ指定一致検出回路26は、先行命
令と後続命令の演算結果を格納するベクトルレジスタの
指定が一致することをライトビジーフラッグ23およびデ
コード情報より検出すると、“1"を出力する。先行命令
リードビジーリセット検出回路30は、先行命令でどの演
算器を使用しているかという情報をファンクションフラ
ッグ22より受け、リードビジーフラッグ24がリセットさ
れると、先行命令のベクトルレジスタからの読出しが終
了したので“1"を出力する。ライトビジーチェック回路
27,ライトビジーチェック回路28はそれぞれライトビジ
ーフラッグ23,リードビジーフラッグ24の情報から後続
命令で読出しに使用すべきベクトルレジスタの指定との
競合がないことをチェックする。ナンド回路31はパイプ
ライン段数比較回路25,結果格納レジスタ一致検出回路2
6,先行命令リードビジーリセット検出回路30の各出力の
否定論理積をとる。アンド回路32はナンド回路31の出力
が“1"のとき、ライトビジーチェック回路27のライトビ
ジーを無視する。命令実行指示回路29は、デコーダ回路
21の出力，アンド回路32の出力、ライトビジーチェック
回路28の出力を入力し、アンド回路32の出力およびライ
トビジーチェック回路28の出力が共に“0"のとき命令実
行指示を命令処理部１に送出する。This embodiment comprises an instruction processing unit 1 and an instruction decoding instruction unit 2. The instruction processing unit 1 is composed of vector registers R _{0 to} R ₅ , arithmetic units 10 to 12, and a crossbar switch 13. In this case, the arithmetic units 10, 11 and 12 are an adder, a multiplier and a logical arithmetic unit, respectively, and the number of pipeline stages is unique to each arithmetic unit. Here, it is assumed that the number of pipeline stages of the multiplier 11 is equal to or larger than the number of pipeline stages of the adder 10. The instruction decoding instruction unit 2 includes an instruction register 20, a decoder circuit 21, a function flag 22, a write busy flag 23, a read busy flag 24, a pipeline stage number comparison circuit 25, and a result storage register match detection circuit 26. A write busy check circuit 27, a read busy check circuit 28, a preceding instruction read busy reset detection circuit 30, a NAND circuit 31, an AND circuit 32, and an instruction execution instruction circuit 29. The decoder circuit 31 decodes the instruction in the instruction register 20. Function flag
22 is composed of an adder flag 22 ₁ , a multiplier flag 22 ₂ and a logical operation unit flag 22 ₃ , and the flag is set to the bit indicating the vector register number in which the operation unit stores the result when executing the operation, and It is reset when you supply the last entry in the vector. Light busy flag
23 indicates that the operation result is being stored in the specified vector register, and is reset when the storage of the operation result in the specified vector register is completed. The read busy flag 24 indicates that the vector register giving the operand of the instruction being executed is being read, and is reset when the reading is completed. The pipeline stage number comparison circuit 25 receives information on which arithmetic unit is being used in execution of the preceding instruction from the function flag 22, reads the pipeline stage number from a table, etc., and the pipeline stage number of the subsequent instruction is the decoder circuit. Obtained from the decode information of 21, the number of pipeline stages of the subsequent instruction is compared with the number of pipeline stages of the preceding instruction, and when the number of pipeline stages of the subsequent instruction is equal to or greater than the number of pipeline stages of the preceding instruction, "1" is output. The result storage register designation coincidence detection circuit 26 outputs "1" when it detects from the write busy flag 23 and the decoding information that the designations of the vector registers for storing the operation results of the preceding instruction and the succeeding instruction match. The preceding instruction read busy reset detection circuit 30 receives information indicating which arithmetic unit is used by the preceding instruction from the function flag 22, and when the read busy flag 24 is reset, the reading of the preceding instruction from the vector register ends. As a result, "1" is output. Light busy check circuit
27 and the write busy check circuit 28 check from the information of the write busy flag 23 and the read busy flag 24, respectively, that there is no conflict with the designation of the vector register to be used for reading in the subsequent instruction. The NAND circuit 31 includes a pipeline stage number comparison circuit 25 and a result storage register match detection circuit 2
6, AND of each output of the preceding instruction read busy reset detection circuit 30. The AND circuit 32 ignores the write busy of the write busy check circuit 27 when the output of the NAND circuit 31 is "1". The instruction execution instruction circuit 29 is a decoder circuit
21 output, AND circuit 32 output, and write busy check circuit 28 output are input, and when both the AND circuit 32 output and the write busy check circuit 28 output are "0", an instruction execution instruction is sent to the instruction processing unit 1. Send out.

次に、本実施例の動作を説明する。Next, the operation of this embodiment will be described.

例えば、命令解読指示部２が命令処理部１へベクトルレ
ジスタR₀とR₁の各内容の加算を行ない、その結果をベク
トルレジスタR₅に格納する命令指示を行なったとする
と、第２図に示すように、ベクトルレジスタR₀,R₁のリ
ードビジーフラッグ24および加算器10のファンクション
フラッグ22₁およびベクトルレジスタR₅のライトビジー
フラッグ23をセットする。命令処理部１でベクトルレジ
スタR₀とR₁の各内容の加算が行なわれ、その結果がベク
トルレジスタR₅に格納される。命令処理部１でこの加算
の実行中に、命令解読指示部２へベクトルレジスタR₂と
R₃の各内容の乗算を行ない、その結果をベクトルレジス
タR₅に格納する命令が入力したものとすると、命令解読
指示部２は、第３図に示すように、ベクトルレジスタR₅
への演算結果の格納が終了してからT₁後でなければ後続
命令（R₅←R₂×R₃）のベクトルレジスタR₅への格納は再
開しない。これは先行命令によるベクトルレジスタR₅の
ライトビジーフラッグ23がリセットされて、これをチェ
ックしてから後続命令の実行指示を送出するため後続命
令で使用する演算器のパイプライン段数相当クロック時
間以上、ベクトルレジスタR₅に新しいデータが送出され
ないからである。このため全体として演算時間が遅くな
っていた。本発明では、T₁を短かくすることを目的とし
ている。For example, instruction decode instruction unit 2 performs addition of the contents of the vector register R ₀ and R ₁ to the instruction processing unit 1, when conducted a command instruction to store the result in the vector register R _5, shown in Figure 2 Thus, the read busy flag 24 of the vector registers R ₀ and R ₁ , the function flag 22 ₁ of the adder 10 and the write busy flag 23 of the vector register R ₅ are set. The instruction processing unit 1 adds the contents of the vector registers R ₀ and R ₁ and stores the result in the vector register R ₅ . During execution of this addition in the instruction processing unit 1, the instruction decoding instruction unit 2 is connected to the vector register R ₂ and
Assuming that an instruction for multiplying each content of R ₃ and storing the result in the vector register R ₅ is input, the instruction decoding instructing unit 2 causes the vector register R _{5 to} read as shown in FIG.
The storage of the subsequent instruction (R ₅ ← R ₂ × R ₃ ) in the vector register R ₅ does not resume until T ₁ after the storage of the operation result in is completed. This is because the write busy flag 23 of the vector register R ₅ due to the preceding instruction is reset, and after checking this, to send the execution instruction of the succeeding instruction, the clock time corresponding to the pipeline stage number of the arithmetic unit used in the succeeding instruction or more, This is because new data is not sent to the vector register R ₅ . For this reason, the calculation time was slow as a whole. The purpose of the present invention is to shorten T ₁ .

本実施例では、先行命令のベクトルレジスタからの読出
しが終了した時点を先行命令リードビジーリセット検出
回路30により知る。これは、ファンクションフラッグ22
をセットしている演算器を見つけ、その演算器にベクト
ルデータを供給しているベクトルレジスタは、命令処理
部１の構造上、一義的に知ることができる。加算器フラ
ッグ22₁のベクトルレジスタR₅がセットされており、先
行命令で加算器10にベクトルデータを供給しているベク
トルレジスタはR₀とR₁であることがわかるので、ベクト
ルレジスタR₀,R₁のリードビジーのリセットを検出し、
このとき先行命令リードリセット検出回路30は、“1"を
出力する。このとき、パイプライン段数比較回路25,結
果レジスタ一致検出回路26の出力はともに“1"であるの
で、ナンド回路31の出力は“0"となり、ライトビジーチ
ェック回路27のベクトルレジスタR₅のライトビジーはア
ンド回路32により無視される。この結果、命令実行指示
回路29は、第４図に示すように、リードビジーチェック
回路28により、後続命令R₅←R₂×R₃において、ベクトル
レジスタR₂,R₃のライトビジーでないことをチェック
し、命令実行指示を命令処理部１に送出する。In this embodiment, the preceding instruction read busy reset detection circuit 30 knows the time when the reading of the preceding instruction from the vector register is completed. This is the function flag 22
The vector register that finds the arithmetic unit that sets and supplies the vector data to the arithmetic unit can be uniquely known from the structure of the instruction processing unit 1. Since it can be seen that the vector register R ₅ of the adder flag 22 ₁ is set and the vector registers supplying the vector data to the adder 10 by the preceding instruction are R ₀ and R ₁ , vector register R ₀ , detects the lead busy reset R _1,
At this time, the preceding instruction read reset detection circuit 30 outputs "1". At this time, since the outputs of the pipeline stage number comparison circuit 25 and the result register coincidence detection circuit 26 are both "1", the output of the NAND circuit 31 becomes "0" and the write of the vector register R ₅ of the write busy check circuit 27. Busy is ignored by AND circuit 32. As a result, as shown in FIG. 4, the instruction execution instruction circuit 29 causes the read busy check circuit 28 to confirm that the subsequent instructions R ₅ ← R ₂ × R ₃ are not write busy for the vector registers R ₂ and R _3. It is checked and an instruction execution instruction is sent to the instruction processing unit 1.

したがって、本実施例によれば第３図に示す時間T₁に比
較して、第４図に示すように時間T₁を短縮することがで
き、これにより、全体としての演算時間も短縮する。Therefore, according to the present embodiment, the time T ₁ can be shortened as shown in FIG. 4 as compared with the time T ₁ shown in FIG. 3, and thereby the calculation time as a whole is also shortened.

また、先行命令で使用している演算器のパイプライン段
数が後続命令で使用する演算器のパイプライン段数と等
しく、演算結果を格納するベクトルレジスタも一致して
いる場合は、時間T₁が全くなくなってしまうことも容易
に類推できる。If the number of pipeline stages of the arithmetic unit used in the preceding instruction is equal to the number of pipeline stages of the arithmetic unit used in the subsequent instruction and the vector registers for storing the calculation results are also the same, the time T ₁ is completely It can be easily inferred that it will disappear.

〔The invention's effect〕

以上説明したように本発明は、命令処理部で実行中の命
令の結果を出力中の演算器のパイプライン段数と比べ、
命令解読指示部で解読中の命令で使用する演算器のパイ
プライン段数が等しいかあるいは大きい場合で前記命令
処理部で実行中の命令で演算器にデータを供給している
ベクトルレジスタの読出しが終了した時点で解読中の命
令の実行指示を命令処理部に送出することにより、同一
ベクトルレジスタに演算結果を格納する連続した命令を
実行する際、後続命令で使用する演算器のパイプライン
段数が先行命令で使用するパイプライン段数以上である
場合、演算時間を短縮できる効果がある。As described above, the present invention compares the result of the instruction being executed by the instruction processing unit with the number of pipeline stages of the outputting arithmetic unit,
When the number of pipeline stages of the arithmetic unit used by the instruction being decoded by the instruction decoding instruction unit is equal or large, the reading of the vector register supplying data to the arithmetic unit by the instruction being executed by the instruction processing unit is completed. By sending the instruction execution instruction of the instruction being decoded to the instruction processing unit at the point of time, when executing consecutive instructions that store the operation result in the same vector register, the number of pipeline stages of the arithmetic unit used in the subsequent instruction precedes If the number of pipeline stages is greater than or equal to the number of pipeline stages used in an instruction, the operation time can be shortened.

[Brief description of drawings]

第１図は本発明のベクトル処理装置の一実施例を示すブ
ロック図、第２図は加算命令実行時の各ビジーフラッグ
22,23,24の状態を示す図、第３図は加算命令に続いて乗
算命令を実行する場合の従来の方法による各フラッグ2
2,23,24の状態を示す図、第４図は加算命令に続いて乗
算命令を実行する場合の本実施例における各フラッグ2
2,23,24の状態を示す図である。１……命令処理部、２……命令解読指示部、 R₀〜R₅……ベクトルレジスタ、 10……演算器（加算器）、 11……演算器（乗算器）、 12……演算器（論理演算器）、 13……クロスバスイッチ、 20……命令レジスタ、 21……デコーダ回路、 22……ファンクションフラッグ、 22₁……加算器フラッグ、 22₂……乗算器フラッグ、 22₃……論理演算器フラッグ、 23……ライトビジーフラッグ、 24……リードビジーフラッグ、 25……パイプライン段数比較回路、 26……結果格納レジスタ指定一致検出回路、 27……ライトビジーチェック回路、 28……リードビジーチェック回路、 29……命令実行指示回路、 30……先行命令リードビジーリセット検出回路、 31……ナンド回路、 32……アンド回路。FIG. 1 is a block diagram showing an embodiment of the vector processing device of the present invention, and FIG. 2 is each busy flag at the time of executing an add instruction.
Fig.3 shows the states of 22,23,24, and Fig.3 shows each flag by the conventional method when executing a multiply instruction followed by a multiply instruction.
FIG. 4 is a diagram showing the states of 2, 23, and 24, and FIG. 4 shows each flag 2 in the present embodiment in the case of executing a multiply instruction following an add instruction
It is a figure which shows the state of 2,23,24. 1 ...... instruction processing unit, 2 ...... instruction decode instruction unit, R ₀ to R ₅ ...... vector register, 10 ...... calculator (adder), 11 ...... calculator (multiplier), 12 ...... calculator (Logical unit), 13 …… Crossbar switch, 20 …… Instruction register, 21 …… Decoder circuit, 22 …… Function flag, 22 ₁ …… Adder flag, 22 ₂ …… Multiplier flag, 22 ₃ …… Logic operation unit flag, 23 …… Write busy flag, 24 …… Read busy flag, 25 …… Pipeline stage number comparison circuit, 26 …… Result storage register specified match detection circuit, 27 …… Write busy check circuit, 28 …… Read busy check circuit, 29 ... Instruction execution instruction circuit, 30 ... Leading instruction read busy reset detection circuit, 31 ... NAND circuit, 32 ... AND circuit.

Claims

[Claims]

1. An instruction processing unit composed of a plurality of vector registers and a plurality of arithmetic units, and an instruction to be executed is decoded to generate information for instruction execution. In a vector processing device having an instruction decoding instruction unit for instructing the execution of instructions while managing the state of the device, the specification of the vector register storing the result of the instruction being executed in the instruction processing unit and the instruction Means for detecting that the specification of the vector register for storing the operation result of the instruction being decoded in the decoding instruction unit matches, and a pipeline stage of the arithmetic unit outputting the result of the instruction being executed in the instruction processing unit And means for comparing the number of pipeline stages of the arithmetic unit used by the instruction being decoded by the instruction decoding instruction unit, and supplying data to the arithmetic unit by the instruction being executed by the instruction processing unit. Means for detecting the end of the reading of the vector data, and the coincidence of the designation of the vector register is detected, and the pipeline stage number of the arithmetic unit used in the subsequent instruction is equal to or more than the pipeline stage number of the arithmetic unit used in the preceding instruction. In this case, there is provided means for sending an execution instruction of the instruction being decoded to the instruction processing unit when the reading of the vector register supplying data to the arithmetic unit by the instruction being executed by the instruction processing unit is completed. A characteristic vector processing device.