JP3726092B2

JP3726092B2 - Vector processing apparatus and vector loading method

Info

Publication number: JP3726092B2
Application number: JP2003270882A
Authority: JP
Inventors: 秀之佐藤
Original assignee: NEC Computertechno Ltd
Current assignee: NEC Computertechno Ltd
Priority date: 2003-07-04
Filing date: 2003-07-04
Publication date: 2005-12-14
Anticipated expiration: 2023-07-04
Also published as: JP2005025693A

Description

本発明はベクトル処理装置に関し、特に主記憶装置からベクトルレジスタにベクトルデータをロードするベクトルロード命令の処理方法に関する。 The present invention relates to a vector processing device, and more particularly to a vector load instruction processing method for loading vector data from a main storage device into a vector register.

一般にベクトル処理装置は、主記憶装置からロードしたベクトルデータやベクトル演算中の中間結果などを保持する複数のベクトルレジスタと、ベクトルレジスタに保持されたベクトルデータに対する演算を行うベクトル演算器とを備え、大量のデータを高速に演算することができるようになっている。また、主記憶装置のアクセス速度はベクトル演算の速度に比べて遅いため、ベクトルデータのベクトルレジスタへのロードを高速化するために、主記憶装置とベクトルレジスタとの間にベクトルデータを一時的に格納するロードバッファを備え、ベクトルロード命令の解読時に主記憶装置からロードバッファへのベクトルデータの読み出しを開始させ、ロードバッファからベクトルレジスタへの転送は命令発行順、つまりベクトルロード命令の発行ステージで実施する技術が提案されている（例えば特許文献１参照）。
特開平２−１０１５７６号公報 In general, a vector processing device includes a plurality of vector registers that hold vector data loaded from a main storage device, intermediate results during vector operations, and the like, and a vector operation unit that performs operations on vector data held in the vector registers, A large amount of data can be calculated at high speed. In addition, since the access speed of the main storage device is slower than the vector operation speed, the vector data is temporarily transferred between the main storage device and the vector register in order to speed up the loading of the vector data into the vector register. It has a load buffer to store, and starts reading vector data from the main memory to the load buffer when the vector load instruction is decoded. Transfer from the load buffer to the vector register is in the order of instruction issue, that is, in the issue stage of the vector load instruction. The technique to implement is proposed (for example, refer patent document 1).
Japanese Patent Laid-Open No. 2-101576

しかし、ロードバッファからベクトルレジスタへの転送をベクトルロード命令の発行ステージで起動する構成では、そのベクトルロード命令に先行するベクトル命令がリソースビジー等の理由で発行ステージに留まっている限り、ロードバッファからベクトルレジスタへの転送を起動することができず、ベクトルロード処理の高速化が図れない。また、ロードバッファからベクトルレジスタへの転送をベクトルロード命令の発行ステージで起動する構成では、複数のベクトルロード命令の処理は命令発行順になり、後続のベクトルロード命令を先行するベクトルロード命令よりも早く処理するといった命令の追越し制御は行えない。 However, in the configuration in which the transfer from the load buffer to the vector register is activated at the issue stage of the vector load instruction, as long as the vector instruction preceding the vector load instruction remains at the issue stage for reasons such as resource busy, Transfer to the vector register cannot be started, and the speed of vector load processing cannot be increased. In addition, in the configuration in which the transfer from the load buffer to the vector register is started at the issue stage of the vector load instruction, the processing of a plurality of vector load instructions is in the instruction issue order, and the subsequent vector load instruction is performed earlier than the preceding vector load instruction. Instruction overtaking control such as processing cannot be performed.

そこで本発明の目的は、ロードバッファからベクトルレジスタへの転送を命令発行順でなく、ロードバッファに全要素が揃い且つ転送先のベクトルレジスタがビジー状態でないという条件が成立した順に実行することのできるベクトル処理装置を提供することにある。 Therefore, the object of the present invention is to execute the transfer from the load buffer to the vector register not in the order of instruction issuance, but in the order in which all the elements are aligned in the load buffer and the condition that the transfer destination vector register is not busy is satisfied. It is to provide a vector processing device.

本発明のベクトル処理装置は、主記憶装置とベクトルレジスタとの間に前記主記憶装置から読み出されたベクトルデータを一時的に格納するロードバッファを備え、前記主記憶装置から読み出されたベクトル命令を複数のステージで順次処理するベクトル制御部の命令解読ステージと命令発行ステージとの間に、仕掛かり中のベクトルロード命令が使用するベクトルレジスタが先行するベクトル命令と競合しなくなったことを検出するリソースチェック手段を備え、前記主記憶装置から読み出されたベクトルロード命令の解読時に前記主記憶装置から前記ロードバッファへのベクトルデータの読み出しを起動し、前記ベクトルデータの全要素が前記ロードバッファに格納され次第、前記ベクトルロード命令で使用するベクトルレジスタが先行するベクトル命令で使用するベクトルレジスタと競合しないことが前記リソースチェック手段で検出されていることを条件に、前記ロードバッファから前記ベクトルレジスタへのベクトルデータの転送を開始する。より具体的には、主記憶装置とこれに接続されたプロセッサとを備え、前記プロセッサは、前記主記憶装置から読み出されたベクトル命令がベクトルロード命令であった場合に前記主記憶装置からのベクトルデータの読み出しを起動するメモリアクセス処理部、前記主記憶装置から読み出されたベクトル命令を複数のステージで順次処理するベクトル制御部、複数のベクトルレジスタと１以上のベクトル演算器と複数のロードバッファとを備えるベクトル処理部を含み、前記ベクトル制御部の命令解読ステージと命令発行ステージとの間に、仕掛かり中のベクトルロード命令が使用するベクトルレジスタが先行するベクトル命令と競合しなくなったことを検出して前記ベクトル処理部に通知するリソースチェック手段を備え、前記ベクトル処理部は、前記主記憶装置から読み出されたベクトルデータを前記メモリアクセス処理部で割り当てられた前記ロードバッファに格納し、ベクトルデータの全要素が前記ロードバッファに格納され次第、前記ベクトルロード命令で使用するベクトルレジスタが先行するベクトル命令で使用するベクトルレジスタと競合しないことが前記ベクトル制御部から通知されていることを条件に、前記ロードバッファから前記ベクトルレジスタへのベクトルデータの転送を開始するベクトルロード管理部を備えている。 The vector processing device of the present invention includes a load buffer for temporarily storing vector data read from the main storage device between a main storage device and a vector register, and a vector read from the main storage device. Detects that the vector register used by the vector load instruction in progress no longer conflicts with the preceding vector instruction between the instruction decode stage and the instruction issue stage of the vector controller that sequentially processes instructions in multiple stages. Resource checking means for starting vector data read from the main memory to the load buffer when decoding a vector load instruction read from the main memory, and all elements of the vector data are loaded into the load buffer As soon as it is stored in the vector register, the vector register used by the vector load instruction precedes That it does not conflict with the vector registers used in the vector instruction is detected by the resource check means the condition that initiates the transfer of vector data from the load buffer to the vector register. More specifically, a main storage device and a processor connected to the main storage device are provided. When the vector instruction read from the main storage device is a vector load instruction, the processor A memory access processing unit for starting reading of vector data, a vector control unit for sequentially processing vector instructions read from the main storage device in a plurality of stages, a plurality of vector registers, one or more vector calculators, and a plurality of loads A vector processing unit including a buffer, and the vector register used by the vector load instruction in progress no longer conflicts with the preceding vector instruction between the instruction decoding stage and the instruction issuing stage of the vector control unit. Resource checking means for detecting and notifying the vector processing unit, The unit stores the vector data read from the main storage device in the load buffer allocated by the memory access processing unit, and when all the elements of the vector data are stored in the load buffer, the vector load instruction A vector that starts transfer of vector data from the load buffer to the vector register on the condition that the vector control unit notifies that the vector register to be used does not conflict with the vector register used in the preceding vector instruction A load management unit is provided.

また本発明のベクトルロード方法は、ａ）主記憶装置から読み出されたベクトルロード命令の解読時に、前記ベクトルロード命令によって前記主記憶装置から読み出すベクトルデータを一時的に格納するロードバッファを割り当てて、前記主記憶装置から前記ロードバッファへのベクトルデータの読み出しを起動する段階、ｂ）前記ベクトルデータの全要素が前記ロードバッファに格納されたという第１の条件を検出する段階、ｃ）前記ベクトルロード命令で使用するベクトルレジスタが先行するベクトル命令で使用するベクトルレジスタと競合しないという第２の条件を命令解読ステージと命令発行ステージとの間で検出する段階、ｄ）前記第１および第２の条件が成立次第、前記ロードバッファから前記ベクトルレジスタへのベクトルデータの転送を開始する段階、を含んで構成される。 In the vector loading method of the present invention, a) a load buffer for temporarily storing vector data read from the main memory by the vector load instruction when the vector load instruction read from the main memory is decoded. Initiating reading of vector data from the main memory to the load buffer; b) detecting a first condition that all elements of the vector data are stored in the load buffer; c) the vector Detecting a second condition between the instruction decode stage and the instruction issue stage that the vector register used in the load instruction does not conflict with the vector register used in the preceding vector instruction; d) the first and second stages; As soon as the condition is met, the vector data from the load buffer to the vector register Initiating transfer of data, configured to include a.

本発明によれば、ロードバッファからベクトルレジスタへの転送を命令発行順ではなく、ロードバッファに全要素が揃い且つ転送先のベクトルレジスタがビジー状態でないという条件が成立した順に実行することができる。このため、ベクトルロード命令に先行するベクトル命令がリソースビジー等の理由で発行ステージに留まっていても、転送条件が成立次第、ロードバッファからベクトルレジスタへの転送を起動することができるため、ベクトルロード処理の高速化が可能となる。また、複数のベクトルロード命令の処理を命令発行順でなく、転送条件の成立順に行うことができ、後続のベクトルロード命令を先行するベクトルロード命令よりも早く処理するといった命令の追越し制御が可能となる。 According to the present invention, transfer from the load buffer to the vector register can be executed not in the order of instruction issuance, but in the order in which all the elements are aligned in the load buffer and the transfer destination vector register is not busy. For this reason, even if the vector instruction preceding the vector load instruction stays in the issue stage for reasons such as resource busy, transfer from the load buffer to the vector register can be started as soon as the transfer condition is satisfied. Processing speed can be increased. In addition, the processing of multiple vector load instructions can be performed in the order in which the transfer conditions are satisfied, not in the order of instruction issuance, and it is possible to control overtaking of instructions such that subsequent vector load instructions are processed earlier than the preceding vector load instructions. Become.

図１を参照すると、本発明の実施の形態にかかるベクトル処理装置は、プロセッサ１と主記憶装置７とを含んで構成される。プロセッサ１と主記憶装置７とは、主記憶装置７に記憶された命令をプロセッサ１から読み出すための信号線１０１と、主記憶装置７に記憶されたベクトルデータ等をプロセッサ１から読み出し、逆にプロセッサ１で生成したベクトルデータ等を主記憶装置７に書き込むための信号線１０２によって相互に接続されている。信号線１０２は多重化されており、複数のベクトルロード命令にかかる主記憶装置７からのベクトルデータの読み出し等を並行して行うことができるようになっている。 Referring to FIG. 1, the vector processing device according to the embodiment of the present invention includes a processor 1 and a main storage device 7. The processor 1 and the main storage device 7 read from the processor 1 the signal line 101 for reading out the instructions stored in the main storage device 7 from the processor 1, the vector data stored in the main storage device 7, etc. The vector data and the like generated by the processor 1 are connected to each other by a signal line 102 for writing to the main storage device 7. The signal line 102 is multiplexed so that reading of vector data from the main storage device 7 related to a plurality of vector load instructions can be performed in parallel.

プロセッサ１は、命令制御部２と、メモリアクセス処理部３と、プロセッサネットワーク部４と、ベクトル制御部５と、ベクトル処理部６とを含んで構成される。 The processor 1 includes an instruction control unit 2, a memory access processing unit 3, a processor network unit 4, a vector control unit 5, and a vector processing unit 6.

命令制御部２は、主記憶装置７と信号線１０１で接続され、メモリアクセス処理部３と信号線１０３で接続され、ベクトル制御部５と信号線１０４で接続され、信号線１０１を通じて主記憶装置７から読み出した命令を解読する命令解読部２１と、解読された命令がスカラ命令である場合にそのスカラ命令にかかる処理を実行するスカラ処理部２２を有している。また、命令解読部２１は、解読した命令がベクトル命令である場合にはそのベクトル命令を信号線１０４を通じてベクトル制御部５に出力し、更にそのベクトル命令がベクトルロード命令である場合は信号線１０３を通じてメモリアクセス処理部３に対して当該ベクトルロード命令を出力する。ベクトルロード命令には、ロードするベクトルデータの主記憶装置７のアドレスを特定する情報（例えば開始アドレスとベクトルデータの間隔）、ベクトルデータの要素数、ベクトルデータをロードするベクトルレジスタの番号が含まれている。 The instruction control unit 2 is connected to the main storage device 7 through a signal line 101, is connected to the memory access processing unit 3 through a signal line 103, is connected to the vector control unit 5 through a signal line 104, and is connected to the main storage device through the signal line 101. 7 includes an instruction decoding unit 21 that decodes an instruction read from 7 and a scalar processing unit 22 that executes a process related to the scalar instruction when the decoded instruction is a scalar instruction. Further, the instruction decoding unit 21 outputs the vector instruction to the vector control unit 5 through the signal line 104 when the decoded instruction is a vector instruction, and further, the signal line 103 when the vector instruction is a vector load instruction. The vector load instruction is output to the memory access processing unit 3 through The vector load instruction includes information (for example, the interval between the start address and the vector data) specifying the address of the main memory 7 of the vector data to be loaded, the number of elements of the vector data, and the number of the vector register for loading the vector data ing.

メモリアクセス処理部３は、主記憶装置７へのアクセスを制御する部分で、命令制御部２と信号線１０３で接続され、プロセッサネットワーク部４と信号線１０５で接続され、ベクトル制御部５と信号線１０６で接続され、ベクトル処理部６と信号線１０７で接続されている。メモリアクセス処理部３は、信号線１０３を通じて命令制御部２から送られてくるメモリアクセスにかかる命令を解読すると共にプロセッサネットワーク部４の状態を管理し、信号線１０５を通じてメモリアクセスリクエストを制御する信号をプロセッサネットワーク部４に送り、主記憶装置７との信号線１０２とベクトル処理部６との信号線１０８との間のデータの行き来を制御する。特にベクトルロード命令に関し、メモリアクセス処理部３は、ベクトル処理部６内に設けられている複数のロードバッファの空き管理を行っており、命令制御部２から信号線１０３を通じてベクトルロード命令を受信すると、このベクトルロード命令用に空き状態の１つのロードバッファを割り当ててそれを使用中状態として管理し、割り当てたロードバッファを一意に識別するバッファ番号を付随してメモリアクセスリクエストを信号線１０５を通じてプロセッサネットワーク部４に発行し、同時に、どのベクトルロード命令に対してどのバッファ番号のロードバッファを割り当てたかを信号線１０６を通じてベクトル制御部５に通知する。また、信号線１０７を通じてベクトル処理部６からバッファ番号を指定したバッファ解放通知を受けると、メモリアクセス処理部３はそのバッファ番号のロードバッファを再び空き状態として管理する。 The memory access processing unit 3 controls access to the main storage device 7 and is connected to the instruction control unit 2 through the signal line 103, connected to the processor network unit 4 through the signal line 105, and connected to the vector control unit 5 and the signal. They are connected by a line 106 and are connected to the vector processing unit 6 by a signal line 107. The memory access processing unit 3 decodes a memory access command sent from the command control unit 2 through the signal line 103 and manages the state of the processor network unit 4 and controls a memory access request through the signal line 105 Is sent to the processor network unit 4 to control the data transfer between the signal line 102 to the main storage device 7 and the signal line 108 to the vector processing unit 6. In particular, regarding the vector load instruction, the memory access processing unit 3 performs vacancy management of a plurality of load buffers provided in the vector processing unit 6, and receives a vector load instruction from the instruction control unit 2 through the signal line 103. A free load buffer is allocated for the vector load instruction and managed as a busy state, and a memory access request is attached to the processor through the signal line 105 with a buffer number uniquely identifying the allocated load buffer. It is issued to the network unit 4 and at the same time, the vector control unit 5 is notified via the signal line 106 of which buffer number is assigned to which vector load instruction. When receiving a buffer release notification designating a buffer number from the vector processing unit 6 through the signal line 107, the memory access processing unit 3 again manages the load buffer of that buffer number as an empty state.

プロセッサネットワーク部４は、主記憶装置７と信号線１０２で接続され、メモリアクセス処理部３と信号線１０５で接続され、ベクトル処理部６と信号線１０８で接続され、メモリアクセス処理部３から与えられるメモリアクセスリクエストに応じて主記憶装置７とベクトル処理部６との間でベクトルデータをやりとりする。ベクトルロード命令にかかるメモリアクセスリクエストに関し、プロセッサネットワーク部４は、主記憶装置７から読み出したベクトルデータを構成する各要素に、メモリアクセスリクエストに付随するバッファ番号を付随させ、信号線１０８を通じてベクトル処理部６に送出する。信号線１０８は多重化されており、複数のメモリアクセスリクエストにかかるベクトルデータを並行してベクトル処理部６に供給できるようになっている。 The processor network unit 4 is connected to the main storage device 7 through the signal line 102, connected to the memory access processing unit 3 through the signal line 105, connected to the vector processing unit 6 through the signal line 108, and supplied from the memory access processing unit 3. In response to a memory access request, vector data is exchanged between the main storage device 7 and the vector processing unit 6. Regarding the memory access request related to the vector load instruction, the processor network unit 4 associates each element constituting the vector data read from the main storage device 7 with the buffer number associated with the memory access request, and performs vector processing through the signal line 108. Send to unit 6. The signal line 108 is multiplexed so that vector data related to a plurality of memory access requests can be supplied to the vector processing unit 6 in parallel.

ベクトル処理部６は、メモリアクセス処理部３と信号線１０７で接続され、プロセッサネットワーク部４と信号線１０８で接続され、ベクトル制御部５と信号線１０９、１１０で接続され、ベクトルデータに対してベクトル演算を実行する機能を持つ。ベクトル処理部６は、少なくとも１つのベクトルパイプライン演算器６１を備える。ベクトルパイプライン演算器６１は、ベクトルデータを格納する複数のベクトルレジスタ６２と、ベクトルレジスタ６２に格納されたベクトルデータに対してベクトル演算を行う１つ以上のベクトル演算器６３と、主記憶装置７から読み出されたベクトルデータを一時的に格納する複数のロードバッファ６４と、ベクトルロード管理部６５と、ベクトル演算器６３およびロードバッファ６４から出力されるベクトルデータをベクトルレジスタ６２に振り分けるクロスバスイッチ回路６６とを含んで構成される。なお、ベクトル演算されたベクトルデータを主記憶装置７にストアするストアバッファなど、ストアに関連する構成は本発明と直接関係しないため図示を省略している。 The vector processing unit 6 is connected to the memory access processing unit 3 through a signal line 107, connected to the processor network unit 4 through a signal line 108, connected to the vector control unit 5 through signal lines 109 and 110, and Has the ability to perform vector operations. The vector processing unit 6 includes at least one vector pipeline calculator 61. The vector pipeline arithmetic unit 61 includes a plurality of vector registers 62 that store vector data, one or more vector arithmetic units 63 that perform vector operations on the vector data stored in the vector register 62, and the main storage device 7. A crossbar switch circuit that distributes the vector data output from the plurality of load buffers 64, the vector load management unit 65, the vector calculator 63, and the load buffer 64 to the vector register 62 for temporarily storing the vector data read from 66. It should be noted that the configuration related to the store, such as a store buffer for storing the vector data obtained by vector calculation in the main storage device 7, is not shown because it is not directly related to the present invention.

ベクトルロード管理部６５は、メモリアクセス処理部３と信号線１０７で接続され、プロセッサネットワーク部４と信号線１０８で接続され、ベクトル制御部５と信号線１０９〜１１１で接続され、信号線１０８を通じて送られてきたベクトルデータの要素を、それに付随するバッファ番号のロードバッファ６４に一旦格納し、その後、ロードバッファ６４に格納されたベクトルデータをベクトルレジスタ６２に転送する。図２にベクトルロード管理部６５の構成例を示す。 The vector load management unit 65 is connected to the memory access processing unit 3 through the signal line 107, is connected to the processor network unit 4 through the signal line 108, is connected to the vector control unit 5 through the signal lines 109 to 111, and passes through the signal line 108. The elements of the transmitted vector data are temporarily stored in the load buffer 64 having the buffer number associated therewith, and then the vector data stored in the load buffer 64 is transferred to the vector register 62. FIG. 2 shows a configuration example of the vector load management unit 65.

図２を参照すると、ベクトルロード管理部６５の一例は、レジスタ群６５１と、レジスタ設定部６５２と、ライト部６５３と、リード部６５４とを含んで構成される。 Referring to FIG. 2, an example of the vector load management unit 65 includes a register group 651, a register setting unit 652, a write unit 653, and a read unit 654.

レジスタ群６５１は、ロードバッファ６４に１対１に対応するレジスタ６５１−０〜６５１−ｎの集合で構成され、各レジスタ６５１−ｉ（ｉ＝０〜ｎ）は、ロードバッファ番号フィールド６５１１、ベクトルレジスタ（ＶＡＲ）番号フィールド６５１２、リソースチェックフラグフィールド６５１３、ライト要素数フィールド６５１４およびリード要素数フィールド６５１５を有する。ロードバッファ番号フィールド６５１１には、ロードバッファ６４のバッファ番号が固定的に設定されている。ベクトルレジスタ番号フィールド６５１２には、ベクトルロード命令でロードするベクトルデータを格納するベクトルレジスタ６２の番号が設定される。リソースチェックフラグフィールド６５１３には、ベクトルロード命令で使用するベクトルレジスタ６２が先行するベクトル命令と競合するか否かを示すリソースチェックフラグが設定される。ライト要素数フィールド６５１４およびリード要素数フィールド６５１５には、ベクトルロード命令でロードするベクトルデータの要素数が初期値として設定される。ライト要素数フィールド６５１４に設定されたライト要素数は、ロードバッファ６４へのベクトルデータ要素の書き込みに応じて減算され、全要素のロードバッファ６４への書き込みが完了すると０となる。リード要素数フィールド６５１５に設定されたリード要素数は、ロードバッファ６４からベクトルレジスタ６２への読み出しに応じて減算され、全要素のロードバッファ６４からの読み出しが完了すると０となる。 The register group 651 includes a set of registers 651-0 to 651-n corresponding to the load buffer 64 on a one-to-one basis. Each register 651-i (i = 0 to n) includes a load buffer number field 6511, a vector. It has a register (VAR) number field 6512, a resource check flag field 6513, a write element number field 6514, and a read element number field 6515. In the load buffer number field 6511, the buffer number of the load buffer 64 is fixedly set. In the vector register number field 6512, the number of the vector register 62 that stores vector data to be loaded by the vector load instruction is set. A resource check flag field 6513 is set with a resource check flag indicating whether or not the vector register 62 used in the vector load instruction conflicts with the preceding vector instruction. In the write element number field 6514 and the read element number field 6515, the number of elements of vector data to be loaded by the vector load instruction is set as an initial value. The number of write elements set in the write element number field 6514 is subtracted in accordance with the writing of the vector data element to the load buffer 64, and becomes zero when the writing of all the elements to the load buffer 64 is completed. The number of read elements set in the number of read elements field 6515 is subtracted in response to reading from the load buffer 64 to the vector register 62, and becomes 0 when reading of all elements from the load buffer 64 is completed.

レジスタ設定部６５２は、ベクトル制御部５と信号線１０９で接続され、レジスタ群６５１の初期設定等を行う。レジスタ設定部６５２は、信号線１０９を通じてベクトル制御部５から、ロードバッファ番号、ベクトルレジスタ番号および要素数を含むベクトルロード命令情報が伝達されると、そのロードバッファ番号をロードバッファ番号フィールド６５１１に持つレジスタ６５１−ｉのベクトルレジスタ番号フィールド６５１２にベクトルロード命令情報中のベクトルレジスタ番号を設定し、ライト要素数フィールド６５１４およびリード要素数フィールド６５１５にベクトルロード命令情報中の要素数を設定し、リソースチェックフラグフィールド６５１３には、リソースの競合有りを示すフラグ値１を設定する。また、レジスタ設定部６５２は、信号線１０９を通じてベクトル制御部５から、ロードバッファ番号を指定したリソースチェックＯＫ信号が伝達されると、そのロードバッファ番号をロードバッファ番号フィールド６５１１に持つレジスタ６５１−ｉのリソースチェックフラグフィールド６５１３のフラグをリソース競合無しを示す値０に書き換える。 The register setting unit 652 is connected to the vector control unit 5 through the signal line 109, and performs initial setting of the register group 651 and the like. When the vector setting instruction information including the load buffer number, the vector register number, and the number of elements is transmitted from the vector control unit 5 through the signal line 109, the register setting unit 652 has the load buffer number in the load buffer number field 6511. The vector register number in the vector load instruction information is set in the vector register number field 6512 of the register 651-i, the number of elements in the vector load instruction information is set in the write element number field 6514 and the read element number field 6515, and the resource check is performed. In the flag field 6513, a flag value 1 indicating that there is a resource conflict is set. Further, when a resource check OK signal designating a load buffer number is transmitted from the vector control unit 5 through the signal line 109, the register setting unit 652 has a register 651-i having the load buffer number in the load buffer number field 6511. The flag in the resource check flag field 6513 is rewritten to a value 0 indicating no resource conflict.

ライト部６５３は、プロセッサネットワーク部４と信号線１０８で接続され、ベクトルデータのロードバッファ６４への書き込みを行う。ライト部６５３は、信号線１０８を通じてプロセッサネットワーク部４からベクトルデータの要素を受信すると、受信した要素に付随するロードバッファ番号を持つロードバッファ６４にその要素を書き込み、そのロードバッファ番号をロードバッファ番号フィールド６５１１に持つレジスタ６５１−ｉのライト要素数フィールド６５１４の値を書き込んだ要素数分だけ減算する。 The write unit 653 is connected to the processor network unit 4 through the signal line 108 and writes vector data to the load buffer 64. When the write unit 653 receives an element of vector data from the processor network unit 4 through the signal line 108, the write unit 653 writes the element into the load buffer 64 having the load buffer number associated with the received element, and the load buffer number is set to the load buffer number. The value of the write element number field 6514 of the register 651-i in the field 6511 is subtracted by the number of written elements.

リード部６５４は、ベクトル制御部５と信号線１０９、１１０で接続され、ロードバッファ６４からベクトルレジスタ６２へのベクトルデータの転送を行う。リード部６５４は、レジスタ設定部６５２によってライト要素数フィールド６５１４が全要素数に初期設定されたレジスタ６５１−ｉについて、そのレジスタのロードバッファ番号フィールド６５１１に設定されたバッファ番号を持つロードバッファ６４のデータを、そのレジスタのベクトルレジスタ番号フィールド６５１２に設定された番号のベクトルレジスタ６２に転送する条件が満足されたかどうかを監視する。（１）ベクトルデータの全要素がロードバッファ６４に格納済みである（ライト要素数フィールド６５１４の値が０である）、（２）先行するベクトル命令とベクトルレジスタが競合しない（リソースチェックフラグフィールド６５１３のフラグ値が０である）、の２つの要件が満たされた場合、転送可能と判断する。 The read unit 654 is connected to the vector control unit 5 through signal lines 109 and 110 and transfers vector data from the load buffer 64 to the vector register 62. For the register 651-i in which the write element number field 6514 is initialized to the total number of elements by the register setting unit 652, the read unit 654 stores the load buffer 64 having the buffer number set in the load buffer number field 6511 of the register. It monitors whether the condition for transferring data to the vector register 62 of the number set in the vector register number field 6512 of that register is satisfied. (1) All elements of vector data have been stored in the load buffer 64 (the value of the write element number field 6514 is 0). (2) The preceding vector instruction does not conflict with the vector register (resource check flag field 6513). If the two requirements are satisfied, it is determined that transfer is possible.

リード部６５４は、或るレジスタ６５１−ｉについて転送可能と判断した場合、信号線１１０を通じてベクトル制御部５に対して、そのレジスタ６５１−ｉのロードバッファ番号フィールド６５１１に設定されたバッファ番号およびベクトルレジスタ番号フィールド６５１２に設定されたベクトルレジスタ番号を指定した転送開始通知を送出した後、ロードバッファ番号フィールド６５１１に設定されたバッファ番号を持つロードバッファ６４に格納されているデータを順次に読み出して、ベクトルレジスタ番号フィールド６５１２に設定された番号のベクトルレジスタ６２にクロスバスイッチ回路６６を通じて書き込んでいく。このとき、ベクトルデータの１要素をロードバッファ６４から読み出す毎に、リード要素数フィールド６５１５の値を１だけ減算する。リード要素数フィールド６５１５の値が０になると、転送完了となり、リード部６５４は、そのレジスタ６５１−ｉのロードバッファ番号フィールド６５１１に設定されたバッファ番号およびベクトルレジスタ番号フィールド６５１２に設定されたベクトルレジスタ番号を指定した転送終了通知を信号線１１０を通じてベクトル制御部５に送出すると同時に、そのレジスタ６５１−ｉのロードバッファ番号フィールド６５１１に設定されたバッファ番号を指定したバッファ解放通知を信号線１０７を通じてメモリアクセス処理部３に送出し、そのレジスタ６５１−ｉのフィールド６５１２〜６５１５を例えばＮＵＬＬに初期化する。 When the read unit 654 determines that transfer is possible for a certain register 651-i, the buffer number and vector set in the load buffer number field 6511 of the register 651-i to the vector control unit 5 through the signal line 110. After sending a transfer start notification specifying the vector register number set in the register number field 6512, the data stored in the load buffer 64 having the buffer number set in the load buffer number field 6511 is read sequentially, Data is written to the vector register 62 of the number set in the vector register number field 6512 through the crossbar switch circuit 66. At this time, every time one element of the vector data is read from the load buffer 64, the value of the read element number field 6515 is decremented by one. When the value of the read element number field 6515 becomes 0, the transfer is completed, and the read unit 654 reads the buffer number set in the load buffer number field 6511 of the register 651-i and the vector register set in the vector register number field 6512. A transfer end notification designating a number is sent to the vector control unit 5 through the signal line 110, and at the same time, a buffer release notification designating the buffer number set in the load buffer number field 6511 of the register 651-i is stored in the memory through the signal line 107. The data is sent to the access processing unit 3, and the fields 6512 to 6515 of the register 651-i are initialized to NULL, for example.

再び図１を参照すると、ベクトル制御部５は、命令制御部２と信号線１０４で接続され、メモリアクセス処理部３と信号線１０６で接続され、ベクトル処理部６と信号線１０９〜１１１で接続され、ベクトル命令の発行制御を行う。このベクトル制御部５は、ベクトル命令の実行サイクルを、命令の解読、オペランドの計算と取り出し、命令の発行などと言った複数のステージに分割し、各ステージの処理を独立したハードウェアが行うことにより、複数のベクトル命令を並列に処理する。このため、ベクトル制御部５は、信号線１０４を通じて命令制御部２から入力されたベクトル命令を解読する命令デコード部５１と、最終ステージである命令発行ステージ用のレジスタ５２との間に、中間の各ステージに存在する命令のデコード情報等を保持する幾つかの命令レジスタ５３、５４、５５と中間のステージにおける処理を実行するデコーダ５６、５７を備えている。なお、ステージの段数はプロセッサのアーキテクチャによって相違する。命令発行部５８は、命令発行ステージのレジスタ５２に格納された命令が発行可能かどうかをチェックし、発行可能でなければ発行可能になるまで待ち、発行可能であれば、信号線１１１を通じてベクトル処理部６に命令発行を通知し、同時に処理に必要な付随情報を通知する。レジスタ５２に格納された命令が発行可能かどうかのチェックは、ベクトル処理部６のベクトルパイプライン演算器６１におけるベクトルレジスタ６２、ベクトル演算器６３等の各リソース毎にそのリソースがビジー状態かどうかを保持するビジーフラグ群５９のビジーフラグのうち、命令発行ステージのレジスタ５２に格納された命令が使用するリソースのビジーフラグがビジー状態でないかどうかに基づいて行う。 Referring again to FIG. 1, the vector control unit 5 is connected to the instruction control unit 2 through the signal line 104, connected to the memory access processing unit 3 through the signal line 106, and connected to the vector processing unit 6 through the signal lines 109 to 111. And issue control of vector instructions. The vector control unit 5 divides a vector instruction execution cycle into a plurality of stages such as instruction decoding, operand calculation and extraction, and instruction issuance, and processing of each stage is performed by independent hardware. Thus, a plurality of vector instructions are processed in parallel. For this reason, the vector control unit 5 includes an intermediate between the instruction decode unit 51 that decodes the vector instruction input from the instruction control unit 2 through the signal line 104 and the instruction issue stage register 52 that is the final stage. There are provided several instruction registers 53, 54, and 55 that hold decode information and the like of instructions existing in each stage, and decoders 56 and 57 that execute processing in an intermediate stage. The number of stages differs depending on the processor architecture. The instruction issuing unit 58 checks whether or not the instruction stored in the register 52 at the instruction issuing stage can be issued. If the instruction is not issued, the instruction issuing unit 58 waits until the instruction can be issued. An instruction issuance is notified to the unit 6, and accompanying information necessary for processing is simultaneously notified. Whether or not the instruction stored in the register 52 can be issued is determined by checking whether or not the resource is busy for each resource such as the vector register 62 and the vector calculator 63 in the vector pipeline calculator 61 of the vector processing unit 6. Of the busy flags of the busy flag group 59 to be held, the determination is performed based on whether the busy flag of the resource used by the instruction stored in the register 52 at the instruction issue stage is not busy.

本実施の形態においては、ベクトルロード命令については、主記憶装置７から読み出されたベクトルデータの全要素がロードバッファ６４に格納され、且つ、そのベクトルロード命令が使用するリソースが先行するベクトル命令で使用するリソースと競合しなければ、そのベクトルロード命令が命令発行ステージのレジスタ５２に到着するのを待たずに速やかに、ロードバッファ６４からベクトルレジスタ６２への転送を開始させる。ベクトル制御部５のライト・ロードカウンタ部５Ａ、リードカウンタ部５Ｂおよびロード命令情報管理部５Ｃは、そのような制御のために設けられている。 In the present embodiment, for the vector load instruction, all elements of the vector data read from the main storage device 7 are stored in the load buffer 64, and the vector instruction preceded by the resource used by the vector load instruction If there is no conflict with the resources used in step 1, the transfer from the load buffer 64 to the vector register 62 is started immediately without waiting for the vector load instruction to arrive at the register 52 in the instruction issue stage. The write / load counter unit 5A, the read counter unit 5B, and the load instruction information management unit 5C of the vector control unit 5 are provided for such control.

ライト・ロードカウンタ部５Ａは、ベクトル制御部５内で仕掛かり中のベクトル命令でライトまたはロードのために使用されるベクトルレジスタ６２とその使用命令数を管理する。構成例を図３のライト・ロードカウンタ部５Ａのブロックに示す。この例のライト・ロードカウンタ部５Ａは、ベクトル処理部６に存在するベクトルレジスタ６２と１対１に対応するカウント部５Ａ−０〜５Ａ−ｍから構成されるカウンタ５ＡＣを有する。カウンタ５ＡＣの全てのカウント部５Ａ−０〜５Ａ−ｍの初期値は０である。命令デコード部５１で解読された命令がベクトルレジスタ６２へライトする命令またはベクトルレジスタ６２へデータをロードする命令であった場合、カウンタ５ＡＣは、デコード部５１から出力される当該命令が使用するベクトルレジスタ６２の情報に基づいて、そのベクトルレジスタ６２に対応するカウント部５Ａ−ｊ（ｊは０〜ｍ）の値を＋１する。また、命令発行部５８は、命令発行ステージのレジスタ５２の命令の処理が終了すると、その命令がベクトルレジスタ６２へライトする命令またはベクトルレジスタ６２へデータをロードする命令であった場合、その命令で使用したベクトルレジスタ６２のビジーフラグをリセットするタイミングで、ベクトルレジスタ６２の番号をライト・ロードカウンタ部５Ａに通知するようになっており、カウンタ５ＡＣは、通知されたベクトルレジスタ６２に対応するカウント部５Ａ−ｊの値を−１する。 The write / load counter unit 5A manages a vector register 62 used for writing or loading with a vector instruction being processed in the vector control unit 5 and the number of instructions used. A configuration example is shown in the block of the write / load counter unit 5A in FIG. The write / load counter unit 5A in this example includes a counter 5AC including a vector register 62 existing in the vector processing unit 6 and count units 5A-0 to 5A-m corresponding one-to-one. The initial values of all the counting units 5A-0 to 5A-m of the counter 5AC are zero. When the instruction decoded by the instruction decoding unit 51 is an instruction for writing to the vector register 62 or an instruction for loading data into the vector register 62, the counter 5AC displays the vector register used by the instruction output from the decoding unit 51. Based on the information 62, the value of the count unit 5A-j (j is 0 to m) corresponding to the vector register 62 is incremented by one. In addition, when processing of the instruction in the register 52 at the instruction issuing stage is completed, the instruction issuing unit 58 determines that the instruction is an instruction to write to the vector register 62 or an instruction to load data to the vector register 62. At the timing when the busy flag of the used vector register 62 is reset, the number of the vector register 62 is notified to the write / load counter unit 5A, and the counter 5AC counts the counting unit 5A corresponding to the notified vector register 62. Decrease the value of -j by -1.

リードカウンタ部５Ｂは、ベクトル制御部５内で仕掛かり中のベクトル命令でリードされるベクトルレジスタ６２とその使用命令数を管理する。構成例を図３のリードカウンタ部５Ｂのブロックに示す。この例のリードカウンタ部５Ｂは、ベクトル処理部６に存在するベクトルレジスタ６２と１対１に対応するカウント部５Ｂ−０〜５Ｂ−ｍから構成されるカウンタ５ＢＣを有する。カウンタ５ＢＣの全てのカウント部５Ｂ−０〜５Ｂ−ｍの初期値は０である。命令デコード部５１で解読された命令がベクトルレジスタ６２をリードする命令であった場合、カウンタ５ＢＣは、デコード部５１から出力される当該命令が使用するベクトルレジスタ６２の情報に基づいて、そのベクトルレジスタ６２に対応するカウント部５Ｂ−ｊの値を＋１する。また、命令発行部５８は、命令発行ステージのレジスタ５２の命令の処理が終了すると、その命令がベクトルレジスタ６２をリードする命令であった場合、その命令で使用したベクトルレジスタ６２のビジーフラグをリセットするタイミングで、ベクトルレジスタ６２の番号をリードカウンタ部５Ｂに通知するようになっており、カウンタ５ＢＣは、通知されたベクトルレジスタ６２に対応するカウント部５Ｂ−ｊの値を−１する。 The read counter unit 5B manages the vector register 62 to be read by the vector command being processed in the vector control unit 5 and the number of used instructions. A configuration example is shown in the block of the read counter section 5B in FIG. The read counter unit 5B in this example includes a counter 5BC including a vector register 62 existing in the vector processing unit 6 and a count unit 5B-0 to 5B-m corresponding one-to-one. The initial values of all the counting units 5B-0 to 5B-m of the counter 5BC are zero. When the instruction decoded by the instruction decoding unit 51 is an instruction for reading the vector register 62, the counter 5BC determines the vector register based on the information of the vector register 62 used by the instruction output from the decoding unit 51. The value of the count unit 5B-j corresponding to 62 is incremented by one. Further, when processing of the instruction in the register 52 at the instruction issuing stage is completed, the instruction issuing unit 58 resets the busy flag of the vector register 62 used in the instruction when the instruction is an instruction that reads the vector register 62. At the timing, the number of the vector register 62 is notified to the read counter unit 5B, and the counter 5BC decrements the value of the count unit 5B-j corresponding to the notified vector register 62 by -1.

ロード命令情報管理部５Ｃは、ベクトル制御部５内で仕掛かり中のベクトルロード命令を管理し、ベクトル処理部６のベクトルロード管理部６５に対して必要な情報を通知する部分である。構成例を図３のロード命令情報管理部５Ｃのブロックに示す。この例のロード命令情報管理部５Ｃは、レジスタ群５Ｃ１、リソース情報登録部５Ｃ２およびリソース情報チェック部５Ｃ３を含んで構成される。 The load instruction information management unit 5C is a part that manages a vector load instruction in progress in the vector control unit 5 and notifies the vector load management unit 65 of the vector processing unit 6 of necessary information. A configuration example is shown in the block of the load instruction information management unit 5C in FIG. The load instruction information management unit 5C in this example includes a register group 5C1, a resource information registration unit 5C2, and a resource information check unit 5C3.

レジスタ群５Ｃ１は、ベクトル処理部６に存在するロードバッファ６４に１対１に対応するレジスタ５Ｃ１−０〜５Ｃ１−ｎの集合で構成され、各レジスタ５Ｃ１−ｉ（ｉ＝０〜ｎ）は、ロードバッファ番号フィールド５Ｃ１１、ベクトルレジスタ番号フィールド５Ｃ１２、要素数フィールド５Ｃ１３、ライト・ロードカウンタ値フィールド５Ｃ１４およびリードカウンタ値フィールド５Ｃ１５を有する。ロードバッファ番号フィールド５Ｃ１１には、ロードバッファ６４のバッファ番号が固定的に設定されている。ベクトルレジスタ番号フィールド５Ｃ１２には、ベクトルロード命令でロードするベクトルデータを格納するベクトルレジスタ６２の番号が設定される。要素数フィールド５Ｃ１３には、ベクトルロード命令でロードされるベクトルデータの要素数が設定される。ライト・ロードカウンタ値フィールド５Ｃ１４には、ベクトルレジスタ番号フィールド５Ｃ１２に設定された番号のベクトルレジスタ６２をライトおよびロードに使用するベクトル命令の数が設定される。リードカウンタ値フィールド５Ｃ１５には、ベクトルレジスタ番号フィールド５Ｃ１２に設定された番号のベクトルレジスタ６２をリードに使用するベクトル命令の数が設定される。 The register group 5C1 is composed of a set of registers 5C1-0 to 5C1-n corresponding one-to-one with the load buffer 64 existing in the vector processing unit 6, and each register 5C1-i (i = 0 to n) It has a load buffer number field 5C11, a vector register number field 5C12, an element number field 5C13, a write / load counter value field 5C14, and a read counter value field 5C15. In the load buffer number field 5C11, the buffer number of the load buffer 64 is fixedly set. In the vector register number field 5C12, the number of the vector register 62 that stores vector data to be loaded by the vector load instruction is set. In the element number field 5C13, the number of elements of vector data loaded by the vector load instruction is set. In the write / load counter value field 5C14, the number of vector instructions that use the vector register 62 of the number set in the vector register number field 5C12 for writing and loading is set. In the read counter value field 5C15, the number of vector instructions that use the vector register 62 of the number set in the vector register number field 5C12 for reading is set.

リソース情報登録部５Ｃ２は、命令レジスタ５３の出力と、ライト・ロードカウンタ部５Ａの出力と、リードカウンタ部５Ｂの出力とを入力とする。命令レジスタ５３には、命令デコード部５１でデコードされたベクトル命令がベクトルロード命令であった場合に、命令デコード部５１のデコード情報に加えて信号線１０６を通じてメモリアクセス処理部３から通知された当該ベクトルロード命令に割り当てられたロードバッファ６４の番号が保持されている。リソース情報登録部５Ｃ２は、命令レジスタ５３からロードバッファ番号を入力すると、そのロードバッファ番号をロードバッファ番号フィールド５Ｃ１１に持つレジスタ５Ｃ１−ｉのベクトルレジスタ番号フィールド５Ｃ１２および要素数フィールド５Ｃ１３に、命令レジスタ５３から出力されるベクトルレジスタ番号および要素数を設定する。また、ライト・ロードカウンタ部５Ａのカウンタ５ＡＣのカウント部５Ａ−０〜５Ａ−ｍのうち、ベクトルレジスタ番号フィールド５Ｃ１２に設定したベクトルレジスタ番号に対応するカウント部の内容をライト・ロードカウンタ値フィールド５Ｃ１４に設定し、リードカウンタ部５Ｂのカウンタ５ＢＣのカウント部５Ｂ−０〜５Ｂ−ｍのうち、ベクトルレジスタ番号フィールド５Ｃ１２に設定したベクトルレジスタ番号に対応するカウント部の内容をリードカウンタ値フィールド５Ｃ１５に設定する。そして、ロードバッファ番号フィールド５Ｃ１１のロードバッファ番号、ベクトルレジスタ番号フィールド５Ｃ１２に設定したベクトルレジスタ番号、要素数フィールド５Ｃ１３に設定した要素数を含むベクトルロード命令情報を、信号線１０９を通じてベクトルロード管理部６５に出力する。 The resource information registration unit 5C2 receives the output of the instruction register 53, the output of the write / load counter unit 5A, and the output of the read counter unit 5B. In the instruction register 53, when the vector instruction decoded by the instruction decoding unit 51 is a vector load instruction, the instruction notified from the memory access processing unit 3 through the signal line 106 in addition to the decoding information of the instruction decoding unit 51 The number of the load buffer 64 assigned to the vector load instruction is held. When the resource information registration unit 5C2 inputs the load buffer number from the instruction register 53, the instruction register 53 is stored in the vector register number field 5C12 and the element number field 5C13 of the register 5C1-i having the load buffer number in the load buffer number field 5C11. Sets the vector register number and the number of elements output from. Of the count units 5A-0 to 5A-m of the counter 5AC of the write / load counter unit 5A, the contents of the count unit corresponding to the vector register number set in the vector register number field 5C12 are written in the write / load counter value field 5C14. Among the count units 5B-0 to 5B-m of the counter 5BC of the read counter unit 5B, the content of the count unit corresponding to the vector register number set in the vector register number field 5C12 is set in the read counter value field 5C15 To do. Then, the vector load management section 65 receives the vector load instruction information including the load buffer number in the load buffer number field 5C11, the vector register number set in the vector register number field 5C12, and the number of elements set in the element number field 5C13 through the signal line 109. Output to.

リソース情報チェック部５Ｃ３は、命令発行部５８からの通知に応じてレジスタ群５Ｃ１のライト・ロードカウンタ値フィールド５Ｃ１４およびリードカウンタ値フィールド５Ｃ１５を更新し、先行するベクトル命令とベクトルレジスタの競合が生じなくなったタイミングで、信号線１０９を通じてベクトルロード管理部６５に対して、ロードバッファ番号を指定したリソースチェックＯＫ信号を送信する。具体的には、命令発行部５８は、命令発行ステージのレジスタ５２の命令の処理を終了すると、その命令がベクトルレジスタ６２へライトする命令またはベクトルレジスタ６２へデータをロードする命令であった場合、その命令で使用したベクトルレジスタ６２のビジーフラグをリセットするタイミングで、そのベクトルレジスタ６２の番号をロード命令情報管理部５Ｃに通知するようになっており、リソース情報チェック部５Ｃ３は、通知されたベクトルレジスタ番号をベクトルレジスタ番号フィールド５Ｃ１２に持つレジスタのライト・ロードカウンタ値フィールド５Ｃ１４の値を１だけ減算する。また、命令発行部５８は、命令発行ステージのレジスタ５２の命令の処理を終了すると、その命令がベクトルレジスタ６２をリードする命令であった場合、その命令で使用したベクトルレジスタ６２のビジーフラグをリセットするタイミングで、そのベクトルレジスタ６２の番号をロード命令情報管理部５Ｃに通知するようになっており、リソース情報チェック部５Ｃ３は、通知されたベクトルレジスタ番号をベクトルレジスタ番号フィールド５Ｃ１２に持つレジスタのリードカウンタ値フィールド５Ｃ１５の値を１だけ減算する。そして、リソース情報チェック部５Ｃ３は、ライト・ロードカウンタ値フィールド５Ｃ１４の値が１で、且つ、リードカウンタ値フィールド５Ｃ１５の値が０となった場合に、そのレジスタのロードバッファ番号フィールド５Ｃ１１に設定されているロードバッファ番号を指定したリソースチェックＯＫ信号を信号線１０９を通じてベクトルロード管理部６５に送信する。 The resource information check unit 5C3 updates the write / load counter value field 5C14 and the read counter value field 5C15 of the register group 5C1 in response to the notification from the instruction issuing unit 58, so that the conflict between the preceding vector instruction and the vector register does not occur. At the same time, a resource check OK signal designating a load buffer number is transmitted to the vector load management unit 65 through the signal line 109. Specifically, when the instruction issuing unit 58 finishes processing the instruction in the register 52 at the instruction issuing stage, the instruction is an instruction to write to the vector register 62 or an instruction to load data to the vector register 62. At the timing when the busy flag of the vector register 62 used in the instruction is reset, the number of the vector register 62 is notified to the load instruction information management unit 5C, and the resource information check unit 5C3 notifies the notified vector register The value of the write / load counter value field 5C14 of the register having the number in the vector register number field 5C12 is decremented by one. In addition, when the instruction issuing unit 58 finishes processing the instruction in the register 52 at the instruction issuing stage, if the instruction is an instruction that reads the vector register 62, the instruction issuing unit 58 resets the busy flag of the vector register 62 used in the instruction. At the timing, the number of the vector register 62 is notified to the load instruction information management unit 5C, and the resource information check unit 5C3 reads the register counter having the notified vector register number in the vector register number field 5C12. Subtract 1 from the value in the value field 5C15. Then, when the value of the write / load counter value field 5C14 is 1 and the value of the read counter value field 5C15 is 0, the resource information check unit 5C3 is set in the load buffer number field 5C11 of the register. A resource check OK signal designating the specified load buffer number is transmitted to the vector load management unit 65 through the signal line 109.

再び図１を参照すると、ロードバッファ番号管理部５Ｄは、信号線１１０を通じてベクトルロード管理部６５から転送開始通知を受けると、その通知に含まれるロードバッファ６４の番号を保持する。このロードバッファ番号管理部５Ｄは、命令発行ステージの命令レジスタ５２に格納されたベクトルロード命令を処理する際に命令発行部５８によって参照される。なお、ロードバッファ番号管理部５Ｄに記憶されたロードバッファ番号は、そのロードバッファ番号のロードバッファを使うベクトルロード命令が命令発行ステージの命令レジスタ５２に格納されたときに、命令発行部５８によって削除される。 Referring to FIG. 1 again, when the load buffer number management unit 5D receives a transfer start notification from the vector load management unit 65 through the signal line 110, the load buffer number management unit 5D holds the number of the load buffer 64 included in the notification. The load buffer number management unit 5D is referred to by the instruction issuing unit 58 when processing the vector load instruction stored in the instruction register 52 in the instruction issuing stage. The load buffer number stored in the load buffer number management unit 5D is deleted by the instruction issuing unit 58 when a vector load instruction using the load buffer of the load buffer number is stored in the instruction register 52 in the instruction issuing stage. Is done.

ビジーフラグ群５９は、前述したように、ベクトル演算器６３やベクトルレジスタ６２等の各リソース毎にそのリソースがビジー中かどうかを保持しており、基本的には命令発行部５８が命令発行ステージの命令レジスタ５２に格納された命令を発行する際、その命令で使用するリソースをビジー状態に更新し、その命令の実行が終了した時点でビジー状態を解除する。但し、本実施の形態では、ベクトルロード管理部６５側の判断でロードバッファ６４の内容をベクトルレジスタ６２に転送する動作を開始するため、信号線１１０を通じてベクトルロード管理部６５から送られてくる転送開始通知および転送終了通知によってもビジーフラグ群５９が更新される。具体的には、信号線１１０を通じてベクトルロード管理部６５から転送開始通知を受けると、その通知に含まれるベクトルレジスタ６２の番号で特定されるベクトルレジスタをビジー状態に更新し、信号線１１０を通じてベクトルロード管理部６５から転送終了通知を受けると、その通知に含まれるベクトルレジスタ６２の番号で特定されるベクトルレジスタのビジー状態を解除する。 As described above, the busy flag group 59 holds whether or not the resource is busy for each resource such as the vector computing unit 63 and the vector register 62. Basically, the instruction issuing unit 58 is in the instruction issuing stage. When the instruction stored in the instruction register 52 is issued, the resource used by the instruction is updated to the busy state, and the busy state is released when the execution of the instruction is completed. However, in this embodiment, since the operation of transferring the contents of the load buffer 64 to the vector register 62 is started at the judgment of the vector load managing unit 65, the transfer sent from the vector load managing unit 65 through the signal line 110 is started. The busy flag group 59 is also updated by the start notification and the transfer end notification. Specifically, when a transfer start notification is received from the vector load management unit 65 through the signal line 110, the vector register specified by the number of the vector register 62 included in the notification is updated to the busy state, and the vector is transmitted through the signal line 110. When the transfer end notification is received from the load management unit 65, the busy state of the vector register specified by the number of the vector register 62 included in the notification is canceled.

次に本実施の形態にかかるベクトル処理装置の動作を、ベクトルロード命令の処理を中心に説明する。 Next, the operation of the vector processing apparatus according to the present embodiment will be described with a focus on vector load instruction processing.

命令制御部２の命令解読部２１は、主記憶装置７から信号線１０１を通じて読み出した命令がベクトルロード命令であった場合、そのベクトルロード命令を信号線１０３を通じてメモリアクセス処理部３に送出すると同時に、信号線１０４を通じてベクトル制御部５に送出する。以下の説明の便宜上、上記ベクトルロード命令は、主記憶装置７の開始アドレスＸ、アクセスするベクトルデータの間隔Ｄ、要素数１００、ベクトルデータをロードするベクトルレジスタの番号１をパラメータとして持つベクトルロード命令ＶＬＤ１とする。 When the instruction read from the main storage device 7 through the signal line 101 is a vector load instruction, the instruction decoding unit 21 of the instruction control unit 2 sends the vector load instruction to the memory access processing unit 3 through the signal line 103 at the same time. The signal is sent to the vector control unit 5 through the signal line 104. For convenience of the following description, the vector load instruction is a vector load instruction having as parameters the start address X of the main storage device 7, the interval D of vector data to be accessed, the number of elements 100, and the vector register number 1 for loading vector data. VLD1.

メモリアクセス処理部３は、複数存在するロードバッファ６４の内から空き状態の１つのロードバッファを確保し、この確保したロードバッファの番号を指定して、命令制御部２から伝達されたベクトルロード命令ＶＬＤ１にかかるメモリアクセスリクエストを信号線１０５を通じてプロセッサネットワーク部４に発行し、同時に前記確保したロードバッファの番号を信号線１０６を通じてベクトル制御部５へ通知する。今、確保されたロードバッファの番号は０であったものとする。 The memory access processing unit 3 secures one empty load buffer from among the plurality of load buffers 64, designates the number of the secured load buffer, and transmits the vector load instruction transmitted from the instruction control unit 2 A memory access request for the VLD 1 is issued to the processor network unit 4 through the signal line 105, and at the same time, the number of the reserved load buffer is notified to the vector control unit 5 through the signal line 106. It is assumed that the number of the load buffer secured now is 0.

ベクトル制御部５の命令デコード部５１は、ベクトルロード命令ＶＬＤ１を解読し、ベクトルレジスタ番号１および要素数１００などの解読情報と信号線１０６を通じてメモリアクセス処理部３から通知されたロードバッファ番号０の情報とを命令レジスタ５３に格納する。また、ベクトルロード命令であるため、命令デコード部５１は、ライト・ロードカウンタ部５Ａのカウンタ５ＡＣにおけるベクトルレジスタ番号１に対応するカウント部５Ａ−１の値を＋１する。次に、命令レジスタ５３の内容は命令レジスタ５４に格納されると同時にロード命令情報管理部５Ｃに出力される。ロード命令情報管理部５Ｃのリソース情報登録部５Ｃ２は、命令レジスタ５３からロードバッファ番号０が出力されたため、そのロードバッファ番号０をロードバッファ番号フィールド５Ｃ１１に持つレジスタ５Ｃ１−０のベクトルレジスタ番号フィールド５Ｃ１２および要素数フィールド５Ｃ１３に、命令レジスタ５３から出力されたベクトルレジスタ番号１および要素数１００を設定し、ライト・ロード命令カウンタ部５Ａのカウンタ５ＡＣにおけるベクトルレジスタ番号１に対応するカウント部５Ａ−１の値をライト・ロードカウンタ値フィールド５Ｃ１４に設定し、リード命令カウンタ部５Ｂのカウンタ５ＢＣにおけるベクトルレジスタ番号１に対応するカウント部５Ｂ−１の値をリードカウンタ値フィールド５Ｃ１５に設定する。そして、リソース情報登録部５Ｃ２は、ロードバッファ番号０、ベクトルレジスタ番号１、要素数１００の情報を含むベクトルロード命令情報を信号線１０９を通じてベクトルロード管理部６５に通知する。ベクトルロード管理部６５のレジスタ設定部６５２は、ロードバッファ番号０がロードバッファ番号フィールド６５１１に設定されているレジスタ６５１−０におけるベクトルレジスタ番号フィールド６５１２にベクトルレジスタ番号１を設定し、ライト要素数フィールド６５１４およびリード要素数フィールド６５１５に要素数１００を設定し、リソースチェックフラグフィールド６５１３のフラグは値１とする。 The instruction decode unit 51 of the vector control unit 5 decodes the vector load instruction VLD 1, decodes information such as the vector register number 1 and the number of elements 100, and the load buffer number 0 notified from the memory access processing unit 3 through the signal line 106. Information is stored in the instruction register 53. Since it is a vector load instruction, the instruction decode unit 51 increments the value of the count unit 5A-1 corresponding to the vector register number 1 in the counter 5AC of the write / load counter unit 5A. Next, the contents of the instruction register 53 are stored in the instruction register 54 and simultaneously output to the load instruction information management unit 5C. Since the load buffer number 0 is output from the instruction register 53, the resource information registration unit 5C2 of the load instruction information management unit 5C has the load buffer number field 0C in the load buffer number field 5C11 and the vector register number field 5C12 of the register 5C1-0. The vector register number 1 and the element number 100 output from the instruction register 53 are set in the element number field 5C13, and the count unit 5A-1 corresponding to the vector register number 1 in the counter 5AC of the write / load instruction counter unit 5A is set. The value is set in the write / load counter value field 5C14, and the value of the count unit 5B-1 corresponding to the vector register number 1 in the counter 5BC of the read instruction counter unit 5B is set in the read counter value field 5C15. Then, the resource information registration unit 5C2 notifies the vector load management unit 65 through the signal line 109 of vector load instruction information including information of the load buffer number 0, the vector register number 1, and the number of elements 100. The register setting unit 652 of the vector load management unit 65 sets the vector register number 1 in the vector register number field 6512 in the register 651-0 in which the load buffer number 0 is set in the load buffer number field 6511, and the write element number field. The element number 100 is set in 6514 and the read element number field 6515, and the flag in the resource check flag field 6513 is set to a value of 1.

ここで、ロード命令情報管理部５Ｃのレジスタ５Ｃ１−０のライト・ロードカウンタ値フィールド５Ｃ１４の値とリードカウンタ値フィールド５Ｃ１５の値とがどのような値になるかは、ベクトルロード命令ＶＬＤ１に先行する仕掛かり中のベクトル命令の種類や数によって決定され、大別すると以下のようなケースに分けられる。 Here, the value of the write / load counter value field 5C14 and the value of the read counter value field 5C15 of the register 5C1-0 of the load instruction information management unit 5C precedes the vector load instruction VLD1. It is determined by the type and number of vector instructions in progress, and can be roughly divided into the following cases.

（１）ケース１：ベクトルロード命令ＶＬＤ１が使用するベクトルレジスタと先行命令が使用するベクトルレジスタとが競合しない場合。このときは、ライト・ロードカウンタ値フィールド５Ｃ１４の値は１、リードカウンタ値フィールド５Ｃ１５の値は０になる。
（２）ケース２：ベクトルロード命令ＶＬＤ１が使用するベクトルレジスタと先行命令が使用するベクトルレジスタとが競合する場合。このときは、ベクトルロード命令ＶＬＤ１が使用するベクトルレジスタをライトに使う仕掛かり中のベクトル命令の数をａとすると、ライト・ロードカウンタ値フィールド５Ｃ１４の値はａ＋１であり、ベクトルロード命令ＶＬＤ１が使用するベクトルレジスタをリードに使う仕掛かり中のベクトル命令の数をｂとすると、リードカウンタ値フィールド５Ｃ１５の値はｂである。なお、メモリアクセス処理部３は同じロードバッファ６４を複数のベクトルロード命令に同時に割り当てることがないので、ベクトルロード命令ＶＬＤ１に先行するベクトルロード命令が仕掛かり中であっても、ライト・ロードカウンタ値フィールド５Ｃ１４の値に影響はない。 (1) Case 1: When the vector register used by the vector load instruction VLD1 does not conflict with the vector register used by the preceding instruction. At this time, the value of the write / load counter value field 5C14 is 1, and the value of the read counter value field 5C15 is 0.
(2) Case 2: When the vector register used by the vector load instruction VLD1 conflicts with the vector register used by the preceding instruction. At this time, assuming that the number of vector instructions in progress using the vector register used by the vector load instruction VLD1 for writing is a, the value of the write / load counter value field 5C14 is a + 1, and the vector load instruction VLD1 uses it. Assuming that the number of vector instructions in progress using the vector register to be read is b, the value of the read counter value field 5C15 is b. Since the memory access processing unit 3 does not assign the same load buffer 64 to a plurality of vector load instructions at the same time, even if a vector load instruction preceding the vector load instruction VLD1 is in progress, the write / load counter value There is no effect on the value of field 5C14.

ケース１の場合、ロード命令情報管理部５Ｃのリソース情報チェック部５Ｃ３は、リソース情報登録部５Ｃ２が登録したレジスタ５Ｃ１−０のライト・ロードカウンタ値フィールド５Ｃ１４の値が１で、且つ、リードカウンタ値フィールド５Ｃ１５の値が０なので、ベクトルロード命令ＶＬＤ１のリソースは仕掛かり中の他のベクトル命令と競合しないと判定し、ロードバッファ番号０を指定したリソースチェックＯＫ信号を信号線１０９を通じてベクトルロード管理部６５に通知する。他方、ケース２の場合、リソース情報チェック部５Ｃ３は、リソース情報登録部５Ｃ２が登録したレジスタ５Ｃ１−０のライト・ロードカウンタ値フィールド５Ｃ１４の値が１になり、且つ、リードカウンタ値フィールド５Ｃ１５の値が０になるのを監視する。ベクトル制御部５においてベクトルロード命令ＶＬＤ１に先行する仕掛かり中のベクトル命令の実行が終了する毎に、命令発行部５８からの通知に応じてリソース情報チェック部５Ｃ３はライト・ロードカウンタ値フィールド５Ｃ１４の値およびリードカウンタ値フィールド５Ｃ１５の値を減算する。そして、レジスタ５Ｃ１−０のライト・ロードカウンタ値フィールド５Ｃ１４の値が１になり、且つ、リードカウンタ値フィールド５Ｃ１５の値が０になると、リソース情報チェック部５Ｃ３は、ベクトルロード命令ＶＬＤ１のリソースは仕掛かり中の他のベクトル命令と競合しないと判定し、ロードバッファ番号０を指定したリソースチェックＯＫ信号を信号線１０９を通じてベクトルロード管理部６５に通知する。 In case 1, the resource information check unit 5C3 of the load instruction information management unit 5C has a value of 1 in the write / load counter value field 5C14 of the register 5C1-0 registered by the resource information registration unit 5C2 and a read counter value. Since the value of the field 5C15 is 0, it is determined that the resource of the vector load instruction VLD1 does not compete with other vector instructions in progress, and the resource check OK signal designating the load buffer number 0 is sent through the signal line 109 to the vector load management unit. 65 is notified. On the other hand, in case 2, the resource information check unit 5C3 sets the value of the write / load counter value field 5C14 of the register 5C1-0 registered by the resource information registration unit 5C2 to 1 and the value of the read counter value field 5C15. Is monitored to become 0. Each time the vector control unit 5 finishes executing the in-progress vector instruction preceding the vector load instruction VLD1, the resource information check unit 5C3 responds to the notification from the instruction issue unit 58 in the write / load counter value field 5C14. The value and the value of the read counter value field 5C15 are subtracted. When the value of the write / load counter value field 5C14 of the register 5C1-0 becomes 1 and the value of the read counter value field 5C15 becomes 0, the resource information check unit 5C3 sets the resource of the vector load instruction VLD1. It is determined that there is no conflict with other vector instructions being applied, and a resource check OK signal designating the load buffer number 0 is notified to the vector load management unit 65 through the signal line 109.

ベクトルロード管理部６５のレジスタ設定部６５２は、通知されたロードバッファ番号０をロードバッファ番号フィールド６５１１に持つレジスタ６５１−０のリソースチェックフラグフィールド６５１３のフラグ値を１から０に変更する。 The register setting unit 652 of the vector load management unit 65 changes the flag value of the resource check flag field 6513 of the register 651-0 having the notified load buffer number 0 in the load buffer number field 6511 from 1 to 0.

プロセッサネットワーク部４は、メモリアクセス処理部３から発行されたメモリアクセスリクエストに応じて、主記憶装置７の開始アドレスＸの要素、開始アドレス＋距離Ｄのアドレスの要素、…、最後の要素を順次読み出し、ロードバッファ番号０を添えて信号線１０８経由でベクトルロード管理部６５に送出する。ベクトルロード管理部６５のライト部６５３は、受信した各要素をロードバッファ番号０のロードバッファ６４に格納し、ロードバッファ番号０がロードバッファ番号フィールド６５１１に設定されているレジスタ６５１−０のライト要素数フィールド６５１４の値を順次減算していく。１００要素全てのデータがロードバッファ番号０のロードバッファ６４に格納された時点で、ライト要素数フィールド６５１４の値は０となる。 In response to the memory access request issued from the memory access processing unit 3, the processor network unit 4 sequentially selects the element of the start address X of the main storage device 7, the element of the address of the start address + the distance D,. Read out, load the load buffer number 0 and send it to the vector load management unit 65 via the signal line 108. The write unit 653 of the vector load management unit 65 stores each received element in the load buffer 64 with the load buffer number 0, and the write element of the register 651-0 in which the load buffer number 0 is set in the load buffer number field 6511 The value in the number field 6514 is sequentially subtracted. When all 100 elements of data are stored in the load buffer 64 with the load buffer number 0, the value of the write element number field 6514 becomes 0.

リード部６５４は、レジスタ６５１−０のライト要素数フィールド６５１４の値が０になったことで全要素がロードバッファ６４に格納されたことを認識し、リソースチェックフラグフィールド６５１３のフラグの値が０か１かを判定する。１であれば０になるのを待ち、０であれば或いは０になれば、クロスバスイッチ回路６６との間のパスが空いていれば直ちに、空いていなければ空き次第、ロードバッファ番号０およびベクトルレジスタ番号１を指定した転送開始通知を信号線１１０を通じてベクトル制御部５に送出する。ベクトル制御部５のロードバッファ番号管理部５Ｄは、ロードバッファ番号０を保持し、ビジーフラグ群５９は、ベクトルレジスタ番号１のベクトルレジスタをビジー状態として管理する。リード部６５４は、転送開始通知を行うと同時にロードバッファ番号０のロードバッファ６４の内容をクロスバスイッチ回路６６を通じてベクトルレジスタ番号１のベクトルレジスタ６２に転送する処理を開始する。 The read unit 654 recognizes that all the elements are stored in the load buffer 64 when the value of the write element number field 6514 of the register 651-0 becomes 0, and the flag value of the resource check flag field 6513 is 0. Or 1 is determined. If it is 1, wait until it becomes 0, if it is 0 or 0, immediately if the path to the crossbar switch circuit 66 is free, if it is not free, as soon as it is free, load buffer number 0 and vector A transfer start notification designating the register number 1 is sent to the vector control unit 5 through the signal line 110. The load buffer number management unit 5D of the vector control unit 5 holds the load buffer number 0, and the busy flag group 59 manages the vector register of the vector register number 1 as a busy state. The read unit 654 starts a process of transferring the contents of the load buffer 64 with the load buffer number 0 to the vector register 62 with the vector register number 1 through the crossbar switch circuit 66 at the same time that the transfer start notification is given.

リード部６５４は、１要素を転送する毎に、レジスタ６５１−０のリード要素数フィールド６５１５の値を１だけ減算していき、リード要素数フィールド６５１５の値が０になった時点で転送を終了し、ロードバッファ番号０とベクトルレジスタ番号１を指定した転送終了通知を信号線１１０を通じてベクトル制御部５に送出すると同時に、ロードバッファ番号０を指定したバッファ解放通知を信号線１０７を通じてメモリアクセス処理部３に送出する。ベクトル制御部５のビジーフラグ群５９は、ベクトルレジスタ番号１のベクトルレジスタのビジー状態を解除する。また、メモリアクセス処理部３は、通知されたロードバッファ番号０のロードバッファを空き状態として管理する。 Each time one element is transferred, the read unit 654 decrements the value of the read element number field 6515 of the register 651-0 by 1, and ends the transfer when the value of the read element number field 6515 becomes 0. Then, a transfer end notification designating the load buffer number 0 and the vector register number 1 is sent to the vector control unit 5 through the signal line 110, and simultaneously, a buffer release notification designating the load buffer number 0 is sent through the signal line 107 to the memory access processing unit. 3 to send. The busy flag group 59 of the vector control unit 5 cancels the busy state of the vector register of vector register number 1. Further, the memory access processing unit 3 manages the notified load buffer with the load buffer number 0 as an empty state.

ベクトル制御部５において、ベクトルロード命令ＶＬＤ１は、命令レジスタ５４に格納された後、以降の各ステージで処理され、最終的に命令発行ステージの命令レジスタ５２に格納される。命令発行部５８は、命令レジスタ５２に格納されたベクトルロード命令ＶＬＤ１が使用するロードバッファの番号０（これは命令レジスタ５３から順次持ち回って命令レジスタ５２に保持されている）とロードバッファ番号管理部５Ｄに記憶されているロードバッファ番号とを比較し、ロードバッファ番号管理部５Ｄにロードバッファ番号０が格納されていなければ格納されるのを待ち合わせ、既に格納されているか、待ち合わせ中に格納されたら、ベクトルロード命令ＶＬＤ１の命令発行を行わずにその命令の処理を終了し、ロードバッファ番号０をロードバッファ番号管理部５Ｄから削除する。このように命令発行部５８がベクトルロード命令ＶＬＤ１の命令発行を行わない理由は、ベクトルロード命令ＶＬＤ１が使用するリソースが先行命令のリソースと競合しなくなり、且つ、ベクトルデータの全要素がロードバッファに格納されると、ベクトルロード管理部６５によってベクトルレジスタへの転送が起動されるため、ベクトルロード命令ＶＬＤ１の命令発行処理をあえて命令発行部５８が行う必要がないからである。 In the vector control unit 5, the vector load instruction VLD1 is stored in the instruction register 54, then processed in each subsequent stage, and finally stored in the instruction register 52 in the instruction issue stage. The instruction issuing unit 58 manages the load buffer number 0 (this is sequentially carried from the instruction register 53 and held in the instruction register 52) used by the vector load instruction VLD1 stored in the instruction register 52. The load buffer number stored in the unit 5D is compared, and if the load buffer number 0 is not stored in the load buffer number management unit 5D, the load buffer number 0 is waited to be stored, and is stored or stored while waiting. Then, the processing of the instruction is terminated without issuing the vector load instruction VLD1, and the load buffer number 0 is deleted from the load buffer number management unit 5D. The reason why the instruction issuing unit 58 does not issue the instruction of the vector load instruction VLD1 in this way is that the resource used by the vector load instruction VLD1 does not compete with the resource of the preceding instruction, and all elements of the vector data are stored in the load buffer. When stored, the vector load management unit 65 starts the transfer to the vector register, so that the instruction issuing unit 58 does not need to execute the instruction issuing process of the vector load instruction VLD1.

このように本実施の形態によれば、ベクトルロード命令ＶＬＤ１が命令発行ステージの命令レジスタ５２に到着するのを待たずに、バッファ番号０のロードバッファ６４からレジスタ番号１のベクトルレジスタ６２への転送を開始できるため、レジスタ番号１のベクトルレジスタ６２を使用する後続のベクトル命令の処理を早めることができると共に、バッファ番号０のロードバッファ６４の解放を早め、処理効率を高めることができる。 As described above, according to the present embodiment, the transfer from the load buffer 64 with the buffer number 0 to the vector register 62 with the register number 1 is performed without waiting for the vector load instruction VLD1 to arrive at the instruction register 52 in the instruction issue stage. Therefore, it is possible to speed up the processing of the subsequent vector instruction using the vector register 62 of the register number 1, and to speed up the release of the load buffer 64 of the buffer number 0, thereby improving the processing efficiency.

以上の説明では、１つのベクトルロード命令ＶＬＤ１に着目したが、複数のベクトルロード命令が連続する場合でも同様の動作が行われる。この場合、本実施の形態では、ベクトルロード命令が命令発行ステージに到着した順でなく、ロードバッファに全要素が揃い且つリソースの競合がなくなった順にベクトルレジスタへの転送が起動されるため、先行するベクトルロード命令よりも先に後続のベクトルロード命令のベクトルデータをベクトルレジスタへ転送することも可能となる。例えば、先行するベクトルロード命令ＶＬＤ１が、自命令で使用するベクトルレジスタを使用する仕掛かり中の先行ベクトル命令の終了を待ち合わせているか、或いは要素数が多いためにロードバッファに全要素が格納されるのを待ち合わせている間に、後続のベクトルロード命令ＶＬＤ２の全要素がロードバッファに格納され且つベクトルロード命令ＶＬＤ２が使用するベクトルレジスタが先行する仕掛かり中のベクトル命令のリソースと競合しない状況に至ると、ベクトルロード管理部６５のリード部６５４は、ベクトルロード命令ＶＬＤ１より先にベクトルロード命令ＶＬＤ２について、ロードバッファからベクトルレジスタへの転送を開始する。 In the above description, attention is paid to one vector load instruction VLD1, but the same operation is performed even when a plurality of vector load instructions are consecutive. In this case, in this embodiment, the transfer to the vector register is started not in the order in which the vector load instructions arrive at the instruction issue stage but in the order in which all elements are aligned in the load buffer and there is no resource contention. It is also possible to transfer the vector data of the subsequent vector load instruction to the vector register before the vector load instruction to be performed. For example, the preceding vector load instruction VLD1 waits for the end of the preceding vector instruction in progress using the vector register used in its own instruction, or all elements are stored in the load buffer because of the large number of elements. All the elements of the subsequent vector load instruction VLD2 are stored in the load buffer and the vector register used by the vector load instruction VLD2 does not conflict with the resources of the preceding vector instruction in progress. Then, the read unit 654 of the vector load management unit 65 starts the transfer from the load buffer to the vector register for the vector load instruction VLD2 prior to the vector load instruction VLD1.

本発明の実施の形態のブロック図である。It is a block diagram of an embodiment of the invention. ベクトルロード管理部の構成例を示すブロック図である。It is a block diagram which shows the structural example of a vector load management part. ライト・ロードカウンタ部、リードカウンタ部およびロード命令情報管理部の構成例を示すブロック図である。It is a block diagram which shows the example of a structure of a write / load counter part, a read counter part, and a load command information management part.

Explanation of symbols

１…プロセッサ
２…命令制御部
３…メモリアクセス処理部
４…プロセッサネットワーク部
５…ベクトル制御部
６…ベクトル処理部
７…主記憶装置 DESCRIPTION OF SYMBOLS 1 ... Processor 2 ... Instruction control part 3 ... Memory access processing part 4 ... Processor network part 5 ... Vector control part 6 ... Vector processing part 7 ... Main memory

Claims

A load buffer for temporarily storing vector data read from the main storage device is provided between the main storage device and the vector register, and vector instructions read from the main storage device are sequentially processed in a plurality of stages. Resource check means for detecting that the vector register used by the vector load instruction in progress does not conflict with the preceding vector instruction between the instruction decoding stage and the instruction issuing stage of the vector control unit, When the vector load instruction read from the main storage device is decoded, reading of the vector data from the main storage device to the load buffer is activated, and the vector load is started as soon as all elements of the vector data are stored in the load buffer. The vector register used in the instruction must be used in the preceding vector instruction. That it does not conflict with torr register has been detected in the resource check means on condition, the vector processor to start the transfer of vector data from the load buffer to the vector register.

A main storage device and a processor connected to the main storage device, wherein the processor activates reading of vector data from the main storage device when the vector instruction read from the main storage device is a vector load instruction A memory access processing unit, a vector control unit for sequentially processing vector instructions read from the main storage device at a plurality of stages, a vector processing comprising a plurality of vector registers, one or more vector computing units, and a plurality of load buffers And detecting that the vector register used by the vector load instruction in progress no longer conflicts with the preceding vector instruction between the instruction decoding stage and the instruction issuing stage of the vector control section. Resource checking means for notifying the processing unit, wherein the vector processing unit includes the main memory. The vector data read from the device is stored in the load buffer allocated by the memory access processing unit, and as soon as all elements of the vector data are stored in the load buffer, the vector register used in the vector load instruction is preceded. A vector load management unit for starting transfer of vector data from the load buffer to the vector register on the condition that the vector control unit is informed that there is no conflict with the vector register used in the vector instruction to be executed A vector processing device.

a) When a vector load instruction read from the main storage device is decoded, a load buffer for temporarily storing vector data read from the main storage device by the vector load instruction is allocated, and the load buffer is transferred from the main storage device to the load buffer. Starting the reading of the vector data to b) detecting the first condition that all elements of the vector data are stored in the load buffer; c) preceding the vector register used in the vector load instruction Detecting a second condition between the instruction decoding stage and the instruction issuing stage that does not conflict with the vector register used in the vector instruction to be used; d) as soon as the first and second conditions are satisfied, from the load buffer Initiating transfer of vector data to the vector register. Vector load wherein the.