JPS61221966A

JPS61221966A - Vector instruction processor

Info

Publication number: JPS61221966A
Application number: JP6453485A
Authority: JP
Inventors: Hajime Fukuzawa; 福澤　一
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1985-03-28
Filing date: 1985-03-28
Publication date: 1986-10-02
Also published as: JPH0357499B2

Abstract

PURPOSE:To minimize the idle time due to a data waiting time by prefetching the mask vector data and supplying this data at a high speed to a processing part from a mask vector data buffer. CONSTITUTION:A memory 10 stores a mask vector containing plural mask vector data. These data are successively prefetched by a mask vector data buffer 11 from the memory 10 and read by a mask vector data reading part 30a. These vector data are collectively processed by a mask vector data processing part 40. While an arithmetic unit 60 and an address producing part 50 are connected to the memory 10 and the part 40 respectively. The actions of these parts are controlled by a controller 20.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明は、ベクトル命令処理装置に関し、特にベクトル
要素毎に−ｇｔスク・ビットによってベクトル演算の実
行を制御するベクトル命令処理装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a vector instruction processing device, and more particularly to a vector instruction processing device that controls execution of vector operations using -gtsc bits for each vector element.

[Conventional technology]

従来のベクトル命令処理装置はベクトル・プロセサの処
理能力をよシ高遠に向上させるために、より広範囲の種
類の処理を高速に処理することが必要で、とりわけ、Ｆ
ＯＲＴＲＡＮプログラムでＩＦ文を含むＤＯループをい
かに高速に処理するかが課題となっており、古くからマ
スク・ベクトルによシ演算を制御し、処理する方法が工
夫されている。Conventional vector instruction processing devices need to be able to process a wider range of types of processing at high speed in order to greatly improve the processing performance of vector processors.
How to process DO loops containing IF statements at high speed in ORTRAN programs has been an issue, and methods have been devised for a long time to control and process operations using mask vectors.

ＤＯループのマスク・ベクトルによる演算の処理過程の
概念を第２図を参照して簡単に説明すると、このＤＯル
ープはベクトル・オペランドＣ（Ｉ）とベクトル・オペ
ランドＤ　（Ｉ）の各ベクトル要素同士の比較を行い、
ベクトル・オペランドＤ　（Ｉ）の値より小なるベクト
ル・オペランドＣ（Ｉ）のベクトル要素を、ベクトル・
オペランドＡ（Ｉ）　、　Ｂ　（Ｉ）の対応するベクト
ル要素同志の加算結果に置き換えるような処理ｉＮ組の
ベクトル要素について繰返す例である。To briefly explain the concept of the processing process using mask vectors in the DO loop with reference to Figure 2, this DO loop performs operations between the vector elements of vector operand C(I) and vector operand D(I). Make a comparison of
The vector elements of vector operand C(I) that are smaller than the value of vector operand D(I) are
This is an example in which processing is repeated for iN sets of vector elements in which the process is replaced with the addition result of corresponding vector elements of operands A(I) and B(I).

まずベクトルＱオペランドＣ（Ｉ）とＤ　（Ｉ）の各ベ
クトル要素対応の比較を行い、Ｃ（Ｉ）　＞　Ｄ　（Ｉ
）の場合は１０”、　Ｃ（Ｉ）　＜　Ｄ　（Ｉ）の場合
は１１”の値を示すビット列を作成する。このビット列
がマスク・ベクトルであり、その各ビットをマスク・ビ
ットと言う。そして、この作成されたマスク・ベクトル
に従い、値が１１”であるマスク・ビットに対応するベ
クトル・オペランドＡ（Ｉ）　、　Ｂ（Ｉ）のベクトル
要素同志を加算し、結果ｔ　Ｃ（Ｉ）の対応するベクト
ル要素に書き込む。それ以外のベクトル・オペランドＣ
（Ｉ）、つまり値が′″Ｏ”であるマスク・ビットに対
応するベクトル・オペランドＣ（Ｉ）のベクトル要素は
変化させない。First, we compare the vector element correspondence of vector Q operands C(I) and D(I), and find that C(I) > D(I
), a bit string is created that indicates a value of 10", and when C(I) < D (I), a bit string that indicates a value of 11" is created. This bit string is a mask vector, and each bit is called a mask bit. Then, according to this created mask vector, the vector elements of the vector operands A(I) and B(I) corresponding to the mask bit whose value is 11" are added together, and the result t C(I) is Write to the corresponding vector element. Other vector operands C
(I), that is, the vector element of the vector operand C(I) corresponding to the mask bit whose value is ``O'' is unchanged.

以上のようにして、マスク・ベクトルによる演算の処理
が行われる訳であるが、マスク・ベクトルの作成にあた
っては、同種の比較器を複数組用意するととＫより、一
括して順次高速く作成することが可能である。従って、
マスク・ベクトルによるベクトル演算を高速に処理する
ためには、一括して順次作成されたマスク・ベクトルを
いかに高速に処理し、演算処理を実行するかが重要であ
る。As described above, calculations using mask vectors are performed. When creating mask vectors, if multiple sets of the same type of comparators are prepared, the comparators can be created at once at a higher speed than K. Is possible. Therefore,
In order to process vector calculations using mask vectors at high speed, it is important to process mask vectors that are sequentially created in batches at high speed and perform calculation processing.

このために、従来のベクトル命令処理装置は第３図に示
すようにベクトル要素の読み出しリクエストを要求する
時点でマスク・ビットによって示された有効なベクトル
要素のみを読み出し、演算を抑止されるベクトル要素の
読み出しを抑止することで、無効な演算を飛ばすようＫ
したベクトル命令処理装置がある。For this reason, as shown in FIG. 3, conventional vector instruction processing devices read only valid vector elements indicated by mask bits at the time of requesting a vector element read request, and vector elements whose operations are suppressed. By inhibiting the reading of K, invalid operations are skipped.
There is a vector instruction processing device.

この従来のベクトル命令処理装置においてはベクトル・
オペランドの要素数分に対応する個数のマスク・ビット
列から構成されているマスク・ベクトルをある一定処理
単位ごとに区切り九マスク・ベクトル・データ′ｆ：１
処理単位として、順次高速に処理を行うために、マスク
・ベクトルによる演算処理に先行して一括して順次作成
されるマスク・ベクトルを一旦記憶装置に格納する。マ
スク・ベクトルの記憶装置への格納が終了するか、ある
いは格納動作と並行して、ベクトル命令処理装置に対し
てはマスク・ベクトルによる演算処理開始の命令が発行
される。ベクトル命令処理装置が本命令を受付けると、
まずベクトル・オペランド１．２及び３に対応するベク
トル・オペランド・アドレス（ＶＨＩ、２及びＶＨ３）
がそれぞれベクトル・アドレス・レジスタ５１．５２及
び５３に。In this conventional vector instruction processing device, vector
A mask vector consisting of a number of mask bit strings corresponding to the number of elements of the operand is divided into certain processing units, and 9 mask vector data'f: 1
In order to sequentially perform high-speed processing as a processing unit, mask vectors that are sequentially created in batches are temporarily stored in a storage device prior to arithmetic processing using mask vectors. When the storage of the mask vector in the storage device is completed, or in parallel with the storage operation, an instruction to start arithmetic processing using the mask vector is issued to the vector instruction processing device. When the vector instruction processing device receives this instruction,
First, vector operand addresses (VHI, 2 and VH3) corresponding to vector operands 1, 2 and 3.
into vector address registers 51, 52 and 53, respectively.

ベクトル・オペランド１，２及び３に対応するベクトル
要素間距離（ＶＩＩ、２及び３）がそれぞれベクトル要
素間距離レジスタ５４．５５及び５６に初期設定される
。制御装置２０では、マスク・ベクトル・データ読み出
し部３０に指令を出し。Vector interelement distances (VII, 2 and 3) corresponding to vector operands 1, 2 and 3 are initialized in vector element distance registers 54, 55 and 56, respectively. The control device 20 issues a command to the mask vector data reading section 30.

マスク・ベクトル・データ読み出し部３０より記憶装置
１０に対してマスク・ベクトル・データの読み出しリク
エストを出させる。マスク・ベクトル・データの読み出
しリクエストを受は取った記憶装置１０は該当するマス
ク・ベクトル・データを読み出し、マスク・ベクトル・
データ処理部４０に送出する。マスク・ベクトル・デー
タ処理部４０においては、マスク・ベクトル・データを
もとに処理が行われる。The mask vector data reading unit 30 issues a mask vector data read request to the storage device 10. Upon receiving the mask vector data read request, the storage device 10 reads the corresponding mask vector data and stores the mask vector data.
The data is sent to the data processing unit 40. In the mask vector data processing section 40, processing is performed based on mask vector data.

第４図は、マスク・ベクトル・データの処理過程と演算
指示ビットに対応するベクトル要素のアドレス生成手順
を表わしたものである。一つのベクトル命令内では、ベ
クトル・オペランド・アドレス（ＶＨｉ）に一定のベク
トル要素間距離（ＶＩｉ）を加算すると、次のベクトル
要素のベクトル・オペランド・アドレスが得られる。こ
こで、第４図１ａｌを参照すると、読み出されたビット
長（ｎ＋１）のマスク・ベクトル０データの値は％　３
ビツト。FIG. 4 shows the process of processing mask vector data and the procedure for generating addresses of vector elements corresponding to operation instruction bits. Within one vector instruction, adding a constant vector element distance (VIi) to the vector operand address (VHi) yields the vector operand address of the next vector element. Here, referring to FIG. 4 1al, the value of the read mask vector 0 data of bit length (n+1) is % 3
Bit.

８ビツト、・・・、（ｎ−３）ビット目に演算指示ビッ
ト１１”（その個数ｆｍとする）ｔ−示しており、他（
ＤＣｎ−ｍ＋１）個のビット全てが演算抑止ビット″″
０”を示している場合が示されている。今ベクトル・オ
ペランド・アドレス（ＶＨｉ　、　ｏ）はマスク・ベク
トル・データのθビット目に対応するベクトル要素のア
ドレスである。このとき、マスク・ベクトル・データの
３ビツト目に対応するベクトル要素のアドレス（ＶＨｉ
　、　ａ　）はＶＨｉ　、３−ＶＨｉ　、６＋３ＸＶＩ
ｉで得られる。ここで、ベクトル要素間距離（ＶＩ　ｉ
　）の乗数３は、第４図（ａｌのマスク・ベクトル・デ
ータの先頭（Ｏビット目）よシ最初の演算指示ビット″
′１”が見つかるまで演算抑止ビット″′０”の個数を
カウントすることで求められる。さらに、次に演算すべ
きマスク・ベクトル・データの８ビツト目に対応するベ
クトル要素のアドレス（ＶＨｉ、ｓ）′１に求めるため
に、ｉＥ４図（ｂｌに示されるように、第４図Ｔａ）の
マスク・ベクトル・データを乗数に相当する３ビツト分
だけ左シフトを行ない、それに伴い３ビツト目の演算指
示ビット″″１”を反転させて、次に用いるマスク・ベ
クトル・データとする。この時、シフト・インに論理レ
ベル″″１”を入力させる。このようにして得られ九ｍ
４図（ｂｌのマスク・ベクトル０データに対して、再び
先頭よυ演算指示ビット＠１”が見つかるまで演算抑止
ビット″″０”の個数ｔカウントする操作を行うことで
新た表乗数５が得られ、第４図１ａｌの８ビツト目に対
応するベクトル要素のアドレス（ＶＨｉ。The 8th bit..., the (n-3)th bit is the operation instruction bit 11" (the number of bits is fm), and the others (
All DCn-m+1) bits are operation inhibit bits ″″
The vector operand address (VHi, o) is the address of the vector element corresponding to the θ-th bit of the mask vector data.・Address of the vector element corresponding to the third bit of data (VHi
, a) is VHi, 3-VHi, 6+3XVI
Obtained by i. Here, the distance between vector elements (VI i
) is the first operation instruction bit from the beginning (Oth bit) of the mask vector data in Figure 4 (al).
It is obtained by counting the number of operation inhibit bits ``0'' until ``1'' is found.Furthermore, the address (VHi, s )'1, the mask vector data in Figure iE4 (as shown in BL, Figure 4 Ta) is shifted to the left by 3 bits corresponding to the multiplier, and the calculation of the 3rd bit is accordingly performed. The instruction bit ""1" is inverted and used as mask vector data to be used next. At this time, input the logic level ""1" to the shift-in. In this way, the 9 m
Figure 4 (A new table multiplier of 5 is obtained by counting the number of operation suppression bits ""0" t from the beginning to the mask vector 0 data of bl until the υ operation instruction bit @1" is found again. and the address (VHi) of the vector element corresponding to the 8th bit in FIG. 4 1al.

８）はＶＨｉ　、８．、ｖＨｉ、ｓ＋５ｘｖｘｉで得られる。8) is VHi, 8. , vHi, s+5xvxi.

以上の操作′ｔ−ｍ回繰返すことで、第４図１０）に示
されるような第４図１ａ）の（ｎ−３）ビット目の演算
指示ビット＠１′が反転を受けて先頭のビット位置にシ
フトされたマスク・ベクトル・データが得られる。この
第４図（Ｃ１のマスク・ベクトル・データに対して、も
う一度同じ操作を行って得られる乗数４を用いて得られ
るベクトル要素のアドレス（ＶＨｉ　、　ｏ’　）ＶＨｌ、。’−ｖＨｉｓａ−ａ＋４ＸＶＩｉは、新たに
連続する次のマスク・ベクトル・データの先頭ビットに
対応するベクトル要素のアドレスである。By repeating the above operation 't-m times, the (n-3)th bit operation instruction bit @1' in Figure 4 1a) as shown in Figure 4 10) is inverted and the leading bit is Mask vector data shifted in position is obtained. The vector element address (VHi, o') obtained by performing the same operation once again on the mask vector data of FIG. , is the address of the vector element corresponding to the first bit of the next consecutive new mask vector data.

以上の操作は、各々の操作で得られる乗数を毎回加算し
、得られた加算値結果が読み出されたマスク・ベクトル
・データのビット長（ｎ＋１）と一致するまで繰返され
、一致を検出すると、一つのマスク・ベクトル・データ
の処理を終了する。The above operations are repeated until the multipliers obtained in each operation are added each time, and the resulting added value matches the bit length (n+1) of the read mask vector data. When a match is detected, , ends the processing of one mask vector data.

再び、８ｇ３図を参照する。マスク・ベクトル・データ
処理部４０はこれに供給された、マスク・ベクトル・デ
ータを上述したように演算指示ピッかト＠１”が克つおるまで演算抑止ビット＠″０”の個数
をカウントする操作が行われ、得られた値がベクトル要
素間距離ＶＩｉの乗数として、アドレス生成部５０に送
出される。アドレス生成部５０に送出されたベクトル要
素間距離ＶＩｉの乗数は１乗算器５７で、セレクタＳ５
０によって選択されたベクトル・オペランドのベクトル
要素間距離Ｖｌｉと乗算されて、アドレス差分が生成さ
れ、アドレス加算器５８に送られる。アドレス加算器５
８では、乗算器５７で生成され九アドレス差分と、セレ
クタ８５４によって選択された対応するベクトル・オペ
ランドのベクトル・オペランド・アドレスＶＨｉと加算
されて、演算を行うべきベクトル要素のアドレスが得ら
れ、記憶装置１０に対して。Refer again to Figure 8g3. The mask vector data processing unit 40 performs an operation of counting the number of arithmetic inhibition bits @"0" until the arithmetic instruction pick @1" is overcome on the mask vector data supplied thereto, as described above. is performed, and the obtained value is sent to the address generation unit 50 as a multiplier of the vector element distance VIi.The multiplier of the vector element distance VIi sent to the address generation unit 50 is 1 multiplier 57, and the selector S5
The vector element distance Vli of the selected vector operand is multiplied by 0 to generate an address difference and sent to address adder 58 . Address adder 5
8, the 9 address difference generated by the multiplier 57 is added to the vector operand address VHi of the corresponding vector operand selected by the selector 854 to obtain the address of the vector element on which the operation is to be performed and stored. For the device 10.

ベクトル要素のオペランド・リクエストが送出される。A vector element operand request is sent.

これと同時に、セレクタ８５１，５２．５３のいずれか
のセレクタを通じて対応するベクトル・オペランドのベ
クトル・アドレス・レジスタ（５１，５２若しくは５３
）の値が更新される。ベクトル要素のオペランド・リク
エストを受は取った記憶装置１０は、読み出しリクエス
トの場合は該当するベクトル要素を読み出し、演算装置
６０に送出する。一つのベクトル要素間距離ＶＩｉの乗
数に対して、オペランド１，２．及び３に対応するベク
トル要素のオペランド・リフエストラ記憶装置１０に対
して送出し終ると、新たなベクトル要素間距離ＶＩｉの
乗数が、マスク・ベクトル・データ処理部４０からアド
レス生成部５０に供給される。マスク・ベクトル・デー
タ処理部４０で生成されるベクトル要素間距離Ｖｌｉの
乗数が、新たに連続する次のマスク・ベクトル・データ
の先頭ビット位置までの距離に対応することが検出され
ると、制御装置２０に報告され１以上の一連の操作が終
了する。アドレス生成部５０に送られる新たに連続する
次のマスク・ベクトル・データの先頭ビット位置までの
距離に対応するベクトル要素間距離ＶＩｉの乗数によっ
て生成される。ベクトル要素のアドレスは、制御装置２
０の指令により、記憶装置１０へのオペランド・リクエ
ストとしては使用されずに、ベクトル・アドレス・レジ
スタ（５１，５２若しくは５３）の値を新たに連続する
次のマスク・ベクトル・データの先頭ビット位置に対応
するベクトル要素のアドレスに更新する目的だけに使用
される。At the same time, the vector address register (51, 52 or 53) of the corresponding vector operand is sent through one of the selectors 851, 52, and
) is updated. When the storage device 10 receives an operand request for a vector element, in the case of a read request, the storage device 10 reads the corresponding vector element and sends it to the arithmetic unit 60. For a multiplier of one vector element distance VIi, operands 1, 2, . and 3, the multiplier of the new inter-vector element distance VIi is supplied from the mask vector data processing unit 40 to the address generation unit 50. . When it is detected that the multiplier of the vector element distance Vli generated by the mask vector data processing unit 40 corresponds to the distance to the first bit position of the next consecutive mask vector data, the control It is reported to the device 20 and one or more series of operations are completed. It is generated by a multiplier of the vector element distance VIi corresponding to the distance to the first bit position of the next consecutive mask vector data sent to the address generation unit 50. The address of the vector element is
0 command, the value of the vector address register (51, 52 or 53) is changed to the first bit position of the next consecutive mask vector data without being used as an operand request to the storage device 10. It is used only to update the address of the vector element corresponding to .

この動作と並行して、制御装置２０は逐次処理を実行す
べく新たに連続する次のマスク・ベクトル・データを読
み出すための指令を、マスク・ベクトル・データ読み出
し部３０に送出し、処理の続行を開始する。In parallel with this operation, the control device 20 sends a command to the mask vector data reading unit 30 to read out the next consecutive mask vector data to execute sequential processing, and continues processing. Start.

[Problem that the invention seeks to solve]

上述した従来のベクトル命令処理装置は、マスク・ベク
トル・データ処理部で、一つのマスク・ベクトル・デー
タの処理が終了するのを待ってから２次に処理すべき新
たなマスク・ベクトル・データを記憶装置からフェッチ
している。従って、マスク・ベクトル・データ処理部は
、マスク・ベクトル・データが記憶装置から取出されマ
スク・ベクトル・データ処理部に送出される迄、処理を
行えずに待たされ、これに伴い演算装置での処理も、処
理の行えない遊びの時間を生じ、記憶装置のアクセス待
ち時間による性能低下が著しくなるという欠点がある。In the conventional vector instruction processing device described above, the mask vector data processing unit waits for the processing of one mask vector data to be completed and then processes new mask vector data to be processed next. Fetching from storage. Therefore, the mask vector data processing section is forced to wait without being able to process the mask vector data until it is retrieved from the storage device and sent to the mask vector data processing section. Processing also has the disadvantage that idle time occurs during which no processing can be performed, and performance degradation due to access latency of the storage device becomes significant.

[Means for solving problems]

本発明によれば、ある複数のマスク・ベクトル・データ
で構成されるマスク・ベクトルを記憶する記憶装置と、
前記マスク・ベクトル・データを一括して処理するマス
ク・ベクトル・データ処理部と、前記マスク・ベクトル
・データを読み出すマスク・ベクトル・データ読み出し
部とを含み、ベクトル９素毎にマスク・ビットによって
ベクトル演算の実行を制御するデータ処理装置において
。According to the present invention, a storage device that stores a mask vector composed of a certain plurality of mask vector data;
It includes a mask vector data processing section that processes the mask vector data all at once, and a mask vector data reading section that reads out the mask vector data, and the mask vector data is processed by mask bits for every nine elements of the vector. In a data processing device that controls execution of calculations.

逐次処理される前記マスク・ベクトル・データのアドレ
スを保持するマスク・ベクトル・データ・アドレス・レ
ジスタ、マスク・ベクトル・データの読み出しデータ巾
値を保持するマスク・ベクトル・データ読み出しデータ
巾レジスタおよび前記マスク・ベクトル・データ・アド
レスとマスク・ベクトル・データの読み出しデータ巾値
とから、前記記憶装置上の逐次処理されるべき、先行す
るマスク・ベクトル・データのアドレスを作成するアド
レス加算器を有する前記マスク・ベクトル・データ読み
出し部と、前記記憶装置から該マスク・ベクトルｅデー
タを次々と先行フェッチして、そのデータ処理部えるマ
スク・ベクトル・データ・バッファとを含み、前記マス
ク・ベクトル・データ処理部がマスク・ベクトル・デー
タを必要とした場合に前記マスク・ベクトルＯデータ・
バッファかう該当マスク・ベクトル・データを前記マス
ク・ベクトル・データ処理部に対して供給するようにし
たことを特徴とするベクトル命令処理装置が得られる。a mask vector data address register that holds the address of the mask vector data that is sequentially processed; a mask vector data read data width register that holds the read data width value of the mask vector data; and the mask. - said mask having an address adder for creating an address of preceding mask vector data to be sequentially processed on said storage device from a vector data address and a read data width value of the mask vector data;・The mask vector data processing unit includes a vector data reading unit and a mask vector data buffer that sequentially prefetches the mask vector e data from the storage device and stores the data processing unit. requires the mask vector data, the mask vector O data
A vector instruction processing device is obtained, characterized in that the buffer supplies the corresponding mask vector data to the mask vector data processing section.

〔Example〕

次に本発明の実施例について図面を参照して説明する。 Next, embodiments of the present invention will be described with reference to the drawings.

第１図は本発明の一実施例であるベクトル命令処理装置
を示す。第１図において１本実施例はめる複数のマスク
・ベクトル・データで構成されるマスク・ベクトルを記
憶する記憶装置１０と、該記憶装置１０から該マスク・
ベクトル・データを次々と先行フェッチして、そのデー
タを蓄えるマスク・ベクトル・データ・バッファ１１と
、前記マスク・ベクトル・データ音読み出すマスク・ベ
クトル・データ読み出し部３０ａと、前記マスク・ベク
トル・データを一括して処理するマスク・ベクトル・デ
ータ処理部４０とを含む。FIG. 1 shows a vector instruction processing device which is an embodiment of the present invention. In FIG. 1, there is shown a storage device 10 for storing mask vectors composed of a plurality of mask vector data that are included in this embodiment, and
A mask vector data buffer 11 that pre-fetches vector data one after another and stores the data; a mask vector data reading section 30a that reads out the mask vector data sound; It also includes a mask vector data processing unit 40 that performs batch processing.

記憶装置１０は、複数のマスク・ベクトル・データで構
成されるマスク・ベクトルや複数のベクトル・オペラン
ドのベクトル要素などが格納されている。マスク・ベク
トル・データ・バッファ１１は、複数のマスク・ベクト
ル・データの保持を行う専用のデータ・バッファである
。マスク・ベクトル・データ処理部３０ｇは逐次処理さ
れる先行するマスク・ベクトル・データを読み出すため
のマスク・ベクトル・データ読み出し部であプ、セレク
タ８３０と、マスク・ベクトル・データのアドレスを保
持するマスク・ベクトル・データ・アドレス・レジスタ
３１と、マスク・ベクトル・データの読み出しデータ巾
値を保持する。マスク・ベクトル・データ読み出しデー
タ巾レジスタ（以下読み出しデータ巾しジスタ榊と略す
）３２と、アドレスレジスタ３１の出力信号と読み出し
データ巾レジスタ３２の出力信号とを加算するアドレス
加算器３３とから構成されている。The storage device 10 stores mask vectors composed of a plurality of mask vector data, vector elements of a plurality of vector operands, and the like. The mask vector data buffer 11 is a dedicated data buffer that holds a plurality of mask vector data. The mask vector data processing section 30g is a mask vector data reading section for reading out preceding mask vector data that is sequentially processed, and includes a selector 830 and a mask that holds the address of the mask vector data. - Holds the vector data address register 31 and the read data width value of mask vector data. It consists of a mask vector data read data width register (hereinafter abbreviated as read data width register Sakaki) 32 and an address adder 33 that adds the output signal of the address register 31 and the output signal of the read data width register 32. ing.

更に本実施例は　　　　　　　記憶装置１０に。Furthermore, in this embodiment, the storage device 10 is used.

接続される演算装置６０と、マスク・ベクトル・データ
処理部４０に接続されるアドレス生成部５０と、マスク
・ベクトル・データ・バッファ１１の読み出しポインタ
（以下マスク・ベクトル・データポインタ鱒と略す）７
０と、これらおよび各部を制御する制御装置２０とを有
している。The connected arithmetic unit 60, the address generation unit 50 connected to the mask vector data processing unit 40, and the read pointer of the mask vector data buffer 11 (hereinafter abbreviated as mask vector data pointer) 7
0, and a control device 20 that controls these and each part.

本実施例はマスク・ベクトルによる演算処理開始の命令
を受付けると、初期値の設定を行ない、マスク・ベクト
ル・データーアドレスレジスタ３１にはセレクタ５３０
ｆ：通じて処理を行うべき先頭のマスク・ベクトル−デ
ータに対応するアドレスが設定され、読み出しデータ巾
レジスタ３２には初期加算値″″０”が設定される。初
期値が設定されると同時に制御装置２０の指令により、
マスク・ベクトル・データ・アドレスレジスタ３１の初
期値と読み出しデータ巾レジスタ３２の初期加算値“Ｏ
＃がアドレス加算器３３で加算されて、処理を行うべき
先頭のマスク・ベクトルゲータのアドレスが生成され、
記憶装置１０に対してマスク・ベクトルゲータの読み出
しリクエストが出される。When this embodiment receives an instruction to start arithmetic processing using a mask vector, it sets an initial value, and the mask vector data address register 31 has a selector 530.
f: The address corresponding to the first mask vector-data to be processed is set, and the initial addition value ""0" is set in the read data width register 32. At the same time as the initial value is set, According to the command from the control device 20,
The initial value of the mask vector data address register 31 and the initial added value of the read data width register 32 “O”
# is added by the address adder 33 to generate the address of the first mask vector gator to be processed,
A request to read the mask vector gator is issued to the storage device 10.

この時、同時に制御装置２０は読み出しデータ巾レジス
タ３２にマスク・ベクトル・データの読み出しデータ巾
値を設定する。以後読み出しデータ巾レジスタ３２には
マスク・ベクトル・データの読み出しデータ巾値が保持
される。このマスク・ベクトルゲータの読み出しデータ
巾値は、各さクトル命令処理装置固有に、記憶装置１０
から読み出される一つのデータ長対応に任意に設定する
ことができる。At this time, the control device 20 simultaneously sets the read data width value of the mask vector data in the read data width register 32. Thereafter, the read data width register 32 holds the read data width value of the mask vector data. The read data width value of this mask vector gator is unique to each vector instruction processing device in the storage device 10.
It can be set arbitrarily to correspond to one data length read from.

マスク・ベクトル・データの読み出しリクエストを受け
た記憶装置１０は該当アドレスから目的とするマスク・
ベクトル・データを読み出し、マスクΦベクトル・デー
タ・バッファ１１に送出する。マスク・ベクトル・デー
タ・バッファ１１はそのマスク・ベク）ＡＡデータをマ
スク・ベクトルゲータ処理部４０が使用する時迄保持す
る。Upon receiving the mask vector data read request, the storage device 10 reads the target mask vector data from the corresponding address.
The vector data is read and sent to the mask Φ vector data buffer 11. The mask vector data buffer 11 holds the mask vector AA data until the mask vector gator processing section 40 uses it.

このように本実施例は処理を行うべき先頭のマスク・ベ
クトルデータのフェッチ動作を終了する。In this manner, the present embodiment completes the fetch operation of the first mask vector data to be processed.

もともと、マスク・ベクトルは記憶装置１０上に連続に
格納されており、そのマスク・ベクトルをある一定処理
単位ごとに区切ったものがマスク・ベクトル番データで
あり、一定処理単位の大きさはマスク・ベクトルゲータ
の読み出しデータ巾値に対応させることができる。従っ
て一つのマスク・ベクトルゲータのアドレスに順次マス
ク・ベクトル・データの読み出しデータ巾値を足し込ん
でいけば先行するマスク・ベクトルゲータのアドレスが
生成できる。処理を行うべき先頭のマスク・ベクトル・
データの読み出しリクエストが出されると、引続き制御
装置２０は新たに連続するマスク・ベクトル・データを
記憶装置１０から読み出すべく、マスク・ベクトル・デ
ータ・アドレスレジスタ３１の値と読み出しデータ巾レ
ジスタ３２に保持されているマスクψベクトルゲータの
読み出しデータ巾値とをアドレス加算器３３で加算を行
い、新たに連続するマスク・ベクトル・データのアドレ
ス金生成し、記憶装置１０に対して新たに連続するマス
り・ベクトル・データの読み出しリクエストを送出する
。これと、同時に制御装置２０はセレクタＳ３０を通じ
て、マスク・ベクトルゲータ・アドレス・レジスタ３１
の値を更新させる。Originally, mask vectors are stored continuously on the storage device 10, and the mask vector is divided into certain processing units as mask vector number data, and the size of a certain processing unit is determined by the mask vector. It can be made to correspond to the read data width value of the vector gator. Therefore, by sequentially adding the read data width value of the mask vector data to the address of one mask vector gator, the address of the preceding mask vector gator can be generated. The first mask/vector to be processed
When a data read request is issued, the control device 20 continues to hold the value of the mask vector data address register 31 and the read data width register 32 in order to read new continuous mask vector data from the storage device 10. The address adder 33 adds the read data width value of the current mask ψ vector gator to generate a new continuous mask vector data address, and adds a new continuous mask vector data width value to the storage device 10. - Send a vector data read request. At the same time, the control device 20 selects the mask vector gater address register 31 through the selector S30.
The value of is updated.

この動作を繰返し、逐次処理されるべきマスク・ベクト
ル・データのフェッチが行なわれ、マスク会ベクトル・
データ処理部４０でマスク・ベクトル・データの使用要
求が出される以前に、マスク・ベクトル−デーダパッフ
ァ１１には複数のマスク・ベクトル−データが蓄えられ
る。マスク・ベクトル・データ処理部４０ではマスク・
ベクトルゲータの処理の為に１マスク・ベクトルゲータ
が必要となった場合には、制御装置２０に対して、マス
ク・ペクトかデータの要求を出す。この要求金堂けた制
御装置２０は、マスク・ベクトルデータ・ポインタ７０
で示されるマスク・ベクトル・データ・バッファ１１内
の該当位置から、蓄えられているマスク拳ベクトル・デ
ータをマスク・ベクトルデータ処理部４０に供給する。By repeating this operation, the mask vector data to be processed sequentially is fetched, and the mask vector data is
Before the data processing unit 40 issues a request to use mask vector data, a plurality of mask vector data are stored in the mask vector data buffer 11. The mask vector data processing unit 40 processes the mask vector data.
When one mask/vector gator is required for vector gator processing, a request for mask/pect or data is issued to the control device 20. This request control device 20 uses a mask vector data pointer 70.
The stored mask fist vector data is supplied to the mask vector data processing unit 40 from the corresponding position in the mask vector data buffer 11 indicated by .

該当マスク・ベクトルゲータを供給し終ると。After supplying the corresponding mask/vector gator.

制御装置２０はマスク・ベクトルゲータ・ポインタ７０
の値をマスク・ベクトルゲータ−バッファ１１内に蓄え
られている次に使用されるべきマスク・ベクトル・デー
タを指し示すように更新する。The control device 20 has a mask vector gator pointer 70.
is updated to point to the next mask vector data stored in the mask vector gator buffer 11 to be used.

マスク・ベクトル・データ処理部４０に供給されたマス
ク・ベクトル・データは、マスク・ベクトル−データ処
理部４０内で演算抑止ビット＠０”の個数をカウントす
ることで、演算指示ビット＠１”Ｋ対応する演算実行を
指示されたベクトル要素のアドレスのみを生成するため
のベクトル要素間距離ＶＩｉの乗数を算出する処理が行
われる訳であるが、以下の動作は従来技術の説明の項で
述べた゛動作と同じであり、又１本発明の要旨とする所
でないので説明は簡単に留める。The mask vector data supplied to the mask vector data processing unit 40 is converted into an operation instruction bit @1”K by counting the number of operation inhibit bits @0” in the mask vector data processing unit 40. A process is performed to calculate the multiplier of the distance VIi between vector elements in order to generate only the address of the vector element for which execution of the corresponding operation has been instructed. Since the operation is the same and is not the gist of the present invention, the explanation will be kept brief.

マスク・ベクトルゲータ処理部４０で算出されたベクト
ル要素間距離ＶＩｉの乗数はアドレス生成部５０に送出
され、目的とするベクトル要素のアドレスが生成されて
、記憶装置１０に対してオペランド・リクエストが送出
される。記憶装置１０は該当するベクトル要素を読み出
し、演算装置６０に送出する。演算装置６０は％あらか
じめ出されタオペランド・リクエストに対応する記憶装
置１０上の該当位置に演算結果を格納する。The multiplier of the vector element distance VIi calculated by the mask vector gator processing unit 40 is sent to the address generation unit 50, which generates the address of the target vector element, and sends an operand request to the storage device 10. be done. The storage device 10 reads the corresponding vector element and sends it to the arithmetic device 60. The arithmetic unit 60 stores the arithmetic result in the corresponding location on the storage device 10 corresponding to the data request issued in advance.

以上の動作において、制御装置２０は、マスク・ベクト
ルゲータの先行フェッチとマスク・ベクトルゲータ処理
部４０に対するマスク・ベクトル−データの供給及び記
憶装置１０に対する演算実行を指示されたベクトル要素
のオペランド−リクエストの送出を互いに独立且つ並行
に行われるように制御し、又、演算結果の記憶装置１０
への格納もこれら３つの動作と独立に行われるように制
御する。In the above operation, the control device 20 prefetches the mask vector gator, supplies the mask vector data to the mask vector gator processing unit 40, and requests the storage device 10 for operands of vector elements instructed to perform operations. control the transmission of the calculation results so that they are performed independently and in parallel, and the storage device 10 for the calculation results.
The data storage is also controlled so that it is performed independently of these three operations.

〔Effect of the invention〕

以上説明したように本発明は、マスクーベクトルゲータ
のアドレスとマスクｅペク）Ａ４データの読み出しデー
タ巾値とから、逐次処理されるべき先行するマスク・ベ
クトルゲータ゛のアドレスを作成し、￥スフ・ベクトル
ゲータ処理部がマスク・ベクトル・データ金必要とする
以前に次々とマスク拳ベク）Ａ−データの先行フェッチ
を行い、専用のマスク・ペクトＭデータ書バッファに蓄
えておき。As explained above, the present invention creates the address of the preceding mask/vector gator to be sequentially processed from the address of the mask/vector gator and the read data width value of A4 data, and Before the Gator processing unit needs the mask vector data, it performs a preliminary fetch of the mask vector data one after another and stores it in a dedicated mask vector data write buffer.

マスク・ベクトルゲータ処理部がマスク・ベクトルデー
タを必要とした場合に、該当するブスク拳ベクトルヴー
タをマスク・ベクトルゲータ−バッファよりマスク・ベ
ク）７ｕデ一タ処理部に対して高速に供給することを可
能にすることにより、マスク・ベクトルゲータ処理部に
於けるマスク・ベクトル争データ待ち時間による遊びを
最小限に抑えることができ、マスク−ペクトＡ４データ
の高速連続処理が可能となる効果がある。又、それに伴
い演算装置に於けるベクトル要素待ち時間による遊びも
大幅に削減でき演算装置の使用効率をより向上させるこ
とが可能となる効果がめる。When the mask vector gator processing section requires mask vector data, the corresponding mask vector vector data is supplied from the mask vector gator buffer to the mask vector gator data processing section at high speed. By making this possible, it is possible to minimize play in the mask/vector contest data waiting time in the mask/vector gator processing section, and there is an effect that high-speed continuous processing of mask/pect A4 data is possible. Additionally, the idle time due to vector element waiting time in the arithmetic unit can be significantly reduced, and the efficiency of use of the arithmetic unit can be further improved.

[Brief explanation of drawings]

第１図は本発明の一実施例を示すブロック図、第２図は
マスク・ベクトルにより演算が制御され処理が行われる
ＦＯＲＴＲＡＮプログラムの例を示す図％ｗｃａ図は従
来の装置を示すブロック図、第４図はマスク鳴ベクトν
データの処理過程と演算指示ビットに対応するベクトル
要素のアドレス生成手順を示した図である。１０・・・・・・記憶装置、１１・・・・・・マスク・
ベクトル・データーバッファ、２０・・・・・・制御装
置、３０ａ・・・・・・マスクΦベクトル・データ読み
出し部、４０・・・・・・マスクーペク）７ｗデータ処
理部、５０・・・・・・アドレス生成部、６０・・・・
・・演算装置、７０・・・・・・マスク・ベクトルデー
タ読み出しポインタ％　３１・・・・・・マスク・ベク
トルデータ・アドレスレジスタ、３２・・・・・・マス
クーペク）Ａ４データ読み出しデータ巾レジスタ、５１
，５２．５３・・・・・・ベクトル−アドレスレジスタ
、５４，５５．５６・・・・・・ベクトル要素間距離レ
ジスタ、３３，５８・・・・・・アドレス加算器、５７
・・・・・・乗算器、８３０，８５０，８５１，８５２
゜’　８５３，８５４・・・・・・セレクタ。〆・′）代理人　弁理士　　内　厘　　　晋争２面ＤＯｙｏ　　ＩｔｆＪＩＦＣＣ（Ｏ、Ｃ７，Ｄ（工ンンＣ（Ｌ）−Ａ（エノナｂ　（工）１０　　Ｃ０Ｎ７１ＮＵＦ− ）３頂Ｖ４＠手続補正書（自発） ◇FIG. 1 is a block diagram showing an embodiment of the present invention; FIG. 2 is a block diagram showing an example of a FORTRAN program in which calculations are controlled by mask vectors and processing is performed; %wca is a block diagram showing a conventional device; Figure 4 shows the mask sound vector ν
FIG. 3 is a diagram showing a data processing process and a procedure for generating addresses of vector elements corresponding to operation instruction bits. 10...Storage device, 11...Mask
Vector data buffer, 20...control device, 30a...mask Φ vector data reading section, 40...mask mask) 7w data processing section, 50...・Address generation section, 60...
... Arithmetic unit, 70 ... Mask vector data read pointer % 31 ... Mask vector data address register, 32 ... Mask mask) A4 data read data width register, 51
, 52.53...Vector-address register, 54,55.56...Vector element distance register, 33,58...Address adder, 57
...multiplier, 830, 850, 851, 852
゜' 853,854...Selector. 〆・') Agent Patent Attorney Uchi Rin Jinwar 2nd page DOyo ItfJ IFCC (O, C7, D (Enon C (L) - A (Enona b (Eng) 10 C0N71NUF-) 3 top V4 @ Procedural amendment ( spontaneous) ◇

Claims

[Claims]

a storage device that stores a mask vector composed of a plurality of mask vector data; a mask vector data processing unit that processes the mask vector data collectively; and a mask vector data reading section for reading data, and a mask holding addresses of the mask vector data to be sequentially processed in a data processing device that controls execution of vector operations using mask bits for each vector element. - Vector data address register, mask vector data read data width register that holds the read data width value of mask vector data, and the mask vector data address and read data width value of mask vector data. and the mask vector data reading section having an address adder for creating an address of preceding mask vector data to be sequentially processed on the storage device from the storage device; a mask vector data buffer that sequentially pre-fetches the data and stores the data, and when the mask vector data processing unit needs the mask vector data, the mask vector data buffer stores the data. A vector instruction processing device characterized in that the mask vector data is supplied from a buffer to the mask vector data processing section.