JPH0277940A

JPH0277940A - Data processor

Info

Publication number: JPH0277940A
Application number: JP22877788A
Authority: JP
Inventors: Hideo Wada; 英夫和田; Tomoo Aoyama; 青山　智夫
Original assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Current assignee: Hitachi Ltd; Hitachi Computer Engineering Co Ltd
Priority date: 1988-09-14
Filing date: 1988-09-14
Publication date: 1990-03-19

Abstract

PURPOSE:To increase the branch processing speed for a scalar processor which performs such control that decodes and executes simultaneously plural instructions for each machine cycle by preparing a mask register group that holds the branch conditions. CONSTITUTION:A mask register group 103 holds the branch conditions. A mask field 201 of an instruction format includes a subfield (f) which identifies a case where an instruction is carried out based on the contents of the group 103 and a case where an invalid instruction is carried out and a subfield (a) which shows a process after execution of an instruction. Then a logical part 107 invalidates the instructions based on the value of the field (f) and the value of the group 103 and holds the instructions which are through with the value of the field (a) in plural instruction registers 100 until the value of the group 103 is through. At the same time, the part 107 takes out these instructions in the proper timing. Thus it is possible to increase the branch processing speed of a scalar processor which performs such control that decodes and executes simultaneously plural instructions for each machine cycle like a multiple instruction pipeline.

Description

【発明の詳細な説明】〔産業上の利用分野〕本発明はデータ処理装置に係り、特にスカシにおける分
岐処理の高速化を図るデータ処理装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to a data processing device, and particularly to a data processing device that speeds up branch processing in a search.

[Conventional technology]

近年、スーパーコンピュータにおいて、ベクトル処理の
みならずスカシ処理を高速化することが重要になってい
る。特にスカシ処理における分岐処理の占める割合は１
／４〜１／３に達すると考えられており１分岐処理の高
速化が極めて重要である。In recent years, it has become important to speed up not only vector processing but also search processing in supercomputers. In particular, the proportion of branch processing in squash processing is 1
It is thought that this will reach 1/4 to 1/3, making it extremely important to increase the speed of one-branch processing.

一方、スカシ処理プロセッサ（以下略してスカシプロセ
ッサという）において命令パイプラインのピッチを細か
くしていくと終には命令アドレス生成も１命令単位では
出来なくなる。このような極限の命令実行状態では、命
令語長は１種類に限定されかつ複数の命令が１ブロツク
なる単位で実行されるような制御になる。即ち、１マシ
ンサイクルに複数の命令を同時にデコードするような命
令並列パイプライン制御になる。On the other hand, if the pitch of the instruction pipeline in a space processing processor (hereinafter simply referred to as a space processor) is made finer, it will eventually become impossible to generate an instruction address in units of one instruction. In such an extreme instruction execution state, the instruction word length is limited to one type and control is such that a plurality of instructions are executed in units of one block. That is, instruction parallel pipeline control is performed in which a plurality of instructions are simultaneously decoded in one machine cycle.

このような命令専行制御方式の一例として、「多重命令
パイプライン方式」が提唱されている（村上、福田、末
吉、富田、情報処理学会研究報告、　８８−ＣＡ−６９
，ｐ、ｐ、２５〜３２）。As an example of such an instruction-only control method, a "multiple instruction pipeline method" has been proposed (Murakami, Fukuda, Sueyoshi, Tomita, Information Processing Society of Japan Research Report, 88-CA-69).
, p, p, 25-32).

この方式はスカラ処理を高速化するには効果があるが９
分岐命令の出現頻度が高くなると効果が減少していく、
多重命令パイプラインにはその命令処理方式に合致した
分岐処理方式が必要である。This method is effective in speeding up scalar processing, but9
The effect decreases as the frequency of branch instructions increases.
A multi-instruction pipeline requires a branch processing method that matches the instruction processing method.

[Problem to be solved by the invention]

命令列をブロックに分割し、１ブロツクを複数の命令で
構成し、１マシンサイクルで実行するような制御をとる
と、分岐処理はブロック外へ分岐する場合とブロック内
に分岐する場合とに分けられるようになる。ブロック外
へ分岐する場合は従来の１マシンサイクルに１命令を処
理する場合と同様に制御できる。しかし、ブロック内へ
分岐する場合は従来の方法で制御すると１ブロツク内に
１命令しか置けなくなり、多重命令パイプラインの効果
が減少する。この欠点を除去するためには分岐処理を単
純に多重命令パイプラインで実行するのではなく、分岐
処理と同じ結果をうるような断制御方式を多重命令パイ
プラインで実施する。When a sequence of instructions is divided into blocks, each block is made up of multiple instructions, and control is executed so that each block is executed in one machine cycle, branch processing is divided into cases in which a branch is taken outside the block and cases in which a branch is made within the block. You will be able to do it. When branching out of a block, it can be controlled in the same way as when one instruction is processed in one machine cycle in the conventional art. However, when branching into a block, if controlled using the conventional method, only one instruction can be placed in one block, reducing the effectiveness of the multi-instruction pipeline. In order to eliminate this drawback, branch processing is not simply executed using a multi-instruction pipeline, but a cut-off control method that achieves the same result as the branch processing is implemented using a multi-instruction pipeline.

分岐結果として、（１）命令列の実行スキップ、（２）命令列の再実行が考えられる。As a result of branching, (1) Skip execution of instruction sequence, (2) Re-execution of the instruction sequence may be considered.

この２動作を１ブロツクの命令列内で実行できるように
制御すればよい。１ブロツク内の分岐制御の本質を簡単
に示すため、１ブロツク内の命令による命令書替を禁止
する。It is sufficient to control such that these two operations can be executed within one block of instruction sequence. To simply illustrate the essence of branch control within one block, instruction rewriting by instructions within one block is prohibited.

本発明の目的は多重命令パイプラインの如き１マシンサ
イクル毎に複数の命令を同時にデコードし実行していく
ような制御をとるスカラプロセッサにおける分岐処理の
高速化を図るデータ処理装置を提供することにある。SUMMARY OF THE INVENTION An object of the present invention is to provide a data processing device that speeds up branch processing in a scalar processor that controls the simultaneous decoding and execution of multiple instructions per machine cycle, such as a multi-instruction pipeline. be.

[Means to solve the problem]

上記目的を達成するために、分岐条件を保持するマスク
レジスタを設ける。このレジスタは従来の条件コードが
レジスタとして独立し、かつ複数になったものである。To achieve the above object, a mask register is provided to hold branch conditions. This register is an independent register of the conventional condition code, and is now a plurality of registers.

上記マスクレジスタの内容によって、命令を実行する場
合と無効命令として実行することを識別するフィールド
（以下フィールド／）と、命令の実行後の処理を示すフ
ィールド（以下フィールドα）を設ける。Depending on the contents of the mask register, there are provided a field (hereinafter referred to as field /) for identifying when an instruction is to be executed or executed as an invalid instruction, and a field (hereinafter referred to as field α) indicating processing after execution of the instruction.

命令語長を１種類にする。Use one type of instruction word length.

以上のアーキテクチャ上の工夫、の他に、フィールドＩ
の値とマスクレジスタの値によって命令を無効化する論
理部、フィールドαの値によって完了した命令を複数の
命令レジスタ中にマスクレジスタの値が完了するまで保
持し、かつこれを適切なタイミングでとり出す論理部を
設ける。In addition to the above architectural ideas, Field I
The logic part invalidates the instruction according to the value of field α and the value of the mask register, holds the completed instruction in multiple instruction registers until the value of the mask register is completed according to the value of field α, and retrieves it at an appropriate timing. A logic section is provided to output the output.

[Effect]

前記アーキテクチャ上の工夫とハードウェアの動作を説
明するために具体的な命令の表現形を与えて説明する。In order to explain the above-mentioned architectural ideas and hardware operations, a specific instruction expression will be given and explained.

第２図で示したフォーマットは多重命令パイプラインに
適合する命令フォーマットの一列である。The format shown in FIG. 2 is a series of instruction formats that are compatible with multiple instruction pipelines.

フィールド２００，２０２，２０１はそれぞれオペレー
ション、オペランド、マスクフィールドを表す、マスク
フィールドを！、α、ｙのサブフィールドに分ける。サ
ブフィールドＩが１′の時ｙフィールドで指定されたマ
スクレジスタの値が１′の時命令を実行し、０′の時命
令を無効命令にする。サブフィールドＩが０′の時マス
クレジスタの値が何んであっても命令を実行する。サブ
フィールドαが１１′の時ｙフィールドで指定されたマ
スクレジスタの値が１０′の時命令を実行し、１′の時
無効命令とする。サブフィールドＩとαの値が共に１１
′であるとき指定例外とする（共に１０′の時はマスク
レジスタを引用しない通常の命令である）。Fields 200, 202, and 201 represent an operation, an operand, and a mask field, respectively.A mask field! , α, and y subfields. When subfield I is 1', the instruction is executed when the value of the mask register specified by the y field is 1', and when it is 0', the instruction is invalidated. When subfield I is 0', the instruction is executed regardless of the value of the mask register. When the subfield α is 11', the instruction is executed when the value of the mask register specified by the y field is 10', and when it is 1', the instruction is invalid. The values of subfield I and α are both 11
', it is a designated exception (if both are 10', it is a normal instruction that does not reference the mask register).

命令列はブロックに区分されるが、ブロックの切れ目は
単なる命令フェッチ動作の単位を意味するのではなく、
サブフィールドａが１′の命令のｙフィールドで指示さ
れたマスクレジスタの値が１′になるまで当該ブロック
内のａフィールドが１′の命令を実行するように完了条
件を判定する単位を示す。またＯ８から見た割込みの単
位でもある。ここでは動作の説明を簡約化するため、１
ブロツクに置けるαフィールド１１／　の命令で指定で
きるマスクレジスタは１種類にする。The instruction sequence is divided into blocks, but the breaks between blocks do not simply mean units of instruction fetch operation;
Subfield a indicates a unit for determining the completion condition such that an instruction whose a field is 1' in the block is executed until the value of the mask register designated by the y field of the instruction whose subfield a is 1' becomes 1'. It is also the unit of interrupt seen from O8. Here, in order to simplify the explanation of the operation, 1
Only one type of mask register can be specified by the α field 11/ instruction placed in the block.

また１ブロツク内でαフィールドで指定されているマス
クレジスタの値を定義する命令がない時。Also, when there is no instruction to define the value of the mask register specified by the α field within one block.

指定例外にする。αフィールドを用いた一例を示す。Make it a specified exception. An example using the α field is shown.

区ｄｄ（ａ＝１１’　、ＭＲＯ）ＦＲＯ←ＦＲＬ＋Ａ　
（＜７１ｄ＜ｚ＝ＧＲ１０）　・・・■Ｌｅ４ｆ（ａ＝
ゝ１’　、ＭＲＯ）ＭＲＯ←ン／　（ＦＲＯ≧０）・・
・■ ａｄｄ（ａ＝ゝ１’　、ＭＲＯ）ＯＲＩＯ十Ｆ１１′・
・・■ ル、ρ　　　　　　　　　　　　　　　　　　　　　　
　・・・・・・・・・・・・■■はＦＲＩと命名された
レジスタの内容にインデックスレジスタＧＲＩＯが示す
Ａ領域の内容を加算してＦＲＯレジスタに代入すること
を示す。Ward dd (a=11', MRO) FRO←FRL+A
(<71d<z=GR10) ...■Le4f(a=
ゝ1', MRO) MRO←n/ (FRO≧0)...
・■ add (a=ゝ1', MRO)ORIO11'・
・・■ Le, ρ
. . . ■■ indicates that the contents of the A area indicated by the index register GRIO are added to the contents of the register named FRI and assigned to the FRO register.

■はＦＲＯの内容が０以上か否かをチエツクして０以上
ならばマスクレジスタＭＲＯを１′にセットする。(2) checks whether the contents of FRO are 0 or more, and if they are 0 or more, sets the mask register MRO to 1'.

■はインデックスレジスタ０ＲＩＯの更新を行う。(2) updates the index register 0RIO.

■は１ブロツクに含まれる命令数を合わせるために入れ
た命令である（もし１ブロツク３命令ならばなくてもよ
い）。3 is an instruction inserted to match the number of instructions included in one block (if there are three instructions in one block, it is not necessary).

■〜■の命令列はマスクレジスタＭＲＯが１′になるま
で繰り返して実行され、分岐命令で構成したＤ○ループ
と同様のデータ処理を行う。The instruction strings ① to ② are repeatedly executed until the mask register MRO becomes 1', and data processing similar to the D◯ loop made up of branch instructions is performed.

ｆフィールドを用いた一例としては。As an example using the f field.

Ｌａｊｒ　　（／　　＝　　’Ｏ’　　）　　ＭＲ１←
＜／（ＦＲＯ≧０）・・・・・・・■ｔｄｄ　（／＝　
’１’　、ＭＲＩ）ＦＲＩ←ＦＲ１＋Ａ　（ｉｎｄ＜χ
＝ＧＲ１Ｏ）・・・■５ｕｂｔｒａｃｔ　（／　＝　’
１’　、　ＭＨＩ）　ＦＲ２←ＦＲ２−Ａ　（＜ｎｄ４
！ｚ＝ＧＲ１ｏ）＋＊＋■ａｄｄ　（／＝　’１’　、
ＭＲＩ）ＦＲ３４−ＦＲ３＋Ａ　（＜ｙＬｄ＜ｚ＝ＦＲ
１０）　・・・■。Lajr (/ = 'O') MR1←
＜/(FRO≧0)・・・・・・・■tdd (/=
'1', MRI) FRI←FR1+A (ind<χ
=GR1O)...■5ubtract (/ = '
1', MHI) FR2←FR2-A (<nd4
! z=GR1o)+*+■add (/= '1',
MRI) FR34-FR3+A (<yLd<z=FR
10) ...■.

がある。最初の■でマスクレジスタＭＨＩが１０′にセ
ットされると、■〜■の命令を無効命令とし、ＭＲＩが
１′にセットされると実行する。即ち■の直後に分岐命
令を置いた場合と同様の処理を行う。There is. When the mask register MHI is set to 10' in the first step (2), the instructions (2) to (4) are invalidated, and when MRI is set to 1', they are executed. That is, the same processing as when a branch instruction is placed immediately after ■ is performed.

〔Example〕

以下９本発明の一実施例を図面を用いて詳細に説明する
。第１図は本発明のデータ処理装置の概略ブロック図で
ある。第１図において１００は命令レジスタ、１０１は
デコーダ部、１０２はレジスタ群、１０３はマスクレジ
スタ群、１０４はスイッチング回路、１０５命令実行判
定部、１０６は演算器である。点線１０７で囲まれた論
理部は多重命令パイプラインの多重炭分存在する。演算
器１０６は命令によって書込レジスタが１０２側のレジ
スタが指定されている場合パス１７０上に結果を出力す
る。一方、書込レジスタがマスクレジスタ１０３である
とパス１７１側に結果を出力する。出力結果には書込レ
ジスタ番号が付けられ、この情報によってそれぞれスイ
ッチング回路１１０．１１１によって目的のレジスタに
書込が行われる。Hereinafter, one embodiment of the present invention will be described in detail with reference to the drawings. FIG. 1 is a schematic block diagram of a data processing apparatus according to the present invention. In FIG. 1, 100 is an instruction register, 101 is a decoder section, 102 is a register group, 103 is a mask register group, 104 is a switching circuit, 105 is an instruction execution determination section, and 106 is an arithmetic unit. A logic section surrounded by a dotted line 107 is a multi-column portion of a multi-instruction pipeline. The arithmetic unit 106 outputs the result on the path 170 if the write register is designated as the register on the 102 side by the instruction. On the other hand, if the write register is the mask register 103, the result is output to the path 171 side. A write register number is attached to the output result, and the respective switching circuits 110 and 111 write to the target register according to this information.

命令のパイプラインは命令レジスタ１００までが゛Ｆ′
、レジスタ１１２〜１１６までがｌＤ′。The instruction pipeline is ゛F' up to instruction register 100.
, registers 112 to 116 are ID'.

レジスタ１１７〜１２０までが′Ｅ′、レジスタ群１０
２・（又は１０３）の書込みまでがＷ′で全部で４段で
ある。この段数は何段であってもかまわない。Registers 117 to 120 are 'E', register group 10
The writing up to 2. (or 103) is W', which is 4 stages in total. The number of stages may be any number.

命令レジスタ１００にセットされた信号はデコーダ１０
１で解読される。パス１５０上には演算動作を規定する
オーダ、／フィールド情報、演算結果の書込レジスタ番
号が送出される。パス１６２上にはαフィールド情報、
パス１６３上にはレジスタ群１０２又はマスクレジスタ
１０３を読出した被演算データが送出される。パス１５
０上の信号は命令実行判定回路１０５で処理されパス１
５５上に演算器１０６文は第４図のアドレスアダー４０
０への指示信号が送出される。パス１６３上に送出され
た信号はスイッチング回路１０４に作用し被演算データ
をパス１５７゜１５８上に送出する。マスクレジスタ１
０３を読出したデータはパス１６０上に送出され、命令
実行判定回路１０５に送られる。The signal set in the instruction register 100 is sent to the decoder 10
1 is decoded. On the path 150, the order that defines the calculation operation, /field information, and the write register number of the calculation result are sent. α field information is on the path 162,
The operand data read from the register group 102 or the mask register 103 is sent onto the path 163. pass 15
The signal above 0 is processed by the instruction execution determination circuit 105 and passed to pass 1.
The arithmetic unit 106 statement on 55 is the address adder 40 in FIG.
An instruction signal to 0 is sent. The signal sent on path 163 acts on switching circuit 104 to send the operand data onto paths 157-158. Mask register 1
The data read from 03 is sent onto the path 160 and sent to the instruction execution determination circuit 105.

図面の簡約化を行うため数種類の信号を１本の信号線で
表わすことがある。概略ブロック図では１本であっても
、詳細ブロック図では複数の信号線に分かれる時、信号
線を示す番号にα＋　ｂｙ　ｃ・・・等の添字をつける
。In order to simplify the drawings, several types of signals are sometimes represented by one signal line. Even if there is only one signal line in the schematic block diagram, when it is divided into a plurality of signal lines in the detailed block diagram, a subscript such as α+ by c . . . is added to the number indicating the signal line.

第３図は第１図の命令レジスタ１００に命令をセットす
るための論理部（命令読出論理部）のブロック図である
。第３図において、レジスタ３００には命令語長の几倍
がセットされている。FIG. 3 is a block diagram of a logic section (instruction read logic section) for setting an instruction in the instruction register 100 of FIG. In FIG. 3, a multiple of the instruction word length is set in a register 300.

ここで几は多重命令パイプラインの多重度である。Here, 几 is the multiplicity of the multiple instruction pipeline.

初め第１図の命令デコーダ１０１からユーザプログラム
の先頭アドレスがパス１５４を介してレジスタ３０２に
セットされる。プログラムのスタートはＯ８が行う。Ｏ
８がレジスタ３０２に主記憶上のデータをセットする特
権命令を発行して上記処理が行われる。プログラムの先
頭アドレスはパス３６０．３−６１を通ってバッファス
トレイジに送られる。バッファストレイジから読出され
た命令は第１図のパス１７５を経由して命令レジスタ１
００にセットされる。読出された命令に対応してアドバ
ンス信号がパス１７６上にバッファストレイジから送ら
れる。アドバンス信号は命令パイプラインのＦ、Ｄ、Ｅ
、Ｗステージ毎にレジスタ３１０〜３１３にセットされ
ていく。論理回路３１４はレジスタ３０２．命令レジス
タ１００のセット信号を作る論理部である。パス１５９
上に前記セット信号が送出されると、パス３６１上の命
令アドレスとレジスタ３００上のデータが加算器３０１
で加算されパス１５４上に送出されていた次命令アドレ
スがレジスタ３０２にセットされる。First, the start address of the user program is set in the register 302 from the instruction decoder 101 in FIG. 1 via the path 154. The O8 starts the program. O
8 issues a privileged instruction to set data on the main memory in the register 302, and the above processing is performed. The start address of the program is sent to buffer storage via path 360.3-61. Instructions read from buffer storage are sent to instruction register 1 via path 175 in FIG.
Set to 00. An advance signal is sent from the buffer storage on path 176 in response to the read instruction. Advance signals are F, D, and E of the instruction pipeline.
, W stages are set in the registers 310 to 313. Logic circuit 314 connects register 302 . This is a logic unit that generates a set signal for the instruction register 100. pass 159
When the set signal is sent to the adder 301, the instruction address on the path 361 and the data on the register 300 are sent to the adder 301.
The next instruction address that has been added and sent on path 154 is set in register 302.

レジスタ３０３には例外発生時の割込処理ルーチンの先
頭アドレスが格納されている。パス５５０．５５１上に
指定例外等が検出されたことを示す信号が発行されると
、ＯＲ回路３０４でこれらの検出信号が集められセレク
タ３０５に作用する。セレクタ３０５はレジスタ３０３
の出力をパス３６１に接続し、ユーザプログラムの処理
から割込処理ルーチンの処理に移行する。この時レジス
タ３０２のアドレスはレジスタ３０６にセットされる。The register 303 stores the start address of the interrupt processing routine when an exception occurs. When a signal indicating that a designated exception or the like has been detected is issued on the paths 550 and 551, these detection signals are collected by the OR circuit 304 and act on the selector 305. Selector 305 is register 303
The output of is connected to path 361, and the process shifts from user program processing to interrupt processing routine processing. At this time, the address of register 302 is set in register 306.

割込処理ルーチン内で処理を完了し再びユーザプログラ
ムへ制御を移す時、第１図のデコーダ１０１からオペコ
ードを解読したパス１５０σ上の信号をセレクタ３０５
に作用させてパス３６０とパス３６１を接続させる。レ
ジスタ３０６に退避したアドレスはパス３６７経出でい
ったんバッファストレイジに送られ、そこからパス１５
４経出でレジスタ３０２にセットされる。When processing is completed in the interrupt processing routine and control is transferred to the user program again, the selector 305 selects the signal on the path 150σ from which the opcode is decoded from the decoder 101 in FIG.
to connect paths 360 and 361. The address saved in the register 306 is sent to the buffer storage via path 367, and from there it is sent to the buffer storage via path 15.
It is set in the register 302 at the fourth output.

この方法は一例であって、レジスタ３０２のアドレスを
第１図のレジスタ群１０２に送り退避回復処理を行うこ
ともできる。この場合はパス３６７のシンク先を第１図
のスイッチング回路１１０とし、パス１５７をレジスタ
３０２にまで接続する。This method is just one example, and it is also possible to send the address of the register 302 to the register group 102 in FIG. 1 and perform the save and recovery process. In this case, the sink destination of the path 367 is the switching circuit 110 in FIG. 1, and the path 157 is connected to the register 302.

第４図はオペランドで規定されたアドレスをバッファス
トレイジへ発行するアドレスアダ一部のブロック図であ
る。アドレスアダーの動作は命令パイプラインのＥステ
ージで行われる。パス１５７．８上に送出されたレジス
タ群１０２を読み出したデータはレジスタ４０１〜４０
４にセットされ加算器４００に入力され、オペランドア
ドレスを生成する。このアドレスはパス４５ｏ。FIG. 4 is a block diagram of a portion of an address adder that issues an address specified by an operand to a buffer storage. The address adder operation is performed in the E stage of the instruction pipeline. The data read from register group 102 sent on path 157.8 is stored in registers 401 to 40.
4 and input to adder 400 to generate an operand address. This address is path 45o.

４５１を経由してバッファストレイジに送られる。451 to the buffer storage.

第５図は第１図のデコーダ１０１のブロック図である。FIG. 5 is a block diagram of decoder 101 of FIG. 1.

命令レジスタ１００にセットされた命令をＯＰ、／、α
、ｙ、Ｒ１〜Ｒ３フィールド分ける。それぞれオペコー
ド、／、α、ｙサブフィールド、３つのオペランドフィ
ールドである。オペコードはパス５５５を経由してＲＡ
Ｍ５００を引用し、演算器又はアドレスアダー用のオー
ダ情報を生成する。これらの情報はレジスタ５０１にセ
ットされる。ここでレジスタ５０１６部にはマスクレジ
スタに値をセットする命令の時に１′がセットされると
する。レジスタ５０１８部の出力はαサブフィールドの
値とＡＮＤ回路５０２で論理積がとられる。ＡＮＤ回路
５０２は第１図のデコーダ１０１に対応してい゛る。複
数のＡＮＤ回路５０２の出力はＯＲ回路５０３で論理和
がとられ、インバータ５０４で反転されてパス５５０上
に送出される。パス５５０上の信号は、１ブロック内に
αサブフィールドが１′である（即ちくりかえし実行さ
れる命令がある）とき、マスクレジスタに値をセットす
る命令が存在しない時゛１′になる。該信号は指定例外
の一条件を検出している。The instruction set in the instruction register 100 is OP, /, α
, y, and R1 to R3 fields. These are the opcode, /, α, and y subfields, and three operand fields, respectively. The opcode is sent to RA via path 555.
M500 is cited to generate order information for the arithmetic unit or address adder. These pieces of information are set in register 501. Here, it is assumed that 1' is set in the register 5016 at the time of an instruction to set a value in the mask register. The output of the register 5018 is ANDed with the value of the α subfield by an AND circuit 502. AND circuit 502 corresponds to decoder 101 in FIG. The outputs of the plurality of AND circuits 502 are logically summed by an OR circuit 503, inverted by an inverter 504, and sent onto a path 550. The signal on path 550 becomes 1' when the α subfield is 1' in one block (that is, there is an instruction that is repeatedly executed) and there is no instruction that sets a value in the mask register. The signal detects a condition of specified exception.

ｙサブフィールドのマスクレジスタ番号は比較回路５０
６によって他の命令レジスタのｙサブフィールドのマス
クレジスタ番号と比較される。両者が一致すると出力゛
１′が得られる。該出力は反転されてのちＡＮＤ回路５
０７で論理積がとられ、ＯＲ回路５０８で論理和がとら
れる。ＡＮＤ回路５０７には命令のαサブフィールドが
１′であるか否かの情報が入力される。ＯＲ回路５０８
の出力は１ブロツク内でαフィールド゛１′の命令で異
ったマスクレジスタを引用した時゛１′になる。パス５
５１上の信号は指定例外の一条件を検出している。The mask register number of the y subfield is determined by the comparator circuit 50.
6 is compared with the mask register number of the y subfield of other instruction registers. When the two match, an output "1" is obtained. The output is inverted and then sent to the AND circuit 5.
A logical product is taken at step 07, and a logical sum is taken at an OR circuit 508. Information as to whether the α subfield of the instruction is 1' is input to the AND circuit 507. OR circuit 508
The output becomes ``1'' when a different mask register is referenced by the instruction of α field ``1'' within one block. pass 5
The signal on 51 detects one condition of the specified exception.

！、α、ｙサブフィールド、Ｒ１〜Ｒ３フィールド上の
データはいったんラッチされて後、それぞれパス１５０
（Ｌ、１６２，１６３ｃ、１５０ｃ。! , α, y subfields, and R1 to R3 fields are once latched and then passed through each path 150.
(L, 162, 163c, 150c.

１６３α、ｂ上に送出される。163α,b.

第６図は第１図の命令実行判定回路１０５のブロック図
である。第６図において、パス１６０α上にマスクレジ
スタの値が読出されると、パス１５０ｃＬ上のｆサブフ
ィールドの続出結果とＡＮＤ回路６００で論理積がとら
れる。この結果は！サブフィールドが１′である命令が
実行されるか否かを示している０次に、レジスタ６０１
の出力はインバータ６０２で反転され、パス１６２上の
αサブフィールドのデータとＡＮＤ回路６０３で論理積
がとられる。この結果はαサブフィールドが１′の命令
が実行されるか否かを示している。！サブフィールドが
０′の時マスクレジスタの値によらず命令を実行する。FIG. 6 is a block diagram of the instruction execution determination circuit 105 of FIG. 1. In FIG. 6, when the value of the mask register is read on path 160α, an AND circuit 600 performs a logical product with the successive result of the f subfield on path 150cL. This result is! Next, register 601 indicates whether the instruction whose subfield is 1' is executed.
The output is inverted by an inverter 602, and logically multiplied with the α subfield data on the path 162 by an AND circuit 603. This result indicates whether the instruction whose α subfield is 1' is executed or not. ! When the subfield is 0', the instruction is executed regardless of the value of the mask register.

このためレジスタ６０４の出力をインバータ６０５で反
転しＯＲ回＠６０６に入力する。同回路にはＩサブフィ
ールドが１′の時の実行条件、αサブフィールドが１′
の時の実行条件がそれぞれパス６５１，６５２を通って
入力される。パス１５５α上には命令を実行するオーダ
信号が送出される。この信号がＯ′の時命令を無効命令
として処理する。すなわち第１，４図の演算器、アドレ
スアダーに無動作がパス１５５によって指示される。Therefore, the output of the register 604 is inverted by an inverter 605 and input to the OR circuit @606. The same circuit has an execution condition when the I subfield is 1', and an execution condition when the α subfield is 1'.
The execution conditions at the time are input through paths 651 and 652, respectively. An order signal for executing an instruction is sent on path 155α. When this signal is O', the instruction is processed as an invalid instruction. That is, the path 155 instructs the arithmetic units and address adders in FIGS. 1 and 4 to not operate.

レジスタ６０４．６０７の出力が共に１′の時即ち！、
αサブフィールドが共に１′の時は指定例外信号をパス
６５０上に送出する。この信号は第３図のＯＲ回路３０
４に送られる。When the outputs of registers 604 and 607 are both 1', that is! ,
When both α subfields are 1', a designated exception signal is sent on path 650. This signal is transmitted to the OR circuit 30 in FIG.
Sent to 4.

第７図は第１図のスイッチング回路１０４のブロック図
である。第７図においてレジスタ７００はマスクレジス
タの１つである。レジスタ７００の出力はセレクタ７０
１，７０２に送られ、パス１６３．１６４上の信号によ
って、命令によって指定されたマスクレジスタの出力が
選択されパス１６０ａ、ｂ上に送り出される。FIG. 7 is a block diagram of the switching circuit 104 of FIG. 1. In FIG. 7, register 700 is one of the mask registers. The output of the register 700 is the selector 70
1,702, and signals on paths 163 and 164 select the output of the mask register specified by the instruction and send it out on paths 160a,b.

第８図は第３図の論理回路３１４のブロック図である。FIG. 8 is a block diagram of logic circuit 314 of FIG. 3.

パス１６２，１７２経由に第１図１０１のデコーダから
αサブフィールドのデータが送られて来る。パス１５２
，１５３経由に第１図の演算器１０６からマスクレジス
タのデータが送られて来る。本論理回路はＷステージで
動作するとし。Data of the α subfield is sent from the decoder 101 in FIG. 1 via paths 162 and 172. pass 152
, 153, mask register data is sent from the arithmetic unit 106 in FIG. Assume that this logic circuit operates in the W stage.

αサブフィールドを用いない命令では演算器のマスク側
出力（パス１７１側）を０′にする。For instructions that do not use the α subfield, the mask side output (path 171 side) of the arithmetic unit is set to 0'.

第８図において、ＡＮＤ回路８００，８０１はαサブフ
ィールドを用いる（即ち該フィールドが１′）命令がマ
スクレジスタの値が１′になって完了する条件を検出す
る。ＡＮＤ回路８０ｏ。In FIG. 8, AND circuits 800 and 801 detect a condition in which an instruction using the α subfield (that is, the field is 1') is completed when the value of the mask register becomes 1'. AND circuit 80o.

８０１は第１図の命令レジスタ１００に対応して在る。Reference numeral 801 corresponds to the instruction register 100 in FIG.

該回路の出力はＯＲ回路８０２で論理和がとられ、ＡＮ
Ｄ回路８０３に出力が送出される。The output of the circuit is logically summed by an OR circuit 802, and then
The output is sent to D circuit 803.

ＯＲ回路８０４はαサブフィールドを用いない命令のみ
の場合があるか否かを判定する。ＯＲ回路８０４の出力
はＡＮＤ回路８０３に入力されると同時に反転されてＯ
Ｒ回路８０５に入力される。The OR circuit 804 determines whether there is a case where there is only an instruction that does not use the α subfield. The output of the OR circuit 804 is input to the AND circuit 803, and at the same time it is inverted and output as O.
It is input to the R circuit 805.

パス８５０上の信号は１′の時αサブフィールドが１１
′であってマスクレジスタが１′になった時を示してい
る。パス８５１上の信号が１１′の時αサブフィールド
が１ブロツク内の全命令が０′である場合を示している
。両者の論理和がとられてパス８５２上に次ブロックの
命令列を読出す指示信号が生成される。パス３５０は命
令がＷステージに入っていることを示す信号を伝播して
いる。ＡＮＤ回路８０６は命令パイプラインとのステー
ジ合せのために設けられている。When the signal on path 850 is 1', the α subfield is 11.
' and the mask register becomes 1'. This shows that when the signal on path 851 is 11', the α subfield is 0' for all instructions within one block. The logical sum of the two is taken to generate an instruction signal on path 852 for reading out the instruction sequence of the next block. Path 350 propagates a signal indicating that the instruction is entering the W stage. An AND circuit 806 is provided for stage alignment with the instruction pipeline.

パス１５９上の信号は第１図の命令レジスタ１００をセ
ットしたり、第３図のレジスタ３０２をセットするため
に用いられる。Signals on path 159 are used to set instruction register 100 in FIG. 1 and register 302 in FIG.

〔Effect of the invention〕

本発明によれば、１マシンサイクルで複数の命令を実行
する制御方式を採るデータ処理装置で、１マシンサイク
ルで実行される命令列を１ブロツクというとき、１ブロ
ツク内で行われる分岐処理を、命令実行処理に作用する
マスクデータという概念で高速に処理できる。また、マ
スクデータをマスクレジスタなるプログラムでアクセス
可能な複数のレジスタに保持することにより分岐条件の
設定を分岐命令から自由に離すことが可能になる。According to the present invention, in a data processing device that employs a control method that executes a plurality of instructions in one machine cycle, when a sequence of instructions executed in one machine cycle is called one block, the branch processing performed within one block is as follows: High-speed processing is possible using the concept of mask data that affects instruction execution processing. Furthermore, by holding mask data in a plurality of registers called mask registers that can be accessed by a program, it becomes possible to freely separate setting of branch conditions from branch instructions.

さらに、１ブロツク内で繰り返し実行するようなループ
構造のプログラムを命令レジスタ上に命令列を保持して
実行することができる。この効果によって主記憶、バッ
ファストレイジなどに命令フェッチ動作を行わずに処理
を行うことができ、従来の分岐高速化方式よりも格段に
高速のループ処理が可能になるという効果が得られる。Furthermore, a program with a loop structure that is repeatedly executed within one block can be executed by holding a sequence of instructions in the instruction register. This effect allows processing to be performed without performing an instruction fetch operation to the main memory, buffer storage, etc., and has the effect of enabling much faster loop processing than the conventional branch acceleration method.

４、図面の簡単な説明　　　　− 第１図は本発明のデータ処理装置の概略ブロック図、第
２図は命令フォーマット図、第３図は命令読出論理部の
ブロック図、第４図はアドレスアダ一部のブロック図、
第５図はデコーダ部のブロック図、第６図は命令実行判
定回路のブロック図、第７図はスイッチング回路のブロ
ック図、第８図は第３図の論理回路３１４のブロック図
である。4. Brief description of the drawings - Fig. 1 is a schematic block diagram of the data processing device of the present invention, Fig. 2 is an instruction format diagram, Fig. 3 is a block diagram of the instruction read logic section, and Fig. 4 is an address adapter diagram. block diagram of the department,
5 is a block diagram of the decoder section, FIG. 6 is a block diagram of the instruction execution determination circuit, FIG. 7 is a block diagram of the switching circuit, and FIG. 8 is a block diagram of the logic circuit 314 of FIG. 3.

１００・・・命令レジスタ。100...Instruction register.

１０１・・・デコーダ、１０２・・・（汎用／浮動小数点）レジスタ群、１０３
・・・マスクレジスタ群、１０５・・・命令実行判定回路、１０６・・・演算器、３０１・・・加算器。101...Decoder, 102...(General purpose/floating point) register group, 103
... Mask register group, 105 ... Instruction execution determination circuit, 106 ... Arithmetic unit, 301 ... Adder.

５０６・・・比較回路。506... Comparison circuit.

晃／ＩＥＩ尾２０弗す圀第６ｆｊＥＪ８６目第７目Akira/IEI tail 20 cross country 6th fjEJ 86th eye 7th eye

Claims

[Claims] 1. A data processing device that divides and executes execution of instructions into a plurality of stages, characterized in that logically identical stages of a plurality of instructions are executed at the same timing. . 2. According to claim 1, a plurality of registers are provided that act on an instruction execution method, and the instruction field is used to select whether to execute the instruction or not, or whether to execute the instruction again after executing the instruction, depending on the contents of the register in the instruction field. A data processing device characterized by having an area for making a selection. 3. A data processing device according to claim 2, wherein completion conditions for a plurality of instruction sequences to be executed at one timing of the data processing device are generated by ANDing the contents of the register and a specific field of the instruction. 4. The data processing device according to claim 1, wherein an exception in a sequence of instructions executed at one timing of the data processing device is detected in parallel, and control is passed to the interrupt processing section using this as an opportunity.