JPH0844563A

JPH0844563A - Microprocessor

Info

Publication number: JPH0844563A
Application number: JP6178129A
Authority: JP
Inventors: Hajime Kubosawa; 元久保沢
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1994-07-29
Filing date: 1994-07-29
Publication date: 1996-02-16

Abstract

PURPOSE:To increase the operation speed of a microprocessor by holding an instruction in an instruction holding part corresponding to the operation part, where this instruction is processed, not to select a slot in the stage of decoding. CONSTITUTION:A flag generation part 4 decodes the instruction inputted from an external memory to discriminate the classification or the instruction. The instruction us read out from an instruction cache 2 into an instruction buffer 5 and is held there. The instruction, which is inputted from the external memory because being absent in the instruction cache 2, is inputted to the instruction buffer 5 after the flag is added by the flag generation part 4, and this instruction is held in the instruction cache 2. In this case, an instruction holding part selection part 6 discriminates the flag; and if the flag is '0', the instruction buffer 5 is selected, and the instruction is held in an instruction buffer A. If the flag is '1', the instruction holding part selection part 6 selects an instruction holding part B8, and the instruction is held in this part B8.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は，マイクロプロセッサに
関する。特に，複数の演算ユニットを有し，１クロック
サイクルに複数の命令を同時に実行できるマイクロプロ
セッサに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a microprocessor. In particular, the present invention relates to a microprocessor having a plurality of arithmetic units and capable of simultaneously executing a plurality of instructions in one clock cycle.

【０００２】パイプラインで複数の命令を同時に実行す
ることにより，命令を一つずつ順次実行する場合と比較
して高速にプログラム処理することができる。パイプラ
イン処理により複数命令を同時に実行するためには，命
令をどの演算器で実行すれば良いかを判定して適切なパ
イプライン（スロット）に命令を転送する必要がある。By simultaneously executing a plurality of instructions in a pipeline, it is possible to perform program processing at a higher speed than in the case where instructions are sequentially executed one by one. In order to execute a plurality of instructions simultaneously by pipeline processing, it is necessary to determine which arithmetic unit should execute the instruction and transfer the instruction to an appropriate pipeline (slot).

【０００３】本発明は，パイプラインのどのスロットに
命令を発行するかの判定とスロットへの命令転送を効率
良く行うことのできるマイクロプロセッサを提供する。The present invention provides a microprocessor capable of efficiently determining to which slot of a pipeline an instruction is issued and efficiently transferring the instruction to the slot.

【０００４】[0004]

【従来の技術】図８は従来のパイプライン処理をするマ
イクロプロセッサの構成を示す。２００はプロセッサで
ある。2. Description of the Related Art FIG. 8 shows the structure of a conventional microprocessor for pipeline processing. 200 is a processor.

【０００５】２１０は入力バッファであって，外部メモ
リから入力される命令を一時保持するものである。２１
１は命令キャッシュであって，命令を保持するキャッシ
ュである。An input buffer 210 temporarily holds an instruction input from an external memory. 21
Reference numeral 1 is an instruction cache, which is a cache for holding instructions.

【０００６】２１２はデータキャッシュであって，デー
タを保持するキャッシュである。２１３は命令バッファ
であって，命令をパイプライン処理のスロット（スロッ
ト０，スロット１）に入力するためのバッファである。Reference numeral 212 is a data cache, which is a cache for holding data. An instruction buffer 213 is a buffer for inputting an instruction into a pipeline processing slot (slot 0, slot 1).

【０００７】２１４はスロット０であって，パイプライ
ン処理するものである。２１５はスロット１であって，
パイプライン処理するものである。２１６はレジスタフ
ァイルであって，演算器の入力データ（オペランド），
演算器の演算結果等を一時的に保持する複数のレジスタ
により構成されるものである。Reference numeral 214 denotes slot 0, which is for pipeline processing. 215 is slot 1,
It is for pipeline processing. 216 is a register file, which is input data (operand) of the arithmetic unit,
It is composed of a plurality of registers for temporarily holding the calculation result of the arithmetic unit.

【０００８】２２０，２２１はデコーダであって，命令
をデコードするものである。２３０は演算器１であっ
て，スロット０の演算処理をするものである。２３１は
演算器２であって，スロット０の演算処理をするもので
ある。Denoted at 220 and 221 are decoders for decoding instructions. Reference numeral 230 denotes an arithmetic unit 1 which performs arithmetic processing for slot 0. An arithmetic unit 2 231 performs arithmetic processing of slot 0.

【０００９】２３２は演算器３であって，スロット１の
演算処理をするものである。２３３は演算器４であっ
て，スロット１の演算処理をするものである。図８の従
来のマイクロプロセッサの構成の動作を説明する。Reference numeral 232 is an arithmetic unit 3 for performing arithmetic processing of slot 1. Numeral 233 is an arithmetic unit 4 for performing arithmetic processing of slot 1. The operation of the configuration of the conventional microprocessor shown in FIG. 8 will be described.

【００１０】命令は，命令バッファ２１３から１サイク
ルに２命令ずつデコーダ２２０，２２１に供給される。
命令バッファに命令がある場合は，そこから命令が読み
出され，スロット０，スロット１の各デコーダ２２０，
２２１で解釈される。１命令の語長を４バイトとする
と，命令バッファへ２１３の書き込み，読み出しは８バ
イト単位で行われる。命令キャッシュ２１１への書き込
み，読み出しの単位も８バイトとする。Instructions are supplied from the instruction buffer 213 to the decoders 220 and 221 every two instructions in one cycle.
If there is an instruction in the instruction buffer, the instruction is read from the instruction buffer, and each decoder 220 in slot 0 and slot 1
221. Assuming that the word length of one instruction is 4 bytes, writing and reading of 213 in the instruction buffer are performed in units of 8 bytes. The unit for writing and reading to the instruction cache 211 is also 8 bytes.

【００１１】命令バッファ２１３に命令がないとき，命
令が命令キャッシュ２１１から読み出されて命令バッフ
ァ２１３に書き込まれる（命令フェッチ）。そして，命
令バッファ２１３の命令はスロット０もしくはスロット
１に入力され，デコーダ２２０，２２１で解釈される。
デコードステージでは命令間のデータ依存性，演算器等
の資源の競合がチェックされ，競合がある場合にはそれ
が解消するまで演算の開始を待たせる。デコードされた
命令は，各命令に応じた演算器（演算器１，演算器２，
演算器３，演算器４）で演算される。演算を行うための
入力オペランドはデコードと並行してレジスタファイル
２１６から読み出される。When there is no instruction in the instruction buffer 213, the instruction is read from the instruction cache 211 and written in the instruction buffer 213 (instruction fetch). Then, the instruction in the instruction buffer 213 is input to slot 0 or slot 1 and interpreted by the decoders 220 and 221.
In the decode stage, data dependency between instructions and competition of resources such as arithmetic units are checked, and if there is competition, the start of calculation is made to wait until it is resolved. The decoded instructions are processed according to each instruction (operation unit 1, operation unit 2,
It is calculated by the arithmetic unit 3 and the arithmetic unit 4). Input operands for performing operations are read from the register file 216 in parallel with decoding.

【００１２】命令が命令キャッシュ２１１に存在しない
場合には，マイクロプロセッサ２００の外部メモリから
読み出された命令が命令バッファ２１３に書き込まれる
とともに，命令キャッシュ２１１にも書き込まれる。命
令が演算ステージ（演算器１，演算器２，演算器３，演
算器４）に移行した後のフローは命令によって異なる。
演算が算術演算，論理演算の場合は，１サイクルで演算
は終了し，結果は３サイクル後にレジスタファイル２１
６に書き込まれる。命令がロード命令の場合は，演算実
行ステージでメモリにアクセスするアドレスを計算し，
次のキャッシュステージでデータキャッシュを読み出
す。次のステージで例外処理（例えば，アクセスを禁止
されているアドレスが求められてはいないかを検出する
等の処理）を行い，次のステージでレジスタの内容を更
新する。When the instruction does not exist in the instruction cache 211, the instruction read from the external memory of the microprocessor 200 is written in the instruction buffer 213 and also in the instruction cache 211. The flow after the instruction moves to the arithmetic stage (the arithmetic unit 1, the arithmetic unit 2, the arithmetic unit 3, the arithmetic unit 4) differs depending on the instruction.
When the operation is an arithmetic operation or a logical operation, the operation ends in one cycle, and the result is the register file 21 after three cycles.
Written in 6. If the instruction is a load instruction, calculate the address to access the memory in the operation execution stage,
The data cache is read in the next cache stage. Exception processing is performed in the next stage (for example, processing for detecting whether an access-prohibited address is not requested, etc.), and the register contents are updated in the next stage.

【００１３】図８において，演算器１，演算器２，演算
器３，演算器４はすべて異なる機能の演算を行うものと
する。つまり，同じ種類の演算を２命令同時に実行する
ことはできない。また，スロット０にある命令はスロッ
ト１側の演算器を使用することはできない。同様に，ス
ロット１にある命令はスロット０で使用することはでき
ない。これらはスロット間の配線およびマルチプレクサ
が増加し制御が複雑になることを避けるためである。In FIG. 8, the arithmetic unit 1, the arithmetic unit 2, the arithmetic unit 3, and the arithmetic unit 4 are assumed to perform arithmetic operations of different functions. In other words, the same type of operation cannot be executed simultaneously by two instructions. The instruction in slot 0 cannot use the arithmetic unit on the side of slot 1. Similarly, the instruction in slot 1 cannot be used in slot 0. These are to avoid complicated wiring due to an increase in wirings and multiplexers between slots.

【００１４】通常は，命令キャッシュ２１１からバッフ
ァ２１３に命令を転送してきた時点では，スロット０に
はフェッチしてきた８バイトの下位側の命令が，スロッ
ト１には上位の命令が入っている。各演算器（２３０，
２３１，２３２，２３３）に上記のような制限があった
場合には，命令がどの演算器を使用するのかを検出して
適切なスロットが選ばれる必要がある。使用する演算器
と違うスロットに入っている場合には，命令スロットの
入れ替えが必要になる。この入れ替えは命令バッファか
ら命令を読み出し，デコードして演算器に渡されるまで
の間（デコードステージ）に行わなければならない。Normally, when the instruction is transferred from the instruction cache 211 to the buffer 213, the fetched 8-byte lower instruction is contained in the slot 0 and the upper instruction is contained in the slot 1. Each computing unit (230,
231, 232, 233) has the above limitation, it is necessary to detect which arithmetic unit the instruction uses and select an appropriate slot. If it is in a different slot than the one you are using, you will need to replace the instruction slot. This replacement must be performed during the time (decoding stage) until the instructions are read from the instruction buffer, decoded, and passed to the arithmetic unit.

【００１５】[0015]

【発明が解決しようとする課題】従来のパイプライン処
理するマイクロプロセッサはデコードステージで命令ス
ロットの入れ替えを行うようにしていた。デコードステ
ージは命令のデコード，依存関係のチェック，レジスタ
ファイル読み出し，命令発行が可能であるか等の判定を
行うのでプロセッサ全体のクリティカルパス（動作遅延
の最も大きいパス）になる可能性の高いステージであ
る。デコードステージで命令がどの演算器を使用するか
を判定し，適切なスロットを選択することはマイクロプ
ロセッサの高速動作をさまたげる大きな要因となる。In the conventional microprocessor for pipeline processing, instruction slots are swapped in the decode stage. Since the decode stage performs instruction decoding, dependency check, register file read, and instruction issue determination, it is a stage that is likely to become a critical path (path with the largest operation delay) of the entire processor. is there. Determining which arithmetic unit an instruction uses in the decode stage and selecting an appropriate slot is a major factor that impedes high-speed operation of the microprocessor.

【００１６】本発明は，デコードステージでのスロット
選択を行わないようにして，マイクロプロセッサの高速
化を図ることを目的とする。An object of the present invention is to speed up the microprocessor by not performing slot selection in the decode stage.

【００１７】[0017]

【課題を解決するための手段】本発明は，命令キャッシ
ュにチップ外部から命令を格納する際に，命令がどのス
ロットを使用するかを示すフラグを付け，命令キャッシ
ュにそのフラグとともに書き込む。また，外部メモリか
ら読み出した命令を命令バッファに書き込む場合も同様
に，フラグを付け，命令バッファと命令キャッシュに書
き込むようにする。そして，命令バッファへの書き込み
の際に命令キャッシュのフラグを見て適切なスロットを
選択し，命令バッファ内の転送制御はこのスロットフラ
グによって行うことにより，デコードステージでスロッ
ト選択を行わないでもスロットの選択をできるようにし
た。According to the present invention, when an instruction is stored in the instruction cache from the outside of the chip, a flag indicating which slot the instruction uses is attached and the instruction cache is written together with the flag. Similarly, when writing the instruction read from the external memory to the instruction buffer, a flag is added and the instruction buffer and the instruction cache are written. Then, when writing to the instruction buffer, an appropriate slot is selected by looking at the flag of the instruction cache, and transfer control in the instruction buffer is performed by this slot flag. Enabled choice.

【００１８】図１は本発明の基本構成を示す。図１は二
つのスロットを備えている場合を例として示す。図１に
おいて，１はマイクロプロセッサである。FIG. 1 shows the basic configuration of the present invention. FIG. 1 shows an example in which two slots are provided. In FIG. 1, 1 is a microprocessor.

【００１９】２は命令キャッシュである。３は命令キャ
ッシュのフラグ保持部であって，命令を演算する演算部
（スロット）を識別するためのフラグを保持するもので
ある。Reference numeral 2 is an instruction cache. Reference numeral 3 is a flag holding unit of the instruction cache, which holds a flag for identifying a computing unit (slot) that computes an instruction.

【００２０】４はフラグ生成部であって，命令の種類に
応じてフラグを生成するものである。５は命令バッファ
であって，命令キャッシュもしくは外部メモリから読み
出した命令を保持するものである。A flag generator 4 generates a flag according to the type of instruction. An instruction buffer 5 holds an instruction read from an instruction cache or an external memory.

【００２１】６は命令保持部選択部であって，命令のフ
ラグを識別して，命令保持部Ａもしくは命令保持部Ｂの
いずれに命令を保持すべきかを判定するものである。７
は命令保持部Ａであって，演算部Ａ（スロット０）で処
理される命令を保持するものである。Reference numeral 6 denotes an instruction holding unit selection unit, which identifies the flag of the instruction and determines which of the instruction holding unit A and the instruction holding unit B should hold the instruction. 7
Is an instruction holding unit A for holding an instruction to be processed by the arithmetic unit A (slot 0).

【００２２】８は命令保持部Ｂであって，演算部Ｂ（ス
ロット１）の側で処理される命令を保持するものであ
る。９は演算部Ａ（スロット０）であって，命令保持部
Ａから入力される命令をデコードし，デコードの結果に
よりレジスタファイル１２からオペランドを入力し，演
算し，演算結果をレジスタスタファイル１２に転送する
ものである。An instruction holding unit B holds an instruction to be processed on the side of the arithmetic unit B (slot 1). An operation unit A (slot 0) 9 decodes an instruction input from the instruction holding unit A, inputs an operand from the register file 12 according to the decoding result, performs an operation, and outputs the operation result to the register star file 12. It is to be transferred.

【００２３】１０は演算部Ｂ（スロット１）であって，
命令保持部Ｂから入力される命令をデコードし，デコー
ドした結果によりレジスタファイル１２からオペランド
を入力し，演算し，演算結果をレジスタファイル１２に
転送するものである。Reference numeral 10 denotes an arithmetic unit B (slot 1),
An instruction input from the instruction holding unit B is decoded, an operand is input from the register file 12 according to the decoded result, an operation is performed, and the operation result is transferred to the register file 12.

【００２４】１２はレジスタファイルであって，データ
キャッシュ１３から入力されるデータもしくは演算部
Ａ，演算部Ｂの演算結果を保持するものである。１３は
データキャッシュであって，データを保持するものであ
る。Reference numeral 12 is a register file, which holds the data input from the data cache 13 or the calculation results of the arithmetic units A and B. A data cache 13 holds data.

【００２５】[0025]

【作用】図１の本発明の基本構成の動作を説明する。フ
ラグ生成部４は外部メモリから入力される命令をデコー
ドし，命令の種類を判定する。そして，その命令が演算
部Ａで処理すべきものであるか，あるいは演算部Ｂで処
理すべきものであるかを識別するためのフラグを生成す
る。例えば，演算部Ａで処理する命令に対しては
「０」，演算部Ｂで処理する命令に対しては「１」等で
ある。The operation of the basic configuration of the present invention shown in FIG. 1 will be described. The flag generator 4 decodes the instruction input from the external memory and determines the instruction type. Then, a flag is generated for identifying whether the instruction is to be processed by the arithmetic unit A or the arithmetic unit B. For example, the instruction processed by the arithmetic unit A is "0", the instruction processed by the arithmetic unit B is "1", and the like.

【００２６】命令バッファ５に命令キャッシュ２から命
令が読み出されて保持される。もしくは，命令キャッシ
ュ２に命令が無い場合に外部メモリから入力される命令
はフラグ生成部４でフラグを付され命令バッファ５に入
力されるとともに命令キャッシュ２にも保持される。そ
の際，命令保持部選択部６はフラグを識別し，例えば，
フラグが「０」であれば命令バッファ５を選択し，命令
バッファＡに命令を保持する。あるいは，フラグが
「１」であれば，命令保持部選択部６は命令保持部Ｂを
選択し，命令保持部Ｂに命令を保持する。レジスタファ
イル１２はデータキャッシュ１３からオペランドとなる
データを取り出して保持する。Instructions are read from the instruction cache 2 and held in the instruction buffer 5. Alternatively, if there is no instruction in the instruction cache 2, the instruction input from the external memory is flagged by the flag generation unit 4 and input to the instruction buffer 5, and is also held in the instruction cache 2. At that time, the instruction holding unit selection unit 6 identifies the flag, and, for example,
If the flag is "0", the instruction buffer 5 is selected and the instruction is held in the instruction buffer A. Alternatively, if the flag is "1", the instruction holding unit selection unit 6 selects the instruction holding unit B and holds the instruction in the instruction holding unit B. The register file 12 takes out the data as an operand from the data cache 13 and holds it.

【００２７】演算部Ａには命令保持部Ａの命令が入力さ
れ，入力された命令をデコードする。そして，演算部Ａ
はレジスタファイル１２から必要なオペランドを取り出
し演算し，演算結果をレジスタファイル１２に保持させ
る。同様に，演算部Ｂには命令保持部Ｂの命令が入力さ
れ，入力された命令をデコードする。そして，演算部Ｂ
はレジスタファイル１２から必要とするオペランドを取
り出して演算する。そして，演算結果をレジスタファイ
ル１２に保持する。レジスタファイル１２に保持された
演算結果はデータキャッシュ１３に転送され，外部メモ
リに出力される。The instruction of the instruction holding unit A is input to the arithmetic unit A, and the input instruction is decoded. And the arithmetic unit A
Takes the necessary operands from the register file 12 and performs an operation, and stores the operation result in the register file 12. Similarly, the instruction of the instruction holding unit B is input to the arithmetic unit B, and the input instruction is decoded. Then, the calculation unit B
Calculates the necessary operands from the register file 12. Then, the calculation result is held in the register file 12. The calculation result held in the register file 12 is transferred to the data cache 13 and output to the external memory.

【００２８】本発明によれば，マイクロプロセッサの処
理速度に大きく影響する演算部Ａ，演算部Ｂのデコード
ステージでのスロット選択を行わない。そのため，デコ
ードステージで処理が遅延されることがないので，能率
的にパイプライン処理を行うことができ，マイクロプロ
セッサ全体の動作が高速化される。また，命令バッファ
の命令処理もフラグにより簡単に制御できる。According to the present invention, slot selection is not performed in the decode stages of the arithmetic units A and B, which greatly affects the processing speed of the microprocessor. Therefore, since the processing is not delayed in the decoding stage, the pipeline processing can be efficiently performed, and the operation of the entire microprocessor is speeded up. Also, the instruction processing of the instruction buffer can be easily controlled by the flag.

【００２９】[0029]

【実施例】図２は本発明の実施例を示す。図２におい
て，５１はマイクロプロセッサである。FIG. 2 shows an embodiment of the present invention. In FIG. 2, reference numeral 51 is a microprocessor.

【００３０】５２は命令キャッシュである。５３はスロ
ットを識別するためのフラグである。フラグ「０」でス
ロット０，フラグ「１」でスロット１を識別するとす
る。Reference numeral 52 is an instruction cache. 53 is a flag for identifying a slot. It is assumed that slot 0 is identified by the flag “0” and slot 1 is identified by the flag “1”.

【００３１】５４はプリデコーダであって，外部メモリ
から命令をデコードしてスロット０もしくはスロット１
のいずれで演算すべき命令であるかを識別し，フラグを
生成するものである（図１のフラグ生成部に相当す
る）。Reference numeral 54 is a predecoder, which decodes an instruction from an external memory to obtain slot 0 or slot 1.
The flag is generated by identifying which of the two is the instruction to be operated (corresponding to the flag generation unit in FIG. 1).

【００３２】５５は命令バッファである（命令バッファ
５５の詳細は図４で説明する）。５６は命令保持部選択
部である。５７は命令保持部である。Reference numeral 55 is an instruction buffer (details of the instruction buffer 55 will be described with reference to FIG. 4). Reference numeral 56 is an instruction holding unit selection unit. Reference numeral 57 is an instruction holding unit.

【００３３】５９はスロット０であって，デコーダと演
算器１，演算器２により構成されるものである。６０は
スロット１であって，デコーダと演算器３，演算器４に
より構成されるものである。Numeral 59 is a slot 0, which is composed of a decoder, an arithmetic unit 1 and an arithmetic unit 2. A slot 1 is composed of a decoder, an arithmetic unit 3 and an arithmetic unit 4.

【００３４】６２はレジスタファイルである。６３はデ
ータキャッシュである。７６は入力バッファ（外部入力
レジスタ）であって，外部メモリから入力される命令を
一時保持するバッファである。Reference numeral 62 is a register file. 63 is a data cache. An input buffer (external input register) 76 is a buffer for temporarily holding an instruction input from an external memory.

【００３５】スロット０において，７０はデコーダであ
って，スロット０に入力される命令をデコードするもの
である。In slot 0, a decoder 70 decodes an instruction input to slot 0.

【００３６】７２は演算器１であって，デコーダ７０の
デコードした命令の内容に従って，レジスタファイル６
２からオペランドを取り出し演算処理するものである。
７３は演算器２であって，デコーダ７０のデコードした
命令の内容に従って，レジスタファイル６２からオペラ
ンドを取り出し演算処理するものである。Reference numeral 72 denotes an arithmetic unit 1, which registers the register file 6 in accordance with the content of the instruction decoded by the decoder 70.
The operand is taken out from 2 and arithmetic processing is performed.
Reference numeral 73 denotes an arithmetic unit 2 which extracts an operand from the register file 62 and performs arithmetic processing according to the content of the instruction decoded by the decoder 70.

【００３７】スロット１において，７１はデコーダであ
って，スロット１に入力される命令をデコードするもの
である。In slot 1, reference numeral 71 is a decoder which decodes an instruction input to slot 1.

【００３８】７４は演算器３であって，デコーダ７１の
デコードした命令の内容に従って，レジスタファイル６
２からオペランドを取り出し演算処理するものである。
７５は演算器４であって，デコーダ７１のデコードした
命令の内容に従って，レジスタファイル６２からオペラ
ンドを取り出し演算処理するものである。Reference numeral 74 is an arithmetic unit 3, which registers the register file 6 in accordance with the contents of the instruction decoded by the decoder 71.
The operand is taken out from 2 and arithmetic processing is performed.
Reference numeral 75 denotes an arithmetic unit 4 which extracts an operand from the register file 62 and performs arithmetic processing according to the content of the instruction decoded by the decoder 71.

【００３９】８０はフラグを付された命令のデータ構造
の例である。命令０はスロット０で処理される命令であ
り，スロット０を識別するフラグを持つ。命令１はスロ
ット１で処理される命令であり，スロット１を識別する
フラグを持つ。Reference numeral 80 is an example of the data structure of a flagged instruction. Instruction 0 is an instruction processed in slot 0 and has a flag for identifying slot 0. The instruction 1 is an instruction processed in the slot 1 and has a flag for identifying the slot 1.

【００４０】命令キャッシュ５２の１ラインが３２バイ
ト長であるとすると８命令分が格納できる。フラグの書
き込みは，外部メモリから命令を取ってきた時に行う。
外部メモリから読み込んだ命令は，命令キャッシュ５２
に書き込む前に入力バッファ７６（外部入力レジスタ）
に一旦書き込みを行う。フラグの発生はこの入力バッフ
ァ７６を読み出してから命令キャッシュ５２に書き込む
間にデコーダ５４で行われる。命令キャッシュ５２に書
き込むデータが必要になるタイミングはサイクルの後半
である。フラグ発生はこれより速く終わるのでプリデコ
ーダ５４によるデコードが動作遅延の原因にはならない
（この点については図３で詳述する）。命令キャッシュ
５２から命令バッファ５５への読み出しは命令フェッチ
ステージで行う。命令フェッチスタージでは命令キャッ
シュ５２の読み出しが行われ，命令バッファ５５に書き
込まれる。命令バッファ５５において，命令フラグを識
別し，命令がスロット０で処理されるべきものである
か，あるいはスロット１で処理されるべきものであるか
を命令保持部選択部５６で識別し，スロット０で演算す
べき命令であれば命令保持部５７のスロット０側に保持
し，あるいはスロット１で演算すべき命令であればスロ
ット１側に保持する。If one line of the instruction cache 52 has a length of 32 bytes, 8 instructions can be stored. The flag is written when an instruction is fetched from the external memory.
The instruction read from the external memory is stored in the instruction cache 52.
Input buffer 76 (external input register) before writing to
Write once to. The flag is generated by the decoder 54 while the input buffer 76 is read and then written in the instruction cache 52. The timing at which the data to be written in the instruction cache 52 is needed is the latter half of the cycle. Since the flag generation ends faster than this, the decoding by the predecoder 54 does not cause the operation delay (this point will be described in detail in FIG. 3). Reading from the instruction cache 52 to the instruction buffer 55 is performed in the instruction fetch stage. In the instruction fetch stage, the instruction cache 52 is read and written in the instruction buffer 55. In the instruction buffer 55, the instruction flag is identified, and whether the instruction should be processed in the slot 0 or the slot 1 is identified by the instruction holding unit selection unit 56, and the slot 0 If it is an instruction to be operated in step 1, it is held in the slot 0 side of the instruction holding unit 57, or if it is an instruction to be operated in slot 1, it is held in the slot 1 side.

【００４１】図３は本発明のフラグ生成のタイムチャー
トである。外部メモリから読み出された命令は一旦，外
部入力レジスタ（入力バッファ７６）に保持される。FIG. 3 is a time chart of flag generation of the present invention. The instruction read from the external memory is once held in the external input register (input buffer 76).

【００４２】第１サイクルにおいて，まず，第１サイク
ルのクロックの立ち上がりで，入力レジスタ７６から命
令が読み出される。そして，チップ内（マイクロプロセ
ッサ５１の内部）を転送され，プリデコーダ５４でプリ
デコードされ，第１サイクルの後半で，チップ内を転送
されて命令キャッシュ５２の入口の命令キャッシュ入力
レジスタ（命令キャッシュ５２に命令を入力するために
一時命令を保持するレジスタ（図示せず））に書き込ま
れる。ここで，命令バッファ５５にも書き込まれる。In the first cycle, first, an instruction is read from the input register 76 at the rising edge of the clock in the first cycle. Then, it is transferred in the chip (inside the microprocessor 51), predecoded by the predecoder 54, transferred in the chip in the latter half of the first cycle, and transferred to the instruction cache input register (instruction cache 52) at the entrance of the instruction cache 52. Are written to a register (not shown) that holds a temporary instruction for inputting the instruction. Here, it is also written in the instruction buffer 55.

【００４３】第２サイクルにおいて，第２サイクルのク
ロックの立ち上がりで，命令キャッシュの入力レジスタ
から命令が読み出され、命令キャッシュ５２に書き込ま
れる。In the second cycle, at the rising edge of the clock in the second cycle, the instruction is read from the input register of the instruction cache and written in the instruction cache 52.

【００４４】図４は本発明の命令バッファの実施例であ
る。図４において，５２は命令キャッシュである。命令
キャッシュには４バイトの命令を８バイト単位に保持さ
れる。８バイト構成のＬＳＷ（下位）側の命令（命令
０）とＭＳＷ（上位）側の命令（命令１）にそれぞれス
ロットを識別するフラグが備えられる。FIG. 4 shows an embodiment of the instruction buffer of the present invention. In FIG. 4, reference numeral 52 is an instruction cache. The instruction cache holds 4-byte instructions in 8-byte units. The 8-byte LSW (lower) side instruction (instruction 0) and the MSW (upper side) instruction (instruction 1) are respectively provided with flags for identifying slots.

【００４５】５５は命令バッファであり，命令の保持部
を７個（命令保持部Ｔ，Ｐ０，Ｐ１，Ｆ０，Ｆ１，Ｄ
０，Ｄ１）備えるものである。命令は，命令の連続性，
フラグを考慮した選択ルールに従って選択的に命令保持
部に保持される。選択ルールについては後述する。An instruction buffer 55 has seven instruction holding units (instruction holding units T, P0, P1, F0, F1, D).
0, D1). A command is a sequence of commands,
It is selectively held in the instruction holding unit according to a selection rule in consideration of flags. The selection rule will be described later.

【００４６】８５は命令保持部選択部１であって，命令
キャッシュ５２もしくは命令保持部Ｔから取り出された
命令のフラグを参照し，選択ルールに従って命令保持部
Ｐ０もしくはＰ１を選択するものである。Reference numeral 85 is an instruction holding unit selecting unit 1 which refers to the flag of the instruction fetched from the instruction cache 52 or the instruction holding unit T and selects the instruction holding unit P0 or P1 according to the selection rule.

【００４７】８６は命令保持部選択部２であって，命令
キャッシュ５２，命令保持部Ｐ０もしくは命令保持部Ｐ
１から取り出された命令のフラグを参照し，選択ルール
に従って命令保持部Ｆ０もしくはＦ１を選択するもので
ある。Reference numeral 86 is an instruction holding unit selection unit 2, which includes the instruction cache 52, the instruction holding unit P0 or the instruction holding unit P.
The instruction holding unit F0 or F1 is selected according to the selection rule by referring to the flag of the instruction fetched from 1.

【００４８】８７は命令保持部選択部３であって，命令
キャッシュ５２，命令保持部Ｆ０もしくは命令保持部Ｆ
１から取り出された命令のフラグを参照し，選択ルール
に従って命令保持部Ｄ０もしくはＤ１を選択するもので
ある。Reference numeral 87 is an instruction holding unit selection unit 3, which includes the instruction cache 52, the instruction holding unit F0 or the instruction holding unit F.
The instruction holding unit D0 or D1 is selected according to the selection rule by referring to the flag of the instruction fetched from 1.

【００４９】９０はセレクタ０であって，Ｐ０に保持す
る命令を選択するものである。９１はセレクタ１であっ
て，Ｐ１に保持する命令を選択するものである。９２は
セレクタ２であって，Ｆ０に保持する命令を選択するも
のである。Reference numeral 90 designates a selector 0 for selecting an instruction held in P0. Reference numeral 91 is a selector 1 for selecting an instruction held in P1. Reference numeral 92 is a selector 2 for selecting an instruction held in F0.

【００５０】９３はセレクタ３であって，Ｆ１に保持す
る命令を選択するものである。９４はセレクタ４であっ
て，Ｄ０に保持する命令を選択するものである。９５は
セレクタ５であって，Ｄ１に保持する命令を選択するも
のである。Reference numeral 93 is a selector 3 for selecting an instruction held in F1. Reference numeral 94 is a selector 4 for selecting an instruction held in D0. Reference numeral 95 denotes a selector 5 for selecting an instruction held in D1.

【００５１】Ｔは命令保持部であって，命令キャッシュ
５２の命令の上位４バイトの命令を保持するものであ
る。Ｐ０は命令保持部であって，セレクタ０（９０）で
選択された命令を保持するものである。Reference numeral T denotes an instruction holding unit, which holds the upper 4 bytes of the instruction in the instruction cache 52. P0 is an instruction holding unit, which holds the instruction selected by the selector 0 (90).

【００５２】Ｐ１は命令保持部であって，セレクタ１
（９１）で選択された命令を保持するものである。Ｆ０
は命令保持部であって，セレクタ２（９２）で選択され
た命令を保持するものである。P1 is an instruction holding unit, which is a selector 1
It holds the instruction selected in (91). F0
Is an instruction holding unit, which holds the instruction selected by the selector 2 (92).

【００５３】Ｆ１は命令保持部であって，セレクタ３
（９３）で選択された命令を保持するものである。Ｄ０
は命令保持部であって，セレクタ４（９４）で選択され
た命令を保持するものである。F1 is an instruction holding unit, which is a selector 3
It holds the instruction selected in (93). D0
Is an instruction holding unit, which holds the instruction selected by the selector 4 (94).

【００５４】Ｄ１は命令保持部であって，セレクタ５
（９５）で選択された命令を保持するものである。図４
の構成において，セレクタ０（９０）には，命令保持部
Ｔ，命令キャッシュのＬＳＷ（命令０），ＭＳＷ（命令
１）の内容とフラグが入力され，フラグおよび選択ルー
ルに従って，それらの命令のいずれかを選択する。そし
て，Ｐ０に選択した命令０を保持する。同様に，セレク
タ１（９１）には，命令保持部Ｔ，命令キャッシュ５２
の命令のＬＳＷ，ＭＳＷの内容とフラグが入力され，フ
ラグおよび選択ルールに従って選択されてＰ１に保持さ
れる。D1 is an instruction holding unit, and the selector 5
It holds the instruction selected in (95). FIG.
In the above configuration, the contents and flags of the instruction holding unit T, the instruction cache LSW (instruction 0), and the MSW (instruction 1) are input to the selector 0 (90), and any of these instructions is selected according to the flag and the selection rule. Or select. Then, the selected instruction 0 is held in P0. Similarly, the selector 1 (91) has an instruction holding unit T and an instruction cache 52.
The contents of the LSW and MSW of the instruction and the flag are input, selected according to the flag and the selection rule, and held in P1.

【００５５】また，セレクタ２（９２）には，命令保持
部Ｐ０，命令キャッシュ５２のＬＳＷ，ＭＳＷの内容と
そのフラグが入力され，フラグと選択ルールに従って，
そのいずれかを選択し，命令保持部Ｆ０に保持する。同
様に，セレクタ３（９３）には，命令保持部Ｐ１，命令
キャッシュ５２のＬＳＷ，ＭＳＷの内容とフラグが入力
され，フラグと選択ルールに従って命令が選択されて命
令保持部Ｆ１に保持される。The selector 2 (92) receives the contents of the instruction holding unit P0, the LSW and MSW of the instruction cache 52 and their flags, and according to the flags and the selection rules.
One of them is selected and held in the instruction holding unit F0. Similarly, the selector 3 (93) receives the contents of the instruction holding unit P1 and the LSW and MSW of the instruction cache 52 and the flags, and the instruction is selected and held in the instruction holding unit F1 according to the flag and the selection rule.

【００５６】また，セレクタ４（９４）には，命令保持
部Ｆ０，命令キャッシュ５２のＬＳＷ，ＭＳＷの内容と
フラグが入力され，フラグと選択ルールに従って命令が
選択されて，命令保持部Ｄ０に保持される。同様に，セ
レクタ５（９５）には，命令保持部Ｆ１，命令キャッシ
ュ５２のＬＳＷ，ＭＳＷの内容とフラグが入力され，フ
ラグと選択ルールに従って命令が選択されて命令保持部
Ｄ１に保持される。The selector 4 (94) also receives the instruction holding unit F0, the contents of the LSW and MSW of the instruction cache 52 and the flags, selects the instruction according to the flag and the selection rule, and holds it in the instruction holding unit D0. To be done. Similarly, the selector 5 (95) receives the contents of the instruction holding unit F1 and the LSW and MSW of the instruction cache 52 and the flags, selects the instruction according to the flag and the selection rule, and holds it in the instruction holding unit D1.

【００５７】Ｄ０の内容はスロット０に転送され，Ｄ１
の内容はスロット１に転送される。次に図４の構成にお
ける命令の転送ルールについて説明する。図４の構成に
おいて，各選択部は次のルールに従って，命令を選択
し，選択した保持部に転送する。The contents of D0 are transferred to slot 0 and D1
Is transferred to slot 1. Next, the instruction transfer rule in the configuration of FIG. 4 will be described. In the configuration of FIG. 4, each selection unit selects an instruction according to the following rule and transfers it to the selected holding unit.

【００５８】(1) キャッシュからの命令の読み出しは２
命令単位で行う。 (2) Ｄ０，Ｆ０，Ｐ０にはスロット０の演算器のグルー
プで実行される命令が入る。(1) Two instructions are read from the cache
Do by instruction. (2) D0, F0, and P0 contain instructions executed by the group of arithmetic units in slot 0.

【００５９】(3) Ｄ１，Ｆ１，Ｐ１にはスロット１の演
算器のグループで実行される命令が入る。 (4) 読み出した命令がスロット０で実行する命令とス
ロット１で実行する命令のペアの場合において，(4) −
１命令バッファが空の時はＤ０，Ｄ１に書き込む。(3) D1, F1 and P1 contain instructions executed by the group of arithmetic units in slot 1. (4) If the read instruction is a pair of an instruction executed in slot 0 and an instruction executed in slot 1, (4)-
1 If the instruction buffer is empty, write to D0 and D1.

【００６０】(4) −２Ｄ０，Ｄ１に命令が存在してい
る時はＦ０，Ｆ１に書き込む。 (4) −３Ｄ０，Ｄ１，Ｆ０，Ｆ１に命令が存在してい
る時はＰ０，Ｔに書き込む。(4) -2 If an instruction exists in D0 and D1, write it in F0 and F1. (4) -3 Write to P0, T when an instruction exists in D0, D1, F0, F1.

【００６１】(5) 読み出した命令が両方ともスロット
０で実行する命令の場合，(5) −１命令バッファが空
の時はＤ０，Ｆ０に書き込む。(5) −２Ｄ０，Ｄ１に
命令が存在している時はＦ０，Ｐ０に書き込む。(5) If both the read instructions are instructions to be executed in slot 0, (5) −1 If the instruction buffer is empty, write to D0 and F0. (5) -2 If an instruction exists in D0 and D1, write it in F0 and P0.

【００６２】(5) −３Ｄ０，Ｄ１，Ｆ０，Ｆ１に命令
が存在している時はＰ０，Ｔに書き込む。 (6) 読み出した命令が両方ともスロット１で実行する
命令の場合，(6) −１命令バッファが空の時はＤ１，
Ｆ１に書き込む。(5) -3 When an instruction exists in D0, D1, F0, F1, write it in P0, T. (6) If both read instructions are instructions to be executed in slot 1, (6) -1 If the instruction buffer is empty, D1,
Write to F1.

【００６３】(6) −２Ｄ０，Ｄ１に命令が存在してい
る時はＦ１，Ｐ１に書き込む。(6) −３Ｄ０，Ｄ１，
Ｆ０，Ｆ１に命令が存在している時はＰ１，Ｔに書き込
む。(6) -2 When an instruction exists in D0 and D1, write it in F1 and P1. (6) -3 D0, D1,
When an instruction exists in F0 and F1, write it in P1 and T.

【００６４】(7) Ｄ０，Ｄ１から命令が発行される
と，Ｆ０，Ｆ１，Ｐ０，Ｐ１，Ｔに存在する命令はＤ
０，Ｄ１の方向に詰めるようにバッファ内を移動する。 (8) Ｄ０，Ｄ１に同時に入る命令は連続したアドレス
の命令でなければならない。(7) When an instruction is issued from D0, D1, the instruction existing in F0, F1, P0, P1, T is D
Move in the buffer so that it is packed in the direction of 0, D1. (8) Instructions that enter D0 and D1 at the same time must have consecutive addresses.

【００６５】(9) Ｆ０，Ｆ１に同時に入る命令は連続
したアドレスの命令でなければならない。 (10) Ｐ０，Ｐ１に同時に入る命令は連続したアドレス
の命令でなければならない。(9) Instructions that enter F0 and F1 at the same time must have consecutive addresses. (10) Instructions that enter P0 and P1 at the same time must have consecutive addresses.

【００６６】図４の命令バッファの動作を動作例に基づ
いて説明する（図５，図６，図７を参照する）。図５は
命令キャッシュ５２に保持されている命令列の例であ
る。４バイト構成の２命令を１サイクルの処理単位と
し，８バイトの上位４バイトのＭＳＷと下位４バイトの
ＬＳＷをそれぞれ１命令とする。それぞれにスロットを
識別するフラグを備えている。フラグ「０」はスロット
０で処理される命令であり，フラグ「１」はスロット
「１」で処理される命令である（ａｄｄ命令はスロット
０で実行され，ｌｏａｄ命令はスロット１で実行された
ものとする。）命令キャッシュ５２から各クロックサイクル（以下サイ
クルと略称する）毎に２命令もしくは，上位バイト（Ｍ
ＳＷ）が読み出される。同じサイクルでバッファ（命令
保持部）に書き込まれた命令は，同時に実行できる。バ
ッファに書き込まれるサイクルが異なる場合は，命令の
アドレスが４バイト境界で連続している命令は同時に実
行できる（例えば，上記の命令で，ｌｏａｄ２とａｄｄ
３，ａｄｄ４とｌｏａｄ３等）。そうでないもの同士
（例えば，ｌｏａｄ２とｌｏａｄ４等）は同時に実行で
きない。The operation of the instruction buffer of FIG. 4 will be described based on an operation example (see FIGS. 5, 6 and 7). FIG. 5 is an example of an instruction sequence held in the instruction cache 52. Two instructions having a 4-byte structure are used as a processing unit for one cycle, and the MSW of the upper 4-byte of the 8-byte and the LSW of the lower 4-byte are each one instruction. Each has a flag for identifying the slot. Flag "0" is an instruction processed in slot 0, flag "1" is an instruction processed in slot "1" (add instruction executed in slot 0, load instruction executed in slot 1) From the instruction cache 52, two instructions or an upper byte (M) are provided for each clock cycle (hereinafter abbreviated as cycle).
SW) is read. Instructions written in the buffer (instruction holding unit) in the same cycle can be executed simultaneously. If the cycles written to the buffer are different, instructions whose instruction addresses are consecutive on a 4-byte boundary can be executed at the same time (for example, in the above instructions, load2 and add are added).
3, add4 and load3, etc.). Those that are not (for example, load2 and load4) cannot be executed simultaneously.

【００６７】図６は動作例１である。サイクル１で命令
キャッシュ５２からａｄｄ１とｌｏａｄ１が読み出さ
れ，セレクタ４（９４）とセレクタ５（９５）で選択さ
れてＤ０とＤ１に書き込まれる。ここで，２命令を実行
する命令が発行される（２命令発行）。FIG. 6 shows an operation example 1. In cycle 1, add1 and load1 are read from the instruction cache 52, selected by the selector 4 (94) and the selector 5 (95), and written in D0 and D1. Here, an instruction for executing two instructions is issued (two instruction issuance).

【００６８】サイクル２で，２命令発行に従って，ａｄ
ｄ１とｌｏａｄ１がそれぞれＤ０，Ｄ１からスロット
０，スロット１に出力される。そして，新たにｌｏａｄ
２とａｄｄ２が命令キャッシュ５２から読み出され，セ
レクタ４（９４），セレクタ５（９５）で選択され，そ
れぞれＤ０，Ｄ１に書き込まれる。ここで２命令実行が
発行される。In cycle 2, according to the issuance of two instructions,
d1 and load1 are output from D0 and D1 to slot 0 and slot 1, respectively. And newly load
2 and add2 are read from the instruction cache 52, selected by the selector 4 (94) and the selector 5 (95), and written in D0 and D1, respectively. Two instruction execution is issued here.

【００６９】サイクル３で，２命令発行に従って，ｌｏ
ａｄ２とａｄｄ２がそれぞれスロット０，スロット１に
出力される。そして，新たにａｄｄ３とａｄｄ４が命令
キャッシュ５２から読み出される。両方ともスロット０
の命令であるので，セレクタ４（９４），セレクタ２
（９２）で選択されてＤ０，Ｆ０に書き込まれる。ここ
で１命令の実行が発行される（１命令発行）。In cycle 3, according to the issuance of two instructions, lo
ad2 and add2 are output to slot 0 and slot 1, respectively. Then, add3 and add4 are newly read from the instruction cache 52. Both slots 0
, The selector 4 (94), the selector 2
It is selected in (92) and written in D0 and F0. Here, execution of one instruction is issued (one instruction issuance).

【００７０】サイクル４で，１命令発行に従ってａｄｄ
３がＤ０からスロット０に出力され，セレクタ４（９
４）で選択されてａｄｄ４がＦ０からＤ０に移動する。
ここで，次の命令ｌｏａｄ３とｌｏａｄ４が命令キャッ
シュ５２から読み出される。ｌｏａｄ３とａｄｄ４は境
界が連続している（プログラムカウンタ値が連続してい
る）ので，同時に実行できる。そのため，ｌｏａｄ３は
セレクタ５（９５）で選択されてＤ１に保持され，ｌｏ
ａｄ４はセレクタ３（９３）で選択されてＦ１に保持さ
れる。ここで，２命令実行が発行される。In cycle 4, add is issued according to the issuance of one instruction.
3 is output from D0 to slot 0, and selector 4 (9
Selected in 4), add4 moves from F0 to D0.
Here, the next instructions load3 and load4 are read from the instruction cache 52. Since the boundaries of load3 and add4 are continuous (the program counter values are continuous), they can be executed simultaneously. Therefore, load3 is selected by the selector 5 (95) and held in D1,
ad4 is selected by the selector 3 (93) and held in F1. Here, two instruction execution is issued.

【００７１】サイクル５で，２命令発行のためｌｏａｄ
３とａｄｄ４がそれぞれスロット０，スロット１に出力
される。そして，セレクタ５（９５）で選択されてｌｏ
ａｄ４がＦ１からＤ１に移動し，新たに命令キャッシュ
５２からｌｏａｄ５，ａｄｄ５が読み込まれる。そし
て，Ｄ１のｌｏａｄ４とａｄｄ５は同時に実行できない
ので，ａｄｄ５はセレクタ２（９２）で選択されてＦ０
に書き込まれ，ｌｏａｄ５はセレクタ３（９３）で選択
されてＦ１に書き込まれる。ここで１命令実行が発行さ
れる。In cycle 5, two instructions are issued to load
3 and add4 are output to slot 0 and slot 1, respectively. Then, it is selected by the selector 5 (95) and
ad4 moves from F1 to D1, and loads 5 and 5 are newly read from the instruction cache 52. Since load4 and add5 of D1 cannot be executed at the same time, add5 is selected by the selector 2 (92) and F0 is added.
, And load5 is selected by the selector 3 (93) and written in F1. Here, one instruction execution is issued.

【００７２】サイクル６で，１命令発行のためｌｏａｄ
４がＤ１からスロット１に出力され，ｌｏａｄ５，ａｄ
ｄ５がそれぞれセレクタ５（９５），セレクタ４（９
４）で選択されてＤ１，Ｄ０に移動する。新たに，命令
キャッシュ５２から，ｌｏａｄ６，ａｄｄ６が読み出さ
れ，それぞれセレクタ３（９３），セレクタ２（９２）
で選択されてＦ１とＦ０に書き込まれる。ここで，２命
令実行が発行され，次のサイクル（図示せず）でｌｏａ
ｄ５とａｄｄ５が出力される。In cycle 6, one instruction is issued to load
4 is output from D1 to slot 1, load5, ad
d5 is selector 5 (95) and selector 4 (9
It is selected in 4) and moves to D1 and D0. Load6 and load6 are newly read from the instruction cache 52, and the selector 3 (93) and the selector 2 (92) are respectively read.
Is selected and written in F1 and F0. Here, execution of two instructions is issued and loa is executed in the next cycle (not shown).
d5 and add5 are output.

【００７３】図７は動作例２を示す。以下，セレクタの
選択処理についての説明は省略する。サイクル１で命令
キャッシュからａｄｄ１とｌｏａｄ１が読み出され，Ｄ
０，Ｄ１に書き込まれる。ここで，０命令発行で，次の
サイクルで２命令とも実行されないとする。FIG. 7 shows an operation example 2. The description of the selector selection process is omitted below. In cycle 1, add1 and load1 are read from the instruction cache, and D
It is written to 0 and D1. Here, it is assumed that 0 instruction is issued and no two instructions are executed in the next cycle.

【００７４】サイクル２でａｄｄ１とｌｏａｄ１はＤ０
とＤ１に保持される。そして，次の命令（ｌｏａｄ２と
ａｄｄ２）が命令キャッシュ５２から読み出され，Ｆ１
とＦ０に保持される。０命令発行で，サイクル３でも命
令が実行されないとする。In cycle 2, add1 and load1 are D0
And held at D1. Then, the next instruction (load2 and add2) is read from the instruction cache 52, and F1
And held at F0. It is assumed that 0 instructions are issued and no instructions are executed even in cycle 3.

【００７５】サイクル３でａｄｄ３とａｄｄ４が命令キ
ャッシュ５２から読み出され，上位バイトのａｄｄ４は
Ｔに保持され，下位バイトのａｄｄ３はＰ０に保持され
る。サイクル３で２命令発行がなされる。In cycle 3, add3 and add4 are read from the instruction cache 52, the upper byte add4 is held in T, and the lower byte add3 is held in P0. Two instructions are issued in cycle 3.

【００７６】サイクル４でｌｏａｄ１とａｄｄ１がそれ
ぞれＤ１，Ｄ０からスロット１とスロット０に出力され
る。そして，ｌｏａｄ２とａｄｄ２がそれぞれＤ１，Ｄ
０に移動する。ａｄｄ３とａｄｄ４はそれぞれＦ０，Ｐ
０に移動する。命令キャッシュ５２から，新たにｌｏａ
ｄ３とｌｏａｄ５が読み込まれる。ｌｏａｄ３はａｄｄ
４と境界が連続しているのでＰ１に書き込まれ，ｌｏａ
ｄ４はＴに書き込まれる。ここで，２命令発行がなされ
る。In cycle 4, load1 and add1 are output from slot D1 and slot D0 to slot 1 and slot 0, respectively. And load2 and add2 are D1 and D, respectively.
Move to 0. add3 and add4 are F0 and P, respectively
Move to 0. New loa from the instruction cache 52
d3 and load5 are read. load3 is add
Since 4 and the boundary are continuous, it is written in P1, and loa
d4 is written to T. Two commands are issued here.

【００７７】サイクル５で，２命令発行に従って，ｌｏ
ａｄ２とａｄｄ２がそれぞれＤ１，Ｄ０からスロット
１，スロット０に出力される。そして，ａｄｄ３，ａｄ
ｄ４はそれぞれＤ０，Ｐ０に移動する。また，ｌｏａｄ
３，ｌｏａｄ４はそれぞれＦ１，Ｐ１に移動する。ここ
で，次の命令ｌｏａｄ５とａｄｄ５を読み出す余地がな
いので，新たに命令は読み出されない。ここで１命令発
行がなされる。In cycle 5, according to the issuance of two instructions, lo
ad2 and add2 are output from D1 and D0 to slot 1 and slot 0, respectively. And add3, ad
d4 moves to D0 and P0, respectively. Also, load
3, load4 moves to F1 and P1, respectively. Here, since there is no room to read the next instructions load5 and add5, no new instruction is read. One command is issued here.

【００７８】サイクル６で，１命令発行に従って，ａｄ
ｄ３がＤ０から出力され，ａｄｄ４がＤ０に移動する。
そして，ｌｏａｄ３，ｌｏａｄ４がそれぞれＤ１，Ｆ１
に移動する。命令キャッシュ５２から新たに，ｌｏａｄ
５とａｄｄ５が読み出され，それぞれＰ１，Ｐ０に保持
される。ここて２命令発行がなされる。In cycle 6, according to the issuance of one instruction,
d3 is output from D0, and add4 moves to D0.
Then, load3 and load4 are D1 and F1, respectively.
Go to A new load from the instruction cache 52
5 and add5 are read and held in P1 and P0, respectively. Two commands are issued here.

【００７９】サイクル７で，ｌｏａｄ３とａｄｄ４がそ
れぞれＤ０，Ｄ１からスロット１，スロット０に出力さ
れる。そして，ｌｏａｄ４がＤ１に移動し，ｌｏａｄ
５，ａｄｄ５がそれぞれＦ１，Ｆ０に移動する。命令キ
ャッシュ５２からｌｏａｄ６，ａｄｄ６が読み出され，
それそれＰ１，Ｐ０に保持される。ここで１命令が発行
される。In cycle 7, load3 and add4 are output from D0 and D1 to slot 1 and slot 0, respectively. Then, load4 moves to D1 and loads
5 and add5 move to F1 and F0, respectively. Load6 and add6 are read from the instruction cache 52,
It is held at P1 and P0 respectively. Here, one instruction is issued.

【００８０】[0080]

【発明の効果】本発明によれば，マイクロプロセッサの
処理速度に大きく影響する演算部Ａ，演算部Ｂのデコー
ドステージでのスロット選択を行わない。そのため，デ
コードステージで処理が遅延されることがないので，能
率的にパイプライン処理を行うことができ，マイクロプ
ロセッサ全体の動作が高速化される。また，命令バッフ
ァの命令処理もフラグにより簡単に制御できる。According to the present invention, slot selection is not performed in the decode stages of the arithmetic units A and B, which greatly affects the processing speed of the microprocessor. Therefore, since the processing is not delayed in the decoding stage, the pipeline processing can be efficiently performed, and the operation of the entire microprocessor is speeded up. Also, the instruction processing of the instruction buffer can be easily controlled by the flag.

[Brief description of drawings]

【図１】本発明の基本構成を示す図である。FIG. 1 is a diagram showing a basic configuration of the present invention.

【図２】本発明の実施例を示す図である。FIG. 2 is a diagram showing an example of the present invention.

【図３】本発明のフラグ生成のタイムチャートを示す図
である。FIG. 3 is a diagram showing a time chart of flag generation of the present invention.

【図４】本発明の命令バッファの実施例を示す図であ
る。FIG. 4 is a diagram showing an embodiment of an instruction buffer of the present invention.

【図５】本発明の命令キャッシュに保持されている命令
の例を示す図である。FIG. 5 is a diagram showing an example of instructions held in an instruction cache of the present invention.

【図６】本発明の動作例１を示す図である。FIG. 6 is a diagram showing an operation example 1 of the present invention.

【図７】本発明の動作例２を示す図である。FIG. 7 is a diagram showing a second operation example of the present invention.

【図８】従来のマイクロプロセッサの構成を示す図であ
る。FIG. 8 is a diagram showing a configuration of a conventional microprocessor.

[Explanation of symbols]

１：マイクロプロセッサ２：命令キャッシュ３：フラグ保持部４：フラグ生成部５：命令バッファ６：命令保持部選択部７：命令保持部Ａ８：命令保持部Ｂ９：演算部Ａ１０：演算部Ｂ１２：レジスタファイル１３：データキャッシュ 1: Microprocessor 2: Instruction cache 3: Flag holding unit 4: Flag generation unit 5: Instruction buffer 6: Instruction holding unit selection unit 7: Instruction holding unit A 8: Instruction holding unit B 9: Arithmetic unit A 10: Arithmetic unit B 12: Register file 13: Data cache

Claims

[Claims]

1. A microprocessor is provided with an instruction buffer for holding instructions for arithmetic processing and a plurality of arithmetic units for inputting instructions from the instruction buffer and performing arithmetic processing, and the instructions are processed in a microprocessor capable of simultaneously executing a plurality of instructions. The operation unit is defined, the instruction buffer includes an instruction holding unit corresponding to each operation unit, and a flag generation unit that generates a flag specifying the operation unit used by the instruction and adds the flag to the instruction. A command holding unit selecting unit that selects the command holding unit of the command buffer according to the flag, and holds the command in a command holding unit corresponding to an arithmetic unit that processes the command. And a microprocessor.