JP2006127080A

JP2006127080A - Pipeline processor

Info

Publication number: JP2006127080A
Application number: JP2004313400A
Authority: JP
Inventors: Takashi Mimura; 隆三村
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2004-10-28
Filing date: 2004-10-28
Publication date: 2006-05-18

Abstract

<P>PROBLEM TO BE SOLVED: To provide a pipeline processor that can prevent degradation of processing efficiency due to cache miss-hit and also the degradation of processing efficiency due to data hazard. <P>SOLUTION: In a fetch stage, which fetches instructions of operation stages to the pipeline processor 3 for executing a plurality of instructions by dividing them into an operation stage of each multi-step, there are arranged an instruction fetch register 119 for storing the instructions and an instruction fetch register control part 109. The instruction fetch register control part 109 detects the cache miss-hit by which the instructions cannot be fetched therein in the fetch stage. Also, when the miss-hit is detected, the instruction fetch register control part 109 successively operates the instruction fetch register 119. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、パイプラインプロセッサに関する。 The present invention relates to a pipeline processor.

現在、複数の命令を並行に実行する、いわゆるパイプライン方式で演算処理をするパイプラインプロセッサがある。パイプラインプロセッサは、複数の命令の各々をフェッチ、デコードといったステージの単位で実行する。このようなパイプラインプロセッサには、多くのデータを記憶可能な外部メモリの他、データの記憶容量は小さいものの、より高速にアクセスできるキャッシュメモリを備えたものがある。 Currently, there are pipeline processors that execute a plurality of instructions in parallel and perform arithmetic processing in a so-called pipeline system. The pipeline processor executes each of a plurality of instructions in units of stages such as fetch and decode. Some pipeline processors include an external memory that can store a large amount of data, and a cache memory that can be accessed at a higher speed although the data storage capacity is small.

キャッシュメモリを備えたパイプラインプロセッサでは、外部メモリから命令の実行に必要なデータを外部メモリからキャッシュメモリに読み込む。そして、プロセッサがキャッシュメモリにアクセスしてデータを取り出して演算を実行することにより、より高速に演算の処理を実行することができる。
このようなパイプラインプロセッサでは、パイプラインプロセッサがキャッシュメモリのうち必要な情報が書き込まれていない領域にアクセスするというエラーが発生し得る。このようなエラーは、一般的にキャッシュのミスヒット（本明細書ではキャッシュミスヒットまたは単にミスヒットとも記す）と呼ばれている。 In a pipeline processor having a cache memory, data necessary for execution of instructions is read from the external memory into the cache memory. Then, when the processor accesses the cache memory, retrieves the data and executes the calculation, the calculation process can be executed at a higher speed.
In such a pipeline processor, an error may occur that the pipeline processor accesses an area in the cache memory where necessary information is not written. Such an error is commonly referred to as a cache miss (also referred to herein as a cache miss or simply a miss).

キャッシュミスヒットが発生した場合、外部メモリから情報が読み出されるまでパイプラインプロセッサ全体の動作が停止する。このため、キャッシュミスヒットの発生は、パイプラインプロセッサの稼動を妨げ、処理効率を低下させる一因であるといえる。キャッシュミスヒット発生によるパイプラインプロセッサの処理効率低下を防ぐ従来例としては、例えば、特許文献１、特許文献２に記載されたものがある。 When a cache miss occurs, the operation of the entire pipeline processor stops until information is read from the external memory. For this reason, it can be said that the occurrence of a cache miss hits the pipeline processor and impedes the processing efficiency. As conventional examples for preventing a decrease in processing efficiency of a pipeline processor due to occurrence of a cache miss hit, for example, there are those described in Patent Document 1 and Patent Document 2.

特許文献１に記載された従来例は、レジスタファイルの他にシャドウレジスタファイルを設ける。シャドウレジスタファイルには、オペランドフェッチにおいてキャッシュミスヒットが発生したとき、ミスヒットしたアドレス等の情報が記憶される。そして、次にキャッシュミスヒットが発生したとき、記憶されているアドレスを用いて将来データアクセスすべきアドレスを予測するものである。 The conventional example described in Patent Document 1 provides a shadow register file in addition to a register file. The shadow register file stores information such as a missed address when a cache miss occurs in operand fetch. Then, when the next cache miss hit occurs, an address to be accessed in the future is predicted using the stored address.

また、特許文献２に記載された発明は、予め第１プログラムカウンタと第２プログラムカウンタとを設けておく。そして、正常動作時は第１プログラムカウンタのアドレスで示される命令データをＣＰＵに供給する。また、キャッシュミスヒットが起こったときには第２プログラムカウンタのアドレスで示される命令データをＣＰＵに供給することによってキャッシュミスヒット発生時のＣＰＵ動作停止を防いでいる。
特開平６−５１９８２号公報特開２００４−３８６００号公報 In the invention described in Patent Document 2, a first program counter and a second program counter are provided in advance. During normal operation, instruction data indicated by the address of the first program counter is supplied to the CPU. Further, when a cache miss hit occurs, instruction data indicated by the address of the second program counter is supplied to the CPU to prevent the CPU operation from being stopped when the cache miss hit occurs.
JP-A-6-51982 JP 2004-38600 A

キャッシュミスヒットは、プロセッサで使用されるオペランドに係るデータ（オペランドデータ）の他、プロセッサの命令（オペコード）に係る命令データをプロセッサがフェッチする場合にも発生する。しかし、上記した従来技術をはじめ、従来技術の多くがオペランドデータのキャッシュミスヒットによるプロセッサ停止を回避するためになされたものである。この理由の一つとして、オペランドデータが命令データよりもキャッシュミスヒットの発生率が高いことが考えられる。 A cache miss hit occurs when the processor fetches instruction data related to an instruction (opcode) of the processor in addition to data related to an operand (operand data) used in the processor. However, many of the prior arts including the above-described prior art have been made in order to avoid a processor stop due to a cache miss hit of operand data. One reason for this is that operand data has a higher cache miss hit rate than instruction data.

しかしながら、命令データのキャッシュミスヒットは、オペランドデータに比べて低いとはいえプログラムやシステムによっては比較的頻繁に発生する。また、命令のキャッシュミスヒットが発生すると、外部メモリから命令が読み出されるまでパイプラインプロセッサを停止させる必要が生じる。したがって、命令データのキャッシュミスヒットは、オペランドデータのキャッシュミスヒットと同様にパイプラインプロセッサの処理効率を低下させる。 However, cache miss hits of instruction data occur relatively frequently depending on programs and systems, although they are lower than operand data. When an instruction cache miss hit occurs, the pipeline processor must be stopped until the instruction is read from the external memory. Accordingly, the instruction data cache miss hit reduces the processing efficiency of the pipeline processor in the same manner as the operand data cache miss hit.

ところで、パイプラインプロセッサで処理される複数の命令は、先に処理が開始された命令（例えば命令１とする）で得られた演算結果を後に処理が開始された命令（例えば命令２とする）の演算で利用することがある。
このような命令１、命令２において、命令２が命令１の演算結果を利用するタイミングで命令１の演算結果の書込みが完了していない場合、データハザードが発生する。データハザードが発生すると、パイプラインプロセッサは、インターロックをかけて処理を停止する。したがって、データハザードの発生に伴うインターロックの停止は、キャッシュミスヒットと同様にパイプラインプロセッサの処理効率を低下させる一因である。 By the way, as for a plurality of instructions processed by the pipeline processor, an operation result obtained by an instruction that has been processed first (for example, instruction 1) is used as an instruction that has been processed later (for example, instruction 2). It may be used in the calculation of
In such instruction 1 and instruction 2, if writing of the operation result of instruction 1 is not completed at the timing when instruction 2 uses the operation result of instruction 1, a data hazard occurs. When a data hazard occurs, the pipeline processor interlocks and stops processing. Therefore, the stoppage of the interlock due to the occurrence of the data hazard is a cause of lowering the processing efficiency of the pipeline processor as well as the cache miss hit.

このようなデータハザードは、従来のパイプラインプロセッサにおいて、キャッシュミスヒットの発生と共にパイプラインプロセッサの処理効率を低下させる一因である。
本発明は、上記した点に鑑みてなされたものであり、キャッシュミスヒットによる処理効率の低下と併せてデータハザードによる処理効率の低下を防ぐことが可能なパイプラインプロセッサを提供することを目的とする。 Such a data hazard is a cause of reducing the processing efficiency of the pipeline processor along with the occurrence of a cache miss hit in the conventional pipeline processor.
The present invention has been made in view of the above points, and an object thereof is to provide a pipeline processor capable of preventing a decrease in processing efficiency due to a data hazard in addition to a decrease in processing efficiency due to a cache miss hit. To do.

以上の課題を解決するため、本発明のパイプラインプロセッサは、複数の命令を各々多段階の動作工程に区切って実行するパイプラインプロセッサであって、前記動作工程のうちの命令を取り込む取込動作工程において命令を取り込む命令取込手段と、前記取込動作工程において、前記命令取込手段が命令を取り込むことができないミスヒットを検出するミスヒット検出手段と、前記ミスヒット検出手段によってミスヒットが検出された場合にも前記命令取込手段を継続して動作させる動作継続手段と、を備えることを特徴とする。 In order to solve the above problems, a pipeline processor of the present invention is a pipeline processor that executes a plurality of instructions by dividing them into multi-stage operation processes, and takes in instructions in the operation processes. Instruction fetching means for fetching instructions in the process, miss hit detection means for detecting a miss that the instruction fetching means cannot fetch instructions in the fetching operation process, and a miss hit by the miss hit detection means And an operation continuation means for continuously operating the command fetching means even when detected.

このような発明によれば、複数の命令を各々多段階の動作工程に区切って実行するパイプラインプロセッサにおいて、取込動作工程において命令取込手段が命令を取り込む。また、この取込動作工程においてミスヒット検出手段がミスヒットを検出し、ミスヒットが検出された場合にも命令取込手段を動作させることができる。
このため、ミスヒットが発生した場合にも命令取込手段の停止によってパイプライン方式で行われている他の命令の処理が停止することがなく、ミスヒットを起こした命令に先行して他の命令の処理が進行する。したがって、本来後にデータハザードを生じるプログラムのデータハザード発生を防ぎ、より円滑で効率のよいパイプライン処理が実行できるパイプラインプロセッサを提供することができる。 According to such an invention, in a pipeline processor that divides and executes a plurality of instructions into multi-stage operation processes, the instruction fetch means fetches instructions in the fetch operation process. Further, in this fetching operation step, the miss-hit detecting means detects a mishit, and the instruction fetching means can be operated even when a mishit is detected.
For this reason, even when a mishit occurs, the processing of other instructions being performed in the pipeline system does not stop by stopping the instruction fetching means, and other instructions preceding the instruction that caused the mishit are not stopped. Instruction processing proceeds. Therefore, it is possible to provide a pipeline processor that can prevent a data hazard from occurring in a program that originally causes a data hazard and can execute pipeline processing more smoothly and efficiently.

また、本発明のパイプラインプロセッサは、前記動作継続手段が、前記ミスヒット検出手段によってミスヒットが検出された場合、前記命令取込手段に対して命令を取り込まないよう指示することによって前記命令取込手段を動作させることを特徴とする。
このような発明によれば、命令取込手段は、ミスヒットが検出された時にパイプラインプロセッサに入力される無効な命令を取り込むことなく、動作を継続できる。このため、命令取込手段の停止によってパイプライン方式で行われている他の命令の処理が停止することがなく、ミスヒットを起こした命令に先行して他の命令の処理が進行する。したがって、本来後にデータハザードを生じるプログラムのデータハザード発生を防ぎ、より円滑で効率のよいパイプライン処理が実行できるパイプラインプロセッサを提供することができる。 In the pipeline processor according to the present invention, the operation continuation unit instructs the instruction fetch unit to not fetch an instruction when a miss is detected by the miss hit detection unit. And operating the insertion means.
According to such an invention, the instruction fetching unit can continue the operation without fetching an invalid instruction input to the pipeline processor when a miss hit is detected. For this reason, the processing of other instructions that are performed in the pipeline system by the stop of the instruction fetching means does not stop, and the processing of the other instructions proceeds prior to the instruction that caused the mishit. Therefore, it is possible to provide a pipeline processor that can prevent a data hazard from occurring in a program that originally causes a data hazard and can execute pipeline processing more smoothly and efficiently.

また、本発明のパイプラインプロセッサは、前記動作継続手段が、前記命令取込手段に対してＮＯＰ信号を供給することによって前記命令取込手段に命令を取り込まないよう指示することを特徴とする。
このような発明によれば、比較的簡易に命令取込手段が動作を停止することを回避することができる。 The pipeline processor according to the present invention is characterized in that the operation continuation means instructs the instruction fetching means not to fetch an instruction by supplying a NOP signal to the instruction fetching means.
According to such an invention, it is possible to avoid the instruction fetching unit from stopping its operation relatively easily.

また、本発明のパイプラインプロセッサは、前記動作工程間で発生するハザードを検出するハザード検出手段をさらに備え、前記動作継続手段は、前記ミスヒット検出手段がミスヒットを検出し、かつ前記ハザード検出手段がハザードを検出していない場合に前記命令取込手段に対してＮＯＰ信号を供給することを特徴とする。
このような発明によれば、ミスヒットが発生していてハザードが発生してないタイミングの場合にＮＯＰ信号を命令取込手段に対して供給することができる。このため、適切なタイミングで効率よくＮＯＰ信号が供給されるため、本来後にデータハザードを生じるプログラムのデータハザード発生を防ぐことができ、いっそう処理効率の高いパイプラインプロセッサを提供することができる。 The pipeline processor according to the present invention further includes hazard detection means for detecting a hazard occurring between the operation steps, and the operation continuation means is configured to detect a miss by the miss-hit detection means, and detect the hazard. When the means does not detect a hazard, a NOP signal is supplied to the instruction fetch means.
According to such an invention, it is possible to supply the NOP signal to the instruction fetching means at the timing when a miss hit has occurred and no hazard has occurred. For this reason, since the NOP signal is efficiently supplied at an appropriate timing, it is possible to prevent the occurrence of a data hazard in a program that originally causes a data hazard, and it is possible to provide a pipeline processor with higher processing efficiency.

また、本発明のパイプラインプロセッサは、複数の命令を、命令を取り込む取込動作工程、取り込まれた命令をデコードするデコード動作工程を含む多段階の動作工程に区切って実行するパイプラインプロセッサであって、前記取込動作工程において命令を取り込む命令取込手段と、前記命令取込手段によって取り込まれた命令を、前記デコード動作工程においてデコードする命令デコード手段と、前記命令取込手段が命令を取り込むことができないミスヒットを検出するミスヒット検出手段と、前記デコード動作工程におけるハザードの発生によりインターロックがかかったことを検出するインターロック検出手段と、前記ミスヒット検出手段によってミスヒットが発生したことが検出され、かつ、前記インターロック検出手段によってインターロックがかかったことが検出された場合、前記デコード手段の動作をインターロックが解消するまで停止させながら、前記デコード動作工程以外の動作工程の少なくとも一部を継続して動作させる他動作継続手段と、を備えることを特徴とする。 The pipeline processor of the present invention is a pipeline processor that executes a plurality of instructions by dividing them into multi-stage operation steps including a fetch operation step for fetching instructions and a decode operation step for decoding the fetched instructions. An instruction fetching unit that fetches an instruction in the fetching operation step, an instruction decoding unit that decodes an instruction fetched by the instruction fetching unit in the decoding operation step, and the instruction fetching unit fetches an instruction. Miss hit detection means for detecting a miss hit that cannot be performed, interlock detection means for detecting that an interlock has been applied due to the occurrence of a hazard in the decoding operation step, and that a miss has occurred by the miss hit detection means Is detected, and the interlock detection means Other operation continuation means for continuously operating at least a part of the operation steps other than the decoding operation step while stopping the operation of the decoding means until the interlock is canceled when it is detected that the turlock is applied. It is characterized by providing.

このような発明によれば、多段階の動作工程のうちの取込動作工程において命令を取り込む。また、命令取込手段が命令を取り込むことができなかったミスヒットを検出すると共にデコード動作工程におけるハザードの発生によりインターロックがかかったことを検出する。そして、ミスヒットが発生したことが検出され、かつ、インターロックがかかったことが検出された場合、デコード手段の動作をインターロックが解消するまで停止させながらデコード動作工程以外の動作工程の少なくとも一部を継続して動作させることができる。 According to such an invention, the instruction is fetched in the fetching operation process among the multi-stage operating processes. Further, the instruction fetching unit detects a miss hit that cannot fetch the instruction, and detects that the interlock has been applied due to the occurrence of a hazard in the decoding operation process. When it is detected that a mis-hit has occurred and an interlock has been detected, at least one of the operation steps other than the decoding operation step is performed while stopping the operation of the decoding means until the interlock is released. The part can be operated continuously.

このような発明によれば、インターロックとミスヒットが共に発生している状態におけるデコードステージの進行停止中に、他の動作工程を継続して実行でき、他の命令の動作に影響を与えない。このため、ミスヒットに対応してミスヒットを発生した命令の取り込みを停止させている間にハザードにも対応でき、総合的なパイプラインプロセッサの停止時間をミスヒットとインターロックとに別個に対応するよりも短縮することができる。したがって、より円滑で効率のよいパイプライン処理が実行できるパイプラインプロセッサを提供することができる。 According to such an invention, while the progress of the decode stage is stopped in a state where both an interlock and a mishit occur, other operation steps can be continuously executed, and the operation of other instructions is not affected. . For this reason, it is possible to respond to hazards while stopping fetching instructions that caused a miss hit in response to a miss hit, and the total pipeline processor stop time can be handled separately for a miss hit and an interlock. It can be shortened than. Therefore, it is possible to provide a pipeline processor that can execute pipeline processing that is smoother and more efficient.

また、本発明は、前記他動作継続手段が、前記ミスヒット検出手段によってミスヒットが発生したことが検出され、かつ同時またはミスヒット発生の直前に前記インターロック検出手段によってインターロックがかかったことが検出された場合、インターロックが解消するまで前記デコード手段の動作を停止させながら、前記デコード動作工程以外の動作工程の少なくとも一部を継続して動作させることを特徴とする。 Further, according to the present invention, the other operation continuation means detects that a miss has occurred by the miss hit detection means, and the interlock detection means has been interlocked at the same time or immediately before the occurrence of the miss hit. Is detected, the operation of the decoding means is stopped until the interlock is canceled, and at least a part of the operation steps other than the decoding operation step is continuously operated.

このような発明によれば、デコード手段の動作を停止させながらデコード動作工程以外の動作工程の少なくとも一部を継続して動作させる処理を最適なタイミングで実行させることができる。このため、より円滑で効率のよいパイプライン処理が実行できるパイプラインプロセッサを提供することができる。
また、本発明のパイプラインプロセッサは、複数の命令を各々多段階の動作工程に区切って実行するパイプラインプロセッサであって、前記動作工程のうちの命令を取り込む取込動作工程において、命令を取り込む命令取込手段と、前記命令取込手段によって取り込まれた命令を格納する複数の命令格納手段と、前記命令格納手段に格納された命令を取り出してデコードする命令デコード手段と、を備え、前記ハザード検出手段によってハザードが検出され、インターロックがかかっている間は、前記命令格納手段の各々が前記命令取込手段によって取り込まれた命令を順次格納し、インターロックが解消後は、前記命令デコード手段が、前記命令格納手段の各々に格納されている命令を順次デコードすることを特徴とする。 According to such an invention, it is possible to execute a process of continuously operating at least a part of the operation steps other than the decoding operation step while stopping the operation of the decoding means at an optimal timing. Therefore, it is possible to provide a pipeline processor that can execute pipeline processing that is smoother and more efficient.
The pipeline processor of the present invention is a pipeline processor that divides and executes a plurality of instructions into multi-stage operation processes, and fetches instructions in the fetch operation process of fetching instructions among the operation processes. An instruction fetching means; a plurality of instruction storing means for storing instructions fetched by the instruction fetching means; and an instruction decoding means for fetching and decoding instructions stored in the instruction storing means; While the hazard is detected by the detecting means and the interlock is applied, each of the instruction storing means sequentially stores the instructions fetched by the instruction fetching means, and after the interlock is resolved, the instruction decoding means Are characterized by sequentially decoding instructions stored in each of the instruction storage means.

このような発明によれば、ハザード検出手段によってハザードが検出され、インターロックがかかっている間は、命令取込手段によって取り込まれた命令を複数の命令格納手段に順次格納し、インターロックが解消後は、命令格納手段の各々に格納されている命令を順次デコードする。インターロック解消後においても、新たな命令の取り込みを継続するため、命令格納手段には次々と新たな命令が取り込まれ、命令格納手段に命令が存在している状態で処理が進む。この命令格納手段に命令が存在している状態において、ミスヒットが発生した場合、格納されている命令を使って継続して動作することができる。このため、例えば動作継続手段によるＮＯＰ信号の供給がいらない、あるいはＮＯＰ信号の供給回数を抑えることができ、より円滑で効率のよいパイプライン処理が実行できるパイプラインプロセッサを提供することができる。 According to such an invention, while the hazard is detected by the hazard detecting means and the interlock is applied, the instructions fetched by the instruction fetching means are sequentially stored in the plurality of instruction storing means, and the interlock is eliminated. After that, the instructions stored in each of the instruction storage means are sequentially decoded. Even after the interlock is released, in order to continue fetching new instructions, new instructions are successively taken into the instruction storage means, and the process proceeds in a state where the instructions exist in the instruction storage means. In the state where the instruction exists in the instruction storing means, when a miss hit occurs, the operation can be continued using the stored instruction. For this reason, for example, the supply of the NOP signal by the operation continuation unit is not required, or the number of times of the NOP signal supply can be suppressed, and a pipeline processor capable of executing smoother and more efficient pipeline processing can be provided.

以下、図を参照して本発明に係るパイプラインプロセッサの実施形態１及び実施形態２を説明する。
（実施形態１）
図１は、本発明の実施形態１におけるパイプラインプロセッサのブロック図である。図示した構成は、パイプラインプロセッサに与えられる命令が格納された外部メモリ１、外部メモリ１に格納されている命令のキャッシュに使用される命令キャッシュ２と、パイプラインプロセッサ３とからなる。 Embodiments 1 and 2 of a pipeline processor according to the present invention will be described below with reference to the drawings.
(Embodiment 1)
FIG. 1 is a block diagram of a pipeline processor in Embodiment 1 of the present invention. The illustrated configuration includes an external memory 1 in which instructions to be given to the pipeline processor are stored, an instruction cache 2 used for caching instructions stored in the external memory 1, and a pipeline processor 3.

パイプラインプロセッサ３は、複数の命令を各々多段階の動作工程に区切って実行する。実施形態１では、命令キャッシュ２から複数の命令を取り出して各命令を実行する。なお、実施形態１では、各命令を実行するための多段階の動作工程とは、命令キャッシュから命令を取り込むフェッチステージ、フェッチされた命令をデコードするデコードステージ、命令を実行する実行ステージ、実行ステージで得られた結果をレジスタに書き込むライトバックステージの４つの動作ステージをいうものとする。 The pipeline processor 3 executes a plurality of instructions divided into multi-stage operation processes. In the first embodiment, a plurality of instructions are extracted from the instruction cache 2 and executed. In the first embodiment, the multi-stage operation process for executing each instruction includes a fetch stage for fetching an instruction from the instruction cache, a decode stage for decoding the fetched instruction, an execution stage for executing the instruction, and an execution stage. The four operation stages of the write-back stage for writing the result obtained in step 4 into a register are assumed.

このような複数の動作ステージに命令を区切って実行するため、パイプラインプロセッサ３は、フェッチステージで命令を取り込む命令フェッチレジスタ１１９及びそれを制御する命令フェッチレジスタ制御部１０９、デコードステージを処理するための命令デコード部１１１、実行ステージを実行（演算）するための実行部１１３、実行部１１３による演算で得られた演算結果が書き込まれるレジスタファイル１１５を備えている。また、命令フェッチレジスタ制御部１０９は、命令フェッチレジスタ１１９とともに予備レジスタ１２１を内蔵している。さらに、命令キャッシュ２に対して命令読み出しの要求を行うために命令フェッチ要求制御部１０５を備えている。命令フェッチ要求制御部１０５は、命令キャッシュ２に対して命令の読み出し要求信号（図中ではＰＢＵＳ＿ＲＤ＿ＥＮと記す）と、読み出しアドレス信号（図中ではＰＢＵＳ＿ＡＤＲＳと記す）を出力する。この読み出し要求と読み出しアドレスは、フェッチステージの前のサイクルで出力されている。 In order to execute the instruction by dividing the instruction into a plurality of operation stages, the pipeline processor 3 processes the instruction fetch register 119 that fetches the instruction at the fetch stage, the instruction fetch register control unit 109 that controls the instruction fetch stage, and the decode stage. Instruction decode unit 111, an execution unit 113 for executing (calculating) the execution stage, and a register file 115 into which the operation result obtained by the operation by the execution unit 113 is written. The instruction fetch register control unit 109 includes a spare register 121 together with the instruction fetch register 119. Further, an instruction fetch request control unit 105 is provided to make a request for instruction reading to the instruction cache 2. The instruction fetch request control unit 105 outputs an instruction read request signal (denoted as PBUS_RD_EN in the figure) and a read address signal (denoted as PBUS_ADRS in the figure) to the instruction cache 2. This read request and read address are output in the cycle before the fetch stage.

以上の構成において、フェッチステージで命令キャッシュ２からの命令を命令フェッチレジスタ制御部１０９により命令フェッチレジスタ１１９へ取り込む。この際、命令フェッチ要求制御部１０５により、命令キャッシュ２の命令がキャッシュされていない領域に読み出し要求がされていた場合、すぐに命令の取り込みができない。このことを、一般にキャッシュミスヒットという。 In the above configuration, an instruction from the instruction cache 2 is fetched into the instruction fetch register 119 by the instruction fetch register control unit 109 at the fetch stage. At this time, if the instruction fetch request control unit 105 makes a read request to an area in the instruction cache 2 where the instruction is not cached, the instruction cannot be immediately fetched. This is generally called a cache miss hit.

命令フェッチレジスタ制御部１０９は、キャッシュミスヒットが検出され、後述するインターロックが検出されていない場合、命令フェッチレジスタ１１９に対してＮＯＰ（no-operation instruction）信号を供給するものである。ＮＯＰとは、何の処理も行わないことを指示する命令である。
このため、命令フェッチレジスタ１１９は、ミスヒットによって命令が取り込めない場合に、命令フェッチレジスタ制御部１０９からのＮＯＰ信号を取り込むため、停止することがなく、継続して動作し続けることができる。なお、継続して動作するとは、今回の動作タイミング（クロックサイクル）で命令が取り出せない場合にも正常にフェッチが完了したのと同じように動作することをいう。キャッシュミスヒットが発生した場合にも命令フェッチレジスタ１１９が継続して動作した場合、デコード、演算実行といった以降の動作ステージも継続して行われる。 The instruction fetch register control unit 109 supplies a NOP (no-operation instruction) signal to the instruction fetch register 119 when a cache miss hit is detected and an interlock described later is not detected. NOP is an instruction that indicates that no processing is to be performed.
For this reason, since the instruction fetch register 119 captures the NOP signal from the instruction fetch register control unit 109 when the instruction cannot be captured due to a miss hit, the instruction fetch register 119 can continue to operate without stopping. Note that “continuous operation” means that the operation is performed in the same manner as when the fetch is normally completed even when an instruction cannot be fetched at the current operation timing (clock cycle). If the instruction fetch register 119 continues to operate even when a cache miss hit occurs, subsequent operation stages such as decoding and execution of operations are also performed continuously.

このような動作によれば、キャッシュミスヒットが発生した命令が他の命令に影響することがなく、他の命令はキャッシュミスヒットが発生しなかった場合と同様に完了する。
また、実施形態１のパイプラインプロセッサは、データハザード等を検出するための検出部１０７を備えている。検出部１０７は、命令フェッチ要求制御部１０５及び命令フェッチレジスタ制御部１０９にインターロック信号（図中ではＩＮＴＥＲＬＯＣＫと記す）を出力する構成である。インターロック信号は、検出部１０７によるハザード検出の有無によって異なる値を持つ信号である。 According to such an operation, an instruction in which a cache miss hit occurs does not affect other instructions, and the other instructions are completed in the same manner as in the case where no cache miss hit has occurred.
Further, the pipeline processor of the first embodiment includes a detection unit 107 for detecting a data hazard or the like. The detection unit 107 is configured to output an interlock signal (indicated as INTERLOCK in the drawing) to the instruction fetch request control unit 105 and the instruction fetch register control unit 109. The interlock signal is a signal having a different value depending on whether or not hazard detection is performed by the detection unit 107.

さらに、実施形態１のパイプラインプロセッサは、命令デコード部１１１において分岐命令がデコードされた時に分岐先アドレスを計算するための分岐アドレス計算部１０３を備える。また、命令フェッチ要求制御部１０５から出力される命令読み出しアドレス信号の元になるプログラムカウンタ１０１を備えている。プログラムカウンタ１０１は通常はサイクルごとにインクリメントされるが、分岐命令がデコードされた時は、分岐アドレス計算部１０３から出力される分岐先アドレスに更新される。なお、命令フェッチ要求制御部１０５から出力される命令の読み出し要求信号がディスエイブルになったときは、プログラムカウンタ１０１が更新されないように制御する。ちなみに、実施形態１の命令フェッチ要求制御部１０５から出力される命令の読み出し要求信号は、キャッシュミスヒットまたはインターロックが発生したサイクルでディスエイブルになるように制御される。 Further, the pipeline processor of the first embodiment includes a branch address calculation unit 103 for calculating a branch destination address when a branch instruction is decoded by the instruction decoding unit 111. Further, a program counter 101 that is a source of an instruction read address signal output from the instruction fetch request control unit 105 is provided. The program counter 101 is usually incremented every cycle, but is updated to the branch destination address output from the branch address calculation unit 103 when the branch instruction is decoded. Note that when the instruction read request signal output from the instruction fetch request control unit 105 is disabled, the program counter 101 is controlled not to be updated. Incidentally, the instruction read request signal output from the instruction fetch request control unit 105 of the first embodiment is controlled so as to be disabled in a cycle in which a cache miss hit or an interlock occurs.

以上述べた構成は、基本的に次のように動作する。すなわち、図１に示した命令キャッシュ２は、命令データと共にミスヒット信号（図中ではＰＢＵＳ＿ＷＡＩＴと記す）を命令フェッチレジスタ制御部１０９に出力する。ミスヒット信号は、キャッシュミスヒットの発生の有無によって値が切り換えられる信号であって、命令フェッチレジスタ制御部１０９は、ミスヒット信号の値によってキャッシュミスヒットの発生を検出する。 The configuration described above basically operates as follows. That is, the instruction cache 2 shown in FIG. 1 outputs a miss-hit signal (denoted as PBUS_WAIT in the drawing) to the instruction fetch register control unit 109 together with the instruction data. The miss hit signal is a signal whose value is switched depending on whether or not a cache miss hit occurs, and the instruction fetch register control unit 109 detects the occurrence of a cache miss hit based on the value of the miss hit signal.

また、ハザードが発生した場合、検出部１０７は、インターロック信号の値を切り換えてインターロックがかかったことを命令フェッチレジスタ制御部１０９に通知する。
実施形態１では、ミスヒット信号、インターロック信号をいずれも１または０の値をとるデジタル信号とする。そして、インターロック信号についてはインターロックがかかっていない状態の値を０、インターロックがかかった状態の値を１とする。また、ミスヒット信号については、キャッシュミスヒットが発生していない状態の値を０、キャッシュミスヒットが発生した状態の値を１とする。 When a hazard occurs, the detection unit 107 switches the value of the interlock signal and notifies the instruction fetch register control unit 109 that the interlock has been applied.
In the first embodiment, both the miss-hit signal and the interlock signal are digital signals having a value of 1 or 0. For the interlock signal, the value when the interlock is not applied is 0, and the value when the interlock is applied is 1. For the miss-hit signal, the value in a state where no cache miss has occurred is 0, and the value in a state where a cache miss has occurred is 1.

インターロック信号及びミスヒット信号をこのように設定することにより、実施形態１の命令フェッチレジスタ制御部１０９は、インターロック信号及びミスヒット信号の値の組合せによる４種類の状態を認識できる。
なお、４種類の状態とは、キャッシュミスヒット、インターロックのいずれも発生していない状態、キャッシュミスヒット、インターロックのいずれもが発生している状態、キャッシュミスヒットのみが発生している状態、インターロックのみが発生している状態の合計４つである。 By setting the interlock signal and the miss hit signal in this way, the instruction fetch register control unit 109 according to the first embodiment can recognize four types of states depending on combinations of the values of the interlock signal and the miss hit signal.
Note that the four types of states are states where neither a cache miss hit or an interlock occurs, a state where both a cache miss hit or an interlock occurs, or a state where only a cache miss hit occurs In total, only four interlocks are generated.

次に、上記したパイプラインプロセッサの動作を、キャッシュミスヒットのみ発生（インターロック発生なし）の場合と、キャッシュミスヒットとインターロックが同時に起こった場合と、に分けて説明する。なお、実施形態１では、いずれの説明においても実施形態１のパイプラインプロセッサの動作説明に先立って従来の構成の動作を示し、実施形態１と対比するものとする。 Next, the operation of the pipeline processor described above will be described separately for the case where only a cache miss hit occurs (no occurrence of an interlock) and the case where a cache miss hit and an interlock occur simultaneously. Note that in the first embodiment, prior to the description of the operation of the pipeline processor of the first embodiment, the operation of the conventional configuration is shown in any description and is compared with the first embodiment.

１、キャッシュミスヒットのみ発生（インターロック発生なし）の場合
先ず、キャッシュミスヒットのみ発生（インターロック発生なし）の場合のパイプラインプロセッサの動作について以下に説明する。この説明は、パイプラインプロセッサが以下に示すプログラム１を実行したものとして行う。
プログラム１
（ＩＮＳＴ１）
（ＩＮＳＴ２）
（ＩＮＳＴ３）ＡＤＤ％ＳＲ２，％ＳＲ０，％ＳＲ１
（ＩＮＳＴ４）ＡＤＤ％ＳＲ４，％ＳＲ２，％ＳＲ３
（ＩＮＳＴ５）
（ＩＮＳＴ６）
上記したプログラム１において、ＩＮＳＴ（INSTruction）１〜ＩＮＳＴ６は各々命令コードを示している。ＩＮＳＴ３、ＩＮＳＴ４は、いずれもＡＤＤ命令の命令コードである。ＩＮＳＴ３は、ＳＲ０（レジスタ０）から取り出した値とＳＲ１から取り出した値とを加算してＳＲ２に書き込むよう命令するものである。また、ＩＮＳＴ４は、ＩＮＳＴ３のＡＤＤ命令によって算出されてＳＲ２に書き込まれた値と、ＳＲ３から取り出した値とを加算してＳＲ４に書き込むよう命令するものである。なお、ＩＮＳＴ１、ＩＮＳＴ２及びＩＮＳＴ５、ＩＮＳＴ６は任意のプログラムであってよい。 1. When only a cache miss hit occurs (no interlock occurs) First, the operation of the pipeline processor when only a cache miss hit occurs (no interlock occurs) will be described below. This description is made assuming that the pipeline processor executes the program 1 shown below.
Program 1
(INST1)
(INST2)
(INST3) ADD% SR2,% SR0,% SR1
(INST4) ADD% SR4,% SR2,% SR3
(INST5)
(INST6)
In the above-described program 1, INST (INSTruction) 1 to INST6 indicate instruction codes, respectively. INST3 and INST4 are both instruction codes of the ADD instruction. INST3 instructs to add the value taken out from SR0 (register 0) and the value taken out from SR1 and write to SR2. INST4 is an instruction to add the value calculated by the ADD instruction of INST3 and written to SR2 and the value extracted from SR3 and write to SR4. INST1, INST2, INST5, and INST6 may be arbitrary programs.

このようなプログラム１は、ＩＮＳＴ３による演算結果の書き込みが完了した後でないとＩＮＳＴ４が命令をデコードすることができない。つまり、ＩＮＳＴ４はＩＮＳＴ３のライトバックステージ完了後でないとデコードステージを実行することができないことになる。
図２及び図３は、上記したプログラムを実行する場合のタイミングチャートであって、図２のタイミングチャートが従来のパイプラインプロセッサのものである。図３は、実施形態１のパイプラインプロセッサで上記したプログラムを実行した場合のタイミングチャートである。なお、タイミングチャートでは、フェッチステージをＦ、デコードステージをＤ、実行ステージをＥ、ライトバックステージをＷＢで示している。また、キャッシュミスヒット信号をＰＢＵＳ＿ＷＡＩＴ、インターロック信号をＩＮＴＥＲＬＯＣＫと示している。 Such a program 1 cannot be decoded by the INST4 until after the calculation result is written by the INST3. That is, INST4 cannot execute the decode stage until after the write back stage of INST3 is completed.
2 and 3 are timing charts when the above-described program is executed. The timing chart of FIG. 2 is that of a conventional pipeline processor. FIG. 3 is a timing chart when the above-described program is executed by the pipeline processor of the first embodiment. In the timing chart, F represents the fetch stage, D represents the decode stage, E represents the execution stage, and WB represents the write back stage. Further, the cache miss hit signal is indicated as PBUS_WAIT, and the interlock signal is indicated as INTERLOCK.

図２、図３に示した例では、いずれもＩＮＳＴ４がクロックサイクル４でミスヒットを発生する。このとき、従来の構成では、フェッチが完了するクロックサイクル６までＩＮＳＴ１、ＩＮＳＴ２、ＩＮＳＴ３の動作ステージが次の動作ステージに移ることがない（停止する）。
また、クロックサイクル６以降正常に動作を開始した後、従来の構成では、クロックサイクル７にＩＮＳＴ３とＩＮＳＴ４との間でデータハザードが発生する。このため、ＩＮＳＴ４は、ＩＮＳＴ３のライトバックステージが行われるクロックサイクル８以降にデコードステージを実行することが可能になる。図２に示した例では、クロックサイクル９にデコードステージが実行され、引き続き実行ステージ、ライトバックステージが実行されてクロックサイクル１１にＩＮＳＴ４が完了する。 In both the examples shown in FIGS. 2 and 3, INST 4 generates a miss-hit in clock cycle 4. At this time, in the conventional configuration, the operation stages INST1, INST2, and INST3 do not move to the next operation stage (stop) until clock cycle 6 when the fetch is completed.
In addition, after normal operation is started after clock cycle 6, in the conventional configuration, a data hazard occurs between clocks INST 3 and INST 4 in clock cycle 7. Therefore, INST4 can execute the decode stage after clock cycle 8 in which the write-back stage of INST3 is performed. In the example shown in FIG. 2, the decode stage is executed at clock cycle 9, the execution stage and the write back stage are subsequently executed, and INST4 is completed at clock cycle 11.

一方、実施形態１のパイプラインプロセッサにおいても、ＩＮＳＴ４がクロックサイクル４でミスヒットを発生する。この時、命令フェッチレジスタ制御部１０９には、キャッシュミスヒットが発生していることを示すミスヒット信号のアクティブ状態が入力される。このミスヒット信号のアクティブ状態が入力され、インターロック信号のディスエイブル状態が入力された場合、命令フェッチレジスタ制御部１０９は、ＮＯＰ信号を生成して命令フェッチレジスタ１１９に格納させる。 On the other hand, also in the pipeline processor of the first embodiment, INST4 generates a miss-hit in clock cycle 4. At this time, the instruction fetch register control unit 109 receives an active state of a miss signal indicating that a cache miss has occurred. When the active state of the miss-hit signal is input and the disabled state of the interlock signal is input, the instruction fetch register control unit 109 generates a NOP signal and stores it in the instruction fetch register 119.

ＮＯＰ信号の格納は、例えば、次のようにして実行される。すなわち、命令フェッチレジスタ制御部１０９は、予めＮＯＰ信号を記憶している記憶装置を備える。命令フェッチレジスタ制御部１０９は、ミスヒット信号によってキャッシュミスヒットを検出し、かつ、インターロックが発生していない時、命令フェッチレジスタ１１９に対し、この記憶装置からＮＯＰ信号を出力させ、命令フェッチレジスタ１１９はこのＮＯＰ信号を格納する。 The NOP signal is stored, for example, as follows. That is, the instruction fetch register control unit 109 includes a storage device that stores NOP signals in advance. The instruction fetch register control unit 109 detects a cache miss by the miss hit signal and causes the instruction fetch register 119 to output a NOP signal from the storage device when the interlock is not generated. 119 stores this NOP signal.

図３では、命令フェッチレジスタ１１９に、命令を取り込めないクロックサイクル４及びクロックサイクル５において、命令フェッチレジスタ制御部１０９によるＮＯＰ信号が格納される。命令フェッチレジスタ制御部１０９によるＮＯＰ信号の格納ステージを、ハードフェッチステージと定義し、図中に「ハードＦ」として示す。
以上述べたように、実施形態１のパイプラインプロセッサは、クロックサイクル４及びクロックサイクル５において、ハードフェッチステージを実行する。このため、パイプラインはどのステージも停止することなく動作を続ける。よって、ＩＮＳＴ１，ＩＮＳＴ２、ＩＮＳＴ３においても各動作ステージが停止することなく順次実行され、ＩＮＳＴ３におけるライトバックステージが従来の構成よりも早いタイミングで完了する。 In FIG. 3, the instruction fetch register 119 stores the NOP signal by the instruction fetch register control unit 109 in clock cycle 4 and clock cycle 5 in which the instruction cannot be fetched. The storage stage of the NOP signal by the instruction fetch register control unit 109 is defined as a hard fetch stage, and is indicated as “hard F” in the drawing.
As described above, the pipeline processor according to the first embodiment executes the hard fetch stage in clock cycle 4 and clock cycle 5. For this reason, the pipeline continues to operate without stopping any stage. Therefore, each operation stage is sequentially executed without stopping in INST1, INST2, and INST3, and the write back stage in INST3 is completed at a timing earlier than the conventional configuration.

図３に示した例では、クロックサイクル７でＩＮＳＴ4のデコードステージが行われる時にはＩＮＳＴ３の演算結果の書込みが完了している。よって、ＩＮＳＴ４がクロックサイクル７でＩＮＳＴ３の演算結果を用いることができ、インターロックによってパイプラインプロセッサが停止することがない。このような実施形態１のパイプラインプロセッサは、本来起こるハザードを、キャッシュミスヒットが発生した場合には防ぐことができ、パイプライン処理の効率を上げることができる。 In the example shown in FIG. 3, when the decode stage of INST4 is performed at clock cycle 7, the writing of the calculation result of INST3 is completed. Therefore, the operation result of INST3 can be used by INST4 at clock cycle 7, and the pipeline processor is not stopped by the interlock. The pipeline processor according to the first embodiment can prevent the inherent hazard when a cache miss hit occurs, and can increase the efficiency of pipeline processing.

２、キャッシュミスヒットとインターロックが同時に起こった場合
次に、キャッシュミスヒットとインターロックとが同じタイミングで起こった場合のパイプラインプロセッサの動作について以下に説明する。この説明は、パイプラインプロセッサが以下に示すプログラム２を実行したものとして行う。
プログラム２
（ＩＮＳＴ１）
（ＩＮＳＴ２）ＡＤＤ％ＳＲ２，％ＳＲ０，％ＳＲ１
（ＩＮＳＴ３）ＡＤＤ％ＳＲ４，％ＳＲ２，％ＳＲ３
（ＩＮＳＴ４）
（ＩＮＳＴ５）
（ＩＮＳＴ６）
上記したプログラム２は、ＩＮＳＴ２による演算結果の書き込みが完了した後でないとＩＮＳＴ３が命令をデコードすることができない。つまり、ＩＮＳＴ３はＩＮＳＴ２のライトバックステージ完了後でないとデコードステージを実行することができないことになる。 2. When cache miss hit and interlock occur simultaneously Next, the operation of the pipeline processor when the cache miss hit and interlock occur at the same timing will be described below. This description will be made assuming that the pipeline processor executes the program 2 shown below.
Program 2
(INST1)
(INST2) ADD% SR2,% SR0,% SR1
(INST3) ADD% SR4,% SR2,% SR3
(INST4)
(INST5)
(INST6)
In the program 2 described above, the instruction INST3 cannot decode the instruction until the writing of the operation result by the INST2 is completed. That is, INST3 can only execute the decode stage after the write-back stage of INST2 is completed.

キャッシュミスヒットとインターロックとが同じタイミングで起こった場合、図１に示したパイプラインプロセッサは、検出部１０７が、デコードステージにおけるハザードの発生によりインターロックがかかることを検出するインターロック検出手段として機能する。
また、命令フェッチレジスタ制御部１０９は、ミスヒットが発生したことが検出され、かつ、インターロックがかかったことが検出された場合、命令デコード部１１１の動作をインターロックが解消するまで停止させながら、デコードステージ以外の動作ステージの少なくとも一部を継続して動作させる他動作継続手段として機能する。 When the cache miss hit and the interlock occur at the same timing, the pipeline processor shown in FIG. 1 serves as an interlock detection unit that detects that the interlock 107 is applied due to the occurrence of a hazard in the decode stage. Function.
In addition, the instruction fetch register control unit 109 detects that a mis-hit has occurred and detects that an interlock has been applied, while stopping the operation of the instruction decode unit 111 until the interlock is released. It functions as another operation continuation means for continuously operating at least a part of the operation stage other than the decode stage.

また、実施形態１でいうミスヒットが発生したことが検出され、かつ、インターロックがかかったことが検出された場合とは、ミスヒットが発生したことが検出され、かつ同時またはミスヒット発生の直前にインターロックがかかったことが検出された場合をいう。実施形態１では、ミスヒットが発生したことが検出され、かつ同時にインターロックがかかったことが検出されたものとして以降の説明を行う。なお、実施形態１でいうミスヒットが発生したことが検出され、かつ同時にインターロックがかかったとは、ミスヒットの検出信号とインターロックの検出信号とが同一サイクル内でアクティブになったことを指すものとする。 In addition, when the occurrence of a miss hit in the first embodiment is detected and the occurrence of an interlock is detected, the occurrence of a miss hit is detected and the occurrence of a simultaneous or miss hit is detected. A case where it is detected that an interlock has been applied immediately before. In the first embodiment, the following description will be given on the assumption that a miss-hit has been detected and at the same time that an interlock has been detected. It is to be noted that the occurrence of a miss hit in the first embodiment is detected and the interlock is simultaneously applied means that the miss hit detection signal and the interlock detection signal are activated in the same cycle. Shall.

図４及び図５は、キャッシュミスヒットとインターロックとが同じタイミングで起こった場合のパイプラインプロセッサの動作を説明するためのタイミングチャートであって、図４のタイミングチャートが従来のパイプラインプロセッサのものである。図５は、実施形態１のパイプラインプロセッサでキャッシュミスヒットとインターロックとが同じタイミングで起こった状態に対処した場合のタイミングチャートである。 4 and 5 are timing charts for explaining the operation of the pipeline processor when the cache miss hit and the interlock occur at the same timing. The timing chart of FIG. Is. FIG. 5 is a timing chart when the pipeline processor according to the first embodiment copes with a state in which a cache miss hit and an interlock occur at the same timing.

図示するように、図４のタイミングチャートでは、クロックサイクル４でインターロックとミスヒットとが同時に起こっている。このような場合、従来のパイプラインプロセッサでは、先ず、ミスヒットの発生に対処してＩＮＳＴ１、ＩＮＳＴ２、ＩＮＳＴ３の動作ステージを停止する。ミスヒットが解消後、インターロックの原因となるハザードに対処してクロックサイクル６、クロックサイクル７においてＩＮＳＴ２のライトバックステージが完了するまで、ＩＮＳＴ３のデコードステージが停止する。 As shown in the figure, in the timing chart of FIG. 4, an interlock and a miss hit occur simultaneously in clock cycle 4. In such a case, the conventional pipeline processor first stops the operation stages of INST1, INST2, and INST3 in response to the occurrence of a miss hit. After the mishit is resolved, the INST3 decode stage is stopped until the INST2 write-back stage is completed in clock cycle 6 and clock cycle 7 in response to the hazard causing the interlock.

一方、実施形態１のパイプラインプロセッサでは、クロックサイクル４でインターロックとＩＮＳＴ４のキャッシュミスヒットを発生した場合、ＩＮＳＴ３のデコードステージは停止させて、ライトバックステージにあるＩＮＳＴ１、実行ステージにあるＩＮＳＴ２は継続して実行させる。クロックサイクル５も同様に動作する。このように、実行ステージ、ライトバックステージを継続して動作させることにより、インターロックに対処すると伴に、ＩＮＳＴ４のミスヒットに対処していることになる。 On the other hand, in the pipeline processor of the first embodiment, when an interlock and a cache miss hit of INST4 occur in clock cycle 4, the decode stage of INST3 is stopped, and INST1 in the write-back stage and INST2 in the execution stage are Let it run continuously. Clock cycle 5 operates in the same manner. In this way, by continuously operating the execution stage and the write back stage, it is possible to deal with the INST4 miss hit as well as dealing with the interlock.

これは、従来のようなキャッシュミスヒット中に全ての動作ステージが停止してしまうことにより、キャッシュミスヒットの解消後に、インターロックの解消を行う動作に比べて停止サイクルが短くなるため、パイプライン処理の効率が向上する。
また、上記した動作は、図１に示した構成において、命令フェッチレジスタ制御部１０９がキャッシュミスヒットとインターロックの発生とを同時に検出したとき、命令デコード部１１１の動作のみを停止させる。すなわち、命令フェッチレジスタ１１９の格納内容を更新させないように動作させて、実行部１１３等を継続して動作させることによって実現することができる。 This is because all the operation stages are stopped during a cache miss hit as in the conventional case, so that the stop cycle is shortened compared to the operation for releasing the interlock after the cache miss hit is resolved. Processing efficiency is improved.
Further, the above-described operation stops only the operation of the instruction decoding unit 111 when the instruction fetch register control unit 109 simultaneously detects the occurrence of a cache miss and an interlock in the configuration shown in FIG. That is, it can be realized by operating the execution contents of the execution unit 113 and the like so that the stored contents of the instruction fetch register 119 are not updated.

図６は、以上述べた、キャッシュミスヒットのみ発生（インターロック発生なし）の場合、および、キャッシュミスヒットとインターロックが同時に起こった場合、を含む実施形態１におけるキャッシュミスヒット信号（図中ではＰＢＵＳ＿ＷＡＩＴと記す）とインターロック信号（図中ではＩＮＴＥＲＬＯＣＫと記す）の状態に応じた命令フェッチレジスタ１１９の格納内容の遷移を示した表である。表中の1がアクティブ、０がディスエイブルを意味する。図中の丸付き数字はパイプラインプロセッサの状態に対して付されたものであり、丸１はキャッシュミスヒット、インターロックのいずれも発生していない状態を示す。丸２は、インターロックのみが発生している状態を示す。また、丸３は、キャッシュミスヒットのみが発生した状態を、丸４はキャッシュミスヒット、インターロックが共に発生した状態を示している。 FIG. 6 shows a cache miss signal in the first embodiment (in the figure, including the case where only a cache miss hit occurs (no interlock occurs) and the case where a cache miss hit and an interlock occur simultaneously). It is a table showing the transition of the stored contents of the instruction fetch register 119 according to the state of the interlock signal (denoted as INTERLOCK in the figure) and the interlock signal (denoted as PBUS_WAIT). In the table, 1 means active and 0 means disable. Circled numbers in the figure are attached to the state of the pipeline processor, and circle 1 indicates a state in which neither a cache miss hit nor an interlock has occurred. A circle 2 shows a state where only the interlock is generated. A circle 3 indicates a state where only a cache miss hit occurs, and a circle 4 indicates a state where both a cache miss hit and an interlock occur.

状態丸１では、キャッシュミスヒット、インターロック共に発生していないため、命令フェッチレジスタ１１９の内容は取り込まれた命令（フェッチされた命令）に、置き換えられる。
また、状態丸３は、先の説明で述べたキャッシュミスヒットのみ発生（インターロック発生なし）の場合に対応する。このとき、実施形態１のパイプラインプロセッサは、命令フェッチレジスタ制御部１０９が命令フェッチレジスタ１１９にＮＯＰ命令を格納させる。このため、パイプラインプロセッサはキャッシュミスヒットが起こっていないかのように動作し、停止することがない。これにより、従来、キャッシュミスヒットによる停止の後、本来であれば発生するインターロックを防ぐことができる。 In state circle 1, since neither a cache miss hit nor an interlock has occurred, the contents of the instruction fetch register 119 are replaced with the fetched instruction (fetched instruction).
The state circle 3 corresponds to the case where only the cache miss hit (no interlock occurrence) described in the above description occurs. At this time, in the pipeline processor of the first embodiment, the instruction fetch register control unit 109 causes the instruction fetch register 119 to store the NOP instruction. For this reason, the pipeline processor operates as if a cache miss hit has not occurred and does not stop. As a result, conventionally, it is possible to prevent an interlock that would otherwise occur after a stop due to a cache miss hit.

さらに、状態丸４は、先の説明で述べたキャッシュミスヒットとインターロックが同時に起こった場合に対応する。この場合、インターロックが起こっているためデコードステージを停止する必要がある。よって、命令フェッチレジスタ１１９は次サイクルでも現在格納されている命令を保持する。この際、従来であればキャッシュミスヒットも同時に起こっている為、実行ステージ、ライトバックステージにある命令も停止させていた。本発明では、実行ステージ、ライトバックステージにある命令は停止をさせないため性能が向上している。 Further, the state circle 4 corresponds to the case where the cache miss hit and the interlock described in the above description occur simultaneously. In this case, since the interlock has occurred, it is necessary to stop the decode stage. Therefore, the instruction fetch register 119 holds the currently stored instruction even in the next cycle. At this time, since a cache miss hit occurs at the same time in the past, the instructions in the execution stage and the write back stage are also stopped. In the present invention, since the instructions in the execution stage and the write back stage are not stopped, the performance is improved.

状態丸２は、インターロックのみがかかったパイプラインプロセッサの状態を示している。この場合、インターロックが起こっているためデコードステージを停止する必要がある。よって、命令フェッチレジスタ１１９は次サイクルでも現在格納されている命令を保持する。このとき、新たに取り込まれた命令がある場合は、別レジスタに保持しておく必要がある。新たに取り込まれた命令とは、例えば、後述する実施形態１のタイミングチャートである図９のサイクル４で取り込まれたＩＮＳＴ４の命令をいう。図１に示した予備レジスタ１２１は、このような命令を格納しておくために設けられている。 A state circle 2 indicates the state of the pipeline processor in which only the interlock is applied. In this case, since the interlock has occurred, it is necessary to stop the decode stage. Therefore, the instruction fetch register 119 holds the currently stored instruction even in the next cycle. At this time, if there is a newly fetched instruction, it must be held in a separate register. The newly fetched instruction means, for example, an instruction of INST4 fetched in cycle 4 of FIG. 9 which is a timing chart of the first embodiment described later. The spare register 121 shown in FIG. 1 is provided for storing such an instruction.

実施形態１ではインターロック信号が１（アクティブ）になると共に命令フェッチ要求制御部１０５から出力される命令読み出し要求信号がディスエイブルになるため、これ以上の命令は取り込まれない。このため、実施形態１の予備レジスタ１２１は一つでよい。
以上述べた実施形態１のパイプラインプロセッサは、キャッシュミスヒットのみ発生（インターロック発生なし）の場合、命令フェッチレジスタ１１９を停止させることがない。このため、パイプライン処理は継続され、実行ステージ、ライトバックステージの処理が進む。これにより、実施形態１は、本来発生するハザードを防ぎ、インターロックによるパイプラインプロセッサの停止を回避することができる。 In the first embodiment, since the interlock signal becomes 1 (active) and the instruction read request signal output from the instruction fetch request control unit 105 is disabled, no further instructions are captured. Therefore, only one spare register 121 in the first embodiment is required.
The pipeline processor of the first embodiment described above does not stop the instruction fetch register 119 when only a cache miss hit occurs (no interlock occurs). For this reason, the pipeline processing is continued, and the processing of the execution stage and the write back stage proceeds. As a result, the first embodiment can prevent inherent hazards and can prevent the pipeline processor from stopping due to the interlock.

また、実施形態１のパイプラインプロセッサは、キャッシュミスヒットとインターロックが同時に起こった場合、デコードステージだけを停止させて他の動作ステージを動作させる。このような処理により、実施形態１は、キャッシュミスヒットの発生によるペナルティ（処理停滞）とハザードによるペナルティとに同時に対処することができる。このような対処は、キャッシュミスヒットのペナルティにインターロックのペナルティを隠蔽できていると考えることができる。 Further, when a cache miss hit and an interlock occur simultaneously, the pipeline processor of the first embodiment stops only the decode stage and operates other operation stages. With such processing, the first embodiment can simultaneously deal with a penalty (processing stagnation) due to the occurrence of a cache miss hit and a penalty due to a hazard. It can be considered that such a countermeasure can conceal the interlock penalty from the penalty of cache miss hit.

また、本発明の実施形態１は、以上述べた構成に限定されるものではない。すなわち、実施形態１では、命令のキャッシュミスヒットとハザードによるインターロックとが同時に起こった場合についてのみ説明した。しかし、実施形態１におけるパイプラインプロセッサのキャッシュミスヒットとインターロックとが同時に起こった場合の動作は、キャッシュミスヒットの直前にインターロックが発生した場合にもキャッシュミスヒットのペナルティとインターロックのペナルティとに同時に対処し、パイプラインプロセッサの処理を効率化することができる。
（実施形態２）
次に、本発明の実施形態２を説明する。図７は、本発明の実施形態２におけるパイプラインプロセッサのブロック図である。なお、図７に示した構成のうち、実施形態１のパイプラインプロセッサで述べたのと同様の構成については同様の符号を付し、説明を一部略すものとする。 Further, Embodiment 1 of the present invention is not limited to the configuration described above. That is, in the first embodiment, only the case where an instruction cache miss hit and an interlock due to a hazard occur simultaneously has been described. However, the operation of the pipeline processor when the cache miss hit and the interlock occur simultaneously in the first embodiment is the same as the cache miss hit penalty and the interlock penalty even when the interlock occurs immediately before the cache miss hit. And processing the pipeline processor more efficiently.
(Embodiment 2)
Next, Embodiment 2 of the present invention will be described. FIG. 7 is a block diagram of a pipeline processor according to the second embodiment of the present invention. Of the configurations shown in FIG. 7, configurations similar to those described in the pipeline processor of the first embodiment are denoted by the same reference numerals, and description thereof is partially omitted.

図７に示したパイプラインプロセッサは、図１に示したパイプラインプロセッサと同様に、命令を、フェッチステージ、デコードステージ、実行ステージ、ライトバックステージに区切って実行する。このため、実施形態２のパイプラインプロセッサも、命令フェッチレジスタ制御部７０９、命令デコード部１１１、実行部１１３、レジスタファイル１１５を備えている。 The pipeline processor shown in FIG. 7 executes instructions divided into a fetch stage, a decode stage, an execution stage, and a write-back stage in the same manner as the pipeline processor shown in FIG. Therefore, the pipeline processor of the second embodiment also includes an instruction fetch register control unit 709, an instruction decoding unit 111, an execution unit 113, and a register file 115.

また、図７に示したパイプラインプロセッサは、データハザード等を検出するための検出部１０７、命令フェッチ要求制御部１０５を備える。
実施形態２の命令フェッチレジスタ制御部７０９は、命令フェッチレジスタ１１９を備える点で実施形態１の命令フェッチレジスタ制御部と同様であるものの、さらに複数の命令格納手段である命令バッファ０、命令バッファ１…命令バッファＮを備える点で実施形態１と相違する。実施形態２では、命令バッファ０、命令バッファ１…命令バッファＮを命令バッファ群７１９と記す。 The pipeline processor shown in FIG. 7 includes a detection unit 107 and an instruction fetch request control unit 105 for detecting a data hazard or the like.
The instruction fetch register control unit 709 according to the second embodiment is similar to the instruction fetch register control unit according to the first embodiment in that it includes an instruction fetch register 119, but further includes an instruction buffer 0 and an instruction buffer 1 that are a plurality of instruction storage units. ... different from the first embodiment in that an instruction buffer N is provided. In the second embodiment, the instruction buffer 0, the instruction buffer 1,..., The instruction buffer N are referred to as an instruction buffer group 719.

通常動作時、このような実施形態２のパイプラインプロセッサは、実施形態１と同様に、命令フェッチレジスタ制御部７０９の制御によって命令フェッチレジスタ１１９が命令キャッシュ２からの命令を取り込む。フェッチされた命令は、命令デコード部１１１でデコードされた後、実行部１１３で実行（演算）され、ライトバックステージにおいてレジスタファイル１１５に結果を書き込む。 In the normal operation, in the pipeline processor of the second embodiment, the instruction fetch register 119 fetches an instruction from the instruction cache 2 under the control of the instruction fetch register control unit 709 as in the first embodiment. The fetched instruction is decoded by the instruction decoding unit 111, then executed (calculated) by the execution unit 113, and the result is written in the register file 115 in the write back stage.

ただし、実施形態２のパイプラインプロセッサは、検出部１０７によってハザードが検出され、インターロックがかかっている間も命令の取り込みを継続し、命令バッファ群７１９に含まれる命令バッファに取り込まれた命令を順次格納する。また、インターロックが解消された後、各々の命令バッファに格納されている命令をシフトさせながら命令フェッチレジスタ１１９に移し、処理を行う。 However, the pipeline processor of the second embodiment continues to fetch instructions while the hazard is detected by the detection unit 107 and the interlock is applied, and the instructions fetched into the instruction buffers included in the instruction buffer group 719 are fetched. Store sequentially. In addition, after the interlock is canceled, the instruction stored in each instruction buffer is shifted to the instruction fetch register 119 while being shifted and processed.

インターロックの間も、命令の取り出しを継続するために、実施形態１とは違い、実施形態２の命令フェッチ要求制御部１０５は、インターロックがアクティブ中においても、命令読み出し要求信号（図中ではＰＢＵＳ＿ＲＤ＿ＥＮと記す）をアクティブにし続ける。
次に、上記したパイプラインプロセッサの動作について説明する。実施形態２では、パイプラインプロセッサが実施形態１で示したプログラム２を実行するものとする。なお、実施形態２においても、実施形態２のパイプラインプロセッサの動作説明に先立って従来の構成の動作を示し、実施形態２と対比するものとする。 Unlike the first embodiment, in order to continue fetching instructions even during the interlock, the instruction fetch request control unit 105 according to the second embodiment performs an instruction read request signal (in the figure, even when the interlock is active). Keep PBUS_RD_EN active).
Next, the operation of the above pipeline processor will be described. In the second embodiment, it is assumed that the pipeline processor executes the program 2 shown in the first embodiment. Also in the second embodiment, prior to the description of the operation of the pipeline processor of the second embodiment, the operation of the conventional configuration is shown and compared with the second embodiment.

図８及び図９、図１０は、上記したプログラムを実行する場合のタイミングチャートであって、図８のタイミングチャートが従来のパイプラインプロセッサのものである。図９及び図１０はそれぞれ実施形態１、実施形態２のパイプラインプロセッサで上記したプログラムを実行した場合のタイミングチャートである。
図８、図９、図１０に示した例では、いずれもＩＮＳＴ３がクロックサイクル４でハザードを発生し、ＩＮＳＴ２のライトバックステージが完了するまでインターロックがかかる。また、その後、ＩＮＳＴ８でキャッシュミスヒットが発生している。 8, 9, and 10 are timing charts when the above-described program is executed, and the timing chart of FIG. 8 is that of a conventional pipeline processor. FIGS. 9 and 10 are timing charts when the above-described program is executed by the pipeline processors of the first and second embodiments, respectively.
In the examples shown in FIG. 8, FIG. 9, and FIG. 10, INST3 generates a hazard at clock cycle 4, and interlock is applied until the write back stage of INST2 is completed. Thereafter, a cache miss hit occurs at INST8.

図９は、先に説明した実施形態１を適用した状態を示したタイミングチャートである。この場合、図に示すように、ミスヒットの発生時に命令フェッチレジスタ１１９にＮＯＰを格納させることによってパイプライン処理を継続して動作させている。このため、図８に示す従来の動作では、クロックサイクル１４でＩＮＳＴ８がＩＮＳＴ６またはＩＮＳＴ７の演算結果を用いる場合はインターロックが発生するが、実施形態１の図９ではパイプライン処理が継続して動作したため、クロックサイクル１４ではＩＮＳＴ６及びＩＮＳＴ７のライトバックステージが完了しているため、インターロックが発生することがない。よって、従来のパイプラインプロセッサに比べ、実施形態１のパイプラインプロセッサは処理効率が高いといえる。 FIG. 9 is a timing chart showing a state in which the first embodiment described above is applied. In this case, as shown in the figure, the pipeline process is continuously operated by storing the NOP in the instruction fetch register 119 when a miss hit occurs. Therefore, in the conventional operation shown in FIG. 8, an interlock occurs when the operation result of INST8 is INST6 or INST7 is used in clock cycle 14, but in FIG. 9 of the first embodiment, pipeline processing continues to operate. Therefore, in clock cycle 14, the write back stage of INST6 and INST7 is completed, so that no interlock occurs. Therefore, it can be said that the pipeline processor of the first embodiment has higher processing efficiency than the conventional pipeline processor.

図１０は、実施形態２のパイプラインプロセッサで同じプログラム２を実行した場合のタイミングチャートである。実施形態２のパイプラインプロセッサは、図１０に示したように、インターロックがかかった場合にも命令データの取り込みを継続する。すなわち、実施形態２のパイプラインプロセッサは、インターロックが発生すると、クロックサイクル４で命令バッファ０（図中Ｂ０と記す）へＩＮＳＴ４の命令を取り込む。また、クロックサイクル５では命令バッファ１（図中Ｂ１と記す）へＩＮＳＴ５の命令を取り込む。 FIG. 10 is a timing chart when the same program 2 is executed by the pipeline processor of the second embodiment. As shown in FIG. 10, the pipeline processor of the second embodiment continues to fetch instruction data even when an interlock is applied. That is, when the interlock occurs, the pipeline processor according to the second embodiment fetches the instruction INST4 into the instruction buffer 0 (denoted as B0 in the drawing) in clock cycle 4. In clock cycle 5, the instruction INST5 is fetched into the instruction buffer 1 (denoted as B1 in the figure).

インターロックの解除後、上記した処理で命令バッファに取り込まれている命令が取り込み順序にしたがって処理される。また、図１０に示した例では、インターロックの解除後のクロックサイクル６において、命令バッファ０（図中Ｂ０と記す）内にあるＩＮＳＴ４の命令が、命令フェッチレジスタ１１９に格納される。このとき、命令バッファ１内にあるＩＮＳＴ５の命令を、命令バッファ０に移しておく。また、このとき、命令キャッシュ２から取り出されたＩＮＳＴ６は、命令バッファ１に格納される。 After releasing the interlock, the instructions fetched into the instruction buffer by the above processing are processed according to the fetching order. In the example shown in FIG. 10, the instruction INST4 in the instruction buffer 0 (denoted as B0 in the figure) is stored in the instruction fetch register 119 in the clock cycle 6 after the release of the interlock. At this time, the instruction INST5 in the instruction buffer 1 is moved to the instruction buffer 0. At this time, the INST 6 fetched from the instruction cache 2 is stored in the instruction buffer 1.

クロックサイクル７では、同様に、命令バッファ０内にあるＩＮＳＴ５の命令が、命令フェッチレジスタ１１９に移される。また、命令バッファ１内にあるＩＮＳＴ６の命令は、命令バッファ０に移される。命令キャッシュ２から取り出されたＩＮＳＴ８は、命令バッファ１に格納される。
以上のように、各命令は、命令バッファ群７１９の各命令バッファに順次取り込まれ、命令フェッチレジスタ１１９に移された時点で各命令の実行が開始される。 Similarly, in clock cycle 7, the instruction INST5 in the instruction buffer 0 is moved to the instruction fetch register 119. The instruction INST6 in the instruction buffer 1 is moved to the instruction buffer 0. The INST 8 fetched from the instruction cache 2 is stored in the instruction buffer 1.
As described above, each instruction is sequentially fetched into each instruction buffer of the instruction buffer group 719, and when the instruction is moved to the instruction fetch register 119, execution of each instruction is started.

実施形態１の状態を示した図９では、クロックサイクル１０でＩＮＳＴ８のキャッシュミスヒットが発生するが、実施形態２の状態を示した図１０の場合には、上記で説明したようにインターロック中に命令の取り込みを継続したために、２サイクル早いクロックサイクル８でＩＮＳＴ８のキャッシュミスヒットが発生する。つまり、実施形態２の図１０の方が、2サイクル分処理が早く進んでいる、すなわち、パイプライン処理の処理効率が向上している。 In FIG. 9 showing the state of the first embodiment, a cache miss hit of INST8 occurs at clock cycle 10, but in the case of FIG. 10 showing the state of the second embodiment, the interlock is in progress as described above. Since the instruction fetch is continued, a cache miss hit of INST8 occurs at clock cycle 8 which is two cycles earlier. That is, in FIG. 10 of the second embodiment, the processing is advanced two cycles earlier, that is, the processing efficiency of the pipeline processing is improved.

また、図９と図１０は次のように対比する事もできる。キャッシュミスヒット中の３サイクルにおいて、図９は３つのＮＯＰを取り込むのに対し、図１０はあらかじめ命令バッファ０と命令バッファ１に命令が取り込まれている為にＮＯＰの取り込みは１つでよい。すなわち、図１０は、キャッシュミスヒットが１サイクルしか起こっていないかのようにパイプラインが動作している。よって、パイプライン処理の処理効率が向上している。 Moreover, FIG. 9 and FIG. 10 can also be compared as follows. In three cycles during a cache miss hit, FIG. 9 fetches three NOPs, whereas FIG. 10 fetches instructions into the instruction buffer 0 and the instruction buffer 1 in advance, so that one NOP fetch is sufficient. That is, in FIG. 10, the pipeline is operating as if a cache miss hit has occurred for only one cycle. Therefore, the processing efficiency of pipeline processing is improved.

また、命令バッファ内に命令が存在する状態で、ブランチ命令、ジャンプ命令などの分岐命令がデコードされたとき、命令バッファ内にある命令は分岐しない場合の命令が取り込まれている為、そのまま実行するとプログラムが正しく動作しない。そのため、実施形態２のパイプラインプロセッサは、分岐命令がデコードされたときは、命令バッファに格納されている命令を廃棄するように動作する。なお、ここでいう命令の廃棄とは、命令バッファ内に命令が格納されていないのと同じ状態にすることを意味する。 Also, when a branch instruction such as a branch instruction or jump instruction is decoded in the state where the instruction exists in the instruction buffer, the instruction in the instruction buffer that does not branch is fetched. The program does not work properly. Therefore, the pipeline processor according to the second embodiment operates so as to discard the instruction stored in the instruction buffer when the branch instruction is decoded. Note that the discarding of an instruction here means that the instruction is in the same state as when no instruction is stored in the instruction buffer.

実施形態２のパイプラインプロセッサは、このような動作によってプログラムが破綻することを防ぐことができる。また、このような動作は、例外処理などによってプログラムが分岐する場合も同様に行われる。
図１１は、実施形態２のパイプラインプロセッサが分岐命令をデコードした場合の動作を説明するためのタイミングチャートである。図１１では、ＩＮＳＴ７の命令が分岐命令であり、クロックサイクル１０でデコードされる。実施形態２では、分岐命令がデコードされたクロックサイクル１０において、クロックサイクル１１で命令バッファ０（図中Ｂ０と記す）、命令バッファ１（図中Ｂ１と記す）が無効になるように制御する。これによって命令バッファに格納されている分岐しない場合の命令を廃棄する。命令の廃棄により、クロックサイクル１１では、分岐先の命令が命令フェッチレジスタ１１９に格納される。これらの動作を実現するために、分岐が発生するかどうかを示す制御信号が命令デコード部１１１から命令フェッチレジスタ制御部７０９へ出力される。 The pipeline processor of the second embodiment can prevent the program from failing due to such an operation. Further, such an operation is similarly performed when a program branches by exception processing or the like.
FIG. 11 is a timing chart for explaining the operation when the pipeline processor of the second embodiment decodes a branch instruction. In FIG. 11, the instruction of INST7 is a branch instruction and is decoded in clock cycle 10. In the second embodiment, control is performed so that the instruction buffer 0 (denoted as B0 in the figure) and the instruction buffer 1 (denoted as B1 in the figure) are invalidated in the clock cycle 11 in the clock cycle 10 in which the branch instruction is decoded. As a result, the instruction stored in the instruction buffer when not branching is discarded. By discarding the instruction, in clock cycle 11, the branch destination instruction is stored in the instruction fetch register 119. In order to realize these operations, a control signal indicating whether or not a branch occurs is output from the instruction decode unit 111 to the instruction fetch register control unit 709.

図１２は、以上述べた実施形態２におけるキャッシュミスヒット信号（図中ではＰＢＵＳ＿ＷＡＩＴと記す）とインターロック信号（図中ではＩＮＴＥＲＬＯＣＫと記す）および命令バッファ０の有効フラグ（図中ではＢ（０）＿ｆと記す）に応じた命令フェッチレジスタ１１９の格納内容の遷移を示した表である。
表中の1がアクティブ、０がディスエイブルを意味し、Ｘは、Ｄｏｎ’ｔＣａｒｅを意味している。また、命令バッファ０の有効フラグＢ（０）＿ｆは、命令バッファ０に命令が格納されていれば有効、格納されていなければ無効として示す。また、命令バッファ０に格納されている内容をＢ（０）として示す。表中に示した表中の丸付き数字はパイプラインプロセッサの状態に対して付されたものである。 FIG. 12 shows a cache miss signal (denoted as PBUS_WAIT in the figure), an interlock signal (denoted as INTERLOCK in the figure), and a valid flag (B (0) in the figure) of the instruction buffer 0 in the second embodiment described above. It is a table showing the transition of the stored contents of the instruction fetch register 119 according to _f).
In the table, 1 means active, 0 means disable, and X means Don't Care. The valid flag B (0) _f of the instruction buffer 0 is indicated as valid if an instruction is stored in the instruction buffer 0, and invalid if not stored. The contents stored in the instruction buffer 0 are indicated as B (0). The circled numbers in the table are attached to the state of the pipeline processor.

丸１及び丸２はキャッシュミスヒット、インターロックのいずれも発生していない状態を示す。この時、命令バッファ０が無効、すなわち、命令バッファ０に命令が取り込まれていない場合（丸１の場合）は、命令キャッシュ２からの命令を命令フェッチレジスタ１１９に格納する。命令バッファ０が有効、すなわち、命令バッファ０に命令が取り込まれている場合（丸２の場合）は、命令バッファ０に取り込まれている命令を命令フェッチレジスタ１１９に格納する（移動する）。 Circles 1 and 2 indicate a state in which neither a cache miss hit nor an interlock occurs. At this time, if the instruction buffer 0 is invalid, that is, if no instruction is taken into the instruction buffer 0 (circle 1), the instruction from the instruction cache 2 is stored in the instruction fetch register 119. When the instruction buffer 0 is valid, that is, when an instruction is fetched into the instruction buffer 0 (circle 2), the instruction fetched into the instruction buffer 0 is stored (moved) in the instruction fetch register 119.

丸３はインターロックのみ発生している状態を示す。この時、デコードステージが停止する、すなわち、命令フェッチレジスタ１１９は格納している命令を次サイクルも保持する。
丸４及び丸５はキャッシュミスヒットのみ発生している状態を示す。この時、命令バッファ０が無効の場合（丸４の場合）は、命令フェッチレジスタ制御部７０９は命令フェッチレジスタ１１９にＮＯＰを格納させる。命令バッファ０が有効の場合（丸５の場合）は、命令バッファ０に取り込まれている命令を命令フェッチレジスタ１１９に格納する（移動する）。 A circle 3 indicates a state where only an interlock is generated. At this time, the decode stage is stopped, that is, the instruction fetch register 119 holds the stored instruction in the next cycle.
Circles 4 and 5 show a state in which only a cache miss hit occurs. At this time, if the instruction buffer 0 is invalid (circle 4), the instruction fetch register control unit 709 stores NOP in the instruction fetch register 119. When the instruction buffer 0 is valid (circle 5), the instruction fetched in the instruction buffer 0 is stored (moved) in the instruction fetch register 119.

丸６はキャッシュミスヒット、インターロックの両方とも発生している状態を示す。インターロックが発生しているため、デコードステージが停止する。すなわち、命令フェッチレジスタ１１９は格納している命令を次サイクルも保持する。
図１３は、上述した実施形態２における命令バッファ群７１９の各々の命令バッファに格納されている内容の遷移をまとめて示した表である。なお、各命令バッファは１ビットの有効フラグを持っていて、ｉ番目命令バッファの有効フラグをＢ（ｉ）＿ｆとし、各命令バッファに格納されている内容をＢ（ｉ）として表中に示す。ただし、ｉ＝０であるときのＢ（ｉ−１）＿ｆは、命令フェッチレジスタ１１９を意味し、常に有効である。また、ここでいう有効フラグの有効とは、命令バッファに命令があることを示し、無効とは、命令がないことを示す。 A circle 6 indicates a state where both a cache miss hit and an interlock have occurred. Since the interlock has occurred, the decode stage stops. That is, the instruction fetch register 119 holds the stored instruction in the next cycle.
FIG. 13 is a table collectively showing the transition of the contents stored in each instruction buffer of the instruction buffer group 719 in the second embodiment. Each instruction buffer has a 1-bit valid flag, the valid flag of the i-th instruction buffer is B (i) _f, and the contents stored in each instruction buffer are shown in the table as B (i). . However, B (i−1) _f when i = 0 means the instruction fetch register 119 and is always valid. In addition, the validity of the valid flag here indicates that there is an instruction in the instruction buffer, and invalid means that there is no instruction.

図１３に示すように、ｉ番目命令バッファの有効フラグＢ（ｉ）＿ｆと格納内容Ｂ（ｉ）の遷移は、分岐の発生を示す信号、キャッシュミスヒット信号（図中ではＰＢＵＳ＿ＷＡＩＴと記す）、インターロック信号（図中ではＩＮＴＥＲＬＯＣＫと記す）、Ｂ（ｉ）＿ｆ、Ｂ（ｉ＋１）＿ｆ、Ｂ（ｉ―１）＿ｆによって決まる。
図１２に示したように命令フェッチレジスタ１１９を制御し、図１３に示したように命令バッファ群７１９の各々の命令バッファを制御することにより、実施形態２のパイプラインプロセッサは、図１０および図１１に示したタイミングチャートのように動作でき、プログラムの分岐が存在する場合でも破綻することなく動作し、処理効率の高いパイプライン処理を実現できる。 As shown in FIG. 13, the transition of the valid flag B (i) _f and stored content B (i) of the i-th instruction buffer is a signal indicating the occurrence of a branch, a cache miss signal (denoted as PBUS_WAIT in the figure), It is determined by an interlock signal (indicated as INTERLOCK in the figure), B (i) _f, B (i + 1) _f, and B (i−1) _f.
By controlling the instruction fetch register 119 as shown in FIG. 12, and controlling each instruction buffer of the instruction buffer group 719 as shown in FIG. 11 can operate as in the timing chart shown in FIG. 11 and can operate without failure even when there is a branch of the program, and can realize pipeline processing with high processing efficiency.

以上述べた本実施形態の実施形態２は、フェッチされた命令を格納できる命令バッファを複数備えている。そして、デコードステージでインターロックが発生した場合にはフェッチステージで命令をフェッチしては命令バッファに順次格納するので、インターロックの発生時に命令バッファに命令を取り込んでおくことができる。
また、インターロックの解除後は、命令バッファに格納された命令を順次命令フェッチレジスタにシフトしてデコードする。同時にフェッチされた命令を空いた命令バッファに取り込んでおく。そして、キャッシュミスヒットの発生時にはＮＯＰ信号の供給等の対処に優先して命令バッファに格納されている命令を使って動作ステージの停止を防ぐことができる。 The second embodiment of the present embodiment described above includes a plurality of instruction buffers that can store fetched instructions. When an interlock occurs at the decode stage, the instruction is fetched at the fetch stage and sequentially stored in the instruction buffer. Therefore, the instruction can be fetched into the instruction buffer when the interlock occurs.
Further, after the interlock is released, the instructions stored in the instruction buffer are sequentially shifted to the instruction fetch register and decoded. At the same time, fetched instructions are taken into an empty instruction buffer. When a cache miss hit occurs, it is possible to prevent the operation stage from being stopped using an instruction stored in the instruction buffer in preference to dealing with supply of a NOP signal or the like.

このような実施形態２のパイプラインプロセッサは、実施形態１よりもＮＯＰ信号の供給等の実行回数が少なくてすみ、パイプラインプロセッサの処理をより効率化することができる。 Such a pipeline processor of the second embodiment requires less number of executions such as supply of the NOP signal than the first embodiment, and the processing of the pipeline processor can be made more efficient.

本発明の実施形態１のパイプラインプロセッサのブロック図である。It is a block diagram of the pipeline processor of Embodiment 1 of this invention. 実施形態１のパイプラインプロセッサと比較するために示した従来のパイプラインプロセッサのタイミングチャートである。3 is a timing chart of the conventional pipeline processor shown for comparison with the pipeline processor of the first embodiment. 実施形態１のパイプラインプロセッサのタイミングチャートである。3 is a timing chart of the pipeline processor according to the first embodiment. 実施形態１のパイプラインプロセッサと比較するために示した従来のパイプラインプロセッサにおけるインターロックとミスヒットとが同時に起こった状態を示すタイミングチャートである。4 is a timing chart showing a state in which interlock and mis-hit occur simultaneously in the conventional pipeline processor shown for comparison with the pipeline processor of the first embodiment. 実施形態１のパイプラインプロセッサでインターロックとミスヒットとが同時に起こった状態を示すタイミングチャートである。3 is a timing chart illustrating a state in which an interlock and a miss hit occur simultaneously in the pipeline processor of the first embodiment. 本発明の実施形態１における命令フェッチレジスタの遷移を示した表である。It is the table | surface which showed the transition of the instruction fetch register in Embodiment 1 of this invention. 本発明の実施形態２のパイプラインプロセッサのブロック図である。It is a block diagram of the pipeline processor of Embodiment 2 of this invention. 従来のパイプラインプロセッサでプログラム２を実行した場合のタイミングチャートである。It is a timing chart at the time of executing the program 2 with the conventional pipeline processor. 実施形態１のパイプラインプロセッサでプログラム２を実行した場合のタイミングチャートである。6 is a timing chart when the program 2 is executed by the pipeline processor of the first embodiment. 実施形態２のパイプラインプロセッサでプログラム２を実行した場合のタイミングチャートである10 is a timing chart when the program 2 is executed by the pipeline processor of the second embodiment. 実施形態２のパイプラインプロセッサが分岐命令をデコードした場合の動作を説明するためのタイミングチャートである。10 is a timing chart for explaining an operation when the pipeline processor of the second exemplary embodiment decodes a branch instruction. 本発明の実施形態２における命令フェッチレジスタの遷移を示した表である。It is the table | surface which showed the transition of the instruction fetch register in Embodiment 2 of this invention. 本発明の実施形態２における命令バッファ群の各々の命令バッファに格納されている内容の遷移をまとめて示した表である。It is the table | surface which showed collectively the transition of the content stored in each instruction buffer of the instruction buffer group in Embodiment 2 of this invention.

Explanation of symbols

１外部メモリ、２命令キャッシュ、３パイプラインプロセッサ、１０１プログラムカウンタ、１０３分岐アドレス計算部、１０５命令フェッチ要求制御部、１０７検出部、１０９，７０９命令フェッチレジスタ制御部、１１１命令デコード部、１１３実行部、１１５レジスタファイル、１１９命令フェッチレジスタ、１２１予備レジスタ、７１９命令バッファ群 1 external memory, 2 instruction cache, 3 pipeline processor, 101 program counter, 103 branch address calculation unit, 105 instruction fetch request control unit, 107 detection unit, 109,709 instruction fetch register control unit, 111 instruction decode unit, 113 execution , 115 register file, 119 instruction fetch register, 121 spare register, 719 instruction buffer group

Claims

A pipeline processor that divides a plurality of instructions into multi-stage operation processes and executes the instructions.
Instruction fetching means for fetching instructions in the fetching action process for fetching instructions in the operation process;
In the fetch operation step, a miss detection means for detecting a miss that the instruction fetch means cannot fetch an instruction;
An operation continuation means for continuously operating the instruction fetch means even when a miss hit is detected by the miss hit detection means;
A pipeline processor comprising:

The operation continuation means includes
2. The instruction fetching unit according to claim 1, wherein when a miss hit is detected by the miss hit detection unit, the command fetching unit is operated by instructing the command fetching unit not to fetch a command. Pipeline processor.

3. The pipeline processor according to claim 2, wherein the operation continuation unit instructs the instruction fetch unit not to fetch an instruction by supplying a NOP signal to the instruction fetch unit.

Further comprising a hazard detection means for detecting a hazard occurring between the operation steps;
The operation continuation means supplies a NOP signal to the instruction fetch means when the miss hit detection means detects a miss and the hazard detection means does not detect a hazard. The pipeline processor according to claim 3.

A pipeline processor that divides and executes a plurality of instructions into a multi-stage operation process including a fetch operation process for fetching an instruction and a decode operation process for decoding the fetched instruction,
Command fetching means for fetching commands in the fetching operation step;
Instruction decoding means for decoding the instruction fetched by the instruction fetching means in the decoding operation step;
A miss detection means for detecting a miss that the instruction fetch means cannot fetch an instruction;
Interlock detecting means for detecting that an interlock has been applied due to the occurrence of a hazard in the decoding operation step;
When it is detected by the miss-hit detecting means that a mis-hit has occurred and the interlock detecting means detects that an interlock has been applied, the operation of the decoding means is stopped until the interlock is released. However, other operation continuation means for continuously operating at least a part of the operation steps other than the decoding operation step,
A pipeline processor comprising:

The other operation continuation means, when it is detected that a miss hit has occurred by the miss hit detection means, and when the interlock detection means has detected that an interlock has been applied immediately before the occurrence of a miss hit, 6. The pipeline processor according to claim 5, wherein at least a part of the operation steps other than the decoding operation step are continuously operated while the operation of the decoding means is stopped until the interlock is canceled.

A pipeline processor that divides a plurality of instructions into multi-stage operation processes and executes the instructions.
Instruction fetching means for fetching instructions in the fetching action process for fetching instructions in the operation process;
A plurality of instruction storage means for storing instructions fetched by the instruction fetching means;
Instruction decode means for taking out and decoding the instruction stored in the instruction storage means,
While the hazard is detected by the hazard detection means and the interlock is applied, each of the instruction storage means sequentially stores the instructions fetched by the instruction fetch means, and after the interlock is released, the instruction A pipeline processor, wherein the decoding means sequentially decodes instructions stored in each of the instruction storage means.