JP2011107888A

JP2011107888A - Arithmetic processor and method for controlling arithmetic processor

Info

Publication number: JP2011107888A
Application number: JP2009260950A
Authority: JP
Inventors: Yuji Shirohige; 祐治白髭; Ryuichi Sunayama; 竜一砂山
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2009-11-16
Filing date: 2009-11-16
Publication date: 2011-06-02
Anticipated expiration: 2029-11-16
Also published as: JP5625329B2; US20110119535A1; US8621309B2; EP2323040A1

Abstract

<P>PROBLEM TO BE SOLVED: To speed up load operation concerning an arithmetic processor and a method for controlling the arithmetic processor. <P>SOLUTION: The arithmetic processor includes: a first storage part that stores data; an error detection part that detects an occurrence of error in data read out from the first storage part; a second storage part that stores data read out from the first storage part based on a load request; a rerun request generation part that generates a rerun request of a load request to the first storage part in the same cycle as the cycle in which error of data is detected when the error detection part detects the occurrence of error in data read out from the first storage part by the load request; and an instruction execution part that retransmits the load request to the first storage part when data in which error is detected and a rerun request are given. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、演算処理装置及び演算処理装置の制御方法に関する。 The present invention relates to an arithmetic processing device and a control method for the arithmetic processing device.

近年、演算処理装置としてのプロセッサの処理速度の高速化のために、パイプライン方式を用いたプロセッサが使用されている。パイプライン方式において、プロセッサはその機能を実現する複数のパイプライン（命令制御パイプライン、演算パイプライン、分岐制御パイプライン等）を有する。又、各パイプラインは、それぞれ複数のステージに分割されている。各ステージは、所定の工程を実現する回路ユニットを含み、動作周波数の逆数であるサイクルタイムと呼ばれる期間内に、各ステージに割り当てられた所定の工程を終了するように動作する。そして、先工程に係るステージの出力信号は、例えば、後工程に係るステージの入力信号として使用される。 In recent years, a processor using a pipeline system has been used to increase the processing speed of a processor as an arithmetic processing unit. In the pipeline system, the processor has a plurality of pipelines (an instruction control pipeline, an arithmetic pipeline, a branch control pipeline, etc.) that realize the function. Each pipeline is divided into a plurality of stages. Each stage includes a circuit unit that realizes a predetermined process, and operates so as to end the predetermined process assigned to each stage within a period called a cycle time that is the reciprocal of the operating frequency. The output signal of the stage related to the previous process is used as the input signal of the stage related to the subsequent process, for example.

パイプライン方式を用いたプロセッサの処理速度の高速化手法として、タグＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）と、データＲＡＭに１サイクルでアクセスするように動作するキャッシュメモリが提案されている。 As a technique for increasing the processing speed of a processor using a pipeline system, a tag RAM (Random Access Memory) and a cache memory that operates to access a data RAM in one cycle have been proposed.

特開２００４−１７１１７７号公報JP 2004-171177 A

データＲＡＭから読み出したデータをエラーチェックする処理と、エラーチェック結果に応じてプロセッサに読み出したデータの使用許可の可否を判断して通知する処理とは、互いに連続した処理であるため、１サイクル内にこれら２つの処理を終了させることが出来ない。そのため、上記２つの処理は、少なくとも合計２サイクル以上の時間がかかる。 The process of checking the data read from the data RAM for error and the process of determining whether or not to permit the use of the data read to the processor according to the error check result are processes that are continuous with each other. These two processes cannot be completed. Therefore, the above two processes take at least two cycles in total.

開示の演算処理装置は、データＲＡＭからのロード動作を高速化することを目的とする。 An object of the disclosed arithmetic processing device is to speed up the load operation from the data RAM.

開示の演算処理装置は、データを記憶する第１の記憶部と、第１の記憶部から読み出したデータについてエラーの発生を検出するエラー検出部と、第１の記憶部から読み出したデータを、ロード要求に基づいて格納する第２の記憶部と、エラー検出部が、ロード要求により第１の記憶部から読み出したデータについて、エラーの発生を検出した場合、第１の記憶部へのロード要求の再送要求を、データのエラーが検出されたサイクルと同じサイクルにおいて生成する再送要求生成部と、エラーが検出されたデータと再送要求が与えられたとき、第１の記憶部にロード要求を再送する命令実行部と、を有する。 The disclosed arithmetic processing device includes a first storage unit that stores data, an error detection unit that detects occurrence of an error for data read from the first storage unit, and data read from the first storage unit, When the second storage unit that stores data based on the load request and the error detection unit detect the occurrence of an error in the data read from the first storage unit by the load request, the load request to the first storage unit A retransmission request generation unit that generates the retransmission request in the same cycle as the cycle in which the data error is detected, and when the error detected data and the retransmission request are given, the load request is retransmitted to the first storage unit An instruction execution unit.

開示の演算処理装置は、データＲＡＭからのロード動作を高速化するという効果を奏する。 The disclosed arithmetic processing device has the effect of speeding up the load operation from the data RAM.

演算処理装置の構成の一例を示す図である。It is a figure which shows an example of a structure of an arithmetic processing unit. 命令部及びＬ１キャッシュの一例を示す図である。It is a figure which shows an example of a command part and L1 cache. パイプラインへのデータ要求の遷移の一例を示す図である。It is a figure which shows an example of the transition of the data request | requirement to a pipeline. 優先制御を規定する真理値表の一例を示す図である。It is a figure which shows an example of the truth table which prescribes | regulates priority control. Ｌ１キャッシュ内のアドレス制御処理の実行手順の一例を示す図である。It is a figure which shows an example of the execution procedure of the address control process in L1 cache. Ｌ１キャッシュ内のアドレス制御処理の実行手順の一例を示す図である。It is a figure which shows an example of the execution procedure of the address control process in L1 cache. Ｌ１キャッシュ内のアドレス制御処理の実行手順の一例を示す図である。It is a figure which shows an example of the execution procedure of the address control process in L1 cache. Ｌ１キャッシュ内のフラグ制御処理の実行手順の一例を示す図である。It is a figure which shows an example of the execution procedure of the flag control process in L1 cache. パイプライン制御に使用されるフラグ信号の一例を示す図である。It is a figure which shows an example of the flag signal used for pipeline control. 保持回路に保持されるフラグ信号選択の優先順位を規定する真理値表の一例を示す図である。It is a figure which shows an example of the truth table which prescribes | regulates the priority of the flag signal selection hold | maintained at a holding circuit. ＴＬＢの一例を示す図である。It is a figure which shows an example of TLB. タグＲＡＭの一例を示す図である。It is a figure which shows an example of tag RAM. データＲＡＭの一例を示す図である。It is a figure which shows an example of data RAM. クロック制御部の一例を示す図である。It is a figure which shows an example of a clock control part. エラーチェック回路の一例を示す図である。It is a figure which shows an example of an error check circuit. 再送要求生成部の一例を示す図である。It is a figure which shows an example of a resending request production | generation part. 制御信号生成部の一例を示す図である。It is a figure which shows an example of a control signal generation part. パイプラインのＶＡＬＩＤ信号処理の一例を示す図である。It is a figure which shows an example of the VALID signal processing of a pipeline. 演算処理装置の回路配置の一例を示す図である。It is a figure which shows an example of circuit arrangement | positioning of an arithmetic processing unit. 再送要求が発行されたときのパイプライン処理の一例を示すタイムチャートである。It is a time chart which shows an example of the pipeline process when a resending request | requirement is issued. 再送要求が発行されたときのパイプライン処理の一例を示すタイムチャートである。It is a time chart which shows an example of the pipeline process when a resending request | requirement is issued. 再送要求が発行されたときのパイプライン処理の一例を示すタイムチャートである。It is a time chart which shows an example of the pipeline process when a resending request | requirement is issued.

以下、図面を参照して、演算処理装置としてのプロセッサの実施形態を説明する。図１は、演算処理装置の構成の一例を示す図である。図１に示す演算処理装置１０は、命令実行部４、Ｌ１キャッシュ２０を有する。命令実行部４は、デコード部２、及び実行部３を有する。Ｌ１キャッシュ２０の一例は、図２を用いて後述される。 Hereinafter, an embodiment of a processor as an arithmetic processing device will be described with reference to the drawings. FIG. 1 is a diagram illustrating an example of a configuration of an arithmetic processing device. The arithmetic processing apparatus 10 illustrated in FIG. 1 includes an instruction execution unit 4 and an L1 cache 20. The instruction execution unit 4 includes a decoding unit 2 and an execution unit 3. An example of the L1 cache 20 will be described later with reference to FIG.

デコード部２は、「データ要求信号」をＬ１キャッシュ２０に供給して、「命令」を読み出す。デコード部２は、Ｌ１キャッシュ２０から読み出した「命令（オペコード：ｏｐｃｏｄｅ）」をデコード（解読）して、命令のデコード結果及び命令の実行対象である被演算数等のオペランド（ｏｐｅｒａｎｄ）が格納されたレジスタアドレスを「演算制御信号」として、実行部３に供給する。デコード対象の命令としては、例えば、Ｌ１キャッシュ２０へのロード命令、ストア命令などである。 The decoding unit 2 supplies the “data request signal” to the L1 cache 20 and reads the “instruction”. The decoding unit 2 decodes (decodes) the “instruction (opcode)” read from the L1 cache 20 and stores the operand (operand) such as the instruction decoding result and the number of operands to be executed by the instruction. The registered address is supplied to the execution unit 3 as an “operation control signal”. Examples of instructions to be decoded include a load instruction to the L1 cache 20 and a store instruction.

実行部３は、実行部３の内部にあるレジスタファイルにおいて、レジスタアドレスで特定されるレジスタからオペランドであるデータを取り出し、デコードした命令に従ってデータを演算する。実行部３は、デコードされた命令を実行することにより、「データ要求信号」として、を、Ｌ１キャッシュ２０に供給する。「データ要求信号」は、以下「ＥＸＴ要求」と呼ばれる。「ＥＸＴ要求」には、ロード命令、ストア命令、及びプリフェッチ命令等がある。 The execution unit 3 extracts data as an operand from a register specified by a register address in a register file inside the execution unit 3, and calculates data according to the decoded instruction. The execution unit 3 supplies the L1 cache 20 as a “data request signal” by executing the decoded instruction. The “data request signal” is hereinafter referred to as “EXT request”. The “EXT request” includes a load instruction, a store instruction, a prefetch instruction, and the like.

Ｌ１キャッシュ２０は、例えば、ロード命令に従って、要求されたデータを実行部３に供給する。実行部３は、命令の実行を終了すると、次の演算制御信号を受け取るために、「演算完了信号」をデコード部２に供給する。 For example, the L1 cache 20 supplies the requested data to the execution unit 3 in accordance with a load instruction. When the execution unit 3 finishes executing the instruction, the execution unit 3 supplies an “operation completion signal” to the decoding unit 2 in order to receive the next operation control signal.

Ｌ１キャッシュ２０は、Ｌ２キャッシュ４００の上位階層のメモリであり、Ｌ２キャッシュ４００が保持するデータの一部をキャッシュしている。すなわち、Ｌ２キャッシュ４００は、Ｌ１キャッシュ２０がキャッシュするデータを包含して保持している。Ｌ２キャッシュ４００は、主記憶装置５００の上位階層のメモリであり、主記憶装置５００が保持するデータの一部をキャッシュしている。すなわち、主記憶装置５００は、Ｌ２キャッシュ２００及びＬ１キャッシュ２０の双方がキャッシュするデータを包含して保持している。 The L1 cache 20 is an upper layer memory of the L2 cache 400 and caches a part of the data held by the L2 cache 400. That is, the L2 cache 400 includes and holds data cached by the L1 cache 20. The L2 cache 400 is a higher-level memory of the main storage device 500 and caches a part of data held by the main storage device 500. That is, the main storage device 500 includes and holds data cached by both the L2 cache 200 and the L1 cache 20.

デコード部２又は実行部３がメモリアクセスした命令又はデータが、Ｌ１キャッシュ２０に存在する場合を、以下、「キャッシュヒット」と呼ぶ。デコード部２又は実行部３がメモリアクセスした命令又はデータが、Ｌ１キャッシュ２０に存在しない場合を、以下、「キャッシュミス」と呼ぶ。キャッシュミスが生じた場合、Ｌ１キャッシュ２０の下位階層にあるＬ２キャッシュ４００又は主記憶装置５００から、Ｌ１キャッシュ２０に当該キャッシュミスしたデータが読み出す制御を行う。 The case where the instruction or data accessed by the decoding unit 2 or the execution unit 3 in the memory exists in the L1 cache 20 is hereinafter referred to as “cache hit”. Hereinafter, the case where the instruction or data accessed by the decoding unit 2 or the execution unit 3 does not exist in the L1 cache 20 is referred to as “cache miss”. When a cache miss occurs, control is performed to read the cache-missed data from the L2 cache 400 or the main storage device 500 in the lower hierarchy of the L1 cache 20 to the L1 cache 20.

図２は、Ｌ１キャッシュの一例を示す図である。命令実行部４は、データバッファ５を有する。データバッファ５は、Ｌ１キャッシュ２０から読み出した命令を保持するバッファである。 FIG. 2 is a diagram illustrating an example of the L1 cache. The instruction execution unit 4 has a data buffer 5. The data buffer 5 is a buffer that holds an instruction read from the L1 cache 20.

Ｌ１キャッシュ２０は、キャッシュコントローラ２００、クロック制御部１１０、データＲＡＭ１２０、エラーチェック回路１３０、選択回路１４０、再送要求生成部１５０、制御部１８０、Ｌ２オーダ保持部１９０、Ｌ２要求保持部１９５を有する。キャッシュコントローラ２００は、パイプライン１００、及び制御部１８０を有する。 The L1 cache 20 includes a cache controller 200, a clock control unit 110, a data RAM 120, an error check circuit 130, a selection circuit 140, a retransmission request generation unit 150, a control unit 180, an L2 order holding unit 190, and an L2 request holding unit 195. The cache controller 200 includes a pipeline 100 and a control unit 180.

パイプライン１００は、ＴｒａｎｓｌａｔｉｏｎＬｏｏｋ−ａｓｉｄｅＢｕｆｆｅｒ（ＴＬＢ）３５、タグＲＡＭ３０、比較回路４０、及び制御信号生成部５０を有する。 The pipeline 100 includes a translation look-aside buffer (TLB) 35, a tag RAM 30, a comparison circuit 40, and a control signal generation unit 50.

命令実行部４は、データバッファ５を有する。データバッファ５は、選択回路１４０から供給されたデータを保持する。 The instruction execution unit 4 has a data buffer 5. The data buffer 5 holds the data supplied from the selection circuit 140.

パイプライン１００は、優先回路２５、タグＲＡＭ３０、ＴＬＢ３５、比較回路４０を含む。パイプライン１００に含まれる上記の構成要素は、複数のステージにそれぞれ割り当てられる。例えば、優先回路２５は、「Ｐ（Ｐｒｉｏｒｉｔｙ）ステージ」に割り当てられ、タグＲＡＭ３０及びＴＬＢ３５は、「Ｔ（Ｔａｇ）ステージ」に割り当てられ、比較回路４０は、「Ｍ（Ｍａｔｃｈ）ステージ」に割り当てられる。パイプライン１００の一例は、図５、図７、及び図８を用いて後述される。 The pipeline 100 includes a priority circuit 25, a tag RAM 30, a TLB 35, and a comparison circuit 40. The above-described components included in the pipeline 100 are assigned to a plurality of stages, respectively. For example, the priority circuit 25 is assigned to the “P (Priority) stage”, the tag RAM 30 and the TLB 35 are assigned to the “T (Tag) stage”, and the comparison circuit 40 is assigned to the “M (Match) stage”. . An example of the pipeline 100 will be described later with reference to FIGS. 5, 7, and 8.

クロック制御部１１０は、データＲＡＭ１２０が保持するデータに対するアクセス要求がある場合等のデータＲＡＭ１２０にクロックを供給する必要が有る場合に、データＲＡＭ１２０に対してクロックを供給する。クロック制御部１１０の詳細は、図１４を用いて、後述される。 The clock control unit 110 supplies a clock to the data RAM 120 when it is necessary to supply a clock to the data RAM 120 such as when there is an access request for data held by the data RAM 120. Details of the clock control unit 110 will be described later with reference to FIG.

ＴＬＢ３５、タグＲＡＭ３０、データＲＡＭ１２０、及びエラーチェック回路１３０の詳細は、それぞれ、図１１、図１２、図１３、及び図１５を用いて後述される。 Details of the TLB 35, the tag RAM 30, the data RAM 120, and the error check circuit 130 will be described later with reference to FIGS. 11, 12, 13, and 15, respectively.

比較回路４０は、ＴＬＢ３５から供給された絶対アドレスと、タグＲＡＭ３０から供給された絶対アドレスとを比較して、２つのタグが一致するか否かを判定する回路である。比較回路４０は、ＴＬＢ３５から供給されたタグと、タグＲＡＭ３０から供給されたタグとが一致する場合、選択回路１４０に、キャッシュヒットが生じたウェイを特定するタグヒットウェイ信号を供給する。 The comparison circuit 40 is a circuit that compares the absolute address supplied from the TLB 35 with the absolute address supplied from the tag RAM 30 and determines whether the two tags match. When the tag supplied from the TLB 35 and the tag supplied from the tag RAM 30 match, the comparison circuit 40 supplies the selection circuit 140 with a tag hit way signal that identifies the way in which the cache hit has occurred.

Ｌ２要求保持部１９５は、Ｌ２キャッシュ４００からデータをロードする場合に用いられる要求を保持する。 The L2 request holding unit 195 holds a request used when loading data from the L2 cache 400.

Ｌ２オーダ保持部１９０は、Ｌ２キャッシュ４００からキャッシュラインを削除する処理がなされた場合に、Ｌ１キャッシュ２０において相当するキャッシュラインのエントリを削除する要求を保持する。以下、この削除要求を、「Ｌ２オーダ」と呼ぶ。Ｌ２オーダ保持部１９０からデキューしたＬ２オーダは、後述されるＰ（Ｐｒｉｏｒｉｔｙ）サイクルオーダアドレスレジスタ（ＰＳＸＲ）に保持される。 The L2 order holding unit 190 holds a request to delete a corresponding cache line entry in the L1 cache 20 when a process of deleting a cache line from the L2 cache 400 is performed. Hereinafter, this deletion request is referred to as “L2 order”. The L2 order dequeued from the L2 order holding unit 190 is held in a P (Priority) cycle order address register (PSXR) described later.

再送要求生成部１５０、及び制御信号生成部５０は、それぞれ、図１６、及び図１７を用いて、後述される。制御部１８０は、キャッシュエラーが生じると、Ｌ２キャッシュ４００にデータをロードし、Ｌ１キャッシュにロードしたデータをストアする処理を行う。 The retransmission request generation unit 150 and the control signal generation unit 50 will be described later with reference to FIGS. 16 and 17, respectively. When a cache error occurs, the control unit 180 loads data into the L2 cache 400 and stores the loaded data into the L1 cache.

図３は、パイプラインへのデータ要求の遷移の一例を示す図である。状態Ｓ１０１は、優先回路２５が要求信号を受付中である状態を示す。状態Ｓ１０３は、優先回路２５が、要求を、次ステージのＴＬＢ３５、タグＲＡＭ３０、及びクロック制御部１１０に投入するのを待っている状態を示す。 FIG. 3 is a diagram illustrating an example of a transition of a data request to the pipeline. A state S101 indicates a state in which the priority circuit 25 is accepting a request signal. A state S103 indicates a state in which the priority circuit 25 waits for a request to be input to the TLB 35, the tag RAM 30, and the clock control unit 110 in the next stage.

優先回路２５は、上記した４種の要求を受け取り（Ｔ１０２）、要求投入待ち状態（Ｓ１０３）になると、優先回路２５は、図４に示す真理値表に従って、次ステージに供給する要求を選択する。図３に示した状態Ｓ１０１、Ｓ１０３は、要求種別毎に用意されたラッチ回路で実装される。状態Ｓ１０１から状態Ｓ１０３への遷移Ｔ１０２は、要求を優先回路２５に設定することにより生じる。状態Ｓ１０３から状態Ｓ１０１への遷移Ｔ１０４は、要求を次ステージに供給することにより生じる。 When the priority circuit 25 receives the above four types of requests (T102) and enters the request input waiting state (S103), the priority circuit 25 selects a request to be supplied to the next stage according to the truth table shown in FIG. . The states S101 and S103 shown in FIG. 3 are implemented by a latch circuit prepared for each request type. The transition T102 from the state S101 to the state S103 occurs when a request is set in the priority circuit 25. Transition T104 from state S103 to state S101 occurs by supplying a request to the next stage.

図２に示す優先回路２５は、ＥＸＴ要求、ＢＩＳ要求、ＭＩ要求、ＩＮＴ要求を受け取り、所定の優先順位に基づいてそれらの要求を選択し、且つ選択した要求を次ステージのＴＬＢ３５、タグＲＡＭ３０、クロック制御部１１０に供給する。 The priority circuit 25 shown in FIG. 2 receives an EXT request, a BIS request, an MI request, and an INT request, selects those requests based on a predetermined priority order, and selects the selected request as a TLB 35, a tag RAM 30, This is supplied to the clock controller 110.

ＥＸＴ要求は、命令実行部４から与えられる要求である。ロード要求、ストア要求、プリフェッチ要求などのメモリアクセス要求が含まれる。 The EXT request is a request given from the instruction execution unit 4. Memory access requests such as load requests, store requests, and prefetch requests are included.

ＢＩＳ要求は、Ｌ２キャッシュ４００から与えられる、Ｌ１キャッシュ２０のラインを消す要求である。ＢＩＳ要求は、データＲＡＭ１２０にエラーが検出された場合、エラーに係るラインを消す場合に要求される。 The BIS request is a request for deleting the line of the L1 cache 20 given from the L2 cache 400. The BIS request is requested when an error is detected in the data RAM 120 and a line related to the error is deleted.

ＭＩ要求は、キャッシュコントローラ２００から与えられる、データＲＡＭにＬ２キャッシュ４００からロードしたデータを書き込む要求である。ＭＩ要求は、ＢＩＳ要求がなされた後に、対象となるラインに対して行われる。 The MI request is a request for writing the data loaded from the L2 cache 400 to the data RAM, which is given from the cache controller 200. The MI request is made to the target line after the BIS request is made.

ＩＮＴ要求は、ＥＸＴ要求が、パイプライン１００を停止した後で、停止前のデータを用いて、所定の処理を実行するためにパイプライン１００から与えられる要求である。 The INT request is a request given from the pipeline 100 in order to execute a predetermined process using the data before the stop after the EXT request stops the pipeline 100.

図４は、優先順位の真理値表の一例を示す図である。真理値表１０００内の「０」は、要求が、要求受付中状態（Ｓ１０１）にあることを示す。真理値表１０００内の「１」は、要求が、要求投入待ち状態にあることを示す。真理値表１０００内の「＊」は、は、胴体要求がどのような状態にあっても優先順位決定に関係が無いドントケア（Ｄｏｎ’ｔｃａｒｅ）であることを示す。 FIG. 4 is a diagram illustrating an example of a priority truth table. “0” in the truth table 1000 indicates that the request is in the request accepting state (S101). “1” in the truth table 1000 indicates that the request is in a request input waiting state. “*” In the truth table 1000 indicates that the don't care is not related to the priority determination regardless of the state of the body request.

図４に示す列Ｒ１０１は、ＭＩ要求が要求投入待ち状態（Ｓ１０３）にあれば、他の要求に関係なく、ＭＩ要求が次ステージに投入されることを示す。 The column R101 shown in FIG. 4 indicates that if the MI request is in the request input waiting state (S103), the MI request is input to the next stage regardless of other requests.

図４に示す行Ｒ１０２は、ＢＩＳ要求が要求投入待ち状態（Ｓ１０３）にあり、且つ、ＭＩ要求が要求受付中状態（Ｓ１０１）にある場合、ＢＩＳ要求が次ステージに投入されることを示す。 A row R102 shown in FIG. 4 indicates that when the BIS request is in the request input waiting state (S103) and the MI request is in the request accepting state (S101), the BIS request is input to the next stage.

図４に示す行Ｒ１０３は、ＩＮＴ要求が要求投入待ち状態（Ｓ１０３）にあり、且つ、ＭＩ要求及びＢＩＳ要求が要求受付中状態（Ｓ１０１）にある場合、ＩＮＴ要求が次ステージに投入されることを示す。 The row R103 shown in FIG. 4 indicates that the INT request is input to the next stage when the INT request is in the request input waiting state (S103) and the MI request and BIS request are in the request accepting state (S101). Indicates.

図４に示す行Ｒ１０４は、ＭＩ要求、ＢＩＳ要求、及びＩＮＴ要求が要求受付中状態（Ｓ１０１）にある場合、ＥＸＴ要求が次ステージに投入されることを示す。このように、優先回路２５は、ＭＩ要求＞ＭＩＳ要求＞ＩＮＴ要求＞ＥＸＴ要求の優先順位（不等号が示す大きい要求の方の優先順位が高いことを示す）に従って、要求を次ステージに投入する。 A row R104 illustrated in FIG. 4 indicates that when the MI request, the BIS request, and the INT request are in the request accepting state (S101), the EXT request is input to the next stage. In this way, the priority circuit 25 inputs the request to the next stage in accordance with the priority of MI request> MIS request> INT request> EXT request (indicating that the priority of the larger request indicated by the inequality sign is higher).

図５は、パイプラインのアドレス制御、すなわちアドレス制御パイプラインｓの一例を示す図である。図５には、図２に示したＬ１キャッシュ２０の構成要素が、パイプラインステージ「Ｐ（Ｐｒｉｏｒｉｔｙ）」、「Ｔ（Ｔａｇ）」、「Ｍ（Ｍａｔｃｈ）」、「Ｂ（Ｂｒａｎｃｈ）」、「Ｒ（Ｒｅｓｕｌｔ）」、「Ｒ１」に分割される。クロックサイクルは、パイプラインの各ステージでかかる処理時間のうち最長時間に基づいて設定される。それにより、全ステージが、クロックに同期して、同一サイクルタイムで動作する。 FIG. 5 is a diagram showing an example of pipeline address control, that is, an address control pipeline s. In FIG. 5, the components of the L1 cache 20 shown in FIG. 2 are pipeline stages “P (Priority)”, “T (Tag)”, “M (Match)”, “B (Branch)”, “ R (Result) "and" R1 ". The clock cycle is set based on the longest time among the processing times required for each stage of the pipeline. Thereby, all the stages operate in the same cycle time in synchronization with the clock.

「Ｐ（Ｐｒｉｏｒｉｔｙ）」、「Ｔ（Ｔａｇ）」、「Ｍ（Ｍａｔｃｈ）」、「Ｂ（Ｂｒａｎｃｈ）」、「Ｒ（Ｒｅｇｉｓｔｅｒ）」、及び「Ｒ１」の各ステージは、それぞれステージングラッチであるとともに論理アドレスレジスタである、ＰＬＲ（ＰｒｉｏｒｉｔｙｓｔａｇｅＬｏｇｉｃａｌＲｅｇｉｓｔｅｒ）、ＴＬＲ（ＴａｇｓｔａｇｅＬｏｇｉｃａｌＲｅｇｉｓｔｅｒ）、ＭＬＲ（ＭａｔｃｈｓｔａｇｅＬｏｇｉｃａｌＲｅｇｉｓｔｅｒ）、ＢＬＲ（ＢｒａｎｃｈｓｔａｇｅＬｏｇｉｃａｌＲｅｇｉｓｔｅｒ）、ＲＬＲ（ＲｅｇｉｓｔｅｒｓｔａｇｅＬｏｇｉｃａｌＲｅｉｇｓｔｅｒ）、及びＲ１ＬＲ（Ｒ１ｓｔａｇｅＬｏｇｉｃａｌＲｅｉｇｓｔｅｒ）をそれぞれ有する。これらのステージングラッチである論理アドレスレジスタは、外部から供給されるクロックに同期し、一定期間論理アドレスを保持した後、次ステージのパイプラインアドレスレジスタに論理アドレスを供給する。 Each stage of “P (Priority)”, “T (Tag)”, “M (Match)”, “B (Branch)”, “R (Register)”, and “R1” is a staging latch. Logical address registers, PLR (Priority Stage Logical Register), TLR (Tag stage Logical Register), MLR (Match stage Logical Register), BLR (Branch stage Logical Register), BLR (Batch stage Logic Register). stage Logic Reigster). These staging latches, which are logical address registers, synchronize with a clock supplied from the outside, hold the logical address for a certain period, and then supply the logical address to the pipeline address register of the next stage.

ポートレジスタは、ＥＸＴ要求を保持する。ポートレジスタで保持されるＥＸＴ要求は、パイプライン停止後、ＩＮＴ要求として利用される。 The port register holds an EXT request. The EXT request held in the port register is used as an INT request after the pipeline is stopped.

ＭＩＬＡＲ（ＭｏｖｅＩｎＬｏｇｉｃａｌＡｄｄｒｅｓｓＲｅｇｉｓｔｅｒ）は、タグＲＡＭ３０に書き込むデータの論理アドレスを保持しておくレジスタである。ＭＩＡＡＲ（ＭｏｖｅＩｎＡｂｓｏｌｕｔｅＡｄｄｒｅｓｓＲｅｇｉｓｔｅｒ）は、キャッシュミスが生じた場合、タグＲＡＭ３０に書き込むデータの物理アドレスを保持するレジスタである。キャッシュコントローラ２００は、ＭＩＡＡＲに保持された物理アドレスのデータを、Ｌ２要求保持部１９５を介して、Ｌ２キャッシュ４００へ要求するＭＩ（ＭｏｖｅＩｎ）要求を送出する。Ｌ２キャッシュ４００から取得したデータは、ＭＩ要求により、タグＲＡＭ３０に書き込まれる。 MILAR (Move In Logical Address Register) is a register that holds a logical address of data to be written to the tag RAM 30. MIAAR (Move In Absolute Address Register) is a register that holds a physical address of data to be written to the tag RAM 30 when a cache miss occurs. The cache controller 200 sends an MI (Move In) request for requesting the data of the physical address held in the MIAAR to the L2 cache 400 via the L2 request holding unit 195. Data acquired from the L2 cache 400 is written into the tag RAM 30 in response to an MI request.

ＢＡＡＲ（ＢｒａｎｃｈｃｙｃｌｅＡｂｓｏｌｕｔｅＡｄｄｒｅｓｓＲｅｇｉｓｔｅｒ）は、ＭＩＡＡＲに入力される物理アドレスを保持するレジスタである。 A BAAR (Branch Cycle Absolute Address Register) is a register that holds a physical address input to the MIAAR.

ＥＲＡＲ（ＥｒｒｏｒＡｄｄｒｅｓｓＲｅｇｉｓｔｅｒ）は、メモリアクセスにおいてエラーが発生した場合の仮想アドレスを保持しておくレジスタである。比較回路４０でキャッシュミスが生じた場合、キャッシュコントローラ２００は、Ｌ２キャッシュ４００にエラーが報告する。エラーが報告されると、Ｌ２キャッシュ４００は、エラーが生じたキャッシュラインのエントリを削除するための要求を発行する。この要求は、「Ｌ２オーダ」と呼ばれる。Ｌ１キャッシュ２０は、Ｌ２オーダ保持部１９０を介して、Ｌ２オーダを受け取ると、ＢＩＳ要求によりラインを消去する。制御部１８０は、ラインが消去されたことをＬ２キャッシュ４００に通知する。 ERAR (Error Address Register) is a register that holds a virtual address when an error occurs in memory access. When a cache miss occurs in the comparison circuit 40, the cache controller 200 reports an error to the L2 cache 400. When an error is reported, the L2 cache 400 issues a request to delete the entry of the cache line in which the error has occurred. This request is called “L2 order”. When the L1 cache 20 receives the L2 order via the L2 order holding unit 190, the L1 cache 20 erases the line in response to the BIS request. The control unit 180 notifies the L2 cache 400 that the line has been erased.

Ｔステージは、ＴＬＢ３５及びタグＲＡＭ３０を含む。クロック制御部１１０は、パイプライン１００には含まれないが、Ｐ又はＴステージのサイクルで処理を行う。 The T stage includes a TLB 35 and a tag RAM 30. The clock control unit 110 is not included in the pipeline 100, but performs processing in a P or T stage cycle.

Ｍステージは、選択回路１４０を含む。データＲＡＭ１２０は、パイプライン１００には含まれないが、Ｍステージのサイクルで動作する。Ｂステージは、エラーチェック回路１３０、優先回路２５、再送要求生成部１５０を含む。Ｒステージは、データバッファ５に要求を供給する回路を含む。 The M stage includes a selection circuit 140. The data RAM 120 is not included in the pipeline 100, but operates in an M stage cycle. The B stage includes an error check circuit 130, a priority circuit 25, and a retransmission request generation unit 150. The R stage includes a circuit that supplies a request to the data buffer 5.

Ｐステージでは、優先回路２５は、図３に示す真理値表の優先順位に従って、Ｐステージで選択されたＥＸＴ、ＢＩＳ、ＩＮＴ、ＭＩのいずれかの要求が、ＴＬＢ３５及びタグＲＡＭ３０に供給される。 At the P stage, the priority circuit 25 supplies one of the EXT, BIS, INT, and MI requests selected at the P stage to the TLB 35 and the tag RAM 30 in accordance with the priority order of the truth table shown in FIG.

図６は、エラーチェック回路を用いて使用許可信号であるＳＴＶ（ＳＴｏｒｅＶａｌｉｄ：ストアバリッド）信号を生成した場合のパイプラインによるアドレス制御の一例を示す図である。図６には、図２に示したＬ１キャッシュ２０の構成要素のうち、クロック制御部１１０と、再送要求生成部１５０が示されていない。 FIG. 6 is a diagram illustrating an example of address control by a pipeline when an STV (Store Valid) signal that is a use permission signal is generated using an error check circuit. FIG. 6 does not show the clock control unit 110 and the retransmission request generation unit 150 among the components of the L1 cache 20 shown in FIG.

制御信号生成部５０が、エラーチェック回路１３０によるエラー検出信号を受け取って、上記のＳＴＶ信号を出力する構成とする場合、制御信号生成部５０は、エラーチェック回路１３０の後段に配置される。制御信号生成部５０は、図１９で後述されるように、パイプライン１００内に配置され、データＲＡＭ１２０の近傍に配置されない。よって、データＲＡＭ１２０と制御信号生成部５０との間の伝送路が長いために、制御信号生成部５０は、Ｍステージではなく、Ｂステージに配置される。さらに、エラーチェック回路１３０は、Ｍステージに配置される。 When the control signal generation unit 50 is configured to receive the error detection signal from the error check circuit 130 and output the STV signal, the control signal generation unit 50 is disposed at the subsequent stage of the error check circuit 130. As will be described later with reference to FIG. 19, the control signal generation unit 50 is disposed in the pipeline 100 and is not disposed in the vicinity of the data RAM 120. Therefore, since the transmission path between the data RAM 120 and the control signal generation unit 50 is long, the control signal generation unit 50 is arranged not at the M stage but at the B stage. Further, the error check circuit 130 is arranged at the M stage.

そのため、エラーチェック回路１３０の前段に配置されるデータＲＡＭ１２０は、図５に示したようにＭステージへの配置ではなく、Ｔステージに配置される。結果として、クロック制御部１１０を優先回路２５と、データＲＡＭ１２０との間に配置することが出来なくなる。 Therefore, the data RAM 120 arranged in the previous stage of the error check circuit 130 is arranged in the T stage, not in the M stage as shown in FIG. As a result, the clock control unit 110 cannot be disposed between the priority circuit 25 and the data RAM 120.

以上のように、演算処理装置１０は、エラーチェック回路１３０の後段に、制御信号生成部５０ではなく、再送要求生成部１５０を設けることで、データＲＡＭ１２０へのクロック制御を行うクロック制御部１１０を設けることが出来る。クロック制御部１１０は、データＲＡＭ１２０が保持するデータに対するアクセスが無いとき、データＲＡＭへクロックを供給しないため、クロック制御部１１０を設けることで、演算処理装置１０は、その消費電力を低減できる。 As described above, the arithmetic processing device 10 includes the retransmission request generation unit 150 instead of the control signal generation unit 50 in the subsequent stage of the error check circuit 130, thereby allowing the clock control unit 110 that performs clock control to the data RAM 120 to be performed. Can be provided. Since the clock control unit 110 does not supply a clock to the data RAM when there is no access to the data held in the data RAM 120, the arithmetic processing unit 10 can reduce the power consumption by providing the clock control unit 110.

図７は、図５に示したクロック制御部を除いたパイプラインのアドレス制御の一例を示す図である。図５に示したクロック制御部をＴステージから取り除いたことで、データＲＡＭ１２０は、Ｔステージで動作する。また、エラーチェック回路１３０から検出したエラーは、パイプライン１００の外にある再送要求生成部１５０で受け取り、再送要求を生成することで、再送要求生成部１５０は、Ｍステージに配置することが出来る。そのため、データＲＡＭ１２０の後段にあるエラーチェック回路１３０及び再送要求生成部１５０をＭステージに配置可能になり、ＳＴＶ信号及びＲＥＲＵＮ信号をＢステージで送信することができる。よって、図５及び図６に示すようなＳＴＶ信号又はＲＥＲＵＮ信号送信用のＲステージは不要にすることが出来る。 FIG. 7 is a diagram illustrating an example of pipeline address control excluding the clock control unit illustrated in FIG. 5. By removing the clock controller shown in FIG. 5 from the T stage, the data RAM 120 operates at the T stage. The error detected from the error check circuit 130 is received by the retransmission request generation unit 150 outside the pipeline 100, and the retransmission request generation unit 150 can be placed in the M stage by generating a retransmission request. . Therefore, the error check circuit 130 and the retransmission request generation unit 150 in the subsequent stage of the data RAM 120 can be arranged in the M stage, and the STV signal and the RERUN signal can be transmitted in the B stage. Therefore, the R stage for transmitting the STV signal or RERUN signal as shown in FIGS. 5 and 6 can be eliminated.

このように、図６に示すパイプライン制御と異なり、図７に示すパイプライン制御では、より短いサイクルタイムでデータのロード動作が可能になる。すなわち、演算処理装置１０は、図６に示すパイプライン制御においてデータのロード動作がサイクルタイム（動作周波数）向上のボトルネックになっていた場合、図７に示すパイプライン制御に変更することにより、サイクルタイム（動作周波数）向上を図ることが出来る。 In this way, unlike the pipeline control shown in FIG. 6, the pipeline control shown in FIG. 7 enables a data load operation in a shorter cycle time. That is, when the data loading operation in the pipeline control shown in FIG. 6 is a bottleneck for improving the cycle time (operating frequency), the arithmetic processing unit 10 changes to the pipeline control shown in FIG. Cycle time (operating frequency) can be improved.

図８は、パイプラインにおけるフラグ制御の一例を示す図である。図８に示すパイプライン１００ｂは、フラグ信号制御に係るパイプライン内の回路が示される。パイプライン１００ｂには、ＩＮＶＥＲＴＥＲ１０１、フラグ信号ラッチＴＦＬＡＧ（ＴａｇＦＬＡＧ）、ＭＦＬＡＧ（ＭａｔｃｈＦＬＡＧ）、ＢＦＬＡＧ（ＢｒａｎｃｈＦＬＡＧ）、ＲＦＬＡＧ（ＲｅｇｉｓｔｅｒＦＬＡＧ）、優先回路１０２を有する。 FIG. 8 is a diagram illustrating an example of flag control in the pipeline. A pipeline 100b shown in FIG. 8 shows a circuit in the pipeline related to flag signal control. The pipeline 100 b includes an INVERTER 101, a flag signal latch TFLAG (Tag FLAG), MFLAG (Match FLAG), BFLAG (Branch FLAG), RFFLAG (Register FLAG), and a priority circuit 102.

「Ｐ」、「Ｔ」、「Ｍ」、「Ｂ」、「Ｒ」、及び「Ｒ１」の各ステージは、フラグ信号を保持するステージングラッチである、フラグ信号ラッチＴＦＬＡＧ、ＭＦＬＡＧ、ＢＦＬＡＧ、及びＲＦＬＡＧをそれぞれ有する。フラグ信号とは、パイプラインが要求を処理した結果に基づいて生成される属性情報や識別情報等の状態情報を示す制御信号である。フラグ信号は、図９を用いて後述される。フラグ信号ラッチは、データ入力端子でありフラグ信号が入力されるＤ（Ｄａｔａ）端子と、制御端子でありＷＡＩＴ信号が入力されるＩＨ（ＩｎＨｉｂｉｔ）端子を有する。ＩＨ端子に入力されるＷＡＩＴ信号が信号レベル「ロウ」である場合、フラグ信号ラッチのＤ端子から入力されるフラグ信号が書き込まれ、ＩＨ端子に入力されるＷＡＩＴ信号が信号レベル「ハイ」である場合、フラグ信号ラッチのＤ端子から入力されるフラグ信号の書き込みが禁止される。 “P”, “T”, “M”, “B”, “R”, and “R1” stages are staging latches that hold flag signals, flag signal latches TFLAG, MFLAG, BFLAG, and RFLAG Respectively. The flag signal is a control signal indicating status information such as attribute information and identification information generated based on the result of processing the request by the pipeline. The flag signal will be described later with reference to FIG. The flag signal latch has a data input terminal D (Data) terminal to which a flag signal is input, and a control terminal IH (In Hibit) terminal to which a WAIT signal is input. When the WAIT signal input to the IH terminal is at the signal level “low”, the flag signal input from the D terminal of the flag signal latch is written, and the WAIT signal input to the IH terminal is at the signal level “high”. In this case, writing of the flag signal input from the D terminal of the flag signal latch is prohibited.

なお、以下において、信号レベル「ロウ」を『Ｌ』と、信号レベル「ハイ」を『Ｈ』と呼ぶ。 In the following, the signal level “low” is referred to as “L”, and the signal level “high” is referred to as “H”.

ＩＨ端子の入力信号は、否定回路であるＩＮＶＥＲＴＥＲ１０１により反転したＷＡＩＴ信号である。ＷＡＩＴ信号は、パイプライン１００の動作を停止させる信号であり、図２に示す制御信号生成部５０で生成される。よって、パイプラインの動作を停止させるＷＡＩＴ信号が『Ｈ』になるとき、フラグ信号ラッチへフラグ信号が書き込まれる。制御信号生成部５０によるＷＡＩＴ信号の生成条件は、図１７を用いて後述される。 The input signal at the IH terminal is a WAIT signal inverted by INVERTER 101 which is a negative circuit. The WAIT signal is a signal for stopping the operation of the pipeline 100, and is generated by the control signal generation unit 50 shown in FIG. Therefore, when the WAIT signal for stopping the pipeline operation becomes “H”, the flag signal is written to the flag signal latch. The conditions for generating the WAIT signal by the control signal generator 50 will be described later with reference to FIG.

優先回路２５が、受け取った要求のいずれかを、パイプライン１００に投入すると、フラグ信号は、クロック信号の入力とともにＴＦＬＡＧ、ＭＦＬＡＧ、ＢＦＬＡＧ、ＲＦＬＡＧの順に移動する。しかし、パイプライン処理の停止信号であるＷＡＩＴ信号が、ＩＮＶＥＲＴＥＲ１０１に供給されると、各フラグ信号ラッチが、フラグ信号を、クロック信号の入力とともに各ステージに対応した保持回路ＴＷ、ＭＷ、ＢＷ、ＲＷに格納する。 When the priority circuit 25 inputs one of the received requests to the pipeline 100, the flag signal moves in the order of TFLAG, MFLAG, BFLAG, and RFFLAG along with the input of the clock signal. However, when a WAIT signal, which is a pipeline processing stop signal, is supplied to the INVERTER 101, each flag signal latch outputs the flag signal to the holding circuit TW, MW, BW, RW corresponding to each stage together with the input of the clock signal To store.

一旦パイプライン停止状態になったあと、パイプラインを再開するときには保持回路ＴＷ、ＭＷ、ＢＷ、ＲＷからフラグ信号が出力され、ＩＮＴ要求としてまたパイプラインに投入される。投入は、一番古い要求から投入されるので、ＲＷ、ＢＷ、ＭＷ、ＴＷの順で投入される。 Once the pipeline is stopped, when the pipeline is resumed, flag signals are output from the holding circuits TW, MW, BW, and RW, and are again input to the pipeline as an INT request. Since the input is input from the oldest request, it is input in the order of RW, BW, MW, and TW.

図９は、パイプライン制御に使用されるフラグ信号の一例を示す図である。フラグ信号１１０１は、ＴＦＬＡＧ、ＭＦＬＡＧ、ＢＦＬＡＧ、及びＲＦＬＡＧに保持される信号である。フラグ信号１１０１は、「ＶＡＬＩＤ」信号、「ポートＩＤ」信号、「パイプＩＤ」信号、「命令部ＩＤ」信号、「再送要求指示」信号、「再送要求２回目」信号を含む。 FIG. 9 is a diagram illustrating an example of a flag signal used for pipeline control. The flag signal 1101 is a signal held in TFLAG, MFLAG, BFLAG, and RFFLAG. The flag signal 1101 includes a “VALID” signal, a “port ID” signal, a “pipe ID” signal, a “command part ID” signal, a “retransmission request instruction” signal, and a “retransmission request second time” signal.

『Ｈ』の「ＶＡＬＩＤ」信号は、パイプラインステージに有効な要求が流れていることを示す。「ポートＩＤ（ポートレジスタ−ＩＤ）」信号は、図５に示すポートレジスタを特定する信号である。図１７で後述するように、パイプラインの動作を停止させるＷＡＩＴ信号が『Ｈ』になると、『Ｈ』の「ＶＡＬＩＤ」信号が、パイプラインを流れる。 A “VALID” signal of “H” indicates that a valid request is flowing through the pipeline stage. The “port ID (port register-ID)” signal is a signal for specifying the port register shown in FIG. As will be described later with reference to FIG. 17, when the WAIT signal for stopping the pipeline operation becomes “H”, the “VALID” signal of “H” flows through the pipeline.

「パイプＩＤ（ＰＩＰＥ−ＩＤ）」信号は、要求の種類を示す。例えば、１６進数で「０ｘ３」、「０ｘ５」、「０ｘＤ」、及び「０ｘＦ」を示す「パイプＩＤ」信号は、ＭＩ要求、ＢＩＳ要求、ＩＮＴ要求、及びＥＸＴ要求をそれぞれ示す。「命令ＩＤ（ＩＢＲ−ＩＤ）」信号は、要求を返す先の命令実行部４の番号を示す。「命令ＩＤ」信号は、命令実行部４がＥＸＴ要求をＬ１キャッシュ２０に供給するときに、ＥＸＴ要求に付随される。 The “pipe ID (PIPE-ID)” signal indicates the type of request. For example, a “pipe ID” signal indicating “0x3”, “0x5”, “0xD”, and “0xF” in hexadecimal indicates an MI request, a BIS request, an INT request, and an EXT request, respectively. The “instruction ID (IBR-ID)” signal indicates the number of the instruction execution unit 4 to which the request is returned. The “instruction ID” signal is attached to the EXT request when the instruction execution unit 4 supplies the EXT request to the L1 cache 20.

再送要求指示であるＲＥＲＵＮ−ＲＥＱ信号は、再送要求であるＲＥＲＵＮ信号を受け取った命令実行部４が、パイプライン１００に供給する信号である。 The RERUN-REQ signal that is a retransmission request instruction is a signal that is supplied to the pipeline 100 by the instruction execution unit 4 that has received the RERUN signal that is a retransmission request.

再送要求２回目指示であるＲＥＲＵＮ−２ｎｄ信号は、後述するフローＩＤであるＷＩＤ信号が７１を示すとき、７１をデコードすることで得られる信号である。「ＷＩＤ＝７１」は、先のフローが再送要求によるものであり、且つ、キャッシュヒット及びエラーの発生なしで、ＷＡＩＴされたフローであることを特定する。別の言い方をすれば、「ＷＩＤ＝７１」は、再送要求２回目指示は、先のフローでデータバッファ５にデータを書き込み済みであり、現在のフローは、ＳＴＶ信号を返すフローであることを示している。よって、パイプライン１００は、ＩＮＴ要求と共に受信したＷＩＤ信号をデコードすることで、ＲＥＲＵＮ−２ｎｄ信号を『Ｈ』にして、フラグ信号の１つとしてＲＥＲＵＮ−２ｎｄ信号をパイプラインの各ステージに流す。 The RERUN-2nd signal that is the second instruction for retransmission request is a signal obtained by decoding 71 when a WID signal that is a flow ID described later indicates 71. “WID = 71” specifies that the previous flow is due to a retransmission request and is a WAITed flow without occurrence of a cache hit or error. In other words, “WID = 71” indicates that the second instruction for retransmission request has already written data into the data buffer 5 in the previous flow, and the current flow is a flow that returns an STV signal. Show. Therefore, the pipeline 100 decodes the WID signal received together with the INT request, thereby setting the RERUN-2nd signal to “H” and flowing the RERUN-2nd signal to each stage of the pipeline as one of the flag signals.

フラグ信号１１０２は、ＴＷ、ＭＷ、及ＢＷに保持されるフラグ信号である。フラグ信号１１０２が含む属性又は識別情報は、上記したＶＡＬＩＤ、ポートＩＤ、パイプＩＤ、及び命令部ＩＤである。フラグ信号１１０３は、ＲＷに保持されるフラグ信号である。フラグ信号１１０３が含む属性又は識別情報は、上記したＶＡＬＩＤ、ポートＩＤ、パイプＩＤ、命令部ＩＤ、及びＷＩＤである。 A flag signal 1102 is a flag signal held in TW, MW, and BW. The attribute or identification information included in the flag signal 1102 is the above-described VALID, port ID, pipe ID, and instruction part ID. The flag signal 1103 is a flag signal held in the RW. The attribute or identification information included in the flag signal 1103 is the above-described VALID, port ID, pipe ID, command unit ID, and WID.

ＷＩＤは、パイプラインが停止した理由を特定し、以下の種類がある。
「ＷＩＤ＝１０」は、キャッシュミスで、パイプラインが中断したことを示す。パイプライン１００は、Ｌ２キャッシュ４００からＬ１キャッシュ２０にデータがロードされるまで待つ。「ＷＩＤ＝６０」は、ＴＬＢミスで、パイプラインが中断したことを示す。「ＷＩＤ＝７０」は、キャッシュのエラーで、パイプラインが中断したことを示す。「ＷＩＤ＝７１」は、再送要求を受けた最初のフローが、キャッシュヒット及びエラーなしで終了したことを示す。 The WID specifies the reason why the pipeline has stopped, and has the following types.
“WID = 10” indicates that the pipeline is interrupted due to a cache miss. The pipeline 100 waits until data is loaded from the L2 cache 400 to the L1 cache 20. “WID = 60” indicates that the pipeline is interrupted due to a TLB miss. “WID = 70” indicates that the pipeline is interrupted due to a cache error. “WID = 71” indicates that the first flow having received the retransmission request is completed without a cache hit and an error.

図１０は、保持回路ＲＷ、ＢＷ、ＭＷ、ＴＷに保持されるフラグ信号の選択の優先順位を規定する真理値表である。優先回路１０２は、図１０に示す真理値表１２００に従って、フラグ信号アドレスに保持されるフラグ信号を選択する。真理値表１０００内の「＊」は、は優先順位決定に関係が無いドントケア（Ｄｏｎ’ｔｃａｒｅ）であることを示す。真理値表１２００内の「０」は、フラグ信号が保持回路に保持されることを示す。真理値表１２００内の「１」は、フラグ信号が保持回路に保持されないことを示す。 FIG. 10 is a truth table that defines the priority of selection of flag signals held in the holding circuits RW, BW, MW, and TW. The priority circuit 102 selects the flag signal held in the flag signal address according to the truth table 1200 shown in FIG. “*” In the truth table 1000 indicates a don't care that is not related to priority determination. “0” in the truth table 1200 indicates that the flag signal is held in the holding circuit. “1” in the truth table 1200 indicates that the flag signal is not held in the holding circuit.

ＲＷに保持されるフラグ信号は、行Ｌ１２０１に示すように、常に優先回路１０２により選択される。ＢＷに保持されるフラグ信号は、行Ｌ１２０２に示すように、ＲＷにフラグ信号が無い場合、優先回路１０２により選択される。ＭＷに保持されるフラグ信号は、行Ｌ１２０３に示すように、ＢＷ及びＲＷにフラグ信号が無い場合、優先回路１０２により選択される。ＴＷに保持されるフラグ信号は、行Ｌ１２０３に示すように、ＭＷ、ＢＷ及びＲＷにフラグ信号が無い場合、優先回路１０２により選択される。 The flag signal held in the RW is always selected by the priority circuit 102 as shown in the row L1201. The flag signal held in BW is selected by priority circuit 102 when there is no flag signal in RW, as shown in row L1202. The flag signal held in the MW is selected by the priority circuit 102 when there is no flag signal in the BW and RW as shown in the row L1203. The flag signal held in the TW is selected by the priority circuit 102 when there is no flag signal in the MW, BW, and RW, as shown in the row L1203.

図１１は、ＴＬＢの一例を示す図である。ＴＬＢ３５は、Ｍ（Ｍは、整数）個のエントリがあり、各エントリには、エントリが有効か否かを示す有効ビット（Ｖａｌｉｄ）、仮想アドレス（ＶＡ）、絶対アドレス（ＡＡ）を含む。ＴＬＢ３５は、比較部３６でアクセスに使われる仮想アドレスと格納されている仮想アドレスのマッチをとって一致したエントリを選択するエントリ選択信号を出力する。選択部３７は、エントリ選択信号によって選択されたエントリに保持される絶対アドレスを出力する。出力された絶対アドレスは、選択回路１４０に供給される。 FIG. 11 is a diagram illustrating an example of a TLB. The TLB 35 has M (M is an integer) entries, and each entry includes a valid bit (Valid) indicating whether the entry is valid, a virtual address (VA), and an absolute address (AA). The TLB 35 outputs an entry selection signal for selecting a matching entry by matching the virtual address used for access by the comparison unit 36 with the stored virtual address. The selection unit 37 outputs an absolute address held in the entry selected by the entry selection signal. The output absolute address is supplied to the selection circuit 140.

実際にタグマッチに使う仮想アドレスは、ページサイズに応じて所定の下位部分がタグマッチには使用されないこととなる。例えば、８ＫＢページサイズでは、タグマッチに使う仮想アドレスは、６４ビットの仮想アドレスのうち５０ビット＜６３：１４＞である。仮想アドレスがＴＬＢ３５上にある場合、ＴＬＢ３５は、ＴＬＢヒット信号を、制御信号生成部５０、及び再送要求生成部１５０に供給する。 As for the virtual address actually used for tag matching, a predetermined lower part is not used for tag matching according to the page size. For example, in the 8 KB page size, the virtual address used for tag matching is 50 bits <63:14> of the 64-bit virtual address. When the virtual address is on the TLB 35, the TLB 35 supplies a TLB hit signal to the control signal generation unit 50 and the retransmission request generation unit 150.

図１２は、タグＲＡＭの一例を示す図である。タグＲＡＭ３０は、Ｎ（Ｎは、整数）個のエントリがあり、各エントリには、エントリが有効か否かを示す有効ビット（Ｖａｌｉｄ）、及び絶対アドレス（ＡＡ）を含む。タグＲＡＭ３０のデコーダ３１は、６４ビットの仮想アドレスの一部であるアクセスアドレス（例えば、＜１３：７＞）をデコードして、エントリを選択する。タグＲＡＭ３０は、選択されたエントリから絶対アドレスを出力する。タグＲＡＭが、複数のウェイを有し、１つのインデックスに対して各ウェイのエントリが選択されるセットアソシアティブ型キャッシュメモリの場合、絶対アドレスは、ウェイの数だけ出力される。出力した絶対アドレスは、選択回路１４０に出力される。 FIG. 12 is a diagram illustrating an example of a tag RAM. The tag RAM 30 has N (N is an integer) entries, and each entry includes a valid bit (Valid) indicating whether the entry is valid and an absolute address (AA). The decoder 31 of the tag RAM 30 selects an entry by decoding an access address (for example, <13: 7>) that is a part of a 64-bit virtual address. The tag RAM 30 outputs an absolute address from the selected entry. In the case of a set associative cache memory in which the tag RAM has a plurality of ways and an entry of each way is selected for one index, the absolute addresses are output by the number of ways. The output absolute address is output to the selection circuit 140.

図１３は、データＲＡＭの一例を示す図である。データＲＡＭ１２０は、タグＲＡＭと同じＮ（Ｎは、正の整数）個のエントリがあり、各エントリには、データ、及びデータのパリティビットを含む。データＲＡＭ１２０のデコーダ４１は、タグＲＡＭ３０に供給されるアクセスアドレスと同一のアクセスアドレスをデコードして、エントリを選択する。データＲＡＭ１２０は、選択されたエントリからデータを出力する。出力されたデータは、エラーチェック回路１３０及び選択回路１４０に供給される。 FIG. 13 is a diagram illustrating an example of the data RAM. The data RAM 120 has the same N entries (N is a positive integer) as the tag RAM, and each entry includes data and a parity bit of the data. The decoder 41 of the data RAM 120 decodes an access address that is the same as the access address supplied to the tag RAM 30 and selects an entry. The data RAM 120 outputs data from the selected entry. The output data is supplied to the error check circuit 130 and the selection circuit 140.

データＲＡＭ１２０は、１ラインのデータ幅を確保するために１個のＲＡＭで足りないときは、１ウェイを構成するのに複数のＲＡＭを用いることもある。例えば、４個のＲＡＭが１つのウェイを構成し、ウェイの数が２つある場合、ＲＡＭは、４ｘ２＝８個必要になる。 The data RAM 120 may use a plurality of RAMs to form one way when one RAM is not enough to secure the data width of one line. For example, if four RAMs form one way and there are two ways, 4 × 2 = 8 RAMs are required.

図１４は、クロック制御部の一例を示す図である。クロック制御部１１０は、ＯＲ回路１１１、ラッチ回路１１２、及びクロックバッファ１１３を有する。 FIG. 14 is a diagram illustrating an example of the clock control unit. The clock control unit 110 includes an OR circuit 111, a latch circuit 112, and a clock buffer 113.

ＯＲ回路１１１は、ＲＥＱ−ＶＡＬ（ＲＥＱｕｓｔ―ＶＡＬｉｄ信号、ＭＩ−ＨＬＤ（ＭｏｖｅＩｎ−ＨｏＬＤ）信号、ＩＮＴ−ＨＬＤ（ＩＮＴｅｒｒｕｐｔ−ＨｏＬＤ）信号を受け取り、何れかの信号が『Ｈ』であれば、ラッチ回路１１２を介して後段のクロックバッファ１１３にＥｎａｂｌｅ信号を供給する。何れの信号も『Ｌ』のときは、Ｅｎａｂｌｅ信号は、クロックバッファ１１３に供給されない。 The OR circuit 111 receives REQ-VAL (REQust-VALid signal, MI-HLD (Move In-HoLD) signal, INT-HLD (INTrupt-HoLD) signal, and latches if any signal is “H”. The Enable signal is supplied to the subsequent clock buffer 113 via the circuit 112. When any signal is “L”, the Enable signal is not supplied to the clock buffer 113.

ＲＥＱ−ＶＡＬ信号は、命令実行部４からＥＸＴ要求が供給されたときに『Ｈ』になる信号である。ＭＩ−ＨＬＤ信号は、ＭＩ要求が、図３の要求投入待ち状態（Ｓ１０３）にある状態のときに『Ｈ』になる信号である。ＩＮＴ−ＨＬＤ信号は、ＩＮＴ要求が、図３の要求投入待ち状態（Ｓ１０３）にある状態のときに『Ｈ』になる信号である。 The REQ-VAL signal is a signal that becomes “H” when an EXT request is supplied from the instruction execution unit 4. The MI-HLD signal is a signal that becomes “H” when the MI request is in the request input waiting state (S103) of FIG. The INT-HLD signal is a signal that becomes “H” when the INT request is in the request input waiting state (S103) of FIG.

クロックバッファ１１３は、入力されるＥｎａｂｌｅ信号と、クロックとのＡＮＤ回路（論理積回路）の出力が『Ｈ』になると、データＲＡＭ１２０にクロックを印加する。 The clock buffer 113 applies a clock to the data RAM 120 when the output of the AND signal (logical product circuit) of the Enable signal input and the clock becomes “H”.

このように、クロック制御部１１０は、ＲＥＱ−ＶＡＬ信号、ＭＩ−ＨＬＤ信号、ＩＮＴ−ＨＬＤ信号のいずれかが『Ｈ』のときはデータＲＡＭ１２０にクロックが印加され、いずれの信号も『Ｌ』のときはデータＲＡＭ１２０にクロックが印加されない。よって、優先回路２５に、ＥＸＴ要求、ＩＮＴ要求、ＭＩ要求のいずれかが供給された場合、クロック制御部１１０は、データＲＡＭ１２０にクロックを供給し、上記要求がいずれも供給されていない場合、データＲＡＭ１２０にクロックを供給しない。 Thus, the clock controller 110 applies the clock to the data RAM 120 when any of the REQ-VAL signal, the MI-HLD signal, and the INT-HLD signal is “H”, and all the signals are “L”. When the clock is not applied to the data RAM 120. Therefore, when any one of the EXT request, the INT request, and the MI request is supplied to the priority circuit 25, the clock control unit 110 supplies a clock to the data RAM 120, and when none of the requests is supplied, A clock is not supplied to the RAM 120.

このように、クロック制御部１１０は、データＲＡＭ１２０が保持するデータに対するアクセスが無いとき、データＲＡＭへクロックを供給しないように制御することで、データＲＡＭの消費電力を低減できる。 As described above, the clock control unit 110 can reduce the power consumption of the data RAM by controlling so that the clock is not supplied to the data RAM when there is no access to the data held in the data RAM 120.

図１５は、エラーチェック回路の一例を示す図である。図１５に示すように、エラーチェック回路１３０は、ＥｘＯＲ回路（否定排他的論理和回路）１３１、ＯＲ回路（論理和回路）１３２、及び選択回路１３３を有する。 FIG. 15 is a diagram illustrating an example of an error check circuit. As illustrated in FIG. 15, the error check circuit 130 includes an ExOR circuit (negative exclusive OR circuit) 131, an OR circuit (logical sum circuit) 132, and a selection circuit 133.

データＲＡＭ１２０から一度に読み出すデータがＪバイトの場合、否定排他的論理和１３１は、バイト毎にパリティビットを用いて、奇数パリティか否かのパリティチェックを行う。ＥｘＯＲ回路１３１は、パリティエラーが発生した場合、『Ｈ』のデータパリティエラー信号を出力する。 When the data read from the data RAM 120 at a time is J bytes, the negative exclusive OR 131 performs a parity check on whether or not the parity is odd using a parity bit for each byte. The ExOR circuit 131 outputs a data parity error signal of “H” when a parity error occurs.

ＯＲ回路１３２は、複数のＥｘＯＲ回路１３１から、受け取った各バイト毎のデータパリティエラー信号の論理和を、選択回路１３３に、データエラーウェイ信号として出力する。ＯＲ回路１３２が受け取ったデータパリティエラー信号の１つでも『Ｈ』であれば、データエラーウェイ信号が『Ｈ』になる。 The OR circuit 132 outputs the logical sum of the data parity error signals for each byte received from the plurality of ExOR circuits 131 to the selection circuit 133 as a data error way signal. If even one of the data parity error signals received by the OR circuit 132 is “H”, the data error way signal becomes “H”.

選択回路１３３は、キャッシュヒットを生じたウェイを特定するタグヒットウェイ信号を受け取り、タグヒットウェイ信号により特定されるデータエラーウェイ信号を選択する。選択されたデータエラーウェイ信号が、『Ｈ』の場合、データＲＡＭ１２０から読み出されたデータにエラーであることを示す。 The selection circuit 133 receives a tag hit way signal that specifies a way that has caused a cache hit, and selects a data error way signal that is specified by the tag hit way signal. If the selected data error way signal is “H”, this indicates that the data read from the data RAM 120 is an error.

図１６は、再送要求生成部の一例を示す図である。再送要求生成部１５０の一例は、ＡＮＤ回路１５０ａである。ＡＮＤ回路１５０ａは、ＶＡＬＩＤ、ＥＸＴ要求又はＩＮＴ要求、再送要求指示（ＲＥＲＵＮ−ＲＥＱ）、タグヒット（ＴＡＧ−ＨＩＴ）、ＴＬＢヒット（ＴＬＢ−ＨＩＴ）、及びエラー（ＥＲＲＯＲ）を受け取る。ＶＡＬＩＤ、ＥＸＴ要求又はＩＮＴ要求、タグヒット、ＴＬＢヒット、及びエラーが全て『Ｈ』であり、再送要求指示が『Ｌ』の場合、ＡＮＤ回路１５０ａは、『Ｈ』のＲＥＲＵＮ信号を出力する。なお、ＡＮＤ回路１５０ａの入力信号は、全てＢステージで生成され、ＲＥＲＵＮ信号は、ラッチ回路１５１を介してＢステージの次のステージであるＲステージで出力される。 FIG. 16 is a diagram illustrating an example of a retransmission request generation unit. An example of the retransmission request generation unit 150 is an AND circuit 150a. The AND circuit 150a receives a VALID, an EXT request or an INT request, a retransmission request instruction (RERUN-REQ), a tag hit (TAG-HIT), a TLB hit (TLB-HIT), and an error (ERROR). When the VALID, EXT request or INT request, tag hit, TLB hit, and error are all “H” and the retransmission request instruction is “L”, the AND circuit 150a outputs a RERUN signal of “H”. The input signals of the AND circuit 150a are all generated at the B stage, and the RERUN signal is output via the latch circuit 151 at the R stage, which is the next stage of the B stage.

このように、再送要求生成部１５０は、エラーが発生して、且つ、命令実行部４からＲＥＲＵＮ−ＲＥＱ信号が供給されていない場合、ＲＥＲＵＮ信号を生成する。 As described above, the retransmission request generation unit 150 generates a RERUN signal when an error occurs and the RERUN-REQ signal is not supplied from the instruction execution unit 4.

図１７は、制御信号生成部の一例を示す図である。制御信号生成部５０は、ＡＮＤ回路５１、５２、５３、５４、ＯＲ回路５５を有する。 FIG. 17 is a diagram illustrating an example of the control signal generation unit. The control signal generation unit 50 includes AND circuits 51, 52, 53, 54 and an OR circuit 55.

ＡＮＤ回路５１は、ＲＥＲＵＮ−ＲＥＱ信号、ＲＥＲＵＮ−２ｎｄ信号、ＴＡＧ−ＨＩＴ信号、ＴＬＢ−ＨＩＴ信号を受け取る。ＡＮＤ回路５１は、ＲＥＲＵＮ―ＲＥＱ信号、ＴＡＧ−ＨＩＴ信号、ＴＬＢヒット信号が全て『Ｈ』であり、ＲＥＲＵＮ−２ｎｄ信号が『Ｌ』の場合、『Ｈ』の信号Ｓ５１を出力する。 The AND circuit 51 receives the RERUN-REQ signal, the RERUN-2nd signal, the TAG-HIT signal, and the TLB-HIT signal. The AND circuit 51 outputs a signal S51 of “H” when the RERUN-REQ signal, the TAG-HIT signal, and the TLB hit signal are all “H” and the RERUN-2nd signal is “L”.

ＯＲ回路５５は、『Ｈ』の信号Ｓ５１、『Ｌ』のタグヒット（ＴＡＧ−ＨＩＴ）信号、『Ｌ』のＴＬＢ−ＨＩＴ信号の何れかを受け取ると、『Ｈ』の信号Ｓ５５を出力する。 When the OR circuit 55 receives any one of the “H” signal S51, the “L” tag hit (TAG-HIT) signal, and the “L” TLB-HIT signal, the OR circuit 55 outputs the “H” signal S55.

ＡＮＤ回路５２は、『Ｈ』のＶＡＬＩＤ信号、『Ｈ』のＥＸＴ要求又はＩＮＴ要求、及び『Ｌ』のＷＡＩＴ信号を受け取ると、『Ｈ』の信号Ｓ５２を出力する。 Upon receiving the “H” VALID signal, the “H” EXT request or INT request, and the “L” WAIT signal, the AND circuit 52 outputs the “H” signal S52.

ＡＮＤ回路５３は、『Ｈ』の信号Ｓ５５、及び『Ｈ』の信号Ｓ５２を受け取ると、『Ｈ』のＷＡＩＴ信号を出力する。 Upon receiving the “H” signal S55 and the “H” signal S52, the AND circuit 53 outputs the “H” WAIT signal.

ＡＮＤ回路５４は、『Ｌ』の信号Ｓ５５、及び『Ｈ』の信号Ｓ５２を受け取ると、『Ｈ』のＳＴＶ信号を出力する。 Upon receiving the “L” signal S55 and the “H” signal S52, the AND circuit 54 outputs the “H” STV signal.

このように、制御信号生成部５０は、『Ｈ』のＲＥＲＵＮ−ＲＥＱ信号を受け取ると、ＷＡＩＴ信号を出力すると共に、ＳＴＶ信号の出力を抑止するように動作する。制御信号生成部５０は、『Ｌ』のＲＥＲＵＮ−ＲＥＱ信号、又は『Ｈ』のＲＥＲＵＮ−２ｎｄ信号を受け取ると、ＳＴＶ信号を出力し、ＷＡＩＴ信号の出力を抑止して、パイプライン動作を再開するように動作する。 As described above, when receiving the “H” RERUN-REQ signal, the control signal generation unit 50 operates to output the WAIT signal and suppress the output of the STV signal. Upon receiving the “L” RERUN-REQ signal or the “H” RERUN-2nd signal, the control signal generation unit 50 outputs the STV signal, suppresses the output of the WAIT signal, and restarts the pipeline operation. To work.

図８に示したように、ＷＡＩＴ信号がパイプライン１００に供給されると、ＩＮＴ信号が出力される。そして、図２に示したように、ＷＡＩＴ信号は、パイプライン１００に投入される。 As shown in FIG. 8, when the WAIT signal is supplied to the pipeline 100, the INT signal is output. Then, as shown in FIG. 2, the WAIT signal is input to the pipeline 100.

図１８は、パイプラインのＶＡＬＩＤ信号処理の一例を示す図である。図１８に示されるパイプラインは、ＡＮＤ回路１７１、１７２、１７３を有する。『Ｈ』のＷＡＩＴ信号が、ＡＮＤ回路１７１、１７２、１７３に供給されると、ＡＮＤ回路１７１、１７２、１７３の出力は、『Ｌ』になる。よって、ＷＡＩＴ信号が『Ｈ』のときは、パイプラインにおいてＶＡＬＩＤ信号の伝播を抑止できる。 FIG. 18 is a diagram illustrating an example of pipeline VALID signal processing. The pipeline illustrated in FIG. 18 includes AND circuits 171, 172, and 173. When the “H” WAIT signal is supplied to the AND circuits 171, 172, 173, the outputs of the AND circuits 171, 172, 173 become “L”. Therefore, when the WAIT signal is “H”, propagation of the VALID signal can be suppressed in the pipeline.

図１９は、演算処理装置の回路配置の一例を示す図である。図１９に示されるように、データＲＡＭ１２０は、演算処理装置１０において大きな面積を占有するので、データＲＡＭ１２０が配置される領域と、パイプライン１００が配置される領域との配線の距離は必然的に長くなる。そのため、再送要求生成部１５０を、キャッシュコントローラ２００内部ではなく、データＲＡＭ１２０の近傍に配置することで、Ｂステージのサイクル内で再送要求を生成しながらデータＲＡＭ１２０に送信することにより、Ｒステージのサイクル内でデータバッファ５に再送要求の送信を行うことが出来る。 FIG. 19 is a diagram illustrating an example of a circuit arrangement of the arithmetic processing device. As shown in FIG. 19, since the data RAM 120 occupies a large area in the arithmetic processing unit 10, the wiring distance between the area where the data RAM 120 is arranged and the area where the pipeline 100 is arranged is inevitably. become longer. Therefore, by arranging the retransmission request generation unit 150 not in the cache controller 200 but in the vicinity of the data RAM 120, the retransmission request generation unit 150 transmits the retransmission request to the data RAM 120 while generating the retransmission request within the B stage cycle. The retransmission request can be transmitted to the data buffer 5.

図２０及び図２１は、再送要求が発行されたときのパイプライン処理の一例を示すタイムチャートである。図２０及び図２１では、ＲＥＲＵＮ−ＲＥＱ信号、ＳＴＶ信号、ＷＩＤ信号、データＲＡＭエラーを表すＳＢＥ信号、データバッファ書き込み表すＩＢＲ−ＣＥ信号、ＲＥＲＵＮ信号の信号レベルの変化が示される。 20 and 21 are time charts showing an example of pipeline processing when a retransmission request is issued. 20 and 21 show changes in the signal levels of the RERUN-REQ signal, the STV signal, the WID signal, the SBE signal indicating a data RAM error, the IBR-CE signal indicating data buffer writing, and the RERUN signal.

パイプライン１００のＰステージでは、ＥＸＴ要求を受け取る。 In the P stage of the pipeline 100, an EXT request is received.

Ｂステージでは、ＳＢＥ信号が『Ｈ』になり、ＩＢＲ−ＣＥ信号も『Ｈ』になる。つまり、エラーデータが、データバッファ５に供給される。 In the B stage, the SBE signal becomes “H”, and the IBR-CE signal also becomes “H”. That is, error data is supplied to the data buffer 5.

図１７を用いて説明したように、ＲＥＲＵＮ−ＲＥＱ信号が『Ｌ』であると、制御信号生成部５０は、『Ｈ』のＳＴＶ信号を出力する。そのため、Ｒステージでは、ＳＴＶ信号は、『Ｈ』になる。 As described with reference to FIG. 17, when the RERUN-REQ signal is “L”, the control signal generation unit 50 outputs the “H” STV signal. Therefore, in the R stage, the STV signal becomes “H”.

図１６を用いて説明したように、再送要求生成部１５０の入力信号のうちエラー信号が『Ｈ』になることで、Ｂステージでは、ＲＥＲＵＮ信号が『Ｈ』になる。よって、命令実行部４は、ＳＴＶ信号を受け取っても、同時にＲＥＲＵＮ信号を受け取ることにより、ＳＴＶ信号を破棄し、命令実行部４がＢステージで受け取ったデータを使用するという不都合を回避することが出来る。 As described with reference to FIG. 16, the error signal becomes “H” in the input signal of the retransmission request generation unit 150, so that the RERUN signal becomes “H” in the B stage. Therefore, even if the instruction execution unit 4 receives the STV signal, it simultaneously receives the RERUN signal, thereby discarding the STV signal and avoiding the inconvenience that the instruction execution unit 4 uses the data received in the B stage. I can do it.

ＲＥＲＵＮ信号が供給されると、命令実行部４によって保持回路ＴＷ、ＭＷ、ＢＷ、ＲＷとパイプラインのＶＡＬＩＤ信号のリセットが指示されることにより、パイプライン１００が保持するリクエストが消滅し、パイプライン動作が停止する。 When the ERRUN signal is supplied, the instruction execution unit 4 instructs the resetting of the holding circuits TW, MW, BW, RW and the pipeline VALID signal, whereby the request held by the pipeline 100 disappears, and the pipeline Operation stops.

１３〜２１サイクルでは、パイプライン１００は、Ｌ２キャッシュ４００からＢＩＳ要求を受け取り、Ｌ１キャッシュ２０のエラーを生じたラインが無効化される。なお、１３〜２１サイクルでは、当該処理に２回パイプラインを実行する２フローかかるように示される。これは、１フロー目にタグＲＡＭに無効化を必要とするラインがあることのチェックを行い、２フロー目にタグＲＡＭのインバリデート（無効化）を行うためである。 In the 13th to 21st cycles, the pipeline 100 receives the BIS request from the L2 cache 400, and the line causing the error in the L1 cache 20 is invalidated. In the 13th to 21st cycles, the process is shown to take two flows for executing the pipeline twice. This is to check that there is a line that needs to be invalidated in the tag RAM in the first flow and to invalidate the tag RAM in the second flow.

図２１のタイミングチャートにおけるクロックの２１サイクル目のＰステージでは、命令実行部４は、パイプラインにＥＸＴ要求を供給する。当該ＥＸＴ要求の供給と共に、ＲＥＲＵＮ−ＲＥＱ信号が供給されるため、ＲＥＲＵＮ−ＲＥＱ信号が『Ｈ』になる。図８で説明したように、ＲＥＲＵＮ−ＲＥＱ信号は、各ステージにあるフラグ信号ラッチを伝播するため、２１〜２５サイクルでは、ＲＥＲＵＮ−ＲＥＱ信号は『Ｈ』の状態を維持する。 In the P stage of the 21st cycle of the clock in the timing chart of FIG. 21, the instruction execution unit 4 supplies an EXT request to the pipeline. Since the RERUN-REQ signal is supplied together with the supply of the EXT request, the RERUN-REQ signal becomes “H”. As described with reference to FIG. 8, the RERUN-REQ signal propagates through the flag signal latch in each stage, and therefore the RERUN-REQ signal maintains the “H” state in the 21st to 25th cycles.

図２１のタイミングチャートにおけるクロックの２４サイクル目のＢステージでは、データＲＡＭエラー（ＳＢＥ）が『Ｌ』である。またＷＩＤ信号は、１０となり、キャッシュミスの発生によりパイプラインが中断したことを示す。これは、２フロー目で、対象ラインが無効化されているためである。 In the B stage of the 24th cycle of the clock in the timing chart of FIG. 21, the data RAM error (SBE) is “L”. The WID signal is 10, indicating that the pipeline has been interrupted due to the occurrence of a cache miss. This is because the target line is invalidated in the second flow.

図２１のタイミングチャートにおけるクロックの３０〜３７サイクルの期間では、パイプライン１００は、キャッシュコントローラ２００からＭＩ要求を受け取り、Ｌ２キャッシュ４００からロードしたデータが、対象ラインに書き込まれる。 In the period of 30 to 37 cycles of the clock in the timing chart of FIG. 21, the pipeline 100 receives the MI request from the cache controller 200, and the data loaded from the L2 cache 400 is written to the target line.

図２１のタイミングチャートにおけるクロックの４０サイクル目では、パイプライン１００は、キャッシュコントローラ２００からＩＮＴ要求を受け取る。ＩＮＴ要求により、図８で説明したように、フラグ信号ラッチに保持されていたフラグ信号が再投入される。図９で説明したように、ＲＥＲＵＮ−ＲＥＱ信号は、ＩＮＴ要求に含まれるため、各ステージにあるフラグ信号ラッチを伝播する。よって、図２１のタイミングチャートにおけるクロックの２１〜２５サイクルの期間においては、ＲＥＲＵＮ−ＲＥＱ信号は『Ｈ』の状態を維持する。 In the 40th cycle of the clock in the timing chart of FIG. 21, the pipeline 100 receives an INT request from the cache controller 200. In response to the INT request, as described in FIG. 8, the flag signal held in the flag signal latch is turned on again. As described with reference to FIG. 9, the RERUN-REQ signal is included in the INT request, and therefore propagates through the flag signal latch in each stage. Therefore, the RERUN-REQ signal maintains the “H” state during the period of 21 to 25 cycles of the clock in the timing chart of FIG.

図２１のタイミングチャートにおけるクロックの４３サイクル目では、ＳＢＥ信号が『Ｌ』である。当該クロックの４３サイクル目では、ＩＢＲ−ＣＥ信号は『Ｈ』に遷移し、Ｌ２キャッシュ４００からロードしたデータが、データバッファ５に供給される。よって、先のフローであるＭＩ−２ｎｄフローが再送要求によるものであり、且つ、キャッシュヒット及びエラーなしで、ＷＡＩＴされたフローであるという条件が成立して、４４サイクルでは、ＷＩＤ信号が「７１」になる。 In the 43rd cycle of the clock in the timing chart of FIG. 21, the SBE signal is “L”. In the 43rd cycle of the clock, the IBR-CE signal transits to “H”, and the data loaded from the L2 cache 400 is supplied to the data buffer 5. Therefore, the condition that the MI-2nd flow, which is the previous flow, is due to a retransmission request, and is a WAITed flow without a cache hit and error, and the WID signal is “71” in 44 cycles. "become.

図２１のタイミングチャートにおけるクロックの４７〜５１サイクルの期間では、ＲＥＲＵＮ−ＲＥＱ信号は、ＩＮＴ要求に含まれ、各ステージにあるフラグ信号ラッチを伝播する。よって、当該４７〜５１サイクルの期間では、ＲＥＲＵＮ−ＲＥＱ信号は『Ｈ』の状態を維持する。 In the period of 47 to 51 cycles of the clock in the timing chart of FIG. 21, the RERUN-REQ signal is included in the INT request and propagates through the flag signal latch in each stage. Therefore, the RERUN-REQ signal maintains the “H” state during the period of 47 to 51 cycles.

また、図９で説明したように、保持回路ＲＷには、「ＷＩＤ＝７１」が保持される。そのため、投入されたＩＮＴ要求は「ＷＩＤ＝７１」を含み、「ＷＩＤ＝７１」がデコードされることで、図２１には示されないＲＥＲＵＮ−２ｎｄ信号が『Ｈ』になる。 As described with reference to FIG. 9, “WID = 71” is held in the holding circuit RW. Therefore, the input INT request includes “WID = 71”, and “WID = 71” is decoded, so that the RERUN-2nd signal not shown in FIG. 21 becomes “H”.

ＲＥＲＵＮ−２ｎｄ信号が『Ｈ』になると、図２１のタイミングチャートにおけるクロックの５１サイクル目のＲステージでは、図１７で説明したように、ＷＡＩＴ信号が『Ｌ』になり、ＳＴＶ信号が『Ｈ』になる。よって、パイプライン１００が再開されると共に、命令実行部４は、データバッファ５で保持されるデータを使用することが出来る。 When the RERUN-2nd signal becomes “H”, the WAIT signal becomes “L” and the STV signal becomes “H” in the R stage at the 51st cycle of the clock in the timing chart of FIG. become. Therefore, the pipeline 100 is restarted and the instruction execution unit 4 can use the data held in the data buffer 5.

図２１のタイミングチャートにおけるクロックの４７サイクル目のＰステージでは、『Ｈ』のＲＥＲＵＮ−ＲＥＱ信号を受け取ると、制御信号生成部５０は、ＳＴＶ信号を出力することで、命令実行部４は、図２１のタイミングチャートにおけるクロックの４３サイクル目のＢステージで受け取ったデータを使用することが出来る。 In the P stage of the 47th cycle of the clock in the timing chart of FIG. 21, upon receiving the “H” RERUN-REQ signal, the control signal generator 50 outputs the STV signal, so that the instruction execution unit 4 Data received in the B stage of the 43rd cycle of the clock in the timing chart of 21 can be used.

このように、Ｌ２キャッシュからデータをロードすることで、エラーを回避すると、図２１のタイミングチャートにおけるクロックの５１サイクル目でＳＴＶ信号を生成することにより、命令実行部４は、図２１のタイミングチャートにおけるクロックの４３サイクル目で受け取ったデータを使用することが出来る。以上のように、ＳＴＶ信号と、ＲＥＲＵＮ信号を命令実行部４に供給することで、エラーチェック回路１３０のエラー検出を入力信号として有さなくても、Ｌ１キャッシュ２０は、ＳＴＶ信号の機能を維持することが出来る。 Thus, if an error is avoided by loading data from the L2 cache, the instruction execution unit 4 generates the STV signal at the 51st cycle of the clock in the timing chart of FIG. The data received at the 43rd cycle of the clock can be used. As described above, by supplying the STV signal and the RERUN signal to the instruction execution unit 4, the L1 cache 20 maintains the function of the STV signal even if the error detection circuit 130 does not have the error detection as an input signal. I can do it.

図２２は、再送要求が発行されたときのパイプライン処理の一例を示すタイムチャートである。図２２のタイミングチャートにおけるクロックの０〜３８サイクルの期間においては、図２２に示さないが、図２０及び図２１で説明した動作がなされる。 FIG. 22 is a time chart illustrating an example of pipeline processing when a retransmission request is issued. In the period of 0 to 38 cycles of the clock in the timing chart of FIG. 22, although not shown in FIG. 22, the operation described in FIGS. 20 and 21 is performed.

図２２のタイミングチャートにおけるクロックの５０サイクル目のＢステージでは、ＳＢＥ信号が『Ｈ』であるが、ＩＢＲ−ＣＥ信号が『Ｌ』であるため、データバッファ５にデータは書き込まれない。これは、正常なデータが、既に４３サイクルで命令実行部４に送られているからである。このように、命令実行部４に正常にデータが送られた後に、対象ラインにエラーが生じても、正常なデータは既に送信されているので、ＢＩＳ要求による無効化や、ＭＩ要求によるＬ２キャッシュ４００からのデータロードを行うことなく、処理を続行できる。 In the B stage of the 50th cycle of the clock in the timing chart of FIG. 22, the SBE signal is “H”, but the data is not written to the data buffer 5 because the IBR-CE signal is “L”. This is because normal data has already been sent to the instruction execution unit 4 in 43 cycles. As described above, even if an error occurs in the target line after the data has been normally sent to the instruction execution unit 4, the normal data has already been transmitted, so the invalidation by the BIS request or the L2 cache by the MI request. Processing can be continued without loading data from 400.

２命令部
３実行部
４命令実行部
５データバッファ
１０演算処理装置
２０Ｌ１キャッシュ
２５優先回路
３０タグＲＡＭ
３５ＴＬＢ
４０比較回路
５０制御信号生成部
１００パイプライン
１１０クロック制御部
１２０データＲＡＭ
１３０エラーチェック回路
１４０選択回路
１５０再送要求生成部
１８０制御部
１９０Ｌ２オーダ保持部
１９５Ｌ２要求保持部
２００キャッシュコントローラ
４００Ｌ２キャッシュ
５００主記憶装置 2 instruction part 3 execution part 4 instruction execution part 5 data buffer 10 arithmetic processing unit 20 L1 cache 25 priority circuit 30 tag RAM
35 TLB
40 Comparison Circuit 50 Control Signal Generation Unit 100 Pipeline 110 Clock Control Unit 120 Data RAM
DESCRIPTION OF SYMBOLS 130 Error check circuit 140 Selection circuit 150 Retransmission request production | generation part 180 Control part 190 L2 order holding | maintenance part 195 L2 request holding | maintenance part 200 Cache controller 400 L2 cache 500 Main memory

Claims

A first storage unit for storing data;
An error detection unit that detects the occurrence of an error in the data read from the first storage unit;
A second storage unit that stores data read from the first storage unit based on a load request;
When the error detection unit detects the occurrence of an error for the data read from the first storage unit by the load request, the request for resending the load request to the first storage unit is the cycle in which the data error is detected. A retransmission request generator that generates the same cycle;
An instruction execution unit that resends a load request to the first storage unit when an error is detected and a retransmission request is given;
An arithmetic processing apparatus comprising:

The arithmetic processing unit further includes a clock control unit that supplies a clock to the first storage unit when a load request to the first storage unit is given,
The arithmetic processing unit according to claim 1, wherein the first storage unit outputs data when a clock is supplied.

The arithmetic processing unit further includes a load request based on the retransmission request when no error is detected by the error detection unit for data read from the first storage unit by the load request based on the retransmission request after the load request is retransmitted. The arithmetic processing apparatus according to claim 1, further comprising a use permission control unit that outputs a use permission signal for the data read by the step to the instruction execution unit.

In a control method for an arithmetic processing unit having a first storage unit and a second storage unit,
Reading data from the first storage in response to a load request;
An error detection unit included in the arithmetic processing unit detects an error in the data read from the first storage unit;
When the error detection unit detects the occurrence of an error in the data read from the first storage unit by the load request, the load request retransmission request to the storage unit is made in the same cycle as the cycle in which the data error is detected. Generating step;
When the instruction execution unit included in the arithmetic processing unit receives data in which an error is detected and a retransmission request, the instruction execution unit retransmits the load request to the first storage unit;
A control method for an arithmetic processing unit, comprising:

In the control method of the arithmetic processing unit,
The arithmetic processing unit further includes a clock control unit,
5. The method according to claim 4, wherein the clock control unit includes a step of supplying a clock to the first storage unit when a load request to the first storage unit is given.

The control method of the arithmetic processing unit is further
After the load request is retransmitted, if the error detection unit detects no error for the data read from the first storage unit by the load request based on the retransmission request, the data read by the load request based on the retransmission request is used. 6. The method according to claim 4, further comprising a step of outputting a permission signal to the instruction execution unit.