JP2015210655A

JP2015210655A - Arithmetic processing unit and control method of arithmetic processing unit

Info

Publication number: JP2015210655A
Application number: JP2014091636A
Authority: JP
Inventors: 直宏清田; Naohiro Kiyota
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2014-04-25
Filing date: 2014-04-25
Publication date: 2015-11-24
Anticipated expiration: 2034-04-25
Also published as: JP6287544B2

Abstract

PROBLEM TO BE SOLVED: To provide an arithmetic processing unit and a control method of the arithmetic processing unit with an improved arithmetic processing capability.SOLUTION: An instruction control section 11 acquires an arithmetic processing instruction including plural cache access requests and sequentially transmits the cache access requests included in the arithmetic processing instruction. A cache control section 13 receives the cache access requests transmitted from the instruction control section 11 and sequentially executes access processing to the caches instructed by the cache access requests. An arithmetic control section 12 controls to execute the arithmetic processing based on the result of access processing by the cache control section 13.

Description

本発明は、演算処理装置及び演算処理装置の制御方法に関する。 The present invention relates to an arithmetic processing device and a control method for the arithmetic processing device.

近年、演算性能向上のために、演算処理装置は、ＳＩＭＤ（Single Instruction Multiple Data）と呼ばれる処理を行うことがある。ＳＩＭＤは、複数のデータに対する演算を１つの命令でまとめて処理する手法である。また、ＳＩＭＤを用いた演算処理の性能向上のために、ロード・ストア処理も、ＳＩＭＤに応じた拡張がなされることが好ましい。 In recent years, in order to improve calculation performance, an arithmetic processing apparatus sometimes performs processing called SIMD (Single Instruction Multiple Data). SIMD is a method for processing operations on a plurality of pieces of data in a single instruction. Also, in order to improve the performance of arithmetic processing using SIMD, it is preferable that the load / store processing is also expanded according to SIMD.

一般的なロード・ストア命令によるキャッシュアクセスリクエストは、１つのリクエストにつき１つのアドレスが指定される。演算処理装置に備えるキャッシュ制御部では、キャッシュアクセスリクエストで指定されたアドレスのデータに対して、ロード・ストア処理を行う。 For a cache access request by a general load / store instruction, one address is designated per request. The cache control unit provided in the arithmetic processing unit performs load / store processing on the data at the address specified by the cache access request.

これに対して、ＳＩＭＤ拡張したロード・ストア命令によるキャッシュアクセスリクエストでは、１つのリクエストにつき１つのアドレスと演算の対象となるデータの数を表す要素数とが指定される。そして、キャッシュ制御部では、指定されたアドレスから要素数分の連続したデータに対して、まとめてロード・ストア処理を行う。 On the other hand, in a cache access request by a load / store instruction expanded by SIMD, one address and the number of elements indicating the number of data to be operated are designated for each request. Then, the cache control unit collectively performs load / store processing on the continuous data for the number of elements from the designated address.

そして、一般的なロード・ストア命令又はＳＩＭＤ拡張したロード・ストア命令のいずれであっても、１つのリクエストをキャッシュ制御部が備える１つの制御ポートに割り当て、その命令種、アドレス及びＳＩＭＤの要素数など各種制御信号を保持して処理を行っていた。 Whether a general load / store instruction or a SIMD-extended load / store instruction, one request is assigned to one control port provided in the cache control unit, its instruction type, address, and number of SIMD elements. For example, processing was performed while holding various control signals.

また、キャッシュアクセスリクエストには、プログラム上の命令順を示す情報であるＩＩＤ（Instruction Identification）が付属する。キャッシュ制御部には、キャッシュヒットの判定を行う経路であるキャッシュ制御パイプラインを複数有する場合がある。キャッシュ制御パイプラインを複数有する演算処理装置では、異なるキャッシュ制御パイプラインを同時に複数のキャッシュアクセスリクエストが流れることが考えられる。この場合、キャッシュ制御部では、それぞれのキャッシュアクセスリクエストに付属したＩＩＤの大小判定を行うことで、それらのリクエストの順序関係を判定する。キャッシュ制御部は、例えば、異なるキャッシュ制御パイプライン間で共有する単一資源獲得時に、リクエスト順序の判定結果を用いてより古いリクエストに獲得優先権を与えるなどの処理を行う。 The cache access request is accompanied by IID (Instruction Identification) that is information indicating the order of instructions on the program. The cache control unit may have a plurality of cache control pipelines that are paths for determining a cache hit. In an arithmetic processing unit having a plurality of cache control pipelines, a plurality of cache access requests may simultaneously flow through different cache control pipelines. In this case, the cache control unit determines the order relation of these requests by determining the size of the IID attached to each cache access request. For example, when acquiring a single resource shared between different cache control pipelines, the cache control unit performs processing such as giving an acquisition priority to an older request using the determination result of the request order.

さらに、ＳＩＭＤによる演算処理の性能向上のために、インダイレクトロード・インダイレクトストア処理への対応が挙げられる。従来のＳＩＭＤを用いたロード・ストア処理では、処理対象となるデータは、メモリ上に連続して配置されていた。これに対して、インダイレクトロード・インダイレクトストア処理は、データが非連続に配置されていてもまとめてロード・ストア処理を行うことを可能とする技術である。 Furthermore, in order to improve the performance of arithmetic processing by SIMD, correspondence to indirect load / indirect store processing can be mentioned. In load / store processing using conventional SIMD, data to be processed is continuously arranged on a memory. On the other hand, indirect load / indirect store processing is a technique that enables load / store processing to be performed collectively even when data is discontinuously arranged.

例えば、それぞれの要素に対して、ロード・ストア命令を要素数分発行する従来技術がある。また、例えば、新たにインダイレクトロード・インダイレクトストア命令を定義し、１つの命令で複数要素のアドレスをそれぞれ並行して同時に複数のキャッシュアクセスリクエストを送信する従来技術がある。 For example, there is a conventional technique for issuing a load / store instruction for each element by the number of elements. Further, for example, there is a conventional technique in which a new indirect load / indirect store instruction is newly defined and a plurality of cache access requests are simultaneously transmitted in parallel with a plurality of element addresses by one instruction.

特開２００９−１６３４４２号公報JP 2009-163442 A 特公平０４−７９０２６号公報Japanese Patent Publication No. 04-79026

しかしながら、ロード・ストア命令を要素数分発行する従来技術では、要素数が増えるにしたがい使用するロード・ストア命令が増大し、演算処理装置の各資源を占有することとなり、性能向上が妨げられるおそれがある。 However, in the conventional technology that issues the load / store instructions for the number of elements, the load / store instructions to be used increase as the number of elements increases, which occupies each resource of the arithmetic processing unit, which may hinder performance improvement. There is.

また、並行して同時に複数のキャッシュアクセスリクエストを送信する従来技術では、制御ポートへリクエストを送信するためのアドレスバスやデータバスの幅を要素数分そなえる必要があり、演算処理装置内の回路量が増大する。そのため、演算処理装置の使用可能な回路量により同時に処理する要素数が制限されてしまい、性能向上が妨げられるおそれがある。 In addition, in the conventional technology that simultaneously transmits a plurality of cache access requests in parallel, it is necessary to provide the width of the address bus and data bus for transmitting requests to the control port by the number of elements. Will increase. For this reason, the number of elements that can be processed simultaneously is limited by the amount of circuit that can be used by the arithmetic processing unit, which may hinder performance improvement.

開示の技術は、上記に鑑みてなされたものであって、演算処理能力を向上させた演算処理装置及び演算処理装置の制御方法を提供することを目的とする。 The disclosed technology has been made in view of the above, and an object thereof is to provide an arithmetic processing device and an arithmetic processing device control method with improved arithmetic processing capability.

本願の開示する演算処理装置及び演算処理装置の制御方法は、複数のキャッシュアクセス要求を含む演算処理命令を取得し、前記演算処理命令に含まれる前記キャッシュアクセス要求を順次送信する命令制御部と、前記命令制御部から送信された前記キャッシュアクセス要求を受信し、各前記キャッシュアクセス要求により指示されたキャッシュへのアクセス処理を順次実行するキャッシュ制御部と、前記キャッシュ制御部による前記アクセス処理の処理結果を基に演算処理を行う演算制御部とを備える。 An arithmetic processing device and a control method for the arithmetic processing device disclosed in the present application include: an instruction control unit that acquires an arithmetic processing instruction including a plurality of cache access requests and sequentially transmits the cache access requests included in the arithmetic processing instruction; A cache control unit that receives the cache access request transmitted from the instruction control unit and sequentially executes an access process to the cache instructed by each cache access request; and a processing result of the access process by the cache control unit And an arithmetic control unit that performs arithmetic processing based on the above.

本願の開示する演算処理装置及び演算処理装置の制御方法の一つの態様によれば、演算処理能力を向上させることができるという効果を奏する。 According to one aspect of the arithmetic processing device and the control method for the arithmetic processing device disclosed in the present application, there is an effect that the arithmetic processing capability can be improved.

図１は、演算処理装置のブロック図である。FIG. 1 is a block diagram of an arithmetic processing unit. 図２は、キャッシュ制御部の詳細を表すブロック図である。FIG. 2 is a block diagram showing details of the cache control unit. 図３は、命令制御部から送出されるリクエストのフォーマットの一例の図である。FIG. 3 is a diagram illustrating an example of a format of a request sent from the instruction control unit. 図４は、制御ポートが有するエントリの一例の図である。FIG. 4 is a diagram illustrating an example of entries included in the control port. 図５は、リクエスト間の調停処理を説明するための図である。FIG. 5 is a diagram for explaining arbitration processing between requests. 図６は、パイプライン間のデータ転送要求の調停のタイムチャートである。FIG. 6 is a time chart of arbitration of a data transfer request between pipelines. 図７は、実施例１に係る演算処理装置による命令処理のフローチャートである。FIG. 7 is a flowchart of command processing performed by the arithmetic processing apparatus according to the first embodiment. 図８は、大小比較器による比較処理のフローチャートである。FIG. 8 is a flowchart of the comparison process by the magnitude comparator. 図９は、パイプライン間のデータ転送調停処理のフローチャートである。FIG. 9 is a flowchart of a data transfer arbitration process between pipelines. 図１０は、実施例２に係る演算処理装置による命令処理のフローチャートである。FIG. 10 is a flowchart of command processing performed by the arithmetic processing apparatus according to the second embodiment.

以下に、本願の開示する演算処理装置及び演算処理装置の制御方法の実施例を図面に基づいて詳細に説明する。なお、以下の実施例により本願の開示する演算処理装置及び演算処理装置の制御方法が限定されるものではない。 Embodiments of an arithmetic processing device and a control method for the arithmetic processing device disclosed in the present application will be described below in detail with reference to the drawings. The following embodiments do not limit the arithmetic processing device and the control method of the arithmetic processing device disclosed in the present application.

図１は、演算処理装置のブロック図である。また、図２は、キャッシュ制御部の詳細を表すブロック図である。本実施例では、演算処理装置としてＣＰＵ（Central Processing Unit）を例に説明する。 FIG. 1 is a block diagram of an arithmetic processing unit. FIG. 2 is a block diagram showing details of the cache control unit. In this embodiment, a CPU (Central Processing Unit) will be described as an example of the arithmetic processing device.

ＣＰＵ１は、命令制御部１１、演算制御部１２、キャッシュ制御部１３、二次キャッシュ制御部１４及びメモリ制御部１５を有する。そして、メモリ制御部１５は、メモリ２と接続する。 The CPU 1 includes an instruction control unit 11, an operation control unit 12, a cache control unit 13, a secondary cache control unit 14, and a memory control unit 15. The memory control unit 15 is connected to the memory 2.

命令制御部１１は、プログラムから命令を取得する。そして、命令制御部１１は、取得した命令にキャッシュアクセスリクエストであるか否かを判定する。キャッシュアクセスリクエストとは、例えば、メモリからキャッシュを介して演算制御部１２へのデータの読み出しや演算制御部１２からキャッシュを介してのデータの更新を要求するロード・ストアリクエストなどである。ここで、本実施例に係る命令は、複数のキャッシュアクセスリクエストを含ませることが可能であるように定義されている。 The instruction control unit 11 acquires an instruction from a program. Then, the instruction control unit 11 determines whether or not the acquired instruction is a cache access request. The cache access request is, for example, a load / store request for requesting reading of data from the memory to the arithmetic control unit 12 via the cache or an update of data from the arithmetic control unit 12 via the cache. Here, the instruction according to the present embodiment is defined so as to include a plurality of cache access requests.

キャッシュアクセスリクエストである場合、命令制御部１１は、そのキャッシュアクセスリクエストをキャッシュ制御部１３へ送信する。以下では、キャッシュアクセスリクエストを、単に「リクエスト」と呼ぶ。 If it is a cache access request, the instruction control unit 11 transmits the cache access request to the cache control unit 13. Hereinafter, the cache access request is simply referred to as “request”.

ここで、命令制御部１１は、後述するキャッシュ制御部１３が有する２本のパイプライン１３２及び１３３に同時にリクエストを発行可能である。例えば、１つの命令に４つのリクエストが含まれている場合、命令制御部１１は、２つのリクエストのそれぞれを各パイプライン１３２及び１３３に向けて送出し、続いて、残りの２つのリクエストそれぞれを各パイプライン１３２及び１３３に向けて送出する。 Here, the instruction control unit 11 can issue requests to two pipelines 132 and 133 included in the cache control unit 13 described later. For example, when four requests are included in one instruction, the instruction control unit 11 sends each of the two requests to the pipelines 132 and 133, and then transmits each of the remaining two requests. It sends out toward each pipeline 132 and 133.

このように、本実施例に係る命令制御部１１は、１つのインダイレクト命令によるリクエストを順次キャッシュ制御部１３へ送出する。すなわち、本実施例における命令制御部１１は、１つのインダイレクト命令により複数のリクエストを処理できるので、インダイレクトロード・インダイレクトストア処理による命令数の増大を抑えることができ、性能の低下を回避できる。また、本実施例における命令制御部１１とキャッシュ制御部１３との間のバスは、従来のＳＩＭＤ処理に用いていたバス数や幅が同じであり、回路量の増大を抑えることができる。 As described above, the instruction control unit 11 according to the present embodiment sequentially sends requests based on one indirect instruction to the cache control unit 13. That is, since the instruction control unit 11 in this embodiment can process a plurality of requests with one indirect instruction, an increase in the number of instructions due to indirect load / indirect store processing can be suppressed, and performance degradation can be avoided. it can. In addition, the bus between the instruction control unit 11 and the cache control unit 13 in this embodiment has the same number of buses and the same width as those used in conventional SIMD processing, and can suppress an increase in circuit amount.

その後、命令制御部１１は、リクエスト完了応答通知をキャッシュ制御部１３から受信する。そして、リクエスト完了応答通知を受けて、命令制御部１１は、キャッシュアクセスリクエストの完了を確定する。 Thereafter, the instruction control unit 11 receives a request completion response notification from the cache control unit 13. Then, upon receiving the request completion response notification, the instruction control unit 11 determines completion of the cache access request.

また、命令制御部１１は、演算命令の実行要求などを演算制御部１２へ送出する。 In addition, the instruction control unit 11 sends an execution request for an operation instruction to the operation control unit 12.

次に、図２を参照して、キャッシュ制御部１３について説明する。キャッシュ制御部１３は、制御ポート管理部１３１、パイプライン１３２及び１３３、キャッシュＲＡＭ（Random Access Memory）１３４、ヒット判定回路１３５及び１３６、大小比較器１３７、並びに、メモリアクセス制御部１３８を有する。 Next, the cache control unit 13 will be described with reference to FIG. The cache control unit 13 includes a control port management unit 131, pipelines 132 and 133, a cache RAM (Random Access Memory) 134, hit determination circuits 135 and 136, a size comparator 137, and a memory access control unit 138.

制御ポート管理部１３１は、複数の制御ポート３０を有する。各制御ポート３０は、パイプライン１３２及びパ１３３のそれぞれを介してキャッシュＲＡＭ１３４に接続する。以下では、パイプライン１３２及び１３３のそれぞれを区別しない場合、「パイプライン１３０」という。ここで、本実施例では、パイプライン１３０は、２本であるがこれは１本でもよい。また、可能であれば、ＣＰＵ１は、３本以上のパイプライン１３０を有してもよい。 The control port management unit 131 has a plurality of control ports 30. Each control port 30 is connected to the cache RAM 134 via a pipeline 132 and a path 133, respectively. Hereinafter, when the pipelines 132 and 133 are not distinguished from each other, they are referred to as “pipeline 130”. Here, in this embodiment, there are two pipelines 130, but this may be one. Further, if possible, the CPU 1 may include three or more pipelines 130.

制御ポート３０は、命令制御部１１から送出されたリクエストを受信する。そして、制御ポート３０は、命令制御部１１により指定されたパイプライン１３０にリクエストを投入する。ここで、投入先のパイプライン１３０に既に他のリクエストが投入され、その処理が終わっていない場合には、制御ポート３０は、投入先のパイプライン１３０が処理を完了しリクエストが投入可能な状態となるまで、リクエストを保持しておく。例えば、制御ポート３０は、キューなどを用いてリクエストを蓄積する。 The control port 30 receives a request sent from the instruction control unit 11. Then, the control port 30 inputs a request to the pipeline 130 designated by the instruction control unit 11. Here, when another request has already been input to the input pipeline 130 and the processing has not been completed, the control port 30 is in a state where the input pipeline 130 has completed processing and the request can be input. Keep the request until For example, the control port 30 accumulates requests using a queue or the like.

ここで、図３を参照して、制御ポート３０が受信するリクエストのフォーマットについて説明する。図３は、命令制御部から送出されるリクエストのフォーマットの一例の図である。図３に示すように、リクエスト２００は、ｒｅｑｕｅｓｔ、ａｄｄｒｅｓｓ、ＩＩＤ、ＳＩＭＤ、ｉｎｄｉｒｅｃｔ及びｅｌｅｍｅｎｔの各項目を有する。ｒｅｑｕｅｓｔは、ロードやストアなどのアクセスの種類を表す。ａｄｄｒｅｓｓは、アクセス先のアドレスを表す。ＩＩＤは、そのリクエストが含まれる命令の識別子である。本実施例では、ＩＩＤは、命令に昇順に番号が連番で割り当てられ、命令順を示す。ＳＩＭＤは、ＳＩＭＤの要素数を表す。ｉｎｄｉｒｅｃｔは、インダイレクトアクセスか否かを示すフラグである。本実施例では、ｉｎｄｉｒｅｃｔが「０」であれば、インダイレクトロード処理又はインダイレクトストア処理の対象となっていないことを表す。また、ｉｎｄｉｒｅｃｔが「１」であれば、インダイレクトロード処理又はインダイレクトストア処理の対象となっているリクエストであることを表す。ｅｌｅｍｅｎｔは、そのリクエストのＳＩＭＤの要素番号を表す。本実施例では、要素番号は、昇順に番号が連番で割り当てられている。 Here, the format of the request received by the control port 30 will be described with reference to FIG. FIG. 3 is a diagram illustrating an example of a format of a request sent from the instruction control unit. As shown in FIG. 3, the request 200 includes items of request, address, IID, SIMD, indirect, and element. request represents the type of access such as load or store. address represents the address of the access destination. The IID is an identifier of an instruction including the request. In this embodiment, the IID is assigned to the instructions in ascending order by serial number, and indicates the order of the instructions. SIMD represents the number of elements of SIMD. Indirect is a flag indicating whether or not indirect access is performed. In the present embodiment, when indirect is “0”, it indicates that the indirect load process or the indirect store process is not performed. Further, when “indirect” is “1”, this indicates that the request is a target of indirect load processing or indirect store processing. element represents the SIMD element number of the request. In this embodiment, the element numbers are assigned sequentially in ascending order.

さらに、制御ポート３０は、図４に示すように、各リクエストに対応させてリクエストで指示されたロードやストアなどのキャッシュアクセスの現在の状態を表すステータスフラグを備えるエントリを記憶する。図４は、制御ポートが有するエントリの一例の図である。図４のリクエストの欄には、図３に示すフォーマットを有するリクエストが格納されている。また、ｓｔａｔｕｓの欄には、ステータスフラグがセットされている。ステータスフラグが「１」であれば、そのリクエストを受信し且つ未処理の状態を表す。ステータスフラグが「２」であれば、そのリクエストの処理が完了したことを表す。ステータスフラグが「０」であれば、そのリクエストに関するリクエスト完了応答通知が命令制御部１１へ既に送られ、そのリクエストのエントリが解放された状態であることを表す。 Further, as shown in FIG. 4, the control port 30 stores an entry including a status flag indicating a current state of cache access such as load or store indicated by the request in association with each request. FIG. 4 is a diagram illustrating an example of entries included in the control port. In the request column of FIG. 4, a request having the format shown in FIG. 3 is stored. A status flag is set in the status column. If the status flag is “1”, this indicates that the request has been received and has not been processed. If the status flag is “2”, it indicates that the processing of the request has been completed. If the status flag is “0”, it indicates that a request completion response notification regarding the request has already been sent to the command control unit 11 and the entry of the request has been released.

すなわち、ＩＮＤＩＲＥＣＴ＝０のリクエストを処理する場合は、制御ポート３０は、リクエストを命令制御部１３１から受信すると、受信したリクエストに対応させて、値が「１」のステータスフラグを立てたエントリを１つ作成する。 That is, when processing a request with DIRECT = 0, when the request is received from the instruction control unit 131, the control port 30 sets an entry with a status flag having a value of “1” corresponding to the received request to 1 Create one.

次に、制御ポート３０は、リクエストをパイプライン１３０へ投入し、ヒット判定回路１３５又は１３６から処理完了の通知を受けると、そのリクエストに対応するエントリのステータスフラグを「２」に変更する。これに対して、リクエストをパイプライン１３０へ投入後、ヒット判定回路１３５又は１３６から再投入の通知を受けた場合、制御ポート３０は、リクエストをパイプライン１３０へ再投入し、ステータスフラグは「１」のまま維持する。 Next, when the control port 30 inputs a request to the pipeline 130 and receives a notification of processing completion from the hit determination circuit 135 or 136, the control port 30 changes the status flag of the entry corresponding to the request to “2”. On the other hand, when a request for re-input is received from the hit determination circuit 135 or 136 after the request is input to the pipeline 130, the control port 30 re-inputs the request to the pipeline 130 and the status flag is “1”. ”.

その後、制御ポート管理部１３１からリクエスト完了応答が命令制御部１１へ送信されると、制御ポート３０は、リクエスト完了応答に対応するリクエストのエントリを解放し、ステータスを「０」に変更する。 Thereafter, when a request completion response is transmitted from the control port management unit 131 to the command control unit 11, the control port 30 releases the entry of the request corresponding to the request completion response and changes the status to “0”.

ＩＮＤＩＲＥＣＴ＝１のリクエストを処理する場合は、命令制御部１３１からキャッシュ制御部１３に対して、ＳＩＭＤの値の数のリクエストが同一ＩＩＤで発行され、制御ポート３０は、受信したリクエストに対応させて、値が「１」のステータスフラグを立てたエントリをＳＩＭＤの値の数だけ作成する。制御ポート管理部１３１は、各制御ポート３０における各リクエストのエントリを監視する。そして、いずれかのエントリのステータスフラグが２になると、制御ポート管理部１３１は、そのエントリに格納されているリクエストのＩＩＤを取得する。次に、制御ポート管理部１３１は、取得したＩＩＤを有する命令の要素数を取得する。その後、制御ポート管理部１３１は、同じＩＩＤを有するリクエストを各制御ポート３０が有するエントリから抽出する。そして、制御ポート管理部１３１は、抽出した各エントリのステータスフラグを確認する。 When processing a request with DIRECT = 1, the instruction control unit 131 issues requests for the number of SIMD values to the cache control unit 13 with the same IID, and the control port 30 corresponds to the received request. As many entries as the number of SIMD values are created with the status flag set to "1". The control port management unit 131 monitors each request entry in each control port 30. When the status flag of any entry becomes 2, the control port management unit 131 acquires the IID of the request stored in the entry. Next, the control port management unit 131 acquires the number of elements of the instruction having the acquired IID. Thereafter, the control port management unit 131 extracts a request having the same IID from an entry included in each control port 30. Then, the control port management unit 131 checks the status flag of each extracted entry.

抽出した各エントリのステータスフラグが全て「２」の場合、制御ポート管理部１３１は、リクエスト完了応答を命令制御部１１へ送信する。これに対して、抽出した各エントリの中にステータスフラグが「１」のエントリが含まれている場合、制御ポート管理部１３１は、次にステータスフラグが「２」に変更されるまで待機する。 When all the status flags of the extracted entries are “2”, the control port management unit 131 transmits a request completion response to the instruction control unit 11. On the other hand, when an entry with a status flag “1” is included in each extracted entry, the control port management unit 131 waits until the status flag is changed to “2” next time.

ここで、本実施例では、制御ポート管理部１３１は、ＩＩＤが同じエントリのステータスフラグが全て「２」であればリクエスト完了応答を送信したが、各命令に含まれるリクエストの数、すなわち要素数を用いて確認を行ってもよい。例えば、以下のような処理で実現できる。制御ポート管理部１３１は、リクエストがインダイレクトアクセスである場合、リクエストを受信するとともに、そのリクエストを含む命令の要素数を命令制御部１１から受信する。そして、制御ポート１３１は、ＩＩＤに対応させて要素数を記憶する。その後、制御ポート１３１は、ＩＩＤが同じエントリを抽出した際に、そのＩＩＤに対応する要素数を取得し、抽出したエントリの数と取得した要素数とが一致するか否かを判定してもよい。 Here, in this embodiment, the control port management unit 131 transmits a request completion response if the status flags of the entries with the same IID are all “2”, but the number of requests included in each command, that is, the number of elements You may confirm using. For example, it can be realized by the following processing. When the request is indirect access, the control port management unit 131 receives the request and receives the number of elements of the instruction including the request from the instruction control unit 11. The control port 131 stores the number of elements corresponding to the IID. Thereafter, when the control port 131 extracts the entries having the same IID, the control port 131 acquires the number of elements corresponding to the IID, and determines whether or not the number of extracted entries matches the acquired number of elements. Good.

そして、制御ポート管理部１３１は、リクエスト完了応答を送信後、ステータスフラグの判定のために抽出した各エントリのステータスフラグを「０」に変更するように各制御ポート３０に指示する。 Then, after transmitting the request completion response, the control port management unit 131 instructs each control port 30 to change the status flag of each entry extracted for determination of the status flag to “0”.

パイプライン１３２及び１３３は、リクエストの投入を制御ポート３０から受ける。そして、パイプライン１３２及び１３３は、キャッシュＲＡＭ１３４にリクエストを送る。また、パイプライン１３２は、リクエストをヒット判定回路１３５へ送る。また、パイプライン１３３は、リクエストをヒット判定回路１３６へ送る。さらに、パイプライン１３２及び１３３は、リクエストを大小比較器１３７へ送る。 Pipelines 132 and 133 receive requests from the control port 30. Then, the pipelines 132 and 133 send a request to the cache RAM 134. The pipeline 132 also sends the request to the hit determination circuit 135. In addition, the pipeline 133 sends the request to the hit determination circuit 136. Further, the pipelines 132 and 133 send the request to the magnitude comparator 137.

キャッシュＲＡＭ１３４は、パイプライン１３２から受信したリクエストで指定されているアドレスにデータがある場合、指定されたアドレスからデータを取り出し、ヒット判定回路１３５へ出力する。また、キャッシュＲＡＭ１３４は、パイプライン１３３から受信したリクエストで指定されているアドレスからデータを取り出し、ヒット判定回路１３６へ出力する。 When there is data at the address specified by the request received from the pipeline 132, the cache RAM 134 extracts the data from the specified address and outputs the data to the hit determination circuit 135. Further, the cache RAM 134 extracts data from the address specified by the request received from the pipeline 133 and outputs the data to the hit determination circuit 136.

ヒット判定回路１３５は、制御ポート３０から投入されたリクエストの入力をパイプライン１３２から受ける。また、ヒット判定回路１３５は、リクエストで指定されたアドレスにデータがある場合、キャッシュＲＡＭ１３４からデータを取得する。さらに、ヒット判定回路１３５は、パイプライン１３２及び１３３に投入されたリクエストの比較結果を大小比較器１３７から受ける。 The hit determination circuit 135 receives a request input from the control port 30 from the pipeline 132. The hit determination circuit 135 acquires data from the cache RAM 134 when there is data at the address specified by the request. Furthermore, the hit determination circuit 135 receives the comparison result of the requests input to the pipelines 132 and 133 from the magnitude comparator 137.

次に、ヒット判定回路１３５は、パイプライン１３２から受信したリクエストのタグを用いて、キャッシュＲＡＭ１３４から取得したデータの中にヒットするデータがあるか否かを判定する。 Next, the hit determination circuit 135 determines whether there is hit data in the data acquired from the cache RAM 134 using the tag of the request received from the pipeline 132.

ヒットしたデータがある場合、ヒット判定回路１３５は、ヒットしたデータを演算制御部１２へ送信する。以下では、ヒット判定回路１３５又は１３６によりデータがヒットしたと判定された場合を、「キャッシュヒット」という。また、ヒット判定回路１３５は、処理完了の通知をリクエストの投入元の制御ポート３０へ送信する。 If there is hit data, the hit determination circuit 135 transmits the hit data to the arithmetic control unit 12. Hereinafter, the case where the hit determination circuit 135 or 136 determines that the data has hit is referred to as “cache hit”. In addition, the hit determination circuit 135 transmits a processing completion notification to the control port 30 that is the request source.

一方、キャッシュヒットするデータがない場合、すなわち、キャッシュＲＡＭ１３４からデータが送出されない場合又はタグを用いた判定でデータがヒットしない場合、ヒット判定回路１３５は、以下の処理を行う。 On the other hand, when there is no data that hits the cache, that is, when data is not transmitted from the cache RAM 134 or when data is not hit by the determination using the tag, the hit determination circuit 135 performs the following processing.

ヒット判定回路１３５は、大小比較器１３７から入力された比較結果を参照する。比較結果がパイプライン１３２に投入されたリクエストの処理の実行を表す結果の場合、ヒット判定回路１３５は、リクエストに対応するデータ転送をメモリアクセス制御部１３８へ要求する。その後、ヒット判定回路１３５は、応答データのキャッシュＲＡＭ１３４への格納の通知をメモリアクセス制御部１３８から受信する。そして、ヒット判定回路１３５は、リクエストの再投入を制御ポート３０に通知する。 The hit determination circuit 135 refers to the comparison result input from the magnitude comparator 137. When the comparison result is a result representing execution of processing of the request input to the pipeline 132, the hit determination circuit 135 requests the memory access control unit 138 to transfer data corresponding to the request. Thereafter, the hit determination circuit 135 receives a notification of storing the response data in the cache RAM 134 from the memory access control unit 138. Then, the hit determination circuit 135 notifies the control port 30 of request re-input.

逆に、比較結果がパイプライン１３３に投入されたリクエストの処理の実行を表す結果の場合、ヒット判定回路１３５は、リクエストの再投入の通知をリクエストの送信元である制御ポート３０へ送信する。 On the other hand, when the comparison result is a result indicating execution of processing of a request input to the pipeline 133, the hit determination circuit 135 transmits a request re-input notification to the control port 30 that is a request transmission source.

ヒット判定回路１３６は、ヒット判定回路１３５と同様の動作を行うので、説明を省略する。ただし、ヒット判定回路１３６では、パイプライン１３２とパイプライン１３３との関係が逆になる。 Since the hit determination circuit 136 performs the same operation as the hit determination circuit 135, the description thereof is omitted. However, in the hit determination circuit 136, the relationship between the pipeline 132 and the pipeline 133 is reversed.

大小比較器１３７は、リクエストの入力をパイプライン１３２及び１３３のそれぞれから受ける。そして、大小比較器１３７は、受信した各リクエストの要素番号を取得する。次に、大小比較器１３７は、取得した要素番号の大小を比較する。そして、大小比較器１３７は、要素番号が大きい方のリクエストの処理の実行を表す比較結果をヒット判定回路１３５及び１３６へ送信する。例えば、パイプライン１３２から入力されたリクエストの要素番号が、パイプライン１３３から入力されたリクエストの要素番号よりも大きい場合、大小比較器１３７は、パイプライン１３２から入力されたリクエストの処理の実行を表す比較結果を出力する。 The large / small comparator 137 receives a request input from each of the pipelines 132 and 133. Then, the large / small comparator 137 acquires the element number of each received request. Next, the magnitude comparator 137 compares the obtained element numbers in magnitude. Then, the magnitude comparator 137 transmits a comparison result indicating execution of processing of the request having the larger element number to the hit determination circuits 135 and 136. For example, when the element number of the request input from the pipeline 132 is larger than the element number of the request input from the pipeline 133, the magnitude comparator 137 executes the processing of the request input from the pipeline 132. Output the comparison result.

メモリアクセス制御部１３８は、キャッシュミスが発生した場合、データの転送要求をヒット判定回路１３５又は１３６から受ける。そして、メモリアクセス制御部１３８は、データの転送要求の送信元でヒット判定が行われたリクエストについてのデータ転送要求を二次キャッシュ制御部１４へ送信する。 When a cache miss occurs, the memory access control unit 138 receives a data transfer request from the hit determination circuit 135 or 136. Then, the memory access control unit 138 transmits a data transfer request for the request for which the hit determination is performed at the transmission source of the data transfer request to the secondary cache control unit 14.

その後、メモリアクセス制御部１３８は、二次キャッシュ制御部１４からデータ転送要求に対する応答データを受信する。そして、メモリアクセス制御部１３８は、受信した応答データをキャッシュＲＡＭ１３４に格納する。その後、メモリアクセス制御部１３８は、ヒット判定回路１３５及び１３６の内のデータ転送要求の送信元に対して応答データの格納を通知する。 Thereafter, the memory access control unit 138 receives response data for the data transfer request from the secondary cache control unit 14. Then, the memory access control unit 138 stores the received response data in the cache RAM 134. Thereafter, the memory access control unit 138 notifies the transmission source of the data transfer request in the hit determination circuits 135 and 136 to store the response data.

ここで、図５を参照して、パイプライン１３２及び１３３から入力されたリクエストのいずれもヒットしなかった場合の、ヒット判定回路１３５及び１３６、並びに、大小比較器１３７によるリクエスト間の調停処理について具体的に説明する。図５は、リクエスト間の調停処理を説明するための図である。図５に記載したヒット判定回路１３５及び１３６は、ヒット判定回路１３５及び１３６の詳細の一例である。この場合、ヒット判定回路１３５は、ヒット判定部３５１、ＡＮＤ回路３５２及び３５３を有する。また、ヒット判定回路１３６は、ヒット判定部３６１、ＡＮＤ回路３６２及び３６３を有する。また、ＡＮＤ回路３５２及び３６２における入力側の丸印は、入力の反転を示す。また、以下の説明では、キャッシュＲＡＭ１３４へのリクエストの送信などリクエスト間の調停処理に直接関係ない処理は説明を省略する。 Here, with reference to FIG. 5, arbitration processing between requests by the hit determination circuits 135 and 136 and the magnitude comparator 137 when none of the requests input from the pipelines 132 and 133 is hit. This will be specifically described. FIG. 5 is a diagram for explaining arbitration processing between requests. The hit determination circuits 135 and 136 shown in FIG. 5 are an example of details of the hit determination circuits 135 and 136. In this case, the hit determination circuit 135 includes a hit determination unit 351 and AND circuits 352 and 353. The hit determination circuit 136 includes a hit determination unit 361 and AND circuits 362 and 363. Further, the circle on the input side in the AND circuits 352 and 362 indicates inversion of the input. In the following description, description of processing that is not directly related to arbitration processing between requests such as transmission of a request to the cache RAM 134 is omitted.

命令制御部１１は、ＩＩＤが「０」である命令ＸＸＸからリクエストＡ及びリクエストＢを抽出する。そして、命令制御部１１は、リクエストＡに要素番号として「０」を割り当て、リクエストＢに要素番号として「１」を割り当てる。図５では、「ｅｌｅｍｅｎｔ＃」に続く番号が要素番号を表す。 The instruction control unit 11 extracts the request A and the request B from the instruction XXX whose IID is “0”. Then, the instruction control unit 11 assigns “0” as the element number to the request A and assigns “1” as the element number to the request B. In FIG. 5, the number following “element #” represents an element number.

命令制御部１１は、要素番号「０」のリクエストＡを制御ポート３１へ送出する。また、命令制御部１１は、要素番号「１」のリクエストＢを制御ポート３２へ送出する。ここで、制御ポート３１及び３２は、図２の制御ポート３０のいずれかである。 The instruction control unit 11 sends the request A with the element number “0” to the control port 31. Further, the instruction control unit 11 sends the request B having the element number “1” to the control port 32. Here, the control ports 31 and 32 are any of the control ports 30 of FIG.

制御ポート３１は、リクエストＡをヒット判定部３５１及び大小比較器１３７へ送信する。また、制御ポート３２は、リクエストＢをヒット判定部３６１及び大小比較器１３７へ送信する。 The control port 31 transmits the request A to the hit determination unit 351 and the size comparator 137. In addition, the control port 32 transmits the request B to the hit determination unit 361 and the magnitude comparator 137.

ヒット判定部３５１は、リクエストＡのヒット判定を行う。リクエストがヒットした場合、ヒット判定部３５１は、「０」をＡＮＤ回路３５２及び３５３へ出力する。リクエストがヒットしない場合、ヒット判定部３５１は、「１」をＡＮＤ回路３５２及び３５３へ出力する。ここでは、リクエストＡがヒットしない場合であるので、ヒット判定部３５１は、「１」をＡＮＤ回路３５２及び３５３へ出力する。 The hit determination unit 351 performs a hit determination for the request A. When the request is hit, the hit determination unit 351 outputs “0” to the AND circuits 352 and 353. If the request does not hit, the hit determination unit 351 outputs “1” to the AND circuits 352 and 353. Here, since the request A does not hit, the hit determination unit 351 outputs “1” to the AND circuits 352 and 353.

ヒット判定部３６１は、リクエストＢのヒット判定を行う。リクエストがヒットした場合、ヒット判定部３６１は、「０」をＡＮＤ回路３６２及び３６３へ出力する。リクエストがヒットしない場合、ヒット判定部３６１は、「１」をＡＮＤ回路３６２及び３６３へ出力する。ここでは、リクエストＢがヒットしない場合であるので、ヒット判定部３６１は、「０」をＡＮＤ回路３６２及び３６３へ出力する。 The hit determination unit 361 performs hit determination for the request B. When the request is hit, the hit determination unit 361 outputs “0” to the AND circuits 362 and 363. If the request does not hit, the hit determination unit 361 outputs “1” to the AND circuits 362 and 363. Here, since the request B does not hit, the hit determination unit 361 outputs “0” to the AND circuits 362 and 363.

大小比較器１３７は、リクエストＡ及びＢを制御ポート３１及び３２から受信する。そして、大小比較器１３７は、リクエストＡから要素番号「０」を取得する。また、大小比較器１３７は、リクエストＢから要素番号「１」を取得する。次に、大小比較器１３７は、リクエストＡの要素番号「０」とリクエストＢの要素番号「１」を比較する。この場合、大小比較器１３７は、リクエストＡの要素番号がリクエストＢの要素番号より小さいと判定する。 The large / small comparator 137 receives requests A and B from the control ports 31 and 32. Then, the large / small comparator 137 acquires the element number “0” from the request A. The large / small comparator 137 acquires the element number “1” from the request B. Next, the size comparator 137 compares the element number “0” of the request A with the element number “1” of the request B. In this case, the size comparator 137 determines that the element number of the request A is smaller than the element number of the request B.

そして、大小比較器１３７は、要素番号が小さい方のリクエストを、データ転送要求を実行するリクエストとする。さらに、大小比較器１３７は、ヒット判定回路１３５及び１３６のうち、データ転送要求を実行しないリクエストのヒット判定を行った方に「０」を出力し、他方に「１」を出力する。ここでは、大小比較器１３７は、リクエストＡについてのデータ転送要求の実行を表す判定結果として「１」をＡＮＤ回路３５２及び３５３へ出力する。また、大小比較器１３７は、リクエストＡについてのデータ転送要求の実行を表す判定結果、すなわちリクエストＢについてはデータ転送要求を実行しないことを表す判定結果として「０」をＡＮＤ回路３６２及び３６３へ出力する。 The large / small comparator 137 sets the request with the smaller element number as the request for executing the data transfer request. Further, the magnitude comparator 137 outputs “0” to the hit determination circuits 135 and 136 that perform the hit determination of the request that does not execute the data transfer request, and outputs “1” to the other. Here, the magnitude comparator 137 outputs “1” to the AND circuits 352 and 353 as a determination result indicating execution of the data transfer request for the request A. Further, the magnitude comparator 137 outputs “0” to the AND circuits 362 and 363 as a determination result indicating execution of the data transfer request for the request A, that is, a determination result indicating that the request B does not execute the data transfer request. To do.

ＡＮＤ回路３５２は、ヒット判定部３５１からの入力と大小比較器１３７からの入力を反転させた値との論理積を制御ポート３１へ出力する。ここでは、ＡＮＤ回路３５２は、ヒット判定部３５１から入力された「１」と、大小比較器１３７からの入力を反転させた値である「０」の論理積である「０」を制御ポート３１へ出力する。ここで、制御ポート３１又は３２への「０」の出力は、リクエストの処理継続を表す。 The AND circuit 352 outputs a logical product of the input from the hit determination unit 351 and the value obtained by inverting the input from the magnitude comparator 137 to the control port 31. Here, the AND circuit 352 generates “0”, which is the logical product of “1” input from the hit determination unit 351 and “0”, which is a value obtained by inverting the input from the magnitude comparator 137. Output to. Here, the output of “0” to the control port 31 or 32 represents the continuation of request processing.

ＡＮＤ回路３５３は、ヒット判定部３５１からの入力と大小比較器１３７からの入力との論理積をメモリアクセス制御部１３８へ出力する。ここでは、ヒット判定部３５１及び大小比較器１３７のいずれからも「１」が入力されるので、ＡＮＤ回路３５２は、「１」をメモリアクセス制御部１３８へ出力する。 The AND circuit 353 outputs the logical product of the input from the hit determination unit 351 and the input from the magnitude comparator 137 to the memory access control unit 138. Here, since “1” is input from both the hit determination unit 351 and the magnitude comparator 137, the AND circuit 352 outputs “1” to the memory access control unit 138.

ＡＮＤ回路３６２は、ヒット判定部３５１からの入力と大小比較器１３７からの入力を反転させた値との論理積を制御ポート３２へ出力する。ここでは、ＡＮＤ回路３６２は、ヒット判定部３６１から入力された「１」と、大小比較器１３７からの入力を反転させた値である「１」の論理積である「１」を制御ポート３１へ出力する。制御ポート３１又は３２への「１」の出力は、リクエストの再投入を表す。 The AND circuit 362 outputs a logical product of the input from the hit determination unit 351 and the value obtained by inverting the input from the magnitude comparator 137 to the control port 32. Here, the AND circuit 362 generates “1” which is a logical product of “1” input from the hit determination unit 361 and “1” which is an inverted value of the input from the magnitude comparator 137, as the control port 31. Output to. An output of “1” to the control port 31 or 32 represents a request re-injection.

ＡＮＤ回路３６３は、ヒット判定部３６１からの入力と大小比較器１３７からの入力との論理積をメモリアクセス制御部１３８へ出力する。ここでは、ヒット判定部３５１から「１」の入力を受け、大小比較器１３７から「０」が入力されるので、ＡＮＤ回路３５２は、「０」をメモリアクセス制御部１３８へ出力する。 The AND circuit 363 outputs the logical product of the input from the hit determination unit 361 and the input from the magnitude comparator 137 to the memory access control unit 138. Here, since “1” is input from the hit determination unit 351 and “0” is input from the magnitude comparator 137, the AND circuit 352 outputs “0” to the memory access control unit 138.

制御ポート３１は、ＡＮＤ回路３５２から「０」の入力を受けると、リクエストＡの処理が継続されていることを確認する。また、制御ポート３２は、ＡＮＤ回路３６２から「１」の入力を受けると、リクエストＢをパイプライン１３３へ再投入する。 When receiving “0” input from the AND circuit 352, the control port 31 confirms that the processing of the request A is continued. Further, upon receiving “1” input from the AND circuit 362, the control port 32 re-injects the request B into the pipeline 133.

ここでは、双方がキャッシュミスした場合の調停処理についての説明であり、ヒット判定回路１３５及び１３６による調停結果を制御ポート３０へ通知するまでの説明である。ただし、最終的には、ヒット判定回路１３５及び１３６は、以下の処理を行う。すなわち、リクエストの処理が継続する状態で、キャッシュにリクエストが指定するデータが登録された場合に、ヒット判定回路１３５及び１３６は、リクエストの送出元の制御ポート３０にリクエストの再投入を指示する。また、リクエストがキャッシュヒットした場合、ヒット判定回路１３５及び１３６は、リクエストの送出元の制御ポート３０にリクエストの処理完了の通知を行う。 Here, the description is about the arbitration process when both of them make a cache miss, and it is the explanation until the control port 30 is notified of the arbitration results by the hit determination circuits 135 and 136. However, finally, the hit determination circuits 135 and 136 perform the following processing. That is, when the data designated by the request is registered in the cache while the request processing is continued, the hit determination circuits 135 and 136 instruct the control port 30 of the request transmission source to reinject the request. When the request has a cache hit, the hit determination circuits 135 and 136 notify the request transmission source control port 30 of the completion of the request processing.

メモリアクセス制御部１３８は、ＡＮＤ回路３５３及び３６３から論理積の入力を受ける。そして、メモリアクセス制御部１３８は、「１」の値を入力した側のリクエストについてのデータ転送要求を二次キャッシュ制御部１４へ出力する。ここでは、ＡＮＤ回路３５３から「１」の入力を受け、ＡＮＤ回路３６３から「０」の入力を受けるので、メモリアクセス制御部１３８は、リクエストＡについてのデータ転送要求を二次キャッシュ制御部１４へ出力する。 The memory access control unit 138 receives logical product inputs from the AND circuits 353 and 363. Then, the memory access control unit 138 outputs a data transfer request for the request on the side where the value “1” is input to the secondary cache control unit 14. Here, since the input of “1” is received from the AND circuit 353 and the input of “0” is received from the AND circuit 363, the memory access control unit 138 sends the data transfer request for the request A to the secondary cache control unit 14. Output.

次に、図６を参照して、パイプライン１３２と１３３との間のデータ転送要求の調停における各処理のタイミングについて説明する。図６は、パイプライン間のデータ転送要求の調停のタイムチャートである。 Next, with reference to FIG. 6, the timing of each process in the arbitration of the data transfer request between the pipelines 132 and 133 will be described. FIG. 6 is a time chart of arbitration of a data transfer request between pipelines.

図６は、左端に処理対象を記載する。一番下の欄はリクエストＡ及びＢの双方を用いた大小比較を表す。また、図６は、左に進むにしたがい時間が時刻Ｔ１〜Ｔ９へ経過することを表す。また、ここでは、各パイプライン１３２及び１３３に投入されたリクエストに対して、Ｐ１〜Ｐ３の３サイクルでヒット判定の処理が行われる場合で説明する。 FIG. 6 describes the processing target at the left end. The bottom column represents a size comparison using both requests A and B. FIG. 6 shows that the time elapses from time T1 to time T9 as it moves to the left. Here, a case will be described in which hit determination processing is performed in three cycles P1 to P3 for the requests input to the pipelines 132 and 133.

時刻Ｔ１で、リクエストＡがパイプライン１３２に投入され、リクエストＢがパイプライン１３３に投入される。 At time T1, request A is input to the pipeline 132, and request B is input to the pipeline 133.

そして、時刻Ｔ３にあたるサイクルＰ２でリクエストＡ及びリクエストＢのキャッシュヒットの判定が行われる。 Then, the cache hit of request A and request B is determined in cycle P2 corresponding to time T3.

その後、時刻Ｔ５で、リクエストＡとリクエストＢとの大小判定がなされる。ここでは、リクエストＡがリクエストＢより小さいと判定された場合で説明する。そこで、リクエストＡについて、データ転送要求により二次キャッシュ又はメモリ２からのデータ取得であるメモリリクエストが行われる。また、リクエストＢは、パイプライン１３３へ再投入される。 Thereafter, the request A and the request B are determined to be large and small at time T5. Here, a case where it is determined that request A is smaller than request B will be described. Therefore, for request A, a memory request for data acquisition from the secondary cache or the memory 2 is performed by a data transfer request. Request B is re-entered into pipeline 133.

その後、時刻Ｔ７における次のタイミングのサイクルＰ２でリクエストＢのキャッシュミスが再度発生する。ここでは、リクエストＡ及びＢ以外のリクエストはない場合であるので、時刻Ｔ９で、リクエストＢについて、データ転送要求により二次キャッシュ又はメモリ２からのデータ取得であるメモリリクエストが行われる。 Thereafter, the cache miss of the request B occurs again in the cycle P2 of the next timing at the time T7. Here, since there is no request other than the requests A and B, at time T9, a memory request for data acquisition from the secondary cache or the memory 2 is performed for the request B by a data transfer request.

このように、パイプライン１３２及び１３３間でキャッシュミスが競合した場合でも、若い番号が与えられたリクエストから順次データ転送要求が実行されることで、最終的に全てのリクエストのデータ転送が実行される。 As described above, even when a cache miss conflicts between the pipelines 132 and 133, the data transfer request is sequentially executed from the request given the young number, so that the data transfer of all the requests is finally executed. The

図１に戻って説明を続ける。二次キャッシュ制御部１４は、二次キャッシュを有する。二次キャッシュ制御部１４は、データ転送要求をメモリアクセス制御部１３８から受信する。そして、二次キャッシュ制御部１４は、自己が有する二次キャッシュにデータ転送要求で指定されたデータが格納されているか否かを判定する。格納されている場合、二次キャッシュ制御部１４は、データ転送要求で指定されたデータを二次キャッシュから取得し、取得したデータを応答データとしてメモリアクセス制御部１３８へ送信する。 Returning to FIG. 1, the description will be continued. The secondary cache control unit 14 has a secondary cache. The secondary cache control unit 14 receives a data transfer request from the memory access control unit 138. Then, the secondary cache control unit 14 determines whether the data designated by the data transfer request is stored in the secondary cache that the secondary cache control unit 14 has. If stored, the secondary cache control unit 14 acquires the data specified by the data transfer request from the secondary cache, and transmits the acquired data to the memory access control unit 138 as response data.

これに対して、データ転送要求で指定されたデータが二次キャッシュに格納されていない場合、二次キャッシュ制御部１４は、データ転送要求をメモリ制御部１５に送信する。その後、二次キャッシュ制御部１４は、応答データをメモリ制御部１５から受信する。そして、二次キャッシュ制御部１４は、受信した応答データをメモリアクセス制御部１３８へ送信する。 On the other hand, when the data specified by the data transfer request is not stored in the secondary cache, the secondary cache control unit 14 transmits the data transfer request to the memory control unit 15. Thereafter, the secondary cache control unit 14 receives the response data from the memory control unit 15. Then, the secondary cache control unit 14 transmits the received response data to the memory access control unit 138.

メモリ制御部１５は、データ転送要求を二次キャッシュ制御部１４から受ける。そして、メモリ制御部１５は、データ転送要求で指定されたデータをメモリ２から取得する。そして、メモリ制御部１５は、取得したデータを二次キャッシュ制御部１４へ送信する。 The memory control unit 15 receives a data transfer request from the secondary cache control unit 14. Then, the memory control unit 15 acquires the data designated by the data transfer request from the memory 2. Then, the memory control unit 15 transmits the acquired data to the secondary cache control unit 14.

演算制御部１２は、演算処理の実行要求などを命令制御部１１から受信する。また、演算処理１２は、データをキャッシュ制御部１３から受信する。そして、演算制御部１２は、キャッシュ制御部１３から受信したデータを用いて演算処理などを実行する。ただし、実行する処理にキャッシュデータを用いない場合など、演算制御部１２は、キャッシュ制御部１３からのデータの受信を行わずに、処理を実行する場合もある。 The arithmetic control unit 12 receives an execution request for arithmetic processing from the instruction control unit 11. The arithmetic processing 12 receives data from the cache control unit 13. Then, the arithmetic control unit 12 executes arithmetic processing using the data received from the cache control unit 13. However, when the cache data is not used for the process to be executed, the arithmetic control unit 12 may execute the process without receiving data from the cache control unit 13.

次に、図７を参照して、本実施例に係る演算処理装置による命令処理の流れを説明する。図７は、実施例１に係る演算処理装置による命令処理のフローチャートである。以下では、ヒット判定回路１３５とヒット判定回路１３６とを区別せずに、「ヒット判定回路１４０」という。 Next, with reference to FIG. 7, the flow of instruction processing by the arithmetic processing unit according to this embodiment will be described. FIG. 7 is a flowchart of command processing performed by the arithmetic processing apparatus according to the first embodiment. Hereinafter, the hit determination circuit 135 and the hit determination circuit 136 are referred to as “hit determination circuit 140” without being distinguished from each other.

命令制御部１１は、命令からリクエストを取得し、取得したリクエストを制御ポート３０に発行する（ステップＳ１０１）。 The command control unit 11 acquires a request from the command and issues the acquired request to the control port 30 (step S101).

制御ポート３０は、リクエストを受信する（ステップＳ１０２）。そして、制御ポート３０は、受信したリクエストのエントリのステータスフラグを「１」にする（ｓｔａｔｕｓ＝１）（ステップＳ１０３）。 The control port 30 receives the request (step S102). Then, the control port 30 sets the status flag of the received request entry to “1” (status = 1) (step S103).

次に、制御ポート３０は、リクエストをパイプライン１３０に投入する（ステップＳ１０４）。 Next, the control port 30 inputs the request to the pipeline 130 (step S104).

ヒット判定回路１４０は、リクエストがキャッシュヒットしたか否かを判定する（ステップＳ１０５）。キャッシュヒットしない場合（ステップＳ１０５：否定）、メモリアクセス制御部１３８は、データ転送要求を二次キャッシュ制御部１４へ送信する（ステップＳ１０６）。そして、メモリアクセス制御部１３８は、二次キャッシュ制御部１４から応答データを受信し、受信した応答データをキャッシュＲＡＭ１３４に登録する（ステップＳ１０７）。その後、ヒット判定回路１４０は、リクエストの再投入を制御ポート３０へ通知する。制御ポート３０は、ステップＳ１０４へ戻る。 The hit determination circuit 140 determines whether or not the request has a cache hit (step S105). If there is no cache hit (No at Step S105), the memory access control unit 138 transmits a data transfer request to the secondary cache control unit 14 (Step S106). Then, the memory access control unit 138 receives the response data from the secondary cache control unit 14, and registers the received response data in the cache RAM 134 (step S107). Thereafter, the hit determination circuit 140 notifies the control port 30 of the request re-injection. The control port 30 returns to step S104.

これに対して、キャッシュヒットした場合（ステップＳ１０５：肯定）、ヒット判定回路１４０は、ロード・ストア処理を実行する（ステップＳ１０８）。 On the other hand, when a cache hit occurs (step S105: Yes), the hit determination circuit 140 executes a load / store process (step S108).

そして、ヒット判定回路１４０は、リクエストの処理完了を制御ポート３０へ通知する。制御ポート３０は、リクエストの処理完了の通知を受けて、該当するリクエストのエントリのステータスフラグを「２」に変更する（ｓｔａｔｕｓ＝２）（ステップＳ１０９）。 The hit determination circuit 140 notifies the control port 30 of the completion of the request processing. In response to the notification of the completion of the request processing, the control port 30 changes the status flag of the entry of the corresponding request to “2” (status = 2) (step S109).

制御ポート管理部１３１は、ステータスフラグが「２」に変更されると、そのリクエストがインダイレクトアクセスでないか否か、すなわち、インダイレクトアクセスか否かを示すフラグが「０」（ｉｎｄｉｒｅｃｔ＝０）か否かを判定する（ステップＳ１１０）。リクエストがインダイレクトアクセスでない場合（ステップＳ１１０：肯定）、制御ポート管理部１３１は、ステップＳ１１３へ進む。 When the status flag is changed to “2”, the control port management unit 131 sets the flag indicating whether the request is not indirect access, that is, whether it is indirect access, to “0” (indirect = 0). Is determined (step S110). If the request is not indirect access (step S110: Yes), the control port management unit 131 proceeds to step S113.

これに対して、リクエストがインダイレクトアクセスである場合（ステップＳ１１０：否定）、制御ポート管理部１３１は、そのリクエストを含む命令のＩＩＤを取得する。次に、制御ポート管理部１３１は、全エントリを検索する（ステップＳ１１１）。そして、制御ポート管理部１３１は、取得したＩＩＤと同じＩＩＤを有する命令に含まれるリクエストの全エントリのステータスフラグが「２」（ｓｔａｔｕｓ＝２）か否かを判定する（ステップＳ１１２）。 On the other hand, when the request is indirect access (No at Step S110), the control port management unit 131 acquires the IID of the instruction including the request. Next, the control port management unit 131 searches all entries (step S111). Then, the control port management unit 131 determines whether or not the status flags of all the entries of the requests included in the instruction having the same IID as the acquired IID are “2” (status = 2) (step S112).

ステータスフラグが「２」以外のエントリが含まれていた場合（ステップＳ１１２：否定）、制御ポート管理部１３１は、ステップＳ１１１に戻る。ここで、並行して他のリクエストについてもステップＳ１０１〜Ｓ１０９の処理が行われるため、ステップＳ１１１及び１１２を繰り返す間に、全エントリのステータスフラグが「２」に変わる。一方、全てのエントリのステータスフラグが「２」の場合（ステップＳ１１２：肯定）、制御ポート管理部１３１は、ステップＳ１１３へ進む。 If an entry other than the status flag “2” is included (No at Step S112), the control port management unit 131 returns to Step S111. Here, since the processes of steps S101 to S109 are performed for other requests in parallel, the status flags of all entries are changed to “2” while steps S111 and S112 are repeated. On the other hand, if the status flags of all the entries are “2” (step S112: Yes), the control port management unit 131 proceeds to step S113.

そして、制御ポート管理部１３１は、リクエスト完了応答を命令制御部１１へ送信する（ステップＳ１１３）。 Then, the control port management unit 131 transmits a request completion response to the command control unit 11 (step S113).

そして、制御ポート３０は、リクエスト完了応答が送信されたリクエストのエントリのステータスフラグを「０」に変える（ｓｔａｔｕｓ＝０）（ステップＳ１１４）。 Then, the control port 30 changes the status flag of the entry of the request to which the request completion response is transmitted to “0” (status = 0) (step S114).

次に、図８を参照して、大小比較器による比較処理の流れを説明する。図８は、大小比較器による比較処理のフローチャートである。ここでは、リクエストＡがパイプライン１３２に投入され、リクエストＢがパイプライン１３３に投入された場合で説明する。図８では、リクエストＡを単に「Ａ」と表し、リクエストＢを単に「Ｂ」と表す。 Next, with reference to FIG. 8, the flow of comparison processing by the size comparator will be described. FIG. 8 is a flowchart of the comparison process by the magnitude comparator. Here, a case where request A is input to the pipeline 132 and request B is input to the pipeline 133 will be described. In FIG. 8, the request A is simply expressed as “A”, and the request B is simply expressed as “B”.

大小比較器１３７は、リクエストＡを含む命令のＩＩＤ（以下では、「ＡのＩＩＤ」という。）とリクエストＢを含む命令のＩＩＤ（以下では、「ＢのＩＩＤ」という。）とが異なるか否かを判定する（ステップＳ２０１）。 The magnitude comparator 137 determines whether or not the IID of the instruction including the request A (hereinafter referred to as “A's IID”) is different from the IID of the instruction including the request B (hereinafter referred to as “B's IID”). Is determined (step S201).

ＩＩＤが異なる場合（ステップＳ２０１：肯定）、大小比較器１３７は、リクエストＡのＩＩＤがリクエストＢのＩＩＤよりも小さいか否かを判定する（ステップＳ２０２）。 When the IIDs are different (step S201: Yes), the size comparator 137 determines whether the IID of the request A is smaller than the IID of the request B (step S202).

ＡのＩＩＤがＢのＩＩＤよりも小さい場合（ステップＳ２０２：肯定）、大小比較器１３７は、リクエストデータ転送要求をヒット判定回路１３５へ指示する（ステップＳ２０４）。例えば、図５のような構成の場合、大小比較器１３７は、ＡＮＤ回路３５２及び３５３に「１」を出力する。この場合、大小比較器１３７は、ＡＮＤ回路３６２及び３６３に「０」を出力する。 When A's IID is smaller than B's IID (step S202: affirmative), the magnitude comparator 137 instructs the hit determination circuit 135 to make a request data transfer request (step S204). For example, in the case of the configuration as shown in FIG. 5, the magnitude comparator 137 outputs “1” to the AND circuits 352 and 353. In this case, the magnitude comparator 137 outputs “0” to the AND circuits 362 and 363.

これに対して、ＢのＩＩＤがＡのＩＩＤよりも小さい場合（ステップＳ２０２：否定）、大小比較器１３７は、リクエストデータ転送要求をヒット判定回路１３６へ指示する（ステップＳ２０５）。例えば、図５のような構成の場合、大小比較器１３７は、ＡＮＤ回路３６２及び３６３に「１」を出力する。この場合、大小比較器１３７は、ＡＮＤ回路３５２及び３５３に「０」を出力する。 On the other hand, when the IID of B is smaller than the IID of A (No at Step S202), the magnitude comparator 137 instructs the hit determination circuit 136 to make a request data transfer request (Step S205). For example, in the case of the configuration shown in FIG. 5, the magnitude comparator 137 outputs “1” to the AND circuits 362 and 363. In this case, the magnitude comparator 137 outputs “0” to the AND circuits 352 and 353.

一方、ＩＩＤが同じ場合（ステップＳ２０１：否定）、大小比較器１３７は、リクエストＡの要素番号がリクエストＢの要素番号より小さいか否かを判定する（ステップＳ２０３）。 On the other hand, when the IID is the same (No at Step S201), the magnitude comparator 137 determines whether the element number of the request A is smaller than the element number of the request B (Step S203).

リクエストＡの要素番号がリクエストＢの要素番号より小さい場合（ステップＳ２０３：肯定）、大小比較器１３７は、リクエストデータ転送要求をヒット判定回路１３５へ指示する（ステップＳ２０４）。例えば、図５のような構成の場合、大小比較器１３７は、ＡＮＤ回路３５２及び３５３に「１」を出力する。この場合、大小比較器１３７は、ＡＮＤ回路３６２及び３６３に「０」を出力する。 When the element number of request A is smaller than the element number of request B (step S203: Yes), the magnitude comparator 137 instructs the hit determination circuit 135 to make a request data transfer request (step S204). For example, in the case of the configuration as shown in FIG. 5, the magnitude comparator 137 outputs “1” to the AND circuits 352 and 353. In this case, the magnitude comparator 137 outputs “0” to the AND circuits 362 and 363.

これに対して、リクエストＢの要素番号がリクエストＡの要素番号より小さい場合（ステップＳ２０３：否定）、大小比較器１３７は、リクエストデータ転送要求をヒット判定回路１３６へ指示する（ステップＳ２０５）。例えば、図５のような構成の場合、大小比較器１３７は、ＡＮＤ回路３６２及び３６３に「１」を出力する。この場合、大小比較器１３７は、ＡＮＤ回路３５２及び３５３に「０」を出力する。 On the other hand, when the element number of the request B is smaller than the element number of the request A (No at Step S203), the magnitude comparator 137 instructs the hit determination circuit 136 to make a request data transfer request (Step S205). For example, in the case of the configuration shown in FIG. 5, the magnitude comparator 137 outputs “1” to the AND circuits 362 and 363. In this case, the magnitude comparator 137 outputs “0” to the AND circuits 352 and 353.

次に、図９を参照して、パイプライン間のデータ転送調停処理の流れについて説明する。図９は、パイプライン間のデータ転送調停処理のフローチャートである。ここでは、リクエストＡ及びＢという２つのリクエストがあり、且つ、制御ポート３０として、制御ポート＃ａ及び＃ｂという２つの制御ポートがある場合で説明する。 Next, the flow of data transfer arbitration between pipelines will be described with reference to FIG. FIG. 9 is a flowchart of a data transfer arbitration process between pipelines. Here, there will be described a case where there are two requests A and B, and there are two control ports #a and #b as the control port 30.

命令制御部１１は、リクエストＡを制御ポート＃ａに投入し、リクエストＢを制御ポート＃ｂに投入する（ステップＳ３０１）。 The instruction control unit 11 inputs the request A to the control port #a and inputs the request B to the control port #b (step S301).

制御ポート＃ａは、リクエストＡのエントリのステータスフラグを「１」にする（ｓｔａｔｕｓ＝１）。また、制御ポート＃ｂは、リクエストＢのエントリのステータスフラグを「１」にする（ｓｔａｔｕｓ＝１）（ステップＳ３０２）。 The control port #a sets the status flag of the entry of request A to “1” (status = 1). Further, the control port #b sets the status flag of the entry of the request B to “1” (status = 1) (step S302).

制御ポート＃ａは、リクエストＡをパイプライン１３２に投入し、制御ポート＃ｂは、リクエストＢをパイプライン１３３に投入する（ステップＳ３０３）。 The control port #a inputs the request A into the pipeline 132, and the control port #b inputs the request B into the pipeline 133 (step S303).

ヒット判定回路１３５は、リクエストＡがキャッシュヒットしたか否かを判定する（ステップＳ３０４）。リクエストＡがキャッシュヒットしない場合（ステップＳ３０４：否定）、ヒット判定回路１３６は、リクエストＢがキャッシュヒットしたか否かを判定する（ステップＳ３０５）。リクエストＢがキャッシュヒットした場合（ステップＳ３０５：肯定）、制御ポート＃ｂは、リクエストＢのエントリのステータスフラグを「２」にする（ｓｔａｔｕｓ＝２）（ステップＳ３０６）。 The hit determination circuit 135 determines whether or not the request A has a cache hit (step S304). When the request A does not have a cache hit (No at Step S304), the hit determination circuit 136 determines whether or not the request B has a cache hit (Step S305). When the request B has a cache hit (step S305: Yes), the control port #b sets the status flag of the entry of the request B to “2” (status = 2) (step S306).

メモリアクセス制御回路１３８は、リクエストＡについてデータ転送要求を二次キャッシュ制御部１４へ送信する（ステップＳ３０７）。 The memory access control circuit 138 transmits a data transfer request for the request A to the secondary cache control unit 14 (step S307).

その後、メモリアクセス制御回路１３８は、リクエストＡについてのデータ転送要求に対する応答データをキャッシュＲＡＭ１３４に登録する（ステップＳ３０８）。その後、処理は、ステップＳ３０３に戻る。 Thereafter, the memory access control circuit 138 registers response data for the data transfer request for the request A in the cache RAM 134 (step S308). Thereafter, the process returns to step S303.

これに対して、リクエストＢがキャッシュヒットしない場合（ステップＳ３０５：否定）、大小比較器１３７は、リクエストＡの要素番号がリクエストＢの要素番号より小さいか否か判定する（ステップＳ３０９）。 On the other hand, when the request B does not hit the cache (No at Step S305), the large / small comparator 137 determines whether the element number of the request A is smaller than the element number of the request B (Step S309).

リクエストＡの要素番号がリクエストＢの要素番号よりも小さい場合（ステップＳ３０９：肯定）、メモリアクセス制御回路１３８は、リクエストＡについてデータ転送要求を二次キャッシュ制御部１４へ送信する（ステップＳ３１０）。 When the element number of the request A is smaller than the element number of the request B (step S309: Yes), the memory access control circuit 138 transmits a data transfer request for the request A to the secondary cache control unit 14 (step S310).

その後、メモリアクセス制御回路１３８は、リクエストＡについてのデータ転送要求に対する応答データをキャッシュＲＡＭ１３４に登録する（ステップＳ３１１）。その後、処理は、ステップＳ３０３に戻る。 Thereafter, the memory access control circuit 138 registers response data for the data transfer request for the request A in the cache RAM 134 (step S311). Thereafter, the process returns to step S303.

これに対して、リクエストＢの要素番号がリクエストＡの要素番号よりも小さい場合（ステップＳ３０９：否定）、メモリアクセス制御回路１３８は、リクエストＢについてデータ転送要求を二次キャッシュ制御部１４へ送信する（ステップＳ３１２）。 On the other hand, when the element number of request B is smaller than the element number of request A (step S309: No), the memory access control circuit 138 transmits a data transfer request for the request B to the secondary cache control unit 14. (Step S312).

その後、メモリアクセス制御回路１３８は、リクエストＢについてのデータ転送要求に対する応答データをキャッシュＲＡＭ１３４に登録する（ステップＳ３１３）。その後、処理は、ステップＳ３０３に戻る。 Thereafter, the memory access control circuit 138 registers response data for the data transfer request for the request B in the cache RAM 134 (step S313). Thereafter, the process returns to step S303.

一方、リクエストＡがキャッシュヒットした場合（ステップＳ３０４：肯定）、ヒット判定回路１３６は、リクエストＢがキャッシュヒットしたか否かを判定する（ステップＳ３１４）。リクエストＢがキャッシュヒットしない場合（ステップＳ３１４：否定）、制御ポート＃ａは、リクエストＡのエントリのステータスフラグを「２」にする（ｓｔａｔｕｓ＝２）（ステップＳ３１５）。 On the other hand, when the request A has a cache hit (Yes at Step S304), the hit determination circuit 136 determines whether or not the request B has a cache hit (Step S314). When the request B does not hit the cache (Step S314: No), the control port #a sets the status flag of the entry of the request A to “2” (status = 2) (Step S315).

メモリアクセス制御回路１３８は、リクエストＢについてデータ転送要求を二次キャッシュ制御部１４へ送信する（ステップＳ３１６）。 The memory access control circuit 138 transmits a data transfer request for the request B to the secondary cache control unit 14 (step S316).

その後、メモリアクセス制御回路１３８は、リクエストＢについてのデータ転送要求に対する応答データをキャッシュＲＡＭ１３４に登録する（ステップＳ３１７）。その後、処理は、ステップＳ３０３に戻る。 Thereafter, the memory access control circuit 138 registers response data for the data transfer request for the request B in the cache RAM 134 (step S317). Thereafter, the process returns to step S303.

これに対して、リクエストＢがキャッシュヒットした場合（ステップＳ３１４：肯定）、制御ポート＃ａは、リクエストＡのエントリのステータスフラグを「２」にする（ｓｔａｔｕｓ＝２）。また、制御ポート＃ｂは、リクエストＢのエントリのステータスフラグを「２」にする（ｓｔａｔｕｓ＝２）（ステップＳ３１８）。 On the other hand, when the request B has a cache hit (step S314: affirmative), the control port #a sets the status flag of the entry of the request A to “2” (status = 2). Further, the control port #b sets the status flag of the entry of request B to “2” (status = 2) (step S318).

以上に説明したように、本実施例に係る演算処理装置は、インダイレクトアクセス処理において、１つの命令から複数のキャッシュアクセスリクエストを取り出し、パイプライン数ずつ順に取り出したキャッシュ命令アクセスリクエストを制御ポートに送る。これにより、インダイレクトアクセスを行う場合の命令数を抑えることができ、命令制御部の処理負荷を軽減することができる。 As described above, in the indirect access process, the arithmetic processing unit according to the present embodiment extracts a plurality of cache access requests from one instruction and uses the cache instruction access requests sequentially extracted by the number of pipelines as a control port. send. Thereby, the number of instructions when performing indirect access can be suppressed, and the processing load of the instruction control unit can be reduced.

また、命令制御部とキャッシュ制御部との間のバスなどの増加を抑えることができ、回路規模を小さく抑えることができる。また、回路規模の制限による並列に処理するキャッシュアクセスリクエストの数の制限を回避することができる。 In addition, an increase in the bus between the instruction control unit and the cache control unit can be suppressed, and the circuit scale can be reduced. Further, it is possible to avoid the limitation on the number of cache access requests processed in parallel due to the limitation on the circuit scale.

さらに、各リクエスト番号の要素番号の大小によりデータ転送要求の処理順の調停を行うので、命令番号が同じでも、適切にキャッシュアクセスリクエストを処理していくことができる。また、１つの命令に含まれるキャッシュアクセスリクエストの全ての処理が完了した後に、キャッシュ制御部から命令制御部へとリクエスト完了通知が送られるので、１つの命令についてのキャッシュアクセス処理の完了を適切に通知することができる。 Furthermore, since the processing order of data transfer requests is adjusted according to the size of the element number of each request number, cache access requests can be appropriately processed even if the command numbers are the same. In addition, after all processing of the cache access request included in one instruction is completed, a request completion notification is sent from the cache control unit to the instruction control unit, so that the cache access processing for one instruction is properly completed. You can be notified.

次に、実施例２について説明する。本実施例に係る演算処理装置は、キャッシュアクセスリクエストを要素番号の若い順に処理していくことが実施例１と異なる。本実施例に係る演算処理装置及びキャッシュ制御部も図１及び図２で表される。以下では、実施例１と同様の各部の機能については説明を省略する。 Next, Example 2 will be described. The arithmetic processing unit according to the present embodiment is different from the first embodiment in that cache access requests are processed in ascending order of element numbers. The arithmetic processing unit and the cache control unit according to this embodiment are also shown in FIGS. In the following, description of functions of the same parts as those in the first embodiment will be omitted.

まず、ヒット判定回路１３５及び１３６について説明するが、いずれも同じ機能を有するので、ここでは、ヒット判定回路１３５を例に説明する。 First, the hit determination circuits 135 and 136 will be described. Since both have the same function, the hit determination circuit 135 will be described as an example here.

ヒット判定回路１３５は、パイプライン１３２に投入されたリクエストについてキャッシュヒットの判定を行う。ヒットしない場合、ヒット判定回路１３５は、実施例１と同様のデータ転送要求の調停処理の下にデータ転送要求を行う。そして、メモリアクセス制御部１３８により応答データがキャッシュＲＡＭ１３４に登録されると、ヒット判定回路１３５は、リクエストの再投入をリクエストの投入元の制御ポート３０に指示する。 The hit determination circuit 135 determines a cache hit for the request input to the pipeline 132. If there is no hit, the hit determination circuit 135 makes a data transfer request under the same data transfer request arbitration process as in the first embodiment. When the response data is registered in the cache RAM 134 by the memory access control unit 138, the hit determination circuit 135 instructs the control port 30 that is the request source to re-inject the request.

一方、キャッシュヒットした場合、ヒット判定回路１３５は、キャッシュヒットしたリクエストに対応する制御ポート３０が有するエントリから、そのリクエストがインダイレクトアクセスか否かを判定する。インダイレクトアクセスでなければ、ヒット判定回路１３５は、そのままキャッシュヒットしたリクエストのロード・ストア処理を行う。そして、ヒット判定回路１３５は、処理したリクエストの処理完了をリクエストの投入元の制御ポート３０に通知する。 On the other hand, when a cache hit occurs, the hit determination circuit 135 determines whether or not the request is indirect access from the entry of the control port 30 corresponding to the cache hit request. If it is not indirect access, the hit determination circuit 135 performs the load / store processing of the request having a cache hit as it is. Then, the hit determination circuit 135 notifies the completion of processing of the processed request to the control port 30 that is the request source.

これに対して、インダイレクトアクセスの場合、ヒット判定回路１３５は、キャッシュヒットしたリクエストに対応する制御ポート３０が有するエントリから、そのリクエストの要素番号が「０」か否かを判定する。要素番号が「０」の場合、ヒット判定回路１３５は、そのリクエストの要素番号から１つ前の番号のリクエストのステータスフラグを参照する。 On the other hand, in the case of indirect access, the hit determination circuit 135 determines whether the element number of the request is “0” from the entry of the control port 30 corresponding to the cache hit request. When the element number is “0”, the hit determination circuit 135 refers to the status flag of the request with the number immediately before the element number of the request.

ステータスフラグが「２」の場合、１つ前の番号のリクエストの処理が完了済みであるので、ヒット判定回路１３５は、キャッシュヒットしたリクエストのロード・ストア処理を行う。そして、ヒット判定回路１３５は、処理したリクエストの処理完了をリクエストの投入元の制御ポート３０に通知する。 When the status flag is “2”, since the processing of the request with the previous number has been completed, the hit determination circuit 135 performs the load / store processing of the cache hit request. Then, the hit determination circuit 135 notifies the completion of processing of the processed request to the control port 30 that is the request source.

ステータスフラグが「２」でない場合、１つ前の番号のリクエストの処理が完了していないので、ヒット判定回路１３５は、キャッシュヒットしたリクエストのロード・ストア処理は行わずにそのリクエストの再投入をリクエストの投入元の制御ポート３０へ指示する。 If the status flag is not “2”, the processing of the request with the previous number has not been completed, so the hit determination circuit 135 does not perform the load / store processing of the cache hit request and re-enters the request. An instruction is given to the control port 30 of the request source.

制御ポート管理部１３１は、リクエストの処理完了の通知をヒット判定回路１３５又は１３６から受けると、通知されたリクエストのエントリからインダイレクトアクセスか否かを判定する。そのリクエストがインダイレクトアクセスでなければ、そのまま、リクエスト完了応答を命令制御部１１に通知する。その後、制御ポート管理部１３１は、そのクエストのエントリのステータスフラグを「０」に変更させる。 Upon receiving a request processing completion notification from the hit determination circuit 135 or 136, the control port management unit 131 determines whether indirect access is made from the notified request entry. If the request is not indirect access, a request completion response is notified to the instruction control unit 11 as it is. Thereafter, the control port management unit 131 changes the status flag of the entry of the quest to “0”.

一方、そのリクエストがインダイレクトアクセスの場合、制御ポート管理部１３１は、そのリクエストの要素番号が、そのリクエストが含まれる命令に含まれるキャッシュアクセスリクエストの要素数と同じか否かを判定する。以下では、リクエストが含まれる命令を、単に「命令」という。要素数と同じ、すなわち、そのリクエストの要素番号が命令が有するリクエストの要素番号の中で一番大きい番号の場合、制御ポート管理部１３１は、リクエスト完了応答を命令制御部１１に通知する。 On the other hand, when the request is indirect access, the control port management unit 131 determines whether the element number of the request is the same as the number of elements of the cache access request included in the instruction including the request. Hereinafter, an instruction including a request is simply referred to as an “instruction”. When the number of elements is the same, that is, when the element number of the request is the largest among the element numbers of the requests included in the instruction, the control port management unit 131 notifies the instruction control unit 11 of a request completion response.

これに対して、そのリクエストの要素番号が命令の要素数と異なる場合、制御ポート管理部１３１は、その命令に含まれるリクエストで要素番号が要素数と同じリクエストの処理が完了するまで、そのリクエストについての処理を一時停止する。そして、要素番号が要素数と同じリクエストの処理の完了通知をヒット判定回路１３５又は１３６から受信すると、制御ポート管理部１３１は、リクエスト完了通知を命令制御部１１へ送信する。その後、制御ポート管理部１３１は、その命令に含まれる全リクエストのエントリのステータスフラグを「０」に変更させる。 On the other hand, if the element number of the request is different from the number of elements of the command, the control port management unit 131 requests the request until the request having the same element number as the number of elements is completed in the request included in the command. Pause processing for. When the completion notification of the request processing with the element number equal to the number of elements is received from the hit determination circuit 135 or 136, the control port management unit 131 transmits a request completion notification to the instruction control unit 11. Thereafter, the control port management unit 131 changes the status flags of all the request entries included in the command to “0”.

次に、図１０を参照して、本実施例に係る演算処理装置による命令処理の流れを説明する。図１０は、実施例２に係る演算処理装置による命令処理のフローチャートである。以下では、ヒット判定回路１３５とヒット判定回路１３６とを区別せずに、「ヒット判定回路１４０」という。 Next, with reference to FIG. 10, the flow of instruction processing by the arithmetic processing unit according to this embodiment will be described. FIG. 10 is a flowchart of command processing performed by the arithmetic processing apparatus according to the second embodiment. Hereinafter, the hit determination circuit 135 and the hit determination circuit 136 are referred to as “hit determination circuit 140” without being distinguished from each other.

命令制御部１１は、命令からリクエストを取得し、取得したリクエストを制御ポート３０に発行する（ステップＳ４０１）。 The command control unit 11 acquires a request from the command and issues the acquired request to the control port 30 (step S401).

制御ポート３０は、リクエストを受信する（ステップＳ４０２）。そして、制御ポート３０は、受信したリクエストのエントリのステータスフラグを「１」にする（ｓｔａｔｕｓ＝１）（ステップＳ４０３）。 The control port 30 receives the request (step S402). Then, the control port 30 sets the status flag of the received request entry to “1” (status = 1) (step S403).

次に、制御ポート３０は、リクエストをパイプライン１３０に投入する（ステップＳ４０４）。 Next, the control port 30 inputs the request to the pipeline 130 (step S404).

ヒット判定回路１４０は、リクエストがキャッシュヒットしたか否かを判定する（ステップＳ４０５）。キャッシュヒットしない場合（ステップＳ４０５：否定）、メモリアクセス制御部１３８は、データ転送要求を二次キャッシュ制御部１４へ送信する（ステップＳ４０６）。そして、メモリアクセス制御部１３８は、二次キャッシュ制御部１４から応答データを受信し、受信した応答データをキャッシュＲＡＭ１３４に登録する（ステップＳ４０７）。その後、ヒット判定回路１４０は、リクエストの再投入を制御ポート３０へ通知する。制御ポート３０は、ステップＳ４０４へ戻る。 The hit determination circuit 140 determines whether or not the request has a cache hit (step S405). If there is no cache hit (No at Step S405), the memory access control unit 138 transmits a data transfer request to the secondary cache control unit 14 (Step S406). Then, the memory access control unit 138 receives the response data from the secondary cache control unit 14, and registers the received response data in the cache RAM 134 (step S407). Thereafter, the hit determination circuit 140 notifies the control port 30 of the request re-injection. The control port 30 returns to step S404.

これに対して、キャッシュヒットした場合（ステップＳ４０５：肯定）、ヒット判定回路１４０は、リクエストがインダイレクトアクセスでない（ｉｎｄｉｒｅｃｔ＝０）か否かを判定する（ステップＳ４０８）。インダイレクトアクセスでない場合（ステップＳ４０８：肯定）、ヒット判定回路１４０は、ステップＳ４１２へ進む。 On the other hand, when a cache hit occurs (step S405: affirmative), the hit determination circuit 140 determines whether or not the request is not indirect access (indirect = 0) (step S408). If it is not indirect access (step S408: Yes), the hit determination circuit 140 proceeds to step S412.

一方、インダイレクトアクセスの場合（ステップＳ４０８：否定）、ヒット判定回路１４０は、リクエストの要素番号が「０」か否かを判定する（ステップＳ４０９）。リクエストの要素番号が「０」の場合（ステップＳ４０９：肯定）、ヒット判定回路１４０は、ステップＳ４１２へ進む。 On the other hand, in the case of indirect access (No at Step S408), the hit determination circuit 140 determines whether or not the element number of the request is “0” (Step S409). When the element number of the request is “0” (step S409: Yes), the hit determination circuit 140 proceeds to step S412.

これに対して、リクエストの要素番号が「０」でない場合（ステップＳ４０９：否定）、ヒット判定回路１４０は、リクエストの要素番号から１を減算した値（要素番号−１）を有する要素番号のステータスフラグを参照する（ステップＳ４１０）。そして、ヒット判定回路１４０は、ステータスフラグが「２」（ｓｔａｔｕｓ＝２）か否かを判定する（ステップＳ４１１）。ステータスフラグが「２」の場合（ステップＳ４１１：肯定）、ヒット判定回路１４０は、ステップＳ４１２に進む。一方、ステータスフラグが「１」の場合（ステップＳ４１１：否定）、ヒット判定回路１４０は、リクエストの再投入を制御ポート３０へ通知する。制御ポート３０は、ステップＳ４０４へ戻る。 On the other hand, when the element number of the request is not “0” (step S409: No), the hit determination circuit 140 has a status of an element number having a value (element number−1) obtained by subtracting 1 from the element number of the request The flag is referenced (step S410). Then, the hit determination circuit 140 determines whether or not the status flag is “2” (status = 2) (step S411). When the status flag is “2” (step S411: Yes), the hit determination circuit 140 proceeds to step S412. On the other hand, when the status flag is “1” (No at Step S411), the hit determination circuit 140 notifies the control port 30 of request re-injection. The control port 30 returns to step S404.

そして、ヒット判定回路１４０は、ロード・ストア処理を実行する（ステップＳ４１２）。 Then, the hit determination circuit 140 executes load / store processing (step S412).

そして、ヒット判定回路１４０は、リクエストの処理完了を制御ポート３０へ通知する。制御ポート３０は、リクエストの処理完了の通知を受けて、該当するリクエストのエントリのステータスフラグを「２」に変更する（ｓｔａｔｕｓ＝２）（ステップＳ４１３）。 The hit determination circuit 140 notifies the control port 30 of the completion of the request processing. The control port 30 receives the notification of the request processing completion and changes the status flag of the entry of the corresponding request to “2” (status = 2) (step S413).

制御ポート管理部１３１は、ステータスフラグが「２」に変更されると、そのリクエストがインダイレクトアクセスでないか否か、すなわち、インダイレクトアクセスか否かを示すフラグが「０」（ｉｎｄｉｒｅｃｔ＝０）か否かを判定する（ステップＳ４１４）。リクエストがインダイレクトアクセスでない場合（ステップＳ４１４：肯定）、制御ポート管理部１３１は、ステップＳ４１７へ進む。 When the status flag is changed to “2”, the control port management unit 131 sets the flag indicating whether the request is not indirect access, that is, whether it is indirect access, to “0” (indirect = 0). It is determined whether or not (step S414). When the request is not indirect access (step S414: Yes), the control port management unit 131 proceeds to step S417.

これに対して、リクエストがインダイレクトアクセスである場合（ステップＳ４１４：否定）、制御ポート管理部１３１は、そのリクエストの要素番号が、そのリクエストを含む命令のキャッシュアクセスリクエストの要素数と同じか否かを判定する（ステップＳ４１５）。リクエストの要素番号が要素数と同じ場合（ステップＳ４１５：肯定）、制御ポート管理部１３１は、そのリクエストに関するリクエスト完了応答を命令制御部１１へ送信する（ステップＳ４１７）。 On the other hand, when the request is indirect access (No at Step S414), the control port management unit 131 determines whether the element number of the request is the same as the number of elements of the cache access request of the instruction including the request. Is determined (step S415). If the element number of the request is the same as the number of elements (step S415: affirmative), the control port management unit 131 transmits a request completion response related to the request to the instruction control unit 11 (step S417).

一方、リクエストの要素番号が要素数と異なる場合（ステップＳ４１５：否定）、制御ポート管理部１３１は、要素数が要素番号と一致するリクエスト、すなわち、最も要素番号が大きいリクエストについてのリクエスト完了応答を出力するまで待機する（ステップＳ４１６）。 On the other hand, when the element number of the request is different from the element number (step S415: No), the control port management unit 131 sends a request completion response for the request whose element number matches the element number, that is, the request with the largest element number. Wait until output (step S416).

その後、制御ポート管理部１３１は、リクエスト完了応答に対応するリクエストのエントリのステータスフラグを「０」にする指示を制御ポート３０へ送信する。制御ポート３０は、制御ポート管理部１３１からの指示を受けて、指示されたリクエストのステータスフラグを「０」にする（ｓｔａｔｕｓ＝０）（ステップＳ４１８）。 Thereafter, the control port management unit 131 transmits to the control port 30 an instruction to set the status flag of the request entry corresponding to the request completion response to “0”. Upon receiving an instruction from the control port management unit 131, the control port 30 sets the status flag of the instructed request to “0” (status = 0) (step S418).

以上に説明したように、本実施例に係る演算処理装置は、要素番号が若いリクエストから処理していく。これにより、要素番号が命令の要素数と一致するリクエストの処理が完了した時点で、その命令に含まれる全てのリクエストの処理が完了したことが確定される。そのため、演算処理装置は、命令に含まれる全てのリクエストのステータスを確認を行わずにリクエスト完了応答を送信するか否かを判定でき、処理の簡略化及び回路の簡素化を実現できる。 As described above, the arithmetic processing apparatus according to the present embodiment processes a request with a lower element number. Thereby, when the processing of the request whose element number matches the number of elements of the command is completed, it is determined that the processing of all the requests included in the command is completed. Therefore, the arithmetic processing unit can determine whether or not to transmit a request completion response without confirming the statuses of all requests included in the instruction, and can simplify the processing and the circuit.

１ＣＰＵ
２メモリ
１１命令制御部
１２演算制御部
１３キャッシュ制御部
１４二次キャッシュ制御部
１５メモリ制御部
３０制御ポート
１３０，１３２，１３３パイプライン
１３１制御ポート管理部
１３４キャッシュＲＡＭ
１３５，１３６，１４０ヒット判定回路
１３７大小比較器
１３８メモリアクセス制御回路 1 CPU
2 Memory 11 Instruction control unit 12 Operation control unit 13 Cache control unit 14 Secondary cache control unit 15 Memory control unit 30 Control port 130, 132, 133 Pipeline 131 Control port management unit 134 Cache RAM
135, 136, 140 Hit determination circuit 137 Size comparator 138 Memory access control circuit

Claims

An instruction control unit that acquires an arithmetic processing instruction including a plurality of cache access requests and sequentially transmits the cache access requests included in the arithmetic processing instruction;
A cache control unit that receives the cache access request transmitted from the instruction control unit and sequentially executes a process of accessing the cache instructed by each cache access request;
An arithmetic processing unit comprising: an arithmetic control unit that performs arithmetic processing based on a processing result of the access processing by the cache control unit.

When the cache control unit completes processing of all the cache access requests included in the arithmetic processing instruction transmitted from the instruction control unit, the cache control unit notifies the instruction control unit of the completion of processing,
The arithmetic processing apparatus according to claim 1, wherein the instruction control unit determines completion of a cache access request in the arithmetic processing instruction when receiving a processing completion notification from the cache control unit.

The cache access requests included in the arithmetic processing instructions are numbered sequentially.
The cache control unit processes the cache access requests in ascending order of numbers, and notifies the instruction control unit of completion of processing when processing of the cache access request having the last number is completed. 2. The arithmetic processing apparatus according to 2.

The cache control unit has a plurality of cache access request processing paths,
The arithmetic processing apparatus according to claim 1, wherein the instruction control unit transmits the cache access requests for the number of processing paths to the cache control unit at the same time.

The instruction control unit transmits the arithmetic processing instruction identification information and the cache access request identification information added to the cache access request,
The cache control unit determines the processing order of the cache access request based on the identification information of the arithmetic processing instruction and the identification information of the cache access request. The arithmetic processing unit described in 1.

The cache control unit determines the processing order between the cache access requests based on the identification information of the arithmetic processing instructions when the identification information of the arithmetic processing instructions added to the cache access requests is different. 6. The processing order between the cache access requests is determined based on the identification information of the cache access request when the identification information of the arithmetic processing instruction added to the access request is the same. The arithmetic processing unit according to claim 1.

Obtain an arithmetic processing instruction including multiple cache access requests,
Sequentially obtaining the cache access requests included in the arithmetic processing instructions;
Sequentially execute access processing to the cache instructed by the acquired cache access request,
An arithmetic processing unit that performs arithmetic processing based on the processing result of the access processing.