JPWO2013084315A1

JPWO2013084315A1 - Arithmetic processing device and control method of arithmetic processing device

Info

Publication number: JPWO2013084315A1
Application number: JP2013548003A
Authority: JP
Inventors: 石井　寛之; 寛之石井; 小島　広行; 広行小島
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2011-12-07
Filing date: 2011-12-07
Publication date: 2015-04-27

Abstract

本発明の一側面に係る演算処理装置は、第１のキャッシュメモリを含む複数の演算処理部と、複数の演算処理部がそれぞれ演算したデータを保持する第２のキャッシュメモリと、制御対象となる第１のキャッシュメモリのキャッシュブロックのウェイを示すウェイ情報を含む制御対象のアクセス要求元の第１のキャッシュメモリに関する属性情報を取得する取得部と、アドレス情報と属性情報とを保持する保持部と、第１のキャッシュメモリに関して複数の演算処理部のいずれかが発行するリプレース要求に係るアクセス要求に含まれるリプレース対象アドレスと第１のキャッシュメモリにおけるリプレース対象のキャッシュブロックのウェイを示すウェイ情報とに基づき、制御対象のアドレス情報と属性情報とにより特定される第２のキャッシュメモリのキャッシュブロックに対するリプレース要求に係るアクセス要求を制御する制御部と、を備える。An arithmetic processing apparatus according to one aspect of the present invention is a control target, a plurality of arithmetic processing units including a first cache memory, a second cache memory that holds data calculated by the plurality of arithmetic processing units, respectively. An acquisition unit that acquires attribute information related to the first cache memory of the access request source to be controlled, including way information indicating a way of a cache block of the first cache memory; and a holding unit that holds address information and attribute information The replacement target address included in the access request related to the replacement request issued by one of the plurality of arithmetic processing units with respect to the first cache memory and the way information indicating the way of the cache block to be replaced in the first cache memory. Based on the second address specified by the address information and attribute information to be controlled. And a control unit for controlling the access request according to the replacement request to the cache block of Shumemori, the.

Description

本発明は、演算処理装置、及び、演算処理装置の制御方法に関する。 The present invention relates to an arithmetic processing device and a control method for the arithmetic processing device.

従来、プロセッサコアの実行速度と主記憶装置に対するアクセス速度との差を埋めるために、キャッシュメモリが用いられる。アクセス速度とメモリ容量とのトレードオフの関係から、大抵のキャッシュメモリは、２階層以上に階層化されている。階層化されたキャッシュメモリは、プロセッサコアから近い順に、１次（Ｌ１）キャッシュメモリ、２次（Ｌ２）キャッシュメモリ等と呼ばれる。なお、以下、プロセッサコアを単に「コア」とも記載する。主記憶装置を単に「メモリ」又は「メインメモリ」とも記載する。キャッシュメモリを単に「キャッシュ」とも記載する。 Conventionally, a cache memory is used to bridge the difference between the execution speed of the processor core and the access speed to the main storage device. Because of the trade-off relationship between access speed and memory capacity, most cache memories are hierarchized into two or more layers. The hierarchical cache memory is referred to as a primary (L1) cache memory, a secondary (L2) cache memory, or the like in order from the processor core. Hereinafter, the processor core is also simply referred to as “core”. The main storage device is also simply referred to as “memory” or “main memory”. The cache memory is also simply referred to as “cache”.

メインメモリのデータは、ブロックを単位としてキャッシュメモリに対応付けられる。キャッシュメモリのブロックとメインメモリのブロックとの対応付け方式として、セットアソシアティブ（set associative）方式が知られている。なお、以下では、メインメモリのブロックとキャッシュメモリのブロックとを区別するために、メインメモリのブロックを特に「メモリブロック」と表記する。また、キャッシュメモリのブロックを「キャッシュブロック」又は、「ライン」と表記する。 The data in the main memory is associated with the cache memory in units of blocks. A set associative method is known as a method of associating a cache memory block with a main memory block. Hereinafter, in order to distinguish the main memory block from the cache memory block, the main memory block is particularly referred to as a “memory block”. A block of the cache memory is expressed as “cache block” or “line”.

セットアソシアティブ方式とは、メインメモリ及びキャッシュメモリをいくつかのセットに分け、各セット内において、メインメモリとキャッシュメモリとの対応付けを行う方式である。なお、セットは、カラムとも呼ばれる。セットアソシアティブ方式では、キャッシュメモリの各セット内に収容可能なキャッシュブロックの数が定められている。当該収容可能なキャッシュブロックの数は、ロー数、レベル数、又は、ウェイ数と呼ばれる。 The set associative method is a method in which the main memory and the cache memory are divided into several sets, and the main memory and the cache memory are associated with each other in each set. A set is also called a column. In the set associative method, the number of cache blocks that can be accommodated in each set of the cache memory is determined. The number of cache blocks that can be accommodated is called the number of rows, the number of levels, or the number of ways.

セットアソシアティブ方式では、キャッシュブロックは、インデックスとウェイ情報によって、識別される。具体的には、インデックスによって、該当キャッシュブロックを収容するセットが識別される。また、ウェイ情報によって、セット内に収容されるキャッシュブロックのうちの該当キャッシュブロックが識別される。ウェイ情報は、例えば、該当キャッシュブロックの識別に用いられるウェイ番号である。 In the set associative method, a cache block is identified by an index and way information. Specifically, a set containing the corresponding cache block is identified by the index. Further, the way information identifies the corresponding cache block among the cache blocks accommodated in the set. The way information is, for example, a way number used for identifying the corresponding cache block.

メモリブロックとキャッシュブロックとの割り当てには、割り当て対象となるメモリブロックのアドレスが用いられる。セットアソシアティブ方式では、割り当て対象となるメモリブロックは、当該メモリブロックのアドレスの一部と一致するインデックスにより示されるセットに収容されるいずれかのキャッシュブロックに割り当てられる。つまり、アドレスの一部によって、キャッシュメモリ内のインデックスが指定される。 For the allocation between the memory block and the cache block, the address of the memory block to be allocated is used. In the set associative method, a memory block to be allocated is allocated to any cache block accommodated in a set indicated by an index that matches a part of the address of the memory block. That is, an index in the cache memory is specified by a part of the address.

なお、当該割り当てに用いられるアドレスは、物理アドレス（実アドレス）及び論理アドレス（仮想アドレス）のいずれのアドレスでもよい。また、キャッシュメモリ内のインデックスを指定するために使われる当該アドレスの一部は、セットアドレスとも呼ばれる。これらのアドレスは、ビットにより表わされる。なお、メインメモリにおいて、同一のセットに収容されるメモリブロックは、同一のセットアドレスを有するメモリブロックである。 The address used for the assignment may be a physical address (real address) or a logical address (virtual address). A part of the address used for designating an index in the cache memory is also called a set address. These addresses are represented by bits. In the main memory, memory blocks accommodated in the same set are memory blocks having the same set address.

メインメモリは、キャッシュメモリよりも容量が大きい。そのため、メインメモリにおけるセットに収容されるメモリブロックの数は、キャッシュメモリにおけるセットに収容されるキャッシュブロックの数よりも多い。したがって、メインメモリ内の全てのメモリブロックをキャッシュメモリ内のキャッシュブロックに割り当てることはできない。つまり、メインメモリにおける各セットに収容されるメモリブロックは、キャッシュメモリにおける各セットに収容されるキャッシュブロックに割り当てられているメモリブロックと、割り当てられていないメモリブロックとに分けることができる。 The main memory has a larger capacity than the cache memory. Therefore, the number of memory blocks accommodated in the set in the main memory is larger than the number of cache blocks accommodated in the set in the cache memory. Therefore, all the memory blocks in the main memory cannot be assigned to the cache block in the cache memory. That is, the memory block accommodated in each set in the main memory can be divided into a memory block allocated to the cache block accommodated in each set in the cache memory and a memory block not allocated.

ここで、例えば、プロセッサコアにより指定されたアドレスに該当するメモリブロックの代わりに、当該メモリブロックに割り当てられたキャッシュブロックから、データを取得する場面を考える。この場合、キャッシュメモリ内において、プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックが検索される。 Here, for example, consider a situation where data is acquired from a cache block assigned to a memory block instead of a memory block corresponding to an address designated by the processor core. In this case, the cache block to which the memory block corresponding to the address designated by the processor core is allocated is searched in the cache memory.

当該検索においてキャッシュブロックがヒットすれば、プロセッサコアにより指定されるデータは、キャッシュメモリ内のヒットしたキャッシュブロックから取得することができる。他方、当該検索においてキャッシュブロックがヒットしなければ、プロセッサコアにより指定されるデータは、キャッシュメモリ内には存在しない。そのため、プロセッサコアにより指定されるデータは、メインメモリから取得される。このような場面は、キャッシュミスヒットとも呼ばれる。 If a cache block is hit in the search, the data specified by the processor core can be acquired from the hit cache block in the cache memory. On the other hand, if the cache block does not hit in the search, the data specified by the processor core does not exist in the cache memory. Therefore, the data specified by the processor core is acquired from the main memory. Such a scene is also called a cache miss hit.

なお、プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックの検索には、インデックスとキャッシュタグとが用いられる。インデックスは、上述の通り、該当キャッシュブロックを収容するセットを示す。また、キャッシュタグは、各セット内において、メモリブロックに対応するキャッシュブロックを検索するために用いられるタグである。 An index and a cache tag are used to search for a cache block to which a memory block corresponding to an address designated by the processor core is assigned. As described above, the index indicates a set that accommodates the corresponding cache block. The cache tag is a tag used to search for a cache block corresponding to a memory block in each set.

キャッシュタグは、キャッシュブロック毎設けられる。キャッシュブロックにメモリブロックが割り当てられた際、当該キャッシュブロックに対応するキャッシュタグには、当該メモリブロックのアドレスの一部が格納される。このキャッシュタグに格納されるアドレスの一部は、セットアドレスとは異なる。具体的には、キャッシュタグには、当該メモリブロックのアドレスからセットアドレスを除いた部分から取得される適当なビット長のアドレスが格納される。なお、キャッシュタグに格納されるアドレスを、以下、タグアドレスと表記する。 A cache tag is provided for each cache block. When a memory block is assigned to a cache block, a part of the address of the memory block is stored in the cache tag corresponding to the cache block. Part of the address stored in the cache tag is different from the set address. Specifically, the cache tag stores an address of an appropriate bit length acquired from a portion obtained by removing the set address from the address of the memory block. The address stored in the cache tag is hereinafter referred to as a tag address.

このようなインデックスとキャッシュタグが用いられて、プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックが検索される。 Using such an index and a cache tag, a cache block to which a memory block corresponding to an address designated by the processor core is allocated is searched.

例えば、まず、キャッシュメモリから、プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックを収容する可能性のあるセットが検索される。具体的には、キャッシュメモリから、プロセッサコアにより指定されたアドレスのセットアドレス該当部分のアドレスと一致するインデックスが検索される。このときに検索されるインデックスにより示されるセットが、プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックを収容する可能性のあるセットである。 For example, first, a set that may contain a cache block to which a memory block corresponding to an address designated by the processor core is allocated is searched from the cache memory. Specifically, an index that matches the address of the set address corresponding portion of the address specified by the processor core is searched from the cache memory. The set indicated by the index searched at this time is a set that may accommodate a cache block to which a memory block corresponding to the address specified by the processor core is allocated.

そして、検索されたインデックスにより示されるセットに収容されるキャッシュブロックから、プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックが検索される。具体的には、検索されたセットに収容される各キャッシュブロックに対応付けられたキャッシュタグのうち、プロセッサコアにより指定されたアドレスのタグアドレス該当部分のアドレスを格納するキャッシュタグが検索される。このときに検索されるキャッシュタグに対応付けられたキャッシュブロックが、プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックである。 Then, the cache block to which the memory block corresponding to the address designated by the processor core is allocated is retrieved from the cache blocks contained in the set indicated by the retrieved index. Specifically, among the cache tags associated with each cache block accommodated in the retrieved set, a cache tag that stores the address of the corresponding portion of the tag address of the address specified by the processor core is retrieved. The cache block associated with the cache tag searched at this time is the cache block to which the memory block corresponding to the address designated by the processor core is assigned.

なお、当該検索において、プロセッサコアにより指定されたアドレスのタグアドレス該当部分のアドレスを格納するキャッシュタグが検索されなければ、キャッシュミスヒットである。この場合、プロセッサコアにより指定されたアドレスに該当するメモリブロックに割り当てられたキャッシュブロックは、キャッシュメモリ内には存在しない。そのため、プロセッサコアにより指定されるデータは、メインメモリから取得される。 In this search, if a cache tag storing the address corresponding to the tag address corresponding to the address specified by the processor core is not searched, a cache miss hit occurs. In this case, the cache block assigned to the memory block corresponding to the address designated by the processor core does not exist in the cache memory. Therefore, the data specified by the processor core is acquired from the main memory.

プロセッサコアにより指定されたアドレスに該当するメモリブロックが割り当てられたキャッシュブロックは、このように検索される。これにより、メモリブロックに格納されているデータがキャッシュメモリにも格納されている場合は、当該データは、キャッシュメモリから取得される。他方、メモリブロックに格納されているデータがキャッシュメモリに格納されていない場合は、当該データは、メモリブロックから取得される。 The cache block to which the memory block corresponding to the address designated by the processor core is allocated is searched in this way. Thereby, when the data stored in the memory block is also stored in the cache memory, the data is acquired from the cache memory. On the other hand, when the data stored in the memory block is not stored in the cache memory, the data is acquired from the memory block.

なお、キャッシュメモリのブロックとメインメモリのブロックとの対応付け方式として、セットアソシアティブ方式の他に、ダイレクトマッピング方式、フルアソシアティブ方式等の方式が知られている。ダイレクトマッピング方式は、メインメモリのブロックのアドレスによって対応付けられるキャッシュメモリのブロックを定める方式である。ダイレクトマッピング方式は、セットアソシアティブ方式におけるウェイ数が１である場合の対応付け方式に該当する。また、フルアソシアティブ方式は、キャッシュメモリの任意のブロックとメインメモリの任意のブロックとを対応させる方式である。 In addition to the set associative method, methods such as a direct mapping method and a full associative method are known as association methods between the cache memory block and the main memory block. The direct mapping method is a method for determining a cache memory block associated with an address of a main memory block. The direct mapping method corresponds to the association method when the number of ways in the set associative method is 1. The full associative method is a method in which an arbitrary block of the cache memory is associated with an arbitrary block of the main memory.

一方、近年、チップ当たりの性能向上及び消費電力削減の観点から、プロセッサコアを複数個備えるマルチコアプロセッサシステムが主流となってきている。このマルチコアプロセッサシステムにおいて、例えば、各々のプロセッサコアが１次キャッシュメモリを有し、複数のプロセッサコアによって２次キャッシュメモリが共有されるマルチコアプロセッサシステムが知られている。 On the other hand, in recent years, multi-core processor systems having a plurality of processor cores have become mainstream from the viewpoint of improving performance per chip and reducing power consumption. In this multi-core processor system, for example, a multi-core processor system is known in which each processor core has a primary cache memory and the secondary cache memory is shared by a plurality of processor cores.

このとき、２次キャッシュメモリは、当該２次キャッシュメモリを共有する複数のプロセッサコアが有する１次キャッシュメモリと当該２次キャッシュメモリとの間で、キャッシュメモリ間の整合性であるキャッシュコヒーレンシを維持する機構を備える。キャッシュコヒーレンシを維持するために、２次キャッシュメモリは、あるプロセッサコアからデータを要求された場合、各プロセッサコアが有する１次キャッシュメモリに当該要求に係るデータが格納されているか否かを調べる。 At this time, the secondary cache memory maintains cache coherency, which is the consistency between the cache memories, between the primary cache memory and the secondary cache memory of the plurality of processor cores sharing the secondary cache memory. It has a mechanism to do. In order to maintain cache coherency, when data is requested from a certain processor core, the secondary cache memory checks whether or not the data related to the request is stored in the primary cache memory included in each processor core.

あるプロセッサコアから要求されたデータが各１次キャッシュメモリに格納されているか否かを調べる方法として、データ要求がある度に、全プロセッサコアの１次キャッシュメモリに対して２次キャッシュメモリがスヌープを行なう方法が知られている。しかし、この方法では、１次キャッシュメモリから問い合わせ結果が返ってくるまでのマシンサイクルがかかる分、データ要求に対する応答までのレイテンシが長くなってしまう。 As a method of checking whether or not data requested from a certain processor core is stored in each primary cache memory, the secondary cache memory is snooped with respect to the primary cache memory of all the processor cores whenever there is a data request. A method of performing is known. However, this method requires a longer machine cycle until the inquiry result is returned from the primary cache memory, resulting in a longer latency until a response to the data request.

特許文献１では、このレイテンシを改善する方法が開示されている。特許文献１では、１次キャッシュメモリのキャッシュタグのコピーを２次キャッシュメモリのキャッシュタグ内に格納することで、１次キャッシュメモリへのスヌープ処理を不要とする方法が開示されている。 Patent Document 1 discloses a method for improving this latency. Patent Document 1 discloses a method that eliminates the need for snoop processing to the primary cache memory by storing a copy of the cache tag of the primary cache memory in the cache tag of the secondary cache memory.

１次キャッシュメモリのキャッシュタグのコピーが２次キャッシュメモリのキャッシュタグ内に格納されていると、２次キャッシュメモリは、自身のキャッシュタグにおいて１次キャッシュメモリの状態を参照することが可能となる。そのため、２次キャッシュメモリは、１次キャッシュメモリに対してスヌープしなくても、あるプロセッサコアから要求されたデータが各１次キャッシュメモリに格納されているか否かを調べることができる。特許文献１では、このような方法により、レイテンシを改善する方法が開示されている。なお、以下、１次キャッシュメモリのキャッシュタグをＬ１タグと呼ぶ。また、２次キャッシュメモリのキャッシュタグをＬ２タグと呼ぶ。 When a copy of the cache tag of the primary cache memory is stored in the cache tag of the secondary cache memory, the secondary cache memory can refer to the state of the primary cache memory in its own cache tag. . Therefore, the secondary cache memory can check whether data requested from a certain processor core is stored in each primary cache memory without snooping with respect to the primary cache memory. Patent Document 1 discloses a method for improving latency by such a method. Hereinafter, the cache tag of the primary cache memory is referred to as an L1 tag. A cache tag of the secondary cache memory is called an L2 tag.

しかしながら、２次キャッシュメモリの容量と１次キャッシュメモリの容量との差が大きいほど、２次キャッシュメモリに格納されるデータの多くは、１次キャッシュメモリに格納されていないことになる。そのため、Ｌ１タグのコピーを格納する領域をＬ２タグ内に設けると、Ｌ２タグ内のＬ１タグコピーを格納する領域がほとんど使用されない無駄な領域となってしまう。物量と消費電力の観点からこの状態は好ましくないため、改善が求められていた。 However, the greater the difference between the capacity of the secondary cache memory and the capacity of the primary cache memory, the more data stored in the secondary cache memory is not stored in the primary cache memory. Therefore, if an area for storing a copy of the L1 tag is provided in the L2 tag, the area for storing the L1 tag copy in the L2 tag becomes a useless area that is hardly used. Since this state is not preferable from the viewpoint of physical quantity and power consumption, improvement has been demanded.

特許文献２及び３では、この無駄な領域を改善する方法が開示されている。特許文献２及び３では、Ｌ１タグのコピーの代わりに、各１次キャッシュメモリにおける該当ラインの共有状態を示す情報をＬ２タグ内に格納し、Ｌ１タグのコピーをＬ２タグとは別の領域に格納する方法が開示されている。 Patent Documents 2 and 3 disclose methods for improving this useless area. In Patent Documents 2 and 3, instead of copying the L1 tag, information indicating the shared state of the corresponding line in each primary cache memory is stored in the L2 tag, and the copy of the L1 tag is stored in an area different from the L2 tag. A method of storing is disclosed.

特開２００６−４０１７５号公報JP 2006-40175 A 特許第４２９７９６８号公報Japanese Patent No. 4297968 特開２０１１−６５５７４号公報JP 2011-65574 A 特開平５−３４２１０１号公報JP-A-5-342101

図１４は、マルチコア・プロセッサにおける１次キャッシュと２次キャッシュの接続例を示す。図１４により示される接続例では、各コア（７００、７１０、・・・、７ｎ０）はそれぞれ、１次キャッシュ（７０１、７１１、・・・、７ｎ１）を備える。なお、ｎは自然数である。そして、２次キャッシュ８００は、各コア（７００、７１０、・・・、７ｎ０）によって共有される。図１４により示される接続例では、２次キャッシュ８００は、各コア（７００、７１０、・・・、７ｎ０）とメモリ９００との間に存在する。 FIG. 14 shows a connection example of the primary cache and the secondary cache in the multi-core processor. In the connection example shown by FIG. 14, each core (700, 710,..., 7n0) includes a primary cache (701, 711,..., 7n1). Note that n is a natural number. The secondary cache 800 is shared by each core (700, 710,..., 7n0). In the connection example shown by FIG. 14, the secondary cache 800 exists between each core (700, 710,..., 7n0) and the memory 900.

また、各１次キャッシュ（７０１、７１１、・・・、７ｎ１）における該当ラインの共有状態を示す情報を格納するＬ１共有情報８１１がＬ２タグ８１０内の領域に設けられている。 Further, L1 shared information 811 for storing information indicating the shared state of the corresponding line in each primary cache (701, 711,..., 7n1) is provided in an area in the L2 tag 810.

更に、２次キャッシュ８００内には、Ｌ２タグ８１０とは別の領域に、Ｌ１タグのコピーを格納するＬ１タグコピー８２０が設けられている。図１４において、Ｌ１タグコピー８２１は、コア７００の１次キャッシュタグであるＬ１タグ７０２のコピーである。Ｌ１タグコピー８２２は、コア７１０の１次キャッシュタグであるＬ１タグ７１２のコピーである。Ｌ１タグコピー８２ｎは、コア７ｎ０の１次キャッシュタグであるＬ１タグ７ｎ２のコピーである。なお、各１次キャッシュ（７０１、７１１、・・・、７ｎ１）及び２次キャッシュ８００のデータ格納構造には、セットアソシアティブ方式が採用されているとする。 Further, in the secondary cache 800, an L1 tag copy 820 for storing a copy of the L1 tag is provided in an area different from the L2 tag 810. In FIG. 14, an L1 tag copy 821 is a copy of the L1 tag 702 that is the primary cache tag of the core 700. The L1 tag copy 822 is a copy of the L1 tag 712 that is the primary cache tag of the core 710. The L1 tag copy 82n is a copy of the L1 tag 7n2, which is the primary cache tag of the core 7n0. It is assumed that the set associative method is adopted for the data storage structure of each primary cache (701, 711,..., 7n1) and secondary cache 800.

２次キャッシュ８００では、コア（７００、７１０、・・・、７ｎ０）からのアクセス要求に応じて、当該アクセス要求基づいて要求される処理が実行される。２次キャッシュ８００は、１つ以上のパイプラインを備えることにより、コア（７００、７１０、・・・、７ｎ０）からのアクセス要求に応じて実行される処理を並列処理することが可能である。このコア（７００、７１０、・・・、７ｎ０）からのアクセス要求には、Ｌ１リプレースに基づくアクセス要求が含まれる。なお、以下、コアからのアクセス要求を「リクエスト」とも記載する。 In the secondary cache 800, in response to an access request from the core (700, 710,..., 7n0), processing requested based on the access request is executed. Since the secondary cache 800 includes one or more pipelines, it is possible to perform parallel processing of processes executed in response to access requests from the cores (700, 710,..., 7n0). The access request from the core (700, 710,..., 7n0) includes an access request based on the L1 replacement. Hereinafter, the access request from the core is also referred to as “request”.

Ｌ１リプレースは、例えば、１次キャッシュにおいてキャッシュミスヒットした場合に発生する処理である。コアから要求されるデータが１次キャッシュに格納されていない場合、当該データは、２次キャッシュ又はメモリから取得され、１次キャッシュに格納される。１次キャッシュにおいて、当該データは、当該データのアドレスの一部と一致するインデックスで特定されるセット内におけるいずれかのラインに格納される。 L1 replacement is processing that occurs when, for example, a cache miss occurs in the primary cache. When the data requested from the core is not stored in the primary cache, the data is acquired from the secondary cache or memory and stored in the primary cache. In the primary cache, the data is stored in any line in the set identified by an index that matches part of the address of the data.

このように１次キャッシュにデータを書き込む際、書き込み対象のセット内におけるすべてのラインにデータが格納されており、書き込み対象のセット内のラインに空きがないときがある。このとき、コアから要求されるデータを１次キャッシュに書き込むため、ＬＲＵ（Least Recently Used）等の置き換えアルゴリズムに基づいて特定されたラインにおいて、データの置き換え処理が発生する。この置き換え処理が、Ｌ１リプレースである。 As described above, when data is written to the primary cache, data is stored in all the lines in the set to be written, and there are cases where there is no space in the lines in the set to be written. At this time, since data requested from the core is written to the primary cache, data replacement processing occurs in the line specified based on a replacement algorithm such as LRU (Least Recently Used). This replacement process is L1 replacement.

なお、以下、Ｌ１リプレースにおいて、置き換え対象のラインに書き込まれるデータ及び当該データに対応するアドレスを、それぞれ、リプレース要求データ及びリプレース要求アドレスと呼ぶ。上述のとおり、当該リプレース要求データが、コアからのアクセス要求の対象となっているデータである。また、Ｌ１リプレースによって、リプレース要求データに置き換えられるデータ及び当該データに対応するアドレスを、ぞれぞれ、リプレース対象データ及びリプレース対象アドレスと呼ぶ。 Hereinafter, in L1 replacement, data written to a line to be replaced and an address corresponding to the data are referred to as replacement request data and replacement request address, respectively. As described above, the replacement request data is data that is the target of an access request from the core. Further, data replaced with replacement request data by L1 replacement and an address corresponding to the data are referred to as replacement target data and replacement target address, respectively.

Ｌ１リプレースが発生すると、Ｌ１リプレースに基づくアクセス要求が発行される。２次キャッシュ８００では、当該アクセス要求に受け取ると、例えば、Ｌ１タグコピー８２０において、リプレース対象データが格納されていたラインにリプレース要求データを上書きすることにより、リプレース対象データを消去する。 When L1 replacement occurs, an access request based on L1 replacement is issued. When the secondary cache 800 receives the access request, for example, in the L1 tag copy 820, the replacement request data is overwritten on the line where the replacement target data is stored, thereby erasing the replacement target data.

２次キャッシュ８００では、このようなＬ１リプレースに基づくアクセス要求を含むコアからのアクセス要求に基づいて要求される処理が並列処理される。このように処理が並列処理される場合、パイプラインを流れる先行のアクセス要求に基づいて要求される処理の内容によっては、当該処理が完了するまでの間、後続のアクセス要求に基づいて要求される処理は、その実行を取り消され、再試行させられる。 In the secondary cache 800, processing required based on the access request from the core including the access request based on the L1 replacement is processed in parallel. When processing is performed in parallel in this way, depending on the content of the processing requested based on the preceding access request flowing through the pipeline, it is requested based on the subsequent access request until the processing is completed. The process is canceled and retried.

例えば、２次キャッシュ８００は、Ｌ２キャッシュ制御部８３０によって、このような後続の処理の実行取消、および、再試行の動作を実現する。図１４に示される例では、Ｌ２キャッシュ制御部８３０は、アドレスロック制御部８３１とリトライ制御部８３２を備える。 For example, in the secondary cache 800, the L2 cache control unit 830 realizes such an operation cancellation and retry operation of the subsequent processing. In the example illustrated in FIG. 14, the L2 cache control unit 830 includes an address lock control unit 831 and a retry control unit 832.

アドレスロック制御部８３１は、先行のアクセス要求に基づいて要求される処理によってロックされるアドレスを保持する。そして、アドレスロック制御部８３１は、後続のアクセス要求の対象となるアドレスとアドレス保持部に保持されているアドレスとを比較する。これにより、アドレスロック制御部８３１は、後続のアクセス要求に基づいて要求される処理の対象となるアドレスがロックされているか否かを判定する。リトライ制御部８３２は、アドレスロック制御部８３１の判定結果（マッチ情報）に基づいて、後続のアクセス要求に基づいて要求される処理がその実行を取り消されるか否かを識別して、後続の処理の実行の取消と再試行を制御する。なお、マッチ情報は、例えば、当該ロックの有無を表現する１ビットの情報である。以下、アドレスロック制御部８３１によって実行される当該判定を「ロックチェック」と呼ぶ。 The address lock control unit 831 holds an address locked by a process requested based on a previous access request. Then, the address lock control unit 831 compares the address that is the target of the subsequent access request with the address held in the address holding unit. Thereby, the address lock control unit 831 determines whether or not the address to be processed based on the subsequent access request is locked. Based on the determination result (match information) of the address lock control unit 831, the retry control unit 832 identifies whether or not the processing requested based on the subsequent access request is canceled, and the subsequent processing Control undo and retry execution of. Note that the match information is, for example, 1-bit information expressing the presence or absence of the lock. Hereinafter, the determination executed by the address lock control unit 831 is referred to as “lock check”.

ここで、ロックされるアドレスには、２種類のアドレスが存在する。１つ目のアドレスは、アクセス要求の対象となっているデータのアドレスである。２つ目のアドレスは、コアからのアクセス要求がＬ１リプレースに基づくアクセス要求である場合におけるリプレース対象アドレスである。コアからのアクセス要求がＬ１リプレースに基づくアクセス要求である場合、当該アクセス要求に基づいて要求される処理では、リプレース対象データが処理される。そのため、アクセス要求に基づいて要求される処理の対象であるリプレース要求アドレスに加えて又は代えて、リプレース対象アドレスがロックされる。コアからのアクセス要求に基づいて要求される処理の実行に伴い、アドレスロック制御部８３１では、ロックされるアドレスとして、これら２種類のアドレスが保持される。 Here, there are two types of addresses to be locked. The first address is the address of the data that is the target of the access request. The second address is a replacement target address when the access request from the core is an access request based on L1 replacement. When the access request from the core is an access request based on the L1 replacement, the replacement target data is processed in the processing requested based on the access request. Therefore, the replacement target address is locked in addition to or in place of the replacement request address that is the target of processing requested based on the access request. As the processing requested based on the access request from the core is executed, the address lock control unit 831 holds these two types of addresses as locked addresses.

また、アドレスロック制御部８３１に保持されているアドレス、すなわち、ロックされているアドレスを表現する方法として、ロック対象のアドレス、すなわち、フルサイズのアドレスをそのまま用いる方法がある。 In addition, as a method of expressing the address held in the address lock control unit 831, that is, the locked address, there is a method of using the address to be locked, that is, the full size address as it is.

他方、特許文献３及び４には、Ｌ２インデックス及びＬ２ウェイによって、フルサイズのアドレスを表現する方法が開示されている。１次キャッシュのラインに対応する２次キャッシュのラインを一意に特定することができれば、２次キャッシュにおけるキャッシュヒットの情報を利用して、１次キャッシュのラインを検索することができる。特許文献３及び４には、１次キャッシュのラインと２次キャッシュのラインを対応付ける方法が開示されている。 On the other hand, Patent Documents 3 and 4 disclose a method of expressing a full-size address using an L2 index and an L2 way. If the secondary cache line corresponding to the primary cache line can be uniquely identified, the primary cache line can be searched using the cache hit information in the secondary cache. Patent Documents 3 and 4 disclose a method of associating a primary cache line with a secondary cache line.

具体的には、Ｌ１タグコピーにおいて該当ラインを検索する場合に、１次キャッシュにおける検索処理で用いられる比較アドレスの代わりに、Ｌ２インデックスとＬ１インデックスの差分とＬ２ウェイとを用いる方法が開示されている。すなわち、Ｌ１タグコピーには、比較アドレスの代わりに、Ｌ２インデックスとＬ１インデックスの差分とＬ２ウェイとが登録される。特許文献３及び４では、これによって、１次キャッシュのラインと２次キャッシュのラインを対応付け、Ｌ１タグコピーの物量を削減することが開示されている。 Specifically, a method is disclosed that uses the difference between the L2 index, the L1 index, and the L2 way instead of the comparison address used in the search process in the primary cache when searching for the corresponding line in the L1 tag copy. Yes. That is, the difference between the L2 index and the L1 index and the L2 way are registered in the L1 tag copy instead of the comparison address. Patent Documents 3 and 4 disclose that the primary cache line and the secondary cache line are associated with each other to reduce the amount of the L1 tag copy.

当該方法によれば、２次キャッシュにおけるキャッシュヒットに係るヒット情報（Ｌ２インデックス及びＬ２ウェイ）により、一意に１次キャッシュのラインを特定することができる。そのため、当該方法によれば、Ｌ２インデックス及びＬ２ウェイは、フルサイズのアドレス相当の情報となる。したがって、当該方法によれば、フルサイズのアドレスに代えて、Ｌ２インデックス及びＬ２ウェイによって、ロック対象のアドレスを表現することができる。 According to this method, the primary cache line can be uniquely identified by the hit information (L2 index and L2 way) related to the cache hit in the secondary cache. Therefore, according to the method, the L2 index and the L2 way are information corresponding to a full-size address. Therefore, according to the method, the address to be locked can be expressed by the L2 index and the L2 way instead of the full size address.

つまり、従来の方法では、フルサイズのアドレスを用いる、または、Ｌ２インデックス及びＬ２ウェイの組を用いることによって、ロック対象のアドレスが表現されていた。これらを前提に、図１５〜図１７を用いて、従来のロックチェックに係る動作を説明する。なお、本動作例では、コアからのアクセス要求には、当該アクセス要求に基づいて要求される処理の対象となるデータのアドレスと、当該アクセス要求に関連する１次キャッシュのラインのＬ１ウェイとが含まれているものとする。なお、コアからのアクセス要求に関連する１次キャッシュのラインとは、例えば、コアからのアクセス要求がデータのロード要求であるならば、ロードしたデータの格納先となるラインである。 That is, in the conventional method, the address to be locked is expressed by using a full-size address or by using a set of L2 index and L2 way. Based on these assumptions, operations related to a conventional lock check will be described with reference to FIGS. In this operation example, the access request from the core includes the address of the data to be processed based on the access request and the L1 way of the primary cache line related to the access request. It shall be included. The primary cache line related to the access request from the core is, for example, a line that is a storage destination of the loaded data if the access request from the core is a data load request.

図１５は、フルサイズのアドレスをそのまま用いてロックされているアドレスを表現する方法におけるロックチェックに係る動作を例示する。 FIG. 15 illustrates an operation related to a lock check in a method of expressing a locked address using a full-size address as it is.

アクセス要求に基づいて要求される処理の対象となるデータのアドレスについてロックチェックを実行する場合、アドレスロック制御部８３１は、コアからのアクセス要求に含まれるアドレスを取得する。なお、図１５におけるアドレス(A)が、コアからのアクセス要求に含まれるアドレスである。そして、アドレスロック制御部８３１は、コアからのアクセス要求から取得されるアドレスと保持しているロック対象のアドレスとを比較することで、取得した処理対象のアドレスがロック対象となっているか否かを判定する。リトライ制御部８３２は、アドレスロック制御部８３１の当該判定結果に基づいて、コアからのアクセス要求に基づいて要求される処理の実行が取り消されるか否かを識別し、当該処理の実行の取消と再試行を制御する。 When the lock check is executed for the address of the data to be processed that is requested based on the access request, the address lock control unit 831 acquires the address included in the access request from the core. Note that the address (A) in FIG. 15 is an address included in the access request from the core. Then, the address lock control unit 831 compares the address acquired from the access request from the core with the held lock target address to determine whether or not the acquired processing target address is the lock target. Determine. The retry control unit 832 identifies whether or not the execution of the process requested based on the access request from the core is canceled based on the determination result of the address lock control unit 831, and cancels the execution of the process. Control retries.

一方、Ｌ１リプレースに基づくアクセス要求におけるリプレース対象アドレスについてロックチェックを実行する場合、アドレスロック制御部８３１は、Ｌ１タグコピー８２０から、リプレース対象アドレスを取得する。なお、図１５におけるアドレス(B)が、リプレース対象アドレスである。具体的には、アドレスロック制御部８３１は、Ｌ１リプレースに基づくアクセス要求に含まれる情報に一致するラインをＬ１タグコピー８２０から検索する。なお、図１５におけるアドレス(A)及びＬ１ウェイ(A)が、Ｌ１リプレースに基づくアクセス要求に含まれる情報である。そして、アドレスロック制御部８３１は、当該検索の結果により示されるラインから、リプレース対象アドレスを取得する。 On the other hand, when the lock check is executed for the replacement target address in the access request based on the L1 replacement, the address lock control unit 831 acquires the replacement target address from the L1 tag copy 820. Note that the address (B) in FIG. 15 is the replacement target address. Specifically, the address lock control unit 831 searches the L1 tag copy 820 for a line that matches the information included in the access request based on the L1 replacement. Note that the address (A) and the L1 way (A) in FIG. 15 are information included in the access request based on the L1 replacement. Then, the address lock control unit 831 acquires the replacement target address from the line indicated by the search result.

そして、アドレスロック制御部８３１は、取得したリプレース対象アドレスと保持しているロック対象のアドレスとを比較することで、当該リプレース対象アドレスがロック対象となっているか否かを判定する。リトライ制御部８３２は、アドレスロック制御部８３１の当該判定結果に基づいて、Ｌ１リプレースに基づくアクセス要求に基づいて要求される処理の実行が取り消されるか否かを識別し、当該処理の実行の取消と再試行を制御する。 Then, the address lock control unit 831 determines whether or not the replacement target address is a lock target by comparing the acquired replacement target address with the held lock target address. Based on the determination result of the address lock control unit 831, the retry control unit 832 identifies whether or not the execution of the process requested based on the access request based on the L1 replacement is canceled, and cancels the execution of the process. And control retry.

また、図１６は、Ｌ２インデックス及びＬ２ウェイを用いてロックされているアドレスを表現する方法におけるロックチェックに係る動作を例示する。 FIG. 16 illustrates the operation related to the lock check in the method of expressing the address locked using the L2 index and the L2 way.

アクセス要求に基づいて要求される処理の対象となっているデータのアドレスについてロックチェックを実行する場合、アドレスロック制御部８３１は、Ｌ２タグ８１０から、当該処理の対象となるアドレスに対応するＬ２インデックス及びＬ２ウェイを取得する。具体的には、アドレスロック制御部８３１は、アクセス要求に含まれる情報にヒットするラインをＬ２タグ８１０から検索する。なお、図１６におけるアドレス(A)が、アクセス要求に含まれる情報である。そして、アドレスロック制御部８３１は、当該検索の結果により示されるラインから、処理対象となるアドレスに対応するＬ２インデックス及びＬ２ウェイを取得する。なお、図１６におけるＬ２インデックス(A)及びＬ２ウェイ(A)が、処理対象となるアドレスに対応するＬ２インデックス及びＬ２ウェイである。 When performing a lock check on the address of the data to be processed that is requested based on the access request, the address lock control unit 831 reads the L2 index corresponding to the address to be processed from the L2 tag 810. And L2 way is acquired. Specifically, the address lock control unit 831 searches the L2 tag 810 for a line that hits the information included in the access request. Note that the address (A) in FIG. 16 is information included in the access request. Then, the address lock control unit 831 acquires the L2 index and the L2 way corresponding to the address to be processed from the line indicated by the search result. Note that the L2 index (A) and the L2 way (A) in FIG. 16 are the L2 index and the L2 way corresponding to the address to be processed.

そして、アドレスロック制御部８３１は、取得したＬ２インデックス及びＬ２ウェイと保持しているロック対象のＬ２インデックス及びＬ２ウェイとを比較することで、当該処理対象となるアドレスがロック対象となっているか否かを判定する。リトライ制御部８３２は、アドレスロック制御部８３１の当該判定結果に基づいて、コアからのアクセス要求に基づいて要求される処理の実行が取り消されるか否かを識別し、当該アクセス要求の実行の取消と再試行を制御する。 Then, the address lock control unit 831 compares the acquired L2 index and L2 way with the held L2 index and L2 way to be locked to determine whether or not the address to be processed is a lock target. Determine whether. The retry control unit 832 identifies whether or not the execution of the processing requested based on the access request from the core is canceled based on the determination result of the address lock control unit 831, and cancels the execution of the access request. And control retry.

一方、Ｌ１リプレースに基づくアクセス要求におけるリプレース対象アドレスについてロックチェックを実行する場合、アドレスロック制御部８３１は、Ｌ１タグコピー８２０から、リプレース対象アドレスに対応するＬ２インデックス及びＬ２ウェイを取得する。具体的には、アドレスロック制御部８３１は、Ｌ１リプレースに基づくアクセス要求に含まれる情報に一致するラインをＬ１タグコピー８２０から検索する。なお、図１５におけるアドレス(A)及びＬ１ウェイ(A)が、Ｌ１リプレースに基づくアクセス要求に含まれる情報である。そして、アドレスロック制御部８３１は、当該検索の結果により示されるラインから、リプレース対象アドレスに対応するＬ２インデックス及びＬ２ウェイを取得する。なお、図１６におけるＬ２インデックス(B)及びＬ２ウェイ(B)が、リプレース対象アドレスに対応するＬ２インデックス及びＬ２ウェイである。 On the other hand, when the lock check is executed for the replacement target address in the access request based on the L1 replacement, the address lock control unit 831 acquires the L2 index and the L2 way corresponding to the replacement target address from the L1 tag copy 820. Specifically, the address lock control unit 831 searches the L1 tag copy 820 for a line that matches the information included in the access request based on the L1 replacement. Note that the address (A) and the L1 way (A) in FIG. 15 are information included in the access request based on the L1 replacement. The address lock control unit 831 acquires the L2 index and the L2 way corresponding to the replacement target address from the line indicated by the search result. Note that the L2 index (B) and the L2 way (B) in FIG. 16 are the L2 index and the L2 way corresponding to the replacement target address.

そして、アドレスロック制御部８３１は、取得したＬ２インデックス及びＬ２ウェイと保持しているロック対象のＬ２インデックス及びＬ２ウェイとを比較することで、当該リプレース対象アドレスがロック対象となっているか否かを判定する。リトライ制御部８３２は、アドレスロック制御部８３１の当該判定結果に基づいて、Ｌ１リプレースに基づくアクセス要求に基づいて要求される処理の実行が取り消されるか否かを識別し、当該処理の実行の取消と再試行を制御する。 Then, the address lock control unit 831 compares the acquired L2 index and L2 way with the held L2 index and L2 way to be locked to determine whether the replacement target address is a lock target or not. judge. Based on the determination result of the address lock control unit 831, the retry control unit 832 identifies whether or not the execution of the process requested based on the access request based on the L1 replacement is canceled, and cancels the execution of the process. And control retry.

このように、従来の方法では、Ｌ１リプレースに基づくアクセス要求におけるロックチェックでは、アドレスロック制御部８３１は、リプレース対象アドレスを取得するために、Ｌ１タグコピー８２０の検索を実行する。図１５及び１６により示されるロックチェックの動作についてまとめた図が、図１７である。図１７は、Ｌ１リプレースに基づくアクセス要求に係る動作を例示する。 As described above, in the conventional method, in the lock check in the access request based on the L1 replacement, the address lock control unit 831 searches the L1 tag copy 820 in order to obtain the replacement target address. FIG. 17 is a diagram summarizing the operation of the lock check shown in FIGS. 15 and 16. FIG. 17 illustrates an operation related to an access request based on L1 replacement.

図１７に示されるとおり、Ｌ１リプレースに基づくアクセス要求がコアから発行されると、２次キャッシュ８００では、Ｌ１タグコピーから、リプレース対象アドレス、または、リプレース対象アドレスに対応するＬ２インデックス及びＬ２ウェイが取得される。そして、当該リプレース対象アドレス、または、Ｌ２インデックス及びＬ２ウェイに基づくロックチェックが実行される。そして、ロックされていなければ、Ｌ１リプレースに基づくアクセス要求に係る処理が実行される。当該アクセス要求に係る処理が実行されると、例えば、Ｌ１タグコピー８２０において、リプレース対象データが格納されていたラインにリプレース要求データが上書きされる。これにより、リプレース対象データがＬ１タグコピー８２０から消去される。 As shown in FIG. 17, when the access request based on the L1 replacement is issued from the core, the secondary cache 800 obtains the replacement target address or the L2 index and the L2 way corresponding to the replacement target address from the L1 tag copy. To be acquired. Then, a lock check based on the replacement target address or the L2 index and the L2 way is executed. If it is not locked, the process related to the access request based on the L1 replacement is executed. When the processing related to the access request is executed, for example, in the L1 tag copy 820, the replacement request data is overwritten on the line where the replacement target data is stored. As a result, the replacement target data is deleted from the L1 tag copy 820.

従来の方法では、Ｌ１リプレースに基づくアクセス要求に基づいて要求される処理においてロックチェックを行う場合、リプレース対象アドレス等を取得するために、Ｌ１タグコピーにおける検索処理が実行される。具体的には、第１のキャッシュメモリ（Ｌ１）を有する複数のプロセッサコアで第２のキャッシュメモリが共有されている。この場合、いずれかの第１のキャッシュメモリにおいてリプレース要求が発生すると、第２のキャッシュメモリでは、当該リプレース要求の対象のアドレス等を特定するために、第１のキャッシュメモリのタグのコピー（Ｌ１タグコピー）が検索される。そのため、リプレース対象アドレスに係るロックチェックでは、第１のキャッシュメモリのタグのコピーから当該リプレース対象アドレスを検索する分の遅延が生じるという問題点がある。 In the conventional method, when a lock check is performed in a process requested based on an access request based on L1 replacement, a search process in the L1 tag copy is executed in order to obtain a replacement target address or the like. Specifically, the second cache memory is shared by a plurality of processor cores having the first cache memory (L1). In this case, when a replace request occurs in any of the first cache memories, the second cache memory copies the tag (L1) of the first cache memory in order to identify the target address or the like of the replace request. Tag copy) is searched. Therefore, in the lock check related to the replacement target address, there is a problem that a delay corresponding to the search for the replacement target address from the copy of the tag of the first cache memory occurs.

一側面では、本発明は、リプレース要求の対象となるアドレスのロックチェックの際に生じる遅延を改善することを課題とする。 In one aspect, an object of the present invention is to improve a delay that occurs during a lock check of an address that is a target of a replacement request.

開示の技術の一側面は、以下の演算処理装置によって例示できる。本演算処理装置は、それぞれ演算を行うとともにアクセス要求を出力する、第１のキャッシュメモリを含む複数の演算処理部と、前記複数の演算処理部がそれぞれ演算したデータを保持する、前記複数の演算処理部に共有される第２のキャッシュメモリと、制御対象のアクセス要求元の演算処理部において制御対象となる第１のキャッシュメモリのキャッシュブロックのウェイを示すウェイ情報を含む該制御対象のアクセス要求元の前記第１のキャッシュメモリに関する属性情報を取得する取得部と、前記第２のキャッシュメモリにおける前記制御対象のアクセス要求の対象となるキャッシュブロックを特定する制御対象アドレスに関するアドレス情報と、取得された前記属性情報とを保持する保持部と、前記第１のキャッシュメモリに関して前記複数の演算処理部のいずれかが発行するリプレース要求に係るアクセス要求に含まれるリプレース対象アドレスと第１のキャッシュメモリにおけるリプレース対象のキャッシュブロックのウェイを示すウェイ情報とに基づき、前記制御対象のアドレス情報と前記属性情報とにより特定される第２のキャッシュメモリのキャッシュブロックに対する前記リプレース要求に係るアクセス要求を制御する制御部と、を備える。 One aspect of the disclosed technology can be exemplified by the following arithmetic processing device. The arithmetic processing device includes a plurality of arithmetic processing units including a first cache memory, each of which performs an arithmetic operation and outputs an access request, and the plurality of arithmetic processing units each holding data calculated by the plurality of arithmetic processing units. The second cache memory shared by the processing unit and the control target access request including way information indicating the way of the cache block of the first cache memory to be controlled in the arithmetic processing unit of the control target access request source An acquisition unit that acquires attribute information related to the original first cache memory, address information related to a control target address that specifies a cache block that is a target of the control target access request in the second cache memory, and acquired A holding unit for holding the attribute information, and the first cache memory Based on the replacement target address included in the access request related to the replacement request issued by one of the plurality of arithmetic processing units and the way information indicating the way of the cache block to be replaced in the first cache memory, the control target A control unit that controls an access request related to the replacement request for the cache block of the second cache memory specified by the address information and the attribute information.

本演算処理装置によれば、リプレース要求の対象となるアドレスのロックチェックの際に生じる遅延を改善することができる。 According to this arithmetic processing unit, it is possible to improve the delay that occurs during the lock check of the address that is the target of the replacement request.

図１は、実施の形態に係る装置を例示する。FIG. 1 illustrates an apparatus according to an embodiment. 図２は、実施の形態に係るキャッシュタグのデータ形式を例示する。FIG. 2 illustrates a data format of the cache tag according to the embodiment. 図３Ａは、通常のオーダーにおける、実施の形態に係るロック処理を例示する。FIG. 3A illustrates lock processing according to the embodiment in a normal order. 図３Ｂは、リプレースオーダーにおける、実施の形態に係るロック処理を例示する。FIG. 3B illustrates lock processing according to the embodiment in the replacement order. 図４は、実施の形態に係るアドレスロック制御部のロック処理を例示する。FIG. 4 illustrates lock processing of the address lock control unit according to the embodiment. 図５は、実施の形態における、リプレース対象アドレスの取得に係る動作を例示する。FIG. 5 exemplifies an operation related to acquisition of a replacement target address in the embodiment. 図６Ａは、通常のオーダーにおける、実施の形態に係るロックチェックを例示する。FIG. 6A illustrates a lock check according to an embodiment in a normal order. 図６Ｂは、リプレースオーダーにおける、実施の形態に係るロックチェックを例示する。FIG. 6B illustrates the lock check according to the embodiment in the replacement order. 図７は、実施の形態に係るアドレスロック制御部のロックチェックを例示する。FIG. 7 illustrates a lock check of the address lock control unit according to the embodiment. 図８は、実施の形態に係るロック処理及びロックチェックを例示する。FIG. 8 illustrates lock processing and lock check according to the embodiment. 図９は、実施の形態に係るアドレスロック制御部の回路を例示する。FIG. 9 illustrates a circuit of the address lock control unit according to the embodiment. 図１０は、実施の形態に係るロックチェックの具体的な動作を例示する。FIG. 10 illustrates a specific operation of the lock check according to the embodiment. 図１１は、実施の形態に係るロックチェックの具体的な動作を例示する。FIG. 11 illustrates a specific operation of the lock check according to the embodiment. 図１２Ａは、従来方式におけるロック範囲を例示する。FIG. 12A illustrates the lock range in the conventional method. 図１２Ｂは、実施の形態におけるロック範囲を例示する。FIG. 12B illustrates the lock range in the embodiment. 図１３Ａは、従来方式のロック範囲による影響を例示する。FIG. 13A illustrates the effect of a conventional lock range. 図１３Ｂは、実施の形態のロック範囲による影響を例示する。FIG. 13B illustrates the effect of the lock range of the embodiment. 図１４は、従来のマルチコアプロセッサシステムを例示する。FIG. 14 illustrates a conventional multi-core processor system. 図１５は、フルサイズのアドレスによりロック対象を表現する場合におけるロックチェックの動作を例示する。FIG. 15 illustrates the operation of the lock check when the lock target is expressed by a full-size address. 図１６は、Ｌ２インデックス及びＬ２ウェイによりロック対象を表現する場合におけるロックチェックの動作を例示する。FIG. 16 illustrates the operation of the lock check when the lock target is expressed by the L2 index and the L2 way. 図１７は、Ｌ１リプレースに基づくアクセス要求に対する、従来のマルチコアプロセッサシステムの動作を例示する。FIG. 17 illustrates the operation of a conventional multi-core processor system in response to an access request based on L1 replacement.

以下、本発明の一側面に係る実施の形態を、図面に基づいて説明する。ただし、以下で説明する実施の形態は、あらゆる点において本発明の例示に過ぎず、その範囲を限定しようとするものではない。本発明の範囲を逸脱することなく種々の改良や変形を行うことができることは言うまでもない。つまり、本発明の実施にあたって、本実施の形態に応じた具体的要素が適宜採用されてもよい。なお、以下、本発明の一側面に係る実施の形態を「本実施形態」とも表記する。 Embodiments according to one aspect of the present invention will be described below with reference to the drawings. However, the embodiments described below are merely examples of the present invention in all respects, and are not intended to limit the scope thereof. It goes without saying that various improvements and modifications can be made without departing from the scope of the present invention. That is, in implementing the present invention, specific elements according to the present embodiment may be employed as appropriate. Hereinafter, an embodiment according to one aspect of the present invention is also referred to as “this embodiment”.

また、以下で説明する本実施形態は、２階層のキャッシュメモリを例示する。しかしながら、本発明は、２階層以外のキャッシュメモリに適用されてもよい。３階層以上のキャッシュメモリに適用される場合を考慮し、以下の実施形態における１次キャッシュは「第１のキャッシュメモリ」と称されてもよい。また、以下の実施形態における２次キャッシュは「第２のキャッシュメモリ」と称されてもよい。 Further, the present embodiment described below exemplifies a two-level cache memory. However, the present invention may be applied to cache memories other than two layers. In consideration of the case where it is applied to a cache memory having three or more layers, the primary cache in the following embodiments may be referred to as a “first cache memory”. Further, the secondary cache in the following embodiments may be referred to as a “second cache memory”.

なお、本実施形態において登場するデータは、自然言語（日本語等）により説明される。しかしながら、これらのデータは、具体的には、コンピュータが認識可能な疑似言語、コマンド、パラメータ、マシン語等で指定される。 Note that data appearing in the present embodiment is described in a natural language (such as Japanese). However, these data are specifically designated by a pseudo language, a command, a parameter, a machine language, etc. that can be recognized by a computer.

§１装置例
まず、図１を用いて、本実施形態に係る装置例を説明する。図１は、本実施形態に係るマルチコアプロセッサシステムを例示する。図１に示されるとおり、本実施形態に係るマルチコアプロセッサシステムは、ｍ＋１個のプロセッサコア（１００、１１０、１ｍ０）、２次キャッシュ２００、メモリコントローラ３００、及び、メインメモリ４００と、を備える。なお、ｍは自然数である。本実施形態では、メインメモリ４００以外のユニットは１つの半導体チップ上に設けられる。しかし、本実施形態に係るマルチコアプロセッサシステムは、このように形成されなければならない訳ではない。半導体チップと各ユニットの関係は、適宜決定される。§1 Device Example First, an example device according to the present embodiment will be described with reference to FIG. FIG. 1 illustrates a multi-core processor system according to this embodiment. As shown in FIG. 1, the multi-core processor system according to the present embodiment includes m + 1 processor cores (100, 110, 1m0), a secondary cache 200, a memory controller 300, and a main memory 400. Note that m is a natural number. In this embodiment, units other than the main memory 400 are provided on one semiconductor chip. However, the multi-core processor system according to the present embodiment does not have to be formed in this way. The relationship between the semiconductor chip and each unit is appropriately determined.

各プロセッサコア（１００、１１０、・・・、１ｍ０）は、それぞれ命令制御部（１０１、１１１、・・・、１ｍ１）、演算実行部（１０２、１１２、・・・、１ｍ２）、１次キャッシュ（１０３、１１３、・・・、１ｍ３）を備える。なお、図１に示されるとおり、各プロセッサコア（１００、１１０、・・・、１ｍ０）は、それぞれ「第１コア」、「第２コア」、「第ｍ＋１コア」とも称する。また、各プロセッサコア（１００、１１０、・・・、１ｍ０）は、それぞれ演算処理部に相当する。 Each processor core (100, 110,..., 1m0) includes an instruction control unit (101, 111,..., 1m1), an operation execution unit (102, 112,..., 1m2), and a primary cache. (103, 113, ..., 1m3). As shown in FIG. 1, each processor core (100, 110,..., 1m0) is also referred to as “first core”, “second core”, and “m + 1 core”, respectively. Each processor core (100, 110,..., 1m0) corresponds to an arithmetic processing unit.

命令制御部（１０１、１１１、・・・、１ｍ１）は、各プロセッサコア（１００、１１０、・・・、１ｍ０）における命令のデコードと処理順序制御を行う制御部である。具体的には、命令制御部（１０１、１１１、・・・、１ｍ１）は、記憶装置から命令（機械命令）をフェッチする。機械命令を格納する記憶装置は、例えば、メインメモリ４００、２次キャッシュ２００、１次キャッシュ（１０３、１１３、・・・、１ｍ３）である。そして、命令制御部（１０１、１１１、・・・、１ｍ１）は、フェッチした命令を解釈（デコード）する。また、命令制御部（１０１、１１１、・・・、１ｍ１）は、当該命令における処理対象となるデータを記憶装置から取得して、各プロセッサコア（１００、１１０、・・・、１ｍ０）に設けられたレジスタ等に格納する。そして、命令制御部（１０１、１１１、・・・、１ｍ１）は、取得した各データに対する命令の実行を制御する。 The instruction control unit (101, 111,..., 1m1) is a control unit that performs instruction decoding and processing order control in each processor core (100, 110,..., 1m0). Specifically, the instruction control unit (101, 111,..., 1m1) fetches an instruction (machine instruction) from the storage device. The storage device for storing machine instructions is, for example, the main memory 400, the secondary cache 200, and the primary cache (103, 113,..., 1m3). Then, the instruction control unit (101, 111,..., 1m1) interprets (decodes) the fetched instruction. In addition, the instruction control unit (101, 111,..., 1m1) obtains data to be processed in the instruction from the storage device and provides the data to each processor core (100, 110,..., 1m0). Stored in a designated register. Then, the instruction control unit (101, 111,..., 1m1) controls the execution of the instruction for each acquired data.

演算実行部（１０２、１１２、・・・、１ｍ２）は、演算処理を行う。具体的には、各演算実行部（１０２、１１２、・・・、１ｍ２）は、レジスタ等に読み込まれたデータに対して、各命令制御部（１０１、１１１、・・・、１ｍ１）によって解釈された命令に対応する演算の処理を実行する。 The calculation execution unit (102, 112,..., 1m2) performs calculation processing. Specifically, each operation execution unit (102, 112,..., 1m2) interprets the data read into the register or the like by each instruction control unit (101, 111,..., 1m1). The operation corresponding to the issued instruction is executed.

１次キャッシュ（１０３、１１３、・・・、１ｍ３）及び２次キャッシュ２００は、命令制御部（１０１、１１１、・・・、１ｍ１）及び演算実行部（１０２、１１２、・・・、１ｍ２）で処理されるデータを一時的に保持するキャッシュメモリである。 The primary cache (103, 113,..., 1m3) and the secondary cache 200 include an instruction control unit (101, 111,..., 1m1) and an operation execution unit (102, 112,..., 1m2). This is a cache memory that temporarily holds data to be processed.

１次キャッシュ（１０３、１１３、・・・、１ｍ３）はそれぞれ、各プロセッサコア（１００、１１０、・・・、１ｍ０）専用のキャッシュメモリである。また、１次キャッシュ（１０３、１１３、・・・、１ｍ３）は、命令（IF）キャッシュとオペランド（OP）キャッシュとが分離された分離型キャッシュメモリである。命令キャッシュは、命令アクセスにより要求されるデータを格納する。オペランドキャッシュは、データアクセスにより要求されるデータを格納する。なお、オペランドキャッシュは、データキャッシュとも呼ばれる。このように格納するデータの種別によりキャッシュを分離することで、分離しない統合型キャッシュと比べて、キャッシュの処理速度の高速化を図ることが可能となる。ただし、本発明で利用されるキャッシュメモリの構造は、当該分離型キャッシュメモリに限定される訳ではない。 Each of the primary caches (103, 113,..., 1m3) is a cache memory dedicated to each processor core (100, 110,..., 1m0). The primary cache (103, 113,..., 1m3) is a separate cache memory in which an instruction (IF) cache and an operand (OP) cache are separated. The instruction cache stores data requested by instruction access. The operand cache stores data requested by data access. The operand cache is also called a data cache. By separating the cache according to the type of data to be stored in this way, it is possible to increase the processing speed of the cache as compared to the integrated cache that is not separated. However, the structure of the cache memory used in the present invention is not limited to the separate cache memory.

一方、２次キャッシュ２００は、各プロセッサコア（１００、１１０、・・・、１ｍ０）により共有されるキャッシュメモリである。２次キャッシュ２００は、命令とオペランドを区別せずに格納する統合型キャッシュメモリである。なお、処理能力を向上させるため、当該キャッシュのバンク分けを行ってもよい。 On the other hand, the secondary cache 200 is a cache memory shared by the processor cores (100, 110,..., 1m0). The secondary cache 200 is an integrated cache memory that stores instructions and operands without distinction. Note that the cache may be divided into banks in order to improve the processing capability.

また、１次キャッシュ（１０３、１１３、・・・、１ｍ３）は、２次キャッシュ２００に比して、高速にデータを処理できるが、データを格納する容量は小さい。各プロセッサコア（１００、１１０、・・・、１ｍ０）は、処理速度及び容量の異なる各１次キャッシュ（１０３、１１３、・・・、１ｍ３）と２次キャッシュ２００を使用することにより、メインメモリ４００との処理速度差を埋める。 The primary cache (103, 113,..., 1m3) can process data at a higher speed than the secondary cache 200, but has a small capacity for storing data. Each processor core (100, 110,..., 1m0) uses a primary cache (103, 113,..., 1m3) and a secondary cache 200 with different processing speeds and capacities, thereby enabling the main memory The processing speed difference from 400 is filled.

なお、本実施形態では、１次キャッシュ（１０３、１１３、・・・、１ｍ３）に格納されるデータは、２次キャッシュ２００にも格納される。つまり、本実施形態で使用されるキャッシュは、プロセッサコアに近い上位のキャッシュメモリに格納されるデータは、下位のキャッシュメモリに含まれるという関係が成立するインクルージョンキャッシュである。 In this embodiment, data stored in the primary cache (103, 113,..., 1m3) is also stored in the secondary cache 200. In other words, the cache used in the present embodiment is an inclusion cache in which the relationship that data stored in an upper cache memory close to the processor core is included in the lower cache memory is established.

例えば、２次キャッシュ２００は、プロセッサコアから要求されたデータ（アドレスブロック）をメモリから取得すると、１次キャッシュに取得したデータを転送すると同時に、当該データを自身に登録する。また、２次キャッシュ２００は、１次キャッシュに登録されたデータが無効化、又は、２次キャッシュ２００に書き戻された後に、自身に登録された当該データをメモリに書き戻す。このような動作により、１次キャッシュに格納されるデータは、２次キャッシュ２００に含まれる。 For example, when the secondary cache 200 acquires the data (address block) requested from the processor core from the memory, the secondary cache 200 transfers the acquired data to the primary cache and simultaneously registers the data in itself. Further, after the data registered in the primary cache is invalidated or written back to the secondary cache 200, the secondary cache 200 writes back the data registered in itself to the memory. By such an operation, data stored in the primary cache is included in the secondary cache 200.

インクルージョンキャッシュは、キャッシュの他の構造に比べて、キャッシュタグの構造や制御が簡単であるという利点を持つ。ただし、本発明において使用されるキャッシュメモリはインクルージョンキャッシュに限定されない。 The inclusion cache has an advantage that the structure and control of the cache tag are simpler than other structures of the cache. However, the cache memory used in the present invention is not limited to the inclusion cache.

更に、１次キャッシュ（１０３、１１３、・・・、１ｍ３）及び２次キャッシュ２００のデータ格納構造には、セットアソシアティブ方式が採用される。上述のとおり、１次キャッシュ（１０３、１１３、・・・、１ｍ３）のラインは、Ｌ１インデックス及びＬ１ウェイによって表現される。また、２次キャッシュ２００のラインは、Ｌ２インデックス及びＬ２ウェイによって表現される。なお、１次キャッシュ（１０３、１１３、・・・、１ｍ３）及び２次キャッシュ２００のラインのサイズは同じであるとする。 Further, a set associative method is adopted for the data storage structures of the primary cache (103, 113,..., 1m3) and the secondary cache 200. As described above, the lines of the primary cache (103, 113,..., 1m3) are represented by the L1 index and the L1 way. A line of the secondary cache 200 is represented by an L2 index and an L2 way. It is assumed that the primary caches (103, 113,..., 1m3) and the secondary cache 200 have the same line size.

図１に示されるとおり、１次キャッシュ１０３は、Ｌ１キャッシュ制御部１０４、Ｌ１命令キャッシュ１０５、及び、Ｌ１オペランドキャッシュ１０６を備える。Ｌ１キャッシュ制御部１０４は、アドレス変換部１０４ａとリクエスト処理部１０４ｂとを含む。また、Ｌ１命令キャッシュ１０５、及び、Ｌ１オペランドキャッシュ１０６は、それぞれ、Ｌ１タグ（１０５ａ、１０６ａ）及びＬ１データ（１０５ｂ、１０６ｂ）を含む。なお、本実施形態では、各１次キャッシュ（１１３、・・・、１ｍ３）は、１次キャッシュ１０３と同じように形成される。 As shown in FIG. 1, the primary cache 103 includes an L1 cache control unit 104, an L1 instruction cache 105, and an L1 operand cache 106. The L1 cache control unit 104 includes an address conversion unit 104a and a request processing unit 104b. The L1 instruction cache 105 and the L1 operand cache 106 include L1 tags (105a and 106a) and L1 data (105b and 106b), respectively. In the present embodiment, each primary cache (113,..., 1m3) is formed in the same manner as the primary cache 103.

アドレス変換部１０４ａは、命令制御部１０１によりフェッチされる命令により指定される論理アドレスを物理アドレスに変換する。当該アドレスの変換には、ＴＬＢ（Translation Lookaside Buffer）又はハッシュ表等が用いられてもよい。 The address conversion unit 104a converts a logical address specified by an instruction fetched by the instruction control unit 101 into a physical address. For the translation of the address, a TLB (Translation Lookaside Buffer), a hash table, or the like may be used.

また、リクエスト処理部１０４ｂは、命令制御部１０１により制御される命令に基づくキャッシュのデータ操作を処理する。例えば、リクエスト処理部１０４ｂは、命令制御部１０１からのデータ要求に対応するデータをＬ１命令キャッシュ１０５、又は、Ｌ１オペランドキャッシュ１０６から検索する。該当するデータが検索された場合は、リクエスト処理部１０４ｂは、検索されたデータを命令制御部１０１に返す。また、該当するデータが検索されない場合は、リクエスト処理部１０４ｂは、キャッシュミスヒットの結果を命令制御部１０１に返す。なお、リクエスト処理部１０４ｂは、上述したＬ１リプレースに基づく１次キャッシュ内におけるデータ操作についても処理する。また、リクエスト処理部１０４ｂは、２次キャッシュ２００からの要求によって、当該要求により指定されるデータを２次キャッシュ２００に書き戻す処理も実行する。 Further, the request processing unit 104b processes a cache data operation based on an instruction controlled by the instruction control unit 101. For example, the request processing unit 104b searches the L1 instruction cache 105 or the L1 operand cache 106 for data corresponding to the data request from the instruction control unit 101. When the corresponding data is retrieved, the request processing unit 104b returns the retrieved data to the instruction control unit 101. If the corresponding data is not retrieved, the request processing unit 104b returns a cache miss hit result to the instruction control unit 101. The request processing unit 104b also processes data operations in the primary cache based on the L1 replacement described above. Further, in response to a request from the secondary cache 200, the request processing unit 104b also executes a process of writing back the data designated by the request to the secondary cache 200.

Ｌ１命令キャッシュ１０５、及び、Ｌ１オペランドキャッシュ１０６は、１次キャッシュ１０３のデータを格納する記憶部である。Ｌ１命令キャッシュ１０５は、命令フェッチの際にアクセスされる機械命令を格納する。また、Ｌ１オペランドキャッシュ１０６は、機械命令のオペランド部で指定されるデータを格納する。 The L1 instruction cache 105 and the L1 operand cache 106 are storage units that store data in the primary cache 103. The L1 instruction cache 105 stores machine instructions that are accessed at the time of instruction fetch. The L1 operand cache 106 stores data specified by the operand part of the machine instruction.

Ｌ１タグ（１０５ａ、１０６ａ）はそれぞれ、Ｌ１命令キャッシュ１０５、及び、Ｌ１オペランドキャッシュ１０６のキャッシュタグを格納する。当該キャッシュタグとＬ１インデックスによって、ラインに格納されたデータのアドレスが特定される。また、Ｌ１データ（１０５ｂ、１０６ｂ）は、それぞれ、Ｌ１インデックスとＬ１タグ（１０５ａ、１０６ａ）によって特定されるアドレスに対応するデータを格納する。 The L1 tags (105a and 106a) store cache tags of the L1 instruction cache 105 and the L1 operand cache 106, respectively. The address of the data stored in the line is specified by the cache tag and the L1 index. The L1 data (105b, 106b) stores data corresponding to addresses specified by the L1 index and the L1 tag (105a, 106a), respectively.

また、図１に示されるとおり、２次キャッシュ２００は、Ｌ２キャッシュ制御部２１０及びＬ２キャッシュデータ部２２０を備える。Ｌ２キャッシュ制御部２１０は、第１のキャッシュメモリに関して複数の演算処理部のいずれかが発行するリプレース要求に係るアクセス要求に含まれるリプレース対象アドレスと第１のキャッシュメモリにおけるリプレース対象のキャッシュブロックのウェイを示すウェイ情報とに基づき、上述の制御対象のアドレス情報と属性情報とにより特定される第２のキャッシュメモリのキャッシュブロックに対するリプレース要求に係るアクセス要求を制御する。本実施形態では、Ｌ２キャッシュ制御部２１０は、リクエスト処理部２１１、リトライ制御部２１２、及び、アドレスロック制御部２１３を備える。また、Ｌ２キャッシュデータ部２２０は、Ｌ２タグ２２１、Ｌ２データ２２２、及び、Ｌ１タグコピー２２３を含む。 As shown in FIG. 1, the secondary cache 200 includes an L2 cache control unit 210 and an L2 cache data unit 220. The L2 cache control unit 210 includes a replacement target address included in an access request related to a replacement request issued by any one of the plurality of arithmetic processing units with respect to the first cache memory, and a way of the cache block to be replaced in the first cache memory. The access request related to the replacement request for the cache block of the second cache memory specified by the address information and the attribute information to be controlled is controlled based on the way information indicating the above. In the present embodiment, the L2 cache control unit 210 includes a request processing unit 211, a retry control unit 212, and an address lock control unit 213. The L2 cache data unit 220 includes an L2 tag 221, L2 data 222, and an L1 tag copy 223.

リクエスト処理部２１１は、各プロセッサコア（１００、１１０、・・・、１ｍ０）からのアクセス要求に基づいて要求される処理を実行する。本実施形態では、プロセッサコア１００からの当該アクセス要求は、命令制御部１０１によって発行される。 The request processing unit 211 executes processing requested based on an access request from each processor core (100, 110,..., 1m0). In this embodiment, the access request from the processor core 100 is issued by the instruction control unit 101.

当該アクセス要求には、例えば、１次キャッシュ内においてキャッシュミスヒットが生じた場合に発行されるデータ要求が含まれる。例えば、命令制御部１０１は、命令フェッチやデータアクセスに際して、１次キャッシュ１０３からデータを取得しようとする。このとき、対象のデータが１次キャッシュ１０３内に格納されていなければ、キャッシュミスヒットが起こる。キャッシュミスヒットが起こると、命令制御部１０１は、２次キャッシュ２００から、当該対象のデータを取得しようとする。上記データ要求は、この際に発行される、コアから２次キャッシュ２００に対してのアクセス要求である。 The access request includes, for example, a data request issued when a cache miss hit occurs in the primary cache. For example, the instruction control unit 101 tries to acquire data from the primary cache 103 at the time of instruction fetch or data access. At this time, if the target data is not stored in the primary cache 103, a cache miss occurs. When a cache miss hit occurs, the instruction control unit 101 tries to acquire the target data from the secondary cache 200. The data request is an access request from the core to the secondary cache 200 issued at this time.

また、アクセス要求には、例えば、Ｌ１リプレースに基づくアクセス要求が含まれる。例えば、命令制御部１０１は、１次キャッシュ１０３内に格納されていないデータを２次キャッシュ２００又はメインメモリ４００から取得する。この際、命令制御部１０１は、リクエスト処理部１０４ｂに対して、取得したデータを１次キャッシュ１０３内に格納することを要求する。この際にＬ１リプレースが生じたとする。上述のとおり、Ｌ１リプレースが生じると、Ｌ１リプレースが生じた１次キャッシュ内、及び、２次キャッシュ内でデータの処理が実行される。Ｌ１リプレースに基づくアクセス要求は、この際に、２次キャッシュ内でのデータ処理の実行のために、命令制御部１０１によって発行されるアクセス要求である。 The access request includes, for example, an access request based on L1 replacement. For example, the instruction control unit 101 acquires data that is not stored in the primary cache 103 from the secondary cache 200 or the main memory 400. At this time, the instruction control unit 101 requests the request processing unit 104 b to store the acquired data in the primary cache 103. Assume that L1 replacement occurs at this time. As described above, when an L1 replacement occurs, data processing is executed in the primary cache and the secondary cache where the L1 replacement has occurred. At this time, the access request based on the L1 replacement is an access request issued by the instruction control unit 101 in order to execute data processing in the secondary cache.

なお、例えば、命令フェッチに際してこれらの処理が実行される場合は、取得されたデータは、Ｌ１命令キャッシュ１０５に格納される。また、例えば、機械命令のオペランド部に係るデータへのアクセスに際してこれらの処理が実行される場合は、取得されたデータは、Ｌ１オペランドキャッシュ１０６に格納される。 For example, when these processes are executed at the time of instruction fetch, the acquired data is stored in the L1 instruction cache 105. Further, for example, when these processes are executed when accessing the data related to the operand part of the machine instruction, the acquired data is stored in the L1 operand cache 106.

リクエスト処理部２１１は、１次キャッシュ（１０３、１１３、・・・、１ｍ３）と２次キャッシュ２００との間でコヒーレンシを維持した上で、これらのアクセス要求を実行する。つまり、リクエスト処理部２１１は、１次キャッシュ（１０３、１１３、・・・、１ｍ３）と２次キャッシュ２００との間でキャッシュのコヒーレンシを制御する。このとき、リクエスト処理部２１１は、プロセッサコア（１００、１１０、・・・、１ｍ０）に対して、キャッシュのコヒーレンシを維持するための、１次キャッシュに対する該当ラインの無効化や書き戻しの処理の要求をする。一例は、後述する。 The request processing unit 211 executes these access requests while maintaining coherency between the primary cache (103, 113,..., 1m3) and the secondary cache 200. That is, the request processing unit 211 controls cache coherency between the primary cache (103, 113,..., 1m3) and the secondary cache 200. At this time, the request processing unit 211 performs processing for invalidating and writing back the corresponding line for the primary cache to maintain cache coherency for the processor cores (100, 110,..., 1m0). Make a request. An example will be described later.

なお、コアからのアクセス要求について、以降、Ｌ１リプレースに基づくアクセス要求に係る処理の命令を「Ｌ１リプレースオーダー」とも表記し、Ｌ１リプレースに基づくアクセス要求以外のアクセス要求に係る処理の命令を「通常のオーダー」と表記する。ここで、アクセス要求は、当該アクセス要求の対象となるデータの取得などの関連する処理を含めて、処理要求と称してもよい。 In addition, regarding the access request from the core, hereinafter, the processing instruction related to the access request based on the L1 replacement is also referred to as “L1 replacement order”, and the processing instruction related to the access request other than the access request based on the L1 replacement is “normally”. Of "order". Here, the access request may be referred to as a processing request including related processing such as acquisition of data that is a target of the access request.

リトライ制御部２１２は、アドレスロック制御部２１３によりロックされているアドレスに係るアクセス要求の実行を取り消し、再試行する。本実施形態では、２次キャッシュ２００は、１つ以上のパイプラインを備える（不図示）。上述のとおり、処理が並列処理される場合、パイプラインを流れる先行の処理によっては、先行の処理が完了するまでの間、後続の処理の実行は、取り消され、再試行させられる。リトライ制御部２１２は、当該後続の処理の実行の取消と再試行を制御する。 The retry control unit 212 cancels the execution of the access request related to the address locked by the address lock control unit 213 and tries again. In the present embodiment, the secondary cache 200 includes one or more pipelines (not shown). As described above, when the processes are processed in parallel, depending on the preceding process flowing in the pipeline, the execution of the subsequent process is canceled and retried until the preceding process is completed. The retry control unit 212 controls cancellation and retry of the subsequent processing.

アドレスロック制御部２１３は、Ｌ１属性情報取得部２１４、アドレス保持部２１５、及び、ロック判定部２１６を含む。Ｌ１属性情報取得部２１４、及び、アドレス保持部２１５、は、それぞれ、取得部、及び、保持部に相当する。 The address lock control unit 213 includes an L1 attribute information acquisition unit 214, an address holding unit 215, and a lock determination unit 216. The L1 attribute information acquisition unit 214 and the address holding unit 215 correspond to an acquisition unit and a holding unit, respectively.

Ｌ１属性情報取得部２１４は、１次キャッシュ（１０３、１１３、・・・、１ｍ３）に関する属性情報を取得する。また、Ｌ１属性情報取得部２１４は、コアからのアクセス要求に基づいて要求される処理の対象となるアドレス情報を取得する。アドレス情報は、第２のキャッシュメモリにおける制御対象のアクセス要求の対象となるキャッシュブロックを特定する制御対象アドレスに関し、例えば、上述した、フルサイズのアドレス、又は、Ｌ２インデックス及びＬ２ウェイの組である。本実施形態では、アドレス情報は、Ｌ２キャッシュ及びＬ２ウェイの組である。また、当該属性情報には、当該アクセス要求に関連する１次キャッシュのラインのＬ１ウェイを識別するためのウェイ情報が含まれる。当該ウェイ情報は、例えば、該当するラインのＬ１ウェイを示すウェイ番号である。なお、当該属性情報は、制御対象のアクセス要求元の第１のキャッシュメモリに関する属性情報に相当する。また、当該ウェイ情報は、制御対象のアクセス要求元の演算処理部において制御対象となる第１のキャッシュメモリのキャッシュブロックのウェイを示すウェイ情報に相当する。 The L1 attribute information acquisition unit 214 acquires attribute information related to the primary cache (103, 113,..., 1m3). In addition, the L1 attribute information acquisition unit 214 acquires address information to be processed based on an access request from the core. The address information relates to a control target address that identifies a cache block that is a target of an access request to be controlled in the second cache memory, and is, for example, the above-described full size address or a set of L2 index and L2 way . In the present embodiment, the address information is a set of L2 cache and L2 way. The attribute information includes way information for identifying the L1 way of the primary cache line related to the access request. The way information is, for example, a way number indicating the L1 way of the corresponding line. The attribute information corresponds to attribute information related to the first cache memory of the access request source to be controlled. The way information corresponds to way information indicating the way of the cache block of the first cache memory to be controlled in the arithmetic processing unit of the access request source to be controlled.

本実施形態では、Ｌ１属性情報取得部２１４は、更に、当該アクセス要求を発行したコアを示す情報、及び、データの種別を示す情報を含む属性情報を取得する。なお、コアを示す情報とは、制御対象のアクセス要求元の演算処理部を示す情報に相当し、例えば、コア番号である。また、データの種別を示す情報とは、アクセス要求の対象となるデータの種別を示す情報に相当し、当該データが機械命令に係るデータであるのか、又は、機械命令のオペランド部により指定されるデータであるのかを示す情報である。なお、機械命令に係るデータは、命令キャッシュに格納されるデータである。また、機械命令のオペランド部により指定されるデータは、オペランドキャッシュに格納されるデータである。これらの情報は、プロセッサコアにより発行されるアクセス要求に基づいて要求される処理が実行されている間、当該処理の実行にあたりロックされる対象を示す情報として用いられる。また、これらの情報は、コアからのアクセス要求に基づいて要求される処理の対象がロックされているか否かを判定するためにも用いられる。 In the present embodiment, the L1 attribute information acquisition unit 214 further acquires attribute information including information indicating the core that issued the access request and information indicating the type of data. The information indicating the core corresponds to information indicating the arithmetic processing unit of the access request source to be controlled, and is, for example, a core number. The information indicating the type of data corresponds to information indicating the type of data that is the target of the access request, and is specified by the operand part of the machine instruction or whether the data is data related to the machine instruction. It is information indicating whether it is data. The data relating to the machine instruction is data stored in the instruction cache. The data specified by the operand part of the machine instruction is data stored in the operand cache. These pieces of information are used as information indicating a target to be locked in executing the process while the process requested based on the access request issued by the processor core is being executed. These pieces of information are also used to determine whether or not the processing target requested based on the access request from the core is locked.

アドレス保持部２１５は、各プロセッサコアにより発行されるアクセス要求に係る処理が実行されている間、該処理の実行にあたりロックされる対象を示す情報として、Ｌ１属性情報取得部２１４により取得されたアドレス情報及び属性情報を保持する。なお、以下、アドレス保持部２１５により保持される情報を、ロック対象情報とも表記する。 The address holding unit 215 receives the address acquired by the L1 attribute information acquisition unit 214 as information indicating a target to be locked when the processing related to the access request issued by each processor core is executed. Holds information and attribute information. Hereinafter, the information held by the address holding unit 215 is also referred to as lock target information.

本実施形態では、アドレス保持部２１５は、プロセッサコアにより発行されるアクセス要求に基づいて要求される処理が実行されている間、当該処理に対応するロック対象情報を有効に保持する。そして、当該処理の実行が完了した後、アドレス保持部２１５は、当該処理に対応するロック対象情報を無効化する。データの無効化は、データの削除によって実現されてもよいし、データが無効であることを示すフラグにより実現されてもよい。本実施形態では、後者の方法が採用される（後述する図９参照）。 In the present embodiment, the address holding unit 215 effectively holds the lock target information corresponding to the processing while the processing requested based on the access request issued by the processor core is being executed. Then, after the execution of the process is completed, the address holding unit 215 invalidates the lock target information corresponding to the process. The invalidation of data may be realized by deleting data, or may be realized by a flag indicating that the data is invalid. In the present embodiment, the latter method is adopted (see FIG. 9 described later).

なお、アドレス保持部２１５は、コアからのアクセス要求に基づいて要求される処理によりロックされる対象として、ロック対象のコア、及び、データの種別をそれぞれ識別可能な状態でロック対象情報を保持する。ロック対象のコア、及び、データの種別は、これらを識別するための情報がロック対象情報に含まれることで、識別されてもよい。また、ロック対象のコア、及び、データの種別は、これら毎にロック対象情報が用意されることで、識別されてもよい。本実施形態では、ロック対象のコアについては、後者の形態が採用され、ロック対象のデータの種別については、前者の形態が採用される（後述する図９参照）。 Note that the address holding unit 215 holds the lock target information in a state in which the lock target core and the data type can be identified as the target to be locked by the processing requested based on the access request from the core. . The core to be locked and the type of data may be identified by including information for identifying them in the lock target information. Further, the core to be locked and the type of data may be identified by preparing lock target information for each of them. In the present embodiment, the latter form is adopted for the core to be locked, and the former form is adopted for the type of data to be locked (see FIG. 9 described later).

ロック判定部２１６は、プロセッサコアにより発行されるアクセス要求に基づいて要求される処理の対象がロックされているか否かを判定する。具体的には、ロック判定部２１６は、プロセッサコアにより発行されるアクセス要求に対応するアドレス情報及びＬ１ウェイを含む属性情報と、アドレス保持部２１５が保持するこれらの情報とを比較する。これにより、ロック判定部２１６は、当該アクセス要求に基づいて要求される処理の対象がロックされているか否かを判定する。ロックされているか否かが判定されるアクセス要求には、Ｌ１リプレースに基づくアクセス要求が含まれる。 The lock determination unit 216 determines whether the processing target requested based on the access request issued by the processor core is locked. Specifically, the lock determination unit 216 compares the address information corresponding to the access request issued by the processor core and the attribute information including the L1 way with the information held by the address holding unit 215. Thereby, the lock determination unit 216 determines whether or not the processing target requested based on the access request is locked. The access request for which it is determined whether or not it is locked includes an access request based on L1 replacement.

アドレスロック制御部２１３は、これらのユニットにより、プロセッサコアが発行するアクセス要求に基づいて要求される処理を制御する。本実施形態に係るシステムは、このアドレスロック制御部２１３により、アドレスのロックを制御することで、リプレース要求の対象となるアドレスのロックチェックの際に生じる遅延を改善する。なお、アドレスロック制御部２１３の詳細な動作については後述する。 The address lock control unit 213 controls processing requested by these units based on an access request issued by the processor core. In the system according to the present embodiment, the address lock control unit 213 controls the lock of the address, thereby improving the delay that occurs during the lock check of the address to be replaced. The detailed operation of the address lock control unit 213 will be described later.

Ｌ２キャッシュデータ部２２０は、２次キャッシュ２００のデータを格納する記憶部である。Ｌ２タグ２２１は、２次キャッシュ２００のキャッシュタグを格納する。当該キャッシュタグとＬ２インデックスによって、２次キャッシュ２００内のラインに格納されたデータのアドレスが特定される。また、Ｌ２データ２２２は、Ｌ２インデックスとＬ２タグ２２１によって特定されるアドレスに対応するデータを格納する。更に、Ｌ１タグコピー２２３は、１次キャッシュ（１０３、１１３、・・・、１ｍ３）のキャッシュタグのコピーを格納する。Ｌ１タグコピー２２３は、例えば、Ｌ１タグ（１０５ａ、１０６ａ）のコピーを格納する。 The L2 cache data unit 220 is a storage unit that stores data of the secondary cache 200. The L2 tag 221 stores the cache tag of the secondary cache 200. The address of the data stored in the line in the secondary cache 200 is specified by the cache tag and the L2 index. The L2 data 222 stores data corresponding to the address specified by the L2 index and the L2 tag 221. Further, the L1 tag copy 223 stores a cache tag copy of the primary cache (103, 113,..., 1m3). The L1 tag copy 223 stores, for example, a copy of the L1 tag (105a, 106a).

なお、図１に示されるとおり、本実施形態に係るマルチコアプロセッサシステムは、メモリコントローラ３００及びメインメモリ４００を備える。メモリコントローラ３００は、メインメモリ４００におけるデータの書き込み及び読み出し等を処理する。例えば、メモリコントローラ３００は、リクエスト処理部２１１により実行されるデータの書き戻しに応じて、当該書き戻し対象のデータをメインメモリ４００に書き込む。また、メモリコントローラ３００は、リクエスト処理部２１１からのデータ要求に応じて、当該要求に係るデータをメインメモリ４００から読み出す。なお、メインメモリ４００は、本実施形態に係るマルチコアプロセッサシステムで利用される主記憶装置である。 As shown in FIG. 1, the multi-core processor system according to this embodiment includes a memory controller 300 and a main memory 400. The memory controller 300 processes writing and reading of data in the main memory 400. For example, the memory controller 300 writes the data to be written back into the main memory 400 in response to the data write back executed by the request processing unit 211. Further, in response to a data request from the request processing unit 211, the memory controller 300 reads data related to the request from the main memory 400. The main memory 400 is a main storage device used in the multi-core processor system according to the present embodiment.

§２データ形式
次に、図２を用いて、本実施形態で扱われるキャッシュタグのデータ形式について説明する。図２は、１次キャッシュ（１０３、１１３、・・・、１ｍ３）及び２次キャッシュ２００において格納されるキャッシュタグのデータ形式を例示する。なお、図２は、１つのラインについてのキャッシュタグのデータ形式を例示する。§2 Data format Next, the data format of the cache tag used in this embodiment will be described with reference to FIG. FIG. 2 illustrates the data format of cache tags stored in the primary cache (103, 113,..., 1m3) and the secondary cache 200. FIG. 2 illustrates the data format of the cache tag for one line.

図２に示される例では、Ｌ１タグ（１０５ａ、１０６ａ）の各エントリは、物理アドレス上位ビットＢ１とステータス５００とを格納する領域を有する。物理アドレス上位ビットＢ１は、該当ラインの検索に用いられる。また、ステータス５００は、当該キャッシュタグに対応するラインに格納されたデータが有効か否か、データが更新されたか否か等を示す情報である。このようなＬ１タグにより、１次キャッシュに格納されたデータの検索が行われる。 In the example shown in FIG. 2, each entry of the L1 tag (105a, 106a) has an area for storing the physical address upper bits B1 and the status 500. The physical address upper bit B1 is used for searching for the corresponding line. The status 500 is information indicating whether the data stored in the line corresponding to the cache tag is valid, whether the data has been updated, and the like. With such an L1 tag, the data stored in the primary cache is searched.

具体的には、まず、リクエスト処理部１０４ｂは、Ｌ１命令キャッシュ１０５又はＬ１オペランドキャッシュ１０６内から、命令制御部１０１により与えられる論理アドレスの下位ビットと一致するＬ１インデックスが与えられたセットを検索する。命令フェッチの段階における動作であれば、リクエスト処理部１０４ｂは、Ｌ１命令キャッシュ１０５から該当するセットを検索する。また、機械命令のオペランド部において指定されるデータを取得する処理における動作であれば、リクエスト処理部１０４ｂは、Ｌ１オペランドキャッシュ１０６から該当するセットを検索する。 Specifically, first, the request processing unit 104b searches the L1 instruction cache 105 or the L1 operand cache 106 for a set provided with an L1 index that matches the lower bit of the logical address provided by the instruction control unit 101. . If the operation is in the instruction fetch stage, the request processing unit 104b searches the L1 instruction cache 105 for a corresponding set. If the operation is in the process of acquiring the data specified in the operand part of the machine instruction, the request processing part 104b searches the L1 operand cache 106 for the corresponding set.

次に、リクエスト処理部１０４ｂは、該当するセットの中から、命令制御部１０１により与えられた論理アドレスにより指定されるデータが格納されたラインを検索する。この検索は、物理アドレスを用いて行われる。そのため、当該検索の前までに、命令制御部１０１により与えられた論理アドレスは、アドレス変換部１０４ａによって、物理アドレスに変換される。 Next, the request processing unit 104b searches the corresponding set for a line storing data specified by the logical address given by the instruction control unit 101. This search is performed using the physical address. Therefore, before the search, the logical address given by the instruction control unit 101 is converted into a physical address by the address conversion unit 104a.

つまり、１次キャッシュ１０３では、インデックスが論理アドレス（仮想アドレス）で与えられ、キャッシュタグが実アドレス（物理アドレス）で与えられる。このような方式は、ＶＩＰＴ（Virtually Indexed Physically Tagged）方式と呼ばれる。 That is, in the primary cache 103, an index is given by a logical address (virtual address) and a cache tag is given by a real address (physical address). Such a method is called a VIPT (Virtually Indexed Physically Tagged) method.

コアから与えられるアドレスは論理アドレスである。そのため、インデックスが物理アドレスで与えられるＰＩＰＴ（Physically Indexed Physically Tagged）方式では、該当ラインの検索は、論理アドレスから物理アドレスへの変換処理をした後に実行される。これに対して、ＶＩＰＴ方式によると、インデックスの特定と、論理アドレスから物理アドレスへの変換とが並列に処理することが可能となる。よって、ＶＩＰＴ方式は、ＰＩＰＴ方式に比べてレイテンシが少ない。 The address given from the core is a logical address. For this reason, in the PIPT (Physically Indexed Physically Tagged) method in which an index is given by a physical address, the search for the corresponding line is executed after a conversion process from a logical address to a physical address. On the other hand, according to the VIPT method, it is possible to specify the index and convert the logical address to the physical address in parallel. Therefore, the VIPT method has less latency than the PIPT method.

また、キャッシュタグも論理アドレスで与えられるＶＩＶＴ（Virtually Indexed Virtually Tagged）方式では、異なる物理アドレスが同一の仮想アドレスに割り当てられる問題（ホモニム問題）が生じる。ＶＩＰＴ方式では、キャッシュタグに物理アドレスが用いられるため、当該ホモニム問題を検知することができる。 Further, in the VIVT (Virtually Indexed Virtually Tagged) system in which the cache tag is given by a logical address, a problem (homonym problem) occurs in which different physical addresses are assigned to the same virtual address. In the VIPT method, since the physical address is used for the cache tag, the homonym problem can be detected.

これらの利点により、本実施形態で使用されるキャッシュには、ＶＩＰＴ方式が採用される。ただし、本発明で使用されるキャッシュは、ＶＩＰＴ方式に限定される訳ではない。 Due to these advantages, the VIPT method is adopted for the cache used in this embodiment. However, the cache used in the present invention is not limited to the VIPT method.

リクエスト処理部１０４ｂは、アドレス変換部１０４ａによって変換された物理アドレスの上位ビットとＬ１タグの各エントリの物理アドレス上位ビットＢ１とを比較する。アドレス変換部１０４ａによって変換された物理アドレスの上位ビットと一致する物理アドレス上位ビットＢ１を含むＬ１タグのエントリに対応するラインが、命令制御部１０１により与えられた論理アドレスにより指定されたデータを格納するラインである。よって、リクエスト処理部１０４ｂは、与えられた物理アドレスの上位ビットと一致する物理アドレス上位ビットＢ１を含むＬ１タグのエントリを、検索したセットに含まれるラインに対応するＬ１タグの中から検索する。 The request processing unit 104b compares the high-order bits of the physical address converted by the address conversion unit 104a with the high-order bits B1 of the physical address of each entry of the L1 tag. The line corresponding to the entry of the L1 tag including the physical address high-order bit B1 that matches the high-order bit of the physical address converted by the address conversion unit 104a stores the data designated by the logical address given by the instruction control unit 101 It is a line to do. Therefore, the request processing unit 104b searches the L1 tag entry including the physical address high-order bit B1 that matches the high-order bit of the given physical address from the L1 tags corresponding to the lines included in the searched set.

最後に、検索の結果、該当するＬ１タグのエントリが見つかった場合、リクエスト処理部１０４ｂは、該当するＬ１タグのエントリに対応するラインに格納されたデータを取得し、取得したデータを命令制御部１０１に渡す。他方、該当するＬ１タグのエントリが見つからなかった場合、リクエスト処理部１０４ｂは、キャッシュミスヒットと判定し、１次キャッシュ１０３内には指定されたデータは格納されていないことを命令制御部１０１に通知する。本実施形態では、このように１次キャッシュに格納されたデータの検索が行われる。 Finally, when an entry of the corresponding L1 tag is found as a result of the search, the request processing unit 104b acquires the data stored in the line corresponding to the entry of the corresponding L1 tag, and uses the acquired data as the instruction control unit 101. On the other hand, when the corresponding L1 tag entry is not found, the request processing unit 104b determines that a cache miss hit has occurred, and notifies the instruction control unit 101 that the designated data is not stored in the primary cache 103. Notice. In the present embodiment, the data stored in the primary cache is searched in this way.

なお、Ｌ１タグのエントリは、コア毎、データの種別毎、インデックス毎、及び、ウェイ毎に用意される。図１では、Ｌ１タグのエントリが、コア毎、インデックス毎、及び、データの種別毎に用意されることが示されている。また、本実施形態における１次キャッシュ（１０３、１１３、・・・、１ｍ３）のデータ格納構造には、セットアソシアティブ方式が採用されているため、Ｌ１タグのエントリは、ウェイ毎に用意される（後述する図９参照）。 An L1 tag entry is prepared for each core, each data type, each index, and each way. FIG. 1 shows that an entry for the L1 tag is prepared for each core, each index, and each data type. In addition, since the set associative method is adopted for the data storage structure of the primary cache (103, 113,..., 1m3) in the present embodiment, an L1 tag entry is prepared for each way ( (See FIG. 9 described later).

また、図２に示される例では、Ｌ２タグ２２１のエントリは、物理アドレス上位ビットＢ２、ステータス５０１、論理アドレス下位ビットＡ１、及び、Ｌ１共有情報５０２を格納する領域を有する。 In the example shown in FIG. 2, the entry of the L2 tag 221 has an area for storing the physical address upper bit B2, the status 501, the logical address lower bit A1, and the L1 shared information 502.

物理アドレス上位ビットＢ２は、２次キャッシュ２００における該当ラインの検索に用いられる。ステータス５０１は、２次キャッシュ２００において、当該キャッシュタグに対応するラインに格納されたデータが有効か否か、データが更新されたか否か等を示す情報である。 The physical address upper bit B2 is used for searching for the corresponding line in the secondary cache 200. The status 501 is information indicating whether or not the data stored in the line corresponding to the cache tag is valid, whether or not the data has been updated in the secondary cache 200.

また、論理アドレス下位ビットＡ１は、例えば、シノニム問題を解消するために用いられる。本実施形態では、１次キャッシュにおいてＶＩＰＴ方式が採用されている。そのため、異なる論理アドレスが同一の物理アドレスに割り当てられるシノニム問題が発生する可能性がある。本実施形態では、論理アドレス下位ビットＡ１を参照することで、当該シノニム問題が生じているか否かを検知することができる。 The logical address lower bit A1 is used, for example, to solve the synonym problem. In this embodiment, the VIPT method is adopted in the primary cache. Therefore, a synonym problem in which different logical addresses are assigned to the same physical address may occur. In the present embodiment, it is possible to detect whether or not the synonym problem has occurred by referring to the logical address lower bit A1.

Ｌ１共有情報５０２は、当該キャッシュタグに対応するラインに格納されたデータについての１次キャッシュ（１０３、１１３、・・・、１ｍ３）における共有状態を示す情報である（例えば、特許文献２及び３を参照）。Ｌ１共有情報５０２を格納する領域は、Ｌ２タグ２２１の物量を削減するため、Ｌ１タグコピー２２３を格納する代わりに設けられる。このようなＬ２タグ２２１により、２次キャッシュ２００に格納されたデータの検索が行われる。 The L1 shared information 502 is information indicating a shared state in the primary cache (103, 113,..., 1m3) for the data stored in the line corresponding to the cache tag (for example, Patent Documents 2 and 3). See). The area for storing the L1 shared information 502 is provided instead of storing the L1 tag copy 223 in order to reduce the amount of the L2 tag 221. The L2 tag 221 searches for data stored in the secondary cache 200.

データの検索については、１次キャッシュ１０３における検索処理とほぼ同様に説明することができる。具体的には、リクエスト処理部２１１は、まず、コアから与えられるアクセス要求に含まれる物理アドレスの下位ビットと一致するＬ２インデックスが与えられたセットを検索する。なお、以下、コアからのアクセス要求に含まれる処理対象のアドレスを「リクエストアドレス」とも表記する。 Data retrieval can be described in substantially the same manner as retrieval processing in the primary cache 103. Specifically, the request processing unit 211 first searches for a set to which an L2 index that matches the lower bits of the physical address included in the access request given from the core is given. Hereinafter, the processing target address included in the access request from the core is also referred to as “request address”.

次に、リクエスト処理部２１１は、該当するセットの中から、コアから与えられたリクエストアドレスにより指定されるデータが格納されたラインを検索する。具体的には、リクエスト処理部２１１は、コアから与えられるリクエストアドレスの上位ビットと一致する物理アドレス上位ビットＢ２を含むＬ２タグ２２１のエントリを、検索したセットに含まれるラインに対応するＬ２タグ２２１の中から検索する。 Next, the request processing unit 211 searches the corresponding set for a line in which data specified by the request address given from the core is stored. Specifically, the request processing unit 211 searches the entry of the L2 tag 221 including the physical address upper bit B2 that matches the upper bit of the request address given from the core, and the L2 tag 221 corresponding to the line included in the searched set. Search from within.

最後に、検索の結果、該当するＬ２タグ２２１のエントリが見つかった場合、リクエスト処理部２１１は、該当するＬ２タグ２２１のエントリに対応するラインに格納されたデータを取得し、アクセス要求を発行したコアに取得したデータを渡す。他方、該当するＬ２タグ２２１のエントリが見つからなかった場合、リクエスト処理部２１１は、キャッシュミスヒットと判定する。そして、リクエスト処理部２１１は、メモリコントローラ３００に対して、リクエストアドレスにより指定されるデータを要求する。メモリコントローラ３００は、リクエスト処理部２１１からの当該要求に応じて、メインメモリ４００から該当するデータを取得し、取得したデータを２次キャッシュ２００に渡す。 Finally, as a result of the search, when an entry of the corresponding L2 tag 221 is found, the request processing unit 211 acquires the data stored in the line corresponding to the entry of the corresponding L2 tag 221 and issues an access request Pass the acquired data to the core. On the other hand, if the entry of the corresponding L2 tag 221 is not found, the request processing unit 211 determines that a cache miss hit. Then, the request processing unit 211 requests the memory controller 300 for data specified by the request address. In response to the request from the request processing unit 211, the memory controller 300 acquires the corresponding data from the main memory 400, and passes the acquired data to the secondary cache 200.

なお、本実施形態では、２次キャッシュのデータ格納構造にはセットアソシアティブ方式が採用されているため、Ｌ２タグ２２１のエントリは、インデックス毎、及び、ウェイ毎に用意される（後述する図９参照）。 In this embodiment, since the set associative method is adopted for the data storage structure of the secondary cache, the entry of the L2 tag 221 is prepared for each index and for each way (see FIG. 9 described later). ).

また、１次キャッシュ１０３のデータ格納容量よりも、２次キャッシュ２００のデータ格納容量の方が大きい。そして、本実施形態では、１次キャッシュ１０３のラインサイズと２次キャッシュ２００のラインサイズは同じである。そのため、通常では、１次キャッシュ１０３のセットの数よりも２次キャッシュ２００のセットの数の方が多くなる。この場合、Ｌ１インデックスのビット長よりもＬ２インデックスのビット長の方が長くなる。よって、この場合、物理アドレス上位ビットＢ１のビット長よりも物理アドレス上位ビットＢ２のビット長の方が短くなる。ただし、それぞれのキャッシュ容量やウェイ数等によっては、物理アドレス上位ビットＢ１のビット長の方が物理アドレス上位ビットＢ２のビット長よりも短くなる場合も考えられるし、同じになる場合も考えられる。これらの関係は、適宜選択される。 Further, the data storage capacity of the secondary cache 200 is larger than the data storage capacity of the primary cache 103. In the present embodiment, the line size of the primary cache 103 and the line size of the secondary cache 200 are the same. Therefore, normally, the number of sets of the secondary cache 200 is larger than the number of sets of the primary cache 103. In this case, the bit length of the L2 index is longer than the bit length of the L1 index. Therefore, in this case, the bit length of the physical address upper bits B2 is shorter than the bit length of the physical address upper bits B1. However, depending on the cache capacity and the number of ways, the bit length of the physical address upper bit B1 may be shorter than the physical address upper bit B2, or may be the same. These relationships are appropriately selected.

また、図２に示される例では、Ｌ１タグコピー２２３のエントリは、インデックス差分５０３、Ｌ２ウェイ５０４、及び、ステータス５０５を格納する領域を有する。 In the example shown in FIG. 2, the entry of the L1 tag copy 223 has an area for storing an index difference 503, an L2 way 504, and a status 505.

インデックス差分５０３は、論理アドレス下位ビットＡ１と物理アドレス下位ビットＢ３との差分である。また、Ｌ２ウェイ５０４は、当該Ｌ１タグコピー２２３に対応する２次キャッシュのラインを表現するＬ２ウェイである。本実施形態では、１次キャッシュに格納されるデータは２次キャッシュに格納されるため、Ｌ２ウェイ５０４は特定される。このインデックス差分５０３及びＬ２ウェイ５０４により、Ｌ１タグコピー２２３のエントリとＬ２タグ２２１のエントリとが対応付けられる（特許文献３及び４を参照）。なお、ステータス５０５は、１次キャッシュ（１０３、１１３、・・・、１ｍ３）において、当該キャッシュタグに対応するラインに格納されたデータが有効か否か、データが更新されたか否か等を示す情報である。 The index difference 503 is a difference between the logical address lower bit A1 and the physical address lower bit B3. The L2 way 504 is an L2 way that represents a secondary cache line corresponding to the L1 tag copy 223. In the present embodiment, since the data stored in the primary cache is stored in the secondary cache, the L2 way 504 is specified. The index difference 503 and the L2 way 504 associate the entry of the L1 tag copy 223 and the entry of the L2 tag 221 (see Patent Documents 3 and 4). The status 505 indicates whether or not the data stored in the line corresponding to the cache tag is valid, whether or not the data is updated in the primary cache (103, 113,..., 1m3). Information.

なお、このようなＬ１タグコピー２２３によれば、２次キャッシュ２００は、Ｌ２タグ２２１の検索結果を用いて、Ｌ１タグコピー２２３の検索を実行することができる。 According to such an L1 tag copy 223, the secondary cache 200 can execute a search for the L1 tag copy 223 using the search result of the L2 tag 221.

具体的には、リクエスト処理部２１１は、２次キャッシュ２００から該当するデータを検索するために、Ｌ２タグ２２１を参照する。２次キャッシュ２００内に該当するデータが存在する場合、当該データの物理アドレスを用いたＬ２タグ２２１の検索により、該当するデータを格納するラインに対応するＬ２タグ２２１内のエントリが検索される。この検索により、検索対象のデータに係るＬ２インデックスとＬ２ウェイを特定することができる。検索対象のデータに係るＬ１インデックスは、Ｌ２インデックスの一部、又は、Ｌ２タグ２２１内の論理アドレス下位ビットＡ１より、特定される。よって、Ｌ２タグ２２１の検索により、検索対象のデータに係るＬ１インデックス、Ｌ２インデックス、及び、Ｌ２ウェイが特定される。 Specifically, the request processing unit 211 refers to the L2 tag 221 in order to search for corresponding data from the secondary cache 200. When the corresponding data exists in the secondary cache 200, the entry in the L2 tag 221 corresponding to the line storing the corresponding data is searched by searching the L2 tag 221 using the physical address of the data. By this search, the L2 index and the L2 way related to the search target data can be specified. The L1 index related to the search target data is specified by a part of the L2 index or the logical address lower bit A1 in the L2 tag 221. Therefore, the L1 index, the L2 index, and the L2 way related to the data to be searched are specified by searching the L2 tag 221.

ここで、Ｌ１タグコピー２２３のエントリは、Ｌ１インデックス、インデックス差分５０３、及び、Ｌ２ウェイ５０４によって、特定することができる。インデックス差分５０３は、Ｌ１インデックス（論理アドレス下位ビットＡ１）とＬ２インデックス（物理アドレス下位ビットＢ３）との差分である。よって、２次キャッシュ２００は、Ｌ２タグ２２１の検索により特定されたＬ１インデックス、Ｌ２インデックス、及び、Ｌ２ウェイにより、検索対象のデータに対応するＬ１タグコピー２２３のエントリを特定することができる。 Here, the entry of the L1 tag copy 223 can be specified by the L1 index, the index difference 503, and the L2 way 504. The index difference 503 is a difference between the L1 index (logical address lower bit A1) and the L2 index (physical address lower bit B3). Therefore, the secondary cache 200 can specify the entry of the L1 tag copy 223 corresponding to the search target data by the L1 index, the L2 index, and the L2 way specified by the search of the L2 tag 221.

なお、検索対象のデータに係るＬ１インデックスは、Ｌ２インデックスに含まれる場合がある。この場合、Ｌ２インデックスからＬ１インデックスが特定されてもよい。また、コアからのアクセス要求にＬ１インデックスに係る情報が含まれる場合がある。この場合、コアからのアクセス要求に含まれる情報からＬ１インデックスが特定されてもよい。なお、コアからのアクセス要求に含まれるＬ１インデックスに係る情報とは、例えば、論理アドレスそのものである。 Note that the L1 index related to the search target data may be included in the L2 index. In this case, the L1 index may be specified from the L2 index. In addition, information related to the L1 index may be included in the access request from the core. In this case, the L1 index may be specified from information included in the access request from the core. Note that the information related to the L1 index included in the access request from the core is, for example, the logical address itself.

なお、本実施形態のキャッシュメモリは、インクルージョンキャッシュであるため、２次キャッシュ２００内に該当するデータが存在しない場合は、当該データは、１次キャッシュ内にも存在しない。そのため、２次キャッシュ２００内に存在しないデータについて、Ｌ１タグコピー２２３内のエントリが検索されることはない。 Note that, since the cache memory of the present embodiment is an inclusion cache, when there is no corresponding data in the secondary cache 200, the data does not exist in the primary cache. Therefore, the entry in the L1 tag copy 223 is not searched for data that does not exist in the secondary cache 200.

なお、論理アドレスと物理アドレスの対応関係によっては、論理アドレス下位ビットＡ１が物理アドレス下位ビットＢ３に含まれる場合がある。この場合、インデックス差分５０３のビット長は、物理アドレス下位ビットＢ３のビット長と論理アドレス下位ビットＡ１のビット長の差と等しくなる。 Depending on the correspondence between the logical address and the physical address, the logical address lower bit A1 may be included in the physical address lower bit B3. In this case, the bit length of the index difference 503 is equal to the difference between the bit length of the physical address lower bit B3 and the logical address lower bit A1.

また、インデックス差分５０３とＬ２ウェイ５０４と合わせたビット長の方が物理アドレス上位ビットＢ１のビット長よりも短い場合、Ｌ１タグをそのままコピーする場合に比べて、Ｌ１タグコピー２２３の物量が削減される。 In addition, when the bit length of the index difference 503 and the L2 way 504 is shorter than the bit length of the physical address upper bits B1, the amount of the L1 tag copy 223 is reduced compared to the case where the L1 tag is copied as it is. The

このＬ１タグコピー２２３は、主に、１次キャッシュ（１０３、１１３、・・・、１ｍ３）と２次キャッシュ２００との間でコヒーレンシを維持するために用いられる（特許文献１〜３を参照）。なお、Ｌ１タグコピー２２３は、Ｌ１タグと同様の理由により、コア毎、データの種別毎、及び、ウェイ毎に用意される。 The L1 tag copy 223 is mainly used to maintain coherency between the primary cache (103, 113,..., 1m3) and the secondary cache 200 (see Patent Documents 1 to 3). . The L1 tag copy 223 is prepared for each core, each data type, and each way for the same reason as the L1 tag.

§３アドレスロック制御部
＜動作例＞
次に、図３Ａ、３Ｂ、４、５、６Ａ、６Ｂ、７、及び、８を用いて、本実施形態に係るアドレスロック制御部２１３の動作例を説明する。§3 Address lock controller <Operation example>
Next, an operation example of the address lock control unit 213 according to the present embodiment will be described with reference to FIGS. 3A, 3B, 4, 5, 6A, 6B, 7, and 8.

＜アドレスの登録＞
先に、図３Ａ、３Ｂ、４、及び、５を用いて、本実施形態に係るアドレスロック制御部２１３によるロック対象のアドレスの登録処理について説明する。図３Ａは、コアからのアクセス要求に係る処理の命令が通常のオーダーである場合における登録処理を例示する。また、図３Ｂは、コアからのアクセス要求に係る処理の命令がリプレースオーダーである場合における登録処理を例示する。なお、図３Ａ及び３Ｂにより示される登録処理についてまとめた図が、図４である。図４は、本実施形態に係るアドレスロック制御部２１３によるロック対象のアドレスの登録処理を例示する。<Address registration>
First, with reference to FIGS. 3A, 3B, 4 and 5, the registration processing of the address to be locked by the address lock control unit 213 according to the present embodiment will be described. FIG. 3A exemplifies a registration process in the case where a processing instruction related to an access request from the core is in a normal order. FIG. 3B exemplifies a registration process when a process command related to an access request from the core is a replacement order. FIG. 4 shows a summary of the registration process shown in FIGS. 3A and 3B. FIG. 4 illustrates the registration processing of the address to be locked by the address lock control unit 213 according to the present embodiment.

アドレスロック制御部２１３は、コアからのアクセス要求により特定される情報を用いて、ロック対象を特定するためのロック対象情報を取得する。そして、アドレスロック制御部２１３は、アドレス保持部２１５において、取得したロック対象情報を保持する。本実施形態では、ロック対象情報がアドレス保持部２１５に保持されている間、当該ロック対象情報により示される対象がロック対象として登録された状態となる。 The address lock control unit 213 uses the information specified by the access request from the core to acquire lock target information for specifying the lock target. Then, the address lock control unit 213 holds the acquired lock target information in the address holding unit 215. In this embodiment, while the lock target information is held in the address holding unit 215, the target indicated by the lock target information is registered as the lock target.

なお、図３Ａ及び３Ｂに示されるとおり、本実施形態におけるロック対象を示す情報として保持されるロック対象情報からは、ロック対象のコア、データの種別、Ｌ２インデックス、Ｌ２ウェイ、Ｌ１インデックス、及び、Ｌ１ウェイの６種類の情報が特定可能である。なお、ロック対象情報は、これら６種類全ての情報が特定可能でなくてもよい。例えば、ロック対象情報は、これら６種類の情報のうちのいくつかの種類の情報が特定可能であってもよい。 As shown in FIGS. 3A and 3B, from the lock target information held as information indicating the lock target in the present embodiment, the lock target core, data type, L2 index, L2 way, L1 index, and Six types of information on the L1 way can be specified. Note that the lock target information may not be able to specify all six types of information. For example, the lock target information may be able to identify some types of information among these six types of information.

また、図３Ａ及び３Ｂにより示されるロック対象情報には、ロック対象を特定するための情報として、ロック対象のコア、データの種別、Ｌ２インデックス、Ｌ２ウェイ、Ｌ１インデックス、及び、Ｌ１ウェイの６種類の情報が含まれる。これにより、図３Ａ及び３Ｂにより示されるロック対象情報から、ロック対象のコア、データの種別、Ｌ２インデックス、Ｌ２ウェイ、Ｌ１インデックス、及び、Ｌ１ウェイが特定することができる。 In addition, the lock target information shown in FIGS. 3A and 3B includes six types of information for specifying the lock target: lock target core, data type, L2 index, L2 way, L1 index, and L1 way. Information is included. As a result, the lock target core, data type, L2 index, L2 way, L1 index, and L1 way can be specified from the lock target information shown in FIGS. 3A and 3B.

しかしながら、ロック対象情報にロック対象を示す情報が含まれていなくても、当該ロック対象情報に含まれていない情報により示されるロック対象を特定することができる。例えば、ロック対象情報を格納するエントリがコア毎に用意されている場合、ロック対象情報にコアを示す情報が含まれていなくても、ロック対象のコアを特定することができる。 However, even if the lock target information does not include information indicating the lock target, the lock target indicated by the information not included in the lock target information can be specified. For example, when an entry for storing lock target information is prepared for each core, the lock target core can be specified even if the lock target information does not include information indicating the core.

そのため、図３Ａ及び３Ｂを用いたアドレス登録処理についての以下の説明では、ロック対象を特定するための情報がロック対象情報に含まれているものとして説明するが、ロック対象を特定するための情報は、ロック対象情報に含まれていなくてもよい。そのロック対象を特定可能であれば、ロック対象情報に含まれる情報は、適宜選択されてよい。 Therefore, in the following description of the address registration process using FIGS. 3A and 3B, it is assumed that information for specifying the lock target is included in the lock target information. However, information for specifying the lock target is described. May not be included in the lock target information. If the lock target can be specified, the information included in the lock target information may be appropriately selected.

まず、ロック対象情報を登録する処理について、コアからのアクセス要求により特定可能な情報について説明する。図３Ａ及び３Ｂに示されるように、２次キャッシュ２００は、コアからのアクセス要求から、リクエストアドレス、リクエストコア、リクエストオペコード、及び、リクエストウェイを特定することができる。 First, regarding the process of registering lock target information, information that can be specified by an access request from the core will be described. As shown in FIGS. 3A and 3B, the secondary cache 200 can specify the request address, the request core, the request opcode, and the request way from the access request from the core.

リクエストアドレスは、上述のとおり、コアにより発行されるアクセス要求に基づいて要求される処理の対象となるデータを格納するメインメモリ４００の物理アドレスである。リクエストアドレスには、物理アドレスに対応する論理アドレスが含まれてもよい。また、リクエストアドレスには、論理アドレスそのものではなく、１次キャッシュのラインの検索に用いた論理アドレス下位ビットＡ１（Ｌ１インデックス）が含まれてもよい。コアからのアクセス要求にはこれらの情報が含まれる。２次キャッシュ２００は、コアからのアクセス要求に含まれる情報により、リクエストアドレスを特定することができる。 As described above, the request address is a physical address of the main memory 400 that stores data to be processed based on an access request issued by the core. The request address may include a logical address corresponding to the physical address. The request address may include not the logical address itself but the logical address lower bit A1 (L1 index) used for searching the primary cache line. The access request from the core includes such information. The secondary cache 200 can specify the request address based on information included in the access request from the core.

リクエストコアは、アクセス要求を発行したコアを示す。２次キャッシュ２００は、アクセス要求が流れてきたコアを特定することで、当該リクエストコアを特定することができる。なお、アクセス要求を発行したコアを示す情報がコアからのアクセス要求に含まれていることにより、当該リクエストコアが特定されてもよい。 The request core indicates a core that has issued an access request. The secondary cache 200 can identify the request core by identifying the core through which the access request has flowed. The request core may be specified by including information indicating the core that issued the access request in the access request from the core.

リクエストオペコードは、コアからのアクセス要求により指定されるデータの種別を示す。データの種別には、上記のとおり、命令フェッチにより取得される機械命令に係るデータと、機械命令のオペランド部により指定されるアドレスから取得されるデータの２種類が存在する。 The request opcode indicates the type of data specified by the access request from the core. As described above, there are two types of data: data relating to a machine instruction acquired by instruction fetch and data acquired from an address specified by an operand part of the machine instruction.

機械命令の実行において、各コアは、基本的には、機械命令を読み出し（命令フェッチ）、読み出した機械命令を解読し（デコード）、機械命令の対象となるオペランドデータを読み出し（オペランドフェッチ）、演算を実行し、結果を格納する。このとき、命令フェッチのフェーズでは、各コアは、機械命令に係るデータを要求する。また、オペランドフェッチのフェーズでは、機械命令のオペランド部により指定されるオペランドデータを要求する。すなわち、各コアの命令制御部は、アクセス要求（データ要求）を発行するタイミングにおける処理フェーズによって、これら２種類のデータのうちのどちらか一方のデータを要求する。 In execution of a machine instruction, each core basically reads a machine instruction (instruction fetch), decodes the read machine instruction (decode), reads out operand data as a target of the machine instruction (operand fetch), Perform the operation and store the result. At this time, in the instruction fetch phase, each core requests data related to a machine instruction. In the operand fetch phase, operand data specified by the operand part of the machine instruction is requested. That is, the instruction control unit of each core requests one of these two types of data depending on the processing phase at the timing of issuing an access request (data request).

当該要求の違いは、命令制御部によって発行されるアクセス要求を示すオペコードに反映される。命令制御部が、１次キャッシュ内の命令キャッシュに格納するデータを要求する場合、「0x00 IF-MI-SH」等のオペコードを含むアクセス要求が発行される。他方、命令制御部が、１次キャッシュ内のデータキャッシュに格納するデータを要求する場合、「0x01 OP-MI-SH」及び「0x02 OP-MI-EX」等のオペコードを含むアクセス要求が発行される。２次キャッシュ２００は、これらのオペコードをデコードすることで、リクエストオペコードを特定することができる。 The difference between the requests is reflected in the operation code indicating the access request issued by the instruction control unit. When the instruction control unit requests data to be stored in the instruction cache in the primary cache, an access request including an opcode such as “0x00 IF-MI-SH” is issued. On the other hand, when the instruction control unit requests data to be stored in the data cache in the primary cache, an access request including an operation code such as “0x01 OP-MI-SH” and “0x02 OP-MI-EX” is issued. The The secondary cache 200 can specify the request operation code by decoding these operation codes.

リクエストウェイは、コアからのアクセス要求により指定されるデータの格納先である１次キャッシュ内のラインのＬ１ウェイを示す。コアからのアクセス要求には、Ｌ１ウェイを示す情報が含まれる。２次キャッシュ２００は、その情報によって、当該リクエストウェイを特定することができる。なお、リプレースは、同一ラインにおけるデータの置き換えであるため、リプレース要求アドレスにより指定されるデータが格納されるラインとリプレース対象アドレスにより指定されるデータが格納されていたラインとは同一のウェイである。そのため、コアからのアクセス要求に係る処理の命令がＬ１リプレースオーダーである場合、リクエストウェイは、リプレース要求アドレスに対応するＬ１ウェイを示すとともに、リプレース対象アドレスに対応するＬ１ウェイを示す。 The request way indicates the L1 way of the line in the primary cache that is the storage destination of data specified by the access request from the core. The access request from the core includes information indicating the L1 way. The secondary cache 200 can specify the request way based on the information. Since replacement is replacement of data on the same line, the line storing the data specified by the replacement request address and the line storing the data specified by the replacement target address are the same way. . Therefore, when the processing instruction related to the access request from the core is the L1 replacement order, the request way indicates the L1 way corresponding to the replacement request address and the L1 way corresponding to the replacement target address.

次に、コアからのアクセス要求により特定可能な情報を用いて、ロック対象を特定するための情報を取得する処理について説明する。当該処理は、コアからのアクセス要求に係る処理の命令が通常のオーダーであるか、又は、リプレースオーダーであるかによって異なる。図３Ａ及び３Ｂを用いて当該処理について説明する。 Next, processing for acquiring information for specifying a lock target using information that can be specified by an access request from the core will be described. The processing differs depending on whether the processing instruction related to the access request from the core is a normal order or a replacement order. The process will be described with reference to FIGS. 3A and 3B.

コアからのアクセス要求に係る処理の命令が通常のオーダーである場合、図３Ａに示されるようにしてロック対象を特定するための情報が取得される。例えば、Ｌ２インデックス及びＬ２ウェイについて、コアからのアクセス要求に含まれる情報から特定されるリクエストアドレスに基づいて、Ｌ２タグ２２１内の該当エントリが検索される。リクエストアドレスにより指定されるデータが２次キャッシュ２００内に格納されている場合、当該検索により、当該リクエストアドレスに対応するＬ２タグ２２１のエントリが見つかる。この場合、Ｌ１属性情報取得部２１４は、Ｌ２タグ２２１の検索処理に基づいて、リクエストアドレスにより指定されるデータが格納されたラインのＬ２インデックス及びＬ２ウェイを取得する。 When the processing instruction related to the access request from the core is a normal order, information for specifying the lock target is acquired as shown in FIG. 3A. For example, for the L2 index and the L2 way, the corresponding entry in the L2 tag 221 is searched based on the request address specified from the information included in the access request from the core. When the data specified by the request address is stored in the secondary cache 200, an entry of the L2 tag 221 corresponding to the request address is found by the search. In this case, the L1 attribute information acquisition unit 214 acquires the L2 index and the L2 way of the line in which the data specified by the request address is stored based on the search process of the L2 tag 221.

なお、Ｌ２タグ２２１のエントリが見つからない場合、検索対象のデータは、２次キャッシュ２００内に格納されていない。この場合、リクエスト処理部２１１は、メインメモリ４００から該当データを取得し、２次キャッシュ２００内に取得したデータを格納する。このとき、メインメモリ４００から取得したデータの格納先は、データの取得前に特定される。 If no entry of the L2 tag 221 is found, the search target data is not stored in the secondary cache 200. In this case, the request processing unit 211 acquires the corresponding data from the main memory 400 and stores the acquired data in the secondary cache 200. At this time, the storage destination of the data acquired from the main memory 400 is specified before the data is acquired.

具体的には、メインメモリ４００からデータを取得する前に実行される２次キャッシュ２００内の検索処理において、当該データが格納されるべきラインが含まれるセットが特定される。当該セット内のラインに空きがあれば、空いているラインのうちのいずれかのラインが、データの格納先として特定される。また、当該セット内のラインに空きがなければ、リプレース処理が実行される。そして、置き換え対象のラインが特定されることで、データの格納先が特定される。なお、以下、このようなリプレース処理を「Ｌ２リプレース」と呼ぶ。 Specifically, in a search process in the secondary cache 200 that is executed before acquiring data from the main memory 400, a set including a line in which the data is to be stored is specified. If there is a vacant line in the set, one of the vacant lines is specified as the data storage destination. Further, if there is no vacancy in the line in the set, a replacement process is executed. Then, by specifying the replacement target line, the data storage destination is specified. Hereinafter, such replacement processing is referred to as “L2 replacement”.

このときに特定されるデータの格納先は、コアからのアクセス要求に基づいて要求される処理の対象となる。そのため、Ｌ１属性情報取得部２１４は、ロック対象を特定するための情報として、特定されたデータの格納先を指定するＬ２インデックス及びＬ２ウェイを取得する。なお、当該取得は、メインメモリ４００からのデータ取得処理の前後いずれのタイミングで行われてよい。 The data storage location specified at this time is a target of processing requested based on the access request from the core. Therefore, the L1 attribute information acquisition unit 214 acquires an L2 index and an L2 way that specify the storage destination of the specified data as information for specifying the lock target. The acquisition may be performed at any timing before or after the data acquisition process from the main memory 400.

なお、Ｌ２タグ２２１のエントリの検索において、Ｌ１属性情報取得部２１４は、ロック対象を特定するための情報として、Ｌ１インデックスを取得してもよい。Ｌ１インデックスがＬ２インデックスに含まれる場合は、Ｌ１属性情報取得部２１４は、取得したＬ２インデックスからＬ１インデックスを特定してもよい。また、Ｌ２タグ２２１のエントリに含まれる論理アドレス下位ビットＡ１からＬ１インデックスを特定してもよい。 In the search for the entry of the L2 tag 221, the L1 attribute information acquisition unit 214 may acquire an L1 index as information for specifying the lock target. When the L1 index is included in the L2 index, the L1 attribute information acquisition unit 214 may specify the L1 index from the acquired L2 index. Further, the L1 index may be specified from the logical address lower bits A1 included in the entry of the L2 tag 221.

また、上述したとおり、２次キャッシュ２００では、Ｌ２タグ２２１の検索結果を用いて、Ｌ１タグコピー２２３のエントリを検索することができる。これにより、該当するエントリがＬ１タグコピー２２３内に見つかった場合、Ｌ１属性情報取得部２１４は、ロック対象を特定するための情報として、該当するエントリに対応するラインのＬ１ウェイを取得する。 Further, as described above, the secondary cache 200 can search for an entry of the L1 tag copy 223 using the search result of the L2 tag 221. Thereby, when the corresponding entry is found in the L1 tag copy 223, the L1 attribute information acquisition unit 214 acquires the L1 way of the line corresponding to the corresponding entry as information for specifying the lock target.

なお、このとき、Ｌ１タグコピー２２３は、コア毎、データの種別毎に設けられている。そのため、Ｌ１属性情報取得部２１４は、当該Ｌ１タグコピー２２３から、ロック対象を特定するための情報として、アクセス要求を発行したコアを示す情報とデータの種別を示す情報とを取得してもよい。 At this time, the L1 tag copy 223 is provided for each core and each data type. Therefore, the L1 attribute information acquisition unit 214 may acquire, from the L1 tag copy 223, information indicating the core that issued the access request and information indicating the type of data as information for specifying the lock target. .

また、Ｌ１属性情報取得部２１４は、コアからのアクセス要求に含まれる情報から特定可能なリクエストオペコードから、アクセス要求に係るデータの種別を特定することができる。更に、Ｌ１属性情報取得部２１４は、コアからのアクセス要求に含まれる情報から特定可能なリクエストアドレスからＬ１インデックスを特定することができる。よって、Ｌ１属性情報取得部２１４は、これらを特定することにより、ロック対象を特定するための情報として、データの種別を示す情報、及び、Ｌ１インデックスを取得してもよい。 The L1 attribute information acquisition unit 214 can specify the type of data related to the access request from the request opcode that can be specified from the information included in the access request from the core. Furthermore, the L1 attribute information acquisition unit 214 can specify the L1 index from the request address that can be specified from the information included in the access request from the core. Therefore, the L1 attribute information acquisition unit 214 may acquire information indicating the type of data and an L1 index as information for specifying the lock target by specifying these.

他方、コアからのアクセス要求に係る処理の命令がリプレースオーダーである場合、図３Ｂにより示されるようにしてロック対象を特定するための情報が取得される。例えば、リクエストアドレス又はＬ２タグ２２１のエントリ内に格納される論理アドレス下位ビットＡ１から、リプレース対象のラインのＬ１インデックスが特定される。また、リクエストウェイから、リプレース対象のラインのＬ１ウェイが特定される。このように特定されることにより、Ｌ１属性情報取得部２１４は、ロック対象を示すＬ１インデックス及びＬ１ウェイを取得することができる。 On the other hand, if the processing instruction related to the access request from the core is a replacement order, information for specifying the lock target is acquired as shown in FIG. 3B. For example, the L1 index of the line to be replaced is specified from the logical address lower bit A1 stored in the entry of the request address or the L2 tag 221. Further, the L1 way of the line to be replaced is specified from the request way. By specifying in this way, the L1 attribute information acquisition unit 214 can acquire an L1 index and an L1 way indicating a lock target.

また、Ｌ１属性情報取得部２１４は、これらＬ１インデックス及びＬ１ウェイにより示されるＬ１タグコピー２２３のエントリを参照する。Ｌ１属性情報取得部２１４は、当該参照したエントリから、１次キャッシュ内におけるリプレース対象のラインに対応する２次キャッシュ２００内のラインを示すＬ２インデックス及びＬ２ウェイを取得することができる。 The L1 attribute information acquisition unit 214 refers to the entry of the L1 tag copy 223 indicated by the L1 index and the L1 way. The L1 attribute information acquisition unit 214 can acquire an L2 index and an L2 way indicating the line in the secondary cache 200 corresponding to the replacement target line in the primary cache from the referenced entry.

また、Ｌ１タグコピー２２３はコア毎及びデータの種別毎にエントリを有する。そのため、Ｌ１属性情報取得部２１４は、Ｌ１タグコピー２２３を参照することで、ロック対象を示す情報として、ロック対象のコアを示す情報、データ種別を示す情報を取得してもよい。また、Ｌ１属性情報取得部２１４は、Ｌ１タグコピー２２３から、ロック対象を特定するための情報として、Ｌ１インデックスを取得してもよい。 The L1 tag copy 223 has an entry for each core and each data type. Therefore, the L1 attribute information acquisition unit 214 may acquire information indicating the lock target core and information indicating the data type as information indicating the lock target by referring to the L1 tag copy 223. Further, the L1 attribute information acquisition unit 214 may acquire an L1 index from the L1 tag copy 223 as information for specifying a lock target.

更に、Ｌ１属性情報取得部２１４は、リクエストコアから、ロック対象のコアを示す情報を取得してもよい。Ｌ１属性情報取得部２１４は、リクエストオペコードから、データ種別を示す情報を取得してもよい。Ｌ１属性情報取得部２１４は、リクエストアドレスから、Ｌ１インデックスを取得してもよい。 Furthermore, the L1 attribute information acquisition unit 214 may acquire information indicating a lock target core from the request core. The L1 attribute information acquisition unit 214 may acquire information indicating the data type from the request opcode. The L1 attribute information acquisition unit 214 may acquire the L1 index from the request address.

最後に、このように取得される、ロック対象のコア、データの種別、Ｌ２インデックス、Ｌ２ウェイ、Ｌ１インデックス、及び、Ｌ１ウェイを含むロック対象情報がアドレス保持部２１５により保持される。本実施形態では、ロック対象情報がアドレス保持部２１５に保持されている間、アドレスロック制御部２１３において、ロック対象情報により示される対象がロック対象として登録された状態となる。 Finally, the lock target information including the core to be locked, the type of data, the L2 index, the L2 way, the L1 index, and the L1 way acquired in this way is held by the address holding unit 215. In this embodiment, while the lock target information is held in the address holding unit 215, the address lock control unit 213 is in a state where the target indicated by the lock target information is registered as the lock target.

図５は、この時の動作を例示した図である。図５に示されるように、リプレース要求アドレスにより指定されるデータと、リプレース対象アドレスにより指定されるデータは異なる。そのため、リプレース対象アドレスにより指定されるデータを格納する２次キャッシュ内のラインを特定するために、Ｌ１タグコピー２２３内の該当エントリの検索が実行される。なお、図５における「hit」と記載されたＬ２タグ２２１上のエントリは、リプレース要求アドレスにより指定されるデータを格納するラインに対応するエントリである。また、図５における「victim」と記載されたＬ２タグ２２１上のエントリは、リプレース対象アドレスにより指定されるデータを格納するラインに対応するエントリである。 FIG. 5 is a diagram illustrating the operation at this time. As shown in FIG. 5, the data specified by the replacement request address is different from the data specified by the replacement target address. Therefore, in order to identify the line in the secondary cache that stores the data specified by the replacement target address, the corresponding entry in the L1 tag copy 223 is searched. Note that the entry on the L2 tag 221 labeled “hit” in FIG. 5 is an entry corresponding to a line storing data specified by the replacement request address. In addition, the entry on the L2 tag 221 described as “victim” in FIG. 5 is an entry corresponding to a line that stores data specified by the replacement target address.

なお、図３Ｂにより示される動作例では、リプレース要求アドレスについてのシノニム問題を解消するため、Ｌ１タグコピー２２３が検索される前に、Ｌ２タグ２２１が検索されている（特許文献２を参照）。その他については、通常のオーダーの際の動作と同様であるため説明を省略する。なお、本実施形態では、当該Ｌ１タグコピー２２３内のエントリの特定には、更に、コア及びデータの種別が考慮される。図４は、図３Ａ及び図３Ｂにより示される動作例を簡略化して１つにまとめた図である。このようにして、ロック対象のアドレスがアドレスロック制御部２１３に登録される。 In the operation example shown in FIG. 3B, the L2 tag 221 is searched before the L1 tag copy 223 is searched in order to solve the synonym problem regarding the replacement request address (see Patent Document 2). Since other operations are the same as those in the normal order, the description thereof is omitted. In the present embodiment, the specification of the entry in the L1 tag copy 223 further considers the core and data type. FIG. 4 is a diagram in which the operation examples shown in FIGS. 3A and 3B are simplified and combined into one. In this way, the address to be locked is registered in the address lock control unit 213.

＜ロックチェック＞
続いて、図６Ａ、６Ｂ、及び、７を用いて、本実施形態に係るアドレスロック制御部２１３によるロックチェックについて説明する。図６Ａは、コアからのアクセス要求に係る処理の命令が通常のオーダーである場合におけるロックチェックを例示する。また、図６Ｂは、コアからのアクセス要求に係る処理の命令がリプレースオーダーである場合におけるロックチェックを例示する。なお、図６Ａ及び６Ｂにより示されるロックチェックについてまとめた図が、図７である。図７は、本実施形態に係るアドレスロック制御部２１３によるロックチェックを例示する。<Lock check>
Subsequently, a lock check by the address lock control unit 213 according to the present embodiment will be described with reference to FIGS. 6A, 6B, and 7. FIG. FIG. 6A illustrates the lock check when the processing instruction related to the access request from the core is in the normal order. FIG. 6B exemplifies the lock check when the processing instruction related to the access request from the core is a replacement order. FIG. 7 is a diagram summarizing the lock check shown in FIGS. 6A and 6B. FIG. 7 illustrates a lock check by the address lock control unit 213 according to the present embodiment.

アドレスロック制御部２１３は、コアからのアクセス要求に係る処理の対象がロック対象として登録されているか否かを判定することで、ロックチェックを行う。なお、当該ロックチェックは、コアからのアクセス要求に係る処理の命令が通常のオーダーであるか、リプレースオーダーであるかによって異なる。 The address lock control unit 213 performs the lock check by determining whether or not the processing target related to the access request from the core is registered as the lock target. The lock check differs depending on whether the processing instruction related to the access request from the core is a normal order or a replacement order.

コアからのアクセス要求に係る処理の命令が通常のオーダーである場合、図６Ａに示されるようにロックチェックが行われる。具体的には、リクエストアドレスに基づいて、Ｌ２タグ２２１内の該当エントリが検索される。リクエストアドレスにより指定されるデータが２次キャッシュ２００内に格納されている場合、当該検索により、当該リクエストアドレスに対応するＬ２タグ２２１のエントリが見つかる。この場合、ロック判定部２１６は、アドレス保持部２１５に保持されているロック対象情報を参照して、当該Ｌ２タグ２２１のエントリにより特定されるＬ２インデックス及びＬ２ウェイに一致するロック対象情報（エントリ）を検索する。なお、リクエストアドレスに対応するＬ２タグ２２１のエントリが見つからなかった場合、上述したデータの格納先として特定されるラインのＬ２インデックス及びＬ２ウェイに一致するロック対象情報が検索されてもよい。 When the processing instruction related to the access request from the core is in a normal order, a lock check is performed as shown in FIG. 6A. Specifically, the corresponding entry in the L2 tag 221 is searched based on the request address. When the data specified by the request address is stored in the secondary cache 200, an entry of the L2 tag 221 corresponding to the request address is found by the search. In this case, the lock determination unit 216 refers to the lock target information held in the address holding unit 215, and lock target information (entry) matching the L2 index and L2 way specified by the entry of the L2 tag 221. Search for. If no entry of the L2 tag 221 corresponding to the request address is found, the lock target information matching the L2 index and L2 way of the line specified as the data storage destination described above may be searched.

当該検索の結果、ロック対象情報が見つかった場合、ロック判定部２１６は、コアからのアクセス要求の対象がロックされていると判定する。他方、ロック対象情報が見つからなかった場合、ロック判定部２１６は、コアからのアクセス要求の対象がロックされていないと判定する。そして、ロック判定部２１６は、当該判定結果をリトライ制御部２１２に渡す。リトライ制御部２１２では、当該判定結果によって、コアからのアクセス要求に係る処理のリトライが制御される。 When the lock target information is found as a result of the search, the lock determination unit 216 determines that the target of the access request from the core is locked. On the other hand, when the lock target information is not found, the lock determination unit 216 determines that the target of the access request from the core is not locked. Then, the lock determination unit 216 passes the determination result to the retry control unit 212. The retry control unit 212 controls the retry of the process related to the access request from the core according to the determination result.

コアからのアクセス要求に係る処理の命令がリプレースオーダーである場合、図６Ｂに示されるようにロックチェックが行われる。具体的には、リクエストアドレスにより特定されるＬ１インデックス及びリクエストウェイにより特定されるＬ１ウェイに一致するロック対象情報がロック判定部２１６により検索される。これらロックチェックに用いられるＬ１インデックス及びＬ１ウェイは、それぞれ、１次キャッシュ内おけるリプレース対象のラインのＬ１インデックス及びＬ１ウェイである。なお、リクエストアドレスにＬ１インデックスが含まれない場合、Ｌ２タグ２２１から検索されるエントリから特定されるＬ１インデックスが用いられてもよい。 When the processing instruction related to the access request from the core is a replacement order, a lock check is performed as shown in FIG. 6B. Specifically, the lock determination unit 216 searches for the lock target information that matches the L1 index specified by the request address and the L1 way specified by the request way. The L1 index and L1 way used for these lock checks are the L1 index and L1 way of the line to be replaced in the primary cache, respectively. When the L1 index is not included in the request address, the L1 index specified from the entry searched from the L2 tag 221 may be used.

このように、本実施形態では、従来の方法に比べて、リプレースオーダーに対するロックチェックの場合に、各キャッシュタグにおける検索処理が不要となり、コアからのアクセス要求に含まれる情報だけでロックチェックが可能である。そのため、本実施形態では、従来の方法に比べて、リプレースオーダーに対するロックチェックにおけるレイテンシが改善される。 As described above, in this embodiment, in the case of lock check for a replacement order, search processing in each cache tag is not required, and lock check can be performed only by information included in the access request from the core, compared to the conventional method. It is. Therefore, in this embodiment, the latency in the lock check for the replacement order is improved as compared with the conventional method.

なお、本実施形態では、ロックチェックに際して、Ｌ２インデックス及びＬ２ウェイの組、又は、Ｌ１インデックス及びＬ１ウェイの組を用いたロックチェックが例示される。これらは、ロックチェックの例示であり、ロックチェックに際しては、ロック対象情報に含まれる情報であれば、いかなる組合せがロックチェックに用いられてもよい。例えば、本実施形態では用いられていないコアを示す情報、又は、データの種別を示す情報がロックチェックに用いられてもよい。これらロックチェックに用いる情報は、処理される命令の性質によって、適宜選択されてよい。 In the present embodiment, a lock check using a set of L2 index and L2 way or a set of L1 index and L1 way is exemplified in the lock check. These are examples of the lock check. In the lock check, any combination may be used for the lock check as long as it is information included in the lock target information. For example, information indicating a core that is not used in the present embodiment or information indicating the type of data may be used for the lock check. Information used for these lock checks may be appropriately selected depending on the nature of the instruction to be processed.

なお、図７は、図６Ａ及び図６Ｂにより示される動作例を簡略化して１つにまとめた図である。このようにして、ロック判定部２１６によりロックチェックが行われる。また、図８は、これまでに説明したロック対象のアドレスの登録処理とロックチェックの処理を簡略化して１つにまとめた図である。本実施形態に係るアドレスロック制御部２１３では、このようにして、ロック対象の登録とロックチェックが実行される。 FIG. 7 is a diagram showing the operation examples shown in FIGS. 6A and 6B in a simplified manner. In this way, the lock determination unit 216 performs a lock check. Further, FIG. 8 is a diagram in which the registration processing of the lock target address and the lock check processing described so far are simplified and combined into one. In the address lock control unit 213 according to the present embodiment, registration of lock targets and lock check are executed in this way.

＜回路例＞
次に、図９を用いて、本実施形態に係るアドレスロック制御部２１３の回路例を説明する。図９は、アドレスロック制御部２１３の回路を例示する。図９に示されるとおり、本実施形態に係るアドレスロック制御部２１３は、マッチ回路２３０、エントリ選択回路２３１、セット／リセット回路２３２、入力選択回路２３３、及び、保持回路２３４を備える。図９に示されるとおり、マッチ回路２３０、エントリ選択回路２３１、セット／リセット回路２３２、及び、保持回路２３４は、コア毎に設けられている。<Circuit example>
Next, a circuit example of the address lock control unit 213 according to the present embodiment will be described with reference to FIG. FIG. 9 illustrates a circuit of the address lock control unit 213. As shown in FIG. 9, the address lock control unit 213 according to the present embodiment includes a match circuit 230, an entry selection circuit 231, a set / reset circuit 232, an input selection circuit 233, and a holding circuit 234. As shown in FIG. 9, the match circuit 230, the entry selection circuit 231, the set / reset circuit 232, and the holding circuit 234 are provided for each core.

マッチ回路２３０は、後述する保持回路２３４が保持する４つ分のエントリそれぞれに格納されたロック対象を示す情報と、パイプラインを流れる情報及びＬ２タグ２２１により得られる処理対象を示す情報とが、一致しているかどうか判定する。そして、マッチ回路２３０は、パイプライン上に、当該判定結果を流す。なお、ロック対象を示す情報とは、上述した、コアを示す情報、データの種別を示す情報、Ｌ２インデックス、Ｌ２ウェイ、Ｌ１インデックス、及び、Ｌ１ウェイ等である。そして、当該アクセス要求に係る処理の命令が通常のオーダーである場合、処理対象を示す情報は、Ｌ２インデックスとＬ２ウェイの組である。また、アクセス要求に係る処理の命令がリプレースオーダーである場合、処理対象を示す情報は、Ｌ１インデックス及びＬ１ウェイである。なお、マッチ回路２３０は、本実施形態のロック判定部２１６に相当する。 The match circuit 230 includes information indicating the lock target stored in each of four entries held by the hold circuit 234 described later, and information indicating the processing target obtained by the information flowing through the pipeline and the L2 tag 221. Determine if they match. Then, the match circuit 230 causes the determination result to flow on the pipeline. Note that the information indicating the lock target includes the information indicating the core, the information indicating the type of data, the L2 index, the L2 way, the L1 index, the L1 way, and the like described above. When the processing instruction related to the access request is a normal order, the information indicating the processing target is a set of an L2 index and an L2 way. Further, when the processing instruction related to the access request is a replacement order, the information indicating the processing target is the L1 index and the L1 way. The match circuit 230 corresponds to the lock determination unit 216 of the present embodiment.

エントリ選択回路２３１は、保持回路２３４が保持する各エントリのうちで空いているエントリの中から、次にデータが登録されるエントリを示す。エントリ選択回路２３１は、空きのエントリが複数存在する場合、次にデータが登録されるエントリを適当に決定する。そして、エントリ選択回路２３１は、決定したエントリを示すための信号を出力する。 The entry selection circuit 231 indicates an entry in which data is registered next from among vacant entries among the entries held by the holding circuit 234. When there are a plurality of empty entries, the entry selection circuit 231 appropriately determines an entry in which data is registered next. Then, the entry selection circuit 231 outputs a signal for indicating the determined entry.

セット／リセット回路２３２は、保持部２３４が保持する各エントリのステータスを更新する。ステータスは、例えば、「１」で「有効」を表現し、「０」で「無効」を表現する。 The set / reset circuit 232 updates the status of each entry held by the holding unit 234. For example, “1” represents “valid” and “0” represents “invalid”.

入力選択回路２３３は、パイプライン上に流れる情報から、後述する保持回路２３４に格納するための情報を選択する。保持回路２３４に格納する情報は、ロック対象情報である。入力選択回路２３３は、パイプライン上に流れる情報から、保持回路２３４に格納するロック対象情報に含まれる情報を選択し、選択した情報を保持回路２３４に渡す。これにより、入力選択回路２３３は、ロック対象を示す情報を取得する。入力選択回路２３３は、本実施形態のＬ１属性情報取得部２１４に相当する。 The input selection circuit 233 selects information to be stored in the holding circuit 234 described later from information flowing on the pipeline. Information stored in the holding circuit 234 is lock target information. The input selection circuit 233 selects information included in the lock target information stored in the holding circuit 234 from the information flowing on the pipeline, and passes the selected information to the holding circuit 234. Thereby, the input selection circuit 233 acquires information indicating the lock target. The input selection circuit 233 corresponds to the L1 attribute information acquisition unit 214 of the present embodiment.

パイプライン上に流れる情報には、上述した、リクエストコア、リクエストアドレス、リクエストオペコード、リクエストウェイ、Ｌ２タグ２２１の検索結果、及び、Ｌ１タグコピー２２３の検索結果が含まれる。入力選択回路２３３は、パイプライン上に流れる情報から、図３Ａ及び３Ｂで説明したようなロック対象を特定するための情報を選択し、選択した情報を保持回路２３４に渡す。 The information flowing on the pipeline includes the above-described request core, request address, request opcode, request way, L2 tag 221 search result, and L1 tag copy 223 search result. The input selection circuit 233 selects information for specifying the lock target as described with reference to FIGS. 3A and 3B from the information flowing on the pipeline, and passes the selected information to the holding circuit 234.

なお、後述するとおり、図９において示される保持回路２３４において保持されるロック対象情報を格納するエントリには、ロック対象のコアを示す情報を格納するフィールドが存在しない。エントリがコア毎に用意されることで、ロック対象のコアが識別される。よって、図９により示される回路例においては、入力選択回路２３３は、図３Ａ及び３Ｂにより示される例とは異なり、ロック対象のコアを示す情報を取得しない。 As will be described later, the entry for storing the lock target information held in the holding circuit 234 shown in FIG. 9 does not have a field for storing information indicating the core to be locked. By preparing an entry for each core, the core to be locked is identified. Therefore, in the circuit example shown in FIG. 9, the input selection circuit 233 does not acquire information indicating the core to be locked, unlike the examples shown in FIGS. 3A and 3B.

また、入力選択回路２３３は、パイプライン上の適当な位置に配置されてよい。例えば、入力選択回路２３３は、Ｌ２タグ２２１の検索結果及びＬ１タグコピー２２３の検索結果を取得可能なパイプライン上の適当な位置に配置される。 The input selection circuit 233 may be arranged at an appropriate position on the pipeline. For example, the input selection circuit 233 is arranged at an appropriate position on the pipeline where the search result of the L2 tag 221 and the search result of the L1 tag copy 223 can be acquired.

保持回路２３４は、入力選択回路２３３から取得されるロック対象情報をエントリに格納する。図９では、４つ分のエントリが例示されている。図９に示されるとおり、保持回路２３４により保持されるエントリには、ステータスフィールド、Ｌ２インデックスフィールド、ＩＦステータスフィールド、ＩＦウェイフィールド、ＯＰステータスフィールド、及び、ＯＰウェイフィールドが含まれる。 The holding circuit 234 stores the lock target information acquired from the input selection circuit 233 in the entry. In FIG. 9, four entries are illustrated. As shown in FIG. 9, the entries held by the holding circuit 234 include a status field, an L2 index field, an IF status field, an IF way field, an OP status field, and an OP way field.

ステータスフィールドには、例えば、該当エントリの有効性を示す１ビットの情報が格納される。例えば、該当エントリが有効である場合、ステータスフィールドには「１」が格納される。他方、該当エントリが無効である場合、ステータスフィールドには「０」が格納される。 In the status field, for example, 1-bit information indicating the validity of the corresponding entry is stored. For example, when the corresponding entry is valid, “1” is stored in the status field. On the other hand, if the corresponding entry is invalid, “0” is stored in the status field.

Ｌ２インデックスフィールドには、例えば、１１ビットのＬ２インデックスが格納される。Ｌ２ウェイフィールドには、例えば、５ビットのＬ２ウェイが格納される。また、ＩＦウェイフィールド及びＯＰウェイフィールドには、例えば、２ビットのＬ１ウェイが格納される。 In the L2 index field, for example, an 11-bit L2 index is stored. For example, a 5-bit L2 way is stored in the L2 way field. Further, in the IF way field and the OP way field, for example, a 2-bit L1 way is stored.

ＩＦステータスフィールドには、例えば、当該エントリにおけるＩＦウェイフィールドの有効性を示す１ビットの情報が格納される。ＩＦウェイフィールドにＬ１ウェイが格納されている場合、ＩＦステータスフィールドには、有効を示す「１」が格納される。 In the IF status field, for example, 1-bit information indicating the validity of the IF way field in the entry is stored. When the L1 way is stored in the IF way field, “1” indicating validity is stored in the IF status field.

また、ＯＰステータスフィールドには、例えば、当該エントリにおけるＯＰウェイフィールドの有効性を示す１ビットの情報が格納される。ＯＰウェイフィールドにＬ１ウェイが格納されている場合、ＯＰステータスフィールドには、有効を示す「１」が格納される。 In the OP status field, for example, 1-bit information indicating the validity of the OP way field in the entry is stored. When the L1 way is stored in the OP way field, “1” indicating validity is stored in the OP status field.

なお、保持回路２３４において保持されるエントリは、本実施形態のアドレス保持部２１５に相当する。また、保持回路２３４は、パイプライン、Ｌ２タグ２２１、及び、Ｌ１タグコピー２２３からロック対象情報を取得する。当該取得に係る回路は、本実施形態のＬ１属性情報取得部２１４に相当する。 The entry held in the holding circuit 234 corresponds to the address holding unit 215 of this embodiment. In addition, the holding circuit 234 acquires lock target information from the pipeline, the L2 tag 221, and the L1 tag copy 223. The circuit related to the acquisition corresponds to the L1 attribute information acquisition unit 214 of the present embodiment.

＜具体例＞
次に、図１０及び１１を用いて、ある具体的な場面における本実施形態に係るアドレスロック制御部２１３の動作例を説明する。なお、図１０及び１１では、ステップを「Ｓ」と略称する。<Specific example>
Next, an operation example of the address lock control unit 213 according to the present embodiment in a specific scene will be described with reference to FIGS. 10 and 11, the step is abbreviated as “S”.

図１０及び１１において例示される具体的な場面では、説明を簡単にするため、２次キャッシュにおいてキャッシュミスヒットが生じないとする。ただし、本実施形態の適用場面はこのような場面に限定される訳ではない。２次キャッシュにおいてキャッシュミスヒットが生じてもよい。なお、キャッシュミスヒットが生じた場合は、メインメモリ４００からのデータ取得処理、及び、リプレースが生じる場合は、Ｌ２リプレースに係る処理が実行される。 In the specific scene illustrated in FIGS. 10 and 11, it is assumed that a cache miss does not occur in the secondary cache for the sake of simplicity. However, the application scene of this embodiment is not limited to such a scene. A cache miss hit may occur in the secondary cache. When a cache miss occurs, data acquisition processing from the main memory 400 and when replacement occurs, processing related to L2 replacement is executed.

図１０は、第２コアがアドレス(Ａ)により指定されるデータを保持している場合における動作例を示す。なお、図１０により示される場面では、ロック対象が登録されていない状態から処理が開始している。 FIG. 10 shows an operation example when the second core holds data designated by the address (A). In the scene shown in FIG. 10, the process starts from a state in which the lock target is not registered.

ステップ１０００では、第１コアによって、アドレス(A)により指定されるデータのストア命令が発行される（ST(A)）。ストア命令が発行されると、２次キャッシュのアドレスロック制御部は、当該発行されたストア命令が実行可能か否か、ロックチェックする。当該ロックチェックは、図６Ａにより例示される。この時点においては、ロック対象は登録されていないので、第１コアによって発行されたストア命令はロックされない。よって、２次キャッシュでは、当該ストア命令は実行に移される。 In step 1000, the first core issues a store instruction for data specified by the address (A) (ST (A)). When a store instruction is issued, the address lock control unit of the secondary cache performs a lock check to determine whether or not the issued store instruction can be executed. The lock check is illustrated by FIG. 6A. At this time, since the lock target is not registered, the store instruction issued by the first core is not locked. Therefore, in the secondary cache, the store instruction is executed.

ステップ１００１では、実行に移されたストア命令に基づいたロック対象を登録する。当該登録処理は、図３Ａにより例示される。この時、少なくとも、ロック対象を示す情報として、Ｌ２インデックス、Ｌ２ウェイ、及び、Ｌ１ウェイが登録される。 In step 1001, a lock target is registered based on the store instruction transferred to execution. The registration process is illustrated by FIG. 3A. At this time, at least the L2 index, the L2 way, and the L1 way are registered as information indicating the lock target.

ステップ１００２では、２次キャッシュのリクエスト処理部は、１次キャッシュと２次キャッシュの間でコヒーレンシを保つため、第２コアの１次キャッシュに対して、アドレス(A)で指定されるデータの無効化、又は、書き戻しを要求する。２次キャッシュのリクエスト処理部は、２次キャッシュ内のＬ１タグコピーを参照することで、第２コアが該当するデータを保持していることを検知する。なお、第２コアでは、Ｌ１キャッシュ制御部によって、アドレス(A)で指定されるデータを格納するラインに対応するキャッシュタグ内のステータスが参照されることで、アドレス(A)で指定されるデータが更新された（dirty）か否か判定される。アドレス(A)で指定されるデータが更新されていた場合、２次キャッシュのリクエスト処理部からのアクセス要求に基づいて、第２コアでは、当該データの書き戻しが実行される。また、アドレス(A)で指定されるデータが更新されていなかった場合、第２コアでは、当該データの無効化が実行される。 In step 1002, the request processing unit of the secondary cache invalidates the data specified by the address (A) with respect to the primary cache of the second core in order to maintain coherency between the primary cache and the secondary cache. Or request to write back. The request processing unit of the secondary cache detects that the second core holds corresponding data by referring to the L1 tag copy in the secondary cache. In the second core, the data specified by the address (A) is referred to by the L1 cache control unit referring to the status in the cache tag corresponding to the line storing the data specified by the address (A). It is determined whether or not is updated (dirty). When the data specified by the address (A) has been updated, the second core performs write back of the data based on the access request from the request processing unit of the secondary cache. If the data specified by the address (A) has not been updated, the second core executes invalidation of the data.

ステップ１００３では、第３コアによって、アドレス(A)により指定されるデータのロード命令が発行される（LD(A)）。この時点では、ステップ１００１において登録されたロックは未だ解除されていないものとする。つまり、ステップ１００３は、第１コアが発行したストア命令が実行されている間に発生した処理である。ロード命令が発行されると、２次キャッシュのアドレスロック制御部は、当該発行されたロード命令が実行可能か否か、ロックチェックする。当該ロックチェックは、図６Ａにより示される。図６Ａに示されるように、この場合、Ｌ２インデックス及びＬ２ウェイを用いたロックチェックが行われる。 In step 1003, the third core issues a data load instruction designated by the address (A) (LD (A)). At this point, it is assumed that the lock registered in step 1001 has not yet been released. That is, step 1003 is a process that occurs while the store instruction issued by the first core is being executed. When a load instruction is issued, the address lock control unit of the secondary cache performs a lock check to determine whether or not the issued load instruction can be executed. The lock check is illustrated by FIG. 6A. As shown in FIG. 6A, in this case, a lock check using the L2 index and the L2 way is performed.

ここで、ステップ１０００において第１コアが発行したストア命令と、当該ロード命令は同じアドレス(A)に対する処理である。そのため、当該ロード命令により特定されるＬ２インデックス及びＬ２ウェイは、ロック対象として登録されているＬ２インデックス及びＬ２ウェイと一致する。よって、当該ストア命令の実行は、リトライ制御部によって、取り消され、再試行させられる。このようなリトライは、ステップ１００１において登録されたロックが解錠されるまで続く。 Here, the store instruction issued by the first core in step 1000 and the load instruction are processing for the same address (A). Therefore, the L2 index and L2 way specified by the load instruction match the L2 index and L2 way registered as lock targets. Therefore, the execution of the store instruction is canceled and retried by the retry control unit. Such retry continues until the lock registered in step 1001 is unlocked.

図１１は、第２コアがアドレス(B)により指定されるデータを保持している場合における動作例を示す。なお、図１１により示される場面においても、ロック対象が登録されていない状態から処理が開始している。 FIG. 11 shows an operation example when the second core holds data designated by the address (B). In the scene shown in FIG. 11 as well, the process starts from a state in which no lock target is registered.

ステップ２０００では、第１コアによって、アドレス(B)により指定されるデータのストア命令が発行される（ST(B)）。ストア命令が発行されると、２次キャッシュのアドレスロック制御部は、当該発行されたストア命令が実行可能か否か、ロックチェックする。当該ロックチェックは、図６Ａにより例示される。この時点においては、ロック対象は登録されていないので、第１コアによって発行されたストア命令はロックされない。よって、当該ストア命令は実行に移される。 In step 2000, the first core issues a data store instruction designated by the address (B) (ST (B)). When a store instruction is issued, the address lock control unit of the secondary cache performs a lock check to determine whether or not the issued store instruction can be executed. The lock check is illustrated by FIG. 6A. At this time, since the lock target is not registered, the store instruction issued by the first core is not locked. Therefore, the store instruction is executed.

次に、ステップ２００１では、実行に移されたストア命令に基づいたロック対象を登録する。当該登録処理は、図３Ａにより例示される。この時、少なくとも、ロック対象を示す情報として、Ｌ２インデックス、Ｌ２ウェイ、及び、Ｌ１ウェイが登録される。 Next, in step 2001, the lock target based on the store instruction transferred to execution is registered. The registration process is illustrated by FIG. 3A. At this time, at least the L2 index, the L2 way, and the L1 way are registered as information indicating the lock target.

ステップ２００２では、２次キャッシュのリクエスト処理部は、１次キャッシュと２次キャッシュの間でコヒーレンシを保つため、第２コアの１次キャッシュに対して、アドレス(B)で指定されるデータの無効化、又は、書き戻しを要求する。この点は、図１０により示される動作と同様であるため、説明を省略する。 In step 2002, the request processing unit of the secondary cache invalidates the data specified by the address (B) with respect to the primary cache of the second core in order to maintain coherency between the primary cache and the secondary cache. Or request to write back. Since this point is the same as the operation shown in FIG.

ステップ２００３では、第１コアが発行したストア命令（ST(B)）が実行されている間に、第２コアによって、アドレス(A)により指定されるデータのロード命令が発行される（LD(A)）。ここで、アドレス(A)とアドレス(B)は、同一のＬ１インデックスであるとする。また、当該Ｌ１インデックスにより指定されるセット内のラインに空きがなく、Ｌ１リプレースが発生し、アドレス(B)により指定されるデータが格納されたラインが置き換え対象のラインとして選択されたとする。 In step 2003, while the store instruction (ST (B)) issued by the first core is being executed, a data load instruction specified by the address (A) is issued by the second core (LD ( A)). Here, it is assumed that the address (A) and the address (B) are the same L1 index. Further, it is assumed that there is no space in the line designated by the L1 index, L1 replacement occurs, and the line storing the data designated by the address (B) is selected as the replacement target line.

この場合、当該ロード命令は、Ｌ１リプレースに基づくアクセス要求に係る処理の命令である。当該ロード命令が発行されると、２次キャッシュのアドレスロック制御部は、Ｌ１リプレースを伴う当該ロード命令が実行可能か否か、ロックチェックする。当該ロックチェックは、図６Ｂにより示される。図６Ｂにより示されるように、この場合、コアからのリクエストに含まれるＬ１インデックス及びＬ１ウェイを用いたロックチェックが行われる。 In this case, the load instruction is a process instruction related to the access request based on the L1 replacement. When the load instruction is issued, the address lock control unit of the secondary cache performs a lock check to determine whether or not the load instruction with L1 replacement can be executed. The lock check is illustrated by FIG. 6B. As shown in FIG. 6B, in this case, a lock check is performed using the L1 index and the L1 way included in the request from the core.

ここで、ステップ２００１において、ロック対象を示す情報として、アドレス(B)により指定されるデータを格納する第２コアのラインのＬ１インデックス及びＬ１ウェイが登録されている。そして、アドレス(B)により指定されるデータを格納する第２コアのラインは、ステップ２００３で発生したリプレース対象のラインである。そのため、ロック対象を示す情報として登録されたＬ１インデックス及びＬ１ウェイと、ステップ２００３で発生したリプレース対象のＬ１インデックス及びＬ１ウェイとは一致する。よって、当該Ｌ１リプレースを伴うロード命令は、リトライ制御部によって、取り消され、再試行させられる。このようなリトライは、ステップ２００１において登録されたロックが解錠されるまで続く。 Here, in step 2001, the L1 index and the L1 way of the second core line storing the data specified by the address (B) are registered as information indicating the lock target. The second core line that stores the data specified by the address (B) is the line to be replaced generated in step 2003. For this reason, the L1 index and L1 way registered as information indicating the lock target coincide with the L1 index and L1 way to be replaced generated in step 2003. Therefore, the load instruction accompanied with the L1 replacement is canceled and retried by the retry control unit. Such retry continues until the lock registered in step 2001 is unlocked.

§４本実施形態に係る作用及び効果
最後に、図１２Ａ、１２Ｂ、１３Ａ、及び、１３Ｂを用いて、本実施形態に係る作用及び効果を説明する。§4 Actions and effects according to this embodiment Finally, actions and effects according to this embodiment will be described with reference to FIGS. 12A, 12B, 13A, and 13B.

図１２Ａは、従来の方法によりロックしていた範囲を例示する。図１２Ｂは、本実施形態によりロックされる範囲を例示する。 FIG. 12A illustrates the range locked by the conventional method. FIG. 12B illustrates the range locked by this embodiment.

図１２Ａにより示されるとおり、フルサイズのアドレス、又は、Ｌ２インデックス及びＬ２ウェイの組でロック対象を指定した場合、２次キャッシュ上の該当ラインがロックされる。よって、２次キャッシュ上の当該ラインに対するアクセスは全てロックされる。これに対して、図１２Ｂにより示されるとおり、従来のアドレス情報に加えて、Ｌ１ウェイ、コアを示す情報、及び、データの種別情報等を更に追加して、ロック対象を指定した場合、例えば、１次キャッシュ上の該当ラインがロックされる。よって、その他のコアにおけるアクセスについてはロックされない。 As shown in FIG. 12A, when a lock target is specified by a full-size address or a set of L2 index and L2 way, the corresponding line on the secondary cache is locked. Therefore, all accesses to the line on the secondary cache are locked. On the other hand, as shown in FIG. 12B, in addition to the conventional address information, information indicating the L1 way, core, data type information, and the like are further added, and the lock target is designated. The corresponding line on the primary cache is locked. Therefore, access in other cores is not locked.

よって、本実施形態では、従来の方法と比べて、より詳細にロック対象範囲を設定することができる。これにより、本実施形態では、従来の方法に比べて、より並列処理の可能性を高めることができる。 Therefore, in the present embodiment, the lock target range can be set in more detail as compared with the conventional method. Thereby, in this embodiment, compared with the conventional method, the possibility of parallel processing can be improved more.

図１３Ａ及び１３Ｂを用いて、並列処理の可能性が高まることを説明する。図１３Ａ及び図１３Ｂは、第１コア及び第２コアにおいて、アドレスＡにより指定されるデータを格納するラインについてＬ１リプレースが発生した場合におけるロックとリプレースとの関係を例示する。図１３Ａは、従来の方法におけるロックとリプレースとの関係を例示する。また、図１３Ｂは、本実施形態におけるロックとリプレースとの関係を例示する。 The increase in the possibility of parallel processing will be described with reference to FIGS. 13A and 13B. FIG. 13A and FIG. 13B illustrate the relationship between lock and replacement when L1 replacement occurs in the first core and the second core for the line storing the data specified by the address A. FIG. 13A illustrates the relationship between lock and replacement in the conventional method. FIG. 13B illustrates the relationship between lock and replacement in the present embodiment.

従来の方法では、フルサイズのアドレス、又は、Ｌ２インデックス及びＬ２ウェイの組によりロック対象が表現される。ここで、第１コア及び第２コアで生じるリプレースの対象は共にアドレスＡである。よって、例えば、第２コアにおけるＬ１リプレースが先に処理されたとすると、当該第２コアにおけるＬ１リプレースに基づいて登録されたロックによって、第１コアにおけるＬ１リプレースが処理できなくなる。 In the conventional method, a lock target is expressed by a full-size address or a set of an L2 index and an L2 way. Here, both the replacement targets generated in the first core and the second core are the address A. Therefore, for example, if the L1 replacement in the second core is processed first, the L1 replacement in the first core cannot be processed due to the lock registered based on the L1 replacement in the second core.

これに対して、本実施形態では、Ｌ１インデックス、Ｌ１ウェイ、コアを示す情報、及び、データの種別を示す情報等によりロック対象が表現される。よって、第２コアにおけるＬ１リプレースに基づいて実行されるロックから、第１コアにおけるＬ１リプレースを外すことができる。よって、本実施形態では、第２コアにおけるＬ１リプレースに係る処理が実行されている間であっても、第１コアにおけるＬ１リプレースに係る処理を実行することができる。したがって、本実施形態では、従来の方法に比べて、より並列処理の可能性を高めることができる。 On the other hand, in the present embodiment, the lock target is represented by information indicating the L1 index, L1 way, core, information indicating the type of data, and the like. Therefore, the L1 replacement in the first core can be removed from the lock executed based on the L1 replacement in the second core. Therefore, in the present embodiment, the process relating to the L1 replacement in the first core can be executed even while the process relating to the L1 replacement in the second core is being executed. Therefore, in this embodiment, the possibility of parallel processing can be further increased as compared with the conventional method.

また、従来の方式では、Ｌ１リプレースの対象となるアドレスのロックチェックの際、Ｌ１タグコピーの検索を実行した後に、ロックチェックが行われていた。これに対して、本実施形態では、Ｌ１リプレースの対象となるアドレスのロックチェックの際、リクエストに含まれるＬ１インデックス及びＬ１ウェイでロックチェックが可能である。つまり、本実施形態では、ロックチェックにおいてＬ１タグコピーの検索を省略することができる。したがって、本実施形態によれば、Ｌ１リプレースの対象となるアドレスのロックチェックの際に生じる遅延を改善することができる。 Further, in the conventional method, the lock check is performed after the search for the L1 tag copy is executed at the time of the lock check of the address to be replaced by the L1. On the other hand, in this embodiment, when performing a lock check of an address that is a target of L1 replacement, a lock check can be performed using the L1 index and the L1 way included in the request. That is, in this embodiment, the search for the L1 tag copy can be omitted in the lock check. Therefore, according to the present embodiment, it is possible to improve the delay that occurs during the lock check of the address that is the target of L1 replacement.

なお、図３Ａに示されるとおり、通常のオーダーにおいてロック対象のアドレスを登録する場合、Ｌ１タグコピーを検索する分遅延が生じる。しかしながら、ロック対象のアドレスを登録する処理は、ロックに係るアクセス要求の実行と並列で行われる処理である。そのため、当該遅延は、処理全体の処理速度にはほとんど影響を及ぼさない。これに対して、ロックチェックの処理は、ロックに係るアクセス要求の実行前に、実行される。そのため、ロックチェックの遅延は、処理全体の処理速度にそのまま影響を及ぼす。よって、本実施形態によれば、処理全体の処理速度も改善される。 As shown in FIG. 3A, when registering an address to be locked in a normal order, a delay is caused by searching for an L1 tag copy. However, the process of registering the lock target address is a process performed in parallel with the execution of the access request related to the lock. For this reason, the delay hardly affects the processing speed of the entire processing. On the other hand, the lock check process is executed before the execution of the access request related to the lock. For this reason, the delay of the lock check directly affects the processing speed of the entire process. Therefore, according to the present embodiment, the processing speed of the entire process is also improved.

１００、１１０、１ｍ０プロセッサコア
１０１、１１１、１ｍ１命令制御部
１０２、１１２、１ｍ２演算実行部
１０３、１１３、１ｍ３１次キャッシュ
１０４Ｌ１キャッシュ制御部
１０４ａアドレス変換部
１０４ｂリクエスト処理部
１０５Ｌ１命令キャッシュ
１０５ａＬ１タグ
１０５ｂＬ１データ
１０５Ｌ１オペランドキャッシュ
１０６ａＬ１タグ
１０６ｂＬ１データ
２００２次キャッシュ
２１０Ｌ２キャッシュ制御部
２１１リクエスト処理部
２１２リトライ制御部
２１３アドレスロック制御部
２１４Ｌ１属性情報取得部
２１５アドレス保持部
２１６ロック判定部
２２０Ｌ２キャッシュデータ部
２２１Ｌ２タグ
２２２Ｌ２データ
２２３Ｌ１タグコピー
２３０マッチ回路
２３１エントリ選択回路
２３２セット／リセット回路
２３３入力選択回路
２３４保持回路100, 110, 1m0 Processor core 101, 111, 1m1 Instruction control unit 102, 112, 1m2 Operation execution unit 103, 113, 1m3 Primary cache 104 L1 cache control unit 104a Address conversion unit 104b Request processing unit 105 L1 instruction cache 105a L1 Tag 105b L1 data 105 L1 operand cache 106a L1 tag 106b L1 data 200 Secondary cache 210 L2 cache control unit 211 Request processing unit 212 Retry control unit 213 Address lock control unit 214 L1 attribute information acquisition unit 215 Address holding unit 216 Lock determination unit 220 L2 cache data part 221 L2 tag 222 L2 data 223 L1 tag copy 230 Match circuit 231 Entry selection circuit 232 Set / reset circuit 233 Input selection circuit 234 Holding circuit

一方、Ｌ1リプレースに基づくアクセス要求におけるリプレース対象アドレスについて
ロックチェックを実行する場合、アドレスロック制御部831は、Ｌ1タグコピー820から、
リプレース対象アドレスに対応するＬ2インデックス及びＬ2ウェイを取得する。具体的には、アドレスロック制御部831は、Ｌ1リプレースに基づくアクセス要求に含まれる情報に一致するラインをＬ1タグコピー820から検索する。なお、図１６におけるアドレス(A)及
びＬ1ウェイ(A)が、Ｌ1リプレースに基づくアクセス要求に含まれる情報である。そして
、アドレスロック制御部831は、当該検索の結果により示されるラインから、リプレース
対象アドレスに対応するＬ2インデックス及びＬ2ウェイを取得する。なお、図16におけるＬ2インデックス(B)及びＬ2ウェイ(B)が、リプレース対象アドレスに対応するＬ2インデ
ックス及びＬ2ウェイである。 On the other hand, when the lock check is executed for the replacement target address in the access request based on the L1 replacement, the address lock control unit 831 reads the L1 tag copy 820 from
An L2 index and an L2 way corresponding to the replacement target address are acquired. Specifically, the address lock control unit 831 searches the L1 tag copy 820 for a line that matches the information included in the access request based on the L1 replacement. Note that the address (A) and the L1 way (A) in FIG. 16 are information included in the access request based on the L1 replacement. Then, the address lock control unit 831 acquires the L2 index and the L2 way corresponding to the replacement target address from the line indicated by the search result. Note that the L2 index (B) and the L2 way (B) in FIG. 16 are the L2 index and the L2 way corresponding to the replacement target address.

Claims

A plurality of arithmetic processing units including a first cache memory, each performing an operation and outputting an access request;
A second cache memory shared by the plurality of arithmetic processing units, each storing data calculated by the plurality of arithmetic processing units;
Acquire attribute information related to the first cache memory of the access request source of the control object including way information indicating the way of the cache block of the first cache memory to be controlled in the arithmetic processing unit of the access request source of the control object An acquisition unit to
A holding unit that holds address information related to a control target address that specifies a cache block that is a target of the control target access request in the second cache memory, and the acquired attribute information;
A replacement target address included in an access request related to a replacement request issued by any of the plurality of arithmetic processing units with respect to the first cache memory, and way information indicating a way of a cache block to be replaced in the first cache memory; A control unit for controlling an access request related to the replacement request for the cache block of the second cache memory specified by the address information to be controlled and the attribute information,
An arithmetic processing device comprising:

The acquisition unit further acquires at least one of information indicating an arithmetic processing unit of the access request source to be controlled and information indicating a type of data to be a target of the access request.
The arithmetic processing apparatus according to claim 1.

A plurality of arithmetic processing units including a first cache memory, each performing an arithmetic operation and outputting an access request, and a plurality of arithmetic processing units each holding data calculated by the plurality of arithmetic processing units. A control method of an arithmetic processing unit comprising: a second cache memory;
The acquisition unit included in the arithmetic processing unit includes the way information indicating the way of the cache block of the first cache memory to be controlled in the arithmetic processing unit of the access request source to be controlled. Obtaining attribute information about the first cache memory;
A holding unit included in the arithmetic processing unit holds address information related to a control target address that specifies a cache block that is a target of the control target access request in the second cache block and the acquired attribute information. ,
The control unit included in the arithmetic processing unit includes a replacement target address included in an access request related to a replacement request issued by any of the plurality of arithmetic processing units with respect to the first cache memory, and a replacement target in the first cache memory. Controlling an access request related to the replace request for the cache block of the second cache memory specified by the address information to be controlled and the attribute information, based on way information indicating a way of the cache block of
A method for controlling an arithmetic processing unit.

The acquisition unit further acquires at least one of information indicating an arithmetic processing unit of the access request source to be controlled and information indicating a type of data to be a target of the access request.
The control method of the arithmetic processing apparatus of Claim 3.