JP5104139B2

JP5104139B2 - Cash system

Info

Publication number: JP5104139B2
Application number: JP2007232606A
Authority: JP
Inventors: 毅 ▲葛▼; 真一郎多湖
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2007-09-07
Filing date: 2007-09-07
Publication date: 2012-12-19
Anticipated expiration: 2027-09-07
Also published as: JP2009064308A

Description

本発明は、キャッシュシステムに関する。 The present invention relates to a cache system.

コンピュータシステムにおいては一般に、主記憶とは別に小容量で高速なキャッシュメモリが設けられる。主記憶に記憶される情報の一部をキャッシュにコピーしておくことで、この情報をアクセスする場合には主記憶からではなくキャッシュから読み出すことで、高速な情報の読み出しが可能となる。 In general, in a computer system, a small-capacity and high-speed cache memory is provided separately from the main memory. By copying a part of the information stored in the main memory to the cache, when accessing this information, it is possible to read the information at a high speed by reading from the cache instead of the main memory.

キャシュは複数のキャッシュラインを含み、主記憶からキャッシュへの情報のコピーはキャッシュライン単位で実行される。主記憶のメモリ空間はキャッシュライン単位で分割され、分割されたメモリ領域を順番にキャッシュラインに割当てておく。キャッシュの容量は主記憶の容量よりも小さいので、主記憶のメモリ領域を繰り返して同一のキャッシュラインに割当てることになる。 The cache includes a plurality of cache lines, and information is copied from the main memory to the cache in units of cache lines. The memory space of the main memory is divided in units of cache lines, and the divided memory areas are sequentially assigned to the cache lines. Since the capacity of the cache is smaller than the capacity of the main memory, the memory area of the main memory is repeatedly assigned to the same cache line.

メモリ空間上のあるアドレスに最初のアクセスが実行されると、そのアドレスの情報（データやプログラム）をキャシュ内の対応するキャッシュラインにコピーする。同一アドレスに対して次のアクセスを実行する場合にはキャシュから直接に情報を読み出す。一般に、アドレスの全ビットのうちで、所定数の下位ビットがキャッシュのインデックスとなり、それより上位に位置する残りのビットがキャッシュのタグとなる。 When the first access is executed to an address in the memory space, the information (data or program) at that address is copied to the corresponding cache line in the cache. When executing the next access to the same address, information is read directly from the cache. Generally, out of all bits of the address, a predetermined number of lower bits serve as a cache index, and the remaining bits positioned higher than that serve as a cache tag.

データをアクセスする場合には、アクセス先を示すアドレス中のインデックス部分を用いて、キャッシュ中の対応するインデックスのタグを読み出す。読み出したタグと、アドレス中のタグ部分のビットパターンが一致するか否かを判断する。一致しない場合にはキャッシュミスとなる。一致する場合には、キャッシュヒットとなり、当該インデックスに対応するキャッシュデータ（１キャッシュライン分の所定ビット数のデータ）がアクセスされる。 When accessing the data, the tag of the corresponding index in the cache is read using the index portion in the address indicating the access destination. It is determined whether or not the read tag matches the bit pattern of the tag portion in the address. If they do not match, a cache miss occurs. If they match, a cache hit occurs, and the cache data corresponding to the index (data of a predetermined number of bits for one cache line) is accessed.

各キャッシュラインに対して１つだけタグを設けたキャッシュの構成を、ダイレクトマッピング方式と呼ぶ。各キャッシュラインに対してＮ個のタグを設けたキャッシュの構成をＮウェイセットアソシアティブと呼ぶ。ダイレクトマッピング方式は１ウェイセットアソシアティブとみなすことができる。 A cache configuration in which only one tag is provided for each cache line is called a direct mapping method. A cache configuration in which N tags are provided for each cache line is called an N-way set associative. The direct mapping method can be regarded as a one-way set associative.

キャッシュミスが発生した場合に、主記憶にアクセスすることによるペナルティーを軽減するために、キャッシュメモリを多階層化したシステムが用いられる。例えば、１次キャッシュと主記憶との間に、主記憶よりは高速にアクセスできる２次キャッシュを設けることにより、１次キャッシュにおいてキャッシュミスが発生した場合に、主記憶にアクセスが必要になる頻度を低くして、キャッシュミス・ペナルティーを軽減することができる。 In order to reduce the penalty for accessing the main memory when a cache miss occurs, a system in which the cache memory is hierarchized is used. For example, by providing a secondary cache that can be accessed faster than the main memory between the primary cache and the main memory, the frequency at which access to the main memory is required when a cache miss occurs in the primary cache. Can be reduced to reduce the cache miss penalty.

従来、プロセッサにおいては動作周波数の向上やアーキテクチャの改良により処理速度を向上させてきた。しかし近年、周波数をこれ以上高くすることには技術的な限界が見え始めており、複数のプロセッサを用いたマルチプロセッサ構成により処理速度の向上を目指す動きが強くなっている。 Conventionally, in a processor, the processing speed has been improved by improving the operating frequency and improving the architecture. However, in recent years, a technical limit has begun to appear when the frequency is further increased, and there is a strong movement toward improving the processing speed by a multiprocessor configuration using a plurality of processors.

複数のプロセッサが存在するシステムは、各々がキャッシュを有する既存のシングルプロセッサコアを複数個設け、それらを単純に繋げることで実現可能である。このような構成は、設計コストを低く抑えることができるが、キャッシュの使用効率やキャッシュの一貫性に関して問題がある。 A system having a plurality of processors can be realized by providing a plurality of existing single processor cores each having a cache and simply connecting them. Such a configuration can keep the design cost low, but there are problems with cache usage efficiency and cache consistency.

下記の特許文献１には、ネットワークで相互に接続された複数のプロセッサエレメントの各々が、処理装置と、その処理装置で実行される命令列とその命令列で使用されるデータを保持するローカルメモリと、該ネットワークを介して他のプロセッサエレメントへデータを送信する送信ユニットと、該ネットワークを介して他のプロセッサエレメントから転送されたデータを受信する受信ユニットと、該処理装置が要求した、そのローカルメモリに保持されているデータまたはそのローカルメモリに記憶されるべきデータを一時的に保持するための第１のキャッシュメモリと、該受信ユニットにより受信され、該ローカルメモリに記憶されるべきデータを一時的に保持するための第２のキャッシュメモリと、該受信ユニットによりデータが受信されたことに応答して、そのデータを該第２のキャッシュメモリに書き込み、該処理装置からの、ローカルメモリ内データの読み出し要求に応答して、該第１、第２のキャッシュメモリの内、そのいずれか一方にその要求の対象とされるデータがあるか否かを検出し、いずれか一方にそのデータがあるときには、そのデータをその一方のキャッシュメモリから読み出すキャッシュアクセス制御手段とを有する並列計算機が記載されている。 In Patent Document 1 below, each of a plurality of processor elements connected to each other via a network stores a processing device, an instruction sequence executed by the processing device, and data used in the instruction sequence A transmitting unit for transmitting data to other processor elements via the network, a receiving unit for receiving data transferred from other processor elements via the network, and a local unit requested by the processing device. A first cache memory for temporarily holding data held in the memory or data to be stored in the local memory, and data received by the receiving unit and stored in the local memory temporarily The second cache memory for holding the data and the receiving unit In response to this, the data is written to the second cache memory, and in response to a read request for data in the local memory from the processing device, the first and second cache memories are It is detected whether or not there is data subject to the request in either one of them, and when there is the data in either one, it has a cache access control means for reading the data from the one cache memory The calculator is listed.

また、下記の特許文献２には、プロセッサとキャッシュメモリとを有する一つ以上のプロセッサモジュールと、前記プロセッサモジュールにより共有されるメインメモリを有するメモリモジュールと、前記一つ以上のプロセッサモジュールと前記メモリモジュールとを相互に接続する結合手段とを備えるマルチプロセッサシステムにおいて、前記プロセッサモジュール内のキャッシュメモリ間のデータのコヒーレンシを維持するキャッシュコヒーレンシ制御の際のデータの単位であるラインサイズが、前記結合手段を介してメインメモリのデータを転送する際のデータの単位である転送サイズのｎ倍（ｎは２以上の整数）であって、前記プロセッサモジュールは、前記ラインサイズの値を前記キャッシュコヒーレンシ制御の際のデータの単位としてキャッシュメモリを制御するキャッシュ制御手段を有し、前記プロセッサモジュールは、その中のプロセッサが発行したデータ要求が自己のキャッシュメモリでキャッシュミスを起こしたときには、前記プロセッサモジュールと前記メモリモジュールとにデータを要求するコヒーレント・リード要求を発行し、前記メモリモジュールのみにデータを要求するノン・コヒーレント・リード要求を、（ｎ−１）回だけ発行することを特徴とするデータ要求発行の制御手段を有することを特徴とするマルチプロセッサシステムが記載されている。 Patent Document 2 below discloses one or more processor modules having a processor and a cache memory, a memory module having a main memory shared by the processor modules, the one or more processor modules, and the memory. In a multiprocessor system comprising coupling means for interconnecting modules, a line size as a unit of data in cache coherency control for maintaining data coherency between cache memories in the processor module is the coupling means. N times the transfer size (n is an integer equal to or greater than 2), which is a unit of data when transferring data in the main memory via the processor, and the processor module sets the value of the line size to the value of the cache coherency control. Data unit Cache processor for controlling the cache memory, and when the data request issued by the processor in the processor module causes a cache miss in its own cache memory, the processor module and the memory module A data request issuing control means for issuing a coherent read request for requesting data and issuing a non-coherent read request for requesting data only to the memory module only (n-1) times; A multiprocessor system characterized by having a multiprocessor system is described.

また、下記の特許文献３には、データの転送を行う共有バスと、この共有バスに接続され、データを保持する共有メモリと、前記共有バスに接続され、データを保持するデータ保持部、及び、このデータ保持部に保持されたデータの情報を保持するタグ保持部を備えた複数のキャッシュメモリと、この複数のキャッシュメモリのうち一のキャッシュメモリに接続され、必要なデータを前記キャッシュメモリに要求して所定の処理を行う複数のプロセッサ（ＰＥ）と、を備えたマルチプロセッサを制御する方法において、前記複数のキャッシュメモリのうちの所定のキャッシュメモリで、他のキャッシュメモリと共有状態にあってかつ共有メモリに対して更新されているデータがリプレイスされるとき、前記データを共有している前記他のキャッシュメモリのうちの１つのキャッシュメモリのタグ保持部に、共有メモリに対して更新されている旨の情報を保持させ、前記共有メモリへの前記更新されているデータの書き戻しを行わないことを特徴とするマルチプロセッサの制御方法が記載されている。 Patent Document 3 below discloses a shared bus that transfers data, a shared memory that is connected to the shared bus and holds data, a data holding unit that is connected to the shared bus and holds data, and A plurality of cache memories having a tag holding unit for holding information of data held in the data holding unit, and one cache memory of the plurality of cache memories, and necessary data is stored in the cache memory. In a method for controlling a multiprocessor including a plurality of processors (PE) that perform predetermined processing upon request, a predetermined cache memory of the plurality of cache memories is in a shared state with another cache memory. When the data updated to the shared memory is replaced, the other cache sharing the data is replaced. A tag holding unit of one cache memory in the cache memory holds information indicating that the shared memory has been updated, and the updated data is not written back to the shared memory. A multiprocessor control method is described.

特開平６−１０３２４４号公報JP-A-6-103244 特開２００３−３００４８号公報Japanese Patent Laid-Open No. 2003-30048 特開平９−１８５５４７号公報JP-A-9-185547

本発明の目的は、複数の処理装置に一対一に複数のキャッシュを接続した場合に、複数のキャッシュ間で効率的にデータ転送を行うことができるキャッシュシステムを提供することである。 An object of the present invention is to provide a cache system capable of efficiently transferring data between a plurality of caches when a plurality of caches are connected to a plurality of processing devices on a one-to-one basis.

本発明のキャッシュシステムは、複数の処理装置と、前記複数の処理装置に一対一に接続された複数のキャッシュと、前記複数のキャッシュに接続され、前記複数のキャッシュ間のデータ通信を行うネットワークと、前記キャッシュ毎に設けられ、前記キャッシュが前記ネットワークに対してデータ通信するための複数のバッファとを有し、前記処理装置が要求したデータが、一のキャッシュでミスした場合、前記複数のバッファの状態に応じて、前記キャッシュが前記ネットワークを介してデータ通信し、前記処理装置が要求したデータが、一のキャッシュでミスし、他のキャッシュでヒットした場合に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないものでなく、かつ、前記一のキャッシュから前記複数のキャッシュの中で最も最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記最も最近使用されていないキャッシュのエントリに上書きし、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きすることを特徴とする。 The cache system of the present invention includes a plurality of processing devices, a plurality of caches connected to the plurality of processing devices on a one-to-one basis, and a network connected to the plurality of caches and performing data communication between the plurality of caches. A plurality of buffers provided for each cache, wherein the cache has a plurality of buffers for data communication with the network, and when the data requested by the processing device misses in one cache, the plurality of buffers Depending on the state of the cache, when the cache performs data communication via the network and the data requested by the processing device misses in one cache and hits in another cache, There is no predetermined amount of data in the buffer for transmitting data to the one cache, and the previous There is a predetermined amount of data in the buffer for transmitting data from one cache to the other hit cache, and the one cache is not the least recently used among the plurality of caches And when there is no predetermined amount of data in the buffer for transmitting data from the one cache to the least recently used cache among the plurality of caches, the entry of the one cache is changed to the most recent The cache entry that is not used is overwritten, and the entry in the other cache that has been hit is overwritten in the entry in the one cache.

バッファの状態に応じて、複数のキャッシュ間で効率的にデータ転送を行うことができ、トラフィックの増大を防止することができる。 Data can be efficiently transferred between a plurality of caches according to the state of the buffer, and an increase in traffic can be prevented.

図１は、共有分散キャッシュシステムの構成例を示す図である。この共有分散キャッシュシステムは、３個のコア（プロセッサ：処理装置）１０１〜１０３、３個のコア１０１〜１０３に一対一に対応する３個の３ウェイ１次キャッシュ１１１〜１１３、キャッシュ１１１〜１１３に接続されるキャッシュ間接続コントローラ（ＩＣＣ）１２０、及びメインメモリ（又は２次キャッシュ）１３０を含む。コア１０１〜１０３はそれぞれ、自己に直接に接続されるキャッシュ（自コアキャッシュ）を上位階層キャッシュとしてアクセス可能である。この共有分散キャッシュシステムでは、さらに、他のコアのキャッシュ（他コアキャッシュ）を下位階層キャッシュとしてアクセス可能なように構成される。すなわち、例えばコア１０１から見たときに、キャッシュ１１１を上位階層キャッシュとしてアクセス可能であるとともに、さらに他コアキャッシュ１１２及び１１３を下位階層キャッシュとしてアクセス可能なように構成される。このような下位階層キャッシュをアクセスする経路は、キャッシュ間接続コントローラ１２０を介して提供される。 FIG. 1 is a diagram illustrating a configuration example of a shared distributed cache system. This shared distributed cache system has three cores (processors: processing devices) 101 to 103, three three-way primary caches 111 to 113 corresponding to the three cores 101 to 103, and caches 111 to 113. And an inter-cache connection controller (ICC) 120 connected to the main memory (or secondary cache) 130. Each of the cores 101 to 103 can access a cache (own core cache) directly connected to itself as an upper layer cache. This shared distributed cache system is further configured so that a cache of another core (another core cache) can be accessed as a lower hierarchy cache. That is, for example, when viewed from the core 101, the cache 111 can be accessed as an upper hierarchy cache, and the other core caches 112 and 113 can be accessed as lower hierarchy caches. A path for accessing such a lower hierarchy cache is provided via the inter-cache connection controller 120.

キャッシュ間接続コントローラ１２０は、第１のＬＲＵ情報１２１〜１２３及び第２のＬＲＵ情報１２９を記憶する情報メモリを有する。情報メモリは、キャッシュ間接続コントローラ１２０の外にあってもよい。第１のＬＲＵ情報１２１は、キャッシュ１１１のエントリａ，ｂ，ｃ及び他コアキャッシュ１１２及び１１３のグループｘの最古順を示し、「０」が最古、「２」が最新を示す。すなわち、第１のＬＲＵ情報１２１は、自己のキャッシュ１１１の３個のエントリａ，ｂ，ｃに、ｎ個（＝１個）の他コアキャッシュグループｘを追加したＬＲＵ情報である。 The inter-cache connection controller 120 includes an information memory that stores the first LRU information 121 to 123 and the second LRU information 129. The information memory may be outside the inter-cache connection controller 120. The first LRU information 121 indicates the oldest order of the entries a, b, and c of the cache 111 and the group x of the other core caches 112 and 113, with “0” indicating the oldest and “2” indicating the latest. That is, the first LRU information 121 is LRU information obtained by adding n (= 1) other core cache groups x to the three entries a, b, and c of the own cache 111.

図２は、図１の第１のＬＲＵ情報１２１〜１２３及び第２のＬＲＵ情報１２９の詳細を示す図である。第１のＬＲＵ情報１２２は、第１のＬＲＵ情報１２１と同様に、キャッシュ１１２のエントリａ，ｂ，ｃ及び他コアキャッシュ１１１及び１１３のグループｘの最古順を示す。第１のＬＲＵ情報１２３も、第１のＬＲＵ情報１２１と同様に、キャッシュ１１３のエントリａ，ｂ，ｃ及び他コアキャッシュ１１１及び１１２のグループｘの最古順を示す。第１のＬＲＵ情報１２１〜１２３を２項関係方式によりエンコードすると、それぞれ６ビットになる。 FIG. 2 is a diagram showing details of the first LRU information 121 to 123 and the second LRU information 129 of FIG. Similar to the first LRU information 121, the first LRU information 122 indicates the oldest order of the entries a, b, and c of the cache 112 and the group x of the other core caches 111 and 113. Similarly to the first LRU information 121, the first LRU information 123 also indicates the oldest order of the entries a, b, and c of the cache 113 and the group x of the other core caches 111 and 112. When the first LRU information 121 to 123 is encoded by the binary relation method, each becomes 6 bits.

コア１０１はコアＡであり、キャッシュ１１１はキャッシュＡである。コア１０２はコアＢであり、キャッシュ１１２はキャッシュＢである。コア１０３はコアＣであり、キャッシュ１１３はキャッシュＣである。 The core 101 is the core A, and the cache 111 is the cache A. The core 102 is the core B, and the cache 112 is the cache B. The core 103 is the core C, and the cache 113 is the cache C.

第２のＬＲＵ情報１２９は、３個のコア１０１〜１０３間での自己のキャッシュ１１１〜１１３に対する使用古さ順を示し、ＡがキャッシュＡ（１１１）のＬＲＵ情報、ＢがキャッシュＢ（１１２）のＬＲＵ情報、ＣがキャッシュＣ（１１３）のＬＲＵ情報である。第２のＬＲＵ情報１２９を２項関係方式によりエンコードすると、３ビットになる。したがって、ＬＲＵ情報１２１〜１２３、１２９の合計は、６＋６＋６＋３＝２１ビットになり、図１５のシステムのビット数より少なくなる。第１のＬＲＵ情報１２１〜１２３の情報メモリは、各キャッシュ１１１〜１１３のローカルなＬＲＵ情報に、他コアキャッシュグループが１ウェイ増えたとみて、各キャッシュ毎に第１のＬＲＵ情報１２１〜１２３を保持する。 The second LRU information 129 indicates the usage order of the caches 111 to 113 among the three cores 101 to 103, A is the LRU information of the cache A (111), and B is the cache B (112). LRU information, and C is the LRU information of the cache C (113). When the second LRU information 129 is encoded by the binary relation method, it becomes 3 bits. Therefore, the sum of the LRU information 121 to 123 and 129 is 6 + 6 + 6 + 3 = 21 bits, which is smaller than the number of bits of the system of FIG. The information memory of the first LRU information 121 to 123 holds the first LRU information 121 to 123 for each cache on the assumption that the other core cache group is increased by one way in the local LRU information of each of the caches 111 to 113. To do.

図３は、共有分散キャッシュシステムの他の構成例を示す図である。以下、図３が図１と異なる点を説明する。この共有分散キャッシュシステムは、６個のコア１０１〜１０６、６個のコア１０１〜１０６に一対一に対応する６個の１次キャッシュ１１１〜１１６、キャッシュ１１１〜１１６に接続されるキャッシュ間接続コントローラ（ＩＣＣ）１２０、及びメインメモリ（又は２次キャッシュ）１３０を含む。キャッシュ１１１〜１１４及び１１６は３ウェイであり、キャッシュ１１５は２ウェイである。コア１０１〜１０６はそれぞれ、自コアキャッシュ１１１〜１１６を上位階層キャッシュとしてアクセス可能である。この共有分散キャッシュシステムでは、さらに、他コアキャッシュを下位階層キャッシュとしてアクセス可能なように構成される。すなわち、例えばコア１０１から見たときに、キャッシュ１１１を上位階層キャッシュとしてアクセス可能であるとともに、さらに他コアキャッシュ１１２及び１１３のグループｘ、他コアキャッシュ１１４及び１１５のグループｙ、他コアキャッシュ１１６のグループｚを下位階層キャッシュとしてアクセス可能なように構成される。 FIG. 3 is a diagram illustrating another configuration example of the shared distributed cache system. Hereinafter, the points of FIG. 3 different from FIG. 1 will be described. The shared distributed cache system includes six cores 101 to 106, six primary caches 111 to 116 corresponding to the six cores 101 to 106, and an inter-cache connection controller connected to the caches 111 to 116. (ICC) 120 and main memory (or secondary cache) 130 are included. The caches 111 to 114 and 116 are 3 ways, and the cache 115 is 2 ways. Each of the cores 101 to 106 can access the own core caches 111 to 116 as an upper hierarchy cache. This shared distributed cache system is further configured so that the other core cache can be accessed as a lower hierarchy cache. That is, for example, when viewed from the core 101, the cache 111 can be accessed as a higher-level cache, and the group x of the other core caches 112 and 113, the group y of the other core caches 114 and 115, and the other core cache 116 The group z is configured to be accessible as a lower hierarchy cache.

キャッシュ間接続コントローラ１２０は、第１のＬＲＵ情報１２１〜１２６及び第２のＬＲＵ情報１２９を記憶する情報メモリを有する。第１のＬＲＵ情報１２１は、キャッシュ１１１のエントリａ，ｂ，ｃ、他コアキャッシュ１１２及び１１３のグループｘ、他コアキャッシュ１１４及び１１５のグループｙ、他コアキャッシュ１１６グループｚの最古順を示し、「０」が最古、「５」が最新を示す。すなわち、第１のＬＲＵ情報１２１は、自己のキャッシュ１１１の３個のエントリａ，ｂ，ｃに、ｎ個（＝３個）の他コアキャッシュグループｘ，ｙ，ｚを追加したＬＲＵ情報である。 The inter-cache connection controller 120 includes an information memory that stores the first LRU information 121 to 126 and the second LRU information 129. The first LRU information 121 indicates the earliest order of the entries a, b, and c of the cache 111, the group x of the other core caches 112 and 113, the group y of the other core caches 114 and 115, and the other core cache 116 group z. , “0” indicates the oldest and “5” indicates the latest. That is, the first LRU information 121 is LRU information obtained by adding n (= 3) other core cache groups x, y, and z to the three entries a, b, and c of the own cache 111. .

図４は、図３の第１のＬＲＵ情報１２１〜１２６及び第２のＬＲＵ情報１２９の詳細を示す図である。第１のＬＲＵ情報１２２は、第１のＬＲＵ情報１２１と同様に、キャッシュ１１２のエントリａ，ｂ，ｃ、ｎ個（＝３個）の他コアキャッシュ１１１及び１１３のグループｘ、他コアキャッシュ１１４及び１１５のグループｙ、他コアキャッシュ１１６のグループｚの最古順を示す。第１のＬＲＵ情報１２３は、キャッシュ１１３のエントリａ，ｂ，ｃ、ｎ個（＝３個）の他コアキャッシュ１１１及び１１２のグループｘ、他コアキャッシュ１１４及び１１５のグループｙ、他コアキャッシュ１１６のグループｚの最古順を示す。第１のＬＲＵ情報１２４は、キャッシュ１１４のエントリａ，ｂ，ｃ、ｎ個（＝２個）の他コアキャッシュ１１１及び１１２のグループｘ、他コアキャッシュ１１３、１１５及び１１６のグループｙの最古順を示す。第１のＬＲＵ情報１２５は、キャッシュ１１５の２個のエントリａ，ｂ、ｎ個（＝３個）の他コアキャッシュ１１１及び１１２のグループｘ、他コアキャッシュ１１３及び１１４のグループｙ、他コアキャッシュ１１６のグループｚの最古順を示す。第１のＬＲＵ情報１２６は、キャッシュ１１６のエントリａ，ｂ，ｃ、ｎ個（＝３個）の他コアキャッシュ１１１及び１１２のグループｘ、他コアキャッシュ１１３及び１１４のグループｙ、他コアキャッシュ１１５のグループｚの最古順を示す。グループの組み合わせは、任意に決めることができる。 FIG. 4 is a diagram showing details of the first LRU information 121 to 126 and the second LRU information 129 of FIG. Similarly to the first LRU information 121, the first LRU information 122 includes the entries a, b, c, n (= 3) of the other core caches 111 and 113 of the cache 112, the group x of the other core caches 114, and the other core cache 114. , 115 of group y, and the oldest order of group z of other core cache 116. The first LRU information 123 includes entries a, b, c and n (= 3) of the cache 113, the group x of the other core caches 111 and 112, the group y of the other core caches 114 and 115, and the other core cache 116. Indicates the oldest order of group z. The first LRU information 124 includes the entries a, b, c and n (= 2) of the cache 114, the group x of the other core caches 111 and 112, and the oldest group y of the other core caches 113, 115 and 116. Indicates the order. The first LRU information 125 includes two entries a and b in the cache 115, group x of n (= 3) other core caches 111 and 112, group y of other core caches 113 and 114, and other core caches. 116 shows the oldest order of 116 groups z. The first LRU information 126 includes entries a, b, c and n (= 3) of the cache 116, the group x of the other core caches 111 and 112, the group y of the other core caches 113 and 114, and the other core cache 115. Indicates the oldest order of group z. The combination of groups can be determined arbitrarily.

第１のＬＲＵ情報１２１〜１２３，１２６を２項関係方式によりエンコードすると、それぞれ１５ビットになる。第１のＬＲＵ情報１２４は、２個の他コアキャッシュグループｘ及びｙの情報を有するので、２項関係方式によりエンコードすると１０ビットになる。第１のＬＲＵ情報１２５は、キャッシュ１１５の２個のエントリａ及びｂの情報を有するので、２項関係方式によりエンコードすると１０ビットになる。 When the first LRU information 121 to 123, 126 is encoded by the binary relation method, each becomes 15 bits. Since the first LRU information 124 includes information on two other core cache groups x and y, the first LRU information 124 becomes 10 bits when encoded by the binary relational method. Since the first LRU information 125 includes information on two entries a and b in the cache 115, the first LRU information 125 becomes 10 bits when encoded by the binary relation method.

コア１０１はコアＡであり、キャッシュ１１１はキャッシュＡである。コア１０２はコアＢであり、キャッシュ１１２はキャッシュＢである。コア１０３はコアＣであり、キャッシュ１１３はキャッシュＣである。コア１０４はコアＤであり、キャッシュ１１４はキャッシュＤである。コア１０５はコアＥであり、キャッシュ１１５はキャッシュＥである。コア１０６はコアＦであり、キャッシュ１１６はキャッシュＦである。 The core 101 is the core A, and the cache 111 is the cache A. The core 102 is the core B, and the cache 112 is the cache B. The core 103 is the core C, and the cache 113 is the cache C. The core 104 is the core D, and the cache 114 is the cache D. The core 105 is the core E, and the cache 115 is the cache E. The core 106 is the core F, and the cache 116 is the cache F.

第２のＬＲＵ情報１２９は、６個のコア１０１〜１０６間での自己のキャッシュ１１１〜１１６に対する使用古さ順を示し、ＡがキャッシュＡ（１１１）のＬＲＵ情報、ＢがキャッシュＢ（１１２）のＬＲＵ情報、ＣがキャッシュＣ（１１３）のＬＲＵ情報、ＤがキャッシュＤ（１１４）のＬＲＵ情報、ＥがキャッシュＥ（１１５）のＬＲＵ情報、ＦがキャッシュＦ（１１６）のＬＲＵ情報である。第２のＬＲＵ情報１２９を２項関係方式によりエンコードすると、１５ビットになる。したがって、ＬＲＵ情報１２１〜１２６、１２９の合計は、１５＋１５＋１５＋１０＋１０＋１５＋１５＝９５ビットになり、コア数が多いときにはビット数を少なくできる効果が大きい。 The second LRU information 129 indicates the used order of the caches 111 to 116 among the six cores 101 to 106, A is the LRU information of the cache A (111), and B is the cache B (112). LRU information of the cache C (113), D is LRU information of the cache D (114), E is LRU information of the cache E (115), and F is LRU information of the cache F (116). When the second LRU information 129 is encoded by the binary relation method, it becomes 15 bits. Therefore, the sum of the LRU information 121 to 126, 129 is 15 + 15 + 15 + 10 + 10 + 15 + 15 = 95 bits, and when the number of cores is large, the effect of reducing the number of bits is great.

図５は、共有分散キャッシュシステムにおけるデータロードアクセス動作を示すフローチャートである。以下、図１のシステムを例に説明する。ステップＳ１において、まず、複数のコア１０１〜１０３の何れか１つのコアが、自己に直接に接続されたキャッシュ（自コアキャッシュ）へロード要求を発行する。 FIG. 5 is a flowchart showing a data load access operation in the shared distributed cache system. Hereinafter, the system of FIG. 1 will be described as an example. In step S1, first, any one of the plurality of cores 101 to 103 issues a load request to a cache (own core cache) directly connected to itself.

ステップＳ２において、ロード要求を受け取ったキャッシュは、要求対象のデータがキャッシュ内に存在するか否か、すなわちキャッシュヒットであるか否かを判定する。キャッシュヒットである場合には、ステップＳ３において、自コアキャッシュから要求対象のデータを読み出して、ロード要求を発行したコアにデータを返送する。 In step S2, the cache that has received the load request determines whether or not the requested data exists in the cache, that is, whether or not it is a cache hit. If it is a cache hit, in step S3, the requested data is read from the own core cache, and the data is returned to the core that issued the load request.

ステップＳ２においてキャッシュミスである場合には、ステップＳ４に進む。このステップＳ４において、キャッシュミスが検出されたキャッシュの下に、下位階層キャッシュが存在するか否かを判定する。例えばステップＳ２においてキャッシュミスが検出されてステップＳ４に進んだ場合には、キャッシュミスが検出されたキャッシュはキャッシュ１１１〜１１３の何れか１つであるので、この場合、他の２つのキャッシュが下位階層キャッシュとして存在する。下位階層キャッシュが存在する場合には、ステップＳ５に進む。 If there is a cache miss in step S2, the process proceeds to step S4. In step S4, it is determined whether or not a lower hierarchy cache exists under the cache in which a cache miss is detected. For example, when a cache miss is detected in step S2 and the process proceeds to step S4, the cache in which the cache miss is detected is any one of the caches 111 to 113. In this case, the other two caches are in the lower level. It exists as a hierarchical cache. If a lower hierarchy cache exists, the process proceeds to step S5.

ステップＳ５において、キャッシュ階層が１つ下の他コアキャッシュにアクセスする。ステップＳ６において、アクセス要求を受け取ったキャッシュは、要求対象のデータがキャッシュ内に存在するか否か、すなわちキャッシュヒットであるか否かを判定する。キャッシュミスである場合には、ステップＳ４に戻り、上記の処理を繰り返す。 In step S5, the cache hierarchy accesses the other core cache one level below. In step S6, the cache that has received the access request determines whether or not the requested data exists in the cache, that is, whether or not it is a cache hit. If it is a cache miss, the process returns to step S4 and the above processing is repeated.

ステップＳ６においてキャッシュヒットである場合には、ステップＳ７に進む。このステップＳ７において、キャッシュデータの交換処理を行う。すなわち、キャッシュヒットしたキャッシュからアクセス対象のキャッシュライン（アクセス対象のデータ）を自コアキャッシュに移動し、コアへデータを返送する。この際に、キャッシュラインの自コアキャッシュへの移動に伴い、自コアキャッシュから追い出されるキャッシュラインを他コアに移動する。ステップＳ７の詳細は、後に図６を参照しながら説明する。 If it is a cache hit in step S6, the process proceeds to step S7. In step S7, cache data exchange processing is performed. That is, the cache line to be accessed (data to be accessed) is moved from the cache hit to the own core cache, and the data is returned to the core. At this time, as the cache line moves to the own core cache, the cache line evicted from the own core cache is moved to another core. Details of step S7 will be described later with reference to FIG.

またステップＳ４において、下位階層キャッシュが存在しないと判定された場合には、ステップＳ８に進む。例えばステップＳ６においてキャッシュミスが検出されてステップＳ４に進んだ際に、このキャッシュミスが検出されたキャッシュが既に最下層のキャッシュであった場合、その下層にはメインメモリ１３０しか存在しない。このような場合には、ステップＳ８で、メインメモリ１３０から要求対象のデータを読み出し、自コアキャッシュにアロケート（１キャッシュライン分の要求対象のデータを自コアキャッシュにコピー）するとともに、ロード要求を発行したコアにデータを返送する。またこの動作に伴い自コアキャッシュから追い出されたキャッシュラインは、例えば下位階層キャッシュに移動される。ステップＳ８の詳細は、後に図１０を参照しながら説明する。 If it is determined in step S4 that there is no lower hierarchy cache, the process proceeds to step S8. For example, when a cache miss is detected in step S6 and the process proceeds to step S4, if the cache in which this cache miss is detected is already the lowermost cache, only the main memory 130 exists in the lower layer. In such a case, in step S8, the requested data is read from the main memory 130, allocated to the own core cache (the requested data for one cache line is copied to the own core cache), and a load request is issued. Return data to the issued core. Further, the cache line evicted from the own core cache as a result of this operation is moved to, for example, a lower hierarchy cache. Details of step S8 will be described later with reference to FIG.

上記の動作フローにおいて、ステップＳ７はキャッシュとキャッシュとの間のデータ転送に関連する動作であり、ステップＳ８はメインメモリ１３０とキャッシュとの間のデータ転送に関連する動作である。 In the above operation flow, step S7 is an operation related to data transfer between the cache and the cache, and step S8 is an operation related to data transfer between the main memory 130 and the cache.

図６は、図５のステップＳ７の交換処理を示すフローチャートである。まず、ステップＡ１では、図５のステップＳ６の判断により、下位階層キャッシュとして利用可能な他コアキャッシュにヒットしたと判断される。 FIG. 6 is a flowchart showing the exchange process in step S7 of FIG. First, in step A1, it is determined that another core cache that can be used as a lower hierarchy cache has been hit by the determination in step S6 of FIG.

次に、ステップＡ２では、キャッシュ間接続コントローラ１２０は判定処理を行う。すなわち、ヒットした他コアが所属するグループの自コアローカルの第１のＬＲＵ情報が最新であり、かつ、ヒットした他コアキャッシュの第１のＬＲＵ情報のエントリが最新でないことの条件を満たすか否かを判定する。条件を満たす場合にはステップＡ４に進み、条件を満たさない場合にはステップＡ７に進む。 Next, in step A2, the inter-cache connection controller 120 performs a determination process. That is, whether or not the condition that the first LRU information of the own core local of the group to which the hit other core belongs is the latest and the entry of the first LRU information of the hit other core cache is not the latest is satisfied Determine whether. If the condition is satisfied, the process proceeds to step A4. If the condition is not satisfied, the process proceeds to step A7.

ステップＡ４では、キャッシュ間接続コントローラ１２０は自コアキャッシュ及び他コアキャッシュ間で交換処理を行う。すなわち、自コアキャッシュの最古エントリと他コアキャッシュのヒットしたエントリを交換する。 In step A4, the inter-cache connection controller 120 performs exchange processing between the own core cache and the other core cache. That is, the oldest entry in the own core cache and the hit entry in the other core cache are exchanged.

次に、ステップＡ５では、キャッシュ間接続コントローラ１２０はＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの交換されたエントリを最新に更新し、他コアの第１のＬＲＵ情報については他コアキャッシュの交換されたエントリを最古に更新し、第２のＬＲＵ情報については自コアキャッシュが最新となるように更新する。その後、ステップＡ６に進み、処理を終了する。 Next, in step A5, the inter-cache connection controller 120 performs LRU update processing. That is, for the first LRU information of the own core, the exchanged entry of the own core cache is updated to the latest, and for the first LRU information of the other core, the exchanged entry of the other core cache is updated the oldest. The second LRU information is updated so that the own core cache becomes the latest. Then, it progresses to step A6 and complete | finishes a process.

ステップＡ７では、自コアはキャッシュ間接続コントローラ１２０を介して参照処理を行う。すなわち、他コアキャッシュのヒットしたエントリを直接参照する（読み出す）。 In step A7, the self-core performs a reference process via the inter-cache connection controller 120. That is, it directly refers to (reads out) the entry hit in the other core cache.

次に、ステップＡ３では、キャッシュ間接続コントローラ１２０はＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報についてはヒットした他コアキャッシュが所属するグループを最新に更新し、他コアの第１のＬＲＵ情報についてはヒットした他コアキャッシュのエントリが最新であれば２番目に新しいものと交換するように更新し、第２のＬＲＵ情報についてはヒットした他コアキャッシュが最新となるように更新する。その後、ステップＡ６に進み、処理を終了する。 Next, in step A3, the inter-cache connection controller 120 performs LRU update processing. That is, for the first LRU information of the own core, the group to which the hit other core cache belongs is updated to the latest, and for the first LRU information of the other core, 2 if the entry of the hit other core cache is the latest. The second LRU information is updated so that the hit other core cache becomes the latest. Then, it progresses to step A6 and complete | finishes a process.

図７〜図９は、図１及び図２のシステムにおける図６の交換処理の例を示す図である。図７は、初期状態を示す。第１のＬＲＵ情報１２１〜１２３及び第２のＬＲＵ情報１２９は、０が最古であり、値が大きくなるほど新しいことを示す。 7 to 9 are diagrams illustrating an example of the exchange process of FIG. 6 in the system of FIGS. 1 and 2. FIG. 7 shows an initial state. The first LRU information 121 to 123 and the second LRU information 129 indicate that 0 is the oldest and newer as the value increases.

まず、ステップＡ１において、自コアＡ（１０１）から他コアＢ（１０２）のキャッシュＢ（１１２）のエントリｃにヒットした場合を説明する。 First, the case where the entry c of the cache B (112) of the other core B (102) is hit from the own core A (101) in step A1 will be described.

次に、ステップＡ２では、ヒットした他コアキャッシュＢ（１１２）が所属するグループｘの自コアローカルの第１のＬＲＵ情報１２１が１であり、最新でないので、ステップＡ７に進む。ステップＡ７では、他コアキャッシュＢ（１１２）のヒットしたエントリｃを直接参照する。 Next, in step A2, the first LRU information 121 of the own core local of the group x to which the hit other core cache B (112) belongs is 1 and is not the latest, so the process proceeds to step A7. In step A7, the entry c hit in the other core cache B (112) is directly referenced.

次に、ステップＡ３では、図８に示すように、自コアＡ（１０１）の第１のＬＲＵ情報１２１についてはヒットした他コアキャッシュＢ（１１２）が所属するグループｘを最新（値「３」）に更新し、他コアＢ（１０２）の第１のＬＲＵ情報１２２についてはヒットした他コアキャッシュＢ（１１２）のエントリｃが最新（図７の値「３」）であるので、２番目に新しいエントリｘと交換するように更新し、第２のＬＲＵ情報１２９についてはヒットした他コアキャッシュＢ（１１２）が最新となるように更新する。その後、ステップＡ６に進み、処理を終了する。 Next, in step A3, as shown in FIG. 8, for the first LRU information 121 of the own core A (101), the group x to which the hit other core cache B (112) belongs is updated (value “3”). ), The entry c of the other core cache B (112) that has been hit is the latest (value “3” in FIG. 7) for the first LRU information 122 of the other core B (102). The new entry x is updated to be exchanged, and the second LRU information 129 is updated so that the hit other core cache B (112) becomes the latest. Then, it progresses to step A6 and complete | finishes a process.

次に、ステップＡ１において、再度、自コアＡ（１０１）から他コアＢ（１０２）のキャッシュＢ（１１２）のエントリｃにヒットした場合を説明する。 Next, the case where the entry c of the cache B (112) of the other core B (102) is hit again from the own core A (101) in step A1 will be described.

次に、ステップＡ２では、ヒットした他コアキャッシュＢ（１１２）が所属するグループｘの自コアローカルの第１のＬＲＵ情報１２１が最新（値「３」）であり、かつ、ヒットした他コアキャッシュＢ（１１２）の第１のＬＲＵ情報１２２のエントリｃが最新でない（値「２」）ので、ステップＡ４に進む。ステップＡ４では、自コアキャッシュＡ（１１１）の最古エントリｂと他コアキャッシュＢ（１１２）のヒットしたエントリｃを交換する。 Next, in Step A2, the first LRU information 121 of the own core local of the group x to which the hit other core cache B (112) belongs is the latest (value “3”), and the hit other core cache Since the entry c of the first LRU information 122 of B (112) is not the latest (value “2”), the process proceeds to step A4. In step A4, the oldest entry b in the own core cache A (111) and the hit entry c in the other core cache B (112) are exchanged.

次に、ステップＡ５では、図９に示すように、自コアＡ（１０１）の第１のＬＲＵ情報１２１については自コアキャッシュＡ（１１１）の交換されたエントリｂを最新に更新し、他コアＢ（１０２）の第１のＬＲＵ情報１２２については他コアキャッシュＢ（１１２）の交換されたエントリｃを最古に更新し、第２のＬＲＵ情報１２９については自コアキャッシュＡ（１１１）が最新となるように更新する。その後、ステップＡ６に進み、処理を終了する。 Next, in step A5, as shown in FIG. 9, for the first LRU information 121 of the own core A (101), the exchanged entry b of the own core cache A (111) is updated to the latest, and the other cores For the first LRU information 122 of B (102), the exchanged entry c of the other core cache B (112) is updated to the oldest, and for the second LRU information 129, the own core cache A (111) is the latest Update to be Then, it progresses to step A6 and complete | finishes a process.

以上のように、第１回目に他コアキャッシュＢのエントリｃにヒットした場合には、交換処理を行わず、第２回目に他コアキャッシュＢのエントリｃにヒットした場合に交換処理行うことにより、無駄な交換処理をなくし、トラフィックを減少させることができる。すなわち、自コアＡが連続してアクセスし、かつ、ヒットした他コアキャッシュＢで現在あまり使っていないエントリｃのみ交換することにより、効率的な交換処理を行うことができる。 As described above, when the entry c of the other core cache B is hit at the first time, the replacement process is not performed, and when the entry c of the other core cache B is hit at the second time, the replacement process is performed. , Useless exchange processing can be eliminated and traffic can be reduced. In other words, efficient exchange processing can be performed by exchanging only the entry c that is continuously accessed by the own core A and that is currently not frequently used in the hit other core cache B.

図１０は、図５のステップＳ８の追い出し処理を示すフローチャートである。まず、ステップＢ１では、図５のステップＳ４の判断により、下位階層キャッシュとして利用可能な他コアキャッシュに全てミスしたと判断される。 FIG. 10 is a flowchart showing the eviction process in step S8 of FIG. First, in step B1, it is determined that all other core caches that can be used as the lower hierarchy cache have missed by the determination in step S4 of FIG.

次に、ステップＢ２では、キャッシュ間接続コントローラ１２０は判定処理を行う。すなわち、第２のＬＲＵ情報を参照し、自コアが最近そのインデックスのエントリを最も使っていないコアでないことの条件を満たすか否かを判定する。条件を満たす場合にはステップＢ４に進み、条件を満たさない場合にはステップＢ７に進む。 Next, in step B2, the inter-cache connection controller 120 performs a determination process. That is, referring to the second LRU information, it is determined whether or not the self-core satisfies the condition that it is not the most recently used core of the index. If the condition is satisfied, the process proceeds to step B4. If the condition is not satisfied, the process proceeds to step B7.

ステップＢ４では、キャッシュ間接続コントローラ１２０は自コアキャッシュから他コアキャッシュへ追い出し処理を行う。すなわち、最近最も使われていない他コアキャッシュの最古エントリに自コアキャッシュの最古エントリ（他コアグループ用エントリを含まないものの中で最古エントリ）を移動する。 In step B4, the inter-cache connection controller 120 performs an eviction process from the own core cache to another core cache. That is, the oldest entry of the own core cache (the oldest entry among those not including the other core group entry) is moved to the oldest entry of the other core cache that has not been used most recently.

次に、ステップＢ５では、キャッシュ間接続コントローラ１２０はＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの最古エントリを最新に更新し、他コアの第１のＬＲＵ情報については追い出し先の他コアキャッシュの最古エントリを最新に更新し、第２のＬＲＵ情報については自コアキャッシュが最新に、追い出し先の他コアキャッシュを２番目に新しいものとなるように更新する。その後、ステップＢ６に進む。 Next, in step B5, the inter-cache connection controller 120 performs LRU update processing. That is, the oldest entry of the own core cache is updated to the latest for the first LRU information of the own core, and the oldest entry of the other core cache to be evicted is updated to the latest for the first LRU information of the other core. The second LRU information is updated so that its own core cache is the latest and the other core cache of the eviction destination is the second most recent. Then, it progresses to step B6.

ステップＢ７では、キャッシュ間接続コントローラ１２０は破棄処理を行う。すなわち、自コアキャッシュの最古エントリを破棄する。具体的には、このステップでは何も処理せず、後のステップＢ６の上書き処理により自コアキャッシュの最古エントリが破棄される。 In step B7, the inter-cache connection controller 120 performs a discard process. That is, the oldest entry in the own core cache is discarded. Specifically, nothing is processed in this step, and the oldest entry in the own core cache is discarded by the overwrite process in the subsequent step B6.

次に、ステップＢ３では、キャッシュ間接続コントローラ１２０はＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの最古エントリを最新に更新し、第２のＬＲＵ情報については自コアキャッシュが最新になるように更新する。その後、ステップＢ６に進む。 Next, in step B3, the inter-cache connection controller 120 performs LRU update processing. That is, for the first LRU information of the own core, the oldest entry in the own core cache is updated to the latest, and the second LRU information is updated to be the latest. Then, it progresses to step B6.

ステップＢ６では、キャッシュ間接続コントローラ１２０は自コアキャッシュの最古エントリ（ＬＲＵ情報更新後の最新エントリ）にメインメモリ１３０から読み出したデータを上書きし、自コアはそのデータを取得する。以上で、処理を終了する。 In Step B6, the inter-cache connection controller 120 overwrites the data read from the main memory 130 on the oldest entry (the latest entry after the LRU information update) of the own core cache, and the own core acquires the data. Thus, the process ends.

図１１〜図１３は、図１及び図２のシステムにおける図１０の追い出し処理の例を示す図である。図１１は、初期状態を示す。第１のＬＲＵ情報１２１〜１２３及び第２のＬＲＵ情報１２９は、０が最古であり、値が大きくなるほど新しいことを示す。 11 to 13 are diagrams illustrating an example of the eviction process of FIG. 10 in the system of FIGS. 1 and 2. FIG. 11 shows an initial state. The first LRU information 121 to 123 and the second LRU information 129 indicate that 0 is the oldest and newer as the value increases.

まず、ステップＢ１において、自コアＡ（１０１）のアクセスが全ての他コアキャッシュでミスした場合を説明する。 First, in step B1, the case where the access of the own core A (101) misses in all other core caches will be described.

次に、ステップＢ２では、第２のＬＲＵ情報１２９を参照すると、自コアキャッシュＡ（１１１）が最近最も使っていないので、ステップＢ７を介してステップＢ３に進む。 Next, in step B2, when referring to the second LRU information 129, since the own core cache A (111) has not been used most recently, the process proceeds to step B3 via step B7.

ステップＢ３では、図１２に示すように、自コアＡ（１０１）の第１のＬＲＵ情報１２１については自コアキャッシュＡ（１１１）の最古エントリｂを最新に更新し、第２のＬＲＵ情報１２９については自コアキャッシュＡが最新になるように更新する。 In step B3, as shown in FIG. 12, for the first LRU information 121 of the own core A (101), the oldest entry b of the own core cache A (111) is updated to the latest, and the second LRU information 129 is updated. Is updated so that its own core cache A becomes the latest.

次に、ステップＢ６では、自コアキャッシュＡ（１１１）の最古エントリ（ＬＲＵ情報更新後の最新エントリ）ｂにメインメモリ１３０から読み出したデータを上書きする。以上で処理を終了する。 Next, in step B6, the data read from the main memory 130 is overwritten on the oldest entry (latest entry after updating the LRU information) b of the own core cache A (111). The process ends here.

次に、ステップＢ１において、再度、自コアＡ（１０１）のアクセスが全ての他コアキャッシュでミスした場合を説明する。 Next, the case where the access of the own core A (101) misses in all other core caches again in step B1 will be described.

次に、ステップＢ２では、第２のＬＲＵ情報１２９を参照すると、他コアキャッシュＣ（１１３）が最近最も使っていないので、ステップＢ４に進む。 Next, in step B2, referring to the second LRU information 129, the other core cache C (113) has not been used most recently, so the process proceeds to step B4.

ステップＢ４では、最近最も使われていない他コアキャッシュＣ（１１３）の最古エントリｂに自コアキャッシュＡ（１１１）の最古エントリ（他コアグループ用エントリを含まないものの中で最古エントリ）ａを移動する。 In step B4, the oldest entry b of the own core cache A (111) is included in the oldest entry b of the other core cache C (113) that has not been used most recently (the oldest entry among those not including the other core group entry). Move a.

次に、ステップＢ５では、図１３に示すように、自コアＡ（１０１）の第１のＬＲＵ情報１２１については自コアキャッシュＡ（１１１）の最古エントリａを最新に更新し、他コアＣ（１０３）の第１のＬＲＵ情報１２３については追い出し先の他コアキャッシュＣ（１１３）の最古エントリｂを最新に更新し、第２のＬＲＵ情報１２９については自コアキャッシュＡ（１１１）が最新に、追い出し先の他コアキャッシュＣ（１１３）を２番目に新しいものとなるように更新する。 Next, in step B5, as shown in FIG. 13, for the first LRU information 121 of the own core A (101), the oldest entry a of the own core cache A (111) is updated to the latest, and the other core C For the first LRU information 123 of (103), the oldest entry b of the other core cache C (113) to be evicted is updated to the latest, and for the second LRU information 129, the own core cache A (111) is the latest Then, the other core cache C (113) of the eviction destination is updated so as to be the second newest.

次に、ステップＢ６では、自コアキャッシュＡ（１１１）の最古エントリ（ＬＲＵ情報更新後の最新エントリ）ａにメインメモリ１３０から読み出したデータを上書きする。以上で処理を終了する。 Next, in step B6, the data read from the main memory 130 is overwritten on the oldest entry (latest entry after updating the LRU information) a of the own core cache A (111). The process ends here.

以上のように、第１回目に全ての他コアキャッシュでミスした場合には、追い出し処理を行わず、第２回目に全ての他コアキャッシュでミスした場合に追い出し処理行うことにより、無駄な追い出し処理をなくし、トラフィックを減少させることができる。すなわち、自コアＡが連続してキャッシュミスし、かつ、他コアキャッシュが最古であるときのみ追い出し処理することにより、効率的な追い出し処理を行うことができる。 As described above, if all other core caches miss in the first time, the eviction process is not performed, and if the other core cache misses in the second time, the eviction process is performed, so that unnecessary eviction is performed. Processing can be eliminated and traffic can be reduced. In other words, efficient eviction processing can be performed by performing eviction processing only when the own core A continuously misses the cache and the other core cache is the oldest.

第２のＬＲＵ情報１２９は最新のアクセスのみしか保持していないが、完全な最古順を保持しても構わない。最新のアクセスのみしか保持しない理由は、ビット数削減のためである。また、第２のＬＲＵ情報１２９の更新時に追い出し先コアを２番目に新しいものに更新して、追い出し先を持ち回りすることで、完全な最古順を保持した場合に比べた性能低下を抑えられる。 The second LRU information 129 holds only the latest access, but may hold the complete oldest order. The reason for retaining only the latest access is to reduce the number of bits. In addition, when the second LRU information 129 is updated, the eviction destination core is updated to the second most recent one, and the eviction destination is carried around, so that it is possible to suppress the performance degradation compared to the case where the complete oldest order is maintained. .

図１４は、本発明の実施形態による共有分散キャッシュシステムの構成例を示す図である。以下、図１４が図１と異なる点を説明する。情報メモリ１４２１及び判定回路１４２４及びキャッシュ間接続ネットワーク（ＩＣＮ）１４２５は、図１のキャッシュ間接続コントローラ１２０に対応する。情報メモリ１４２１は、ＬＲＵ情報１４２２及び送受信バッファ情報１４２３を記憶する。ＬＲＵ情報１４２２は、図１のＬＲＵ情報１２１〜１２３及び１２９に対応する。キャッシュ間接続ネットワーク１４２５は、メインメモリ１３０及びキャッシュ１１１〜１１３に接続される。判定回路１４２４は、情報メモリ１４２１内のＬＲＵ情報１４２２又は送受信バッファ情報１４２３に応じてキャッシュ間接続ネットワーク１４２５の通信を制御する。 FIG. 14 is a diagram showing a configuration example of the shared distributed cache system according to the embodiment of the present invention. Hereinafter, the points of FIG. 14 different from FIG. 1 will be described. The information memory 1421, the determination circuit 1424, and the inter-cache connection network (ICN) 1425 correspond to the inter-cache connection controller 120 of FIG. The information memory 1421 stores LRU information 1422 and transmission / reception buffer information 1423. The LRU information 1422 corresponds to the LRU information 121 to 123 and 129 in FIG. The inter-cache connection network 1425 is connected to the main memory 130 and the caches 111 to 113. The determination circuit 1424 controls communication of the inter-cache connection network 1425 according to the LRU information 1422 or the transmission / reception buffer information 1423 in the information memory 1421.

送信バッファＡ（１４０１）及び受信バッファＡ（１４１１）は、キャッシュＡ（１１１）内に設けられる。送信バッファ１４０１は、キャッシュ１１１内のデータをキャッシュ間接続ネットワーク１４２５を介してキャッシュ１１２又は１１３に送信するためのバッファである。受信バッファ１４１１は、キャッシュ１１２又は１１３内のデータをキャッシュ間接続ネットワーク１４２５を介してキャッシュ１１１に受信するためのバッファである。 The transmission buffer A (1401) and the reception buffer A (1411) are provided in the cache A (111). The transmission buffer 1401 is a buffer for transmitting data in the cache 111 to the cache 112 or 113 via the inter-cache connection network 1425. The reception buffer 1411 is a buffer for receiving data in the cache 112 or 113 to the cache 111 via the inter-cache connection network 1425.

送信バッファＢ（１４０２）及び受信バッファＢ（１４１２）は、キャッシュＢ（１１２）内に設けられる。送信バッファ１４０２は、キャッシュ１１２内のデータをキャッシュ間接続ネットワーク１４２５を介してキャッシュ１１１又は１１３に送信するためのバッファである。受信バッファ１４１２は、キャッシュ１１１又は１１３内のデータをキャッシュ間接続ネットワーク１４２５を介してキャッシュ１１２に受信するためのバッファである。 The transmission buffer B (1402) and the reception buffer B (1412) are provided in the cache B (112). The transmission buffer 1402 is a buffer for transmitting data in the cache 112 to the cache 111 or 113 via the inter-cache connection network 1425. The reception buffer 1412 is a buffer for receiving data in the cache 111 or 113 to the cache 112 via the inter-cache connection network 1425.

送信バッファＣ（１４０３）及び受信バッファＣ（１４１３）は、キャッシュＣ（１１３）内に設けられる。送信バッファ１４０３は、キャッシュ１１３内のデータをキャッシュ間接続ネットワーク１４２５を介してキャッシュ１１１又は１１２に送信するためのバッファである。受信バッファ１４１３は、キャッシュ１１１又は１１２内のデータをキャッシュ間接続ネットワーク１４２５を介してキャッシュ１１３に受信するためのバッファである。 The transmission buffer C (1403) and the reception buffer C (1413) are provided in the cache C (113). The transmission buffer 1403 is a buffer for transmitting data in the cache 113 to the cache 111 or 112 via the inter-cache connection network 1425. The reception buffer 1413 is a buffer for receiving data in the cache 111 or 112 to the cache 113 via the inter-cache connection network 1425.

送受信バッファ情報１４２３は、送信バッファ１４０１、受信バッファ１４１１、送信バッファ１４０２、受信バッファ１４１２、送信バッファ１４０３及び受信バッファ１４１３の各バッファが閾値を超えて埋まっているか否かを示す使用量情報である。 The transmission / reception buffer information 1423 is usage amount information indicating whether each of the transmission buffer 1401, the reception buffer 1411, the transmission buffer 1402, the reception buffer 1412, the transmission buffer 1403, and the reception buffer 1413 is filled beyond a threshold.

図１５は図１４のキャッシュシステムにおける図５のステップＳ７の交換処理を示すフローチャートであり、図１６は図１５のステップＣ１３の玉突き処理の例を示す図であり、図１７は図１５のステップＣ１１の移動処理の例を示す図である。 FIG. 15 is a flowchart showing the exchange process of step S7 of FIG. 5 in the cache system of FIG. 14, FIG. 16 is a diagram showing an example of the ball hitting process of step C13 of FIG. 15, and FIG. 17 is step C11 of FIG. It is a figure which shows the example of this movement process.

まず、ステップＣ１では、図５のステップＳ６の判断により、下位階層キャッシュとして利用可能な他コアキャッシュにヒットしたと判断される。 First, in step C1, it is determined that another core cache that can be used as a lower hierarchy cache has been hit by the determination in step S6 of FIG.

次に、ステップＣ２では、図６のステップＡ２と同様に、判定回路１４２４はＬＲＵ判定処理を行う。すなわち、ヒットした他コアが所属するグループの自コアローカルの第１のＬＲＵ情報が最新であり、かつ、ヒットした他コアキャッシュの第１のＬＲＵ情報のエントリが最新でないことの条件を満たすか否かを判定する。条件を満たす場合にはステップＣ８に進み、条件を満たさない場合にはステップＣ７に進む。 Next, in step C2, as in step A2 of FIG. 6, the determination circuit 1424 performs LRU determination processing. That is, whether or not the condition that the first LRU information of the own core local of the group to which the hit other core belongs is the latest and the entry of the first LRU information of the hit other core cache is not the latest is satisfied Determine whether. If the condition is satisfied, the process proceeds to step C8. If the condition is not satisfied, the process proceeds to step C7.

ステップＣ８では、判定回路１４２４は第１のバッファ判定処理を行う。すなわち、ヒットした他コアキャッシュから自コアキャッシュへのバッファが閾値を越えて埋まっているか否かを判定する。例えば、ヒットした他コアバッファ１１２から自コアバッファ１１１へのデータ転送を行うため、送受信バッファ情報１４２３を参照し、他コアバッファ１１２の送信バッファ１４０２又は自コアバッファ１１１の受信バッファ１４１１が閾値を越えて埋まっているか否かを判定する。送信バッファ１４０２又は受信バッファ１４１１の少なくともいずれかが埋まっていればステップＣ７に進み、送信バッファ１４０２及び受信バッファ１４１１の両方が埋まっていなければステップＣ９へ進む。 In step C8, the determination circuit 1424 performs a first buffer determination process. That is, it is determined whether or not the buffer from the hit other core cache to the own core cache is filled beyond the threshold. For example, in order to perform data transfer from the hit other core buffer 112 to the own core buffer 111, the transmission / reception buffer information 1423 is referred to, and the transmission buffer 1402 of the other core buffer 112 or the reception buffer 1411 of the own core buffer 111 exceeds the threshold. To determine whether it is buried. If at least one of the transmission buffer 1402 or the reception buffer 1411 is filled, the process proceeds to Step C7, and if both the transmission buffer 1402 and the reception buffer 1411 are not filled, the process proceeds to Step C9.

ステップＣ７では、図６のステップＡ７と同様に、自コアはキャッシュ間接続ネットワーク１４２５を介して参照処理を行う。すなわち、他コアキャッシュのヒットしたエントリを直接参照する（読み出す）。 In step C7, the own core performs a reference process via the inter-cache connection network 1425 as in step A7 of FIG. That is, it directly refers to (reads out) the entry hit in the other core cache.

次に、ステップＣ３では、図６のステップＡ３と同様に、キャッシュシステムはＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報についてはヒットした他コアが所属するグループを最新に更新し、他コアの第１のＬＲＵ情報についてはヒットした他コアのエントリが最新であれば２番目に新しいものと交換するように更新し、第２のＬＲＵ情報についてはヒットした他コアキャッシュが最新となるように更新する。その後、ステップＣ６に進み、処理を終了する。 Next, in step C3, the cache system performs LRU update processing as in step A3 of FIG. That is, for the first LRU information of the own core, the group to which the hit other core belongs is updated to the latest, and for the first LRU information of the other core, the second is the second if the entry of the hit other core is the latest. The second LRU information is updated so that it is replaced with a new one, and the other core cache that has been hit is updated so as to be the latest. Then, it progresses to step C6 and complete | finishes a process.

ステップＣ９では、判定回路１４２４は第２のバッファ判定処理を行う。すなわち、自コアキャッシュからヒットした他コアキャッシュへのバッファが閾値を越えて埋まっているか否かを判定する。例えば、自コアバッファ１１１からヒットした他コアバッファ１１２へのデータ転送を行うため、送受信バッファ情報１４２３を参照し、自コアバッファ１１１の送信バッファ１４０１又は他コアバッファ１１２の受信バッファ１４１２が閾値を越えて埋まっているか否かを判定する。送信バッファ１４０１又は受信バッファ１４１２の少なくともいずれかが埋まっていれば交換処理ができないためステップＣ１０に進み、送信バッファ１４０１及び受信バッファ１４１２の両方が埋まっていなければ交換処理可能であるためステップＣ４へ進む。 In step C9, the determination circuit 1424 performs a second buffer determination process. That is, it is determined whether or not the buffer to the other core cache hit from the own core cache is filled beyond the threshold. For example, in order to perform data transfer from the own core buffer 111 to the other core buffer 112 hit, the transmission / reception buffer information 1423 is referred to, and the transmission buffer 1401 of the own core buffer 111 or the reception buffer 1412 of the other core buffer 112 exceeds the threshold. To determine whether it is buried. If at least one of the transmission buffer 1401 and the reception buffer 1412 is filled, the exchange process cannot be performed, so the process proceeds to Step C10. If both the transmission buffer 1401 and the reception buffer 1412 are not filled, the exchange process is possible, and therefore, the process proceeds to Step C4. .

ステップＣ４では、図６のステップＡ４と同様に、キャッシュシステムは自コアキャッシュ及び他コアキャッシュ間で交換処理を行う。例えば、図１６及び図１７に示すように、自コアキャッシュ１１１の最古エントリｂと他コアキャッシュ１１２のヒットしたエントリｃを交換する。すなわち、データ転送処理１６１１により、他コアキャッシュ１１２のヒットエントリｃのデータは自コアキャッシュ１１１の最古エントリｂに移動され、データ転送処理１６１２により、自コアキャッシュ１１１の最古エントリｂのデータは他コアキャッシュ１１２のヒットエントリｃに移動される。 In step C4, as in step A4 of FIG. 6, the cache system performs an exchange process between the own core cache and the other core cache. For example, as shown in FIGS. 16 and 17, the oldest entry b in the own core cache 111 and the hit entry c in the other core cache 112 are exchanged. In other words, the data of the hit entry c in the other core cache 112 is moved to the oldest entry b of the own core cache 111 by the data transfer process 1611, and the data of the oldest entry b of the own core cache 111 is moved by the data transfer process 1612. It is moved to hit entry c in the other core cache 112.

次に、ステップＣ５では、図６のステップＡ５と同様に、キャッシュシステムはＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアの交換されたエントリを最新に更新し、他コアの第１のＬＲＵ情報については他コアの交換されたエントリを最古に更新し、第２のＬＲＵ情報については自コアキャッシュが最新となるように更新する。その後、ステップＣ６に進み、処理を終了する。 Next, in step C5, as in step A5 of FIG. 6, the cache system performs LRU update processing. That is, for the first LRU information of the own core, the exchanged entry of the own core is updated to the latest, for the first LRU information of the other core, the exchanged entry of the other core is updated to the oldest, The LRU information of 2 is updated so that the own core cache becomes the latest. Then, it progresses to step C6 and complete | finishes a process.

ステップＣ１０では、判定回路１４２４は第３のバッファ判定処理を行う。すなわち、ＬＲＵ情報１４２２及び送受信バッファ情報１４２３を参照することにより、自コアのバッファが最古であり、又は、自コアのバッファから最古コアのバッファへの送受信バッファが閾値を越えて埋まっているか否かを判定する。自コアのバッファが最古であるときには、玉突き処理（ステップＣ１３）を行う必要がないので、ステップＣ１１へ進む。自コアのバッファから最古コアのバッファへの送受信バッファが閾値を越えて埋まっているときには、玉突き処理（ステップＣ１３）を行うことができないので、ステップＣ１１に進む。自コアのバッファが最古でなく、かつ、自コアのバッファから最古コアのバッファへの送受信バッファが閾値を越えて埋まっていないときには、玉突き処理を行うためにステップＣ１３に進む。 In step C10, the determination circuit 1424 performs a third buffer determination process. That is, by referring to the LRU information 1422 and the transmission / reception buffer information 1423, whether the buffer of the own core is the oldest or the transmission / reception buffer from the buffer of the own core to the buffer of the oldest core is filled beyond the threshold. Determine whether or not. When the buffer of the own core is the oldest, it is not necessary to perform the ball hitting process (step C13), and the process proceeds to step C11. When the transmission / reception buffer from the buffer of the own core to the buffer of the oldest core is filled beyond the threshold, the ball hitting process (step C13) cannot be performed, and the process proceeds to step C11. When the buffer of the own core is not the oldest and the transmission / reception buffer from the buffer of the own core to the buffer of the oldest core is not filled beyond the threshold, the process proceeds to step C13 to perform a ball hitting process.

ステップＣ１３では、キャッシュシステムは玉突き処理を行う。すなわち、図１６に示すように、データ転送処理１６１３により、最古コア１０３のキャッシュ１１３の最古エントリｃに自コア１０１のキャッシュ１１１の最古エントリｂを上書きし、データ転送処理１６１１により、自コア１０１のキャッシュ１１１の最古エントリｂに他コア１０２のキャッシュ１１２のヒットしたエントリｃを上書きし、該他コア１０２のキャッシュ１１２のヒットしたエントリｃを無効化する。最古コアは第２のＬＲＵ情報１２９を基に判断し、最古エントリは第１のＬＲＵ情報１２１〜１２３を基に判断することができる。 In step C13, the cache system performs a ball hitting process. That is, as shown in FIG. 16, the data transfer process 1613 overwrites the oldest entry c of the cache 111 of the own core 101 with the oldest entry c of the cache 113 of the oldest core 103, and the data transfer process 1611 The oldest entry b of the cache 111 of the core 101 is overwritten with the hit entry c of the cache 112 of the other core 102, and the hit entry c of the cache 112 of the other core 102 is invalidated. The oldest core can be determined based on the second LRU information 129, and the oldest entry can be determined based on the first LRU information 121-123.

図１６に示すように、ヒットした他コアキャッシュ１１２から自コアキャッシュ１１１への通信パス１６０１の送信バッファ１４０２及び受信バッファ１４１１が閾値を越えて埋まっておらず、かつ、自コアキャッシュ１１１からヒットした他コアキャッシュ１１２への通信パス１６０２の送信バッファ１４０１及び受信バッファ１４１２が閾値を越えて埋まっており、かつ、自コアキャッシュ１１１から最古コアキャッシュ１１３への通信パス１６０３の送信バッファ１４０１及び受信バッファ１４１３が閾値を越えて埋まっていないときには、交換処理（ステップＣ４）１６１１及び１６１２ができないので、上記の玉突き処理（ステップＣ１３）１６１１及び１６１３を行う。 As shown in FIG. 16, the transmission buffer 1402 and the reception buffer 1411 of the communication path 1601 from the hit other core cache 112 to the own core cache 111 are not filled beyond the threshold value, and the own core cache 111 is hit. The transmission buffer 1401 and the reception buffer 1412 of the communication path 1602 to the other core cache 112 are filled beyond the threshold, and the transmission buffer 1401 and the reception buffer of the communication path 1603 from the own core cache 111 to the oldest core cache 113 When 1413 is not filled exceeding the threshold value, the replacement process (step C4) 1611 and 1612 cannot be performed, so the above-described ball hitting processes (step C13) 1611 and 1613 are performed.

次に、ステップＣ１４では、キャッシュシステムはＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの最古エントリを最新に更新し、最古の他コアの第１のＬＲＵ情報については最古エントリを最新に更新し、第２のＬＲＵ情報については自コアキャッシュが最新、最古の他コアが２番目に新しい順番となるように更新する。その後、ステップＣ６に進み、処理を終了する。 Next, in step C14, the cache system performs LRU update processing. That is, the oldest entry of the own core cache is updated to the latest for the first LRU information of the own core, the oldest entry is updated to the latest for the first LRU information of the oldest other core, and the second The LRU information is updated so that the own core cache is the latest and the oldest other core is the second most recent. Then, it progresses to step C6 and complete | finishes a process.

ステップＣ１１では、キャッシュシステムは移動処理を行う。すなわち、図１７に示すように、データ転送処理１６１１により、自コア１０１のキャッシュ１１１の最古エントリｂに他コア１０２のキャッシュ１１２のヒットしたエントリｃを上書きし、該他コア１０２のキャッシュ１１２のヒットしたエントリｃを無効化する。上記の上書きにより、自コアキャッシュ１１１の最古エントリｂの元のキャッシュラインは破棄処理１６１４される。 In step C11, the cache system performs migration processing. That is, as shown in FIG. 17, the data transfer process 1611 overwrites the entry c hit in the cache 112 of the other core 102 with the oldest entry b of the cache 111 of the own core 101, and The hit entry c is invalidated. By the overwriting, the original cache line of the oldest entry b in the own core cache 111 is discarded 1614.

以上のように、ヒットした他コアキャッシュ１１２から自コアキャッシュ１１１への通信パス１６０１の送信バッファ１４０２及び受信バッファ１４１１が閾値を越えて埋まっておらず、かつ、自コアキャッシュ１１１からヒットした他コアキャッシュ１１２への通信パス１６０２の送信バッファ１４０１及び受信バッファ１４１２が閾値を越えて埋まっており、かつ、自コアキャッシュ１１１から最古コアキャッシュ１１３への通信パス１６０３の送信バッファ１４０１及び受信バッファ１４１３が閾値を越えて埋まっているときには、交換処理（ステップＣ４）１６１１，１６１２及び玉突き処理（ステップＣ１３）ができないので、上記の移動処理（ステップＣ１１）１６１１及び１６１４を行う。 As described above, the transmission buffer 1402 and the reception buffer 1411 of the communication path 1601 from the hit other core cache 112 to the own core cache 111 are not filled beyond the threshold and the other core hit from the own core cache 111 The transmission buffer 1401 and the reception buffer 1412 of the communication path 1602 to the cache 112 are filled beyond the threshold, and the transmission buffer 1401 and the reception buffer 1413 of the communication path 1603 from the own core cache 111 to the oldest core cache 113 are Since the replacement process (step C4) 1611 and 1612 and the ball hitting process (step C13) cannot be performed when the threshold is exceeded, the movement processes (step C11) 1611 and 1614 are performed.

次に、ステップＣ１２では、キャッシュシステムはＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの最古エントリを最新に更新し、第２のＬＲＵ情報については自コアのキャッシュを最新の順番に更新する。その後、ステップＣ６に進み、処理を終了する。 Next, in step C12, the cache system performs LRU update processing. That is, the oldest entry of the own core cache is updated to the latest for the first LRU information of the own core, and the cache of the own core is updated to the latest order for the second LRU information. Then, it progresses to step C6 and complete | finishes a process.

なお、ステップＣ１０において、自コアキャッシュが最古でなく、かつ、自コアキャッシュから最古コアキャッシュへの送受信バッファが閾値を越えて埋まっていると判断された場合には、自コアキャッシュから２番目に古いキャッシュへの送受信バッファが閾値を越えて埋まっているか否かを判定してもよい。埋まっていない場合には、自コアのキャッシュの最古エントリを２番目に古いコアキャッシュの最古エントリに上書きし、自コアキャッシュの最古エントリに他コアキャッシュのヒットしたエントリを上書きし、該他コアキャッシュのヒットエントリを無効化する。埋まっている場合には、さらに、自コアキャッシュから３番目に古いキャッシュへの送受信バッファが閾値を越えて埋まっているか否かを判定し、上記と同様に、３、４、・・・番目に古いコアキャッシュを順次玉突き処理の対象にしてもよい。 In step C10, if it is determined that the own core cache is not the oldest and the transmission / reception buffer from the own core cache to the oldest core cache is filled beyond the threshold, 2 from the own core cache. It may be determined whether the transmission / reception buffer for the second oldest cache is filled beyond the threshold. If not, the oldest entry in the own core cache is overwritten with the oldest entry in the second oldest core cache, the oldest entry in the own core cache is overwritten with the hit entry in the other core cache, Invalidate hit entries in other core caches. If it is filled, it is further determined whether the transmission / reception buffer from the own core cache to the third oldest cache is filled beyond the threshold, and the third, fourth,... Old core caches may be sequentially subjected to the ball hitting process.

以上のように、本実施形態は、交換処理を行う際には、ＬＲＵ判定処理（ステップＣ２）の他に、送受信バッファ判定処理（ステップＣ８〜Ｃ１０）を行う。図１５では、図６に対して、玉突き処理（ステップＣ１３）及び移動処理（ステップＣ１１）が追加されている。ＬＲＵ判定処理（ステップＣ２）でステップＣ８に進む場合、送受信バッファ（例えば送信バッファ１４０１及び受信バッファ１４１２）の埋まり具合に応じて、参照処理（ステップＣ７）、交換処理（ステップＣ４）、玉突き処理（ステップＣ１３）又は移動処理（ステップＣ１１）を行う。これにより、トラフィックを低減することができる。 As described above, according to the present embodiment, when performing the exchange process, in addition to the LRU determination process (step C2), the transmission / reception buffer determination process (steps C8 to C10) is performed. In FIG. 15, a ball hitting process (step C13) and a moving process (step C11) are added to FIG. When the process proceeds to step C8 in the LRU determination process (step C2), the reference process (step C7), the exchange process (step C4), and the ball hitting process (step S4) are performed depending on how the transmission / reception buffers (for example, the transmission buffer 1401 and the reception buffer 1412) are filled. Step C13) or movement processing (step C11) is performed. Thereby, traffic can be reduced.

図１８は図１４のキャッシュシステムにおける図５のステップＳ８の追い出し処理を示すフローチャートであり、図１９は図１８のステップＤ１０の次候補への追い出し処理の例を示す図である。 18 is a flowchart showing the eviction process in step S8 of FIG. 5 in the cache system of FIG. 14, and FIG. 19 is a diagram showing an example of the eviction process to the next candidate in step D10 of FIG.

まず、ステップＤ１では、図５のステップＳ４の判断により、下位階層キャッシュとして利用可能な他コアキャッシュに全てミスしたと判断される。 First, in step D1, it is determined that all other core caches that can be used as the lower hierarchy cache have missed by the determination in step S4 of FIG.

次に、ステップＤ２では、図１０のステップＢ２と同様に、判定回路１４２４はＬＲＵ判定処理を行う。すなわち、第２のＬＲＵ情報を参照し、自コアが最近そのインデックスのエントリを最も使っていないコアでないことの条件を満たすか否かを判定する。条件を満たす場合にはステップＤ８に進み、条件を満たさない場合にはステップＤ７に進む。 Next, in step D2, as in step B2 of FIG. 10, the determination circuit 1424 performs LRU determination processing. That is, referring to the second LRU information, it is determined whether or not the self-core satisfies the condition that it is not the most recently used core of the index. If the condition is satisfied, the process proceeds to step D8. If the condition is not satisfied, the process proceeds to step D7.

ステップＤ８では、判定回路１４２４は第１のバッファ判定処理を行う。すなわち、自コアのバッファから最古コアのバッファへの送受信バッファが閾値を越えて埋まっているか否かを判定する。例えば、図１９において、自コア１０１のバッファ１１１から最古コア１０２のバッファ１１２への通信パス１９０１の送信バッファ１４０１又は受信バッファ１４１２が閾値を越えて埋まっているか否かを判定する。送信バッファ１４０１又は受信バッファ１４１２が閾値を越えて埋まっているときにはステップＤ９へ進み、送信バッファ１４０１及び受信バッファ１４１２が閾値を越えて埋まっていないときにはステップＤ４へ進む。 In step D8, the determination circuit 1424 performs a first buffer determination process. That is, it is determined whether the transmission / reception buffer from the buffer of the own core to the buffer of the oldest core is filled beyond the threshold. For example, in FIG. 19, it is determined whether or not the transmission buffer 1401 or the reception buffer 1412 of the communication path 1901 from the buffer 111 of the own core 101 to the buffer 112 of the oldest core 102 is filled beyond the threshold. When the transmission buffer 1401 or the reception buffer 1412 is filled beyond the threshold, the process proceeds to step D9. When the transmission buffer 1401 and the reception buffer 1412 are not filled beyond the threshold, the process proceeds to step D4.

ステップＤ４では、図１０のステップＢ４と同様に、キャッシュシステムは自コアキャッシュから他コアキャッシュへ追い出し処理を行う。すなわち、図１９に示すように、データ転送処理１９１２により、最近最も使われていない他コアキャッシュ１１２の最古エントリｃに自コアキャッシュ１１１の最古エントリ（他コアグループ用エントリを含まないものの中で最古エントリ）ｂを移動する。 In step D4, as in step B4 of FIG. 10, the cache system performs eviction processing from its own core cache to another core cache. That is, as shown in FIG. 19, the data transfer processing 1912 causes the oldest entry c of the other core cache 112 that has not been used most recently to be the oldest entry of the own core cache 111 (not including the entry for the other core group). To move the oldest entry b).

次に、ステップＤ５では、図１０のステップＢ５と同様に、キャッシュシステムはＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの最古エントリを最新に更新し、他コアの第１のＬＲＵ情報については追い出し先の他コアキャッシュの最古エントリを最新に更新し、第２のＬＲＵ情報については自コアキャッシュが最新に、追い出し先の他コアキャッシュが２番目に新しい順番となるように更新する。その後、ステップＤ６に進む。 Next, in step D5, the cache system performs LRU update processing as in step B5 of FIG. That is, the oldest entry of the own core cache is updated to the latest for the first LRU information of the own core, and the oldest entry of the other core cache to be evicted is updated to the latest for the first LRU information of the other core. The second LRU information is updated so that the own core cache is in the latest order and the other core cache in the eviction destination is in the second most recent order. Thereafter, the process proceeds to Step D6.

ステップＤ９では、判定回路１４２４は第２のバッファ判定処理を行う。すなわち、自コアのキャッシュから２番目に古いコアのキャッシュへの送受信バッファが閾値を越えて埋まっているか否かを判定する。例えば、図１９において、自コア１０１のキャッシュ１１１から２番目に古いコア１０３のキャッシュ１１３への通信パス１９０２の送信バッファ１４０１又は受信バッファ１４１３が閾値を越えて埋まっているか否かを判定する。送信バッファ１４０１又は受信バッファ１４１３が閾値を越えて埋まっているときにはステップＤ７へ進み、送信バッファ１４０１及び受信バッファ１４１３が閾値を越えて埋まっていないときにはステップＤ１０へ進む。 In step D9, the determination circuit 1424 performs a second buffer determination process. That is, it is determined whether the transmission / reception buffer from the cache of the own core to the cache of the second oldest core is filled beyond the threshold. For example, in FIG. 19, it is determined whether the transmission buffer 1401 or the reception buffer 1413 of the communication path 1902 from the cache 111 of the own core 101 to the cache 113 of the second oldest core 103 is filled beyond the threshold. When the transmission buffer 1401 or the reception buffer 1413 is filled beyond the threshold, the process proceeds to step D7. When the transmission buffer 1401 and the reception buffer 1413 are not filled beyond the threshold, the process proceeds to step D10.

ステップＤ７では、図１０のステップＢ７と同様に、キャッシュシステムは破棄処理を行う。すなわち、自コアキャッシュの最古エントリを破棄する。具体的には、このステップでは何も処理せず、後のステップＤ６の上書き処理により自コアキャッシュの最古エントリが破棄される。 In step D7, as in step B7 of FIG. 10, the cache system performs a discard process. That is, the oldest entry in the own core cache is discarded. Specifically, nothing is processed in this step, and the oldest entry in the own core cache is discarded by the overwrite process in the subsequent step D6.

次に、ステップＤ３では、図１０のステップＢ３と同様に、キャッシュシステムはＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの最古エントリを最新に更新し、第２のＬＲＵ情報については自コアキャッシュが最新になるように更新する。その後、ステップＤ６に進む。 Next, in step D3, the cache system performs LRU update processing as in step B3 of FIG. That is, for the first LRU information of the own core, the oldest entry in the own core cache is updated to the latest, and the second LRU information is updated to be the latest. Thereafter, the process proceeds to Step D6.

ステップＤ１０では、キャッシュシステムは次候補への追い出し処理を行う。すなわち、図１９に示すように、データ転送処理１９１３により、２番目に古いコアキャッシュ１１３の最古エントリｃに自コアキャッシュ１１１の最古エントリ（他コアグループ用エントリを含まないものの中で最古）ｂを移動する。 In step D10, the cache system performs an eviction process for the next candidate. That is, as shown in FIG. 19, the data transfer process 1913 causes the oldest entry c of the own core cache 111 to be the oldest entry (not including the entry for other core groups) in the oldest entry c of the second oldest core cache 113. ) Move b.

次に、ステップＤ１１では、キャッシュシステムはＬＲＵ更新処理を行う。すなわち、自コアの第１のＬＲＵ情報については自コアキャッシュの最古エントリを最新に更新し、２番目に古いコア（追い出し先コア）の第１のＬＲＵ情報については最古エントリを最新に更新し、第２のＬＲＵ情報については自コアキャッシュを最新に、２番目に古いコア（追い出し先コア）を２番目に新しい順番に更新する。その後、ステップＤ６に進む。 Next, in step D11, the cache system performs LRU update processing. That is, the oldest entry of the own core cache is updated to the latest for the first LRU information of the own core, and the oldest entry is updated to the latest for the first LRU information of the second oldest core (ejecting destination core). For the second LRU information, the local core cache is updated to the latest, and the second oldest core (ejecting destination core) is updated to the second newest. Thereafter, the process proceeds to Step D6.

ステップＤ６では、図１０のステップＢ６と同様に、図１９に示すように、自コア１０１は、データ読み出し処理１９１１により、自コアキャッシュ１１１の最古エントリ（ＬＲＵ情報更新後の最新エントリ）ｂにメインメモリ（又は２次キャッシュ）１３０から読み出したデータを上書きすると共に、そのデータを取得する。以上で、処理を終了する。 In step D6, as in step B6 of FIG. 10, as shown in FIG. 19, the own core 101 sets the oldest entry (latest entry after updating the LRU information) b in the own core cache 111 by the data read processing 1911. The data read from the main memory (or secondary cache) 130 is overwritten and the data is acquired. Thus, the process ends.

なお、ステップＤ９において、自コアキャッシュから２番目に古いコアキャッシュへの送受信バッファが閾値を越えて埋まっていると判断された場合には、自コアキャッシュから３番目に古いコアキャッシュへの送受信バッファが閾値を越えて埋まっているか否かを判定してもよい。埋まっていない場合には、自コアのキャッシュの最古エントリを３番目に古いコアキャッシュの最古エントリに上書きし、自コアキャッシュの最古エントリに他コアキャッシュのヒットしたエントリを上書きし、該他コアキャッシュのヒットエントリを無効化する。埋まっている場合には、さらに、自コアキャッシュから４番目に古いコアキャッシュへの送受信バッファが閾値を越えて埋まっているか否かを判定し、上記と同様に、４、５、・・・番目に古いコアキャッシュを順次追い出し処理の対象にしてもよい。 When it is determined in step D9 that the transmission / reception buffer from the own core cache to the second oldest core cache is filled beyond the threshold, the transmission / reception buffer from the own core cache to the third oldest core cache is determined. It may be determined whether or not is buried beyond the threshold. If it is not filled, the oldest entry in the own core cache is overwritten with the oldest entry in the third oldest core cache, the oldest entry in the own core cache is overwritten with the hit entry in the other core cache, Invalidate hit entries in other core caches. If it is filled, it is further determined whether or not the transmission / reception buffer from the own core cache to the fourth oldest core cache exceeds the threshold, and the fourth, fifth,... Alternatively, older core caches may be sequentially targeted for eviction processing.

以上のように、本実施形態は、追い出し処理を行う際には、ＬＲＵ判定処理（ステップＤ２）の他に、送受信バッファ判定処理（ステップＤ８及びＤ９）を行う。図１８では、図１０に対して、次候補への追い出し処理（ステップＤ１０）が追加されている。ＬＲＵ判定処理（ステップＤ２）でステップＤ８に進む場合、送受信バッファ（例えば送信バッファ１４０１及び受信バッファ１４１２）の埋まり具合に応じて、破棄処理（ステップＤ７）、追い出し処理（ステップＤ４）又は次候補への追い出し処理（ステップＤ１０）を行う。これにより、トラフィックを低減することができる。 As described above, in the present embodiment, when the eviction process is performed, the transmission / reception buffer determination process (steps D8 and D9) is performed in addition to the LRU determination process (step D2). In FIG. 18, a process of evicting to the next candidate (step D10) is added to FIG. When the process proceeds to step D8 in the LRU determination process (step D2), the discard process (step D7), the eviction process (step D4), or the next candidate is selected depending on how the transmission / reception buffers (for example, the transmission buffer 1401 and the reception buffer 1412) are filled. The eviction process (step D10) is performed. Thereby, traffic can be reduced.

図２０は、本発明の他の実施形態による共有分散キャッシュシステムの構成例を示す図である。以下、図２０が図１４と異なる点を説明する。図１４では、共通の１個の情報メモリ１４２１内に全キャッシュ１１１〜１１３のＬＲＵ情報１４２２及び送受信バッファ情報１４２３を記憶させる。図２０の情報メモリ１４２１ａ，１４２１ｂ，１４２１ｃ、ＬＲＵ情報１４２２ａ，１４２２ｂ，１４２２ｃ、及び送受信バッファ１４２３ａ，１４２３ｂ，１４２３ｃは、それぞれ図１４の情報メモリ１４２１、ＬＲＵ情報１４２２及び送受信バッファ１４２３を分散させたものである。 FIG. 20 is a diagram showing a configuration example of a shared distributed cache system according to another embodiment of the present invention. Hereinafter, the points of FIG. 20 different from FIG. 14 will be described. In FIG. 14, the LRU information 1422 and the transmission / reception buffer information 1423 of all the caches 111 to 113 are stored in one common information memory 1421. The information memories 1421a, 1421b, 1421c, LRU information 1422a, 1422b, 1422c, and transmission / reception buffers 1423a, 1423b, 1423c in FIG. 20 are obtained by distributing the information memory 1421, LRU information 1422, and transmission / reception buffer 1423 in FIG. is there.

情報メモリ１４２１ａは、キャッシュ１１１内に設けられ、ＬＲＵ情報１４２２ａ及び送受信バッファ情報１４２３ａを記憶する。ＬＲＵ情報１４２２ａは、図１の第１のＬＲＵ情報１２１に対応する。送受信バッファ情報１４２３ａは、送信バッファ１４０１及び受信バッファ１４１１の各バッファが閾値を越えて埋まっているか否かの情報である。 The information memory 1421a is provided in the cache 111, and stores LRU information 1422a and transmission / reception buffer information 1423a. The LRU information 1422a corresponds to the first LRU information 121 of FIG. The transmission / reception buffer information 1423a is information indicating whether each of the transmission buffer 1401 and the reception buffer 1411 is filled beyond a threshold.

情報メモリ１４２１ｂは、キャッシュ１１２内に設けられ、ＬＲＵ情報１４２２ｂ及び送受信バッファ情報１４２３ｂを記憶する。ＬＲＵ情報１４２２ｂは、図１の第１のＬＲＵ情報１２２に対応する。送受信バッファ情報１４２３ｂは、送信バッファ１４０２及び受信バッファ１４１２の各バッファが閾値を越えて埋まっているか否かの情報である。 The information memory 1421b is provided in the cache 112, and stores LRU information 1422b and transmission / reception buffer information 1423b. The LRU information 1422b corresponds to the first LRU information 122 in FIG. The transmission / reception buffer information 1423b is information indicating whether or not each of the transmission buffer 1402 and the reception buffer 1412 is filled beyond a threshold.

情報メモリ１４２１ｃは、キャッシュ１１３内に設けられ、ＬＲＵ情報１４２２ｃ及び送受信バッファ情報１４２３ｃを記憶する。ＬＲＵ情報１４２２ｃは、図１の第１のＬＲＵ情報１２３に対応する。送受信バッファ情報１４２３ｃは、送信バッファ１４０３及び受信バッファ１４１３の各バッファが閾値を越えて埋まっているか否かの情報である。 The information memory 1421c is provided in the cache 113, and stores LRU information 1422c and transmission / reception buffer information 1423c. The LRU information 1422c corresponds to the first LRU information 123 of FIG. The transmission / reception buffer information 1423c is information indicating whether each of the transmission buffer 1403 and the reception buffer 1413 is filled beyond a threshold.

図１の第２のＬＲＵ情報１２９は、判定回路１４２４内に記憶させてもよいし、キャッシュ１１１〜１１３内に記憶させてもよい。図２０のキャッシュシステムの動作は、図１４のキャッシュシステムと同様である。 The second LRU information 129 of FIG. 1 may be stored in the determination circuit 1424, or may be stored in the caches 111 to 113. The operation of the cache system of FIG. 20 is the same as that of the cache system of FIG.

以上のように、本実施形態は、エントリ（キャッシュライン）移動の判定をＬＲＵ情報の他、トラフィックの状態を直接知ることができる送受信バッファ情報を基にエントリの移動判定を行うことにより、トラフィックを直接的に低減することができる。 As described above, in this embodiment, the determination of entry (cache line) movement is performed by performing entry movement determination based on transmission / reception buffer information that can directly know the traffic state in addition to LRU information. It can be reduced directly.

本実施形態は、複数の処理装置（コア）１０１〜１０３と、前記複数の処理装置１０１〜１０３に一対一に接続された複数のキャッシュ１１１〜１１３と、前記複数のキャッシュ１１１〜１１３に接続され、前記複数のキャッシュ１１１〜１１３間のデータ通信を行うネットワーク１４２５と、前記キャッシュ１１１〜１１３毎に設けられ、前記キャッシュ１１１〜１１３が前記ネットワーク１４２５に対してデータ通信するための複数のバッファ１４０１〜１４０３，１４１１〜１４１３とを有する。前記処理装置が要求したデータが、一のキャッシュでミスした場合、前記複数のバッファの状態（送受信バッファ情報）１４２３に応じて、前記キャッシュが前記ネットワークを介してデータ通信する。 In the present embodiment, a plurality of processing devices (cores) 101 to 103, a plurality of caches 111 to 113 connected to the plurality of processing devices 101 to 103 on a one-to-one basis, and a plurality of caches 111 to 113 are connected. , A network 1425 that performs data communication between the plurality of caches 111 to 113, and a plurality of buffers 1401 that are provided for each of the caches 111 to 113 and that allow the caches 111 to 113 to perform data communication with the network 1425. 1403, 1411-1413. When the data requested by the processing device misses in one cache, the cache performs data communication via the network according to the plurality of buffer states (transmission / reception buffer information) 1423.

前記処理装置が要求したデータが、一のキャッシュでミスし（図５のステップＳ２）、他のキャッシュでヒットした場合（図５のステップＳ６）に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず（図１５のステップＣ８）、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在しないとき（図１５のステップＣ９）には、前記一のキャッシュのエントリ及び前記ヒットした他のキャッシュのエントリを交換する（図１５のステップＣ４）。 When the data requested by the processing device misses in one cache (step S2 in FIG. 5) and hits in another cache (step S6 in FIG. 5), the one cache is transferred from the other hit cache. There is no predetermined amount of data in the buffer for transmitting data to (step C8 in FIG. 15), and the predetermined amount of data in the buffer for transmitting data from the one cache to the other hit cache When there is no data (step C9 in FIG. 15), the entry in the one cache and the entry in the other hit cache are exchanged (step C4 in FIG. 15).

また、前記処理装置が要求したデータが、一のキャッシュでミスし（図５のステップＳ２）、他のキャッシュでヒットした場合（図５のステップＳ６）に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在するとき（図１５のステップＣ８）には、前記要求した処理装置は前記ヒットした他のキャッシュのデータを参照する（図１５のステップＣ７）。 In addition, when the data requested by the processing device misses in one cache (step S2 in FIG. 5) and hits in another cache (step S6 in FIG. 5), the data from the other cache hit becomes the one. When a predetermined amount of data exists in the buffer for transmitting data to the cache (step C8 in FIG. 15), the requested processing device refers to the data in the other cache hit (in FIG. 15). Step C7).

また、前記処理装置が要求したデータが、一のキャッシュでミスし（図５のステップＳ２）、他のキャッシュでヒットした場合（図５のステップＳ６）に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず（図１５のステップＣ８）、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１５のステップＣ９）、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないとき（図１５のステップＣ１０）には、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きする（図１５のステップＣ１１）。 In addition, when the data requested by the processing device misses in one cache (step S2 in FIG. 5) and hits in another cache (step S6 in FIG. 5), the data from the other cache hit becomes the one. There is no predetermined amount of data in the buffer for transmitting data to the cache (step C8 in FIG. 15), and the buffer for transmitting data from the one cache to the other cache hit. When a predetermined amount of data exists (step C9 in FIG. 15) and the one cache is least recently used among the plurality of caches (step C10 in FIG. 15), the other hits The cache entry is overwritten with the one cache entry (step C11 in FIG. 15).

また、前記処理装置が要求したデータが、一のキャッシュでミスし（図５のステップＳ２）、他のキャッシュでヒットした場合（図５のステップＳ６）に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず（図１５のステップＣ８）、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１５のステップＣ９）、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないものでなく（図１５のステップＣ１０）、かつ、前記一のキャッシュから前記複数のキャッシュの中で最も最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在しないとき（図１５のステップＣ１０）には、前記一のキャッシュのエントリを前記最も最近使用されていないキャッシュのエントリに上書きし、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きする（図１５のステップＣ１３）。 In addition, when the data requested by the processing device misses in one cache (step S2 in FIG. 5) and hits in another cache (step S6 in FIG. 5), the data from the other cache hit becomes the one. There is no predetermined amount of data in the buffer for transmitting data to the cache (step C8 in FIG. 15), and the buffer for transmitting data from the one cache to the other cache hit. There is a predetermined amount of data (step C9 in FIG. 15), and the one of the plurality of caches is not the least recently used (step C10 in FIG. 15). A predetermined amount of data is stored in the buffer for transmitting data from the cache to the least recently used cache of the plurality of caches. Is not present (step C10 in FIG. 15), the entry of the one cache is overwritten with the entry of the least recently used cache, and the entry of the other cache hit is the entry of the one cache. (Step C13 in FIG. 15).

また、前記処理装置が要求したデータが、一のキャッシュでミスし（図５のステップＳ２）、他のキャッシュでヒットした場合（図５のステップＳ６）に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず（図１５のステップＣ８）、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１５のステップＣ９）、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないものでなく（図１５のステップＣ１０）、かつ、前記一のキャッシュから前記複数のキャッシュの中で最も最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在するとき（図１５のステップＣ１０）には、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きする（図１５のステップＣ１１）。 In addition, when the data requested by the processing device misses in one cache (step S2 in FIG. 5) and hits in another cache (step S6 in FIG. 5), the data from the other cache hit becomes the one. There is no predetermined amount of data in the buffer for transmitting data to the cache (step C8 in FIG. 15), and the buffer for transmitting data from the one cache to the other cache hit. There is a predetermined amount of data (step C9 in FIG. 15), and the one of the plurality of caches is not the least recently used (step C10 in FIG. 15). A predetermined amount of data is stored in the buffer for transmitting data from the cache to the least recently used cache of the plurality of caches. When but present in (step C10 in FIG. 15) overwrites the entry of other caches that the hit entry of said one cache (step C11 in FIG. 15).

また、前記処理装置が要求したデータが、一のキャッシュでミスし（図５のステップＳ２）、他のキャッシュでヒットした場合（図５のステップＳ６）に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず（図１５のステップＣ８）、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１５のステップＣ９）、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないものでなく（図１５のステップＣ１０）、かつ、前記一のキャッシュから前記複数のキャッシュの中で最も最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１５のステップＣ１０）、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記２番目に最近使用されていないキャッシュのエントリに上書きし、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きする。 In addition, when the data requested by the processing device misses in one cache (step S2 in FIG. 5) and hits in another cache (step S6 in FIG. 5), the data from the other cache hit becomes the one. There is no predetermined amount of data in the buffer for transmitting data to the cache (step C8 in FIG. 15), and the buffer for transmitting data from the one cache to the other cache hit. There is a predetermined amount of data (step C9 in FIG. 15), and the one of the plurality of caches is not the least recently used (step C10 in FIG. 15). A predetermined amount of data is stored in the buffer for transmitting data from the cache to the least recently used cache of the plurality of caches. (Step C10 in FIG. 15), and there is no predetermined amount of data in the buffer for transmitting data from the one cache to the second least recently used cache among the plurality of caches Sometimes, the one cache entry is overwritten with the second least recently used cache entry, and the other cache entry with the hit is overwritten with the one cache entry.

また、前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合（図５のステップＳ２、Ｓ４及びＳ６）に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在しないとき（図１８のステップＤ８）には、前記一のキャッシュのエントリを前記最も最近使用していないキャッシュのエントリに移動し（図１８のステップＤ４）、主記憶装置（メインメモリ）又は前記複数のキャッシュより下位階層のキャッシュ（２次キャッシュ）１３０から読み出したデータを前記移動した一のキャッシュのエントリに上書きする（図１８のステップＤ６）。 In addition, when the data requested by the processing device misses in all of the plurality of caches (steps S2, S4, and S6 in FIG. 5), from one cache of the requested processing device to the plurality of caches. When the predetermined amount of data does not exist in the buffer for transmitting data to the least recently used cache (step D8 in FIG. 18), the entry of the one cache is stored in the least recently used cache. Move to the entry (step D4 in FIG. 18), and overwrite the data read from the main storage device (main memory) or the cache (secondary cache) 130 lower than the plurality of caches on the entry of the moved one cache. (Step D6 in FIG. 18).

また、前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合（図５のステップＳ２、Ｓ４及びＳ６）に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１８のステップＤ８）、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在しないとき（図１８のステップＤ９）には、前記一のキャッシュのエントリを前記２番目に最近使用していないキャッシュのエントリに移動し（図１８のステップＤ１０）、主記憶装置又は前記複数のキャッシュより下位階層のキャッシュから読み出したデータを前記移動した一のキャッシュのエントリに上書きする（図１８のステップＤ６）。 In addition, when the data requested by the processing device misses in all of the plurality of caches (steps S2, S4, and S6 in FIG. 5), from one cache of the requested processing device to the plurality of caches. A predetermined amount of data exists in the buffer for transmitting data to the least recently used cache (step D8 in FIG. 18), and the second most recently used from the one cache to the plurality of caches When a predetermined amount of data does not exist in the buffer for transmitting data to a cache that has not been processed (step D9 in FIG. 18), the cache entry that has not been used the second most recently is entered. (Step D10 in FIG. 18) and read from the main storage device or a cache lower than the plurality of caches. Overwrite out data to the entry of one cache and the moved (step D6 in Fig. 18).

また、前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合（図５のステップＳ２、Ｓ４及びＳ６）に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１８のステップＤ８）、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在するとき（図１８のステップＤ９）には、主記憶装置又は前記複数のキャッシュより下位階層のキャッシュから読み出したデータを前記一のキャッシュのエントリに上書きする（図１８のステップＤ６）。 In addition, when the data requested by the processing device misses in all of the plurality of caches (steps S2, S4, and S6 in FIG. 5), from one cache of the requested processing device to the plurality of caches. A predetermined amount of data exists in the buffer for transmitting data to the least recently used cache (step D8 in FIG. 18), and the second most recently used from the one cache to the plurality of caches When there is a predetermined amount of data in the buffer for transmitting data to a cache that has not been processed (step D9 in FIG. 18), the data read from the main storage device or a cache in a lower hierarchy than the plurality of caches is stored in the buffer. The entry in one cache is overwritten (step D6 in FIG. 18).

また、前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合（図５のステップＳ２、Ｓ４及びＳ６）に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１８のステップＤ８）、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し（図１８のステップＤ９）、かつ、前記一のキャッシュから前記複数のキャッシュの中で３番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記３番目に最近使用していないキャッシュのエントリに移動し、主記憶装置又は前記複数のキャッシュより下位階層のキャッシュから読み出したデータを前記移動した一のキャッシュのエントリに上書きする。 In addition, when the data requested by the processing device misses in all of the plurality of caches (steps S2, S4, and S6 in FIG. 5), from one cache of the requested processing device to the plurality of caches. A predetermined amount of data exists in the buffer for transmitting data to the least recently used cache (step D8 in FIG. 18), and the second most recently used from the one cache to the plurality of caches There is a predetermined amount of data in the buffer for sending data to a cache that has not been performed (step D9 in FIG. 18), and the third most recently used from the one cache to the plurality of caches When there is no predetermined amount of data in the buffer for sending data to the cache, the entry in the one cache is Go to the entry of the cache serial third not recently used, overwrites the data read from the cache of the lower hierarchical than the main storage device or the plurality of cache entries of one cache that the mobile.

図１４のキャッシュシステムは、前記複数のバッファ１１１〜１１３の状態（送受信バッファ情報）１４２３を集中して記憶する１個の情報メモリ１４２１を有する。これに対し、図２０のキャッシュシステムは、前記複数のバッファ１１１〜１１３の状態１４２３ａ，１４２３ｂ，１４２３ｃを前記バッファ毎に分散して記憶する複数の情報メモリ１４２１ａ，１４２１ｂ，１４２１ｃを有する。 The cache system shown in FIG. 14 has one information memory 1421 for storing the states (transmission / reception buffer information) 1423 of the plurality of buffers 111 to 113 in a centralized manner. On the other hand, the cache system of FIG. 20 includes a plurality of information memories 1421a, 1421b, and 1421c that store the states 1423a, 1423b, and 1423c of the plurality of buffers 111 to 113 in a distributed manner for each buffer.

なお、上記実施形態は、何れも本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 The above-described embodiments are merely examples of implementation in carrying out the present invention, and the technical scope of the present invention should not be construed in a limited manner. That is, the present invention can be implemented in various forms without departing from the technical idea or the main features thereof.

本発明の実施形態は、例えば以下のように種々の適用が可能である。 The embodiment of the present invention can be applied in various ways as follows, for example.

（付記１）
複数の処理装置と、
前記複数の処理装置に一対一に接続された複数のキャッシュと、
前記複数のキャッシュに接続され、前記複数のキャッシュ間のデータ通信を行うネットワークと、
前記キャッシュ毎に設けられ、前記キャッシュが前記ネットワークに対してデータ通信するための複数のバッファとを有し、
前記処理装置が要求したデータが、一のキャッシュでミスした場合、前記複数のバッファの状態に応じて、前記キャッシュが前記ネットワークを介してデータ通信することを特徴とするキャッシュシステム。
（付記２）
前記処理装置が要求したデータが、一のキャッシュでミスし、他のキャッシュでヒットした場合に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリ及び前記ヒットした他のキャッシュのエントリを交換することを特徴とする付記１記載のキャッシュシステム。
（付記３）
前記処理装置が要求したデータが、一のキャッシュでミスし、他のキャッシュでヒットした場合に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在するときには、前記要求した処理装置は前記ヒットした他のキャッシュのデータを参照することを特徴とする付記１記載のキャッシュシステム。
（付記４）
前記処理装置が要求したデータが、一のキャッシュでミスし、他のキャッシュでヒットした場合に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないときには、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記５）
前記処理装置が要求したデータが、一のキャッシュでミスし、他のキャッシュでヒットした場合に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないものでなく、かつ、前記一のキャッシュから前記複数のキャッシュの中で最も最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記最も最近使用されていないキャッシュのエントリに上書きし、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記６）
前記処理装置が要求したデータが、一のキャッシュでミスし、他のキャッシュでヒットした場合に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないものでなく、かつ、前記一のキャッシュから前記複数のキャッシュの中で最も最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在するときには、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記７）
前記処理装置が要求したデータが、一のキャッシュでミスし、他のキャッシュでヒットした場合に、前記ヒットした他のキャッシュから前記一のキャッシュへデータを送信するための前記バッファに所定量データが存在せず、かつ、前記一のキャッシュから前記ヒットした他のキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記複数のキャッシュの中で前記一のキャッシュが最も最近使用されていないものでなく、かつ、前記一のキャッシュから前記複数のキャッシュの中で最も最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用されていないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記２番目に最近使用されていないキャッシュのエントリに上書きし、前記ヒットした他のキャッシュのエントリを前記一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記８）
前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記最も最近使用していないキャッシュのエントリに移動し、主記憶装置又は前記複数のキャッシュより下位階層のキャッシュから読み出したデータを前記移動した一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記９）
前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記２番目に最近使用していないキャッシュのエントリに移動し、主記憶装置又は前記複数のキャッシュより下位階層のキャッシュから読み出したデータを前記移動した一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記１０）
前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在するときには、主記憶装置又は前記複数のキャッシュより下位階層のキャッシュから読み出したデータを前記一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記１１）
前記処理装置が要求したデータが、前記複数のキャッシュすべてでミスした場合に、前記要求した処理装置の一のキャッシュから前記複数のキャッシュの中で最も最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記一のキャッシュから前記複数のキャッシュの中で２番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在し、かつ、前記一のキャッシュから前記複数のキャッシュの中で３番目に最近使用していないキャッシュへデータを送信するための前記バッファに所定量データが存在しないときには、前記一のキャッシュのエントリを前記３番目に最近使用していないキャッシュのエントリに移動し、主記憶装置又は前記複数のキャッシュより下位階層のキャッシュから読み出したデータを前記移動した一のキャッシュのエントリに上書きすることを特徴とする付記１記載のキャッシュシステム。
（付記１２）
さらに、前記複数のバッファの状態を集中して記憶する１個の情報メモリを有する付記１記載のキャッシュシステム。
（付記１３）
さらに、前記複数のバッファの状態を前記バッファ毎に分散して記憶する複数の情報メモリを有する付記１記載のキャッシュシステム。 (Appendix 1)
A plurality of processing devices;
A plurality of caches connected one-to-one to the plurality of processing devices;
A network connected to the plurality of caches and performing data communication between the plurality of caches;
A plurality of buffers provided for each cache, the cache having data communication with the network;
When the data requested by the processing device misses in one cache, the cache performs data communication via the network according to the states of the plurality of buffers.
(Appendix 2)
When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. When there is no predetermined amount of data in the buffer for transmitting data from the one cache to the other hit cache, the one cache entry and the other hit cache entry The cache system according to supplementary note 1, characterized in that:
(Appendix 3)
When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. 2. The cache system according to claim 1, wherein when the request exists, the requested processing device refers to the data of the other hit cache.
(Appendix 4)
When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. Does not exist, and a predetermined amount of data exists in the buffer for transmitting data from the one cache to the other hit cache, and the one cache is the most recent among the plurality of caches The cache system according to claim 1, wherein when the cache is not used, the entry of the other cache hit is overwritten on the entry of the one cache.
(Appendix 5)
When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. Does not exist, and a predetermined amount of data exists in the buffer for transmitting data from the one cache to the other hit cache, and the one cache is the most recent among the plurality of caches When the predetermined amount of data does not exist in the buffer for transmitting data from the one cache to the least recently used cache among the plurality of caches when not being used, Overwrite cache entry with the least recently used cache entry and hit Cache system according to Supplementary Note 1, wherein the overwriting entries in other cache entries of the one caches.
(Appendix 6)
When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. Does not exist, and a predetermined amount of data exists in the buffer for transmitting data from the one cache to the other hit cache, and the one cache is the most recent among the plurality of caches The hit is found when there is a predetermined amount of data in the buffer for transmitting data from the one cache to the least recently used cache among the plurality of caches that are not used Supplementary note 1 overwriting another cache entry with the one cache entry Mounting cache system.
(Appendix 7)
When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. Does not exist, and a predetermined amount of data exists in the buffer for transmitting data from the one cache to the other hit cache, and the one cache is the most recent among the plurality of caches A predetermined amount of data exists in the buffer for transmitting data from the one cache to the least recently used cache among the plurality of caches, and is not used, and the one cache Sending data from one cache to the second least recently used cache among the plurality of caches When a predetermined amount of data does not exist in the buffer, the one cache entry is overwritten with the second least recently used cache entry, and the hit other cache entry is overwritten with the one cache entry. The cache system according to appendix 1, wherein the cache system is overwritten.
(Appendix 8)
When data requested by the processing device misses in all of the plurality of caches, the data is transmitted from one cache of the requested processing device to the least recently used cache among the plurality of caches. When a predetermined amount of data does not exist in the buffer, the entry of the one cache is moved to the entry of the least recently used cache, and the data read from the main storage device or a cache lower than the plurality of caches 2. The cache system according to appendix 1, wherein the entry of the moved one cache is overwritten.
(Appendix 9)
When data requested by the processing device misses in all of the plurality of caches, the data is transmitted from one cache of the requested processing device to the least recently used cache among the plurality of caches. There is a predetermined amount of data in the buffer, and there is no predetermined amount of data in the buffer for transmitting data from the one cache to the second least recently used cache among the plurality of caches. Sometimes the entry of the one cache is moved to the entry of the second least recently used cache, and the data read from the main storage device or the cache lower than the plurality of caches is moved to the cache of the moved one cache. The cache system according to appendix 1, wherein the entry is overwritten.
(Appendix 10)
When data requested by the processing device misses in all of the plurality of caches, the data is transmitted from one cache of the requested processing device to the least recently used cache among the plurality of caches. There is a predetermined amount of data in the buffer, and there is a predetermined amount of data in the buffer for transmitting data from the one cache to the second least recently used cache among the plurality of caches. 2. The cache system according to appendix 1, wherein data read from a main storage device or a cache lower than the plurality of caches is overwritten on an entry of the one cache.
(Appendix 11)
When data requested by the processing device misses in all of the plurality of caches, the data is transmitted from one cache of the requested processing device to the least recently used cache among the plurality of caches. A predetermined amount of data exists in the buffer, and there is a predetermined amount of data in the buffer for transmitting data from the one cache to the second least recently used cache among the plurality of caches. And when there is no predetermined amount of data in the buffer for transmitting data from the one cache to the third least recently used cache among the plurality of caches, the entry of the one cache is Move to the third least recently used cache entry and either main storage or the caches Cache system according to Supplementary Note 1, wherein the overwriting data read from the cache of the lower hierarchy than the Interview entry of one cache that the mobile.
(Appendix 12)
The cache system according to appendix 1, further comprising one information memory for storing the states of the plurality of buffers in a concentrated manner.
(Appendix 13)
The cache system according to appendix 1, further comprising a plurality of information memories for storing the states of the plurality of buffers in a distributed manner for each of the buffers.

共有分散キャッシュシステムの構成例を示す図である。It is a figure which shows the structural example of a shared distributed cache system. 図１の第１のＬＲＵ情報及び第２のＬＲＵ情報の詳細を示す図である。It is a figure which shows the detail of the 1st LRU information and 2nd LRU information of FIG. 共有分散キャッシュシステムの他の構成例を示す図である。It is a figure which shows the other structural example of a shared distributed cache system. 図３の第１のＬＲＵ情報及び第２のＬＲＵ情報の詳細を示す図である。It is a figure which shows the detail of the 1st LRU information and 2nd LRU information of FIG. 共有分散キャッシュシステムにおけるデータロードアクセス動作を示すフローチャートである。It is a flowchart which shows the data load access operation | movement in a shared distributed cache system. 図５のステップＳ７の交換処理を示すフローチャートである。It is a flowchart which shows the exchange process of FIG.5 S7. 図１及び図２のシステムにおける図６の交換処理の例を示す図である。It is a figure which shows the example of the exchange process of FIG. 6 in the system of FIG.1 and FIG.2. 図１及び図２のシステムにおける図６の交換処理の例を示す図である。It is a figure which shows the example of the exchange process of FIG. 6 in the system of FIG.1 and FIG.2. 図１及び図２のシステムにおける図６の交換処理の例を示す図である。It is a figure which shows the example of the exchange process of FIG. 6 in the system of FIG.1 and FIG.2. 図５のステップＳ８の追い出し処理を示すフローチャートである。It is a flowchart which shows the eviction process of step S8 of FIG. 図１及び図２のシステムにおける図１０の追い出し処理の例を示す図である。It is a figure which shows the example of the eviction process of FIG. 10 in the system of FIG.1 and FIG.2. 図１及び図２のシステムにおける図１０の追い出し処理の例を示す図である。It is a figure which shows the example of the eviction process of FIG. 10 in the system of FIG.1 and FIG.2. 図１及び図２のシステムにおける図１０の追い出し処理の例を示す図である。It is a figure which shows the example of the eviction process of FIG. 10 in the system of FIG.1 and FIG.2. 本発明の実施形態による共有分散キャッシュシステムの構成例を示す図である。It is a figure which shows the structural example of the shared distributed cache system by embodiment of this invention. 図１４のキャッシュシステムにおける図５のステップＳ７の交換処理を示すフローチャートである。It is a flowchart which shows the exchange process of FIG.5 S7 in the cache system of FIG. 図１５のステップＣ１３の玉突き処理の例を示す図である。It is a figure which shows the example of the ball hit process of step C13 of FIG. 図１５のステップＣ１１の移動処理の例を示す図である。It is a figure which shows the example of the movement process of step C11 of FIG. 図１４のキャッシュシステムにおける図５のステップＳ８の追い出し処理を示すフローチャートである。FIG. 15 is a flowchart showing eviction processing in step S8 of FIG. 5 in the cache system of FIG. 図１８のステップＤ１０の次候補への追い出し処理の例を示す図である。It is a figure which shows the example of the eviction process to the next candidate of step D10 of FIG. 本発明の他の実施形態による共有分散キャッシュシステムの構成例を示す図である。It is a figure which shows the structural example of the shared distributed cache system by other embodiment of this invention.

Explanation of symbols

１０１〜１０３コア（処理装置）
１１１〜１１３キャッシュ
１３０メインメモリ
１４０１〜１４０３送信バッファ
１４１１〜１４１３受信バッファ
１４２１情報メモリ
１４２２ＬＲＵ情報
１４２３送受信バッファ情報
１４２４判定回路
１４２５キャッシュ間接続ネットワーク 101-103 core (processing equipment)
111 to 113 cache 130 main memory 1401 to 1403 transmission buffer 1411 to 1413 reception buffer 1421 information memory 1422 LRU information 1423 transmission / reception buffer information 1424 determination circuit 1425 inter-cache connection network

Claims

A plurality of processing devices;
A plurality of caches connected one-to-one to the plurality of processing devices;
A network connected to the plurality of caches and performing data communication between the plurality of caches;
A plurality of buffers provided for each cache, the cache having data communication with the network;
When the data requested by the processing device misses in one cache, the cache performs data communication via the network according to the state of the plurality of buffers,
When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. Does not exist, and a predetermined amount of data exists in the buffer for transmitting data from the one cache to the other hit cache, and the one cache is the most recent among the plurality of caches When the predetermined amount of data does not exist in the buffer for transmitting data from the one cache to the least recently used cache among the plurality of caches when not being used, Overwrite cache entry with the least recently used cache entry and hit Features and to Ruki catcher Tsu shoe system to overwrite the entry of another cache entry of the one of the cache.

A plurality of processing devices;
A plurality of caches connected one-to-one to the plurality of processing devices;
A network connected to the plurality of caches and performing data communication between the plurality of caches;
A plurality of buffers provided for each cache, the cache having data communication with the network;
When the data requested by the processing device misses in one cache, the cache performs data communication via the network according to the state of the plurality of buffers,
When data requested by the processing device misses in all of the plurality of caches, the data is transmitted from one cache of the requested processing device to the least recently used cache among the plurality of caches. There is a predetermined amount of data in the buffer, and there is no predetermined amount of data in the buffer for transmitting data from the one cache to the second least recently used cache among the plurality of caches. Sometimes the entry of the one cache is moved to the entry of the second least recently used cache, and the data read from the main storage device or the cache lower than the plurality of caches is moved to the cache of the moved one cache. features and be Ruki turbocharger Tsu Gerhard system to overwrite the entry.

When the data requested by the processing device misses in one cache and hits in another cache, a predetermined amount of data is stored in the buffer for transmitting data from the hit other cache to the one cache. When there is no predetermined amount of data in the buffer for transmitting data from the one cache to the other hit cache, the one cache entry and the other hit cache entry cache system according to claim 1 or 2, wherein the exchanging.

When data requested by the processing device misses in all of the plurality of caches, the data is transmitted from one cache of the requested processing device to the least recently used cache among the plurality of caches. When the predetermined amount of data does not exist in the buffer, the entry of the one cache is moved to the entry of the least recently used cache, and the data read from the main storage device or the cache at a lower hierarchy than the plurality of caches The cache system according to any one of claims 1 to 3, wherein an entry of the moved one cache is overwritten.