JP2012164209A

JP2012164209A - Cache control method, cache control device, and program for cache control

Info

Publication number: JP2012164209A
Application number: JP2011025245A
Authority: JP
Inventors: Ryota Mibu; 亮太壬生; Tomoyoshi Sugawara; 智義菅原
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2011-02-08
Filing date: 2011-02-08
Publication date: 2012-08-30

Abstract

PROBLEM TO BE SOLVED: To provide a cache control method capable of suppressing deterioration of processing performance even if receiving a large amount of data, when processing the received data by writing it in a cache memory.SOLUTION: Area setting means 81 sets a writing area, to which a predetermined amount of received data is writable, to a cache memory. Area deletion means 82 deletes an area to which received data of an object of processing is written, for each processing to a part or all of the received data written in the writing area. The area setting means 81 newly sets an area of an amount which corresponds to the deleted area as an area to which received data is writable in a position to which received data to be received after having received the predetermined amount of received data is written, for each processing to the received data.

Description

本発明は、キャッシュインジェクションを利用してキャッシュメモリを制御するキャッシュ制御方法、キャッシュ制御装置およびキャッシュ制御用プログラムに関する。 The present invention relates to a cache control method, a cache control apparatus, and a cache control program for controlling a cache memory using cache injection.

まず、通信ネットワークを介してパケットを受信する際の受信処理について説明する。一般的な受信処理では、デバイスドライバがメインメモリ（以降、メモリ）上に予め用意した受信バッファ／受信キュー（Receive Queue 。以下、ＲＱと記す。）が使用される。 First, reception processing when receiving a packet via a communication network will be described. In general reception processing, a reception buffer / reception queue (Receive Queue, hereinafter referred to as RQ) prepared in advance by the device driver on the main memory (hereinafter referred to as memory) is used.

図１３は、一般的な受信処理および受信処理を行う際のＲＱの状態を示す説明図である。ＮＩＣ（Network Interface Card）は、パケットを受信すると（図１３（ａ）におけるステップＳ６０１）、受信したデータをＤＭＡ（Direct Memory Access）によりメモリへと転送する（ステップＳ６０２）。メモリでは、図１３（ｂ）におけるステップＢ１に示すように、ドライバがメモリ上に用意した受信バッファ／受信キュー（ＲＱ）に転送されたデータが書き込まれる（ステップＳ６０３）。 FIG. 13 is an explanatory diagram illustrating a general reception process and a state of the RQ when performing the reception process. When the NIC (Network Interface Card) receives the packet (step S601 in FIG. 13A), the received data is transferred to the memory by DMA (Direct Memory Access) (step S602). In the memory, as shown in step B1 in FIG. 13B, the data transferred by the driver to the reception buffer / reception queue (RQ) prepared on the memory is written (step S603).

その後、ＮＩＣがハードウェア割り込み（以下、Ｈ／Ｗ割り込みと記す。）をＣＰＵに通知すると（ステップＳ６０４）、ＣＰＵは、受信したデータに対する処理を開始する（ステップＳ６０５）。 Thereafter, when the NIC notifies the CPU of a hardware interrupt (hereinafter referred to as H / W interrupt) (step S604), the CPU starts processing on the received data (step S605).

このとき、キャッシュメモリ（以下、キャッシュと記すこともある。）には受信データが登録されていないため、図１３（ｂ）におけるステップＢ２に示すように、ＣＰＵがキャッシュのデータを参照すると（ステップＳ６０６）、キャッシュミスが発生する（ステップＳ６０７）。その結果、メモリ上の受信バッファからキャッシュへデータが読み出される（ステップＳ６０８）。メモリからデータが読み出される間、ＣＰＵは、処理を中断してデータが取得されるのを待つことになる（ステップＳ６０９）。メモリから読み出されたデータがキャッシュに登録されると（ステップＳ６１０）、ＣＰＵは、そのデータを読み取り、処理を再開する（ステップＳ６１１）。以上のステップＳ６０１〜ステップＳ６１１の処理が行われ、受信処理が終了する（ステップＳ６１２）。 At this time, since the received data is not registered in the cache memory (hereinafter also referred to as a cache), as shown in Step B2 in FIG. S606), a cache miss occurs (step S607). As a result, data is read from the reception buffer on the memory to the cache (step S608). While the data is read from the memory, the CPU interrupts the process and waits for the data to be acquired (step S609). When the data read from the memory is registered in the cache (step S610), the CPU reads the data and resumes the process (step S611). The processes in steps S601 to S611 described above are performed, and the reception process ends (step S612).

次に、メモリアクセスによる遅延を解決する方法として広く用いられているプリフェッチについて説明する。プリフェッチとは、ＣＰＵがあるデータを参照する前にそのデータをメモリからキャッシュへ読み込むことにより、メモリアクセスによる遅延を解決するキャッシュ制御方法である。 Next, prefetch widely used as a method for solving a delay caused by memory access will be described. Prefetch is a cache control method for solving a delay due to memory access by reading the data from the memory into the cache before referring to the data.

次に、データの書き込みの観点からメモリアクセスによる遅延を解決するキャッシュ制御方法であるキャッシュインジェクションについて説明する。キャッシュインジェクションは、データをメモリへ書き込まず、直接キャッシュに書き込むキャッシュ制御方法である。 Next, cache injection, which is a cache control method for solving a delay due to memory access from the viewpoint of data writing, will be described. Cache injection is a cache control method for writing data directly to the cache without writing it to the memory.

非特許文献１には、キャッシュインジェクションの実装方法が記載されている。非特許文献１に記載された方法では、インジェクトするデータのアドレスをキャッシュに付属するテーブルに登録し、そのアドレスに対する書き込みをスヌープすることでキャッシュへの書き込みを行う。また、非特許文献１には、キャッシュインジェクションの対象アドレスを登録する命令にｏｐｅｎ＿ｗｉｎｄｏｗ命令（ｏｐｗｄ）を使用する方法が記載されている。 Non-Patent Document 1 describes a cache injection mounting method. In the method described in Non-Patent Document 1, the address of data to be injected is registered in a table attached to the cache, and writing to the address is performed by snooping, and writing to the cache is performed. Non-Patent Document 1 describes a method of using an open_window instruction (opwd) as an instruction for registering a cache injection target address.

また、特許文献１には、すでにキャッシュしているデータを対象としたキャッシュインジェクションの実装方法について記載されている。特許文献１に記載された方法は、キャッシュされたアドレス領域をキャッシュインジェクションの対象とするものである。キャッシュを作成することで、キャッシュインジェクションの対象を設定することが可能になる。 Further, Patent Document 1 describes a method for implementing cache injection for data that has already been cached. The method described in Patent Document 1 uses a cached address area as a target for cache injection. By creating a cache, it is possible to set the target for cache injection.

なお、Ｐｏｗｅｒアーキテクチャによるキャッシュの作成方法が、非特許文献２に記載されている。非特許文献２に記載された方法では、メモリにアクセスせず、０で初期化したキャッシュを作成するＤａｔａＣａｃｈｅＢｌｏｃｋＳｅｔｔｏＺｅｒｏ（ｄｃｂｚ）命令によりキャッシュを作成する。 Note that Non-Patent Document 2 describes a cache creation method based on the Power architecture. In the method described in Non-Patent Document 2, a cache is created by a Data Cache Block Set to Zero (dcbz) instruction that creates a cache initialized with 0 without accessing the memory.

また、特許文献２には、ネットワークを介して外部と通信を行う情報処理装置が記載されている。特許文献２に記載された情報処理装置は、キャッシュメモリに格納されたデータを利用した後、キャッシュメモリにおける利用後のデータに対応するキャッシュラインを消去する。 Patent Document 2 describes an information processing apparatus that communicates with the outside via a network. The information processing apparatus described in Patent Document 2 uses the data stored in the cache memory, and then erases the cache line corresponding to the used data in the cache memory.

特許文献３には、バッファメモリを有するパケット転送装置が記載されている。特許文献３に記載されたパケット転送装置は、パケット処理完了後にバッファ領域が不要になったときにバッファ領域を解放し、ヘッダ領域として必要な容量がバッファアレイ内の１個分の容量より大きい場合、バッファを複数個割り当てる。 Patent Document 3 describes a packet transfer apparatus having a buffer memory. The packet transfer apparatus described in Patent Document 3 releases the buffer area when the buffer area becomes unnecessary after the packet processing is completed, and the capacity required as the header area is larger than the capacity of one in the buffer array Allocate multiple buffers.

特許文献４には、プロセッサ間のデータ転送方法が記載されている。特許文献４に記載されたデータ転送方法では、一連のパケット転送において、最初のパケット列の最後のパケットデータを主記憶に書き込んだ時点でＣＰＵに受信終了割り込みがかけられる。そして、割り込み時に通知された主記憶の書き込み範囲を示すアドレス範囲情報に従って、今回主記憶に書き込まれた範囲を認識し、その部分に関してのキャッシュ処理を実行する。 Patent Document 4 describes a data transfer method between processors. In the data transfer method described in Patent Document 4, in a series of packet transfers, a reception end interrupt is applied to the CPU when the last packet data of the first packet sequence is written to the main memory. Then, according to the address range information indicating the write range of the main memory notified at the time of interruption, the range written in the main memory at this time is recognized, and the cache processing for that portion is executed.

米国特許第７８３６２５４号明細書US Pat. No. 7,836,254 特開２００８−３５２０２号公報JP 2008-35202 A 特開２００９−８８６２２号公報JP 2009-88622 A 特開平８−２８７０３１号公報JP-A-8-287031

A. Milenkovic and V. Milutinovic, "Cache injection on bus based multiprocessors", SRDS'98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems, p.341-346, 1998.A. Milenkovic and V. Milutinovic, "Cache injection on bus based multiprocessors", SRDS'98 Proceedings of the The 17th IEEE Symposium on Reliable Distributed Systems, p.341-346, 1998. IBM, "Power ISA Version 2.06 Revision B", p.685、965, July 23, 2010.IBM, "Power ISA Version 2.06 Revision B", p. 685, 965, July 23, 2010.

図１３（ａ）に示す一般的な受信処理では、メモリからキャッシュへデータの読み出し処理が行われている間はＣＰＵでの処理が中断するため、受信処理が遅れてしまうという問題がある。 The general reception process shown in FIG. 13A has a problem that the reception process is delayed because the process in the CPU is interrupted while the process of reading data from the memory to the cache is being performed.

また、プリフェッチするためには、読み出すデータが既にメモリに存在することが前提である。すなわち、メモリ上におけるネットワークの受信バッファにデータが書き込まれ、ＣＰＵがその書き込みを検知した後（ＣＰＵが割り込みを検知した後）、ＣＰＵはプリフェッチできるようになる。 In order to perform prefetching, it is assumed that data to be read already exists in the memory. That is, after data is written to the network reception buffer on the memory and the CPU detects the writing (after the CPU detects an interrupt), the CPU can prefetch.

したがって、ＣＰＵがそのパケットを即座に参照しようとすると、プリフェッチが間に合わず、ＣＰＵの処理が中断してしまうため、遅延が発生する。具体的には、ＣＰＵが受信パケットの各種ヘッダや受信データの先頭に対してプリフェッチしようとすると、対象のパケットは、受信処理が開始された直後または比較的早い段階で参照されるため、プリフェッチが間に合わない。したがって、ＣＰＵの処理が中断してしまい、メモリアクセスによる遅延を解決できない。特に、データサイズの小さなパケットの受信処理では、プリフェッチの効果を得られないことになる。 Therefore, if the CPU tries to refer to the packet immediately, the prefetch is not in time and the processing of the CPU is interrupted, causing a delay. Specifically, when the CPU tries to prefetch various headers of the received packet or the beginning of the received data, the target packet is referred to immediately after the reception processing is started or at a relatively early stage. Not in time. Therefore, the processing of the CPU is interrupted, and the delay due to memory access cannot be solved. In particular, in the reception process of a packet having a small data size, the prefetch effect cannot be obtained.

さらに、プリフェッチする方法では、メモリアクセスは削減されない。例えば、マルチコアプロセッサが通信ネットワークを介したパケットの受信処理や、他のメモリアクセスが発生する処理を並列で行う場合、メモリアクセスが集中するとメモリアクセスのバンド幅を圧迫してしまう。その結果、処理性能が向上しない（スケールアウトしない）という問題がある。 Furthermore, memory access is not reduced by the prefetch method. For example, when a multi-core processor performs a packet reception process via a communication network and a process in which another memory access occurs in parallel, if the memory access is concentrated, the memory access bandwidth is compressed. As a result, there is a problem that the processing performance does not improve (does not scale out).

一方、キャッシュインジェクションでは、データが直接キャッシュに書き込まれることから、ＣＰＵはキャッシュのデータを読み込めばよく、メモリから読み込む必要はないため、メモリアクセスによる遅延が発生しない。この方法は、通信ネットワークを介してパケットを受信する受信処理やマルチコアによる並列処理で、メモリ上に存在しないデータをＮＩＣまたは他のコアから受け取る場合に有用である。 On the other hand, in cache injection, since data is directly written into the cache, the CPU only needs to read the data in the cache and does not need to read from the memory, so there is no delay due to memory access. This method is useful when data that does not exist in the memory is received from the NIC or another core in a reception process for receiving a packet via a communication network or parallel processing by a multi-core.

さらに、キャッシュインジェクションでは、一般的にデータをメモリに書き出さないため、メモリアクセスを削減できる。また、通信ネットワークを介したパケットの受信処理やコア間のデータ転送にキャッシュインジェクションを用いることで、メモリアクセスバンド幅が圧迫されることにより処理性能が向上しない（スケールアウトしない）という問題も解決できる。 Further, in the cache injection, since data is generally not written to the memory, memory access can be reduced. In addition, the use of cache injection for packet reception processing and data transfer between cores via a communication network can solve the problem that the processing performance does not improve (does not scale out) due to pressure on the memory access bandwidth. .

しかし、パケットの受信処理またはマルチコアによる並列処理で、単にキャッシュインジェクションを用いた場合、高負荷時において、キャッシュインジェクションによる効果を十分に得られないという問題がある。具体的には、高負荷時の場合、メモリアクセスによる処理遅延の解決や、メモリアクセスバンド幅の制限による処理性能限界の解決といった効果を十分に得られない。 However, when cache injection is simply used in packet reception processing or multi-core parallel processing, there is a problem that the effect of cache injection cannot be sufficiently obtained at high load. Specifically, in the case of a high load, it is not possible to sufficiently obtain effects such as resolution of processing delay due to memory access and resolution of processing performance limit due to memory access bandwidth limitation.

キャッシュの容量は一般的に小さいため、大量のデータがキャッシュに書き込まれる状況では、他のデータがキャッシュから追い出されることになる。すなわち、パケットの受信処理が高負荷になり、受信処理性能を超えて大量のデータがキャッシュインジェクションによりキャッシュに送られた場合、キャッシュが溢れ、ＣＰＵにより参照されるデータが処理前にキャッシュからメモリへと追い出されてしまう。 Since the capacity of the cache is generally small, in a situation where a large amount of data is written to the cache, other data is evicted from the cache. That is, when the packet reception processing becomes heavy and a large amount of data exceeding the reception processing performance is sent to the cache by cache injection, the cache overflows and the data referenced by the CPU is transferred from the cache to the memory before processing. It will be kicked out.

処理前のデータがメモリへ追い出されると、ＣＰＵが追い出されたデータを参照しようとする際、キャッシュミスが起こる。このとき、ＣＰＵは、キャッシュインジェクションを用いていない場合と同様、メモリから対象のデータを読み込むことになる。そのため、メモリアクセスによる処理遅延が発生してしまう。 If the data before processing is evicted to the memory, a cache miss occurs when the CPU tries to refer to the evicted data. At this time, the CPU reads the target data from the memory as in the case where the cache injection is not used. Therefore, a processing delay due to memory access occurs.

また、キャッシュメモリを共有するマルチコアにおいて並列処理を行っている場合でも、あるコアでの受信処理により大量のキャッシュが使用されることで、他のコアで処理しているデータがキャッシュからメモリへ追い出されると、追い出されたデータを参照するためにメモリアクセスによる遅延が発生する。これによって、全体の処理性能が低下する。 Also, even when parallel processing is performed in a multi-core sharing a cache memory, a large amount of cache is used for reception processing in one core, so that data processed in other cores is evicted from the cache to the memory. When this occurs, a delay due to memory access occurs to refer to the evicted data. This reduces the overall processing performance.

そこで、本発明は、受信したデータをキャッシュメモリに書き込んで処理を行う場合に、大量のデータを受信する状況であっても、処理性能が低下することを抑制できるキャッシュ制御方法、キャッシュ制御装置およびキャッシュ制御用プログラムを提供することを目的とする。 Therefore, the present invention provides a cache control method, a cache control device, and a cache control method capable of suppressing a decrease in processing performance even when a large amount of data is received when processing is performed by writing the received data to a cache memory. An object is to provide a cache control program.

本発明によるキャッシュ制御方法は、予め定められた量の受信データを書き込み可能な領域である書き込み領域をキャッシュメモリに設定し、書き込み領域に書き込まれた一部または全部の受信データに対する処理ごとに、その処理の対象になった受信データが書き込まれた領域を削除し、受信データに対する処理ごとに、予め定められた量の受信データを受信した後に受信する受信データを書き込む位置に、受信データを書き込み可能な領域として、削除された領域に相当する量の領域を新たに設定することを特徴とする。 The cache control method according to the present invention sets a write area, which is an area in which a predetermined amount of received data can be written, in the cache memory, and performs processing for some or all of the received data written in the write area. Delete the area where the received data to be processed is written, and write the received data to the position where the received data to be received is written after receiving a predetermined amount of received data for each process on the received data As a possible area, an area corresponding to the deleted area is newly set.

本発明によるキャッシュ制御装置は、予め定められた量の受信データを書き込み可能な領域である書き込み領域をキャッシュメモリに設定する領域設定手段と、書き込み領域に書き込まれた一部または全部の受信データに対する処理ごとに、その処理の対象になった受信データが書き込まれた領域を削除する削除手段とを備え、領域設定手段が、受信データに対する処理ごとに、予め定められた量の受信データを受信した後に受信する受信データを書き込む位置に、受信データを書き込み可能な領域として、削除された領域に相当する量の領域を新たに設定することを特徴とする。 The cache control device according to the present invention includes an area setting means for setting a write area, which is an area in which a predetermined amount of received data can be written, in the cache memory, and a part or all of the received data written in the write area. A deletion unit that deletes an area in which received data to be processed is written for each process, and the area setting unit receives a predetermined amount of received data for each process on the received data An area corresponding to the deleted area is newly set as an area where the received data can be written at a position where received data to be received later is written.

本発明によるキャッシュ制御用プログラムは、コンピュータに、予め定められた量の受信データを書き込み可能な領域である書き込み領域をキャッシュメモリに設定する領域設定処理、および、書き込み領域に書き込まれた一部または全部の受信データに対する処理ごとに、その処理の対象になった受信データが書き込まれた領域を削除する削除処理を実行させ、領域設定処理で、受信データに対する処理ごとに、予め定められた量の受信データを受信した後に受信する受信データを書き込む位置に、受信データを書き込み可能な領域として、削除された領域に相当する量の領域を新たに設定させることを特徴とする。 The cache control program according to the present invention includes an area setting process for setting a write area, which is an area in which a predetermined amount of received data can be written, in a cache memory, and a part or For each process for all received data, a deletion process is executed to delete the area in which the received data to be processed is written. In the area setting process, a predetermined amount is set for each process for the received data. It is characterized in that an area corresponding to the deleted area is newly set as an area where the received data can be written at a position where received data to be received after receiving the received data is written.

本発明によれば、受信したデータをキャッシュメモリに書き込んで処理を行う場合に、大量のデータを受信する状況であっても、処理性能が低下することを抑制できる。 According to the present invention, when the received data is written in the cache memory for processing, it is possible to suppress a decrease in processing performance even when a large amount of data is received.

本発明の第１の実施形態におけるキャッシュ制御装置の例を示すブロック図である。It is a block diagram which shows the example of the cache control apparatus in the 1st Embodiment of this invention. 第１の実施形態におけるＣＰＵの例を示すブロック図である。It is a block diagram which shows the example of CPU in 1st Embodiment. 第１の実施形態におけるキャッシュ制御装置の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the cache control apparatus in 1st Embodiment. 第１の実施形態におけるキャッシュ制御装置の動作例およびＲＱの状態例を示すフローチャートである。It is a flowchart which shows the operation example of the cache control apparatus in 1st Embodiment, and the example of a state of RQ. 本発明の第２の実施形態におけるキャッシュ制御装置の例を示すブロック図である。It is a block diagram which shows the example of the cache control apparatus in the 2nd Embodiment of this invention. 第３の実施形態におけるＣＰＵの例を示すブロック図である。It is a block diagram which shows the example of CPU in 3rd Embodiment. 受信バッファキューに設定されるデータ構造の例を示す説明図である。It is explanatory drawing which shows the example of the data structure set to a reception buffer queue. 第３の実施形態におけるキャッシュ制御装置の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the cache control apparatus in 3rd Embodiment. 受信バッファキューに設定されるデータ構造の他の例を示す説明図である。It is explanatory drawing which shows the other example of the data structure set to a reception buffer queue. 第５の実施形態におけるＣＰＵの例を示すブロック図である。It is a block diagram which shows the example of CPU in 5th Embodiment. 第５の実施形態におけるキャッシュ制御装置の動作例を示すフローチャートである。It is a flowchart which shows the operation example of the cache control apparatus in 5th Embodiment. 本発明によるキャッシュ制御装置の最小構成の例を示すブロック図である。It is a block diagram which shows the example of the minimum structure of the cache control apparatus by this invention. 一般的な受信処理および受信処理を行う際のＲＱの状態を示す説明図である。It is explanatory drawing which shows the state of RQ at the time of performing a general reception process and a reception process.

以下、本発明の実施形態を図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

実施形態１．
図１は、本発明の第１の実施形態におけるキャッシュ制御装置の例を示すブロック図である。本実施形態におけるキャッシュ制御装置は、プロセッサ１と、ＮＩＣ２と、メモリ３とを備えている。 Embodiment 1. FIG.
FIG. 1 is a block diagram showing an example of a cache control device according to the first embodiment of the present invention. The cache control device in this embodiment includes a processor 1, a NIC 2, and a memory 3.

プロセッサ１は、ＣＰＵ１１と、キャッシュメモリ１２（以下、キャッシュ１２と記す。）と、メモリコントローラ１３（以下、ＭＣ１３と記す。）と、Ｉ／Ｏコントローラ１５（以下、ＩＯＣ１５と記す。）とを含む。ＮＩＣ２は、ＩＯＣ１５に接続され、メモリ３は、ＭＣ１３に接続される。 The processor 1 includes a CPU 11, a cache memory 12 (hereinafter referred to as a cache 12), a memory controller 13 (hereinafter referred to as an MC 13), and an I / O controller 15 (hereinafter referred to as an IOC 15). . The NIC 2 is connected to the IOC 15, and the memory 3 is connected to the MC 13.

プロセッサ１の内部では、ＣＰＵ１１がキャッシュ１２に接続され、キャッシュ１２、ＭＣ１３およびＩＯＣ１５が、内部バス１４で相互に接続される。 Inside the processor 1, the CPU 11 is connected to the cache 12, and the cache 12, MC 13, and IOC 15 are connected to each other via the internal bus 14.

受信バッファ（図示せず）は、メモリ３上に確保され、ＣＰＵ１１によるアクセスまたはキャッシュインジェクションにより、受信バッファの一部がキャッシュ１２に一時的に記憶される。 A reception buffer (not shown) is secured on the memory 3, and a part of the reception buffer is temporarily stored in the cache 12 by access or cache injection by the CPU 11.

図２は、本実施形態におけるＣＰＵ１１の例を示すブロック図である。ＣＰＵ１１は、領域設定手段２１と、領域削除手段２２とを有する。 FIG. 2 is a block diagram illustrating an example of the CPU 11 in the present embodiment. The CPU 11 includes an area setting unit 21 and an area deleting unit 22.

領域設定手段２１は、予め定められた量の受信データを書き込み可能な領域（以下、書き込み領域と記す。）をキャッシュインジェクションの対象としてキャッシュ１２に設定する。具体的には、領域設定手段２１は、初期化時、受信キュー（ＲＱ）の先頭から整数個（ｎ個）分のパケットを書き込む領域をキャッシュインジェクションの対象としてキャッシュ１２に設定する。領域設定手段２１は、例えば、特許文献１および非特許文献２に記載されているように、０で初期化したキャッシュを作成することで、そのキャッシュ領域をキャッシュインジェクションの対象としてもよい。ただし、キャッシュ１２上にキャッシュインジェクションの対象となる領域を設定する方法は、上記方法に限定されない。 The area setting means 21 sets an area in which a predetermined amount of received data can be written (hereinafter referred to as a write area) in the cache 12 as a target for cache injection. Specifically, at the time of initialization, the area setting unit 21 sets an area in which an integer number (n) of packets from the head of the reception queue (RQ) are written in the cache 12 as a target for cache injection. For example, as described in Patent Document 1 and Non-Patent Document 2, the area setting unit 21 may create a cache initialized with 0 to set the cache area as a target of cache injection. However, the method for setting an area to be cache-injected on the cache 12 is not limited to the above method.

また、領域設定手段２１は、後述する領域削除手段２２が書き込み領域の一部または全部を削除したあと、削除された領域に相当する量の領域をキャッシュインジェクションの対象としてキャッシュ１２に新たに設定する。このとき、領域設定手段２１は、予め定められた量の受信データを受信した後に受信する受信データを書き込む位置に新たな書き込み領域を設定する。 The area setting means 21 newly sets an area corresponding to the deleted area in the cache 12 as a target for cache injection after the area deleting means 22 described later deletes a part or all of the write area. . At this time, the area setting unit 21 sets a new writing area at a position to write the received data received after receiving a predetermined amount of received data.

領域削除手段２２は、キャッシュ１２の書き込み領域に書き込まれた受信データのうち、予め定められた処理の対象になった受信データが書き込まれている領域の削除を行う。予め定められた処理として、例えば、受信パケットのヘッダを解析する処理や、パケットを破棄する処理、上位レイヤのハンドラ起動を行う処理などが挙げられる。ただし、行われる処理は、上記処理に限定されない。なお、以下の説明では、キャッシュ１２でデータが書き込まれている領域の削除を行うことを、キャッシュを削除すると記すこともある。 The area deleting unit 22 deletes an area in which received data that is a target of a predetermined process is written out of the received data written in the writing area of the cache 12. Examples of the predetermined process include a process of analyzing a header of a received packet, a process of discarding a packet, and a process of starting a higher layer handler. However, the processing to be performed is not limited to the above processing. In the following description, deleting an area in which data is written in the cache 12 may be described as deleting the cache.

ＩＯＣ１５は、メモリ３とキャッシュ１２のいずれかを選択して受信データを書き込む。具体的には、ＩＯＣ１５は、キャッシュ１２の書き込み領域に受信データを書き込む。一方、ＩＯＣ１５は、受信データをキャッシュ１２に書き込めない場合に、書き込めなかった受信データをメモリ３に書き込む。 The IOC 15 selects either the memory 3 or the cache 12 and writes the received data. Specifically, the IOC 15 writes the received data in the write area of the cache 12. On the other hand, if the received data cannot be written to the cache 12, the IOC 15 writes the received data that could not be written to the memory 3.

なお、ＣＰＵ１１は、キャッシュ制御装置の記憶部（図示せず）に記憶されたプログラム（キャッシュ制御用プログラム）を読み込み、そのプログラムに従って、領域設定手段２１および領域削除手段２２として動作してもよい。 The CPU 11 may read a program (a cache control program) stored in a storage unit (not shown) of the cache control device and operate as the area setting unit 21 and the area deletion unit 22 according to the program.

次に、プロセッサ１内の動作について説明する。図３は、本実施形態におけるキャッシュ制御装置の動作例を示すフローチャートである。まず、初期化時に、領域設定手段２１は、受信キュー（ＲＱ）の先頭から整数個（ｎ個）分のパケットが書き込まれる領域をキャッシュインジェクションの対象として設定する。以降、パケットを受信するごとに、図３に例示するステップＳ３０１〜ステップＳ３０３の処理が行われる。 Next, the operation in the processor 1 will be described. FIG. 3 is a flowchart illustrating an operation example of the cache control device according to the present embodiment. First, at the time of initialization, the area setting unit 21 sets an area in which an integer number (n) of packets are written from the head of the reception queue (RQ) as a target for cache injection. Thereafter, each time a packet is received, the processing of steps S301 to S303 illustrated in FIG. 3 is performed.

キャッシュ制御装置がパケットを受信すると、一般的なパケット処理が行われる（ステップＳ３０１）。ＣＰＵ１１は、例えば、Ｅｔｈｅｒヘッダの宛先ＭＡＣアドレスが自身のものと異なる場合にパケットを破棄してもよい。また、ＣＰＵ１１は、Ｅｔｈｅｒヘッダのプロトコル情報に応じて上位レイヤのハンドラを起動する処理を行ってもよい。ただし、ここで行われるパケット処理は、上記内容に限定されず、他の処理が行われてもよい。 When the cache control device receives the packet, general packet processing is performed (step S301). For example, the CPU 11 may discard the packet when the destination MAC address of the Ether header is different from its own. Further, the CPU 11 may perform a process of starting a higher layer handler according to the protocol information of the Ether header. However, the packet processing performed here is not limited to the above content, and other processing may be performed.

パケット処理の完了後、領域削除手段２２は、処理を終えた対象データのキャッシュを削除する（ステップＳ３０２）。その後、領域設定手段２１は、処理対象であったパケットのｎ個後に受信するパケット（すなわち、ｎ個先のパケット）が書き込まれるキャッシュ１２の領域をキャッシュインジェクションの対象として設定する（ステップＳ３０３）。 After completion of the packet processing, the area deleting unit 22 deletes the cache of the target data that has been processed (step S302). Thereafter, the area setting unit 21 sets an area of the cache 12 in which a packet received after n packets to be processed (that is, n-th packet) is written as a target for cache injection (step S303).

次に、キャッシュ制御装置全体の動作について説明する。図４は、本実施形態におけるキャッシュ制御装置の動作例およびＲＱの状態例を示すフローチャートである。図４（ｂ）に例示するＲＱの状態例では、ｎ＝３の場合を例示している。 Next, the operation of the entire cache control device will be described. FIG. 4 is a flowchart showing an example of the operation of the cache control device and an example of the RQ state in the present embodiment. In the example of the RQ state illustrated in FIG. 4B, the case where n = 3 is illustrated.

ＮＩＣ２がパケットを受信すると（図４（ａ）におけるステップＳ７０１）、ＩＯＣ１５は、受信したデータをキャッシュインジェクションによりキャッシュ１２へ転送する（ステップＳ７０２）。これにより、図４（ｂ）におけるステップＢ３に例示するように、キャッシュ１２には、メモリ３上の受信バッファ／受信キュー（ＲＱ）のキャッシュとして受信データが書き込まれる（ステップＳ７０３）。 When the NIC 2 receives the packet (step S701 in FIG. 4A), the IOC 15 transfers the received data to the cache 12 by cache injection (step S702). As a result, as illustrated in step B3 in FIG. 4B, the received data is written in the cache 12 as a cache of the reception buffer / reception queue (RQ) on the memory 3 (step S703).

その後、ＮＩＣ２がＨ／Ｗ割り込みをＣＰＵ１１に通知すると（ステップＳ７０４）、ＣＰＵ１１は、受信処理を開始する（ステップＳ７０５）。この際、キャッシュインジェクションにより、対象のデータがキャッシュ１２に書き込まれるため、ＣＰＵ１１は、受信処理において受信データを参照したとき（ステップＳ７０６）、図４（ｂ）におけるステップＢ４に例示するように、キャッシュミスを発生させず、即座にデータを参照できる（ステップＳ７０７）。 Thereafter, when the NIC 2 notifies the CPU 11 of an H / W interrupt (step S704), the CPU 11 starts reception processing (step S705). At this time, since the target data is written into the cache 12 by the cache injection, when the CPU 11 refers to the received data in the reception process (step S706), as illustrated in step B4 in FIG. Data can be referred to immediately without causing a mistake (step S707).

その後、図４（ｂ）におけるステップＢ５に例示するように、領域削除手段２２は、受信処理の最後にキャッシュ１２を操作し、処理対象データのキャッシュを削除する（ステップＳ７１１）。そして、領域設定手段２１は、ｎ個先のパケットが書き込まれるキャッシュ１２の領域をキャッシュインジェクションの対象として新たに設定する。その結果、キャッシュ１２の内容が更新される（ステップＳ７１０）。以上のステップＳ７０１〜ステップＳ７１１の処理が行われ、受信処理が終了する（ステップＳ７１２）。 Thereafter, as illustrated in step B5 in FIG. 4B, the region deletion unit 22 operates the cache 12 at the end of the reception process to delete the cache of the processing target data (step S711). Then, the area setting unit 21 newly sets an area of the cache 12 where the n-th packet is written as a target for cache injection. As a result, the contents of the cache 12 are updated (step S710). The processes in steps S701 to S711 described above are performed, and the reception process ends (step S712).

以上のように、本実施形態によれば、領域設定手段２１が書き込み可能領域をキャッシュ１２に設定し、削除手段２２がキャッシュ１２に書き込まれた一部または全部の受信データに対する処理ごとにその処理の対象になった受信データが書き込まれている領域の削除を行う。そして、領域設定手段２１が、受信データに対する処理ごとに、予め定められた量の受信データを受信した後に受信する受信データが書き込まれる位置に、削除された領域に相当する量の領域を新たに設定する。 As described above, according to the present embodiment, the area setting unit 21 sets a writable area in the cache 12, and the deletion unit 22 performs processing for each part or all of the received data written in the cache 12. Delete the area where the received data that is the target of is written. Then, for each process on the received data, the area setting unit 21 newly adds an area corresponding to the deleted area to the position where the received data received after receiving the predetermined amount of received data is written. Set.

そのような構成により、受信したデータをキャッシュメモリに書き込んで処理を行う場合に、大量のデータを受信する状況であっても、処理性能が低下することを抑制できる。すなわち、メモリアクセスを削減し、キャッシュ使用量を抑制することで、キャッシュインジェクションの効果を最大限に引き出すことができる。 With such a configuration, when the received data is written in the cache memory for processing, it is possible to suppress a decrease in processing performance even in a situation where a large amount of data is received. That is, it is possible to maximize the effects of cache injection by reducing memory access and suppressing cache usage.

例えば、図１３（ａ）に示す一般的な受信処理と比較すると、ステップＳ６０３に示すＮＩＣ２からメモリ３への書き込み処理、ステップＳ６０８に示すＣＰＵ１１からメモリ３への読み出し処理、および、これらの処理に伴ってキャッシュ１２に記憶されたデータが追い出される際に、そのデータがメモリ３へ書き出される処理に伴うメモリアクセスが削減される。これらのメモリアクセスが削減されることで、メモリアクセスに伴う処理遅延が解消されるため、処理性能を向上させることができる。 For example, when compared with the general reception process shown in FIG. 13A, the write process from the NIC 2 to the memory 3 shown in step S603, the read process from the CPU 11 to the memory 3 shown in step S608, and these processes. Accordingly, when the data stored in the cache 12 is evicted, memory access associated with the process of writing the data to the memory 3 is reduced. By reducing these memory accesses, the processing delay associated with the memory access is eliminated, so that the processing performance can be improved.

また、本実施形態によれば、使用されるキャッシュ１２がパケットｎ個分に制限される。したがって、ｎを適切な値に設定することで、キャッシュの使用量を制御できる。さらに、キャッシュが大量に使用されることを回避できると共に、次に処理する受信データがキャッシュからメモリへ追い出されることを回避できる。すなわち、追い出されたデータを再度メモリから読み出すことによる処理性能の低下を抑制できる。 Further, according to the present embodiment, the cache 12 to be used is limited to n packets. Therefore, the cache usage can be controlled by setting n to an appropriate value. Furthermore, it is possible to avoid a large amount of use of the cache, and it is possible to avoid that received data to be processed next is evicted from the cache to the memory. In other words, it is possible to suppress a decrease in processing performance due to reading the evicted data from the memory again.

実施形態２．
次に、第２の実施形態について説明する。図５は、本発明の第２の実施形態におけるキャッシュ制御装置の例を示すブロック図である。なお、第１の実施形態と同様の構成については、図１と同一の符号を付し、説明を省略する。本実施形態におけるキャッシュ制御装置も、プロセッサ１と、ＮＩＣ２と、メモリ３とを備えている。ＮＩＣ２およびメモリ３については、第１の実施形態と同様である。 Embodiment 2. FIG.
Next, a second embodiment will be described. FIG. 5 is a block diagram illustrating an example of the cache control device according to the second embodiment of the present invention. In addition, about the structure similar to 1st Embodiment, the code | symbol same as FIG. 1 is attached | subjected and description is abbreviate | omitted. The cache control apparatus according to the present embodiment also includes a processor 1, a NIC 2, and a memory 3. The NIC 2 and the memory 3 are the same as those in the first embodiment.

プロセッサ１は、２つのＣＰＵ１１と、キャッシュ１２と、ＭＣ１３と、ＩＯＣ１５とを含む。２つのＣＰＵ１１は、それぞれキャッシュ１２に接続される。すなわち、本実施形態におけるキャッシュ制御装置は、図５に例示するように、複数のＣＰＵ１１がキャッシュ１２を共有する構成のマルチコアプロセッサである。なお、ＣＰＵ１１、キャッシュ１２、ＭＣ１３、および、ＩＯＣ１５の内容は、第１の実施形態と同様である。 The processor 1 includes two CPUs 11, a cache 12, an MC 13, and an IOC 15. The two CPUs 11 are each connected to the cache 12. That is, the cache control device according to the present embodiment is a multi-core processor configured such that a plurality of CPUs 11 share a cache 12 as illustrated in FIG. The contents of the CPU 11, the cache 12, the MC 13, and the IOC 15 are the same as those in the first embodiment.

このように、キャッシュ制御装置が図５に例示する構成をとる場合であっても、第１の実施形態と同様、メモリアクセスの削減およびキャッシュ使用量の抑制により、処理性能を向上させることができる。 As described above, even when the cache control apparatus has the configuration illustrated in FIG. 5, the processing performance can be improved by reducing the memory access and suppressing the cache usage, as in the first embodiment. .

さらに、本実施形態によれば、共有されるキャッシュが大量に使用されることを抑制できる。よって、他のコアによる処理で扱われるデータをキャッシュからメモリに追い出すことが減るため、他の処理の性能低下を抑えられる。すなわち、本実施形態によれば、並列処理によるスケールアウトが可能になる。 Furthermore, according to the present embodiment, it is possible to suppress the use of a large amount of shared cache. Therefore, since data handled by processing by other cores is less expelled from the cache to the memory, it is possible to suppress degradation in performance of other processing. That is, according to this embodiment, scale-out by parallel processing becomes possible.

実施形態３．
次に、第３の実施形態について説明する。本実施形態におけるキャッシュ制御装置の構成は、第１の実施形態と同様である。すなわち、本実施形態におけるキャッシュ制御装置も、プロセッサ１と、ＮＩＣ２と、メモリ３とを備えている。また、プロセッサ１は、ＣＰＵ１１と、キャッシュ１２と、ＭＣ１３と、ＩＯＣ１５とを含む。ただし、図５に例示するように、キャッシュ制御装置が複数のＣＰＵ１１を備えていてもよい。 Embodiment 3. FIG.
Next, a third embodiment will be described. The configuration of the cache control device in this embodiment is the same as that in the first embodiment. That is, the cache control device according to the present embodiment also includes the processor 1, the NIC 2, and the memory 3. The processor 1 includes a CPU 11, a cache 12, an MC 13, and an IOC 15. However, as illustrated in FIG. 5, the cache control device may include a plurality of CPUs 11.

第３の実施形態では、受信したパケットの処理をＣＰＵ１１が行っている間に、キャッシュ制御装置がｎ個以上のパケットを新たに受け取った場合の処理について説明する。 In the third embodiment, a process when the cache control apparatus newly receives n or more packets while the CPU 11 is processing a received packet will be described.

図６は、本実施形態におけるＣＰＵ１１の例を示すブロック図である。本実施形態におけるＣＰＵ１１は、領域設定手段２１と、領域削除手段２２と、受信データ判定手段２３と、データ読取手段２４とを有する。なお、領域設定手段２１および領域削除手段２２については、第１の実施形態と同様である。また、ＣＰＵ１１は、キャッシュ制御装置の記憶部（図示せず）に記憶されたプログラム（キャッシュ制御用プログラム）を読み込み、そのプログラムに従って、領域設定手段２１、領域削除手段２２、受信データ判定手段２３およびデータ読取手段２４として動作してもよい。 FIG. 6 is a block diagram illustrating an example of the CPU 11 in the present embodiment. The CPU 11 in this embodiment includes an area setting unit 21, an area deletion unit 22, a received data determination unit 23, and a data reading unit 24. The area setting unit 21 and the area deleting unit 22 are the same as those in the first embodiment. Further, the CPU 11 reads a program (cache control program) stored in a storage unit (not shown) of the cache control device, and according to the program, an area setting means 21, an area deletion means 22, a received data determination means 23, and The data reading unit 24 may be operated.

受信データ判定手段２３は、予め定められた所定量の受信データを書き込み可能な領域（すなわち、書き込み領域）に書き込むことが可能な量を超えるデータを受信したか否かを判定する。具体的には、受信データ判定手段２３は、ｎ個先の受信データを受信したか否かを確認する。受信データ判定手段２３は、例えば、キャッシュ１２に設定されたキャッシュインジェクションの対象が予め定められた所定量になっているか否かを判定することにより、書き込み領域に設定可能な量を超えるデータを受信したか否かを判定してもよい。設定可能な量を超えるデータを受信した場合、受信データ判定手段２３は、ＩＯＣ１５に対して、設定可能な量を超えるデータをメモリ３に書き込むよう指示する。 The reception data determination unit 23 determines whether or not data exceeding an amount that can be written in a predetermined area (that is, a write area) in which a predetermined amount of reception data can be written is received. Specifically, the reception data determination means 23 confirms whether or not n-th reception data has been received. For example, the reception data determination unit 23 receives data that exceeds the amount that can be set in the write area by determining whether or not the target of cache injection set in the cache 12 is a predetermined amount. It may be determined whether or not. When data exceeding the settable amount is received, the received data determination unit 23 instructs the IOC 15 to write data exceeding the settable amount to the memory 3.

図７は、受信バッファキュー（ＲＱ）に設定されるデータ構造の例を示す説明図である。図７に示す例では、受信バッファキューが、受信したデータを記憶する領域である受信データ領域と、受信したデータを管理する領域である管理用データ領域とを含んでいることを示す。このように、受信バッファキューには、パケットごとに管理用データ領域と受信データ領域が確保される。なお、図７では、受信バッファキューを連続領域として確保している場合を例示しているが、受信バッファキューをリスト構造で管理する場合のデータ構造も同様である。また、受信データ領域のみをキャッシュインジェクションの対象としてもよく、受信データ領域と管理用データ領域の両方をキャッシュインジェクションの対象としてもよい。 FIG. 7 is an explanatory diagram showing an example of a data structure set in the reception buffer queue (RQ). In the example illustrated in FIG. 7, the reception buffer queue includes a reception data area that is an area for storing received data and a management data area that is an area for managing the received data. In this way, a management data area and a reception data area are secured for each packet in the reception buffer queue. FIG. 7 illustrates the case where the reception buffer queue is secured as a continuous area, but the data structure when the reception buffer queue is managed in a list structure is the same. Further, only the reception data area may be the target of cache injection, and both the reception data area and the management data area may be the target of cache injection.

受信されたデータは、受信データ領域に、その領域の先頭から書き込まれる。また、管理用データ領域には、受信時のタイムスタンプ、受信データサイズ、受信が完了したことを示す受信完了フラグなどが含まれる。受信データ判定手段２３は、メモリ３にデータを書き込む指示を行う際、管理用データ領域に設定すべき情報を受信したデータと合わせて書き込む指示を行う。 The received data is written in the reception data area from the beginning of the area. In addition, the management data area includes a time stamp at the time of reception, a reception data size, a reception completion flag indicating that reception has been completed, and the like. The reception data determination unit 23 instructs to write information to be set in the management data area together with the received data when giving an instruction to write data to the memory 3.

データ読取手段２４は、メモリ３とキャッシュ１２のいずれかを選択して書き込まれたデータを読み取る。例えば、メモリ３が図７に例示するデータ構造の場合、受信したデータに対して受信データフラグが受信完了を示しているのであれば、その受信データはメモリ上に書き込まれていることを意味する。この場合、データ読取手段２４は、プリフェッチを行う。一方、受信したデータがメモリ３上に記憶されていない場合、領域設定手段２１は、受信したデータを記憶する領域をキャッシュインジェクションの対象としてキャッシュ１２に新たに設定する。 The data reading unit 24 selects either the memory 3 or the cache 12 and reads the written data. For example, when the memory 3 has the data structure illustrated in FIG. 7, if the reception data flag indicates reception completion for the received data, this means that the reception data is written in the memory. . In this case, the data reading unit 24 performs prefetching. On the other hand, when the received data is not stored in the memory 3, the area setting unit 21 newly sets an area for storing the received data in the cache 12 as a target of cache injection.

このように、受信データ判定手段２３が書き込み領域に設定可能な量を超えるデータを受信したと判定した場合（すなわち、ｎ個先のデータを受信したと判定した場合）、データ読取手段２４は、受信データをメモリ３からプリフェッチする。 As described above, when the received data determination unit 23 determines that data exceeding the amount that can be set in the writing area is received (that is, when it is determined that n-th data is received), the data reading unit 24 Received data is prefetched from the memory 3.

次に、プロセッサ１内の動作について説明する。図８は、本実施形態におけるキャッシュ制御装置の動作例を示すフローチャートである。なお、初期化時に書き込み領域をキャッシュインジェクションの対象として設定する処理は、第１の実施形態と同様である。以降、パケットを受信するごとに、図８に例示するステップＳ４０１〜ステップＳ４０５の処理が行われる。 Next, the operation in the processor 1 will be described. FIG. 8 is a flowchart illustrating an operation example of the cache control device according to the present embodiment. Note that the process of setting a write area as a target for cache injection at the time of initialization is the same as in the first embodiment. Thereafter, every time a packet is received, the processing of step S401 to step S405 illustrated in FIG. 8 is performed.

キャッシュ制御装置がパケットを受信すると、受信したパケットに対するパケット処理が行われる（ステップＳ４０１）。なお、ステップＳ４０１の処理は、第１の実施形態におけるステップＳ３０１の処理と同様である。パケット処理の完了後、領域削除手段２２は、処理を終えた対象データのキャッシュを削除する（ステップＳ４０２）。 When the cache control device receives the packet, packet processing is performed on the received packet (step S401). Note that the processing in step S401 is the same as the processing in step S301 in the first embodiment. After completing the packet processing, the area deleting unit 22 deletes the cache of the target data that has been processed (step S402).

キャッシュの削除後、受信データ判定手段２３は、ｎ個先のパケットが既に受信されているか否かを確認する（ステップＳ４０３）。受信データ判定手段２３は、例えば、図７に例示する管理用データ領域内に含まれる受信完了フラグをもとにｎ個先のパケットが既に受信されているか否かを確認する。 After deleting the cache, the reception data determination unit 23 checks whether or not the n-th packet has already been received (step S403). For example, the reception data determination unit 23 checks whether or not the n-th packet has already been received based on the reception completion flag included in the management data area illustrated in FIG.

ｎ個先のパケットが既に受信されている場合（ステップＳ４０３におけるＹＥＳ）、ｎ個先のパケットデータはメモリ上に書き込まれているため、データ読取手段２４は、プリフェッチを行う（ステップＳ４０４）。一方、ｎ個先のパケットがまだ受信されていない場合（ステップＳ４０３におけるＮＯ）、領域設定手段２１は、そのパケットデータが書き込まれる領域をキャッシュインジェクションの対象に設定する（ステップＳ４０５）。この場合、データ読取手段２４は、キャッシュ１２から対象データを読み取ることになる。 If the n-th packet has already been received (YES in step S403), since the n-th packet data has been written in the memory, the data reading unit 24 performs prefetching (step S404). On the other hand, when the n-th packet has not been received yet (NO in step S403), the area setting unit 21 sets an area in which the packet data is written as a cache injection target (step S405). In this case, the data reading unit 24 reads the target data from the cache 12.

以上のように、本実施形態によれば、受信データ判定手段２３が、書き込み領域に書き込むことが可能な量を超えるデータを受信したか否かを判定する。書き込み領域に書き込むことが可能な量を超えるデータを受信した場合、ＩＯＣ１５がその受信データをメモリ３に書き込む。そして、データ読取手段２４が、その受信データをメモリ３からプリフェッチする。よって、第１の実施形態の効果に加え、キャッシュ制御装置が一時的に受信処理性能を超えたパケットを受信した場合であっても、パケットを欠落させず、かつ、メモリアクセスによる遅延を抑制して受信データを処理できる。 As described above, according to the present embodiment, the reception data determination unit 23 determines whether or not data exceeding the amount that can be written to the write area has been received. When data exceeding the amount that can be written to the write area is received, the IOC 15 writes the received data in the memory 3. Then, the data reading unit 24 prefetches the received data from the memory 3. Therefore, in addition to the effects of the first embodiment, even when the cache control device receives a packet that temporarily exceeds the reception processing performance, the packet is not lost and the delay due to memory access is suppressed. Can process the received data.

例えば、ｄｃｂｚ命令により０で初期化されたキャッシュを設定する場合、メモリのデータは参照されない。そのため、ｎ個目以降のパケットデータがＤＭＡによりメモリへ書き出された後に、ｄｃｂｚ命令によってキャッシュを作成する方法では、ＣＰＵが０で初期化されたキャッシュ上のデータを読み込むことになる。すなわち、ＣＰＵが受信したデータと異なるデータを参照してしまうことになる。 For example, when setting a cache initialized to 0 by the dcbz instruction, the data in the memory is not referred to. Therefore, in the method of creating a cache by the dcbz instruction after the nth and subsequent packet data are written to the memory by DMA, the CPU reads the data on the cache initialized with 0. That is, data different from the data received by the CPU is referred to.

しかし、本実施形態では、図８に例示する処理に基づいてキャッシュ１２を制御し、メモリ３上に書かれたパケットデータも処理するため、キャッシュの登録方法によりｎ個目以降のパケットデータが適切に処理されないという問題を回避できる。 However, in the present embodiment, the cache 12 is controlled based on the processing illustrated in FIG. 8 and packet data written on the memory 3 is also processed. Can be avoided.

実施形態４．
次に、第４の実施形態について説明する。本実施形態におけるキャッシュ制御装置の構成は、第１の実施形態と同様である。ただし、図５に例示するように、キャッシュ制御装置が複数のＣＰＵ１１を備えていてもよい。 Embodiment 4 FIG.
Next, a fourth embodiment will be described. The configuration of the cache control device in this embodiment is the same as that in the first embodiment. However, as illustrated in FIG. 5, the cache control device may include a plurality of CPUs 11.

第１の実施形態〜第３の実施形態では、各受信データ領域の全体をキャッシュインジェクションの対象にする場合について説明した。本実施形態では、キャッシュの使用量をさらに抑えるため、受信データにおけるヘッダ等の先頭部分だけをキャッシュインジェクションの対象にする点において、第１の実施形態〜第３の実施形態と異なる。それ以外については、第１の実施形態〜第３の実施形態と同様である。 In the first to third embodiments, the case has been described in which the entirety of each reception data area is the target of cache injection. This embodiment is different from the first to third embodiments in that only the top portion of the received data in the received data is subjected to cache injection in order to further suppress the cache usage. About other than that, it is the same as that of 1st Embodiment-3rd Embodiment.

図９は、受信バッファキュー（ＲＱ）に設定されるデータ構造の他の例を示す説明図である。図９に例示するＲＱは、図７に例示するＲＱと同様、受信データ領域と管理用データ領域とを含む。また、図９に例示するＲＱのうち網掛け部分が、本実施形態においてキャッシュインジェクションの対象とする領域に相当する。 FIG. 9 is an explanatory diagram showing another example of the data structure set in the reception buffer queue (RQ). The RQ illustrated in FIG. 9 includes a reception data area and a management data area, similarly to the RQ illustrated in FIG. Further, the shaded portion of the RQ illustrated in FIG. 9 corresponds to a region targeted for cache injection in the present embodiment.

領域設定手段２１は、キャッシュインジェクションの対象として、各受信データのうちの少なくとも一部の領域をキャッシュ１２に設定する。以下、この一部に相当する領域のことを、縮小領域と記す。 The area setting unit 21 sets at least a part of each received data in the cache 12 as a target of cache injection. Hereinafter, an area corresponding to this part is referred to as a reduced area.

具体的には、領域設定手段２１は、各受信データ領域のうち、先頭ｋキャッシュラインの領域をキャッシュインジェクションの対象として設定する。ここで、ｋは、予め定められた整数値である。このようにすることで、受信データ領域のうち受信データが書き込まれないキャッシュの領域の使用量を削減できる。 Specifically, the area setting means 21 sets the area of the first k cache line among the received data areas as a target for cache injection. Here, k is a predetermined integer value. By doing so, it is possible to reduce the amount of use of the cache area in the received data area where the received data is not written.

例えば、受信データがｋキャッシュライン分のサイズ以下の場合、受信データ全てを書き込む場合と比較すると、１パケット毎に（受信データ領域サイズ）−（ｋキャッシュラインのサイズ）分のキャッシュ使用量を削減できる。なお、受信データがｋキャッシュライン分のサイズより大きい場合、ＣＰＵ１１は、メモリ３を参照して、受信データの残り部分を参照するようにすればよい。 For example, when the received data is less than or equal to the size of k cache lines, the cache usage is reduced by (received data area size)-(k cache line size) for each packet as compared to writing all the received data. it can. If the received data is larger than the size of k cache lines, the CPU 11 may refer to the memory 3 and refer to the remaining portion of the received data.

なお、図９では、受信データ領域の情報をキャッシュインジェクションの対象とする場合を例示しているが、キャッシュインジェクションの対象に管理用データ領域の情報を含めてもよい。 Although FIG. 9 illustrates the case where the information in the received data area is the target of cache injection, the information in the management data area may be included in the target of cache injection.

以上のように、本実施形態によれば、領域設定手段２１が各受信データのうちの少なくとも一部の領域をキャッシュインジェクションの対象領域としてキャッシュ１２に設定する。そのため、第１の実施形態〜第３の実施形態における効果に加え、キャッシュ使用量を削減することができる。 As described above, according to the present embodiment, the area setting unit 21 sets at least a part of the received data in the cache 12 as a cache injection target area. Therefore, in addition to the effects in the first to third embodiments, the cache usage can be reduced.

すなわち、例えば、図９に例示する受信バッファキューを用いてパケットの受信する処理を行う場合、受信データ領域すべてをキャッシュインジェクションの対象としてキャッシュ１２に設定した場合、受信データ領域のうち、受信データが書き込まれない領域のキャッシュは無駄になってしまう。しかし、本実施形態では、受信データが書き込まれない領域部分を削減させるため、キャッシュの使用量をさらに抑えられる。 That is, for example, when the process of receiving a packet is performed using the reception buffer queue illustrated in FIG. 9, when all the reception data areas are set as cache injection targets in the cache 12, the reception data is included in the reception data areas. The cache of the area that is not written is wasted. However, in this embodiment, since the area portion where received data is not written is reduced, the cache usage can be further suppressed.

実施形態５．
次に、第５の実施形態について説明する。第４の実施形態では、受信データ領域のうちの一部のデータ領域をキャッシュインジェクションの対象としてキャッシュ１２に設定する場合について説明した。本実施形態におけるキャッシュ制御装置は、第４の実施形態における処理に加え、受信したデータのうちキャッシュインジェクションされなかったデータの残り部分を受信データのサイズに合わせてプリフェッチする処理を行う。 Embodiment 5. FIG.
Next, a fifth embodiment will be described. In the fourth embodiment, a case has been described in which a part of the received data area is set in the cache 12 as a target for cache injection. In addition to the processing in the fourth embodiment, the cache control device in the present embodiment performs processing for prefetching the remaining portion of the received data that has not been cache-injected according to the size of the received data.

本実施形態におけるキャッシュ制御装置の構成も、第１の実施形態と同様である。すなわち、本実施形態におけるキャッシュ制御装置も、プロセッサ１と、ＮＩＣ２と、メモリ３とを備えている。また、プロセッサ１は、ＣＰＵ１１と、キャッシュ１２と、ＭＣ１３と、ＩＯＣ１５とを含む。ただし、図５に例示するように、キャッシュ制御装置が複数のＣＰＵ１１を備えていてもよい。 The configuration of the cache control device in this embodiment is the same as that in the first embodiment. That is, the cache control device according to the present embodiment also includes the processor 1, the NIC 2, and the memory 3. The processor 1 includes a CPU 11, a cache 12, an MC 13, and an IOC 15. However, as illustrated in FIG. 5, the cache control device may include a plurality of CPUs 11.

図１０は、本実施形態におけるＣＰＵ１１の例を示すブロック図である。本実施形態におけるＣＰＵ１１は、領域設定手段２１と、領域削除手段２２と、受信データ判定手段２３と、データ読取手段２４と、サイズ比較手段２５とを有する。なお、領域設定手段２１、領域削除手段２２および受信データ判定手段２３については、第４の実施形態と同様である。また、ＣＰＵ１１は、キャッシュ制御装置の記憶部（図示せず）に記憶されたプログラム（キャッシュ制御用プログラム）を読み込み、そのプログラムに従って、領域設定手段２１、領域削除手段２２、受信データ判定手段２３、データ読取手段２４およびサイズ比較手段２５として動作してもよい。 FIG. 10 is a block diagram illustrating an example of the CPU 11 in the present embodiment. The CPU 11 in this embodiment includes an area setting unit 21, an area deletion unit 22, a received data determination unit 23, a data reading unit 24, and a size comparison unit 25. The region setting unit 21, the region deletion unit 22, and the reception data determination unit 23 are the same as those in the fourth embodiment. In addition, the CPU 11 reads a program (cache control program) stored in a storage unit (not shown) of the cache control device, and in accordance with the program, an area setting unit 21, an area deletion unit 22, a received data determination unit 23, The data reading unit 24 and the size comparison unit 25 may operate.

サイズ比較手段２５は、各受信データの先頭から一部に相当する領域（すなわち、縮小領域）のサイズと、受信データのサイズとを比較する。サイズ比較手段２５は、受信データのサイズを、例えば、図９に例示する管理用データ領域内に含まれる受信データサイズを基に判断してもよい。 The size comparison means 25 compares the size of the area corresponding to a part from the head of each received data (that is, the reduced area) and the size of the received data. The size comparison unit 25 may determine the size of the received data based on, for example, the received data size included in the management data area illustrated in FIG.

受信データのサイズが縮小領域のサイズ以下の場合、受信データはキャッシュインジェクションされる。一方、受信したデータサイズが縮小領域のサイズよりも大きい場合、キャッシュインジェクションされなかった領域が存在することになる。この場合、データ読取手段２４は、メモリ３からその領域のデータをプリフェッチする。このようにすることで、キャッシュの使用量を減らし、かつメモリアクセスによる遅延を削減できる。 If the size of the received data is less than or equal to the size of the reduced area, the received data is cache injected. On the other hand, when the received data size is larger than the size of the reduced area, there is an area that has not been cache-injected. In this case, the data reading unit 24 prefetches the data in that area from the memory 3. By doing so, it is possible to reduce the usage amount of the cache and reduce the delay due to memory access.

次に、動作について説明する。図１１は、本実施形態におけるキャッシュ制御装置の動作例を示すフローチャートである。キャッシュ制御装置がパケットを受信すると、サイズ比較手段２５は、受信したデータのサイズとキャッシュインジェクションの対象に登録したｋキャッシュラインのサイズとを比較する（ステップＳ５０１）。サイズ比較手段２５は、例えば、受信バッファキューにおける管理用データ領域を参照して、受信したデータのサイズを判断すればよい。 Next, the operation will be described. FIG. 11 is a flowchart illustrating an operation example of the cache control device according to the present embodiment. When the cache control device receives the packet, the size comparison unit 25 compares the size of the received data with the size of the k cache line registered in the cache injection target (step S501). The size comparison unit 25 may determine the size of the received data with reference to the management data area in the reception buffer queue, for example.

受信したデータサイズが縮小領域のサイズ以下の場合（ステップＳ５０１におけるＮＯ）、受信データはキャッシュインジェクションされる。一方、受信したデータサイズが縮小領域のサイズよりも大きい場合（ステップＳ５０１におけるＹＥＳ）、データ読取手段２４は、メモリ３からその領域のデータをプリフェッチする（ステップＳ５０２）。 If the received data size is equal to or smaller than the size of the reduced area (NO in step S501), the received data is cache injected. On the other hand, if the received data size is larger than the size of the reduced area (YES in step S501), the data reading unit 24 prefetches the data in that area from the memory 3 (step S502).

その後、受信したパケットのパケット処理が行われる（ステップＳ５０３）。なお、ステップＳ５０１の処理は、第１の実施形態におけるステップＳ３０１の処理と同様である。パケット処理の完了後、領域削除手段２２は、処理を終えた対象データのキャッシュを削除する（ステップＳ５０４）。そして、領域設定手段２１は、ｎ個先のパケットが書き込まれる領域をキャッシュインジェクションの対象として設定する（ステップＳ４０５）。 Thereafter, packet processing of the received packet is performed (step S503). Note that the processing in step S501 is the same as the processing in step S301 in the first embodiment. After the packet processing is completed, the area deleting unit 22 deletes the cache of the target data that has been processed (step S504). Then, the area setting unit 21 sets an area in which n-th packet is written as a target for cache injection (step S405).

以上のように、本実施形態によれば、サイズ比較手段２５が縮小領域のサイズと受信データのサイズとを比較する。そして、受信データのサイズが縮小領域のサイズ以下の場合、ＩＯＣ１５は、受信データのうち縮小領域のサイズを超える部分のデータをメモリ３に書き込み、データ読取手段２４は、縮小領域のサイズを超える部分の受信データをメモリ３からプリフェッチする。よって、第４の実施形態における効果に加え、メモリアクセスによる遅延をより削減することができる。 As described above, according to the present embodiment, the size comparison unit 25 compares the size of the reduced area with the size of the received data. If the size of the received data is equal to or smaller than the size of the reduced area, the IOC 15 writes the portion of the received data that exceeds the size of the reduced area to the memory 3, and the data reading means 24 is a portion that exceeds the size of the reduced area. Is prefetched from the memory 3. Therefore, in addition to the effects of the fourth embodiment, the delay due to memory access can be further reduced.

以下、具体的な実施例により本発明を説明するが、本発明の範囲は以下に説明する内容に限定されない。本実施例では、特許文献１に記載された実装方法を用いて、キャッシュインジェクションを実現するものとする。すなわち、本実施例におけるプロセッサは、特許文献１に記載されているように、キャッシュしているデータをキャッシュインジェクションの対象とするものとする。 Hereinafter, the present invention will be described with reference to specific examples, but the scope of the present invention is not limited to the contents described below. In the present embodiment, it is assumed that cache injection is realized using the mounting method described in Patent Document 1. That is, as described in Patent Document 1, the processor according to the present embodiment assumes cached data as a target for cache injection.

また、本実施例におけるプロセッサは、非特許文献２に記載されたｄｃｂｚ命令およびＤａｔａＣａｃｈｅＢｌｏｃｋＩｎｖａｌｉｄａｔｅ（ｄｃｂｉ）命令によりキャッシュを操作できるものとする。なお、非特許文献２に記載されたｄｃｂｉ命令とは、キャッシュをメモリに書き出さずに削除する命令である。 In addition, the processor in this embodiment can operate the cache by the dcbz instruction and the Data Cache Block Invalidate (dcbi) instruction described in Non-Patent Document 2. Note that the dcbi instruction described in Non-Patent Document 2 is an instruction for deleting a cache without writing it to a memory.

次に、本実施例で用いる受信バッファのデータ構成について説明する。本実施例で用いる受信バッファは、図９に例示するように、連続した領域に等間隔で順にデータが格納されるキューを想定する。また、受信バッファには、パケットごとに管理用データ領域と受信データ領域とを有する。さらに、管理用データ領域には、受信時のタイムスタンプ、受信データサイズ、および受信完了フラグ等が含まれる。また、管理用データ領域と受信データ領域のサイズは、それぞれキャッシュラインサイズの整数倍とする。 Next, the data configuration of the reception buffer used in this embodiment will be described. As shown in FIG. 9, the reception buffer used in the present embodiment assumes a queue in which data is sequentially stored in a continuous area at equal intervals. The reception buffer has a management data area and a reception data area for each packet. Further, the management data area includes a reception time stamp, a reception data size, a reception completion flag, and the like. The sizes of the management data area and the reception data area are each an integral multiple of the cache line size.

また、領域設定手段２１は、初期化時、キューの先頭からｎ個分のパケットを書き込む領域をキャッシュインジェクションの対象としてキャッシュ１２に設定するものとする。ここで、ｎには、キャッシュサイズと受信処理の性能から最適な値が予め設定される。例えば、第２の実施形態で示すように、２つのＣＰＵが並列でパケット処理を行う場合、「キャッシュサイズ／２＞受信データ領域のサイズ×ｎ」になるｎを選択することで、受信データのキャッシュがメモリへ追い出されることを抑制できる。また、第３の実施形態で示すように、「プリフェッチを完了するまでにかかる時間＜ｎ個のパケットに対する処理時間」になるｎを選択することで、メモリアクセスによる遅延をなくすことができる。なお、整数ｎの値は、プログラム実行中に変更しないものとする。 In addition, the area setting unit 21 sets an area in which n packets from the head of the queue are written to the cache 12 as a target of cache injection at the time of initialization. Here, an optimal value is set in advance for n from the cache size and the performance of the reception process. For example, as shown in the second embodiment, when two CPUs perform packet processing in parallel, by selecting n such that “cache size / 2> size of received data area × n”, the received data It is possible to prevent the cache from being evicted into the memory. Further, as shown in the third embodiment, by selecting n that satisfies “the time required for completing prefetch <the processing time for n packets”, it is possible to eliminate the delay due to memory access. Note that the value of the integer n is not changed during program execution.

次に、本実施例によるキャッシュ制御装置の動作について説明する。まず初めに、全受信データ領域をＤＭＡの対象としておく。その後、初めに受信するｎ個のパケットについての受信データ領域をキャッシュインジェクションの対象とするため、プロセッサ（例えば、領域設定手段２１）は、ｄｃｂｚ命令によりキャッシュを作成する。 Next, the operation of the cache control apparatus according to this embodiment will be described. First, the entire received data area is set as a DMA target. Thereafter, in order to set the received data area for the n packets received first as the target of cache injection, the processor (for example, the area setting unit 21) creates a cache by the dcbz instruction.

その後、パケットごとに行われる一般的な受信処理の後、プロセッサ（例えば、領域削除手段２２および領域設定手段２１）は、ステップＳ７１１に例示するように、キャッシュを操作する。具体的には、プロセッサ（例えば、領域削除手段２２）は、処理対象であったパケットに対応する受信データ領域のキャッシュをｄｃｂｉ命令で削除する。そして、プロセッサ（例えば、領域設定手段２１）は、ｎ個先のパケットが書き込まれる受信データ領域をキャッシュインジェクションの対象とするため、ｄｃｂｚ命令によりキャッシュを作成する。その結果、ステップＳ７１０に例示するように、キャッシュ１２の内容が更新される。 Thereafter, after a general reception process performed for each packet, the processor (for example, the area deletion unit 22 and the area setting unit 21) operates the cache as illustrated in step S711. Specifically, the processor (for example, the region deletion unit 22) deletes the cache of the reception data region corresponding to the packet that was the processing target with the dcbi instruction. Then, the processor (for example, the area setting unit 21) creates a cache by the dcbz instruction in order to set the received data area in which the n-th packet is written to be the target of cache injection. As a result, the contents of the cache 12 are updated as exemplified in step S710.

なお、ｎ個先の受信データ（パケット）が書き込まれる領域は、処理対象であったパケットの受信データ領域を指すアドレスに、（管理用データ領域サイズ＋受信データ領域サイズ）×ｎを加算することで算出できる。なお、プロセッサ（例えば、領域削除手段２２および領域設定手段２１）は、それぞれのキャッシュ操作を受信データ領域の全キャッシュラインに対して行う。 In the area where n-th received data (packet) is written, (management data area size + received data area size) × n is added to the address indicating the received data area of the packet to be processed. It can be calculated by The processor (for example, the area deleting unit 22 and the area setting unit 21) performs each cache operation on all the cache lines in the received data area.

このような処理を行うことで、例えば、図４（ｂ）に例示する受信バッファのように、常にｎ個のパケットデータのみキャッシュされることになる。 By performing such processing, for example, only n packet data are always cached as in the reception buffer illustrated in FIG.

次に、本発明の最小構成について説明する。図１２は、本発明によるキャッシュ制御装置の最小構成の例を示すブロック図である。本発明によるキャッシュ制御装置は、予め定められた量の受信データ（例えば、ｎ個分のパケット）を書き込み可能な領域である書き込み領域（例えば、キャッシュインジェクションの対象領域）をキャッシュメモリ（例えば、キャッシュ１２）に設定する領域設定手段８１と、書き込み領域に書き込まれた一部または全部の受信データに対する処理（例えば、パケット処理）ごとに、その処理の対象になった受信データが書き込まれた領域を削除する（例えば、キャッシュを削除する）領域削除手段８２（例えば、領域削除手段２２）とを備えている。 Next, the minimum configuration of the present invention will be described. FIG. 12 is a block diagram showing an example of the minimum configuration of the cache control device according to the present invention. A cache control device according to the present invention uses a write area (for example, a cache injection target area), which is an area into which a predetermined amount of received data (for example, n packets) can be written, as a cache memory (for example, a cache memory). 12) and for each process (for example, packet processing) for part or all of the received data written in the write area, the area in which the received data to be processed is written An area deleting unit 82 (for example, area deleting unit 22) for deleting (for example, deleting a cache) is provided.

領域設定手段８１は、受信データに対する処理ごとに、予め定められた量の受信データ（例えば、ｎ個分のパケット）を受信した後に受信する受信データ（例えば、ｎ個先のパケット）を書き込む位置に、受信データを書き込み可能な領域として、削除された領域に相当する量の領域を新たに設定する。 The area setting unit 81 writes the received data (for example, n ahead packets) received after receiving a predetermined amount of received data (for example, n packets) for each process on the received data. In addition, an area corresponding to the deleted area is newly set as an area in which received data can be written.

以上のような構成により、受信したデータをキャッシュメモリに書き込んで処理を行う場合に、大量のデータを受信する状況であっても、処理性能が低下することを抑制できる。 With the configuration as described above, when the received data is written in the cache memory for processing, it is possible to suppress a decrease in processing performance even when a large amount of data is received.

また、キャッシュ制御装置が、受信データを書き込み領域に書き込む受信データ書き込み手段（例えば、ＩＯＣ１５）と、メインメモリ（例えば、メモリ３）とキャッシュメモリのいずれかを選択して、書き込まれた受信データを読み取るデータ読取手段（例えば、データ読取手段２４）と、書き込み領域に書き込むことが可能な量を超えるデータを受信したか否かを判定する受信データ判定手段（例えば、受信データ判定手段２３）を備えていてもよい。 In addition, the cache control device selects any one of received data writing means (for example, IOC15), main memory (for example, memory 3), and cache memory for writing received data to the write area, and writes the received data to be written. Data reading means for reading (for example, data reading means 24) and reception data determining means (for example, receiving data determining means 23) for determining whether or not data exceeding the amount that can be written to the writing area has been received. It may be.

そして、受信データ書き込み手段が、書き込み領域に書き込むことが可能な量を超えた受信データをメインメモリに書き込み、データ読取手段が、受信データをメインメモリからプリフェッチしてもよい。 Then, the reception data writing means may write the reception data exceeding the amount that can be written to the writing area into the main memory, and the data reading means may prefetch the reception data from the main memory.

また、領域設定手段８１は、各受信データのうちの少なくとも一部の領域（例えば、先頭ｋキャッシュラインの領域）である縮小領域を書き込み可能な領域としてキャッシュメモリに設定してもよい。 The area setting unit 81 may set a reduced area, which is at least a part of the received data (for example, the area of the first k cache line), in the cache memory as a writable area.

また、キャッシュ制御装置は、縮小領域のサイズと受信データのサイズとを比較するサイズ比較手段（例えば、サイズ比較手段２５）を備えていてもよい。そして、受信データ書き込み手段は、受信データのサイズが縮小領域のサイズ以下の場合、受信データのうち縮小領域のサイズを超える部分のデータをメインメモリに書き込み、データ読取手段は、受信データのサイズが縮小領域のサイズ以下の場合、縮小領域のサイズを超える部分の受信データをメインメモリからプリフェッチしてもよい。 In addition, the cache control apparatus may include size comparison means (for example, size comparison means 25) that compares the size of the reduced area with the size of the received data. When the received data size is equal to or smaller than the size of the reduced area, the received data writing means writes the portion of the received data that exceeds the size of the reduced area to the main memory, and the data reading means has a size of the received data. If the size is equal to or smaller than the size of the reduced area, a portion of received data that exceeds the size of the reduced area may be prefetched from the main memory.

以上、実施形態及び実施例を参照して本願発明を説明したが、本願発明は上記実施形態および実施例に限定されるものではない。本願発明の実施形態および実施例の枠内において、種々の変更および調整が可能である。 Although the present invention has been described with reference to the embodiments and examples, the present invention is not limited to the above embodiments and examples. Various changes and adjustments are possible within the scope of the embodiments and examples of the present invention.

本発明は、キャッシュインジェクションを利用してキャッシュメモリを制御するキャッシュ制御装置に好適に適用される。また、本発明を、バッファキューを用いて処理を行う他の装置にも適応可能である。例えば、アクセラレータからのメモリ書き出しおよびキャッシュインジェクションがサポートされているアーキテクチャにおいて、そのアクセラレータを使用した処理に本発明を適応可能である。 The present invention is preferably applied to a cache control device that controls cache memory using cache injection. Further, the present invention can be applied to other apparatuses that perform processing using a buffer queue. For example, the present invention can be applied to processing using an accelerator in an architecture that supports memory writing and cache injection from the accelerator.

具体的には、アクセラレータに複数のジョブを投入し、それらの結果を順次受け取りながら処理を行う場合、それぞれのジョブについての出力データが発生する。このとき、出力データを書き込む領域の制御に、本発明による方法を適用することで、キャッシュの使用を一定量に制御できるため、メモリアクセスの削減といったキャッシュインジェクションの効果を最大限に引き出せる。 Specifically, when a plurality of jobs are input to the accelerator and processing is performed while sequentially receiving the results, output data for each job is generated. At this time, the use of the cache can be controlled to a certain amount by applying the method according to the present invention to control the area where the output data is written, so that the effect of cache injection such as memory access reduction can be maximized.

１プロセッサ
２ＮＩＣ
３メモリ
１１ＣＰＵ
１２キャッシュメモリ
１３メモリコントローラ
１４内部バス
１５Ｉ／Ｏコントローラ
２１領域設定手段
２２領域削除手段
２３受信データ判定手段
２４データ読取手段
２５サイズ比較手段 1 processor 2 NIC
3 Memory 11 CPU
12 cache memory 13 memory controller 14 internal bus 15 I / O controller 21 area setting means 22 area deleting means 23 received data determining means 24 data reading means 25 size comparing means

Claims

Set a write area, which is an area in which a predetermined amount of received data can be written, in the cache memory,
For each process on part or all of the received data written in the write area, delete the area where the received data that is the target of the process is written,
For each process on the received data, an area corresponding to the deleted area is set as an area where the received data can be written at a position where the received data received after receiving the predetermined amount of received data is written. A cache control method characterized by newly setting.

Write the received data to the writing area,
Select either main memory or cache memory, read the received data written,
Determine if you have received more data than can be written to the write area,
When writing received data to the write area, write the received data exceeding the amount that can be written to the write area to the main memory,
The cache control method according to claim 1, wherein the received data is prefetched from a main memory.

The cache control method according to claim 1 or 2, wherein when the write area is set, a reduced area that is at least a part of each received data is set as a writable area in the cache memory.

Compare the size of the reduced area with the size of the received data,
If the size of the received data is equal to or smaller than the size of the reduced area, the portion of the received data that exceeds the size of the reduced area is written to the main memory, and the portion of the received data that exceeds the size of the reduced area is prefetched from the main memory. 3. The cache control method according to 3.

The cache control method according to any one of claims 1 to 4, wherein a write area is set in a cache memory as a target for cache injection.

An area setting means for setting a write area, which is an area in which a predetermined amount of received data can be written, in the cache memory;
An area deletion unit that deletes an area in which received data that is a target of the process is written for each process on a part or all of the received data written in the writing area;
In each of the processes for the received data, the area setting unit sets the received data as a writable area at a position where the received data received after receiving the predetermined amount of received data is written. A cache control apparatus characterized by newly setting a corresponding amount of area.

Received data writing means for writing received data to a writing area;
A data reading unit that selects either the main memory or the cache memory and reads the written received data;
Receiving data determination means for determining whether or not data exceeding the amount that can be written to the write area has been received,
The received data writing means writes received data exceeding the amount that can be written to the writing area to the main memory,
The cache control device according to claim 6, wherein the data reading unit prefetches the received data from a main memory.

The cache control device according to claim 6 or 7, wherein the area setting means sets a reduced area, which is at least a part of each received data, as a writable area in the cache memory.

Received data writing means for writing received data to a writing area;
A data reading unit that selects either the main memory or the cache memory and reads the written received data;
Size comparison means for comparing the size of the reduced area and the size of the received data,
When the size of the received data is equal to or smaller than the size of the reduced area, the received data writing means writes the portion of the received data that exceeds the size of the reduced area to the main memory,
The cache control device according to claim 8, wherein the data reading means prefetches the received data in a portion exceeding the size of the reduced area from the main memory when the size of the received data is equal to or smaller than the size of the reduced area.

On the computer,
An area setting process for setting a write area, which is an area in which a predetermined amount of received data can be written, in the cache memory; and
For each process on part or all of the received data written in the writing area, execute a deletion process to delete the area in which the received data to be processed is written,
In the area setting process, each time the received data is processed, the received data received after receiving the predetermined amount of received data is written into the area where the received data is written, and the deleted data is written into the deleted area. A cache control program to set a new equivalent area.