CN117609314A - Cache data processing method, cache controller, chip and electronic equipment - Google Patents

Cache data processing method, cache controller, chip and electronic equipment Download PDF

Info

Publication number
CN117609314A
CN117609314A CN202410085799.2A CN202410085799A CN117609314A CN 117609314 A CN117609314 A CN 117609314A CN 202410085799 A CN202410085799 A CN 202410085799A CN 117609314 A CN117609314 A CN 117609314A
Authority
CN
China
Prior art keywords
cache
data block
cache line
data
dirty
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410085799.2A
Other languages
Chinese (zh)
Inventor
杜倩倩
吴峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiangdixian Computing Technology Co Ltd
Original Assignee
Beijing Xiangdixian Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiangdixian Computing Technology Co Ltd filed Critical Beijing Xiangdixian Computing Technology Co Ltd
Priority to CN202410085799.2A priority Critical patent/CN117609314A/en
Publication of CN117609314A publication Critical patent/CN117609314A/en
Pending legal-status Critical Current

Links

Abstract

The disclosure provides a cache data processing method, a cache controller, a chip and electronic equipment. The cache comprises a plurality of cache lines, and any cache line stores: a plurality of data blocks, a data block valid identifier for identifying whether each data block is valid, and a data block dirty identifier for identifying whether each data block is dirty data; the method comprises the following steps: when writing the data in any cache line back to a downstream memory, traversing the effective identification of the data block in the cache line and the dirty identification of the data block to obtain a target data block which is effective and dirty data; and writing the obtained target data block back to the downstream memory.

Description

Cache data processing method, cache controller, chip and electronic equipment
Technical Field
The disclosure relates to the technical field of computers, and in particular relates to a cache data processing method, a cache controller, a chip and electronic equipment.
Background
The Cache is a Cache, the stored content is a subset of the downstream memory, when other access devices such as a CPU/GPU access the downstream memory, the Cache is searched first, if the Cache hits, the Cache is directly used for reading and writing operations, and if the Cache misses, the data is read from the downstream memory. In this way, the number of accesses to downstream memory is reduced, thereby improving transfer performance.
Currently, when data in a Cache needs to be written back to a downstream memory, the data is usually written back in a Cache line, that is, a Cache line minimum unit, that is, when the data needs to be written back, the whole Cache line is written back. In this way, redundant writing occurs, resulting in a waste of bandwidth.
Disclosure of Invention
The disclosure aims to provide a cache data processing method, a cache controller, a chip and electronic equipment.
According to a first aspect of the present disclosure, there is provided a method for processing cached data, where the cache includes a plurality of cache lines, and any one of the cache lines stores: a plurality of data blocks, a data block valid identifier for identifying whether each data block is valid, and a data block dirty identifier for identifying whether each data block is dirty data; the method comprises the following steps:
when writing the data in any cache line back to a downstream memory, traversing the effective identification of the data block in the cache line and the dirty identification of the data block to obtain a target data block which is effective and dirty data;
and writing the obtained target data block back to the downstream memory.
In one embodiment, the writing the data in any cache line back to the downstream memory includes:
Upon detecting that the downstream memory is free, determining to write data in each cache line in the cache back to the downstream memory.
In one embodiment, the writing the data in any cache line back to the downstream memory includes:
after receiving a write request, if the write request misses any cache line in the cache and there is no free cache line in the cache, determining to determine a target cache line from the cache, and writing data in the target cache line back to a downstream memory.
In one embodiment, after writing the data in the target cache line back to the downstream memory, the method further comprises:
and writing at least one data block carried by the write request into the target cache line, and modifying the data block effective identifier and the data block dirty identifier into data blocks which are used for identifying the at least one data block as effective and dirty data.
In one embodiment, the mapping mode of the cache and the downstream memory is set associative mapping, wherein the cache comprises a plurality of sets, and any set comprises a plurality of cache lines; the determining a target cache line from the cache includes:
determining a set hit by the write request, traversing the effective identifiers of the data blocks and the dirty identifiers of the data blocks in each cache line in the set, and determining a target cache line of the cache line with the largest target data block in the set, wherein the target data block is the effective and dirty data block.
In one embodiment, the method further comprises:
after receiving a write request, if the write request hits any cache line in a cache, or the write request misses any cache line in the cache and there is an idle cache line in the cache, writing at least one data block carried by the write request into the cache line, and modifying a data block valid identifier and a data block dirty identifier into a data block for identifying that the written data block is valid and dirty data.
In one embodiment, the method further comprises:
after receiving a read request, under the condition that the read request hits any cache line in a cache, determining whether data blocks to be read by the read request are in the cache line according to a data block effective identifier and a data block dirty identifier, if so, reading the data blocks corresponding to the read request from the hit cache line and returning the data blocks to an initiator of the read request;
if not, generating a read command for the data block which corresponds to the read request and is not in the cache line, sending the read command to a downstream memory, reading the data block which corresponds to the read request and is in the cache line from the hit cache line, and returning the data block to an initiator of the read request.
In one embodiment, the method further comprises:
after receiving a read request, if the read request does not hit any cache line in a cache, the read request is sent to a downstream memory, so that the downstream memory returns a data block corresponding to the read request to an initiator of the read request.
In one embodiment, the any cache line is configured to store a number of data blocks, a data block valid identification for identifying whether each data block is valid, and a data block dirty identification for identifying whether each data block is dirty data, comprising:
a plurality of data blocks are stored in any cache line, and a data block valid identifier and a data block dirty identifier are respectively configured for each data block.
According to a second aspect of the present disclosure, there is provided a cache controller configured to control a cache, where the cache includes a plurality of cache lines, and any cache line stores: a plurality of data blocks, a data block valid identifier for identifying whether each data block is valid, and a data block dirty identifier for identifying whether each data block is dirty data; the cache controller includes:
the traversal module is used for traversing the effective identification of the data block in any cache line and the target data block which is effective and is dirty data when the data in any cache line is written back to the downstream memory;
And the write-back module is used for writing the obtained target data block back to the downstream memory.
In one embodiment, the write-back module is specifically configured to determine to write back data in each cache line in the cache to the downstream memory when the downstream memory is detected to be idle.
In one embodiment, the write-back module is specifically configured to determine, after receiving a write request, that a target cache line is determined from the cache if the write request misses any cache line in the cache and there is no free cache line in the cache, and write back data in the target cache line to a downstream memory.
In one embodiment, the cache controller further comprises:
and the writing module is used for writing at least one data block carried by the writing request into the target cache line, and modifying the data block effective identifier and the data block dirty identifier into data blocks which are used for identifying that the at least one data block is effective and dirty data.
In one embodiment, the mapping mode of the cache and the downstream memory is set associative mapping, wherein the cache comprises a plurality of sets, and any set comprises a plurality of cache lines;
The write-back module is specifically configured to determine a set hit by the write request, traverse the valid identifier of the data block and the dirty identifier of the data block in each cache line in the set, and determine a target cache line of the cache line with the largest target data block in the set, where the target data block is a valid and dirty data block.
In one embodiment, the cache controller further comprises:
and the writing module is used for writing at least one data block carried by the writing request into a cache line after receiving the writing request, and modifying a data block effective identifier and a data block dirty identifier into a data block for identifying that the written data block is effective and dirty data if the writing request hits any cache line in a cache or the writing request misses any cache line in the cache and an idle cache line capable of processing the writing request exists in the cache.
In one embodiment, the cache controller further comprises:
the reading module is used for determining whether the data blocks to be read by the read request are in the cache lines according to the data block effective identification and the data block dirty identification when the read request hits any cache line in the cache after receiving the read request, and if yes, reading the data blocks corresponding to the read request from the hit cache lines and returning the data blocks to the initiator of the read request;
If not, generating a read command for the data block which corresponds to the read request and is not in the cache line, sending the read command to a downstream memory, reading the data block which corresponds to the read request and is in the cache line from the hit cache line, and returning the data block to an initiator of the read request.
In one embodiment, the reading module is further configured to send, after receiving the read request, the read request to a downstream memory if the read request misses any cache line in the cache, so that the downstream memory returns a data block corresponding to the read request to an initiator of the read request.
According to a third aspect of the present disclosure, there is provided a chip comprising the cache controller in any one of the embodiments of the second aspect.
According to a fourth aspect of the present disclosure, there is provided an electronic device comprising the chip of the third aspect.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
FIG. 1 is a schematic diagram of address mapping in a cache according to one embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of a Tag array according to an embodiment of the present disclosure;
FIG. 3 is a flow chart illustrating a method for processing cache data according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a cache line according to one embodiment of the present disclosure;
FIG. 5 is a schematic diagram of another cache line according to one embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a specific address mapping provided in one embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a specific Tag array according to one embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a first cache line identification provided by an embodiment of the present disclosure;
FIG. 9 is a diagram illustrating a second cache line identification provided by one embodiment of the present disclosure;
FIG. 10 is a diagram illustrating a third exemplary cache line identification provided by one embodiment of the present disclosure;
FIG. 11 is a diagram illustrating a fourth cache line identification provided by an embodiment of the present disclosure;
FIG. 12 is a diagram illustrating a fifth cache line identification provided by an embodiment of the present disclosure;
fig. 13 is a schematic structural diagram of a cache processor according to an embodiment of the present disclosure.
Detailed Description
Before describing embodiments of the present disclosure, it should be noted that:
some embodiments of the present disclosure are described as process flows, in which the various operational steps of the flows may be numbered sequentially, but the operational steps may be performed in parallel, concurrently, or simultaneously.
The terms "first," "second," and the like may be used in embodiments of the present disclosure to describe various features, but these features should not be limited by these terms. These terms are only used to distinguish one feature from another.
The term "and/or," "and/or" may be used in embodiments of the present disclosure to include any and all combinations of one or more of the associated features listed.
It will be understood that when two elements are described in a connected or communicating relationship, unless a direct connection or direct communication between the two elements is explicitly stated, connection or communication between the two elements may be understood as direct connection or communication, as well as indirect connection or communication via intermediate elements.
In order to make the technical solutions and advantages of the embodiments of the present disclosure more apparent, the following detailed description of exemplary embodiments of the present disclosure is given in conjunction with the accompanying drawings, and it is apparent that the described embodiments are only some of the embodiments of the present disclosure, not exhaustive of all the embodiments. In addition, embodiments of the present disclosure and features of the embodiments may be combined with each other without conflict.
The cache is located between the access device and the downstream memory, the stored content is a subset of the downstream memory, the cache can be a multi-level cache or a single-level cache, if a multi-level cache structure is adopted, the downstream memory can be a lower-level cache, and if a single-level cache structure is adopted, the downstream memory can be a memory, such as a DRAM.
Mapping data in the downstream memory to a cache line in the cache requires a corresponding mapping method, and common mapping methods include direct mapping, set associative mapping and full associative mapping.
The direct mapping adopts a mode of modulo to perform one-to-one mapping, and the situation of cache miss is easy to occur. More information can be stored in each set (set) in the set associative, thus increasing the probability of a cache hit relative to the direct mapping approach. Full associativity is extremely set associativity, namely, the cache has only one set, the implementation is complex, and the mapping method most commonly used in the industry is set associativity at present.
In the direct mapping and set associative mapping approach, the address that the processor sends into the Cache is divided into 4 segments, as shown in FIG. 1, including Tag, index, line offset, byte offset. Where line offset is used to indicate the offset of the address in the cache line, index is used to indicate which set (in the way of set associative mapping) or which line (in the way of direct mapping) the address is in, tag is used to determine whether a data block (Tag bit) is hit, byte offset is used to identify the size of the data block for one access. For implementation of various mapping manners, reference may be made to the related art, and this disclosure will not be described in detail. As shown in fig. 2, in addition to the cache line, a dirty data flag (D in the figure), a valid flag (V in the figure), and a Tag identification of the cache line are recorded for each cache line. Wherein the dirty data flag is used to identify whether the data in the corresponding cache line is consistent with the data in the downstream memory, e.g., the dirty data flag is set to 1 if not and set to 0 if consistent. The valid flag is used to identify whether the corresponding cache line is valid. The Tag identifier is used for determining whether the access request hits in a cache line corresponding to the Tag after the access request is received. The memory space storing dirty Data tags, valid tags, and Tag identifications may be referred to as Tag array, and the memory space storing Cache lines (Cache lines) may be referred to as Data array. In addition, the Tag array occupies less space, and the Tag array can also be located in the cache controller and exist in a register mode.
The cache update policy refers to how the write operation should update the data when a cache hit occurs. Cache update policies fall into two categories: write through and write back.
Write through is also known as write through, and updates both data in the cache and data in the downstream memory when a write address hits in the cache. It follows that the write through policy does not reduce the amount of write access by the device to the downstream memory.
The write-back strategy is that when the write address hits in the cache, we only update the data in the cache, but not update the data in the downstream memory, so that the write access amount to the downstream memory can be effectively reduced. However, since the write-back strategy only updates the data in the cache, the data in the cache and the data in the downstream memory may not be consistent, and therefore, there may be one bit in each cache line to record whether the data in the cache line is modified, that is, whether the data in the cache line is the same as the data in the downstream memory, which is called dirty bit.
In order to reduce the access to the downstream memory, a write-back strategy is generally used in the implementation, and in addition, in order to ensure the consistency of the data in the cache and the downstream memory, writing the data in the cache back to the downstream memory is triggered under certain conditions.
Currently, when data in a cache needs to be written back to a downstream memory, the writing back is usually performed in a minimum unit of cache line, that is, the entire cache line is written back when the writing back is needed, however, not all data in the cache line may be dirty (dirty data refers to data inconsistent with the data in the downstream memory) during the writing back, so that the writing back of the entire cache line also writes data in the cache line, which is not dirty data, back to the downstream memory, and the data in the cache line is identical to the data in the downstream memory, so that the writing back operation for the data, which is not dirty data, is actually redundant, and therefore, bandwidth is wasted.
In order to solve the above-mentioned problems, as shown in fig. 3, the disclosure proposes a method for processing cache data, which is applied to a cache controller, where the cache includes a plurality of cache lines, and any cache line stores: a plurality of data blocks, a data block valid identifier for identifying whether each data block is valid, and a data block dirty identifier for identifying whether each data block is dirty data; the method comprises the following steps:
s301, when data in any cache line is written back to a downstream memory, traversing the effective identification of the data block in the cache line and the dirty identification of the data block to obtain a target data block which is effective and dirty data;
S302, writing the obtained target data block back to a downstream memory.
By adopting the mode, the data in the cache line is divided into a plurality of data blocks, and the data block effective identifiers and the data block dirty identifiers are used for respectively identifying whether each data block is effective or not and whether each data block is dirty or not, and when the data in a certain cache line is required to be written back to the downstream memory, all the data in the cache line is not written back to the downstream memory, but the effective target data blocks which are dirty data are screened out through the data block effective identifiers and the data block dirty identifiers, and only the effective target data blocks which are dirty data are written back to the downstream memory, so that redundant writing operation can be avoided, and further bandwidth waste is avoided.
The structure of the cache line proposed in the present disclosure is described below.
Each cache line in the cache may be as shown in fig. 4, that is, any cache line stores a plurality of data blocks, and a valid identifier of a data block and a dirty identifier of a data block are respectively configured for each data block. Specifically, each Data block (Data in the figure) is continuously stored in the cache together with the valid identifier (each_valid in the figure) and the dirty identifier (each_dirty in the figure) of the Data block. The valid flag of the data block may occupy 1 bit, the valid flag of the data block may be 1, the invalid flag of the data block may be 0, or the valid flag of the data block may be 0, and the invalid flag of the data block may be 1. The dirty flag of the data block may occupy 1 bit, 1 may be used to flag the data block as dirty data, 0 may be used to flag the data block as not dirty data, or 0 may be used to flag the data block as dirty data, and 1 may be used to flag the data block as not dirty data.
Alternatively, a valid data block identifier may be used to identify whether each data block is valid, and a dirty data block identifier may be used to identify whether each data block is dirty. As shown in fig. 5, a data block valid_valid and a data block dirty_dirty may be stored at a start position of a cache line, where the data block valid_valid and the data block dirty_dirty occupy N bits (N data blocks in the cache line), each bit in the data block valid identifier is used to identify whether one data block is valid, and each bit in the data block dirty identifier is used to identify whether one data block is dirty.
In the manner shown in fig. 4 and 5, a valid identification of data blocks may be used to identify whether each data block in a cache line is valid, and a dirty identification of data blocks may be used to identify whether each data block in a cache line is dirty, and then the valid identification and dirty identification may be traversed to determine a target data block in a cache line that is valid and dirty when data in the cache line needs to be written back to downstream memory.
The timing of writing a cache line in the cache back to the downstream memory is described below.
In one embodiment, the data in each cache line in the cache may be written back to the downstream memory when a write-back instruction sent by the upstream access device is received. After the upstream access device reads and writes the data in the downstream memory by using the cache, after determining that there is no read and write requirement for the data in the cache temporarily, the upstream access device can send a write-back instruction to the cache controller, and the cache controller determines that the data in each cache line in the cache needs to be written back to the downstream memory currently based on the write-back instruction.
In another embodiment, the cache controller may actively monitor the working state of the downstream memory, in addition to performing write-back after receiving the write-back instruction sent by the upstream device, and specifically may monitor the number of commands to be sent to the downstream memory, and determine to write back the data in each cache line in the cache to the downstream memory when the downstream memory is monitored to be in an idle state, for example, when the number of commands to be sent to the downstream memory is monitored to be less than a preset threshold value.
In addition to the need to write the data in the cache line back to the downstream memory in the above case, the write request is processed to trigger the write back of the data in the cache line to the downstream memory.
Specifically, after receiving a write request, if the write request does not hit any cache line in the cache and there is no free cache line in the cache, determining a target cache line from the cache, and writing data in the target cache line back to the downstream memory. The free cache line described herein refers to a free cache line that may be used to process the write request, i.e., in the event that there is no free cache line available to process the write request, a target cache line needs to be determined from the cache, and the data in the target cache line is written back to the downstream memory to process the write request with the target cache line in the free state after the data is written back.
After the data in the target cache line is written back to the downstream memory, at least one data block carried by the write request is written into the target cache line, and the data block valid identifier and the data block dirty identifier are modified into data blocks for identifying that the at least one data block is valid and dirty data.
I.e., after writing the data in the target cache line back to the downstream memory, the target cache line is in an idle state and the write request can be processed using the target cache line. The disclosure proposes that, in addition to writing a data block carried by a write request into a target cache line, a valid identifier of the data block and a dirty identifier of the data block need to be modified, and the data blocks in which the identifiers are written are valid and dirty data. Since the read-write request generally carries a burst length, the burst length can be understood as the number of data blocks for which the read-write request is directed, for example, if the burst length of the write request is 2, it is indicated that the write request carries two data blocks, so each write request can carry at least one data block according to the information of the burst length.
The above mentioned ways of mapping the cache and the downstream memory may be direct mapping, set associative mapping, and fully associative mapping.
In the case where the cache is mapped to the downstream memory in a set associative manner, the cache typically includes a number of sets, and any one set includes a number of cache lines. In this mapping manner, after the cache controller receives the write request, in the case that the write request misses any cache line in the cache and there is no free cache line in the set hit by the write request, the following manner may be specifically adopted to determine the target cache line from the cache.
Determining a set hit by the write request, traversing the effective identifications and the dirty identifications of the data blocks in each cache line in the set, and determining a target cache line of the cache line with the largest target data block in the set, wherein the target data block is the effective and dirty data block.
In the mapping mode of group association, the storage position of the data of the access request in the cache can be any cache line in a group, so that when a target cache line needs to be determined from the cache, the determination is made from the group hit by the access request, specifically, the group hit by the access request is determined according to the Index address in the access request address, the valid identification and the dirty identification of the data blocks in each cache line in one group are traversed, the cache line with the largest target data block in the group is determined as the target cache line, and the target cache line is written back to the downstream memory. In this way, the target cache line is determined, and the cache line which is more required to be written back to the downstream memory can be written back preferentially, in addition, since the data in one cache line is always stored in the adjacent position in the downstream memory, for example, if the downstream memory is a DRAM, the data in one cache line is always written into one memory page in the DRAM, so that after the relevant memory page of the DRAM is opened, more data can be written, which is beneficial to fully utilizing the bandwidth of the DRAM.
In addition, in the set associative mapping, if there are the most target data blocks in a plurality of cache lines in a set, they are the same. One cache line may be randomly selected from the plurality of cache lines as the target cache line or may be selected from the plurality of cache lines according to a least recently accessed (LRU) principle.
In the fully-associative mapping manner, the storage position of the data of the access request in the cache can be any cache line in the cache, so that when a target cache line needs to be determined from the cache, the valid identification of the data block and the dirty identification of the data block in each cache line in the cache can be traversed, and the target cache line of the cache with the most target data block is determined, wherein the target data block is the valid and dirty data block.
That is, in the fully-associative mapping manner, the cache line having the most target data blocks in the current cache may be written back to the downstream memory as the target cache line.
In the direct mapping mode, since the data of the access request can only be stored in one cache line in the cache, if the cache line is not hit, the cache line is directly determined as the target cache line and written back to the downstream memory.
Since the cache line provided by the disclosure includes both the data block and the data block dirty identifier and the data block valid identifier, the data block dirty identifier and the data block valid identifier in the cache line need to be updated when the data block in the cache line is updated.
Specifically, the cache line is updated for the write request in the following manner.
After receiving a write request, if the write request hits any cache line in the cache, or the write request misses any cache line in the cache and there is an idle cache line capable of processing the write request in the cache, writing at least one data block carried by the write request into the hit cache line or the idle cache line capable of processing the write request, and modifying the data block valid identifier and the data block dirty identifier into a data block for identifying that the at least one data block is valid and dirty data.
After the data block carried by the write request is written into the cache line, the effective identification of the data block and the dirty identification of the data block are required to be modified, and the data block written with the identification is the effective and dirty data block, so that the effective and dirty data block in the cache line can be determined by traversing the effective identification of the data block and the dirty identification of the data block.
The cache line is updated for the read request in the following manner.
After receiving a read request, under the condition that the read request hits any cache line in a cache, determining whether data blocks to be read by the read request are in the cache line according to the data block effective identification and the data block dirty identification, if so, reading the data blocks corresponding to the read request from the hit cache line and returning the data blocks to an initiator of the read request;
if not, generating a read command for the data block which corresponds to the read request and is not in the cache line, and sending the read command to the downstream memory, so that the downstream memory returns the data block which corresponds to the read request and is not in the cache line to the initiator of the read request, reads the data block which corresponds to the read request and is in the cache line from the hit cache line, and returns the data block to the initiator of the read request.
In addition, after receiving the read request, if the read request does not hit any cache line in the cache, the read request is directly sent to the downstream memory, so that the downstream memory returns the data block corresponding to the read request to the initiator of the read request.
By adopting the method for processing the read request, under the condition that the read request does not hit any cache line in the cache, the read request is directly sent to the downstream memory, and the cache line is not allocated for the read request, so that unnecessary data can be prevented from being prefetched into the cache line, and further, the waste of bandwidth is avoided. In addition, when processing the write request, only the data block carried in the write request is stored in the cache line, and other data is not acquired from the downstream memory to fill the cache line, so when processing the read request, it can be determined which data blocks in the cache line are valid and dirty data according to the valid data block identifier and the dirty data block identifier, if the data block required to be acquired by the read request is contained in the data block, that is, the data required to be acquired by the read request is currently in the cache line, the data block is directly returned from the cache to the initiator of the read request. If the data block required to be acquired by the read request is not a valid and dirty data block in the cache line, namely the data required to be acquired by the read request is not currently in the cache line, the data corresponding to the read request is returned to the initiator of the read request from the downstream memory.
The cache data processing method proposed in the present disclosure is described in a specific embodiment below.
In a specific embodiment, the upstream bus interface may be an AXI bus, where the data bit width is 256 bits, that is, each read-write data block of the upstream bus received by the cache controller is 256 bits, and the size (size) of the cache line is 256 bytes, and then each cache line may store 8 data blocks. The bus bit width of the downstream memory is 256 bits, which is the same as the data bit width of the upstream bus interface, and the bit width of the upstream and downstream bus addresses is 32-bits; the mapping party of the cache is set associative mapping, and the cache comprises 8 sets (sets) and 8 ways (ways).
Since the data bit width of the bus interface is 256 bits, the minimum data Block for accessing the cache is 256 bits (32 bytes), 5-bit Address bits are needed, namely, byte Offset is 5 bits, corresponding Address bits 0-4, and the size of the cache line is 256 bytes, so that each cache line can store 8 data blocks (blocks), and 3-bit Block Offset is needed to index which data Block. Since the mapping scheme is a set associative mapping, including 8-way and 8-way sets, a 3-bit Index is required as a set Index. As shown in fig. 6, the address sent by the upstream device to the cache controller is divided into: byte offset 5 bits, address bits 0-4; block offset 3 bits, address bits 5-7, used to index which data Block in the cache line; index, 3 bit, address 8-10, group Index, which is used to select which group the access request hits; the Tag is 21 bits, address 11-31, and the remaining address bits are used to determine whether to hit a cache line, and the format of the cache line may be as shown in fig. 4, where N is specifically 7.
If the write request does not hit the cache line and the cache has an idle cache line, the cache controller directly selects an idle cache line for allocation without writing back, and updates the tag array and the data array. For example, assuming that the cache line is initially empty, and the received write request is awaddr=0x00001000 (write request address, 16-ary number), awlen=2 (awlen+1 is a burst length, and the burst length means that the write request carries several data blocks), the write request address is divided to obtain: block offset=0x0, index=0x0, tag=0x2. At this time, the cache line is missed, and if there is an idle cache line in the index=0x0 group, the corresponding Tag array and Bata array are updated, and at this time, the Tag array and data array of set0 and way0 are updated:
as shown in fig. 7, tag array is updated to valid=1, dirty=1, and tag=0x2. Since Block offset=0 and awlen=2, burst_length=3 of write, each_valid 0=1, each_valid 1=1, each_valid 2=1, each_dirty0=1, each_dirty1=1, each_dirty2=1 are updated. The updated Bata array is shown in fig. 8. I.e. writing the data block into the first three positions in the cache line, and updating the corresponding valid data block identification and dirty data block identification.
For another example, when a write request is received and hits a cache line, the corresponding Tag array, data array, is updated directly.
Assuming that the write request awaddr=0x00001060 and awlen=2 are received, the Block offset=0x3, index=0x0 and tag=0x2 are obtained after dividing the address. At this point, the write hits, the corresponding Tag array and Data array are updated, at this point, the Tag array and Data array of set0 and way0 are updated: the Tag array is updated to valid=1, dirty=1, tag=0x2 as shown in fig. 7;
the Data array update is shown in FIG. 9.
Since block offset=4 and awlen=2, burst_length=3 of the write request, update each_valid 3=1, each_dirty3=1, each_valid4=1, each_dirty4=1, each_valid5=1, each_dirty5=1.
For another example, suppose that the write request awaddr=0x00008080, awlen=2, the address is divided into Block offset=0x4, index=0x0, tag=0x10. At this time, the write request misses, and assuming that there are no free cache lines in set0 at this time, the each_dirty values of all the cache lines in set0 are read out, the cache line with the largest number of each_dirty bits (each_dirty_cnt) is selected for write-back, and if there are as many each_dirty bits of multiple cache lines, the cache line is selected for write-back according to the least recently accessed (LRU) principle.
each_dirty_cnt=each_dirty0+each_dirty1+each_dirty2+each_dirty3+ each_dirty4+ each_dirty5+each_dirty6+ each_dirty7
Assuming that at this time, the cache line of set0 way0 is shown in fig. 10 below, and that the area_dirty_cnt=7 indicates that the target data block in the cache line is the largest, the write-back of the cache line is selected.
At this time, the tag=0x2 in the tag_array is taken out, and the corresponding data_array area_valid, area_dirty value is taken out.
Two write commands are generated, command 1: awaddr=0x00001000, awlen=5, command 2: awaddr=0x000010e0, awlen=0, written back to the memory controller.
After writing back to the memory controller, the cache line is cleared 0 and the data in the new write request is updated into the cache line. The corresponding Tag array and Data array are updated, at which time Tag array and Data array of set0 and way0 are updated.
the tag array is updated to valid=1, dirty=1, tag=0x10.
The Data array update is shown in FIG. 11.
Since block offset=4 and awlen=2, burst_length=3 of the write request, update each_valid 4=1, each_dirty4=1, each_valid5=1, each_dirty5=1, each_dirty6=1.
In processing a read request, it is assumed that a read request araddr=0x00008080, arlen=3 is received, hit set0 way0 is assumed at this time, data_array in way0 is assumed at this time as shown in fig. 12, data 0x00008080,0x000080a0,0x000080c0 required by read is in a cache line, and 0x000080e0 is not in the cache line; it is necessary to fetch the required 0x00008080,0x000080a0, and 0x000080c0, and then send the command araddr=0x000080e0 and arlen=0 to the downstream memory, and read the data from the downstream memory and return the data to the bus.
As shown in fig. 13, based on the same inventive concept, the embodiments of the present disclosure further provide a cache controller, configured to control a cache, where the cache includes a plurality of cache lines, and any cache line stores: a plurality of data blocks, a data block valid identifier for identifying whether each data block is valid, and a data block dirty identifier for identifying whether each data block is dirty data; the cache controller includes:
a traversing module 110, configured to traverse the valid identifier of the data block in any cache line and the dirty identifier of the data block to obtain a valid target data block that is dirty data when writing the data in the cache line back to the downstream memory;
a write back module 120 for writing back the resulting target data block to the downstream memory.
In one embodiment, the write-back module is specifically configured to determine to write back data in each cache line in the cache to the downstream memory when the downstream memory is detected to be idle.
In one embodiment, the write-back module 120 is specifically configured to determine, after receiving a write request, that a target cache line is determined from the cache if the write request misses any cache line in the cache and there is no free cache line in the cache, and write data in the target cache line back to the downstream memory.
In one embodiment, the cache controller further comprises:
and the writing module 130 is configured to write at least one data block carried by the write request into the target cache line, and modify the valid identifier of the data block and the dirty identifier of the data block into a data block for identifying that the at least one data block is valid and dirty data.
In one embodiment, the mapping mode of the cache and the downstream memory is set associative mapping, wherein the cache comprises a plurality of sets, and any set comprises a plurality of cache lines;
the write-back module 120 is specifically configured to determine a set hit by the write request, traverse the valid identifier of the data block and the dirty identifier of the data block in each cache line in the set, and determine a target cache line of the cache line including the most target data block in the set, where the target data block is a valid and dirty data block.
In one embodiment, the cache controller further comprises:
and the writing module 130 is configured to, after receiving a write request, write at least one data block carried by the write request into a cache line if the write request hits any cache line in a cache, or if the write request misses any cache line in the cache and there is a free cache line in the cache that can process the write request, and modify a data block valid identifier and a data block dirty identifier into a data block for identifying that the written data block is valid and dirty data.
In one embodiment, the cache controller further comprises:
the reading module 140 is configured to determine, when a read request hits any cache line in a cache, whether data blocks to be read by the read request are in the cache line according to a data block valid identifier and a data block dirty identifier, and if yes, read a data block corresponding to the read request from the hit cache line and return the read request to an initiator of the read request;
if not, generating a read command for the data block which corresponds to the read request and is not in the cache line, sending the read command to a downstream memory, reading the data block which corresponds to the read request and is in the cache line from the hit cache line, and returning the data block to an initiator of the read request.
In one embodiment, the reading module 140 is further configured to send, after receiving a read request, the read request to a downstream memory if the read request misses any cache line in the cache, so that the downstream memory returns a data block corresponding to the read request to an initiator of the read request.
The disclosure also proposes a chip comprising the above-mentioned cache controller. The chip may be GPU, TPU, CPU, etc., and the disclosure is not limited thereto.
The embodiment of the disclosure also provides electronic equipment, which comprises the chip. In some use scenarios, the product form of the electronic device is a portable electronic device, such as a smart phone, a tablet computer, a VR device, etc.; in some use cases, the electronic device is in the form of a personal computer, a game console, or the like.
While preferred embodiments of the present disclosure have been described above, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the appended claims be interpreted as including the preferred embodiments and all alterations and modifications that fall within the scope of this disclosure, and that those skilled in the art will recognize that the invention may be practiced with modification and alteration without departing from the spirit and scope of the present disclosure.

Claims (19)

1. A method for processing cache data, the cache includes a plurality of cache lines, any cache line stores: a plurality of data blocks, a data block valid identifier for identifying whether each data block is valid, and a data block dirty identifier for identifying whether each data block is dirty data; the method comprises the following steps:
When writing the data in any cache line back to a downstream memory, traversing the effective identification of the data block in the cache line and the dirty identification of the data block to obtain a target data block which is effective and dirty data;
and writing the obtained target data block back to the downstream memory.
2. The method of claim 1, the writing data in any cache line back to downstream memory, comprising:
upon detecting that the downstream memory is free, determining to write data in each cache line in the cache back to the downstream memory.
3. The method of claim 1, the writing data in any cache line back to downstream memory, comprising:
after receiving a write request, if the write request misses any cache line in the cache and there is no free cache line in the cache, determining to determine a target cache line from the cache, and writing data in the target cache line back to a downstream memory.
4. The method of claim 3, further comprising, after writing the data in the target cache line back to downstream memory:
and writing at least one data block carried by the write request into the target cache line, and modifying the data block effective identifier and the data block dirty identifier into data blocks which are used for identifying the at least one data block as effective and dirty data.
5. A method according to claim 3, wherein the cache and the downstream memory are mapped in a set associative mapping, the cache comprising a plurality of sets, any one of the sets comprising a plurality of cache lines; the determining a target cache line from the cache includes:
determining a set hit by the write request, traversing the effective identifiers of the data blocks and the dirty identifiers of the data blocks in each cache line in the set, and determining a target cache line of the cache line with the largest target data block in the set, wherein the target data block is the effective and dirty data block.
6. The method of claim 1, further comprising:
after receiving a write request, if the write request hits any cache line in a cache, or the write request misses any cache line in the cache and there is an idle cache line in the cache, writing at least one data block carried by the write request into the cache line, and modifying a data block valid identifier and a data block dirty identifier into a data block for identifying that the written data block is valid and dirty data.
7. The method of claim 1, further comprising:
after receiving a read request, under the condition that the read request hits any cache line in a cache, determining whether data blocks to be read by the read request are in the cache line according to a data block effective identifier and a data block dirty identifier, if so, reading the data blocks corresponding to the read request from the hit cache line and returning the data blocks to an initiator of the read request;
If not, generating a read command for the data block which corresponds to the read request and is not in the cache line, sending the read command to a downstream memory, reading the data block which corresponds to the read request and is in the cache line from the hit cache line, and returning the data block to an initiator of the read request.
8. The method of claim 7, further comprising:
after receiving a read request, if the read request does not hit any cache line in a cache, the read request is sent to a downstream memory, so that the downstream memory returns a data block corresponding to the read request to an initiator of the read request.
9. The method of claim 1, the any cache line configured to store a number of data blocks, a data block valid identification for identifying whether each data block is valid, and a data block dirty identification for identifying whether each data block is dirty data, comprising:
a plurality of data blocks are stored in any cache line, and a data block valid identifier and a data block dirty identifier are respectively configured for each data block.
10. A cache controller, configured to control a cache, where the cache includes a plurality of cache lines, and any cache line stores: a plurality of data blocks, a data block valid identifier for identifying whether each data block is valid, and a data block dirty identifier for identifying whether each data block is dirty data; the cache controller includes:
The traversal module is used for traversing the effective identification of the data block in any cache line and the target data block which is effective and is dirty data when the data in any cache line is written back to the downstream memory;
and the write-back module is used for writing the obtained target data block back to the downstream memory.
11. The cache controller of claim 10,
the write-back module is specifically configured to determine to write back data in each cache line in the cache to the downstream memory when the downstream memory is detected to be idle.
12. The cache controller of claim 10,
the write-back module is specifically configured to determine, after receiving a write request, that a target cache line is determined from the cache if the write request misses any cache line in the cache and there is no free cache line in the cache, and write back data in the target cache line to a downstream memory.
13. The cache controller of claim 12, the cache controller further comprising:
and the writing module is used for writing at least one data block carried by the writing request into the target cache line, and modifying the data block effective identifier and the data block dirty identifier into data blocks which are used for identifying that the at least one data block is effective and dirty data.
14. The cache controller of claim 12, wherein the cache and the downstream memory are mapped in a set associative mapping, the cache including a plurality of sets, any one of the sets including a plurality of cache lines;
the write-back module is specifically configured to determine a set hit by the write request, traverse the valid identifier of the data block and the dirty identifier of the data block in each cache line in the set, and determine a target cache line of the cache line with the largest target data block in the set, where the target data block is a valid and dirty data block.
15. The cache controller of claim 10, the cache controller further comprising:
and the writing module is used for writing at least one data block carried by the writing request into a cache line after receiving the writing request, and modifying a data block effective identifier and a data block dirty identifier into a data block for identifying that the written data block is effective and dirty data if the writing request hits any cache line in a cache or the writing request misses any cache line in the cache and an idle cache line capable of processing the writing request exists in the cache.
16. The cache controller of claim 10, the cache controller further comprising:
The reading module is used for determining whether the data blocks to be read by the read request are in the cache lines according to the data block effective identification and the data block dirty identification when the read request hits any cache line in the cache after receiving the read request, and if yes, reading the data blocks corresponding to the read request from the hit cache lines and returning the data blocks to the initiator of the read request;
if not, generating a read command for the data block which corresponds to the read request and is not in the cache line, sending the read command to a downstream memory, reading the data block which corresponds to the read request and is in the cache line from the hit cache line, and returning the data block to an initiator of the read request.
17. The cache controller of claim 16,
and the reading module is further configured to send the read request to a downstream memory if the read request misses any cache line in the cache after receiving the read request, so that the downstream memory returns a data block corresponding to the read request to an initiator of the read request.
18. A chip comprising a cache controller as claimed in any one of claims 10 to 17.
19. An electronic device comprising the chip of claim 18.
CN202410085799.2A 2024-01-22 2024-01-22 Cache data processing method, cache controller, chip and electronic equipment Pending CN117609314A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410085799.2A CN117609314A (en) 2024-01-22 2024-01-22 Cache data processing method, cache controller, chip and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410085799.2A CN117609314A (en) 2024-01-22 2024-01-22 Cache data processing method, cache controller, chip and electronic equipment

Publications (1)

Publication Number Publication Date
CN117609314A true CN117609314A (en) 2024-02-27

Family

ID=89956484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410085799.2A Pending CN117609314A (en) 2024-01-22 2024-01-22 Cache data processing method, cache controller, chip and electronic equipment

Country Status (1)

Country Link
CN (1) CN117609314A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170242794A1 (en) * 2016-02-19 2017-08-24 Seagate Technology Llc Associative and atomic write-back caching system and method for storage subsystem
CN111052096A (en) * 2017-08-30 2020-04-21 美光科技公司 Buffer line data
CN115422604A (en) * 2022-08-16 2022-12-02 华中科技大学 Data security processing method for nonvolatile memory, memory controller and system
CN115858417A (en) * 2023-02-01 2023-03-28 南京砺算科技有限公司 Cache data processing method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170242794A1 (en) * 2016-02-19 2017-08-24 Seagate Technology Llc Associative and atomic write-back caching system and method for storage subsystem
CN111052096A (en) * 2017-08-30 2020-04-21 美光科技公司 Buffer line data
CN115422604A (en) * 2022-08-16 2022-12-02 华中科技大学 Data security processing method for nonvolatile memory, memory controller and system
CN115858417A (en) * 2023-02-01 2023-03-28 南京砺算科技有限公司 Cache data processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US6912628B2 (en) N-way set-associative external cache with standard DDR memory devices
US6557081B2 (en) Prefetch queue
JP5417879B2 (en) Cache device
US6138213A (en) Cache including a prefetch way for storing prefetch cache lines and configured to move a prefetched cache line to a non-prefetch way upon access to the prefetched cache line
US9128847B2 (en) Cache control apparatus and cache control method
JP2554449B2 (en) Data processing system having cache memory
US20170161197A1 (en) Apparatuses and methods for pre-fetching and write-back for a segmented cache memory
US8176255B2 (en) Allocating space in dedicated cache ways
EP2118754B1 (en) Apparatus and methods to reduce castouts in a multi-level cache hierarchy
US7577793B2 (en) Patrol snooping for higher level cache eviction candidate identification
US6832294B2 (en) Interleaved n-way set-associative external cache
US8621152B1 (en) Transparent level 2 cache that uses independent tag and valid random access memory arrays for cache access
US5953747A (en) Apparatus and method for serialized set prediction
KR100326989B1 (en) Method and system for pre-fetch cache interrogation using snoop port
US6715040B2 (en) Performance improvement of a write instruction of a non-inclusive hierarchical cache memory unit
US7461212B2 (en) Non-inclusive cache system with simple control operation
CN115794673A (en) Access method and device for non-Cacheable data of system-level chip and electronic equipment
CN117389914B (en) Cache system, cache write-back method, system on chip and electronic equipment
US7237084B2 (en) Method and program product for avoiding cache congestion by offsetting addresses while allocating memory
CN115878507B (en) Memory access method and device of system-on-chip and electronic equipment
US5966737A (en) Apparatus and method for serialized set prediction
CN117609314A (en) Cache data processing method, cache controller, chip and electronic equipment
CN116775560B (en) Write distribution method, cache system, system on chip, electronic component and electronic equipment
JP5224959B2 (en) Cash system
CN115794674B (en) Cache data write-back method and device, graphics processing system and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination