WO2022021178A1 - 缓存方法、系统和芯片 - Google Patents
缓存方法、系统和芯片 Download PDFInfo
- Publication number
- WO2022021178A1 WO2022021178A1 PCT/CN2020/105696 CN2020105696W WO2022021178A1 WO 2022021178 A1 WO2022021178 A1 WO 2022021178A1 CN 2020105696 W CN2020105696 W CN 2020105696W WO 2022021178 A1 WO2022021178 A1 WO 2022021178A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cache
- data
- page
- memory
- access
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/12—Replacement control
Definitions
- the embodiments of the present application relate to the technical field of caching, and in particular, to a caching method, system, and chip.
- On-chip memory Die-stacked DRAM
- TSV through silicon via
- on-chip memory can be used as ordinary memory or as a cache of off-chip memory (DDR).
- DDR off-chip memory
- the cache space in the cache is allocated at the granularity of pages.
- the cache space of the cache can be divided into multiple pages, and similarly, the memory space can also be divided into multiple pages.
- a cache page can be selected in the cache for storage based on the pre-established mapping relationship between the memory page and the cache page.
- the replacement method of the least recently used data or the first-in-first-out data replacement method is usually adopted, and a page is selected from the cache. Replace the previously saved data of the selected page with the newly acquired data.
- the pages selected in the traditional way may store more dirty data, and the dirty data is the data stored in the cache that has been rewritten by the processor.
- a page replacement occurs, it needs to be written back to memory.
- the cache bandwidth is seriously occupied, causing data congestion, reducing data transmission efficiency, and causing data access delay.
- the caching method, system and chip provided by the present application can improve data caching efficiency.
- an embodiment of the present application provides a caching method, the caching method includes: receiving a data read request, determining data to be written into the cache from a memory based on the data read request; acquiring data access requests received within a unit time The number of times; based on the number of data access requests, the first page is selected from the cache; the data that needs to be written into the cache from the memory is stored in the first page.
- the cache In the caching method provided by this application, data is stored in the cached pages.
- the cache can be determined by counting the number of access requests sent by the processor within a unit time. bandwidth usage, and then select a cache page based on the cache bandwidth usage, so as to store the data to be cached in the selected cache page.
- the cache page with low dirty data in the stored data can be selected during the period of high data transmission volume, thereby reducing the occupancy rate of the memory access bandwidth, which is beneficial to improve the data cache efficiency and the data access efficiency of the processor .
- selecting the first page from the cache based on the number of data access requests includes: responding to the number of data access requests being greater than or equal to the first page The threshold value is selected, and the page with the least dirty data stored in the cache is selected as the first page.
- selecting the first page from the cache based on the number of data access requests includes: in response to the number of data access requests being less than a first threshold, The first page is selected from the cache based on the priority level information of the data stored in each page in the cache; the priority level includes one of the following: least recently used information, FIFO information or access frequency information .
- the selecting the first page from the cache based on the number of data access requests includes: responding to the number of data access requests greater than or equal to For the first threshold, the first page is selected based on the cache access volume per unit time and the memory access volume per unit time.
- the cache hit rate and the bandwidth occupancy rate can be further considered, thereby further improving the cache efficiency.
- the cache access amount includes one of the following: the number of cache hits or the data transfer amount between the cache and the processor; the memory access amount includes one of the following: memory The number of accesses or the amount of data transferred between memory and the processor.
- the selecting the first page based on the cache access volume per unit time and the memory access volume per unit time includes: determining the cache access volume and The ratio between the memory accesses; and based on the ratio between the cache accesses and the memory accesses, the first page is selected from the cache.
- the selecting the first page from the cache based on the ratio between the cache access amount and the memory access amount includes: responding to The ratio between the cache access amount and the memory access amount is greater than or equal to a second threshold, based on the location information of the page in the cache occupied by the data that needs to be written to the cache from the memory and the data in the cache. The location information of the dirty data stored in each page, and the first page is selected from the cache.
- the selecting the first page from the cache based on the ratio between the cache access amount and the memory access amount includes: responding to The ratio between the cache access amount and the memory access amount is less than the second threshold, and the first page is selected from the cache based on priority information of data stored in each page in the cache;
- the priority level includes one of the following: least recently used information, first-in-first-out information, or access frequency information.
- the method further includes: updating the first index information stored in the cache, where the first index information is used to index all the stored information stored in the first page. Describes the data that needs to be written to the cache from memory.
- the method further includes: obtaining location information of an idle data unit in the first page, and updating the data stored in the cache according to the location information of the idle data unit.
- the second index information is used to index the original data in the data unit corresponding to the location information in the first page.
- an embodiment of the present application provides a cache system, the cache system includes a cache for storing data from a memory and index information for indexing the data stored in the cache; a storage controller for receiving data read fetch request, determine the data that needs to be written to the cache from the memory based on the data read request; obtain the number of data access requests received per unit time; based on the number of data access requests, select the first A page; save the data that needs to be written into the cache from the memory in the first page.
- the storage controller is further configured to: in response to the number of times of the data access request being greater than or equal to a first threshold, based on the dirty data stored in the page in the cache The dirty data is selected, and the first page is selected from the cache.
- the storage controller is further configured to: in response to the number of data access requests being less than the first threshold, based on the priority level of the data stored in each page in the cache information, and the first page is selected from the cache; the priority level includes one of the following: least recently used information, first-in-first-out information, or access frequency information.
- the storage controller is further configured to: in response to the number of the data access requests being greater than or equal to the first threshold, based on the cache access volume per unit time and the unit time The amount of memory access within, selects out the first page.
- the cache access amount includes one of the following: the number of cache hits or the data transmission amount between the cache and the processor; the memory access amount includes one of the following: memory The number of accesses or the amount of data transferred between memory and the processor.
- the storage controller is further configured to: determine a ratio between the cache access amount and the memory access amount; The ratio between the visits, and the first page is selected from the cache.
- the storage controller is further configured to: in response to the ratio between the cache access amount and the memory access amount being greater than or equal to a second threshold The first page is selected from the cache according to the location information of the pages in the cache occupied by the data written into the cache from the memory and the location information of the dirty data stored in each page in the cache.
- the storage controller is further configured to: in response to the ratio between the cache access amount and the memory access amount being less than the second threshold, perform the following steps based on the Priority level information of data stored in each page in the cache, and the first page is selected from the cache; the priority level includes one of the following: least recently used information, FIFO information or access frequency information.
- the cache system further includes a first counter, where the first counter is used to count the number of data access requests received by the storage controller in a unit time .
- the cache system further includes a second counter; the second counter is used to count the cache access amount of the storage controller per unit time; wherein, the Cache access volume includes one of the following: the number of cache hits or the amount of data transferred between the cache and the processor.
- the cache system further includes a third counter; the third counter is used to count the memory access amount of the storage controller in a unit time; wherein, the The amount of memory access includes one of the following: the number of memory accesses or the amount of data transferred between memory and the processor.
- an embodiment of the present application provides a chip, where the chip includes the cache system described in the second aspect.
- the chip further includes a processor, configured to access the data stored in the cache system, and store the processed data in the cache system.
- FIG. 1 is a schematic structural diagram of a cache system provided by an embodiment of the present application.
- FIG. 2 is a schematic diagram of a mapping relationship between a memory page and a cache page provided by an embodiment of the present application
- FIG. 3 is a schematic structural diagram of a cache provided by an embodiment of the present application.
- FIG. 4 is another schematic structural diagram of a cache provided by an embodiment of the present application.
- FIG. 5 is another schematic structural diagram of a cache provided by an embodiment of the present application.
- FIG. 6 is a schematic diagram of data units occupied by dirty data stored in cache page B as shown in FIG. 3 provided by an embodiment of the present application;
- FIG. 7 is a schematic diagram of data units occupied by dirty data stored in cache page A as shown in FIG. 3 provided by an embodiment of the present application;
- FIG. 8 is a schematic diagram of a data unit in a cache page occupied by data in a memory page to be stored provided by an embodiment of the present application;
- FIG. 9 is a flowchart of a caching method provided by an embodiment of the present application.
- FIG. 11 is a schematic structural diagram of a cache device provided by an embodiment of the present application.
- references herein to "first,” “second,” and similar terms do not denote any order, quantity, or importance, but are merely used to distinguish the various components. Likewise, words such as “a” or “an” do not denote a quantitative limitation, but rather denote the presence of at least one.
- module mentioned in this document generally refers to a functional structure divided according to logic, and the “module” can be realized by pure hardware, or realized by a combination of software and hardware.
- words such as “exemplary” or “for example” are used to indicate an example, illustration or illustration. Any embodiments or designs described in the embodiments of the present application as “exemplary” or “such as” should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as “exemplary” or “such as” is intended to present the related concepts in a specific manner.
- the meaning of "plurality" refers to two or more. For example, multiple pages refer to two or more pages; multiple index information refers to two or more data information.
- FIG. 1 shows a schematic structural diagram of a cache system applied to the present application.
- a cache system 100 includes a processor, a memory controller, a cache, and a memory.
- the data required for the operation of the processor is stored in the memory.
- Part of the data stored in memory is stored in the cache.
- the processor can initiate data access requests and perform data processing.
- the storage controller controls the data interaction between the processor and the cache and between the cache and the memory based on the data access request initiated by the processor. Under the control of the memory controller, data in memory can be written to the cache, provided to the processor, or written to memory.
- the storage controller can detect whether the data exists in the cache based on the data access request. If the data accessed by the processor is stored in the cache, the storage controller controls the cache to provide the data to the processor through the bus; if the data accessed by the processor is not stored in the cache, the storage controller can control the data to be fetched from the memory and stored in the cache. provided to the processor. In addition, the data can also be written into the cache after being fetched from the memory, so that the processor can directly obtain the data from the cache next time.
- the cache as shown in FIG. 1 may include a multi-level cache structure, such as L1 level, L2 level and L3 level.
- a multi-level cache structure such as L1 level, L2 level and L3 level.
- the processor accesses data, it can first access the L1-level cache.
- the L1-level cache misses it can continue to access the L2-level cache.
- the L2-level cache misses it can continue to access the L3-level cache.
- the L3-level cache misses data can be retrieved from memory. That is, for the L1-level cache, the L2-level cache and the L3-level cache are the next-level caches; for the L2-level cache, the L3-level cache is the next-level cache.
- L1-level storage data When data needs to be written back, for example, when L1-level storage data needs to be written back, it can be written back to L2-level cache, L1-level cache, or memory; when L3-level storage data needs to be written back, it can only be written back to memory.
- the L1-level cache, the L2-level cache, and the L3-level cache may be caches with the same cache structure but different data capacities.
- the caches shown in Figure 1 do not distinguish between L1, L2, and L3 caches.
- the cache space in each level of cache is allocated with page granularity. Specifically, the cache space of each level of cache can be divided into multiple pages.
- the pages in the cache are collectively referred to as cache pages in the following description.
- a cache page can also be understood as a cache line.
- the memory storage space can also be divided into multiple pages.
- the same page in memory is referred to as a memory page in the following description.
- the storage capacity of a memory page can be the same as the storage capacity of a cache page.
- the data stored in the same cache page can come from the same memory page or from different memory pages.
- the data saved by multiple memory pages can be simultaneously stored in the cache pages corresponding to the same set of Sets in the cache, and the pages cached in the same set of Sets have a competitive relationship.
- each set of Sets can be set with multiple Ways.
- the cache pages corresponding to the multiple Ways have no data to store, the data stored in the memory pages mapped to the same set of Sets can be stored in the set.
- How many Ways there are in the cache means how many way groups are associated. For example, it may include but not limited to: two-way group connection, four-way group connection, or eight-way group connection, etc.
- FIG. 2 it schematically shows a schematic diagram of the correspondence between the cache space in the cache and the storage space in the memory.
- FIG. 2 exemplarily shows a situation where the cache is 2-way set associative.
- the cache includes two ways Way0 and Way1, and each way can store two sets of page data.
- the memory storage space can be divided into eight memory pages.
- the data in memory page 01, memory page 11, memory page 21, and memory page 31 can be respectively stored in the cache pages corresponding to the cached group Set0, memory page 02, memory page 12, memory page 22, and memory page 32.
- the data in can be respectively stored in the cached pages corresponding to the cached group Set1.
- a page can be divided into multiple data units.
- data access is usually performed in units of data blocks.
- a cache page may store data of all data units in a memory page, or may store data of some data units in a memory page. That is to say, when the data in a certain memory page is written into the cache, only the data of some data units in the memory page can be cached. For example, when the data that can be saved by a page in the cache or memory can be 4KB, if a page in the cache or memory is divided into 32 data units, each data unit can save 128B of data.
- the cache page may only store data stored in some data units (for example, 5 data units) of a certain memory page.
- FIG. 3 to FIG. 5 three schematic diagrams of cache structures of the cache shown in FIG. 1 are respectively shown.
- the cache may include a tag array Tag Array and a data array Data Array.
- the data obtained from the memory is stored in the cache page of the data array Data Array, and the index information used to index the data stored in the cache page is stored in the Tag Array.
- the label array and the data array are both an m*n array.
- each row represents a set of Set
- each column represents a Way.
- Each element in the tag array Tag Array is an index information
- each element in the data array Data Array is a cache page.
- the elements in the tag array Tag Array are in one-to-one correspondence with the elements in the data array Data Array, and the index information in the tag array Tag Array is the data stored in the corresponding cache page in the data array Data Array. index information.
- the data stored in each cache page in the data array Data Array comes from the same memory page.
- Each index information may include Tag (Tag) information, Valid Bits (Valid Bits) information, Dirty Bits (Dirty Bits) information and priority information.
- Tag information is used to indicate the physical address information in the memory of the memory page from which the data stored in the cache page comes and the group set information corresponding to the cache page to which the data is stored. Among them, the data from the same memory page has the same Tag information.
- Dirty Bits information used to indicate whether the data in the data unit stored in the cache page is dirty data, if some of the Dirty Bits bits are set to 0, it indicates that the data stored in the corresponding data unit is clean data , when the replacement occurs, it can be directly invalid without writing back to the off-chip memory. On the contrary, if some of the Dirty Bits bits are set to 1, when the replacement occurs, all the data in the data unit where the corresponding dirty data is located needs to be written back to the off-chip memory.
- the dirty data here specifically refers to the data stored in the cache that has been rewritten by the processor. The data is not stored in the memory. If the dirty data in the cache is overwritten by other data, the data will be lost if it is not written back to the memory.
- the valid bits (Valid Bits) information is used to indicate whether each data unit in the cache page stores valid data.
- the cache page includes several data units, and the valid bit information is represented by several bits. For example, if the cache page includes 32 data units, it is represented by 32 bits. Additionally, each bit may include a "0" state and a "1" state. When a certain bit is "1", it indicates that the data stored in the corresponding data unit is valid; when a certain bit is "0", it indicates that the data stored in the corresponding data unit is invalid.
- the priority level information is used to indicate whether or not to be replaced with priority when page replacement occurs.
- the priority level information PRI includes one of the following: least recently used (LRU) information, which is used to indicate the least recently used page among the pages currently stored in the cache, and when the priority level is used, the least recently used page Replaced by priority; based on Frequency Based Replacement (FBR) information, it is used to indicate the frequency information of the pages currently stored in the cache and accessed.
- LRU least recently used
- FBR Frequency Based Replacement
- FIFO First In First Out
- FIG. 4 shows another schematic structural diagram of the cache as shown in FIG. 1 .
- the cache may include a tag array Tag Array and a data array Data Array.
- the tag array and the data array can both be m*n arrays.
- the data stored in each cache page in the data array Data Array can come from different memory pages.
- a plurality of index information is stored at the Tag Array position that has a mapping relationship with each cache page.
- index information Index01 is used to index the data stored in the cache page A from the memory page 01
- index information Index11 is used to index the data stored in the cache page A from the memory page 11.
- FIG. 5 shows another schematic structural diagram of the cache as shown in FIG. 1 .
- the cache may include a tag array Tag Array and a data array Data Array.
- the tag array is an m*n array
- the data array can be an m*s array, where s is less than or equal to n.
- the tag array Tag Array and the data array Data Array have the same number of sets (Set), and the number of ways (Way) in the tag array Tag Array is not less than the number of ways (Way) in the data array Data Array.
- 4 schematically shows that the tag array Tag Array is a 2*4 array, and the data array Data Array is a 2*3 array.
- the data stored in a cache page may come from the same memory page, or may come from different memory pages.
- the index information stored in the tag array Tag Array shown in FIG. 5 includes, in addition to the information included in the index information stored in the cache shown in FIG. 2 , location information, which is used to indicate the index information that can be indexed The position of the cache page where the data is located in the data array Data Array.
- the location information may be road (Way) information, or may be (group Set, Way) information. It selects which location information to use according to the needs of the application scenario.
- the group set in the tag array Tag Array and the group set in the data array Data Array have a preset mapping relationship at this time.
- the index information stored in the group Set0 in the tag array Tag Array is respectively used to index the data stored in each cache page in the group set0 in the data array.
- the index information of the data stored in each cache page in the data array can be stored at any position in the Tag Array.
- the cache page for storing the overwritten data is usually selected based on priority information such as LRU information or FIFO information of the data stored in each cache page.
- the cache page Before storing the data to be cached in the cache, it is necessary to write back the dirty data in the overwritten data (for example, write back to the memory or write back to the next-level cache). As the cache capacity increases, the amount of valid data and dirty data stored in the cache page may be larger. Using LRU information or FIFO information to select the overwritten data does not consider the amount of dirty data. When the amount of dirty data in the overwritten data is large, a large amount of dirty data needs to be written back to the memory or the next-level cache, which seriously occupies the cache bandwidth and reduces the data cache efficiency. However, the data stored on some unselected cache pages may include less dirty data, and if the page is selected, the less dirty data can be written back. Taking the cache shown in FIG.
- the original data currently stored in cache page A includes a large amount of dirty data
- the original data currently stored in other cache pages includes less dirty data.
- the data stored in cache page A is the least recently used data based on the LRU information of the data stored in each cache page
- a large amount of dirty data in cache page A needs to be written back.
- the data to be cached is stored in the cache page B, it is possible that only less dirty data needs to be written back, which can reduce the occupation of memory access bandwidth and improve the data cache efficiency.
- the access data sent by the processor can be counted within a unit time.
- the usage of the cache bandwidth is determined by the number of requests, and then a cache page is selected based on the usage of the cache bandwidth, so as to store the data to be cached in the selected cache page.
- the cache page with low dirty data in the stored data can be selected during the period of high data transmission volume, thereby reducing the occupancy rate of the memory access bandwidth, which is beneficial to improve the data cache efficiency and the data access efficiency of the processor .
- a first counter is also included.
- the first counter is used to count the number of access requests sent by the processor within a unit time.
- the unit time here may be, for example, 1s, or may be one clock cycle in the cache system.
- the following takes the cache structure shown in FIG. 3 as an example to describe the caching method of the cache system 100 shown in the embodiment of the present application in detail.
- the storage controller can query whether the cache page corresponding to the group Set0 in the data array Data Array is in a full storage state.
- the storage controller can query whether each storage location in the Tag Array has index information stored, so as to determine whether the cache is in a full storage state. In some other implementation manners, the storage controller may also query whether each cache page in the data array Data Array stores data, so as to determine whether the cache is in a full storage state. It is assumed that index information is stored in each storage location in the current tag array Tag Array, as shown in Figure 3. That is, the cache is in a full storage state at this time.
- the storage controller may acquire the number of access requests sent by the processor within a unit time from the first counter.
- the access request sent by the processor may include reading data from the cache or the memory, or may include writing data to the cache or the memory.
- the storage controller may select a cache page from the cache shown in FIG. 2 based on the number of access requests sent by the processor in a unit time to save the data to be cached.
- the memory controller may select a cache page for storing the data to be cached based on the priority level information.
- the storage controller may query the priority level information of each index information Index stored at the storage location corresponding to the group Set0 in the tag array Tag Array to determine the priority level of the data that can be indexed by each index information.
- the priority information may include one of the following: LRU information, FIFO information or FBR information.
- the LRU information in each index information is pre-calculated by the storage controller through the LRU algorithm. For example, the data usage within the current preset time period can be sorted (usually, the data usage is reflected by the number of times the processor accesses the cache page used to store data), and based on the sorting, the data stored in each cache page can be sorted. The data sets the corresponding LRU information.
- the FIFO information and the FBR information may also use their respective algorithms to determine the corresponding information of the data stored in each cache page in advance, which will not be repeated here. It is assumed that by querying the index information Index of each page, it is assumed that it is determined that the data stored in the cache page B shown in FIG. 3 has the lowest priority. At this time, the storage controller can store the data stored in the memory page 21 to be cached into the cache page B of the data array Data Array, and then update the index information used to index the data stored in the cache page B in the Tag Array Tag Array . That is, the previously saved index information Index11 is updated to the index information Index21.
- the dirty data in the data previously stored in the cache page B also needs to be written back to the memory or to the next level cache.
- the selected overwritten data is the data in the cache page B. Assume that cache page B includes 32 data units, of which 25 data units currently store data as valid data, and among the 25 data units, 20 data units store data as dirty data, as shown in FIG. 6 . Before storing the data stored in the memory page 21 to be cached into the cache, it is necessary to write back all the dirty data stored in the above 20 data units. In this case, writing the dirty data back also takes up too much bandwidth resources.
- the data unit occupied by the data stored in the page is represented by the valid bit information in the index information.
- each bit in the valid bit information represents a data unit, 0 represents that the data stored in the data unit is valid, and 1 represents that the data stored in the data unit is invalid.
- both the valid bit information and the dirty bit information in the index information can be represented by 32 bits.
- the bit indicating the data unit in the valid bit information can be set to be valid (for example, set to "1"); when a data unit is invalid, the corresponding valid bit of the data unit can be Set to invalid (eg set to "0").
- the data held in a data unit needs to be written back only if the data held in that data unit is valid and has dirty data.
- the storage controller may select a page with the least dirty data for data storage. Specifically, the dirty bit information in each index information stored in the Tag Array can be queried, and the cache page with the least dirty data can be selected.
- the dirty bit information in the index information Index01 used for indexing the data in the cache page A shown in FIG. 3 is: 00000000000000001100110000000000
- the index information Index11 in the index information Index11 used for indexing the data in the cache page B shown in FIG. 3 is:
- the dirty bit information is: 00000001111110001100110000000000.
- each bit represents a data unit, "0" represents that no dirty data is stored in the data unit, and "1" represents that dirty data is stored in the data unit.
- the storage controller may store the data stored in the memory page 21 to be cached in the cache page A, and then update the index information in the Tag Array for indexing the data stored in the cache page A. That is, the previously saved index information Index01 is updated to the index information Index21. It should be noted that, before storing the data stored in the memory page 21 in the cache page A, the dirty data previously stored in the cache page A also needs to be written back to the memory or to the next level cache.
- the first threshold of the number of access requests sent by the processor per unit time shown in the embodiment of the present application may be determined based on the maximum number of accesses that the cache can undertake in unit time. When the maximum number of accesses that the cache can undertake is relatively high, the first threshold can be increased; when the maximum number of visits that the cache can undertake is low, the first threshold can be lowered.
- the storage controller may further base on the cache access volume per unit time and the memory access per unit time The amount of selected cache pages to save data.
- the cache system 100 may further include a second counter and a third counter, where the second counter is used to count the amount of cache access within a unit time; the third counter is used to count the amount of memory access within a unit time.
- the cache access amount may be the number of cache hits or the data transfer amount between the cache and the processor; the memory access amount may be the number of times the processor accesses the memory or the data transfer amount between the memory and the processor.
- the cache access volume and the memory access volume are respectively the number of cache hits and the number of times the processor accesses the memory
- only one second counter may be set above, and no third counter may be set, and the second counter is used to count the number of cache hits
- the number of times the processor accesses the memory can be determined by subtracting the number of cache hits from the number of access requests sent by the processor.
- the second threshold when the cache access volume is the number of cache hits and the memory access volume is the number of times the processor accesses the memory, the second threshold may be the maximum number of accesses that the cache can undertake per unit time and the memory that the memory can undertake per unit time.
- the ratio of the maximum number of accesses when the amount of cache access is the amount of data transfer between the cache and the processor, and the amount of memory access is the amount of data transferred between the memory and the processor, the second threshold can be the maximum data cached per unit time. The ratio of the transfer rate to the maximum data transfer rate of the memory.
- the storage controller may acquire the cache access volume per unit time from the second counter, and acquire the memory access volume per unit time from the third counter. Then determine the ratio of cache access to memory access. When the ratio is less than or equal to the second threshold, it means that the hit rate of the processor accessing the cache is low at this time, and a large amount of data needs to be obtained from the memory. At this time, the cache page can be selected based on the priority information, thereby improving the cache hit rate; when When the ratio is greater than the second threshold, it indicates that the cache bandwidth is overloaded, and the cache page with the least dirty data can be selected for data storage.
- the storage controller may acquire, from the first counter, the number of access requests sent by the processor within a unit time.
- the cache page may be selected to store the data to be stored by querying the priority level information in the index information stored in the cache.
- the storage controller may obtain the cache access amount per unit time from the second counter, obtain the memory access amount per unit time from the third counter, and determine Ratio between cache accesses and memory accesses.
- the cache page can be selected to save the data to be stored by querying the priority information in the index information stored in the cache;
- the page with the least dirty data may be selected for data storage.
- the above describes how, when each cache page in the cache stores data, how the cache system according to the embodiment of the present application selects the cache page, so as to store the data to be stored in the memory in the selected cache page.
- the above implementation manner may be applicable to any cache structure shown in FIG. 3 to FIG. 5 .
- the valid bit information of the data to be cached and the valid bit information of the data stored in the cached page may be used. and dirty bit information, select the cache page to store the data to be cached.
- the memory page 01 has a mapping relationship with the cache page corresponding to the group Set0 in the cache.
- the data saved in the memory page 01 needs to be stored in the cache page corresponding to the group Set0 in the data array Data Array.
- the storage controller can query the cache page corresponding to the group Set0 in the data array Data Array and the storage location corresponding to the group Set0 in the tag array Tag Array, and find out that the cache pages corresponding to the group Set0 in the Data Array all store the data and the tag array Tag Array. There is still a free location in the storage location corresponding to the group Set0.
- the storage controller may determine the valid dirty bit information in the data stored in each cache page based on the valid bit information and dirty bit information of the data stored in the corresponding cache pages in the group Set0 (only the valid dirty bit information The data needs to be written back, and when the dirty data is invalid, there is no need to write back). Then, the storage controller may select a conflicting position with the memory page 01 based on the valid bit information of the data stored in the memory page 01 to be cached and the valid dirty bit information of the data stored in the cache page corresponding to the group Set0 minimal pages.
- the conflict here means that when the data saved in the memory page 01 to be cached is stored in the cache page A, the occupied data unit in the cache page A is the same as the data unit occupied by the valid dirty data currently stored in the cache page.
- Conflicting data units For example, the data unit occupied by the data stored in the memory page 01 to be cached is shown in FIG. 8 , and correspondingly, its valid bit information is: 0xFF0E410.
- the valid bit information of the data stored in the cache page B is 0xC975A450, the dirty bit information is 0x00700060, and the valid dirty bit information is 0x00700040; the valid bit information of the data stored in the cache page C is 0xFF8DAC20, The dirty bit information is 0x06980020, and its valid dirty bit information is 0x06880020. Therefore, it can be determined that the data unit occupied by the data stored in the cache page B has the least conflict with the data unit occupied by the data stored in the memory page 01 to be cached, so that the data stored in the memory page 01 to be cached can be stored. to cache page B.
- the overwritten valid dirty data in the cache page B also needs to be written back to the memory or the next level cache.
- the index information Inedx02 stored in the cache also needs to be updated. Specifically, the valid bit information and the dirty bit information in the index information Inedx02 are updated.
- the cache, the storage controller, and the processor may be integrated on the same chip to form a system on chip (SOC, System on chip).
- the processor and cache can be integrated on the same chip, and the memory controller can be integrated on another chip.
- the cache can also be integrated with the processor on a different chip.
- the off-chip cache adopts the same storage structure design as the on-chip cache provided by the embodiment in this document, and has the same implementation as the on-chip cache provided by the embodiment in this document. function, the off-chip cache should also be deemed to fall within the protection scope of the embodiments in this document.
- FIG. 9 shows a process 900 of the caching method provided by the embodiment of the present application.
- the process 900 of the caching method includes the following steps:
- Step 901 Receive a data read request, and determine, based on the data read request, data to be written into the cache from the memory.
- the data read request usually carries address information of the data to be read, and the address information includes tag information Tag, group information Set, and the like.
- the cache controller uses the group information Set to retrieve the Tag Array, and finds multiple index information in the Set group.
- the cache controller may continue to find out whether one of the index information of the plurality of index information includes the Tag information carried in the data read request. Assuming that any index information does not include the Tag information carried in the data read request, it means that the Tag is not hit. That is, the data to be read by the processor is not stored in the cache. At this point, the memory controller needs to obtain the data to be read by the processor from the memory.
- the storage controller may further determine the data to be written into the cache from the memory based on the Tag information carried in the data read request. That is, the location information in the memory of the memory page for saving the data to be written is determined. Then, based on the mapping relationship between the memory page and the cache page as shown in FIG. 2 , a plurality of cache pages for storing the data to be written are determined.
- step 902 it is detected whether data is stored in all of the determined multiple cache pages.
- Step 902 Acquire the number of data access requests received within a unit time.
- the storage controller can determine the number of data access requests received per unit time.
- the storage controller may be provided with a first counter as shown in FIG. 1 .
- the first counter is used to count the number of data access requests received within a unit time.
- the unit time here can also be referred to as a clock period.
- the unit time may be, for example, 1 s or 30 ms, etc., which is not specifically limited.
- the storage controller may acquire the number of times of data access requests received in a unit time from the first counter.
- the data access request is initiated by the processor. It can be a data read request and it can be a data write request.
- the processor sends a data access request to the memory controller, the counter can be incremented by one.
- Step 903 Select the first page from the cache based on the number of data access requests received per unit time.
- a first threshold may be preset in the storage controller, and the first threshold may also be referred to as the maximum number of access times per unit time.
- the value of the maximum number of visits within the unit time may be set based on cache bandwidth, cache capacity, and the like.
- the memory controller may select a cache page for storing the data to be read based on the determined priority information of the data stored in each cache page.
- the priority level information may include one of the following: LRU information, FIFO information or FBR information.
- the number of data access requests received per unit time is greater than or equal to the first threshold, it indicates that the frequency of the processor accessing the memory or the cache is high, and the cache bandwidth and interface occupancy rate used for data transmission with the processor in the cache higher. If the priority information selection method is adopted, there may be a large amount of valid data in the selected overwritten data, and the large amount of valid data also includes a lot of dirty data. Writing back dirty data also takes up too much bandwidth resources. Due to the limited buffer bandwidth and the amount of data that the interface can transmit per unit of time, it may cause data congestion, reduce the data access efficiency of the processor and the storage efficiency of the cache, and further reduce the The operating rate of the device or system.
- the storage controller can select the page with the least dirty data for data storage.
- a cache page for storing the data to be read may be selected based on the determined data dirty bit information stored in each cache page.
- Step 904 Save the data that needs to be written into the cache from the memory in the selected first page.
- the number of access requests sent by the processor in a unit time can be counted. to determine the usage of the cache bandwidth, and then select a cache page based on the usage of the cache bandwidth, so as to store the data to be cached in the selected cache page.
- the cache page with low dirty data in the stored data can be selected during the period of high data transmission volume, thereby reducing the occupancy rate of the memory access bandwidth, which is beneficial to improve the data cache efficiency and the data access efficiency of the processor .
- FIG. 10 shows a flowchart of still another embodiment of the caching method provided by the present application.
- the process 1000 of the caching method includes:
- Step 1001 Receive a data read request, and determine, based on the data read request, data to be written into the cache from the memory.
- Step 1002 Acquire the number of data access requests received within a unit time.
- step 1001 and step 1002 For the specific implementation of step 1001 and step 1002, reference may be made to the relevant description of step 901 and step 902 shown in FIG. 9 , which will not be repeated here.
- Step 1003 Determine whether the number of data access requests is greater than or equal to a first threshold. When it is determined that the number of data access requests is less than the first threshold, step 1004 is performed; when it is determined that the number of data access requests is greater than or equal to the first threshold, step 1005 is performed.
- Step 1004 based on the determined priority information of the data stored in each cache page, select a first page for storing the data to be written.
- Step 1005 Determine whether the ratio between the amount of cache access per unit time and the amount of memory access per unit time is greater than or equal to a second threshold.
- the storage controller When the number of data access requests received in a unit time is greater than or equal to the first threshold, the storage controller also selects the data for saving the data to be read based on the amount of cache accesses per unit time and the amount of memory accesses per unit time. Cached pages for data.
- the cache access amount includes one of the following: the number of cache hits or the data transfer amount between the cache and the processor; the memory access amount includes one of the following: the number of memory accesses or the data transfer amount between the memory and the processor.
- the cache system 100 shown in FIG. 1 further includes a second counter and a third counter, the second counter is used to count the amount of cache access within a unit time; the third counter is used to count the amount of memory access within a unit time.
- the cache access volume and the memory access volume are respectively the number of cache hits and the number of times the processor accesses the memory, only one second counter may be set above, and no third counter may be set, and the second counter is used to count the number of cache hits, The number of times the processor accesses the memory can be determined by subtracting the number of cache hits from the number of access requests sent by the processor.
- the storage controller may acquire the cache access volume per unit time from the second counter, and acquire the memory access volume per unit time from the third counter. Then determine the ratio of cache access to memory access.
- the second threshold may be the difference between the maximum number of accesses that the cache can undertake per unit time and the maximum number of accesses that the memory can undertake per unit time. Ratio; when the amount of cache access is the amount of data transfer between the cache and the processor, and the amount of memory access is the amount of data transfer between the memory and the processor, the second threshold can be the maximum data transfer rate of the cache per unit time and the memory The ratio between the maximum data transfer rates.
- step 1004 When the ratio between cache access and memory access is less than the second threshold, step 1004 is performed; when the ratio between cache access and memory access is greater than or equal to the second threshold, step 1006 is performed.
- the ratio between the cache access volume and the memory access volume is less than the second threshold, it means that the hit rate of the processor accessing the cache is low, and a large amount of data needs to be obtained from the memory.
- the replaced page is determined, thereby improving the cache hit rate.
- Step 1006 based on the location information of the page in the cache occupied by the data that needs to be written into the cache from the memory and the location information of the dirty data saved by each page in the cache, select the first page for saving the data to be written. .
- the ratio When the ratio is greater than or equal to the first threshold, it indicates that the cache bandwidth is overloaded. At this time, the determined valid bit information and dirty bit information of the data stored in the multiple cache pages and the location information of the data units in the cache pages occupied by the data to be read can be used to determine One of the multiple cache pages is selected, and then the data to be read is stored in the selected cache page.
- the cache structure shown in FIG. 5 when all cache pages are stored with data, and in the tag array Tag Array shown in FIG. 5 , There is a free location and no index information is saved. At this time, the data to be read is stored in one of the cache pages, and the index information of the data to be read is stored in a free position in the Tag Array, and the data to be read can also be obtained. Location information of free data units in cache pages. Then, according to the location information of the free data unit, the second index information stored in the cache is updated. The second index information is used to index the original data in the free data unit.
- the storage controller includes corresponding hardware and/or software modules for executing each function.
- the present application can be implemented in hardware or in the form of a combination of hardware and computer software in conjunction with the algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software-driven hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functionality for each particular application in conjunction with the embodiments, but such implementations should not be considered beyond the scope of this application.
- the storage controller may be divided into functional modules according to the foregoing method examples.
- each functional module may be divided corresponding to each function, or two or more functions may be integrated into one cache control module.
- the above-mentioned integrated modules can be implemented in the form of hardware. It should be noted that, the division of modules in this embodiment is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
- FIG. 11 shows a possible schematic diagram of the composition of the storage controller 1100 involved in the above embodiment.
- the storage controller 1100 may include: A receiving module 1101 , an obtaining module 1102 , a selection module 1103 and a saving module 1104 .
- the receiving module 1101 is used to receive a data read request, and based on the data read request, determine the data that needs to be written into the cache from the memory; the obtaining module 1102 is used to obtain the number of data access requests received per unit time; the selection module 1103, configured to select a first page from the cache based on the number of times of the data access request; and a saving module 1104, configured to save the data that needs to be written into the cache from the memory in the first page.
- the selection module 1103 is further configured to: in response to the number of times of the data access request being greater than or equal to a first threshold, based on the dirty data stored in the pages in the cache, from the cache to select the first page.
- the selection module 1103 is further configured to: in response to the number of data access requests being greater than or equal to the first threshold, based on the cache access volume per unit time and the memory access volume per unit time, select out the first page.
- the cache access amount includes one of the following: the number of cache hits or the data transfer amount between the cache and the processor; the memory access amount includes one of the following: the number of memory accesses or the number of memory The amount of data transferred between processors.
- the selection module 1103 is further configured to: determine a ratio between the cache access amount and the memory access amount; based on the ratio between the cache access amount and the memory access amount, The first page is selected from the cache.
- the selection module 1103 is further configured to: in response to the ratio between the cache access amount and the memory access amount being greater than or equal to a second threshold, based on the need to write to the cache from the memory The location information of the pages in the cache occupied by the data and the location information of dirty data stored in each page in the cache are used to select the first page from the cache.
- the selection module 1103 is further configured to: in response to the ratio between the cache access amount and the memory access amount being smaller than the second threshold, based on the data stored in each page in the cache and select the first page from the cache; the priority includes one of the following: least recently used information, first-in-first-out information or access frequency information.
- the storage controller 1100 further includes a first update module (not shown in the figure), the first update module is configured to update the first index information saved in the cache, the first update module An index information is used to index the data to be read stored in the first page.
- the storage controller 1100 further includes a second update module (not shown in the figure): the second update module is configured to obtain the location information of the free data unit in the first page, The second index information stored in the cache is updated according to the location information, where the second index information is used to index the original data in the data unit corresponding to the location information in the first page.
- the second update module is configured to obtain the location information of the free data unit in the first page, The second index information stored in the cache is updated according to the location information, where the second index information is used to index the original data in the data unit corresponding to the location information in the first page.
- the storage controller 1100 provided in this embodiment is configured to execute the caching method executed by the storage controller shown in the caching system 10 , and can achieve the same effect as the foregoing implementation method.
- the memory controller may implement or execute various exemplary logic modules described in connection with the present disclosure.
- the memory controller can also be a combination that implements computing functions, including, for example, an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistors Logic devices, or discrete hardware components, etc.
- ASIC application specific integrated circuit
- FPGA off-the-shelf programmable gate array
- other programmable logic devices discrete gates or transistors Logic devices, or discrete hardware components, etc.
- the disclosed caching apparatus and method may be implemented in other manners.
- the device embodiments described above are only illustrative.
- the division of modules is only a logical function division.
- there may be other division methods for example, multiple modules or components may be combined or integrated.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices may be in electrical, mechanical or other forms.
- Units described as separate components may or may not be physically separated, and components shown as units may be one physical unit or multiple physical units, that is, may be located in one place, or may be distributed in multiple different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
- each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
- the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
- the integrated unit if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium.
- a readable storage medium includes several instructions to make a device (which may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods in the various embodiments of the present application.
- the aforementioned readable storage medium includes: U disk, mobile hard disk, read only memory (ROM), random access memory (RAM), magnetic disk or optical disk, etc. that can store program codes. medium.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
本申请实施例提供了一种缓存系统、方法和芯片,该缓存方法包括:接收数据读取请求,基于数据读取请求确定需要从内存写入缓存的数据;获取单位时间内接收到的数据访问请求的次数;基于数据访问请求的次数,从缓存中选择出第一页面;将需要从内存写入缓存的数据保存在第一页面中,本申请实施例所示的缓存方法可以提高数据缓存效率。
Description
本申请实施例涉及缓存技术领域,尤其涉及一种缓存方法、系统和芯片。
随着芯片工艺技术的发展,内存的实现介质也越来越多样化。片上内存(Die-stacked DRAM)是为解决内存的访存带宽问题所提出的一种新技术。其中,可以通过硅通孔(through silicon via,TSV)技术把大容量存储芯片与处理器封装在同一个系统级芯片Soc上,以实现片上大容量内存。以2.5D或者3D封装的DRAM为例,有数据表明,片上内存的带宽可以达到片外双倍速率动态随机存储器(double data rate DRAM,DDR DRAM)的4~8倍。
当前技术中,片上内存可以作为普通内存使用,也可以作为片外内存(DDR)的缓存使用。当片上内存作为缓存使用时,缓存中的缓存空间中是以页面为粒度进行分配的。缓存的缓存空间可以划分为多个页面,同样,内存空间也可以划分为多个页面。将片外内存中的数据保存在缓存页面中时,可以基于预先建立内存页面和缓存页面之间的映射关系,在缓存中选择一个缓存页面进行存储。当缓存中的缓存页面均保存有数据、且需要将内存中的数据保存在缓存中时,通常采用最近最少使用的数据的替换方式或者先入先出的数据替换方式,从缓存中选择一个页面,以将最新获取的数据替换该选择出的页面之前保存的数据。通过传统方式选择出的页面,可能存储较多的脏数据,该脏数据为缓存中存储的被处理器改写过的数据。在进行页面替换时,需要将其写回内存。当需要写回内存的脏数据较多时,严重占用缓存带宽,造成数据拥塞,降低数据传输效率,导致数据访问时延。
由此,在大容量缓存的场景下如何提高数据缓存效率成为需要解决的问题。
发明内容
本申请提供的缓存方法、系统和芯片,可以提高数据缓存效率。
为达到上述目的,本申请采用如下技术方案:
第一方面,本申请实施例提供一种缓存方法,该缓存方法包括:接收数据读取请求,基于数据读取请求确定需要从内存写入缓存的数据;获取单位时间内接收到的数据访问请求的次数;基于数据访问请求的次数,从所述缓存中选择出第一页面;将需要从内存写入缓存的数据保存在第一页面中。
本申请提供的缓存方法,在缓存页面中均保存有数据,需要将待缓存的数据覆盖缓存中存储的原数据时,可以通过统计单位时间内、处理器所发送的访问请求的次数来确定缓存带宽的使用情况,然后基于缓存带宽的使用情况选择出缓存页面,以将待缓存的数据存储至所选择出的缓存页面中。具体实现中,可以在数据传输量较高的时段选择所保存的数据中脏数据较低的缓存页面,从而可以降低访存带宽的占用率,有利于提高数据缓存效率 以及处理器的数据访问效率。
基于第一方面,在一种可能的实现方式中,基于所述数据访问请求的次数,从所述缓存中选择出第一页面,包括:响应于所述数据访问请求的次数大于或等于第一阈值,选择所述缓存中所保存的脏污数据最少的页面作为所述第一页面。
基于第一方面,在一种可能的实现方式中,基于所述数据访问请求的次数,从所述缓存中选择出第一页面,包括:响应于所述数据访问请求的次数小于第一阈值,基于所述缓存中各页面保存的数据的优先等级信息,从所述缓存中选择出所述第一页面;所述优先等级包括以下之一:最近最少使用信息、先入先出信息或者访问频率信息。
基于第一方面,在一种可能的实现方式中,所述基于所述数据访问请求的次数,从所述缓存中选择出第一页面,包括:响应于所述数据访问请求的次数大于或等于第一阈值,基于单位时间内的缓存访问量和单位时间内的内存访问量,选择出所述第一页面。
通过进一步引入缓存访问量和内存访问量之间的比值选择出第一页面以保存数据,可以进一步兼顾缓存命中率和带宽占用率,从而进一步提高缓存效率。
基于第一方面,在一种可能的实现方式中,所述缓存访问量包括以下之一:缓存命中次数或者缓存与处理器之间的数据传输量;所述内存访问量包括以下之一:内存访问次数或者内存与处理器之间的数据传输量。
基于第一方面,在一种可能的实现方式中,所述基于单位时间内的缓存访问量和单位时间内的内存访问量,选择出所述第一页面,包括:确定所述缓存访问量和所述内存访问量之间的比值;基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面。
基于第一方面,在一种可能的实现方式中,所述基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面,包括:响应于所述缓存访问量与所述内存访问量之间的比值大于或等于第二阈值,基于所述需要从内存写入缓存的数据所占用的所述缓存中的页面的位置信息和所述缓存中各页面保存的脏污数据的位置信息,从所述缓存中选择出所述第一页面。
基于第一方面,在一种可能的实现方式中,所述基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面,包括:响应于所述缓存访问量与所述内存访问量之间的比值小于所述第二阈值,基于所述缓存中各页面保存的数据的优先等级信息,从所述缓存中选择出所述第一页面;所述优先等级包括以下之一:最近最少使用信息、先入先出信息或者访问频率信息。
基于第一方面,在一种可能的实现方式中,所述方法还包括:更新所述缓存中保存的第一索引信息,所述第一索引信息用于索引所述第一页面中保存的所述需要从内存写入缓存的数据。
基于第一方面,在一种可能的实现方式中,所述方法还包括:获得所述第一页面中的空闲数据单元的位置信息,根据所述空闲数据单元的位置信息更新所述缓存中保存的第二索引信息,所述第二索引信息用于索引所述第一页面中的与所述位置信息对应的数据单元中的原数据。
第二方面,本申请实施例提供一缓存系统,该缓存系统包括缓存,用于保存来自内存中的数据以及用于索引缓存中所保存的数据的索引信息;存储控制器,用于接收数据 读取请求,基于所述数据读取请求确定需要从内存写入缓存的数据;获取单位时间内接收到的数据访问请求的次数;基于所述数据访问请求的次数,从所述缓存中选择出第一页面;将所述需要从内存写入缓存的数据保存在所述第一页面中。
基于第二方面,在一种可能的实现方式中,所述存储控制器还用于:响应于所述数据访问请求的次数大于或等于第一阈值,基于所述缓存中的页面中存储的脏污数据,从所述缓存中选择出所述第一页面。
基于第二方面,在一种可能的实现方式中,所述存储控制器还用于:响应于所述数据访问请求的次数小于第一阈值,基于所述缓存中各页面保存的数据的优先等级信息,从所述缓存中选择出所述第一页面;所述优先等级包括以下之一:最近最少使用信息、先入先出信息或者访问频率信息。
基于第二方面,在一种可能的实现方式中,所述存储控制器还用于:响应于所述数据访问请求的次数大于或等于第一阈值,基于单位时间内的缓存访问量和单位时间内的内存访问量,选择出所述第一页面。
基于第二方面,在一种可能的实现方式中,所述缓存访问量包括以下之一:缓存命中次数或者缓存与处理器之间的数据传输量;所述内存访问量包括以下之一:内存访问次数或者内存与处理器之间的数据传输量。
基于第二方面,在一种可能的实现方式中,所述存储控制器还用于:确定所述缓存访问量和所述内存访问量之间的比值;基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面。
基于第二方面,在一种可能的实现方式中,所述存储控制器还用于:响应于所述缓存访问量与所述内存访问量之间的比值大于或等于第二阈值,基于所述需要从内存写入缓存的数据所占用的所述缓存中的页面的位置信息和所述缓存中各页面保存的脏污数据的位置信息,从所述缓存中选择出所述第一页面。
基于第二方面,在一种可能的实现方式中,所述存储控制器还用于:响应于所述缓存访问量与所述内存访问量之间的比值小于所述第二阈值,基于所述缓存中各页面保存的数据的优先等级信息,从所述缓存中选择出所述第一页面;所述优先等级包括以下之一:最近最少使用信息、先入先出信息或者访问频率信息。
基于第二方面,在一种可能的实现方式中,所述缓存系统还包括第一计数器,所述第一计数器用于统计所述存储控制器在单位时间内所接收到的数据访问请求的数目。
基于第二方面,在一种可能的实现方式中,所述缓存系统还包括第二计数器;所述第二计数器用于统计所述存储控制器在单位时间内的缓存访问量;其中,所述缓存访问量包括以下之一:缓存命中次数或者缓存与处理器之间的数据传输量。
基于第二方面,在一种可能的实现方式中,所述缓存系统还包括第三计数器;所述第三计数器用于统计所述存储控制器在单位时间内的内存访问量;其中,所述内存访问量包括以下之一:内存访问次数或者内存与处理器之间的数据传输量。
第三方面,本申请实施例提供一种芯片,该芯片包括如第二方面所述的缓存系统。
基于第三方面,在一种可能的实现方式中,该芯片还包括处理器,用于访问所述缓存系统中存储的数据,以及将处理后的数据存储至所述缓存系统。
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的缓存系统的一个结构示意图;
图2是本申请实施例提供的内存页面与缓存页面之间的映射关系示意图;
图3是本申请实施例提供的缓存的一个结构示意图;
图4是本申请实施例提供的缓存的又一个结构示意图;
图5是本申请实施例提供的缓存的又一个结构示意图;
图6是本申请实施例提供的如图3所示的缓存页面B存储的脏污数据所占用的数据单元的示意图;
图7是本申请实施例提供的如图3所示的缓存页面A存储的脏污数据所占用的数据单元的示意图;
图8是本申请实施例提供的待存储的内存页面中的数据所占用的缓存页面中的数据单元的示意图;
图9是本申请实施例提供的缓存方法的一个流程图;
图10是本申请实施例提供的缓存方法的又一个流程图;
图11是本申请实施例提供的缓存装置的一个结构示意图。
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本文所提及的"第一"、"第二"以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。同样,"一个"或者"一"等类似词语也不表示数量限制,而是表示存在至少一个。
在本文中提及的"模块"通常是指按照逻辑划分的功能性结构,该"模块"可以由纯硬件实现,或者,软硬件的结合实现。
在本申请实施例中,“示例性的”或者“例如”等词用于表示例子、例证或说明。本申请实施例中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者“例如”等词旨在以具体方式呈现相关概念。在本申请实施例的描述中,除非另有说明,“多个”的含义是指两个或两个以上。例如,多个页面是指两个或两个以上的页面;多个索引信息是指两个或两个以上的数据信息。
请参考图1,其示出了应用于本申请的一个缓存系统的结构示意图。
在图1中,缓存系统100包括处理器、存储控制器、缓存和内存。其中,内存中存储有处理器运行所需数据。缓存中存储有内存中存储的部分数据。处理器可以发起数据访问 请求以及进行数据处理。存储控制器基于处理器发起的数据访问请求,控制处理器与缓存之间以及缓存与内存之间的数据交互。在存储控制器的控制下,内存中的数据可以被写入缓存、提供给处理器,或者,将数据写入内存。
处理器发起数据访问请求后,存储控制器可以基于该数据访问请求检测数据是否存在缓存中。若处理器所访问的数据存储在缓存中,存储控制器控制缓存通过总线将数据提供给处理器;若处理器访问的数据未存储在缓存中,则存储控制器可以控制数据从内存中取出并提供给处理器。此外,将所述数据从内存中取出后还可以写入缓存,以使处理器下一次可以直接从缓存获取该数据。
如图1所示的缓存,可以包括多级缓存结构,例如L1级、L2级和L3级。处理器进行数据访问时,可以首先访问L1级缓存,当L1级缓存未命中时,可以继续访问L2级缓存,当L2级缓存未命中时,可以继续访问L3级缓存,当L3级缓存未命中时可以从内存中获取数据。也即对L1级缓存来说,L2级缓存和L3级缓存为其下一级缓存;对L2级缓存来说,L3级缓存为其下一级缓存。当需要进行数据写回时,例如L1级存储的数据需要写回时,可以写回L2级缓存、L1级缓存或者内存;L3级存储的数据需要写回时,仅能写回内存。L1级缓存、L2级缓存和L3级缓存可以为缓存结构相同但数据容量不同的缓存。图1中所示的缓存未将L1级、L2级和L3级缓存加以区分。
如图1所示的缓存中,每一级缓存中的缓存空间中是以页面为粒度进行分配的。具体来说,每一级缓存的缓存空间可以划分为多个页面。后面描述中将缓存中的页面统一称为缓存页面。在某些实现方式中,一个缓存页面也可以理解为一个缓存行。基于与缓存空间类似的逻辑结构,内存的存储空间也可以划分为多个页面。后面描述中将内存中的页面同一称为内存页面。一个内存页面的存储容量与一个缓存页面的存储容量可以相同。存储在同一个缓存页面中的数据均可以来自同一内存页面,也可以来自于不同的内存页面。此外,内存页面与缓存页面之间具有映射关系,其通过组Set关联。也即是说,多个内存页面保存的数据可以同时存储在缓存中同一组Set对应的缓存页面中,缓存在同一组Set中的页面具有竞争关系。为了缓解该竞争关系,每一组Set可以设置多个路Way,当该多个路Way对应的缓存页面均无数据储存时,映射至同一组Set中的内存页面保存的数据可以存储在该组Set中任意路Way对应的缓存页面中。缓存中有多少路Way即表示有多少路组相联。例如可以包括但不限于:两路组相连、四路组相连或者八路组相连等。
具体的,如图2所示,其示意性的示出了缓存中的缓存空间与内存中的存储空间之间的对应关系示意图。图2示例性的示出了缓存为2路组相联的情形。在图2中,缓存包括两路Way0和Way1,每一路中均可以存储两组页面数据。基于与缓存空间类似的逻辑结构,可以将内存的存储空间划分为八个内存页面。其中,内存页面01、内存页面11、内存页面21、内存页面31中的数据可以分别对应存储于缓存的组Set0对应的缓存页面中,内存页面02、内存页面12、内存页面22、内存页面32中的数据可以分别对应存储于缓存的组Set1对应的缓存页面中。
此外,一个页面又可以被划分为多个数据单元。将内存中保存的数据存储至缓存中时,通常以数据单元(Data Block)为单位进行数据存取。在缓存中,一个缓存页面可能存储有某个内存页面中全部数据单元的数据,也可能存储有某个内存页面中部分数据单元的数据。也即是说,将某一内存页面中的数据写入缓存中时,可以仅缓存该内存页面中的部分 数据单元的数据。例如,当缓存或内存中的一个页面所能保存的数据可以为4KB时,如果将缓存或内存中的一个页面划分为32个数据单元,则每个数据单元可以保存128B数据。其中,缓存页面可以仅保存某一内存页面部分数据单元(例如5个数据单元)中保存的数据。
如图3-图5所示,其分别示出了如图1所示的缓存的三种缓存结构示意图。
在图3中,缓存可以包括标签阵列Tag Array和数据阵列Data Array。
从内存中获取的数据存储在数据阵列Data Array的缓存页面中,用于对缓存页面保存的数据进行索引的索引信息存储在标签阵列Tag Array中。其中,标签阵列和数据阵列均是一个m*n的阵列。标签阵列Tag Array中的缓存位置与数据阵列Data Array中的页面之间具有预先建立的映射关系,也即是说,数据在数据阵列Data Array中的存储位置固定后,相应的索引信息在标签阵列Tag Array中的位置即固定。在标签阵列Tag Array和数据阵列Data Array中,每一行代表一组Set,每一列代表一路Way。标签阵列Tag Array中的每个元素为一个索引信息,数据阵列Data Array中的每个元素为一个缓存页面。从图3中可以看出,标签阵列Tag Array中的元素与数据阵列Data Array中的元素是一一对应的,标签阵列Tag Array中的索引信息为数据阵列Data Array中相应缓存页面所存储的数据的索引信息。在如图3所示的缓存中,数据阵列Data Array中的每个缓存页面中保存的数据来自于同一内存页面。
下面对标签阵列Tag Array中存储的索引信息进行介绍。每个索引信息可以包括标签(Tag)信息、有效位(Valid Bits)信息、脏污位(Dirty Bits)信息和优先等级信息。具体的,Tag信息用于指示缓存页面保存的数据所来自的内存页面在内存中的物理地址信息和数据所存储至的缓存页面对应的组set信息。其中,来自于同一内存页面的数据具有相同的Tag信息。脏污位(Dirty Bits)信息,用于指示缓存页面中所保存的数据单元中的数据是否为脏数据,若Dirty Bits某些位被设置为0,指示相应数据单元中保存的数据为干净数据,当发生替换时可以不写回片外内存直接无效,反之如果Dirty Bits某些位被设置为1,当发生替换时需要将相应脏数据所位于的数据单元中全部数据写回片外内存。这里的脏数据具体是指,缓存中所保存的被处理器改写过的数据。内存中未保存有该数据,若缓存中该脏数据被其他数据覆盖时,如果未写回内存,则造成数据丢失。因此,脏数据被覆盖时,需要写回。有效位(Valid Bits)信息,用于指示缓存页面中的各个数据单元是否存储有有效数据,通常,缓存页面中包括几个数据单元,该有效位信息则用几个比特位表示。例如,缓存页面包括32个数据单元,则用32个比特位表示。此外,每一个比特位可以包括“0”状态和“1”状态。当某一比特位为“1”时,说明相应的数据单元中保存的数据有效;当某一比特位为“0”时,,说明相应的数据单元中保存的数据无效。优先等级信息用于指示当发生页面替换时,是否被优先替换。该优先等级信息PRI包括以下之一:最近最少使用(least recently used,LRU)信息,用于指示缓存当前存储的页面中、最近最少使用的页面,当使用该优先等级时,最近最少使用的页面被优先替换;基于频率替换(Frequency Based Replacement,FBR)信息,用于指示缓存当前存储的页面中、被访问的频率信息,当使用该优先等级时,使用频率最低的页面被优先替换;先入先出(First In First Out,FIFO)信息,用于页面存储在缓存中的先后顺序信息,当使用该优先等级时,最先存储在缓存中的页面被优先替换。可以根据应用场景的需要选择采用哪种方式进行页面替换。
请继续参考图4,其示出了如图1所示的缓存的又一个结构示意图。
在图4中,缓存可以包括标签阵列Tag Array和数据阵列Data Array。其中,标签阵列和数据阵列可以均为m*n的阵列,标签阵列Tag Array中的缓存位置与数据阵列Data Array中的页面之间具有预先建立的位置映射关系,数据在数据阵列Data Array中的存储位置固定后,相应的索引信息在标签阵列Tag Array中的位置即固定。与图3所示的缓存不同的是,图4所示的缓存中,数据阵列Data Array中的每个缓存页面中保存的数据可以来自于不同的内存页面。相应的,与各缓存页面具有映射关系的标签阵列Tag Array位置处存储有多个索引信息。以图4为例,假设缓存页面A中保存的数据分别来自于图2所示的内存页面01和内存页面11,则标签阵列Tag Array中,与缓存页面A具有映射关系的存储位置处存储有索引信息Index01和索引信息Index11。其中,索引信息Index01用于索引缓存页面A中存储的来自于内存页面01中的数据;索引信息Index11用于索引缓存页面A中存储的来自于内存页面11中的数据。
请继续参考图5,其示出了如图1所示的缓存的又一个结构示意图。
在图5中,缓存可以包括标签阵列Tag Array和数据阵列Data Array。与图3和图4所示的缓存结构不同的是,图4所示的缓存中,标签阵列是一个m*n的阵列,数据阵列可以是一个m*s的阵列,其中,s小于等于n。也即是说,标签阵列Tag Array和数据阵列Data Array的组(Set)具有相同的数目,标签阵列Tag Array中路(Way)的数目不少于数据阵列Data Array中路(Way)的数目。其中,图4示意性的示出了标签阵列Tag Array是一个2*4的阵列,数据阵列Data Array是一个2*3的阵列。如图5所示的缓存中,一个缓存页面中保存的数据可能来自于同一个内存页面,也可能来自于不同的内存页面。图5所示的标签阵列Tag Array存储的索引信息中,除了包括图2所示的缓存中存储的索引信息所包括的各信息外,还包括位置信息,用于指示索引信息所能索引到的数据所在的缓存页面在数据阵列Data Array中的位置。该位置信息可以为路(Way)信息,也可以为(组Set,路Way)信息。其根据应用场景的需要选择采用哪种位置信息。需要说明的是,当上述位置信息可以为路(Way)信息时,此时标签阵列Tag Array中的组set与数据阵列Data Array中的组set具有预先设定的映射关系。例如,标签阵列Tag Array中的组Set0存储的索引信息分别用于索引数据阵列中的组set0中的各缓存页面保存的数据。当上述位置信息为(组Set,路Way)信息时,数据阵列中的各缓存页面保存的数据的索引信息可以存储在标签阵列Tag Array中任意位置处。在进行Tag信息检索时,需要将所要访问的数据的Tag信息与标签阵列Tag Array中全部的索引信息中的Tag信息一一进行比较。
基于图3-图5所示的缓存结构,假设缓存为满储存状态,如果将内存页面中保存的数据存储至缓存中,并且待缓存的数据与各缓存页面当前保存的数据来自于不同的内存页面,此时需要从数据阵列Data Array中选择出一个缓存页面、以将待缓存的数据存储至该缓存页面中,并且待缓存的数据存储至所选择出的缓存页面中时,通常会覆盖缓存页面之前保存的原数据。传统缓存技术中,通常基于各缓存页面保存的数据的LRU信息或者FIFO信息等优先等级信息选择出用于存储被覆盖的数据的缓存页面。在将待缓存的数据存储至缓存之前,需要将被覆盖的数据中的脏污数据写回(例如写回内存或写回下一级缓存)。随着缓存容量的增大,缓存页面中所存储的有效数据量以及脏污数据量可能较大。采用LRU信息或者FIFO信息等选择被覆盖的数据并未考虑脏数据的数量。当被覆盖的数据中脏数 据的数量较大时,需要将大量脏污数据写回内存或者下一级缓存,严重占用了缓存带宽,从而降低了数据缓存效率。然而,某些未被选中的缓存页面存储的数据可能包括较少的脏数据,如果选择该页面,则将较少的脏污数据写回即可。以图3所示的缓存为例,假设缓存页面A当前存储的原数据中包括大量脏数据,其余缓存页面当前存储的原数据包括较少的脏数据。假设基于各缓存页面保存的数据的LRU信息确定出缓存页面A中保存的数据是最近最少使用的数据,此时需要将缓存页面A中大量的脏数据写回。严重占用访存带宽。而如果将待缓存的数据存储至缓存页面B,则有可能仅需要将较少的脏数据写回,由此可以降低访存带宽的占用,提高数据缓存效率。
基于此,本申请实施例所示的缓存系统,在缓存页面中均保存有数据、需要将待缓存的数据覆盖缓存中存储的原数据时,可以通过统计单位时间内、处理器所发送的访问请求的次数来确定缓存带宽的使用情况,然后基于缓存带宽的使用情况选择出缓存页面,以将待缓存的数据存储至所选择出的缓存页面中。具体实现中,可以在数据传输量较高的时段选择所保存的数据中脏数据较低的缓存页面,从而可以降低访存带宽的占用率,有利于提高数据缓存效率以及处理器的数据访问效率。
在如图1所示的缓存系统中,还包括第一计数器。第一计数器用于统计单位时间内处理器发送的访问请求的次数。这里的单位时间例如可以为1s,也可以为缓存系统中的一个时钟周期。
下面以图3所示的缓存结构为例,对本申请实施例所示的缓存系统100的缓存方法进行详细描述。
当需要将图2所示的内存页面21中保存的数据存储至缓存页面中时,假设基于图2所示的内存页面与缓存页面之间的映射关系,需要将内存页面21中保存的数据存储至数据阵列Data Array中的组Set0对应的缓存页面中。此时,存储控制器可以查询数据阵列Data Array中组Set0对应的缓存页面是否为满储存状态。
具体实现中,存储控制器可以查询标签阵列Tag Array中的各存储位置是否均存储有索引信息,以确定缓存是否为满储存状态。在其他一些实现方式中,存储控制器也可以查询数据阵列Data Array中的各缓存页面是否均存储有数据,以确定缓存是否为满储存状态。假设当前标签阵列Tag Array中各存储位置均保存有索引信息,如图3所示。也即此时缓存为满储存状态。
然后,存储控制器可以从第一计数器获取单位时间内处理器发送的访问请求的数目。这里,处理器发送的访问请求可以包括从缓存或内存中读取数据,也可以包括向缓存或者内存中写入数据。存储控制器可以基于单位时间内处理器发送的访问请求的数目,从图2所示的缓存中选择出缓存页面以保存待缓存的数据。
当单位时间内处理器发送的访问请求的数目小于第一阈值时,说明处理器访问内存或缓存的频率较低,此时缓存中用于与处理器进行数据传输的缓存带宽的占用率较低,缓存带宽所传输数据量在可承担的范围内。此时,为了提高处理器的访存命中率,存储控制器可以基于优先等级信息选择出用于保存待缓存的数据的缓存页面。
具体的,存储控制器可以查询标签阵列Tag Array中组Set0对应的存储位置处所保存的各索引信息Index的优先等级信息,以确定各索引信息所能索引到的数据的优先等级。其中,优先等级信息可以包括以下之一:LRU信息、FIFO信息或者FBR信息。具体的, 各索引信息中的LRU信息是存储控制器通过LRU算法预先计算出的。例如,可以对距离当前预设时间段内的数据使用情况进行排序(通常数据的使用情况通过处理器对用于存储数据的缓存页面的访问次数体现),基于排序情况对各缓存页面中保存的数据设置相应的LRU信息。此外,FIFO信息和FBR信息也可以采用其各自的算法预先将各缓存页面中保存的数据的相应信息确定出来,在此不再赘述。假设通过查询各页面的索引信息Index,假设确定出图3所示的缓存页面B中保存的数据的优先等级最低。此时,存储控制器可以将待缓存的内存页面21中保存的数据存储至数据阵列Data Array的缓存页面B中,然后更新标签阵列Tag Array中用于索引缓存页面B中保存的数据的索引信息。也即是将之前保存的索引信息Index11更新为索引信息Index21。
需要说明的是,将内存页面21中保存的数据存储至缓存页面B中之前,还需要将缓存页面B中之前保存的数据中的脏污数据写回内存或者写回下一级缓存中。
当单位时间内处理器发送的访问请求的数目大于等于第一阈值时,说明处理器访问内存或缓存的频率较高,缓存中用于与处理器进行数据传输的缓存带宽和接口的占用率较高。如果采用优先等级信息选择方式,选择出的被覆盖的数据为缓存页面B中的数据。假设缓存页面B包括32个数据单元,其中25个数据单元当前保存的数据为有效数据,该25个数据单元中,其中20个数据单元保存的数据为脏污数据,如图6所示。将待缓存的内存页面21中保存的数据存储至缓存之前,需要将上述20个数据单元中保存的脏污数据全部写回。在该种情况下,将脏污数据写回同样占用过多带宽资源,由于单位时间内缓存带宽和接口所能传输的数据量有限,有可能造成数据拥塞,降低处理器的数据访问效率和缓存的存储效率,进而降低设备或系统的运行速率。假设当前缓存页面A中有20个数据单元保存的数据为有效数据,该20个数据单元中,5个数据单元保存的数据为脏污数据,缓存页面A中的各数据单元的数据存储情况如图7所示。如果将待缓存的数据存储至缓存页面A,则仅需要将5个数据单元保存的脏污写回,此时可以极大缓解带宽压力。
需要说明的是,页面中保存的数据所占用的数据单元通过索引信息中的有效位信息来体现。其中,有效位信息中的每一位代表一个数据单元,0代表该数据单元中保存的数据有效,1代表该数据单元中保存的数据无效。例如,当缓存页面包括32个数据单元时,索引信息中的有效位信息和脏污位信息均可以用32位表示。当某个数据单元存储的数据有效时,有效位信息中指示该数据单元的位可以设置为有效(例如设置为“1”);当某个数据单元无效时,该数据单元相应的有效位可以设置为无效(例如设置为“0”)。此外,只有当某个数据单元中保存的数据有效且具有脏污数据时,才需要将该数据单元中保存的数据写回。
由此,在处理器访问内存或缓存的频率较高时,为了避免将过多的脏污数据写回导致数据拥塞,存储控制器可以选择出脏污数据最少的页面进行数据保存。具体的,可以查询标签阵列Tag Array中保存的各索引信息中的脏污位信息,选择出脏污数据最少的缓存页面。
假设,图3所示的用于索引缓存页面A中的数据的索引信息Index01中的脏污位信息为:00000000000000001100110000000000,图3所示用于索引缓存页面B中的数据的的索引信息Index11中的脏污位信息为:00000001111110001100110000000000。其中,脏污位信息中,每一位代表一个数据单元,“0”代表该数据单元中未保存有脏数据,“1”代表 该数据单元中保存有脏数据。通过对比索引信息Index01中的脏污位信息和索引信息Index11中的脏污位信息,可以确定出索引信息Index01所能索引到的缓存页面A中存储有最少的脏数据。此时,存储控制器可以将待缓存的内存页面21中保存的数据存储至缓存页面A中,然后更新标签阵列Tag Array中用于索引缓存页面A中保存的数据的索引信息。也即是将之前保存的索引信息Index01更新为索引信息Index21。需要说明的是,将内存页面21中保存的数据存储至缓存页面A中之前,还需要将缓存页面A中之前存储的脏污数据写回内存或者写回下一级缓存中。
需要说明的是,本申请实施例所示的单位时间内处理器发送的访问请求的次数的第一阈值,可以是基于单位时间内缓存所能承接的最大访问次数确定的。当缓存所能承接的最大访问次数较高时,可以提高该第一阈值;当缓存所能承接的最大访问次数较低时,可以降低该第一阈值。
在另外一种可能的实现方式中,当单位时间内处理器发送的访问请求的次数大于等于第一阈值时,存储控制器还可以进一步基于单位时间内的缓存访问量和单位时间内的内存访问量选择出缓存页面以保存数据。
缓存系统100还可以包括第二计数器和第三计数器,第二计数器用于统计单位时间内的缓存访问量;第三计数器用于统计单位时间内的内存访问量。缓存访问量可以为缓存命中次数或者缓存与处理器之间的数据传输量;内存访问量可以为处理器访问内存的次数或者内存与处理器之间的数据传输量。此外,当缓存访问量和内存访问量分别为缓存命中次数和处理器访问内存的次数时,上述可以仅设置一个第二计数器,不设置第三计数器,该第二计数器用于统计缓存命中次数,处理器访问内存的次数可以通过处理器发送的访问请求的次数减去缓存命中次数来确定。
具体实现中,当缓存访问量为缓存命中次数、内存访问量为处理器访问内存的次数时,第二阈值可以是单位时间内缓存所能承接的最大访问次数和单位时间内内存所能承接的最大访问次数的比值;当缓存访问量为缓存与处理器之间的数据传输量、内存访问量为内存与处理器之间的数据传输量时,第二阈值可以是单位时间内缓存的最大数据传输速率与内存的最大数据传输速率之间的比值。
存储控制器可以从第二计数器获取单位时间内的缓存访问量,从第三计数器获取单位时间内的内存访问量。然后确定缓存访问量和内存访问量的比值。当该比值小于等于第二阈值时,说明此时处理器访问缓存的命中率较低,大量数据需要从内存中获取,此时可以基于优先等级信息选择出缓存页面,从而提高缓存命中率;当该比值大于第二阈值时,说明缓存带宽超负荷,可以选择出脏污数据最少的缓存页面进行数据保存。
本申请实施例中,存储控制器可以从第一计数器获取单位时间内处理器发送的访问请求的数目。当单位时间内处理器发送的访问请求的数目小于第一阈值时,可以通过查询缓存中所保存的索引信息中的优先等级信息,选择出缓存页面以保存待存储的数据。当单位时间内处理器发送的访问请求的次数大于等于第一阈值时,存储控制器可以从第二计数器获取单位时间内的缓存访问量,从第三计数器获取单位时间内的内存访问量,确定缓存访问量和内存访问量之间的比值。当缓存访问量和内存访问量之间的比值小于第二阈值时,可以通过查询缓存中所保存的索引信息中的优先等级信息,选择出缓存页面以保存待存储的数据;当缓存访问量和内存访问量之间的比值大于等于第二阈值时,可以选择出脏污数 据最少的页面进行数据保存。由此,可以进一步兼顾缓存命中率和带宽占用率,从而进一步提高缓存效率。
以上阐述了当缓存中各缓存页面均存储有数据时,本申请实施例所示的缓存系统如何选择缓存页面,以将待内存中的数据存储至所选择出的缓存页面中。其中,上述实现方式可以适用于图3-图5所示的任意缓存结构。
在如图5所示的缓存结构中,由于数据阵列Data Array中的内存页面和标签阵列Tag Array中的存储位置之间解绑,其可以具有不同的阵列数,此时存在一种情况,数据阵列Data Array中的每个缓存页面均存储有数据,而标签阵列Tag Array中可能还存在空闲的存储位置,该位置处未存储有标签信息,如图5所示。此时,需要将不同的内存页面中存储的数据存储至同一缓存页面中。对于缓存页面中的一个数据单元来说,若该数据单元用来存储第一内存页面中保存的数据,那么该数据单元就不可能用来存储第二内存页面中保存的数据。假设第一内存页面中有数据需要存储在该数据单元中时,则需要覆盖之前在该数据单元保存的其他内存页面中保存的数据。
基于上述场景,本申请实施例中,当单位时间内处理器发送的访问请求的数目大于等于第一阈值时,可以基于待缓存的数据的有效位信息、缓存页面中保存的数据的有效位信息和脏污位信息,选择出缓存页面以存储待缓存的数据。
具体的,当需要将内存页面01中保存的数据存储至缓存页面中时,假设内存页面01与缓存中的组Set0对应的缓存页面具有映射关系。此时,需要将内存页面01中保存的数据存储至数据阵列Data Array中组Set0对应的缓存页面中。存储控制器可以查询数据阵列Data Array中组Set0对应的缓存页面以及标签阵列Tag Array中组Set0对应的存储位置,查询出Data Array中组Set0对应的缓存页面均存储有数据、标签阵列Tag Array中组Set0对应的存储位置还存在空闲位置。此时,存储控制器可以基于组Set0中对应的各缓存页面存储的数据的有效位信息和脏污位信息,确定出各缓存页面存储的数据中有效的脏污位信息(只有有效的脏污数据才需要写回,当脏污数据无效时可以不需要写回)。然后,存储控制器可以基于待缓存的内存页面01中保存的数据的有效位信息与组Set0对应的缓存页面中保存的数据的有效的脏污位信息,选择出与内存页面01相冲突的位置最少的页面。这里的相冲突,是指待缓存的内存页面01中保存的数据存储至缓存页面A中时,所占用的缓存页面A中的数据单元,与缓存页面中当前存储的有效的脏数据所占用的数据单元相冲突。例如,待缓存的内存页面01中保存的数据所占用的数据单元如图8所示,相应的,其有效位信息为:0xFF0E410。假设图5中,缓存页面B中保存的数据的有效位信息为0xC975A450,脏污位信息为0x00700060,其有效的脏污位信息为0x00700040;缓存页面C中保存的数据的有效位信息为0xFF8DAC20,脏污位信息为0x06980020,其有效的脏污位信息为0x06880020。从而,可以确定出缓存页面B中保存的数据所占用的数据单元与待缓存的内存页面01中保存的数据所占用的数据单元冲突最少,从而可以将待缓存的内存页面01中保存的数据存储至缓存页面B中。需要说明的是,将待缓存的内存页面01中保存的数据存储至缓存页面B之前,还需要将缓存页面B中被覆盖的有效的脏污数据写回内存或者写回下一级缓存中。此外,还需要更新缓存中所存储的索引信息Inedx02。具体的,更新索引信息Inedx02中的有效位信息和脏污位信息。
在本申请实施例的示例中,缓存、存储控制器和处理器可以集成在同一个芯片上,形 成片上系统(SOC,System on chip)。此外,处理器和缓存可以集成在同一个芯片上,存储控制器集成在另外一个芯片上。实际应用中,缓存也可以与处理器集成于不同芯片,该片外缓存采用本中请实施例提供的片内缓存相同的存储结构设计、且与本中请实施例提供的片内缓存实现相同的功能,该片外缓存也应视为落入本中请实施例的保护范围之内。
基于图1所示的缓存系统、图2所示的内存页面与缓存页面之间的映射关系、图3-图5所示的缓存结构,本申请实施例还提供了一种缓存方法,该缓存方法应用于如图1所示的存储控制器。请继续参考图9,其示出了本申请实施例提供的缓存方法的一个流程900,该缓存方法的流程900包括如下所述的步骤:
步骤901,接收数据读取请求,基于数据读取请求确定需要从内存写入缓存的数据。
在本实施例中,数据读取请求中通常携带有所要读取的数据的地址信息,该地址信息中包括标签信息Tag、组信息Set等。缓存控制器基于处理器发出的指令,利用组信息Set检索Tag Array,查找到Set这一组内的多个索引信息。接着,缓存控制器可以继续查找该多个索引信息的其中一个索引信息中是否包括数据读取请求中携带的Tag信息。假设任何一个索引信息中均不包括数据读取请求中携带的Tag信息,则表示Tag未命中。也即缓存中未保存处理器所要读取的数据。此时,存储控制器需要从内存中获取处理器所要读取的数据。
然后,存储控制器可以进一步基于数据读取请求中携带的Tag信息,确定需要从内存写入缓存的数据。也即确定出用于保存待写入的数据的内存页面在内存中的位置信息。然后,基于如图2所示的内存页面与缓存页面之间的映射关系,确定用于保存待写入的数据的多个缓存页面。
接着,检测所确定出的多个缓存页面中是否均保存有数据。当检测出多个缓存页面中均保存有数据时,为了避免所选择出的用于保存待写入的数据的缓存页面中存储有大量脏数据,当待读取的数据覆盖所选择出的缓存页面中保存的原数据时,需要将大量脏污数据写回内存或者下一级缓存,严重占用了缓存带宽,从而降低数据缓存效率。此时,可以执行步骤902。
步骤902,获取单位时间内接收到的数据访问请求的次数。
存储控制器可以确定单位时间内接收到的数据访问请求的次数。具体的,存储控制器中可以设置有如图1所示的第一计数器。该第一计数器用于统计单位时间内接收到的数据访问请求的次数。这里的单位时间也可以称为时钟周期。该单位时间例如可以为1s或者30ms等,具体不做限定。
存储控制器可以从第一计数器获取单位时间内所接收到的数据访问请求的次数。通常,该数据访问请求为处理器发起的。其可以为数据读取请求,而可以为数据写入请求。当处理器向存储控制器发送一个数据访问请求,计数器可以加1。
步骤903,基于单位时间内接收到的数据访问请求的次数,从缓存中选择出第一页面。
本申请实施例中,存储控制器中可以预先设置第一阈值,该第一阈值也可以称为单位时间内的最大访问次数值。该单位时间内的最大访问次数值可以是基于缓存带宽、缓存容量等来设置的。
当单位时间内接收到的数据访问请求的次数小于第一阈值时,说明处理器访问内存或缓存的频率较低,此时缓存中用于与处理器进行数据传输的缓存带宽的占用率较低,缓存 带宽所传输数据量足以在可承担的范围内。此时,为了提高处理器的访存命中率,存储控制器可以基于所确定出的各缓存页面中保存的数据的优先等级信息,选择出用于保存待读取的数据的缓存页面。优先等级信息可以包括以下之一:LRU信息、FIFO信息或者FBR信息。
当单位时间内接收到的数据访问请求的次数大于或者等于第一阈值时,说明处理器访问内存或缓存的频率较高,缓存中用于与处理器进行数据传输的缓存带宽和接口的占用率较高。如果采用优先等级信息选择方式,选择出的被覆盖的数据可能存在大量的有效数据,且该大量的有效数据中又包括很多脏数据。将脏污数据写回同样占用过多带宽资源,由于单位时间内缓存带宽和接口所能传输的数据量有限,有可能造成数据拥塞,降低处理器的数据访问效率和缓存的存储效率,进而降低设备或系统的运行速率。此时,为了避免将过多的脏数据写回导致数据拥塞,存储控制器可以选择出脏污数据最少的页面进行数据保存。具体的,可以基于所确定出的各缓存页面中保存的数据脏污位信息,选择出用于保存待读取的数据的缓存页面。
步骤904,将需要从内存写入缓存的数据保存在所选择出的第一页面中。
本申请实施例所示的缓存方法,在缓存页面中均保存有数据、需要将待缓存的数据覆盖缓存中存储的原数据时,可以通过统计单位时间内、处理器所发送的访问请求的次数来确定缓存带宽的使用情况,然后基于缓存带宽的使用情况选择出缓存页面,以将待缓存的数据存储至所选择出的缓存页面中。具体实现中,可以在数据传输量较高的时段选择所保存的数据中脏数据较低的缓存页面,从而可以降低访存带宽的占用率,有利于提高数据缓存效率以及处理器的数据访问效率。
请继续参考图10,其示出了本申请提供的缓存方法的又一个实施例的流程图。该缓存方法的流程1000包括:
步骤1001,接收数据读取请求,基于所述数据读取请求确定需要从内存写入缓存的数据。
步骤1002,获取单位时间内接收到的数据访问请求的次数。
其中,步骤1001和步骤1002的具体实现可以参考图9所示的步骤901和步骤902的相关描述,在此不再赘述。
步骤1003,确定数据访问请求的次数是否大于等于第一阈值。当确定出数据访问请求的次数小于第一阈值时,执行步骤1004;当确定出数据访问请求的次数大于等于第一阈值时,执行步骤1005。
步骤1004,基于所确定出的各缓存页面中保存的数据的优先等级信息,选择出用于保存待写入的数据的第一页面。
步骤1005,确定单位时间内的缓存访问量和单位时间内的内存访问量之间的比值是否大于等于第二阈值。
当单位时间内接收到的数据访问请求的次数大于或者等于第一阈值时,存储控制器还基于单位时间内的缓存访问量和单位时间内的内存访问量,选择出用于保存待读取的数据的缓存页面。该缓存访问量包括以下之一:缓存命中次数或者缓存与处理器之间的数据传输量;该内存访问量包括以下之一:内存访问次数或者内存与处理器之间的数据传输量。
具体的,如图1所示的缓存系统100还包括第二计数器和第三计数器,第二计数器用 于统计单位时间内的缓存访问量;第三计数器用于统计单位时间内的内存访问量。此外,当缓存访问量和内存访问量分别为缓存命中次数和处理器访问内存的次数时,上述可以仅设置一个第二计数器,不设置第三计数器,该第二计数器用于统计缓存命中次数,处理器访问内存的次数可以通过处理器发送的访问请求的次数减去缓存命中次数来确定。
存储控制器可以从第二计数器获取单位时间内的缓存访问量,从第三计数器获取单位时间内的内存访问量。然后确定缓存访问量和内存访问量的比值。
当缓存访问量为缓存命中次数、内存访问量为处理器访问内存的次数时,第二阈值可以是单位时间内缓存所能承接的最大访问次数和单位时间内内存所能承接的最大访问次数的比值;当缓存访问量为缓存与处理器之间的数据传输量、内存访问量为内存与处理器之间的数据传输量时,第二阈值可以是单位时间内缓存的最大数据传输速率与内存的最大数据传输速率之间的比值。
当缓存访问量和内存访问量之间的比值小于第二阈值时,执行步骤1004;当当缓存访问量和内存访问量之间的比值大于等于第二阈值时,执行步骤1006。
当缓存访问量和内存访问量之间的比值小于第二阈值时,说明此时处理器访问缓存的命中率较低,大量数据需要从内存中获取,此时可以基于上述优先等级信息从缓存中确定出被替换的页面,从而提高缓存命中率。
步骤1006,基于需要从内存写入缓存的数据所占用的缓存中的页面的位置信息和缓存中各页面保存的脏污数据的位置信息,选择出用于保存待写入的数据的第一页面。
当该比值大于等于第一阈值时,说明缓存带宽超负荷。此时可以通过所确定出的多个缓存页面中保存的数据的有效位信息和脏污位信息,以及待读取的数据所占用的缓存页面中的数据单元的位置信息,从所确定出的多个缓存页面中选择其中一个缓存页面,然后将待读取的数据保存在所选择出的缓存页面中。
从图10所示的实施例可以看出,通过进一步引入缓存访问量和内存访问量之间的比值选择出第一页面以保存数据,可以进一步兼顾缓存命中率和带宽占用率,从而进一步提高缓存效率。
此外,在本实施例一种可能的实现方式中,例如在如图5所示的缓存结构中,当所有的缓存页面中均保存有数据,且如图5所示的标签阵列Tag Array中均保存有索引信息。将待读取的数据保存在其中一个缓存页面中时,其覆盖掉了该缓存页面之前保存的原数据。而标签阵列Tag Array中仍然保存的是之前该缓存页面中保存的原数据的索引信息。此时,需要将之前存储的原数据的索引信息更新为待读取的数据的索引信息。
进一步的,在本实施例一种可能的实现方式中,例如在如图5所示的缓存结构中,当所有的缓存页面中均保存有数据,且如图5所示的标签阵列Tag Array中存在空闲位置未保存有索引信息。此时,将待读取的数据保存在其中一个缓存页面中,将待读取的数据的索引信息保存在标签阵列Tag Array中的空闲位置后,还可以获取用于保存待读取的数据的缓存页面中的空闲数据单元的位置信息。然后根据该空闲数据单元的位置信息,更新缓存中保存的第二索引信息。该第二索引信息用于索引该空闲数据单元中的原数据。
可以理解的是,存储控制器为了实现上述功能,其包含了执行各个功能相应的硬件和/或软件模块。结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方 式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以结合实施例对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
本实施例可以根据上述方法示例对存储控制器进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个缓存控制模块中。上述集成的模块可以采用硬件的形式实现。需要说明的是,本实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。
在采用对应各个功能划分各个功能模块的情况下,图11示出了上述实施例中涉及的存储控制器1100的一种可能的组成示意图,如图11所示,该存储控制器1100可以包括:接收模块1101、获取模块1102、选择模块1103和保存模块1104。
其中,接收模块1101,用于接收数据读取请求,基于数据读取请求确定需要从内存写入缓存的数据;获取模块1102,用于获取单位时间内接收到的数据访问请求的次数;选择模块1103,用于基于所述数据访问请求的次数,从所述缓存中选择出第一页面;保存模块1104,用于将需要从内存写入缓存的数据保存在所述第一页面中。
在一种可能的实现方式中,选择模块1103进一步用于:响应于所述数据访问请求的次数大于或等于第一阈值,基于所述缓存中的页面中存储的脏污数据,从所述缓存中选择出所述第一页面。
在一种可能的实现方式中,选择模块1103进一步用于:响应于所述数据访问请求的次数大于或等于第一阈值,基于单位时间内的缓存访问量和单位时间内的内存访问量,选择出所述第一页面。
在一种可能的实现方式中,所述缓存访问量包括以下之一:缓存命中次数或者缓存与处理器之间的数据传输量;所述内存访问量包括以下之一:内存访问次数或者内存与处理器之间的数据传输量。
在一种可能的实现方式中,选择模块1103进一步用于:确定所述缓存访问量和所述内存访问量之间的比值;基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面。
在一种可能的实现方式中,选择模块1103进一步用于:响应于所述缓存访问量与所述内存访问量之间的比值大于或等于第二阈值,基于所述需要从内存写入缓存的数据所占用的所述缓存中的页面的位置信息和所述缓存中各页面保存的脏污数据的位置信息,从所述缓存中选择出所述第一页面。
在一种可能的实现方式中,选择模块1103进一步用于:响应于所述缓存访问量与所述内存访问量之间的比值小于所述第二阈值,基于所述缓存中各页面保存的数据的优先等级信息,从所述缓存中选择出所述第一页面;所述优先等级包括以下之一:最近最少使用信息、先入先出信息或者访问频率信息。
在一种可能的实现方式中,存储控制器1100还包括第一更新模块(图中未示出),该第一更新模块,用于更新所述缓存中保存的第一索引信息,所述第一索引信息用于索引所述第一页面中保存的所述待读取的数据。
在一种可能的实现方式中,存储控制器1100还包括第二更新模块(图中未示出):该第二更新模块,用于获得所述第一页面中的空闲数据单元的位置信息,根据所述位置 信息更新所述缓存中保存的第二索引信息,所述第二索引信息用于索引所述第一页面中的与所述位置信息对应的数据单元中的原数据。
本实施例提供的存储控制器1100,用于执行缓存系统10中所示的存储控制器所执行的缓存方法,可以达到与上述实现方法相同的效果。
其中,存储控制器可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑模块。存储控制器也可以是实现计算功能的组合,例如包括专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、或分立硬件组件等。
通过以上实施方式的描述,所属领域的技术人员可以了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的缓存装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的可读存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (13)
- 一种缓存方法,其特征在于,包括:接收数据读取请求,基于所述数据读取请求确定需要从内存写入缓存的数据;获取单位时间内接收到的数据访问请求的次数;基于所述数据访问请求的次数,从所述缓存中选择出第一页面;将所述需要从内存写入缓存的数据保存在所述第一页面中。
- 根据权利要求1所述的缓存方法,其特征在于,所述基于所述数据访问请求的次数,从所述缓存中选择出第一页面,包括:响应于所述数据访问请求的次数大于或等于第一阈值,基于所述缓存中的页面存储的脏数据,从所述缓存中选择出所述第一页面。
- 根据权利要求1所述的缓存方法,其特征在于,所述基于所述数据访问请求的次数,从所述缓存中选择出第一页面,包括:响应于所述数据访问请求的次数大于或等于第一阈值,基于单位时间内的缓存访问量和单位时间内的内存访问量,选择出所述第一页面。
- 根据权利要求3所述的缓存方法,其特征在于,所述缓存访问量包括以下之一:缓存命中次数或者缓存与处理器之间的数据传输量;所述内存访问量包括以下之一:内存访问次数或者内存与处理器之间的数据传输量。
- 根据权利要求3或4所述的缓存方法,其特征在于,所述基于单位时间内的缓存访问量和单位时间内的内存访问量,选择出所述第一页面,包括:确定所述缓存访问量和所述内存访问量之间的比值;基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面。
- 根据权利要求5所述的缓存方法,其特征在于,所述基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面,包括:响应于所述缓存访问量与所述内存访问量之间的比值大于或等于第二阈值,基于所述需要从内存写入缓存的数据所占用的所述缓存中的页面的位置信息和所述缓存中各页面保存的脏污数据的位置信息,从所述缓存中选择出所述第一页面。
- 根据权利要求6所述的缓存方法,其特征在于,所述基于所述缓存访问量与所述内存访问量之间的比值,从所述缓存中选择出所述第一页面,包括:响应于所述缓存访问量与所述内存访问量之间的比值小于所述第二阈值,基于所述缓存中各页面保存的数据的优先等级信息,从所述缓存中选择出所述第一页面;所述优先等级包括以下之一:最近最少使用信息、先入先出信息或者访问频率信息。
- 一种缓存系统,其特征在于,包括:缓存,用于保存来自内存中的数据以及用于索引缓存中所保存的数据的索引信息;存储控制器,所述存储控制器用于:接收数据读取请求,基于所述数据读取请求确定需要从内存写入缓存的数据;获取单位时间内接收到的数据访问请求的次数;基于所述数据访问请求的次数,从所述缓存中选择出第一页面;将所述需要从内存写入缓存的数据保存在所述第一页面中。
- 根据权利要求8所述的缓存系统,其特征在于,所述缓存系统还包括第一计数器;所述第一计数器用于统计所述存储控制器在单位时间内所接收到的数据访问请求的数目。
- 根据权利要求9所述的缓存系统,其特征在于,所述缓存系统还包括第二计数器;所述第二计数器用于统计所述存储控制器在单位时间内的缓存访问量;其中,所述缓存访问量包括以下之一:缓存命中次数或者缓存与处理器之间的数据传输量。
- 根据权利要求10所述的缓存系统,其特征在于,所述缓存系统还包括第三计数器;所述第三计数器用于统计所述存储控制器在单位时间内的内存访问量;其中,所述内存访问量包括以下之一:内存访问次数或者内存与处理器之间的数据传输量。
- 一种芯片,其特征在于,所述芯片包括如权利要求8-11任一项所述的缓存系统。
- 根据权利要求12所述的芯片,其特征在于,还包括:处理器,用于访问所述缓存系统中存储的数据,以及将处理后的数据存储至所述缓存系统。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080101463.0A CN115668159A (zh) | 2020-07-30 | 2020-07-30 | 缓存方法、系统和芯片 |
PCT/CN2020/105696 WO2022021178A1 (zh) | 2020-07-30 | 2020-07-30 | 缓存方法、系统和芯片 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/105696 WO2022021178A1 (zh) | 2020-07-30 | 2020-07-30 | 缓存方法、系统和芯片 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022021178A1 true WO2022021178A1 (zh) | 2022-02-03 |
Family
ID=80037068
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/105696 WO2022021178A1 (zh) | 2020-07-30 | 2020-07-30 | 缓存方法、系统和芯片 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115668159A (zh) |
WO (1) | WO2022021178A1 (zh) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101576856A (zh) * | 2009-06-18 | 2009-11-11 | 浪潮电子信息产业股份有限公司 | 一种基于长短周期访问频度的缓存数据替换方法 |
CN104111900A (zh) * | 2013-04-22 | 2014-10-22 | 中国移动通信集团公司 | 一种缓存中数据替换方法及装置 |
CN106888262A (zh) * | 2017-02-28 | 2017-06-23 | 北京邮电大学 | 一种缓存替换方法及装置 |
CN107783727A (zh) * | 2016-08-31 | 2018-03-09 | 华为技术有限公司 | 一种内存设备的访问方法、装置和系统 |
US20200117607A1 (en) * | 2018-10-15 | 2020-04-16 | International Business Machines Corporation | Cache line replacement using reference states based on data reference attributes |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109582214B (zh) * | 2017-09-29 | 2020-04-28 | 华为技术有限公司 | 数据访问方法以及计算机系统 |
-
2020
- 2020-07-30 CN CN202080101463.0A patent/CN115668159A/zh active Pending
- 2020-07-30 WO PCT/CN2020/105696 patent/WO2022021178A1/zh active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101576856A (zh) * | 2009-06-18 | 2009-11-11 | 浪潮电子信息产业股份有限公司 | 一种基于长短周期访问频度的缓存数据替换方法 |
CN104111900A (zh) * | 2013-04-22 | 2014-10-22 | 中国移动通信集团公司 | 一种缓存中数据替换方法及装置 |
CN107783727A (zh) * | 2016-08-31 | 2018-03-09 | 华为技术有限公司 | 一种内存设备的访问方法、装置和系统 |
CN106888262A (zh) * | 2017-02-28 | 2017-06-23 | 北京邮电大学 | 一种缓存替换方法及装置 |
US20200117607A1 (en) * | 2018-10-15 | 2020-04-16 | International Business Machines Corporation | Cache line replacement using reference states based on data reference attributes |
Also Published As
Publication number | Publication date |
---|---|
CN115668159A (zh) | 2023-01-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109582214B (zh) | 数据访问方法以及计算机系统 | |
US8615634B2 (en) | Coordinated writeback of dirty cachelines | |
US6282617B1 (en) | Multiple variable cache replacement policy | |
US8645627B2 (en) | Memory bus write prioritization | |
US20070094450A1 (en) | Multi-level cache architecture having a selective victim cache | |
US20170242794A1 (en) | Associative and atomic write-back caching system and method for storage subsystem | |
JP6859361B2 (ja) | 中央処理ユニット(cpu)ベースシステムにおいて複数のラストレベルキャッシュ(llc)ラインを使用してメモリ帯域幅圧縮を行うこと | |
WO2021236800A1 (en) | Adaptive cache | |
US9552301B2 (en) | Method and apparatus related to cache memory | |
WO2019128958A1 (zh) | 缓存替换技术 | |
US20130031312A1 (en) | Cache memory controller | |
US9058283B2 (en) | Cache arrangement | |
US10452313B2 (en) | Apparatuses and methods for multiple address registers for a solid state device | |
KR20220054366A (ko) | 완전 연관 캐시 관리 | |
KR100395768B1 (ko) | 멀티 레벨 캐쉬 시스템 | |
CN110537172B (zh) | 混合存储器模块 | |
WO2022021178A1 (zh) | 缓存方法、系统和芯片 | |
US11994990B2 (en) | Memory media row activation-biased caching | |
CN109117388B (zh) | 针对内存端缓存的数据动态旁路装置及方法 | |
WO2021008552A1 (zh) | 数据读取方法和装置、计算机可读存储介质 | |
WO2022021158A1 (zh) | 缓存系统、方法和芯片 | |
WO2020001665A2 (zh) | 一种片内缓存及集成芯片 | |
KR20220052978A (ko) | 매칭되지 않는 트랜잭션 세분성을 지원하는 방법 | |
WO2022021177A1 (zh) | 缓存系统、方法和芯片 | |
KR20200052528A (ko) | 비트 카운터를 이용하는 컴퓨팅 시스템 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20946545 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20946545 Country of ref document: EP Kind code of ref document: A1 |