CN113778693A - Cache operation method, cache operation device, electronic equipment and processor - Google Patents

Cache operation method, cache operation device, electronic equipment and processor Download PDF

Info

Publication number
CN113778693A
CN113778693A CN202111335951.0A CN202111335951A CN113778693A CN 113778693 A CN113778693 A CN 113778693A CN 202111335951 A CN202111335951 A CN 202111335951A CN 113778693 A CN113778693 A CN 113778693A
Authority
CN
China
Prior art keywords
information
cache
access
priority
cache block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111335951.0A
Other languages
Chinese (zh)
Other versions
CN113778693B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Bilin Technology Development Co ltd
Shanghai Bi Ren Technology Co ltd
Original Assignee
Beijing Bilin Technology Development Co ltd
Shanghai Biren Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Bilin Technology Development Co ltd, Shanghai Biren Intelligent Technology Co Ltd filed Critical Beijing Bilin Technology Development Co ltd
Priority to CN202111335951.0A priority Critical patent/CN113778693B/en
Publication of CN113778693A publication Critical patent/CN113778693A/en
Application granted granted Critical
Publication of CN113778693B publication Critical patent/CN113778693B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5038Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5021Priority

Abstract

A cache operation method, a cache operation device, an electronic device and a processor are provided. The cache operation method is applied to a cache comprising a plurality of cache blocks. The cache operation method comprises the following steps: receiving an access request for the cache, wherein the access request comprises access control information, the access control information comprises first priority information, first last access information and first transient information, the first priority information indicates the priority of data related to the access request, the first last access information indicates whether the access request is the last access, and the first transient information indicates whether the access request is the transient access; and operating the cache according to the access control information of the access request. The cache operation method can flexibly configure cache replacement strategies according to scenes, improves the replacement effect under different scenes, can improve the hit rate, and has small hardware cost and small required chip area.

Description

Cache operation method, cache operation device, electronic equipment and processor
Technical Field
The embodiment of the disclosure relates to a cache operation method, a cache operation device, an electronic device and a processor.
Background
In a general computer architecture, instructions and data of a program are stored in a memory, and an operating frequency of a processor is much higher than that of the memory, so that it takes hundreds of clock cycles to obtain the data or instructions from the memory, which often causes the processor to idle due to the fact that the processor cannot continue to operate related instructions, and performance loss is caused. To improve efficiency and access speed, a cache device (or called a cache) is usually used to store recently accessed data. The processor preferentially searches data from the cache, and if the data requested by the application program or the software exists in the cache, the cache hit is called, and otherwise, the cache miss is called.
Disclosure of Invention
At least one embodiment of the present disclosure provides a cache operation method, applied to a cache including a plurality of cache blocks, where the method includes: receiving an access request to the cache, wherein the access request includes access control information, the access control information includes first priority information, first last access information and first transient information, the first priority information indicates a priority of data related to the access request, the first last access information indicates whether the access request is a last access, and the first transient information indicates whether the access request is a transient access; and operating the cache according to the access control information of the access request.
In a method provided by an embodiment of the present disclosure, for example, each cache block is provided with cache control information, the cache control information includes at least a priority information item, a last access information item, and a transient information item, the plurality of cache blocks comprises at least one occupied non-empty cache block, the second priority information of the priority information item record of the non-empty cache block indicates the priority of the non-empty cache block, second last access information of a last access information item record of the non-empty cache block indicates whether a last access to the non-empty cache block was a last access, the second transient information of the transient information item record of the non-empty cache block indicates whether the last access to the non-empty cache block is a transient access, and the cache is operated according to the access control information of the access request, and the method comprises the following steps: and operating the cache according to the access control information of the access request and the cache control information of the non-empty cache block in the plurality of cache blocks.
For example, in a method provided by an embodiment of the present disclosure, the accessing control information further includes first bypass information, and the operating the cache according to the accessing control information of the accessing request and cache control information of a non-empty cache block in the plurality of cache blocks includes: responding to the access request miss, and acquiring access control information in the access request; in response to the first bypass information indicating that the access request is not a bypass type of access request, determining an operation policy for the access request based on first priority information, first last access information, and first transient information in the access control information; responding to the access request by using the plurality of cache blocks based on second priority information in the cache control information of the non-empty cache blocks and the operation policy.
For example, in the method provided by an embodiment of the present disclosure, when the first priority information is a valid value, it indicates that the data related to the access request is a high priority, and when the first priority information is an invalid value, it indicates that the data related to the access request is a low priority; when the first last access information is an effective value, the access request is indicated to be the last access, and when the first last access information is an invalid value, the access request is indicated not to be the last access; when the first instant information is an effective value, the access request is indicated as instant access, and when the first instant information is an invalid value, the access request is not instant access; when the first bypass information is an effective value, the access request is indicated to be a bypass type access request, and when the first bypass information is an invalid value, the access request is indicated not to be a bypass type access request; when the second priority information is an effective value, indicating that the non-empty cache block is in a high priority, and when the second priority information is an invalid value, indicating that the non-empty cache block is in a low priority; when the second last access information is a valid value, indicating that the last access to the non-empty cache block is the last access, and when the second last access information is an invalid value, indicating that the last access to the non-empty cache block is not the last access; the second transient message, when a valid value, indicates that a last access to the non-empty cache block was a transient access, and when an invalid value, indicates that the last access to the non-empty cache block was not a transient access.
For example, in a method provided by an embodiment of the present disclosure, the operation policy includes a first policy, a second policy, a third policy, and a fourth policy; the first policy includes: if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists and a cache block with low priority exists, replacing the cache block with low priority; if no idle cache block exists and no low-priority cache block exists, replacing the high-priority cache block; the second policy includes: if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists and a cache block with low priority exists, replacing the cache block with low priority; if no idle cache block exists and no low-priority cache block exists, changing the access request into a bypass type access request and not performing replacement operation; the third policy includes: if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists, changing the access request into a bypass type access request and not performing replacement operation; the fourth policy includes: the access request is changed to a bypass type access request without performing a replacement operation and an allocation operation.
For example, in a method provided by an embodiment of the present disclosure, the determining, based on first priority information, first last access information, and first transient information in the access control information, an operation policy for the access request, where the access request is a read request, includes: determining the operation policy as the first policy in response to the first priority information being a valid value and the first last access information being an invalid value; determining the operation policy as the second policy in response to the first priority information, the first last access information and the first instant information being invalid values or the first priority information, the first last access information and the first instant information being valid values; determining the operation policy as the third policy in response to the first priority information being an invalid value and the first transient information being an valid value; and determining the operation policy to be the fourth policy in response to the first last access information being a valid value and the first transient information being an invalid value.
For example, in a method provided by an embodiment of the present disclosure, the determining, based on first priority information, first last access information, and first transient information in the access control information, an operation policy for the access request, where the access request is a write request, includes: determining the operation policy as the first policy in response to the first priority information being a valid value and the first last access information being an invalid value; determining the operation policy as the second policy in response to that the first priority information and the first instant information are invalid values or the first priority information and the first last access information are valid values; and determining the operation policy to be the third policy in response to the first priority information being an invalid value and the first transient information being an valid value.
For example, in a method provided by an embodiment of the present disclosure, responding to the access request by using the plurality of cache blocks based on the second priority information in the cache control information of the non-empty cache block and the operation policy includes: determining a free cache block and a non-empty cache block of the plurality of cache blocks; and responding to the access request by determining the priority of the non-empty cache block according to the second priority information of the non-empty cache block in response to the determined operation policy being the first policy or the second policy.
For example, in a method provided by an embodiment of the present disclosure, determining a priority of the non-empty cache block according to the second priority information of the non-empty cache block, and responding to the access request includes: and in response to the existence of a plurality of cache blocks with the same priority and the need of replacement operation, selecting one cache block from the plurality of cache blocks with the same priority for replacement according to a preset algorithm.
For example, in one embodiment of the present disclosure, the preset algorithm includes at least one of a first-in first-out algorithm, a least recently used algorithm, a pseudo least recently used algorithm, and a least recently used algorithm.
For example, in a method provided by an embodiment of the present disclosure, responding to the access request by using the plurality of cache blocks based on the second priority information in the cache control information of the non-empty cache block and the operation policy further includes: and responding to the data corresponding to the access request distributed to the selected cache block, and storing the first priority information, the first last access information and the first transient information in the access control information as the second priority information, the second last access information and the second transient information in the cache control information of the cache block respectively.
For example, in a method provided in an embodiment of the present disclosure, the operating the cache according to the access control information of the access request and the cache control information of a non-empty cache block in the plurality of cache blocks further includes: and managing the states and data of the cache blocks according to the access control information of the access request.
For example, in a method provided in an embodiment of the present disclosure, managing states and data of the plurality of cache blocks according to access control information of the access request includes: and in response to the hit of the access request, the access request is a read request and the first last access information of the read request is a valid value, marking the cache block as an invalid cache block after reading the data in the corresponding cache block.
For example, in the method provided in an embodiment of the present disclosure, the managing the states and data of the plurality of cache blocks according to the access control information of the access request further includes: responding to the condition that the access request is a write request, first instant information in the write request is a valid value, first last access information in the write request is an invalid value, and data corresponding to the write request is successfully distributed to a cache block, and writing the data into a next-level memory bank cascaded with the cache when the cache block is idle; responding to the condition that the access request is a write request, the first instant information in the write request is a valid value, the first last access information in the write request is a valid value, and the data corresponding to the write request is successfully distributed into a cache block, writing the data into a next-level memory bank cascaded with the cache when the cache block is idle, and marking the cache block as an invalid cache block.
For example, an embodiment of the present disclosure provides a method further including: in response to receiving a destage request, managing a priority of a cache block corresponding to the destage request.
For example, in a method provided by an embodiment of the present disclosure, managing priorities of cache blocks corresponding to the destage requests includes: responding to that the second priority information of the cache block corresponding to the degradation request is a valid value, and modifying the second priority information of the cache block into an invalid value; and in response to that the second priority information of the cache block corresponding to the downgrading request is an invalid value and the second last access information is a valid value, marking the cache block as an invalid cache block after the cache block is accessed next time, and emptying the cache control information of the cache block.
For example, in the method provided in an embodiment of the present disclosure, the operating the cache according to the access control information of the access request further includes: and responding to the first bypass information being a valid value, configuring the access request as a bypass type access request.
For example, in the method provided by an embodiment of the present disclosure, the access control information of the access request comes from the configuration of the software layer.
For example, in the method provided by an embodiment of the present disclosure, the data corresponding to the access request is data of an artificial intelligence application scenario.
At least one embodiment of the present disclosure further provides a cache operation apparatus, applied to a cache including a plurality of cache blocks, where the cache operation apparatus includes: a receiving unit configured to receive an access request to the cache, wherein the access request includes access control information, the access control information includes first priority information, first last access information, and first transient information, the first priority information indicates a priority of data related to the access request, the first last access information indicates whether the access request is a last access, and the first transient information indicates whether the access request is a transient access; and the processing unit is configured to operate the cache according to the access control information of the access request.
At least one embodiment of the present disclosure further provides an electronic device, including the cache operating apparatus provided in any embodiment of the present disclosure.
At least one embodiment of the present disclosure further provides a processor including the cache operating device and the cache provided in any embodiment of the present disclosure.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings of the embodiments will be briefly introduced below, and it is apparent that the drawings in the following description relate only to some embodiments of the present disclosure and are not limiting to the present disclosure.
Fig. 1 is a schematic flow chart of a cache operation method according to some embodiments of the present disclosure;
FIG. 2 is a software and hardware implementation diagram of a cache operation method according to some embodiments of the present disclosure;
fig. 3 is a schematic flowchart of an example of step S20 in fig. 1;
fig. 4 is one of exemplary flowcharts of step S22 in fig. 3;
FIG. 5 is a second exemplary flowchart of step S22 in FIG. 3;
fig. 6 is a flowchart illustrating an example of step S23 in fig. 3;
fig. 7 is a flowchart illustrating an example of step S233 in fig. 6;
fig. 8 is a schematic flowchart illustrating information management performed based on read requests and write requests in a cache operation method according to some embodiments of the present disclosure;
fig. 9 is a schematic flowchart of priority management based on a destage request in a cache operation method according to some embodiments of the present disclosure;
fig. 10 is a schematic application diagram of a cache operation method according to some embodiments of the present disclosure;
fig. 11 is a schematic block diagram of a cache operation apparatus according to some embodiments of the present disclosure;
fig. 12 is a schematic block diagram of an electronic device provided by some embodiments of the present disclosure; and
fig. 13 is a schematic block diagram of a processor provided in some embodiments of the present disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings of the embodiments of the present disclosure. It is to be understood that the described embodiments are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the disclosure without any inventive step, are within the scope of protection of the disclosure.
Unless otherwise defined, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in this disclosure is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. Also, the use of the terms "a," "an," or "the" and similar referents do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used merely to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
In a computing system, a multi-level cache is generally adopted, for example, a first-level cache (L1 cache, also called L1 cache), a second-level cache (L2 cache, also called L2 cache), and a third-level cache (L3 cache, also called L3 cache) are generally adopted. These caches are typically integrated within the processor, with the L1 cache being closest to the processor core, the L2 cache being second, and the L3 cache being second. Accordingly, the L1 cache is the fastest, the L2 cache is second, and the L3 cache is second. When data needs to be obtained, the processor first searches the L1 cache for the needed data, searches the L2 cache if the needed data is not found in the L1 cache, and searches the L3 cache if the needed data is not found yet. If the required data is not found in the L1 cache, the L2 cache, and the L3 cache, a search is made in the memory.
In order to increase the hit rate, it is necessary to store data that is likely to be used recently in the cache as much as possible. The capacity of the cache is limited, so that when the space of the cache is full, a cache replacement policy is adopted to delete some data from the cache, and then new data is written into the released space. The cache replacement strategy is actually a data elimination mechanism, and a reasonable cache replacement strategy is adopted, so that the hit rate can be effectively improved. Of course, the cache replacement strategy also involves hardware overhead, which can have an impact on chip area and cost.
The cache replacement policy is typically implemented by the hardware itself, e.g., using some solidified replacement algorithm. Commonly used replacement algorithms are a first-in-first-out algorithm (FIFO), a least recently used algorithm (LRU), a pseudo least recently used algorithm (PLRU), a least recently used algorithm (LFRU), and the like. The basic unit of a Cache is a Cache block (Cache Line, which may also be referred to as a Cache Line). These replacement algorithms are all used to select the appropriate cache block for replacement, so as to guarantee the hit rate to the maximum extent. Generally, simple strategies (e.g., FIFO, LRU, PLRU, etc.) have low hardware overhead and low power consumption; complex policies (e.g., LFRUs, etc.) will generally work better than simple policies, but the overhead will be greater.
The solidified hardware replacement strategy can not ensure that the maximum hit rate of the cache can be achieved under all scenes, and some strategies can ensure that the cache can have a better hit rate in most scenes, but a more complex algorithm is often adopted, more logic units (namely, a larger chip area is needed) are needed during implementation, and larger power consumption is brought during work.
At least one embodiment of the disclosure provides a cache operation method, a cache operation device, an electronic device and a processor. The cache operation method can flexibly configure cache replacement strategies according to scenes, improves the replacement effect under different scenes, can improve the hit rate, and has small hardware cost and small required chip area.
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. It should be noted that the same reference numerals in different figures will be used to refer to the same elements that have been described.
At least one embodiment of the present disclosure provides a cache operation method, which is applied to a cache including a plurality of cache blocks. The cache operation method comprises the following steps: receiving an access request for the cache, wherein the access request comprises access control information, the access control information comprises first priority information, first last access information and first transient information, the first priority information indicates the priority of data related to the access request, the first last access information indicates whether the access request is the last access, and the first transient information indicates whether the access request is the transient access; and operating the cache according to the access control information of the access request.
Fig. 1 is a schematic flowchart of a cache operation method according to some embodiments of the present disclosure. As shown in fig. 1, the cache operation method includes the following operations.
Step S10: receiving an access request for the cache, wherein the access request comprises access control information, the access control information comprises first priority information, first last access information and first transient information, the first priority information indicates the priority of data related to the access request, the first last access information indicates whether the access request is the last access, and the first transient information indicates whether the access request is the transient access;
step S20: and operating the cache according to the access control information of the access request.
For example, the Cache operation method is applied to a Cache (Cache) including a plurality of Cache blocks (Cache Line).
For example, in step S10, the access request may be a request from a command queue or a request from any processing thread, which is not limited by the embodiments of the present disclosure. For example, the access request may be a read request or a write request. For example, the access request includes access control information, which may come from a configuration of the software layer. That is, in the codes and instructions of the software layer, the relevant information is written into the corresponding command, so that when the hardware layer executes the corresponding request, the request carries the access control information. For example, according to the characteristics of different data, preset information may be written in the corresponding command, so that the corresponding request carries preset access control information.
For example, the access control information includes first priority information, first last access information, and first transient information. In some examples, the access control information may further include first bypass information. The first priority information indicates a priority of data related to the access request, the first last access information indicates whether the access request is a last access (here, the access may refer to read data or write data), the first transient information indicates whether the access request is a transient access, and the first bypass information indicates whether the access request needs to be configured as a bypass (bypass) type of request. Here, the data corresponding to the bypass type access request is not stored in the cache.
For example, in some examples, the access control information may be represented using a 4-bit binary number. In bit [3:0], bit [0] represents first priority information, bit [1] represents first last access information, bit [2] represents first transient information, and bit [3] represents first bypass information. The following table shows the sign and meaning of the access control information.
Figure 202694DEST_PATH_IMAGE001
For example, in this example, when the first priority information has a valid value (e.g., bit [0] is 1, i.e., PRI = 1), it indicates that the access request-related data is of high priority; the first priority information indicates that the access request related data is of low priority when the first priority information is invalid (e.g., bit [0] is 0, i.e., PRI = 0). When the first LAST access information is a valid value (e.g. bit [1] is 1, i.e. LAST/FW = 1), it indicates that the access request is the LAST access; the first LAST access information indicates that the access request was not the LAST access when it is an invalid value (e.g. bit [1] is 0, i.e. LAST/FW = 0). Here, when the access request is a read request, the sign of the value in bit [1] is represented by LAST; when the access request is a write request, the sign of the numerical value in bit [1] is denoted by FW. The first transient information indicates that the access request is a transient access when the first transient information has a valid value (e.g., bit [2] is 1, i.e., TRS = 1); the first transient information indicates that the access request is not a transient access when it is an invalid value (e.g., bit [2] is 0, i.e., TRS = 0). When the first bypass information has a valid value (e.g. bit [3] is 1, that is, BYPS = 1), indicating that the access request is a bypass type access request; the first bypass information indicates that the access request is not a bypass type access request when the first bypass information has an invalid value (e.g., bit [3] is 0, i.e., BYPS = 0).
For example, when the cache operation method is used for operating an access request corresponding to data from an artificial intelligence application scenario, access control information of the corresponding access request may be configured according to usage characteristics of various types of data. For example, the first priority information of the access request corresponding to the backward propagated weight gradient data (weight gradient data) in the training scenario may be marked as a valid value (representing a high priority), and the activation function data (activation data) in the inference scenario or the first last access information of the access request corresponding to the backward propagated activation function data and activation function gradient (activation gradient) in the training scenario may be marked as a valid value (represented as a last read/write). For example, the first transient information of the access request corresponding to stream data (stream in/stream out data) can be marked as a valid value (expressed as a transient access).
In the embodiment of the present disclosure, the representation manner and the symbols of each type of information in the access control information are not limited to the manner shown in the above table, and may be represented in any applicable manner, which is not limited in this respect. For example, in other examples, each type of information may be represented by a multi-bit binary number, not limited to a 1-bit binary number, or may be represented by a character string, which may be determined according to actual requirements. The configuration rule of the access control information of the access request is not limited to the rule described above, and may be flexibly configured according to the characteristics of various types of data in the actual application scenario, which is not limited in the embodiment of the present disclosure.
For example, each cache block of the cache is provided with cache control information, which may be stored in a tag block table (tag table). The label block table comprises a plurality of label items, and the label block table is used for mapping the plurality of main storage blocks in the main storage with the plurality of label items according to the mapping rule. The detailed description of the tag block table may refer to conventional designs and will not be described in detail herein. For example, some entries may be added to the general tag block table to store the cache control information, so that no additional storage space is required. For example, in other examples, a table may be separately constructed to store cache control information, thereby eliminating the need to modify the current tag block table. How to store the cache control information may be determined according to actual needs, and embodiments of the present disclosure are not limited thereto.
For example, the plurality of cache blocks in the cache includes at least one non-empty cache block that is occupied, i.e., that currently stores valid data, and is not in an idle state. The cache control information includes at least a priority information item, a last access information item, and a transient information item. The priority information item is recorded with second priority information, the last access information item is recorded with second last access information, and the transient information item is recorded with second transient information. For example, the second priority information of the priority information item record of a non-empty cache block indicates the priority of the non-empty cache block, the second last access information of the last access information item record of a non-empty cache block indicates whether the last access to the non-empty cache block was the last access, and the second transient information of the transient information item record of a non-empty cache block indicates whether the last access to the non-empty cache block was a transient access.
For example, in some examples, 3-bit binary numbers may be employed to represent cache control information. In bit 2:0, bit 0 represents the second priority information, bit 1 represents the second last access information, and bit 2 represents the second instantaneous information. The following table shows the sign and meaning of the cache control information.
Figure 119835DEST_PATH_IMAGE002
For example, in this example, a second priority information of a valid value (e.g., bit [0] is 1, i.e., PRI' = 1) indicates that the non-empty cache block is of high priority; the second priority information indicates that the non-empty cache block is of low priority when the second priority information is invalid (e.g., bit [0] is 0, i.e., PRI' = 0). The second LAST access information indicates that the LAST access to the non-empty cache block was the LAST access when the second LAST access information is a valid value (e.g., bit [1] is 1, i.e., LAST '/FW' = 1); the second LAST access information is an invalid value (e.g., bit [1] is 0, i.e., LAST '/FW' = 0) indicating that the LAST access to the non-empty cache block was not the LAST access. A second transient information value (e.g., bit [2] is 1, i.e., TRS' = 1) indicates that the last access to the non-empty cache block is a transient access; the second transient information, when invalid (e.g., bit [2] is 0, i.e., TRS' = 0), indicates that the last access to the non-empty cache block was not a transient access.
For example, the cache control information for a non-empty cache block is derived from the access control information in the corresponding access request when data is written to that cache block. That is, while storing data corresponding to a certain access request in the non-empty cache block, the first priority information, the first last access information, and the first transient information in the access control information of the access request are stored as the second priority information, the second last access information, and the second transient information in the cache control information of the non-empty cache block, respectively. For example, the first priority information is stored as the second priority information, the first last access information is stored as the second last access information, and the first transient information is stored as the second transient information.
It should be noted that there is no bypass information in the cache control information. This is because the access request corresponding to the data stored in the cache is not necessarily a bypass type access request, and therefore the first bypass information of the access request is invalid, and the storage record for the first bypass information is not needed. If the first bypass information of a certain access request is a valid value, the access request needs to be configured as a bypass type access request, so that corresponding data is not stored in a cache, and the content corresponding to the access control information is not stored.
For example, in step S20, the operation on the cache according to the access control information of the access request may include: and operating the cache according to the access control information of the access request and the cache control information of the non-empty cache blocks in the plurality of cache blocks.
As shown in fig. 2, the access control information of the access request is first configured at the software layer, for example, the first priority information, the first last access information, the first transient information, and the first side communication information. Then, when the instruction execution unit of the hardware layer executes the instruction of the corresponding request, the request carries the access control information. The cache (e.g., L2 cache) is operated on based on access control information of the access request and cache control information of a non-empty cache block of the plurality of cache blocks, thereby implementing program control.
Fig. 3 is a flowchart illustrating an example of step S20 in fig. 1. As shown in fig. 3, in some examples, the step S20 may include the following operations.
Step S21: responding to the miss of the access request, and acquiring access control information in the access request;
step S22: in response to the first bypass information indicating that the access request is not a bypass type access request, determining an operation policy for the access request based on first priority information, first last access information, and first transient information in the access control information;
step S23: responding to the access request by using a plurality of cache blocks based on second priority information in the cache control information of the non-empty cache blocks and the operation strategy;
step S24: and responding to the first bypass information being a valid value, and configuring the access request as a bypass type access request.
For example, in step S21, if the access request is not hit, that is, there is no required data in the cache, it may be necessary to obtain data from the next-level memory bank or main memory and store the obtained data in the cache. Therefore, the access control information in the access request needs to be acquired so as to perform subsequent operations on the cache based on the access control information. For example, the obtained access control information at least includes the first priority information, the first last access information, and the first transient information, and the related contents may refer to the above contents, which are not described herein again.
For example, in step S22, if the first bypass information indicates that the access request is not a bypass type access request, that is, the required data needs to be stored in the cache, the operation policy for the access request is determined based on the first priority information, the first last access information and the first transient information in the access control information.
For example, the operational policies may include a first policy, a second policy, a third policy, and a fourth policy.
The first policy includes: if the idle cache block exists, distributing the data corresponding to the cache request to the idle cache block; if no idle cache block exists and a cache block with low priority exists, replacing the cache block with low priority; if there is no free cache block and no low priority cache block, the high priority cache block is replaced.
The second policy includes: if the idle cache block exists, distributing the data corresponding to the cache request to the idle cache block; if no idle cache block exists and a cache block with low priority exists, replacing the cache block with low priority; if there is no free cache block and there is no low priority cache block, the access request is changed to a bypass type access request and no replacement operation is performed.
The third strategy comprises: if the idle cache block exists, distributing the data corresponding to the cache request to the idle cache block; if no free cache block exists, the access request is changed to a bypass type access request and no replacement operation is performed.
The fourth strategy includes: the access request is changed to a bypass type access request without performing the replacement operation and the allocation operation.
For example, in each of the above strategies, when there are multiple free cache blocks and data needs to be allocated to the free cache blocks, one free cache block may be selected according to a preset algorithm, or one free cache block may be randomly selected. The preset algorithm is, for example, a general first-in first-out algorithm (FIFO), a least recently used algorithm (LRU), a pseudo least recently used algorithm (PLRU), a least recently used algorithm (LFRU), or the like.
As shown in fig. 4, in some examples, in case the access request is a read request, the above step S22 may include the following operations. At this time, the first LAST access information is denoted by LAST.
Step S221: determining the operation strategy as a first strategy in response to the first priority information being a valid value and the first last access information being an invalid value;
step S222: determining the operation strategy as a second strategy in response to that the first priority information, the first last access information and the first instant information are invalid values or the first priority information, the first last access information and the first instant information are valid values;
step S223: determining the operation strategy as a third strategy in response to the first priority information being an invalid value and the first instant information being an valid value;
step S224: and determining the operation policy to be a fourth policy in response to the first last access information being a valid value and the first transient information being an invalid value.
For example, if each policy is represented by the symbol shown in table three below, each type of information is represented by the symbol shown in table one above, and 0 represents an invalid value and 1 represents a valid value, the above-mentioned manner of determining the operation policy may be as shown in table four below.
Figure 899572DEST_PATH_IMAGE003
Figure 481732DEST_PATH_IMAGE004
As shown in fig. 5, in some examples, in case the access request is a write request, the above step S22 may include the following operations. At this time, the first last access information is denoted by FW.
Step S225: determining the operation strategy as a first strategy in response to the first priority information being a valid value and the first last access information being an invalid value;
step S226: determining the operation strategy as a second strategy in response to that the first priority information and the first instant information are invalid values or the first priority information and the first last access information are valid values;
step S227: and determining the operation strategy to be a third strategy in response to the first priority information being an invalid value and the first instant information being a valid value.
For example, each policy is represented by the symbol shown in table three, each type of information is represented by the symbol shown in table one, and 0 represents an invalid value and 1 represents a valid value, the above manner of determining the operation policy may be as shown in table five below.
Figure 296104DEST_PATH_IMAGE005
For example, returning to fig. 3, in step S23, the access request is responded to with a plurality of cache blocks based on the second priority information in the cache control information of the non-empty cache block and the operation policy.
For example, as shown in fig. 6, in some examples, the step S23 may include the following operations.
Step S231: determining a free cache block and a non-empty cache block of a plurality of cache blocks;
step S233: and responding to the access request by determining the priority of the non-empty cache block according to the second priority information of the non-empty cache block in response to the determined operation policy being the first policy or the second policy.
For example, in some examples, step S23 may include step S232 in addition to steps S231 and S233 described above.
Step S232: responding to the access request in response to the determined operating policy being the third policy or the fourth policy.
For example, in step S231, a free cache block and a non-empty cache block of the plurality of cache blocks are determined, that is, it is determined whether there is a free cache block, whether there is a non-empty cache block, which cache blocks are free cache blocks, which cache blocks are non-empty cache blocks, among the plurality of cache blocks. For example, a non-empty cache block refers to an occupied cache block in which valid data is stored. A free cache block refers to an unoccupied cache block. For example, the free cache block and the non-empty cache block may be determined from information in a tag block table (tag table), which may be referred to conventional designs and will not be described in detail herein.
For example, in step S232, if the determined operation policy is the third policy or the fourth policy, the access request may be responded to. Because the operation can be performed without acquiring the priority of the cache block under the third strategy and the fourth strategy, the response can be performed without acquiring the second priority information of the non-empty cache block.
For example, in step S232, a specific manner of responding is as follows. Under the condition that the determined operation strategy is the third strategy, if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block, namely storing the data in the idle cache block; if no free cache block exists, the access request is changed to a bypass type access request and no replacement operation is performed. For example, when there are a plurality of free cache blocks, one free cache block may be selected according to a preset algorithm, or one free cache block may be randomly selected. The preset algorithm is, for example, a general first-in first-out algorithm (FIFO), a least recently used algorithm (LRU), a pseudo least recently used algorithm (PLRU), a least recently used algorithm (LFRU), or the like. For example, in the case where the determined operation policy is the fourth policy, the access request is directly changed to the bypass type access request without performing the replacement operation and the allocation operation.
For example, in step S233, if the determined operation policy is the first policy or the second policy, the priority of the non-empty cache block is determined according to the second priority information of the non-empty cache block, and the access request is responded to.
For example, in step S233, a specific manner of responding is as follows. Under the condition that the determined operation strategy is the first strategy, if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists and a low-priority cache block exists, replacing the low-priority cache block; if there is no free cache block and no low priority cache block, the high priority cache block is replaced. Under the condition that the determined operation strategy is a second strategy, if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists and a low-priority cache block exists, replacing the low-priority cache block; if there is no free cache block and there is no low priority cache block, the access request is changed to a bypass type access request and no replacement operation is performed.
For example, as shown in fig. 7, in some examples, the step S233 may further include the following operations.
Step S2331: in response to the existence of a plurality of cache blocks with the same priority and the need of replacement operation, selecting one cache block from the plurality of cache blocks with the same priority for replacement according to a preset algorithm;
step S2332: and in response to the data corresponding to the access request being distributed to the selected cache block, storing the first priority information, the first last access information and the first transient information in the access control information as the second priority information, the second last access information and the second transient information in the cache control information of the cache block respectively.
For example, in step S2331, the preset algorithm includes at least one of a first-in-first-out algorithm (FIFO), a least recently used algorithm (LRU), a pseudo least recently used algorithm (PLRU), a least recently used algorithm (LFRU). When a cache block with low priority needs to be replaced and a plurality of cache blocks with low priority exist, one cache block with low priority can be selected for replacement according to a preset algorithm; when a cache block with high priority needs to be replaced and a plurality of cache blocks with high priority exist, one cache block with high priority can be selected for replacement according to a preset algorithm. Of course, the embodiments of the present disclosure are not limited thereto, and when there are multiple cache blocks of the same priority and a replacement operation is required, one cache block may be randomly selected to perform the replacement operation.
For example, in step S2332, after the data corresponding to the access request is allocated to the selected cache block, the first priority information, the first last access information, and the first transient information in the access control information are also stored as the second priority information, the second last access information, and the second transient information, respectively, in the cache control information of the cache block. That is, the first priority information is stored as the second priority information, the first last access information is stored as the second last access information, and the first transient information is stored as the second transient information. Thus, cache control information corresponding to the cache block may be generated to facilitate operation according to the operation policy at the next response to the access request.
Returning to fig. 3, for example, in step S24, if the first bypass information is a valid value, the access request is configured as a bypass type access request. That is, the data corresponding to the access request is not stored in the cache, but is responded by a bypass (bypass) method.
For example, when the cache operation method is applied to an artificial intelligence application scenario, the data corresponding to the access request is data of the artificial intelligence application scenario. The artificial intelligence application scene can be scenes such as artificial intelligence calculation (AI calculation), neural network calculation, machine learning and the like.
The above artificial intelligence calculation (AI calculation) may refer to artificial intelligence calculation implemented using a Graphics Processing Unit (GPU) or a General Purpose Graphics Processor (GPGPU).
For example, in some examples, for some application scenarios (e.g., AI calculation, etc.) where the algorithm steps are explicit and the data flow is relatively clear, the user may determine in advance whether some data stored in the L2 needs to be used again, whether there is a higher access amount, etc., and when programming, make the instruction for accessing these application data carry some access control information containing these information. When the hardware executes the instruction, the access control information is transmitted to the L2 cache all the way along with the request or control flow signal of each level of module. After obtaining the cache control information of the cache block in which the requested data is located, the L2 may determine the priority of the cache block in the replacement, so as to make an accurate replacement for the hardware. The access control information in the access request is recorded along with information such as tag entries of the cache blocks and used for comparison when a new cache block needs to be replaced later.
It should be noted that, in the embodiment of the present disclosure, the cache operation method may be used to operate an L1 cache, an L2 cache, an L3 cache, or other caches at any levels in a multi-core large-scale computing chip, or to operate caches in other devices having a computing function, which is not limited in this regard. The specific operation policy, the setting of various types of information in the access control information, and the like may be determined according to different application scenarios and data characteristics, and are not limited to the above-described manner.
In the embodiment of the present disclosure, in the manner described above, the access control information of the access request may be directly configured by software, so as to determine the priority of the access request, and mark information such as whether the access request is an instant access, whether the access request is a last access, whether the access request is a bypass type, and further determine the replacement priority of the cache block in the cache, thereby determining a corresponding operation policy to implement the cache operation. The embodiment of the disclosure provides a method capable of flexibly configuring a cache replacement strategy according to scenes, which can improve the replacement effect in different scenes and help cache to achieve the maximum hit rate in all scenes. Compared with a complex solidification replacement strategy, the cache operation method provided by the embodiment of the disclosure can be realized by hardware more directly, and the hardware overhead is smaller. For example, when the cache operation method is applied to an L2 cache, due to the commonality of the L2 cache in the system and the nature of being far away from the instruction execution unit, the cache operation method can greatly improve the replacement effect of the L2 cache in different scenarios. Furthermore, since the size of the L2 cache is generally large, and the tag entry (tag) of each cache block is generally stored in a dedicated tag entry storage (tag ram), cache control information of the cache block can be stored together with the tag entry, and the required chip area is also small.
In the embodiment of the present disclosure, besides defining the replacement priority, the cache operation method may also utilize different combinations of various types of information in the control information (e.g., access control information, cache control information) to implement other functions, such as directly invalidating (invalidating) a certain cache block that is no longer used, or defining the data of the cache block as temporarily stored data, which may be cached and actively evicted (evict). The implementation of these functions will be explained below.
For example, in step S20, the operating the cache according to the access control information of the access request and the cache control information of the non-empty cache block of the plurality of cache blocks may include: the state and data of the plurality of cache blocks are managed based on access control information of the access request. A flow example of performing information management based on a read request and a write request is shown in fig. 8, and includes the following operations.
Step S2281: in response to the hit of the access request, the access request is a read request, and the first last access information of the read request is a valid value, marking the cache block as an invalid cache block after reading the data in the corresponding cache block;
step S2282: responding to the fact that the access request is a write request, the first instant information in the write request is an effective value, the first last access information in the write request is an invalid value, and data corresponding to the write request is successfully distributed to a cache block, and writing the data into a next-level memory bank cascaded with a cache when the cache block is idle;
step S2283: and in response to the access request being a write request, the first instant information in the write request being a valid value and the first last access information being a valid value, and the data corresponding to the write request being successfully allocated into the cache block, writing the data into a next-level bank in cascade with the cache when the cache block is idle, and marking the cache block as an invalid cache block.
For example, in step S2281, if the access request hits, the access request is a read request, and the first LAST access information of the read request is a valid value (LAST = 1), it indicates that the access request is the LAST access to the data stored in a certain cache block in the cache. Therefore, a corresponding cache block is marked as an invalid cache block after reading the data in the cache block, i.e., the cache block is invalidated (invalid). Thus, a cache block that is no longer used may be invalidated based on the first last access information in the access request.
For example, in step S2282, if the access request is a write request, the first transient information in the write request is a valid value (TRS = 1), the first last access information is an invalid value (FW = 0), and the data corresponding to the write request is successfully allocated to the cache block, it indicates that the data written into the cache at this time is dirty data. Therefore, data is written to the next-level bank in cascade with the cache while the cache block is free. Thus, temporary storage of data in the cache may be achieved, which may be actively evicted (evict) by the cache.
For example, in step S2283, if the access request is a write request, the first transient information in the write request is a valid value (TRS = 1), the first last access information is a valid value (FW = 1), and the data corresponding to the write request is successfully allocated to the cache block, it indicates that the access request is a dynamic write-through (dynamic-write) request. Therefore, data is written to the next level bank in cascade with the cache while the cache block is free, and since the data is already completely written or the complete data is merged, the cache block is marked as an invalid cache block, i.e., the cache block is invalidated. Here, the dynamic write-through request indicates that when the lower-level memory bank can immediately accept write data, the current-level cache changes the request into bypass (bypass) access, otherwise, the request is cached in the current-level cache first, and then the request is actively written out when the lower-level memory bank can accept the write data.
For example, in some embodiments, the cache operation method may further include the following operations.
Step S30: in response to receiving the destage request, managing a priority of a cache block corresponding to the destage request.
For example, in step S30, the destage request is different from the read request and the write request, and the destage request does not access the data in the cache block but is used for managing the priority of the cache block, for example, for modifying the second priority information in the cache control information. The destage request carries address information by which a certain cache block can be located, thereby managing the priority of the cache block.
Fig. 9 is a schematic flowchart of priority management based on a destage request in a cache operation method according to some embodiments of the present disclosure. As shown in fig. 9, the above step S30 may include the following operations.
Step S31: in response to that the second priority information of the cache block corresponding to the degradation request is a valid value, modifying the second priority information of the cache block into an invalid value;
step S32: and in response to that the second priority information of the cache block corresponding to the downgrading request is an invalid value and the second last access information is a valid value, marking the cache block as an invalid cache block after the cache block is accessed next time, and emptying the cache control information of the cache block.
For example, in step S31, if the second priority information of the cache block corresponding to the destage request is a valid value (PRI ' = 1), the second priority information of the cache block is modified to an invalid value, that is, the value of PRI ' is modified such that PRI ' = 0. Thereby, the cache block is changed from high priority to low priority. For example, when PRI ' =1 and FW ' =1, the numerical value of PRI ' needs to be modified to 0; when PRI ' =1 and FW ' =0, the value of PRI ' also needs to be modified to 0.
For example, in step S32, if the second priority information of the cache block corresponding to the destage request is an invalid value (PRI '= 0) and the second last access information is a valid value (FW' = 1), the cache block is marked as an invalid cache block after the cache block is accessed next time, and the cache control information of the cache block is emptied. In this way, dirty data that is no longer used (dirty data) can be discarded directly in the cache.
If the second priority information of the cache block corresponding to the destage request is an invalid value (PRI '= 0) and the second last access information is an invalid value (FW' = 0), the second priority information of the cache block is not modified at this time.
Fig. 10 is an application diagram of a cache operation method according to some embodiments of the present disclosure. As shown in fig. 10, in some examples, when the cache operation method is used to operate on an L2 cache, the following flow may be employed.
First, a new request, which may be a read request or a write request, is received. Then, first priority information (PRI), first LAST access information (LAST/FW), first transient information (TRS), first bypass information (BYPS) are obtained according to access control information carried in the request, and a configuration table is looked up based on these information to determine whether the request hits.
If the new request is not hit (cache miss), check if BYPS is a valid value. If BYPS is a valid value, it indicates that the request needs to be configured as a bypass type access request, so the request is bypassed to the next level of memory in the L2 cache. And if the BYPS is an invalid value, judging whether the cache block needs to be allocated or not. If a cache block needs to be allocated, the cache block may be allocated or replaced based on various operating strategies and decisions described above. If the cache block does not need to be allocated, then no replacement/allocation operations are performed. Here, the operation policy may be determined and operated on the basis of the first priority information (PRI), the first LAST access information (LAST/FW), the first transient information (TRS), and the second priority information (PRI '), the second LAST access information (LAST'/FW '), and the second transient information (TRS') of each cache block in the L2 cache, which are carried by the request. In addition, in addition to determining whether a cache block needs to be allocated, it is also necessary to determine whether cache control information needs to be modified. For example, the cache control information corresponding to the cache block may be modified based on the access control information carried in the request to enable updating of the cache control information.
If the new request hits (cache hit), the data is read from or written to the cache. Moreover, it is necessary to determine whether to invalidate the cache block. For example, whether the cache block needs to be invalidated may be determined based on cache control information of the corresponding cache block. If the cache block needs to be invalidated, a tag entry table (tag table) is modified to implement the invalidation.
For example, as shown in fig. 10, in steps Z1, Z2, Z3, and Z4, a corresponding determination needs to be made based on the access control information and/or the cache control information. For example, in step Z1, a corresponding judgment needs to be made based on BYPS; in step Z2, it is necessary to make corresponding judgment based on PRI, LAST/FW, TRS, PRI'; in step Z3, a corresponding judgment is needed based on PRI, LAST/FW and TRS; in step Z4, a corresponding determination needs to be made based on LAST/FW, TRS. For the related description, reference may be made to the above description about each step of the cache operation method, and details are not repeated here.
It should be noted that, in the embodiment of the present disclosure, the cache operation method may further include more or fewer steps, and the execution order of each step is not limited, and may be executed in parallel or executed in series, which may be determined according to actual requirements.
At least one embodiment of the present disclosure further provides a cache operation device. The cache operation device can flexibly configure cache replacement strategies according to scenes, improves the replacement effect under different scenes, can improve the hit rate, and has small hardware cost and small required chip area.
Fig. 11 is a schematic block diagram of a cache operation apparatus according to some embodiments of the present disclosure. As shown in fig. 11, the cache operation apparatus 100 includes a receiving unit 110 and a processing unit 120. For example, the cache operation apparatus 100 is applied to a cache including a plurality of cache blocks.
The receiving unit 110 is configured to receive an access request for a cache. For example, the access request includes access control information including first priority information, first last access information, and first transient information. The first priority information indicates a priority of data related to the access request, the first last access information indicates whether the access request is a last access, and the first transient information indicates whether the access request is a transient access. For example, the receiving unit 110 may perform step S10 of the caching method as shown in fig. 1. The processing unit 120 is configured to operate on the cache according to the access control information of the access request. For example, the processing unit 120 may perform step S20 of the cache operation method as shown in fig. 1.
For example, the receiving unit 110 and the processing unit 120 may be hardware, software, firmware, or any feasible combination thereof. For example, the receiving unit 110 and the processing unit 120 may be dedicated or general circuits, chips, or devices, and may also be a combination of a processor and a memory. The embodiment of the present disclosure is not limited in this regard to the specific implementation forms of the receiving unit 110 and the processing unit 120.
It should be noted that, in the embodiment of the present disclosure, each unit of the cache operation apparatus 100 corresponds to each step of the cache operation method, and for the specific function of the cache operation apparatus 100, reference may be made to the description related to the cache operation method above, and details are not described here again. The components and structure of the cache operation apparatus 100 shown in fig. 11 are merely exemplary and not limiting, and the cache operation apparatus 100 may further include other components and structures as needed.
At least one embodiment of the present disclosure also provides an electronic device. The electronic equipment can flexibly configure the cache replacement strategy according to the scenes, improves the replacement effect under different scenes, can improve the hit rate, and has small hardware cost and small required chip area.
Fig. 12 is a schematic block diagram of an electronic device provided in some embodiments of the present disclosure. As shown in fig. 12, the electronic device 200 includes a cache operation apparatus 210. For example, the cache operation apparatus 210 may be the cache operation apparatus 100 shown in fig. 11, and may perform the foregoing cache operation method. The electronic device 200 may be a computer, a server, or any other device with computing capability, and only needs to be provided with a cache and operate the cache, which is not limited in this embodiment of the present disclosure. For the related description of the electronic device 200, reference may be made to the above description of the cache operation apparatus 100, and no further description is provided herein.
At least one embodiment of the present disclosure also provides a processor. The processor can flexibly configure a cache replacement strategy according to scenes, improves the replacement effect under different scenes, can improve the hit rate, and has small hardware overhead and small required chip area.
Fig. 13 is a schematic block diagram of a processor provided in some embodiments of the present disclosure. As shown in fig. 13, the processor 300 includes a cache operating device 310 and a cache 320. For example, cache 320 may be any level of cache such as an L1 cache, an L2 cache, an L3 cache, and the like. The cache operating device 310 is used for operating the cache 320, such as reading/writing data, managing the status and data of a plurality of cache blocks in the cache 320, and the like. The cache operation apparatus 310 may be the cache operation apparatus 100 shown in fig. 11. For example, the processor 300 may be a multi-core large-scale computing chip or a single-core computing chip, and the embodiments of the present disclosure are not limited thereto. For the related description of the processor 300, reference may be made to the above description of the cache operation apparatus 100, which is not repeated herein.
The following points need to be explained:
(1) the drawings of the embodiments of the disclosure only relate to the structures related to the embodiments of the disclosure, and other structures can refer to common designs.
(2) Without conflict, embodiments of the present disclosure and features of the embodiments may be combined with each other to arrive at new embodiments.
The above description is only a specific embodiment of the present disclosure, but the scope of the present disclosure is not limited thereto, and the scope of the present disclosure should be subject to the scope of the claims.

Claims (22)

1. A cache operation method is applied to a cache comprising a plurality of cache blocks, wherein,
the method comprises the following steps:
receiving an access request to the cache, wherein the access request includes access control information, the access control information includes first priority information, first last access information and first transient information, the first priority information indicates a priority of data related to the access request, the first last access information indicates whether the access request is a last access, and the first transient information indicates whether the access request is a transient access;
and operating the cache according to the access control information of the access request.
2. The method of claim 1, wherein each cache block is provided with cache control information comprising at least a priority information item, a last access information item and a transient information item,
the plurality of cache blocks includes at least one occupied non-empty cache block, second priority information of a priority information item record of the non-empty cache block indicates a priority of the non-empty cache block, second last access information of a last access information item record of the non-empty cache block indicates whether a last access to the non-empty cache block is a last access, second transient information of a transient information item record of the non-empty cache block indicates whether a last access to the non-empty cache block is a transient access,
according to the access control information of the access request, the cache is operated, and the operation comprises the following steps:
and operating the cache according to the access control information of the access request and the cache control information of the non-empty cache block in the plurality of cache blocks.
3. The method of claim 2, wherein the access control information further comprises first bypass information,
according to the access control information of the access request and the cache control information of the non-empty cache block in the plurality of cache blocks, operating the cache, including:
responding to the access request miss, and acquiring access control information in the access request;
in response to the first bypass information indicating that the access request is not a bypass type of access request, determining an operation policy for the access request based on first priority information, first last access information, and first transient information in the access control information;
responding to the access request by using the plurality of cache blocks based on second priority information in the cache control information of the non-empty cache blocks and the operation policy.
4. The method of claim 3, wherein the first priority information indicates that the access request related data is of high priority when the first priority information is a valid value and indicates that the access request related data is of low priority when the first priority information is an invalid value;
when the first last access information is an effective value, the access request is indicated to be the last access, and when the first last access information is an invalid value, the access request is indicated not to be the last access;
when the first instant information is an effective value, the access request is indicated as instant access, and when the first instant information is an invalid value, the access request is not instant access;
when the first bypass information is an effective value, the access request is indicated to be a bypass type access request, and when the first bypass information is an invalid value, the access request is indicated not to be a bypass type access request;
when the second priority information is an effective value, indicating that the non-empty cache block is in a high priority, and when the second priority information is an invalid value, indicating that the non-empty cache block is in a low priority;
when the second last access information is a valid value, indicating that the last access to the non-empty cache block is the last access, and when the second last access information is an invalid value, indicating that the last access to the non-empty cache block is not the last access;
the second transient message, when a valid value, indicates that a last access to the non-empty cache block was a transient access, and when an invalid value, indicates that the last access to the non-empty cache block was not a transient access.
5. The method of claim 4, wherein the operational policies include a first policy, a second policy, a third policy, and a fourth policy;
the first policy includes: if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists and a cache block with low priority exists, replacing the cache block with low priority; if no idle cache block exists and no low-priority cache block exists, replacing the high-priority cache block;
the second policy includes: if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists and a cache block with low priority exists, replacing the cache block with low priority; if no idle cache block exists and no low-priority cache block exists, changing the access request into a bypass type access request and not performing replacement operation;
the third policy includes: if an idle cache block exists, distributing data corresponding to the cache request to the idle cache block; if no idle cache block exists, changing the access request into a bypass type access request and not performing replacement operation;
the fourth policy includes: the access request is changed to a bypass type access request without performing a replacement operation and an allocation operation.
6. The method of claim 5, wherein the access request is a read request, and determining an operation policy for the access request based on first priority information, first last access information, and first transient information in the access control information comprises:
determining the operation policy as the first policy in response to the first priority information being a valid value and the first last access information being an invalid value;
determining the operation policy as the second policy in response to the first priority information, the first last access information and the first instant information being invalid values or the first priority information, the first last access information and the first instant information being valid values;
determining the operation policy as the third policy in response to the first priority information being an invalid value and the first transient information being an valid value;
and determining the operation policy to be the fourth policy in response to the first last access information being a valid value and the first transient information being an invalid value.
7. The method of claim 5, wherein the access request is a write request, and determining an operation policy for the access request based on first priority information, first last access information, and first transient information in the access control information comprises:
determining the operation policy as the first policy in response to the first priority information being a valid value and the first last access information being an invalid value;
determining the operation policy as the second policy in response to that the first priority information and the first instant information are invalid values or the first priority information and the first last access information are valid values;
and determining the operation policy to be the third policy in response to the first priority information being an invalid value and the first transient information being an valid value.
8. The method of claim 5, wherein responding to the access request with the plurality of cache blocks based on the second priority information in the cache control information of the non-empty cache blocks and the operating policy comprises:
determining a free cache block and a non-empty cache block of the plurality of cache blocks;
and responding to the access request by determining the priority of the non-empty cache block according to the second priority information of the non-empty cache block in response to the determined operation policy being the first policy or the second policy.
9. The method of claim 8, wherein determining the priority of the non-empty cache block from the second priority information of the non-empty cache block and responding to the access request comprises:
and in response to the existence of a plurality of cache blocks with the same priority and the need of replacement operation, selecting one cache block from the plurality of cache blocks with the same priority for replacement according to a preset algorithm.
10. The method of claim 9, wherein the preset algorithm comprises at least one of a first-in-first-out algorithm, a least recently used algorithm, a pseudo least recently used algorithm, a least recently used algorithm.
11. The method of claim 5, wherein responding to the access request with the plurality of cache blocks based on second priority information in cache control information of the non-empty cache blocks and the operating policy further comprises:
and responding to the data corresponding to the access request distributed to the selected cache block, and storing the first priority information, the first last access information and the first transient information in the access control information as the second priority information, the second last access information and the second transient information in the cache control information of the cache block respectively.
12. The method of any of claims 1-11, wherein the caching is operated on according to access control information of the access request and cache control information of non-empty cache blocks of the plurality of cache blocks, further comprising:
and managing the states and data of the cache blocks according to the access control information of the access request.
13. The method of claim 12, wherein managing the state and data of the plurality of cache blocks according to the access control information of the access request comprises:
and in response to the hit of the access request, the access request is a read request and the first last access information of the read request is a valid value, marking the cache block as an invalid cache block after reading the data in the corresponding cache block.
14. The method of claim 12, wherein managing the state and data of the plurality of cache blocks according to the access control information of the access request further comprises:
responding to the condition that the access request is a write request, first instant information in the write request is a valid value, first last access information in the write request is an invalid value, and data corresponding to the write request is successfully distributed to a cache block, and writing the data into a next-level memory bank cascaded with the cache when the cache block is idle;
responding to the condition that the access request is a write request, the first instant information in the write request is a valid value, the first last access information in the write request is a valid value, and the data corresponding to the write request is successfully distributed into a cache block, writing the data into a next-level memory bank cascaded with the cache when the cache block is idle, and marking the cache block as an invalid cache block.
15. The method of any of claims 1-11, further comprising:
in response to receiving a destage request, managing a priority of a cache block corresponding to the destage request.
16. The method of claim 15, wherein managing the priority of the cache block to which the destage request corresponds comprises:
responding to that the second priority information of the cache block corresponding to the degradation request is a valid value, and modifying the second priority information of the cache block into an invalid value;
and in response to that the second priority information of the cache block corresponding to the downgrading request is an invalid value and the second last access information is a valid value, marking the cache block as an invalid cache block after the cache block is accessed next time, and emptying the cache control information of the cache block.
17. The method of any of claims 4-11, wherein operating on the cache according to access control information of the access request further comprises:
and responding to the first bypass information being a valid value, configuring the access request as a bypass type access request.
18. The method of any of claims 1-11, wherein the access control information of the access request is from a configuration of a software layer.
19. The method of any of claims 1-11, wherein the data corresponding to the access request is data of an artificial intelligence application scenario.
20. A cache operation device is applied to a cache comprising a plurality of cache blocks, wherein,
the cache operation device comprises:
a receiving unit configured to receive an access request to the cache, wherein the access request includes access control information, the access control information includes first priority information, first last access information, and first transient information, the first priority information indicates a priority of data related to the access request, the first last access information indicates whether the access request is a last access, and the first transient information indicates whether the access request is a transient access;
and the processing unit is configured to operate the cache according to the access control information of the access request.
21. An electronic device comprising the cache operation apparatus of claim 20.
22. A processor comprising the cache operating apparatus of claim 20 and the cache.
CN202111335951.0A 2021-11-12 2021-11-12 Cache operation method, cache operation device, electronic equipment and processor Active CN113778693B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111335951.0A CN113778693B (en) 2021-11-12 2021-11-12 Cache operation method, cache operation device, electronic equipment and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111335951.0A CN113778693B (en) 2021-11-12 2021-11-12 Cache operation method, cache operation device, electronic equipment and processor

Publications (2)

Publication Number Publication Date
CN113778693A true CN113778693A (en) 2021-12-10
CN113778693B CN113778693B (en) 2022-02-08

Family

ID=78957063

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111335951.0A Active CN113778693B (en) 2021-11-12 2021-11-12 Cache operation method, cache operation device, electronic equipment and processor

Country Status (1)

Country Link
CN (1) CN113778693B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5787490A (en) * 1995-10-06 1998-07-28 Fujitsu Limited Multiprocess execution system that designates cache use priority based on process priority
US6542966B1 (en) * 1998-07-16 2003-04-01 Intel Corporation Method and apparatus for managing temporal and non-temporal data in a single cache structure
US20160321185A1 (en) * 2015-04-28 2016-11-03 Intel Corporation Controlling displacement in a co-operative and adaptive multiple-level memory system
US20210182216A1 (en) * 2019-12-16 2021-06-17 Advanced Micro Devices, Inc. Cache management based on access type priority
WO2021143154A1 (en) * 2020-01-16 2021-07-22 华为技术有限公司 Cache management method and device
CN113424160A (en) * 2019-03-30 2021-09-21 华为技术有限公司 Processing method, processing device and related equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5787490A (en) * 1995-10-06 1998-07-28 Fujitsu Limited Multiprocess execution system that designates cache use priority based on process priority
US6542966B1 (en) * 1998-07-16 2003-04-01 Intel Corporation Method and apparatus for managing temporal and non-temporal data in a single cache structure
US20160321185A1 (en) * 2015-04-28 2016-11-03 Intel Corporation Controlling displacement in a co-operative and adaptive multiple-level memory system
CN113424160A (en) * 2019-03-30 2021-09-21 华为技术有限公司 Processing method, processing device and related equipment
US20210182216A1 (en) * 2019-12-16 2021-06-17 Advanced Micro Devices, Inc. Cache management based on access type priority
WO2021143154A1 (en) * 2020-01-16 2021-07-22 华为技术有限公司 Cache management method and device

Also Published As

Publication number Publication date
CN113778693B (en) 2022-02-08

Similar Documents

Publication Publication Date Title
US11086792B2 (en) Cache replacing method and apparatus, heterogeneous multi-core system and cache managing method
US7284096B2 (en) Systems and methods for data caching
US8943272B2 (en) Variable cache line size management
US8041897B2 (en) Cache management within a data processing apparatus
KR100978156B1 (en) Method, apparatus, system and computer readable recording medium for line swapping scheme to reduce back invalidations in a snoop filter
US7844778B2 (en) Intelligent cache replacement mechanism with varying and adaptive temporal residency requirements
US7380065B2 (en) Performance of a cache by detecting cache lines that have been reused
US7783837B2 (en) System and storage medium for memory management
US7975107B2 (en) Processor cache management with software input via an intermediary
US7277992B2 (en) Cache eviction technique for reducing cache eviction traffic
US20060155934A1 (en) System and method for reducing unnecessary cache operations
US7020748B2 (en) Cache replacement policy to mitigate pollution in multicore processors
US20160055100A1 (en) System and method for reverse inclusion in multilevel cache hierarchy
US7194586B2 (en) Method and apparatus for implementing cache state as history of read/write shared data
US10628318B2 (en) Cache sector usage prediction
US20110320720A1 (en) Cache Line Replacement In A Symmetric Multiprocessing Computer
US20120151149A1 (en) Method and Apparatus for Caching Prefetched Data
KR20200080142A (en) Bypass predictor for an exclusive last-level cache
CN110297787B (en) Method, device and equipment for accessing memory by I/O equipment
US20070233965A1 (en) Way hint line replacement algorithm for a snoop filter
CN115794682A (en) Cache replacement method and device, electronic equipment and storage medium
CN106164874B (en) Method and device for accessing data visitor directory in multi-core system
US9218292B2 (en) Least-recently-used (LRU) to first-dirty-member distance-maintaining cache cleaning scheduler
CN113778693B (en) Cache operation method, cache operation device, electronic equipment and processor
US20050235116A1 (en) System, method and storage medium for prefetching via memory block tags

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Room 0106-508, 1st floor, No.26, shangdixin Road, Haidian District, Beijing 100085

Patentee after: Beijing Bilin Technology Development Co.,Ltd.

Country or region after: China

Patentee after: Shanghai Bi Ren Technology Co.,Ltd.

Address before: Room 0106-508, 1st floor, No.26, shangdixin Road, Haidian District, Beijing 100085

Patentee before: Beijing Bilin Technology Development Co.,Ltd.

Country or region before: China

Patentee before: Shanghai Bilin Intelligent Technology Co.,Ltd.