CN115757203A

CN115757203A - Memory access strategy management method and device, processor and computing equipment

Info

Publication number: CN115757203A
Application number: CN202310034457.3A
Authority: CN
Inventors: 请求不公布姓名
Original assignee: Moore Threads Technology Co Ltd
Current assignee: Moore Threads Technology Co Ltd
Priority date: 2023-01-10
Filing date: 2023-01-10
Publication date: 2023-03-07
Anticipated expiration: 2043-01-10
Also published as: CN115757203B

Abstract

The disclosure provides a memory access strategy management method and device, a processor and a computing device. The method comprises the following steps: updating an address management table based on the hit condition of the historical access request, wherein the address management table comprises hit statistical data corresponding to an address interval, and the historical access request is an access request aiming at a historical target address in the address interval; determining access strategy parameters corresponding to the address intervals based on the hit statistical data; in response to receiving an access request for a target address within an address interval, adding an access policy parameter to the access request to update the access request; and sending the updated access request to the cache unit. The method can realize the self-adaptive dynamic adjustment of the access strategy parameters, is beneficial to optimizing the cache access strategy and improves the data interaction efficiency and the processor processing efficiency.

Description

Memory access strategy management method and device, processor and computing equipment

Technical Field

The present disclosure relates to the field of cache technologies, and in particular, to a memory access policy management method and apparatus, a processor, and a computing device.

Background

With the development of the related art, for a processor such as a GPU (Graphics Processing Unit), a CPU (Central Processing Unit), etc., the operation speed is often higher than the read/write speed of a memory, and therefore, one or more levels of cache are usually disposed in such a processor to solve the problem that the operation speed of the processor is not matched with the read/write speed of the memory. In particular, the cache may be used to store data that is frequently used by the processor, thereby reducing the latency required to access such data in order to increase processing efficiency. However, since the size of the cache storage space is often much smaller than the size of the memory storage space, it can only accommodate a small portion of the data from the memory according to the preset access policy. Therefore, how to better optimize the memory access strategy becomes an important link for optimizing the processing efficiency of the processor.

Disclosure of Invention

In view of the above, the present application provides a memory access policy management method, a memory access policy management apparatus, a processor and a computing device, which may alleviate, reduce or even eliminate the above problems.

According to an aspect of the present disclosure, there is provided a memory access policy management method, including: updating an address management table based on the hit condition of the historical access request, wherein the address management table comprises hit statistical data corresponding to an address interval, and the historical access request is an access request aiming at a historical target address in the address interval; determining access strategy parameters corresponding to the address interval based on the hit statistical data; in response to receiving an access request for a target address within an address interval, adding an access policy parameter to the access request to update the access request; and sending the updated access request to the cache unit.

In some embodiments, updating the address management page table based on hits from historical access requests comprises: receiving a hit condition of the historical access request fed back by the cache unit, wherein the hit condition comprises at least one of the following items: information indicating whether the historical target address is hit, an average hit distance of the historical target address, and a cumulative number of times the historical target address is hit; based on the hit, hit statistics corresponding to the address intervals are updated in the address management table.

In some embodiments, updating, in the address management table, the hit statistics corresponding to the address intervals based on the hit conditions includes: determining a hit statistic value corresponding to the historical access request based on the hit condition and a preset statistic mechanism, wherein the hit statistic value comprises at least one of the following items: the hit distance statistic of the historical target address and the hit times statistic of the historical target address; updating hit statistics corresponding to the address interval in the address management table based on the hit statistics, wherein the hit statistics comprises at least one of the following items: and (5) counting the hit distance and the hit times.

In some embodiments, determining, based on the hit statistics, an access policy parameter corresponding to the address interval comprises: responding to the fact that the hit statistical data meet preset conditions, and setting access strategy parameters corresponding to the address interval as preset parameters; and responding to the fact that the hit statistical data do not meet the preset conditions, and keeping the access strategy parameters corresponding to the address intervals as default parameters.

In some embodiments, the above method further comprises: in response to that the access strategy parameter corresponding to the address interval is a preset parameter and the hit statistical data meets a preset degradation condition, restoring the access strategy parameter corresponding to the address interval to be a default parameter, wherein the preset degradation condition comprises at least one of the following items: the statistical hit frequency of the address interval in the preset time window meets a preset threshold condition, the variation trend of the statistical hit distance of the address interval meets a first preset trend condition, and the variation trend of the statistical hit frequency of the address interval meets a second preset trend condition.

In some embodiments, the above method further comprises: and in response to the fact that the number of the address intervals with the preset parameters reaches a preset threshold and the hit statistical data of one address interval with the default parameters meets a preset condition, restoring the access strategy parameters of one address interval in the threshold number of address intervals with the preset parameters to the default parameters.

In some embodiments, the address management table includes a local cache portion and a non-local cache portion, and the method further includes: in response to the address interval in the local cache portion being replaced by an address interval in the non-local cache portion, the hit statistics corresponding to the replaced address interval are zeroed in the local cache portion.

In some embodiments, the access policy parameters include at least one of: access segment length, cache application policy, and replacement priority.

According to another aspect of the present disclosure, there is provided a memory access policy management apparatus, including: a first update module configured to: updating an address management table based on the hit condition of the historical access request, wherein the address management table comprises hit statistical data corresponding to an address interval, and the historical access request is an access request aiming at a historical target address in the address interval; a determination module configured to: determining access strategy parameters corresponding to the address interval based on the hit statistical data; a second update module configured to: in response to receiving an access request for a target address within an address interval, adding a memory access policy parameter to the access request to update the access request; a sending module configured to: and sending the updated access request to the cache unit.

According to yet another aspect of the present disclosure, there is provided a processor including: an execution unit configured to initiate an access request for a target address; the cache unit is configured to manage data from each address interval in the cache unit based on an access and storage strategy, and the access and storage strategy comprises access and storage strategy parameters; the memory access policy management unit is configured to execute the memory access policy management method described in any embodiment of the foregoing aspect.

According to yet another aspect of the present disclosure, there is provided a computing device comprising a memory and a processor as described in the preceding aspect, wherein the memory comprises a plurality of address intervals.

By the access policy management method provided by the foregoing aspect of the disclosure, access policy parameters corresponding to each address interval may be dynamically adjusted based on hit conditions of historical access requests, and then, when a new access request is received, the access request is updated by adding the adjusted access policy parameters corresponding to the corresponding address interval to the access request, and the updated access request is sent to the cache unit, so that the cache unit may perform corresponding cache management operations according to the access policy parameters carried by the access request. Therefore, self-adaptive adjustment during operation of the access strategy parameters of each address interval can be realized, a plurality of problems caused by the fact that all addresses use the same access strategy parameters are reduced, such as the problem of crowding of one-time access data or low-frequency access data to high-frequency access data and the like, the access strategy parameters of each address or each address interval do not need to be manually set in advance, and the problems of extra consumption of manpower and time cost caused by manual setting and unmatched preset data access conditions with actual operation are solved. Therefore, the caching method provided by the disclosure is beneficial to improving the flexibility of the cache access strategy and optimizing the cache access strategy, and is further beneficial to improving the data interaction efficiency and the processing efficiency of the processor.

These and other aspects of the application will be apparent from and elucidated with reference to the embodiment(s) described hereinafter.

Drawings

Further details, features and advantages of the present application are disclosed in the following description of exemplary embodiments, which description should be taken in conjunction with the accompanying drawings, in which:

FIG. 1 schematically illustrates an example block diagram of a computing system in the related art;

FIG. 2 schematically illustrates an example flow diagram of a memory access policy management method according to some embodiments of the present disclosure;

FIG. 3 schematically illustrates an example flow diagram of an access request processing procedure, in accordance with some embodiments of the present disclosure;

FIG. 4 schematically illustrates an example block diagram architecture of a memory access policy management unit, in accordance with some embodiments of the present disclosure;

FIG. 5 schematically illustrates an example block diagram of a memory access policy management apparatus in accordance with some embodiments of this disclosure;

FIG. 6 schematically illustrates an example block diagram of a processor architecture, in accordance with some embodiments of this disclosure;

fig. 7 schematically illustrates an example block diagram of a computing device in accordance with some embodiments of this disclosure.

Detailed Description

Fig. 1 schematically illustrates an example block diagram of a computing device 100 in the related art. As shown, computing device 100 may include a processor 110 and memory 120. Alternatively, the processor 110 may be a CPU, GPU, or the like, which may include an execution unit 111 and a cache unit 112. For example, the execution unit 111 may initiate an access request to an address in the memory 120 to obtain data stored in the address. At this time, whether the data of the address exists may be first searched in the cache entry of the cache unit 112, and if the data of the address exists, the address may be regarded as being hit, and the data of the address is directly read from the cache entry of the cache unit 112 and returned to the execution unit 111; if not, the address may be regarded as a miss (miss), and then the data of the address is requested to the memory 120, and the data returned by the memory 120 may be fed back to the execution unit 111 via the cache unit 112, and may optionally be stored in the cache unit 112 by replacing a cache entry in the cache unit 112.

In addition, optionally, the processor 110 may further include a Memory Management Unit (MMU) 113. Memory management unit 113 may implement the mapping between virtual addresses and physical addresses based on an address mapping table, which may take the form of a page table, for example, which may be stored locally to memory management unit 113, in memory 120, or a combination of both. For example, the address in the access request issued by the execution unit 111 may be a virtual address, and the virtual address may be converted into a physical address by the memory management unit 113 and then provided to the cache unit 112 for processing. For example, the virtual address space and the physical address space may be divided in units of pages, each page may have a preset size, for example, 4 KB or other preset sizes, and each mapping entry in the address mapping table used by the memory management unit 113 may correspond to an address mapping of one page.

In general, the cache unit may have a pre-designed cache size, i.e., the size of the total cache space. The cache may be divided into a number of cache lines, each cache line having a size that defines the amount of data that a cache record can store. Further, the cache size and the cache line size may determine the number of cache lines. In order to realize data exchange between the memory space and the cache space, a preset mapping relationship, such as direct mapping, fully associative mapping, set associative mapping, etc., is usually established between the cache address and the memory address. Specifically, for direct mapping, each data block in the memory can only be mapped to a specific cache line, that is, the data blocks mapped to the same cache line compete for using the cache line; for fully associative mapping, each data block in the memory may be mapped to any cache line, i.e., all data blocks compete for use of all cache lines; for set associative mapping, each data block in memory may be mapped to any cache line in a set of cache lines, i.e., data blocks mapped to the same set compete for use of several cache lines within the set. Under the mapping mechanism, when a certain access request for a certain data block misses in a cache, the data block needs to be obtained from a memory, and one cache line is selected from one or more cache lines having a mapping relationship with the data block for replacement. Generally, to ensure subsequent data read and write efficiency, the cache line that is least likely to be recycled may be selected for replacement.

By analyzing the related art, the applicant finds that the current optimization strategies for cache access mainly include the following strategies. First, flexible cache line sizes may be employed. Specifically, although the cache line size and number of cache lines are determined at the time of hardware design, for a particular sequence of requests, it may be selected to read only a portion of the cache line length of data from downstream storage (e.g., memory or downstream cache) to conserve bandwidth. For example, for a request sequence, the interval between adjacent requests is one cache line length, but the actually required data is only 1/4 of the cache line length, the length of the access segment can be set to be 1/4 of the cache line length by a segmentation method, that is, only the data with the required 1/4 cache line length is read, so that invalid reading of downstream storage is saved. Second, a flexible cache application strategy may be employed. Specifically, in some cases, if the reuse times of the partial addresses can be known in advance, different cache application policies may be adopted to reduce the impact on cache access. For example, for the one-time data, a cache access policy that is not applied may be selected, that is, when a miss occurs, the cache line application is not triggered, and the problem that the high-frequency data originally residing in the cache is replaced to affect the subsequent hit again may be solved. Third, various cache line replacement algorithms may be employed to optimize the cache line replacement process. Illustratively, the following algorithm may be employed: (1) The Least Recently Used (LRU) algorithm records a reference time for each cache line, and sets a reference time global counter to increment during each access, and replaces the least recently used data by replacing the counter with the smallest reference time each time, so that the algorithm can find the least recently used cache line most accurately, but the algorithm needs to compare a large number of reference time records in hardware, and the cost is high; (2) A pseudo-least-recently-used (PLRU) algorithm, which is a simplification for the LRU, maintains whether each cache line has been accessed recently through a binary tree, and updates the binary tree when accessing the cache line, so that the method has less performance loss compared with the LRU, and is relatively friendly to hardware implementation; (3) A non-recently used (NRU) algorithm, which records a bit for each cache line, sets the corresponding bit to be 1 if a certain cache line is accessed, and performs zero clearing once when all the cache lines are set to be 1, wherein the algorithm can also achieve the performance close to the least recently used algorithm and has smaller hardware overhead; and so on. However, basing only on the least recently used principle is often not sufficient to maximize the utilization of the cache. This is because many data are reused only a few times during program execution, e.g. only once, but the corresponding initial and re-accesses may be closely spaced. This can result in the crowding of other more frequently used data in the cache, affecting memory access performance. To this end, replacement priorities may be employed as a supplement to the least recently used replacement principle. That is, in performing replacement of a cache line, with reference to the resident priority of the corresponding data, for data of high resident priority, it is retained in the cache as much as possible, and it is restricted from being replaced by data of low resident priority or replaced by data of low resident priority in a later order.

However, through analysis of the related art, the applicant finds that the various policies regarding cache access described above are either solidified in a hardware circuit during a hardware design process, or need to be configured by a driver before an application program runs, or need to give explicit definitions and statements to control different access policies when the program is written. For example, the replacement policy based on the least recently used principle mentioned above is usually implemented in hardware, and the part of the replacement priority usually requires software personnel to define different priorities corresponding to each address interval at the time of program initialization. In addition, similar situations exist for the above mentioned access segment length and cache application policy, etc. Thus, the above-described scheme can be effective in a case where the use frequency of each data block in the memory can be clearly known in advance. However, in most cases, the frequency of use of each data block is difficult to estimate, or is influenced by the state of operation and the input condition, so that it is difficult to specify information such as the replacement priority, the access segment length, and the cache application policy corresponding to each data block in advance, and in this case, the above scheme may be ineffective or difficult to achieve the desired optimization effect.

Based on the above considerations, the applicant proposes a new access policy management method, which can solve or alleviate all or part of the above problems.

Illustratively, FIG. 2 schematically illustrates an example flow diagram of a memory access policy management method 200 according to some embodiments of the present disclosure. Alternatively, the access policy management method 200 may be performed by the aforementioned memory management unit 113, or may be performed by a separate access policy management unit, or by a combination of both. As shown, the access policy management method may include steps 210 to 240, which are described in detail below.

At step 210, the address management table may be updated based on hits from historical access requests. The address management table may include hit statistics corresponding to an address interval, and the historical access requests may be access requests for historical target addresses within the address interval. For example, the memory space may be divided into a plurality of address intervals in advance, and the hit statistics data corresponding to each address interval may be recorded by an address management table, for example, the address management table may be in the form of a page table, where each page table entry may correspond to one address interval and may be allocated with one or a set of counters to record the hit statistics data. Alternatively, the address space may be divided in the same way as the page division in the aforementioned memory management unit 113, which helps to reduce the complexity of the processing logic, and when the two division ways are kept in the same way, the address management table here and the aforementioned address mapping table may be merged. Optionally, the address management table may further include an identifier characterizing the address interval to distinguish different address intervals, and the identifier may be, for example, a start address, a number, or other types of identifiers of the address interval. Optionally, the hit condition of the historical access request may include whether the historical target address to which the historical access request is directed is hit, and if hit, the hit condition may further include information such as an average hit distance of the historical target address, a historical hit number, and the like. Accordingly, the hit statistics may include statistical hit distances, statistical hit times, and the like based on address interval statistics. As mentioned previously, data accesses may be based on data blocks, the size of which may be consistent with the size of a cache line, each of which may be identified by an address. Each address interval may include several data blocks, and therefore, hit statistics corresponding to the address interval may be maintained based on hits from historical access requests for various addresses (i.e., various data blocks) within the address interval.

At step 220, a memory access policy parameter corresponding to the address interval may be determined based on the hit statistics. Illustratively, the access policy that can be supported, for example, the aforementioned optimization policies, such as flexible cache line size, flexible cache application policy, cache replacement based on the least recently used principle, cache replacement with replacement priority, etc., may be agreed in advance between the execution unit and the cache unit of the processor. The supported access policies may be determined at the time of hardware design or software driven configuration, etc. It will be appreciated that the various access policies may include corresponding access policy parameters such as access segment length, cache application policy, replacement priority, etc. Further illustratively, for flexible cache line sizes, there may be an access line length related parameter, where the access line length may indicate a length at which data in addresses within an address interval is read into a cache unit, which may be conventionally equal to the cache line length, and in some cases may be determined to be a value less than the cache line length, such as 1/4 cache line length, 1/2 cache line length, and the like; for a flexible cache application policy, there may be a parameter indicating a specific cache application policy, for example, a parameter indicating whether to use a policy such as a non-application cache, for example, a cache application policy for normally applying a cache may be adopted in a conventional case, and a cache application policy for not applying a cache may be adopted in some cases; for cache replacement in cooperation with a replacement priority, there may be a parameter related to the replacement priority, where the replacement priority may indicate a priority of replacing data of each address in the address interval with other data in a cache unit, and may include, for example, a high replacement priority and a low replacement priority, where, in the cache unit, data of an address in the address interval with the low replacement priority will be preferentially replaced with other data compared to data of an address in the address interval with the high replacement priority, or there may also be more than two levels of replacement priorities, where a low replacement priority may be adopted in a conventional case, and a higher replacement priority may be adopted in some cases; and so on. The hit statistics may reflect historical hit conditions for each address interval, such as how frequently each address interval is accessed. Thus, for example, different access policy parameters may be set for each address interval according to the hit statistics, for example, a corresponding parameter may be set for an address interval in which the hit statistics indicates that access is frequent, so as to indicate that the address interval has a higher replacement priority and is not replaced by data with a low replacement priority or is replaced by data with a low replacement priority later; setting corresponding cache application strategy parameters for an address interval of which the hit statistical data indicates that the access frequency is very low or the hit statistical data is accessed once, so as to indicate that a strategy of not applying for cache is used; setting corresponding parameters for address intervals with low access frequency indicated by the hit statistical data to indicate that the length of an access segment lower than the length of a cache line can be used; and so on. Optionally, the access policy parameter may be recorded in the address management table, or may also be recorded in a separate mapping table, such as a separate address interval-access policy parameter mapping table.

In step 230, in response to receiving an access request for a target address within the address interval, an access policy parameter may be added to the access request to update the access request. For example, when receiving an access request from an execution unit of a processor, an address interval to which a target address belongs may be determined according to the target address to which the target address belongs, then, an address management table may be queried so as to obtain an access policy parameter corresponding to the address interval, and then, the queried access policy parameter may be added to the access request, for example, in a convention manner to a specific location of the access request, so as to inform a cache unit of the access policy parameter and guide the behavior thereof.

At step 240, the updated access request may be sent to the cache molecule. For example, after receiving the updated access request, the cache unit may execute subsequent operations, such as adjusting the replacement priority of a corresponding cache item, reading a certain data block from the memory in a manner that does not apply for caching, reading data from the memory according to a specified access segment length, and the like, according to the access policy parameter included in the agreed manner by the reader.

The method 200 may dynamically adjust parameters of the cache access policy (also referred to as access policy parameters in the present disclosure), such as the replacement priority, the access segment length, the cache application policy, and the like, according to the hit condition of the historical access request. Therefore, under the condition that the use frequency of each data block is difficult to predict or can be influenced by the running state and the input condition, the autonomous dynamic adjustment of the access strategy parameters can be realized, so that the running optimization of the cache access strategy can be realized, the data access efficiency of the processor is improved, the overall processing efficiency is improved, the labor cost is reduced, the dependence on the manual setting of the access strategy parameters for different address intervals in the hardware configuration or software development process is reduced, and the additional labor consumption caused by the manual setting is reduced.

In some embodiments, the step 210 may include: receiving a hit condition of the historical access request fed back by the cache unit, wherein the hit condition may include at least one of the following items: information indicating whether the historical target address is hit, an average hit distance of the historical target address, and a cumulative number of times the historical target address is hit; based on the hit, hit statistics corresponding to the address range are updated in the address management table. For example, when the history access request is transmitted to the cache unit, the cache unit may determine whether the history access request hits, and in the case of hit, record or update information such as a cumulative hit number, an average hit distance, and the like of the history target address to which the history access request is directed. The cache molecule may then feed back the requested data block and feed back the hit. Upon receiving a hit from a cache unit, hit statistics for an address interval including the historical target address may be updated based on the received hit. Therefore, the hit condition of each access request can be obtained based on the reporting mechanism of the cache unit, and the hit statistical data can be updated in real time according to the hit condition. Therefore, the hit statistical data of each address interval in the address management table can be updated in time to reflect the real-time hit condition of each address interval, which is beneficial to ensuring the accuracy of the determined access strategy parameters and ensuring that the determined access strategy parameters are suitable for the current data access condition. In addition, two indexes, namely the average hit distance and the accumulated hit times of each address, can intuitively reflect the access frequency of each address, wherein the average hit distance can refer to the average time interval of hitting one address, specifically, when the address is hit for multiple times, a time interval can exist between every two hits, and the average value of all the time intervals is the average hit distance of the address; the accumulated hit number may refer to the number of times a data block of an address is accumulated and hit after being stored in the cache.

In some embodiments, the hit statistics may be updated based on the hit condition by: hit statistics corresponding to historical access requests may be determined based on hit conditions and a preset statistical mechanism, and the hit statistics may include at least one of: the hit distance statistic of the historical target address and the hit times statistic of the historical target address; updating, in the address management table, hit statistics corresponding to the address intervals based on the hit statistics, the hit statistics may include at least one of: and (5) counting the hit distance and the hit times. For example, the preset statistical mechanism may be a preset scaling mechanism, for example, for the accumulated hit times, every 2, 4, or 6 times, etc. may be counted as 1 statistical time, that is, when the actual accumulated hit times is increased by 2, 4, or 6 times, the hit times statistical value is increased by 1; for the average hit distance, it can be reduced by a preset ratio, for example, 1/8, 1/16, etc., to obtain the corresponding hit distance statistic. Alternatively, the process of converting the hit condition into the hit statistic according to a preset statistic mechanism can be performed at the external interface of the processor core (core). Through the preset statistical mechanism, the overhead of statistical data is reduced, frequent updating of hit statistical data (such as excessively frequent updating of hit number statistical values) can be reduced, the probability of obtaining excessively large statistical data (such as excessively large hit distance statistical values) is reduced, and the like. For example, as mentioned above, each address interval may include a plurality of addresses (or refer to a plurality of data blocks), and the hit statistics corresponding to each address or part of addresses in the address interval may be obtained by averaging the hit statistics of the addresses or part of addresses in the address interval. For example, the statistical hit distance may be an average value of hit distance statistics values of all or part of addresses in an address interval, and a smaller value indicates a better time locality of the address interval; the above-mentioned statistical hit number may be an average value of the hit number statistical values of all or part of the addresses in the address interval, and the larger the value is, the better the spatial locality of the address interval is. In this example, the smaller the statistical hit distance and the larger the statistical hit frequency in one address interval, the higher the access heat of the address interval, and the data of each address included in the address interval should preferentially reside in the cache unit. In addition, optionally, other statistical mechanisms may be selected according to specific requirements, and the hit statistical values of the single addresses may be converted into hit statistical data of the address intervals by other means. Converting the statistics of a single address into the statistics of an address interval helps to keep the data storage and processing overhead low.

It is understood that, besides the above-mentioned statistical hit distance and statistical hit frequency, other types of hit statistics can be obtained, for example, hit statistics can be respectively counted according to different processor cores (cores), different modules, different applications, and the like. However, in comparison, since the statistical hit distance and the statistical hit frequency can intuitively, accurately and comprehensively reflect the access frequency of the address interval, a larger benefit can be obtained with a smaller cost by using the statistical hit distance and the statistical hit frequency, which is helpful for realizing the balance between the cost and the benefit, and obtaining the expected access efficiency improvement effect with a lower cost.

In some embodiments, the step 220 may include: responding to the fact that the hit statistical data meet preset conditions, and setting access strategy parameters corresponding to the address intervals as preset parameters; and in response to the fact that the hit statistical data do not meet the preset conditions, keeping the access strategy parameters corresponding to the address interval as default parameters. Illustratively, the preset parameters may include at least one of: the length of the access segment is preset (such as 1/4 cache line length, 1/2 cache line length, 2 times cache line length, and the like), a cache application policy of not applying for caching, a high replacement priority, and the like, and accordingly, the default parameter may include at least one of the following: the length of a conventional access segment equal to the length of a cache line, a cache application strategy for normally applying for the cache, low replacement priority and the like. Alternatively, the preset condition may be various types of preset conditions, such as greater than or equal to preset threshold data, less than or equal to preset threshold data, ranking within a previous preset ranking, greater than or equal to an average value of hit statistics data of each address interval, and the like, and may be set according to specific application requirements.

Illustratively, the memory access policy parameters may be determined based on hit statistics according to the following algorithm.

An intuitive way is to give higher replacement priority to address intervals with smaller statistical hit distance and larger statistical hit times, and to give longer access segment length, etc. For example, 2 access segment lengths and 3 replacement priorities may be preset, where the 2 access segment lengths are respectively equal to the cache line length and twice the cache line length, the 3 replacement priorities are respectively a low replacement priority, a medium replacement priority, and a high replacement priority, and the priorities of the corresponding data that are replaced in the cache unit are sequentially reduced according to the sequence from the low replacement priority to the high replacement priority. Assuming that the total number of recorded address intervals is 64, all recorded address intervals may be weighted and sorted according to the statistical hit distance and the statistical hit number. Thereafter, the access segment length, replacement priority, and caching application policy may be assigned according to the following table:

1-8 name

9-16 names

17-32 names

33-48 names

49-56 name

57-64 name

Access segment length

2 times of

1 times of

1 time of

-

Replacement priority

High replacement priority

Medium replacement priority

High replacement priority

Medium replacement priority

Low replacement priority

-

Whether to apply for caching

Is that

Whether or not

Another slightly complex algorithm is that, during initialization, the same access policy parameters such as the replacement priority are given to all address intervals, and in the following statistical period, the address intervals with the hit statistical data "superior to" the global average value are periodically subjected to the statistics, so as to improve the parameters such as the replacement priority. Here, "better" may mean a smaller statistical hit distance, a larger number of statistical hits, etc. For example, for 4 address intervals, replacing priorities in initial uniform configuration, and after a statistical period of 10 ms, assuming that the statistical hit times of the address intervals 1 and 2 are higher than the average value of the statistical hit times of the 4 address intervals, raising the replacing priorities of the address intervals 1 and 2 to high replacing priorities; and the address intervals 3 and 4 are lower in replacement priority to be low in replacement priority because the statistical hit frequency is smaller than the average value of the statistical hit frequency of the 4 address intervals. On this basis, an adjustment threshold may also be added, for example, the access policy parameter is adjusted after a certain percentage of the average value is better, and so on, which is not described herein.

The above two algorithms are only exemplary, and those skilled in the art can design other algorithms for determining the access policy parameters based on the hit statistics according to specific requirements. Alternatively, the access policy parameter determination algorithm may be implemented by a fixed logic circuit according to actual application requirements, or may be implemented by a partially configurable circuit controlled by several registers, for example, a user may be allowed to select to set a high replacement priority for the address intervals with the top 5, 8, 10 or other numbers of hit statistics data according to requirements, or may be implemented by a more complex programmable module including a processor and instructions, or the like. The present disclosure does not specifically limit the specific implementation manner of the above algorithm.

In some embodiments, the method 200 may further include: in response to that the access strategy parameter corresponding to the address interval is a preset parameter and the hit statistical data meets a preset degradation condition, restoring the access strategy parameter corresponding to the address interval to be a default parameter, wherein the preset degradation condition comprises at least one of the following items: and in a preset time window, the statistical hit frequency of the address interval meets a preset threshold condition, the variation trend of the statistical hit distance of the address interval meets a first preset trend condition, and the variation trend of the statistical hit frequency of the address interval meets a second preset trend condition. When the hit statistical data of a certain address interval meets a preset degradation condition, restoring the access strategy parameter of the certain address interval from a preset parameter to a default parameter, for example, restoring the replacement priority from high priority to low priority, restoring the access segment length from a specified length to a cache line length, restoring the cache application strategy from non-application cache to a normal application strategy, and the like. Therefore, real-time dynamic adjustment of the access strategy parameters of each address interval can be ensured, and the problems caused by the fact that the access strategy parameters of the address intervals are not adjusted for a long time after being set to specific values are solved, such as the problems that short-term hot data occupies cache resources for a long time, the current hot data cannot preferentially reside in cache, and the like.

For example, for a certain address interval, the change of the hit statistics may be slow, if the address interval is accessed very frequently in a period of time, the corresponding number of statistical hits may be very high, and the statistical hit distance may be very low, and since the statistics are changed by accumulating continuously, the number of statistical hits in the address interval may not be reduced immediately and the statistical hit distance may not be increased immediately after the frequent access period is over. However, by counting the number of times of hit in the preset time window before the current time, and by counting the hit distance and the variation trend of the number of times of hit, the recent access situation of the address interval can be known in time, so that the fact that the frequent access time period is ended can be known in time, the access strategy parameters can be adjusted in time, and the problem that the access efficiency of other hot data is affected due to the fact that the cache space is occupied by the address interval in a long-term manner is solved. For example, the trend of the statistical hit distance and the statistical hit number may be determined by a sliding time window, for example, the statistical hit distance and the statistical hit number may be counted in two or more time windows before the current time, the two or more time windows may be time windows of the same size covering different time periods, or the two or more time windows may be time windows of different sizes with the current time as the end time, and the trend of the change of the hit statistics may be known by comparing the hit statistics in the two or more time windows. For example, assume that there are two preset time windows, where a first preset time window includes the current time and a first preset length of time before the current time, a second preset time window includes the current time and a second preset length of time before the current time, and the first preset length is greater than the second preset length. At this time, if the hit statistical data (such as the number of times of statistical hits, the statistical hit distance, etc.) determined for the first preset time window is greater than the hit statistical data determined for the second preset time window, it may be indicated that the corresponding hit statistical data is in a downward trend; otherwise, it can indicate that the corresponding hit statistic is in an ascending trend. Or, for example, the first preset time window and the second preset time window may have the same preset length, wherein the first preset time window includes the current time and a time having the preset length before the current time, the second preset time window includes the second time and a time having the preset length before the second time, and the second time is before the current time. At this time, if the hit statistical data (such as the statistical hit frequency, the statistical hit distance, and the like) determined for the first preset time window is greater than the hit statistical data determined for the second preset time window, it may be indicated that the corresponding hit statistical data is in an ascending trend; otherwise, it can indicate that the corresponding hit statistic is in a downward trend.

Further exemplarily, taking the replacement priority as an example, the preset threshold condition may be less than or equal to a certain threshold, the first preset trend may refer to an upward trend, and the second preset trend may refer to a downward trend. For example, for an address interval, if the statistics indicate that the subsequent access amount is decreased, the address interval may be adjusted from a high replacement priority to a low replacement priority. For example, the recent absence of access to the address in the address interval (for example, the number of statistical hits in the preset time window before the current time is zero), the recent number of access to the address in the address interval is smaller than a certain threshold, the statistical hit number of the address interval is in a descending trend (that is, meets a second preset trend), the statistical hit distance of the address interval is in an ascending trend (that is, meets a first preset trend), and the like, which all indicate that the access amount of the address interval is in a descending trend. Taking the cache application policy as an example, the preset threshold condition may be greater than or equal to a certain threshold, the first preset trend may be a descending trend, and the second preset trend may be an ascending trend. For example, for a certain address interval, if the statistical data indicates that the subsequent access amount will increase, the address interval may be adjusted from the non-application cache to the application cache. For example, the recent access to the address in the address interval is greater than a certain threshold, the statistical hit frequency of the address interval is in an increasing trend (i.e., meets the second preset trend), the statistical hit distance of the address interval is in a decreasing trend (i.e., meets the first preset trend), and the like, which all indicate that the access amount of the address interval is in an increasing trend. In other words, the preset threshold condition, the first preset trend and the second preset trend can be flexibly set according to specific requirements, which is not specifically limited by the present disclosure.

In some embodiments, the method 200 may further include: and in response to the fact that the number of the address intervals with the preset parameters reaches a preset threshold and the hit statistical data of one address interval with the default parameters meets a preset condition, restoring the access strategy parameters of one address interval in the threshold number of address intervals with the preset parameters to the default parameters. In order to solve the problem that management is disordered and data access efficiency is affected when the access policy parameters of the excessive address intervals are set to specific preset parameters (such as high replacement priority, low access segment length, no application for a cache policy, and the like), for example, for the replacement priority, if the excessive address intervals are set to the high replacement priority, the address intervals actually accessed more frequently may not be truly reflected, and thus the residence priority of the data of the address intervals in the cache unit cannot be well guaranteed. Therefore, a preset threshold may be set for the number of address intervals having preset parameters, and the preset threshold may be, for example, 8, 10, 15, etc., which is not specifically limited by the present disclosure. When the number of address intervals with preset parameters reaches the preset threshold and hit statistical data of a new address interval exists and meets the preset condition, the new address interval may replace one address interval of the original preset threshold number of address intervals, for example, regarding the replacement priority, it is assumed that at most 10 address intervals are allowed to be set as a high replacement priority, and when 10 address intervals with high replacement priority already exist, if another address interval meets the preset condition set as the high replacement priority, the replacement priority of one address interval which is least accessed recently (for example, the statistical hit frequency is lowest and/or the statistical hit distance is longest) among the 10 address intervals may be restored from the high replacement priority to the low replacement priority; for the length of the access segment, assuming that at most 20 address intervals are allowed to be set as the length of the access segment of 2 times of the cache line length, when 20 address intervals with the length of the access segment of 2 times of the cache line length already exist, if another address interval meets a preset condition set as the length of 2 times of the cache line length, the length of the access segment of one address interval which is accessed least recently (such as the statistical hit frequency is lowest and/or the statistical hit distance is longest) in the 20 address intervals can be restored to the length of the cache line from the length of 2 times of the cache line length; for the cache application policy, assuming that 15 address intervals at most are allowed to be set as the cache application policy not applying for caching, when 15 address intervals not applying for caching already exist, if another address interval meets a preset condition of the cache application policy set as the cache not applying for caching, the cache application policy of the address interval with the most recent access times (for example, the statistical hit times are the highest and/or the statistical hit distance is the shortest) among the 15 address intervals may be restored from the cache not applying for caching to the normal cache; and so on.

In some embodiments, the aforementioned address management table may include a local cache portion and a non-local cache portion. Illustratively, the address management table may be stored in a specific location of the memory space, but in order to increase the access speed, a part of the address management table, i.e. the above-mentioned local cache part, may be locally cached, and the rest is a non-local cache part. The local cache part can be adjusted in real time according to the access condition of each address interval. In such an embodiment, the method 200 described above may further include: in response to the address interval in the local cache portion being replaced by an address interval in the non-local cache portion, the hit statistics corresponding to the replaced address interval are zeroed in the local cache portion. Illustratively, only the hit statistics for the address intervals in the local cache portion may be counted, which helps to reduce the overhead incurred by the data statistics. Because the access probability of address intervals that are not in the local cache portion is low, the gains from maintaining hit statistics and setting access policy parameters for them may be low. Illustratively, several counters for hit statistics may be maintained locally only, and each or each set of counters may correspond to an address interval in the local cache portion of the address management table, and when that address interval is replaced by another address interval, the respective counter may be set to zero and counting resumes. Further, in such embodiments, the access policy parameters may be determined and stored only for hit statistics of address intervals in the local cache portion. However, alternatively, the hit statistics and the access policy parameters may be counted and stored for all address intervals, and may be stored locally, for example, in the access policy management unit or the memory management unit, or may be written into the memory for storage.

To further facilitate understanding, fig. 3 schematically illustrates an example flow diagram of an access request processing procedure 300, in accordance with some embodiments of the present disclosure. Illustratively, the process may be performed within a processor. As shown in fig. 3, an execution unit may initiate an access request for a target address, and then, the aforementioned access policy unit, memory management unit, or other similar units may query an address management table or an individual address interval-access policy parameter mapping table, so as to obtain an access policy parameter corresponding to an address interval in which the target address is located. The obtained access policy parameters may be added to the access request to obtain an updated access request. The updated access request may then be sent to a downstream cache molecule. The cache unit may return response data and a hit condition, where the response data may be fed back to the execution unit for processing, and the hit condition may be fed back to the access policy unit, the memory management unit, or other similar units, so as to update the access policy parameters in the address management table or the separate address interval-access policy parameter mapping table, and the update process may be performed with reference to the foregoing various embodiments, and is not described herein again. Therefore, real-time adjustment and update of the access strategy parameters in the address management table can be realized.

FIG. 4 schematically illustrates an example block diagram of a memory access policy management unit 400, according to some embodiments of this disclosure, which memory access policy unit 400 may implement the various embodiments described above. As shown, the access policy management unit 400 may include an address interval-access policy parameter mapping table 410, which may record the access policy parameters corresponding to each address interval, and optionally, the table may be combined with the address management table into one table, or may be a separate table. In addition, the memory policy management unit 400 may further include hit statistics translation logic 420 and statistics-memory policy parameter translation logic 430. The hit conversion logic 420 may convert the hits returned by the cache units into hit statistics and corresponding address intervals according to the process described in the previous embodiments. Statistics-access policy parameter translation logic 430 may determine the corresponding access policy parameter based on the hit statistics and write or update address interval-access policy parameter mapping table 410 according to the process described in the previous embodiment. When a new access request is received, the address interval-access policy parameter mapping table 410 may be queried to obtain a corresponding access policy parameter, and the access request carrying the access policy parameter is transmitted to the cache unit for processing. Alternatively, the memory access policy unit 400 may be a separate structure, or may be incorporated into the memory management unit 113 shown in fig. 1.

Fig. 5 schematically illustrates an example block diagram of an access policy management apparatus 500 according to some embodiments of this disclosure. As shown in fig. 5, the access policy management apparatus 500 may include a first updating module 510, a determining module 520, a second updating module 530, and a sending module 540.

In particular, the first update module 510 may be configured to: updating an address management table based on the hit condition of the historical access request, wherein the address management table comprises hit statistical data corresponding to an address interval, and the historical access request is an access request aiming at a historical target address in the address interval; the determination module 520 may be configured to: determining access strategy parameters corresponding to the address intervals based on the hit statistical data; the second update module 530 may be configured to: in response to receiving an access request for a target address within an address interval, adding an access policy parameter to the access request to update the access request; the sending module 540 may be configured to: and sending the updated access request to the cache unit.

It should be understood that the access policy management device 500 may have the same or similar embodiments and may have the same or similar technical effects as the access policy management method 200 described above. For the sake of brevity, no further description is provided herein.

Fig. 6 schematically illustrates an example block diagram of the structure of a processor 600, in accordance with some embodiments of the present disclosure. As shown, the processor 600 may include an execution unit 610, a memory policy management unit 620, and a cache unit 630. The execution unit 610 may be configured to initiate an access request for a target address; the cache unit 630 may be configured to manage data from each address interval in the cache unit based on a memory policy, which may include a memory policy parameter; the access policy management unit 620 may be configured to perform various embodiments of the access policy management method 200 described previously. Alternatively, the processor 600 may be various types of processors, such as a CPU, a GPU, and the like, and may be implemented in various forms such as a chip.

For example, the execution unit 610 may initiate an access request to a certain target address in the memory according to the requirements of the running application program and the like. After receiving the access request, the memory access policy management unit 620 may query the memory access policy parameters corresponding to the address interval where the target address is located, add the queried memory access policy parameters to the access request, and send the access request carrying the memory access policy parameters to the cache unit 630. The cache unit 630 may read the corresponding access policy parameters according to a pre-agreed manner and perform the corresponding operations. The cache unit 630 may feed back response data for the access request to the execution unit 610, and feed back a hit condition of the access request to the access policy management unit 620, so that the access policy management unit 620 updates the hit statistical data according to the foregoing description in various embodiments, and further updates the access policy parameter.

Fig. 7 schematically illustrates an example block diagram of a computing device 700, in accordance with some embodiments of the present disclosure. As shown, computing device 700 may include a memory 710 and a processor 720, where memory 710 may include memory and processor 720 may have the same or similar structure as processor 600 shown in fig. 6. Illustratively, the processor 720 may interact with the memory 710 through the internal cache unit by using the schemes described in the previous embodiments.

Variations to the disclosed embodiments can be understood and effected by those skilled in the art in practicing the claimed subject matter, from a study of the drawings, the disclosure, and the appended claims. In the claims, the word "comprising" does not exclude other elements or steps, and the word "a" or "an" does not exclude a plurality. The mere fact that certain measures are recited in mutually different dependent claims does not indicate that a combination of these measures cannot be used to advantage.

Claims

1. A memory access strategy management method is characterized by comprising the following steps:

updating an address management table based on the hit condition of a historical access request, wherein the address management table comprises hit statistical data corresponding to an address interval, and the historical access request is an access request aiming at a historical target address in the address interval;

determining access strategy parameters corresponding to the address intervals based on the hit statistical data;

in response to receiving an access request for a target address within the address interval, adding the access policy parameter to the access request to update the access request;

and sending the updated access request to the cache unit.

2. The method of claim 1, wherein updating an address management page table based on hits from historical access requests comprises:

receiving a hit condition of the historical access request fed back by the cache unit, the hit condition including at least one of: information indicating whether the historical target address is hit, an average hit distance of the historical target address, and a number of times the historical target address is accumulated to be hit;

and updating the hit statistical data corresponding to the address interval in the address management table based on the hit condition.

3. The method of claim 2, wherein updating the hit statistics corresponding to the address range in the address management table based on the hit comprises:

determining a hit statistic corresponding to the historical access request based on the hit condition and a preset statistic mechanism, wherein the hit statistic comprises at least one of the following items: the hit distance statistic value of the historical target address and the hit times statistic value of the historical target address;

updating, in the address management table, hit statistics corresponding to the address intervals based on the hit statistics, the hit statistics including at least one of: and counting the hit distance and the hit times.

4. The method of claim 1, wherein determining, based on the hit statistics, access policy parameters corresponding to the address interval comprises:

responding to the fact that the hit statistical data meet preset conditions, and setting access strategy parameters corresponding to the address intervals as preset parameters;

and responding to the fact that the hit statistical data do not meet preset conditions, and keeping the access strategy parameters corresponding to the address interval as default parameters.

5. The method of claim 4, further comprising:

in response to that the access strategy parameter corresponding to the address interval is the preset parameter and the hit statistical data meets a preset degradation condition, restoring the access strategy parameter corresponding to the address interval to be the default parameter, wherein the preset degradation condition comprises at least one of the following items: and in a preset time window, the statistical hit frequency of the address interval meets a preset threshold condition, the variation trend of the statistical hit distance of the address interval meets a first preset trend condition, and the variation trend of the statistical hit frequency of the address interval meets a second preset trend condition.

6. The method of claim 4, further comprising:

and in response to the number of the address intervals with the preset parameters reaching a preset threshold and the hit statistical data of one address interval with the default parameters meeting the preset condition, restoring the access strategy parameters of one address interval in the address intervals with the preset parameters with the threshold number to the default parameters.

7. The method of claim 1, wherein the address management table comprises a local cache portion and a non-local cache portion, and wherein the method further comprises:

in response to the address interval in the local cache portion being replaced by the address interval in the non-local cache portion, zeroing hit statistics corresponding to the replaced address interval in the local cache portion.

8. The method of any of claims 1 to 7, wherein the access policy parameters comprise at least one of: access segment length, cache application policy, and replacement priority.

9. An access policy management apparatus, characterized in that the apparatus comprises:

a first update module configured to: updating an address management table based on the hit condition of a historical access request, wherein the address management table comprises hit statistical data corresponding to an address interval, and the historical access request is an access request aiming at a historical target address in the address interval;

a determination module configured to: determining access strategy parameters corresponding to the address intervals based on the hit statistical data;

a second update module configured to: in response to receiving an access request for a target address within the address interval, adding the access policy parameter to the access request to update the access request;

a sending module configured to: and sending the updated access request to the cache unit.

10. A processor, wherein the processor comprises:

an execution unit configured to initiate an access request for a target address;

the cache unit is configured to manage data from each address interval in the cache unit based on a memory access strategy, and the memory access strategy comprises memory access strategy parameters;

a memory access policy management unit configured to execute the memory access policy management method according to any one of claims 1 to 8.

11. A computing device comprising a memory and the processor of claim 10, wherein the memory comprises a plurality of address intervals.