CN116010109A - Cache resource allocation method and device, electronic equipment and storage medium - Google Patents
Cache resource allocation method and device, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN116010109A CN116010109A CN202310153348.3A CN202310153348A CN116010109A CN 116010109 A CN116010109 A CN 116010109A CN 202310153348 A CN202310153348 A CN 202310153348A CN 116010109 A CN116010109 A CN 116010109A
- Authority
- CN
- China
- Prior art keywords
- shared cache
- identification information
- data request
- cache
- request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 74
- 238000003860 storage Methods 0.000 title claims abstract description 30
- 238000013468 resource allocation Methods 0.000 title claims abstract description 22
- 230000004044 response Effects 0.000 claims description 28
- 238000004590 computer program Methods 0.000 claims description 16
- 238000012545 processing Methods 0.000 abstract description 24
- 238000010586 diagram Methods 0.000 description 22
- 238000013507 mapping Methods 0.000 description 14
- 230000006870 function Effects 0.000 description 11
- 230000005540 biological transmission Effects 0.000 description 4
- 230000010365 information processing Effects 0.000 description 4
- 238000003491 array Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000001902 propagating effect Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 229910052802 copper Inorganic materials 0.000 description 1
- 239000010949 copper Substances 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000001627 detrimental effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000010926 purge Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/544—Buffers; Shared memory; Pipes
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
The disclosure relates to the technical field of electric digital data processing, and in particular relates to a cache resource allocation method, a cache resource allocation device, electronic equipment and a storage medium. A processor system comprising at least two levels of cache, the highest level of the at least two levels of cache being a shared cache, the shared cache comprising a plurality of shared cache sets, the method comprising: responding to a first data request from any application, and acquiring first identification information carried by the first data request; and responding to the first data request as the data request corresponding to the first identification information received for the first time, and distributing a preset number of shared cache groups to the first identification information. Because the number of the groups is more than the number of the ways, the shared cache resources are allocated based on the groups, so that the reasonable allocation of the shared cache resources can be realized more easily, and the utilization rate of the shared cache resources can be improved.
Description
Technical Field
The disclosure relates to the technical field of electric digital data processing, and in particular relates to a cache resource allocation method, a cache resource allocation device, electronic equipment and a storage medium.
Background
The cache is an on-chip memory located between the CPU (Central Processing Unit )/GPU (Graphics Processing Unit, graphics processor) and the memory, which can provide fast and small-capacity data reading and writing. The data in the cache is a part of the main memory, is stored in the cache through a certain mapping relation, and is obtained by comparing the tag information. The mapping relation is mainly divided into the following three types: the first is direct mapping. In direct mapping, a block address in memory can only be mapped to a fixed location in the cache. The second is a full-join map. In full-associative mapping, a block of addresses in memory may be mapped to any location in the cache. The third is a group join map. Group-connected mapping is a compromise between direct mapping and full-connected mapping. In the set associative mapping, the cache is divided into sets (sets) each having multiple ways. A block address in memory can only be mapped to a fixed set, but can be mapped to a different way within each set. Because of the complexity of physical implementation, the number of ways that each group contains in a group connection typically does not exceed 16 or 32. Table 1 shows an exemplary implementation of the group join map. In the implementation shown in Table 1, the caches are divided into M+1 sets, each set including N+1 ways.
TABLE 1
|
Way #1 | … | Way # | |
Group # | ||||
0 | Data | Data | Data | Data |
… | Data | Data | Data | Data |
Group #M | Data | Data | Data | Data |
Caching utilizes the principle of locality of programs, and is divided into temporal locality and spatial locality. Temporal locality refers to the fact that an address may be repeatedly accessed over a period of time. Spatial locality means that an address is accessed, and there is a high likelihood that a nearby address will be accessed. Since the locality exhibited by different programs will also vary, there will be a difference in cache utilization.
How to improve the utilization rate of the cache resources is a technical problem to be solved urgently.
Disclosure of Invention
The present disclosure provides a cache resource allocation technical scheme.
According to an aspect of the present disclosure, there is provided a method for allocating cache resources, a processor system including at least two levels of caches, a highest level of the at least two levels of caches being a shared cache, the shared cache including a plurality of shared cache groups, the method including:
responding to a first data request from any application, and acquiring first identification information carried by the first data request;
and responding to the first data request as the data request corresponding to the first identification information received for the first time, and distributing a preset number of shared cache groups to the first identification information.
In one possible implementation, the shared cache includes a plurality of shared cache channels;
the responding to the first data request is a data request corresponding to the first identification information received for the first time, allocates a preset number of shared cache groups to the first identification information, and includes:
and responding to the first data request as the data request corresponding to the first identification information received for the first time, and respectively distributing the preset number of shared cache groups in the plurality of shared cache channels to the first identification information.
In one possible implementation, the preset number includes a first preset number and a second preset number, and the first preset number is smaller than the second preset number;
the responding to the first data request is a data request corresponding to the first identification information received for the first time, and respectively distributing a preset number of shared cache groups in the plurality of shared cache channels to the first identification information, including:
responding to the first data request as the first received data request corresponding to the first identification information, and determining a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels, wherein the first reference shared cache channel represents the reference shared cache channel corresponding to the first identification information;
And allocating the first preset number of shared cache groups in a first common shared cache channel to the first identification information, and allocating the second preset number of shared cache groups in the first reference shared cache channel to the first identification information, wherein the first common shared cache channel represents a shared cache channel except the first reference shared cache channel in the plurality of shared cache channels.
In one possible implementation manner, the determining, in response to the first data request being a first received data request corresponding to the first identification information, a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels includes:
and responding to the first data request as the first received data request corresponding to the first identification information, wherein the shared cache channels which are not determined as the reference shared cache channels exist in the plurality of shared cache channels, and determining the first reference shared cache channel corresponding to the first identification information from the shared cache channels which are not determined as the reference shared cache channels.
In one possible implementation, the method further includes:
Acquiring a first hit rate of the first reference shared cache channel for the first identification information and a second hit rate of the first common shared cache channel for the first identification information;
and adjusting the number of the shared cache groups allocated to the first identification information according to the first hit rate and the second hit rate.
In one possible implementation manner, the adjusting the number of shared cache sets allocated to the first identification information according to the first hit rate and the second hit rate includes:
determining a ratio of the first hit rate to the second hit rate;
responsive to the ratio being greater than or equal to a first preset threshold, increasing a number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; or, in response to the ratio being less than or equal to a second preset threshold, reducing the number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; the first preset threshold is greater than the second preset threshold, and the first preset threshold and the second preset threshold are both greater than 1.
In one possible implementation, the method further includes:
Acquiring a first request address corresponding to the first data request;
responding to the first request address to determine that cache miss occurs in a local cache, and acquiring a group mask, a group offset and a flag bit offset corresponding to the first identification information;
determining new group bits and new tag information corresponding to the first data request according to a second request address corresponding to the first data request, the group mask, the group offset and the flag bit offset, wherein the second request address is determined according to the first request address;
and searching target data according to the channel information and the in-row offset address in the second request address, the new group bit and the new tag information.
In one possible implementation, the method further includes:
and remapping the first request address to obtain the second request address.
In one possible implementation manner, the determining new set of bits and new tag information corresponding to the first data request according to the second request address, the set mask, the set offset, and the flag bit offset corresponding to the first data request includes:
acquiring original group bits and original tag information from the second request address;
Performing AND operation on the original group bit and the group mask to obtain the relative positions of the target data requested by the first data request in a plurality of shared cache groups corresponding to the first identification information;
determining a new group bit corresponding to the first data request according to the group offset and the relative position;
and determining new tag information corresponding to the first data request according to the original tag information, the flag bit offset and the designated bit in the original group bit.
In one possible implementation manner, the searching the target data according to the channel information and the intra-row offset address in the second request address, and the new set of bits and the new tag information includes:
and responding to the fact that target data cannot be found according to the channel information, the new group bits, the new tag information and the in-row offset address, acquiring the target data from a memory or an external memory, writing the target data into a shared cache group corresponding to the new group bits, and requesting to return the target data to the first data.
In one possible implementation manner, the first identification information includes any one of the following:
The method comprises the steps of determining identification information according to a module called by the application, determining identification information according to context identification information, and determining identification information according to an address interval of target data requested by the first data request in a memory.
According to an aspect of the present disclosure, there is provided a cache resource allocation apparatus, a processor system including at least two levels of caches, a highest level of the at least two levels of caches being a shared cache, the shared cache including a plurality of shared cache groups, the apparatus comprising: the first acquisition module is used for responding to a first data request from any application and acquiring first identification information carried by the first data request; the allocation module is used for allocating a preset number of shared cache groups to the first identification information in response to the first data request being the data request corresponding to the first identification information received for the first time.
In one possible implementation, the shared cache includes a plurality of shared cache channels; the distribution module is used for: and responding to the first data request as the data request corresponding to the first identification information received for the first time, and respectively distributing the preset number of shared cache groups in the plurality of shared cache channels to the first identification information.
In one possible implementation, the preset number includes a first preset number and a second preset number, and the first preset number is smaller than the second preset number; the distribution module is used for: responding to the first data request as the first received data request corresponding to the first identification information, and determining a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels, wherein the first reference shared cache channel represents the reference shared cache channel corresponding to the first identification information; and allocating the first preset number of shared cache groups in a first common shared cache channel to the first identification information, and allocating the second preset number of shared cache groups in the first reference shared cache channel to the first identification information, wherein the first common shared cache channel represents a shared cache channel except the first reference shared cache channel in the plurality of shared cache channels.
In one possible implementation, the allocation module is configured to: and responding to the first data request as the first received data request corresponding to the first identification information, wherein the shared cache channels which are not determined as the reference shared cache channels exist in the plurality of shared cache channels, and determining the first reference shared cache channel corresponding to the first identification information from the shared cache channels which are not determined as the reference shared cache channels.
In one possible implementation, the apparatus further includes: the second acquisition module is used for acquiring a first hit rate of the first reference shared cache channel aiming at the first identification information and a second hit rate of the first common shared cache channel aiming at the first identification information; and the adjusting module is used for adjusting the number of the shared cache groups allocated to the first identification information according to the first hit rate and the second hit rate.
In one possible implementation, the adjusting module is configured to: determining a ratio of the first hit rate to the second hit rate; responsive to the ratio being greater than or equal to a first preset threshold, increasing a number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; or, in response to the ratio being less than or equal to a second preset threshold, reducing the number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; the first preset threshold is greater than the second preset threshold, and the first preset threshold and the second preset threshold are both greater than 1.
In one possible implementation, the apparatus further includes: the third acquisition module is used for acquiring a first request address corresponding to the first data request; a fourth obtaining module, configured to obtain a group mask, a group offset, and a flag bit offset corresponding to the first identification information in response to determining that a cache miss occurs in the local cache according to the first request address; a determining module, configured to determine new group bits and new tag information corresponding to the first data request according to a second request address corresponding to the first data request, the group mask, the group offset, and the flag bit offset, where the second request address is determined according to the first request address; and the searching module is used for searching the target data according to the channel information and the in-row offset address in the second request address, the new group bit and the new tag information.
In one possible implementation, the apparatus further includes: and the remapping module is used for remapping the first request address to obtain the second request address.
In one possible implementation, the determining module is configured to: acquiring original group bits and original tag information from the second request address; performing AND operation on the original group bit and the group mask to obtain the relative positions of the target data requested by the first data request in a plurality of shared cache groups corresponding to the first identification information; determining a new group bit corresponding to the first data request according to the group offset and the relative position; and determining new tag information corresponding to the first data request according to the original tag information, the flag bit offset and the designated bit in the original group bit.
In one possible implementation, the search module is configured to: and responding to the fact that target data cannot be found according to the channel information, the new group bits, the new tag information and the in-row offset address, acquiring the target data from a memory or an external memory, writing the target data into a shared cache group corresponding to the new group bits, and requesting to return the target data to the first data.
In one possible implementation manner, the first identification information includes any one of the following: the method comprises the steps of determining identification information according to a module called by the application, determining identification information according to context identification information, and determining identification information according to an address interval of target data requested by the first data request in a memory.
According to an aspect of the present disclosure, there is provided an electronic apparatus including: one or more processors; a memory for storing executable instructions; wherein the one or more processors are configured to invoke the executable instructions stored by the memory to perform the above-described method.
According to an aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method.
According to an aspect of the present disclosure, there is provided a computer program product comprising a computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in an electronic device, a processor in the electronic device performs the above method.
In the embodiment of the disclosure, the processor system includes at least two levels of caches, the highest level in the at least two levels of caches is a shared cache, the shared cache includes a plurality of shared cache groups, and by responding to a first data request from any application, first identification information carried by the first data request is obtained, and responding to the first data request as a data request corresponding to the first identification information received for the first time, a preset number of shared cache groups are allocated to the first identification information, so that allocation of shared cache resources is performed based on the groups. Since the number of groups is greater than the number of ways (e.g., the shared cache includes a plurality of shared cache ways, each shared cache way includes 256 groups, and the number of ways included in a group typically does not exceed 16 or 32), the allocation of shared cache resources based on the groups can be more easily implemented, and the utilization of shared cache resources can be improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the technical aspects of the disclosure.
FIG. 1 shows a schematic diagram of the cache structure of a GPU.
Fig. 2 shows a flowchart of a method for allocating cache resources according to an embodiment of the present disclosure.
Fig. 3 is a schematic diagram illustrating allocation of identification information for a data request by groups in a cache resource allocation method according to an embodiment of the present disclosure.
Fig. 4 is a schematic diagram of a cache structure of a GPU according to an embodiment of the present disclosure.
Fig. 5 is a schematic diagram of a cache lookup method according to an embodiment of the disclosure.
Fig. 6 shows a block diagram of a cache resource allocation apparatus provided by an embodiment of the present disclosure.
Fig. 7 illustrates a block diagram of an electronic device 1900 provided by an embodiment of the disclosure.
Detailed Description
Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, and may mean including any one or more elements selected from the group consisting of A, B and C.
Furthermore, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.
Fig. 1 shows a schematic diagram of the cache structure of a GPU (Graphics Processing Unit, graphics processor). At least two levels of cache structures are typically present in GPU systems. In the example shown in fig. 1, the GPU system includes two levels of caches.
There are multiple clusters of arithmetic units (GPU clusters) on the GPU, and there are multiple arithmetic units inside each cluster of arithmetic units. The Local Cache (Local Cache) may be considered a first level Cache, which is only accessed by the corresponding computing unit cluster. In fig. 1, k+1 arithmetic unit clusters (arithmetic unit cluster 0 to arithmetic unit cluster K) and k+1 local caches (local cache 0 to local cache K) corresponding to the k+1 arithmetic unit clusters one by one are included. The local cache may also be referred to as a local cache, a level 1 cache, an L1 cache, etc., which is not limited herein.
A communication module (interface) may be used to pass the request to the corresponding external memory block.
The Shared Cache (Shared Cache) may be considered a second level Cache. The shared cache may be accessed by all the clusters of arithmetic units. The shared cache may also be referred to as a level 2 cache, an L2 cache, a global cache, etc., which is not limited herein. In fig. 1, l+1 shared cache ways are included, namely shared cache way 0 to shared cache way L.
In addition, in fig. 1, l+1 DRAM banks (Dynamic Random Access Memory bank, dynamic random access memory banks) are included, namely, DRAM bank 0 to DRAM bank L, respectively.
Wherein, the relationship between the shared cache channel and the DRAM memory bank can be one-to-one or many-to-one.
Because the GPU can process many operations or applications in parallel, the data locality of these operations and applications is not necessarily the same, so the requirements on the cache are different, even there is no data sharing between applications, and then the operations are all put in the shared cache, which will affect the operation efficiency.
Because the shared cache may be commonly used by multiple threads, multiple cores or multiple different applications, and the cache size or data characteristics required by each application are different, it is preferable to treat the shared cache differently, avoid collisions, and reasonably allocate cache resources. The related art adopts a way division (way partition) manner to allocate different ways to different applications. Table 2 shows another exemplary implementation of the group join map. In the implementation shown in Table 2, the caches are divided into M+1 sets, each set comprising 4 ways.
TABLE 2
|
Way #1 | Way #2 | Way #3 | |
|
Data | Data | Data | Data |
… | Data | Data | Data | Data |
Group #M | Data | Data | Data | Data |
In one example of a split by way of number, way 0 and way 1 store only application A data, way 2 store only application B data, and way 3 store only application C data. So that the different applications do not affect each other.
In practical physical implementations of this way of dividing by way of number of ways, the number of ways in each group is limited (the number of ways each group contains typically does not exceed 16 or 32), and when the number of requesters is excessive, exceeding the number of ways (as is common in GPU applications), the way-by-way allocation creates limitations, resulting in unreasonable allocation of resources.
The embodiment of the disclosure provides a method for allocating cache resources, a processor system comprises at least two levels of caches, the highest level in the at least two levels of caches is a shared cache, the shared cache comprises a plurality of shared cache groups, first identification information carried by a first data request is obtained by responding to the first data request from any application, and a preset number of shared cache groups are allocated to the first identification information in response to the first data request, wherein the first identification information corresponds to the first identification information received for the first time. Since the number of groups is greater than the number of ways (e.g., the shared cache includes a plurality of shared cache ways, each shared cache way includes 256 groups, and the number of ways included in a group typically does not exceed 16 or 32), the allocation of shared cache resources based on the groups can be more easily implemented, and the utilization of shared cache resources can be improved.
The cache resource allocation method provided by the embodiment of the present disclosure is described in detail below with reference to the accompanying drawings.
Fig. 2 shows a flowchart of a method for allocating cache resources according to an embodiment of the present disclosure. The Cache resource allocation method is used for allocating the resources of the Cache (Cache). In one possible implementation manner, the execution subject of the buffer resource allocation method may be a buffer resource allocation apparatus, and for example, the buffer resource allocation method may be executed by a terminal device or a server or other electronic devices. The terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or the like. In some possible implementations, the method of allocating cache resources may be implemented by a processor invoking computer readable instructions stored in a memory. As shown in fig. 2, the method for allocating cache resources includes steps S21 to S22.
In step S21, in response to a first data request from any application, first identification information carried by the first data request is acquired.
In step S22, a preset number of shared cache groups are allocated to the first identification information in response to the first data request being a data request corresponding to the first identification information received for the first time.
In an embodiment of the disclosure, a processor system includes at least two levels of caches, a highest level of the at least two levels of caches being a shared cache, the shared cache including a plurality of shared cache sets.
The processor system may be a GPU (Graphics Processing Unit, graphics processor) system or a CPU (Central Processing Unit ) system, which is not limited herein. Hereinafter, the processor system is exemplified as a GPU system.
In one possible implementation, the processor system may include a two-level cache. The first-level cache may be a local cache, and the second-level cache may be a shared cache.
In another possible implementation, the processor system may include a three-level cache. The first-level cache and the second-level cache may be local caches, and the third-level cache may be a shared cache.
In the disclosed embodiments, the shared cache may include at least one channel, i.e., the shared cache may include at least one shared cache channel. Wherein the shared cache channel represents a channel of the shared cache. Any shared cache channel can be accessed by different clusters of arithmetic units.
In one possible implementation, the shared cache may include a plurality of shared cache channels. For example, the number of shared cache ways may be 16, 24, 32, 48, etc., without limitation. In this implementation, each shared cache channel may comprise a plurality of sets (sets), respectively, i.e. each shared cache channel may comprise a plurality of shared cache sets, respectively. For example, each shared cache way may include 256 shared cache sets, respectively.
In the disclosed embodiments, each shared cache set may include multiple ways, respectively. For example, the number of ways in each shared cache set may be 4, 8, 16, 32, etc.
In the embodiment of the disclosure, the first data request may be any data request issued by any application. The first identification information may identify identification information carried by the first data request.
Any one application may issue a large number of data requests. The identification information carried by different data requests sent by the same application can be different or the same. The identification information carried by different data requests sent by different applications can be different or the same.
In one possible implementation manner, the first identification information includes any one of the following: the method comprises the steps of determining identification information according to a module called by the application, determining identification information according to context identification information, and determining identification information according to an address interval of target data requested by the first data request in a memory.
As one example of this implementation, the identification information of the data request may be determined from a module invoked by the application. Taking GPU as an example, the GPU module for application call may include a special purpose unit for processing coordinate transformation in GPU, a special purpose unit for performing texture compression in GPU, and the like, which is not limited herein. In this example, when the same application invokes the same module to issue two data requests, the identification information of the two data requests is the same; when the same application calls two different modules to send two data requests, the identification information of the two data requests is different; when two applications call the same module to respectively send out data requests, the identification information of the two data requests is the same; when two applications call two different modules to respectively send data requests, the identification information of the two data requests is different. For example, the first data request sent by the application A1 calling module M1 is identical to the identification information of the second data request sent by the application A1 calling module M1, the first data request sent by the application A1 calling module M1 is different from the identification information of the third data request sent by the application A1 calling module M2, the first data request sent by the application A1 calling module M1 is identical to the identification information of the fourth data request sent by the application A2 calling module M1, and the first data request sent by the application A1 calling module M1 is different from the identification information of the fifth data request sent by the application A2 calling module M2.
As another example of this implementation, the identification information of the data request may be determined from the identification information of the context. In this example, the identification information of the context may refer to the identification information of the application. In this example, the identification information of different data requests issued by the same application is the same, and the identification information of the data requests issued by different applications is different.
As another example of this implementation, the identification information of the data request may be determined according to an address interval of the target data in the memory, which is requested by the data request. Different applications may access the same segment of address in memory, and thus, the identification information of data requests issued by different applications may be the same. The same application may access different addresses in memory, and thus the identification information of different data requests issued by the same application may be different.
In the implementation manner, the first identification information is determined according to the module called by the application, or the first identification information is determined according to the identification information of the context, or the first identification information is determined according to the address interval of the target data requested by the first data request in the memory, so that the identification information of the data request can be reasonably determined, and more reasonable allocation of the shared cache resources is facilitated.
Although the above implementations describe the manner in which the identification information of a data request is determined as above, those skilled in the art will appreciate that the present disclosure should not be limited thereto. The determination mode of the identification information of the data request can be flexibly determined by a person skilled in the art according to the actual application scene requirement and/or personal preference.
In one possible implementation, the shared cache includes a plurality of shared cache channels; the responding to the first data request is a data request corresponding to the first identification information received for the first time, allocates a preset number of shared cache groups to the first identification information, and includes: and responding to the first data request as the data request corresponding to the first identification information received for the first time, and respectively distributing the preset number of shared cache groups in the plurality of shared cache channels to the first identification information.
For example, the shared cache includes 16 shared cache channels, namely, shared cache channel 0 to shared cache channel 15, and then a preset number of shared cache groups in the 16 shared cache channels may be respectively allocated to the first identification information in response to the first data request being a data request corresponding to the first identification information received for the first time.
In this implementation manner, the first identification information is respectively allocated to a preset number of shared cache groups in the plurality of shared cache channels by responding to the first data request as the first received data request corresponding to the first identification information, so that it is beneficial to balance the requests obtained by each shared cache channel.
As one example of this implementation, the preset number includes a first preset number and a second preset number, and the first preset number is smaller than the second preset number; the responding to the first data request is a data request corresponding to the first identification information received for the first time, and respectively distributing a preset number of shared cache groups in the plurality of shared cache channels to the first identification information, including: responding to the first data request as the first received data request corresponding to the first identification information, and determining a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels, wherein the first reference shared cache channel represents a reference shared cache channel (reference cache) corresponding to the first identification information; and allocating the first preset number of shared cache groups in a first common shared cache channel to the first identification information, and allocating the second preset number of shared cache groups in the first reference shared cache channel to the first identification information, wherein the first common shared cache channel represents a shared cache channel except the first reference shared cache channel in the plurality of shared cache channels.
In this example, the second preset number may be 2 times, 1.5 times, 3 times, etc. the first preset number, without limitation. The first preset number may represent a preset number corresponding to a common shared cache channel, and the second preset number may represent a preset number corresponding to a reference shared cache channel. The first reference shared cache channel may represent a reference shared cache channel corresponding to the first identification information. The reference shared cache channels corresponding to different identification information may be different. The number of reference shared buffer channels corresponding to any one of the identification information may be one or more than two. For example, the number of reference shared cache channels corresponding to any one identification information may be one. For any one of the identification information, the common shared cache channel may represent a shared cache channel other than the reference shared cache channel to which the identification information corresponds. For example, the number of shared cache channels is 16, and the number of normal shared cache channels is 15 with reference to the number of shared cache channels being 1.
In one example, the first predetermined number is 16 and the second predetermined number is 32. In this example, each first normal shared buffer channel may allocate 16 shared buffer groups for the first identification information, and the first reference shared buffer channel may allocate 32 shared buffer groups for the first identification information, respectively.
In this example, the first reference shared cache channel corresponding to the first identification information is determined from the plurality of shared cache channels in response to the first data request being the first received data request corresponding to the first identification information, the first preset number of shared cache groups in the first common shared cache channel are allocated to the first identification information, and the second preset number of shared cache groups in the first reference shared cache channel are allocated to the first identification information, so that the size condition of shared cache resources required by the data request corresponding to the first identification information can be determined based on the reference shared cache channel, thereby being beneficial to improving the utilization rate of shared caches.
In one example, the determining, in response to the first data request being a first received data request corresponding to the first identification information, a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels includes: and responding to the first data request as the first received data request corresponding to the first identification information, wherein the shared cache channels which are not determined as the reference shared cache channels exist in the plurality of shared cache channels, and determining the first reference shared cache channel corresponding to the first identification information from the shared cache channels which are not determined as the reference shared cache channels.
In this example, the reference shared cache channel should be chosen as evenly as possible for different identification information. For example, a total of 4 shared cache ways and identification information for 3 data requests. Wherein the 4 shared cache channels are respectively shared cache channel 0, shared cache channel 1, shared cache channel 2 and shared cache channel 3, and the identification information of the 3 data requests is respectively first identification information (ID 0), second identification information (ID 1) and third identification information (ID 2). For example, after selecting the shared cache way 2 as the reference shared cache way for the first identification information, the shared cache way 2 may be avoided, e.g. the shared cache way 3 may be selected as the reference shared cache way, when selecting the reference shared cache way for the second identification information. When selecting the reference shared cache way for the third identification information, the shared cache way 2 and the shared cache way 3 may be avoided, for example, the shared cache way 0 or the shared cache way 1 is selected as the reference shared cache way.
In this example, the first reference shared cache channel corresponding to the first identification information is determined from the shared cache channels which are not determined as reference shared cache channels by responding to the first data request as the first received data request corresponding to the first identification information and the shared cache channels which are not determined as reference shared cache channels exist in the plurality of shared cache channels, so that the utilization rate of shared cache resources can be improved.
In another example, the determining, in response to the first data request being a first received data request corresponding to the first identification information, a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels includes: and responding to the first data request as the first received data request corresponding to the first identification information, and determining the shared cache channel with the largest residual capacity in the plurality of shared cache channels as a first reference shared cache channel corresponding to the first identification information.
In another example, the determining, in response to the first data request being a first received data request corresponding to the first identification information, a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels includes: and responding to the first data request as the first received data request corresponding to the first identification information, and randomly selecting a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels.
In another example, the reference shared cache channel may be selected sequentially for different identification information. For example, the first time shared cache way 0 is selected as the reference shared cache way, the second time shared cache way 1 is selected as the reference shared cache way, the third time shared cache way 2 is selected as the reference shared cache way, and so on.
In one example, the method further comprises: acquiring a first hit rate of the first reference shared cache channel for the first identification information and a second hit rate of the first common shared cache channel for the first identification information; and adjusting the number of the shared cache groups allocated to the first identification information according to the first hit rate and the second hit rate.
In this example, the first hit rate and the second hit rate may be counted at a preset frequency, so that the shared cache resources allocated to the respective identification information may be adjusted at the preset frequency.
In this example, in the case where the number of normal shared cache channels is plural, the average or median of the hit rates of the respective normal shared cache channels for the first identification information may be determined as the second hit rate.
In this example, by acquiring the first hit rate of the first reference shared cache channel for the first identification information and the second hit rate of the first normal shared cache channel for the first identification information, and adjusting the number of shared cache groups allocated to the first identification information according to the first hit rate and the second hit rate, the shared cache resources allocated to the first identification information are dynamically adjusted based on the performance difference of the first reference shared cache channel and the first normal shared cache channel, so that the utilization rate of the shared cache resources can be further improved, and the running efficiency of different applications can be improved.
In one example, the adjusting the number of shared cache sets allocated to the first identification information according to the first hit rate and the second hit rate includes: determining a ratio of the first hit rate to the second hit rate; responsive to the ratio being greater than or equal to a first preset threshold, increasing a number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; or, in response to the ratio being less than or equal to a second preset threshold, reducing the number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; the first preset threshold is greater than the second preset threshold, and the first preset threshold and the second preset threshold are both greater than 1.
The first preset threshold value and the second preset threshold value can be configured through a register. The second preset threshold may be slightly greater than 1.
Because the number of the shared cache groups obtained by allocating the first identification information in the first reference shared cache channel is greater than the number of the shared cache groups obtained by allocating the first identification information in the first common shared cache channel, if more shared cache resources bring about a significantly higher hit rate (for example, the ratio is greater than or equal to a first preset threshold), the increase of the cache resources can be considered to have a significant benefit for improving the hit rate, and further the increase of the cache space for the first identification information can be considered. If more shared cache resources do not result in a significant increase in hit rate (e.g., the ratio is less than or equal to the second preset threshold), then the increase in cache resources may be considered to be detrimental to increasing hit rate, and thus the shared cache resources allocated to the first identification information may be reduced.
In the above example, in the case where the ratio is smaller than the first preset threshold and larger than the second preset threshold, the number of shared cache sets corresponding to the first identification information may not be changed.
In addition, in order to maintain data consistency when allocation of cache resources is changed, additional cache maintenance operations may be employed, such as a flush operation (flush operation), a purge operation (invalidate), and the like.
In the above example, the number of the shared cache groups allocated to the first identification information in the plurality of shared cache channels is increased by determining a ratio of the first hit rate to the second hit rate, in response to the ratio being greater than or equal to a first preset threshold, or the number of the shared cache groups allocated to the first identification information in the plurality of shared cache channels is decreased in response to the ratio being less than or equal to a second preset threshold, wherein the first preset threshold is greater than the second preset threshold, and both the first preset threshold and the second preset threshold are greater than 1, whereby the utilization rate of the shared cache resources can be further improved.
In one possible implementation, the method further includes: acquiring a first request address corresponding to the first data request; responding to the first request address to determine that cache miss occurs in a local cache, and acquiring a group mask, a group offset and a flag bit offset corresponding to the first identification information; determining new group bits and new tag information corresponding to the first data request according to a second request address corresponding to the first data request, the group mask, the group offset and the flag bit offset, wherein the second request address is determined according to the first request address; and searching target data according to the channel information and the in-row offset address in the second request address, the new group bit and the new tag information.
In this implementation, the first request address may represent a request address carried by the first data request. The first request address may be a virtual address or a physical address.
As an example of this implementation, a set mask (set mask), a set offset (set offset), and a tag bit offset (tag shift) corresponding to the first identification information may be acquired from the ID-cache set mapping table. Wherein the group mask may be used to determine a number of shared cache groups allocated to the first identification information. For example, the set mask=0x0f may indicate that the number of shared cache sets allocated to the first identification information is 16. The group offset may represent a starting position of the shared cache group allocated to the first identification information. For example, the set offset=0x10 may indicate that the start position of the shared cache set of the first identification information is the 17 th set in the shared cache channel. The flag bit offset may be used to determine the shift amount of the tag.
In this implementation, the new set of bits represents the set of bits used to store the target data requested by the first data request. The new tag information may represent new tag information corresponding to the target data. The new tag information can be stored in the shared cache group corresponding to the target data for subsequent cache searching and hit judgment.
In this implementation, the sizes of the group masks corresponding to different identification information may be the same or different. For example, the group mask corresponding to each identification information is 0x0f. As another example, a data request may carry a request size of a shared cache group, and the number of shared cache groups allocated to identification information of the data request may be determined based on the request size.
In this implementation manner, by acquiring the first request address corresponding to the first data request, in response to determining that a cache miss occurs in the local cache according to the first request address, acquiring a set mask, a set offset and a flag bit offset corresponding to the first identification information, and determining a new set bit and new tag information corresponding to the first data request according to the second request address, the set mask, the set offset and the flag bit offset corresponding to the first data request, where the second request address is determined according to the first request address, and according to channel information and an intra-line offset address in the second request address, and the new set bit and the new tag information, target data is searched, so that allocation of a shared cache set based on identification information of the data request can be achieved.
As an example of this implementation, the method further comprises: and remapping the first request address to obtain the second request address.
In one example, addresses issued by the GPU (i.e., request addresses carried by data requests) may be scrambled to be equally distributed to different shared cache channels by address interleaving and hashing.
In this example, the second request address is obtained by remapping the first request address, so that the requests obtained by each shared cache channel can be balanced, and the utilization rate of the shared cache resource is improved.
As another example of this implementation, the first request address is a virtual address; the method further comprises the steps of: and converting the virtual address to the physical address through the memory management unit to obtain a second request address. In this example, the second request address is a physical address.
As another example of this implementation, the first request address is a physical address, and the first request address may be directly taken as the second request address.
As an example of this implementation, the determining new set of bits and new tag information corresponding to the first data request according to the second request address, the set mask, the set offset, and the flag bit offset corresponding to the first data request includes: acquiring original group bits and original tag information from the second request address; performing AND operation on the original group bit and the group mask to obtain the relative positions of the target data requested by the first data request in a plurality of shared cache groups corresponding to the first identification information; determining a new group bit corresponding to the first data request according to the group offset and the relative position; and determining new tag information corresponding to the first data request according to the original tag information, the flag bit offset and the designated bit in the original group bit.
For example, the second request address includes 32 bits, represented in 16 bins as 0x12345678. Wherein the upper 4 bits (0 x 1) are channel information; bits 16-27 (0 x 234) are the original tag information; bits 8-15 (0 x 56) are the original bits; the lower 8 bits (0 x 78) are the in-line offset address used to determine that it accesses the 0x78 byte data in the cache line.
For example, the group mask mask=0x0f, the group offset offset=0x10, and the flag bit offset tag shift=0x4.
And performing an AND operation on the original group bit 0x56 and the group mask 0x0f to obtain the relative position 0x06 of the target data requested by the first data request in the plurality of shared cache groups corresponding to the first identification information. Wherein the original group bits and the group mask are converted into binary values, respectively, resulting in 0000010100000110 and 0000000011111111. 0000010100000110& 0000000011111111= 0000000000000110, thereby determining that the relative position of the target data requested by the first data request in the plurality of shared cache sets corresponding to the first identification information is 0x06.
Adding the group offset to the relative position may result in a new group bit 0x16 corresponding to the first data request. That is, according to the group mask, the 8 th to 11 th bits in the second request address are selected as part of the new group bits, and an offset of 0x10 is added.
From the original tag information 0x234, the flag bit offset 0x4, and 0x5 in the original set of bits, new tag information 0x2345 can be obtained.
Thus, a new request address of 0x123451678 can be obtained. In this example, the 8 th to 11 th bits of the new request address are determined according to the second request address, and 16 total shared cache groups are allocated to the first identification information.
In this example, the original group bit and the original tag information are obtained from the second request address, and the original group bit and the group mask are subjected to an and operation to obtain the relative positions of the target data requested by the first data request in the plurality of shared cache groups corresponding to the first identification information, the new group bit corresponding to the first data request is determined according to the group offset and the relative positions, and the new tag information corresponding to the first data request is determined according to the original tag information, the flag bit offset and the designated bit in the original group bit, so that the allocation of the shared cache groups based on the identification information of the data request can be realized, and the designated bit in the original group bit is reserved when the new tag information is determined, so that the integrity of the original address can be maintained.
Fig. 3 is a schematic diagram illustrating allocation of identification information for a data request by groups in a cache resource allocation method according to an embodiment of the present disclosure. In fig. 3, the request ID indicates identification information carried by the data request, for example, first identification information carried by the first data request. The group mask, the group offset, and the flag bit offset corresponding to the first identification information may be obtained from the ID-cache group mapping table. The ID-cache set mapping table shown in fig. 3 includes mapping relations between IDs 0 to IDN and set mask, set offset, and flag bit offset. The original tag information, the original set of bits, and the intra-row offset address may be obtained from a request address (e.g., a second request address). The original group bits and the group mask may be bitwise and operated to obtain the relative positions of the target data requested by the first data request in the plurality of shared cache groups corresponding to the first identification information. The group offset may be added to the relative position to obtain a new group bit. The original tag information, the flag bit offset and the designated bit in the original group bit can be processed through a shifter to obtain new tag information. And carrying out cache searching according to the channel information, the in-line offset address, the new group bit and the new tag information.
It should be noted that the above definition of group mask and group offset is only an example, and not a unique definition. For example, the same group allocation function may be implemented by means of new group bit=original group bit+group offset & group mask, etc.
As an example of this implementation, the searching for the target data according to the channel information and the intra-row offset address in the second request address, and the new set of bits and the new tag information includes: and responding to the fact that target data cannot be found according to the channel information, the new group bits, the new tag information and the in-row offset address, acquiring the target data from a memory or an external memory, writing the target data into a shared cache group corresponding to the new group bits, and requesting to return the target data to the first data.
In this example, in the case where the first data request is a data request corresponding to the first identification information received for the first time, the target data is not found in the shared cache channel. At this time, the target data may be obtained from the memory or the external memory, and the target data is written into the shared cache group corresponding to the new group bit, and returned to the first data request.
In this example, in response to the channel information, the new set bit, the new tag information, and the intra-row offset address not finding the target data, the target data is obtained from the memory or the external memory, the target data is written into the shared cache group corresponding to the new set bit, and the target data is returned to the first data request, thereby writing the target data into the shared cache group corresponding to the first identification information can be achieved.
Fig. 4 is a schematic diagram of a cache structure of a GPU according to an embodiment of the present disclosure. The parts overlapping with fig. 1 will not be described again. In fig. 4, addresses issued by the GPU (i.e., request addresses carried by data requests) may be scrambled by address interleaving and hashing to be distributed equally to different shared cache channels. In addition, for the identification information carried by the data request, a reference shared cache channel corresponding to the identification information can be selected from all the shared cache channels.
Fig. 5 is a schematic diagram of a cache lookup method according to an embodiment of the disclosure. As shown in fig. 5, a local cache lookup may be performed in response to a data request. If there is a hit in the local cache (i.e., the target data requested by the data request is found in the local cache), the target data is fetched from the local cache and returned to the data request. If there is a miss in the local cache (i.e., a cache miss occurs in the local cache), the first request address carried by the data request may be converted to a second request address by address interleaving and hashing. The shared cache channel corresponding to the target data can be determined according to the channel information in the second request address, and the data request can be sent to the corresponding shared cache channel. Corresponding group masks, group offsets and flag bit offsets can be obtained from the ID-cache group mapping table according to the identification information carried by the data request. The original tag information, the original set of bits, and the intra-row offset address may be obtained from the second request address. And performing bitwise AND operation on the original group bits and the group mask to obtain the relative positions of the target data requested by the data request in the plurality of shared cache groups corresponding to the identification information. The group offset may be added to the relative position to obtain a new group bit. The new tag information can be obtained by processing the original tag information, the flag bit offset and the designated bit in the original group bit. And carrying out shared cache searching on the shared cache channel corresponding to the target data according to the channel information, the in-line offset address, the new group bit and the new tag information. If hit in the shared cache, the target data is fetched from the shared cache and returned to the data request. If there is a miss (i.e., a miss) in the shared cache, the data request may be sent to the corresponding DRAM bank, the target data may be fetched from the DRAM bank and returned to the data request, and a determination may be made as to whether to fill the target data into the cache based on the cache request control signal.
The cache resource allocation method provided by the embodiment of the present disclosure is described below through a specific application scenario. In the application scenario, a first data request from any application can be responded to, and first identification information carried by the first data request is acquired. Wherein the first identification information includes any one of the following: the method comprises the steps of determining identification information according to a module called by the application, determining identification information according to context identification information, and determining identification information according to an address interval of target data requested by the first data request in a memory. The first data request may be a first received data request corresponding to the first identification information, and there are shared cache channels which are not determined as reference shared cache channels in the plurality of shared cache channels, and a first reference shared cache channel corresponding to the first identification information is determined from the shared cache channels which are not determined as reference shared cache channels, wherein the first reference shared cache channel represents a reference shared cache channel corresponding to the first identification information, the first preset number of shared cache groups in a first common shared cache channel are allocated to the first identification information, and the second preset number of shared cache groups in the first reference shared cache channel are allocated to the first identification information, wherein the first common shared cache channel represents a shared cache channel except for the first reference shared cache channel in the plurality of shared cache channels.
And obtaining the group mask, the group offset and the flag bit offset corresponding to the first identification information in response to determining that the cache miss occurs in the local cache according to the first request address. And remapping the first request address through address interleaving and hash operation to obtain a second request address. The original group bit and the original tag information can be obtained from the second request address; performing AND operation on the original group bit and the group mask to obtain the relative positions of the target data requested by the first data request in a plurality of shared cache groups corresponding to the first identification information; determining a new group bit corresponding to the first data request according to the group offset and the relative position; and determining new tag information corresponding to the first data request according to the original tag information, the flag bit offset and the designated bit in the original group bit. And performing cache searching according to the channel information and the in-row offset address in the second request address, the new group bit and the new tag information.
The shared cache resources allocated to the first identification information may be adjusted at a preset frequency. For example, a first hit rate of the first reference shared cache channel for the first identification information and a second hit rate of the first normal shared cache channel for the first identification information may be obtained; determining a ratio of the first hit rate to the second hit rate; responsive to the ratio being greater than or equal to a first preset threshold, increasing a number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; or, in response to the ratio being less than or equal to a second preset threshold, reducing the number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; the first preset threshold is greater than the second preset threshold, and the first preset threshold and the second preset threshold are both greater than 1.
It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure. It will be appreciated by those skilled in the art that in the above-described methods of the embodiments, the particular order of execution of the steps should be determined by their function and possible inherent logic.
In addition, the disclosure further provides a cache resource allocation device, an electronic device, a computer readable storage medium, and a computer program product, where the foregoing may be used to implement any cache resource allocation method provided in the disclosure, and the corresponding technical schemes and technical effects may be referred to the corresponding records of the method section and are not repeated.
Fig. 6 shows a block diagram of a cache resource allocation apparatus provided by an embodiment of the present disclosure. In an embodiment of the disclosure, a processor system includes at least two levels of caches, a highest level of the at least two levels of caches being a shared cache, the shared cache including a plurality of shared cache sets. As shown in fig. 6, the cache resource allocation apparatus includes:
a first obtaining module 61, configured to obtain, in response to a first data request from any application, first identification information carried by the first data request;
And the allocation module 62 is configured to allocate a preset number of shared cache groups to the first identification information in response to the first data request being a data request corresponding to the first identification information received for the first time.
In one possible implementation, the shared cache includes a plurality of shared cache channels;
the allocation module 62 is configured to:
and responding to the first data request as the data request corresponding to the first identification information received for the first time, and respectively distributing the preset number of shared cache groups in the plurality of shared cache channels to the first identification information.
In one possible implementation, the preset number includes a first preset number and a second preset number, and the first preset number is smaller than the second preset number;
the allocation module 62 is configured to:
responding to the first data request as the first received data request corresponding to the first identification information, and determining a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels, wherein the first reference shared cache channel represents the reference shared cache channel corresponding to the first identification information;
And allocating the first preset number of shared cache groups in a first common shared cache channel to the first identification information, and allocating the second preset number of shared cache groups in the first reference shared cache channel to the first identification information, wherein the first common shared cache channel represents a shared cache channel except the first reference shared cache channel in the plurality of shared cache channels.
In one possible implementation, the allocation module 62 is configured to:
and responding to the first data request as the first received data request corresponding to the first identification information, wherein the shared cache channels which are not determined as the reference shared cache channels exist in the plurality of shared cache channels, and determining the first reference shared cache channel corresponding to the first identification information from the shared cache channels which are not determined as the reference shared cache channels.
In one possible implementation, the apparatus further includes:
the second acquisition module is used for acquiring a first hit rate of the first reference shared cache channel aiming at the first identification information and a second hit rate of the first common shared cache channel aiming at the first identification information;
And the adjusting module is used for adjusting the number of the shared cache groups allocated to the first identification information according to the first hit rate and the second hit rate.
In one possible implementation, the adjusting module is configured to:
determining a ratio of the first hit rate to the second hit rate;
responsive to the ratio being greater than or equal to a first preset threshold, increasing a number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; or, in response to the ratio being less than or equal to a second preset threshold, reducing the number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; the first preset threshold is greater than the second preset threshold, and the first preset threshold and the second preset threshold are both greater than 1.
In one possible implementation, the apparatus further includes:
the third acquisition module is used for acquiring a first request address corresponding to the first data request;
a fourth obtaining module, configured to obtain a group mask, a group offset, and a flag bit offset corresponding to the first identification information in response to determining that a cache miss occurs in the local cache according to the first request address;
A determining module, configured to determine new group bits and new tag information corresponding to the first data request according to a second request address corresponding to the first data request, the group mask, the group offset, and the flag bit offset, where the second request address is determined according to the first request address;
and the searching module is used for searching the target data according to the channel information and the in-row offset address in the second request address, the new group bit and the new tag information.
In one possible implementation, the apparatus further includes:
and the remapping module is used for remapping the first request address to obtain the second request address.
In one possible implementation, the determining module is configured to:
acquiring original group bits and original tag information from the second request address;
performing AND operation on the original group bit and the group mask to obtain the relative positions of the target data requested by the first data request in a plurality of shared cache groups corresponding to the first identification information;
determining a new group bit corresponding to the first data request according to the group offset and the relative position;
And determining new tag information corresponding to the first data request according to the original tag information, the flag bit offset and the designated bit in the original group bit.
In one possible implementation, the search module is configured to:
and responding to the fact that target data cannot be found according to the channel information, the new group bits, the new tag information and the in-row offset address, acquiring the target data from a memory or an external memory, writing the target data into a shared cache group corresponding to the new group bits, and requesting to return the target data to the first data.
In one possible implementation manner, the first identification information includes any one of the following:
the method comprises the steps of determining identification information according to a module called by the application, determining identification information according to context identification information, and determining identification information according to an address interval of target data requested by the first data request in a memory.
In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementation and technical effects of the functions or modules may refer to the descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
The disclosed embodiments also provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the above-described method. Wherein the computer readable storage medium may be a non-volatile computer readable storage medium or may be a volatile computer readable storage medium.
The disclosed embodiments also propose a computer program comprising computer readable code which, when run in an electronic device, causes a processor in the electronic device to carry out the above method.
Embodiments of the present disclosure also provide a computer program product comprising computer readable code, or a non-transitory computer readable storage medium carrying computer readable code, which when run in an electronic device, causes a processor in the electronic device to perform the above method.
The embodiment of the disclosure also provides an electronic device, including: one or more processors; a memory for storing executable instructions; wherein the one or more processors are configured to invoke the executable instructions stored by the memory to perform the above-described method.
The electronic device may be provided as a terminal, server or other form of device.
Fig. 7 illustrates a block diagram of an electronic device 1900 provided by an embodiment of the disclosure. For example, electronic device 1900 may be provided as a terminal or server. Referring to FIG. 7, electronic device 1900 includes a processing component 1922 that further includes one or more processors and memory resources represented by memory 1932 for storing instructions, such as application programs, that can be executed by processing component 1922. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. Further, processing component 1922 is configured to execute instructions to perform the methods described above.
The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output interface 1958 (I/O interface). Electronic device 1900 may operate an operating system based on memory 1932, such as the Microsoft Server operating system (Windows Server) TM ) Map-based on apple company Shape user interface operating system (Mac OS X) TM ) Multi-user multi-process computer operating system (Unix) TM ) Unix-like operating system (Linux) of free and open source code TM ) Unix-like operating system (FreeBSD) with open source code TM ) Or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 1932, including computer program instructions executable by processing component 1922 of electronic device 1900 to perform the methods described above.
The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
Computer program instructions for performing the operations of the present disclosure can be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, c++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The computer program product may be realized in particular by means of hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.
If the technical scheme of the embodiment of the disclosure relates to personal information, the product applying the technical scheme of the embodiment of the disclosure clearly informs the personal information processing rule and obtains personal independent consent before processing the personal information. If the technical solution of the embodiment of the present disclosure relates to sensitive personal information, the product applying the technical solution of the embodiment of the present disclosure obtains individual consent before processing the sensitive personal information, and simultaneously meets the requirement of "explicit consent". For example, a clear and remarkable mark is set at a personal information acquisition device such as a camera to inform that the personal information acquisition range is entered, personal information is acquired, and if the personal voluntarily enters the acquisition range, the personal information is considered as consent to be acquired; or on the device for processing the personal information, under the condition that obvious identification/information is utilized to inform the personal information processing rule, personal authorization is obtained by popup information or a person is requested to upload personal information and the like; the personal information processing rule may include information such as a personal information processor, a personal information processing purpose, a processing mode, and a type of personal information to be processed.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.
Claims (14)
1. A method for allocating cache resources, wherein a processor system includes at least two levels of cache, a highest level of the at least two levels of cache is a shared cache, the shared cache includes a plurality of shared cache groups, the method comprising:
responding to a first data request from any application, and acquiring first identification information carried by the first data request;
and responding to the first data request as the data request corresponding to the first identification information received for the first time, and distributing a preset number of shared cache groups to the first identification information.
2. The method of claim 1, wherein the shared cache comprises a plurality of shared cache channels;
the responding to the first data request is a data request corresponding to the first identification information received for the first time, allocates a preset number of shared cache groups to the first identification information, and includes:
and responding to the first data request as the data request corresponding to the first identification information received for the first time, and respectively distributing the preset number of shared cache groups in the plurality of shared cache channels to the first identification information.
3. The method of claim 2, wherein the preset number comprises a first preset number and a second preset number, and the first preset number is less than the second preset number;
the responding to the first data request is a data request corresponding to the first identification information received for the first time, and respectively distributing a preset number of shared cache groups in the plurality of shared cache channels to the first identification information, including:
responding to the first data request as the first received data request corresponding to the first identification information, and determining a first reference shared cache channel corresponding to the first identification information from the plurality of shared cache channels, wherein the first reference shared cache channel represents the reference shared cache channel corresponding to the first identification information;
And allocating the first preset number of shared cache groups in a first common shared cache channel to the first identification information, and allocating the second preset number of shared cache groups in the first reference shared cache channel to the first identification information, wherein the first common shared cache channel represents a shared cache channel except the first reference shared cache channel in the plurality of shared cache channels.
4. The method of claim 3, wherein the determining, in response to the first data request being a first received data request corresponding to the first identification information, a first reference shared cache way corresponding to the first identification information from the plurality of shared cache ways, comprises:
and responding to the first data request as the first received data request corresponding to the first identification information, wherein the shared cache channels which are not determined as the reference shared cache channels exist in the plurality of shared cache channels, and determining the first reference shared cache channel corresponding to the first identification information from the shared cache channels which are not determined as the reference shared cache channels.
5. A method according to claim 3, characterized in that the method further comprises:
acquiring a first hit rate of the first reference shared cache channel for the first identification information and a second hit rate of the first common shared cache channel for the first identification information;
and adjusting the number of the shared cache groups allocated to the first identification information according to the first hit rate and the second hit rate.
6. The method of claim 5, wherein adjusting the number of shared cache sets allocated to the first identification information based on the first hit rate and the second hit rate comprises:
determining a ratio of the first hit rate to the second hit rate;
responsive to the ratio being greater than or equal to a first preset threshold, increasing a number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; or, in response to the ratio being less than or equal to a second preset threshold, reducing the number of shared cache groups allocated to the first identification information in the plurality of shared cache channels; the first preset threshold is greater than the second preset threshold, and the first preset threshold and the second preset threshold are both greater than 1.
7. The method according to claim 1, wherein the method further comprises:
acquiring a first request address corresponding to the first data request;
responding to the first request address to determine that cache miss occurs in a local cache, and acquiring a group mask, a group offset and a flag bit offset corresponding to the first identification information;
determining new group bits and new tag information corresponding to the first data request according to a second request address corresponding to the first data request, the group mask, the group offset and the flag bit offset, wherein the second request address is determined according to the first request address;
and searching target data according to the channel information and the in-row offset address in the second request address, the new group bit and the new tag information.
8. The method of claim 7, wherein the method further comprises:
and remapping the first request address to obtain the second request address.
9. The method of claim 7, wherein the determining new set of bits and new tag information corresponding to the first data request based on the second request address, the set mask, the set offset, and the flag bit offset corresponding to the first data request comprises:
Acquiring original group bits and original tag information from the second request address;
performing AND operation on the original group bit and the group mask to obtain the relative positions of the target data requested by the first data request in a plurality of shared cache groups corresponding to the first identification information;
determining a new group bit corresponding to the first data request according to the group offset and the relative position;
and determining new tag information corresponding to the first data request according to the original tag information, the flag bit offset and the designated bit in the original group bit.
10. The method of claim 7, wherein the looking up the target data based on the channel information and the intra-row offset address in the second request address, and the new set of bits and the new tag information, comprises:
and responding to the fact that target data cannot be found according to the channel information, the new group bits, the new tag information and the in-row offset address, acquiring the target data from a memory or an external memory, writing the target data into a shared cache group corresponding to the new group bits, and requesting to return the target data to the first data.
11. The method according to any one of claims 1 to 10, wherein the first identification information comprises any one of:
the method comprises the steps of determining identification information according to a module called by the application, determining identification information according to context identification information, and determining identification information according to an address interval of target data requested by the first data request in a memory.
12. A cache resource allocation apparatus, wherein a processor system comprises at least two levels of cache, a highest level of the at least two levels of cache being a shared cache, the shared cache comprising a plurality of shared cache sets, the apparatus comprising:
the first acquisition module is used for responding to a first data request from any application and acquiring first identification information carried by the first data request;
the allocation module is used for allocating a preset number of shared cache groups to the first identification information in response to the first data request being the data request corresponding to the first identification information received for the first time.
13. An electronic device, comprising:
one or more processors;
a memory for storing executable instructions;
Wherein the one or more processors are configured to invoke the memory-stored executable instructions to perform the method of any of claims 1 to 11.
14. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1 to 11.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311076545.6A CN117093371B (en) | 2023-02-23 | 2023-02-23 | Cache resource allocation method and device, electronic equipment and storage medium |
CN202310153348.3A CN116010109B (en) | 2023-02-23 | 2023-02-23 | Cache resource allocation method and device, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310153348.3A CN116010109B (en) | 2023-02-23 | 2023-02-23 | Cache resource allocation method and device, electronic equipment and storage medium |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311076545.6A Division CN117093371B (en) | 2023-02-23 | 2023-02-23 | Cache resource allocation method and device, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116010109A true CN116010109A (en) | 2023-04-25 |
CN116010109B CN116010109B (en) | 2023-07-04 |
Family
ID=86037526
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311076545.6A Active CN117093371B (en) | 2023-02-23 | 2023-02-23 | Cache resource allocation method and device, electronic equipment and storage medium |
CN202310153348.3A Active CN116010109B (en) | 2023-02-23 | 2023-02-23 | Cache resource allocation method and device, electronic equipment and storage medium |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311076545.6A Active CN117093371B (en) | 2023-02-23 | 2023-02-23 | Cache resource allocation method and device, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN117093371B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116521095A (en) * | 2023-07-03 | 2023-08-01 | 摩尔线程智能科技(北京)有限责任公司 | Response output system, method, electronic device, storage medium, and program product |
CN117093371A (en) * | 2023-02-23 | 2023-11-21 | 摩尔线程智能科技(北京)有限责任公司 | Cache resource allocation method and device, electronic equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110072217A1 (en) * | 2009-09-18 | 2011-03-24 | Chi Hoang | Distributed Consistent Grid of In-Memory Database Caches |
US20110225372A1 (en) * | 2009-04-27 | 2011-09-15 | Lsi Corporation | Concurrent, coherent cache access for multiple threads in a multi-core, multi-thread network processor |
CN103562897A (en) * | 2011-06-10 | 2014-02-05 | 国际商业机器公司 | Store storage class memory information command |
CN114217861A (en) * | 2021-12-06 | 2022-03-22 | 海光信息技术股份有限公司 | Data processing method and device, electronic device and storage medium |
CN114928652A (en) * | 2022-04-29 | 2022-08-19 | 高德软件有限公司 | Map data transmission method, map data transmission device, electronic apparatus, storage medium, and program |
CN115052042A (en) * | 2022-06-07 | 2022-09-13 | 成都北中网芯科技有限公司 | Method for realizing high-performance multi-channel shared cache |
CN115098169A (en) * | 2022-06-24 | 2022-09-23 | 海光信息技术股份有限公司 | Capacity sharing-based instruction calling method and device |
CN115357196A (en) * | 2022-08-31 | 2022-11-18 | 鹏城实验室 | Dynamically expandable set-associative cache method, apparatus, device and medium |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102270180B (en) * | 2011-08-09 | 2014-04-02 | 清华大学 | Multicore processor cache and management method thereof |
US10002076B2 (en) * | 2015-09-29 | 2018-06-19 | Nxp Usa, Inc. | Shared cache protocol for parallel search and replacement |
CN106909515B (en) * | 2017-02-11 | 2020-09-18 | 苏州浪潮智能科技有限公司 | Multi-core shared last-level cache management method and device for mixed main memory |
US10789175B2 (en) * | 2017-06-01 | 2020-09-29 | Mellanox Technologies Ltd. | Caching policy in a multicore system on a chip (SOC) |
CN109857681B (en) * | 2017-11-30 | 2023-07-18 | 华为技术有限公司 | Cache address mapping method and related equipment |
US11086777B2 (en) * | 2019-04-01 | 2021-08-10 | Arm Limited | Replacement of cache entries in a set-associative cache |
CN112148665B (en) * | 2019-06-28 | 2024-01-09 | 深圳市中兴微电子技术有限公司 | Cache allocation method and device |
US10949352B1 (en) * | 2020-03-05 | 2021-03-16 | Nxp Usa, Inc. | Data processing system having a shared cache |
US11481332B1 (en) * | 2021-05-07 | 2022-10-25 | Ventana Micro Systems Inc. | Write combining using physical address proxies stored in a write combine buffer |
US11593109B2 (en) * | 2021-06-07 | 2023-02-28 | International Business Machines Corporation | Sharing instruction cache lines between multiple threads |
CN115061972B (en) * | 2022-07-05 | 2023-10-13 | 摩尔线程智能科技(北京)有限责任公司 | Processor, data read-write method, device and storage medium |
CN115168247B (en) * | 2022-09-02 | 2022-12-02 | 北京登临科技有限公司 | Method for dynamically sharing memory space in parallel processor and corresponding processor |
CN117093371B (en) * | 2023-02-23 | 2024-05-17 | 摩尔线程智能科技(北京)有限责任公司 | Cache resource allocation method and device, electronic equipment and storage medium |
-
2023
- 2023-02-23 CN CN202311076545.6A patent/CN117093371B/en active Active
- 2023-02-23 CN CN202310153348.3A patent/CN116010109B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110225372A1 (en) * | 2009-04-27 | 2011-09-15 | Lsi Corporation | Concurrent, coherent cache access for multiple threads in a multi-core, multi-thread network processor |
US20110072217A1 (en) * | 2009-09-18 | 2011-03-24 | Chi Hoang | Distributed Consistent Grid of In-Memory Database Caches |
CN103562897A (en) * | 2011-06-10 | 2014-02-05 | 国际商业机器公司 | Store storage class memory information command |
CN114217861A (en) * | 2021-12-06 | 2022-03-22 | 海光信息技术股份有限公司 | Data processing method and device, electronic device and storage medium |
CN114928652A (en) * | 2022-04-29 | 2022-08-19 | 高德软件有限公司 | Map data transmission method, map data transmission device, electronic apparatus, storage medium, and program |
CN115052042A (en) * | 2022-06-07 | 2022-09-13 | 成都北中网芯科技有限公司 | Method for realizing high-performance multi-channel shared cache |
CN115098169A (en) * | 2022-06-24 | 2022-09-23 | 海光信息技术股份有限公司 | Capacity sharing-based instruction calling method and device |
CN115357196A (en) * | 2022-08-31 | 2022-11-18 | 鹏城实验室 | Dynamically expandable set-associative cache method, apparatus, device and medium |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117093371A (en) * | 2023-02-23 | 2023-11-21 | 摩尔线程智能科技(北京)有限责任公司 | Cache resource allocation method and device, electronic equipment and storage medium |
CN117093371B (en) * | 2023-02-23 | 2024-05-17 | 摩尔线程智能科技(北京)有限责任公司 | Cache resource allocation method and device, electronic equipment and storage medium |
CN116521095A (en) * | 2023-07-03 | 2023-08-01 | 摩尔线程智能科技(北京)有限责任公司 | Response output system, method, electronic device, storage medium, and program product |
CN116521095B (en) * | 2023-07-03 | 2023-09-08 | 摩尔线程智能科技(北京)有限责任公司 | Response output system, method, electronic device, storage medium, and program product |
Also Published As
Publication number | Publication date |
---|---|
CN117093371B (en) | 2024-05-17 |
CN117093371A (en) | 2023-11-21 |
CN116010109B (en) | 2023-07-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116010109B (en) | Cache resource allocation method and device, electronic equipment and storage medium | |
US10152501B2 (en) | Rollover strategies in a n-bit dictionary compressed column store | |
US10572378B2 (en) | Dynamic memory expansion by data compression | |
TWI559217B (en) | Dynamic cache and memory allocation for memory subsystems | |
US10769073B2 (en) | Bandwidth-based selective memory channel connectivity on a system on chip | |
CN107003940B (en) | System and method for providing improved latency in non-uniform memory architectures | |
EP3249539B1 (en) | Method and device for accessing data visitor directory in multi-core system | |
CN111949681A (en) | Data aggregation processing device and method and storage medium | |
US11567661B2 (en) | Virtual memory management method and processor | |
US8935508B1 (en) | Implementing pseudo content access memory | |
CN107111560B (en) | System and method for providing improved latency in non-uniform memory architectures | |
US10997077B2 (en) | Increasing the lookahead amount for prefetching | |
CN111026680B (en) | Data processing system, circuit and method | |
CN116107926B (en) | Cache replacement policy management method, device, equipment, medium and program product | |
CN113805845A (en) | Random number sequence generation method and random number engine | |
CN108196786B (en) | Method and management device for storage system partitioning | |
CN112839071A (en) | Training system, training data access method and device, electronic device and medium | |
CN116166575B (en) | Method, device, equipment, medium and program product for configuring access segment length | |
CN117539636A (en) | Memory management method and device for bus module, electronic equipment and storage medium | |
CN117742957A (en) | Memory allocation method, memory allocation device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |