WO2015131395A1 - Cache, procédé de gestion de cache partagé, et contrôleur - Google Patents

Cache, procédé de gestion de cache partagé, et contrôleur Download PDF

Info

Publication number
WO2015131395A1
WO2015131395A1 PCT/CN2014/073052 CN2014073052W WO2015131395A1 WO 2015131395 A1 WO2015131395 A1 WO 2015131395A1 CN 2014073052 W CN2014073052 W CN 2014073052W WO 2015131395 A1 WO2015131395 A1 WO 2015131395A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache
cache block
priority
cores
block
Prior art date
Application number
PCT/CN2014/073052
Other languages
English (en)
Chinese (zh)
Inventor
郑礼炳
李景超
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2014/073052 priority Critical patent/WO2015131395A1/fr
Priority to CN201480000331.3A priority patent/CN105359116B/zh
Publication of WO2015131395A1 publication Critical patent/WO2015131395A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/084Multiuser, multiprocessor or multiprocessing cache systems with a shared cache

Definitions

  • the present invention relates to the field of computers, and in particular, to a buffer, a shared cache management method, and a controller. Background technique
  • a part of the cache block in the shared cache is usually allocated to each core in the multi-core system, and one of the cores in the multi-core system is in the access cache (read or write) miss and cached.
  • the cache block to be replaced is determined in the cache block corresponding to the kernel, and the original data in the cache block to be replaced is replaced with the data to be read or to be written.
  • the inventor has found that at least the following problems exist in the prior art:
  • the kernel can only determine the cache block to be replaced from the corresponding part of the cache block, which is very practical in practical applications. It may happen that some cache blocks corresponding to the kernel are frequently reused, and cache blocks corresponding to other cores are idle for a long time, resulting in low utilization of the shared cache and affecting system performance. Summary of the invention
  • the present invention provides a buffer, a shared cache management method, and a control method, in order to solve the problem that the cache is not high in the prior art, and the cache is not determined by the kernel. Device.
  • the technical solution is as follows:
  • a buffer where the buffer includes:
  • a cache unit a status register, a priority calculation unit, and a controller
  • the buffer unit is respectively connected to the status register and the controller; the status register is respectively connected to the buffer unit and the priority calculation unit; the priority calculation unit and the status register are respectively The controller is connected; the cache unit includes a shared cache and N shadow labels, and the N shadow labels respectively correspond to N cores of the processor; N > 2, and N Is an integer;
  • the status register is configured to record first access information of the N cores to the cache unit, where the first access information includes: accessing the shared cache, occupying a cache block in the shared cache The number of times, the number of times the access to the shared cache is hit, and the number of times the shadow tag was accessed and hit;
  • the priority calculation unit is configured to calculate, according to the first access information of the cache unit, the replacement priorities of the N cores, respectively, according to the N cores recorded in the status register; The priority of the cache block occupied by the corresponding kernel is replaced;
  • the controller configured to acquire, from the priority calculation unit, a replacement priority of the N cores when the access to the shared cache miss occurs and perform a cache block refill operation, and from the N
  • the cache block to be replaced is determined in the cache block of the shared cache currently occupied by the kernel with the highest priority.
  • the controller is further used to
  • Each of the N cores is assigned a respective target cache occupancy according to a performance goal, the performance target including at least one of overall hit rate maximization, fairness, or quality of service;
  • the status register is further connected to the controller
  • the status register is further configured to record a second access information corresponding to each cache block in the shared cache, where the second access information includes a number of times occupied by the N cores respectively;
  • the controller is configured to: when the accessing the shared cache miss occurs and perform a cache block refill operation, obtain, from the status register, the second corresponding to each cache block currently occupied by a kernel with the highest replacement priority Accessing the information, and determining the cache block to be replaced according to the second access information corresponding to each of the cache blocks currently occupied by the core with the highest priority.
  • the controller In conjunction with the second possible implementation of the first aspect, in a third possible implementation manner of the first aspect, the controller,
  • the cache block to be replaced is determined from the second type cache block according to a replacement algorithm.
  • the second aspect provides a shared cache management method, which is used in the buffer according to any of the foregoing first aspect or the first aspect, wherein the method includes: When the shared cache misses and performs a cache block refill operation, the respective replacement priorities of the N cores of the processor are obtained from the priority calculation unit; the replacement priority is used to represent the cache occupied by the corresponding kernel. The priority of the block being replaced;
  • a cache block to be replaced is determined from the cached blocks of the shared cache currently occupied by the highest priority kernel.
  • the method includes:
  • Each of the N cores is assigned a respective target cache occupancy according to a performance goal, the performance target including at least one of overall hit rate maximization, fairness, or quality of service;
  • the status register is further configured to record second access information corresponding to each cache block in the shared cache, where the second access information includes the N The number of times the kernels are respectively occupied; the cache block to be replaced is determined from the cache blocks of the shared cache currently occupied by the kernel with the highest priority among the N cores, including:
  • the second access corresponding to each cache block currently occupied by the core with the highest replacement priority is respectively The information determines the cache block to be replaced, including:
  • the cache block to be replaced is determined from the second type cache block according to a replacement algorithm.
  • the third aspect provides a controller, which is used in the buffer according to the foregoing first aspect or any possible implementation manner of the first aspect, wherein the controller includes:
  • a first obtaining module configured to acquire, from the priority computing unit, a replacement priority of each of the N cores of the processor when the accessing the shared cache miss occurs and performing a cache block refilling operation;
  • Level is used to characterize the priority of the cache block occupied by the corresponding kernel being replaced;
  • a determining module configured to determine, from the N cores, a cache block to be replaced from a cache block of a shared cache currently occupied by a kernel with the highest priority.
  • the controller includes:
  • An allocation module configured to allocate a respective target cache occupancy to the N cores according to a performance target, where the performance target includes at least one of overall hit rate maximization, fairness, or quality of service;
  • the detecting module is configured to detect whether the actual cache occupancy is not greater than the target cache occupancy;
  • control module configured to: if the detection result is that the actual cache occupancy is not greater than the target cache occupancy, control the priority calculation unit to recalculate respective replacement priorities of the N cores.
  • the determining module includes:
  • an obtaining unit configured to acquire, from the status register, second access information corresponding to each cache block currently occupied by the core with the highest replacement priority
  • a determining unit configured to determine, according to the second access information corresponding to each cache block that is currently occupied by the kernel with the highest replacement priority, the cache block to be replaced;
  • the status register is further configured to record the second access information corresponding to each cache block in the shared cache, where the second access information includes a number of times occupied by the N cores.
  • the determining unit includes:
  • a first determining subunit configured to determine a first type of cache block from a cache block currently occupied by the core with the highest replacement priority, where the first type cache block is occupied by a kernel with the highest replacement priority The least used cache block;
  • a second determining subunit configured to determine a second type of cache block from the first type of cache block, where the second type of cache block is a cache block having the least total number of times occupied by the N cores;
  • a third determining subunit configured to determine, according to the replacement algorithm, the cache block to be replaced from the second type cache block.
  • the first access information of each of the N cores to the cache unit is recorded by the status register, and the priority calculation unit calculates the replacement priority of each of the N cores according to the first access information recorded by the status register, and the controller preferentially replaces each of the N cores according to the replacement status of the N cores.
  • the level of the cache block to be replaced in the shared cache is determined, which solves the problem that the kernel can only determine the cache cache to be replaced from the corresponding part of the cache block, and the utilization of the shared cache is not high, thereby improving the utilization of the shared cache. The effect of rate and system performance.
  • FIG. 1 is a schematic structural diagram of a buffer provided by an embodiment of the present invention.
  • FIG. 2 is a schematic structural diagram of a buffer according to another embodiment of the present invention.
  • FIG. 3 is a schematic structural diagram of a cache unit according to another embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a status register according to another embodiment of the present invention.
  • FIG. 5 is a flowchart of a method for a shared cache management method according to an embodiment of the present invention
  • FIG. 6 is a flowchart of a method for a shared cache management method according to another embodiment of the present invention
  • FIG. 7 is a flowchart of an embodiment of the present invention. Schematic diagram of the device
  • FIG. 8 is a schematic structural diagram of a controller according to another embodiment of the present invention. detailed description
  • FIG. 1 is a schematic structural diagram of a buffer provided by an embodiment of the present invention.
  • the buffer may include: a cache unit 102, a status register The device 104, the priority calculation unit 106 and the controller 108;
  • the buffer unit 102 is respectively connected to the status register 104 and the controller 108; the status register 104 is respectively connected to the buffer unit 102 and the priority calculation unit 106; the priority calculation unit 106 Connected to the status register 104 and the controller 108 respectively; the cache unit 102 includes a shared cache 1022 and N shadow labels 1024, respectively, corresponding to N cores of the processor; N > 2, and N is an integer;
  • the status register 104 is configured to record first access information of the N cores to the cache unit 102, where the first access information includes: accessing the shared cache 1022, occupying the shared cache 1022 The number of cache blocks in the cache, the number of hits to the shared cache 1022 and hits, and the number of hits to the shadow tag 1024 and hits;
  • the priority calculation unit 106 is configured to calculate, according to the first access information of the cache unit 102, the replacement priorities of the N cores according to the N cores recorded by the status register 104; The priority is used to indicate the priority of the cache block occupied by the corresponding kernel being replaced;
  • the controller 108 is configured to acquire, from the priority calculation unit, a replacement priority of each of the N cores when the access to the shared cache miss occurs and perform a cache block refill operation, and The cache block to be replaced is determined among the N cores in the cache block of the shared cache currently occupied by the core with the highest priority.
  • the status register records the access information of each core to the shared cache
  • the priority calculation unit calculates the replacement priority of each core according to the access information of each core to the shared cache
  • the controller performs the cache block refill operation.
  • the cache block to be replaced is determined according to the replacement priority of each core, and the cache block to be replaced can be determined according to the actual access situation of each core to the shared cache, thereby improving the utilization of the shared cache.
  • the buffer provided by the embodiment of the present invention records the first access information of each of the N cores to the cache unit through the status register, and the priority calculation unit calculates the respective N cores according to the first access information recorded in the status register.
  • the controller determines the cache block to be replaced in the shared cache according to the replacement priority of each of the N cores, which solves the problem that the kernel can only determine the cache block to be replaced from the corresponding part of the cache block in the prior art.
  • the problem of low shared cache utilization is achieved by improving shared cache utilization and system performance.
  • FIG. 2 is a schematic structural diagram of a buffer provided by another embodiment of the present invention.
  • the Buffers can be applied to multi-core systems.
  • the buffer may include: a cache unit 202, a status register 204, a priority calculation unit 206, and a controller 208;
  • the buffer unit 202 is respectively connected to the status register 204 and the controller 208; the status register 204 is respectively connected to the buffer unit 202 and the priority calculation unit 206; the priority calculation unit 206 Connected to the status register 204 and the controller 208, respectively; the cache unit 202 includes a shared cache 2022 and N shaded labels 2024, respectively, corresponding to N cores of the processor; N > 2, and N is an integer;
  • the processor includes four cores.
  • the cache unit includes a shared cache and four shadow labels
  • the shared cache includes n cache blocks.
  • Each shadow label also contains n storage units, and each cache block in the shared cache corresponds to one of each of the shadow labels.
  • Each cache block in the shared cache is divided into four parts: the kernel ID (identity, identification number), valid identification, tag information, and data currently occupying the cache block; each storage unit in the shadow tag includes two parts : Valid identification and label information.
  • the address of the corresponding memory block and the storage unit of each shadow tag is determined by the lower address of the 64-bit address, and the valid address is extracted from the determined address.
  • the identification and label information if the extracted valid identifier is valid, and the label information matches the information contained in the upper address of the 64-bit address, the current access hit is determined, otherwise, the current access miss is determined.
  • the status register 204 is configured to record first access information of the N cores to the cache unit 202, where the first access information includes: accessing the shared cache 2022, occupying the shared cache 2022 The number of cache blocks in the access buffer, access to the shared cache 2022 and the hit time when the kernel accesses the shared cache, the status register can record the kernel's access hits to the shared cache and shadow tags by the counter in real time, in the state shown in FIG.
  • the configuration of the register can maintain four counters for each core, which are the access counter (CNT_Ac), the buffer occupancy counter (CNT_Size), the cache hit counter (Cnt_Shared Hit), and The tag hit counter (Cnt_Shadow Hit), where CNT_Ac corresponds to the number of times a certain kernel accesses the shared cache, CNT_Size corresponds to the number of cache blocks occupied by a certain kernel in the shared cache, and Cnt_Shared Hit corresponds to a certain kernel. The number of times the access cache is hit and hit, Cnt-Shadow Hit corresponds to a kernel access shadow The number of hits and hits. Take the kernel i access shared cache and shadow tags as an example.
  • each counter is as follows: When the processor core i accesses the shared cache (read operation or write operation), if the shared cache is hit, Cnt_Ac(i) and Cnt_Shared Hit(i) are incremented by 1. If the shadow label is hit, Cnt — Shadow Hit(i) is incremented by one; if the shared cache is missed, Cnt_Ac(i) and Cnt_Size(i) are incremented by 1 when there is a free cache block. If there is no free cache block, then The Cnt_Ac(i) counter is incremented by 1, and the kernel A with the highest priority is replaced.
  • the cache block to be replaced is determined from the cache block occupied by the kernel A and Write Back is executed. Operation, and decrement the value of Cnt_Size(A) by 1. If kernel A is kernel i, the value of Cnt_Size(A) remains unchanged; if the shadow label is missing, Cnt- Shadow Hit(i) constant.
  • the priority calculation unit 206 is configured to calculate, according to the first access information of the cache unit 202, the replacement priorities of the N cores according to the N cores recorded by the status register 204; The priority is used to indicate the priority of the cache block occupied by the corresponding kernel being replaced;
  • the priority calculation unit 206 may calculate the replacement probability of each kernel according to the first access information and the replacement probability calculation model of each of the above-mentioned kernels, and form a probability distribution, wherein the core with the highest replacement probability has the highest replacement priority.
  • the replacement probability calculation model is as follows:
  • the number of access misses in any interval is W. And within these W access misses, the proportion of kernel i is Mi. Then, within this interval, the number of access misses in kernel i is Mi xW. If the cache block occupied by kernel i at the beginning of this interval is not replaced within this interval, at the end of the interval, the cache occupancy ratio of kernel i will become (Ci+(MiX W/m)), where m is The total number of cache blocks; Q is the ratio of the number of cache blocks occupied by kernel i at the beginning of this interval.
  • the controller 208 is configured to acquire, from the priority calculation unit, a replacement priority of the N cores when the access to the shared cache miss occurs and perform a cache block refill operation, and The cache block to be replaced is determined among the N cores in the cache block of the shared cache currently occupied by the core with the highest priority.
  • the controller may determine the cache block to be replaced from the cache block of the shared cache currently occupied by the core with the highest priority replacement according to the LRU (Least Recently Used) replacement algorithm.
  • the controller may determine the cache block to be replaced according to other replacement algorithms, which is not specifically limited in this embodiment.
  • the controller 208 is further configured to allocate a respective target cache occupancy to the N cores according to a performance target, where the performance target includes at least one of overall hit ratio maximization, fairness, or quality of service; The actual cache occupancy of the kernel with the highest priority; detecting whether the actual cache occupancy is greater than the target cache occupancy; if the detection result is that the actual cache occupancy is not greater than the target cache occupancy, then controlling the The priority calculation unit recalculates the respective replacement priorities of the N cores.
  • the controller may first allocate a target cache occupancy amount to each core according to performance targets (such as overall hit rate maximization, fairness, or quality of service, etc.), and the controller further acquires the first access information from the cache unit, and from the first The actual cache occupancy of the kernel with the highest priority is obtained in the access information (the actual cache occupancy of the kernel with the highest priority substitution can be determined according to the number of cache blocks in the shared cache occupied by the kernel with the highest priority).
  • performance targets such as overall hit rate maximization, fairness, or quality of service, etc.
  • the controller may send a control instruction to the priority calculation unit to control the priority.
  • the calculation unit recalculates the corresponding replacement of the N cores priority.
  • the status register 204 is further connected to the controller 208.
  • the status register 204 is further configured to record second access information corresponding to each cache block in the shared cache 2022, where the second access information includes The number of times the N cores are occupied respectively;
  • the controller 208 is configured to: when the accessing the shared cache miss occurs and perform a cache block refill operation, obtain, from the status register 204, each of the currently occupied cores with the highest priority And storing the second access information corresponding to each of the cache blocks, and determining the cache block to be replaced according to the second access information corresponding to each of the cache blocks currently occupied by the kernel with the highest replacement priority.
  • the controller 208 is specifically configured to determine, according to a buffer block currently occupied by the core with the highest replacement priority, the first type cache block, where the first type cache block is the core with the highest replacement priority. a cache block having the least number of occupations; determining a second type of cache block from the first type of cache block, the second type of cache block being a cache block having the least total number of times occupied by the N cores; Determining the cache block to be replaced from the second type of cache block.
  • each cache block is used on average to avoid the reused locality of the cache block.
  • This embodiment also determines the cache block to be replaced in combination with the number of times each cache block is occupied. Specifically, the status register records the number of times each buffer block in the shared cache is occupied by each core as the second access information corresponding to each cache block, and when the subsequent kernel accesses the shared cache miss and performs the cache block refill, first Determining, from the cache block currently occupied by the kernel with the highest priority, the cache block with the least number of cores with the highest replacement priority, and determining that it is occupied by all cores from the cache block with the least number of cores with the highest replacement priority.
  • the cache block with the least total number of times, and finally the cache block to be replaced is determined from the cache block with the least total number of times occupied by all the kernels according to a replacement algorithm such as LUR.
  • the above method can tend to select the cache block with the least number of reuses, thereby taking into account the reuse locality of the cache block and the extent to which the cache block is shared by all cores.
  • the buffer provided by the embodiment of the present invention records the first access information of each of the N cores to the cache unit through the status register, and the priority calculation unit calculates the respective N cores according to the first access information recorded in the status register.
  • the controller determines the cache block to be replaced in the shared cache according to the replacement priority of each of the N cores, which solves the problem that the kernel can only determine the cache block to be replaced from the corresponding part of the cache block in the prior art.
  • the problem of low shared cache utilization is achieved by improving shared cache utilization and system performance.
  • the buffer provided by the embodiment of the present invention detects the size relationship between the actual cache occupancy of the kernel with the highest priority and the target cache occupancy of the kernel with the highest priority, and the actual cache occupancy is not greater than the target.
  • the control priority calculation unit recalculates the replacement priority of each core, thereby further improving the utilization of the shared cache.
  • the buffer determines the first type of cache block by using the cache block currently occupied by the kernel with the highest replacement priority, and the first type cache block is the least occupied by the kernel with the highest priority.
  • a cache block determining a second type of cache block from the first type of cache block, the second type of cache block being a cache block having the least total number of times occupied by the N cores;
  • the cache block to be replaced is determined in the type cache block, thereby taking into account the reuse locality of the cache block and the extent to which the cache block is shared by all cores, thereby further improving the utilization of each cache block in the shared cache.
  • FIG. 5 is a flowchart of a method for sharing a cache management method according to an embodiment of the present invention.
  • the method can control access to the kernel in a buffer as shown in FIG. 1 or 2.
  • the shared cache management method can include:
  • Step 302 Acquire a replacement priority of each of the N cores of the processor from the priority calculation unit when the access to the shared cache miss occurs and perform a cache block refill operation; the replacement priority is used to represent the corresponding kernel The priority of the occupied cache block being replaced;
  • Step 304 Determine, from the N cores, a cache block to be replaced from a cache block of a shared cache currently occupied by a kernel with the highest priority.
  • the shared cache management method determines the replacement priority of the N cores from the priority calculation unit, and determines the cache block to be replaced in the shared cache according to the replacement priorities of the N cores.
  • the problem that the kernel cache can only determine the cache cache to be replaced from the corresponding part of the cache block is not high, and the effect of improving the shared cache utilization and system performance is achieved.
  • FIG. 6, is a flowchart of a method for sharing a cache management method according to another embodiment of the present invention. This method can control access to the kernel in a buffer as shown in Figure 1 or Figure 2.
  • the shared cache management method can include:
  • Step 402 Obtain a replacement priority of each of the N cores of the processor from the priority computing unit when accessing the shared cache miss occurs and performing a cache block refill operation;
  • the replacement priority is used to indicate the priority of the cache block occupied by the corresponding kernel to be replaced.
  • the replacement priority is calculated by the priority calculation unit according to the first access information corresponding to each core recorded in the status register, the specific process of recording the first access information by the status register, and the priority calculation unit calculating the priority of each core.
  • the process of the process please refer to the description in the embodiment corresponding to FIG. 2, and details are not described herein again.
  • Step 404 Obtain, from the status register, the second access information corresponding to each cache block currently occupied by the core with the highest priority;
  • the status register is also used to record the second visit corresponding to each cache block in the shared cache.
  • the information is information, and the second access information includes the number of times occupied by the N cores.
  • Step 406 Determine, according to the second access information corresponding to each cache block currently occupied by the core with the highest replacement priority, the cache block to be replaced.
  • the controller may determine, from a cache block currently occupied by the core with the highest replacement priority, the first type cache block being the cache block with the least number of times occupied by the kernel with the highest replacement priority; Determining a second type of cache block in a type of cache block, the second type of cache block being a cache block having the least total number of times occupied by the N cores; determining a cache block to be replaced from the second type of cache block according to a replacement algorithm .
  • each cache block is used on average to avoid the reused locality of the cache block.
  • This embodiment also determines the cache block to be replaced in combination with the number of times each cache block is occupied. Specifically, the status register records the number of times each cache block in the shared cache is occupied by each core as the second access information corresponding to each cache block.
  • the subsequent kernel accesses the shared cache miss and performs the cache block refill, the first The cache block currently occupied by the highest priority kernel is determined to be the cache block with the least number of cores occupied by the highest priority, and the cache block that has the least number of kernels with the highest priority is determined to be occupied by all the cores.
  • the cache block with the least total number of times, and finally the cache block to be replaced is determined from the cache block with the least total number of times occupied by all the kernels according to a replacement algorithm such as LUR.
  • the controller allocates a respective target cache occupancy for the N cores according to performance goals, the performance target including at least one of overall hit rate maximization, fairness, or quality of service; obtaining a kernel with the highest replacement priority
  • the actual cache occupancy is detected; if the actual cache occupancy is not greater than the target cache occupancy; if the detection result is that the actual cache occupancy is not greater than the target cache occupancy, then the priority calculation unit is controlled to recalculate the N cores.
  • Respective replacement priorities are provided to determine the priority of the priority of the priority.
  • the controller may first allocate a target cache occupancy amount to each core according to performance targets (such as overall hit rate maximization, fairness, or quality of service, etc.), and the controller further acquires the first access information from the cache unit, and from the first The actual cache occupancy of the kernel with the highest priority is obtained in the access information (the actual cache occupancy of the kernel with the highest priority substitution can be determined according to the number of cache blocks in the shared cache occupied by the kernel with the highest priority).
  • performance targets such as overall hit rate maximization, fairness, or quality of service, etc.
  • the controller may send a control instruction to the priority calculation unit to control the priority calculation unit to recalculate the replacement priority corresponding to each of the N cores. .
  • the shared cache management method determines the replacement priority of the N cores from the priority calculation unit, and determines the cache block to be replaced in the shared cache according to the replacement priorities of the N cores.
  • the problem that the kernel cache can only determine the cache cache to be replaced from the corresponding part of the cache block is not high, and the effect of improving the shared cache utilization and system performance is achieved.
  • the shared cache management method detects the size relationship between the actual cache occupancy of the kernel with the highest priority and the target cache occupancy of the kernel with the highest priority, when the actual cache occupancy is not When the target cache occupancy is greater than, the control priority calculation unit recalculates the replacement priority of each core, thereby further improving the utilization of the shared cache.
  • the shared cache management method determines the first type of cache block by using the cache block currently occupied by the kernel with the highest priority, and the first type of cache block is the number of times the core with the highest priority is replaced. a minimum number of cache blocks; determining a second type of cache block from the first type of cache block, the second type of cache block being the least number of cache blocks occupied by the N cores; determining from the second type of cache block according to the replacement algorithm
  • the replacement cache block takes into account the reuse locality of the cache block and the extent to which the cache block is shared by all cores, further improving the utilization of each cache block in the shared cache.
  • FIG. 7, is a schematic structural diagram of a controller according to an embodiment of the present invention.
  • the controller can control access to the kernel in a buffer as shown in Figure 1 or Figure 2.
  • the controller can include:
  • a first obtaining module 501 configured to acquire, from the priority computing unit, a replacement priority of each of the N cores of the processor when the accessing the shared cache miss occurs and performing a cache block refilling operation;
  • the priority is used to indicate the priority of the cache block occupied by the corresponding kernel is replaced;
  • the determining module 502 is configured to determine, from the N cores, the cache block of the shared cache currently occupied by the highest priority kernel to be replaced. Cache block.
  • the controller provided by the embodiment of the present invention obtains the replacement priorities of the N cores from the priority calculation unit, and determines the cache blocks to be replaced in the shared cache according to the replacement priorities of the N cores.
  • the kernel can only determine that it is to be replaced from a corresponding part of the cache block.
  • the problem of low utilization of the shared cache caused by the cache block achieves the effect of improving the shared cache utilization and system performance.
  • FIG. 8 is a schematic structural diagram of a controller according to another embodiment of the present invention.
  • the controller can control access to the kernel in a buffer as shown in FIG. 1 or 2.
  • the controller can include:
  • a first obtaining module 601 configured to acquire, from the priority computing unit, a replacement priority of each of the N cores of the processor when the accessing the shared cache miss occurs and performing a cache block refilling operation;
  • the priority is used to indicate the priority of the cache block occupied by the corresponding kernel is replaced;
  • the determining module 602 is configured to determine, from the N cores, the cache block of the shared cache currently occupied by the highest priority kernel to be replaced. Cache block.
  • the controller includes:
  • the allocating module 603 is configured to allocate a respective target cache occupancy to the N cores according to a performance target, where the performance target includes at least one of an overall hit rate maximization, fairness, or quality of service;
  • the second obtaining module 604 is configured to obtain an actual cache occupancy of the kernel with the highest priority; the detecting module 605 is configured to detect whether the actual cache occupancy is not greater than the target cache usage;
  • the control module 606 is configured to, if the detection result is that the actual cache occupancy is not greater than the target cache occupancy, control the priority calculation unit to recalculate respective replacement priorities of the N cores.
  • the determining module 602 includes:
  • the obtaining unit 6021 is configured to obtain, from the status register, second access information corresponding to each of the cache blocks currently occupied by the kernel with the highest priority;
  • the determining unit 6022 is configured to determine, according to the second access information corresponding to each cache block currently occupied by the kernel with the highest replacement priority, the cache block to be replaced;
  • the status register is further configured to record the second access information corresponding to each cache block in the shared cache, where the second access information includes a number of times occupied by the N cores.
  • the determining unit 6022 includes:
  • a first determining sub-unit 6022a configured to use the current highest priority from the core with the highest replacement priority Determining a first type of cache block in the storage block, the first type of cache block being a cache block having the least number of times occupied by the core with the highest replacement priority;
  • a second determining sub-unit 6022b configured to determine, from the first type of cache block, a second type of cache block, where the second type of cache block is a cache block having the least total number of times occupied by the N cores;
  • the determining subunit 6022c is configured to determine, according to the replacement algorithm, the cache block to be replaced from the second type cache block.
  • the controller provided by the embodiment of the present invention obtains the replacement priorities of the N cores from the priority calculation unit, and determines the cache blocks to be replaced in the shared cache according to the replacement priorities of the N cores.
  • the kernel can only determine the cache cache to be replaced from the corresponding part of the cache block, and the utilization of the shared cache is not high, thereby achieving the effect of improving the shared cache utilization and system performance.
  • the controller provided by the embodiment of the present invention detects the size relationship between the actual cache occupancy of the core with the highest priority and the target cache occupancy of the core with the highest priority, and the actual cache occupancy is not greater than the target.
  • the control priority calculation unit recalculates the replacement priority of each core, thereby further improving the utilization of the shared cache.
  • the controller determines the first type of cache block by using the cache block currently occupied by the core with the highest priority, and the first type of cache block is the least occupied by the core with the highest priority.
  • a cache block determining a second type of cache block from the first type of cache block, the second type of cache block being a cache block having the least total number of times occupied by the N cores; determining, to be replaced, from the second type of cache block according to a replacement algorithm Cache blocks, thereby taking into account the reuse locality of the cache block and the extent to which the cache block is shared by all cores, further improving the utilization of each cache block in the shared cache.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

L'invention concerne un cache, un procédé de gestion de cache partagé, et un contrôleur, appartenant au domaine de l'informatique. Le cache comprend une unité de cache, un registre d'état, une unité de calcul de priorité et un contrôleur. Le registre d'état est utilisé pour enregistrer des premières informations d'accès concernant N noyaux pour accéder à l'unité de cache respectivement. L'unité de calcul de priorité est utilisée pour calculer les priorités de remplacement respectives des N noyaux d'après les premières informations d'accès. Le contrôleur est utilisé pour déterminer un bloc de cache devant être remplacé parmi les blocs de cache, dans un cache partagé actuellement occupé par le noyau ayant la priorité de remplacement la plus haute. Les priorités de remplacement respectives des N noyaux sont calculées d'après les premières informations d'accès enregistrées par le registre d'état, et le bloc de cache devant être remplacé dans le cache partagé est déterminé d'après les priorités de remplacement. L'invention résout ainsi le problème lié, dans l'état de la technique, au fait qu'un noyau ne peut déterminer qu'un bloc de cache devant être remplacé dans une partie des blocs de cache correspondants. Cela a pour effet d'améliorer le taux d'utilisation et les performances système du cache partagé.
PCT/CN2014/073052 2014-03-07 2014-03-07 Cache, procédé de gestion de cache partagé, et contrôleur WO2015131395A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2014/073052 WO2015131395A1 (fr) 2014-03-07 2014-03-07 Cache, procédé de gestion de cache partagé, et contrôleur
CN201480000331.3A CN105359116B (zh) 2014-03-07 2014-03-07 缓存器、共享缓存管理方法及控制器

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/073052 WO2015131395A1 (fr) 2014-03-07 2014-03-07 Cache, procédé de gestion de cache partagé, et contrôleur

Publications (1)

Publication Number Publication Date
WO2015131395A1 true WO2015131395A1 (fr) 2015-09-11

Family

ID=54054398

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/073052 WO2015131395A1 (fr) 2014-03-07 2014-03-07 Cache, procédé de gestion de cache partagé, et contrôleur

Country Status (2)

Country Link
CN (1) CN105359116B (fr)
WO (1) WO2015131395A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210342461A1 (en) * 2017-09-12 2021-11-04 Sophos Limited Providing process data to a data recorder

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108614782B (zh) * 2018-04-28 2020-05-01 深圳市华阳国际工程造价咨询有限公司 一种用于数据处理系统的高速缓存访问方法
CN113505087B (zh) * 2021-06-29 2023-08-22 中国科学院计算技术研究所 一种兼顾服务质量和利用率的缓存动态划分方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1804816A (zh) * 2004-12-29 2006-07-19 英特尔公司 用于程序员控制的超高速缓冲存储器线回收策略的方法
CN101739299A (zh) * 2009-12-18 2010-06-16 北京工业大学 一种基于片上多核处理器共享cache的动态公平划分方法
CN101916230A (zh) * 2010-08-11 2010-12-15 中国科学技术大学苏州研究院 基于划分感知和线程感知的末级高速缓存的性能优化方法
CN103150266A (zh) * 2013-02-20 2013-06-12 北京工业大学 一种改进的多核共享cache替换方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB0516474D0 (en) * 2005-08-10 2005-09-14 Symbian Software Ltd Pre-emptible context switching in a computing device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1804816A (zh) * 2004-12-29 2006-07-19 英特尔公司 用于程序员控制的超高速缓冲存储器线回收策略的方法
CN101739299A (zh) * 2009-12-18 2010-06-16 北京工业大学 一种基于片上多核处理器共享cache的动态公平划分方法
CN101916230A (zh) * 2010-08-11 2010-12-15 中国科学技术大学苏州研究院 基于划分感知和线程感知的末级高速缓存的性能优化方法
CN103150266A (zh) * 2013-02-20 2013-06-12 北京工业大学 一种改进的多核共享cache替换方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210342461A1 (en) * 2017-09-12 2021-11-04 Sophos Limited Providing process data to a data recorder
US11620396B2 (en) * 2017-09-12 2023-04-04 Sophos Limited Secure firewall configurations
US11966482B2 (en) 2017-09-12 2024-04-23 Sophos Limited Managing untyped network traffic flows

Also Published As

Publication number Publication date
CN105359116B (zh) 2018-10-19
CN105359116A (zh) 2016-02-24

Similar Documents

Publication Publication Date Title
US9977623B2 (en) Detection of a sequential command stream
US9639466B2 (en) Control mechanism for fine-tuned cache to backing-store synchronization
US9280290B2 (en) Method for steering DMA write requests to cache memory
US9201796B2 (en) System cache with speculative read engine
TW200809608A (en) Method and apparatus for tracking command order dependencies
US7640399B1 (en) Mostly exclusive shared cache management policies
US10025504B2 (en) Information processing method, information processing apparatus and non-transitory computer readable medium
US9135177B2 (en) Scheme to escalate requests with address conflicts
CN105095116A (zh) 缓存替换的方法、缓存控制器和处理器
US8583873B2 (en) Multiport data cache apparatus and method of controlling the same
US9043570B2 (en) System cache with quota-based control
CN104951239B (zh) 高速缓存驱动器、主机总线适配器及其使用的方法
WO2017052764A1 (fr) Dispositif de commande de mémoire pour une mémoire de système à multiples niveaux ayant une mémoire cache divisée en secteurs
WO2014075428A1 (fr) Procédé et dispositif de remplacement de données dans un module de cache
US8151058B2 (en) Vector computer system with cache memory and operation method thereof
WO2016015583A1 (fr) Procédé et dispositif de gestion de mémoire et contrôleur de mémoire
CN107592927A (zh) 管理扇区高速缓存
WO2015131395A1 (fr) Cache, procédé de gestion de cache partagé, et contrôleur
US9727474B2 (en) Texture cache memory system of non-blocking for texture mapping pipeline and operation method of texture cache memory
US10452548B2 (en) Preemptive cache writeback with transaction support
JP5699854B2 (ja) 記憶制御システムおよび方法、置換方式および方法
JP2015191604A (ja) 制御装置、制御プログラム、および制御方法
US20050044321A1 (en) Method and system for multiprocess cache management
CN108920192B (zh) 基于分布式有限目录的缓存数据一致性实现方法及装置
JP7264806B2 (ja) キャッシュエントリでメモリアクセス要求のペンデンシを識別するシステム及び方法

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201480000331.3

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14884368

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14884368

Country of ref document: EP

Kind code of ref document: A1