WO2018161272A1 - 一种缓存替换方法,装置和系统 - Google Patents

一种缓存替换方法,装置和系统 Download PDF

Info

Publication number
WO2018161272A1
WO2018161272A1 PCT/CN2017/075952 CN2017075952W WO2018161272A1 WO 2018161272 A1 WO2018161272 A1 WO 2018161272A1 CN 2017075952 W CN2017075952 W CN 2017075952W WO 2018161272 A1 WO2018161272 A1 WO 2018161272A1
Authority
WO
WIPO (PCT)
Prior art keywords
cache
cache line
level
line
low
Prior art date
Application number
PCT/CN2017/075952
Other languages
English (en)
French (fr)
Inventor
于绩洋
方磊
蔡卫光
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2017/075952 priority Critical patent/WO2018161272A1/zh
Priority to CN201780023595.4A priority patent/CN109074320B/zh
Priority to EP17899829.0A priority patent/EP3572946B1/en
Publication of WO2018161272A1 publication Critical patent/WO2018161272A1/zh
Priority to US16/544,352 priority patent/US20190370187A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3037Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/123Replacement control using replacement algorithms with age lists, e.g. queue, most recently used [MRU] list or least recently used [LRU] list
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0811Multiuser, multiprocessor or multiprocessing cache systems with multilevel cache hierarchies
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/126Replacement control using replacement algorithms with special data handling, e.g. priority of data or instructions, handling errors or pinning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/12Replacement control
    • G06F12/121Replacement control using replacement algorithms
    • G06F12/128Replacement control using replacement algorithms adapted to multidimensional cache systems, e.g. set-associative, multicache, multiset or multilevel
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/885Monitoring specific for caches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/10Providing a specific technical effect
    • G06F2212/1041Resource optimization
    • G06F2212/1044Space efficiency improvement
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the field of computers, and in particular, to a cache replacement method, apparatus and system.
  • Multi-level CPU cache (English: cache) to reduce the overhead (time or energy) of accessing data from primary storage (eg, system memory).
  • the relationship between multi-level caches usually includes three types: English (inclusive), mutually exclusive (English: exclusive), and non-inclusive (inclusive: non-inclusive).
  • the inclusion means that each cache line in the high-level cache (English: cache line) has corresponding (identical) cache lines in the low-level cache.
  • Mutex is the exclusive relationship between the cache line in the high-level cache and the cache behavior in the low-level cache. In the case of a non-contained inclusion and a mutually exclusive compromise, the cache line in the high-level cache and the cache line in the low-level cache are partially identical.
  • the high-level cache data is a subset of the data in the low-level cache. If a cache line in the low-level cache is removed (English: evict), in order to maintain the inclusion relationship, the corresponding cache line in the high-level cache will also be invalidated (English: back invalidation).
  • an inclusion relationship can be adopted between the low-level cache and the high-level cache.
  • the L2 cache contains the L1 cache
  • a cache line in the L1 cache is frequently accessed, the cache line is generally stored in the L1 cache for a long time, and no cache is generated.
  • Hit English: cache miss
  • the L2 cache is not aware of the use of cache lines in the L1 cache. The L2 cache can only know which data needs to be accessed in the L1 cache by causing a cache miss through the L1 cache. If a cache line is frequently accessed in the L1 cache, no cache miss occurs.
  • the corresponding cache line of the cache line in the L2 cache is replaced by the L2 cache because the L2 cache replacement policy is not used for a long time. .
  • the cache line replacement in the L2 cache will trigger the invalidation of the corresponding cache line in the L1 cache.
  • the data that is frequently accessed in the L1 cache is invalidated later, which may cause the L1 cache to be cache missed, which affects system performance.
  • the present application discloses a cache replacement method, apparatus, and system that reduces the possibility of invalidation of frequently accessed cache lines in a high-level cache by monitoring high-level cache access conditions.
  • the present application provides a cache replacement method in a computer, where the computer includes a high-level cache and a low-level cache, and the low-level cache and the high-level cache are in an inclusion relationship, that is, each cache line in the high-level cache is The corresponding same cache line exists in the low-level cache.
  • the method includes: the processor selecting the first cache line as the cache line to be replaced in the low-level cache, and monitoring whether a cache hit of the corresponding cache line of the first cache line occurs in the high-level cache (English: cache hit), if Before the cache miss occurs in the low-level cache, a hit occurs in the corresponding cache line of the first cache line in the high-level cache, then the first cache line is reserved in the low-level cache, and the second cache line is selected as the cache to be replaced. Row.
  • the cache line to be replaced is used to be moved out of the low-level cache when a cache miss occurs in the low-level cache, thereby making room for the miss cache line.
  • the corresponding cache line of the first cache line is the same as the first cache line Cache line, both of which contain the same data as the corresponding access address.
  • the high-level cache hits the corresponding cache line of the first cache line, indicating the data corresponding to the first cache line (the first cache line and the corresponding cache line of the first cache line)
  • the data included is the same.
  • the cache line to be replaced is reselected in the low-level cache to prevent the first cache line from being replaced in a short time, so that the corresponding cache line of the first cache line is short. The time will not be invalid after the time, ensuring high cache hit rate in the high-level cache.
  • the computer further includes a status register associated with the high level cache for saving the identification of the cache line in the monitored state.
  • the cache line is in a monitored state indicating that the processor monitors whether the cache line has a cache hit in the high level cache.
  • the method further includes: the processor writing the identifier of the first cache line to the status register.
  • the processor determines whether a hit of the corresponding cache of the first cache line occurs by comparing the identifier of the cache line in which the cache hit occurs in the high-level cache with the identifier of the first cache line held in the status register, if the two are the same, A hit occurred on the corresponding cache line of the first cache line.
  • the processor deletes the identifier of the first cache line recorded in the status register, and terminates monitoring of the corresponding cache line of the first cache line.
  • each cache line in the high-level cache corresponds to an indication identifier, and when the indication identifier is in the first state (for example, 0), the corresponding indication is The cache line is not in the monitored state, indicating that the flag bit is in the second state (for example, 1), indicating that the corresponding cache line is in the monitored state.
  • the method further includes: the processor searching the corresponding cache line of the first cache line in the high level cache, and corresponding to the corresponding cache line of the first cache line The indicator identifies the location as the second state.
  • the processor After the corresponding cache line hit of the first cache line or the first cache line is invalidated, the processor sets the indication identifier corresponding to the corresponding cache line of the first cache line to the first state, and terminates the first cache line.
  • the monitoring of the corresponding cache line at the same time, only one cache line in the high-level cache is in the monitored state.
  • the indication bit can be implemented by extending the tag bit in the high-level cache.
  • the processor determines whether a hit of the corresponding cache line of the first cache line occurs by checking whether the indication bit corresponding to the cache line in which the cache hit occurs is the second state, and if the indication flag corresponding to the hit cache line is the second state , indicating that a hit has occurred on the corresponding cache line of the first cache line.
  • the method further includes: if the cache miss occurs in the low-level cache, the high-level cache does not When access to the corresponding cache line of the first cache line occurs, the processor moves the first cache line out of the lower level cache and the cache line in which the cache miss occurs into the lower level cache when the cache miss occurs in the lower level cache.
  • the cache is not generated in the low-level cache.
  • the first cache line is moved out of the lower level cache and the corresponding cache line of the first cache line in the higher level cache is invalidated.
  • the high-level cache does not include the corresponding cache line of the first cache line, and the cache cache miss occurs in the first-level cache, the upper-level cache does not have access to the corresponding cache line of the first cache line, that is, no first occurs. If the cache miss of the corresponding cache line of the cache line is missed, the first cache line is moved out of the lower level cache when the cache miss occurs in the lower level cache.
  • the method further includes: the processor invalidating the corresponding cache line of the first cache line in the high-level cache.
  • the processor invalidates the corresponding cache line of the first cache line.
  • the processor is configured in a low-level cache according to a least recently used (LRU) policy. Select the first cache line as the cache line to be replaced.
  • LRU least recently used
  • the hit of the line further includes: the processor updating the status of the first cache line in the low level cache to the most recently used (MRU).
  • MRU most recently used
  • the high-level cache hits the corresponding cache line of the first cache line, according to the time and space characteristics of the cache access, the data of the first cache line in the high-level cache is frequently accessed, in order to avoid short
  • the corresponding cache line of the first cache line is invalid after the time, and the status of the first cache line is updated to MRU, thereby extending the time that the first cache line exists in the low level cache.
  • the present application provides a cache replacement device in a computer, which includes a high-level cache and a low-level cache, and the low-level cache is in an inclusion relationship with a high-level cache.
  • the apparatus includes: a selecting unit, configured to select a first cache line as a cache line to be replaced in the low level cache, and the cache line to be replaced is used to be moved out of the low level cache when a cache miss occurs in the low level cache.
  • the monitoring unit is configured to monitor whether a hit of the corresponding cache line of the first cache line occurs in the high-level cache.
  • the selection unit is further configured to reserve the first cache line in the low-level cache and select the second The cache line is used as the cache line to be replaced.
  • the computer further includes a status register associated with the high level cache for saving the identification of the cache line in the monitored state.
  • the apparatus also includes a write unit that, after selecting the first cache line as the cache line to be replaced in the low level cache, the write unit is configured to write the identification of the first cache line to the status register.
  • the monitoring unit is configured to compare the identifier of the cache line in which the cache hit occurs in the high-level cache with the identifier of the first cache line saved in the status register. If the two are the same, it indicates that a hit of the corresponding cache line of the first cache line occurs. .
  • each cache line in the high-level cache corresponds to an indication flag, and when the indication bit is in the first state, the corresponding cache line is not in the The monitoring status indicates that when the flag bit is in the second state, it indicates that the corresponding cache line is in the monitored state.
  • the monitoring unit is further configured to search the corresponding cache line of the first cache line in the high-level cache, and corresponding the corresponding cache line of the first cache line.
  • the indicator identifies the location as the second state.
  • the device further includes a writing unit. If the high-level cache does not have access to the corresponding cache line of the first cache line before the cache miss occurs in the low-level cache, the write unit is also used to cache the first cache line when the cache miss occurs in the low-level cache. Move out of the low-level cache and move cache lines that have cache misses into the lower-level cache.
  • the apparatus further includes an invalid unit. After the write unit moves the first cache line out of the low-level cache, the invalid unit is used to cache the first level The corresponding cache line of the first cache line in is invalid.
  • the selecting unit is configured to select the first cache line in the low-level cache according to the least recently used LRU policy As a cache line to be replaced.
  • the sixth possible implementation manner of the second aspect if a cache miss occurs in the low-level cache, a corresponding cache of the first cache line occurs in the high-level cache.
  • the row hit, the selection unit is also used to update the state of the first cache line in the low level cache to the most recently used MRU.
  • the second aspect or the second aspect any one of the possible implementation manners of the first aspect or the first aspect, any one of the possible method implementation manners, the first aspect or the first aspect, in any possible implementation manner
  • the description is applicable to any of the possible implementations of the second aspect or the second aspect, and details are not described herein again.
  • the present application provides a cache replacement system, the system high-level cache, a low-level cache, and a cache controller, and the low-level cache and the high-level cache are in an inclusion relationship.
  • the cache controller is configured to select a first cache line as a cache line to be replaced in a low-level cache, and the cache line to be replaced is used to be moved out of a low-level cache when a cache miss occurs in a low-level cache, and monitoring whether a high-level cache occurs For the hit of the corresponding cache line of the first cache line, if a hit of the corresponding cache line of the first cache line occurs in the high-level cache before the cache miss occurs in the low-level cache, the first cache is retained in the low-level cache. Cache the line and select the second cache line as the cache line to be replaced.
  • the system further includes a status register associated with the high level cache for saving the identification of the cache line in the monitored state.
  • the cache controller is further configured to write the identifier of the first cache line to the status register.
  • the cache controller is configured to compare the identifier of the cache line in the high-level cache with the cache line and the identifier of the first cache line saved in the status register. If the two are the same, it indicates that a hit occurs on the corresponding cache line of the first cache line. .
  • each cache line in the high-level cache corresponds to an indication flag, and when the indication bit is in the first state, the corresponding cache line is not in the The monitoring status indicates that when the flag bit is in the second state, it indicates that the corresponding cache line is in the monitored state.
  • the cache controller is further configured to search the corresponding cache line of the first cache line in the high-level cache, and corresponding the corresponding cache line of the first cache line. Indicates that the identified location is the second state.
  • the high-level cache does not occur to the first cache before the cache miss occurs in the low-level cache.
  • the cache controller is also used to move the first cache line out of the low-level cache when the cache miss occurs in the low-level cache, and move the cache line in which the cache miss occurs into the low-level cache.
  • the cache controller after moving the first cache line out of the low-level cache, is further used to The corresponding cache line for the cache line is invalid.
  • the cache controller is configured to select the first cache line in the low-level cache according to the least recently used LRU policy As a cache line to be replaced.
  • the cache controller is also used to update the status of the first cache line in the lower level cache to the most recently used MRU.
  • a possible implementation of the third aspect or the third aspect is the system implementation corresponding to the first aspect or the first aspect of any possible method implementation, in the first aspect or the first aspect, in any possible implementation manner
  • the description is applicable to any possible implementation of the third aspect or the third aspect, and details are not described herein again.
  • the processor monitors whether a hit event of the corresponding cache line of the first cache line occurs in the high-level cache, if A hit occurs on the corresponding cache line of the first cache line, indicating that the corresponding cache line of the first cache line is frequently accessed data (hot data) in the high-level cache, and the processor reserves the same in the low-level cache.
  • First cache line and reselect the second cache line as the cache line to be replaced, thereby ensuring that the corresponding cache line of the first cache line as hot data is not in the high-level cache due to the replacement of the low-level cache Invalid, the cache hit rate of the high-level cache is raised, which improves the overall performance of the system.
  • FIG. 1 is a schematic diagram showing the logical structure of a data processing system according to an embodiment of the present invention
  • FIG. 2 is an exemplary flowchart of a cache replacement method according to an embodiment of the invention.
  • FIG. 3 is a schematic diagram of a cache structure according to an embodiment of the invention.
  • FIG. 4 is an exemplary flowchart of a cache replacement method according to an embodiment of the invention.
  • FIG. 5 is an exemplary flowchart of a cache replacement method according to an embodiment of the invention.
  • FIG. 6 is a schematic diagram showing the logical structure of a cache replacement system according to an embodiment of the invention.
  • FIG. 7 is a schematic diagram showing the logical structure of a cache replacement device according to an embodiment of the invention.
  • FIG. 8 is a schematic diagram showing the logical structure of a cache replacement device according to an embodiment of the invention.
  • the low-level cache can be based on the least recently used policy when selecting the cache line to be replaced.
  • the first cache line exists in the low level cache (for example, the L2 cache), and the corresponding cache line of the first cache line exists in the upper level cache (for example, the L1 cache), wherein the corresponding cache line of the first cache line refers to the first
  • a cache line has the same cache line, and the data included in the two is the same as the corresponding access address. If the processor frequently accesses the data corresponding to the first cache line, a cache hit of the corresponding cache line of the first cache line occurs frequently in the L1 cache (English: cache hit), and the data corresponding to one cache line refers to the cache.
  • the memory data included in the row, the first cache line is the same as the data included in the corresponding cache line of the first cache line. If the corresponding cache line of the first cache line is frequently hit in the L1 cache, according to the LRU policy, the corresponding cache line of the first cache line will continue to exist in the L1 cache, and the L2 cache will not have access to the first cache line. . If the first cache line has not been accessed in the L2 cache for a period of time, according to the LRU replacement policy, the risk that the first cache line is replaced in the L2 cache is gradually increased. When the first cache line is determined to be the cache line to be replaced in the L2 cache, and the cache miss occurs in the L2 cache, the first cache line needs to be moved out of the L2 cache.
  • the corresponding cache line of the first cache line is The L1 cache will also be invalidated afterwards. If the data corresponding to the first cache line is frequently accessed data in the L1 cache, the L1 cache will quickly cause a cache miss for the corresponding cache line of the first cache line, and the data corresponding to the first cache line needs to be re-created. Write to the L2 cache and L1 cache, which affects the overall performance of the system.
  • the processor monitors whether a cache hit for the corresponding cache line of the first cache line in the high-level cache occurs. Before the first cache line is replaced due to a cache miss due to the low level cache, if a hit occurs for the corresponding cache line of the first cache line in the high level cache, the first cache line according to the time and space characteristics of the cache access The corresponding cache line may be accessed frequently in the high-level cache, then the first cache line is reserved in the low-level cache, and the second cache line is reselected as the cache line to be replaced.
  • the post-invalidation of the corresponding cache line of the first cache line is avoided, so that the corresponding cache line of the first cache line is retained in the high-level cache, which ensures the overall cache hit rate is high.
  • the high-level cache includes the corresponding cache line of the first cache line, and the cache cache miss occurs before the low-level cache occurs, the cache hit of the corresponding cache line of the first cache line does not occur in the high-level cache, indicating the corresponding cache line
  • the cache line is not accessed frequently in the high-level cache.
  • the first cache line is replaced by the lower-level cache, and the corresponding cache line of the first cache line in the high-level cache is cached. Invalid after.
  • the first cache line is replaced by the lower level cache, that is, the first cache line is moved out of the lower level cache, and the cache line in which the cache miss occurs is written to the lower level cache.
  • FIG. 1 is a schematic structural diagram of a computer 100 according to an embodiment of the invention.
  • the computer can be any electronic device such as a portable computer, desktop computer, server, network device, tablet, cell phone, personal digital assistant (PDA), wearable device, or any combination thereof.
  • PDA personal digital assistant
  • data processing system 100 includes a processor 101 that is coupled to system memory 108.
  • the processor 101 can be a central processing unit (CPU), an image processing unit (GPU) or a digital signal processor (DSP).
  • CPU central processing unit
  • GPU image processing unit
  • DSP digital signal processor
  • the processor 101 can be a single core processor or a multi-core processor, including two or more levels of cache.
  • the processor 101 includes a plurality of processor cores 102 (including a processor core 102-1, a processor core 102-2 and a processor core 102-N, collectively referred to as a processor core 102) and a third level cache.
  • L1 cache 103 (including L1 cache 103-1, L1 cache 103-2 and L1 cache 103-N, collectively referred to as L1 cache 103) and L2 cache 104 (including L2 cache 104-1, L2 cache 104-2 and L2)
  • the cache 104-N is a private cache
  • the L3 cache 105 is a shared cache
  • the private cache can only be exclusive to the corresponding processor core
  • the shared cache can be shared by multiple processor cores.
  • the L1 cache 103 is a high level cache with respect to the L2 cache 104 and the L3 cache 105.
  • the L2 cache is cached relative to the L3 cache.
  • L2 cache 104 and L3 cache 105 are low level caches with respect to L1 cache 103.
  • the L3 cache 105 is a low level cache relative to the L2 cache 104.
  • the low-level cache and the high-level cache are in an inclusion relationship, and each cache line in the high-level cache also exists in the corresponding low-level cache.
  • the processor 101 may further include a cache controller 106, and the cache controller 106 is configured to select a corresponding data unit (cache line) according to different types of message requests and request corresponding address information, and read, update, fill, etc. operating.
  • each layer cache can have its own control logic, that is, the cache controller 106 shown in FIG. 1 can be distributedly distributed in different levels of cache at each level, or a cache structure can have a total control. The logic is not limited in this embodiment of the present invention.
  • Cache controller 106 may be integrated as internal component within processor 101 or integrated into processor core 102, with processor core 102 implementing the functionality of cache controller 106.
  • the processor 101 may further include a cache replacement policy module 118, which is a firmware module integrated in the processor 101, and the processor 101 or the cache controller 106 executes a cache replacement policy module.
  • the firmware code in 118 implements the technical solution of the embodiment of the present invention.
  • the cache replacement policy module 118 includes: (1) a code for selecting a first cache line as a cache line to be replaced in a low-level cache; and (2) for selecting a first cache line as a cache line to be replaced in a low-level cache. After that, the first cache line is replaced by the cache cache miss due to the low-level cache, and the cache hit of the corresponding cache line of the first cache line is monitored in the high-level cache. (3) If a cache hit occurs for the corresponding cache line of the first cache line before the cache miss occurs in the low level cache, the first cache line is reserved in the lower level cache and selected The second cache line is the code of the cache line to be replaced.
  • the bus 113 is used to transfer information between the various components of the data processor system 100.
  • the bus 113 can use a wired connection or a wireless communication method, which is not limited in this application.
  • the bus 113 can also be connected to a secondary storage 107, an input/output interface 109 and a communication interface 110.
  • the storage medium of the auxiliary storage 107 may be a magnetic medium (for example, a floppy disk, a hard disk, a magnetic tape), an optical medium (for example, an optical disk), or a semiconductor medium (for example, a solid state disk (SSD)).
  • the auxiliary storage 107 may further include a remote memory separate from the processor 101, such as a network disk (including a network file system (NFS) network) accessed through the communication interface 110 and the network storage protocol and the communication network 111. Or fleet file system).
  • NFS network file system
  • the input/output interface 109 is connected to an input/output device 114 for receiving input information and outputting an operation result.
  • the input/output device 114 can be a mouse, a keyboard, a display, an optical drive, or the like.
  • Communication interface 110 enables communication with other devices or communication networks 111 using transceivers such as, but not limited to, transceivers, which may be interconnected with network 111 in a wired or wireless form.
  • the network 111 can be the Internet, an intranet (English: intranet), a local area network (LAN), a wide area network (WAN), a storage area network (English): a storage area network (SAN), etc. , or any combination of the above networks.
  • System memory 108 may include software such as operating system 115 (eg, Darwin, RTXC, LINUX, UNIX, OS X, WINDOWS, or embedded operating system (eg, Vxworks)), application 116, and cache replacement policy module 117, and the like.
  • operating system 115 eg, Darwin, RTXC, LINUX, UNIX, OS X, WINDOWS, or embedded operating system (eg, Vxworks)
  • application 116 eg, Vxworks
  • the processor 101 or the cache controller 106 executes the cache replacement policy module 117 to implement the technical solution of the embodiment of the present invention.
  • the cache replacement policy module 117 includes: (1) a code for selecting the first cache line as the cache line to be replaced in the lower level cache; (2) for selecting the first cache line as the cache line to be replaced in the lower level cache.
  • the first cache line is actually replaced by the cache cache miss due to the low-level cache, and the code of the cache hit of the corresponding cache line of the first cache line is monitored in the high-level cache; (3) if Before the cache miss occurs in the low-level cache, a cache hit occurs for the corresponding cache line of the first cache line in the high-level cache, for retaining the first cache line in the low-level cache, and selecting the second cache line as the to-be-replaced The code for the cache line.
  • FIG. 2 is a schematic diagram of a cache structure according to an embodiment of the present invention.
  • a memory address is divided into three segments, namely, tag, index, and offset.
  • the processor accesses the data, it first searches the cache for the data corresponding to the memory address according to the memory address. If the data is sent, the data is directly read from the cache. If there is no data corresponding to the memory address in the cache, then The processor goes to the memory to obtain data corresponding to the memory address, and writes the acquired data into the cache.
  • the memory address is 32 bits (English: bit)
  • the buffer capacity is 32 kilobytes (English: kilobyte, KB), including 64 groups, 8-way group association (that is, each group has 8 cache lines).
  • each cache line size is 64 bytes (English: byte).
  • Offset The operation of the cache is in the unit of the cache behavior.
  • the address of a certain data inside the cache line needs to be determined by offset.
  • the cache behavior is 64 bytes, and the lower 6 bits of the address can be used to represent the offset.
  • the cache includes a total of 64 groups, which can use addresses. The middle 6 bits to find the address belongs to that group.
  • the cache When the cache is accessed, it first determines which group the access address belongs to according to the index. After the group is determined, the tag bits in the group are compared with the tag bits of the access address respectively. If there is the same tag, the cache hits, if not The same tag, the cache miss. In Figure 2, after the group is found according to the index, the tag bits of the 8 cache lines in the group (the remaining bits in the address, which is the upper 20 bits of the address in Figure 2) are compared with the tag bits of the access address, respectively. .
  • the cache line can be uniquely determined based on the tag and index.
  • FIG. 2 merely illustrates the cache structure, and the embodiment of the present invention does not limit the actual specifications of the cache.
  • FIG. 3 is a flowchart of a cache replacement method 300.
  • the method 300 is applied to a computer system including a high-level cache and a low-level cache, where the low-level cache and the high-level cache are inclusive, that is, The corresponding cache lines of the cache lines in all high-level caches also exist in the low-level cache.
  • method 300 includes:
  • S302 The processor selects the first cache line as the cache line to be replaced in the low-level cache.
  • the processor cannot find the free space to store the missed cache line. It needs to use the space where the cache line has been stored, that is, the new cache line is used to cover the low.
  • the cache line already in the hierarchy cache.
  • the cache line to be replaced is used to replace the cache line that is missing when the cache cache miss occurs in the low-level cache. That is, when the cache miss occurs in the low-level cache, the cache line to be replaced is used to be moved out of the lower-level cache, so that the cache is not generated.
  • the cache line of the hit makes room.
  • the embodiment of the present invention does not limit the selection of the first cache line as the cache algorithm to be replaced in the low-level cache.
  • the cache algorithm may be first in first out (FIFO), last in first out (English: last) In first out, LIFO), LRU, pseudo-LRU (English: pseudo-LRU, PLRU), random replacement (English: random replacement, RR), segmented LRU (English: segmented LRU, SLRU), least commonly used (English: Least-frequently used (LFU), dynamic aging LFU (English: LFU with dynamic aging, LFUDA), low inter-reference recency set (LIRS), adaptive replacement cache (English: adaptive replacement cache, ARC), multi-queue (English: multi queue, MQ) and other algorithms.
  • the processor may select the first cache line as the to-be-replaced cache line in the low-level cache according to the LRU policy.
  • S304 The processor monitors whether a cache hit of the corresponding cache line of the first cache line occurs in the high-level cache.
  • the processor After the processor selects the first cache line as the cache line to be replaced in the low-level cache, it monitors whether the corresponding cache line of the first cache line has a cache hit in the high-level cache. According to the time and space characteristics of the cache access, if the data corresponding to the first cache line is frequently accessed data (hot data) in the high-level cache, the first cache line is before the cache miss occurs again in the lower level cache. The corresponding cache line has a high probability of being hit in the high-level cache.
  • the data corresponding to the cache line in which the cache miss occurs may be directly written to the upper-level level.
  • Cache in which case the low-level cache does not have a cache miss.
  • a cache miss occurs in the low-level cache only if a cache miss occurs at a high level and there is no corresponding cache line in the low-level cache where the cache cache miss occurs.
  • the space of the high-level cache is smaller than the space of the low-level cache, if the first cache line is frequently accessed data in the high-level cache, the first cache line has a high-level cache before the cache miss occurs in the low-level cache. A high probability will be hit.
  • the high-level cache identifies the cache line in the monitored state by using certain indication information.
  • the manner in which the cache line is in the monitored state indicates that the processor monitors whether the cache line has a cache hit in the cache.
  • the embodiment of the present invention does not limit the manner in which the corresponding cache line of the first cache line is in the monitored state.
  • the processor may use a status register to hold the identification of the cache line in the monitored state, which may be the tag bit and the index bit of the cache line. After selecting the first cache line as the cache line to be replaced in the low level cache, the processor writes the identifier of the first cache line to the status register (the first cache line is the same as the identifier of the corresponding cache line of the first cache line).
  • Detecting whether a cache hit of the corresponding cache line of the first cache line occurs in the high-level cache may be: comparing the identifier of the cache line in which the cache hit occurs in the high-level cache with the identifier of the first cache line saved in the status register, if The same is true, indicating that a cache hit has occurred for the corresponding cache line of the first cache line.
  • the processor may record the status of the cache line using an indication flag corresponding to the cache line.
  • the cached tag field may be extended by 1 bit as an indication identifier.
  • Each cache line in the high-level cache corresponds to an indication flag. When the indication flag is in the first state, it indicates that the corresponding cache line is not in the monitored state. When the indication flag is in the second state, the corresponding cache is indicated. The line is in a monitored state. The processor only monitors the cache line indicating that the flag bit is in the second state.
  • the processor After selecting the first cache line as the cache line to be replaced in the low-level cache, the processor searches for the corresponding cache line of the first cache line in the high-level cache, and sets the indication position of the corresponding cache line of the first cache line to be the first Two states. Detecting whether a cache hit of the corresponding cache line of the first cache line occurs in the high-level cache may be specifically: the processor monitors whether the corresponding cache line of the first cache line indicating that the identifier bit is the second state sends an event in life.
  • the processor may select, according to the cache replacement algorithm, a second cache behavior other than the first cache line to be replaced. For example, the processor may select a least recently used second cache behavior to replace the cache line in addition to the first cache line according to the LRU algorithm.
  • the processor reserves the first cache line in the lower level cache and reselects The second cache line is used as the cache line to be replaced. Because the corresponding cache line of the first cache line is hot data in the high level cache, the processor can update the status of the first cache line to the most recent use in the lower level cache.
  • the high level cache cancels monitoring of the corresponding cache line of the first cache line. Specifically, if the high-level cache uses a status register to record the identifier of the first cache line, the identifier of the first cache line recorded in the status register is deleted.
  • the high-level cache uses an indication flag to indicate whether the corresponding cache line of the first cache line is in the monitored state, and the indication flag is in the first state, indicating that the corresponding cache line is not in the monitored state, the indication flag
  • the high-level cache sets the indication location of the corresponding cache line of the first cache line to the first state to end the corresponding cache line of the first cache line. monitor.
  • the first cache line is moved out of the lower-level cache when the cache miss occurs in the lower-level cache, and Move cache lines that have cache misses into the lower level cache.
  • the processor invalidates the corresponding cache line of the first cache line in the high-level cache. Further, the processor cancels the monitoring of the corresponding cache line of the first cache line by the high-level cache. The specific manner is described above, and details are not described herein again. The processor also writes missed cache lines to the lower level cache and the higher level cache.
  • the processor does not access the first cache line data before the cache miss occurs again in the low-level cache, that is, the first cache line does not occur in the high-level cache.
  • the cache miss of the corresponding cache line when the cache miss occurs in the low-level cache, moves the first cache line out of the low-level cache, and updates the missed cache line to the low-level cache and the high-level cache.
  • the processor has access to the data corresponding to the first cache line, that is, the first cache occurs in the high-level cache. If the row corresponds to the data access operation, the high-level cache will cause the cache miss due to the corresponding cache line of the first cache line, and the processor will go to the lower-level cache to find the first cache line and update the corresponding data of the first cache line to the high level.
  • Hierarchical cache so the low-level cache hits the first cache line, the processor keeps the first cache line in the low-level cache, and reselects the second cache line as the cache line to be replaced, thus avoiding the update to the high level
  • the corresponding cache line of the first cache line in the cache is invalid afterwards. Further, the status of the first cache line can be updated to the most recently used MRU in the lower level cache.
  • the processor monitors whether a hit event of the corresponding cache line of the first cache line occurs in the high-level cache. If a hit occurs on the corresponding cache line of the first cache line, it indicates that the corresponding cache line of the first cache line is frequently accessed data (hot data) in the high-level cache, and the processor is in the low-level cache.
  • FIG. 4 is a schematic diagram of a cache state change according to an embodiment of the invention.
  • the cache system 400 includes a high-level cache 401 and a low-level cache 402.
  • the low-level cache 402 includes B, C, D, E, F, and G6 cache lines, and the high-level cache 401.
  • the high-level cache 401 and the low-level cache 402 are in an inclusion relationship, that is, the cache lines B, F and C in the high-level cache are also present in the low-level cache.
  • the low-level cache maintains a logical LRU chain.
  • the specific implementation of the LRU chain is not limited by the number of hits or times of the cache line.
  • the cache behavior cache line G is the cache line to be replaced, and when the cache miss occurs again in the lower level cache, the cache line to be replaced is used.
  • the cache line that caused the cache miss to occur That is, when the cache cache miss occurs again in the low level cache 402, the cache line G is moved out of the lower level cache 402.
  • the processor may initiate access to the data A, because the high-level cache 401 does not include the cache line corresponding to the data A, the high-level cache 401 may cause a cache miss for the data A, and the processor may In the low-level cache 402, it is found whether the low-level cache 402 includes the data A. Since the low-level cache 402 also does not include the data A, the low-level cache also has a cache miss for the data A, and the processor goes to the lower-level cache. Alternatively, the data A is fetched in the memory, and the cache line corresponding to the data A is updated to the lower level cache 402 and the higher level cache 401, and the cache system enters state 2.
  • cache line A is updated to lower level cache 402 and higher level cache 401
  • state The cache line G to be replaced at 1 o'clock is moved out of the lower level cache 402
  • the cache line C is moved out of the higher level cache 401.
  • the newly added cache line A is at the beginning of the LRU chain
  • the cache line F is at the end of the LRU chain.
  • the cache line F is the cache line to be replaced of the lower level cache 402.
  • state 2 after determining that the cache line F in the low-level cache 402 is the cache line to be replaced, the processor monitors whether a hit event for the cache line F occurs in the high-level cache 401, if the cache occurs because of the low-level cache 402. Before the miss cache line F is replaced in the lower level cache 402, a hit on the cache line F occurs in the high level cache 401, indicating that the cache line F is hot data in the high level cache 401, in order to avoid high level cache. The cache line F is invalidated, the processor keeps the cache line F in the low level cache 402, and selects the cache line E as the cache line to be replaced, and the cache system enters state 3.
  • the cache line F is adjusted to the MRU, which is in the chain of the LRU chain.
  • the cache line E is at the end of the LRU and is selected as the cache line to be replaced.
  • the processor monitors that the cache line F is removed from the low level cache because the cache miss occurs in the low level cache 402. Previously, whether the high-level cache 401 hits the cache line F. If the high-level cache 401 hits the cache line F, the cache line F is hot data. To prevent the cache line F from being invalidated, The processor keeps the cache line F in the low level cache 402, and reselects the cache line E as the cache line to be replaced, thereby ensuring that the cache line F as hot data remains in the high level cache 401, thereby increasing The cache hit rate of the high-level cache improves the overall performance of the processor system.
  • FIG. 5 is an exemplary flowchart 500 of a cache replacement method according to an embodiment of the present invention.
  • the cache architecture includes a low-level cache and a high-level cache.
  • the low-level cache and the high-level cache respectively have respective control logics, as shown in FIG. 5 .
  • Method 500 includes:
  • the low-level cache selects the first cache line as the cache line to be replaced.
  • the embodiment of the present invention sets the low-level cache to use the LRU replacement policy to select the first cache line as the cache line to be replaced.
  • the first cache line in the LRU state is determined to be the first cache line.
  • the low-level cache sends a first notification message to the high-level cache.
  • the first notification message is used to notify the high-level cache that the first cache line is confirmed by the low-level cache as the cache line to be replaced.
  • the first notification message includes the indication information of the first cache line, and the indication information may be specifically the index and the tag information of the first cache line, which is not limited by the embodiment of the present invention.
  • the low-level cache may update the state, and when the first cache line is in the LRU state, send the first notification message to the high-level cache.
  • the high-level cache After receiving the first notification message sent by the low-level cache, the high-level cache first searches for the corresponding cache line of the first cache line in the high-level cache according to the indication information of the first cache line, if the high-level cache does not include The corresponding cache line of the first cache line, the higher level cache may discard the first notification message. If the corresponding cache line of the first cache line is included in the high level cache, the high level cache performs step S506.
  • the embodiment of the present invention sets a high-level cache and a low-level cache to include a first cache line, and the low-level cache includes a first cache line and the high-level cache does not include a first cache line. The description has been made in the embodiment of the present invention, and in the following description of the embodiments of the present invention, unless otherwise stated, the high-level cache is included in the first cache. Save it.
  • step S506 The high-level cache monitors the low-level cache before the cache miss occurs, whether the corresponding cache line hit event of the first cache line occurs in the high-level cache. If yes, step S508 is performed, and if not, step S512 is performed.
  • the high-level cache may mark the corresponding cache line of the first cache line in a plurality of manners to indicate that the corresponding cache line of the first cache line is in a monitored state.
  • the high level cache may use a register to record the identity of the first cache line (the first cache line has the same identity as the corresponding cache line of the first cache line) or use a flag bit to identify the corresponding cache line of the first cache line. For example, a 1-bit flag FI may be added to the tag of the cache line, and the initial value and the reset value of the FI flag are both 0.
  • step S508 is performed. And the monitoring of the corresponding cache line of the first cache line is canceled. If the upper level cache uses the register to record the identifier of the first cache line, the identifier of the first cache line recorded in the register is deleted. If the high level cache uses the FI flag bit to indicate that the corresponding cache line of the first cache line is in the monitored state, the FI flag corresponding to the corresponding cache line of the first cache line is set to zero.
  • the high-level cache uses a register to record the identifier of the first cache line
  • the high-level cache compares the identifier of the cache line in which the cache hit event occurs with the number saved in the register.
  • the identifier of a cache line if the two are the same, indicates that a hit of the corresponding cache line of the first cache line occurs, triggering step S508, and deleting the identifier of the first cache line recorded by the register.
  • the high-level cache uses the FI flag bit to indicate that the corresponding cache line of the first cache line is in the monitored state, if a cache hit occurs at the upper level, and the FI flag bit corresponding to the cache line where the cache hit occurs is 1, it indicates that the occurrence occurs.
  • the hit of the corresponding cache line of the first cache line triggers step S508, and the corresponding cache line of the first cache line corresponds to the FI flag position of 0.
  • S508 The high level cache sends a second notification message to the low level cache.
  • the second notification message is used to indicate that a hit event for the corresponding cache line of the first cache line occurs in the high-level cache.
  • the low-level cache retains the first cache line, and selects the second cache line as the cache line to be replaced.
  • the hit event of the corresponding cache line of the first cache line occurs in the high-level cache, according to the time and space characteristics of the cache access, it is indicated that the corresponding cache line of the first cache line is hot data in the high-level cache, in order to avoid the The corresponding cache line of a cache line is invalidated in the higher level cache, the lower level cache retains the first cache line, and the cache line to be replaced is reselected.
  • the low level cache may update the status of the first cache line to the MRU.
  • the second cache in the LRU state is determined to be the cache row to be replaced. It should be understood that the embodiment of the present invention is only illustrated by the LRU policy. The embodiment of the present invention does not limit the selection of the second cache line as the cache algorithm of the cache to be replaced.
  • S512 The cache of the third cache line is missed by the high level cache.
  • the cache line replacement is relatively frequent. After receiving the first notification message, the high-level cache will quickly generate a cache miss, and the high-level cache occurs. After the cache of the third cache line is missed, it will go to the lower level cache to find out whether the lower level cache includes the corresponding cache line of the third cache line.
  • the low-level cache determines whether a cache miss of the corresponding cache line of the third cache line occurs.
  • the data of the corresponding cache line of the third cache line may be directly updated to the high-level cache, and the cache cache miss occurs in the low-level cache, so it will not be replaced.
  • the first cache line of the cache line moves out of the lower level cache.
  • the low-level cache does not include the corresponding cache line of the third cache line, the low-level cache has a cache miss for the corresponding cache line of the third cache line, and the low-level cache needs to obtain the cache from the lower level or the memory.
  • the data corresponding to the three cache lines is executed, and step S516 is performed.
  • S516 The low-level cache moves out of the first cache line and moves into the data corresponding to the third cache line.
  • the cache miss of the corresponding cache line of the third cache line occurs, after the low-level cache obtains the data corresponding to the third cache line from the lower-level cache or the memory, the current data is replaced with the data corresponding to the third cache line.
  • the cache line (the first cache line) is to be replaced, and the data corresponding to the third cache line is written to the upper level cache.
  • S518 The low-level cache sends a invalidation message to the high-level cache.
  • the invalidation message for the corresponding cache line of the first cache line needs to be sent to the high-level cache for invalidating the corresponding cache of the first cache line in the high-level cache.
  • the invalidation message may carry the identifier information of the first cache line, where the identifier information may be the index and tag information of the first cache line.
  • S520 The high-level cache invalidates the corresponding cache line of the first cache line.
  • the high-level cache After receiving the invalidation message for the corresponding cache line of the first cache line, the high-level cache invalidates the corresponding cache line of the first cache line. Specifically, the high-level cache may be based on the first cache line carried in the invalidation message. The identifier identifies the corresponding cache line of the first cache line, and if the corresponding cache line of the corresponding first cache line is found, the corresponding cache line of the first cache line is marked as invalid. Further, the high level cache terminates monitoring of the corresponding cache line of the first cache line.
  • FIG. 6 is a schematic diagram of a logical structure of a cache replacement system 600 according to an embodiment of the present invention.
  • the system 600 includes a processor 101, and the processor 101 includes a processor core 102-1 and a processor core 102-2.
  • Processor core and tertiary cache architecture wherein L1 cache 103-1 and L2 cache 104-1 are private caches of processor core 102-1, and L1 cache 103-2 and L2 cache 104-2 are processor cores 102.
  • a private cache of -2 which is a shared cache of processor core 102-1 and processor core 102-2.
  • Processor 101 is interconnected with system memory subsystem 108.
  • system 600 is merely illustrative.
  • the cache replacement strategy of embodiments of the present invention is applicable to data processing systems including more cache hierarchy (greater than 3 layers) and data processing systems including two-level cache architecture.
  • the processor 101 also includes a cache replacement policy module 118, which is firmware integrated within the processor 101.
  • processor 101 may further include a cache controller 106.
  • the embodiment of the present invention performs the method described in any of the embodiments of FIG. 3 and FIG. 5, and the specific process is described in the foregoing, and details are not described herein again.
  • FIG. 7 is a schematic diagram of a logical structure of a cache replacement device 700.
  • the device 700 is applied to a computer system including a high-level cache and a low-level cache, and the low-level cache and the high-level cache are included in an inclusion relationship according to an embodiment of the present invention.
  • the apparatus 700 includes a selection unit 702 and a monitoring unit 704.
  • the selecting unit 702 is configured to select the first cache line as the cache line to be replaced in the low level cache, and the cache line to be replaced is used to be moved out of the low level cache when a cache miss occurs in the low level cache.
  • the monitoring unit 704 is configured to monitor whether a hit of the corresponding cache line of the first cache line occurs in the high level cache. If the hit of the corresponding cache line of the first cache line occurs in the high-level cache before the cache miss occurs in the low-level cache, the selection unit 702 is further configured to reserve the first cache line in the low-level cache and select the second The cache line is used as the cache line to be replaced.
  • device 700 can also include write unit 706 and invalidation unit 708.
  • the high level cache is associated with a status register that holds the identity of the cache line in the monitored state.
  • the write unit 706 is configured to write the identifier of the first cache line to the status register.
  • the monitoring unit 704 is configured to compare the identifier of the cache line in which the cache hit occurs in the high-level cache with the identifier of the first cache line saved in the status register. If the two are the same, it indicates that the corresponding cache line of the first cache line has occurred. Hit.
  • each cache line in the high-level cache corresponds to an indication flag, where the indication bit is in the first state, indicating that the corresponding cache line is not in the monitored state, and the indication bit is in the second state, indicating that the corresponding The cache line is in the monitored state.
  • the monitoring unit 704 is further configured to search the corresponding cache line of the first cache line in the high-level cache, and cache the corresponding cache line of the first cache line.
  • the indication position corresponding to the row is the second state.
  • the write unit 706 is also used to cache the first cache when the cache miss occurs in the low-level cache.
  • the row moves out of the low-level cache and moves the cached row that has a cache miss into the lower-level cache.
  • the invalid unit 708 is used to invalidate the corresponding cache line of the first cache line in the high level cache.
  • the selecting unit 702 is configured to select the first cache line as the to-be-replaced cache line in the low-level cache according to the least recently used LRU policy. If the hit of the corresponding cache line of the first cache line occurs in the high-level cache before the cache miss occurs in the low-level cache, the selection unit 702 is further configured to update the status of the first cache line in the low-level cache to the nearest one. Use MRU.
  • the selecting unit 702, the monitoring unit 704, the writing unit 706 and the invalidating unit 708 may be specifically implemented by the processor 101 executing the code in the cache replacement policy module 118.
  • the embodiment of the present invention is an embodiment of the device corresponding to the method embodiment of FIG. 3 to FIG. 6.
  • the description of the embodiment of the embodiment of the present invention is applicable to the embodiment of the present invention, and details are not described herein again.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Quality & Reliability (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明实施例提供了一种计算机中的缓存替换方法、装置和系统。该计算机包括高层级缓存和低层级缓存,低层级缓存与高层级缓存为包含关系,该方法包括:处理器在低层级缓存中选择第一缓存行作为待替换缓存行,并监控高层级缓存中是否发生对第一缓存行的相应缓存行的命中,如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,则在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行。通过对高层级缓存访问情况的监控,避免了对高层级缓存中被频繁访问的缓存行的后无效,提升了缓存命中率。

Description

一种缓存替换方法,装置和系统 技术领域
本申请涉及计算机领域,尤其涉及一种缓存替换方法,装置和系统。
背景技术
处理器通常使用多级CPU缓存(英文:cache)来减小从主存储器(英文:primary storage)(例如系统内存)访问数据的开销(时间或能量)。多级缓存之间的关系,通常有包含(英文:inclusive)、互斥(英文:exclusive)和介于包含与互斥二者之间的非包含(英文:non-inclusive)三种。其中,包含是指高层级缓存中的每个缓存行(英文:cache line)在低层级缓存中都存在相应的(标识相同的)缓存行。互斥是指高层级缓存中的缓存行与低层级缓存中的缓存行为互斥关系。非包含为包含与互斥二者折中的情况,高层级缓存中的缓存行和低层级缓存中的缓存行有一部分是相同的。
包含类型的缓存中,高层级缓存内数据是低层级缓存内数据的一个子集。如果低层级缓存中某个缓存行被移出(英文:evict),为了维持包含关系,高层级缓存中的相应缓存行也会被后无效(英文:back invalidation)。
对于数据缓存来说,由于维护一致性协议的需求,低层级缓存与高层级缓存之间可以采用包含关系。以L1缓存和L2缓存举例,在L2缓存包含L1缓存的情况下,如果L1缓存中某个缓存行经常被访问,那么该缓存行一般会在L1缓存中保存较长时间,不会发生缓存未命中(英文:cache miss)。然而,L2缓存对于L1缓存中缓存行的使用情况是不感知的,L2缓存只能通过L1缓存发生缓存未命中来得知L1缓存中需要访问哪些数据。如果某个缓存行在L1缓存中经常被访问,不发生缓存未命中,该缓存行在L2缓存中的相应的缓存行由于长时间未被使用,会因为L2缓存的替换策略被替换出L2缓存。L2缓存中缓存行替换又会触发L1缓存中相应缓存行的后无效,L1缓存中经常被访问的数据被后无效进一步会引起L1缓存发生缓存未命中,影响系统性能。
发明内容
有鉴于此,本申请公开了一种缓存替换方法、装置和系统,通过对高层级缓存访问情况的监控,减低对高层级缓存中被频繁访问的缓存行的后无效的可能。
第一方面,本申请提供了一种计算机中的缓存替换方法,该计算机包括高层级缓存和低层级缓存,低层级缓存与高层级缓存为包含关系,即高层级缓存中的每个缓存行在低层级缓存中都存在相应的相同缓存行。该方法包括:处理器在低层级缓存中选择第一缓存行作为待替换缓存行,并监控高层级缓存中是否发生对第一缓存行的相应缓存行的缓存命中(英文:cache hit),如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,则在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行。其中,待替换缓存行用于在低层级缓存发生缓存未命中时被移出低层级缓存,从而为未命中缓存行腾出空间。第一缓存行的相应缓存行是指与第一缓存行相同 的缓存行,二者包括的数据和对应的访问地址相同。
如果在低层级缓存发生缓存未命中前,高层级缓存发生了对第一缓存行的相应缓存行的命中,说明第一缓存行对应的数据(第一缓存行与第一缓存行的相应缓存行包括的数据相同)在高层级缓存中是被频繁访问的数据,在低层级缓存中重新选择待替换缓存行,避免第一缓存行短时间内被替换,使得第一缓存行的相应缓存行短时间内不会被后无效,保证了高层级缓存中缓存命中率高。
根据第一方面,在第一方面第一种可能的实现方式中,该计算机还包括一个状态寄存器,该状态寄存器与高层级缓存关联,用于保存处于被监控状态的缓存行的标识。缓存行处于被监控状态表示处理器监控该缓存行在高层级缓存是否发生缓存命中。在低层级缓存中选择第一缓存行作为待替换缓存行之后,该方法还包括:处理器将第一缓存行的标识写入状态寄存器。处理器通过比较高层级缓存中发生缓存命中的缓存行的标识与状态寄存器中保存的第一缓存行的标识来判断是否发生对第一缓存行的相应缓存的命中,如果二者相同,则表明发生了对第一缓存行的相应缓存行的命中。
在发生对第一缓存行的相应缓存行命中或者第一缓存行被后无效后,处理器删除状态寄存器中记录的第一缓存行的标识,终止对第一缓存行的相应缓存行的监控。
根据第一方面,在第一方面第二种可能的实现方式中,高层级缓存中的每个缓存行对应一个指示标识位,该指示标识位为第一状态(例如,0)时,表明对应的缓存行未处于被监控状态,指示标识位为第二状态(例如,1)时,表明对应的缓存行处于被监控状态。在低层级缓存中选择第一缓存行作为待替换缓存行之后,该方法还包括:处理器在高层级缓存中查找第一缓存行的相应缓存行,并将第一缓存行的相应缓存行对应的指示标识位置为第二状态。
在发生对第一缓存行的相应缓存行命中或者第一缓存行被后无效后,处理器将第一缓存行的相应缓存行对应的指示标识位置为第一状态,并终止对第一缓存行的相应缓存行的监控,同一时刻高层级缓存中只有一个缓存行处于被监控的状态。
具体的,可以通过扩展高层级缓存中的tag位来实现指示标识位。处理器通过查看发生缓存命中的缓存行对应的指示标识位是否为第二状态来判断是否发生了对第一缓存行的相应缓存行的命中,如果命中缓存行对应的指示标识位为第二状态,则说明发生了对第一缓存行的相应缓存行的命中。
根据第一方面或第一方面以上任一种可能的实现方式,在第一方面第三种可能的实现方式中,该方法还包括:如果在低层级缓存发生缓存未命中前,高层级缓存没有发生对第一缓存行的相应缓存行的访问,则处理器在低层级缓存发生缓存未命中时,将第一缓存行移出低层级缓存,并将发生缓存未命中的缓存行移入低层级缓存。
如果高层级缓存中包括第一缓存行的相应缓存行,且低层级缓存发生缓存未命中前,高层级缓存没有发生对第一缓存行的相应缓存行的命中,则在低层级缓存发生缓存未命中时,将第一缓存行移出低层级缓存,并将高层级缓存中的第一缓存行的相应缓存行后无效。如果高层级缓存不包括第一缓存行的相应缓存行,且在第一层级缓存发生缓存未命中前,高层级缓存没有发生对第一缓存行的相应缓存行的访问,即没有发生对第一缓存行的相应缓存行的缓存未命中,则在低层级缓存发生缓存未命中时,将第一缓存行移出低层级缓存。
根据第一方面第三种可能的实现方式,在第一方面第四种可能的实现方式中,将第一 缓存行移出低层级缓存后,该方法还包括:处理器将高层级缓存中的第一缓存行的相应缓存行无效。
因为第一缓存行被移出低层级缓存,为了保证低层级缓存与高层级缓存的包含关系,处理器将第一缓存行的相应缓存行后无效。
根据第一方面或第一方面以上任一种可能的实现方式,在第一方面第五种可能的实现方式中,处理器根据最近最少使用(英文:least recently used,LRU)策略在低层级缓存中选择第一缓存行作为待替换缓存行。
根据第一方面第五种可能的实现方式,在第一方面第六种可能的实现方式中,如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,该方法还包括:处理器将低层级缓存中的第一缓存行的状态更新为最近使用(英文:most recently used,MRU)。
因为高层级缓存发生了对第一缓存行的相应缓存行的命中,根据缓存访问的时间和空间特征,说明在高层级缓存中第一缓存行的相应缓存行为被频繁访问的数据,为了避免短时间内第一缓存行的相应缓存行被后无效,将第一缓存行的状态更新为MRU,从而延长第一缓存行在低层级缓存中存在的时间。
第二方面,本申请提供了一种计算机中的缓存替换装置,该计算机包括高层级缓存和低层级缓存,低层级缓存与高层级缓存为包含关系。该装置包括:选择单元,用于在低层级缓存中选择第一缓存行作为待替换缓存行,待替换缓存行用于在低层级缓存发生缓存未命中时被移出低层级缓存。监控单元,用于监控高层级缓存中是否发生对第一缓存行的相应缓存行的命中。如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,该选择单元还用于在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行。
根据第二方面,在第二方面第一种可能的实现方式中,该计算机还包括一个状态寄存器,该状态寄存器与高层级缓存关联,用于保存处于被监控状态的缓存行的标识。该装置还包括写入单元,选择单元在低层级缓存中选择第一缓存行作为待替换缓存行之后,写入单元用于将第一缓存行的标识写入状态寄存器。监控单元用于比较高层级缓存中发生缓存命中的缓存行的标识与状态寄存器中保存的第一缓存行的标识,如果二者相同,则表明发生了对第一缓存行的相应缓存行的命中。
根据第二方面,在第二方面第二种可能的实现方式中,高层级缓存中的每个缓存行对应一个指示标识位,指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,指示标识位为第二状态时,表明对应的缓存行处于被监控状态。选择单元在低层级缓存中选择第一缓存行作为待替换缓存行之后,监控单元还用于在高层级缓存中查找第一缓存行的相应缓存行,并将第一缓存行的相应缓存行对应的指示标识位置为第二状态。
根据第二方面或第二方面以上任一种可能的实现方式,在第二方面第三种可能的实现方式中,该装置还包括写入单元。如果在低层级缓存发生缓存未命中前,高层级缓存没有发生对第一缓存行的相应缓存行的访问,则在低层级缓存发生缓存未命中时,写入单元还用于将第一缓存行移出低层级缓存,并将发生缓存未命中的缓存行移入低层级缓存。
根据第二方面第三种可能的实现方式,在第二方面第四种可能的实现方式中,该装置还包括无效单元。写入单元将第一缓存行移出低层级缓存后,无效单元用于将高层级缓存 中的第一缓存行的相应缓存行无效。
根据第二方面或第二方面以上任一种可能的实现方式,在第二方面第五种可能的实现方式中,选择单元用于根据最近最少使用LRU策略在低层级缓存中选择第一缓存行作为待替换缓存行。
根据第二方面第五种可能的实现方式,在第二方面第六种可能的实现方式中,如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,选择单元还用于将低层级缓存中的第一缓存行的状态更新为最近使用MRU。
第二方面或第二方面任一种可能的实现方式为第一方面或第一方面任一种可能的方法实现方式对应的装置实现,第一方面或第一方面任一种可能的实现方式中的描述对应适用于第二方面或第二方面任一种可能的实现方式,在此不再赘述。
第三方面,本申请提供了一种缓存替换系统,该系统高层级缓存、低层级缓存和缓存控制器,低层级缓存与高层级缓存为包含关系。该缓存控制器用于:在低层级缓存中选择第一缓存行作为待替换缓存行,待替换缓存行用于在低层级缓存发生缓存未命中时被移出低层级缓存,监控高层级缓存中是否发生对第一缓存行的相应缓存行的命中,如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,则在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行。
根据第三方面,在第三方面第一种可能的实现方式中,该系统还包括一个状态寄存器,该状态寄存器与高层级缓存关联,用于保存处于被监控状态的缓存行的标识。在低层级缓存中选择第一缓存行作为待替换缓存行之后,缓存控制器还用于将第一缓存行的标识写入状态寄存器。缓存控制器用于比较高层级缓存中发生缓存命中的缓存行的标识与状态寄存器中保存的第一缓存行的标识,如果二者相同,则表明发生了对第一缓存行的相应缓存行的命中。
根据第三方面,在第三方面第二种可能的实现方式中,高层级缓存中的每个缓存行对应一个指示标识位,指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,指示标识位为第二状态时,表明对应的缓存行处于被监控状态。在低层级缓存中选择第一缓存行作为待替换缓存行之后,缓存控制器还用于在高层级缓存中查找第一缓存行的相应缓存行,并将第一缓存行的相应缓存行对应的指示标识位置为第二状态。
根据第三方面或第三方面以上任一种可能的实现方式,在第三方面第三种可能的实现方式中,如果在低层级缓存发生缓存未命中前,高层级缓存没有发生对第一缓存行的相应缓存行的访问,则缓存控制器还用于在低层级缓存发生缓存未命中时,将第一缓存行移出低层级缓存,并将发生缓存未命中的缓存行移入低层级缓存。
根据第三方面第三种可能的实现方式,在第三方面第四种可能的实现方式中,缓存控制器将第一缓存行移出低层级缓存后,还用于将高层级缓存中的第一缓存行的相应缓存行无效。
根据第三方面或第三方面以上任一种可能的实现方式,在第三方面第五种可能的实现方式中,缓存控制器用于根据最近最少使用LRU策略在低层级缓存中选择第一缓存行作为待替换缓存行。
根据第三方面第五种可能的实现方式,在第三方面第六种可能的实现方式中,如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中, 缓存控制器还用于将低层级缓存中的第一缓存行的状态更新为最近使用MRU。
第三方面或第三方面任一种可能的实现方式为第一方面或第一方面任一种可能的方法实现方式对应的系统实现,第一方面或第一方面任一种可能的实现方式中的描述对应适用于第三方面或第三方面任一种可能的实现方式,在此不再赘述。
根据本申请公开的技术方案,在低层级缓存中选择第一缓存行作为待替换缓存行之后,处理器会监控高层级缓存中是否发生对该第一缓存行的相应缓存行的命中事件,如果发生了对第一缓存行的相应缓存行的命中,则说明第一缓存行的相应缓存行在高层级缓存中为被频繁访问的数据(热数据),处理器会在低层级缓存中保留该第一缓存行,并重新选择第二缓存行作为待替换缓存行,从而保证了作为热数据的第一缓存行的相应缓存行不会因为在低层级缓存被替换而引起的在高层级缓存中的无效,提上了高层级缓存的缓存命中率,提升了系统的整体性能。
附图说明
图1为依据本发明一实施例的数据处理系统的逻辑结构示意图;
图2为依据本发明一实施例的缓存替换方法的示范性流程图;
图3为依据本发明一实施例的的缓存结构示意图;
图4为依据本发明一实施例的缓存替换方法的示范性流程图;
图5为依据本发明一实施例的缓存替换方法的示范性流程图;
图6为依据本发明一实施例的缓存替换系统的逻辑结构示意图;
图7为依据本发明一实施例的缓存替换装置的逻辑结构示意图;
图8为依据本发明一实施例的缓存替换装置的逻辑结构示意图。
具体实施方式
下面将结合附图,对本发明实施例进行描述。
在传统的缓存替换策略中,低层级缓存在选择待替换缓存行的时候可以基于最近最少使用策略。第一缓存行存在于低层级缓存(例如,L2缓存),第一缓存行的相应缓存行存在于高层级缓存(例如,L1缓存),其中,第一缓存行的相应缓存行是指与第一缓存行相同的缓存行,二者包括的数据和对应的访问地址相同。如果处理器频繁访问第一缓存行对应的数据,则在L1缓存中会频繁发生对第一缓存行的相应缓存行的缓存命中(英文:cache hit),一个缓存行对应的数据是指该缓存行中包括的内存数据,第一缓存行与第一缓存行的相应缓存行包括的数据相同。如果第一缓存行的相应缓存行在L1缓存中被频繁命中,根据LRU策略,第一缓存行的相应缓存行会持续存在于L1缓存中,L2缓存就不会发生对第一缓存行的访问。如果在一段时间内第一缓存行在L2缓存中一直未被访问,根据LRU替换策略,第一缓存行在L2缓存被替换掉的风险慢慢变大。当在L2缓存中第一缓存行被确定为待替换缓存行后,L2缓存发生缓存未命中时,第一缓存行需要被移出L2缓存,为了保证包含关系,第一缓存行的相应缓存行在L1缓存中也会被后无效。如果第一缓存行对应的数据在L1缓存中为被频繁访问的数据,L1缓存很快就会发生对第一缓存行的相应缓存行的缓存未命中,需要将第一缓存行对应的数据重新写入L2缓存和L1缓存,从而影响了系统的整体性能。
为了解决上述问题,在本发明实施例公开的技术方案中,当低层级缓存将第一缓存行确 定为待替换缓存行后,处理器会监控高层级缓存中对第一缓存行的相应缓存行的缓存命中是否发生。在第一缓存行因为低层级缓存发生缓存未命中而被替换前,如果高层级缓存中发生了对第一缓存行的相应缓存行的命中,根据缓存访问的时间和空间特性,第一缓存行的相应缓存行在高层级缓存中可能会被频繁访问,则在低层级缓存中保留该第一缓存行,并重新选择第二缓存行作为待替换缓存行。避免了对第一缓存行的相应缓存行的后无效,使第一缓存行的相应缓存行继续保留在高层级缓存中,保证了整体的缓存命中率高。如果高层级缓存包括第一缓存行的相应缓存行,且低层级缓存发生缓存未命中前,高层级缓存中没有发生对第一缓存行的相应缓存行的缓存命中,说明第一缓存行的相应缓存行在高层级缓存中不会被频繁访问,则在低层级缓存发生缓存未命中时,将第一缓存行替换出低层级缓存,并将高层级缓存中的第一缓存行的相应缓存行后无效。在本发明实施例中,将第一缓存行替换出低层级缓存,即将第一缓存行移出低层级缓存,并将发生缓存未命中的缓存行写入低层级缓存。
图1为依据本发明一实施例的计算机100的结构示意图。计算机可以为任何电子设备,例如便携计算机,台式计算机,服务器,网络设备,平板电脑,手机,个人数字助理(英文:personal digital assistant,PDA),可穿戴设备,或其任意组合。
如图1所示,数据处理系统100包括处理器101,处理器101与系统内存108连接。处理器101可以为中央处理器(CPU),图像处理器(英文:graphics processing unit,GPU)或数字信号处理器(英文:digital signal processor,DSP)。
处理器101可以为单核处理器或多核处理器,包括两级或两级以上的缓存。在本发明实施例中,处理器101包括多个处理器核102(包括处理器核102-1,处理器核102-2和处理器核102-N,统称处理器核102)和三级缓存架构,其中L1缓存103(包括L1缓存103-1,L1缓存103-2和L1缓存103-N,统称L1缓存103)和L2缓存104(包括L2缓存104-1,L2缓存104-2和L2缓存104-N,统称L1缓存104)为私有缓存,L3缓存105为共享缓存,私有缓存只能被对应的处理器核独享,共享缓存可以被多个处理器核共享。L1缓存103相对于L2缓存104和L3缓存105为高层级缓存。L2缓存相对于L3缓存高层级缓存。L2缓存104和L3缓存105相对于L1缓存103为低层级缓存。L3缓存105相对于L2缓存104为低层级缓存。在本发明实施例中,低层级缓存与高层级缓存为包含关系,高层级缓存中的每一个缓存行也都存在于对应的低层级缓存。
处理器101还可以包括缓存控制器106,缓存控制器106用于根据不同类型的消息请求以及请求对应的地址信息,选中相应的数据单元(缓存行),对其进行读取、更新、填充等操作。在具体实现过程中,每一层缓存可以拥有自己的控制逻辑,即图1所示的缓存控制器106可以分布式部署在每一级不同层级的缓存,也可以一个缓存架构拥有一个总的控制逻辑,本发明实施例对此不进行限定。缓存控制器106可以作为独立组件集成在处理器101内部,也可以集成在处理器核102中,由处理器核102实现缓存控制器106的功能。
在本发明的一个实施例中,处理器101还可以包括缓存替换策略模块118,缓存替换策略模块118为集成在处理器101中的固件模块,处理器101或缓存控制器106执行缓存替换策略模块118中的固件代码来实现本发明实施例的技术方案。缓存替换策略模块118包括:(1)用于在低层级缓存中选择第一缓存行作为待替换缓存行的代码;(2)用于在低层级缓存中选择第一缓存行作为待替换缓存行之后,该第一缓存行因低层级缓存发生缓存未命中而被替换掉之前,监控高层级缓存中是否发生对该第一缓存行的相应缓存行的缓存命中的代 码;(3)如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的缓存命中,用于在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行的代码。
总线113用于在数据处理器系统100的各部件之间传递信息,总线113可以使用有线的连接方式或采用无线的通讯方式,本申请并不对此进行限定。总线113还可以连接有辅助存储器(英文:secondary storage)107,输入/输出接口109和通信接口110。
辅助存储器107的存储介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如光盘)、或者半导体介质(例如固态硬盘(英文:solid state disk,SSD))等。在一些实施例中,辅助存储器107还可能进一步包括与处理器101分离的远程存储器,例如通过通信接口110和网络存储协议与通信网络111进行访问的网盘(包括网络文件系统(NFS)等网络或机群文件系统)。
输入/输出接口109连接有输入/输出设备114,用于接收输入的信息,输出操作结果。输入/输出设备114可以为鼠标、键盘、显示器、或者光驱等。
通信接口110使用例如但不限于收发器一类的收发装置,来实现与其他设备或通信网络111之间的通信,通信接口110可以通过有线或者无线的形式与网络111互连。网络111可以是因特网,内联网(英文:intranet),局域网(英文:local area network,LAN),广域网络(英文:wide area network,WAN),存储区域网络(英文:storage area network,SAN)等,或者以上网络的任意组合。
本发明实施例的一些特征可以由处理器101执行系统内存108中的软件代码来完成/支持。系统内存108可以包括一些软件,例如,操作系统115(例如Darwin、RTXC、LINUX、UNIX、OS X、WINDOWS或嵌入式操作系统(例如Vxworks)),应用程序116,和缓存替换策略模块117等。
在本发明的一个实施例中,处理器101或缓存控制器106执行缓存替换策略模块117来实现本发明实施例的技术方案。缓存替换策略模块117包括:(1)用于在低层级缓存中选择第一缓存行作为待替换缓存行的代码;(2)用于在低层级缓存中选择第一缓存行作为待替换缓存行之后,该第一缓存行因低层级缓存发生缓存未命中而被真正替换掉之前,监控高层级缓存中是否发生对该第一缓存行的相应缓存行的缓存命中的代码;(3)如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的缓存命中,用于在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行的代码。
图2为依据本发明一实施例的的缓存结构示意图,在访问缓存时,一个内存地址被划分为3段,分别是tag、index、offset。处理器在访问数据时,首先根据内存地址在缓存中查找是否有与内存地址对应的数据,如果发生命中,则直接从缓存中读取数据,如果缓存中不存在与内存地址对应的数据,则处理器去内存中获取与内存地址对应的数据,并将获取的数据写入缓存。如图2所示,内存地址为32位(英文:bit),缓存容量为32千字节(英文:kilobyte,KB),包括64个组,8路组关联(即每组有8个缓存行),每个缓存行大小为64字节(英文:byte)。
offset:对缓存的操作以缓存行为单位,某个数据在缓存行内部的地址,就需要通过offset来确定。缓存行为64byte,可以使用地址的低6位来表示offset。
index:用于寻找缓存行在哪个组。在图2中,缓存共包括一共64个组,可以使用地址 的中间6位来查找地址属于那个组。
tag:缓存访问时,先根据index判断访问地址属于哪个组,组确定了之后,将该组中tag位分别与访问地址的tag位作比较,若有相同的tag,则表明缓存命中,若没有相同tag,则缓存未命中。在图2中,根据index找到组后,该组内的8个缓存行的tag位(地址中的剩余位,在图2中即为地址的高20位)分别与访问地址的tag位做比较。
根据tag和index可以唯一的确定缓存行。
应理解,图2仅仅是对缓存结构进行举例说明,本发明实施例并不限定缓存的实际规格。
图3为依据本发明一实施例的一种缓存替换方法300的流程图,方法300应用于包括高层级缓存和低层级缓存的计算机系统,其中,低层级缓存与高层级缓存为包含关系,即所有高层级缓存中的缓存行的相应缓存行也都存在于低层级缓存中。如图3所示,方法300包括:
S302:处理器在低层级缓存中选择第一缓存行作为待替换缓存行。
当低层级缓存全满且发生缓存未命中时,处理器无法找到空闲的空间来存放未命中的缓存行,需要把已经存放了缓存行的空间拿来使用,也就是用新的缓存行覆盖低层级缓存中已有的缓存行。
待替换缓存行用于在低层级缓存发生缓存未命中时替换未命中的缓存行,即在低层级缓存发生缓存未命中时,待替换缓存行用于被移出低层级缓存,从而为发生缓存未命中的缓存行腾出空间。
本发明实施例不限定在低层级缓存中选择第一缓存行作为待替换缓存的缓存算法,该缓存算法可以为先入先出(英文:first in first out,FIFO),后进先出(英文:last in first out,LIFO),LRU,伪LRU(英文:pseudo-LRU,PLRU),随机替换(英文:random replacement,RR),分段LRU(英文:segmented LRU,SLRU),最不常用(英文:least-frequently used,LFU),动态老化LFU(英文:LFU with dynamic aging,LFUDA),低访问间隔最近集(英文:low inter-reference recency set,LIRS),适应替换缓存(英文:adaptive replacement cache,ARC),多队列(英文:multi queue,MQ)等算法。
举例而言,处理器可以根据LRU策略在低层级缓存中选择第一缓存行作为待替换缓存行。
S304:处理器监控高层级缓存中是否发生对第一缓存行的相应缓存行的缓存命中。
处理器在低层级缓存中选择第一缓存行作为待替换缓存行之后,会监控第一缓存行的相应缓存行在高层级缓存中是否发生缓存命中。根据缓存访问的时间和空间特性,如果第一缓存行对应的数据在高层级缓存中为被频繁访问的数据(热数据),则在低层级缓存再次发生缓存未命中之前,第一缓存行的相应缓存行在高层级缓存中有很大概率会被命中。
如果高层级缓存发生了缓存未命中,但低层级缓存中存在高层级缓存中发生缓存未命中的缓存行相应的缓存行,则可以直接将发生缓存未命中的缓存行对应的数据写入高层级缓存,在这种情况下低层级缓存并不会发生缓存未命中。只有在高层级发生缓存未命中,且低层级缓存中没有高层级缓存发生缓存未命中的缓存行的相应缓存行的情况下,低层级缓存才会发生缓存未命中。由于高层级缓存的空间小于低层级缓存的空间,如果第一缓存行在高层级缓存中是被频繁访问的数据,在低层级缓存发生缓存未命中之前,第一缓存行在高层级缓存中有很大概率会被命中。
在本发明实施例中,高层级缓存通过一定的指示信息来识别处于被监控状态的缓存行, 一个缓存行处于被监控状态表示处理器监控该缓存行在高层级缓存是否发生缓存命中,本发明实施例不限定标识第一缓存行的相应缓存行处于被监控状态的方式。
在本发明的一个实施例中,处理器可以使用一个状态寄存器来保存处于被监控状态的缓存行的标识,该标识可以为该缓存行的tag位和index位。在低层级缓存中选择第一缓存行作为待替换缓存行之后,处理器将第一缓存行的标识写入状态寄存器(第一缓存行与第一缓存行的相应缓存行的标识相同)。监控高层级缓存中是否发生对第一缓存行的相应缓存行的缓存命中可以具体为:比较高层级缓存中发生缓存命中的缓存行的标识与状态寄存器中保存的第一缓存行的标识,如果二者相同,则表明发生了对第一缓存行的相应缓存行的缓存命中。
在本发明的另一个实施例中,处理器可以使用一个与缓存行对应的指示标识位来记录缓存行的状态,例如,可以在缓存的tag字段扩充1位作为指示标识。高层级缓存中的每个缓存行对应一个指示标识位,该指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,该指示标识位为第二状态时,表明对应的缓存行处于被监控状态。处理器只对指示标识位为第二状态的缓存行进行监控。在低层级缓存中选择第一缓存行作为待替换缓存行之后,处理器在高层级缓存中查找第一缓存行的相应缓存行,并将第一缓存行的相应缓存行的指示标识位置为第二状态。监控高层级缓存中是否发生对第一缓存行的相应缓存行的缓存命中可以具体为:处理器监控指示标识位为第二状态的第一缓存行的相应缓存行是否发生命中事件。
S306:如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,则在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行。
具体的,处理器可以根据缓存替换算法,选择除第一缓存行外的第二缓存行为待替换缓存行。例如,处理器可以根据LRU算法,选择除第一缓存行外的最近最少使用的第二缓存行为待替换缓存行。
如果高层级缓存中包括第一缓存行的相应缓存行,且在低层级缓存发生缓存未命中之前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,则说明第一缓存行的相应缓存行在高层级缓存中为被频繁访问的数据(热数据),为了避免作为热数据的第一缓存行被后无效,处理器在低层级缓存中保留第一缓存行,并重新选择第二缓存行作为待替换缓存行。因为第一缓存行的相应缓存行在高层级缓存中属于热数据,处理器可以在低层级缓存中将第一缓存行的状态更新为最近使用。
进一步的,第一缓存行的相应缓存行发生缓存命中后,高层级缓存取消对第一缓存行的相应缓存行的监控。具体的,如果高层级缓存使用一个状态寄存器记录第一缓存行的标识,则删除该状态寄存器中记录的第一缓存行的标识。如果高层级缓存使用一个指示标识位来指示第一缓存行的相应缓存行是否处于被监控状态,该指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,该指示标识位为第二状态时,表明对应的缓存行处于被监控状态,则高层级缓存将第一缓存行的相应缓存行的指示标识位置为第一状态,以结束对第一缓存行的相应缓存行的监控。
如果在低层级缓存发生缓存未命中前,高层级缓存没有发生对第一缓存行的相应缓存行的访问,则在低层级缓存发生缓存未命中时,将第一缓存行移出低层级缓存,并将发生缓存未命中的缓存行移入低层级缓存。
具体的,如果高层级缓存包括第一缓存行的相应缓存行,且在低层级缓存发生缓存未命 中之前,高层级缓存中没有发生对第一缓存行的相应缓存行的命中,则在低层级缓存发生缓存未命中时,将第一缓存行移出低层级缓存。为了保证高层级缓存和低层级缓存的包含关系,处理器将高层级缓存中的第一缓存行的相应缓存行后无效。进一步的,处理器取消高层级缓存对第一缓存行的相应缓存行的监控,具体方式在上文已有描述,在此不再赘述。处理器还将未命中的缓存行写入该低层级缓存和高层级缓存。
如果高层级缓存中不包括第一缓存行,且在低层级缓存再次发生缓存未命中之前,处理器没有发生对第一缓存行数据的访问操作,即高层级缓存中没有发生对第一缓存行的相应缓存行的缓存未命中,则在低层级缓存发生缓存未命中时,将第一缓存行移出低层级缓存,并将未命中的缓存行更新至该低层级缓存和高层级缓存。
如果高层级缓存中不包括第一缓存行,且在低层级缓存再次发生缓存未命中之前,处理器发生了对第一缓存行对应数据的访问操作,即高层级缓存中发生了对第一缓存行对应数据的访问操作,则高层级缓存会因为第一缓存行的相应缓存行发生缓存未命中,处理器会去低层级缓存中查找第一缓存行,并将第一缓存行对应数据更新至高层级缓存,因此低层级缓存发生了对第一缓存行的命中,处理器在低层级缓存中保留第一缓存行,并重新选择第二缓存行作为待替换缓存行,这样避免了刚更新至高层级缓存中的第一缓存行的相应缓存行被后无效。进一步的,可以在低层级缓存中将第一缓存行的状态更新为最近使用MRU。
根据本发明实施例公开的技术方案,处理器在低层级缓存中选择第一缓存行作为待替换缓存行之后,会监控高层级缓存中是否发生对该第一缓存行的相应缓存行的命中事件,如果发生了对第一缓存行的相应缓存行的命中,则说明第一缓存行的相应缓存行在高层级缓存中为被频繁访问的数据(热数据),处理器会在低层级缓存中保留该第一缓存行,并重新选择第二缓存行作为待替换缓存行,从而保证了作为热数据的第一缓存行的相应缓存行不会因为在低层级缓存被替换而引起的在高层级缓存中的无效,提上了高层级缓存的缓存命中率,提升了系统的整体性能。
图4为依据本发明一实施例的缓存状态变化示意图。如图4所示,缓存系统400包括高层级缓存401和低层级缓存402,在状态1的时候,低层级缓存402包括B,C,D,E,F和G6个缓存行,高层级缓存401包括B,F和C三个缓存行,高层级缓存401与低层级缓存402为包含关系,即高层级缓存中的缓存行B,F和C也都存在于低层级缓存。低层级缓存维护有一个逻辑LRU链,具体实现的时候,可以通过记录缓存行的未被命中次数或时间来实现,本发明对LRU链的实现形式不进行限定。在状态1的时候,处于该LRU链链尾的缓存行为缓存行G,则根据替换策略,缓存行G为待替换缓存行,在低层级缓存再次发生缓存未命中的时候,待替换缓存行用于替换发生缓存未命中的缓存行。即低层级缓存402再次发生缓存未命中的时候,会将缓存行G移出低层级缓存402。
在状态1的时候,如果处理器发起了对数据A的访问,因为高层级缓存401中不包括数据A对应的缓存行,则高层级缓存401会发生针对数据A的缓存未命中,处理器会在低层级缓存402查找低层级缓存402是否包括数据A,因为低层级缓存402也不包括数据A,则低层级缓存也会发生针对数据A的缓存未命中,处理器会去更低层级的缓存或者内存中取出数据A,并将数据A对应的缓存行更新至低层级缓存402和高层级缓存401,缓存系统进入状态2。
如图4所示,在状态2下,缓存行A被更新至低层级缓存402和高层级缓存401,状态 1时的待替换缓存行G被移出低层级缓存402,缓存行C被移出高层级缓存401。在状态2下,新加入的缓存行A处于LRU链的链首,缓存行F处于LRU链的链尾,则状态2下缓存行F为低层级缓存402的待替换缓存行。
在状态2下,确定低层级缓存402中缓存行F为待替换缓存行之后,处理器会监控高层级缓存401中是否会发生对缓存行F的命中事件,如果在因为低层级缓存402发生缓存未命中缓存行F在低层级缓存402中被替换之前,高层级缓存401中发生了对缓存行F的命中,则说明在高层级缓存401中缓存行F为热数据,为了避免高层级缓存中的缓存行F被后无效,处理器将缓存行F继续保留在低层级缓存402中,并选择缓存行E为待替换缓存行,缓存系统进入状态3。
如图,在状态3中,因为高层级缓存中发生了对缓存行F的命中事件,在低层级缓存中,在更新后的LRU链中,缓存行F被调整为MRU,处于LRU链的链首,缓存行E处于LRU的链尾,被选择为待替换缓存行。
根据本发明实施例公开的技术方案,当确定缓存行F为低层级缓存402的待替换缓存行后,处理器会监控在缓存行F因为低层级缓存402发生缓存未命中而被移出低层级缓存之前,高层级缓存401是否会发生对缓存行F的命中事件,如果高层级缓存401发生了对缓存行F的命中事件,则说明缓存行F为热数据,为了避免缓存行F被后无效,处理器会将缓存行F继续保留在低层级缓存402中,被重新选择缓存行E为待替换缓存行,从而保证了作为热数据的缓存行F继续保留在高层级缓存401中,从而增加了高层级缓存的缓存命中率,提升了处理器系统的整体性能。
图5依据本发明一实施例的缓存替换方法的示范性流程图500,缓存架构包括低层级缓存和高层级缓存,低层级缓存和高层级缓存分别有各自的控制逻辑,如图5所示,方法500包括:
S502:低层级缓存选择第一缓存行作为待替换缓存行。
低层级缓存选择待替换缓存行的策略在上文已有描述,再次不在赘述。为了描述方便,本发明实施例设定低层级缓存使用LRU替换策略来选择第一缓存行作为待替换缓存行。当发生状态更新时,处于LRU状态的第一缓存行被确定为第一缓存行。
S504:低层级缓存向高层级缓存发送第一通知消息。
第一通知消息用于通知高层级缓存第一缓存行被低层级缓存确认为待替换缓存行。其中,第一通知消息中包括第一缓存行的指示信息,该指示信息可以具体为第一缓存行的index和tag信息,本发明实施例对此并不进行限定。
具体实现过程中,低层级缓存可以在状态发生更新,第一缓存行被更为LRU状态时,向高层级缓存发送第一通知消息。
高层级缓存在接收到低层级缓存发送的第一通知消息后,首先会根据第一缓存行的指示信息查找高层级缓存中是否包括第一缓存行的相应缓存行,如果高层级缓存中不包括第一缓存行的相应缓存行,则高层级缓存可以丢弃第一通知消息。如果高层级缓存中包括第一缓存行的相应缓存行,则高层级缓存执行步骤S506。为了描述方便,本发明实施例设定高层级缓存和低层级缓存中均包括第一缓存行,对于低层级缓存包括第一缓存行而高层级缓存不包括第一缓存行的情况在图3实施例已有描述,在本发明实施例不再赘述,在本发明实施例的以下描述中,除非另有申明,均设定高层级缓存中包括第一缓 存行。
S506:高层级缓存监控低层级缓存发生缓存未命中前,高层级缓存是否发生对第一缓存行的的相应缓存行命中事件,如果发生则执行步骤S508,如果未发生,则执行步骤S512。
具体实现过程中,高层级缓存可以使用多种方式来标记第一缓存行的相应缓存行,以表明第一缓存行的相应缓存行处于被监控的状态。高层级缓存可以使用一个寄存器记录第一缓存行的标识(第一缓存行的标识与第一缓存行的相应缓存行的标识相同),或者使用一个标记位标识第一缓存行的相应缓存行,例如,可以在缓存行的tag中增加一个1bit的标志位FI,FI标志位的初始值和复位值均为0,当高层级缓存接收到第一通知消息后,若第一缓存行的相应缓存行在高层级缓存中存在,则高层级缓存将第一缓存行的相应缓存行的FI标志位置为1,以此来表示第一缓存行的相应缓存行处于被监控的状态。
如果高层级缓存在收到低层级缓存针对第一缓存行的后无效消息之前,发生了对第一缓存行的相应缓存行的命中,则执行步骤S508。并取消对第一缓存行的相应缓存行的监控,如果高层级缓存使用寄存器记录第一缓存行的标识,则删除该寄存器中记录的第一缓存行的标识。如果高层级缓存使用FI标志位指示第一缓存行的相应缓存行处于被监控状态,则将第一缓存行的相应缓存行对应的FI标志位置为0。
具体的,如果高层级缓存使用一个寄存器记录第一缓存行的标识,则高层级缓存每发生一次缓存命中事件后,高层级缓存比较发生缓存命中事件的缓存行的标识与该寄存器中保存的第一缓存行的标识,如果二者相同,则说明发生了对第一缓存行的相应缓存行的命中,触发步骤S508,并删除该寄存器记录的第一缓存行的标识。如果高层级缓存使用FI标志位来指示第一缓存行的相应缓存行处于被监控状态,如果高层级发生了缓存命中,且发生发生缓存命中的缓存行对应的FI标志位为1,则表明发生了对第一缓存行的相应缓存行的命中,触发步骤S508,并将第一缓存行的相应缓存行对应FI标志位置为0。
S508:高层级缓存向低层级缓存发送第二通知消息。
其中,第二通知消息用于指示高层级缓存中发生了对第一缓存行的相应缓存行的命中事件。
S510:低层级缓存保留第一缓存行,并选择第二缓存行作为待替换缓存行。
因为高层级缓存中发生了对第一缓存行的相应缓存行的命中事件,根据缓存访问的时间和空间特征,表明第一缓存行的相应缓存行在高层级缓存中是热数据,为了避免第一缓存行的相应缓存行在高层级缓存中被后无效,低层级缓存保留第一缓存行,并重新选择待替换缓存行。
进一步的,低层级缓存可以将第一缓存行的状态更新为MRU。更新状态后,处于LRU状态的第二缓存被确定为待替换缓存行。应理解,本发明实施例仅仅以LRU策略进行举例说明,本发明实施例不限定选择第二缓存行作为待替换缓存的缓存算法。
S512:高层级缓存发生对第三缓存行的缓存未命中。
因为高层级缓存空间较小,且更靠近处理器核,缓存行替换相对比较频繁,高层级缓存在收到第一通知消息之后,高层级缓存很快就会发生缓存未命中,高层级缓存发生 对第三缓存行的缓存未命中后,会去低层级缓存查找低层级缓存是否包括第三缓存行的相应缓存行。
S514:低层级缓存判断是否发生对第三缓存行的相应缓存行的缓存未命中。
如果低层级缓存包括第三缓存行的相应缓存行,则可以直接将该第三缓存行的相应缓存行的数据更新至高层级缓存,低层级缓存没有发生缓存未命中,所以不会将作为待替换缓存行的第一缓存行移出低层级缓存。
如果低层级缓存不包括第三缓存行的相应缓存行,则低层级缓存发生了对第三缓存行的相应缓存行的缓存未命中,低层级缓存需要从更低层级的缓存或者内存中获取第三缓存行对应的数据,并执行步骤S516。
S516:低层级缓存移出第一缓存行,并移入第三缓存行对应的数据。
因为发生了第三缓存行的相应缓存行的缓存未命中,低层级缓存从更低层级的缓存或者内存中获取第三缓存行对应的数据后,会用第三缓存行对应的数据替换当前的待替换缓存行(第一缓存行),并将第三缓存行对应的数据写入至高层级缓存。
S518:低层级缓存向高层级缓存发送无效化消息。
为了保证包含关系,低层级缓存移出第一缓存行后,需要向高层级缓存发送针对第一缓存行的相应缓存行的无效化消息,用于无效高层级缓存中的第一缓存行的相应缓存行,该无效化消息中可以携带第一缓存行的标识信息,该标识信息可以为第一缓存行的index和tag信息。
S520:高层级缓存无效化第一缓存行的相应缓存行。
高层级缓存接收到针对第一缓存行的相应缓存行的无效化消息后,会无效化第一缓存行的相应缓存行,具体的,高层级缓存可以根据无效化消息中携带的第一缓存行的标识查找第一缓存行的相应缓存行,如果查找到了对应的第一缓存行的相应缓存行,则将第一缓存行的相应缓存行标记为无效。进一步的,高层级缓存终止对第一缓存行的相应缓存行的监控。
图6为依据本发明一实施例的缓存替换系统600的逻辑结构示意图,如图6所示,系统600包括处理器101,处理器101包括处理器核102-1和处理器核102-2两个处理器核和三级缓存架构,其中,L1缓存103-1和L2缓存104-1为处理器核102-1的私有缓存,L1缓存103-2和L2缓存104-2为处理器核102-2的私有缓存,L3缓存105为处理器核102-1和处理器核102-2的共享缓存。处理器101与系统内存子系统108互联。
应理解,系统600仅仅是举例说明,本发明实施例的缓存替换策略适用于包括更多缓存层级架构(大于3层)的数据处理系统也适用于包括两级缓存架构的数据处理系统。
处理器101还包括缓存替换策略模块118,存替换策略模块118为集成在处理器101内部的固件。
进一步的,处理器101还可以包括缓存控制器106。
当处理器器101或缓存控制器106执行缓存替换策略模块118时,本发明实施例执行图3-图5任一实施例描述的方法,具体流程前文已有描述,在此不再赘述。
图7为依据本发明一实施例的一种缓存替换装置700的逻辑结构示意图,装置700应用于计算机系统,该计算机系统包括高层级缓存和低层级缓存,低层级缓存与高层级缓存为包含关系,如图7所示,装置700包括选择单元702和监控单元704。
选择单元702用于在低层级缓存中选择第一缓存行作为待替换缓存行,待替换缓存行用于在低层级缓存发生缓存未命中时被移出低层级缓存。
监控单元704用于监控高层级缓存中是否发生对第一缓存行的相应缓存行的命中。如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,选择单元702还用于在低层级缓存中保留第一缓存行,并选择第二缓存行作为待替换缓存行。
如图8所示,装置700还可以包括写入单元706和无效单元708。
可选的,高层级缓存关联一个状态寄存器,状态寄存器用于保存处于被监控状态的缓存行的标识。选择单元702在低层级缓存中选择第一缓存行作为待替换缓存行之后,写入单元706用于将第一缓存行的标识写入状态寄存器。监控单元704用于比较高层级缓存中发生缓存命中的缓存行的标识与状态寄存器中保存的第一缓存行的标识,如果二者相同,则表明发生了对第一缓存行的相应缓存行的命中。
可选的,高层级缓存中的每个缓存行对应一个指示标识位,指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,指示标识位为第二状态时,表明对应的缓存行处于被监控状态。选择单元702在低层级缓存中选择第一缓存行作为待替换缓存行之后,监控单元704还用于在高层级缓存中查找第一缓存行的相应缓存行,并将第一缓存行的相应缓存行对应的指示标识位置为第二状态。
如果在低层级缓存发生缓存未命中前,高层级缓存没有发生对第一缓存行的相应缓存行的访问,则在低层级缓存发生缓存未命中时,写入单元706还用于将第一缓存行移出低层级缓存,并将发生缓存未命中的缓存行移入低层级缓存。写入单元706将第一缓存行移出低层级缓存后,无效单元708用于将高层级缓存中的第一缓存行的相应缓存行后无效。
可选的,选择单元702用于根据最近最少使用LRU策略在低层级缓存中选择第一缓存行作为待替换缓存行。如果在低层级缓存发生缓存未命中前,高层级缓存中发生了对第一缓存行的相应缓存行的命中,选择单元702还用于将低层级缓存中的第一缓存行的状态更新为最近使用MRU。
在本发明实施例中,选择单元702,监控单元704,写入单元706和无效单元708可以具体由处理器101执行缓存替换策略模块118中的代码来实现。
本发明实施例为图3-图6方法实施例对应的装置实施例,图3-图6实施例部分的特征描述适用于本发明实施例,在此不再赘述。
以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者替换其中部分技术特征;而这些修改或者替换,并不使相应技术方案脱离权利要求的保护范围。

Claims (21)

  1. 一种计算机中的缓存替换方法,其特征在于,所述计算机包括高层级缓存和低层级缓存,所述低层级缓存与所述高层级缓存为包含关系,所述方法包括:
    在所述低层级缓存中选择第一缓存行作为待替换缓存行,所述待替换缓存行用于在所述低层级缓存发生缓存未命中时被移出所述低层级缓存;
    监控所述高层级缓存中是否发生对所述第一缓存行的相应缓存行的命中;
    如果在所述低层级缓存发生缓存未命中前,所述高层级缓存中发生了对所述第一缓存行的相应缓存行的命中,则在所述低层级缓存中保留所述第一缓存行,并选择第二缓存行作为待替换缓存行。
  2. 根据权利要求1所述的方法,其特征在于,所述高层级缓存关联一个状态寄存器,所述状态寄存器用于保存处于被监控状态的缓存行的标识;
    所述在所述低层级缓存中选择第一缓存行作为待替换缓存行之后,所述方法还包括:将所述第一缓存行的标识写入所述状态寄存器;
    所述监控所述高层级缓存中是否发生对所述第一缓存行的相应缓存行的命中包括:比较所述高层级缓存中发生缓存命中的缓存行的标识与所述状态寄存器中保存的所述第一缓存行的标识,如果二者相同,则表明发生了对所述第一缓存行的相应缓存行的命中。
  3. 根据权利要求1所述的方法,其特征在于,所述高层级缓存中的每个缓存行对应一个指示标识位,所述指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,所述指示标识位为第二状态时,表明对应的缓存行处于被监控状态;
    所述在所述低层级缓存中选择第一缓存行作为待替换缓存行之后,所述方法还包括:在高层级缓存中查找所述第一缓存行的相应缓存行,并将所述第一缓存行的相应缓存行对应的指示标识位置为第二状态。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:如果在所述低层级缓存发生缓存未命中前,所述高层级缓存没有发生对所述第一缓存行的相应缓存行的访问,则在所述低层级缓存发生缓存未命中时,将所述第一缓存行移出所述低层级缓存。
  5. 根据权利要求4所述的方法,其特征在于,所述将所述第一缓存行移出所述低层级缓存后,所述方法还包括:
    将所述高层级缓存中的所述第一缓存行的相应缓存行无效。
  6. 根据权利要求1-5任一项所述的方法,其特征在于,所述在所述低层级缓存中选择第一缓存行作为待替换缓存行包括:
    根据最近最少使用LRU策略在所述低层级缓存中选择所述第一缓存行作为待替换缓存行。
  7. 根据权利要求6所述的方法,其特征在于,如果在所述低层级缓存发生缓存未命中前,所述高层级缓存中发生了对所述第一缓存行的相应缓存行的命中,所述方法还包括:将所述低层级缓存中的所述第一缓存行的状态更新为最近使用MRU。
  8. 一种计算机中的缓存替换装置,其特征在于,所述计算机包括高层级缓存和低层级缓存,所述低层级缓存与所述高层级缓存为包含关系,所述装置包括:
    选择单元,用于在所述低层级缓存中选择第一缓存行作为待替换缓存行,所述待替换 缓存行用于在所述低层级缓存发生缓存未命中时被移出所述低层级缓存;
    监控单元,用于监控所述高层级缓存中是否发生对所述第一缓存行的相应缓存行的命中;
    如果在所述低层级缓存发生缓存未命中前,所述高层级缓存中发生了对所述第一缓存行的相应缓存行的命中,所述选择单元还用于在所述低层级缓存中保留所述第一缓存行,并选择第二缓存行作为待替换缓存行。
  9. 根据权利要求8所述的装置,其特征在于,所述高层级缓存关联一个状态寄存器,所述状态寄存器用于保存处于被监控状态的缓存行的标识;
    所述装置还包括写入单元,所述选择单元在所述低层级缓存中选择第一缓存行作为待替换缓存行之后,所述写入单元用于将所述第一缓存行的标识写入所述状态寄存器;
    所述监控单元用于比较所述高层级缓存中发生缓存命中的缓存行的标识与所述状态寄存器中保存的所述第一缓存行的标识,如果二者相同,则表明发生了对所述第一缓存行的相应缓存行的命中。
  10. 根据权利要求8所述的装置,其特征在于,所述高层级缓存中的每个缓存行对应一个指示标识位,所述指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,所述指示标识位为第二状态时,表明对应的缓存行处于被监控状态;
    所述选择单元在所述低层级缓存中选择第一缓存行作为待替换缓存行之后,所述监控单元还用于在高层级缓存中查找所述第一缓存行的相应缓存行,并将所述第一缓存行的相应缓存行对应的指示标识位置为第二状态。
  11. 根据权利要求8-10任一项所述的装置,其特征在于,所述装置还包括写入单元;
    如果在所述低层级缓存发生缓存未命中前,所述高层级缓存没有发生对所述第一缓存行的相应缓存行的访问,则在所述低层级缓存发生缓存未命中时,所述写入单元还用于将所述第一缓存行移出所述低层级缓存。
  12. 根据权利要求11所述的装置,其特征在于,所述装置还包括无效单元,所述写入单元将所述第一缓存行移出所述低层级缓存后,所述无效单元用于将所述高层级缓存中的所述第一缓存行的相应缓存行无效。
  13. 根据权利要求8-12任一项所述的装置,其特征在于,所述选择单元用于根据最近最少使用LRU策略在所述低层级缓存中选择所述第一缓存行作为待替换缓存行。
  14. 根据权利要求13所述的装置,其特征在于,如果在所述低层级缓存发生缓存未命中前,所述高层级缓存中发生了对所述第一缓存行的相应缓存行的命中,所述选择单元还用于将所述低层级缓存中的所述第一缓存行的状态更新为最近使用MRU。
  15. 一种缓存替换系统,其特征在于,所述系统高层级缓存、低层级缓存和缓存控制器,所述低层级缓存与所述高层级缓存为包含关系;
    所述缓存控制器用于:
    在所述低层级缓存中选择第一缓存行作为待替换缓存行,所述待替换缓存行用于在所述低层级缓存发生缓存未命中时被移出所述低层级缓存;
    监控所述高层级缓存中是否发生对所述第一缓存行的相应缓存行的命中;
    如果在所述低层级缓存发生缓存未命中前,所述高层级缓存中发生了对所述第一缓存行的相应缓存行的命中,则在所述低层级缓存中保留所述第一缓存行,并选择第二缓存行作为待替换缓存行。
  16. 根据权利要求15所述的系统,其特征在于,所述系统还包括状态寄存器,所述状态寄存器与高层级缓存关联,用于保存处于被监控状态的缓存行的标识;
    在所述低层级缓存中选择第一缓存行作为待替换缓存行之后,所述缓存控制器还用于将所述第一缓存行的标识写入所述状态寄存器;
    所述缓存控制器用于比较所述高层级缓存中发生缓存命中的缓存行的标识与所述状态寄存器中保存的所述第一缓存行的标识,如果二者相同,则表明发生了对所述第一缓存行的相应缓存行的命中。
  17. 根据权利要求15所述的系统,其特征在于,所述高层级缓存中的每个缓存行对应一个指示标识位,所述指示标识位为第一状态时,表明对应的缓存行未处于被监控状态,所述指示标识位为第二状态时,表明对应的缓存行处于被监控状态;
    在所述低层级缓存中选择第一缓存行作为待替换缓存行之后,所述缓存控制器还用于在高层级缓存中查找所述第一缓存行的相应缓存行,并将所述第一缓存行的相应缓存行对应的指示标识位置为第二状态。
  18. 根据权利要求15-17任一项所述的系统,其特征在于,如果在所述低层级缓存发生缓存未命中前,所述高层级缓存没有发生对所述第一缓存行的相应缓存行的访问,则所述缓存控制器还用于在所述低层级缓存发生缓存未命中时,将所述第一缓存行移出所述低层级缓存。
  19. 根据权利要求18所述的系统,其特征在于,所述缓存控制器将所述第一缓存行移出所述低层级缓存后,还用于将所述高层级缓存中的所述第一缓存行的相应缓存行无效。
  20. 根据权利要求15-19任一项所述的系统,其特征在于,所述缓存控制器用于根据最近最少使用LRU策略在所述低层级缓存中选择所述第一缓存行作为待替换缓存行。
  21. 根据权利要求20所述的系统,其特征在于,如果在所述低层级缓存发生缓存未命中前,所述高层级缓存中发生了对所述第一缓存行的相应缓存行的命中,所述缓存控制器还用于将所述低层级缓存中的所述第一缓存行的状态更新为最近使用MRU。
PCT/CN2017/075952 2017-03-08 2017-03-08 一种缓存替换方法,装置和系统 WO2018161272A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
PCT/CN2017/075952 WO2018161272A1 (zh) 2017-03-08 2017-03-08 一种缓存替换方法,装置和系统
CN201780023595.4A CN109074320B (zh) 2017-03-08 2017-03-08 一种缓存替换方法,装置和系统
EP17899829.0A EP3572946B1 (en) 2017-03-08 2017-03-08 Cache replacement method, device, and system
US16/544,352 US20190370187A1 (en) 2017-03-08 2019-08-19 Cache Replacement Method, Apparatus, and System

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/075952 WO2018161272A1 (zh) 2017-03-08 2017-03-08 一种缓存替换方法,装置和系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/544,352 Continuation US20190370187A1 (en) 2017-03-08 2019-08-19 Cache Replacement Method, Apparatus, and System

Publications (1)

Publication Number Publication Date
WO2018161272A1 true WO2018161272A1 (zh) 2018-09-13

Family

ID=63447134

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/075952 WO2018161272A1 (zh) 2017-03-08 2017-03-08 一种缓存替换方法,装置和系统

Country Status (4)

Country Link
US (1) US20190370187A1 (zh)
EP (1) EP3572946B1 (zh)
CN (1) CN109074320B (zh)
WO (1) WO2018161272A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144431A (zh) * 2018-09-30 2019-01-04 华中科技大学 数据块的缓存方法、装置、设备及存储介质

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200226066A1 (en) * 2020-03-27 2020-07-16 Intel Corporation Apparatus and method for efficient management of multi-level memory
CN112612727B (zh) * 2020-12-08 2023-07-07 成都海光微电子技术有限公司 一种高速缓存行替换方法、装置及电子设备
CN112667534B (zh) * 2020-12-31 2023-10-20 海光信息技术股份有限公司 缓冲存储装置、处理器及电子设备
CN113760787B (zh) * 2021-09-18 2022-08-26 成都海光微电子技术有限公司 多级高速缓存数据推送系统、方法、设备和计算机介质
US20230102891A1 (en) * 2021-09-29 2023-03-30 Advanced Micro Devices, Inc. Re-reference interval prediction (rrip) with pseudo-lru supplemental age information
US20230244606A1 (en) * 2022-02-03 2023-08-03 Arm Limited Circuitry and method
CN117971718A (zh) * 2024-03-28 2024-05-03 北京微核芯科技有限公司 一种多核处理器的缓存替换方法及其装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060143400A1 (en) * 2004-12-29 2006-06-29 Steely Simon C Jr Replacement in non-uniform access cache structure
CN1940892A (zh) * 2005-09-29 2007-04-04 国际商业机器公司 逐出高速缓存的行的电路布置、数据处理系统和方法
CN103885890A (zh) * 2012-12-21 2014-06-25 华为技术有限公司 高速缓冲存储器cache中cache块的替换处理方法和装置
US20140280206A1 (en) * 2013-03-14 2014-09-18 Facebook, Inc. Social cache
CN105930282A (zh) * 2016-04-14 2016-09-07 北京时代民芯科技有限公司 一种用于nand flash的数据缓存方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7805574B2 (en) * 2005-02-09 2010-09-28 International Business Machines Corporation Method and cache system with soft I-MRU member protection scheme during make MRU allocation
US7577793B2 (en) * 2006-01-19 2009-08-18 International Business Machines Corporation Patrol snooping for higher level cache eviction candidate identification
US20090106496A1 (en) * 2007-10-19 2009-04-23 Patrick Knebel Updating cache bits using hint transaction signals
EP2527987A1 (en) * 2008-01-30 2012-11-28 QUALCOMM Incorporated Apparatus and methods to reduce castouts in a multi-level cache hierarchy
US8347037B2 (en) * 2008-10-22 2013-01-01 International Business Machines Corporation Victim cache replacement
US8364898B2 (en) * 2009-01-23 2013-01-29 International Business Machines Corporation Optimizing a cache back invalidation policy
US9411728B2 (en) * 2011-12-23 2016-08-09 Intel Corporation Methods and apparatus for efficient communication between caches in hierarchical caching design
US8874852B2 (en) * 2012-03-28 2014-10-28 International Business Machines Corporation Data cache block deallocate requests in a multi-level cache hierarchy
US9361237B2 (en) * 2012-10-18 2016-06-07 Vmware, Inc. System and method for exclusive read caching in a virtualized computing environment
US20160055100A1 (en) * 2014-08-19 2016-02-25 Advanced Micro Devices, Inc. System and method for reverse inclusion in multilevel cache hierarchy
US9558127B2 (en) * 2014-09-09 2017-01-31 Intel Corporation Instruction and logic for a cache prefetcher and dataless fill buffer

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060143400A1 (en) * 2004-12-29 2006-06-29 Steely Simon C Jr Replacement in non-uniform access cache structure
CN1940892A (zh) * 2005-09-29 2007-04-04 国际商业机器公司 逐出高速缓存的行的电路布置、数据处理系统和方法
CN103885890A (zh) * 2012-12-21 2014-06-25 华为技术有限公司 高速缓冲存储器cache中cache块的替换处理方法和装置
US20140280206A1 (en) * 2013-03-14 2014-09-18 Facebook, Inc. Social cache
CN105930282A (zh) * 2016-04-14 2016-09-07 北京时代民芯科技有限公司 一种用于nand flash的数据缓存方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3572946A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109144431A (zh) * 2018-09-30 2019-01-04 华中科技大学 数据块的缓存方法、装置、设备及存储介质
CN109144431B (zh) * 2018-09-30 2021-11-02 华中科技大学 数据块的缓存方法、装置、设备及存储介质

Also Published As

Publication number Publication date
EP3572946A4 (en) 2020-02-19
CN109074320B (zh) 2023-11-17
EP3572946A1 (en) 2019-11-27
CN109074320A (zh) 2018-12-21
US20190370187A1 (en) 2019-12-05
EP3572946B1 (en) 2022-12-07

Similar Documents

Publication Publication Date Title
WO2018161272A1 (zh) 一种缓存替换方法,装置和系统
US9043556B2 (en) Optimizing a cache back invalidation policy
US11106600B2 (en) Cache replacement based on translation lookaside buffer evictions
KR102043886B1 (ko) 프로파일링 캐시 대체
US9817760B2 (en) Self-healing coarse-grained snoop filter
JP4226057B2 (ja) 包含キャッシュにおける望ましくない置換動作を低減するための先行犠牲選択のための方法及び装置
US7380065B2 (en) Performance of a cache by detecting cache lines that have been reused
US8291175B2 (en) Processor-bus attached flash main-memory module
US10725923B1 (en) Cache access detection and prediction
KR20190058316A (ko) 예측에 기초하여 효율적으로 캐시 라인을 관리하는 시스템 및 방법
US20110320720A1 (en) Cache Line Replacement In A Symmetric Multiprocessing Computer
CN109154912B (zh) 根据另一个高速缓存中条目的可用性替换高速缓存条目
US11030115B2 (en) Dataless cache entry
US11507519B2 (en) Data compression and encryption based on translation lookaside buffer evictions
JP2017072982A (ja) 情報処理装置、キャッシュ制御方法およびキャッシュ制御プログラム
KR20180122969A (ko) 멀티 프로세서 시스템 및 이에 포함된 프로세서의 데이터 관리 방법
US7093075B2 (en) Location-based placement algorithms for set associative cache memory
JP5536377B2 (ja) コヒーレンス性追跡方法およびデータ処理システム(領域コヒーレンス配列に対する領域ビクティム・ハッシュの実装で強化されたコヒーレンス性追跡)
EP3885920A1 (en) Apparatus and method for efficient management of multi-level memory
KR102665339B1 (ko) 변환 색인 버퍼 축출 기반 캐시 교체
JP5045334B2 (ja) キャッシュシステム

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201780023595.4

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17899829

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017899829

Country of ref document: EP

Effective date: 20190819

NENP Non-entry into the national phase

Ref country code: DE