CN117312187A - Method for improving operation efficiency of cache consistency mechanism - Google Patents

Method for improving operation efficiency of cache consistency mechanism Download PDF

Info

Publication number
CN117312187A
CN117312187A CN202311418036.7A CN202311418036A CN117312187A CN 117312187 A CN117312187 A CN 117312187A CN 202311418036 A CN202311418036 A CN 202311418036A CN 117312187 A CN117312187 A CN 117312187A
Authority
CN
China
Prior art keywords
state
read
block
write
consistency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311418036.7A
Other languages
Chinese (zh)
Inventor
何震子
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cetc Shentai Information Technology Co ltd
Original Assignee
Cetc Shentai Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cetc Shentai Information Technology Co ltd filed Critical Cetc Shentai Information Technology Co ltd
Priority to CN202311418036.7A priority Critical patent/CN117312187A/en
Publication of CN117312187A publication Critical patent/CN117312187A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0815Cache consistency protocols
    • G06F12/0831Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
    • G06F12/0833Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means in combination with broadcast means (e.g. for invalidation or updating)

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a method for improving the operation efficiency of a cache consistency mechanism, and belongs to the field of data processing. The invention adds corresponding status bits in the directory of LLC, including private read, private read-write, shared read and shared read-write, and only carries out consistency operation for shared read-write status. The method and the device provide switching conditions among four states and a starting scheme of the consistency operation, and improve the operation efficiency of the multi-core consistency mechanism.

Description

Method for improving operation efficiency of cache consistency mechanism
Technical Field
The invention relates to the technical field of data processing, in particular to a method for improving the operation efficiency of a cache consistency mechanism.
Background
Modern computers often have multi-core processors, and multi-core consistency problems of data are involved when the multi-cores perform read-write operations on data stored in the same area. In general, each processor has a corresponding private cache, and how to synchronize data of each private cache when necessary is a core problem of multi-core consistency research.
Modern computers place higher demands on the operating efficiency of the multi-core coherency mechanism, which depends on the design of the cache coherency protocol. Unlike broadcast protocols, conventional cache coherency protocols track each block of cache. In that way, a backup of each cached block of data may be more easily located, but may also be redundant for many unnecessary operations. The consistency operation of deleting the cache blocks without consistency requirements is an effective means for improving the operation efficiency of a cache consistency mechanism.
Disclosure of Invention
The invention aims to provide a method for improving the operation efficiency of a cache consistency mechanism, so as to solve the problems in the background technology.
In order to solve the technical problems, the invention provides a method for improving the operation efficiency of a cache consistency mechanism, which comprises the following steps:
adding a 2bit LS state bit into a Tag RAM of LLC, indicating four states of private reading, private reading and writing, shared reading and writing, checking consistency request things initiated by LLC according to each processor to update state bit information in real time, and determining whether the block needs to execute consistency operations such as monitoring or not according to the state bit;
adding an FR bit into a Tag RAM of LLC, and indicating the ID of a first core for performing read or write operation on the block, wherein the ID is used for judging whether the core for performing subsequent operation on the block is the same core or not, so as to judge whether the state bit needs to be modified or not;
adding a 1bit C bit in the Tag RAM of the LLC to indicate whether the block is written into the private cache or not, and whether the block is started or not;
and for the private read, private read and write and shared read, the data tracking of multi-core consistency is not carried out, and only the shared read and write state is subjected to a consistency flow.
In one embodiment, a 2bit status bit is added to the block of the LLC, and the status changes according to the read-write requests of the LLC in different checks.
In one embodiment, the state change mechanism between the four states is irreversible, and when the state is transferred to the shared read-write state, the block is always in the shared read-write state unless the block is replaced or the cache is reset.
In one embodiment, the state change mechanism between the four states identifies, for each request, whether its source is consistent with the core of the first request, and whether to read or write when the request is screened, as a basis for state transition.
In one embodiment, in the state change mechanism between the four states, a consistency flow is required to be performed in the shared read-write state; before the buffer memory block is changed from other states to the shared read-write state, the starting of the consistency flow is needed.
In one embodiment, the bit width of the FR bits depends on the number of cores.
The method for improving the operation efficiency of the cache consistency mechanism provided by the invention simultaneously provides the conversion conditions among four states and the starting scheme of the consistency operation, and improves the operation efficiency of the multi-core consistency mechanism.
Drawings
Fig. 1 is a block diagram of the structure of the present invention.
Fig. 2 is a state jump diagram of the present invention.
Detailed Description
The method for improving the operation efficiency of the cache consistency mechanism provided by the invention is further described in detail below with reference to the accompanying drawings and the specific embodiments. The advantages and features of the present invention will become more apparent from the following description. It should be noted that the drawings are in a very simplified form and are all to a non-precise scale, merely for convenience and clarity in aiding in the description of embodiments of the invention.
In order to improve the operation efficiency of the multi-core consistency mechanism, the invention provides a mechanism for selective consistency operation based on additional status bits, which can effectively reduce unnecessary consistency operation, reduce resource consumption and increase less hardware cost.
As shown in FIG. 1, the invention utilizes spare bits to add 2bit status bits in the Tag RAM of LLC, indicating the status of the block, and there are four status in total:
private read (PL): only one check that the block is operating and only a read operation;
private read-write (PS): only one check of the block is operated, and the operation comprises a writing operation;
shared read (SL): a plurality of checks that the block has been operated and only a read operation;
shared read-write (SS): a plurality of cores operate on the block, and at least one of the cores includes a write operation.
The state transitions between the four states are shown in fig. 2. All blocks will be in a non-uniform state after the first cold start (bit C in clear state), miss-fill. The initial state is PL when cold start is a read operation, and PS when cold start is a write operation; while the FR records the core ID that operates on the block for the first time (hereinafter this core will be referred to simply as the FR core).
When there is a hit again to check that the block has issued a request, the system will determine the state transition in combination with whether the new request is a read or a write, comparing whether the core's ID matches the FR's recorded ID.
In the PL state, if the FR core issues a read request, the block state is unchanged; if the FR core issues a write request, the block state becomes PS; if other cores send out read requests, the block state becomes SL; if other cores send out write requests, the block starts a consistency flow and changes the state into SS;
in the SL state, if FR or other core issues a read request, the block state is unchanged; if FR or other cores send out a write request, starting a consistency flow and changing the state into SS;
in the PS state, if the FR core sends out a read request or a write request, the block state is unchanged; if other cores send out a read request or a write request, starting a consistency flow and changing the state into SS;
in the SS state, any request does not change the state of the block.
When there is a check again that the block is requesting but missing, non-coherency processing is performed (cache directory is not tracked) if the block is in one of the PL/SL/PS states, and coherency processing is performed if the block is in the SS state.
When the cache block needs to be converted from other states to the SS state, the FR core locks the cache block first, and other cores send out a request for recovering consistency. After the FR core receives the request, if the block is currently in the SL state, the request needs to be broadcast to other nodes, and if the block is currently in the PL or PS state, the request does not need to be broadcast. The FR and (possibly) other cores that received the broadcast then flush their private caches and check for missing records, and clean up the missing problem that has not been handled. Subsequently, each other core issues a "resume consistency request complete" signal to the FR core. After the FR core receives signals of all other cores, the FR core replies done signals to all other cores, then the recovery consistency flow is completed, the state of the cache block is changed to the SS state, and the FR core is unlocked.
The above description is only illustrative of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention, and any alterations and modifications made by those skilled in the art based on the above disclosure shall fall within the scope of the appended claims.

Claims (6)

1. The method for improving the operation efficiency of the cache consistency mechanism is characterized by comprising the following steps:
adding a 2bit LS state bit into a Tag RAM of LLC, indicating four states of private reading, private reading and writing, shared reading and writing, checking consistency request things initiated by LLC according to each processor to update state bit information in real time, and determining whether the block needs to execute consistency operations such as monitoring or not according to the state bit;
adding an FR bit into a Tag RAM of LLC, and indicating the ID of a first core for performing read or write operation on the block, wherein the ID is used for judging whether the core for performing subsequent operation on the block is the same core or not, so as to judge whether the state bit needs to be modified or not;
adding a 1bit C bit in the Tag RAM of the LLC to indicate whether the block is written into the private cache or not, and whether the block is started or not;
and for the private read, private read and write and shared read, the data tracking of multi-core consistency is not carried out, and only the shared read and write state is subjected to a consistency flow.
2. The method of claim 1, wherein 2bit status bits are added to the block of the LLC, the status changing with different requests to read from and write to the LLC.
3. The method for improving the operation efficiency of the cache coherency mechanism according to claim 1, wherein the state change mechanism between the four states is irreversible, and is always in the shared read/write state unless the block is replaced or the cache is reset after the state is transferred to the shared read/write.
4. The method as set forth in claim 1, wherein the state change mechanism between the four states identifies, for each request, whether its source is consistent with the core of the first request, and selects whether the request is read or written as a basis for state transition.
5. The method for improving the operation efficiency of a cache coherency mechanism according to claim 1, wherein in a state change mechanism between the four states, a coherency flow is required for a shared read-write state; before the buffer memory block is changed from other states to the shared read-write state, the starting of the consistency flow is needed.
6. The method for improving the efficiency of cache coherency mechanism operation of claim 1, wherein the bit width of the FR bits depends on the number of cores.
CN202311418036.7A 2023-10-30 2023-10-30 Method for improving operation efficiency of cache consistency mechanism Pending CN117312187A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311418036.7A CN117312187A (en) 2023-10-30 2023-10-30 Method for improving operation efficiency of cache consistency mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311418036.7A CN117312187A (en) 2023-10-30 2023-10-30 Method for improving operation efficiency of cache consistency mechanism

Publications (1)

Publication Number Publication Date
CN117312187A true CN117312187A (en) 2023-12-29

Family

ID=89242711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311418036.7A Pending CN117312187A (en) 2023-10-30 2023-10-30 Method for improving operation efficiency of cache consistency mechanism

Country Status (1)

Country Link
CN (1) CN117312187A (en)

Similar Documents

Publication Publication Date Title
US9959053B2 (en) Method for constructing NVRAM-based efficient file system
US20140006687A1 (en) Data Cache Apparatus, Data Storage System and Method
EP3441886B1 (en) Method and processor for processing data
US7444478B2 (en) Priority scheme for transmitting blocks of data
US10885004B2 (en) Method and apparatus to manage flush of an atomic group of writes to persistent memory in response to an unexpected power loss
CN108431783B (en) Access request processing method and device and computer system
US10733101B2 (en) Processing node, computer system, and transaction conflict detection method
US8423736B2 (en) Maintaining cache coherence in a multi-node, symmetric multiprocessing computer
WO2017041570A1 (en) Method and apparatus for writing data to cache
JP2002323959A (en) System and method for non-volatile write cache based on log of magnetic disk controller
CN103197899A (en) Life and performance enhancement of storage based on flash memory
US20030191906A1 (en) Data-maintenance method of distributed shared memory system
CN103279428B (en) A kind of explicit multi-core Cache consistency active management method towards stream application
JP2006323826A (en) System for log writing in database management system
CN113778338B (en) Distributed storage data reading efficiency optimization method, system, equipment and medium
US20100211744A1 (en) Methods and aparatus for low intrusion snoop invalidation
US20050198438A1 (en) Shared-memory multiprocessor
US7395381B2 (en) Method and an apparatus to reduce network utilization in a multiprocessor system
US20220334918A1 (en) Method and system for ensuring failure atomicity in non-volatile memory
US9323671B1 (en) Managing enhanced write caching
RU2183850C2 (en) Method of performance of reading operation in multiprocessor computer system
US20090327601A1 (en) Asynchronous data mirroring with look-ahead synchronization record
CN117312187A (en) Method for improving operation efficiency of cache consistency mechanism
US6918011B2 (en) Cache memory for invalidating data or writing back data to a main memory
US20150113244A1 (en) Concurrently accessing memory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination