CN115934367A - Buffer processing method, snoop filter, multiprocessor system, and storage medium - Google Patents

Buffer processing method, snoop filter, multiprocessor system, and storage medium Download PDF

Info

Publication number
CN115934367A
CN115934367A CN202211633337.7A CN202211633337A CN115934367A CN 115934367 A CN115934367 A CN 115934367A CN 202211633337 A CN202211633337 A CN 202211633337A CN 115934367 A CN115934367 A CN 115934367A
Authority
CN
China
Prior art keywords
ram
buffer
entry
target
target address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211633337.7A
Other languages
Chinese (zh)
Inventor
刘宗玺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Eswin Computing Technology Co Ltd
Original Assignee
Beijing Eswin Computing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Eswin Computing Technology Co Ltd filed Critical Beijing Eswin Computing Technology Co Ltd
Priority to CN202211633337.7A priority Critical patent/CN115934367A/en
Publication of CN115934367A publication Critical patent/CN115934367A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application provides a buffer processing method, a snoop filter, a multiprocessor system and a storage medium, and the method comprises the following steps: in response to a read transaction sent by a request processor and used for searching a target address of a cache line, determining a target RAM set pointed by the target address from a group of RAM sets, and respectively searching the target address of the cache line in the target RAM set and an eviction buffer; if the target address is not found in the target RAM set and the eviction buffer and no residual storage space in the target RAM set is detected, selecting a first RAM item from the target RAM set; sending a first back invalidation message aiming at the first RAM (random access memory) entry to a consistency control module, so that the consistency control module sends an invalidation transaction aiming at the first RAM entry to a first CPU (central processing unit) core pointed by the first RAM entry according to the first back invalidation message; and evicts the first RAM entry.

Description

Buffer processing method, snoop filter, multiprocessor system, and storage medium
Technical Field
The present application relates to the field of processors, and in particular, to a buffering method, a snoop filter, a multiprocessor system, and a storage medium.
Background
The use of multiple processors is becoming more common as a way to increase the computing power of new computer systems. Multiprocessor systems share system resources such as system memory and storage. Multiple processors typically access the same data in memory or storage and attempt to utilize such data at the same time. To accomplish this, multiple processors track the use of data to maintain data consistency. While one common scheme for maintaining data consistency in these systems is to employ Snoop filters (Snoop filters).
In a multiprocessor system including a snoop filter, when a CPU core needs a Cache line but is not found in the Cache in the core, a read transaction (transaction) that looks for the Cache line is sent to the snoop filter for a lookup of the target address of the Cache line. The core component of the current snoop process comprises a request queue, a snoop filter and a feedback circuit, wherein read transactions are sent into the request queue to be queued, the snoop filter processes the read transactions in the request queue to search for a target address of a corresponding cache line, and the transactions generated by the snoop filter in the search process are all sent into the request queue through the feedback circuit to wait for being executed.
However, when the Random Access Memory (RAM) and the eviction buffer (eviction buffer) in the snoop filter are full, if the target address of the cache line is not hit in the snoop filter, the snoop filter will generate invalid transactions, which are sent to the request queue through the feedback circuit, and are arranged after the current read transaction. Since the invalid transaction is processed after the read transaction, the invalid transaction in the execution sequence needs to wait for the completion of the read transaction to be processed, and the read transaction in the execution logic needs to wait for the completion of the invalid transaction to be processed continuously, thereby causing the problem of deadlock.
Disclosure of Invention
The embodiment of the application provides a buffering method, a snoop filter, a multiprocessor system and a storage medium, and can solve the problem of deadlock.
The technical scheme of the application is realized as follows:
in a first aspect, an embodiment of the present application provides a buffering method applied to a snoop filter, where the snoop filter includes a set of RAM and an eviction buffer, and the method includes:
in response to a read transaction issued by a request processor for finding a target address of a cache line, determining a target RAM set pointed to by the target address from the set of RAM sets, and respectively finding the target address of the cache line in the target RAM set and the eviction buffer;
if the target address is not found in the target RAM set and the eviction buffer and no residual storage space in the target RAM set is detected, selecting a first RAM item from the target RAM set;
sending a first back invalidation message aiming at the first RAM entry to a consistency control module, so that the consistency control module sends an invalidation transaction aiming at the first RAM entry to a first CPU core pointed by the first RAM entry according to the first back invalidation message; and evicting the first RAM entry.
In a second aspect, an embodiment of the present application provides a snoop filter, including: a set of RAM sets, an eviction buffer, a controller, a result distribution circuit, a first sequential logic circuit, and a second sequential logic circuit; the output end of the controller is respectively connected with the input end of the group of RAM sets and the input end of the eviction buffer, the output end of the group of RAM sets and the output end of the eviction buffer are connected with the input end of the result distribution circuit, the output end of the result distribution circuit is connected with the input end of the first sequential logic circuit, the first sequential logic circuit is connected with the group of RAM sets in a bidirectional mode, and the output end of the first sequential logic circuit is connected with the input end of the consistency control module through the second sequential logic circuit;
the controller is used for responding to a read transaction which is sent by a request processor and used for searching a target address of a cache line, determining a target RAM set pointed by the target address from the group of RAM sets, and respectively searching the target address in the target RAM set and an eviction buffer to obtain a search result;
the result distribution circuit is used for transmitting a search result of which the target address is not searched in the target RAM set and the eviction buffer to the first time sequence logic circuit;
the first time sequence logic circuit is used for detecting whether available RAM exists in the target RAM set or not and selecting a first RAM item from the target RAM set under the condition that no residual storage space exists in the target RAM set; and evicting the first RAM entry;
the second sequential logic circuit is configured to send a first back invalidation message for the first RAM entry to the consistency control module, so that the consistency control module sends an invalidation transaction for the first RAM entry to the first CPU core to which the first RAM entry points according to the first back invalidation message.
In a third aspect, an embodiment of the present application provides a multiprocessor system, where the multiprocessor system includes: the system comprises a plurality of processors, a bus and a consistency module, wherein the plurality of processors and the consistency module are connected through the bus; the consistency module comprises a request queue, a third-level cache, a snoop filter and a consistency control module; wherein the content of the first and second substances,
the request queue is configured to receive at least one read transaction sent by at least one of the plurality of processors through the bus, and queue the at least one read transaction to obtain a read transaction queue; the read transaction is used for searching a target address of a corresponding cache line;
the third-level cache and the snoop filter are used for simultaneously responding to the read transaction arranged at the first position in the read transaction queue and performing the search process of the target address of the corresponding cache line;
the third-level cache is further configured to, when the target address is found, return data corresponding to the found target address to the request processor that initiated the read transaction; and sending a miss message under the condition that the target address is not found.
In a fourth aspect, an embodiment of the present application provides a storage medium, on which a computer program is stored, where the computer program, when executed by a snoop filter, implements the buffering method as described above.
The embodiment of the application provides a buffer processing method, a snoop filter, a multiprocessor system and a storage medium, which can construct corresponding independent processing logic for invalid transactions, namely when a target address of a cache line is not found in a target RAM set and an eviction buffer of the snoop filter and no residual storage space in the target RAM set is detected, a first back invalidation message for a first RAM entry in the target RAM set is sent to a consistency control module, the invalid transaction is sent to a first CPU core pointed by the first RAM entry through the consistency control module, and then the first RAM entry can be evicted. Based on the independent processing logic for the invalid transaction, the invalid transaction and the read transaction do not have the execution sequence, and the problem of deadlock between the read transaction and the invalid transaction is avoided.
Drawings
Fig. 1 is a flowchart of a buffering method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of evicting a first RAM entry according to an embodiment of the present disclosure;
fig. 3 is a flowchart of a method for evicting a cache according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a snoop filter according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of a multiprocessor system according to an embodiment of the present application;
fig. 6 is a schematic flowchart illustrating a lookup operation executed in parallel by a snoop filter and a third-level cache according to an embodiment of the present disclosure.
Detailed Description
The multiprocessor system may be any type of multiprocessor or multicore system, including personal computers, mainframe computers, hand-held computers, consumer electronics (mobile phones, hand-held gaming devices, wearable devices, etc.), network devices, automotive/avionic controllers, or other similar devices. However, this is not a limitation of the present application, i.e., any electronic device using the snoop filter proposed in the present application should fall within the scope of the present application.
There may be any number of processors in a multiprocessor system, each having at least one cache associated with the processor. In one embodiment, a multiprocessor system may have a fixed number of processors. In another embodiment, a multiprocessor system may have a slot or interface for any processor. The number of processors may be changed by adding or removing processors from the system.
In multiprocessor systems, where a snoop filter is employed to maintain data coherency, a processor or core may send a coherency request, commonly referred to as a snoop (snoop), to other processors before accessing or modifying the data. The snoop filter maintains a cache of data requests from each processor or core in order to track the contents of each processor or core's cache. Whenever a processor retrieves data from memory, a coherency record containing the tag address of that data is stored in the form of a RAM entry in the random access memory set (RAM set) of the snoop filter. The RAM in which the snoop filters are also referred to as SFRAM.
So that the manner in which the above recited features and aspects of the present invention can be understood in detail, a more particular description of the embodiments of the invention, briefly summarized above, may be had by reference to the appended drawings, which are included to illustrate, but are not intended to limit the embodiments of the invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict. It should also be noted that reference to the terms "first \ second \ third" in the embodiments of the present application is only used for distinguishing similar objects and does not represent a specific ordering for the objects, and it should be understood that "first \ second \ third" may be interchanged with a specific order or sequence where possible so that the embodiments of the present application described herein can be implemented in an order other than that shown or described herein.
An embodiment of the present application provides a buffering method, as shown in fig. 1, applied to a snoop filter, where the snoop filter includes a set of RAM sets and an eviction buffer, and the method may include:
s101, in response to a read transaction which is sent by a request processor and used for searching for a target address of a cache line, a target RAM set to which the target address points is determined from a group of RAM sets, and the target address of the cache line is searched in the target RAM set and an eviction buffer respectively.
In this embodiment, the request processor may be a processor that needs to search for data of a cache line in a multiprocessor system, the request processor sends a read transaction for searching for a target address of the cache line, and after receiving the read transaction, the snoop filter determines, according to an index carried in the target address, a target RAM set to which the target address points from a group of RAM sets, and searches for the target address of the cache line in the target RAM set and an eviction buffer, respectively.
In this embodiment, the type of the request processor may be a CPU, a Graphics Processing Unit (GPU), and other processors, and the request processor transmits the read transaction to the snoop filter through a corresponding interface, for example, the CPU connects to the SLVS _ CPUs to issue the read transaction, the GPU connects to the SLVS _ ACP to issue the read transaction, the SLVS _ SNP may also connect to some other external processor to execute issuance of a corresponding read transaction, and the SLVS _ BIB connects to an internal eviction buffer to execute issuance of a corresponding read transaction.
S102, if the target address is not found in the target RAM set and the eviction buffer and no residual storage space in the target RAM set is detected, selecting a first RAM item from the target RAM set.
It should be noted that, since the coherency record of the address containing the data is stored in the snoop filter each time the processor retrieves the data from the memory, if the target address is not found in the target RAM set and the eviction buffer, the coherency record of the data whose target address was retrieved from the memory by another processor does not exist in the snoop filter. At this point, the requesting processor needs to retrieve the data for the target address from memory (third level cache and main memory), while the snoop filter needs to store a coherency record for the target address. At this time, the snoop filter searches whether there is a remaining memory space in the target RAM set. The snoop filter may store a coherency record for the target address in the remaining memory space.
In the embodiment of the present application, in the event that no storage space is detected in the target RAM set, characterizing the snoop filter as lacking space for storing a coherency record for the target address requires eviction of an entry from the snoop filter to accommodate the new transaction. At this point, a first RAM entry may be selected from the target RAM set.
In an embodiment of the present application, the first RAM entry may be selected from the target RAM set by a Linear Feedback Shift Register (LSFR).
It should be noted that, for the data retrieval sequence of the target address in the third-level cache and the main memory, the snoop filter and the third-level cache may respond to a read transaction issued by the request processor for finding the target address of the cache line at the same time, and the data of the target address is continuously searched in the main memory only when the target address is not hit in both the snoop filter and the third-level cache. The target address is searched by the snoop filter and the third-level cache at the same time, so that the searching efficiency can be improved.
Further, if the identifier of the second CPU core including the target address is found in the target RAM set or the eviction buffer, the corresponding second CPU core may include a cache line required by the read transaction, and at this time, a corresponding hit message is returned to the consistency control module; for the coherency control module to send snoop transactions to the second CPU core.
It should be noted that the consistency control module is configured to send out the snoop transaction after the snoop filter hits the target address, and the consistency control module is further configured to receive a response message of the snoop transaction, where the content of the specific response message may be selected according to an actual situation, and the embodiment of the present application is not limited specifically.
S103, sending a first back invalidation (back invalidation) message aiming at the first RAM item to the consistency control module, so that the consistency control module sends an invalid transaction aiming at the first RAM item to a first CPU core pointed by the first RAM item according to the first back invalidation message; and evicts the first RAM entry.
In this embodiment of the present application, after a first RAM entry is selected from a target RAM set, a first back invalidation message for the first RAM entry needs to be sent to a consistency control module, at this time, the consistency control module determines a first CPU core to which the first entry points, and sends an invalidation transaction for a target address to the first CPU core, so that the first CPU core deletes cache data corresponding to the target address in the first CPU core, so as to maintain data consistency.
It should be noted that when the target RAM set is full, an invalid transaction is issued by the coherency control module when the target RAM set is read next time. This is to avoid the problem that if a certain full RAM set is selected when the RAM is written again later, the information of the RAM will be overwritten, resulting in inconsistent cache.
In an embodiment of the present application, after a first RAM entry is selected from a target set of RAM, the first RAM entry is also evicted.
In the embodiment of the present application, the specific process of evicting the first RAM entry is shown in fig. 2, and includes:
1. it is detected whether there is a buffer space in the eviction buffer.
2. If there is buffer space in the eviction buffer, the first RAM entry is evicted to the eviction buffer.
3. And if the buffer space does not exist in the eviction buffer, the first RAM entry is evicted to the eviction buffer after the buffer entry in the eviction buffer is released.
In an embodiment of the present application, after evicting the first RAM entry, it is characterized that there is remaining storage space in the snoop filter to write a coherency record (i.e., a RAM entry is newly added) for a new memory transaction that looks up data for the target address in the third level cache or main memory; the identification of the requesting processor and the target address may be written into the target RAM set at this point to form a second RAM entry in response to a write transaction issued by the requesting processor to write to the target address.
It should be noted that, in the case that the result of the lookup of the read transaction in the target RAM set is a miss, the requesting processor initiates a write transaction in the target RAM set.
It should be noted that, in the process of maintaining the RAM set, when the contents in the RAM entry and the eviction buffer are updated, a relevant write transaction is also initiated. The scenario in which the contents in the RAM entry and the eviction buffer need to be updated may be a synchronization scenario in which a cache in the CPU is deleted, or a scenario in which the contents in the eviction buffer are written back to the RAM entry after hit in the eviction buffer, which may be specifically selected according to an actual situation, and the embodiment of the present application is not specifically limited.
It should be noted that a RAM entry is composed of the following fields: the system comprises a field for representing the identification of the CPU core, a field for representing the target address shared by multiple cores, a field for representing the target address shared by single cores and a field for representing the target address. The present application is not limited to the above field types, and the above fields may be specifically added, deleted or modified, which is not specifically limited in the embodiments of the present application.
Referring to table 1, an exemplary structure of a RAM entry, where the RAM is composed of the following five fields:
a Presence Vector (PV) field; an identification of the CPU core that characterizes the data having the target address. Several CPU cores occupy several bit descriptions.
Sharing field: when the bit is set, the characterizing RAM entry is shared by multiple CPU cores, i.e., the RAM entry has multiple valid copies shared by multiple caches.
An exclusive field: when the bit is set, the token RAM entry is exclusive to a CPU core, i.e., the RAM entry has only one valid copy outside of main memory.
req _ early _ addr _ tc0[ tag _ HI _ R ] field and req _ early _ addr _ tc0[ tag _ LO _ R ] field: indicating the high and low bits of the tag of the target address.
Table 1RAM entry structure
Figure BDA0004006294620000091
It should be noted that, when the exclusive field bit is set, the PV field includes only one CPU core identifier; when the shared field bit is set, the PV field includes all of the CPU core identifications that share the RAM entry. The exclusive field and the shared field are mutually exclusive, namely, only one of the two fields can be set.
When the number of CPU cores sharing the RAM entry changes from one to a plurality, the exclusive field bit is reset and the shared field bit is set. And each time data of the target address is hit in the snoop filter, the identification of the corresponding requesting CPU core is added to the PV field of the corresponding RAM entry.
Further, for the snoop filter proposed in the present application, the steps shown in fig. 3 are also performed for the eviction buffer therein, specifically,
1. the number of available buffers in the eviction buffer is detected.
2. Under the condition that the number is smaller than a preset threshold value, selecting a first buffer entry from the eviction buffer; and evicts the first buffer entry.
3. And sending a second back invalidation message aiming at the first buffer entry to the consistency control module, so that the consistency control module sends an invalidation transaction aiming at the first buffer entry to a second CPU core pointed by the first buffer entry according to the second back invalidation message.
In the embodiment of the present application, the preset threshold may be 2 or other values, which are specifically selected according to actual situations, and the embodiment of the present application is not specifically limited.
It should be noted that, by presetting the threshold of the number of available buffers in the eviction buffer, it can be ensured that the eviction buffer is not fully written, and further it can be ensured that the RAM entries can be evicted into the eviction buffer, thereby improving the efficiency of the snoop filter.
It can be understood that, when the target address of the cache line is not found in the target RAM set and the eviction buffer of the snoop filter and it is detected that there is no remaining storage space in the target RAM set, a first post-invalidation message for a first RAM entry in the target RAM set is sent to the coherency control module, an invalidation transaction is sent to the first CPU core to which the first RAM entry points through the coherency control module, and then the first RAM entry may be evicted. Corresponding independent processing logic is constructed for the invalid transaction, the invalid transaction and the read transaction do not have the execution sequence, and the deadlock problem between the read transaction and the invalid transaction cannot exist.
Based on the foregoing embodiments, the present application provides a snoop filter 1, as shown in fig. 4, where the snoop filter 1 includes: a set of RAM sets 10, an eviction buffer 11, a controller 12, a result distribution circuit 13, a first sequential logic circuit 14, and a second sequential logic circuit 15; the output end of the controller 12 is connected to the input end of the set of RAM sets 10 and the input end of the eviction buffer 11, the output end of the set of RAM sets 10 and the output end of the eviction buffer 11 are connected to the input end of the result distribution circuit 13, the output end of the result distribution circuit 13 is connected to the input end of the first timing logic circuit 14, the first timing logic circuit 14 is connected to the set of RAM sets 10 in a bidirectional manner, and the output end of the first timing logic circuit 14 is connected to the input end of the coherency control module 2 through the second timing logic circuit 15;
the controller 12 is configured to, in response to a read transaction issued by a request processor and used for finding a target address of a cache line, determine a target RAM set pointed to by the target address from the set of RAM sets, and respectively search the target address in the target RAM set and an eviction buffer to obtain a search result;
the result allocating circuit 13 is configured to transmit a search result in which the target address is not found in the target RAM set and the eviction buffer to the first sequential logic circuit;
the first timing logic circuit 14 is configured to detect whether there is an available RAM in the target RAM set, and select a first RAM entry from the target RAM set when it is detected that there is no remaining storage space in the target RAM set;
the second sequential logic circuit 15 is configured to send a first back invalidation message for the first RAM entry to the coherency control module 2, so that the coherency control module 2 sends an invalidation transaction for the first RAM entry to the first CPU core to which the first RAM entry points according to the first back invalidation message; and evicting the first RAM entry.
In the embodiment of the application, the snoop filter is composed of a controller, a group of RAM sets, an eviction buffer and a plurality of sequential logic circuits, and the sequential logic circuits are connected with the controller, the group of RAM sets and the eviction buffer so as to realize the buffering processing method.
In the embodiment of the present application, the first sequential logic circuit is a sequential combination logic circuit that determines whether the currently pointed target RAM set is full, and if so, needs to select a first RAM entry from the target RAM set to be evicted to the eviction buffer.
In the embodiment of the present application, the second sequential logic circuit is a sequential combination logic circuit that determines whether a back invalidation (back invalidation) message is to be sent.
It should be noted that the snoop filter in the embodiment of the present application may further include a sequential logic circuit (not shown in fig. 4) for returning a miss signal when the target address misses in the target RAM set and the eviction buffer. The input end of the sequential logic circuit may be connected to the output end of the result distribution circuit, and may also be connected to the output end of the first sequential logic circuit, which may be specifically selected according to the actual execution logic, and this embodiment of the present application is not specifically limited.
Referring to fig. 4, the filter 1 further comprises: a third sequential logic circuit 16; the third sequential logic circuit 16 is bidirectionally connected with the eviction buffer 11, and the input terminal of the third sequential logic circuit 16 is further connected with the output terminal of the set of RAM sets 10;
the third sequential logic circuit 16 is configured to detect whether there is a buffer space in the eviction buffer; if a buffer space exists in the eviction buffer, writing the first RAM entry into the eviction buffer; and if the buffer space does not exist in the eviction buffer, the first RAM entry is evicted to the eviction buffer after the buffer entry in the eviction buffer is released.
In an embodiment of the present application, the third sequential logic circuit is a sequential combination logic circuit that determines that the selected first RAM entry cannot be written to the eviction buffer, and the third sequential logic circuit determines that the first RAM entry cannot be written to the eviction buffer by detecting whether a buffer space exists in the eviction buffer. When there is buffer space in the eviction buffer, the first RAM entry can be written to the eviction buffer; when no buffer space exists in the eviction buffer, the first RAM entry cannot be written into the eviction buffer immediately, at this time, the third sequential logic circuit monitors whether a buffer entry in the eviction buffer is released in real time, and evicts the first RAM entry to the eviction buffer after monitoring that the buffer entry in the eviction buffer is released.
Referring to fig. 4, the filter 1 further comprises: a fourth timing logic circuit 17; the fourth sequential logic circuit 17 is bidirectionally connected to the eviction buffer 11, and the output terminal of the fourth sequential logic circuit 17 is further connected to the input terminal of the second sequential logic circuit 15;
said fourth timing logic circuit 17 for detecting the number of available buffers in said eviction buffer; selecting a first buffer entry from the eviction buffer if the number is less than a preset threshold; and evicting the first buffer entry;
the second sequential logic circuit 15 is configured to send a second back invalidation message for the first buffer entry to the consistency control module 2, so that the consistency control module 2 sends an invalidation transaction for the first buffer entry to the second CPU core to which the first buffer entry points according to the second back invalidation message.
In this embodiment, the fourth sequential logic circuit is a sequential combination logic circuit that determines that there are currently several buffers left unused in the eviction buffer. And under the condition that the number of available buffers in the eviction buffer is less than a preset threshold value, sending a back invalidation message to the consistency control module through the second sequential logic circuit.
In some embodiments of the present application, the controller 12 is further configured to write the identification of the requesting processor and the target address in the target RAM set to form a second RAM entry in response to a write transaction issued by the requesting processor to write the target address.
In some embodiments of the present application, the output of the result distribution circuit 13 is further connected to the input of the coherency control module 2;
the result allocating circuit 13 is configured to transmit a search result that hits the target address in the target RAM set and the eviction buffer to the coherency control module 2; for the consistency control module 2 to send a snoop transaction to the second CPU core that is found to contain the target address.
In the embodiment of the present application, the consistency control module is configured to implement issuance of a snoop transaction, issuance of an invalid transaction, and reception of a corresponding response message. The consistency control module is used for sending out the invalid transactions, the snoop filter is not needed to rearrange the invalid transactions in the request queue through the feedback circuit to wait for execution, processing logics of the invalid transactions and the read transactions are distinguished, and the deadlock problem caused by the invalid transactions and the read transactions can be avoided.
Based on the foregoing embodiments, an embodiment of the present application further provides a multiprocessor system 3, as shown in fig. 5, the system includes: the system comprises a plurality of processors 4, a bus 5 and a consistency module 6, wherein the processors 4 and the consistency module 6 are connected through the bus 5; the consistency module 6 comprises a request queue 61, a third-level cache 62, a snoop filter 1 and a consistency control module 2; wherein the content of the first and second substances,
the request queue 61 is configured to receive at least one read transaction sent by at least one of the processors 4 through the bus 5, and queue the at least one read transaction to obtain a read transaction queue; the read transaction is used for searching a target address of a corresponding cache line;
the third-level cache 62 and the snoop filter 1 are configured to respond to the read transaction queued at the first bit in the read transaction queue at the same time, and perform a lookup process of a target address of a corresponding cache line;
the third-level cache 62 is further configured to, when the target address is found, return data corresponding to the found target address to the request processor that initiated the read transaction; and sending a miss message under the condition that the target address is not found.
It should be noted that the function description of the snoop filter refers to the snoop filter described in fig. 4, and is not repeated herein.
In the embodiment of the application, the consistency module can communicate with each processor through a bus,
it should be noted that, the third-level cache and the snoop filter respond to the read transaction at the same time, and perform the lookup process of the target address of the corresponding cache line, which can improve the lookup efficiency. If the data hit occurs in the third-level cache, returning the data corresponding to the target address to the requesting processor which issues the read transaction by the control circuit of the third-level cache; if hit in the snoop filter, sending the snoop transaction by the coherency control module to the CPU core that may own the cache line; if the cache miss occurs in the third level cache and the snoop filter, the main memory is accessed, and the control logic of the main memory is responsible for returning the data corresponding to the target address to the requesting processor that issued the read transaction.
For example, the flow of performing lookup operations in parallel with snoop filters and three-level caches is shown in fig. 6. First, SLVS _ CPUS, SLVS _ ACP, SLVS _ SNP, or SLVS _ BIB issues a read transaction to find the target address of a cache line, and when multiple read transactions enter the request queue at the same time, one read transaction is selected by the LFSR. And simultaneously entering the third-level cache and the snoop filter for the selected read transaction to search the target address of the corresponding cache line. If the data hit occurs in the third-level cache, the control circuit of the third-level cache returns the data corresponding to the target address to the requesting processor; if there is a hit in the snoop filter, then a snoop transaction is subsequently sent by the coherency control module to the CPU core that may own the cache line; if both miss, main memory is accessed and the control logic of main memory is responsible for returning data that may be needed by this read transaction to the requesting processor.
The various embodiments described above may be implemented as code and may be stored on a storage medium having stored thereon instructions that can be used to become a system to execute the instructions. The storage medium may include, but is not limited to, any type of disk including floppy disks, optical disks, compact disk Read-Only memories (CD-ROMs), compact disk rewritables (CD-RWs), and magneto-optical disks, semiconductor devices such as Read-Only memories (ROMs), random Access Memories (RAMs) such as Dynamic random access memories (Dynamics RAM, DRAM), static random access memories (Static RAM, SRAM), erasable Programmable Read-Only memories (EPROMs), flash memories, electrically Erasable Programmable Read-Only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a component of' 8230; \8230;" does not exclude the presence of another like element in a process, method, article, or apparatus that comprises the element.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions to enable a multi-processor system (which may be a mobile phone, a computer, a server or a network device, etc.) to execute the above method.
The above description is only a preferred embodiment of the present application, and is not intended to limit the scope of the present application.

Claims (13)

1. A cache processing method applied to a snoop filter comprising a set of Random Access Memory (RAM) sets and an eviction buffer, the method comprising:
in response to a read transaction issued by a request processor for finding a target address of a cache line, determining a target RAM set pointed to by the target address from the set of RAM sets, and respectively searching the target address of the cache line in the target RAM set and the eviction buffer;
if the target address is not found in the target RAM set and the eviction buffer and no residual storage space in the target RAM set is detected, selecting a first RAM item from the target RAM set;
sending a first back invalidation message aiming at the first RAM entry to a consistency control module, so that the consistency control module sends an invalidation transaction aiming at the first RAM entry to a first CPU core pointed by the first RAM entry according to the first back invalidation message; and evicting the first RAM entry.
2. The method of claim 1, wherein the evicting the first RAM entry comprises:
detecting whether a buffer space exists in the eviction buffer;
evicting the first RAM entry to the eviction buffer if there is buffer space in the eviction buffer;
if the buffer space does not exist in the eviction buffer, the first RAM entry is evicted to the eviction buffer after the buffer entry in the eviction buffer is released.
3. The method of claim 1, further comprising:
detecting a number of available buffers in the eviction buffer;
selecting a first buffer entry from the eviction buffer if the number is less than a preset threshold; and evicting the first buffer entry;
and sending a second back invalidation message aiming at the first buffer entry to the consistency control module, so that the consistency control module sends an invalidation transaction aiming at the first buffer entry to a second CPU core pointed by the first buffer entry according to the second back invalidation message.
4. The method of claim 1, wherein after the evicting the first RAM entry, the method further comprises:
in response to a write transaction issued by the requesting processor to write to the target address, writing the identification of the requesting processor and the target address in the target RAM set to form a second RAM entry.
5. Method according to claim 1 or 4, wherein said RAM entry consists of the following fields: the system comprises a field for representing the identification of the CPU core, a field for representing the target address shared by multiple cores, a field for representing the target address shared by single cores and a field for representing the target address.
6. The method of claim 1, wherein after the target address is looked up in the target RAM set and an eviction buffer, respectively, the method further comprises:
if the identification of the second CPU core containing the target address is found in the target RAM set or the eviction buffer, returning a corresponding hit message to the consistency control module; for the coherency control module to send snoop transactions to the second CPU core.
7. A snoop filter, the snoop filter comprising: a set of RAM sets, an eviction buffer, a controller, a result distribution circuit, a first sequential logic circuit, and a second sequential logic circuit;
the controller is used for responding to a read transaction which is sent by a request processor and used for searching a target address of a cache line, determining a target RAM set pointed by the target address from the group of RAM sets, and respectively searching the target address in the target RAM set and an eviction buffer to obtain a search result;
the result distribution circuit is used for transmitting a search result of which the target address is not searched in the target RAM set and the eviction buffer to the first time sequence logic circuit;
the first time sequence logic circuit is used for detecting whether available RAM exists in the target RAM set and selecting a first RAM item from the target RAM set under the condition that no residual storage space exists in the target RAM set; and evicting the first RAM entry;
the second sequential logic circuit is configured to send a first back invalidation message for the first RAM entry to the consistency control module, so that the consistency control module sends an invalidation transaction for the first RAM entry to the first CPU core to which the first RAM entry points according to the first back invalidation message.
8. The snoop filter as claimed in claim 7, wherein the filter further comprises: a third sequential logic circuit; the third sequential logic circuit is connected with the eviction buffer in a bidirectional mode, and the input end of the third sequential logic circuit is further connected with the output end of the group of RAM sets;
the third sequential logic circuit is configured to detect whether a buffer space exists in the eviction buffer; if a buffer space exists in the eviction buffer, writing the first RAM entry into the eviction buffer; and if the buffer space does not exist in the eviction buffer, the first RAM entry is evicted to the eviction buffer after the buffer entry in the eviction buffer is released.
9. The snoop filter as claimed in claim 7, wherein the filter further comprises: a fourth timing logic circuit;
said fourth timing logic to detect a number of available buffers in said eviction buffer; selecting a first buffer entry from the eviction buffer if the number is less than a preset threshold; and evicting the first buffer entry;
the second sequential logic circuit is configured to send a second back invalidation message for the first buffer entry to the consistency control module, so that the consistency control module sends an invalidation transaction for the first buffer entry to a second CPU core to which the first buffer entry points according to the second back invalidation message.
10. The snoop filter of claim 7,
the controller is further configured to write the identification of the requesting processor and the target address in the target set of RAM to form a second RAM entry in response to a write transaction issued by the requesting processor to write the target address.
11. The snoop filter of claim 7,
the result distribution circuit is used for transmitting a search result of hitting the target address in the target RAM set and the eviction buffer to the consistency control module; and the consistency control module sends a monitoring transaction to the searched second CPU core containing the target address.
12. A multiprocessor system, characterized in that the system comprises: the system comprises a plurality of processors, a bus and a consistency module, wherein the processors and the consistency module are connected through the bus; the coherency module comprising a request queue, a level three cache, a snoop filter as claimed in any of claims 7 to 11, and a coherency control module; wherein the content of the first and second substances,
the request queue is configured to receive at least one read transaction sent by at least one of the plurality of processors through the bus, and queue the at least one read transaction to obtain a read transaction queue; the read transaction is used for searching a target address of a corresponding cache line;
the third-level cache and the snoop filter are used for simultaneously responding to the read transaction arranged at the first position in the read transaction queue and performing the search process of the target address of the corresponding cache line;
the third-level cache is further configured to, when the target address is found, return data corresponding to the found target address to the request processor that initiated the read transaction; and sending a miss message under the condition that the target address is not found.
13. A storage medium having stored thereon a computer program which, when executed by a snoop filter, implements the method of any one of claims 1 to 6.
CN202211633337.7A 2022-12-19 2022-12-19 Buffer processing method, snoop filter, multiprocessor system, and storage medium Pending CN115934367A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211633337.7A CN115934367A (en) 2022-12-19 2022-12-19 Buffer processing method, snoop filter, multiprocessor system, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211633337.7A CN115934367A (en) 2022-12-19 2022-12-19 Buffer processing method, snoop filter, multiprocessor system, and storage medium

Publications (1)

Publication Number Publication Date
CN115934367A true CN115934367A (en) 2023-04-07

Family

ID=86551986

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211633337.7A Pending CN115934367A (en) 2022-12-19 2022-12-19 Buffer processing method, snoop filter, multiprocessor system, and storage medium

Country Status (1)

Country Link
CN (1) CN115934367A (en)

Similar Documents

Publication Publication Date Title
US6304945B1 (en) Method and apparatus for maintaining cache coherency in a computer system having multiple processor buses
US9720839B2 (en) Systems and methods for supporting a plurality of load and store accesses of a cache
US8706973B2 (en) Unbounded transactional memory system and method
US8015365B2 (en) Reducing back invalidation transactions from a snoop filter
KR100318789B1 (en) System and method for managing cache in a multiprocessor data processing system
US7281092B2 (en) System and method of managing cache hierarchies with adaptive mechanisms
CA1238984A (en) Cooperative memory hierarchy
CN1991793B (en) Method and system for proximity caching in a multiple-core system
US8209499B2 (en) Method of read-set and write-set management by distinguishing between shared and non-shared memory regions
US20030200404A1 (en) N-way set-associative external cache with standard DDR memory devices
JPH11506852A (en) Reduction of cache snooping overhead in a multi-level cache system having a large number of bus masters and a shared level 2 cache
CN113342709B (en) Method for accessing data in a multiprocessor system and multiprocessor system
US9645931B2 (en) Filtering snoop traffic in a multiprocessor computing system
US20020169935A1 (en) System of and method for memory arbitration using multiple queues
US6832294B2 (en) Interleaved n-way set-associative external cache
US7574566B2 (en) System and method for efficient software cache coherence
CN103076992A (en) Memory data buffering method and device
US6449698B1 (en) Method and system for bypass prefetch data path
CN114036089B (en) Data processing method and device, buffer, processor and electronic equipment
US20040117558A1 (en) System for and method of operating a cache
US5680577A (en) Method and system for processing multiple requests for data residing at the same memory address
US6839806B2 (en) Cache system with a cache tag memory and a cache tag buffer
CN114238171B (en) Electronic equipment, data processing method and device and computer system
KR100304318B1 (en) Demand-based issuance of cache operations to a processor bus
CN115934367A (en) Buffer processing method, snoop filter, multiprocessor system, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination