Disclosure of Invention
The embodiment of the invention provides a method and a device for maintaining Cache data consistency according to directory information, which can reduce the difficulty of NUMA system design and implementation.
In a first aspect, an embodiment of the present invention provides a method for maintaining Cache data consistency according to directory information, where the directory information is stored in a main memory, and the directory information is used to record a condition that a data block in a main memory system area is cached by a Cache memory of each processor, and the method includes: receiving a data block state change request; determining a designated data block in a main memory system area according to the data block state change request; determining a target processor according to directory information stored in the main memory, wherein the target processor is a processor which stores a specified data block into a processor Cache; and sending a state change instruction to the target processor, wherein the data block state change request is used for requesting the target processor to change the state of the cached specified data block in the Cache of the target processor.
With reference to the first aspect, in a first possible implementation manner of the first aspect, the determining a target processor according to directory information stored in the main memory includes: detecting whether specified directory information corresponding to the specified data block exists in a directory area of the main memory, wherein the directory area of the main memory is used for storing the directory information; and when the specified directory information exists in the directory area of the main memory, determining the target processor according to the specified directory information.
With reference to the first aspect, in a second possible implementation manner of the first aspect, the determining a target processor according to directory information stored in the main memory includes: detecting whether appointed directory information corresponding to the appointed data block exists in a directory cache of an interconnection chip; when the specified directory information does not exist in the directory cache of the interconnection chip, detecting whether the specified directory information corresponding to the specified data block exists in the directory area of the main memory; and when the specified directory information exists in the directory area of the main memory, determining the target processor according to the specified directory information.
With reference to the second possible implementation manner of the first aspect, in a third possible implementation manner of the second aspect, the method further includes: and when the specified directory information exists in the directory cache of the interconnected chip, determining the target processor according to the specified directory information.
With reference to the second possible implementation manner of the first aspect, in a fourth possible implementation manner of the second aspect, the method further includes: and when the specified directory information does not exist in the directory cache of the interconnected chip, updating the directory information in the directory cache.
With reference to any one of the second to fourth possible implementation manners of the first aspect, in a fifth possible implementation manner of the second aspect, the directory information stored in the directory cache is generated by mapping the directory information in the directory area of the main memory in a multi-way set associative manner.
With reference to the first aspect or any one of the first to fifth possible implementation manners of the first aspect, in a sixth possible implementation manner of the first aspect, the sending a state change indication to the target processor includes: when the data block state change request is a data block rewriting request, sending a failure indication to the target processor, wherein the failure indication is used for indicating the target processor to fail the specified data block cached in the Cache of the target processor; or when the data block state change request is a data block sharing request, sending a sharing indication to the target processor, wherein the sharing indication is used for indicating the target processor to modify the specified data block cached in the Cache of the target processor into a sharing mode.
With reference to the first aspect or any one of the first to sixth possible implementation manners of the first aspect, in a seventh possible implementation manner of the first aspect, the directory information includes an access status field, an information status field, and a Cache indication field, where the access status field is used to indicate whether the directory information is being accessed, the information status field is used to indicate a status of the directory information, and the Cache indication field is used to indicate a status that a data block corresponding to the directory information is cached by a processor Cache.
In a second aspect, an embodiment of the present invention provides an apparatus for maintaining Cache data consistency according to directory information, where the directory information is stored in a main memory, and the directory information is used to record a condition that a data block in a main memory system area is cached by a Cache memory of each processor, and the apparatus includes: a receiving unit, configured to receive a data block state change request; a specified data block determination unit for determining a specified data block in the main memory system area according to the data block state change request; the target processor determining unit is used for determining a target processor according to the directory information stored in the main memory, wherein the target processor is a processor which stores the specified data block into a processor Cache; and the sending unit is used for sending a state change instruction to the target processor, and the data block state change request is used for requesting the target processor to change the state of the specified data block cached in the Cache of the target processor.
With reference to the second aspect, in a first possible implementation manner of the second aspect, the target processor determining unit includes: a first detecting subunit, configured to detect whether specified directory information corresponding to the specified data block exists in a directory area of the main memory, where the directory area of the main memory is used to store the directory information; a first determining subunit, configured to determine the target processor according to the specified directory information when the specified directory information exists in the directory area of the main memory.
With reference to the second aspect, in a second possible implementation manner of the second aspect, the target processor determining unit includes: the second detection subunit is used for detecting whether the directory cache of the interconnection chip has the specified directory information corresponding to the specified data block; a third detecting subunit, configured to detect whether specified directory information corresponding to the specified data block exists in a directory area of the main memory when the specified directory information does not exist in a directory cache of an interconnect chip; a second determining subunit, configured to determine the target processor according to the specified directory information when the specified directory information exists in the directory area of the main memory.
With reference to the second possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, the second determining subunit is further configured to determine the target processor according to the specified directory information when the specified directory information exists in a directory cache of an interconnection chip.
With reference to the second possible implementation manner of the second aspect, in a fourth possible implementation manner of the second aspect, the target processor determining unit further includes: an updating subunit, configured to update the directory information in the directory cache of the interconnection chip when the specified directory information does not exist in the directory cache
With reference to any one of the second to fourth possible implementation manners of the second aspect, in a fifth possible implementation manner of the second aspect, the directory information stored in the directory cache is generated by mapping the directory information in the directory area of the main memory in a multi-way set associative manner.
With reference to the second aspect or any one of the first to fifth possible implementation manners of the second aspect, in a sixth possible implementation manner of the second aspect, the sending unit is configured to send a failure indication to the target processor when the data block state change request is a data block rewrite request, where the failure indication is used to indicate the target processor to fail the specified data block cached in the target processor Cache; or, the shared instruction is used to instruct the target processor to modify the specified data block cached in the Cache of the target processor into a shared mode.
With reference to the second aspect or any one of the first to sixth possible implementation manners of the second aspect, in a seventh possible implementation manner of the second aspect, the directory information includes an access status field, an information status field, and a Cache indication field, where the access status field is used to indicate whether the directory information is being accessed, the information status field is used to indicate a status of the directory information, and the Cache indication field is used to indicate a status that a data block corresponding to the directory information is cached by a processor Cache.
In the embodiment of the invention, a data block state change request is received; determining a designated data block in a main memory system area according to the data block state change request; determining a target processor according to directory information stored in the main memory, wherein the directory information is used for recording the condition that data blocks in a main memory system area are cached by each processor Cache, and the target processor refers to a processor which stores specified data blocks into the processor caches; and sending a state change instruction to the target processor, wherein the data block state change request is used for requesting the target processor to change the state of the cached specified data block in the Cache of the target processor. By adopting the embodiment of the invention, the directory information is stored in the main memory, so that the directory storage space of the interconnection chip can be prevented from being used for storing the directory information, and the difficulty in designing and realizing the NUMA system is reduced.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the non-uniform Memory access architecture system of the present invention may be formed by at least one Multiprocessor system (Multiprocessor Systems), wherein each Multiprocessor system may include a Main Memory (Main Memory) and at least two processors, and directory information may be stored in the Main Memory. The main memory comprises a directory area and a system area, wherein the directory area is used for storing directory information, the system area is used for storing data blocks required to be stored in the main memory, and each directory area can be used for storing the directory information of the data blocks in the main memory system area to which the directory area belongs and can also be used for storing the directory information of the data blocks in other main memory system areas.
Referring to fig. 1, a schematic flow chart of an embodiment of the method for maintaining Cache data consistency according to directory information in the present invention is shown. The method comprises the following steps:
step 101, receiving a data block state change request.
A processor in a NUMA system first receives a data block status change request, which may be sent by an application in the system. The data block state change request may be a data block rewriting request, a data block sharing request, or the like, where the data block rewriting request is used to request to rewrite a certain data block in the main memory, and the data block sharing request is used to request to share a certain data block in the main memory.
And 102, determining the specified data block in the main memory system area according to the data block state change request.
After receiving the data block state change request, the processor first determines which data block in the main memory system area needs to be rewritten according to the data block state change request.
For example, when the data block state change request carries an address range of a specified data block, the data block in the address range may be considered as the specified data block. When the data block state change request carries the start address of the specified data block and the size of the specified data block can be determined according to the data block state change request, the specified data block can be determined according to the start address of the specified data block and the size of the specified data block.
Step 103, determining a target processor according to directory information stored in the main memory, wherein the directory information is used for recording the condition that data blocks in a main memory system area are cached by each processor Cache, and the target processor refers to a processor which stores specified data blocks into the processor Cache.
After the specified data block is determined, the processor may detect whether specified directory information corresponding to the specified data block exists in a directory area of the main memory; and when the specified directory information exists in the main memory directory area, determining the target processor according to the specified directory information.
Each piece of directory information in the main memory directory area may correspond to a data block with a fixed size in the main memory, and the size of the data block may be preset according to needs, for example, may be the same as the size of each Cache line (Cacheline) in the processor Cache.
The directory information in the directory area may be mapped in a multi-way set associative manner using locality principles. By adopting a locality principle and mapping and generating directory information in a multi-path group connection mode, only the directory information corresponding to the data blocks which are possibly cached by the Cache of the processor in the main memory can be generated, and the directory information corresponding to all the data blocks in the main memory does not need to be generated, so that the data volume of the directory information can be greatly reduced, and the space occupied by the directory area is reduced; and the information corresponding to the data block with the fixed address in the system area can be stored in the fixed address of the directory area, and the storage address of the directory information corresponding to the data block can be determined very tolerant according to the address of the data block, so that the searching speed of the directory information can be increased.
The structure of the directory information may be as shown in fig. 2, where the directory information includes an access status field, an information status field, and a Cache indication field, where the access status field is used to indicate whether the directory information is being accessed, the information status field is used to indicate the status of the directory information, and the Cache indication field is used to indicate a condition that a data block corresponding to the directory information is cached by a processor Cache. In general, the length of the access status field may be 1bit, the length of the information status field may be 3 bits, and the length of the cache indication field may be 4 bits. The condition that the data block corresponding to the directory information is cached by the processor Cache refers to whether the data block corresponding to the directory information is cached by the processor Cache, which processor Cache is cached by the processor Cache, and the like.
In order to speed up the processing speed of data block rewriting, the NUMA system may also include an interconnect chip, and the interconnect chip is provided with a directory cache for storing directory information in a directory area of the main memory in a cache manner. In order to reduce the space of the directory cache, the directory information stored in the directory cache is generated by mapping the directory information in the directory area of the main memory in a multi-way set associative manner.
Therefore, after the specified data block is determined, the processor may also detect whether specified directory information corresponding to the specified data block exists in a directory cache of the interconnection chip; and when the specified directory information exists in the directory cache of the interconnected chip, determining the target processor according to the specified directory information. When the specified directory information does not exist in the directory cache of the interconnection chip, the processor detects whether the specified directory information corresponding to the specified data block exists in the directory area of the main memory; and when the specified directory information exists in the main memory directory area, determining the target processor according to the specified directory information.
If the directory information comprises an access status field, an information status field and a cache indication field, when the target processor is determined according to the specified directory information, the content of the cache indication field of the specified directory information can be analyzed, and the target processor is determined according to the content of the cache indication field. For example, when it is determined that the specified data block is cached in the Cache of the first processor according to the Cache indication field, the first processor may be determined to be the target processor. When it is determined that the specified data block is cached in both the second processor Cache and the third processor Cache according to the Cache indication field, it may be determined that both the second processor and the third processor are the target processor.
If the directory Cache of the interconnection chip and the directory area of the main memory do not have the specified directory information, the specified data block is not considered to be cached by any processor Cache, and the processor can directly change the state of the specified data block according to the data block state change request.
And 104, sending a state change instruction to the target processor, wherein the data block state change request is used for requesting the target processor to change the state of the specified data block cached in the Cache of the target processor.
If the target processor is determined to exist according to the directory information, the processor can send a state change instruction to the target processor, and the data block state change request is used for requesting the target processor to change the state of the specified data block cached in the Cache of the target processor. The target processor can modify the state of the specified data block according to the data block state change request, so that the consistency of data in the caches of the processors can be ensured.
The status change indication may be different according to different types of the data block status change request. For example, when the data block state change request is a data block rewrite request, sending a failure indication to the target processor, where the failure indication is used to indicate the target processor to fail the specified data block cached in the Cache of the target processor; or when the data block state change request is a data block sharing request, sending a sharing indication to the target processor, wherein the sharing indication is used for indicating the target processor to modify the specified data block cached in the Cache of the target processor into a sharing mode.
The target processor, upon receiving the state change indication, may change the state of the specified data block in accordance with the state change indication. For example, when the target processor receives a failure indication, a specified data block corresponding to the failure indication in the Cache of the target processor may be failed; and when the target processor receives the sharing indication, the specified data block corresponding to the sharing indication in the Cache of the target processor can be shared.
In this embodiment, a data block status change request is received; determining a designated data block in a main memory system area according to the data block state change request; determining a target processor according to directory information stored in the main memory, wherein the directory information is used for recording the condition that data blocks in a main memory system area are cached by each processor Cache, and the target processor refers to a processor which stores specified data blocks into the processor caches; and sending a state change instruction to the target processor, wherein the data block state change request is used for requesting the target processor to change the state of the cached specified data block in the Cache of the target processor. By adopting the embodiment, the directory information is stored in the main memory, so that the directory storage space of the interconnection chip can be prevented from being used for storing the directory information, and the difficulty in designing and realizing the NUMA system is reduced.
Corresponding to the embodiment of the method for maintaining the consistency of the Cache data according to the directory information, the embodiment of the invention also provides an embodiment of a device for maintaining the consistency of the Cache data according to the directory information.
Referring to fig. 3, a schematic structural diagram of an embodiment of the apparatus for maintaining Cache data consistency according to directory information in the present invention is shown, where the directory information is stored in a main memory, and the directory information stored in the directory Cache is generated by mapping the directory information in a directory area of the main memory in a multi-way set associative manner. The device can be used for executing the method for maintaining the consistency of the Cache data according to the directory information in the embodiment.
As shown in fig. 3, the apparatus may include: a receiving unit 301, a specified data block determining unit 302, a target processor determining unit 303, and a transmitting unit 304.
The receiving unit 301 is configured to receive a data block state change request; a specified data block determination unit 302, configured to determine a specified data block in the main memory system area according to the data block state change request; a target processor determining unit 303, configured to determine a target processor according to directory information stored in the main memory, where the target processor is a processor that has stored a specified data block into a processor Cache; a sending unit 304, configured to send a state change indication to the target processor, where the data block state change request is used to request the target processor to change a state of the specified data block cached in the Cache of the target processor.
Optionally, the target processor determining unit 303 includes: a first detecting subunit, configured to detect whether specified directory information corresponding to the specified data block exists in a directory area of the main memory, where the directory area of the main memory is used to store the directory information; a first determining subunit, configured to determine the target processor according to the specified directory information when the specified directory information exists in the directory area of the main memory.
Optionally, the target processor determining unit 303 includes: the second detection subunit is used for detecting whether the directory cache of the interconnection chip has the specified directory information corresponding to the specified data block; a third detecting subunit, configured to detect whether specified directory information corresponding to the specified data block exists in a directory area of the main memory when the specified directory information does not exist in a directory cache of an interconnect chip; a second determining subunit, configured to determine the target processor according to the specified directory information when the specified directory information exists in the directory area of the main memory. The second determining subunit may be further configured to determine the target processor according to the specified directory information when the specified directory information exists in a directory cache of an interconnect chip. The target processor determining unit 303 may further include: and the updating subunit is used for updating the directory information in the directory cache when the specified directory information does not exist in the directory cache of the interconnected chip.
Optionally, the sending unit 304 is configured to send a failure indication to the target processor when the data block state change request is a data block rewrite request, where the failure indication is used to indicate that the target processor fails the specified data block cached in the target processor Cache; or, the shared instruction is used to instruct the target processor to modify the specified data block cached in the Cache of the target processor into a shared mode. The directory information may include an access status field, an information status field, and a Cache indication field, where the access status field is used to indicate whether the directory information is being accessed, the information status field is used to indicate a status of the directory information, and the Cache indication field is used to indicate a status that a data block corresponding to the directory information is cached by a processor Cache.
In this embodiment, the apparatus for maintaining the consistency of Cache data according to directory information includes: a receiving unit, configured to receive a data block state change request; a specified data block determination unit for determining a specified data block in the main memory system area according to the data block state change request; a target processor determining unit configured to determine a target processor based on the directory information stored in the main memory, and a transmitting unit configured to transmit a state change instruction to the target processor. By adopting the embodiment, the directory information is stored in the main memory, so that the directory storage space of the interconnection chip can be prevented from being used for storing the directory information, and the difficulty in designing and realizing the NUMA system is reduced.
Those skilled in the art will readily appreciate that the techniques of the embodiments of the present invention may be implemented as software plus a required general purpose hardware platform. Based on such understanding, the technical solutions in the embodiments of the present invention may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above-described embodiments of the present invention do not limit the scope of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.