CN105700953A - Multiprocessor cache coherence processing method and device - Google Patents

Multiprocessor cache coherence processing method and device Download PDF

Info

Publication number
CN105700953A
CN105700953A CN201410704522.XA CN201410704522A CN105700953A CN 105700953 A CN105700953 A CN 105700953A CN 201410704522 A CN201410704522 A CN 201410704522A CN 105700953 A CN105700953 A CN 105700953A
Authority
CN
China
Prior art keywords
processor
page
descriptor
cache blocks
directory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410704522.XA
Other languages
Chinese (zh)
Other versions
CN105700953B (en
Inventor
张广飞
崔晓松
黄勤业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Hangzhou Huawei Digital Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Huawei Digital Technologies Co Ltd filed Critical Hangzhou Huawei Digital Technologies Co Ltd
Priority to CN201410704522.XA priority Critical patent/CN105700953B/en
Publication of CN105700953A publication Critical patent/CN105700953A/en
Application granted granted Critical
Publication of CN105700953B publication Critical patent/CN105700953B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a multiprocessor cache coherence processing method and device. The method is applied to a multiprocessor system, and comprises the following steps of when a last stage of cache receives a first request for performing read operation on a cache block by a first processor in a multiprocessor, reading a first page descriptor corresponding to the cache block from a memory; and if the first processor is determined to be the first processor accessing the cache block according to the first page descriptor, adding a page tag into the first page descriptor to obtain a second page descriptor, wherein the page tag marks the first processor as the processor accessing the cache block for the first time. The method provided by the invention is used for solving the problem that the existing method for maintaining the system consistency in the multiprocessor system wastes great on-chip storage area.

Description

A kind of multiprocessor buffer consistency processing method and device
Technical field
The present invention relates to electronic technology field, particularly relate to a kind of multiprocessor buffer consistency processing method and device。
Background technology
The progress of semiconductor technology makes the quantity of transistor on the integrated circuit of unit are remain to increase with 3 years some speed after 2013。The transistor increased for making full use of technique progress, processor manufacturer all turns to multi-core/many-core processor to design from single core processor design, the overall performance of system is improved thereby through excavation Thread-Level Parallelism (ThreadLevelParallelism, TLP)。Meanwhile, manufacturer also by integrated for multiple processor chips on one system to obtain higher performance, this designed system is referred to as multicomputer system。Such as: currently advanced processor chips can contain 12 processors, and 4 pieces of such processor chips can integrate, and eventually forms the server level arithmetic system of 48 cores。Multinuclear, many-core and multi-processor structure are referred to as multiprocessing architecture (multi-processingarchitecture)。
Although the performance that multiprocessing architecture makes computer system is further improved, this design is also faced with in the design of many single core processor the problem never run into, one of them is exactly the design of buffer consistency, current computer system is often made up of multiple processors, and there is also multiple processor in a processor。Allowing processor collaborative work, its correctness becomes primary problem。This problem can be subdivided into two subproblems, it may be assumed that storage coherence (memoryconsistency) and buffer consistency (cachecoherence)。
Storage coherence is also called memory consistency model, it defines the behavior and order that are allowed to ground read-write operation in shared storage。In all memory consistency models, the most also it is that the most direct consistency model is referred to as sequential consistency model (SequentialConsistency, SC)。And in main general multinuclear/multiprocessor at present (such as x86, x8664 and SPARC), a kind of approximate model of the Ordinal Consistency of employing stores sequence (TotalStoreOrder, TSO) entirely。
The introducing of sequential consistency model and TSO model, is very easy to the programming operation of programmer。Processor general at present achieves TSO model mostly。But, the design of high-speed cache is proposed some new requirements by both models, is namely required to meet buffer consistency (cachecoherence)。Buffer consistency requires that the multiple buffer memorys in a computer system meet the requirement of " single many readers of writer "。Namely at any time, for any data, to it is carried out write operation, the processor only carrying out write operation can these data of buffer memory;But, to it is carried out read operation, multiple processors can simultaneously these data of buffer memory。In general-purpose computing system, correct efficient buffer consistency design is a very important problem。
Along with the increase of processor number integrated in system, how more efficiently to safeguard that the buffer consistency between each processor becomes a problem of academia and industrial quarters common concern。Safeguard that buffer consistency has protocol mode snoopy protocol (snoop) two kinds basic and directory protocol (directory)。Owing to directory protocol is better than snoopy protocol in autgmentability, it is often employed on a large scale/ultra-large system in。
Fig. 1 gives a logical structure adopting directory protocol system, and this system has a directory hardware, is connected with all of processor cache by network, and all renewals to coherency state operate and all have to pass through directory hardware。When processor A needs cache blocks M is carried out read operation, first processor A can send a read request to directory hardware;Directory hardware search find processor B buffer memory in have this data, therefore can messaging to the buffer memory of processor B;After the buffer memory of processor B receives message, the buffer memory of processor A can be sent data to。
Prior art provides a kind of sparse directory (sparsedirectory) scheme solve the consistent Non Uniform Memory Access of buffer memory and access cache coherency problems in (CacheCoherentNon-UniformMemoryAccess, ccNUMA) system。As in figure 2 it is shown, sparse directory stores directory information with the form of set associative in the sheet of processor system, owing to its type of organization is very close to the structure of high-speed cache, so also known as by for Directory caching (directorycache)。
Being that processor A needs cache blocks M is carried out read operation equally, first processor A can send a read request to sparse directory A;Directory search find processor B buffer memory in have this data, therefore can messaging to the buffer memory of processor B;After the buffer memory of processor B receives message, the buffer memory of processor A can be sent data to。
Sparse directory can adopt very low degree of association, and therefore its access energy consumption is relatively low。But owing to sparse directory is set associative design, therefore there is disequilibrium in the access between its directory group (directorysets), thus causing some to access the big group of pressure form catalog conflict (directoryconflicts)。Catalog conflict needs that some directory blocks are replaced (replace) and goes out sparse directory, for safeguarding the concordance of system, it is necessary to deactivate (invalidate) by being replaced corresponding data cached of directory block。Additionally, system also needs to take out certain memory space specially preserves the Cache consistency information, the DirectoryCache in the method occupies and stores area on substantial amounts of。
Summary of the invention
The present invention provides a kind of multiprocessor buffer consistency processing method and device, and method and apparatus provided by the present invention solves to safeguard in existing multicomputer system that the conforming method of system wastes the substantial amounts of upper problem storing area。
On the one hand, the present invention provides a kind of multiprocessor buffer consistency processing method, the method is applied in multicomputer system, this multicomputer system includes multiple processor, afterbody buffer memory LLC, internal memory and Directory caching, wherein, afterbody buffer memory is connected with Directory caching with multiple processors, internal memory respectively, and the method includes:
Afterbody buffer memory reads, after receiving the first request that a cache blocks is carried out read operation by the first processor in described multiprocessor, the page 1 descriptor that described cache blocks is corresponding from described internal memory;
Determine that described first processor is first processor accessing described cache blocks according to described page 1 descriptor, then add page label and obtain page 2 descriptor in described page 1 descriptor;Described page labeled marker first processor is processor first time accessing described cache blocks。
In conjunction with first aspect, in the implementation that the first is possible, the method farther includes:
When receive the second processor described cache blocks is carried out read operation second request, page 2 descriptor according to described second acquisition request;
Page label in described page 2 descriptor is updated, obtains page 3 descriptor;Page labeled marker first processor after renewal is processor first time accessing described cache blocks, and this cache blocks is operated by multiple processors;
Send catalogue assignment messages to Directory caching so that Directory caching distribution directory entry, described directory entry is used for the processor of cache blocks described in record access。
In conjunction with the first possible implementation of first aspect, in the implementation that the second is possible, described in obtain page 3 descriptor after, farther include:
Store described page 3 descriptor, and described page 3 descriptor is sent to described first processor and described second processor storage。
In conjunction with first aspect, or implementation the first to two of first aspect kind possible, in the implementation that the third is possible, described page label includes: access whether record FA has processor to access cache blocks for indicating;Processor flag is for indicating unique mark of the processor of first time access cache block;Concordance scope SH is safeguard concordance between one or multiple processor for record buffer memory block。
In conjunction with first aspect, or implementation the first to three of first aspect kind possible, in the implementation that the third is possible, after the page descriptor that described cache blocks is corresponding is updated, the page descriptor after renewal is write back in described internal memory by the method further。
Second aspect, the present invention also provides for a kind of multiprocessor buffer consistency and processes device, this device is applied in the afterbody buffer memory of multicomputer system, this multicomputer system includes multiple processor, afterbody buffer memory, internal memory and Directory caching, wherein, afterbody buffer memory is connected with Directory caching with multiple processors, internal memory respectively, and this device includes:
Read unit, after one cache blocks is carried out for the first processor received in described multiprocessor the first request of read operation, from described internal memory, read the page 1 descriptor that described cache blocks is corresponding;
Updating block, for determining that described first processor is first processor accessing described cache blocks according to described page 1 descriptor, then adds page label and obtains page 2 descriptor in described page 1 descriptor;Described page labeled marker first processor is processor first time accessing described cache blocks。
In conjunction with second aspect, in the implementation that the first is possible, described updating block is additionally operable to receive the second processor and described cache blocks carries out the second request of read operation, page 2 descriptor according to described second acquisition request;Page label in described page 2 descriptor is updated, obtains page 3 descriptor;Page labeled marker first processor after renewal is processor first time accessing described cache blocks, and this cache blocks is operated by multiple processors;
Then this device also includes:
Directory caching request unit, is used for sending catalogue assignment messages to Directory caching so that Directory caching distribution directory entry, described directory entry is the processor of cache blocks described in record access。
In conjunction with the first possible implementation of second aspect, in the implementation that the second is possible, this device also includes:
Lock unit, for, after obtaining page 3 descriptor, storing described page 3 descriptor, and described page 3 descriptor be sent to described first processor and described second processor storage。
In conjunction with the implementation that the first possible implementation of first aspect or the second are possible, in the implementation that the third is possible, whether described updating block, for the access record FA, processor flag and the concordance scope SH that comprise in described page label are updated, has processor to access cache blocks for indicating;Processor flag is for indicating unique mark of the processor of first time access cache block;Concordance scope SH is safeguard concordance between one or multiple processor for record buffer memory block。
In conjunction with second aspect, or implementation the first to three of second aspect kind possible, in the 4th kind of possible implementation, described updating block is additionally operable to, after the page descriptor that described cache blocks is corresponding updates, be write back in described internal memory by the page descriptor after updating。
One or two in technique scheme, at least has the following technical effect that
The method and apparatus that the embodiment of the present invention provides, after the processor access block of first time access cache block, in page descriptor, directly add the page label of correspondence, this page of label has had a processor that this cache blocks has been carried out read operation for indicating, and just can carry out follow-up operation by this page of label when this cache blocks is carried out read operation by other processors follow-up time;Just can being realized buffer consistency by this page of label, thus avoiding the access to Directory caching, thus decreasing DirectoryCache access times, reducing DirectoryCache power dissipation overhead;Nor with the distribution directory entry storage processor operational circumstances to cache blocks, thus multinuclear concordance can also be safeguarded by less chip area, reduce DirectoryCache area overhead。
Accompanying drawing explanation
Fig. 1 is the logical structure schematic diagram adopting directory protocol system in prior art;
Fig. 2 is the logical structure schematic diagram of sparse directory system in prior art;
The schematic flow sheet of a kind of multiprocessor buffer consistency processing method that Fig. 3 provides for the embodiment of the present invention;
The structural representation of the multicomputer system that Fig. 4 is suitable for for the provided scheme of the embodiment of the present invention;
Fig. 5 processes the structural representation of device for a kind of multiprocessor buffer consistency that the embodiment of the present invention provides;
The structural representation of a kind of multicomputer system that Fig. 6 provides for the embodiment of the present invention。
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is a part of embodiment of the present invention, rather than whole embodiments。Based on the embodiment in the present invention, the every other embodiment that those of ordinary skill in the art obtain under not making creative work premise, broadly fall into the scope of protection of the invention。
Below in conjunction with Figure of description, the embodiment of the present invention is described in further detail。
As shown in Figure 3, the embodiment of the present invention provides a kind of multiprocessor buffer consistency processing method, the method is applied in multicomputer system, this multicomputer system includes multiple processor, afterbody buffer memory (LastLevelCache, LLC), internal memory (MainMemory) and Directory caching (DirectoryCache), wherein, afterbody buffer memory is connected with Directory caching with multiple processors, internal memory (in this multicomputer system, the attachment structure of unit is as shown in Figure 4) respectively, and the method includes:
Step 301, afterbody buffer memory reads, after receiving the first request that a cache blocks is carried out read operation by the first processor in described multiprocessor, the page 1 descriptor (PageDescriptor) that described cache blocks is corresponding from described internal memory;
According to described page 1 descriptor, step 302, determines that described first processor is first processor accessing described cache blocks, then adds page label and obtain page 2 descriptor in described page 1 descriptor;Described page labeled marker first processor is processor first time accessing described cache blocks。
In embodiments of the present invention, in order to indicate the processor of first time access cache block, implementing of page label can be the structure shown in table 1:
Table 1
Wherein, access whether record (FirstAccess, FA) has processor to access cache blocks for indicating;It is that 1 sign is not wherein it is possible to indicate with 0;
Processor flag (CPU_ID) is for indicating unique mark (ID) of the processor of first time access cache block;
Concordance scope (Shareability, SH) is safeguard concordance between one or multiple processor for record buffer memory block;Wherein it is possible to only indicate in a processor with 0,1 is indicated in the maintenance concordance in multiple processor。
In order to ensure buffer consistency, after also description that cache blocks is original is updated adjusting, also need to be saved in the page descriptor obtained after adjustment the position of correspondence, so that the follow-up processor conducted interviews that this buffer memory is opened can determine the accessed situation of cache blocks according to page descriptor after adjusting, so the method that this embodiment provides also includes:
Afterbody is buffered in self and stores a described page 2 descriptor, is also sent in described first processor and internal memory by described page 2 descriptor and stores。Because original page descriptor reads from internal memory, so the page descriptor after adjusting can write back to the position corresponding to original page descriptor。
In concrete applied environment, the different data handled by processor have certain division, so the situation that same data are accessed by multiple processors is fewer than the situation that same data are accessed by a processor, however, to ensure that the buffer consistency of multiple processors, no matter it is that cache blocks (or being referred to as data) is operated or multiple processor by a processor, it is required for the directory entry that distribution is corresponding in Directory caching (DirectoryCache), the CPU_ID of these data of record access;And during every sub-distribution catalogue, directory entry needs to include the access situation (even if not all processor have accessed cache blocks, but the directory entry distributed needs to record the situation of all processors) of each processor in multicomputer system。So the method for prior art, substantial amounts of Directory caching resource can be wasted。
And in the method that the embodiment of the present invention provides, after the processor access block of first time access cache block, in page descriptor, directly add the page label of correspondence, this page of label has had a processor that this cache blocks has been carried out read operation for indicating, and just can carry out follow-up operation by this page of label when this cache blocks is carried out read operation by other processors follow-up time;Just can being realized buffer consistency by this page of label, thus avoiding the access to Directory caching, thus decreasing DirectoryCache access times, reducing DirectoryCache power dissipation overhead;Nor with the distribution directory entry storage processor operational circumstances to cache blocks, thus multinuclear concordance can also be safeguarded by less chip area, reduce DirectoryCache area overhead。
Further, after in said method, cache blocks is conducted interviews by first processor, the situation of access has been saved in each equipment, when same cache blocks is conducted interviews by the second processor time, first the page descriptor that cache blocks is corresponding is obtained, then pass through the page label in page descriptor is assured that it self is not first processor accessing this cache blocks, may further comprise: in this way
A, when receive the second processor described cache blocks is carried out read operation second request, page 2 descriptor according to described second acquisition request;Page label in described page 2 descriptor is updated, obtains page 3 descriptor;Page labeled marker first processor after renewal is processor first time accessing described cache blocks, and this cache blocks is operated by multiple processors;
In this example, because first processor is the processor of first time access cache block, so after afterbody buffer memory receives the second request, obtain the page descriptor that this cache blocks is corresponding, the processor being known first time access cache block by the page label in the page descriptor that gets is first processor, so after determining that the second processor and first processor need to access same cache blocks, need the page label in original page descriptor is updated, page labeled marker first processor after renewal is processor first time accessing described cache blocks, and this cache blocks is operated by multiple processors;
Since it is desired that ensure buffer consistency within the scope of first processor and the second processor, so needing further exist for going for the directory entry asking corresponding from Directory caching。
B, sends catalogue assignment messages to Directory caching so that Directory caching distribution directory entry, described directory entry is used for the processor of cache blocks described in record access。
Same in order to ensure buffer consistency, after also description that cache blocks is original is updated adjusting, also need to be saved in the page descriptor obtained after adjustment the position of correspondence, so that the follow-up processor conducted interviews that this buffer memory is opened can determine the accessed situation of cache blocks according to page descriptor after adjusting, so the method that this embodiment provides also includes:
Store described page 3 descriptor, and described page 3 descriptor is sent to described first processor and described second processor storage。
As shown in Figure 5, according to said method, the embodiment of the present invention also provides for a kind of multiprocessor buffer consistency and processes device, this device is applied in the afterbody buffer memory of multicomputer system, this multicomputer system includes multiple processor, afterbody buffer memory, internal memory and Directory caching, wherein, afterbody buffer memory is connected with Directory caching with multiple processors, internal memory respectively, and this device includes:
Read unit 501, after one cache blocks is carried out for the first processor received in described multiprocessor the first request of read operation, from described internal memory, read the page 1 descriptor (PageDescriptor) that described cache blocks is corresponding;
Updating block 502, for determining that described first processor is first processor accessing described cache blocks according to described page 1 descriptor, then adds page label and obtains page 2 descriptor in described page 1 descriptor;Described page labeled marker first processor is processor first time accessing described cache blocks。
This updating block 502 can be different according to the parameter of page label, use different modes to carry out the renewal of page label, and the one in specific implementation may is that
Whether this updating block 502, for the access record FA, processor flag and the concordance scope SH that comprise in described page label are updated, has processor to access cache blocks for indicating;Processor flag is for indicating unique mark (ID) of the processor of first time access cache block;Concordance scope SH is safeguard concordance between one or multiple processor for record buffer memory block。
Further, in order to ensure that subsequent processor can get the page descriptor of cache blocks, so after page descriptor is updated, in addition it is also necessary to further the page descriptor of renewal is stored the position of correspondence, wherein specifically includes:
Updating block 502 is additionally operable to, after the page descriptor that described cache blocks is corresponding updates, be write back in described internal memory by the page descriptor after updating。
When multiple processors access same cache blocks, this updating block 502 also needs to further realize the buffer consistency between multiple processor, specifically may is that
This updating block 502 is additionally operable to receive the second processor and described cache blocks carries out the second request of read operation, page 2 descriptor according to described second acquisition request;Page label in described page 2 descriptor is updated, obtains page 3 descriptor;Page labeled marker first processor after renewal is processor first time accessing described cache blocks, and this cache blocks is operated by multiple processors;
Then this device also includes:
Directory caching request unit, is used for sending catalogue assignment messages to Directory caching so that Directory caching distribution directory entry, described directory entry is the processor of cache blocks described in record access。
In order to ensure buffer consistency, after also description that cache blocks is original is updated adjusting, also need to be saved in the page descriptor obtained after adjustment the position of correspondence, so that the follow-up processor conducted interviews that this buffer memory is opened can determine the accessed situation of cache blocks according to page descriptor after adjusting, so the device that this embodiment provides also includes:
Lock unit, for, after obtaining page 3 descriptor, storing described page 3 descriptor, and described page 3 descriptor be sent to described first processor and described second processor storage。
As shown in Figure 6, according to said method, the embodiment of the present invention also provides for a kind of multicomputer system, this multicomputer system includes multiple processor 601, afterbody buffer memory 602, internal memory 603 and Directory caching 604, wherein, afterbody buffer memory 602 is connected with Directory caching 604 with multiple processors 601, internal memory 603 respectively, wherein:
In multiple processors 601, first processor sends the first request that a cache blocks carries out read operation to afterbody buffer memory 602;
In actual applied environment, each processor is both provided with a hardware page and searches unit (HardwareTableWalker, HTW) and transmission look-aside buffer (TranslationLookasideBuffer, TLB), when a certain processor needs to access a certain cache blocks being stored in internal memory, TLB then sends the page descriptor of this cache blocks of acquisition request。
Afterbody buffer memory 602 reads page descriptor corresponding to this cache blocks (PageDescriptor) after receiving the first request that first processor sends from internal memory 603;Determine that described first processor is first processor accessing described cache blocks according to described page 1 descriptor, then add page label and obtain page 2 descriptor in described page 1 descriptor;Described page labeled marker first processor is first time access the processor of described cache blocks, stores page 2 descriptor, and is sent to first processor and internal memory 603 stores。
Further, when in multiple processors 601 second processor send the second request that same cache blocks is carried out read operation to afterbody buffer memory 602;
Then afterbody buffer memory 602 is according to page 2 descriptor corresponding to described second acquisition request cache blocks, page 2 descriptor according to described second acquisition request;Page label in described page 2 descriptor is updated, obtains page 3 descriptor;Page labeled marker first processor after renewal is first time access the processor of described cache blocks, and this cache blocks is operated by multiple processors, sends catalogue assignment messages to Directory caching 604;
Directory caching 604, after receiving the assignment messages that afterbody buffer memory 602 transmission comes, directory entry is distributed according to this assignment messages, this directory entry is the processor (namely first processor and the second processor have accessed cache blocks, and other processors in multiple processors do not access this cache blocks) of cache blocks described in record access。
Said one in the embodiment of the present application or multiple technical scheme, at least have the following technical effect that:
In scheme provided by the present invention, after the processor access block of first time access cache block, in page descriptor, directly add the page label of correspondence, this page of label has had a processor that this cache blocks has been carried out read operation for indicating, and just can carry out follow-up operation by this page of label when this cache blocks is carried out read operation by other processors follow-up time;Just can being realized buffer consistency by this page of label, thus avoiding the access to Directory caching, thus decreasing DirectoryCache access times, reducing DirectoryCache power dissipation overhead;Nor with the distribution directory entry storage processor operational circumstances to cache blocks, thus multinuclear concordance can also be safeguarded by less chip area, reduce DirectoryCache area overhead。
Method of the present invention is not limited to the embodiment described in detailed description of the invention, and those skilled in the art draw other embodiment according to technical scheme, also belongs to the technological innovation scope of the present invention。
Obviously, the present invention can be carried out various change and modification without deviating from the spirit and scope of the present invention by those skilled in the art。So, if these amendments of the present invention and modification belong within the scope of the claims in the present invention and equivalent technologies thereof, then the present invention is also intended to comprise these change and modification。

Claims (10)

1. a multiprocessor buffer consistency processing method, the method is applied in multicomputer system, this multicomputer system includes multiple processor, afterbody buffer memory LLC, internal memory and Directory caching, wherein, afterbody buffer memory is connected with Directory caching with multiple processors, internal memory respectively, it is characterized in that, the method includes:
Afterbody buffer memory reads, after receiving the first request that a cache blocks is carried out read operation by the first processor in described multiprocessor, the page 1 descriptor that described cache blocks is corresponding from described internal memory;
Determine that described first processor is first processor accessing described cache blocks according to described page 1 descriptor, then add page label and obtain page 2 descriptor in described page 1 descriptor;Described page labeled marker first processor is processor first time accessing described cache blocks。
2. the method for claim 1, it is characterised in that the method farther includes:
When receive the second processor described cache blocks is carried out read operation second request, page 2 descriptor according to described second acquisition request;
Page label in described page 2 descriptor is updated, obtains page 3 descriptor;Page labeled marker first processor after renewal is processor first time accessing described cache blocks, and this cache blocks is operated by multiple processors;
Send catalogue assignment messages to Directory caching so that Directory caching distribution directory entry, described directory entry is used for the processor of cache blocks described in record access。
3. method as claimed in claim 2, it is characterised in that described in obtain page 3 descriptor after, farther include:
Store described page 3 descriptor, and described page 3 descriptor is sent to described first processor and described second processor storage。
4. the method as described in as arbitrary in claims 1 to 3, it is characterised in that described page label includes: access whether record FA has processor to access cache blocks for indicating;Processor flag is for indicating unique mark of the processor of first time access cache block;Concordance scope SH is safeguard concordance between one or multiple processor for record buffer memory block。
5. the method as described in as arbitrary in Claims 1 to 4, it is characterised in that after the page descriptor that described cache blocks is corresponding is updated, the page descriptor after updating is write back in described internal memory by the method further。
6. a multiprocessor buffer consistency processes device, this device is applied in the afterbody buffer memory of multicomputer system, this multicomputer system includes multiple processor, afterbody buffer memory, internal memory and Directory caching, wherein, afterbody buffer memory is connected with Directory caching with multiple processors, internal memory respectively, it is characterized in that, this device includes:
Read unit, after one cache blocks is carried out for the first processor received in described multiprocessor the first request of read operation, from described internal memory, read the page 1 descriptor that described cache blocks is corresponding;
Updating block, for determining that described first processor is first processor accessing described cache blocks according to described page 1 descriptor, then adds page label and obtains page 2 descriptor in described page 1 descriptor;Described page labeled marker first processor is processor first time accessing described cache blocks。
7. device as claimed in claim 6, it is characterised in that described updating block is additionally operable to receive the second processor and described cache blocks carries out the second request of read operation, page 2 descriptor according to described second acquisition request;Page label in described page 2 descriptor is updated, obtains page 3 descriptor;Page labeled marker first processor after renewal is processor first time accessing described cache blocks, and this cache blocks is operated by multiple processors;
Then this device also includes:
Directory caching request unit, is used for sending catalogue assignment messages to Directory caching so that Directory caching distribution directory entry, described directory entry is the processor of cache blocks described in record access。
8. device as claimed in claim 7, it is characterised in that this device also includes:
Lock unit, for, after obtaining page 3 descriptor, storing described page 3 descriptor, and described page 3 descriptor be sent to described first processor and described second processor storage。
9. the device as described in as arbitrary in claim 7 or 8, it is characterised in that whether described updating block, for the access record FA, processor flag and the concordance scope SH that comprise in described page label are updated, has processor to access cache blocks for indicating;Processor flag is for indicating unique mark of the processor of first time access cache block;Concordance scope SH is safeguard concordance between one or multiple processor for record buffer memory block。
10. the device as described in as arbitrary in claim 6~9, it is characterised in that described updating block is additionally operable to, after the page descriptor that described cache blocks is corresponding updates, be write back in described internal memory by the page descriptor after updating。
CN201410704522.XA 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device Active CN105700953B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410704522.XA CN105700953B (en) 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410704522.XA CN105700953B (en) 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device

Publications (2)

Publication Number Publication Date
CN105700953A true CN105700953A (en) 2016-06-22
CN105700953B CN105700953B (en) 2019-03-26

Family

ID=56295866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410704522.XA Active CN105700953B (en) 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device

Country Status (1)

Country Link
CN (1) CN105700953B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116962259A (en) * 2023-09-21 2023-10-27 中电科申泰信息科技有限公司 Consistency processing method and system based on monitoring-directory two-layer protocol

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070168619A1 (en) * 2006-01-18 2007-07-19 International Business Machines Corporation Separate data/coherency caches in a shared memory multiprocessor system
CN101510191A (en) * 2009-03-26 2009-08-19 浙江大学 Multi-core system structure with buffer window and implementing method thereof
CN101859281A (en) * 2009-04-13 2010-10-13 廖鑫 Method for embedded multi-core buffer consistency based on centralized directory

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070168619A1 (en) * 2006-01-18 2007-07-19 International Business Machines Corporation Separate data/coherency caches in a shared memory multiprocessor system
CN101004711A (en) * 2006-01-18 2007-07-25 国际商业机器公司 Multiple processor system and method for providing its with high speed caches coherency
CN101510191A (en) * 2009-03-26 2009-08-19 浙江大学 Multi-core system structure with buffer window and implementing method thereof
CN101859281A (en) * 2009-04-13 2010-10-13 廖鑫 Method for embedded multi-core buffer consistency based on centralized directory

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116962259A (en) * 2023-09-21 2023-10-27 中电科申泰信息科技有限公司 Consistency processing method and system based on monitoring-directory two-layer protocol
CN116962259B (en) * 2023-09-21 2024-02-13 中电科申泰信息科技有限公司 Consistency processing method and system based on monitoring-directory two-layer protocol

Also Published As

Publication number Publication date
CN105700953B (en) 2019-03-26

Similar Documents

Publication Publication Date Title
KR101385430B1 (en) Cache coherence protocol for persistent memories
US10552339B2 (en) Dynamically adapting mechanism for translation lookaside buffer shootdowns
US8209499B2 (en) Method of read-set and write-set management by distinguishing between shared and non-shared memory regions
US8285969B2 (en) Reducing broadcasts in multiprocessors
US9384134B2 (en) Persistent memory for processor main memory
US9229878B2 (en) Memory page offloading in multi-node computer systems
US9405703B2 (en) Translation lookaside buffer
US20150186275A1 (en) Inclusive/Non Inclusive Tracking of Local Cache Lines To Avoid Near Memory Reads On Cache Line Memory Writes Into A Two Level System Memory
US20120102273A1 (en) Memory agent to access memory blade as part of the cache coherency domain
US20120159080A1 (en) Neighbor cache directory
US9424198B2 (en) Method, system and apparatus including logic to manage multiple memories as a unified exclusive memory
US9128856B2 (en) Selective cache fills in response to write misses
US9037804B2 (en) Efficient support of sparse data structure access
CN107003932B (en) Cache directory processing method and directory controller of multi-core processor system
CN114238171B (en) Electronic equipment, data processing method and device and computer system
CN105700953A (en) Multiprocessor cache coherence processing method and device
US9842050B2 (en) Add-on memory coherence directory
US20080104323A1 (en) Method for identifying, tracking, and storing hot cache lines in an smp environment
KR20110092014A (en) Method for managing coherence, coherence management unit, cache device and semiconductor device including the same
US11741017B2 (en) Power aware translation lookaside buffer invalidation optimization
WO2013101065A1 (en) Domain state
KR101446924B1 (en) Method for managing coherence, coherence management unit, cache device and semiconductor device including the same
Mittal A New Approach to Directory Based Solution for Cache Coherence Problem
Karakostas Improving the performance and energy-efficiency of virtual memory
Esteve García Analysis of opportunities for cache coherence in heterogeneous embedded systems

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200420

Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee after: HUAWEI TECHNOLOGIES Co.,Ltd.

Address before: 301, A building, room 3, building 301, foreshore Road, No. 310052, Binjiang District, Zhejiang, Hangzhou

Patentee before: Huawei Technologies Co.,Ltd.

TR01 Transfer of patent right