CN105700953B - A kind of multiprocessor buffer consistency processing method and processing device - Google Patents

A kind of multiprocessor buffer consistency processing method and processing device Download PDF

Info

Publication number
CN105700953B
CN105700953B CN201410704522.XA CN201410704522A CN105700953B CN 105700953 B CN105700953 B CN 105700953B CN 201410704522 A CN201410704522 A CN 201410704522A CN 105700953 B CN105700953 B CN 105700953B
Authority
CN
China
Prior art keywords
processor
page
cache blocks
page descriptor
caching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410704522.XA
Other languages
Chinese (zh)
Other versions
CN105700953A (en
Inventor
张广飞
崔晓松
黄勤业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Hangzhou Huawei Digital Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Huawei Digital Technologies Co Ltd filed Critical Hangzhou Huawei Digital Technologies Co Ltd
Priority to CN201410704522.XA priority Critical patent/CN105700953B/en
Publication of CN105700953A publication Critical patent/CN105700953A/en
Application granted granted Critical
Publication of CN105700953B publication Critical patent/CN105700953B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The present invention discloses a kind of multiprocessor buffer consistency processing method and processing device, this method is applied in multicomputer system, this method comprises: reading corresponding first page descriptor of the cache blocks from the memory after afterbody caching receives the first request that the first processor in the multiprocessor carries out read operation to a cache blocks;It determines that the first processor is the processor of first access cache blocks according to first page descriptor, then adds page label and obtain the second page descriptor into first page descriptor;The page labeled marker first processor is the processor for accessing the cache blocks for the first time.The method that method provided by the invention is used to solve the problems, such as the consistency of maintenance system in existing multicomputer system wastes a large amount of on piece storage area.

Description

A kind of multiprocessor buffer consistency processing method and processing device
Technical field
The present invention relates to electronic technology field more particularly to a kind of multiprocessor buffer consistency processing method and processing devices.
Background technique
The progress of semiconductor technology is so that the quantity of transistor remains to after 2013 on the integrated circuit of unit area Increased with 3 years some speed.To make full use of technique progress and increased transistor, processor manufacturer is from list Core processor design turns to multi-core/many-core processor design, thus by excavating Thread-Level Parallelism (Thread Level Parallelism, TLP) Lai Tigao system overall performance.At the same time, manufacturer also integrates multiple processor chips To obtain higher performance in one system, the system of this design is referred to as multicomputer system.Such as: currently advanced place 12 processors can be contained by managing device chip, and processor chips as 4 pieces can integrate together, eventually form one The server level arithmetic system of 48 cores.Multicore, many-core and multi-processor structure are referred to as multiprocessing architecture (multi- processing architecture)。
Although multiprocessing architecture makes the performance of computer system be further improved, which also faces The problem of never encountering in the design of many single core processors, one of them is exactly the design of buffer consistency, current calculating Machine system is often made of multiple processors, and there is also multiple processors in a processor.Processor is allowed to cooperate, Its correctness becomes primary problem.The problem can be subdivided into two sub-problems, it may be assumed that storage consistency (memory ) and buffer consistency (cache coherence) consistency.
Storage consistency is also known as memory consistency model, it is defined is allowed to ground read-write operation in shared storage Behavior and sequence.It is most strictly also that most direct consistency model is referred to as sequence consensus in all memory consistency models Property model (Sequential Consistency, SC).And (such as x86, x8664 in leading general multicore/multiprocessor at present And SPARC), a kind of approximate model of the Ordinal Consistency of use stores sequence (Total Store Order, TSO) entirely.
The introducing of sequential consistency model and TSO model greatly facilitates the programming operation of programmer.It is general at present Processor realizes TSO model mostly.But both models propose some new requirements to the design of cache, i.e., It is required to meet buffer consistency (cache coherence).Buffer consistency requires multiple cachings in a computer system Meet the requirement of " single more readers of writer ".I.e. at any time, for any data, to carry out write operation to it, only into The processor of row write operation can cache the data;But to carry out read operation to it, multiple processors can cache simultaneously The data.In general-purpose computing system, correct efficient buffer consistency design is a very important project.
With the increase of the processor number integrated in system, the caching one between each processor how is more efficiently safeguarded Cause property becomes a project of academia and industry common concern.Safeguarding buffer consistency, there are two types of basic agreement sides Formula-snoopy protocol (snoop) and directory protocol (directory).Since directory protocol is better than listening to association in terms of scalability View, is often employed in extensive/ultra-large system.
Fig. 1 gives a logical construction using directory protocol system, which has a directory hardware, pass through network It is connected with all processor caches, the update operation of all pairs of coherency states all has to pass through directory hardware.Work as processing Device A needs the carry out read operation to cache blocks M, and first processing device A can send a read request to directory hardware;Directory hardware is looked into Looking in the caching of discovery processor B has this data, therefore can send out message to the caching of processor B;The caching of processor B receives To after message, the caching of processor A can be sent data to.
A kind of sparse directory (sparsedirectory) scheme is provided in the prior art to solve to cache in consistent nonuniformity Deposit cache coherency problems in access (Cache Coherent Non-Uniform Memory Access, ccNUMA) system. As shown in Fig. 2, the form storage catalogue information that sparse directory is connected in the piece of processor system with group, due to its organizational form Very close to the structure of cache, so also known as by for Directory caching (directory cache).
It is equally that processor A needs to carry out read operation to cache blocks M, first processing device A can send a read request to dilute Dredge directory A;There is this data in the caching of directory search discovery processor B, therefore message can be sent out to the caching of processor B;Place After the caching of reason device B receives message, the caching of processor A can be sent data to.
Sparse directory can use very low degree of association, therefore its access energy consumption is lower.But since sparse directory is group Associate design, therefore there are disequilibriums for the access between its directory group (directory sets), to cause certain access The big group of pressure forms catalog conflict (directory conflicts).Catalog conflict needs to replace some directory blocks (replace) go out sparse directory, be the consistency of maintenance system, need to be replaced that directory block is corresponding data cached deactivates (invalidate).In addition, system, which also needs specially to take out certain memory space, saves Cache consensus information, this method In Directory Cache occupy a large amount of on piece storage area.
Summary of the invention
The present invention provides a kind of multiprocessor buffer consistency processing method and processing device, method provided by the present invention and dress It sets the method for solving the problems, such as the consistency of maintenance system in existing multicomputer system and wastes a large amount of on piece storage area.
On the one hand, the present invention provides a kind of multiprocessor buffer consistency processing method, and this method is applied to multiprocessor In system, which includes multiple processors, afterbody caching LLC, memory and Directory caching, wherein last Level cache is connected with multiple processors, memory with Directory caching respectively, this method comprises:
Afterbody caching receives first processor in the multiprocessor and carries out the of read operation to a cache blocks After one request, corresponding first page descriptor of the cache blocks is read from the memory;
Determine that the first processor is the processor of first access cache blocks according to first page descriptor, It then adds page label and obtains the second page descriptor into first page descriptor;The page labeled marker first processor is the Once access the processor of the cache blocks.
With reference to first aspect, in the first possible implementation, this method further comprises:
It requests when receiving second processor the cache blocks carry out read operation second, is obtained according to second request Take second page descriptor;
Page label in second page descriptor is updated, third page descriptor is obtained;Updated page label Mark first processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
Catalogue assignment messages are sent to Directory caching, so that Directory caching distributes directory entry, the directory entry is for recording Access the processor of the cache blocks.
The possible implementation of with reference to first aspect the first, it is in the second possible implementation, described to obtain After third page descriptor, further comprise:
The third page descriptor is stored, and sends the first processor and described for the third page descriptor The storage of two processors.
With reference to first aspect or the first to two kind of possible implementation of first aspect, in the third possible reality In existing mode, the page label includes: that access record FA is used for whether mark to have processor to access cache blocks;Processor flag For indicating the unique identification of the processor of first time access cache block;Consistency range SH is at one for record buffer memory block Or consistency is safeguarded between multiple processors.
With reference to first aspect or the first to three kind of possible implementation of first aspect, in the third possible reality In existing mode, after being updated to the corresponding page descriptor of the cache blocks, this method is further by updated page descriptor It writes back in the memory.
Second aspect, the present invention also provides a kind of multiprocessor buffer consistency processing unit, which is applied to many places It manages in the afterbody caching of device system, which includes multiple processors, afterbody caching, memory and catalogue Caching, wherein afterbody caching is connected with multiple processors, memory with Directory caching respectively, which includes:
Reading unit carries out the of read operation to a cache blocks for receiving the first processor in the multiprocessor After one request, corresponding first page descriptor of the cache blocks is read from the memory;
Updating unit, for determining that the first processor is that first access is described slow according to first page descriptor The processor of counterfoil then adds page label and obtains the second page descriptor into first page descriptor;The page labeled marker First processor is the processor for accessing the cache blocks for the first time.
In conjunction with second aspect, in the first possible implementation, the updating unit, which is also used to work as, receives second Processor carries out the second request of read operation to the cache blocks, according to the second page descriptor described in second request; Page label in second page descriptor is updated, third page descriptor is obtained;Updated page labeled marker first Processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
The then device further include:
Directory caching request unit, for sending catalogue assignment messages to Directory caching, so that Directory caching distributes catalogue , processor of the directory entry for cache blocks described in record access.
In conjunction with the first possible implementation of second aspect, in the second possible implementation, the device is also Include:
Synchronization unit, for after obtaining third page descriptor, storing the third page descriptor, and by the third Page descriptor is sent to the first processor and second processor storage.
The first possible implementation or second of possible implementation with reference to first aspect, it is possible at the third Implementation in, the updating unit be used for include in the page label access record FA, processor flag with it is consistent Property range SH be updated, whether there is processor to access cache blocks for indicating;Processor flag is visited for the first time for indicating Ask the unique identification of the processor of cache blocks;Consistency range SH for record buffer memory block be one or multiple processors it Between safeguard consistency.
In conjunction with the first to three kind of possible implementation of second aspect or second aspect, in the 4th kind of possible reality In existing mode, the updating unit is also used to after the corresponding page descriptor of the cache blocks updates, and updated page is described Symbol writes back in the memory.
One or two of above-mentioned technical proposal at least has the following technical effect that
Method and apparatus provided by the embodiment of the present invention, when first time access cache block processor access block it Afterwards, corresponding page label is directly added in page descriptor, this page of label has had a processor to the caching for indicating Block has carried out read operation, can pass through this page of label when other subsequent processors carry out read operation to the cache blocks again Carry out subsequent operation;Buffer consistency can be realized by this page of label, so as to avoid the access to Directory caching, from And reduce to Directory Cache access times, reduce Directory Cache power dissipation overhead;Nor with distribution Directory entry storage processor is to the operational circumstances of cache blocks, thus lesser chip area can also be used to safeguard multicore consistency, Reduce Directory Cache area overhead.
Detailed description of the invention
Fig. 1 is the logical construction schematic diagram for using directory protocol system in the prior art;
Fig. 2 is the logical construction schematic diagram of sparse directory system in the prior art;
Fig. 3 is a kind of flow diagram of multiprocessor buffer consistency processing method provided in an embodiment of the present invention;
Fig. 4 provides the structural schematic diagram of the applicable multicomputer system of scheme by the embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of multiprocessor buffer consistency processing unit provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of multicomputer system provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
The embodiment of the present invention is described in further detail with reference to the accompanying drawings of the specification.
As shown in figure 3, the embodiment of the present invention provides a kind of multiprocessor buffer consistency processing method, this method is applied to In multicomputer system, the multicomputer system include multiple processors, afterbody caching (Last Level Cache, LLC), memory (Main Memory) and Directory caching (Directory Cache), wherein afterbody caching respectively with it is multiple Processor, memory are connected (connection structure of each unit is as shown in Figure 4 in the multicomputer system) with Directory caching, this method Include:
Step 301, the first processor that afterbody caching receives in the multiprocessor reads a cache blocks After first request of operation, the corresponding first page descriptor (Page of the cache blocks is read from the memory Descriptor);
Step 302, determine that the first processor is first access cache blocks according to first page descriptor Processor, then add page label and obtain the second page descriptor into first page descriptor;The page labeled marker first Processor is the processor for accessing the cache blocks for the first time.
In embodiments of the present invention, in order to indicate the processor of first time access cache block, the specific implementation of page label can To be structure shown in table 1:
Table 1
Wherein, whether access record (First Access, FA) has processor to access cache blocks for indicating;Wherein, It can be that 1 mark is not with 0 mark;
Processor flag (CPU_ID) is used to indicate the unique identification (ID) of the processor of first time access cache block;
Consistency range (Shareability, SH) for record buffer memory block is tieed up between one or multiple processors Protect consistency;Wherein it is possible to which in a processor with 0 mark, 1 is indicated in the maintenance consistency in multiple processors.
In order to guarantee buffer consistency, to cache blocks it is original also description be updated adjustment after, it is also necessary to will adjust The page descriptor obtained after whole is saved in corresponding position, so as to it is subsequent the processor to access is opened to the caching being capable of basis Page descriptor determines the accessed situation of cache blocks after adjustment, so the method that the embodiment provides further include:
Afterbody is buffered in a second page descriptor of itself storage, also sends second page descriptor to It is stored in the first processor and memory.Because original page descriptor is read from memory, after adjustment Page descriptor can write back to position corresponding to original page descriptor.
In specific application environment, data handled by different processors have certain division, so same number According to by multiple processors access the case where accessed than same data by a processor the case where it is few, however, to ensure that multiple places The buffer consistency of device is managed, an either processor operate to cache blocks (or referred to as data) or multiple processing Device requires the corresponding directory entry of distribution, the CPU_ of the record access data in Directory caching (Directory Cache) ID;And when every sub-distribution catalogue, needed in directory entry include each processor in multicomputer system access situation (i.e. Not all processor is set all to have accessed cache blocks, but the directory entry distributed needs the case where recording all processors). So the method for the prior art, can waste a large amount of Directory caching resource.
And in method provided by the embodiment of the present invention, when first time access cache block processor access block it Afterwards, corresponding page label is directly added in page descriptor, this page of label has had a processor to the caching for indicating Block has carried out read operation, can pass through this page of label when other subsequent processors carry out read operation to the cache blocks again Carry out subsequent operation;Buffer consistency can be realized by this page of label, so as to avoid the access to Directory caching, from And reduce to Directory Cache access times, reduce Directory Cache power dissipation overhead;Nor with distribution Directory entry storage processor is to the operational circumstances of cache blocks, thus lesser chip area can also be used to safeguard multicore consistency, Reduce Directory Cache area overhead.
Further, after first processor accesses to cache blocks in the above method, the case where access, is saved in In each equipment, when second processor accesses to same cache blocks, the corresponding page descriptor of cache blocks is obtained first, Then being assured that itself not by the page label in page descriptor is first processor for accessing the cache blocks, so should Method may further comprise:
A is requested the cache blocks carry out read operation second when receiving second processor, according to second request Obtain second page descriptor;Page label in second page descriptor is updated, third page descriptor is obtained;More Page labeled marker first processor after new is the processor for accessing the cache blocks for the first time, and the cache blocks are by multiple places Manage device operation;
In this example, because first processor is the processor of first time access cache block, afterbody caching After receiving the second request, the corresponding page descriptor of the cache blocks is obtained, is known by the page label in the page descriptor that gets The processor of road first time access cache block is first processor, so determining that second processor and first processor need to visit After asking same cache blocks, need to be updated the page label in original page descriptor, updated page labeled marker first Processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
Since it is desired that guarantee buffer consistency within the scope of first processor and second processor, thus need further exist for from It goes to request corresponding directory entry in Directory caching.
B sends catalogue assignment messages to Directory caching, so that Directory caching distributes directory entry, the directory entry is for remembering Record accesses the processor of the cache blocks.
For the purposes of guarantee buffer consistency, to cache blocks it is original also description be updated adjustment after, it is also necessary to The page descriptor obtained after adjustment is saved in corresponding position, so as to it is subsequent the processor to access is opened to the caching can The accessed situation of cache blocks is determined according to page descriptor after adjustment, so the method that the embodiment provides further include:
The third page descriptor is stored, and sends the first processor and described for the third page descriptor The storage of two processors.
As shown in figure 5, the embodiment of the present invention also provides a kind of multiprocessor buffer consistency processing dress according to the above method It sets, which is applied in the afterbody caching of multicomputer system, which includes multiple processors, last Level cache, memory and Directory caching, wherein afterbody caching is connected with multiple processors, memory with Directory caching respectively, The device includes:
Reading unit 501 carries out read operation to a cache blocks for receiving the first processor in the multiprocessor First request after, corresponding first page descriptor of the cache blocks (Page Descriptor) is read from the memory;
Updating unit 502, for determining the first processor for first access institute according to first page descriptor The processor of cache blocks is stated, then adds page label and obtains the second page descriptor into first page descriptor;The page label Mark first processor is the processor for accessing the cache blocks for the first time.
The updating unit 502 can be different according to the parameter of page label, and the update of page label is carried out using different modes, is had One of body implementation may is that
The updating unit 502 is used for access record FA, processor flag and the consistency model for including in the page label It encloses SH to be updated, whether there is processor to access cache blocks for indicating;Processor flag is delayed for indicating to access for the first time The unique identification (ID) of the processor of counterfoil;Consistency range SH for record buffer memory block be one or multiple processors it Between safeguard consistency.
Further, in order to guarantee that subsequent processor can get the page descriptor of cache blocks, so to page descriptor After being updated, it is also necessary to further store the page descriptor of update to corresponding position, wherein specifically including:
Updating unit 502 is also used to after the corresponding page descriptor of the cache blocks updates, by updated page descriptor It writes back in the memory.
When multiple processors access the same cache blocks, which also needs further to realize multiple places The buffer consistency between device is managed, specifically may is that
The updating unit 502 is also used to receive second processor and asks to the cache blocks carry out read operation second It asks, according to the second page descriptor described in second request;Page label in second page descriptor is updated, Obtain third page descriptor;Updated page labeled marker first processor is the processor for accessing the cache blocks for the first time, And the cache blocks are operated by multiple processors;
The then device further include:
Directory caching request unit, for sending catalogue assignment messages to Directory caching, so that Directory caching distributes catalogue , processor of the directory entry for cache blocks described in record access.
In order to guarantee buffer consistency, to cache blocks it is original also description be updated adjustment after, it is also necessary to will adjust The page descriptor obtained after whole is saved in corresponding position, so as to it is subsequent the processor to access is opened to the caching being capable of basis Page descriptor determines the accessed situation of cache blocks after adjustment, so the device that the embodiment provides further include:
Synchronization unit, for after obtaining third page descriptor, storing the third page descriptor, and by the third Page descriptor is sent to the first processor and second processor storage.
As shown in fig. 6, the embodiment of the present invention also provides a kind of multicomputer system according to the above method, the multiprocessor System includes multiple processors 601, afterbody caching 602, memory 603 and Directory caching 604, wherein afterbody caching 602 are connected with multiple processors 601, memory 603 with Directory caching 604 respectively, in which:
First processor sends to afterbody caching 602 and carries out read operation to a cache blocks in multiple processors 601 First request;
In actual application environment, a hardware page searching unit (Hardware is both provided in each processor Table Walker, HTW) and transmission look-aside buffer (Translation Lookaside Buffer, TLB), work as certain When reason device needs to access a certain cache blocks being stored in memory, TLB then sends the page descriptor of the request cache blocks.
After afterbody caching 602 receives the first request of first processor transmission, the caching is read from memory 603 The corresponding page descriptor of block (Page Descriptor);Determine that the first processor is the according to first page descriptor The processor of one access cache blocks, then add page label and obtain the second page descriptor into first page descriptor; The page labeled marker first processor is the processor for accessing the cache blocks for the first time, stores the second page descriptor, concurrently It is sent to first processor and memory 603 stores.
Further, when in multiple processors 601 second processor to afterbody caching 602 send to same cache blocks into Second request of row read operation;
Then afterbody caching 602 is according to corresponding second page descriptor of the second request cache blocks, according to institute State the second page descriptor described in the second request;Page label in second page descriptor is updated, third is obtained Page descriptor;Updated page labeled marker first processor is the processor for accessing the cache blocks for the first time, and this is slow Counterfoil is operated by multiple processors, sends catalogue assignment messages to Directory caching 604;
Directory caching 604, after receiving the assignment messages that afterbody caching 602 is sent, according to the assignment messages point With directory entry, processor (i.e. first processor and second processor access of the directory entry for cache blocks described in record access Cache blocks, other processors in multiple processors do not access the cache blocks).
Said one or multiple technical solutions in the embodiment of the present application, at least have the following technical effect that:
In scheme provided by the present invention, after the processor access block of first time access cache block, directly exist Corresponding page label is added in page descriptor, this page of label has had a processor to read the cache blocks for indicating Operation can be carried out subsequent when other subsequent processors carry out read operation to the cache blocks again by this page of label Operation;Buffer consistency can be realized by this page of label, so as to avoid the access to Directory caching, to reduce pair Directory Cache access times reduce Directory Cache power dissipation overhead;Nor it is stored with distribution directory entry Processor reduces the operational circumstances of cache blocks to can also safeguard multicore consistency with lesser chip area Directory Cache area overhead.
Method of the present invention is not limited to embodiment described in specific embodiment, those skilled in the art according to Technical solution of the present invention obtains other embodiments, also belongs to the scope of the technical innovation of the present invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (12)

1. a kind of multiprocessor buffer consistency processing method, this method is applied in multicomputer system, the multiprocessor system System includes multiple processors, afterbody caching LLC, memory and Directory caching, wherein afterbody caching respectively with multiple places Reason device, memory are connected with Directory caching, which is characterized in that this method comprises:
Afterbody caching receives the first processor in the multiprocessor and asks to a cache blocks carry out read operation first After asking, corresponding first page descriptor of the cache blocks is read from the memory;
It determines that the first processor is the processor of first access cache blocks according to first page descriptor, then adds Page label is added to obtain the second page descriptor into first page descriptor;The page labeled marker first processor is for the first time Access the processor of the cache blocks.
2. the method as described in claim 1, which is characterized in that this method further comprises:
It requests when receiving second processor the cache blocks carry out read operation second, according to the second request institute State the second page descriptor;
Page label in second page descriptor is updated, third page descriptor is obtained;Updated page labeled marker First processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
Catalogue assignment messages are sent to Directory caching, so that Directory caching distributes directory entry, the directory entry is used for record access The processor of the cache blocks.
3. method according to claim 2, which is characterized in that it is described obtain third page descriptor after, further comprise:
The third page descriptor is stored, and sends the third page descriptor at the first processor and described second Manage device storage.
4. the method as described in claims 1 to 3 is any, which is characterized in that the page label includes: access record FA for marking Whether show has processor to access cache blocks;Processor flag is used to indicate unique mark of the processor of first time access cache block Know;Consistency range SH is to safeguard consistency between one or multiple processors for record buffer memory block.
5. the method as described in claims 1 to 3 is any, which is characterized in that carried out to the corresponding page descriptor of the cache blocks After update, the method further writes back to updated page descriptor in the memory.
6. method as claimed in claim 4, which is characterized in that after being updated to the corresponding page descriptor of the cache blocks, The method further writes back to updated page descriptor in the memory.
7. a kind of multiprocessor buffer consistency processing unit, the afterbody which is applied to multicomputer system is cached In, which includes multiple processors, afterbody caching, memory and Directory caching, wherein afterbody caching It is connected respectively with multiple processors, memory with Directory caching, which is characterized in that the device includes:
Reading unit is asked for receiving the first processor in the multiprocessor to a cache blocks carry out read operation first After asking, corresponding first page descriptor of the cache blocks is read from the memory;
Updating unit, for determining that the first processor is first access cache blocks according to first page descriptor Processor, then add page label and obtain the second page descriptor into first page descriptor;The page labeled marker first Processor is the processor for accessing the cache blocks for the first time.
8. device as claimed in claim 7, which is characterized in that the updating unit, which is also used to work as, receives second processor pair The cache blocks carry out the second request of read operation, according to the second page descriptor described in second request;To described Page label in two page descriptors is updated, and obtains third page descriptor;Updated page labeled marker first processor is The processor of the cache blocks is accessed for the first time, and the cache blocks are operated by multiple processors;
The device further include:
Directory caching request unit, for sending catalogue assignment messages to Directory caching, so that Directory caching distributes directory entry, institute State processor of the directory entry for cache blocks described in record access.
9. device as claimed in claim 8, which is characterized in that the device further include:
Synchronization unit is retouched for after obtaining third page descriptor, storing the third page descriptor, and by the third page It states symbol and is sent to the first processor and second processor storage.
10. the device as described in claim 8 or 9 is any, which is characterized in that the updating unit is used for in the page label Access record FA, processor flag and the consistency range SH for including are updated, and access record FA is for indicating whether have place Reason device accessed cache blocks;Processor flag is used to indicate the unique identification of the processor of first time access cache block;Consistency Range SH is to safeguard consistency between one or multiple processors for record buffer memory block.
11. the device as described in claim 7~9 is any, which is characterized in that the updating unit is also used in the cache blocks After corresponding page descriptor updates, updated page descriptor is write back in the memory.
12. device as claimed in claim 10, which is characterized in that the updating unit is also used to corresponding in the cache blocks After page descriptor updates, updated page descriptor is write back in the memory.
CN201410704522.XA 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device Active CN105700953B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410704522.XA CN105700953B (en) 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410704522.XA CN105700953B (en) 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device

Publications (2)

Publication Number Publication Date
CN105700953A CN105700953A (en) 2016-06-22
CN105700953B true CN105700953B (en) 2019-03-26

Family

ID=56295866

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410704522.XA Active CN105700953B (en) 2014-11-26 2014-11-26 A kind of multiprocessor buffer consistency processing method and processing device

Country Status (1)

Country Link
CN (1) CN105700953B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116962259B (en) * 2023-09-21 2024-02-13 中电科申泰信息科技有限公司 Consistency processing method and system based on monitoring-directory two-layer protocol

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7475193B2 (en) * 2006-01-18 2009-01-06 International Business Machines Corporation Separate data and coherency cache directories in a shared cache in a multiprocessor system
CN101510191B (en) * 2009-03-26 2010-10-06 浙江大学 Implementing method of multi-core system structure with buffer window
CN101859281A (en) * 2009-04-13 2010-10-13 廖鑫 Method for embedded multi-core buffer consistency based on centralized directory

Also Published As

Publication number Publication date
CN105700953A (en) 2016-06-22

Similar Documents

Publication Publication Date Title
US9501425B2 (en) Translation lookaside buffer management
US9418009B2 (en) Inclusive and non-inclusive tracking of local cache lines to avoid near memory reads on cache line memory writes into a two level system memory
KR101385430B1 (en) Cache coherence protocol for persistent memories
US8285969B2 (en) Reducing broadcasts in multiprocessors
US20180239702A1 (en) Locality-aware and sharing-aware cache coherence for collections of processors
US9916247B2 (en) Cache management directory where hardware manages cache write requests and software manages cache read requests
CN109240945B (en) Data processing method and processor
US9208088B2 (en) Shared virtual memory management apparatus for providing cache-coherence
Tabbakh et al. G-TSC: Timestamp based coherence for GPUs
US20120233409A1 (en) Managing shared memory used by compute nodes
US9760489B2 (en) Private memory table for reduced memory coherence traffic
CN105700953B (en) A kind of multiprocessor buffer consistency processing method and processing device
US9037804B2 (en) Efficient support of sparse data structure access
García-Guirado et al. Energy-efficient cache coherence protocols in chip-multiprocessors for server consolidation
US9842050B2 (en) Add-on memory coherence directory
Mojumder et al. Halcone: A hardware-level timestamp-based cache coherence scheme for multi-gpu systems
US10482015B2 (en) Ownership tracking updates across multiple simultaneous operations
KR101155127B1 (en) Apparatus and method for memory management of multi-core system
Bae et al. Dynamic directory table with victim cache: on-demand allocation of directory entries for active shared cache blocks
US11960399B2 (en) Relaxed invalidation for cache coherence
Alkhamisi Cache coherence issues and solution: A review
WO2013101065A1 (en) Domain state
Karakostas Improving the performance and energy-efficiency of virtual memory
Li et al. A New Kind of Cache Coherence Protocol with SC-Cache for Multiprocessor
Merritt Efficient Programming of Massive-memory Machines

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200420

Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen

Patentee after: HUAWEI TECHNOLOGIES Co.,Ltd.

Address before: 301, A building, room 3, building 301, foreshore Road, No. 310052, Binjiang District, Zhejiang, Hangzhou

Patentee before: Huawei Technologies Co.,Ltd.