CN105700953B - A kind of multiprocessor buffer consistency processing method and processing device - Google Patents
A kind of multiprocessor buffer consistency processing method and processing device Download PDFInfo
- Publication number
- CN105700953B CN105700953B CN201410704522.XA CN201410704522A CN105700953B CN 105700953 B CN105700953 B CN 105700953B CN 201410704522 A CN201410704522 A CN 201410704522A CN 105700953 B CN105700953 B CN 105700953B
- Authority
- CN
- China
- Prior art keywords
- processor
- page
- cache blocks
- page descriptor
- caching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Abstract
The present invention discloses a kind of multiprocessor buffer consistency processing method and processing device, this method is applied in multicomputer system, this method comprises: reading corresponding first page descriptor of the cache blocks from the memory after afterbody caching receives the first request that the first processor in the multiprocessor carries out read operation to a cache blocks;It determines that the first processor is the processor of first access cache blocks according to first page descriptor, then adds page label and obtain the second page descriptor into first page descriptor;The page labeled marker first processor is the processor for accessing the cache blocks for the first time.The method that method provided by the invention is used to solve the problems, such as the consistency of maintenance system in existing multicomputer system wastes a large amount of on piece storage area.
Description
Technical field
The present invention relates to electronic technology field more particularly to a kind of multiprocessor buffer consistency processing method and processing devices.
Background technique
The progress of semiconductor technology is so that the quantity of transistor remains to after 2013 on the integrated circuit of unit area
Increased with 3 years some speed.To make full use of technique progress and increased transistor, processor manufacturer is from list
Core processor design turns to multi-core/many-core processor design, thus by excavating Thread-Level Parallelism (Thread Level
Parallelism, TLP) Lai Tigao system overall performance.At the same time, manufacturer also integrates multiple processor chips
To obtain higher performance in one system, the system of this design is referred to as multicomputer system.Such as: currently advanced place
12 processors can be contained by managing device chip, and processor chips as 4 pieces can integrate together, eventually form one
The server level arithmetic system of 48 cores.Multicore, many-core and multi-processor structure are referred to as multiprocessing architecture (multi-
processing architecture)。
Although multiprocessing architecture makes the performance of computer system be further improved, which also faces
The problem of never encountering in the design of many single core processors, one of them is exactly the design of buffer consistency, current calculating
Machine system is often made of multiple processors, and there is also multiple processors in a processor.Processor is allowed to cooperate,
Its correctness becomes primary problem.The problem can be subdivided into two sub-problems, it may be assumed that storage consistency (memory
) and buffer consistency (cache coherence) consistency.
Storage consistency is also known as memory consistency model, it is defined is allowed to ground read-write operation in shared storage
Behavior and sequence.It is most strictly also that most direct consistency model is referred to as sequence consensus in all memory consistency models
Property model (Sequential Consistency, SC).And (such as x86, x8664 in leading general multicore/multiprocessor at present
And SPARC), a kind of approximate model of the Ordinal Consistency of use stores sequence (Total Store Order, TSO) entirely.
The introducing of sequential consistency model and TSO model greatly facilitates the programming operation of programmer.It is general at present
Processor realizes TSO model mostly.But both models propose some new requirements to the design of cache, i.e.,
It is required to meet buffer consistency (cache coherence).Buffer consistency requires multiple cachings in a computer system
Meet the requirement of " single more readers of writer ".I.e. at any time, for any data, to carry out write operation to it, only into
The processor of row write operation can cache the data;But to carry out read operation to it, multiple processors can cache simultaneously
The data.In general-purpose computing system, correct efficient buffer consistency design is a very important project.
With the increase of the processor number integrated in system, the caching one between each processor how is more efficiently safeguarded
Cause property becomes a project of academia and industry common concern.Safeguarding buffer consistency, there are two types of basic agreement sides
Formula-snoopy protocol (snoop) and directory protocol (directory).Since directory protocol is better than listening to association in terms of scalability
View, is often employed in extensive/ultra-large system.
Fig. 1 gives a logical construction using directory protocol system, which has a directory hardware, pass through network
It is connected with all processor caches, the update operation of all pairs of coherency states all has to pass through directory hardware.Work as processing
Device A needs the carry out read operation to cache blocks M, and first processing device A can send a read request to directory hardware;Directory hardware is looked into
Looking in the caching of discovery processor B has this data, therefore can send out message to the caching of processor B;The caching of processor B receives
To after message, the caching of processor A can be sent data to.
A kind of sparse directory (sparsedirectory) scheme is provided in the prior art to solve to cache in consistent nonuniformity
Deposit cache coherency problems in access (Cache Coherent Non-Uniform Memory Access, ccNUMA) system.
As shown in Fig. 2, the form storage catalogue information that sparse directory is connected in the piece of processor system with group, due to its organizational form
Very close to the structure of cache, so also known as by for Directory caching (directory cache).
It is equally that processor A needs to carry out read operation to cache blocks M, first processing device A can send a read request to dilute
Dredge directory A;There is this data in the caching of directory search discovery processor B, therefore message can be sent out to the caching of processor B;Place
After the caching of reason device B receives message, the caching of processor A can be sent data to.
Sparse directory can use very low degree of association, therefore its access energy consumption is lower.But since sparse directory is group
Associate design, therefore there are disequilibriums for the access between its directory group (directory sets), to cause certain access
The big group of pressure forms catalog conflict (directory conflicts).Catalog conflict needs to replace some directory blocks
(replace) go out sparse directory, be the consistency of maintenance system, need to be replaced that directory block is corresponding data cached deactivates
(invalidate).In addition, system, which also needs specially to take out certain memory space, saves Cache consensus information, this method
In Directory Cache occupy a large amount of on piece storage area.
Summary of the invention
The present invention provides a kind of multiprocessor buffer consistency processing method and processing device, method provided by the present invention and dress
It sets the method for solving the problems, such as the consistency of maintenance system in existing multicomputer system and wastes a large amount of on piece storage area.
On the one hand, the present invention provides a kind of multiprocessor buffer consistency processing method, and this method is applied to multiprocessor
In system, which includes multiple processors, afterbody caching LLC, memory and Directory caching, wherein last
Level cache is connected with multiple processors, memory with Directory caching respectively, this method comprises:
Afterbody caching receives first processor in the multiprocessor and carries out the of read operation to a cache blocks
After one request, corresponding first page descriptor of the cache blocks is read from the memory;
Determine that the first processor is the processor of first access cache blocks according to first page descriptor,
It then adds page label and obtains the second page descriptor into first page descriptor;The page labeled marker first processor is the
Once access the processor of the cache blocks.
With reference to first aspect, in the first possible implementation, this method further comprises:
It requests when receiving second processor the cache blocks carry out read operation second, is obtained according to second request
Take second page descriptor;
Page label in second page descriptor is updated, third page descriptor is obtained;Updated page label
Mark first processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
Catalogue assignment messages are sent to Directory caching, so that Directory caching distributes directory entry, the directory entry is for recording
Access the processor of the cache blocks.
The possible implementation of with reference to first aspect the first, it is in the second possible implementation, described to obtain
After third page descriptor, further comprise:
The third page descriptor is stored, and sends the first processor and described for the third page descriptor
The storage of two processors.
With reference to first aspect or the first to two kind of possible implementation of first aspect, in the third possible reality
In existing mode, the page label includes: that access record FA is used for whether mark to have processor to access cache blocks;Processor flag
For indicating the unique identification of the processor of first time access cache block;Consistency range SH is at one for record buffer memory block
Or consistency is safeguarded between multiple processors.
With reference to first aspect or the first to three kind of possible implementation of first aspect, in the third possible reality
In existing mode, after being updated to the corresponding page descriptor of the cache blocks, this method is further by updated page descriptor
It writes back in the memory.
Second aspect, the present invention also provides a kind of multiprocessor buffer consistency processing unit, which is applied to many places
It manages in the afterbody caching of device system, which includes multiple processors, afterbody caching, memory and catalogue
Caching, wherein afterbody caching is connected with multiple processors, memory with Directory caching respectively, which includes:
Reading unit carries out the of read operation to a cache blocks for receiving the first processor in the multiprocessor
After one request, corresponding first page descriptor of the cache blocks is read from the memory;
Updating unit, for determining that the first processor is that first access is described slow according to first page descriptor
The processor of counterfoil then adds page label and obtains the second page descriptor into first page descriptor;The page labeled marker
First processor is the processor for accessing the cache blocks for the first time.
In conjunction with second aspect, in the first possible implementation, the updating unit, which is also used to work as, receives second
Processor carries out the second request of read operation to the cache blocks, according to the second page descriptor described in second request;
Page label in second page descriptor is updated, third page descriptor is obtained;Updated page labeled marker first
Processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
The then device further include:
Directory caching request unit, for sending catalogue assignment messages to Directory caching, so that Directory caching distributes catalogue
, processor of the directory entry for cache blocks described in record access.
In conjunction with the first possible implementation of second aspect, in the second possible implementation, the device is also
Include:
Synchronization unit, for after obtaining third page descriptor, storing the third page descriptor, and by the third
Page descriptor is sent to the first processor and second processor storage.
The first possible implementation or second of possible implementation with reference to first aspect, it is possible at the third
Implementation in, the updating unit be used for include in the page label access record FA, processor flag with it is consistent
Property range SH be updated, whether there is processor to access cache blocks for indicating;Processor flag is visited for the first time for indicating
Ask the unique identification of the processor of cache blocks;Consistency range SH for record buffer memory block be one or multiple processors it
Between safeguard consistency.
In conjunction with the first to three kind of possible implementation of second aspect or second aspect, in the 4th kind of possible reality
In existing mode, the updating unit is also used to after the corresponding page descriptor of the cache blocks updates, and updated page is described
Symbol writes back in the memory.
One or two of above-mentioned technical proposal at least has the following technical effect that
Method and apparatus provided by the embodiment of the present invention, when first time access cache block processor access block it
Afterwards, corresponding page label is directly added in page descriptor, this page of label has had a processor to the caching for indicating
Block has carried out read operation, can pass through this page of label when other subsequent processors carry out read operation to the cache blocks again
Carry out subsequent operation;Buffer consistency can be realized by this page of label, so as to avoid the access to Directory caching, from
And reduce to Directory Cache access times, reduce Directory Cache power dissipation overhead;Nor with distribution
Directory entry storage processor is to the operational circumstances of cache blocks, thus lesser chip area can also be used to safeguard multicore consistency,
Reduce Directory Cache area overhead.
Detailed description of the invention
Fig. 1 is the logical construction schematic diagram for using directory protocol system in the prior art;
Fig. 2 is the logical construction schematic diagram of sparse directory system in the prior art;
Fig. 3 is a kind of flow diagram of multiprocessor buffer consistency processing method provided in an embodiment of the present invention;
Fig. 4 provides the structural schematic diagram of the applicable multicomputer system of scheme by the embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of multiprocessor buffer consistency processing unit provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of multicomputer system provided in an embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
Every other embodiment obtained without creative efforts, shall fall within the protection scope of the present invention.
The embodiment of the present invention is described in further detail with reference to the accompanying drawings of the specification.
As shown in figure 3, the embodiment of the present invention provides a kind of multiprocessor buffer consistency processing method, this method is applied to
In multicomputer system, the multicomputer system include multiple processors, afterbody caching (Last Level Cache,
LLC), memory (Main Memory) and Directory caching (Directory Cache), wherein afterbody caching respectively with it is multiple
Processor, memory are connected (connection structure of each unit is as shown in Figure 4 in the multicomputer system) with Directory caching, this method
Include:
Step 301, the first processor that afterbody caching receives in the multiprocessor reads a cache blocks
After first request of operation, the corresponding first page descriptor (Page of the cache blocks is read from the memory
Descriptor);
Step 302, determine that the first processor is first access cache blocks according to first page descriptor
Processor, then add page label and obtain the second page descriptor into first page descriptor;The page labeled marker first
Processor is the processor for accessing the cache blocks for the first time.
In embodiments of the present invention, in order to indicate the processor of first time access cache block, the specific implementation of page label can
To be structure shown in table 1:
Table 1
Wherein, whether access record (First Access, FA) has processor to access cache blocks for indicating;Wherein,
It can be that 1 mark is not with 0 mark;
Processor flag (CPU_ID) is used to indicate the unique identification (ID) of the processor of first time access cache block;
Consistency range (Shareability, SH) for record buffer memory block is tieed up between one or multiple processors
Protect consistency;Wherein it is possible to which in a processor with 0 mark, 1 is indicated in the maintenance consistency in multiple processors.
In order to guarantee buffer consistency, to cache blocks it is original also description be updated adjustment after, it is also necessary to will adjust
The page descriptor obtained after whole is saved in corresponding position, so as to it is subsequent the processor to access is opened to the caching being capable of basis
Page descriptor determines the accessed situation of cache blocks after adjustment, so the method that the embodiment provides further include:
Afterbody is buffered in a second page descriptor of itself storage, also sends second page descriptor to
It is stored in the first processor and memory.Because original page descriptor is read from memory, after adjustment
Page descriptor can write back to position corresponding to original page descriptor.
In specific application environment, data handled by different processors have certain division, so same number
According to by multiple processors access the case where accessed than same data by a processor the case where it is few, however, to ensure that multiple places
The buffer consistency of device is managed, an either processor operate to cache blocks (or referred to as data) or multiple processing
Device requires the corresponding directory entry of distribution, the CPU_ of the record access data in Directory caching (Directory Cache)
ID;And when every sub-distribution catalogue, needed in directory entry include each processor in multicomputer system access situation (i.e.
Not all processor is set all to have accessed cache blocks, but the directory entry distributed needs the case where recording all processors).
So the method for the prior art, can waste a large amount of Directory caching resource.
And in method provided by the embodiment of the present invention, when first time access cache block processor access block it
Afterwards, corresponding page label is directly added in page descriptor, this page of label has had a processor to the caching for indicating
Block has carried out read operation, can pass through this page of label when other subsequent processors carry out read operation to the cache blocks again
Carry out subsequent operation;Buffer consistency can be realized by this page of label, so as to avoid the access to Directory caching, from
And reduce to Directory Cache access times, reduce Directory Cache power dissipation overhead;Nor with distribution
Directory entry storage processor is to the operational circumstances of cache blocks, thus lesser chip area can also be used to safeguard multicore consistency,
Reduce Directory Cache area overhead.
Further, after first processor accesses to cache blocks in the above method, the case where access, is saved in
In each equipment, when second processor accesses to same cache blocks, the corresponding page descriptor of cache blocks is obtained first,
Then being assured that itself not by the page label in page descriptor is first processor for accessing the cache blocks, so should
Method may further comprise:
A is requested the cache blocks carry out read operation second when receiving second processor, according to second request
Obtain second page descriptor;Page label in second page descriptor is updated, third page descriptor is obtained;More
Page labeled marker first processor after new is the processor for accessing the cache blocks for the first time, and the cache blocks are by multiple places
Manage device operation;
In this example, because first processor is the processor of first time access cache block, afterbody caching
After receiving the second request, the corresponding page descriptor of the cache blocks is obtained, is known by the page label in the page descriptor that gets
The processor of road first time access cache block is first processor, so determining that second processor and first processor need to visit
After asking same cache blocks, need to be updated the page label in original page descriptor, updated page labeled marker first
Processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
Since it is desired that guarantee buffer consistency within the scope of first processor and second processor, thus need further exist for from
It goes to request corresponding directory entry in Directory caching.
B sends catalogue assignment messages to Directory caching, so that Directory caching distributes directory entry, the directory entry is for remembering
Record accesses the processor of the cache blocks.
For the purposes of guarantee buffer consistency, to cache blocks it is original also description be updated adjustment after, it is also necessary to
The page descriptor obtained after adjustment is saved in corresponding position, so as to it is subsequent the processor to access is opened to the caching can
The accessed situation of cache blocks is determined according to page descriptor after adjustment, so the method that the embodiment provides further include:
The third page descriptor is stored, and sends the first processor and described for the third page descriptor
The storage of two processors.
As shown in figure 5, the embodiment of the present invention also provides a kind of multiprocessor buffer consistency processing dress according to the above method
It sets, which is applied in the afterbody caching of multicomputer system, which includes multiple processors, last
Level cache, memory and Directory caching, wherein afterbody caching is connected with multiple processors, memory with Directory caching respectively,
The device includes:
Reading unit 501 carries out read operation to a cache blocks for receiving the first processor in the multiprocessor
First request after, corresponding first page descriptor of the cache blocks (Page Descriptor) is read from the memory;
Updating unit 502, for determining the first processor for first access institute according to first page descriptor
The processor of cache blocks is stated, then adds page label and obtains the second page descriptor into first page descriptor;The page label
Mark first processor is the processor for accessing the cache blocks for the first time.
The updating unit 502 can be different according to the parameter of page label, and the update of page label is carried out using different modes, is had
One of body implementation may is that
The updating unit 502 is used for access record FA, processor flag and the consistency model for including in the page label
It encloses SH to be updated, whether there is processor to access cache blocks for indicating;Processor flag is delayed for indicating to access for the first time
The unique identification (ID) of the processor of counterfoil;Consistency range SH for record buffer memory block be one or multiple processors it
Between safeguard consistency.
Further, in order to guarantee that subsequent processor can get the page descriptor of cache blocks, so to page descriptor
After being updated, it is also necessary to further store the page descriptor of update to corresponding position, wherein specifically including:
Updating unit 502 is also used to after the corresponding page descriptor of the cache blocks updates, by updated page descriptor
It writes back in the memory.
When multiple processors access the same cache blocks, which also needs further to realize multiple places
The buffer consistency between device is managed, specifically may is that
The updating unit 502 is also used to receive second processor and asks to the cache blocks carry out read operation second
It asks, according to the second page descriptor described in second request;Page label in second page descriptor is updated,
Obtain third page descriptor;Updated page labeled marker first processor is the processor for accessing the cache blocks for the first time,
And the cache blocks are operated by multiple processors;
The then device further include:
Directory caching request unit, for sending catalogue assignment messages to Directory caching, so that Directory caching distributes catalogue
, processor of the directory entry for cache blocks described in record access.
In order to guarantee buffer consistency, to cache blocks it is original also description be updated adjustment after, it is also necessary to will adjust
The page descriptor obtained after whole is saved in corresponding position, so as to it is subsequent the processor to access is opened to the caching being capable of basis
Page descriptor determines the accessed situation of cache blocks after adjustment, so the device that the embodiment provides further include:
Synchronization unit, for after obtaining third page descriptor, storing the third page descriptor, and by the third
Page descriptor is sent to the first processor and second processor storage.
As shown in fig. 6, the embodiment of the present invention also provides a kind of multicomputer system according to the above method, the multiprocessor
System includes multiple processors 601, afterbody caching 602, memory 603 and Directory caching 604, wherein afterbody caching
602 are connected with multiple processors 601, memory 603 with Directory caching 604 respectively, in which:
First processor sends to afterbody caching 602 and carries out read operation to a cache blocks in multiple processors 601
First request;
In actual application environment, a hardware page searching unit (Hardware is both provided in each processor
Table Walker, HTW) and transmission look-aside buffer (Translation Lookaside Buffer, TLB), work as certain
When reason device needs to access a certain cache blocks being stored in memory, TLB then sends the page descriptor of the request cache blocks.
After afterbody caching 602 receives the first request of first processor transmission, the caching is read from memory 603
The corresponding page descriptor of block (Page Descriptor);Determine that the first processor is the according to first page descriptor
The processor of one access cache blocks, then add page label and obtain the second page descriptor into first page descriptor;
The page labeled marker first processor is the processor for accessing the cache blocks for the first time, stores the second page descriptor, concurrently
It is sent to first processor and memory 603 stores.
Further, when in multiple processors 601 second processor to afterbody caching 602 send to same cache blocks into
Second request of row read operation;
Then afterbody caching 602 is according to corresponding second page descriptor of the second request cache blocks, according to institute
State the second page descriptor described in the second request;Page label in second page descriptor is updated, third is obtained
Page descriptor;Updated page labeled marker first processor is the processor for accessing the cache blocks for the first time, and this is slow
Counterfoil is operated by multiple processors, sends catalogue assignment messages to Directory caching 604;
Directory caching 604, after receiving the assignment messages that afterbody caching 602 is sent, according to the assignment messages point
With directory entry, processor (i.e. first processor and second processor access of the directory entry for cache blocks described in record access
Cache blocks, other processors in multiple processors do not access the cache blocks).
Said one or multiple technical solutions in the embodiment of the present application, at least have the following technical effect that:
In scheme provided by the present invention, after the processor access block of first time access cache block, directly exist
Corresponding page label is added in page descriptor, this page of label has had a processor to read the cache blocks for indicating
Operation can be carried out subsequent when other subsequent processors carry out read operation to the cache blocks again by this page of label
Operation;Buffer consistency can be realized by this page of label, so as to avoid the access to Directory caching, to reduce pair
Directory Cache access times reduce Directory Cache power dissipation overhead;Nor it is stored with distribution directory entry
Processor reduces the operational circumstances of cache blocks to can also safeguard multicore consistency with lesser chip area
Directory Cache area overhead.
Method of the present invention is not limited to embodiment described in specific embodiment, those skilled in the art according to
Technical solution of the present invention obtains other embodiments, also belongs to the scope of the technical innovation of the present invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art
Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies
Within, then the present invention is also intended to include these modifications and variations.
Claims (12)
1. a kind of multiprocessor buffer consistency processing method, this method is applied in multicomputer system, the multiprocessor system
System includes multiple processors, afterbody caching LLC, memory and Directory caching, wherein afterbody caching respectively with multiple places
Reason device, memory are connected with Directory caching, which is characterized in that this method comprises:
Afterbody caching receives the first processor in the multiprocessor and asks to a cache blocks carry out read operation first
After asking, corresponding first page descriptor of the cache blocks is read from the memory;
It determines that the first processor is the processor of first access cache blocks according to first page descriptor, then adds
Page label is added to obtain the second page descriptor into first page descriptor;The page labeled marker first processor is for the first time
Access the processor of the cache blocks.
2. the method as described in claim 1, which is characterized in that this method further comprises:
It requests when receiving second processor the cache blocks carry out read operation second, according to the second request institute
State the second page descriptor;
Page label in second page descriptor is updated, third page descriptor is obtained;Updated page labeled marker
First processor is the processor for accessing the cache blocks for the first time, and the cache blocks are operated by multiple processors;
Catalogue assignment messages are sent to Directory caching, so that Directory caching distributes directory entry, the directory entry is used for record access
The processor of the cache blocks.
3. method according to claim 2, which is characterized in that it is described obtain third page descriptor after, further comprise:
The third page descriptor is stored, and sends the third page descriptor at the first processor and described second
Manage device storage.
4. the method as described in claims 1 to 3 is any, which is characterized in that the page label includes: access record FA for marking
Whether show has processor to access cache blocks;Processor flag is used to indicate unique mark of the processor of first time access cache block
Know;Consistency range SH is to safeguard consistency between one or multiple processors for record buffer memory block.
5. the method as described in claims 1 to 3 is any, which is characterized in that carried out to the corresponding page descriptor of the cache blocks
After update, the method further writes back to updated page descriptor in the memory.
6. method as claimed in claim 4, which is characterized in that after being updated to the corresponding page descriptor of the cache blocks,
The method further writes back to updated page descriptor in the memory.
7. a kind of multiprocessor buffer consistency processing unit, the afterbody which is applied to multicomputer system is cached
In, which includes multiple processors, afterbody caching, memory and Directory caching, wherein afterbody caching
It is connected respectively with multiple processors, memory with Directory caching, which is characterized in that the device includes:
Reading unit is asked for receiving the first processor in the multiprocessor to a cache blocks carry out read operation first
After asking, corresponding first page descriptor of the cache blocks is read from the memory;
Updating unit, for determining that the first processor is first access cache blocks according to first page descriptor
Processor, then add page label and obtain the second page descriptor into first page descriptor;The page labeled marker first
Processor is the processor for accessing the cache blocks for the first time.
8. device as claimed in claim 7, which is characterized in that the updating unit, which is also used to work as, receives second processor pair
The cache blocks carry out the second request of read operation, according to the second page descriptor described in second request;To described
Page label in two page descriptors is updated, and obtains third page descriptor;Updated page labeled marker first processor is
The processor of the cache blocks is accessed for the first time, and the cache blocks are operated by multiple processors;
The device further include:
Directory caching request unit, for sending catalogue assignment messages to Directory caching, so that Directory caching distributes directory entry, institute
State processor of the directory entry for cache blocks described in record access.
9. device as claimed in claim 8, which is characterized in that the device further include:
Synchronization unit is retouched for after obtaining third page descriptor, storing the third page descriptor, and by the third page
It states symbol and is sent to the first processor and second processor storage.
10. the device as described in claim 8 or 9 is any, which is characterized in that the updating unit is used for in the page label
Access record FA, processor flag and the consistency range SH for including are updated, and access record FA is for indicating whether have place
Reason device accessed cache blocks;Processor flag is used to indicate the unique identification of the processor of first time access cache block;Consistency
Range SH is to safeguard consistency between one or multiple processors for record buffer memory block.
11. the device as described in claim 7~9 is any, which is characterized in that the updating unit is also used in the cache blocks
After corresponding page descriptor updates, updated page descriptor is write back in the memory.
12. device as claimed in claim 10, which is characterized in that the updating unit is also used to corresponding in the cache blocks
After page descriptor updates, updated page descriptor is write back in the memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410704522.XA CN105700953B (en) | 2014-11-26 | 2014-11-26 | A kind of multiprocessor buffer consistency processing method and processing device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410704522.XA CN105700953B (en) | 2014-11-26 | 2014-11-26 | A kind of multiprocessor buffer consistency processing method and processing device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105700953A CN105700953A (en) | 2016-06-22 |
CN105700953B true CN105700953B (en) | 2019-03-26 |
Family
ID=56295866
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410704522.XA Active CN105700953B (en) | 2014-11-26 | 2014-11-26 | A kind of multiprocessor buffer consistency processing method and processing device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105700953B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116962259B (en) * | 2023-09-21 | 2024-02-13 | 中电科申泰信息科技有限公司 | Consistency processing method and system based on monitoring-directory two-layer protocol |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7475193B2 (en) * | 2006-01-18 | 2009-01-06 | International Business Machines Corporation | Separate data and coherency cache directories in a shared cache in a multiprocessor system |
CN101510191B (en) * | 2009-03-26 | 2010-10-06 | 浙江大学 | Implementing method of multi-core system structure with buffer window |
CN101859281A (en) * | 2009-04-13 | 2010-10-13 | 廖鑫 | Method for embedded multi-core buffer consistency based on centralized directory |
-
2014
- 2014-11-26 CN CN201410704522.XA patent/CN105700953B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN105700953A (en) | 2016-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9501425B2 (en) | Translation lookaside buffer management | |
US9418009B2 (en) | Inclusive and non-inclusive tracking of local cache lines to avoid near memory reads on cache line memory writes into a two level system memory | |
KR101385430B1 (en) | Cache coherence protocol for persistent memories | |
US8285969B2 (en) | Reducing broadcasts in multiprocessors | |
US20180239702A1 (en) | Locality-aware and sharing-aware cache coherence for collections of processors | |
US9916247B2 (en) | Cache management directory where hardware manages cache write requests and software manages cache read requests | |
CN109240945B (en) | Data processing method and processor | |
US9208088B2 (en) | Shared virtual memory management apparatus for providing cache-coherence | |
Tabbakh et al. | G-TSC: Timestamp based coherence for GPUs | |
US20120233409A1 (en) | Managing shared memory used by compute nodes | |
US9760489B2 (en) | Private memory table for reduced memory coherence traffic | |
CN105700953B (en) | A kind of multiprocessor buffer consistency processing method and processing device | |
US9037804B2 (en) | Efficient support of sparse data structure access | |
García-Guirado et al. | Energy-efficient cache coherence protocols in chip-multiprocessors for server consolidation | |
US9842050B2 (en) | Add-on memory coherence directory | |
Mojumder et al. | Halcone: A hardware-level timestamp-based cache coherence scheme for multi-gpu systems | |
US10482015B2 (en) | Ownership tracking updates across multiple simultaneous operations | |
KR101155127B1 (en) | Apparatus and method for memory management of multi-core system | |
Bae et al. | Dynamic directory table with victim cache: on-demand allocation of directory entries for active shared cache blocks | |
US11960399B2 (en) | Relaxed invalidation for cache coherence | |
Alkhamisi | Cache coherence issues and solution: A review | |
WO2013101065A1 (en) | Domain state | |
Karakostas | Improving the performance and energy-efficiency of virtual memory | |
Li et al. | A New Kind of Cache Coherence Protocol with SC-Cache for Multiprocessor | |
Merritt | Efficient Programming of Massive-memory Machines |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200420 Address after: 518129 Bantian HUAWEI headquarters office building, Longgang District, Guangdong, Shenzhen Patentee after: HUAWEI TECHNOLOGIES Co.,Ltd. Address before: 301, A building, room 3, building 301, foreshore Road, No. 310052, Binjiang District, Zhejiang, Hangzhou Patentee before: Huawei Technologies Co.,Ltd. |