CN101958834A - On-chip network system supporting cache coherence and data request method - Google Patents

On-chip network system supporting cache coherence and data request method Download PDF

Info

Publication number
CN101958834A
CN101958834A CN2010102940174A CN201010294017A CN101958834A CN 101958834 A CN101958834 A CN 101958834A CN 2010102940174 A CN2010102940174 A CN 2010102940174A CN 201010294017 A CN201010294017 A CN 201010294017A CN 101958834 A CN101958834 A CN 101958834A
Authority
CN
China
Prior art keywords
cache
node
request
buffer memory
host
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010102940174A
Other languages
Chinese (zh)
Other versions
CN101958834B (en
Inventor
王惊雷
汪东升
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN2010102940174A priority Critical patent/CN101958834B/en
Publication of CN101958834A publication Critical patent/CN101958834A/en
Application granted granted Critical
Publication of CN101958834B publication Critical patent/CN101958834B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses an on-chip network system supporting cache coherence. The network system comprises a network interface part and a router, wherein the network interface part is connected with the router, a multi-core processor and a second level cache; a consistent state cache connected with the multi-core processor is additionally arranged in the network interface part and is used for storing and maintaining the consistent state of a data block in a first level cache of the multi-core processor; and an active directory cache connected with the second level cache is also additionally arranged in the network interface part and is used for caching and maintaining the directory information of the data block usually accessed by the first level cache. Coherence maintenance work is separated from the work of a processor, directory maintenance work is separated from the work of the second level cache, and the directory structure in the second level cache is eliminated, so that the design and the verification process of the multi-core processor are simplified, the storage cost of a chip is reduced, and the performance of the multi-core processor is improved. The invention also discloses a data request method of the system.

Description

Support the network-on-a-chip and the request of data method of high-speed cache unanimity
Technical field
The present invention relates to the Computer Systems Organization technical field, particularly a kind of network-on-a-chip and request of data method of supporting the high-speed cache unanimity.
Background technology
Enter single-chip after 1,000,000,000 transistorized epoch, how the emphasis of architectural study makes full use of the processor that requirements such as high-performance, low-power consumption are satisfied in ever-increasing transistor resource design from how utilizing limited resources realization required function progressively to turn to.Polycaryon processor (Multicore processors) provides a kind of efficient, extendible scheme for effectively utilizing these transistor resources, and the heat that has been subjected to academia and industrial quarters is held in both hands.On one chips with integrated a plurality of (Multicore) or numerous (Manycore) processor cores, promptly extensive polycaryon processor.The main challenge that extensive polycaryon processor faces is design complexity, autgmentability and memory access delay etc.
Along with the increase of polycaryon processor scale, need that storage system provides lot of data for it on the sheet.For reducing the consideration that memory access postpones and reduces the programming complexity, polycaryon processor adopts storage system on the sheet of sharing Cache usually.Owing to comprise privately owned Cache usually in the processor core, in polycaryon processor, must use the Cache consistency protocol to safeguard the consistency and the integrality of data among the privately owned Cache.Along with the expansion of polycaryon processor scale, bus structures and based on the requirement that consistency protocol can't satisfy autgmentability of intercepting of bus.In order to address this problem, the consistency protocol of network-on-chip and catalogue is used to substitute bus and intercepts agreement.
The catalogue consistency protocol is a kind of basic communication mechanism of polycaryon processor, has guaranteed the consistency and the integrality of data in the polycaryon processor, and its realization is relevant with a plurality of parts:
1, processor core: the coherency state of in processor core, safeguarding privately owned Cache;
2, share Cache: in sharing Cache, preserve and maintenance directory information;
3, network-on-chip: network-on-chip provides the transmission service for consistency operation.
Because catalogue consistency protocol and processor, shared Cache and network-on-chip all are closely related, increased the difficulty of polycaryon processor design.
Heterogeneous multi-nucleus processor has obtained extensive use owing to the advantage of performance and power consumption in industrial quarters, as the Cell processor of IBM Corporation, and the Bulldozer processor of AMD etc.But the incompatibility problem that has consistency protocol in the heterogeneous multi-nucleus processor between the different processor nuclear, the support consistency protocol that has, what have does not support consistency protocol.PowerPC 755 processors as IBM Corporation are supported MEI (modified, exclusive, and invalid) agreement, the IA-32 series processors of INTEL Corp. is supported MESI (modified, exclusive, shared, and invalid) agreement, the Ultrasparc processor of SUN company is supported MOESI (exclusive modified, shared modified, exclusive clean, shared clean, andinvalid) agreement, AMD 64 series are supported MOESI (modified, owned, exclusive, shared, and invalid) agreement, but very big-difference is arranged with MOESI agreement that the Ultrasparc processor is supported.The DSP of TI company provides conforming simple functions between maintenance processor and the L2 cache.Some flush bonding processors such as MIPS 4K series though processors such as ARM 7 series comprise privately owned Cache, are not supported consistency protocol.Most of application specific processor and hardware-accelerated parts are not supported consistency protocol yet.Incompatible consistency protocol makes the heterogeneous multi-nucleus processor design difficult unusually.
Because consistency protocol and processor, shared Cache and network-on-chip all are closely related, incompatible consistency protocol in the heterogeneous multi-nucleus processor particularly, make when designing new polycaryon processor, each parts all need to redesign for maintaining coherency, reduce the reusability of parts, increased the difficulty of design.
Along with the expansion of polycaryon processor scale, there is serious scaling concern in the catalogue consistency protocol.In the consistency protocol based on catalogue, the storage of catalogue can take resource on a part of sheet.With full catalogue consistency protocol is example, and when the size of data block was 64B, the storage overhead of catalogue accounted for 3% of L2 cache storage overhead in the 16 nuclear polycaryon processors; When being increased to 64 nuclears, the ratio of directory stores expense is increased to 12.5%; When being increased to 512 nuclears, the ratio of directory stores expense is increased to 50%.Directory stores expense in the catalogue consistency protocol not only increases chip area and cost, has also increased the power consumption of system, has had a strong impact on the autgmentability of polycaryon processor.
In the catalogue consistency protocol, catalogue is generally held among the afterbody Cache (as the L2 cache in the two-stage Cache structure), and each data block in the L2 cache is all safeguarded a catalogue vector, in order to the processor of this data block of trace cache.All miss request of processor all need to search directory information in the L2 cache of host's node, and carry out corresponding consistency operation.Along with the expansion of processor scale,, have a strong impact on the performance of polycaryon processor to the also increase thereupon of access delay of catalogue.The catalogue consistency protocol also is the performance bottleneck of extensive polycaryon processor.
Summary of the invention
(1) technical problem that will solve
The technical problem to be solved in the present invention is the problem that consistency protocol brought in the polycaryon processor design complexity, catalogue are difficult to expand and access delay is bigger
(2) technical scheme
For solving the problems of the technologies described above, the invention provides a kind of network-on-a-chip of supporting the high-speed cache unanimity, comprise: network interface unit, described network interface unit connects router, and connection polycaryon processor and L2 cache, have additional the coherency state buffer memory that links to each other with polycaryon processor in described network interface unit, described coherency state buffer memory is used for preserving and safeguards the coherency state of level cache data block of each nuclear of polycaryon processor.
Wherein, described coherency state buffer memory comprises:
The coherency state memory has the storage line identical with described level cache, is used for preserving the coherency state of described level cache data block;
Processor interface, connect described polycaryon processor, be used for isolating the needed request signal of coherency state buffer memory, and the coherency state buffer memory is converted to the signal that polycaryon processor can be discerned to the response or the request signal of polycaryon processor from the bus request of dissimilar processor cores;
The consistency protocol controller, be used for when the level cache of polycaryon processor is visited miss request or receiveed the response through network interface unit, obtain this visit miss request or receive the response by described processor interface, therefrom isolate address tag, and safeguard corresponding coherency state.
Wherein, also have additional in the described network interface unit link to each other with L2 cache enliven the catalogue buffer memory, be used for buffer memory and safeguard the directory information of the data block that L2 cache is often visited by described level cache.
Wherein, the described catalogue buffer memory that enlivens comprises:
Catalog memory is used for the directory information of the data block that the buffer memory L2 cache often visited by described level cache: the address tag, directory states and the catalogue vector that comprise described data block;
The L2 cache interface connects L2 cache, is used for the visit miss request of polycaryon processor is sent to described L2 cache, or the back-signalling of L2 cache returned to enlivens the catalogue buffer memory;
The catalog control device is used for obtaining the access request of polycaryon processor to L2 cache by the L2 cache interface, and searches the directory information in the catalog memory, and whether the type decided according to request sends this access request to local L2 cache then.
The present invention also provides a kind of request of data method of utilizing above-mentioned system, may further comprise the steps:
S1: the consistency protocol controller of requesting node is caught the level cache visit miss request of described request modal processor,
S2: the consistency protocol controller of described request node is searched the address tag of described miss request in the coherency state memory, and sends request of data according to corresponding coherency state to the L2 cache of host's node;
S3: the catalog control device that enlivens the catalogue buffer memory in the network interface unit of described host's node router is caught described request of data, and in this catalog memory, search the directory information of described request of data correspondence, whether send described request of data according to the request type decision then to the L2 cache of this host's node;
S4: the L2 cache of described host's node according to described request of data to requesting node processor return messages, when the router of process requesting node, the consistency protocol controller of this router is caught this message, and store the coherency state of data in the address according to the type of message change or the address tag described in the state cache that keeps consistency, and the request msg that comprises in the message is returned to the requesting node processor.
Wherein, described step S2 specifically comprises: when described miss request is:
Read request operation: in the coherency state memory of the coherency state buffer memory of requesting node, be cache lines of this request address distribution, and coherency state is made as interim state I S, simultaneously transmit this request to the L2 cache of host's node, described interim state I S represents that read request also do not finish, and waits for the data response of this L2 cache;
Write request operation: if the coherency state buffer memory of requesting node does not hit, it then is cache lines of this address assignment, and coherency state is made as interim state I M, and transmit this write request to the L2 cache of host's node, described interim state I M represents that write request also do not finish, and waits for the response of writing of this L2 cache; If the coherency state cache hit of requesting node and be in shared state then is made as the IM state to coherency state, and sends to the L2 cache of host's node and to write update request; If the coherency state cache hit of requesting node and be in the modification state is then directly returned to the requesting node processor and write back-signalling, the state in the coherency state buffer memory does not change;
Update request operation: the coherency state in the coherency state buffer memory of requesting node is made as the IM state, sends update request to host's node then;
Replace and write back solicit operation: the coherency state buffer memory of requesting node directly is transmitted to this request the L2 cache of host's node;
Replacement operation: when the coherency state buffer memory of requesting node because during capacity conflict generation replacement operation, then send invalid signals to the requesting node processor, the privately owned level cache of requesting node processor then can send invalid response or write back message to host's node according to its state, after the coherency state buffer memory of requesting node receives the invalid response of requesting node processor or writes back message, L2 cache to host's node sends replacement or writes back request, the replacement of receiving host's node is by the time deleted this cache lines after responding or write back and receiveing the response from the coherency state buffer memory of requesting node.
Wherein, described step S3 specifically comprises: when described miss request is:
Read request operation: if host's node enliven the catalogue cache hit, the position of node joins request in enlivening the catalogue vector of directory information, if directory states is in shared state, then the L2 cache to this host's node sends read data request, after obtaining the data response of L2 cache, this data forwarding is given the processor of requesting node; If directory states is in the modification state, then send degradation and write back request to the shared node that has these data, when the catalog control device receives the data that write back, the data forwarding that writes back is given the processor of requesting node, and these data are write back the L2 cache of this host's node, directory states becomes shared state; If host's node enliven the catalogue cache miss, then add directory entry in enlivening the catalogue buffer memory, the L2 cache to this host's node sends read request then, after obtaining data in buffer and responding, the processor of the data forwarding of request to requesting node, directory states becomes shared state;
Write request operation: if the enlivening the catalogue cache hit and be in shared state of host's node, then the processor to all shared nodes sends invalid signals, and send read request to the L2 cache of this host's node, after the catalog control device is collected all invalid receiveing the response, the position of sharing node is accordingly deleted from the catalogue vector, the data response that returns from L2 cache is transmitted to the processor of requesting node, the directory states of enlivening the catalogue buffer memory of host's node is changed into the modification state, the position of the node that in the catalogue vector, joins request; If the catalogue buffer memory that enlivens of host's node is in the modification state, then send invalid and write back request to the processor of sharing node, when the catalog control device is received the data that write back, delete from the catalogue vector node corresponding position, give requesting node data forwarding, the position of the node that in the catalogue vector, joins request; If the catalogue buffer memory that enlivens of host's node does not hit, then in enlivening the catalogue buffer memory, add directory entry, L2 cache to this host's node sends read request, after obtaining the data response of L2 cache, give requesting node the data forwarding of request, directory states becomes the modification state, the position of the node that joins request in the catalogue vector;
The replacement request operation: delete from the catalogue vector of host's node requesting node position that will be to be replaced, and return the replacement back-signalling to this requesting node, if unique shared node is then deleted this catalogue vector from enlivening the catalogue buffer memory of host's node;
Write back solicit operation: node location is deleted, data are write back to the L2 cache of this host's node, and return to requesting node and to write back back-signalling, this catalogue vector is deleted from enlivening the catalogue buffer memory of host's node from the catalogue vector of host's node;
Replacement operation: during the enlivening the catalogue buffer memory and replace of host's node owing to capacity conflict, send invalidation request to all shared nodes, if the directory states of this catalog memory is in shared state, after then the catalog control device is collected all invalid responses, this catalogue vector is deleted from enliven the catalogue buffer memory; If this catalog memory directory states is in the modification state, after then the catalog control device is received the data that write back, these data are write back in the L2 cache of this host's node, delete the catalogue vector of this data correspondence then;
When receiving the invalidation request of L2 cache of host's node, if the catalogue buffer memory that enlivens of this node does not hit, then directly return invalid back-signalling to L2 cache, if enliven the catalogue cache hit, then enliven the replacement operation of catalogue buffer memory, after replacement operation is finished, return invalid back-signalling or write back signal, the catalogue vector is deleted from enliven the catalogue buffer memory to L2 cache.
Wherein, described step S4 specifically comprises: when the response of described miss request correspondence is operating as:
Read back and should operate: the IS state of the coherency state buffer memory of requesting node is changed into shared state, and to requesting node processor return data;
Write and respond and upgrade and respond operation: the IM state of the coherency state buffer memory of requesting node is changed into the modification state, and return to write to the requesting node processor and respond or upgrade and respond;
Replace to respond and write back and respond operation: the cache lines at this place, address in the coherency state buffer memory of removal request node, and back-signalling is transmitted to the requesting node processor;
Invalidation request operation: when the coherency state buffer memory of requesting node receives from the invalidation request of host's node L2 cache, directly be transmitted to the requesting node processor;
Invalid response operation: when the coherency state buffer memory of requesting node receives from the invalid back-signalling of requesting node processor, deletion corresponding cache row, and invalid back-signalling forwarding host node.
(3) beneficial effect
The network-on-a-chip of the support high-speed cache unanimity that the present invention proposes is by integrated coherency state buffer memory in the network interface unit of network-on-chip and enliven the catalogue buffer memory, consistency maintenance work is separated from processor, directory maintenance work is separated from L2 cache, and cancelled the bibliographic structure in the L2 cache, the design and the proof procedure of polycaryon processor have been simplified, reduce the storage and the time-delay expense of chip, improved the performance of polycaryon processor.
Description of drawings
Fig. 1 is the network-on-a-chip structural representation of the support high-speed cache unanimity of the embodiment of the invention;
Fig. 2 is a coherency state buffer structure schematic diagram in the network-on-a-chip of support high-speed cache unanimity of the embodiment of the invention;
Fig. 3 is the structural representation that enlivens the catalogue buffer memory in the network-on-a-chip of support high-speed cache unanimity of the embodiment of the invention;
Fig. 4 is the request of data method flow diagram of the above-mentioned system of utilizing of the embodiment of the invention.
Embodiment
Below in conjunction with drawings and Examples, the specific embodiment of the present invention is described in further detail.Following examples are used to illustrate the present invention, but are not used for limiting the scope of the invention.
In polycaryon processor, the operation of buffer consistency agreement is transmitted by network-on-chip.Privately owned buffer memory (level cache) miss request of polycaryon processor sends in the network-on-chip by network interface unit, and echo message also is transferred to the router of requesting node by network-on-chip, return to polycaryon processor by network interface unit again.Visit to catalogue and data in the shared L2 cache is also transmitted by network interface unit.Share L2 cache to the receiveing the response and the invalid message of privately owned buffer memory is injected into network-on-chip by network interface unit of polycaryon processor, be transferred to corresponding polycaryon processor.Network interface unit can obtain all the consistency protocol operations in the system, and makes further processing.
For the coherency state maintenance work is separated from processor, the present invention has increased a coherency state buffer memory in network interface unit, as shown in Figure 1, described coherency state buffer memory is used for preserving and safeguarding the coherency state of local privately owned level cache data block, and the level cache of polycaryon processor is worked by the mode of oneself, needn't be concerned about consistency maintenance work.By the coherency state buffer memory, realized separating of consistency protocol and processor, compatibility has the polycaryon processor of different consistencies agreement.
As shown in Figure 2, above-mentioned coherency state buffer memory comprises:
The coherency state memory has the storage line identical with described level cache, and address tag that stores level cache and coherency state in the storage line are used for preserving the coherency state of described level cache data block.
Processor interface, connect described polycaryon processor, be used for isolating the needed request signal of coherency state buffer memory, and the coherency state buffer memory is converted to the signal that polycaryon processor can be discerned to the response or the request signal of polycaryon processor from the bus request of dissimilar processor cores.
The consistency protocol controller, be used for when the level cache of polycaryon processor is visited miss request or receiveed the response through network interface unit, obtain this access request or receive the response by described processor interface, therefrom isolate address tag, and safeguard corresponding coherency state.
For directory stores and maintenance work are separated from L2 cache, also have additional in the network interface unit link to each other with L2 cache enliven the catalogue buffer memory, as shown in Figure 1, enliven the directory information that the catalogue buffer memory is used for buffer memory and safeguards the data block that L2 cache is often visited by level cache.Cancel directory stores space and directory maintenance work in the L2 cache simultaneously.Enliven the catalogue buffer memory and reduced the directory stores expense, realized separating of consistency protocol and L2 cache.Enliven the catalogue buffer memory and also reduced the directory access delay, the performance of system is increased.
As shown in Figure 3, the above-mentioned catalogue buffer memory that enlivens comprises:
Catalog memory is used for the directory information of the data block that the buffer memory L2 cache often visited by described level cache, and each storage line is made up of address tag, directory states and the catalogue vector etc. of described data block.The purpose of catalogue vector is the position of the level cache of this address of trace cache.Keep a shared position for each processor core that comprises privately owned level cache in the catalogue vector.
The L2 cache interface connects L2 cache, and the miss request of polycaryon processor is used for the visit miss request of polycaryon processor is sent to described L2 cache after visit enlivens the catalogue buffer memory, or the back-signalling of L2 cache returned to enlivens the catalogue buffer memory.
The catalog control device is used for obtaining the access request of polycaryon processor to L2 cache by the L2 cache interface, and searches the directory information in the catalog memory, and whether the type decided according to request sends this access request to local L2 cache then.
With respect to traditional network-on-chip, in the network-on-a-chip structure of supporting the high-speed cache unanimity, increased the coherency state buffer memory and enlivened the catalogue buffer memory.They all are implemented in the network interface unit of network-on-chip.The coherency state buffer memory is the interface of router and processor, enlivens the interface that the catalogue buffer memory is router and L2 cache.Coherency state safeguard and directory maintenance work respectively by the coherency state buffer memory with enliven the catalogue buffer memory and finish.The major function of network interface unit is that the data that send are packed and the data that receive are unpacked processing.Coherency state buffer memory and enliven the packing of catalogue buffer memory and network interface and unpack the parts concurrent working has been hidden the coherency state buffer memory and has been enlivened the access delay of catalogue buffer memory.Such design does not change the structure of router, and topological structure and routing algorithm are not all had influence, has increased adaptability and the flexibility of supporting the network-on-chip structure of high-speed cache unanimity.The present invention supports polycaryon processor to be directly connected in the network-on-chip by supporting the consistent network-on-chip structure of high-speed cache with L2 cache, realizes the seamless integrated of polycaryon processor, and consistency protocol is transparent to polycaryon processor and L2 cache.
The invention also discloses a kind of request of data method of utilizing above-mentioned system, this method can make the L2 cache request msg of the host node of processor on network of requesting node, the described request node is meant the computer node at the processor place of sending request of data, and described host's node is meant the computer node on the network at the L2 cache place of preserving these data.As shown in Figure 4, comprising:
Step S401, the consistency protocol controller of described request node catch the level cache visit miss request of described request modal processor.
Step S402, the consistency protocol controller of described request node is searched the address tag of described miss request in the coherency state memory, and sends request of data according to corresponding coherency state to the L2 cache of host's node.
Step S403, enliven the catalog control device of catalogue buffer memory in the network interface unit of described host's node router and catch described request of data, and in described catalog memory, search the directory information of described request of data correspondence, whether send described request of data according to the request type decision then to the L2 cache of this host's node.
Step S404, the L2 cache of described host's node according to described request of data to described request modal processor return messages, when the router of process requesting node, the consistency protocol controller of this router is caught this message, and store the coherency state of data in the address according to the type of message change or the address tag described in the state cache that keeps consistency, and the request msg that comprises in the message is returned to the requesting node processor.
Of the present invention connecting internet system operation principle is as follows:
In the network-on-a-chip of above-mentioned support high-speed cache unanimity, during the memory access miss request of processor process network interface unit, enter the coherency state buffer memory.The consistency protocol controller is at first searched the address tag of level cache request address in the coherency state buffer memory, and according to corresponding coherency state information, sends request to the L2 cache of host's node.If this address tag not in the coherency state buffer memory then adds this address tag.L2 cache returns to receiveing the response of this polycaryon processor and is intercepted and captured by the consistency protocol controller, the consistency protocol controller is according to the type of receiveing the response, change or keep coherency state in the coherency state buffer memory, simultaneously, give polycaryon processor the data forwarding of request, finish single treatment device disappearance accessing operation.L2 cache is is also intercepted and captured by the consistency protocol controller the data block invalidation request of polycaryon processor and invalid the receiveing the response of polycaryon processor, and coherency state is changed accordingly.Concrete consistency protocol operation is as follows:
Read request operation: in the coherency state memory of the coherency state buffer memory of requesting node, be cache lines of this request address distribution, and coherency state is made as the IS state, and (IS is a kind of interim state, the expression read request is not also finished, wait for the data response of this L2 cache), and transmit this request to the L2 cache of host's node.
Write request operation: if the coherency state buffer memory of requesting node does not hit, it then is cache lines of this address assignment, and coherency state is made as the IM state, and (IM also is a kind of interim state, the expression write request is not also finished, wait for the response of writing of this L2 cache), and transmit this write request to the L2 cache of host's node; If the coherency state cache hit of requesting node and be in shared (S) state then is made as the IM state to coherency state, and sends to host's node and to write renewals (Update) and ask; If the coherency state cache hit of requesting node and be in modification (M) state is then directly returned to the requesting node processor and write back-signalling, the state in the coherency state buffer memory does not change.
Update request operation: the coherency state in the coherency state buffer memory of requesting node is made as the IM state, sends update request to host's node then.When receiving the update request of polycaryon processor, the coherency state buffer memory should be in shared (S) state.
Replace and write back solicit operation: the coherency state buffer memory of requesting node directly is transmitted to this request the L2 cache of host's node.When receiving the replacement request of polycaryon processor, the coherency state buffer memory should be in shared (S) state.When request of writing back that receives polycaryon processor, the coherency state buffer memory should be in modification (M) state.
Read back and should operate: the IS state of the coherency state buffer memory of requesting node is changed into shared (S) state, and to requesting node processor return data.
Write and respond and upgrade and respond operation: the IM state of the coherency state buffer memory of requesting node is changed into modifications (M) state, and return to write to the requesting node processor and respond or upgrade response.
Replace to respond and write back and respond operation: the cache lines at this place, address in the coherency state buffer memory of removal request node, and back-signalling is transmitted to the requesting node processor.
Invalidation request operation: when the coherency state buffer memory of requesting node receives from the invalidation request of host's node L2 cache, directly be transmitted to the requesting node processor.
Invalid response operation: when the coherency state buffer memory of requesting node receives from the invalid back-signalling of polycaryon processor, deletion corresponding cache row, and invalid back-signalling forwarding host node.
Replacement operation: when the coherency state buffer memory of requesting node because during capacity conflict generation replacement operation, then send invalid signals to the requesting node processor, the privately owned level cache of requesting node processor then can send invalid response or write back message to host's node according to its state.After the coherency state buffer memory of requesting node receives the invalid response of requesting node processor or writes back message, send replacement or write back request to host's node.The replacement of receiving host's node is by the time deleted this cache lines after responding or write back and receiveing the response from the coherency state buffer memory of requesting node.
In the network-on-a-chip of above-mentioned support high-speed cache unanimity, enliven the catalogue buffer memory and preserve recently the often directory information of the data of visit.The read-write miss request of all polycaryon processors all can cause the visit to host's node L2 cache, and these visits are caught by the catalogue buffer memory that enlivens on the network interface.The catalog control device that enlivens in the catalogue buffer memory is at first searched directory information in the catalog memory that enlivens the catalogue buffer memory, and whether the type decided according to request sends read-write requests to local L2 cache then.Its course of work is as follows:
Read request operation: if host's node enliven the catalogue cache hit, the position of node joins request in the catalogue vector, if directory states is in shared (S) state, then the L2 cache to this host's node sends read data request, after obtaining the data response of L2 cache, the processor of this data forwarding, finish read operation to requesting node.If directory states is in modification (M) state, then send degradation and write back request to the shared node that has these data, when the catalog control device receives the data that write back, the data forwarding that writes back is given the processor of requesting node, and these data are write back the L2 cache of this host's node, directory states becomes shared (S) state.If this host's node enliven the catalogue cache miss, then in enlivening the catalogue buffer memory, add directory entry, L2 cache to this host's node sends read request then, after obtaining the data in buffer response, give the processor of requesting node the data forwarding of request, directory states becomes shared (S) state.
Write request operation: if the enlivening the catalogue cache hit and be in shared (S) state of host's node, then the processor to all shared nodes sends invalid signals, and sends read request to the L2 cache of this host's node.Write request in fact also is a read operation, and this is at a word because of write command, and read-write requests is all capable at a Cache, when carrying out write request, need all read the capable data of Cache, deliver to requesting node, to merge into a new Cache capable with the content that writes.After the catalog control device is collected all invalid receiveing the response, the corresponding node location of sharing is deleted from the catalogue vector, the data response that returns from L2 cache is transmitted to the processor of requesting node, the state of the catalogue of enlivening the catalogue buffer memory of host's node is changed into modification (M) state, the position of the node that joins request in the catalogue vector.If the catalogue buffer memory that enlivens of host's node is in modification (M) state, then send invalid and write back request to the processor of sharing node, be unique correct copy because at this moment share the copy of the preservation of node, need write back.When the catalog control device was received the data that write back, deleted from the catalogue vector node corresponding position, gives requesting node data forwarding, the position of the node that joins request in the catalogue vector.If the catalogue buffer memory that enlivens of host's node does not hit, then in enlivening the catalogue buffer memory, add directory entry, L2 cache to this host's node sends read request, after obtaining the data response of L2 cache, give requesting node the data forwarding of request, directory states becomes modification (M) state, the position of the node that joins request in the catalogue vector.
The replacement request operation: delete from the catalogue vector requesting node position that will be to be replaced, and return the replacement back-signalling to requesting node.If unique shared node is then deleted this catalogue vector from enlivening the catalogue buffer memory of host's node.
Write back solicit operation: node location is deleted, data are write back to the L2 cache of this host's node, and return to requesting node and to write back back-signalling, then this catalogue vector is deleted from enlivening the catalogue buffer memory of host's node from the catalogue vector of host's node.
Replacement operation: during the enlivening the catalogue buffer memory and replace of host's node, send invalidation request to all shared nodes owing to capacity conflict.If the directory states of this catalog memory is in shared (S) state after then the catalog control device is collected all invalid responses, is deleted this catalogue vector from enliven the catalogue buffer memory.If the directory states of this catalog memory is in modification (M) state, after then the catalog control device is received the data that write back, these data are write back in the L2 cache of this host's node, delete the catalogue vector of this data correspondence then.
When receiving the invalidation request of L2 cache of host's node, do not hit, then directly return invalid back-signalling to L2 cache if this node enlivens the catalogue buffer memory.If enliven the catalogue cache hit, then enliven the replacement operation of catalogue buffer memory, after replacement operation is finished, return invalid back-signalling or write back signal to L2 cache, the catalogue vector is deleted from enliven the catalogue buffer memory.
Above execution mode only is used to illustrate the present invention; and be not limitation of the present invention; the those of ordinary skill in relevant technologies field; under the situation that does not break away from the spirit and scope of the present invention; can also make various variations and modification; therefore all technical schemes that are equal to also belong to category of the present invention, and scope of patent protection of the present invention should be defined by the claims.

Claims (8)

1. network-on-a-chip of supporting the high-speed cache unanimity, comprise: network interface unit, described network interface unit connects router, and connection polycaryon processor and L2 cache, it is characterized in that, have additional the coherency state buffer memory that links to each other with polycaryon processor in described network interface unit, described coherency state buffer memory is used for preserving and safeguards the coherency state of level cache data block of each nuclear of polycaryon processor.
2. the network-on-a-chip of support high-speed cache unanimity as claimed in claim 1 is characterized in that, described coherency state buffer memory comprises:
The coherency state memory has the storage line identical with described level cache, is used for preserving the coherency state of described level cache data block;
Processor interface, connect described polycaryon processor, be used for isolating the needed request signal of coherency state buffer memory, and the coherency state buffer memory is converted to the signal that polycaryon processor can be discerned to the response or the request signal of polycaryon processor from the bus request of dissimilar processor cores;
The consistency protocol controller, be used for when the level cache of polycaryon processor is visited miss request or receiveed the response through network interface unit, obtain this visit miss request or receive the response by described processor interface, therefrom isolate address tag, and safeguard corresponding coherency state.
3. the network-on-a-chip of support high-speed cache unanimity as claimed in claim 2, it is characterized in that, also have additional in the described network interface unit link to each other with L2 cache enliven the catalogue buffer memory, be used for buffer memory and safeguard the directory information of the data block that L2 cache is often visited by described level cache.
4. the network-on-a-chip of support high-speed cache unanimity as claimed in claim 3 is characterized in that, the described catalogue buffer memory that enlivens comprises:
Catalog memory is used for the directory information of the data block that the buffer memory L2 cache often visited by described level cache: the address tag, directory states and the catalogue vector that comprise described data block;
The L2 cache interface connects L2 cache, is used for the visit miss request of polycaryon processor is sent to described L2 cache, or the back-signalling of L2 cache returned to enlivens the catalogue buffer memory;
The catalog control device is used for obtaining the access request of polycaryon processor to L2 cache by the L2 cache interface, and searches the directory information in the catalog memory, and whether the type decided according to request sends this access request to local L2 cache then.
5. a request of data method of utilizing the described system of claim 4 is characterized in that, may further comprise the steps:
S1: the consistency protocol controller of requesting node is caught the level cache visit miss request of described request modal processor,
S2: the consistency protocol controller of described request node is searched the address tag of described miss request in the coherency state memory, and sends request of data according to corresponding coherency state to the L2 cache of host's node;
S3: the catalog control device that enlivens the catalogue buffer memory in the network interface unit of described host's node router is caught described request of data, and in this catalog memory, search the directory information of described request of data correspondence, whether send described request of data according to the request type decision then to the L2 cache of this host's node;
S4: the L2 cache of described host's node according to described request of data to requesting node processor return messages, when the router of process requesting node, the consistency protocol controller of this router is caught this message, and store the coherency state of data in the address according to the type of message change or the address tag described in the state cache that keeps consistency, and the request msg that comprises in the message is returned to the requesting node processor.
6. request of data method as claimed in claim 5 is characterized in that, described step S2 specifically comprises: when described miss request is:
Read request operation: in the coherency state memory of the coherency state buffer memory of requesting node, be cache lines of this request address distribution, and coherency state is made as interim state I S, simultaneously transmit this request to the L2 cache of host's node, described interim state I S represents that read request also do not finish, and waits for the data response of this L2 cache;
Write request operation: if the coherency state buffer memory of requesting node does not hit, it then is cache lines of this address assignment, and coherency state is made as interim state I M, and transmit this write request to the L2 cache of host's node, described interim state I M represents that write request also do not finish, and waits for the response of writing of this L2 cache; If the coherency state cache hit of requesting node and be in shared state then is made as the IM state to coherency state, and sends to the L2 cache of host's node and to write update request; If the coherency state cache hit of requesting node and be in the modification state is then directly returned to the requesting node processor and write back-signalling, the state in the coherency state buffer memory does not change;
Update request operation: the coherency state in the coherency state buffer memory of requesting node is made as the IM state, sends update request to host's node then;
Replace and write back solicit operation: the coherency state buffer memory of requesting node directly is transmitted to this request the L2 cache of host's node;
Replacement operation: when the coherency state buffer memory of requesting node because during capacity conflict generation replacement operation, then send invalid signals to the requesting node processor, the privately owned level cache of requesting node processor then can send invalid response or write back message to host's node according to its state, after the coherency state buffer memory of requesting node receives the invalid response of requesting node processor or writes back message, L2 cache to host's node sends replacement or writes back request, the replacement of receiving host's node is by the time deleted this cache lines after responding or write back and receiveing the response from the coherency state buffer memory of requesting node.
7. request of data method as claimed in claim 5 is characterized in that, described step S3 specifically comprises: when described miss request is:
Read request operation: if host's node enliven the catalogue cache hit, the position of node joins request in enlivening the catalogue vector of directory information, if directory states is in shared state, then the L2 cache to this host's node sends read data request, after obtaining the data response of L2 cache, this data forwarding is given the processor of requesting node; If directory states is in the modification state, then send degradation and write back request to the shared node that has these data, when the catalog control device receives the data that write back, the data forwarding that writes back is given the processor of requesting node, and these data are write back the L2 cache of this host's node, directory states becomes shared state; If host's node enliven the catalogue cache miss, then add directory entry in enlivening the catalogue buffer memory, the L2 cache to this host's node sends read request then, after obtaining data in buffer and responding, the processor of the data forwarding of request to requesting node, directory states becomes shared state;
Write request operation: if the enlivening the catalogue cache hit and be in shared state of host's node, then the processor to all shared nodes sends invalid signals, and send read request to the L2 cache of this host's node, after the catalog control device is collected all invalid receiveing the response, the position of sharing node is accordingly deleted from the catalogue vector, the data response that returns from L2 cache is transmitted to the processor of requesting node, the directory states of enlivening the catalogue buffer memory of host's node is changed into the modification state, the position of the node that in the catalogue vector, joins request; If the catalogue buffer memory that enlivens of host's node is in the modification state, then send invalid and write back request to the processor of sharing node, when the catalog control device is received the data that write back, delete from the catalogue vector node corresponding position, give requesting node data forwarding, the position of the node that in the catalogue vector, joins request; If the catalogue buffer memory that enlivens of host's node does not hit, then in enlivening the catalogue buffer memory, add directory entry, L2 cache to this host's node sends read request, after obtaining the data response of L2 cache, give requesting node the data forwarding of request, directory states becomes the modification state, the position of the node that joins request in the catalogue vector;
The replacement request operation: delete from the catalogue vector of host's node requesting node position that will be to be replaced, and return the replacement back-signalling to this requesting node, if unique shared node is then deleted this catalogue vector from enlivening the catalogue buffer memory of host's node;
Write back solicit operation: node location is deleted, data are write back to the L2 cache of this host's node, and return to requesting node and to write back back-signalling, this catalogue vector is deleted from enlivening the catalogue buffer memory of host's node from the catalogue vector of host's node;
Replacement operation: during the enlivening the catalogue buffer memory and replace of host's node owing to capacity conflict, send invalidation request to all shared nodes, if the directory states of this catalog memory is in shared state, after then the catalog control device is collected all invalid responses, this catalogue vector is deleted from enliven the catalogue buffer memory; If this catalog memory directory states is in the modification state, after then the catalog control device is received the data that write back, these data are write back in the L2 cache of this host's node, delete the catalogue vector of this data correspondence then;
When receiving the invalidation request of L2 cache of host's node, if the catalogue buffer memory that enlivens of this node does not hit, then directly return invalid back-signalling to L2 cache, if enliven the catalogue cache hit, then enliven the replacement operation of catalogue buffer memory, after replacement operation is finished, return invalid back-signalling or write back signal, the catalogue vector is deleted from enliven the catalogue buffer memory to L2 cache.
8. request of data method as claimed in claim 5 is characterized in that, described step S4 specifically comprises: when the response of described miss request correspondence is operating as:
Read back and should operate: the IS state of the coherency state buffer memory of requesting node is changed into shared state, and to requesting node processor return data;
Write and respond and upgrade and respond operation: the IM state of the coherency state buffer memory of requesting node is changed into the modification state, and return to write to the requesting node processor and respond or upgrade and respond;
Replace to respond and write back and respond operation: the cache lines at this place, address in the coherency state buffer memory of removal request node, and back-signalling is transmitted to the requesting node processor;
Invalidation request operation: when the coherency state buffer memory of requesting node receives from the invalidation request of host's node L2 cache, directly be transmitted to the requesting node processor;
Invalid response operation: when the coherency state buffer memory of requesting node receives from the invalid back-signalling of requesting node processor, deletion corresponding cache row, and invalid back-signalling forwarding host node.
CN2010102940174A 2010-09-27 2010-09-27 On-chip network system supporting cache coherence and data request method Expired - Fee Related CN101958834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010102940174A CN101958834B (en) 2010-09-27 2010-09-27 On-chip network system supporting cache coherence and data request method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010102940174A CN101958834B (en) 2010-09-27 2010-09-27 On-chip network system supporting cache coherence and data request method

Publications (2)

Publication Number Publication Date
CN101958834A true CN101958834A (en) 2011-01-26
CN101958834B CN101958834B (en) 2012-09-05

Family

ID=43485952

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010102940174A Expired - Fee Related CN101958834B (en) 2010-09-27 2010-09-27 On-chip network system supporting cache coherence and data request method

Country Status (1)

Country Link
CN (1) CN101958834B (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591800A (en) * 2011-12-31 2012-07-18 龙芯中科技术有限公司 Data access and storage system and method for weak consistency storage model
CN102662885A (en) * 2012-04-01 2012-09-12 天津国芯科技有限公司 Device and method for maintaining second-level cache coherency of symmetrical multi-core processor
CN102819498A (en) * 2012-08-15 2012-12-12 上海交通大学 Method of constructing consistency protocol of cache, many-core processor and network interface unit
CN103440223A (en) * 2013-08-29 2013-12-11 西安电子科技大学 Layering system for achieving caching consistency protocol and method thereof
CN103885890A (en) * 2012-12-21 2014-06-25 华为技术有限公司 Replacement processing method and device for cache blocks in caches
CN104360981A (en) * 2014-11-12 2015-02-18 浪潮(北京)电子信息产业有限公司 Design method of multi-core multiprocessor platform orientated Cache consistency protocol
CN104462007A (en) * 2013-09-22 2015-03-25 中兴通讯股份有限公司 Method and device for achieving cache consistency between multiple cores
CN105488012A (en) * 2015-12-09 2016-04-13 浪潮电子信息产业股份有限公司 Consistency protocol design method based on exclusive data
CN105580308A (en) * 2013-09-06 2016-05-11 萨基姆防卫安全 Method of managing consistency of caches
CN105740164A (en) * 2014-12-10 2016-07-06 阿里巴巴集团控股有限公司 Multi-core processor supporting cache consistency, reading and writing methods and apparatuses as well as device
CN105915619A (en) * 2016-04-29 2016-08-31 中国地质大学(武汉) Access heat regarded cyber space information service high performance memory caching method
CN106155853A (en) * 2015-03-23 2016-11-23 龙芯中科技术有限公司 The verification method of processor IP, device and system
CN106201980A (en) * 2015-01-21 2016-12-07 联发科技(新加坡)私人有限公司 Processing unit and processing method thereof
CN107038123A (en) * 2015-12-10 2017-08-11 Arm 有限公司 Snoop filter for the buffer consistency in data handling system
CN107229593A (en) * 2016-03-25 2017-10-03 华为技术有限公司 The buffer consistency operating method and multi-disc polycaryon processor of multi-disc polycaryon processor
CN107341114A (en) * 2016-04-29 2017-11-10 华为技术有限公司 A kind of method of directory management, Node Controller and system
CN108694156A (en) * 2018-04-16 2018-10-23 东南大学 A kind of network-on-chip traffic modeling method based on buffer consistency behavior
CN108804348A (en) * 2017-05-02 2018-11-13 迈络思科技有限公司 Calculating in parallel processing environment
CN109213641A (en) * 2017-06-29 2019-01-15 展讯通信(上海)有限公司 Buffer consistency detection system and method
CN109684237A (en) * 2018-11-20 2019-04-26 华为技术有限公司 Data access method and device based on multi-core processor
CN110225008A (en) * 2019-05-27 2019-09-10 四川大学 SDN network state consistency verification method under a kind of cloud environment
CN110647532A (en) * 2019-08-15 2020-01-03 苏州浪潮智能科技有限公司 Method and device for maintaining data consistency
CN111651375A (en) * 2020-05-22 2020-09-11 中国人民解放军国防科技大学 Method and system for realizing consistency of cache data of multi-path processor based on distributed finite directory
CN111930527A (en) * 2020-06-28 2020-11-13 绵阳慧视光电技术有限责任公司 Method for maintaining cache consistency of multi-core heterogeneous platform
CN112559433A (en) * 2019-09-25 2021-03-26 阿里巴巴集团控股有限公司 Multi-core interconnection bus, inter-core communication method and multi-core processor
CN113495854A (en) * 2020-04-03 2021-10-12 阿里巴巴集团控股有限公司 Method and system for implementing or managing cache coherence in a host-device system
CN115514772A (en) * 2022-11-15 2022-12-23 山东云海国创云计算装备产业创新中心有限公司 Method, device and equipment for realizing cache consistency and readable medium
CN116167310A (en) * 2023-04-25 2023-05-26 上海芯联芯智能科技有限公司 Method and device for verifying cache consistency of multi-core processor
CN118260304A (en) * 2024-05-29 2024-06-28 山东云海国创云计算装备产业创新中心有限公司 Request processing method, device, equipment and medium based on cache consistency directory

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1171159A (en) * 1994-12-23 1998-01-21 英特尔公司 Cache coherent multiprocessing computer system with reduced power operating features
CN101458665A (en) * 2007-12-14 2009-06-17 扬智科技股份有限公司 Second level cache and kinetic energy switch access method
CN101694639A (en) * 2009-10-15 2010-04-14 清华大学 Computer data caching method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1171159A (en) * 1994-12-23 1998-01-21 英特尔公司 Cache coherent multiprocessing computer system with reduced power operating features
CN101458665A (en) * 2007-12-14 2009-06-17 扬智科技股份有限公司 Second level cache and kinetic energy switch access method
CN101694639A (en) * 2009-10-15 2010-04-14 清华大学 Computer data caching method

Cited By (55)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591800B (en) * 2011-12-31 2015-01-07 龙芯中科技术有限公司 Data access and storage system and method for weak consistency storage model
CN102591800A (en) * 2011-12-31 2012-07-18 龙芯中科技术有限公司 Data access and storage system and method for weak consistency storage model
CN102662885A (en) * 2012-04-01 2012-09-12 天津国芯科技有限公司 Device and method for maintaining second-level cache coherency of symmetrical multi-core processor
CN102662885B (en) * 2012-04-01 2015-09-23 天津国芯科技有限公司 Symmetrical multi-core processor safeguards the conforming devices and methods therefor of L2 cache
CN102819498A (en) * 2012-08-15 2012-12-12 上海交通大学 Method of constructing consistency protocol of cache, many-core processor and network interface unit
CN102819498B (en) * 2012-08-15 2015-01-07 上海交通大学 Method of constructing consistency protocol of cache, many-core processor and network interface unit
CN103885890A (en) * 2012-12-21 2014-06-25 华为技术有限公司 Replacement processing method and device for cache blocks in caches
CN103885890B (en) * 2012-12-21 2017-04-12 华为技术有限公司 Replacement processing method and device for cache blocks in caches
CN103440223B (en) * 2013-08-29 2017-04-05 西安电子科技大学 A kind of hierarchical system and its method for realizing cache coherent protocol
CN103440223A (en) * 2013-08-29 2013-12-11 西安电子科技大学 Layering system for achieving caching consistency protocol and method thereof
CN105580308B (en) * 2013-09-06 2018-12-28 萨基姆防卫安全 Method for managing cache coherence
CN105580308A (en) * 2013-09-06 2016-05-11 萨基姆防卫安全 Method of managing consistency of caches
CN104462007A (en) * 2013-09-22 2015-03-25 中兴通讯股份有限公司 Method and device for achieving cache consistency between multiple cores
CN104462007B (en) * 2013-09-22 2018-10-02 南京中兴新软件有限责任公司 The method and device of buffer consistency between realization multinuclear
CN104360981A (en) * 2014-11-12 2015-02-18 浪潮(北京)电子信息产业有限公司 Design method of multi-core multiprocessor platform orientated Cache consistency protocol
CN104360981B (en) * 2014-11-12 2017-09-29 浪潮(北京)电子信息产业有限公司 Towards the design method of the Cache coherence protocol of multinuclear multi processor platform
CN105740164A (en) * 2014-12-10 2016-07-06 阿里巴巴集团控股有限公司 Multi-core processor supporting cache consistency, reading and writing methods and apparatuses as well as device
US10409723B2 (en) 2014-12-10 2019-09-10 Alibaba Group Holding Limited Multi-core processor supporting cache consistency, method, apparatus and system for data reading and writing by use thereof
CN105740164B (en) * 2014-12-10 2020-03-17 阿里巴巴集团控股有限公司 Multi-core processor supporting cache consistency, reading and writing method, device and equipment
CN106201980A (en) * 2015-01-21 2016-12-07 联发科技(新加坡)私人有限公司 Processing unit and processing method thereof
CN106155853B (en) * 2015-03-23 2018-09-14 龙芯中科技术有限公司 The verification method of processor IP, device and system
CN106155853A (en) * 2015-03-23 2016-11-23 龙芯中科技术有限公司 The verification method of processor IP, device and system
CN105488012A (en) * 2015-12-09 2016-04-13 浪潮电子信息产业股份有限公司 Consistency protocol design method based on exclusive data
CN105488012B (en) * 2015-12-09 2021-05-18 浪潮电子信息产业股份有限公司 Consistency protocol design method based on exclusive data
CN107038123A (en) * 2015-12-10 2017-08-11 Arm 有限公司 Snoop filter for the buffer consistency in data handling system
CN107038123B (en) * 2015-12-10 2021-11-30 Arm 有限公司 Snoop filter for cache coherency in a data processing system
CN107229593B (en) * 2016-03-25 2020-02-14 华为技术有限公司 Cache consistency operation method of multi-chip multi-core processor and multi-chip multi-core processor
CN107229593A (en) * 2016-03-25 2017-10-03 华为技术有限公司 The buffer consistency operating method and multi-disc polycaryon processor of multi-disc polycaryon processor
CN107341114B (en) * 2016-04-29 2021-06-01 华为技术有限公司 Directory management method, node controller and system
CN105915619A (en) * 2016-04-29 2016-08-31 中国地质大学(武汉) Access heat regarded cyber space information service high performance memory caching method
CN105915619B (en) * 2016-04-29 2019-07-05 中国地质大学(武汉) Take the cyberspace information service high-performance memory cache method of access temperature into account
CN107341114A (en) * 2016-04-29 2017-11-10 华为技术有限公司 A kind of method of directory management, Node Controller and system
CN108804348A (en) * 2017-05-02 2018-11-13 迈络思科技有限公司 Calculating in parallel processing environment
CN108804348B (en) * 2017-05-02 2023-07-21 迈络思科技有限公司 Computing in a parallel processing environment
CN109213641B (en) * 2017-06-29 2021-10-26 展讯通信(上海)有限公司 Cache consistency detection system and method
CN109213641A (en) * 2017-06-29 2019-01-15 展讯通信(上海)有限公司 Buffer consistency detection system and method
CN108694156B (en) * 2018-04-16 2021-12-21 东南大学 On-chip network traffic synthesis method based on cache consistency behavior
CN108694156A (en) * 2018-04-16 2018-10-23 东南大学 A kind of network-on-chip traffic modeling method based on buffer consistency behavior
CN109684237A (en) * 2018-11-20 2019-04-26 华为技术有限公司 Data access method and device based on multi-core processor
CN109684237B (en) * 2018-11-20 2021-06-01 华为技术有限公司 Data access method and device based on multi-core processor
CN110225008A (en) * 2019-05-27 2019-09-10 四川大学 SDN network state consistency verification method under a kind of cloud environment
CN110225008B (en) * 2019-05-27 2020-07-31 四川大学 SDN network state consistency verification method in cloud environment
CN110647532A (en) * 2019-08-15 2020-01-03 苏州浪潮智能科技有限公司 Method and device for maintaining data consistency
CN112559433A (en) * 2019-09-25 2021-03-26 阿里巴巴集团控股有限公司 Multi-core interconnection bus, inter-core communication method and multi-core processor
CN112559433B (en) * 2019-09-25 2024-01-02 阿里巴巴集团控股有限公司 Multi-core interconnection bus, inter-core communication method and multi-core processor
CN113495854A (en) * 2020-04-03 2021-10-12 阿里巴巴集团控股有限公司 Method and system for implementing or managing cache coherence in a host-device system
CN111651375A (en) * 2020-05-22 2020-09-11 中国人民解放军国防科技大学 Method and system for realizing consistency of cache data of multi-path processor based on distributed finite directory
CN111930527A (en) * 2020-06-28 2020-11-13 绵阳慧视光电技术有限责任公司 Method for maintaining cache consistency of multi-core heterogeneous platform
CN111930527B (en) * 2020-06-28 2023-12-08 绵阳慧视光电技术有限责任公司 Method for maintaining cache consistency of multi-core heterogeneous platform
CN115514772A (en) * 2022-11-15 2022-12-23 山东云海国创云计算装备产业创新中心有限公司 Method, device and equipment for realizing cache consistency and readable medium
CN115514772B (en) * 2022-11-15 2023-03-10 山东云海国创云计算装备产业创新中心有限公司 Method, device and equipment for realizing cache consistency and readable medium
WO2024103907A1 (en) * 2022-11-15 2024-05-23 山东云海国创云计算装备产业创新中心有限公司 Method and apparatus for achieving cache consistency, device, and readable medium
CN116167310A (en) * 2023-04-25 2023-05-26 上海芯联芯智能科技有限公司 Method and device for verifying cache consistency of multi-core processor
CN118260304A (en) * 2024-05-29 2024-06-28 山东云海国创云计算装备产业创新中心有限公司 Request processing method, device, equipment and medium based on cache consistency directory
CN118260304B (en) * 2024-05-29 2024-08-13 山东云海国创云计算装备产业创新中心有限公司 Request processing method, device, equipment and medium based on cache consistency directory

Also Published As

Publication number Publication date
CN101958834B (en) 2012-09-05

Similar Documents

Publication Publication Date Title
CN101958834B (en) On-chip network system supporting cache coherence and data request method
CN100495361C (en) Method and system for maintenance of memory consistency
JP4123621B2 (en) Main memory shared multiprocessor system and shared area setting method thereof
JP4848771B2 (en) Cache coherency control method, chipset, and multiprocessor system
US6115804A (en) Non-uniform memory access (NUMA) data processing system that permits multiple caches to concurrently hold data in a recent state from which data can be sourced by shared intervention
CN103049422B (en) Method for building multi-processor node system with multiple cache consistency domains
EP0817074B1 (en) Multiprocessing system employing a three-hop communication protocol
EP0817070B1 (en) Multiprocessing system employing a coherency protocol including a reply count
CN101354682B (en) Apparatus and method for settling access catalog conflict of multi-processor
JP2000227908A (en) Non-uniform memory access(numa) data processing system having shared intervention support
US9208091B2 (en) Coherent attached processor proxy having hybrid directory
CN102063406B (en) Network shared Cache for multi-core processor and directory control method thereof
US9009446B2 (en) Using broadcast-based TLB sharing to reduce address-translation latency in a shared-memory system with electrical interconnect
US20140250276A1 (en) Selection of post-request action based on combined response and input from the request source
JPH10171710A (en) Multi-process system for executing effective block copying operation
JPH10187645A (en) Multiprocess system constituted for storage in many subnodes of process node in coherence state
JPH10149342A (en) Multiprocess system executing prefetch operation
CN101635679B (en) Dynamic update of route table
US9183150B2 (en) Memory sharing by processors
CA2505259A1 (en) Methods and apparatus for multiple cluster locking
EP2771796B1 (en) A three channel cache-coherency socket protocol
US6965972B2 (en) Real time emulation of coherence directories using global sparse directories
KR19990085485A (en) Adaptive Granularity Method for Merging Micro and Coarse Communication in Distributed Shared Memory Systems
CN117290285A (en) On-chip cache consistency maintenance device and method of multi-core chiplet architecture
CN116795767A (en) Multi-core Cache sharing consistency protocol construction method based on CHI protocol

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20120905

Termination date: 20210927

CF01 Termination of patent right due to non-payment of annual fee