WO2001063240A2 - Maintaining high snoop traffic throughput and preventing cache data eviction during an atomic operation - Google Patents
Maintaining high snoop traffic throughput and preventing cache data eviction during an atomic operation Download PDFInfo
- Publication number
- WO2001063240A2 WO2001063240A2 PCT/US2001/004147 US0104147W WO0163240A2 WO 2001063240 A2 WO2001063240 A2 WO 2001063240A2 US 0104147 W US0104147 W US 0104147W WO 0163240 A2 WO0163240 A2 WO 0163240A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cache
- address
- atomic
- entry
- request queue
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0831—Cache consistency protocols using a bus scheme, e.g. with bus monitoring or watching means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0844—Multiple simultaneous or quasi-simultaneous cache accessing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/12—Replacement control
- G06F12/121—Replacement control using replacement algorithms
- G06F12/126—Replacement control using replacement algorithms with special data handling, e.g. priority of data or instructions, handling errors or pinning
Definitions
- the present invention relates to an apparatus and method for maintaining high snoop filtering throughput and preventing cache data eviction during an atomic operation.
- FIG. 1 illustrates, in block diagram form, a typical prior art multi-processor
- System 30 includes a number of Processors, 32a, 32b, 32c, coupled via a shared Bus 35 to Main Memory 36.
- Each Processor 32 has its own non-blocking Cache 34, which is N-way set associative.
- Each cache index includes data and a tag to identify the memory address with which the data is associated.
- coherency bits are associated with each item of data in the cache to indicate the cache coherency state of the data entry.
- each cache data entry can be in one of four states: M, O, S, or I. The I state indicates invalid data.
- the owned state, O indicates that the data associated with a cache index is valid, has been modified from the version in memory, is owned by a particular cache and that another cache may have a shared copy of the data.
- the processor with a requested line in the O state responds with data upon request from other processors.
- the shared state, S indicates that the data associated with a cache index is valid, and one or more other processors share a copy of the data.
- the modified state, M indicates valid data that has been modified since it was read into cache and that no other processor has a copy of the data.
- Cache coherency states help determine whether a cache access request is a miss or a hit.
- a cache hit occurs when one of the ways of a cache index includes a tag matching that of the requested address and the cache coherency state for that way does not indicate invalid data.
- a cache miss occurs when none of the tags of an index set matches that of the requested address or when the way with a matching tag contains invalid data.
- Figure 2 illustrates how MOSI cache coherency states transition in response to various types of misses.
- the events causing transitions between MOSI states are indicated using the acronyms 1ST, ILD, FST and FLD.
- ILD indicates an Internal LoaD; i.e., a load request from the processor associated with the cache.
- 1ST indicates an Internal STore.
- FLD indicates that a Foreign LoaD caused the transition; i.e, a load request to the cache coming from a processor not associated with cache
- FST indicates a Foreign STore.
- snooping refers to the process by which a processor in a multi-processor system determines whether a foreign cache stores a desired item of data.
- a snoop represents a potential, future request for an eviction , e.g., a FLD or a FST, on a particular address.
- Each snoop indicates the desired address and operation. Every snoop is broadcast to every Processor 32 within System 30, but only one Processor 32 responds to each snoop.
- the responding Processor 32 is the one associated with the Cache 34 storing the data associated with the desired address.
- Each Processor 32 within System 30 includes an External Interface Unit (EIU), which handles snoop responses.
- EIU External Interface Unit
- FIG. 3 illustrates, in block diagram form, EIU 40 and its coupling to Bus 35 and Cache 34.
- EIU 40 receives snoops from Bus 35.
- EIU 40 forwards each snoop onto Cache Controller 42, which stores the snoop in Request Queue 46 until it can be filtered.
- Snoop filtering involves determining whether a snoop hits or misses in Cache 34 and indicating that to EIU 40.
- the latency between receipt of a snoop by EIU 40 and a response to it can be quite long under the best of circumstances. Snoop latency usually increases from its theoretical minimum in response to other pending cache access requests, such as a pending atomic operation, for example.
- An atomic operation refers to a computational task that should be completed without interruption.
- Processors 32 typically implement atomic operations as two sub-operations on a single address, one sub- operation on the address following the other without interruption.
- One atomic operation for example, is an atomic load, which is a load followed immediately and without interruption by a store to the same address.
- Some processors cease filtering snoops, even though most snoops are for addresses other than that associated with the pending atomic operation. Two factors necessitate this approach.
- Cache includes a single data-and-tag read-write port, which, in response to a hit permits modification of both a cache line's data and tag.
- An embodiment of the present invention provides an apparatus that permits snoop filtering to continue while an atomic operation is being executed.
- the apparatus includes first and second request queues and a cache.
- the first request queue tracks cache access requests, while the second request queue tracks snoops that have yet to be filtered.
- the cache includes a dedicated port for each request queue.
- the first port is dedicated to the first request queue and is a data-and-tag port, permitting modification of cache contents.
- the second port is dedicated to the second request queue and is a tag-only port. Because the second port is a tag-only port, snoop filtering can continue during an atomic operation without fear of any modification of the data associated with the atomic address.
- the apparatus includes a first request queue, a second request queue, and an atomic address block.
- the first request queue stores an entry for each cache access request. Each entry includes a first set of address bits and an atomic bit.
- the first set of address bits represents a first cache address associated with the cache access request and the atomic bit indicates whether the cache access request is associated with the atomic operation.
- the second request queue stores an entry for each cache eviction request. Each entry of the second request queue includes a second set of address bits indicating a second cache address associated with the cache eviction request.
- the atomic address block prevents eviction of a third cache address during an atomic operation on the third cache address.
- the atomic address block receives and analyzes a first set of signals representing a first entry of the first request queue to determine whether they represent the atomic operation. If so, the atomic address block sets a third set of address bits to a value representative of the first cache address.
- the atomic address block receives and analyzes a second set of signals representing the second set of address bits to determine whether the second set of address bits represent a same cache address as the third set of address bits. If so, the atomic address block stalls servicing of the second request queue, thus preventing eviction of data from the cache upon which an atomic operation is being performed.
- Figure 2 illustrates the states of the prior art MOSI cache coherency protocol.
- Figure 3 illustrates a prior art External Interface Unit and it relationship with a cache.
- FIG. 4 illustrates Snoop Filtering Circuitry in accordance with an embodiment of the invention.
- Figure 5 illustrates a Cache Access Request Queue of the Snoop Filtering Circuitry of Figure 4.
- Figure 6 illustrates a Snoop Filtering Request Queue of the Snoop Filtering Circuitry of Figure 4.
- Figure 7 is a block diagram of the Atomic Address Register and the Control
- FIG. 8 illustrates an entry of the Atomic Address Register utilized in accordance with an embodiment of the invention.
- FIG. 9 is a block diagram of the Address Write Circuitry of the Control Circuitry of Figure 7.
- Figure 10 is a block diagram of the Lock Bit Control Circuitry of the Control Circuitry of Figure 7.
- Figure 11 illustrates a Eviction Queue of the Snoop Filtering Circuitry of Figure 4 .
- Figure 12 is a block diagram of the Atomic Hit Detection Circuitry of the Control Circuitry of Figure 7.
- FIG. 4 illustrates in block diagram form a portion of a Processor 33 of a multi- processor system 50.
- Processor 33 improves snoop latency by continuing to filter snoops during the pendency of an atomic operation.
- Processor 33 achieves this improvement using Cache 37, Cache Access Request Queue 52 and Snoop Filtering Request Queue 54.
- Cache Controller 43 uses Cache Access Request Queue 52 to track native, or internal, cache access requests and Snoop Filtering Request Queue54 to filter snoops.
- Processor 33 includes Atomic Address Block 56, which protects the atomic address from eviction during the atomic operation.
- Atomic Address Block 56 detects the beginning of an atomic operation by monitoring cache access requests from the Cache
- Atomic Address Block 56 then monitors the Eviction Queue 58 to detect when eviction of the atomic address is requested. Atomic Address Block 56 prevents eviction of the atomic address by asserting a Stall signal, which causes Cache Controller 43 to stall selection of eviction requests from Eviction Queue 58.
- Cache Access Request Queue 52 is preferably realized as a memory device storing an entry for each outstanding request for access to Cache 37.
- Figure 5 illustrates an entry 60 of Cache Access Request Queue 52.
- the maximum number of entries Cache Access Request Queue 52 can support is a design choice.
- Entry 60 contains information about a
- Address bits 62 and Tag bits 63 indicate the memory address to which the request seeks access.
- Atomic bit 64 indicates whether or not the cache access request is a sub-operation of an atomic operation.
- Ld/Store bit 65 indicates whether the cache access request is for a load or store operation.
- Valid bit 66 indicates
- Cache Controller 43 controls the contents of Cache Access Request Queue 52.
- Cache Controller 43 also controls the contents of Snoop Filtering Request Queue 54.
- Snoop Filtering Request Queue 54 is realized as a memory device storing an entry for each outstanding snoop.
- Figure 6 illustrates an entry 70 of Snoop Filtering
- Entry 70 contains information about a single outstanding snoop, and includes Address bits 72, Tag bits 73, FLD/FST bit 74, and Valid bit 76. Address bits 72 and Tag bits 73indicate the memory address to which the snoop seeks access. FLD/FST bit 74 indicates whether the snoop is associated with a foreign load or a foreign store. Valid bit
- FIG. 35 76 indicates whether or not the associated entry is valid.
- Figure 11 illustrates an entry 55 of Eviction Queue 58.
- the maximum number of entries Eviction Queue 58 can support is a design choice.
- Entry 55 contains information about a single outstanding eviction request and includes Address bits 57 and Valid bit 59.
- Address bits 57 indicates the memory address on which the eviction will be performed.
- Valid bit 59 indicated whether or not the associated entry is valid.
- Cache Controller 43 stalls servicing of Eviction Queue 58 in response to a Stall signal from Snoop Filtering Circuitry 51.
- Atomic Address Block Figure 7 illustrates, in block diagram form, Atomic Address Block 56 and its coupling to Cache Access Request Queue 52, Snoop Filtering Request Queue 54 and Eviction Queue 58.
- Atomic Address Block 56 includes Atomic Address Register 80, Address Write Circuitry 100, Lock Bit Control Circuitry 110 and Atomic Hit Detection Circuitry 130.
- Address Write Circuitry 100 and Lock Bit Control Circuitry 110 monitor the cache access requests coupled to Cache 37 by Cache Access Request Queue 52.
- Address Write Circuitry 100 stores the atomic address in Atomic Address Register 80.
- Lock Bit Control Circuitry 110 responds to the same circumstances by locking the atomic address to prevent access to the data during the pendency of the atomic operation.
- Atomic Hit Detection Circuitry 130 monitors eviction requests from
- Atomic Address Register 80 is preferably realized as a memory device storing an entry 90 for each atomic operation which Processor 33 allows to be simultaneously pending. In a preferred embodiment, Processor 33 permits just one atomic operation to be pending at a time.
- Figure 8 illustrates an entry 90 of Atomic Address Register 80. Entry 90 includes Address & Tag bits 92, and Lock bit 94.
- Lock Bit 94 indicates whether the atomic address may be accessed. Lock bit 94 is asserted when a cache access request associated with the first sub-operation of an atomic operation is coupled from Cache Access Request Queue 52 to Cache 37. Lock bit 94 is deasserted upon completion of the second sub-operation of the atomic operation. Thus, Lock bit 94 also indicates the validity of the contents of Atomic Address Register 80.
- Lock Bit Control Circuitry 110 controls the state of Lock bit 94 of Atomic Address Register 80. Lock Bit Control Circuitry 1 10 monitors the signals coupled to Cache 37 on lines 112 by Cache Access Request Queue 52.
- the signals on lines 1 12 represent a single entry 60 of Cache Access Request Queue 52. If the signals on lines 112 indicate that the cache access request represents the first sub-operation of an atomic operation, then Lock Bit Control Circuitry 110 modifies Lock bit 94 to indicate that the atomic address is unavailable. On the other hand, if the signals on lines 112 indicate that the cache access request represents completion of the second sub-operation of the atomic operation, then Lock Bit Control Circuitry modifies Lock bit 94 to indicate that the atomic address is available; i.e, that Entry 90 is no longer valid.
- Atomic Hit Detection Circuitry 130 protects data associated with an atomic address from eviction during the atomic operation.
- Atomic Hit Detection Circuitry 130 identifies an eviction request for the atomic address by comparing the atomic address stored within Atomic Address Register 80 to the signals on line 53, which represent the Address bits 57 of a single entry 55 of Eviction Queue 58. (See Figure 11) If the two addresses match while the atomic address is locked, then Atomic Hit Detection Circuitry 130 asserts it Stall signal, which is coupled to Cache Controller 43 on line 138.
- Cache Controller 43 responds to assertion of the Stall signal by stalling selection of eviction requests in Eviction Queue 58.
- Cache Controller 43 resumes servicing of eviction requests when the Stall signal is deasserted.
- Atomic Hit Detection Circuitry 130 de-asserts the Stall signal when the atomic operation is completed.
- FIG. 9 illustrates Address Write Circuitry 100 in block diagram form.
- Address Write Circuitry 100 is preferably realized as a series of parallel Latches 104, each with an associated logical AND gate 103, although only one of each is illustrated.
- Each Latch 104 stores a single bit of an address and tag pair.
- the D input of each Latch 104 is coupled to a line of lines 102b, which represents a bit of the Address and Tag bits of Cache Access Request Queue 52.
- the enable input of Latch 104 is controlled by the output of a logical AND gate 103.
- Logical AND gate 103 enables Latch 104 whenever the current cache access request from Cache Access Request Queue 52 represents a valid request for an atomic operation.
- logical AND gate 103 brings its output active whenever the signals on line 102c representing the Valid bit 66 and the signals on line 102a representing Atomic bit 64 are active. (See Figure 5) Thus, when the signals on lines 102a and 102c indicate a valid request for an atomic operation is being serviced, then the signals on lines 102b are latched by Latches 104.
- Bit Control Circuitry 110 includes logical multiplexer (MUX) 150 and Select Control Circuitry 152.
- MUX 150 determines the state of the Lock bit 94 to be written in Atomic Address Register 80.
- MUX 150 indicates that the Lock bit 94 should be locked.
- MUX logical multiplexer
- Select Control Circuitry 152 selects between the II and 10 inputs using First Select Control Circuit 151 and Zero Select Control Circuitry 156.
- First Select Control Circuit 151 controls when the II input is selected by controlling the SI signal on line 155.
- First Select Control Circuit 151 is realized as a pair of logical AND gates 153 and 154. Logical AND gate 153 asserts its
- Zero Select Control 0 Circuitry 156 controls when the 10 input of MUX 150 is selected by controlling the SO signal on line 157.
- Zero Select Control Circuitry 156 includes one Zero Select Circuit 156a for each entry of Cache Access Request Queue 52.
- Figure 10 illustrates a single instance of a Zero Select Control Circuit 156a.
- Zero Select Circuit 156a examines its associated entry to determine whether the associated cache access request 5 just completed. Comparator 158 performs this task. If the addresses match and the cache access request entry is associated with the second sub-operation of an atomic operation, as represented by signals representing the Atomic bit 64 and Ld/Store bit 65 of the cache access request entry 60, then logical AND 160 asserts the SO signal on line 157, thereby unlocking the Lock bit 94 of Atomic Address Register 80. 0
- FIG. 12 illustrates Atomic Hit Detection Circuitry 130 in block diagram form.
- Atomic Hit Detection Circuitry 130 signals an eviction request cache hit to Cache Controller 43 via the Stall signal on line 138.
- Atomic Hit Detection Circuitry 130 includes 5 Comparator 170 and logical AND gate 172.
- Comparator 170 compares the address of the eviction request, which is represented by the signals on line 53, with the atomic address, which is represented by signals on line 92. Just because the eviction address and the atomic address match does not necessarily mean that Eviction Queue 58 should be stalled. Eviction should be stalled only if the atomic operation is still pending.
- Logical AND gate 172 determines whether this is the case by asserting its output, the Stall signal on line 138, only if the Lock bit 94 is asserted.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2001562159A JP2003524248A (en) | 2000-02-25 | 2001-02-09 | Method and apparatus for preventing cache data eviction during atomic operations and maintaining high snoop traffic performance |
EP01908995A EP1275045A4 (en) | 2000-02-25 | 2001-02-09 | Apparatus and method for maintaining high snoop traffic throughput and preventing cache data eviction during an atomic operation |
AU2001236793A AU2001236793A1 (en) | 2000-02-25 | 2001-02-09 | Apparatus and method for maintaining high snoop traffic throughput and preventing cache data eviction during an atomic operation |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/513,033 US6347360B1 (en) | 2000-02-25 | 2000-02-25 | Apparatus and method for preventing cache data eviction during an atomic operation |
US09/513,034 | 2000-02-25 | ||
US09/513,034 US6389517B1 (en) | 2000-02-25 | 2000-02-25 | Maintaining snoop traffic throughput in presence of an atomic operation a first port for a first queue tracks cache requests and a second port for a second queue snoops that have yet to be filtered |
US09/513,033 | 2000-02-25 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2001063240A2 true WO2001063240A2 (en) | 2001-08-30 |
WO2001063240A3 WO2001063240A3 (en) | 2002-01-17 |
Family
ID=27057731
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2001/004147 WO2001063240A2 (en) | 2000-02-25 | 2001-02-09 | Maintaining high snoop traffic throughput and preventing cache data eviction during an atomic operation |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1275045A4 (en) |
JP (1) | JP2003524248A (en) |
AU (1) | AU2001236793A1 (en) |
WO (1) | WO2001063240A2 (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013188414A2 (en) | 2012-06-15 | 2013-12-19 | Soft Machines, Inc. | A method and system for filtering the stores to prevent all stores from having to snoop check against all words of a cache |
GB2516092A (en) * | 2013-07-11 | 2015-01-14 | Ibm | Method and system for implementing a bit array in a cache line |
JP2015102973A (en) * | 2013-11-22 | 2015-06-04 | 富士通株式会社 | Arithmetic processing apparatus and control method of arithmetic processing unit |
FR3048526A1 (en) * | 2016-03-07 | 2017-09-08 | Kalray | ATOMIC LIMITED RANGE TRAINING AT AN INTERMEDIATE CACHE LEVEL |
US9904552B2 (en) | 2012-06-15 | 2018-02-27 | Intel Corporation | Virtual load store queue having a dynamic dispatch window with a distributed structure |
US9928121B2 (en) | 2012-06-15 | 2018-03-27 | Intel Corporation | Method and system for implementing recovery from speculative forwarding miss-predictions/errors resulting from load store reordering and optimization |
US9965277B2 (en) | 2012-06-15 | 2018-05-08 | Intel Corporation | Virtual load store queue having a dynamic dispatch window with a unified structure |
US9990198B2 (en) | 2012-06-15 | 2018-06-05 | Intel Corporation | Instruction definition to implement load store reordering and optimization |
US10019263B2 (en) | 2012-06-15 | 2018-07-10 | Intel Corporation | Reordered speculative instruction sequences with a disambiguation-free out of order load store queue |
US10048964B2 (en) | 2012-06-15 | 2018-08-14 | Intel Corporation | Disambiguation-free out of order load store queue |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006155080A (en) * | 2004-11-26 | 2006-06-15 | Fujitsu Ltd | Memory controller and memory control method |
US9471323B2 (en) | 2013-10-03 | 2016-10-18 | Intel Corporation | System and method of using an atomic data buffer to bypass a memory location |
GB2554442B (en) | 2016-09-28 | 2020-11-11 | Advanced Risc Mach Ltd | Apparatus and method for providing an atomic set of data accesses |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4783736A (en) * | 1985-07-22 | 1988-11-08 | Alliant Computer Systems Corporation | Digital computer with multisection cache |
US5765199A (en) * | 1994-01-31 | 1998-06-09 | Motorola, Inc. | Data processor with alocate bit and method of operation |
US6049851A (en) * | 1994-02-14 | 2000-04-11 | Hewlett-Packard Company | Method and apparatus for checking cache coherency in a computer architecture |
US6098156A (en) * | 1997-07-22 | 2000-08-01 | International Business Machines Corporation | Method and system for rapid line ownership transfer for multiprocessor updates |
US6145059A (en) * | 1998-02-17 | 2000-11-07 | International Business Machines Corporation | Cache coherency protocols with posted operations and tagged coherency states |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5966729A (en) * | 1997-06-30 | 1999-10-12 | Sun Microsystems, Inc. | Snoop filter for use in multiprocessor computer systems |
-
2001
- 2001-02-09 WO PCT/US2001/004147 patent/WO2001063240A2/en not_active Application Discontinuation
- 2001-02-09 AU AU2001236793A patent/AU2001236793A1/en not_active Abandoned
- 2001-02-09 EP EP01908995A patent/EP1275045A4/en not_active Withdrawn
- 2001-02-09 JP JP2001562159A patent/JP2003524248A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4783736A (en) * | 1985-07-22 | 1988-11-08 | Alliant Computer Systems Corporation | Digital computer with multisection cache |
US5765199A (en) * | 1994-01-31 | 1998-06-09 | Motorola, Inc. | Data processor with alocate bit and method of operation |
US6049851A (en) * | 1994-02-14 | 2000-04-11 | Hewlett-Packard Company | Method and apparatus for checking cache coherency in a computer architecture |
US6098156A (en) * | 1997-07-22 | 2000-08-01 | International Business Machines Corporation | Method and system for rapid line ownership transfer for multiprocessor updates |
US6145059A (en) * | 1998-02-17 | 2000-11-07 | International Business Machines Corporation | Cache coherency protocols with posted operations and tagged coherency states |
Non-Patent Citations (1)
Title |
---|
See also references of EP1275045A2 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9904552B2 (en) | 2012-06-15 | 2018-02-27 | Intel Corporation | Virtual load store queue having a dynamic dispatch window with a distributed structure |
US9965277B2 (en) | 2012-06-15 | 2018-05-08 | Intel Corporation | Virtual load store queue having a dynamic dispatch window with a unified structure |
US10592300B2 (en) | 2012-06-15 | 2020-03-17 | Intel Corporation | Method and system for implementing recovery from speculative forwarding miss-predictions/errors resulting from load store reordering and optimization |
EP2862060A4 (en) * | 2012-06-15 | 2016-11-30 | Soft Machines Inc | A method and system for filtering the stores to prevent all stores from having to snoop check against all words of a cache |
US10048964B2 (en) | 2012-06-15 | 2018-08-14 | Intel Corporation | Disambiguation-free out of order load store queue |
US9928121B2 (en) | 2012-06-15 | 2018-03-27 | Intel Corporation | Method and system for implementing recovery from speculative forwarding miss-predictions/errors resulting from load store reordering and optimization |
US9990198B2 (en) | 2012-06-15 | 2018-06-05 | Intel Corporation | Instruction definition to implement load store reordering and optimization |
WO2013188414A2 (en) | 2012-06-15 | 2013-12-19 | Soft Machines, Inc. | A method and system for filtering the stores to prevent all stores from having to snoop check against all words of a cache |
US10019263B2 (en) | 2012-06-15 | 2018-07-10 | Intel Corporation | Reordered speculative instruction sequences with a disambiguation-free out of order load store queue |
GB2516092A (en) * | 2013-07-11 | 2015-01-14 | Ibm | Method and system for implementing a bit array in a cache line |
JP2015102973A (en) * | 2013-11-22 | 2015-06-04 | 富士通株式会社 | Arithmetic processing apparatus and control method of arithmetic processing unit |
EP3217288A1 (en) * | 2016-03-07 | 2017-09-13 | Kalray | Atomic instruction having a local scope limited to an intermediate cache level |
FR3048526A1 (en) * | 2016-03-07 | 2017-09-08 | Kalray | ATOMIC LIMITED RANGE TRAINING AT AN INTERMEDIATE CACHE LEVEL |
US11144480B2 (en) | 2016-03-07 | 2021-10-12 | Kalray | Atomic instruction having a local scope limited to an intermediate cache level |
Also Published As
Publication number | Publication date |
---|---|
EP1275045A4 (en) | 2005-12-21 |
AU2001236793A1 (en) | 2001-09-03 |
WO2001063240A3 (en) | 2002-01-17 |
EP1275045A2 (en) | 2003-01-15 |
JP2003524248A (en) | 2003-08-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3533355B2 (en) | Cache memory system | |
US5555398A (en) | Write back cache coherency module for systems with a write through cache supporting bus | |
US6625698B2 (en) | Method and apparatus for controlling memory storage locks based on cache line ownership | |
US5774700A (en) | Method and apparatus for determining the timing of snoop windows in a pipelined bus | |
US6389517B1 (en) | Maintaining snoop traffic throughput in presence of an atomic operation a first port for a first queue tracks cache requests and a second port for a second queue snoops that have yet to be filtered | |
US6347360B1 (en) | Apparatus and method for preventing cache data eviction during an atomic operation | |
US5426765A (en) | Multiprocessor cache abitration | |
US5875472A (en) | Address conflict detection system employing address indirection for use in a high-speed multi-processor system | |
EP0667578B1 (en) | Apparatus and method for checking cache coherency with double snoop mechanism | |
US4484267A (en) | Cache sharing control in a multiprocessor | |
US5197139A (en) | Cache management for multi-processor systems utilizing bulk cross-invalidate | |
US6292705B1 (en) | Method and apparatus for address transfers, system serialization, and centralized cache and transaction control, in a symetric multiprocessor system | |
US5860111A (en) | Coherency for write-back cache in a system designed for write-through cache including export-on-hold | |
US6587931B1 (en) | Directory-based cache coherency system supporting multiple instruction processor and input/output caches | |
US5893921A (en) | Method for maintaining memory coherency in a computer system having a cache utilizing snoop address injection during a read transaction by a dual memory bus controller | |
US6374332B1 (en) | Cache control system for performing multiple outstanding ownership requests | |
US5963978A (en) | High level (L2) cache and method for efficiently updating directory entries utilizing an n-position priority queue and priority indicators | |
US5832276A (en) | Resolving processor and system bus address collision in a high-level cache | |
JPH0561770A (en) | Coherent means of data processing system | |
US6553442B1 (en) | Bus master for SMP execution of global operations utilizing a single token with implied release | |
JP2002163148A (en) | Multi-processor system with cache memory | |
US5214766A (en) | Data prefetching based on store information in multi-processor caches | |
EP1275045A2 (en) | Apparatus and method for maintaining high snoop traffic throughput and preventing cache data eviction during an atomic operation | |
JP2002259211A (en) | Method for maintaining coherency in cache hierarchy, computer system and processing unit | |
US6615321B2 (en) | Mechanism for collapsing store misses in an SMP computer system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A2 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A2 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DFPE | Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101) | ||
AK | Designated states |
Kind code of ref document: A3 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A3 Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG |
|
ENP | Entry into the national phase |
Ref country code: JP Ref document number: 2001 562159 Kind code of ref document: A Format of ref document f/p: F |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2001908995 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 2001908995 Country of ref document: EP |
|
REG | Reference to national code |
Ref country code: DE Ref legal event code: 8642 |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2001908995 Country of ref document: EP |