US7222222B1 - System and method for handling memory requests in a multiprocessor shared memory system - Google Patents
System and method for handling memory requests in a multiprocessor shared memory system Download PDFInfo
- Publication number
- US7222222B1 US7222222B1 US10/601,030 US60103003A US7222222B1 US 7222222 B1 US7222222 B1 US 7222222B1 US 60103003 A US60103003 A US 60103003A US 7222222 B1 US7222222 B1 US 7222222B1
- Authority
- US
- United States
- Prior art keywords
- request
- data
- requests
- memory
- requesting
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime, expires
Links
- 230000015654 memory Effects 0.000 title claims abstract description 182
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000012545 processing Methods 0.000 claims abstract description 120
- 230000004044 response Effects 0.000 claims description 86
- 230000008569 process Effects 0.000 claims description 26
- 230000000717 retained effect Effects 0.000 claims description 16
- TUEBYUWTZWEBPE-UHFFFAOYSA-N 1-[2-($l^{1}-selanylmethyl)prop-2-enoyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)C1CCCN1C(=O)C(=C)C[Se] TUEBYUWTZWEBPE-UHFFFAOYSA-N 0.000 description 86
- 230000007246 mechanism Effects 0.000 description 10
- 230000009471 action Effects 0.000 description 8
- 238000010586 diagram Methods 0.000 description 6
- 238000012546 transfer Methods 0.000 description 6
- 239000000872 buffer Substances 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 230000001133 acceleration Effects 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000003213 activating effect Effects 0.000 description 2
- 238000013475 authorization Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000008570 general process Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0815—Cache consistency protocols
- G06F12/0817—Cache consistency protocols using directory methods
- G06F12/0828—Cache consistency protocols using directory methods with concurrent directory accessing, i.e. handling multiple concurrent coherency transactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1605—Handling requests for interconnection or transfer for access to memory bus based on arbitration
- G06F13/1642—Handling requests for interconnection or transfer for access to memory bus based on arbitration with request queuing
Definitions
- the present invention generally relates to methods and apparatus for use in a shared memory multiprocessor data processing system; and, more particularly, relates to an improved mechanism for managing memory requests in a system that includes multiple processing nodes coupled to a shared main memory.
- IPs Instruction Processors
- I/O Input/Output
- cache memory systems are often coupled to one or more of the IPs for storing data signals that are copied from main memory or from other cache memories. These cache memories are generally capable of processing requests faster than the main memory while also serving to reduce the number of requests that the main memory must handle. This increases system throughput.
- cache memories While the use of cache memories increases system throughput, it causes other design challenges.
- some system must be utilized to ensure that all IPs are working from the same (most recent) copy of the data. For example, if a data item is copied, and subsequently modified, within a cache memory, another IP requesting access to the same data item must be prevented from using the older copy of the data item stored either in main memory or the requesting IP's cache. This is referred to as maintaining cache coherency. Maintaining cache coherency becomes more difficult as more cache memories are added to the system since more copies of a single data item may have to be tracked.
- Another problem related to that described above involves ensuring that a fair priority scheme is implemented which allows all processors to have relatively timely access to shared data. For example, consider the situation wherein data from the shared memory is copied to a first cache to allow one or more processors coupled to the cache to access this data. Before all of these processors have had the opportunity to access the data within the first cache, the first cache is forced to relinquish control over this data. This may occur because another processor that is coupled to a different cache requires access to the data. The data is therefore copied to this different cache only to be copied back to the first cache a short time later because the original processors still require access to the data. This repeated transfer of data, or “memory thrashing”, decreases system throughput.
- a software lock can be implemented by designating a location within shared memory as a lock cell that is used to control access to shared protected data.
- a processor cannot gain access to the protected data without first activating the software lock. This can be accomplished using an indivisible read-modify-write operation that tests the software lock for availability. If the lock is available, the lock cell data is set to a predetermined state to activate the lock. After the processor has completed reading and/or updating the protected data, the lock cell is deactivated, allowing another processor within the system to acquire the lock cell and access the protected data.
- problems can exist when it is time to deactivate the lock cell. Assume, for example, that one or more processors are repeatedly testing the lock cell for availability, as may be performed within software looping constructs. Because these read requests are repeatedly being issued to test the state of the lock cell, it may not be possible for the processor that has activated the lock to readily gain access to the lock cell to deactivate the lock. This results in a temporary deadlock situation.
- a system and method are provided for tracking memory requests within a data processing system.
- the system includes a request tracking circuit that is coupled to receive requests for data from multiple processors. After a request is received and before it is forwarded to the memory for processing, a record is created within the request tracking circuit that stores request information. For example, this information may identify the request address, the processor that issued the request, as well as the request type.
- the request tracking circuit determines whether any other requests are pending for the same memory address. If not, the request is forwarded to the memory. Otherwise, a request is not issued to memory, and instead, the newly-created record is associated with one or more other records tracking requests to the same address. In one embodiment, this association is created by forming a linked list of records. These records may be linked in an order that indicates the time-order in which the respective requests were received.
- the data When data is received from memory as the result of a request, the data is forwarded to the processor that initiated the request. In one embodiment, the request tracking circuit then deletes the record for this request.
- any additional request that is linked to this request is processed next as the current request.
- a request is issued to the processor that was most-recently provided the data.
- This request solicits the return of the data along with the return of any access rights (e.g., read-only or read/write access rights) that will be needed to fulfill the current request.
- the request type that will be used to solicit return of the data is selected based on the access rights that were granted to the processor that most recently retained the data, and on the access rights that are being requested by the current request. The request type may further be based on the access rights that were granted by the memory for the data.
- any returned data is forwarded to the processor that issued the current request.
- the request for, and the subsequent transfer of, the data to this processor is performed during an indivisible operation. This prevents some other processor or the memory itself from making an intervening request that intercepts the data. This mechanism thereby ensures that requests for data are processed in an order of receipt so that a temporary deadlock situation does not arise.
- the current request After the current request is processed, it may be deleted. Then, if the current request was linked to still another request, the next request in the linked list becomes the current request and is processed in the above-described manner. Processing continues until all requests within the linked list have been processed.
- Each processing node may include multiple processors having dedicated caches.
- Each of the processing nodes may further include a shared cache. Requests issued by the multiple processors of a processing node are tracked when requested data is not resident with the requested access rights within any of the caches in the processing node.
- the data may not be returned because the first processor already aged the data from its dedicated caches to the shared cache. If this is the case, data retrieved from the shared cache is returned to the next processor.
- the memory utilizes a dual memory channel architecture for providing requests to, and receiving requests from, the multiple processors.
- this architecture it may be possible for the memory to issue a request for return of data to the multiple processors while one or more of the multiple processors are requesting the same data from the memory.
- the request tracking circuit will be tracking one or more requests for this data that were issued by one of the multiple processors.
- the request tracking circuit When the request from memory is received, the request tracking circuit generates a record to track this memory request, which is then linked to the one or more requests for the same data in a manner similar to that described above.
- the requested data will be provided by the memory, and the linked list of requests will be processed in the above-discussed manner.
- the record storing the memory request is encountered, the data is retrieved from one of the multiple processors or shared cache, and is returned to the main memory.
- data may be provided by the memory as the result of a processor request before all coherency actions are performed for this data.
- data may be provided by memory before other read-only copies of the data that are stored elsewhere within the data processing system are invalidated.
- the request tracking circuit tracks the outstanding invalidation activities so that the data will not be returned to memory until these activities are completed. This prevents memory incoherency and inconsistency problems from arising.
- the inventive system and method provides a mechanism to process requests for the same memory data in an ordered manner. Moreover, the system prevents the memory from re-acquiring the data before any previously-issued requests from one or more of the multiple processors within the same processing node are handled. This prevents the occurrence of temporary deadlock situations that arise when memory thrashing is occurring.
- a method for use in a system having multiple processors coupled to a memory.
- the method includes the steps of receiving multiple requests for data from the multiple processors, and if ones of the multiple requests are requesting the same data, creating a respective linked list to record the ones of the multiple requests.
- the method further includes issuing one of the requests recorded by each linked list to the memory.
- a method of processing requests to a memory includes receiving a request for data stored in the memory, and if the request is requesting the same data as another request that is already pending to the memory, linking the request to the other pending request. Steps a.) and b.) are repeated for any additional requests issued to the memory.
- a system for processing requests to a memory includes multiple requesters to issue requests for data to the memory.
- the multiple requests are the processors within a processing node.
- the system further includes a request tracking circuit coupled to the multiple requesters to retain a record of each request until the request is completed, and to associate a request with any other one or more requests for the same data so that a single request for any given data is pending to memory at a given time.
- a data processing system that includes a memory, and a processing node coupled to the memory to issue requests for data to the memory, wherein the processing node includes a requesting tracking circuit to record, in time-order, requests issued for the same data, and to allow only one of the requests for the same data from being issued to the memory at a given time.
- a system for processing requests to a memory includes processing means for issuing the requests to the memory.
- the system also includes request tracking means for receiving the requests, for forming an association between any of the requests that are requesting the same data, and for allowing only one of the associated requests to be issued to the memory.
- FIG. 1 is a block diagram of an exemplary data processing system of the type that may employ the current invention.
- FIG. 2 is a block diagram of one embodiment of the current invention that is adapted for use within a data processing platform similar to that of FIG. 1 .
- FIGS. 3A and 3B when arranged as shown in FIG. 3 , are a flow diagram illustrating one method of the invention according to the current invention.
- FIG. 1 is a block diagram of an exemplary data processing system that may employ the current invention.
- the system includes a Storage Coherency Director (SCD) 100 that provides the main memory facility for the system.
- SCD 100 may include random access memory (RAM), read-only memory (ROM), and any other type of memory known in the art.
- SCD 100 may be subdivided into multiple subunits (not shown) in a manner largely beyond the scope of the current invention.
- SCD is a directory-based storage unit.
- SCD retains information in directory 101 that indicates where the latest copy of data resides within the system. This is necessary since data from SCD 100 may be copied into any of the various cache memories within the system.
- Directory 101 tracks the latest copy of the data to ensure that every processor is operating from this copy.
- directory 101 includes a directory entry that tracks the location of each 128-byte block of memory within the SCD, where a 128-byte block is referred to as a cache line.
- the SCD of the current embodiment includes a SCD response channel 103 and an SCD request channel 105 .
- the SCD request channel 105 is coupled to an acknowledge tracker 107 . The use of these channels and the acknowledge tracker is discussed below.
- SCD is coupled to one or more Processor Node Directors (PND) shown as PNDs 102 A and 102 B.
- PND Processor Node Directors
- the system of the current invention may include more or fewer PNDs than are shown in FIG. 1 .
- Each PND is coupled to SCD 100 over one or more high-speed SCD interfaces shown as interfaces 109 A and 109 B. Each of these interfaces includes data, address, and function lines.
- Each PND includes logic to interface to the high-speed SCD interface, and further includes logic to interface to a respective processor bus such as processor buses 104 A and 104 B.
- Each PND may further include shared cache and all supporting logic, shown as shared cache logic 106 A and 106 B, respectively.
- This cache logic may include a Third-Level Cache (TLC), a Fourth-Level Cache (4LC), or some other type of cache memory.
- each of PNDs 102 A and 102 B is coupled to a respective processor bus 104 A and 104 B, which may utilize any type of bus protocol.
- Each processor bus further couples to multiple local cache memories through respective Bus Controllers (BCs) 114 .
- BCs Bus Controllers
- Each BC controls the transfer of data between a processor bus and a respective one of the Second-Level Caches (SLCs) 108 .
- SLCs Second-Level Caches
- SLCs 108 A– 108 D are coupled to processor bus 104 A through BCs 114 A– 114 D, respectively.
- SLCs 108 E– 108 H are coupled to processor bus 104 B through BCs 114 E– 114 H, respectively.
- these local SLCs may be Third-Level Caches.
- Each SLC 108 is also coupled to a respective one of the Instruction Processors (IPs) 110 A– 110 H over a respective interface 112 A– 112 H.
- IPs Instruction Processors
- SLC 108 A is coupled to IP 110 A via interface 112 A
- SLC 108 B is coupled to IP 110 B via interface 112 B, and so on.
- An IP may be any type of processor such as a 2200TM processor commercially available from Unisys Corporation, a processor commercially available from Intel Corporation, or any other processor known in the art.
- Each IP may include one or more on-board caches.
- each IP includes a First-Level Cache (FLC).
- FLC First-Level Cache
- each IP resides on a single Application Specific Integrated Circuit (ASIC) device with a respective SLC 108 .
- ASIC Application Specific Integrated Circuit
- an IP may be coupled to a respective SLC over an external interface.
- the associated BC may or may not be integrated with the SLC logic, and may also
- a PND, its respective processor bus, and the entities coupled to the processor bus may be referred to as a “processing node”.
- PND 102 A, processor bus 104 A, and all entities associated with processor bus including BCs 114 A– 114 D, SLCs 108 A– 108 D, and IPs 110 A– 110 D may be referred to as processing node 120 A.
- PND 102 B, processor bus 104 B, and all entities associated with processor bus 104 B comprise a second processing node 120 B.
- Other processing nodes may exist within the system, and are not shown in FIG. 1 for simplicity.
- an IP is accessing programmed instructions and data from SCD 100 and its respective caches. For example, when IP 110 A requires access to a memory address, it first attempts to retrieve this address from its internal cache(s) such as its FLC. If the requested address is not resident in the FLC, a request is sent to the respective SLC 108 A. If the requested data is likewise not resident within the SLC, the SLC forwards the request to the processor bus 104 A.
- all SLCs on a processor bus implement a snoop protocol to monitor, or “snoop”, the processor bus for requests.
- SLCs 108 B– 108 D snoop the request that is driven onto processor bus 104 A by BC 114 A. If any of these SLCs has a modified copy of the requested cache line, it will be returned to requesting SLC 108 A via processor bus 104 A. Additionally, SLCs 108 B– 108 D may have to invalidate any stored copies of the data depending on the type of request made by SLC 108 A. This is discussed further below.
- PND 102 A also snoops the request from SLC 108 A. In particular, PND 102 A determines whether any other SLC responds to the request by providing modified data on processor bus 104 A. If not, data that is retrieved from cache 206 of shared cache logic 106 A is provided by PND 102 A to SLC 108 A.
- data requested by IP 110 A is not resident within any of the cache memories associated with processor bus 104 A. In that case, PND 102 A must forward the request to SCD 100 .
- SCD 100 determines the location of the current copy of the requested data using information stored within its directory 101 . The most current copy may reside within the SCD itself. If so, the SCD provides the data directly to PND 102 A. In one embodiment, this is accomplished via SCD response channel 103 .
- request data is stored within another cache memory of a different processing node.
- the way in which the request is handled depends on the type of request that has been made by IP 110 A, and the type of access rights that have been acquired by the other cache memory.
- IP 110 A is requesting “ownership” of the data so that a write operation can be performed, and further if another processing node 120 currently retains ownership of the data
- the SCD issues a port Snoop and Invalidate (S&I) request.
- S&I port Snoop and Invalidate
- this type of request is issued via request channel 105 , although in a different embodiment, this request may be issued on response channel 103 . This request will cause the processing node to invalidate any stored data copies, and return updated data to SCD 100 so that this updated copy may be forwarded to PND 102 A.
- the IP 110 A may be requesting ownership of data that is retained by one or more other processing nodes 120 as read-only data.
- an invalidation request is issued to these one or more processing nodes.
- the invalidation request causes the nodes to invalidate their copies of the data so these copies may no longer be used.
- this type of request is issued on response channel 103 , although this need not be the case.
- IP 110 A may be requesting read-only access of data that is retained with ownership privileges by another node.
- SCD 100 issues a port snoop request.
- this request is issued via request channel 105 to cause the other node to return any updated data copy to SCD.
- This type of request could be issued on the response channel 103 in an alternative embodiment.
- this processing node may, in some cases, retain a read-only copy of the data. In other cases, all retained copies are invalidated.
- Any of the above-described request types may be issued by SCD 100 to a processing node or an SCD interface 109 .
- these requests are received by the respective PND 102 .
- this PND may determine, based on stored state bits, whether any of the SLCs 108 within the processing node stores a valid copy of the requested cache line. If so, a request will be issued on the respective processor bus 104 to prompt return of any modified data. Based on the scenario, this request may also result in invalidation of the stored copies, or the conversion of these copies to read-only data. Any updated data will be returned to SCD 100 .
- some time may expire between the time a processing node makes a request for data and the time the data is delivered to the processing node.
- the time the data is delivered more than one request for that data may have been issued by the IPs within the processing node.
- a request may be received from the SCD to relinquish control over the data in any of the ways discussed above.
- a request from SCD 100 could be honored before all previously pending requests from the IPs within the processing node were handled. This could result in data thrashing, since after the data is copied from the processing node to the SCD, the processing node must immediately make another request to get the data back.
- a lock cell may implement a software-lock associated with, and protecting, shared data.
- the shared data must not be accessed without first gaining authorization by activating the lock cell. This is accomplished by performing an autonomous test-and-set operation whereby the processor tests the state of the lock cell to determine whether it is available. If it is available, the processor sets the lock cell to an activated state to acquire access to the protected shared data. The processor must deactivate the lock cell before another processor can access the protected data.
- this type of preemption can occur for some period of time, since no mechanism is provided to prioritize requests to the same cache line within a given processing node.
- the current invention provides a system and method for ordering requests for the same cache line so that a request pending within a processing node will be honored before any subsequently-received request from SCD 100 is processed for the same cache line. This system and method is described in reference to the following drawings.
- FIG. 2 is a block diagram of logic within a PND 102 according to the current invention. Although PND 102 A is shown and described, it will be understood that this discussion applies to any other PND within a data processing system of the type shown in FIG. 1 .
- the logic of FIG. 2 includes a request tracking circuit 280 (shown dashed) that is provided to track outstanding invalidation operations so that data is not written from a PND in a manner that will cause another processor to reference outdated data. This is discussed further below.
- FIG. 2 The logic of FIG. 2 may best be understood by considering the following example. Assume that IP 110 A is requesting ownership of a cache line for update purposes. A cache miss results in SLC 108 A, and a request is therefore issued on processor bus 104 A to request the data. This request will cause any other SLC on processor bus 104 A to return any updated data copy to SLC 108 A on the processor bus. If the request is for data ownership, it will also result in invalidation of any other copies retained by the SLCs of processing node 120 A.
- bus control logic 201 When the request is provided to processor bus 104 A, it is also received by input queue 200 of PND 102 A. In response, bus control logic 201 provides a request to pipeline logic 210 , which, in turn, initiates a request to cache control logic 202 of shared cache logic 106 A (shown dashed). If the requested data resides within cache 206 , it will be retrieved so that it can be provided to SLC 108 A if none of the other SLCs on processor bus 104 A returns an updated copy.
- LT control logic 202 forwards information associated with the cache miss to Local Tracker (LT) control logic 203 .
- LT control logic creates a request entry for the request within a storage device referred to as Local Tracker (LT) 212 .
- LT 212 includes multiple addressable entries shown as entries 0 through N.
- LT 212 includes storage space for sixty-four entries, although an LT of a different size may be utilized as well. Each entry may be addressed using an index value. For instance, LT entry 0 is addressed using an index value of “zero”, LT entry 1 is addressed using an index value of “one”, and so on.
- Each LT entry includes multiple fields.
- An address field 220 stores the request address. In the current embodiment, this address will identify a cache line of memory within SCD 100 , wherein a cache line is an addressable contiguous memory portion containing 128 bytes. In another embodiment, any other contiguous portion of memory may be identified by the address.
- the LT entry further stores a function field 222 that identifies a request type. In this example, the request is a write request. Other types of requests may be tracked, as will be discussed below.
- Also included in an LT entry is a processor ID field 224 indicating which processor issued the request. In the current example, processor 110 A is identified within this field.
- An additional response type field 226 which is initially left unused, is used to track request responses in a manner to be discussed below.
- each LT entry includes a link field 228 that is provided to link the current LT entry to any subsequently created entry associated with a request for the same cache line.
- the link field may be set to the index value that identifies a latter-created LT entry, as will be described below.
- Requests are linked in this manner to order the requests for the same cache line according to time-order. If a request entry is already stored within LT 212 for a given cache line such that a linked list is created in this manner, LT will prevent the subsequent request from being issued to SCD 100 . Thus, only one request for a given cache line will be pending to SCD at any given time.
- Each LT entry further includes a conflict flag 235 , which will be used in the manner discussed below to maintain memory coherency. In the current example, this flag is left unused.
- the LT entry further includes a deferred identifier (DID) field 238 that stores a deferred identifier. This identifier was provided by SLC 108 A to PND 102 A along with the initial request, and will be used to match the request to a response, as will be discussed below.
- DID deferred identifier
- This identifier was provided by SLC 108 A to PND 102 A along with the initial request, and will be used to match the request to a response, as will be discussed below.
- each LT entry includes a valid bit in field 236 that is set when a valid entry is created within LT 212 . This valid bit is cleared when the entry is later removed from the LT.
- a transaction identifier is included with the request. This transaction identifier is set to the index value for the LT entry that is tracking this request. This transaction identifier will be used to match a response from SCD 100 with the request information stored within LT 212 , as will be described below.
- the request gains priority the request and transaction identifier are transferred via interface 109 A to SCD 100 for processing.
- IP 110 B makes a read request for the same cache line. In the manner described above, this request results in a miss on processor bus 104 A, and also results in a cache miss to cache 206 .
- the current invention addresses this situation by creating an entry in LT 212 for the request issued by SLC 108 B. Specifically, when the request results in a miss to cache 206 , the request information is provided to LT control logic 203 .
- LT control logic 203 searches LT 212 to determine whether an entry exists for the current cache line. The request entry for IP 110 A is located. LT control logic 203 then makes a second entry for the cache line. This entry identifies IP 100 B in processor ID field 224 , and further identifies the request as a read request without ownership in function field 222 .
- the valid bit in field 236 is activated, and the address field 220 is set to include the address of the cache line. Response field 228 and conflict flag 235 remain unused.
- the link field of the request entry for IP 110 A is set to point to this newly created entry. In one embodiment, if the newly created entry is created within storage location “two” of LT 212 , for example, the link field of the first entry is set to “two”, and so on.
- IP 110 B Because a request entry exists within LT 212 for the current cache line, the request issued by IP 110 B will not result in the issuance of a request to SCD 100 . In addition, PND 102 A issues a deferred response to IP 110 B on processor bus 104 A indicating that the request cannot be satisfied at this time.
- directory 101 is referenced to determine whether any of the one or more other processing nodes within the system stores a copy of the requested data. In the current example, it will be assumed the most recent copy of the requested data is available within SCD 100 . This data is provided to PND 102 A along with the original transaction identifier and a response type of ownership-with-data. This response type indicates that there is no outstanding response associated with the data. Other cases involving the return of data while some responses are still outstanding are discussed below.
- the transaction identifier provided with the response is used by LT control logic 203 to reference LT 212 and retrieve the deferred identifier for this request from DID field 238 .
- the returned data is routed from Input queue 240 to output queue 242 , and is provided on processor bus 104 A. In one embodiment, this data is provided to SLC 108 A during what is known as a “deferred phase”.
- a deferred phase is one of the ways a PND 102 provides data following the issuance of a deferred response.
- PND 102 places an encoded value on processor bus 104 A indicating that a deferred phase is occurring, along with the deferred Identifier retrieved from LT 212 .
- the deferred identifier is used by the target SLC to match the returned data with the original cache line request.
- SLC 108 A receives and processes the deferred phase, the data will be forwarded to IP 110 A to satisfy the initial request.
- PND 102 A In addition to providing the data to processor bus 104 A, PND 102 A also routes the data and address to pipeline logic 210 , which initiates a request to cache tag logic 204 . A replacement operation is initiated to update the cache tag logic 204 and store the data to cache 206 . Finally, the address and transaction identifier are provided to LT control logic 203 . LT control logic 203 uses the transaction identifier that was returned with the data to remove the first request entry associated with IP 110 A from LT 212 by clearing valid bit 236 .
- LT control logic 202 When LT control logic 202 removes an entry from LT 212 , it is determined whether the entry being removed is linked to any other entry in the LT. If it is, LT control logic 203 begins the process of unlinking all of the requests within that linked list of entries as follows. LT control logic 203 first determines what type of action must be taken to satisfy the request that is associated with the next entry in the linked list. The type of action taken depends on the type of access rights that have been granted by SCD 100 to processing node 120 A for the requested data, on the type of access rights that have been granted to one or more of the units within the processing node for the requested data, and on the type of access rights requested by the next entry in the linked list. In one embodiment, LT control logic 203 includes a lookup table that is referenced with this information to determine the course of action LT control logic 203 should take.
- the lookup table used to control the unlinking of LT entries may be programmable, and may be stored within a memory such as LT control store 288 .
- This lookup table could be modified using a scan-set interface, as is known in the art.
- the type of actions taken to unlink the entries within LT can change as the needs of the system change. For example, if different types of processors are coupled to processor bus 104 A, the types of requests that may be issued to obtain data in various situations may change. This can be accomplished merely by modifying the control store. As a general rule, normal processing activities must be halted before modifying LT control store 288 so as to avoid the occurrence of errors.
- IP 110 A has been granted ownership to the data, and IP 110 B is requesting read-only access.
- LT control logic 203 determines which actions to take. In this instance, LT control logic 203 prompts bus control logic 201 to issue a request on processor bus 104 A to snoop the cache line for a shared copy. This request, also referred to as a “snoop”, directs SLC 108 A to return any modified copy of the cache line on processor bus 104 A. A copy of this data may be retained by IP 110 A and SLC 108 A for read-only purposes.
- a deferred reply is a mechanism for providing data to one of the SLCs 108 in response to a deferred request.
- the deferred reply includes the deferred identifier from field 238 of the current LT request entry for SLC 108 B.
- this identifier allows the targeted SLC to match the data that accompanies the reply to a previous request.
- This deferred reply also indicates the type of access rights being granted with the data.
- the data is provided to SLC 108 B with read-only access rights.
- a request was issued on processor bus 104 A to obtain the cache line from SLC 108 A followed by a deferred reply to provide that data to SLC 108 B.
- This request and deferred reply are autonomous, meaning that no other requests or other types of operations are allowed to gain access to processor bus 104 A after the request and before the deferred reply.
- SLC 108 B will be the next entity to gain access to the cache line, and will prevent any other request from intervening to obtain this cache line. For example, this prevents a request from a different SLC on processor bus 104 A from being received by PND 102 A and thereafter preempting the servicing of the request from SLC 108 B.
- this autonomy is achieved when bus control logic 201 asserts a bus priority signal on processor bus 104 A during both the request and the deferred reply. This signal prevents any other unit on processor bus from gaining control over the processor bus to make a request.
- PND 102 A initiates a replacement operation. Any updated data returned by SLC 108 A is forwarded to pipeline logic 210 , which initiates a request to cache tag logic 204 . The tag information is updated based on the response to the request, and the updated data is stored to cache 206 .
- LT control logic 203 removes the request entry for SLC 108 B from LT 212 by clearing valid bit in field 236 . If this entry is linked to still another entry, the process described above may be repeated. That is, an unlinking process is initiated for any next entry in the list. This unlinking process will solicit the return of the cache line from whichever entity most recently received this data. In the current example, this is SLC 108 B. This request for return of the data will be followed by a deferred reply that provides the data to the entity identified in this next LT entry.
- a request and subsequent deferred reply are issued in a manner that is dictated by the access rights granted to the entity that retains the data at the time of the request, and is further based on the access rights requested by the next LT entry.
- the foregoing example describes the situation wherein SLC 108 A was granted ownership and SLC 108 B was requesting read access.
- SLC 108 B may be requesting ownership.
- PND 102 A will issue a request on processor bus 108 A to cause SLC 108 A to return the data copy and invalidate all copies of the data retained by SLC 108 A and IP 110 A.
- SLC 108 A retains a read-only copy of the data and SLC 108 B requests ownership.
- a request is issued on processor bus to cause SLC 108 A to invalidate the read-only copy.
- Pipeline logic 210 causes cache control logic 202 to perform a cache read to obtain the cache line with ownership privileges from cache 206 . If processing node 120 A does not own the cache line, the request from SLC 108 B cannot be satisfied. Therefore, instead of returning the data to SLC 108 B, bus control logic 201 issues a retry response. At this time, the LT entry is removed from LT 212 .
- SLC 108 B In response to the retry indication, SLC 108 B will either then, or at some later time, re-issue the original request for the cache line to processor bus 104 A. When this request is re-issued, a miss occurs to processor bus 104 A and to cache 206 . Therefore, a request entry is created within LT 212 in the manner discussed above and a request for ownership of the data is issued to SCD 100 . Data returned from SCD 100 will be handled in the manner previously described.
- SCD 100 returns data and ownership for the cache line requested by SLC 108 A.
- the data is forwarded by PND 102 A to SLC 108 A for processing.
- IP 110 A updates the data.
- SLC 108 A then returns the updated data back to the PND. This could occur because the SLC is explicitly writing the data from its cache during a write back operation, or because another processor on processor bus 104 A requested the data, and SLC 108 A is responding with an updated copy that is provided via processor bus 104 .
- LT control logic 203 will determine that the original replacement operation to cache 206 should be aborted and the data discarded, since this replacement operation is now associated with an outdated copy of the cache line. Instead, the updated data from IP 110 A will be stored to cache 206 .
- the LT control logic 203 will process the original request entry associated with SLC 108 A in the manner discussed above. That is, after the original request entry is removed from LT 212 , any LT entries linked to this entry are likewise unlinked and removed. As each entry is unlinked, a request for data is issued to processor bus 104 A, followed by a deferred reply that is autonomously associated with the request for data. During this process, a request for data made to processor bus 104 A may result in a miss if that data was returned to PND 102 A during a write back operation. If this occurs, the subsequent deferred response will provide a copy of the data retrieved from cache 206 to whichever SLC 108 is associated with the next request in the linked list.
- the cache may be full when the write back operation is presented by SLC 108 A to pipeline logic 210 .
- the updated data from SLC 108 A cannot be stored to the cache, and instead must be transferred to SCD 100 . If the original request entry did not receive a split response, this data may be provided directly to SCD 100 without delay. If the original request entry did receive a split response, however, the data must be transferred from input queue 200 to one of SCD output buffers 258 .
- LT control logic 203 creates an entry in the one of output buffer registers that corresponds with the SCD output buffer storing the data.
- This entry contains control bits that activate hold line 262 , thereby preventing the transfer of the updated data to SCD. These hold lines will remain activated until the invalidate-complete response is received from SCD for the original request. When this response is received, LT control logic 203 clears the control bits to deactivate hold line 262 , thereby allowing the transfer of data to occur.
- LT control logic 203 creates an entry in LT 212 to track the port memory write operation.
- Format field 222 is set to indicate that the entry is associated with a port memory write. This entry is linked to the last entry in the linked list for this cache line in the manner discussed above.
- a linked list of entries containing a port memory write is processed as follows.
- LT 212 will eventually remove the original request entry from LT 212 . Recall that if this request entry is associated with a split response, the removal of this entry will not occur until the associated invalidate-complete response is received. At this time, the replacement operation for the associated data is aborted if the abort flag is set, as is the case in the current example. Thereafter, any linked request entries are unlinked as discussed above. When an entry for the port memory write operation is encountered, the unlinking processed is halted. This entry remains stored within LT until a response is returned from SCD 100 indicating the port memory write operation completed successfully. At this time, any LT entries linked to the port memory write entry will be unlinked in the manner discussed above.
- LT control logic 203 causes a deferred reply to be issued to processor bus 104 A.
- LT control logic 203 further creates an entry within LT 212 for this read request. This new entry is linked to the port memory write entry.
- updated data may be placed on processor bus 104 A by one of the SLCs within the processing node while a port memory write entry is stored within LT 212 .
- This may occur either because an SLC is performing a write back operation to cache 206 , or because the SLC is responding to a request for the data that was issued by another SLC on the processor bus. In either case, this data cannot be stored to cache 206 because the cache is full as discussed above, and a port memory write operation must be scheduled.
- the data is transferred into an available one of SCD output buffers 258 , and LT control logic 203 initializes one of the output buffer registers 260 to activate hold line 262 for this data.
- a second port memory write entry is created in LT 212 , and is linked to the linked list of entries for this cache line. This second port memory write operation will not be allowed to complete until an acknowledgement is received from SCD 100 that the first port memory operation was successfully processed.
- the port memory write entry is removed from LT 212 .
- Any entry linked to this entry is unlinked and processed by issuing a request following by a deferred reply to whichever SLC and IP are indicated by processor ID field 224 of the entry. It is possible that during this unlinking process, a request for the data will result in a miss both to processor bus 104 A and to cache 206 . In this case, a retry response is issued as the deferred reply. In response to this retry response, the SLC receiving this response will re-issue the original request.
- the S&I request may by-pass the response. Assume for this example that the request does, in fact, by-pass the earlier issued response that includes the data.
- LT control logic 203 searches LT 212 for an entry associated with the requested cache line. If a request entry exists, indicating a request from SCD 100 by-passed an associated response in the above-described manner, an entry is created to record this SCD request. Because this entry is associated with a request from SCD 100 instead of one of the SLCs, this entry is created within a Remote Tracker (RT) 252 rather than LT 212 .
- RT Remote Tracker
- RT 252 is a storage device used to track all SCD requests that must be delayed because they are requesting the same cache line that is already associated with an entry within LT. In one embodiment, RT 252 is capable of storing sixty-four entries, although any other storage capacity may be utilized in the alternative.
- a RT entry includes information provided with the SCD snoop request such as the cache line address, the snoop request type, and the identity of the processing node that initiated the snoop request.
- a valid RT entry is designated by setting a valid bit stored within the entry.
- the newly created RT entry is linked to the LT conflict entry for this cache line by storing the number of the RT entry within link field 228 of the LT entry along with an indication that the linked entry is stored in RT 252 instead of LT 212 . In the current example, this entry is linked to the request entry that was created because of the request from SLC 108 B.
- LT control logic 203 begins the process of unlinking the entry in RT 252 for this cache line. During this process, LT control logic 203 causes a request to be issued on processor bus 104 A for the data. The type of request that is issued will depend on the access rights currently granted for this data, and on the type of request issued by SCD 100 . In the manner discussed above, this request type may be determined using a lookup table.
- This lookup table may be programmable, and may be stored within a control store memory such as RT control store 290 of FIG. 2 . In another embodiment, this lookup table may instead be retained within LT control store 288 .
- system operations may be revised as needed. For example, the unlinking processing may be changed to accommodate system updates.
- Both LT and RT control stores may be programmed using a scan-set interface, as is known in the art.
- LT control logic 203 causes bus control logic 201 to issue a request for SLC 108 B to invalidate its copy. Any updated copy of the data will be obtained from cache 206 for return to SCD 100 . The copy within cache 206 will be invalidated, and the entry will be removed from RT 252 by clearing the valid bit.
- one of SLCs 108 A– 108 D may own the cache line when the request from RT 252 is unlinked.
- LT control logic 203 causes bus request logic 201 to issue a request for return of any modified data copy from the SLC to PND 102 A and to invalidate its data copies.
- PND forwards the modified data to SCD 100 , and further invalidates its copy within cache 206 .
- the RT entry is removed from RT 252 .
- SCD 100 is requesting return of ownership only, while allowing the processing node to retain a read-only copy of the data.
- LT control logic 203 causes bus request logic 201 to issue a request for return of ownership and any updated copy of the cache line.
- the SLC is allowed to retain a read-only copy of the data.
- PND 102 A returns any updated copy of the cache line with ownership to SCD 100 .
- the SLC that most recently retained the data may have stored the updated data back to cache at a time when a replacement operation could not be performed.
- a port memory write operation occurs in the manner discussed above, and an LT entry is created to track the port memory write operation.
- the request to processor bus for the data 104 A will result in a miss, as will a request to cache 206 .
- LT control logic 203 will locate the port memory write entry within LT 212 and re-link the RT entry to this entry.
- the response for the port memory write operation is received from SCD 100 indicating the write operation is complete, the RT entry may be unlinked. This causes a request to again be issued to processor bus 104 A for the data. Another miss will occur, and a response will be issued to SCD 100 indicating the processing node does not retain the data.
- RT 252 a RT entry may be linked to a LT entry through the use of link field 228 within the LT entry.
- each RT entry includes a link field 284 .
- An RT entry may be linked to an LT entry in a manner similar that described above. That is, the RT link field 284 is set to point to an entry within LT that is associated with the same cache line. This may occur as follows. Assume that after the RT entry of the current example has been created, and before the requested cache line is returned to PND 102 A, yet another request for the same cache line is received from SLC 108 C. This new request will be stored within LT 212 in the manner discussed above. Field 284 of the current RT entry will be set to point to this new LT request entry. This new LT request entry may further point to still another LT entry if another request for the same cache line is received.
- LT control logic 203 begins the process of unlinking the next LT entry.
- LT control logic 203 signals bus control logic 201 to issue a request on processor bus 104 A for the data.
- the current LT entry is then removed from LT 212 .
- the request will result in a miss to processor bus 104 A and a miss to cache 206 .
- a retry response will be issued to SLC 108 C.
- This retry response will cause SLC 108 C to re-issue the request for the cache line on processor bus 104 A.
- This request will again result in a miss on both processor bus 104 A and to cache 206 .
- LT control logic 203 will create a request entry within LT, and a request for the cache line will be issued to SCD 100 .
- the unlinking process discussed above could continue for additional LT entries. For example, when the LT entry for SLC 108 C is removed from LT 212 , a next LT entry in the list could be unlinked in a manner similar to that discussed above.
- a retry response will be issued on processor bus 104 A, and the LT entry will be removed from LT 212 . This retry response will cause the target SLC to issue another request for the cache line, which will result in a miss on processor bus 104 A and a miss to cache miss.
- a LT entry will be created within LT 212 that is linked to the request entry created for SLC 108 C.
- a linked list of request entries may include multiple LT entries. However, this linked list will include, at most, one RT entry. This is because SCD 100 will not issue a request for return of a cache line while another request issued by SCD for the same cache line is still outstanding.
- Split responses are tracked by the PND in a special manner using LT 212 as follows. Assume that PND 102 A issues a request for data and ownership to SCD 100 . Directory 101 indicates that one or more other processing nodes within the system retain a read-only copy of this data. These copies must be invalidated so that processing node 120 A can update the requested data. Therefore, SCD issues one or more invalidation requests to these other processing nodes to invalidate the read-only copies.
- SCD Before SCD receives an acknowledgement from these other processing nodes indicating that the one or more invalidation operations have completed, SCD provides the requested data to PND 102 A via SCD response channel 103 and interface 109 A.
- the data is provided along with the original transaction identifier, and a response type of “data-with-invalidate-pending”, which indicates that the data is being provided before the invalidation operations have been completed.
- PND 102 A When PND 102 A receives the data, it is processed in the manner discussed above. That is, a transaction identifier provided with this response is used to address LT 212 to obtain the deferred identifier for the request. This identifier is used to issue a deferred phase along with the data to the SLC 108 that issued the initial request. This data can be forwarded to the requesting IP to allow that processor to continue processing activities. In addition, a replacement operation is scheduled to store the returned data to cache 206 and update cache tag logic 204 .
- LT control logic 203 updates the entry, setting response type field 226 to a response type of invalidate-pending. This records that invalidation operations are outstanding for this request.
- SCD 100 issued one or more invalidation requests to one or more other processing nodes to request invalidation of the read-only copies of the current cache line data.
- a PND of a processing node receives an invalidation request from SCD, all read-only copies of the data stored within an IP, SLC, or the shared cache of that processing node will be invalidated. The PND will then respond to SCD 100 with an invalidation acknowledge, which, in one embodiment, is issued on SCD response channel 103 .
- Response channel 103 is coupled to acknowledge tracker 107 , which is tracking all outstanding invalidation activities for the cache line.
- acknowledge tracker 107 signals SCD request channel 105 to issue an acknowledgement that is referred to as an invalidate-complete response. This response is sent via response channel 103 and interface 109 A to input queue 240 of PND 102 A.
- An invalidate-complete response includes a transaction identifier.
- LT control logic 203 utilizes this transaction identifier to address LT 212 and obtain the associated request entry, which will have a response type in field 226 of invalidate-pending. Because the outstanding invalidate-complete response has been received for the cache line, the request entry may now be removed from LT 212 . This is accomplished by clearing the valid bit for this entry. At this time, any linked entries may be unlinked in the manner discussed above.
- a second type of entry known as a “conflict entry” may also be linked to the linked list of entries.
- a conflict entry is created after data has been provided to a processing node with a split response in the manner discussed above.
- processor ID field 224 for this request entry, becomes known as an “invalidate-pending.” processor.
- This IP will be considered an invalidate-pending processor as long as there is at least one request entry within LT 212 for that IP having a response type in field 226 of invalidate-pending.
- the invalidate-pending LT entries for IPs 110 A– 110 D are tracked by vector registers 250 A– 250 D, respectively. In one embodiment, these registers store a master-bitted value for this purpose.
- a conflict entry is created in LT 212 by LT control logic 203 .
- This type of entry is differentiated from request entries by setting a conflict flag in field 235 .
- This entry further includes address field 220 , which stores the address of the updated cache line.
- Processor ID field 226 stores an identifier indicating which invalidate-pending processor provided the data written to cache 206 .
- Link field 228 is used in the manner discussed above to link this entry to any future LT entry that is associated with the current cache line. This may include additional request and/or conflict entries as discussed above.
- valid bit in field 236 is activated to indicate the LT entry is valid.
- snapshot device 232 When LT control logic 203 creates a conflict entry within LT, an associated entry is created within snapshot device 232 .
- This snapshot entry records all potential conflicts that may exist for the cache line associated with this entry. This cache line will not be allowed to exit the processing node until all of the potential conflicts recorded by the conflict and associated snapshot entries have been cleared.
- a conflict entry of the type described above may also be created to record the occurrence of a request that is received from SCD 100 .
- SCD issues an S&I request for a cache line to PND 102 A.
- PND will issue a request on processor bus 104 A for return of the data, and will further read cache 206 .
- the requested data is either obtained from an invalidate-pending processor, or resides within cache 206 and is associated with a conflict entry within LT 212 .
- LT control logic 203 creates a conflict entry within LT 212 for the cache line. This LT entry will be linked to the linked list of request and/or conflict entries associated with the same cache line.
- an associated entry is created within snapshot device 232 to store any potential conflicts that may exist for the current cache line.
- an entry is created in RT 252 .
- the newly created RT entry is linked to the LT conflict entry for this cache line by storing the number of the RT entry within link field 228 of the LT entry along with an indication that the linked entry is stored in RT 252 instead of LT 212 .
- Processing of a linked list containing conflict entries occurs as follows.
- the first request entry in the linked list is processed only after all invalidation operations associated with the data have been completed. This means that in the case of split responses, a request entry is not removed from LT until the associated invalidate-complete response is received from SCD 100 . Thereafter, the unlinking of request entries proceeds in the manner discussed above. This is generally accomplished using request and deferred reply operations that are autonomously linked as previously described.
- request and deferred reply operations that are autonomously linked as previously described.
- the unlinking stalls.
- a conflict entry is not removed from LT 212 until all invalidate-pending request entries being tracked by this conflict entry and the associated snapshot are cleared. This occurs when corresponding ones of the invalidate-complete responses are received from SCD 100 .
- a conflict entry is removed from LT. If the removed conflict entry points to a RT entry, the unlinking of this RT entry occurs as follows. The RT entry is removed from RT 252 , and LT control logic 203 signals bus control logic 201 to re-issue the request for the cache line on processor bus 104 A. Pipeline logic 210 will also initiate a request to cache control logic 202 . These requests will result in a processor bus miss, and a hit to cache 206 . LT control logic 203 will determine that all conflicts have been cleared for the current cache line, and the data from cache 206 will be forwarded to output queue 230 for transfer to SCD 100 .
- LT control logic 203 unlinks the next LT conflict entry on the linked list by re-issuing a request for the cache line to processor bus 104 A, and by initiating a read to cache 206 . Because the cache line was returned to SCD 100 during the previous transaction, both operations will result in a miss. As a result, PND 102 A will issue a retry indication to processor bus 104 A, causing this IP to re-issue the request for this cache line.
- a request entry When this request is re-issued, a request entry will be created within LT in the manner discussed above, and a request will be made to SCD for the cache line. This process effectively converts the conflict entry into a request entry. Any subsequent conflict entries in the linked list can be converted to request entries in a similar manner. These additional request entries will be linked to the request entry that results in the request to SCD 100 .
- the above-described invention provides a system and method for ordering the processing of requests for the same cache line that originate within the same processing node before a request that is received from SCD 100 . This prevents data thrashing that can occur because data is transferred from a processing node as a result of a request from SCD that is received after an earlier request by an IP for the same processing node.
- the current invention can significantly reduce the time required to deactivate the lock cell, allowing additional processing to occur on the protected data.
- the current invention further provides a mechanism for linking, and later processing, multiple types of entries, including various request entries, port memory write entries, conflict entries, and entries from SCD requests in a manner that ensures memory coherency is maintained.
- FIGS. 3A and 3B when arranged as shown in FIG. 3 , are a flow diagram illustrating one method of the invention according to the current invention.
- shared cache logic receives a request for data from one of multiple requesters that are coupled to shared cache logic.
- the multiple requesters include multiple processors within the same processing node. If the request results in a miss to the shared cache logic, and further assuming none of the other multiple requesters retains a modified copy of the requested data, the shared cache logic issues a response to the requester indicating the data will be provided at a later time ( 302 ).
- a record is created for the request ( 304 ).
- This record includes information identifying the requester, the type of the request, and the address of the requested data. If another request is already pending for the same data, the newly created record is linked to the record created for the previous request. Otherwise, if another request is not pending for this data, a request for the data is issued to a main memory that is coupled to the shared cache logic ( 306 ). Steps 300 – 306 may be repeated as needed, with any records for the same data being linked together into a linked list ( 308 ). While this process is occurring, a request for data may be received from the main memory. If this request results in a miss to shared cache logic, and if none of the multiple requesters retains a copy of the requested data, a record is created for this memory request ( 310 ). This record will be linked to any records already existing for the same data.
- this data When data is received by shared cache logic from the main memory, this data is forwarded to whichever requester first requested the data, as identified by the oldest record that is associated with this data ( 312 ). Additionally, a replacement operation is scheduled to store the data to the cache and update the cache tag logic.
- Main memory signals the completion of the invalidation operations at a later time via an invalidate-complete response provided to shared cache logic.
- a request is issued to the requester that most recently retained the data. This request solicits the return of that data to shared cache logic ( 316 ). If the current record identifies one of the multiple requesters, and if the issued request results in a miss both to the target requester and to shared cache logic, a retry response is issued to the requester identified in the current record, causing this requester to re-issue the request to shared cache at a later time. This will eventually result in the issuance of another request to memory, as was discussed above in reference to FIG. 2 . This request will be handled as described in steps 300 et sequence. Otherwise, if a cache miss does not result, the returned data is provided to the identified requester ( 318 ).
- the current record does not identify one of the multiple requesters but instead identifies the main memory, and if a miss occurs, as may be the result of a previously issued port memory write operation as was described above in regards to FIG. 2 , the current record is linked to the request entry previously created for this operation. Otherwise, if a miss does not occur, the data returned as a result of the request is forwarded to main memory ( 320 ).
- the current record is removed from the linked list. If another record remains in the current linked list, make the next record in the list the current record ( 322 ). For any requests received during processing of the linked list, or anytime thereafter, handle the new requests according to step 300 et sequence ( 324 ), as indicated by arrow 325 .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
Description
Claims (18)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/601,030 US7222222B1 (en) | 2003-06-20 | 2003-06-20 | System and method for handling memory requests in a multiprocessor shared memory system |
US11/784,238 US7533223B1 (en) | 2003-06-20 | 2007-04-06 | System and method for handling memory requests in a multiprocessor shared memory system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/601,030 US7222222B1 (en) | 2003-06-20 | 2003-06-20 | System and method for handling memory requests in a multiprocessor shared memory system |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/784,238 Continuation US7533223B1 (en) | 2003-06-20 | 2007-04-06 | System and method for handling memory requests in a multiprocessor shared memory system |
Publications (1)
Publication Number | Publication Date |
---|---|
US7222222B1 true US7222222B1 (en) | 2007-05-22 |
Family
ID=38049655
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/601,030 Expired - Lifetime US7222222B1 (en) | 2003-06-20 | 2003-06-20 | System and method for handling memory requests in a multiprocessor shared memory system |
US11/784,238 Expired - Lifetime US7533223B1 (en) | 2003-06-20 | 2007-04-06 | System and method for handling memory requests in a multiprocessor shared memory system |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/784,238 Expired - Lifetime US7533223B1 (en) | 2003-06-20 | 2007-04-06 | System and method for handling memory requests in a multiprocessor shared memory system |
Country Status (1)
Country | Link |
---|---|
US (2) | US7222222B1 (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080005257A1 (en) * | 2006-06-29 | 2008-01-03 | Kestrelink Corporation | Dual processor based digital media player architecture with network support |
US20100005470A1 (en) * | 2008-07-02 | 2010-01-07 | Cradle Technologies, Inc. | Method and system for performing dma in a multi-core system-on-chip using deadline-based scheduling |
US20150339398A1 (en) * | 2009-12-15 | 2015-11-26 | At & T Intellectual Property I, L.P. | Footprint Tracking Of Contacts |
US10228869B1 (en) * | 2017-09-26 | 2019-03-12 | Amazon Technologies, Inc. | Controlling shared resources and context data |
US10298496B1 (en) | 2017-09-26 | 2019-05-21 | Amazon Technologies, Inc. | Packet processing cache |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8769232B2 (en) * | 2011-04-06 | 2014-07-01 | Western Digital Technologies, Inc. | Non-volatile semiconductor memory module enabling out of order host command chunk media access |
USD826162S1 (en) | 2017-06-12 | 2018-08-21 | Norman R. Byrne | Electrical receptacle |
USD870672S1 (en) | 2017-06-12 | 2019-12-24 | Norman R. Byrne | Electrical receptacle |
USD977431S1 (en) | 2019-09-06 | 2023-02-07 | Norman R. Byrne | Electrical extension cord |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5995967A (en) * | 1996-10-18 | 1999-11-30 | Hewlett-Packard Company | Forming linked lists using content addressable memory |
US6434641B1 (en) * | 1999-05-28 | 2002-08-13 | Unisys Corporation | System for reducing the number of requests presented to a main memory in a memory storage system employing a directory-based caching scheme |
US6546465B1 (en) * | 2000-08-31 | 2003-04-08 | Hewlett-Packard Development Company, L.P. | Chaining directory reads and writes to reduce DRAM bandwidth in a directory based CC-NUMA protocol |
US6611906B1 (en) * | 2000-04-30 | 2003-08-26 | Hewlett-Packard Development Company, L.P. | Self-organizing hardware processing entities that cooperate to execute requests |
US6973550B2 (en) * | 2002-10-02 | 2005-12-06 | Intel Corporation | Memory access control |
-
2003
- 2003-06-20 US US10/601,030 patent/US7222222B1/en not_active Expired - Lifetime
-
2007
- 2007-04-06 US US11/784,238 patent/US7533223B1/en not_active Expired - Lifetime
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5995967A (en) * | 1996-10-18 | 1999-11-30 | Hewlett-Packard Company | Forming linked lists using content addressable memory |
US6820086B1 (en) * | 1996-10-18 | 2004-11-16 | Hewlett-Packard Development Company, L.P. | Forming linked lists using content addressable memory |
US6434641B1 (en) * | 1999-05-28 | 2002-08-13 | Unisys Corporation | System for reducing the number of requests presented to a main memory in a memory storage system employing a directory-based caching scheme |
US6611906B1 (en) * | 2000-04-30 | 2003-08-26 | Hewlett-Packard Development Company, L.P. | Self-organizing hardware processing entities that cooperate to execute requests |
US6546465B1 (en) * | 2000-08-31 | 2003-04-08 | Hewlett-Packard Development Company, L.P. | Chaining directory reads and writes to reduce DRAM bandwidth in a directory based CC-NUMA protocol |
US6973550B2 (en) * | 2002-10-02 | 2005-12-06 | Intel Corporation | Memory access control |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080005257A1 (en) * | 2006-06-29 | 2008-01-03 | Kestrelink Corporation | Dual processor based digital media player architecture with network support |
US20100005470A1 (en) * | 2008-07-02 | 2010-01-07 | Cradle Technologies, Inc. | Method and system for performing dma in a multi-core system-on-chip using deadline-based scheduling |
US8151008B2 (en) | 2008-07-02 | 2012-04-03 | Cradle Ip, Llc | Method and system for performing DMA in a multi-core system-on-chip using deadline-based scheduling |
US9032104B2 (en) | 2008-07-02 | 2015-05-12 | Cradle Ip, Llc | Method and system for performing DMA in a multi-core system-on-chip using deadline-based scheduling |
US20150339398A1 (en) * | 2009-12-15 | 2015-11-26 | At & T Intellectual Property I, L.P. | Footprint Tracking Of Contacts |
US9922127B2 (en) * | 2009-12-15 | 2018-03-20 | At&T Intellectual Property I, L.P. | Footprint tracking of contacts |
US10228869B1 (en) * | 2017-09-26 | 2019-03-12 | Amazon Technologies, Inc. | Controlling shared resources and context data |
US10298496B1 (en) | 2017-09-26 | 2019-05-21 | Amazon Technologies, Inc. | Packet processing cache |
US10911358B1 (en) | 2017-09-26 | 2021-02-02 | Amazon Technologies, Inc. | Packet processing cache |
Also Published As
Publication number | Publication date |
---|---|
US7533223B1 (en) | 2009-05-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7533223B1 (en) | System and method for handling memory requests in a multiprocessor shared memory system | |
US7366847B2 (en) | Distributed cache coherence at scalable requestor filter pipes that accumulate invalidation acknowledgements from other requestor filter pipes using ordering messages from central snoop tag | |
US6199144B1 (en) | Method and apparatus for transferring data in a computer system | |
US6990559B2 (en) | Mechanism for resolving ambiguous invalidates in a computer system | |
US6625698B2 (en) | Method and apparatus for controlling memory storage locks based on cache line ownership | |
US5652859A (en) | Method and apparatus for handling snoops in multiprocessor caches having internal buffer queues | |
US7047322B1 (en) | System and method for performing conflict resolution and flow control in a multiprocessor system | |
US6189078B1 (en) | System and method for increasing data transfer throughput for cache purge transactions using multiple data response indicators to maintain processor consistency | |
US7003635B2 (en) | Generalized active inheritance consistency mechanism having linked writes | |
US6374332B1 (en) | Cache control system for performing multiple outstanding ownership requests | |
US7284097B2 (en) | Modified-invalid cache state to reduce cache-to-cache data transfer operations for speculatively-issued full cache line writes | |
US20050188159A1 (en) | Computer system supporting both dirty-shared and non dirty-shared data processing entities | |
JPH10254773A (en) | Accessing method, processor and computer system | |
US20140068201A1 (en) | Transactional memory proxy | |
US6477620B1 (en) | Cache-level return data by-pass system for a hierarchical memory | |
US10761987B2 (en) | Apparatus and method for processing an ownership upgrade request for cached data that is issued in relation to a conditional store operation | |
US7051163B2 (en) | Directory structure permitting efficient write-backs in a shared memory computer system | |
US7260677B1 (en) | Programmable system and method for accessing a shared memory | |
EP3644190B1 (en) | I/o coherent request node for data processing network with improved handling of write operations | |
US6202126B1 (en) | Victimization of clean data blocks | |
US7024520B2 (en) | System and method enabling efficient cache line reuse in a computer system | |
US6892290B2 (en) | Linked-list early race resolution mechanism | |
US7000080B2 (en) | Channel-based late race resolution mechanism for a computer system | |
US7032079B1 (en) | System and method for accelerating read requests within a multiprocessor system | |
US7065614B1 (en) | System and method for maintaining memory coherency within a multi-processor data processing system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: UNISYS CORPORATION, MINNESOTA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:VARTTI, KELVIN S.;WEBER, ROSS M.;REEL/FRAME:014226/0811 Effective date: 20030618 |
|
AS | Assignment |
Owner name: CITIBANK, N.A., NEW YORK Free format text: SECURITY AGREEMENT;ASSIGNORS:UNISYS CORPORATION;UNISYS HOLDING CORPORATION;REEL/FRAME:018003/0001 Effective date: 20060531 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023312/0044 Effective date: 20090601 Owner name: UNISYS HOLDING CORPORATION, DELAWARE Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023312/0044 Effective date: 20090601 Owner name: UNISYS CORPORATION,PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023312/0044 Effective date: 20090601 Owner name: UNISYS HOLDING CORPORATION,DELAWARE Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023312/0044 Effective date: 20090601 |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023263/0631 Effective date: 20090601 Owner name: UNISYS HOLDING CORPORATION, DELAWARE Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023263/0631 Effective date: 20090601 Owner name: UNISYS CORPORATION,PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023263/0631 Effective date: 20090601 Owner name: UNISYS HOLDING CORPORATION,DELAWARE Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:CITIBANK, N.A.;REEL/FRAME:023263/0631 Effective date: 20090601 |
|
AS | Assignment |
Owner name: DEUTSCHE BANK TRUST COMPANY AMERICAS, AS COLLATERA Free format text: PATENT SECURITY AGREEMENT (PRIORITY LIEN);ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:023355/0001 Effective date: 20090731 |
|
AS | Assignment |
Owner name: DEUTSCHE BANK TRUST COMPANY AMERICAS, AS COLLATERA Free format text: PATENT SECURITY AGREEMENT (JUNIOR LIEN);ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:023364/0098 Effective date: 20090731 |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
AS | Assignment |
Owner name: GENERAL ELECTRIC CAPITAL CORPORATION, AS AGENT, IL Free format text: SECURITY AGREEMENT;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:026509/0001 Effective date: 20110623 |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:DEUTSCHE BANK TRUST COMPANY;REEL/FRAME:030004/0619 Effective date: 20121127 |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:DEUTSCHE BANK TRUST COMPANY AMERICAS, AS COLLATERAL TRUSTEE;REEL/FRAME:030082/0545 Effective date: 20121127 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
AS | Assignment |
Owner name: CITIBANK, N.A., NEW YORK Free format text: CORRECTIVE ASSIGNMENT TO CORRECT THE APPLICATION NUMBER 10/604030 PREVIOUSLY RECORDED AT REEL: 018003 FRAME: 0001. ASSIGNOR(S) HEREBY CONFIRMS THE ASSIGNMENT;ASSIGNORS:UNISYS CORPORATION;UNISYS HOLDING CORPORATION;REEL/FRAME:038519/0224 Effective date: 20060531 |
|
AS | Assignment |
Owner name: WELLS FARGO BANK, NATIONAL ASSOCIATION, AS COLLATE Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:042354/0001 Effective date: 20170417 Owner name: WELLS FARGO BANK, NATIONAL ASSOCIATION, AS COLLATERAL TRUSTEE, NEW YORK Free format text: PATENT SECURITY AGREEMENT;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:042354/0001 Effective date: 20170417 |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT, ILLINOIS Free format text: SECURITY INTEREST;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:044144/0081 Effective date: 20171005 Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT Free format text: SECURITY INTEREST;ASSIGNOR:UNISYS CORPORATION;REEL/FRAME:044144/0081 Effective date: 20171005 |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:WELLS FARGO BANK, NATIONAL ASSOCIATION (SUCCESSOR TO GENERAL ELECTRIC CAPITAL CORPORATION);REEL/FRAME:044416/0358 Effective date: 20171005 |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 12TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1553); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 12 |
|
AS | Assignment |
Owner name: UNISYS CORPORATION, PENNSYLVANIA Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:WELLS FARGO BANK, NATIONAL ASSOCIATION;REEL/FRAME:054231/0496 Effective date: 20200319 |