US20090187695A1 - Handling concurrent address translation cache misses and hits under those misses while maintaining command order - Google Patents

Handling concurrent address translation cache misses and hits under those misses while maintaining command order Download PDF

Info

Publication number
US20090187695A1
US20090187695A1 US12/351,900 US35190009A US2009187695A1 US 20090187695 A1 US20090187695 A1 US 20090187695A1 US 35190009 A US35190009 A US 35190009A US 2009187695 A1 US2009187695 A1 US 2009187695A1
Authority
US
United States
Prior art keywords
address translation
cache
command
processing unit
misses
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US12/351,900
Inventor
John D. Irish
Chad B. McBride
Ibrahim A. Ouda
Andrew H. Wottreng
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US12/351,900 priority Critical patent/US20090187695A1/en
Publication of US20090187695A1 publication Critical patent/US20090187695A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]

Definitions

  • the present invention relates generally to the data processing field, and more particularly, relates to a method and apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order when architecturally required.
  • Load and store commands in an input/output (I/O) subsystem target multiple sources and destinations, respectively.
  • a virtual channel number distinguishes these targets. All commands that share a virtual channel number are required to maintain a user defined order which may range from relaxed to strict ordering. The ordering policy is not known until after address translation is complete so it is assumed that a strict ordering policy is used through the translation process, which means all commands must complete in the order that the command is issued within a virtual channel. Commands with different virtual channels can complete in a different order or can pass each other.
  • Incoming load and store commands signal to the I/O interface a virtual address, which corresponds to that I/O device's view of memory. This virtual address needs to be translated into a real address corresponding to the processor's view of the memory map.
  • an address translation cache is used, as part of the translation process, to take advantage of the temporal and spacial locality of the I/O command addressing.
  • a cache miss will affect the command flow due to the added time required for performing a memory fetch and then re-translation of the address of the corresponding command.
  • the translation unit continues to enable translations to occur while a cache miss is being handled. This is referred to as hits under a miss for those commands that have translation table hits.
  • Principal aspects of the present invention are to provide apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel (VC).
  • Other important aspects of the present invention are to provide such apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel substantially without negative effect and that overcome many of the disadvantages of prior art arrangements.
  • apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel.
  • Commands are stored in an input command queue in a command processing unit maintaining ordering of the commands.
  • a command buffer index (CBI) is assigned to each address being sent from the command processing unit to an address translation unit.
  • CBI command buffer index
  • the CBI When the cache table entry is loaded into the cache, the CBI is passed back to the command processing unit with a signal to indicate that the fetch request has completed.
  • the command processing unit uses the CBI to locate the command and address to be reissued to the address translation unit.
  • the CBI is stored in the address translation unit in a mapping array coupled to a miss fetch unit.
  • the mapping array is indexed by a unique command identifier (CI) for the memory fetch request.
  • the address translation cache miss occurs due to a segment table cache miss or a page table cache miss.
  • the memory fetch request is sent to get a page table entry or a segment table entry depending on the type of cache miss.
  • Additional information stored with the CBI in the mapping array includes the fetch type (segment or page table fetch), a segment table cache set used for indexing into the cache, a page table cache set used for indexing into the cache, and an input/output identification (IOID).
  • fetch type segment or page table fetch
  • IOID input/output identification
  • the command processing unit stops issuing address translation requests when a virtual channel has a predefined number of outstanding address translation requests. This prevents all miss fetch resources from being consumed by a single virtual channel.
  • a limit on the number of cache misses for a single congruence class prevents over-allocating a cache set for a congruence class.
  • a congruence class When a congruence class is fully allocated to outstanding misses, an additional miss to that congruence class will be denied and the command processing unit will re-issue that miss at a later time.
  • commands that hit under a cache miss may or may not be allowed to continue depending on the storage ordering (SO) bits found in the page table entry (PTE). These bits dictate the ordering rules for commands using that PTE.
  • SO storage ordering
  • FIGS. 1A and 1B together illustrate apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel in accordance with the preferred embodiment.
  • a problem solved is that of handling concurrent misses to a translation cache, keeping track of miss correspondence to memory fetch data, while allowing hits under those misses as well as preventing a single VC from consuming all miss fetch resources.
  • a method is provided for handling concurrent address translation cache misses and hits under those misses while maintaining command order when required. Commands must be performed in order if they are from the same I/O bus, same virtual channel or same I/O device, and if the page table storage ordering bits indicate strict ordering.
  • the invention also accommodates concurrent hardware and software loading the cache.
  • an I/O command queue and translation cache structure are provided that allows concurrent cache misses and hits under those misses without allowing a single virtual channel to consume all of the miss fetch resources.
  • the command processing unit and the translation unit both need to be aware of this predefined limit.
  • either a congruence class of the segment table cache can have M misses or a page table cache can have N misses at which point all translation requests are denied and re-issued.
  • a congruence class of the segment table cache can have M misses or a page table cache can have N misses at which point all translation requests are denied and re-issued.
  • a 4 way segment cache and an 8 way page cache with 8 outstanding misses to set 25 of the page table cache and 0 outstanding misses to the segment table cache would result in a stall where no additional commands will be accepted by the address translation unit until at least one of the current outstanding misses complete.
  • FIGS. 1A and 1B there is shown an apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel or a central processor unit (CPU) generally designated by the reference 100 in accordance with the preferred embodiment.
  • CPU 100 includes a command processing unit generally designated by the reference 102 shown in FIG. 1B ; and an I/O address translation unit generally designated by the reference 104 , an embedded processor 106 together with software 108 , a memory 110 , and an Element Interconnect Bus (EIB) 112 shown in FIG. 1A .
  • EIB Element Interconnect Bus
  • Addresses for commands are passed from the command processing unit 102 to the I/O address translation unit 104 in the order that the commands are sent from an IO device 116 .
  • This ordering is assumed to be strict ordering because the ordering rules for the command , have not been read from the page table entry yet.
  • command processing unit 102 includes an input command queue 118 and a command buffer index function 120 .
  • the input command queue 118 maintains ordering of the commands from the IO device 116 .
  • the command buffer index function 120 assigns a command buffer index (CBI) for each address of the commands to be sent to the I/O address translation unit 104 in FIG. 1A .
  • CBI command buffer index
  • Command processing unit 102 includes a translate interface input control 122 coupled to the input command queue 118 and the command buffer index function 120 of the preferred embodiment.
  • the translate interface input control 122 provides an address and a command buffer index (CBI) for the address to a translation pipeline 140 of the I/O address translation unit 104 in FIG. 1A .
  • the translate interface input control 122 provides the address and command buffer index (CBI) to a pipeline 126 coupled to a translate interface output control 130 .
  • Command processing unit 102 includes the translate interface output control 130 coupled between the address and CBI pipeline 126 and an output command buffer 132 .
  • Output command buffer 132 is coupled to an IOC 134 .
  • the I/O address translation unit 104 provides the translate interface output control 130 with a hit or miss translation result, a translated address, the CBI, and a CLEAR 141 signal to indicate that a fetch request for a cache miss has completed when a cache table entry is loaded into the cache.
  • the translate interface output control 130 provides a miss command reissue control signal to the translate interface input control 122 .
  • the Input Command Queue 118 is a circular buffer with a single head pointer, a speculative tail pointer and a main tail pointer. Commands are added to the queue at the head and are removed from the queue at the main tail pointer. If translation is stalled for all virtual channels, no commands are sent to be translated. Otherwise, the command pointed to by the speculative tail pointer is sent to the I/O address translation unit to be translated and then the speculative tail pointer is advanced towards the head pointer.
  • a segment table cache miss As the addresses are passed from the command processing unit 102 , two types of address translation misses can occur including a segment table cache miss and a page table cache miss.
  • a translation cache miss occurs the I/O address translation unit 104 performs a memory fetch to get the page or segment table entry depending on the type of the cache miss. Since this logic is pipelined, addresses are presented to the translation logic continually, so even when a cache miss occurs, addresses following that miss still are processed.
  • I/O address translation unit 104 includes a translation pipeline 140 providing a plurality of signals to the translate interface output control 130 in FIG. 1B , including translation results (hit/miss), a translated address, a command buffer index (CBI), and a CLEAR 141 , which indicates to the command processing unit 102 that it should re-issue a translation request for a given command indexed by a CBI.
  • the translation pipeline 140 is coupled to the EIB bus 124 , a page cache 142 , such as a 4-way page cache, a segment cache 144 , and a miss fetch unit 146 .
  • the miss fetch unit 146 is coupled to mapping function which maps a command buffer index (CBI) to a command identifier (CI) referenced by CBI to CI mapper 148 which passes the CBI on a cache miss to the CBI to CI mapper 148 .
  • the miss fetch unit 146 applies a fetch request to the memory 110 via the EIB 112 .
  • a fetch data handler 150 is coupled to the CBI to CI mapper 148 and receives fetch data from memory 110 via the EIB 112 .
  • the invention provides a method of implementing a miss-under-miss for I/O commands. Addresses that get cache hits during an outstanding miss are called hits-under-miss. When a miss occurs while another miss is being handled this is called a miss-under-miss.
  • the process of the invention is as follows:
  • CBI Command Buffer Index
  • the command processing unit 102 sends an address and CBI from the translate interface input control 122 to the translation pipeline 140 of the I/O address translation unit 104 .
  • the segment table cache 144 is searched for the corresponding segment table entry, and the page table cache 142 is searched for the corresponding page table entry.
  • miss fetch unit 146 When an address translation cache miss occurs, a memory fetch request is sent by miss fetch unit 146 to the memory controller or memory 110 via the EIB 112 . These memory fetches have unique identifiers so that when the return data comes back, the unit that sent the request accepts the data based on a return tag match. This unique Identifier is called the CI or command identifier.
  • the CBI is stored in a mapping array 148 which is indexed by the CI so that when the return data comes back from memory 110 , the translation logic or fetch handler 150 knows where to put the data and also can send back the CBI to the command processing 102 so that the command can be re-issued.
  • the address translation cache 142 , 144 gets cache misses, the memory fetch requests are sent out to the memory controller even though more than one memory fetch request is outstanding.
  • the only stipulation is that when the address translation cache 142 , 144 sees that a congruence class has as many outstanding misses as there are ways in the respective cache, the address translation cache 142 , 144 indicates to the translate interface input control 122 of command processing unit 102 that the translation request was denied and that the command will need to be re-issued because all of the resources for a congruence class may be consumed.
  • This additional information is the following: the fetch type (segment or page table fetch); Segment Table Cache Set, which is used for indexing into the cache 144 ; Page Table Cache Set, which is used for indexing into the cache 142 ; and IOID or the identification of a particular I/O device 116 .
  • the CBI is passed back to the translate interface output control 130 of command processing unit 102 with a CLEAR 141 signal to indicate that the fetch has completed and that it can re-issue the address translation request because the cache entry has been loaded.
  • the command processing unit 102 then uses the CBI to locate the command and address that needs to be re-issued to the I/O address translation unit 104 .

Abstract

Apparatus handles concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel. Commands are stored in a command processing unit that maintains ordering of the commands. A command buffer index is assigned to each address being sent from the command processing unit to an address translation unit. When an address translation cache miss occurs, a memory fetch request is sent. The CBI is passed back to the command processing unit with a signal to indicate that the fetch request has completed. The command processing unit uses the CBI to locate the command and address to be reissued to the address translation unit.

Description

  • This application is a continuation application of Ser. No. 11/420,884 filed on May 30, 2006.
  • FIELD OF THE INVENTION
  • The present invention relates generally to the data processing field, and more particularly, relates to a method and apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order when architecturally required.
  • DESCRIPTION OF THE RELATED ART
  • Load and store commands in an input/output (I/O) subsystem target multiple sources and destinations, respectively. A virtual channel number distinguishes these targets. All commands that share a virtual channel number are required to maintain a user defined order which may range from relaxed to strict ordering. The ordering policy is not known until after address translation is complete so it is assumed that a strict ordering policy is used through the translation process, which means all commands must complete in the order that the command is issued within a virtual channel. Commands with different virtual channels can complete in a different order or can pass each other.
  • Incoming load and store commands signal to the I/O interface a virtual address, which corresponds to that I/O device's view of memory. This virtual address needs to be translated into a real address corresponding to the processor's view of the memory map.
  • Typically an address translation cache is used, as part of the translation process, to take advantage of the temporal and spacial locality of the I/O command addressing. In a system where the address translation unit uses a cache to hold the translation table entries, a cache miss will affect the command flow due to the added time required for performing a memory fetch and then re-translation of the address of the corresponding command.
  • Ideally, the translation unit continues to enable translations to occur while a cache miss is being handled. This is referred to as hits under a miss for those commands that have translation table hits.
  • Some known arrangements only allow one miss at a time while allowing multiple hits under that miss. This solution is not generally effective since it fails to take advantage of the memory fetch pipeline, which allows multiple memory fetches in process at a time.
  • A need exists for an effective mechanism that allows handling concurrent misses where another translation cache miss occurs under the current translation cache miss and then to continue allowing translations under multiple misses. A need exists for such mechanism that enables maintaining command order based upon virtual channel and that prevents a single virtual channel from consuming all of the miss handling resources.
  • SUMMARY OF THE INVENTION
  • Principal aspects of the present invention are to provide apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel (VC). Other important aspects of the present invention are to provide such apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel substantially without negative effect and that overcome many of the disadvantages of prior art arrangements.
  • In brief, apparatus are provided for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel. Commands are stored in an input command queue in a command processing unit maintaining ordering of the commands. A command buffer index (CBI) is assigned to each address being sent from the command processing unit to an address translation unit. When an address translation cache miss occurs, a memory fetch request is sent. When the cache table entry is loaded into the cache, the CBI is passed back to the command processing unit with a signal to indicate that the fetch request has completed. The command processing unit uses the CBI to locate the command and address to be reissued to the address translation unit.
  • In accordance with features of the invention, the CBI is stored in the address translation unit in a mapping array coupled to a miss fetch unit. The mapping array is indexed by a unique command identifier (CI) for the memory fetch request. The address translation cache miss occurs due to a segment table cache miss or a page table cache miss. The memory fetch request is sent to get a page table entry or a segment table entry depending on the type of cache miss. Additional information stored with the CBI in the mapping array includes the fetch type (segment or page table fetch), a segment table cache set used for indexing into the cache, a page table cache set used for indexing into the cache, and an input/output identification (IOID).
  • In accordance with features of the invention, the command processing unit stops issuing address translation requests when a virtual channel has a predefined number of outstanding address translation requests. This prevents all miss fetch resources from being consumed by a single virtual channel.
  • In accordance with features of the invention, a limit on the number of cache misses for a single congruence class prevents over-allocating a cache set for a congruence class. When a congruence class is fully allocated to outstanding misses, an additional miss to that congruence class will be denied and the command processing unit will re-issue that miss at a later time. Other conditions exist that force the command processing unit to re-issue or stall translation requests that have translation cache misses. Examples are subsequent misses with the same IOID, VC and IO Bus as a previous miss that is in process. Also, commands that hit under a cache miss may or may not be allowed to continue depending on the storage ordering (SO) bits found in the page table entry (PTE). These bits dictate the ordering rules for commands using that PTE.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present invention together with the above and other objects and advantages may best be understood from the following detailed description of the preferred embodiments of the invention illustrated in the drawings, wherein:
  • FIGS. 1A and 1B together illustrate apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel in accordance with the preferred embodiment.
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • In accordance with features of the invention, a problem solved is that of handling concurrent misses to a translation cache, keeping track of miss correspondence to memory fetch data, while allowing hits under those misses as well as preventing a single VC from consuming all miss fetch resources. A method is provided for handling concurrent address translation cache misses and hits under those misses while maintaining command order when required. Commands must be performed in order if they are from the same I/O bus, same virtual channel or same I/O device, and if the page table storage ordering bits indicate strict ordering. The invention also accommodates concurrent hardware and software loading the cache.
  • In accordance with features of the invention, an I/O command queue and translation cache structure are provided that allows concurrent cache misses and hits under those misses without allowing a single virtual channel to consume all of the miss fetch resources. There is a predefined limit to the number of cache misses for a single VC which, when reached, stalls the traffic for the VC. The command processing unit and the translation unit both need to be aware of this predefined limit.
  • In accordance with features of the invention, for an M-way segment table cache and an N-way page table cache, either a congruence class of the segment table cache can have M misses or a page table cache can have N misses at which point all translation requests are denied and re-issued. For example with a 4 way segment cache and an 8 way page cache with 8 outstanding misses to set 25 of the page table cache and 0 outstanding misses to the segment table cache would result in a stall where no additional commands will be accepted by the address translation unit until at least one of the current outstanding misses complete.
  • When there is a miss, subsequent translations that hit in the cache can proceed and complete if the accesses came from a different I/O bus, a different virtual channel or a different I/O device or if the page table storage ordering bits indicate that the accesses need not be in strict order.
  • Having reference now to the drawings, in FIGS. 1A and 1B, there is shown an apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order based upon virtual channel or a central processor unit (CPU) generally designated by the reference 100 in accordance with the preferred embodiment. CPU 100 includes a command processing unit generally designated by the reference 102 shown in FIG. 1B; and an I/O address translation unit generally designated by the reference 104, an embedded processor 106 together with software 108, a memory 110, and an Element Interconnect Bus (EIB) 112 shown in FIG. 1A.
  • Addresses for commands are passed from the command processing unit 102 to the I/O address translation unit 104 in the order that the commands are sent from an IO device 116. This ordering is assumed to be strict ordering because the ordering rules for the command , have not been read from the page table entry yet.
  • Referring to FIG. 1B, command processing unit 102 includes an input command queue 118 and a command buffer index function 120. The input command queue 118 maintains ordering of the commands from the IO device 116. The command buffer index function 120 assigns a command buffer index (CBI) for each address of the commands to be sent to the I/O address translation unit 104 in FIG. 1A.
  • Command processing unit 102 includes a translate interface input control 122 coupled to the input command queue 118 and the command buffer index function 120 of the preferred embodiment. The translate interface input control 122 provides an address and a command buffer index (CBI) for the address to a translation pipeline 140 of the I/O address translation unit 104 in FIG. 1A. The translate interface input control 122 provides the address and command buffer index (CBI) to a pipeline 126 coupled to a translate interface output control 130. Command processing unit 102 includes the translate interface output control 130 coupled between the address and CBI pipeline 126 and an output command buffer 132. Output command buffer 132 is coupled to an IOC 134. The I/O address translation unit 104 provides the translate interface output control 130 with a hit or miss translation result, a translated address, the CBI, and a CLEAR 141 signal to indicate that a fetch request for a cache miss has completed when a cache table entry is loaded into the cache. The translate interface output control 130 provides a miss command reissue control signal to the translate interface input control 122.
  • In the preferred embodiment the Input Command Queue 118 is a circular buffer with a single head pointer, a speculative tail pointer and a main tail pointer. Commands are added to the queue at the head and are removed from the queue at the main tail pointer. If translation is stalled for all virtual channels, no commands are sent to be translated. Otherwise, the command pointed to by the speculative tail pointer is sent to the I/O address translation unit to be translated and then the speculative tail pointer is advanced towards the head pointer. However, if translation is stalled for a specific virtual channel corresponding to the command pointed to by the speculative tail pointer, this command is not sent to the I/O address translation unit, but the speculative tail pointer is still advanced towards the head pointer. In addition to this circular buffer there is a list of completion flags, one per queue entry, which indicate that the command at that entry has completed address translation. When a command completes address translation successfully, and the main tail pointer is pointing to that command, the main tail pointer is advanced toward the head pointer to the next command that has not completed translation (i.e. the completion flag is not asserted). All completion flags, for completed commands that get bypassed, are then de-asserted. When a command completes address translation successfully and the command is between the main tail pointer and the head pointer in the command queue, then the completion flag for that entry is asserted. When a command gets a cache miss, the completion flag remains de-asserted. When the CLEAR 141 signal is asserted, the speculative tail pointer is set to the CBI value sent with the CLEAR 141 signal and then advances toward the head pointer re-issuing requests for the commands that have not completed translation. Other implementations are available such as the use of linked lists and separate command queues for each virtual channel.
  • As the addresses are passed from the command processing unit 102, two types of address translation misses can occur including a segment table cache miss and a page table cache miss. When a translation cache miss occurs the I/O address translation unit 104 performs a memory fetch to get the page or segment table entry depending on the type of the cache miss. Since this logic is pipelined, addresses are presented to the translation logic continually, so even when a cache miss occurs, addresses following that miss still are processed.
  • Referring to FIG. 1A, I/O address translation unit 104 includes a translation pipeline 140 providing a plurality of signals to the translate interface output control 130 in FIG. 1B, including translation results (hit/miss), a translated address, a command buffer index (CBI), and a CLEAR 141, which indicates to the command processing unit 102 that it should re-issue a translation request for a given command indexed by a CBI. The translation pipeline 140 is coupled to the EIB bus 124, a page cache 142, such as a 4-way page cache, a segment cache 144, and a miss fetch unit 146. The miss fetch unit 146 is coupled to mapping function which maps a command buffer index (CBI) to a command identifier (CI) referenced by CBI to CI mapper 148 which passes the CBI on a cache miss to the CBI to CI mapper 148. The miss fetch unit 146 applies a fetch request to the memory 110 via the EIB 112. A fetch data handler 150 is coupled to the CBI to CI mapper 148 and receives fetch data from memory 110 via the EIB 112.
  • The invention provides a method of implementing a miss-under-miss for I/O commands. Addresses that get cache hits during an outstanding miss are called hits-under-miss. When a miss occurs while another miss is being handled this is called a miss-under-miss. The process of the invention is as follows:
  • Initially every address from the translate interface input control 122 that is sent to the address translation unit 104 is assigned a Command Buffer Index (CBI) by command buffer index function 120 of the command processing unit 102. The CBI is the location of the command in the command processing unit's buffer or input command queue 118. This CBI is used when the entry for that miss has been loaded into the cache 142 and the commands address needs to be re-issued to the I/O address translation unit 104.
  • The command processing unit 102 sends an address and CBI from the translate interface input control 122 to the translation pipeline 140 of the I/O address translation unit 104. The segment table cache 144 is searched for the corresponding segment table entry, and the page table cache 142 is searched for the corresponding page table entry.
  • When an address translation cache miss occurs, a memory fetch request is sent by miss fetch unit 146 to the memory controller or memory 110 via the EIB 112. These memory fetches have unique identifiers so that when the return data comes back, the unit that sent the request accepts the data based on a return tag match. This unique Identifier is called the CI or command identifier. The CBI is stored in a mapping array 148 which is indexed by the CI so that when the return data comes back from memory 110, the translation logic or fetch handler 150 knows where to put the data and also can send back the CBI to the command processing 102 so that the command can be re-issued.
  • As the address translation cache 142, 144 gets cache misses, the memory fetch requests are sent out to the memory controller even though more than one memory fetch request is outstanding. The only stipulation is that when the address translation cache 142, 144 sees that a congruence class has as many outstanding misses as there are ways in the respective cache, the address translation cache 142, 144 indicates to the translate interface input control 122 of command processing unit 102 that the translation request was denied and that the command will need to be re-issued because all of the resources for a congruence class may be consumed.
  • Along with the CBI, additional translation information needs to be stored in the CBI to CI Mapping array 148 to help address translation 104 update the cache. This additional information is the following: the fetch type (segment or page table fetch); Segment Table Cache Set, which is used for indexing into the cache 144; Page Table Cache Set, which is used for indexing into the cache 142; and IOID or the identification of a particular I/O device 116.
  • Once the page table or segment entry has been loaded into the appropriate cache 142 or 144, the CBI is passed back to the translate interface output control 130 of command processing unit 102 with a CLEAR 141 signal to indicate that the fetch has completed and that it can re-issue the address translation request because the cache entry has been loaded.
  • The command processing unit 102 then uses the CBI to locate the command and address that needs to be re-issued to the I/O address translation unit 104.
  • When the address together with the CBI is re-issued to the translation unit 104, a cache hit should result in the appropriate cache 142 or 144 that had the previous cache miss. The hits under a miss to the same VC, IOID or I/O bus are re-translated after the miss is translated. Other, more elaborate schemes, could track and not re-issue the commands that have completed translation and are already stored in the output command buffer 132.
  • While the present invention has been described with reference to the details of the embodiments of the invention shown in the drawing, these details are not intended to limit the scope of the invention as claimed in the appended claims.

Claims (13)

1. (canceled)
2. The apparatus for handling concurrent address translation cache misses as recited in claim 8 wherein said command processing unit is responsive to a predefined number of outstanding address translation cache misses for a virtual channel in order to avoid having a single virtual channel consume all fetch miss resources; and said address translation cache miss includes a segment table cache miss or a page table cache miss.
3. The apparatus for handling concurrent address translation cache misses as recited in claim 2 wherein said memory fetch request is for a page table entry or a segment table entry based upon said page table cache miss or said segment table cache miss.
4. The apparatus for handling concurrent address translation cache misses as recited in claim 2 wherein said address translation unit further includes a mapping array coupled to said miss fetch unit for storing the CBI, said mapping array being indexed by a unique command identifier (CI) for the memory fetch request.
5. The apparatus for handling concurrent address translation cache misses as recited in claim 4 wherein said address translation unit includes a fetch handler coupled to said mapping array.
6. The apparatus for handling concurrent address translation cache misses as recited in claim 4 wherein said mapping array stores additional information with the CBI including a page or segment table fetch, a segment table cache set or a page table cache set used for indexing into the cache, and an input/output identification (IOID).
7. The apparatus for handling concurrent address translation cache misses as recited in claim 8 wherein said predefined number of outstanding address translation cache misses for a given congruence class is based upon a number of ways and a type of segment or page cache miss of said address translation cache.
8. An apparatus for handling concurrent address translation cache misses and hits under those misses while maintaining command order comprising:
a command processing unit;
said command processing unit including an input command queue for storing commands and maintaining ordering of the commands;
said command processing unit including a command buffer indexing function in said command processing unit assigning a command buffer index (CBI) to each address being sent from said command processing unit to an address translation unit;
said command processing unit including a translate interface input control for issuing an address and the CBI of address translation requests to said address translation unit;
said address translation unit including a translation pipeline coupled to an address translation cache;
said address translation unit including a miss fetch unit coupled to said translation pipeline for sending a memory fetch request when an address translation cache miss occurs;
said command Processing unit being responsive to a predefined number of outstanding address translation cache misses for a given congruence class, to reissue address translation requests to said address translation unit at a later time based on an assertion of a CLEAR signal;
said address translation unit sending the CBI with said CLEAR signal to said command processing unit, said CLEAR signal to indicate that the memory fetch request has completed when a cache table entry is loaded into the cache;
said command processing unit, responsive to the CBI with said CLEAR signal, using the CBI to locate the command and address to reissue an address translation request for the previous address translation cache miss to said address translation unit; and
said command processing unit, responsive to reissuing the address translation requests for the previous address translation cache miss, reissues address translation requests to said address translation unit for hits under a previous address translation cache miss with a same virtual channel, I/O Bus and I/O device.
9. The apparatus for handling concurrent address translation cache misses as recited in claim 8 wherein said miss fetch unit sends another memory fetch request when another address translation cache miss occurs before the a previous memory fetch has completed.
10. The apparatus for handling concurrent address translation cache misses as recited in claim 8 wherein said translate interface input control of said command processing unit continues issuing address translation requests to said address translation unit for commands from a different input/output bus.
11. The apparatus for handling concurrent address translation cache misses as recited in claim 8 wherein said translate interface input control of said command processing unit continues issuing address translation requests to said address translation unit for commands from a different virtual channel.
12. The apparatus for handling concurrent address translation cache misses as recited in claim 8 wherein said translate interface input control of said command processing unit continues issuing address translation requests to said address translation unit for commands from a different input/output device.
13-19. (canceled)
US12/351,900 2006-05-30 2009-01-12 Handling concurrent address translation cache misses and hits under those misses while maintaining command order Abandoned US20090187695A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US12/351,900 US20090187695A1 (en) 2006-05-30 2009-01-12 Handling concurrent address translation cache misses and hits under those misses while maintaining command order

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/420,884 US7539840B2 (en) 2006-05-30 2006-05-30 Handling concurrent address translation cache misses and hits under those misses while maintaining command order
US12/351,900 US20090187695A1 (en) 2006-05-30 2009-01-12 Handling concurrent address translation cache misses and hits under those misses while maintaining command order

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US11/420,884 Continuation US7539840B2 (en) 2006-05-30 2006-05-30 Handling concurrent address translation cache misses and hits under those misses while maintaining command order

Publications (1)

Publication Number Publication Date
US20090187695A1 true US20090187695A1 (en) 2009-07-23

Family

ID=38791762

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/420,884 Expired - Fee Related US7539840B2 (en) 2006-05-30 2006-05-30 Handling concurrent address translation cache misses and hits under those misses while maintaining command order
US12/351,900 Abandoned US20090187695A1 (en) 2006-05-30 2009-01-12 Handling concurrent address translation cache misses and hits under those misses while maintaining command order

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US11/420,884 Expired - Fee Related US7539840B2 (en) 2006-05-30 2006-05-30 Handling concurrent address translation cache misses and hits under those misses while maintaining command order

Country Status (1)

Country Link
US (2) US7539840B2 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110283041A1 (en) * 2009-01-28 2011-11-17 Yasushi Kanoh Cache memory and control method thereof
US20110320761A1 (en) * 2010-06-25 2011-12-29 International Business Machines Corporation Address translation, address translation unit data processing program, and computer program product for address translation

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7716423B2 (en) * 2006-02-07 2010-05-11 International Business Machines Corporation Pseudo LRU algorithm for hint-locking during software and hardware address translation cache miss handling modes
US8386748B2 (en) * 2009-10-29 2013-02-26 Apple Inc. Address translation unit with multiple virtual queues
US8661200B2 (en) * 2010-02-05 2014-02-25 Nokia Corporation Channel controller for multi-channel cache
US20110197031A1 (en) * 2010-02-05 2011-08-11 Nokia Corporation Update Handler For Multi-Channel Cache
JP5625714B2 (en) * 2010-10-07 2014-11-19 富士通セミコンダクター株式会社 Simulation apparatus, program, storage medium, and method
US10318435B2 (en) * 2017-08-22 2019-06-11 International Business Machines Corporation Ensuring forward progress for nested translations in a memory management unit
US11669267B2 (en) * 2018-02-09 2023-06-06 Western Digital Technologies, Inc. Completion entry throttling using host memory
US11775444B1 (en) 2022-03-15 2023-10-03 International Business Machines Corporation Fetch request arbiter

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5418922A (en) * 1992-04-30 1995-05-23 International Business Machines Corporation History table for set prediction for accessing a set associative cache
US20030084273A1 (en) * 2001-10-25 2003-05-01 International Business Machines Corp. Processor and method of testing a processor for hardware faults utilizing a pipeline interlocking test instruction
US20040215921A1 (en) * 2003-04-24 2004-10-28 International Business Machines Corporation Zero cycle penalty in selecting instructions in prefetch buffer in the event of a miss in the instruction cache

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5418922A (en) * 1992-04-30 1995-05-23 International Business Machines Corporation History table for set prediction for accessing a set associative cache
US20030084273A1 (en) * 2001-10-25 2003-05-01 International Business Machines Corp. Processor and method of testing a processor for hardware faults utilizing a pipeline interlocking test instruction
US20040215921A1 (en) * 2003-04-24 2004-10-28 International Business Machines Corporation Zero cycle penalty in selecting instructions in prefetch buffer in the event of a miss in the instruction cache

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110283041A1 (en) * 2009-01-28 2011-11-17 Yasushi Kanoh Cache memory and control method thereof
US9053030B2 (en) * 2009-01-28 2015-06-09 Nec Corporation Cache memory and control method thereof with cache hit rate
US20110320761A1 (en) * 2010-06-25 2011-12-29 International Business Machines Corporation Address translation, address translation unit data processing program, and computer program product for address translation
US8966221B2 (en) * 2010-06-25 2015-02-24 International Business Machines Corporation Translating translation requests having associated priorities

Also Published As

Publication number Publication date
US20070283121A1 (en) 2007-12-06
US7539840B2 (en) 2009-05-26

Similar Documents

Publication Publication Date Title
US7539840B2 (en) Handling concurrent address translation cache misses and hits under those misses while maintaining command order
EP0391517B1 (en) Method and apparatus for ordering and queueing multiple memory access requests
US6212603B1 (en) Processor with apparatus for tracking prefetch and demand fetch instructions serviced by cache memory
US6622225B1 (en) System for minimizing memory bank conflicts in a computer system
EP0813709B1 (en) Parallel access micro-tlb to speed up address translation
US6065103A (en) Speculative store buffer
US5490261A (en) Interlock for controlling processor ownership of pipelined data for a store in cache
EP0381470B1 (en) Processing of memory access exceptions along with prefetched instructions within the instruction pipeline of a virtual memory system-based digital computer
EP0978044B1 (en) Method and apparatus for reordering commands and restoring data to original command order
EP0795820B1 (en) Combined prefetch buffer and instructions cache memory system and method for providing instructions to a central processing unit utilizing said system.
EP0097790B1 (en) Apparatus for controlling storage access in a multilevel storage system
US6263404B1 (en) Accessing data from a multiple entry fully associative cache buffer in a multithread data processing system
US10083126B2 (en) Apparatus and method for avoiding conflicting entries in a storage structure
US6012134A (en) High-performance processor with streaming buffer that facilitates prefetching of instructions
JPH06318177A (en) Method, device and computer system for reducing cache mistake penalty
JPH0619794A (en) Estimation device of setting position of data
US8898430B2 (en) Fault handling in address translation transactions
CN107038125A (en) Processor cache with the independent streamline for accelerating to prefetch request
US20070260754A1 (en) Hardware Assisted Exception for Software Miss Handling of an I/O Address Translation Cache Miss
US9727483B2 (en) Tracking memory accesses when invalidating effective address to real address translations
EP3844624B1 (en) Method, apparatus, and system for reducing pipeline stalls due to address translation misses
US20100100702A1 (en) Arithmetic processing apparatus, TLB control method, and information processing apparatus
US20080065855A1 (en) DMAC Address Translation Miss Handling Mechanism
KR20190087500A (en) Memory address translation
US10754791B2 (en) Software translation prefetch instructions

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE